Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:08:36 2011 Seq name: gi|229784128|gb|GG667607.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld0, whole genome shotgun sequence Length of sequence - 174489 bp Number of predicted genes - 152, with homology - 144 Number of transcription units - 67, operones - 36 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 44 - 391 423 ## Closa_3389 DRTGG domain protein 2 1 Op 2 . - CDS 442 - 528 66 ## - Prom 676 - 735 4.9 + Prom 555 - 614 4.4 3 2 Op 1 . + CDS 702 - 1196 476 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit 4 2 Op 2 . + CDS 1207 - 1764 564 ## COG0642 Signal transduction histidine kinase 5 2 Op 3 1/0.176 + CDS 1789 - 2166 418 ## COG3411 Ferredoxin 6 2 Op 4 1/0.176 + CDS 2179 - 3966 1862 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit 7 2 Op 5 . + CDS 3984 - 5729 1700 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain + Term 5750 - 5803 5.3 + Prom 6164 - 6223 4.1 8 3 Tu 1 . + CDS 6300 - 7706 1325 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase + Term 7751 - 7796 10.2 - Term 7734 - 7789 18.1 9 4 Tu 1 . - CDS 7805 - 8896 1074 ## COG0371 Glycerol dehydrogenase and related enzymes - Prom 8918 - 8977 8.2 + Prom 8985 - 9044 6.4 10 5 Op 1 . + CDS 9071 - 10420 958 ## COG0534 Na+-driven multidrug efflux pump 11 5 Op 2 . + CDS 10456 - 10629 251 ## Closa_3397 hypothetical protein 12 5 Op 3 . + CDS 10642 - 10887 274 ## Closa_3398 Coat F domain protein + Term 10931 - 10977 6.3 - Term 10919 - 10965 10.1 13 6 Tu 1 . - CDS 10971 - 12557 1768 ## COG2199 FOG: GGDEF domain - Prom 12592 - 12651 6.8 + Prom 12589 - 12648 7.2 14 7 Op 1 . + CDS 12895 - 13413 604 ## Clole_1677 hypothetical protein 15 7 Op 2 . + CDS 13482 - 15227 2235 ## COG2268 Uncharacterized protein conserved in bacteria 16 7 Op 3 . + CDS 15272 - 15970 907 ## COG1842 Phage shock protein A (IM30), suppresses sigma54-dependent transcription + Term 15980 - 16035 -0.8 - Term 15968 - 16023 1.8 17 8 Tu 1 . - CDS 16209 - 18899 1545 ## COG1221 Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain - Prom 18941 - 19000 2.4 18 9 Op 1 . - CDS 19016 - 19261 276 ## gi|266619040|ref|ZP_06111975.1| phosphocarrier protein HPr 19 9 Op 2 . - CDS 19268 - 19750 410 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 20 9 Op 3 . - CDS 19782 - 19991 174 ## gi|266619042|ref|ZP_06111977.1| putative band 3 anion transport protein 21 9 Op 4 . - CDS 19991 - 20641 571 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 22 9 Op 5 10/0.000 - CDS 20665 - 22011 1058 ## COG3775 Phosphotransferase system, galactitol-specific IIC component 23 9 Op 6 . - CDS 22034 - 22318 331 ## COG3414 Phosphotransferase system, galactitol-specific IIB component - Prom 22377 - 22436 8.6 24 10 Tu 1 . + CDS 23073 - 23207 57 ## - Term 24123 - 24177 17.2 25 11 Op 1 . - CDS 24181 - 26067 1453 ## COG0296 1,4-alpha-glucan branching enzyme 26 11 Op 2 . - CDS 26021 - 26131 95 ## - Prom 26152 - 26211 3.6 + Prom 26101 - 26160 7.5 27 12 Tu 1 . + CDS 26250 - 26783 654 ## COG1309 Transcriptional regulator + Prom 27754 - 27813 80.4 28 13 Tu 1 . + CDS 27929 - 28774 863 ## COG0530 Ca2+/Na+ antiporter 29 14 Op 1 . - CDS 29919 - 30773 428 ## COG3943 Virulence protein 30 14 Op 2 . - CDS 30780 - 30896 138 ## - Prom 30920 - 30979 4.6 31 15 Op 1 . - CDS 31213 - 32664 1569 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 32 15 Op 2 . - CDS 32726 - 33907 1369 ## Closa_3400 protein serine/threonine phosphatase 33 15 Op 3 1/0.176 - CDS 33907 - 35640 1719 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 34 15 Op 4 . - CDS 35643 - 35882 308 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit - Prom 35962 - 36021 21.4 35 16 Op 1 . - CDS 36923 - 38464 1717 ## COG1001 Adenine deaminase 36 16 Op 2 . - CDS 38497 - 39327 896 ## COG0005 Purine nucleoside phosphorylase - Prom 39374 - 39433 7.6 + Prom 39508 - 39567 7.3 37 17 Op 1 21/0.000 + CDS 39754 - 40860 1120 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 38 17 Op 2 . + CDS 40939 - 41706 734 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 39 17 Op 3 . + CDS 41716 - 42315 362 ## COG1118 ABC-type sulfate/molybdate transport systems, ATPase component + Term 42350 - 42383 2.5 - Term 42290 - 42343 16.4 40 18 Op 1 . - CDS 42369 - 43085 858 ## Closa_3408 hypothetical protein - Prom 43125 - 43184 5.8 41 18 Op 2 . - CDS 43291 - 43887 611 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 43978 - 44037 80.4 + TRNA 45393 - 45467 86.4 # Pro TGG 0 0 + TRNA 45522 - 45592 75.8 # Gly TCC 0 0 + TRNA 45596 - 45669 82.9 # Arg TCT 0 0 + TRNA 45707 - 45780 67.3 # His GTG 0 0 + TRNA 45827 - 45898 71.7 # Gln TTG 0 0 + TRNA 45903 - 45975 73.8 # Lys TTT 0 0 + Prom 46846 - 46905 80.4 42 19 Tu 1 . + CDS 47114 - 47200 122 ## + Term 47282 - 47350 29.7 + TRNA 47252 - 47331 63.4 # Leu TAG 0 0 43 20 Tu 1 . - CDS 47412 - 48212 674 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 48252 - 48311 80.4 + Prom 49134 - 49193 25.2 44 21 Tu 1 . + CDS 49399 - 50796 1544 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases + Term 50804 - 50861 13.1 - Term 50788 - 50851 16.1 45 22 Tu 1 . - CDS 50918 - 52474 1300 ## Amet_0685 PucR family transcriptional regulator - Prom 52545 - 52604 7.0 + Prom 52498 - 52557 6.2 46 23 Op 1 11/0.000 + CDS 52629 - 53135 643 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 47 23 Op 2 . + CDS 53137 - 55404 2343 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 48 23 Op 3 3/0.118 + CDS 55412 - 57775 2154 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 49 23 Op 4 . + CDS 57787 - 58416 466 ## COG2068 Uncharacterized MobA-related protein 50 24 Op 1 7/0.059 - CDS 58382 - 58882 509 ## COG0622 Predicted phosphoesterase 51 24 Op 2 . - CDS 58928 - 59494 459 ## COG0127 Xanthosine triphosphate pyrophosphatase 52 25 Tu 1 . - CDS 59817 - 60698 836 ## COG4870 Cysteine protease - Prom 60753 - 60812 16.3 53 26 Op 1 . - CDS 61714 - 62211 479 ## COG4870 Cysteine protease 54 26 Op 2 . - CDS 62216 - 63373 1182 ## COG0116 Predicted N6-adenine-specific DNA methylase 55 26 Op 3 . - CDS 63375 - 63923 709 ## COG0622 Predicted phosphoesterase 56 26 Op 4 . - CDS 63991 - 64299 355 ## COG0607 Rhodanese-related sulfurtransferase - Prom 64340 - 64399 3.6 - Term 64379 - 64442 6.4 57 27 Tu 1 . - CDS 64494 - 64721 244 ## - Prom 64822 - 64881 9.3 - Term 64952 - 64990 9.2 58 28 Op 1 7/0.059 - CDS 65030 - 65755 186 ## PROTEIN SUPPORTED gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 - Prom 65817 - 65876 8.5 59 28 Op 2 . - CDS 65900 - 66595 176 ## PROTEIN SUPPORTED gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 - Prom 66669 - 66728 7.6 - Term 66709 - 66757 13.0 60 29 Op 1 . - CDS 66801 - 67247 382 ## Closa_0686 hypothetical protein - Prom 67303 - 67362 8.3 61 29 Op 2 . - CDS 67375 - 69312 2369 ## COG3855 Uncharacterized protein conserved in bacteria - Prom 69401 - 69460 3.7 + Prom 69483 - 69542 4.6 62 30 Op 1 . + CDS 69632 - 70438 592 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III 63 30 Op 2 . + CDS 70492 - 71670 1123 ## COG0739 Membrane proteins related to metalloendopeptidases + Term 71733 - 71785 13.4 - Term 71721 - 71773 13.4 64 31 Op 1 . - CDS 71786 - 72973 1086 ## COG1979 Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family 65 31 Op 2 2/0.176 - CDS 73027 - 74274 1203 ## COG4198 Uncharacterized conserved protein 66 31 Op 3 6/0.059 - CDS 74355 - 75518 1534 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 67 31 Op 4 . - CDS 75537 - 76622 1225 ## COG1932 Phosphoserine aminotransferase 68 31 Op 5 6/0.059 - CDS 76659 - 78338 1950 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 69 31 Op 6 1/0.176 - CDS 78358 - 80028 1957 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase 70 31 Op 7 . - CDS 80064 - 81146 1175 ## COG0473 Isocitrate/isopropylmalate dehydrogenase - Prom 81203 - 81262 5.5 71 32 Tu 1 . - CDS 81490 - 82668 1214 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities - Prom 82742 - 82801 3.8 + Prom 82771 - 82830 6.3 72 33 Tu 1 . + CDS 82892 - 83143 220 ## Closa_3641 hypothetical protein + Term 83151 - 83207 22.4 - Term 83141 - 83193 19.4 73 34 Op 1 . - CDS 83198 - 83698 585 ## Sgly_1265 regulatory protein MarR 74 34 Op 2 . - CDS 83774 - 84790 1170 ## Closa_3658 hypothetical protein 75 34 Op 3 4/0.059 - CDS 84838 - 86208 1577 ## COG3225 ABC-type uncharacterized transport system involved in gliding motility, auxiliary component 76 34 Op 4 24/0.000 - CDS 86234 - 87100 856 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component 77 34 Op 5 . - CDS 87101 - 88165 284 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 - Prom 88330 - 88389 6.1 78 35 Tu 1 . - CDS 88511 - 89194 422 ## COG1985 Pyrimidine reductase, riboflavin biosynthesis - Prom 89224 - 89283 6.9 79 36 Op 1 . - CDS 89315 - 89935 429 ## COG0279 Phosphoheptose isomerase 80 36 Op 2 1/0.176 - CDS 89963 - 92392 1865 ## COG0383 Alpha-mannosidase 81 36 Op 3 . - CDS 92379 - 94835 1812 ## COG0383 Alpha-mannosidase 82 36 Op 4 . - CDS 94854 - 97472 2094 ## Ccel_0950 HI0933 family protein 83 36 Op 5 1/0.176 - CDS 97514 - 98974 1541 ## COG0366 Glycosidases 84 36 Op 6 1/0.176 - CDS 98987 - 99967 337 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 85 36 Op 7 . - CDS 99958 - 101364 1215 ## COG1070 Sugar (pentulose and hexulose) kinases 86 36 Op 8 . - CDS 101397 - 101576 180 ## gi|266619111|ref|ZP_06112046.1| hypothetical protein CLOSTHATH_00104 - Prom 101754 - 101813 7.1 87 37 Tu 1 . - CDS 102715 - 104028 948 ## Mahau_2006 raffinose synthase - Term 104089 - 104123 -0.7 88 38 Op 1 . - CDS 104133 - 104384 183 ## gi|266619113|ref|ZP_06112048.1| dihydroxy-acid dehydratase - Prom 104405 - 104464 2.3 89 38 Op 2 . - CDS 104474 - 105688 359 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 - Prom 105726 - 105785 5.3 90 39 Op 1 38/0.000 - CDS 105802 - 106680 814 ## COG0395 ABC-type sugar transport system, permease component 91 39 Op 2 35/0.000 - CDS 106680 - 107579 890 ## COG1175 ABC-type sugar transport systems, permease components 92 39 Op 3 . - CDS 107698 - 109062 1516 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 109279 - 109338 8.6 - Term 109576 - 109620 4.2 93 40 Tu 1 . - CDS 109685 - 109801 177 ## - Prom 109881 - 109940 7.1 - Term 109967 - 110017 2.6 94 41 Op 1 2/0.176 - CDS 110039 - 111298 1567 ## COG1653 ABC-type sugar transport system, periplasmic component 95 41 Op 2 1/0.176 - CDS 111377 - 112291 967 ## COG2207 AraC-type DNA-binding domain-containing proteins 96 41 Op 3 . - CDS 112352 - 113431 1141 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 97 41 Op 4 . - CDS 113424 - 115952 2389 ## Spico_1278 hypothetical protein 98 41 Op 5 38/0.000 - CDS 115967 - 116800 1055 ## COG0395 ABC-type sugar transport system, permease component 99 41 Op 6 2/0.176 - CDS 116790 - 117665 973 ## COG1175 ABC-type sugar transport systems, permease components - Prom 117686 - 117745 12.8 - Term 117787 - 117837 3.5 100 42 Op 1 . - CDS 117971 - 119326 656 ## COG0534 Na+-driven multidrug efflux pump 101 42 Op 2 2/0.176 - CDS 119394 - 120239 576 ## COG1082 Sugar phosphate isomerases/epimerases 102 42 Op 3 . - CDS 120241 - 121056 319 ## COG1082 Sugar phosphate isomerases/epimerases - Prom 121081 - 121140 1.9 - Term 121071 - 121111 7.1 103 43 Op 1 31/0.000 - CDS 121142 - 121819 507 ## COG0765 ABC-type amino acid transport system, permease component 104 43 Op 2 16/0.000 - CDS 121831 - 122676 721 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 105 43 Op 3 . - CDS 122698 - 123429 579 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 106 43 Op 4 . - CDS 123447 - 124226 523 ## gi|288869860|ref|ZP_06112066.2| conserved hypothetical protein 107 43 Op 5 . - CDS 124255 - 124995 429 ## gi|266619132|ref|ZP_06112067.1| hypothetical protein CLOSTHATH_00127 - Prom 125086 - 125145 11.0 108 44 Tu 1 . + CDS 125403 - 126398 555 ## COG1609 Transcriptional regulators + Term 126434 - 126491 21.4 - Term 126422 - 126479 11.8 109 45 Tu 1 . - CDS 126515 - 127336 702 ## COG0789 Predicted transcriptional regulators - Prom 127409 - 127468 6.7 + Prom 127439 - 127498 7.1 110 46 Tu 1 . + CDS 127560 - 128915 960 ## COG0534 Na+-driven multidrug efflux pump + Term 128922 - 128966 7.2 - Term 128863 - 128898 -1.0 111 47 Op 1 4/0.059 - CDS 128987 - 130639 1928 ## COG0366 Glycosidases 112 47 Op 2 38/0.000 - CDS 130666 - 131517 925 ## COG0395 ABC-type sugar transport system, permease component 113 47 Op 3 35/0.000 - CDS 131517 - 132386 1013 ## COG1175 ABC-type sugar transport systems, permease components 114 47 Op 4 . - CDS 132472 - 133788 1758 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 133867 - 133926 5.1 - Term 133829 - 133859 2.9 115 48 Op 1 . - CDS 134093 - 134185 59 ## 116 48 Op 2 . - CDS 134201 - 135178 948 ## COG1609 Transcriptional regulators - Prom 135245 - 135304 5.0 - Term 135279 - 135315 1.3 117 49 Tu 1 . - CDS 135371 - 135901 411 ## Rumal_0724 metal-dependent phosphohydrolase HD sub domain - Prom 136114 - 136173 4.8 + Prom 135872 - 135931 5.2 118 50 Tu 1 . + CDS 136126 - 137337 816 ## COG3547 Transposase and inactivated derivatives 119 51 Tu 1 . - CDS 137535 - 137684 157 ## Bcav_3943 binding-protein-dependent transport systems inner membrane component - Prom 137714 - 137773 26.6 120 52 Tu 1 . + CDS 138706 - 139086 289 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Term 139094 - 139141 16.3 121 53 Op 1 . - CDS 139145 - 140836 1981 ## COG0443 Molecular chaperone 122 53 Op 2 . - CDS 140833 - 142422 2029 ## COG0457 FOG: TPR repeat - Prom 142529 - 142588 80.4 123 54 Op 1 . - CDS 143442 - 145037 1900 ## Clocel_3378 heat shock protein DnaJ domain-containing protein 124 54 Op 2 . - CDS 145051 - 145890 839 ## COG0122 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase 125 54 Op 3 . - CDS 145949 - 146563 634 ## COG1011 Predicted hydrolase (HAD superfamily) - Prom 146786 - 146845 7.6 + Prom 146906 - 146965 6.6 126 55 Tu 1 . + CDS 147007 - 148380 1381 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain + Term 148389 - 148428 -0.6 - Term 148369 - 148416 -1.0 127 56 Op 1 . - CDS 148422 - 149759 1358 ## COG0534 Na+-driven multidrug efflux pump 128 56 Op 2 . - CDS 149756 - 150772 906 ## CAR_c05940 putative sugar permease - Prom 150811 - 150870 16.4 129 57 Op 1 . - CDS 151772 - 151909 151 ## gi|266619155|ref|ZP_06112090.1| conserved hypothetical protein 130 57 Op 2 . - CDS 151929 - 152432 717 ## COG0782 Transcription elongation factor - Prom 152591 - 152650 4.8 + Prom 152565 - 152624 6.9 131 58 Tu 1 . + CDS 152647 - 153642 832 ## COG2199 FOG: GGDEF domain + Term 153648 - 153705 16.2 132 59 Tu 1 . - CDS 153739 - 155016 1591 ## COG1145 Ferredoxin - Prom 155099 - 155158 4.0 133 60 Op 1 . - CDS 155228 - 155668 318 ## COG0232 dGTP triphosphohydrolase - Prom 155696 - 155755 2.4 134 60 Op 2 . - CDS 155757 - 156530 840 ## gi|266619160|ref|ZP_06112095.1| hypothetical protein CLOSTHATH_00156 - Prom 156560 - 156619 3.4 - Term 156618 - 156651 2.0 135 61 Op 1 9/0.000 - CDS 156726 - 157940 1157 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 136 61 Op 2 . - CDS 157928 - 158653 589 ## COG3279 Response regulator of the LytR/AlgR family - Term 158676 - 158714 7.1 137 62 Tu 1 . - CDS 158893 - 160101 538 ## COG0582 Integrase - Prom 160159 - 160218 4.1 - Term 160183 - 160226 7.1 138 63 Op 1 . - CDS 160264 - 161478 518 ## Closa_3673 integrase family protein 139 63 Op 2 . - CDS 161563 - 161793 163 ## Closa_3674 hypothetical protein - Prom 161881 - 161940 2.6 140 64 Tu 1 . - CDS 162327 - 162785 268 ## Closa_2766 RNA polymerase, sigma 28 subunit, FliA/WhiG subfamily + Prom 163115 - 163174 3.1 141 65 Tu 1 . + CDS 163203 - 163544 104 ## Dtox_1519 transcriptional regulator, XRE family + Term 163545 - 163584 3.0 - Term 163533 - 163571 1.2 142 66 Op 1 . - CDS 163572 - 163730 78 ## gi|288869868|ref|ZP_06112104.2| mn-dependent transcriptional regulator 143 66 Op 2 . - CDS 163822 - 165210 219 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 144 66 Op 3 . - CDS 165207 - 165905 153 ## TDE0256 hypothetical protein 145 66 Op 4 . - CDS 165902 - 166144 220 ## TDE0257 hypothetical protein 146 66 Op 5 . - CDS 166125 - 166490 180 ## TDE0257 hypothetical protein 147 66 Op 6 35/0.000 - CDS 166565 - 168286 223 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 148 66 Op 7 . - CDS 168283 - 170028 225 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 - Prom 170096 - 170155 4.0 - Term 170062 - 170093 0.2 149 67 Op 1 . - CDS 170259 - 171389 4 ## COG0031 Cysteine synthase 150 67 Op 2 13/0.000 - CDS 171466 - 173070 190 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 151 67 Op 3 49/0.000 - CDS 173080 - 173910 496 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 152 67 Op 4 . - CDS 173924 - 174367 204 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components - Prom 174417 - 174476 2.6 Predicted protein(s) >gi|229784128|gb|GG667607.1| GENE 1 44 - 391 423 115 aa, chain - ## HITS:1 COG:no KEGG:Closa_3389 NR:ns ## KEGG: Closa_3389 # Name: not_defined # Def: DRTGG domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 115 1 115 115 190 84.0 1e-47 MTAKDVRDVLGARVLVGEDHLEVEVRSACGSDMMSDVLAFSKDHSVLLTGLCNPQVIRTA EMLDIVCLVFVRGKKPDEAMLEMARERNLIVMATGHRMFSACGMLYSAGLHGGAI >gi|229784128|gb|GG667607.1| GENE 2 442 - 528 66 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRRGVGFHKKLYKRLSFPEKNYYNLLMN >gi|229784128|gb|GG667607.1| GENE 3 702 - 1196 476 164 aa, chain + ## HITS:1 COG:TM0012 KEGG:ns NR:ns ## COG: TM0012 COG1905 # Protein_GI_number: 15642787 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 7 164 13 173 176 154 47.0 5e-38 MACKKQGVLFSGTKEQEAALKEVISELKGTKGALMPIMQKAQDIYGYLPIEVQTMISDET GIPLEKIYGVATFYAQFALQPKGKYQVSVCLGTACYVKGSGDIYDKLVELLGITNGECTP DGKFSLDSCRCVGACGLAPVMMINGEVYGRLTPDDVPGILAKYE >gi|229784128|gb|GG667607.1| GENE 4 1207 - 1764 564 185 aa, chain + ## HITS:1 COG:TM1665 KEGG:ns NR:ns ## COG: TM1665 COG0642 # Protein_GI_number: 15644413 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Thermotoga maritima # 1 182 4 182 186 107 34.0 2e-23 MMTEISLNVLDVAENSTRAGAALVAITAAVDTASDLLTITIADDGCGMTEEQVAHVTDPF FTTRKTRKVGLGIPFFKYAAESTGGSFTIDSTPGAGTTVTAVFGLSHIDRMPLGDMNSTI ETLITCHPDTDFLYTYRYNDASFVLDTREFREILGDIPFDTPEVSAYIKEYLAENKLETD GGAVI >gi|229784128|gb|GG667607.1| GENE 5 1789 - 2166 418 125 aa, chain + ## HITS:1 COG:TM0011 KEGG:ns NR:ns ## COG: TM0011 COG3411 # Protein_GI_number: 15642786 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Thermotoga maritima # 1 116 6 120 128 82 37.0 2e-16 MKTLEELKAIREKMQSQVSMREEDHARTRVLVGMATCGIASGARPVLNKLSTLVQENNMT DRFSVTQTGCIGLCQYEPIVEILEPGKEKVTYIKMNAEKAEEVYKEHLIGGHILENYTLG SADIK >gi|229784128|gb|GG667607.1| GENE 6 2179 - 3966 1862 595 aa, chain + ## HITS:1 COG:TM0010_1 KEGG:ns NR:ns ## COG: TM0010_1 COG1894 # Protein_GI_number: 15642785 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Thermotoga maritima # 6 524 8 525 527 687 63.0 0 MYRSHVLVCGGTGCTSSGSQQIMVSLREELKKQGLDEEVSVVQTGCHGLCALGPIMIVYP DATFYAMVKEDDIPEIVEEHLLKGRPVQRLLYDETVTPAGVKSLGDTDFYKKQHRIALRN CGVINPEVIEEYIGTGGYQALGKVLTEMTPDDVIQVLLDSGLRGRGGAGFPTGLKWKLAK GNDADQKYVCCNADEGDPGAFMDRSVLEGDPHAVLEAMTIAGYAIGASQGYIYVRAEYPI AVQRLKIAIEQAREMELLGDDIFGSGFSFNIDLRLGAGAFVCGEETALMVSIEGNRGEPR PRPPFPAQKGLFGKPTILNNVETWANIPQIILNGPEWFASMGTEKSKGTKVFALGGKIHN TGLVEVPMGTTLREIIEEIGGGIPNGKKFKAAQTGGPSGGCIPAEHFDIPIDYDNLVAIG SMMGSGGLIVMDEDDCMVDIAKFFLEFTVEESCGKCTPCRIGTKRMLEILEKITKGQATM ADLDKLEELCYYIKANSACALGQTAPNPVLSTLQYFRDEYVAHIVDKTCPAGVCKALLNF YIDPEKCKGCTLCARNCPANAITGTVKNPHVIDGEKCLKCGACMEKCKFGAIYKK >gi|229784128|gb|GG667607.1| GENE 7 3984 - 5729 1700 581 aa, chain + ## HITS:1 COG:CAC0028_2 KEGG:ns NR:ns ## COG: CAC0028_2 COG4624 # Protein_GI_number: 15893326 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Clostridium acetobutylicum # 214 577 1 365 371 392 52.0 1e-108 MENITIKINGMEVSAPEGSTILEAARLAHVEIPTLCYLKDINEIAACRMCTVKINGGKLA AACVYPINPGLEVWTNTPDLVDYRKKTLQLILSNHNRSCLSCVRSGNCELQTLCKELGVE DSDYYDGEKTPSCIDESAAHMVRDNSKCILCRRCSATCEKVQGIGVIGANERGFATFIGS AFEMGLGDTSCVSCGQCIAVCPTGALYEKSCIDEVAAAIADPSKHVIVQTAPAVRAGLGE EFGYPIGTNVEGKMAAALRRIGFDKVFDTNFSADLTIMEEAHEFIERVKNGGTLPMITSC SPGWIKYCEHYFPDMTDNLSTCKSPQQMFGAIAKSYYAEKMGLKPEDIVSVSVMPCTAKK FEIGRDNQDANGCPDVDYSMTTRELARMIKKYGIRFNELPDEEFDAPLGLGTGAAVIFGA TGGVMEAALRTAVETLTGEELEKVDFTDVRGTEGIKEAVYNVAGMDVKVAVASGLGNAKE LLNRVKSGEADYHFIEIMGCPGGCVNGGGQPQQPGSVRNTVDIRALRAKVLYDEDAANTI RKSHENPAIKELYETYLGKPGSEKAHHLLHTTYVKRKVNEI >gi|229784128|gb|GG667607.1| GENE 8 6300 - 7706 1325 468 aa, chain + ## HITS:1 COG:CAC1392 KEGG:ns NR:ns ## COG: CAC1392 COG0034 # Protein_GI_number: 15894671 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Clostridium acetobutylicum # 3 466 21 458 475 189 30.0 1e-47 MGGIFGVVSKKSCTLDVFFGVDYHSHLGTKRGGMAVYGPRGFSRAIHNIENTPFRTKFDG DLDELEGTSGIGCISDNEPQPLLIQSHLGSFAITTVGKINNQDDLIRSAYENGHIHFMEM SGGRINSTELVAALINQKSSITEGLQYAQDRIDGSMTILILTPDGIYAARDRMGRTPIVI GKKEDAFCASFESFAYINLGYSDYYELGPGEIVFFTSEGMESLVPAREEMKICSFLWVYY GYPTSSYEGVNVENMRYECGKLLARRDDVKSDMVAGVPDSGIAHAIGYANESGIPFARPF IKYTPTWPRSFMPQNQGERNLIAKMKLIPVDALIRGKSMLLIDDSIVRGTQLGETTEFLY QSGAKEVHIRPACPPLMFGCPYLNFSSSNSELDLITRRIIRDREGDSVSDEVLADYANPD STNYQEMVEEIRKKLNFTTLRYHRLDDLQASIGISPCKLCTYCWNGKK >gi|229784128|gb|GG667607.1| GENE 9 7805 - 8896 1074 363 aa, chain - ## HITS:1 COG:ECs4874 KEGG:ns NR:ns ## COG: ECs4874 COG0371 # Protein_GI_number: 15834128 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Escherichia coli O157:H7 # 7 360 11 365 380 304 45.0 2e-82 MEYQNGGSHMRRVINAPSTYIQGPGEGKKLAEHYRSIGEGGVLLITDRFVHEVYLGEMTE SFDKAEITWKEEVFTGECCEKEIRRLAVVAGDCGAVFGIGGGKVQDTAKAAGHYSGKPVI LVPTAVSTDAPCSRIAVLYREDGTFDRYLSLKKNPDLIILDTEIIAHAPVRLFTAGMGDA ISTYYEASACRRSGAVTKAGGCVSIAAEALAKACLNTVLTYGEQARRDVEKQRCTEAVEY LVEANTYLSGIGFESGGLACAHALHNGLALLPELSSVMHGEKVAFCTLVQMVLEGRPEEE ISRIRSFCRKVGLPVSLQDMNAGEISDHRLLEAAVKSCSPEETMSHIGRAVEPEEVVRAM KSL >gi|229784128|gb|GG667607.1| GENE 10 9071 - 10420 958 449 aa, chain + ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 1 434 1 434 447 243 34.0 7e-64 MTTDLTVGHPRKTLWRFLLPMLVSVMFQQFYNIADSMIAGRFAGENALAAVGASYPITVI FMAFATGSNLGASVVVSRLFGARDFCRMKTAIHTAFISCVVLSILLTVCGIIFCGPMMRW IRTPDDVFSAGALYLRIYTLGLLFLILYNVCTGIFTAFGDSKTPLYFLIGSSIGNILLDI WFVAGLHLGVAGAAWATFLAQGVSCLLAAATLWKRLKNMAGAEKAAMFDWGLFKQIAAIA VPSILQQSVLSVGNLFVQIIVNRYGSAVIAGYSAAIKLNTFAITSFMSLGSCLSSYTAQN LGAGRSERIRPGFQEGVRLSLTASVPFVVLYFVFSRQIMGLFLDSGSEEAIHAGVMFLKI VSPLYFMISVKLMTDGILRGSGAMVYFVLATIPDLILRILFAYLLTPRFGSTGIWMAWPF GWTAATVFTMIFYRKIVPASKKESIAGMD >gi|229784128|gb|GG667607.1| GENE 11 10456 - 10629 251 57 aa, chain + ## HITS:1 COG:no KEGG:Closa_3397 NR:ns ## KEGG: Closa_3397 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 57 1 57 57 76 80.0 4e-13 MSPKELTYIEDALSHEKFLKTQCQEAVTNLQDPELKSFAEQISQKHQQIFDNFYHLV >gi|229784128|gb|GG667607.1| GENE 12 10642 - 10887 274 81 aa, chain + ## HITS:1 COG:no KEGG:Closa_3398 NR:ns ## KEGG: Closa_3398 # Name: not_defined # Def: Coat F domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 81 1 81 81 134 82.0 1e-30 MDDKCIMENILLTTKGVCDLYLHGTIESSTANVHQSFDSALNDSLCIQDDIYRKMSAKGW YTTEQAPQEKITKVKNQFSGM >gi|229784128|gb|GG667607.1| GENE 13 10971 - 12557 1768 528 aa, chain - ## HITS:1 COG:sll1673_3 KEGG:ns NR:ns ## COG: sll1673_3 COG2199 # Protein_GI_number: 16329397 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Synechocystis # 343 511 1 173 176 102 38.0 2e-21 MGNGWKVKSYIEKIKEAHFKNQEEERRLCEELITYSQANHNAYGLGFAYTYLLDYHIAMH DTEGCSPVLKTALEFHNGHEFLDLRMQVFNFAGIYYSYIHDDITAFDYFLKSLELAEALN DSFMKYRLYNNIAVSFHNKKDCESALAYYQQAYDYLNFSHLEGLYTQYQLYLLQNLINCS IALDSKELVLEYCGRLEQLYEEQPRLKEDLSVPWQNVVILSYFGKREEAYEILRSHLCWE EDPLGATDIIELYPHILDILLEWKEKALAEKVLESLCRHLERESPRARQEACSHEIRFYE IFGMTERLPDAYERYYQASKEALGSVAHNQIRAMKEKLYSFEVQQKLNRLQKLSYMDGLC DIYNRRYYTEQLERAMKETSVKRLGIIILDIDYFKEYNDYYGHNSGDLVLKKVALCLKEE AADQITPCRYGGDEFCCICEDLGGEDIGRYLRNVHERLNRLGIEHAMSRAASVVTLSAGY SVRPLEGLNQQELFHEADEALYWAKTRGKNRDCCYQTDQTDPDQCSRP >gi|229784128|gb|GG667607.1| GENE 14 12895 - 13413 604 172 aa, chain + ## HITS:1 COG:no KEGG:Clole_1677 NR:ns ## KEGG: Clole_1677 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 4 157 12 165 179 63 28.0 3e-09 MVIGFLLLSLSVLGEFLENVSGALEAFLNFDFDIDLNFLPISGVSICTGLIAFGGMGLLF NHLIFALVVGYAAAMAVQMVIRRLKRVENEAMNREELYLCDGRIINTVLAGGLGTVEFDN IKGIATTFVCRSTDASEVLKQNTVVKLVEFDGEIAVVRPKDEFAGYGSAEDE >gi|229784128|gb|GG667607.1| GENE 15 13482 - 15227 2235 581 aa, chain + ## HITS:1 COG:BS_yuaG KEGG:ns NR:ns ## COG: BS_yuaG COG2268 # Protein_GI_number: 16080153 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 9 538 3 480 509 207 34.0 4e-53 MSELINKLIPIIIIVAAVILVLMVIMSWWKRVPQDKAGVVTGIKKKVITGGGGIVIPVIN RIDYISLSASSLEITTEDSMSSQKVPINVVSTVVLKVKNDTTSILKAIERFNGKDIKEVK LNMEEIARQILEGKLREVVSTLSVEELYSNREKFANSVQEAAATELSTMGLEIMSFTIKD VTDENGYIKSLGVKQIAEKKKEADIAQAEAERERQIKVSEARRDGEQAKLATEAEISAAN KEKLIKEQAYQKEIQTSKAQADVAYAIQKNITEKDVIQTEMDAELLRQERQKDIEQAAVQ VEISKEVKNRELAERQAETAKASLQATVVQPAIAEREKQAQIADSEKYKKVAEADASAQT LKKQADAEAEATRMRGLATAETNAIAETKKAEAEAAAIRMKAEAEATATKAKQLAEAEGI RAKQLAEAEGIRAKKLAEAEGIKAALLAEAEGMEKKAEAYNKYNKAAVTEMIVNILPEMA GRIAEPLARIEKITVIDSGSGNGENGVGNLAGNVGSVLAKTLETVKETTGIDFKEIIDAD VLGKTTTNVNINGAGQGSGQNSMEQEVSNAEIVEAVLDKRN >gi|229784128|gb|GG667607.1| GENE 16 15272 - 15970 907 232 aa, chain + ## HITS:1 COG:BS_ydjF KEGG:ns NR:ns ## COG: BS_ydjF COG1842 # Protein_GI_number: 16077685 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Phage shock protein A (IM30), suppresses sigma54-dependent transcription # Organism: Bacillus subtilis # 3 227 1 225 227 84 32.0 2e-16 MGLDVFGRIKDIVSANIDELLSKAEDPEKMADQMVRKYEDAIKDLLGSTASVMASSREAK AKLDACDAEIKKFGTYAERALRDGNEEKARQAITLKQEQEDLRPTLEHSYVTVKVQADKA EEDYRKLTAGLEKTKARAEAMKNTVRTAKVKNQAGEVQMKSVNTAAGLMSQFDRMEEKAN RMLHEAEAKEELTGRKSTAAELDALYGDQSKSRVDEELEAMKKALSISAETK >gi|229784128|gb|GG667607.1| GENE 17 16209 - 18899 1545 896 aa, chain - ## HITS:1 COG:lin0778_1 KEGG:ns NR:ns ## COG: lin0778_1 COG1221 # Protein_GI_number: 16799852 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain # Organism: Listeria innocua # 1 421 1 462 464 168 28.0 4e-41 MKKKDVVFETVKESFHYDEFISALQKNKVTGLEAQEVSEKAGLVRSNASSLLNRLVDERK LIKITGRPVLFIAREPLEAAVKHTLVKTEYSIQDLRDFCCSSSDPFSKFTGYDTSMKEVI NQAKAAIMYPPGGIHTIISGPRGCGKTVFAKSMYEYAKGQKKKNADEYVFVELNNLMYSD LVECLKKMKTLISGVLFIDDASGLPKVLQERLIHYMESTEWQENVFIILALPEGERLNQS VAERFPITLKLPSLTERSWEERFRLIKLFFSKEAVRIKRPMKIASEVVQALMAYECSGDI KQLKADIKMICADVFFQNLHKTDNKILEVNFGEVPGKIRADVLGGKKISSKNWGYLQVID SDILIRPSEYSEDYPIPKFNIYSEISRISEKLKREGRTEAEITEVVQQRLKDYFAKISDS ILDFRSVKRELKNSLPEDVVDFTIEILEKASMDLGKAISAKLAYALAFHINYLILRSKMG HQESEQTKRVAQEQGTKENQAARMIASRISERFTIAVQEKEVAFLTELLKNQIPQKRVED KVRIMIIAHGEHVATSMADAVNTLMMTDLVYGFDIPVNVKHEDVFENILNIAKAINKGSG ILLMVDMGSLANLEEKITAESGIFVKTIERVSTIYVIEAVRRILYKEETLEEIYQEVLKL QSVPYQMEVPSEKQPIVITTCSSGVGTGIMLKERVERLMREERIEGIAVEALGYQEILKQ SRAYEEIVQRYEVIACIGNMNPGICSRFYDLSTILSEENLNVFADYLRQNQGEQKADPYK KLEKILEEHVTYFNVHKSMRCFEEFFSDTVAMGFRLSNNGMISVAMHLSYMIERIISRVE AKFSDHTEKYVVQNRELYQNLANAVKTFERVFKITISDHEICFLCEIFLNLNKINQ >gi|229784128|gb|GG667607.1| GENE 18 19016 - 19261 276 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619040|ref|ZP_06111975.1| ## NR: gi|266619040|ref|ZP_06111975.1| phosphocarrier protein HPr [Clostridium hathewayi DSM 13479] phosphocarrier protein HPr [Clostridium hathewayi DSM 13479] # 1 81 3 83 83 145 100.0 1e-33 MARTRLLEDLHARPAIRFSKICSGFQSEVKIITENGSVLDGKNFIELLGKYAKKGQEIIV ISEGKDEREALQAAMAFLKGE >gi|229784128|gb|GG667607.1| GENE 19 19268 - 19750 410 160 aa, chain - ## HITS:1 COG:BH0192 KEGG:ns NR:ns ## COG: BH0192 COG1762 # Protein_GI_number: 15612755 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Bacillus halodurans # 8 143 5 140 160 84 35.0 1e-16 MSGFLASMINPELVFTDLPCTDRQTFLQYISRRLEEKDYVKCSFEGAIIEREKKYPTGLP TVNQHVALPHTDVQHAKNPIIVPVKFRTPVVFKEMGNGINDIPCSMAFVLVVTEPEKQID ILQSLINLFIKKDFLEELYKEKTREGFLKRLLDEAEKDEG >gi|229784128|gb|GG667607.1| GENE 20 19782 - 19991 174 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619042|ref|ZP_06111977.1| ## NR: gi|266619042|ref|ZP_06111977.1| putative band 3 anion transport protein [Clostridium hathewayi DSM 13479] putative band 3 anion transport protein [Clostridium hathewayi DSM 13479] # 1 69 1 69 69 123 100.0 5e-27 MGEERLKKWGIAAWYLLVLVAIVLFLYSFALEDPVMSVAAFILAMVLSKYKDKVTLPGCF TKEENGIYK >gi|229784128|gb|GG667607.1| GENE 21 19991 - 20641 571 216 aa, chain - ## HITS:1 COG:TVN1450 KEGG:ns NR:ns ## COG: TVN1450 COG0235 # Protein_GI_number: 13542281 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Thermoplasma volcanium # 1 211 1 207 218 93 29.0 2e-19 MFEEIRNQVLKQARLAETLGLCQSGGGNFSMIDRESGMVCMTPHDTDRFQMSWRDVLVTD LEGNVLEKGPGLKPTSEIEIHLAVYRAREDILAVAHTHAPYTSVFAGLSMEVKPVLTEAM TYYGRAPLAPFGTPSTPQLAENIVKTLGEKGVAAVMEKHGLITASPFGIWDAVRKNLYVE ETAKAYFRMIQLVGIEHVPSIDMDELDRMTEMLGIR >gi|229784128|gb|GG667607.1| GENE 22 20665 - 22011 1058 448 aa, chain - ## HITS:1 COG:lin2200 KEGG:ns NR:ns ## COG: lin2200 COG3775 # Protein_GI_number: 16801265 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIC component # Organism: Listeria innocua # 1 447 1 445 449 323 43.0 4e-88 MELIKTFFSAITNAGSTVVLPLIFFAIGLILKMKVGKAFSAALTLGVALAGISAVTGFLS SAIGPAAASFVENTGANLNAIDMGWAPALGVAWQWEYAFLMFPLHIAINVIMLAAGMTFT LNVDMWNVANKVATGFFVYTASGNVLLGFAFVILQTVLELLIGDANQKNVQKLSGVPGVT TPHPMFLNLPLLWPIKLLLDKIIPVRKPLDINSLRSKIGIFGESHVIGFLIGLLIGLIGG YSVVESVTMAVQVATALVLIPMAAKLFMTALTPISEAANRFMKARFHDRKFCIGLDWPIL AGSNEIYVTTILSIPVILALAMVLPFNIVLPLAGIMYMCIPITTLLLYQGDLLKMMITQV ITIPFSLYAASYFAPMIDGLARQKGTDLSSLAEGQMMGWYGVDFGFLRWTFCEVTLGNIL AIAIFVGYLVLTFFYLKARKKEEAAIEL >gi|229784128|gb|GG667607.1| GENE 23 22034 - 22318 331 94 aa, chain - ## HITS:1 COG:lin2201 KEGG:ns NR:ns ## COG: lin2201 COG3414 # Protein_GI_number: 16801266 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIB component # Organism: Listeria innocua # 1 94 1 90 91 76 44.0 9e-15 MKRIVVACGSGIATSGLVANKINNLLEERGMVGKARADSIDIKSIDLEIGTADIFVSITP TFDLSNVRIPTFSGIPFLTGIGEEEIMDQFVGLL >gi|229784128|gb|GG667607.1| GENE 24 23073 - 23207 57 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MASCRKVKAKLDACDAEIQMYGTYVERPVMVMKKAFHITDESIS >gi|229784128|gb|GG667607.1| GENE 25 24181 - 26067 1453 628 aa, chain - ## HITS:1 COG:FN0856 KEGG:ns NR:ns ## COG: FN0856 COG0296 # Protein_GI_number: 19704191 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Fusobacterium nucleatum # 1 589 8 567 611 466 43.0 1e-131 MDYYGFYTGKEFSAYEYLGAHRLENGTVFRTFAPNAVRISVIGEFNDWGETPMDKVYDGN FWECTVPGAKAGMMYKYRIYRRDGSFIDHCDPYGFFMELRPHSASVISSLSGYEFRDSKW QKKHRAGLDKPLNIYEIHAGSWRRKERVDGVEKNAEKHSKKYSKKYLEKDGENGVGRGLV NGWYSYTELADLLIPYLEENGYNYVELMPLSEHPSDESWGYQNTGFFSPTSRYGTPDELK AFVDQCHSHEIGVIMDFVPVHFAVDDYALWNYDGTALYEYPHSDVGRSEWGSCNFMHSRG EVRSFLQSAGNYWLKEFHFDGLRMDAVSNLIYWQGDTARGENRNGIQFLQEMNKGLKSLH PETLLIAEDSSVYPGVTKPVEHGGLGFDYKWDMGWMNDTLSYFQAPPSERKERYHQLTFS MQYFYQERYLLPLSHDEVVHGKATILQKMYGAYEGKFPQARALYLYMYAHPGKKLNFMGN EFGQLREWDEKRQQDWEILRYPVHDAFHRFMEDLNRLYLNSPALYERDYEPDGFQWLDCH QEERLIYAFERRSEKQRIAAVFNFSDEKQEGYELRVEDAECLALLFSSEMEQYVRCEGNA VLTLPPYSGRYYLVDPDEKSGKRTEETA >gi|229784128|gb|GG667607.1| GENE 26 26021 - 26131 95 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSIIFEEKDREAVCRIYMRYQYGLLWVLYGERVFSL >gi|229784128|gb|GG667607.1| GENE 27 26250 - 26783 654 177 aa, chain + ## HITS:1 COG:BH0719 KEGG:ns NR:ns ## COG: BH0719 COG1309 # Protein_GI_number: 15613282 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 1 103 5 108 188 63 36.0 3e-10 MDLRIEKTRQSIINAFIALRSGKPLEKITVKELCEKAMINKSTFYFHYADIYELSDFLET EIVTSIISSLDHPEYMVEHPGAFTQELFLAYLSRDTLIQTLFSGSRSGRLIQKIEAGIKE LIYERLPEYRDDPVKNIILSYQIYGAYYAFSENRNENTSEVIAVLGRISEELQKLLH >gi|229784128|gb|GG667607.1| GENE 28 27929 - 28774 863 281 aa, chain + ## HITS:1 COG:BH0465 KEGG:ns NR:ns ## COG: BH0465 COG0530 # Protein_GI_number: 15613028 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/Na+ antiporter # Organism: Bacillus halodurans # 1 279 1 292 318 175 44.0 7e-44 MMYVLLLLGFFLLIKGADYFVEASSSIARALRIPSIIIGLTIVAFGTSAPELSVSVSASI TGNNDIAVGNVIGSNLFNLLVVLGACGVIRPFSVKLRWDYAASVFVAAALFVMILFDQTV NRMDGIILLVFFLAFVGFTVRDAIVNRIVTSEEIKTLSPVRCAVYIIGGLAAIVWGGDLV VDNASAIAASFGLSQNLIGLTIVALGTSLPELVTSVVASRKGENGLALGNVIGSNIFNIL LVLGASAAIHPLLVNPFSIYDTAFLIIASGITWLLCRSKTS >gi|229784128|gb|GG667607.1| GENE 29 29919 - 30773 428 284 aa, chain - ## HITS:1 COG:NMA1039 KEGG:ns NR:ns ## COG: NMA1039 COG3943 # Protein_GI_number: 15793995 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Neisseria meningitidis Z2491 # 6 132 24 162 336 88 35.0 2e-17 MEDDTVWLTQMQMAELFQRDISVISRHIKNVFQEEVEEKSNLHFLQFAHSDKPITCYSLD VIMSVGYRIKSKRGIAFRKWANQVLKDYIIHGYAVNQSRMEQLGGIIRVMERVNECLDTK QVLHVIERYASALNLLDDYDHQRIEKPGGSKPVCILTYEECRKVIATMNFSEESSLFGNE KDESFKSSIGAVYQTFGGSDVYESMEEKAANLLYFITKNHSFSDGNKRIAAAIFLYFLDQ NGILFDDGKKRIDDFALAAITIMIAESRPEEKELMVNLVMKCLI >gi|229784128|gb|GG667607.1| GENE 30 30780 - 30896 138 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVLYLESRFTTVRRLVQMTKEDAKTLVLYQTEDGAVTL >gi|229784128|gb|GG667607.1| GENE 31 31213 - 32664 1569 483 aa, chain - ## HITS:1 COG:CAC3230 KEGG:ns NR:ns ## COG: CAC3230 COG4624 # Protein_GI_number: 15896476 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Clostridium acetobutylicum # 129 429 78 379 450 183 37.0 6e-46 MVTNDATLLRLKHQVLYEVAKMAWEGDLEQHRSEIPYKIIPGPKAQFRCCIYREREIIRE RVRLAEGLCPSGKDTKNIVQVISSACEGCPIARYVVTDNCQKCMGKACQNSCNFGAISMG HDRAYIDPDKCKECGKCSQACPYNAIADLTRPCKKSCPVDAITMDEDGIVVIDESKCIQC GACIHSCPFGAIGSKTFLVDVINLIRAGKKVVAMVAPATEGQYGPDITMASWRTALKKIG FADMVEVALGGDLTAAAEAAEWAEAYKEGKKMTTSCCPAFVNMIKQHYPMLLDHMSTTVS PMCAVSRMLKAQDPETITVFIGPCIAKKSETLDLNIKDNADYAMTLGEAHAMMVAKGVDL EPEENTLQQGSVFGKRFGNGGGVTAAVLECLKETGENAEINVMKCNGAAECKKALMLLKI GKLPDDFVEGMVCVGGCVGGPSKHKSELEAKKARDLLIGQADKREVHENLKMQGAEEVAM HRH >gi|229784128|gb|GG667607.1| GENE 32 32726 - 33907 1369 393 aa, chain - ## HITS:1 COG:no KEGG:Closa_3400 NR:ns ## KEGG: Closa_3400 # Name: not_defined # Def: protein serine/threonine phosphatase # Organism: C.saccharolyticum # Pathway: not_defined # 1 393 1 393 393 721 88.0 0 MGVTVDVAYKSLNKFNEVLCGDKVELLQTEDSNIMILADGMGSGVKANILATLTSKILGT MFLNGATLEECVETIVETLPVCQVRQVAYSTFSILQVFHNGDAYLVEFDNPSCIFIRAGQ LVPIPQNIRVIQNKKINEYRFRVKKGDALILMSDGTIHAGVGQLLNFGWLWEDIAAYALK QYRVTISAIRLATALSRACDELYQFRPGDDTTVAVMRIIDRKPVHLMTGPARNPEDDTAM VTDFMSGDEFTKRIVCGGTSANIVARTTKKSLDVSLDYNDPDIPPIAYIDGIELVTEGVL TLNRVLQLLRRYVKNEAVTEEFFLELDKPNGASMVAKMLIEDCTELSLYVGKAINSAYQN PGLPFDLGIRQNLVEQLKHTVEEMGKKVTVTYY >gi|229784128|gb|GG667607.1| GENE 33 33907 - 35640 1719 577 aa, chain - ## HITS:1 COG:TM1421 KEGG:ns NR:ns ## COG: TM1421 COG4624 # Protein_GI_number: 15644172 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 7 325 8 300 301 170 33.0 7e-42 MAIIDFKATKCKHCYKCVRNCEVKAIMIKDERAEIMPDKCILCGKCMQVCPQSAKTLVSD LDIVKGYIANNIPTVVSIAPSYMGLLKYKSIGQINAALRKLGFTDVRETSEGAAMVTAEY ARLLEEGKMETIITTCCPSVNDLIEIYYPQLIPYMAPVVSPMIAHGKMLKEELGKNVKVV FLGPCIAKKKEAGDVRHDSCIDAVLNFNDINRWLKEEEITIEDCEDIPFRHLDPRVNRLY PVTNGVVNSVLATEEKRDGYRKFYVHGARNCIDLCESMVRGEIKGCFIEMNMCSGGCIKG PTVEDESISRFKVKLDMEETIEKDPAEKSEVEEIISRISFQKLFMDRSPREAMPTEAQIQ DILKKTGKTKPEDELNCGACGYSTCREKAIAVFQKKAELGMCIPFMHEKAESLSNLVMET SPNIVLIVDKDMKVLEYSAVGEKYFGKTRQEALTMYLYEFIDPSDFQWVYDSHQNIHGKK VTYSEYNFSMLQNIVYIEKEDVVLATFIDITKEEELAKQEYEKKLETIDLAQRVIHKQMM VAQEIAGLLGETTAETKTTLTKLCRSLLDEGSESEVK >gi|229784128|gb|GG667607.1| GENE 34 35643 - 35882 308 79 aa, chain - ## HITS:1 COG:TM1420 KEGG:ns NR:ns ## COG: TM1420 COG1905 # Protein_GI_number: 15644171 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 1 76 3 75 77 66 45.0 9e-12 MRVTICIGSACHLKGSREIIAQLQQLVKENHLESKVDLNGSFCSGNCVNGVCVTVDGQLF SLKPEDTKEFFDKEIKGRL >gi|229784128|gb|GG667607.1| GENE 35 36923 - 38464 1717 513 aa, chain - ## HITS:1 COG:CAC0887 KEGG:ns NR:ns ## COG: CAC0887 COG1001 # Protein_GI_number: 15894174 # Func_class: F Nucleotide transport and metabolism # Function: Adenine deaminase # Organism: Clostridium acetobutylicum # 5 513 4 506 570 426 43.0 1e-119 MDKKLSERILASAGTKKASLVLKHGTVVNVFTAALEQADIAVEDGYIVGVGDYEGVTEVD LKGAVVCPGLIDGHIHLESSMVAPGEFERTVIPHGTQAVITDPHEIANVAGVEGIRFMME RTAGLTLDVFFMLPSCVPATGLDESGAVLPADKLEPFYKDERVLGLAELMNSYGTTRADQ GILDKLEGALGKNKLVDGHAPGLTGRELNAYVTAGVKSDHECSTASEAIEKLSRGQWIMI REGTAARNLEALMPLFQEPYYHRCMLVTDDKHPGDLLRLGHIDYIIRKAISLGADPVHAV IMGSCNAAQYFGLKDRGAIAPGYQANLIVVSDLENFRVEQVYKNGRLVAENGTMKEGVLA AERKNIETPVRVGDSFHLKELTPEQLYIEKKSGQVRVLCLTPGELTTTERLVPWTEKEGY APGVDVDRDIVKMAVFERHHDTGHIGLGFLGGYGLKHGAVATSIAHDSHNLIVAGTNDAD MILAGNTVRKNRGGLAIAADGKVLGELALPIAG >gi|229784128|gb|GG667607.1| GENE 36 38497 - 39327 896 276 aa, chain - ## HITS:1 COG:lin2067 KEGG:ns NR:ns ## COG: lin2067 COG0005 # Protein_GI_number: 16801133 # Func_class: F Nucleotide transport and metabolism # Function: Purine nucleoside phosphorylase # Organism: Listeria innocua # 6 274 4 272 272 270 49.0 2e-72 MNEVYEKLQRCLKSVREKTDFKPEIALVLGSGLGEYAEEIEVAETIDYKDIEGFPVSTVP GHKGRFIFGYVNSVPVVIMQGRVHFYEGYPMSDVVLPARLMGLMGAKILFLTNASGGINS TFHAGDFMLIRDHIASFVPSPLIGANIDELGPRFPDMSDIYRRELRDIIKETAKEEAIEL KEGVYLQLTGPSYESPSEVSMCKMLGADAVGMSTACEAVAANHMGMAVCGISCITNMACG ISKEPLSHTEVQETADRVAPLFKRLITASITKLAGR >gi|229784128|gb|GG667607.1| GENE 37 39754 - 40860 1120 368 aa, chain + ## HITS:1 COG:TM0202 KEGG:ns NR:ns ## COG: TM0202 COG0715 # Protein_GI_number: 15642975 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Thermotoga maritima # 81 366 26 299 300 84 26.0 3e-16 MRKVLPVLMALAMTAAALTGCAKGAGETAAETTTAAAAEASTETSAEETAEAPETESETE TESTGTSAAVGYTLRIGSLKGPTSMGLVELMDQAAKGEAKGSYEFTMVTAADELLGKIVS GDLDVALVPSNMASIIFNKTNHGVNVLNINTLGVLYVVSSDDSIKSIADLKGKTVYLTGK GTTPDYALQYLLKANGMTTDDVTLEYKSEPTEVAALLKEKPDAIGLLPQPFVTVAMAQND TLKMVLDLTKEWDAVSGEDGGSLVTGVTICRGELFEEHADAIQTFMEEQKASAAFANENV AETAKLVAAAGIIEKAPVAEKAIPYCSITYIDGTDMKNRLYGYLSALYEMDPATVGGELP TKDFFYIP >gi|229784128|gb|GG667607.1| GENE 38 40939 - 41706 734 255 aa, chain + ## HITS:1 COG:RSc1340 KEGG:ns NR:ns ## COG: RSc1340 COG0600 # Protein_GI_number: 17546059 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Ralstonia solanacearum # 32 215 42 225 264 70 26.0 4e-12 MKPSGLGRKAGIICFWLVLWQLISIAVHNSIVLVGPAEVAGALVSQISTVEFWRTIGLTF GKICSGFSGAFLAGILVGSAACRFPFLKDLLEPVMVLVKSVPVASFVILALIWIGSKNLS VFIAFLVVFPMIYVNTISGLNSADAQLLEMAGVFHMSGWKKFRYIYWPALLPYLINGAAI SLGMSFKSGIAAEVIGVPDHSIGEKLYMAKIYLSTADLFAWTLVIIVISGLFEKVFIQLL KAAKRQNRANGKRGA >gi|229784128|gb|GG667607.1| GENE 39 41716 - 42315 362 199 aa, chain + ## HITS:1 COG:ECs3293 KEGG:ns NR:ns ## COG: ECs3293 COG1118 # Protein_GI_number: 15832547 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate/molybdate transport systems, ATPase component # Organism: Escherichia coli O157:H7 # 1 181 1 199 365 97 34.0 2e-20 MEITVTNLSKSFEGREVFRQVSQTFRTGGIYCLMGASGSGKTTFFRILLGLDKPDSGTVA GTEGAFLSAVFQENRLCESFSSLDNVLMVFPKVTREIREAAKEDLCKLLPEESIYRPVST LSGGMKRRVAVCRALAVPFDAVLMDEPFTGLDEDTRRKVIAFVKQKTNRKLAIISTHQEE DIPLLGGSLIRLTAERKGI >gi|229784128|gb|GG667607.1| GENE 40 42369 - 43085 858 238 aa, chain - ## HITS:1 COG:no KEGG:Closa_3408 NR:ns ## KEGG: Closa_3408 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 238 1 235 235 137 46.0 3e-31 MRKNSRIITLAVAAAVMMAVTACGSKDKTETTASTEATTEAVTTTAAAETSLSEAESTEA EDQTEEEETDEAEEAAEETTASASGEIGSDGVFQAANGKFEIKVPAGWTIDEGSDDEYVT FLSANGEDMLEIMTISGSSADSVREIYPDTAEEYKNMVSRGDGMEILSYDVNTKEDGSQT FQYSMKYTNPSDGIHYLAESGTYDAAKQTYSCATGTVMSTDEAVAKQIEEAVKSFKIK >gi|229784128|gb|GG667607.1| GENE 41 43291 - 43887 611 198 aa, chain - ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 1 169 70 245 344 67 25.0 1e-11 MSDARNLAMDQAEGKYLQFVDSDDWITTDATRMMVSAAEYGSCDMVITHFFRVNHDKIRR KGHIKEYEIMTRKGFAEHMMEKPANFYYGVMWNKLYRRDILNAYKLRCSRELNWCEDFLF NLQYLCYAERICALPYPTYYYVKTKNSLVAKEATMKNTIQTKKFIFTYYKELYENIDLYE ENKGKINRFLVSIASDGH >gi|229784128|gb|GG667607.1| GENE 42 47114 - 47200 122 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTYLAYIVVFSLLAMELYPIIRNGIHKA >gi|229784128|gb|GG667607.1| GENE 43 47412 - 48212 674 266 aa, chain - ## HITS:1 COG:CAC2608 KEGG:ns NR:ns ## COG: CAC2608 COG2207 # Protein_GI_number: 15895866 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 5 256 35 278 284 89 26.0 4e-18 MQTKHFHDSFELYFLLEGERYYFIDRETYHVKKGMVVLVNRQQIHKTSLAGKSYHDRILL QISQEGFSPLLEQAGVVSLTRMFEENYGVTELPEKIWEQVKRLLFEIRDELKEKQGKYDG MVKLKLAEILLLIYRCRRNRVYHKRGGELSTVQTPRHQKVHEVADYLLHHYDTDESLEEL AERFFISKSYLSRIFREVTGFSVNEYRNITRIRKAKELLAGSEYSVTEISELLGFESVTY FERVFKKLTDKTPLRYRRICSGDQEE >gi|229784128|gb|GG667607.1| GENE 44 49399 - 50796 1544 465 aa, chain + ## HITS:1 COG:BS_lplD KEGG:ns NR:ns ## COG: BS_lplD COG1486 # Protein_GI_number: 16077780 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Bacillus subtilis # 11 453 10 440 446 484 51.0 1e-136 MKYENNKVRDLNIAYIGGGSRGWAWTFMTDLSMDDSLGGTIRLYDIDEAAAKNNEIIGNK LTSREDTTGKWDYVTCSTIGEALTGCDFVVISILPGTFDEMASDVHLPERLGIYQSVGDT AGPGGMIRALRTIPMFVEIAEAIKKYAPEAWVINYTNPMTLCVKTLYHVFPQIKAFGCCH EVFGTQKVLKGICEETFGLDSIDRRDININVLGINHFTWLDKASYKGIDLFPVYRDYIDS HYEEGYNEPDKNWANSTFDCAHRVKFDLFRRYGLIAAAGDRHLAEFMPGGEYLKDPDTVA SWKFGLTTVDWRKDDLQKRLDRSHRLVSGEEEIKLEPSGEEGILLIKSLCGLERSISNVN IPNTALQIANLPADAVVETNAVFSLDSIQPIVAGAIPEQVKALITPHVENHERILKAALT YDRSLVCEAFLADPLVKGRASEEEVKTLVDDMIANTINYLPDGWK >gi|229784128|gb|GG667607.1| GENE 45 50918 - 52474 1300 518 aa, chain - ## HITS:1 COG:no KEGG:Amet_0685 NR:ns ## KEGG: Amet_0685 # Name: not_defined # Def: PucR family transcriptional regulator # Organism: A.metalliredigens # Pathway: not_defined # 364 502 268 406 411 77 32.0 1e-12 MKLKLEYLAGCGAFTACCFRTPFYGKEVSSLELMCGNGAAVREPQSLLLVCRGLIPDDGE RLNEWLNERKAQEAAGLLVDLSGESEERISKLAEACRERELVFGTAETAGFTALINDYSH LIASRTDGAVKSYDRVLEDLQRRFYTSGTDSLLEGLSYWTGCQAALIVGQDTFVKPSVPV LNEAVFYPAYWKKEPRKSGLSHVSCYSSSFSDNMLLQAELFKNRLPFGVLCLLGDEDVFE PSDDILLNYASILCTGIDDYKRRSRRIEAAVEMICGGQMPDSSVMELFPESGYALVLCDQ EAGEPAEGKKEYLSYLIHHYFPQKLCYSFSAEGSLRLFVSAEDVDHFARRLLAILDGAGK RCRVGVSRHYPASQAVTAFFEAESAAHIAGLLEYGERICYYHDLGIYRLLNYPENSWPIN QMLGEMDELLNQMDEEKRDVLAMTIRTFVKCRFHYQKTADKLYTHVNTIRYRIKLIEDLW DVDLSSDEGRLLFSVLAKLLPLWRKSGCYSGTMPREGE >gi|229784128|gb|GG667607.1| GENE 46 52629 - 53135 643 168 aa, chain + ## HITS:1 COG:SSO2433 KEGG:ns NR:ns ## COG: SSO2433 COG2080 # Protein_GI_number: 15899181 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Sulfolobus solfataricus # 3 165 10 162 171 144 46.0 5e-35 MGDNKKSVTITVNGDVYRLGVGNAVGDIAPADTLAETLRERLLLTGTKISCDDGACGCCT VIMNGKAIPSCTVLTIECDGADIRTIEGLQDPNTGELDKVQQAFVDHSAFQCGYCTPGII MTTEALLAENPHPDEEEIKEALAGNYCRCISHYQVLEAVKQAAREGSE >gi|229784128|gb|GG667607.1| GENE 47 53137 - 55404 2343 755 aa, chain + ## HITS:1 COG:SMb20132 KEGG:ns NR:ns ## COG: SMb20132 COG1529 # Protein_GI_number: 16263880 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Sinorhizobium meliloti # 3 751 14 763 772 325 32.0 2e-88 MGKEFRYVGKPTPRIDGKAIVTGSAEYANDIHMHGMLYGRIKMSPHPHAMITKIDVSKAL ELPGVRTIMTYENAFDWVTGMPAQKRLLDRHVRFVGDPVALIAATSNRIAEEAAELIEVE YEVLPAVQDLDEAIADGAPQLYEQFPGNIIERGCPPFGPKALQELVIGDTKKGFEECSVV VEGTSKYETISNPLPAESPGLVAWWDGPNQCNMKGSFQSPNIQKMIMQIAMPNVNVRVIS ANVGGSFGSKNTVPMEGLYCAALAKASGRPVKLVISKEAHLSTYMLRLGSRLTAKVGLLA DGTVHAYEGTWLVGSGVHSDFGQGQIANGLGRAMLLLNKCKHWNYQPHLVATNRCRSGGI RGFGGQESYASLVPVLSMAMAQMNLDPVDFFKKNMCDINGGYYWVEGHWWENHFLSYNDV MDHAAEHFGWKEKFPGWLKPYKTEGSKQYGAGCCVHGSADCGMLHGEAFVKLDGWGTANI NVCIAESGMGQRSAVVKMAAEILNLPLDKVTISTPDSQDNPWDWGLAGSRGTLVYGRMVG DAARSARKQLLEAAAAILHCPAEMLDTRDGIIFMKDNPEVALPWIAVLGPMNTISGHGVF EMDHSKPCFMIAFVEVEVDTDTGDVKIVRVVEGTDVGQVIDPFSLKMQLEGGLGSCGSDS AINEEMILDEKTGKFLTNNLVDFKWRVFPDLPAFETHIEETPNDLSAFGGIGVGEISGAP LPGAIIMAVSNAIGIQLMDYPLTPDQILKALGKAR >gi|229784128|gb|GG667607.1| GENE 48 55412 - 57775 2154 787 aa, chain + ## HITS:1 COG:TM1217_2 KEGG:ns NR:ns ## COG: TM1217_2 COG0493 # Protein_GI_number: 15643973 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 159 565 51 471 472 231 34.0 4e-60 MKNFNHTDAESLSQAAGILKDEQNNARIIAGGTDLLSVLKHKLEPDYPETIVNIKSIDGL DRIEEDEDGLKIGATAKLCDIASSPLIREEYGALADAARSVATPLIRNLATIGGNLCQEV RCWYYRYPDEIGGRIQCLRKGGDTCNAMTGENKNHSIFGARKVHLSGCSSNCPAHVDIPE YLAKLRAGDMDGAASILLEANPIPAVTSRVCTHFCQMNCKRKEYDEHVNIGGMERYVGDY ILEHSNRFMAPPKTDSGKRVAIVGSGPAGLTAAFYLRSRGHSVRVYDKMEEAGGLLRYAI PEYRLPKDIVRNLIASLEHMGIEFLTGIEIGKDIPVNDLVEQYDSVFLNTGAWKRPLIGL DGEEFTTFGLDFLIEVNKWMESKIGRDIIVVGGGNVAVDVAITAKRLGAANVVMACLEPE HQMPASEEELNRAREEGITILPSWGPKEILRDHGEIVGLVLKRCTAVFDEQGHFAPSYDD HETTTLHGSSILMAVGQKTDLSFLGEALEFKVSRGLISVDPATMKTSVDKVFAVGDVTTG PKTVVDAIADGKKASVAVHCFMTGEPEGDAKHPLGGLITFDASCTGHSQGARLPERPKEE RGLHLEDSYGLTWEQVQEEAKRCFNCGCLAVNPSDVATVLTAMDAEIVTSKRKEPIEEMM KKSHCRLVEEDEIVTEIRIPASAKEYRVGYDKFRVRDSHDFAVVSVASAYKMQEGVITDA RIVLGAVAPIPLRAREAEAYITGKTVTEELAAEAAEIALSGALPLSRNGYKVQIAKTLVR NSLLNAR >gi|229784128|gb|GG667607.1| GENE 49 57787 - 58416 466 209 aa, chain + ## HITS:1 COG:ECs3750 KEGG:ns NR:ns ## COG: ECs3750 COG2068 # Protein_GI_number: 15833004 # Func_class: R General function prediction only # Function: Uncharacterized MobA-related protein # Organism: Escherichia coli O157:H7 # 1 188 1 182 192 80 31.0 2e-15 MGRIEAILLAAGYSSRMGRLKPLLPIGETTVIRRQADILHGLTDRTIVVTGYRGEEVEAH LSGCHADTVRNPAFAEGMFTSVKAGILALDRDVSAFLILPVDYPLVTRKLIADLIEEFQR SEPLVLYPSFSMRKGHPPVISSACIPGILSYQGGAGLKGALTPFNGGAEYFSAEDETCII DMDTPEDYKKVLALVSKASNGTAAPESLR >gi|229784128|gb|GG667607.1| GENE 50 58382 - 58882 509 166 aa, chain - ## HITS:1 COG:lin1203 KEGG:ns NR:ns ## COG: lin1203 COG0622 # Protein_GI_number: 16800272 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Listeria innocua # 1 152 1 148 174 81 32.0 6e-16 MKILIVSDTHRKDDNLKNVIEKTSPLDMLIHLGDAEGSETKIAGWVGEGCDLEMVLGNND FFSNLDREKELKIGEYRVLLTHGHYYNVSLGVERLEQEARDRRLDIVMYGHTHRPFYEVR GGVTILNPGSLSYPRQDGRKPSFMIMELDDQGKAHFTLNFLEPQYH >gi|229784128|gb|GG667607.1| GENE 51 58928 - 59494 459 188 aa, chain - ## HITS:1 COG:ECs3830 KEGG:ns NR:ns ## COG: ECs3830 COG0127 # Protein_GI_number: 15833084 # Func_class: F Nucleotide transport and metabolism # Function: Xanthosine triphosphate pyrophosphatase # Organism: Escherichia coli O157:H7 # 5 123 3 120 197 110 48.0 2e-24 MSENRIVFATGNAGKMKEIRLILADLGMEILSMKEAGADPVIVENGKTFGENAEIKARAV WAETGGIVLADDSGLVVDCLGGEPGIYSARYMGEDTSYEIKNQTIISRVNEAPGDDRSAR FVCNIAAVLPDGRVLHTEETMEGNHRREGRRKRRLWIRSNPVFTGVRSDQCRDYRGTEKQ DQPSRKSA >gi|229784128|gb|GG667607.1| GENE 52 59817 - 60698 836 293 aa, chain - ## HITS:1 COG:MA3430_1 KEGG:ns NR:ns ## COG: MA3430_1 COG4870 # Protein_GI_number: 20092242 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cysteine protease # Organism: Methanosarcina acetivorans str.C2A # 9 289 236 510 516 198 39.0 1e-50 MPGKDYEKIKRAVLQYGGVQSSLYTSMTNEESRSAHYNEDQYAYCYIGSEPPNHDSVIIG WDDGYPRENFNTEVNGDGAFICMNSWGDDFGDDGYFYVSYYDSNIGLNNIIYTDVEDPDH YDHNYQTDLRGWVGQLGYGSDTVWFSNVYEAQGKETLKAVGFYSTDRHSSYEVYVVRQPG DEEAIKSGDGFRNRRLAASGNLDYAGFYTIPLEAEPEEGLELEPGERFAVMVKLTTPDSV HPAAIEYDAGDGKTIVDISDGEGYISPDGVNFTRVEEKQKCNLCLKAYTTDRR >gi|229784128|gb|GG667607.1| GENE 53 61714 - 62211 479 165 aa, chain - ## HITS:1 COG:MA1513_1 KEGG:ns NR:ns ## COG: MA1513_1 COG4870 # Protein_GI_number: 20090372 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cysteine protease # Organism: Methanosarcina acetivorans str.C2A # 94 165 116 193 511 65 46.0 5e-11 MKAGKWLGTAAVLVLAGTAVIWINGRWPELSGPEQESAVTGTGETGTESQEDSGEQESAG RSSRESHGPLEPESESSEERDGAVREDPVPQVLPAAYDGRTTGRAPSIKNQGSLGTCWAF ASLMALEARLLPDENYDFSEDHMSLNNGFCIPQSDGGEYTMSMAY >gi|229784128|gb|GG667607.1| GENE 54 62216 - 63373 1182 385 aa, chain - ## HITS:1 COG:CAC0223 KEGG:ns NR:ns ## COG: CAC0223 COG0116 # Protein_GI_number: 15893515 # Func_class: L Replication, recombination and repair # Function: Predicted N6-adenine-specific DNA methylase # Organism: Clostridium acetobutylicum # 5 377 3 372 373 381 50.0 1e-105 MKKTFELIAPCHFGMEAVLKREITDLGYDITLVEDGRVTFIGDEETICRANIFLRTAERI LLKVGSFHAETFEELFQNTRNIPWEEYLPQDGKFWVAKASSIKSKLFSPSDIQSIMKKAM VERMKKAYGLEWFPETGSSYPLRVFLYKDEVTVGIDTSGESLHKRGYRTLTSKAPITETL TAALIMMTPWNRDRILVDPFCGSGTFPIEAAMIAANMAPGMNRSFLAEDWKNVIPRKFWY EAMDEAADLLDTEVEVDIQGYDIDGEIVKAARANAESAGVDHLIHFQQRPLSALSHPKKY GFIITNPPYGERIEEKENLPALYKEIGERFRALDSWSLYMITAYEEAERYVGRKADKNRK IYNGMMKTYFYQFMGPKPPKRKTGD >gi|229784128|gb|GG667607.1| GENE 55 63375 - 63923 709 182 aa, chain - ## HITS:1 COG:CAC2749 KEGG:ns NR:ns ## COG: CAC2749 COG0622 # Protein_GI_number: 15896006 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Clostridium acetobutylicum # 1 180 1 180 180 182 48.0 4e-46 MKYMIASDIHGSAFYCRKLLEAFEESGANRLVLLGDILYHGPRNDLPKEYAPKEVLAMLN GCKDRIYCVRGNCDTEVDQMVLEFPVMADYALLAIDGITIYATHGHVYHENHLPPMQKGD VLLHGHTHVLRADRCGDITILNPGSVSIPKEGNPPTYAILEDGIFRILDFEGRVIKEKNL GE >gi|229784128|gb|GG667607.1| GENE 56 63991 - 64299 355 102 aa, chain - ## HITS:1 COG:lin0618 KEGG:ns NR:ns ## COG: lin0618 COG0607 # Protein_GI_number: 16799693 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Listeria innocua # 1 96 1 95 99 59 27.0 2e-09 MFKTIPIGEVDYYIENEYDMMVIDLRNAASYQRAHINGAVNIPYEEIDQRISELPRDKIL VFYCARGGQSMMVCRYLSRMGYSVLNVANGIAYYRGKYLVRG >gi|229784128|gb|GG667607.1| GENE 57 64494 - 64721 244 75 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSNNNYNDSYNDCSNSSRNNSQNSNQNNSQNSSRNNSQNSSKNNSQNSSRNNSQNNSQNS SRNNNQNNSQNKSDY >gi|229784128|gb|GG667607.1| GENE 58 65030 - 65755 186 241 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 [Haemophilus parasuis 29755] # 99 230 37 162 175 76 35 9e-13 MIKNAWKTLSLLAFCGAIVLANPADAKADEVSGPGLSSGNYIATVEAETVNISTDESGQE VLMAASKGSSFQVVEDMGDGFVKVKVKDTYGYLPVDGNATVTQTDEATIAALEAAYDAPA EFRQSVVNYALQFVGGRYAYGGSDPHTGVDCSGFTRYVMQHAAGISLNRSSGGQASQGVA VSSDQMKPGDLIFYGNGRSINHVAMYIGNGQIVHSSTYKTGIKVSPWNYRTPVKIVNVIG E >gi|229784128|gb|GG667607.1| GENE 59 65900 - 66595 176 231 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 [Haemophilus parasuis 29755] # 98 221 40 162 175 72 35 1e-11 MGNKTWKVLGVIGLYAVLAWNSPVTAKAAEIEVETSTTLYAKIDAPDVVVRKAKNAGGEA VTKVSRGQTYEVAAPAEGGWVKISTAEGEGYIPSNRATLIEKTKEKVDESVKLRQDVVNY ALQFVGNRYVYGGTDPNTGADCSGFTRYIMQHAAGISLSHSSRAQSGEGRSVSYSDRKPG DLIFYGGKGYINHVAMYIGNGQIVHASTERTGIKISNATYRTPVKVVRVLD >gi|229784128|gb|GG667607.1| GENE 60 66801 - 67247 382 148 aa, chain - ## HITS:1 COG:no KEGG:Closa_0686 NR:ns ## KEGG: Closa_0686 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 148 10 157 157 196 60.0 3e-49 MRDTLTGEVPGNVIEENIRYYDDYIRTETAHGQSEDEVTGAIGDPRLIAKTIMEATENAR EGGSRQSAYDTYRNEGRTVYEEEAGPDRHFHFVDLNKWYWKLLAVVVTVLFFFLVASIVT GIFSILIPLMGPLLLILLVIWLIRGPRR >gi|229784128|gb|GG667607.1| GENE 61 67375 - 69312 2369 645 aa, chain - ## HITS:1 COG:CAC1572 KEGG:ns NR:ns ## COG: CAC1572 COG3855 # Protein_GI_number: 15894850 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 3 645 15 664 665 678 51.0 0 MRDLAYLKLMAREYPTIKAASSEIINLTAIQGLPKGTEYFFSDLHGEYEAFIHLLRSASG IIREKITETFGHIIPEKEEVELANMIYYPERVVNQMQLKGMATDDWQRVAIYRLVQICKV VSSKYTRSKVRKKMPPEFAYIIDELLHVDYNDDNKRVYYSEIIRSIIDIRVGDKFIIALC ELIQNLTIDSLHIIGDIFDRGPRADIIMKELMQFHDVDIQWGNHDISWMGAATGNLACIC NVLRIAISYNSFDVLEDGYGINLRPLSMFAAKVYHNDPCERFVPQILDENIYDAVDPGLA AKMHKAIAIIQFKVQGQIIGRHPEYEMDDRRLLEAVDFDRGVVVLEGKEYPMLDLSFPTV DPKDPLALTAEEEELLHTLQLSFRHNELLHKHIRFLYSHGSLYKCYNSNLLYHGCIPMKE DGSFDEIVVDGISYSGQALMDYVDRKVQNAYFMAEESPEKEDAMDFMWYLWCGAKSPVYG KGKMTTFEHYFIEDPATHKEPMNPYYRLSVKEETCDRILEEFGLPRSGSHIINGHVPVKI KEGESPVKAGGKLFIIDGGLSKAYQSRTGIAGYTLIYNSNHLALAEHKPFDPEKESTPRV SIVENMHKRVMVADTDKGVELAGRIADLKELAGAYREGILKEKVE >gi|229784128|gb|GG667607.1| GENE 62 69632 - 70438 592 268 aa, chain + ## HITS:1 COG:CAC1330 KEGG:ns NR:ns ## COG: CAC1330 COG1234 # Protein_GI_number: 15894609 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Clostridium acetobutylicum # 1 267 1 267 268 275 44.0 7e-74 MEQLYVFGTGHAAVTRCYNTCFAVEKDGEFFMTDAGGGNGILRILQDMNVDMTHIHHIFV THSHTDHILGIVWMVRFIGTRILNGSYEGQLHLYSHEDLLPAIETLCRLTLQEKLCRLIG DRIVLIPVSDGEQRQILGMDVTFFDIHSTKAKQFGFTAVLSDGRRLTCCGDEPFHPLCAP YVENSAWLLHEAFCLYEDRDRFKPYEKHHSTARDAARTAQDMKAEHIVLWHTEDTDLDHR RERYTAEAAEWFHGGIFVPCDEDIIPLS >gi|229784128|gb|GG667607.1| GENE 63 70492 - 71670 1123 392 aa, chain + ## HITS:1 COG:BH3436 KEGG:ns NR:ns ## COG: BH3436 COG0739 # Protein_GI_number: 15615998 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Bacillus halodurans # 223 321 212 305 337 84 42.0 3e-16 MKSWKERIAESPETLHHIIIYIEVIVLTCLLVSLFLISPEGRFSFPASNQDGKATSSEAR NYIKWVDFKVTSKAMQEAFRYDVNTCQSSPHLNWVELLSYLGARYGGDFSRYSTKDLDTL AEQLLSGKTMEELTANMKYYSYYRQAYGAVLDGLVGQYDIQIAAENAPLFASPVVNADGT ADPDRVWVTKYGLKAFSPIAKGFPYQDYDDFGVSRSYGYKRQHLGHDMMGQVGTPVIAVE SGYVEAIGWNQYGGWRLGIRSFDGKRYYYYAHLRKNYPYHKSLKQGSIITAGDVIGYLGR TGYSRTENTNNIDEAHLHFGLQLIFDESQKEGNNEIWVDCYELTKFLSMNRSETLKDQET KEYDRIYDMKDPDIPENFVPHAPLPEEQHIPD >gi|229784128|gb|GG667607.1| GENE 64 71786 - 72973 1086 395 aa, chain - ## HITS:1 COG:CAC3299 KEGG:ns NR:ns ## COG: CAC3299 COG1979 # Protein_GI_number: 15896543 # Func_class: C Energy production and conversion # Function: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family # Organism: Clostridium acetobutylicum # 1 392 1 386 389 276 38.0 4e-74 MNNFEYCVPTRVIFGRDTQKRAGELIKEYGFRKVMIHYGGGSVKRSGLLDQVTASLREAG IETVLFGGVQANPTLSKALEGMEICRREGVDFILAVGGGSVIDSAKCIADGAPNPEIDPW KFFMKEAVPPKALPHGNILTLSASGSETSQSCVITNEENGLKRGFNCPAHRPLFAVCNPE LTFTVSRFQTGCGTVDILMHTLERYLGGTTKDTALTDRIAEGLMKAVIEAGTAADLNPED YEARATLMWAGSLSHNDLTGLGREVMMTVHQLEHELSGKYPEVAHGAGLSALFCSWARYV CRDDPMRFAQLAVRVWDVEMNFENPLKTALDGIQRIEDYFKSLNMPVRLSELEADVKEAD FDEMAEKCTNFGRRVLPGIRELGKREMMEIYRMAL >gi|229784128|gb|GG667607.1| GENE 65 73027 - 74274 1203 415 aa, chain - ## HITS:1 COG:CAC0016 KEGG:ns NR:ns ## COG: CAC0016 COG4198 # Protein_GI_number: 15893314 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 412 1 413 414 462 56.0 1e-130 MAVVKPFPCVRPDKKYAACVAALPYDVYNRKEAKAAASANPKSFLNIDRPETQFPDELDM YDDRVYDKAKEMLESWIADGTFVKDPLAAYYIYELTMDGRKQTGIVACSSIDDYVNGVVK KHENTREEKELDRIRHVDATDAQTGPIFLAYRSNAMINKVVDLKTGTEPLYDFTAEDGIR HRVWEIGNADSILTVEEAFAGIPATYIADGHHRAASAVKVGLKRREEHPDYTGNEPFNYF LSVLFPDDQLMIMPYNRVVKDLNGMTEETFLKRVEEAGFTVASMGTEGFAPVEKGTFGMY LCGQWYCLTAAERLLTSDPVKGLDVSILQDHLLGPVLGIGDPRVDKRIDFIGGIRGLSEL EKRCREDMTVAFSMVPTSIEELFSVADAGLLMPPKSTWFEPKLRSGLFIHLLSKR >gi|229784128|gb|GG667607.1| GENE 66 74355 - 75518 1534 387 aa, chain - ## HITS:1 COG:lin2956 KEGG:ns NR:ns ## COG: lin2956 COG0111 # Protein_GI_number: 16802015 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Listeria innocua # 1 387 1 389 395 368 48.0 1e-101 MKKIYCLNAISSKGTALLNEEYELTENMKEAAGVLVRSASMHEMELPDDLLAVARAGAGV NNIPLDACADKGIVVFNTPGANANGVKELVIAGLMLASRDVVGGINWVKDNKEDENIAKS VEKAKKAFAGCEIKGKKLGVIGLGAIGAEVANAAASLGMEVYGYDPFISVNAAWMLSRNV RHITSVETIYRECDYITIHVPLLDSTREMIKRESLAQMKNGVVILNFSRDVLVNEDDMAE ALASKKVKCYVTDFPNTKSVNMEGAIVIPHLGASTEESEDNCARMAVEEIMDYIDNGNIR NSVNFPACDMGVCQMASRVAVLHLNIPNMIGQVTGTLAAGNVNISDMTNKSRDKYAYTLL DLESVPDSMTIQKLNAIKGVLRVRVIK >gi|229784128|gb|GG667607.1| GENE 67 75537 - 76622 1225 361 aa, chain - ## HITS:1 COG:RSc0903 KEGG:ns NR:ns ## COG: RSc0903 COG1932 # Protein_GI_number: 17545622 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoserine aminotransferase # Organism: Ralstonia solanacearum # 2 356 13 374 378 392 52.0 1e-109 MSRVYNFSAGPAVLPEEVLKEAADEMMDYRGCGMSVMEMSHRSKMFETIINEAEADLRDL MGIPDNYRVLFLQGGASQQFAMVPMNLMKNRVGDYIITGQWAKKAYQEAQLYGKANAVAS SADKTFSYIPDVSDLPISEDADYVYICENNTIYGTKYKTLPNTKGKTLVADISSCFLSEP VDVEKYGLLFGGAQKNVGPAGVVIAIIREDLITEDVLPGTPTMLRYKTHADAASLYNTPP AYGIYICGKVFKWLKNRGGLEAMKEYNEKKAKLLYDFLDNSRMFKGTVEKKDRSLMNVPF VTGDADLDALFVKEAKAAGLENLKGHRSVGGMRASIYNAMPMEGVEALVAFMKDFEARNG K >gi|229784128|gb|GG667607.1| GENE 68 76659 - 78338 1950 559 aa, chain - ## HITS:1 COG:CAC3169 KEGG:ns NR:ns ## COG: CAC3169 COG0028 # Protein_GI_number: 15896417 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Clostridium acetobutylicum # 3 550 2 549 554 644 59.0 0 MNKLTGAEIVIECLKEQGVDTVFGYPGGTILNIYDALYKHQDEITHILTSHEQGAAHAAD GYARATGKVGVCMATSGPGATNLVTGIATACMDSIPVVAITCNVAVSLLGRDSFQEIDIA GITMPITKYNFIVKDITKLADTIRRAFTIAQTGRPGPVLIDITKDVTANTCDYEYKAPEP IVRQNSTIREEDMEKALYLIRNAKKPFIFAGGGAVIADAADELKAFAHKIQAPVADSLMG KGAFDGTDELYTGMVGMHGTKTSNFGITECDLLIVVGARFSDRVTGNAAKFAPNAKILQF DVDPAEIDKNVKTYASVIGDVKTILKKLNARLDPVNHDEWLAHIDRMKDMYPLRYDKNLL TGPYVVEQIYELTQGDAIITTEVGQHQMWAAQFYKYKHPRTFVTSGGLGTMGYGLGAALG AKMGCKDKTVINIAGDGCFRMNMNEIATATRYNIPVIQVVLNNHVLGMVRQWQTLFYGKR YSHTVLNDAVDFVKVAEGMGAKAYRVTSKEEFMPVLKEAIGLNIPVVIDCQIHCDDKVFP MVSPGAPIQDAFDENDLKI >gi|229784128|gb|GG667607.1| GENE 69 78358 - 80028 1957 556 aa, chain - ## HITS:1 COG:CAC3170 KEGG:ns NR:ns ## COG: CAC3170 COG0129 # Protein_GI_number: 15896418 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Clostridium acetobutylicum # 1 551 1 547 552 667 62.0 0 MKSDSVKKGMQQAPHRSLFNALGMTEEEMERPLVGIVSSYNEIVPGHMNLDKIVQAVKMG VAMAGGTPVMVPAIAVCDGIAMGHIGMKYSLVTRDLIADSTECLAKAHAFDALVMVPNCD KNVPGLLMAAARINVPTVFVSGGPMLAGHVQGHKTSLSSMFEAVGAYTAGTMSEEEVREY ECKACPTCGSCSGMYTANSMNCLTEVLGMGLQGNGTIPAVYSERLKLAKHAGMQVMEMYR KNIKPSDIMTEAAFKNALTMDMALGCSTNSMLHLPAIAHEAGVELNVDIANEISAKTPNL CHLAPAGPTYIEDLNEAGGIYAVMKEISKLGLLNLDCMTVTGKTVGENIKNCVNLNPEVI RPVENPYSRTGGIAILKGNLAPDSAVVKRSAVAPEMLKHEGPARVFDCEEDAVAAIKGGK IVAGDVVVIRYEGPKGGPGMREMLNPTSVIAGMGLGSEVALITDGRFSGASRGASIGHVC PEAAAGGPIALVEEGDMICIDINNNRLDVKISDEEMAARKAKWQPRVPEVTTGYLARYAA MAAPASKGAILEVPGN >gi|229784128|gb|GG667607.1| GENE 70 80064 - 81146 1175 360 aa, chain - ## HITS:1 COG:PA3118 KEGG:ns NR:ns ## COG: PA3118 COG0473 # Protein_GI_number: 15598314 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Pseudomonas aeruginosa # 1 356 1 353 360 405 59.0 1e-113 MSYNIAVIPGDGIGPEIIREARKVLDRIGTVYGHEFSYTEVLMGGVSIDAYGVPLTDEAL ETAKKSDSVLLGAVGGDVGNSRWYDVAPNLRPEAGLLAIRKGLNLFANIRPAYLYEELAE ACPLKKEIIGDGFDMVIMRELTGGLYFGERHTTEVDGVMTAVDTLTYNENEIRRIAVKAF DIAMKRRKSVISVDKANVLDSSRLWRKVVEEVARDYPEVTLTHMLVDNCAMQLVMNPGQF DVVLTENMFGDILSDEASMITGSIGMLSSASMNEGKFGMYEPSHGSAPDIAGQNIANPIA TILSAAMMLRFSFDLDREADAVEAAVQKVLSDGFRTGDIMSEGCKKVSTSEMGDLIAERI >gi|229784128|gb|GG667607.1| GENE 71 81490 - 82668 1214 392 aa, chain - ## HITS:1 COG:BS_patB KEGG:ns NR:ns ## COG: BS_patB COG1168 # Protein_GI_number: 16080196 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Bacillus subtilis # 4 389 2 386 387 335 42.0 1e-91 MKYDFDTRVERRGTYCCKYDTIPEEAGENAIPVSIADMDLPCAEPIIRALHERIDQKIFG YTVYENEECKQAVMGWYRKRFGWEIDKHDIFFSPGVVPALAFLIQFLTKEGDGVIIQRPV YHPFTGKVEANGRKVVNNPLIRKGNTYEMDFDDLDRKFADDNTKGMILCSPHNPVGRVWR EDELRKVVDIAKKYDKWIISDEIHADLTRIGVTHTPLLKLAPDYSDRIIVCTAPSKTFNL AGMQFSNIIIPNKEYQEQWTEVVSNRLSVGMCSPFGLTAIIAAYTEGEEWLDQARAYIDG NIRYIEEFVKENLPKAVMMDCQGTYLVWLDLNEYCSDSEKLEALMVRKAGIIFDEGYIFG PEGSGFERINAAAQRSTVEECMKRMKAALDTL >gi|229784128|gb|GG667607.1| GENE 72 82892 - 83143 220 83 aa, chain + ## HITS:1 COG:no KEGG:Closa_3641 NR:ns ## KEGG: Closa_3641 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 82 1 84 88 63 48.0 3e-09 MITIFNRKELAITYSISEQARIRNLLAAEGIKYIIDTKGTFMRNFGSRGATIQQFGENIT AGTEYRIYVLKSDYDKASHLIHK >gi|229784128|gb|GG667607.1| GENE 73 83198 - 83698 585 166 aa, chain - ## HITS:1 COG:no KEGG:Sgly_1265 NR:ns ## KEGG: Sgly_1265 # Name: not_defined # Def: regulatory protein MarR # Organism: S.glycolicus # Pathway: not_defined # 29 148 25 142 150 68 32.0 8e-11 MGVEKRYELFCELLKTLDEGVDAIEEYDSLLHDYRGTVLYQAESQIIKAVGDQPGITASE LSRVFDKTNSACSQLIRKLKKKEWIRQERNEKNNREYNLYLTEEGKVIYKKHEEFENACY ERTYHMLDGISEEEMRTYIGIQKQLNRAFKLDVEESRQLSGNSGAE >gi|229784128|gb|GG667607.1| GENE 74 83774 - 84790 1170 338 aa, chain - ## HITS:1 COG:no KEGG:Closa_3658 NR:ns ## KEGG: Closa_3658 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 335 1 324 326 324 51.0 4e-87 MKKNYGLLAGCLVVAALGVIYAGVGAYQDHTKEKERQQAETEKIYMTDFSDIEAISYDND GNVLAFTKDGDSWTYDGDDQFPVNTTRMDSLAGTVKKLPAVRRLEGGDDLAAYGLDTPLR RVTVSADDGKTVTILIGDKTDGGNYYAVIDGQNVPCLISSSLFDETAYGLEDMMALEEFP AVVGTDIQSITIEKNGVSEHYVKKKLAEETSPQSGSAESEDGTIAWYRGSDDTEDNKLPD NSALNVLADSLSGLVVKSCANYKVTDEELAGYGLDHPQAVLSYTYEKDGEDKTFSLSVGN PGEDGTTYYTRTEDSKYVNEIDKTALDQCMTADTGNDQ >gi|229784128|gb|GG667607.1| GENE 75 84838 - 86208 1577 456 aa, chain - ## HITS:1 COG:slr2105 KEGG:ns NR:ns ## COG: slr2105 COG3225 # Protein_GI_number: 16330592 # Func_class: N Cell motility # Function: ABC-type uncharacterized transport system involved in gliding motility, auxiliary component # Organism: Synechocystis # 15 278 81 329 595 89 27.0 1e-17 MTRKLKKGGYAAILSVIVIAAVVLLNMIVGRLPEKVRRWDLSGTQIYTVGDTTKELLASL DKNVTIYVVADPTMVDDRITSFVNRYADLSDHIKAETVDSVLHPDRVKQLNAENNTILVV CEDNGKTETIQMSDIIKYDQMSYYYYGQTKETEFDGEGQLTSAVSYVTNDVQKNIYVTEG HGEAALGTFASDLLEKSGLTVNTVNLLTGGGIPEDCELLLINAPVSDLADDEKTMVTDYL DGGGKVLLIAGYSDKDRPNLNAVLNAYGLNMEHGLAADTKSCYQNNPYYIFPTLVSGSEI TNGIDRKSTALILQSSALNQLDTLPDGVTVEPFMETTDGGMLVTESSQTPGTYLLGAMAE KTLDSGTARLTVFGTPSLIDDGLNSTFSNLTNLDLFMNAVTANFEDVTNVSIPSKSLEVT YNTVTHGGMWGIVFILVIPVATVAAGLMVWLKRRRL >gi|229784128|gb|GG667607.1| GENE 76 86234 - 87100 856 288 aa, chain - ## HITS:1 COG:PA4038 KEGG:ns NR:ns ## COG: PA4038 COG1277 # Protein_GI_number: 15599233 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Pseudomonas aeruginosa # 1 170 4 175 244 66 31.0 6e-11 MAAIYKRELKSYFQCMTGYVFIAFLVLFVGIYFMAYNLMSGYPYFSYTLSGMVTIVMIGI PVLTMRSFADDRKTKTDQLLLTAPVSVPNMVLAKYLSMVTVFAVPVLISCLCPLIIKMNG TAYLKADYASILAFFLLGCVYIAIGMFISSTTESQIIAAVGTFGAILLLLLWPSLVNFLP TSASGSLVGFLILWTLCVVILHRVTSHNLLAIVLEAAGVVALVGAYVAKKSMFDRAFVTL VEKIAVTDVFQNFASNYIFDAGGLIYYVSIIFLLVFLTVQSVEKRRWS >gi|229784128|gb|GG667607.1| GENE 77 87101 - 88165 284 354 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 2 329 9 307 309 114 27 4e-24 MIEVKNLVKDYGGHLAVDHLNFTIEDGQIYGFLGPNGAGKSTTMNIMTGYIGATEGDVLI NGHNILEEPEEAKKCIGYLPELPPLYADMTVMEQLDFAAELKQIPKKERKEAIGEVVALA KLEDVQGRLIRNLSKGYRQRVGLAQAVLGMPPVIILDEPTVGLDPKQIIEIRDTIRELGE KHTVILSSHILSEVSAVCDRILIIDHGKLIASDTPENLERQMAGASGMELLVKGQEETIR EILEPIRGVEEIAVNENGGEDVRKVTFRLAAESEGSDYAEAAASSPDGASQTDIRETIFF AFADRKIPILSMQTTKASLEQVFLELTAEDTLDTADGEDASDEDSVMETEKEEV >gi|229784128|gb|GG667607.1| GENE 78 88511 - 89194 422 227 aa, chain - ## HITS:1 COG:VNG1256G KEGG:ns NR:ns ## COG: VNG1256G COG1985 # Protein_GI_number: 15790310 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine reductase, riboflavin biosynthesis # Organism: Halobacterium sp. NRC-1 # 13 209 9 210 220 59 28.0 4e-09 MNNRPVTTLFMLMSVDGKITTGASDELDVDKNFPCIKGVKEGLHQYYELEQMTDLWSFNT GRVQKKLGVNEKEFPSKSPVSFVLLDNHHLTEHGVRYFCEWSKEFVLITQNPVHPAFSVT ADNLHIIKQDSLDLADALKVLEEEYGCKRLTIQSGGTLNGLFLRNKLIDYIDIVVAPVLI GGKDISTLIDGSSITSMEELNLLGVLQLETCEVLEDSYIRLRYQVTK >gi|229784128|gb|GG667607.1| GENE 79 89315 - 89935 429 206 aa, chain - ## HITS:1 COG:Ta0854 KEGG:ns NR:ns ## COG: Ta0854 COG0279 # Protein_GI_number: 16081908 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoheptose isomerase # Organism: Thermoplasma acidophilum # 26 199 27 182 182 129 40.0 3e-30 MDHLKRLMQRFPVLIDMEGELMRAYDLLEDAYSQGGKLLLCGNGGSAADCDHIVGELMKG FLKKRPLDSAEREGLGDLGDRLQGALPAISLTGHSALSTAFLNDVAPELVFAQQVYGYGR RGDVLAAISTSGNSVNVVNAARVAKARGLSVISMTGKTGGVLKGISDVCLAVPAEVTADV QELHLPVYHTLCAMLEEHFFPDTDAK >gi|229784128|gb|GG667607.1| GENE 80 89963 - 92392 1865 809 aa, chain - ## HITS:1 COG:all0848 KEGG:ns NR:ns ## COG: all0848 COG0383 # Protein_GI_number: 17228343 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Nostoc sp. PCC 7120 # 6 772 230 1001 1047 334 29.0 4e-91 MEKDKKLYMIGNAHIDVVWLWQWQEGLQEVKATFKSVLDRMKEYDDFIFTGSSAAYYEWV EENDPSMFEEIRSRVKEGRWVIVGGWWTEPDCNAPCGESFVRQGLYGQRYFEEKFGVKAV CGYNVDSFGHNGNLPQILKKCGMDSYVFMRPGRHEKGIAGENFVWKSADGSAVNAFRLPF EYCTWPDQIEEHVRRCAGEIKNSDGGIMCFYGVGNHGGGPTKKNIDSIHRMNARPDMPEL QLSSPNEYFEDVKKRGRDLPVVCGGLFHHSSGCYSVESRVKALNRAAEMRLLMAERVSVM AGLLTGIRYPAEEYQKAWKSVLFNQFHDILAGTSLRESYEDAAEDYGYALHAAGRGLNSA VQSLSWQINIPMEEGMKPLVVVNPNAFNTKAEVQAESWTLKEGTVLLDENGNQIPCQLVQ SSAALQGRCRICFVADLPSLGWRTYRFAVREKAETFPEVAVSECSAENRWFKLTLDPETG YIASLLKKNDGTEYFSGPAAVPVVIRDESDTWSHAVRIFDEVIGRFKAVSVRTVENGPVK CVIRVTSVWGNSRIIQDFSVFQDLDYIAVKTTVDWHEKQAMLKLQFPMNMNYLRTSWEIP YGMEQREPDGEEYPMQMWLDLEGTNPGMETSMNGLSILNEGKFAGSATGKTASLTVLRSP VYTHHEPYQLQENLEYVYIDQGTQTFTYGLYPHDGSWENAATVRRAKVMNCRPIALFETY HEGRLEQTGSLMEVNQENIVVEVLKKAEDGSGDLILRAYETAGRAVKAELTIGVLEQTIE ADFQPFEIKTLRLPGQKGGEAVWTDMLEE >gi|229784128|gb|GG667607.1| GENE 81 92379 - 94835 1812 818 aa, chain - ## HITS:1 COG:all0848 KEGG:ns NR:ns ## COG: all0848 COG0383 # Protein_GI_number: 17228343 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Nostoc sp. PCC 7120 # 3 772 228 1001 1047 229 25.0 1e-59 MEKKQLYMIGNSHIDPVWFWEWEEGMQEVKATFQSALDRMREFEDYRFTSTSTAFFQWIE KLLPDLFEEIRQRVLEGRWEITGGWFIEPDCLLPCGEAFARHGLYGQRYLKDRFGEICRI GSNVDSFGHSGSLPQILKKSGMDSYVFMRPRLDTPVFRWESPDGSSVNAICLPAEYTTWF REPTLKNINDTLERTKEYDKMVCCYGVGNHGGGPTIENIETIREQIEDFDGAELHFASFG EFLEDIRDWRLPVKMGPFEKINQGCYSNDSRFKQLNRLAEGRLMAADTFLSMAAALTGVW PEETGKIEDLWNGLHFNEFHDTMGGTTIKPAREEAVMQLSAVCAKAASVRETAKQRIVNS LDTSGEGYPLFLFNPHGKDFSDYVEVELEWFCQSPLKLSDPFGNEVPYQRIHTAAKVRHT VLGGRRRIVFHASVPSCGYAVYRLRKETPSLGFNQNMELDNPDPYTLENRFLRAEFDPET GMLCALKDRTAGYDALKGPASVRLYVDERDAWGGLQGKRFEDRNVTFRLDSIDKVESGEI RETIRVRLSYEGTRLEQLYSLGADERELRVENRLCFAHPRALLKTAYPTGEGCLITRAET AYGIVERDHKGDDGEYNMQRFLDAADGAGRGLAVSNDGKYAFNLTEGRTQITVARSAIYA QGSSPDWENEIESYEYMDIGTQTFHLILKPHGKRLPASELYRIAEKANGSYEYLMDSAHK PLTNDGRGSIVLSIAGTDRDNVAVQVMKKAEDDDGFILRLLELEGKDTDYHLYLMGNTYP LTIGHHEIQTIKTDREGRNIKIVNLLEWEEERDRNGER >gi|229784128|gb|GG667607.1| GENE 82 94854 - 97472 2094 872 aa, chain - ## HITS:1 COG:no KEGG:Ccel_0950 NR:ns ## KEGG: Ccel_0950 # Name: not_defined # Def: HI0933 family protein # Organism: C.cellulolyticum # Pathway: not_defined # 228 393 16 173 435 90 38.0 3e-16 MGFELGQEILKPESPDRWRETGYLITGNTLEAVKRSICLARRGERVLLAVEETFLAADIG SYLDYESSPELQGFFPDSLWENGCLHPDGYKKYLERLCREQGAEFCYGLYFMDAVSEEKE KIEVRFASKGGLYRVRCRYHEKFSQTNKGGRTVCAAWITDEKTGKNRILKAPFLWKPRAT EAENLLAGRRELLLAFGRSKAENPALNLGRFALRAGKGEKSCSNLKEAPFDVIVVGGGTA GAMAALYAARGGAGTVLIEPQYDLGGTATLGGVSTYWFGKRYRDVEEIDLETDRLEQQYG ISRKPGLWSEYDDFHAGIRGYVLLKLCLEAGVSVVFGQIAYDVVKEGTRVCGVKTAGHAG KKTLRGSLIIDATGDGDLAYFSGAETVYGSGRDCFTYWASLAQYTGTAGYRNNFSSMVMA SDPEDMTRFAVLGRERGEGTFDHGAYVSMRESRHITGKKMIDLRDICTFRTYDDGICTFF SNYDPKGKLDADMVYCGYLPPQVNMQIPLSALVPKHREGYSLEGLYVAGKAVSATHNAFP GLRMQPDLMHQGAVLGLLAARAAKSGCLIEALSIPQRRAWIREATGDSLTLPDKIRYQSG SRNIPDYAFYAGQVTGESRTHWIDVPFTYEETKISPLLALVCGESEHVLPCIRKRIQALE ADEYGRDQGTLAVLKRAALFHGCNDYMEDMQQSIVRRINEAKPGLPVRKGSVMCAQLLPD HGVMPELVYDLNLLSGGDGFSMEPFYLVFEVLKKEERDYMDIRKGTYSYLECFAYAARRS SRTEFIPLLKGLAELPEFQTALQEENQVSLLAERMQMLQLLLYQSLAGLGEREGVKGLER LSKVRCLAIRRSAEMVKQSVEEGDLGGAEKIW >gi|229784128|gb|GG667607.1| GENE 83 97514 - 98974 1541 486 aa, chain - ## HITS:1 COG:SP1894 KEGG:ns NR:ns ## COG: SP1894 COG0366 # Protein_GI_number: 15901721 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Streptococcus pneumoniae TIGR4 # 1 433 3 434 480 453 49.0 1e-127 MKNKIMLITYADSFGKNLKEMKGVIDRYFRREIHSIHILPFFPSSGDRGFAPINYREVDE AFGTWEDIREIAEDYELMFDFMINHLSRRSPEFLDFIEKHDESEYADMFLRFRKFWPGGE PTEEQVELLNKRKPCAPCEEIAFRDGTTEKIWCTFDDEQMDLDLRSESAWNYVERTLRFL MDQGLSQLRLDAFAFATKKVGTTCFFLEPEIWEMMDRIQRVLDEKGIPMLPEIHDHYTVQ LKIAEHGYAVYDFVLPVMVLHTLYSGSSARIRHWLAICPRNQQTTLDTHDGLGTVDVVDL LSPEELQAVVDQTEQYGANFKWDYSGGNSSGEKVVYQINCSYYSAVGERDDSYLLSRAIQ FFTPGIPQVYYMGLLAGENDYELMERTNYDRNISRHNYTVEEIERLAEKPVVKQLCVMMQ FRNEYPVFDGEMTVHNTPDERLWISWMAGPCRADLKAELKSHSFAITYLDREGCEKALDL DRDFSF >gi|229784128|gb|GG667607.1| GENE 84 98987 - 99967 337 326 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 3 309 4 305 323 134 31 3e-30 MQIIAGIDIGGTKSAVSFARYADGGIEILDKVKRPTRTADYHSAFHEYIEIIQTQLKASP DWRLCSIGISCGGPLDAEKGIIMAPPNLPEWDNADIFTPLRNAFRVPVMLQNDADACALA EWRLGAGKGTKNMVFLTFGTGMGAGMILNGQLYQGATCMAGEVGHIRLEKDGPYGYGKNG SFEGFCSGGGIANLGRMKAQEALEKGVVPSFCKSMDMLPQVDCKEIGRALEEGDATAREI FDIVAEYLGRGLSILIDILNPDMIVLGSIYARQREALEPGMQAVIAREALAASGKACKIV PATLGEMIGDYAAVSVGMKAYDDFYR >gi|229784128|gb|GG667607.1| GENE 85 99958 - 101364 1215 468 aa, chain - ## HITS:1 COG:BH2676 KEGG:ns NR:ns ## COG: BH2676 COG1070 # Protein_GI_number: 15615239 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Bacillus halodurans # 4 462 7 454 500 78 24.0 2e-14 MISLGIDIGTTSICAVLYDLTEDKIIKSLSTPNSFINTGSYLQDPDRIVSLAKELLKELL MEFKAWAGEHEARNIRKPAGVGISSQMHGILYTDRTGRAVSPLYTWKNEDGNQRYRDGLT YARYLVKETGLPFYTGYGSVTHFYLQENQQIPDEAVAFVGIGDYVAMRLTGSRKAVVHKT MAASFGGYHMAEGRFELDKLAAAGVDISCYPAVAREEGQAVGTMVMEQTAVLLAGGTAEP EAIPVFCAVGDNQASFLGAVKDKEHSVSINVGTGSQVSVYSGEWDPAAGTDIRPWIDHGY LYVGASLNGGKVYERLAAFFEEVCEEFAGQKVNAYETMERLAMEEQETELRAVPSLYGSR EETDREPEAGIYGLNSGNFHPKDLIRSFTTGMARELFALYSAFPEKACAGKTQIVASGNG IRKNRLLREDVEKVFGLPVVFTDREEEAASGAALYVRQAVIEGGAECR >gi|229784128|gb|GG667607.1| GENE 86 101397 - 101576 180 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619111|ref|ZP_06112046.1| ## NR: gi|266619111|ref|ZP_06112046.1| hypothetical protein CLOSTHATH_00104 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00104 [Clostridium hathewayi DSM 13479] # 1 59 75 133 133 120 98.0 3e-26 MAALVLEAGTLGIAADRPPVRVWIDGRDHEAVCHGGMLWSVYCEKPDCLVEIEWQSGGL >gi|229784128|gb|GG667607.1| GENE 87 102715 - 104028 948 437 aa, chain - ## HITS:1 COG:no KEGG:Mahau_2006 NR:ns ## KEGG: Mahau_2006 # Name: not_defined # Def: raffinose synthase # Organism: M.australiensis # Pathway: not_defined # 12 434 147 566 697 371 41.0 1e-101 MDRYQLNRCSLETDRVNRSGAENRLAVRMASNAGNRNRMEDLSLAIAGGSDPYLCCERAV QAALGRLGRSSMLRKNRKFPEKLEFFGWCTWDAFYHRVSHEGVMEKMKEFRAKQLPVKWV LLDDGWLDADYDKKVLIGLDADRERFPKGLKGCVKELKETWNVDSVGVWHAVMGYWNGLA GESPAAETLKAGTRVLPDGRILPDPEAGKAFTFFETWHKYLKNCCGIDFVKVDGQSAVSL AYGGMETYGHASCGIQKGLNASAALYFDNCIINCMGMAGEDMWNRPSSAVARSSDDFVPQ VPHGFKEHAVQNSYNSLLQGQFYWGDWDMFFSSHEENWQNSILRAVSGGPVYVSDRVGET NPGFIRPLITETGLVIRCREVGMPTTDCLFDNPADTLRPLKIFNRYGENYVIGAFHICEK DDICLGKLEMSDIPVAS >gi|229784128|gb|GG667607.1| GENE 88 104133 - 104384 183 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619113|ref|ZP_06112048.1| ## NR: gi|266619113|ref|ZP_06112048.1| dihydroxy-acid dehydratase [Clostridium hathewayi DSM 13479] dihydroxy-acid dehydratase [Clostridium hathewayi DSM 13479] # 1 83 21 103 103 166 98.0 5e-40 MVFLLSLSPEEGTAGELQGSVSEDTVLITVRAEKEAREPCFGPQAYFDSREGIVYHLDIP ETERFLAVYQHKDWWVRPAFFRT >gi|229784128|gb|GG667607.1| GENE 89 104474 - 105688 359 404 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 85 385 7 308 323 142 30 8e-33 MADQNNQVQLKNFNRRTVLGYIRRNGTATKAGLASVTGLTFMAIKKILAELQELNLIRDG QMESGGMGRKAVSYVINENYRYTIGIHINKFITGIALLNLRGQILAIERYSMDKEFENQN DFVTMLAEAVELVIEKSGVKREDILGIGVGAPGPLDCESGVILTPPNMPMLDYLPLKETL EGRTGFPVFLNKDTNVIAFAEYWYRNNRDCSNLAYVEVDMGIGSGLIIDGKLNVGANCIA GEFGHITIDINGPLCNCGNRGCLEAMSSGIAVLRMLGEQLENQKDHPLYHKRNALTIEDV FEMTDKKDLLTISILNRSAFYVGVAVSNLINTFDPEMIILGGILIQKYPMYFNIVQDVAN QRKVKGAKENYMAVSVLGENAGVIGAGEIVTDHFFNQFVNEVFH >gi|229784128|gb|GG667607.1| GENE 90 105802 - 106680 814 292 aa, chain - ## HITS:1 COG:SMc04137 KEGG:ns NR:ns ## COG: SMc04137 COG0395 # Protein_GI_number: 15963868 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 18 288 19 292 295 207 42.0 2e-53 MAADTKTVKRGGLSSRRRKKAIVNVICFILLAAASLTVIMPLWFMISTALKSMDEIFTYP ITWYPHEPIWSNFKDAWASAEFSRWFLNSVFVAAFAILGGVLANSLVAYGFAKIRFPGRN IMFSIILATMMIPEFVTMIPQYVLYAKIGWVGTYLPLIVPQFMGSAYFIFMLRQFYAGIP NSVIESAQIDGASHFRIWRSIMLPMAKPAIMTVVVLSFNWSWNDFLKPLLYLMDTKTFTL QLGLKIFVSQSNTQWNYLMAASCIVLLPIIVVFMCLQKYFTDGMNIGGAVKG >gi|229784128|gb|GG667607.1| GENE 91 106680 - 107579 890 299 aa, chain - ## HITS:1 COG:SMc04136 KEGG:ns NR:ns ## COG: SMc04136 COG1175 # Protein_GI_number: 15963869 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Sinorhizobium meliloti # 11 296 31 315 315 247 45.0 2e-65 MTKRESREARAGWLFISPWLIGFLCLTGGPLLFSLYASFTNYNMTSRMDFIGLSNYIRMF TKDPVFWKSLGNTLYYVALAVPSSCICAIFLATLLNQKVKGTPLFRMLFYLPTVLSGVAV YQLWMQLLAPQSGLINSVLRLVGIEGPSWLSDPAWTKPSLVMMRVWALGTSMLLYLSSMN SVSKDLYEAADIDGASFLQKFRKITLPQISPIIFFDIITNMTGAFQVFQEALVMSKNGKG DPAGSLLFYNLHIYQEAFTHYDMGYASAMAWFLLLIVMTITVINLVASKYWVHTEEGET >gi|229784128|gb|GG667607.1| GENE 92 107698 - 109062 1516 454 aa, chain - ## HITS:1 COG:lin0762 KEGG:ns NR:ns ## COG: lin0762 COG1653 # Protein_GI_number: 16799836 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Listeria innocua # 4 454 2 415 417 113 26.0 7e-25 MRKKLIGLSIATALGIMSLTACKSQPAENADTKGAADTTVQAEGTKEEGTAKEASGDRVK LEIWDWWGDGQYKYVIEQLCQNFNESQDQYEVVHVNYPWGDVWTKALAATAAGNPPDVII QDIRTVTHRAEAKQATDLTDLIAKEGDGFTDQFYPQLMDAMMYEGHAYGLPYVTDTRILY YDKDAFAEAGLDPEAPPETWAQLVEYARKLDMKKDNGSYERLGFFPGTLEWNAWAFTADG KDYTDADGNVYVNTPEKVKMFENIYNDFYQYYGKKELDSFSAEFGNGMTNPFVAGKVAMW VNTPTEFTKVRDYAPDKNYGVALLPSLEDGGEHYSWGGGFSVEIPYGAKDTEGSWEFVKF ITSKESQIYWASQVYDTVANIEASQDPSLMEIPVFKMALEAMPTTVVTQDRLTAPGASDI VLPYLDEIILGSKTPQEALDEAQAQVEQLVSDNK >gi|229784128|gb|GG667607.1| GENE 93 109685 - 109801 177 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTAYEIITIFIGILTLLMSFGCWIVALIAFLDKRNKRK >gi|229784128|gb|GG667607.1| GENE 94 110039 - 111298 1567 419 aa, chain - ## HITS:1 COG:TM0432 KEGG:ns NR:ns ## COG: TM0432 COG1653 # Protein_GI_number: 15643198 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Thermotoga maritima # 38 320 26 321 423 105 27.0 1e-22 MKKWLSVLLAGAMVLGTLTGCSGGGKKAENEVKTLDVVWFSDGKEGESFMKLADKYMEEN PSVKIELIEVPYADLENKIKNMINGKEAPALARLTNIGPFQNQLVDLGEYVSDKDAFQNS FGEGLKFVFDGKILAAPMDVTANGLIYNKTAFEKAGVAVPQSEDEIWTWDEWKEAMKTVM EKSDCKYGLVYDKSPQRFTTLLYEAGGSLLNSDLTASNFNTDETKRAVSFFKELHDEGII PTSVWLGSENPNNMFRTGQVAMHFAGSWMIANYKEEITDFDWGVTYLPKEAMRSSVPGGK YLAAFQNTGVEKEAADFIEWISKAENNAQYCVENSYLSQVKGNESLDYEYGKDFFTIFSQ ELAATGPQPGAEWGYQAFTGAIQNDLRDKLIEVLAGQLTVDQYAEDMDKLITDALEELK >gi|229784128|gb|GG667607.1| GENE 95 111377 - 112291 967 304 aa, chain - ## HITS:1 COG:CAC2608 KEGG:ns NR:ns ## COG: CAC2608 COG2207 # Protein_GI_number: 15895866 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 9 296 18 279 284 111 27.0 2e-24 MKERVFFEEFESLYYFHGSTQNWAMQGFHFHNQYEIILFLNDGALLEIGDRVYHVMKGDL FFINNREYHRTSGAEGKEYNRYVLMFEPELLEPMTKAFGYDFTMFFENRGDDFVHRLYLE GENLKRVERLLAKIETNIGSGSSEEASVKIRLSILELITAINEMYKFFMKEKHTAADSEE QKWKEEDGKFRDPILYRERVEQIKKYITAHVEEKLDLDEIAGKFYMNRYYLSHYFKKETG FTVLQYVTNQKIIAAKALLKKGTSVTDVALKLSYNSDSHFISVFKKNTGITPKKYAQNKK NNNV >gi|229784128|gb|GG667607.1| GENE 96 112352 - 113431 1141 359 aa, chain - ## HITS:1 COG:STM1911 KEGG:ns NR:ns ## COG: STM1911 COG4225 # Protein_GI_number: 16765253 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Salmonella typhimurium LT2 # 40 358 63 379 379 273 41.0 3e-73 MVEKQMIEKTLNLVSDKMMVLKNNGMKEKYPVSLIDINCWEWPQGVGLFGLYRYYEMSRK QEVLTFLTEWYDRRMEEGIVEKNVNTTSPMLTLTYLYEITKKESYLSFIKEWADWIMEEK GLIRTGDGCFQHMITGSPNDSEILIDTLFMAVLFLARAGKLLNRQDYIEEATYQVLCHIK YLLNKNEGLFYHGFNFERNDNYGGILWGRGNCWYTIVAMELIQEIPMEDGLKRHFLTVYE NQVKALKRYADPETGLWHTVIDDPDTYLELSASAAFLRGIMQGVRLGILNGEEYEELIEK AMKGLLDNISEDGTVENVSYGTPIGLDKEFYNTIPCCPMTYGQALMILALSEAMTDYWH >gi|229784128|gb|GG667607.1| GENE 97 113424 - 115952 2389 842 aa, chain - ## HITS:1 COG:no KEGG:Spico_1278 NR:ns ## KEGG: Spico_1278 # Name: not_defined # Def: hypothetical protein # Organism: S.coccoides # Pathway: not_defined # 12 839 11 839 839 454 32.0 1e-126 MKYLEYPLFAEGYVNRFLTAGVFTELQQFDKATLYGRVNEWLKKGFAIHENPCRKEFVRR RQEKKPDYLELDGAADGKKVEVFGQTLPLVPYFPFGNTGVDASGFYYRPTWLRMYSYVLL EAEADGDVEFELETCGGAAVWVNGTFVADFTPFTRNMVKRTTIVLPLKAGRNRLLVCLDD LAERDTDYYFRMKRLGDAALTMLVPVRDEVDPSVVGQLENMLEQMYFDKEAYISEMVKLN ITNPFSDPLSVTVAVAPGEFIEKMEGQESLIRTFQYEMGPDQKALELFHSDELPPGYCYF TECVDHQGISLKRKIGNQLVRKEFLEYHEPDLEERKRHILRVIVEYAPENTYKAAALLTL DGDTEAAERILLEELPGVWARKDCSDFHFIIMLYIYRTFQERLSEKMREELEKTMCGFRY WIDEPGDDVMWFFSENHALLFHCCQYLAGSFLPDRIFTCSGKTGRVVSARGEELLREWFE GFFEEFITEWNSNAYIPVDVLGLGTLYNLTEPGSEFHQKAKRALDMIFYSLRVNAHKGAV MTSFGRSYEKEMKGNYNAGTTSLLYLAYGDGYLNRACNGCIPLALGDYAAPEEYRAYGAL KENQELYFMNTQGFEKHVNLYLYKNAYALLSTAAGFKPFQKGYQEHIVQAVIDELAQVFV NHPGESFPYGSGRPNFWAGNGILPLAVQYRNTAILRYRIGEEERIDYTHAYIPISAFTRY LGEDGVIALEKDGAYIAVKAMKGLSMQKEGPNRFREFISQGRDNVWIIRVGRMDEYGDLA ALLAAFKKIEIRTDGEETMVKDEKGTEFLTGPDFLLKVNGVPVYDYPLNVEGKLNLEEWD RG >gi|229784128|gb|GG667607.1| GENE 98 115967 - 116800 1055 277 aa, chain - ## HITS:1 COG:AGl624 KEGG:ns NR:ns ## COG: AGl624 COG0395 # Protein_GI_number: 15890429 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 58 276 131 349 349 223 48.0 3e-58 MQTKTKKICAAGLKWILLLLLTFLFLMPVIWVICSSFKSVGELFSWPPSLLGKNPSLDNY TKAMAEGHFGVYFFNTVFVSLVATFLTIVVNVMSGYAFAKYHFKGDKILFGIVLATLMVP LEVIMIPIFKVIVATHLYNNLWGLIIPAVASPTAVFLVRQYYVGIPDAYMEAARIDGASE LNILLKIMLPMAKPVISVLCIFSFMWRWNDYLWPKLVINGKERYTIQLALANYSGEYSVD WNSLLAMSVISMIPVIVVFVTLQKYIIGGMTAGGVKE >gi|229784128|gb|GG667607.1| GENE 99 116790 - 117665 973 291 aa, chain - ## HITS:1 COG:slr1202 KEGG:ns NR:ns ## COG: slr1202 COG1175 # Protein_GI_number: 16329975 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Synechocystis # 10 285 16 293 298 192 38.0 8e-49 MKSFEQKIFPYLLLAPTMIIFGLFLFFPALNGLWISLTKWDGVNPQVFVGVKNYVTLFGD KSFWNSFVRTLVFTLVSVPSVYVAALGLAVLLTGGIRGSNFFRAVFYWPTMISTIIVGLT WRFLLGEDFGLVNYLLTAMDKTPVKWLTDPNNAMGVVIFVTVWSMAGYYMVMFVSGIKAI SETYYEAARIDGAGAWQQFKFITLPLLKPTSLLVLVLSTVSIIKTYPLVYSLTQGGPAGA TKFMVQMIQETGFEKNKMGYASAMTMALFVILALFTVIQFKLNQGGEQDAD >gi|229784128|gb|GG667607.1| GENE 100 117971 - 119326 656 451 aa, chain - ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 9 423 4 414 447 175 32.0 2e-43 MRACNKTIDMTTGSSVGCIMYFALPIILGNLFQQLYSLTDAVIVGRFVNLNALAAIGGTG WVRWAILGVCMDCTLGFGIVASRRIGAGDWNGFKEVLAAGVEFVLAAGFLLTLFTCLSLD FVLELLHIPENIYEDARLYLWYASVSIPIGLIYNMTCAFLRAVGNSRGPFAAVVLSTLIN VSLDLYFIVGLRWGVCGAARATLIAQTSAAVFVLFLAVRHELFQTTRKNWKYNPGILRET AGLWIPMFFNSIAISAGGVIVQRYINSSGAIVAAGIDAGEKIYCLLETVEKAVCSAISVF VGQNLGAMKIARIRKGMKSMTIFAFFFSVWLAFVLTVYGDSLIGLFLNKNQDPADLEGAY RAARVYLNVQCISVFFMVPMHFYRGAVQALGYAVYPMIAAFLQIAARWITVALFVPALGL VGLCLPDGVAALASLPVVVIPYFVFMRGCKK >gi|229784128|gb|GG667607.1| GENE 101 119394 - 120239 576 281 aa, chain - ## HITS:1 COG:CAC3499 KEGG:ns NR:ns ## COG: CAC3499 COG1082 # Protein_GI_number: 15896736 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Clostridium acetobutylicum # 6 266 2 255 271 110 27.0 4e-24 MLKREQIAGMNQLYRFYSFEYFLNCMEATGIRSLELWMGSPHFWIDTTGYFQCLEKKKQI CGHQLHVVSACVPSMAYQYQYGHGNKEYREKSFQFFSNGIKAAAAMGAKVVVVNSGSCEY GVPFEEAVKGPAELLNRLGRVAGQEGVTMAIESLTADETNIADNLKHTKELIRQADTPSL KAMADTVAIYFAGETLEEWFQAFGKDLIHMHFIDGFIKKRTYDHLAWGDGEFPLAEMVRC MEKYEYTGYLTQELEYEAYYGNPAEAERKNWKALSGTTLCN >gi|229784128|gb|GG667607.1| GENE 102 120241 - 121056 319 271 aa, chain - ## HITS:1 COG:ECs4223 KEGG:ns NR:ns ## COG: ECs4223 COG1082 # Protein_GI_number: 15833477 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Escherichia coli O157:H7 # 13 237 10 242 275 60 22.0 4e-09 MGKICLEGCNRHYTRVDFREFCRDQIRFGKKQVEIWGAVPQLFIDHHGYQDAEKMKAILD DYDLVCSTFTPKTYRYNLGSSSPDVRTWTLSYFKECIKAAAVLGADKMLIYLPAVTRDGD RKGAEEIFAENVQSLAEYAADYKISLVVGGLNAFAEDLPEFLKVMKKINHPSVNCFFETE NLLFSKYELEEWLAELKTALVHVHFADATEIGSCRIGTGCLPMRQCLRDIRKSGYQGGLS PFFTEFSCEQNPGFVSSEHERALKGLLEEEL >gi|229784128|gb|GG667607.1| GENE 103 121142 - 121819 507 225 aa, chain - ## HITS:1 COG:CAC3326 KEGG:ns NR:ns ## COG: CAC3326 COG0765 # Protein_GI_number: 15896569 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Clostridium acetobutylicum # 3 218 9 224 227 218 57.0 6e-57 MERIIQLLISSFWPLMKAGLKMTVPLTLISFALGMILAFITAMCRISKNKLLSKTAQFYV WVIRGTPMIVQLFVIFYGLPKVGIMFSPIVSAIIGLTISEGAYDSEIIRAALTSIPKGQW EACHALGMSKFQTLSNVIIPQAALIAVPSLGNMFISLVKDTSLTAILTVREVFQIGQQIV AVTFEPLWIYLEVGLIYLIYSTVLSQLQGVLERKLGKHMLVKDEK >gi|229784128|gb|GG667607.1| GENE 104 121831 - 122676 721 281 aa, chain - ## HITS:1 COG:SMc03891 KEGG:ns NR:ns ## COG: SMc03891 COG0834 # Protein_GI_number: 15967027 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Sinorhizobium meliloti # 32 273 21 256 257 156 39.0 4e-38 MKRTVAVLLAVISALSLAACGSKTQVNNTGSGDTDNSLQTVLERGTLRAGAEGNWNPFVY NDLATGNLVGYEVEIVEEIAKRMGVKVEWSIANQWDGVIAGLQANRYDVVFCGCTLANLE AAPDCVGTIPYREDPIVLVTAEDNNEIKSWEDLKGKLSANALTGDYGAIAKQYGAELTNA SLEQAMELIVQHRVDCSVNSQIAINTYMAEKPDTPVKVAAKYEYPTPEEAYSYGMILKSK VTLTEKINEILQEMLDDGYCRDLAVKYFGQDVADNISLYQK >gi|229784128|gb|GG667607.1| GENE 105 122698 - 123429 579 243 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 4 242 1 242 245 227 46 2e-58 MAALLEVESLTKSFGENQVLRGVNTKVEKGEVVCVIGPSGSGKSTFLRCINKLEVATGGH IYFEGVDMTGKGPLVDQKIQKLGMVFQQFNLFPHRTVLENIMMAPIKVQKKQRDEAERRA KDLLKIVGLEDKADAYPGRLSGGQQQRIAIARALAMKPDLMLFDEPTSALDPEMVGEVLD VMKALAQDGTTMIVVTHEMGFAREVADRVVFIDEGVITEEGAPKEFFANPKNERLQRFLK KVL >gi|229784128|gb|GG667607.1| GENE 106 123447 - 124226 523 259 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288869860|ref|ZP_06112066.2| ## NR: gi|288869860|ref|ZP_06112066.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 259 15 273 273 546 100.0 1e-154 MAKVLISRGKDYNPTDYVNGLAREEVLPGVCDEITMHRCTLKAGTALKPAVYAFEEKMQI FLFVMGNGYITTDTEAYRIDDRAVFVPNFNKDEILIKAGTEDLSYIHIVGEMTRWDQERM KIDHIVLPRFRLLKDSWEYTEGFTGDAGSNIKSHMVLEHEYLGRYSMGWNCGKGPTFIGT HVHEDLLQWYLNMPGSSFTYHAGDETVSVESSDITFTEIGSPHGSEASEGQCIDYVWFEL AINGYIHDDDILKEHIANE >gi|229784128|gb|GG667607.1| GENE 107 124255 - 124995 429 246 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619132|ref|ZP_06112067.1| ## NR: gi|266619132|ref|ZP_06112067.1| hypothetical protein CLOSTHATH_00127 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00127 [Clostridium hathewayi DSM 13479] # 1 246 1 246 246 503 100.0 1e-141 MAINITRKEDIMLKFCDGYAKTEVLKDAYPLVRTYKCLLKAGYTVEPEIYSDKIQVFSFT SGTGYICGKKKAYNIDELSFYIPDFDREKFSFYAAEDMEITLWVVDMLESDKAAYEDTHM VLPSFKRFSECEPYDQSCKGPHTKSWTVIADGNLARVLMGVVKAEGEGTKERGHAEVAQW NYCLPGSDFHFTVENETVNHYEGEFSYVTAGLDHSLIADPGKTVFYIWFEHWVEEKKINP QSYESN >gi|229784128|gb|GG667607.1| GENE 108 125403 - 126398 555 331 aa, chain + ## HITS:1 COG:BH1250 KEGG:ns NR:ns ## COG: BH1250 COG1609 # Protein_GI_number: 15613813 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 3 329 4 326 338 132 29.0 1e-30 MNIKDIAKLSGVSVATVSRVINNQPNVRPEKRELVLSVMADHGYTPNIFARGLNSKSTKS IGILCSELDDINHAKILSLLEKQLREHGFSSFLCCSGEYRPYQRKHFEFLMAKQVDAIIE IGSSYGDANDLEVYREISEEVPLIILNGFVDVPTVYSVRCDERSAICNLVEELYKRNCRH IYYLEDNTTFCAMEKRNGFLEGINKFNLKATSKCMTIPIYSNEILDAQQKIERLIKNGRR IDAVLSADDLLAVGALQALNAAGLQIPVVGFNNSDFAFACTPRLTTVDNQREFVCTNAIN TLLSLLDGKDAAHHIIVSPKLIERDTFRLQS >gi|229784128|gb|GG667607.1| GENE 109 126515 - 127336 702 273 aa, chain - ## HITS:1 COG:BH3496_1 KEGG:ns NR:ns ## COG: BH3496_1 COG0789 # Protein_GI_number: 15616058 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 16 125 5 116 117 69 32.0 8e-12 MEPDRGKKRKAKEKYYTIGEIAKLYHLGVDTIRYYEEKGILKPNRGENGYRYYDSQSIWR MNVITNLRNLNFSVSRIRDYFQNHTVETTQMLLCEELEVIAEKINEWTSLQKTVQEQVET IRQAQTLEIGTVRSLELGPRKAFEIRKDYSTDEDMDLLMMKLLEQNHRKIYMIGNNRMAS VMSEGNRDALFESAVMFDDDGDLVIPGGTYLSVCYRGVTDSSNQADIIRDYAAEHGIVLK PPFMDLMWIDVHISADPEEYISEVQVRAEQQEN >gi|229784128|gb|GG667607.1| GENE 110 127560 - 128915 960 451 aa, chain + ## HITS:1 COG:FN1469 KEGG:ns NR:ns ## COG: FN1469 COG0534 # Protein_GI_number: 19704801 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 15 438 13 436 440 231 32.0 2e-60 MNLEGRSLKSNFIRYIVPSILAQWVYSLYSMVDGMFVAKGVNEIALTAVNLSYPFMAFLF ALSLLFAVGASTIVSILLGENRHQRACEVFTQNIVLQVIFSLVIAGFVMMNRESFARFLG APNQQTADYVIQYITWIAPFSCVYLLSYSFEILLKTDGYPKKATQIVIFGAVENCILDWL FVMVLHKGIQGAALATSLSQASVIVLYLHHFLSGKGVLSFKRFKPDFGILVRQVRNGFSS GVTEMSSGMVTFIFNQVILMYLTQDALVSYTIISYVNNLVVLSATGIAQGAQPLISYYYG QKRLDQCRSLFRYSLIAAGAMCTVSFTACFILARGIVNIYIGPELQALRDSSVTAFRIFT TSFLLVGFNIAVSGYFTSVERAAEALIISAGRGLILMAGCLIVMTRLFGGAGIWWSPLVS EAVSLAVTGLLLMRYCRKDAFWNQKVRFEVV >gi|229784128|gb|GG667607.1| GENE 111 128987 - 130639 1928 550 aa, chain - ## HITS:1 COG:BH2903 KEGG:ns NR:ns ## COG: BH2903 COG0366 # Protein_GI_number: 15615466 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 1 547 1 558 561 446 43.0 1e-125 MAAEWWKKAVVYQIYPRSFKDSNGDGIGDLNGITEKLDYLKKLGVDVIWISPMYCSPMDD NGYDISDYYQVDPMFGTNDDMDRLLFEAGKRGMKVILDLVVNHCSDEHEWFVKAKQDPDC EEAGYFYFKSPEDGKEPNNWRSNFGGSVWTELPDGRWYFHTFSKKQPDLNWENPKLREKI YEMINWWLEKGVAGFRVDAITFIKKDLTFASRETEDGLRYPIENLTDYPGIGDFLAEMKA KCFDRYGCMTVAEAPGVSGDTFRRYAGENGYFSMIFDFTWENMEGETDKSSVEAVERWKQ RILESQRFTSEAGWSGLFLENHDQSRCVNKYLEKDQIGFAGASAMAVMYFYLCGTPFIYQ GQEIGMTNAEWNSIEEMDDVRAKGMYREALEQNGDPQKVLEYFGELGRDNARTPMQWCDG KNAGFTEGVPWMKVNENYREINVKAQEERGDSLLAFYKKLTALRHQEPYASVFAEGTIRP VLKELPAVIAYEREWDGRIVTVAVNFKKTEQEIPVVGGKCLLSNEGELQQMGERYLLKAY QAVVFDGMIK >gi|229784128|gb|GG667607.1| GENE 112 130666 - 131517 925 283 aa, chain - ## HITS:1 COG:Cgl2406 KEGG:ns NR:ns ## COG: Cgl2406 COG0395 # Protein_GI_number: 19553656 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Corynebacterium glutamicum # 6 283 23 303 304 185 37.0 1e-46 MAETKNSKRDSKGIQRIVMGIAGLFLGCLFLFPIYILVLNSFKNTKGIFTDVIGFPNAAT FTLANYPNAFEALEYIRSFMNSLMITVIATVLILLISAMAAWVLVRYKTKTSKIIFFLFA ASMLIPFQCVMLPLVGFASRIGIMNPQGLIFMYMGFGSSMSIVMFHSFIKNIPEELEEAA TIDGCGSFRLFFSIVIPLMRTILITVAVLNVMWIWNDYLLPSLIINKPGWQTLPLKTYLF FGQFAKRWDLASAGLIMCIIPIIIFYLCCQKYIVKGITDGAIK >gi|229784128|gb|GG667607.1| GENE 113 131517 - 132386 1013 289 aa, chain - ## HITS:1 COG:Cgl2407 KEGG:ns NR:ns ## COG: Cgl2407 COG1175 # Protein_GI_number: 19553657 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Corynebacterium glutamicum # 5 288 6 280 281 210 45.0 3e-54 MKKSKARFWIFLAPALISFSLVVLIPMLIGFFYAFTDWNGIVGSEITFVGFRNFIEIFTR DSSFMHAFGFTALFSICAVILVNIVGFALALLVTQKFRGATLLRGIFFMPNLIGGLLLGF TWQFIFVSIFGAISKATRWQALNGWLADPVTGFAGLLILTVWQLSGYMMIVYIAQIQQIP ESVKEAARIDGAGFKDMLKSIIMPLMMPAFTIGLFLSISNSFKMFDQNLALTQGGPYKST EMLALNIYNSAFGANEFGFAQAKAIIFMIVVAAIGVTQLVLTKRKEVEM >gi|229784128|gb|GG667607.1| GENE 114 132472 - 133788 1758 438 aa, chain - ## HITS:1 COG:Cgl2408 KEGG:ns NR:ns ## COG: Cgl2408 COG1653 # Protein_GI_number: 19553658 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Corynebacterium glutamicum # 5 436 6 440 443 131 24.0 2e-30 MKAKRTAALGLAAVMAASLLSGCGAGGSKSAKTEITIYQSKIEANEGYKKAIAAYEESHP DVKINLEAVTGNDFAASLKAKMQSDPPTIFSVGGFQDLKDYGDILEDISDLEVLKNALPG TTDMFTKDGRVLAVPLYMEGYGFVVNRQMFEDAGVDFESMMTFEGMKAGFDTLKAKIDSG EMKEKYPNLEAVMEYPTKELWIAGDHDVNVALTHDFDTAKAAYDSEVLPGTGFADYKTMV DFQASYTTNADNTANLNSVDYTASLEGGLAIERVASIKQGNWVAPAVETTDPEVLAKLDM LPYSVPGYSDGKYFVGVSGYWAINSKVTDEQKAAAKDFINWLYSDPAGQKIVVEDCKFVP PYDNFGDLKAGDPLSQRIMDANEAGDTMNGWVYSGAPNTWGQQAAGVEVQKYLAGQATWD EVTESCIKQWASMRESQK >gi|229784128|gb|GG667607.1| GENE 115 134093 - 134185 59 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIFTVDWIKGNLIGTASDREPKFFGGRRGF >gi|229784128|gb|GG667607.1| GENE 116 134201 - 135178 948 325 aa, chain - ## HITS:1 COG:BH3727 KEGG:ns NR:ns ## COG: BH3727 COG1609 # Protein_GI_number: 15616289 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 1 316 1 313 331 196 33.0 4e-50 MASIREVARLARVAPSTVSRALNGSGYVAEETKEKIREAVDELDYVPNQWIRNLYQQKTG IIGVMAPEIVHPYFSSLWSFLELELNKYGYNMMLCNTSGKKDIERQYLDTLERNLFDGMI VGAAFLPDKYYEQIQKPILSLDRIIPGIPLVTSDHTQGGCIAADKMLEKGCKKVLNLIDP QAKTVASSRGGLSFIRKMQEAGAEVITEEFLWDDVIHYPRSIQRTREILAEYPDLDGIMA NDLCASSFLKTARELNRKVPQQLKIIAYDGTYITDFNYRTIASIRQDVALIAREAVQVLV KLINGQPLGNEAVYVPVSYKDGDTI >gi|229784128|gb|GG667607.1| GENE 117 135371 - 135901 411 176 aa, chain - ## HITS:1 COG:no KEGG:Rumal_0724 NR:ns ## KEGG: Rumal_0724 # Name: not_defined # Def: metal-dependent phosphohydrolase HD sub domain # Organism: R.albus # Pathway: not_defined # 21 157 3 143 160 115 43.0 6e-25 MDNSLLAVRDINHKYIVVGGKNVNRIEKVREAVDAVLLNVADDAERRCGYAHLYGVSQAC VLIALKRGENAELAATAGMLHDIHTYAAMDLRDHAHKGAAMARDLLEAMKCFEAGEIDVI CSAVYHHSSKEITHAPFDEVLKDADVMQHCLYNPLFDVMKHETVRFGKLKAEFGLA >gi|229784128|gb|GG667607.1| GENE 118 136126 - 137337 816 403 aa, chain + ## HITS:1 COG:FN1676 KEGG:ns NR:ns ## COG: FN1676 COG3547 # Protein_GI_number: 19704997 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 6 396 4 381 391 102 24.0 1e-21 MNRNAVGIDVSKGKSMAAILRPYGEIVSSPFEIKHTSGNIQSLIQQIKSIEGESRIVMEH TGHYYEPLARELSLAGLFVTAINPKLIKDFGDNSLRKVKSDKADAVKIARYTLDSWTELK QYSLMDEIRNQLKTMNRQFGFYMKHKTAMKNNLIGILDQTYPGVNTYFDSPAREDGSQKW VDFAAAYWHVDCIRKMSLNAFVDHYQKWCKRKKYNFSRDKAEEIYGAARELVPVLPKDDL TKLIVKQAIEQLNTASKTVEELRTLMNETAAKLPEYPVVMGMKGVGPSLGPQLMAEIGDV TRFSHKGAITAFAGVDPGVNESGTYEQKSVPTSKRGSSSLRKTLFQVMDCLIKTKPQDDP VYAFIDKKRAQGKPYYVYMTAGANKFLRIYYGRVKEYLMSLPE >gi|229784128|gb|GG667607.1| GENE 119 137535 - 137684 157 49 aa, chain - ## HITS:1 COG:no KEGG:Bcav_3943 NR:ns ## KEGG: Bcav_3943 # Name: not_defined # Def: binding-protein-dependent transport systems inner membrane component # Organism: B.cavernae # Pathway: not_defined # 1 49 262 310 310 62 59.0 6e-09 MGLYALRGENVVDWAGIAAGASIAIVPVICVFIALQRYFVDGIAGAVKS >gi|229784128|gb|GG667607.1| GENE 120 138706 - 139086 289 126 aa, chain + ## HITS:1 COG:MA3541 KEGG:ns NR:ns ## COG: MA3541 COG0454 # Protein_GI_number: 20092348 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Methanosarcina acetivorans str.C2A # 16 120 213 317 321 89 41.0 2e-18 MYQSPEEIDQSLIRAESGDSRIFAVLSGGSAIACLELSASAETFVSETGGIRNICGAYCL PQHRGKGVYQGLLNHVIETLKAEGWTRLGVDYESINPTAAHFWPKYFEPYTSGVVRRIDE CAVRNN >gi|229784128|gb|GG667607.1| GENE 121 139145 - 140836 1981 563 aa, chain - ## HITS:1 COG:RSp0521 KEGG:ns NR:ns ## COG: RSp0521 COG0443 # Protein_GI_number: 17548742 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Ralstonia solanacearum # 1 561 23 587 593 503 47.0 1e-142 MIIGIDLGTTNSLAAYYTEEGPKIIPNRLGRRLTPSVVSIEGEGQIHVGDSAVERGLLHP ESTASVFKRDMGSGKKFQLGHKSFLAEELSSFVLRALKEDAEYYLGEPVTEAVISVPAYF NDARRRATKRAGELAGLKVERIISEPTAAAIAYGLYQNREHARFLVFDLGGGTFDVSILE LYDTILEVRAVAGDNYLGGEDFTAVLEDMFFEKHRELDRLSLDEKTVRHIHKQAELCKMG FTESRSSGMRCRIGEEIVEFNVELSRYEEACEDLLERIRKPVKRSLSDAHIRLSDIDKVV LVGGATKSPVIRRFVSRLFKMLPDTNVNPDEAVALGAAVQGAMKERKESIREVILTDVCP FTLGTEVVREWEKGVFENGVFCPIIDRNTVIPASRTERLYTASDNQTKIRVNVLQGESRF AANNLSLGELMIDVPAGRAGEEAVDVTYTYDINSILEVEVKVVSSQKQVKEVFKGSNVDM TDEEIRERFETLSYLKIHPRDREENKYLLLRGERIYEESIGEKRLHVEAALHKYEKALST YDTGLIEEAKEEFKKFLEEIEEL >gi|229784128|gb|GG667607.1| GENE 122 140833 - 142422 2029 529 aa, chain - ## HITS:1 COG:FN1787 KEGG:ns NR:ns ## COG: FN1787 COG0457 # Protein_GI_number: 19705092 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 42 317 65 332 628 98 27.0 3e-20 MKVLRNLARTKEDRELPLKLCGELMKEIEGETDIEDVSEVEFETALLHWDNEEFKEALSH LAEAIRNNKDRMQYRLVRGNIYLDMKKYNEALTEYASAAEVYHDSPSLFFNRGLCYEGKK MKVQAAENFEKALELQEGYRDACEKLADYYREKYENQYRRADFDTAIAYISRQIAVTENC YYLVHRGLIYMNAMELDEAIRDFEKALTYVPEDWAAYNNLGCCYKYLGRFEEAVKYFEKA VEYMEDSKSLLPYSNMADCYEALGDYRKAIECYEKDLKLFPEYMSFWKEIGQLYAYLGEY EKAEEAYGHTTKMDDYYSRMGDLWFYQGNEKKALRFYKTGIENAAADKKAERQSDLGEAY MDQMQNYPKAVVWLKRAIGRTTDHGDLFDYERYLARAYYRMGKYGPAREHAKAALEHFKL SEEGTEEDYLAYGPYKPAREGVMGWLYMALGEREKGLELFGQMDQGHRCKGCRYRKCYEK FLFQGWYYESMGDQENALVMYREADRLNPHSIAVSCALDNLQKRVDRKK >gi|229784128|gb|GG667607.1| GENE 123 143442 - 145037 1900 531 aa, chain - ## HITS:1 COG:no KEGG:Clocel_3378 NR:ns ## KEGG: Clocel_3378 # Name: not_defined # Def: heat shock protein DnaJ domain-containing protein # Organism: C.cellulovorans # Pathway: not_defined # 1 530 1 519 1112 259 32.0 3e-67 MEKEMSFHILGIEETKDERAVKMAYMRLLKETNPEDDPEGFKRLRTAYETAAAFAAAPEE EGESGAGDVSEEKTELDEWIDRVDACYRDVYGRTDERKWRQFLEDPVCDGLDTSLEARER LLVYLMNHVYLPQKIWQLLDKSFQIVEDIGILEEKFPKNFLNYVEYYVEHEGFIRFEYFR RRGAGEENVDGYLDGYFSVKKKVDRNEFEGCRQELDDLAAFGVWHPYEEVERLRLLLAEH DVDRASRLADALEETCGDSMRDDPYIWSYIGRVRYMEEKKDEAYTIWIRVLKSYPDYYQA KSGAVSCLMEKGRFWQAREWMMEILELDGRDEDILGRLKAANEALIAKLRENPEYHEEGK DVPAEEGAIELAWCLFQNERLDETIRYLESYTPEKEQEYSYTNLFGRVLFQAERYEEARP YLEKWLRMITETVDDGTQENRRRRSRKGRALYILGGCCFQQGESEAAAGYVERAAKEADG QQERMGYLQYLAHILCESKQYERAVDVCDQMTAEDENYYPAYLLRQEACYE >gi|229784128|gb|GG667607.1| GENE 124 145051 - 145890 839 279 aa, chain - ## HITS:1 COG:CAC2707 KEGG:ns NR:ns ## COG: CAC2707 COG0122 # Protein_GI_number: 15895964 # Func_class: L Replication, recombination and repair # Function: 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase # Organism: Clostridium acetobutylicum # 2 274 22 284 292 127 29.0 2e-29 MDLGQLAKSGQCFRMRPAAETECGGIWSAAAGEQYVEIMQDKSRFLFSCGEAEFEGIWRS YFDMDTDYEAVKRSVDPEDEYLQAAMAFGGGVRILRQDLWEMIVTFLISQNNNISRIRNS VDALCEKFGTRKTGTGLVLDPNEGVKSVERTYNAFPEAGAVAAGGPEGLGGLGLGYRDKY IWAMALKCSGPDGAAWLDDLRAADYHTAHGMLTAEFGIGRKVADCVCLFGLHHVEAFPVD THVKQIVNAYYPGGFPLERYRGYAGILQQYMFYYKLNDK >gi|229784128|gb|GG667607.1| GENE 125 145949 - 146563 634 204 aa, chain - ## HITS:1 COG:CAC3581 KEGG:ns NR:ns ## COG: CAC3581 COG1011 # Protein_GI_number: 15896815 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Clostridium acetobutylicum # 1 191 1 189 201 112 37.0 4e-25 MIKNIVFDMGKVLVGYDADKVCSHFMDNEEEKKAVCTSVFVSPEWLMLDMGVISEEEALQ RMQARLETDHAKEMAALCLAHWHEYNMWPLESMGELVRELKEMGYGIYLCSNASVRLLEC YRIIPGIECFDGILFSAEVKCIKPQKEMYQHLFDRFQLKPEECFFIDDVMVNIEGARACG MDGYCFADGDEAALRAALKRLNRN >gi|229784128|gb|GG667607.1| GENE 126 147007 - 148380 1381 457 aa, chain + ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 91 452 14 364 372 327 49.0 3e-89 MSKHLFTDIRVPIEPDNPSVMRHEELCIKCGQCKTVCTESIGVAGYYDLLKTGDTAICIH CGQCANVCPVGCITEVPEWERVKAAVNDPGKIVIFSTSPSVRVALGEEFGMEEGSFVEGR MVALLRALGGDYILDTNFAADLTIMEEANELIRRLTSSNIPIPQLTSCCPAWVKFVETYY PEFKENLSTSKSPIGMQGPTIKTYFANQENLDPKKIVNVAVTPCTAKKFEIRRKEMCSAG SMLGIEGMRDMDYVITTRELAQWARAEGIDFPALEPSGYDPFMGTGSGAGVIFGNTGGVM EAAVRTAYAVLTGEPVPDDLYDLKEVRGLSGTKEASLTIAGTRLDIAVVYGTANARCLLD SIKRGEKFYQFIEVMTCPGGCIGGGGQPKDKNFSGDGLRQKRIDGLYARDRQMKLRLSHE NPQIKEIYKKFYEKPLSPLAEEMLHTSYIDRSGDLGN >gi|229784128|gb|GG667607.1| GENE 127 148422 - 149759 1358 445 aa, chain - ## HITS:1 COG:CAC3354 KEGG:ns NR:ns ## COG: CAC3354 COG0534 # Protein_GI_number: 15896597 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 1 438 4 441 452 464 52.0 1e-130 MMKDMTTGTIPSQLIRFSIPLVLGNLFQLTYNAVDSIIVGRWAGENALAAVGTANPVMSI VILGITGICIGASVLMSEFFGAGRGEQLKRQVSTTLIFGCFFSAGVVLLGLACSGAVMKA LRVPEEIFGEAVLYLRIVFLGMPFTYLYNAVASALRSVGDSRTPVRFLAMASVLNGCLDL VFVAGLGLGVVGAALATDIAEAVSALLCVAYVYRKVPLLQLKREDLRVDGELLKLTLKHG SITALQQSCQPIGKLLIQGAINPLGVSTIAAFNAVNRVDDFAFTPEQSISSGMMTFIAQN RGAGKKERVKKGFRYGLLVEAGYWVIICTVILLMKEPLMRLFVSDGDTKMVAPGVEYLSL MAFFYLMPAFTNGIQGFFRGMGNMSITLVSTLIQISFRVVFVYLLVPRMGMTGVACSCLI GWSFMLMAEVPYYFWFKRRHELLRD >gi|229784128|gb|GG667607.1| GENE 128 149756 - 150772 906 338 aa, chain - ## HITS:1 COG:no KEGG:CAR_c05940 NR:ns ## KEGG: CAR_c05940 # Name: ywbF # Def: putative sugar permease # Organism: Carnobacterium_17-4 # Pathway: not_defined # 4 323 80 387 394 194 36.0 7e-48 MTGITCLGIVPAAVLMLAPGSRFLSAGLLMVIMVVVSVQQPLVNAVNGYYISRGKSMNFG VARATGSLGFAVLSWIMGYLVAAFGERVIPIAICFLLVTMVAVISSFRMERDKSCNGETI EAAEPFSGSGKRTGTSPRAPGSGWALIKKYRRFFLVLGGIVFLFAFHNMVNTYLIQIMER FGGNSSDMGTSIAIAAVCEIPVMVLFSKLAEKISPNRLIKIAGLGFLLKAGAIWLAGSVL MVHASQLLQALSFAILIPASVYYSDEVMEAQDRIKGQAFITASITAGGMVGNFFGGKIID AAGVPAMLTAGVVCAGAGMVLVWIGAVPAKGRGEDGTL >gi|229784128|gb|GG667607.1| GENE 129 151772 - 151909 151 45 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619155|ref|ZP_06112090.1| ## NR: gi|266619155|ref|ZP_06112090.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 45 1 45 45 77 100.0 4e-13 MKKTRGDRGLTIKYAALQGNYWMLFGAVFTFIAVFLLSQGFKAGD >gi|229784128|gb|GG667607.1| GENE 130 151929 - 152432 717 167 aa, chain - ## HITS:1 COG:AGc3965 KEGG:ns NR:ns ## COG: AGc3965 COG0782 # Protein_GI_number: 15889462 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 138 7 141 158 89 40.0 3e-18 MYDKLTKNDIQKMQEEIDYRKLVVRREALEAVKEARAHGDLSENFEYHAAKKDKNQNESR IRYLERMIKTAQIVSDESREDEIGLYNTVDLYFEDDDEVETYKLVTTVRGNSMKGLISSE SPLGSAIMGHRVGDRVYVRINEKAGYYVVVKRLENTTDDGSDKLRSY >gi|229784128|gb|GG667607.1| GENE 131 152647 - 153642 832 331 aa, chain + ## HITS:1 COG:CAC0818 KEGG:ns NR:ns ## COG: CAC0818 COG2199 # Protein_GI_number: 15894105 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Clostridium acetobutylicum # 8 165 405 562 571 84 32.0 3e-16 MKQDWAALKEENEILSKLASHDWLTSIYNRGATETKINQLLAEKKTGVLFVLDVDQFKQI NDRYGHITGDSVLQEVVRILSLMTFKHDILGRVGGDEFVIYMPLAQNQNFVDERCRQIRS RLLGIQMTNPLINGISVTVCGSLYQPGDDYKSLFDRADQLLLSEKRKKNKKPFLTAAEER RSAARKGIEIDMALIRTELSEQELTTGAYCQDYDTFKSIYRFVERRLRRSGESAYIILFT LTDKNGDFPKLLTRENQMDTLKSVTQYSLRLGDVFTQYSSCQYLVMVSDVDGQNAELIAR RISEAFYAETADIEDKLLLHHCYPLKPAGTS >gi|229784128|gb|GG667607.1| GENE 132 153739 - 155016 1591 425 aa, chain - ## HITS:1 COG:AF1263 KEGG:ns NR:ns ## COG: AF1263 COG1145 # Protein_GI_number: 11498862 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Archaeoglobus fulgidus # 1 351 1 343 369 147 29.0 5e-35 MGHLTTRDAYRNLSDRINWFTQGAPTSETLYKILEVLYTEKEAKWVALLPVRPFTVKKAA KIWQTSEFKAERLLDHLCEKALLVDSYHNGVRKFVMPPPMAGFIEFALMRTRGDIDQHYL GELYYQYMNVEEDFIKDLFFATETKLGRVYVQEPVLANDKTNHILDYERATHIVEEAEYI GLGLCYCRHKMYHAGHPCEINAPWDVCLTFDNVARSLAQHGDYARLISREEALDALERSY ASNLVQIGENVREHPAFICNCCGCCCEALQAARHFSPMQPVATTNYIPEISLERCVGCGK CAKVCPVLAVSMEEGENGKKKAVVNKEICLGCGVCARNCAVKAIELQRRPEQIITPVNST HRFVLQAIEKGTLQNLIFDNQAFANHRAMAAVFASILELPPVKQALASKQLKSVYLDKLL SLHKK >gi|229784128|gb|GG667607.1| GENE 133 155228 - 155668 318 146 aa, chain - ## HITS:1 COG:lin2806 KEGG:ns NR:ns ## COG: lin2806 COG0232 # Protein_GI_number: 16801867 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Listeria innocua # 1 133 1 140 465 134 47.0 7e-32 MEWEKLLSKERFYQSGDGEEPGRSDYDTIIRSTLFRRLQDKAQVLPLESDDYVRTRLTHS LEVSAIGKRLGELVYRRLKEAGKDAWFEHNPETEFSDTLLCAGLVHDIGNPPFGHFGEYA VREWFQRYLGGMTDNYARRLYRELFS >gi|229784128|gb|GG667607.1| GENE 134 155757 - 156530 840 257 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619160|ref|ZP_06112095.1| ## NR: gi|266619160|ref|ZP_06112095.1| hypothetical protein CLOSTHATH_00156 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00156 [Clostridium hathewayi DSM 13479] # 1 257 1 257 257 483 100.0 1e-135 MKGLKKIVFMAAAICVCLAACSGKKSTGYAETPEAAVQSAMEALQTLDLETFNALTDNYV STERNWLGIPVRRRYRVFSELQQPGILEGAKEKANRAFAEKIVKNLTWEITGAETEGERS VISVRITNTDMSDVMGRYTVHVLEKMTDGKGTGLKELFEEIAEADYDRGGVLPYLDEAEG SVTTDVAVTVCRENGRWIMKITDPLIEAFMGNFGAGEFSGEVNARIEELEEEYEKKMVQW GEDFGSRIGQWLEGIFN >gi|229784128|gb|GG667607.1| GENE 135 156726 - 157940 1157 404 aa, chain - ## HITS:1 COG:CAC1582 KEGG:ns NR:ns ## COG: CAC1582 COG2972 # Protein_GI_number: 15894860 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Clostridium acetobutylicum # 187 398 215 430 452 93 28.0 6e-19 MGIVMRNILLCLMLGIGCGIYYETIVPMRVWRYRWVKDTAVPAFTLGFLAIALTEIPPYI LQPVRVIVVLWLVSWIYFQMKAVQSLILSVLFCALMWIVMAIVVSAVWALPPEYRGLEAL EEELSYSILLCLILLFHYRYRKGLRGLTGGQWARYGYFPVFSMIVIMAISMMTGSGETER LVAVSGFGILNVIALYFIGSILEKDARVQEMRLLQERTQNQMEMYHNMQRNYERQRRELH DYKNQLNCIQGLLLEGRLEETLHYVETLNGNLRESADVVNTNHAVVNVVLNQKYQSALEK NITMILSVNDLSGLTMGEEDLVTLLVNLLDNAVEACEKLEERRIIRFKMVLEDGELILSV SNPVAEPVIIDGKKIATTKKDGRNHGIGLLNVNGVIERLGEPLH >gi|229784128|gb|GG667607.1| GENE 136 157928 - 158653 589 241 aa, chain - ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 233 1 229 234 107 27.0 3e-23 MLQIAICDDENYYRETIRTLLTEYLERRGLEYALSLFLSGEEFVSQPENAVKYDVVFMDI SMSHVNGIETAAWMRSLCSETAIVFVTAFLHYAPEGYKVDAVRFIMKDTLEAALPECMDA VLKRKRLEQVEFSCVEGTVRLFTERILYVESRKHKAVFCCQGTEPETCQAYLKLDEVERR LKEYGFLRIHKSYLVNMRHVRRIRNYEAELDNGECLPIPRPRFQAVKEEYAVYKGAMPWG L >gi|229784128|gb|GG667607.1| GENE 137 158893 - 160101 538 402 aa, chain - ## HITS:1 COG:SPy2122 KEGG:ns NR:ns ## COG: SPy2122 COG0582 # Protein_GI_number: 15675872 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pyogenes M1 GAS # 16 388 9 369 381 61 25.0 3e-09 MAKKNRSVPQYGTVMRKGILYYRTRIQDADGKQVSLYATTCEELYEKQQAAKKQVEEIVF RRKHPTVAEYCEKWLLMQSAKVSSATMKAYTSDMRNYIIKPLGDMYMEEVTADDIRLALV PLTNKSEGLYNTVNMLIKCIFYSAERSELITYNPCVGISAKGGKPTKKKDALTDQQVEVL LDTVKGLPPYLFIMICLYSGLRREEALGLHWDCVFLDAPTPYISVRRAWRSEHNRPVVST VLKTPAAKRDIPIPKCLVDCLREAKENSISDYVIADSKGEPLAYSQFQRVWQYVVVRSTK PRNYYKYVNGESIKYTVTPTLGMTQKNQPKIKYTIDFDVTPHQLRHTYITNLLYAGVNPK TVQYLAGHENSKTTMDIYAKVKYNKPEELFEVVNDALHQRGF >gi|229784128|gb|GG667607.1| GENE 138 160264 - 161478 518 404 aa, chain - ## HITS:1 COG:no KEGG:Closa_3673 NR:ns ## KEGG: Closa_3673 # Name: not_defined # Def: integrase family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 403 1 403 404 624 77.0 1e-177 MANRKRAIPQYGTVTLNGLDYYRTRVQDADGKLRALYARTPEELYDKELNALEQIDNDTF CRKSPTVAEYCEKWLLMQSVHVRATTMTDYTSKVRRHIIKELGDMGMADVTLDDIQLALV PVSKKSASVYKSVVVLYKSIFRAAKESRIIDKNPTVYLTSKGGGVPQKEKEALTDEQAAR LLDAIQGLPPYVFVMLGLYAGLRREEILALQWDSVYLDTDTPYLTVRRAWHTESNRPVIL DELKTKAAERNIPLPICLADCLREAKANSASEYVVPNRDGDPLSYTQFKRLWQYIVTRTT KERVYYRYEDGKRVKHTVTPVLGEKAAHNGKVVYSLDFEVTPHQLRHTYITNLIHSSVDP KTVQYLAGHESSKITMDIYAKVKYNRPDELVKSMGGAFAQWDGV >gi|229784128|gb|GG667607.1| GENE 139 161563 - 161793 163 76 aa, chain - ## HITS:1 COG:no KEGG:Closa_3674 NR:ns ## KEGG: Closa_3674 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 76 2 77 82 102 67.0 3e-21 MEKLITRKEAAEILGISIATLDAARNNGLISYVQYVQNGCVYFTAAGLQEYIAKCTHRAK PVERSTTYRKPRSGRS >gi|229784128|gb|GG667607.1| GENE 140 162327 - 162785 268 152 aa, chain - ## HITS:1 COG:no KEGG:Closa_2766 NR:ns ## KEGG: Closa_2766 # Name: not_defined # Def: RNA polymerase, sigma 28 subunit, FliA/WhiG subfamily # Organism: C.saccharolyticum # Pathway: not_defined # 23 149 18 148 151 80 40.0 2e-14 MSKNREERANYYLYIDGQAVPVSEQIYRVYQHYERKEEYFSYDLKTEKFQKDTASFLPSR EDSYERLLEKDRQFAAPGKNVEQLALEHLEAEQIRFCLAQLNEDEQELIFLLFYQEKTEQ EVGNILHISHQAVNKRKRVLLLKLRKIFEEFF >gi|229784128|gb|GG667607.1| GENE 141 163203 - 163544 104 113 aa, chain + ## HITS:1 COG:no KEGG:Dtox_1519 NR:ns ## KEGG: Dtox_1519 # Name: not_defined # Def: transcriptional regulator, XRE family # Organism: D.acetoxidans # Pathway: not_defined # 8 108 9 109 109 65 35.0 5e-10 MRHIDFSFGASLRNAREKRNYRREQIAERAEISPRFLAAIESGRRKPSLDVLIRLVNAIG ASFDEILAPQMITDSEIVDRIRRLVPQCSQRDQELLLALIDKMLDTKEKKDNK >gi|229784128|gb|GG667607.1| GENE 142 163572 - 163730 78 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288869868|ref|ZP_06112104.2| ## NR: gi|288869868|ref|ZP_06112104.2| mn-dependent transcriptional regulator [Clostridium hathewayi DSM 13479] mn-dependent transcriptional regulator [Clostridium hathewayi DSM 13479] # 1 52 1 52 52 87 100.0 3e-16 MEQLITAGVNPKTAEADACHMEHTISIEAFEKLRDYYSSQKEKQFARIATDR >gi|229784128|gb|GG667607.1| GENE 143 163822 - 165210 219 462 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 265 453 4 204 245 89 29 1e-16 MIEFKNVSFQYEGSEQQIWNVNLSIAQGECVVLTGVSGCGKTTLTRLMNGLAPSYFKGSI SGSICIDDRDITGMTAWEIGTIVGSVFQNPKSQFFSSELVGEVAFGCENHGFSREAIQER TDKSIETFSLSLIRKRPLDVLSSGEKQRVAIASVYAMRPKVFVCDEPTANLDLGGIEQLL GTLFKLKAQGYTLVIAEHRLSWLSELADRIIYMEKGRIVREYTPKEFTAMPEMERRNKGL RTMHRVQTREPLPLPKLFERGSLIMHDLCKKCGEIEVFTNLSGAFPQGSITAVTGRNGAG KSTLALVLAGLSKKSGGDIIVCGSKNHLSRRRKKVYYCGNDTTTQFFTASVSEELLLNQP LSEERMERARQLLKNMNLYEYRDVHPAALSGGQKQRLAVACAIFSEREILLLDEPTSGLD GANMRRISDALKAAASRGKTVIVITHDPEFMEECCQYCFSLK >gi|229784128|gb|GG667607.1| GENE 144 165207 - 165905 153 232 aa, chain - ## HITS:1 COG:no KEGG:TDE0256 NR:ns ## KEGG: TDE0256 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 7 227 8 228 232 183 47.0 4e-45 MRGEHSFAVPTKFFALLCAMLGIFLVKNILFTTVLVIFCFVYVGFQRNYRLMTSCGAFYL LLSLLLYAIRFWGYHTVIISEFLVLFFWSSMVMLMLFWDLFMTPPGELSSFLSSIHMPTG VILGLMVIFRFFPTMKAELKAVGLSMKNRGLTSVRQFAVHPLASVEYVLVPLLLRILQIA DQLAVSAVARGAECPGVRKSYYETTMHMRDWCFLALWGIVTAAVLCVGGVKI >gi|229784128|gb|GG667607.1| GENE 145 165902 - 166144 220 80 aa, chain - ## HITS:1 COG:no KEGG:TDE0257 NR:ns ## KEGG: TDE0257 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 3 80 123 200 200 105 66.0 5e-22 MVSLFYNGGNLLPLLFFWDTFYAFAVASGMDQSYIDSYIQYYTSPGWLIFIVSFSVICGF LGSILGSRLIGKHFKKAGVL >gi|229784128|gb|GG667607.1| GENE 146 166125 - 166490 180 121 aa, chain - ## HITS:1 COG:no KEGG:TDE0257 NR:ns ## KEGG: TDE0257 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 1 87 1 87 200 89 55.0 3e-17 MKTNNKWTVKDVITTVLLTVLLIGIHFVVITVSMVNQFFNCVLSPGVSMFFSAPVYMLMV SRVNKRFVSLTYLTILGVFFLLTGNLFRFGRPTMRGGTMEAGILSQRKEINLRMGDGEPV L >gi|229784128|gb|GG667607.1| GENE 147 166565 - 168286 223 573 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 332 554 132 354 398 90 29 4e-17 MRELFKTHFALTDKGAKDLQKASAASFFVYVINFFPAMLLLLLVDELLLNNVKETGLYLW GSILVLAVMWILLRIEYDALYNTTYQESANLRTEIADILTKLPLSYFSRHDLSDLAQTIM ADVAAIEHAMSHAMAKAVGFLFFFPLLSVLLLLGNVKLGLAIILPILLGFGLLLLSKNLQ IRESFKHYQKLRDNSESFQEAIENQQEIKSFGLTQKIRQTLYQKMEESEKIHLRAEISAG IPMLCSNVILQFAFVLVILIGVQMLHTGEINILYFLGYVLASIKVRESVEAVSMNVAELY YLDSMVKRIREVRETKIQQGKDQTISSYDIEFDQVSFSYDKDTEVLKNISFTAKQNEVTA LVGVSGCGKTSILRLMSRLYDYDGGSIRIGGLDIKEISTKSLFEKIAIVFQDVTLFNASV LENIRIGKKTATDEEVVQAARLANCEEFIRRLPDGYKTMIGENGATLSGGERQRLSIARA FLKDSPIIILDEIAASLDVENEKKIQDSLNRLILDKTVIIISHRLKSVENADRIVVIDCG RVEASGTHLELLKASPTYNNLVEKAKLTEEFQY >gi|229784128|gb|GG667607.1| GENE 148 168283 - 170028 225 581 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 351 552 294 502 563 91 31 3e-17 MFAVYKKLLYYVPKQRPLAYIAIGLTVVSTLLNVGAYYYLYEFLKRLVVDGDMGHAQYNA FLVAGLLIGGSLLYFAAVLLTHMLGFRLETNLRKRGIDGLTNASFRYFDLNSSGKTRKLI DDNAAQTHSIVAHLIPDNAGAILTPFLALVVGFLISLRVGIVLLVLFLLSGVLLVLMTGE KKFMEVYQKALETMSSETVEYIRGIQVLKIFGIDATSFKTLHRAIADYARHALNYSMSCR RPYVLFQLILFGFIAVLVSVAALTLNRMQDPAVLAVELIMTLFLGGVMFTAFMKVMYVGM YAFMGTSAVEKLEQVFSDMQKDRLTFGNKSTFKNYDIEFERVCFGYTDQMVLEDVSFTLK EGRSYALVGSSGGGKSTIAKLISGFYKVDGGAVKIGGEPIEAYTEDALIRNIAFVFQNVR LFHISIYENVRLANPDAQRAEVMEALHQAGCDSILDKFKDRENTVIGSKGVYLSGGEKQR IAIARAMLKNAKIVILDEASAAIDPENEHELQKAFAHLMKGKTVIMIAHRLSSIRNVDEI LVLEDGKIAERGTDAALMATDTKYREYQTLYEQANEWRVVR >gi|229784128|gb|GG667607.1| GENE 149 170259 - 171389 4 376 aa, chain - ## HITS:1 COG:all1169 KEGG:ns NR:ns ## COG: all1169 COG0031 # Protein_GI_number: 17228664 # Func_class: E Amino acid transport and metabolism # Function: Cysteine synthase # Organism: Nostoc sp. PCC 7120 # 5 315 34 347 365 263 44.0 3e-70 MHLLETIGNTPLIELRNTTASTEGRLLFKYERGNPGGSIKDRPALFIVTEAEKRGLLKPG GTIIESSSGNFGISLAMIGAAKGYRVIILVDPKTTATNLALLKCFGAEVIVVTEQDDSGS YHKTRISLANKLASEIPDAFRPDQCFNLLNSTAHYQGTAREIFADCPDNVAAIVAAVSTG GQLGGISRYTKTYRPDVKVIGVDAVGSSIFGGESHSYLIPGIGLAWTPCNVAVENIDSVY KVTDEAAFVAARCFARNEGVLMGPSGGACALVALTITKQLSPHDCVVCIISDGGDRYIQT LLDDAWMQSNNFSIETSLEKLLALTGTITPWSIHPDQNANYKPELKSQLCVPASTQTINE NIDEINESASRISFID >gi|229784128|gb|GG667607.1| GENE 150 171466 - 173070 190 534 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 281 503 5 215 305 77 28 3e-13 MDNYLKAALSVEQLKVSYVQNKKRLPVLDGISFVLEKGKILALLGESGSGKSTCGKSLIG VLPPSAKIDSGTVRFGNGDYVDISDNDINWKTIRGMRIGMIYQDAQLALNPVKTIRVHFE ETLRNKPFPSKEAMEKVCYDMLRLLNFDDPERVLNAYPFELSGGMCQRVYIAMILSLEPE VLIADEPTSALDTVSQKEVLQLLRDTQKKLGLSIVLITHDIGVVHEICDRVIVLNEGKIV EQGAVQDVLLHPKEHYTKQLIEARNLPAFSLPCSKCKEPLLKIECIKKAFQKGNKCKNVL NQVDLLVHKGETIGILGFSGCGKSTLARCVTGLELPDQGKIMYNGQDISMLRGRKRQHIC KSLQIVFQDARASLNPRRSALELVQEPLKYLKIGTTKERTEKARFYLKSVGIDGDALHRR PPQLSTGQCQRIAIARALIVEPELLICDEAVSALDMILQKQILDLLLNLQKSIGFAYIMI SHDARVIRHSCERVAIMNNGVFVDMVSTDRLAAQSENEFTHCLLSSELHIAEVS >gi|229784128|gb|GG667607.1| GENE 151 173080 - 173910 496 276 aa, chain - ## HITS:1 COG:BH0030 KEGG:ns NR:ns ## COG: BH0030 COG1173 # Protein_GI_number: 15612593 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 11 268 40 297 301 196 41.0 4e-50 MGNRKIPSTVLLGSILLLVIILLVIAAPLFTSYAPDEAIAERLQEFSFSHPFGTDRYGRD IFCRTLYGGRTTLTACFLALVMALFIGTFFGIITGMRSRGLLDTIIMRVIDVLMAFPFMV FAMVFAALWGTGLANLLIAVVAVWWVPFARLARSIVLQVKNDPSIQAARVLGASGWRMGV FELLPKMIGSLLVLATFELGTLILSVSALSFLGMGALPPTPEWGSMLSDARAHFFQAPHI LLGPALFIVLTVLSLNLIGEGLRDMLDPYEHIDILN >gi|229784128|gb|GG667607.1| GENE 152 173924 - 174367 204 147 aa, chain - ## HITS:1 COG:MA1247 KEGG:ns NR:ns ## COG: MA1247 COG0601 # Protein_GI_number: 20090111 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 8 136 182 311 316 108 43.0 4e-24 MEWGRLWLPAITLGIIGAPRCIRMVRAAMLTELGQLYVSSALSRGLTSKMVLQGAFQNAL LPAITTVALSFAHMIGGSVVIEAIFNWPGIGNYAINSILVHDYPAVQGYTVLTVATVILI NLTVEITYILSNPMVRRGGISRPTAQQ Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:13:06 2011 Seq name: gi|229784127|gb|GG667608.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld1, whole genome shotgun sequence Length of sequence - 158390 bp Number of predicted genes - 140, with homology - 133 Number of transcription units - 66, operones - 33 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 699 770 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 2 1 Op 2 . + CDS 692 - 1249 576 ## CLJU_c39380 hypothetical protein + Prom 1253 - 1312 2.2 3 2 Op 1 . + CDS 1334 - 2644 1091 ## Amet_0444 hypothetical protein 4 2 Op 2 . + CDS 2665 - 3966 1343 ## COG2610 H+/gluconate symporter and related permeases + Term 4057 - 4107 5.5 + Prom 4395 - 4454 12.6 5 3 Tu 1 . + CDS 4575 - 5318 423 ## gi|266619184|ref|ZP_06112119.1| conserved hypothetical protein + Term 5340 - 5373 -0.6 6 4 Tu 1 . - CDS 5410 - 5604 94 ## gi|266619185|ref|ZP_06112120.1| toxin-antitoxin system, toxin component, RelE family - Prom 5648 - 5707 2.7 7 5 Tu 1 . - CDS 5967 - 6839 918 ## COG0583 Transcriptional regulator - Prom 6875 - 6934 7.0 + Prom 6981 - 7040 5.8 8 6 Tu 1 . + CDS 7080 - 8081 980 ## COG0491 Zn-dependent hydrolases, including glyoxylases + Term 8203 - 8243 -0.3 + TRNA 8364 - 8437 69.7 # Arg ACG 0 0 - Term 8349 - 8417 31.3 9 7 Tu 1 . - CDS 8516 - 9226 419 ## COG1145 Ferredoxin - Prom 9305 - 9364 7.6 + Prom 9216 - 9275 7.3 10 8 Tu 1 . + CDS 9367 - 10017 700 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 10070 - 10118 -0.7 + Prom 10061 - 10120 9.1 11 9 Op 1 . + CDS 10298 - 11116 829 ## COG2367 Beta-lactamase class A 12 9 Op 2 . + CDS 11119 - 12204 1252 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 13 9 Op 3 21/0.000 + CDS 12271 - 14226 2397 ## COG4166 ABC-type oligopeptide transport system, periplasmic component 14 9 Op 4 49/0.000 + CDS 14398 - 15339 1185 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 15 9 Op 5 44/0.000 + CDS 15339 - 16349 1129 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 16 9 Op 6 44/0.000 + CDS 16389 - 17441 524 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 17 9 Op 7 . + CDS 17438 - 18373 1112 ## COG4608 ABC-type oligopeptide transport system, ATPase component 18 10 Op 1 . + CDS 18484 - 18630 78 ## 19 10 Op 2 1/0.176 + CDS 18612 - 19586 985 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 20 10 Op 3 . + CDS 19666 - 20511 969 ## COG2367 Beta-lactamase class A 21 11 Tu 1 . - CDS 20502 - 21395 563 ## Cphy_1141 hypothetical protein - Prom 21555 - 21614 5.0 + Prom 21455 - 21514 8.4 22 12 Op 1 . + CDS 21591 - 22358 419 ## COG1451 Predicted metal-dependent hydrolase 23 12 Op 2 . + CDS 22273 - 23499 1609 ## COG0840 Methyl-accepting chemotaxis protein 24 12 Op 3 . + CDS 23417 - 25162 2010 ## COG0608 Single-stranded DNA-specific exonuclease 25 12 Op 4 . + CDS 25188 - 26315 1090 ## COG0673 Predicted dehydrogenases and related proteins + Term 26504 - 26551 9.2 + Prom 26434 - 26493 5.6 26 13 Tu 1 . + CDS 26586 - 27833 1565 ## COG0112 Glycine/serine hydroxymethyltransferase + Term 27876 - 27938 13.3 + Prom 27936 - 27995 4.8 27 14 Tu 1 . + CDS 28031 - 29107 1044 ## COG0500 SAM-dependent methyltransferases 28 15 Tu 1 . + CDS 29587 - 29766 176 ## gi|266619208|ref|ZP_06112143.1| conserved hypothetical protein + Prom 30652 - 30711 80.4 29 16 Op 1 . + CDS 30820 - 31683 510 ## COG0384 Predicted epimerase, PhzC/PhzF homolog 30 16 Op 2 . + CDS 31736 - 32425 543 ## gi|266619210|ref|ZP_06112145.1| hypothetical protein CLOSTHATH_00213 31 16 Op 3 . + CDS 32446 - 32607 118 ## 32 16 Op 4 . + CDS 32525 - 33040 436 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) 33 16 Op 5 . + CDS 32973 - 33146 143 ## 34 16 Op 6 . + CDS 33139 - 33699 625 ## COG1859 RNA:NAD 2'-phosphotransferase 35 16 Op 7 . + CDS 33713 - 34594 760 ## COG1091 dTDP-4-dehydrorhamnose reductase + Term 34740 - 34792 11.2 + Prom 34708 - 34767 5.2 36 17 Op 1 15/0.000 + CDS 34875 - 35639 809 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 37 17 Op 2 11/0.000 + CDS 35643 - 36104 629 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 38 17 Op 3 1/0.176 + CDS 36101 - 36376 461 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs + Prom 37278 - 37337 5.8 39 18 Tu 1 . + CDS 37379 - 39337 2076 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs + Term 39345 - 39408 21.1 - Term 39453 - 39515 5.5 40 19 Op 1 12/0.000 - CDS 39533 - 40438 1152 ## COG1732 Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) 41 19 Op 2 24/0.000 - CDS 40520 - 41242 346 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 42 19 Op 3 . - CDS 41271 - 41915 776 ## COG1174 ABC-type proline/glycine betaine transport systems, permease component - Prom 41971 - 42030 6.8 - Term 42028 - 42064 6.3 43 20 Tu 1 . - CDS 42106 - 43524 1719 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 43544 - 43603 3.2 + Prom 43808 - 43867 5.6 44 21 Tu 1 . + CDS 43906 - 45051 1159 ## COG3711 Transcriptional antiterminator 45 22 Op 1 . + CDS 45968 - 46726 670 ## Thebr_0341 protein-N(pi)-phosphohistidine--sugar phosphotransferase (EC:2.7.1.69) 46 22 Op 2 . + CDS 46720 - 47172 429 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 47 22 Op 3 . + CDS 47184 - 47432 342 ## LM5578_2173 hypothetical protein 48 22 Op 4 3/0.059 + CDS 47474 - 48757 1509 ## COG3037 Uncharacterized protein conserved in bacteria 49 22 Op 5 . + CDS 48828 - 49745 1023 ## COG1735 Predicted metal-dependent hydrolase with the TIM-barrel fold 50 22 Op 6 . + CDS 49757 - 50377 672 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase 51 22 Op 7 . + CDS 50382 - 51098 789 ## COG1402 Uncharacterized protein, putative amidase + Term 51205 - 51253 7.0 + Prom 51237 - 51296 5.7 52 23 Tu 1 . + CDS 51453 - 51515 56 ## 53 24 Op 1 7/0.059 + CDS 51629 - 51931 246 ## COG2190 Phosphotransferase system IIA components 54 24 Op 2 7/0.059 + CDS 51950 - 52786 918 ## COG3711 Transcriptional antiterminator + Prom 52791 - 52850 6.6 55 25 Op 1 8/0.000 + CDS 52917 - 54320 1525 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 56 25 Op 2 . + CDS 54334 - 55791 1117 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 55839 - 55873 8.3 57 26 Tu 1 . - CDS 55889 - 60136 3709 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) 58 27 Tu 1 . - CDS 60276 - 60407 71 ## + Prom 60313 - 60372 8.1 59 28 Op 1 . + CDS 60406 - 61347 1203 ## COG0598 Mg2+ and Co2+ transporters 60 28 Op 2 . + CDS 61381 - 62226 847 ## Closa_0669 hypothetical protein 61 28 Op 3 . + CDS 62245 - 63681 1789 ## COG0469 Pyruvate kinase 62 28 Op 4 . + CDS 63717 - 64985 1603 ## COG0019 Diaminopimelate decarboxylase + Term 65069 - 65102 5.4 + Prom 65281 - 65340 5.6 63 29 Tu 1 . + CDS 65393 - 66103 853 ## COG0670 Integral membrane protein, interacts with FtsH + Term 66117 - 66155 9.1 + Prom 66172 - 66231 5.5 64 30 Tu 1 . + CDS 66265 - 68733 2433 ## COG2199 FOG: GGDEF domain - Term 68731 - 68787 15.1 65 31 Tu 1 . - CDS 68821 - 69981 1144 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases - Prom 70006 - 70065 7.3 + Prom 69991 - 70050 3.6 66 32 Tu 1 . + CDS 70081 - 70971 955 ## COG1404 Subtilisin-like serine proteases + Term 70997 - 71051 18.1 - Term 70984 - 71038 18.1 67 33 Op 1 . - CDS 71069 - 72367 1130 ## COG0513 Superfamily II DNA and RNA helicases - Prom 72409 - 72468 6.6 - Term 72449 - 72507 6.3 68 33 Op 2 . - CDS 72559 - 74022 1195 ## Dhaf_0170 sodium/sulfate symporter - Prom 74064 - 74123 9.9 + Prom 73993 - 74052 11.1 69 34 Op 1 . + CDS 74198 - 74878 777 ## SGGBAA2069_c09090 cAMP-binding protein 70 34 Op 2 . + CDS 74913 - 75368 370 ## ELI_2084 hypothetical protein + Term 75404 - 75445 1.1 + Prom 75401 - 75460 6.6 71 35 Tu 1 . + CDS 75503 - 77368 2111 ## HMPREF9243_1609 hypothetical protein + Term 77380 - 77430 9.0 72 36 Tu 1 . - CDS 77358 - 78227 782 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 78235 - 78294 5.3 73 37 Op 1 . + CDS 78379 - 81456 3022 ## COG3250 Beta-galactosidase/beta-glucuronidase 74 37 Op 2 . + CDS 81489 - 81584 70 ## + Term 81597 - 81627 2.0 + Prom 81737 - 81796 3.7 75 38 Tu 1 . + CDS 81816 - 82814 1401 ## COG0468 RecA/RadA recombinase + Prom 83661 - 83720 80.4 76 39 Op 1 1/0.176 + CDS 83840 - 84457 785 ## COG2137 Uncharacterized protein conserved in bacteria + Prom 84504 - 84563 6.4 77 39 Op 2 . + CDS 84606 - 86165 1270 ## COG1418 Predicted HD superfamily hydrolase + Term 86187 - 86247 13.1 + Prom 86185 - 86244 3.8 78 40 Tu 1 . + CDS 86280 - 86459 90 ## gi|266619257|ref|ZP_06112192.1| putative lipoprotein + Term 86610 - 86647 6.2 - Term 86598 - 86634 2.2 79 41 Tu 1 . - CDS 86639 - 87658 363 ## COG1609 Transcriptional regulators - Prom 87817 - 87876 8.8 + Prom 87810 - 87869 8.7 80 42 Op 1 . + CDS 87933 - 89687 814 ## COG2272 Carboxylesterase type B 81 42 Op 2 8/0.000 + CDS 89702 - 91078 896 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 82 42 Op 3 . + CDS 92091 - 93521 823 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 83 43 Op 1 . - CDS 93693 - 94265 509 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation 84 43 Op 2 2/0.059 - CDS 94279 - 95085 364 ## COG0627 Predicted esterase 85 43 Op 3 . - CDS 95102 - 95929 459 ## COG0627 Predicted esterase 86 43 Op 4 11/0.000 - CDS 95919 - 96935 753 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 87 43 Op 5 21/0.000 - CDS 96938 - 97945 898 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 88 43 Op 6 16/0.000 - CDS 97966 - 99453 1056 ## COG1129 ABC-type sugar transport system, ATPase component 89 43 Op 7 . - CDS 99491 - 100642 960 ## COG1879 ABC-type sugar transport system, periplasmic component 90 44 Tu 1 . - CDS 100792 - 100950 92 ## gi|266619269|ref|ZP_06112204.1| putative N utilization substance protein A - Prom 101061 - 101120 6.1 91 45 Op 1 13/0.000 + CDS 101201 - 102211 882 ## COG1609 Transcriptional regulators + Prom 102267 - 102326 2.8 92 45 Op 2 . + CDS 102382 - 103287 332 ## COG0524 Sugar kinases, ribokinase family + Term 103305 - 103363 10.4 93 46 Op 1 7/0.059 - CDS 103717 - 105474 1210 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 94 46 Op 2 3/0.059 - CDS 105458 - 106237 686 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 106269 - 106328 5.1 - Term 106272 - 106321 7.8 95 47 Tu 1 3/0.059 - CDS 106341 - 107342 1077 ## COG0229 Conserved domain frequently associated with peptide methionine sulfoxide reductase - Prom 107367 - 107426 2.8 96 48 Op 1 13/0.000 - CDS 107526 - 107984 606 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 108067 - 108126 4.0 97 48 Op 2 . - CDS 108137 - 108835 734 ## COG0785 Cytochrome c biogenesis protein - Prom 108859 - 108918 4.3 - Term 108975 - 109025 5.3 98 49 Tu 1 . - CDS 109058 - 110515 591 ## ELI_1104 hypothetical protein - Prom 110556 - 110615 7.3 + Prom 110616 - 110675 6.6 99 50 Tu 1 . + CDS 110797 - 113424 2259 ## COG0642 Signal transduction histidine kinase + Prom 113432 - 113491 5.6 100 51 Op 1 . + CDS 113545 - 113847 324 ## gi|288869901|ref|ZP_06112214.2| high-affinity choline transport protein 101 51 Op 2 40/0.000 + CDS 113898 - 114581 761 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 102 51 Op 3 . + CDS 114575 - 115732 1123 ## COG0642 Signal transduction histidine kinase 103 52 Op 1 . - CDS 115753 - 117183 1051 ## ELI_2456 transcriptional regulator 104 52 Op 2 . - CDS 117226 - 120006 2003 ## COG0642 Signal transduction histidine kinase - Prom 120072 - 120131 8.0 + Prom 120127 - 120186 5.8 105 53 Op 1 . + CDS 120247 - 121161 731 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 121242 - 121270 1.0 106 53 Op 2 . + CDS 121284 - 121520 242 ## Closa_0502 hypothetical protein + Term 121539 - 121595 7.1 + Prom 121532 - 121591 3.3 107 54 Op 1 8/0.000 + CDS 121767 - 122729 837 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 122747 - 122806 5.6 108 54 Op 2 3/0.059 + CDS 122832 - 123710 925 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 123719 - 123789 27.3 + Prom 123727 - 123786 3.5 109 55 Op 1 . + CDS 123884 - 125596 1495 ## COG0840 Methyl-accepting chemotaxis protein 110 55 Op 2 . + CDS 125615 - 125980 482 ## COG3603 Uncharacterized conserved protein + Term 126005 - 126039 3.5 + Prom 126261 - 126320 4.2 111 56 Tu 1 . + CDS 126473 - 126808 249 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 112 57 Op 1 . + CDS 127725 - 127874 181 ## gi|266619292|ref|ZP_06112227.1| conserved hypothetical protein 113 57 Op 2 . + CDS 127852 - 128847 1067 ## Amet_3811 hypothetical protein 114 57 Op 3 40/0.000 + CDS 128935 - 129621 689 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 115 57 Op 4 4/0.059 + CDS 129641 - 130645 590 ## COG0642 Signal transduction histidine kinase + Prom 130647 - 130706 1.9 116 58 Op 1 36/0.000 + CDS 130728 - 131408 306 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 117 58 Op 2 . + CDS 131411 - 134014 1908 ## COG0577 ABC-type antimicrobial peptide transport system, permease component + Term 134016 - 134066 1.3 - Term 134008 - 134044 9.0 118 59 Tu 1 . - CDS 134092 - 135114 1382 ## COG3641 Predicted membrane protein, putative toxin regulator + Prom 135070 - 135129 4.0 119 60 Op 1 . + CDS 135160 - 135309 97 ## gi|288869910|ref|ZP_06112234.2| hypothetical protein CLOSTHATH_00313 120 60 Op 2 . + CDS 135314 - 135436 63 ## + Prom 135682 - 135741 6.2 121 61 Op 1 . + CDS 135796 - 137289 1452 ## Closa_1674 hypothetical protein 122 61 Op 2 . + CDS 137286 - 138155 888 ## COG1180 Pyruvate-formate lyase-activating enzyme 123 61 Op 3 15/0.000 + CDS 138204 - 139310 1350 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein 124 61 Op 4 24/0.000 + CDS 139345 - 140880 1871 ## COG3845 ABC-type uncharacterized transport systems, ATPase components 125 61 Op 5 26/0.000 + CDS 140873 - 141997 1352 ## COG4603 ABC-type uncharacterized transport system, permease component 126 61 Op 6 . + CDS 141990 - 142898 1167 ## COG1079 Uncharacterized ABC-type transport system, permease component + Prom 144187 - 144246 6.5 127 62 Op 1 . + CDS 144277 - 145023 835 ## Closa_1680 Crp/Fnr family transcriptional regulator 128 62 Op 2 . + CDS 145050 - 145832 1019 ## COG2820 Uridine phosphorylase 129 62 Op 3 1/0.176 + CDS 145873 - 146580 938 ## COG0813 Purine-nucleoside phosphorylase 130 62 Op 4 3/0.059 + CDS 146654 - 147322 817 ## COG0274 Deoxyribose-phosphate aldolase 131 62 Op 5 . + CDS 147369 - 147818 539 ## COG0295 Cytidine deaminase + Term 147856 - 147909 11.1 132 63 Tu 1 . - CDS 147927 - 148505 430 ## PROTEIN SUPPORTED gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) - Prom 148650 - 148709 6.1 133 64 Op 1 . - CDS 148715 - 149947 480 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 134 64 Op 2 7/0.059 - CDS 149949 - 151565 1089 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 135 64 Op 3 . - CDS 151579 - 153330 891 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 153352 - 153411 7.6 + Prom 153478 - 153537 6.1 136 65 Op 1 35/0.000 + CDS 153604 - 154953 1402 ## COG1653 ABC-type sugar transport system, periplasmic component 137 65 Op 2 38/0.000 + CDS 155011 - 155889 630 ## COG1175 ABC-type sugar transport systems, permease components 138 65 Op 3 . + CDS 155900 - 156730 657 ## COG0395 ABC-type sugar transport system, permease component 139 65 Op 4 . + CDS 156740 - 157936 861 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities + Prom 157938 - 157997 3.4 140 66 Tu 1 . + CDS 158047 - 158389 388 ## COG3192 Ethanolamine utilization protein Predicted protein(s) >gi|229784127|gb|GG667608.1| GENE 1 1 - 699 770 232 aa, chain + ## HITS:1 COG:ECs2408 KEGG:ns NR:ns ## COG: ECs2408 COG0318 # Protein_GI_number: 15831662 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Escherichia coli O157:H7 # 1 225 333 557 566 221 47.0 8e-58 VPGYMVQKAWSFGILLCEVYGSTESVPHVFVRPEEALTLNGTTSGRPIEGVEIRVVDDEG NDVPPGIPGEELSRGPNVFVGYLKDKRITDGALDDDGWFHSGDLCVMDEAGNIHIIGRKK DMIVRGGENLNSNEINDRLEGCPGMGDHTVIGMPDDRLGERICAFAVPLPGFEHLKLEDV TGWLCRNGVPKRFWPERLELIDRIPHTGSGKVKKYLLKEELEKRMEGPEHND >gi|229784127|gb|GG667608.1| GENE 2 692 - 1249 576 185 aa, chain + ## HITS:1 COG:no KEGG:CLJU_c39380 NR:ns ## KEGG: CLJU_c39380 # Name: not_defined # Def: hypothetical protein # Organism: C.ljungdahlii # Pathway: not_defined # 1 178 1 175 175 182 48.0 7e-45 MIEQKFGPKRCRDTRKPRADQCPEVVFYRCRSCNSLFPVTGGQALEEKKIFCCSEEAERL VPTDAGEAKEFLELSYQITGGYNDNAVKVSWKTKKQDCVPEWIYLKTFTGGYLKYVRKEK RSPMVFALADTDAFCYCDEDPCLECVFRCKRGFAVYAYSEGTGLLEMPLDRMTAHWQTRE KEAER >gi|229784127|gb|GG667608.1| GENE 3 1334 - 2644 1091 436 aa, chain + ## HITS:1 COG:no KEGG:Amet_0444 NR:ns ## KEGG: Amet_0444 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 20 419 9 374 383 145 27.0 4e-33 MKSENVGSGGTAVDWRAVLVFVSIAFSSQCGGGFASGSTPWTYFFRNTGFYGMLPEVTQS YFCLLMPLVVAAINCFIIYFFMKFAMDFRTFDYGNFLEKYGSRFWGGVLAKPMQIVYELN FNWLLIVCMALAYSTSGSALKELTGIPYIWTTFIVGIIMFLLCMKGTELVRKNAVFMSSV IFIALIVVHVPNLIFNIGSGKLANEMAVMSSSMEAGIASGAGSFFKYLFIAFVWAYIFSG QNLAGFGAYVNHAQLFDNKKTLRWAVVLSVIVNWVFLEMNAINLAGNYSAVYDGWANGKA VYTIMVVQNGWAGGAVRAISLTLLTIAIFFATISTAINYAQGFNDRVLNWYQKKSGEDPK VSEARRNRRGAVLTFVYICLTWVISQAGLTTLVSKGLTAASYINLFTLILPTVMNVIIGW PDREYDSGRKVQAGTK >gi|229784127|gb|GG667608.1| GENE 4 2665 - 3966 1343 433 aa, chain + ## HITS:1 COG:BH3897 KEGG:ns NR:ns ## COG: BH3897 COG2610 # Protein_GI_number: 15616459 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Bacillus halodurans # 5 425 2 421 427 138 26.0 2e-32 MLATMGLIFAIALLVVMIIKGIDMICATVITAVIILLSSQLSLSDGLLTTFSGGAGGFVT SWFLILGMGGIFGQLVLETGLATKIAKWLTQKLGARMVGLVILICTFFITLLGINGYIMV FVLYPIADNLLRENKMDRRLLPFMLLGGGASANGFMYTLDICNVLPSGYLGTSLGSAPFL SLVFTAIVAVCYFVYFNYAQAQSHKKHTNEELLAMYTSSNKVMSDEELPGIGMSVLPFIV VFGIVVATSSWGTSTSILFSLTVGILLIMATQFKHIENIKASAKTGAGSGLNSMLTVAAV MGLSKVLSLSPAFQALQQAVLSMNMPVYLKAYFGTAVLTGLTGSAISGETIFLESFGQAF VDMGVNVQALHRIVTESALILNKLPNSSVAILTMSICGCSLKESYKHVILGTMIPAAAAG LVIALMATFGITF >gi|229784127|gb|GG667608.1| GENE 5 4575 - 5318 423 247 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619184|ref|ZP_06112119.1| ## NR: gi|266619184|ref|ZP_06112119.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 247 1 247 247 445 100.0 1e-123 MNHYIYIIVLTFILISANWGENCNNYPQANDIENNNFADIQESKQPETKAMQDKDLNLKE IADHVVESNKEKTNREISKESNEFLSYFVDKVFDSSYHPPKAASGYVMEYIVIDKIEDGN VSGEYSSQAGFIWENGVFNNVRINEDNSIVVINEWTGVNFITEESFPKAYTKIKIIFDRV VDGIPVIKTVKLGPVEGNEEILKEYYYMSEPGTIHEWFSNNYMYYDLNMSEEEMVDWITK NINTPWK >gi|229784127|gb|GG667608.1| GENE 6 5410 - 5604 94 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619185|ref|ZP_06112120.1| ## NR: gi|266619185|ref|ZP_06112120.1| toxin-antitoxin system, toxin component, RelE family [Clostridium hathewayi DSM 13479] toxin-antitoxin system, toxin component, RelE family [Clostridium hathewayi DSM 13479] # 1 64 1 64 64 101 100.0 2e-20 MLIFTRTQYTLVREPISRIGKTAETSSVTYRYWSKEIIKLLKYELTDRTIVLDILKLKNN ENQD >gi|229784127|gb|GG667608.1| GENE 7 5967 - 6839 918 290 aa, chain - ## HITS:1 COG:lin0491 KEGG:ns NR:ns ## COG: lin0491 COG0583 # Protein_GI_number: 16799566 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 280 6 287 297 180 36.0 2e-45 MYYFRKLAEVQHYTEAAKALYITQPSLSDSIASLEKELGVSLFQKKGRNIQLTKYGEEFY QYVSESLSILEHGISLIKEKSGHISGSIDIGCIPTLLGDFLPEALDLYQQKNPLATLNIY HGMSIEVAQGVSCGKYDIGFCSMVDDMPDLVFVPITYQELIAIVQDNHPLADRGAMYLSD LKDYRLTTYRDTIPIGKVVRRLLKEKGVEAVYSYDDEISIAGRISKSSKVAIVADTPYLR QFNNLSKVRLLDVPKDTRMIYMVYSRKNFITSAVEAFANFMVANCLNMPD >gi|229784127|gb|GG667608.1| GENE 8 7080 - 8081 980 333 aa, chain + ## HITS:1 COG:FN1279 KEGG:ns NR:ns ## COG: FN1279 COG0491 # Protein_GI_number: 19704614 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Fusobacterium nucleatum # 1 270 2 262 326 95 27.0 1e-19 MQEIYHNIFQETISYSDRHMSPRNLYLIRQPGRSLMVDTSFRFERDWEILRSMVETLGVS YKDLDVFITHDHPDHTGLVPALQELGASVFMNPEETRKRADLLHCYLADEASRIQNLRIV GVTKKDTPEVYQAIMEYTGRAYAERGKAQDFSFIPVHPGETLDYGEYVFDVVSLKGHTFG QCGLYERSHGFLFCGDQIMTTIVPIVGSQQKDLGLLKCYMESLGDMKHRYADCRILPCHY GPIEDVGKEANRIIFGYLDKCEIMKRILEEDGGLMTTRDVGVRAYGRSQGPPDYRHFVSC TQIWAKTFSCLEFMYGEGLVERIARDGIIYWKL >gi|229784127|gb|GG667608.1| GENE 9 8516 - 9226 419 236 aa, chain - ## HITS:1 COG:CAC0885_1 KEGG:ns NR:ns ## COG: CAC0885_1 COG1145 # Protein_GI_number: 15894172 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Clostridium acetobutylicum # 1 102 1 99 115 106 50.0 4e-23 MIRKIIHIDTDACTGCGLCAEACHEGAIQMTDGKAVLTREDYCDGLGDCLPACPAGAISF TEREAPAYDHAAVQKAKLARNQAGQAGRESSCACPSAASRSLSGSAAQNPSGIKEVPGCL SQWPVQIKLAPVRAPYFDNADLLIAADCTAFTYGNFHHDFIRNHITLIGCPKLDNTDYSD KLTEILRENTIRSITVTRMEVPCCGGITFAVERAVENSGKDIPCNVFTIGIDGNLL >gi|229784127|gb|GG667608.1| GENE 10 9367 - 10017 700 216 aa, chain + ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 4 210 14 216 229 86 25.0 4e-17 MVSLFDGIENGELEQLLPCLGGRRVRFDEGDYIFMAGGRADQVGMVLSGSVLVFSDDFWG NRTIMAHVERGGLFGEAFSFARVEALPVSVTAAEKTEVLFINCERMITVCPSACEFHNRL VRNMLKILAEKNMALTQKIGHMGKRSTREKILSYLSEQAGKAGGESFSIPLNRQEMADYL AVDRSALSKELGRLEKEGQISFHKNQFQLLKAQLQE >gi|229784127|gb|GG667608.1| GENE 11 10298 - 11116 829 272 aa, chain + ## HITS:1 COG:FN1584 KEGG:ns NR:ns ## COG: FN1584 COG2367 # Protein_GI_number: 19704905 # Func_class: V Defense mechanisms # Function: Beta-lactamase class A # Organism: Fusobacterium nucleatum # 23 235 15 224 264 126 36.0 5e-29 MTEGIKPEASGLDTIELNVADQIKALPGKTGFYYENLVTGERAAYHEEERMMAASVIKLF VMTEAFTRFEEGTLSPDRIIRMRREDCVPSCGALTYLHDGIEVTVLDLVTLMIIFSDNTA TNVLIDLLGIEEINRTIRRLGYRDTVLRRKMYDTEKSKQGIQNYITAAETGRLLREMYQG RLVSRTASEAMISILKNQQLCSKIPFYLQALPEEPEIAHKTGEDCGITHDVGIIYAKQPF IVCYCGNDTDTPAYERVMAETALWLYHRNSEV >gi|229784127|gb|GG667608.1| GENE 12 11119 - 12204 1252 361 aa, chain + ## HITS:1 COG:FN1586 KEGG:ns NR:ns ## COG: FN1586 COG4948 # Protein_GI_number: 19704907 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Fusobacterium nucleatum # 1 360 13 372 375 402 54.0 1e-112 MRIAKIETAEVNIPLVTPFKTALRTVNSVNDIVVRVTADDGRMGFGEAPPTAVITGDTKG SIRCAVEEFIAPNLIGMEIDNLDGMMKKLHGCIVKNTSAKAAVDMALYDLYAQSCRKPLY RVLGGSRTEVETDLTISVNGVDEMVEDSLKAVKQGFRILKIKVGKEGVKDIERIREIRAA VGPEIRLRIDANQGWTAKDAVRIIRAMEDSGIEMDLVEQPVNAHDFDGMKFVTANVATPV LADESVFSPEDAVRIIQNRAADLINIKLMKTGGIYEALKICAIAESYGVECMIGCMLESK IAVSAAAHLAAGKGIITRADLDGPSLCSVDPYTGGPVYEGAMIRMNENDGIGITGVPGFE P >gi|229784127|gb|GG667608.1| GENE 13 12271 - 14226 2397 651 aa, chain + ## HITS:1 COG:SP1527 KEGG:ns NR:ns ## COG: SP1527 COG4166 # Protein_GI_number: 15901372 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 59 648 35 631 652 166 27.0 1e-40 MKKRRLTAALLCAVLAASTVLSGCGGKKTTEGGSTAQAGNEAGKNENGKGEDGGKPVVYK ELYSGEVATLNYLVSASEADFKTAAYCIDTLIEYDSEGKIRPGQATEWEYDEPSKTWTFH LREAKWVDNTGAPVADVTAQDYVDAIKYQLTPEYESGIVQNLFGVIANAKEYYNGLVYNG GADDDGVVWNAIDFSEVGVKAVDDHTLTYTLETEVPYFLSSLAYIVYMPAYGPQLEELGK TFATDADKMYYNGSYYMSEFSPQERRIYKKNTLNYDADIVYIDEIQKIYNAEANTIGPEM IKRGEIDYAIITADILDDWLSNEDTKNLISRERPRNSYSYFYCFNFDPQFDAEYEPDNWR KAVNNENFRKALSSALNKTKEVAVLEPNVPEDYVIQTITPAKFTYNADGTDFTEIGDMAA LGDTFNEAKAVEYRDLAKSELAAEGVTFPVKVLLPYNPTEVNWDKECQVVEQQMESLLGT DFIDIIVQTGPTDGFLTEIRRNGKYAMLKCNWGADYADPETWTDPFYQAKGENGYDPGYK YANLAKAIEDGTPSADAVLEYFTTIEEAKDIKVDINARYEAFAKAEAALINHALVVPFSI SVSKYLATKINVFEGQYAPFGVSNLKYKGQHLQDHYISMEEFEANKEKAGQ >gi|229784127|gb|GG667608.1| GENE 14 14398 - 15339 1185 313 aa, chain + ## HITS:1 COG:lin2299 KEGG:ns NR:ns ## COG: lin2299 COG0601 # Protein_GI_number: 16801363 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Listeria innocua # 1 307 1 305 309 160 34.0 2e-39 MLKYLGKRIARSFLTLIIILTVVFCLLRLMPIEGYFTNFDKLTPQQIQVSLQEMGLTDPL PVQVVRFFNDLVHGDLGVSRIYRANVPVAEILADKVPVSIKLGVLSMIVSLFVGLPMGAL MARYKYRWFDKLGTLFIVCIQAVPAAVYFLYIQLYGTQALGVGLLFQIDDPKYWILPVVS MSLNNIAFYGMWLRRYMVDESNKDYVKLARAKGMSEGGVMFKHIFRNAFVPLAQYIPTAF LNTVIGSIYIESLYSIPGMGGLLVTVIKKHDNTMVQGIVLLYACVGVMGLLLGDILMVLL DPRISLSKKGGDR >gi|229784127|gb|GG667608.1| GENE 15 15339 - 16349 1129 336 aa, chain + ## HITS:1 COG:SP1889 KEGG:ns NR:ns ## COG: SP1889 COG1173 # Protein_GI_number: 15901716 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Streptococcus pneumoniae TIGR4 # 29 335 5 307 308 222 40.0 9e-58 MKQFSYRHTKEAELMDSMEQAAKGAPQTDEELFSFAGFNEAEVERTGYSNYSYWSSTFRA FLKNKGAVVLLVLLLLLLAFTFIQPHLPGQYDPNTIVNDEAGMQYRNIPPGDQFILGTNS IGQDLWARIWAGTRTSLLIGLSVALAEALIGITVGVIWGYVRKADFILTEIYNIIDNIPN TIVLILISYILKPGVKTLIFAMCLTGWIQMARFIRNQILIIRDRDYNVASRCLGTPTAKI IIKNLLPYLVSVIMLRMALTIPSAIGNEVFITYIGLGLPVNIPSLGNLINEGRALITSPA LRYQLVYPTIILSFVTISFYIIGNAFSDAADPKNHV >gi|229784127|gb|GG667608.1| GENE 16 16389 - 17441 524 350 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 3 329 9 329 329 206 35 5e-52 MPEKILTVEDLEIKFELRGKTLHAIRGISMELHRGEILAIVGESGSGKSVFTKSFMGLLD QNGHVTAGKVTYYGADNREPVELTSLKKEKEWLKIRGREIAMIMQDPMTSLSPLKTIGVQ IMEAVLLHQNVNRAEAKARTLQYLADVGISEPEKRFGQYPHEFSGGMRQRVVIAIAIACN PQILICDEPTTALDVTIQAQILDLIKELRHKYHLSVILITHDLGVVANIADRVAVMYAGD IIEIGTGEDIFYDARHPYTWALLSSLPQVGVKGTDLFSIPGTPPNLFTEIKGDAFAPRNP KALKIDFIKQPPYFDVSPTHKAKTWLLDQRAPKVEPPSVVEKIRNGGLFS >gi|229784127|gb|GG667608.1| GENE 17 17438 - 18373 1112 311 aa, chain + ## HITS:1 COG:SP1887 KEGG:ns NR:ns ## COG: SP1887 COG4608 # Protein_GI_number: 15901714 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Streptococcus pneumoniae TIGR4 # 4 306 3 305 308 386 66.0 1e-107 MKREVLLDVKDLDVTFGERKKRYEAVKKVNFKIYKGETFGLVGESGSGKTTIGRAVMRII PSSGGDIIYKGEKINGRISRELDRKVMGEIQMIFQDPQASLNERAKVDYIISEGLMNIEK GIGEKERKKRVDQALLDVGLLPEFASRFPHEFSGGQRQRIGIARALIMNPEFIIADEPIS ALDVSVRAQVLNLLADMQKKRGITYLFIAHDLSVMRFITDRIAVIRKGEIVEMAETEELI THAIHPYTRALLSAIPMPDPRHEREKKLLVYDASMHHYETDKPSWREVRPEHYVLANGSE YEAYKKLYRDW >gi|229784127|gb|GG667608.1| GENE 18 18484 - 18630 78 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEREERQQEGGVQMAEMNFQTGPGAAFPAARGNPWSPEQEMERCSMEL >gi|229784127|gb|GG667608.1| GENE 19 18612 - 19586 985 324 aa, chain + ## HITS:1 COG:alr3379 KEGG:ns NR:ns ## COG: alr3379 COG0791 # Protein_GI_number: 17230871 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Nostoc sp. PCC 7120 # 134 323 44 225 234 73 29.0 6e-13 MQYGIVIQPVATIYEIPAETVVKDGMVLSAIADEAMYGTGLAITGPGVTGAEVNGPEAAG MAVIGRETDETGQRTEMNRGYLPVRTFYGYTGYVRKEAVRLVSLEEMMAWEEGGLMVTTA FCVDVVSVPKVQGVRLVSLFRGSLVSVIPYECGIEGWGKVKLADGRFGYIREQFLENKEF SQAGLWTETLPQKKIVDKEAFRRAVVETAKTYLGIQYRWGGRSTAGIDCSGLTSVSYMMN GILTWRDAKIVEGYPVREIAKDEMKMGDLLYFPGHIAMYMGDGRYIHSTGKAGSGGVVIN SLNPEDADFRADLAESLYAAGSIF >gi|229784127|gb|GG667608.1| GENE 20 19666 - 20511 969 281 aa, chain + ## HITS:1 COG:DR0433 KEGG:ns NR:ns ## COG: DR0433 COG2367 # Protein_GI_number: 15805460 # Func_class: V Defense mechanisms # Function: Beta-lactamase class A # Organism: Deinococcus radiodurans # 18 255 14 254 277 112 31.0 6e-25 MDKRLTIEKRIEAELMSYDGIMGIYADDLHGNIIAIGADEPFETASTIKTYILACLFDQV EKGKASLEDMVEYKEEHTVDGSGVLCALEPGAVLRVKDAATLMIIVSDNVATNMMIDYLG LDTINACIRGLGCRDTVLYNPLHFERYDKLGTSTPRDYASIFTRLAAGTLISPESDAKML EIFKKQHYNSMITKDFPAVYMDSDNTDDVMVTVASKSGSMNACRNDGGIIYTPYGPYVLV MFNRKFSDAMYYPAHPATVFGARVSRLLFDQFIALEGRFQP >gi|229784127|gb|GG667608.1| GENE 21 20502 - 21395 563 297 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1141 NR:ns ## KEGG: Cphy_1141 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 294 1 313 319 259 41.0 1e-67 MKYFHGYQKKHSFFEGWYLKHQSPDGVLALIPAYHIDRQGKPSASLQIVTNTESWVIRFS AGQFRAASSLFYVNLGTSTFSEQGIHLDIAAKEVTLTGHLSYGPFTPPGYDIMGPFCAVP FMQCRHGVLSLSHRLSGQLVLNGKKLNFDGGLGYIEKDWGSSFPSRYVWTQCTWTDKKTK APCCVMLSVADIPFAGTHFTGSVGSVFFRGKEYRLATYKGLKILEFTARELLVSQGGLML HVNLIKDQPLSLAAPSFGSMSRTIKESAACQVRCQFFCQEKKIFDILCHHAGFESYG >gi|229784127|gb|GG667608.1| GENE 22 21591 - 22358 419 255 aa, chain + ## HITS:1 COG:RSc0521 KEGG:ns NR:ns ## COG: RSc0521 COG1451 # Protein_GI_number: 17545240 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Ralstonia solanacearum # 24 248 63 283 288 117 32.0 2e-26 MKTEKQALPDQGILICEDGDFFPYRIVKSARRTMAVSVTRAGEIVVRIPEGLSWKTGHEF AQRNQEWIFAHASRIREDLEKREQFHWTEGAELLFHGTLKKLHFEQDYETERFRVCDTGD KFLVSGPFSGRDDDEPLVKAAMESWYRKQARRFLEERTAWWADQMGVSYLRIAIRDQATR WGSCSVRGNINYNWKLVLLPVELTDYVVVHELAHRTEMNHSKDFWKIVERELPDYRQRRR RLKGYESEINQKYQY >gi|229784127|gb|GG667608.1| GENE 23 22273 - 23499 1609 408 aa, chain + ## HITS:1 COG:BS_yvaQ KEGG:ns NR:ns ## COG: BS_yvaQ COG0840 # Protein_GI_number: 16080422 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Bacillus subtilis # 26 284 5 260 566 69 22.0 1e-11 MRENCRITGKEGEDLRVMKVKSIRNISIKTKLLFLGGVSILGLVFIGAESIITARQINEA STEISQSWVPAIIIAEELNTETSDYRIKEFYHAITHDQETMDHLEKEMMAVRDEIDAAFT EYEQFYITDETDRTLMENARAYWDKYLEYSDRLLPVSRGNDNQESLKMIIGESRLLFDEA SSNFLKMAEFNRLGAEAASVRGDQLYVRLARVKIVSICLIALVITLLVIYIIIAIDKPVK AIVEGTRRVSNGDLDVYLPYDSEDEIGILTDSVNQLIERLKNIIDDEKYLFREIGSENFE VKSTCEQAYRGDFAPILYSIASLMSRLDIAKQKKEELKKRLEEQVAAEMLAEKKLKEEKT LAEEGKKLAEEKTPAEDKKTPDADQAGEETKKDGSRTMDASDEEGRLS >gi|229784127|gb|GG667608.1| GENE 24 23417 - 25162 2010 581 aa, chain + ## HITS:1 COG:CAC2232 KEGG:ns NR:ns ## COG: CAC2232 COG0608 # Protein_GI_number: 15895500 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Clostridium acetobutylicum # 15 578 2 587 587 549 45.0 1e-156 MQIRLVRRRKRMAAERWMLQTKKADFHEMAGIHHITPVTARIIRNRDIVGAEAVEKYLRG SLKDLYSPHLLKDMDLAVEILKGKTEQGKRIRIVGDYDIDGVCSTYLLYRALTRIGANVD YEIPDRIKDGYGINESIIRAAADDGVDTILTCDNGIAAVEQVRLAKELGLTVLITDHHDI MQEEGRDVLPPADAVVNPKQSGCSYPYPEICGGMVAYKLVKALYEAFSVPEEEWLSMLEF AAIATVGDVMKLRDENRIIVKEGLKRIADTKSIGLLKLIERNDLDKDHISAYHIGFVIGP CLNAGGRLQTAKLALALLLCEDGEEADRLAMELKELNDQRKDMTKQGTEEAIAQVEQCYR DDKVLVAYLPSCHESLAGIIAGRLREQYQKPAFVLTDGEGCVKGSGRSIEQYHMFEGLVK VRDLLLKFGGHPMAAGLSLEKENIDEFRRRLNEDAELTEDDFVRRIWIDVPMPFDYISEP LIEELELLEPFGQGNEKPLFAQKGLHIRSVRVLGKNRNAVKFSLADEKGTPMDAMLFTDG DTFLEELGNRRVIDVIYYPTVNEYNGSRTLQVIIKNYKIPV >gi|229784127|gb|GG667608.1| GENE 25 25188 - 26315 1090 375 aa, chain + ## HITS:1 COG:AGl1782 KEGG:ns NR:ns ## COG: AGl1782 COG0673 # Protein_GI_number: 15891005 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 53 331 58 326 350 105 27.0 2e-22 MEKIRAAVVGFGGMGRQYVQMLDNGEIDGMVLAGVCCRNEKGQNEIRTAYPGVSVYPNVE AMFGCIDEFDAAVVVTPHATHVEIGKMAAAAGKHILLDKPAGISTKEVKELLDAAERAGV SLGMIFNTRMNRVFFRAKEMIERGELGRLNRAVWISNIWFRTPAYHNSASWRSTWAGEHG GLLINQSQHFLDVWQWLFGMPDHVLATVECGKYSDITVDDSMDVQFFYDNGLHGTFIASS GEYPGVNRLEIWGTKGRLCIEDSSRILFDENVVPTDEFSSRNQEIYGMPEHHAREITVEK KGAEGYQKIFQNFTDHLRYGEPLLATGEDGLRGVMIANGAYLSSWLKKKVDFPIDDERYA AMLEEKAAEERRAGK >gi|229784127|gb|GG667608.1| GENE 26 26586 - 27833 1565 415 aa, chain + ## HITS:1 COG:SA1915 KEGG:ns NR:ns ## COG: SA1915 COG0112 # Protein_GI_number: 15927687 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Staphylococcus aureus N315 # 6 415 1 412 412 516 62.0 1e-146 MVNEVMKFITSYDKEVGEAIELECARQRRNLELIASENIVSEPVMMAMGTVLTNKYAEGY PGKRYYGGCEDVDIVENIAIERAKKLFGCDYANVQPHSGAQANMAAFVAMVQPGDTVMGM NLNHGGHLTHGSPVNFSGLYFNIVPYGVNDEGFIDYDEMERIAIENKPKLIIAGASAYGR TIDFKRFREVADKVGAYLMVDMAHIAGLVAAGLHPSPIPYADVVTTTTHKTLRGPRGGMI LANKEAAEKFNFNKAIFPGTQGGPLEHVIAGKAVCFGEALKPEFKEYQEQVVKNAKALAA ALVKQGFNILTGGTDNHLMLIDLRGMEVTGKELQNRCDEVYLTLNKNAVPNDPRSPFVTS GVRVGTPAVTSRGLKEEDMEKIAECIWLAATDFENKADYIRAEVTKICEKYPIYE >gi|229784127|gb|GG667608.1| GENE 27 28031 - 29107 1044 358 aa, chain + ## HITS:1 COG:CAC2451_2 KEGG:ns NR:ns ## COG: CAC2451_2 COG0500 # Protein_GI_number: 15895716 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Clostridium acetobutylicum # 141 345 3 212 216 131 33.0 2e-30 MFKINEAADMLHVNPNALRFYEKKGIIAPHRDENNYRMYTMDDIARIQMILLYRRMGFSI EMILKLLEQEESPVDLFFRQYDILNRHVHSMSLIRDSLAGCIDRMLEEMSQPEAECTGAD TLVSAGAVAELEKTAALLAGMTKWEDRWNFDSSASHYDDMVRRGGNGLAFYENYETVLNA TARLAKEHGGIMAEIGVGTGELAGRRLSDCDITGIDQSVNMLKEAKKKFPKLKVKLGTFL QLPLDSGSVDTVVSSYAFHHCNEEEKFLAVREMARVLKPEGRVVIADLMFADQEAMQKFA ERCSDAEREDLEDEYFACVDRLEVMMEAMGFDVRHEQVDALIWIVSGDLKEKDQQKGQ >gi|229784127|gb|GG667608.1| GENE 28 29587 - 29766 176 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619208|ref|ZP_06112143.1| ## NR: gi|266619208|ref|ZP_06112143.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 59 1 59 59 70 100.0 4e-11 MKKVEIMSIAVFIVTLLIMGINMFIRPLSDWLVRVDGIVMLISAAVVVYMAVQSHNEKI >gi|229784127|gb|GG667608.1| GENE 29 30820 - 31683 510 287 aa, chain + ## HITS:1 COG:lin0782 KEGG:ns NR:ns ## COG: lin0782 COG0384 # Protein_GI_number: 16799856 # Func_class: R General function prediction only # Function: Predicted epimerase, PhzC/PhzF homolog # Organism: Listeria innocua # 4 287 3 282 282 270 47.0 2e-72 MHFIDVCVASAFSKNNRGGNQAGVVLQGASLNRKEKMRTAHQLGYSETAFLSQSELADYK LEYFTPTGEVPLCGHATIAAFVVLNYFGQLKKCEYTIETKSGILSIAVKDGGVIFMQQNT PEFYETLELADVVDCLGFNADTAPGNLPVQIVSTGLRDIIVPVENPLVLETMVPDFAAVS KLSQKWNCVGIHAFSLVEDGKLTAVCRNFAPLYGIDEESATGTSCCALACYLYQYGVKRN QYVFEQGRSLNRISELYVSVNAGDDTVDGVWIGGCGYLESLKTIEVE >gi|229784127|gb|GG667608.1| GENE 30 31736 - 32425 543 229 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619210|ref|ZP_06112145.1| ## NR: gi|266619210|ref|ZP_06112145.1| hypothetical protein CLOSTHATH_00213 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00213 [Clostridium hathewayi DSM 13479] # 1 229 1 229 229 480 100.0 1e-134 MAAGFMLTVTLCLLILSAAVLFVLIQLVKRCAAAAIDDIMGVEEGECVSKLEQVRLLKEL GISIVPEEGEQEFILGLTPSENEYLASHPYYGFCILAGSRNALNCVFSTGDRECIYQWDS YGKILKGLKAVSGLPFEEISGMERYAVTFRLNGRDYSWKARKNGDWMDTGMAGYLNRILD RQNCFGDRRFYLDNSHEAPLYLYASGAMADEVNRKTGLKFRPARTAHSH >gi|229784127|gb|GG667608.1| GENE 31 32446 - 32607 118 53 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGSEKSDTAEMEYGMVKSRRIKKRLKLAGAVCGELGSGEKDGTEDRRIPFFQF >gi|229784127|gb|GG667608.1| GENE 32 32525 - 33040 436 171 aa, chain + ## HITS:1 COG:mlr0125 KEGG:ns NR:ns ## COG: mlr0125 COG3757 # Protein_GI_number: 13470422 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Mesorhizobium loti # 5 166 145 306 317 117 37.0 8e-27 MQERFAENWEAAKKTELRIGAYHFFSFDSPGESQLKHFTETVPAFDKMLPPVVDFEFYGD KKVNPPDQESVCTQLEVMLQGLEAYYGVKPVIYATEDTLQFYLEGRFEEYPLWIRNVVKK PDTGNRDWLFWQYTNRKRLAGYEGDETYIDVNVFCGSRDDWELWSGNFKME >gi|229784127|gb|GG667608.1| GENE 33 32973 - 33146 143 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSMCFAAAGTIGSSGQGILKWSEIYRMGCECFRYEPQSGSDGRYQAVLEVQERVNNG >gi|229784127|gb|GG667608.1| GENE 34 33139 - 33699 625 186 aa, chain + ## HITS:1 COG:FN1102 KEGG:ns NR:ns ## COG: FN1102 COG1859 # Protein_GI_number: 19704437 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNA:NAD 2'-phosphotransferase # Organism: Fusobacterium nucleatum # 6 181 2 178 179 177 50.0 1e-44 MDKKNDKMKKMSVFLSLVLRHQPDAAGITLDEHGWADVEALIEGFQKTGRPLDLEMLKEI VRTDEKGRYSMSEDGTLIRANQGHSIPVDVELKETKPPKVLFHGTAERFLPSIMTQGLKS MSRLYVHLSADYDTAVKVGKRHGKPVVLRVDAERMARDGAVFYLSENGVWLTGPVDSRYL EVMAGE >gi|229784127|gb|GG667608.1| GENE 35 33713 - 34594 760 293 aa, chain + ## HITS:1 COG:BH3365 KEGG:ns NR:ns ## COG: BH3365 COG1091 # Protein_GI_number: 15615927 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Bacillus halodurans # 1 225 1 218 283 100 27.0 3e-21 MERWMVTGTNGFLGSRIMEYYQDKYEITGANHGNLDITDEGAVTSFVKKARPRLVIHCAA ISNTGTCQENPGLSEAVNVKGAVNLARACRETGSRLIFMSSDQIYAGNRTMEPGKEENTP KPVNVYGLHKRQAEDEIMAILPEAACLRLPWMYDFPWRGLKSNSNLLGNLLKALIHNRPL TLPVYDYRGITWAMEVVKHVEAAGALPGGVYNFGGGNTLSTYETAGKVLAMVTEGEDRSG LLIPDRERYAGQPRNLLMDTEKIRGFGIIFPDTVEGFQRCFEESPEYMLGLIR >gi|229784127|gb|GG667608.1| GENE 36 34875 - 35639 809 254 aa, chain + ## HITS:1 COG:ECs3753 KEGG:ns NR:ns ## COG: ECs3753 COG1319 # Protein_GI_number: 15833007 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Escherichia coli O157:H7 # 5 246 3 247 259 59 24.0 6e-09 MVRFKNYVKAGSLSEAYELNQKKSSIVGGGMMWLKLQNRVKMTLIDLSGLGLDTIEETEE EFRIGCMCTLRMLETHEGLNTYFQGVFKECTRAIVGVQFRNGATVGGSVFGRFGFSDIMT CLMALDTYVELYQGGVVSLKEFNHMKYDRDILVRIIIKKDGRRAAYAAQRRSGTDFPLIA CCVAEKEGTWYVSAGARPFRAEVVEAAAVDGRVDCRAVAERFRFGNNTRGSAEYRKHLTE VYTGRLIKLLGGEV >gi|229784127|gb|GG667608.1| GENE 37 35643 - 36104 629 153 aa, chain + ## HITS:1 COG:SSO2433 KEGG:ns NR:ns ## COG: SSO2433 COG2080 # Protein_GI_number: 15899181 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Sulfolobus solfataricus # 1 143 13 157 171 106 37.0 1e-23 MEIEFTLNGKVTYAEIKDDTSLFDLLRAKGCYSVKCGCETENCGLCTVLVDGKSVLSCSM LAARVNDRDVVTLEGVQKEAEEFGMFLAGEGAEQCGFCSPGLIMNVLAMEKELNHPGEDE IKEYLAGNLCRCSGYMGQMRAIRKYLARGGEAR >gi|229784127|gb|GG667608.1| GENE 38 36101 - 36376 461 91 aa, chain + ## HITS:1 COG:ygfN_2 KEGG:ns NR:ns ## COG: ygfN_2 COG1529 # Protein_GI_number: 16130783 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli K12 # 14 88 12 86 790 71 44.0 3e-13 MNMEMHAVNSPVIKVDAKALVTGKPVYTDDLAPKDSLIVKVLRSPHAHALIREIDVKAAS EVDGIACVLTYKDVPDQRFTMAGQTYPELAS >gi|229784127|gb|GG667608.1| GENE 39 37379 - 39337 2076 652 aa, chain + ## HITS:1 COG:Z4220_2 KEGG:ns NR:ns ## COG: Z4220_2 COG1529 # Protein_GI_number: 15803418 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli O157:H7 EDL933 # 1 652 109 790 790 355 34.0 1e-97 MAGETEEAVDRALRLIKVEYEVLEPVLDFRQAKDSEILVHPEDNWKSLCPVGADNRRNLC ACGGEENGDVDGTLEKCEVVIDRVYHTKANQQAMMETFRAYTYLDTYGRLNVVASTQIPF HVRRILAHALDIPKSMVRVIKPRIGGGFGAKQTVVAEVYPAIVTMKTGRAAKMVYTRYES QIASSPRHEMEIHVRLGADAEGTIKAIDVYTLSNTGAYGEHGPTTVGLSGHKSIPLYRTP EAFRFRYDVVYTNVMSSGAYRGYGATQGIFAVESAVNELAAALHMDPVALREKNMVKEGD VMPAYYGEKLNSCALDRCMARAKEMIGWDQKFPRVDMGNGKVRGVGVAMAMQGSSISNVD VGSVEIRVNDDGFYTLMIGASDMGTGCDTILAQMAADCLECDMDQIVVHGTDTDVSPYDS GSYASSTTYLTGMAVVRACEELRRKIVDKGAEYLNCCGETLEFDGKRVYQPDGELEISLK DIGNRVMCFNEKMLSAGYTHTSPVSPPPFMVGMAEVEVDTETGDVRLIDYAAVVDCGTVI NPNLARVQTEGGIAQGIGMALYEDITYTEKGRLIENSLMQYKLPTRLDVGNVRVEFESSY EPTGPFGAKSIGEIVINTPSPAIAGAVANAVGVQIRELPITAEKVYKGMREL >gi|229784127|gb|GG667608.1| GENE 40 39533 - 40438 1152 301 aa, chain - ## HITS:1 COG:FN2009_2 KEGG:ns NR:ns ## COG: FN2009_2 COG1732 # Protein_GI_number: 19705305 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) # Organism: Fusobacterium nucleatum # 29 300 33 304 307 272 53.0 6e-73 MKKWKKIAAAAVIGTAAIAALSLGGCAKKEQGTIKIASKPMTEQYVLTEIIGQLIEHDLG VKVEITKGVGGGTTNIQPALLKGDFDLYPEYTGTGWLTVLKKEQENDPDVLYANLQKGYE EMGLHWTGLYGFQNSYVLAVRKEAADQYSLKTFSDLAAASNQLVFGGNPDYMEREDGFNY LASAYNMNFKDVKDIDIALKYTAMADKQIDVTNAYTTDAQLSVADVSLLEDDQHIFATYY GATVVRQDTLKKYPKLSETLEKLTGQISDDEMRAMNYAVEVEQKDEKEVAKDFLTEKGLL N >gi|229784127|gb|GG667608.1| GENE 41 40520 - 41242 346 240 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 229 1 230 245 137 34 2e-31 MNVIELKDITMSYGPNRILEHFNLSVEQGTFLTMIGSSGCGKTTALKLMNGLLTPEQGTV TINGTDISRTDINELRRGIGYVIQEIGLFPHMTIEKNISYVPNLYKSPDKAAIAARARQL AGTVGLDTDMLKRYPSELSGGQRQRVGIARALMNSPKIILMDEPFGAVDEITRRRLQEEI LRIHEELGGTIVFVTHDIKEALKLGSRVLVMDHGNIIQDGSPTELLEHPATDFVRRLVAG >gi|229784127|gb|GG667608.1| GENE 42 41271 - 41915 776 214 aa, chain - ## HITS:1 COG:FN2009_1 KEGG:ns NR:ns ## COG: FN2009_1 COG1174 # Protein_GI_number: 19705305 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, permease component # Organism: Fusobacterium nucleatum # 1 206 1 206 206 186 57.0 2e-47 MIHGIFSLYVERWPFFQELILQHIKIASTAILISGTIGLFLGILISEHRKFATWIIGIIN VAYTIPSISMLGFLIPFTGIGNKTATIALTIYGLLPMVRNTYTGITTIEASTIEVAKGMG STPWQILYKVKLPLALPVIVAGLRSMVVMTISLSGIASYIGAGGLGVAIYRGITTNNAAM TYAGSILIAIVALASDQLVAYIERHIRKKWHLAP >gi|229784127|gb|GG667608.1| GENE 43 42106 - 43524 1719 472 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 28 156 545 669 744 65 34.0 3e-10 MKRQMKKWIIPCAAALLTIGASMTAFAAQGWSMEGDNWVYLDSSGNRVTDSWKKSGDNWF YLDSNGDMAKSSLIEDDDNYYYVNSVGAMVSNEWREIASDYEEEDSPDTYWYYFQNNGKA YKASSSGKTSFKSIRKANGEWKKYAFDSEGRMLFGWVGEDSERITGDDAWREGVYYCGNS DDGAQVINAWSLQEVEDPDTEDDNFNGTYWFYFGSNGKKTKETTKTINGLKYRFNENGAA ESEWYAKASTSAASGSNLYYNLPEQCWLAKGWFKTIPGVEVDPEAYDDGTEYWFYAQNNG ELVKSQIKTINNYRYAFNEKGEMLHGLYMLTFDDNKKIETYEEIESESDLPDTDDERDVY YFGDSPKEGVMATGKTTIDIDGEKYTYNFRKSGSDKGAGYNGIYDDSIYIKGRLLKADRD AKYEVVEYDGKDYLIGTSGKLAKGKKNLKNGDDKYFSTNKQGIVTYEGYEKE >gi|229784127|gb|GG667608.1| GENE 44 43906 - 45051 1159 381 aa, chain + ## HITS:1 COG:BS_licR_1 KEGG:ns NR:ns ## COG: BS_licR_1 COG3711 # Protein_GI_number: 16080911 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 17 380 17 390 499 72 23.0 1e-12 MYTKTFQVVQYLADTLESYVTSGQLGAHLGVSSRMILRYIKEAEALGKENGFEIRSYKGR GYQIKITDQARCRTFFKEMGSHAGSQGDELIREMVRRILISDGCKMEELEELFHYSRSSM SRITTLSNSYLENFGLELFSKAYTGLYISGNEISIRDCMYHLLEKAGDETIWDSLDIREV PEKDIRKWIGRCLKKYGISAKEKEEQQFFKYLAITIKRISLGKEIYFGYLAHLDGKEPFK EQMKTTRFLLKKYFPKNRLEGECMYLTLIWMQTLGGNCRFNQIDEHNLMFFHGLVMKGMK KIRNNYGVDLSGDEALVNGLILHVASSYGKYLVHMETENPFYEELQKIYPTAYYYAIELA ECITEYTKQNLSVGELGFLAS >gi|229784127|gb|GG667608.1| GENE 45 45968 - 46726 670 252 aa, chain + ## HITS:1 COG:no KEGG:Thebr_0341 NR:ns ## KEGG: Thebr_0341 # Name: not_defined # Def: protein-N(pi)-phosphohistidine--sugar phosphotransferase (EC:2.7.1.69) # Organism: T.brockii # Pathway: not_defined # 1 244 407 645 652 79 24.0 1e-13 MYFASSLEQENKIRKFKTAFISETMPGAARLLKSRLEQIYAGLEVLDIGEAGEGNALPEE ADFYITMLPTGRENVLGKEVVTVSPFLNEKDQMRINAALTRLKRSGSLESLCGRDSFFLW DKPVRKKGVLTRICDLMIEQGRLKEEEKNAVMKRESLVSTEISPFAALPHCLIDGESFFV FVLMKNPVPWGKANVKLVILGCFKRGDEKIKEVLERLFLMVSDEQWINKLAGSKCYEEFV TYLKEFDGGYLC >gi|229784127|gb|GG667608.1| GENE 46 46720 - 47172 429 150 aa, chain + ## HITS:1 COG:BH0221 KEGG:ns NR:ns ## COG: BH0221 COG1762 # Protein_GI_number: 15612784 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Bacillus halodurans # 6 144 4 141 147 111 39.0 5e-25 MLTDILTKDVVRLDVEGLTTPEEVIHFSGQLLVNSGKVKETYIVKMEEAFHDLGPYMVMA PGIAMPHARPSGDVSEPCISFIRLKDPVSFHHPFNDPVKLVFTLGGVENDSHLALLQELG RFLEDDKVRERLLTITSYEELEKLTEKESL >gi|229784127|gb|GG667608.1| GENE 47 47184 - 47432 342 82 aa, chain + ## HITS:1 COG:no KEGG:LM5578_2173 NR:ns ## KEGG: LM5578_2173 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: Phosphotransferase system (PTS) [PATH:lmn02060] # 1 82 6 88 93 95 62.0 5e-19 MCGMGFGTSLMLLMTIKDIGKKHHVTVEGEAVDLGSYKGKTCDLIAASSEIAKQIVADVP VLPVTNLLDKKGIEELILPYMK >gi|229784127|gb|GG667608.1| GENE 48 47474 - 48757 1509 427 aa, chain + ## HITS:1 COG:YPO2782 KEGG:ns NR:ns ## COG: YPO2782 COG3037 # Protein_GI_number: 16122986 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Yersinia pestis # 4 419 7 416 418 290 40.0 4e-78 MIKFLVDFLSQPTVVLGLVAMIGLIAMKNSVSDVISGTFKTILGFLILSAGSNVIVGALV PFSSMFNSAFGMAGIVPEDNSLVAAVQTVLGFETPLIMLFSFLINLLLARITPFKYIFLT GHMMFSFAGTMAIVLDQMGIRGWEAVAIGSVVQGISMIVFPAIAQPFVRKILKTDEVAFG FWGSSLVCLSGFVGGLFGKKDRDPEQDTWEKVHVSEKFNFLKDMSILMAIVMIVVYILTA LIAGRETVTALSGGQNYIIYTLIQALTFVAGVLVLLQGVRMFLGEIIPAFKGISEKLVPG ARPALDIPIFYSVGPMATTVGFLAAMVGGIISTMITTRMNVVVLPGVIGLFFMGGAAGVF GDKLGGKKGAVAAGLFLGVFFTLIVAAAYPFVNVGAYGVEGLWFASTDAIIVSVLMRLVG MVFGVPL >gi|229784127|gb|GG667608.1| GENE 49 48828 - 49745 1023 305 aa, chain + ## HITS:1 COG:BH0225 KEGG:ns NR:ns ## COG: BH0225 COG1735 # Protein_GI_number: 15612788 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Bacillus halodurans # 4 303 3 328 331 236 39.0 4e-62 MEKFARTVLGDIKGEDMGFTYPHEHLHAVPPPCQKDRDLELSSYENSVQELFHFKEVGGR TLVEASTLDYGRDLKVLKRMAQETGVHIIATTGFNKHIYYPPWVAETSMEEIAGMLAKDV MEGGDGSDARAGFIKIGTYYNMIHPLEKKTAIAAATAQKLCNAPIWGHTEAGTMGMEVLD ILENEGVDLTTVALGHLDRNPDEYYLLKLADRGIYIQFDGPGKVKYHPDSTRVALIRSLI GHGYGRQLLISGDMGRASYLEAYGGGPGFRFIRTKFIARLLDEGISQQDIDTIFIENPKR WLAVF >gi|229784127|gb|GG667608.1| GENE 50 49757 - 50377 672 206 aa, chain + ## HITS:1 COG:BS_kdgA KEGG:ns NR:ns ## COG: BS_kdgA COG0800 # Protein_GI_number: 16079268 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Bacillus subtilis # 2 178 10 182 196 105 33.0 6e-23 MIESIRLMAIIRDVEPVYAAPIAETLLSEGITGIEVSLSDPEKGFGCIEQIQKKYSPGGI CLGAGTVTKKEEVDRLAAMKIPFFLTPGYDDDLVAYGLSKGMEVLPGVLTPGDVQKALNR GVRRLKLFPADAFGMSYIKSLKGPFPQAEFVAVGGVNETNVSQFLKAGFVGAAAGSNLVP RGAADGDLELIRQKARLYVQAMKREV >gi|229784127|gb|GG667608.1| GENE 51 50382 - 51098 789 238 aa, chain + ## HITS:1 COG:BH0226 KEGG:ns NR:ns ## COG: BH0226 COG1402 # Protein_GI_number: 15612789 # Func_class: R General function prediction only # Function: Uncharacterized protein, putative amidase # Organism: Bacillus halodurans # 13 223 11 223 249 135 36.0 9e-32 MAFIVQNMTWPMVKERLEVCSTAIVPIGSTEQHGYHLPLGTDVFLAEHLARLISDRTGAL VFPTLNFGYSWVWRDRIGTVSLPQDHLQLVLKDIVKSVERYGVTKLVFLNGHEANGASMK YAIRDIQDDTPVKVLGMFYPGLQAIYDKYMESPTWGGMFHACEFETSLMLSAREELVHMD LAQAEYPDRPPLYGMDNTSIGDLSVSGTYGDPTAAAKDKGDRMFEEFAAKAAELISKA >gi|229784127|gb|GG667608.1| GENE 52 51453 - 51515 56 20 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFGFKKKKRTAAFMHRQTGK >gi|229784127|gb|GG667608.1| GENE 53 51629 - 51931 246 100 aa, chain + ## HITS:1 COG:SA2326_3 KEGG:ns NR:ns ## COG: SA2326_3 COG2190 # Protein_GI_number: 15928117 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Staphylococcus aureus N315 # 5 85 54 134 147 77 41.0 9e-15 MLADTKHAFGLEAVNGAEILIHIGLDTVEFNGMGFTALKAVNDRVKKGTPVIKLDREYFQ SRNACLITPVIISNGTNYRFELENIGKKVVAKESVVIRFQ >gi|229784127|gb|GG667608.1| GENE 54 51950 - 52786 918 278 aa, chain + ## HITS:1 COG:BS_licT KEGG:ns NR:ns ## COG: BS_licT COG3711 # Protein_GI_number: 16080959 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 1 270 1 273 277 158 33.0 1e-38 MKVIKKINNNVVICLDHNNRELVAFGRGIGFPAAPYELDDLSKIDRTYYGVNSSYMGLVN EIPEEVFEVAAKIVDMAVNYIDCELSSNLVFTLADHINFAIERNQRNMNLRNPFLYDIRY FYEKEMDIGNLAVKMIRRYLKVSLPEEEAGNIALHFINAEALSEKENEFNVSDTVVEEVT YLIEKELNIRIKREDFNYSRFVSHLQYLMKRKDAVSSISSDNIKLYMEMKEEFPVIYQCV LKIRDYMAGKLDWELSEEELLYLILHVNRLYAREDCNR >gi|229784127|gb|GG667608.1| GENE 55 52917 - 54320 1525 467 aa, chain + ## HITS:1 COG:BS_bglP_2 KEGG:ns NR:ns ## COG: BS_bglP_2 COG1263 # Protein_GI_number: 16080978 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Bacillus subtilis # 100 450 1 355 364 227 38.0 3e-59 MANQSYKELAAKILKLIGGKENISFFTHCVTRLRFNLKDQSLAKEEEIGRLEGVLGTKIQ NGQFQVIIGNKVNEVYAAFCEISGLEKQEGIDENLDKVTAEKKKFSLGMLLEVISGCFAP VIPAFAGAGVLKGILTLVTTYGWMSGESGLYVMLNAASDAVFYFLPFILAFTSAKKFKVN EVLALVIAGIYMYPSILNNAGSTINVLGIDVSLIKYASTALPVLISVWVMGYIHRWLEKH IPSCLRVVLVSAILLLIMAPLNLIVIGPLGNNIAVLIGKAFQWLFDKAPVVGGFVDGFTR PLLVFTGTHVTLGPIMINNIQTLGYDMLSPVHCVQAMAAAGMCFGAFLKAKKEDNKAANF SAFISAFIGITEPALYGVAFRFKKPLLALMIGGGVAGGFVAALGAKAISFAMPALISLPI YVGSIPTVLAGLVIAFVLTAVLTYVLGFDENIEKDQKAIDAEKKNVI >gi|229784127|gb|GG667608.1| GENE 56 54334 - 55791 1117 485 aa, chain + ## HITS:1 COG:lin0288 KEGG:ns NR:ns ## COG: lin0288 COG2723 # Protein_GI_number: 16799365 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 3 485 8 486 486 507 52.0 1e-143 MSFPNNFLWGGATAASQVEGGWNEGGKGLDTQDCKPQYPELTREQKNSWQYKQMTNEKYE AGKCCKETGIYPFRFGSDEYHRYKEDIALFAEMGMKIYRLSISWARIFPNGDDEKPNLAG IEYYKNVFAECRKHHIKVFVTMLHYAIPVHLVDEYGGWKNRKLVDFYLRYAKTLFENFKD DVDFWLPFNEINAGRFNPYNGVGLIKEREENYDQSVYQAIHHQFIANALTVKLAHEMMPG SKVGCMIARFCHYAATCNPADQLTQLFDEQYTNWFYTDVMARGRYPKYINRHFKQRNVTV HFEPGDEELLQKYPVDFVAFSYYFTQVSTADETWEKTDGNLVIANKNPYLESSEWGWQKD AAGLRITLNQIYDRYQKPMFIAENGLGAVDVPEPDHSVHDPYRIAYLRDHFKAMSDAIDD GVELTGYTMWGIIDLVSCGSIEMSKRYGVIYVDADDEGNGTFDRYKKDSFYWYKKVIESN GAELE >gi|229784127|gb|GG667608.1| GENE 57 55889 - 60136 3709 1415 aa, chain - ## HITS:1 COG:CAC2401_1 KEGG:ns NR:ns ## COG: CAC2401_1 COG1924 # Protein_GI_number: 15895667 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Clostridium acetobutylicum # 7 662 7 663 663 816 60.0 0 MNTYNTLGIDIGSTTVKIAILDGDHHILFADYERHFANIQETLAALLKKACDLLGPMDLR PAITGSGGLTLSKHLEIPFIQEVVAVAASLTDYAPQTDVAIELGGEDAKIIYFTGGIDQR MNGICAGGTGSFIDQMAALLQTDAAGLNEYAADYKAIYPIAARCGVFAKSDIQPLINEGA TREDLAASIFQAVVNQTISGLACGKPIRGNVAFLGGPLHFLPELRKAFIRTLHLSGDAII APENSHLFAAIGAALNAEERGALLSMEELITRLSSGIRMEFEVKRMEPLFADQADYDEFT KRHENHCVTKSDLSSYHGKCFLGIDAGSTTTKVALVGEDGTLLYSFYSSNNGSPLATAIR AMGEINSRLPADSQIVYSCSTGYGEALLKAAFLLDEGEVETISHYYAAAFFEPDVDCILD IGGQDMKCIRIKNGTVDSVQLNEACSSGCGSFIETFANSLNFSVEDFAKEALFAKNPTDL GTRCTVFMNSNVKQAQKEGASVADISAGLAYSVIKNALFKVIKITSASDLGEHVVVQGGT FYNDAVLRSFEKIAECNAVRPDIAGIMGAFGAALIARERYDETKATSMLSMEKILSLHYE TSMTRCKGCTNSCVLTINRFDGGRQFITGNRCERGLGKERAKREIPNLFDYKNHRMFDYE PLSADLASRGVLGIPRVLNMYENYPFWAVFLKELKFRTVLSPQSTRKIYELGIESIPSES ECYPAKITHGHIEWLIKQGIKTIFYPCIPYERNETPDAGNHFNCPIVTSYAENIKNNVEE LDTENVRFLNPFMAFTNEEVLTKRLIEVFNEEFQIPADEVKEAAHKAWEELLASRRDMEK KGEETLQWMKENDRRGIVLAGRPYHVDPEINHGIPELITSYGFAVLTEDSISQLGEIERP LVVTDQWMYHTRLYRAASYVKSQNNLDLIQLNSFGCGLDAVTTDQVNDILTGSGKIYTVL KIDEVNNLGAARIRIRSLIAALRVRDKKHYERKVVSSAYHRITFTKDMKKDYTILCPQMS PIHFDLIEPAIRSFGYQVEVLQNDNRSAIDTGLKYVNNDACYPSLIVVGQIMDALLSGKY DLDHTAIFMSQTGGGCRASNYIGFIRRALERAGMGQIPVISVNANGMETNPGFSITLPLL TKAMQAVVYGDIFMRVLYATRPYEKEPGSANALHQKWKARCIKSLSKRVPNMMEFSRNIS GIVRDFDELPRISGLQKPKVGIVGEILVKFSPLANNHIVELLESEGAEAVMPDLMDFLLY CFYNSNFKAANLGGKKSSASLCNMGISLLEYFRKAARRELMKSKHFTAPARIANLAEMAK DFVSIGNQTGEGWFLTGEMLELIHSGVGNIVCTQPFGCLPNHIVGKGVIKELRKAYPNSN IIAVDYDPGASEVNQLNRIKLMLSTAQKNLKNIDI >gi|229784127|gb|GG667608.1| GENE 58 60276 - 60407 71 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSYFPLYLAYLDLYIIESDPFFVKRTLNRFGDLTQILLYFNTF >gi|229784127|gb|GG667608.1| GENE 59 60406 - 61347 1203 313 aa, chain + ## HITS:1 COG:CAC0294 KEGG:ns NR:ns ## COG: CAC0294 COG0598 # Protein_GI_number: 15893586 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Clostridium acetobutylicum # 1 313 1 315 315 295 50.0 9e-80 MIRIFKTENGLIHQKDELSPGSWIALTNPTATEILEIANTYGIDPDDLRAPLDEEERSRI ETEDNYTLILVDIPSIEERNDKDWYVTIPMGIITTEEVIITVCLEETPVLGAFMDGRVRD FHTYMKTRFILQVLYKNASLFLQYLRIIDKKSGVIEEKLHQSTKNRELIELLELEKSLVY FTTSLRSNEVVLEKLMRNEKIKKYPEDTDLLEDVIIENKQAIEMANIYSGILSGTMDAFA SVISNNLNIVMKFLATITIVMSIPTMVASFYGMNVNSRGMPFADSPYGFVIVMGFTLVLT LIVAWIFSKKDLF >gi|229784127|gb|GG667608.1| GENE 60 61381 - 62226 847 281 aa, chain + ## HITS:1 COG:no KEGG:Closa_0669 NR:ns ## KEGG: Closa_0669 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 281 1 281 281 418 77.0 1e-115 MKFFYNFERKFKRYAIPNLMYYIILMYGVGLLMEMFAPGFYWRFLSLDAQAILHGQIWRI VTFMIYPPGGGIFGSLIAMYLYYMVGNTLERVWGAFRFNVYFFMGIIGHVAASLILYVFF GTPVYLTTEFLNFSLFFAFAATFPDLEFLLFFVIPIKAKWLALFNGIYFLYGFVKGGTSI RVTIVLSLLNFILFFFMTRNFNRVNPKEIKRKRDFQRQTKIMPQGSTHHKCAVCGRTEKD DPNLEFRYCSKCEGSYEYCSEHLYTHKHVTREEDEARKTMN >gi|229784127|gb|GG667608.1| GENE 61 62245 - 63681 1789 478 aa, chain + ## HITS:1 COG:CAC0518 KEGG:ns NR:ns ## COG: CAC0518 COG0469 # Protein_GI_number: 15893809 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Clostridium acetobutylicum # 1 478 1 473 473 441 49.0 1e-123 MKKTKIICTMGPNTNDRNLMKALAENGMDVARFNFSHGDYEEQKMRLDMLKSVREELDLP IAALLDTKGPEIRTGVLKDGKKVTLKEGQTYTLTTDDIIGDETMGHITYERLHEDVKAGN KILIDDGLIELDVVEVNGNNIVCTVVNGGELGEKKGVNVPNVKVKLPALTEKDKADILFG IEQGFDFIAASFVRTAAAILEIKEILSEHGSNMAVIAKIENAEGIENLDAIIEASDGIMV ARGDMGVEIPAQEVPYIQKMIIEKCNVACKPVITATQMLDSMIRNPRPTRAEVTDVANAV YDGTDAVMLSGETAMGKYPVEALSMMASIVEETEKHLDYSAYRQRRVSAANVHNISNAVC SSSVGTAHDLNAKAIVAPSITGFTTRMLSKWRPEALVIGLSPSASAVRQMQLYWGVKPFH AKRAESTDVLIYSSIELLKAKNIVKEDDLVVVTAGVVSPTSKHEPAAHTNILRVVTVD >gi|229784127|gb|GG667608.1| GENE 62 63717 - 64985 1603 422 aa, chain + ## HITS:1 COG:SP1978 KEGG:ns NR:ns ## COG: SP1978 COG0019 # Protein_GI_number: 15901801 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Streptococcus pneumoniae TIGR4 # 3 414 2 414 416 557 64.0 1e-158 MEKRPFVTKEKLEEIVKQYPTPFHIYDEKGIRENAKKLKEAFAWNPGYKEYFAVKATPNP FILNILKEFGCGTDCSSETELMMADALGFSGSDIMFSSNDTPAEEFRFAHKIGAIINLDD ITHIEYLENAIGEIPETISCRYNPGGVFKISNDIMDNPGDSKYGMTTEQMFEAYQVLKEK GAKHFGIHAFLASNTVTNEYYPLLAKILFELAVKLKEETGAHIQFINLSGGIGIPYRPDQ EPNDIMAIGEGVRKAYEEILVPAGMDDVAIYTELGRYMLGPYGALVTRAIHEKHTHKEYV GVDACAVNLMRPAMYGAYHHVTVMGRENDPCDHKYDVVGSLCENNDKFAIDRMLPEIHKG DLLYIHDTGAHGFAMGYNYNGKLKSAELLLKEDGSVEMIRRAETPKDYFATFDFCDILKD LL >gi|229784127|gb|GG667608.1| GENE 63 65393 - 66103 853 236 aa, chain + ## HITS:1 COG:CAC2831 KEGG:ns NR:ns ## COG: CAC2831 COG0670 # Protein_GI_number: 15896086 # Func_class: R General function prediction only # Function: Integral membrane protein, interacts with FtsH # Organism: Clostridium acetobutylicum # 24 236 16 231 231 128 39.0 1e-29 MDYNSTDQPYVYSDSTSAEPLGKYTAKTFGWMFAGLLITFLVAAFGYTTGTIIYVFAVPY AYLLLGVAEIAVVLFLSARIHKMSVGTARALFFTYAVLNGVVFSAYFLMYDMVDLILVFG ATSIFFGVMAVIGYVTNADFSRIRNFLMGGLIFLVVFWLLSMFINLGQFEMIVCYIGIFI FLGFTAYDTQKIRAYHQYYAQDPEMAAKASIFSALQLYLDFVNLFLYIIRIVGRRK >gi|229784127|gb|GG667608.1| GENE 64 66265 - 68733 2433 822 aa, chain + ## HITS:1 COG:PA0285_2 KEGG:ns NR:ns ## COG: PA0285_2 COG2199 # Protein_GI_number: 15595482 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Pseudomonas aeruginosa # 644 816 1 175 175 103 36.0 2e-21 MGGRRWRGRIGILFTVLFCLLLFSFTARAAENGSRVVRVGYPVQQGFTEISEDGTYSGYT YEYLQEIARYTNWEYEFVTAEGTLDEQLSGLLTMLRDGEIDLLGAMNYSERLAELFDYPE YSYGTAHKTISVLKDNLEYTEFDYDSFNDIRVAVVSFDGQEDEKLNGFANMNGFRVNQVL CSTGEEQMELLKTGKADAVLSMDVSLSDRELRVISRFDSRPFYFGVTKGNQAVAGELNKA MEIITATNPFLTAELNRKYFTGEDGLVLSNSEREYISQRGTVRAGVMLGRAPLQYQAKNG EIRGISIEVLKYIEEQTGLSYEIVPFHNWEEYSRAMEDGAVDLVVGIAGDYEVTGAESCA LTLSYLSVPLQLIMNENVKALTDLSGKKLALQKSLLTEQSLQGELHYYDNVEECMEAVHK GEADYCYGNSYAVQYYMADHNYKNLFTFAQSENWTQKYCFGIRRPVDITLLSIMNKVIRA LPEEQVNRFLYQNAYDAGNLTFSEYVMRNPEQSILYLVLFWLLVVTAVLITAEILRRRNM ARKALENQRYEQLSQLSNEFLYEYNIREGCLNLTEQTAGFLGCEKKIRRPEALKQKYPVF GYMISQEENHTEHECLLPGGNVRWLKVISKMVTDTSGRPWYAVGKLVDIQTEREIKEQLE AKAQTDSLTGVYNSATSHQRMRQALMAEGRDGSGAMIIMDIDYFKQINDYLGHYTGDQVL KETAGILKACFRADDIVGRLGGDEFVVFLMEVNDPDLVKQRCTEVLDRVKNVTMEKHGQE VTLSIGAAMAGGETDYDVLYRKADQALYQIKKNGRSGVLVVE >gi|229784127|gb|GG667608.1| GENE 65 68821 - 69981 1144 386 aa, chain - ## HITS:1 COG:mlr1185 KEGG:ns NR:ns ## COG: mlr1185 COG0624 # Protein_GI_number: 13471265 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Mesorhizobium loti # 1 377 9 412 425 160 31.0 5e-39 MDNHNSDAMELTMKLVRIDSTDPGACEGEIGTFIFQYLKDLGVPVIKKEVLPGRFNIMAK VEGEEDAPALVYICHMDTVTTGEGWTVSPFGAEVIRGRIYGRGACDMKSGLACALSAFTS MALKARNGKKPGRSLVFIGTVDEEDFMRGAEAAIADGWVSKDSWVLDTEPTNGQIEVAHK GRTWFEITVNGITAHASTPWKGADAIAAMAEIIAAIRRRIAACPVHPDLGASTVTFGQIE GGYRPYVVPDSCRVWIDMRLVPPTDTAGAAAIVEDAIAAATKEIPGITAAYQITGNRPYV EKDEQSPLLNALSRACEEVTGEPAPVSFFPGYTDTAVIAGTLGNHNCMSYGPGDLELAHK PDEYVPCEDILRCEEVLTRLADNLLF >gi|229784127|gb|GG667608.1| GENE 66 70081 - 70971 955 296 aa, chain + ## HITS:1 COG:BH1930 KEGG:ns NR:ns ## COG: BH1930 COG1404 # Protein_GI_number: 15614493 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Bacillus halodurans # 18 272 143 414 444 214 44.0 2e-55 MNHAREVIRCEAAYRMGLTGRGIGVAVLDTGIYLHEDFEHRVAAFADMVHRRREPYDDNG HGTHISGIIGGSGTSSDGRYQGVAPECHIIMVKVLDQKGNGYASDVLAGIKWIRENRERL GIRIVNISVGSFTKKGMSENSVLVRGVNAAWDDGLVVCVAAGNMGPGKNTITTPGISRKV ITVGCSDDYKEVNVMGNRMIDYSGRGPTGACICKPEIIAPGAGIISCAPETGAYQSKSGT SMSTPLVSGAIALLLQKYPYMSNRDVKLRLRERAVDLGLPMNQQGWGMLDVERLIE >gi|229784127|gb|GG667608.1| GENE 67 71069 - 72367 1130 432 aa, chain - ## HITS:1 COG:ECs0875 KEGG:ns NR:ns ## COG: ECs0875 COG0513 # Protein_GI_number: 15830129 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Escherichia coli O157:H7 # 1 395 1 389 455 401 52.0 1e-111 MTFKDLQLSAPLLKALEEKGYTTPSPIQEKAIPHVLAGKDVLGCAQTGTGKTAAFALPII QNLMRPSDKKHSKRVIRSLILTPTRELALQIAENFKEYGSHTPVRCAVIFGGVSAVPQIK ELQRGIDILVATPGRLNDLIHQGEISLSHVEMFVLDEADRMLDMGFIHDVKKIISLLPVK KQTLLFSATMPPEIQALTEKLLHNPAVVEVTPVSSIVDLIEDSLYYVDKENKRALLVHLL KREAITSTLVFTRTKHGADRMAKFLTKNRINAAAIHGDKSQGARQKALSQFKAGTVRVLA ATDIAARGIDIEELSCVINFDLPNVPETYVHRIGRTGRAGLGGRAISFSDIEEKAYVEDI EKLIGKKIPVVKDHPFPMTVFTVPQKETKPRPVKNTGRNDAKRPQKSKVAQTARPRTRVS VNASAGKRTAAK >gi|229784127|gb|GG667608.1| GENE 68 72559 - 74022 1195 487 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_0170 NR:ns ## KEGG: Dhaf_0170 # Name: not_defined # Def: sodium/sulfate symporter # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 4 485 19 504 507 230 33.0 1e-58 MAQTKKKDFYSIHLVIGIGIMLLFRFLPIQLPCITPIGMEILGIFIGTLYLWTTSDPIIS SILSIFLVGLTGYSTMNEVLTATFGNPILVQVLFLLTAINCLIENNLTEYCGRFFLTRKI CLGRPWVLAFFIMLGSMLMGAFMGGFTPIFLFCPILYDIFETVGLKKHDKFPTIMLILVT VATLLGFPIPPFMGNGLALISNYASVTGNMGAVIEINNAGYLLTGLIHATVCIIVLVLFC KFVLRPDTSKLKELKMETLNKNPLPPMSLRQKFIAVSFTVFILILLIPSVIPAHPVGRFV KANTYGIAAFYVFLLCAIRISKKPLMDFSSTMKRFSWSTYFLIASAILLGNALTNESTGV SAFLNTVLMPVLQNVPASAFAVIIMILTVVLTNVCNSFVIGLLMQPVIATFCMATGMNSA PVVSMMILFVLSSAAVTPAASPFAALLFGNKDWLKSGEIYKYTTIFVAIELAIVLLVSMP VANMLIR >gi|229784127|gb|GG667608.1| GENE 69 74198 - 74878 777 226 aa, chain + ## HITS:1 COG:no KEGG:SGGBAA2069_c09090 NR:ns ## KEGG: SGGBAA2069_c09090 # Name: not_defined # Def: cAMP-binding protein # Organism: S.gallolyticus_gallolyticus # Pathway: not_defined # 1 225 9 222 223 95 31.0 2e-18 MDQNGVKQYFERYGMRIEVKRETSICNPRLSDSYIYYLVEGIASLTSLTMDGEEKDFIYF PHDHLLGFAPALMRHYRKVRGEDYTLSGDDCVQEKIPFGIDTKTDCVFYRLDERTFETLL EEDACFLSYIMEAVTCNYVTLVRKFHDTQEECASKRLYKWFLAFSSREGQYRAVPHGFTY AEIAKYLGMHPVTVSKLSSALKKSGIIKKEKGRILIIDEARLKALI >gi|229784127|gb|GG667608.1| GENE 70 74913 - 75368 370 151 aa, chain + ## HITS:1 COG:no KEGG:ELI_2084 NR:ns ## KEGG: ELI_2084 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 151 1 151 152 203 64.0 2e-51 MQDFMKALTSERKSTALPEEFDYFGKLIGSWQINYVESDSSRVIKGEWHFSRVLEGMAVQ DVIILPDYEYGTSLRIFNPDTHAWDVAYGYTGNIIRLEARKQGEMIVLTFTGDERRKWVF TEIGEHKFHWQNITVKDDGEWLINAEIDAER >gi|229784127|gb|GG667608.1| GENE 71 75503 - 77368 2111 621 aa, chain + ## HITS:1 COG:no KEGG:HMPREF9243_1609 NR:ns ## KEGG: HMPREF9243_1609 # Name: not_defined # Def: hypothetical protein # Organism: A.urinae # Pathway: not_defined # 4 620 5 628 629 694 55.0 0 MAGAVYSFEEHMIKKQARAEEAMLKELRSGAYTIDNPLVSYNLYLINPLSAVISFYTEEE TAVTVTVLGKAKEGNITHTFPRAKEHVLPVVGLYSGYVNRVEIREYRGKVHTVEIEVPDV FDGDSPLESMDTTPEYLQDDCIFLSPSGAELAVAFDYAGDVRWCLNIKCVFDMKRLKNGH ILMGTDRLVQMPYYMSGMYETSACGKIYHEYRLPGGSHHDAFEMPDGSLLCLTEDLTSDT VEDMCVLIDRNTGEILKTWDYKNFLEPGLGKSGSWSEKDWFHNNAVWYDEKSHSLTFSGR HMDSIVNIDFETGRLNWILSDPEGWPQEWVDRYFFKPIGTDFEWQYEQHACLITPDGDVM CFDNHHWGSKIRENYRAAKNNYSRGVRYRINTKDMTIEQIWQYGKDRGAKFFSPYICNVE YYNEGHYMVHSGGSAYNKDGEISESLGALEQNMGGTLFATTVELCDDKKMLELHTKGNYY RAEKMKLYAEQGNLELGEGSVLGEMGVTKEFDTDIPAPVSSELIPDRYEARIEDEDDRFT FHAKFEKGQLVMLMLEKGEEKHRYYISTTAVSHKAMCCGTFLESDERVTKTNVNKAGMKG TYEVFVVIDDIKYPCGVSIKC >gi|229784127|gb|GG667608.1| GENE 72 77358 - 78227 782 289 aa, chain - ## HITS:1 COG:BS_yisR KEGG:ns NR:ns ## COG: BS_yisR COG2207 # Protein_GI_number: 16078146 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 49 282 36 282 287 90 24.0 4e-18 MYTNTGYLDIADADLEDDTIPLRINCCGVYRLLTRLSLTTIRATGRPDYQLLYIASGKAE FVLNGAAVTIPAGNMVLYGPGEYQQYTYLLEDRPEVFWVHFTGYEAADLLSRTGLSLEDF RILRTGCASRYQDLFLSMIRELQLPRPLSGDFSSLYFKELLLTVERQIAEGGKNKPQIQK EMEHAVHFFHENFSADIDISDYAASLHMSTCWFIRSFKQYVGVPPLQYLTSIRINKAKEL LESTDCPVSEIGSIIGYENPLYFSRIFKKQTGLSPAAYRKVLPSAHTST >gi|229784127|gb|GG667608.1| GENE 73 78379 - 81456 3022 1025 aa, chain + ## HITS:1 COG:ECs3958 KEGG:ns NR:ns ## COG: ECs3958 COG3250 # Protein_GI_number: 15833212 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli O157:H7 # 7 1025 16 1016 1042 578 33.0 1e-164 MIIPRHYENLHVLHENTAANRSYYIPAEERFDDLVENREHSGRFQLLNGMWGFRYFDSIY DLKDEFYREGYGEEGFGEIPVPGCWQNYGFDSHQYTNTRYPFPMDPPYVPEENPCGAYIH HFVYEPDEAAPKAFLNFEGVDSCYYVWLNGSYVGYSQVSHATGEFDVTPYIRAGENKLAV LVLKWCDGSYLEDQDKFRMSGIFRDVYLLKRPEQGIFDYFLKTRIDGQQAEINVSFTFFD EKMPVKCSVYDQEGSLSALAEDVPGDDGLSLTLDQPVLWNAEQPFLYTVVYECGGEVITD RLGIREIKVKDGAVLVNGTKIKFRGVNRHDSDPVTGFVISLDQMKKDFQVMKQHNVNAIR TSHYPNAPQFYQLCDELGFYVIDEADNESHGTNDIYMEDDHEAGRRERWNHAVSDNPAFT EATADRAKLLVERDKNRPCVLIWSMGNECAYGCTFEAALAWTKAFDSTRLTHYESARYRD SNKTYDFSNIDLYSRMYPPLEDIHEYFAGNPEKPFVMCEYCHAMGNGPGDFEDYFEVIEQ YDGLCGGFVWEWCDHGIYQGRTPEGRAMYLYGGDHGEYPHDGNFCMDGLVYPDRTPHTGL KELKNVNRPARVVSFCQEKREAIIHNYLDFTSFKEYASVAWEVTCDGVCIAEGTLEGDAV PDILPHGEAAVRLDFEVPEAGKCFLKLRYIRRESTRLVAAGTELGFDEIPLKNEDGRNQT VLHLLKRCGETCTGSREPGLFTVKEDDRRLYVSGPDFAHVYNKLTGVLEEMNVNQCRLLE RPMEYNLWRAPTDNDRYLKLKWQKAHYDLTYSRAYQTDYSVTAEGVRIHSVIAILSPVIQ RILNMEADWLIGADGRVDINLAVKRDTELPELPRFGLRLFLPKDMEDVTYCGLGPVESYR DKRRASSHGLFTANVAALHEDYIRPQENGSHDDCDFVTLTGRRAGLTAAGDQTFSFNASV YTQEELTEKNHNYELVRSPYTVLCLDYRMNGIGSNSCGPRLLEKYRLDEAEFTFGISLIP EGKAE >gi|229784127|gb|GG667608.1| GENE 74 81489 - 81584 70 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIPEQSRLFRYCFGMRKISENTDCAKNAAMV >gi|229784127|gb|GG667608.1| GENE 75 81816 - 82814 1401 332 aa, chain + ## HITS:1 COG:BMEI0787 KEGG:ns NR:ns ## COG: BMEI0787 COG0468 # Protein_GI_number: 17987070 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Brucella melitensis # 9 332 29 352 378 469 71.0 1e-132 MFALEDNMNKDDKLKALDAAITQIEKAYGKGSVMKLGDGSVNMNVETIPTGSLSLDIALG LGGIPKGRVVEIYGPESSGKTTVALHMIAEVQKRGGIAGFIDAEHALDPVYAKKIGVDID NLYISQPDNGEQALEITETMVRSGAVDIIIVDSVAALVPKAEIEGDMGDSHVGLQARLMS QALRKLTGVISKSNCTVLFINQLREKVGVMFGNPETTTGGRALKFYASVRMDIRRIESLK QGGEVVGNRVRVKVVKNKIAPPFKEAEFDIMFGRGISKEGDVLDLAVKEDIVEKSGAWFA YGGAKIGQGRENAKIYLQDNPAVCEEIENKVR >gi|229784127|gb|GG667608.1| GENE 76 83840 - 84457 785 205 aa, chain + ## HITS:1 COG:lin1801 KEGG:ns NR:ns ## COG: lin1801 COG2137 # Protein_GI_number: 16800869 # Func_class: R General function prediction only # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 10 189 12 195 269 59 25.0 4e-09 MTVTEIIPLDKRRSKVILDEDFALALYNGEIKRYHMEAGEDLPEETYREILEEILLKRAV ERVCYLLKSSDKTEQELRQKLKDGYYPGEAIDYAIEFLKKHRYINDEEYGRRYVEYHSAK KSKRQIQYELQRKGLTKEAVSDILEEHPVDEEAQIRDYMRKKRLEPEAMTQEERRKAMAA LGRKGFSFETVNRVLGSRYSDDGAF >gi|229784127|gb|GG667608.1| GENE 77 84606 - 86165 1270 519 aa, chain + ## HITS:1 COG:CAC1816 KEGG:ns NR:ns ## COG: CAC1816 COG1418 # Protein_GI_number: 15895092 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 13 519 8 514 514 552 64.0 1e-157 MSVSVFTAALIAIVASVVVAFIAWTAAIAYRTKTYESKIGSAEEKSREIIDEALKTAETK KREALLEAKEESLKTKNELEKETRERRAELQRYERRVLSKEENLDKKSDAMEKREAGLAA REEALNKRNAEVESLYEKGIQELEKISGLTSEQAKEYLLKSVEDDVKHDTAKLIKELDNK AKEEADKKAREYVVTAIQRCAADHVAETTVSVVQLPNDEMKGRIIGREGRNIRTLETLTG VELIIDDTPEAVVLSGFDPVRREVARIALERLIVDGRIHPARIEEMVEKAQKEVETNMRE EGEAAALEVGIHGLHPELIRLLGKLKYRTSYGQNALKHSIEVAQLSGLLAGEIGLDVRMA KRAGLLHDIGKAVDHEMEGSHIQLGAELCKKYKEPAVVLNTVESHHGDVEPQSLIACIVQ AADTISAARPGARRETLETYTNRLKQLEDITNSFKGVDKSFAIQAGREVRIMVVPEQIND DGMILLARDISKKIEEALEYPGQIKVNVIRESRVTDYAK >gi|229784127|gb|GG667608.1| GENE 78 86280 - 86459 90 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619257|ref|ZP_06112192.1| ## NR: gi|266619257|ref|ZP_06112192.1| putative lipoprotein [Clostridium hathewayi DSM 13479] putative lipoprotein [Clostridium hathewayi DSM 13479] # 20 59 1 40 40 73 100.0 5e-12 MRYNFIAHKCTYCYPLEEAMKKELFNIFIITGVMIACAGCVMQRVETTLYLMEWSKAVH >gi|229784127|gb|GG667608.1| GENE 79 86639 - 87658 363 339 aa, chain - ## HITS:1 COG:TM1200 KEGG:ns NR:ns ## COG: TM1200 COG1609 # Protein_GI_number: 15643956 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Thermotoga maritima # 7 315 6 309 333 122 25.0 8e-28 MNSSETVSIRTLAKLCNVSTATISRVLNHNDKVSPETRDKVLKIIEEHHYTLPVSKAAKP SPKSQGCIGVLLESSQSDYYLSLQRHIVLYFMNLGINVLHYNIEDQEKFESAALKTLYKS NIDGLILISVHSNDINQILNPNIPTVWIDCNDICMNDPDSLWVTSDHYIGGKLAAQEFIN KKCKHPIILTNATPTLRAASRIKGFSDTFEKYGVLIEDNQIVQMPGIVNPFQESKDMVQY MFMKGIPFDSIFAINDWRALGCMAGIEALGLRVPDDIKIIGFDGISLASNSVQRITSIRQ NTELIAQNACRLLMQQIKHTEILHRHIVVPTTLIEGSTT >gi|229784127|gb|GG667608.1| GENE 80 87933 - 89687 814 584 aa, chain + ## HITS:1 COG:CC0799 KEGG:ns NR:ns ## COG: CC0799 COG2272 # Protein_GI_number: 16125052 # Func_class: I Lipid transport and metabolism # Function: Carboxylesterase type B # Organism: Caulobacter vibrioides # 68 570 4 496 515 164 27.0 4e-40 MKKKVLSIIALGITAMMFCNACSGGTPGKESTVPVSTENTGATDTKESSSAEKAEETEFA QPVLKSSTEAIKQVAVQLDAGTLMGYEQSGIYHFKGIPYATADRFKGPVPVTKFDNEVQM ALTYGAVAPQDRSLNGTGEVNSHEFMTPSNGTADMVANENCQNLNVWSSDLKAAKPVVVF FHGGGLNNGASSELSYYTGEYFVESEDTVFVSVNHRLNVLGYLDLSEYGGEDYANSGISG IDDCVCALEWVQNNIAQFGGDPSNVTIIGQSGGGTKVTTLACMSNTVDLFDKIVVMSGNY STSSKQEGIENTKLLVDYLGLADDEVIDALSNMSYEDLFNAATEAGCSWTTHYGDGTFTN PLFDSETGMVNEYAAQRTWVWGSTFSEFNSNGEGLITGKTGLMYLPKTTDDMAMEALKET YGENAQEVADAFKMAYPDHALAEALYLSQGSSAISRYGIISPRDGILKKFNDSGIPVYNY MVTYKEPYFGGVTMHHTGDVAYWFNSLNTIPYQVQGDEENAYAVALEMSQALKNFIKNGN PSTDSLSWKPYTTDEHNTMVFDVKSELKTDYDTDLYETIMKSQN >gi|229784127|gb|GG667608.1| GENE 81 89702 - 91078 896 458 aa, chain + ## HITS:1 COG:BH0595_2 KEGG:ns NR:ns ## COG: BH0595_2 COG1263 # Protein_GI_number: 15613158 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Bacillus halodurans # 88 402 1 327 399 281 46.0 2e-75 MASKYDGLARIIIQNVGGKENIESLTHCITRLRFKLYDEGKAQTDILNETDGIVKVMQSG GQYQVVIGNHVSDVYAAVCERAHINGDAPTNGAVAKEKQNAFNAFISVVTGVFLPFMGVL AALGILKGVLALLVSFHVLNGAGGTYNILYSLADSLFYFFPVILGYTASKKFGLDEFTGL ILGATLVYPSMTSASVADVSNFLGIPVIMPATGDYTTSIIPIIIAVWFASILQKRIKKHI PDAVKSFITPLIVLIVTVPLTFLVIGPIASLVSNGISMFCNFLYGVSPAVLGLFVGLFWQ VLVMFGLHWAIVPIGINNIGLMGYDIVMPCMIATTFAQTGAVLAIMLKTRNSHLKGLCIP AAISAFCGVTEPAIYGITLPKKTPFFITCIIAGIGGMVTAILNVKEYSLGAMGIFAWTTF VGEGEVSGMIRAIIVSLAALAVAFAAVYAVYKDTEKEA >gi|229784127|gb|GG667608.1| GENE 82 92091 - 93521 823 476 aa, chain + ## HITS:1 COG:CAC1408 KEGG:ns NR:ns ## COG: CAC1408 COG2723 # Protein_GI_number: 15894687 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 4 473 5 473 477 678 66.0 0 MNDFPETFLWGGATAANQCEGGYQEEGRGLSTVDTIPYGEDRFPVMRGDKNMLSCDDEHV YPTHTAIDMFHRYKEDIALFAEMGFNCYRMSVAWTRIFPAGDEDKPNEAGLQFYDDIFNE CKKYGIEPVVTICHFDTPIQLIKEFGGWKNRRMIDYYLKYCKTIFERYRKKVKYWISFNE INMILHLPYMGAGIVFEPNDYKEQVKYQAAHHELIASALATKLLHSIIPSAKMGCMLAAG EVYPYSCKPEDVFDAMEKNRDNYFFIDVQSRGGYPSYAKKKLEELQIELQTDPEDEQILR DNTVDFISLSYYASRLTSTDPELLKDMTNGNVFETIKNPYLKTSEWGWQIDPLGFRITLN TLYDRYQKPLFIVENGLGAIDMMTEDEKVHDEYRISYLRDHINAMKDAMNKDGVDIIGYT SWGCIDLVSASNGMMQKRYGFIYVDINDDGSGTGKRIKKDSFYWYQKVIGSNGSVL >gi|229784127|gb|GG667608.1| GENE 83 93693 - 94265 509 190 aa, chain - ## HITS:1 COG:BS_yckF KEGG:ns NR:ns ## COG: BS_yckF COG0794 # Protein_GI_number: 16077414 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Bacillus subtilis # 14 190 9 185 185 120 35.0 2e-27 MGYYQEQYLKVSNEVIQELDMTLRSIEPESVERLVNEILEADQVFFVGVGRVLLALQCIM KRLAHLGIKTHYVGEITEPAFTDKDLLIVGSGSGSSLFPLNIAKKAKTIGAKIVHVGSNP ESQMKEIVDFMVRIPVRTKLYLEDEIASVQPMTTLFEQSVLLLGDVIAKMIIEERQLDMK SLWRYHANLE >gi|229784127|gb|GG667608.1| GENE 84 94279 - 95085 364 268 aa, chain - ## HITS:1 COG:lin2527 KEGG:ns NR:ns ## COG: lin2527 COG0627 # Protein_GI_number: 16801589 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Listeria innocua # 1 264 1 250 252 192 37.0 4e-49 MAYAQCEIYSKILKKNVNLNILLPSATGFDIQNGAGCYYKRPRTVYKTLYLLHGSFGDGN EWVHFSNLRRFVHTYGMAVVMPSVEFSFSVNMEHGEHYTDYVACELPELLGSVFPLSEKR EDTFIGGLSMGGYGAYRCALEYPERYKAAVSLSGALDAAKLERTDTIQAQMMPESYRRAV FKDFTHLQGTSDDLVVLMKNQLRKGAVLPEFLMLCGTEDILYECNNAFYEQVKDLKADIE YRKYPGAHDWDFWDKHMIEMLNWLSTAK >gi|229784127|gb|GG667608.1| GENE 85 95102 - 95929 459 275 aa, chain - ## HITS:1 COG:lin2527 KEGG:ns NR:ns ## COG: lin2527 COG0627 # Protein_GI_number: 16801589 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Listeria innocua # 2 268 3 252 252 191 39.0 1e-48 MLAKLNIFSETLKFPTDLYVMVPTPVSDDYINGRETNYLKPDVKFQVLYLLHGAYGNHSD WLRYTNIERYAREHKLVVVMPDASNSFYQNMYYGSAYLSYLTDELPRVMQQMFPVSLKRE NTFVAGLSMGGYGAVRSAFERPDLFGYCASLSGALDIVSLINETAGDRYGNGITDVFKWN NIFEHPDQVAGSDADLLFLIKKRMQEGKMLPAVCQMIGREDFLYRQNRAMKEKMEAMGVE LHYREYEGTHDWEFWDHHIQDVLRWLPLTNNPVME >gi|229784127|gb|GG667608.1| GENE 86 95919 - 96935 753 338 aa, chain - ## HITS:1 COG:ECs0378 KEGG:ns NR:ns ## COG: ECs0378 COG1172 # Protein_GI_number: 15829632 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli O157:H7 # 16 309 13 303 318 154 34.0 2e-37 MEAKKRIGIDGNFTRMLSILVFLLVFAGITRGSSFISVANFQNIAKQLTEYGLMTLGVAI AMISGGIDLSTVYVANLSAICAGTIMQQTAPNGEIAGILLACAAALVIGLACGFMNGLLI SVLHIPAMLATLGSYELFYGIGIVLSKGRAVSSTVGFGNLSGALIFGIIPVPILIFLLAA VVLTFIMGHTTFGKQVHMIGISNKASMFTGINNTKVIITVYVISGFLSSAAGLISLSRVS SVKADFGSSYVMLAILITVLGGCDPNGGFGEIPGVATAVLVMQVIAAYLNTLPGVSNYYR QLLYGILLLAVMTFNYEMRRRKGRVKKLRNKEDDKDAG >gi|229784127|gb|GG667608.1| GENE 87 96938 - 97945 898 335 aa, chain - ## HITS:1 COG:AGl83 KEGG:ns NR:ns ## COG: AGl83 COG1172 # Protein_GI_number: 15890148 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 10 308 5 307 325 143 36.0 6e-34 MNKLKKIAEKNETYILLIILLLSMVIQGISGMFFASNNILDLIRSIIVNAIYALCALLAF ISTGPDVSFPMIGALASYLTFAAGNKFDLPVIILIPIALIIGAMCGSVNGFFVVKYRFRS LIVTLATSTMCYGIIFGAFGGSRSAVPKSLSSLAGWKLFTVTSVRTGLSSVLPGMFLVMV ALYLIVYLVLNFTTVGRGIYAIGGDEVAAKRAGYNVNAIRFGIFVANGMIAAIGGLAYAL MSNNCTPTEYYGGEMIVIAAIVLGGVRLTGGVGSLTGCILGTLLLSMVTSSLTMVGISVY WQQTFVGVIIIIGAAISSLQAVRAQKIFVQVRKGD >gi|229784127|gb|GG667608.1| GENE 88 97966 - 99453 1056 495 aa, chain - ## HITS:1 COG:AGl85 KEGG:ns NR:ns ## COG: AGl85 COG1129 # Protein_GI_number: 15890149 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 488 40 528 538 398 44.0 1e-110 MSDNILNCKGICKSFGGVHALQNVDLSVKRGEVHCLAGENGSGKSTIIKAISGFQRQDSG TIEIDGKIYKNLTPSAAINAGVQVIYQDLSIFPNLTVQENLAINTVVARRKKLYSRREQK VIAEAALKTLNLDLPLDVRVEKLPVATKQLIAIARALNAEAKLLILDEPTTALTRKEVDR LFEIIFDLKKRGITVLFVSHKLDEMFEISDSITIMRSGSNVHSCPMNELTEKDFIHYMTG RDFEEVDKLDKKRIRTDQSPALEVESLCGPGFQDITFSVLPGEILGITGQLGSGRSELCD ALFGINKTTGGVIKRDSSPVTVKSVQDAMKHGIALVPEDRLSEGLFLPVSIMENITIVNY RKLSKYGYLKREVLERESSKWVKDIHVATSDHTLPVQTLSGGNQQKVGLAKWMSTLPKVL ILNGPTVGVDIGAKYDIYQLLRDLASTGVAIVVASDDIAEVIKLCDRAVVMRGGHMTGVL EGEELNAENLARAAM >gi|229784127|gb|GG667608.1| GENE 89 99491 - 100642 960 383 aa, chain - ## HITS:1 COG:SMb20316 KEGG:ns NR:ns ## COG: SMb20316 COG1879 # Protein_GI_number: 16264050 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 66 383 24 333 333 191 38.0 3e-48 MKKKITSILLVACMTLTLAGCSTSTTTQTNQSKETAAETLPKNESGSKNEVKGNQGGFAY DPADFVSEKDPSQYKIAVLIKGTAEWFDRLELGVKKFSEDYGVNAVMIEPANPDAASQLE MLDSLLTQDYDAICVVPNDSSALETSLKGALDKKVVVIGHEASDLVNCLYDTEAFNAEQY GSAIAEQLAQAMGKSGVYGDMVGFTTSVNHMQYSEAELAYMKKNYPDITVVNDTLPTCES QETVTTAYEQAKQVLKSNPEVTGFIGHASGDGLGIAQAVEELGLAGKVHIVCGGTPNMYI DYLEKGTVDCVSVWDPMISGYVMCQAAYNVLSGVEIGDGADLSGKAGFAEGYEKVTQVDG ANRCLIGNAPITATKENAREFGF >gi|229784127|gb|GG667608.1| GENE 90 100792 - 100950 92 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619269|ref|ZP_06112204.1| ## NR: gi|266619269|ref|ZP_06112204.1| putative N utilization substance protein A [Clostridium hathewayi DSM 13479] putative N utilization substance protein A [Clostridium hathewayi DSM 13479] # 1 52 1 52 52 90 100.0 5e-17 MKVVDILKKAEYNPKYGTDSKVVFFKTSQVHKTQILESGQIKGQNFVNERGL >gi|229784127|gb|GG667608.1| GENE 91 101201 - 102211 882 336 aa, chain + ## HITS:1 COG:RSc1014 KEGG:ns NR:ns ## COG: RSc1014 COG1609 # Protein_GI_number: 17545733 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Ralstonia solanacearum # 5 325 3 326 347 185 35.0 1e-46 MAQLTISDIAKKAGVSKATVSRVLNNNPNVKDETREKVLAIMQAKGYSPSAAARSLSKQT SDTIGVIVPEIDNPFFGRLLRGVIDVMGSYNLSMICCNTDDTISKDLESLDMLNNHRVRG LIYTPATDYSIPQDKLQLKKMLQLIDAPVVQIDRMVSGINADCILFRDEEGVYQATQMLI QAGHKKIGIINATLDQYLARIRQKGFIRAMEEARLPVEERYQFFGNYRMSKAYELAKEML AMPDRPTAVITCNNNTTLGLLKALHERNETIPESLSCIGLDSIEVLKYTGNNFNFIERDS YAMGREAMELLIKRIAFPDMPKRTIYLDTSVVIHKL >gi|229784127|gb|GG667608.1| GENE 92 102382 - 103287 332 301 aa, chain + ## HITS:1 COG:YPO0008 KEGG:ns NR:ns ## COG: YPO0008 COG0524 # Protein_GI_number: 16120361 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Yersinia pestis # 8 297 20 307 308 169 38.0 5e-42 MDLIFHGIPRLAEYGESLSGESYSYVPGGKGANQAAAAARMGADTTLAGCLGCDDNGELL LNNLKAFHINTNHILKAPESQTGLALMLVDSASGRYVSYNVMGGNNLITPGQVEDALNDC HYDMILIQLEMPLETVYRTCEMAAERRIPVFLDAGPAMSIPLSRLSGIFIISPNESETKA LTGITVDSNENIRRAAEMLYQRVHPSYVLLKLGERGAFCYDGTDSDFVPAFNDITAVDST AAGDTFGGAFAAAYCSGSSIHDSVLFANAAAAICVSRRGGQPAIPSYEEVTAFLKVRGID I >gi|229784127|gb|GG667608.1| GENE 93 103717 - 105474 1210 585 aa, chain - ## HITS:1 COG:FN0190 KEGG:ns NR:ns ## COG: FN0190 COG2972 # Protein_GI_number: 19703535 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Fusobacterium nucleatum # 11 568 5 549 552 410 39.0 1e-114 MGKQFKLRDWSRRFSVQLSLFFLGAGLTVILLLSGAVYWFASNLLMSETISKTHDLLEMS GANIGTYIARVKGESNVFAGSPSLRQYLSDDEESLRTGLLSQIDTLLQNDSSIKSIVVVS KDGRILSNEKNLDMSVSSDMMKEDWYIESIHNTMPVLTGARMQSFSSDKDNWVISVSTEI TGDSGDNLGVLLMDMEYSVIEDHLRSLDLGREGYVFLLNDKGEPVYHKDTSYFSDPDKLA QLLEIQSAGDGYDKASNLLTCQTQIEKTGWTMVGVVSLDTLKMLERQLFEAVLLTGGLLF IAVLLIGILFTRRLSTPMADLERGMNEIEKLAEVRIRKNSFYEVELLAGNYNRMIHRIRI LMDEISDKEKTLRHAELNALVSQINPHFLYNTLDTIVWMAEFNDSTRVIALTKSLAAFFR LSLSGGRELITVGDELEHVRQYLYIQKERYGDKLNYTIHAPEEVLDYTVPKIILQPIAEN SIYHGIKPLDSPGQITITVQEEGEKLIFTVSDNGAGCRPDAAAADNPSRPGKVGLKNVDE RLKLYYGPGYGVTIHPAPGAGCRVELTVGKQLFSSPVSSSNSSSA >gi|229784127|gb|GG667608.1| GENE 94 105458 - 106237 686 259 aa, chain - ## HITS:1 COG:FN0189 KEGG:ns NR:ns ## COG: FN0189 COG4753 # Protein_GI_number: 19703534 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Fusobacterium nucleatum # 1 250 1 252 261 197 43.0 1e-50 MYHLLVVEDESLIRRGIRAFIDFDRLGIDEFFEAENGEEGLEAVRSHRIDLILADINMPK MNGLEFCRLAKELDPSIKIAILTGYDYLDYAIRAIKIGVDDYALKPLSRDDVQKLLFHLV EKKKEADNFAAVRQAVDKLAGNAGTEDESGLRAQMAGILEEHLSDSHFSLSTMAAELGYN ISYLSTLFKRCFGENFRDYLFDLRLERAKILLLSTQMKNYEIAAAVGIDDPNYFSVCFRR KYKKTPKEFRSEAGHGETI >gi|229784127|gb|GG667608.1| GENE 95 106341 - 107342 1077 333 aa, chain - ## HITS:1 COG:HI1455_2 KEGG:ns NR:ns ## COG: HI1455_2 COG0229 # Protein_GI_number: 16273361 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Conserved domain frequently associated with peptide methionine sulfoxide reductase # Organism: Haemophilus influenzae # 186 327 1 142 148 211 70.0 2e-54 MNNDNLHTIYLAGGCFWGIEAYMKKLPGVRDTDVGYANGNTENPTYEQVCYDNTGHAETV KVVYDPALISTEQLLDGFFKVVDPTSINRQGNDRGSQYRSGIYYVDEADKAIAESAAARQ KENYKDPVVTEILPLNQFYLAEDYHQDYLDKNPGGYCHINLNAADEFIGEEGLGMSDDLS VLIRPEDYPVPDDQVLKEKLTDIQYQVTQNNDTERPYTNEYAATFDKGIYVDVVTGEPLF SSEDKFESGCGWPSFSKPIIPEVVTEHTDTSFNMKRTEVRSRAGDTHLGHVFDDGPKDLG GLRYCINSASIRFIPFDDLETEGYGYLKPLFDM >gi|229784127|gb|GG667608.1| GENE 96 107526 - 107984 606 152 aa, chain - ## HITS:1 COG:SPy1558 KEGG:ns NR:ns ## COG: SPy1558 COG0526 # Protein_GI_number: 15675451 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Streptococcus pyogenes M1 GAS # 3 151 58 206 207 136 46.0 1e-32 MKKESTSEMMNPKTPAPDFEMMDLDGNTVKLSDFNGEKVYLKYWASWCPICLGGLEDINT LSAEENDFRVLTIVAPGQKGEKNTEDFKKWFSGVENTEHITVLFDTDGAYGTQAGVRGYP TSEYIGSDGVMVRLVPGHADNETIKTVFESVK >gi|229784127|gb|GG667608.1| GENE 97 108137 - 108835 734 232 aa, chain - ## HITS:1 COG:HI1454 KEGG:ns NR:ns ## COG: HI1454 COG0785 # Protein_GI_number: 16273360 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis protein # Organism: Haemophilus influenzae # 6 224 6 212 213 157 44.0 1e-38 MTPSFLSISTVLAAGLVSFFSPCILPLLPVYLSLFSSGGVASKEDAARRKLRVFLKSVLF VAGISVCFILLGFGAGALGSVIRSKSFLTVMGLIVMILGLHQTGLIPIKWLYREKRVNLE RSRRGDYIGAFLLGLTFSFGWTPCIGPVLGAILGLSATSSRPLYGALLMAVYSLGFLIPF LALSLFSDVLLAKVTRLHRHLGKIKTAGGVIIILMGFLLMTDHLGSILTVFV >gi|229784127|gb|GG667608.1| GENE 98 109058 - 110515 591 485 aa, chain - ## HITS:1 COG:no KEGG:ELI_1104 NR:ns ## KEGG: ELI_1104 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 411 1 417 500 125 24.0 5e-27 MDNSTELRQVVYSVLLTQIQFGFYRYGEKLPAIEETSTRLCVSVDTARAAYLKLKTRGYI TLIKNAGATVKVKYDSRETEQFIQTFFSARKHAMIDLGKSLQPLFGTAQWTGLKYAAPED LQTMELLFSKEDAASPYAILEHLNLKYRALGNHILMRLVWQSFMFLHNPFFSLRDNLRYF DRDADYLSAVLVFCREKDWSALRAAVAGSIEKLPSALIRFYEDRITTPPPEAETPFVWSS YKKSRQLCYSLAMELLTSISLGQYPVGSLLPSQEELARQKGVSVSTVRRALLLLDSIGAI KSAKYIGTRVLPFDKTAENSDFTRPVLQRRLMDMAESLQILALSCQAVSQLTLSSLNTTF VEQLCRDLKAHRQRRRGETLSYFLLGLIGNHAPYQTIRTVYTELLRQIFWAHAFHGMKGS ADTIHAIYDPYYDTFIDSLEHLNFLRFSVTLEALILFELRCTVDYLLHFNVPGAENILVP DSSFR >gi|229784127|gb|GG667608.1| GENE 99 110797 - 113424 2259 875 aa, chain + ## HITS:1 COG:SMb20356_1 KEGG:ns NR:ns ## COG: SMb20356_1 COG0642 # Protein_GI_number: 16264090 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Sinorhizobium meliloti # 283 587 336 646 667 186 40.0 2e-46 MQKAVNKKDSSVGRSQQMSIVSTAVLILLMFCYLVMTIRNSASLAAQTEIISTHPFEVVI SAGDVKLYVSEMSLRTGRLERHHNREDVEIAADMLEKLAFSLEKPILRLEELYLGDAEDV QELKETLTSLQTEQAAYLTYCGQAAITEENIEAYAQEHLQPLYEEALHQTEAIIATAQAK KVGYGETAESLRLTTLIGSVVLMTLMVFVLLISQYVLHRQRKELTYRSKLFDNLSLSIDD AFIIRDAGTGAINYRGLNVERILGIPVDYVESLYQGMPEEDAQALREGVSDPQFASPFER MVEYTKPNREKCWMLIRIYRVEDLKTPQLITVFSDRTEEVRSRQILQEAMTNAEQANMAK SEFLSRMSHEIRTPLNAIIGMTTIAAAVVRDPARVEDCLSKITFSSRHLLMLINDVLDMS KIESRSMILQQEPFDMFEIINGFVSTVYAQTKAKGIEFQETMEGFGEDAIFIGDSLRLNQ ILFNLSSNAVKFTPPGGKIRLEVSRYRGRNTADMIRFTLSDTGIGMTKEAVERVFQPFEQ ADASIAKLYGGTGLGMSITRNLITLMGGCIRVDSEPGVGTACIVELPFLKGEESGIQEPD FEPYGLQALIVDDEQTVCEQTAILLEKIKIHAEWHLSGAEAVEQVKEMYREGRNIDLCLI DWKMPDMDGVEVTRRIRREVGDDIPIVMISAYDISEVEEEARAAGVNGFLPKPLYRSSVY AAIKDALEKKGQPSEAVEQKHTDMPLTGMRLLMAEDNALNQEIAATLLHMNGAEVDCVED GQQVLDTFLASGPGDYDAILMDVQMPVMDGHEAARRIRKSNHPTALTIPIIATTANAFSD DIAAALAAGMNAHISKPLDIVQLCKTLTGCIHKNA >gi|229784127|gb|GG667608.1| GENE 100 113545 - 113847 324 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288869901|ref|ZP_06112214.2| ## NR: gi|288869901|ref|ZP_06112214.2| high-affinity choline transport protein [Clostridium hathewayi DSM 13479] high-affinity choline transport protein [Clostridium hathewayi DSM 13479] # 1 100 2 101 101 149 100.0 8e-35 MLKHKTAAGLAAAAMITTSLAFTASAEGKDVQADTSKSIVLDSAVISSVEVQPGFTPELM TDEDGKTYFIAEDGSRVYISWTEDGVNSDGSILFSVVAAD >gi|229784127|gb|GG667608.1| GENE 101 113898 - 114581 761 227 aa, chain + ## HITS:1 COG:XF0389 KEGG:ns NR:ns ## COG: XF0389 COG0745 # Protein_GI_number: 15836991 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Xylella fastidiosa 9a5c # 1 218 1 218 227 169 42.0 6e-42 MRLLIVEDEEDMREALCYGLRKRGYAVDAAGDGADAVEQCGINEYDLVVLDLNLPGLDGM EVLKHIRSMEKPAKVLILSARSELTDKIRGLDSGASDYLTKPFHFEELEARIRMLLRRSF IQTEAALTRGGLCLDSNLKTASFQGRPLEFSMREFSILEYLMIHMGRPVSAEELLEHVWD SEADPFSNQVKVYISVIRRKLQAVTDEEIIRSIRGAGYLIDKEEEEC >gi|229784127|gb|GG667608.1| GENE 102 114575 - 115732 1123 385 aa, chain + ## HITS:1 COG:SA1246 KEGG:ns NR:ns ## COG: SA1246 COG0642 # Protein_GI_number: 15926994 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Staphylococcus aureus N315 # 103 379 173 451 451 163 34.0 6e-40 MLKNISLRMRLTLLSALVMASVAVILTAMFLFGADRIFVRDLGQQMTFQTQDIIVTSVKA ETALDDDSNIQSITISLKKAGTQFNLWGLAALFLVLLLGTGATWLMAGHVLKPLEELSST IEEIGGSDLSSRVEIKGRQDEIGRLARSFNHMMDKVSASFERQKRFSASAAHELKTPLAT ILVNLEVLELDGKTSPARMEKVLSIVKANTERMIRLVENLMRLTSEEAHEMDEEVPLSEV FAMTLDELSPMIQKKNLTVTIKNTPDISLTGSRIMLYRVMSNLLENAAKYNREHGSVTVV TGRDGDAVAVEITDTGIGIPEEELSHIFEPFYRVDQSRSRAVGGAGLGLSLVKDIVEKHG GEITVKSVIGEGTTFMLRFPGRERS >gi|229784127|gb|GG667608.1| GENE 103 115753 - 117183 1051 476 aa, chain - ## HITS:1 COG:no KEGG:ELI_2456 NR:ns ## KEGG: ELI_2456 # Name: not_defined # Def: transcriptional regulator # Organism: E.limosum # Pathway: not_defined # 1 474 1 477 483 356 41.0 1e-96 MSNTNSKEQTIYRSLAGKIQLGFFDDGERFPSAEEIAERYRVSYCPAQRALKMLERDGLI QLNRGKNTIILGKPYENYLESDVFKRRAAALSDLLKSLHILSPAICLQSLLHCRESLALK KEQALPGRSLYQQFERSLHSLGSQTALSLYYDISSFAESALLDILCLKLGKKEAEAFLHA AALEYTSCFEDFTKESAESIGHRLEHLSETFRKPIEEYLAKLELPPDIEPEAFVWEPNKG RTRYCDIVAIDMICKINQGIYPLGTLLPGGPVLADTYHVSEITIRRTIGLLNTLGVVQTI NGVGTRVIGPGDASIPYRLKELMLDGNLKAFLEALQLLAVTGKPVFLYTFPWIPEEALAA IAGAAAIPEEKSSMVAVISAGMQAVVHYCPLAAVRDIYSKLTLLLLKGSILRLEETGAEK VPGWSFVSAELQESCSKKDGVRLAGAYRQLFQMIFTDTRLALIDIGVHGAAEVAGI >gi|229784127|gb|GG667608.1| GENE 104 117226 - 120006 2003 926 aa, chain - ## HITS:1 COG:all1389_1 KEGG:ns NR:ns ## COG: all1389_1 COG0642 # Protein_GI_number: 17228884 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 394 650 320 577 595 178 36.0 5e-44 MELQDAVKQKAERPDAAGMGMDGLLPMLEMIGSPACQCIFDQELTLLCANPSFYKSSGYS KEEFNSSCPTLRQYLHQHPGEFDIISQALSAAAEHGQNLFSADCRIPVKTGTFLHVRIHG TITEETTSGCPTVFMTLTDISDLVDQAEQKVQYFEWMMDEYIGNVYISDMDSYELLYINQ TAADTLQCTKKAPLGHKCYEIIQGRTEPCPFCTNSRLRTDEFYEWEFDNPLLNRTFMIKN RIIDWNGHRARLELSHDMFSTEYKLAKKDRERDALIRTIPGGLARLDARDCSTIMWYGAN FLDIIGYTEEQFENELGRQCTYIHPDDLSRLQLLLEQIRITGESTAFESRIITRDGQVKI LTMTFCYASGENSWDGIPSFYSIGIDLTKDRAEQERQKKALEEAYQAARVANSAKTNFLS SMSHDIRTPMNAIMGMAAIAQANLSSPEKIRDCLSKINVSSRHLLSLVNEVLDMSRIESG KIDLTPEEVKLPELIENVFAMCKPLISEKDLQFQISASKVRHETVITDGDRLQQVLINLL TNAVKYTPNGGRISLSIKEIPSLSSQKAQFEFIITDTGIGMSEEFLPHIFEPFSRAEDSR ISKIQGTGLGMAITENIVRMMNGSIEVTSRLGEGSQFTVTVQLELVNEEVTDTEELNGLP VLVVDDDQIVCESASALLTELGMRGCWVLSGAEAVTRAEEAHKCGDDFFAVILDWKMPEM DGLETVKVLRKRLGNDVPIIIISAYDYSDIEAEFLQAGADAFISKPLFKSKMLHVLQLFL ESGHSTASGAAEEQSRPALADKRILLAEDNELNREIAIELLEMQEIIVEAVENGQEAVEA FRSSPIGYYNAILMDIQMPVMNGYDAAAAIRSLEREDAPSVPILALTANAFTTDIGKAYS VGMNDHIAKPIDVERLITVLERCMNA >gi|229784127|gb|GG667608.1| GENE 105 120247 - 121161 731 304 aa, chain + ## HITS:1 COG:BH0724 KEGG:ns NR:ns ## COG: BH0724 COG2207 # Protein_GI_number: 15613287 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 25 303 23 297 298 121 26.0 2e-27 MKVIHTVELKPGSYEEMLPDFSPDFPYIASYVELEKHLGRQSPWHWHKEVELFYMKEGAL EYNTPQGKTVFPQGSAGFVNSNVLHMSKVRQGEKTSVSMIHIFDPLLVSGQTGSRIDQRY VRPLITAPGIEILGMYPDRPEHAEFLEILRESFELQEDGYAYEIRLRSLLSELWCRLLAM AEPLCHEDAQGSRSSEKMKMMMAFIHEHCADKITVAEIAAAAYISERECFRTFQDCLKMT PVEYMTDYRLQKACHMLAEGNDSITRICQSCGLGSSSYFGKVFREHIGYSPMEYRRKWRN RDIY >gi|229784127|gb|GG667608.1| GENE 106 121284 - 121520 242 78 aa, chain + ## HITS:1 COG:no KEGG:Closa_0502 NR:ns ## KEGG: Closa_0502 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 78 2 79 79 89 53.0 6e-17 MKLRILNMKEFLKTVNGCEGPVYLVDSDGRRENVNKEYGTQVRLQAEYQRNRNTLVLCLA VPVPRDYLNIVNYYAGDC >gi|229784127|gb|GG667608.1| GENE 107 121767 - 122729 837 320 aa, chain + ## HITS:1 COG:BH3506_1 KEGG:ns NR:ns ## COG: BH3506_1 COG2207 # Protein_GI_number: 15616068 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 7 107 7 107 130 97 41.0 5e-20 MHAWEAIDGVLDFMEEHLREEMNTEALAQSAGLSPFYFQRLFKRLTGRPVLEYVKMRRLA LAASALEDREKRILDIAVDYGFSSHASFTRAFKEAYGISPEKYRKDRPMLNTCEKPELSM RYHLVDEGVPLIAGDIVLEIRRMRLTEPEWYLGLETAVEIKKQVPAGESTGIDVPGQLWK AFHEKKPAVDGLAQEGPEMGMSHKADPEMGTFLYFAGALSERGTEYAGAEERSAKDGFPL VKRELPAGEYLVCRIEAETFEKLVTAALDQAGVYLFGTWLPAHGLTFLPFSAEKYFLKNE DDIYMEIWVMPAPATEQDVK >gi|229784127|gb|GG667608.1| GENE 108 122832 - 123710 925 292 aa, chain + ## HITS:1 COG:BH3634_1 KEGG:ns NR:ns ## COG: BH3634_1 COG2207 # Protein_GI_number: 15616196 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 5 117 4 116 132 82 36.0 1e-15 MKEQVEAVQRMQDYIQEHLRETVTLADLSRVSLFSPWHSYRLFRQWTNQTPADYIRRLRL SESALKLRDERCRIADVAFEYGFGSVDGYQRAFFREFGCNPREYAGNPVPLYLFTPYGVK YRELRKEWNAMENVKNVFIQAIDKPVRKVIIKRGIKAEDYFPYCEEVGCDVWGLLTSMKS LCGEPVCLWLPEAYRSPGTSVYVQGVEVAADYGGPVPEGFDVIELPAAKYLMFQGEPFEE EDYCEAIEAVRTAVQRYEPSAAGYQWDDENPRIQLEPVGTRGYIELLPVRKL >gi|229784127|gb|GG667608.1| GENE 109 123884 - 125596 1495 570 aa, chain + ## HITS:1 COG:CAC0120 KEGG:ns NR:ns ## COG: CAC0120 COG0840 # Protein_GI_number: 15893416 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Clostridium acetobutylicum # 259 567 210 518 555 199 38.0 2e-50 MCYDLEYLGMNSGRAEMARHIKTLKLGTRFSILAGLLLCISTALIIVFCGAVFRAQFMDM LTKECVQGTNLLDYELSKEDYTDQDRTRLLDELKARTGYEYTIFEGDKRVATTVMKDGTR AVGTTLDPAIAEIVLSERKSFVGETSILGIPHVCSYVPLYDADGTVGGLIFSGVSSYEAK SAMHRALWTMGIVGLLCAAAGVWLILFFTKRAITVPFARLKRFAHTVEQGDFGIASKTPV TSGIVSGDEIGELAGTFELTVKRLREYIGELSYVLGRISNNDLTAGPVLDYVGDFSSIRE SLDHITARLNDTLEEIVQSSAQVAEGAGMVADGAMTLSQGAQEQAASVEELAATLEAASE EVDSTARNAGEASLVSQEAVRVLEAGKVQMEQLTDAMNDITQASDEIGRITKVIEDIALQ TNLLALNAAVEAARAGEAGRGFSVIAEEVRSLAGKSAEASRDTVGLIENANAAVSKGRQI AVSTAGALDQGVTASSQAMDLVRGISAASAQQAEHIRLLQGGMNQIAEIVQSNSATAEEE AAASQTLSDEAGRLDSIVASFRLRQRARTE >gi|229784127|gb|GG667608.1| GENE 110 125615 - 125980 482 121 aa, chain + ## HITS:1 COG:MA2818 KEGG:ns NR:ns ## COG: MA2818 COG3603 # Protein_GI_number: 20091642 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 1 121 6 132 135 90 40.0 5e-19 MEIKLLEDDFSVCKIRDLSGVNLTDAFYFIGKTDEELSLVCRTEHVPEETTDREDGWKAF RIEGVLDFSLIGILAEISAVLAECSIGIFVVSTFNTDYVLTKKEDFMRAAQALSEKGYAV K >gi|229784127|gb|GG667608.1| GENE 111 126473 - 126808 249 111 aa, chain + ## HITS:1 COG:BS_yhdM KEGG:ns NR:ns ## COG: BS_yhdM COG1595 # Protein_GI_number: 16078017 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 3 73 2 71 163 58 42.0 4e-09 MDSMEQIYLSHARTVYAFLLTKTQNPDLAEELTQETFYQAVKSIHRFKEQSSVSTWLCGI ARNVWFDHLRKQKGRADFKEAEEIAVPSAEEEVFCSWDHLSVMKQLHGLAS >gi|229784127|gb|GG667608.1| GENE 112 127725 - 127874 181 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619292|ref|ZP_06112227.1| ## NR: gi|266619292|ref|ZP_06112227.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 49 1 49 49 82 100.0 9e-15 MYLRLSGSLTFGQIGEIMGKSENWARVTFYRGKEKIVKEAKERETEHTL >gi|229784127|gb|GG667608.1| GENE 113 127852 - 128847 1067 331 aa, chain + ## HITS:1 COG:no KEGG:Amet_3811 NR:ns ## KEGG: Amet_3811 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 2 106 23 128 130 72 38.0 2e-11 MKQNIPCEMIQDLLPLYVDGLTSDESSRQIEAHLETCGDCRSRYQRMKEDLERETQVKQK ENEREIDYLKKIRKSSIRKVFLGIFSAFAVILLALFLKLFVIGYPVDSYLVTYANVNGNM LSVGGILYDSSPVYRRYKLIGEEDGNTKLVIYGCLPSVWNRNGAFNLDIDLTEVGTDLTI NGMTVKQDGTIVSRQANELFAAKHPYVGDMSANGRAAQLLGIGNTLGSFKNELQTSAEPY GWTLKFENSAANSAVFDEQMKGYACVLMALTGNLGEVTWTYTVELEDGPAVRQRTMTREE CSEWAGEPVETFAESPEAVQRLLDLIGEKMK >gi|229784127|gb|GG667608.1| GENE 114 128935 - 129621 689 228 aa, chain + ## HITS:1 COG:CAC0524 KEGG:ns NR:ns ## COG: CAC0524 COG0745 # Protein_GI_number: 15893814 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 228 1 228 228 357 78.0 1e-98 MTRICLVEDDREIAKNLVLLLRSEGFAVTHAPTREEAFAAAEEHKFDLALIDIALPDGNG FTVYTKIKETQDIPVIFLTASGDETSVVTGLNMGADDYITKPFRPRELIARIGAALRKNG RLGAVFEICGLFVDVAGGVVKKNGSEVFLSALEYRLLLVFVNNPRSIITRGRLLDELWDA AGEYVNDNTLTVYIKRLREKIEDDPANPQIILTVRGTGYRLGGAYASE >gi|229784127|gb|GG667608.1| GENE 115 129641 - 130645 590 334 aa, chain + ## HITS:1 COG:CAC0525 KEGG:ns NR:ns ## COG: CAC0525 COG0642 # Protein_GI_number: 15893815 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 1 334 12 329 329 432 65.0 1e-121 MVYAFTAAVTVILGFIIHWAAGILAVFSAAAFGTAFLLFTRARYRSIARLSGQIDLVLHH ADRLDLDELEEGELSILHSEITKMLLRIREQNEALKKEKMHLADSLADIAHQLRTPLTSA NLILSLLEKDPDEKERRAFVRETEELLVRMDWLLTSLLKLSRLDAGVVVFQKEPVDGYDL INAALRPLLIPMELHGIVVQTDVRKSAPSKSASSQPDSPVVIQGDSGWLAEAVQNILKNC IESIGDNGKIEIGLTDTVLFTEISIHDSGPGFESAELPRIFDRFYSGKSTGSTGRAGYGI GLALCRMIILRQGGTVTAKNHPQGGAVFVIRFPK >gi|229784127|gb|GG667608.1| GENE 116 130728 - 131408 306 226 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 220 1 218 245 122 34 1e-26 MELIKIDDLCKVYGRGENQVAALDHVSLTIEKGEFTAITGPSGSGKSTLLHIIGGVDMPT SGKVFLDCQDIYAGGHDKLAIFRRRQVGLIYQFHNLIPTLNVVENITLPVLMDRRKVNEE RLNELLEMLGLEDRRTHLPNQLSGGQQQRVSVGRALMNAPQVCLADEPTGSLDSRNGQEI ISLLKLSNKKYGQTLIVVTHDESIALQADRVIGISDGRVVRDERMV >gi|229784127|gb|GG667608.1| GENE 117 131411 - 134014 1908 867 aa, chain + ## HITS:1 COG:CAC0527 KEGG:ns NR:ns ## COG: CAC0527 COG0577 # Protein_GI_number: 15893817 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 1 866 1 862 863 848 52.0 0 MNIFHKVTLQSMKKSRTRTVVTIIGVILSTAMITAVTTFSVSLLNYMIKGATVKYGGWHV EFVDIDSAFAEEQARDRDVADTAEIDNIGYAYLNGGKNPDKPYLFITGFSKKAFDTVPVS LVSGRLPENSGEILVPAHVAANGGVNFSVGDTLSLTIGGRMAEDGSLGQHDPFRSRNEQG KAMEKFVPREERTYTVVGICERPAFEEFTAPGYTLITAADPADQTGSCSVFVTLKNPGKA ASYISRTAGQNAYVLNDNVLRFLGVSSDRLFNTLLYSVGGLLTVLIMTGSIFLIYNSFTI SLSERTRQFGILASVGATAGQLRNSVLFEGLCIGAAGIPTGMIMGIGSISLVISAVAEYF KNFGYSTVTLSLTVSGPAIAGAAVISLATILFAAWIPARKAAARPVMESIRQTNEIKVES RDVKTSKAAQRIYGLEGTLALKNFKRNKKRCRSIVLSLTLSVVLFVSANAFSACLKQGAN RSVVNSDYDICFTSQNIDENELFRLYGRLKTAEGIRESSYQALMTYSCAVKAGDLSDDYR KSAGAGIQEETVNLPMDIQFIEDREFSSFIESLGLSPEEYTGQNAKMIAVAKEKTKQKNN GQSGLEDLFAERTMQVSVTPGEEHPAKLQDSAPSRSEKQISLTLVDTIPLDTLPRNASGK KPFVFMIVAPWQLRESFEASEAGEIMGLTFRSDTPSQSVAEMREIIQESGIAVDYNLYNV NKMFEENRSILFVIHVFSYVFVMMISLIAAANVFNTISTNIRLRRRELAQLRSIGMADRA FDKMMNFECIFYSLRTLGIGLPAAGICSWLICRGMIAGGADVSFAVPWGSMAVSALSVFF IVFFTTKYAAGRVKKESIIDALRDDIT >gi|229784127|gb|GG667608.1| GENE 118 134092 - 135114 1382 340 aa, chain - ## HITS:1 COG:BH3254 KEGG:ns NR:ns ## COG: BH3254 COG3641 # Protein_GI_number: 15615816 # Func_class: R General function prediction only # Function: Predicted membrane protein, putative toxin regulator # Organism: Bacillus halodurans # 8 339 3 334 336 268 53.0 1e-71 MEKTGVKEFLKRKNVNITVQTYLIDALGAMAFGLFASLLIGTIFATLGDKTGVALFGTIA AYAKSATGAALGVSIAYALKAPQLVLFSAATVGIAGNELGGPVGALVATVVAAELGKIVS KETRVDIIVTPGVTIISGVLIAQFVGPGVAAFMTAFGNLVKTATEMQPFFMGILVSGLIG IALTLPISSAAICIMLALDGLAGGAATAGCCAQMVGFAVLSFRENGWGGLLAQGLGTSML QMGNIVKNPKIWIAPTLASLITGPAATVVFQMKNIETASGMGTCGLVGPIGIYTAMGGGA RMWTGILFVCFLLPAVLTLVFGACLRKAGWIKEGDLKLDL >gi|229784127|gb|GG667608.1| GENE 119 135160 - 135309 97 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288869910|ref|ZP_06112234.2| ## NR: gi|288869910|ref|ZP_06112234.2| hypothetical protein CLOSTHATH_00313 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00313 [Clostridium hathewayi DSM 13479] # 1 49 1 49 49 90 100.0 5e-17 MAEMPKQLVMYFTIEMGKCKNNMKKDQPGELILFSEKVFRFLWGVAKTI >gi|229784127|gb|GG667608.1| GENE 120 135314 - 135436 63 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGGFTIINITHPYKKVFTNFEKIKKKNCAFYQYKSIGYKK >gi|229784127|gb|GG667608.1| GENE 121 135796 - 137289 1452 497 aa, chain + ## HITS:1 COG:no KEGG:Closa_1674 NR:ns ## KEGG: Closa_1674 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 497 1 497 498 913 87.0 0 MNKVLDIVTSKGLTYEQKVVALAHAAENSLEVLDIPDRTRHYMETGAICDLDEGHAPYRP RYIMPDYEKAVKNGCEFLQLDPPKDLDEVLQFLEIFYRHVPSITSYPVFLGNIDKLIEPF LEGVSDEEAEKKLKLFLIYLDRTITDGFCHANLGPEETRTGRIILKLEKELQNAVPNLTL KYDPDITPDSYGELALYTSLYCANPAICNHRMHKETYEDYGISSCYNILPIGGGSYTLTR IVLPKLVPEAESREQFLTEFLPDCLSHMGDYMNERIRFLVEESNFFETSFLSREGFVERD KFLAMFGIVGLAECTNMLMGDPEKIYGHDKEADDLADQIMQVIADYAEHSKALYSEVFGG RFALHAQVGIDSDHGITSGVRIPVGMEPERMYDHLRHSARFHRFFPTGCADIFSFDPTGR NNPAAMLDIVKGAFSLGDKYISFYASDSDLVRITGYLVKRSEMEKYYEGEAVLQNTVYLG GDNYRNAHLENRKVRAL >gi|229784127|gb|GG667608.1| GENE 122 137286 - 138155 888 289 aa, chain + ## HITS:1 COG:STM4565 KEGG:ns NR:ns ## COG: STM4565 COG1180 # Protein_GI_number: 16767806 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Salmonella typhimurium LT2 # 8 274 6 268 287 196 41.0 3e-50 MTRFSPEAPINKIIPLSVVDGPGCRTSVFVQGCNIACAYCHNPETQQLCRACGICAGQCP AGALSIEEGGGESSEKRIVWNEKLCIQCDNCIRVCPYFASPKVRRMSAEEVWREIEDNMP FIQGITVSGGECTLYPEFLTELCRNAGKAGLTCFSDSNGCVDLSEYPELMAVTDQVMLDV KAWDYEVFKRLTGGDGSVVKKNLIYLAEQKKLYEVRLVCLDGETDMEAVIAGVADAAAPY LKEFRLKLITFRKYGVRGRLEKRNSPPPERMEELRNLAVRCGFQEIQVV >gi|229784127|gb|GG667608.1| GENE 123 138204 - 139310 1350 368 aa, chain + ## HITS:1 COG:BB0383 KEGG:ns NR:ns ## COG: BB0383 COG1744 # Protein_GI_number: 15594728 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Borrelia burgdorferi # 53 368 23 337 339 243 43.0 6e-64 MKLRRGIVFGLTAVVALSLTACGGAKTGSTTAESAKTENTKTENAGNADSGTSEGKSSFS VAMITDTGGINDQSFNQSAWEGLTELKEEKGVEVNYIESKQASDFVTNLERLGDNGANLL WGVGYACADAVLEAADSNPDIQYAIIDNAYDDTPANVTGVMFRAEEPSFMVGYVAGRMTE TGKVGFVGGISSGLIDQFQYGYEAGVKYAAKELGRDIDISVQYAESFSDAAKGKAIATKM YSDGCDVIYHAAGGSGTGVIEAAKEADKWVIGVDRDQAYLAPENVLTSALKLVGKAVKEV SIEAMEGKTIGGQTLTFGLKEDCVGIPEEHGNMADGVYEDTLKVAESIKNGELVPPTTKE TYEAFTAE >gi|229784127|gb|GG667608.1| GENE 124 139345 - 140880 1871 511 aa, chain + ## HITS:1 COG:lin1426 KEGG:ns NR:ns ## COG: lin1426 COG3845 # Protein_GI_number: 16800494 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport systems, ATPase components # Organism: Listeria innocua # 7 506 2 499 513 538 58.0 1e-152 MATVSSDYAVQMHGITKRFGAFYALKDMNLDVKKGSIHSLLGENGAGKSTLMNVLYGLYQ ADEGEIYIGGKKTDIKNPNAAIANGIGMVHQHFMLVEDFTVTQNIILGNETTSHAGIVDM RKARKQILGIVEKYGLEVDPDAKISDISVGMQQRVEILKALYRGAELLILDEPTAVLTPQ EIEDLLSIMHNLVDDGKAIIIITHKLKEIKASSDTCTIIRRGEYIRTIPVEEVTEQELAT LMVGHEVKLVVDKTPAKPGKAVLEIKDLVVKNERKLDAVKGLNLTVRKGEIVGIAGIDGN GQKELIEAINCLVPAERGTITVNGKSVGNTNPRSVIDSGISTIPEDRQKRGLVLDFSVNE NAVLERYRQEPFSRKGILNKKEMEQFTKKLIEEFDVRPEDCGSKRAGGLSGGNQQKVIIG REISMNPDVLIAVQPTRGLDVGAIETVHRTLIRERDKGKAVLLISFELDEVMNVSDTIAV IYDGKIQDTFPQGTVDENTIGLLMAGGKNHG >gi|229784127|gb|GG667608.1| GENE 125 140873 - 141997 1352 374 aa, chain + ## HITS:1 COG:BB0678 KEGG:ns NR:ns ## COG: BB0678 COG4603 # Protein_GI_number: 15595023 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Borrelia burgdorferi # 14 363 22 369 383 216 35.0 7e-56 MDKTVHILKKPLTMTLIAIFFGFVVAGIILAAVGYPPVRSLGVLLNGVFSSPKHMSNVII KSTPLILTGIGVAFAFKTGLFNIGAEGQFIMGCVAATVVGIVCDFPPVIQIPLVLFAGAA AGAVYGGIAGFLKARFGIHEVLTSIMLNWIALYFSNYVCGLERFHKPDTIGTYAINRSGY TMILGNYKATEEGKAALMQNEFLRNTILKTDVNAGFIIAVILAVLVSFLLYKTAKGFELR AVGANRFAAEFTGINVNRNILHSMLISGAICGLAASLYITGNSPHGIATLAAFENTGFNG LAVCLIAASSPVGCIFAGLLFGGLIYGGQTLQYEVGAPSEIINIVIGIIVFFVALTHIIP PLVERLSKGGKKHD >gi|229784127|gb|GG667608.1| GENE 126 141990 - 142898 1167 302 aa, chain + ## HITS:1 COG:CAC0705 KEGG:ns NR:ns ## COG: CAC0705 COG1079 # Protein_GI_number: 15893993 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Clostridium acetobutylicum # 12 298 14 302 310 214 45.0 2e-55 MINSMTLFIGITLMYSNPLVFAALGGVVSERSGVTNLGIEGMMTIGAVTGATVGYYSGSA WAGFIAAGLAGGLIALLHAFAAITCKADQTISGIAINLIGPGVALFVCRLLFDGATMSQP VPNKIAKLFGGLKVQGALQNLNVDVTVVIAFLLAVFIWFFLYKTKWGLHIRAVGEHPAAA DTMGIHVYVIRYLCVLVSGVLAGFGGAAMTLAIIPQFTPTAISGQGFIALAAVIFGKWTP HGAYGACLLFGAAQALTVTLGGGNFAVPSQILAMLPYFITIVILILFVGRSSAPKASGQL AS >gi|229784127|gb|GG667608.1| GENE 127 144277 - 145023 835 248 aa, chain + ## HITS:1 COG:no KEGG:Closa_1680 NR:ns ## KEGG: Closa_1680 # Name: not_defined # Def: Crp/Fnr family transcriptional regulator # Organism: C.saccharolyticum # Pathway: Two-component system [PATH:csh02020] # 10 248 1 239 239 409 84.0 1e-113 MNTEERGETMNDVVALINELRSEERDYLNNYLANAPKWLLEAFQIVRLKKGTTFIHENAH VDTVYILVEGIVKATDYRIQEVAYDYTRFYPIEVFGAMEFLMGYDLYKTTLMTETDCRFL CVSKDQFSRWMLTDIHAVLEQVKAMSVYLVEQVRKERLFLFLQGIDRLFLLFMQIYRDSA RRGVCRIHLTRKDLSNSTGLCVKTVNRCISQMEDEGYISREGRGIIISEEQYQRIKSAMA DKIDENDI >gi|229784127|gb|GG667608.1| GENE 128 145050 - 145832 1019 260 aa, chain + ## HITS:1 COG:SPy1869 KEGG:ns NR:ns ## COG: SPy1869 COG2820 # Protein_GI_number: 15675688 # Func_class: F Nucleotide transport and metabolism # Function: Uridine phosphorylase # Organism: Streptococcus pyogenes M1 GAS # 1 259 1 259 259 377 71.0 1e-105 MTNYSENEGKQYHIQVGKGDVGRYVIMPGDPKRCALIAKYFDDPKLIADNREYVTYTGTL DGTPVSVTSTGIGGPSASIAMEELVMSGADTFIRIGTCGGMDVNVKSGDLVIANGAIRME GTSREYAPIEFPAVPDFDVTNALVAAAKKLEKPYHVGVVQCKDSFYGQHSPETKPVSYEL LNKWNAWLQLGCKASEMESAALFIVASALKVRAGSVFLVLANQERAKQGLENPIVHDTDA AIRTAVEAIRLMIQEENGKK >gi|229784127|gb|GG667608.1| GENE 129 145873 - 146580 938 235 aa, chain + ## HITS:1 COG:SA1940 KEGG:ns NR:ns ## COG: SA1940 COG0813 # Protein_GI_number: 15927712 # Func_class: F Nucleotide transport and metabolism # Function: Purine-nucleoside phosphorylase # Organism: Staphylococcus aureus N315 # 9 235 11 234 236 275 60.0 4e-74 MSVTPTAHNGAKQGEIAKTVLMPGDPLRAQYIAETYLENPVLVTSVRNMFGYTGMYKGRE ISVMGGGMGMPSVGIYTYELFHFYGVEQIIRIGSAGALQDGVKLMDVVIGMGACTDSNYA YQYGLPGTFAPIADYGLLAKAVDTAKAQGTNVVVGNILSSDIFYNADTTVNDKWRSMGVL AVEMEAAALYMNAAAAGKKALCLLTISDLIYGEEKLSAEERQLGFGKMMEIALEL >gi|229784127|gb|GG667608.1| GENE 130 146654 - 147322 817 222 aa, chain + ## HITS:1 COG:SPy1867 KEGG:ns NR:ns ## COG: SPy1867 COG0274 # Protein_GI_number: 15675686 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Streptococcus pyogenes M1 GAS # 1 218 1 218 223 268 66.0 5e-72 MDGKEMLKYVDHTLLTQTATWEEIKKICDEAMEYGTASVCIPPSYVKAAKEYMKDRMAVC TVIGFPNGYATTAVKAFETKDAIADGAAEIDMVINIGWVKDGKYDEVEEEIRTLKGCCGD KILKVIIETCLLTEEEKIKMCEVVTRSGADFIKTSTGFSKAGATFDDVALFRAHVGKNVK IKAAGGISSIEDAEKFLELGAERLGTSRMIKLVKQEKSDGGY >gi|229784127|gb|GG667608.1| GENE 131 147369 - 147818 539 149 aa, chain + ## HITS:1 COG:CAC1544 KEGG:ns NR:ns ## COG: CAC1544 COG0295 # Protein_GI_number: 15894822 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Clostridium acetobutylicum # 18 148 5 130 132 125 51.0 3e-29 MDRTEELAALKEKLPVEELISQAFSAMAKAYTPYSGFQVGAALLTADGVIYQGCNIENAA YTPSNCAERTAFFKAVSEGVREFQAICIVGGKDGIPSGLTAPCGVCRQVMMEFCDPETFQ IILPSGREEYEIYTLKELLPVGFGPKNLL >gi|229784127|gb|GG667608.1| GENE 132 147927 - 148505 430 192 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) [Campylobacter concisus 13826] # 5 185 1 181 185 170 41 4e-41 MSNELTRCDWAAHDILNQEYHDNEWGKPVHDEITLFKMLILEGMQAGLSWITILKKREAF IKAFDDFDPAVISSYGDEKIEELMQNAGIVRNRLKINAAITNARAYFVLCEKYGSLDHFL WSYVDNKPIKNAWKTMSDIPASTPLAEKISRDLKNLGFKFVGPTIIYAYMQSVGLVNDHL TSCFCYERETSE >gi|229784127|gb|GG667608.1| GENE 133 148715 - 149947 480 410 aa, chain - ## HITS:1 COG:PH1043 KEGG:ns NR:ns ## COG: PH1043 COG1473 # Protein_GI_number: 14590880 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Pyrococcus horikoshii # 7 406 12 386 387 280 38.0 5e-75 MVDFSMLSDYIISIRRRIHKCPELSGQEFQTQRLIRDELEKMGIPHRTLHETDVLAEITG LQTSPAELTGSRPNTVEFRDYETGNHAKTVLLRADMDALPLTEKSDSSYTSQFPGVMHAC GHDSHTAMLLGAARLLQDSRDLFSGTVRLMFQPAEETGKETRTLIDHGMLDRVDTVFALH VEPDLPSGNICILPGPCMAGVDDFSIRLTSPGGHGATPHLGSDTLLAGAHLAINLQQIIS REIDPQKPAVLTIGVFQAGTKVNLLAQEAVLSGNIRFFDKELSDYFKESLTRYSAHTASM FRCSFEVTYTPSLLPTVNDAACCGTAKRAALTVWGKDNLVERPASMTSEDFSRYLEAVPG VMVFLGTSDGTRKTSWPLHHECFDLDESALLNGSRLYAAYALEWLNEHVV >gi|229784127|gb|GG667608.1| GENE 134 149949 - 151565 1089 538 aa, chain - ## HITS:1 COG:BH1123 KEGG:ns NR:ns ## COG: BH1123 COG4753 # Protein_GI_number: 15613686 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 8 525 3 519 526 156 25.0 1e-37 MYQPQPIKVMIVDDEDKMRGLLKICIPWNELDYVITDDVSSANQALELIAERKPDVIITD IEMPFINGLDFAEMVLEEYPHIKIIILTAHDEFDYAKQGLEIGISSFLLKPIKREELIRT VTEIRNSIREEQKKLYQHELLQKKLNENRDYIIQNFLSNMMVSSLSSDYLWENLNYYQIP LHRDSGFFNILLLTLNVSQDIEESIMQQFQCREILHTAAERINGLLVFQDIHMNLVLLTE NRKINLYNYGFHFSTLLMDKLGIKSYVGVGTPVERLEDIRSSYRQAYQNSQVAKYSHNKS FLANPPQEQNQQELQNLINTVMEELPLFMQLPQEDTALQRVEAAYDSLERIPSHVLSDVM VLSLSIVNIVLATLRDKGISYNEIYNTDHLPYTHILKLTDEKDIRQYVLQMVRFTLYQVN LYTDARGNKLIHTILQYLNENLSKSTLTLKKIAEINYVNPSYLSRIFKEVTQMNFVDYLA TLRIEKAKRLLQTGNLRIYEIAESVGISDPNYFSKLFKKYTNMTPAQYKEEIHRQEDF >gi|229784127|gb|GG667608.1| GENE 135 151579 - 153330 891 583 aa, chain - ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 311 582 297 571 577 172 33.0 2e-42 MGKRKHTGLNLWNWAFHSKISRKISTVFFGFLLLSVILTYFSYRYVSKDHVLSNMEQHSR QTLTSIRSNIVEMVENTNNSYSLLLKSGISDLTKLPLKPSDWKFYDNYLFTIIDSYSHLD SIYIMNLKGQLYGVDKFGIKTPVIHSVTEAPWYQIAMEAKGSCILSLHAGNIFLPDYSED FMSAIRVINDLETQKPIGFSIMNIPTRSINQIFEHVFNPSDIEITLYDNSGEIISTYGNI KEGGDHHRICRFIGPDSSYYSTSGQGSLSAGIYIPEYQWKLTATFPLKYESTLSTSLLHI SFWIFFINMVLMLICVIIIAKSIQTPVSQLIQAMGRAENGNFEPVAVRHPDSELGLLEIH YNHLTERLKFLMDSLVQEQKLKRKTELRALQEQMKPHFLYNTLDTIGYIVLTGDTCQAYD AITALGSFYRQSLSRGKQFITLEQELQIIKDYTSLLSLRYEELFTVTYDVDETLYQHEII KLLLQPLVENSVYHGIKPLGEPGIIIISVKRQNELICLTVSDNGVGMDENLLQSLRSSSP SPRDSFGLAGTIERVHISYPEKGAVELHSQKGKGTSVTIKIPF >gi|229784127|gb|GG667608.1| GENE 136 153604 - 154953 1402 449 aa, chain + ## HITS:1 COG:BH3680 KEGG:ns NR:ns ## COG: BH3680 COG1653 # Protein_GI_number: 15616242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 1 446 1 432 438 229 29.0 1e-59 MKKKTIRTITALALAGTMALGLTACSTGTGNNTAGGDTTGAETVKETTQEANAGTEEVPV IKLWHIFGGDTDPNKAIIDQVVKDAEEKFHVKIESDTAENEAYKMKLKAAIAANETPDIF YTWGHGFIKPFVEAGKVEALDDYLTDDFKSHLGPSTLTGFQFDGKTYALTTDQSVACMFY NKEMFENLNLEVPETFDDFLTVCQTFLDNGITPLTVGGKEPWTIAMYHDLLALRAVGSEG VKAATGKETGFDDPGFLEAAQCLKKLVDMGAFPEGSAGISREESEVPFLQGQIPMYLNGS WTATRVYKDSSLVQGKIGVFPFPVLKDGKSGVTDFTGGPDTAFAVSAATKDPKLTTEVAQ YISYELAIGKYKIGSSILPYVNVDVDESEINPLLMEIYDFTKDATSYTIWWDNLLEGKDA TVYLNKLQELFVGSITPEQYVAELQKLNQ >gi|229784127|gb|GG667608.1| GENE 137 155011 - 155889 630 292 aa, chain + ## HITS:1 COG:BH3681 KEGG:ns NR:ns ## COG: BH3681 COG1175 # Protein_GI_number: 15616243 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 1 292 1 288 293 209 41.0 5e-54 MNRVLSNKKAVFIFLFPALILFLTIIIVPIFMSMTYSLTEWDGIGKKVFTGFDNYKELFL TNSDGFWRAVKNSLIFAAGSVFVQLPISLILALILARGVKGERFYVSVYFIPVLISTVVI GQLWMKIYNPQYGLLNTVLRSMGLDKLAGNWLGDTKKVIFAVIVPVLWQYIGYHMLLMYA SVRSISEEIFEAARIDGANGIQTALHITIPLMKPILKVCVTFAVVGSLKNFDLVYVMTDG GPAGASQLPSTLMVETIFSRNMYGYGSSMAIFIILECFLFAWLIRKAFRDNE >gi|229784127|gb|GG667608.1| GENE 138 155900 - 156730 657 276 aa, chain + ## HITS:1 COG:BH3682 KEGG:ns NR:ns ## COG: BH3682 COG0395 # Protein_GI_number: 15616244 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 2 276 19 293 293 218 44.0 7e-57 MKKIKLSRVIIQACLIFWAAIQIFPLYWLLTFSLKDNTEIFQGNMVGLPKVWRFENYTHA FMGGNVGRYLANSIIVTAATILIVCIASLMASYALIRMKWKLRKTFQLIFIMGITIPIHA ALLPVFIMMRQVHLINTYWCLIIPYSAFAVPMAIMISSGFISSIPVELEEAACIDGCSIY KIFIQIILPLMKPSLATIAIFTFLQSWNELMFAVVFISKPAYRTLTVGIQSLVGQYTTDW GPIGAGLVVATFPIIIIYVILSRQVQQSLIAGSVKG >gi|229784127|gb|GG667608.1| GENE 139 156740 - 157936 861 398 aa, chain + ## HITS:1 COG:CAC2970 KEGG:ns NR:ns ## COG: CAC2970 COG1168 # Protein_GI_number: 15896223 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Clostridium acetobutylicum # 6 386 4 380 384 300 39.0 3e-81 MERAYFDEVVNRRGTNCAKYDEMDEKYGKDVLHCGVADMDFRVPAPIREACRKVTDHGIF GYTNLPSCYPAVVREWIQREYGCLVKDEWILFSPRINIGLNMAVETFTKPGDKIIVHTPA YPALTDAVKKHDRTMIESPLEWNGTRWNMNLSALESEMDSSVKMMILCNPHNPTGRVWEA AELRAVESFCLRHDLLLLSDEIHADIVRQGRKFCSILSLSEQMRKKMIVYQSVTKTFNIP GIMFSNMIIPDDSMRQAMKKTVDREGFHNPNIFAAAVIEPAYRECEQWKRELNHRLDENM RYLTGYLKENMPLFELTEPEGTFLAWIDYRKTGLTEEQILQLFLEKAKVSVYGGSHFGEA GRGYIRLNAAVPKCVLEEILNRIEKAYFLVSNEEKHVR >gi|229784127|gb|GG667608.1| GENE 140 158047 - 158389 388 114 aa, chain + ## HITS:1 COG:FN0089 KEGG:ns NR:ns ## COG: FN0089 COG3192 # Protein_GI_number: 19703441 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Fusobacterium nucleatum # 2 113 3 115 360 102 52.0 1e-22 MINRIIMIIMALGAVAGGIDRIMGNRFGYGKKFEEGFQYLGPTALSMVGIICLAPLVSGT LGKLIIPVYRFLGVDPAMFGSLLAIDMGGYQLSMELAENPMIGRYAGIVAASVF Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:16:31 2011 Seq name: gi|229784126|gb|GG667609.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld2, whole genome shotgun sequence Length of sequence - 107587 bp Number of predicted genes - 84, with homology - 84 Number of transcription units - 34, operones - 17 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 14/0.000 + CDS 2 - 304 222 ## COG1089 GDP-D-mannose dehydratase 2 1 Op 2 . + CDS 319 - 1257 504 ## COG0451 Nucleoside-diphosphate-sugar epimerases 3 1 Op 3 . + CDS 1254 - 2111 374 ## gi|266619323|ref|ZP_06112258.1| conserved hypothetical protein 4 1 Op 4 . + CDS 2135 - 3853 1142 ## COG1109 Phosphomannomutase + Term 3882 - 3946 15.3 5 2 Tu 1 . - CDS 3925 - 4254 211 ## PROTEIN SUPPORTED gi|149020132|ref|ZP_01835106.1| 50S ribosomal protein L9 6 3 Tu 1 . - CDS 4435 - 4599 154 ## gi|266619326|ref|ZP_06112261.1| conserved hypothetical protein - Prom 4672 - 4731 2.5 7 4 Op 1 . - CDS 4802 - 5347 -500 ## gi|288869921|ref|ZP_06112262.2| hypothetical protein CLOSTHATH_00346 8 4 Op 2 1/0.250 - CDS 5391 - 5717 209 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 5885 - 5944 6.5 - Term 5779 - 5821 2.1 9 5 Op 1 . - CDS 6065 - 6862 355 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 10 5 Op 2 . - CDS 6852 - 7646 86 ## Ping_0452 xylose isomerase domain-containing protein 11 5 Op 3 . - CDS 7640 - 8566 277 ## BC1001_0667 hypothetical protein 12 5 Op 4 . - CDS 8581 - 9519 184 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 13 5 Op 5 . - CDS 9506 - 10261 259 ## bpr_I0518 FAD dependent oxidoreductase 14 6 Tu 1 . - CDS 10667 - 11350 119 ## gi|288869922|ref|ZP_06112270.2| conserved hypothetical protein 15 7 Tu 1 . - CDS 13264 - 15552 450 ## COG0438 Glycosyltransferase - Prom 15661 - 15720 80.4 16 8 Op 1 . - CDS 16568 - 18070 318 ## Dd586_1549 methyltransferase FkbM family 17 8 Op 2 2/0.083 - CDS 18089 - 19450 288 ## COG0438 Glycosyltransferase - Prom 19514 - 19573 3.1 18 9 Op 1 26/0.000 - CDS 20325 - 21020 197 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 19 9 Op 2 3/0.000 - CDS 21069 - 21854 248 ## COG1682 ABC-type polysaccharide/polyol phosphate export systems, permease component 20 9 Op 3 12/0.000 - CDS 21847 - 23001 394 ## COG0438 Glycosyltransferase - Prom 23055 - 23114 6.6 21 9 Op 4 . - CDS 23157 - 24545 592 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 22 9 Op 5 . - CDS 24542 - 26194 747 ## Closa_0537 hypothetical protein - Prom 26259 - 26318 6.2 + Prom 26352 - 26411 2.8 23 10 Tu 1 . + CDS 26491 - 27951 1222 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 28001 - 28056 8.2 - Term 28114 - 28166 14.4 24 11 Tu 1 . - CDS 28214 - 29884 1593 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Term 30181 - 30246 1.3 25 12 Tu 1 . - CDS 30298 - 31608 801 ## COG1672 Predicted ATPase (AAA+ superfamily) + Prom 31831 - 31890 9.8 26 13 Op 1 1/0.250 + CDS 32082 - 33722 1372 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 33751 - 33789 6.0 27 13 Op 2 . + CDS 33800 - 35269 378 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 35289 - 35331 5.7 - Term 35277 - 35319 9.5 28 14 Tu 1 . - CDS 35361 - 36920 1179 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 36979 - 37038 6.7 + Prom 38233 - 38292 7.2 29 15 Tu 1 . + CDS 38326 - 40146 1592 ## COG4750 CTP:phosphocholine cytidylyltransferase involved in choline phosphorylation for cell surface LPS epitopes + Term 40203 - 40244 -0.8 + Prom 40161 - 40220 8.4 30 16 Op 1 1/0.250 + CDS 40261 - 41385 950 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 41454 - 41495 6.3 31 16 Op 2 . + CDS 41533 - 43383 1720 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 43421 - 43459 9.1 - Term 43408 - 43445 5.1 32 17 Tu 1 . - CDS 43505 - 45622 2087 ## COG3968 Uncharacterized protein related to glutamine synthetase - Prom 45701 - 45760 7.0 - Term 45653 - 45694 10.2 33 18 Op 1 . - CDS 45782 - 46645 963 ## COG0191 Fructose/tagatose bisphosphate aldolase - Prom 46681 - 46740 8.5 34 18 Op 2 . - CDS 46869 - 48356 1639 ## COG0498 Threonine synthase - Prom 48428 - 48487 6.1 + Prom 48507 - 48566 4.2 35 19 Tu 1 . + CDS 48595 - 49404 552 ## COG1489 DNA-binding protein, stimulates sugar fermentation + Term 49584 - 49617 -0.1 36 20 Op 1 . - CDS 49622 - 50497 718 ## COG1512 Beta-propeller domains of methanol dehydrogenase type 37 20 Op 2 . - CDS 50548 - 51633 960 ## Closa_0460 hypothetical protein 38 20 Op 3 . - CDS 51630 - 52868 1162 ## COG4260 Putative virion core protein (lumpy skin disease virus) - Prom 52906 - 52965 5.5 - Term 52882 - 52940 6.3 39 21 Tu 1 . - CDS 52967 - 54034 1066 ## COG1396 Predicted transcriptional regulators - Prom 54074 - 54133 7.7 40 22 Tu 1 . - CDS 54182 - 55012 1069 ## COG0648 Endonuclease IV - Term 55076 - 55137 1.2 41 23 Op 1 . - CDS 55206 - 56168 1060 ## Closa_0457 hypothetical protein 42 23 Op 2 . - CDS 56146 - 57108 1205 ## COG1131 ABC-type multidrug transport system, ATPase component 43 23 Op 3 . - CDS 57108 - 59600 2745 ## Closa_0455 hypothetical protein 44 23 Op 4 . - CDS 59608 - 60552 1177 ## COG0642 Signal transduction histidine kinase 45 24 Op 1 . - CDS 61514 - 61642 128 ## Closa_0454 integral membrane sensor signal transduction histidine kinase 46 24 Op 2 . - CDS 61632 - 62339 1126 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 62375 - 62434 5.0 47 25 Op 1 . - CDS 62445 - 63365 883 ## Cphy_2412 hypothetical protein 48 25 Op 2 . - CDS 63438 - 63782 451 ## gi|266619371|ref|ZP_06112306.1| conserved hypothetical protein 49 25 Op 3 . - CDS 63782 - 64453 514 ## Cphy_2414 hypothetical protein 50 25 Op 4 . - CDS 64402 - 64869 209 ## Cphy_2414 hypothetical protein - Prom 64984 - 65043 1.5 - Term 64965 - 65008 8.1 51 25 Op 5 . - CDS 65073 - 66530 1808 ## COG0855 Polyphosphate kinase - Prom 66556 - 66615 80.4 52 26 Op 1 11/0.000 - CDS 67463 - 68104 787 ## COG0855 Polyphosphate kinase 53 26 Op 2 . - CDS 68104 - 69672 1812 ## COG0248 Exopolyphosphatase 54 26 Op 3 . - CDS 69743 - 71047 1582 ## COG2607 Predicted ATPase (AAA+ superfamily) 55 26 Op 4 . - CDS 71112 - 72248 1237 ## COG0077 Prephenate dehydratase 56 26 Op 5 . - CDS 72245 - 72994 857 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 57 26 Op 6 . - CDS 73050 - 73553 292 ## Closa_0447 hypothetical protein - Prom 73574 - 73633 3.4 - Term 73598 - 73644 5.1 58 27 Op 1 7/0.000 - CDS 73655 - 75238 1202 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 59 27 Op 2 1/0.250 - CDS 75259 - 77175 1759 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Term 77201 - 77247 8.7 60 27 Op 3 . - CDS 77260 - 78678 1376 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 78796 - 78855 7.8 + Prom 78781 - 78840 7.6 61 28 Op 1 38/0.000 + CDS 78958 - 79845 677 ## COG1175 ABC-type sugar transport systems, permease components 62 28 Op 2 3/0.000 + CDS 79861 - 81213 930 ## COG0395 ABC-type sugar transport system, permease component 63 28 Op 3 . + CDS 81230 - 82498 760 ## COG0673 Predicted dehydrogenases and related proteins 64 28 Op 4 . + CDS 82504 - 83313 242 ## Pjdr2_5268 hypothetical protein - Term 83216 - 83252 -0.8 65 29 Tu 1 . - CDS 83397 - 84893 1400 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 66 30 Op 1 . - CDS 84952 - 86670 1598 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 67 30 Op 2 . - CDS 86694 - 87290 400 ## gi|266619390|ref|ZP_06112325.1| putative prokaryotic N- methylation motif protein 68 30 Op 3 . - CDS 87299 - 87700 360 ## gi|266619391|ref|ZP_06112326.1| putative general secretion protein G 69 30 Op 4 . - CDS 87697 - 88275 476 ## gi|266619392|ref|ZP_06112327.1| conserved hypothetical protein 70 30 Op 5 . - CDS 88290 - 88928 712 ## gi|266619393|ref|ZP_06112328.1| conserved hypothetical protein 71 30 Op 6 . - CDS 88939 - 90417 1413 ## ELI_4139 hypothetical protein 72 30 Op 7 7/0.000 - CDS 90440 - 91627 845 ## COG1459 Type II secretory pathway, component PulF 73 30 Op 8 . - CDS 91644 - 92216 501 ## COG1989 Type II secretory pathway, prepilin signal peptidase PulO and related peptidases 74 30 Op 9 . - CDS 92206 - 92634 557 ## ELI_4145 hypothetical protein 75 30 Op 10 . - CDS 92650 - 93072 561 ## gi|266619398|ref|ZP_06112333.1| general secretion pathway protein G 76 30 Op 11 . - CDS 93098 - 93277 280 ## Closa_0445 hypothetical protein - Prom 93304 - 93363 6.8 + Prom 93324 - 93383 4.5 77 31 Tu 1 . + CDS 93408 - 94412 1065 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D 78 32 Tu 1 . - CDS 94460 - 97048 1877 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 - Prom 97103 - 97162 6.1 + Prom 97170 - 97229 7.8 79 33 Tu 1 . + CDS 97277 - 97870 544 ## COG5263 FOG: Glucan-binding domain (YG repeat) 80 34 Op 1 1/0.250 - CDS 97964 - 99286 1156 ## COG0534 Na+-driven multidrug efflux pump - Term 99314 - 99344 -0.6 81 34 Op 2 . - CDS 99364 - 100341 1095 ## COG0673 Predicted dehydrogenases and related proteins 82 34 Op 3 . - CDS 100384 - 101136 497 ## COG4509 Uncharacterized protein conserved in bacteria 83 34 Op 4 . - CDS 101136 - 102509 865 ## COG4932 Predicted outer membrane protein 84 34 Op 5 . - CDS 102448 - 107586 3246 ## Closa_0424 LPXTG-motif cell wall anchor domain protein Predicted protein(s) >gi|229784126|gb|GG667609.1| GENE 1 2 - 304 222 100 aa, chain + ## HITS:1 COG:CAC2180 KEGG:ns NR:ns ## COG: CAC2180 COG1089 # Protein_GI_number: 15895449 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: GDP-D-mannose dehydratase # Organism: Clostridium acetobutylicum # 7 99 259 347 355 115 59.0 2e-26 LEPKGFEFVETAFRYAGIDVEWHGSGITENGIDKATGKTIVTVNPEFFRPAEVEVLLGDP SKAEEKLGWKREITFQELVRRMVENDLAIVEKEIKISNIK >gi|229784126|gb|GG667609.1| GENE 2 319 - 1257 504 312 aa, chain + ## HITS:1 COG:MT0121 KEGG:ns NR:ns ## COG: MT0121 COG0451 # Protein_GI_number: 15839493 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Mycobacterium tuberculosis CDC1551 # 3 312 2 311 318 184 34.0 3e-46 MKKALIIGAAGFVGDYLIDHIQKNCIWSITVTKLPQENIVKKGIEILDLNLMKPEEIITI LDRVQPDYIFHLAAQSSVALSWKKPGLTVDINIKGTLNLLDAVRELQKRPRLLLIGSGEE YGHVLEEEIPITEETLTRPGNIYAATKACQNMIGKIYCDAYHMDIMSVRAFNHIGPNQAP LFVVSDFCKQASEIEKGIHEPVIRVGNLTARRDFTDVRDVVRAYVMLMENGLAGETYNVG SGTAVSIKSILDTILHMAHCEIEVSVDPEKIRPVDVPIIEADISKLQAVTGWKPEIPLSR TLAETLDYWRRQ >gi|229784126|gb|GG667609.1| GENE 3 1254 - 2111 374 285 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619323|ref|ZP_06112258.1| ## NR: gi|266619323|ref|ZP_06112258.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 285 1 285 285 545 100.0 1e-153 MNRLKTISLSIFVLISLLLLSGCGYRAVPSQYDLKLKTKVSELSCENGGNLAKLEITIDS RRYQDLTSDNNIFLAYHLLDTAGEVLVQDGIRTSLIPVKARGIKNEIWEVLVPLEEGDYI VEADLVEEGVTWFSAQGMETLKIPLTVQNTVIPDYSGIILTMDSTVANSADANSATDAVS ADIHSPITVTIRNNSGIALCSSGYEATQLSYLIKDKKGTVLAEGERVKLPENLASGQTGE LTFTPQAELLSTPGDYMIEIDLLREGTAWFKDLGLIPLQIPVTIK >gi|229784126|gb|GG667609.1| GENE 4 2135 - 3853 1142 572 aa, chain + ## HITS:1 COG:CAC2337 KEGG:ns NR:ns ## COG: CAC2337 COG1109 # Protein_GI_number: 15895604 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Clostridium acetobutylicum # 8 555 6 556 575 501 47.0 1e-141 MESEVLSRWNKWNQEVLEDPDLIDELKSIQQNDRELEDHFYRDLEFGTGGLRGIIGVGTN RMNIYTVGKATQGYANYLNQKCSCPSVAIAYDSRIKSDVFARRAACILAANGIKVHLYQE LMPTPSLSFAVRYLECSGGIVITASHNPARYNGYKVYGSDGGQITTETANSILNEINNID PFSDVKYMDFDAALSQKLIVYIEEKTVTAYIEAVSTQALCGDEINKNISVIYTPLNGSGL HCVMRTLKENGFTNIAVVKEQEQPDGNFPTCPYPNPEMKEALALGIQYASQLGSELLLAT DPDCDRVGTAVKSADGYELLSGNEMGLLLFDYICRRRIALGKMPHNPILVKTIVTTDLAK LIAADYGVEVIDVLTGFKFIGEQIGLLEEKGEAERYIFGFEESYGYLSGGFVRDKDGVNA SLLICEMFAYYKSHGQTLLEVLDALYKKYGYCLNTLRSYTFEGAEGFDTMTKIMENFRTA APTAIAGKAITAISDYKTSVTSNTDGSKTIINLPQSNVMKFLLEGNTSVVIRPSGTEPKL KFYISVSAASRREAELAEVAIASEMEKTYFRK >gi|229784126|gb|GG667609.1| GENE 5 3925 - 4254 211 109 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149020132|ref|ZP_01835106.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP23-BS72] # 11 104 11 103 107 85 41 8e-16 MLSETTGFRAIYIYCGKCDLRKGIDGLATLLKEQFHLDPFQKEILFLFCGCHTERFKGLV WEGDGFCLLYKWIEAGRLRWPRSQEEAATISPEELHLMLTRMTILKRSS >gi|229784126|gb|GG667609.1| GENE 6 4435 - 4599 154 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619326|ref|ZP_06112261.1| ## NR: gi|266619326|ref|ZP_06112261.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 54 1 54 54 94 100.0 3e-18 MNEAMQVRAASWPAMVKQRNDSGLTVKEWCAANAIQESVYYYRLNRLRKIALDV >gi|229784126|gb|GG667609.1| GENE 7 4802 - 5347 -500 181 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288869921|ref|ZP_06112262.2| ## NR: gi|288869921|ref|ZP_06112262.2| hypothetical protein CLOSTHATH_00346 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00346 [Clostridium hathewayi DSM 13479] # 1 181 1 181 181 336 100.0 5e-91 MSRIFLNLSFTVWQPAVIRLCHLNLHRSPFRGYNPCSYSVLFPTRFRRHDGVWRIQVLGH SVHQGMHQLLSYSMLSSSFLTTEERFISRSFSSTSGLCHGLPPYFPVSEWLSISWLPFYP VPWCNREDAAVKVDDTPLIEGSREYLRYGFQHAEVFIAYDELHIVKPARTLCLHSYPQLH R >gi|229784126|gb|GG667609.1| GENE 8 5391 - 5717 209 108 aa, chain - ## HITS:1 COG:CAC2502 KEGG:ns NR:ns ## COG: CAC2502 COG0697 # Protein_GI_number: 15895767 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Clostridium acetobutylicum # 5 108 8 111 112 68 44.0 3e-12 MKYYIVLIIMTILGACGSFYLKKASDAKKLKILVITPAFYLGGFLYLLSAILNIYILKFL DYSVVLPLTSITYIWTFGLSYCFLKEGITQKKIIGIVLILLGSIIIAV >gi|229784126|gb|GG667609.1| GENE 9 6065 - 6862 355 265 aa, chain - ## HITS:1 COG:Cgl1444 KEGG:ns NR:ns ## COG: Cgl1444 COG0463 # Protein_GI_number: 19552694 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Corynebacterium glutamicum # 26 244 12 227 270 68 27.0 1e-11 MIYNSIPGVPNFVSEEFVQKRSSYCICIPIINEGRRIHDELKRARKYKIDTIADIIICDG GSIDGSTDSTLLEAFGVNTLLIKKGPGKQGAQLRMGIWWALERGYKGVITIDGNNKDSIE DVPKFIEKLEAGYDFIQGSRFIKGGKAINTPFLRYLSVRLLHAPVISLTAHHHFTDTTNA FRGHSRNYLTNPKVQPLRDIFNTYELLAYLSTRASQIGLRVCEVPVTRAYPIKGKVPTKI SFLKGNSELLKILFSNMLGKYNPQK >gi|229784126|gb|GG667609.1| GENE 10 6852 - 7646 86 264 aa, chain - ## HITS:1 COG:no KEGG:Ping_0452 NR:ns ## KEGG: Ping_0452 # Name: not_defined # Def: xylose isomerase domain-containing protein # Organism: P.ingrahamii # Pathway: not_defined # 4 263 2 270 271 131 33.0 3e-29 MLKLSISNISWDSKYDQEIYKYLLDNNIQGLEIAPTRVFPNNPYECVKEAHDFKEYLSKS YHLKISSMQSIWYGKTEKMFQSVKERKQLMEYTKKAIDFAAALECHNLVFGCPKNRSINC SGDYNIAVDFFSAIGEYAHNRNTVIALEANPVIYNTNFITSTAEAFSMVKEVQSEGFLVN LDIGTIIYNEESLDIIADNIGLINHIHISEPGLAGILKRKLHRQLAGILAGQYSKFISIE MGKQEDLQRVFDALSYVKEVFYDI >gi|229784126|gb|GG667609.1| GENE 11 7640 - 8566 277 308 aa, chain - ## HITS:1 COG:no KEGG:BC1001_0667 NR:ns ## KEGG: BC1001_0667 # Name: not_defined # Def: hypothetical protein # Organism: Burkholderia_CCGE1001 # Pathway: not_defined # 1 306 1 252 264 178 33.0 3e-43 MKTMLVGYTGFVGSNLSCQYDFTERYNSKNIEDAYGSKPDLLIYAGIRAEKYLANQDPNK DLRNIENAFYNIQQIQPKKLVLISTIDVYKSPINVNEDTPIISEDLQPYGANRYYLEQKV RDDFPETLIIRLPGLFGKNIKKNFIYDYIHYIPSMIQNEKFGKLATVRHELNNYYKLQDN GFYKCSDISCEEKEALKTIFKELGFSALNFTDSRGVFQFYNLKHLWNHIKYALKNDLYLL NLATEPIGIAELYHYLEGKKFQNIISNNPPYYDFKTKHDTLMGGKRGYIFDKDNVLAEIK TFIQEQSC >gi|229784126|gb|GG667609.1| GENE 12 8581 - 9519 184 312 aa, chain - ## HITS:1 COG:sll0501 KEGG:ns NR:ns ## COG: sll0501 COG0463 # Protein_GI_number: 16332035 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Synechocystis # 10 292 5 291 318 87 26.0 2e-17 MLLNKEKNFISAVVYVYNQETQIYRFLELVNETLRNSFEKYEIICVNDASTDGSIKEIKR YASTVKGNVISILNMSFYQGLELSMNAGVDLAIGDFVFEFDLPHASYSSNLIIDVYQRSL QGYDIVAAAPLHGKRTSSKIFYAVFNRLSHTQYLLRTESFRILSRRGINRVHSMSRTIPY RKAVYANCGLKTDVITYDNESYSSFQDKDTLHKQKDTALDAMILYTDIAYRFSMVFAYIM MFITTLTGIYTVCIYIGGRPVVGWTTTMLFLSFAFLGLFVILTIIIKYMSLILKINFNKQ KYIIESIEKITN >gi|229784126|gb|GG667609.1| GENE 13 9506 - 10261 259 251 aa, chain - ## HITS:1 COG:no KEGG:bpr_I0518 NR:ns ## KEGG: bpr_I0518 # Name: not_defined # Def: FAD dependent oxidoreductase # Organism: B.proteoclasticus # Pathway: not_defined # 1 246 128 378 378 325 64.0 1e-87 MCDGAFLTEEYTYDADILKKYLLNQLSQYPNVEICYNARINKITKKSTYFELTMNDGTTY ETDFLLNATYASTNQISSMLGYEPFKIKYELCEIILCNVTEKLKNVGITVMDGPFFSIMP FGKTGYHSLTSVTFTPHITSYDAVPTFDCQKKSNGGCSPKQLGNCNDCPAKPETAWPYMS NLATKYLKEEYGFEYSHSLFSMKPILKASEIDDSRPTVIKQYSTDPTFISVLSGKINTVY DLDEVLNNATK >gi|229784126|gb|GG667609.1| GENE 14 10667 - 11350 119 227 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288869922|ref|ZP_06112270.2| ## NR: gi|288869922|ref|ZP_06112270.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 227 611 837 837 433 99.0 1e-120 MFLILLYIGNMGLKYYDVTQKLSVIDKNGILCAQNDFFKMYLYEDSFYYYIDNKKIPDNG FIYMHIYPDNTQNKDFLNRDINLEDIQEKTVLPFKKKALIQRPIPDYKFEYIMTGISSKV ADNQYKEYSQNTIYSNDIYTNNIYSITAFNLTDENWVNGISQNGNCILLNGNLSDYINLT GKTIQLLSGEKVQIIDVKKVDDNWLYIFLDKKISNQNGYPVQLIISP >gi|229784126|gb|GG667609.1| GENE 15 13264 - 15552 450 762 aa, chain - ## HITS:1 COG:PA1391 KEGG:ns NR:ns ## COG: PA1391 COG0438 # Protein_GI_number: 15596588 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Pseudomonas aeruginosa # 285 653 146 503 591 164 27.0 9e-40 MDFDFTKEPGLIQLKPCLEEKMIPLMSIITPYYNAFKYFEQTFNCVINQTFPWFEWIIVD DGSTQDISLLLEYANSDSRIRIIHQSNKGQASARNKGIIEAASDIIVPLDADDLIIPTYL EYVYWGLKCHPEASWCYTDSLGFQEECYLWKKAFSSEKMKKENLLVCTAAIRKKDLMAVG GYDESNQHYDEDWKLWLELLAMGKYPIHLSVFGFWYRRTASGMGNQVRINRDLKNCSDSL VQQSAQKITKVIEAIEYPRQSSANLFSKPVCSNWTYKIPQITEKKHILMLLPWLEMGGAD LFNLELVKKLNKSIYEITIITTVNSSNTWKQRFEEYVSDIFTLPDFLEISNYPEFITYII NSRQIDLIFLSNSYYGYYLMPWIRKEFFNIAIVDYIHMEEWYWRKGGFARTSGVMKNIIE KTYVCNERTRKILIKDFRKAPENVETVYIGVDKDLYNSVNIESGLVRKQFNLEEDRPIIL FPCRMHPQKRPFMMLEIAKVLKKSEMRVAFVAVGDGPQLEEMISFIHENQLEDTVFFAGR QADMRPYYKDAAITLICSLKEGLALTAYESLSMGTPVITSDVGGQAELIDEKVGRVIPLM QDEATQLDSRLFLTEEILLYTDAINSILADREYYKILCYECRMRIEESFSTDIMIKKMEQ EFLSLDLPAVSALRHKKSDELRKYSDIIDDYLEIYVEYEMAESKQEEIWAAREWYRNLYE SNISCTNKLESANNQRIWKVVFKENKFLRKGFARIWNSLTKR >gi|229784126|gb|GG667609.1| GENE 16 16568 - 18070 318 500 aa, chain - ## HITS:1 COG:no KEGG:Dd586_1549 NR:ns ## KEGG: Dd586_1549 # Name: not_defined # Def: methyltransferase FkbM family # Organism: D.dadantii_Ech586 # Pathway: not_defined # 61 306 118 358 361 152 36.0 4e-35 MSGEENKNIFETLKNYVQSHTVDEIIFLGLKHFFDINYDISLEALIEKYRSEKFNELGEN LHYFQDNIRDLKENFCEYEKIYNSLKEEKSRNVFSNMLCAKVFMDISFIREAYDSDDIYF DLSIWAQLSNEVYVDCGGYSGDTALSFICHCPDYKKIFVLEPMEEAEILCKNNLQWFIQE GNVTVIKRAAYDQDIRLTFNEKHGTGDSCIDENGKTSVFATSLDQMISEPVSFIKMDIEG SERKALMGARRHIAQDAPKMAICVYHLKDDFWKIPQLLLSINSNYDIIFRQHRPDVFSET VMYAIPKNSNEVVLTNHRADLIYCRIKNALNHLLLISTDEYTNLITNVKDKAWYLYQIRM HIKEIEHLRSIQTDFNGQFNTLQKQYQEQKLWLDELQTAKDYLDAQNKAYLQEIKNLNSS LVEQKKWTESLQKGKDYLENQNADLIKEVEALQNILSDQKSWTEELQKSKDYLEICVNQQ KLDIEKAKTDIFNNEKQISE >gi|229784126|gb|GG667609.1| GENE 17 18089 - 19450 288 453 aa, chain - ## HITS:1 COG:PA1391 KEGG:ns NR:ns ## COG: PA1391 COG0438 # Protein_GI_number: 15596588 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Pseudomonas aeruginosa # 1 367 150 503 591 179 29.0 1e-44 MIFPWLEMGGADLFNLEIVKKLDKKNYDLTIITTVTSSNTWRQRFDDYVPEIFSLPDFMD VVDYPEFISYIIKTRRTDLIFLSNSYFGYYLLPWIRNSFPNLAVCDYVHMEEWYWRHGGY ARVSGIFGDYIDNTYVCNKGTRNVLIDSFGRNPETVKTLYIGVDSQYYSPGSFVEGKVYK MFDIDIDRPIILFPCRIHPQKRPFLMIEIARTMIDQGSNAAFVVVGDGPQLGELMDASAQ LGLSNTIYFAGRQEDMRPFYIDAAITLICSLKEGLALTAYESLAMGTPVITSDVGGQAEL IDDSVGAVIPLLQKEADSLDCRKFMPDELNLYTNAIFKLLNKGNEEYYQTICKNCRKRIL EGFSTDLMIQKLEYEFKALCSKYPSTIQSFSENLKIAQELTTLYTEYELLEAYAKNVSQP MDTKTELIRIANSKWGNRIIKIAFKLKLNKLFH >gi|229784126|gb|GG667609.1| GENE 18 20325 - 21020 197 231 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 30 230 18 229 311 80 25 3e-14 MMFNKSSEKIDSLKEYMIKLVKRQLMFEEFWALQNISFEIKKGEAVGIVGLNGSGKSTLL KTIAKVLKPTKGEIEVVGTIAPLIELGAGFDSNLSARENIFLNGAVLGYNRTQMREKFES IMDFAELWDFVDVPIKNFSSGMVARLGFSIATSNMPDILIVDEILGVGDYKFQRKCEERM SKIIDNGATIVFVSHSIEQVREVCSRAIWLEKGHMLMDGSVDEVCDKYSES >gi|229784126|gb|GG667609.1| GENE 19 21069 - 21854 248 261 aa, chain - ## HITS:1 COG:CAC2329 KEGG:ns NR:ns ## COG: CAC2329 COG1682 # Protein_GI_number: 15895596 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate export systems, permease component # Organism: Clostridium acetobutylicum # 14 257 11 254 258 121 31.0 1e-27 MDKIKNYYMGFQHYQALLYELVIRDIKVRYKRSFLGLLWTVINPILTMAVMTIVFSKLFR FQIENYTTYFLVGNILFTFFTEATNNSMHSVLDNSNLIKKVYVPKYLFPISKVMSSVVNL FFSFIALIIVMIATRVPFQATMWLTPIVLCYIIMFAAGIGLILATVMVFFRDIAQLYSIV TLLWMYLTPIFYPVDLLMENAPWALTFNPMYHYIDFMRKLVLDGTLPSISENMLCLSISV ITLLIGLIVFYRKQDKYILYI >gi|229784126|gb|GG667609.1| GENE 20 21847 - 23001 394 384 aa, chain - ## HITS:1 COG:SP0353 KEGG:ns NR:ns ## COG: SP0353 COG0438 # Protein_GI_number: 15900282 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 14 348 2 334 372 220 35.0 4e-57 MADNTQEGAQTDCIRVLHFVSTLSRGSGVMSVIMNYYRHIDRNEVQFDFLHFVACEDSYI EEIRALGGRIYCIDKPGSSFQSVKQLDLFFRLHAEEYTWLHNHEVYLTFLLRPIAKRYGL EKFIVHCHATKYSDKTLHAVRNRILCLPIRFMKVDRFACSEAAGKFLYGEKMLKAGKVFI MHNAIDCEKFRFRPELRERLRKEMGLEGKFVIGHVGRFERQKNHEFLIEVFAEVKKSIPQ AVLLLIGDGSLKERIKSKVVEADLKTSVIFLRQRNDVNEMLHVMDVFVLPSYYEGLPVSC IEAQANGLPCVISGTITKEVCINENVSVCNLENPPCVWSELLKHLTNQTISQRILRQDSI LKNGYCLKSESAILRKKYLGGYNG >gi|229784126|gb|GG667609.1| GENE 21 23157 - 24545 592 462 aa, chain - ## HITS:1 COG:CAC2330 KEGG:ns NR:ns ## COG: CAC2330 COG2148 # Protein_GI_number: 15895597 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Clostridium acetobutylicum # 57 446 58 445 461 223 37.0 4e-58 MRNYEQYKRLIKLFSSSVIVLLEVSAYWIVWHRYYNNLIMAPFFRRGNWLMVAVYGILLL FFLKTYGGLKIGFYKTGNVIYSQILSILFVNIITYLQIALLALGFPPVGVMFLMTLTDVV LIVLWAFIFQAVYEKMFPRKRMLLVYGDIPAFHLQEKINSREDKYQICGAVHIRIGIRRI MEMAGDYDAVIIGDVPSHERNCLMKECFGASIRSYTVPKISDILLSSSVELNIFDSPLYL SRNEDLKVEQKFFKRLIDILFSSLGLIVTSPFWAIISIMIKCSDRGPVFYKQLRLTKDGK VFEIYKFRTMVQNAEEDGVARLASESDNRILPVGRFLRMTRLDELPQLINILKGEMSVVG PRPERPELAAEIEKEIPEFSYRLKVKAGLTGYAQVYGKYNTTAYDKLKLDLTYIRKYSVF LDLKLILMTPKIMLLKESTEGVKESLAEQSAREEAAAASLKD >gi|229784126|gb|GG667609.1| GENE 22 24542 - 26194 747 550 aa, chain - ## HITS:1 COG:no KEGG:Closa_0537 NR:ns ## KEGG: Closa_0537 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 10 538 2 513 514 488 51.0 1e-136 MSSVKINKSKNSKSDSALLSHIIDIFTFLILCLFPLFVMDKYYNILQAKYYFYIGSVILL VIAVIFAGCIEHNKLEKYITSFKWNDFIKKFSVTDVALLVFLIIAGISTISSDYVYESFW GNEGRFSGLFLLCLYGVAYFCVTRFGKLKRWYLDAFLIANMLVCIFGITDYFKMDLLHFK VGMLEEQKFMFTSTIGNINTYTALVGMVTAVSAVLFAVEDKWKKQSFYFVSLVISFFALV MGISDNAYLSLAALFGFLPLYLFKNKKGVKSYLLILTAFFSVIQCIDWINTGMGDKVLGI DSAFNIIVKYDGLLLIVTALWIICAIVYGLDYLYAKKKSRNVEIRGGDSEENGKKLTGCG VLRIIWLAILVICILGIGYALYDVNIQGNTERYGAIGGYLLFNDDWGTHRGYIWRNAMEC FEQFPVMKKIFGFGPDTFGIVLLDKTKGNIYGQIFDNAHNEYLHYLITVGITGLMAYLVF ISSFVVRVIRKAGKNQYIVAVFFAVLCYSTQAFVNLNLPIATPVMWMFLMLGELNIRERY RQSGSKDKGK >gi|229784126|gb|GG667609.1| GENE 23 26491 - 27951 1222 486 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 29 324 526 744 744 83 26.0 8e-16 MRKQTKLVAVLSAAALLAVGASMTSFAATGWQEENGEWVYYNKDGDQVTDTWAKSGDAWF YLNGDGVMATDELIDDGTNYYYVDSNGAMVTNGWVAVENENAGDTDEPDAYWYYFQANGK AYKSSGSASFKTINGKKYAFDSEGKMLYGWVNKEDATRITGDDAWQSGDYYCGDENDGSR ASGWMKLDVIKDDEDKSYWFYFNPSNGKKYAADDATKFAIKTINSKKYAFDETGVMQDGW TDTLATDADVATITEYQYFNGGDEGWRLQKGWVKVVPAEKVNKSAYDDDSTKWFYADGSG NVYNSALKTINNKKYAFNANGEMISGLWALNIDSTGKILGEMIEIDDAGKFDTAYAAYST NKNYAVYYFGSGDDGAAKTGNQTIDVDGDNYSYSFGTTGATKYKGVNETKKTLLVLGRKI KADKDYRYQSFGEDGKEVAGKVGGYLVNTSASIMKKKTNLKDADGMYYCTDEYGVVTYFG STAKAK >gi|229784126|gb|GG667609.1| GENE 24 28214 - 29884 1593 556 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 29 189 526 710 744 75 28.0 2e-13 MRKQTKLVAVLSAAALLAIGASMTSLAATGWQEENGTWVYYDKNGDAVTEQWAKSGNNWF WLNEDGEMATDYLVDDDGNYYYVDANGAMVTNQWVSIENEDYDGDDGDEPLNYWYYFGAN GKAYKSSSNSNSASFKTINGKKYIFNDEAQMLYGWIDTDGQRVTGDDDWREGTYYCGDEN DGAQKAGEWAYLDITDNNWDDVDTGLSSNNLFDDEDQTRYFYFKTNGKKMTDEKGKTING RKYSFDEYGRMNAEWVNWDATPSTASAGSASYTTGYRYYGDPEDGARVTKGWFQVVPDSY LSPDDYEDDENNWYYSDKDGKLVASEVKTINGKKYAFDSYGVMKSGLKAIKFVGSSTTEI DEIKGDDGIANMHFDTEDNFKDNVSDLYGADYKLYYFGSGDDGAMKTGKQTVTIDGEDYS FLFNKSGSSKGAGKLGIDDDKYYLGGMLLKASKDDKYSVIKITKDASGTITALECLSTEE FLDDVNATSSIPTGKADDYDEYYDVKADAADTATTQYRLVNTSGSVQKNKNKAKDGDDRC FKVGSNKEIEAVFVES >gi|229784126|gb|GG667609.1| GENE 25 30298 - 31608 801 436 aa, chain - ## HITS:1 COG:PAB1516 KEGG:ns NR:ns ## COG: PAB1516 COG1672 # Protein_GI_number: 14521501 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Pyrococcus abyssi # 1 401 37 427 469 245 36.0 1e-64 MIQEFMKDKSAVYMTAVEAGIAINLELLSTATYLTFLGEEEATLMPSFKDFRTALQYITQ KAKESKVLLIIDEYPYLAQADKSISSILQATIDQEWKTSNIMLILCGSSMSFMENQVLGY QSPLYGRRTAQFRLDPLDYYESSLFVPRYSAEEKALVYGITGGIPQYLEMVDDSISIKEN ILEMYLNPNAYLYEEPANLMKQELKEPANYNAIVEAIAQGATKLNEISNKVQMENSNVSA CLKSLISLGLVEKESAITEEDNKKKTGYILADHMFRFWYRYVPKCMLLINTRRPERAYEK IIVPDLQNYMGKVFEKMCLQHVAMLNAQEKFPYDILKLGRWWGNNPVLKRQEEIDIMGIN DIDKTALFGECKYRNEVLDLDTLELLIQRSELFSRYRKKGYILYSKKGFSDSVLKLTAQQ SVYMKLFTLDELYDLE >gi|229784126|gb|GG667609.1| GENE 26 32082 - 33722 1372 546 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 29 314 526 736 744 84 26.0 4e-16 MRKQTKLVAVLSAAALLAIGASMTSFAATGWQEESGTWVYYDNDGYKVSNEWKKSGNNWF YLNDDGEMATETLVEDGDYTYYVDENGVMAANRWVQVANDDDNSDDAPANYWYYFQASGK AYKAGDSTSFKTINGKKYAFDDEGKMLYGWVDGTSNRITDDEGWKESTTLYYCGTEDDGA QRVGWAQIHVINKDEDDEDQDYWFYFKANGKKLFAEDGDSITNKTINGRKYGFNSWGKMV SGWTEATAGEIPTASNYQNYSTSEDGSRRKGWFKAVPTDKINKEANQDDEEKWFYANNDG YLVVNQLKTINGKKYAFYDKGEMLSGLYALGVDDDNFIWGAEKIDSESKVKSYWAPKEED SKAGIPLSDERATYVDENNIKLNVDVYYFGSSDDGSMKTGVQNVDVDGDTYAYFFGKSGN NKGKGFGSTTVGNYSDTVLGDGTVVKTFIYDKSAMYVQGRKIKADSDLKYEAFDAIGVKV SARNEGYKDEPLFLLNSSGSLMKNKKNVKDGNDMYYCTDGAGRVTYYGSEKCDSKDGVKH DKNNIH >gi|229784126|gb|GG667609.1| GENE 27 33800 - 35269 378 489 aa, chain + ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 371 488 496 610 621 114 46.0 4e-25 MKKVYYQLTLILAILVLTGVFNIFPAKGETLLTSIKDNDVPTMASDSNASKQRDIDKSKN TLISISSANIVFDLHHELDNTYSLPQPINVTIELAQSDTPKTNIHFEVSDNLTEKLSITP ESFDILEPERQADIQVAFIDKNLDTLNQLPYDSGYINILSENEILGQINVTYCIDGLYDE HYTNNIDRYSGTKIRIDGSFYKNIGSINLRNSVPPVMTITFNPYILDTSNVPNDMGVEIN NDSLGRILLNEVGNFEFSDHTFEQNIIDGNKKVTLYIQMKQSVFDDIKEKAKQNPSAYNG SYLLSDLVFEFKDGCRMGGRISPLPLVYKITPESYNNSSNSGGGNSSGSSNSSNTNSGPG IDSTFTSNEYNTIGSNDIGWRQSDNGMWQYLGPDGRVKTGWLWDPTGIWYYIDSTGNMVT GWAQINNIWYLLRPDGSMVTGWSFTADKWYYFNQDGSMRTGWFQDTDKRWYYFQNDGSMA METTTPDGL >gi|229784126|gb|GG667609.1| GENE 28 35361 - 36920 1179 519 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 30 325 527 726 744 79 26.0 2e-14 MRKQTKLVAVLSAAALLAIGASMTSFAAQGWQEENGTWVYYDKDGYQVSDQWQKSGNNWF YLNSDGEMTTDAIVEDDDYTYYVDANGAMVTNQWIQVENEDSYGDDDPDYFWYYFQSNGK AYKAPTSGKTSFKNINGKKYAFDSEGKMLYGWVNESSERQTGDDAWTNGLYFCGDENDGA QANGWALIDVVDDTQDDPDQSYYFYFDASNGKKVKASDASELKSKTINGLKYAFNEYGKM VSEWAMPKATADQASISSYHYFNLSDQGWRAKGWFNVVPEEDVNESAYNDDSAKWFYAES DGDLVKSEIKTISSKKYAFNNKGEMLSGVQALAMSSSSKIVGNKAIDDADLVDALKAGTL SFEGYNGNTYTIGSDVYVYYFGAGSDGAMKIGNQNVDVDGDTYAFSFGKTGSNKGRGLSE ADDVLYVNGLRIKADSDMRYQAFKNGTAVEDPTTIIDDQNAYLVNTSGSIMKNKKNLKDT DDGYYCTDSKGHVTGYSETKCGTKKDDGTVYTCSNPKHK >gi|229784126|gb|GG667609.1| GENE 29 38326 - 40146 1592 606 aa, chain + ## HITS:1 COG:TP0107_1 KEGG:ns NR:ns ## COG: TP0107_1 COG4750 # Protein_GI_number: 15639101 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CTP:phosphocholine cytidylyltransferase involved in choline phosphorylation for cell surface LPS epitopes # Organism: Treponema pallidum # 76 343 1 251 251 286 50.0 1e-76 MNRNIIICRSILENPAITQRELAQALDVSLGTANNLVKECISQHLITPAGTGGAYELLPA GLELLKQYKVDGALIIAAGFGSRFVPLTFEMPKGLLEVFGERMIERQIRQLHEVGIKDIT IVVGYLKEKFEYLIDKYDVKLLYNPEYSTKNTLTTIYRAREVLRGRNMYVLSSDNWMRHN MYHAYEGGAWYSSAFMEGETSEWCLDYNKKGRIGGVSVGGTNQWVMYGPSYFSKEFTAKF LPVLEAYYELPGTEQFYWENVYMEMLSGEAAKRLPDLKEAGEDIDMYVNRRPEDEVYEFE NLEELRRFDIKYQRHSDNEAMELVARVFKVPESEITEIRCLKSGMTNKSFLFKVKERHYI CRIPGPGTELLINRRQEKAVYDVVAGLGITEKIIYFDGETGYKIAEYYDGANNADPRNWE QMEKCMAIVRRLHHSGLTVEHSFDMRERIDFYEDLCRRHGEIPFEDYGKVRGWMTELMDR LDSMGRRKVLSHLDDNGDNFLMLEDGGVRLIDWEYAGMCDPLVDVAMCAIYSYYTDEETE RLIRIYLEREPSLDERYSVYAHMALGGFLWSLWAVYKAALGEEFGEYTIIMYRYAKHFYK KCREIG >gi|229784126|gb|GG667609.1| GENE 30 40261 - 41385 950 374 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 32 280 527 743 744 170 36.0 3e-42 MRCGKTMTALLLAGLVGSSVFMSADTAWAAAGWKQEGSQWKYIDTDGSEHKGWAKSTDGT YYYMDLSTGYMVTGWKQINNNWYYFRSNGAMATKWVEDGGKYYYLMDDTGVLVKGWLRIG DDYYYMKGDGAMSTGWREMDGGWFYFKENGKCTLGWGQIGSDWFYFGPDGKMLTGWQQIG GDYYYLNSDSKSVLGKMMTGWLSDGTNKYYMDTSSGKMTHGWKELDGAWYYFNDAGHMMT GWIQLSGVYYYLDPSTGKMAANTTLTIDGVSYTFASNGAYQNSNSSAPSTGGNISAPGSG TNNTNAPGTSNNNTNTSGGPGGSSTGNGSLSSAPGSSDNTPGGSSSPGGSNAPGGSSSNN GVYQLAPGLTNGPG >gi|229784126|gb|GG667609.1| GENE 31 41533 - 43383 1720 616 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 311 516 526 731 744 212 46.0 1e-54 MKHRRLTGRITAAALSAVMGCSGAFLYSAPPAYAAGYERVNGYYQMLDGTVIDGVVARGI DVSRWQGNIDWNAVAADDVEFVMLGTRSKGAVDPYFHKNVKEASDAGVRVGAYIYSLATT PQMAVEEADFVLDLVKDYPISFPIAFDAEDSSTLGSLPPAQVTEIINAFCERVEAAGYYP MVYANDYWLANKIDMSNMHYDVWVARYEVKHNFSDPIMWQATSTGSIDGIAGNVDINFLY KDLTPYLPGNLWRTIGDKTYYYQNYQMQKDAWINDGTGWFYMNGDGLASVGWLEKLGLHY YLDETTGRMVTGWKQLSNSWYFFNSSGAMSTGWLDDNGSRYFLNADGTMKTGWLQQMNQY YYLDENSGKMATGWKQLSGKWYFLQDTGIMATGWLDQNGTKYFLNNDGSMVTGWHTEGDK KYFLTDSGAAAVGWLQLDGTWYYFNQGGDMARGWINPDGNWYYLNQDGKMQTGWLDDGGA KYYLSTSSGKMTVGWREVDGAWYYFNGSGAMVTGLTEINGQLYYLNPSDGRMAVSTTLDF DGVDYTVDANGVCTKVEEPADGTQDGQNGAEGNGQSGQSSQNEPAVNGQNTGSATQESPA DSGQTDSKKEIGPGIK >gi|229784126|gb|GG667609.1| GENE 32 43505 - 45622 2087 705 aa, chain - ## HITS:1 COG:CAC2658 KEGG:ns NR:ns ## COG: CAC2658 COG3968 # Protein_GI_number: 15895916 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Clostridium acetobutylicum # 7 705 4 696 696 877 61.0 0 MSEVVNVAEIFGKNVFNETVMRERLPKSVFKKLKKTIEDGAELDPSIADVVAHAMKDWAI ERGATHYTHWFQPLTGITAEKHDSFISAPTSEGKVIMEFSGKELIKGEPDASSFPSGGLR ATFEARGYTAWDCTSPAFLREDAIGVTLCIPTAFCSYTGEALDKKTPLLKSMEAVDEQAL RILRLFGNTTSKRVIPSVGAEQEYFLVDREKYLQRKDLIYAGRTLFGAMPPKGQELEDHY FGAIRERVAAYMKEVNEELWRLGVPAKTQHNEVAPAQHELAPIYEQVNVAVDHNQMVMET LKKVAGRHGLNCLLHEKPFAGVNGSGKHNNWSLTTEDGINLLNPGETPHENIQFLLVLSC ILKAVDRHADLLRESAADVGNDHRLGANEAPPAIISVFIGEQLEDVVDQLCSTGEATHSK VGGTLKTGVRTLPDLFKDATDRNRTSPFAFTGNKFEFRMVGSSDSISSPNVVLNTIAAEA FKEAADILEKAEDFDTAVHDMIKEQLAAHRRIIFNGNGYSDAWVEEAEKRGLPNIKSMVE AIPALTTEASVRMFEEFNVFTKAELESRVEIEYEAYSKAINIEARTMIDMAGKQIIPAAV KYATMLADSLSKVQTACPDADVSVQAELLTETSAYLSDMKVALAALIDATEKCGTIENNK EQANAYHDTVVPAMEALRAPADKLEMMVDKELWPMPSYGDLIFEV >gi|229784126|gb|GG667609.1| GENE 33 45782 - 46645 963 287 aa, chain - ## HITS:1 COG:CAC0827 KEGG:ns NR:ns ## COG: CAC0827 COG0191 # Protein_GI_number: 15894114 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Clostridium acetobutylicum # 1 287 1 287 287 431 78.0 1e-120 MLVSAKEMLEKARDGKYAVGQFNINNLEWTKAVLLTAEELKSPVILGVSEGAGKYMTGYK TVSAMVKAMIDELNITVPVALHLDHGSYEGCYKCIEAGFSSIMFDGSHYPIAENVEKTKE LVAVCAEKGISLEAEVGSIGGEEDGVVGMGECADPNECKQIADLGVTMLAAGIGNIHGKY PENWAGLSFETLAAVKDKVGDMPLVLHGGTGIPEDQIKKAISLGVAKINVNTECQLSFAD ATRKYIEAGKDLEGKGFDPRKLLAPGTEAIKATVKEKMELFGSVGRA >gi|229784126|gb|GG667609.1| GENE 34 46869 - 48356 1639 495 aa, chain - ## HITS:1 COG:CAC0999 KEGG:ns NR:ns ## COG: CAC0999 COG0498 # Protein_GI_number: 15894286 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Clostridium acetobutylicum # 5 493 6 494 496 601 60.0 1e-172 MELSYQSTRGGETGVTASQAILKGLADDGGLFMPTFIPKLDKTMEELAGMTYQETAYEVM KLFLTDYTEEELKNCIARAYDSKFDTEEIAPLAKVSGAYYLELYHGSTIAFKDMALSILP HFMTTAAKKNQVKNKIVVLTATSGDTGKAAMAGFADVEGTAIIVFYPKDGVSRIQELQMV TQKGDNTHVVAIHGNFDDAQTGVKKIFGDKAFAEALAEKGCQLSSANSINIGRLVPQVVY YVYAYAKLLEAGEIEAGEPINVTVPTGNFGNILAAYIAKQMGVPVKTLICASNENKVLYD FFRTGTYDRNRDFILTTSPSMDILISSNLERLIYLSAGSDAAANAALMKELNEKGSYTVT DEMREFMKDFRGGFASEEENRATIQKIFKDTGYLIDTHTGVAASVYNRYKEETKDAAKTV IASTASPYKFSGSVLEAIDGVRPEEDEFAIVDRLAELSGTVIPQAVEEIRTAPVRHTVQC DTDQMQATVAGILSV >gi|229784126|gb|GG667609.1| GENE 35 48595 - 49404 552 269 aa, chain + ## HITS:1 COG:CAC0144 KEGG:ns NR:ns ## COG: CAC0144 COG1489 # Protein_GI_number: 15893439 # Func_class: R General function prediction only # Function: DNA-binding protein, stimulates sugar fermentation # Organism: Clostridium acetobutylicum # 11 264 12 228 230 179 43.0 6e-45 MRYEHIIQGTFINRPNRFIAHAAIRRNEGAEEEIVVCHVKNTGRCRELLLPGAAVILQFH PEAAASGRKTEYSLIGVWKEQHGEFLLINMDSQAPNQVAAEWLHSMEQAPTLPVSGFDGK KLPSSLTLADIRREVTYGQSRFDLAFHLVFGSSASASQEQRKPAFMEVKGVTLEENGIAM FPDAPTERGIKHILELAEAVKAGYEAYILFVIQMKGIREFTPNKKTHPQFGDALRQAHES GVHVLAYDCMVTVDSLAIDQPVPVFPATF >gi|229784126|gb|GG667609.1| GENE 36 49622 - 50497 718 291 aa, chain - ## HITS:1 COG:BH1807 KEGG:ns NR:ns ## COG: BH1807 COG1512 # Protein_GI_number: 15614370 # Func_class: R General function prediction only # Function: Beta-propeller domains of methanol dehydrogenase type # Organism: Bacillus halodurans # 38 171 33 166 271 102 35.0 6e-22 MLRLLLLAVMAVFLAGAGSGFTAYASGKSDLAAGEKRVFDEAGLFGETEKSSMEEEIASM RKEMNMDVVIATTDDAEGKSAETYCEDFYINGGFGTGKDYSGVIFLIDMDNRELYIAPVG TMNRFLTDKRWNEILDNAYEGASNGDYAASAEAFLDGVRQYYAAGIPGGQYNYDRDTGKV SIYRSITWLEALVALAVALFTAAGPCIGVMNRYAMKKERRQAGNYLKAYRADCRFCFSAN TDHLVNKTVTHIVIPKSNTGGHTGGGGGSSSGRSTTHSSGGRSFGGGGRKF >gi|229784126|gb|GG667609.1| GENE 37 50548 - 51633 960 361 aa, chain - ## HITS:1 COG:no KEGG:Closa_0460 NR:ns ## KEGG: Closa_0460 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 361 1 356 356 509 67.0 1e-143 MSATITYKCPNCGGGLQFDPEKQKYACEYCLSEFTQDELERLSPEESADQAGFSFDEAAG EENTGEQSREGTSVLYTCPSCGAEIVTDETTAADICYYCHNPVILSGKLSGEYNPDYVIP FQLDREKAVSIFSQWMKKKRYVPRAFYSEEQIQKMSGVYFPYLMYSCRVEGELEARADRL RVWVSGRYQYTETQTYDILREGDMEVSHVTRNALKKANRELVEGVLPYDMKQLEPFTPGY LSGFVAERRDMEEQDFSADVCSEVRQFALDSLKESVGTYDRVSVRRSDAMLRDERWEYAL LPVWTLTYHDQAKNEMYYFTVNGQTGKVCGKLPVDRRKLLILFAEIFFPVLIFMLLMGYL I >gi|229784126|gb|GG667609.1| GENE 38 51630 - 52868 1162 412 aa, chain - ## HITS:1 COG:BH1805 KEGG:ns NR:ns ## COG: BH1805 COG4260 # Protein_GI_number: 15614368 # Func_class: S Function unknown # Function: Putative virion core protein (lumpy skin disease virus) # Organism: Bacillus halodurans # 1 412 1 400 433 494 60.0 1e-139 MGIIKAVTTAISGSLADQWLEVIESGNMGDQTVFTSGVKIRRGSNTKGTEYTISDGSVIH VYPNQFMILVDGGKVIDYTAEEGYYTVRNSSLPSLFNGQFGEALKESFDRVRYGGETPTS QKVYFINLQEIKGIKFGTPNPINYFDQFYNAELFLRAHGTYSIKVTDPLLFYNEAIPKNK TRVEVTDINEQYLSEFLEALQSSINQMSADGIRISYVASKGRELGSYMADTLDSQWKASR GMEIQSVGIASISYDEESKKLINMRNQGAMLSDPGVREGYVQGAVARGMEAAGSNRNGSM AGFMGMGVGMNAGGGFMGAASQANLQQMQMNQMAGNQQMTGAPQMAGNQQMTGTPQMAAA QTGTSWTCQCGAVNTGKFCPECGTPRPAGPWTCECGTVNSGKFCSECGKPRP >gi|229784126|gb|GG667609.1| GENE 39 52967 - 54034 1066 355 aa, chain - ## HITS:1 COG:CAC3472_1 KEGG:ns NR:ns ## COG: CAC3472_1 COG1396 # Protein_GI_number: 15896711 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 117 1 117 125 109 47.0 8e-24 MAMNTVIREKRKELGLTQEKVAVYLGVSAPAVNKWEKGITCPDVSVLPALARLLKTDLNT LLCFKEELTEQEIAGFCTEVAEAVRADGMSAGIRLVREKVQEYPACGALIHRVAMILDGA WFIKGSCSGEREAYADEILSLYERAAECGDEEVSTSARFMLASKYMVRMEYDKSQRMLDL LPENNHADKRQIQANLYIRQEKLDEAARLMERTLLMRLNELMGVLLSLAEIAVKEGRPGD ATDIGERWKVVAEQMGLWNYNGLVIALQAALDQEDTEQSLRLIKELLKAAEGPWEVNHSP LFCHIPVYGAPETYTKQIIPPILYELENSEKCSFVQDNREFKQLIEQYRKQMKDF >gi|229784126|gb|GG667609.1| GENE 40 54182 - 55012 1069 276 aa, chain - ## HITS:1 COG:BS_yqfS KEGG:ns NR:ns ## COG: BS_yqfS COG0648 # Protein_GI_number: 16079568 # Func_class: L Replication, recombination and repair # Function: Endonuclease IV # Organism: Bacillus subtilis # 1 273 1 283 297 234 44.0 9e-62 MLTIGCHLSSSKGYLAMGKEAVKIDANTFQFFTRNPRGTKAKAIDENDVERFLVFAKENG IERILAHAPYTLNACSADEHLRELARDTMADDLRRMEYTPGNCYNFHPGSHVGQGAEAGI AFIADMLNQILKPEQRTTVLLETMSGKGSEVGREFEELREILDRVECRERMGVCLDTCHV WDGGYDIVNDLDGVIGTFDRIIGLEKLKAIHLNDSMSPLGAHKDRHAKIGEGYIGEEALK RVVTHPAFKDLPFYLETPNELPGYAREIAMMREVCR >gi|229784126|gb|GG667609.1| GENE 41 55206 - 56168 1060 320 aa, chain - ## HITS:1 COG:no KEGG:Closa_0457 NR:ns ## KEGG: Closa_0457 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 302 1 302 302 377 75.0 1e-103 MKMNPVYKRETTVSSRSFRLALIIMVFNGILALVALLNMYSVVARVKITAEIQYSSFLNL YVFVSVVEFVMLMFIMPALTAGSISGERERQTLDLMLTTTMKPSDIILGKLSASFSTMFI LIVSSFPLLAVSFVYGGITFYDVCLLLLCYVAVALLTGSMGICFSAIFKRSTIATVVTYG ILILIAAGTYAINVFALSIARMNMNNTYISSVGSMTEQANSGGFLYLLLFNPVATFYVMI NGQAGDNQVTGGLNRWFGPHPDNFIMNYWVVISIVIQLALSAIFLLIAIRAINPVKKKPV KQQKEKRKVGVSVPAAGSQL >gi|229784126|gb|GG667609.1| GENE 42 56146 - 57108 1205 320 aa, chain - ## HITS:1 COG:alr0970 KEGG:ns NR:ns ## COG: alr0970 COG1131 # Protein_GI_number: 17228465 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Nostoc sp. PCC 7120 # 2 302 7 310 316 219 39.0 4e-57 MLEIRNLKKTFGKFYALDGLDMDIGDGALYGFVGPNGAGKTTTIKIMTGLLEADEGQVLI NGVNVTGGLSQLKLKIGYVPDFFGVYDNLKVSEYMEFFASCYGIYGLKARTRYMTLLEQV GLDDKVNFYVDGLSRGMKQRLCLARALIHDPALLVMDEPTSGLDPRTRFEFKEILKELRE QGKTILISSHVLSELSELCTDIGIIDQGKMVLSGSMEAILRRVNTSNPLIISILGNKEKA LTILKSQPCVQTISVKEGDIMVQFVGDEEDEAVLLQQLVDAEVMVHGFTREPGSLESLFM QITDHDVEKAVLVHENESGI >gi|229784126|gb|GG667609.1| GENE 43 57108 - 59600 2745 830 aa, chain - ## HITS:1 COG:no KEGG:Closa_0455 NR:ns ## KEGG: Closa_0455 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 830 1 829 829 1065 64.0 0 MRFRKNTLVWAVILSMAGSVLLPASCVTAYAVQTMTEPANNLNTESDIYLASSPVVMDVA YGYDGAAKSGRYVPVRISLANQEQKAFEGTIRIQAMESDYDIYDYDYPVSLGAEEKLEKS LDIPAGRGEILYVKLFDGNGTELVRKRLRINVSREVAELYVGILSDSPDRLNYLNGVGVN YSSVRTKTFHMTADTMPDKAAGLDLLDVLLITDYDTRKLSDSQTGAVWEWVRGGGTLLIG TGGRANDTLAAFREEIVETAFPAPEVRSVDMGVEYATDGPGDSFINLTCADISLKGGTEV LANDEFPVLTSTPKGKGLVGVAAYDFVDIADFCETQRSYVDKLLTALLGEDKLNNLSSYL YYGNSSKYWSVQSILNTGNVDKLPNMPLYVIVILGYIILVGPGLYLLLKKQERRRFYRTG VVVLSLAFAAVIYLMGVGTRFKNTFFTYATILDTSEKSIDEYTYINIRTPYNKPYAVSLD PGYDLLPITRSYYYDMTPVPKFTGTEESKVRVEHGLQETKISAQGVVAFTPKYFSLVKKS PNTEHQGLTGNVTLFDGRISGTVTNRYTFPVEKVGVMMYGQMVVIDELGPGETVSLDGLK IINYPLNSSYLIAERVTGGYQYEKPDIEDENYMLALSRSKILSFYLDTYASVYSPEARIV AFGREQSDSKFLAKGDFETYGMTMFTSALDISTKKNGYTYRSALMKAPHVVSGQYYAPSN SVYGMDPTVLEYSLGNDIEVESLTFEKISDEFLEGEKNNGLSLFSGSIYFYNHDTGIYDL MDSGKDVYTADELEPYLSPGNTMTIKYIFDNNTDYSWVTLPMLTVMGKDK >gi|229784126|gb|GG667609.1| GENE 44 59608 - 60552 1177 314 aa, chain - ## HITS:1 COG:BH1809 KEGG:ns NR:ns ## COG: BH1809 COG0642 # Protein_GI_number: 15614372 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 2 299 50 345 351 219 40.0 4e-57 MYVLFGILVFTVTFMILQEKSLRYITKISDAMQNISEGDLNVTVEVEGDDEFSSMAANLN KMVEELKELMDKERESERTKNELITNVAHDLRTPLTSIIGYLELLSGDVKLDPELQKKYI NIAYVKTKRLEKLIEDLFGFTKMNYGKLTMHVAQVDVVKLLSQLLEEFYPSFVDKNLSYE LQSNVPAKVITADGNLLARLFDNLINNAIKYGADGKRIMVKLHADDEIVTVSVINYGYVI PADELPLIFNKFYRVEQSRSTNTGGTGLGLAISKNIVDMHGGTITVTSDLSGTVFTVKLK VNFDVNKENFGKIG >gi|229784126|gb|GG667609.1| GENE 45 61514 - 61642 128 42 aa, chain - ## HITS:1 COG:no KEGG:Closa_0454 NR:ns ## KEGG: Closa_0454 # Name: not_defined # Def: integral membrane sensor signal transduction histidine kinase # Organism: C.saccharolyticum # Pathway: not_defined # 1 39 1 39 378 63 89.0 2e-09 MKNDMNRRFRTRVITNIFYSAVVTVLIEIFLVTNVSLIAIAS >gi|229784126|gb|GG667609.1| GENE 46 61632 - 62339 1126 235 aa, chain - ## HITS:1 COG:CAC0564 KEGG:ns NR:ns ## COG: CAC0564 COG0745 # Protein_GI_number: 15893854 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 3 235 2 231 233 247 56.0 1e-65 MSETTNILVVDDEKEIAELVEIYLVSDGYKVFKANNAMDGLDILDKNEIHLVLLDIMMPG MDGMEMCRKIRETNNIPIIMLSAKSTDLDKIMGLGTGADDYVTKPFNPLELTARVKSQLR RYTQLNPNSAVNEKAGNEITIRGLIINKDNHKVTVYGEEIKLTPIEFDILYLLASNPGRV FSTDEIFEKVWNEKVYEANNTVMVHIRRLRGKMKEDTRQNKIITTVWGVGYKIEK >gi|229784126|gb|GG667609.1| GENE 47 62445 - 63365 883 306 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2412 NR:ns ## KEGG: Cphy_2412 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 39 256 24 229 267 110 28.0 9e-23 MKEMNQKGVSRGPAMNVKRILAASVFILVLASFLLVPGRRASADVIWTPDDTFYEKHYED CKYIGRRYYANGADGYVTVVESPLRRNVIDFIANGPVLFVYITYDKGNSGTWGLVQYVLD EEGKPKTDGNWEEEAAVGWIKMEELVNVYDGISFSEEHGAEIQETDESTAPRVIMPDTGV ICLWEYPGSDISYGKLESLESEISIDRTYRDPGGDLWGHSGYYFGHRDFWINLSSPDETN PKLEPKAAPELIPPADSKTLESLPKTGRSGLGLPLAMAGLIILTIVLTVFAIRFMAVKKR GEEITE >gi|229784126|gb|GG667609.1| GENE 48 63438 - 63782 451 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619371|ref|ZP_06112306.1| ## NR: gi|266619371|ref|ZP_06112306.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 114 2 115 115 169 100.0 8e-41 MGLWLVPVLLVSLGLTIALEEAFGWLAGVRDWWNRLLIVLVNILTNPAVVFLYYVNDLYL GWNPMIVTVALEAAAAAVEAVCYRSAGKIRRPWLFSIGANLFSVTLGAVVTKCF >gi|229784126|gb|GG667609.1| GENE 49 63782 - 64453 514 223 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2414 NR:ns ## KEGG: Cphy_2414 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 177 138 327 360 94 36.0 3e-18 MLGRAGKGPEVWLGGLLILSAAAIRLMAGRERLWTSSGNAPVSFAVQSDKTVPAAACFLA AVILRSYSGMVQNFPWKETLAGSGFILAGAVVLGKMAGGVLADKTGVWRTAFCSMCAAAV GFLCLVYPAAGILAVFCWNMSMPLTLWAVAEAFPGAKGFGFGLLTFGLFVGFCPAWLGGS SVGFAAQSGIRQLAGAASSPVGLALMAVLSLLLLDWGLKQVVK >gi|229784126|gb|GG667609.1| GENE 50 64402 - 64869 209 155 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2414 NR:ns ## KEGG: Cphy_2414 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 14 130 14 128 360 95 44.0 5e-19 MSNGKGGVGAVTAVYALVHMAVDLSCAFLVFAYVSGGERWYLWLLIYNFCAFALQMPIGA AADRLDRNSCVAAGGCAGVLAGLLLGSVGFPAAAAVAAGIGNACFHVGGGIDILNRSGGT RRLWGFLWLPVLWEFFWEPCLAGQERGRRSGLEGF >gi|229784126|gb|GG667609.1| GENE 51 65073 - 66530 1808 485 aa, chain - ## HITS:1 COG:BH1392 KEGG:ns NR:ns ## COG: BH1392 COG0855 # Protein_GI_number: 15613955 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Bacillus halodurans # 5 440 236 674 705 527 59.0 1e-149 MIAYSPFRIMRNADLTIDEEEAVDLLEEIQKQLKKRQWGEAIRLEIDEKMDKSLLKILKR ELSISSGDIYEIGGPLDLTFLMKMYGLEGFEHLKAPKYIPQRVPALMNEDDIFTNIRKGD ILLHHPYETFAPVVNFVKSAAKDPDVLAIKQTLYRVSGNSPIIAALAEAADNGKQVSVLV ELKARFDEENNINWAKKLEKAGCHVIYGLVGLKTHSKITLVVRREEDGIRRYVHLGTGNY NDSTAKLYTDLGLMTCNPQIGEDATAVFNMLSGYSEPLHWNKLVVAPIWLRNRFLKMIRR ETQNALKKKPAHIMAKMNSLCDKEIIAALYEASCAGVKIEMVIRGICCLKAGIPNLSENI EVHSIVGNFLEHARVFYFENDGSPEVYMGSADWMPRNLDKRVEIMFPVEEESLREQVMHV LKVQLEDNVKAHILKPDGTYEKPDKRGKVLVSSQEQFCEEAINNVEAELGKMDPVSNRVF VPVNS >gi|229784126|gb|GG667609.1| GENE 52 67463 - 68104 787 213 aa, chain - ## HITS:1 COG:BH1392 KEGG:ns NR:ns ## COG: BH1392 COG0855 # Protein_GI_number: 15613955 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Bacillus halodurans # 2 213 10 218 705 186 47.0 2e-47 MADIDERFYNPDNYVNRELSWLEFNYRVLSEARDKNLPLFERLKFLSITSSNLDEFFMVR VASLKDMVHAKYTKPDIAGLKPSEQLEKIGVKTHEMVAMQYSTYNRSILPALKQNGLCVV TSHVDLNAEQGAYVDEYFRENVYPVLTPMAVDSSRPFPLIRNKTLNIAALLQKKSGEEDL EFAMVQVPSVLPRIVEIPAGRKSGRSVILLEEI >gi|229784126|gb|GG667609.1| GENE 53 68104 - 69672 1812 522 aa, chain - ## HITS:1 COG:all3552 KEGG:ns NR:ns ## COG: all3552 COG0248 # Protein_GI_number: 17231044 # Func_class: F Nucleotide transport and metabolism; P Inorganic ion transport and metabolism # Function: Exopolyphosphatase # Organism: Nostoc sp. PCC 7120 # 4 521 21 545 550 111 24.0 4e-24 MAIKLFAAIDVGSFELELGIYEVSPKYGIRRVDHVRHVIALGKDTYKNGKISYELVEEMC GVLQQFAEIMESYKISEYRAYATSAMREARNNQIVLDQIRVRTGIEVQIISNSEQRFISY KAIASKDAEFNKTIQKGTAIVDVGFGSMQISLFDKDALISTQNLLLGVLRIREMMSNMQA DIRMENTLIEELVDNELMTFKKMYLKDREIKNLIGIGECILYLSRGSGTGKPVDRVTAED FKAFYEKLVDMPLYQIEERFGVNSDYAALLMPAAIIYRRVLELTGAEMFWIPGIRLCDGI AAEYAEETRAVKFSHDFTEDILAASRNMAKRYKCHASHTADMEKHVLDIFDSMKRYHGMG KRERLLLQISAIIHSCGKFIGMKYTGESGYNIIMSTEIIGLSHMEREIIANVVRYNSIDF DYNEVIFNEDLFRDTKGELTHNDVTILIAKLTAILRLANAMDRSHKQKLHDCRLNVKEGQ LLVTTTYGGDMTLEAVAFAQKADFFEEIFGIRPVLKQKRRLS >gi|229784126|gb|GG667609.1| GENE 54 69743 - 71047 1582 434 aa, chain - ## HITS:1 COG:CAC3262 KEGG:ns NR:ns ## COG: CAC3262 COG2607 # Protein_GI_number: 15896507 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Clostridium acetobutylicum # 1 424 12 422 426 272 37.0 9e-73 MKSQELIIYRNLQHRELFDHVSTLLAVGPDDKPGTMPDSYACASQLIDMAVTYGFEGNLW HCFLAFCLANNENVYSTSCEIVGHIEGSLNELAAHDFNVIKELFDYDIRTLDGGNGRTGG IWSALSDYHAANGNSRVFNKRIRDQIVALAVELEQAVDVEAFGRAVTEFYRRFGVGKFGL NKAFRIEEIDHEVHINPIPNVEHIYLDDIIGYELQKKKLIENTEAFIQGRAANNVLLFGD SGTGKSSSIKAILNEYYDQGLRIIEVYKHQFKSLSEVQEQVKDRNYKFIIYMDDLSFEDS ELEYKYLKAIIEGGLGKKPGNVLIYATSNRRHLIREKFSDKRELDDELHNNDTVQEKLSL VARFGVTIYFGSPDKKQFQNIVRELAERYHVNMPLEELYAEANKWELSHGGLSGRTASQF ITHLMGTREDFTKI >gi|229784126|gb|GG667609.1| GENE 55 71112 - 72248 1237 378 aa, chain - ## HITS:1 COG:ECs3462_2 KEGG:ns NR:ns ## COG: ECs3462_2 COG0077 # Protein_GI_number: 15832716 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Escherichia coli O157:H7 # 113 378 1 271 282 181 38.0 2e-45 MSTFDLQEIRKQLDKIDREIVTLFEERMRLCGDVAEYKIENGRPVYDGDRERQKLESVKA MTDDSFQKQAVEELFSQMMTISRRFQYKLMAEHGLTARMDFQAVKGLPMKGVRVVYQGVE GAYSHEATLQYFGDDVDAYHVQFWEDAMKEVEAGRADYAVLPIENSSAGAVSDNYDLLIK YHNYIVAETFIPVSHALLGLPDAELSDITTVFSHPQALMQSSRYLNSHREWTQYSVENTA ASAKKVLNDGKKNQAAVASETAGRLYGLKVLEPSINFNKDNTTRFIILSREPIYREDASK VSISFELPHTSGSLYNMLSNFIYNNVNMRMIESRPIPGRNWEYRFFVDIEGNLGDAQIQN ALKGIEEEASNMRILGNY >gi|229784126|gb|GG667609.1| GENE 56 72245 - 72994 857 249 aa, chain - ## HITS:1 COG:CAC0965 KEGG:ns NR:ns ## COG: CAC0965 COG0204 # Protein_GI_number: 15894252 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Clostridium acetobutylicum # 2 224 1 220 241 144 36.0 1e-34 MIRFIFVAATVILFLILSIPLMLVEFLIGKKNRHLRDVQSLAVIQCVFRLILRMAGVKIT VKGRENVPTDQAVLYVGNHRSYFDILVGYVTVPSLTGFVAKKEMEKIPLLSSWMKLVNCL FLDRVNLKEGLKTILAGIEQVKRGVSVWIFPEGTRNKNDNPLDLLPFKEGSLKIAEKSGC YVVPVALTGTAEVFEQHFPKIKPSHVTITFGEPFHTKELPPEVKKFPGAYTEEQIKAMLT LQLAEGQDL >gi|229784126|gb|GG667609.1| GENE 57 73050 - 73553 292 167 aa, chain - ## HITS:1 COG:no KEGG:Closa_0447 NR:ns ## KEGG: Closa_0447 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 19 167 8 156 156 247 80.0 1e-64 MKTTTLNTTTFKKFTISSTALQLKQLPVNFCKCGITGWCLEVLFTSAESIMRHDWRLMGR TSLLMFPIYGCGALLGPIGKWIDDWLNPGRVLTVKKSDLAVRHGMLYMVLIFVAEYASGA WLRSRGMCPWDYTGRSTNIDGLIRLDFAPLWFATGLLFEQITKKKGG >gi|229784126|gb|GG667609.1| GENE 58 73655 - 75238 1202 527 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 6 516 1 523 525 177 26.0 4e-44 METKKLYQLLIVDDESFVRQGLSQYVDWEALGYTVAGTADCKETALHFLEENPVDVILTD IRMPDGTGLELLQECRGIQPDVKSVILSGYGEFEYAQKALRLGSFDFLTKPVDFSQLTAI FLKLHDTLSQLETQNRNLQEVYRLKKDNFFHNVTKYSVEPETLAQRCREYNLPPCGRYGI LRIRYPDAENCLPGLPPDCREAVSRLIGESGGGDRCYVFPNEAHEITVALYDIPKGETTG LLELLKPYLDYHEAIAGVGSFYEGIGKLKEAYQEAGQALEYHHLQTEERIIEYGKASGQI DFGAAYSQNSKESMLDCLSEGRSDLLAEFLVSHIDNLIADGSSIDTIKALAINSILVVEE YLKQNTSCRETGNGIQELICRIISMKSRHRIQDIAREYLASTAPLIESAKSRSPVIDRAL VYIREHYSENITLQSLAEDIYLHPVYLSRLFKEKVGQTFLEYLTHYRIEMAKDYLRNPNL RIYDIGQMVGYETPQYFSRVFKELTGTTPKGFREELAGKGEKHEKGV >gi|229784126|gb|GG667609.1| GENE 59 75259 - 77175 1759 638 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 209 630 177 591 602 159 27.0 2e-38 METLEKRNDLVGRLLGICRKTSIFVRLIILSCTLLMIPSCLLIFLTFHVYTSDLFKTTLQ LFKNNNLLIGSSISHVLKQNEQFLYEIYSDEELTGALIELYRLENNPEEDLSDKKEILRD QINSRLYHYYTDGDSLIRSIQIITEYDQFSPVSTYGNMAGAWIPDISAFRDSDIYQKTIR ASGTPVWFDPPQFYNTFYHQTMFNTPLVDTFVLARSIPSYQVNSSLGIIVINLDMKSLLK NVDASTLTSDSNLLLMGSQPLWSFNVNIEGPTLASDDPVMEQVKGREEGDIEAVIRGEAC ELTFLRITGTDMYLINLAYRDRLFERPLMLRRFSFLVLAVSLIIAVFIAYIITVSIVSPL NRLKNTMQTASRGNLEVEYEDPAGDEVGILGQKYNEMIADLKSLIQETYVSELNKRHLEM LKKEAELNAFQMQINPHFLYNTLDMIRWKAIAEEQGDGEVSHMIKQFSDLLRLGIKKNSV IVPFSEELSHVDAYLAIINLRYSEPIQLTVDLPFDPGDVTIAKLSLQPLVENSVIHGFDS TPAEISHKRIAISGFIEQNLIHILVEDNGIGMTQEQMEALNRELKDRVHTQGGIGVFNVN ERLKLYFGDDCGLHYQSSPLGGTGIEMLLPRQSQEQEI >gi|229784126|gb|GG667609.1| GENE 60 77260 - 78678 1376 472 aa, chain - ## HITS:1 COG:BMEII0542 KEGG:ns NR:ns ## COG: BMEII0542 COG1653 # Protein_GI_number: 17988887 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Brucella melitensis # 81 366 31 311 398 109 30.0 1e-23 MRRRQVMKKAAAILLTAAVAAGLTACGSKPSETQSTAAEVKKTEEEKTAAENGKAAEDKG NGEAVTLTMWVESKNETDIKQELDEEFMEENPDIILNKVMKEGDPGNDFYQGVAAGNAPD CVTVSFTLMDKYANAGILTPLNQYFDQWDEKDGFSQSYVDMFTKNGNLYGITSIDANFLL AYNKAVFREAGIENPPATWEEAYETAKKITNPDTQTYGYGVLGSEWTDWFFQYYVWQAGG DLTKRNADGTLELTFTDPAVIEAANYYQKLRRDGVSQPDLTLKFNELIEKFAQGKIAMMP FATDWVSTAASLGAKVDDIGLCPFPAGPSGKSVTTSLGYNWVINAKSSPEKQDAAWRYIA FMSSKESMERTIEASADKGAVDPIHYPRTDVNVSELTDMNPEYQEVLKKIEGTGRLEYEG KDIISPYVDRTVQKILADLNADPEKEFAAASEQAQKEVVDQYNKEVLEKAEK >gi|229784126|gb|GG667609.1| GENE 61 78958 - 79845 677 295 aa, chain + ## HITS:1 COG:lin0760 KEGG:ns NR:ns ## COG: lin0760 COG1175 # Protein_GI_number: 16799834 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 1 293 1 293 296 132 27.0 6e-31 MENNRFHSASFRKALSPWLFMLPGIIFTVWLRYYPIIKSFYMSLFSYDAVNPPGHFAGFQ NYINLFTAAYYWDAWKNTFIFLILQLFLVFWVPLVQAIFLNELKRFQKTFSTIYLITALI PISVSVILWKWIWNPDFGLANQIVKAFGLPQQMWLSNPALTKFCIIFPGFVGGGVGVLLF LSAINGISEDVLEASQLDGCSGWQRIRHIILPNIRFIIIIQLVLAVISSMQILDAPYQFA AGGPSGASTSMGVYIYSTFYTDLDYGKATAASCILFLVIAVLTFLQLRMDQSEAE >gi|229784126|gb|GG667609.1| GENE 62 79861 - 81213 930 450 aa, chain + ## HITS:1 COG:SMb20233 KEGG:ns NR:ns ## COG: SMb20233 COG0395 # Protein_GI_number: 16263972 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 240 449 75 282 282 113 29.0 9e-25 MKRKKQTGIKAGGSTISLKIAAYAVSIFFAVVILYPLLYIAATSMKDSKKLYELPPRLLP YAANSVTFELDYSRVPEDQLLEAMRQDQILTGFLTIYEMDEDSVFEVRLVGTRDGVPVFQ SRAHRGALELQRDAGLYSHATMEPKVLLNPKRWEKASDFLGYQYDPAGIPKGDQFRIKNS GEELASRIQAVFQEKYELHGALTAITSHKNNLLLLESFSYYFQLPSVMYSKNELVSRYSF LIFIMNTALVITWAIVTQVVLCGITAFPISRLLPKKLSGIMLFFFLGSTMIPFISIMIPQ FTMFQRLGFYNNYRALLMPFLLPYGFYIYLYKGFFDQLPQSLFDAARIDGAGNWYSFVKI CMPLSKPIISLIALQTFLANWNDFFWAWLVTEKQSLWTLNVALYNISKNQFIKMNFIMGL SLFSILPVMLLTIIFSSQIKKSIISSGIKG >gi|229784126|gb|GG667609.1| GENE 63 81230 - 82498 760 422 aa, chain + ## HITS:1 COG:DR1362 KEGG:ns NR:ns ## COG: DR1362 COG0673 # Protein_GI_number: 15806379 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Deinococcus radiodurans # 1 416 1 401 403 280 38.0 4e-75 MKKKITAVLLGAGARGGVYARYALDHPDEFQIVAIAEPDTAKREAMKKAHKIPEDSVYET WEPLLEQPRFADAALICTMDQMHTGPAVKALELGYHVLLEKPMAVREEDCRRIGAAAEKN KRLLSICHVLRYTPFYSEIKRIIDSGRLGKILAVQQIEHVGYWHQAHSFVRGNWADDTTS SPMILQKSCHDMDILSWLTGGTCTKVSSFGSLTHFKPENAPAGAPSRCLDGCPSSGSCPY YAPRFYLEHPKAVSDGFVSMVTTEPGREKLMEALRHGPYGRCVYHCDNHVVDHQAVNLEF DNSAVVSFIMTAFTEDCNRTLTLMGTKGQLTGDMNRSEIRFHLFGSADEEVCCPQLPDVS AAYHHGGGDYRLIENFVAMVREGDFTKNKSSAQQSLQSHLICFAAEKSRLTGKTIELSFE QE >gi|229784126|gb|GG667609.1| GENE 64 82504 - 83313 242 269 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_5268 NR:ns ## KEGG: Pjdr2_5268 # Name: not_defined # Def: hypothetical protein # Organism: Paenibacillus # Pathway: not_defined # 37 264 43 272 276 187 43.0 3e-46 MTDIPFDFPELTAQAREAAACAANLAPWSILEAWRLFHTDLPRFFRTLNAEPDSSLLYLF YYLQFTRFAYRKYQERKIPEDVFYHTFSDIGRWEKVCFEHTGAHGLEEYEWLSLHLRLQL FALGRLQFQPVCCPWDLPAVYGIQKGDPVLNVHIPSGSRLTPSACEASYQQAARFFSMHP AVFICHSWLLCPALKNLLPSGSNILQFQNEYTITGVDAESRQAEERIFGCLKSCPSEYPE NSYLQKSAKTWLLSGNPLPAGYGVRRLFS >gi|229784126|gb|GG667609.1| GENE 65 83397 - 84893 1400 498 aa, chain - ## HITS:1 COG:BS_ywnE KEGG:ns NR:ns ## COG: BS_ywnE COG1502 # Protein_GI_number: 16080712 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Bacillus subtilis # 10 498 2 482 482 453 45.0 1e-127 METAWMAVTAVWSWIMEHLIYINLIFSIIIVFFQRRDPKSVWTWLMVLYFIPVFGFLFYL LLGQDLGKSKMFRVKEVEDRVYFTVRGQEEFLKTHDSSLEAHLSRDYEDLVLYNLETSGA VLTVDNTVRIFTDGEAKYTDLRSELRKASRFIHMQYYIIKDDELFDSISPILAERARAGV EVRILCDGMGGRFMPKKKWEELKQCGVKVGIFFPPVLGRLQLRMNYRNHRKIVVIDNKVG YVGGFNIGREYISRDPRFGYWRDTHLKLTGGSVLSLQIRFALDWNYAAGENLFKDMKYFE TGTDLFRLDQISPEIHPLGIQIIASGPDASCRQIRDNYIRLFAKARDHIYIQTPYFVPDD AVLSALSMAARSGVDVRLMIPCKPDHPFVYWATYSYVGDLLSAGARCYTYENGFLHAKGV MTDSRVCSYGTANMDIRSFELNFEVNAVIYDEETTKELENIFLEDLKLCKEITREGYASR DLWIRVKEQCSRLLSPLL >gi|229784126|gb|GG667609.1| GENE 66 84952 - 86670 1598 572 aa, chain - ## HITS:1 COG:aq_1971 KEGG:ns NR:ns ## COG: aq_1971 COG2804 # Protein_GI_number: 15606969 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Aquifex aeolicus # 26 561 26 564 566 372 39.0 1e-102 MKGIHIGEVLVEQGAITKEQLNEGLKLLKAGTNDRRLAEVLTDLGYITERELLDVLGRGM GLEVIDLEFFHIDERAVEKIPKQLALKYTVMAVSMEGSGLTVATADPLDLYALEDIRLVT NMRIQLILAERTQIRHAIELNYSGIDARAAARLASEHAVFSRTFAENLMVNSDEDQAPIV RLLNSLLLKGYNTNASDIHIEPYENETVVRMRRDGMLIPYMTLSPAIHQGIVARTKILAN MDIAEKRKAQDGHFKITLDGREMNVRVSFVPTVYGEKGVLRFLTTNTPIDRSAAFGMSGD NYRRMKQLLRVPHGIIYFTGPTGSGKTTTLYSVLQYLSGRLVNIVTIEDPVERNLPKVNQ IQVNERAGLNFQTGLRSILRQDPDIIMIGETRDPETAEISARAAITGHLVFSTLHTNDAA SSVTRLMDMGIPAYLAAASIAGIISQRLMRKVCPYCQEEYEARPEEVKILYGRPADPGEH VRLRRGKGCYLCNETGYKGRIAIHEILSVDTEMKRMISEGRNEEELKAYAVHRHGMTTLK EEAVKMVLAGITTLEEMERLIYTIDMDYFGEE >gi|229784126|gb|GG667609.1| GENE 67 86694 - 87290 400 198 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619390|ref|ZP_06112325.1| ## NR: gi|266619390|ref|ZP_06112325.1| putative prokaryotic N- methylation motif protein [Clostridium hathewayi DSM 13479] putative prokaryotic N- methylation motif protein [Clostridium hathewayi DSM 13479] # 1 198 5 202 202 391 99.0 1e-107 MGGKWNERMRDERGMTLTELMVSLAIAFILMSFLIRMYTYTGNMFETSASTDDLRRTTQV VIDYLEDRIACADSLVISREELSGEEFGHEILFSEDGRIYCDGEPVCEEEFYRKRTVFCQ ILSAFSEDNTPALKYRITWKNQKNTTLYSADSVVKLVNLELNGREIIRRDLETGAGMSGK SLYIYYKDPDYSHGIVFD >gi|229784126|gb|GG667609.1| GENE 68 87299 - 87700 360 133 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619391|ref|ZP_06112326.1| ## NR: gi|266619391|ref|ZP_06112326.1| putative general secretion protein G [Clostridium hathewayi DSM 13479] putative general secretion protein G [Clostridium hathewayi DSM 13479] # 1 133 1 133 133 254 100.0 1e-66 MKRNVFKKHWNLTGRIRGIGAGNADNGFTLVESLAAWTILLIAVTIFLKCLGMAHASMGK GTVMRNQYMTALERVELGGEPLRTKETKLKFIIDKTTFSMDAVMMEYGLPQSGKGETQPV TLKVIGPVSGAGE >gi|229784126|gb|GG667609.1| GENE 69 87697 - 88275 476 192 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619392|ref|ZP_06112327.1| ## NR: gi|266619392|ref|ZP_06112327.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 192 1 192 192 367 100.0 1e-100 MAGQKNGKEGYVLGWVVIMVLVLMILVTAILFASTAYYRQSFREYNDRQAYLTARSIVQT AAADFTGSGSGELRDVLLDEISVYLKDEPDSESQAAPPDEAADKPTDKPTDKPDTGYPIP KIKFQLDSTMGSCTMTGYYMPDEDLLILTAEARKGGSTEIMTVYLQHSPESEEEPWAVLG YERGEAQKGGKS >gi|229784126|gb|GG667609.1| GENE 70 88290 - 88928 712 212 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619393|ref|ZP_06112328.1| ## NR: gi|266619393|ref|ZP_06112328.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 212 1 212 212 331 100.0 2e-89 MRKLTQREKILVYIAAVLLLLTGGYYLVEQPAEMRRVRTETELKILQLQERESEDSLTGK STLEEEIGREQDMAGENTEDYLPYRTNSVLISELAEYLTSFGIVPETMKVESVSVLEEQE TTEAGYHIHRADIHVEALGTMEELYRLLDWAYDRDWISVTTYSGEAMEDKDAGMNGEADE TGDNETLSMTFTCFMLELKNADETKNAEEPRL >gi|229784126|gb|GG667609.1| GENE 71 88939 - 90417 1413 492 aa, chain - ## HITS:1 COG:no KEGG:ELI_4139 NR:ns ## KEGG: ELI_4139 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 467 1 462 515 152 26.0 4e-35 MRLSVYLGENEIRTVLGRSGKTIEIMDCLNVRLKEGALINDVVTEEAAVKEVLNGIRKRY GKYRRHVYLTMGGNQIITKVLSAPRLPRCQMMELVRRELSDLILPSEKDGYVYDYSIVRK KNKDHKGRTILCVAMKRSVIHEYQTLFSECGMKLKAIDVAVDGLNSLVDFLPSFRNKTFI MAVADGRNMMTSLYIDGVYAYTNRMRLVDERGTSESLEEMVKVIRSVIHFCKMQRDEFEL DFVCLCGLGKEEVDSLIPEIVNNEDIAVTVPGPESWITVKEGLNYSMDQYMYVTGSLLGS RKSLDLIGAAKQKAGRGEEEKSHFLAACLLPAAVITVFLGNAVRNEIAVRNMREEIQVLE ERLSVKERKEALAEEKQLKEKLASLRILTAGQSAVKKEAAKMPEMNSAVRSYIFGTAAGK LELSEPEYTQGVLRFDGKSRKYEEISDYVRELEESGLFSTVEYSGFTNVNPVTKEKDDWY YVQLACMLKTPE >gi|229784126|gb|GG667609.1| GENE 72 90440 - 91627 845 395 aa, chain - ## HITS:1 COG:hofF KEGG:ns NR:ns ## COG: hofF COG1459 # Protein_GI_number: 16131206 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Escherichia coli K12 # 3 389 2 392 398 185 29.0 1e-46 MADYEYTAKSINGRRMKGKIQAQDRDQAWIKIRGMGLYPVKLKEERGRVRKRPLSCGLLA RFMSEMAVMLNSGIPLAQALNLISVRESRKDLKEIYGKLYHMILHGMDLSQAMEMQDGVF PPVLIGMVRAGEAGGSLADAFGKMAEYFEKEDETRKKVGTAMIYPAFLFLTVAVVVILLF TTVLPSFFELFESMEKIPVSTQVLMAVSIGLQNHFTEIAAAGGCAAAILAGIFTRPGAAV WRDKMILHIPCIGTLLKMVMAGRFARTFSFLYGGGVPFINALELTADALGSRYMKKRLCQ VMEEVKNGMLLSDSLAQIRELCPEFIHSVYIGEESGNLEAMLKRTADSFEIRSETAIKRL LSLLEPTMIIIMAVIIGYLMLSVMVPIYQYYQSIG >gi|229784126|gb|GG667609.1| GENE 73 91644 - 92216 501 190 aa, chain - ## HITS:1 COG:BS_comC KEGG:ns NR:ns ## COG: BS_comC COG1989 # Protein_GI_number: 16079859 # Func_class: N Cell motility; O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, prepilin signal peptidase PulO and related peptidases # Organism: Bacillus subtilis # 34 186 91 243 248 67 33.0 2e-11 MKNEEHMGRIIPGRVVAAAVALFLAGLSWHHCGGLSLRAVMIWLVCLILTAIAYTDFKTM HIPDRMTAALLIPAVLSLGAESGISLFSRALGGAAVSFPMFLMTLAVPGSFGGGDIKLMA VCGLILGWPRVAAAGFLSVVTGGCYAIWLLATRRAGRSDHFAFGPFLAVGVVLSLLYGDQ LLAWYLGFLR >gi|229784126|gb|GG667609.1| GENE 74 92206 - 92634 557 142 aa, chain - ## HITS:1 COG:no KEGG:ELI_4145 NR:ns ## KEGG: ELI_4145 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 7 140 8 146 175 76 35.0 4e-13 MRNWRYVRENRDGFTLVEVLAVLVIIAILAAVAIPTMSGFISDARKKSYTSQARNVYVAA QAAALELETSKDGTAAASVTVYTRKDGSKYMDRVEAFLGNDVENGSHFTITLDGNKVEEV EYTPENGKKITITGGEGVTYEE >gi|229784126|gb|GG667609.1| GENE 75 92650 - 93072 561 140 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619398|ref|ZP_06112333.1| ## NR: gi|266619398|ref|ZP_06112333.1| general secretion pathway protein G [Clostridium hathewayi DSM 13479] general secretion pathway protein G [Clostridium hathewayi DSM 13479] # 1 140 1 140 140 242 100.0 5e-63 MQKDDMGFTLIETLTVLAIVAILAATNIPVLNGFIDDAKKKSYVAEAYMVKAAMQSYVIE RIADGTIDDFVMYEEIFYPEVGSEENALYEMLKGSVTKGGKIRLINYDRTTSKVSGIIYD VKNYEIEIKNDTEVEVRDRK >gi|229784126|gb|GG667609.1| GENE 76 93098 - 93277 280 59 aa, chain - ## HITS:1 COG:no KEGG:Closa_0445 NR:ns ## KEGG: Closa_0445 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 59 1 59 59 83 67.0 4e-15 MANIPKDPVMLLSFLNTQLRDYYPSLEDCCLSLGINMEEITGGLALIGYYYNEEKNQFI >gi|229784126|gb|GG667609.1| GENE 77 93408 - 94412 1065 334 aa, chain + ## HITS:1 COG:BS_ansA KEGG:ns NR:ns ## COG: BS_ansA COG0252 # Protein_GI_number: 16079415 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Bacillus subtilis # 1 331 1 328 329 294 45.0 2e-79 MKKILMLGTGGTIACKRGEYGLKPLISSDELLSYVPDAAQFCQADSLQILNIDSTNMQPK HWLAMAEAIETHYEEYDGFVICHGTDTMAYTAAALSYLIQSSAKPVVITGAQKPIDMENT DARINLYDSLRFASDDRACGVTIVFDGKVIAGTRGKKERTKSYNAFSSINFPYIAMIQDG HVIFYLDEKEHLTDHVHFYHTLNPNVALLKLIPSMGADVLDYMAEHYDAVIIESFGVGGL PSYDSGDFYKAIEKWISLGKTVVMTTQVTNEGSNMSVYEVGRTIKKEFGLLEAYDMTLEA TVTKLMWILGQTQDAHAIHDMFYTTVNKDILWRG >gi|229784126|gb|GG667609.1| GENE 78 94460 - 97048 1877 862 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 862 1 813 815 727 45 0.0 MNINKFTQKSLEAVQNCEKLAYEYGNQQIEQEHLLYSLLTVEDSLILKLITKMNVQKEQF INETAQAIEKLPKVSGGQLYISNDLNKVLISAEDEAKAMGDEYVSVEHLFLAMLKQPSRM VKELFRSYAVTRENFLQALSTVRGNQRVVSDNPEATYDTLEKYGYDMVERARDQKLDPVI GRDSEIRNVVRILSRKTKNNPVLIGEPGVGKTAVVEGLAQRIVRGDVPEGLKDKKLFALD MGALLAGAKYRGEFEERLKAVLDEVKKSDGQIILFIDELHTIVGAGKTEGSMDAGNMLKP MLARGELHCIGATTLDEYRQYIEMDQALERRFQPVMVAEPTVEDTISILRGLKERYEVYH GVKITDSALVSAATLSDRYISDRFLPDKAIDLVDEACAMIKTELDSMPAELDELSRKIMQ MEIEEAALKKETDHLSKDRLAELQRELAELHDEFAVSKAQWENEKSSVEHLSALREEIEN LNREIQDAKQKYDLNRAAELQYGKLPQLQKELEEEEARVMSQDLSLVHENVTEDEIAKIV SKWTGIPVAKLTESERNKTLHLDEELHKRVIGQNEAVEKVTEAIIRSKAGIKDPTKPIGS FLFLGPTGVGKTELAKALAERLFDDENNIVRIDMSEYMEKHSVSRLIGAPPGYVGYDEGG QLTEAVRRKPYSVVLFDEVEKAHPDVFNVLLQVLDDGRITDSTGKTVDFKNTIIIMTSNI GSQYLLDGIDETGSITPEAEAMVMNDLRGHFRPEFLNRLDEIILFKPLTKENIAGIIDLM IQDLNKRIGDKELKIELTDSAKQFVVDRGYDPVYGARPLKRYLQKHVETLAAKIILGDEV RAGNTIVIDVAENGNQLIAYPE >gi|229784126|gb|GG667609.1| GENE 79 97277 - 97870 544 197 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 47 178 568 689 744 83 35.0 2e-16 MKSCSHKRKIKSIAALAACCLFVSAGTGVFPSFAHTPDESRFLNGTWEKDDHGWWFANKG NPAYFTSTWAEYQGYSYYFGADGYLVTGWNYINNHWYRFNPEKGSHEGAMMTGWIFDPDY NGWFYTNPSGILVTGWNKIGGEWYYFNPDSDGTMGLMAVNQVIDGYYVNSDGKMNEPWTV TCIGAFEIIKRDWNQTA >gi|229784126|gb|GG667609.1| GENE 80 97964 - 99286 1156 440 aa, chain - ## HITS:1 COG:yeeO KEGG:ns NR:ns ## COG: yeeO COG0534 # Protein_GI_number: 16129928 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 5 428 92 521 547 187 28.0 3e-47 MFTRKDLTKLLAPLIVEQILAVLVGMVDVMMVAAVGEAAVSGVSLVDSISVLIIQILAAL ATGGAVVSAQYLGKKQSENACRAAGQLVGVTTVLSLVVTTAALIGNRHLLGAVFGKVETA VMNDAVIYFGITALSYPFIAVYNSCAALFRSMGNSRISMVISIVMNTINVVGNAVCIFGL HMGVAGVAYPTLAGRMVAAVLMVCLIQKPDNIIRINRLSELRPNLHMIRNILSIGIPNGL ESGMFQFGKIALQSLVSSLGTAAIASYAVASNLVTLLYLPGNAIGLGLITIVGQCIGAGE KGQAKQYTRQLTGINYAILLVLCTVMIVFGGQLVGIYQLSPEASAISKQMITAHSIAMVV WPLAFTIPYTLRASLDAKFTMAVSVFSMWVFRIAFAYLFVRVFQLGVMGVWYGMFIDWIF RAAVFSLRFHGLERRAVSVS >gi|229784126|gb|GG667609.1| GENE 81 99364 - 100341 1095 325 aa, chain - ## HITS:1 COG:CAC1480 KEGG:ns NR:ns ## COG: CAC1480 COG0673 # Protein_GI_number: 15894759 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Clostridium acetobutylicum # 1 318 1 320 320 209 37.0 4e-54 MEDLKIGILGTGAIAATLADTMNRMSGVKLYGAASRSLEKAKAFAERFGIDHAYGSYEEL VADPEIELVYIATPHSEHYKNAVLCLEHGKHVLCEKAFAVNAAQAKEMIALAEEKKLLIT EAMWVRYMPMAKTLREVLDSGIIGEPMTLTASLCYLIDGIQRLVDPNLAGGALLDVGIYT IAFASLVFGDEITSITSTVIKTETGVDSQNSMTFCYPGGKMAVLNSSIHVLSDRRGIIYG TKGYIEVENINNFESIRVFDENRRLKEAYTRPEQISGYEYQIEASRKAIRENRLECPEMP HHVTIRMMETMDALREQWGIRYPFE >gi|229784126|gb|GG667609.1| GENE 82 100384 - 101136 497 250 aa, chain - ## HITS:1 COG:BH3294 KEGG:ns NR:ns ## COG: BH3294 COG4509 # Protein_GI_number: 15615856 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 1 236 1 233 254 90 28.0 4e-18 MRRIRRRNKVKRLNITILLLAGFLLSVFCLVIRFADYASGRKEYRQLQEQLEAFADSLSN QTEEWNWNREEEEEEDEWTRLMTSRNPDYVFWLWIPGTNISYPVVKNTIPGYYLNHTFSG KENPCGSIFCGEGAENLIIYGHNMKDGSMFGDLKQYINEDYFREHRYIMIYCQGDWQRYM IFSCYIIREEEVGVYQSYFNSEEEKRTYLNNIERKSLYLTGKQPEITDSIITLSTCMGKG KRVIVQAVLL >gi|229784126|gb|GG667609.1| GENE 83 101136 - 102509 865 457 aa, chain - ## HITS:1 COG:L148778 KEGG:ns NR:ns ## COG: L148778 COG4932 # Protein_GI_number: 15672133 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted outer membrane protein # Organism: Lactococcus lactis # 73 382 1128 1458 1983 66 25.0 1e-10 MVVEKLTLSSAENASPSEADKQKPGFILQILNKDKSPAKAVRDYSGFKEGEELIFTTSQE FKSITGQLTAGGEYWLHEVKPRDGYALAEDVPFTVDRDGNRQIVVMADRPTHVVLSKKAL TGSQELPGNHMAVRDEGGKVLERWISGEKPHEITAKLTAGKTYYLCEETPESGYAYAEEV PFTVSEDGSVDLVEMRNDVTRVRIHKQDQSGNAIKGAILQILEAKKEIVILEFETNGEPV DVTGILTAGETYYLHEKQAPPGYLPAADISFTVPRKNRQIEVTMTDLKQHDSESNTMYLM KTDAATGKGLEGFEFTVTGPGRNTFTVVTGRDGKAEFTMPPDGTYIYRETSGQSGYLISE ETYRFTISHGRVTGNSIILVADQPLPPETPSDDTHRIGRITASYKPGFQGRVSAAEEASR QPGARTGDEYPLMALMAAALFCLGGFLYRMRRAGARK >gi|229784126|gb|GG667609.1| GENE 84 102448 - 107586 3246 1712 aa, chain - ## HITS:1 COG:no KEGG:Closa_0424 NR:ns ## KEGG: Closa_0424 # Name: not_defined # Def: LPXTG-motif cell wall anchor domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 1690 2452 4148 4700 1909 56.0 0 FDGVEVVRDRNNNVKSIKVLEGYAGNTVEFVNSDDIEGSLKGGTGSGTWTYRTMERKDTD ILYYSLGGLKVTERGSDGRIYGYDRDGKRIQVRNQESIYALKNGRPVFELTGGDLMDAEY SALDKCFTLQSGTELYHVDSDGNRDAKVHPTMGMAYTTELGTDRKGKQYERILVWPVNVS KSANGAVIAQEKIKTYRIASINADTEGEYTIGSFDGTAFNKSLNPIVGSHGLPEYYQRSD QVYKKGEPVYDIDHDYVRYRYNDLLPAFNRNAYKINNCGELRDIGEEEDPADDKKLYHRQ GEAWIMENVWTTGDRYPDDPFHYASGTGQADMLKRVIPGTYIMEEAEAPAGYVKGFPIGL TVRETRQVQNAELVDEKIKVEIVKTDAADQYRIDIVSDYQEILKTTEPKGAYSYGQISGA HLVLYKARRNYTTDTEAYPDGYYLEKTENTPASWTVENPEDNTPVVVTAEWITDGKPKYF EGIPAGDYILEEIEAAGGYVRNSMNIEVQKTGEVQTFHMKNDHTKLEVFKYCKDEYGSMV PLPGEHAAGLALYNAVTDEQGNIQMNGGEPQFAKDRLIDEWKTDDLKAYTENYELSRKLT DRIKAFLGLAVNQSGFILDFENTYRKQGDEFVWLTWHTKAGERSAERISSGKTGVTGSTV QLWRTDDDKIIRITIYRNARNGSLEEDGQLPLNFEYQFNYRELADGIKSYDTLEGMHRID YLPFTGIKDGRRVGNYVLVENEVPEGYEPAKPKALVIEENGSVQRFSLENTEKSVSILKL ISDGDKEYAAEGVKMALYRPDTDGNFNSDENNLIERWISGADGRFTEEDRYQDRIPEGFG VGDLKPHRISRIPDGPYYIVEEEVPAYMVKREPMKVEIGRDTAGIIRMVNLPLKGRLELV KKADDTGEMLENARFLITNRDTKEEWYITTGRDGRAKLQNLAVGKVRADGTIDPYTYTIE EVSPPDYYQLSGGKKFFTFDGKAESQVVTYTCVVENKPTQIRFKKTDFETGMALEGAKIA VYHAVAIDGKYVKSGEAIEVNISGPQGFTIKKKLSAGHVYIMEELEAPAGAHLSPPVIFT VNRAGTGISKIKNNFSVLECASSGGAIDSLLVFGRAADKTYTVLKDLDTGKVLPDIISST DVTLTAEDGIEEGHLYEITEYTRYSDGNTEMSKKGTRRIWFDENGSFLLPSRTYLKTRLR LEKQDRTELDSWTVEMGSVEHKINNPVKKESAVAEIAGTAGNGYMPVKNGDVVKYVITYK NSNAESTDLTVTVELEKGLEFMRASLEPEKKEGTLTWQIRDAAPYSVGQIELVTLVSGQT GEFIRSVFTAGTQTNLMESILENPIAPKGSLTIRNHISGLGKNPDDVFAYRIRFLDSSGR LLSGYQNYTGSKEGRIKGEGQISLKGEEYMIFSGLPYGTKYEIIQEVNRDYEPVSREISG LISKEAQGAVYVNNRNDESVREVLTAGGSYCLAETTDYTDGAEQITGIYRFTLNESGRID NVDMEDKPVRLYFSKIDITTGEEISGGSYVLIDAVTKAEIYRFTKEESLPVLIPSDVLIP GNEYILHEDMPPDGYAKEEDIRFAVNKEGVAETIVMQDRKTRIYLEKVDADTGESVTGGY YCVRDMESGEAVFCYTSTGKPVLAEGVLVAGRKYELVEERPPAGYGSCRNIAFSVPLRPE AITIRMKDKKNRSGCGEAYTVFRRECQSVGSR Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:20:19 2011 Seq name: gi|229784125|gb|GG667610.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld3, whole genome shotgun sequence Length of sequence - 101644 bp Number of predicted genes - 98, with homology - 97 Number of transcription units - 37, operones - 21 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 30/0.000 - CDS 2 - 311 173 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 2 1 Op 2 4/0.000 - CDS 304 - 1380 1296 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 3 1 Op 3 . - CDS 1402 - 1935 631 ## COG1396 Predicted transcriptional regulators - Prom 2030 - 2089 7.8 + Prom 1928 - 1987 10.5 4 2 Op 1 . + CDS 2145 - 3203 954 ## COG2706 3-carboxymuconate cyclase 5 2 Op 2 . + CDS 3298 - 4416 1191 ## COG0082 Chorismate synthase 6 2 Op 3 . + CDS 4450 - 4905 561 ## COG0698 Ribose 5-phosphate isomerase RpiB 7 2 Op 4 2/0.000 + CDS 4927 - 7149 2249 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 7152 - 7209 14.1 + Prom 7168 - 7227 3.6 8 3 Op 1 35/0.000 + CDS 7374 - 8735 1494 ## COG1653 ABC-type sugar transport system, periplasmic component 9 3 Op 2 38/0.000 + CDS 8864 - 9700 834 ## COG1175 ABC-type sugar transport systems, permease components 10 3 Op 3 . + CDS 9703 - 10518 939 ## COG0395 ABC-type sugar transport system, permease component 11 3 Op 4 . + CDS 10532 - 12196 1634 ## Closa_4081 hypothetical protein + Term 12202 - 12263 18.1 - Term 12190 - 12251 18.1 12 4 Op 1 3/0.000 - CDS 12254 - 13975 1659 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase - Prom 14036 - 14095 5.7 - Term 14009 - 14054 3.2 13 4 Op 2 . - CDS 14137 - 14847 878 ## COG2186 Transcriptional regulators - Prom 14870 - 14929 5.4 + Prom 15034 - 15093 5.2 14 5 Tu 1 . + CDS 15153 - 15485 585 ## COG2610 H+/gluconate symporter and related permeases 15 6 Op 1 . + CDS 16408 - 17385 1003 ## COG2610 H+/gluconate symporter and related permeases 16 6 Op 2 . + CDS 17407 - 18225 790 ## COG0627 Predicted esterase + Prom 18386 - 18445 7.1 17 7 Op 1 2/0.000 + CDS 18527 - 19864 1747 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases + Prom 19866 - 19925 2.6 18 7 Op 2 . + CDS 19955 - 20818 1147 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases 19 8 Tu 1 . - CDS 21890 - 22456 573 ## COG0681 Signal peptidase I - Prom 22589 - 22648 4.7 + Prom 22394 - 22453 4.1 20 9 Tu 1 . + CDS 22650 - 23789 993 ## Closa_3889 hypothetical protein + Term 23823 - 23873 -0.6 + Prom 23844 - 23903 6.8 21 10 Op 1 . + CDS 24055 - 26943 3070 ## COG0296 1,4-alpha-glucan branching enzyme 22 10 Op 2 . + CDS 27009 - 27935 1058 ## COG0668 Small-conductance mechanosensitive channel + Term 27967 - 28024 2.1 23 11 Tu 1 . + CDS 28100 - 29413 1179 ## BDI_2740 putative signal transduction histidine kinase + Prom 30261 - 30320 80.4 24 12 Tu 1 . + CDS 30340 - 31845 1580 ## COG0642 Signal transduction histidine kinase 25 13 Op 1 . - CDS 32835 - 34763 1463 ## COG3525 N-acetyl-beta-hexosaminidase 26 13 Op 2 . - CDS 34776 - 35921 608 ## CPF_1025 acetyltransferase + Prom 36013 - 36072 5.0 27 14 Tu 1 . + CDS 36253 - 37782 1433 ## TTE2409 hypothetical protein + Term 38014 - 38052 6.1 28 15 Op 1 . - CDS 38112 - 39461 1020 ## COG0534 Na+-driven multidrug efflux pump 29 15 Op 2 . - CDS 39485 - 40153 708 ## COG2964 Uncharacterized protein conserved in bacteria - Prom 40180 - 40239 8.3 - Term 40273 - 40318 16.7 30 16 Tu 1 . - CDS 40422 - 41501 853 ## COG1609 Transcriptional regulators - Prom 41622 - 41681 6.6 + Prom 41581 - 41640 5.2 31 17 Op 1 . + CDS 41695 - 42657 314 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 32 17 Op 2 . + CDS 42706 - 43446 799 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 33 17 Op 3 . + CDS 43486 - 45915 1644 ## lin0032 hypothetical protein + Prom 46007 - 46066 4.9 34 18 Op 1 35/0.000 + CDS 46158 - 47519 1605 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 47603 - 47653 13.0 + Prom 47613 - 47672 3.9 35 18 Op 2 38/0.000 + CDS 47702 - 48646 906 ## COG1175 ABC-type sugar transport systems, permease components + Prom 48663 - 48722 2.7 36 18 Op 3 2/0.000 + CDS 48750 - 49493 755 ## COG0395 ABC-type sugar transport system, permease component 37 18 Op 4 . + CDS 49519 - 52023 2262 ## COG3250 Beta-galactosidase/beta-glucuronidase 38 18 Op 5 . + CDS 52039 - 53154 1288 ## COG3616 Predicted amino acid aldolase or racemase 39 18 Op 6 1/0.125 + CDS 53151 - 54776 1476 ## COG3653 N-acyl-D-aspartate/D-glutamate deacylase 40 18 Op 7 . + CDS 54813 - 55277 698 ## COG0251 Putative translation initiation inhibitor, yjgF family 41 18 Op 8 . + CDS 55303 - 57447 1918 ## COG1472 Beta-glucosidase-related glycosidases 42 18 Op 9 . + CDS 57441 - 58712 1224 ## COG3538 Uncharacterized conserved protein + Term 58713 - 58776 19.1 - Term 58701 - 58764 15.3 43 19 Op 1 2/0.000 - CDS 58774 - 59526 975 ## COG2968 Uncharacterized conserved protein 44 19 Op 2 . - CDS 59604 - 60500 636 ## COG0583 Transcriptional regulator - Prom 60529 - 60588 10.5 + Prom 60492 - 60551 8.4 45 20 Op 1 7/0.000 + CDS 60607 - 61176 739 ## COG2059 Chromate transport protein ChrA 46 20 Op 2 . + CDS 61173 - 61754 617 ## COG2059 Chromate transport protein ChrA 47 20 Op 3 . + CDS 61751 - 62275 371 ## gi|266619456|ref|ZP_06112391.1| hypothetical protein CLOSTHATH_00482 + Term 62424 - 62470 13.1 + Prom 62630 - 62689 6.9 48 21 Op 1 35/0.000 + CDS 62914 - 64326 1653 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 64345 - 64394 9.4 49 21 Op 2 38/0.000 + CDS 64431 - 65273 935 ## COG1175 ABC-type sugar transport systems, permease components 50 21 Op 3 . + CDS 65273 - 66142 1001 ## COG0395 ABC-type sugar transport system, permease component 51 21 Op 4 1/0.125 + CDS 66161 - 67435 1343 ## COG1640 4-alpha-glucanotransferase + Prom 68282 - 68341 80.4 52 22 Op 1 1/0.125 + CDS 68411 - 68572 56 ## COG1640 4-alpha-glucanotransferase 53 22 Op 2 2/0.000 + CDS 68592 - 69044 442 ## COG1609 Transcriptional regulators 54 22 Op 3 . + CDS 69037 - 69627 573 ## COG1609 Transcriptional regulators 55 23 Tu 1 . - CDS 69768 - 69890 110 ## gi|266619462|ref|ZP_06112397.1| conserved hypothetical protein - Prom 69944 - 70003 12.8 - Term 70075 - 70121 6.2 56 24 Tu 1 . - CDS 70160 - 71629 1119 ## COG2730 Endoglucanase - Prom 71715 - 71774 8.9 + Prom 71710 - 71769 7.6 57 25 Tu 1 . + CDS 71872 - 73305 1462 ## Closa_3338 extracellular solute-binding protein family 1 58 26 Op 1 38/0.000 + CDS 73411 - 74343 989 ## COG1175 ABC-type sugar transport systems, permease components 59 26 Op 2 . + CDS 74352 - 75200 914 ## COG0395 ABC-type sugar transport system, permease component 60 26 Op 3 12/0.000 + CDS 75237 - 76841 925 ## COG0642 Signal transduction histidine kinase 61 26 Op 4 . + CDS 76810 - 77484 885 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 62 26 Op 5 . + CDS 77513 - 78454 1002 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 63 26 Op 6 . + CDS 78512 - 81037 1742 ## COG3250 Beta-galactosidase/beta-glucuronidase 64 26 Op 7 . + CDS 81021 - 82391 1392 ## Phep_0457 hypothetical protein + Prom 82473 - 82532 5.9 65 27 Tu 1 . + CDS 82609 - 83481 924 ## COG0685 5,10-methylenetetrahydrofolate reductase + Term 83626 - 83674 3.0 66 28 Tu 1 . - CDS 83500 - 84423 1094 ## COG0583 Transcriptional regulator - Prom 84664 - 84723 3.8 + Prom 84443 - 84502 7.8 67 29 Tu 1 . + CDS 84653 - 84898 216 ## gi|288869958|ref|ZP_06112409.2| LexA repressor + Term 84935 - 84994 10.1 68 30 Tu 1 . - CDS 85134 - 85820 652 ## Closa_3855 hypothetical protein - Prom 85933 - 85992 6.4 69 31 Tu 1 . - CDS 86020 - 87138 658 ## COG0582 Integrase 70 32 Op 1 . - CDS 87296 - 87838 181 ## PROTEIN SUPPORTED gi|163783284|ref|ZP_02178277.1| 50S ribosomal protein L16 71 32 Op 2 . - CDS 87901 - 88554 80 ## gi|288869961|ref|ZP_06409574.1| conserved hypothetical protein 72 32 Op 3 . - CDS 88595 - 89008 105 ## gi|266619480|ref|ZP_06112415.1| hypothetical protein CLOSTHATH_00508 73 32 Op 4 . - CDS 88992 - 89399 267 ## EUBREC_1258 SOS-response transcriptional repressor, LexA - Prom 89421 - 89480 5.0 + Prom 89462 - 89521 9.9 74 33 Op 1 . + CDS 89546 - 89758 185 ## EUBREC_1259 hypothetical protein 75 33 Op 2 . + CDS 89794 - 89979 275 ## gi|288869965|ref|ZP_06112418.2| hypothetical protein CLOSTHATH_00511 76 33 Op 3 . + CDS 90015 - 90170 249 ## gi|266623816|ref|ZP_06116751.1| conserved hypothetical protein 77 33 Op 4 . + CDS 90182 - 90934 440 ## COG3645 Uncharacterized phage-encoded protein 78 33 Op 5 . + CDS 90927 - 91124 140 ## gi|266626184|ref|ZP_06119119.1| conserved hypothetical protein 79 33 Op 6 . + CDS 91175 - 92605 1100 ## Clole_0795 hypothetical protein 80 33 Op 7 . + CDS 92615 - 93256 461 ## PSPA7_2380 hypothetical protein 81 33 Op 8 . + CDS 93253 - 94050 716 ## EUBREC_2131 hypothetical protein 82 33 Op 9 . + CDS 94054 - 94359 254 ## EUBREC_2068 hypothetical protein + Prom 94589 - 94648 2.3 83 34 Op 1 . + CDS 94682 - 94837 139 ## gi|266619491|ref|ZP_06112426.1| putative prophage Lp2 protein 24 84 34 Op 2 . + CDS 94842 - 95681 436 ## CD3151 hypothetical protein 85 34 Op 3 . + CDS 95678 - 96211 203 ## Mahau_0086 hypothetical protein 86 34 Op 4 . + CDS 96208 - 96444 162 ## gi|266619494|ref|ZP_06112429.1| phosphoserine phosphatase 87 34 Op 5 . + CDS 96458 - 96709 253 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) 88 34 Op 6 . + CDS 96688 - 96852 80 ## 89 34 Op 7 . + CDS 96874 - 97074 289 ## gi|266619496|ref|ZP_06112431.1| putative ATP:guanido phosphotransferase + Prom 97253 - 97312 3.3 90 35 Tu 1 . + CDS 97340 - 97540 104 ## gi|266619498|ref|ZP_06112433.1| nucleoside-diphosphate-sugar epimerase + Prom 97670 - 97729 3.7 91 36 Op 1 . + CDS 97754 - 97957 150 ## gi|266619499|ref|ZP_06112434.1| molyBdopterin-guanine dinucleotide biosynthesis protein b related protein 92 36 Op 2 . + CDS 98030 - 98446 165 ## CLJ_0128 DNA N-4 cytosine methyltransferase M.NgoMXV 93 36 Op 3 . + CDS 98487 - 98807 232 ## gi|266619501|ref|ZP_06112436.1| hypothetical protein CLOSTHATH_00529 94 36 Op 4 . + CDS 98865 - 99902 243 ## CLL_A2764 putative aminotransferase, class V 95 36 Op 5 . + CDS 99941 - 100162 74 ## gi|288869972|ref|ZP_06112438.2| conserved hypothetical protein 96 36 Op 6 . + CDS 100185 - 100760 387 ## gi|266619504|ref|ZP_06112439.1| putative DNA polymerase III, beta chain + Term 100977 - 101014 1.2 + Prom 100960 - 101019 7.1 97 37 Op 1 . + CDS 101087 - 101305 167 ## gi|288869973|ref|ZP_06112440.2| conserved hypothetical protein 98 37 Op 2 . + CDS 101302 - 101607 221 ## gi|266619506|ref|ZP_06112441.1| (ABCB), permease/ATP-binding protein, efflux ABC transporter, heavy metal transporter family Predicted protein(s) >gi|229784125|gb|GG667610.1| GENE 1 2 - 311 173 103 aa, chain - ## HITS:1 COG:CAC0839 KEGG:ns NR:ns ## COG: CAC0839 COG1176 # Protein_GI_number: 15894126 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Clostridium acetobutylicum # 9 101 12 104 277 63 35.0 9e-11 MSKKLLSGPYLMWMIGFTIIPLALIVFYGLTDRSGAFTMANVLSIATAEHSKALWLSLGL SLISTVICFILAYPLALILRARGIGQGSFIVFIFILPMWMNSN >gi|229784125|gb|GG667610.1| GENE 2 304 - 1380 1296 358 aa, chain - ## HITS:1 COG:CAC0840 KEGG:ns NR:ns ## COG: CAC0840 COG3842 # Protein_GI_number: 15894127 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Clostridium acetobutylicum # 5 351 6 351 352 400 56.0 1e-111 MGQPLIDLKNITKTFDGTVVLDDLNLSVKENTFVTLLGPSGCGKTTTLRIIGGFERPDQG TVIFDGQDITSLPPNKRQLNTVFQKYALFTHMSIAENIAFGLKIKNKPKTYIQDKIKYAL KLVNLDGFENRTVDSLSGGQQQRIAIARAIVNEPRVLLLDEPLGALDLKLRQDMQYELIR LKNELGITFIYVTHDQEEALTMSDTIVVMNQGYIQQMGSPESIYNEPENAFVADFIGESN IVPGIMIRDELVEIFGARFVCVDKGFGNNKPVDVVIRPEDIDLVPPGNGSISGVVTHLIF KGVHYEMEVTTPDGFEWLVHSTDMFPVGQEVDIHVDPFDIQIMNKPASEDEEAVGVNE >gi|229784125|gb|GG667610.1| GENE 3 1402 - 1935 631 177 aa, chain - ## HITS:1 COG:CAC0841 KEGG:ns NR:ns ## COG: CAC0841 COG1396 # Protein_GI_number: 15894128 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 177 1 179 179 202 58.0 2e-52 MDIGAKLKELRILKGLTQEELADRAELSKGFISQLERDLTSPSIATLMDILQCLGTSIGE FFNETPEEQIVFGKTDYFEKHDQELKNEIKWIIPNAQKNMMEPILLTLEPGGETYPDNPH EGEEFGYVLQGNISIHIGSKTYKAKKGESFYFVSDKKHYLSSKAGAVLIWVSSPPSF >gi|229784125|gb|GG667610.1| GENE 4 2145 - 3203 954 352 aa, chain + ## HITS:1 COG:BS_ykgB KEGG:ns NR:ns ## COG: BS_ykgB COG2706 # Protein_GI_number: 16078366 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Bacillus subtilis # 4 342 3 340 349 139 26.0 1e-32 MKGKYMAYVGSYSYNGQAKGITVYDVDVAKGRFIPRCEVEVDNSSYLIASNDGKTLYSIA DEGVVSFRIHENGAITRLNSANIKGMRGCHLSTDAEDKYIFVSGYHDGKSTVLRLNPDGT VGDIVDGVFHKGLGSVAERNFRPHVSCTRRTPDGRFVMVADLGIDQVKIYRFDEKEEKLI LVDALRCERESAPRCFRFSSDGRFFYLLYELKNVIDVYTYETGERQPKIEKIQTISTTGP KALSQLTAASSMRFSTDEKHMFCGNAGDNTITVYDRDPKTGLLDYICCLPISGSYPKDIC VFPDDRHIASINHENGSITFFSVDYKKGLLIMNGKEIRVNEPNSCVIVKVSD >gi|229784125|gb|GG667610.1| GENE 5 3298 - 4416 1191 372 aa, chain + ## HITS:1 COG:MA0550 KEGG:ns NR:ns ## COG: MA0550 COG0082 # Protein_GI_number: 20089439 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Methanosarcina acetivorans str.C2A # 1 353 1 355 365 352 53.0 1e-96 MAGSTLGTIFKITTWGESHGKGIGVVVDGCPAGLPLSEEDIQAYLDRRKPGQSRFTTKRQ EADRVEILSGVFEGRTTGTPISMAVMNTDQRSRDYGNIMEVYRPGHADYTFDEKYGFRDY RGGGRSSGRETIGRVAAGAIAAKLLASLGITVCAYTKAVGPYEADPGQFNMEEMHRNRLY IPDAGTAALAEAYLDRMMAECNSVGGVVECVIDGLPSGIGDPVFEKYDASLAKAILSIGA VKGFEIGDGFQAAKSVGSENNDAFRVDESGKVQKETNHAGGILGGISDGSRVVLRAAFKP TPSIAQPQRSVTRGLEETELVIKGRHDPIIVPRAVVVVEAMAALTTADLLLCSMTSRLDR IQAFFGRQEAKS >gi|229784125|gb|GG667610.1| GENE 6 4450 - 4905 561 151 aa, chain + ## HITS:1 COG:rpiB KEGG:ns NR:ns ## COG: rpiB COG0698 # Protein_GI_number: 16131916 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Escherichia coli K12 # 2 146 3 147 149 182 61.0 2e-46 MKIAIGNDHTALEMKNEMKKFLEEKGYEVIDYGTNSTESCDYPVYGEKVGNAVAGGEADL GLAICGTGVGISLACNKVKGIRCCVCSEPYTAKLSRMHNDSNVLAFGARVIGVEMAKMIT EEWLNAKYEGGRHQRRVDQVMDIERRNQNAR >gi|229784125|gb|GG667610.1| GENE 7 4927 - 7149 2249 740 aa, chain + ## HITS:1 COG:SA0097_1 KEGG:ns NR:ns ## COG: SA0097_1 COG2207 # Protein_GI_number: 15925805 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Staphylococcus aureus N315 # 604 738 121 254 325 81 34.0 7e-15 MKKGKITKSLYLWVRSFIIVSVVMIGMLVALTTWMVKEFSGEIEELNHGLTSVLQTSIDI RLNDIDSFAAQLQMNAVNLKLSRSKELSEIDKDFLIQFSRQLYDYKLSNTFIKEIYVYYP HLNYVVGDLGNFGAEKYYLLKNDLKLDGYEEWLEIAGGQKRDGYSFMAQSDGSLELCLFR QLPYDLSSDKSAVLVILINKSEVESILRNANEGMGNTMTAVVAGDDSVYASFGSGISPAE LKESMRENGGKPLFKLQGNYGSVMESDYYGIQYVTLIERRKLLDSGYFIRNIAYVSIGLC MIFGAGLFVLLGRRNAKPLKDILDKLANKTDEGGRFLDDYDLIDSGISQMLKTNVESTRK LEEQREAIEGLFLCNLLSSEERNNSVIFASMQKFGIQMEYSLFQVGLIRNPMGFSQEEIH DADMRIRSVLKGREDIAVITAEFDGSLALLFHMEPEYGKAEMRDIAAEILNSLHSLVECR IMLGGIYDSMSNIITSFHQAQMVAEFCENDRGRVFFYDETMVGRQENDVYVSVMPEYEMA MLEEKFEEADRLLDLLFNQYIGNDRNVYTSRAKKYAVINPVLMALEHHADDRTGYHMEEY IGSLSEAKDGQKLLKLLHEGFGYLIRWQQKRQLDQKENIAERARQYINLNFEDPMIGLYS ISDMLGVSNTYLSTTFKKKYGIGIAQYINSLRIEKAKRLIVNTDENIKEIALSVGFSSDA AFIRVFKQFENTTPGRYKKK >gi|229784125|gb|GG667610.1| GENE 8 7374 - 8735 1494 453 aa, chain + ## HITS:1 COG:mll4149 KEGG:ns NR:ns ## COG: mll4149 COG1653 # Protein_GI_number: 13473518 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 92 451 46 408 408 125 26.0 2e-28 MKKRMKKCLSLVLAGAMAGMLATGCGANGGKEGTAAGDGSETAGTAKETGISDAAEQKAG GETAGGEKTKLVLWMPPFGTEESLDKEVWGEILKPFEEENNAQVSIEIIPWSNYEEKYLT GISSGQGPDVGYMYMEMINDYIKAGAITPFDEYLTAEDRDNYYYLDKGVIDGKQYAMPIV VGSARVMFYNKAILKEAGVETVPQTWEEFEDACRKVAAIGKTPMLQQWGDKSKGAMNSIF FPYLWQAGGEIFSEDGTKAAFDSEAGVKAAQFLADLRFKDGIMPESITSMSEDQVFAEFK AGNTAFVVSPTNQGSAFKEAGIDWGFFTSLKDQRMGTFASADSLVLISASKNKDLAVKMV KYMLSGPSMTKFHEMAAFPPVAKDEEYHDDPAFKTVYEDNQDALITLPAVQGSAAVYDNL YKNLQLMMLGQMTPEEALKNAADYANNTLSQNK >gi|229784125|gb|GG667610.1| GENE 9 8864 - 9700 834 278 aa, chain + ## HITS:1 COG:lin0760 KEGG:ns NR:ns ## COG: lin0760 COG1175 # Protein_GI_number: 16799834 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 6 271 26 289 296 197 41.0 1e-50 MLPSFLLLVLFSLIPLCMTGYFSFTEYSVLKPPVWIGLDNYKSMAGDPFFRPAIRNTVIY TLVTVPIQTVLSLVIANLLASRFRGKLGNFFKSTLFIPVISSMVLVGTVWKFMLATEGGI INNILALVGIGKVNWLGRYNLALISVSIVAIWKNVGYFLVIYYAGLMDVPRSHYEAAKVD GASSLQQFFYITVPALKPITYLVVTLGTIWSFQVFDLVYTMTGGGPGTATITMVMSIYQS GFKQYKMGYASAMAFVLFLIVILISILQKKVFSGKGGE >gi|229784125|gb|GG667610.1| GENE 10 9703 - 10518 939 271 aa, chain + ## HITS:1 COG:SPy0255 KEGG:ns NR:ns ## COG: SPy0255 COG0395 # Protein_GI_number: 15674435 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Streptococcus pyogenes M1 GAS # 13 271 14 276 276 171 36.0 1e-42 MEKKIQPVLWAETVLLAVLALAALFPFVYMILTSMRQTYSMELNLGLSGLNLRNYSTIFK NFEFGKYLYNSVSVVVIACVLNAVISSAAAYGFEKKKFVGSELLFAVYLATLMVPSQVIL IPLFRIAQKLHLLNTYLILALPVVNAFGVFLIRQFITSVPDDLIEAARIDGCREFRIFYS IVVPLIKPVLVSLTVFTFITTWNDFVWPLIAITNTNRSTLTLALASLQGNYATNYGLVMA GATLTFMPPFILYIFLQKEFVEGIALSGVKG >gi|229784125|gb|GG667610.1| GENE 11 10532 - 12196 1634 554 aa, chain + ## HITS:1 COG:no KEGG:Closa_4081 NR:ns ## KEGG: Closa_4081 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 5 550 9 550 552 523 47.0 1e-147 MANYEFILTDSLEKVFPGRRPKRRLCKTISALGGERLSLQLAYYMEYDGCELNEQEITVT FDSAVADRIRMRRVDLVPAALAAYREQLDDNYLFTDPCLCPDLLTPVQGGILRPYASQWR AVWIDLSVTPEMCGGSYPVTVAVKRGTELLWSETVTLKVVRSVLPEQRLIYTEWLHADCL ADYYRVKPFSETHWEILERFIHTAAEHGVNMLLTPVFTPPLDTKVGGERTTVQLVGVKKE ESGYTFDFSLLRRFIAVCRRNGIKYLEISHLFTQWGAKHAPKVIALVDGEERQIFGWDTE AAGEAYRGFLSAFLPELKRVLKEEDMFEQTYFHISDEPHGEQMESYRAAKESIRSLLADC RVIDALSSFFIYQAGVVEHPIVCSDCLEPFLEAGVKDLWTYYCCVQGYEVSNQFMAMPSA RNRILGVQLYLYHIKGFLHWGYNFYNSQHSVSAVNPYLVTDAGGGFPSGDPFVVYPSEDG TPHDSLRFMVLTEAMYDLRALERLEELAGRDYTERLIHEGLEYRITFKKYPKEAEYLLRL REKVNRAIDERMNH >gi|229784125|gb|GG667610.1| GENE 12 12254 - 13975 1659 573 aa, chain - ## HITS:1 COG:CAC3604 KEGG:ns NR:ns ## COG: CAC3604 COG0129 # Protein_GI_number: 15896838 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Clostridium acetobutylicum # 3 573 2 572 572 770 64.0 0 MKLKSQEVRALAPEMDPLRMGMGWKAEDLSKPQIIVESTFGDSHPGSAHLLGLVEESVKG INHSGGKGARYFATDICDGMAQGHDGINYSLASRDTICSMIEIHANATTFDAGVFIASCD KAVPAHLMAIGRLKIPSIVVTGGVMDAGPDLLTLEQIGMYSAKCQRGEITEEQLTWYKQH ACPSCGACSFMGTASTMQIMAEALGLMLPGTALMPATCEDLKKAAYDAGVQSVKLAQTGL KSTDIVTMKSFENAIMVHAAISGSSNSLLHIPAAAHEFGYELDADYFDRMHRNAHYLLDI RPAGKWPAQYFYYAGGVPRIMEEIKSMLHLDVLTVTGKTLGENLEDLKSSGYYEKCDEYL AKAGRKRTDIIRSFQTPIGDNGTIAVLRGNLAPEGAEVKHSAVPKEMFQAVLKAKPFDCE EEAIAAVLNKTIRPGDAVIIRYEGPKGSGMPEMFYTTEAISSDEELSKSIALITDGRFSG ASKGPAIGHVSPEAADGGPIALIEEDDLIEIDIPARILRIVGVRGEKKTADEIDAILAGR RKNWTPKPAKYTSGVLKIFSEHAVSPMKGGYME >gi|229784125|gb|GG667610.1| GENE 13 14137 - 14847 878 236 aa, chain - ## HITS:1 COG:RSc1078 KEGG:ns NR:ns ## COG: RSc1078 COG2186 # Protein_GI_number: 17545797 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Ralstonia solanacearum # 10 227 15 237 255 103 35.0 3e-22 MNSLEFYNGKPLAEQVADYILNYIIDNSLEVGAKIPNEFELGKMINVSRTTIREAIKILV SRQILEIRRGAGTFVSERQGITDDPLGLTFVKDKTKLALDLVSVRLMLEPEIAMMAADHA TPEQIEEMRLQCSQIEELIQSGRDHMEADIRLHTLIASCSGNSVVEKLVPVINSSIAVFV DVTNRALGQETIETHREIVNCIAERDAEGAKCAMYMHLIYNRRVFRQLEKEEKTLR >gi|229784125|gb|GG667610.1| GENE 14 15153 - 15485 585 110 aa, chain + ## HITS:1 COG:BH0805 KEGG:ns NR:ns ## COG: BH0805 COG2610 # Protein_GI_number: 15613368 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Bacillus halodurans # 8 108 6 106 456 74 43.0 5e-14 MDYVANPAQLITATLVGFVVLLFLIIKLRLHALLSVLIAGVIIGVGSGMPFSLVTQAVTD GMGNTLKGIALIVGLGSMFGGILEVSGAAQAIADKLVKVFGEKNSAFAPS >gi|229784125|gb|GG667610.1| GENE 15 16408 - 17385 1003 325 aa, chain + ## HITS:1 COG:ygbN KEGG:ns NR:ns ## COG: ygbN COG2610 # Protein_GI_number: 16130647 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Escherichia coli K12 # 2 320 127 450 454 140 31.0 2e-33 MPIVYSIAKKTNKSLLYYVLPMTAGLGIGSAIPPTPAPMLITGSLPVDLGTVTILALVLV IPKYLISYAVAVPFAKRLYVPVPENAVLAPEKKEGAAGGPSFGLVISIVLLPILLILVSA SAGMMTADSAGMQNFLDFCVFIGQPYLALIIANVAGIIFLCVRKGITMETVESVLAKSLQ PVAMIVLVTSAGGVLRYVLDYSGMGTLIGNALQNANLSIILVAYIISTLMRIGVGSTTVA LTMTLGIVASFPQISTYSPMYLACIGMSAMFGATSFSHVNDSGFWLTKEYMGIDLKTAFK SWTLYGGFCSILSFIVLYAISFIWA >gi|229784125|gb|GG667610.1| GENE 16 17407 - 18225 790 272 aa, chain + ## HITS:1 COG:lin2527 KEGG:ns NR:ns ## COG: lin2527 COG0627 # Protein_GI_number: 16801589 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Listeria innocua # 1 262 1 252 252 160 34.0 2e-39 MALLNVDFYSYYLGMDSPLTVLLPEKRGRKPEAAPDKKYPVLYLLHGHADDNTAWIRKSD LELLVRDHDLIVVMPSAHRSFYTNGRYGHLYFDYITKELPVIVGNFFPASARREDTYIGG LSMGGYGALKAALSCPEQYAGVAAMSAANSPFGAMKAAGPMFSVPDFMDNVYRIFGDEAE YKGSKEDLEYLAKQAASSAGSGLKIYHSCGKQDPLYGLNVEFKQFMETECPELDYHYCEC DGSHNWGFWNPQLKCILEYFGLIPASQDKSGT >gi|229784125|gb|GG667610.1| GENE 17 18527 - 19864 1747 445 aa, chain + ## HITS:1 COG:SPy1150 KEGG:ns NR:ns ## COG: SPy1150 COG0446 # Protein_GI_number: 15675127 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Streptococcus pyogenes M1 GAS # 5 445 4 455 456 597 64.0 1e-170 MKETIVVIGANHAGTAALNTILDNYKDKEVIAFDSNSNISFLGCGMALWIGRQISGPEGL FYSDKAAFEAKGAKIYMETPVTSVDYDKKIVYASDREGNTYEQHFDKLILATGSLPIRPA IEGMDLENVQFVKLFQNAQDVIERLKDESVKRVAVVGAGYIGVELAEAFKRNGREVVLID CADTCLAGYYDREFSDQMADHLKEHGVTLAFGETVKKLEGSVRVQKVITDKGSYDADMVV FGIGFKPNTALGGDRIKLFRNGAFLTDLCQETSVPGVYAVGDCATVYNNAIQATDYIALA TNAVRSGIIAAHNACGTKLESVGVQGSNGISIWGLNMVSTGISLAKAEKLGIDALVTDFE DWQKAEFMENGNYKVKLRIVYDKNTRVILGAQLSSDYDISMVIHLFSMAIEEQVTIDKLK LFDAFFLPHFNKPYNYITMAALGAK >gi|229784125|gb|GG667610.1| GENE 18 19955 - 20818 1147 287 aa, chain + ## HITS:1 COG:CAC3342 KEGG:ns NR:ns ## COG: CAC3342 COG2084 # Protein_GI_number: 15896585 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Clostridium acetobutylicum # 3 285 7 289 292 308 54.0 8e-84 MKKVGFIGIGIMGKSMVRNLMKAGFEVAVYTRTREKAEEVIAEGAVWCKDVKTCAAGRDA VITIVGYPKDVEEVYFGENGIIASADPDTVLIDMTTTSPKLAVRIYEDAKKAGLKALDAP VTGGDTGAREGTLTILAGGDEADFTACMPLFEAMGKNIKYEGKAGNGQHTKMCNQIAIAG AISGVCEAMVYAKAVGLDVAQMVDSIGTGAAGSAQLKTVAPRILAGDFDPGFFIRHFIKD MKLASEEAEDAGVHLGVLEYVLSMYEDMEAQGKGSLGTQALVQYYQW >gi|229784125|gb|GG667610.1| GENE 19 21890 - 22456 573 188 aa, chain - ## HITS:1 COG:alr2975 KEGG:ns NR:ns ## COG: alr2975 COG0681 # Protein_GI_number: 17230467 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Nostoc sp. PCC 7120 # 24 182 25 183 190 117 39.0 1e-26 MNREDNRLEKNEAFDWKAEIISWIKIIAAAAVIAFVLNNFIIANSKVPSGSMENTIMTGD RVIGSRLSYKFGDPKRGDIVIFHFPDDPTGTIYYVKRIIGLPGDTVDIIDGKVYLNGSRT PLDEPYIREPMDPELPACFEVPEDSYFMMGDNRNFSADARRWENKYVKRDKIIAKVLFRY YPGIGKIE >gi|229784125|gb|GG667610.1| GENE 20 22650 - 23789 993 379 aa, chain + ## HITS:1 COG:no KEGG:Closa_3889 NR:ns ## KEGG: Closa_3889 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 31 377 3 349 349 498 67.0 1e-139 MGEKIWKNAHKNKEGASAEFGSEEQGKNLRGRRKGTEGRLLLASGILFAVSLLLQLLARL VPGFGEWYAVTVYPVIVGTLGRVSGSVPFSVVEAAAYLLTAAAVWYTAVHLKRIRMVLVR ALLFLTGIFFLFTCNCGVNYYRKPFSSYSNLEVRNSSKEELIDLCTTLTETVNQYCEDGV GAAMELKDANREGVAAMRRLGEQYPQLAGYYPGPKPLAWSYFFSVQQLCGQYSPFTVEAN YNREMLDYNIPHTICHELSHLRGFMREDEANFIGYLACIGSDSPSYRYSGYLTGWVYATN ALAKTDMEAYLELCGVLDERAWADLRENNEFWARYDGKAAEVSNQLNDTYLKINSQTDGV KSYGRVVDLMLAYARKQAD >gi|229784125|gb|GG667610.1| GENE 21 24055 - 26943 3070 962 aa, chain + ## HITS:1 COG:all0713 KEGG:ns NR:ns ## COG: all0713 COG0296 # Protein_GI_number: 17228208 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Nostoc sp. PCC 7120 # 14 750 11 761 764 745 48.0 0 MDKKLYDLMDWAGIEEIVYSEAADPHKLLGPHVTDDGLLISAFIPTAVMVTVRLASSGKE FPMELADEAGFFSALIPRKNIPEYTLLVVYDNGTTEELYDPYAFSPLYTESDLKKFEAGV HYTIYEKMGAHPANINGVAGVYFSVWAPCAMRVSVVGDFNLWDGRRHQMRRLGDSGIFEL FIPGLASGLIYKYEIKNGHGDPQLKADPYGNFCELRPNTASVVWDTDRYEWQDGAWMEQR AKTDSKNQAMSIYEIHLGSWIRKETERDENGNDIVGSEFYNYREIAVRLAEYVKMMGYTH VELLPVMEHPLDASWGYQVTGYYAPTSRYGTPDDFMYFMDYMHGQGIGVILDWVPAHFPR DSFGLANFDGTCVYEHKDPRQGAHPHWGTLIYNYGRPGVSNFLIANALFWADKYHADGIR MDAVASMLYLDYGKNPGEWIPNIYGGHENLEAVEFLKHLNSVFKARTNGAVLIAEESTAW PEITGDIKEGALGFDYKWNMGWMNDFTGYMQCDPYFRKHHYGELTFSMLYAYSEDFILVF SHDEVVHGKGSMIGKMPGEELEIKAANLRAAYGFMMGHPGKKLLFMGQEFAQIHEWNENA ELDWGIVEQPLHKQMQEYVKSLNELYVNYPALHQMDYEPEGFEWVNCTDSEESIVVFLRK TKKKEETLLIVCNFDTVLHEKFRVGVPFAGKYKEISNSDAELYGGKGRTNPRVKNSKKAE KDARPDSIEITVAPLSVMIFTCTPAEDKKAAKPAAAKTAGKTKALKAAEKKTEVKKTGLQ RTAGKKNETEKSADKKPGPQKTADKEPETGKITDKKPEAKKIEDKKTEPQKIADKRPEAK KIEDKKPETQKIADKKPEAKKIEDKKPETRKIADKKPEAKKIEDKKPETQKIADKKPEAK KIEDKKPETQKIADKKPEAKKIEDKKPETQKITDKKPEAKKIEDKKTEPKKKADRGPRDE RS >gi|229784125|gb|GG667610.1| GENE 22 27009 - 27935 1058 308 aa, chain + ## HITS:1 COG:VC0480 KEGG:ns NR:ns ## COG: VC0480 COG0668 # Protein_GI_number: 15640507 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Vibrio cholerae # 45 302 30 287 287 182 36.0 6e-46 MITGDSTAGLPDETLAALQQEVQQNLEQLKPNVMLETMKGWVPGLIAFGIKLLIAIAIFA VGSRIIKMIYRMLNRSFTRMDMEISLRKFLLSVLNATMYCLLGFIIAGQIGVNSASIVAL LGSASIAVGLAVQGSLANFAGGVLILMMKPFRVGDYIVSKDGEGTVYTIGLVYTVLNTVD NKQVVIPNGTLSNSPLTNVTAMDKRRIDIKIGIGYGSDLKRAKQIMEEIYVTHPAVLKEE SIDVFVDELGASSVTIGGRGWVATEDYWSTRWNMMEQIKLAFDEAGIEIPYNQLDVHIRD GNTEKQDV >gi|229784125|gb|GG667610.1| GENE 23 28100 - 29413 1179 437 aa, chain + ## HITS:1 COG:no KEGG:BDI_2740 NR:ns ## KEGG: BDI_2740 # Name: not_defined # Def: putative signal transduction histidine kinase # Organism: P.distasonis # Pathway: not_defined # 18 437 16 436 942 471 53.0 1e-131 MKRTRFMNRFGIWAVFSLIIIMISAISVTPVAAESQPRVLRVAFCEIEGITEKDPDGTRH GLVVDYLNEIAKYTGWEYEYIDTTAETMLEEFADGEYELMGGNYYMPGLEAYYGYPDYSI GNSKAVILAREDDDSIQSYNLKSLNGKTIGVYERATENIRRLKEFLSMNALDCTLKYYAY EDLTNGNLYPCLERGEVDLLLGNSSEQSPGIRVVVSYDSQPYYIVTNVGNQEVLDGLNMA MARIADSNPNFAAERYNANFLNASNVRIRLNEEEQEYIRQKGSVTVAMPRNFHPLACENP NDIHDGLVHDILKEMAEFTGLEFRFISADSYMNAMELVRQGAADLLGFYLGDVSDSMQKG MVLTAPYASMNSIVVRSKASSFPGDGLVGAVIVGRGLPGTIRADRVEFFHNASDALEAVN RGEIDFVYGLASYMEYD >gi|229784125|gb|GG667610.1| GENE 24 30340 - 31845 1580 501 aa, chain + ## HITS:1 COG:all0638_1 KEGG:ns NR:ns ## COG: all0638_1 COG0642 # Protein_GI_number: 17228134 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 102 353 342 595 615 194 41.0 3e-49 MEYEIQKNHFTNVVPVTLVNDRMDVSFALAKPADPELLTIINKAVNSISSDRRNELMDQN LISVGSGQLSLTELIYANPFMFIGVLTTLLLILVTSVLWFNRMRVKSAVMQSSLEKAEAE NRAKGAFLSRMSHELRTPMNAVVGLAELAGMTEGVPENVREMLVKLRASSHYLLDLINDI LDMSRLDSGMLETASEPFSLEELLGELRTMMETEAGKRGLICEMEADVDHSRVTGDSIRL RQVLMNLLSNAFKFTPEGGTVTLRVTEKTGVNGSADFDFRVTDTGVGIAPEDQERIFDTF EQMGTNYSRSQGTGLGLPISRSIVQLMGGELRVKSEPGHGSEFYFTLTLPLDSLSEECED GNPMVTDDNLLEGVRILLAEDNDLNAEIAIQLLELKGAQVSRSENGRLAAERFAASAPGE FQAILMDIQMPEMNGLEATRAIRAMDRPDAAVIPIVAMTANVFKEDVDAAMEAGMNGFEG KPLDVEHLYRQLCRLLGIDTE >gi|229784125|gb|GG667610.1| GENE 25 32835 - 34763 1463 642 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 37 323 75 378 757 66 25.0 1e-10 MYRIFPAPKQCTFQQNTISGIRYGAVTTAPEIEASCRDAGFDPAALLAEAFPGGYSPDGI TLAAEYVTCSPESFHICISSDGIRITCQDGAGYFYALGVLVQLAAQADGVLPCAEIADEP SLSVRGIMLDIGRDKIPSMETLKGLIDLFASMRINHLQLYMEGFSFDYEDFHYLFTNETP VTPEEFRELSQYARAHFIDLVPNQNVLGHMEKWLEKPQFRSLAECEDGYIFENLFWRPPM TLDVRDSRSFELAETLLGTLMDASSSGCINVNMDEPFELGMGKNKQAAEKEGRLNLYFEY VEKLHAYCASRGKRMLMWGDEILHHPDCVSRFPKDAILLDWIYEGDAHFEDHARLMQKTG LNYCLCPGTSSWGSLTGRSDNMAKNIRDAASCAIRYGGLGIITTDWGDLGHWQYLSASYP GFAYTAACSWSGPESDSDLTAWFCNRFIYESPAENAFQTAWDLGNYYPLEHAPLYNTTLA FAVMSSKYTYESIEEFDAKMERLLTLSANIAKTNSIPPREPAICIDFAAMKEYLDQVEKE VSEAALACADGELIRREMENGLRMVRHGIHLYYTMTVLRDDPGAFSREMAVLFDDLDDIL KVHYTLWSARNRSGGFQRSTAHMNHLLFFYRKMQKANEQATT >gi|229784125|gb|GG667610.1| GENE 26 34776 - 35921 608 381 aa, chain - ## HITS:1 COG:no KEGG:CPF_1025 NR:ns ## KEGG: CPF_1025 # Name: not_defined # Def: acetyltransferase # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 4 379 5 422 424 189 26.0 2e-46 MTVYCKGQPGELPEILDFINMVFSMHRVPHNFRTLLPKLYGTDRQTEWCHYLAKEDGQIR AVVCVLPIILEHSGHTITCATVGSVSVHPYSRGKGYMKQLMAMAADDMERQGITLGILDG RRHRYQYYGYEPGGHYLEYRFISDNFRHCASRYPDIPITLQPVTAGDTALIEKLSRMYAL LPVHTVRSEQDFFPVCTSWGSGLYAVECEGTAAGYLCGTKDHIYELVLTDEAMLFSALKA WSTLHGCDAFTLAVPSYDTERIRGISGFYERFSVREEDNYRIFNYSDAIRFFLSIKSESE PLTDGRLVLQIGGRSALAVTVSHGEITVSPCGDTPDLCMSDVEAVDLLFSPASFYGREPS SPLYSVNWFPLPLSVSHLDKC >gi|229784125|gb|GG667610.1| GENE 27 36253 - 37782 1433 509 aa, chain + ## HITS:1 COG:no KEGG:TTE2409 NR:ns ## KEGG: TTE2409 # Name: not_defined # Def: hypothetical protein # Organism: T.tengcongensis # Pathway: not_defined # 3 509 5 512 512 445 44.0 1e-123 MSKIVFVPLDERPCNYLYPDYIGRMSGLDLRIPPKEILGDLKKEADVEAVWEWTKGQVKG ASHLVVSMDMLLYGGIVPSRLHHLPEAVCEERLERLKEIRGLEPDIQIYGFQLITRAPAR DGSGEEPDYYEDYGYRIFRYGVIRDKESVNAATPEELEELETICREVPEEYLNDFLERRR VNYHNHIRTIELVEEGVIDYMIIPLDDCREYGYAPSERKKLSSVMAKKNLLSRIAMYPGA DEIGCTLLARAINGQSGLTPAVYVDYSSLRGKTQIPSYEDRSIGETVLSHLLAAGCSEAE TSAEADFVLAVNPPTPFSLKLEKEILTDDIILESERNLSAFLARIRKYRTRGLAAGVADC AIPNGADRALMQFLYEQDMLKELTAYGGWNTSSNTLGTVLSHLCAWNAAERLGRLTGEAR EASEEFLFFRYLEDWGYMAEVRRDVTDHLTDIDPELNRLDLRDKEPVVREIVKKRLEEFQ AKYFPKEPYRFELRMPWNRMFEVEISLYR >gi|229784125|gb|GG667610.1| GENE 28 38112 - 39461 1020 449 aa, chain - ## HITS:1 COG:FN1653 KEGG:ns NR:ns ## COG: FN1653 COG0534 # Protein_GI_number: 19704974 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 8 442 9 443 445 253 34.0 7e-67 MKERPSASLTEGPITRSILLFTGPLIIGNLFQQLYNTTDSFIVGNFVGSGALAAIGASAQ IINLFIGFIIGLTTGAGIIIAQYYGAGDGTRLHSAVHTSFCFCILGGILVSIAGCLLSPL ILTAMNTPSSVIPDAALYLRVYFLGALFNIFYNMGAGTLQAAGDTKHPLYCLCLSSVINV VLDIVFVTVFHMAVLGVALATLIAQLVSAACVLWKLTHTDDIYRLIPSKIHSDKAMLIKI LKFGIPTGLQTAVTSLSNVIVQSYLNGFGSLAMAGSNIYGKVDSFALLPANCLSLTTTTY IAQNMGARKFDRVKQGFKQFLWLGNLYAAAIGILLFFFSAGPLRLFTAAPEVVHYGAMMG QVLGPGYILLITCQILIGTARGAGDTFSTMLLSILNLCGLRILWLTIMIPFFPSIYTLYL GYPVTWGTAAVCMGIYYCKKIKPRLQRVR >gi|229784125|gb|GG667610.1| GENE 29 39485 - 40153 708 222 aa, chain - ## HITS:1 COG:YPO1671 KEGG:ns NR:ns ## COG: YPO1671 COG2964 # Protein_GI_number: 16121935 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Yersinia pestis # 5 216 15 224 225 85 28.0 1e-16 MKTLDEIMPILQQLLTLVEQHFGNKCEIVLHDLTKDYNHTIVDIRNGDITHRSIGGCGSN LGLEVLRGTVLDGDRFNYVTTTQDGKILRSSSIYLKNDQGEVIGSICVNLDITETLQFEG YLRQFNQFDSFTSNDEEIFAPDVNNLLSHLIQMGQEQIGKPALEMNKNEKIEFIRFLDQK GAFLITKSGEQICELLGISKFTFYNYLESSRSQSDSSDSDQT >gi|229784125|gb|GG667610.1| GENE 30 40422 - 41501 853 359 aa, chain - ## HITS:1 COG:lin0030 KEGG:ns NR:ns ## COG: lin0030 COG1609 # Protein_GI_number: 16799109 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 16 357 4 351 351 311 45.0 1e-84 MGFKHILEGLIMKSNITITQIAQESGVSVATVSRFLNGKVPVSADKKARIEAVVKKYDFT PNALARGLISKRTMTLGVILPDITNPYFATIFQEVQRCAAEEGYSVYLCNTRFHHGDSPI DETGYFRMMMDKKVDGILIIGGQLDMVEPNPAYKEALNKLAASIPVVAAGRAIPGVDCIF LDSENGSGVVTAYNYLASLGHKHIAFAGGQPGVTITETRLEAYKKAAASSGSAVQEDLIS LSDYYLPDGYQAAEALLTRETPFTAVIAMNDNVALGAIRAFADHGLSVPRDVALISCDRF YFADYTMPRLTSIHHHSRRWGQMVVRTLIQSIQGSTENARITFPPELIIGESCGTHLHH >gi|229784125|gb|GG667610.1| GENE 31 41695 - 42657 314 320 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 4 263 5 266 323 125 31 8e-28 MDTYLGIDYGGTKLLIGEVREDGVLLKYRRYPTGLTGQEEIAAHLLQCLDNYVAEEAPEG RIRGAGIGIVGISDFKNGIWHSVNHEDGIPLPLAQMAAARLGVETGIDNDVRSAAAAELL WGEGKNCSDFIYVNAGTGLAAGFVIDGKVLHGAHCDAGEIGHMVVDYRSSRNCVCGRKGC CELTASGIGIHRMVEERRADAPAELLELGKSGRYPAAAVFRMAREGSEFCLSIAEEAASV LGCTIMNLVRMSDPDMVVAGGGLMSDPWFFQRVEGYLEKVTMRHVSKGFRVSGFAPAFSG VIGAAAVGRLKSGREGRNII >gi|229784125|gb|GG667610.1| GENE 32 42706 - 43446 799 246 aa, chain + ## HITS:1 COG:BS_ybfT KEGG:ns NR:ns ## COG: BS_ybfT COG0363 # Protein_GI_number: 16077305 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Bacillus subtilis # 1 241 1 243 249 148 37.0 1e-35 MKITIAETEKQFDETGAWRIIGQILARKNSVIGLSTGQTTTNMHRIVSELYKLHPFDVSE TVFFNVDELTNLPRSYEGSCYAMIRSQLLDHLGVPDERFLMPLTESDDFELESRKFENSI REHGGADLEILGIGWNGHIGINQPGTPFGSDTWVSPMDEIFEARVRRETAVPDDHELGGL TLGIKTIMQSRRIILAAKGEKKAEIIRQALEGPVTEAVPASILQLHPNCEVLLDREAAGA LMRKTI >gi|229784125|gb|GG667610.1| GENE 33 43486 - 45915 1644 809 aa, chain + ## HITS:1 COG:no KEGG:lin0032 NR:ns ## KEGG: lin0032 # Name: not_defined # Def: hypothetical protein # Organism: L.innocua # Pathway: not_defined # 15 806 4 795 800 607 39.0 1e-172 MQSDAERTNMHINTEFQQELEESGYIHRPLPLHEEVSLEQREKEKKVTARKTVWSGDGKG CCGDPVSMERGTVETVRLEAETVLAMEAPLMCRCWPEGMPQDGDCAYYGHIHAAFPVKET DWTSFHRVRALIRPECDGARTVSLSMYLENDGERKVPDRYRRQGFHMMNLKNHQWNDCIW EFPSLPRDCVSGFGFRYRINGRDTASGETAAFFVKEIWLEAVEHPGKESGWLPERNTVCY STCGYETGAVKRAVLAEEAETFIIEDRSGRICLKKSLEHVVWKEDAFRIADFSELTQEGE YRIRAGNVTGDWFPVRDGVLNEVTWKGINFLFCERCGYPVPGKHGLCHMDVYAEHNGVKL PYCGGWHDAGDMSQQTVQTAETVESLLELAAERRGNTLLYQRLMEEALWGLEFIFRTRFG DGYRASSIGLIRWTDGKIGNDDDAANVRVHNHALENFICAGVFALAAECLRDYDGELAWR CAKAAEEDFGFALERCRTHGMELPVPWEHTYSSSAALFYGEIVTAAVRIWKVTGKEVYET AAAEYGRKLLDCQEKDSSETGLRGFFYRDKTHRDIVHYNHQAREHVPVTALVLLCRLLDS HPDKMLWEEGIRLYGEYVRDLMACASPYEMLPAGLYSETAEDQELFRLLHLQTDYETEKE NHTLQVRAGMCVPTGTGGKSGYGVRQFPVWFSFRGNTAVMLSAAKGAAAAGGYLNDAFLM EAAASQTDWLFGKNPFGRSLMYGVGTGYQQLFSTFPGICVGQLPVGIETDGNSDVPYWPG GNQSTYKEVWVTSVAKLFAIIAEIYKNHT >gi|229784125|gb|GG667610.1| GENE 34 46158 - 47519 1605 453 aa, chain + ## HITS:1 COG:lin0762 KEGG:ns NR:ns ## COG: lin0762 COG1653 # Protein_GI_number: 16799836 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Listeria innocua # 1 414 1 367 417 118 27.0 2e-26 MKKRFMSCLLATAMVITSLTACGGGKSTTTDTTAAAGGETAETTAGETAGDTKAESGDRK KVVFWYSHTGDEATCFENAIKAYNESQDKYFVEGLSVSDKQKIIVAISGNEAPDVIEVSN QDVLAYATNGLVESITDLAAADNYDLTGVFSNQSLVANTLNDTVYGAPLASMIIQMFYNK DILEEIGYTEPPKTMEELYEMAVKATEVDDKGTITRLGYPLFPLASARQELIYAFGGKWW SDDGLTLTPDDPAILDSLKMNIDYRSQYGIEKVQEFVATANTNRYTENDMFFAGKQLFRF DGTWMAAMMKNFNSDVNYGVALIPGTEAHPEIAGTSRYETNSLSIPIVANEKEGAWDFVK YFTNSEATKELLIGMANLPTQLALYDDPDILAQPNFDMFIEALKTENGIQYAKIEDLAKY TSLIEEYLDYAYNGMQTPEEAMKGLADQAKMLQ >gi|229784125|gb|GG667610.1| GENE 35 47702 - 48646 906 314 aa, chain + ## HITS:1 COG:BS_yesP KEGG:ns NR:ns ## COG: BS_yesP COG1175 # Protein_GI_number: 16077765 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 6 303 8 298 309 218 41.0 2e-56 MSKPTAKKMSKSERRDNINGFLFALPWIIGFVCFSLIPLLTSFYYSFTSFNPVKPPEWIG LENFKYIFKDPLVFKSLKNTLFMAFVSTPINLFIAMLLASLLNSKFKGRGVARTIFFMPS IIPMVAATMVWIWMFDPTYGYINRVLEMIGINGPSWLVNPAYTKWALVLMGTWCTGTTML ICLAALQDVPNSYYEAAEIDGANAFDKFFRITMPCVAPVLVYQGILNIINSFQYFTQVYV IINASSGGGASNASGGPANSILMYPLYLFNTAFSYMKMGRASAMAWLLFVIVFVLTLVMT RITKKVSENGVGGE >gi|229784125|gb|GG667610.1| GENE 36 48750 - 49493 755 247 aa, chain + ## HITS:1 COG:SMb20969 KEGG:ns NR:ns ## COG: SMb20969 COG0395 # Protein_GI_number: 16264842 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 4 247 27 270 270 182 39.0 6e-46 MVTTALKTNADAFQLPVKLFPREVIWNNFPEAMAKIPYVRYMMNTIFITLLSVVGQMVAT PLVAYSLAKIKWKGAPIISALIMGTMMIPYTVTMIPLYKIWSRLGFTNTYVPLILPTFFG SPFYIIIMRQFFAGLPNSLMEAAKIDGAGEFKRYIAIALPLSRPALTTVGIYAFINAWSD YLAPLIYINKTEKLTLSLGLQGFLNQYSVDWTHLMAAATIFVIPVVIFFLFFQRNFVEGI ATSGIKG >gi|229784125|gb|GG667610.1| GENE 37 49519 - 52023 2262 834 aa, chain + ## HITS:1 COG:XF0846 KEGG:ns NR:ns ## COG: XF0846 COG3250 # Protein_GI_number: 15837448 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Xylella fastidiosa 9a5c # 21 834 59 880 891 447 32.0 1e-125 MEHRITLNGTWNWCDTETKEWHKGSVPGTVLTDMADGGLIEDPYWRTNEYETRELSRRDY RYEREFEVPDSFFRETEQCLVFEGLDTIADIYLNDELLLAVRDMHRTWRIDVKGKLKTVN RLAVTFHSPIAFIEKADREGDIFYASTGCMKGNGALRKAHYMFGWDWGPQLPDAGIFRSV YLSGYSGACLEDVRVRQEHGDTGVKLSLESSVRMPGRSETSEAYTLACEITAPDGTGIFV SQTVHPGSTACTSETIEALIEDPQLWWPNGYGEQPLYTVRAELKAAGAVLDVWERTIGLR TVTVCTDADEWGNQFAFVVNGQKIFAMGANYIPEDNLLGHLSEERSERLIRDCARANFNC IRIWGGGYYPEDYVYDACDRYGILVWQDLMFACNVYDLNDEFEADILAETADNVKRIRHH ACLALWCGNNEMEWGWRDWGRLEGHRPKYKADYTKIFEMLLPRLVKQVDDQTYYWLSSPS SGGSFDDPNDFNRGDNHYWEVWHSNKPFTEYRDFYFRFCSEFGFQSFPGKKTLDAFSLPE DQNIFSEVMESHQKNGLANTKIFSYISGYYKYPKDMESIAYISQILQLKAIQYGVEHWRR NWGRCMGSIYWQLNDCWPVASWASVDYYGRWKALHYGARRFYSRFMATACEKEELSTDIG YYIHNESFEERKAVLVVRLFDRDFHTLYETETEAVTGPFEVKQVLAMDFAPFLTEERMKK KAVAEYRLIEDGRVVSRGTTLFVKPKYFEFQKPEYRISVTEREDVFCIHVDADTYVSYAE LSMDEYDVIFEDNFFDITSEEGVDITVDKKEFPNRITAEEAAAALHIRSVADSF >gi|229784125|gb|GG667610.1| GENE 38 52039 - 53154 1288 371 aa, chain + ## HITS:1 COG:mll8328 KEGG:ns NR:ns ## COG: mll8328 COG3616 # Protein_GI_number: 13476878 # Func_class: E Amino acid transport and metabolism # Function: Predicted amino acid aldolase or racemase # Organism: Mesorhizobium loti # 9 370 7 351 352 208 36.0 1e-53 MQKELLDQLETPCVVIDMAKAEENVIRMQAEADAAGCRLRPHIKTHKMPLFARMQLAHGA AGITCAKVSEAEVMADGGADDIFIAYPMVGGFRIKRAVALARRLKRLILAVDSMECAVPL NEAAKAAGITLEVRLEVDTGAKRTGVQRTKAAELAKEVHQLSNLNLTGIYTFKSLVYHDK PTEDKVIAGAEEGDMMEAIADEIRKAGVPIAEISAGSTPTGVEVAKTGKVDEIRPGTYIF KDHMLCKEGAAEPEDIAVRIYATVVSTPCREYAVIDGGTKTFPMDILLDTPPYCYPGYAL IAGNDDLQLRRMNEEHGIITSKKGDTGLKVGDKVELIPIHVCTAINMQNSVYLYDGETLR QEVVAARGMLV >gi|229784125|gb|GG667610.1| GENE 39 53151 - 54776 1476 541 aa, chain + ## HITS:1 COG:PAB0090 KEGG:ns NR:ns ## COG: PAB0090 COG3653 # Protein_GI_number: 14520359 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: N-acyl-D-aspartate/D-glutamate deacylase # Organism: Pyrococcus abyssi # 4 540 7 523 526 332 36.0 1e-90 MRTLIKNGLVYDGSGDKPFRKDILIRNGVIEAVTENTETAGTEAPPADSGGLEVIDAKGL VVTPGFIDTHRHCDIDALYNPEFGRLEMAQGLTTVIGGNCGLAPIPAPKKYRRAIYDFIE PCLGIAPEEMALERFSDYLEELGKRDLPLHVGSLIATGTLKAAIKGYGKGPFTGPEMEQA KAYIREGLEAGAAGLSMGIMYQPECYSARAEIQELISAAAPFGRPLACHIRGEGDNLVSS VKEVIEVTGAAGVPLNISHFKATGVKNWGSEIYKAIELIDAARAAGQDVTVDFYPYCGGS TTLISLLPPAVMEDSVEMTLKKLGTERGKEELRREIYREHTGWDNMVTAIGWERILLSSV TKEANRKFTGKNFREAASLAGYEEPADFCSDLLAEEQGKVGIIVLSMSQEDVDTVARLPY SMVISDSLYGVSDCPHPRLYGSFPKIIREYVRERGVLTMEEAVKKMTLLPAKRLSLEGRG MIKEGYHADINVFDPEKVRDYAVYENPKQLCSGFRMIMVDGTIAVSDDLLLKRNCGSVIK L >gi|229784125|gb|GG667610.1| GENE 40 54813 - 55277 698 154 aa, chain + ## HITS:1 COG:AGl3105 KEGG:ns NR:ns ## COG: AGl3105 COG0251 # Protein_GI_number: 15891668 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 153 15 163 164 100 43.0 1e-21 MNDVYAKLKELGLELPKAPAKGGVYSSSRVFAGNLVYISGCGPVIDSPVAGKVGKEYTKE EAQVFSRNSMLNVLAVLQDRIGDLNKVKQAVKILVYVASDDDFYEQPYVANGGSQLLVDL FGEEAGAPTRSAVGMNVLPGNIPVETEAIFEIEE >gi|229784125|gb|GG667610.1| GENE 41 55303 - 57447 1918 714 aa, chain + ## HITS:1 COG:SSO3032 KEGG:ns NR:ns ## COG: SSO3032 COG1472 # Protein_GI_number: 15899739 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Sulfolobus solfataricus # 41 681 81 724 754 356 35.0 9e-98 MEPVYLDESRTDEERVRDLVSQMTLEEKVSQLRYDAPAVERLGIPSYNWWNEALHGVARA GAATVFPQAIGLAAMFDEALLEKIGDVTALEGRAKYHEAVRNGDRGLYKGITFWSPNINI FRDPRWGRGHETYGEDPCLTGRMGTAYIKGMQGNGKRLKAAACVKHFAAHSGPEKGRHSF NSVVSKKDLTETYFPAFERCVKEAGVEGVMGGYNRLNGEAACGSHHLITEILREKWGFDG YYVSDCGAIKDFHMHHGLTDTPQESAALALKSGCDLNCGAVYLHVMSAYNQGLVSAEDID RAVTHLMMTRMRLGMFDQHTEFDEIPYEINDCAEHHGLALKAAEESMVLLKNDGILPLDK TALKTVAVIGPNGDSEEILKGNYNGTATEKYTILEGIRAVLGKETRIFCSEGSHLYRDNV ENLAEADDRLKEAVSMAVRSDVVFLCLGLNGTLEGEEGDANNSYAGADKADLNLPESQMR LLKAVCGTGTPVILLLAAGSAMAINYAAEHCSAILHIWYPGQMGGLAAARLLTGEAVPSG RLPVTFYQTTEELPEFTDYSMKGRTYRYMEREALYPFGYGLSYGDFEYSNFKAEQTEAGP DGQVRFSVKITNRSKAECDEIAEVYVRIADSELAAPGGSLADFRRIHMKAGESVTVPFTL PVKAFMVVNEEGEYILDGSTAVVTCGGSQPDSRSVKLTGKTPLTLEVKLEDLRW >gi|229784125|gb|GG667610.1| GENE 42 57441 - 58712 1224 423 aa, chain + ## HITS:1 COG:lin0759 KEGG:ns NR:ns ## COG: lin0759 COG3538 # Protein_GI_number: 16799833 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 418 1 419 432 434 48.0 1e-121 MVERYRTILAAEAAELKKRLPEGKLADMFEQCYLNTIDTTVKELEDGSLFVITGDIEAMW LRDSAAQVHHYLPAAARYEEIYDLIRRVLQKQVCYINKDPYANAFNAAANGRHYAPDRSG QTDWTWERKYEVDSLCFPISLAFELWKTTGRTEHLDHALKSACETVTEQWAAEQNHEEES PYRFERETDKETETLSRSGLGSPAAYTGMTWSGFRPSDDACTYGYLIPSNMYAAVALKQM SEMAETVWQDRGLADRAAILGGEIREGIERYGITDHGKYGRVYVYEADGLGNTLLMDDAG IPGLLSMPYFGYCDASDPVYQNTRRLVLSSDNPCFYEGTRLTGIGSPHTKPGHVWPMSLI IQALTSDDDAEIERLVRMLVENDAGTGYIHESIHKDDERIYTRPWFAWVNSLFSEMLMKK IMR >gi|229784125|gb|GG667610.1| GENE 43 58774 - 59526 975 250 aa, chain - ## HITS:1 COG:SMc01556 KEGG:ns NR:ns ## COG: SMc01556 COG2968 # Protein_GI_number: 15966060 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 42 244 49 257 262 85 30.0 8e-17 MKKHLIQTAVILAAVSAFGLTACAQGSTSAPIPDTLKVENVKDHVITVQSSEEVKVVPDM AELVFSVTTQAEDAKACQEQNSRDLANVISFLKDSGIAETSIQTSNYGLDPVYDWNSGRT ITGYEMNTEITVSDVPIDQAGALISSSVEAGINSISQVTYLSSKYDESYQQALKNAIASA RVKAEAIAEAGGCSLGAVVHVQEYSDNQTARYSSYRNAATEDKASGAAAMAVEPGQLSVT AQVTVEFEIQ >gi|229784125|gb|GG667610.1| GENE 44 59604 - 60500 636 298 aa, chain - ## HITS:1 COG:PA3398 KEGG:ns NR:ns ## COG: PA3398 COG0583 # Protein_GI_number: 15598594 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pseudomonas aeruginosa # 2 290 4 292 308 186 34.0 4e-47 MTLRHMKIFVSVYQNNGITRASEELHLAQPSVSLAIRELEDYYGIRLFDRISRRLYVTEQ GKMFYDYALHIVSLFDEMELGIRNWEHMGTLRIGSSITIGNFLLPGLIKKFTTAHPNMNV KASVHNSSYIEESILNNRIDFALIEGIPESPQIIRDPFMNDRLCLICGTSHPLASRSTVL LSELEQHNFILREPGSGGREILESLLTGFQIHLNAAWESVSTQAIVKAVAEGLGISILPY LLVADDIKNGIVLEKEVEGISLSRKFSIIYHKHKYLTPSAREFIQLCHTDEIIQEFDL >gi|229784125|gb|GG667610.1| GENE 45 60607 - 61176 739 189 aa, chain + ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 10 175 8 171 186 96 31.0 3e-20 MKREKGILWKLFLSMISLGTFTFGGGYVIVTLMKKKFADEYHWIDETEMLDLIAIAQSAP GAIAVNGSIVVGYKLAGIPGMLVSILGTILPPFVILSLVSMFYIAFRDNIWVHSMLTGMQ AGVAAVIIAVVLEMGHGILVEKNKISMLIMAAAFVLTFFLNINVVYVVIGCIVFGAARTW RSMKKEAVK >gi|229784125|gb|GG667610.1| GENE 46 61173 - 61754 617 193 aa, chain + ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 185 1 172 176 107 36.0 2e-23 MSYIQLFLSFLQIGAFSFGGGYAAMPLIQNQVVTLHHWLTAGEFNDLVTISQMTPGPIAV NSATFVGLRIAGLPGAVVATAGCILPSCILVSILAYVYTKYKKLTMLQGILNTLRPAVVA MIATAGMSIIITSFWGEAGFSLPALISGLKVDAVLIFAAALFVLRRFDINPIYVMVLAGV CQTVITAAGGMAG >gi|229784125|gb|GG667610.1| GENE 47 61751 - 62275 371 174 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619456|ref|ZP_06112391.1| ## NR: gi|266619456|ref|ZP_06112391.1| hypothetical protein CLOSTHATH_00482 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00482 [Clostridium hathewayi DSM 13479] # 1 174 1 174 174 336 100.0 3e-91 MKIKIKECVGNSRIYGEDGKLAGTIEGWPKNGPVREICGTDGSVHYRVRKERGCCVIENR TEPQAEQEISVPFQYEERTDISGNSKGGGAAVLFRAPLAVKAVVPLTAGAVIVLQNRKRE IVLEDCEGRIGRITRIASLTGHEVECEKELDAYEAAVLFAAAEYMYHDDDIEMV >gi|229784125|gb|GG667610.1| GENE 48 62914 - 64326 1653 470 aa, chain + ## HITS:1 COG:Cgl2408 KEGG:ns NR:ns ## COG: Cgl2408 COG1653 # Protein_GI_number: 19553658 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Corynebacterium glutamicum # 2 466 3 438 443 447 50.0 1e-125 MKLRKLAALSLAAAMAAGTLAGCSSKPAETKPAETTAAPAGETTEAAKAEDTSAAASDAK GQIYYLNFKPEVADVWVELAEKYTAETGVPMKVQTAAAGTYEQTLKSEVAKKEPPTLFQI NGPIGYKSWAKYCRDLKDTDIYSHLVDKSMAVTDGDGVYGIPYVVEGYGIIYNDAIMQKY FALDGAKAASMDDIKNFDTLKAVAEDMQARKADLGIEGVFASTSFAPGEDWRWQTHLANL PIYYEYKDDNVSDKDAIDFKYSDNFKNIFDLYLNNSCTDPKMVGAKTVEDSMAEFALGKV AMVQNGNWGWSQVAGVEGNTVKEEDVKFLPIYTGVAGEEKQGLCIGTENFFSINSQAKEE DQQASLDFLTWLFSSDTGKDYVTNKLGFIAPFDTFTADEKPTDPLAKEVDRYMSNTDLYS VSWNFTSFPSQTFKDNFGASLLQYAQGGKEWQAVVDDMKADWANEKAMTK >gi|229784125|gb|GG667610.1| GENE 49 64431 - 65273 935 280 aa, chain + ## HITS:1 COG:Cgl2407 KEGG:ns NR:ns ## COG: Cgl2407 COG1175 # Protein_GI_number: 19553657 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Corynebacterium glutamicum # 1 279 1 280 281 320 63.0 3e-87 MQKSIKRYFPIFVLPTLIAFIIAFLYPFIMGIYLSFTEFTTVKDATWVGISNYKKIFLDQ NFINALVFTVKFTIVSVVTINVFGFLMAYALTRGIKGTNLFRTVFFMPNLIGGIVLGYIW QLLLNGILAKFGVTLSFDAKYGFWGLVILMNWQLIGYMMIIYIAGLQNVSPDLIEAAKID GATKTQTLRRIIIPMVMPSFTICLFLTLTNSFKLFDQNLALTAGGPGRQTSMLALDIYST FYGRVGWEGVGQAKAVVFFLMVAVISLTQLYLTRRKEVEN >gi|229784125|gb|GG667610.1| GENE 50 65273 - 66142 1001 289 aa, chain + ## HITS:1 COG:Cgl2406 KEGG:ns NR:ns ## COG: Cgl2406 COG0395 # Protein_GI_number: 19553656 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Corynebacterium glutamicum # 10 289 23 304 304 287 56.0 1e-77 MSVKAKKAENQQMEVSKSTGMMMTVFFAILSLAFLFPIFLVLVNSFKSKLYISDAPFQLP KAGTYVALTNYFEGVRKTGFFPAFGRSLFITVCSVAVIVLFTSMTAWYITRVKSKFNSIL YYIFVFAMIVPFQMVMFTMSKTANDLYLNSMYGLVLLYLGFGAGQAVFLFCGFVKAIPLE IEEAAMIDGCSPIQTYFNVVFPMLRPTAVTVAILNAMWIWNDYLLPSLILPQEKGTIPMV IQNLKGGYGSIDMGAMMAMLVLAIVPIIIFYLSCQKYIIKGVAAGAVKG >gi|229784125|gb|GG667610.1| GENE 51 66161 - 67435 1343 424 aa, chain + ## HITS:1 COG:SP2107 KEGG:ns NR:ns ## COG: SP2107 COG1640 # Protein_GI_number: 15901922 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Streptococcus pneumoniae TIGR4 # 2 414 4 420 505 434 51.0 1e-121 MRECGMLLPVASLPSKYGIGAFSKEAYEFIDTLKEAGQHYWQILPLGPTSYGDSPYQSFS AFAGNPYFIDLDKLVEEGLLTKEECDSADFGSNPRDIDYGKIYFNRFPLLRKAYERWLKD GHTAAEAREKLWSETVEYCFYMAVKNQNKGRSWTEWNEDIRLQKREAMERLQYELADEAG FYAFLQMKFEEQWSALKSYANEKGIRIIGDIPIYVALDSADTWYHPELFQFDENREPVAV AGCPPDGFSATGQLWGNPLYQWEYHGKTGYQWWMRRMEYSFRMYDVVRVDHFRGFDEYYS IPAGSENAIHGTWEKGPGIEIFQKMQEKFGKLDIIAEDLGFLTPSVLKLVKDTGFPGMKV LEFAFDSREESDYLPHNYTTNCVVYTGTHDNNTIRGWYEEMDEADRQLSIDYMNNAHTPE DEIH >gi|229784125|gb|GG667610.1| GENE 52 68411 - 68572 56 53 aa, chain + ## HITS:1 COG:alr3871 KEGG:ns NR:ns ## COG: alr3871 COG1640 # Protein_GI_number: 17231363 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Nostoc sp. PCC 7120 # 1 52 446 497 502 58 48.0 3e-09 MIPVQDYLGLGGEARINTPSTLGENWRWRMLSGEMTDEIAVKCRKMAKLYGRV >gi|229784125|gb|GG667610.1| GENE 53 68592 - 69044 442 150 aa, chain + ## HITS:1 COG:BH1250 KEGG:ns NR:ns ## COG: BH1250 COG1609 # Protein_GI_number: 15613813 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 6 122 4 120 338 88 35.0 4e-18 MDSINIKDIARICGVGVSTVSRAINNHPDINPETKNMIMQAIKEHNYIPNNSARNLKRTD SKAIAVLIKGISNPFFSKMIRVFEEEIQKKKYLFIMQRVDACEDEVDVALELIKEKKPRG IVFLGGLLLTQQREAGTADCSVCVKYHWND >gi|229784125|gb|GG667610.1| GENE 54 69037 - 69627 573 196 aa, chain + ## HITS:1 COG:TM1218 KEGG:ns NR:ns ## COG: TM1218 COG1609 # Protein_GI_number: 15643974 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Thermotoga maritima # 14 196 154 328 328 103 35.0 2e-22 MTDSLEESSSYSWVSVDDYAESYKMTDYLCKLGHKRILILAATQDDESIGKLRLEGYKKA LEDNGIELDEKLCIFSEVGRDCYTMDYGFMSMSRILKEKKVDFTAVYAISDSIAIGACKA ITECGGKVPEDYSVAGFDGLDIAHYYNPTLTTIRQPVEEMAKSTIRILFDVIGKKADHQK RVFKGELVEGESTRRI >gi|229784125|gb|GG667610.1| GENE 55 69768 - 69890 110 40 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619462|ref|ZP_06112397.1| ## NR: gi|266619462|ref|ZP_06112397.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 40 1 40 40 65 100.0 1e-09 MSTYEEFMVIINAAMLIVAILVYIDRKNGTKKIVTPPSQR >gi|229784125|gb|GG667610.1| GENE 56 70160 - 71629 1119 489 aa, chain - ## HITS:1 COG:TM1751 KEGG:ns NR:ns ## COG: TM1751 COG2730 # Protein_GI_number: 15644497 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase # Organism: Thermotoga maritima # 71 365 34 281 317 62 22.0 1e-09 MEFLKTKGKNIITESGTPVWLRGTCVGGWMNMENFINGFPGTEISLREHMRKRLGEENGS YFFQEMIHNFLAEEDLAFIASTGATVVRLALNYRHFESDDAPFVYKEEGFRLLRETVDLC EKHGLYVIFDMHAVQGWQNTHWHSDNDRGISLFWSDACYQRRYYALMQEIASRFKDCPAV AGYELLNEPSSSCRSGDYPFNMYENFTSDYQTFNRVIHTAVDRIREIDKRHIIFIEGDSY GHNFSGLEAPFDDNLVYSSHDYIVSSFGPGAYPGQYEMLHNDRVEAGCYWDYEQQIRHIK ETEGWQFSETWQVPLWVGEFGSQYCTGADDIPYRLASMDDQLRAFNELGIHWTTWTYKDC GTMGWVTLDPESEYMKTVSEVQRMKGLLGAENFTAWRSACPGKEVTGQFTRYIMSLLSES PNYTFGTFQKCMNYALLTSFCAGVLQPEYAKAFQDCSAGDIARIMRSFALLNCQVNEPYL ELLRKRLAG >gi|229784125|gb|GG667610.1| GENE 57 71872 - 73305 1462 477 aa, chain + ## HITS:1 COG:no KEGG:Closa_3338 NR:ns ## KEGG: Closa_3338 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: C.saccharolyticum # Pathway: not_defined # 1 467 1 474 480 166 29.0 2e-39 MRLKKYLGILLSGIMVLSMTACAGADKGQTAPPESGQTEGTKKEQTQAEAGSGDVTKITM WSNDQHDQAVYEERIKAFNETIGKEKGISVEYTVYGSDYYTTLDVAVTAGEGPDIFKCNK IGNYAEAGYIMPWEAEAGLKGLVDRFAEYNAPGYGEFGQKTYSIPIRVTTYGIACNMDVF NQYGLKTPETWDEMRACAKAITEGSKESTYGYALAMGYGSYNYFYVNLPNAASVGDEYFN HTTGRYDFESLGGFFDHLQGFIDDKSMFPGYETMDGDTARAQFSAGNIGMIGVMSSDVTT FKNQFPCDFEWTVIPYPVADGSNRYKEPVGAAMSYVINSRAAKDGYSDKVAEFINYLYSD EMFVATNEAQVDISILGDEITSQSKVDGLDPNWIKFSDMDVFCIKYPNPDGDLSIEGDSY QDVYNKILTGIITDVDGALKDLSERYNKALDEAVASGKVNIDDYIDPNIADKMKWNQ >gi|229784125|gb|GG667610.1| GENE 58 73411 - 74343 989 310 aa, chain + ## HITS:1 COG:BS_yesP KEGG:ns NR:ns ## COG: BS_yesP COG1175 # Protein_GI_number: 16077765 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 23 302 26 298 309 164 38.0 1e-40 MIKTKKKIKKLHSANKIQEILMITPMTIGFLLFSVYPIIWVLRWSFFKYNGYSEPVFVGL GNFIRVFSRDPAYWNSLKNTFLIAGMKMIFEIPLALVLAVLLNNKIKGSSFFRVVFFLPS VFSIAVVGLIFSILFGAYNGIVNAILKNIGLITQNISWFSDKGHAMFVIILVSLWTTFGL NMIYFLMGLQNISKSLYECASIDGANEVQQFFYITMPLVAPILQLVLMLSVLGTMKMTDL ILVLTNGAPGGSTEVVMTYIFKYFFSYGESAAMEVQFGYASSMAVVTAVILGIVTLIYLK VSKKMQEVEE >gi|229784125|gb|GG667610.1| GENE 59 74352 - 75200 914 282 aa, chain + ## HITS:1 COG:BMEII0592 KEGG:ns NR:ns ## COG: BMEII0592 COG0395 # Protein_GI_number: 17988937 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Brucella melitensis # 5 282 12 293 293 138 32.0 1e-32 MRTVSKSIWKAVIYAFLLVIAVFALFPVIYILLGSFKSNKEILVGGLSILPTEWHFENYI QAWKLADFGRLTFNSIFYAVVVVGGCVVTSITAGYVFARGTTRFSKVVNAMVLCSMFVSI GTLSLYPQLSLAKVFGLSGTLWGPIIIRVFGMNATQVFIATGFVRQISREIDEAAQIDGC SFFKIFWRIIFPLCKPLVATTGLVAFRTAWSDYLLPYVFTIAAKDKWPLVVGVVSLKSSG EAVSSWNLMLAGISISILPMLIVYLFLNKYFISGLTEGSVKG >gi|229784125|gb|GG667610.1| GENE 60 75237 - 76841 925 534 aa, chain + ## HITS:1 COG:PA4398 KEGG:ns NR:ns ## COG: PA4398 COG0642 # Protein_GI_number: 15599594 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 404 517 557 675 698 67 32.0 1e-10 MFYTIGLLVAFSVILLIFDHKSRYSYLFVLMATGATLAFFSIILHINMFASYGDYYGGSI YYRLDYMIYKAITTRLALPIVVNIRLMNVGVALFLLAEMIFNYEFQKNLGRSEPKMEKGS RRMRRLLFFAVPLLSIILYDPVTSTKMYILYHTSGNKPFVYGLYCSLNVIFKLVVLLLLL RPICVLIRYVMVTSVRFLRKRIFLFSMGLMLAAAIFYIFFYIGPFSMSVDKVIRSGFWIF ENVQARIQKVYLTAPSLVLVVISFCMFILLSFRMDMSATPFVKRKIQKNLNIMNEVLGET LHSQKNLFFSQQILITKIENKVDNTAEIPEIGRMRKLIDESLGRTTQMLDELKEIKYHYL NNSVSSIIDEALNEVTVPSYITVEWNGEGYEDICGMYDRYHLCKALVNILNNSVEAIEQS GKEEGKITVRLEFLFRWLIIMIQDNGKGIRFRDRGRIFSPHYSGKQGKMNWGLGLPYVYK VIRAHLGQIKIDSRYGVYTSVFLLLPMSREEKTSVSQRNGGGRKHGENQTGHCG >gi|229784125|gb|GG667610.1| GENE 61 76810 - 77484 885 224 aa, chain + ## HITS:1 COG:CAC1455 KEGG:ns NR:ns ## COG: CAC1455 COG2197 # Protein_GI_number: 15894734 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Clostridium acetobutylicum # 4 203 2 200 225 103 33.0 3e-22 MEKIKLVIADDMEEIRIHLAEEIAARSDDILVVGQAATGHEAFLMVEQLLPSVVLMDIQM ETRTSGITAIGKIHERYPGVKCIALTIHEDEEFIFQAYLAGAVDYIIKTSSIEKIVGAIH DVIDNKLLVRPEVAKKLIDGYQRVYEKQSRLKETFQVMLLINTTEYEILKMVYEGYTYKD IASKRFVEETTIRSQINHILKKFKKKRMKDVVALLRELNIFDES >gi|229784125|gb|GG667610.1| GENE 62 77513 - 78454 1002 313 aa, chain + ## HITS:1 COG:SA0557 KEGG:ns NR:ns ## COG: SA0557 COG0667 # Protein_GI_number: 15926278 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Staphylococcus aureus N315 # 6 300 7 299 312 160 33.0 4e-39 MEYNIFGRTGMKVSDIGFGAWAIGGRGWGDGICDRDAVEALTVSWEKGVNLYDTCDAYGD GHSEELIGEFLKGRRKEAVIVTKGGTNFRLPERSKNFSKEYLMMCLEESLKRLQTDYVDV YLLHVPDAKLQDECGVFETLKEMKRSGRVRFAGLAMWGAGDTMHALERDTERVIDVLECP FNILNKSNLEVVKLAKERNIAVMTSQPLASGILTGKYREGTQFAEGDNRKGFWTAKRFEE VKPDLEIVEACTAETGLSMGHLALAYNMTYPGISCVIPGAKNAAQALDNAAATGLRLNDD IMERLSSTKGFVF >gi|229784125|gb|GG667610.1| GENE 63 78512 - 81037 1742 841 aa, chain + ## HITS:1 COG:TM1624 KEGG:ns NR:ns ## COG: TM1624 COG3250 # Protein_GI_number: 15644372 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 47 806 26 763 785 186 26.0 1e-46 MKKMVLREGWLLKRFPTGMDYSPEELEAGFRVPGTGKDRSFPVLNFPEQVHDVLLDYGVI GNPNETGINRDLWIDEYDWVYRLEFTGDASAAAGGQVYLTLEGVDTFAAVYLNGTLVGKC GDVFLDYRFPVNEVNPGANVLLIWIQSAAKKVSGIRLPERYEGRVPAISAVRAFRSGFHE YCGPVPGLIRCGIYGDVTLMQEDPLLFDEVTADSSLSEDGDCGVVTITASFLGALEEYQG EPLQARVFDSTGQVRAEEVRQIMSRSETMVLRVPHPKLWWPRTMGEPSVYRLELTCGSRT ADFLEKVIGFRRVSMAGDMDFRINGRPLKLWGANLVHPDTLSNRYRPDVMGRLLDLAELG NFNILRVWGESERYPEAFYEECDRRGILIWQDFYLCYTAYPEEAEFLELCRKEAEQMTKR LKSHPSILLWCGGNETLLSRDYENPGGPCFGETIVRQVFPGVCMEQDPERYYHPSSPCGG AYANDPLAGDTHGYTHLWFVPGREFPVFLSENCRVSTPSMKSMRKMMRPEELWPPEYTGR VSRRQRKEWPETWELHNTNQGVIKLGPVEHYYDAGSADELVYRIGAAHAEYIHEQVCRFR RGFRGEEAASAGTARRTKGHMLWKFNNNSNIISYGVVDYYQEPYYPYYELKRCYAPFLVS CELGDHGYLWVTNDTGHAVNGHLEIFLFDILKNERQGTFFQKFHAEPDESKPVCTIDCYG QFRKSNLVCAVAYSDEGEVMGMCAEPCEIERRMEYPEHTGLQLKQEGDILVVTASRFARC VELGGDEDGDEFGWLFEDNYFDLLPGMEKRIRVLGRHERGTVTAKAVYDEDKAVLKYEKR D >gi|229784125|gb|GG667610.1| GENE 64 81021 - 82391 1392 456 aa, chain + ## HITS:1 COG:no KEGG:Phep_0457 NR:ns ## KEGG: Phep_0457 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 6 455 11 458 466 219 33.0 2e-55 MKREIEEYRKTPVQYECDVVVCGGGTAGFTAALASARNGADTILVERFGHVGGTLVNGAG PLHSFFNLYKAFAGAEKTQVVRGIPQEIVTRMMEASGSPGHLEQDKGGNYDSAITLIDWE IFKDVAFTMLEEAGVRILLHTVIADVIKDGERVSGVIIEGKSGREAILAHTVIDSTGDAD AAALAGAGVVKKHDTTSAGMPFGMTNVDMPRLVEFCREKGIVNQLIEGDKGSSTDHVIRL GFELKKIPEFTEFMEKTGMWGPLGFSLHEREFTYINSAALPDVDAASTEEYSKAEITLRH QVMKLAGMLKQYIPGFEDAYVSWTPASIGVRLTRIVECEYDLSLEEIVSGKRFEDEVYLY GFHDCAPRITIREGGYYGVPYRALIPKRVEGLLVAGRCITGTWEAHMSTRNTVSCMAQGQ AAGTAAALCVKENILPRYLNTEELRKVLREQGVYLG >gi|229784125|gb|GG667610.1| GENE 65 82609 - 83481 924 290 aa, chain + ## HITS:1 COG:Cj1202 KEGG:ns NR:ns ## COG: Cj1202 COG0685 # Protein_GI_number: 15792526 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Campylobacter jejuni # 13 288 4 276 282 308 54.0 5e-84 MKTTDLFAKKPVLSFEVFPPKRTNPVETIYETLDSLRKLNPDFISVTYGAGGSENCKATT EIASKIKNEYGIESVAHLPCIGLTKDDVISLLGGFKRAGIDNILALRGDIPEGGRPAGDF EHASDLISFIKEKEEFKDFNVVAACYPEGHTESENILTDIRSLKTKVDAGADHLITQLFF DNDCFYRFKERALLAGIHVPVEAGIMPVTNKKQIERMVSLCGVSLPSKFTAMMERYEHNP EAMRDAGIAYAVDQIVDLIARGVDGIHLYTMNNPYVASKIHEAVYRLLAA >gi|229784125|gb|GG667610.1| GENE 66 83500 - 84423 1094 307 aa, chain - ## HITS:1 COG:SP0676 KEGG:ns NR:ns ## COG: SP0676 COG0583 # Protein_GI_number: 15900577 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pneumoniae TIGR4 # 1 294 21 314 322 354 61.0 1e-97 MTIQQLKYIIKIVECGSITEATRQLFISQPSLSAALKELESEFGIELFYRTAKGISLTAD GTEFLSYARQIIEQTQLLEQRYTNGKPAKRLCSVSTQHYAFAVNAFVNLISRIDTDEYEF TLRETRTYEIIEDVANFRSEVGILYFSQFNEKVLKKLLKENHLLFIPLFEAAPHVFISSM HPLTGRKEVTLEDLEDYPFLAFEQGTFNSFYFSEEILSTAPHKKTIHVSDRATLFNLLIG LNGYTICSGILNSNLNGDNIVSVPLKTEEWMRIGYICSDKFPLSPISLSYIRELKNVIAA EGFVILT >gi|229784125|gb|GG667610.1| GENE 67 84653 - 84898 216 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288869958|ref|ZP_06112409.2| ## NR: gi|288869958|ref|ZP_06112409.2| LexA repressor [Clostridium hathewayi DSM 13479] LexA repressor [Clostridium hathewayi DSM 13479] # 1 81 21 101 101 165 98.0 1e-39 MRGLRWDYDLQERIYQFIVYFVSENLYSPTIREICDGVGKNSTATVSLYLEELEDIGRII LGHGPRTIKLVGFKVVEECQH >gi|229784125|gb|GG667610.1| GENE 68 85134 - 85820 652 228 aa, chain - ## HITS:1 COG:no KEGG:Closa_3855 NR:ns ## KEGG: Closa_3855 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 214 29 236 236 230 55.0 4e-59 MTVLLFYVLPFIVVNSIIFILVTAAPKGDLTIGEADNFTTTTMELKIKSLFPIKAMTVTL DGNEVELTKTASKTYTAVLGSNGTVKVSLTAFNGMKNVFSEQVNILDDTPPDIKDSIIED GVLSFRLEDTQSGVNYDTIYAYDDDTPEILPLSIDRSTGIITFDMQKENLTICVKDQVGN EARVTITPKGENLNPEEAAALASQEAAQDSDAASGESKEDQTGLESAE >gi|229784125|gb|GG667610.1| GENE 69 86020 - 87138 658 372 aa, chain - ## HITS:1 COG:SPy2122 KEGG:ns NR:ns ## COG: SPy2122 COG0582 # Protein_GI_number: 15675872 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pyogenes M1 GAS # 4 341 7 362 381 96 27.0 9e-20 MPKKRKDGRYSKQVTIGIKDGKPVRKTIYGSTIKELDKNYRDFMSLMDKGIILQEQNTTF KELSELWLTNEKLGSVRDQTISTIKGQLSTVNSYIGDIRIKDLRMSHIESFRSYMIKSKK LAQYNLCLSRIKAIVRYAVQKDIMAKDITAGMKRVKIDKKPKRALTSEERLLFEITDLDS FERCFINLLLYTGLRKCEALALDVKDIDLKKNQIYVSKTLVASKKINTCLQEYTKTAAGL RQIPIPAPLAKILFEFIKGRSGILFPSKSGRYISTLDYKWEKILKKVQAVSSTPLSDDIT PHIFRHTYASDLYKAGVDIKQAQYLLGHDDIKTTLDTYTHFGFFDVEPDKLEDYYNAVKM QSNDNIIPMKHA >gi|229784125|gb|GG667610.1| GENE 70 87296 - 87838 181 180 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163783284|ref|ZP_02178277.1| 50S ribosomal protein L16 [Hydrogenivirga sp. 128-5-R1-1] # 6 171 5 159 185 74 29 2e-12 MPLPDKKTYSSEDYWNLPEGDRAELIDGQLYAMAPPSRNHQKLIAAFTKILGNYIDSHRG DCEVYPAPFAVNLDADDKNWVEPDISVICDKNKLSDRGCNGAPDFIIEIVSPSSRKMDYT KKNALYSEAGVREYWIVDLAKERTTIYHYEEDAAPIIIPFNEKAEVGIYEDLSIIISELI >gi|229784125|gb|GG667610.1| GENE 71 87901 - 88554 80 217 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288869961|ref|ZP_06409574.1| ## NR: gi|288869961|ref|ZP_06409574.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 93 217 1 125 125 236 100.0 8e-61 MEQTLYFKNGILYKIDPNDGRNYYQARFFISDGEKYDFENKGDIERLPIPNFSRQNGPFP DVTKCLDYIVRMKAGHFYIRHDFELCSTCLRKMIELMKHSTILWGEYDYYRIVQWNIEMG FFEEAERAELDLNNFLYHTPNKVFLRSNVVNTPELIQEQQERQKKNRDRKEYYHIFYQLP EHAPKSFGAYRRMKNRNSKNFQELMQVAEEAGIDIEL >gi|229784125|gb|GG667610.1| GENE 72 88595 - 89008 105 137 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619480|ref|ZP_06112415.1| ## NR: gi|266619480|ref|ZP_06112415.1| hypothetical protein CLOSTHATH_00508 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00508 [Clostridium hathewayi DSM 13479] # 1 137 1 137 137 245 100.0 8e-64 MTPDVNVVLLDFPAPGNEMVFENEDDSFTIMINARLSYDEQLKAYRHAMRHIENDDFQKE NVQTIEAAAHAAITIPVETQPMSAKRFLQRLKKIQAEQKRIQKELKQLEEHLEVVRSMKG FDDFELADSQQWYGSSL >gi|229784125|gb|GG667610.1| GENE 73 88992 - 89399 267 135 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1258 NR:ns ## KEGG: EUBREC_1258 # Name: not_defined # Def: SOS-response transcriptional repressor, LexA # Organism: E.rectale # Pathway: not_defined # 1 72 1 72 213 73 47.0 3e-12 MPEQEFNAVFSKRLRYYLSKYEITQAELAKHLGVGTTSVYNWCNGIKTPRMDKVDAMCDL FNCKRSDLIEEKDDSNDRYYLNDETAQAAQEIFENKELRALFDVQRDMDADDLRALHNMA LALKRKERGNDDTGC >gi|229784125|gb|GG667610.1| GENE 74 89546 - 89758 185 70 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1259 NR:ns ## KEGG: EUBREC_1259 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 5 66 4 65 66 62 53.0 5e-09 MAEVIQISLAAARVNAKLTQEEVANMMKIGKRTVINWEKGVAMPSFADLNMLSNIYGIPV DNIFLSAKST >gi|229784125|gb|GG667610.1| GENE 75 89794 - 89979 275 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288869965|ref|ZP_06112418.2| ## NR: gi|288869965|ref|ZP_06112418.2| hypothetical protein CLOSTHATH_00511 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00511 [Clostridium hathewayi DSM 13479] # 1 61 8 68 68 120 100.0 5e-26 MGDGQILTIKDCMVRHNKSYDTIAALFKRKGSPAFRVGREWQVDVIKWDAYLLKLAEEAK G >gi|229784125|gb|GG667610.1| GENE 76 90015 - 90170 249 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623816|ref|ZP_06116751.1| ## NR: gi|266623816|ref|ZP_06116751.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 50 1 50 50 85 96.0 1e-15 MQKYYTDLDDFEDDSRSRLLEVTERWLMPAVIFVAGVIIILAVCARLEALR >gi|229784125|gb|GG667610.1| GENE 77 90182 - 90934 440 250 aa, chain + ## HITS:1 COG:SPy0946_2 KEGG:ns NR:ns ## COG: SPy0946_2 COG3645 # Protein_GI_number: 15674964 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Streptococcus pyogenes M1 GAS # 125 247 1 123 127 166 62.0 3e-41 MDELMKISYESGQPTVSARDLYDLLSEDGGTKGTERFSKWFERYRGYGFEQGSDFSTPNK KVRVQTEGTRDVRREVEDYDLSVDMAKQICMLQRTNKGMELRQYLLDLEKAWNTPEQVFA RALKMADQTINRLQSDITRMRPKEIFADAVTASHTSILVGDMAKLLKQNGVDMGAQRLFT WLRDNGYLIRRKGADWNMPTQRSMEMGLFEIKESTHLDGNGCNVTTRTPKVTGKGQQYFI NKFLGGEQSA >gi|229784125|gb|GG667610.1| GENE 78 90927 - 91124 140 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266626184|ref|ZP_06119119.1| ## NR: gi|266626184|ref|ZP_06119119.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 63 1 63 65 109 93.0 6e-23 MHKIILNIIYRLETLRNQAPACEKSAYTKAIAEVMDIYDIACFSKAEKKDKKKSPAGANC HGSGK >gi|229784125|gb|GG667610.1| GENE 79 91175 - 92605 1100 476 aa, chain + ## HITS:1 COG:no KEGG:Clole_0795 NR:ns ## KEGG: Clole_0795 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 474 1 540 540 204 30.0 8e-51 MKLTKIKIKNLFGIKEYEADGQSVELSGRNGAGKTSVIDAIRLALTNRSDREYIVRDGET EGEILIETDNGLRIDRKIRTNQADYKSVKKDGHEVGSPETFLKDIFTPLQLSPVEFMAMD RKKQNAIILDMIDYPWDMNKIREWFGEIPGWVSYDQNILQVLHDIQSENGEYFQTRQDIN RDIRNKRAFIEDIADVIPAGYDAEKWDNENVGELYREIEKIRKENETIEKAKRMLESRSN KMRAFEADREIELSAIEKEFTRRETNLKEQIASLEEQIRSCKKELSGLNEKKQDKISLAE QTYKTNVAKYDAELSQYEEYASKEVKSTAELTEKAEYAEEMKGHLNEYRRMENLQSEVEK LAAESKALTDKIEKARELPGEILQEATIPIDGLTVKDGIPLIHGLPISNLSDGEKLDLCI DVAIQKPNGLQIILIDGVEKLSSDMRNELYRKCKEKGLQFIATRTTDEPELTVVEL >gi|229784125|gb|GG667610.1| GENE 80 92615 - 93256 461 213 aa, chain + ## HITS:1 COG:no KEGG:PSPA7_2380 NR:ns ## KEGG: PSPA7_2380 # Name: not_defined # Def: hypothetical protein # Organism: P.aeruginosa_PA7 # Pathway: not_defined # 7 191 10 195 264 143 35.0 4e-33 MEEIMNVENTALANPFNNESNFKMLMKMAEAFASTEIIPQNYQNKPADCMIAIDMANRMH VSPMFVMQNLYVVKGKPSWSGQACMSLIKSNTEFKDVKPVYTGEAGTNTWGCHIEAIRRS TGELVRGPEITIGVAKAEKWFSKIDRYGNETSKWQTMPELMLAYRASAFFARVYIPDALL GCSVEGEAEDIIREKDIPEIPDIFGEKEKEAQA >gi|229784125|gb|GG667610.1| GENE 81 93253 - 94050 716 265 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2131 NR:ns ## KEGG: EUBREC_2131 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 261 1 261 265 335 59.0 2e-90 MILTAENYFSKEADREYLSVSQYKNFMGTIGRPACEAEAMAKLNGEWEMKKTTALMVGSY VDAHFEGTLSLFQAQNPEIFTKQGALKAEYRKAEEIINRIERDDLFMKFMGGEKQVIMTA DMFGSPWKIKIDSYLPGKAIVDLKVMRELHKAEYTKDYGYMNFIEYWGYDLQAAVYQEVV YQNTGERLPFFVAAASKEEETDIELIWIPDDHLREKLIEVENNTPKIVALKNGEVEPIRC GLCDYCKHTKVLMRPIHFTELLGEV >gi|229784125|gb|GG667610.1| GENE 82 94054 - 94359 254 101 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2068 NR:ns ## KEGG: EUBREC_2068 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 4 101 5 123 123 120 48.0 2e-26 MISLLTDDMDHCFFCGRPADCEHHLIFGSANRELADEDCLKVPICNNCHTAGKVNSRIHD NPMAEKLSKMLGQMAYEKELALKMVPMGRELFRARYGKSYL >gi|229784125|gb|GG667610.1| GENE 83 94682 - 94837 139 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619491|ref|ZP_06112426.1| ## NR: gi|266619491|ref|ZP_06112426.1| putative prophage Lp2 protein 24 [Clostridium hathewayi DSM 13479] putative prophage Lp2 protein 24 [Clostridium hathewayi DSM 13479] # 1 51 115 165 165 107 98.0 2e-22 MSSFGRKVIQDALVKCGVLKDDGWDYVIGFTDQFFCDRNEPRIEVLIEERE >gi|229784125|gb|GG667610.1| GENE 84 94842 - 95681 436 279 aa, chain + ## HITS:1 COG:no KEGG:CD3151 NR:ns ## KEGG: CD3151 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 7 246 8 229 249 102 30.0 2e-20 MGGDGNYIKISRNILEWEWYRNINTKVLFLHMLLKANWKEGRFEGTTVPRGSFISSYPRL CEECDLTINELRTALKHLTSTGEITVKTQSKYSVFTVNNYSLYQDINSQTTVNQQSDTSQ CTDKAHSINSLLTTIEEGKKEKREELEEEKEGKKKDNRNYQEIITLYNSLCKSYPHVTKL SDKRRRAIGARLNSGYTADDFRKLFELAEQSEFLKGKNNKNWSATFDWLISDGNMAKVLD GNYSNRPEPSYSPVGKQQDKQNESREMMYNWAMSRGGEE >gi|229784125|gb|GG667610.1| GENE 85 95678 - 96211 203 177 aa, chain + ## HITS:1 COG:no KEGG:Mahau_0086 NR:ns ## KEGG: Mahau_0086 # Name: not_defined # Def: hypothetical protein # Organism: M.australiensis # Pathway: not_defined # 1 170 1 170 173 104 34.0 2e-21 MNTKEFAVFADRIKTAYPKDNLLATGDQMDWWYELLGDIPFQVAIMALKKYALSNKFPPA ISDLRLYAADLMETRIPDADEAWGEVNMAVRRYGYMREAEALKSLSGPVRRAVERTGWQN ICQSPYDQVNTLKAQFRGAYEAEQRRAVEFHKMPEHLKIEQAGIQPEAALPMMEDQK >gi|229784125|gb|GG667610.1| GENE 86 96208 - 96444 162 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619494|ref|ZP_06112429.1| ## NR: gi|266619494|ref|ZP_06112429.1| phosphoserine phosphatase [Clostridium hathewayi DSM 13479] phosphoserine phosphatase [Clostridium hathewayi DSM 13479] # 1 78 1 78 78 132 100.0 1e-29 MNEEAARKLARYRVGEDQHMGKSMTADEIELRRGYVRNMLRNVPGWKDLTDEQLDRVRIY RTGEDWYVEDADFYEYRF >gi|229784125|gb|GG667610.1| GENE 87 96458 - 96709 253 83 aa, chain + ## HITS:1 COG:BH2356 KEGG:ns NR:ns ## COG: BH2356 COG1974 # Protein_GI_number: 15614919 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Bacillus halodurans # 1 65 4 68 207 63 46.0 8e-11 MKERHEQIRDYIVQYTISHGWPPSVREIGEGVGLESTSSVQLHLKQMADAGIIKMVPGQP RCIAVPGVKITWEGDAECGKVSV >gi|229784125|gb|GG667610.1| GENE 88 96688 - 96852 80 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRKSKCLKMHYPESICMAERIVLFHGTRFRIALSVHEFYCKKCKKIRNLWFINR >gi|229784125|gb|GG667610.1| GENE 89 96874 - 97074 289 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619496|ref|ZP_06112431.1| ## NR: gi|266619496|ref|ZP_06112431.1| putative ATP:guanido phosphotransferase [Clostridium hathewayi DSM 13479] putative ATP:guanido phosphotransferase [Clostridium hathewayi DSM 13479] # 1 66 1 66 66 104 100.0 2e-21 MKDYAQLYDDELDYERDIETGLEQLCELRLKMYREKDTDILKEITPVLNAIIHDAERYRD WIQAQN >gi|229784125|gb|GG667610.1| GENE 90 97340 - 97540 104 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619498|ref|ZP_06112433.1| ## NR: gi|266619498|ref|ZP_06112433.1| nucleoside-diphosphate-sugar epimerase [Clostridium hathewayi DSM 13479] nucleoside-diphosphate-sugar epimerase [Clostridium hathewayi DSM 13479] # 9 66 12 69 69 109 100.0 7e-23 MRKLLLKLAHRILKKYGVIPLDFKDKVFFMGTIYEIQSYVISKEFFKTDVTIEMCDCLKL PDFGES >gi|229784125|gb|GG667610.1| GENE 91 97754 - 97957 150 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619499|ref|ZP_06112434.1| ## NR: gi|266619499|ref|ZP_06112434.1| molyBdopterin-guanine dinucleotide biosynthesis protein b related protein [Clostridium hathewayi DSM 13479] molyBdopterin-guanine dinucleotide biosynthesis protein b related protein [Clostridium hathewayi DSM 13479] # 1 67 1 67 67 120 100.0 3e-26 MGLIDVFEKEDRTEIKLSQLCEMLSAGAKTELLMNAVNCDVPHQYIREMVTGEKEVSDCV LFSTEKR >gi|229784125|gb|GG667610.1| GENE 92 98030 - 98446 165 138 aa, chain + ## HITS:1 COG:no KEGG:CLJ_0128 NR:ns ## KEGG: CLJ_0128 # Name: not_defined # Def: DNA N-4 cytosine methyltransferase M.NgoMXV # Organism: C.botulinum_Ba4 # Pathway: not_defined # 1 138 14 151 152 196 64.0 2e-49 MFYFDRQNPEVIYADNRELETTLCDGRTLLIKPDVKMDFRDMPYPDNSFKVVVFDPPHLI HAGTGSWLANKYGILPADWPEYLKQGFSECMRVMEPDGLLIFKWNEDQIKLSEVLRVFDK KPLLGDQRGKTRWLVFIK >gi|229784125|gb|GG667610.1| GENE 93 98487 - 98807 232 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619501|ref|ZP_06112436.1| ## NR: gi|266619501|ref|ZP_06112436.1| hypothetical protein CLOSTHATH_00529 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00529 [Clostridium hathewayi DSM 13479] # 1 106 1 106 106 211 100.0 2e-53 MEGQISLFDFMAKEFQPGDWIEECCLGRELTFNEITDMVGKLIVMDMSTESHNWYKVVQV EKIVEGDSGRRRLVYYDGKRQRGLVDEIYFDPQRSRPEKTYTLKTD >gi|229784125|gb|GG667610.1| GENE 94 98865 - 99902 243 345 aa, chain + ## HITS:1 COG:no KEGG:CLL_A2764 NR:ns ## KEGG: CLL_A2764 # Name: not_defined # Def: putative aminotransferase, class V # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 1 326 1 318 340 418 61.0 1e-115 MFIKEDDLKLNDWQFGQRKYLPWEIKLQLTKTRIREWYDNWGGMVYLSYSGGLDSRVLLH LIRETVGNDVPAVFSNTGLEFPEIVKFARQASGEFREIYPVDKDGKRITFRKVITTYGYP LISKETARKVYKLRHGNLSDRHRNYLLNGDERGSFGKLADKWQFLINAPFETTDKCCEMM KKKPFDKFVRETGRYPYIGITQDEGFKRAHQYAHTGCNVYDGKTIKSQPMGFWTKQDVLR YVVENDLEICSVYGDIKQTPCGEYYLTGEQRTGCMFCAFGAHLEPEPNRFQRMATSYPHQ YDFCMKPVCKGGLGMAEVLDYVGIPWTTWEAQGQMSITDFPEVLP >gi|229784125|gb|GG667610.1| GENE 95 99941 - 100162 74 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288869972|ref|ZP_06112438.2| ## NR: gi|288869972|ref|ZP_06112438.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 8 73 1 66 66 117 100.0 3e-25 MCNCIYDMKKRLEEKGYEHVQPPVEILSGRVYISFTAREPGKKREREVPILLSKCPICGM KYEKELDPRDIID >gi|229784125|gb|GG667610.1| GENE 96 100185 - 100760 387 191 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619504|ref|ZP_06112439.1| ## NR: gi|266619504|ref|ZP_06112439.1| putative DNA polymerase III, beta chain [Clostridium hathewayi DSM 13479] putative DNA polymerase III, beta chain [Clostridium hathewayi DSM 13479] # 1 191 1 191 191 353 100.0 4e-96 MKAKICAGEFKRIIDNTKRFVGDGTLSELMRWTYLEIDAKEKVIRATALEGHRISIEYAE LVEADESFTCYIRPTIPKITKRDNYAELEVSNSRLYVQVGESIMGYVQPEGQYYPVDKIL KEYQEKEKMITIGINAKYLKDALDSISTYDSDKKMAKIDIYDSVSPVIIRTGRKGERENL KIVLPARLRDD >gi|229784125|gb|GG667610.1| GENE 97 101087 - 101305 167 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288869973|ref|ZP_06112440.2| ## NR: gi|288869973|ref|ZP_06112440.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 26 72 1 47 47 86 100.0 6e-16 MIDNTSCKIRGCNKCDEYKKHCDDLMQENEALRMRLAEIRERINGMELPHEYHILYTRGW YDAVEEVRRYIN >gi|229784125|gb|GG667610.1| GENE 98 101302 - 101607 221 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619506|ref|ZP_06112441.1| ## NR: gi|266619506|ref|ZP_06112441.1| (ABCB), permease/ATP-binding protein, efflux ABC transporter, heavy metal transporter family [Clostridium hathewayi DSM 13479] (ABCB), permease/ATP-binding protein, efflux ABC transporter, heavy metal transporter family [Clostridium hathewayi DSM 13479] # 1 101 1 101 101 178 100.0 1e-43 MKHQEIPKGYITQKEIQAAQRYYRIGRNVIVHTYKAQGIDSMGHTGEAHRGKIVEHYKHF ALVRLPSGVLDSALWPDLVLQMRKRKRYRQGGEAGGAKQSG Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:24:29 2011 Seq name: gi|229784124|gb|GG667611.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld4, whole genome shotgun sequence Length of sequence - 99933 bp Number of predicted genes - 96, with homology - 93 Number of transcription units - 35, operones - 17 average op.length - 4.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 114 - 2345 2285 ## COG0226 ABC-type phosphate transport system, periplasmic component - Prom 2380 - 2439 5.0 + Prom 2317 - 2376 5.4 2 2 Tu 1 . + CDS 2414 - 2611 60 ## - Term 2471 - 2520 1.3 3 3 Op 1 . - CDS 2566 - 3705 1304 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase 4 3 Op 2 . - CDS 3726 - 3878 82 ## gi|266619509|ref|ZP_06112444.1| conserved hypothetical protein 5 3 Op 3 40/0.000 - CDS 3893 - 6379 2436 ## COG0642 Signal transduction histidine kinase 6 3 Op 4 . - CDS 6351 - 7064 919 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 7215 - 7274 6.0 7 4 Op 1 . - CDS 7287 - 7760 192 ## Apre_1383 TRAP dicarboxylate transporter, DctP subunit 8 4 Op 2 . - CDS 7782 - 8555 775 ## COG1082 Sugar phosphate isomerases/epimerases 9 4 Op 3 . - CDS 8599 - 9102 382 ## gi|266619514|ref|ZP_06112449.1| conserved hypothetical protein - Prom 9147 - 9206 5.9 + Prom 9222 - 9281 6.9 10 5 Tu 1 . + CDS 9393 - 10325 503 ## EUBREC_0455 hypothetical protein + Term 10372 - 10417 12.4 - Term 10355 - 10409 10.0 11 6 Op 1 . - CDS 10449 - 11396 1001 ## COG3481 Predicted HD-superfamily hydrolase 12 6 Op 2 . - CDS 11442 - 12758 1299 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain - Prom 12783 - 12842 6.3 + Prom 12845 - 12904 4.7 13 7 Tu 1 . + CDS 12957 - 13550 536 ## COG1592 Rubrerythrin + Prom 13611 - 13670 5.4 14 8 Op 1 . + CDS 13734 - 16133 2270 ## COG1193 Mismatch repair ATPase (MutS family) 15 8 Op 2 . + CDS 16146 - 16856 760 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 16913 - 16954 10.0 16 9 Tu 1 . + CDS 17232 - 17651 292 ## gi|266619521|ref|ZP_06112456.1| entericidin EcnAB + Term 17688 - 17729 1.2 + Prom 17783 - 17842 8.2 17 10 Tu 1 . + CDS 17911 - 19317 1438 ## COG0642 Signal transduction histidine kinase + Term 19362 - 19411 12.1 - Term 19350 - 19399 14.1 18 11 Op 1 . - CDS 19411 - 19905 606 ## COG0219 Predicted rRNA methylase (SpoU class) 19 11 Op 2 3/0.000 - CDS 19914 - 21083 1252 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 20 11 Op 3 . - CDS 21080 - 21562 626 ## COG1522 Transcriptional regulators 21 11 Op 4 . - CDS 21604 - 22134 537 ## COG0309 Hydrogenase maturation factor 22 11 Op 5 . - CDS 22180 - 23730 1417 ## COG2720 Uncharacterized vancomycin resistance protein 23 11 Op 6 . - CDS 23775 - 23885 180 ## 24 11 Op 7 . - CDS 23900 - 24607 565 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases 25 11 Op 8 . - CDS 24617 - 25270 180 ## PROTEIN SUPPORTED gi|46129221|ref|ZP_00155777.2| COG1194: A/G-specific DNA glycosylase 26 11 Op 9 . - CDS 25289 - 27028 1774 ## COG1032 Fe-S oxidoreductase 27 11 Op 10 . - CDS 27036 - 27530 625 ## COG2131 Deoxycytidylate deaminase 28 11 Op 11 . - CDS 27548 - 28141 683 ## COG0009 Putative translation factor (SUA5) - Prom 28162 - 28221 17.1 29 12 Tu 1 . - CDS 29123 - 29572 487 ## COG0009 Putative translation factor (SUA5) - Prom 29711 - 29770 6.3 + Prom 29505 - 29564 7.3 30 13 Tu 1 . + CDS 29740 - 30747 969 ## COG2502 Asparagine synthetase A + Term 30815 - 30853 10.2 - Term 30801 - 30843 11.0 31 14 Op 1 . - CDS 30860 - 31540 731 ## COG0860 N-acetylmuramoyl-L-alanine amidase 32 14 Op 2 . - CDS 31620 - 33065 1486 ## COG1409 Predicted phosphohydrolases 33 14 Op 3 . - CDS 33098 - 34759 2005 ## COG3858 Predicted glycosyl hydrolase - Prom 34798 - 34857 5.5 34 15 Op 1 . - CDS 34988 - 35206 242 ## Closa_0332 hypothetical protein 35 15 Op 2 . - CDS 35282 - 35914 498 ## Closa_0331 hypothetical protein 36 15 Op 3 . - CDS 35911 - 37389 981 ## COG0515 Serine/threonine protein kinase - Prom 37420 - 37479 7.7 + Prom 37470 - 37529 9.8 37 16 Tu 1 . + CDS 37594 - 38313 784 ## Closa_0328 Negative regulator of genetic competence + Term 38351 - 38399 15.4 38 17 Op 1 . - CDS 38292 - 38429 79 ## gi|266619543|ref|ZP_06112478.1| conserved hypothetical protein 39 17 Op 2 . - CDS 38434 - 39324 971 ## COG0657 Esterase/lipase 40 17 Op 3 7/0.000 - CDS 39356 - 40654 1482 ## COG0534 Na+-driven multidrug efflux pump - Term 40667 - 40704 -0.8 41 17 Op 4 18/0.000 - CDS 40761 - 41210 379 ## COG1846 Transcriptional regulators - Prom 41277 - 41336 6.0 42 17 Op 5 . - CDS 41420 - 42646 889 ## COG0477 Permeases of the major facilitator superfamily - Prom 42717 - 42776 4.3 - Term 42802 - 42833 1.8 43 18 Op 1 . - CDS 42871 - 42966 76 ## 44 18 Op 2 . - CDS 43038 - 44429 1079 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 45 18 Op 3 . - CDS 44426 - 44827 401 ## COG2893 Phosphotransferase system, mannose/fructose-specific component IIA 46 18 Op 4 . - CDS 44824 - 45648 918 ## COG0434 Predicted TIM-barrel enzyme 47 18 Op 5 . - CDS 45654 - 46586 866 ## COG1184 Translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family 48 18 Op 6 . - CDS 46627 - 47349 726 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold 49 18 Op 7 13/0.000 - CDS 47364 - 48197 1056 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID 50 18 Op 8 13/0.000 - CDS 48197 - 48940 935 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 51 18 Op 9 . - CDS 48957 - 49442 670 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 52 18 Op 10 . - CDS 49417 - 50181 685 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold 53 18 Op 11 1/0.250 - CDS 50168 - 50566 509 ## COG2188 Transcriptional regulators 54 19 Tu 1 . - CDS 51544 - 51855 312 ## COG2188 Transcriptional regulators - Prom 51940 - 51999 11.5 - Term 52101 - 52143 9.2 55 20 Tu 1 . - CDS 52163 - 52363 311 ## Closa_0318 hypothetical protein - Prom 52399 - 52458 5.5 + Prom 52353 - 52412 7.4 56 21 Tu 1 . + CDS 52599 - 52955 443 ## COG1882 Pyruvate-formate lyase + Prom 53805 - 53864 80.4 57 22 Op 1 . + CDS 53936 - 55531 1622 ## COG1882 Pyruvate-formate lyase 58 22 Op 2 . + CDS 55553 - 55822 328 ## Closa_0316 hypothetical protein 59 22 Op 3 11/0.000 + CDS 55850 - 56101 387 ## COG1882 Pyruvate-formate lyase 60 22 Op 4 . + CDS 56154 - 56924 720 ## COG1180 Pyruvate-formate lyase-activating enzyme + Prom 56930 - 56989 6.8 61 22 Op 5 . + CDS 57020 - 57190 227 ## CPR_2442 ferredoxin (FdxA) + Term 57204 - 57256 6.4 - Term 57192 - 57244 6.4 62 23 Tu 1 . - CDS 57283 - 58134 963 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase - Prom 58182 - 58241 5.6 + Prom 58135 - 58194 8.2 63 24 Tu 1 . + CDS 58277 - 58660 398 ## COG0789 Predicted transcriptional regulators - Term 58656 - 58702 8.2 64 25 Tu 1 . - CDS 58740 - 60131 1445 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 65 26 Op 1 . - CDS 60235 - 61170 853 ## COG1940 Transcriptional regulator/sugar kinase 66 26 Op 2 . - CDS 61172 - 63028 1844 ## Cphy_0607 glycoside hydrolase family protein 67 26 Op 3 . - CDS 63025 - 64524 1329 ## COG3119 Arylsulfatase A and related enzymes 68 26 Op 4 . - CDS 64528 - 67644 2978 ## COG0383 Alpha-mannosidase 69 26 Op 5 7/0.000 - CDS 67648 - 68415 797 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 70 26 Op 6 1/0.250 - CDS 68400 - 70181 1668 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 70223 - 70282 6.4 - Term 70248 - 70281 -0.3 71 27 Op 1 14/0.000 - CDS 70302 - 71825 458 ## PROTEIN SUPPORTED gi|15900035|ref|NP_344639.1| ABC transporter, substrate-binding protein 72 27 Op 2 7/0.000 - CDS 71907 - 72836 971 ## COG0395 ABC-type sugar transport system, permease component 73 27 Op 3 . - CDS 72888 - 73832 978 ## COG4209 ABC-type polysaccharide transport system, permease component - Prom 73877 - 73936 5.8 + Prom 73839 - 73898 9.0 74 28 Op 1 . + CDS 74148 - 75641 1000 ## COG3119 Arylsulfatase A and related enzymes + Prom 75667 - 75726 8.2 75 28 Op 2 . + CDS 75825 - 76541 594 ## COG2071 Predicted glutamine amidotransferases 76 29 Op 1 . - CDS 76538 - 77434 878 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 77 29 Op 2 . - CDS 77520 - 79466 2092 ## COG3711 Transcriptional antiterminator 78 29 Op 3 13/0.000 - CDS 79497 - 80312 913 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID 79 29 Op 4 13/0.000 - CDS 80305 - 81090 1166 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 80 29 Op 5 9/0.000 - CDS 81109 - 81588 714 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 81 29 Op 6 . - CDS 81607 - 82026 630 ## COG2893 Phosphotransferase system, mannose/fructose-specific component IIA 82 29 Op 7 . - CDS 82082 - 83173 1039 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains - Prom 83266 - 83325 5.5 - Term 83312 - 83350 3.5 83 30 Tu 1 . - CDS 83355 - 83549 289 ## gi|288870001|ref|ZP_06409592.1| putative protein GrpE - Prom 83692 - 83751 5.0 84 31 Tu 1 . - CDS 83907 - 84746 637 ## COG0582 Integrase - Prom 84772 - 84831 5.8 85 32 Tu 1 . + CDS 84848 - 85018 130 ## gi|266619590|ref|ZP_06112525.1| ABC spermidine/putrescine transporter, inner membrane subunit + Term 85069 - 85105 -0.9 - Term 84861 - 84899 1.3 86 33 Op 1 . - CDS 85015 - 85683 796 ## Closa_0311 MerR family transcriptional regulator - Prom 85726 - 85785 2.2 87 33 Op 2 . - CDS 85814 - 86692 744 ## Selsp_0978 hypothetical protein - Prom 86933 - 86992 20.8 88 34 Op 1 16/0.000 - CDS 87904 - 89241 1097 ## COG0305 Replicative DNA helicase 89 34 Op 2 9/0.000 - CDS 89254 - 89700 574 ## PROTEIN SUPPORTED gi|239629097|ref|ZP_04672128.1| ribosomal protein L9 90 34 Op 3 . - CDS 89700 - 91760 653 ## PROTEIN SUPPORTED gi|162447066|ref|YP_001620198.1| bipartite protein: signaling protein and ribosomal protein L9 - Prom 91814 - 91873 5.5 - Term 91846 - 91895 4.1 91 34 Op 4 . - CDS 91947 - 92366 347 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit - Prom 92416 - 92475 80.4 - Term 93531 - 93576 10.2 92 35 Op 1 . - CDS 93629 - 95821 2065 ## COG3345 Alpha-galactosidase 93 35 Op 2 . - CDS 95857 - 96531 593 ## Closa_0515 hypothetical protein 94 35 Op 3 38/0.000 - CDS 96580 - 97404 1032 ## COG0395 ABC-type sugar transport system, permease component 95 35 Op 4 35/0.000 - CDS 97422 - 98333 1082 ## COG1175 ABC-type sugar transport systems, permease components 96 35 Op 5 . - CDS 98361 - 99671 1467 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 99765 - 99824 7.2 Predicted protein(s) >gi|229784124|gb|GG667611.1| GENE 1 114 - 2345 2285 743 aa, chain - ## HITS:1 COG:Z4789 KEGG:ns NR:ns ## COG: Z4789 COG0226 # Protein_GI_number: 15803936 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Escherichia coli O157:H7 EDL933 # 168 426 239 495 502 201 38.0 5e-51 MKRNLYWLPVALAFLIGMAAAGCSNKQKPEETASEAVTADTHEEESGERETESKTAEETR SYPAGSREDDYILPEAQNHIYTQEELSDLTREELRIARNEIYARHGRKFKSDDLNSYFSG KSWYRPSVDGDAFDDSVLNESETGNLAAVKAAEENAPSELVTCPKIGKEDFPRIDGSTAT IPLSQAIYRLAADATEREAERLIRHDKTTQAYLKLIRDQGTDLVIAYEPGESVKKRLKEE GDNLIIKPIGRDALVFMANQGNPVNSLTGRQVIDIYSGNITNWKTVGGSSRAIRAFQRPV DSGSQNLMEKLVMKGTAMAEAPQDFVVSEMGELIEKVSAYDNTGEALGYSVYYYAKNMYQ KPELKFMAVDGVMPSSDTIRDGSYPYVSDFYAAVRKDEPKDSNAYRLFEWLTSDDGQALI NALGYVGVRDEKKLLPQGFEGEDEVFQAEIPLSKGEVILADGDYLYGENGIAVFDRSMNL MKFIRHVESQAISPFLVWDGSGLLSMQDTLTGNYGLYSVAEERWVCEPVYSDIYQTKDGY GLEHAVWVESGRSGEWHYTYDYADKNGAITEKGVQSDRKIWGEEGSEELYYDVKEFMEHY PEIAMGLEVTSEAVSIYGTEFQENIAVIEKGNRNYYYNMRGKCLFEFDKSGLPAGKEQMM FPIIVNDHVAFLSVYDPDGTAVSRDYIYRDGTLVKTLESGSTSGNVSDIEEDFYTRTAGN YLYVYNYQDEPCAKFLMGYYTSD >gi|229784124|gb|GG667611.1| GENE 2 2414 - 2611 60 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MILQNILEFVRNPVLGPLFYINNKKRDRGISTGSFSACHSLSFLLSTAQLHYSIAVPLQY VCATS >gi|229784124|gb|GG667611.1| GENE 3 2566 - 3705 1304 379 aa, chain - ## HITS:1 COG:BS_nagA KEGG:ns NR:ns ## COG: BS_nagA COG1820 # Protein_GI_number: 16080554 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Bacillus subtilis # 23 375 28 384 396 252 41.0 8e-67 MIIQSKKVWIAGQFIPAQLEVVDGKITAVREYGERSVDQDYGTKRIVPGFIDVHTHGAYG YDTNDGEPEGLRNWMKHIPEEGVTSILPTTVTQMPDVLEKAVANVASVVDDGYEGAEILG IHFEGPYLDMDYKGAQPPEAIAQALVEQFKRYQNAAKGLIKYITLAPEHDADFALTKYCK ETGVVVSMGHSSATYEQALMGIANGATSMTHVYNGMTPYHHRKPGLVGTAFRVRDIYGEI ICDGCHSHLAALNNYFTAKGRDFVIMVTDSLRAKHCPPGGSYQLGGHDIEIRENGLAYLK GTETIAGSTLYMNRGLRILVEDALLPFDQALNSCTINPARCLGVDDRKGKLVAGYDADLV VLEDNYDVAQTYCRGTAML >gi|229784124|gb|GG667611.1| GENE 4 3726 - 3878 82 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619509|ref|ZP_06112444.1| ## NR: gi|266619509|ref|ZP_06112444.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 50 1 50 50 85 100.0 1e-15 MRSGSREEHFRLVDRIFYFLGHFENYFQNDLEIEMGTGYTKDGRTEDCRR >gi|229784124|gb|GG667611.1| GENE 5 3893 - 6379 2436 828 aa, chain - ## HITS:1 COG:BH1154_2 KEGG:ns NR:ns ## COG: BH1154_2 COG0642 # Protein_GI_number: 15613717 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 576 815 35 272 274 234 47.0 7e-61 MDTKLKNRHRISALLSVLIVVAAAVIMTGLYPFFEKEAGRHEEPVYEDQAFLRHLVFGNY VLDLEQSQYENGVVISPFEQFFPEGAYELKNGSMDGFGAQASAAPEGTDQLEDMDRPEGE AAAETDSRTPRVTASAAGENPAVSGESSPGADADVHAADEAAVFAGEDTISFPEDINYEA LTDEEKRYMAEERQEYLRNLRQNCMMEYESWNRYFGSIRNSIDYQVLDDNGSILMNYAKN RNILSDDDYVFRVVMHYDDQGRMDVMQLGGEDTTYLLNVIQKFGEVDPMEEYYWEQYSHE PQLKLENPKNITFAYAMTEQQLAEYSLARSSSWKSYYNSGIVANVFAGLGVFVALAAFLL SCCDVTRADDSRVFRAPLGVVLGVGAGAIGFGYSLTSLVAVTNHGSLYSGLIRANFLPKA ADMMVKLWNVFWWCIPFAVIYWAVLCMRDVFHIGLKRYIRERTLCGLFYGWCAGLFRRLY AFLGSINLQDHSDRTIFKIVGVNFIILAVLCSLWFFGIGALIIYSFVLFLVLRKYYRDLT RKYRILLHATNQIAEGNLDVTISEDLGVFEPFKTEVQKIQSGFKVAVDEEVKSQKMKTDL ITNMSHDLKTPLTAIITYVNLLKDDNAVPEQRKAYIDVLEQKSMRLKSLIEDLFEISKAN SNNVTLNLVNVDIVSLLKEVSLELSDKIEESTIDFRWNLPDEKIVLPLDSQKTYRIFDNL LTNIMKYGMPNTRAYIDMKRDDGGVVITMKNVSASELDFNPEEITERFVRGDQSRNTEGS GLGMAIAKSFVELQNGKMNVEVEADLFRVIIRWPVEQSTAAEEVKEEL >gi|229784124|gb|GG667611.1| GENE 6 6351 - 7064 919 237 aa, chain - ## HITS:1 COG:BH1153 KEGG:ns NR:ns ## COG: BH1153 COG0745 # Protein_GI_number: 15613716 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 5 235 6 232 232 256 55.0 3e-68 METNHILLVEDDKEIREGVEIYLKSQGYEVFQAADGIEGLKILDQEEIHLAIVDVMMPRM DGITMTVKLREKYDFPVIILSAKTEEVDKIMGLNIGADDYISKPFTPMELLARVNSQLRR YRKYLDMLEAKELKEKSNIYVIGGLELNEDTVEVTVDGKPAKLTPIEFKILALLMKNPGR VFSAEEIYERVWNERAINTDTIMVHVRKIREKIEINPKEPKYLKVVWGVGYKIEKQA >gi|229784124|gb|GG667611.1| GENE 7 7287 - 7760 192 157 aa, chain - ## HITS:1 COG:no KEGG:Apre_1383 NR:ns ## KEGG: Apre_1383 # Name: not_defined # Def: TRAP dicarboxylate transporter, DctP subunit # Organism: A.prevotii # Pathway: not_defined # 56 95 83 122 340 64 72.0 1e-09 MKDILRKCESLEYHQLKPRAVDGEWYLETTLFANHHNPEVEELFALPLECLLRNLLLSTG AIDFCMASKAILETFSKNYEIFNLPYLFASSETYHLCFYPGGACAGHVYSGCQHGASDDG RAGQITKNQIWFYAVEKPSDTVSVWQAVFCGSKPAVR >gi|229784124|gb|GG667611.1| GENE 8 7782 - 8555 775 257 aa, chain - ## HITS:1 COG:MA4658 KEGG:ns NR:ns ## COG: MA4658 COG1082 # Protein_GI_number: 20093437 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Methanosarcina acetivorans str.C2A # 37 257 40 256 257 63 26.0 5e-10 MEFGMPFLMETDTLEECARLCSQLGLGFIELNMNFPQCQLPELSADRLRRIMDQEHLYFT IHLDENLNVCDFNPKVRQAYLETAVETIELAKAIGAPVINMHLAKGIYLTLPGRKEYLFK RYEEEYRRHIAEFGEMCRDAVGDAAIHMAVENTEGFMVHERKALELLLQQPCFGLTLDIG HSHAAGNVDIPFYLAYEDRLIHMHGHDAKGKSCHLAFGDGEIDLEERLLMAEKAGARVVL ETKTIEALTKTVSWLRK >gi|229784124|gb|GG667611.1| GENE 9 8599 - 9102 382 167 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619514|ref|ZP_06112449.1| ## NR: gi|266619514|ref|ZP_06112449.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 167 1 167 167 315 100.0 8e-85 MQTIDFRLHHIVRYCGLERGNAFAVSFEETFHKIKGIEKLLQLNPRFYGWCYFGKIHSMY LYSDYDYEEWLEIQNLRMVMESEDKEYRMTLFFRDVTSFYLAQSAGISGFEIECSDDHAF GDRRNFHVFDFEEGDIRFYCREIEIEEVVNREMIKRKEEGGLAYHGD >gi|229784124|gb|GG667611.1| GENE 10 9393 - 10325 503 310 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0455 NR:ns ## KEGG: EUBREC_0455 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 285 1 277 322 169 35.0 2e-40 MPRQDFYIKKLLQDPARFADLYNAEIFHGKQILKAELLSPVSTESGIAITNRSGRKQTIQ RRRDIAMKASIGACFIVAGCEAQGEIHYGMPIRSLTYDALDYTEQLTEIQKEHRKKKDLA KSPEFLSGITRRDKLQPVLTLVLYCGKDPWDGPKSLYDMLDLRGPTECIPDLLAALPDYR INLVDIRKIENLSLYKTGLQQVFGMLKYSTDKSKFYNYITSNHDQISMLDDNALTAVMGL LGENRRLMKYLAAPGREEGYTMCQAIDDLIADGKLEGKREGKRRGSPYGQPNLHPFERKP AGSHRGDLRE >gi|229784124|gb|GG667611.1| GENE 11 10449 - 11396 1001 315 aa, chain - ## HITS:1 COG:SA1660 KEGG:ns NR:ns ## COG: SA1660 COG3481 # Protein_GI_number: 15927416 # Func_class: R General function prediction only # Function: Predicted HD-superfamily hydrolase # Organism: Staphylococcus aureus N315 # 1 282 1 279 313 159 32.0 6e-39 MRYIETFREGMHVSDVYLCKNKQIALTKAGKEYGNLILQDKTGTIDAKIWDLSSPGVGNF ETMDYIYIDADVTMFQNSNQLNVKRVRKAEESEYIPGDYLPVSAKNISEMYEELTGLIKS IRTPHYRKLAESFFVEDKAFAKAFQFHSAAKSVHHGFVGGLLEHTLSVVKLCDYYAGYYP LINRDLLLTAAMFHDIGKTKELSVFPENDYTDDGQLLGHIIIGTEMVSERIRTIPDFPAK AATELKHCILAHHGELEYGSPKKPALLEALALNFADNTDAKMETMIEVLKGAGDNNGWLG FNRLLETNVRKTSES >gi|229784124|gb|GG667611.1| GENE 12 11442 - 12758 1299 438 aa, chain - ## HITS:1 COG:sll1427 KEGG:ns NR:ns ## COG: sll1427 COG0265 # Protein_GI_number: 16330598 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Synechocystis # 161 425 133 401 416 91 28.0 3e-18 MPDKSGPDDKKTEPRRFISEKIVKQPRTKREIMRRAVALLLTAALFGIVAAVTFVVSRPV AEKYLGKETTRETTPVTIPKDDPEPSTTEETEPTQATTKESEPIEDLLESAMKKYPFSID DLNTLYGNLRAVSLKAGKGIVTVHSVKTQTDWFNNPVENSGLYAGAVISSANEELLVLTP EGAVEAADSIRVAFSDGTEVQGTIRQTDKLSGMAIVSVPTAELGRALLEEVKAIELGNSY SVKQGDLVIAIGGPAGMVHSTCYGFVSYIAKNVQMTDGMTRIIYSDLKSNAATGTFLMNT TGQIIGWVTDEYKSEGSEDMTVAMAISDYKSILEKMSNGNAFPYFGIKGQEVSAVMNESG MPLGVYVVDVNADSPAYNAGIQCGDIITVMGGENIVTMKDFQVKLEASTPGEVLPVTVLR SGRDEYKEIEYQITIGAR >gi|229784124|gb|GG667611.1| GENE 13 12957 - 13550 536 197 aa, chain + ## HITS:1 COG:CAC2575 KEGG:ns NR:ns ## COG: CAC2575 COG1592 # Protein_GI_number: 15895835 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 1 187 1 188 195 158 44.0 5e-39 MSDLKNSETVKNLMRAFAGESQARNRYTFAASQAKKEGLPVIQAVFSYTAGQEKEHAEIF YQYLKELQGETIFIDGGYPVDLYDQTVDLLKSARHNEYEEYQDVYKNFAEVAKQEGFLQI SSSFSLISEIEKVHGDRFQLFADYMEQNKLFVSDAETGWICLNCGHIHHGLQAPKACPVC HHEQGYFVRVELSPFLR >gi|229784124|gb|GG667611.1| GENE 14 13734 - 16133 2270 799 aa, chain + ## HITS:1 COG:CAC2340 KEGG:ns NR:ns ## COG: CAC2340 COG1193 # Protein_GI_number: 15895607 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 1 799 1 788 788 608 44.0 1e-173 MNQKALKTLEYDKIIVQLAEYASCESGKELCRRLVPSIDYDEIVTAQRETTDAVTRVRQK GGISFGGVKDIRASLKRLEVGSSLGIVELLSVSSLLTTAARAKSYGRHEDSELPEDSLEQ FFSVLEPLTPVNTEIRRCILSEEEVSDDASPGLHHVRRSMKNIHDKIHTQLNSILNSNRT YLQDAVITMRDGRYCLPVKSEHKSNVPGMVHDQSSTGSTLFIEPMAILKLNNDLRALEIQ EQKEIEMVLADLSNQLVPYQDELLTDFEVLTRLDFIFAKAALSRHYQASEPRFNKKGIIH IKDGRHPLLDPSKVVPITVHLGRDFDLLVVTGPNTGGKTVSLKTVGLFTLIGQAGLHIPA FDGSELSVFDEVFADIGDEQSIEQSLSTFSAHMTNIVQILGQADSRSLCLFDELGAGTDP TEGAALAIAILSFLHNMKCRTMATTHYSELKVFALTTPGVENACCEFNVETLQPTYRLLI GVPGKSNAFAISSKLGLPDYIIEDAKTHLEAKDETFEDLLTHLEQNRVTIEKERIQIESY KMEVEKLKARLTQKEERLDERRDKMIRDAKEEAQRILRDAKDTADQTIRQINKLASESGV GKELEAERARIRGKLKEVDSSLSLKNQTKEPKQSIDPKKLKLGDGVRVLSMNLNGTVSSL PNSKGDLYVQMGILRSLVNLSDLELLNEQSVSGPTLGSGNGKKKTGSGSSSARMSKSFTI SPEVNLIGMTTDEAIPELDKYLDDAYLAHLPSVRVVHGRGTGALKNAVHKHLKKLKYVKD FRLGVFGEGDTGVTIVTFK >gi|229784124|gb|GG667611.1| GENE 15 16146 - 16856 760 236 aa, chain + ## HITS:1 COG:CAC3220 KEGG:ns NR:ns ## COG: CAC3220 COG0745 # Protein_GI_number: 15896467 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 6 235 7 228 228 258 56.0 5e-69 MAEKQRILIVDDDNNIAELISLYLTKECFETIIVNDGEEALRVFPEFKPNLVLLDLMLPG MDGYSVCREMRAQSPVPIIMLSAKGEVFDKVLGLELGADDYMIKPFDSKELVARVKAVLR RYQQPAVPAAAQTDQQGDYVEYPDLIVNLTNYSVIYMGHSIEMPPKELELLYFLASSPNQ VFTREQLLDHIWGYEYIGDTRTVDVHIKRLREKIKDHASWALTTVWGIGYKFEVKR >gi|229784124|gb|GG667611.1| GENE 16 17232 - 17651 292 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619521|ref|ZP_06112456.1| ## NR: gi|266619521|ref|ZP_06112456.1| entericidin EcnAB [Clostridium hathewayi DSM 13479] entericidin EcnAB [Clostridium hathewayi DSM 13479] # 1 139 1 139 139 186 100.0 4e-46 MKQKLFIIAAILVFTVSAAACSAKNEGSASTAAAKTETTADADRTNDAGTAEADAADLST GDSQEGEGTVTGVVEENKGFMITVAADDDEEAYVFTLDETQSEKYKDIKSGEKITVSYTN GLPTPDNLDTIVTDIQPAK >gi|229784124|gb|GG667611.1| GENE 17 17911 - 19317 1438 468 aa, chain + ## HITS:1 COG:CAC3219 KEGG:ns NR:ns ## COG: CAC3219 COG0642 # Protein_GI_number: 15896466 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 139 462 147 471 475 204 36.0 3e-52 MTHSLYYKFILGYLVFGLLGFASIATLSSRMTERYLIREQADLLYDEANQVASTYSGMYH GKDMDLISSYPQLVGTAHYLHSQIWIIDRQGKIVVDSDQSERTGRVIDRFDPTAAGNKSY TIGTFYDQFTYDVLSVHAPITGNYTTYGYVVIHMPLSVLQNTRDGILNIVYLTLAVVFGF SLIILLVFTKTVYFPLKKITAGANEYAQGNLTHHIDVNTRDEMGYLAATLNYMSDELNKM EEYQKTFIANVSHDFRSPLTSIKGYLEAILDGTIPPEMYEKYLTRVITETERLNKLTQGM LTLNTLDSKGYLNRTNFDINRVIKDTAASFEGTCEEKNINFDLTFSNNIQMVYADLGKIQ QVMYNLIDNAIKFSHHDSTIYIQASGKYEKIFISVKDTGIGIPKDSIKKIWERFYKSDLS RGKDKKGTGLGLAIVKEIIQSHGENIDVVSTEGVGTEFIFSLPKSTNL >gi|229784124|gb|GG667611.1| GENE 18 19411 - 19905 606 164 aa, chain - ## HITS:1 COG:FN0809 KEGG:ns NR:ns ## COG: FN0809 COG0219 # Protein_GI_number: 19704144 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase (SpoU class) # Organism: Fusobacterium nucleatum # 1 148 1 148 150 182 57.0 3e-46 MNVVLLEPEMPANTGNIGRTCVATGTKLHLIEPLGFRLNDKLIKRAGLDYWDKLDVTVYC DYQDFLERNPGAKIYMATTKGLKVYSDVEYEPDCFLMFGKESAGIPEEILLENQEQCVRI PMWGDIRSLNLSNSVSIVLYEALRQNGFEKMTMQGQLHHLHWKE >gi|229784124|gb|GG667611.1| GENE 19 19914 - 21083 1252 389 aa, chain - ## HITS:1 COG:BH3350 KEGG:ns NR:ns ## COG: BH3350 COG0436 # Protein_GI_number: 15615912 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Bacillus halodurans # 2 384 8 390 393 452 55.0 1e-127 MRNPLSERIVEIPPSGIRKFFDIVSEMKDAISLGVGEPDFETPWHIREEGIYSLEKGRTF YTSNAGLRELKVEISNYLKRRFHVTYDPDHEVMVTVGGSEAIDVALRAMLDPGDEVLIPQ PCFVSYVPCTILANGVPVPIPLEEKDNFKLTKEKLLQYITPKSKILVLPFPNNPTGAVMT AEELAVIAEIVVEKDLFVLTDEIYGELTYGMEHTTIASFPGLKERTVLINGFSKAYAMTG WRLGYLAAPAVILEQMLKIHQFAIMCAPTTSQYAAVAALRDGDKDVQMMRESYDQRRRFL MNAFQEMGLECFEPNGAFYAFPSIKRFGMTSDEFATRLLQEEKVAVVPGTAFGDCGEGYL RVSYAYSLKSLKEALGRMERFVKRLDGKA >gi|229784124|gb|GG667611.1| GENE 20 21080 - 21562 626 160 aa, chain - ## HITS:1 COG:BH3351 KEGG:ns NR:ns ## COG: BH3351 COG1522 # Protein_GI_number: 15615913 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 5 160 8 164 164 153 50.0 1e-37 MREKILAVIEKNSRIDLKDLAVLLGESEAAVANEIAEMEKEHIICGYHTLINWDNTSDEK VVALIEVKVTPQRGMGFDKIAERIYQYSEVEAVYLMSGAYDFTVFIEGRTMRQVAQFVSE KLSPMESVLSTATHFVLKKYKDHGTVLAEETQDERMLITP >gi|229784124|gb|GG667611.1| GENE 21 21604 - 22134 537 176 aa, chain - ## HITS:1 COG:PH1573 KEGG:ns NR:ns ## COG: PH1573 COG0309 # Protein_GI_number: 14591353 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Pyrococcus horikoshii # 3 176 151 326 326 63 28.0 2e-10 MEPGQNLVVAGYAGYAGTVEIVRQKHGELLTWFTESYLDRIMENEDGTLSGNLERWKAFG ATECEPAGEGGILSALWNLSGAYMTGIEFSLRQIPVKQETIEVCERYNLNPYRLYSDGCL LFVTDNGGEMVLSLEREGIHAAVIGRVTEGIGRIIRHGESTGYLDRPAEDEIHKVL >gi|229784124|gb|GG667611.1| GENE 22 22180 - 23730 1417 516 aa, chain - ## HITS:1 COG:CAC0691 KEGG:ns NR:ns ## COG: CAC0691 COG2720 # Protein_GI_number: 15893979 # Func_class: V Defense mechanisms # Function: Uncharacterized vancomycin resistance protein # Organism: Clostridium acetobutylicum # 62 382 65 384 411 121 27.0 3e-27 MKNENLAAGLAAVVLAVGLTLAVPAPARADTLPDGLYVGEYSLGGMTEEEAGEKIQSLVG EMENQKITLVVDGESVETTAKELGFHWSNTDAVNQAAASVTGGNLIHRYLNLKDIEHNHV VIPLETAFDDGKVAAFVEEKCASVVAEPKDASITRENGAFVITPAVVGKSVDVGATKAAL DEAISGGLKEPVTVTAAVADAQPAITTEALSTIQDVLGTFSTDFSSSGNSRATNLRVGAG KINGHVLMPGETLSGYECMHPFTTANGYATAAAYENGQVVDSVGGGVCQISTTLYNASLR AELEITQRQNHSMIVTYVKPSEDAAIAGTYKDLKITNNYSTPIYVEGYTSGKTLTFTIYG KETRPANRTFKFVSETLATMDPGPPKEEVDPSMEPGTRKQVQSAHRGYKSRLWKYVYIDG VEQSKEILHTDTYNASKAIVKVGPAAPAVTVPLPEETTAPVETQPAESSPVEGENGGPGV TLPEAAAPEPAPAPEPEAPAVSPVPAPEAAGAAPAV >gi|229784124|gb|GG667611.1| GENE 23 23775 - 23885 180 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MINFNDKKNRKVVSIVLIAVVILLIVAMVLPMMMIR >gi|229784124|gb|GG667611.1| GENE 24 23900 - 24607 565 235 aa, chain - ## HITS:1 COG:aq_932 KEGG:ns NR:ns ## COG: aq_932 COG0847 # Protein_GI_number: 15606258 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Aquifex aeolicus # 4 160 18 175 202 102 34.0 5e-22 MVNSYVAVDLETTGLDPKRDKIIEIGAVRIEAGEITAEFESFVNPYRMLEAKTRTLTGIR DQDVVHAPGIEEVLPVFLNFAGELPLLGHRIIFDYSFLKRAAVNQGESFSRAGIDTLTLA RLFMPAEERKNLKAACGWFGIDQRETHRAMADAVAAHQLYQAMKKRYGKERPDAFSEKNL IYKVKKEQPASKRQKEHLQDLIKYHKIGLTVQIDHLTRNEISRITDKIIAQYGRI >gi|229784124|gb|GG667611.1| GENE 25 24617 - 25270 180 217 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46129221|ref|ZP_00155777.2| COG1194: A/G-specific DNA glycosylase [Haemophilus influenzae R2846] # 1 212 2 213 378 73 26 3e-12 MAKRETKAEFKARVDHVLAALDQEYGTEYVCYLNHETPWQLLIAVIMSAQCTDARVNIVT ADLFKKYDSIEKFANADLKELEKDIHSIGFYHMKAKNIISCCQGLLERFGGQVPRTIEEL TSLAGVGRKTANVIRGNIYHEPSIVVDTHVKRISRKLGFAKAEDPEKIEMELMKVLPKEH WILWNIQIITLGRSICFARSPKCKECFLREYCPSAEQ >gi|229784124|gb|GG667611.1| GENE 26 25289 - 27028 1774 579 aa, chain - ## HITS:1 COG:CAC1021 KEGG:ns NR:ns ## COG: CAC1021 COG1032 # Protein_GI_number: 15894308 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 2 516 3 512 548 362 40.0 1e-99 MKCLLAALNAKYIHSNPGIYSLKAYAMGRLEQDSACLKREISVELGEYTINNQMGDILKD IYLRKPDIIGFSCYIWNITWVMELVRDLPSVLPEAEIWLGGPEVSYDAADILEREPNVTG IMMGEGEETFLEVLKHYAEGRRDFFTIDGVACRNEKREVVSRPLKKAADMSAIPFPYEDL HDFEHKIVYYESSRGCPFSCSYCLSSLDKSVRFRNLELVKRELDFFLEHRVPQVKFVDRT FNCSKKHTMEVWRHITEHDNGVTNFHFEISADLLDEEELELMAQMRPGLIQLEIGVQSTN PDTIREIRRKTDLNRLKSVVERIHGFGNIHCHLDLIAGLPYEEYESFRNSFNEVYGMKPD QLQLGFLKVLKGSYMAENAEAYGLVYSAKPPYEVLGTKWLDYESLLKLKGVEEMVEVYEN SGQFTVTMKELVKEFDTPFDLFSELAEYYEAHTLTGVSHSRMARYEILDRFVGEHIPDKL ELYRRLMVYDLYLRENLKSRPDFAEDQTPFKEKIREFFAKEAVNHRYLREGYEGFEARQL IKMAHLEVFGGKTAVLFDYRQRDPLSYNAAAFVIDEFFY >gi|229784124|gb|GG667611.1| GENE 27 27036 - 27530 625 164 aa, chain - ## HITS:1 COG:FN1902 KEGG:ns NR:ns ## COG: FN1902 COG2131 # Protein_GI_number: 19705207 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidylate deaminase # Organism: Fusobacterium nucleatum # 5 162 15 168 174 191 59.0 5e-49 MTEKREDYITWDEYFMGVAMLSGKRSKDPSTQVGACIVSQDNKILSMGYNGFPKGCSDDE FPWGKEHEKDDPYNAKYFYSTHSELNAILNYRGGSLEGSKLYVTLFPCNECAKAIIQSGI KTIVYGSDKYDGTPAVNASKRMLNAAGVRYYQYQPTGRKIEIEV >gi|229784124|gb|GG667611.1| GENE 28 27548 - 28141 683 197 aa, chain - ## HITS:1 COG:BS_ywlC KEGG:ns NR:ns ## COG: BS_ywlC COG0009 # Protein_GI_number: 16080748 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Bacillus subtilis # 2 182 175 345 346 149 46.0 4e-36 MIVDGGPVGIGVESTIIDVSGQVPVILRPGAVTQEMAGAVLGMEIGLDPAITGDGPMKAD VKPKAPGMKYKHYAPKADLTLVEGEPAEVARTINHLAKEKLQAGYRVGIICTDETRELYP EGMVRSLGLRAREETIAHNLYAVLREFDDLEADYIYSESFSADHLGQAIMNRLKKAAGYH MIHGGCAEEGAEAGRLQ >gi|229784124|gb|GG667611.1| GENE 29 29123 - 29572 487 149 aa, chain - ## HITS:1 COG:PH0435 KEGG:ns NR:ns ## COG: PH0435 COG0009 # Protein_GI_number: 14590350 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Pyrococcus horikoshii # 6 148 3 145 340 174 62.0 5e-44 MVTERIIIEDTSQIDLKKLEKAAAILREGGLVAFPTETVYGLGANALDEEAAKKIYAAKG RPSDNPLIAHIADFSDLKPLVAEIPETAKKLMEAFWPGPLTMIFPKSDAVPYGTTGGLDT VAVRMPSDPVAMTLIRLSGVPVAAPSANT >gi|229784124|gb|GG667611.1| GENE 30 29740 - 30747 969 335 aa, chain + ## HITS:1 COG:FN0776 KEGG:ns NR:ns ## COG: FN0776 COG2502 # Protein_GI_number: 19704111 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthetase A # Organism: Fusobacterium nucleatum # 9 335 3 327 327 417 61.0 1e-116 MALMIPENYDPHLTVRETQEAIKYIRDTFQKEFGKEMNLERISAPLFVEKSSGLNDNLNG VERPVQFDIAGIPGETVEVVHSLAKWKRMALHEYGFQPGEGLYTNMNAIRRDEELDNLHS CYVDQWDWEKVITKEDRTIETLKETVRTIFKIIKHMQHEVWYKYPNAVNHLPKEITFITS QELEDRYPGKTPKERENLITKEYGCVFLMQIGDKLASGEPHDGRAPDYDDWQLNGDILFW FENLNCALEISSMGIRVDEKSLEEQLKKAGCEDRRSLPYHRMLLEGQLPYTIGGGIGQSR LCMLLLNRAHVGEVQASIWPNDMIEECRRHNIFLL >gi|229784124|gb|GG667611.1| GENE 31 30860 - 31540 731 226 aa, chain - ## HITS:1 COG:BH0239 KEGG:ns NR:ns ## COG: BH0239 COG0860 # Protein_GI_number: 15612802 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 5 222 10 229 238 134 35.0 2e-31 MEPYLAVLLLVMVYIVSRQAGRMAAGVNVVSGKDRPVVVIDAGHGGNDPGKIGVDGSLEK DINLKIARKVKAYLEASDVQVVLTREDDQGLYTEKDSKKKMADMNRRCQIINETSPALTV SIHQNSYHEEGISGGQVFYYKGSEKGKKLAETLQKRFDFVLGDKNTRVAKANDNYFLLLH VKTPIVIVECGFLSNWNESAMLNSEEYQDRLAWTIHMGIMEYLNGR >gi|229784124|gb|GG667611.1| GENE 32 31620 - 33065 1486 481 aa, chain - ## HITS:1 COG:lin2791 KEGG:ns NR:ns ## COG: lin2791 COG1409 # Protein_GI_number: 16801852 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Listeria innocua # 64 435 26 404 443 243 38.0 7e-64 MKKRTALYAALICTMMTGCSLKSPVSETVTEPESTTETERAEIPQTPAETTVPSTIEEFP HEAEEPTVKPPEQKFEEGCRILLASDIHYLSRGLTDFGDGFKYKVEHGDGKVVNYIWEIT DAFIEAVKKEDPDLVILSGDLTLNGERDSHLDFAKKLKEIEDAGIHVIVIPGNHDINNTA AAQFTGAQALGTAHIGPDDFAEIYRDFGYDEAVSRDTASLSYVYQIDDYNRALMLDTCQY EPYNQVGGMIQEETYDWIENQMEDAWDLGMNVIPVGHHNLLDESEVYVEDCTIEHSEQLI DQLDSWDVPLFLSGHLHVQHFMREDEDYGIYEIVTSSLATPPCQYGMLDYGDDGSFQYHT VPLDMEAWAKEQGSTDKNLLNFKSYSTDFLNRVFYNQAQDEFERMNQFDSLTKRQKDQMA KVYAEINAACYAGKAVEIKDKATEKDGYKLWGENGYPSILSQYLESIVRDATKDYNSLKV E >gi|229784124|gb|GG667611.1| GENE 33 33098 - 34759 2005 553 aa, chain - ## HITS:1 COG:CAC1556 KEGG:ns NR:ns ## COG: CAC1556 COG3858 # Protein_GI_number: 15894834 # Func_class: R General function prediction only # Function: Predicted glycosyl hydrolase # Organism: Clostridium acetobutylicum # 259 549 150 441 446 130 28.0 6e-30 MKRRVVPFLVALGLIVLVIAGFFGIRLVERYTPSKEQADIGRLLGVSGDNVAVYLNNELQ ETKGIYIEEQTYLPIEWVNENLNERFYWDNNEKLLVYALPESIVYADHSTKGSSGKPLIW VNDEGVYLSLGLIANYTDIRLEVFDLAEHKRVFVNNDWSEETKAVADSKGNVREKGGIKS PIITRVEKGSEVTVLETMEKWDKVRTVDGYIGYVEHKRLGGSRSETPVSSFVAPVYKNIS LDEPVCLAWHQVTKPEGNASFDNLIANTKGLNVISPTWYELTDNEGGFNSYADAAYVQKA HDMGLQVWALINNFSNDVQTEVLLSKTSTRQKLIGKLMAEVEQYGLDGLNLDFEGIKKEA GVHYIQFIRELSVSCRKEGIVLSVDNYVPYAGNEFYNRKEQGIVADYVIVMGYDEHYAGG EPGSVASIGYVNDGISNTLKQVPKEKLINAIPLYTRVWTEEADGKTSSVALGIARAKEWA VENNVELFWQEELGQYYGELKTEEGTKKVWMEEERSIGLKMDLIRKYDLAGVACWKLGFE PAELWDEIRMDQE >gi|229784124|gb|GG667611.1| GENE 34 34988 - 35206 242 72 aa, chain - ## HITS:1 COG:no KEGG:Closa_0332 NR:ns ## KEGG: Closa_0332 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 9 72 18 81 81 85 68.0 5e-16 MDENGKDSYIAGMKRYRERGIPIVIDGEKCTEPDWNRIFEAREDNSFYMADFVSDEKTGK LTEIRFDRVYHR >gi|229784124|gb|GG667611.1| GENE 35 35282 - 35914 498 210 aa, chain - ## HITS:1 COG:no KEGG:Closa_0331 NR:ns ## KEGG: Closa_0331 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 24 210 21 205 205 206 56.0 5e-52 MKMKLKNRVSRTMAPKRGAAFAEQRLIGLIGAGPAVGVTHTAVMMAGYLTGICRRRCAVL EWNGSGTFDRLEESCLGKRNGGRSRSFPVLDVDYYRNAGTETLVLCKKLRYQDVIVDYGT VLEGNQEEFFRCDRQFLLAGLSEWQTGAFLDTAGSWKRAGAGWETLAVFGSEETRKNMEK ELGLSIRRIPVSVDAFAVTEAVMEFYQQIL >gi|229784124|gb|GG667611.1| GENE 36 35911 - 37389 981 492 aa, chain - ## HITS:1 COG:CAC0404_1 KEGG:ns NR:ns ## COG: CAC0404_1 COG0515 # Protein_GI_number: 15893695 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Clostridium acetobutylicum # 4 261 21 303 306 136 31.0 7e-32 MGKLNTVLFGKYQLCRILGTGRLGTVFLAVHLGLSEYRAIKRVPKSFLEYDQFRREALIL KELRHPGIPIVYDVEEDENYSYLIEEYLEGESLYDLVKSQGYLPQRLVIRYGIQLTDIMN YLHLAGTTPLLYLDLQPRNLLVCHGTVKLIDFDHAAFSDEANAAADRYGTVGCAAPEQYE KNGKLDERTDLYAIGVILHYLYTGVFPAVPYEPVPSMTPELAAIIRTCLEQKKERRFSSA SELNERLKQLEQRGAAATACLQSPLTIALAGSKSGAGTTHIAVGLSVCLRNLGYPNLYEE KNGSGMGAGLGAFTGALRDRCGLMQYHGFVWKPAYGPGVKLPMPSCGLVIADYGHEVEAA LAAEPDALILVCDGREWSRGAAESALKCVAERHTPYAAVYNHLAESARVKPPEGIEQSRC FRAPYFADPLKESEGTREFFCQVLDSLKSASEAGMDAPGRWSCMRISRWKHPARVLREAA GRARRKKRRGVS >gi|229784124|gb|GG667611.1| GENE 37 37594 - 38313 784 239 aa, chain + ## HITS:1 COG:no KEGG:Closa_0328 NR:ns ## KEGG: Closa_0328 # Name: not_defined # Def: Negative regulator of genetic competence # Organism: C.saccharolyticum # Pathway: not_defined # 1 239 1 239 239 378 85.0 1e-104 MKIERINENQIRCTLTSFDLSVRNLNLGELAYGSEKARSLFREMIQKASNEVGFEAEDIP LMVEAIPLSNESVMLVITKIEDPEELDTRFAKFSPMSDDDQDTMLNDLASELLEGADGLL GLLGADKKDKEEAPAPKEEAAASTMRIYCFQSLDQISEAARAALQVYDGENTLYKKPEAR QYYLVIKNTLESSLDFSRVCNILAEYGTKIRQDYASEAYYREHYEVLIEGHALQSLAKL >gi|229784124|gb|GG667611.1| GENE 38 38292 - 38429 79 45 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619543|ref|ZP_06112478.1| ## NR: gi|266619543|ref|ZP_06112478.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 45 1 45 45 78 100.0 1e-13 MKSRCCRKNLLTYKKQPGDAMRLRVILCVMKYYESKGTGQSFARD >gi|229784124|gb|GG667611.1| GENE 39 38434 - 39324 971 296 aa, chain - ## HITS:1 COG:SA2140 KEGG:ns NR:ns ## COG: SA2140 COG0657 # Protein_GI_number: 15927930 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Staphylococcus aureus N315 # 69 291 79 300 306 113 32.0 5e-25 MSFKASMAGFLARLCKADRKMAENLQNPLRNTGALDMERFTKKVRVKNWDIRGFKGVTVN GSYPGRGHILMLPGGAYSLEPGKHHRRIAERFAILDHMKVSVFQYPLSPEYTAVDVRIIL MEAYERLVREYPEDMFFLFGDSSGGGLALAFLQELRDRGGCPMPARTAMVSPWLDISLTN PKIKIARKTEHILPAEALIETGKRYCGGLDKEDPFVSPICGTMDGLGPVLMFSGEEEILT PDCELLAEKMEKVTGTELVFRKAAGMCHDWIQIPCRETDVTLDLIAGFFQEALEAG >gi|229784124|gb|GG667611.1| GENE 40 39356 - 40654 1482 432 aa, chain - ## HITS:1 COG:FN0944 KEGG:ns NR:ns ## COG: FN0944 COG0534 # Protein_GI_number: 19704279 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 2 425 32 454 455 290 39.0 4e-78 MLVNAIYNIVDQIFIGQGIGYLGNAATTIAFPIVTIILAMSTLLGAGGSAYAAIKLGEKK EEDAEKTLGTVFILTLISSAVVMVLGFLFMTPMLKIFGATVNTIDYARQYTSIILIGTPF NMLSVVLSNMSRTDGSPALSMYALLVGAILNTILDPIYIFVFHWGVSGAAIATITSQIIS AVVLVVYFLKKGKNMRLRRKTFRLDSEICARALPLGISSGITQVASTILQVVMNNSLVYY GNQTTVGGDAALSAMGIVNKISMILISICIGIGIGSQPILGFNKGANQPKRVRKTYLLAI SAATGVSILGWLACQIFPGPILSLFGTQDVEFTQFAVKCLKIYMLGIFTAGFQVVSTSYF QSTGQPLKASVLSMLRQLLLLIPLILILPLTFSLDGILYAGPIADVASAVIVSLFVVYEL RKLNKVCSASEA >gi|229784124|gb|GG667611.1| GENE 41 40761 - 41210 379 149 aa, chain - ## HITS:1 COG:MA2051 KEGG:ns NR:ns ## COG: MA2051 COG1846 # Protein_GI_number: 20090898 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 1 126 1 126 153 85 39.0 4e-17 MKEHREIGKYISMIQRLSNIYFANELSSYQIGCGQQFFLLQIFKNPGISLQELASNGSYD KATATRAVKKLEEEAYVRTETDGEDKRVRHIYATEKAKRVVDITRNSVDTLTEVLLEGFT EEERDSAEAFLIRMADNSYRHIIANKGKE >gi|229784124|gb|GG667611.1| GENE 42 41420 - 42646 889 408 aa, chain - ## HITS:1 COG:RSp0310 KEGG:ns NR:ns ## COG: RSp0310 COG0477 # Protein_GI_number: 17548531 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Ralstonia solanacearum # 6 375 45 415 450 137 28.0 4e-32 MNQLNWKHSIARFLTAQTISLFGSSLVQYAIVWYITLTTSSGKMLTISTMCGFLPQIVIS LFAGVWIDRYDRKKMIMLSDSAIALSTLLLAAAFLSGHKSVWFLFAALVVRSAGTGIQTP AVNAMIPQIVPKPYLMKVNGLNSTLSSMMMFLSPAASGAVLSVAPLEAALFIDVLTAVIG VGITATIHAEPYEKRCLPGTAGLEEIRDGFRYLKDNQLIKRLLLFQITILFLVSPSAFLT PLMVSRTFGSEVWRLTASEMTYSLGMVLGGILIASWGGFKKRMNTTLAAGAVYGLIMIGL GNAGVFLLYLLFNTAIGITSPCYNAPVTVTIQEKVSPGMQGRIFSFMQIATSCALPLGML VFGPLADLVPVQYLLNSAGLMILIICCIVWKANGFSVESATENIEQPL >gi|229784124|gb|GG667611.1| GENE 43 42871 - 42966 76 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MERIIIDELKRRMNSVQMVPVKHCTELKLEA >gi|229784124|gb|GG667611.1| GENE 44 43038 - 44429 1079 463 aa, chain - ## HITS:1 COG:AGc4328 KEGG:ns NR:ns ## COG: AGc4328 COG0044 # Protein_GI_number: 15889657 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 453 26 479 506 301 38.0 1e-81 MKLIKNGIILTAETEFEGDILVDGEVIRAVGRGLDGLADDVIDASGKYIFPGGVDEHTHF GSFGGRLFETTEAAAVGGTTTVVDFAPQDKGDSLGAAIEKHAARAAGVSCVDFGFHSMVM DMQADTADQLKELPEHGVSCVKMFMAYKGTPFYMEDASILKVMNESRRHGITMMVHAENP DMISFYTERLKAEGRLEPISHAFSRPPVTEEEAVRHAIVMARVEEAPLFIVHVSTRGAME AIRDAYAGGQSVYGETCTHYLVLDTSFLALPDFEGAKYVCAPPLRTREHQQALWEGIDKG WLNAVSSDHCALAGGFETKKEGLGDFTKIPNGIPGVQNRISVLWSQGVAKGRISKQRFVD LIATAPAKNSGLAKKGQIAPGFDADFVIFDPGYRGIMTQKDNLEGIGYGAFEGFEMIGRP EQVFLRGMLIAENGRFTGRKGCGRRIFAKPYAAAYDHYRSEAL >gi|229784124|gb|GG667611.1| GENE 45 44426 - 44827 401 133 aa, chain - ## HITS:1 COG:lin0777 KEGG:ns NR:ns ## COG: lin0777 COG2893 # Protein_GI_number: 16799851 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose-specific component IIA # Organism: Listeria innocua # 2 121 3 121 144 87 35.0 5e-18 MIGIIVAGHGNFASGITTMLELVVGKPEHYEFIDFLQGESQEALEEDYRDKLDKLKDCEK IVIMTDITGGSPFKSAVTYAVSDDRLNVISGTNVPMLVELIFGRMNSDDSEALIQSAMEA GKAQIIRFHKDML >gi|229784124|gb|GG667611.1| GENE 46 44824 - 45648 918 274 aa, chain - ## HITS:1 COG:MJ1115 KEGG:ns NR:ns ## COG: MJ1115 COG0434 # Protein_GI_number: 15669302 # Func_class: R General function prediction only # Function: Predicted TIM-barrel enzyme # Organism: Methanococcus jannaschii # 11 265 11 259 261 117 28.0 3e-26 MMRGKQFPVALAMIQPEPFPGSFRHEGKSFEEIIDISLNEIEMIEANGFDGYIIQNRNDA PVRQHALPETTAYMTALARECRRRFPDMIQGILVDWDGVASLAVADAAGSDFIRVEHTYT GVEVGYAGMMEAQCVDICQFKKRIGSDIPVYADVQEVHYEQLAGKSIVDNAWDTVMNAFA DGLFLGGKSCEESIEIIKCVRKRLGERIPIFLSSGATGDNISKILQYYDGVSVGTWVKNG NMRNPIDPVRARQFMEGVKSARKLRGDSGDEGML >gi|229784124|gb|GG667611.1| GENE 47 45654 - 46586 866 310 aa, chain - ## HITS:1 COG:MJ0122 KEGG:ns NR:ns ## COG: MJ0122 COG1184 # Protein_GI_number: 15668294 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family # Organism: Methanococcus jannaschii # 48 308 55 307 308 68 22.0 1e-11 MNDQASRLFDDIVNSRVLGANNHIRMIGEIMLLTAEDESENGKSVLYKVRETARFFMETR GQQSRAVYHAIQYYIKGLDELENCGEEEIRARITERIGAYGVDTAKDMDTLVSYGVHISE NMDTIMIFDYSSTVDRFVAELGRKRTIYIPESRALDGGRPFVKTALEAGHEVHFIPDTTM LVALKQCQAAFMGAETVYPDGTVFNTVGSDILAILCREIRIPLYVLTPMIKVDTRPVSGY IRLSPMPFDYGPRLAGAWEPELREQVDFKGIKLLEIAPEYIRALITERGIIPSSAFFHEA MEYARSLEGR >gi|229784124|gb|GG667611.1| GENE 48 46627 - 47349 726 240 aa, chain - ## HITS:1 COG:PAB0058 KEGG:ns NR:ns ## COG: PAB0058 COG2159 # Protein_GI_number: 14520316 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Pyrococcus abyssi # 3 239 17 243 248 93 27.0 3e-19 MIIDYHSHIKWDRDTNQYDTEALLLDMQENQIDKRVVSALAGFSIREQNDAVAELVKAHP DRIVGSAVINPKERDCVAEMERIAASGCFKTIELDSMEHCYYPETMPVLDTIIEIAVANH MPVNVFTGWGCRTMPAQWAYYARRHPDMTMVLLHMGTTDFGYGCVELVPQYENLMVETSC MYEFPILRKAFSQIPKERFLFGTHYPDKLTVCSVHTFDLLKLPEELKECMFYKNAGRLLA >gi|229784124|gb|GG667611.1| GENE 49 47364 - 48197 1056 277 aa, chain - ## HITS:1 COG:lin2108 KEGG:ns NR:ns ## COG: lin2108 COG3716 # Protein_GI_number: 16801174 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Listeria innocua # 3 276 4 272 273 213 42.0 5e-55 MAENTDLTKLSKKDLMKMYIYSQAFVSGFNYEKQEAPGFAFSMMPVIEKVYKNEEDKKEA YRRHTELFLTEARLSHFVIGVTAAMEERNANEHDIDPVSINAIKAALMGPLAGIGDSLYH GTLRPIIAGLAVSMVIASGYTSMVGPLLFVLVMAGVGQLLRYFGIFKGYEKGVEFVGAMQ SSGLIDKMTRLAGIAAYAVIGGFVYKFVTINVPWTYKAGETVVSLQSTLDGIVPGILPLL YTLLMLWLMDKKKMNPVILMLVTMLVGIAGYAVGILG >gi|229784124|gb|GG667611.1| GENE 50 48197 - 48940 935 247 aa, chain - ## HITS:1 COG:YPO0835 KEGG:ns NR:ns ## COG: YPO0835 COG3715 # Protein_GI_number: 16121143 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Yersinia pestis # 3 239 1 236 262 116 32.0 3e-26 MSLLVKAVLVALVTWLTFFDKYCTQFFTYRPIVIGPIIGLIMGDLKMGLMVGCTIELMFL GQVFVGTALPPEETFSTIIATAFACISGSTEVALATALPLAILGQMGMYFRNMVLCVWTQ HRLEAAVERGSRRGITMNCLILPNVFNFVLFSVPVFLAIYLGAGPVQAIINAIPKVIIDG LAVGGGMIGAVGLALLLKCINVKHIWYYFLIGFFFSSFLNINPIGITLIAVVCVAMAYYR EMDREVA >gi|229784124|gb|GG667611.1| GENE 51 48957 - 49442 670 161 aa, chain - ## HITS:1 COG:lin0021 KEGG:ns NR:ns ## COG: lin0021 COG3444 # Protein_GI_number: 16799100 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Listeria innocua # 3 161 6 162 162 100 35.0 1e-21 MVVTHMRIDDRLIHGQIVTAWISDSKAEEILVADDMAAKDPTQQMLLKLAVPAKIKLAIV TVEKAAELLKEDQSDTKVLLIVRNPKTALDLFERGFTIESVNVGNISNARSTVGRKRLLP YIYVEPEDVENLKAIAEKNIRLDVRAVPNDKSIDGLALISK >gi|229784124|gb|GG667611.1| GENE 52 49417 - 50181 685 254 aa, chain - ## HITS:1 COG:PAB0058 KEGG:ns NR:ns ## COG: PAB0058 COG2159 # Protein_GI_number: 14520316 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Pyrococcus abyssi # 2 242 15 245 248 74 28.0 1e-13 MKRYDFHSHLGKTRSGDANNADQLVRELGEFGITKVGICSLSGNGMRPQNDLVYEAMCQY PGIVEGYADIDPKAPDAFDEIHRTLGDLRMNGVKFMAWKHGYAVENCPSLGPVIDEIGKY GVHIQIHGGASPLCTPFVWIDHAKRRPDMRFVFTHVCGREFGYSCIEAIRDLDNFWVETS ANMEIDILRKAVEVLGPERMLFGTDWPYKPTNIEIEKLYHLGLSESELELVFYKNAEKLW ERKEQHHGCNTHEN >gi|229784124|gb|GG667611.1| GENE 53 50168 - 50566 509 132 aa, chain - ## HITS:1 COG:PA2299 KEGG:ns NR:ns ## COG: PA2299 COG2188 # Protein_GI_number: 15597495 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Pseudomonas aeruginosa # 1 125 123 247 249 68 33.0 4e-12 METGEKVYFLKRLRLLNGRIIALNETYVSGRLGFAIRGEDFDSTTSLYEYLEQHGIVLGS ADETLEAKMATSEIRRELFLEESQPLVYKERVTYDMNGRPVEFSENSYLSDSYKYYIHIT NVRGEGNTGEKV >gi|229784124|gb|GG667611.1| GENE 54 51544 - 51855 312 103 aa, chain - ## HITS:1 COG:BS_yvoA KEGG:ns NR:ns ## COG: BS_yvoA COG2188 # Protein_GI_number: 16080556 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 5 101 1 97 243 72 38.0 3e-13 MEQPIRLDKDSKVPIHQQLYSYIKERIVEGFYKENETIPSEKEMQEMFEVSRITVRRAIS DLEHDGYLIKRKGRGTIVCPQKKERDMSVFNSFSGDAKVKGDK >gi|229784124|gb|GG667611.1| GENE 55 52163 - 52363 311 66 aa, chain - ## HITS:1 COG:no KEGG:Closa_0318 NR:ns ## KEGG: Closa_0318 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 66 1 66 66 99 84.0 5e-20 MQISKDMLIGALIQMDDCIAPILMRAGMHCLGCPASQGESLEEACMVHGIDCDTLVSNIN EVLAER >gi|229784124|gb|GG667611.1| GENE 56 52599 - 52955 443 118 aa, chain + ## HITS:1 COG:lin1443 KEGG:ns NR:ns ## COG: lin1443 COG1882 # Protein_GI_number: 16800511 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Listeria innocua # 1 115 1 115 743 176 69.0 9e-45 MRPEWRDFSGGVWEREINVRDFIQKNYTPYDGDDSFLAGPTQATKDLWAQVMDLSAKERE AGGVLDMDTKVVSTITSHGPGYLDQDKEKIVGFQTDKPFKRSLQPYGGIRMAEKACAD >gi|229784124|gb|GG667611.1| GENE 57 53936 - 55531 1622 531 aa, chain + ## HITS:1 COG:lin1443 KEGG:ns NR:ns ## COG: lin1443 COG1882 # Protein_GI_number: 16800511 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Listeria innocua # 1 529 150 679 743 784 71.0 0 MRACRSNHIITGLPDAYGRGRIIGDYRRVALYGIDRLIENKEEQKATTRTIMYSDVIRER EELSEQLRALGELKKLGAIYGFDISRPATNVKEAIQWTYLGYLAAVKEQNGAAMSLGRTS TFIDIYAQRDLAEGTFTEEEIQEFVDHFIMKLRLVKFARTPEYNELFSGDPTWVTESIGG VGIDGRHMVTKMSYRYLHTLQNLGTAPEPNLTVLWSTRLPENFKRFCAKTSIESSSIQYE NDDLMRVTHGDDYAIACCVSSMRVGKEMQFFGARANLAKCLLYAINGGIDEVTKKQVGPK YRPITSEYLDFDEVMEAYKDMMRWLARVYVNTLNIIHYMHDKYSYERVQMALHDKKVTRW FATGIAGLSVVADSLSAIKYAKVKTVRDENGIVVDYIVEGDFPKYGNNDDRVDQLASDLV HTFMNYVKGNHTYRGGIPTTSILTITSNVVYGKNTGATPDGRKAGEAFAPGANPMHHRDS HGAVASLASVAKLPFKDAQDGISNTFSIVPGALGKEDQIFTGDLEVDLDNI >gi|229784124|gb|GG667611.1| GENE 58 55553 - 55822 328 89 aa, chain + ## HITS:1 COG:no KEGG:Closa_0316 NR:ns ## KEGG: Closa_0316 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 89 1 89 89 97 64.0 2e-19 MAAPDEKKIDDIVAMLDAFMSGGGGHMNIRSDEDGNVSADTTTAKNVTTMNSLDCAAGNL ACSVPTLFEGMDSDEEDPENSGTDFSDNY >gi|229784124|gb|GG667611.1| GENE 59 55850 - 56101 387 83 aa, chain + ## HITS:1 COG:SA0218 KEGG:ns NR:ns ## COG: SA0218 COG1882 # Protein_GI_number: 15925929 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Staphylococcus aureus N315 # 8 83 674 749 749 119 69.0 2e-27 MANTNISDSQIDNLVSLLDGYVQEGGHHLNVNVFTRETLLDAQKHPENYPQLTVRVSGYA VNFIKLTKEQQDEVISRTFHDAI >gi|229784124|gb|GG667611.1| GENE 60 56154 - 56924 720 256 aa, chain + ## HITS:1 COG:SPy0379 KEGG:ns NR:ns ## COG: SPy0379 COG1180 # Protein_GI_number: 15674526 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Streptococcus pyogenes M1 GAS # 3 239 10 248 263 260 51.0 2e-69 MTTGQIHSIESFGTVDGPGIRMVVFTKGCPMRCQYCHNPDTWSLHGGTAMTVSEILDQYE ASRPFYRGGGITVTGGEPLLQMEFVTELFEEACRRDIHTCLDTSGVTFRASDQAYLEQLD RLLASTRLVMLDLKQIDDEKHRALTGHSNRNILQFAEYLDQKQVPMWIRHVVVPGLTDSE KDLYHLGRFIGTLKYAKALDVLPYHDMGKVKYKNLGILYPLEDVPPMSREEAVGAKKIIL HGIKDRRAGTPDLFCS >gi|229784124|gb|GG667611.1| GENE 61 57020 - 57190 227 56 aa, chain + ## HITS:1 COG:no KEGG:CPR_2442 NR:ns ## KEGG: CPR_2442 # Name: not_defined # Def: ferredoxin (FdxA) # Organism: C.perfringens_SM101 # Pathway: not_defined # 1 56 14 69 69 62 67.0 6e-09 MAYVISDACVSCGTCEGECPVSAISEGEGQYVIDADTCISCGTCAGACPVGAISEE >gi|229784124|gb|GG667611.1| GENE 62 57283 - 58134 963 283 aa, chain - ## HITS:1 COG:L66233 KEGG:ns NR:ns ## COG: L66233 COG0656 # Protein_GI_number: 15672250 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Lactococcus lactis # 3 283 2 281 281 321 53.0 9e-88 MKSLTDCYKLANGVEIPCIGFGTWQTPDGDVCVSSVKAAIAAGYRHIDTAEMYENEDSVG KAIKECGVSREELFVTSKLNNTEHGYEKTMAAFEGTMEKLGLKYLDLFLIHWPNPIAFRD HWQEANAGTWKAFEELYKAGRVRAIGISNFRQHHIEALMETATVPPMVNQMKLCPGETQE EAADYCRSRNILLEAYSPLGTGQIFQVPEMQELARKYGRSIAQICIRWSLQRGYLPLPKS VNPARIQENANVFDFELEASDVQLIADLKGCVGYASDPDTRTF >gi|229784124|gb|GG667611.1| GENE 63 58277 - 58660 398 127 aa, chain + ## HITS:1 COG:NMA1517 KEGG:ns NR:ns ## COG: NMA1517 COG0789 # Protein_GI_number: 15794411 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Neisseria meningitidis Z2491 # 1 116 1 116 135 89 40.0 2e-18 MGYTIKLAAEKTHLSPNTLRYYEKEGLLLNVARTKSGIRSYSDQDLELLGLISCLKNTDM SIRQIRDFVNLSHQGPETLKERCELLIAHKCEVEKRMEEMQQHLEKVAHKIDHYTKEYEE YQLENKQ >gi|229784124|gb|GG667611.1| GENE 64 58740 - 60131 1445 463 aa, chain - ## HITS:1 COG:lin0583 KEGG:ns NR:ns ## COG: lin0583 COG2723 # Protein_GI_number: 16799658 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 1 460 1 462 464 536 56.0 1e-152 MDYQFSKDFLWGAASSGPQSEGSFPGDQKAEMEWDYWFKQAPELFYEGVGPSVASDFYHK YKEYAGLMKEIGMNSYRTSIQWSRLIKDCEGTVNPEGAAFYNAVIDELLSRGIEPILCLH HFDLPMYWVEKGGFENRETVEAFAAYAAQCFALFGDRVKRWITFNEPIVITEGGYLYQNH YPAVCDAKRAALVSYHLQLASSLAVKAFKKSGRDGQIGIVLNLTPTYCREPENEADKKSA ETADLLFNKSFLDPSIKGEYPKALVELLKEEGICPETREEDLEVIRINTVDFLGVNYYHP RRVTRRSTPYTGPLMPEKYFEEYTYEGQKMNFSRGWEIYEPAIYEIAIHLRDNYGNIPWY ISENGMGVQDEERFIGADGTVEDDYRIEFIRDHLIWLHKAIEEGCSCFGYHLWAPFDCWS WKNAYKNRYGMIRVDIKDHCRLSIKKSGRWFRETAGRGGFNKF >gi|229784124|gb|GG667611.1| GENE 65 60235 - 61170 853 311 aa, chain - ## HITS:1 COG:lin0770 KEGG:ns NR:ns ## COG: lin0770 COG1940 # Protein_GI_number: 16799844 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Listeria innocua # 5 306 4 287 288 201 38.0 2e-51 MDRYLIFDLGGTSVKYAVSDKDGAFDTAGAFATPREGMERMLEEMERVYEAVQTGPGVTE AAPSGGDSCGTRIVGAAISSPGAVDIRTGMVGGISAIPYIHGIPLAEAVSRRLNGIPVTM ENDGNCAALGELWKGAAAGRRNMVSVVCGTGIGGAIVVDGRVYRGRTNNAGEFGNYLVNR EGGRKRTWSSYTMVNQAVRYEERTGRHADGRELFRLAEAGDCDAVRLVEEFYEAMAVGLF QIQFTLDTELIVLGGGISEAPFVIPEICQRMERLAENTEFGFLMPEIVPCRFGNKANLYG ALYHHLTAGYA >gi|229784124|gb|GG667611.1| GENE 66 61172 - 63028 1844 618 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0607 NR:ns ## KEGG: Cphy_0607 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: C.phytofermentans # Pathway: not_defined # 1 616 1 603 606 627 51.0 1e-178 MNLIPQPKKAERREGEFTIGYGSFLVLEESCGPLLTRQAGLFLEALEGVIGYRPALTRGK GRGGDMILRQNDAPGQGYTLEIRPEEVVITGGEKGLWHGMQTLLQIAGQQGAVLPAMSIC DEPDIENRGFYHDVTRGRVPKLTWLKKLADRMAFYKLNQMQLYIEHTFLFRDFSEIWRDD TPLTAEEILELDRYCLERGIELVPSLSTFGHLYKVLGSKHYSHLCELEGSEAMPFSLRGR MHHHTINVSSPESAGFIKGKIREFAALFSSGQFNICADETFDLGKGRSSGRKEELGVHRL YIDYVKELCEFVISCGKRPMFWGDVITGAPELLKELPAGTVCLNWGYAKDQREDETRALD GAGAIQYLCPGVCGWNYLIPLLEDAYSNIKIMCTYARKYKAAGVLNTDWGDFLHINHPEF SIPGMIYGAAFSWNTQIPDFEEINRQISEMEYGDSTGRLVGVLAKTQALNGFTWETAVRF KELKTGAAPEYEEDHRQYIQGHMEALEGVDDKNRRLEEVTGELYGLLPHMDSRNRPAVKA FVVAVEGIGLFNELGKIIAERDYGLKYGEKPDSWELAERLEVWLYHYKELYRSVSRESEL GRIQEIVVWYGDYLREQR >gi|229784124|gb|GG667611.1| GENE 67 63025 - 64524 1329 499 aa, chain - ## HITS:1 COG:BMEII0110 KEGG:ns NR:ns ## COG: BMEII0110 COG3119 # Protein_GI_number: 17988454 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Brucella melitensis # 1 475 1 474 495 372 41.0 1e-103 MKVILFDIDTLRADHMGCYGYGRNTTPVMDQIAKEGVCFERYYCPNAPCLPSRASMVTGL YGIHSGIVGHGGTAADLRLQGETRSFTDQVSENGLFMQFKRAGFHTVSFSSFAERHSAWW FLSGLDEYYNPGLRGGESAEMVTPKVIEWLERNREKDDWFLHVHYWDPHTPYRTPEGEKS RFQDEPLPDDWITDQIFCEHLRHIGPHGANEINMWNDETFSRWPKHPGSLKTKDEAKHFI DLYDDSVKFTDDNIGLITSWLKEHGLYGEDLAIIITADHGEDLGEFGVYGEHGMADEPVC HIPLIIKWPGAEGGRRVSGLYDNTDLLPTVKDLLGTETSGDYRYDGASFKEALFGGKDGG KPYAVLTQCAHVCQRSVRFDHYLYIRTIHGGYHLLPEEMVFDLDSDPHQLYNLVKERPEL CDRGARMILDWTDRMMKQSDSDQDPLWTVMREGGPEHCRGRLESYVERLKETGRAYGIPL LEKQYSSEWKHRKRRGEDR >gi|229784124|gb|GG667611.1| GENE 68 64528 - 67644 2978 1038 aa, chain - ## HITS:1 COG:lin2123 KEGG:ns NR:ns ## COG: lin2123 COG0383 # Protein_GI_number: 16801189 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Listeria innocua # 58 1036 57 1029 1032 722 42.0 0 MIFARERAEVIADKLKKLMVVRKFELDSWQVKEGFFLRPEEADRAAESWEPFDSRLMHWY GPDRHYWFRTVFTVPEEYGGKGLWLRIRTQIEEWDDAKNPQFLVFINGEAVQGADMNHRE VLLDRAATAGDTVTVDLQAYSGTLHPEFRLIADIEEISEPVKDLYYDIQVPLWAMDRMDQ KGKTAIDILTVLNDTISLLDLRDVYSDEFYRSVEEARAYIGKALYEDLAGDDTVIATCIG HTHIDVAWWWTVAQSREKAARSFATVLKLMDEYPEYRFMSSQPVLYTFVKERYPELYAEI QRRALEGRWEPEGGMWLEADCNLTSGESLVRQFLYGKRFFMEEFGVDSRVLWLPDVFGYS GALPQIMKKCGIDYFMTTKLSWNQFNKVPDDTLLWEGIDGTKVLTHFISTLGVGQSVERF FTTYNGILHPDAIMGGWQRYQNKEMNNDILIAYGYGDGGGGPTREMLETGRRMEKGIRGV PKVRQESSLTYFTELEDRVKDSRRLPVWVGEFYFEYHRGTYTSMARNKRSNRKSELLLMD LELLSVLAEKKGMAYPKEELERLWKMVLLNQFHDILPGSSIKEVYEVTKREYEQIAEEGS ALLKAREEAVAGAGDGVTVFNTLGFPRSSLTILPAGVTALTNQGDVLPSQEIGGVRYSLT GEIPSKGYAVYGAAEHGVSDSGKTEPFSVLKTAEGLSITTPFANVDMAADGSFTSLFDLT AGRQVLKENEAGNRLRVYEDKPIYYDNWDIDVFYTEKYWDLDEPASVEVTSVGPLCLQIT VKRSFMHSRMTQDIRFYAHSPRIDFVTWVDWREHQYLLKAGFPADVHTDEAAFDIQFGNV VRKTHTNTSWDRARFESCGHKWMDVSEGGYGVSLLNDCKYGHSVRGGRLELTLIKSGIEP NPDTDNEEHVFTYSLYPHQGTWKEADTQKEAADLNQPLLAVNGGVPGKTYSFAGVEGDSV VLETVKRSEDGSGVILRLYESRNQRVNAKVNLFCTPAKVMECSLLEEPDDGSCGVRMEQN GFAFTIRPYEIKTFKVVF >gi|229784124|gb|GG667611.1| GENE 69 67648 - 68415 797 255 aa, chain - ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 3 253 5 257 257 122 32.0 9e-28 MVTVLIVDDEPNVLEGVKRLLDWEKLGVGRIWTAKTSLEVMSRIIDWNPDICLVDVRIGN DMGYELIECLNGLGIHSNYIMMSGYDDFHYVREALRCGALDYLLKPVGAGELEACIRKVI VEQLHGTLPEAEQKNCDPILKIEYEAMSPLIHKIVLMVAAEYGGHISLKIIADKFRMNST YLGQIFIKETGMKFSEYLMCYRLSMAREQIVNSDEKIAVIASSVGYTNMNYFYQQFNQYY HTTPSEMRIRGSERN >gi|229784124|gb|GG667611.1| GENE 70 68400 - 70181 1668 593 aa, chain - ## HITS:1 COG:SPy1588 KEGG:ns NR:ns ## COG: SPy1588 COG2972 # Protein_GI_number: 15675475 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pyogenes M1 GAS # 316 577 305 568 574 156 32.0 1e-37 MKKTSGSFLKISVYRRRFTGTVCSVIFIALFCLIFALCFFTTIWADNRWQEVQSGFMTLE QQKRDRAEQLENYVNGIYAKASLMEDTKTLFESITSQDYTNRREENSARSSMQIAWLPGD MRKLFVNSQLKVNAVTIRSPEGLKVIWLDNNSGNMRVSYGIKGEWQVLDRPESDTVLESF EVRNPESIMKNLGTMTFYGSSKDLYKGSSQAYDWALMKGNMTVYEHVKGEREREWILTAS GKEAFQGWFLWNGRLPVFYYEFVNSSGDSEYISAVDIVSLWRANGEGAATLAVTVLVLAA GAVIITFFNLQGEARFLSHIMDMLKTMEHGSFIEVGTMALKRGGHYNEYTMIADALSEVS VKLDEYIRKEYLLKLKQQETAMRALRHQINPHFLYNTLESIRARALILKDRETAEAIEGL GRLYRTLIRCPEVIPLKKEAGLLEMYLKLMALRFKDTFVYRVDIEEEAGEVETIAFWLQP LAENFFNHGMDRESEFNLLMLEAVKKEGGILVTMSDNGLGIPKDRLLEIRKNMVEGGDDP GADIGLRNVYMRLNYFYGEAFTMNIENQEAGGLKIDIFLPTLPGKEQAAWLQF >gi|229784124|gb|GG667611.1| GENE 71 70302 - 71825 458 507 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900035|ref|NP_344639.1| ABC transporter, substrate-binding protein [Streptococcus pneumoniae TIGR4] # 7 505 4 486 491 181 26 2e-44 MKKTDVWKKAAAAGLAGLLAVGTMTGCSAKTDTKEAEAPAGDTAQKDGAAAKDTAAGTAD GDIPTLVYWTVGGTTPSDFDQDMDAINEYLAEKIHVKLDLKVAGWGDYEKKMNTIINSGE YFDMMFVNNTNYSRFVKMGAFEDISDKLQSTAPELYEMIPEQLWKGVTIGGKVYSVPTYK DSSMAQFWYFDDQYVQKYHIDVNNIRTMQDLDKPFRDMKEGEGKGFYPLQMSQGSPFNGF FNEYDGLTAGLQPLGVKVDDQTRTVICTLEQPDIMENLKLLHQWYQDGIINPDANVLTEP QKKLPFASAQGWSSAVATWQVLNGVEKYDAFKVFGPLYSTDTIQGSMNAVSINSKYKDEC LKFLQLVNTDSKLRDMLAYGVPDKTFVYVGDGVIKKQTDTWPLAAYTQGTFFNMSITEDA DPDQWDQVKKQNEEAVSSSCLGFSLDLTNIQNEMSNCLTVWQKYKYDLLTGASDPETAVP AVIQELKAAGFDTVVQEAQKQINEFYN >gi|229784124|gb|GG667611.1| GENE 72 71907 - 72836 971 309 aa, chain - ## HITS:1 COG:lin2116 KEGG:ns NR:ns ## COG: lin2116 COG0395 # Protein_GI_number: 16801182 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 4 308 23 322 323 268 44.0 1e-71 MSKKQKEAADVQNFNRISRRANILLNIMFLILALICVIPAVLVVSISFSAEQSITDYGYR LIPKIISLDGYGYLLKQGALIIRALGVSLFVTIVGTVLGTLLTTLMGYVLSRPDYKLNGF LTMLVFIPMVFNGGLVSTYFIVSQFLHLKNTLWALILPLSVSSFNVVICRTFFKTTIPEE LIESAKMDGATQFKIFFQIVLPISLPVIATIGLFLCFAYWNDWYQSMLYIDNQRLYSLQA LLNAIMTNINMLAQNAATMGASMADMVANMPKEAARMAIVVIIVLPIACAYPFFQKYFIS GLTVGAVKG >gi|229784124|gb|GG667611.1| GENE 73 72888 - 73832 978 314 aa, chain - ## HITS:1 COG:lin2117 KEGG:ns NR:ns ## COG: lin2117 COG4209 # Protein_GI_number: 16801183 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Listeria innocua # 23 313 16 308 309 287 50.0 2e-77 MADMKKRPGKKRKWNRDDTELTILAFPTAVWYILFSFLPMFGIVIAFKKYTINGGFLHSI LTSAWCGLDNFKFLFSSGDIWMILRNTILYNITFIILNIVVPVTMALLIGQLHNQRMAKV FQTAMFLPYFLSWVVVTALVWAFLSFDKGMLNNLMEGLGQDPRQWYMVPKLWPGFLIFMY LWKNLGYSMVVYLATITGIDKTYYEAACIDGASVWQQMKWVTLPLMRTVIIMMFIMAVGR IFYSDFGLFYQVPRDSNSLYNVTYTLDVFVYKQLMSSTTGMASAAAFVQSVAGCITILAA NAVVRNVDRESAMI >gi|229784124|gb|GG667611.1| GENE 74 74148 - 75641 1000 497 aa, chain + ## HITS:1 COG:mll7612 KEGG:ns NR:ns ## COG: mll7612 COG3119 # Protein_GI_number: 13476324 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mesorhizobium loti # 12 433 5 432 509 153 30.0 8e-37 MDERRQMQSDARDNILLITTDQQRFDTINAWGNQSIFTPHINYMAAMGTSFTSCYASCPI CVPSRTTIMTGIDGYESGVVSNADHRAFMEAQTAGRLTLPAVLTDAGYQTCAKGKMHFEP ARACYGFEQMSLPLDYMRSCDKRGDRIRPKVHGVGECLMEPVISTVDVRDSMTTWIGDEA IDFIETRDPLRPFFLWTSFTKPHPPFDPCRDFWELYRQIPMPEPVYGDWSRTLEGTPQGF LAGSYENTDMHLFGPEQIAASRRAYYACITQVDYQLGRLFGALRENGLLENTWILFTSDH GEMLGDHYMSQKNLFFEGSAHVPFIIVPPAGRGIVHNNRIDRVITLADVFPTVLAMAGLP SPKEKRGENLLKWIGGKRQDERIFYGDSLHTNFCVMENRKKLIYTRIGSSLLLFDLETDP MERHNLADDPEYAKCRERLWTLLISHVKKTAPQVLRPDSGTDTSDAFISIPAPRFPGDMP GRWLGFHYHDYTVDTFH >gi|229784124|gb|GG667611.1| GENE 75 75825 - 76541 594 238 aa, chain + ## HITS:1 COG:CAC1764 KEGG:ns NR:ns ## COG: CAC1764 COG2071 # Protein_GI_number: 15895041 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferases # Organism: Clostridium acetobutylicum # 3 231 2 231 241 164 37.0 1e-40 MKKPLIGLTPSHNTDNHDIQMRPTYLKAVTAAGAIPVVLPLTSSEEDLKQLVDTLDGFLF TGGPDVHPFLFGEETLDHCGSVSTERDQMELALLPLVMETGKPILGICRGVQLLNIGLGG TIWQDIPSQVTSDFPLAHTQPFAYTLPSHTVTVKPGSRLAEITGAETLSVNSMHHQAVKD VAPSLTASAFSSDRLVEAVEMPDYPFFIGVQWHPEYLWEKNEAASRLFAAFAKAAGAR >gi|229784124|gb|GG667611.1| GENE 76 76538 - 77434 878 298 aa, chain - ## HITS:1 COG:RSc1002 KEGG:ns NR:ns ## COG: RSc1002 COG0697 # Protein_GI_number: 17545721 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Ralstonia solanacearum # 13 297 20 302 312 62 25.0 9e-10 MKELTQANQTYKGIIFTAVSAVIFGFTPVLARISYDGGANGITMTFLRCFLSLPVLFLIL KIKGIPLRVEKAWRMPVSVCGVFGAFATTVTLYMSYSYIPVGMATTLHFVYPVLVTVGCV LIFKEGITVKKVLALLCGAAGTLLFLENFSAGSGSGAGIFLALLSGLFYSVHMIVMDKSG IKNMYYFKLSFYLCLFGAVLSGIYGGVTGQLTLHLTGQAWFFAFLVSLCTSVGAISLFQL GIRYTGAVAAAILSTLEPITSVILGVLVLGELFTTRKIAGCVCILFSVVLIAAAGKKR >gi|229784124|gb|GG667611.1| GENE 77 77520 - 79466 2092 648 aa, chain - ## HITS:1 COG:BS_licR_1 KEGG:ns NR:ns ## COG: BS_licR_1 COG3711 # Protein_GI_number: 16080911 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 1 411 1 401 499 124 25.0 6e-28 MVEARTRYVIMIKTMLGRKTYVTSAALAEAAGVSVRTVKYDLKELQSFLKEYGVEIESKR SYGYRLLVGDGSDMEGLSKALRANLRKHKGEVFRYNFQRVIYIINKLLIGKPYYKAEELM DTFYISRSTLTQDLNRARTLLAKFRLEIDVNLRRGIGVTGDELDKRLCIAEYFFRYDDKL QVMVQRRADGGKYDWDAQRDTLVRIVWEACRANKIRLSPFLAVDLASHMYVSVLRMKAGC GIGELPGHTMTVEYVRERYAAGEVADQLEKLYSVSISEKEREYFALHILSKKMPDSGPVG SGEHATLKQCVTEIMREVKDNFELDFTNDPVFLDFLYSSIEPMVLRLRTHLIVRNPLLFE NLRRYLFATKVAHSASGIIEKLFGVQMDNNEFAYLVPAFNMIISTHEKRKKFKIGFCGDL GLSEALIYYNELSESLPRDGYELVWLDRYHNTGYLNQLQYLIYVSDYRLPSELPYYEIQD GDSTSEVCSAIAEYKLEQVQIEQYLKPEFGIFGLEGKSREEVMENLYRCLAARGLIATEL DWKNAFRANEVGNGIAHIQDLGRILKNAGCYVCILKTPILWEQDIIKVLVMIKTKRDGDK NLSLLCRIVSNWASSPEKVEHFLKSQSYDVFCGDIKSECLNICFHSII >gi|229784124|gb|GG667611.1| GENE 78 79497 - 80312 913 271 aa, chain - ## HITS:1 COG:STM0574 KEGG:ns NR:ns ## COG: STM0574 COG3716 # Protein_GI_number: 16763951 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Salmonella typhimurium LT2 # 2 259 18 272 284 261 48.0 1e-69 MSKQKMSAEDKKMVNGLFWNSFLLEASYNYERQQALGFSVGMWPAIKRFYKTKEGQAEAL KRHMAIFNTTPHMVSVISGVTAAMEKEASENKDFDKDIINNVKVGLMGPLAGIGDSFFWG TLRIIAAGIGLSLAQQGSVLGAILFLIIFNVPHLLIRYYGTVLGYQFGAGLMSNTKSAGI LKMISKGASIVGLMVIGAMSASMVAMKTPLTFTIGETAFELQGYLDQIFPLLLPLLYTLA MFGLLKKGCKSTTILLITIAVGVIGSLLHIL >gi|229784124|gb|GG667611.1| GENE 79 80305 - 81090 1166 261 aa, chain - ## HITS:1 COG:lin2109 KEGG:ns NR:ns ## COG: lin2109 COG3715 # Protein_GI_number: 16801175 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Listeria innocua # 18 234 20 238 272 147 41.0 3e-35 MQAAILVGLAMAVLWFLEKLGGTPMVIRPIVVSPVIGALLGDLQTGVMVGATLELVFMGA IQIGAAVPPDVLVGAGLGTAFAIQSGQGADIALALALPIAILAQSLKVIVFIIRSWFMDL AMKLAEAGDIKKMHALNIGGLLLQCFMYFVVGFVALLFGSPAVEAFVNNIPQVILNGLSV AGGLLPAVGFALLLLPMMEKRNAIYFVFGFILISYLNLPIMAVTIIGVVLAFVICYERGA GGNVTAAAAVSSEEEEDLFDE >gi|229784124|gb|GG667611.1| GENE 80 81109 - 81588 714 159 aa, chain - ## HITS:1 COG:STM4536 KEGG:ns NR:ns ## COG: STM4536 COG3444 # Protein_GI_number: 16767780 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Salmonella typhimurium LT2 # 1 146 1 146 153 102 37.0 4e-22 MIVKLRIDDRLLHGQVAYSWKSALSYNAIVIASDAAAADEFRKGVIKMCCPEGVKLATRS VEEAAKLINNPKLKDMKVFAICGSPADANGLLKKLEEKPVVNLGGIQMADGKKLFSKAVY VDEEDLRNLDEIAAAGYTIEVQEVPSTAMAKYADLRKKF >gi|229784124|gb|GG667611.1| GENE 81 81607 - 82026 630 139 aa, chain - ## HITS:1 COG:SPy1057 KEGG:ns NR:ns ## COG: SPy1057 COG2893 # Protein_GI_number: 15675049 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose-specific component IIA # Organism: Streptococcus pyogenes M1 GAS # 2 137 3 139 141 73 32.0 1e-13 MKRMLIAAHARFADGLESALSLIMGKREDITCINAFVTETPLETQIDDYAASLGEEDQVL VITDAFFGSVNQKIMGKRLKNSLIVTGANLPLVLELVTLLDGDAPITADQLKNTVALARE QLMVVEMEDAVDSSDDFDF >gi|229784124|gb|GG667611.1| GENE 82 82082 - 83173 1039 363 aa, chain - ## HITS:1 COG:STM0573 KEGG:ns NR:ns ## COG: STM0573 COG0449 # Protein_GI_number: 16763950 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Salmonella typhimurium LT2 # 30 296 23 276 347 113 27.0 6e-25 MITVLDCIERIPKSLKGIRDSYPERTKNIADYLSGRGIKKVGRLVFIASGSSYNSAHTAK LFIKNQCGIETEFKYPNIFVHYEADTELSREAKDLESTVYVVISQGGETRLVYQALEKIR NAGRPCMAITADEEASIAKMADIHLDMGCGQEEFMYRTIGFSTSAAVCMMLGLAAGVYNG TVTAKEEAEYQRDFDAMIQNLPAVEAATESWYQAHKFSLLRRSKMMMAGTGDLYPIVNEG DIKIMEMVPMMTRSFELEEFIHGPQNSFDDATLFLILHHKGEDDEKAKAIARFIKEQIGF CALVGEEPLEERDLFISPASRYFFGLEYVTVFQVLAYRMADDRGRDLHRGVNAVVSKYIT KTL >gi|229784124|gb|GG667611.1| GENE 83 83355 - 83549 289 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870001|ref|ZP_06409592.1| ## NR: gi|288870001|ref|ZP_06409592.1| putative protein GrpE [Clostridium hathewayi DSM 13479] putative protein GrpE [Clostridium hathewayi DSM 13479] # 1 64 1 64 64 75 100.0 1e-12 MPPQPDQQPADPQQANLQQANLQQNIDFMNYSRSIYEIRQTLLKKQEKEHNMLEYNGKET EKHR >gi|229784124|gb|GG667611.1| GENE 84 83907 - 84746 637 279 aa, chain - ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 2 267 1 254 265 116 30.0 6e-26 MMTDTEIQLEKFKQFLIDEERAAATIEKYRRDVQAFFTWLPEKTEVSKEMVLEYKRKLAA QYKSTSANSMLVALNRFFGFCGRRDLQVRLLKVQRVSFRERSREMSVEEYKRLVRAAREK KDERLSLLIQTLCSTGIRVSEHRCITVEALRSGSICIDGKGKERAVFLPKKLQKQLKYYC KEKKITTGPVFITKSGKPLNRCNIWAEMKALCKNAGIEPQKVFPHNLRHLFALTYYRLEK DIVRLADILGHANIETTRIYTSTTEEECLRSLSRMKLLL >gi|229784124|gb|GG667611.1| GENE 85 84848 - 85018 130 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619590|ref|ZP_06112525.1| ## NR: gi|266619590|ref|ZP_06112525.1| ABC spermidine/putrescine transporter, inner membrane subunit [Clostridium hathewayi DSM 13479] ABC spermidine/putrescine transporter, inner membrane subunit [Clostridium hathewayi DSM 13479] # 1 56 1 56 56 65 100.0 9e-10 MSFARPQKSGLTIEVTPLSYAYCSYFIFHLCIFLITSFIILFVTVLYGFILGFLLL >gi|229784124|gb|GG667611.1| GENE 86 85015 - 85683 796 222 aa, chain - ## HITS:1 COG:no KEGG:Closa_0311 NR:ns ## KEGG: Closa_0311 # Name: not_defined # Def: MerR family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 222 1 219 219 313 83.0 3e-84 MAEIHYLISDASKKVDVESHVLRYWEDELELAIPRNEMGHRYYTEAHIRLFQQIKELKEK GYQLKAIKTALDKIMEDGKDPVIPDELLEESVTRALKESAVVTDSGDGETGLAEVQNTAQ VMIAQEKMEQFQEIMNHIIGRALEVNNEQLSQDVSSIVNDKVVKELEYLMRVKDEKEEER FKQLDELLRSYQKDNKGRAEAAASKIPFFNRKKKFGRNGKKL >gi|229784124|gb|GG667611.1| GENE 87 85814 - 86692 744 292 aa, chain - ## HITS:1 COG:no KEGG:Selsp_0978 NR:ns ## KEGG: Selsp_0978 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 25 274 19 273 273 105 31.0 2e-21 MQENKIYKESKLDNYEEKRKKVSFFHLTSDIFFSKVMEDLQACQEVIQILTEQKLTVKKV KTQYSIRNMENRSVILDVLAEDESGRIVNIEMHPKEDEDHVRRVRYHLSSIDMSFLEKGT SYDTIPEVYLIYITERDFIGENRGINEVERIVKRSGKRIDNGVHELYVNFSGKTDSAEQQ ELLSYMVNSDSNYKTDTFPNLVKRVKLFKEKKEGINIMCDIIDRERAEGKAEGKAEVIAL IRRKYQKQNTPEQAAEALELEIDYVRKVMNMIAADEGQSNEAIALQLLREEM >gi|229784124|gb|GG667611.1| GENE 88 87904 - 89241 1097 445 aa, chain - ## HITS:1 COG:CAC3715 KEGG:ns NR:ns ## COG: CAC3715 COG0305 # Protein_GI_number: 15896946 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Clostridium acetobutylicum # 8 442 6 436 442 461 52.0 1e-129 MDDALIKRVLPHSVEAEQSVIGSMLMDREAIISASEIITADDFYQHQYGVMFESMVELFN ENRPVDLITLQNRLKEKDVPPEVSSLDFVRDIITTVPTSANVKSYANIVREKAVLRRLIK INEDIANTCYVGKEPLETILATTEKTVFDLLQSRNSGDFVPIRQVAMNVLEKIEEASKNQ GTVTGIPTGFIDLDYKTSGLQPSDFVLIAARPSMGKTAFVLNLVDHIAVKKGLPCMVFSL EMSKEQLVNRMLAMESNVDSQKLRTGTLSDSDWDAVVEGIGVIGNSKLIIDDTPGISIME LRSKCRKMKLEYGLSVVIIDYLQLMSGSGKGGGDNRQQEISEISRSLKALARELSAPVIA LSQLSRACETRQDHRPMLSDLRESGAIEQDADVVMFLYRDDYYNKDTDMPNIAEVIIAKQ RNGPIGTVNLVWRPEFTKFANMAKQ >gi|229784124|gb|GG667611.1| GENE 89 89254 - 89700 574 148 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239629097|ref|ZP_04672128.1| ribosomal protein L9 [Clostridiales bacterium 1_7_47_FAA] # 1 148 1 148 148 225 77 6e-58 MEVVLLEDVKTLGKKGQIVKVNDGYARNFILPKKLGIEATSKNLNDLKLQKANEARLAAE QLAAAKELAAQLEKSSITVSIKAGEGGKAFGSVSSKEIASAIASQLSLDIDKKKLVLPEP LKTFGVHKVPIKLHKEVTGKLAVKVVES >gi|229784124|gb|GG667611.1| GENE 90 89700 - 91760 653 686 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|162447066|ref|YP_001620198.1| bipartite protein: signaling protein and ribosomal protein L9 [Acholeplasma laidlawii PG-8A] # 45 677 45 660 818 256 28 2e-92 MKEKIKVKGQLRAYMQWPLYLSAMLIITNAIIGAVSVTAGIIMAVFTLLYIGIALWIYTY RKKRLLNGLVEFSAEYSWMQKQLLTDLELPYGIADESGRVLWGNTALAEVLGEEKPFKNT RKNLSSIFPEITREILDLEEDSVILHCTCDDRKYQVQLKAVYMKDSIEDIADAIGTEDAG QKLIAVYLFDETEILTYKQMVNDQKMVAGLIYLDNYDEALESVEEVRRSLLTALIDRKIT KYITAMNGIVKKLEKDKYFIAIKQCYIQELKDNRFSILEDVKTVNIGNEMAVTLSIGLGM NGESYSQSYDYARIAIDMALGRGGDQAVVKDGERIQYFGGKAQQVEKTTRVKARVKAHAL RELMETKDRLLIMGHRLTDVDSFGAAVGIYRIATAMNKKANIIINEVTSSVKPMMDRFTG NSDYPDDIFMTGARAAELVDNNTILVVVDVNRPSITDAPELLRLVKTIVVLDHHRQSSEI IDNAVLSYVEPYASSACEMVAEVLQYIADGIRIKSAEADAMYAGIVIDTNNFTNQTGVRT FEAAAFLRRNGADVVRVRKLFRDDLDDYKAKAEAVREAEIFEGCFAISTCPSEGIESPTV VGAQAANELLDIAGIKASVVMTFYNNTIYLSARSIDEVNVQVMMEKLGGGGHRTIAGAQL KDMSLEEARDRVKEVIKDMLEKGDIS >gi|229784124|gb|GG667611.1| GENE 91 91947 - 92366 347 139 aa, chain - ## HITS:1 COG:TM0012 KEGG:ns NR:ns ## COG: TM0012 COG1905 # Protein_GI_number: 15642787 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 11 139 37 172 176 77 33.0 8e-15 MDYYSSQRDAGSQENIVEMLREVQELYGYIPSEKTRAMAEATGVKQTLLLQLIKLYPSFK KAPYGHCITVCTGARCGDKGSAEVFEAVLKAVEARESGAFKIVMKECLKQCKTAPNLMVD SDSYGCVKPDEVASILSNY >gi|229784124|gb|GG667611.1| GENE 92 93629 - 95821 2065 730 aa, chain - ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 1 714 1 715 748 657 45.0 0 MAIIMNQAEQVITLQTKRSTYQMKVGDYGVLLHLYYGARVEDCTMDYLLHKKDVGFSGNP YDAGEDRTFSLDTLPQEFPSYGVGDYRNTCVGVCQADGTRAADFRYVSCEIRDGAHKIPG LPCLFDEDEKGETLVVFLEDTASSLKLELHYVVFAERDIIARSARITNGGNGAVRLEKMM SACLELPNGAWEVIHFHGRHAMERRLERLPLMHGTMEVGSRRGTSSHQHNPGVIICSPDT TEEYGDCYGMSLIYSGNFTAEIEQDQMNSVRFVCGINPESFEYLLEPGEAFDTPQLMMTY SGSGLGRMSANFHSIIRHNLCRGKYRFARRPILINNWEATYFDFNEEKILSIARQASELG IEMLVLDDGWFGSRDSDDAGLGDWFVNTDKLKGGLADLVRGINGLGMKFGIWIEPEMVNE DSCLYREHPDWALTIPGRKPCRSRNQLVLDMSRSDVRDYIFDSISAVLKSANVEYVKWDM NRSICDIYSAALPKERQGEVYHRYVLGVYDLMERFTSTFPDILFEGCSGGGGRFDPAILY YSPQIWCSDDTDGIERLEIQYGTSFFYPISAVGSHVSAIPNHQTGRKTPLKTRGVVAMAG SFGYEMDLNLLSPDEKEVVKEQVEDYKKYYDLIHNGDYYRLVSPQGDSDFTAWQFVSGDK AKTLVHVVITHVRANAPDLWFKLRGLAPEKCYRLEENGRIYSGSALMNAGISIPMMMGDY PAVQMELTEV >gi|229784124|gb|GG667611.1| GENE 93 95857 - 96531 593 224 aa, chain - ## HITS:1 COG:no KEGG:Closa_0515 NR:ns ## KEGG: Closa_0515 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 7 220 9 220 260 110 34.0 3e-23 MNGLSEDSIFGQVFGFLGKMIALNILWIVTSLPIVTIGASTTAMYYTALKLHKDEDVTVW KSFLHSFKQNFIQATAIWAVLAAVGALLFIEGRWLLLSGSSSSMFLSYGLIGIGLIVGVL LLYIFPVIAAFSNTLGKLAGHAFYFAFHKPGYLIATAAITCLPMYFTMMDAKLFPVYLLI WLMCGFSLTAYGNAWFYLRLFQPHLKTADAGSSASDTLQPESGQ >gi|229784124|gb|GG667611.1| GENE 94 96580 - 97404 1032 274 aa, chain - ## HITS:1 COG:MT2099 KEGG:ns NR:ns ## COG: MT2099 COG0395 # Protein_GI_number: 15841527 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mycobacterium tuberculosis CDC1551 # 9 271 15 277 280 178 37.0 1e-44 MEKKKKAAAISIHVVLILVSITMLVPFIWMALTAFKTSTEATSVNPFVIFPAVWQTDNFK TVIDNMNFLQLYVNTILLIVGRVAAAVLTATMAAYAFARLEFKGKNVMFSLVLFQMMVPG QIFIIPQYLMVSKMGLLDTVFALIFPGLVTAFGTFLLRQAYLGLPKDLEEAARLDGCNIG QTFLYIMAPLTRSGMVALGIFTAVFAYKDLMWPMICNKTVMPLSAALAKMQGQYTNNYPQ LMAASLLACVPMIVIYLIFQKQFIEGIATSGGKL >gi|229784124|gb|GG667611.1| GENE 95 97422 - 98333 1082 303 aa, chain - ## HITS:1 COG:BH1865 KEGG:ns NR:ns ## COG: BH1865 COG1175 # Protein_GI_number: 15614428 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 3 302 9 309 309 182 39.0 5e-46 MAKETELKIKRKITSREKNEFLWGWAFSLPTMIGLIVLNIYPIIKTIYESFFKTGDFGKG NIFIGLANYQRVLTDPEVWQSLLNTFKYAIVEVPFSIIIALVLAVLLNRKMKGRAAYRTI FFLPMVAAPAAIAMVWRWLFNSEFGLLNNVFGTSINWISNPNIAVYSIGVIGVWSIIGYN MVLFLSGLQEIPKDYYEASNIDGASGIYQFFHITVPLLSPTIFFVTVTRVIGALQVFDLI FMVMDRNNPALSKTQSLVYLFYQYSFVQNNKGLGSTIVVVLLAIIMVITVFQMKAQKKWV YYN >gi|229784124|gb|GG667611.1| GENE 96 98361 - 99671 1467 436 aa, chain - ## HITS:1 COG:BH1864 KEGG:ns NR:ns ## COG: BH1864 COG1653 # Protein_GI_number: 15614427 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 6 435 5 439 461 140 28.0 3e-33 MSRLWKKGIAVTMAATMAVSLTACGGGNSSSGGNKEAGDTQAADAGGKLNVAIWDNGQKP GLDQILADFTKETGIEADLQVITWDSYWTLLEAGASGGDMPDVFWMHSNEATKYMSNDIL LDLTDKVAASEKLEMDKFPSDIKEMYQYDGKTYAVPKDIDTIALWYNKDMFDEAGIAYPD DTWTWDDLYDAAVKLTKDDGSQYGIAMNPSNEQDGWMNVIYSMGGNVISEDLKKSGFDDP NTIKAMEYVDKLVKNAMPPAAVMSETGTDVLLGSGKIAMLSQGSWMVAAFKDNEYISQHC DVAVLPKDAQTGKRVSLYNGLGWAASANTKNPEAAWKLIEFLGTKDMQLKQAQLGVTMAA YEGVSDDWVKNTDKFNLQPYLDMMNSDIVFRPHSRSTLVWWNMMTTELKEAWSGNQEMDT VCNTIAEKMDQMLADE Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:26:34 2011 Seq name: gi|229784123|gb|GG667612.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld5, whole genome shotgun sequence Length of sequence - 94052 bp Number of predicted genes - 79, with homology - 73 Number of transcription units - 38, operones - 18 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 94 - 153 9.3 1 1 Op 1 40/0.000 + CDS 193 - 1689 1801 ## COG0642 Signal transduction histidine kinase 2 1 Op 2 . + CDS 1686 - 2375 1064 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 3 1 Op 3 . + CDS 2390 - 3163 886 ## Closa_3969 hypothetical protein + Term 3181 - 3227 11.6 - Term 3169 - 3213 10.6 4 2 Tu 1 . - CDS 3263 - 3904 634 ## Closa_2593 Methyltransferase type 11 - Prom 3969 - 4028 6.2 5 3 Op 1 . - CDS 4030 - 5322 1080 ## COG0534 Na+-driven multidrug efflux pump 6 3 Op 2 . - CDS 5309 - 5758 280 ## ELI_2545 putative MarR family transcriptional regulator - Prom 5796 - 5855 4.2 + Prom 5761 - 5820 4.8 7 4 Op 1 . + CDS 5986 - 7134 803 ## gi|266619609|ref|ZP_06112544.1| conserved hypothetical protein + Prom 7141 - 7200 7.3 8 4 Op 2 . + CDS 7325 - 9010 2005 ## COG0513 Superfamily II DNA and RNA helicases + Term 9101 - 9153 9.4 + Prom 9156 - 9215 4.1 9 5 Op 1 . + CDS 9337 - 10704 1220 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 10 5 Op 2 . + CDS 10727 - 10828 138 ## 11 5 Op 3 . + CDS 10810 - 11589 276 ## BCB4264_A0436 hypothetical protein 12 5 Op 4 . + CDS 11607 - 11696 70 ## + Prom 11705 - 11764 7.6 13 6 Tu 1 . + CDS 11940 - 12713 341 ## Swol_0509 hypothetical protein + Term 12782 - 12816 0.6 + Prom 14531 - 14590 2.1 14 7 Tu 1 . + CDS 14612 - 15256 409 ## COG3943 Virulence protein + Prom 15382 - 15441 6.5 15 8 Op 1 5/0.000 + CDS 15663 - 18929 1949 ## COG4096 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 16 8 Op 2 . + CDS 18926 - 20341 1077 ## COG0286 Type I restriction-modification system methyltransferase subunit - Term 20984 - 21023 -0.5 17 9 Tu 1 . - CDS 21260 - 21424 57 ## - Prom 21501 - 21560 5.3 + Prom 21342 - 21401 4.0 18 10 Op 1 . + CDS 21496 - 21711 92 ## Amet_0849 restriction modification system DNA specificity subunit 19 10 Op 2 . + CDS 21768 - 23198 1068 ## Clocel_0831 hypothetical protein 20 10 Op 3 . + CDS 23220 - 23798 530 ## Cbei_2801 hypothetical protein 21 10 Op 4 . + CDS 23804 - 27106 1975 ## Clocel_0833 M protein-like MukB domain-containing protein 22 10 Op 5 . + CDS 27124 - 27870 244 ## Clocel_0834 hypothetical protein 23 11 Tu 1 . - CDS 28795 - 28956 63 ## - Prom 29054 - 29113 5.1 + Prom 28773 - 28832 7.7 24 12 Tu 1 . + CDS 29071 - 29271 216 ## gi|266619623|ref|ZP_06112558.1| conserved hypothetical protein + Prom 29658 - 29717 6.9 25 13 Tu 1 . + CDS 29817 - 30953 711 ## EUBELI_01635 multiple sugar transport system substrate-binding protein + Prom 31855 - 31914 14.2 26 14 Tu 1 . + CDS 32042 - 34606 1816 ## COG0642 Signal transduction histidine kinase - Term 34559 - 34591 -1.0 27 15 Tu 1 . - CDS 34710 - 35585 622 ## COG0583 Transcriptional regulator - Prom 35641 - 35700 10.5 + Prom 35646 - 35705 10.5 28 16 Tu 1 . + CDS 35763 - 36368 669 ## COG2252 Permeases + Prom 37212 - 37271 80.4 29 17 Op 1 . + CDS 37301 - 37759 458 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins + Term 37806 - 37858 12.0 + Prom 37870 - 37929 4.0 30 17 Op 2 . + CDS 38006 - 38578 421 ## PROTEIN SUPPORTED gi|116492196|ref|YP_803931.1| acetyltransferase + Prom 38587 - 38646 2.4 31 18 Op 1 . + CDS 38713 - 39153 365 ## Clole_1993 isoprenylcysteine carboxyl methyltransferase + Prom 39180 - 39239 1.7 32 18 Op 2 . + CDS 39301 - 39543 308 ## Trebr_2404 hypothetical protein + Term 39625 - 39668 8.3 + Prom 39560 - 39619 2.5 33 19 Op 1 2/0.200 + CDS 39681 - 39845 162 ## COG1331 Highly conserved protein containing a thioredoxin domain 34 19 Op 2 . + CDS 39884 - 41752 1567 ## COG1331 Highly conserved protein containing a thioredoxin domain + Term 41780 - 41813 -0.4 35 20 Tu 1 . - CDS 41772 - 42608 454 ## COG3049 Penicillin V acylase and related amidases - Prom 42641 - 42700 2.5 36 21 Tu 1 . - CDS 42850 - 44181 1234 ## COG0733 Na+-dependent transporters of the SNF family - Prom 44265 - 44324 7.6 + Prom 44333 - 44392 5.4 37 22 Tu 1 . + CDS 44445 - 45068 582 ## Closa_2536 TetR family transcriptional regulator + Term 45101 - 45135 2.7 + Prom 45080 - 45139 2.6 38 23 Op 1 35/0.000 + CDS 45216 - 47030 214 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 39 23 Op 2 . + CDS 47027 - 48769 212 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 + Prom 48806 - 48865 4.5 40 24 Op 1 . + CDS 48893 - 49531 708 ## gi|266619640|ref|ZP_06112575.1| conserved hypothetical protein 41 24 Op 2 . + CDS 49548 - 49709 96 ## Thit_0244 two component transcriptional regulator, AraC family + Term 49738 - 49795 15.9 - Term 49725 - 49783 1.1 42 25 Tu 1 . - CDS 49852 - 50445 349 ## COG0583 Transcriptional regulator - Prom 50494 - 50553 80.4 43 26 Tu 1 . - CDS 51393 - 51608 224 ## COG0583 Transcriptional regulator - Prom 51660 - 51719 7.4 + Prom 51705 - 51764 4.7 44 27 Op 1 38/0.000 + CDS 51792 - 53477 1550 ## COG0747 ABC-type dipeptide transport system, periplasmic component 45 27 Op 2 49/0.000 + CDS 53506 - 54459 763 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 46 27 Op 3 44/0.000 + CDS 54459 - 55343 689 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 47 27 Op 4 44/0.000 + CDS 55356 - 56336 780 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 48 27 Op 5 . + CDS 56340 - 57296 772 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 49 27 Op 6 . + CDS 57322 - 57924 402 ## gi|266619650|ref|ZP_06112585.1| putative membrane protein 50 27 Op 7 . + CDS 57957 - 59249 1009 ## Pat9b_4610 hypothetical protein 51 27 Op 8 . + CDS 59227 - 60783 812 ## COG2272 Carboxylesterase type B 52 27 Op 9 . + CDS 60814 - 62193 750 ## Slin_4581 hypothetical protein 53 27 Op 10 5/0.000 + CDS 62201 - 63190 614 ## COG0673 Predicted dehydrogenases and related proteins 54 27 Op 11 . + CDS 63187 - 64407 791 ## COG0477 Permeases of the major facilitator superfamily 55 27 Op 12 . + CDS 64426 - 65751 982 ## Acid_2075 hypothetical protein 56 27 Op 13 . + CDS 65744 - 67126 1173 ## COG0534 Na+-driven multidrug efflux pump + Prom 67187 - 67246 5.4 57 28 Tu 1 . + CDS 67323 - 68330 554 ## PROTEIN SUPPORTED gi|90020579|ref|YP_526406.1| ribosomal protein L22 + Term 68355 - 68405 10.0 58 29 Tu 1 . - CDS 68402 - 71302 2234 ## COG0642 Signal transduction histidine kinase 59 30 Op 1 . - CDS 72540 - 73571 857 ## COG0388 Predicted amidohydrolase 60 30 Op 2 . - CDS 73614 - 75023 1512 ## COG0471 Di- and tricarboxylate transporters - Prom 75213 - 75272 6.3 + Prom 74837 - 74896 2.0 61 31 Tu 1 . + CDS 75050 - 75247 134 ## - Term 75327 - 75363 8.0 62 32 Tu 1 . - CDS 75376 - 76251 658 ## COG0583 Transcriptional regulator - Prom 76290 - 76349 11.0 + Prom 76282 - 76341 11.8 63 33 Op 1 . + CDS 76374 - 78011 1615 ## BMD_0797 malonate decarboxylase subunit alpha 64 33 Op 2 . + CDS 78031 - 79578 1458 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) 65 33 Op 3 . + CDS 79590 - 79913 297 ## gi|266619667|ref|ZP_06112602.1| sodium pump decarboxylase, gamma subunit subfamily 66 33 Op 4 9/0.000 + CDS 79942 - 80388 268 ## COG0511 Biotin carboxyl carrier protein 67 33 Op 5 . + CDS 80402 - 81565 1115 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit + Term 81592 - 81631 9.1 + Prom 81606 - 81665 5.8 68 34 Op 1 . + CDS 81736 - 83187 1314 ## COG0513 Superfamily II DNA and RNA helicases + Term 83196 - 83227 1.5 69 34 Op 2 . + CDS 83287 - 84294 800 ## COG0657 Esterase/lipase + Term 84347 - 84394 5.4 70 35 Op 1 . + CDS 84441 - 85496 1079 ## gi|266619672|ref|ZP_06112607.1| conserved hypothetical protein + Prom 85498 - 85557 1.6 71 35 Op 2 . + CDS 85588 - 86406 851 ## COG0789 Predicted transcriptional regulators + Term 86424 - 86469 10.6 + Prom 86442 - 86501 4.2 72 36 Op 1 7/0.000 + CDS 86537 - 88273 1912 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 73 36 Op 2 . + CDS 88251 - 88967 850 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 74 36 Op 3 . + CDS 88982 - 89128 59 ## gi|266619676|ref|ZP_06112611.1| conserved hypothetical protein 75 36 Op 4 . + CDS 89125 - 90513 1703 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 90583 - 90626 7.6 76 37 Tu 1 . - CDS 90491 - 90607 58 ## - Prom 90640 - 90699 5.5 77 38 Op 1 38/0.000 + CDS 90640 - 91527 926 ## COG1175 ABC-type sugar transport systems, permease components 78 38 Op 2 . + CDS 91539 - 92372 1018 ## COG0395 ABC-type sugar transport system, permease component 79 38 Op 3 . + CDS 92383 - 93720 1442 ## COG3119 Arylsulfatase A and related enzymes + Term 93823 - 93871 13.1 Predicted protein(s) >gi|229784123|gb|GG667612.1| GENE 1 193 - 1689 1801 498 aa, chain + ## HITS:1 COG:CAC0317 KEGG:ns NR:ns ## COG: CAC0317 COG0642 # Protein_GI_number: 15893609 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 3 490 1 492 498 340 38.0 3e-93 MKLKLRTRLAVAFVIITVVPLALIYAVVAGLSNYQMKSFRKAYNLTEQVNLFSGNSLQIF NRMTQVEQKEIEKILDSSPEKFSDMEYMQKLNEKLSNKYAYMIVRKDDQIVFDGNSHVTQ ELLDQLPKYEDVDSSIDGAIYLDGDTQHLVKQMDFLYPDEGKGSVFIVSNVDGLLPEIKA MMAEMLLAVILIIVFTDAILMMWVYSSVVHPLGRLQEATKKIRDGNLDFALEVENDDEIG QLCQDFEEMRMRLKENAEEKIQYDKENKELISNISHDLKTPITAIKGYVEGIMDGVASSP EKLDRYIRTIYNKANDMDKLIDELTFYSKIDTNKIPYTFSKINVASYFRDCVDEVGLEME ARNIELGYFNYVDEDVMVIADAEQMRRVINNIVSNSVKYIDKKNGIINIRIKDVGDFIQI EIEDNGKGIAAKDLPNIFDRFYRTDSSRNSSQGGSGIGLSIVKKIIEDHGGRIWATSKEG IGTEIHFVLRKYQEVIAE >gi|229784123|gb|GG667612.1| GENE 2 1686 - 2375 1064 229 aa, chain + ## HITS:1 COG:CAC0321 KEGG:ns NR:ns ## COG: CAC0321 COG0745 # Protein_GI_number: 15893613 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 228 1 229 230 317 69.0 1e-86 MSRILIIEDEESIAELEKDYLELSGFEVEIENDGEAGLAKALHEDFDLLILDLMLPGVDG FEICRKVREVKNTPIIMVSAKKEDIDKIRGLGLGADDYITKPFSPSEMVARVKAHLARYE RLIGSGTPDNEIVEIRGLKIDKTARRVWVNGEEKSFTTKEFDLLTFLAQNPNHVFTKEEL FSKIWDMESIGDIATVTVHIKKIREKIEFNTAKPQYIETIWGVGYRFKV >gi|229784123|gb|GG667612.1| GENE 3 2390 - 3163 886 257 aa, chain + ## HITS:1 COG:no KEGG:Closa_3969 NR:ns ## KEGG: Closa_3969 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 257 1 252 252 392 71.0 1e-107 MKEKLYEIPMNDAMDEDDECPFCFVERKTEQELMDFVLGSCASYMESDTREATDKAGFCR VHQKKMFDYGNALGNGWILKTYYKKLIQEMKAEFKDYTPGKTSMMDRITGKAGKRDGGNG NPIEKWVSEKERTCYICDRFNQVYERYIATFFHLYKKDAAFREKVKNGKGFCLHHFAELC AGADRFLNDAERKEFYPMIFEVMERNMERLSADVDWFIEKYDYLNRDADWKQSKDAIQRG MQKIRGGYPADPVYRMK >gi|229784123|gb|GG667612.1| GENE 4 3263 - 3904 634 213 aa, chain - ## HITS:1 COG:no KEGG:Closa_2593 NR:ns ## KEGG: Closa_2593 # Name: not_defined # Def: Methyltransferase type 11 # Organism: C.saccharolyticum # Pathway: not_defined # 24 213 4 193 194 314 76.0 1e-84 MNGLDNSSKKAETDQLIIQMGTEQPAAEETWKQIWTRKGKAEGGIENLLAFDGYERTQVN MKEVAAEISRRLDIQKEDKVLEVGCGAGALAQYLDCDYTGIDYSPTLVRRHIELLHNPVL VGEAANLPFKDKTFDKVICYGVFLYFDNKKYAQRATEELLRVAKKGVLIGELPIRSHRTE HLLFTPEEFEGWDISDGFYDPYRKDRFNAVLTF >gi|229784123|gb|GG667612.1| GENE 5 4030 - 5322 1080 430 aa, chain - ## HITS:1 COG:lin2192 KEGG:ns NR:ns ## COG: lin2192 COG0534 # Protein_GI_number: 16801257 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 3 427 13 438 443 157 29.0 4e-38 MKLHKEFFHCVIPSMLAFALSGVYAIADGFFVGNALGDSALAAVNLAYPLTAFLQAVGTG IGMGGAIQYAISAGTENHKRCRQYFGMSLFMLIISGLLLSALLLFSAPVVLRFFGASGTI GELAQEYILYIAFGAIFQVLGTGLVPFIRNMGGAVSAMAAMIAGFLTNIFLDYLFVWHLP FGMMGAAAATVIGQAVTFLVCAAFFIAKKKIPLLRFGGNGAFLFKTILKTGLSPFGLTFS PNITLILINKSAALVGGAAAVTCYAPVSYISAIIILLLQGVSDGCQPLISRSYGEGRHDR TRQFRNLAYCFSAMVSLLCMGLLFLLRGQAAALFGASHQITARVADILPVFLSGYLFVSI SRVTTAYFYAADKDLQACMMIYGEVIILFLLLLILPPIIGISGTWLSVPLSQLAAMCISI LLIVREKKSA >gi|229784123|gb|GG667612.1| GENE 6 5309 - 5758 280 149 aa, chain - ## HITS:1 COG:no KEGG:ELI_2545 NR:ns ## KEGG: ELI_2545 # Name: not_defined # Def: putative MarR family transcriptional regulator # Organism: E.limosum # Pathway: not_defined # 1 135 1 135 148 72 32.0 4e-12 MCDILTEQLQHYTVLWRETSALYEEWAKRHALSYYELLVILSIMNPDGPCLQKDICTHWQ LPKQTVNTILKNFAGRGWITLVPSAEDRRGRVILPTGTGRLFMEATVSDLQAHEKSVWQR MGQENARALLESTALYNKLFKETDSNETA >gi|229784123|gb|GG667612.1| GENE 7 5986 - 7134 803 382 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619609|ref|ZP_06112544.1| ## NR: gi|266619609|ref|ZP_06112544.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 382 1 382 382 767 100.0 0 MRELEVQLSNKDIRRMMSDMRLKKDGMKESKILAVMLIIAFISLIYTVGVGLLLGGHDPS ILWWLTDLCAGIGLWLGIAVVLAVLIKTVQYFRIIRGPLMHRQLVRFEEHRIEMYTEEGV SCYPYASILYAEKTRHQILVYMKRIGTVKTLLTLPDSAFTGEDEMDCCLAFLKEKQQQEA FMDPLDQQAEIISPEEQIYSFAFIQEEAEWLDALTAGKYYMMRTGFALKMPESMTLILAL FLLTGVGILSFIRDRDPVTAVVYGIIMILFTGISYQILCSRRGIYRGVKRALKRGKTVPD RTGRQVISFGRSGISLCTDKEQWDLTYPMIYRVVESKKEVFIFTKGPYFLNIPVWAFQTE REKQEVLDCLRAHGICVLQKNI >gi|229784123|gb|GG667612.1| GENE 8 7325 - 9010 2005 561 aa, chain + ## HITS:1 COG:BH2384 KEGG:ns NR:ns ## COG: BH2384 COG0513 # Protein_GI_number: 15614947 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Bacillus halodurans # 4 555 6 524 539 485 46.0 1e-137 METLRFDELQLDERILRAVADMGFEEASPIQAQAIPVQMEGRDIIGQAQTGTGKTAAFGI PLLQKVDPKSKKLQAIALCPTRELAIQVADEIRRLAKYMHGVKVLPIYGGQDIVKQIRSL KDGTQIIIGTPGRVMDHMRRKTVKFDHIHTVVMDEADEMLNMGFLEDMETILSQLPEDRQ TVMFSATMPQAIADIAHKFQKEPVTVKVVKKELTVPKVTQYYYEVKPKTKVEVMCRLLDM YAPKLSVVFCNTKKGVDELVQALQGRGYFAEGLHGDLKQIQRDRVMNSFRNGRTDILVAT DVAARGIDVDDVEAVFNYDLPQDDEYYVHRIGRTGRAGREGIAFSFVVGKEVYKLRDIQR YCKTKIIPQAIPSLNDVTGIKVDKILENVADTIEESDLSEMINILEKKLLEEDYTSLDLA AALLKMMMGEENEDIIDTREPRSLDELDSYYRGENRNGNGRGRGRNGGGRDSRYEGGRED MARLFINIGKNQNVKPGDILGAIAGESGMPGKMVGSIDMYDKYTFVEVPRENADAVLQAM KDVKIKGKNIHMEKANGGKGK >gi|229784123|gb|GG667612.1| GENE 9 9337 - 10704 1220 455 aa, chain + ## HITS:1 COG:BH0687 KEGG:ns NR:ns ## COG: BH0687 COG2265 # Protein_GI_number: 15613250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus halodurans # 5 453 10 457 458 462 52.0 1e-130 MEWKKNDRIQVKIEDMGDTGEGIGKTDGFTWFVKDAVIGDEVEAKVMKTKKSYGYARLEK IINASPNRVTPPCPVARQCGGCQLQAMDYEEQLKYKERKIYNNIKRIGGFDEVPMLPVMG MEEPWRYRNKAQFPWGTDKDGKIITGFFAGRTHAIIENEDCLLGIEENREILKIIKNHLE RYHIRPYDEASHSGLIRHTLIRKGFHTGELMVCQVINGKSLPHQEELTEQLLKVPGMTSI SVNINREQTNVILGNQVINLYGPGYITDYIGGVKYRISPLSFYQVNPVQTEKLYSTALEY AGLTGGETVWDLYCGIGTISLFLAQKAKKVYGVEIVPQAIDDARENARINGMENVEFFVG KAEEVLPREYEKNQVYADVIVVDPPRKGCDEVCLDTIVKMGPKRVVYVSCDSATLARDMK YLAERGYEVVKVRGCDMFPHSTHTECCVKLERKEK >gi|229784123|gb|GG667612.1| GENE 10 10727 - 10828 138 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTAETLDVVYEQVFVLPNEEKIGFGRNICLEQL >gi|229784123|gb|GG667612.1| GENE 11 10810 - 11589 276 259 aa, chain + ## HITS:1 COG:no KEGG:BCB4264_A0436 NR:ns ## KEGG: BCB4264_A0436 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_B4264 # Pathway: not_defined # 2 255 6 256 262 203 42.0 7e-51 MFGTVIIDAYRKEETIEIADAVEDLCSPNANYGWASAGIYCFWDYYIEEILYIGLAGDLA ERFKQHNGILPLKEGSKQKKIEEYFSKNERLGYTIFVQSPLSQPLVHRNQNMYKDFARQN NAPIEDMTSDEGRENIKIVEGILIEAYRRKYGHFPPWNDIGGSMVGQKRVMPNNINIVKS FCTPDNCYVNPIVSRSTIRELSQNPEWAWYENYLHAVRMNMLIHGMEYQEALEFVRKNDN IDTYNQILKSNYWKKKLVV >gi|229784123|gb|GG667612.1| GENE 12 11607 - 11696 70 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFLCTAGIKVDFQLIVFTRKKSACAGLVE >gi|229784123|gb|GG667612.1| GENE 13 11940 - 12713 341 257 aa, chain + ## HITS:1 COG:no KEGG:Swol_0509 NR:ns ## KEGG: Swol_0509 # Name: not_defined # Def: hypothetical protein # Organism: S.wolfei # Pathway: not_defined # 93 256 270 426 432 121 45.0 4e-26 MNCHDDFTNSDYIAFDEKAFITRIKIRIPSLFRDEYDRICAPLHDYEYDQYALLDLIEFF AQNIEDISERWNNERYRNYQTIDCLNTSDVFENFQEAINEIFIESGLLYELTDEKIIERI VENSPLTAEIENNFAVVHEVGTRELLKDAVALYKTPNPSARQDSVEKIWDAFERLKTYYI TLDKKHSSEKIVNDMAGGNDKFVDLFNDEFKMLTEIGNKYRIRHHETDKIDITDIRYYDY LFNRCLSLIALAIEYLM >gi|229784123|gb|GG667612.1| GENE 14 14612 - 15256 409 214 aa, chain + ## HITS:1 COG:NMA1039 KEGG:ns NR:ns ## COG: NMA1039 COG3943 # Protein_GI_number: 15793995 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Neisseria meningitidis Z2491 # 1 209 114 322 336 214 55.0 1e-55 MKKGFALDDERLKNLGGGGYFKELLERIRDICASEEVLYRQVLEIYATSIDYDPKAEISI QFFKKVQNKIHYAIHGQTTAEVIYTRVDAEKEFMGLMTFAGNQPILKEAVIAKNYLNEKE LRAMGPLVSGYLDFAERQAEREQVMTMSDWAEHLERILTMSGEQLLQGNGSVTHKQAVDK ATSEYRKYKTRILSDVEKDYLNSIKILEQKTDKN >gi|229784123|gb|GG667612.1| GENE 15 15663 - 18929 1949 1088 aa, chain + ## HITS:1 COG:hsdR KEGG:ns NR:ns ## COG: hsdR COG4096 # Protein_GI_number: 16132171 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Escherichia coli K12 # 3 1083 23 1186 1188 693 36.0 0 MASNFEFLDQAFPVLSAFGRQAEAYCLNDSNSCLIKLGMIGETIVNLMFTYDRLPFPKDN TSISRIDTLYREGLLTNDLTDILHLMRKKRNKAVHENYESESDAKILLQMAHSLCQWFMQ TYGDWNYQQSPFIMPSDTTGIPAIIGEQEGDNGETLIRQAESVAIAAPFISPSERKKQAG RAAGMRQKSEAETRYMIDEQLRKVGWEADTETLCYSRGVRPQRGRNMAIAEWPTNSNVGK RGYADYALFIDKKMVAIIEAKAMHKDIPSVIDYQCKDYPRCIREEDADYQIGRWGAYKVP LTFATNGRPYMEQFKTKSGIWFLDLRKPSNIPRALHGWMSPAGIIELLEKDIDAGNQGLQ NMPYDLLRDKNGLNLREYQMKAIQAAEQAIIEGQQKILLAMATGTGKTRTILGMIYRFLK TGRFRRILFLVDRTALGEQAQDVFREVKLEELMTLDALYNINGLEDKVIEKETRIQVATV QGMVKRILYNDGDDMPAVTDFDLIIIDEAHRGYILDKEMDETEQLYRDQREYQSTYRTVI EYFDAVKIGLTATPALQTTEIFGQPVFRYSYREAVIEGYLVDHDAPHQLQTKLSTEGIHY KSGDEVTIYDPVTGELMNSELLEDELDFDIESFNRQVITENFNRAVLEEIARDIDPENPE EQGKTLIFAVDDQHADMIVSILKEIYARTEIDNDAIMKITGSVGGGNKKKVREVIRRFKN DRYPSIVVTVDLLTTGIDVPEITALVFMRRIKSRILFEQMLGRATRLCPEIHKTHFEIYD PVGVYDSLDQVNTMKPVVANPSVTFTQLLEGLEVMEEEQQVKRQIEQIIVKLQRQKASMD DTTREHFIDMAGGLDPNQFISEIQQQKPEEARKRLLAYYEMFQMLQDTKTRGRRPVVVSD AEDELISHKRGYGNSDRPEDYLDAFAKYVQTNINEIAALNVVCTSPRELTRESLKSLRLT LDREGFTTRQLNTAVSQMTNEEIAADIISLIRRYAIGSALINHEAKIRRAVARLKNAHSF SMQELNWIGKMEKYLLEESVLNVSVFDEDSRFKAQGGYVRINKVFQNQLESIVLELNEYM YDDGGRTA >gi|229784123|gb|GG667612.1| GENE 16 18926 - 20341 1077 471 aa, chain + ## HITS:1 COG:hsdM KEGG:ns NR:ns ## COG: hsdM COG0286 # Protein_GI_number: 16132170 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Escherichia coli K12 # 1 466 1 509 529 441 47.0 1e-123 MNTQEIVSKLWNLCNVLRDDGITYHQYVTELTYILFLKMAKETGVEEQIPEEYRWDCLVS KSGMELRRYYRELLNYLGEECTGRIQEIYQGAATNIDEPKNLEKIITAIDKLDWYSAKEE GLGNLYEGLLEKNANEKKSGAGQYFTPRVLIDVMVRLMKPQVGERCNDPACGTFGFMIAA DKYVKEHNDFWGISADLAEFQHKEAFTGCELVHETHRLALMNAMLHDIEGQIMLADTLSN AGKQLKGYDLVLTNPPFGTEKGGERATRDDFVFSTSNKQLNFLQHIYRSLKPNGKARAAV VLPDNVLFADGDGERIRVDLMERCNLHTVLRLPTGIFYAQGVKTNVLFFTRGTTDKDNTK EVWFYDLRTNMPSLGKTNPLKTEHFADFEKAYEADDRRAVNDERWSVFTREEIVAKGNSL DLGLIRDDSVLDYNDLPDPIESAEEAAAQLEEAVDLLKRVINELKALTGEL >gi|229784123|gb|GG667612.1| GENE 17 21260 - 21424 57 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFFRIDFPPVMIENKDLTYQFVFLQQRYDFDEYSLVFPHSTHYYDPIYQNHHPQ >gi|229784123|gb|GG667612.1| GENE 18 21496 - 21711 92 71 aa, chain + ## HITS:1 COG:no KEGG:Amet_0849 NR:ns ## KEGG: Amet_0849 # Name: not_defined # Def: restriction modification system DNA specificity subunit # Organism: A.metalliredigens # Pathway: not_defined # 8 71 397 459 467 71 59.0 1e-11 MLIPTIEEQREIVNILNFFLGKEEQIKQNCLKLLEKIEEIKKSILSRAFRGELGTNNPDE ESSIELLKTIL >gi|229784123|gb|GG667612.1| GENE 19 21768 - 23198 1068 476 aa, chain + ## HITS:1 COG:no KEGG:Clocel_0831 NR:ns ## KEGG: Clocel_0831 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulovorans # Pathway: not_defined # 1 474 1 466 468 166 27.0 2e-39 MNLFDILNPNFFSPLTGVNKHRYADIIGLIWDRCSQNPLYSTEKSTLLDEVESYFTGLEQ IGEGNIPLDEEEISGEDKPEDTGSTTPRFWASHYLRRLKNTGWLEEKDGGYDDEARYAVN HRIIPIIQAFIDVISPKMVTYQGKLYKIYTLLSGIGTQANPYETVLKEVEEDMEDLNISL RQLASSIEEHMDQLTQGKTPVEILELFNAYEEKVVVGAYHRFKTSENLFYYRAELFEMLE RCEDEYKERLELDYAQIMGVPVDEAGLQIRRLLYKLRDEIKEMGNIIRIIDESHVLYRSR AVQRAQFLLLSDGSIKSKISTILQYYAASMESKEEMLEYDEGLASQIFQIYIQGYFCPES LQKPVTRRKPTEIEDMEALEPLDEEVLKREQERLLKIAREALTEENVNQYANQLLQGHEA VEASSLTEGEDEQMIKLIGLYTYSQSRERTYDIEVKNRVIKKGRTRFTDFTIEKKR >gi|229784123|gb|GG667612.1| GENE 20 23220 - 23798 530 192 aa, chain + ## HITS:1 COG:no KEGG:Cbei_2801 NR:ns ## KEGG: Cbei_2801 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 4 178 12 187 201 94 34.0 2e-18 MAVDLREETANYLLNNCFLIGTLENMREKYFDVINHENDYRELFRPLGYTLVIHKALRVV QVISRYLSGRVELKKYESILLLILRLLYVQKRESLSTNENQVLVTVEEIKDEYDKLSFPR KLDQQLLTDGLRTLKRYNLAAAIDKLGDLTARIQVYPSVMLAMPETHIKEAAEKTREYLL LYTKAVEEGDQE >gi|229784123|gb|GG667612.1| GENE 21 23804 - 27106 1975 1100 aa, chain + ## HITS:1 COG:no KEGG:Clocel_0833 NR:ns ## KEGG: Clocel_0833 # Name: not_defined # Def: M protein-like MukB domain-containing protein # Organism: C.cellulovorans # Pathway: not_defined # 1 1096 6 1067 1068 372 27.0 1e-101 MTRVRLVNWHNFTDSILDIKMITYLIGVNAVGKTTIMDAVRYCLTTNKNFNTAGNRRSER TLQGSVHGKQRAVKVYTRPGHTVSYIGVEFYDEIKQKPFVITVRVESENPAQELRHVSQD WYLTKPGYNLEQLPYIIKNRPASREQFRLSGKGLEMAPNQTEARRRISRILGIGEADSPL GKKFHEVFHMGTSLEDINNIREFIYTYILPEPEMNLETLQGDMRELERLSEVLMEAQQRE KSLQEIIDCLDEGRRLDGRVRVVELLIEYARWQEAVEKDRYCELEITRNQRISVTAGEEL KVLEERKSALTRKRDEAIRNLGQNPENQALTYLQEKEEELKKQCRELRTAKDKLDRSVSM LKQLDEQLKIQDFVLGISDEITNIDISLEERKEQIHTLAAGLKQLEPEIKERGHQIWASI ETAEQELKGIGKRLLQLKSGKMVYPRDAEKLKQVINSELQKRGMPEEARIFCELLYMTDE SWQDTVESYLGSLRFHVLVPPRYYQVAKEVFVRMGREVGHAGLVDTIGLERDYHKAEFPE GDFLVGKIESKNPYARMYIKFLLKDTVCCEHESDLEQYRRSVTKDLLRYQNYCLIRMDKR EHYIGLNARKQQIDVLEKRQNTLLTEKRHSENSKVELERLEQLYYPFVQGNAMEDLYQHL DAPERLEETEQQRLNVRKEIEEYENNPILRAMFNQIDCLKNELEDLDKECVNKQASQQTA DQTILKMRQEQEKNRDYIQETRQQYESLLHQYSEYQEDALARYEEYAKTRSPAEIVKNQI NSNALAQLQSRRDNYINAILIPKQNDYNSHYACDYAPGLAGDGAFRMAHVSLVNIDLEKY KEDLRQAQIRCEARFRKDVLFRLKDDITTAKQQIKSLNRVMENLAYGEEQYRFYVDGCKE GELKAFYNIIMAEKNEEYSEDSQLSLFTETRDEAYETQVHDFMERIMVDAKEIAQARAEG KKVAFKPLSQYVDYRTYLDYDMYVKNLSTGYEVPLSEVSGDGSGGENQAPFYVAICASFL QIYEQSENSIRLILLDEAFNKMTSDRIAPMMKMFRDLKMQVLLISTVEKCSSIYPYCDLT YSILKVGNRNSIGLFDREQI >gi|229784123|gb|GG667612.1| GENE 22 27124 - 27870 244 248 aa, chain + ## HITS:1 COG:no KEGG:Clocel_0834 NR:ns ## KEGG: Clocel_0834 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulovorans # Pathway: not_defined # 3 248 5 246 390 99 27.0 9e-20 MDYTQKLIQEGISIYERRRYFVHEPEVKQRAINISINKLFPKYQDNHDHHEYRNVNAVVE QLVRDGILEAKTDQRGYYKIVRFRLEAVSYCYQFLKRKSVPEICRDLEHIIDIYDSPEQE ILHLFCQNQRTLLTEYRKLPYGIGFEEEKLEGILIALRGIERLQKETYIRNFSTAVYHDS KKFARFRNCVQSILFDYSERVVEKELILERFHLVDNPTYAMFKGDAKLFGEGLSIELGKL PGGIALAS >gi|229784123|gb|GG667612.1| GENE 23 28795 - 28956 63 53 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKICRPTVLLIYGLKKPSSDFIVKTPKIYYHSFSRLSIMVSRQVFYSNNLLTI >gi|229784123|gb|GG667612.1| GENE 24 29071 - 29271 216 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619623|ref|ZP_06112558.1| ## NR: gi|266619623|ref|ZP_06112558.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 66 96 161 161 123 100.0 5e-27 MGYCKELDENDRKMMGSKKLDDYRDVLMFMDEHNCKMEQESVMAAEIQIHVLGEMDETKN IGVTQM >gi|229784123|gb|GG667612.1| GENE 25 29817 - 30953 711 378 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01635 NR:ns ## KEGG: EUBELI_01635 # Name: not_defined # Def: multiple sugar transport system substrate-binding protein # Organism: E.eligens # Pathway: not_defined # 50 297 35 292 326 136 31.0 1e-30 MIGCSSIIAICAVTLSVWGCSKNTKPEIEVALWNLDVYDHYTEQVQSLVPDVKIKWVEGQ RNLDYYKQLAETESLPDIITVRKFSMKDSLSLNPYLMDLTRTEVASSYYDIYLENYKTSE GKVFWVPMSGTVDGIVANKELFEQYNIPVPSDFDSFISACQAFETYGIRGYSIDLSIDYN ALHLLQGMGIESLSSVDGIIWRKDYDDGKTNQLDSRVWLPAFEKLERLNTLKILDQDTMY SDDLLSYERFIEGTQAMVNISSESISKLLPGKEVEILPYFGENQNFLLTYPTFNAAITKA GGENSVKKAAAWKVLMAMTSSQAQEELNRYTDGLIPYKRDISFEYDGSMSMVQQYLERNR TYIRLGSEEFKPFRFQPS >gi|229784123|gb|GG667612.1| GENE 26 32042 - 34606 1816 854 aa, chain + ## HITS:1 COG:slr2104_3 KEGG:ns NR:ns ## COG: slr2104_3 COG0642 # Protein_GI_number: 16330590 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 457 705 1 246 270 163 40.0 1e-39 MILKKKEKDISNDKYKRIRNVTVTGILLAISGIIFLMTVFLKNMENNARKENEYHLLEIA HQVSDNINSKLNQNWSLLRTIDYDLSVTRMSEADTQKYRQNLVKEWKFSHIYLIDDAGNC SDEFGEKSRLITKDNAIALLRDRTDVSFLRRDINGNATLFFAIPVEKKQAGSENISAIAL EYKLDNILEILTISAFERKGICYVIDQNGSRLFNTQADNAIKDYNIFSYLEYASKDGKKS AEVIQDAMDSGTSGVTVYTKDGHQEYISYNPLENGKWTLLLFVGGDIIGENMNRFSRNVF TSCAVIVGLLIIISGIVFFRLNQLANRKRDADVYNRERMLNLLIERGNEVYMMYNKTEGC LEYVSPNLESVMGWSRQEAEQLFTEGENTDEEDIAGIKDEFIHWDGQGDFVGSIHQHRNR NNGNLSWIRLQVNPVYLSPEEVWIANLSDMTKEWEQRENLEQALMAANSANIAKSNFLSN MSHDIRTPMNAILGMAAIAEQYAGDPEKVINCMKKISYSSRHLLALIDEILDMSRIESGK MLLENKIFSLSGMLEGIVSMFQEQFKGKKLSFNMEKRSIKHDSLIGDEFRLSRILVNILS NAIKYTPEQGQIVFSVTELQAGKEGYARYQFVIKDDGRGMTEEFLKTIFMPFTRMEEKEG SYTQGTGLGMAIAKSMLDLMGGSINVESTLGKGSTFTVDVELETAEEGECREEKKEQNGG QTKRFDFTGKRVLIVEDNEINEEILKELLSIEGALTESAGNGKEAVDKLEQSAPGYYDLI LMDVQMPVMNGYEAAAIIRGLDRSDAETIPIIALTANAFSEDRSKALAAGMNAHVTKPIN MANLCSVLAEVFSS >gi|229784123|gb|GG667612.1| GENE 27 34710 - 35585 622 291 aa, chain - ## HITS:1 COG:CAC2394 KEGG:ns NR:ns ## COG: CAC2394 COG0583 # Protein_GI_number: 15895660 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 284 1 281 286 95 27.0 1e-19 MTLESYRNFVAIVECGSIMAAANQLLIAQPSLSNQLKNIEAYYGAKLLIRNRHKLELTDA GRVFYLHAREICQAEEKLQNEIHNKKTAFSELLKLSIPAGNSAYFLHHLFDDFRKAYPNV NFDFYEIPSDFAIPNVLNHVTEIGLIRSSLPGYASLAVYPYEKEEIVAVCSPDHPLVHKS DILDISQLADYPLAVPFDCLDLVRSSFGHRLLAPNISIITSSNATAIEWARSFGTVALAS LTTADFQHIKDLAVKHFSDENLFITSAFIVNKEYPLSRAARNFLGFHGVHT >gi|229784123|gb|GG667612.1| GENE 28 35763 - 36368 669 201 aa, chain + ## HITS:1 COG:FN2072 KEGG:ns NR:ns ## COG: FN2072 COG2252 # Protein_GI_number: 19705362 # Func_class: R General function prediction only # Function: Permeases # Organism: Fusobacterium nucleatum # 5 172 3 167 355 119 44.0 3e-27 MNALLTDLLAVFGVVLNALPQGLLALSFGFAAVPTAMAFFIGAVGNVVTGNVAPISFQAE TITYAGTSGRNRSERCTMIFIGAVILAVVGACGMLTRIVDFIGTDIANGMMAGVGLILTK AAVNMVKEDRVAGGVSLAAAVITYLLTRSSANALVYTIVISVIASSVASAVLNKEKKNIV IVDDTFKRQKFTINGTVIFAS >gi|229784123|gb|GG667612.1| GENE 29 37301 - 37759 458 152 aa, chain + ## HITS:1 COG:FN2073 KEGG:ns NR:ns ## COG: FN2073 COG0503 # Protein_GI_number: 19705363 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Fusobacterium nucleatum # 4 139 42 177 177 146 58.0 1e-35 MACAPKLAEKIGDVDVIITAEAKGIALTYEISRLLGKSNFIVARKSIKSYMSGVVSVSVH SITTSGEQHLYLDGHDAECLRGKRVCIVDDVISTGESLYALQALVESAEGIVTKKAAVLA EGNAAERDDIIFLQKLPLFQKSENGEYVVKES >gi|229784123|gb|GG667612.1| GENE 30 38006 - 38578 421 190 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116492196|ref|YP_803931.1| acetyltransferase [Pediococcus pentosaceus ATCC 25745] # 5 184 3 185 185 166 45 3e-40 MKERLETDRLILRPWEAQDAADLYSYAKDDRVGPIAGWPVHTGVENSREIIKNVLSKEGT YAVVLKETMLPVGSIGFMAGKESNLEIPDTEAEIGYWIGVPYWGQGLIPEAVRELIRYGF EELNMEKLWCGYFDGNEQSRRVQEKCGFVYHHTNEKIPWVLMNDIRTEHVTCLTREQWEK AKTQAGREEA >gi|229784123|gb|GG667612.1| GENE 31 38713 - 39153 365 146 aa, chain + ## HITS:1 COG:no KEGG:Clole_1993 NR:ns ## KEGG: Clole_1993 # Name: not_defined # Def: isoprenylcysteine carboxyl methyltransferase # Organism: C.lentocellum # Pathway: not_defined # 1 143 48 190 193 135 48.0 4e-31 MEMTMKAATCAAPVAELISVFTGISLLPVFARYTGVLAAFAGVMCFAVSVYTMRDSWRAG IPEHDKTEMVTTGIYSISRNPAFLGFDLVYLGFLLMFFNPVLLVFTIFAVVMLHLQILQE EKFMADTFGSEYEKYRKHVFRYIGSK >gi|229784123|gb|GG667612.1| GENE 32 39301 - 39543 308 80 aa, chain + ## HITS:1 COG:no KEGG:Trebr_2404 NR:ns ## KEGG: Trebr_2404 # Name: not_defined # Def: hypothetical protein # Organism: T.brennaborense # Pathway: not_defined # 1 80 52 131 132 96 58.0 3e-19 MEKGIAGCYVCEESCSKGLLGKIKPLGFRTFIQRYGVEALLDCLERNEKNGVMYHREGIN GDYDKFENVEDLISFIQSGK >gi|229784123|gb|GG667612.1| GENE 33 39681 - 39845 162 54 aa, chain + ## HITS:1 COG:MA3726 KEGG:ns NR:ns ## COG: MA3726 COG1331 # Protein_GI_number: 20092523 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Highly conserved protein containing a thioredoxin domain # Organism: Methanosarcina acetivorans str.C2A # 1 54 1 54 697 96 79.0 1e-20 MDNHKNIPNRLINEKSPYLLQHAYNPVQWYPWGGEAFEKARLEDKPVFLSIGYS >gi|229784123|gb|GG667612.1| GENE 34 39884 - 41752 1567 622 aa, chain + ## HITS:1 COG:CAC3546 KEGG:ns NR:ns ## COG: CAC3546 COG1331 # Protein_GI_number: 15896782 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Highly conserved protein containing a thioredoxin domain # Organism: Clostridium acetobutylicum # 1 617 63 673 677 535 43.0 1e-151 MEQESFENDRIAALLNREYVCVKVDREERPDVDAVYMSVCQAMNGQGGWPLTIIMTPDCK PFFSGTYFPPYARYGRVGLEELLTAVAGQWKADRETFLDSAGQIEAHLKAQERITMSAEP GVDAVHQAFRQFLGNFDKKNGGFGGAPKFPTPHNLIFLMEYGVREKKREALAMAETTLVQ MYRGGIFDHIGGGFSRYSTDETWLVPHFEKMLYDNALLVMAYVEAFGLTGRNGYKRVARR ILAYVEAELTDEKGGFYCGQDADSEGLEGKYYVFTPQEICRILGPDAGTDFCSCYGITER GNFEGKSIPNLLKNEAYEAVWENHESPDLKKLYDYRITRTRLHRDDKILVSWNGWMICAC AKAGAVLDDTNYLDMAVRAETFIHENLVRDGRLMVRYREGDSAGEGKLDDYACYILALLE LYRVTFQTDYLTRAAQWAETMVQQFFDRERGGFWMTAEDGEPLIVRTKETYDGAVPSGNS AAALGLYQLARITGETKWQDVLNQQLHYLAGAMEGYPSGHSFALLTMMNVLYPSRELVCT VSPDESGEALSILARRLAYLAETVPGLTVVVKTADNETELTKLAPYIGDYPLPEAGSLFY LCSGSRCMPPVKSLEELAGKWT >gi|229784123|gb|GG667612.1| GENE 35 41772 - 42608 454 278 aa, chain - ## HITS:1 COG:BS_yxeI KEGG:ns NR:ns ## COG: BS_yxeI COG3049 # Protein_GI_number: 16081005 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Bacillus subtilis # 1 251 55 303 328 218 41.0 9e-57 MGIGQDISRIIFADGVNEMGFAAAMLYFPGYAVYDPIASPDSRGLSITALEVVSFLLGMC HSADHAASLLRSVRIVGAEDSVTHSIAPLHWIMADQSGKCLVAEKTADGLHIMENPIGVL ANSPDLPWHLTNLRNYMNVTPDQPPEREWDSVKLTPFGQGGGTFGLPGDYTSPSRFVRTA WLKSHTPIPADRQAAVNTCFHIMNSVTIPKGAVMTSRGTPSYTQYTAFIDLSAKEYFFKT YDNSRITSAKLPSASGGDLKALPLVSLNKPTQFDRLEV >gi|229784123|gb|GG667612.1| GENE 36 42850 - 44181 1234 443 aa, chain - ## HITS:1 COG:BH1128 KEGG:ns NR:ns ## COG: BH1128 COG0733 # Protein_GI_number: 15613691 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Bacillus halodurans # 6 433 11 448 453 315 43.0 9e-86 MKTNNKWASKMGFILATAGAAIGLGNLWKFPYLMGRNGGFPFLVAYLFFICILGVPVMIT EMSLGRKTGKNPVLAYDTIHPHARIVGYFGVLAAFVILSYYAIIGGWIIKYFVSYATAFQ APADFAAFTAKPVEPLIWFFIFMLITGLICYFGVNGIEKASKFMMPALFVILIVIIIRGV TLPGAGEGLAFIFSPKFEAFNITSVSAALGQVFYSLSLCMGITITYGSYLNKEVSIPRSC MNIAALDTTIAVLAGIAIFPAVFASGLEPASGPSLIFVTLPKVFDALKGGTVFAALFFLL VLFAAVTSAVALLEVCASFVMGTWHWSRKKAVLLLATAIFLLGIPSSLSFGPLADISILN YNIFDFVCMLTDNIFLPLGGIFMCYYVAWKWSPKNLIAEIEQNGVRFRLAKIWIFLIKFI TPVMVAIVTITGFIAIYHTVSGR >gi|229784123|gb|GG667612.1| GENE 37 44445 - 45068 582 207 aa, chain + ## HITS:1 COG:no KEGG:Closa_2536 NR:ns ## KEGG: Closa_2536 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 205 1 205 205 253 59.0 5e-66 MGTQNRDTAAAILEAGRKEFMEYGYEKASLRRIAKEASVTTGAIYGYFAGKEALFAALTE DAAEELVELYTKVHSDFASLPPEEQPAMLNVVTEECVPWLVNYIYDHFEAFKLLLCCGAP GCGERFFDRLAEVEERSCHDFVAAVEQMGYPVPKLGDALIHIVCSTFFRQIQEFVDHDIP REEAVSCSLILSRFQHAGWKKILNLPS >gi|229784123|gb|GG667612.1| GENE 38 45216 - 47030 214 604 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 358 581 132 354 398 87 29 3e-16 MESSNRKIMTRLPETKKVTLMTFAGKYRFLTGAGCVLSGISSVIALAPYLCMWNVVKEVV LSWNTGLDNRHLVYWGWLAVASSLISMLLYFGALMCTHLSAFRTARNMKTVALHHLTRLP VGYFKQEGSGKLRRIIDDGAGQTETYLAHQLPDLAGAVVTPVAVLVLLLVFDWRFGLISL VPMGVGVFFLSRMMGTGLAECMRQYQNALEDMNNEAVEYVRGIPVVKTFQQSIFSFKSFH DSILRYRDWAVNYTISLRIPMCAYTVSINGIFAILIPAGILLAGDISRGASYTTAALDII FYILFSPICVAMMDKIMWTSENTMAARDAMNRVLKILKEEPLSEPPVPKKPKDYTIQFQD VTFSYRKGGIAALDKVSLTVPQGTAAAIVGASGSGKTTLVSLIPRFFDVSSGSITIGGAD VREIGTEELMKRVSFVFQDCHLFKDTLLNNIRAARPQAAEAEVRRALEAARCEDIIEKMP QGLHTVVGTKGVYLSGGEVQRIALARAILKDAPIVLLDEATAFADPDNEYLIQQGFEKLA EGKTVIMIAHRLSTICRADRIFVMEKGRIAEEGSHNELLATGGLYARMWEDYQRSAEWKV GGTA >gi|229784123|gb|GG667612.1| GENE 39 47027 - 48769 212 580 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 342 553 284 502 563 86 31 5e-16 MSKYLQKKFALSEQGGKDLTKAIVCCALTNISLMFPMGILFLFMERLTGPLIGLVPPDLG IGGYVGISVVLLAVIGFFEYLQYNATFLSSYQESADMRIRLAERLRMLPLSFFGKRDLAD LTTTIMADCAFVEKAFSHFIPELIGALISTVLIGIGLLAANWRMGIAVLWVAPVSILLAV LSRPLVDRMERRQKGKKIAASDGIQECIENIQDIRANNQRADYLKKLDRKIMDVESITVK LELLNGTLVTSAQMILKIGMATTVLVGASLLVSGSISFMIFLMFMLAATRLYQPLSGCLQ NLSAVYSTLLVVERMKTIEEQKIQQGREDADCHGYDIVFDRVGFSYKEGEPVLREVSFTA KQGQVTALVGPSGGGKSTSAKLAARFWDADKGRITIGGSDISKIEPETLLKSFSIVFQDV VLFNNTIMENIRLGRKGASDEEVLAAARAAQCEEFISRLPEGYQTRIGENGSTLSGGERQ RLSIARALLKDAPIILMDEATASLDVENETLVQEAISNLVKDKTVLIIAHRMRTVAGADQ IIVLKDGCVAEQGTPEKLLEENGIYRHMMELQNRSLSWSL >gi|229784123|gb|GG667612.1| GENE 40 48893 - 49531 708 212 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619640|ref|ZP_06112575.1| ## NR: gi|266619640|ref|ZP_06112575.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 212 1 212 212 394 100.0 1e-108 MREDKMENQKIRIIKKNNDFSLEYKPGDIFTVDSTWYGGVNVTSKSGIPLSLDREEYELY QEAEEPRREIDRYSYHLGAMDSFCEMVAAGVKKLAMSHPCATKEERDSFLPEVKRICDSY GILFYPEDEAFLTDLFPEELNRGTYNYLFYSTNEVLEAYLGLKEEQKRLMEDGTYTRQQS YETARQFGRLLSYTEEGIARLIEKTEKQKIEG >gi|229784123|gb|GG667612.1| GENE 41 49548 - 49709 96 53 aa, chain + ## HITS:1 COG:no KEGG:Thit_0244 NR:ns ## KEGG: Thit_0244 # Name: not_defined # Def: two component transcriptional regulator, AraC family # Organism: T.italicus # Pathway: Two-component system [PATH:tit02020] # 3 52 481 530 537 62 52.0 5e-09 MEEAGKLLEESTVNVKEIGKAVGYADSNYFAKVFKRTTGQSPTEYRMAIFQRT >gi|229784123|gb|GG667612.1| GENE 42 49852 - 50445 349 197 aa, chain - ## HITS:1 COG:STM0763 KEGG:ns NR:ns ## COG: STM0763 COG0583 # Protein_GI_number: 16764128 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Salmonella typhimurium LT2 # 4 197 88 277 283 77 27.0 2e-14 MTNIVAYRISDIFVSFQRDYPNYHLDIDGGSPYAVYQNLLNGKYDFAFLRYTDKTDLSEI TAIPFTTDRLAVACSRENPLASRTSVTAKDLERENILMFKEHSFMYDFINEACNAAGFKP NISLSVHRTENLVELASHNMGICFLMKKVALTFDNPNIAVVDLEPAYNCNIDLCYLKGKK LSAAAQNFMNFTRHFNK >gi|229784123|gb|GG667612.1| GENE 43 51393 - 51608 224 71 aa, chain - ## HITS:1 COG:PA2447 KEGG:ns NR:ns ## COG: PA2447 COG0583 # Protein_GI_number: 15597643 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pseudomonas aeruginosa # 9 71 10 72 307 62 49.0 2e-10 MNTTYFKDFLTIAELGSYEKAAEELYTTQSTLTKHIQKLESELGTTLFDRTSRSVRLNEC GKIFLPYALAS >gi|229784123|gb|GG667612.1| GENE 44 51792 - 53477 1550 561 aa, chain + ## HITS:1 COG:FN0396 KEGG:ns NR:ns ## COG: FN0396 COG0747 # Protein_GI_number: 19703738 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 60 440 27 388 511 127 26.0 5e-29 MRRDLRKTLRRSLAALCTISLAAAGMTGCGSKTADSAATTASTAGLTNEATAEGPVVNQK GSEDKTAANLEVINVGIEADPGDLSPWGPNNTGRTATTDCIYENLAHCIDGEIHGVLMKD YELAEDESSMTVHIYDNIYDSAGNHLTASDVKFSFEKCVEAGNIYGLSYIENVEVIDDYT VQFNFNTTLYVYDLETLFETFYVVTQAAYEASPDGMATTPVSTSPYKVSNYVSGYILTME KTDNYWQTDETLINPRDRANVNKINYYIITESSQMTTALENGSIDMSWAVRTDDLSTFQE GGAQADNFWVYQVPDNLVSQIFCNCSEDKLTSNVDLRKAIYYAIDANVILQSVYNGNGTV TYDVARPKCPDYQDSWETEDNYYHSDLEKAKEHLAAAGYKEGEVTLSLMCESTDAMSDTA VLIQAFLGQIGINVEINAVESSLLGTYIKDPTAWDISVISKAASNYVTTCWKNCFSQSYF TWGGTINFAYDDELQKRIDAARLLSTHTDETVQAAHEYIIDMAYGHGLVNYYNNLIIPAN CSEVVLSYKNAIMPGACTYTE >gi|229784123|gb|GG667612.1| GENE 45 53506 - 54459 763 317 aa, chain + ## HITS:1 COG:FN0397 KEGG:ns NR:ns ## COG: FN0397 COG0601 # Protein_GI_number: 19703739 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 308 1 307 308 242 45.0 8e-64 MGKYILKRLVMTLFVIVGAAILIFSIMYFVPGDPAMLLLGSEASPAELEAKRVQLGLTDP YLIRLGKFLYNTFFHFDLGTSWFRGIPVLTELTGRFPRTLFLGIMFVIISTAIALPLGIT AAIHQNGWQDRICMVVAMVCTSVPDFWLALMMVYLFSLKLGWLPSFGIERWQCYIMPVIA GSLHGIGQLARQTRSSMLDVCRSDFVTTARAKGLPEKKVIYSHMLPNALIPVITIIGGSF GRSIAGTIIIEQVFSMPGIGSYITTAITGRDYPVVQGCVIVLAIFIAIVMLLVDLVYAYV DPRIKAQYVAQSKRRTK >gi|229784123|gb|GG667612.1| GENE 46 54459 - 55343 689 294 aa, chain + ## HITS:1 COG:MA1246 KEGG:ns NR:ns ## COG: MA1246 COG1173 # Protein_GI_number: 20090110 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 29 293 16 280 288 244 48.0 1e-64 MTGYSMGAAARVGRKEGIVIQTIRRLRTNSTAMFGLIVLALLILLSVFAPVIAPYHYTEM DMLQLNAAPSLKHLFGTDSLGRDILSRLIYGGRYSIFLGFSASVLSMAAAIVLGSLAGYF GGWVDNVVLRICDVVQAIPGILLSIVISAVLGPGFFNTILALAIGGIPSGIRLTRAQILS VRSEEYLEAAASVNCSSMRIMFRHILPNILSPLIVGFTMGIGNTIMLASSLSFIGLGVQP PAPEWGAMLSAGRDFIRNYPWQIIFPGIFVFVTVLSINLFGDGLRDALDPKMKK >gi|229784123|gb|GG667612.1| GENE 47 55356 - 56336 780 326 aa, chain + ## HITS:1 COG:FN0399 KEGG:ns NR:ns ## COG: FN0399 COG0444 # Protein_GI_number: 19703741 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Fusobacterium nucleatum # 4 323 2 321 335 386 58.0 1e-107 MNGETKKFVSVKDLVIEYTSGGSVIHAVNGVSFDLAEGSTIGLVGETGAGKTSIAKAIMR ILPNPPARICGGEIFVDGEDILKKPEKDMLEIRGDKISMIFQDPMTALNPVKTIGEQIAE GILLHNPISRGDARDAAIDMLKMVGITEERYDEYPFQFSGGMKQRVVIAMALACKPKLLI ADEPTTALDVTIQAQVLEMISDLKKQLNTSMILITHDLGVIAEMCDEVAVVYAGQIVEYG KKADIFKHTSHPYTIGLFRSLPSLSGDERRLWPIEGLPPDPSNLPEGCCFSPRCPHATEE CKKVKAELKELKTGHYCRCLYAKRED >gi|229784123|gb|GG667612.1| GENE 48 56340 - 57296 772 318 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 7 309 14 324 329 301 49 6e-81 MSACVEVKNLKKYFDTPAGRLHAVDDISLKIEKGTTMGVVGESGCGKSTLGRTIVHLQDS TEGQIFLNGEDITHVQGKQLKRLHEKMQIIFQDPYSSLNPRMTIGETIQEPLMLSKRFSK ADLEKEVTKLMDKAGIEARLRNAYPHELDGGRRQRVGIARALALDPEFIVCDEPVSALDV SIQAQILNLLMDLQEADKLTLMFVTHDLSVVRHISTSICVMYLGQLVETAPSKKLFEMPV HPYTKALLSAIPSTDIDKPSSRILLKGELVSPINPKPGCRFATRCIYATEQCSQEQKLEE VENGHFVSCCRVKELNQM >gi|229784123|gb|GG667612.1| GENE 49 57322 - 57924 402 200 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619650|ref|ZP_06112585.1| ## NR: gi|266619650|ref|ZP_06112585.1| putative membrane protein [Clostridium hathewayi DSM 13479] putative membrane protein [Clostridium hathewayi DSM 13479] # 1 200 1 200 200 366 100.0 1e-100 MEKEYHSRSYHRFFEGFTEKEQVDENGKRTINRLYTGEYYRPGISEEERQKHKIFNVLLY LIAVVGYGGAAVQRVPINSVWYIALIQSLVLVSFLWLLWPLYYYLVSKQKMTVRVYKDSS LRLQYVSICSALLLEGVALLSILYLMTENSAMAVKSSFAAAGYAVSGLSLFVMFLYEKRV PYLTMENPAKKPEKGVTIRY >gi|229784123|gb|GG667612.1| GENE 50 57957 - 59249 1009 430 aa, chain + ## HITS:1 COG:no KEGG:Pat9b_4610 NR:ns ## KEGG: Pat9b_4610 # Name: not_defined # Def: hypothetical protein # Organism: Pantoea_At-9b # Pathway: not_defined # 3 403 33 433 450 206 30.0 2e-51 MKTDENLKGGAGIAEIKYPADFFPEEGFKGIHDPIHARALILESGERIAVISLELPSLRP YTLIDDMKETVRECTGIDHRNIWICMTHNLAAPHVPDENRTEYKYRIHMEAVKKAIHSAC ETALLNMEPVRIGTGTGYSDINVNRDIETPEGWWQGINPAGISDKTLTVIKFEGVTGKTA AVIYHYAVKSSAMAAAVMEDGSRYVTSDVTGAASRIIETKLEAPAIFFMGAAADQEPKRR AAFDQINGKGDMIHVNLGSAGFDLVEELGEILAGDVVETAGKIVCSENAPVIHLEHTSFW LPGQQFYQGGKPYHPMKNYEYIPSEDQELKVELLLFGNTVLFGLQPEATAIVGMKLREMN PDFVPLMAAMVNGGKDYLSDESAFDRMTFGAIHSVFARGAAELFLMKSKELLDRLRLEYL EESNDSKEIK >gi|229784123|gb|GG667612.1| GENE 51 59227 - 60783 812 518 aa, chain + ## HITS:1 COG:CC0799 KEGG:ns NR:ns ## COG: CC0799 COG2272 # Protein_GI_number: 16125052 # Func_class: I Lipid transport and metabolism # Function: Carboxylesterase type B # Organism: Caulobacter vibrioides # 17 505 10 496 515 183 30.0 7e-46 MIPKKLSETVVCRPDYPVACTKKGKIRGILSESTFIFKGIPYASARRFHKAEELPPWEGI KDALYYGYTAPQLVHTIAADERFIPHYYTVEDENCHYMNIWTQSLDHGSKLPVMVWFHGG GWKNGSSVEQFAYDGEMISKTCGLVFVTFNNRQNCFGALDLSSFGKEYEDSVMAGLSDVV AAMRWIQDNIGAFGGDPGNVTVVGQAGGAKRVLALMQTPEADGLYHKAAVGSCAGECMKV PEGITRKQIARRMGELTVRQLGLDYRTVGAIEKLPYGKIVEAVNSSEKLLKQEIPERFRW EPVADNQYIFEEPFQAGFRRETLKIPMMTGTSFGEMASNARVRTGSGNKNTWSESYTQKL LEEQFGSRARSVAAAFKKAYPGRTVADALFIDRMLRGKLTELSMKRAKAGGKVWNWLFDL ESPVDGGTVAWHCSETPFITGNSAYMEASFIAGITEELQEKMMYAWAAFARDGDPGSHGL PAWPQVTEDSVPTMIFGRKSSVRTDHDRDLLHILMEEA >gi|229784123|gb|GG667612.1| GENE 52 60814 - 62193 750 459 aa, chain + ## HITS:1 COG:no KEGG:Slin_4581 NR:ns ## KEGG: Slin_4581 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 5 390 70 440 494 143 30.0 2e-32 MKKQLFCGASKREITPGEDILPYMMGLGKAVFGMIQDKLYVRVIAFSNGEDQSLLVSFDL DKAPYPEENVRLLHERTGVPAENILFFAIHTHTAPLTGYRPFEPSHDITKKSLKVQEAVR EYEAFLTSQMEEAAAEALEKLRPARLGYRYGSSYINVNRCQDYSVMEEDNSTHTECSVGV NGAGPVDHTVFVMKIEDMAGNAIAFFINYAVHCVTMFLNDIGNGMSAVSSDIGGNVSKYM EQRYEGSVAVWSSGAAGDINPVLMTQTIYPDSVSGKPTGTCVKGLENSCLMLNAMAGRHF ADIIQIVKNIKCSVFEVEIGSAVEWSRTPASEGILKNPEYRNKTPDGYHIRLHLVRIGDI ALMGINGELYTTLGQKIREVSPMKNTIIVSHECSLLPDNPGYILDDETIYRCQMSGGGKI PVNGFCGVPDTIASALKSHTKIMFERVLCGNCLIPDFQE >gi|229784123|gb|GG667612.1| GENE 53 62201 - 63190 614 329 aa, chain + ## HITS:1 COG:PAB1139 KEGG:ns NR:ns ## COG: PAB1139 COG0673 # Protein_GI_number: 14521934 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Pyrococcus abyssi # 15 329 11 331 335 89 27.0 6e-18 MIKDQLGIACIGEWHVHGKDFADKLDKYPNCRVTAVCSDIPETGRAWAAHMGCRFEPDYR RVLENPEVDAVMITAATSDHGRLIVDAAKAGKHVFVEKALAVSNEEAYAIQKAVTENRIH FTMSDPVMKAPQMFVKRMVMEGTLGMITNARIRVVHPLGIMGQHKPQFYNKAESGGGTTI DLGCKAAHTLYQLFGKPFSACAMFSTYTDMGKANGIDENAVVLYRFPNGILGVAENGWAS PKYQYSLEVYGTEGCVTIHDAETAYRLNSGCRVTVQEAELPEPMIYPLDDWIDSIKKDRP NRNYGIEEAVHITEMITAAYRSEGMETAI >gi|229784123|gb|GG667612.1| GENE 54 63187 - 64407 791 406 aa, chain + ## HITS:1 COG:BS_ywoG KEGG:ns NR:ns ## COG: BS_ywoG COG0477 # Protein_GI_number: 16080698 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 19 392 7 384 396 155 28.0 2e-37 MNLETRTTADPSQDQPKKIFNRCFVSIFLANSLMYLSQWMVNSLVAKYADYLGATASMVG IVSSVFALTALLFKIVSGPALDVYNRRIILILAMCVMGISYFGYSASSNVPMLVSFRLFQ GCGQAFTATCCLALASDTLPPDQFGTGIGIFSLAQAVMQAIGPTAGLALMNRFGYHMTFV ISGSIMVAAAILAACVNVPYKKEKKFRITMDSIFAKEALLPVILIFFLSTAFSVINSFLI IFAGKQGITGNIGLYFTVYACTLLFSRPLIGKLSDRLGMVRVAIPAMFCFALSFLVISFS TTLLSLLTAGLIAGFGYGACQPAIQTLCMKCVSQKRRGAASSSSYIGNDLGQLLGPNVAG AVIVAFGYNAMWRFMMLFIFTAVVLVIIFRKKIYLIEKNFRTGVVR >gi|229784123|gb|GG667612.1| GENE 55 64426 - 65751 982 441 aa, chain + ## HITS:1 COG:no KEGG:Acid_2075 NR:ns ## KEGG: Acid_2075 # Name: not_defined # Def: hypothetical protein # Organism: S.usitatus # Pathway: not_defined # 4 435 29 446 449 187 29.0 1e-45 MGKISAGAAKICITPPSEMMPAYFHGKMIFEGIYEDIYLRALVFDNGDRRMAFLSYESGD MARMEELRSAVKRECGLDPENVCFSAVHNHEAPTFANTHKGVKNIPEKLDWVMRYGDFII RQTVLCVNEAISRMKPARYCVSTGKSFINANRDQLFENGLWGQGRDFEGPSDKELAVLRF TGYDGKIIGALVNYAVHGTACYQGMDEKQEKYLIAGDLPGMISNYLEERYRDDGAVFLWT SGAAANQNPIFFSSYQKYEHDKSHSLQYSVGYGVWTLCRHLAETQAVDVIKVLERMGEGK EKLHAAIVDRKIILPGQVIRYPDDGETAVMARADTPFSGTIEDGEPVELELKLLTLDDYA FLGLNGELDCGTGLRLKEKSPLKNTFIITHTGERAGYLPDKQGYDNHTQEFYASNVKDGC TEEYLIPAVMEMFTDRFEDDE >gi|229784123|gb|GG667612.1| GENE 56 65744 - 67126 1173 460 aa, chain + ## HITS:1 COG:CAC3354 KEGG:ns NR:ns ## COG: CAC3354 COG0534 # Protein_GI_number: 15896597 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 8 391 2 385 452 146 26.0 8e-35 MNNTIKKEKIGMDMTEGAVMKTLLLFSMPLIAANLLQEFYSMVDLMVIGQFAGSVGTVGV STGGEIADILTPAAMALAAAGQIYIAQLAGAGEEKRIKEAVGTLLSLMLGIAVVFSFLIQ LFCPRLLNLLNCPQEAMSEAVSYLRITALGIPFIFGYNGICGLLRGMGESKKPLVFIMIA AAVNIFLDLLLVAVFHMGTAGTAIATVLAQIGACAAAFAYLYLKREQFDFEWRLSYFKIR RNAAVILLRLAIPRIFQSCCIRFSLLWCNSNINSYGMIASAANSIGNKLQKISTTVLTAV DTGSGAMIGQNIGARKTERVKKIVLSTLAITLTAATIASAFALLCPRTLFGIFTMDKEVM ELGVTYLKIMVITFYLAALLGTFQAVVTGVGFASFGFAIGMLDGVVCRIGFSLFFVYVLD WGVKGFFMGHAMARLIPFILSFLYYISGRWKKRKLLIESS >gi|229784123|gb|GG667612.1| GENE 57 67323 - 68330 554 335 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020579|ref|YP_526406.1| ribosomal protein L22 [Saccharophagus degradans 2-40] # 2 316 10 313 331 218 35 1e-55 MRIKAVLLFGLAMVILTACGGRKEDQKMASAPEFVFTYAENQAEDYPTTQGARRFADLVE EKTGGRIKILVKADGDLGDEKTVIEQIQFGGVDFARVSLTPLAEFVPRLNVLQMPYLYTG REHMWKVLDGEIGDDFLNSFDGSGFVALSWYDAGARNFYSSTKPITKLEDMKGMKIRVQE SELMVGMIEALGASAVPMAYDKVYSALQTGLIDGAENNWPSYESTAHYEVAGYYTIDEHN RVPELQLISRVTWEKLTKEDQDIIRECARESSLYERELWEERERSSEKKVRDAGCQVVEL SPEEKARFQEAVTSMYGEYCAEYMDMIDAIAAAGK >gi|229784123|gb|GG667612.1| GENE 58 68402 - 71302 2234 966 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 408 808 250 655 676 171 32.0 8e-42 MGKKTLAEIESLTRKILTTYFCDSDMEFMISTFADDIIWLGGGELQKAEGKEAVAAVFRS GKDGMIACSMYDEQYHSIDLGGGSYLCEAVSRLLSKPESGAYLNTQQRATFVFREKGDGL ETVHIHNSVPFSEIHDDELFPVESGREAFERLKSALDTKKQEYEHQTRFLEQLYHTLPCG ILQFSTDPAHTIISVNPMVWKFYGYDSEAEYREAVSNPLQKVDLQDQEWISTLLDRLTLN GESASYRRHCIRRDGEEAWINVVMGRIINSNGMEVIQAVFTDITEQMRLEQAQEQERVLE NRFLRAAICTAYPLIVSVNLTQDTYHCFLNEQEEYQLSPDGCFTELYASCIPEIYSSYQA DFASTFSPEALQRRFSAGEQEIYMELQGQGQDGKYHWLSLHAIAVENPFNDDVLTICLVK FLDELRQEQARQEQLLRDALASAQAASRAKSDFLSRMSHDIRTPMNAIIGMSTIGQIKLH DSRTVKDCFQKIDTSSKYLLSLINDILDMSRIETDKMEIAHDLFDFRAFVEEINQIIYPQ TLEREIAYEMRHLEPLESHYIGDSLRMKQILMNLLSNALKFTQPGGSVSVDIAEEKRTNG YAYIKFVIEDNGIGMSEEFMDRLFQPFEQEAPGNARNNVGSGLGLSIVYSLVQLMGGTIS VTSHKQEGSRFTVTVPFQLVTDNEEQEWERKRQNLLKGFQVLVVDDDPSVGRQTALILDD IGAATRWVDSGIRAVREVENAMAEHWMFDIAMIDWKMPGMDGIETARRIRRLVGPDTMII MITAYDWRGIEEEAREAGIDYFIAKPLFRSTIYDTLLKLDRKEAPEPAIPDLQTQESVLV GARILLVEDNELNQEIAKTLLEMNGAVVDVAGNGAAAIDCFSSHDPGTYQAVLMDIRMPV MDGLEATRGIRALGREDSDSIPILAMTANAFEEDRKKAFEASMNGYLIKPLDVSVLIHEL EKAIRM >gi|229784123|gb|GG667612.1| GENE 59 72540 - 73571 857 343 aa, chain - ## HITS:1 COG:RSc1823 KEGG:ns NR:ns ## COG: RSc1823 COG0388 # Protein_GI_number: 17546542 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Ralstonia solanacearum # 18 336 4 320 343 291 44.0 1e-78 MSKKEAVKEVTHTIGDTLPKLRAAAVQAAPVFLNRDATVQKVARLTKEAKDNGADLVVFP ESFIPTFPLWCLFLPPVDQHPFYKRLFENAVTVPGPAFNELQKIARDNSIFLSVGICEKS TTNFGTMWNTTLLFDREGNMIGHHRKLLPTWGEKLVWSFGDGSSLNIHDTEIGRIGALIC GENSNTLARYALVAQGEQVHISVYPPCWPTSRDKGNYADCLRVRTCAHAFEAKVFNICSS ASLDEDAMEQMSVGDPALKEWLHNQSWALTMIAGPNGQPCCPSIENNQEGIIYADCDIAA EITAKGIHDIAGAYQRFDVFQLHVNKTPREPAYFYDEGIGESR >gi|229784123|gb|GG667612.1| GENE 60 73614 - 75023 1512 469 aa, chain - ## HITS:1 COG:BS_yflS KEGG:ns NR:ns ## COG: BS_yflS COG0471 # Protein_GI_number: 16077824 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Bacillus subtilis # 3 352 9 365 478 98 26.0 2e-20 MTKETRKKLISEAAAIAVGLAIALIPAPGGLEQKAMWTMGLLIWAIINWMTNAIPDFVCI FIMCCTWVLLGIVPFTTAFGSFSGTTVWLLIAAMGIGAAVTKSGLLARVALWIMRVCPPT FNGQVLALLGSGVIIGPFIPSTIAKVSIVGAMATDIGEKLGFEDRSKGMAGMWSAMYAGY TLLSQAVLSASFFSYIIMGLLPQTVQAQFSWMFWLRAMLPWLIICTIASYFAIRILYKPA QAPKLTKEDVNKMLTDLGPMSRSEKITLIVLVACIFFWVLERTLNIPAAITAVLGMSALL ALGVITPKDYNSRINWSIIAFMGGAINLANALTTVGIDTWLGNTFGASMASLISNPFLFV FVIATVALLARFIIVDMTTCYTLFIVILTPFCVQAGMSPWIAAMASYCVVYPYFTKYQNI NFLAAFNSAGGDEKLSHTMLVPFCLVFHIVSILALTASVPYWKILGLIP >gi|229784123|gb|GG667612.1| GENE 61 75050 - 75247 134 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNDWAQQSVLYRGSIIPPDNHVSSSLLPDQFAIESKYLCLLPIAYCNSDKKYLLYKELYV QNARM >gi|229784123|gb|GG667612.1| GENE 62 75376 - 76251 658 291 aa, chain - ## HITS:1 COG:BS_yybE KEGG:ns NR:ns ## COG: BS_yybE COG0583 # Protein_GI_number: 16081119 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 15 286 1 276 278 184 33.0 2e-46 MDIHSLTYFKKVAELQHITRASEELHVAQPSLSRTISRLEKELGVQLFERSGKNIILNSY GEILLKHTNRILQELKDIEQEIGDAAGERSRTVTLSLYAASKLLPELVMAFKHEYPSIRL QIIQEDLTKNQPNGCDLTLFSSMQPCTQDYSVTLIEEEILLALPDSNPLSKRDNLNLSEV AGEEFICLQQGKSLRTITDTFCKIAGFEPSVVLESDSPETVRELIRAGIGISFIPSITWQ GMDTKNIALVPISFPQCRRYINLSWRSQGYLSPAAILFRDFVQDYFRAFQK >gi|229784123|gb|GG667612.1| GENE 63 76374 - 78011 1615 545 aa, chain + ## HITS:1 COG:no KEGG:BMD_0797 NR:ns ## KEGG: BMD_0797 # Name: mdcA # Def: malonate decarboxylase subunit alpha # Organism: B.megaterium_DSM319 # Pathway: not_defined # 3 541 11 548 550 682 60.0 0 MIWDQERRSKAERMEKAAGYARGKRVGTEEIVPFLESVIRPGDRVVLEGCNQKQAAFLAG ALKEVNQGKVHDLNMIIPSISRDEHLDLFDRGIASEINFSFAGVQSRRLAEMVAENTVKI GAIHTYLELYGRLYVDLTPNVCLIAADKADSNGNLYTGYNTEETPILAEAAAFKNGIVVA QVNEIVGEDDLPRVDIPGGWVDYIVKADEPYPMEPLFTRDPARIQEAHILMGMMTIKGIY AKHGVKSLNHGIGYNGAAIELLLPTFGDYLGLKGKICTNWVLNPHPTLIPAVEAGWVKQV CAFGGELGMERYTAARPDIFFTGKDGMLRSNRAAAQVAGLYGIDLFLGGTLQMDYEGNSS TVTGGRLSGFGGAPNMGNSTGGRRHTTKAWRDMAPLTGSMASGRKLVVQMTKSQSKYGPA FVPELDAVKIGRKAGMDAAPVMVYGEDVTHVVTEQGIAYLYQAQNAEERTKMLGCIAQGT PVGSMVGIGEIEQLRREGKVAYPQDLAISPDAAKRELLAANSLEEIADLSGGLYRVPEQF RKQSE >gi|229784123|gb|GG667612.1| GENE 64 78031 - 79578 1458 515 aa, chain + ## HITS:1 COG:PAB1769 KEGG:ns NR:ns ## COG: PAB1769 COG4799 # Protein_GI_number: 14521095 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Pyrococcus abyssi # 4 515 5 522 522 620 59.0 1e-177 MATKKQEELLNLKERLRMGGGEKAAEKQHLAGKMTARERLFALFDEGSFVETGLFVKHRC TNFGMEKKQSPGEGVVTGYGTVNGRLTYAAAQDFTVIGGSLGEMHADKIVKTQEAAMKAG APMICINDSGGARIQEGVDALAGYGRIFYNNTMASGVIPQISVIMGPCAGGAVYSPALTD FIFMVKNTSQMFITGPSVIAAVTGESVTAEELGGARTHNAISGVAHYMAEDEEDCIRQIK ELLGYLPQNNMETPPEMESGDDWNRQDDFLNELIPENSGKPYDMKTVIEALADNGRFMEY QEYFAPNLITGFIHMGGRSTGVIANQPKAMAGCLDINASDKAARFIRTCDCFHIPLLTLV DVPGFLPGTMQEHNGIIRHGAKLLYAYSEAVVPKITVITRKSYGGAYIGMCSRHLGADAV FAWPDAEIAVMGADGAANIIFSKEIKNASDGAARRKEKIAEYQETMMNPYVAASRGYVDD VIMPSETRKRIISALEAFRGKRVQKIPKKHGNIPL >gi|229784123|gb|GG667612.1| GENE 65 79590 - 79913 297 107 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619667|ref|ZP_06112602.1| ## NR: gi|266619667|ref|ZP_06112602.1| sodium pump decarboxylase, gamma subunit subfamily [Clostridium hathewayi DSM 13479] sodium pump decarboxylase, gamma subunit subfamily [Clostridium hathewayi DSM 13479] # 1 107 1 107 107 157 100.0 2e-37 MFGDSLTVADTLLISGFSILSVFLVLLLLSYMIDFCAFLVKRSETLPETGTGTAAPAAND SSDVLLAAAAVAACLSVEPDDIVVRRIRRVEDCGGTWAQSARLESIR >gi|229784123|gb|GG667612.1| GENE 66 79942 - 80388 268 148 aa, chain + ## HITS:1 COG:PAB1771 KEGG:ns NR:ns ## COG: PAB1771 COG0511 # Protein_GI_number: 14521093 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Pyrococcus abyssi # 82 147 79 144 145 64 54.0 6e-11 MGIKVFRIVLDGVVHEVEVEEIHRESRGGGQTAQGKTAGAEDGETRNMRPPVTAEAVSQN RPAGGAGAAAVQKGAFAGAERITAPLQGTILSVPVTSGQTVKCGEVLVIIEAMKMENEII APRDCTVTSIITSKGAAVAAGDPLIEIS >gi|229784123|gb|GG667612.1| GENE 67 80402 - 81565 1115 387 aa, chain + ## HITS:1 COG:SPy1177 KEGG:ns NR:ns ## COG: SPy1177 COG1883 # Protein_GI_number: 15675149 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Streptococcus pyogenes M1 GAS # 1 386 1 375 376 431 66.0 1e-120 MSEILLEFWKSSGLSQLFQCDTVLFGLHLPGELIMIGIACLFLYLAIHKGYEPYLLIPIA FGMLLVNLPRTGLMAAAVGDENGGLLYYLYQGTDLGIYPPLIFLCIGAGTDFGPLLANPR SLLLGAAAQLGIFLTFAGAIMLGFTGPEAASIGIIGGADGPTALYLTSRLAPHLLGAIAI AAYSYMALVPIIQPPIIRLLTTEKERKIKMEQLRAVGKTEKIIFPIIVTLFTVLLLPSAG ALIGMLMLGNLIKESGCVPRLVDALQNSLMYIISVFLGITVGAKASADAFLRVDTLKIIV LGLAAFSVGTAGGVLLGKVMCRLSKGKVNPMIGAAGVSAVPMAARVVQKEGQKYNPSNFL LMHAMGPNVAGVIGSAVAAGMLLLFFR >gi|229784123|gb|GG667612.1| GENE 68 81736 - 83187 1314 483 aa, chain + ## HITS:1 COG:BS_deaD KEGG:ns NR:ns ## COG: BS_deaD COG0513 # Protein_GI_number: 16080962 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Bacillus subtilis # 10 483 4 479 479 433 48.0 1e-121 MPENKNGNGFLPYALAPEIIHALDLLGYQEPTEIQREVIPAVLEGRNVAARSRTGTGKTA AFGIPLCEKIVWEENLPQALVLEPSRELAVQVSNELFHIGRRKRLKVPAVFGGFPIDKQM RTIKQKSHIVVGTPGRVMDLIQRECLKLAGIRYLVIDEADLMLDMGFLEEVKQIMALLPK GGRVLLFSATLDEKVRLLVDEYMADAVMIAPEPDAEEEPAISQVVYRADPEEKYDLLKRT LILENPESCMIFCGTREMVNVLLQKLKRDRIFCGMIHGEMEQRERLKTVDAFRRGGFRFL IATDVAARGIDFDKIDLVVNYDFPGGRETYVHRIGRTGRNGENGRAVSLLCESEERMLHM VEAYMGRELSVTTCPEPDEEATKAFWTRQRERRAPKAQKGAALNAGITRLSIGGGRKSKM RAVDIVGTICSIDGIGAEDIGIIDIRESLTYVEILNGKGQAVLEKLQEKTIKGRLRKVRI ARR >gi|229784123|gb|GG667612.1| GENE 69 83287 - 84294 800 335 aa, chain + ## HITS:1 COG:RSp1108 KEGG:ns NR:ns ## COG: RSp1108 COG0657 # Protein_GI_number: 17549329 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Ralstonia solanacearum # 27 311 50 332 339 238 45.0 1e-62 MNQRVLLEPAAETVCIENSVPPLIFELPPEEGRRVLEEAQDAPVCLYPAQVEASMVDTGQ WGRIPVYRVTPVNSGFAEHVIFYIHGAGWVFGSFHTHEKLVRELAARTGCTVIFPEYSRS PEARYPTAVEQCYSVLCQLPDLFEQAGRRMNGNTLTVAGDSVGGNMAIAMTLFSKYRGGP VIHKQLLYYPVTNACFNTASYCEFAECYYLYRAGMMWFWDQYTTSENCRNQITASPLRAC TEHLEGLPQAMILNGQADVLRDEGEAYAEKLRAAGVPVTAMRFQAIIHDFVMLNALDETN ACRAAMDASTEWINRKNMEWYQDHAGQSADDCSIA >gi|229784123|gb|GG667612.1| GENE 70 84441 - 85496 1079 351 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619672|ref|ZP_06112607.1| ## NR: gi|266619672|ref|ZP_06112607.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 351 1 351 351 692 100.0 0 MNQHAYRKLRMERQHFTKPASKKQYGELFKQMSPVLCVYWSCPGSPPELLYRAAFDDSRY CYKMRESRTIIKGRFQNGNIAYIYADELELFAAVYRKDRPFTDTEQELYDLLQREGPMTI HVMKEFTGLLAKEITPALHRLQEKFLVFEDQTDNEWDRAWYPFESEFPDADLDRYTKEAA AEELVRRFVWLNVWIDAAMVRSFYRLPLKEIKGVLERLVSDETLEPYEGGYIRREDLPLL EGEPAEVPEKNHTVFVLHRNDFLVKSNEHWLKERFTHAEYDVLGYLLIDGEFEGCLLGHF RNGPFELEDVAVRMEKGAADALKEDIRNAVELLYDPETSPLKRYMGEALER >gi|229784123|gb|GG667612.1| GENE 71 85588 - 86406 851 272 aa, chain + ## HITS:1 COG:all0345_1 KEGG:ns NR:ns ## COG: all0345_1 COG0789 # Protein_GI_number: 17227841 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Nostoc sp. PCC 7120 # 1 113 21 133 135 83 39.0 5e-16 MLKIGDFSKLSRISIRMLRHYDEIGLLEPRLVDPATGYRYYGEEQLSAAGQIQNLKSMGF GLAVIREILNCCGRPEEMERFLRVKKKELEDQMDGIGTKLRLLDNTIEWIRKDGDMMDYQ VTLKTLPERYVASVRKVIPAYDQEQILWGILHEETGAQNLQMASPGYGMAVFHDEGYKEE DVDVEIQISVEGTYRDTEHVKFKTVPPILIASAVYQGSYSQITRVNEAVARWIHDNGYEY DGASFCIYHVSPHDTSDPEKLVTEVCYPVKKK >gi|229784123|gb|GG667612.1| GENE 72 86537 - 88273 1912 578 aa, chain + ## HITS:1 COG:BH0792 KEGG:ns NR:ns ## COG: BH0792 COG2972 # Protein_GI_number: 15613355 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 273 564 279 573 587 146 29.0 1e-34 MSRNFMKKIWKMGRYRDASIQKKLTLSTWVVAIIPIMIIFAIVFFVFTNVSVENSKRQSQ MLLDKTVEELDAYFAQAQEGMNLMVTDINVQNAIDNYVTGTYKEKLDLRDFIRNRLTNMS TVGRRTGTISIYIKDANQTFSRDFSDQDLGLACSGAPWFEALLSGEASFVREDGNSLQDG RPVWILASNIVSVKNGEVQGLVYVELDKKKMIQPFYDLVEGSGDEIICNGQEIVSDACRE GGHFVPLYSYSTALGQNVEFRLQLKELQRGYGAALVYFLIGMVVLIILIYLIDKLLADWF SRRIITLRDATREIAKGNLDVVVCDDHPDEIGELVQSLNTMVKDMKRLIESEYLVKIESQ QATLRALQSQINPHFIYNTLESISMLALIRDHYEIVDMAQAFSLMMRYSMEPSTLVAVRE EVENVRNFVTIQKIRFPDRFLVEYAIDEECMAEKIPRLTMQPLVENAFKHGFEDTPEHKR LLVSVLKRRGFLVIRIFNDGMAVPAERIARIRELLMPDNQEETLDCFALRNLSRRLKLLF GENSRVTLRSGKGIGTIVSLRIPLREEGEEDDEKDFDL >gi|229784123|gb|GG667612.1| GENE 73 88251 - 88967 850 238 aa, chain + ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 2 238 4 253 257 152 36.0 7e-37 MRKILICDDEAIIRQGIRKILETCYRDMEIRECSDGMEGYKTMAVWNPDAVITDIRMPVL DGLAMIEKAQADGITAQYVVISAYRDFEYARTAIRCGVREYILKPVNRFELTDCVSRLLD MKEPEDVEEETGTAPEESGSIGRAVQYIENNFYRNISLEEVSQVVHMNTAYFSTLFKKQT GKKYIDYVTDLRMEKARKLILNTDLKIVNIAEMVGYSSTKHFARIYKEKYGVTPNGER >gi|229784123|gb|GG667612.1| GENE 74 88982 - 89128 59 48 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619676|ref|ZP_06112611.1| ## NR: gi|266619676|ref|ZP_06112611.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 48 1 48 48 67 100.0 4e-10 MNKITKNYKNSTKTKIWVPSLLVILCYNESIKVKRNETDRNRKKEDFI >gi|229784123|gb|GG667612.1| GENE 75 89125 - 90513 1703 462 aa, chain + ## HITS:1 COG:SMb21595 KEGG:ns NR:ns ## COG: SMb21595 COG1653 # Protein_GI_number: 16264783 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 67 461 29 410 410 122 27.0 1e-27 MKKTLAVVLSAVMVSGLALSGCGGNSASATTAAPKETTTAEPAETKKEETKGEEAKGEAP AGEQVTITISNWLEAEEATSAIFKELLDDFMAANPDIRVESQAIPFNQYKDQVLIAGTSG NAADVIMGNSQMMSAFNGAGILAELDDLASKEVLDDIYPGYLSGTTYDGKVKALSWAPHP IAMYYNKDLFTKAGLDPEKAPATWDEMTEAARAIAKLGKDEAGNTLYGLGIPTGKVAHTG TVFNGIFYTYGGHFVDENGNVDLNNEGNVEALTWTKQLVDEKVIPAGLEIKDLRGLFASG QIGIIFDGDMGRSAFRSSSGLGEEFDKKMGIAVIPTGKTGRSETVYTEHQLGVFAKSEHK EAAVRLVEYLIGKDAMVKYHKSNAVLSARKSIATLPEMNEDSFMEVFNKQSETASPLPAT NAMFDNAMNEVTKAIERIVVNNEDIKTVLEETNKTIKEMYGQ >gi|229784123|gb|GG667612.1| GENE 76 90491 - 90607 58 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQDDLQPLLIITVTRAQSVRVIVQTDEFCSILIVRTFL >gi|229784123|gb|GG667612.1| GENE 77 90640 - 91527 926 295 aa, chain + ## HITS:1 COG:lin0218 KEGG:ns NR:ns ## COG: lin0218 COG1175 # Protein_GI_number: 16799295 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 8 269 6 267 292 158 35.0 1e-38 MGENTRKKKKINWFMVMLLVPALAMLTVTIIIPVFVVVGMSFFHYNLLDMGNIKWNHFSN YKAVFSDAEFIRTFGRTLVYVCGTVFAQFFIGLGVALLLNSGSLKGRKVFRSLLFLPWTI PSLVVAVTWMFIFQPQYGIVNYLLGLGDYSVLGSPKTAMLGVIISAVWKQMPLMMIMLLS GLQTVPEDLKEAAEIDGATGVQKFWCITVPCIMPVVKTVTLTSIVSNFQMFVLFFTMTGG GPVRATTTLPLYTYETAFSGFNLGKGAAIGVCWLVFLVIFSTIYNRVLSSKEVEY >gi|229784123|gb|GG667612.1| GENE 78 91539 - 92372 1018 277 aa, chain + ## HITS:1 COG:mlr7227 KEGG:ns NR:ns ## COG: mlr7227 COG0395 # Protein_GI_number: 13476021 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mesorhizobium loti # 25 276 30 280 280 184 40.0 1e-46 MKKKTYRKILMYAAILAATLCFVFPIYWMLLTSLKPNEAILRLPPQFLPVNATLANYKGI LTDGKFLIFYKNNIIVSGITTLVTLTLAVLAAYAFSRYRFKGDKFVMMLFLSTQMFPAMT LLIALYNMYFRLGLLNTYTALVLACSTNALPMSVWILKGFFDTISKSLEEAAYIDGCSKG RTLVQVILPLVKPGILAVGLYSFLISWEDFLWGLTLVNKTEMRTLASGIAMTYLGEYNYD WGRVMAAAVGAAVPILIIFIFLQKYMIAGLTAGAVKE >gi|229784123|gb|GG667612.1| GENE 79 92383 - 93720 1442 445 aa, chain + ## HITS:1 COG:PM1682 KEGG:ns NR:ns ## COG: PM1682 COG3119 # Protein_GI_number: 15603547 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 4 430 3 435 453 462 52.0 1e-130 MKQRQNIVFFFSDQQRADTLGCNGQPLPVTPRLDRFACEDAVNFASAFTPQPVCGPARAM LQTGLYPTQTGCYRNAVSLPLNQKTLARYLREAGYRVAYVGKWHLASDEEENHYETIAVP AERRGGYDDYWMAADVLEFTSHGYGGYIFDKDGNKLDFTGYRTDCLTDYAVRYIEEYEDE KPFFLMISHIEPHHQNDRGDYEGPDGSREQFAGFVPPPDLTPGKGDWEQFYPDYLGCCHA LDRNFGRVVDALKARGIYEDTMVIYGSDHGCHFRTHADEVEKGGYDDYKRNSFEGTIHVP LLIKGNGFEKGKREEKVVSLIDLPKTILSAAGYDTAPLGLQGRPLSETEEPDWEEAVYIQ ISESFVGRALRTNRYKYVVHAPDCDPWKESGSTVYKEKYLFDLASDPLETVNLVNDRDYE EIRRGLKERLIRFGEEAGESFAVEG Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:30:19 2011 Seq name: gi|229784122|gb|GG667613.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld6, whole genome shotgun sequence Length of sequence - 116203 bp Number of predicted genes - 95, with homology - 90 Number of transcription units - 49, operones - 24 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 3 - 62 3.1 1 1 Tu 1 . + CDS 201 - 1562 1180 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 1698 - 1758 18.3 + Prom 2683 - 2742 24.3 2 2 Op 1 . + CDS 2910 - 3899 648 ## COG2407 L-fucose isomerase and related proteins + Prom 3985 - 4044 8.1 3 2 Op 2 . + CDS 4084 - 5094 836 ## COG1609 Transcriptional regulators + Term 5128 - 5174 -0.4 + Prom 5121 - 5180 4.2 4 3 Tu 1 . + CDS 5249 - 5440 228 ## + Prom 6287 - 6346 80.4 5 4 Op 1 . + CDS 6367 - 7488 1053 ## COG1653 ABC-type sugar transport system, periplasmic component 6 4 Op 2 . + CDS 7528 - 8934 1036 ## COG3119 Arylsulfatase A and related enzymes 7 4 Op 3 38/0.000 + CDS 8949 - 9869 1018 ## COG1175 ABC-type sugar transport systems, permease components 8 4 Op 4 . + CDS 9886 - 10725 809 ## COG0395 ABC-type sugar transport system, permease component 9 4 Op 5 3/0.167 + CDS 10742 - 11596 903 ## COG0191 Fructose/tagatose bisphosphate aldolase 10 4 Op 6 . + CDS 11638 - 12690 1183 ## COG2222 Predicted phosphosugar isomerases + Term 12723 - 12767 13.5 11 5 Tu 1 . - CDS 12715 - 12783 58 ## - Prom 12915 - 12974 7.7 + Prom 12762 - 12821 6.7 12 6 Tu 1 . + CDS 12954 - 13679 777 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) + Prom 13825 - 13884 6.1 13 7 Tu 1 . + CDS 14047 - 14592 430 ## Calkr_0078 hypothetical protein 14 8 Op 1 . + CDS 14750 - 15424 698 ## COG2176 DNA polymerase III, alpha subunit (gram-positive type) 15 8 Op 2 . + CDS 15517 - 15693 232 ## gi|266619697|ref|ZP_06112632.1| conserved hypothetical protein 16 8 Op 3 . + CDS 15769 - 15849 78 ## 17 8 Op 4 . + CDS 15885 - 16082 307 ## Closa_3061 GCN5-related N-acetyltransferase 18 9 Op 1 . + CDS 17010 - 17255 179 ## Closa_3061 GCN5-related N-acetyltransferase 19 9 Op 2 . + CDS 17281 - 17829 501 ## EUBREC_0394 adenosine deaminase 20 10 Tu 1 . + CDS 18998 - 20452 1604 ## COG0747 ABC-type dipeptide transport system, periplasmic component 21 11 Op 1 49/0.000 + CDS 21660 - 22610 1009 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 22 11 Op 2 44/0.000 + CDS 22607 - 23434 870 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 23 11 Op 3 17/0.000 + CDS 23476 - 24405 807 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 24 11 Op 4 . + CDS 24496 - 25101 225 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 25 11 Op 5 2/0.167 + CDS 25164 - 25907 804 ## COG1691 NCAIR mutase (PurE)-related proteins 26 11 Op 6 . + CDS 25908 - 27281 1191 ## COG1641 Uncharacterized conserved protein + Term 27286 - 27321 1.1 + Prom 27294 - 27353 1.6 27 12 Tu 1 . + CDS 27407 - 27697 371 ## Closa_3053 asparagine synthase + Prom 28599 - 28658 11.6 28 13 Tu 1 . + CDS 28692 - 29114 406 ## COG1606 ATP-utilizing enzymes of the PP-loop superfamily + Term 29283 - 29338 1.1 + Prom 29457 - 29516 7.0 29 14 Op 1 . + CDS 29536 - 30747 549 ## COG1373 Predicted ATPase (AAA+ superfamily) 30 14 Op 2 . + CDS 30812 - 30958 83 ## gi|225390338|ref|ZP_03760062.1| hypothetical protein CLOSTASPAR_04091 31 14 Op 3 . + CDS 30981 - 31085 67 ## + Term 31212 - 31267 13.3 + Prom 31423 - 31482 4.9 32 15 Tu 1 . + CDS 31632 - 32210 547 ## COG4209 ABC-type polysaccharide transport system, permease component + Prom 33112 - 33171 9.1 33 16 Op 1 7/0.083 + CDS 33230 - 33481 342 ## COG4209 ABC-type polysaccharide transport system, permease component 34 16 Op 2 14/0.000 + CDS 33499 - 34377 869 ## COG0395 ABC-type sugar transport system, permease component 35 16 Op 3 2/0.167 + CDS 34458 - 36044 1983 ## COG1653 ABC-type sugar transport system, periplasmic component 36 16 Op 4 3/0.167 + CDS 36073 - 37194 1425 ## COG0673 Predicted dehydrogenases and related proteins + Prom 37205 - 37264 5.4 37 17 Op 1 . + CDS 37284 - 39551 2004 ## COG2207 AraC-type DNA-binding domain-containing proteins 38 17 Op 2 . + CDS 39564 - 40403 1096 ## COG1082 Sugar phosphate isomerases/epimerases 39 17 Op 3 . + CDS 40416 - 40946 547 ## gi|266619723|ref|ZP_06112658.1| putative sugar phosphate isomerase/epimerase + Prom 41848 - 41907 12.8 40 18 Op 1 16/0.000 + CDS 41962 - 42711 651 ## COG0673 Predicted dehydrogenases and related proteins 41 18 Op 2 . + CDS 42705 - 43532 849 ## COG1082 Sugar phosphate isomerases/epimerases 42 18 Op 3 . + CDS 43632 - 44984 1417 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 43 19 Tu 1 . + CDS 45044 - 45931 962 ## COG4989 Predicted oxidoreductase + Term 45988 - 46022 -0.6 44 20 Tu 1 . - CDS 45962 - 46879 856 ## COG1893 Ketopantoate reductase - Prom 46913 - 46972 7.0 - Term 47025 - 47059 5.5 45 21 Op 1 . - CDS 47176 - 47259 108 ## - Prom 47331 - 47390 7.4 46 21 Op 2 . - CDS 47538 - 48503 875 ## COG0583 Transcriptional regulator - Prom 48560 - 48619 9.2 + Prom 48561 - 48620 12.1 47 22 Op 1 36/0.000 + CDS 48666 - 49499 945 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 48 22 Op 2 . + CDS 49492 - 50085 527 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 49 23 Op 1 . + CDS 51008 - 51703 805 ## COG3839 ABC-type sugar transport systems, ATPase components 50 23 Op 2 . + CDS 51742 - 52800 1500 ## COG0687 Spermidine/putrescine-binding periplasmic protein 51 23 Op 3 . + CDS 52882 - 54759 1435 ## COG1001 Adenine deaminase 52 23 Op 4 . + CDS 54756 - 56105 1304 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases + Term 56294 - 56339 8.1 + Prom 56110 - 56169 5.8 53 24 Tu 1 . + CDS 56390 - 57520 1099 ## COG0628 Predicted permease + Term 57535 - 57585 10.1 - Term 57523 - 57573 4.5 54 25 Tu 1 . - CDS 57591 - 58910 1574 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases - Prom 58951 - 59010 7.8 + Prom 58915 - 58974 5.2 55 26 Op 1 . + CDS 59047 - 59877 779 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 59903 - 59945 2.0 + Prom 59887 - 59946 2.0 56 26 Op 2 26/0.000 + CDS 59972 - 60238 379 ## PROTEIN SUPPORTED gi|227872009|ref|ZP_03990394.1| ribosomal protein S15 + Term 60246 - 60283 8.0 + Prom 60375 - 60434 8.3 57 26 Op 3 . + CDS 60616 - 62268 1131 ## PROTEIN SUPPORTED gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase + Prom 63170 - 63229 22.0 58 27 Op 1 . + CDS 63363 - 63776 173 ## PROTEIN SUPPORTED gi|229213658|ref|ZP_04340006.1| SSU ribosomal protein S1P + Term 63788 - 63838 11.2 59 27 Op 2 . + CDS 63851 - 65479 1719 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold 60 27 Op 3 . + CDS 65502 - 67529 2138 ## COG4289 Uncharacterized protein conserved in bacteria + Term 67738 - 67784 4.3 61 28 Op 1 . + CDS 67940 - 70543 2753 ## COG0474 Cation transport ATPase + Prom 70547 - 70606 4.2 62 28 Op 2 . + CDS 70636 - 71937 1254 ## Closa_3037 hypothetical protein + Term 72166 - 72207 0.3 63 29 Op 1 . - CDS 71950 - 72744 657 ## Cbei_1303 hypothetical protein 64 29 Op 2 . - CDS 72722 - 73441 367 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit - Prom 73572 - 73631 5.3 + Prom 73493 - 73552 5.8 65 30 Tu 1 . + CDS 73583 - 73858 174 ## gi|266619749|ref|ZP_06112684.1| conserved hypothetical protein + Term 73896 - 73947 15.2 66 31 Tu 1 . + CDS 74095 - 76023 1754 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Prom 76038 - 76097 3.6 67 32 Op 1 . + CDS 76142 - 76501 345 ## COG3304 Predicted membrane protein 68 32 Op 2 . + CDS 76504 - 77328 1001 ## Clole_2712 hypothetical protein + Term 77533 - 77571 -0.4 69 33 Tu 1 . - CDS 77350 - 78213 685 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 78287 - 78346 7.1 + Prom 78198 - 78257 9.0 70 34 Tu 1 . + CDS 78439 - 79806 1718 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 79948 - 79982 0.0 + Prom 80097 - 80156 3.9 71 35 Op 1 38/0.000 + CDS 80313 - 81164 813 ## COG1175 ABC-type sugar transport systems, permease components 72 35 Op 2 . + CDS 81182 - 82006 877 ## COG0395 ABC-type sugar transport system, permease component 73 35 Op 3 . + CDS 82014 - 82865 856 ## Plim_0551 xylose isomerase 74 35 Op 4 . + CDS 82862 - 83914 1156 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 75 35 Op 5 . + CDS 83916 - 85229 1170 ## Gmet_0448 glycoside hydrolase family protein + Prom 86131 - 86190 17.2 76 36 Tu 1 . + CDS 86236 - 87222 561 ## COG0535 Predicted Fe-S oxidoreductases + Term 87347 - 87391 9.7 + Prom 87255 - 87314 5.7 77 37 Tu 1 . + CDS 87493 - 90168 2735 ## COG0480 Translation elongation factors (GTPases) - Term 90129 - 90187 15.5 78 38 Tu 1 . - CDS 90220 - 92052 1209 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily - Prom 92101 - 92160 5.5 + Prom 92128 - 92187 6.9 79 39 Op 1 19/0.000 + CDS 92218 - 92514 454 ## COG2127 Uncharacterized conserved protein 80 39 Op 2 . + CDS 92511 - 94829 1262 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Term 94963 - 95012 8.3 81 40 Tu 1 . - CDS 94826 - 95836 876 ## COG1680 Beta-lactamase class C and other penicillin binding proteins - Prom 95891 - 95950 3.5 + Prom 95829 - 95888 5.1 82 41 Op 1 . + CDS 96060 - 97457 1454 ## COG1653 ABC-type sugar transport system, periplasmic component 83 41 Op 2 . + CDS 97555 - 97872 325 ## SCAB_86671 sugar-ABC transporter transmembrane protein + Prom 98774 - 98833 21.0 84 42 Op 1 38/0.000 + CDS 98871 - 99380 573 ## COG1175 ABC-type sugar transport systems, permease components 85 42 Op 2 . + CDS 99380 - 100234 765 ## COG0395 ABC-type sugar transport system, permease component + Term 100271 - 100338 29.6 + Prom 100372 - 100431 8.0 86 43 Op 1 7/0.083 + CDS 100462 - 102297 1897 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 87 43 Op 2 . + CDS 102294 - 103820 1728 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Term 103861 - 103896 4.3 88 44 Tu 1 . - CDS 104025 - 104876 979 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 104932 - 104991 80.4 89 45 Tu 1 . - CDS 105835 - 107208 1317 ## GYMC10_4473 transcriptional regulator, AraC family - Prom 107298 - 107357 4.9 90 46 Tu 1 . + CDS 107498 - 108358 815 ## COG4209 ABC-type polysaccharide transport system, permease component 91 47 Op 1 14/0.000 + CDS 109309 - 110118 708 ## COG0395 ABC-type sugar transport system, permease component + Prom 110130 - 110189 3.9 92 47 Op 2 . + CDS 110211 - 111773 2104 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 111803 - 111868 9.6 93 48 Tu 1 . - CDS 111859 - 112959 677 ## COG1073 Hydrolases of the alpha/beta superfamily - Prom 112997 - 113056 7.5 + Prom 112957 - 113016 4.5 94 49 Op 1 . + CDS 113199 - 114650 1335 ## COG5434 Endopolygalacturonase 95 49 Op 2 . + CDS 114722 - 116201 998 ## Mahau_1687 hypothetical protein Predicted protein(s) >gi|229784122|gb|GG667613.1| GENE 1 201 - 1562 1180 453 aa, chain + ## HITS:1 COG:mll4149 KEGG:ns NR:ns ## COG: mll4149 COG1653 # Protein_GI_number: 13473518 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 60 399 28 354 408 100 26.0 6e-21 MIKRRMSVFLLACLLVLTTGCAKSQSTQEISTDSSEVTTESIPESIPETNTSEADSEVPV TIKIANYAVLEKGYEEFWQNVKTGYESKYPNVTVEWVTAPYGEILNQVINMAGGGDKVDC VFGEDIWIPSLVEAGLAAPMDQVLDSDFLNDYYPDALAAHSMDGTVYAAPLYLTPPVLFY NRELFKEAGLDPEKPPKTYDEMMGMAEKLSQLKSSDGNKVYAFGFPTGSVIVVGSYLQGL IKNFGGEILDADGKLDMKNPGFIQAFEFLQELDEKGYNPQNAKAKDLRNLFALGQLAMYY DKTGGINGVTSINPAAADFTATACPLAGGEGSGETVSQSHCFIAIDNGPEKKEAVKNFIQ YVITPEIMEDYMTNIAPAYPAKKAMEHMDGVNNSAFLAGARDTVNHLSPSLMFPTVADFN LELCALAQAVTMGDEEVSAAAADFEKSVQPLLP >gi|229784122|gb|GG667613.1| GENE 2 2910 - 3899 648 329 aa, chain + ## HITS:1 COG:TM0951 KEGG:ns NR:ns ## COG: TM0951 COG2407 # Protein_GI_number: 15643713 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Thermotoga maritima # 38 316 167 453 471 101 25.0 2e-21 MNAAALRRLGVSCHTIYGSREDDKAAEKVKNLLLAYGVVKSLKHTTLGLLGYRPTAFYNC AFDEGLIRRTFGVKIEETDLKVVFDKMADLPAEKYEADMAKMAEAYDTGKLPEGHLENHS RLYLALKEVMAEQHYDFAAIKCWPEMGNLHTTPCAVLGRLADDGITIGCEGDVDAELAQM TEYYLTGEPSFITDLINIDEEKNVITFWHCGNAAPSLFNKAYEVEIRNHPLAGQGTAFYG ALKPGKVTVARFCNIDGAYKLFLLRGEAVALDRCTKGVTASVKVERPVREIIEGIITEGI PHHYSIVWEDVAEAMKGICGLLNIPVIEM >gi|229784122|gb|GG667613.1| GENE 3 4084 - 5094 836 336 aa, chain + ## HITS:1 COG:YPO0108 KEGG:ns NR:ns ## COG: YPO0108 COG1609 # Protein_GI_number: 16120455 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 1 336 1 337 342 144 28.0 4e-34 MERNRERVTRADVAKAAGVSETIVSYVVNNNRYVAKEKRQRVEDAIAALHYRPNNVARAL KGKRSNQLLFIADQITNEYFSRIVSEMDKYAYEAGFLISLCANRNTPEFVSQVISRQYDG IIISSISFPEEYVRHFSKAGIPVLLFEHKKHDTLPENVATLASGLYSGARTGVNHLISRG KKNIIYIDRISRRGNVSGPSDLRLSGYLDELHDHGLEAGGDTVIAGCHTEEELTETVADY LRCHDQVDGMMGRNDMVACIAMQAAIGIGKRVPADIGVVGFDNSSISRFCSPKLTTLEMQ REEISRTVIDMMVQMIGGQMPENATFETKLIIREST >gi|229784122|gb|GG667613.1| GENE 4 5249 - 5440 228 63 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKRRLAALALACVMALTTGCSGGGGSTAGSTAGTEAKTEAGTKEEAKAADTGSGEKITL KIA >gi|229784122|gb|GG667613.1| GENE 5 6367 - 7488 1053 373 aa, chain + ## HITS:1 COG:SMb20634 KEGG:ns NR:ns ## COG: SMb20634 COG1653 # Protein_GI_number: 16265294 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 3 286 46 321 411 94 26.0 2e-19 MKAGYEEKYPNVTIEWVTAPYGEILNQVINMAGGGDRVDCIFSEMIWLPALVDSGLAVPM KDVLDESFLNDYYANILEAHSIDGEVYGAPLYVSPSLLFYNKDLFKKAGLDPEQPPKTYD EMLTMAEKLAALTTDDGNKVYAFGQPTASVIVIGSSIQAFAANFGGTVIDDNGKLTADDP AFKESLEMLKLLDQKGYNPQNAKPKDLRNLFALGQLAMYYDNSWGFNGAKSINPDVVNFA AAAEPLKGGNGEGASTLQSHCFVAVDNGPDHVEAVKNFIQYVITPEVLNDYIANIAPALP AKNSMKDMEAVKNSPILNGAGDSVEKAEPVYMFPTLSEFNLELCALAQAVTVGGEDVDTA IEGFKTAVQPLLP >gi|229784122|gb|GG667613.1| GENE 6 7528 - 8934 1036 468 aa, chain + ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 3 433 28 474 497 168 27.0 2e-41 MAGKKPNVLFILTDDQGIWSMGCYGNSEIQTPNLDKLAKQGVRFDNFFCTSPVCSPARAS LLTGKIPSQHGILDYLSGGNGGASQAAIEFLKDHRGYTDILAEEGYTCGLSGKWHLGDGG HPQKGFSFWYAHQKGGGPYYNAPMFRNGQKIEEKGYITDVITDEAISFIDREKNKEQPFY LSVHYTAPHSPWINCHPKKYTDLYEDCPFETCPQGEVHPWAKTEVIAGYQKPRESLIGYF AAVTAMDDNVGRILKKLEEENLMEDTLIIFSSDNGFNCGHHGIWGKGNGTFPLNMYDSSV KVPLIMCHKGHIPENHVCDEMHSGYDFMPTPLDYLGFKNDEADKLPGKSFLSALMGQEQK GEENSVVVFDEYGPVRMIRSRKYKLVHRYPFGPDEFYDLEVDPGEAYNGIEDESYQDVIR DMKKQMELWFLQYVDPRIDGAKEPVMGGGQKDLAGVLGPGINVYGENI >gi|229784122|gb|GG667613.1| GENE 7 8949 - 9869 1018 306 aa, chain + ## HITS:1 COG:BH1245 KEGG:ns NR:ns ## COG: BH1245 COG1175 # Protein_GI_number: 15613808 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 43 304 176 436 445 178 34.0 1e-44 MEKGIKKSLWKNQTGDKIFAIALLIPAIITTVSFILVPVVDSIYRSFFDYKVRNIISGQP GVWNNFANYTKLFSNGKLIPSMTNTLTFVFGVVIAQFVLGMALALILNSNVKFSRFIRSI MMVPWVVPTLISGLVWLWMFQPQYGLVKYFVGVLTKGRITDFAILNNPATAMFGVSVAAL WKQIPLATLLLLAGLANVPEDMQEAAKIDGANGVQRFFRIVLPYMKSVIKVTVSMSIIEN FKQFPLFWTMTGGGPNNSTTTMAILSYREAFVSNNFGSGAAVTTVWMLMMIVVVYIYNRI FKSEDM >gi|229784122|gb|GG667613.1| GENE 8 9886 - 10725 809 279 aa, chain + ## HITS:1 COG:AGl3066 KEGG:ns NR:ns ## COG: AGl3066 COG0395 # Protein_GI_number: 15891649 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 11 279 40 306 306 176 36.0 4e-44 MKRKRILNGVIKYVVLIIILVFLLFPLLWVLLTSFKTNMEAYKFPPTFIVKNPTFQSYIN LFTVDNDFFTYYKNNFIVAGSTALLTTFMAIISGYALSRFHFKWNKWIIAAFTSAQMFPV VSRLISLYKILGDVHLINTHAGLILAIAAGQLPFTIMLMSSFYDGIPRELEEAARVDGAG RFGTLFRVVVPLVKPGMLAVGIYAFLLSWDDYLHASTLIQTDSLRTLSAGVALRYLGELS YDWSLINTISIVGTIPMVIVFFFFQKYMIKGLVAGAVKG >gi|229784122|gb|GG667613.1| GENE 9 10742 - 11596 903 284 aa, chain + ## HITS:1 COG:YPO0844 KEGG:ns NR:ns ## COG: YPO0844 COG0191 # Protein_GI_number: 16121152 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Yersinia pestis # 2 283 3 283 284 176 37.0 6e-44 MLVTNKENLIEASKNGYAIPALNTQGGNYDIIWAACRAAEEMKSPIILAHYVSTGAYSGH DWFVQVCKWCADKVSVPVSIHLDHGADFDICMEALKLGFTSVMIDGSTSPIEENAAMTNE VIKVAKCFGVPVEAEIGELLRLDNMGTVMENKNIVDPCEVKEFLGLCQPDSLAIGIGNAH GYYKGEPDIHLEVLEAVRKFTDIPLVLHGCTGMKEEIVKEAIKLGVAKINFGTEIRYKYV EHYEEGLKTLDHQGHSWKLSQFANDRLTEDIKDIIRLAGSEGKA >gi|229784122|gb|GG667613.1| GENE 10 11638 - 12690 1183 350 aa, chain + ## HITS:1 COG:TM0813 KEGG:ns NR:ns ## COG: TM0813 COG2222 # Protein_GI_number: 15643576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted phosphosugar isomerases # Organism: Thermotoga maritima # 43 349 41 330 330 115 30.0 1e-25 MEKRNSEITYKEIYLQPESFEAVNETLEDIYGVLDKVFAEDYDELIFTGCGTSLYLAQAA AHAFSSYNAIPSKAVCCSELYYFPETFVKNRKVLVLPITRKSYTTEVRMAIDKVRSYPGV KSLAITCDKDSALYNDFMILSPDTAEDSVIMTRSFTSMVYLAAVMAMYVGGKKEEIAAMA GYGDHARDALKKMDEMAKRIITEHEGLNLFITLGQGINYGVANECMNKMKEMGLSNSEAY FSLEYRHGPMSLVDENTLIILLSHSETVKEDGKLLAQMKEYGAVTAAIGNTASVDFKDAD YCLDLTYGYNDTQNAALIGFIGQFIGYYIAEKKGIDADSPRHLSQAIVIK >gi|229784122|gb|GG667613.1| GENE 11 12715 - 12783 58 22 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLNIPDKTTAEGTAKTTAPSVY >gi|229784122|gb|GG667613.1| GENE 12 12954 - 13679 777 241 aa, chain + ## HITS:1 COG:BS_pbpF KEGG:ns NR:ns ## COG: BS_pbpF COG0744 # Protein_GI_number: 16078075 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Bacillus subtilis # 46 241 56 252 714 170 44.0 2e-42 MKKIKVMRILKWFFVLTMVLCICSGFAVAGKGYSMYRNALDQMSLADKVESIQNKEHYTA FEELPEVYVNAVVSVEDHRFYRHSGIDLIAIGRAVFNDIRAGRFVEGGSTITQQLAKNLY FSQEKEMTRKAAEVFMAFDLERNYSKEEILELYVNTIYFGDGYYTVADASEGYFGKAPEE MTKYESTLLAGIPNAPSRYAPTKNPVLAEKRQMQVLRRMEDCGYFSAEEAETVAAQMVAI N >gi|229784122|gb|GG667613.1| GENE 13 14047 - 14592 430 181 aa, chain + ## HITS:1 COG:no KEGG:Calkr_0078 NR:ns ## KEGG: Calkr_0078 # Name: not_defined # Def: hypothetical protein # Organism: C.kristjanssonii # Pathway: not_defined # 24 178 20 186 189 103 38.0 4e-21 MREVPEESYSRKKCTSPGGEPLAERIDGRIYIMKRPKRWHQRIVRILKEMIEEYRKANPG FCEVRTAPFGLTLAGGMTYLIPDLCVILDRRKLTKCGCSGAPEIVIEVVAPESQYMDYRI KLTRYRKAGVKEYWIVDRDWDHGYVYYFDDGEMDLASFSGALGSRVCRGLVFDFSRLKRK K >gi|229784122|gb|GG667613.1| GENE 14 14750 - 15424 698 224 aa, chain + ## HITS:1 COG:BH2418 KEGG:ns NR:ns ## COG: BH2418 COG2176 # Protein_GI_number: 15614981 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit (gram-positive type) # Organism: Bacillus halodurans # 18 167 419 567 1433 121 42.0 1e-27 MAAADNRGKRLSKYVDNYVVFDLETTGISAVKDDIIEISALKVKNHEQVETFSRLVNPRR PIPAGATKVNGITDEMVQSEPGLEFILPEFLDFIDGEILIGHNIQSFDLLFLNRAADEVC RKAVLNDFIDTLFMARALLPGLSRHRLTDLADYFKISSEGAHRAFNDCVMNQICYEHMGK LLKDADIPVCPRCGGELTKRNGKFGPFYGCSNYPACRYTKNIVV >gi|229784122|gb|GG667613.1| GENE 15 15517 - 15693 232 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619697|ref|ZP_06112632.1| ## NR: gi|266619697|ref|ZP_06112632.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 58 1 58 58 104 100.0 2e-21 MEDKNLYNKVMLIAFCLILVLLGAGMYGRRVAAQRASENYVPRGTVQEVQVETEYLTV >gi|229784122|gb|GG667613.1| GENE 16 15769 - 15849 78 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDPGVSDQRAGSRRMKETEKEISKEK >gi|229784122|gb|GG667613.1| GENE 17 15885 - 16082 307 65 aa, chain + ## HITS:1 COG:no KEGG:Closa_3061 NR:ns ## KEGG: Closa_3061 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.saccharolyticum # Pathway: not_defined # 1 65 1 65 172 68 49.0 7e-11 MTFKIVKAEETDYQLFASVIQEVYEALEQKEWFAADNAEYTYHMLKDCNGVGYKAVHEPS GAVAG >gi|229784122|gb|GG667613.1| GENE 18 17010 - 17255 179 81 aa, chain + ## HITS:1 COG:no KEGG:Closa_3061 NR:ns ## KEGG: Closa_3061 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.saccharolyticum # Pathway: not_defined # 1 72 96 167 172 99 63.0 6e-20 MDSVAILPQYRGAGLQKRLMQHAERELTEQGYRYFMCTVHPENRYSRQNVISQGYEPVKS ALKYGGYLREIFLLDRSKELS >gi|229784122|gb|GG667613.1| GENE 19 17281 - 17829 501 182 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0394 NR:ns ## KEGG: EUBREC_0394 # Name: not_defined # Def: adenosine deaminase # Organism: E.rectale # Pathway: Purine metabolism [PATH:ere00230]; Metabolic pathways [PATH:ere01100] # 8 180 3 175 315 209 57.0 4e-53 MRDEWSEAFINALKNRDTERLRQIPKADLHNHFVLGGSRSFIRERKGLMIPALEGVLKSM QEMDQWNQEFIGKYFESSEGRRFLIEAAFVQAREDGVTVLEIGEDVWGLGQYFDHDVNAL TEAFFSARDKYAPDLELRLLVGLSRHCPVQWLMEQLEPFWGRPEFCSIDLYGDEMAQPIE TS >gi|229784122|gb|GG667613.1| GENE 20 18998 - 20452 1604 484 aa, chain + ## HITS:1 COG:MA1915 KEGG:ns NR:ns ## COG: MA1915 COG0747 # Protein_GI_number: 20090764 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Methanosarcina acetivorans str.C2A # 50 482 41 476 553 332 44.0 1e-90 MKKRGWKLRTAVLLAAAAIAVTGCSAGKSQEATGAAGKEERTEQNPAGNTESAAEAEDTG KNSVIVVMGPSSEPEAGFDPAYGWGAGEHVHEPLIQSTLTVTTADLKIDYDLATEMNVSD DGLTWTVKIRDGVSFTDGEKLTAKDVAFTYNTLRDTSSVNDFTMLDSAEAVDDTTVVFHM KRAYSIWPYTMAIVGIIPEHAYGSDYGEHPIGSGRYIMKQWDRGQQVIFEANPDYYGKAP KMKKVTVLFMEEDAAFAAVQSGQVDLAYTAASYSDQTIPGYELLSFETVDNRGFNLPAVA AGGSDGSGNPLGNDFTADVLVRRAINIAINRDEMIENVLSGYGSPAYSVCDKMPWYNNAA ETTYDEEGAKALLEEAGWKEGKDGIREKDGKKAGFTLMYPAGDSVRQALAADTANQLKAV GIDVKTEGVGWDTAYDRAQAEPLMWGWGAHTPMELYNIYHTMKDTGLAEYSPYGNETVDR YMAS >gi|229784122|gb|GG667613.1| GENE 21 21660 - 22610 1009 316 aa, chain + ## HITS:1 COG:MA1913 KEGG:ns NR:ns ## COG: MA1913 COG0601 # Protein_GI_number: 20090762 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 8 311 25 325 330 334 55.0 1e-91 MAVLLLLVSAAAFFLVSASPLDPLKTNVGQTALGSMSQEQIAKLEEYWGMNTPPLKRFVS WAMDFFRGDMGTSLLYRQPVAHVISVKLKDSLWLMGTAWLISGAAGLLLGVVAGANRGKP VDKIISGYALVTASTPAFWLALVLLMIFAVWLKVLPIGLSVPIGVEAAGVTVSDRIAHGV LPALTLSITGLSNIILHTREKMAEAMESDYVLFAKARGESGRQIILRHGLRNIVLPAMTL QFASISEIFGGSVLVEQVFSYPGIGQAAVTAGLGGDIPLLLGITVISAAIVFAGNFTANV LYGVVDPRIRRGRART >gi|229784122|gb|GG667613.1| GENE 22 22607 - 23434 870 275 aa, chain + ## HITS:1 COG:MA1912 KEGG:ns NR:ns ## COG: MA1912 COG1173 # Protein_GI_number: 20090761 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 5 275 16 285 285 259 52.0 3e-69 MKRRMNRRKMMLLLAGLSVLMLAVITVAGQLLSEAAMVTDFSRKNLAPCLAYPFGTDWMG RDMFVRTITGLSMSIRIGLLTAAISAVIAFLMGIVAATMGKIADGLIIGIIDLVMGIPHI LLLVLISFALGKGFWGVVAGISLTHWTSLARLIRGEVLQLKQSQYIKIAEKLGQGKLSIA FRHMTPHLIPQLFVGLVLMFPHAILHEASVTFLGFGLSPEQPAIGVILSESMKYLAMGKW WLALFPGLFLVFVVLLFHFMGSTISQLLDPAHAHQ >gi|229784122|gb|GG667613.1| GENE 23 23476 - 24405 807 309 aa, chain + ## HITS:1 COG:MA1911 KEGG:ns NR:ns ## COG: MA1911 COG0444 # Protein_GI_number: 20090760 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Methanosarcina acetivorans str.C2A # 2 305 10 321 325 287 48.0 2e-77 MEERKTVLSVEHLSVKFTQYERGFRQIELPAIRDLCLTVKEGEMVAVVGSSGSGKSLLAH AVMGILPYNAVTGGRIDYDGVELTEKRIRALRGREIVLVPQSVSYLDPLMKAGAQVRKGR KDSESRRKCREVLARYGLGKETEELYPFELSGGMTRRVLISTAVMEKPGLVIADEPTPGL HLEAAKRVLTHFREIADEGAGVLLITHDLELALEAADRIVVFYAGTNVEEASVSDFEEEE RLRHPYTKALFRAMPQHGFVPVSGSQPYGKDFPAGCPYAARCELADGACGKEVEYTGLRG GKVRCRKAV >gi|229784122|gb|GG667613.1| GENE 24 24496 - 25101 225 201 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 10 196 28 232 329 91 27 2e-17 MKLEARELSFRYDNGNRQILNKMSMTLESTDRLGLIAPSGFGKTTFCKILAGYEEPDSGE VLLEGKPLSAWKGYCPVQMIWQHPELSVNPRWKMGEVLKEGDQVEKRIIDGLGIELDWLN RYPSELSGGELQRFCIARALGQRTRFLLADEISTMLDLITQSQLWNFLLEEVKNREIGLM AVSHSEVLLERICTKIVRLNG >gi|229784122|gb|GG667613.1| GENE 25 25164 - 25907 804 247 aa, chain + ## HITS:1 COG:CAC0776 KEGG:ns NR:ns ## COG: CAC0776 COG1691 # Protein_GI_number: 15894063 # Func_class: R General function prediction only # Function: NCAIR mutase (PurE)-related proteins # Organism: Clostridium acetobutylicum # 2 244 5 248 248 243 50.0 2e-64 MDVKKVLESVSSGTLDVAEAEALLKDLPYEDLGYAKLDHHRKLRSGFGETVFCQGKPDKY LVEIMKRFYERDGEVLGTRASAEQFLLVKEEVPEIEYDTVSRILKVEKKDKERTGCVAVC TGGTADIPVAEEAAQTAEYFGCAVDRIYDVGVAGIHRLLSQRERLMKANCIIAVAGMEGA LGTVVAGLVDCPVIAVPTSVGYGANFHGVSALLTMLNSCANGISVVNIDNGYGAGYLATQ INRLAVK >gi|229784122|gb|GG667613.1| GENE 26 25908 - 27281 1191 457 aa, chain + ## HITS:1 COG:CAC0774 KEGG:ns NR:ns ## COG: CAC0774 COG1641 # Protein_GI_number: 15894061 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 3 448 2 416 420 267 38.0 4e-71 MGKILYLECNSGISGDMTVGALLDLGADQTVLEHALDSLGVEGYHLHFGRTVKCGLDAYD FDVHLEEEGHEHGHSHDCEEGHSHDHNHDCEEGHSHEHSHDCEKGHSHDHPHDHEEHHHN EHSHDQGEGHTHGHTHRNLSDIYEIIGRLDSNNAVKELAKNMFLIVAEAEAKAHGLPIDQ VHFHEVGAIDSIVDIISVAVCLDNLGIKEVVVSPLAEGHGQIRCQHGVIPVPVPATANIA SAHGLALRMTDQEGEMVTPTGAAIAAAVRTRDKLPDSCKIIKTGMGAGKKDFAHANVLRA MLLETEEAADDEMWVLESNIDDCGGEMLGFAMEMLLEAGAADVWSTPIYMKKNRPSYMLS VLCREDFIPKMEAVIFEQTTSIGIRRYSVARTVLGRRKAVAETRFGKAEVKICMGNGKTY FYPEYESVRAICREQNVDFQTVYSEVRRAAEEMESKQ >gi|229784122|gb|GG667613.1| GENE 27 27407 - 27697 371 96 aa, chain + ## HITS:1 COG:no KEGG:Closa_3053 NR:ns ## KEGG: Closa_3053 # Name: not_defined # Def: asparagine synthase # Organism: C.saccharolyticum # Pathway: not_defined # 1 94 1 94 276 134 71.0 1e-30 MTENTAEKKMKLEKKMAEYANQDICLAFSGGVDSSLLLKAAAEAVKKTGKKVYAVTFDSR LHPSCDLEIARRVAEELGGIHEVITVDELEQKEIPS >gi|229784122|gb|GG667613.1| GENE 28 28692 - 29114 406 140 aa, chain + ## HITS:1 COG:MA0240 KEGG:ns NR:ns ## COG: MA0240 COG1606 # Protein_GI_number: 20089138 # Func_class: R General function prediction only # Function: ATP-utilizing enzymes of the PP-loop superfamily # Organism: Methanosarcina acetivorans str.C2A # 5 136 131 264 281 123 48.0 8e-29 MHVYRPGIQALKELGIVSPLAELHVTKAEVKAIAAEYGISVASRPSTPCMATRIPYNTDI DYEVLEKIGAGEAYLRTMIGGNVRLRLHGDIVRIEVDLNAFEKVLEMREELIRKLKEMGF LYITLDLEGFRSGSMDVRIS >gi|229784122|gb|GG667613.1| GENE 29 29536 - 30747 549 403 aa, chain + ## HITS:1 COG:FN1382 KEGG:ns NR:ns ## COG: FN1382 COG1373 # Protein_GI_number: 19704717 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 400 1 402 402 329 42.0 6e-90 MINRPIYVDKIMAYVNTPFVKILTGIRRCGKSTILRMLMEEMKKRGIRDNQILHYSFDSL EYEDIKTAKTLFTHLKQHLCPEGRTYLFLDEIQEVKSWEKVVNSLMTDYDVDIYVTGSNS RMMSSEISTYLTGRYIAFRIFPLSFSEYMIFRKEYTEVLDPRTELANYLRLGGFPAVHLQ KYTPNEVYTIVKDIYSSTIYTDIVRRNQIRKVDQLERIVKFAFDNVGRTFSAASISKYLK SENRSIDNETVYNYLSKLESAYILHRCSRFDVQGKEILKTQEKFYLADPALRYSVLGYSP DSVAAMLENVIYLELLRRGYDVYVGKLDNAEIDFIAVKQENKVYIQAAQKIGSPETERRE YGRLLDIRDNYPKYVLQTDAFAGGNYEGIKTMHIADFLLSDEY >gi|229784122|gb|GG667613.1| GENE 30 30812 - 30958 83 48 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225390338|ref|ZP_03760062.1| ## NR: gi|225390338|ref|ZP_03760062.1| hypothetical protein CLOSTASPAR_04091 [Clostridium asparagiforme DSM 15981] hypothetical protein CLOSTASPAR_04091 [Clostridium asparagiforme DSM 15981] # 4 48 40 84 84 76 86.0 6e-13 MALRERRIPFELSADPFYSESNLHYFDKKMEDYKAGRLNFSEHELIEE >gi|229784122|gb|GG667613.1| GENE 31 30981 - 31085 67 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGVWSRRVDEEHRLVYVVEDRAEVVISCRGNYDV >gi|229784122|gb|GG667613.1| GENE 32 31632 - 32210 547 192 aa, chain + ## HITS:1 COG:lin2117 KEGG:ns NR:ns ## COG: lin2117 COG4209 # Protein_GI_number: 16801183 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Listeria innocua # 10 191 2 189 309 145 43.0 3e-35 MGLNAASQKKGIWYELRKNRGLYLFMLPGILYTFIFGYVTLPYMIIAFEDFDYKKGIFSP WVGLENFRFFFESNRAWEVTFNTILLNALFLFFGTLVSVGLALILNEVKSKAFVRISQST MLFPYFISWVIVSYILYAMLSTEKGVFNNILTAVGLDRVRWYSRPELWRGILVFAKVWKS AGYTMIIYLAAS >gi|229784122|gb|GG667613.1| GENE 33 33230 - 33481 342 83 aa, chain + ## HITS:1 COG:lin2117 KEGG:ns NR:ns ## COG: lin2117 COG4209 # Protein_GI_number: 16801183 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Listeria innocua # 1 83 227 309 309 85 55.0 3e-17 MILLEVGKIFCGDFGMVYALVKDSGQLLPKVDVIDTYVFRMLRVTGDPSLSMAVGVYQSI MGFIMVFGSNWLVKKLFPEGALF >gi|229784122|gb|GG667613.1| GENE 34 33499 - 34377 869 292 aa, chain + ## HITS:1 COG:BH0795 KEGG:ns NR:ns ## COG: BH0795 COG0395 # Protein_GI_number: 15613358 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 13 291 38 321 322 202 39.0 5e-52 MVQRKRFSIAQGLIYLLLTCISLLCILPMLLVVAVSFTDESCIGRYGYSLIPKKLSLAAY QKLFYPGSSIYASCGITVFITVVGTAIAVLMTFMVAYPLANKNLKYRNQFAMFFFFTTVF NSGMVPWYLICRKLGMYNNILSLLIPSMIFTPFNMFLVRNYVATIPESLMESARLDGASE TRIGFQIYFPLCKPIIATITLFYGINYWNNWFNAVMLIDNKKLYPLQMLLFRLQSDISML SQMVTQGIQLDTPPAESFKIATVIVTIGPLVFLYPFLQKYFVKGIMVGSVKG >gi|229784122|gb|GG667613.1| GENE 35 34458 - 36044 1983 528 aa, chain + ## HITS:1 COG:lin2115 KEGG:ns NR:ns ## COG: lin2115 COG1653 # Protein_GI_number: 16801181 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Listeria innocua # 40 515 19 480 485 77 22.0 1e-13 MRRRVERIGALVVLAAMVCSLTACGSKPAESSPGTTGAKGSAAETAGGTAGEAPKKMRYV APGSDWEKEDEIIGLVNDKLQKDGVNIEVELIRIPWDAWDQKVNLMLSTGEEFELLHVMQ DRKSATVLRSQNAIAPLNDYLDNFPDLKETLSERWPEFTVNGEILAVPVKPNYYISRDYG RIFYRQDIFDKAGGKVPETVDEVIELGQKMQEILMEETGEKCYTWLHTLSRPATWLHRSY EEAPFIVDNTMGIAKCNMDGTVESFYESDIFKRDCEIYSRLYDLGLIDPDILTTDSEDRS NAADYGKFVFGFETIDYQSESTMTKNVGAYLGDFWLNPELGNVEYFGIYNGNAVPVSCTD PNVPLSFLDWLYKDAENYQLFMYGVEGETYKKVTDNTMETIFGADGKPLYKYDDWQMGNA DFRLYDATATDTLMKMYTTPITGTNVRTPMVGFNFDPVNVSNEVANLTNEIITSIYPIKF GVVDYDSNIDQAIQRLKAAGLDKVLEEYSKQYKEHYEANKDLVGVFEK >gi|229784122|gb|GG667613.1| GENE 36 36073 - 37194 1425 373 aa, chain + ## HITS:1 COG:BH2165 KEGG:ns NR:ns ## COG: BH2165 COG0673 # Protein_GI_number: 15614728 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 3 360 4 341 348 198 32.0 2e-50 MNKKYKAAVIGVGFIGAGSHVPAYQTDERCELVAVADLKPGAAEYVAKRYGVEKFYEDPQ QMLEECKPDVVSICVPNQYHKEWSIAALEAGANVLCEKPICVTLADAKEIYAAAERAGRI FMPVQNDRMGARTVLRDMIKDGALGDVYFGECENIRRRGIPNWGSFHIKKDNFGGPFCDV GVHFIDSILYMLGNPKFKSISGNTWSKIAPYETDRPNLGTVSKSKNFVPRPYSSSEFTVE DHSAGTLRLGEDILINFKFAWALNSPPFDGIRLNGTKGGLVFNKYAENPFMLYRSKGETM TDTILNLNVHHKFEHIQNPGIYLIVEHFLDVLDGKAEPIVTKEEALNVVTIIQAFYRSAE LKREVTAEEIEGM >gi|229784122|gb|GG667613.1| GENE 37 37284 - 39551 2004 755 aa, chain + ## HITS:1 COG:BH0483 KEGG:ns NR:ns ## COG: BH0483 COG2207 # Protein_GI_number: 15613046 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 290 750 302 767 769 94 24.0 8e-19 MKILIHSFFSQLKRFTLYRKLVLSFMTLLAVFSLLIVTMLYTLFSHVMHQEYKYMAQNSL VSADSVLLNSYDKLSGVINEVLSLNNLSVFFLRDEADRANEANIFLELRRLKNLYPYIEK ISLINLSGNRYLGTDGASDQVDDSIVEHYEQSGAGGLVSFRRTIPEKIRLTNSPCTEVTT FVFFPSGQYRDSAIVLDLKDSYFIDLLNPAALQPLDQIFLVTNESVLIGRTDLPEIDWRP ILEGIEEDNFRIMDIENTRCAVTAKQACIPGWKMVRLMPMAGSGILTRKFSLQLGAALFI FLLAVLYGIWLMIRRIYAPVKTMVREQVDGYDTGTVDEIGVICEELVKRSYEIEKLNSRL THSQPLFNGSWVKSQLLGDMETAKRVEKLALEQGMRAPESPYKCAVVLAVDRYQTYNSTV DVGERGLHDFAVSNILSELLGGRILGVLIRMDEQKLACMVACPAGSIDDEILEALKLFQK NFYEFFKYSVSIGLGIPAAAPEDMAVSYHQACSALNYRFYRGTGSLNPYVREWETKRSAY PVKEEREILEAILLQDSKALEKAVNGFRQSLGTPEETRIKGYMERCINSLVSGLAARSVI VEHQDIHGEVFGTDTEKELADVFAQWCETMMKRCGEEQKKDRPGEGNPIAFAQNYIQENY SRADLSVEYLARLVDLSPAYFGKQFYSTLRVSCSDYIADVRLENAAKFLRETSLPVQEIS ERVGVGSINYFYRLFKKKYGETPVNYRKKQTKQNF >gi|229784122|gb|GG667613.1| GENE 38 39564 - 40403 1096 279 aa, chain + ## HITS:1 COG:lin2265 KEGG:ns NR:ns ## COG: lin2265 COG1082 # Protein_GI_number: 16801329 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Listeria innocua # 7 257 4 220 246 91 30.0 1e-18 MEQNELKIGIQMWSIHNVCKELGFAKAFELVSKMGYQGWEFALGSSATLADRMERPIDLQ EVKQAAAENGIELLGCHVSTERLLEDPEPVIAECRELGLSYAAVGPCFYADRTPFDEQKK CYENTKRFAQMFRDAGIQFQVHGSAFGYLRDYRGRRTIDGMLEECGLDLLQPEFDTAWMI VGGVIPAVYLDKYKGHVDILHFKDFQPPMEDSDYILVRHNEICDNHRGCAVGDHGIQDIG TIIPAARACGTKWLMTELWNEPESLKNAEISIENLKKYL >gi|229784122|gb|GG667613.1| GENE 39 40416 - 40946 547 176 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619723|ref|ZP_06112658.1| ## NR: gi|266619723|ref|ZP_06112658.1| putative sugar phosphate isomerase/epimerase [Clostridium hathewayi DSM 13479] putative sugar phosphate isomerase/epimerase [Clostridium hathewayi DSM 13479] # 1 174 1 174 174 360 100.0 3e-98 MVKAGIVGFLPEDKSDVWEFFKWCADLGYKAIDMDLSYAAPGEDLTESASRLRNLGLEPL TMGCPGDFELAVGRVAEIAKKAHAQGIKRVTAYSSSISKSIHEGYGVHGTYDEMMRDFEY MNRLTELFDEEGLTFCYHNHYHEFITVYHGVNAFDHMLYNVDSRLKFDLDVGWVTS >gi|229784122|gb|GG667613.1| GENE 40 41962 - 42711 651 249 aa, chain + ## HITS:1 COG:AGl3148 KEGG:ns NR:ns ## COG: AGl3148 COG0673 # Protein_GI_number: 15891691 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 21 225 217 409 415 122 38.0 7e-28 MHAFLARNVWPDGARNDVKLENGGTLPFDMAAYYIHAMVHLFGPVQTVAGFADSYPFELR NPYNPRQGESAEVPGITSLYGALRFESGVCGTLVVTGESFMETPRVEVYGSEGTLICPDP NYYGGPVMLMRKGGSRFMEIPITHDHVTTNKFEGAPGTWGDSRRGMGVAEMAWAIRGERP HRCDVGLHLHALELIHGLDVCTREGTVYTMETRPARPAALKAGMIGKCHEAALADWQPDV IKWAGGRIC >gi|229784122|gb|GG667613.1| GENE 41 42705 - 43532 849 275 aa, chain + ## HITS:1 COG:CC1629 KEGG:ns NR:ns ## COG: CC1629 COG1082 # Protein_GI_number: 16125875 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Caulobacter vibrioides # 24 228 75 283 328 73 27.0 3e-13 MLKAGVVGFLPDDRSKTEETWKLLEKYAEIGYRAMDMDLDYAIPGGDLKEKYERVCSLGI KPIMSGASPEVFQDPAAKIKSLHIQEINQVCMYSSSMIASFRRGYGSNAGYDEVMQDFEF MNKAVLFFESEGIHFCYHNHYQEFTTCYNGISGFDLMLLHVDPRLKFNIDLGWAMVAGVD PVKLLKRLEGRIANVHLKDFYDLEAPKHILNSDPSTKVGFTSLGSGLLPVDPILEEMDRQ GLSYACVEQDVLRHLDRVQTLTASYLRMKESGYVK >gi|229784122|gb|GG667613.1| GENE 42 43632 - 44984 1417 450 aa, chain + ## HITS:1 COG:BH1923 KEGG:ns NR:ns ## COG: BH1923 COG2723 # Protein_GI_number: 15614486 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Bacillus halodurans # 3 443 6 440 447 367 45.0 1e-101 MGFQKDFVWGAATSSYQIEGAAFEDGKGLSIWDVYAHQPGKVFEGHNGDVACDHYHRFEE DVKLMKQLGIKAYRFSISWPRILPDGIGTVNQKGLDFYSRLTDALLENGITPYVTLYHWD QPYELYLRGGWLNPDSPKWFAEYAAVVARALGDRVKNFITFNEPQVFIGLAFVDGVHAPG HKLPRREALSMAHHVMMAHGLASMEIRSIVPDAKIGYAPTSNVPVPVSSDPKDVEAARNA YFRMPENGDWSWNVSWWSDPVMLGNYPEEGLRILEKDLPVMGPDDMKIIHQKPDFYGQNI YRGIPTKAVPGGWETVPHSPGAPKTAINWHVDFDCLYWGVKFLYERYQTPVVITENGMSS HDWPALDGKIHDYARIDYLHRHLRGLKRAAEEGVDVAGYFQWSLMDNFEWARGYNDRFGL IYVDYATQERIPKDSFEWYRNTIMQNGENL >gi|229784122|gb|GG667613.1| GENE 43 45044 - 45931 962 295 aa, chain + ## HITS:1 COG:BS_ycsN KEGG:ns NR:ns ## COG: BS_ycsN COG4989 # Protein_GI_number: 16077482 # Func_class: R General function prediction only # Function: Predicted oxidoreductase # Organism: Bacillus subtilis # 1 288 1 293 300 235 40.0 6e-62 MERIKMGPIELSRVIQGYWRLTSWDWSPEKLADHMNACVENGVTTFDTAAVYGGGECERQ MGKALAFDPTLRSRIELVTKVGIVPKGENAAFSHYNTTYDHVMRACRDCLERLGTDYVDL LLIHREDPCLDPAECGRALKDLQAEGLIRAYGVSNFDPWKFQALNQATGQSLVTNQIECS PLCFEHFDSGMMDLLTGTGIHPMIWSPLAGGRLFTGSDEACVKVRAVLKELAEQYGTTES AIVYAWLAYHPVKALPICGSNRIDRLLEAVRGVEIRLEHVDWYRIYTASGQKVLR >gi|229784122|gb|GG667613.1| GENE 44 45962 - 46879 856 305 aa, chain - ## HITS:1 COG:CAC1605 KEGG:ns NR:ns ## COG: CAC1605 COG1893 # Protein_GI_number: 15894883 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Clostridium acetobutylicum # 1 301 1 301 301 189 35.0 7e-48 MKEIQSTAIIGMGALGLLYGNQIVSRLGSSGLRFIADRKRIQKYHSMEFTVNGEHRTFPM EDCETAAPADFVIVAVKYNALPEALNTMKNCVGPDTTIISVMNGISSEQMIGERYGREKV LYSIAQGMDAMKFGSSLTYTQAGELRVGALSPEQNNRLEAMTAFFDRAGIAYTVEDDILH RLWGKFMLNVGVNQTCMAYETTYSGTLVPGEAHDTMIGAMREVIRLARAEHVNLTEQDLD SYIDLLKTLSPDGMPSMRQDSLSHRPSEVEMFSGTVREMAKKHGLATPVNDRLYKRIREM ESQYT >gi|229784122|gb|GG667613.1| GENE 45 47176 - 47259 108 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNTLEVLTLVLVIFTVLAYLDNHQKRK >gi|229784122|gb|GG667613.1| GENE 46 47538 - 48503 875 321 aa, chain - ## HITS:1 COG:CAC1046 KEGG:ns NR:ns ## COG: CAC1046 COG0583 # Protein_GI_number: 15894333 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 307 1 301 303 229 40.0 7e-60 MELKEARYILAIARHKSISKAAEALFISQPSLSKYLKNLEQQLGTRLFDRIGNGYFPTYV GERYLHYAEKIVEYGLEWDTELDDIMHQNHGRLNIAIPIMLGNSIIGPTLPRFHKQYPHV TVNMMEAVNFVAEHTLTDHTVDLTFYNVHEFPRDLDYQIIGKEEIVLVLSASSPAAHKAV DKEGFRYPWLDLRLLKEEDFILLYPDQNTGGIALKLFSEYCMAPHVLMYTRNSEMSIRLA MEGVGAAFAPESYYHYIKRQESSSPAVRSSACFSIGAEKIENTLIAAYQKNRYLPQYARA YLDMIKEYCLYMDRKKIQGED >gi|229784122|gb|GG667613.1| GENE 47 48666 - 49499 945 277 aa, chain + ## HITS:1 COG:AGl889 KEGG:ns NR:ns ## COG: AGl889 COG1176 # Protein_GI_number: 15890561 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 38 256 92 310 326 165 38.0 1e-40 MKKSSAFVMVVPGLVILAVCLFLPLLRVLVPTLFAGEYPFNSYVDFFRDEYYLKIFARTV KVAVITTAACMAAGVPTAYFISRCGRKWRGILLAVSIFPMMTNSVIRSFAWINILGSNGI INRFLMAAGFADKPVKLLYTDFAIIIGSVYLFLPLMIVTVAGVMENIGDDMMEAAQSLGA NRFTAFLKVVFPMSLPGIIVGGILVFTGTLTAYTTPQLLGGNKNMVLSTFIYQRAMTVGD WDGAAVISLIMIIVTLIVIKGFNALANRLDRRGGSHA >gi|229784122|gb|GG667613.1| GENE 48 49492 - 50085 527 197 aa, chain + ## HITS:1 COG:AGl888 KEGG:ns NR:ns ## COG: AGl888 COG1177 # Protein_GI_number: 15890560 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 13 197 13 197 259 130 42.0 1e-30 MRKNKLLTIFAGLVFAFLIIPLIIITVTAFGGGNAITFPIESFSMKWFLNVFSLKSFRKS FLTSLEIAFLATFIALLVGIPAAYALARSGMKGKGILKSVFLSPTIVPGIVIGFVLYQFL VLSLRIPVYAGLLAGHFMVTLPYVIRVVGSSLDQFDFSVEEAAWSLGCGKISAFFKVVLP NITSGISAAFMLAFINS >gi|229784122|gb|GG667613.1| GENE 49 51008 - 51703 805 231 aa, chain + ## HITS:1 COG:APE1732 KEGG:ns NR:ns ## COG: APE1732 COG3839 # Protein_GI_number: 14601591 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Aeropyrum pernix # 2 224 119 348 358 173 43.0 2e-43 MEVCGLTEFADRFPQQMSGGQRQRVALARALVIEPKLLLLDEPLSNLDAKLRINMRMEIK RIQKQLGITTLFVTHDQEECFSISDKVAVMNRGVIEQYDTPEHIYSRPKTEFVARFIGFE NFLTLHKGGDVYRTDSGEAFTVKTADGVVPSEETVTGTIRPEDILVDFEGRTGNVLTGTV GVRTFLGKSYQYEVHTEAGKFLVNRNADESYEEGQTIRLYLPEEKLILMDR >gi|229784122|gb|GG667613.1| GENE 50 51742 - 52800 1500 352 aa, chain + ## HITS:1 COG:SMb21273 KEGG:ns NR:ns ## COG: SMb21273 COG0687 # Protein_GI_number: 16264525 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Sinorhizobium meliloti # 2 347 1 335 340 168 33.0 2e-41 MMKKMMALSLCAALVLTGCGSSNSAEGGNSDGNSKKLILSTYGLSEDISEEEVYQPFEDE FGCKIVTETGSTNERYTKLSADSETVIDVIELSQAMTAKGIEEGLFEPIDLSKIENSQYL IGAAKTMAEAGQGVAYTINSIGIMYNPEAVGFEIKSFDDLWDARLEGRIAIPDITTTFGP AMVYMASDYKGADVTADQGAAAFEALKELKPNIVKTYAKSSDLINMFTSGEIEAAIVGDF GVPTIQAADPSLVYVTPDVTYANFNTISITKNSKNKELAYEYINYRLSRELQTKTGKALN EAPTNNQVAFTEEESANMTYGDAAENAKVIDYSFVNPILGEWIDQWNRIINS >gi|229784122|gb|GG667613.1| GENE 51 52882 - 54759 1435 625 aa, chain + ## HITS:1 COG:AGc4165 KEGG:ns NR:ns ## COG: AGc4165 COG1001 # Protein_GI_number: 15889568 # Func_class: F Nucleotide transport and metabolism # Function: Adenine deaminase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 32 603 27 569 575 278 32.0 3e-74 MANGNDGGVVVVKVDLVIENAWVYRTFRQCFERMDIAVTGGKFFDVSPVSDFVGDSKLRG GARRGEWEAESVFDGTGKFVIPGLVDIHMHVESSMTYPEEFSRAVIPFGVTTAVADPHEI ANVFGLEGILSFMEQETELDIFYGIPSSVPATRTEMETSGAELGEEELRKQLQDERVVCL GEVMNFKDLAAKEDTKSKRFLKICGECGRNIRIEGHCPGLSGGDLAAFVRAGVDADHTEQ TAESVVEKADMGMFLELQEKSLKEEVVEAVKRHQLYENVALVTDDTMPDRLMEGHLNRIV RLAVEKGMPAEKAVYLATFTPARRMHLDDRGIIAPGKIADFAVLPDLKSFVPEAVFKNGR RCQSRMPDGPDQEMNMDREADKRQEQGLKRRFPDHFYSSVKCRLALASDFVLPITDFWKD NRTALSGTALVNVMQIQTFGTRAKHVQRRIPVREGILCWQEAGLCLAAVFERYGKNGNVS WGFVEHGFQEKGAVATTWSHDSHNLLVLGNSVEDMVLAQNEVVHMQGGYVTACGGRVTAA AELPVGGILSDKSLPELAAEIKAVREEIERMGYVNHNVIMSISTLSLLVSPELKLSDQGM FDVKSQREVPLVESFQIQEEKVAEQ >gi|229784122|gb|GG667613.1| GENE 52 54756 - 56105 1304 449 aa, chain + ## HITS:1 COG:MA1276 KEGG:ns NR:ns ## COG: MA1276 COG0402 # Protein_GI_number: 20090140 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Methanosarcina acetivorans str.C2A # 4 443 15 435 442 249 36.0 1e-65 MSTIITNVVVVTMNTQREVIKHGFAAWEDGKITAVGTMDTLETVIAEARKHETGDGLRLI DGKGGILMPGMVNLHTHMGMIPFRGLGDDCKDRLRVFLLPMEQRAMDAELVYLSTKYAAA EMLLAGVTTVFDMYYYEAEAARAMDEMGIRGIAGETVMEQEACDFKDPYEALSYGESLMK KYQGHPRISGCVAPHGTTTCSSGLLKAAWELDHAYGAPFSLHTAEMDYEMDYFWGMSGQT PAGYLDSLGVLGKQTLAAHCIHMSESDLQLFAAREASVAHCIGSNTKAAKGVAPVSRMRE LGITVGLGTDGPASGNTLDLFTQMKLFANFHKNETKNRSAFPARQVVEMATVMGAQALHM EQMTGSVEVGKRADLVLVETRSANMFPVYDPYSALVYSAGPSNVDMVFVEGACVVKDGKL AEKRLDVLREELERKMAKTAFAENALVSL >gi|229784122|gb|GG667613.1| GENE 53 56390 - 57520 1099 376 aa, chain + ## HITS:1 COG:BS_ytvI KEGG:ns NR:ns ## COG: BS_ytvI COG0628 # Protein_GI_number: 16079968 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Bacillus subtilis # 43 363 31 362 371 116 26.0 9e-26 MEESKEQAVQERRMEGRKRFITNVIYFVILGGIFLFVMKRLVPMFFPFLLGLTVAAVLSP LIRKLCGRTGGKRAPVSIAVLLVFYGILVMAALFSASHIVAFIQNLAGKLPQFYMEVIEP ELGILFERIVVSFPEYQDWLSQMSASLENALQSGIMTVSGTLIGWGASWIVGFPAILVQT IFTIISSFFFTIDYDRIWDFVLRQFKEERRKMIAETAAGARTTIWKILKVYALMMTITFA ELYLGFVLLKIPMPLLLAFLVAVVDILPVLGTGTVLIPWALILCIIGKTGLGVGIFILYV IITIVRQTLEPKVIGQQVGLHPIVTLLCIFAGAQLIGVLGIFTFPVIATIVKKMNDEGTI HLIKQGFGETRTGNGL >gi|229784122|gb|GG667613.1| GENE 54 57591 - 58910 1574 439 aa, chain - ## HITS:1 COG:BH2228 KEGG:ns NR:ns ## COG: BH2228 COG1486 # Protein_GI_number: 15614791 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Bacillus halodurans # 1 439 1 433 434 575 63.0 1e-164 MVKITFMGAGSTVFARNVLEDCMCTPALCDAEIALYDIDGERLNDSYVILSAINKNVNQG RASIHTYLGVPNRKDALRDADFVVNAIQVGGYDPCTITDFEIPKKYGLNQTIADTLGIGG IMRALRTIPVLEDFARDMEEVCPDAYFLNYTNPMAMLTGYMQRYTNVKTIGLCHSVQTCS KTLLTELGMEDKLEGRTELIAGINHMAWLLKLYDKEGNDLYPEIRRLADQKNREEKHKDM VRFEYIRHLGYYCTESSEHNAEYNPLFIKSNYPEMIEDYQIPLDEYPRRCIKQIEGWNQE KDSILMDGAITHTRSLEYASYIMEAIVTNQPYKIGGNVLNHGLIDNLPAEACVEVPCLVD GSGITPCHVGALPTQLAAMNLSNISVQLLTIEAARSRDRRTIYQAAMMDPHTAAELNIPD IIAMCDELIAAHGEYMKEY >gi|229784122|gb|GG667613.1| GENE 55 59047 - 59877 779 276 aa, chain + ## HITS:1 COG:SP1899 KEGG:ns NR:ns ## COG: SP1899 COG2207 # Protein_GI_number: 15901726 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Streptococcus pneumoniae TIGR4 # 1 251 14 272 286 106 26.0 5e-23 MKVYYCGREDCEKGHFFGPAVRSHQLIHFVLKGKGIYRTGYGQYEIKEGEAFLIRPGEVT YYQADFEEPWSYAWIAFDGDGAEALLERYYPDHSFPVCAMGEITAVSGWFEGLLGAFGSA EENRERVLGYFYLIMACLLRSGKGGVPVDEEGYFKRAVAFIRHNYSYPIQISEIAGYVGI DRTYLYRIFMHQAGVSPKQYLSRYRLEEAKEMLVQTEYRITEIAYSCGYHDSSSFCRHFQ KATGNAPARYRSMKKIKPGRESEEKIIDNRGQLMVN >gi|229784122|gb|GG667613.1| GENE 56 59972 - 60238 379 88 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227872009|ref|ZP_03990394.1| ribosomal protein S15 [Oribacterium sinus F0268] # 1 88 1 88 88 150 80 3e-35 MISKEKKAAIIEEYGRKAGDTGSPEVQIAILTARITELTDHLQKNPKDHHSRRGLLMMVG QRRGLLDYLKKTNLDGYRALIEKLGIRK >gi|229784122|gb|GG667613.1| GENE 57 60616 - 62268 1131 551 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase [Brucella abortus bv. 1 str. 9-941] # 1 551 1 552 714 440 43 1e-122 MFKSYSMELAGRTLTVEVGRVAKQASGAALMHYGDTVVLSTATASDKPREGIDFFPLSVE YEEKLYAVGKIPGGFNKREGKASENAILTSRVIDRPMRPLFPKDYRNDVTLDNIVMSVDQ DCSPELTAMLGSAIATCISDIPFDGPCATTQIGMVDGELVVNPTQAQKQASDMALTVAST REKVIMIEAGANEVPEDIMIEAIFKAHSVNQEVIKFIDTIVAECGKPKHSYISCAVPEEL FAAIKEIVPPAEMEVAVFTDDKQTREENIRTITAKLEEAFADKEEWLAVLGEAVYQYQKK TVRKMILKDHKRPDGRAIDQIRPLAAEIDMLPRVHGSAMFTRGQTQILNICTLAPLSESQ RLDGIDDMETSKRYMHHYNFPSFSVGETKPSRGPGRREIGHGALAERALIPVLPSEEEFP YAIRTVSETMESNGSTSQASICSSSMSLMAAGVPIKSAVAGISCGLVTGDTDDDYIVLTD IQGLEDFFGDMDFKVGGTHKGITAIQMDIKIHGLTRPIIEEAIRRTREARIYILDEVMAK AIAEPRKEVGK >gi|229784122|gb|GG667613.1| GENE 58 63363 - 63776 173 137 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229213658|ref|ZP_04340006.1| SSU ribosomal protein S1P [Dethiosulfovibrio peptidovorans DSM 11002] # 25 125 394 483 512 71 41 2e-11 MIARAIEIIKSIVTDIEPGAVMKGKVVRIMNFGAFVELAPNKDGMVHISKLSDKRVEKVE DAVNIGDEVTVRVLEVDKMGRINLSMKPGDMDPNVDVSTLDSGRRDSDRRDGGRRDSGRR DGGRRDSDRRDSDRRNS >gi|229784122|gb|GG667613.1| GENE 59 63851 - 65479 1719 542 aa, chain + ## HITS:1 COG:FN0649 KEGG:ns NR:ns ## COG: FN0649 COG1574 # Protein_GI_number: 19703984 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Fusobacterium nucleatum # 1 537 2 539 542 281 33.0 2e-75 MREIYYNGVIDTGTGLDAEAMAVEDGRIAAVGSLRDVNHSEDTVFHDLEGRFVVPGFNDS HMHLLNYGHTLKLVDLTQATTSLEAMCLALSGYLLEEKPSPGTWIVGRGWNHDYFQDEKR FPDCSDLDRVSTEHPVLVIRACGHIACANTKAMEAAGITRLTPQPEGGCFDTDEAGNPNG IFREFGVDLILGAVTKPGKKELKEYLRLAMKDLNSRGVTSCQTDDLAAFPGIPFETVLEA YRELEREGAMTVRVYEQCLLPTEQLLEEFLSCGYRTGQGSGYFKIGPLKLLADGSLGART AFLGQPYEDAERPGERGITIYSQEELEKMIVLADGAGMQIAVHAIGDGAMEMTVRAYERA MEENPDRRDRRHGIVHCQITTARLLEEFRRLKLHAYIQSIFLDYDNHIVEARIGAERAGE TYQFKTLLSMGVSVSNGSDCPVERPDVMAGIQCAVTRSTLDGTKPFLKEQALTVEEAITT YTAMGAEASFEEDEKGTLSVGKLADFAVLEQDIRTCAPDRIKDTAILAVYVGGVCVYKKE NS >gi|229784122|gb|GG667613.1| GENE 60 65502 - 67529 2138 675 aa, chain + ## HITS:1 COG:SMb20536 KEGG:ns NR:ns ## COG: SMb20536 COG4289 # Protein_GI_number: 16264263 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 71 407 60 373 617 119 28.0 2e-26 MKFTVENPDEKRSPYTGMTREHWISISHFFLEGIFEHVEKIEDPIVVGRHETKVSYPQPD GPKWRIAAERFEGLARSFLIAAPLLHNEPEAAAGGYSLKEYYSKQILLSVTPGTPNYLLR VEEIFPEAQEGEMAFQHTCECASLVIGLSMCREVIWENYAKEERDRIADYLSNFGHSYTG HHNWRLFNMLILAFLDREGYPVDHGMMRDHAAAILSYYAGDGWYRDGHLFDYYCPWAFHV YGPLWNQWYGYEKEPYMAAKIEEYSNRLVETFHRMFDEEGHVTMWGRSGIYRSAASAPLA ANFLLKNPAADPGLARRIASGAVLQFATREECFYEGVPCLGFYGPFQPLIQTYSCAASPF WIANAFVCLALPKDHPFWTDTERNGVWEEMKEREVRTTVLDGPGIVVDNHKDTGITEFRT AKVLMKKDNPSLNTYSRLSFNSQFLWEDFDFKGIEAMQYSVTYDGRPKSLIPNILMYGGV KDGVLYRKEYFEFEFTFQNQASIDLADFPVADGIVRVDRTRIPDKPYTLTLGSYGMPDVD RSGVDVRIRTDQVTGAKAILLKSSEGQMAMVIYGGFDDIEIMSRQGVSAVTKQSCIMYGV SRRRKYYEYLPYAMISVILTKKSQEEWRDEDLFPIRSISYADKEKCGAYGPIVLEMKDGR TVTVDYEGLEGRLHI >gi|229784122|gb|GG667613.1| GENE 61 67940 - 70543 2753 867 aa, chain + ## HITS:1 COG:FN1022 KEGG:ns NR:ns ## COG: FN1022 COG0474 # Protein_GI_number: 19704357 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 5 865 4 862 862 829 53.0 0 MKDWYQKDEQEILKELNVTKEGLTAGQAEQLLLEKGENVLKEGKKKSVLAVFAEQFCDLL VVILIAAAVISMFSGNVESTIVIVAVIILNAVLGTVQHEKAKKSLESLKSLSSPSAKVIR GGQKIQIPSANVVPGDILLLEAGDMVAADGRILNNYSLQVNESSLTGESTNVDKEEGTID GEMPLADRTNMVYSGSLVTYGRAMVVVTGTGMDTEIGKIASLMNATKEKKTPLQVSLDQF SGRLAAVIMVICAIVFALSLYRKMPVLDSLMFAVALAVAAIPEALSSIVTIVQAMGTQKM ARENAIIKELKAVESLGCVSVICSDKTGTLTQNKMTVQEIYTDGRIFRPEELDLHEQLHR YILYDSILTNDSAIVDGKGIGDPTEYALLEMARKVSVSDDVLRTMMLRLEEIPFDSDRKL MSTKYELHGVPTILTKGAVDVLLDRTTQIRTSEGIRAFTEADREEINRQNMEFSRNGLRV LAFAYKEVEDGHVLSLKTENGFTFLGLVSMVDPPRVESKEAVSDARRAGIRPVMITGDHK ITATAIAKQIGIFSEGDLAVTGAELDGMSDQELDEKITKISVYARVSPENKIRIVDAWQR RGSIVSMTGDGVNDAPALKKADIGVAMGITGTEVSKDAASMILADDNFATIIKAVANGRN VYRNIKNAIQFLLSGNTAGILSVLYTSIMALPVPFAPVHLLFINLLTDSLPAIAIGMEPA DKDLLSQKPRDPKEGILTKEFMMKLFLQGGLIAVCTMTAFHLGLNQGGPAVASTMAFCTL TLARLFHGFNCRSSHSIFRIGFSGNWYSLGAFLAGVVLLSLVMFVPFLEKLFSVTALTGG QIGLVYLLAVIPTVIIQMTKVIRERSR >gi|229784122|gb|GG667613.1| GENE 62 70636 - 71937 1254 433 aa, chain + ## HITS:1 COG:no KEGG:Closa_3037 NR:ns ## KEGG: Closa_3037 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 433 1 433 433 598 78.0 1e-169 MTGQLLHKFRNHGPLSRRTKFLILALVPVYFMAAGLILQPADEVWRGIVTIIREPDFLIT DYIAIGGIGAAFINAGVLALISIAIVYFLGMDMSGHTITSCFLMFGFSLFGKNILNIWSI MMGVCLYAFYHKTSVTRYIYVGFYGTSLSPIITQLMTVGHLPLAVRLPFSILVGMIIGFV LPPLSTHTHYSHKGYSLYNVGFASGIIATVIVSLMKSFGLQTEARLIWSTGNDVLFARLL LGLFGGMILFSCLIAESVWKRYMEIWKTYGLSGTDYVKSEGFAPTLFNMGVNGITSTLIV LLAGGDLNGPTIGGIFTIVGFSATGKHLRNILPVMAGVILGSFVKTWNISDPSAMLALLL STTLAPIAGEFGVVAGVLAGFLHASVALNVGIVYGGMNLYNNGFAGGIIAMFLVPVIQSV RDRRARARTHDSL >gi|229784122|gb|GG667613.1| GENE 63 71950 - 72744 657 264 aa, chain - ## HITS:1 COG:no KEGG:Cbei_1303 NR:ns ## KEGG: Cbei_1303 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 51 249 69 274 413 86 29.0 1e-15 MKTGIVIEITGKDAIVMKNGGEFITLPAKEGWKKGDIIPVNRRKRPAVPIRRFLAAAAAC LCLAVSGGGYHYYYAQAALISVDVNPSIELSVNRQDRVTSTAALNEDGKALLTGIRLTGM ECGEAVRELLQAESSGQYPADHKNVVVTVYSANEDRQSRLLKEIRETADHALTTRAADSS TEYRAVTSEEVKAAHSCGVTAGKYIYLQKLEEAAPGTDISRYSHCSIDEIKDHISNCEKR HQTNSTESDSGGKDCHSGHDHKNR >gi|229784122|gb|GG667613.1| GENE 64 72722 - 73441 367 239 aa, chain - ## HITS:1 COG:BS_ykoZ KEGG:ns NR:ns ## COG: BS_ykoZ COG1191 # Protein_GI_number: 16078409 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus subtilis # 4 224 18 245 251 89 25.0 5e-18 MDDIENRIKKAQSDSSELELLISDYLPLLKKQAGAAASGILEYEDRLSLAMLTFISCVRQ YTEKRGSFISFFSVCFHNRITDEIRRQNLYLKNVQTLTPDSAELETAAGKASLDAYDREQ ERLSLCEEIDRLSEAIGVYGVAWKDLPRICPKQPRAKATCRQAAAAVLLNPEFHAMFFGQ KKLPQAQLASALFLSPKTLEKHRKYIVTLIILLSGDYPGIRAFLPQYREVIPIEDWNCH >gi|229784122|gb|GG667613.1| GENE 65 73583 - 73858 174 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619749|ref|ZP_06112684.1| ## NR: gi|266619749|ref|ZP_06112684.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 91 1 91 91 137 100.0 3e-31 MKKIMTGMIASAVVLSMAFPALASTRACDRYEACAVEGCETAACHEHDGVIYRGTCQDGT CSYHSETCGTESGSRSGHHRNHGNHGSHGCH >gi|229784122|gb|GG667613.1| GENE 66 74095 - 76023 1754 642 aa, chain + ## HITS:1 COG:BS_ydiF KEGG:ns NR:ns ## COG: BS_ydiF COG0488 # Protein_GI_number: 16077662 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 558 2 587 642 368 36.0 1e-101 MLYQITDGTVSAGGVPILSHINFEIRGTEKIAVAGKNGAGKTTLLRLLAGELTLDRDDKR NGPGIMTSRKLTVGMMRQQPFDDPSETVEEELLKGCPKRAGTGQNGSAVFSEWDRERFEY EREYDILFTGFGFSKEDKKKKVSSFSGGEQTKIALIRLLLEKPDILLLDEPTNHLDLETV QWLEEYLRQYDKAVVMVSHDRFFLDRTADVIYELAGGKLTRYAGNYSAYREQKLKRIRLQ KKAYESQKEELERLNSVVERFKHKPSKASFARAKKKAMERMNRVEKPEEDEIHIFTGELT PLITGSKWVFESEHLKVGYDRALLEITLRIRRGQKIGILGPNGSGKTTFLKTVAGFLPPL DGEMAIGVNITMGYFDQLSAQVTSEKTVADHFHDLFPSLTEKEARSILGAYLFGGREAQK SVSSLSGGEKARLVLAELLQSRPNFLVLDEPTNHMDIQAKETLESAFRAYTGTLLFVSHD RYFLSEVADALLIFENHSVMYYPFGYGHYLERKRKSEGEGGLAAQVKAEEQALIAGMRAV PKAERHRLKEISAEEAYIDWRLRLVSETMIPAGERAGELYASMVSMKQEWMESEDFWTGG QWEMEPAYVELCRQYEEAAAAWHELCMEWYFVWSGEENEENS >gi|229784122|gb|GG667613.1| GENE 67 76142 - 76501 345 119 aa, chain + ## HITS:1 COG:VCA1051 KEGG:ns NR:ns ## COG: VCA1051 COG3304 # Protein_GI_number: 15601802 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Vibrio cholerae # 4 118 5 129 143 83 44.0 1e-16 MGCLGNLLWFIFGGFLSGLSWTLTGCLWCITIIGIPVGMQCFKFAALSFFPFGKEVRYGG GAGSFLLNLIWLIVSGLPLAIESALIGVVLCITVIGIPFGLQQFKLAKLALMPFGSEVY >gi|229784122|gb|GG667613.1| GENE 68 76504 - 77328 1001 274 aa, chain + ## HITS:1 COG:no KEGG:Clole_2712 NR:ns ## KEGG: Clole_2712 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 10 263 5 260 264 213 48.0 5e-54 MEFPGKRPGKKRTALEKEWAALERSEARFVERQKEKGPSPINQKLDQIVPDQLKGTLDAA FYKAFQLVFEKGGTVIEKTYRKEDGEYRQKLNAYAAELKESRKNLKAFSRQAGATKAKNL LISGVEGIGTGVFGIGIPDIPLFTGVILKSLYEIAVSYGYAYESGREQVFLLEIIKTALE RGGELAEDNERINRWIEGTGSLEEKQVVMRKAANRLSEELLYMKFVQGLPIIGIAGGLSD CVYLKRITDYAELKYRRRFLQDRYAAKGGAGNGG >gi|229784122|gb|GG667613.1| GENE 69 77350 - 78213 685 287 aa, chain - ## HITS:1 COG:SP1899 KEGG:ns NR:ns ## COG: SP1899 COG2207 # Protein_GI_number: 15901726 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Streptococcus pneumoniae TIGR4 # 20 270 14 273 286 143 33.0 4e-34 MQRNMIEIFRKNISGESSSLSFGYAGREQCMGSHSFGPARRAHYLIHFIIKGKGVFFADD KKYILKQRDMFLIRPGETTFYYADEVDPWEYAWVAFVGSDAAQITESCGFAYSPVATYPD SPELIESIDDIIHHMQTGDENDYYLLSRLYNIFYYLSKPVINTQNTYKNEIVRRTVNLIR SSYFEKLTIQWLADQVRVDRTYLYRLFKEEVGICPKEYLTQYRIRMATVMLSTTNQSIKE ISYACGFTDTSLFSICFKKYLGFTPQQFRKIDGIQQLSFQMMDEKAK >gi|229784122|gb|GG667613.1| GENE 70 78439 - 79806 1718 455 aa, chain + ## HITS:1 COG:BS_yurO KEGG:ns NR:ns ## COG: BS_yurO COG1653 # Protein_GI_number: 16080313 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 3 453 2 419 422 79 22.0 2e-14 MKKKTLALVMTAALAFTALTGCGSTGSAPAAKEETTTAAEKKETEAVTEAAAESEKPAER PEVRVLIKWSESQISNWPVLVDEYNADDSNPVKINLEFYGSEGYDDKVKAELMGDNPPEI VQLMKTTFNEYSANGQLMELTGFLGKEGWNYNKGALAWAGPLNNPENGVYGIPDFANTSC IFYNTKLFAELGIEEPANLEELTNAAKTLNDNGYKAIVTGGGNWCAPDLLAKVQAQLVGT DLLIKAYNNEAKYNDPSMVEAMTIVDEMVKSGVIDKSSADYISDDDAIAEFVMGKAGMYT AHTGMASAIDSSKEDDFEYNIIEKMDFVENPKTSVSVTWGSMWCIPANVKNQEAAEEALA FLFGEKVQKSDVTELGKIVNVDEWNAGLTHPALLTAAEQLKGAGTADSFYLLDMVSSKVL DNMIKGIQEMIQGNQTPAGVLDHVQKIWEEEQAQK >gi|229784122|gb|GG667613.1| GENE 71 80313 - 81164 813 283 aa, chain + ## HITS:1 COG:mlr7001 KEGG:ns NR:ns ## COG: mlr7001 COG1175 # Protein_GI_number: 13475831 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Mesorhizobium loti # 1 268 34 299 317 158 34.0 1e-38 MFSLPALAVYTVFWIFPVFLGFYISFTNWNGIVNLADAQFTGFKNYVNLLYDSILRVSVL NNIKYGIITLIVVPVAAFAVAYLVENFTRKKSFWRTVTYLPAMIPTIVTVFLWKWIYNPQ YGILNEILKAVGLGGLATGWITNTKTALYAVTFTSLWKMIPVYFVLFLAGLQSVSIDLIE AAVLDGAGRWQVIRNVTIPCMKRIITIVYVLVFIDIFRVFDLIYTMTKGGPGYYTTEMIL TYSYKTVFTNSNAGYGMAIISALTLFVIICSFIQMKIQNRTAD >gi|229784122|gb|GG667613.1| GENE 72 81182 - 82006 877 274 aa, chain + ## HITS:1 COG:mlr7002 KEGG:ns NR:ns ## COG: mlr7002 COG0395 # Protein_GI_number: 13475832 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mesorhizobium loti # 9 273 23 287 288 150 35.0 2e-36 MKTIKRLFLGFVLAMVFSVTVIPIIWVFFSSFKSKKELMTKPWSLPEKFLFSNYSDAWQK ADFAVLTKNSVTITAFSLLLMLVFATMLAFAISRYKGNLSQVTLIYLLAGQTISAGMIIF PIVIVLRKLGFGSSHAGLILTYAAGGLPFAVFVLQGFFAGIPYELYEAGEIDGAGEFKIF ANIAIPLAKPALATVLLYQFMWVWNEFVLAFTIVSRPEKRTIIAGLYSVVNGYLNTDYVT AFAGAMIVCIPIIIIYLIFQKYLIQGLIAGAVKG >gi|229784122|gb|GG667613.1| GENE 73 82014 - 82865 856 283 aa, chain + ## HITS:1 COG:no KEGG:Plim_0551 NR:ns ## KEGG: Plim_0551 # Name: not_defined # Def: xylose isomerase # Organism: P.limnophilus # Pathway: not_defined # 17 165 82 226 322 68 30.0 3e-10 MKEIRIGAAAWGFRELTLPQQLDLARSLGLVSLELGIANSPADVQTDADVRLLEQVKKEY EAFGISLDTAATGNDFTREDGNAAAEDIGKIKKVINICSFLGVETLRIFAGFTPYREVDG TRFSRMVDALNMVARYGKEHGVVTAMETHGGVMAYPDGVVHFHSAATETGCLKRILRETD DSLKFVFDPANLYAIAGCDVEETAELVGSRIAYAHIKDFVKLPGGHLKPGAIGDEKRDWK RTIETVLKWTDTLMIEYEEPRDIEEGTRRSKQYLETIVKEVWL >gi|229784122|gb|GG667613.1| GENE 74 82862 - 83914 1156 350 aa, chain + ## HITS:1 COG:YPO2502 KEGG:ns NR:ns ## COG: YPO2502 COG1063 # Protein_GI_number: 16122723 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Yersinia pestis # 8 238 59 309 408 71 26.0 3e-12 MSTIYDYKKTRAVVIERPYEVRLREITLTEPKKDAYIARTYYSSISSGTDMKTYKGLQHP EQCYYPLVPGYETAGEVVAAGEESDGSLKPGDRVMINECRKYGDVCAAWGGGSEYTVKDS DTTNDTFDYMVKIPENLSYIDGVLAYLPCVALKGIRMLNLEGRQTIVVSGAGMVGISAMQ ILKIVNPDLRVIAIEPNKFRRDIAKKYVDLVADPFEAELVIRKATGGAMADQIIECSGNA EVPGILHKYLKDGGWGTDDAPGHIHWQGDYPEKIIMDSYHRWFTKNCTISMSCALAPGCK EQILTWISEGKFDTNGLPVEIWPVSKCAEAFEYKMAKGEDVFKIIFDWSK >gi|229784122|gb|GG667613.1| GENE 75 83916 - 85229 1170 437 aa, chain + ## HITS:1 COG:no KEGG:Gmet_0448 NR:ns ## KEGG: Gmet_0448 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: G.metallireducens # Pathway: not_defined # 10 209 10 208 659 83 30.0 2e-14 MRPKVIFLPHANIQYSQLKMERREWVVKNCYEKLFDLVRDNQYKIGFEASGKTLDEIERL APEVMEKLKALILDGQVEPVASPYTHLMMGNVPREVAVHTLLRSLDTWERYTGIRPTVGW NPECSWNAQIPGIYKEVGIETLIMDADSFFLSFPEIRKATGLCYDVQGHSNKNQLFLIEE YIKDKPEYLKYLTNPSICDDGLKLIFRSDCMANLLLWYLMGATEGVRNEAVRYEEIQSMF CRWNARAQETGSFIMPYAEDAEYIGSSAYFYVKQFNQARFFEEEGDSLKRFKEIMDLSIE SGFVPATPSEVVESAELLENPFILKIENGAAWHGGTAKAWLNTDQALQLDPVCRMILEGI EGIAAYKQIDWHENEALAAAFRAVTEGWVSDARWPPAPTSPGRFNVREALEALYRANDRV AGAMKELGIAGLRSLYS >gi|229784122|gb|GG667613.1| GENE 76 86236 - 87222 561 328 aa, chain + ## HITS:1 COG:CAP0136 KEGG:ns NR:ns ## COG: CAP0136 COG0535 # Protein_GI_number: 15004839 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Clostridium acetobutylicum # 6 318 43 383 390 90 23.0 4e-18 MERRLEMIHLQLTRQCNLNCWFCGQRKNSALRDGKESMALKTEEWLGIVNELDAAAGKKP SVMLWGGEPLMSPCFDEIASTLHERGYRVGIVTNGTLLDRHAETVPYCLDKIYVSVDGPE QLHDSIRGAGTFRKIRENLLAVKRAAPLLPVVCMAVLTEELKGQLKEFFAELSDFPVDEV LLQDMIVLNGREIGEYKAVMEREFSIEAEAIESWRQSGEGDSEMEKAGFRLEEPGTHAFK VRYLPHVRSGEIVRPCLSPWRHLHIMWNGETSFCTDFTDFSMGNVREQPLEALFRGEQAE KFRKLVENGRNPSCRHCSWGDKEDFFEL >gi|229784122|gb|GG667613.1| GENE 77 87493 - 90168 2735 891 aa, chain + ## HITS:1 COG:CAC0854 KEGG:ns NR:ns ## COG: CAC0854 COG0480 # Protein_GI_number: 15894141 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Clostridium acetobutylicum # 6 632 5 640 644 530 41.0 1e-150 MKKLAVGILAHVDAGKTTLSEAMLYIGGSIRKMGRVDRGDAFLDTYELEKKRGITIFSKQ AILKWKDMEMTLLDTPGHVDFSAEMERTLQVLDYAILVVSGTDGVQEHTRTLWRLLARYR VPTFLFINKMDLAELGKGDLLDELRKVLKETFVDFGEAGTDHFYEQIAMSGEEVLDEFLE TGRVEESGIRMQIRNRHLFPCFFGSALKLDGVEEFMRGLEEYTETPSYPQEFGAKIFKIS RDAQGNRMSHMKITGGSLSTREIINEEKVNQIRLYSGDKFETVREAPAGTVCAVLGLTKA VPGQGLGTQSASAMPLLEPVMTYRILLPEGLDAAAVLPKFRELEEEDPELHVKWEESKKE IHVQIMGEIQMEILKSLIAERFGIAVEFGDRSIVYRETIMNTVEGVGHFEPLRHYAEVHL LLEPGEPGSGLQFASACSEDILGKSWQRLVLTHLAEKEHKGVLTGAPITDMKITLMSGRA HIKHTEGGDFRQAVYRAVRQGLMQAESMLLEPFYDFRLELPDRYVGRAMSDVERMAGKVM PPVMEGDRAVLTGSAPAACMGGYQKEVTAYTGGAGRLSFTFSGYGPCHNTEEVVAAIGYE ADGDTANPSGSVFCAHGAGFLVDWDQVPEYMHLESCLETQKDRADGTAGADGRKSQWTGA RGDGTEREETWIGTEEVDAILERTFYANRRGNSSGTKSGWKNREKSNSVPVVRTYKKEAP REEYLLVDGYNIIFAWDELKELAEHTIDGARGKLLDILCNYQGMKKCRLIAVFDAYRVQG HVTECLDYHNIRVVYTKEAETADQYIEKFAHENGRKYDVTVATSDHLEQIIIHGQGCRLI SARELKEEISRLSETVHQEYQEKKTGEKNKNYLLDSMSEEMMNQIKELPEE >gi|229784122|gb|GG667613.1| GENE 78 90220 - 92052 1209 610 aa, chain - ## HITS:1 COG:SMc00195 KEGG:ns NR:ns ## COG: SMc00195 COG1368 # Protein_GI_number: 15965601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Sinorhizobium meliloti # 175 609 215 638 639 161 28.0 4e-39 MARKTVHILFTLFLLTVPCISYFLFEYVTGNLATIPSYMAALNIGWIYTLYLLVFAATGR TRIAVPAVSILLFILSLAEAFVVSFRSRPIMFWDILAFGTAMTVSQNYVFTITKSMKIAG LILLVINVAAIAFPRRISGRRFRLAFAGTAAGAIAAYGVWFFSFLMPSWGLGINMWEVNE TYKEYGYILSTAVSFRYAVKEKPAGYSTSQIKRIYHEYKPEDVVLASAEDVSLGEDTAAP VNIICIMNESLSDLRTGGSFDTNQEYFPFLYSLKEHTISGSLCIPVFGSMTCNSEFEFLT GDSMAFLPSNCSAYQFYIKPGTYSLVSTLKEQGYRTVAMHPYPGENWNRDTCYKNMGFDD FIDETRYEERDFLRNYVSDQSDYENLIRQVENKENPDDRLFLFNVTMQNHGGYEGIYDNF PQEVWLTGSLKDKYPKTDQYLSLMKRSDEALEYLIRYFENSSEPTMIVMFGDHQPSVEDE FYDEIAGMPSADVPTGDHLMWYETPFLIWTNYESKSEQKGKMSAIYLASELLDQAGLDMT PYQRFLLSMEEELPVIHPLGCYDASGTYYSWEETGTDRFPYRETIRDYEYLVYNHSFDPK KYKDMYTAGQ >gi|229784122|gb|GG667613.1| GENE 79 92218 - 92514 454 98 aa, chain + ## HITS:1 COG:DR0586 KEGG:ns NR:ns ## COG: DR0586 COG2127 # Protein_GI_number: 15805613 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Deinococcus radiodurans # 9 97 47 133 139 79 44.0 2e-15 MPANTGLSEKTEIRLKQPKQYKVVMYNDDFTPMDVVVEILIDVFRKNYEEAVAIMMTVHK GQRAVVGVYSYDIAMTKAARAVKIAREQGYPFRVEVES >gi|229784122|gb|GG667613.1| GENE 80 92511 - 94829 1262 772 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 2 768 5 808 815 490 35 1e-137 MKIAKEVEELLNDAVLNAGRRGNEYVTPEHVWYVLTDRPVFRAAFQSCGGDVKKLKSRLE EYLGQMDTVEKDAGGTLVSYGLSQAIEEAARKAVSSDRDIIELPHLLYSMMELEESYGAY FVQLQGVECVDLLVELGTGSRDMEMRTGDRRGKRAADGSGGESGKTETGKTEAGKTEAGD AESEDTESEEGTERERLLMKYAVCINEEVKISCPLIGREAELERTMQVLCRRQKNNPLHI GEPGVGKTAITYGLAARINEGRVPEKLKNARIYGIDLGSMVAGTQFRGDFEKRLKAVMEE LKEEENPILYIDEIHNLVGAGAVNGGSMDASNLLKPYLTDGTLKFIGATTYEEYKKYFSK DKSLVRRFQNIDIKEPTVEETVEILNGLKPYYEKFHHVRYQKGTMEHAAVLSNQYINERY LPDKAIDLIDEAGAYLELHPKNKKVQVVDKKLVEEVLAKICRIPKQTVETEERQKLARLE RELKGRVFGQDEAVDQLTNAIKLAKAGLNEENKPVSSLLFVGPTGVGKTELAKSAAKVLG IRLVRFDMSEYAEKHTVAKLIGSPAGYVGYEEGGLLTEAIRKNPHCVLLLDEIEKAHPDI YNILLQIMDYATLTDNQGRKADFRNVILIMTSNAGAEKADRSSIGFGGRQLNEEGITDEV KRVFRPEFRNRLDRVIIFRPLDAAMADRIVEKEFRSLARKAAEKNVVLTIGKAGRKYLAK KGVSREYGAREIKRIIGQEIKPLLVDEMLFGRLKKGGSCQVDYDGERIRILI >gi|229784122|gb|GG667613.1| GENE 81 94826 - 95836 876 336 aa, chain - ## HITS:1 COG:BH2267 KEGG:ns NR:ns ## COG: BH2267 COG1680 # Protein_GI_number: 15614830 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Bacillus halodurans # 25 326 28 327 332 255 40.0 9e-68 MNLEALIDAGFRGCIQIAVHGRTVFEKACGYADLPNRIPNDSSTKFATASAGKVFVAAGI LQLIEKGKLNFNDTIGDMLDFDLKAIDRNITIKELLTHTSGIPDYFDESVMEDYEDLWRD FPNYKIRSNRDLLPLFLEKPMTYPHGSRFQYNNTGFVVLAIIMESITGAVFDEWLEDHIF APCGMTSTGYYELDRLPAKCASNYICSAQDQTYRTNIFSVDAKGTGAGGAFTTVGDIRRF WDHLMSGRLLSPEMTADMIRNHSGGAQCYGYGVWLKKQNGVYSPYIQGSDPGVSFISSYR KENETVITLVSNYGDNVWQLHKMLAAELENRGQNSD >gi|229784122|gb|GG667613.1| GENE 82 96060 - 97457 1454 465 aa, chain + ## HITS:1 COG:SP1897 KEGG:ns NR:ns ## COG: SP1897 COG1653 # Protein_GI_number: 15901724 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 84 382 48 350 419 84 25.0 5e-16 MKKRLLASVLCVAMAGTLLAGCGSKDTAGSTSQASAAGTEAKTEAAKNEAEKTEAAAGTE ASIVTEPTELTFIFADGDEGAKASMNEIVNRFNEAYPDITVKIQPGNGGAYSEFLKTKES VGEFPDVMEMRDTPIYVRAGMLEPLPDDIVSMFKTTIPFDGKTYTAPLGGENTNGIIYNK KYFDENGFAEPKTYDEFIELCQKIKDKGDMAPLVVGGQDIWHVGFWFYKAYNDQVMSQDT DFIKHCYEGTKDFSDPAIVATFEEMKTIFQYAQDGWVSTPDAQITTFLVSDMAAMMYSGT HMFSQIKDADPNFEVGWFAVPSPDGKTRLVGGGGAGGLAISAESAKDPNKKAAAEEFIRF FYTPENYKVYCDKLNAIPSTVETPDLAADPVMMEVIEATNTADDLSVMWNGRVGENELPP DFRNFTYKTVIEVIQGQRDIPSACEEMNKTWQVAMQSFNPVTDAK >gi|229784122|gb|GG667613.1| GENE 83 97555 - 97872 325 105 aa, chain + ## HITS:1 COG:no KEGG:SCAB_86671 NR:ns ## KEGG: SCAB_86671 # Name: not_defined # Def: sugar-ABC transporter transmembrane protein # Organism: S.scabiei # Pathway: ABC transporters [PATH:scb02010] # 15 103 35 123 315 68 38.0 7e-11 MGKKKVRFDFVTMGFIAPAFLFFTLFIIVPTIASVYYSFTSWDGLNPVVKFVGLANYKEI FTSARFGNALRNTVILTAFISILENSMALILALIVDNVKWGKNSN >gi|229784122|gb|GG667613.1| GENE 84 98871 - 99380 573 169 aa, chain + ## HITS:1 COG:mlr7001 KEGG:ns NR:ns ## COG: mlr7001 COG1175 # Protein_GI_number: 13475831 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Mesorhizobium loti # 1 166 148 314 317 114 40.0 1e-25 MYNYNFGAINSIFRSIGMGEACQDWLGNPKLALIMVGVVLIWKGAGYYMIIYLASLQSVP TDVVEAAAIDGASPIQRFKAITLPLISGAFTINLTLSLINGLKVFDQISVMTDGGPGFTT ETIVYLLYKVGFNEGRQGFGTAVGIVLLFIILILNAVQQSILKRREVQL >gi|229784122|gb|GG667613.1| GENE 85 99380 - 100234 765 284 aa, chain + ## HITS:1 COG:Cgl2406 KEGG:ns NR:ns ## COG: Cgl2406 COG0395 # Protein_GI_number: 19553656 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Corynebacterium glutamicum # 19 284 35 304 304 170 36.0 3e-42 MKTAGMKASKRRIRPGSGLLFAVTVFLAVLFLAPVFFAIISAFKSNGDILKNPMSLPASL YLKNFQDLFAQSDFVTAIFHSIVLTLVSEVLIVFIVPMAAYAMERRKTRTTRLIYTYFLA GMMIPFHLYMFPLFKEMKMFGLFGNLAGPVACYISGSVAFGSLLYCSFLNGVPLEIEEAA QIDGCNPFQTFWRVTFPLLGPCTASMVVLNGLGIWNDFLMPYLVLPSDRAKTITVEVYSF VGQYTARWDIVFAGTVCSIVPALIIFVLLQKYFVKGITAGAAKG >gi|229784122|gb|GG667613.1| GENE 86 100462 - 102297 1897 611 aa, chain + ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 198 608 178 596 602 172 30.0 2e-42 MKKRWKRNLSLYQKFCLVIILLGLIPMLFLSTVIVNRMFKEYGNSLRSNYVQAASYVSES LDDMMENYNGVSKMPYYYNYSSEGEFQFNYMSYDNLRQIIYGIGEEPETMEANRYKSMMV FLGNVQAVDSYISGAHFVGNGLRGEKLAFHRSRRNIPSEEEAVFDARMGLDGLDKSSKNL MLILTHKNDYFRMQDSWVFTLARNYFDLTGTVGNYKYVGTLFIDVDLERMEAVFKNLNLN AKDQVYVCDRDNNCYYSSDKSVIGRNLDEEGIRFEETKEEFVVDTDYNASGLRVIITMKT ATAYEKIRRMQKMMYVFIGASLAALLCGSVWFSRRLTKPIRNMMEQMSEIETGNFQVRLP VTSGDEIGVLSRRFNQMSKELESYINQSYVAHMKQAEAEMTALKSQIYPHFLYNTLEIIR MTALDKDDTVVSEMIEALSSQIHYMIGPMQDMVPLEKEVEITRQYVYLLNCRIKGKVHFT ADLNGLSERRVPKLILQPIVENAYVHGMKPKAGSGHVMIEADHVENRMEISVMDNGVGMS TGELEQLKLLLEGSEPGIKNEHNWQSIGLKNVHDRIRYLYGEAYGIEITSTPGVGTMVRI VMPWREGGETE >gi|229784122|gb|GG667613.1| GENE 87 102294 - 103820 1728 508 aa, chain + ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 505 1 523 525 155 25.0 2e-37 MIKMILADDEPVIIKGIQKLVDFSRLGIEIVGEYEDGKAAFDGILTEKPDIALLDIYMPK KTGIEILKELKALGIETKVIFVSGFQDFQYAKDALTYGAVNYLLKPVIREELVETLEKCI TMLKQEPDGEGSGGESLEDREEVGTAAAYDKLVEMEECRYLPVLTEILYGEQTGKQERRL IQFSVVSFLEQYLEQRQMGIVFLKNGHIVMVFKNPEQEVTGMVLCSIMEEAQAQLHCRLG FIVGNPADSMGEIPKAYEECLSMLGYFYFSSQLQVPVLRVGEPVFKKPAALEEIREVRKQ LIEGMVAQDEKAFGDAAVRFRRLVCLISEGRRDDACFHYGSTIRIIEERFETMGIDGMPV AFKDILEQARQSESYESLTDLFEGYFKEYKDRIKRAVVSSEKKDIIHAREYIEKHYSENL TLEVLAGEIHMNPYYFSSFFKKNSGENFKDYLNKVRMKHAIDLLVSTDKKTAEIAADVGF RDSRSFSELFSRIYGETPSNYRKRVKDG >gi|229784122|gb|GG667613.1| GENE 88 104025 - 104876 979 283 aa, chain - ## HITS:1 COG:SMb21336 KEGG:ns NR:ns ## COG: SMb21336 COG2207 # Protein_GI_number: 16264660 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Sinorhizobium meliloti # 178 277 210 308 314 77 38.0 4e-14 MEAAKDELLTDNHLVLTISLDGVTSGIEDLNRVYRRVLELKNYRFVLGTNQIIYPGRVME LMPEYMTYPDKLADEILACMLHGKQEEFTENVQEFLSILKQYSYQPASLLFNRLYLDLLF QMQKLNAPGKDSYLSAETLHTPATLTEGAGVLLTIFEWYQERKAAAEQLKDNKHFERIEE SRKYIEAHYNDYNLSAGMVAEYLGYSTNYFSRIFKSITGFYINDYIRQIRIVKAQELLMN SDMTITTIAEATGFSNPNYFYSIFKKETGLTPAAYRNAGQRIV >gi|229784122|gb|GG667613.1| GENE 89 105835 - 107208 1317 457 aa, chain - ## HITS:1 COG:no KEGG:GYMC10_4473 NR:ns ## KEGG: GYMC10_4473 # Name: not_defined # Def: transcriptional regulator, AraC family # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 12 369 12 381 759 89 21.0 4e-16 MMKKLIGGQNYFRKMLSTCLLVISVTIFILTVFFYQRYADSLSKNLYDGQEKNLEKSART MNDLVSEISQLYNTVILDSRVVGFSSLKEFDPAENYATYLIVKKFYNINPYVNSLYIYNA MADDAITCGSYRFDLDYCWEYLKEAKKASIFPSPLTGTPEEVLTFAYPVYADNFDELSGG IFINLDMEKLSEHVLGTGSQAGMVLGDNGMILLSDLTEDFTSDPEACEAFLVWPSLTEGE SASSLRTFHGDDYLCAFYKDPARNFTFLSGVPYSEIVGPMKALRNLSLAVAAAIFLTAIL LQYIITKRLYRPLEAITEEFRDSKYAGGSDMDEFTLIRHVYENAIDEIRELEEEHAFYQP RMKSDLIRGLVLGNRDIAQTKELLEKNGWEIPFEGMFLACFFIENSDSSDVLAPIVQTRI SQHLHETLSPHFYTECVPVASDQVICLINTIEDIPIS >gi|229784122|gb|GG667613.1| GENE 90 107498 - 108358 815 286 aa, chain + ## HITS:1 COG:lin2117 KEGG:ns NR:ns ## COG: lin2117 COG4209 # Protein_GI_number: 16801183 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Listeria innocua # 4 284 2 285 309 231 43.0 9e-61 MKGKGFFKELYKNRALYSMFLPAAAVLILFNYLPMFGMVTAFKDFNFADGIFGSPWISPL TKNFEYLFSSGSATRATWNTIMLNLLFIVFGLVFELGFALMFHELRMKKFKSAVQFGIFL PYFISWIVVGLFSYNLFNYEHGSLNTLLKVFGMEAMNWYENPKMWIVILVVFRVWKMTGY GMIMYIAALGGIDTTYYEAAKIDGASRWQQIKSISLPVLAPTAITLLLLNCGKIMNADFG MFYSMIGNNAQLYSTTDVLDTFIYRNLRLNGDVGMSSAAGLYQSAS >gi|229784122|gb|GG667613.1| GENE 91 109309 - 110118 708 269 aa, chain + ## HITS:1 COG:lin2116 KEGG:ns NR:ns ## COG: lin2116 COG0395 # Protein_GI_number: 16801182 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 2 268 55 322 323 238 45.0 7e-63 MPFWLVLVSSLTPEADIISYGYSLIPRRIDFTAYKILLADAGRILNGYKISIFVTVVGTV LSVAVITLAAYPISQDRVKYHRALNFFVLFTMLFNGGMVPWFIVCRNMLHLKDSVWALIL PYLANAWYIFLTRNYYRSIPAELVESAKIDGASEYRIFFQIILPLAKPVIATISLFISLN YWNDWWLGIMLIDNTDLQPLQLLLRTITSNIQFLSSSSNVNAITQAAGSIPSEGIKMATC IATIGPIILVYPFVQKYFVKGIMVGAVKG >gi|229784122|gb|GG667613.1| GENE 92 110211 - 111773 2104 520 aa, chain + ## HITS:1 COG:BH0796 KEGG:ns NR:ns ## COG: BH0796 COG1653 # Protein_GI_number: 15613359 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 20 516 15 494 500 121 23.0 3e-27 MKKRTGTKALSIALAAALCMTGLAGCGSKTQETQGSTAAKTEAGEASAELKPVTLKLWSC SDKYQAQDEVLAKFCETYKDKLNIEKIEFNYVPFGDYEDKMISLVAGGDDFDGMYVSDWM LYNKMSNKGGLMPLNDFMQQYAPSLYATYQENGSMTACSIDGQLVAMPWTKKKSSKAILL YRKDLADQYGVDTSKLETIEDLDAMLTDAKAKVPDITIFESGFPRGNTYSDVLAILHAKY EMDNMNYHTLTFDLNSEKPVLEAVEETPMFKEAVTWMNKWYNEGIVSKNELSETDTKMFE NGKTFAKVGLHETATQGVVFNIPNAELGYAELYTDGKYRYDSPLNNAFAVNKNAANPERL LMLLELLNTDEATYDLFMYGIEGETYVKGEDGSIQYPEGQEATNSTFLGWFCWPFVRGQF DKPSGLITPKALEMENEWLEKDSLVVSPLTGFNPDTSSIKTELAQRDQLYDEQGKLLLAG IVENNDVDAAINKYIENQKAAGLDKILEFMQAEADKMIGK >gi|229784122|gb|GG667613.1| GENE 93 111859 - 112959 677 366 aa, chain - ## HITS:1 COG:BS_ytaP KEGG:ns NR:ns ## COG: BS_ytaP COG1073 # Protein_GI_number: 16080077 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Bacillus subtilis # 91 316 51 253 299 68 27.0 3e-11 MTYSYKHFFRQFYESAPPKYACTASDAPSFLIWKDKSRDRLRELLGLSRLEQFVADSAPY SREPELLSTFHEDGYTRYKLSIQTLPEVFMPFYMLVPDRLSNSRPHKAMIAIPAHGASKE SVAGVLTSPGVREKLEAAPKESYGLQLVKNGYTVFCPDPPGYGERLEPVSMEDRAFLGDT PPKPLGSSCKNLTITAEALGLTFAGLVLWDLMRLVDYIETLPTINSESLGCCGFSGGGLY TMWLAAMDDRIRLAVISGYLHSAKESILETHLCPCNFVPGLWRDFDLCDIASLIAPRPAF YENGLRDVLNGPKGIDDPVSQFQKIQRIYSLFGKAGYVRHRTFDGPHMWYGLGYDFTDEC LSSSLQ >gi|229784122|gb|GG667613.1| GENE 94 113199 - 114650 1335 483 aa, chain + ## HITS:1 COG:CC0572 KEGG:ns NR:ns ## COG: CC0572 COG5434 # Protein_GI_number: 16124826 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Caulobacter vibrioides # 2 384 33 438 527 129 28.0 1e-29 METFYSVKDFGAAGNGKSLDTEAIQAAIDRCSQKGGGYVLLEKGEFISGTLYMKSGVYLI VTASAKLTASGSIAAYGTDTHYNRYVNEHDMDRCFIYAEEAENFGIMGEGTIDGNAECFP NEGSIYRPMMIRFLRCRNVKLKGIKLHQAAAWTTAFLDSENIWVEDLEICNDKRYNGDGL DFDGCRNVFVARCRISGTDDNLCLQAGSREYPVENVHISDCHFTSICAGIRIGLKSIGDI RNVTVSGCTFQNVWREGIKIECTEGGNISDICIQGNTMHNVSRPVFILLNNRMEEIGSSV ELKEMPEIGTLKRVTISGLIARDDEEMEKIHYRFQDDVMGRPEFGGIRVDANKDHPIENL VFQNIDYTAIGGVRLADIPEGYPEVPDYMAEYKTEYKADHKSKNKTSYAAGNKSDIWTDA GKTEAAVSENYYPDWSRTTHLDIRNVKNLILAQVMVSTVHEDERPKYLVEGCRCLKEEIY DLD >gi|229784122|gb|GG667613.1| GENE 95 114722 - 116201 998 493 aa, chain + ## HITS:1 COG:no KEGG:Mahau_1687 NR:ns ## KEGG: Mahau_1687 # Name: not_defined # Def: hypothetical protein # Organism: M.australiensis # Pathway: not_defined # 4 466 13 499 1018 167 23.0 1e-39 MRNELNFLAFWAVNDRLEPERMMGQLSAMKEMGFHGTVFHPRYYPGIPAYMSEAYLDLLS RLILYAKEISLQFWIYDENGWPSGSADGRVLEHFPDSRCRWMQYENGRVEWHEVRQFNTF DREEMKYFVGAVYDGYRLGLRPEAFDYVTGFFSDEVGFLYGHGVSIKNGGVPWCEEAGER YEKLYHEDVMEKLELLFVEKEGYHQFRYRYWQILTDLLAESFYQNCNDWCVRYGKRYTAH LKAEENLFFQTSCSGSVCWNLKNVNVPAVDALERYPGNHYYPVIASTLAKQFYDGESLAE ALGGSGWGLSPENLENYVDWQAGSGINNMVFHLWQYNRSSASVRDWPPNIPMGLTWRETA PKVFERLKQRWNGIVGRHNHILIVAPARGVMASFNPADAMVINEHNGAGTPPAESGRIST VFSEFVEQCFEMGMQYDVTEERIVETYGNVEHGVLHIGNVGYDIVIEGQGCFWEKEQMIR ELNDSGILYRQDE Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:32:30 2011 Seq name: gi|229784121|gb|GG667614.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld7, whole genome shotgun sequence Length of sequence - 76502 bp Number of predicted genes - 61, with homology - 60 Number of transcription units - 29, operones - 15 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 839 844 ## COG1015 Phosphopentomutase + Term 903 - 962 5.2 + Prom 991 - 1050 7.9 2 2 Op 1 . + CDS 1171 - 2064 607 ## Closa_0110 hypothetical protein 3 2 Op 2 . + CDS 2078 - 2485 457 ## Closa_3324 LPXTG-motif cell wall anchor domain protein 4 3 Tu 1 . + CDS 3420 - 5795 1743 ## Closa_3324 LPXTG-motif cell wall anchor domain protein 5 4 Tu 1 . + CDS 6218 - 6469 80 ## gi|266619784|ref|ZP_06112719.1| hypothetical protein CLOSTHATH_00832 6 5 Tu 1 . + CDS 7418 - 11866 2351 ## Elen_0104 cell wall/surface repeat-containing protein + Term 11889 - 11955 22.2 + Prom 11895 - 11954 4.2 7 6 Tu 1 . + CDS 12054 - 13412 1318 ## COG0534 Na+-driven multidrug efflux pump + Term 13455 - 13491 3.1 + Prom 13537 - 13596 6.7 8 7 Op 1 . + CDS 13644 - 14687 431 ## PROTEIN SUPPORTED gi|148379145|ref|YP_001253686.1| ribosomal protein S1 + Prom 14735 - 14794 4.0 9 7 Op 2 . + CDS 14819 - 15511 966 ## COG2357 Uncharacterized protein conserved in bacteria + Term 15519 - 15564 10.1 10 8 Op 1 . + CDS 15604 - 17406 1822 ## COG0584 Glycerophosphoryl diester phosphodiesterase 11 8 Op 2 . + CDS 17453 - 17899 541 ## COG1490 D-Tyr-tRNAtyr deacylase 12 8 Op 3 . + CDS 17943 - 19343 1673 ## COG1362 Aspartyl aminopeptidase + Prom 19397 - 19456 5.4 13 9 Op 1 . + CDS 19484 - 20602 1533 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) 14 9 Op 2 . + CDS 20631 - 21215 661 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 15 9 Op 3 3/0.000 + CDS 21212 - 22009 837 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 22040 - 22092 16.3 + Prom 22072 - 22131 4.8 16 10 Tu 1 . + CDS 22158 - 23042 210 ## PROTEIN SUPPORTED gi|90020671|ref|YP_526498.1| ribosomal protein S6 17 11 Tu 1 . - CDS 24033 - 25796 1590 ## CLJ_B0947 hypothetical protein - Prom 25864 - 25923 80.4 18 12 Op 1 7/0.000 - CDS 26895 - 28643 1652 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 19 12 Op 2 . - CDS 28640 - 29401 851 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 29450 - 29509 5.4 + Prom 29574 - 29633 4.6 20 13 Op 1 35/0.000 + CDS 29676 - 31049 1620 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 31086 - 31122 0.5 21 13 Op 2 38/0.000 + CDS 31138 - 32049 982 ## COG1175 ABC-type sugar transport systems, permease components 22 13 Op 3 . + CDS 32062 - 32901 1027 ## COG0395 ABC-type sugar transport system, permease component 23 13 Op 4 . + CDS 32911 - 33936 907 ## Htur_1168 hypothetical protein 24 13 Op 5 . + CDS 34025 - 35539 1310 ## COG3507 Beta-xylosidase + Term 35600 - 35655 6.2 - Term 35586 - 35641 6.2 25 14 Tu 1 . - CDS 35652 - 37076 1761 ## COG2509 Uncharacterized FAD-dependent dehydrogenases - Prom 37189 - 37248 5.0 + Prom 37261 - 37320 6.2 26 15 Tu 1 . + CDS 37403 - 37519 216 ## + Term 37581 - 37619 7.0 + Prom 37663 - 37722 7.4 27 16 Op 1 7/0.000 + CDS 37915 - 38493 651 ## COG0193 Peptidyl-tRNA hydrolase 28 16 Op 2 . + CDS 38490 - 42026 3899 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) + Prom 42046 - 42105 7.4 29 17 Tu 1 . + CDS 42176 - 42361 168 ## Closa_4073 hypothetical protein + Term 42456 - 42513 12.1 + Prom 42431 - 42490 5.3 30 18 Tu 1 . + CDS 42553 - 43626 830 ## COG2706 3-carboxymuconate cyclase + Term 43695 - 43730 7.4 + Prom 43716 - 43775 5.3 31 19 Op 1 . + CDS 43843 - 44406 681 ## COG2002 Regulators of stationary/sporulation gene expression 32 19 Op 2 . + CDS 44435 - 44578 203 ## gi|266619811|ref|ZP_06112746.1| zinc finger CCCH domain-containing protein 18 + Term 44580 - 44646 24.1 - Term 44575 - 44626 12.2 33 20 Op 1 . - CDS 44649 - 45632 1020 ## Closa_4070 hypothetical protein - Prom 45659 - 45718 4.7 34 20 Op 2 . - CDS 45768 - 47891 1645 ## COG2936 Predicted acyl esterases - Prom 47913 - 47972 5.4 + Prom 47956 - 48015 6.5 35 21 Tu 1 . + CDS 48091 - 49383 1172 ## COG2873 O-acetylhomoserine sulfhydrylase + Term 49422 - 49468 15.3 36 22 Op 1 1/0.000 + CDS 49481 - 49861 411 ## COG3956 Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain 37 22 Op 2 1/0.000 + CDS 49968 - 50243 376 ## COG0776 Bacterial nucleoid DNA-binding protein 38 22 Op 3 . + CDS 50247 - 50486 286 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) 39 22 Op 4 . + CDS 50547 - 50831 394 ## Closa_4066 sporulation protein YabP 40 22 Op 5 . + CDS 50850 - 51212 224 ## Closa_4065 Spore cortex biosynthesis protein, YabQ-like protein 41 22 Op 6 . + CDS 51233 - 51556 479 ## Closa_4064 Septum formation initiator + Prom 51616 - 51675 5.1 42 23 Op 1 1/0.000 + CDS 51703 - 53115 1354 ## COG2208 Serine phosphatase RsbU, regulator of sigma subunit + Term 53184 - 53232 16.5 + Prom 53255 - 53314 4.2 43 23 Op 2 10/0.000 + CDS 53399 - 54736 1039 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 44 23 Op 3 11/0.000 + CDS 54729 - 55256 538 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase 45 23 Op 4 . + CDS 55269 - 57089 1247 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 + Prom 57435 - 57494 5.0 46 24 Tu 1 . + CDS 57526 - 59592 1798 ## COG0366 Glycosidases + TRNA 59745 - 59826 51.0 # Tyr GTA 0 0 + TRNA 59830 - 59915 72.4 # Leu TAA 0 0 - Term 59910 - 59952 11.2 47 25 Op 1 . - CDS 59978 - 60730 618 ## COG0647 Predicted sugar phosphatases of the HAD superfamily 48 25 Op 2 2/0.000 - CDS 60790 - 62124 1086 ## COG1653 ABC-type sugar transport system, periplasmic component 49 25 Op 3 . - CDS 62140 - 63027 846 ## COG1082 Sugar phosphate isomerases/epimerases - Prom 63201 - 63260 8.1 50 26 Op 1 . + CDS 63411 - 64292 889 ## GYMC10_0463 glycoside hydrolase family 43 51 26 Op 2 2/0.000 + CDS 64289 - 65170 910 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 52 26 Op 3 35/0.000 + CDS 65249 - 66673 1459 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 66696 - 66744 8.2 53 26 Op 4 38/0.000 + CDS 66917 - 67822 970 ## COG1175 ABC-type sugar transport systems, permease components 54 26 Op 5 . + CDS 67837 - 68679 854 ## COG0395 ABC-type sugar transport system, permease component 55 26 Op 6 . + CDS 68747 - 70183 1255 ## AciPR4_2750 hypothetical protein + Term 70223 - 70272 4.4 - Term 70217 - 70254 2.1 56 27 Tu 1 . - CDS 70322 - 70714 325 ## COG2033 Desulfoferrodoxin - Prom 70765 - 70824 2.7 + Prom 70816 - 70875 2.3 57 28 Op 1 . + CDS 70908 - 71339 651 ## DSY1202 hypothetical protein 58 28 Op 2 . + CDS 71382 - 72302 771 ## Closa_4053 hypothetical protein 59 28 Op 3 . + CDS 72328 - 74175 2076 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain + TRNA 74321 - 74393 75.9 # Phe GAA 0 0 + TRNA 74439 - 74509 75.8 # Gly TCC 0 0 + TRNA 74590 - 74661 74.9 # Gly GCC 0 0 + Prom 74592 - 74651 80.2 60 29 Op 1 2/0.000 + CDS 74887 - 75744 891 ## COG1737 Transcriptional regulators 61 29 Op 2 . + CDS 75746 - 76502 908 ## COG2103 Predicted sugar phosphate isomerase Predicted protein(s) >gi|229784121|gb|GG667614.1| GENE 1 3 - 839 844 278 aa, chain + ## HITS:1 COG:SA0134 KEGG:ns NR:ns ## COG: SA0134 COG1015 # Protein_GI_number: 15925843 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphopentomutase # Organism: Staphylococcus aureus N315 # 3 278 120 392 392 340 60.0 1e-93 RRTGHKIVGNKSASGTEILEEYGEHEIATGDMIVYTSADSVLQICGNEETFGLDELYRCC EIARELTMKDEWKVGRVIARPYVGRKKGEFKRTSNRHDYALKPYGETALDALKAAGYDVI SIGKIKDIFDGEGITEAYKSKSSVHGMEQTIEVMDKDFHGLCFVNLVDFDALWGHRRDPI GYGKEIERFDEKLGIVLSKLKENDLLILTADHGNDPTYTGTDHTKEKVPFLAYSPSMSGG GRIPSEDTFAVIGATIADNFGVTMPKGTIGKSMLNELK >gi|229784121|gb|GG667614.1| GENE 2 1171 - 2064 607 297 aa, chain + ## HITS:1 COG:no KEGG:Closa_0110 NR:ns ## KEGG: Closa_0110 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 272 1 273 294 180 35.0 6e-44 MTIAKMSMNRNSSRYFRISVNGFENGCLSGIIYHEGAAPGIRFRNFLEMMLHMNRIFDEL AYPKRTMDYRKFGGTKLPRPPVQEYDQLENGTLATFNLNVKFRYNASWQGEIFWLEGQEK QEFESFLQMTYWIERVLNGPVERQQLQKSSNICQIAVDSFEGGLLTGSVQNAFLNYLEDF TGTIALADAMGHFIGLGSLKEDDSGGQGDDIRIIPDKMWSVYREGGKEATFLIKILFREH STWQGVIRWRETGEKQAFRSFMELVLLMASALEGSDGRILYEERYLNNYRESALMEG >gi|229784121|gb|GG667614.1| GENE 3 2078 - 2485 457 135 aa, chain + ## HITS:1 COG:no KEGG:Closa_3324 NR:ns ## KEGG: Closa_3324 # Name: not_defined # Def: LPXTG-motif cell wall anchor domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 135 1 135 2050 176 62.0 2e-43 MKGLKKKFRRGLAGFLSFVLTMTSFNMVSWADVASAFEKENATFVMSGEDLRDSAQAAID NGDEFNFEDLGVDTSDRSLAKEYQRLFETGSVFEFAPAYDMDEEEYADGAELRMFIRING DPEGYQITGDEDIIF >gi|229784121|gb|GG667614.1| GENE 4 3420 - 5795 1743 791 aa, chain + ## HITS:1 COG:no KEGG:Closa_3324 NR:ns ## KEGG: Closa_3324 # Name: not_defined # Def: LPXTG-motif cell wall anchor domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 285 740 199 642 2050 254 39.0 8e-66 MTFRSRIDGYTTQKVTVKGNSTLLEAAGPTVPSDGAGAAGGAGAGTNGAAGGEVQNPNEN AGGENGTVEGNVQNPDSDAADGSVQNPDANAGEGNGTTDGTVENPDAGVNDGTDNGTTGS DIQNPAGNADEGNGTAGETAGNPEINGNNGSGTAGSDIQNPDANADSDKNANTGKDEQIQ NPADHSGSSDNGQNSENGKTEEGQKSEAGADKQSSNDDKGGSNASSDEGNAGSGASDSGS SDSGSSDSGASSSGDAVVSAASVIRNYTHILTTSVNNEAGSDEKASSDEKNEEKEVVSAD SGDSKEKADAGEKKDAEEVKADSEDSNSSDSDNKEAAAAPADDVESNTVDDGDKADSVIP ADREGQNPSDDQADGQGEVTAPADNQDNAVIDDNVKDDADKNAAADSKADDTNVAGNDHG VNADGGKNDNTPAVEPEDEEEKEVVSKTGTTSGKTYGQIVLDESYYAKAYVTTLNKLHVD VSKEGYAVTYTVTPVGTAAVKGAKNVEAGKDLTFTVKPQVGYVIDSVTANGESLDAVEED DATDSNAETGVKRFVVPEVEEEQEIVVAMAETGEHPEFNFSKTLGDVVVSLHAEEGILPA GTVAKVTEVTEKVEEAVKEKTAEETGEDTSANTVLAYDIKLFVENEAGELEALDNSWSEN GYVEVTFAGKAIEEKSAEAESLEIVHVDTGSIDVTERSIKVVAADQVQGVETVSEAVDIS GDKSVSDISFDASHFSIYAIIAAKKVYKEVTINKGETITLHSELGFGTWKSSNSDVAKIK VIMEIQFPYPE >gi|229784121|gb|GG667614.1| GENE 5 6218 - 6469 80 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619784|ref|ZP_06112719.1| ## NR: gi|266619784|ref|ZP_06112719.1| hypothetical protein CLOSTHATH_00832 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00832 [Clostridium hathewayi DSM 13479] # 1 82 143 224 224 155 98.0 1e-36 MSFDVQEVGDENFRPVESYPIIIKKNTNIKDLNVPVVADKTVDGDQYKFDGWYKDKECTV KASFTEGRITENTTFYGRYVQVF >gi|229784121|gb|GG667614.1| GENE 6 7418 - 11866 2351 1482 aa, chain + ## HITS:1 COG:no KEGG:Elen_0104 NR:ns ## KEGG: Elen_0104 # Name: not_defined # Def: cell wall/surface repeat-containing protein # Organism: E.lenta # Pathway: not_defined # 562 1373 1132 1942 2099 140 26.0 4e-31 MTPTKGSVNKNVTVASNTFTRAGYSFSGWNTEREGTGTYYAPGSQYKLLAGDNILYAQWT PDKNTKYSVAYYYQNSDNGKFVEDTKLRDDTRTGVTGTLVEIAEDDKVAEKAGYILDNSQ IKNWQGTISGDGKLVLKVYFKLQYTIAIAGDDHVNITSDSPIYVDWGSNATGSWTIDEGY SISEVTVNRVPVKVASDATSYTMSNIKENIDVYVKTKSEDANYKIEYYQENLPGAEARYT KVKEGSGKGKTGERVQINAENFDGFEYQKDKDEYFVGKTPQSQSSPIIAGDGSLVIKLYY DRKTNLSVEVKYVEYVSDDADYSTAEELIKSEKLTGTFGTVYDIPEKSIQDYDLKLIIDR NNPKVFNENMDTIYVTYTKRAAYTVWFNGIQITDLINDQNQNNYQTSDDEWITKLSWNGM ASNNTSVPYSENEIKDLADQMLKIYSDNTGWDNLSYKSFDIVQEIDLPNGKKDWETLYTS GGHESLKNVKLRAGYETRYQIYFDSTMSIKATGANRIYNGELQNLIESLTTQIDGSGAEI TPTYTYEIAKITNNVIGNYKASKNPTEIEAGEYSIRIKAKKDKFKDTETVIRAVISKRPI TVYAGSLTEKYDGSEKEVTKFTYTVSNEDNTAGALEDHTVSAVLENNKRTEAGDQVVRVV ENSVRILSGETNVTGNYVVSLETGLLIISRRDDLKITVDAESLSHVYDGAGHGIKAAEAS DTKGTTIEYSTNGRTWSGTIPQFTDYKEGGYPVYVRAKNLNYSNTAEAEVVFNITKRPVT VSAGTETFTYDGSEKKVTEIKAEPVNETNTEGILPDHTAKAELENNKRTDAGEQIVKIKE GSVTILSGRKDVTENYAVSLGDGKLIINQKGGLRVTVDAENLSHVYDGAGHGIKAAEASD AKGTTIEYSTDGESWSGTIPQFTEYREEGYPVHVRAKNDNYSNVAEADVVFRITKRPIKV AAGILETEYDGSEKAVTEFTYTVSNKENTAGALKDHKVSAVLKNNTRTEAGEQTVSVEEN SVRILSGEADVTKNYAVSLEDGKLTVKRKDGLKVTVDAESLSHVYDGAGHGIKAAEASDA KGTTIEYSTDGESWSGTIPQFTEYQEDGYPVYVRATNPNYSNTAMADVIFMITQRPAHIQ ASNDGKVFGTADPELMASVTGIVTGEKLNYNLNRVAGEDVGTYPIEVILGSNPNYSITYD NAVFEITSADTNKVFITGTTATYDGKAYGVDASVVQNGSTILYSTNQTDWSETAPTFTEA GTYTVYVKATNPNYNETQVAEGTVVISKREVTITADSAEKRYDGTDLTAPTATITGGTLV EGQTLESVTVTGAQRSTGTSANVASNAMIKSGNTDVTANYAITYTDGVLTVTSVGGNSGG NPGGNGGGSTPNENKPYQPGGPGDNNGPTVTIDPDAVPLANAPVDGSPTDNLILIDDGNV PLAGLPKTGDRAGAQAGLAAILSGFLLAAFSMLNNKKKEENK >gi|229784121|gb|GG667614.1| GENE 7 12054 - 13412 1318 452 aa, chain + ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 1 427 1 427 447 289 40.0 7e-78 MTKDMTQGKPLKLILAFSIPLLIGNIFQQFYNMVDSIIVGQFIGEEALAAVGATGSMVFL VIGFCNGIASGFGVMISQAFGAGDKKRLRNYVAVALMLSAVVMVVLTGVTMVTTEPLLHL MRTPENIMDGASSYLMIIYGGLGATLYYNVLSCILRGVGDSRTPLYFLIVSSLLNVVLDL LFVITFHMGVAGAGLATVIAQGVSAVLCLIYMFKKFPMLRLEKSDFRCRWYTVVQMFSVG LPMALQFSITAVGVMILQSAINDFGSTVVASYTAASKVENLSTQIMITLGAAMATYCGQN MGAGRFDRVKEGVRQSYIIVLGAVGVAVFVSAVCGESIVRWFISNPSEEVLAYATTYLHT ISWYFVFLAFLFLYRNVLQGLGNGILPMLCGVAEMVVRTIIAFTLTGKWGYMAICMASPL AWIAACIPLWLCYQIQIRDPQRWFLRKRTVRA >gi|229784121|gb|GG667614.1| GENE 8 13644 - 14687 431 347 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148379145|ref|YP_001253686.1| ribosomal protein S1 [Clostridium botulinum A str. ATCC 3502] # 55 318 1 265 381 170 32 2e-41 MSEERNEILEQAAAADAVTETADAVVDTAGTASEEAAALETETPAPQPEARVETMEDYAK ELEASFKRVKEGDILTGTVAGVTEDQVLVDLKYYAGGVINKENINGDPDYQLQAEIHPGD EITATVISTDDGEGNIELSMKEANENRLWDKFAAMMADRTIVNVKIAEIVKGGAVAYLEG VRAFIPASKLAAEYVENLEEYNGKTIEATVITADAEAKKLVLSGKEPARAKLQEEKNKKI ARCEVGAVMDGTVDTLKEYGAFINLENGLSGLLHISQISNQRIKHPGAVLKEGQTVKVKI ISIKDNKISLSMKALEPDEEIDLEGFDYKGDGEVTTGLGALLKGLKF >gi|229784121|gb|GG667614.1| GENE 9 14819 - 15511 966 230 aa, chain + ## HITS:1 COG:lin0794 KEGG:ns NR:ns ## COG: lin0794 COG2357 # Protein_GI_number: 16799868 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 6 204 5 203 212 208 51.0 8e-54 MNIPNSFNQVDQWKSVMFLYDSALKEINTKIEILNNEFVHIYNYNPIEHIKSRLKTPDSI VKKLKRYGFEVTIDNMVEKLSDIAGIRIICSFTSDIYQIAEMITKQSDVTVLYVKDYIKN PKPNGYKSYHMVVTIPIYLTDGPVDTKVEIQIRTIAMDFWASLEHKIYYKFEGNAPAYLQ QELKACADVVNMLDVKMFSLNQAILELAEAQRMQDQEMPDDEEVAEEELP >gi|229784121|gb|GG667614.1| GENE 10 15604 - 17406 1822 600 aa, chain + ## HITS:1 COG:CAP0015 KEGG:ns NR:ns ## COG: CAP0015 COG0584 # Protein_GI_number: 15004719 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Clostridium acetobutylicum # 320 575 4 264 281 175 38.0 2e-43 MRTLTKETCQILKFNQKNALLFEVLYRLLTGPLYLFLLNKGLKWSLKMAGYSYLTAGNFG YFLLKPGTILMILLIGIVGVLFLALETGCLLTLFQGAAYFRKLRVGEIFLGGLSKLSDEV KKKNWKLGLLILADYGISNLYLIYRVLTHVKPLNFVMSELMRQNAGRLAVCAAVVLILLA VVPGLYTFHACMIEQKNFRDGYRRSRRLLKGHLANTVVLLGGYYAGLILLLHVIYIFCIL VAAVGMSLFTDNRLALAILPAVGDRIELVLIFLASMGLALGNFAALSVQYYQYSNRLSRE PRWDFSYPSRKSFRGRILGAGVAVAAALSLFFLFNLVRNGSAITDDILTEIQITAHRGSS KSAPENTMAAMELAVENMADFVEIDVQETADGVVVLGHDSTLKRVAGVNRTIGSYTLKDL QDLDVGKWFSREFEGERIPTLTEVMDYCKGRIKMNIEIKNLGKDSPLPDKVVQLITDHQM KEQCVVTSTRLSYLSRVKELDPDIRTGYIISAAYGDYYSDDDVDFISIRSSFVSGKLVEA AHEKGKAIHAWTVNNKSEMERMKMLGVDNIITDYPVLAREIVYRERATETLFEYLRLVLK >gi|229784121|gb|GG667614.1| GENE 11 17453 - 17899 541 148 aa, chain + ## HITS:1 COG:BH1243 KEGG:ns NR:ns ## COG: BH1243 COG1490 # Protein_GI_number: 15613806 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Bacillus halodurans # 1 146 1 146 146 153 53.0 1e-37 MKVVLQRVSRAAVSVDGEVIGAIGKGFLLLLGVSDTDTEQTADRMVDKICKLRIFEDENG KTNLSLADVGGALLVVSQFTLYADCKKGNRPSFIKAGGPKLAEELYEYVISRCRERVQMV EHGSFGAHMIIDLENDGPFTLVLDSEEL >gi|229784121|gb|GG667614.1| GENE 12 17943 - 19343 1673 466 aa, chain + ## HITS:1 COG:CAC1091 KEGG:ns NR:ns ## COG: CAC1091 COG1362 # Protein_GI_number: 15894376 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Clostridium acetobutylicum # 9 462 1 462 465 426 48.0 1e-119 MEKTKGELLAEELTWEFPNIAKEAPEQREAAEMFCAGYKAFLDKGKTERECVKEAVKILE AAGYEPFEAGKKYAAGDKVYMVNRKKAVIATTFGSKSVEEGLRFNGAHIDSPRLDLKPNP VYEKKDLAYFKTHYYGGIRKYQWGATPLSMHGVIVKKNGEIVEVNIGEQEGDPVFCVTDL LPHLAADQNMRPLKDGLKGEELNVIIGSIPYVDEAKIKEPVKLLALQLLNERYGITEADF FRAEIELVPAHKASDVGLDRSMIGAYGQDDRVCAYTALMAEVETKNPVYTTVTILTDKEE IGSVGNTGLNSDYVLHYVEELAETQGADVKRALKNSICLSSDVNAAYDPTFPEVYEERNS CFLNKGCVLTKYTGSRGKSSSNDASAEVMAKVIAMMDQEGVYWQIGELGAVDVGGGGTIA QYVAHMNVDVVDLGVPILSMHSPFELASKLDVYNTYKAFLAFYRAS >gi|229784121|gb|GG667614.1| GENE 13 19484 - 20602 1533 372 aa, chain + ## HITS:1 COG:BS_tig KEGG:ns NR:ns ## COG: BS_tig COG0544 # Protein_GI_number: 16079875 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Bacillus subtilis # 65 368 116 423 424 142 35.0 8e-34 MRKFLKVAACGLIMAAVVTGCSKKAAEETTPAETTTETTAEAGSETEASEQLEMMDLSGI DNGTVTLGEYKGIEVTKTPVEVSDEEVDAAILSERESKTTYDDVDRAVQDTDKVNIDYVG TKDGVAFDGGTAQGQDLVIGSGQFIDGFESGLIGAKKGDEVTLDLTFPENYGNADLAGQA VEFKVTVNNVQEKNMPELDDAFVQEVSDFKTVDEYKESKKQTILEQKEGQAQAAVENDIL KAVVENSQIETTQEAVDANFNNYLLNYTNQAAMYGIDLNTFTSAFYGVDEETFKENYVKN IAKSAVEQRLVFHAIADKEGITVSDEDRDNLAAEMGYESKDQMIESAGAYNVDDYLISTK TMKFLVDNAVIK >gi|229784121|gb|GG667614.1| GENE 14 20631 - 21215 661 194 aa, chain + ## HITS:1 COG:BH1514 KEGG:ns NR:ns ## COG: BH1514 COG0503 # Protein_GI_number: 15614077 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Bacillus halodurans # 1 190 1 188 198 170 49.0 1e-42 MQLLKDRIRKDGKIKSGDVLKVDSFLNHQMDIKLFEEIGKEFKRRFSDADINKILTIEAS GIGIACIVAQYFDVPVVFAKKSKTKNIAGDVYTTKVESFTHGKVYDIMVAKEFLGAGDKV LLIDDFLANGKALEGLAAVVKDSGAELIGAGIVIEKGFQPGGDRLRADGIRVESLAIVES MDEKTGSIRFRGDE >gi|229784121|gb|GG667614.1| GENE 15 21212 - 22009 837 265 aa, chain + ## HITS:1 COG:CAC2244 KEGG:ns NR:ns ## COG: CAC2244 COG0561 # Protein_GI_number: 15895512 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 4 265 2 266 266 177 35.0 3e-44 MTSNRKALFFDIDGTLLSEVNRNVPESARKAVAGARAAGHLVFINSGRTYSLIGPIRDLV EVDGYCCGCGTRVIVGDEVLFSFTIPHERGLQIKKDIIKYNLGGILEGTEHCYFRKERTR FTQLEQIRENIERDGHGAPYTWEDDCFDFDKFCVFADEQSDLAGFSRALGLEYNIIDRGD GFYECVPADYSKATAIDVVLEKYGIALEDAYVFGDSTNDLPMFEHAKNCILMGHHDVELE PYATFFTKNVEDDGIEYAMKKLKLI >gi|229784121|gb|GG667614.1| GENE 16 22158 - 23042 210 294 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020671|ref|YP_526498.1| ribosomal protein S6 [Saccharophagus degradans 2-40] # 18 280 22 277 293 85 25 8e-16 MKAKVNVIAAMLMFGSIGLFVRGIALPSSVIALVRGVIGCFFLVTAGSLMKQPVNWKAVK SNWLVLVLSGAAIGFNWIFLFEAYRYTTVSCATLSYYFAPVFVILVSPVVLGEKLTKRKL CCAAGAVIGMILVSNVLFLDQGENGLIGIFFGLMAAVLYASVMILNKFMKGLTGLETTAA QLGVAAVILLPYTLLTSAEPFSAVLSSVPAVSWILLAFVGIVHTGVGYLLYFSAMKKLPA QTVAVLSYIDPVTAIILSAVILGETMSGPQMAGAVLILGMTWMGGREPQNKQGN >gi|229784121|gb|GG667614.1| GENE 17 24033 - 25796 1590 587 aa, chain - ## HITS:1 COG:no KEGG:CLJ_B0947 NR:ns ## KEGG: CLJ_B0947 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_Ba4 # Pathway: not_defined # 23 530 5 533 535 409 41.0 1e-112 MIIVKGPNLNPLYSDGAFAWCFIFSAYVVLNFLIGLGGIRITTDQASHMSVSVDKNVKGI KTGFTLLIIFWVLYFAATVISMPLFNYKAYRDQLGQPEISEFTDTVQPLALNQLPVVDKA LARELADKKVGENPGLGSQVVLGTPVIQKVNDKLVWVVPLEHSGFFKWLKNLDGSAGYIV VSATDVSDVDLVTDHKIKYQENAYLLDDLNRHVRLFEGGLFRGMTDYSFELDDDGVPYWV ITTYKNRWLFSLPEASGVAIVNATTGETNYYDIDHLPEWVDRVQPEAFIMQQIQNQGAYV HGIFNFSNQDKFRTSQGNIIVYNGSDCYLFTGLTSVGSDESAIGFIMVDMVTKEPKMYQM SGATEMAAQQSAQGKVQQYGYTASFPMIINLDGKATYFMTLKDRAGLIKQYAFVSVSNYT NVGTGETIDSALRNFRQVMGSSNDGIITGQGAKEAEGKVYRIASEALGESITYKLILKEK PNQIFTISYDLSNELALTEPGDRVKISYMASGTGICIATAFDNLEFKQISEDEETAPAEP PVPLPPSSPGSEVPDGDEAEGSSGIQNRDEPAAAESSEAMESGNSAE >gi|229784121|gb|GG667614.1| GENE 18 26895 - 28643 1652 582 aa, chain - ## HITS:1 COG:BH1909 KEGG:ns NR:ns ## COG: BH1909 COG2972 # Protein_GI_number: 15614472 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 245 576 255 588 597 128 30.0 2e-29 MRKRNFIKRFLYYTGIITIPIITLCFIFMAVIVYRKQQDIETEAIKSAQAVRDNYSLVLD SAATQYELLARNPRMSISLRNYISHKQIGYLDVILTNSILSNFASLTNHASYIASVYYYL DGYDSLLTSNTGGLTSLAQYSDLNWLPSYAESDQNEWIEKRNMHLYSYTNPVPVLTYYKK LSMFRGVVMVNIYEDQLKQMLESNSSLDQFFIIDSDGNFLLQGRGTEEPDPESKEALIEA VHAASGTPFQDWLSLGRTRYLVRIIPADCGTYIAACTSRSSYYQALYTLLLQLFIVLAVI ICTVVWIAYTITKKNFQQIDYFVTVFSDAERGTYNNEKPAFIKDEYDLILNNIVQLFINQ TFLKNKLALKEQEQKLAEMTALQLQINPHFLFNTLQTLDFKALEYTRMPTALNRIIEALS DILKYSLQNTMSMVTLKDEIEYLKKYDHIQQYRYEDKYILYYDYEEELESTPIIRLILQP LIENSLYHGIKPLDGSGFIKLKITERDNWLTVTVIDTGVGMEKAEITALYEKINHPGNEN IGLANVNSRLVMQYGEESRLKILSKKHMGTCISFRIPFSQSL >gi|229784121|gb|GG667614.1| GENE 19 28640 - 29401 851 253 aa, chain - ## HITS:1 COG:SPy1556 KEGG:ns NR:ns ## COG: SPy1556 COG4753 # Protein_GI_number: 15675449 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Streptococcus pyogenes M1 GAS # 1 245 1 246 246 132 34.0 8e-31 MYQVIIVDDEAKIRNGIAGLFPWSQLGFEIAGCFSNGKAAYEYTLANHVDLVLSDIRMPI MDGLELSEKLLEKRDIKIIFFSGYQDFEYVRRALRTGVVDYLLKPVKYEDLVDCLTRIRD RLNSETHAAPAREEENLTYYEKIISLVKNYLDTDYQNATLEQAARLVNLSPNYLSRIMKE HSDVNFSDYLLKTRMENAARMLKDIGCKQYEIAYRVGYDNPKNFSRAFHQYFHMTPSQYR KSSGADHDYGGIS >gi|229784121|gb|GG667614.1| GENE 20 29676 - 31049 1620 457 aa, chain + ## HITS:1 COG:YPO1719 KEGG:ns NR:ns ## COG: YPO1719 COG1653 # Protein_GI_number: 16121979 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Yersinia pestis # 53 443 23 415 430 156 26.0 1e-37 MRKGKKLTCLLMTGLMAASSLTACTNEPAPAASNAPEGGQSTEKAAEKTTEKASGDQITL RFMWWGGDARNEATLAVIDQYEKLHPEIRIEAEMNSDQGYIDKVSTMLANGTAPDILQQN VDSLPDFISRGDFFVNFKDYPDLFDPSGFDETFISQFGTFDGKLLALPTGMSCLATVANT AAAEANGVDLTKQITWDSMLEDGKALHAQNPDYYYMNVDTKILCEYVLRPYLRQMTGESF IIDSEKKISFTKEQLVEVLQYIKDCYDGGVFEPAEDSATFKGQIHTNPKWIDGKFVFAYG PSSSISLLMDAIPETKCSVVQMPMAENRVSDGYFADTPQYMTVNKNSDHVEEAVKFMDYF YNDPQAQETLKDVRSVPPTASARTLCAEKGLLNATVVEAVDLAAGLNGKSDKGYTTSAEV YAIQEDMIESVAYGQSTPEEAAENAIDLINDYLSGLN >gi|229784121|gb|GG667614.1| GENE 21 31138 - 32049 982 303 aa, chain + ## HITS:1 COG:AGl3351 KEGG:ns NR:ns ## COG: AGl3351 COG1175 # Protein_GI_number: 15891796 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 21 301 7 287 293 281 51.0 2e-75 MANVSVKPRPNHGGWTKRNRIGFLYVCIWIFGFLVFQLYPFVSSLFYSFTSYDIFHKPEF VGLDNYVRLFTKDKEFWNSLTVTLKYTFITVPGKVVLALIIAMILNRELKGINFIRTVYY IPSLLSGSVAVAILWKVLFMNDGFINSLLGVIHIAPVKWLGRPEMALTTICMLEIWQFGS SMVLFLSALKQVPQSLYEAARIDGASRVRIFFKITLPMITSIAFFNIIMQLITALQNFTS AFVVTNGGPNKATYVLGMKLYTDAFKYFKMGYACATSWILFVIIMIMTLILFATSRKWVY YDN >gi|229784121|gb|GG667614.1| GENE 22 32062 - 32901 1027 279 aa, chain + ## HITS:1 COG:YPO1721 KEGG:ns NR:ns ## COG: YPO1721 COG0395 # Protein_GI_number: 16121981 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Yersinia pestis # 1 279 28 306 306 275 46.0 8e-74 MKNKAVKKGLTYVFLGLFGIVMLFPVIWMFFACFKTNNEIFGSLSILPQGWSLEAFVKGW KTTGTYTYAQYFINTFALVVPTTLLTLASCSLVAYGFARFDFPGNQFLFMVLIATLMLPN AVIIIPRYALFNKLGWLDSYMTFFAPAAAGCYPFFVFMLVQFLRGLPRDLDESAYIDGCG PFQCFIQILLPLLKPALFSAGLFQFLWTWNDFFNTNIYINTVSKFPLSLALRVSIDVTSN IQWNQVMSMALVSILPLILLFFAAQKYFVEGIATTGMKG >gi|229784121|gb|GG667614.1| GENE 23 32911 - 33936 907 341 aa, chain + ## HITS:1 COG:no KEGG:Htur_1168 NR:ns ## KEGG: Htur_1168 # Name: not_defined # Def: hypothetical protein # Organism: H.turkmenica # Pathway: not_defined # 24 332 24 332 342 166 35.0 2e-39 MEFNRGLLYHVESLEQDDTYGDYIIGRIPEAVRRQLNPKAQSNMLFVSNSELRFVMEEDA VTIVLRRLPVSGQIKSRGILEVFSGDYQGSCEISPEVIDVNETEITIRRQDWSHICRFPR NPGGFAPQVMRVLLPYDWGCCIREIRGAIRPPKTEEMPEKRLLFYGSSITHGGNASVPSR TYAFQTAWELGYDFINLGSAGSAFMDCAMAEYIAGLTDWDAAVFELGVNVIEEWTDGQLY ERAYQWIDTVKHAHPEKTVIVTDMYYNHYDFEGDERSDAFRKKIEECVSVLEKKYDRLYY KKGTEIMGTFRGLSSDGLHPSDAGHTMIAQNLTNFMRIHGL >gi|229784121|gb|GG667614.1| GENE 24 34025 - 35539 1310 504 aa, chain + ## HITS:1 COG:CAP0114 KEGG:ns NR:ns ## COG: CAP0114 COG3507 # Protein_GI_number: 15004817 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Clostridium acetobutylicum # 3 503 41 528 531 198 29.0 2e-50 MQTYKNPVLYADYSDPDVIRVGEDYYMVASSFTYLPGVPLLHSKDLVHWELINYCVKELP YEKYAVPSHGSGTWAPAIRYHNGTFFVFIPLVDEGILVARSSDPYGEFQLNMLCERKGWI DSCPFWDEDGKAYMIFAYAGSRAGIKHRLMLVEIDPDCRELIGEPVLIFDGEQIAPTTEG PKMYKKDGYYYVLMPSGGVENGWQSCLRSKSVYGPYEYKIVMRQGNSPVNGPHQGGWVTS TDGRDWFLHFQDVIELGRIIHLQPVCFVDGWPFMGQDTDGDGIGEPVAEWNMPAEDMPEY EIVRGDDFSSEKLGLQWQWQANPNPENYSLTAVPGHLRLYCRRNPDRDNLLWYAPNVLTQ IPQQKEFSMIVKLELHGEKTGDFGGIGMVGHSYGYAGLYQGAEGPELRCYEGTVSEKMFL GEAEERCILKVPAEGSCLWLKLSVSGDKTYRFSWSADGTVFQKLEPVFTLCRATWTGAKL CLWSCSRENEESAGYCDYEYVLFE >gi|229784121|gb|GG667614.1| GENE 25 35652 - 37076 1761 474 aa, chain - ## HITS:1 COG:BH1470 KEGG:ns NR:ns ## COG: BH1470 COG2509 # Protein_GI_number: 15614033 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Bacillus halodurans # 8 467 5 477 480 414 46.0 1e-115 MSSNSSNYDVIIIGAGPSGIFCAYELIQQKPDLKILMVEKGRRIENRTCPKRTTKTCVGC KPCSITTGFAGAGAFSDGKLSLSPDVGGNLPEILGYDKTVELLKESDNIYLKFGADTNVY GIDKQKEIQEIRRKAINANLKLIECPIRHLGTEEGYKIYTRLQEHLLAQGIEMEFNTMVK DIIIEDNEAKAIVTDKGETYYAPEIVSAIGREGSDWFSHICDAHGIETRVGTVDIGVRVE VRDEIMEFLNKNLYEAKLVYYTPTFDDKVRTFCTNPSGEVATEYYENGLAVVNGHAYKSK EYKTNNTNFAILVSKNFTKPFKTPIEYGKHIAQLSNMLCDGKILVQTFGDFQRGRRTTEE RLCRNNLIPTLKDAVPGDLSLVFPHRIMVDIKEMLLALDKVTPGIASDETLLYGVEVKFY SNKVVVNSDFETSVTGLRAIGDGASVTRGLQQASANGISVARSILKKLESQKKA >gi|229784121|gb|GG667614.1| GENE 26 37403 - 37519 216 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTAYEIISIFTGILALLMSFGSLLVAVIAFLDKRNKRK >gi|229784121|gb|GG667614.1| GENE 27 37915 - 38493 651 192 aa, chain + ## HITS:1 COG:CAC3217 KEGG:ns NR:ns ## COG: CAC3217 COG0193 # Protein_GI_number: 15896464 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Clostridium acetobutylicum # 1 185 1 183 187 196 52.0 3e-50 MYLIAGLGNPTREYEKTRHNVGFEAIDILADKAGTTVTEKKHKALYGKGYIGGQKVILAK PQTYMNLSGESIREIADFYKIEPENIIILCDDINLSEGQLRIRLKGSAGGHNGLKNIISH LGTQEFPRIRIGVGEKPRGMDLADYVLGRFPKEQQAVMEEAYRDAADAACMMIEEGADAA MNHYNRKHKENS >gi|229784121|gb|GG667614.1| GENE 28 38490 - 42026 3899 1178 aa, chain + ## HITS:1 COG:CAC3216 KEGG:ns NR:ns ## COG: CAC3216 COG1197 # Protein_GI_number: 15896463 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Clostridium acetobutylicum # 3 1177 4 1169 1171 981 44.0 0 MSDVFESPLRELKEFEDLDRDLNRKKGPVLISGCMDSQKVHLMYEAAKEVPWKLVVTYDD TRAREIYEDFRNFEPGVSLYPARDLLFFNADIHGNLLTKQRMAVFRRLMEEKSGVVVTTF DGLMDHLLPLSFLQKQILEVESGQTLDIAEWKEKLVELGYERVAQVDGMGQFSIRGGIID VFPLTEELPVRIELWDDEVDSIRTFDIESQRSIEELERVLVYPATEMVLTKQQITDGREA IEKEAKKYEKELRNQLKTEEAARIHSIVSELLDGLREGWKQHGLDGYIHYFCRDSVSFLD YFPEDNTVIYLDEPERLKEKGATVEEEFRESMISRLEHGYLLPGQTGLLYPAKEMLAKAQ RDRTAYLTGLEQKLSGFTVKERYSLSVKNVNSYQNGFELLIKDLTRWKREGYRVILLSAS RTRSSRLASDLREYSLRAWCPDESREEVEVKPGEILVTYGNLHRGFEYPMIKFIVITEGD MFGVEKKKRKRKKTTYEGTKIQSFSDLAVGDYVVHEDHGLGIYRGIEKIEQDGVIKDYLK VEYGDGGNLYLPATRLDGIQKYAGAEAKKPKLNRLGGDQWNKTKTRVKGAVREIAKDLVQ LYAARQDTQGFQYGPDTVWQKEFEEMFPYEETEDQLDAIDSTKSDMESRKIMDRLICGDV GYGKTEIALRAAFKAVQDEKQVVYLVPTTILAQQHYNTFVQRMKDFPVRVDLMSRFRTPS QVKKTLEDLKRGLVDIVIGTHRVLSKDVQFKDLGLLIIDEEQRFGVAHKEKIKKLKENID VLTLTATPIPRTLHMSLVGIRDMSVLEEPPVDRMPIQTYVMEYNDEMVREAINRELSRGG QVYYVYNRVSNIDEVAGHIASLVPEATVTFAHGQMHEHELERIMFDFVNGEIDVLVCTTI IETGLDIPNANTMIIQDADRMGLSQLYQLRGRVGRSSRTSYAFLMYKRDKMLREEAEKRL QAIREFTELGSGIKIAMRDLEIRGAGNVLGAEQHGHMEAVGYDLYCKLLNQAVLELKGQR REEDTYETVVDCDIDAYIPTSYIKNEYQKLDIYKRISSIENEDEYMDMQDELMDRFGDIP KPVENLLRVAGIKALAHSAYVTEVNINSQEIRLTMYQKAKLSVAGIPAMVDQYRGDLKFH MAEEPYFTFIDRRKNQTTAGMMEQAEELLKQLYELTEK >gi|229784121|gb|GG667614.1| GENE 29 42176 - 42361 168 61 aa, chain + ## HITS:1 COG:no KEGG:Closa_4073 NR:ns ## KEGG: Closa_4073 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 61 1 61 61 87 67.0 2e-16 MDVLGMYLLQRVSNELTIQRNESFENRFVAPVPDSTAARVTHEAYVCSSKDVSPRVTEGA F >gi|229784121|gb|GG667614.1| GENE 30 42553 - 43626 830 357 aa, chain + ## HITS:1 COG:BS_ykgB KEGG:ns NR:ns ## COG: BS_ykgB COG2706 # Protein_GI_number: 16078366 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Bacillus subtilis # 5 351 4 343 349 205 35.0 1e-52 MKQNYFVYAGCYGPAEETGIHGYRLTRREGREPQLTEVFAAAGVSNPSYLAVSADGKFLY SVMEDMAYEGREGGGVAAFRIDEGSLTLLNTQGTAGTLPCHLLPDEKRGFLYAANYLSGS VSMFRLNADGSVGELCDLKQHYGHGPNEERQEGPHVHYVGYSADEAGLWCADLGIDRIRY YRIDEENGKLVPDEKHDMIFPEGTGPRHFVLNRERRELMYVVSELSSELFVMECGAEENR ILQRVSTLPESMENNTCAAVHLSEDGRYLCASNRGHDSIAVYAVDAGTGLLTLAGRAGTK GKTPRDFCLCGDILLSANQDSHTITMLHFDEENGTITDMNREISCASPVCLVGVPEP >gi|229784121|gb|GG667614.1| GENE 31 43843 - 44406 681 187 aa, chain + ## HITS:1 COG:CAC3214 KEGG:ns NR:ns ## COG: CAC3214 COG2002 # Protein_GI_number: 15896461 # Func_class: K Transcription # Function: Regulators of stationary/sporulation gene expression # Organism: Clostridium acetobutylicum # 6 186 1 182 183 201 58.0 6e-52 MEEFYMKATGIVRRIDDLGRVVIPKEIRRTLRLREGTPLEIFTDREGEIILKKYSPMVEL TAFASQYAEAMAQTTGLTVCISDRDQVIAVAGGSKKELLQKTISKQLENVINERTVILAG KDDKNFIPLTGETLEGITAQVIAPIICEGDAIGAVALLSREPRARFGDMETKLATTAAGF LGRQMEG >gi|229784121|gb|GG667614.1| GENE 32 44435 - 44578 203 47 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619811|ref|ZP_06112746.1| ## NR: gi|266619811|ref|ZP_06112746.1| zinc finger CCCH domain-containing protein 18 [Clostridium hathewayi DSM 13479] zinc finger CCCH domain-containing protein 18 [Clostridium hathewayi DSM 13479] # 1 47 18 64 64 66 100.0 7e-10 MALYDPKEDDYGDRDIPEVDTEPYKDGFYVEPLPESTRPRRDGPGGD >gi|229784121|gb|GG667614.1| GENE 33 44649 - 45632 1020 327 aa, chain - ## HITS:1 COG:no KEGG:Closa_4070 NR:ns ## KEGG: Closa_4070 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 327 7 332 332 469 68.0 1e-131 MKSFNIWKKIAAGILFLNIVLTSFIGQPVAAEGTEGQPAIQRMGFSSKLAIGPGSQLFDG GSLSLLADHGNRQMLSMVMQSKDGTLVVIDGGWDSDSAHLLDVIRSRGGHVSGWFITHPH SDHAGALIEILNNPDSQITIDNVYYKFTDQEWYDTHGSGRGDFVVSLRDALAGLPPEKLH GDITKGQEIQLGDIRAVVMNDPYLLDVTSVNNSSVVYKFYLNGVSILVLGDLGPEGGNLL MSEYTPADLKSDIVQMSHHGQYGVNRDVYAAINPQIALWPCPQWLWDNDNGGGVGSGDWK TLETRQWMEELGVRANYCIKDGDQVIY >gi|229784121|gb|GG667614.1| GENE 34 45768 - 47891 1645 707 aa, chain - ## HITS:1 COG:lin2898 KEGG:ns NR:ns ## COG: lin2898 COG2936 # Protein_GI_number: 16801958 # Func_class: R General function prediction only # Function: Predicted acyl esterases # Organism: Listeria innocua # 171 703 5 553 555 219 30.0 1e-56 MKQWNFYFSGILYGIIRDQNGDGKELYHRTLDVFTNTFHPEEPLHNDDFYEVLCRRKLRD FFESLPEYEIHAKAGEERFTTAGGQVFTRRLSGHGASASGSYIQREVKFPLDLYLSDGKV YAVQMAGRDYTGVLVLNGMEEQTILKEWTADPVCAVHFAGTYMVPTADGEQLATDVYLPA TTADGLQPEPARGFSGSSPRVPAVLIRTPYGKHDGVEQYYRFVQRGYAVVVQDVRGREDS TGEWMPNYHEVEDGSDTLDWIADQPWSDGNVGMTGGSYLGYVQWAAAASGNPHLKAMLSS VCAGSPFIDLPRRGGCFTSGSMAWNFAMTEKRFREDRMVRSDWEEVLKLRPVRDMARKAL GIDVPFLNEWLNHPDYDDFWRKANWRERSCGHVVPALVMSGWFDDNGMGTTEALELIDAF YPAGTYKVVLGPWKHSGNANYDIHGIFMGEHALRYDMDLLCFAWLDHFLKNEENGIETGA PVEYYTLGNNCWKHGKHWPLPEAVSTVWYLDREALTLCKPAIQESDSYDYDPESPAAHII DVSENELEVPEDYTEEEKRSDILCYTTPVLDHDLTVTGDITAVLCLSSDCPDTDLFVRIT DVDEAGTSIKLADGVIDVKYRNSFEKPEFMEPGQVYEVRIRTTKLSNTFKAGHRMRFTVT SSAKNFMFPNSNTENGFDSEVNRVAHNTIYRGGALASHILVPVEKDV >gi|229784121|gb|GG667614.1| GENE 35 48091 - 49383 1172 430 aa, chain + ## HITS:1 COG:L75975 KEGG:ns NR:ns ## COG: L75975 COG2873 # Protein_GI_number: 15672055 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Lactococcus lactis # 8 426 7 423 426 504 59.0 1e-142 MSERKGLRFETLQIHAGQEEPDPATGARAVPIYQTTSYVFKNCKEAADRFSLSAAGNIYG RLTNPTEEVLEQRVAALEGGSAALAVASGAAAITYTFQALAREGDHIVAASTIYGGTYNL LAHTLPEFGVTASFVDPDEPGAFERAIRPNTKAVFIETLGNPNSSIIDMEAVAAVAHAHK VPLIVDNTFATPYLFRPLEHGADIVVHSATKFMGGHGTALGGVIVEGGSFDWEASGKFPG LVEPNPSYHGVSFTKAAGPAALVTKIRAVLLRDTGATLAPLHAFLLLQGLETLSLRVERH VFNALRVVEYLNANPHVKKVNHPSLPGSPYHELYNRYFPEGAGSIFTFEIEGGAEKAMEL IDHLKIFSLLANVADVKSLVIHPASTTHAQMTEQELAASGITPETIRLSIGTEHVDDIIE DLEQAFACVF >gi|229784121|gb|GG667614.1| GENE 36 49481 - 49861 411 126 aa, chain + ## HITS:1 COG:BS_yabN KEGG:ns NR:ns ## COG: BS_yabN COG3956 # Protein_GI_number: 16077126 # Func_class: R General function prediction only # Function: Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain # Organism: Bacillus subtilis # 2 122 231 351 489 129 52.0 2e-30 MYSFEDLVRITAELRSEHGCPWDRKQTHESLKPCLKEESEEVLAAIDNQDMENLCEELGD VLFQVLIHSRIAEENGAFTVADVVNGICEKMVRRHPHVFGDAKAATPEESLELWNEIKRR EKMGKK >gi|229784121|gb|GG667614.1| GENE 37 49968 - 50243 376 91 aa, chain + ## HITS:1 COG:BS_hbs KEGG:ns NR:ns ## COG: BS_hbs COG0776 # Protein_GI_number: 16079336 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Bacillus subtilis # 1 89 1 89 92 97 67.0 5e-21 MNKTELIAAVAEKAELSKKDAEKAVKAFTDVVSEELVNGGKIQLVGFGTFEVSERAAREG RNPKSGEVMNIPASKTPKFKAGKALKDMVNA >gi|229784121|gb|GG667614.1| GENE 38 50247 - 50486 286 79 aa, chain + ## HITS:1 COG:BH0073 KEGG:ns NR:ns ## COG: BH0073 COG1188 # Protein_GI_number: 15612636 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Bacillus halodurans # 1 76 1 76 88 75 61.0 2e-14 MRLDKFLKVSRLIKRRTVANEACDAGRVVVNDKPAKASLNVKEGDIIEIHFGTSSVKVEV LDVAETVKKDEAKELYRYI >gi|229784121|gb|GG667614.1| GENE 39 50547 - 50831 394 94 aa, chain + ## HITS:1 COG:no KEGG:Closa_4066 NR:ns ## KEGG: Closa_4066 # Name: not_defined # Def: sporulation protein YabP # Organism: C.saccharolyticum # Pathway: not_defined # 1 94 1 94 94 145 84.0 7e-34 MEEKMNIRPHKLTLENRGAGTVTGIREVVSFDENQVVLDTDLGLLTVKGKDLHVSRLTLE KGEVDLDGTIDSLNYSSNEALRKSGESLFTRLFK >gi|229784121|gb|GG667614.1| GENE 40 50850 - 51212 224 120 aa, chain + ## HITS:1 COG:no KEGG:Closa_4065 NR:ns ## KEGG: Closa_4065 # Name: not_defined # Def: Spore cortex biosynthesis protein, YabQ-like protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 120 1 120 120 164 73.0 1e-39 MSVHILYEVRLLLYSFATGAGLMMTYDFLRILRIFIPHLPVIMGVEDMVYWVYASLVTFS LLYEQNDGGLRGYVIAGVFTGMFLYDRLISRFFLKFLKNTWKYLRMKLDKCFRRRRKRQR >gi|229784121|gb|GG667614.1| GENE 41 51233 - 51556 479 107 aa, chain + ## HITS:1 COG:no KEGG:Closa_4064 NR:ns ## KEGG: Closa_4064 # Name: not_defined # Def: Septum formation initiator # Organism: C.saccharolyticum # Pathway: not_defined # 1 107 1 107 107 137 80.0 1e-31 MSSFGKSRRRRKDKWGNRMALIGITFVVFSLAVIVTVKGASLKDKELEYQIREENLTAQR DKELERSKELEEYRIYVQTKQYIEEVAKQKLGLVNPDEILLKPKKKE >gi|229784121|gb|GG667614.1| GENE 42 51703 - 53115 1354 470 aa, chain + ## HITS:1 COG:CAC3205 KEGG:ns NR:ns ## COG: CAC3205 COG2208 # Protein_GI_number: 15896452 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Serine phosphatase RsbU, regulator of sigma subunit # Organism: Clostridium acetobutylicum # 66 468 387 792 795 153 26.0 6e-37 MFRRKAEHHDMADVNVYTAKRLNDMAGSLGELARACTDELSAGQGLTKEDGIAAMQTAAA MVCAGCNKCNIYSSQEQENNYYLYYLIRSFEQKGNLDYGDMPRMFLETCKRREEYVDQLN RSLGKATMNLAWKNRFLESRDAVIIQFKELAVILEEFSHQMEKAVDITPEKADAVEKVFR RRHIVIEKMLILEYENHQKEAFLTMRTGNGRCVTAKEAAELLEEAMDENEWCVAHDGRNL ITKQPATIRFVEKGRYRMVYGAARAIKTGESVSGDNYSFSQIMPGQVIMSLSDGMGSGEI AEEESRQVIELTERLLETGFSARAALKLVNTVLLLTGNVQHPATLDLACIDLHTGILEAM KLGAAATYILTPTGVELLEAGEVPMGILNPVEPILLSKKLWDDNRIIMVSDGVLDALPGE DKELSLLEFIAGMPVKNPQDMADRILLFARSFHEFVGDDMTVLTAGIWDR >gi|229784121|gb|GG667614.1| GENE 43 53399 - 54736 1039 445 aa, chain + ## HITS:1 COG:CAC3204 KEGG:ns NR:ns ## COG: CAC3204 COG0037 # Protein_GI_number: 15896451 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Clostridium acetobutylicum # 1 438 14 457 461 236 33.0 9e-62 MLVPGDRVIIGLSGGADSVSLLMILHELKNELNIELFAVHVHHGLRGEEADRDSAYAQEL SENLGVPFVCVHANVAEYARVNGMSEEEAGRHLRYRILEEQRLAHHASKIAVAHHADDQA ETVLYNLFRGSGLKGIGGMKPVRDTIIRPLLSVTRKEILAYLEEKEISYCEDSTNSGTDY IRNRLRHEIIPAVRKRINEGAVSNILQAAKTAAAADTYFEKAAKQILEKHGIREKTDEGG ICSVGIAAEILKQEDSIVRQYVIRQMIGETYQSLKDITSVHVEAAEKLLFKPVGRRIQLP DGGCALRTYGELQIKKEGKTIFSQNGGKETGITPVITTFPYKKGQEIPKNGYTKWFDYDK INSTLSVRYRETGDYITLAGGGRKTVKSFMIDEKIPKEERDKILLAADGSHILWIIGYRI SEYYKITDDTHTVLQIQIGGGKNDE >gi|229784121|gb|GG667614.1| GENE 44 54729 - 55256 538 175 aa, chain + ## HITS:1 COG:CAC3203 KEGG:ns NR:ns ## COG: CAC3203 COG0634 # Protein_GI_number: 15896450 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 3 175 4 175 178 189 58.0 2e-48 MNDKIRVLLSEEEVAKRIREIGDEISRDYEGRPLHLICILKGGVFFTCELSKRISLPLTL DFMSVSSYGAGTVSSGIVKIVKDLDEPIEGKDVLIVEDIIDSGNTLSYLIEVLKQRNPKS IELCTLLDKPERRVKKQVTVKYTCFTVPDQFIVGYGLDYDQKYRNLPYIGVIEQQ >gi|229784121|gb|GG667614.1| GENE 45 55269 - 57089 1247 606 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 68 601 66 593 636 484 49 1e-136 MKQQQPFRGLGLVLILVIMLIVASMFPRQNEKISNQEYMKAIEDGSVVAATVKQNQQTPT GQVVLIISDGSTEQAKVVNVPDVIAAQELLDRNGIVYTTENVPQENYFLTIVLPIILTAV VLVGMFMLMNARAGAGSANAKMMNFGRSRAHMAKDTNKVNFGKVAGLEEEKEELAEVVDF LKNPQKYTSVGARIPKGILLVGPPGTGKTLLAKAVAGEAGVPFFSISGSDFVEMFVGVGA SRVRDLFEEGKKHAPCIIFIDEIDAVARRRGTGMGGGHDEREQTLNQLLVEMDGFGVNEG IIVMAATNRVDILDPAILRPGRFDRKVGVGTPDVKGREEILMVHSKEKPLGEDVDLKRVA QTTAGFTGADLENLMNEAAINAARDNKKFISQADINKAFVKVGIGAEKKSKVISEKEKKI TAYHEAGHAILFHLLPDEGPVHTISIIPTGIGAAGYTMPLPESDRMFNTKGKMLQDIMVD LGGRIAEELVFGDITTGASQDIKQATATARSMVTQYGMSDRVGMINYDNDGDEVFIGRDL AHTKSYGNEVANAIDSEVKRIIDDCYTKAKDIIMKHEEVLHACSRLLIEKEKIGQQEFES LFPVEI >gi|229784121|gb|GG667614.1| GENE 46 57526 - 59592 1798 688 aa, chain + ## HITS:1 COG:ECs0453 KEGG:ns NR:ns ## COG: ECs0453 COG0366 # Protein_GI_number: 15829707 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli O157:H7 # 77 634 71 569 605 245 29.0 2e-64 MCFQAEALNREALFTDETENFRFPTEPEAGDDVYLRFRTARNNVDYVCYIEDGSSLKEAV MEKTDSDGQFDYYEYKITVGSDKLSYSFKVVKGSEVCYYNRLGATMDNQESFHFRIAPGF STPDWAKGAVMYQIYVDRFCNGDPANDVVDNEYFYIDQNVEHVSDWSRYPSQMDVGCFYG GDLQGVWDKLDYIQNLGVDVIYFNPIFVSPSNHKYDCQDYDHIDPHYGVIVKDGGEPLKS GATDNRQATKYMVRSTAKENLEASDAFFARFVEEVHRRGMKIILDGVFNHCGSFNKWLDG ELLYQMSGDYEAGAYVSEDSPYHTFFKFYKDDAWPCNDSYDGWWGHSTLPKLNYEESPKL YEYIMNIARKWVSPPYNVDGWRLDVAADLGHSSEYNHKFWRDFRSAVKEANPKAIVLAEH YGDPSSWLAGDQWDTVMNYDAFMEPVTWFLTGMEKHSDESNPSLYGDGESFFRSMNYHMS RMMTNSVMVAMNELSNHDHSRFLTRTNRTVGRISTKGAKAASEGVNYGVFREAVLIQMTW PGAPTIYYGDEAGVCGWTDPDNRRTYPWGNENLELIEFHKYMTGLHHRIAALRRGSLKQL LAGRQLIAYGRFCGDSVCAVIVNNRAAERDVQLPVWQLGLADGDTLVRHMLTYETGYNVG KINYEVADGQVSVFMPANSAILLATDRV >gi|229784121|gb|GG667614.1| GENE 47 59978 - 60730 618 250 aa, chain - ## HITS:1 COG:TM1742 KEGG:ns NR:ns ## COG: TM1742 COG0647 # Protein_GI_number: 15644488 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar phosphatases of the HAD superfamily # Organism: Thermotoga maritima # 1 249 10 259 259 251 49.0 1e-66 MDMDGTFYLGEHLLEGSLDFLQTLQETGRRYLFFTNNSSKSSGAYIKKLRSMNCFIDSSQ IMTSGDVMIEYLKHSHPGKTVYLLGTPVLRESFEKAGINLSEEQPDLVVVGFDTTLTYHK LERACHYIRSGAEFLATHLDINCPTEDGFIPDCGSFCAAITLSTGKKPKYVGKPYPETVE MILEKTGVSRDRIAFVGDRLYTDVAAGVNNGAAGLLVLTGETKREDLRHAEISPDGVYLS LKEMGELLRK >gi|229784121|gb|GG667614.1| GENE 48 60790 - 62124 1086 444 aa, chain - ## HITS:1 COG:TM0595 KEGG:ns NR:ns ## COG: TM0595 COG1653 # Protein_GI_number: 15643361 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Thermotoga maritima # 52 441 20 413 419 132 26.0 1e-30 MVRKYLVGIALLTVIAISGCSAANKETTAEANTSKAPAAETAADIEAASDVLEIEFWHAL ESQYQETLNRVVGEFNDSHDNIVIKPMYIGNYTALNEAIVSANAANTNLPGVAMANIPYV TAYGAGGLCEDLGPYIQRDGFEIDDFGEGLIQAAKYEDVQVALPFLVSTEVVYYNRDLLN ELGLQIPKTWDEIPSFLEKASDISADGKTNRYGMVFPGWITWYYEPFFINNGVQLVTAEG KTDLASEHSVKLVSDLKDWTKKGYSYLATGEDAASVMRQNFIDGKALSVIYTSSLYDTIV DSCSFDVGMDWLPGGKTKEQVLGGNVLFLPAKNDQKVKDASWEFLSYLMSKDVNMIWASE SGYLPTRKSVQETDEGKAFLAQKPEFQIIFDNLDIITPGIQRTDWSQVSTTWRNWMDEII QEDLDVETALKEMEVEINEILSGS >gi|229784121|gb|GG667614.1| GENE 49 62140 - 63027 846 295 aa, chain - ## HITS:1 COG:MJ1311 KEGG:ns NR:ns ## COG: MJ1311 COG1082 # Protein_GI_number: 15669501 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Methanococcus jannaschii # 77 269 78 266 293 70 29.0 3e-12 MKLSTTTLFYGPRPDGSVGPMLESIRRIHKAGFRDLDLNFAWTRSHKTELYYDNWREWIK ECRALIEELGMTVSQSHAPFYNVINPDYPYREEEEEMVRRSIIASSELGAGIVVIHGGSR PYRPDYRKNKQENIEYFKPHLEFAAKYGVKLAIENLFDLNDSEKRVRVPRYLADPEDLID LVDTLHQSYDNVGIVWDFGHANLMHWNQPECLEMMGDRLIATHVQDNYGVIDDHLLPFLG TVEWEPIMKTLKKINYQGVFAYETHKMTDRLPDPMIDAMMRYAYELGEYLLTLAD >gi|229784121|gb|GG667614.1| GENE 50 63411 - 64292 889 293 aa, chain + ## HITS:1 COG:no KEGG:GYMC10_0463 NR:ns ## KEGG: GYMC10_0463 # Name: not_defined # Def: glycoside hydrolase family 43 # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 1 289 1 292 303 293 50.0 4e-78 MKNTEINIRDPFVLAENGTYYMYGTRAATCWMAATGFDCYVSEDMENWDGPYEVFHKPED FWADMNCWAPEVHKYQGAYYMFATFKDSSVHGGTAILKSESPLGPFVMHSDRQITPKDWE CIDGTFYVSPDGKPYMVFVHEWVQISDGSICAVELSRDLREAVSDPVTLFHASEAAGWVK PITNKKRPGLHYVTDGPYLYRTESGRLIMLWSSFGKEGYTEALAYSDNGDITGTWTQDDR LLFEKDGGHGMIFESNSGELYLTLHTPNEHLKEHPVFYRLIEEEDTLRVAETS >gi|229784121|gb|GG667614.1| GENE 51 64289 - 65170 910 293 aa, chain + ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 183 279 158 254 257 83 39.0 4e-16 MKEVFFTGKCDGIGISHVARDCDLTETIGFYKNEYQIYYILDGERFFYQGSKSYRMDRGT LTFIDKRQIPFTNVIGGKYHERILIEVDEKWLVSMGKAMELDLIGFFANYHGVYRLEGKH LAAVEKKLKKMEEVMGQKRAFASAEVKNLLISIIISMLYGAGTRCAEYHMPNGKMLRYAK AREIIQYIMEHYSEVYGLEDLAGIFYMDKSYLSRIFKEVTNFTVNEFINCQRIGHARDML LDESLSMEEISQKLGYERLSYFDRVFKKYVGMSPLQYRKSKRKKNEDEENQSI >gi|229784121|gb|GG667614.1| GENE 52 65249 - 66673 1459 474 aa, chain + ## HITS:1 COG:BS_yesO KEGG:ns NR:ns ## COG: BS_yesO COG1653 # Protein_GI_number: 16077764 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 66 472 7 410 412 214 32.0 4e-55 MFTETYRIRKKLKQVTSENKDSERCKRGSIMKLRRLMALTTAVVMAASLTACGGQKAETN TQTADSGKESAASGEQITLRMAWWGSQTRHDATNKVIEMYEEQNPNVHIEAEFYDFDSYF TKLDTLVAADDVWDIFQMGGNFPKYINSIEPMDSYIEAGTIDVSDTTENFLATTRDNDGT QVGISIGTNTYGIAYDPAMFAEAGLAEPSDNWTWDEWKADCLAITEKLGIYGSSKMDNFI AGVTQRASQAEKDGNFFKKTNDGLEFTDTATFASYMQMIKDLTDAGSYPDTGAIKEIKDI EGDYLVTEDAAMTWVSSNQIASIVNAAGREIKIAPVPRITKDGSYGMGVQSSQQLCMAKS SKNKEEAAKFINYFVNDIEANKVLNGERGVPIMSKVRDVVMEQADDSSKMIYDFVDKIGN FPKEDCNVISPDPKTEIEDQYKLLIEKVQYGDVTPEDAASQLVEFAESKFTRQQ >gi|229784121|gb|GG667614.1| GENE 53 66917 - 67822 970 301 aa, chain + ## HITS:1 COG:BS_yesP KEGG:ns NR:ns ## COG: BS_yesP COG1175 # Protein_GI_number: 16077765 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 15 301 18 308 309 284 51.0 2e-76 MEERKASIKKFWYNDNTVGYIFSLPFIIGFLAFTLIPIAVSMYYSVTDYKLGQTPAFIGI QNYLDLLKDERFINSVKVTLIYVVTSVPTKLIFALFVAYLLTQGRRGVTFYRSLYYVPSL IGGSIAVALVWKTIFSRKGLANTILASLGLQKLSWLGDPKLSMGILVLMSAWQFGSSMII FAAGLKEIPGSYYEAAEIDGANKWQRFLRITLPCLSPIILYNLVMQTITAFMAFTQAFVI TQGGPNDATNFYALYVYNHAFKWSNMGYASAMSWLMLVLISLITFILFKSSKFWVFSEAD Q >gi|229784121|gb|GG667614.1| GENE 54 67837 - 68679 854 280 aa, chain + ## HITS:1 COG:BH1119 KEGG:ns NR:ns ## COG: BH1119 COG0395 # Protein_GI_number: 15613682 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 2 279 3 281 281 266 49.0 3e-71 MTKKHKIEAVLYHLFVCLFGLLMIYPVLWMISGSLKNNTEILNGGMNLIPPAWRVENFAN GWKGFGGISFATFFSNSFFVAAVSTFATVLSSACVAYAFSRVKYKGRNIWFTAMICTMMI PTQIILVPQYIIYNKLGFVGTYLPLILPHFFGQAFFIYQMMQFGAGIPKDLDEAAIIDGC SKYTIFTKIFLPLLKPSIVTTIIIQFYWKWDDFMSPMIYLSKPRMYTASVAIKNFADSTS TTDYGAMFAMSTLSLIPVFLIFLFFNRYLVEGISTSGIKG >gi|229784121|gb|GG667614.1| GENE 55 68747 - 70183 1255 478 aa, chain + ## HITS:1 COG:no KEGG:AciPR4_2750 NR:ns ## KEGG: AciPR4_2750 # Name: not_defined # Def: hypothetical protein # Organism: T.saanensis # Pathway: not_defined # 39 449 62 472 518 152 27.0 4e-35 MKNQTSVSFQTSCGLLQKLVDTAERKSRENLKQFGGRLVLIEGGGYEKIWLETQPMGGEM YAKRNLEAGINNQLLFMENQRADGRIPGSVACEDGKIIPQFNKFQGFCFPSAALNVYYLM GKDPGYLDLLYTTLERFDSYLWRVRDSDGDGCLETWCKYDTGEDHALRYGDAPDDWSKET PPEGCSVVPMASMDIMSFSYASRETLAKISRIRNDVEKMEFWKEKAAQVQGKIRTYLWDE TAGACFDRDKNHRVLRTMTHNNLRAMYWNSFTQPMADRFVNDHLLNPEEFWTHMPLPSVA VNDPLFRNVTTNDWSGQAEALTYQRAIRALENYGYDSLIPELGEKLFQAIGEDCIFVQQY DPFTAAPSLVCLEGEQDCYGPALLSVLEYAARMYGVHLEQDTVFWGTVGGYESIYEQIWG ERSFKLVQSGNGAEGFINGKKIFEAGPGIKVITGLDGTVLEVKKMSKDGDVRRIRQKI >gi|229784121|gb|GG667614.1| GENE 56 70322 - 70714 325 130 aa, chain - ## HITS:1 COG:TP0823 KEGG:ns NR:ns ## COG: TP0823 COG2033 # Protein_GI_number: 15639809 # Func_class: C Energy production and conversion # Function: Desulfoferrodoxin # Organism: Treponema pallidum # 42 126 42 125 128 87 47.0 8e-18 MNKQPVFLTDRDHLVVLEAITGMPNAAVPDSLKPFEILDPNTSEGASEKHLPIIETDGNR VTVKVGSIFHPMSEEHNIGWVCLETEAGVTMRVPLDSACDPVAYFTLEDGDAPKAAYAWC NLHGFWKTEA >gi|229784121|gb|GG667614.1| GENE 57 70908 - 71339 651 143 aa, chain + ## HITS:1 COG:no KEGG:DSY1202 NR:ns ## KEGG: DSY1202 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 143 4 145 145 145 58.0 5e-34 MSADAEMLNFVYQNSQMGVETLNQLLPMIDNEAFKKRIEAQLKEYKQIHEEAKELLNRHG YDEKGIGALEKIMTYLMIDMKTLMDKSSSHIAEMLIQGSNMGIIDAVKRINQYEKEAEKE VTALMKRLLKFEENNVERLKDAL >gi|229784121|gb|GG667614.1| GENE 58 71382 - 72302 771 306 aa, chain + ## HITS:1 COG:no KEGG:Closa_4053 NR:ns ## KEGG: Closa_4053 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 303 1 303 311 439 66.0 1e-122 MILSVSRRTDIPQYYSDWFFNRMKEGFLYVKNPMNSHQVSRIELSPDLVDLVVFWTKNPE PMMKRIDELGEIPYYIQFTLTGYGRGMEPGLPDKRELIRIFRETAETAGRNRMVWRYDPV LLNERYPAEYHFRAFEAIAEGICGCTDKVVISFLDCYGKTKRNMRGIPLETPDTETMKRM GETFVKTAERFGMRVESCAEAVDLSDVGIRHGSCIDPAMAEQILGVPIHVKKDKNQRPVC GCVESVETGAYDTCLCGCKYCYANDSEEAVKRRRAVYDADSPLLCGTVEEGDRITVRRTA RIRSGE >gi|229784121|gb|GG667614.1| GENE 59 72328 - 74175 2076 615 aa, chain + ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 185 606 337 754 776 174 28.0 4e-43 MKKYFRTLLLLLLGMLVAISGVSIKRISDIRDYGKLINYVGIVRGASQRVVKLETNSRPD DELIAYVEGILNELITGQGVYGLVLADCRQYQLDLDLLKDQWDHVKKEIGEVRQGARADV LIDSSEKLFEIANDTVFTIEQYSGEMAEKTARMIFATALVSAAAFAFTVYFYIRKFFQLK VTNEKLEELTSRDELTGAYRADKFYSEVNRMMEREPDAKFAVLYIDFENFKYVNDVFGYS YGDDLLKGYAQLVMSRLKEEELLARDVADRFVLLRRYETQEELLETQKMMDEEFLNSDIV MLNRHMLTIACGICCAEDLVEEPEAVKMVNRANFAQKEVKNKPGKHYAFYGEEIRQKLIA EMAVKDRMQEGLDNEEFVVYLQPKVGVEDGAVKGAEALVRWAIPGNGLLSPGLFIPVLEK NHFIGKVDRYVFEKVCIWLHDRMANGKTVVPISVNVSKIQFYNPDFIPVYAGIKKQYGIP DGMLEIELTESAAFEHEEYLMQVVRELHDNGFLCSLDDFGTGYSSLGLLKDLPIDVMKLD GVFFRVSVDVNRGHTILKYIIDMVKELNISTVAEGVEREEQVEFLKSAGCDLIQGFAFYR PMPVEQFEEVLDGPV >gi|229784121|gb|GG667614.1| GENE 60 74887 - 75744 891 285 aa, chain + ## HITS:1 COG:CAC0191 KEGG:ns NR:ns ## COG: CAC0191 COG1737 # Protein_GI_number: 15893484 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 5 267 3 264 283 139 30.0 5e-33 MAAKGLLLRLKDYQVKASAAEKNAVTIILNHPDRVIGKSIHEFADLAYASPATIIRLCRK LGCAGYRDFQHALVYENALFKDSREISVQEIIPTPSTEDIIQKVTRKNIESLETSRKLVE PEIIDTCVKMIEESRIIQLFGLGSSLLVARDLYLKLIRVEKLCNICDDWHAQLLAARTMR KGDLGIVISYSGLTEEMITCARAAKANGARIIVLTRAADSKLAAEADCVLAVAATELILR SGAMSSRISQLNMVDILYVAYVNKHYESCMRSFPKTHIQKPLEEY >gi|229784121|gb|GG667614.1| GENE 61 75746 - 76502 908 252 aa, chain + ## HITS:1 COG:L144334 KEGG:ns NR:ns ## COG: L144334 COG2103 # Protein_GI_number: 15673112 # Func_class: R General function prediction only # Function: Predicted sugar phosphate isomerase # Organism: Lactococcus lactis # 1 252 1 252 297 290 63.0 1e-78 MLDLTKLTTETRNDKTMNLDQMTPAELAAVMNEEDGNVVKAVREVIPEIATAIEWATESL NRGGRIIYMGAGTSGRLGVLDAVECPPTFGVSPDLVIGLIAGGEGAFVKAVEGAEDSQTL GVQELKDLSLNENDIVIGLAASGRTPYVIHGLAYANSVGCKTVGIACNRQSEVGKAAQLA VEPVTGPEVLTGSTRLKAGTAQKMILNMISTGSMVGFGKVYQNLMVDVLQTNEKLVVRAQ NITMTATGCTRE Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:34:39 2011 Seq name: gi|229784120|gb|GG667615.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld8, whole genome shotgun sequence Length of sequence - 90582 bp Number of predicted genes - 95, with homology - 91 Number of transcription units - 38, operones - 20 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 168 - 494 489 ## Closa_3538 type II secretion system protein E 2 1 Op 2 . - CDS 509 - 1468 494 ## Closa_3540 hypothetical protein 3 1 Op 3 . - CDS 1455 - 2306 724 ## Closa_3541 SAF domain protein 4 1 Op 4 . - CDS 2321 - 2713 365 ## Clocel_2329 peptidase A24A prepilin type IV 5 1 Op 5 . - CDS 2738 - 3061 258 ## Closa_3546 hypothetical protein 6 1 Op 6 . - CDS 3075 - 3287 172 ## Closa_3548 hypothetical protein 7 1 Op 7 . - CDS 3297 - 4139 738 ## Closa_3549 hypothetical protein 8 1 Op 8 . - CDS 4170 - 4445 229 ## COG2088 Uncharacterized protein, involved in the regulation of septum location 9 1 Op 9 . - CDS 4468 - 4719 271 ## Closa_3551 hypothetical protein - Prom 4918 - 4977 4.1 10 2 Tu 1 . - CDS 4980 - 6851 717 ## COG3344 Retron-type reverse transcriptase - Term 7339 - 7373 -0.6 11 3 Op 1 . - CDS 7606 - 7878 271 ## Closa_3552 hypothetical protein 12 3 Op 2 . - CDS 7862 - 10114 1093 ## COG3505 Type IV secretory pathway, VirD4 components 13 3 Op 3 . - CDS 10111 - 10788 221 ## Closa_3554 hypothetical protein 14 3 Op 4 . - CDS 10795 - 11427 191 ## gi|266619854|ref|ZP_06112789.1| hypothetical protein CLOSTHATH_00916 15 3 Op 5 . - CDS 11411 - 12220 505 ## Closa_3555 hypothetical protein - Prom 12240 - 12299 6.4 + Prom 12247 - 12306 6.0 16 4 Tu 1 . + CDS 12349 - 12675 275 ## Closa_3556 XRE family transcriptional regulator - Term 12536 - 12580 7.0 17 5 Tu 1 . - CDS 12785 - 13864 825 ## Closa_3559 hypothetical protein - Prom 13979 - 14038 2.8 18 6 Op 1 . - CDS 14047 - 15081 718 ## EUBREC_2168 hypothetical protein 19 6 Op 2 . - CDS 15113 - 15583 414 ## Closa_3563 hypothetical protein 20 6 Op 3 . - CDS 15580 - 16371 511 ## COG0175 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes - Prom 16414 - 16473 2.1 - Term 16545 - 16573 -1.0 21 7 Op 1 . - CDS 16627 - 16785 76 ## gi|266619861|ref|ZP_06112796.1| conserved hypothetical protein 22 7 Op 2 . - CDS 16775 - 17455 353 ## COG2003 DNA repair proteins 23 7 Op 3 . - CDS 17377 - 17655 171 ## gi|266619863|ref|ZP_06112798.1| conserved hypothetical protein 24 7 Op 4 . - CDS 17694 - 18737 699 ## Closa_3569 zinc finger CHC2-family protein 25 7 Op 5 . - CDS 18715 - 22884 3358 ## COG4646 DNA methylase - Prom 23092 - 23151 2.9 26 8 Tu 1 . - CDS 23154 - 24899 226 ## COG3344 Retron-type reverse transcriptase - Prom 24959 - 25018 1.7 27 9 Op 1 . - CDS 25704 - 28043 968 ## COG0827 Adenine-specific DNA methylase 28 9 Op 2 . - CDS 28027 - 29178 512 ## Closa_3576 ParB domain protein nuclease 29 9 Op 3 . - CDS 29168 - 29956 537 ## COG1192 ATPases involved in chromosome partitioning - Prom 30192 - 30251 4.2 30 10 Tu 1 . - CDS 30259 - 32052 1780 ## COG0370 Fe2+ transport system protein B + Prom 31919 - 31978 5.4 31 11 Tu 1 . + CDS 32125 - 32307 184 ## gi|266619871|ref|ZP_06112806.1| conserved hypothetical protein + Term 32556 - 32588 -0.4 - Term 32494 - 32526 4.5 32 12 Tu 1 . - CDS 32537 - 32749 83 ## gi|266619872|ref|ZP_06112807.1| putative branched-chain amino acid ABC transporter, periplasmic amino acid-binding protein - Prom 32788 - 32847 4.7 33 13 Tu 1 . - CDS 32850 - 33167 296 ## gi|266619873|ref|ZP_06112808.1| putative choline binding protein I - Prom 33219 - 33278 4.1 34 14 Tu 1 . - CDS 34182 - 34496 254 ## CbC4_4176 putative N-acetylmuramoyl-L-alanine amidase - Prom 34519 - 34578 3.4 - Term 34536 - 34576 2.2 35 15 Op 1 . - CDS 34641 - 34841 316 ## gi|266619875|ref|ZP_06112810.1| conserved domain protein 36 15 Op 2 . - CDS 34843 - 35160 207 ## gi|266619876|ref|ZP_06112811.1| conserved hypothetical protein 37 15 Op 3 . - CDS 35177 - 35596 276 ## Cphy_3715 hypothetical protein 38 15 Op 4 . - CDS 35577 - 36317 490 ## gi|266619878|ref|ZP_06112813.1| putative acyl carrier protein - Prom 36383 - 36442 21.1 39 16 Tu 1 . - CDS 37520 - 38539 706 ## COG0582 Integrase - Prom 38619 - 38678 25.9 + Prom 39502 - 39561 80.4 40 17 Tu 1 . + CDS 39700 - 40020 136 ## gi|288870074|ref|ZP_06112815.2| hypothetical protein CLOSTHATH_00943 41 18 Op 1 . - CDS 40027 - 40458 370 ## gi|266619881|ref|ZP_06112816.1| conserved hypothetical protein 42 18 Op 2 . - CDS 40455 - 40676 109 ## gi|266619882|ref|ZP_06112817.1| conserved hypothetical protein 43 18 Op 3 . - CDS 40676 - 41269 341 ## gi|266619883|ref|ZP_06112818.1| conserved hypothetical protein 44 18 Op 4 . - CDS 41282 - 42193 786 ## Sfum_4062 hypothetical protein 45 18 Op 5 . - CDS 42203 - 43072 425 ## gi|266619885|ref|ZP_06112820.1| conserved hypothetical protein 46 18 Op 6 . - CDS 43090 - 43512 213 ## gi|266619886|ref|ZP_06112821.1| hypothetical protein CLOSTHATH_00949 47 18 Op 7 . - CDS 43516 - 43995 296 ## gi|266619887|ref|ZP_06112822.1| hypothetical protein CLOSTHATH_00950 48 18 Op 8 . - CDS 43992 - 44432 215 ## gi|266619888|ref|ZP_06112823.1| hypothetical protein CLOSTHATH_00951 49 18 Op 9 . - CDS 44425 - 46203 873 ## gi|266619889|ref|ZP_06112824.1| conserved hypothetical protein 50 18 Op 10 . - CDS 46205 - 46591 157 ## gi|266619890|ref|ZP_06112825.1| conserved hypothetical protein 51 18 Op 11 . - CDS 46597 - 49359 1776 ## LPST_C2001 probable minor tail protein 52 18 Op 12 . - CDS 49405 - 49965 465 ## Sgly_0348 putative protein GP15 53 18 Op 13 . - CDS 49962 - 50411 343 ## gi|266619893|ref|ZP_06112828.1| conserved hypothetical protein - Prom 50539 - 50598 4.1 54 19 Op 1 . - CDS 50640 - 51917 895 ## COG0582 Integrase - Term 51934 - 51989 8.2 55 19 Op 2 . - CDS 51997 - 52200 220 ## Tresu_1933 Excisionase from transposon Tn916 + Prom 52117 - 52176 5.5 56 20 Tu 1 . + CDS 52241 - 52330 71 ## + Term 52417 - 52453 -0.1 57 21 Op 1 . - CDS 52630 - 52869 263 ## gi|160937775|ref|ZP_02085134.1| hypothetical protein CLOBOL_02667 58 21 Op 2 . - CDS 52866 - 53309 364 ## lmo1099 hypothetical protein - Prom 53409 - 53468 4.4 - Term 53454 - 53498 11.2 59 22 Op 1 . - CDS 53588 - 53779 83 ## 60 22 Op 2 . - CDS 53800 - 53907 160 ## 61 22 Op 3 . - CDS 53897 - 54934 350 ## COG4124 Beta-mannanase 62 22 Op 4 3/0.000 - CDS 54912 - 56036 407 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis - Prom 56062 - 56121 3.5 63 22 Op 5 . - CDS 56226 - 57413 413 ## COG1004 Predicted UDP-glucose 6-dehydrogenase 64 23 Tu 1 . - CDS 57563 - 57715 71 ## gi|160937781|ref|ZP_02085140.1| hypothetical protein CLOBOL_02673 - Prom 57845 - 57904 4.0 65 24 Op 1 . - CDS 57922 - 58050 75 ## gi|160937783|ref|ZP_02085142.1| hypothetical protein CLOBOL_02675 66 24 Op 2 . - CDS 58063 - 58899 520 ## COG0582 Integrase - Prom 58987 - 59046 5.4 + Prom 59044 - 59103 7.2 67 25 Op 1 . + CDS 59259 - 59600 291 ## CD1101 putative mobilization protein 68 25 Op 2 . + CDS 59567 - 60901 952 ## COG3843 Type IV secretory pathway, VirD2 components (relaxase) 69 25 Op 3 . + CDS 60898 - 61251 174 ## EUBREC_3582 hypothetical protein + Term 61256 - 61310 13.1 + Prom 61497 - 61556 5.9 70 26 Tu 1 . + CDS 61630 - 62574 860 ## COG0583 Transcriptional regulator + Term 62598 - 62650 19.1 + Prom 62624 - 62683 5.6 71 27 Op 1 9/0.000 + CDS 62789 - 65059 1267 ## COG0620 Methionine synthase II (cobalamin-independent) 72 27 Op 2 . + CDS 65124 - 65945 585 ## COG0685 5,10-methylenetetrahydrofolate reductase + Prom 66277 - 66336 6.2 73 28 Op 1 . + CDS 66360 - 67001 533 ## Shel_05210 hypothetical protein + Prom 67013 - 67072 1.9 74 28 Op 2 . + CDS 67106 - 67312 193 ## gi|266619910|ref|ZP_06112845.1| membrane protein + Term 67340 - 67380 3.2 + Prom 67405 - 67464 3.3 75 29 Tu 1 . + CDS 67500 - 67709 202 ## CD2291 putative transcriptional regulator + Prom 67740 - 67799 1.6 76 30 Tu 1 . + CDS 67867 - 68112 130 ## gi|288870082|ref|ZP_06112847.2| conserved hypothetical protein 77 31 Op 1 . + CDS 69161 - 69442 113 ## COG0500 SAM-dependent methyltransferases 78 31 Op 2 . + CDS 69510 - 69842 211 ## gi|325264124|ref|ZP_08130856.1| hypothetical protein HMPREF0240_03129 - Term 70149 - 70198 17.1 79 32 Op 1 . - CDS 70215 - 70415 258 ## gi|266619915|ref|ZP_06112850.1| conserved hypothetical protein 80 32 Op 2 . - CDS 70419 - 71363 703 ## EF2322 hypothetical protein 81 32 Op 3 . - CDS 71356 - 76020 3629 ## CD1105 putative DNA primase - Prom 76042 - 76101 1.6 - Term 76045 - 76084 8.1 82 33 Tu 1 . - CDS 76135 - 76263 76 ## - Prom 76425 - 76484 3.6 83 34 Tu 1 . - CDS 76487 - 76948 186 ## Lebu_1563 hypothetical protein - Prom 77162 - 77221 2.9 84 35 Op 1 . - CDS 77262 - 77417 108 ## gi|266619920|ref|ZP_06112855.1| conserved domain protein - Prom 77446 - 77505 4.8 - Term 77542 - 77585 7.4 85 35 Op 2 . - CDS 77628 - 79682 1694 ## COG0550 Topoisomerase IA - Prom 79718 - 79777 80.4 86 36 Op 1 . - CDS 80625 - 81413 799 ## CD1107 hypothetical protein 87 36 Op 2 . - CDS 81403 - 81666 388 ## gi|266619923|ref|ZP_06112858.1| conserved hypothetical protein 88 36 Op 3 . - CDS 81690 - 83489 1705 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 89 36 Op 4 . - CDS 83521 - 83652 71 ## gi|160937802|ref|ZP_02085161.1| hypothetical protein CLOBOL_02694 90 37 Op 1 . - CDS 84590 - 86869 1872 ## COG3451 Type IV secretory pathway, VirB4 components 91 37 Op 2 . - CDS 86772 - 87221 295 ## CD1111 hypothetical protein 92 37 Op 3 . - CDS 87228 - 87689 500 ## COG4725 Transcriptional activator, adenine-specific DNA methyltransferase 93 38 Op 1 . - CDS 87827 - 88690 936 ## Ethha_1894 hypothetical protein 94 38 Op 2 . - CDS 88766 - 88981 255 ## Ethha_1893 putative conjugative transfer protein 95 38 Op 3 . - CDS 89046 - 90386 1230 ## COG3505 Type IV secretory pathway, VirD4 components Predicted protein(s) >gi|229784120|gb|GG667615.1| GENE 1 168 - 494 489 108 aa, chain - ## HITS:1 COG:no KEGG:Closa_3538 NR:ns ## KEGG: Closa_3538 # Name: not_defined # Def: type II secretion system protein E # Organism: C.saccharolyticum # Pathway: not_defined # 1 103 1 103 451 115 56.0 5e-25 MQEIVNIRDTMKALEPEAIKKPFAEVLHEVQEYISSTYASVLKDNPDDNRELIQSYIEKY IEQKRVCVEDIDRSELCELLYGEMTGFSFLTKYLNRDDVEEINSATRS >gi|229784120|gb|GG667615.1| GENE 2 509 - 1468 494 319 aa, chain - ## HITS:1 COG:no KEGG:Closa_3540 NR:ns ## KEGG: Closa_3540 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 9 319 1 283 283 253 47.0 1e-65 MSSNDGGSMVQNKIIAVMGSPGSGKTTASIKLAAELAEKKKNVILVFCDPFTPVIPFVVS CPEEQVSVGTLLTTPSLTQKMILDSCIPIKENGYISLLGYRLGDSLMAYPQITRDRAVDF LVCLRYLADFIIMDCSAVFEADVLSMLAMELSDQIIKIGTSNLRGISYFESRSSLLSDRR FQPEKQRMYLGNLKTGQEWEAVSVQYGGVDGILPHLMALEQQYDEQALFESLAGKEAAAY LAAIRKLADQIDEADQVGQEEGNSRRSQTAAEKKEEIRPTKPEKEKVKSKPKEQTKPVKE KAGGLGRFSLMFRKSKGEF >gi|229784120|gb|GG667615.1| GENE 3 1455 - 2306 724 283 aa, chain - ## HITS:1 COG:no KEGG:Closa_3541 NR:ns ## KEGG: Closa_3541 # Name: not_defined # Def: SAF domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 247 1 248 283 330 70.0 3e-89 MKKFLKNRIILGLICIVVSLLICFGITPMFNDALKSKVTLVRVTKEIRTGEQITDKMVTS VEAVGYNLPSNVIYKIEEVVGKYANADLYKGDYILKSKLSDTPMLRNAYLNKLNGENRAI SVSIKSFASGLSGKLEAGDIVSLIAADVGSQRETLVYPELQYVEIIATTGSSGSDQNVQE RGNGEEEELASTITVLASPEQSRLLAELEQTGKLHAALVFRGESTQAQKFLDEQASVLKE LYPEETEDPKKDRDESGGTDDLGEWANPELEPAGDAVVENEQQ >gi|229784120|gb|GG667615.1| GENE 4 2321 - 2713 365 130 aa, chain - ## HITS:1 COG:no KEGG:Clocel_2329 NR:ns ## KEGG: Clocel_2329 # Name: not_defined # Def: peptidase A24A prepilin type IV # Organism: C.cellulovorans # Pathway: not_defined # 1 130 23 156 159 102 44.0 4e-21 MVKNLVFYAILLYASVRDHKTHTVPDSVHVLILLAAIPGIRLLSSLLGFFLVPLPFLAAA LKKPDSIGGGDIKLMAASGFFLGMEKGIAAAIIGLTLAVLVQGAFLRNRNAPFALIPYLT AGCMAASLIN >gi|229784120|gb|GG667615.1| GENE 5 2738 - 3061 258 107 aa, chain - ## HITS:1 COG:no KEGG:Closa_3546 NR:ns ## KEGG: Closa_3546 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 9 107 7 105 107 144 71.0 1e-33 MVNTETLILYIPSGVKSENELFNGFGKRELLQSMAGALFGFLAAGAVWLIFRNVAATVVT ILTGIFGSVMMCTKDQNNLSVADQIVNIIRFSRSQQIYPYRAMREWG >gi|229784120|gb|GG667615.1| GENE 6 3075 - 3287 172 70 aa, chain - ## HITS:1 COG:no KEGG:Closa_3548 NR:ns ## KEGG: Closa_3548 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 69 2 68 70 87 65.0 1e-16 MEVTIDDLSLFVISIIRTGAALRFLYCMVRLSGAEEEAGQYRKRSKNTVVFWIIAECIWQ LKDIVLYYYT >gi|229784120|gb|GG667615.1| GENE 7 3297 - 4139 738 280 aa, chain - ## HITS:1 COG:no KEGG:Closa_3549 NR:ns ## KEGG: Closa_3549 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 280 1 280 280 370 70.0 1e-101 MEAILLALLVAVLNGAIAFIDQMLNDLIPMTLYADRYMVATSGGSMVTVLFDIMLGFGIS MIVLKFLKKGFECYIMWTDGDPDMEPAGLMIRFVEAIVVAVCFPLLYGWLAGIAEELINQ LLTGVGAATNYSWQAWVDGMASLGLVTAIFGLIFVICYFILYFQFLMRGLEIMILRIGIP LACVGLLDNDKGVFRNYMMKFFQSTLAVVIQISLCKLGIGMMLNVGINMNIFWGLACLVL AIKTPSFLRDFLLVNSGGGGVINNVYHSVRLAGMVQKIRK >gi|229784120|gb|GG667615.1| GENE 8 4170 - 4445 229 91 aa, chain - ## HITS:1 COG:SA0456 KEGG:ns NR:ns ## COG: SA0456 COG2088 # Protein_GI_number: 15926175 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein, involved in the regulation of septum location # Organism: Staphylococcus aureus N315 # 1 81 9 89 108 66 39.0 9e-12 MKVTARISKGFERAGHLKAYATLCLADSFLVTGVRVVECENGLRVFMPSTKDPDGEYHDV CFPITPDCRTRIETAVLNAYDSFIKETEGDE >gi|229784120|gb|GG667615.1| GENE 9 4468 - 4719 271 83 aa, chain - ## HITS:1 COG:no KEGG:Closa_3551 NR:ns ## KEGG: Closa_3551 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 83 15 94 94 76 53.0 4e-13 MRRLNEKMNQMLIAGYVRGIRVKERVFETLKQNTAEGFVDTALKILISVVVGALLLAGLY TLFKDTILPTLQSKIQNLFNYNG >gi|229784120|gb|GG667615.1| GENE 10 4980 - 6851 717 623 aa, chain - ## HITS:1 COG:SMb21167 KEGG:ns NR:ns ## COG: SMb21167 COG3344 # Protein_GI_number: 16264581 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Sinorhizobium meliloti # 40 418 45 372 453 185 34.0 2e-46 MKTEEKPKKQKLRNAEYYNFQEIQDELYQQSKVNQIFKNLVEVIASEENIKLAYRNIKKN KGSKTAGTDGKTIKHLAKWQDERLIQFVRKKLAWYEPQAVRRVEIPKGNGKTRPLGIPTI MDRLIQQCILQVLEPICEAKFHERSNGFRPNRSTEHALAQCYKFMQVDGLQYVVDIDIHG FFDHVKHGKLLKQLWTLGIRDKKLISIISAMLKAEVAGIGFPTEGTPQGGIISPLLSNVV LNELDWWITSQWEEMKTHTHKAYHRKDNGKLDKGKLYTKLRKTNLKECYIVRYADDFKIF CRNRKDAEKIFAATKSWLKDRLGLEISREKSKIMNLKRHYSEFVGFKIKLHKNGKEKSGK EKYTVKSHVGTKAAKKIKERTIALIKEIQRPSGGKTGYEAVLNYNAYVIGIHNYYSSATH VSLDFSQIAFSVKKSLKARIKKGLKQTGKHLPECIKERYGKSKQLKYVHNTALAPISFIQ HKNPMFKKLIINKYTVEGRNEIHKNLECVDMDMLLYLMRNPVIGRSIEYNDNRLSLYSAQ KGKCAVTKATLEIGDMHCHHKIPTYLGGGDGYMNLSLVTVDVHRLIHAVNADVITKYLEK LNLDNKQLTKVNKFRVLAGREII >gi|229784120|gb|GG667615.1| GENE 11 7606 - 7878 271 90 aa, chain - ## HITS:1 COG:no KEGG:Closa_3552 NR:ns ## KEGG: Closa_3552 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 83 5 88 120 85 60.0 6e-16 MRKTIKEWSEKAAKVSIMAAPLFLTNTITAYATGGISDSKLATGTQQLIQDLTTFLMVIA PIVTGLLIIYFCIRRSAADEMDQSVATRCC >gi|229784120|gb|GG667615.1| GENE 12 7862 - 10114 1093 750 aa, chain - ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 234 670 133 562 591 149 26.0 2e-35 MKLKERIILSTALIGIGSLFNLYFTASLHGLMSRQYRTLVLLPIGTCLSGLAEQRQQRLL FLAFEGMILLVCLLFFIQNSRSYQSDLVKVTEDIETPVAVGQYQHGSSRWLREKEKDQIF DVFWLDPSHPVVKQLIRTGFEGLDFLKGPNHGKQETENGVHLKDRRKNPVVVPIEKKETR EKAIEKKGEEEGFELVDYQFQTAIERQAAPEPRKVPQLEWDRKEDPYCLMDRGGIVIGMK KEGGREKIHYISDDSHTLTIGATRSGKTRTLVLQSICLMALSGESMVISDPKAELYEYTS AFLEKLNYDVICLDFKNTEKSSRYNLLQPVIDAVSRGDTEHAQMYAWDITNILVGDNTSN EKIWENGEKSTIAAAILCVVMDNARRPEYQNLTNVYWFLAEMCKTVGNKTPMQEYVKRLK PGHPARALLSISDVAPSRTKGSFYTSALTTLRLFTSKSIYSITHRSDRDISDLGRKKQVL FFILPDEKTTYYPIASLMVSQLYELLVHQSDERGGRLKNRVNFIMEEFGNFTKINDLTNK LTVAAGRGCRFHFFLQSFEQLTEKYSKETASIVKSNCQTWVYLQADDKETLSEICEKLGK YTTSAYQLSSQHGRYVNPSTSSSISLVARDLLTTDEIRRVSRPYQIVVSRAHPAMMVSPD LSKWQFNRMLGLDDQEHNRKVREERERKRPIMTETGGEIPLWNIWIYYVKELQMQEGRQR QFGEAMPGSPLFSEKFMKRGGSAGNNEEDD >gi|229784120|gb|GG667615.1| GENE 13 10111 - 10788 221 225 aa, chain - ## HITS:1 COG:no KEGG:Closa_3554 NR:ns ## KEGG: Closa_3554 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 208 1 208 227 241 54.0 2e-62 MRLVYLCSPYRGDYETNIRLAKQYCKNALESGVVAFAPHLYFAQFYPDTIPEQRKAGLEM GLNMLEKSDELWVMGKIHSEGMRGEINFAKEHNIPVFYVPKPLEIKSYPISIDGNELLSE RDCIEESHNRNYESRLVVLSYSSLKPEYRMPRNQIWYASHGPGCGPGAKFSDTVHLYHPI DEDRMAVSRREILGEIRPEVLEMLQQLYPGLQMNRGILETEGPEL >gi|229784120|gb|GG667615.1| GENE 14 10795 - 11427 191 210 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619854|ref|ZP_06112789.1| ## NR: gi|266619854|ref|ZP_06112789.1| hypothetical protein CLOSTHATH_00916 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00916 [Clostridium hathewayi DSM 13479] # 1 210 1 210 210 425 100.0 1e-117 MDMSHKRRFYAGMVGGVVVLSWLVITSCSGKMANTDVSREADNSCYLCGNGGLMGYYGEF DSIGFMNVNTGQIVDIPILSDADGGKSEKEVNGSSYHLITVGDGGSAVAVSTDQRRRFGK GSVMPGENSNLEEEKAGKLFCKNCLSQLLDIYNDRIAEEIPDTTMVDFVERKFYAIDKRY SDYLIRDYYLHFDFLKDRTELLVFYAPERR >gi|229784120|gb|GG667615.1| GENE 15 11411 - 12220 505 269 aa, chain - ## HITS:1 COG:no KEGG:Closa_3555 NR:ns ## KEGG: Closa_3555 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 269 1 278 278 199 41.0 1e-49 MQQPQGASIWGNINTCVEIALNVYYIYAEHGEGIVIGKEKAEQTISAKAVEAGNEDGEYL YYGKDNTMDIPLYEILQKRLEFSRRLEAQLVEQMEEIKHHGRISLPEYFGECPPPSEGES ACKVRNGIYFVNREGQTENDSIEFAVEKTVAENFLSPMAYEYGEERNGYLYYSADSCAIP LYELKGIFHECQDCIASEIDLISTLCTCYPAYRHAFNKIVPEAEQIPDSNGMPGDFLNRH EEEKAIRDTEPMEQLDDDYGENIDYGYEP >gi|229784120|gb|GG667615.1| GENE 16 12349 - 12675 275 108 aa, chain + ## HITS:1 COG:no KEGG:Closa_3556 NR:ns ## KEGG: Closa_3556 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 102 1 102 102 100 55.0 2e-20 MKLNDHLKQLRTSKNLSVYKLARMTDVSENHIRSIEKGNSQPSVLTLEKLLTALGTNLAE FFNEEQSVLYPTEMEQELLLAVRRLDPEQTKALIQLAALMNRDEKNPL >gi|229784120|gb|GG667615.1| GENE 17 12785 - 13864 825 359 aa, chain - ## HITS:1 COG:no KEGG:Closa_3559 NR:ns ## KEGG: Closa_3559 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 359 45 406 406 362 49.0 1e-98 MFGKMQPGPIQEFKDWKEIARLVYANSKKNITMYRSVVSFSEETAKELLLYDQKSWQRYI ENHILTIAEKNRMKREHLQWACAVHGEKTHPHIHVVFWDTSIRAKNPYVPPAIPAAIRRQ MIKDTFADKIRLYGQKKDENLRAMRQITDEMVDEFERYLRRLKPGRYQKISKALEKELDC ELAFSEKAVNDLGSRLLKLRRAMPENGRIAYQLLSPEAKMQVDAVVEDMIRHCQPVRDAV EAYCEAKKGQAAFYSTDTTYLAQKAEGYEKEARKVLANRVLSGVRMVIRLEGEMRTMQYI QDRKAIYAEQIVLESLEILAGVTDSKEEFAEGRRSGTLQLSKEGQKEWYLKNQDKGFEH >gi|229784120|gb|GG667615.1| GENE 18 14047 - 15081 718 344 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2168 NR:ns ## KEGG: EUBREC_2168 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 300 1 297 297 123 27.0 1e-26 MNYEEFKSYVLGHIKDYLGEMDPEIDMKIEQTENKLFEEVDALVLSKPDREYSTNIFLDK AYVAYMGTGDMEPIMKDIAATVRKSEKQLNKMEAINLNSFEGIKERLSVQILNAERNQSQ LQVITHRKIPETDLAVVYRIEVASNRDGFESVKVTNKMLDIWGIREEQLYQAALEQNMKK YPFIIADAGQFLFRAEPVPEQYPEEMKEKRFYYLTNSNIVNGAATILYPDILKTIGDKFQ GNYFILPSSIHEVLLMKDDGEINVEELQSTVRSVNDESVPEGDILSDQVYAYDRENERFY QVSEREETWDHMDQGFSGMEERNMSEDVYASYADEVIANEEHER >gi|229784120|gb|GG667615.1| GENE 19 15113 - 15583 414 156 aa, chain - ## HITS:1 COG:no KEGG:Closa_3563 NR:ns ## KEGG: Closa_3563 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 156 1 156 156 179 61.0 4e-44 MNQKTQKRSVNFPSETLKTLDKLAAREHTTTSELIRNFVEEGLKVNGYEEQVDFIARIIR QEITAVYHVEDIKAISDHSTDRLAKMLMKTGKINAAMFFLLVKVLIHLADRRSLEEMEHM VSEAVVLGVDYMQKKDFQINSFLYDTDFLMHLADKL >gi|229784120|gb|GG667615.1| GENE 20 15580 - 16371 511 263 aa, chain - ## HITS:1 COG:PAB0168_2 KEGG:ns NR:ns ## COG: PAB0168_2 COG0175 # Protein_GI_number: 14520463 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes # Organism: Pyrococcus abyssi # 4 192 52 221 271 79 29.0 8e-15 MCDSGGKDSNVIKQLAYLSGVPFEIVHNHTTVDYPETVYYVRREKKRWEKLGISYTVSYP TYKGKRTTFWELIPLKGPPLRHRRWCCNVPKESQNCNRYIITGVRWEESARRKQNRAAYE IQRTTKDRIKLQNDNDAKRKLMEICQVKGKVIVNPIIDWTTTEVWEFIDRYQIVTNPLYQ EGHKRIGCVGCPMVDNKYELEKNPEFKRMYLNAFQKCVENKPPSRHGWKTGQELYDWWVN GGRGMDSDQLCLFDLFPEGDDED >gi|229784120|gb|GG667615.1| GENE 21 16627 - 16785 76 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619861|ref|ZP_06112796.1| ## NR: gi|266619861|ref|ZP_06112796.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 52 1 52 52 92 100.0 7e-18 MSVRNNARRGGVQVRAPTIFFSAFESSAANDGPLAGHQKLRNKIFDFGLRQE >gi|229784120|gb|GG667615.1| GENE 22 16775 - 17455 353 226 aa, chain - ## HITS:1 COG:VC0217 KEGG:ns NR:ns ## COG: VC0217 COG2003 # Protein_GI_number: 15640247 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Vibrio cholerae # 46 201 73 224 224 90 37.0 2e-18 METVRTSREHFMDGFSRITGISKRKIEEYSKQFDLLHIVDHPIAAGVTEAQYKKIVQLRE FLNAYQSLRKIEWEERIILSGSERSKEYFISQLAYYREREMILCAYLDSGCGVISCEKVA EGTVDRSPFFTRELLKRVLQLDAVGVVLAHNHPGNSLKASEQDIAITNQLRDALGALAIK VFDHIIVGADRAVSLYQEGIIPADHQFQDGIAMVNLYESEENGYER >gi|229784120|gb|GG667615.1| GENE 23 17377 - 17655 171 92 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619863|ref|ZP_06112798.1| ## NR: gi|266619863|ref|ZP_06112798.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 92 14 105 105 140 100.0 3e-32 MEILLADLLISLLEVDEEEKTEMETFLEKYPFTALLREYPVLGLSVETERKLEAIAMLES YDGRDVSWRQSEPAGNISWTDSAGSPESVKGK >gi|229784120|gb|GG667615.1| GENE 24 17694 - 18737 699 347 aa, chain - ## HITS:1 COG:no KEGG:Closa_3569 NR:ns ## KEGG: Closa_3569 # Name: not_defined # Def: zinc finger CHC2-family protein # Organism: C.saccharolyticum # Pathway: not_defined # 12 347 1 337 337 389 55.0 1e-106 MRIVWNNEEVDMVTKQKPMPFTKEEVEKIYNTNIIDFAVENGLILEKGDANTVHVKNSGG LYLFKHGRGFHWFTTGKHGNIVEFAMEYFGLSKVEAMESVLGSRAYGSTFTVAEPVKEKT KKMVLPPRDRNNNRAALYLVSDRKLDEDIVYALMDQGRIFQVRQEINGKKYINCAFVGYD QEDTPRYCSLREMKLHGGFRQDVENSDKTYGFLMPGRSKRVYEFEAPIDAISHATLCKRF GIDWTKDYRISEGCLSDRALERFLKNHPEIKEIVFCYDNDSEGHLPDGTPHNHGQVKAEE MKKKYEKLGYWTAIQTPHLKDFNQDLMEYYHFPEQEQEEKKGEEMER >gi|229784120|gb|GG667615.1| GENE 25 18715 - 22884 3358 1389 aa, chain - ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 55 1357 1 1310 1315 550 30.0 1e-156 MIGAIRLPKNAFYQVAGTTATTDILFLQKREQEIVPDKREINWLSIEEDENGVPVNSYFI DHPEMILGTMAFDASMYANEATTSCLPFEDQPLEELLNNAVGKLQAVYREPDTELAVDGD SEVKDWLPARPEVKNGCYALIEDKLYYREDSRMYLQAIGGMKEERIKGLLEIKDALWNLI NFQSRTEEERIDDGYPADFDIGLKRYLNELNLVYDRFTARYGYVNSFANITAFAKDSDSP LLRSIEDPKKDEDRKKIKGEYQKAIVFYQATVRPKMVPKRADSAEEALQLSINMKGRIDL DYIQYLYHTPDHENFSKDEIIAELGTRIYQDPAAYDGKPYKGWVIAEEYLSGYVKDKLKE AVFYAKEEPERFNRNVEALQQVQPQPLTPEDISFTLGATWIPTEYYEAFMYEKFETPSYL QGGSSGIGIEFAPYTGEYHITRKSAEPNSVCVNATYGTARKNAYEILEISLNLRTVEVKD RVEYVDSYTGEDKVKYVLNRAETLLAREKQAQMKQEFESWLFEEPERGDALTKLYNDRFN NIRPREFNGDGLMFPDLNSAIHLRKNQRDLIARGIYGNTNVLGAHEVGAGKTFSAVVIAH ERKRLGVCHKPLIAVPNYLVGQWAADYMKLYPQDNLLVATKKDLERKNRRRFVSRIATGD YDAIIMAHSSFELINLSKEKQLSAIQEELNEISDAIADEKGRLGKSWTLKQLCAFQKNLQ FRYDYLFNEDKKDNTINFEQLGVDCLIIDEAHTYKNNYSFTKMRNVAGVGGKSSQRAMDM YMKSQYINEITGEKGLIYLTGTPVTNSMSELYVMQKALQPSELRSRGIFLFDSWASTFGV VESSLELRPEGTGYQMKSRFAHFHNLPELMSMFTLVADIKTTDMLPEIPVPKLRTGAFQA VKTMITPEQKEKMAELVMRAEDIRNGRVDSTEDNFLKLTNEARLLAIDPRILDFSISYNP NTKLNVCAQNVASIYHETAEKKSTQLIFCDKGTPKPDGSYSFYQALKEEMTAAGVKDEEI AFIHDYNTDIQRAELFEKVKTGEIRILIGSTEKMGAGMNVQDKLIALHHLDVPWRPADLT QRNGRILRQGNENEEISIFNYITEQTFDAYLWQILEQKQKYISQIMTGRSAARSCEDIDE TVLQYAEFKTLAADPRMKHKMEVDNEIYRLQTLKASWKANHQELQNDVIFSYPSRIEKCR DQIEKYRVDVERYHKNKRVDFSMTLEHKAHRGRSEASDHLELLFWRLGKSEGDTLVIGSY AGFTISLTRGISGIINVELTGKGIYGFTAGPSSLGNITRIENTIGRLDELKLDMEVKLED LEKQLKAAKLELSKPFKDEEKLRELLKEQVHLALELEFQMEEDKRPEDTPVDEVLESKSE SFYEDCMEQ >gi|229784120|gb|GG667615.1| GENE 26 23154 - 24899 226 581 aa, chain - ## HITS:1 COG:BH0224 KEGG:ns NR:ns ## COG: BH0224 COG3344 # Protein_GI_number: 15612787 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Bacillus halodurans # 4 374 5 326 418 176 33.0 1e-43 MPLIESEENIRLAYRNIKRNSGSNTSGIDRLTIKDIEKIPETKFVEIVRKKLQWYKPKAV RRVEIPKPNGKLRPLGIPAIWDRIVQQCILQILEPICEAKFSDRSNGFRPNRSAEHAIAQ AYAFMQKSHLHFVVDIDIKGFFDNVNHSKLIKQMWNIGIQDKKLLCVIKEMLKAPIVLPN GDTIYPQKGTPQGGILSPLLSNIVLNELDWWIISQWEKQPAHNVRDHITKTGAIHRGRAY DAFRKSGLKEMHMVRYADDFKIFCATRKDADRAYHATKAWLQERLKLEISEEKSKIVNLK KGYSEFLGFKLTLMKKGASYVVKSHICDKAKRRETDKLKGQIKKIAHPKNDEERATLINE YNSMIMGMHNYFQIATCVSSDFADMGREVDVVKLSQLKRKALSAKGDYKGFSFIEKKYGK SKQIRFYSGKPLIPVAYVKHKNPMWKKKSVGKYTSEGRREIHKNLGIDTRTMLLLMRTKE VGRSVEYMDNRISLYVAQYGKCAITGKILELHEIHCHHKKPVSQGGDDRYENLIILHKDI HRLLHATKEETIKAYLSQLQLTYKQKAKLNKLRQLANMQAI >gi|229784120|gb|GG667615.1| GENE 27 25704 - 28043 968 779 aa, chain - ## HITS:1 COG:pli0004 KEGG:ns NR:ns ## COG: pli0004 COG0827 # Protein_GI_number: 18450290 # Func_class: L Replication, recombination and repair # Function: Adenine-specific DNA methylase # Organism: Listeria innocua # 494 773 310 609 756 180 36.0 1e-44 MSRATDLCELYQNAVKMLTQNPQEWIGLLASAAKFYKLSFDKNVLVYMQRPEAGLIATMR DWNIRTGRYVNKHSKAIAVLDMSDPKARLTYYFDFADTHGDLESLQKTMELIWKVENQYK SDLLMRFQEKYGTAGKSIEEALVQMVMQQTEKILNLYMEGFAVREPESILYGAPLEAVKD EFATYVQNSAVYIVLKKCGLSTDIISEDAFNHISHYGSLELFMQLGACSTAAARTVLRQI YQEIENIKEERSRLYEQRSLNASGIHREGRWNAVSEPEHFKGETVRPDTDRKIRENLEGV HDGMPSGTDDRVNPEGRNQPDDNESRRGSGGAERTADTEIIGESADAGEWNHDEPGGTHE SFNDDSGRSDYSGSSLPDEINRIEDRETARSDGEKGSSGRAFFVPRRYNDGENRQYYREI LTDTILYPVADLAERSVRDRRDFEKDSDSQISLFKFGLVDENTDDVGGILGELPEETRIV IPDKTAIEIPVIKEPPASKGKDEKSSREESVTEKQVTEEQTTEESNKDKSTIEKTDEEES TKEESTKEESTKEESVTEEPPAVRYNYVYSENHHLYDGGPKEKCRNNIEAIRLLKTLKQE GRHATPEEQLTLAKYVGWGGLANALTPEKRGWEKEYDTLQNLLDEDEMRSAAESSLTAYY TEQQIIKHIYAALERFGFQKGNILDPAMGTGNFYSVLPDSMKKSALYGVEIDRLSGSIAK ELYPNAYIEIKGYEDIEFSNNFFDVAVGNIPFNSIKISDRRYDRYGFRIHDYFIGATRS >gi|229784120|gb|GG667615.1| GENE 28 28027 - 29178 512 383 aa, chain - ## HITS:1 COG:no KEGG:Closa_3576 NR:ns ## KEGG: Closa_3576 # Name: not_defined # Def: ParB domain protein nuclease # Organism: C.saccharolyticum # Pathway: not_defined # 57 373 35 333 338 321 54.0 4e-86 MATRIPKIPTKSSLLSISLESADNVLDFPIKGEPEKPGVSEVLTERNQFSAASKEQLEME FDQMKPFSDHRFKLYTGKRLEDMVDSIRNYGILQPLILWHRREEYILLSGYNRRNAGILA GLKKAPVVIKENLSHEEAVLIMAETNLRQRSFSDLSESERAFCLKQHYDALKSQGKRTDL LNELDELLNSDDNAGASTVSESVTRSQNGETPENEASTVSKTVTRFRTDEKLGEKYGLNR DKVAKYVRIAGLIPPLLERFDQGEITFLAAYDLSFIENTAYQEILEEYLQGGYRLDTTKA ALFHSYAKEGNLTEKTIAEILSGQKSEILEIAKHKTFRLKPTVISRYFTPVQSKKEIEET IEKALELYFSKDEMKEQTVVKSN >gi|229784120|gb|GG667615.1| GENE 29 29168 - 29956 537 262 aa, chain - ## HITS:1 COG:VC2773 KEGG:ns NR:ns ## COG: VC2773 COG1192 # Protein_GI_number: 15642766 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Vibrio cholerae # 1 258 1 250 257 151 36.0 1e-36 MGKVIGIVNKKGGVGKTTTATTLSYLLTKRGYKVALIDFDGQRHSTLLSGVLCPEQLPFT IYDLLKRLVMDEPLPEAGEYVIQTENGVHLIPANEKLDNFEKLMSDATFCEYKLKEFVDT IRDSYDYIIIDCMPKMGIPMINVMICADSLIIPLQSETLAAEGMSAFLRAYHKIQSRCNK NLKIEGILFTMDNQRTRVSKRVKSQVENSLGEKVHIFSNTIPRSVRVADSVDAGMTICEL EPANPAAVAYERFAQEVIDSGN >gi|229784120|gb|GG667615.1| GENE 30 30259 - 32052 1780 597 aa, chain - ## HITS:1 COG:CAC1031 KEGG:ns NR:ns ## COG: CAC1031 COG0370 # Protein_GI_number: 15894318 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Clostridium acetobutylicum # 17 595 56 592 683 223 28.0 8e-58 MANAQGYASDGEQGYVMVDIPGCYSLMAHSTEEEVARDFICFENPDAVIVVCDATCLERN LNLVLQILEANRRAVVCVNLMDEAKKKSITIRFDILEERLGVPVIGTAARSGKGLEQIYE GLNRVLELKREMVEAEEVEAEEATEAKEATEAEEAVEAEKSVDRRFAENPAPRILIRYPE YIEAAIARLTPAVKKTAGEGVNIRWLCARLLDSNENLMEAVRKYLTPVAESEEVSSLLME IREEWQERGISQKQVSDDMASVFIRKAEFLCRGAVVFENQTYDKKDRLLDRLFTSKATGF PIMFLILLGVFWLTITGANYPSELLSSGLFWLEDRISDLFLTAGMPVVLNDLLVHGVYRV LAWVISVMLPPMAIFFPLFTLLEDFGYLPRVAFNLDRCFKRCAACGKQALTMCMGFGCNA AGIIGCRIIDSPRERLIAMITNNFVPCNGRFPTMIAIITMFFVGSAAGAFSSVLSAAILA GVIVLGVLMTLLISKILSATVLKGVPSSFTLELPPYRKPQIGKVIVRSIFDRTLFVLGRA IAVAAPAGLIIWLMANISVGDATLLAHCSGFLDPFARVIGMDGVILLAFILGFPANA >gi|229784120|gb|GG667615.1| GENE 31 32125 - 32307 184 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619871|ref|ZP_06112806.1| ## NR: gi|266619871|ref|ZP_06112806.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 60 1 60 60 91 100.0 2e-17 MNRTDELFFELIETYQRHAQASRNAKYQETREMAEILLNMDITAMWSLTQRTPDTRYRLM >gi|229784120|gb|GG667615.1| GENE 32 32537 - 32749 83 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619872|ref|ZP_06112807.1| ## NR: gi|266619872|ref|ZP_06112807.1| putative branched-chain amino acid ABC transporter, periplasmic amino acid-binding protein [Clostridium hathewayi DSM 13479] putative branched-chain amino acid ABC transporter, periplasmic amino acid-binding protein [Clostridium hathewayi DSM 13479] # 1 70 1 70 70 68 100.0 2e-10 MVTGVVVEIIAVIIAADAYATIPVRINAIAAITIAVSIHAPEREKKLMRPDLETAVIMLP AGRITVAANG >gi|229784120|gb|GG667615.1| GENE 33 32850 - 33167 296 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619873|ref|ZP_06112808.1| ## NR: gi|266619873|ref|ZP_06112808.1| putative choline binding protein I [Clostridium hathewayi DSM 13479] putative choline binding protein I [Clostridium hathewayi DSM 13479] # 1 105 33 137 137 214 100.0 2e-54 MMDNPELWTDFCLRIRGQEDDVKSFEDGAGNWHFTINGELQKARWIKYKNKWFYVDDAGN MVTGYAIIGGLAYMLNPSKVDMATYGALLVTNNLSQGNLEVQWVG >gi|229784120|gb|GG667615.1| GENE 34 34182 - 34496 254 104 aa, chain - ## HITS:1 COG:no KEGG:CbC4_4176 NR:ns ## KEGG: CbC4_4176 # Name: not_defined # Def: putative N-acetylmuramoyl-L-alanine amidase # Organism: C.botulinum_BKT015925 # Pathway: not_defined # 4 100 5 101 304 92 51.0 4e-18 MLPITKQIKQINCYASQNHPKYIVIHETDNFKKGAGAESHARAHNNGNLDTSVHYYVDDV AIYQTLNHTDGAWAVGKQYGTPLVAGVNNNNTINIEICVNPVAS >gi|229784120|gb|GG667615.1| GENE 35 34641 - 34841 316 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619875|ref|ZP_06112810.1| ## NR: gi|266619875|ref|ZP_06112810.1| conserved domain protein [Clostridium hathewayi DSM 13479] conserved domain protein [Clostridium hathewayi DSM 13479] # 1 66 1 66 66 87 100.0 3e-16 MNEKTKRWIKAAGIRAVKTMAQTAVATIGTAAVLGDVNGPMVISASVLAGVLSLLTSVAG LPEITE >gi|229784120|gb|GG667615.1| GENE 36 34843 - 35160 207 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619876|ref|ZP_06112811.1| ## NR: gi|266619876|ref|ZP_06112811.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 105 1 105 105 175 100.0 1e-42 MIETEFIERLTKVEERSKSNTHQIDDLKPVINEIHTMSKTMVELIGEVKYTNENVSELKD KVEILEKEPAKQWSATKKTFFTSITSSIGTAVAAGILYLLSRGGF >gi|229784120|gb|GG667615.1| GENE 37 35177 - 35596 276 139 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3715 NR:ns ## KEGG: Cphy_3715 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 4 138 11 145 146 129 50.0 5e-29 MALILRKYLVLMATGGLLYVALELVWRGRSHWTMFLLGGTCFICLGLINEILPWSMALWK QILIGMATITVLEFITGCIVNLWLGWNIWDYSGLSGNILGQICPQYCLLWLPVSLAGIVL DDWLRYWWWGEERPRYKLF >gi|229784120|gb|GG667615.1| GENE 38 35577 - 36317 490 246 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619878|ref|ZP_06112813.1| ## NR: gi|266619878|ref|ZP_06112813.1| putative acyl carrier protein [Clostridium hathewayi DSM 13479] putative acyl carrier protein [Clostridium hathewayi DSM 13479] # 1 246 1 246 246 444 100.0 1e-123 MEKIRIGNQETLFEIESIRPVSENVMQLVFPDTVPSVWGDITIYTDDGTEATTLTGYDTV YRDEGQTVYLSNDGSVYTAPETPEGPGEPPEPYIPTLEELQAAKRREISVACQQIIYQGV NVQLSDGSTDHFALTIEDQLNLFGKQIQVTSGAAQIEYHADGQPCRFYSAEDMQAIITAA MWHVSYHTTYCNAINMWIAGCETAEEIQVIFYGADVPPAYQSEVLQSYLTQIAAIAGGNA DGADPA >gi|229784120|gb|GG667615.1| GENE 39 37520 - 38539 706 339 aa, chain - ## HITS:1 COG:SP0890 KEGG:ns NR:ns ## COG: SP0890 COG0582 # Protein_GI_number: 15900773 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 9 338 8 319 321 196 33.0 6e-50 MVEKILENVVNEMAPHLSQDQLEHLSNVLYVNFHGMEVQEQCTELTASGEDGDEAKIRMF VASKKAVNRQTNTLKQYTREICNMLDFLGKRLEDITGMDLRYYYGVMRERRGIKMSTMQT RLHYLSSFWDFMITEDLVSSNPVKKVGLLKIEKTIKKPFSAAEMEALRTSCSELRDRALV EFLYSTGVRVSELVSLNVGDIEMGKQELIVYGKGSKERKTYLTDGAKFYLRRYLRTRCEN EGMTMEELQGRPLFATLDRPHGRLTVAGVQYMLRQLGRRAGVEGVHPHRFRRTIATDLLS RGMPIEQVKEFLGHEKLDTTMIYCTVKTESVQASHRKYA >gi|229784120|gb|GG667615.1| GENE 40 39700 - 40020 136 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870074|ref|ZP_06112815.2| ## NR: gi|288870074|ref|ZP_06112815.2| hypothetical protein CLOSTHATH_00943 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00943 [Clostridium hathewayi DSM 13479] # 1 106 1 106 106 187 100.0 3e-46 MAIGETSNITSASFTGKIIWQNLGRLTTLSFELTTKKELASGATYELISIAEGIPSKKAN CVLTTSDGKPYYCYSSPDGKFYVANYTGAAIPANKVLYGQIAFISA >gi|229784120|gb|GG667615.1| GENE 41 40027 - 40458 370 143 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619881|ref|ZP_06112816.1| ## NR: gi|266619881|ref|ZP_06112816.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 143 1 143 143 264 100.0 1e-69 MTISNFVAYVKQAWKNKPDTSTPLSAARLTHLEDGIKGNSDAIEKIAAAVVSQIVNDPNK IASMASLYSVNQAVTQLNSDLAIVGNVNGMEFGTDGEKRYFVFKHNDGSKSSLEFYNNGV NLVRYNPKTAQWEIIWSIPVTVN >gi|229784120|gb|GG667615.1| GENE 42 40455 - 40676 109 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619882|ref|ZP_06112817.1| ## NR: gi|266619882|ref|ZP_06112817.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 73 1 73 73 128 100.0 1e-28 MADLYIQSVKITPNPVTTKAQFKIEVEIYTLFPAADLYPALDLYPGEDLFGLFPQEEIFP GTNVYPIEGGMSE >gi|229784120|gb|GG667615.1| GENE 43 40676 - 41269 341 197 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619883|ref|ZP_06112818.1| ## NR: gi|266619883|ref|ZP_06112818.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 197 1 197 197 399 100.0 1e-110 MAVARVFGRVDGAEVVMEQTQGDIWSVPVPLDYDGEYVVEIIAEDGAGNQSYMAKMLFCV DSSGLCVQVLPIPYFAELLDSIYHADVLPALFSAELLEPCCQSGGGVNMQKIIFDIGEHR HVRLKIHAAQDAPFRIKSASWELLRGNTLEASGECEIDEHIIDAYIEAPPGKTSYILRVI YKINDETLVEQLDLVVV >gi|229784120|gb|GG667615.1| GENE 44 41282 - 42193 786 303 aa, chain - ## HITS:1 COG:no KEGG:Sfum_4062 NR:ns ## KEGG: Sfum_4062 # Name: not_defined # Def: hypothetical protein # Organism: S.fumaroxidans # Pathway: not_defined # 78 294 324 528 711 94 31.0 6e-18 MAVRTVQAIINGVTTTLTLNSSTGKWEATVTAPSTSSYNNNDGHYYPVTIKATDEAGNAT TKNDTDVTLGGSLRLRVKEKVAPVILITYPTASALIVNNKPAIRWKVTDNDSGVNPDSIK ITIDTGEAVTAGIVKTPITGGYDCTYTPTAALADGSHTIKVDAADNDGNAATQKSVTFKI DTVPPTLNVTAPVNGLITNKAACTVTGTTNDITSSPVTVTVKLNSGTAEAVPVGADGSFS KALTLAAGSNTITVVATDSAGKSTTVVRTVTLDTVAPTIRAVTLTPNPVDAGKTYVISVE VTD >gi|229784120|gb|GG667615.1| GENE 45 42203 - 43072 425 289 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619885|ref|ZP_06112820.1| ## NR: gi|266619885|ref|ZP_06112820.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 289 1 289 289 555 100.0 1e-156 MILPIRKVGGGPNISELTADPGDVLAAEKFIGTGSEEPQVGKIVQRGSPEYALPINGVQK LPPGNYTGGKVKQTIETMAAQSIGPGARMITIPTAGKYMTGDITIRAVKNLSTSVIKKGQ YVGGVGPGTWEGYVNKDPKVPYYYGAFNGIQSITAFKHLLWDNVGTVSLERDHIKVYVRN NTYYTAVVFNEPIDLTNLNRLIVRLEYSGSALDNIELFRNKVTDYIFDNEKSSSLKRNPN LGDKVAGMNTSGTGGDYTLNLSSVSGTAYLYLLFISPAVTSKIRMVKFE >gi|229784120|gb|GG667615.1| GENE 46 43090 - 43512 213 140 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619886|ref|ZP_06112821.1| ## NR: gi|266619886|ref|ZP_06112821.1| hypothetical protein CLOSTHATH_00949 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00949 [Clostridium hathewayi DSM 13479] # 1 140 1 140 140 266 100.0 3e-70 MAILPLFFRGQSVDFSGLTAVENLVRKGKKFIGRGSSDIRTGNLEEKNATSYKLPINGTY NIPAGIHNAEDTVDQEIDTMDGQIVTPGAGPVVIQCAGKYMTGDIIVYAVENLTAENIKF GEVVGEGEGAVTGTCQGFFD >gi|229784120|gb|GG667615.1| GENE 47 43516 - 43995 296 159 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619887|ref|ZP_06112822.1| ## NR: gi|266619887|ref|ZP_06112822.1| hypothetical protein CLOSTHATH_00950 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00950 [Clostridium hathewayi DSM 13479] # 1 159 1 159 159 299 100.0 4e-80 MKKLRTNYKNDKYTGKRLYRVTNVSADTVNLDDITIYAEEGDIFSADDINETNAAVNELY EEYAEGISRANRYVEINLPVSGWSATAPYIQTVSVPGMLASDRPVPGLVYPDNLTEALQA QIDKSANMITTIETLDSKVKVTCRFKKPVIALRLGLKGV >gi|229784120|gb|GG667615.1| GENE 48 43992 - 44432 215 146 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619888|ref|ZP_06112823.1| ## NR: gi|266619888|ref|ZP_06112823.1| hypothetical protein CLOSTHATH_00951 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_00951 [Clostridium hathewayi DSM 13479] # 1 146 1 146 146 295 100.0 9e-79 MSDILDCLIFDRVQADIDAMTDKAYIDYQDLNRVEDAIKWVSHVLNCYGYRNTIIEGAIW QPEDRRTESEMERIRKNLIAIRSAFYTPPSTPQTPERITYTSIYQANFIEKIIYDIGILV ENLIPSIPHLEFKLGCRGIGNRSVSL >gi|229784120|gb|GG667615.1| GENE 49 44425 - 46203 873 592 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619889|ref|ZP_06112824.1| ## NR: gi|266619889|ref|ZP_06112824.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 592 1 592 592 1183 100.0 0 MFLKQSIRSDATDQSGIKIVYDDVAPYAKENSNPQVIDAGLCPGADTFPGPEVYPAPTVI RNTFPDLKRDDLKYPGYALCLPRFALLNGDYINFPDDAQGYGFISDEISDSNGRFEYTMT QEPTLPGEYPSIFLYPSPGLEKEIKTPTLEITFNRKFTSVGLLLTFNDMSGDFASRINVK WYSGEQLLSDLNFEPNETKYFCSNYVQLYDRIIITFMETSKPCRPVFLTRIDYGIYRDFM GDEIKEINCLQEINAISESISVNTMDFTVKTKSSVPFDLQKKQKLSLYFNGGLIGNFYLK NGARKSKTDYYMDTHDAVGLLDGNEFPGGIYSGQLVPDVISQIFDGEDFNYLLDETFNNI TLSGYIPYTTKRAALMQIAFAIGAVVDTSNYDGVIIYPKQTKNTGEFDKVFEGLTLDHSD VVTGIRLTGHSYQRSDESEELYNDELSGTVQVTFSEAHHSLTISGGVISQSGDNYAIIKG TGGAVVLTGKKYHHYTFMISRENPNIFFNKNIKEVKEATLINKDNAQQALDRVYEYYQRA ENVTCEVILEDKMIGQVVGIDTDYDGVKVGTIERINYSGIARAIKAEVTIHE >gi|229784120|gb|GG667615.1| GENE 50 46205 - 46591 157 128 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619890|ref|ZP_06112825.1| ## NR: gi|266619890|ref|ZP_06112825.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 128 1 128 128 244 100.0 1e-63 MSENVFSIDGVDLRLNVTKLDREFSVTDTENSGRLKNYEMYREIAGTFYNYTMEIEPITQ YREDYDTFYQMISAPETKHRLVVPYAQKTLEFEAYVTKGKDSLQRRGDKNLWHGLSVYFV AMSPQRRP >gi|229784120|gb|GG667615.1| GENE 51 46597 - 49359 1776 920 aa, chain - ## HITS:1 COG:no KEGG:LPST_C2001 NR:ns ## KEGG: LPST_C2001 # Name: not_defined # Def: probable minor tail protein # Organism: L.plantarum_plantarum # Pathway: not_defined # 443 745 798 1111 1943 159 31.0 7e-37 MAADGSLKFDTKVNTEGFDAGMSTLTKAVERLSGLIEDLSKKMDGGFTGAGNTAASTAKD IDTVAESAKKAREEVERLNKEKAATFTGTITNNNASPSSIPDDGKRYDIYGNDVDEIIAR NKAIEESAREAAAAENKAFEEAKQGPTMLQNTLEILKRTISDIPTIASSTGHVIMGAFDS GNQSVIALVDKIDLLKEHLYSLEKSGSYFGDPEYDKTYGELQKAIVALNSYKKELEGTGT VQKKVDSSGKKMNKTLAATNKTAIPLTKSILKLSNMFKLMLIRMAMRTAIKAAKEGFENL TQYSDETNKSISMLVSANTRLKNSFATAFAPALEAAAPALKEMIDLLSVGATYAGQLVAA LTGKATFVKAVDVEENYGEALKDSNSELKKKEKLNQKLAFSFDDLIQAQKKSEDGYIGPT PDQMFETVEIENDIKDFAAVVKGVFSDLFDPLKQSWMENGPEVNEAVHAALSSMKNLALD VGASFLQVWKNEGYGQKITDDLLITFANLAFTAANLCDQLDKAWTSGDLGVSIMRHLGDL VLEVTGFFRDASGEIRDWSATLDFAPLLKSFDEVLVSLRPIVSKIGDVLLWLLKDILLPI AKWGIENGIPAAFDLISAALKVLNSILDALKPLAMWLWKDFLQPFGEWTGKIIIEALKKV TEWLTKFSDWISEHKQLIEDVTIVVLGFFAAFAFESFVSGVGNMLSVLPNLIGVLGSLVG KLDPLTLLLGLVISLAAYVATAWDDMTPDEKLASKILAVAGAIGLIVAEIGLLLHDPLML GIGVAIAAIAGIAIAGIASSAKSRAGAYSSGSYSPYNTYSLGDVSSYRMPRLATGTVVPP RAGEFAAILGDNNRETEVVSPLSTIEQALDNVMAKYMGEGGNRRPMQIDLIINGQRFARA VYEANNQERQRVGVRMITEG >gi|229784120|gb|GG667615.1| GENE 52 49405 - 49965 465 186 aa, chain - ## HITS:1 COG:no KEGG:Sgly_0348 NR:ns ## KEGG: Sgly_0348 # Name: not_defined # Def: putative protein GP15 # Organism: S.glycolicus # Pathway: not_defined # 4 185 7 188 191 98 34.0 2e-19 MIGRLPTTLEVAGKKMEIRTDFRDILVIMQAFNDPELLPEEKYEVMLEILFMTPEEIPES AYPEAVRQALWFLDCGQEADDKKPSRKVMDWEQDEPILFPAINKVAGREVRAAEYMHWWT FMGYFMEIDDGTFSMVLGIRQKRARGKKLEKWEQEFYQANKAMCDIKTKYTAEEQEEIDY WNKLLG >gi|229784120|gb|GG667615.1| GENE 53 49962 - 50411 343 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619893|ref|ZP_06112828.1| ## NR: gi|266619893|ref|ZP_06112828.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 149 1 149 149 280 100.0 3e-74 MQSINFSDNLKSFSINGDENRVIRFNPADPNILVRADAAQKRITEKQSQIESVKLMPDGA PVENPTEQVRRLLKEFDDLIRDEINYIFNSDVYDTVFAGQSPLCIVGEKKEFLFEAFLKA AMPIIREGVDEFNVGSQHRIEKYTREYSK >gi|229784120|gb|GG667615.1| GENE 54 50640 - 51917 895 425 aa, chain - ## HITS:1 COG:SP1129 KEGG:ns NR:ns ## COG: SP1129 COG0582 # Protein_GI_number: 15900995 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 214 407 204 383 387 68 28.0 2e-11 MSKKERDEKRRDSKGRLLKSGESQRTDGRYAYKYTDTFGEPKFVYSWKLVPTDKIPAGKR PDISLREKIKQIQKDLDDGIDTIGKKMTVCQLYEKYIRQRGNVKRGTHKSRQQLMKLLSE DKIGGASIDSVKLSDAKEWALRMQEKGVAYHTICNGKRSLKAIFHMAVQDDCLRKNPFDF QINEVINDDTVPKVPLTPAQEKELLGFMQSDPVYAKYYDEVLILLETGLRVSELCGLTPA DLNFDKRFVNVDHQLLRSTEDGYYIEAPKTDSGYRQVPMSAAAYKAFQRVLHRRKDGKGV VVDGYKGFLFLNRDGLPKAAVNYDSMFQGLAKKFNKFHAEPLPEVMTPHTMRHTFCTRMA NAGMNPKALQYIMGHSNIVMTLNYYAHATFHSAQEEMERLQAKSQTAAAVNAQPASESAQ ESKAA >gi|229784120|gb|GG667615.1| GENE 55 51997 - 52200 220 67 aa, chain - ## HITS:1 COG:no KEGG:Tresu_1933 NR:ns ## KEGG: Tresu_1933 # Name: not_defined # Def: Excisionase from transposon Tn916 # Organism: T.succinifaciens # Pathway: not_defined # 1 67 1 67 67 106 82.0 2e-22 MSNNDVPIWEKYTLTIEEASKYFRIGENKLRRLAEENPSAGWVILNGNRIQIKRQKFEKI IDSLDTI >gi|229784120|gb|GG667615.1| GENE 56 52241 - 52330 71 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWAKHGRLAAQLHGLFTPVWFSHGPTHCL >gi|229784120|gb|GG667615.1| GENE 57 52630 - 52869 263 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937775|ref|ZP_02085134.1| ## NR: gi|160937775|ref|ZP_02085134.1| hypothetical protein CLOBOL_02667 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF9475_02725 [Clostridium symbiosum WAL-14673] hypothetical protein CLOBOL_02667 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF9475_02725 [Clostridium symbiosum WAL-14673] # 1 79 1 79 79 152 100.0 6e-36 MKEPTFNGRELLPLSVMEAAHAGDAMAMEQVLRYYEDYINKLCIRTLYDSNGIPYVCVDE YMKHRLEIKLIHSIIVALK >gi|229784120|gb|GG667615.1| GENE 58 52866 - 53309 364 147 aa, chain - ## HITS:1 COG:no KEGG:lmo1099 NR:ns ## KEGG: lmo1099 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes # Pathway: not_defined # 1 137 1 136 139 100 37.0 2e-20 METIPRNDFILRHRYDAFCKAVLRNEAKSYWSEMAHRHEREKSLDALTQEEMDKLSVVDD YPSDSYVFSSYGYDLLIDNELVAEAFASLPEQEQSILILHCVLDLADGEIGSLMGMSRSA VQRHRTRTLKQLRMKLMAFMPEGGKRG >gi|229784120|gb|GG667615.1| GENE 59 53588 - 53779 83 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEHQTKGTLRNQAVITSAGSPRATTPRFVRYWMGAIPPNADPIGGFRREPSCVERDIDQR KWE >gi|229784120|gb|GG667615.1| GENE 60 53800 - 53907 160 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRNKKFTKYQIMTRVGTIISILLFVVLGFCIEVIF >gi|229784120|gb|GG667615.1| GENE 61 53897 - 54934 350 345 aa, chain - ## HITS:1 COG:AGl3016 KEGG:ns NR:ns ## COG: AGl3016 COG4124 # Protein_GI_number: 15891623 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-mannanase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 122 289 99 273 320 62 27.0 1e-09 MIKWKFLNCICIFAAAIFILWVGCNEQPVDTEKADGSDSIASIEAPSNSQATLEEPTTES IVTESSVVESEYYPITEGQRYLGVYVQGPEETDVLADSINTLAWFDRFDQTSDYKISLCL DDNKYIAFITLQPTDWDLKLVSDGYYDDLIIEYFKKLSSDNRANTELFVRLAHEMEMRPS YKSGWYSWQTDDAHAYVNAWVHIVNLGREYAPNVKWVWSPNRADEYTTKYYPGDEYVDYV GLTLNNTLDSRESFQQFYENEGQRDYLEAYNKPIIFGEIAEHSTSDEVRNEYIQSVFDYL GTYDKCIGFIFLNQDIESARQYKFTDCELILDTFIENARDYICAK >gi|229784120|gb|GG667615.1| GENE 62 54912 - 56036 407 374 aa, chain - ## HITS:1 COG:SSO1299 KEGG:ns NR:ns ## COG: SSO1299 COG1215 # Protein_GI_number: 15898141 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Sulfolobus solfataricus # 2 233 67 283 422 102 29.0 2e-21 MLLPVVDEPLDLFYSVLMKIARQNPSEIIVVINGPKNEGLENLCVDFNRNLPICFTPIQH YYTPVAGKRNGIRVAMEHINPNSDITVLVDSDTVWTEDTLSELLKPFACDQKIGGVTTRQ KILDPDRKLVTMFANLLEEIRAEGTMKAMSVTGKVGCLPGRTIAFRTQILKDVMYDFMNE TFMGFHKEVSDDRSLTNLTLRKGYKTVMQDTSVIYTDAPTEWKKFIRQQLRWSEGSQYNN LRMTPWMLKNAKLMCFIYWSDMISPMMLVSVYANTIICKVLNILGCAIPTLAYTAPWWQI ILFILLGCIISFGSRNIKVMRSVKWYYTLLLPVFILVLTVVMVPIRLLGLMLCSDDMEWG TRKLEEDNDKVEVP >gi|229784120|gb|GG667615.1| GENE 63 56226 - 57413 413 395 aa, chain - ## HITS:1 COG:BH3708 KEGG:ns NR:ns ## COG: BH3708 COG1004 # Protein_GI_number: 15616270 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Bacillus halodurans # 1 395 1 388 388 444 57.0 1e-124 MQITVVGAGYVGLSLATLLGQKHEVIVLDIDEEKVAKVNSRISPVQDVYLEKFFAERKLN LTATVDTAIAYKDAEYVIITTPTNYEDETNSFDTHAVDSTIEICMANNDHCIMIIKSTVP VGYTRSVRKKYNTSHILFSPEFLRETKALYDNLYPSRIVVGTDIADPAMVIHAQVFSTIL QECANKEDIPVLIIGLDEAEAAKLFANTYLAMRVAFFNELDTFAEMNGLSTKNIIDAICH DPRIGKHYNNPSFGYGGYCLPKDTKQLKSSFHDIPENLITAVCQANHTKKGHVIKGILNK HPGTVGIYRLTAKSNSDNFRSSAVWGVMEGLSKEKQEIVIYEPLLGDAAEFMGYKVVHSF AEFKRSCDVIVANRVSSELSEVMYKVYTRDIFGRD >gi|229784120|gb|GG667615.1| GENE 64 57563 - 57715 71 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937781|ref|ZP_02085140.1| ## NR: gi|160937781|ref|ZP_02085140.1| hypothetical protein CLOBOL_02673 [Clostridium bolteae ATCC BAA-613] hypothetical protein HMPREF9475_02730 [Clostridium symbiosum WAL-14673] hypothetical protein CLOBOL_02673 [Clostridium bolteae ATCC BAA-613] hypothetical protein HMPREF9475_02730 [Clostridium symbiosum WAL-14673] # 1 50 1 50 50 84 100.0 4e-15 MVGRIVGVLGPEGRMWVSVGFAVSQPMGAIPSVQGAFTCSFGSLFAKNVE >gi|229784120|gb|GG667615.1| GENE 65 57922 - 58050 75 42 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937783|ref|ZP_02085142.1| ## NR: gi|160937783|ref|ZP_02085142.1| hypothetical protein CLOBOL_02675 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02675 [Clostridium bolteae ATCC BAA-613] # 1 42 1 42 42 63 100.0 5e-09 MILWFHGPLAQLQVELISSIISVSEQIFNHLLHIEQQNKYEK >gi|229784120|gb|GG667615.1| GENE 66 58063 - 58899 520 278 aa, chain - ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 9 274 1 262 265 128 28.0 1e-29 MKQNYNEILREYRIYLTEHEKSHATIQKYVRELVWFLSFLQGEEPTKAKVLEYREQLQQS HHARTVNAKLSAIHSYLDYLGLAACKVRFLKIQHTVFVDDSRDLTEAEYHRLLDAAKRKK DSRLYHVMLAICTTGIRVSELSFLTVEALHKGKAEIRMKGKIRTILLTKELCRKLNAYAK EKGIRTGYLFCTRTGKPLDRSNICHDMKKLCRAARVNPEKVFPHNLRHLFAKCYYAIKKN LAYLADILGHASVDTTRIYVAMGTREHERTLQRMHLIS >gi|229784120|gb|GG667615.1| GENE 67 59259 - 59600 291 113 aa, chain + ## HITS:1 COG:no KEGG:CD1101 NR:ns ## KEGG: CD1101 # Name: not_defined # Def: putative mobilization protein # Organism: C.difficile # Pathway: not_defined # 7 108 1 102 109 114 59.0 1e-24 MKEGEKMANRTRKFVLRVPVTPEERALIQEKMAQLGTMNFSAYARKMLIDGYIVHVDTSD IRAQTAELQKIGVNVNQIARRINSTGAVYAQDVEDIKGALAQIWQLQRSILSR >gi|229784120|gb|GG667615.1| GENE 68 59567 - 60901 952 444 aa, chain + ## HITS:1 COG:SP1056_1 KEGG:ns NR:ns ## COG: SP1056_1 COG3843 # Protein_GI_number: 15900926 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD2 components (relaxase) # Organism: Streptococcus pneumoniae TIGR4 # 1 193 1 210 402 83 30.0 1e-15 MAVTKIHPIKVNLKAALDYIENPDKTDDKMLVSSFGCSYETADIEFQMLLDQAFKKGNNL AHHLIQAFEPGENTPEQAHEIGRQLADEVLGGKYPYVITTHIDKGHLHNHIIFCAVDMVN QRKYISNRRSYAYIRRTSDRLCREHGLSVVKPGKDKGKSYAEWDAQRKGKSWKAKLKIAI DAAIPQAKDFDDFLRFMQTQGYEIKPGKFVSFRAPGQDRFTRCKTLGEDYTEEAITRRIK GLAVDRGQKRKAEQRISLRIEIENNIKAQQSAGYARWAKLYNLKQAAKALNFLTEHQIES YESLESRLDEISTANDEAAAALKAVERRLGEMALLIKNLSAYKQLRPVALELRNAKDKAA FRRQHESQLILYEAAAKALKEAGVKKFHNLYALKAEYKKLDGERERLSEQYNEVKKELKE YGIIKQNVDSILRVTPGKERIQEI >gi|229784120|gb|GG667615.1| GENE 69 60898 - 61251 174 117 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3582 NR:ns ## KEGG: EUBREC_3582 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 115 4 117 128 132 64.0 6e-30 MNIPRMNYRQYRKARRLTHECCNYCDGNCLLLDDGEECVCVQSISYSVLCRWFQAAVLPL DAALYAELMERSRALRCRECGTLFSPRRPNCLYCPDCAEKRKRQSKKRWARKHREHA >gi|229784120|gb|GG667615.1| GENE 70 61630 - 62574 860 314 aa, chain + ## HITS:1 COG:SPy0898 KEGG:ns NR:ns ## COG: SPy0898 COG0583 # Protein_GI_number: 15674920 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 1 294 1 294 301 313 50.0 3e-85 MTLQQLRYIVAVAETGTITEAADKLFISQPSLTNAIRELEKEMKILIFHRTKKGIRLSKE GEDFLGYARQVLEQAAILEDKYKGRDGGKKQFCVSTQHYSFAVNAFVDLIKEYGQDEYDF SLRETQTYEIIEDVAHMKSEIGILFLNDFNEAVLSKILKSHDLKFHLLFVATPHVFISRN HPFADRSILTNEQLAPYPYLSFEQGEHNSFYFSEEIFSVSERKKQIRVRDRATLFNLLIG LNGYTVCSGVIDEQLNGKDIIAVPLAEESHMRIGYITHQKGSISRLGTTYLEALKHYVDA EHRAVRKSNYRESV >gi|229784120|gb|GG667615.1| GENE 71 62789 - 65059 1267 756 aa, chain + ## HITS:1 COG:SP0585 KEGG:ns NR:ns ## COG: SP0585 COG0620 # Protein_GI_number: 15900494 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase II (cobalamin-independent) # Organism: Streptococcus pneumoniae TIGR4 # 1 754 1 747 749 918 58.0 0 MQTSVIGFPRIGTLRELKFASEKYFKKEISEEELLQTGKELRAKHWNTQRQAGIDYISCN DFSYYDMVLDTAVLFNLIPKRYKELNLSDLDTYFAMARGYQGTYGDVKALAMKKWFNTNY HYIVPEIEDDMKITLAGNKPFEEYKEAKALGIETKPVIIGAYTLLKLCRYTGKKSLQDIA GSVAKAYQAFLRSCEEIGIPWIQFDEPALVQDMAQEDIALFRQLYDAILSEKRKCHVLLQ TYFGDVRDFYQELLAMPFDGIGLDLIEGKKNVSLIEEYGFPQDKRLFAGVINGKNIWKNH YEPTLRMVKTLQAKNIQCVLSTSCSLLHVPYTLKHEHRLSKHYTSYFAFAEEKLDELAQL KELAQCEDYHAASSYQENKKLFEQARDCVNEAVHKRLSRITEDDYIRKPSRGKRQVLQKQ KLALPEFPTTTIGSFPQTNDVKAKRSSYRKGEVSETEYKEFTKKKIAECVRWQEGIGLDV LVHGEYERNDMVEYFGEALGGFLFTEKAWVQSYGTRCVKPPIIWGDVYREKPITIEWSVY AQSLTNKAIKGMLTGPVTILNWSFPREDISIKESISQIALSIRDEVLDLEKNGIKIIQID EAALREKLPLRKSDWKKEYLDFAIPAFRLTHSGVRPETQIHTHMCYSEFTDIIPAIDDMD ADVITFEASRSDLQILTSLQEHNFETEVGPGVYDIHSPRVPSVEEIVVALHTMLTKIDRT KLWVNPDCGLKTRGIPETEESLTHMVEAAKVIRKDA >gi|229784120|gb|GG667615.1| GENE 72 65124 - 65945 585 273 aa, chain + ## HITS:1 COG:Cj1202 KEGG:ns NR:ns ## COG: Cj1202 COG0685 # Protein_GI_number: 15792526 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Campylobacter jejuni # 2 273 5 276 282 258 44.0 8e-69 MSFEIFPPKKDSELKNIDKTLSVLCELNPDYISVTFGAGGGANRNRTIELAKRIKQEYHV EPVVHLTCLHYDKAEIDEFARVLTNAGLQNVLALRGDKNPNMREKESFLHASDLIAYMKD KWDFCFLGACYPECHPESEDKVCEIKHLKAKVNAGAEVLLSQLCFDNHAFFRFQEACRIA DIQAPVIPGIMPVINAAQIKRMITMCGATFPVRFQKIIQAYESNKEALFDAGMSYALSQI IDLLVSDVEGIHLYTMNNPLVARRICEGIRNII >gi|229784120|gb|GG667615.1| GENE 73 66360 - 67001 533 213 aa, chain + ## HITS:1 COG:no KEGG:Shel_05210 NR:ns ## KEGG: Shel_05210 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 8 209 1 201 201 137 37.0 3e-31 MQKKERKLSRKEQQRKKNFDAICDTMIANGYIKYDLTVGVLAANVLAILVMAPFMALALW LFLLTVPNSAATFTAWDGAVMLLMIIALIVIHELIHGLTWGCFAKKRLKSIDFGVIWSML TPYCTCAESLSRWQYIFGGLMPTLVLGFGLMAVACMLHSLFWLALAELMILSGGGDFLIV YKILLHKSRGTEAYYYDHPYECGLVVFEKSGLE >gi|229784120|gb|GG667615.1| GENE 74 67106 - 67312 193 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619910|ref|ZP_06112845.1| ## NR: gi|266619910|ref|ZP_06112845.1| membrane protein [Clostridium hathewayi DSM 13479] membrane protein [Clostridium hathewayi DSM 13479] # 1 68 1 68 68 140 100.0 4e-32 MKDCEHNRNQPNKTPICLGIGISLGFIWGAVLKNIPIGLLIGVCVGAYYAVITKRKISTQ EDDDHNDG >gi|229784120|gb|GG667615.1| GENE 75 67500 - 67709 202 69 aa, chain + ## HITS:1 COG:no KEGG:CD2291 NR:ns ## KEGG: CD2291 # Name: not_defined # Def: putative transcriptional regulator # Organism: C.difficile # Pathway: not_defined # 2 67 9 74 77 83 72.0 2e-15 MRTRAKMTQQQLAVQVHVSVRTIISIEKEQYAPSLMLAYRIAETFGVTIEDLCCLKENKA LEDERREDL >gi|229784120|gb|GG667615.1| GENE 76 67867 - 68112 130 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870082|ref|ZP_06112847.2| ## NR: gi|288870082|ref|ZP_06112847.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 81 61 141 141 145 100.0 9e-34 MSAEDKTNEMDERNQLVLLKNSSRAFQLAKYGNLILAGGCFIAAKVANLESLIFVGSGLM LAFALSLFSDILTFIYYDRKI >gi|229784120|gb|GG667615.1| GENE 77 69161 - 69442 113 93 aa, chain + ## HITS:1 COG:CAC0567 KEGG:ns NR:ns ## COG: CAC0567 COG0500 # Protein_GI_number: 15893857 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Clostridium acetobutylicum # 1 91 117 209 209 72 35.0 3e-13 MITAFETVYFWPGPVESFQEVWRVLKPGGTFMIVNESDGTKQADEKWTDIIDGMRIFTQE QLMQYLQDAGFSRIAAHVNRKQHWLCLLAEKAR >gi|229784120|gb|GG667615.1| GENE 78 69510 - 69842 211 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|325264124|ref|ZP_08130856.1| ## NR: gi|325264124|ref|ZP_08130856.1| hypothetical protein HMPREF0240_03129 [Clostridium sp. D5] hypothetical protein HMPREF0240_03129 [Clostridium sp. D5] # 1 108 1 108 122 170 98.0 3e-41 MTRGEIWKDFLKHTVLTLVIALFLFLIFKSVFTKNGETEYFYVWLCCGIPFGIRRMFVWL VPHGYDLGGTVGIIAVNFIIGGLIGGVILTWRLVVAVWYIPLTIYRLLTN >gi|229784120|gb|GG667615.1| GENE 79 70215 - 70415 258 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619915|ref|ZP_06112850.1| ## NR: gi|266619915|ref|ZP_06112850.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 66 1 66 66 110 100.0 4e-23 MYFTVEEENLICMYHKSDRRRTSGAIRAALPDMSEDMAALARQTLDKLDAIRDADFEAQQ FHFTDE >gi|229784120|gb|GG667615.1| GENE 80 70419 - 71363 703 314 aa, chain - ## HITS:1 COG:no KEGG:EF2322 NR:ns ## KEGG: EF2322 # Name: not_defined # Def: hypothetical protein # Organism: E.faecalis # Pathway: not_defined # 1 314 118 433 434 138 31.0 2e-31 MNKFENVDVLAALEQIMRQNTAFYQDDFEIDKDMLRKAAASDKAEDKTLLWMSRPSGTYC FRERDVFLKDTWQYNTWKFNGEQTRDHVLAYAVELTGKVGGKIRGTLYELDYQQHFRHVI AEAVKADNLILHYEKGDKEQPAAQYFNGSPDPKLGAFLRYEAKPNEPENLREVLRQEQRS REPLAPDDFKSHVAALRDGRIMREARRIVAGMKELSAPNSPNKTHFMVELSPYFVQIASS KDTDRLFSMLPYKSLCFTGMNDRHGLYAVIHKDENREKEVRRPRTSIRRQLSETKQAKAT KKTPARTKKNELEV >gi|229784120|gb|GG667615.1| GENE 81 71356 - 76020 3629 1554 aa, chain - ## HITS:1 COG:no KEGG:CD1105 NR:ns ## KEGG: CD1105 # Name: not_defined # Def: putative DNA primase # Organism: C.difficile # Pathway: not_defined # 239 837 162 775 1343 439 46.0 1e-121 MSYSSYDHDNLETADTMKIERRIYFESGKSDLSELVKLPLAELLSLRAESAAAEQEVFDR LKTQATAWEEQAGKTLLLDKALEYARTLPVTHTANQWEQPDNYRHIRSNMVYQMYYSISE NTRYDRAAQTSVPYSWTLSWSLRTNAPGTYRQARIAGQDRKVFASQEALDKYLNGRIKAH AHYFTEISPAIPKEYADYFKVNGCLLPGYTIEGEEPAKTEELPKQEETTQPPQTITTPER RKPVNEVFSIFLDNRAEAQTGGPHGYWLSLPTSAEQVQEALKEIHITADNQQDLFIGGFS APDGQPLELPEELIQAASLDELNFLAVQIQKLDDVERSQLNAIMQSPEIFQTIGQVIDYA ENTDCFTLIDAKDYRTLGDYYLNHSGLMVIPDAWKPAIDTERLGQFIAKEEQGTFTEYGY LLRAGEEWQRVHEGQPVPEEYRVIAYPAPEVLREEAKAQPDQNQEGEAPHPVTPIILNSQ NSADRMKEITDRLETGIQELFESERYKAYLTSMAKFHSYSFNNTLLIAMQGGQLVAGYNK WRDEFHRNVKKGEKAIKILAPAPFKAKKEVPKLDAQGRPVMGQDGKPVTEVQEIQVPAFK IVSVFDVSQTEGEPLPSIGVEELTGSVERYGEFFKALEQTSPVPIGFEDIPGGSHGYYHL TEKRIAIQEGMSELQTLKTAIHEIAHSKLHAIDPEAPAMEQADHPDSRTREVQAESVAYA VCQHYGLDTSDYSFGYVAGWSSGKDLKELKASLETIRATSHELITTIDGHLAQLQKERLA QQEQPQAAEQPEPDSVFSKLPPEQQQEMTDSVKAMLQTLIEADLKSTGEVSQGTKEAAQA QGFTIAGDGTLEQAEVPQEAAYHLENGDYLYIQTSETGYDYTLYASDYKELDGGQLDNPD LSLVEAGKEILAIHERPAETMEPLTGDRLDGFLEATEQANAIPQPQAWNGIDGLLNGKPF MPEASPADRAAALIAVAEQNAPRLGGEERQLIMAYAEAVDDTEKVIGLINRLCEQGYEMQ HGHMDDFMKSQIESEIAVARAEQTIAHDPAAEPIVTILWSESPHLKDGQQMPLYEAEAVF GALDSSKRLEREHPDHAGSWYDKTKFRIDFTMQGQPDNYEGRQDFGDGDGSLIEHIQGYH EYYAQDESWKNHVLHHEGPEAWEADKAQREMLLHEFVPYMKQHCTLSHMEQEAQRLLQSG DSLPPEQTAYLTALVGYVTECRPLLNQGQYQLPEPPQLSDFDKSLQDYKIQVQAEVAQEA ENAGLTVEEYLASGSDAPAQPNFSIYQVPRGPEGRDFRYRSYEGLQEDGLSVDRKNYQLV YTAPLDKNTSLDEIYHRFNMEHPTDYTGRSLSLGDIVVFRQDGKQTAYYVDSVGYREVPE FFKEQGQPLTPDKLETGETVKTPRGTFYVTAMSREQMEAAGYGFHHQSEDGRYLIMANGT RAFAIPAQQESHIKTAEMSTEQNYNMIDGILNNAPSMEELEARAKAGEQVSLSDVAAAVK AERQTEKQAHKNAKANHSTKKPSIRAQLAAAKKEQKKQSPAREKSKDMEVGGRE >gi|229784120|gb|GG667615.1| GENE 82 76135 - 76263 76 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEVPCLRRNLPHHKHADGVTRFLKVRMTLDLKYRHTFKTKGV >gi|229784120|gb|GG667615.1| GENE 83 76487 - 76948 186 153 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1563 NR:ns ## KEGG: Lebu_1563 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 153 99 251 251 142 47.0 4e-33 MTYFIADWKFDSKERKWNVLYTEHPWNDPPKAWPRFENNTQAFRSVLHDIQDLAHRLGFE GFANIFYQAGTILDGGKEYPDKAYGLSLPPLPDNHLRVFEAASRADVFGAMGSWNDSPPW AAHEKGLEQEYETLSAELLKQIRFGLLYAINEW >gi|229784120|gb|GG667615.1| GENE 84 77262 - 77417 108 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619920|ref|ZP_06112855.1| ## NR: gi|266619920|ref|ZP_06112855.1| conserved domain protein [Clostridium hathewayi DSM 13479] conserved domain protein [Clostridium hathewayi DSM 13479] # 1 51 1 51 51 81 100.0 2e-14 MKQPDKKQRFHGMAAIGMGLGLLFGILWENLPLGLLMGASLAGFGNMTTKR >gi|229784120|gb|GG667615.1| GENE 85 77628 - 79682 1694 684 aa, chain - ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 1 621 13 655 709 449 39.0 1e-126 MARSIAGVIGADKKQDGYFEGNGYLVSWCIGHLVSLADAGAYDERFKKWRYGDLPILPQE WQYIIPDEKKKQFEVLRSLMELPDVDGLVCATDAGREGELIFRFVYQMAGCKKPFKRLWI SSMEDAAILEGFQHLRPGADYDALYQSALCRARADWLVGINATRLFSILYHKTLTVGRVQ TPTLKMLVDRESKIDNFKKEKYHVVHIAAGGMEAASERFSSADDADATKTACARAQAVCV SVKREKKTEQPPRLYDLTTLQREANRLFGFTAKQTLDYAQQLYEKKCLTYPRTDSQYLTD DMQPTAESIVSGLWPLLPFAAELDISPQFGRVLNSKKVSDHHAIIPTMEFVQKGFDGLTE GEKKLLSLVCCKLLCAVAAPHVFEAVTATFTCAGNEFTSKGKTILTPGWKEIERRFRACF KADADEDGPELARELPEITEGQTFDKVEASVTEHFTTPPKPYTEDTLLSAMERAGAEDMP EDAERQGLGTPATRASILEKLVQMGFVERKGKQLLPTKDGHNLACVLPNVLTSPQLTAEW ETKLTAIAKGEADPDSFMADIEEMTRGLIAGYSQISEDAQKLFQTERVVIGKCPRCGEAV YEGKKNYYCGNRACQFVMWKNDRFFEERKKAFTPKIAAALLKSGKVKIKGLHSVKTGKTY DGTVLLADTGGKYVNYRIERRAKN >gi|229784120|gb|GG667615.1| GENE 86 80625 - 81413 799 262 aa, chain - ## HITS:1 COG:no KEGG:CD1107 NR:ns ## KEGG: CD1107 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 235 3 228 244 191 51.0 2e-47 MKNKAIRMFTATLAAVLCMAAFSVTAYAGGVDPDPEPLPETTEAPLPEETEEPTTGGIEM EPEGVPVTPEGNAALVDDFFGDKQLITVTTKAGNYFYILIDRANEDKETSVHFLNQVDDT DLQALLEDGEQEPQACTCTAKCEAGAVNTACPVCKNNLTACAGPEPEPADEEEPAAPEKE SGGMGGLVVFLVVVLAGGGAALYFFKFRKPKADVKGGDDLDEYDFGEDEDDEEEAPEPDD TTDGDAQEPEEDLLMEEETDKE >gi|229784120|gb|GG667615.1| GENE 87 81403 - 81666 388 87 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619923|ref|ZP_06112858.1| ## NR: gi|266619923|ref|ZP_06112858.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 87 1 87 87 109 100.0 7e-23 MAMNKLERIEKDIEKTKDKIAALQKQLRELEAAKTEQENLQIIQLVRGLHMTPQEFTAFV RDGALQVPPAPQPDFEQDEQEETADEE >gi|229784120|gb|GG667615.1| GENE 88 81690 - 83489 1705 599 aa, chain - ## HITS:1 COG:BS_yddH_2 KEGG:ns NR:ns ## COG: BS_yddH_2 COG0791 # Protein_GI_number: 16077564 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Bacillus subtilis # 479 598 4 123 124 108 42.0 4e-23 MKPLNPRDKVTQRMTRAGLTLDNQTTGESVNISSREAEPEYTAKPGGTAEKALERAVDIR DRLDAAKSALPKKRVIAKATVYDEAKGKTRSKLHFEKVEKHPPKLKPNPASRPVQEAGLY LHGKIHEVEQENVGVESGHKAEELAERQAGKALRNARRRHKLKPYRAAAKAERKSMAANA EFVYQKSLRDNPELAQAVSNPISRLWQKQHIKREYAKAARAAGRSAAGSAKTTASAARKA AEKGKQAASLVARHWKGALLIGGVGLLLMLIMGGLQSCTAMFGSAGTGLAATSYLSEDSD MLGAEAAYAGMEADLQYELDHYETLHPGHDEYRFELDEIGHDPYVLTSILSALHNGVFTL EEVQGDLAMLFEQQYTLTERVEVEIRYRTVTHTDSDGNEYEEEVPYRYSICYVTLKNADL SHLPVYLMSEKQLSLYAAYMQTLGNRQDLFPSGSYPNASTVKEPTYYEIPPEALKDEAFA AMIAEAEKYVGFPYVWGGSSPSTSFDCSGFISWVVNHSGWNVGRQTAQGLYSLCIPVSPE QARPGDLVFFVGTYDTAGMSHVGLYVGNSVMLHCGNPISYTNLNSSYWQQHFYCYGRLP >gi|229784120|gb|GG667615.1| GENE 89 83521 - 83652 71 43 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937802|ref|ZP_02085161.1| ## NR: gi|160937802|ref|ZP_02085161.1| hypothetical protein CLOBOL_02694 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02694 [Clostridium bolteae ATCC BAA-613] # 2 43 762 803 803 90 97.0 3e-17 MNSEPGEGLLIYENVVLPFKNPIPKHTQLYQIMTTRLGEGATV >gi|229784120|gb|GG667615.1| GENE 90 84590 - 86869 1872 759 aa, chain - ## HITS:1 COG:CAC2047 KEGG:ns NR:ns ## COG: CAC2047 COG3451 # Protein_GI_number: 15895317 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Clostridium acetobutylicum # 230 753 27 578 617 99 23.0 2e-20 MSKKQRNETSAKAPVKLTRAEKKEIQAIIRRYKGDGKPHSAQESIPYEAMYPDGVCRLTP RTFSKCIEFSDISYQLAQADTKAAIFENLCDLYNYLDASIHIQFSFLNHKIDPRQYAKSL EIRAQGDDFDDIRTEYSAILKDQLVSGNNGLVKRKFLTYTIEADNLKLARARLHRIETDL LGYFKSMGAVAWGLDATARLEVMHRMFHPDGEPFSFDWKWLASSGLSTKDFIAPSSFRFG NARMFGLGGKYGAVSFLNILSPELSDEMLADFLNTENGIVVNLHVQAIDQSEAIKTVKRK ITDLDAMKIQEQKRAVRSGYDMDILPSDLATYGQDAKELLKTLQSRNERMFQLTFLVLNT ADTRQALENDVFWAAGVAQKYNCSLVRLDYQQEQGLMSSLPLGASHIQIERSLTTSSVAV FVPFVTQELFQDGEAMYYGVNAKTGNMIMLDRKRARCPNGLKLGTPGSGKSMSCKSEILS VFLCTPDDVYVCDPEAEYYPLVKRLHGQVVKLSPTSKSYVNPLDINLNYSEDESPLALKS DFVLSFCELVMGGKNGLDAIEKTVIDRAVQVIYRPYLADPKPENMPILSDLHKALLDQNI PEADRVAQALDLYVNGSLNVFNHRTNVDIESRIVAFDIKELGKQLKKIGMLIVQDQIWGR VTQNRSQGKATWFFCDEFHLLLREEQTAAFSCEIWKRFRKWGGIPTGATQNVKDLLLSPE IENILENSDFICLLNQASGDRHILAERLNLSPQQLRYVE >gi|229784120|gb|GG667615.1| GENE 91 86772 - 87221 295 149 aa, chain - ## HITS:1 COG:no KEGG:CD1111 NR:ns ## KEGG: CD1111 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 140 1 133 134 150 57.0 1e-35 MAYVTVPKDFTKVKSKVVFGLTKRQLLCFGGALLVGVPLFLLIRGRIPTSAAALLMVFAM LPGFLLALYERHGQPLEVVVRQIVECCFIQPKERPYQTNNAYTALVRQFQMEKEVNTIVQ KAKKRNERKSAGQTHPRRKERDSGHHPQV >gi|229784120|gb|GG667615.1| GENE 92 87228 - 87689 500 153 aa, chain - ## HITS:1 COG:all7280 KEGG:ns NR:ns ## COG: all7280 COG4725 # Protein_GI_number: 17233296 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Transcriptional activator, adenine-specific DNA methyltransferase # Organism: Nostoc sp. PCC 7120 # 1 142 39 192 210 86 34.0 1e-17 MGIDALCALPVETLAAKDCLLFLWATFPMLPEALRLIKAWGFTFKTVAFVWLKRNKKSPT WFYGLGHWTRGNAEICLLAKRGHPKRYSRSVHQFIISPIEEHSKKPDITREKIIELAGDL PRAELFARQKIPGWDVWGNEVDSDFSLSARETR >gi|229784120|gb|GG667615.1| GENE 93 87827 - 88690 936 287 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1894 NR:ns ## KEGG: Ethha_1894 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 2 287 4 289 289 344 62.0 2e-93 MIDDLIGEWIKGILIDGIKGNLSGLFSTVNTKVGEIASDVGSTPQDWNGGIFSMLQTLSE TVIVPIAAAILALVMCYELIEMIVEKNNMHDFDTSLFFRWMFKSAFAILIVTNTWNIVMG VFDATQAVVNQSAGIIIGETSIDFDTLIPALETRLEAMDIGPLLGLWFQTLVVGLTMHAL SICIFLVTYGRMIEIYAVTALGPIPLATLGNSEWRGVGQNYLKSLLALGFQAFLIMVVVG IYAVLIQDIGTMEDISGAIWGCMGYTVLLCFCLFKTGSISKAVFTAH >gi|229784120|gb|GG667615.1| GENE 94 88766 - 88981 255 71 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1893 NR:ns ## KEGG: Ethha_1893 # Name: not_defined # Def: putative conjugative transfer protein # Organism: E.harbinense # Pathway: not_defined # 1 71 1 71 71 82 70.0 8e-15 MEFFNSAVDTLQTIVVGLGGALCVWGGVNLLEGYGADNPASKSQGIKQLVAGGGVALIGM TLVPLLSGLLG >gi|229784120|gb|GG667615.1| GENE 95 89046 - 90386 1230 446 aa, chain - ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 17 399 188 562 591 176 32.0 8e-44 MPTSYLMQCVSKDYPASFIVTDPKGGLIGEVGQLLVRCGYRVKVLNTINFSKSMRYNPFR YIHSEKDILKLVNTLICNTKGEGEKSTEDFWVKSERLLYSALIGYIWYEAPDDEMNFTTL LEMINASEAREDDPEFQSPVDQMFERLEEKDPEHFAVRQYKKFLLSAGKTRSSILISCGA RLAPFDIREVRDLMEDDELELDTIGDEKTALFLIMSDTDTTFNFILAMVQSQLINLLCDR ADDKYGGRLPVHVRMILDEFANIGQIPNFDKLIATIRSREISASIILQSQSQLKAIYKDA AEIISDNCDCTLFLSGRGKNAKEIADVLGKETIDSYNQSENRGAQTSHGLNYQKLGKELM SQDEIATMDGGKCILQVRGVRPFFSEKYDITRHPRYKYLSDADKQNTFDVDRYLSSLRRK KRRVVSEDEPFDLYDIDLSDEDLAAK Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:41:29 2011 Seq name: gi|229784119|gb|GG667616.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld9, whole genome shotgun sequence Length of sequence - 67802 bp Number of predicted genes - 74, with homology - 71 Number of transcription units - 30, operones - 19 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 12 - 533 438 ## HM1_0587 type II/IV secretion system protein 2 1 Op 2 . + CDS 534 - 1466 756 ## DSY0057 hypothetical protein 3 1 Op 3 . + CDS 1463 - 2341 851 ## LM5578_1887 hypothetical protein 4 1 Op 4 . + CDS 2397 - 2618 199 ## gi|266619935|ref|ZP_06112870.1| conserved hypothetical protein + Term 2629 - 2671 7.5 + Prom 2622 - 2681 4.8 5 2 Op 1 . + CDS 2706 - 3044 173 ## gi|266619936|ref|ZP_06112871.1| conserved hypothetical protein 6 2 Op 2 . + CDS 3115 - 3525 283 ## Pecwa_0684 hypothetical protein 7 2 Op 3 . + CDS 3541 - 4119 538 ## gi|266619938|ref|ZP_06112873.1| flagellar-specific RNA polymerase sigma factor FliA 8 2 Op 4 . + CDS 4175 - 4870 658 ## gi|288870089|ref|ZP_06112874.2| conserved hypothetical protein 9 2 Op 5 . + CDS 4876 - 5343 221 ## gi|266619940|ref|ZP_06112875.1| hypothetical protein CLOSTHATH_01008 10 2 Op 6 . + CDS 5389 - 6024 449 ## gi|266619941|ref|ZP_06112876.1| conserved hypothetical protein + Prom 6027 - 6086 8.7 11 3 Op 1 . + CDS 6114 - 6713 356 ## Apre_0679 hypothetical protein 12 3 Op 2 . + CDS 6710 - 7549 740 ## Apre_0680 hypothetical protein + Prom 7583 - 7642 10.7 13 4 Op 1 . + CDS 7683 - 7916 109 ## 14 4 Op 2 . + CDS 7813 - 8256 307 ## gi|266619944|ref|ZP_06112879.1| single-strand binding protein + Prom 8287 - 8346 4.7 15 5 Op 1 . + CDS 8375 - 9190 469 ## gi|288870090|ref|ZP_06112880.2| conserved hypothetical protein 16 5 Op 2 . + CDS 9223 - 10293 901 ## AM1_A0346 hypothetical protein 17 5 Op 3 . + CDS 10325 - 11164 606 ## ELI_3222 hypothetical protein 18 5 Op 4 . + CDS 11154 - 11627 249 ## gi|266619948|ref|ZP_06112883.1| hypothetical protein CLOSTHATH_01016 19 6 Op 1 . + CDS 12893 - 13465 210 ## gi|288870091|ref|ZP_06112885.2| conserved hypothetical protein 20 6 Op 2 . + CDS 13489 - 14370 647 ## Bsph_p069 hypothetical protein 21 6 Op 3 . + CDS 14383 - 17424 1644 ## Bsph_p071 putative helicase 22 6 Op 4 . + CDS 17440 - 17913 281 ## gi|266619953|ref|ZP_06112888.1| hypothetical protein CLOSTHATH_01021 23 6 Op 5 . + CDS 17925 - 18047 128 ## gi|266619954|ref|ZP_06112889.1| hypothetical protein CLOSTHATH_01022 24 6 Op 6 . + CDS 18034 - 18270 154 ## gi|266619955|ref|ZP_06112890.1| conserved hypothetical protein 25 6 Op 7 . + CDS 18316 - 19173 238 ## Micau_6120 hypothetical protein 26 6 Op 8 . + CDS 19198 - 19707 305 ## gi|266619957|ref|ZP_06112892.1| conserved hypothetical protein + Term 19785 - 19830 -0.9 + Prom 19774 - 19833 3.2 27 7 Op 1 . + CDS 19878 - 20081 225 ## gi|266619958|ref|ZP_06112893.1| conserved hypothetical protein 28 7 Op 2 . + CDS 20091 - 21278 516 ## gi|266619959|ref|ZP_06112894.1| putative sarcolemmal membrane-associated protein 29 7 Op 3 . + CDS 21297 - 21662 303 ## gi|266619960|ref|ZP_06112895.1| conserved hypothetical protein 30 7 Op 4 . + CDS 21712 - 22644 491 ## COG0175 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes 31 7 Op 5 . + CDS 22673 - 23113 196 ## gi|266619962|ref|ZP_06112897.1| hypothetical protein CLOSTHATH_01030 + Term 23158 - 23200 1.4 - Term 23146 - 23188 1.4 32 8 Tu 1 . - CDS 23216 - 23425 191 ## gi|266619963|ref|ZP_06112898.1| conserved hypothetical protein - Prom 23445 - 23504 3.9 + Prom 23951 - 24010 2.5 33 9 Tu 1 . + CDS 24118 - 24426 309 ## Closa_0630 XRE family transcriptional regulator 34 10 Tu 1 . - CDS 24440 - 24868 126 ## COG4474 Uncharacterized protein conserved in bacteria - Prom 24996 - 25055 8.8 + Prom 24943 - 25002 7.2 35 11 Tu 1 . + CDS 25098 - 25439 290 ## Closa_1119 XRE family transcriptional regulator + Term 25515 - 25550 -0.3 - Term 25409 - 25454 9.5 36 12 Op 1 . - CDS 25458 - 25946 221 ## Closa_1118 hypothetical protein 37 12 Op 2 . - CDS 25952 - 26866 678 ## Closa_1117 hypothetical protein - Prom 26893 - 26952 3.6 38 13 Op 1 . - CDS 26960 - 27376 369 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 39 13 Op 2 . - CDS 27429 - 27830 332 ## SUB1069 membrane protein - Prom 27874 - 27933 1.9 40 14 Op 1 . - CDS 27935 - 28744 487 ## COG2035 Predicted membrane protein 41 14 Op 2 . - CDS 28762 - 29586 588 ## COG1968 Uncharacterized bacitracin resistance protein 42 14 Op 3 . - CDS 29591 - 29740 76 ## gi|266619973|ref|ZP_06112908.1| hypothetical protein CLOSTHATH_01041 - Prom 29844 - 29903 5.1 + Prom 29898 - 29957 8.3 43 15 Op 1 . + CDS 30138 - 31397 1106 ## COG4536 Putative Mg2+ and Co2+ transporter CorB + Prom 31425 - 31484 5.3 44 15 Op 2 . + CDS 31506 - 32288 516 ## COG1396 Predicted transcriptional regulators + Term 32360 - 32408 8.3 45 16 Op 1 . - CDS 32466 - 34646 1920 ## COG0342 Preprotein translocase subunit SecD 46 16 Op 2 . - CDS 34643 - 36307 1443 ## COG1283 Na+/phosphate symporter 47 16 Op 3 . - CDS 36307 - 37395 979 ## COG0628 Predicted permease 48 16 Op 4 . - CDS 37418 - 40081 2169 ## COG0474 Cation transport ATPase - Prom 40120 - 40179 8.0 49 17 Tu 1 . + CDS 40802 - 42334 454 ## COG2199 FOG: GGDEF domain - Term 42304 - 42352 6.5 50 18 Op 1 . - CDS 42353 - 44365 780 ## CLK_A0325 hypothetical protein 51 18 Op 2 . - CDS 44391 - 45146 673 ## Closa_1106 hypothetical protein 52 18 Op 3 . - CDS 45200 - 46132 414 ## Closa_0805 hypothetical protein - Prom 46160 - 46219 3.0 53 19 Tu 1 . - CDS 46241 - 47266 490 ## Closa_1104 hypothetical protein - Prom 47346 - 47405 2.7 54 20 Op 1 . - CDS 48074 - 49117 599 ## Closa_1103 hypothetical protein 55 20 Op 2 . - CDS 49186 - 51465 1456 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member 56 20 Op 3 . - CDS 51462 - 51677 139 ## gi|288870098|ref|ZP_06409627.1| putative non-ribosomal peptide synthetase 57 20 Op 4 . - CDS 51685 - 52752 587 ## COG5377 Phage-related protein, predicted endonuclease - Term 53079 - 53114 -0.1 58 21 Op 1 . - CDS 53332 - 53502 109 ## COG1476 Predicted transcriptional regulators 59 21 Op 2 . - CDS 53525 - 53815 145 ## gi|266619991|ref|ZP_06112926.1| integral membrane protein - Prom 53869 - 53928 13.1 - Term 53876 - 53925 7.2 60 22 Tu 1 . - CDS 53952 - 54035 88 ## - Prom 54081 - 54140 3.8 + Prom 53925 - 53984 8.4 61 23 Op 1 9/0.000 + CDS 54025 - 55746 765 ## COG3275 Putative regulator of cell autolysis 62 23 Op 2 . + CDS 55788 - 56510 596 ## COG3279 Response regulator of the LytR/AlgR family + Prom 56581 - 56640 9.9 63 24 Tu 1 . + CDS 56672 - 58111 1228 ## COG1966 Carbon starvation protein, predicted membrane protein + Term 58152 - 58212 8.0 64 25 Op 1 17/0.000 - CDS 58242 - 58910 787 ## COG0569 K+ transport systems, NAD-binding component 65 25 Op 2 . - CDS 58921 - 60171 1072 ## COG0168 Trk-type K+ transport systems, membrane components 66 25 Op 3 . - CDS 60083 - 60262 69 ## - Prom 60406 - 60465 8.0 + Prom 60928 - 60987 2.7 67 26 Op 1 . + CDS 61168 - 61611 263 ## gi|288870102|ref|ZP_06409629.1| putative heat shock sigma factor 68 26 Op 2 . + CDS 61568 - 61750 231 ## gi|288870103|ref|ZP_06409630.1| conserved hypothetical protein + Term 61839 - 61881 3.4 - Term 61746 - 61796 1.2 69 27 Tu 1 . - CDS 61822 - 62373 371 ## COG1309 Transcriptional regulator - Prom 62444 - 62503 5.5 70 28 Tu 1 . + CDS 62439 - 63425 167 ## COG2267 Lysophospholipase - Term 63392 - 63446 1.1 71 29 Op 1 . - CDS 63510 - 64031 339 ## CDR20291_1471 hypothetical protein 72 29 Op 2 . - CDS 64090 - 64281 206 ## gi|225386648|ref|ZP_03756412.1| hypothetical protein CLOSTASPAR_00396 73 29 Op 3 . - CDS 64357 - 65130 265 ## COG0739 Membrane proteins related to metalloendopeptidases - Prom 65161 - 65220 5.5 - Term 65273 - 65318 -0.9 74 30 Tu 1 . - CDS 65536 - 67764 1043 ## COG0515 Serine/threonine protein kinase Predicted protein(s) >gi|229784119|gb|GG667616.1| GENE 1 12 - 533 438 173 aa, chain + ## HITS:1 COG:no KEGG:HM1_0587 NR:ns ## KEGG: HM1_0587 # Name: not_defined # Def: type II/IV secretion system protein # Organism: H.modesticaldum # Pathway: not_defined # 2 168 313 477 483 175 52.0 7e-43 MNINQDFLLERVLRKHPDVIGVGEMRSAAESLSAAESSRTGHTVCTTIHSNSCNSTYRRM MTLAKRKYNMDDSVLMQIMVEAYPVVVFTKQLEDRSRKIMEIIEGEDYQDGKLIYRSLYK YEVVDNVTLESGETRVVGHHRKVGSISDGLKKRLLDNGISHKELEEFTGKEVD >gi|229784119|gb|GG667616.1| GENE 2 534 - 1466 756 310 aa, chain + ## HITS:1 COG:no KEGG:DSY0057 NR:ns ## KEGG: DSY0057 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 305 1 307 309 283 46.0 5e-75 MHWIYLICFVLINAGLMALLGVRPGDFIDAVFLSRRRAATLSDELNVLLGTPSKGFFNQD YEIKRILRDTGRADRFEALKRLTLILFAVGAVLALLIDNVYMVPIMGIGFSLLPVWYLHS TAASYKKHLNEELETAISIITTSYLRTEDIIRSIQENLPYINEPVKANFEAFVYEARLIN ANTTSAINSLKMKIPNRVFHEWCNILIQCQSDRSMKNTLPTIAQKFSDVRIVQSELEAMM QGPRREAITMMFLVIANIPLLYFLNKDWFHTLLFTTPGKIALAICGAIILFSLTQIMKLS KPIEYGGDGK >gi|229784119|gb|GG667616.1| GENE 3 1463 - 2341 851 292 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1887 NR:ns ## KEGG: LM5578_1887 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 291 1 290 290 217 39.0 4e-55 MTVLALLAVVLAGFSIYYLACAFADIPTSRTSRTMLMVRKQQGGRNEKLAEVYLTKIAVL LAPCLKLDRLRRNKLQSALNIAGLELTPEVYTAKAWVTAGMTGLCALPAVFVMPLMVPLI IGLAVALWFSTYYAAFDFVRKRRKLIEAEIPRFALTIGQNLENDRDVLKILTSYRRVAGR DFGAELDQTIADMITSNYENALLHFETRIGSPMLSDVIRGLVGVLRGDDQRMYFKMICFD MRQIEQNNLKKEAAKRPKKIQRYSMMMLICIMIIYLVVLCTEVLSSLGAFFG >gi|229784119|gb|GG667616.1| GENE 4 2397 - 2618 199 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619935|ref|ZP_06112870.1| ## NR: gi|266619935|ref|ZP_06112870.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 73 1 73 73 97 100.0 3e-19 MKRMILKLNRTAPRLMLNTRAALSNQRGDFYISDAVKIIIAVVLGALLLAALTLIFNDTV IPRITREIEALFG >gi|229784119|gb|GG667616.1| GENE 5 2706 - 3044 173 112 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619936|ref|ZP_06112871.1| ## NR: gi|266619936|ref|ZP_06112871.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 112 21 132 132 243 99.0 3e-63 MNQNHCKFYSFMRATRGNWRNTLYVMCGGSFCPHGKNPPCEGYLMAADADGSLIIMPVEI VRQFTGELIEPSECRGTLVRKTFEAVFSRYIEWHTISDSSCPLLQLYQDADR >gi|229784119|gb|GG667616.1| GENE 6 3115 - 3525 283 136 aa, chain + ## HITS:1 COG:no KEGG:Pecwa_0684 NR:ns ## KEGG: Pecwa_0684 # Name: not_defined # Def: hypothetical protein # Organism: P.wasabiae # Pathway: not_defined # 6 123 116 230 240 79 33.0 6e-14 MKRKHQIRCPYCGAYATCRPASVVYGKSGTTKNSYLYICSRWPACDSYVGTHRKDRRPLG TPANKELRRKRILAHRSLDALQQNCHMKKWEAYIWLQAKLGLSEAETHIGMFSEYMCDRT IELCNQALETDHIRAA >gi|229784119|gb|GG667616.1| GENE 7 3541 - 4119 538 192 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619938|ref|ZP_06112873.1| ## NR: gi|266619938|ref|ZP_06112873.1| flagellar-specific RNA polymerase sigma factor FliA [Clostridium hathewayi DSM 13479] flagellar-specific RNA polymerase sigma factor FliA [Clostridium hathewayi DSM 13479] # 1 192 1 192 192 390 100.0 1e-107 MRKDGILTEEQRCLVTEHMSVVHWVILLNIHVNERIYGFSYDDLFQEGCVWLCRAAVSYD PSLSQFSTYARKVVRNGLLSYCRTMCDKQRHFTRLEIGEQGELIADGAVLNRVDGFSAHI SMLETLELLESRKQDYNGIARLGIEALELKIKGLTIKEIAQMYGVPSSHVGAWISRSAYK LREDRKFLDSVR >gi|229784119|gb|GG667616.1| GENE 8 4175 - 4870 658 231 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870089|ref|ZP_06112874.2| ## NR: gi|288870089|ref|ZP_06112874.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 231 23 253 253 453 100.0 1e-126 MTLKTLFTAVGRMEKRTDSPGRSYPVIVLGGKDYMVDLQEMLVWSSLNWRIARKNEIGEL YNKAVLDYSFQADRTWNACVERLLIRGLLVSGSGATDYDALYDLLSTMYIIPASGSFPLR LFSFLKLVLVKDVPLSMARRLLCGDKRTAGEKQVMRLAGQTLMSTAEIIKCVEKGIHSLP NEESVLDKLYDDRDTTSDNIAYLVKSSSNSKPVVLAVANLYLRKQIIFERI >gi|229784119|gb|GG667616.1| GENE 9 4876 - 5343 221 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619940|ref|ZP_06112875.1| ## NR: gi|266619940|ref|ZP_06112875.1| hypothetical protein CLOSTHATH_01008 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01008 [Clostridium hathewayi DSM 13479] # 1 155 1 155 155 302 100.0 5e-81 MKKWNKYPRVLKRKLYLTLLTGTAGVIVSFMVYMVSADSMLLSMSGVIFLFCLFRSVSLW NIIRRDAYAAISGTCTGITVSPLRKYKRVQITDLSGREHSLLFSSHTAVQTGEKYCFYFQ NENSPVLGNDYLDTMLSTNTFLGYEKLLEYHAEIK >gi|229784119|gb|GG667616.1| GENE 10 5389 - 6024 449 211 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619941|ref|ZP_06112876.1| ## NR: gi|266619941|ref|ZP_06112876.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 211 1 211 211 424 100.0 1e-117 MYKPGTIEQYKIYSYLKEKFYLEEFLLYPISRTSMLLEDKAGDKLAFEYQNGSVREIDLP SPPDIEEVKAFLKSFHALEPKPCLNDFESITLWWLEHPNPLTLQQALGLTDDLYRHFLVY PLIDDETVRRIVSKGLVTEKEFLDIRLWYRNGHVMTCWLGQLGLDGTGNIYGLIFRYRKP NATKYEFYLLDDYYRFMNHIPDLASEDAAIN >gi|229784119|gb|GG667616.1| GENE 11 6114 - 6713 356 199 aa, chain + ## HITS:1 COG:no KEGG:Apre_0679 NR:ns ## KEGG: Apre_0679 # Name: not_defined # Def: hypothetical protein # Organism: A.prevotii # Pathway: not_defined # 16 199 11 195 195 165 46.0 1e-39 MDDIWLKVKELAADGNGIVSTKQVERLGISRAVLKKYVEDNRLVRIRKGLYTLHGDLPDE YVALQIRSAKAIFSYGTALYFWGLSDRTPHFIDMTVPQGTNISTIKRDYPQARFHYVMSA MYSIGITETTSPQGGLIRLYDRERCLCDIIRDKKNVDMQLYTQALKDYFRSGSDCRKLLK YGKKFGIEEQIRTYMEVLS >gi|229784119|gb|GG667616.1| GENE 12 6710 - 7549 740 279 aa, chain + ## HITS:1 COG:no KEGG:Apre_0680 NR:ns ## KEGG: Apre_0680 # Name: not_defined # Def: hypothetical protein # Organism: A.prevotii # Pathway: not_defined # 5 275 5 275 278 283 53.0 8e-75 MKTPEQLKGTIRNMAAQKNLRPQEVLQMFLFERVLERLAGSRYRKNFILKGGLLISSMLG IEERTTIDMDTTVRGIRMEEPEITSVIKEILSIDVGDGIDFSFRKIEPIREDDTYSNFRV HIDARYGKINSPMKLDITTGDEITPAAIQYDYPLLFEHKTVPVMAYTLETILAEKYETII RRNIGTTRARDFYDLHTLYRERYSEIRPDILRMAVAHTAKKRGSASALADWEEILQDIRE EPMLVSFWHKYTAENPYIGKLLFSEVLDTVERIGRLLQE >gi|229784119|gb|GG667616.1| GENE 13 7683 - 7916 109 77 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLHIAEEARIAHVVKFVIKCQWYYRAILRPVLSAGTFYIDIKAFGRKECYGTGFRIWLCD CRFAAKAEPKANPLCLF >gi|229784119|gb|GG667616.1| GENE 14 7813 - 8256 307 147 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619944|ref|ZP_06112879.1| ## NR: gi|266619944|ref|ZP_06112879.1| single-strand binding protein [Clostridium hathewayi DSM 13479] single-strand binding protein [Clostridium hathewayi DSM 13479] # 7 147 1 141 141 291 100.0 8e-78 MAERSVMAQVFVFGYVTADLQLKQSQKQTPYVCFDLAEHIGFGNGQRTIYYQVWTWGEDA LRLIRLKIKKGSMIWISGSQELVDCTSQNGDVKTKRLKVYLDNCGFLPDWQSKDQKLSLP DCQETAAPANSLPPLEILDGERETLPE >gi|229784119|gb|GG667616.1| GENE 15 8375 - 9190 469 271 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870090|ref|ZP_06112880.2| ## NR: gi|288870090|ref|ZP_06112880.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 271 5 275 275 555 100.0 1e-156 MKKDKALWGSISILVGIVIAILALVRGDMQTWLLLGVFSLWGSWIIVILLLPYMRQAKRR QQRKYRLKRLYADGIYTPAPQAKEKKEGAPMEHLLLRHVNHRISSYLRAAYPDVTWSWCE KEPERLAVYGGIGRIRVYGIPDFDHADVTVDQQANIDCSMVKIVPLSHIGAAPDEEHKAP PNRQPVDPQIWYEVQGRKILETLMADLNSRGHNSLTLYENGNICIRQDNEEVAQEQLPAF PERTYWPRLVKVFERNGLAAKITDTGIVVTW >gi|229784119|gb|GG667616.1| GENE 16 9223 - 10293 901 356 aa, chain + ## HITS:1 COG:no KEGG:AM1_A0346 NR:ns ## KEGG: AM1_A0346 # Name: not_defined # Def: hypothetical protein # Organism: A.marina # Pathway: not_defined # 1 351 1 347 351 298 44.0 2e-79 MKEGISLQEMAAEITRQNQLKEDYLVDTRSLKLEPSGNQLYLHMYDNQAEALEPFEINSI AHRQVGSYLKIPADYYDRMRTDYPELLAENVNSWFNREPAKRMLRTIGGTARAFLSNRYR RIDNMEIAKVVLPIIGQMEGAHFESCQITESRMYMKIVNTRLEAEVVPGDIVQAGLIISN SEVGQGSVNIQPLVYRLVCKNGMVVNDARTRQYHTGRINTLEENFQLYSEETLAAEDHAF VKKIEDTVKAAVEEARFSQVIDIMRNATTAKMNTVDVPEIVRLTSRDFHITEDESAGVLQ HLIEGNDLTLYGLSNAVTRHSQNVESYDRATTLESIGYKILTMPPRQWNKINQMAA >gi|229784119|gb|GG667616.1| GENE 17 10325 - 11164 606 279 aa, chain + ## HITS:1 COG:no KEGG:ELI_3222 NR:ns ## KEGG: ELI_3222 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 25 235 22 225 251 170 42.0 7e-41 MYNQNTTALMENEKFLLPAMVESEFSTEELAEDMDGLQMSFQRVKIPAGGTIQFELPSDA PENPDYAKTLVGVILHNHATCAYWPEGSEYDDNTAPLCSSVDGKQGIGDPGGACAACALN RFGTDSNGKAKACKNMRILYLLRSGEYMPIQLTLPPTSISPFREFLNQSFAIRRRATFGS VVQIGLKKMNNGTNDYSVATFRRLYDFTGEELVQIRAYADGFKEQVKGILQQRAASTEEL HDDGCDYLETGTLPGDKTGIFPFRPRLTGNGKPYPLNGF >gi|229784119|gb|GG667616.1| GENE 18 11154 - 11627 249 157 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619948|ref|ZP_06112883.1| ## NR: gi|266619948|ref|ZP_06112883.1| hypothetical protein CLOSTHATH_01016 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01016 [Clostridium hathewayi DSM 13479] # 1 155 1 155 155 317 100.0 2e-85 MVFNSIETINTSAKDISGFGPGVSSFLGGAMKIEIPLSDIQKDFAEENHGLVYAFLNAHS LDEEEFYDVVIFGYLRAVRRYFTEAGLKKYKFGTIAWNCMRVDLLNYYKANRSQKRNAEV VSLHVSLSPDGPPLEHSIPSRNEMMEQLEVKLLLHAS >gi|229784119|gb|GG667616.1| GENE 19 12893 - 13465 210 190 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870091|ref|ZP_06112885.2| ## NR: gi|288870091|ref|ZP_06112885.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 190 1 190 190 382 100.0 1e-105 MLESLDAYRLANYLRENGDSVTCIDDVYSQYSLPAPLTSDGFPVYLVRTVGPDDACIFLK DGFCSIYPARPRTCRIYPFSVAPGQRGRDFEYFLCTDKTHHLTGGRVLVNDWFYKNFKRE DRDFIKLEYQCAEEIVRFMRQLRPSVRETALFSLLYFRYYNYDLQKPFLPQYEKNQKLLM SELHKTLQTN >gi|229784119|gb|GG667616.1| GENE 20 13489 - 14370 647 293 aa, chain + ## HITS:1 COG:no KEGG:Bsph_p069 NR:ns ## KEGG: Bsph_p069 # Name: not_defined # Def: hypothetical protein # Organism: L.sphaericus # Pathway: not_defined # 1 289 22 288 289 63 22.0 8e-09 MNILAKFETVTIKAAIRLAEADRRFCEAHENAYKNALACFQEMEYIWEDICSRQTDILSE VDSHSPYLSTSGGLDISAEQIREHMVGLHSRFISNIVTHFSRTYHVSIDQNTIEDKLLPQ KASESRWTKPEQDEGDMSAWSLLYEDIVKLIFEQMNGRDFGEQALYELKDKCHNAAWSYL NKTAEFEIKKCVLRFACGCSYKNWYGSGEWELYGSLKEILKGIAHFETGRFALLPHGFDS LLGYRHSGSDTVEFSTCSKVSQLKMFKNGRVDIKFASEAIARQFADDYLGRVY >gi|229784119|gb|GG667616.1| GENE 21 14383 - 17424 1644 1013 aa, chain + ## HITS:1 COG:no KEGG:Bsph_p071 NR:ns ## KEGG: Bsph_p071 # Name: not_defined # Def: putative helicase # Organism: L.sphaericus # Pathway: not_defined # 1 1010 1 995 998 827 42.0 0 MKYPYEANPVSFNQRKTLNEKVLYLINHGESDSAGITAADIFNAYTGDGGLHNLQRGDFK NYHEYSDAKKEIENGQFFTPPPICDLVVSCLKPSASDLIADLTCGMGNFFNFLPAESNAY GCEIDHKAYKVAHYLYPKANIELGDIRTYESDIRFDFVIGNPPFHLKWYLEDGSEMLSQI YYCVKAAELLKPFGIMALIVPQSFLADTFTDGRLIQAMENRYSFLGQVALPDNAFLSMGA GRFPTKLQFWQRRSGLEGWVPLPYTTKPGLSLTAGFHIQSEAKRIYDCFLEKARSDLEVH KSHILLELAKSRDTSAEFQYKIKKLLYHIKIHPETKSRYTKCCEYLYRFCTERKPDEMTY EEWCRVRITEKKVLSYLRKTLKRQNLPPERDVIALVKQNYHFAYKGYSAKIRRQMDYDLR VKTPIYQAVLENSPERYPGYARLLRGKRRDYENQNKQFTDMKEDPNIARWLSGFVLWDAE NNEAIHLNELQRHDLNLVLQKRNSLLQWEQGSGKTLAGIAVGQYRMDCQAVRYTWVVSSA ISIRNNWDVVLRNYGISYVFVEKLKDLEKIRSGDFVIITLNKISQYQKQISKWIKRRRQK IQLVLDESDEISNANSQRTKAVLACFRRCRMKLLTTGTSTRNNISEFAPQLELLYNNSIH MLSWCSKLYRHKGKDSHLESHNNPYYRQPIPAYKKGYSLFSASHLPDKITVFGVEQKTQD IYNADVLSEILGKTVITRTFEEVCGKNIKQLHQVPVLFPPEEKEVYEKAMDEFLVMRNDY FNLTGNSRKDSMMKLIQQIVLLLRISAAPDTVKEYGGDTPVKVMAAVELAAGWENEIVAI GVRHKVVLDSYARAIRKYLPNRPLYIVTGSNTTFAKRRALRQTLRESKNGILLCTQQSLA SSVNFEFVNKIIIPELHYNNSGMSQFYFRFIRYTSTEYKDVYFITYAGSIESNLMQMILV KEKLNLFMKGQDTNLDEIYERFDVDYDLLSLLMYQEEDDKGRLRIRWGKQNIA >gi|229784119|gb|GG667616.1| GENE 22 17440 - 17913 281 157 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619953|ref|ZP_06112888.1| ## NR: gi|266619953|ref|ZP_06112888.1| hypothetical protein CLOSTHATH_01021 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01021 [Clostridium hathewayi DSM 13479] # 1 157 1 157 157 317 100.0 3e-85 MYQYIYQISTKPIERNEWTGLHFFEDQPLPAADSITAAINRAKAVSVLGDWLQQHCLGRL SGEAFRLNTNVRTIYFKKRFQQFKNALGDMQMVSEDQFINHYYEVENLILRLNESFCNQY DTYFILDRNAPIPLDQFIRIARTSTPYYVGAVLDYHY >gi|229784119|gb|GG667616.1| GENE 23 17925 - 18047 128 40 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619954|ref|ZP_06112889.1| ## NR: gi|266619954|ref|ZP_06112889.1| hypothetical protein CLOSTHATH_01022 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01022 [Clostridium hathewayi DSM 13479] # 1 40 1 40 40 71 100.0 2e-11 MNFYETPAGKMFKSGFCTATQLIFAGLLEPAGKESENEDT >gi|229784119|gb|GG667616.1| GENE 24 18034 - 18270 154 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619955|ref|ZP_06112890.1| ## NR: gi|266619955|ref|ZP_06112890.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 78 1 78 78 150 100.0 3e-35 MKILKFCRHKSGLWEGVIFENNSGKHYITNGIGVWEESEKRLEGLDIVHAIDIPRLCHCL EQHHCQEDLLRQLLERSA >gi|229784119|gb|GG667616.1| GENE 25 18316 - 19173 238 285 aa, chain + ## HITS:1 COG:no KEGG:Micau_6120 NR:ns ## KEGG: Micau_6120 # Name: not_defined # Def: hypothetical protein # Organism: M.aurantiaca # Pathway: not_defined # 10 257 7 247 288 106 29.0 1e-21 MDSDRTKQLKILCCGAGMQSTALALMSCENAKNGIQYPLVPIYDAIIFCDLCSEAPWVYR QVAFIADACKECGISFYVLTTDLYGVFMKNFGNGHVTSIPFWSIGADGKKAKMRRHCTID FKIIKIQQFVRYELLGYRWREKLREADRGAHEMHIGFSFEEKQRIFDSYHPMFVNRFPLV EMGLTRADNYRYCLDVWGLDTKASACIFCPFHRNYFFWHLKRHHPKRFQDLVAFDQTLER EQPNTMIRSKLFISRSRKRIRNLTDTECNDAETFDYRGKPIWNGF >gi|229784119|gb|GG667616.1| GENE 26 19198 - 19707 305 169 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619957|ref|ZP_06112892.1| ## NR: gi|266619957|ref|ZP_06112892.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 169 1 169 169 319 100.0 4e-86 MDITELYQSLSALLDKLKRNRELVSLSELKTKYRKPYTALCEDIRAATSAYVNKAVLQGI RIRKEYSKEAIAIIETSIKQSGLLRQISAAAFKRQDIKEIEQLSHTLNKQIAAALEPFYY QHLGLYLTEDCFADPPRTPDIYSDVNGCILRDGKWIPEEILYGKQNSAA >gi|229784119|gb|GG667616.1| GENE 27 19878 - 20081 225 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619958|ref|ZP_06112893.1| ## NR: gi|266619958|ref|ZP_06112893.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 67 1 67 67 124 100.0 2e-27 MKKLKWLDETCNSCNKQINSWDKRISKVLSYKYPCCEACIAKEYDMDIDALRNRMEHYLG IRPCLGL >gi|229784119|gb|GG667616.1| GENE 28 20091 - 21278 516 395 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619959|ref|ZP_06112894.1| ## NR: gi|266619959|ref|ZP_06112894.1| putative sarcolemmal membrane-associated protein [Clostridium hathewayi DSM 13479] putative sarcolemmal membrane-associated protein [Clostridium hathewayi DSM 13479] # 1 395 1 395 395 813 100.0 0 MAEYNLLTQRLLSEGYSVDHYPNYVQIQDSTLPGGDPLNNLGGGFIFKRAVANDCTYKTG CGKYVLGKNVNSDMSYMGILWCHENDNPVIRCPYDIPDCPDNDPLLHGTHGGGLCIMCHC VCHRTEESYDYENSIEKADDERQAEKDKKYEEYSKAHNGRVCLNHMYFDERTREWRLNYE PQRCARICYSQNGWCPVLDRRLSRKKANVYYDLKTSHIRKDGTLFDGEVIVHIEKGIRYF ERPVCMDICEAFIRQNGKEIIWNKYKWNTYTTIKLFDPTFQAEILNVRAESRPSRDLMQD LADIQEGINISHSSDLVKCQKEAKRERRQRARTKRIEKLEAKILKNGYGSLENYSLDKIH ADKWLTPERIEELEKLRLQRIKEEQEQPVQMNLFD >gi|229784119|gb|GG667616.1| GENE 29 21297 - 21662 303 121 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619960|ref|ZP_06112895.1| ## NR: gi|266619960|ref|ZP_06112895.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 121 1 121 121 251 100.0 1e-65 MANDFITHQLYAPQVTADPFHKDRLFSKQLGMVKAFWQDLICADNFRAAKWNQGGLTYML YHSTRETDAFVIGVIGSDGLPLSHHSYLKSEVGTGKYYADILSNLPYARKDGALEVMVVK S >gi|229784119|gb|GG667616.1| GENE 30 21712 - 22644 491 310 aa, chain + ## HITS:1 COG:MK0621_2 KEGG:ns NR:ns ## COG: MK0621_2 COG0175 # Protein_GI_number: 20094059 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes # Organism: Methanopyrus kandleri AV19 # 34 239 47 260 329 92 31.0 1e-18 MRGQIDLPVYGDYVGQVIKELRVIAGCHTMCLRFSGGKDSVVLKWLFDQAGIPYQARFSR TSVDPPELLRFIRKYHSDVIEEPPRISMFHLIIKKGFPPTRVCRYCCSEYKERNTCPKEG HVLTLTGVRKAESVKRRNRSRVETCQAYKGVTFFHPILDWTDEQLWNVIDDNHIPYCELY DEGFSRIGCVGCPLTSSKNMLREFRRWPNFEKAYLWAFERMLEGRQFDKWKTKYDVMDWY MYGVQEQYKKLDEENDKLWKQNTFRAYDNELYYGDYFDNLTHEYESRDVNAAIQILHSSQ KKKEVTRIAC >gi|229784119|gb|GG667616.1| GENE 31 22673 - 23113 196 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266619962|ref|ZP_06112897.1| ## NR: gi|266619962|ref|ZP_06112897.1| hypothetical protein CLOSTHATH_01030 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01030 [Clostridium hathewayi DSM 13479] # 1 146 1 146 146 289 100.0 5e-77 MKINIYTKLPDIASESEVILTPVCSFNHVFQNIEKGENGIDRPASELVYSRSDYDGYRWY TSWFKCHKERLDPELVKEIDEFTEALFLLPEFTSLSAMKRLCAACAEPTRDETEYNLYSE TQRLQIWLRLITRFKDYNLYVHYYLK >gi|229784119|gb|GG667616.1| GENE 32 23216 - 23425 191 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619963|ref|ZP_06112898.1| ## NR: gi|266619963|ref|ZP_06112898.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 69 186 254 254 143 100.0 4e-33 MMSESWSDEPYGIGVIDFRTKDIRLLEPDITAFIFNDYYITCDLREKREEDSSVEMDLLV FYCPERYEK >gi|229784119|gb|GG667616.1| GENE 33 24118 - 24426 309 102 aa, chain + ## HITS:1 COG:no KEGG:Closa_0630 NR:ns ## KEGG: Closa_0630 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 6 102 7 103 108 102 51.0 7e-21 MYENLFYQRLEKLRSEKGVSARDMSLSIGQSAGYINALENRNGFPSMQVFFYICEYLGVT PAEFFDSDNSYPTEYKEMLDDIKALDHEQLGNVKAIIKGLKK >gi|229784119|gb|GG667616.1| GENE 34 24440 - 24868 126 142 aa, chain - ## HITS:1 COG:BH1768 KEGG:ns NR:ns ## COG: BH1768 COG4474 # Protein_GI_number: 15614331 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 3 108 33 138 189 57 29.0 9e-09 MAEQVKILHAIGVRRFYISGSLGVGIWAGELILRLKERPGYGDIEFCIVFPYEGYDANWD ERSRKRMNFLKKHSAECITIRGADGREGFFRQNYYMVDQAQHMVAIYDGNHTVKSVTGQT VKYAMRKGLKIVYIHPDTAKIL >gi|229784119|gb|GG667616.1| GENE 35 25098 - 25439 290 113 aa, chain + ## HITS:1 COG:no KEGG:Closa_1119 NR:ns ## KEGG: Closa_1119 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 109 1 109 110 134 67.0 2e-30 MDDLLKEMGKRIHDRRKQLHMTQENLAELANITPQTVSTAELGQKAMRPDTIIKISAALG ISTDYLLLGKVTEEDKSLLSPKVSELTPNQYRHLEDIIGSFVAAVKEKETQEK >gi|229784119|gb|GG667616.1| GENE 36 25458 - 25946 221 162 aa, chain - ## HITS:1 COG:no KEGG:Closa_1118 NR:ns ## KEGG: Closa_1118 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 91 1 91 138 89 52.0 4e-17 MAAKKDRERFTIKFNENDPAHRKVIEILESQGAHSKAPFIVNAILHYIHCPETPDIPVSW LMDKGAIEEMVRGILKEQASEASTPQSVQKTQPTVKVHPGPAAAIHTAADMQPAVNTVQT PEQEETPAAVSRQEQTVPPVCKEEENHARALIMNTLSAFRNN >gi|229784119|gb|GG667616.1| GENE 37 25952 - 26866 678 304 aa, chain - ## HITS:1 COG:no KEGG:Closa_1117 NR:ns ## KEGG: Closa_1117 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 304 3 305 305 451 70.0 1e-125 MLLAIDHGNKQIKTIHCPPFVSGLQQSIEKPFGKNIIQCNGTYYTLSNQRIPYRKDKTED DRFFYLTLFGAANEIEAMKCYSEDVIRIQLAVGLPPAHFGAQNKSFTGYFMNRDVIKFHY HGKPHSIYIEDVVCFPQAYAAAVTMMESLIDSPKALVLDIGGFTADYLLMKYGQADLSTC DSLENGVILLYNKIRSKVNAELDLLLDESDIDAILKGQTDGYTEQAVQLVEYQAQEFIND LFNTLRERMIDLRSGSVVFVGGGAILLRRQIEASGKIGRPLFVDEINANAKGYEFLYQLE MAGR >gi|229784119|gb|GG667616.1| GENE 38 26960 - 27376 369 138 aa, chain - ## HITS:1 COG:all4503 KEGG:ns NR:ns ## COG: all4503 COG0745 # Protein_GI_number: 17231995 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Nostoc sp. PCC 7120 # 30 134 148 252 253 73 36.0 8e-14 MGKIVIITLDEAEEHVLNKILDSIEAEKIQFVDISPAPVLSLSNLSLYPHECKVEVNGQE LTLGPRQFALLYLLARNVGRIFTKEQIYYQVWNESTAINVDETIRYHVSEIRKKLEQIYG YDCIETVWGIGYRFKHDK >gi|229784119|gb|GG667616.1| GENE 39 27429 - 27830 332 133 aa, chain - ## HITS:1 COG:no KEGG:SUB1069 NR:ns ## KEGG: SUB1069 # Name: not_defined # Def: membrane protein # Organism: S.uberis # Pathway: not_defined # 10 118 10 119 205 74 41.0 1e-12 MNRITASYFTKIAAFLEFFVAAILLVAILSGAILLAVNLMGDLIHNPQVVDLNQVLGDAL ALVVGIEFIKMLVKHAPEAVVEVLLFAIAREMVVTHAGSAETLMGVIAVGIIFLIRHYHC PAKTAGAERGKEV >gi|229784119|gb|GG667616.1| GENE 40 27935 - 28744 487 269 aa, chain - ## HITS:1 COG:SA0773 KEGG:ns NR:ns ## COG: SA0773 COG2035 # Protein_GI_number: 15926501 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Staphylococcus aureus N315 # 9 240 9 252 283 99 31.0 4e-21 MKGSVLSVMLKGLCVGSTMMVPGVSGGSMAMLLGIYDRLIYSVNSIRSVKSLLLLGTFSA GGFIGILLFSQPLLSLLDRYPMQMLYFFMGAVGGGIPLMLKKAEVRQLSIPVLVWPALGM LLVWTLSVLPEAPVQTGLSMGVKETVFLLMTGIAAAVALILPGISVSYLLLVFGVYDGMI LALHGFHLPVLVPFGAGVLLGIFLSTRTLEHAMKNYPRPTYLIIFGFILASLVEIFPGVP EANQMLLCGTLAGAGFFAVRSLSGLEETT >gi|229784119|gb|GG667616.1| GENE 41 28762 - 29586 588 274 aa, chain - ## HITS:1 COG:CAC0501 KEGG:ns NR:ns ## COG: CAC0501 COG1968 # Protein_GI_number: 15893792 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Clostridium acetobutylicum # 4 273 1 272 274 250 52.0 2e-66 MTIIEVVEILKAIIFGFIQGVTEWLPVSSTGHMILLNEFLPLRQSNEFMELFFVVIQFGS ILAVPLLYWKRMVPWALKGGFRLKPDILLLWGKIVVACIPAGVAALLWNDEINNLFYHPY VVVIMLIVVGILFLVVEARKREPVVHTAVELTWRAALWIGFFQMIAAVLPGTSRSGATII GALMIGVSRIAAAEFTFIMAVPVMVGASLFKILQFGWELTWVEAVILLTGMIVAFLVSLL VIQGLMNFIKKHSFKIFGWYRIGLGIVMAMYLLT >gi|229784119|gb|GG667616.1| GENE 42 29591 - 29740 76 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619973|ref|ZP_06112908.1| ## NR: gi|266619973|ref|ZP_06112908.1| hypothetical protein CLOSTHATH_01041 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01041 [Clostridium hathewayi DSM 13479] # 1 49 1 49 49 82 100.0 8e-15 MDDADSDAAPEYKNKNQNKKGITCAYERAQAMSFCILTYINNDLEKKGG >gi|229784119|gb|GG667616.1| GENE 43 30138 - 31397 1106 419 aa, chain + ## HITS:1 COG:RC1079 KEGG:ns NR:ns ## COG: RC1079 COG4536 # Protein_GI_number: 15893002 # Func_class: P Inorganic ion transport and metabolism # Function: Putative Mg2+ and Co2+ transporter CorB # Organism: Rickettsia conorii # 17 409 15 413 424 214 31.0 3e-55 MESDTISLIIIVFCIIMSAYFSATETAFSSLNRIRVKNMAEKGNKRAALVLKLSEDYDGM LSTILIGNNIVNIACASLSTLIFVRFLGDEAGASVSTVVTTIVVLIFGEVSPKSIAKESP ERLAMLSAPLLNVFIKILAPANFLFRQWKKLLSAVFRSSDERGITEEELLTIVDEAEQGG GINKQEGTLIRSAIEFTELEAGDIFTPRIDIIGIPMDADKEEIAELFARTGFSRLPVYED NIDHIIGILYQKDFHNYIIRTDRDIREMIKPAMFIAQTKKIGQLLKELQRDKMHIAVILD EFGGTVGIVTLEDILEELVGDIWDEHDTVVQEIEKISDQEYLVLGSANIDRLFELLQKDQ DFNFLTASGWVMELAGRIPKEGDSFEYQNLNVTVLKMAEKRVEQIRVMVKDNQPPLPEE >gi|229784119|gb|GG667616.1| GENE 44 31506 - 32288 516 260 aa, chain + ## HITS:1 COG:SPy1834 KEGG:ns NR:ns ## COG: SPy1834 COG1396 # Protein_GI_number: 15675661 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 63 1 63 195 66 49.0 4e-11 MNFGEKLFNLRKEKGLSQEALAEQLNTTRQAISKWENNHGYPETEKLLKLSDIFQVPVDF LLKETTGIENQNSGNIYVTREMAVSYLSSESMVNRYFGLGTAAWLLGGIPYNMLPDGDPL KTPGIAVFLLAGICIWIVGAFKENDTHKILKQEPLILDHVFYQELSAQYMQIKKRYQVAA TVSALLLAVCITFIAIDAKGYIGSAVYHIFIFIGLSLGAFGFIQSVGKIDMYELLIHNEQ YCSRIWFKLMKKVRDRINNL >gi|229784119|gb|GG667616.1| GENE 45 32466 - 34646 1920 726 aa, chain - ## HITS:1 COG:CAC2278 KEGG:ns NR:ns ## COG: CAC2278 COG0342 # Protein_GI_number: 15895546 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Clostridium acetobutylicum # 2 412 3 408 417 201 32.0 5e-51 MKPGRKSLAAMLLLLVISAAVVWFGLGDEIKGARDMRFGIDIRGGVEAVFEPVGLDRAPT EAEVTAARNIMESRLDDKNILDREVTADIDRGYVIVRFPWKSGETQFNPEAAIAELGEMA NLTFRDEAGNIMLEGRDVSTSSPVESGGTGLKQYAVELHFTEEGAKKFEDATGKLTGQRM GIFMDDVMISNPVVKDRITGGNAIIDGMESYGEAKELSDKINAGALPYSMATRNFSTISP SLGSAALDIMMGAGIAAFVLLCIFMVIKYRLPGFVACLGMAFQMCLQLLAISVPQYTLTL PGIAGIILSLGMAVDANVIINERIGEELRDGSSLRDAVLAGYKRAYSSVMDGNITSAIVA VILMIFGSGPMLSFGYTLLAGIIINVVIGVYYSKVVLLSLVKEGCFSSPGWFRYKKEQRI YQFMKHRKVWLSISAAALFLGFLGLARNGVSLDTQFTGGTTLKYEIQEETDSERLASSLG EALGRNVSVQFTREDGQGRQSMVVTFAGKEGISPQEQKQATDLIHREYGEMNAELSETYS VQPYIGQQALNRSIIAMGLAAVLITLYVWIRFSILSGLSAGLAAMAALVHDVLIVFGVFL VFAIPLNDAFVAVVLTIIGYSINDTIVIYDRIRENMRKGEGKDIFWLVDMSVSQSLSRSV YTSVTTVIGVLILVAVSWYYQIPSIFEFSLPMLFGIVSGCFSSVCIASILWALWVKKRDA WQTGTE >gi|229784119|gb|GG667616.1| GENE 46 34643 - 36307 1443 554 aa, chain - ## HITS:1 COG:SP0496 KEGG:ns NR:ns ## COG: SP0496 COG1283 # Protein_GI_number: 15900410 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Streptococcus pneumoniae TIGR4 # 5 534 8 535 543 299 36.0 8e-81 MSMGMFFGLLGGLALFLYGMQMMSVNLEAAAGSRMKHILERLTANRFLGVCVGAGITAVI QSSSATTVMVVGFVNSGLMTLRQAVWIIMGANIGTTVTGQLIALDIGELAPLITFIGVAL ALFVKRKQTQFAGGIIAGLGILFIGMGMMGEAMVPLRESEAFVNLMTRFSNPLLGILAGA VFTAIIQSSSASVGILQALAVSGLIGLDSAVFVLFGQNIGTCITAALASIGASRDAKRAT LIHLLFNVIGTAVFTAVCIIWPLTGLVGSFTPHNPAAQIANMHTLFNIVTTLLLLPFGAQ LAQLAGRLLPDKAMGKAEEERWFEDLLESEHVLGVSVIARKQLKEDIGRMLSLAAENVDK GFLTFGDRDEAELEKILEREEEIDLLNARLSRKISKVLQAEHSPAEVEALRHMFTVIGNI ERIGDHAKNLAGYAKNMMDRKLELSDQASDELAEMRQSCRCALRLLCNADYIASGYLAEE AALLEQEIDDKAALYRLNQMERMGVGTCHVETSILYSEILTDFERIGDHVLNIARAWAGL EETDRLPVSEKEKL >gi|229784119|gb|GG667616.1| GENE 47 36307 - 37395 979 362 aa, chain - ## HITS:1 COG:BS_ytvI KEGG:ns NR:ns ## COG: BS_ytvI COG0628 # Protein_GI_number: 16079968 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Bacillus subtilis # 15 354 9 360 371 109 27.0 8e-24 MKTEKRRAFIIHFAYFSILFLMAFTVLKYGLSMLSPFVAAFLIAWVLKGPIGFVSGKLRL NWKPAAILVVLLFYSTIGFLIFLLGVKALSAAKELTANLPGIYAYYVEPALISLFDSFEQ SIFRVDNTLLAAFMELETQFVQSAGQMVSGLSMDAMGHISGIASSLPGLFIEILLMIIST FFIALDYHRLTGFCLMQLDGKAKDVFFQIKEYVVGTLLVCIRSYALIMTITFVELAVGLS VIGVKNSVLIAFLIALFDILPVLGTGGIMIPWTILTALQGDYPLALGLLLVYLFVTVVRN ILEPKIVGSQIGVHPVVTLAGMFVGAQLFGVLGLFGFPIGISLLRHLNEKGAIKLFKMAE EK >gi|229784119|gb|GG667616.1| GENE 48 37418 - 40081 2169 887 aa, chain - ## HITS:1 COG:SPy0623 KEGG:ns NR:ns ## COG: SPy0623 COG0474 # Protein_GI_number: 15674699 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Streptococcus pyogenes M1 GAS # 9 887 10 891 893 688 44.0 0 MENHKRLPWHATEPEAIYKALHTSEEGLSDAEAAERLARFGRNELRQKPKKTIWQMIKSQ ITDPMVLILIGAAVFSAVLSEWTEAIVILTIVIINAVIGIVQEKKAESSLEALKQMSAPN ARVLRQREESIVPASELVVGDIVLIDDGAMVPADLRLIESANLKIQEASLTGESVPSEKD AKEIMPQECVLGDRANMAYTSSIVTYGHGTGVVVATGMSTEVGNIAGLLENQDELDTPLK RKLNAVGKTLTVVGIIVCALIFAIGAFYGRPLIPQFLVAISLAISIIPEGLPATATIVMA LGVQRMAKQNALIRKLPAVETLGSATVICSDKTGTLTLNQMTVTQIAVNGDFEAGTTTAV ECADKEHPDVYRELVYAGALCNNASLDPDHKGEIIGDPTEGALIFLAQKFGIDHEELEET YPRLFEQPFDSERKRMSTVHEINQKLVSYTKGAVDEMLPLCTGMLTSRGVRPITQTDIRQ IQDMCDSMSQKALRVLGFAVKTLKHLPEDEEENIEFDMTFIGVAGMIDPPRKEVAESVRT CRNAGIRTIMITGDHKVTALAIAKELSIWQNGDTVISGEDLSAMSEEELDQAVEHATVFA RVSPADKLRIIQSLKRNGEVAAMTGDGVNDSPALKAADIGVAMGRTGTDVAKEASDMILL DDSFTTIAYAIKEGRRVYRNIQKVIQFLLAGNIAEITTLFLATLFNWEAPLLAVHILWVN LATATLPALALGVDPASKNIMKHKPVKAGTLFERDLIARVIRQGIFVAAMTIAAYFIGAD QSSHTTGQTMAFSVLAISQMLRAFNQRSNTEPIWVRAEGPNPWLVVSFLVSAGLMACILF IPSLQSAFRLTYLTGTQWLTVMGLSILSIVQMEAVKWVKRARKREKR >gi|229784119|gb|GG667616.1| GENE 49 40802 - 42334 454 510 aa, chain + ## HITS:1 COG:SMa1548_2 KEGG:ns NR:ns ## COG: SMa1548_2 COG2199 # Protein_GI_number: 16263296 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Sinorhizobium meliloti # 346 501 15 171 180 119 42.0 2e-26 MAAADKEGRLYVPDYPETPDISQEPYFQSAMNGSSSVMWLPEVDGKSEKLVISVPITDNG QPEGVLLGRYSMENLSELFTVGAFDGKGYNYVIKSNGSIIFRALTDPIASNYKTLNDIAE NTSSTLTSDDVKVMLRNMKSGTGGSIIYNRDGQERIMKFIPVGINDWNLALVIPSNVIDA KAQSIIRSTSFYSAVIILSFSVVASLLFYLKHQNHRTLQKSYENIQSIYRTVPSSVVQFK NNRDYSILSANDAFYNFLAYSPQEYNKRIGSFLLPIIHPQDRDSIASLPVGLSEHEFRVL DANGQTKWVYGNFDCSNTSQTVLCAFFDISEQKKLLMNAEQEAMIDPLTGVKNRLAVEKI LSALLHKENRSGALIMLDLDRFKQVNDTLGHPEGDRVLQKFAHCLIEIFGSNSFVARLGG DEFIVFSNEIALPEDAARKARQLIDIMAQRLEMEEKKCGLAVSIGIAFYPSDADSLTELI KLADQALYHAKHNGGKRYMLYSEKQKREKV >gi|229784119|gb|GG667616.1| GENE 50 42353 - 44365 780 670 aa, chain - ## HITS:1 COG:no KEGG:CLK_A0325 NR:ns ## KEGG: CLK_A0325 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A3_LochMaree # Pathway: not_defined # 30 661 27 641 647 250 28.0 2e-64 MKLNKKVCRKFAEPKWSMDLNHGLFCAEQIQYIVKSEVRNVDHQRVLILHIYDRKKVAAG ETRPLWTMFQSRKQYFTLARRDDGSTFWRSAAFDNLDRYYLFKNKCAFYSMNDEKRVMQF CKTDQKDGFSCLNSLQSRMMYQSMLQKRHKMQRKIIEKMKTVPGLPRGIKGWIDREIVPA YIFYTYSKDRKEMDGYCTACKQAVKVSGIKHNAEGSCPHCNKAVTYKSRGRRGTVIDRET VQVLQKISEKELVVRFVKVYRKYPNADVPEDSLYESARIFINSDREGRRKSEHFYHQFSC QDITPWKKGNRPVFSKWQYNFNADHNGYLYDRNLELVLENTPWRYSQLERYYQANRTPFY VINYLQRYLDYPMLEYMVKLGLYRLAADIVNPDYYCYALSQTINEKGKNIQEVLGLNKSH LPMLQEINPGSKQLTLIKKYLKNHIHLDVTLFNWCARNGVSRLENLTVPLQYMTAYKLIR YADEQFEIYKTKSWATAGYRDMESLLNSYRDYLSMCEALEYDLSNSFVLYPANLPKAHDK VNDLSDKEQAKVYEGQIRKLYEKLNSQYAFSKYGYFITLPRTVKEIIEEGHKLRHCIGTY VKMVVKRQCLVFFVRKVNEPEKPLCTVELNGAEIGQTSMFANRAPTPPIKVFLKEWEQEI LQAPVLQSAA >gi|229784119|gb|GG667616.1| GENE 51 44391 - 45146 673 251 aa, chain - ## HITS:1 COG:no KEGG:Closa_1106 NR:ns ## KEGG: Closa_1106 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 25 217 87 279 325 161 58.0 2e-38 MQDSLNIHVLQRDETGVLNPVDTAAEELMLPSDLEDSRAETKEKDEDEIQEEDEKRMEHE ASEARRKEKWEARQQAKKKAEQEQQDRLKTMDNDEVMAASMKRVSADTERLTRRNMKECV SEHIQTLCLESPDFARLVMHPRKNMIHCFRYIYRRAKDFIEQEMKNNNVELTTGLYGEDV PDDLCYQWAEDYFRDLDAEEDKEKEEKFVSRQYHAKTAGKPQKAVRKKTEKNRSRRNRRS RPMKALQVSSC >gi|229784119|gb|GG667616.1| GENE 52 45200 - 46132 414 310 aa, chain - ## HITS:1 COG:no KEGG:Closa_0805 NR:ns ## KEGG: Closa_0805 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 305 1 310 321 143 29.0 1e-32 MLKIDFIYEVTDRIRNKMVGTCEVLIKRITHNNGQQLTGILFKKHGKESPLIYLDKYYDQ YVDGRMSMVEIVDEIMRIQGEGGVQELIDIGGLADYEAVQSKIRLKLVNYEANKARLEGM PHVPFLDLAIMFYVEIDSNSQRILTAAVEHHHLEMWGIGKERIYNDALYNMRACCPVRIK SVMSVIKEVEEENDGNLDLIISEEDISHEQGFWIMTTKLGIQGASTLVHGDGLKKFAQIN RDNIVILPSSIHEVMLIPQMLANGNYEYLSHMVEEVNRDEVLREDRLSNSIYLYNRNEDS ITIAYRGPEL >gi|229784119|gb|GG667616.1| GENE 53 46241 - 47266 490 341 aa, chain - ## HITS:1 COG:no KEGG:Closa_1104 NR:ns ## KEGG: Closa_1104 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 339 253 589 595 419 60.0 1e-116 MKDKNDPPDKAGTKYIWLSSSNKNMGVTSGSPVHFIGNPCARTVYITEGLLKADIAHMLM DRSFVAVAGANNVGPLDPLFAALARNGTELIVEANDMDKFNNNMTSKGASKIYLMARRYG MDCRPLTWNPNYKGIDDWQLALRRKEKLVQENPKSSFKASYLFGKCSFEAIGSYIADWQT DQKKGISLQEYLGLAKAEYELYTRNEASELECLLNSQRHCWKFRIYQLDFDGEQKTIPFA FMGIEALYKANYTQPPAVEYRLMCEDTLICPADMLEEQVLDYIFSRYSGKLPEHYHGRSL APSDVVELYDQGKRSYFYRDKGNFVPVNFSPFFAKPMKEMN >gi|229784119|gb|GG667616.1| GENE 54 48074 - 49117 599 347 aa, chain - ## HITS:1 COG:no KEGG:Closa_1103 NR:ns ## KEGG: Closa_1103 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 347 1 427 427 251 39.0 3e-65 MSEKKHHVTSLYERIPAVAELNKVPGFDPRKLLRRTVSRTGNEEVLKLDLRYKKLWFRLA HPHGRLKLNALRITEQLAIFEAGVYLDRSDSEPIGNFTAACTREEISNGHYVKAAQEAAL DEALTNAGFGLQFSDICVGRNGERYGSEIPLADVQTVRNDNNVVEPAKIPTRADYRTVTP QTSSTEVKRVTESTSIPEKEERTVTILEADKTLTSEARSDNSGEMDQLPTELMEAVTAES TVKSLKTDTSLPENRAALEPMAQETPQSQVSYTADMPVAEIIKRMTFEEAQNVTVDEGTC SGWTMAEVAERRPPSLKWYVYGYKKNNNILRAAAQIMWDSLNGQKAG >gi|229784119|gb|GG667616.1| GENE 55 49186 - 51465 1456 759 aa, chain - ## HITS:1 COG:all7071 KEGG:ns NR:ns ## COG: all7071 COG0507 # Protein_GI_number: 17233087 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Nostoc sp. PCC 7120 # 63 731 64 743 748 340 32.0 4e-93 MKYTGTYEKTIFYNPSNKYCIVIVKSSDQTIPENVRSLERNKDHLIRFIAVGYELPRTDA VELELEGEWVLNKKHGYQLQVEQWQEIVPKTKGGVEGYLSSGLIKGIGPKLAADIVARFG VSVIDILEKQPERLLEVRGITENKLEDIKTSYAENRMLRDLMTLLAPYKLTPKTALKIYQ HFGPASIEILENDPFALCRIPGFGFRRVDSIRQKSGGDLHDPMRIKGALLCVLDEARGKS GHLYLEKKEHIKAALYLLNEKVLLSNNRIREQQVTDTLQELILNGAVVSMKENIYKPNIF EMEDETARQIALRLLDRTPCPNMDSLLEQVKNRFGLSLSNMQAKGVQQACMYNLSIITGP PGSGKTSAIKAIIEMFRILHPKGIIKLMAPTGRASRRMVESTGFEEASTLHSGMCIVSEE DENYRKKEPDLLEADLVIVDEVSLVDMWLMSQFFKRLKRNTKIVLVGDVDQLPSVGAGNV LYEMIHCGLIPVTVLDEIFRQEKGSTIPHNAKLINQGCTRLHGSDDFIFIPSVNQAKAAE TIVERYCQEIAENGIEQVQILSPFREEGAASVAHLNAVIRERVNPFCSTEDEIHTGSKSF RVGDRIMQTKNTEHVSNGDLGFIRYVKDTPEGKRVGMDFGPGRELEYGFEDLSHVELSYA TTVHKGMGSEFSIILFPVLEAHKIMLYRNLIYTAVTRAKTKVMLVGQKSALLTAIHRNGN NKRNTLLGHRIRLYYRAFAKSAGIPIPADLEELELKNAG >gi|229784119|gb|GG667616.1| GENE 56 51462 - 51677 139 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870098|ref|ZP_06409627.1| ## NR: gi|288870098|ref|ZP_06409627.1| putative non-ribosomal peptide synthetase [Clostridium hathewayi DSM 13479] putative non-ribosomal peptide synthetase [Clostridium hathewayi DSM 13479] # 1 71 1 71 71 106 100.0 7e-22 MKGKHMKTKCRYTVMSICVAISLVSGALKVVGVIELEWLAVFAPILAGLAFKLASLLLTL TVILFRWRLRK >gi|229784119|gb|GG667616.1| GENE 57 51685 - 52752 587 355 aa, chain - ## HITS:1 COG:BH3544 KEGG:ns NR:ns ## COG: BH3544 COG5377 # Protein_GI_number: 15616106 # Func_class: L Replication, recombination and repair # Function: Phage-related protein, predicted endonuclease # Organism: Bacillus halodurans # 13 347 3 313 320 132 30.0 9e-31 MALNGALKRDRYRPLVLVDTENLSEEEWLSYRRRGIGGSDVAAILGISPFRTSRDLYYDK LNIVSVNDNEENWVAMSIGHLLEDLVGKIFQYKTGLEIYQIKKMFYHPKYPFMLADLDYF ATLPDGTTALLELKTTNYNARSKWWLDKQEIVPSYYEAQGRHYMSVMDIDRIFYCCLYGN NLNEVIIRELKRDLAYEEEMIFLEQEFWQNHVQAQVPPPYTEDGDLIIKSVRRYAGHADE NASAINFDEKMKEKLIRFQELQEQKKTYEAPAKKVKNELERLKALMIAEMGTSCTAVCEQ DGTNYTVTYNPVREPAISKDNLMRLKILYPEIYKEFVTVSESRRFHIAASAKKAA >gi|229784119|gb|GG667616.1| GENE 58 53332 - 53502 109 56 aa, chain - ## HITS:1 COG:L126409 KEGG:ns NR:ns ## COG: L126409 COG1476 # Protein_GI_number: 15672309 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 1 53 8 60 97 63 64.0 6e-11 MRKQQGLSQEELAKKCGVSRQTINAIENNKYDPTLALAFHLAKELQTTVDNLFICQ >gi|229784119|gb|GG667616.1| GENE 59 53525 - 53815 145 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619991|ref|ZP_06112926.1| ## NR: gi|266619991|ref|ZP_06112926.1| integral membrane protein [Clostridium hathewayi DSM 13479] integral membrane protein [Clostridium hathewayi DSM 13479] # 1 96 1 96 96 145 100.0 9e-34 MREFGLVLFVVFLLVMVILNVSMIISLIKPGDERRQMIIWKASTWTLIATMGSLVFRVVE SVIKVEKMSVNPFITLTSAAAIYFICLLYYRRKYGD >gi|229784119|gb|GG667616.1| GENE 60 53952 - 54035 88 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSGMDKALLSDTVNMGCYNHFNMKVKV >gi|229784119|gb|GG667616.1| GENE 61 54025 - 55746 765 573 aa, chain + ## HITS:1 COG:BS_lytS KEGG:ns NR:ns ## COG: BS_lytS COG3275 # Protein_GI_number: 16079945 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Bacillus subtilis # 4 558 3 577 593 283 33.0 5e-76 MPDNLFLSLLFNIGFLIFLAFILTRIPTVRNMLTCENLSVPSQLLAAVLFGFVSILATYT GIETGGAIVNTRVIGVLSAGLLGGPVIGIGAALIGGIHRYFYEAGQLTAAACAVSTMAEG MIGVFCSRYFHRGKWDNGMLFLLTAVTEVCQMLLILLIARPFSSALEVVKEIALPMIILN SCGMVVFIGTFRALSMEKDNERTSGISLALHVASKCLPHLRKGLDSPEDIKAAADIIFQS TTCSAVMITNREHILAFSCLSGNKDFTSVESLMNSHIKIAMEQNKTICVSDASAPDPLTS VFKNYVLVAVPLTSRGQVSGCLGLFYKKRWHRSQSRIVFAENLSTLFSTQLELADLNYEK SLRRKAEIKALRSQVNPHFLHNALNTISCMCRENPERARELSRTLSVYYRQTLEPHHEMT DLQTELYQVLRYLEMEKARFEENLLIETDVPEGLNCLIPSFILQPLVENAVRYGADRAGL RSVAIRTRKTPEGIQIEVADHGPGIPETIVEAILSGQEPDKSYGLYNVHTRLKKIYGSSG GLSIQHTDGETRISFLIPFSSAAHRSGDWEAAE >gi|229784119|gb|GG667616.1| GENE 62 55788 - 56510 596 240 aa, chain + ## HITS:1 COG:FN0219 KEGG:ns NR:ns ## COG: FN0219 COG3279 # Protein_GI_number: 19703564 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Fusobacterium nucleatum # 4 240 5 240 240 136 33.0 4e-32 MKIVVVDDERPARSELAFLITDILPEAEILEAASGAAALELFLSTPDICLAFLDINLNDM EGTTLAAAMQKILPDVPVVFATAYSEYAVRAFELGAADYILKPVEPDRLRSVLKKCMAVP GASLSVSDPMENKIAIRFNRKVIFTDTKDIIYIETSGRGCMIHTVTDGYAENILIGDYEK RLSHLGFFRIHKSYLVNLAFVSEVFPWANNGLALKLKGCGSEIFPIGREKTKAFRQRMKI >gi|229784119|gb|GG667616.1| GENE 63 56672 - 58111 1228 479 aa, chain + ## HITS:1 COG:MA1905 KEGG:ns NR:ns ## COG: MA1905 COG1966 # Protein_GI_number: 20090754 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 1 444 1 448 479 459 57.0 1e-129 MISFFICLFLLVGGFFVYGKFAEKVFGPDDRRTPAIAMEDGADYVPMGTARIFLIQLLNI AGLGPIFGALAGAAWGPSVFLWITLGTILGGGVHDYMIGMMSMRHKGASVSELTGMYLGN GMKQVMRVFSVVLLVLVGVSFSTGPANLLAMLTPEVLDMNFWLIVVLVYYFFATFVPIDK LIGKIYPIFGICLIVMALGVGGMILFSGNYTIPEITLQNLHPNGTPIWPVMFISVACGAI SGFHATQSPLMARCMRSEKSGRNVFYGAMVAEGIIALIWAAAGAAFYNGTGGLLAALGQG QASVVYEICFTLLGPIGAVIAMIGVIACPISSADTAYRSARLTLADWFNYDQKPIKNRLV LTVPLLGVGAFLTQIDVQIIWRYFSWSNQTLAMIALWAASVYLLKRGRAYLVTALPATFM SAVSCTYILMAEEGMRLSTSVAYPIGILFAVGCLGMFTYSCILRKGKNIALPVGDVNFD >gi|229784119|gb|GG667616.1| GENE 64 58242 - 58910 787 222 aa, chain - ## HITS:1 COG:BH2663 KEGG:ns NR:ns ## COG: BH2663 COG0569 # Protein_GI_number: 15615226 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Bacillus halodurans # 9 222 5 219 220 167 44.0 1e-41 MKNTKHTTFGVIGLGRFGTALAKSLAEAGKEVIVVDCNESKVRELRHYTEHAFVAEDLTK ETLEDIGIQNCDTVIICIGEKIDTSILTTLNVVSLGIPQVIAKAISRDQGAVLEKIGAEV VYPERDMALRLGKRLVSNNFLDYISLDNEVEIQQIPVASKMVGMTVQEFNIRQKYGLNII AIERGHVTDIEVSPQYRFGNDDIIVVIGRVEKIKRFESDIED >gi|229784119|gb|GG667616.1| GENE 65 58921 - 60171 1072 416 aa, chain - ## HITS:1 COG:DR1668 KEGG:ns NR:ns ## COG: DR1668 COG0168 # Protein_GI_number: 15806671 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Deinococcus radiodurans # 1 395 78 488 512 237 39.0 4e-62 MGFAAVILFGTFLLLLPVSVNPGMKVEVIDALFTSTSAVCVTGLIAIDTADHFTAFGRAV VALLIQIGGLGVTSVGVGFILAAGKRVGFKGRALVKEALNVDSSKGVIRLVKAILLMTLL FEAMGAALSFIVFVQDYPPLKAMEISIFHSIAAFNNSGFDILGGLQNLIPYQTDVLLNLV TCGLIIFGGLGFLVILDVLKMRSFKKLCLHSKVVITTSVILLAAGTLLLKATEDISWLGA FFQSVSARTAGFSTYPIGEFSNAGLFTLTMLMFIGASPGSTGGGIKTSTFFALLVSVRSL MTKKHYGAFHRSIPAEGISKAFMIAFLSFLVVCGGTFLLCVLEPGYSFIQLMFEVTSAFG TVGLSTGITPELGSISKLVIILIMFIGRLGAFTLASMWVCRQMPHARYIEESIIIG >gi|229784119|gb|GG667616.1| GENE 66 60083 - 60262 69 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVKASAVFRWSTYEPIISVFERSSPGTFDNFGVCSGDLIWNLSAAVAGVCESGHEGRGN >gi|229784119|gb|GG667616.1| GENE 67 61168 - 61611 263 147 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870102|ref|ZP_06409629.1| ## NR: gi|288870102|ref|ZP_06409629.1| putative heat shock sigma factor [Clostridium hathewayi DSM 13479] putative heat shock sigma factor [Clostridium hathewayi DSM 13479] # 1 147 107 253 253 293 100.0 4e-78 MWQRNNGQGEAEKLQNRFTSYLVTAVNRRRKDYMNQRNRKLRMECSMEDEAWDLEYDEQV YHNLPLPMQLENDSLIYALKQITDRERHVFLSHALEDKSFEELAAELGITYKGAAAIYYR TIQKIKQRMGEGRNGFYGTSEAGKSRK >gi|229784119|gb|GG667616.1| GENE 68 61568 - 61750 231 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870103|ref|ZP_06409630.1| ## NR: gi|288870103|ref|ZP_06409630.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 4 60 1 57 57 100 100.0 3e-20 MDFMELLKQAKAGNEPAIAEILEMYRPLLIKNSIIDGSYDEDLFQELSITLLKCIIQFRV >gi|229784119|gb|GG667616.1| GENE 69 61822 - 62373 371 183 aa, chain - ## HITS:1 COG:CAC0723 KEGG:ns NR:ns ## COG: CAC0723 COG1309 # Protein_GI_number: 15894010 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 5 176 6 179 185 122 40.0 5e-28 MKALKENPIAAQSKQMILKALLELMECKAYNSISIKELAECAGLDRKTFYRNFKSKEEVL YLRLHELCQLYIKELKKLPQFTSYTVSKAYFLICRQNADFFMLLKKNDLLPLALLKFNEY LEELITLFLDDPVYRKKSQYEIYYQAGGFWNVTIRWLEEGAKESPEEMAQMIGAIMPPLK VEE >gi|229784119|gb|GG667616.1| GENE 70 62439 - 63425 167 328 aa, chain + ## HITS:1 COG:RSc2319 KEGG:ns NR:ns ## COG: RSc2319 COG2267 # Protein_GI_number: 17547038 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Ralstonia solanacearum # 23 324 5 309 319 215 37.0 1e-55 MYYKIISATSGTVNLKEETNIQTYHCDPFWRMLQPYLPKSNRLTTNNLPEEYFIPILGMD IHIDHYKPVVSKGRLILFHGVGGNGRLLSCIALPLARAGFEVICPDLPLYGCTCYIKDIT YDTWVSCSVEIVKYYQSNESLPIFLFGFSAGGMLAYQIACKSQNIRGLIVSCILDQREKT VTRNSARNPLLGLMAKPLMTAMHTFAGRVKIPMKWIGNMKAIVNQKEVARILMKDKKSSG VRVPINFIYSMLNPEIRIEPEQFTACPVLLSHPGDDRWTDIKLSNLFYNRLACKKQTVIL EGAGHFPMEAIGLKQMEQACIQFLEQNL >gi|229784119|gb|GG667616.1| GENE 71 63510 - 64031 339 173 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1471 NR:ns ## KEGG: CDR20291_1471 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 171 7 177 179 201 56.0 8e-51 MEFRRLTDSNHAMYHQAMDLYQASFPIHEQRESHVQTCIMGNEAYHFNLIYENDMWVGMI LWWETAEFIYVEHFCILPKMRGKKYGQKALELLDGERKTVILEIDPPTDDVSVHRKQFYE RSGYQANRYSHVHPPYKKGFKGHDLVVMSSPDQLSEAEYRAFNSYLSAVVMKS >gi|229784119|gb|GG667616.1| GENE 72 64090 - 64281 206 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225386648|ref|ZP_03756412.1| ## NR: gi|225386648|ref|ZP_03756412.1| hypothetical protein CLOSTASPAR_00396 [Clostridium asparagiforme DSM 15981] hypothetical protein CLOSTASPAR_00396 [Clostridium asparagiforme DSM 15981] # 1 63 1 63 63 98 92.0 1e-19 MKQFVLYALLKAFVISVGFDLVCLIYGVVSGNPYRITLAGGVICFLVLFFVELIECLWKS RKK >gi|229784119|gb|GG667616.1| GENE 73 64357 - 65130 265 257 aa, chain - ## HITS:1 COG:SA2004 KEGG:ns NR:ns ## COG: SA2004 COG0739 # Protein_GI_number: 15927783 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Staphylococcus aureus N315 # 37 208 103 268 284 119 37.0 6e-27 MNGSVLIQSLKQLWGIIVLEYKYHGDLPSIESYHPQVKYALPFSGEWVVVNGGVTEKTSH SWEIPTQRYAYDFVILDGAGSSFQGEETEAGSFYCYGRDILAPADGTVAEICTGNPDSRI TKNRAASCNARDIRGNYVLICHSDGEYSLLAHLKPDSIRVSVGQSIKRGEKIAECGNSGN TSEPHLHFQVQLGKSFYSSPGLPVEFENIKVKKTLNYEKFDDRYLKEYSENYYPPFIGVG QTVYNVNTEGWNRDSII >gi|229784119|gb|GG667616.1| GENE 74 65536 - 67764 1043 742 aa, chain - ## HITS:1 COG:alr3997 KEGG:ns NR:ns ## COG: alr3997 COG0515 # Protein_GI_number: 17231489 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Nostoc sp. PCC 7120 # 662 740 411 489 496 68 41.0 6e-11 MKRSRLWLLFPILSVFMIIFAFWLSRQEGSYKQTIELSFEPVQDTFTIELQAEEKYMEPH VSHGDAELMWESATGNGKSYLRMKGKLPEHRLGKPVALRIESDSGNRIREMEILSVSGIS AAGCHIDVVKDDTWPLYVGFSIALIAAMAVLLAGSFGVIKKRKKFAYALELAEGLRDDPS AGDCFKEGQRYFLAVERMCIVDTLSGMLFVWISVLWVYRRSYSPGWSRAISWFLLAMAAF FIITYFLRIYRNRIMLRPLLCENRPLSASAACLLSGINGIGNMKERAISLHNAAVGLYRS GRVREALTLEDLAWKLAGKHKGAMLCFLHSDLRSACLRCLGEEQAAEREELKLEIMLEET PWLCKNKDIQMRSLSAKIRGLISGGDLEQAEIEGNYYLDRCHNDYHRLPMLAMIAEVKEM LEKPLEAADLRKRLLSYSPENHEVLKAMAYGPCTYSYTRLKAHDFTGIILRAAYVLAVMF FLFQLADGATHSKGQEDSPVYTETEMSSSAGIDNRETTPSALKETASGEPAESLVRKAAD FSITYPENWDGLYVEKPLEGGGISVFQKSSFEQNGDGLLFSIRIFNNGDYVNEPDYEILG YDGANVYAMSLPTDVTFLLENEQVQHEYEQMVWDIEPVKYSFRIDSDSARYDGSEFIFPN SSDAALNEANLLNLSAQRLRIAKNEIYARHGRRFTDMELQQYFNSCSWYEGKIEPSEFTD DLLNELEKKNVQLIQKQQERME Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:46:58 2011 Seq name: gi|229784118|gb|GG667617.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld10, whole genome shotgun sequence Length of sequence - 63246 bp Number of predicted genes - 52, with homology - 51 Number of transcription units - 18, operones - 13 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 26 - 874 644 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 815 - 860 7.4 2 2 Op 1 . - CDS 886 - 1650 732 ## Sph21_1994 protein of unknown function DUF303 acetylesterase 3 2 Op 2 . - CDS 1674 - 2060 509 ## COG3279 Response regulator of the LytR/AlgR family - Prom 2203 - 2262 7.3 + Prom 2195 - 2254 7.5 4 3 Op 1 . + CDS 2282 - 3145 538 ## COG2207 AraC-type DNA-binding domain-containing proteins 5 3 Op 2 . + CDS 3168 - 3641 374 ## Clole_4115 hypothetical protein 6 3 Op 3 . + CDS 3682 - 4191 411 ## CD1883 AraC family transcription regulator - Term 4024 - 4072 -0.8 7 4 Tu 1 . - CDS 4207 - 5937 2119 ## COG1966 Carbon starvation protein, predicted membrane protein - Prom 6104 - 6163 9.2 - Term 6125 - 6165 6.4 8 5 Op 1 . - CDS 6188 - 7405 1523 ## COG0133 Tryptophan synthase beta chain - Prom 7498 - 7557 4.9 9 5 Op 2 . - CDS 7594 - 9816 2129 ## COG0550 Topoisomerase IA - Prom 9856 - 9915 7.5 + Prom 9930 - 9989 7.1 10 6 Op 1 11/0.000 + CDS 10099 - 11127 440 ## PROTEIN SUPPORTED gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 11 6 Op 2 11/0.000 + CDS 11127 - 11597 242 ## PROTEIN SUPPORTED gi|90020580|ref|YP_526407.1| ribosomal protein S3 12 6 Op 3 . + CDS 11598 - 12905 1077 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 + Term 12976 - 13014 0.4 - Term 12790 - 12827 3.0 13 7 Tu 1 . - CDS 12956 - 13846 930 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 13895 - 13954 5.2 + Prom 13859 - 13918 3.2 14 8 Tu 1 . + CDS 13952 - 15076 1070 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins - Term 15079 - 15110 4.8 15 9 Op 1 7/0.000 - CDS 15187 - 16716 1185 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 16 9 Op 2 . - CDS 16713 - 18398 1442 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 17 9 Op 3 . - CDS 18468 - 19550 919 ## COG0673 Predicted dehydrogenases and related proteins 18 9 Op 4 . - CDS 19568 - 20464 655 ## GYMC10_4387 oxidoreductase domain protein 19 9 Op 5 3/0.000 - CDS 20469 - 21566 839 ## COG0673 Predicted dehydrogenases and related proteins 20 9 Op 6 38/0.000 - CDS 21596 - 22435 790 ## COG0395 ABC-type sugar transport system, permease component 21 9 Op 7 35/0.000 - CDS 22435 - 23370 876 ## COG1175 ABC-type sugar transport systems, permease components 22 9 Op 8 . - CDS 23387 - 24787 1434 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 24894 - 24953 4.7 - Term 24917 - 24978 9.2 23 10 Op 1 . - CDS 24995 - 27043 1862 ## COG3437 Response regulator containing a CheY-like receiver domain and an HD-GYP domain 24 10 Op 2 . - CDS 27074 - 27430 319 ## EUBELI_20567 hypothetical protein - Prom 27469 - 27528 5.5 - Term 27536 - 27581 11.9 25 11 Op 1 . - CDS 27595 - 33390 5249 ## COG5263 FOG: Glucan-binding domain (YG repeat) 26 11 Op 2 38/0.000 - CDS 33415 - 34266 735 ## COG0395 ABC-type sugar transport system, permease component 27 11 Op 3 35/0.000 - CDS 34270 - 35163 1112 ## COG1175 ABC-type sugar transport systems, permease components - Term 35178 - 35216 3.6 28 11 Op 4 2/0.000 - CDS 35265 - 36653 1444 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 36724 - 36783 4.8 - Term 36770 - 36809 7.1 29 12 Op 1 7/0.000 - CDS 36821 - 38377 1827 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 30 12 Op 2 . - CDS 38380 - 40134 1779 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 31 12 Op 3 . - CDS 40097 - 42160 1696 ## Cphy_1618 hypothetical protein - Prom 42194 - 42253 4.1 - Term 42271 - 42318 9.5 32 13 Op 1 21/0.000 - CDS 42343 - 42609 417 ## PROTEIN SUPPORTED gi|160940069|ref|ZP_02087414.1| hypothetical protein CLOBOL_04958 33 13 Op 2 24/0.000 - CDS 42655 - 43107 368 ## COG0629 Single-stranded DNA-binding protein 34 13 Op 3 . - CDS 43130 - 43417 384 ## PROTEIN SUPPORTED gi|160881892|ref|YP_001560860.1| ribosomal protein S6 35 13 Op 4 . - CDS 43404 - 43523 57 ## 36 13 Op 5 . - CDS 43571 - 43768 184 ## COG4481 Uncharacterized protein conserved in bacteria 37 13 Op 6 . - CDS 43793 - 44782 1158 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 38 13 Op 7 . - CDS 44864 - 45595 817 ## COG0846 NAD-dependent protein deacetylases, SIR2 family - Term 45657 - 45723 17.1 39 14 Op 1 . - CDS 45751 - 46389 682 ## EUBREC_2283 hypothetical protein 40 14 Op 2 . - CDS 46458 - 47510 945 ## COG4552 Predicted acetyltransferase involved in intracellular survival and related acetyltransferases 41 14 Op 3 . - CDS 47507 - 48433 1034 ## COG4866 Uncharacterized conserved protein 42 14 Op 4 . - CDS 48481 - 48888 389 ## Closa_0226 Zn-finger containing protein 43 14 Op 5 . - CDS 48910 - 49053 60 ## gi|288870113|ref|ZP_06409635.1| hypothetical protein CLOSTHATH_01119 44 14 Op 6 . - CDS 49087 - 50247 1063 ## COG0534 Na+-driven multidrug efflux pump - Prom 50369 - 50428 1.6 45 15 Tu 1 . - CDS 50441 - 51562 1327 ## Xcel_1188 extracellular solute-binding protein family 1 protein 46 16 Op 1 . - CDS 52502 - 53104 708 ## Xcel_1188 extracellular solute-binding protein family 1 protein - Prom 53126 - 53185 5.7 47 16 Op 2 7/0.000 - CDS 53187 - 54092 1018 ## COG0395 ABC-type sugar transport system, permease component 48 16 Op 3 . - CDS 54108 - 55064 854 ## COG4209 ABC-type polysaccharide transport system, permease component - Prom 55257 - 55316 80.4 + Prom 56139 - 56198 27.4 49 17 Op 1 . + CDS 56348 - 58108 2052 ## Calow_1346 hypothetical protein 50 17 Op 2 . + CDS 58113 - 59891 2024 ## Csac_0666 hypothetical protein + Term 59923 - 59977 12.1 - Term 59910 - 59965 16.1 51 18 Op 1 7/0.000 - CDS 59974 - 61491 1673 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 52 18 Op 2 . - CDS 61488 - 63245 1610 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain Predicted protein(s) >gi|229784118|gb|GG667617.1| GENE 1 26 - 874 644 282 aa, chain + ## HITS:1 COG:SP1899 KEGG:ns NR:ns ## COG: SP1899 COG2207 # Protein_GI_number: 15901726 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Streptococcus pneumoniae TIGR4 # 15 276 12 272 286 132 30.0 8e-31 MSNFSFALFTDKNFVDCTLFQYGWEKCDPHHSFGPAARNHYLFHYIISGKGLLISTDSSK KDTEYHLHAGQGFLISPGQTNTYIADSLDPWEYAWVEFDGLKAPEIIALSGLSFDLPVYN SNNPAMEQVLKDEMIYIAGHTTESSFHLIGHLYLALDALIKSSNRRMTAVGGQLKDFYAR EAVSFVEHNYQNDITVEDIAAFCNLNRSYFGKIFKDVLSVTPQEFLIRFRMAKACEYMET TNLPIGEISARVGYPSQLHFSRAFKKTYGQSPREWRLANQHK >gi|229784118|gb|GG667617.1| GENE 2 886 - 1650 732 254 aa, chain - ## HITS:1 COG:no KEGG:Sph21_1994 NR:ns ## KEGG: Sph21_1994 # Name: not_defined # Def: protein of unknown function DUF303 acetylesterase # Organism: Sphingobacterium_21 # Pathway: not_defined # 5 235 25 259 695 110 31.0 7e-23 MEADILLFMGQSNMAGRGDYRLAPEVLPGAAYEYRAVTEPDTLVPLTEPFGVNENREGGV FEPGMKTGSMAAAFVNACYRKTGRPIIAVSCSKGGSRIQEWQPETPYFKDAAARYQACLS FVQSRQIAVHSTGMVWCQGCTNADDGMAKAEYKEKTKAFFQAVKSLGVDKIFLIQIGNHR EFPDRYVPMQEAQEEIAEEMEDVIMTSRLFRTFRDRGLMKDSFHYKQEAYNLVGEEAGAR TGEILTKEAGKTKL >gi|229784118|gb|GG667617.1| GENE 3 1674 - 2060 509 128 aa, chain - ## HITS:1 COG:ECs3261 KEGG:ns NR:ns ## COG: ECs3261 COG3279 # Protein_GI_number: 15832515 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Escherichia coli O157:H7 # 3 116 1 114 244 72 28.0 2e-13 MSIRVIAVDNSEELLEKITRILKEMDDVEFCGSFGEAIAAIQYVKENPVDIVFSDVVMPD ISGITLAAKIYQLPDPPEVVLLSGIPGFSLEAWKIRAFGFIIKPYTRGQIVDMLNRYRCE KAAEKVLS >gi|229784118|gb|GG667617.1| GENE 4 2282 - 3145 538 287 aa, chain + ## HITS:1 COG:BH0594_1 KEGG:ns NR:ns ## COG: BH0594_1 COG2207 # Protein_GI_number: 15613157 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 1 119 1 116 127 119 46.0 8e-27 MEWNEKLKKIIDYVEDHLQRREEPIDENVVAEIAGCSFGFFQKVFSYMNGISFSEYIRSR KLTLAGYDLKSTDLKVVDISYKYGYASPTSFTKAFQQFHGVTPKEARDKETELKIAPKMQ TSNRQQYSWHIEQREAFRLVGKSRTFSCRNGEHYAQIPAFWNECQRDGALAALSSLDRGK NKGIFGLFTHFDEQLNETTYSIMVECDQEAPPGFEETAIRKASWAVFDCRGSIPQSIHNG WNYLNEEWLIKYPFKHAPCPELEWYSDGDQTSDTYKAQIWIPIIEER >gi|229784118|gb|GG667617.1| GENE 5 3168 - 3641 374 157 aa, chain + ## HITS:1 COG:no KEGG:Clole_4115 NR:ns ## KEGG: Clole_4115 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 7 157 5 154 154 110 41.0 2e-23 MGETCTYDDFLVTVEKDNQELVRELHEELTKLGCRAEIRQAKSGFVVSYLKNKKTILNYV FRKKGLMARIYPNHMADYMEVFDDIPEGVAKAIQDAPVCKRMLDPAACNSRCPMGYDFVL KGQRLQKCRSSAFLFLLSEENRAFVKTLLLNEANITA >gi|229784118|gb|GG667617.1| GENE 6 3682 - 4191 411 169 aa, chain + ## HITS:1 COG:no KEGG:CD1883 NR:ns ## KEGG: CD1883 # Name: not_defined # Def: AraC family transcription regulator # Organism: C.difficile # Pathway: not_defined # 5 169 128 284 284 89 34.0 6e-17 MKNTEPPFRIEKKGSFRVVGYGVRTSNQRGQGRTVIPRHWSSFKADGYEKALWPRANKEP HGLFGINIYNTDTTDPRIFDYLIAVSSDTEADGEFTSYEVPARTWAVFPCTLETIGKTEA QAITKWLPKFQYKPLNRGYITGRMKSNAPDIEYYGENGLVEVWIAVEKK >gi|229784118|gb|GG667617.1| GENE 7 4207 - 5937 2119 576 aa, chain - ## HITS:1 COG:PAE1423 KEGG:ns NR:ns ## COG: PAE1423 COG1966 # Protein_GI_number: 18312627 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Pyrobaculum aerophilum # 1 513 1 492 618 252 36.0 1e-66 MNGLTMMILAVAVLGGAYLIYGRYLAKKWGINPDSRTPAYEMEDGVDYVPADTNVVFGHQ FASIAGAGPINGPIQAAMFGWLPVMLWILLGGVFFGAVQDFASMYASVKNKGRSIGYIIE LYIGKLGKRLFLLFTWLFSILVVAAFADIVAGTFNGFDANDATVSANGAVATTSLLFIAF AVGLGFYLKYTKFPKMLNTLFAIALLVLAVGLGLAFPVYVPQSAWLIFVFIYVIIACVTP VWALLQPRDYLNSYLLIAMIVGSVLGICVYNPSMNLPSFTAFKLTAANGSVSYLFPALFV TIACGAVSGFHSLVASGTASKQIKNEKNMLPVSFGAMLLESLLAITALIAAGFVATQEGL PAGTPPQLFAQAISIFLTSIGLPESVCYTLITLAISAFALTSLDSVARVGRIAFQEFFTD DSVAPENQTMLNKILTNKYFATILTLVLCYALSRAGYASIWPLFGSANQLLSALALIACA VFLKKTKRQGAMLWGPMVIMLGVTFTALALKITELVTALSGQFVFGNALQLVFAVLLLIL GVIVAFEGIKKLIGKDETAENEEASGSKKTAGNVVA >gi|229784118|gb|GG667617.1| GENE 8 6188 - 7405 1523 405 aa, chain - ## HITS:1 COG:BH1663 KEGG:ns NR:ns ## COG: BH1663 COG0133 # Protein_GI_number: 15614226 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Bacillus halodurans # 10 397 5 393 399 426 53.0 1e-119 MDYRTYLKNYPDKDGRFGPYGGAYLTDELIPAFEEIADAYQTICHSSQFINELRRIRKEF QGRPTPVYHCERLSKSIGNCQIYLKREDLNHTGAHKLNHCMGEGLLAKFMGKKRLIAETG AGQHGVALATAAAFFGLECEIHMGEVDIAKQAPNVTRMKILGAKVVPVTHGLKTLKEAVD SAFESYAKNYKDSIYCIGSALGPHPFPLMVRDFQSVVGYEARDQFYEMTGMFPDEVCACV GGGSNSIGMFIPFLDAPVDITGVEPLGRGEKLGDHAASMKFGEKGVMHGFESIMLKDENG EPAPVYSIASGLDYPSVGPEHAFLHDLGRVKYETVSDEQAMEAFFKLSRYEGIIPAIESS HAVAYAMRRAKEMRQGSILVCLSGRGDKDIDYVVEHYGYGDQFLD >gi|229784118|gb|GG667617.1| GENE 9 7594 - 9816 2129 740 aa, chain - ## HITS:1 COG:BS_topB_1 KEGG:ns NR:ns ## COG: BS_topB_1 COG0550 # Protein_GI_number: 16077493 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Bacillus subtilis # 2 584 3 575 575 709 61.0 0 MKSLVIAEKPSVARDIARVLRCGKNLNGALEGDQYIVTWGLGHLVTLADPEDYDPKYKTW RMEDLPMMPDQFKLEVIKQTGKQFSAVKAQLQRKDTGEIIIATDAGREGELVARLILEKA GCRKPIKRLWISSVTDKAIKDGFAKLKNGHEYDNLYDAAMCRAEADWLVGINATRALTCK YNAQLSCGRVQTPTLAIIASREEEIRNFVPKPYYGIAAKSQTPPLTLTWRDEKSKSFRSF DKERVEGIMKKIEAGSGQGLVTEVKRIPKKTYAPQLYDLTELQREANRKFNYSAKETLNI MQRLYENHKVLTYPRTDSRYLSSDIVPSIKERLEACGVGPYRKLAGRLMNQKIAAKPSFV DDKKVSDHHAIIPTEQFVQLDHMTVEERKIYDLVVRRFIAVLYPPFEYEQTELTVELAGE NFAAKGKIVKNAGWREVYDGQDDDEDDEEAAERLDIRAQKLPDLKRGDRIPKLSVTMTEG KTKPPAPFNEATLLSAMENPTAYMESKDKAMAKTLGETGGLGTVATRADIIEKLFSSFLL EKRGKDIFLTSKAKQLLTLVPEDLRKPELTADWEMRLSKIADGKLKRAAFMKDIRAYSNE LISQIKSGDGQFRHDNLTNTKCPVCGKRMLLVKGKNSEMLVCQDRECGHRETIARTSNAR CPVCHKKMELRGKGDSQIFVCKCGYKEKLSSFKERRAKEGAGVSKKDVTRYLNQQKKEAE APINSAFADAFAKLNLDKKK >gi|229784118|gb|GG667617.1| GENE 10 10099 - 11127 440 342 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 [Alteromonas macleodii ATCC 27126] # 69 322 63 313 327 174 38 1e-42 MKSIKKTLGIVLTAAAVVTAGSIFVSVAGNHYSDKIRIKMAHNQSKDSEIADCIAKVAEF AAEDPSMNMEIDIYPSGVLGTEQSAVEMVKAGVLDMAKVSSSMLGQFNDCYSIFSLPYLF TSEGHYYNAMEKSEKVRELFRMNEDEGFLVIGYYANGSRNIYLKEDIKADSPAVLKGKKI RSMTSSTSMRMLELMGASPTPMAASETYTALQQGVVDGAENTELALTVDKHGEVAKSYTY TEHQYTPDVYIISTKTWNRLTEEQQNYLVECLNRSNENFKNKYRQMMEEAIEEAKGMGMT IYYDIDKTAFIEAVQPLHEEYKAKGAFYQELYDDIEQYAGEE >gi|229784118|gb|GG667617.1| GENE 11 11127 - 11597 242 156 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020580|ref|YP_526407.1| ribosomal protein S3 [Saccharophagus degradans 2-40] # 1 145 4 150 164 97 31 1e-19 IMKTLNKILGYLIAFLVGLMVLGCCWQVITRFLLNNPSKYTEEFLRYALIWMTMLGVPYA YGQERHISINIITKTFSLKGSLFTKMVIEIIVMILCVTVFIAGGIMVTMNSAGQISPALQ LPMPLYYVGLPICGVLTLIYSADRLIRFARQLKEAA >gi|229784118|gb|GG667617.1| GENE 12 11598 - 12905 1077 435 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 1 435 1 433 435 419 49 1e-116 MELSAALMLVIVFFLMLAYSVPVSYSIIASALITIIAFLTPTFGMFVSAQKIAGGMDSFT LLAVPFFILAGLLMSSGGIARRLINLAMLVLGKVPGSLALTNIAGNAMFGSISGSGVAAA TAMGGVLNPLEKEEGYDPGFAAAVNVATAPVGQLIPPTTAFIVYSAASGGTSVAVLFMAG WIPGLLWAGLCMLVAFLYGKKHHYVYQTEKLTGKVVLKTVWDAIPSLFLIVIIIGGILSG YFTPTEASGVAVAYAFFLSVVLYKSIKLKDIPGILMETGLMTTIVMLIIGASSVLSFVMS FTGLPEAISHMVLGISNNKYVILLILNIFLLIVGTFMDMAPALLIFTPIFLPIVTALGMT PVQFGVMIVMNLSVGTITPPVGNVLFIGCSVAKLEVEDVLKRLLPFFAAIFAALLFITYV PAFSLWLPSILGLLD >gi|229784118|gb|GG667617.1| GENE 13 12956 - 13846 930 296 aa, chain - ## HITS:1 COG:BH1906 KEGG:ns NR:ns ## COG: BH1906 COG2207 # Protein_GI_number: 15614469 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 30 281 36 287 299 94 25.0 2e-19 MEAVIYYARTDGIELERMIRKGRFNMHVKHFHNQYEIFYLLEGERRFFFDNRSYLVKSGS LILVDENSIHMTMAGSEQEFGHDRIILYIERDKMKEFDQLFPNLNLVRFFHQHYGVFPLN EKQQKDFIDLYIRLQDEFDSRKRNYRAMIETDIIQYFIRFMRENHTPAMEDTDVNAPAKY RNIYAVADYISVHFDEAITLDQLADRFFISKYYLSRTFKEITGYGISEYMNIHRIRAAKR LLEETDMSVLQIAHKLGYESITYFEKVFKTYMTMSPLKYRKTLNTVTYTNGPAEET >gi|229784118|gb|GG667617.1| GENE 14 13952 - 15076 1070 374 aa, chain + ## HITS:1 COG:AGl618 KEGG:ns NR:ns ## COG: AGl618 COG4225 # Protein_GI_number: 15890425 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 50 367 78 387 397 301 47.0 2e-81 MQELKKEEITRKLDLVVNKLLTLGGPENEKELEDGGESIGFFRRDFGISEWDWPQGVGLY GLYKIMMVEKKDEYREFLCSWFKGNMADGLPSRNINTTTPLLTLVQLTELCPDPEFESLC LSWADWLMRNLPRTEEGGFQHVTSANGDRLGVRLNENEMWIDTLFMTVLFLNRMGQKYNR QDWISESIHQVLLHIKYLYDKKTGLFYHGWTFNTRDNFGGVFWCRGNSWFTVGILEYLEM FKGSLDAGVREFIVNTYKSQVHTLKKLQSSSGLWHTVLDDPTSYEEVSGSAAITAGILKG IKLGILDDSCLDCAWKGVRAVMNQIDKDGTVLNVSGGTGMGADREHYKNILIAPMAYGQS LTILALIQALDNLN >gi|229784118|gb|GG667617.1| GENE 15 15187 - 16716 1185 509 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 3 505 2 523 525 142 23.0 2e-33 MKYRALIVDDERIIVNGIMRLPIWLTLDIEPVPAMTGLQALELMEQSRFDLIITDIRMPE MDGLEFISQMRKVYPDIPTVVLSGYDSFAYAKEAIHYGVVDYLLKPISQLDLSDTITRIF HGIEEKKEARQQEEMLRGTVKQTIRDADAQQLECGLLGGRENDRIDRLLGSRYILAECIP CDMEEENLGGLNLLLENIKQIVLGPRRDTGAVVTLFMGHVVFAVNEETMDFNRLLSEFRD WSEKNEFPFSVSWVLAGPGDILSDRYSLLTRISSYRFYYDEFCILSEKEMKSAEGEPCFE ALEEAVKKGNAESALTEMEKVIDDLRLLNKSPEETRTYCVRLYLILAKYTNEHVRETGIV KLLMTSAYTDFNGLAAMLRQLLTRWETGKSEQNLASFSSIVRKAMLYVHDHLDDEELSIA AVAGTVLFIHPDYFGKLFKKETGIAFTRYVMDLRMDRAKKLIAKRPDIKVYELCEATGFG DNPHYFSQVFRSCCGCTPSEYKRAMGHMK >gi|229784118|gb|GG667617.1| GENE 16 16713 - 18398 1442 561 aa, chain - ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 11 553 28 570 577 244 31.0 3e-64 MMQLVWIDIIIFFLIAGAGVILSQRIYRQDMETETSNVFEASFSNIENTLHFVENIAFNI MNDKTVKEALRAVNQDREDRYPLVDDLYYRLITGSLVANDAVSIIVTDQEGRQYATGSLD DNLTDEEQEKIRFLMEERSSRTEPVWIERGNGQELLYGREIYDTPFGGHNPLGMILIRVN LDKMMAESRQGSLAKGNIVIFYKGTPIYAGDKRIFGSPELLDTVRKEKRGRSGRYFKAES ESEFSGITYLGYERYNEMMTSTNTLMIFFTTAILFILIVSFLATARVVHHITAPIAFLSR KMKAVEQGDFSVQIDRKLLDTDIREFEELAANFNLMTGEIGRLVEDNYLRTIKEKEYQIK VLQAQINPHFLYNTLDAINWLAMDSGRPEISSMVQSLATLFREATKQQQYLIPLSEELAL LSCYVTIQKIRLEERLEVAFSIPEECMNLKIPKLSLQPILENSVKYAAEVTMKPCLIEIR ARKKGNRLILCVWDNGPGIPEPILTQIRERRVPGNTGIGLLNIEERIQMLGEPGSGLKIY SAKGRGTWITLCICQPGGECS >gi|229784118|gb|GG667617.1| GENE 17 18468 - 19550 919 360 aa, chain - ## HITS:1 COG:SP1686 KEGG:ns NR:ns ## COG: SP1686 COG0673 # Protein_GI_number: 15901521 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Streptococcus pneumoniae TIGR4 # 1 345 1 347 367 285 38.0 1e-76 MIRIGVLGCGYFGAEFARAVKEMKGIQLAAVYSPGKSSERLSSELGCPRASSVNEIMENA DIDAVIIATPNYLHHQHVLAAAEAGKHIFCEKPFALDAGEAAEMVEACRKAKVTLMVGHI MHFYHGIARVQAMIQEGFFGDILTMHIERTGWEQKKPEVSWKKMQEKSGGHLFHHIHEID IMQWIMGVPTEMYGAGGNLGHRSAEFGDEDDVLLLTASFGGNAFATLQYGSGFRVGNHFI RINGTKAGAVIDFKHAEVRVVSDAGETVFPLFEDEKSAAAICELFARTDGGISYGSPEER PPEYILVSLRKELELFLHVLNGGDIPEDKKDLFDGSSAVHSVQIAQMGCNAVQRHGSRPV >gi|229784118|gb|GG667617.1| GENE 18 19568 - 20464 655 298 aa, chain - ## HITS:1 COG:no KEGG:GYMC10_4387 NR:ns ## KEGG: GYMC10_4387 # Name: not_defined # Def: oxidoreductase domain protein # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 1 298 3 298 305 292 46.0 1e-77 MNLGIIGLDSSHAVQFSRILNGRKEPFHIGGHPVTAAYPGPVSSDFDMSRDRMENFTREV TDDWGVKLYSSIEEVMNNSDAVFLEQVDARRRLTQFREMVGFGKPIYVDKPFALSTEDCK AMLKLAEEYRVPVLSTSSLRFADSLTAALKQCEWHSVVGADFFGPMPFTPTQPGYFWYGV HMADMLYRTMGTGCSKVSVIHTSQHDIITGVWKDGRIGNIRGNRCGNGNFGGTVYHESGS VFVDVSRDTRGYYECLVEQIIKMFDTGQSPVTNQEMMEVVRFLEAAGESLTTGREVVL >gi|229784118|gb|GG667617.1| GENE 19 20469 - 21566 839 365 aa, chain - ## HITS:1 COG:SP1686 KEGG:ns NR:ns ## COG: SP1686 COG0673 # Protein_GI_number: 15901521 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Streptococcus pneumoniae TIGR4 # 1 365 1 364 367 278 39.0 1e-74 MTGYGIIGCGYAGGIHADNLAKMPGTRLAAAFDTDYERAKAFAQERGAYTASTIEELCSN REVEAVIITTPNNSHRIPAICAAKAGKHVFCEKPVALSFEDTKEMIDTAEKAGVYFFAAH TTNFIRGVQTAKALLASGAVGELLMIEAVHTDWAGPQKFTGWKQRKDISGGHLYHHMHEA DLICQLAGLPVSVYANGKNLVHYGAGCGDEEDAIFLNIELSGGGFASLTIGSAFHLGDHF VKLQGKTGGILLDFKNSVVRLENDQGEKLYSMQENEEEDEERRNGYRKHKADAGKGFGKP GMKTSSWMSTIFYKELKCFNDIIRTGNTAPEYEPLLDGSAALNCVKVLDGARESMALRLP VRLEE >gi|229784118|gb|GG667617.1| GENE 20 21596 - 22435 790 279 aa, chain - ## HITS:1 COG:PM1760 KEGG:ns NR:ns ## COG: PM1760 COG0395 # Protein_GI_number: 15603625 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Pasteurella multocida # 12 279 11 277 277 238 50.0 1e-62 MRTRLHKNSWTILAYVILTAAVLLAGGPLVIVIFASFKGQFEIVSNTAALLPEQFSLYWF KDLLHKTDFLIYVKNSLLYSSLSTVLGMTISSLAAYAITRFGGKKQKILTRVIVMTYMFP PILLAIPYFILVSNLGLANTLPGLLMAYMSFTVPFCTYLLVSYFRTVPKDIEEAAMIDGL RRPGVFLHIALPLVAPGLVATAIFAFINAWNEFLYSLLIISSGDKQTVSVGLYSLKGGME TLQWGDMMAASVLVVIPSIVFFSIIQKSIVGGLTAGADK >gi|229784118|gb|GG667617.1| GENE 21 22435 - 23370 876 311 aa, chain - ## HITS:1 COG:SP1689 KEGG:ns NR:ns ## COG: SP1689 COG1175 # Protein_GI_number: 15901524 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Streptococcus pneumoniae TIGR4 # 29 309 11 291 294 261 48.0 1e-69 MRQHRNRQIAGKSVGDRIADFVDRHMCGVFITPAFLVTILLLAYPICISVYYSFTNKSLL GKAVKFVGFNNYVSVLQNPDFYHALWNTLVYTIISLSLQLFFGFIVALSLHKINRFKGLF RTLVLIPWAFPMIIVTFTWSYLLNDLYGIVNAKLLQWNLIAQPIQFLANPKIAMLTVSLI NVWFGVPLFAINILASLQTISRDLYEAAQIDGASPFQRFRFITLPFVRVVVGLLIILRTI WIFNSFDLIFLLTGGGPGTSTETVPIFAYRMGWTLYSLGRSSAVTILLLLFLTAACLIYF KILDHWEEEVA >gi|229784118|gb|GG667617.1| GENE 22 23387 - 24787 1434 466 aa, chain - ## HITS:1 COG:PM1762 KEGG:ns NR:ns ## COG: PM1762 COG1653 # Protein_GI_number: 15603627 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pasteurella multocida # 34 461 15 445 451 293 37.0 4e-79 MKKRTALIVAAALAASALAGCSGGNGQKETGAPAAESTAAAAGQESETVDPMAAITDDGT KKTITYWHTFTEGARKDYIDEMVAEYMKEHPNVTINVEVYPWNVFTQKWTAGLAAGALPD VSSVNPDNLFAINQAGAVLPMNPLIDALGEDYFLGKPLDNFTDGDTILGIPYYLHSYVMW YRKDVLEEKNLSVPKNWDELLSTVTAVADPPNRYGFTIAMSKNDQMCAQLLSCFAKAEGA PLITEDNKGNLTDPKVVKAIKLMADLYKAGSPEGSVNYGLSEQNDVFFQGKSTFAFGSGF HINGVKQNSPELLDKIAAVPFMGPDGENLGATTFVIGLMGWNQTKYPETVYDFIKFHFEK ERYIDFLHLIPGGQLPATTEAANSAEFYDDETIQKFKPEIEYIQEGIEKGTFIGMDYGLT PAAAALTSQGIIEECLQDIVLNGTAAEEAAQKANDKLNAAIAILME >gi|229784118|gb|GG667617.1| GENE 23 24995 - 27043 1862 682 aa, chain - ## HITS:1 COG:VC1348 KEGG:ns NR:ns ## COG: VC1348 COG3437 # Protein_GI_number: 15641360 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and an HD-GYP domain # Organism: Vibrio cholerae # 5 334 73 410 441 216 38.0 1e-55 MKNKILIVDDAEINRSLLADILSVEYDIMEASNGMEAVSLLNQYHADLSLILLDIVMPEM DGFEVLAIMNRNGWIQSIPVITISAETSSSYIEHAYDLGAADYISRPFDEKTVLRRVSNM LMLYTKQKILEGMVTEQIIDKEKNNFLMVEILSNIVEFRNGESGLHVLHIRVITEMLLKK LVEKTDRIQLSLSDIALIVNASALHDIGKISIPEEILNKPGRLTPEEFEIMKTHAAIGAQ ILKNAPNRQTEKLIQVAHDICRWHHERYDGKGYPDGLKGDEIPIAAQVVSVADVYDALTS ERVYKSAYSHEKAMEMILNGECGAFNPLLLDCLVEVGDRILKELDLHSLDGMSDIEMEGI ASELKSHNELRASNRTLMLLEQERIKYRFFASMSQEVQFEYNIHSDLLEISEWGAKYLRI PELIAHPEQNDELARVFDRRDFADLKKKLRATTPGEPVFSGSYQVIIDGKARWFKVVART IWDSDEDTLYSGTIGKFIDIHEEQMELTNLKRLASQDSLTGLYNHKAAKEMIENMIVQRE NKEYAVILIDVDYFKSANDQHGHIFGDEVLKYISQKMRQNIRRDDIAARVGGDEFLIFVE YRGSVMPQIERIFSSLTDSFQGFQISVSMGVACCPENGENYEVLFHCADQALYSAKKNGR NQFCLYDQSIQGFLSVLSPMDN >gi|229784118|gb|GG667617.1| GENE 24 27074 - 27430 319 118 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20567 NR:ns ## KEGG: EUBELI_20567 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 115 1 114 114 110 51.0 2e-23 MTVKDCYEAIGADYQEVLRRLANEERVKRFLLKFPGDDSFSGLVKAVEEGDYETAFRAVH TLKGICLNLSITRLAASSSALTESLRSGGWSEEAGRLFEQVKEEYAQTVQEIKKLEND >gi|229784118|gb|GG667617.1| GENE 25 27595 - 33390 5249 1931 aa, chain - ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1804 1911 510 614 693 101 43.0 2e-20 MYSLSRWKTRMTACILSGVVAFGTLFPVPAEAAGKTAVYDGREIVDLAGNSEMKAYGERL VTGGKASVSRDYGGEAGAGISYYLDSESGNDENEGTSEDKPWKSLKMVNNRTFQPGDRIL LKAGSVWKEETLSPKGSGEEGNPIVIGAYGDGDKPKLEGNAAVPEVVSLYNQQYWEISDL EITNTAAGFTGAMNDQNGNKLKDVRGIRIAGQDGGDLNNFYLHDLHVHDVTGVCAWISGD SNGKPGILGKQGWDKSKRTGGIVFEVMEPATDQPTVFHDITVERNVVNNNSFGGIIIKQW KGADTGGEHWASREEGKGNAANNYFCENWKPHTDVTIRDNYLSQENSDYACNTIYLTSVQ GAVVEGNISKEAGTCGIEMYYADDVTVQRNEVYKTRVKAGGADSNAIDPDKASTNILIQY NYIHDTGDGILLCGFIFGTSVVRYNVIKDAQKRYINPHGDKGVNYVYNNILYNTIEKSNL PFIESSGGNSVYLNKSGNMHYFNNNIFYNTAKTTTSVGIGEGTSTSYDSNCYYGKGVQAP EQETHAVCGDPMFEGELTGSGEEGPEELLKLRLKTESPLLNAGIEVEDDPLLSVSANPGT DFAGDPVFNGEPDMGIFEYQGEAGSGILNGYVEDPYGNIMKGASVKLKDTDYAAVTNEQG FFAIAGVKPGSYTAVISKEIYLDGETPDITVEEKAVTRVQLKLGESLSDVGSVKGCVSNS SGPLQGVTVTVSLEDQVYTVKTRNDGCYEIPDVKIGTGYEVTASKEGYKTATEKDVRIMP AAVSEVNLILSREVGKTTYLIADLFDQYDTGAFTNNSRGNELWKVSDISAADSAKASLQI KAEANGNKYLEMAKNGSYNLYAYTKKEYDLNGIITVEARVKRTKETASPNQFGMYSFNKA DFKTGDPTSSSNAIGTFALSKGNIITHNKKGSSSTVNVQPYEKGRWDIIRNVINLDTNTF DFYVNDMNVPKLADQPLRTQGKNIDRFEFFSSSTNGGDLLIDYFKVCTGPAMDYNDAGLS GISVDGKEAEWKGENIYEMQIPSGSSEVKVQPSANSIFVKKVMVGGKDATKEAVSVPVPE AGEPLLITVLAEDGETEERYQLNLVKEDVSGLAYLTNLTIEGITLTPEFDFNTMEYEGNV PSDVHSVTLQYETVQASNEVTVKVNNQEVTDRVIPLKPGVNVIEIGIASADGTSFADYTV TVTRDCVIDEIQIDTLPVKTVYERGEEPDFEGLTVGAYCEGDRVRSLEAAEFTVSMPDTS KAGTVSVTVTYETEDGKTLEASFDITVYDRDTMQPQSIKVVKQPEQTVYGTGEAFNPDGM EVRLLMKATASNAAPAEVRILDDGEYDVMDDFTEPGDTMVTIRYSWTDSNGEERYLEDQV AVTVYDEELEYYQTGIKVTKQPKKTVYETGSIFDPDGMEVKRIMKASGSNADYYQETITD YDYETAELTGTGSRKIAIRHEGTDENGDEQTFKAVVTVTVTNREDVLNNAVLKQSVAEAG EKLNKEYVAYMTDEEKDSAANAAKNVFLDVMGAEHTVLTKDMADRMAELEALLKNAYPDI TVRIEGDSSLTEGAKVTGTLLNAPFGQGNVRMVPQITKTKLPEDIPQTKAATAMEIKLLA NGEELQPDIPLYLTLKIPEGLDRKDLIVYCELDDGSIKAVNPEISGEFLHFFVRSAGTFV IANPKENDLTEIEITAKPKKTEYRINESFDAAGLVVTAVYEDGTRTVVTGYEMAGFDSTV AGTKTITVFYQGKTAEFTVMVKADEDGPDDDDKDDDHDKDDDYNSDHGSHGSSGGSSAVR TVLPDTVKGNWRQDENGWWFETTDGGYVVSDWARINEKWYYFNEQGYMAAGWVLDQNHWY YLGEDGVMADYTWILDRGTWYFLQSGGVMAADQWVLWNGNWYYLSSDGSMARDTVITVGY RIDEDGVWRPN >gi|229784118|gb|GG667617.1| GENE 26 33415 - 34266 735 283 aa, chain - ## HITS:1 COG:BS_yesQ KEGG:ns NR:ns ## COG: BS_yesQ COG0395 # Protein_GI_number: 16077766 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 12 282 27 296 296 247 43.0 2e-65 MGMKRKRQLSSVIFHAGACILGFFMIYPLLWLLASSFKSNETMFTNTYSLLPEVWDAAKN YASGFAGIGGVAFSTFFMNSMVVTVVGTVTCVLSSLLAAYTFSRLKFRFSNFWFGCVMMT MMIPAQVMVVPQYIILKKIHLIDTRVALVLPWCFGGAFFIFLMVQFFQGIPKELDEAAEI DGCGKIQTLVRVLIPIVRPAIITSSIFAFYWIWQDFFQPLIFMSSSTKYTIPLALNMYLD PNSYNNYGGLFAMSVISLIPIVVFFIIFQRYLVDGIAMDGIKG >gi|229784118|gb|GG667617.1| GENE 27 34270 - 35163 1112 297 aa, chain - ## HITS:1 COG:BH1118 KEGG:ns NR:ns ## COG: BH1118 COG1175 # Protein_GI_number: 15613681 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 3 294 23 313 318 301 52.0 1e-81 MIKKRYKDNKVAYLLLAPWLVGFIGMWLIPMCISIYYSFTDFNLLNDPQFIGAANYVRAF TQDKTFYKALKVTFLYVLLLVPLRLAFALFVAMLLNKKHRGLGLYRTLYYIPSIIGGSIA VSVVWKQIFGNKGVIMTLLGVFGITQKTSLIGNPKTALYVIILMGVWQFGSSMLIFLSAL KQIPGSLYESAKVDGANGFVTFWKITIPMLTPTIFFNLILQIINGFRVFTESYVITDGGP LDSTLSYVLYLYRRAFTYFDMGYSCALAWVLVAIVSVFTIILFKTQKNWVYYEAGRE >gi|229784118|gb|GG667617.1| GENE 28 35265 - 36653 1444 462 aa, chain - ## HITS:1 COG:BS_yesO KEGG:ns NR:ns ## COG: BS_yesO COG1653 # Protein_GI_number: 16077764 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 47 462 3 412 412 293 38.0 3e-79 MRKMKRLLALGLAGMMVLSATGCGSSDKTKGSAADTTQAAATEGTSAGGTTAAAASSDRE DVTLKFTWWGGQTRHELTQQLLDKYTELHPNVTFEAVPSGWDGYFDKLATQAAAGAMPDI VQMDYLYISTYAKNNSIADLTPYIEDGTIDTASIDENLMKSGEIGGKTVGLVYSSSLLAM GYNPEVLEEAGVEVPSMDWTWSDFIELCKTVKDKTGKLGSATGPVDDTNLFNYWVRQHGA SLFSDDQKSLGYDDDQVCIDYLNMWSELMAAGAVPNPDEYAAIQTLGQEAGPVVTGEGAT MIEWNNYANKVASVNDKLKITSLPLSDNTEYKGLWNKPGMFLTISETSPHKKEAAEFINW FVNSEEANDIMMGERGTPASSAVREHMKSTGKLSPQQEEMFNYLDITADICGETPAPDPI GMSEINEAFKNAAYSAFYGQVSAEEAAKAFREEANAILVRNN >gi|229784118|gb|GG667617.1| GENE 29 36821 - 38377 1827 518 aa, chain - ## HITS:1 COG:BS_yesN KEGG:ns NR:ns ## COG: BS_yesN COG4753 # Protein_GI_number: 16077763 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus subtilis # 1 156 1 156 368 146 44.0 1e-34 MYKLLIVDDEDVEREGMADFIQWEKYGISLAGTAWNGVEGYEEIQNQMPDIVITDIKMPV MNGIELIRKTREDFPDIVFIVLSGYGEYEYTSRAMELGVRHYILKPCDEAKIVEVIGKVT EELKVREEKKRKEVDMRTQVKRLAPRAREQMFRNVLLNREDIHKDFEAYLGGCGIFSKEV FLLAFHKETWFDYLEQFVVSNIITELLGSGHVLLVTSIRNDVLLLLDGRVFDEIQPVIVK VKKEYEKFDSSPLQTAVSGRGSWGQIHGLYEEVSGLLKMGKIEADTDALWHGFYKDINRE ELMLINYRALKAAKSYDTILFELYLGFVKMRLRGDLAGRKEAVILYVLDMLYEDEDNERG MKEKLWNGEGRSGEGSQNAGDDRELLKLLADYISEKQGFYPGGDKEERRMRDILLGIYQN ISRQELSIQWLTKEVLYMNQDYFGRVFSKCMKQKYSSYLLNVRIELAKRLIWYFPEIKLS QIAEEVGYPADGQYFSKTFRKSVGMSPSEYKDYIQKQA >gi|229784118|gb|GG667617.1| GENE 30 38380 - 40134 1779 584 aa, chain - ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 1 579 1 571 577 270 29.0 7e-72 MRKRWDQWVERFRNTSLARKMMVVYVAIFGILCSITLLAMQVTLSIYDGQLYHKSLEELN FFTERVDGELKKVEELSFDLAMDYEIQNQLSIIKSAKNKAEYSFQMSKLRSRLTTEALSA ELVSEFIYTDKFLTRYEVGKRYVELPVGIYDSLLEKFDKAKGAYVYENPTEEFPYLVSGR DIRKHIDASLEYLGSLIFVSDIRVLIEKHLNQLKTDVSSLCVTSDSGLIYQSEDGLYEKI RQVLPESDYEIINLQGRKYFLCQLTSAETGWTYVNMFPYSSIYFMNTMVRYCLIGGFLVL FVAAMLILKRLSTAITRPLNRLTDSIQIVEAGEFQKAKEYLEGEKRGDEAGILTQEFQIM LDKIMVLIHENYEKQILLKDTQYRALQAQINPHFLYNTLNSINWMIRSGKNEEASEMLIS LGDLLRAAFKKEQLATVLEELELARHYIDIQKVRYRRRAEFLVEAEGDLEAYHVPRMILQ PLVENSINHGVDNSLNKCRILVRVKENEEAQTMTLEVADDGPGMTEEELAAVRNFTMVPK GNGIGLKNISERLNLIFSDAVFEISSQPGKGTVVRIEIPKYGGK >gi|229784118|gb|GG667617.1| GENE 31 40097 - 42160 1696 687 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1618 NR:ns ## KEGG: Cphy_1618 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 671 1 706 714 728 50.0 0 MGTFLLDKSVTADMGTMKDRGAVRRAVERFYRDLDMVVDGDLAFGGGIELCEESMEPEQY RILVEEKRMVICAGDELGFVYALLHLSEAYMGILPFWFWNDQKFEKREVVEIPVGTIASS PYPVRYRGWFLNDEVLISRWDGGVSKEYPWEMAFEALLRLGGNLVIPGTDANSKIYAGLA AEMGLWITHHHAEPLGAEMFARVYPDKMPSFAEYPELFYGLWEEGIQRQKNYRVIWNIGF RGQGDRPFWDDDPQYDTPKKRGDLISSLMVRQYELVKKYVENPVFCTNLYGETMELYQQG FLTLPEDVITIWADNGYGKMVSRRQGNHNPRVPALPPAELRDGNHGTYYHVSFYDLQAAN HITMLPNSMEFVEKELKIAYDCGIKSLWVVNCSNVKPHVYPLDFIGNLWRGNSLTAEEHR RRYMESYYAYGQRIDGNDLLEPLQDLFEAYFDCTVPYGNEEDERAGDQFYNYVTRDFVCR IMKDGGRRPEPELNWFAPGESLEEQMEHYRKTIAGGKNAFEPLLEACSRMAEQAGPLMED SLLLQVKIHTYCLRGAVSFCDGYEAYAGKQYLKAFYLFGRAADWYEAADRSMREREHGKW KGFYENDCLCDVKETACCLRRLMGYIRNLGDGPHFYEWQREVTYSEEDKRVVLITNMENH MTDQELYQAMKERESDEKALGSVGGTV >gi|229784118|gb|GG667617.1| GENE 32 42343 - 42609 417 88 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160940069|ref|ZP_02087414.1| hypothetical protein CLOBOL_04958 [Clostridium bolteae ATCC BAA-613] # 1 88 1 89 89 165 89 7e-40 MAYNKNDKADGAKFRRGGMRRRKKVCVFCGEKNGTIDYKDVNKLKRYVSERGKILPRRIT GNCAKHQRALTVAIKRARHIALMPYTME >gi|229784118|gb|GG667617.1| GENE 33 42655 - 43107 368 150 aa, chain - ## HITS:1 COG:FN1304 KEGG:ns NR:ns ## COG: FN1304 COG0629 # Protein_GI_number: 19704639 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Fusobacterium nucleatum # 1 109 1 106 154 98 46.0 5e-21 MNRVILMGRLTRDPEVRYSQGERAMAIARYTLAVDRRGRRNQDGNEQTADFINCVAFDRA GEFAEKYFRQGMRVLISGRIQTGSYTNKDGIKVYTTDIIVDDQEFADSKGAASGEGGGYQ PTSRPAPSSAIGDGFMNIPDGVEDEGLPFN >gi|229784118|gb|GG667617.1| GENE 34 43130 - 43417 384 95 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881892|ref|YP_001560860.1| ribosomal protein S6 [Clostridium phytofermentans ISDg] # 1 95 1 95 95 152 70 4e-36 MNKYELAVVLNVKLEDEERAAAIEKITNYITRFGGTVTNIDEWGKKRLAYEIQKMKEGFY YFIQFDGDSTTPNELESRVRIMEPVIRYLCVRQEA >gi|229784118|gb|GG667617.1| GENE 35 43404 - 43523 57 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLFLPISDKILIRDTLNSLLLKLKAGARDHKEVKKHEQV >gi|229784118|gb|GG667617.1| GENE 36 43571 - 43768 184 65 aa, chain - ## HITS:1 COG:CAC3725 KEGG:ns NR:ns ## COG: CAC3725 COG4481 # Protein_GI_number: 15896956 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 3 54 5 56 65 67 55.0 7e-12 MNFEVGDIVKLKKQHPCGSSEWEIMRVGADFRLKCMGCGHMIMVPRKLVEKNARGLKGAD GVEKR >gi|229784118|gb|GG667617.1| GENE 37 43793 - 44782 1158 329 aa, chain - ## HITS:1 COG:STM2406 KEGG:ns NR:ns ## COG: STM2406 COG0667 # Protein_GI_number: 16765732 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Salmonella typhimurium LT2 # 1 329 2 329 332 419 62.0 1e-117 MYQAAEDRYEKMEYKRCGRSGVRLPRVSLGLWQNFGLEKPLEDQEAILCRAFDLGVTHFD LANNYGFPAPGKAEENFGEILRRGLGAYRDEMIISTKAGYDMWPGPYGNWGSRKYLMASL DQSLKRMGLEYVDIFYHHRPDPETPLEETMGTLADIVRQGKALYVGLSNYSRQEAEKAVL LLKDMGIHCLIEQPRYNMFERTPEDGLFKAMWEHGVGCICYSPLAQGALTDKYLNGIPQG SRASRVGSTINDRYLSEEKLDQIRRLSEIASGRNQSLAQMALAWVLREKAVTSVLIGASS VKQLENSVHCLDNLEFSAEELKTIDEILK >gi|229784118|gb|GG667617.1| GENE 38 44864 - 45595 817 243 aa, chain - ## HITS:1 COG:CAC0284 KEGG:ns NR:ns ## COG: CAC0284 COG0846 # Protein_GI_number: 15893576 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Clostridium acetobutylicum # 6 237 9 241 245 296 57.0 2e-80 MEAQEQLKQWISESSRIVFFGGAGMSTESGIPDFRGVDGLYHQQYQYPPETIISHSFYRQ NPQEFYRFYKNRMLFPEAEPNRAHLALAKLEAEGKLKAVITQNIDGLHQAAGSKEVLELH GSVHRNYCTRCGKFYSQEDILNMDEPDGIPRCSCGGTIKPDVVLYEESLDQEVLSRSVEY ITRADMLIVGGTSLTVYPAAGLIDYYRGNRMVLINKTVTPMDSRADLVISGQLGEVLGEA AGV >gi|229784118|gb|GG667617.1| GENE 39 45751 - 46389 682 212 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2283 NR:ns ## KEGG: EUBREC_2283 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: Pyrimidine metabolism [PATH:ere00240]; Metabolic pathways [PATH:ere01100] # 1 212 1 212 212 241 51.0 2e-62 MENRYIKLQAPGCNVPLKVVPGHFATNHSHINYYIDMTTLKTRVSEAQEVARDLVGLYLY NTFVDTIVCLDGMDVVGAYLAEELTKAGVMSRNAHQTIYVVTPEYNSNSQMIFRDNIQPM ITGKNVILLMASVTTGKTINKGVECIQYYGGKLQGISAIFSAIGELDGMPVNSVFRERDI PDYQTYDFRECPYCKQGIKLDALVNGYGYSKL >gi|229784118|gb|GG667617.1| GENE 40 46458 - 47510 945 350 aa, chain - ## HITS:1 COG:FN1041 KEGG:ns NR:ns ## COG: FN1041 COG4552 # Protein_GI_number: 19704376 # Func_class: R General function prediction only # Function: Predicted acetyltransferase involved in intracellular survival and related acetyltransferases # Organism: Fusobacterium nucleatum # 2 125 3 125 391 75 31.0 1e-13 MIHYLRQEEKNRSRKLWEEIFTEDSQSFVDFYYAEKCRRNRILVAEEENEIVAMLHRNPF DVVVKDRIWKIDYVAGVATAAAWRHQGYMSRLLSRMFADMYMEKMGFCFLHPVNPAIYLP FEFTYIFDQPQLTISEKGMMRLSGRALMEQRTEECEEAARFAGKYLEKHYEVYTMRDEEY YRSLCREVRSADGDLVLLRTRDEKERLTGISAYYGGEETEQRELLVHPDFVKEIAPPKPL MMGRIVNLEQFVSVIRLKEDSEVSEMELMIEVSDGQIRQNDGIFRWRLDKNGSKVTRMTE GAIVKPHLSLGIAEMTAWLFGYKMPQIPENRRSMAERIRPLNGVYFDEVV >gi|229784118|gb|GG667617.1| GENE 41 47507 - 48433 1034 308 aa, chain - ## HITS:1 COG:FN0277 KEGG:ns NR:ns ## COG: FN0277 COG4866 # Protein_GI_number: 19703622 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 26 303 22 288 290 108 31.0 1e-23 MDLQFKPVVAKDIERLTPFFALRPNKTCDSVFLDSFIWRDFYHIQCAVSDGKAVQFLMEK DGVTYSAMPMCKEEDLSHYFYEMVEYFNTVLKKPLRIYLADEEAVNDLKLDPEKFEVKEQ EDLKDYLYDAGAMRSLSGKKLHKKKNHLNAFLREYEGRYEYRTLCCSDRDDVWEFLDRWW ENKVEAAEFGTQLDYEVNGIHDILKNCSQLCVRMAGVYIDGKLEAFTIGSYNEFEKMAVI HIEKANPEINGLYQFINREFLVRSFPEEMILVNREDDVGMEGLRRAKMSYNPIDFARKYS IEQKDFKG >gi|229784118|gb|GG667617.1| GENE 42 48481 - 48888 389 135 aa, chain - ## HITS:1 COG:no KEGG:Closa_0226 NR:ns ## KEGG: Closa_0226 # Name: not_defined # Def: Zn-finger containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 135 1 134 134 160 59.0 1e-38 MMNKWREKFAGFMVGRYGADRLGQFLIGVSLFFLLIGIFVRRPWVDILAFACLILCYFRM FSKNIGKRYKEEQVFEGLRYSVTEKFRKYKFRIREKKEYHIYKCPNCGQKIRIPRGKGKI SIHCPKCNTDFVKKS >gi|229784118|gb|GG667617.1| GENE 43 48910 - 49053 60 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870113|ref|ZP_06409635.1| ## NR: gi|288870113|ref|ZP_06409635.1| hypothetical protein CLOSTHATH_01119 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01119 [Clostridium hathewayi DSM 13479] # 1 47 1 47 47 68 100.0 1e-10 MELIQLHFSTAFFLDVSVLERKKSKKIVKHQMTRLTFCSDFIIVKSL >gi|229784118|gb|GG667617.1| GENE 44 49087 - 50247 1063 386 aa, chain - ## HITS:1 COG:BS_yisQ KEGG:ns NR:ns ## COG: BS_yisQ COG0534 # Protein_GI_number: 16078145 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Bacillus subtilis # 1 383 67 455 455 227 33.0 2e-59 MLLLFNIVSSAALILISQYLGAGKEKKAVQFYSLSLFLNGSASFLLSVLFILAARPLFTA MRVPGEALSDCILYFRIIAVSFFVQALFVTFSAILKAHAFMKESMAVSVVMNLLNILGNW LLIPSMGVAGVALSSTASRMAGLGFLIFFFVKRLHTPVGLSILKTVGLKELRQLLSIGLP TGGEGLSYNLSQMGILIIVNSLGTYAVNTKIYVGILANISCLFAVSISQAVQVIVGHAAG AQDYDYARQRVRSCLGYATAASFVMSVLLYALSGPIFSIFTDYPEILALGKQIMLVEIVL EFGRAVNNVMIRSLQVSGDVRFPTFLAITCQWSVAFGLSCLFAFGFGWGLIGVWLAMAAD ECIRAFVLMWRWHTGAWQRMGFAAHA >gi|229784118|gb|GG667617.1| GENE 45 50441 - 51562 1327 373 aa, chain - ## HITS:1 COG:no KEGG:Xcel_1188 NR:ns ## KEGG: Xcel_1188 # Name: not_defined # Def: extracellular solute-binding protein family 1 protein # Organism: X.cellulosilytica # Pathway: not_defined # 1 370 209 583 587 268 38.0 2e-70 MQDAHPTAENGKKVYGFSLFKDWDSKWMRCASICTWFNGYSESNTGFLFTNGDASDYTQL DDENGIYYKMLETFFKANQMGLVDPDSSAQTFDAMQAKVGEGEVLCHFWPWLTIDNYNTP ERVEQGVGFAYLPIEDQRFVVEGYNPYGTDDLVICMGSKCKNPERVFEFLDWFSSPESMM INYGGPEGLAWEKVDGKPVRTEYGRTAETINAPVPEEYGGGNFKDGKCQFNWCIPFWLDK DPATGEIFNKALWTSTVKNERNRVMDEWTETFGTELPTQYLVDNDLMEIIPGSPYICPVD SSDINIKRSQCGTAVKEASWKMVFARDKEEFDAVWQDMKTKLDGLGYNDVIAVDKANVEG FYQARKAVIEENQ >gi|229784118|gb|GG667617.1| GENE 46 52502 - 53104 708 200 aa, chain - ## HITS:1 COG:no KEGG:Xcel_1188 NR:ns ## KEGG: Xcel_1188 # Name: not_defined # Def: extracellular solute-binding protein family 1 protein # Organism: X.cellulosilytica # Pathway: not_defined # 1 199 1 200 587 140 37.0 3e-32 MKKRKMAAAFMAALMLTASLAGCGSSEKAESASDPSHGGAGKEEELLTIDMFCAPANYQG VQTGWFGKAIKDACNVEVNIIAPNVAGGGDSLYQTRSAAGNLGDLIVIRKDQLEDCYKAG LLYDMTDFVANSKNLKNYSKSSENLSKLLGTDRIYAFANSSSLLSPMEPAWTAVNPSMAI FSRWDYYKELGKPQLKDMDS >gi|229784118|gb|GG667617.1| GENE 47 53187 - 54092 1018 301 aa, chain - ## HITS:1 COG:BH1912 KEGG:ns NR:ns ## COG: BH1912 COG0395 # Protein_GI_number: 15614475 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 10 300 6 296 297 188 35.0 1e-47 MKEKKKKNLGDRIFDLLLMVFFGAFTLICIYPFYYLIINTISANNLSANGDILFLPKQIH FQNYIDVIQLPGLLNAAFVTLARTVLGTLGTLMGSAFLGFMFTQEKMWARKFWYRFVMVT MYFNAGLIPWFLTMNNLGLTNNFLAYVVPTIVQPFNIILVKTYIENIPKELQEAAEIDGA GILRVFWQVMLPITKPILATITIFAAVGQWNSFQDTLLLMTNEKLFTLQFILYRYINQAS SLSALIKNTSSTQMLQSLATAQTATSVRMTVSVIVVIPILCVYPFFQKYIVKGVMIGSVK G >gi|229784118|gb|GG667617.1| GENE 48 54108 - 55064 854 318 aa, chain - ## HITS:1 COG:BH1911 KEGG:ns NR:ns ## COG: BH1911 COG4209 # Protein_GI_number: 15614474 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Bacillus halodurans # 28 318 31 319 319 216 39.0 6e-56 MALAAVKRPPAGNGRTINKRRQTGIRYFLLILPFLVLVAIFSYYPLYGWVYSFFNYRPPI KLSDSEFVGLKWFASLVETDVKRRQLLGVLRNTFAMSGLTILTSWLPMAFAIFLTEIRSK WFKKSVQTLTTLPNFVSWVLVYSMAFALFSSSGMMNTLLTKLGLVSQPIQFLQSDSHTWL SMWLWSTWKSLGWSAIMYLAAIAGISDDLYEAARVDGAGRFRLIWNITIPQLIPTFFVLL MLNAANFLNNGLDQYFVFSNAFNMEHIQVLDLYVYNLGMGSGSYSLATAISMLKSLVSVT LLAAINGLSRVMRGESII >gi|229784118|gb|GG667617.1| GENE 49 56348 - 58108 2052 586 aa, chain + ## HITS:1 COG:no KEGG:Calow_1346 NR:ns ## KEGG: Calow_1346 # Name: not_defined # Def: hypothetical protein # Organism: C.owensensis # Pathway: not_defined # 2 533 3 536 571 473 42.0 1e-131 MFQQFKTASYAHAEYLCKTDLPMLKEKIEEFLTWLPLDKVYLETHRGIYDVPREKMEAIK ELFASYGIKTSGGITSTLSIEGHEKHTIFDVLCYTDSVYRKRYLEIVRDTAKMFDEIILD DFFFTSCRCSECIEAKKDRSWPEYRLDLMAGLAEEIVAAAREENPDCFFIIKYPNWYESY QDCGYNPQVQRDIFDGIYTGVEARNPQYDAQHIQRYLSYSLIRFMEQIAPGRNGGGWFDE GGSSDNMNAFVEQASLALFAGARELTLFNFESMASSPLTAVLGQQLERMDQVLGHLGTPM GVSVYEPFDAEGEDQLMNYLGMCGVPLEPTPVFCEEAPSMLLTASSAKDPDVVEKLKEYV RKGGHAVITSGFLSSTMDRGIRDMTTAVPTGRRASGNVYFVDNYNRNHRFYCEGRRPVDL QVLDYKTNATWCQIGLITDECNFPLMLEDFYGEGNLYTLNIPDDFSQLYRLPKDVLTDIS RVMTRNLPVFLSADAKYGLFEYDNSTFAVYSFRPVDENLEIILRGDEYKGIVDLETGREY YPLLTGVEPQKRFDAAKAKSEVREQILEVPFTAGFCRFYRLIGKEE >gi|229784118|gb|GG667617.1| GENE 50 58113 - 59891 2024 592 aa, chain + ## HITS:1 COG:no KEGG:Csac_0666 NR:ns ## KEGG: Csac_0666 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticus # Pathway: not_defined # 2 586 3 569 571 411 36.0 1e-113 MYRNFHIATYPHAEYLYYADLDTVEEAIRYFQKRLPLTKVYLETHRGMYDIEESKMRAVK ELFHRYGIETGGAIAPTLSIPGHEKNTLYDVICYQDPVYREAFLDIVRYTASLFDEIILD DFFFTSCKCPLCIDARGDRTWAEYRTDLMLQVSRDMKAAAADVNPGCRLIIKYPNWYESY RQCGYAPDLQKDVFDFVYTGTEARNPKYDPQHVQRYLSYSSFRRITGIVPGRNLGAWIDE GGSNNHINTWVEQAALSLFAGAKELIACRDWMSFRYPTLTECPLQPVLGQQLKRIDSLLD KTGTPIGVSVYEPLEGSGEGQLINYLGMLGIPFEPSAVFREDAPVIFLTASAAGDPQILT KLKSYVRNGGHAVATPAFVEAMSDRGIDEITNAQITNRRASAELFIADGYHRNQRIYCES REPVTFPVYTCSGGAWQAVSMVTEEAGFSILTEEFYGDGSFLILNIPDDYSQLYRLPEAV VAEIAAVLTRGLFVSLSAAPKYSLFVYDNQTFGLISYRPHAGEVDISIRLEDASGIRDLE TGEVFAPLTVDMHPEKRFDSAKIPEALTEQHYKVSIENGEFRFFEVLRKEHA >gi|229784118|gb|GG667617.1| GENE 51 59974 - 61491 1673 505 aa, chain - ## HITS:1 COG:BH1910 KEGG:ns NR:ns ## COG: BH1910 COG4753 # Protein_GI_number: 15614473 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 2 494 3 496 506 226 31.0 6e-59 MNLLIIDDEQIVREGLKTIFPWAEYGFTVCGEAKNGEEGVSLIRSLNPDLVLLDIRLPKL DGLEVLKTVREEGFGGQIIILSGYADFDYAKTAIRLGVCSYLLKPIDEEELLHAVKQAKE TLEKEKRESDQALKSPEQVRRELLLDILNGGAEHKREELLACSMDLTADAYQVLLVVIKD RKPDGPGDMLCKWLQTLNSVSVSEGGCEYYLLLGREAVSQAAHPLLSDSGIFLSYSRMGF DAAEIPQLTAQAREGYEHRFLFSTDGLSIQCRLIPEPSAKAGEGFGDPLVIADKLYNFIS SDQTAKAAEFLKHLELSIQRQRLNYDTVIRLFINSYMQIDMLIKEHYPEAANRVTGANLV IYAICKCRTLREIVNFLTEHITGIIRMVHEVRSDNIIEKLCGYIGKNYRQPLKIETLAEV FGYNGSYLGKLFKQETGESFHSYLDRVRIEKAKKLLETDARIYSVASECGFQSNEYFSNK FKKYVGVSPQAYRKARAEEQGGGRQ >gi|229784118|gb|GG667617.1| GENE 52 61488 - 63245 1610 585 aa, chain - ## HITS:1 COG:BH1909 KEGG:ns NR:ns ## COG: BH1909 COG2972 # Protein_GI_number: 15614472 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 36 571 58 594 597 208 27.0 3e-53 CAAGLVPMGVIGSLMLYNTHRMLYGQYRGQLESENKRVHNIVFEVTYLVSNISNVLGYDS ALNTLLSTSYESDGQVYEAYRAYPLLDSFAENYTELSSIRIYFNNPTMVSSGRFQYAGEE VQASEWYRLAKESSGEPIWICEKQENSQDNLFLVRKVPLYNRDFAVISIGISSNFLKLMI NDEQTDTLLSLDEGEVFYAGNASYFGKPLPVSQTALSETRVRTLRDNGLLVGCSALQPVR SHQIIQVLTVDSHALGAVNRTMVLMMAIILFCILIPFVLIFLFTGSFGRRIAVIRGEMHK ISEGNLDIIPLFDGTDELGLLFRDMQQTINNIQNLNRQLYENQLDRQRLMNQQRMIEFQM LANQINPHFLFNTLETIRMNASTHKDPDTAYIITQLGKMMRYSLQTKDQLVSLAQELAYT RSYLEIQHFRYREKISYEILVSDEITQDTYYLMPLLIQPLVENAISHGISPMDGPGFLLI HIEPVKDYLSIVIHDNGVGMDEGALQALRESVEAPEPEYTEDPVPTERHIGLRNVSSRIR LFYGREYGVDIESRPMEGTAVTLRLPLPRTDGPLEMEKEKEDERQ Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:48:24 2011 Seq name: gi|229784117|gb|GG667618.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld11, whole genome shotgun sequence Length of sequence - 63870 bp Number of predicted genes - 58, with homology - 55 Number of transcription units - 27, operones - 13 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 337 323 ## Amir_5047 short-chain dehydrogenase/reductase SDR 2 1 Op 2 38/0.000 - CDS 351 - 1223 535 ## COG1175 ABC-type sugar transport systems, permease components 3 1 Op 3 14/0.000 - CDS 1234 - 2118 590 ## COG0395 ABC-type sugar transport system, permease component 4 1 Op 4 . - CDS 2203 - 3648 1270 ## COG1653 ABC-type sugar transport system, periplasmic component 5 1 Op 5 . - CDS 3666 - 4118 586 ## TTE0360 lactoylglutathione lyase and related lyase - Prom 4236 - 4295 6.9 + Prom 4197 - 4256 6.8 6 2 Tu 1 . + CDS 4360 - 5217 522 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 5265 - 5324 6.6 7 3 Tu 1 . + CDS 5376 - 5867 730 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) 8 4 Op 1 . + CDS 6788 - 6925 186 ## gi|266620066|ref|ZP_06113001.1| nucleoside-diphosphate-sugar pyrophosphorylase 9 4 Op 2 2/0.000 + CDS 6922 - 7992 1223 ## COG2605 Predicted kinase related to galactokinase and mevalonate kinase 10 4 Op 3 4/0.000 + CDS 8016 - 8600 519 ## COG0241 Histidinol phosphatase and related phosphatases 11 4 Op 4 . + CDS 8615 - 9250 764 ## COG0279 Phosphoheptose isomerase 12 4 Op 5 . + CDS 9275 - 11155 1656 ## Closa_3795 hypothetical protein 13 4 Op 6 . + CDS 11174 - 12574 1323 ## Closa_3794 hypothetical protein 14 4 Op 7 . + CDS 12600 - 14954 1554 ## Closa_3793 hypothetical protein 15 4 Op 8 . + CDS 14976 - 16556 1523 ## HRM2_29920 hypothetical protein 16 4 Op 9 . + CDS 16575 - 17519 1040 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 17548 - 17604 13.7 + Prom 17591 - 17650 5.4 17 5 Op 1 . + CDS 17778 - 18821 1103 ## COG1363 Cellulase M and related proteins 18 5 Op 2 . + CDS 18889 - 20061 860 ## COG3214 Uncharacterized protein conserved in bacteria + Term 20161 - 20209 11.0 + Prom 20075 - 20134 5.8 19 6 Tu 1 . + CDS 20356 - 20646 375 ## gi|288870122|ref|ZP_06113012.2| hypothetical protein CLOSTHATH_01148 + Prom 20746 - 20805 7.8 20 7 Tu 1 . + CDS 20944 - 21078 67 ## 21 8 Tu 1 . - CDS 21411 - 22307 979 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 22447 - 22506 5.0 + Prom 22315 - 22374 6.1 22 9 Op 1 . + CDS 22493 - 23158 585 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 23 9 Op 2 . + CDS 23240 - 24022 740 ## Closa_0493 glycosyl transferase family 2 24 9 Op 3 5/0.000 + CDS 24106 - 24882 779 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) 25 9 Op 4 4/0.000 + CDS 24925 - 26001 1030 ## COG0451 Nucleoside-diphosphate-sugar epimerases + Term 26025 - 26063 3.1 26 9 Op 5 . + CDS 26076 - 27422 1409 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 27 9 Op 6 . + CDS 27487 - 28326 887 ## COG3475 LPS biosynthesis protein + Prom 28351 - 28410 6.1 28 10 Tu 1 . + CDS 28443 - 28682 306 ## Dhaf_0506 XRE family transcriptional regulator - Term 28594 - 28636 1.4 29 11 Tu 1 . - CDS 28679 - 30256 1621 ## COG1292 Choline-glycine betaine transporter - Prom 30300 - 30359 9.7 + Prom 30150 - 30209 5.8 30 12 Tu 1 . + CDS 30443 - 31039 759 ## Closa_0499 hypothetical protein + Term 31113 - 31159 3.3 - Term 31101 - 31147 3.3 31 13 Tu 1 . - CDS 31158 - 32135 772 ## Closa_0500 hypothetical protein - Prom 32286 - 32345 5.8 + Prom 32287 - 32346 5.2 32 14 Op 1 . + CDS 32522 - 33229 782 ## COG0726 Predicted xylanase/chitin deacetylase 33 14 Op 2 . + CDS 33278 - 34663 840 ## PROTEIN SUPPORTED gi|145632256|ref|ZP_01787991.1| 50S ribosomal protein L27 + Term 34685 - 34715 4.3 + Prom 34753 - 34812 8.1 34 15 Op 1 . + CDS 35008 - 35526 560 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family 35 15 Op 2 . + CDS 35566 - 36345 527 ## Closa_0513 GCN5-related N-acetyltransferase + Term 36453 - 36485 -1.0 - Term 36088 - 36124 4.5 36 16 Tu 1 . - CDS 36281 - 36982 759 ## Closa_0514 hypothetical protein - Prom 37075 - 37134 5.9 + Prom 37116 - 37175 5.9 37 17 Op 1 . + CDS 37209 - 37952 873 ## Closa_0515 hypothetical protein 38 17 Op 2 . + CDS 37956 - 38102 115 ## 39 17 Op 3 . + CDS 38122 - 38706 650 ## COG2357 Uncharacterized protein conserved in bacteria + Term 38877 - 38916 -0.8 - Term 38721 - 38773 16.1 40 18 Op 1 24/0.000 - CDS 38805 - 40016 1055 ## COG0004 Ammonia permease - Prom 40072 - 40131 2.9 - Term 40066 - 40115 14.5 41 18 Op 2 . - CDS 40199 - 40537 546 ## COG0347 Nitrogen regulatory protein PII - Prom 40611 - 40670 7.8 + Prom 40700 - 40759 7.2 42 19 Op 1 . + CDS 40823 - 41728 992 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase 43 19 Op 2 . + CDS 41718 - 42395 1056 ## COG1802 Transcriptional regulators 44 19 Op 3 . + CDS 42402 - 42596 384 ## Closa_0521 hypothetical protein + Term 42641 - 42690 8.5 + Prom 42609 - 42668 8.0 45 20 Op 1 . + CDS 42743 - 43366 510 ## Closa_0522 TetR family transcriptional regulator 46 20 Op 2 27/0.000 + CDS 43403 - 44545 1494 ## COG0845 Membrane-fusion protein 47 20 Op 3 . + CDS 44558 - 47596 2979 ## COG0841 Cation/multidrug efflux pump + Prom 47636 - 47695 6.2 48 21 Op 1 . + CDS 47763 - 48920 1210 ## Closa_0525 hypothetical protein 49 21 Op 2 . + CDS 48990 - 50159 1140 ## Closa_0526 hypothetical protein + Term 50202 - 50274 10.5 - Term 50071 - 50096 -0.1 50 22 Tu 1 . - CDS 50225 - 50890 505 ## Closa_0527 stage II sporulation protein R - Prom 50943 - 51002 5.8 + Prom 50920 - 50979 7.9 51 23 Op 1 . + CDS 51047 - 52111 1109 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 52 23 Op 2 . + CDS 52142 - 52960 987 ## COG0796 Glutamate racemase 53 23 Op 3 . + CDS 52962 - 53402 512 ## COG4506 Uncharacterized protein conserved in bacteria + Term 53446 - 53482 -1.0 - Term 54155 - 54201 10.0 54 24 Tu 1 . - CDS 54208 - 54675 665 ## Closa_3786 hypothetical protein - Prom 54717 - 54776 5.2 + Prom 54689 - 54748 4.3 55 25 Op 1 58/0.000 + CDS 54883 - 58749 4014 ## COG0085 DNA-directed RNA polymerase, beta subunit/140 kD subunit 56 25 Op 2 . + CDS 58778 - 62437 3919 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit + Term 62497 - 62524 -0.9 + Prom 62551 - 62610 6.7 57 26 Tu 1 . + CDS 62634 - 62816 74 ## + Prom 62971 - 63030 3.6 58 27 Tu 1 . + CDS 63070 - 63441 570 ## gi|266620119|ref|ZP_06113054.1| conserved hypothetical protein + Term 63460 - 63515 1.4 Predicted protein(s) >gi|229784117|gb|GG667618.1| GENE 1 1 - 337 323 112 aa, chain - ## HITS:1 COG:no KEGG:Amir_5047 NR:ns ## KEGG: Amir_5047 # Name: not_defined # Def: short-chain dehydrogenase/reductase SDR # Organism: A.mirum # Pathway: not_defined # 3 105 4 105 281 62 39.0 8e-09 MATGLITGADMGLGFELTKLGLLRGNTMIAIVLHASGENITGLKSQYGDKLTIIAADITD EASIRDAADRISDKFKQIDFIVNNAGVLFGSKYDKIDEIVDLDVAKFRKTLD >gi|229784117|gb|GG667618.1| GENE 2 351 - 1223 535 290 aa, chain - ## HITS:1 COG:AGl3390 KEGG:ns NR:ns ## COG: AGl3390 COG1175 # Protein_GI_number: 15891813 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 10 285 24 298 309 188 41.0 1e-47 MDYRKKDIRNALLFISPWIIGFLVFTLYPILSSLYYSFCEYKVIKAPVFIGFKNYIDLFK DKSYLAALGNTVYMLIFGVPVTTIVAVAVSILLNNKHLHHAGIFRVVFFIPTLVPTIVAC FLWIWMFQSGGIVNTVLGYFGISGPAWLSNSTWAKPAFIFYMIWTIGNAVIIYLAGLQDI DPSLYEAAEIDGAGFISQTISVTIPLLRSTILYNTVTLIIGVFQWFAEPLIITEGGPGNA TLFYSLYLYQNAFRFFKMGYACAMGWILLLISLAIILVLFKVFKFGESDN >gi|229784117|gb|GG667618.1| GENE 3 1234 - 2118 590 294 aa, chain - ## HITS:1 COG:SMb20969 KEGG:ns NR:ns ## COG: SMb20969 COG0395 # Protein_GI_number: 16264842 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 25 294 1 270 270 230 46.0 2e-60 MEKKTGISVKTRKYDSIEYSPGVIVRKIVMYAILILISVAFIAPIAFMFFGAFKTTMELA RVPFKWLPDSFSLANFKAVFEKIPFFLYLNNTLIIVFFNMIGSLISNSLVAYGFSRIKWP GRDKVFILVIITMILPFQITMVPLYLMFNKWGWIGTFLPLTVTCFFGSPFYIFLIRQFLI GIPNELSASAKIDGAGEFRIFWQLTLPLAKPVLATVAVFAFMKSWNDYIGPLIFLSDQKL YTLSLAASMLKSNLDPQWTVLLALGSMLVMPVLILFFVLQKYFIQGVTMSGIKG >gi|229784117|gb|GG667618.1| GENE 4 2203 - 3648 1270 481 aa, chain - ## HITS:1 COG:SMb21595 KEGG:ns NR:ns ## COG: SMb21595 COG1653 # Protein_GI_number: 16264783 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 94 451 50 387 410 80 24.0 7e-15 MKLRKIAALALAGITAANLAACSSGTTPSGTEAAKATEATKTEAAAETTASENSAAGTFA GIEKEETDLTGELVYWSAFTGGSGEWDQSRVDLFNEMYADNGIHVTRQQIEGAGIKNGAL MSAVASGTNAPDLIISDSPIDVYSFAVEGAFMAVDDLLPKVGIDPDSFFDGLDDLIKVDG KTYLVPQDANVHLLYYNKKLAREADLDPEAPPTTLEELDAWADALSIPDDKGGFKQLGLI PWIGDGGDTSGDAFHIPFVFGTNVYDPATNKLNLTEDKMIQYFEWIKGYSEKYAPTIQEW GTSAGGVFDPSCPFYTEKVAMYFCGNWMANAIKLTVGDAIEWGVTAVPAPDYGRAKATTL GANPFAIIEGSENAELAAFFIKFCISPAIQEDNFAQWRSIPCSDAAFDGVSLTKNGDAMY ALEREIANNPENGVPAMCSVSTELADQFQKARQEIVFNGADITSTLQALQERMQATLDSA N >gi|229784117|gb|GG667618.1| GENE 5 3666 - 4118 586 150 aa, chain - ## HITS:1 COG:no KEGG:TTE0360 NR:ns ## KEGG: TTE0360 # Name: gloA # Def: lactoylglutathione lyase and related lyase # Organism: T.tengcongensis # Pathway: not_defined # 3 147 5 149 150 142 46.0 4e-33 MGIVDGKTICQVALVVKNIEETAKAYAELFGVDVPDVFTVPPESEAHTKFKGQPTNTRAK LAVFDLGQVVLEITEPDSEPSSWREFLDKNGDGIHHIAFMTRDREPVVQYFEENGMPVRH YGEYTGGNYTVFDSKEKFGTFIQVKEDKNL >gi|229784117|gb|GG667618.1| GENE 6 4360 - 5217 522 285 aa, chain + ## HITS:1 COG:lin2267 KEGG:ns NR:ns ## COG: lin2267 COG2207 # Protein_GI_number: 16801331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 32 282 30 289 292 94 24.0 2e-19 MILSEYYENLSFNHGLMVNISYAGIPTDFPLHWHTFVEIMISFDGSFELGLGYDTYQMEK LDVAVCYPGVLHSIKSKGCEKELLIVQFPFELITNMKEFSCMSYRLTQMQIMRACESEEQ ILLMKDIFMHMRDVFYSDLPFKEVILYTDLLSIFIHLGNRCFSEMKEAAADPCNAKTTEL IAEACFFIARNCTEPLELETVAKHVGFSKYHFAHTFKKYVNMTFLEFLTTERIKKAEILL ADPNISITEIAMRTGYTNLSTFNRTFKINKGCTPREFRNKVHRLD >gi|229784117|gb|GG667618.1| GENE 7 5376 - 5867 730 163 aa, chain + ## HITS:1 COG:CAC3056 KEGG:ns NR:ns ## COG: CAC3056 COG1208 # Protein_GI_number: 15896307 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Clostridium acetobutylicum # 1 157 1 157 234 150 48.0 8e-37 MQAVLLAGGLGTRLRSVVADRPKPMALIGDKPFMEYVVLELKNHGITEIIFAVGYKGSMV EEYFGDGARLGIRVSYAYEETLLGTAGAIKNAGRLVTDDRFFVLNADTFYQIPYGRLVKM SREKNLDMALVLREVPDVSRYGRAVLEGERLTAFDEKTEEASS >gi|229784117|gb|GG667618.1| GENE 8 6788 - 6925 186 45 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620066|ref|ZP_06113001.1| ## NR: gi|266620066|ref|ZP_06113001.1| nucleoside-diphosphate-sugar pyrophosphorylase [Clostridium hathewayi DSM 13479] nucleoside-diphosphate-sugar pyrophosphorylase [Clostridium hathewayi DSM 13479] # 1 45 1 45 45 82 100.0 1e-14 MIPKWLSSGDKALGGIVNDGYFIDIGVPEDYYRFIDDVEKGAVTV >gi|229784117|gb|GG667618.1| GENE 9 6922 - 7992 1223 356 aa, chain + ## HITS:1 COG:CAC3055 KEGG:ns NR:ns ## COG: CAC3055 COG2605 # Protein_GI_number: 15896306 # Func_class: R General function prediction only # Function: Predicted kinase related to galactokinase and mevalonate kinase # Organism: Clostridium acetobutylicum # 1 354 1 355 364 485 71.0 1e-137 MIIRGRAPLRVSFGGGGTDVAPFCVEQGGAIIGSTINKYAYCSIVPRNDDQIIVHSLDFD MTVKYNTKENYVYDGRLDLVTAALKAMDIKQGCEVYLQCDAPPGSGLGTSSTVMVALLIA MAKWKGVYLDGYALADLAYQVEREDLKIDGGYQDQYAATFGGFNFIEFHGRNNVVVNPLR IKKEIIHELQYNLLLCYTGNIHVSANIIKDQVKNYEKKDAFDAMCEVKALAYALKDELLK GNLYSFGKLLDYGWQSKKRMSSKITTPQINELYDEALRAGALGGKLLGAGGGGYLLVYCP YNVRHKVAARLEAAGGQLTDWNFELRGAQAWVTDEERWNYDSVKVHMPNGDYTFKL >gi|229784117|gb|GG667618.1| GENE 10 8016 - 8600 519 194 aa, chain + ## HITS:1 COG:CAC3053 KEGG:ns NR:ns ## COG: CAC3053 COG0241 # Protein_GI_number: 15896304 # Func_class: E Amino acid transport and metabolism # Function: Histidinol phosphatase and related phosphatases # Organism: Clostridium acetobutylicum # 1 158 1 158 181 177 53.0 8e-45 MDRVIFLDRDGTINEEVEYLHRPEDLRFLPGVGDAIRRLRENGFAVVVVTNQAGVARGYY KEEDVILLHEYMNERLKEQGAFIDHFFYCPHHPVHGIGKYKTSCHCRKPETGMFLMAEQY YQVDKTHSYMVGDKLLDVEAGHNYGIKSVLVGTGYGAETAAEMTEEEQKKAFEHYSPDLL SAAEWIIGEQSSSL >gi|229784117|gb|GG667618.1| GENE 11 8615 - 9250 764 211 aa, chain + ## HITS:1 COG:Ta0854 KEGG:ns NR:ns ## COG: Ta0854 COG0279 # Protein_GI_number: 16081908 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoheptose isomerase # Organism: Thermoplasma acidophilum # 1 210 1 182 182 125 37.0 5e-29 MEPMKYLEELVERYPVLDVVKDDVLSAYKLLEGCYAQGGKLLIAGNGGSCADSEHIVGEL MKGFVKRRAVPDSFAESLKSVDAVRGTELAGKLQGGLPAIALTGHAGLSTAYLNDVDGDL IFAQQTYGYGKPGDVLMGISTSGNAKNVMYALTVAKALGMKTIGLTGKDGGALKREADTA IVVPETETFKIQELHLPVYHALCLMLEERFF >gi|229784117|gb|GG667618.1| GENE 12 9275 - 11155 1656 626 aa, chain + ## HITS:1 COG:no KEGG:Closa_3795 NR:ns ## KEGG: Closa_3795 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 626 1 607 607 778 60.0 0 MTEKHFMDGIGKWKDEIIAAGLLMAFAFFINRGIEIKGLYMDDLYLWSCYGEQTFGQFIF PLGSTRFRFLYYLAAWLELAVIGNHIGWFVPVNIILNGCIAYSVYRFGKRLSRSYLMGLG CGFLYLLSRMSYYQIGQVYGLMESLALWAALGILYCLLRYVDEKGDGRKYLIRSCVLYFA VCFIHERYLALLPLIYLALLYKKGKKKAGWFLPAGVFAVVMAIRVLTTGGVAPAGTGGTN VADTFNIMTSLRYAVDQVLYIFGVNAGPGHLSGASWAESPSWIHVQIIAADLALLVLVIL AVIRLIRDKEGRRKALRDISLFLVFIALCIGASSVTIRVETRWIYVSMTAAYLLAACLYG LITAKPEHGKVVDIKREVRKAADKKAGRQSAVLCGGLFLVYIIIMAPVENYYRGQYHNLY FWADQQRYNSLAEETYEKYGDDIFGRKIYIIGNSYQMSDFTARTFFKEFDRDRKAEGTEV YFIDSIHDIGLITNNMLVLKEDPAHNAFQDVTDFVRNLKCEAVYGYYRDGWMDESAEVRI MTGSSGVIKLEIYYPGALTGSEISTISLNGEPVRDLVINQNTMKLELEAEPNQIASLRFD NNFYLKDAGEQRGEKRFSMIVNFTAD >gi|229784117|gb|GG667618.1| GENE 13 11174 - 12574 1323 466 aa, chain + ## HITS:1 COG:no KEGG:Closa_3794 NR:ns ## KEGG: Closa_3794 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 15 464 15 455 461 536 64.0 1e-151 MKRRYYLSILPLAGILFCMWYIHIAASDVIYSDYIRLVNSYLPDVWNPAKFFVPDVLTRI PVNYLARIINVEFFGFNTMFDRVLGVLALGLSGFILGKYCQLRKVRTVWFVILMALMFSL NKWEMLINGSGWAHFLAFAGFYYHYLVMDRLWTGDERPGDRKLLVILPFAITILTAGPYC AIYSVAVMMAYGFMTILDYRKTKRFERTYLLYALCTLAALLLYMWSNSLTVEDHAAAAEV GLFTQLTQTPGFFVRFILISFSSMVVGIEEAIAAFKGNIVPFAVLGLLVIAAYLYALYLN FRYRLYEKSMVPLILIVSGGLNHVLIFLSRWIFLVDDYGASSRYALQFQVGIFGIILTFA LCRNEMVTNKKAREKYRRFSAVAVVFCLLFLAGNVYTTYHELKKAPDRKETFEKRAQMAL DFEEMTDDELRAEFEYRTTRPESGGQIRDALTILKDNGWGVFRDRK >gi|229784117|gb|GG667618.1| GENE 14 12600 - 14954 1554 784 aa, chain + ## HITS:1 COG:no KEGG:Closa_3793 NR:ns ## KEGG: Closa_3793 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 23 784 11 750 753 776 54.0 0 MERTHKSPRGRKSGGSLQSRGLVLLLWVCALGAIAYSADQVFVKGILSWHRQQKEYYNMM AEVIALFCVFAILYGLIRSRILKAAGTILVLSVFFWGHMVFLPVVTAGIYLIYLILLGEF FCRDILKLKESDGTAVYFLGGSLLVILIFCLMSAVGIGAIGDLQIFVIASGILLILRLCM RRGKEDRGLSAWCSRKAGQKRENSPWLTALCIAFIVTMFCIQAGRMNIALDFDSLWYGVR SRYVLDNGNGIYENMGTIGIVYTYSKGWEVMTLPLANLPSYSFLLAFNLWLTAGVIFMAY QAARFYMGKGHSTVFAALLAGVPGIMNMAITTKTDIATLLFQEIMIFYLLKYLKEGRREW RYLAYGFASFFLTWTLKPTALVFSTAILGMSLLFLVWKRLLPSGGGKGKAGQAGGKGNVG QTGALGLFLLSLLALAGIWARTVMITGLPVTSVFSSILTKLGLEMKYPFNVNKIPNSGAE LPFGEQLVNFIRRAFGFLLRPLGVDMDHVILAWGGFLLFIVLLLWIFSLFYRKNVENDRE RLLSSWLNVVFIPFLIVCVVSLYMLVQVDGNYFMLLYVLAALYVMRLVVRISDKAVRNGL CGILVPVMCFCAVVSSLTNWSWVVGFSPVSWKHHGYYDHQEAKHQEMVEKGNGQIWDILA EDPENRLIAVGDHPGVLGFPCNAQSYDDITGSWGNVVLVKTMDNFVEFMDYAKTDYVYTE AGYMEEEQRCYSLVRTLIEYGKLIPICYEEGNMLARVDIDGEYTETSRAALTEFDKCYIK KNVD >gi|229784117|gb|GG667618.1| GENE 15 14976 - 16556 1523 526 aa, chain + ## HITS:1 COG:no KEGG:HRM2_29920 NR:ns ## KEGG: HRM2_29920 # Name: not_defined # Def: hypothetical protein # Organism: D.autotrophicum # Pathway: not_defined # 7 514 18 503 503 121 25.0 6e-26 MKKEVKIGIGLVFLAFYVFLIMQFGKVFVYYDDFGYLSLSYGNTVPDVIGSNYTLSQLLS FMGKHYFYSNGRLFYLFLFSFLNMTGGLAAVQAFMATSVLAILVLVHYVVMRHSKPAGWR PVFLAAFICLLYGTIGILIQRLGTYWYAASFLYVVPAVTFLVLAVLYYETIGEEPKTWKK AACIVLAFFAAFSQEEWLVAVIGYICMVIAWKTLKKYRVRAYDFGTLVAAVAGALPILTS PAVKMRMDNNSEFSNLPFVQKVLTNLRNLMNLFFSSDNQNYLAVLLMALMCLGAYMIWKR MKYTVLNIGYLAFSAAVTVYLFLKLNRHVLGPGGHSQVMIWVLFLYLLVTAVQVIRILYD RNQILMGMIFLSAILTLACLVVVPELPQRVLLPFIYLSYTLFGYLFCLILLERKSAVWGI AGLIIVAAVSIPNLKNIYRGYSINYEVLMYNDGVIRDAAARIRDGGEEIKKLTLYQNPDR LCSCEMVYDENFKYMIYWMDEYYDLPSEVELEYEPITDLGAWRQTH >gi|229784117|gb|GG667618.1| GENE 16 16575 - 17519 1040 314 aa, chain + ## HITS:1 COG:BS_ykcC KEGG:ns NR:ns ## COG: BS_ykcC COG0463 # Protein_GI_number: 16078354 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 4 306 6 308 323 243 42.0 2e-64 MEKKLSVVVSCYNEELALRQFYAETAKVLKSLSWDYELIFVNDGSQDGSIGILKELAAGD EKVKVVDFSRNFGHEAAMIAGMDYSSGDGIVCMDADLQHPPECLPGIIAKLDEGYDVINM VRTKNESAGWFKNFAGAAFYRLINILSDVKFEPNASDFFAVSKKAAKVLKTNYREKVRFL RGYVQNIGFKRTTIEYEARTRVAGESKYSIKKLITFSMNTIMCFSNLPLKLGIYAGGFAG VLGIIMMIYTIWSWAEVGTPSGYATTIVLICFMFAILFLIVGIIGNYIAILFAELKDRPI YIVGETKNFTDDEE >gi|229784117|gb|GG667618.1| GENE 17 17778 - 18821 1103 347 aa, chain + ## HITS:1 COG:CAC0690 KEGG:ns NR:ns ## COG: CAC0690 COG1363 # Protein_GI_number: 15893978 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Clostridium acetobutylicum # 15 343 10 339 343 328 48.0 1e-89 MTCDKTDTMNYVVSFLEKLVNTPSPSGFTDEVMALVEKEAAGFGFRSHYSRKGGLIIEVP GNTEAVLGLSAHVDTLGAMVRSISSEGMLRIVPVGGFMMESIEGMYCKVHTRTGKVYTGT ILTKEPSVHTYDDAKTLERKPKNMEVRLDERVRSEDDVKALDISAGDYISFDPMFVYTEN GFIKSRHLDDKASVAVLMGVLKELGESGVKPEHTLKIVISNYEEVGFGASWIPEDIEEFI AVDMGALGDDLAGDEYRVSICALDSSGPYDYKMTSRLIEIAKEREIGYAVDIFPHYGSDV GAALRGGHNIRGALIGQGVHASHGTERTHVEGLEQTLRLIEGYLGLK >gi|229784117|gb|GG667618.1| GENE 18 18889 - 20061 860 390 aa, chain + ## HITS:1 COG:SMc00961 KEGG:ns NR:ns ## COG: SMc00961 COG3214 # Protein_GI_number: 15964644 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 3 390 2 390 399 159 30.0 7e-39 METVKLTREQARRFLLRRHGLLGGYQFEGKRGILEFVKQVGCIQFDPVNVCGRSPELVLL SRVKGFTKQMLHELLYQDRQLIDYFDKNLSIFATEDWPYFECYREQHRQWERSHADIETV SEQVKGIIAERGAVCSADLDMPGKVHWYWSDTKLSRAALEHLYFTGDLAVHHKKGTVKYY DLISRCVPEELLMQKDPLPDEREHQKWHVLRRIRSVGLLWNRPSDAWLNILNLNAAGRRS AFESLLKENVIVPVLIEGITEPFYMASEDRELADECGSGKNWKKRCEFLAPLDNLLWDRN LIRMIFDFDYKWEIYTPAEKRKYGHYVLPVLYGDRLIGRIEASYDRKEQQLNVKNIWYEP GVRRTEGMQQALSAAVNRLAEFNRGAEAEL >gi|229784117|gb|GG667618.1| GENE 19 20356 - 20646 375 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870122|ref|ZP_06113012.2| ## NR: gi|288870122|ref|ZP_06113012.2| hypothetical protein CLOSTHATH_01148 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01148 [Clostridium hathewayi DSM 13479] # 1 96 44 139 139 159 100.0 5e-38 MSDKVCFDKLGTEEKLSVLYDEMRKMTQRQVQMMEDLEYMGKQLRRVEIIQDDTIADGIT AIAANIADVSKKMDSFINMETVLNTIKSDVKIIKEL >gi|229784117|gb|GG667618.1| GENE 20 20944 - 21078 67 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no METAIGTAIDKTYNGIAVNVNCLQTRNVISVKTVNAAAAMFTAA >gi|229784117|gb|GG667618.1| GENE 21 21411 - 22307 979 298 aa, chain - ## HITS:1 COG:lin0668 KEGG:ns NR:ns ## COG: lin0668 COG0561 # Protein_GI_number: 16799743 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 3 289 2 288 288 152 34.0 9e-37 MSIRLIASDMDGTLLDPYTNISQANIDAIHRLSDYGVEFLICSGRDLHDAKGMIESCGIS CGYICLSGAVIYDQNGTQMANLPLNSQNLKDISHVFGEFGAPMDILTSRGRYSTADPEKK LLEFYGFLNGRAPFQGEIPDEIQAAARERMDSMTFIRDLSEIPAPTDIYKICGNDLDCDL VIRMKQAFGDYPDLAAASSFPNNIELTNTAAQKGSALKNYAAAKGISLEDVMVLGDSDND ISMFTPEFGWTVAMANSMDCIRERAKYITKSNAEDGVAWAIRKYVFKEDLSSPINNHD >gi|229784117|gb|GG667618.1| GENE 22 22493 - 23158 585 221 aa, chain + ## HITS:1 COG:CAC2331 KEGG:ns NR:ns ## COG: CAC2331 COG1898 # Protein_GI_number: 15895598 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Clostridium acetobutylicum # 1 181 1 180 185 211 56.0 1e-54 MGGFRFEQCGNIEGLYLVEPQVFADERGYNFEAYHAGSFREAGLDMVFVQDNLSMSAKGT LRGMHFQKNYQQGKLVCVVCGEVFDAVVDIREGSRTYGQWFGTVLSAEKKNMLYVPEGFA HGFLVLSDTAVFSYKLSEHYHPEDESGIPWNDETVGIRWPIPEDMTVLTSERDSCHPAFG EGSALKPKRNRKTMGAAAHENVMDETGSGRMEHQKFFQNNE >gi|229784117|gb|GG667618.1| GENE 23 23240 - 24022 740 260 aa, chain + ## HITS:1 COG:no KEGG:Closa_0493 NR:ns ## KEGG: Closa_0493 # Name: not_defined # Def: glycosyl transferase family 2 # Organism: C.saccharolyticum # Pathway: not_defined # 1 260 1 260 260 394 75.0 1e-108 MKPHTFAVCAYRDSPYLEACIRSLKGQSVPSDIILCTSTPSPYIMDMADKYGIPVHVREG KSSIMDDWNFAYHMADSSLVTIAHQDDLYQKDYGKLLLESWKRYPDMTLFTGGYTVVKGD TLVKFEKVEFVKRFLRLPLRLRSLSHLRAVKRSALLFGNSICCPACTYNKELLGEPLFDS PYHFALDWDTLWKLAGRDGRFICIERPVMYYRVHEEATTKACIRDNSRAREEAEMFAKIW PKPVVKFLMHFYRKAYEEYE >gi|229784117|gb|GG667618.1| GENE 24 24106 - 24882 779 258 aa, chain + ## HITS:1 COG:alr2825 KEGG:ns NR:ns ## COG: alr2825 COG1208 # Protein_GI_number: 17230317 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Nostoc sp. PCC 7120 # 1 256 1 257 257 324 56.0 9e-89 MKVVLLAGGLGTRISEESHLKPKPMIEIGGRPILWHIMKYYSEFGFHDFVICLGYKQYVV KEFFADYFLHTSDVTFDLANNKMEVHNNYSEPWKVTLVDTGLNTMTGGRIKRIQPYIGDE PFMMTYGDGVCTVDLNALVKFHKEHGKTATMTTVNIAQMKGVLDISDDNAVRSFREKDEK DASLINGGFMVLNPEIFSYLEDDTTVFEKEPLQRLAAEGQLMSFHHTGFWQCMDTQREMQ KLEALWQTGAAPWKIWEN >gi|229784117|gb|GG667618.1| GENE 25 24925 - 26001 1030 358 aa, chain + ## HITS:1 COG:STM2091 KEGG:ns NR:ns ## COG: STM2091 COG0451 # Protein_GI_number: 16765421 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Salmonella typhimurium LT2 # 15 358 6 355 359 365 48.0 1e-101 MKEWTKEMQQEFCAFYKGKKVLVTGHTGFKGAWLTRMLTNAGAVVTGYSLEPPTDPSLFC VAGIEDTMESVIGDIRDLPHLMEVFERTQPELVFHLAAQPIVRDSYKDPVYTYETNVMGT VHVLECIRRNPCVKSFLNVTTDKVYENREWEYGYRECDPLDGFDPYSNSKSCSELVTHSY AKSFFADGHTAVSTSRAGNVIGGGDFANDRIIPDCIRAATAGREIVVRNPYSTRPYQLVL EPLAIYMAIAMKQYEDLNYQGYYNVGPDDRDCVTTGELADLFTDFWGGGITWVNRYDGGP HEANFLKLDCSKIKKTFGWRPRYSVKEAVEKTVEWTKAYLAGEDMLAVMDRQIKEFFS >gi|229784117|gb|GG667618.1| GENE 26 26076 - 27422 1409 448 aa, chain + ## HITS:1 COG:YPO3113 KEGG:ns NR:ns ## COG: YPO3113 COG0399 # Protein_GI_number: 16123279 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Yersinia pestis # 6 442 2 431 437 499 55.0 1e-141 MFENKTEQEAKKQILDMVEAYCDTYHNQKKPFKEGDRIPYASRVYDSSEMVNLVDSSLEF WLTSGRYTDEFEKELAGYLGVKYCSLVNSGSSANLVAFMTLTSPLLGERRVKRGDEVITV AAGFPTTVTPLIQYGAVPVFVDVTIPQYNIDVTMLEEAYSEKTKAVMIAHTLGNPFDLAA VKAFCDRHNLWLVEDNCDALGTRYTIDGEERLSGTIGDIGTSSFYPPHHMTMGEGGAVYT NNPMLNKIIRSFRDWGRDCVCPSGRDNMCGHRFDRQFGELPLGYDHKYVYSHFGYNLKAT DMQAAIGCAQLKKFPSFVERRRHNFDRLRAALSEVEGQLILPEPCPNSKPSWFGFLITCR EGVDRNGIVQYVESRGVQTRMLFAGNLTKHPCFDEMRAAKSGYRIVGELKNTDRIMADTF WIGVYPGMTDEMTDYMAAAIIEAVKKNS >gi|229784117|gb|GG667618.1| GENE 27 27487 - 28326 887 279 aa, chain + ## HITS:1 COG:SP1273 KEGG:ns NR:ns ## COG: SP1273 COG3475 # Protein_GI_number: 15901133 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: LPS biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 1 279 1 267 267 116 28.0 5e-26 MENYDLTKVHEANLAILKEIDRICRKYRIKYVLDAGTLLGAVRHKGFIPWDDDADVAFTR PNYDAFLKVVKRELPEGMTLVEPKDLRGGKAFFDFTARIIYEKSQMHEDTEEMQYYGGRL NHLWVDLFTIDELPDNKIASTWTLFLHTVLYGLAMGHRYKLDFGKYGAMGRICVWLLSTV GRLIPMSVIRKLQHMAAVKDRKGKSPLRYYSNYQPDYLYVTLQKEWCEETVDLPFEDTKL MAPKGWHEVLTWIYGDYMTLPPEEKQVPSHSSMKIKIMD >gi|229784117|gb|GG667618.1| GENE 28 28443 - 28682 306 79 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_0506 NR:ns ## KEGG: Dhaf_0506 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 78 1 78 80 79 46.0 6e-14 MTICEATSKRIEQLCRQKRLTGYSLTFQAGMPPSTFKSIMNGKSKNPGICNIKKIADGFG MTIREFFDSDLFNDLDQDD >gi|229784117|gb|GG667618.1| GENE 29 28679 - 30256 1621 525 aa, chain - ## HITS:1 COG:TP0106 KEGG:ns NR:ns ## COG: TP0106 COG1292 # Protein_GI_number: 15639100 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Choline-glycine betaine transporter # Organism: Treponema pallidum # 11 501 4 505 510 450 48.0 1e-126 MDQDEKKYLIKKLDWVTTIIPFVCIIILCLLFMVFPQKSTDMLSQVRFFLGNEFGSYYLL IGLGVFLCSLYMAFSRYGKIRLGNLEKPQYSNFKWGTMMFTAGLAADILFYSLCEWVLYA NEPHIADMGAFQDWASTYPLFHWGPIPWSFYMVLAVAFGFMLHVKKRTKQKYSEACRPVL GKHVDGIGGKIIDLIAVFALLAGTATTFSLATPLLSMAISRVTGIPESNMLTIAILVIIC VIYTITVYFGMKGIAKLAASCTWLFFGLLLYVLIGGGEARYTIETGITAVGNMVQNFIGM STWTDALRTSSFPQNWTIFYWAYWMVWCVATPFFIGTISKGRTIRQTVLGGYFFGLSGTF TSFIILGNYSLALQTKGHLDVMGIYASTGDLYQTIMAVFETLPLAKLGLILLAVTMIAFY ATSFDALTMVASSYSYKELPPDAQPDKHVKLFWAILLMLFPIALIFSENSMANLQTVSII AAFPIGIIILLILFSFFKDAKEYLDHSEGGEPDENKKKPEKTRAN >gi|229784117|gb|GG667618.1| GENE 30 30443 - 31039 759 198 aa, chain + ## HITS:1 COG:no KEGG:Closa_0499 NR:ns ## KEGG: Closa_0499 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 190 2 190 191 213 60.0 4e-54 MKKKVLIICGIAAMLLAGCTKSNIVTGGPAEPVFETQEGIVLDWGQIGDDLDEQYLNNED YPKAVSINYSVDPDKKTIDLTLMMNAGATTEEAVDFANAVVRTINDEAAVQDFSIETSTE DSYGGFFQDYTLNLIVMPDGMMTDKSVWLVNMTIPAGSNEAIVPAEGAKVMEPTSAEDEM GEDMDGDMEDDGAEEGQE >gi|229784117|gb|GG667618.1| GENE 31 31158 - 32135 772 325 aa, chain - ## HITS:1 COG:no KEGG:Closa_0500 NR:ns ## KEGG: Closa_0500 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 324 1 324 326 547 81.0 1e-154 MPGFTTHYLLGVKAYNDLPNNQLRRIISKYRWLYQLGLQGPDMFFYNIPILRHRDYRNVG SYMHENHVSTFFSVCLNHMATIESKQQREQAISYLSGYFCHYIADSICHPYVYGRIEYDI KNSGNYYHGLHATLENDIDALLLMKYKKKKPSQFNQAATICLNGLETQFISGFLASCINE AYYPINYRNNFRVTPRMVSRSILAMRIGCRTLADPRSRKRNSIAVVENLFLKNPIASKKL VTDIPPDPVKAMNLDHETWCNPWDKRLASQASFPDLFHQSMRKCSDVFYLFNEELISATP LSHQDHGTLLKELGNYSYHSGLAVS >gi|229784117|gb|GG667618.1| GENE 32 32522 - 33229 782 235 aa, chain + ## HITS:1 COG:CAC0358 KEGG:ns NR:ns ## COG: CAC0358 COG0726 # Protein_GI_number: 15893649 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Clostridium acetobutylicum # 4 230 11 221 266 159 36.0 3e-39 MFHGKMKALTFSYDDGVTQDRRLVELFDRYGLKATFNINSELLGKPGTLWRENMWVDHNK IRPEEAAELYRNHEVAVHTLTHPHLTELEEPEIIRQVEEDRINLERLTGRPVMGMAYPGG GVNNDERVAEVIRKHTAMKYARTITSCYHFDLQQDLLRFQPSVYHIEFDRMAEMGEAFLK LQPDKPQIYYIWGHSYEFDYHDTWGEFEKFCQMMSGRDDIFYGTNHEVLNSLYHA >gi|229784117|gb|GG667618.1| GENE 33 33278 - 34663 840 461 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145632256|ref|ZP_01787991.1| 50S ribosomal protein L27 [Haemophilus influenzae 3655] # 1 446 5 433 456 328 40 6e-89 SMIHRIHELVWGPWLLVLFLGIGIIYTVKSGFFQIRRLPYWWKMTIGSIREEELEEDGAV TKFQTACTALAATIGTGNIVGVATALTAGGPGAIFWMWVSAAIGMMTGYAETMLGIRYRY RDKKGKWVCGPMVYLERGMHMPWLGMVYSFLCVMVSLGMGSMVQSNSIAETFRYTFGSRP EVVGVILTGTVFLVIFGGIGRIAMVSERLVPISAGIYMLFSMVVIMSCYDQIPAVFAAIF RDALKPGAAVSGAAGYGISRSLQYGMSRGVFSNEAGLGSMAVLHGAAEETTPEQQGMWAM FEVFFDTILICTMTALVILCMTGGDAAGSGYEGAALTAYCFSKRLGGVGEYLVSAAMMLF AFATIIAWYYLGRQASAYLAERLKEHNRCLLLSRILRGRAYTFLYLGAVFLGCVARLETV WELSDIWNGLMALPNIAALIFLMKEVTFPRGKGEEIRTEKT >gi|229784117|gb|GG667618.1| GENE 34 35008 - 35526 560 172 aa, chain + ## HITS:1 COG:CAC2769 KEGG:ns NR:ns ## COG: CAC2769 COG0652 # Protein_GI_number: 15896024 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Clostridium acetobutylicum # 1 170 1 170 174 236 67.0 1e-62 MKNPVVTITMENDDVMKAELYPEIAPNTVNNFISLVKKGFYDGLIFHRVINGFMIQGGCP DGNGMGGPGYSIKGEFSQNGFSNAFKHTEGVLSMARAMSPNSAGSQFFIMHKVSPHLDGA YAAFGKIIEGMDVVNKIADVRTDYNDRPMKEQKIKSMTVETFGTEYPEPEKC >gi|229784117|gb|GG667618.1| GENE 35 35566 - 36345 527 259 aa, chain + ## HITS:1 COG:no KEGG:Closa_0513 NR:ns ## KEGG: Closa_0513 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.saccharolyticum # Pathway: not_defined # 1 259 1 249 249 234 50.0 3e-60 MTERRTVLIGGTPYPVTISSEQEVLSAAYAAGGAILGLWDKTRSGQSLSPADYVVECMED VDEEFLERVVRRRFHLPWLIAETKRLIIREFTVEDAAHMAAGEAGPGDDIFCSSEKLAAY IDSQYRFFEYGIWALEEKKSKAVIGKAGLFQPDWKFDDAKVFETGTFQAEILPALKKEDT PLELGYHIFSPWRRSGYAKEACREILNYGTGRLARCICAVIEEENTASIRLAESLGFRLT AQRYNGSAGLLYLYEANYS >gi|229784117|gb|GG667618.1| GENE 36 36281 - 36982 759 233 aa, chain - ## HITS:1 COG:no KEGG:Closa_0514 NR:ns ## KEGG: Closa_0514 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 233 1 233 233 295 65.0 1e-78 MTPYENVTFKERLHEEAAKLKEMSFKDKFWYIWEYYKFPIIGVIIAVFLVGSIGSAMYNN RFDTALSCAVLNSRYDSDALTVDQYFNEGFRAFIGLDENTKIDVDYSMSPTFDESAMNEY SYAELAKLTAMISSKGLDVMIGRPDVIDHYGEMDGFLNLEEALPPDLYEQVKDYLYPVTN AETGQESFCGLRLEDTSFGEKTGLILDNPVLTVMSNSPHTDTAIQLIRYIFEQ >gi|229784117|gb|GG667618.1| GENE 37 37209 - 37952 873 247 aa, chain + ## HITS:1 COG:no KEGG:Closa_0515 NR:ns ## KEGG: Closa_0515 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 246 1 251 260 310 67.0 3e-83 MLSGFFNYDNPVWRFIGKFGDLIILNVLWFVCSIPIFTIGASTTAVYYVTLKLARNDDGY TIRSFFKSFKENFKQATIIWLILLAVGLILGFDMLFFTRGGFNLSQQFKTVMVTIFLAMS IIYLAIMTYIFPLQSRFYNTIKRTFFNAFFMSIRHLFQTIAIIAIDVAVVVAMMMIPQIM MFGVLFGFPLLAFINSYILSPILQKYMPKEEREDGEMRPIFADEEENSKTAGLVKEEDAK TEETTEQ >gi|229784117|gb|GG667618.1| GENE 38 37956 - 38102 115 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTGTILFIIEESVVILIRKSSETTTLFLEFGHGMDTKKRYTEHQYGWF >gi|229784117|gb|GG667618.1| GENE 39 38122 - 38706 650 194 aa, chain + ## HITS:1 COG:CAC3340 KEGG:ns NR:ns ## COG: CAC3340 COG2357 # Protein_GI_number: 15896583 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 4 191 20 207 217 154 42.0 7e-38 MTEEEYVVFIQPYEDALKNIRVRVEVLNDDYRRKYQNYPIHHVQDRIKKKESIEQKLARR GCECDVESARDCLTDISGIRVICYFVRDIYAVVSLLKKQTDIVVIKECDYISHPKPNGYR SYHIVFGVPVYHTDGMEYYPVEIQLRTMSMDLWASMEHRICYKGSRNTEAKAAREFREYA EYLKNMEEEMEGYL >gi|229784117|gb|GG667618.1| GENE 40 38805 - 40016 1055 403 aa, chain - ## HITS:1 COG:TM0402 KEGG:ns NR:ns ## COG: TM0402 COG0004 # Protein_GI_number: 15643168 # Func_class: P Inorganic ion transport and metabolism # Function: Ammonia permease # Organism: Thermotoga maritima # 8 401 31 431 435 364 52.0 1e-100 MENTTLLMNIIWTMLGAFLVYFMQAGFAMVETGFTRAKNAGNICMKNMMDFVLGSLFFFI LGFPLMFGKNIAGIIGGSGFFNPYSLADADGMLNGLPIGVFMIFHTVFCATSATIVSGAM AERTKFSSYLIYSAAISIFIYPVTGHWIWGGGWLSELGFHDFAGSTAVHMVGGVCAFVGA KIVGPRLGKYNSDGTPNAIPGHNIPLGALGVFILWFCWFGFNCGSTTAASLNLGDIAITT NLAAAASTFVTLLFTWARYGKPDVSMTLNGALAGLVAITAGCDVVTPYEAVIIGVIAGFI VVFAIEFIDKVVRVDDPVGAVGVHGCCGLVGTLLTGIFGEGCTFIAQLIGVASVLVYTAV LAFIIFTAIKLTIGLRVTDQEQLDGLDMHEHGMTAYSGFRIDK >gi|229784117|gb|GG667618.1| GENE 41 40199 - 40537 546 112 aa, chain - ## HITS:1 COG:HI0337 KEGG:ns NR:ns ## COG: HI0337 COG0347 # Protein_GI_number: 16272289 # Func_class: E Amino acid transport and metabolism # Function: Nitrogen regulatory protein PII # Organism: Haemophilus influenzae # 1 112 1 112 112 115 50.0 2e-26 MKKLEIIIRPEKLENLKAILEGCKVNGIMISNVMGYGNQKGYTQMYRGTKYNVNLLPKVK VETVIPKELADPIIDAVVDEINTGNYGDGKIFVYDVQDAIRIRTGEHGRGAL >gi|229784117|gb|GG667618.1| GENE 42 40823 - 41728 992 301 aa, chain + ## HITS:1 COG:BH0061 KEGG:ns NR:ns ## COG: BH0061 COG1947 # Protein_GI_number: 15612624 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Bacillus halodurans # 7 275 5 270 287 252 49.0 7e-67 MIRHLGLKAYGKINLGLDVLGKREDGYHEVRMIMQTVGLYDKIDLYLKETPGITVETNLY YLPVNENNLVYKAAKLLMDEFHINHGVTIHLKKFIPVSAGMAGGSSDAAAVLFGVNKMFN LGLSMEDLMKRGVKIGADVPYCVMRGTALSEGIGEILTPLPPVPQCQVLIAKPGISVSTK FVYEHLDAASLRPADHPDIDGMIAAIKDRDIYGTASLLGNVLETVTIPEYPVIAEIKERL KSLGAVNALMSGSGPTVFGIFTSPKAAEEAYEEMRYGRSANLAKSVYLTNFYNTKEVSHG K >gi|229784117|gb|GG667618.1| GENE 43 41718 - 42395 1056 225 aa, chain + ## HITS:1 COG:SMb20773 KEGG:ns NR:ns ## COG: SMb20773 COG1802 # Protein_GI_number: 16265213 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Sinorhizobium meliloti # 15 209 23 217 228 113 36.0 3e-25 MGNDLKVNMNEYLPLRDVVFNTLRQAILKGELAPGERLMEIQLAERLGVSRTPIREAIRK LELEGLVLMIPRKGAEVAKISEKSLRDVLEVRRSLEELAIELACQRMTEEEINSLEQTQE EFKKAVARGDAMKIAETDETYHDVIYKGTGNDRLVQILNNLREQMYRYRLEYIKDEDKRQ ILLLEHDKILKAVKMRHVEEAKEAMREHIDNQEITVSRNIKEQEG >gi|229784117|gb|GG667618.1| GENE 44 42402 - 42596 384 64 aa, chain + ## HITS:1 COG:no KEGG:Closa_0521 NR:ns ## KEGG: Closa_0521 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 63 1 62 71 68 74.0 9e-11 MMTFKDRYLAGEVEFEAIDDYVEEWNGSDDPRTLAQFLGLNEEEEDVWIEESDEALQALL DSKK >gi|229784117|gb|GG667618.1| GENE 45 42743 - 43366 510 207 aa, chain + ## HITS:1 COG:no KEGG:Closa_0522 NR:ns ## KEGG: Closa_0522 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 207 1 203 207 183 49.0 3e-45 MPTERFNKLPEEKKKAIRDAAMEECIRVPFEKVSINKIIQNAGISRGSFYTYFEDKRDVV RYIFSDTADKLKDFWTTSVVSNGGDLWSASEELLDQAITFAQKGKTFQMLQSFVLYQDFD KLFSEVHGGNHMGEKKGNEILAALYEVTDRANFQKTDMKSFTLLVSMIMACVMESIGWYN RHMESEENIKKIFREKLEILQHGICKQ >gi|229784117|gb|GG667618.1| GENE 46 43403 - 44545 1494 380 aa, chain + ## HITS:1 COG:VC1674 KEGG:ns NR:ns ## COG: VC1674 COG0845 # Protein_GI_number: 15641678 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 76 359 75 369 369 89 26.0 8e-18 MSKKKVITIAVSVVLVAAAAALIVSNVAGGKKEETVAEVPPVVSAEHPELRSIVVNTELI GTIEPDSIVYVTPKGAGEIISVNAQTGDQVTAGQLLCEIDTKQVEAARLTMETARVSYED AKSNLDRYSVLHAAGDMAEADFQKLADNVELARLQYESAKLGYNLQLESSRVTAPISGRV ESYNVKVHDMVSQQSQICVIAGAGDGKAVTFYASERIVEGLKVGDALTVVKNGVDHAASI TEVSSMVDPQSGLFKVKASVPDGAALATGTSVKLQVVSQRADNVLTVPVDAVYYEGGAPY IYTYGDGVLHKNAVTVGIADNSYIEVKEGINAGDQVVTTWTTEFYDGSKVTLSENQTSED VTPDAQTETEGQTEPADANQ >gi|229784117|gb|GG667618.1| GENE 47 44558 - 47596 2979 1012 aa, chain + ## HITS:1 COG:BH3816 KEGG:ns NR:ns ## COG: BH3816 COG0841 # Protein_GI_number: 15616378 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Bacillus halodurans # 3 1000 1 1004 1093 423 29.0 1e-117 MGITKSVLKRPITTVLAVLCLLVFGLSSVFSSKMELTPEMNFPMMMITAVYAGASPDDVN ELVTKPIEESVGTLSGVKNIQSMSQENISIVMMEYEYGTDMDKAYSDLKKKMDNITMPDD VEVPSIFEFNMNDMPSMTLSVNDPSADNLYNYVNDEIVPEFEKISSVASVDLSGGQEGYI SVRLIPEKMNQYHLTMNSISQAIKSADFTYPAGSTGVGKQDLSVSTSVKFDTVDSLKRIP VVAGSGKTIYLEDVADISETTGKDTSIGRYNGEDTMSLSINKQQKSSAVDVSKSVNRTIE TLKAEHPQLEVIVVNDTSDQITSALKSVTTTMVMAVIISMIIIFLFFGDIKASLIVGSSI PVSILAALILMSVMGFSLNVVTMGSLVLGVGMMVDNSIVVLESCFRSHKGQGFREYMEAA LEGTGKVIQSIFGGTVTTCVVFIPLALLKGMSGQMFAPLGFTIVFCMAASLISAMTIVPL CYSRFRPVEKKNSPASGLVRMMQNGYRKIIGRLLNKRKTVMGVSILLLVLSFLIAGRLGF ALMPDVDQGTIAVTVEVRPGLQIEEVDKIISRVEKIIMDEPDRKSCMVTYGGSGLSLGGS SASLTAYLKSDRELTTKEVVDKWKKEMRDIPDCNITVAASSMMSAFSASNDLQVILQSSQ LDELKEASDSLVNELTQYPGLIKVHSDLENAAPVIKVHVDPIEAAAEGVAPASVGGLLNN MLSGVKAMTMDVDGNNVDVKVEYPDGEYDTLDKVKGITIPTQTGGSVALTDIADIVYADS PAGITRKNKQYQVTITGSFTDEVSTKERQAAVLKDINEQVVSKYLNGTVTRAQNSIDESM YEEFGSLFKAIAIAIFLVFVVMAAQFESPKFSLMVMTTIPFSLIGSFGLLAAMDVEISMP SLLGFLMLIGTVVNAGILYVDTVNQYRSTMDKRTAMIEAGATRLRPILMTTLTTIVSMIP LAIGYGSNGELMQGLALVNVGGLTASTVLSLLMLPVYYSIMSGKVDTTPMPD >gi|229784117|gb|GG667618.1| GENE 48 47763 - 48920 1210 385 aa, chain + ## HITS:1 COG:no KEGG:Closa_0525 NR:ns ## KEGG: Closa_0525 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 381 1 380 383 333 53.0 9e-90 MKKWKQISACCMAAVLAAAVPTGTAWAGSPEFARSAEEWAKLRDNVMEYDELAGLIHEYN VTVQNNQRDFNEKKNKTSDEIAQDYRDAADAIRSSMSGEDDAGSILSDAMSEAQALTLEQ QADDSVEDSEIFKMTYDQTEKTLVSNAQANMISYHQKVLDLELKQKNRELLAATYDSVAL KMNNGMATQMDLLTARENLQNADAAILTAQSDLENTRQKLCVMLGWRYDASPEIMGIPSA DLNRITAMDPSADKEKALENNYTLRINRKKLANSSSNSKKESLQMTIDDNIQRIGASVDS AYKNVIQAQTAYQQASVALDAASKNMEAAERKMTIGTISRLDYLSQQYAYLQAQTAMKNA DMDLFQAMESYDWNVNGLASASAGM >gi|229784117|gb|GG667618.1| GENE 49 48990 - 50159 1140 389 aa, chain + ## HITS:1 COG:no KEGG:Closa_0526 NR:ns ## KEGG: Closa_0526 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 25 386 43 394 398 179 34.0 2e-43 MALAGALTFTGPSAAWAKLSDYDAETAARLQDNNLEYDELENLIREYNPEISYAYSQLEF NEDDTASTVRAVRNDIEDNDYKDLVKTLKQQKKQLEAMPDSEEKAKALLELNTTIKGLEG LISAPASTTKATDKQLEKARKQLDTGVKSALKGTQTLMISYDNLRAQVQTLEKVVELQKE ALEAAKLQAGLGMATDTGVLQAQSDLLSSQSQLDNLRNLAGNLRTQLCLLTGWKADSVPE IGSVPAADPSKIASIDLTADIQKAIGNNQTLISQRHASSTKSTAAIDNRLKNIDVSEQQL TVEMQRLYQVMQQKKILNDAAETAYQKALLAKGAAERQYQLNMLSKLQYLGAEVQYYQAE GARKSADTALLQAMLDYDWAVQGFVSVPQ >gi|229784117|gb|GG667618.1| GENE 50 50225 - 50890 505 221 aa, chain - ## HITS:1 COG:no KEGG:Closa_0527 NR:ns ## KEGG: Closa_0527 # Name: not_defined # Def: stage II sporulation protein R # Organism: C.saccharolyticum # Pathway: not_defined # 1 199 1 199 215 315 71.0 9e-85 MNYKRDLFLCITCLLLAFLLTMASGRRNEEAMAARIAPEILRFHVLANSDSDEDQQLKLR VRTLLLDSIYEGLGENASLNDTKEYVLSNKNSLEQQAEDYMKAQGYDYPAHMEVTRCYFP TKTYGDMVFPCGTYDAVRVEIGSGKGHNWWCVLYPPLCFVDSTYAVVPDSSKEVLRQSLD AADYQALLIRQPEVHVRIRSKFLDLLKKDSSVTEGREQPDR >gi|229784117|gb|GG667618.1| GENE 51 51047 - 52111 1109 354 aa, chain + ## HITS:1 COG:CAC2895 KEGG:ns NR:ns ## COG: CAC2895 COG1181 # Protein_GI_number: 15896148 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Clostridium acetobutylicum # 3 351 2 340 343 292 45.0 8e-79 MNRKTAVVLFGGQSSEHVVSCMSVVNVINHMNREKYDLLLIGITEDGHWIKTESAEDIES GRWREGKTSAVISPDATKQCVILTEGDRAEFVKVDVVFPVLHGLFGEDGTVQGLLELARI PYVGCGVLASAVSMDKLYTKIVVDDLGVRQAAYVPVMRSDLGDMEAVVGRVEAKFTYPVF VKPSNAGSSKGVARADSREELEAALYEAAKHDRKILVEEMIKGREIECAVFGGGLVPVEA SGVGEILAAAEFYDFDAKYYNSDSRTVVNPELPGDAANTVREAAKAIFRAVDGYGLARVD FFVKDDGTVVFNEINTLPGFTAISMYPMLWEAAGVSKDQLVDRLVDHAFDRYKL >gi|229784117|gb|GG667618.1| GENE 52 52142 - 52960 987 272 aa, chain + ## HITS:1 COG:BS_racE KEGG:ns NR:ns ## COG: BS_racE COG0796 # Protein_GI_number: 16079891 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Bacillus subtilis # 7 271 5 268 272 227 43.0 1e-59 MADRNSPIGVFDSGVGGLTVAREIMRQMPEERIVYFGDTARVPYGNKSGNTVIRFTRQII RFLLTQDVKAIVIACNTATAYALEAVEGELDIPIIGVIHAGARTATQATKNGKIGIIGTE GTIRSGVYTKVMMEMRPDIEVTGKPCPLLVPLVEEGLLHDSVTDEIASRYLSELKGKYID TLVLGCTHYPLLRSTVGRLMGPEVTLVNPAYETALELKSVLTEADLLCSGPAEGQEQYLF YVSDLAEKFTSFATSILPGAVKETRQINIEEF >gi|229784117|gb|GG667618.1| GENE 53 52962 - 53402 512 146 aa, chain + ## HITS:1 COG:CAC2894 KEGG:ns NR:ns ## COG: CAC2894 COG4506 # Protein_GI_number: 15896147 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 138 1 136 137 79 35.0 3e-15 MTRDVLITISGIQLAEGDSNEVEMITAGDYFQKNGSHYILYDEVMEGQNDIIRNTIKIRP QGLDIIKRGSSNVHMTFEKDKKNLSCYATPFGEMMIGINTNDIMIAEDEDSLKVRVSYSL DINYQHVSECNINLDIHSKSTANIHL >gi|229784117|gb|GG667618.1| GENE 54 54208 - 54675 665 155 aa, chain - ## HITS:1 COG:no KEGG:Closa_3786 NR:ns ## KEGG: Closa_3786 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 149 1 149 149 184 88.0 9e-46 MSTFLNVLLVILLILVVVLAVLYFLGRKLEKRQVEQQQALEAAAQTVSMLVIDKKKMKLK DAGLPKIVYEQTPKYMRRAKVPIVKAKIGPKVMTLIADAKVFDVLPVKTEAKVVVSGIYI TEIKSIRGGAVPVAPKKKKFMDRFKKGKKEKNDKK >gi|229784117|gb|GG667618.1| GENE 55 54883 - 58749 4014 1288 aa, chain + ## HITS:1 COG:CAC3143 KEGG:ns NR:ns ## COG: CAC3143 COG0085 # Protein_GI_number: 15896392 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta subunit/140 kD subunit # Organism: Clostridium acetobutylicum # 6 1282 2 1228 1241 1625 65.0 0 MEKSRMRPVPAGKSLRMSFSRQKEVLEMPNLIEIQKNSYQWFLDEGLKEVFEDISPIADF AGHLSLEFVDFTLCKDDIKYTIEECKERDATYAAPLKVKVRLCNKDKDEINEHEIFMGDL PLMTDTGSFVINGAERVIVSQLVRSPGIYYGIGHDKVGKELYTCTVIPNRGAWLEYETDS NDIFYVRVDRTRKVPVTVLIRALGFGTNAEIIELFGEEPKLLASFGKDTSDNYQDGLLEL YKKIRPGEPLSVDSAESLLNSMFFDPRRYDLAKVGRYKFNKKLHFKNRIVGHVLSEDVVD TTTGEILATAGTTVDKQLAVDIQNAAVPYVWVEGLERRQKVLSNLMVDLGAYVSFDPKDA GVTELVFYPVLEKILEDVGTEDEEVLKSAIHRSLGELIPKHVTKEDILASINYNMHLEQQ VGNADDIDHLGNRRIRAVGELLQNQYRIGLSRMERVVRERMTTQDMEGITPQSLINIKPV TAAVKEFFGSSQLSQFMDQNNPLAELTHKRRLSALGPGGLSRDRAGFEVRDVHYTHYGRM CPIETPEGPNIGLINSLATYARVNQYGFVEAPYRLVDKSDPNNPIVTDEVVYLTADEEDN FVVAQANEPLDEEGHFIHKNISGRFREETSEFQRKSIDLMDVSPKMVFSVATAMIPFLEN DDANRALMGSNMQRQAVPLLTTEAPVVGTGMEAKAAVDSGVCVVARNAGVVERSASNEII IKRDADGGRDVYHLTKFKRSNQSNCYNQKPIVYKNNHVEKGEVIADGPSTQNGEIALGKN PLIGFMTWEGYNYEDAVLLSERLVQEDVYTSVHIEEYEAEARDTKLGPEEITRDVPGVGE DALKDLDERGIIRIGAEVRAGDILVGKVTPKGETELTAEERLLRAIFGEKAREVRDTSLK VPHGAYGIIVDAKVFTRENGDELSPGVNQTVRIYIAQKRKISVGDKMAGRHGNKGVVSRV LPVEDMPFLPNGRPLDIVLNPLGVPSRMNIGQVLEIHLSLAARALGFNIATPVFDGANEI DIMDTLDVANDYVNMTWEEFQDKYNDTLKPEVIEYLGEHLDHRELWKGVPLSRDGKVRLR DGRTGEYFDSPVTIGHMHYLKLHHLVDDKIHARSTGPYSLVTQQPLGGKAQFGGQRFGEM EVWALEAYGASYTLQEIMTVKSDDVVGRVKTYEAIIKGENIPEPGIPESFKVLLKELQSL ALDVRVLRDDNTEVQIAESMDYEETDLRSIIEGDRRFREEEPLADYGFQKQEFQDDELVS VEEEEDVDGIDDSYEDFDDADLDESDEE >gi|229784117|gb|GG667618.1| GENE 56 58778 - 62437 3919 1219 aa, chain + ## HITS:1 COG:CAC3142 KEGG:ns NR:ns ## COG: CAC3142 COG0086 # Protein_GI_number: 15896391 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Clostridium acetobutylicum # 13 1187 6 1171 1182 1654 68.0 0 MPENNETYHPMTFDAIKIGLASPEKILEWSHGEVKKPETINYRTLKPEKDGLFCERIFGP SKDWECHCGKYKKIRYKGVICDRCGVEVTKASVRRERMGHIALAAPVSHIWYFKGIPSRM GLILDISPRTLEKVLYFASYIVLDAADTGLQYKQVLSEKEYREEVDKYGYGAFRVGMGAE AILELLQAIDLEKDSEELRKGLKDATGQKRARIIKRLEVVEAFRNSGNKPEWMIMTAIPV IPPDIRPMVQLDGGRFATSDLNDLYRRIINRNNRLARLLELGAPDIIVRNEKRMLQEAVD ALIDNGRRGRPVTGPGNRALKSLSDMLKGKQGRFRQNLLGKRVDYSGRSVIVVGPELKIY QCGLPKEMAIELFKPFVMKELVANGTAHNIKNAKKMVERLQTEVWDVLEDVIKEHPVMLN RAPTLHRLGIQAFEPILVEGKAIKLHPLVCTAFNADFDGDQMAVHLPLSVEAQAECRFLL LSPNNLLKPSDGGPVAVPSQDMVLGIYYLTQERPGAKGEGMVFRNVNEAILAYENGAVTL HSRIKARVTKKMADGTVKTGIVESTVGRFIFNEIVPQDLGFVDRSIPGNELLMEVDFHVG KKQLKQILEKVINVHGAVQTAETLDDIKAIGYKYSTRAAMTVSISDMTVPESKPKLIADA QATVDQIAKNFRRGLITEEERYKEVIDTWKETDDQLTHDLLTGLDKYNNIYMMADSGARG SDKQIKQLAGMRGLMADTTGHTIELPIKSNFREGLDVLEYFISAHGARKGLSDTALRTAD SGYLTRRLVDVSQDLIIREVDCCEGKSIPFMEIKAFMDGQETIEGLEERLTGRYIAETIT DPDTGEVVVKANHMCTPKRAAAVMKVLQKLGRDSVKIRTVLTCKSHLGVCAKCYGANMAT GQPVQVGEAVGIIAAQSIGEPGTQLTMRTFHTGGVAGGDITQGLPRVEELFEARKPKGLA IIAEFGGVATIKDTKKKREIIITDNETGNSKTYLIPYGSRIKIQDGVYLEAGDELTEGSI NPHDILKIKGVRAVQDYMIQEVQRVYRLQGVEINDKHVEMIVRQMLKKIKIEESGDSDVL PGTSMDVLDYNDMNEALLAEGKEPAEGKQVMLGITKASLATDSFLSAASFQETTKVLTEA AINGKVDHLIGLKENVIIGKPIPAGTGMKRYRTVKLSTDGEEMNEDEIILGDDLVIPVTE EYAEAEEVLDIDEIDEEDV >gi|229784117|gb|GG667618.1| GENE 57 62634 - 62816 74 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MADLIIGIILIAAVAAVIWKKRRDKKMGKSGCGGCCGSCGIECHQRNGKEDGAVKRRDAK >gi|229784117|gb|GG667618.1| GENE 58 63070 - 63441 570 123 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620119|ref|ZP_06113054.1| ## NR: gi|266620119|ref|ZP_06113054.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 123 1 123 123 157 100.0 3e-37 MADQETLDAIAVMLDGLERRLDQKFDQKLEEKLEEKLEEKLEEKLEEKLKPIRTDILTLY GEVHGLKNEMADVRDNVYELRKDMDTVKKQTVYNYDRIIDNYAAIRELQTVVEGKMDKPR LVQ Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:50:43 2011 Seq name: gi|229784116|gb|GG667619.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld12, whole genome shotgun sequence Length of sequence - 48531 bp Number of predicted genes - 46, with homology - 45 Number of transcription units - 22, operones - 12 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 612 612 ## gi|266620120|ref|ZP_06113055.1| conserved hypothetical protein 2 1 Op 2 . - CDS 664 - 1080 588 ## Closa_0124 hypothetical protein 3 1 Op 3 . - CDS 1098 - 2084 1029 ## Closa_0123 diaminopimelate dehydrogenase (EC:1.4.1.16) - Prom 2124 - 2183 5.2 - Term 2170 - 2214 5.1 4 2 Op 1 . - CDS 2264 - 2551 370 ## COG2088 Uncharacterized protein, involved in the regulation of septum location 5 2 Op 2 7/0.000 - CDS 2578 - 3699 1130 ## COG0448 ADP-glucose pyrophosphorylase 6 2 Op 3 . - CDS 3696 - 4970 1310 ## COG0448 ADP-glucose pyrophosphorylase - Prom 5002 - 5061 9.7 + Prom 5126 - 5185 8.7 7 3 Tu 1 . + CDS 5218 - 6609 1434 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases + Term 6615 - 6669 14.0 + Prom 6611 - 6670 2.0 8 4 Op 1 5/0.000 + CDS 6724 - 7794 1145 ## COG0420 DNA repair exonuclease 9 4 Op 2 . + CDS 7799 - 9661 2067 ## COG4717 Uncharacterized conserved protein 10 4 Op 3 . + CDS 9661 - 9879 121 ## gi|266620129|ref|ZP_06113064.1| regulatory protein, GntR-family + Term 9909 - 9970 2.9 11 5 Op 1 . - CDS 9946 - 11271 1202 ## COG1114 Branched-chain amino acid permeases 12 5 Op 2 . - CDS 11279 - 12457 1435 ## COG0462 Phosphoribosylpyrophosphate synthetase 13 5 Op 3 3/0.000 - CDS 12529 - 13524 978 ## COG1484 DNA replication protein 14 5 Op 4 . - CDS 13535 - 14785 1103 ## COG3935 Putative primosome component and related proteins - Prom 14818 - 14877 7.8 - Term 14994 - 15049 -0.9 15 6 Tu 1 . - CDS 15062 - 16441 1278 ## COG0773 UDP-N-acetylmuramate-alanine ligase - Prom 16519 - 16578 5.3 - Term 16566 - 16609 5.0 16 7 Op 1 5/0.000 - CDS 16629 - 17363 805 ## COG0489 ATPases involved in chromosome partitioning 17 7 Op 2 2/0.000 - CDS 17369 - 18100 766 ## COG3944 Capsular polysaccharide biosynthesis protein 18 7 Op 3 . - CDS 18115 - 18843 378 ## COG4464 Capsular polysaccharide biosynthesis protein 19 7 Op 4 . - CDS 18915 - 20522 961 ## gi|288870138|ref|ZP_06113073.2| conserved hypothetical protein - Term 20616 - 20655 4.4 20 8 Tu 1 . - CDS 20674 - 21591 685 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 21742 - 21801 9.4 - Term 21874 - 21926 -0.8 21 9 Tu 1 . - CDS 22099 - 22668 203 ## BALH_4510 antibiotic resistance protein - Prom 22842 - 22901 7.3 22 10 Tu 1 . - CDS 22906 - 23307 79 ## COG4973 Site-specific recombinase XerC - Prom 23412 - 23471 3.8 + Prom 24025 - 24084 6.1 23 11 Op 1 . + CDS 24170 - 25033 305 ## EUBREC_2629 glycosyltransferase 24 11 Op 2 3/0.000 + CDS 25059 - 25898 348 ## COG3475 LPS biosynthesis protein 25 11 Op 3 . + CDS 25886 - 26659 167 ## COG3475 LPS biosynthesis protein + Prom 27184 - 27243 7.7 26 12 Tu 1 . + CDS 27310 - 27909 117 ## gi|288870139|ref|ZP_06113080.2| O-antigen polymerase family protein 27 13 Op 1 . + CDS 28814 - 29311 203 ## EUBREC_2653 hypothetical protein 28 13 Op 2 . + CDS 29349 - 31358 934 ## COG2513 PEP phosphonomutase and related enzymes 29 13 Op 3 . + CDS 31381 - 32502 413 ## COG1454 Alcohol dehydrogenase, class IV 30 13 Op 4 . + CDS 32507 - 33628 335 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] + Term 33842 - 33883 4.1 - Term 33898 - 33929 2.0 31 14 Op 1 . - CDS 34054 - 35343 546 ## COG3436 Transposase and inactivated derivatives 32 14 Op 2 . - CDS 35366 - 35698 217 ## EUBREC_3237 hypothetical protein - Prom 35722 - 35781 4.4 33 15 Op 1 . - CDS 35889 - 36332 184 ## COG3436 Transposase and inactivated derivatives 34 15 Op 2 . - CDS 36326 - 36487 61 ## gi|239627482|ref|ZP_04670513.1| conserved hypothetical protein 35 16 Tu 1 . - CDS 36653 - 36742 73 ## - Prom 36911 - 36970 2.7 36 17 Op 1 5/0.000 + CDS 37332 - 37652 360 ## COG3436 Transposase and inactivated derivatives 37 17 Op 2 . + CDS 37692 - 38405 346 ## COG3436 Transposase and inactivated derivatives + Term 38470 - 38497 -0.1 - Term 38752 - 38786 1.1 38 18 Tu 1 . - CDS 38908 - 39123 104 ## EUBREC_3236 transposase - Prom 39204 - 39263 2.7 39 19 Tu 1 . - CDS 39795 - 40484 589 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] - Prom 40515 - 40574 3.2 40 20 Op 1 . - CDS 40587 - 40919 245 ## Olsu_0592 phosphonopyruvate decarboxylase 41 20 Op 2 . - CDS 40977 - 42119 374 ## COG1454 Alcohol dehydrogenase, class IV 42 20 Op 3 . - CDS 42143 - 44152 1091 ## COG2513 PEP phosphonomutase and related enzymes - Prom 44174 - 44233 3.7 43 21 Op 1 . - CDS 44341 - 45666 175 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 44 21 Op 2 . - CDS 45762 - 46907 322 ## COG1887 Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC 45 21 Op 3 . - CDS 46900 - 47535 217 ## gi|266620161|ref|ZP_06113096.1| conserved hypothetical protein - Prom 47602 - 47661 4.4 46 22 Tu 1 . - CDS 48114 - 48317 149 ## gi|266620162|ref|ZP_06113097.1| putative glycosyl transferase - Prom 48421 - 48480 8.1 Predicted protein(s) >gi|229784116|gb|GG667619.1| GENE 1 3 - 612 612 203 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620120|ref|ZP_06113055.1| ## NR: gi|266620120|ref|ZP_06113055.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 203 1 203 204 392 100.0 1e-108 MDNEGKKAAQTDRQALLGRLRDTRELYVIVSACTKLPYVVCAPETFDDEVLVFFKPEEAE ARAKSLAAEKIPVSVVKLENKQLLLFYTNLYTMGVNAISLEDSGKEELIQLEDFVRRRGD EPGEEGKVWVENPALHLTALYYMQEARRQPGPEMNSKLMELQEEIEMHFRKGNFILAIQK EGNGIPLVKMKNGDVYQAAFTDI >gi|229784116|gb|GG667619.1| GENE 2 664 - 1080 588 138 aa, chain - ## HITS:1 COG:no KEGG:Closa_0124 NR:ns ## KEGG: Closa_0124 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 133 5 131 136 176 66.0 2e-43 MAAKEMVLYYQPAKEKEAGNASKAAKLKGVLIRMGVRIKNITPEQTGQTIGYLAGFDGFD EMEPQEGSVAPEIDEEMLVMKNFTNRRIDELLVGLRRAGVPKIELKAVVTETNCGWSFYA LYEELKKEREAMTKKEED >gi|229784116|gb|GG667619.1| GENE 3 1098 - 2084 1029 328 aa, chain - ## HITS:1 COG:no KEGG:Closa_0123 NR:ns ## KEGG: Closa_0123 # Name: not_defined # Def: diaminopimelate dehydrogenase (EC:1.4.1.16) # Organism: C.saccharolyticum # Pathway: Lysine biosynthesis [PATH:csh00300] # 1 328 1 328 328 548 79.0 1e-154 MTIKIGILGYGNLGKGVECAVKHNPDMELAAVFTRRDKASLKVLTPGVKVCSVQEAESMK DEIDVMILCGGSATDLPVQTPELAKNFHVVDSFDTHARIPEHFEAVDAAAKESGHVGIIS VGWDPGMFSLNRLYGSAVLPEGKDYTFWGRGVSQGHSDAIRRVEGVKDARQYTVPVEAAL QAVRNGENPELTTRQKHTRECFVVAEEGADLKRIEEEIVTMPNYFADYDTTVHFISEEEL MRDHKGIPHGGFVICTGKTGWENEHSHVIEYSLKLDSNPEFTASVIAAYARAAYRLGKEG VTGCKTVFDIAPAYLSSMSGEELRKHLL >gi|229784116|gb|GG667619.1| GENE 4 2264 - 2551 370 95 aa, chain - ## HITS:1 COG:CAC3223 KEGG:ns NR:ns ## COG: CAC3223 COG2088 # Protein_GI_number: 15896470 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein, involved in the regulation of septum location # Organism: Clostridium acetobutylicum # 1 81 1 81 95 116 74.0 8e-27 MQITDVRVRRIEKEGKMKAIVSITLDNEFVIHDIKVIEGEKGLFIAMPSRKAADGEYRDI AHPINSNTRDMIQRVILDKYETTALELPEEEAAMA >gi|229784116|gb|GG667619.1| GENE 5 2578 - 3699 1130 373 aa, chain - ## HITS:1 COG:TM0239 KEGG:ns NR:ns ## COG: TM0239 COG0448 # Protein_GI_number: 15643011 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Thermotoga maritima # 1 349 1 350 370 313 43.0 2e-85 MRAIGIVLAGGNSKRMRELSNKRAIAAMPIAGSYRSIDFALSNMTNSHIQNVAVLTQYNS RSLHLHLSSSKWWDFGRKQGGLFVFTPTITAESSDWYRGTADALYQNLTFLKNSHEPYVV ISAGDGIYKLDYGKVLEYHIEKKADITVVCTDLPEGEDVTRFGLVGTNEDGRITDFEEKP MVASSNTVSCGIYVIRRRQLIELIERCAMEDRYDFVNDILVRYKNLKRIYAYKMDSYWRN IASVESYYRTNMDFLKPEVRDYFFRQYPDVYSKIDDLPPAKYNPGAVVKNSLVSSGSIIN GVVENSVLFKKAYIGNNCVIKNSIILNDVYIGDNTVIENCIVESRDTIRANTTYIGTPDN IKIVVEKNERYTL >gi|229784116|gb|GG667619.1| GENE 6 3696 - 4970 1310 424 aa, chain - ## HITS:1 COG:TM0240 KEGG:ns NR:ns ## COG: TM0240 COG0448 # Protein_GI_number: 15643012 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Thermotoga maritima # 7 418 5 419 423 473 55.0 1e-133 MIKKEMIAMLLAGGQGSRLGVLTQKVAKPAVSFGGKYRIIDFPLSNCINSGVDTVGVLTQ YQPLRLNTHIGIGIPWDLDRNVGGVTVLPPYEKSKGSDWYTGTANAIYQNLEFMETYNPD YVLILSGDHIYKMDYEVMLEYHKANNADITIAAMPVPIEEASRFGIVITDENNRITEFEE KPANPRSNLASMGIYIFSWKVLKEALIKMSEEPGCDFGKHIIPYCHAAGQRIFAYEYNGY WKDVGTLGSYWEANMELIDIIPEFNLYEEYWKIYTKSDIIPPQYVSEDAVIERSIIGEGS EIYGEVHNSVIGAGVTVAKGAVIRDSIIMRESVIGAGSSIDKAVIAENVKIGSNVALGVG EYAPSKYDPKVYQFDLVTVGENSIIPDNVKVGKNTAIVGETIVEDYPDGLLASGDYIIKA GGMR >gi|229784116|gb|GG667619.1| GENE 7 5218 - 6609 1434 463 aa, chain + ## HITS:1 COG:CAC3260 KEGG:ns NR:ns ## COG: CAC3260 COG0017 # Protein_GI_number: 15896505 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Clostridium acetobutylicum # 1 463 1 463 463 659 66.0 0 MDLVTVRQIYKDREQYLNKEISVGGWVRSLRDSKAFGFIVLSDGTYFETLQVVYHDTMDN FSEISRLSVGTALIIKGTLVATPEAKQPFEIQATEIVIEGASSPDYPLQKKRHSFEYLRT ISHLRPRTNTFQAVFRVRSLIAYAIHQFFQERDFVYVHTPLITGSDCEGAGEMFQVTTLD LNDIPKTDEGKVDFSQDFFCKPTNLTVSGQLNAETYAMAFRNVYTFGPTFRAENSNTTRH AAEFWMIEPEMAFADLNDDMELAESMLKYVIGYVLDHAPEEMNFFNQFVDKGLLDRLHHV LNSEFAHVTYTEAVELLEKHNDEFDYKVFWGCDLQTEHERYLTEQLFKKPVFVTDYPKEI KAFYMKMNPDNKTVAAVDCLVPGIGEIIGGSQREDDYDKLTARMDELGLKKEDYQFYLDL RKYGSTRHAGFGLGFERCVMYLTGMSNIRDVVPFPRTVNNCEL >gi|229784116|gb|GG667619.1| GENE 8 6724 - 7794 1145 356 aa, chain + ## HITS:1 COG:MJ1323 KEGG:ns NR:ns ## COG: MJ1323 COG0420 # Protein_GI_number: 15669513 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Methanococcus jannaschii # 1 207 1 213 366 66 30.0 7e-11 MKFVHIADVHWGMSPDSDKPWSKERSQDIKDTFAKAVAQAGQLEADCLFISGDLFHRQPL ARDLKEVNYLFSTIPGVHVVIIAGNHDRIRNNSALLSFTWAPNVTYLMDEELQCVYFEEI NTEVYGFSYHTTDIPENRLDHLKVPNNGRIHVLLGHGGDANHIPFDKGAMGALDFSYIAM GHIHKPEVLIENRMAFSGSPEPLDKTEAGQHGMFVGEINEVTRMVTSLKFVPLARLQYIS LAVNVTTATTNTELAMKITQEIQNRGPENIFRFRIRGMRDPDISFDLDMLSTRFKIMEII DESEPQYDFSALFAEHPSDMIGFYIQALQKPEMSQVEKKALYYGINALLRTTDERS >gi|229784116|gb|GG667619.1| GENE 9 7799 - 9661 2067 620 aa, chain + ## HITS:1 COG:MA2362 KEGG:ns NR:ns ## COG: MA2362 COG4717 # Protein_GI_number: 20091195 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 1 245 10 240 1300 60 30.0 8e-09 MKILDIYINGFGKFHGRNLSFEDGLNIVYGKNEAGKSTIHTFIRGMLFGIEKQRGRASRN DLYSKFEPWENSGTYEGQLRLEHKDHIYRIERTFQKNKKEFKVVDETAGREIEPTKAFLD DLLSGLSETAYNNTVSIGQLKSATDEGMVSELKNYIANLNTTGNIALNITKATSFLKSQH KELESQMVPEAARTYTSLLSEIRNTEKEIASPEYENQIQAYQRIRVEVKDTLEVKQKEKE ELIQKVARGKQVLANNQFTDQDSITAYSIKTQGTFDEYTEAKEVCGRKSKKILSVLSLVI ATLLLCGAGAVYYLGDSNYLTAAYGMDSLVYIAAAVGAAIIFYLIGLILYLRLRHRQKDM ELSAKVLQEIFSRHLGDTAISMDAMRAFQARMAEFTRLSSAIAKSETAIEQKAAEITELQ GRQETCGEVIEKQQKTQWELEKKLEHLSACKTQAEGLKHILAENDRIREELAAIDLALET MTTLSSSIRDSFGLYLNKTASDLIAGITGGIYTSLSVDENLNVFLNTKTKLVPLEQVSSG TMDQVYLALRLAAARLIIGDHGEMPLIFDDSFVLYDDDRLRTALRWLVKACEGQIIIFTC HQREAQMLTANLIPYHLVEI >gi|229784116|gb|GG667619.1| GENE 10 9661 - 9879 121 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620129|ref|ZP_06113064.1| ## NR: gi|266620129|ref|ZP_06113064.1| regulatory protein, GntR-family [Clostridium hathewayi DSM 13479] regulatory protein, GntR-family [Clostridium hathewayi DSM 13479] # 1 72 1 72 72 130 100.0 3e-29 MCRQGFAAHSRRSIMARSASPQLAASLEVPEGTALLVIENIVYDQYDVPIELSIESISGE THQFSFDVFSNS >gi|229784116|gb|GG667619.1| GENE 11 9946 - 11271 1202 441 aa, chain - ## HITS:1 COG:CAC1610 KEGG:ns NR:ns ## COG: CAC1610 COG1114 # Protein_GI_number: 15894888 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid permeases # Organism: Clostridium acetobutylicum # 1 429 1 431 440 348 49.0 2e-95 MERLSKRNVCLVGLTLFSMFFGAGNLIFPPFLGAMAGTNTWTAMAGFAVTAIGFPILGVI AVAKAGGLTGLAGRVHPGFASVFTLLIYLSIGPCLAIPRTASTSFEMAVIPFLHGRVSVG RAQLLYSCLFFAIAFLVALKPDKLTDRLGKVLTPCLLTLIAILFFGCILKPASVSGYGGP SEMYQKNPLVKGFLEGYLTMDTIAALNFGIIISLNISALGVKAEASIIKETIKAGVIAGG ILVLVYAALAHIGAVAGGAFGSFENGAQTLNQMVRFLFGKAGLIMLAAIFFIACLNTCIG LISCCSRYFCTLLPRIGYRMWAFIFAVVSLIIANAGLNRILEISVPVLNAIYPAAIVLIL LSFLFPDGKRWRLVYVCSIVLTGITSVAMALYQIGITPLGRVLLLLPLHSLNLEWISPAA VGIVIGVLVRFRANKRKLLFE >gi|229784116|gb|GG667619.1| GENE 12 11279 - 12457 1435 392 aa, chain - ## HITS:1 COG:CAC0819 KEGG:ns NR:ns ## COG: CAC0819 COG0462 # Protein_GI_number: 15894106 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Clostridium acetobutylicum # 14 385 11 367 371 403 51.0 1e-112 MLYEEKTIETIPVGPLGLIPLKSCTELGAKVNDYLVDWRRERESEHKSTIAFSGYQRESY IIGASTPRFGTGEAKGTLSESVRGDDLYIMVDVCNYSLTYSLCGMTNHMSPDDHFQDLKR IIAAAAGKARRINVIMPFLYESRQHKRSSRESLDCALALQELTNMGVENIITFDAHDPRV QNAIPLKGFETVQPIYQFIKYLLREEKELQIDSDHMMVISPDEGGMGRAVYFANVLGLDM GMFYKRRDYTQIVNGRNPIVAHEFLGSSVEGKDVIIVDDMISSGESMQDVAKELKRRKAR KVFICSTFGLFTNGLAKFDEYYENGTIDRILTTNLVYQTPDLLSRPYYINVDMSKYIALI IDNLNHDASLSELLNPTSRINRLLDRYRAGDR >gi|229784116|gb|GG667619.1| GENE 13 12529 - 13524 978 331 aa, chain - ## HITS:1 COG:CAC3588 KEGG:ns NR:ns ## COG: CAC3588 COG1484 # Protein_GI_number: 15896822 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Clostridium acetobutylicum # 1 328 1 325 329 187 34.0 3e-47 MSLSNSQYDAIMRTYQQQQLQNRHEQENRIAEVYKKIPAIKELDDSISSCAVKSARRLLD GDQGALKELRAEIADLREQKSVLLRAYGFSPDYMEMHYKCPDCQDTGYSEGRKCHCFRQA EMKYLYAQSNIEEIVTLENFSTFSFEYFDDSSVLPVLGRTVRQYMKQVVETCHRFVDDFS TEHGNLLFTGPTGVGKTFLTNCIARELIDRYYSVIYLSANDLFEVFSKNRFEYQAEEEIK GMYQYILDCDLLIIDDLGTELNNSFTSSQLFYCINERLNSSKSTIISTNHPLNELRDRYT ERVTSRLISKYTIIPLYGDDIRLRKKAKRAD >gi|229784116|gb|GG667619.1| GENE 14 13535 - 14785 1103 416 aa, chain - ## HITS:1 COG:CAC3587 KEGG:ns NR:ns ## COG: CAC3587 COG3935 # Protein_GI_number: 15896821 # Func_class: L Replication, recombination and repair # Function: Putative primosome component and related proteins # Organism: Clostridium acetobutylicum # 207 408 126 316 328 104 26.0 3e-22 MKSEVMAVDFKSSFKVSATLVANEFIDTYMASANGEYIKEYLFVLRHEGEAVTVSMIADA INHTESDVTRALSYWKKLGVLEASVDKVPVDGPDVDKWPAAAQTACAGGMITESMESPLN GVFPEACVAVQSSYDYVERPVKGSGTGTVVYGEAAVPEKREAVLTAPEPASAQAPVKKTE PRPVYSPEQISGLAGDEDFSQLLYIAQKYMNKVFTQRECEVFAYLYDGLHMTAELLEYVV EYCVQGGHTSIRYIETVAINWHESGLKTVEDAKAYASSFTRDSFAVMKAFGLNDRKPGDA EREMIDRWFKTYGFTRELVLEACNRTLTATHTPSFKYADKILSGWKKAGVCSLEDVKRLD EQHAGRTKGKGSGNGAGTQEYRGGGNGRGRSTNQFQNFPQREIDYDALVLQQLTEN >gi|229784116|gb|GG667619.1| GENE 15 15062 - 16441 1278 459 aa, chain - ## HITS:1 COG:CAC3225 KEGG:ns NR:ns ## COG: CAC3225 COG0773 # Protein_GI_number: 15896472 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Clostridium acetobutylicum # 10 457 11 456 458 428 48.0 1e-120 MYQIDFHKPENIHFIGIGGISMSGLAEILIDEGFTVSGSDSHESELTDHLAAKGARIAYG QRAENIVDGIDVVVYTAAVHPDNPEFAAAMEKGLPMLSRAELLGQMMKNYENAVAVSGTH GKTTTTSMITEILLAADADPTISVGGILNSIGGNIRVGGPELFVTEACEYTNSFLSFFPT IEVILNVEADHLDFFKDLDDIRRSFRLFAEKLPKDGILVINSDITDYEKITEGLPCRVIT FGKEESSKYRAEQIRFDELARPVFDLSVDGETVDTVALGVPGEHNVYNSLAAIAVCMELG ISLEMIKKGLWKFTGTNRRFEKKGELGGITIIDDYAHHPQEISASLATALNYPHRKLWVV FQPHTYTRTKAFLDEFAEALSMADEVILADIYAARETDNLGISSKDIVDRIEAKGVKAHY IPSFDEIETFILENCIHGDLLITMGAGDIVKVGEKLLGN >gi|229784116|gb|GG667619.1| GENE 16 16629 - 17363 805 244 aa, chain - ## HITS:1 COG:SP0349 KEGG:ns NR:ns ## COG: SP0349 COG0489 # Protein_GI_number: 15900278 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Streptococcus pneumoniae TIGR4 # 9 221 8 226 227 171 42.0 1e-42 MQKVLLRKHEKLDFRANEAFKTLRTNILFSGHDTRVITLTSSTPNEGKSSLSFQLAISFA EADNRVLFIDADIRKSVLVGRYKAEGNKFGLTHYLSGQKTIDDVICNTDINNMDMIFAGP VSPNPTELLSGDLFAELIQKAREEYDYVIIDAPPLGSVIDAALIARHVDGTIIVVESGAI SYKMVQHVKEQLEKGECRILGVVLNKVDLDRDSYYGNYYGKYYGKYYGKYYGNYGEETKK QDKK >gi|229784116|gb|GG667619.1| GENE 17 17369 - 18100 766 243 aa, chain - ## HITS:1 COG:SP0348 KEGG:ns NR:ns ## COG: SP0348 COG3944 # Protein_GI_number: 15900277 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 5 223 3 224 230 127 37.0 2e-29 MGTNQDDEIEIDLKELFYVIKRKLWIILLTGIVGAVGFGLFTAMVMKPVYTSSTMLYIVN KTTTLTSLTDLQLGTQLTKDYKVLVTSRPVTAQVITNLDLNLSHEQLVRKIKVDNPTDTR ILTISVEDTDPYMAKSIADEFASVASARMAEIMDSAPPNIVEEAYLPTQKTKPSITKNTM IGGLAGVFLAGAIILVLFVMNDAIKTPEDVERYLGLNTLATIPVFEGETGTKKKKSKRKK SGR >gi|229784116|gb|GG667619.1| GENE 18 18115 - 18843 378 242 aa, chain - ## HITS:1 COG:SP0347 KEGG:ns NR:ns ## COG: SP0347 COG4464 # Protein_GI_number: 15900276 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 3 241 1 242 243 115 28.0 1e-25 MDLYDIHCHLIPGVDDGSQTMEESLEAMKLEYEEGVRHIICTSHFSADCEAGYTSKLQES FEKLKLRLKETTYGAEMRLHLGNELMYSESLLECLDEGRAWTMAGSHYILVEFLPSVRYE ELYRGLRKLASGGYTPILAHMERYRCLYKENDRLDELRDLGICLQVNGASLFGGVFDSAS AVVRKLCKSGRIHFLGTDSHGTHYRKPQIKKAAAWIHDNCPASVAEGILCGNPMAVLEDK LF >gi|229784116|gb|GG667619.1| GENE 19 18915 - 20522 961 535 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870138|ref|ZP_06113073.2| ## NR: gi|288870138|ref|ZP_06113073.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 535 44 578 578 1030 100.0 0 MKNKTRAADHCSLLSLLFSTFLIVIFFCIPAYAGSINGNEQSIVSVINGQFEQDGVIYKV RQEYINSAISYLQQDDVDLTAEQAQSVISEIYSNVQTGVDSGYLEAIGQTAPPETESQPT APAQDPNSGTEGKDKAGDKNTDVQETDETAEAGNQPGESTESGEPAGDDQNGGDGTGDPS GTGSDEPQAATPPTPEVISILELVDKAPPQSYEYLSKDTDALMARIHIPYQTLWYILIIL AVVIAIITAISFLKKLFVHHHNRKLRKYFRFILAAEIGVLSAVCFTAVSVWIGAFQDSAV LNKLADTGYYRTIFEELRRDTSISFALLDIPDNVMDGAITYERVVLAARQQVESDLSRGA YQADTSMLTDRLEADIRSYLEGKSVTMTEQAQTGLDELMNRLDQKYSSLLKWPFAAWWIE MKTEFFSFAKIALPFSLLLAILSQALIIGLNHYKYRGVILGGKGLMVGGLLTAAAFGVGM VMIGQKLSTMTPDYMELFFKIYGNGLCRTGMITGVIGVLLGIIVLVAVHTWKDSK >gi|229784116|gb|GG667619.1| GENE 20 20674 - 21591 685 305 aa, chain - ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 22 118 532 625 693 78 38.0 2e-14 MRMKKLAVMALATALSVGSVMTASAAWLQEGSNWRYQNDNGTFQTGTWFRDVDGRWYHFD NNGIMQKGWFQDADGKWYFLAYNGVMQVGLIKVDNQVYYMNASGDLFLGDMTINGTTYNF GLYGTTNGQPNVPSTATYGGNGNQSLPGGGSSSNGGGSTATPAEKVEGAVNEVKNAAKES IKGAESVIAGIEVSDPVTKGDAAVVEVKVDVIDIKDTDDAELVKGSIASVVDTTISELDG AEKVAVSIPGISKSFTVDELRGDKLDDLLDNYVTPDFYKDHKNSSVTVTVPVNGVNVTYT ISLAK >gi|229784116|gb|GG667619.1| GENE 21 22099 - 22668 203 189 aa, chain - ## HITS:1 COG:no KEGG:BALH_4510 NR:ns ## KEGG: BALH_4510 # Name: not_defined # Def: antibiotic resistance protein # Organism: B.thuringiensis_AlHakam # Pathway: not_defined # 5 167 75 232 250 63 30.0 5e-09 MFDLLDKAMWSYIIKDIGKELEYLPYALFLGIIVYFVFKIILRKNSKPLYNFYKVIFLVY LFALVHITLFEREPGSRTAISLTLFETLGGSRNNAYVVENVLLFIPYGFLLAILFRPMRN FVLCLAVGAMSSLTIETVQLLTQRGYFQVDDILMNSIGTVIGCAVFLITAGSYRVCKWII TKYCESAKI >gi|229784116|gb|GG667619.1| GENE 22 22906 - 23307 79 133 aa, chain - ## HITS:1 COG:YPO3843 KEGG:ns NR:ns ## COG: YPO3843 COG4973 # Protein_GI_number: 16123978 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerC # Organism: Yersinia pestis # 13 105 193 285 303 89 43.0 1e-18 MLFISSITVNQKLQKWIRTRQEFNLQDDALFVNKYGKRLSIYSIENLFYKYRELAHINSE ATPHYLRHSFATQLLNNGADIRAVQDILGHSSIVTTQIYTEVSLKRKKEVLLNYNGRNFI NLPEDSMIEEWLK >gi|229784116|gb|GG667619.1| GENE 23 24170 - 25033 305 287 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2629 NR:ns ## KEGG: EUBREC_2629 # Name: not_defined # Def: glycosyltransferase # Organism: E.rectale # Pathway: not_defined # 21 287 4 269 269 342 62.0 2e-92 MNKTNNIFSENNTADRPTKVKIVIATHKKYDMPNDIMYIPLHVGAEGKKDACGNDLDLGY LKDNTGDNISLLNPGFCELTGLYWAWKNLEADYIGLVHYRRHFSIKKTKYPFDGILTYNE IKPMLGKIKIFVPDKRKYYIESLYSHYAHTHYSIQLDETKKILEEKYPEYIKSYDTVINR SYGYMFNMMIMEKNLLEKYCIWLFDILFELKKRFEMPELSDYQQRYYGRISEIAFNVWLD YQIRKHNIRRNEIKEIHSIHLERVNWKTKGIAFMQAKLFGKKYEGSF >gi|229784116|gb|GG667619.1| GENE 24 25059 - 25898 348 279 aa, chain + ## HITS:1 COG:SP1273 KEGG:ns NR:ns ## COG: SP1273 COG3475 # Protein_GI_number: 15901133 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: LPS biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 1 279 1 266 267 131 31.0 2e-30 MREINIDELKKIEYDMLVDFAKFCDKNNLIYFLSSGTLLGAVRHHDFIPWDDDIDVMLPR KDYEKFVTSYKNKDYFVDDLLINNSCWGRCGKLKCRNTILESNLKENYKEKVFIDIFPID GICDNKLIQKIKLSIIQLFINFHMSTITKCKPTKRYADKNAGILNWKTGVRTLLKYFLIL TIGNTRPQFWTRIINGWLKKTDFNSARYAGFFAGGYYGTKELMPKKIFDKRIPMKFGKYD FWIPEEYDYYLTNLYGNYMKLPPIEKRQPHHAFKAYWLD >gi|229784116|gb|GG667619.1| GENE 25 25886 - 26659 167 257 aa, chain + ## HITS:1 COG:HI1540 KEGG:ns NR:ns ## COG: HI1540 COG3475 # Protein_GI_number: 16273440 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: LPS biosynthesis protein # Organism: Haemophilus influenzae # 12 257 6 264 265 93 29.0 4e-19 MVGLKRGYMKDLREHQLYAIEALEYLTQILDRNNIKYFLLAGSCLGAVRHKGFIPWDDDI DIGIFNEDYDRFETVLIQELPKQFTWINAKTDNLYPRFFGKILNGTTPSVDVFRLVKMPQ NKYKQKQKWILKNILVKLYSRKCHWKFESENTVVYYFSYILSLLFSKAALLKICRWNENR FASEHNYDYVNMYSYYGLKRELISYKLTLSISKVEFEGKKYTTFSDTDTYLTNLYGDYMT PPSPDKRKPEHIANKFV >gi|229784116|gb|GG667619.1| GENE 26 27310 - 27909 117 199 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870139|ref|ZP_06113080.2| ## NR: gi|288870139|ref|ZP_06113080.2| O-antigen polymerase family protein [Clostridium hathewayi DSM 13479] O-antigen polymerase family protein [Clostridium hathewayi DSM 13479] # 1 199 217 415 415 352 100.0 8e-96 MLFNLCDNVRFFNAKNYLIINIEAFILMVLLKLQRIFEFLIVGILGKNLTLSGRIYIWDR TITLILQNPLFGYGVEYNEGRASKYALRTLYHTSSKLASFAGLHAHNRFLETTYRGGIIL LVIYIFILFVAFKSLKRNEKTNCAKVIAIGLFAYLMGMLTEFYRLSYLFFPMMVIAERAS VLDYKLNWSKINETNSKKV >gi|229784116|gb|GG667619.1| GENE 27 28814 - 29311 203 165 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2653 NR:ns ## KEGG: EUBREC_2653 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 145 320 468 486 94 40.0 2e-18 MFLAPEVMKILAPVSYHEGIYVIPSVVAGVYFSVMYYVFANIVYYYKKPKYVMIGSCCSA VINVILNAIFIPIFGYLAAGYTTMFAYGFQAFIDYYALRKVIGKDIYDKKFLLMTSSFVV FVAILLNYIYDKTFIRLGLVGAILLYTIYYFSKNKDIFSMFFQKK >gi|229784116|gb|GG667619.1| GENE 28 29349 - 31358 934 669 aa, chain + ## HITS:1 COG:mlr5882 KEGG:ns NR:ns ## COG: mlr5882 COG2513 # Protein_GI_number: 13474899 # Func_class: G Carbohydrate transport and metabolism # Function: PEP phosphonomutase and related enzymes # Organism: Mesorhizobium loti # 382 665 11 281 302 133 31.0 1e-30 MKALILNSGLGSRMGIMASEQPKCMTKISNSETILSRQLKQLVDAGIDEVVMTTGLFDGV LVNYCQSLDLPVHITFVKNPFYQNTNYIYSIYCAREYLDDDIILMHGDLVFENAVFDTVL SSVKSCIVVSSTKELPEKDFKAVIRNGHVEKIGIEYFHHAMAAQPLYRLRKNDWKIWLNN IIKFCESDNVKCYAENAFNEVSNTCMIYPLDIKDMLCGEIDDLEDLAVVSNKLKEIESRT VYMCFSADMLHSGHIAIIKKAQRLGKLMIGVLSDEAVVSYKRYPLLPFTERKAMFENIAG VYKVVEQKTLSYKDNLKKYQPTYVVHGDDWCNGFQKFVRDEVVSVLASYGGILIEYPYSN DKKYQSLEYRAREDLSLPDIRRARLKKAIAMKGTVSAIEAHSGITGLIAEKTVVYQNGEA HQFDAMWVSSLCDSTAKGKPDIELVDMTSRFRTIDDIMEVTTKPIIFDGDTGGLTEHFVY TVKTLERMGVSMIIIEDKTGLKKNSLFGTAVAQTQDSIENFCAKITAGKKAQKTQDFMIC ARIESLILEKGMEDALTRAIAFTKAGADAIMIHSRKKDPAEIFEFVDKFRKEDSITPIVV VPTSFNTVTEEEFKAKGINVVIYANQLTRTGFPAMQNAARTILENHRAKECDDMCMSIKD IITLIPDEV >gi|229784116|gb|GG667619.1| GENE 29 31381 - 32502 413 373 aa, chain + ## HITS:1 COG:PM1453_2 KEGG:ns NR:ns ## COG: PM1453_2 COG1454 # Protein_GI_number: 15603318 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Pasteurella multocida # 16 329 19 357 414 113 25.0 5e-25 MEQKIILTDSSYTGLIEYIQQYKLKKILLVCEKIINLLQIGNFLISNAGKIGFDIILFSD FQPNPSYESVVKGVKLFHQKHCDCIFAIGGGSAMDVAKCIKLYSNMNPNKNYLHQKIIPN DVKLLVAPTTAGTGSEATRFAVIYYNGEKQSVTNDSCIPSAVFIDASALKTLPEYQRKAT MLDAFCHAIESFWSVNSTEESKLFSKEAITMILSNMDGYLKNKWEQNAAMLCAANLAGKA INITQTTAGHAMCYKLTSMYGIAHGHAAALCVRILWRYMIDNTALCTDPRGKEYLENVFA EIADAMGCMDAREAAEKFYNIYDELGLKIPDHRATDFDILISSVNPVRLKNNPIALTADA IELLYWQILRKEK >gi|229784116|gb|GG667619.1| GENE 30 32507 - 33628 335 373 aa, chain + ## HITS:1 COG:MTH1207 KEGG:ns NR:ns ## COG: MTH1207 COG0028 # Protein_GI_number: 15679218 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Methanothermobacter thermautotrophicus # 182 343 3 155 185 86 33.0 7e-17 MKANKLAEILGADFYTGVPDSQLKALCNYLMDTYGIDPKHHIIAANEGNCTALAAGYHLA TGKIPVVYMQNSGEGNIINPVASLLNDKVYAIPVIFIVGWRGEPGTYDEPQHIYQGEVTI KLLEDMGIKPFIIFKDTTNEEVKSTMEYFGEILKSGKSVAFVVCKDSISYDGKTKYSNNN KMSREEVIRHITKITGEDPVISTTGKASRELFEIRKQDDQSHKYDFLTVGSMGHSSSIAL GIAVNKPDTKIWCIDGDGAVLMHMGSMAVLGNYKPANMVHIIINNGAHETVGGMPTVAGS IDLVGIAKSCGYPYAVCVDNFEKLDDELNMAKERNVMSMIEVKCSIASRDNLGRPTTTAF ENKQNFMNYLNLL >gi|229784116|gb|GG667619.1| GENE 31 34054 - 35343 546 429 aa, chain - ## HITS:1 COG:SPy0131 KEGG:ns NR:ns ## COG: SPy0131 COG3436 # Protein_GI_number: 15674346 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 22 424 54 449 450 221 32.0 3e-57 MFANLPIHYEEVNTLSEEEKQCPECGAGMIPIGHEEIRTELRYTRAKLERIVYIATTCGC PACKDTEDPRFMKDEGSPALIPGGYASASLVSHIMYEKYADALPLYRQKKGFELLGVSIN RTTMANWIITCSQNYLKPIYDYFHRELLKRHFLMADETPIQVLKEPGRRPQNKSYIWLMR SGEDRLPPIILYHYTETRAGGNAADFLDGIDEGSYVMVDGYSGYNRLKKIRRCCCYAHIR RYLMEAIPSGQEKDYSHPAVQGVLYCNKLFEYERSYKAKGLSYAQVYKRRQKDAKPVVEG FMRWLDGQHPEKGSRMDRAVTYIQNRKDTLMTYLEDGRCSLSNNASENPHRPVTLGRKNW LFSDSQDGANASMIVYTMAEMAKAHWLHPYNYLKYLLDSRPGTDTTDAELKDLAPWSEKA RIECNKKSE >gi|229784116|gb|GG667619.1| GENE 32 35366 - 35698 217 110 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3237 NR:ns ## KEGG: EUBREC_3237 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 84 15 91 233 64 49.0 2e-09 MASGAKDIQFRELKDTILQLNMTIRTQNDLIISLQKMLEERNAKDDEKDRIISNLQAQLE YFKQKLFGSSSERRSDLPGQLSLFSGSGSGEEPLPERAGVHRSKDQQAGT >gi|229784116|gb|GG667619.1| GENE 33 35889 - 36332 184 147 aa, chain - ## HITS:1 COG:SP1442 KEGG:ns NR:ns ## COG: SP1442 COG3436 # Protein_GI_number: 15901292 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pneumoniae TIGR4 # 5 105 5 103 116 87 39.0 9e-18 MLGDISMADNIYIVCGATDMRKSIDGLCSIIRDKLSMDPDQSSLFLFCGKRCDRIKILLH EPDGYVLLYKRLSVTQGRYRWPRKSSEAQEITWRQLDWLLSGLDIEQPKAIRTSKKNIVK LPKFPAAFRLDVWITALFFRQIGITSA >gi|229784116|gb|GG667619.1| GENE 34 36326 - 36487 61 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|239627482|ref|ZP_04670513.1| ## NR: gi|239627482|ref|ZP_04670513.1| conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] # 1 53 88 140 140 86 84.0 6e-16 MFPVQPALEDPYRIPEISASIEMSLGSATIRIANGADSAILDRVLSLVRGTIC >gi|229784116|gb|GG667619.1| GENE 35 36653 - 36742 73 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSNYNARVPADKQYQLIMECRSSGLTDFQ >gi|229784116|gb|GG667619.1| GENE 36 37332 - 37652 360 106 aa, chain + ## HITS:1 COG:SPy0131 KEGG:ns NR:ns ## COG: SPy0131 COG3436 # Protein_GI_number: 15674346 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 3 93 106 196 450 77 39.0 7e-15 MKKSPMPYPVIQHSYASPSTVAWVIYQKYELDVPLYRQEKEWADLGVPLSRATMSYWILA SYRDWLSPVVGLLKEKLLEQNYLHIDETPVQVLLEPGRKNTTDSYM >gi|229784116|gb|GG667619.1| GENE 37 37692 - 38405 346 237 aa, chain + ## HITS:1 COG:SPy0131 KEGG:ns NR:ns ## COG: SPy0131 COG3436 # Protein_GI_number: 15674346 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 1 205 220 420 450 140 35.0 2e-33 MFDYQPGRSGTFPQKFLKGYTGYIHTDAYKGYEKIKGITRCFCWTHLRRYFVDSLPKDVN SPQATIPAQAIRYINRLFELEKNLEVLSPEGRKEERLIQEKPVLEAFWSWAETASAGILP KSKLGEAFTYAFNQKTGLMNYLLDGNCSISNHLAENSIRPFTIGRKNWLFSRSPKGADAS AGIYTLIETARVNGLNSRKYIQYILGDIQDLHSYSIRNTWRTTSHGIQSFKKCVDNR >gi|229784116|gb|GG667619.1| GENE 38 38908 - 39123 104 71 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3236 NR:ns ## KEGG: EUBREC_3236 # Name: not_defined # Def: transposase # Organism: E.rectale # Pathway: not_defined # 2 69 44 111 113 82 50.0 8e-15 MLFLFCYRRCDRIKAIMKEPDSIALIYKRLTAQGHYRWSRDRPDVRSLTWKEFDWMMLGI DIDQQKAIRMS >gi|229784116|gb|GG667619.1| GENE 39 39795 - 40484 589 229 aa, chain - ## HITS:1 COG:MJ0256 KEGG:ns NR:ns ## COG: MJ0256 COG0028 # Protein_GI_number: 15669880 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Methanococcus jannaschii # 39 166 5 128 188 80 40.0 2e-15 MEDFKDDLGAGKSVAFVVKKGAISYDGKTEYKNNNTMLRETIIQHIVKVSGEDPIISTTG KASRELFEVREANGQNHKYDFLTVGSMGHSSSIALGIALNKPNTKIWCVDGDGAVLMHMG AMAVIGNNRPNNMIHIVINNEAHETVGGMPTVAGGIDMVTIAKSCGYPYAVSVSTSKSLD SELEAAKKRNMLSFIEVKSAIGARDDLGRPTTTSLENKQNFMEYLETLN >gi|229784116|gb|GG667619.1| GENE 40 40587 - 40919 245 110 aa, chain - ## HITS:1 COG:no KEGG:Olsu_0592 NR:ns ## KEGG: Olsu_0592 # Name: not_defined # Def: phosphonopyruvate decarboxylase # Organism: O.uli # Pathway: Phosphonate and phosphinate metabolism [PATH:ols00440]; Metabolic pathways [PATH:ols01100]; Biosynthesis of secondary metabolites [PATH:ols01110]; Microbial metabolism in diverse environments [PATH:ols01120] # 1 110 1 110 373 200 79.0 1e-50 MKVESLVNIIGADFYTGVPDSQLKALCNYLINTYGIDPRHHMIAANEGNCTALAAGYHIA TGKIPVVYMQNSGEGNIVNPVASLLNDMVYAIPVIFIVGWRGEPRIHDEP >gi|229784116|gb|GG667619.1| GENE 41 40977 - 42119 374 380 aa, chain - ## HITS:1 COG:lin1675_2 KEGG:ns NR:ns ## COG: lin1675_2 COG1454 # Protein_GI_number: 16800743 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Listeria innocua # 21 271 24 292 411 135 34.0 1e-31 MLKQQILTARGNYEELDNYLKINEVKRIFLVCDGSLPFLKIGRYFDSLTQRTGIEVIKFS DITPNPLYESVVDGVRLFCEFGADLILAVGGGSAMDVAKCIKLFFPMDPAQNYLEQTIVP NSMKLLAVPTTAGTGSEATRYAVIYFDGEKQSISDYSCIPSAVLMDASVLKTLPIYQKKS TMMDAFCHAIESYWSVNSSEESRQYSRRAIQLIMENKDLYIGNDETGNTQMLKAAHLAGK AINLTQTTAGHAMCYKLTSLYGIAHGHAAALCVSRLWPFMLTHLEYCIDPRGNDYLAGIF EEIANAMGYSDTNTSVRAFQDILKALELEAPDSKDRNDYEVLKSSVNPVRLKNNPIELSE DDLDRLYHQIFSDTQCRFRI >gi|229784116|gb|GG667619.1| GENE 42 42143 - 44152 1091 669 aa, chain - ## HITS:1 COG:mlr9115 KEGG:ns NR:ns ## COG: mlr9115 COG2513 # Protein_GI_number: 13488216 # Func_class: G Carbohydrate transport and metabolism # Function: PEP phosphonomutase and related enzymes # Organism: Mesorhizobium loti # 398 668 34 290 318 132 31.0 2e-30 MKALILNSGLGKRMGSLTSEHPKCMSEISSRETIISRQLKQIANAGIEEVVITTGMFDTT LINYCQSLELPLHYTFVKNPLYQETNYIYSIYCAREYLDDDIIMMHGDLVFENLVFDAIL ESPKSCVAVSSTIPLPEKDFKAVVRENRVLQIGTEFFNEAVAAQPLYKLLKNDWKIWLNH IDVFCKQKNVSCYAEQAFNEVSDHCRIAPIDIENLLCNEIDNPEDLAVVSSKLREVENRT VYICFSTDMLHSGHIAIIKKAQHFGKLIIGVLSDEAVVSYKRFPLLPFNERKTIFENILG VYKVVGQKTLSYKENIKKYKPTYVVHGDDWVSGFQKPVRDEVVSALAEYGGKLVEFPYSN DEKYQILENRARADLSLPDIRRIRLKKSIQMKGTVTAIEAHSGITGLIAEKTVVYQNGEA HQFDAMWVSSLCDSTAKGKPDIELVDMTSRLRTIDDIMEVTTKPIIFDGDTGGMTEHFVY TVKTLERMGVSMIIIEDKTGLKRNSLFGTEVKQTQDSIQNFCAKIKAGKNAQKTKDFMIC ARIESLILDQSMDDALERAFAYIGAGADAIMIHSRKKDPTEIFEFIANFRSKNQTTPLVV VPTSFNAVTEEEFKQRGVNIVIYANQLTRSGFPAMQQAARTILENHRAKECDDICMPIKD IITLIPADV >gi|229784116|gb|GG667619.1| GENE 43 44341 - 45666 175 441 aa, chain - ## HITS:1 COG:TM0620 KEGG:ns NR:ns ## COG: TM0620 COG2244 # Protein_GI_number: 15643386 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Thermotoga maritima # 2 427 31 457 479 120 24.0 6e-27 MPVITRVIEPSDLGKIDLFVTFCTMLVYASTLGMHQAYMRFFADNLEGITKDNLFAFAVK CSTCGAGIAAIICLIFNKNFTESIMGEQSNHIWVLMALYIFITGFIAISKTRPRMYGNVM SYTVLVVLESLSIKLCYLSLLVTKDVYYSLWVLIFMVGVLMSICLILNRKQIFISISHIP KETKNAIFIYALPTIPVMFLANVNTSLPKIFLNAYFDAGFVGVYTGCLTLVSIITLIQAG INVFWGPFVFENYKSQQRLIQSMHLVITFAIVFCGLGLLATQDIIYCLLGNKYRMSQAFY GFLLCSPIYYTIGETVGIGINIKKKTYWNIVTTSVALISNLVFCYFWVPRFGNLGASFAV AISSIVMLFVKSFIGERYYKVVENPIKTCLTITLFFTGALFSYLFNSQILIRCILIVILM LALVVLYLKDIKQFICQWNIR >gi|229784116|gb|GG667619.1| GENE 44 45762 - 46907 322 381 aa, chain - ## HITS:1 COG:lin1074 KEGG:ns NR:ns ## COG: lin1074 COG1887 # Protein_GI_number: 16800143 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC # Organism: Listeria innocua # 12 312 10 316 384 120 30.0 5e-27 MFKYYLILLLRLLLRILWIVPLKENSIILTSFAGRRYSCNPKYFAEYMLTKSGYEVYFAL KKDSDVKVKNGIKRVDFGSLKHIYLLMSCKYILVNSSDYTKYFPYRKGQAYVYTHHGHSV YKVGDMSYLQSTAGRKILNICAKEEDYYLVTSLLEADYVEKRSALDRNKIIAIGYPRNDI FFKHNETLIKSIKKRLSLDDEVGLILYAPTYRGSDKKQTDINGYGFEILDVARVIQAFNK KFNKKHVMLFRAHHDTMPTNLSDECINVSDYPDMQELILVADYLITDYSSTMWDFALQKR PGFLYTPDLEQYSYNIPKSHPIDQWVYPYAVNNEDLIKLINSYENETSEKRIESYFDRTG CQDKGTACNQLFNILVSGKEK >gi|229784116|gb|GG667619.1| GENE 45 46900 - 47535 217 211 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620161|ref|ZP_06113096.1| ## NR: gi|266620161|ref|ZP_06113096.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 211 192 402 402 347 99.0 3e-94 MIPISFWGARGALATGIVAVALYFLFSSKLTMKKIIIIVVTSISAFGIASNLQSIAIYML NKFGYSRTLSIIASGELWTGGGRNTIQEIIVRGITILPHGLFADRIALSIALGVSVDKVS YPHNLFLEILYQYGAIFGSIIVIGLLFLLIRTWNQVIKNDNGYIKILFYAYVITFLFKST ISGSYLTDYTSGIALGICLGLRRNRKGIRNV >gi|229784116|gb|GG667619.1| GENE 46 48114 - 48317 149 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620162|ref|ZP_06113097.1| ## NR: gi|266620162|ref|ZP_06113097.1| putative glycosyl transferase [Clostridium hathewayi DSM 13479] putative glycosyl transferase [Clostridium hathewayi DSM 13479] # 1 67 72 138 138 139 100.0 1e-31 MTHIDIGGNCLWMENADVETIRNNISKVYYDHNLYIEMKIAAESKGRNKFSYLEIAKKTI TPIGEGM Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:52:41 2011 Seq name: gi|229784115|gb|GG667620.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld13, whole genome shotgun sequence Length of sequence - 54926 bp Number of predicted genes - 47, with homology - 47 Number of transcription units - 21, operones - 13 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 35/0.000 + CDS 2 - 1324 1076 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 1336 - 1377 5.3 2 1 Op 2 38/0.000 + CDS 1388 - 2281 818 ## COG1175 ABC-type sugar transport systems, permease components 3 1 Op 3 1/1.000 + CDS 2293 - 3126 526 ## COG0395 ABC-type sugar transport system, permease component 4 1 Op 4 4/0.333 + CDS 3123 - 4280 824 ## COG0524 Sugar kinases, ribokinase family 5 1 Op 5 . + CDS 4284 - 5168 718 ## COG0191 Fructose/tagatose bisphosphate aldolase 6 1 Op 6 . + CDS 5263 - 5790 598 ## COG0655 Multimeric flavodoxin WrbA + Term 5794 - 5840 9.3 - Term 5786 - 5823 4.0 7 2 Op 1 . - CDS 5841 - 6131 293 ## BDI_1010 hypothetical protein 8 2 Op 2 1/1.000 - CDS 6128 - 7273 831 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family 9 2 Op 3 . - CDS 7258 - 7707 449 ## COG1846 Transcriptional regulators - Prom 7735 - 7794 1.9 - Term 7726 - 7788 6.0 10 2 Op 4 . - CDS 7797 - 8879 723 ## CD2229 hypothetical protein - Prom 8932 - 8991 80.4 11 3 Tu 1 . - CDS 9831 - 10391 712 ## CbC4_0855 membrane protein - Prom 10465 - 10524 4.3 - Term 10495 - 10548 9.0 12 4 Tu 1 . - CDS 10644 - 10892 293 ## Closa_0054 hypothetical protein - Prom 10958 - 11017 8.6 + Prom 10914 - 10973 10.3 13 5 Tu 1 . + CDS 11097 - 11795 806 ## COG0775 Nucleoside phosphorylase + Prom 11825 - 11884 6.0 14 6 Op 1 . + CDS 11922 - 12974 809 ## Closa_0056 hypothetical protein + Prom 12979 - 13038 4.8 15 6 Op 2 . + CDS 13086 - 14435 1618 ## COG0166 Glucose-6-phosphate isomerase + Term 14470 - 14527 15.4 + Prom 14531 - 14590 9.3 16 7 Op 1 . + CDS 14737 - 16176 1324 ## CLOST_0658 conserved exported protein of unknown function 17 7 Op 2 . + CDS 16169 - 18136 1924 ## COG2199 FOG: GGDEF domain + Term 18143 - 18196 18.1 - Term 18129 - 18183 13.7 18 8 Tu 1 . - CDS 18197 - 19042 807 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 19071 - 19130 8.2 + Prom 18971 - 19030 4.2 19 9 Op 1 . + CDS 19114 - 19275 129 ## gi|288870161|ref|ZP_06409655.1| conserved hypothetical protein 20 9 Op 2 35/0.000 + CDS 19329 - 20762 1577 ## COG1653 ABC-type sugar transport system, periplasmic component 21 9 Op 3 38/0.000 + CDS 20880 - 21773 1009 ## COG1175 ABC-type sugar transport systems, permease components 22 9 Op 4 . + CDS 21792 - 22634 979 ## COG0395 ABC-type sugar transport system, permease component 23 9 Op 5 . + CDS 22675 - 24831 2360 ## CPR_0537 hypothetical protein + Term 24861 - 24918 4.9 + Prom 24904 - 24963 4.0 24 10 Op 1 1/1.000 + CDS 25042 - 27330 1760 ## COG3250 Beta-galactosidase/beta-glucuronidase 25 10 Op 2 . + CDS 27368 - 28579 998 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 28603 - 28662 4.9 26 11 Op 1 . + CDS 28717 - 30201 1437 ## COG2211 Na+/melibiose symporter and related transporters + Term 30241 - 30282 -0.9 + Prom 30242 - 30301 5.3 27 11 Op 2 . + CDS 30326 - 31108 858 ## Closa_0058 hypothetical protein + Term 31142 - 31195 18.4 - Term 31128 - 31183 11.2 28 12 Op 1 . - CDS 31198 - 31932 334 ## gi|266620189|ref|ZP_06113124.1| conserved hypothetical protein - Prom 32075 - 32134 5.6 29 12 Op 2 . - CDS 32157 - 32423 426 ## Closa_0067 hypothetical protein - Prom 32471 - 32530 6.1 + Prom 32524 - 32583 7.2 30 13 Tu 1 . + CDS 32634 - 33725 1271 ## Closa_0068 hypothetical protein + Term 33769 - 33819 5.1 + Prom 33923 - 33982 13.0 31 14 Op 1 . + CDS 34033 - 35481 1023 ## Lbys_0361 fumarate reductase/succinate dehydrogenase flavoprotein domain protein 32 14 Op 2 7/0.000 + CDS 35532 - 36434 605 ## COG4209 ABC-type polysaccharide transport system, permease component 33 14 Op 3 14/0.000 + CDS 36413 - 37282 713 ## COG0395 ABC-type sugar transport system, permease component 34 14 Op 4 2/1.000 + CDS 37306 - 38820 1692 ## COG1653 ABC-type sugar transport system, periplasmic component 35 14 Op 5 7/0.000 + CDS 38829 - 40382 1266 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 36 14 Op 6 . + CDS 40393 - 42195 1518 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Prom 42224 - 42283 2.8 37 15 Op 1 . + CDS 42314 - 42913 607 ## Closa_0069 sporulation protein YyaC 38 15 Op 2 . + CDS 42976 - 44346 1071 ## Closa_0070 Peptidoglycan-binding lysin domain protein + Prom 44353 - 44412 3.1 39 16 Op 1 . + CDS 44444 - 45700 1320 ## Closa_0071 hypothetical protein 40 16 Op 2 . + CDS 45720 - 47699 2204 ## COG0021 Transketolase + Prom 47701 - 47760 2.1 41 17 Op 1 . + CDS 47781 - 48725 624 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase + Prom 48735 - 48794 5.8 42 17 Op 2 . + CDS 48825 - 49193 389 ## COG2033 Desulfoferrodoxin + Term 49208 - 49258 10.5 + Prom 49331 - 49390 7.7 43 18 Op 1 . + CDS 49468 - 50649 999 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 44 18 Op 2 . + CDS 50670 - 51218 453 ## gi|266620206|ref|ZP_06113141.1| putative transcriptional regulator + Prom 51233 - 51292 7.0 45 19 Tu 1 . + CDS 51335 - 52117 199 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 46 20 Tu 1 . - CDS 52211 - 53122 652 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 53231 - 53290 14.5 47 21 Tu 1 . + CDS 53627 - 54926 1095 ## COG0534 Na+-driven multidrug efflux pump Predicted protein(s) >gi|229784115|gb|GG667620.1| GENE 1 2 - 1324 1076 440 aa, chain + ## HITS:1 COG:BS_araN KEGG:ns NR:ns ## COG: BS_araN COG1653 # Protein_GI_number: 16079927 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 34 439 24 433 433 322 40.0 6e-88 ILSGCGEARKRTVEPTTAEAAKETTQEAGGETKAETAKMEAEPVDFSFWCISAHEAFYRS RIDAWNKENPDKPVNVDLVSVGGSDRQSKLLVALQTGQGAPDFCDVNIIHFGVYYDFDEI PFVSLTDLVKDEEDKFIKSKLDMYSYNGELYGSPTQAGANVVFYNTKIMEEAGVDIDSIV TWDDFVKAGETVVQKTGKPMTAVETTDMYPFQSMVLQKGSDFFDAEGNVTINNETNVEIL EYLKSWLDDGIAVTMPGGSNTSENFYEFFNNDGMGALIMPLWYMTRLEEYMPDLSGHLAV RPMPVWEEGQQYTSACTGGTGTVITNQCENQELAKEFLYFCKLSYDANKECYLQLGFAPF RSDVWEDEALSVNREFFNNENVFGNVAASLKNSNAIHNNPMVPKAFDIVSSDVMYSVFEA GTSTPKEALDRAADTLNSQK >gi|229784115|gb|GG667620.1| GENE 2 1388 - 2281 818 297 aa, chain + ## HITS:1 COG:BS_araP KEGG:ns NR:ns ## COG: BS_araP COG1175 # Protein_GI_number: 16079926 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 5 272 22 288 313 249 51.0 5e-66 MKYLKKLAYSKKAAPYIFLTPFILSFLIFNLYPIISSFLMSFQKLKGFNSFTWVGLNNYK NLWNKDFFTAVKTTTLVTAVECVVLTVFPLILAVVLNTGLFKRKNFFRAVFFIPTLASTI VAGIVFRMLFSDSGNAFMNSLLNVFGIESQNFMLNYKWSIILMVVLSTWKSAGLYMIYFL SGLQTIPRDVYESAEIDGAGEFKKLLYITIPLLKPIIIYVFTILIFEGYRTFGESYVFWK ESMPGNLGMTIVRYLYQQGFTYGDLGFGSAIGFVLLGIVLVINLIQLKGFGLFRDDD >gi|229784115|gb|GG667620.1| GENE 3 2293 - 3126 526 277 aa, chain + ## HITS:1 COG:BS_araQ KEGG:ns NR:ns ## COG: BS_araQ COG0395 # Protein_GI_number: 16079925 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 4 277 9 281 281 248 48.0 8e-66 MKKTIIRFVVRILFLLLAVMMLFPIYFMVIASLKPTSALFKNGIAMVWQFDNLSIQNYIY TVTTKSGIYLKWYGNSILVMAVSTVLSLFFSSMGGYALGAYKFKGKKFLFTLTMIVMMIP LEILLIPLYRMIINMKLINTKTGSILPFLIQPTAMFFFQQFVSGISKDYMDAARVDGCTE FGIYFKIMTPLMRPAFGAMTILLALRNWNAFLWPLIVFTTDSAYTLPVGLASLISSYGNN YEFLIPGAVCALIPLVIIFLFNQNQFVEGLSAGGVKG >gi|229784115|gb|GG667620.1| GENE 4 3123 - 4280 824 385 aa, chain + ## HITS:1 COG:AF0356 KEGG:ns NR:ns ## COG: AF0356 COG0524 # Protein_GI_number: 11497968 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Archaeoglobus fulgidus # 52 335 1 239 250 77 26.0 5e-14 MKKTVVAGSLVLDILPVFPHAVTSDMLLAEGKVTECQGMRIYMGGEVGNTGLALYRLGEP VFLISKTGNDGAGFLIREMMKKEGVPSLIMETEGISSTASIALAVPGRDKSTIHSRGASQ TFVADDIQGELLKDADWFHFGYPTSMKTLYDRNGEELIRMLEKVKQYGLGISVDMSLPDL NTDAGKTDWLPILEKALPYIDIFMPSIEEAVFLFMHREYQEMAARYRGRDMTEVLSEAFI LTIADRALEMGTKAVLLKCGKRGMVLKTGDHCPWGESWNGRELWCTPYKPEKIRSATGAG DTAIAGFLNSFKRGCTPEKALDMAAFTALRCMESYDTVSKIGTYQEMEAIKKEAGERLNG PDSSVWTYCVESGCYVGKNDRRKAR >gi|229784115|gb|GG667620.1| GENE 5 4284 - 5168 718 294 aa, chain + ## HITS:1 COG:lin2239 KEGG:ns NR:ns ## COG: lin2239 COG0191 # Protein_GI_number: 16801304 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Listeria innocua # 1 282 1 282 284 231 41.0 1e-60 MYVSMKPVLEHAHIHGYAVMAMNSINMEMVRAAVNAAQAEHSAVIINIGMGQMKNHAHPE DMVPMIRRLAERALVPVALNLDHGQDFTFICECMKQGFSSIMIDASALPYEENVARTRLV CEMAHPLGICVEGELGHVGQAADGDDENDGLYTEPGQAKEFADRTGVDALAVAVGTAHGN YPKGKTPKLDFERIRLLKKILNMPLVLHGGSGAGEENLKKAVACGINKINVCTDAFGIGS RSIEKQLNDNSEVDYMHLCMGVEKELTSYIRSFMRTIGSSGKYYFGEGPVSGNE >gi|229784115|gb|GG667620.1| GENE 6 5263 - 5790 598 175 aa, chain + ## HITS:1 COG:CAC3334 KEGG:ns NR:ns ## COG: CAC3334 COG0655 # Protein_GI_number: 15896577 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Clostridium acetobutylicum # 1 152 1 155 178 95 33.0 6e-20 MNILVVTGSPRKGGNTEILADAFAQAAKESGHEVAVRKLSRLKVGPCLACEYCFTHGGVC VQDDAMNEILQDVHKADMLVLASPIYWFDISAQMKCFIDRLYACAKTGFHITSAAMLLDS GSPGVYNAAQAQLKDICDYLNWENKGAFTAPGMVEKGSIYQSAALAQVRTFAENL >gi|229784115|gb|GG667620.1| GENE 7 5841 - 6131 293 96 aa, chain - ## HITS:1 COG:no KEGG:BDI_1010 NR:ns ## KEGG: BDI_1010 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 96 81 195 195 66 33.0 4e-10 MKHSKLYACLSYLSILIIIPALIPGKDSFVRFHLNQGLVLLIANIVFGCISFIPHMTLIG DLLNCVVLIFAIMGIVSVIQGQKRELPVIGKIRLIR >gi|229784115|gb|GG667620.1| GENE 8 6128 - 7273 831 381 aa, chain - ## HITS:1 COG:Rv1175c_1 KEGG:ns NR:ns ## COG: Rv1175c_1 COG1902 # Protein_GI_number: 15608315 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Mycobacterium tuberculosis H37Rv # 2 361 5 356 356 129 27.0 1e-29 MYETINSPFTIGSVTLKNRIIFAPTSMGLKEEEYLEKLGAIARGGTAMIIIGDVPVLKSP FLSLYSRKGFLRYRRITETLHQNGALAAAQLHMSDSQFSRLIKYLPGMITGKIKPDELRI LLNNTVSDYITGLTDKRISEIIRGFGKAAVLAKEAGFDVIQIHGDRMCGSFSSSIYNRRK DRYGGSPKNRARFACEAVAAVRRALPDITIDFKLAVRQENPHYGNAGILMEELPVFVPLL EQAGVNSFHVTLANHSDLSDTIPPASHPYFHGEGCFLSYCDEVRRLTDLPVCGVGRLCSP DFVEEQLKSGRIQLAAMSRQLIADPEWVKKTASGSVSKLFKCTGCNRECLGGMQAHKGVH CIRDHIQTKNKSLSHERKVVS >gi|229784115|gb|GG667620.1| GENE 9 7258 - 7707 449 149 aa, chain - ## HITS:1 COG:MA0180 KEGG:ns NR:ns ## COG: MA0180 COG1846 # Protein_GI_number: 20089078 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 3 132 19 148 156 63 24.0 1e-10 MPDFLAYYITALRKDLNAYSDARLKQYGLTDGLFYYIIYIGKNPGCSLSETARFLKADNG HVTRCIGKLEQLGYAVRTRDTADHRHYHLTLTAQGETIFKELHTLLYDWDQAVCKNLSDI QREQLLELLRLVTDSREDRKEGSEPCTRP >gi|229784115|gb|GG667620.1| GENE 10 7797 - 8879 723 360 aa, chain - ## HITS:1 COG:no KEGG:CD2229 NR:ns ## KEGG: CD2229 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 330 210 566 673 116 25.0 1e-24 MVYDRRRRELCISDASFALVDSGRGLEHLLILLSQTEHPEDCRDMLEKTLTQLSHFESYV NKKTTAPHLIARTDFITDPEDLRQEEFYNSLEYIRNHLVHMADPEKKSHYRPTLLSLSIR IQAALKVSRVRVIYALRVALLLSLATLLVQSLKLPHGKWLLFTLASVSLPYADDVGQKAG KRLFATAAGGIFSVILYSLVPSPTGRTAIMMLSGYLSFYFTDYKGTFTCSTVGALGGAVF MSAFGWAPVSRMLLIRLAYVIAGIVIAYAANCLLFPYHRSTAAETLLKKYDFASKLLSKL CSQGVTDTQFYYHLVIQSQLIEEKLYQITSGEEQKLLHDKICECRSLVRAAHRNNPQAEM >gi|229784115|gb|GG667620.1| GENE 11 9831 - 10391 712 186 aa, chain - ## HITS:1 COG:no KEGG:CbC4_0855 NR:ns ## KEGG: CbC4_0855 # Name: not_defined # Def: membrane protein # Organism: C.botulinum_BKT015925 # Pathway: not_defined # 15 184 4 172 601 89 29.0 7e-17 MAIPKKSSILPVVQDRIRKNLKFATLSFLFIIAYVNVFQKLFGAENSIVGVIFTIMMSAS MARDLTSTPLKHLFAQAAVLVLMAVSACLVSTLNPWAALPVNFAVIYFILYTYTYEYSGH LYFPYILSYLFLVFISPSSPEQLSKRIMAMVTGAVCIILYQLVMGRNRAEETVTDVLLTL IREVAS >gi|229784115|gb|GG667620.1| GENE 12 10644 - 10892 293 82 aa, chain - ## HITS:1 COG:no KEGG:Closa_0054 NR:ns ## KEGG: Closa_0054 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 81 1 81 82 135 77.0 8e-31 MIYKTHGVCSQEINFEVQDNKLTNVRFVGGCSGNTQGLGRLIEGMDVDEAIKRLDGIHCG PRPTSCPDQLAKALKQYKESLS >gi|229784115|gb|GG667620.1| GENE 13 11097 - 11795 806 232 aa, chain + ## HITS:1 COG:CAC2117 KEGG:ns NR:ns ## COG: CAC2117 COG0775 # Protein_GI_number: 15895386 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Clostridium acetobutylicum # 2 223 3 223 230 197 45.0 2e-50 MLGIIGAMDVEVAEVKEAMQGVEVKTMAGMEFYKGTLKGKEAVVVRSGIGKVNAAVCTQI LVDHYGVDAVINTGIAGSLKNEIEIGDVVLSTDTVHHDMDASGFGYPVGQIPQMEVFSFP ADETLRNLALKCCKEVNPDIGVFTGRIVSGDQFISDQVKKDWIAENFGGYCTEMEGAAIA QAAYLNHVPFLIIRAISDKADNSATMDYSEFEARAVRHSVNLILAIAERYSC >gi|229784115|gb|GG667620.1| GENE 14 11922 - 12974 809 350 aa, chain + ## HITS:1 COG:no KEGG:Closa_0056 NR:ns ## KEGG: Closa_0056 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 350 1 313 313 410 63.0 1e-113 MLEFLQTGKALYVLAAVCGLGVVTRWLTRNLYKRLIKETDNMSLTKNKNLRAFKQKTESS YQMNQGIPKIRPYLERQMYDFRFMGMTLNGWNTFSNQMTLLCFLAGGAAAFGAYWYRTDP YYIVLYGSMGILAGLFTILVDHGAGVADKKQQLFTALDNYLENTLIYRLDQERDETQAIS GPHMETRENIRSIYPVAAGPAAGQAEEPEKQESVRGKQRNRTERMKNQLKAESQARRQNS GRSDGRQRDGRQGGIIQETISTESGMSPAADEMQLGAGNQNMVPEMVRSGSGPSEMEGSI RDVDYLKKSLEQIAASRERGRTVKQQEDWLKDLNPEELKLIGEILHEYLT >gi|229784115|gb|GG667620.1| GENE 15 13086 - 14435 1618 449 aa, chain + ## HITS:1 COG:BH3343 KEGG:ns NR:ns ## COG: BH3343 COG0166 # Protein_GI_number: 15615905 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Bacillus halodurans # 1 449 1 449 450 616 65.0 1e-176 MGKEVVFDYSRAAGFVSAEEMDNMKATVMCARDVLMAGSGAGNDFLGWIDLPVDYDKEEF ERIKKAAAKIQNDSDVLLVVGIGGSYLGARAAIEFLSHSFYNALPKGKRKTPEIYFLGNS ISSKYIQDLKDVLEGKDFSVNIISKSGTTTEPAIAFRVFKEMLIEKYGREEANKRIYATT DKARGALKNLATEEGYESFVVPDDVGGRFSVLTAVGLLPIAVSGADIDKLMEGAAAARKA ALETPYESNSALQYAAVRNILHRKGKAVEIVANYEPSLHYVSEWWKQLFGESEGKDRRGI FPAAVDLTTDLHSMGQFIQDGSRVMFETVLNVEESPAEILLQKEEVDTDGMNYLAGKSVD FVNKSAMNGTILAHTDGDVPNLMVKIPEQDEYCLGQLFYFFEYACGVSGYILGVNPFNQP GVESYKKNMFALLGKPGYEKEREELLKRL >gi|229784115|gb|GG667620.1| GENE 16 14737 - 16176 1324 479 aa, chain + ## HITS:1 COG:no KEGG:CLOST_0658 NR:ns ## KEGG: CLOST_0658 # Name: not_defined # Def: conserved exported protein of unknown function # Organism: C.sticklandii # Pathway: ABC transporters [PATH:cst02010] # 5 475 2 479 479 335 39.0 4e-90 MKRKKIWKAVVLSLTLACAVSGCAQGKKHGLDPKKPTVITVWHAYNAFAKTKFDEKVTEF NETKGLELGIVVDAYGYGSSGELDDALFNSANKIIGSDPLPNVFTVYPDSAYRLDQLAPL VEMEEYFSAEELERYRPEFLSEGVWGEEGGHKLIPVAKSTELLFVNETGWRKFCSDTGFT DEALGTWEGLVSAAEAYYRWSGGQPFLGMNSFNDFSVLTAIQDGADPFPTDGKSVSFAYS KETARKAWDAYYVPHMMGWYKSSVFNQDGVKSGSLLAYIGSSAGAGYFPKEVIVDGENQY PIECSVRPYPTFEGGHGYMSQRGADMGIFRSGEAAEYASAEFLKWFTDSERNIEFTASIG YIPVENAALSSIEELEKFVQKSDNAEAVKKSVTAALEAMQEGKFYARRPFENSYEVNELF SRSLEKKVELDLNELQNRVENGEDRGTVISGFMDDGNFEAWYQNLMREINGEINGEINE >gi|229784115|gb|GG667620.1| GENE 17 16169 - 18136 1924 655 aa, chain + ## HITS:1 COG:TM0972_2 KEGG:ns NR:ns ## COG: TM0972_2 COG2199 # Protein_GI_number: 15643732 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Thermotoga maritima # 502 651 15 163 166 83 34.0 2e-15 MNKKVRSDSIAFKLALTFVASVVIQSLLMTALLSLGGVIGQSRENAYQIFSEKVKNRSVT IEGEMKNVWTNFELYTDELSRYFTEAERDENGELLEKEDLLKAVAPTVLNALYYTKTTGA FLILDRGTEETESYPALYFRNANPNRVDERNSNTYLLAGPWDVAQETQVVTDAAWKYRLE LTDASRAFYEKPFSSVGLAGDEKLLGYWCPPFRLYEGGEEVVTYSIPLVDQRGRAVGVFG VEISLNYLYKYLPPSDLLAVDSYGYVIGIRSGENDGIRPLVTHGALQKRILQAESPMELK AKDSENGIFLLGNHNGNGAVYACLEQMGMYYNNTPFSGEEWYLIGLMEEDALLQYPHRIQ EILFYSFLVSLCAGAGIAVLISRWFTRYSRLLELSEVPVGVFEMGAKGNRVYMTNQIPRL LGLSKDQERTFCRSRNEFQAFLKKICEKPTEDENIFCLEQQEGKRWIRLTLREDGEKKSG VVEDVTEEVLKTQSLRAENDLDGLTRVLNRKAFEHRQQKWEGMLELGKPLTVLMFDLNCL KGVNDEFGHETGDKYIRFSARAISGALKDAWIYRIGGDEFAAVIEGVSENSPEECSRLLA GQVEAYGKENGFKAGIAWGYAGSDPGSKENFSELLSRADHNMYEMKKQMKSADSL >gi|229784115|gb|GG667620.1| GENE 18 18197 - 19042 807 281 aa, chain - ## HITS:1 COG:STM4423 KEGG:ns NR:ns ## COG: STM4423 COG2207 # Protein_GI_number: 16767669 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Salmonella typhimurium LT2 # 25 275 5 262 274 102 29.0 6e-22 MSNERLNFDKNLETLQKLNFQLLYASTSKYEGDWQSLPHTHHFTELFYVISGLGKFLVED QLITVKEHDMVIVNPNVEHTEKSMDSKPLEYIVFGVEGLTFSFGSPEAPQNYGYYNYSAE KKHLLNFSQLMLREIKEKKNGYEQVCHDLLEVLLIYITRNDQFGIISTDTSRMPKECASA KRYLDANYAQPITLDSLAAATHINKFYLSHTFSKYLGMSPITYLIQRRLQTSKDLLTQTD YSIAEIASSTGFASQSYFSQIFKKFTGMSPNQYRKQNTVHQ >gi|229784115|gb|GG667620.1| GENE 19 19114 - 19275 129 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870161|ref|ZP_06409655.1| ## NR: gi|288870161|ref|ZP_06409655.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 53 1 53 53 93 100.0 6e-18 MFKIRENTGYIPFILFIIGDMCGKEKYFIVLKARYIKNEARITLKLGEQYNIL >gi|229784115|gb|GG667620.1| GENE 20 19329 - 20762 1577 477 aa, chain + ## HITS:1 COG:SMa2305 KEGG:ns NR:ns ## COG: SMa2305 COG1653 # Protein_GI_number: 16263694 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 140 394 152 387 459 77 26.0 5e-14 MRKTLSILLAASMVAATLAGCGSKPAETTAAPKAEETTTAADAAKTEETTADAAESTGDP VTLKLAIFKGGYGDAFWTAVTDAYTKANPNVTFEIDADPSIGEKVNSSMLAGDIPDFIYC PSSNPSGLAQRLIKDQALADLTDVFEGPLKGKILDGFLDTSLSQPYGDGKVYLAPLYYTA NGLFYNATLFEEAGLNVPTTWDEFFALGDEIKGKDIAGKSDRSLFTYQGGNAPGYMETVI IPLLTAKLGVEGMNDCFTYKEGAWDKEGVKEVFDLVAKLGSGGYLLPNTTGIDHTTAQTA VMNGSALFVPCGSWIVGEMADITGEEINGKPFAWGFAPVPAMDGTTDKYLMSSVEEVYIP AASENVDAAKDFLAFLYSDEAIKLNAEKTGGIPPVVGATDLTKDILDPMILSTFQSFDNG YKPYIGGFAAVDSEVVPKSEFYGHVNAVVDGKETVDDWVAACDKLSDTVRDSLVTAE >gi|229784115|gb|GG667620.1| GENE 21 20880 - 21773 1009 297 aa, chain + ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 19 287 13 281 292 147 34.0 2e-35 MNGLAQRKKKEKRFLACCVAPALIGFCLFILYPAFQVFRMSLYKWSGGTGKKTFIGINNF VTLFTSDKFWLAFVNTLFLMVFVTIVTMALSLFFAAIISRGKLKEANLYRIVLFFPNVLS IVVIGIIFKNLFNPLTGIVNSFLSAMGKANLPNWLGDSKYVLWVIALAMIWQAVGYYMVM YIAGMDGVSAELYEVSELEGANKIQEFFKITLPLIWPTVRTTLVFFILSTMNMSFQFVTV MTNGEPQGHSEVLLSYVNKQAFSNANFGYAMSVSVVVFLFSFILSLIAQALTARKDS >gi|229784115|gb|GG667620.1| GENE 22 21792 - 22634 979 280 aa, chain + ## HITS:1 COG:BH3682 KEGG:ns NR:ns ## COG: BH3682 COG0395 # Protein_GI_number: 15616244 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 2 280 16 293 293 154 32.0 3e-37 MSKEQKSKVVRLLLRILLFVAVVLVILPLIWTFISALKDNNEISQFPMRLPKKLQWNNFA RAWTNADIGKNFLNSIFITVFSMVLSLVMVVPASYALGRYKGKILTPVKALFMAGMFIQK NYLVVPLFLQLLSFNMLNSRVAMCIVYACTTLPFYIFLMSGFMEGIDNAYVEAAGIDGAS HFRTMVSVVFPMCKPGVATMLMFSFMEYWNEYILALQFLRDDTKKTLPVGLKNIMEQARF ATDWGALFAAMVIVMIPTMIFYLMVQKKLTSGLSMGGLKG >gi|229784115|gb|GG667620.1| GENE 23 22675 - 24831 2360 718 aa, chain + ## HITS:1 COG:no KEGG:CPR_0537 NR:ns ## KEGG: CPR_0537 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_SM101 # Pathway: not_defined # 1 718 1 732 734 914 57.0 0 MNEITGRLTLPTDVDMVDETIRLKNLLGADALRDCDGTNMPEELLKLDGKIYATYYTTRK DNDWAMQNPEEVQQEYLISDRITARGETLRIELMHGFHTEQLKVNTIDDPKRWWEVIDRT TGEVVPVTDWEYDEASGEVGIKTIPYHEYTVSFLAFLIWDPVHMYNFITNDWKDAEHQMT FDVRQPKTQVFVKNKLKRWCEENPDVDVVRFTTFFHQFTLTFDDKKREKYVEWFGYSASV SPYILEQFEKWAGYKFRPEYIVDQGYHNSCFRVPSKEFRDFIEFQQIEVSRLAKEMVDIV HSYGKEAMMFLGDHWIGTEPFGKYFAGIGLDSVVGSVGDGVTMRMISDIKGVEYTEGRLL PYFFPDVFCEGGDPIGEAEDNWMKARRAMLRSPLDRIGYGGYLKLAIEWPGFIEEIQKVV GEFRQIQENMAGSRSHMAPFKVAVLNCWGALRGFMANEVHHAIYHREIYSYVGVLECLSG MPIDVEFIDFDQLREGIDPDIKVIINSGDAYTAWSGGDNWKDEKVVTAVRRFVDQGGGLI GVGEPTACQHQGRYFQLSDVLGVDREMGFSLSTDKYNELNQEHFILEDIEGEIDFGESKS RIYAQGDHYQILNKDGEYSRLVVNEYGQGRSVYFTGLPYSPQNCRILLRAIYYAAHKEEE MKKYYVTNVDTEVAAYEKTKKIVVINNSKEPKTTELYVEGVLKCRLDMDPMEMRWIDM >gi|229784115|gb|GG667620.1| GENE 24 25042 - 27330 1760 762 aa, chain + ## HITS:1 COG:SSO3036 KEGG:ns NR:ns ## COG: SSO3036 COG3250 # Protein_GI_number: 15899743 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sulfolobus solfataricus # 27 547 29 554 570 184 29.0 5e-46 MRKTVCINENWFFSKTCGVLPDALPAGWEKVNLPHTWNAADGQDGGNDYYRGICWYARSL YQPCGTDGLRSYLEFEGAAMAASVYVNGTNVCSHEGGYSTFRADITDHLRVGENLICVSV DNGVNDRIYPQMADFTFYGGLYRNVNLIIVPETHFNLDYYGAPGLAYTAKTEGETAAVEL KAWVTAPETGDSVCFTILDSQGITAAQGYTPAEACSSTVLCLTSPHVWQGVLDPYLYTVR AELVRRNERIDCVNVKMGIREFSVDPEKGFLLNGIPTPLRGVSRHQDRLCVGSALTREQN LEDAALIREIGANTVRLAHYQHSQDFYEACDAMGFVVWAEIPFISRMSGNPAAHENCILQ LKELILQNYNHPSICFWGISNEITIGGDSPQLHRNLRELNDLAHAMDSTRLTTMAHISML EPDNEQVYLTDVMSYNHYFGWYGGKLTGTEAWMDKFHEAHPNRPVGISEYGAEANITYHS SEPRCRDYSEEYQALYHEHLAEVISRRPYIWATHVWNMFDFGCDARDEGGVKGRNNKGLV TLDRTIKKDAFYIYKAYWSREPFVHICGRRFARRTGDFLSVKVYSNLPDVTLFLDGIRIA DRMGDKVFEFQEIPWTEGIHFVRAVGQDCEDCITLERVTEPWEPYILVDEEAAQGDGAAN WFTNLIVETPPEMTFDPQYFSVKDRIRELLGCNEAMEAVVGAMYSSSGMKFKKSMLSMMG DNTLEDLQKMIDKASGEENSVKPSMKQVMTRLNAVLQTIRKS >gi|229784115|gb|GG667620.1| GENE 25 27368 - 28579 998 403 aa, chain + ## HITS:1 COG:BH2728 KEGG:ns NR:ns ## COG: BH2728 COG4753 # Protein_GI_number: 15615291 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 274 401 374 500 510 80 35.0 5e-15 MKEVDLISMIECVTASFHINLSIVKSPEDEKNCFDLGLRENILKCGRPNDVLHNIYAGCG PFILHHVIDRYDSEYDLFALPEDMRSCGDYILIGPYRKEYMDDFRLNRLIQKCGLPISLT SGLRDYFCEVPIVMDSGSWDELLLSIVRMCYDGQEIKKRLVELKEDESMGWKEDLPEDTL VHTKMLEDRYRMENELLHAIASGDADTALRICRKFSSTSVSRRFKDPNRDRRNILITSNA IYRKAAENGCVHPVHIDRLSSGFAKKIETVCTEREASRLALEMVRKYCLLVRNFSLKGHS AVIQKIVNHIDLNLAHDLSLKRLSEEYCVNASYLSTLFKKEMGQSITAYIAEQRIKQAIL YLNSTDMQIQEIALAVGITDLNYFSKLFKKMTGMPPSVYKKNL >gi|229784115|gb|GG667620.1| GENE 26 28717 - 30201 1437 494 aa, chain + ## HITS:1 COG:BS_ydjD KEGG:ns NR:ns ## COG: BS_ydjD COG2211 # Protein_GI_number: 16077683 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Bacillus subtilis # 23 485 30 462 463 91 21.0 4e-18 MKSQNNTKVNGIHRAHTWEIGLYALNNTSTNLYAMLMGYVSYYVSGLMGVAAVLVGTLLT VMRVWDGVTDPVIGFVVDKTNGRFGKNRPFIVIGNVIMIITTYILYHFGHKLPDNYGLRL GFFVLIYAVYIIGYTCQCVVTKSAQSCMTNDPEQRPTFAVFDAAYGGLMFAGTGMLVSKY LVPKYTVDGISGFRNPAMFHEFWLIFAGLSLLLAALSIVGLWRKDRPQYYGLGVPQQILI RDYWEILKHNRAIQMLVVAASTDKLFQNIMTNSIVLVIVFGIICGDYGSYGTMSMLIIVP QILLCGFGIRKIAAKMGQRKALWTGSIGAIICASCMALLFIFGNPKDFGFSRITFFTVAL CTLWILEKACSGMAGTLVIPMTADCADYETYRSGKYVPGLMGTLFSFVDKLISSLGTTIV ALELAAVGFAKVQPDQTTPLTPGLFWVGIISFCLVPIIGWLCNLVAMKFYPLTKEKMAEI QDRIAEIKAASFDN >gi|229784115|gb|GG667620.1| GENE 27 30326 - 31108 858 260 aa, chain + ## HITS:1 COG:no KEGG:Closa_0058 NR:ns ## KEGG: Closa_0058 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 259 1 256 256 246 60.0 8e-64 MKKVLALLLSAAMVLSMAGCGGGKDARTIYDEASKKTSELKDMDLTTNVAMTMSQGDQTI DVTTAMDIQMTGANTEDMRYLATGKTSMAGQDIDMSMYYEGGYYYMETMGQKIKYAMDLN QLMEQVKQSTEGANMESSYMKEITAKKDGENQRLTFTADAEKMDAYVQDIMSSMGTAAAG MEGVTYKIKEASGEAVVNKDGYFSSMKIKMLMDMEVQGETISIDMDTDAVYRNPGQEVTL AAPSLDGYQEVDPAAMGVQQ >gi|229784115|gb|GG667620.1| GENE 28 31198 - 31932 334 244 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620189|ref|ZP_06113124.1| ## NR: gi|266620189|ref|ZP_06113124.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 244 1 244 244 457 100.0 1e-127 MDSYEFTISDGTYLNDTKYELQTTAYICVSEDGGRNYYMAGIPSDKTSSFTVEHKIIGDS HYHRIFDKNKGEYHISADLTSQSAAQTEPPVNGWMLESSGDQKGNESYKGRLWIYNHRIS PEDFTEVSRKKEDGKTVITLIRSTEQPGYLPLPLQLVRDERLKWHEQELESPPYTDGNRK YTAETLEAAIDSTGRLLTIEYKLTYETADGTPITTLEDSEINSMTKYNYVNLKRFNDASI TIPE >gi|229784115|gb|GG667620.1| GENE 29 32157 - 32423 426 88 aa, chain - ## HITS:1 COG:no KEGG:Closa_0067 NR:ns ## KEGG: Closa_0067 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 88 1 88 88 136 79.0 3e-31 MFRMWGKIVKDTRLVRDTVICISDYSMSRTAMVFQALDDICYEFDLGKPIWLDATIHEFK VHDKARFLQDNFIEHIDFDFLEIQVIEE >gi|229784115|gb|GG667620.1| GENE 30 32634 - 33725 1271 363 aa, chain + ## HITS:1 COG:no KEGG:Closa_0068 NR:ns ## KEGG: Closa_0068 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 362 1 362 363 454 72.0 1e-126 MAQPITDQVGFLSEACRAVQELSASKNSLDKIRLDEKRLEKDLEAEKKAVADAISLTVKK RMEEINGSYDKEIAKGQDRLKRVRSKREKAKSQGIKERIAEETKELHDDNRELKLQMKTT FQKDRVPRFCKSTWYYSLYFTRGFSEFLILLVTILICFLAVPYGIYMLLPEKSTYYLIGI YFAVVVVFGGIYILINNKTKVRHQEALKQGRKIRDLMRTNHKKIRVITNSIRRDRDEAVY DLEKYDDEIAQIEQDLSDITSKKKEALNTFEKVTKTIISDEIAGASREKIERLEEELLRA SEEAKEAEVKVKEQTLFITDNYESYLGTEFMVPEKLDELADFIRMGKATTISEAEALYRS MKG >gi|229784115|gb|GG667620.1| GENE 31 34033 - 35481 1023 482 aa, chain + ## HITS:1 COG:no KEGG:Lbys_0361 NR:ns ## KEGG: Lbys_0361 # Name: not_defined # Def: fumarate reductase/succinate dehydrogenase flavoprotein domain protein # Organism: L.byssophila # Pathway: not_defined # 11 479 3 450 459 279 37.0 2e-73 MKPKFEFVNGSLWEEKKEIPVARNCDVAIAGGGPSGLAAAIASARTGADTLLVESQSFLG GVATSTMMAALVDAKRANGISEELIKRMGERNGAPLLSPEKNVNTIPFDPETFKTCALQM LMESGASVLFYTIVTEPIVVGDEIRGFIIENKNGRQAILARQVIDCTGDADLAYRAGAPC VKGREEDGKMRPLSLLARMGGVDVYKTLKYLEENPEEIQPQYRNGGILKAGEEDVVQRLS GFYKLVEQAKEEGGLFPECHYFRLENLWASRGTVICNTARAYFLDGTDADDLTKAEIICR RQIDKLLEFARKYVPGFENAFVIDISSRLGVRETRRIKGVYTLTDEDAYGDATFEDPILF LSGNLVKRPLPKDLDVHMPEPIEGSDKDWLEKYPERVPMEHHEYQLPFRCLIPQNIRNLL VAGRSLSVSHMIDSFTRNMIPCMWFGQAAGAAAGLCIKYDTIPAELEYEKLHQELKRQGL TF >gi|229784115|gb|GG667620.1| GENE 32 35532 - 36434 605 300 aa, chain + ## HITS:1 COG:BH0484 KEGG:ns NR:ns ## COG: BH0484 COG4209 # Protein_GI_number: 15613047 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Bacillus halodurans # 2 297 28 321 325 224 43.0 1e-58 MKKKNYKRYISLYVMMSVSICYYVVYRYAPIFGNIIALKDYNIQAGIFRSNWAEPLNKYF KMFFESPNFRQLLTNTAIISFYKLVFVTFGGISLALLLNEARKKRLRQTVQTASFLPYFL SWVVIYSISQVFFSQSTGLINHAIRQSTGETIPFLSSDKWFRFILIMTDVWKEAGWSSII YIAAMAGIDPALYEAARVDGAGRLRMIISITLPCIKNVIIMMLILKLGTIMDAGFEQVFI MYSVPVYPVADILDTWVYRVGLQQMNFSLAAAAGLFKSVISFILVVAANKVSKRWGESMW >gi|229784115|gb|GG667620.1| GENE 33 36413 - 37282 713 289 aa, chain + ## HITS:1 COG:BS_lplC KEGG:ns NR:ns ## COG: BS_lplC COG0395 # Protein_GI_number: 16077779 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 26 289 27 295 295 215 44.0 8e-56 MGGVNVVSKISDKSVDFIVHTILLVIGFIYIFPLIYVIAVSLTPYSEVLKNGGFVLIPKR ITLQGYKEIFRNSGIPRAFMITVFITAVGTFINLLLTVTLAYPLSKKDLPGRKFWLLFVL FPMIFGASIIPTYLVVKATGLLNSVWAMIIPLAVTGYNTIILKTFFEGVPESLAESAMLD GAGEFQVLWHIVLPLSKPAVTTVGMYYAVNHWNEFFNAVYYITDVKLNPLQVVLRNIIDV STMASTEINVPTLTIQMAAVVVVIFPVLIVYFLLQKHIEKGMLSGAVKG >gi|229784115|gb|GG667620.1| GENE 34 37306 - 38820 1692 504 aa, chain + ## HITS:1 COG:BH0482 KEGG:ns NR:ns ## COG: BH0482 COG1653 # Protein_GI_number: 15613045 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 58 382 67 387 512 108 26.0 2e-23 MKKQMKYGLAAVLLGLAGMTLLSGCQKKETKEIGQENADKVKLEVLVEGAGMPASSDEDP ILDALNERLNMDMNFSVVQNEYSNQLNVRVAGGNPPDLFSLNYEQMHTYAEQDLLLDWTP YLDQMPHVQEVLKQEDWVKGMVNGKIVALSKRPFIKLHSLWVRKDWLDALGLEVPKTLED LVNVSVAFTKQDPDGNGVDDTYGFTAPGMTAAGSQGLFGLSPVFGAFGTTQEGQFYVKDG KAVYSTEDPAFRDAIEFIRSFIATGSVDPELVVNKNFSERDKAFRGEAGIIYIAWTDMVK GNLQEQMKAVNPDAEWIQIEPPAGPGGSYDGFIDLGDTPNRYAMPKDIAPEKRDQVLKLM DYIASGEGLDLVCYGVEGVHYKKEGDQITALPAMSDITWSYNYQLVGRDDVPYYGVKYPT LTEYRDFAYGLDRINVYTNLIQIPDKFNIVDLRSYEQQELVNFIYGKRDMDEFDDFLKTL EDTYHLSTYKAAAESTLKECGYIK >gi|229784115|gb|GG667620.1| GENE 35 38829 - 40382 1266 517 aa, chain + ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 517 1 514 530 112 22.0 3e-24 MYKVVLIDDNEIFLESLKKNGPWNETDCCLEGVAYDGISGKKMIEELRPDIIITDISMPE MDGLTMVEAIRKECPGSKILIITGYDKFEYAQKALQLKVNQILLKPVSLHDLQCALSAIG EEFKKEEERNNNFISLQKKLDLNKRNYEREQRKRMFQLLLTLPCCDQAWIDKQTEKHECY DGGYTIFVMPADQKSIIEKDSAVIESEMEKYRDIAPLIQIVSFVNSSYFITLVLYKKYMR KDKISYAEEKMRNILRDFGDQKEYLLSVTCADNHFYRMQDRYEKLILQLEERMYFLYETV AGDQLKFSESHIMTYFETVITNVKTIKYEDGVLALDKLVEEISQFYTPDLVLIRTALNGF SVLLVRAFDGMEYGETGAHGLMNGFNTVMNIKQAKEQLHAIYRECLKQRSGEDKKYSALV LAVLQYVRGNVYGDISLERTARYFNVSSGYLSGLVKKETGINYNQIVLQEKINAAKELLK HPEYSIEEIGEKIGYTNYISFYKAFKKATGMSPRECR >gi|229784115|gb|GG667620.1| GENE 36 40393 - 42195 1518 600 aa, chain + ## HITS:1 COG:BH2110 KEGG:ns NR:ns ## COG: BH2110 COG2972 # Protein_GI_number: 15614673 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 272 596 264 580 585 129 24.0 2e-29 MREKIKNYLKERYSDIGIKKEIIMINSAIVSAAVLLTFLFCFFLNLEKSTQNIEKIARQL TEEQYHTVNSNIPHLEGQIFALMQNKRLQQLLAEAGTEPLDVFSRDGIRTELEQMCHFYH FIDFVVVKDLQGIWYGSGGSSGLDFIRQWDSENGERLKAQTGKCIWEIEDGEIYVEREIF NVTTMERLGTIVLNINQNYFNELIGLPRNSMNYCYLLSDTGKCYTYTESEGVPRWSDISP KILYEEGYQEKYLFHDGTVYLMQALMGGTGSWQIVNLIPGNVFTDGWKTSIMISILIGGC AVAVAILLISKFSRNLSARIEGVIGQMDRIADGEFHTEYQSERRDEIDKIRCRLNILGKQ IEQLLENTVAEEKQKKEMEVQMLQLRYYAVQSQMNPHFIYNALETISSMALLYGAAEVSN AVCECAALFRANVRRIHQFVTLEDELEYVKLCMRFYELAYPNRIDTEYRISPELQECMVP SFLLHPLIENIIVHGVSKSVKRVKILVSCKQRGDMIKIAVCDNSIGISREKQQQLLDAMK ETDIWLNEDSKAPHTEIGLPNIYYILKRSFGERADLKLYSREERGTCIFLSFPRERQIQN >gi|229784115|gb|GG667620.1| GENE 37 42314 - 42913 607 199 aa, chain + ## HITS:1 COG:no KEGG:Closa_0069 NR:ns ## KEGG: Closa_0069 # Name: not_defined # Def: sporulation protein YyaC # Organism: C.saccharolyticum # Pathway: not_defined # 1 199 1 199 199 361 86.0 8e-99 MRLWERIAIRGNREVYYYDTKQSFEAEEFASRLYDLICMEQENGKTAGVVFLCIGTDRST GDSLGPLIGYKLKEKRLERVEILGTLERPVHAMNLETYQAILKFQYPDHVVVAVDASVGN MEHIGYVTLGKGSLKPGLGVSKELKAVGDIFITGIVGSCGSFDPLMLQSIRLSVVMRMAD CISRSIYLVEKLWENASLL >gi|229784115|gb|GG667620.1| GENE 38 42976 - 44346 1071 456 aa, chain + ## HITS:1 COG:no KEGG:Closa_0070 NR:ns ## KEGG: Closa_0070 # Name: not_defined # Def: Peptidoglycan-binding lysin domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 456 1 416 416 577 65.0 1e-163 MGELYNPFPKLPKNIRQIGERDAIVKLYVEDYVNTYLRRLYPAGGQDLRVGLLLGSVEMN DGTPYIFIDGAMEMEDVTEQGQKVVFSELAWKKAYQSVEQLFPKRSVLGWFLCGAPGNDL SPLNYWKQHVQYFQGPNKLMYLSSGVEGDESVYITSEDGFYKLCGYSIYYERNQMMQDYM VLRKDVRRIESGSDEKVIQEFRKRMDEHKDEVTDRHQTLGLLRGMCMAMSIVILAGGIVM FNNYERMQEMESVIASAIPERVESALMGKDNAAVKDKPESHVVVEEADGGVYPTTAAVTK ETMSETQPGSQNGGEGQNGGDGQAAGSGQKSENTQDSGDVQSSGNGQNSGGGQNGESSQN SGNGQDGGNSQESAQTQGGNKGASKEEGAGSQSSSGGASQAAAGGTQRVHVVQDGETLYG ICISEYHDVNKLKEICELNGLEDENKIVSGQKLLLP >gi|229784115|gb|GG667620.1| GENE 39 44444 - 45700 1320 418 aa, chain + ## HITS:1 COG:no KEGG:Closa_0071 NR:ns ## KEGG: Closa_0071 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 418 1 418 424 676 77.0 0 MSDLDSMNREIKKRQLRRQIVPPPAPQEEDSEEIVRKARKRVRRKRLKIILIIGVILIAA GIGLFQYFRLYQYTSYDVVWEKELKEGSFVGYMDFGGNVLKYSKDGASYIDNQGKEIWVQ SFEMKSPTACVNGNYAAVADQQGNSIYIFDKSGPTGTATTLLPIIKATVSAHGVVAAILE DSTSNYIYFYKNDGSVLDVKIKALLNGETGYPVDISLSPDGTQLIGAYAYLKNGALNGRV AFHNFAEIGKSVPDRVVGGFDEPYESSLVARVHFLDDTYSCAFADNSVSFYSTKNALSPE LLKQVVVEEEIRAVFYSNKYVGIIVNNAEGENPDRMDVYSPDGNLVFSEAFRYQYTQADI DGEHVILYNEDSCRVYNMSGRLKFEGTFDFPVSKIRNGRFPNTLLVTGPQNMKEIRLK >gi|229784115|gb|GG667620.1| GENE 40 45720 - 47699 2204 659 aa, chain + ## HITS:1 COG:CAC1348 KEGG:ns NR:ns ## COG: CAC1348 COG0021 # Protein_GI_number: 15894627 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Clostridium acetobutylicum # 4 659 3 658 663 810 60.0 0 MSKIETMSVNAIRVLSADAIQKAKSGHPGLPLGCASAAYELWANHMNHNPADPNWANRDR FILSGGHGSMLLYSLLHLFGYGDLSKEDIMNFRQLGSLTPGHPEYGHTVGIEATTGPLGA GMAMAVGMAAAETHLAEVFNKEDYPVVDHYTYVLGGDGCMMEGISSEAFSLAGTLGLSKL IVLYDSNQISIEGSTDIAFTENVQKRMEAFGFQTITVEDGNDLEAIGRAIEEAKEDTEHP SFITIKTQIGFGCPAKQGKASAHGEPLGEENVLAMKENLGWPSMEPFYVPDEVYEHYARI AEEKAETEAAWNVMFTAYCEEYPEMEELWNAYHDENAGEKAIEACAGFWSKPEKADATRN ISGKILNKIKTYMPNLIGGSADLAPSNKTNMSDLGDFSRDNRGGRNLHFGVRELAMTAIG NGMMLHGGLRVFIATFFVFSDYTKPMARLSSLMGVPLTYVFTHDSIGVGEDGPTHEPIEQ MAMLRAMPNFHVFRPCDETETECAWYSALTSKKTPTALVLTRQNLTPMEGSSKEALKGGY VIDDCEGTPDLILIASGSEVELAVNAKKLLSDRKVRVVSMPCMDLFEEQSAEYKESVLPK SVRKRVAVEALSDFGWGRYVGLDGACVTMTGFGASGPANQLFEHFGFTADHVAEVARSL >gi|229784115|gb|GG667620.1| GENE 41 47781 - 48725 624 314 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 1 313 1 318 319 244 41 6e-64 MGKKCIGIDVGGTSVKIGLFETTGDLLLKWEVPTRKEEGGKYIIGDVAASILKTLEEKQI PMEEVVGAGLGVPGPVMPDGSVEVCVNLGWRNVNPGKELSGLLGGMPVKSGNDANVAALG EMWQGGGKGFDDIVMVTLGTGVGGGVIIGQKIVAGKHGLAGEIGHIHIRDEETEHCNCGG VGCVEQISSATGIAREARRKMAASDAPSAMRKFGDRITAKNVLDAAKAGDALALETMEVV GHYLGLALAQISMVVDPEVFVIGGGVSRAGQFLIDTIYKHYDQYTPISKNKSGIVLATLG NDAGIYGAARQILD >gi|229784115|gb|GG667620.1| GENE 42 48825 - 49193 389 122 aa, chain + ## HITS:1 COG:AF0833 KEGG:ns NR:ns ## COG: AF0833 COG2033 # Protein_GI_number: 11498439 # Func_class: C Energy production and conversion # Function: Desulfoferrodoxin # Organism: Archaeoglobus fulgidus # 1 122 5 124 125 119 50.0 1e-27 MKFYVCSHCGNIIAYVKNSGVPVVCCGEAMKELVPNTTDAAVEKHVPVIHTDGQKVTVTV GSTSHPMLEEHYIEWIALATKQGNQRKELKPGQEPQAEFMISEDDEVLEVYAYCNLHGLW KA >gi|229784115|gb|GG667620.1| GENE 43 49468 - 50649 999 393 aa, chain + ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 289 383 160 254 257 69 32.0 9e-12 MKFGFNPEKRVITEKSPPVFACESLIFFTASGSGILHVNGVKLKLKKGSFGWLHTYHIYR FEPDWGEVLELFCCPFDYASASYIAYRNSLKITMTVLTESSPVVMLRGENEEYVRSLMEE LIEEGSGSEAADIVHGYSLFLEIVTLFLRYSAREYERQEGKENRFERSLAWKLLQHIFTY SNENLTREKLASAYGVTDRVVSCQLNLVAGQSFASLLNRARVDRACDMMHFGGLTMQYIA RYVGYHSETSFYRSFKQVMGMTPQEYQKAVLPEGGNEAAHIDVRALLILNHLFGHYREPL TTQDTAKALYMEKNSVNPLLKENFGLDHSGILTELRMTYSESLLENSRMPVIDVAFACGY NSSHTFIRIFSERHGMTPGEYRRFSVHGGYESR >gi|229784115|gb|GG667620.1| GENE 44 50670 - 51218 453 182 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620206|ref|ZP_06113141.1| ## NR: gi|266620206|ref|ZP_06113141.1| putative transcriptional regulator [Clostridium hathewayi DSM 13479] putative transcriptional regulator [Clostridium hathewayi DSM 13479] # 1 182 1 182 182 349 100.0 6e-95 MECSLETREDSGFRGKQYFSPELLPERLKAHSFRFKGEGAFENREDVTVLSIRKGRGRMY LNNRIYPLEPGMVFILYPFHIWKMQAEEGGELEGEECVMNMGAMLFLFAIPPYHRPVPVF EENPSVYRLDQEYGSRVERLWDRLLKENESTDCYGGKMQFSLMMRIMVLCRKHGKVIDCS SF >gi|229784115|gb|GG667620.1| GENE 45 51335 - 52117 199 260 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 9 255 1 238 242 81 27 1e-14 MTKCMIGALKGKAAVITGGGSGIGKAMVERYLAEGASVVAADINEESLKVLQEELEPEYG DKLKIRRVNVSSKEEVEAMIDYTIEAFGKVDVVINNAGIMDNLLPIAEMPDDMWERIMNI NLNSVMYACRKAVQYFLKENREGTIINTASLSGLCAGRGGLAYTTSKFAVVGLTKNVAFM YSDAGIRCNALCPGAIESNIGSGMREPSKLGFTKAVTGNAGRHRNGSADEVATAAVFLAS DAASFINGETLTIDGGWSAY >gi|229784115|gb|GG667620.1| GENE 46 52211 - 53122 652 303 aa, chain - ## HITS:1 COG:BS_ydeC KEGG:ns NR:ns ## COG: BS_ydeC COG2207 # Protein_GI_number: 16077582 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 9 296 3 287 291 136 26.0 5e-32 MLNNEEKINVDIMQDASEIVRYDETGIPLYIQTGELSRYPDRKALCHWHDDLEFIRIVHG EMNYYINGKNLVLKEHDGIMVNTRQMHYGSSINQNDCSFLCILVHPALFSSNRLIYQKYL LPLLEDRETEFVYLDSGRPDHGFILDCLMEMTQLKEHGEYGYELEIIGLFYLIWRHLARQ FKAGKTAAPGIEDMDTKVQKDMVSFIYQHYTEKLTLSAIAASGGVCRSKCCQIFKKYLCQ SPIDFLNSYRLEVSSHLLKKTDKSITEIATECGFNHLSYYSELFMRHYGCTPRQFRSQHP ADS >gi|229784115|gb|GG667620.1| GENE 47 53627 - 54926 1095 433 aa, chain + ## HITS:1 COG:FN1789 KEGG:ns NR:ns ## COG: FN1789 COG0534 # Protein_GI_number: 19705094 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 8 426 16 434 459 285 40.0 1e-76 MRENDKHFYKNLFVLVLPIAFQNLMSALVSASDAMMLGMVNQDSLSAVSLAAQVQFVLSL FWAALTIGTTILAAQYWGKNDRLSVEKILAIVLKYSFVISFLFFAAAALCPRVLMHIFTN DETLIALGAEYLRIVSWSYLLAGISQIYLCIMRNSGRTLKSTVFGSVSMVLNIFLNGILI FGLLGFPKLGISGAAVATVTARAAELILAMWENRRDDVVRIRWSLLKESHAVLRKDFVKY TTPVMANELVWGCGFTMFTVIMGHLGSDAVAANSIAGIVKNLISCLCIGIGSGSGIIVGN ELGSGNLVRAREYGGRLCRLSLVMGAVSGVILLAFSPWIMRYAANLSPQAQAYLQGMLLM CAYYLIAKSANMTVVAGIFCAGGDTKFGFLCDTVTMWIIIVPVGLLAAFVFKLPVLAVYF LLNLDEFKPFRFQ Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:57:11 2011 Seq name: gi|229784114|gb|GG667621.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld14, whole genome shotgun sequence Length of sequence - 68762 bp Number of predicted genes - 87, with homology - 84 Number of transcription units - 30, operones - 18 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 161 68 ## gi|240145800|ref|ZP_04744401.1| hypothetical protein ROSINTL182_07720 - Prom 245 - 304 8.1 2 2 Op 1 . - CDS 350 - 490 140 ## EUBREC_3592 hypothetical protein - Term 539 - 577 1.6 3 2 Op 2 . - CDS 703 - 936 304 ## Ethha_1892 TraG family protein - Prom 992 - 1051 18.3 4 3 Op 1 . - CDS 1953 - 2372 350 ## Closa_3720 TRAG family protein 5 3 Op 2 . - CDS 2369 - 2869 455 ## Ethha_1891 hypothetical protein 6 3 Op 3 . - CDS 2959 - 3696 655 ## COG3617 Prophage antirepressor 7 3 Op 4 . - CDS 3746 - 4624 585 ## BLJ_1240 hypothetical protein - Term 4651 - 4692 4.5 8 4 Op 1 . - CDS 4694 - 4879 89 ## gi|266620217|ref|ZP_06113152.1| conserved hypothetical protein 9 4 Op 2 . - CDS 4912 - 5187 170 ## gi|266620218|ref|ZP_06113153.1| hypothetical protein CLOSTHATH_01299 10 4 Op 3 25/0.000 - CDS 5202 - 6170 838 ## COG1475 Predicted transcriptional regulators 11 4 Op 4 . - CDS 6127 - 6948 546 ## COG1192 ATPases involved in chromosome partitioning - Prom 7075 - 7134 7.5 - Term 7346 - 7396 10.4 12 5 Op 1 59/0.000 - CDS 7425 - 7817 610 ## PROTEIN SUPPORTED gi|240144422|ref|ZP_04743023.1| 30S ribosomal protein S9 13 5 Op 2 7/0.000 - CDS 7844 - 8278 675 ## PROTEIN SUPPORTED gi|239623392|ref|ZP_04666423.1| ribosomal protein L13 - Prom 8376 - 8435 5.3 14 5 Op 3 8/0.000 - CDS 8450 - 9514 1054 ## COG0101 Pseudouridylate synthase 15 5 Op 4 34/0.000 - CDS 9531 - 10331 994 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 16 5 Op 5 15/0.000 - CDS 10325 - 11224 467 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 17 5 Op 6 . - CDS 11214 - 12119 614 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P - Prom 12222 - 12281 6.5 - Term 12293 - 12339 11.0 18 6 Op 1 1/0.500 - CDS 12355 - 13374 1227 ## COG5263 FOG: Glucan-binding domain (YG repeat) 19 6 Op 2 . - CDS 13401 - 14369 850 ## COG5263 FOG: Glucan-binding domain (YG repeat) 20 7 Tu 1 . - CDS 15322 - 15546 268 ## Closa_3736 cell wall binding repeat-containing protein - Prom 15588 - 15647 7.8 21 8 Op 1 . - CDS 15729 - 16796 1229 ## COG0258 5'-3' exonuclease (including N-terminal domain of PolI) 22 8 Op 2 . - CDS 16839 - 17333 583 ## BPUM_1656 hypothetical protein 23 8 Op 3 . - CDS 17367 - 17459 60 ## - Prom 17556 - 17615 23.6 24 9 Op 1 50/0.000 - CDS 18541 - 19077 749 ## PROTEIN SUPPORTED gi|240145848|ref|ZP_04744449.1| LSU ribosomal protein L17P 25 9 Op 2 26/0.000 - CDS 19172 - 20140 1159 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit - Term 20179 - 20229 12.4 26 10 Op 1 36/0.000 - CDS 20258 - 20851 803 ## PROTEIN SUPPORTED gi|240145850|ref|ZP_04744451.1| 30S ribosomal protein S4 27 10 Op 2 48/0.000 - CDS 20868 - 21269 669 ## PROTEIN SUPPORTED gi|239623373|ref|ZP_04666404.1| ribosomal protein S11 - Prom 21309 - 21368 3.0 28 10 Op 3 . - CDS 21389 - 21757 553 ## PROTEIN SUPPORTED gi|240145852|ref|ZP_04744453.1| ribosomal protein S13p/S18e 29 10 Op 4 . - CDS 21786 - 21899 189 ## PROTEIN SUPPORTED gi|160881761|ref|YP_001560729.1| ribosomal protein L36 - Prom 22021 - 22080 6.3 - Term 21924 - 21958 0.0 30 11 Op 1 . - CDS 22084 - 22302 271 ## PROTEIN SUPPORTED gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 31 11 Op 2 . - CDS 22336 - 22602 274 ## Closa_3744 hypothetical protein 32 11 Op 3 12/0.000 - CDS 22623 - 23348 775 ## COG0024 Methionine aminopeptidase 33 11 Op 4 28/0.000 - CDS 23391 - 24035 711 ## COG0563 Adenylate kinase and related kinases - Term 24063 - 24109 9.1 34 11 Op 5 53/0.000 - CDS 24119 - 25441 832 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 35 11 Op 6 48/0.000 - CDS 25443 - 25883 638 ## PROTEIN SUPPORTED gi|240145859|ref|ZP_04744460.1| 50S ribosomal protein L15 36 11 Op 7 50/0.000 - CDS 25909 - 26091 243 ## PROTEIN SUPPORTED gi|227871783|ref|ZP_03990188.1| ribosomal protein L30 37 11 Op 8 56/0.000 - CDS 26105 - 26614 673 ## PROTEIN SUPPORTED gi|227871784|ref|ZP_03990189.1| ribosomal protein S5 38 11 Op 9 46/0.000 - CDS 26634 - 27002 549 ## PROTEIN SUPPORTED gi|239623363|ref|ZP_04666394.1| ribosomal protein L18 39 11 Op 10 55/0.000 - CDS 27021 - 27563 785 ## PROTEIN SUPPORTED gi|238916280|ref|YP_002929797.1| large subunit ribosomal protein L6 40 11 Op 11 50/0.000 - CDS 27630 - 28031 592 ## PROTEIN SUPPORTED gi|240145864|ref|ZP_04744465.1| ribosomal protein S8 - Term 28065 - 28105 0.7 41 11 Op 12 50/0.000 - CDS 28270 - 28455 310 ## PROTEIN SUPPORTED gi|168334339|ref|ZP_02692526.1| ribosomal protein S14 42 11 Op 13 48/0.000 - CDS 28471 - 29010 824 ## PROTEIN SUPPORTED gi|240145866|ref|ZP_04744467.1| 50S ribosomal protein L5 43 11 Op 14 57/0.000 - CDS 29032 - 29337 380 ## PROTEIN SUPPORTED gi|227871791|ref|ZP_03990195.1| ribosomal protein L24 44 11 Op 15 50/0.000 - CDS 29351 - 29719 582 ## PROTEIN SUPPORTED gi|225376419|ref|ZP_03753640.1| hypothetical protein ROSEINA2194_02061 45 11 Op 16 . - CDS 29746 - 30000 405 ## PROTEIN SUPPORTED gi|160881776|ref|YP_001560744.1| ribosomal protein S17 46 12 Op 1 . - CDS 30123 - 30338 292 ## PROTEIN SUPPORTED gi|239623355|ref|ZP_04666386.1| 30S ribosomal protein S17 47 12 Op 2 50/0.000 - CDS 30319 - 30756 666 ## PROTEIN SUPPORTED gi|160881778|ref|YP_001560746.1| 50S ribosomal protein L16 48 12 Op 3 61/0.000 - CDS 30756 - 31412 977 ## PROTEIN SUPPORTED gi|238922836|ref|YP_002936349.1| SSU ribosomal protein S3P 49 12 Op 4 59/0.000 - CDS 31424 - 31810 583 ## PROTEIN SUPPORTED gi|160881780|ref|YP_001560748.1| ribosomal protein L22 50 12 Op 5 60/0.000 - CDS 31842 - 32123 446 ## PROTEIN SUPPORTED gi|240145874|ref|ZP_04744475.1| 30S ribosomal protein S19 51 12 Op 6 61/0.000 - CDS 32143 - 32991 1304 ## PROTEIN SUPPORTED gi|240145875|ref|ZP_04744476.1| 50S ribosomal protein L2 - Term 33016 - 33055 5.7 52 12 Op 7 61/0.000 - CDS 33116 - 33415 417 ## PROTEIN SUPPORTED gi|240145876|ref|ZP_04744477.1| 50S ribosomal protein L23 53 12 Op 8 58/0.000 - CDS 33415 - 34035 832 ## PROTEIN SUPPORTED gi|240145877|ref|ZP_04744478.1| ribosomal protein L4/L1e 54 12 Op 9 40/0.000 - CDS 34065 - 34700 943 ## PROTEIN SUPPORTED gi|238916265|ref|YP_002929782.1| large subunit ribosomal protein L3 - Prom 34735 - 34794 4.6 55 12 Op 10 . - CDS 34967 - 35284 498 ## PROTEIN SUPPORTED gi|160941071|ref|ZP_02088409.1| hypothetical protein CLOBOL_05964 56 13 Op 1 . - CDS 35661 - 35840 76 ## gi|266620265|ref|ZP_06113200.1| conserved hypothetical protein 57 13 Op 2 . - CDS 35880 - 37238 1627 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 37271 - 37330 7.4 + Prom 37221 - 37280 8.1 58 14 Tu 1 . + CDS 37496 - 38365 829 ## COG1092 Predicted SAM-dependent methyltransferases + Term 38480 - 38511 4.1 59 15 Tu 1 . - CDS 38348 - 38761 325 ## Closa_3771 hypothetical protein - Prom 38786 - 38845 80.4 60 16 Tu 1 . - CDS 39693 - 40301 593 ## Closa_3771 hypothetical protein - Prom 40333 - 40392 6.5 - Term 40440 - 40487 3.2 61 17 Op 1 38/0.000 - CDS 40529 - 41413 832 ## COG0395 ABC-type sugar transport system, permease component 62 17 Op 2 35/0.000 - CDS 41428 - 42384 877 ## COG1175 ABC-type sugar transport systems, permease components - Term 42398 - 42448 15.9 63 17 Op 3 . - CDS 42478 - 43812 1335 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 43902 - 43961 10.8 - Term 43823 - 43867 1.3 64 18 Tu 1 . - CDS 43989 - 44843 858 ## COG1737 Transcriptional regulators - Term 46262 - 46317 14.1 65 19 Op 1 . - CDS 46423 - 48051 1248 ## COG0029 Aspartate oxidase 66 19 Op 2 44/0.000 - CDS 48023 - 49075 879 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 67 19 Op 3 44/0.000 - CDS 49065 - 50084 592 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 68 19 Op 4 49/0.000 - CDS 50095 - 51024 997 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 69 19 Op 5 . - CDS 51037 - 51999 1144 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components - Prom 52019 - 52078 1.6 + Prom 51742 - 51801 3.6 70 20 Tu 1 . + CDS 51944 - 52054 57 ## - Term 52009 - 52053 6.8 71 21 Tu 1 . - CDS 52090 - 53796 2076 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 53893 - 53952 6.6 - Term 53814 - 53863 1.2 72 22 Op 1 5/0.000 - CDS 54037 - 54930 281 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase - Term 54942 - 54989 5.1 73 22 Op 2 . - CDS 55008 - 55691 776 ## COG3010 Putative N-acetylmannosamine-6-phosphate epimerase 74 22 Op 3 . - CDS 55652 - 55783 83 ## gi|288870182|ref|ZP_06409658.1| conserved hypothetical protein - Prom 55803 - 55862 8.3 + Prom 55785 - 55844 10.7 75 23 Op 1 . + CDS 55890 - 56807 952 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase + Term 56827 - 56873 13.1 76 23 Op 2 . + CDS 56882 - 57520 680 ## COG2755 Lysophospholipase L1 and related esterases + Prom 57523 - 57582 4.6 77 24 Tu 1 . + CDS 57670 - 58992 1219 ## COG1404 Subtilisin-like serine proteases + Term 59029 - 59075 9.0 78 25 Tu 1 . - CDS 59047 - 59613 463 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 59647 - 59706 3.7 - Term 59652 - 59712 8.1 79 26 Op 1 . - CDS 59794 - 59970 210 ## Closa_3778 hypothetical protein 80 26 Op 2 . - CDS 59957 - 61501 1742 ## Closa_3778 hypothetical protein - Prom 61535 - 61594 6.6 81 27 Op 1 . - CDS 61626 - 62432 617 ## COG2362 D-aminopeptidase - Prom 62517 - 62576 3.6 - Term 62515 - 62583 19.0 82 27 Op 2 . - CDS 62610 - 63764 1397 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 - Prom 63787 - 63846 80.4 83 28 Op 1 51/0.000 - CDS 64913 - 67033 1816 ## COG0480 Translation elongation factors (GTPases) 84 28 Op 2 56/0.000 - CDS 67049 - 67519 733 ## PROTEIN SUPPORTED gi|240146972|ref|ZP_04745573.1| 30S ribosomal protein S7 - Prom 67634 - 67693 2.0 - Term 67629 - 67657 -0.9 85 28 Op 3 . - CDS 67708 - 68127 677 ## PROTEIN SUPPORTED gi|240146973|ref|ZP_04745574.1| 30S ribosomal protein S12 - Prom 68187 - 68246 1.9 86 29 Tu 1 . - CDS 68269 - 68460 88 ## - Prom 68492 - 68551 3.5 + Prom 68181 - 68240 6.8 87 30 Tu 1 . + CDS 68399 - 68762 284 ## CLJU_c31760 hypothetical protein Predicted protein(s) >gi|229784114|gb|GG667621.1| GENE 1 2 - 161 68 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|240145800|ref|ZP_04744401.1| ## NR: gi|240145800|ref|ZP_04744401.1| hypothetical protein ROSINTL182_07720 [Roseburia intestinalis L1-82] hypothetical protein ROSINTL182_07720 [Roseburia intestinalis L1-82] hypothetical protein RO1_33370 [Roseburia intestinalis XB6B4] # 1 53 1 53 215 103 100.0 6e-21 MRKIKTILGIVFSLAIVTTMNTTVFAANDSGIPERRGVYIRECVGYEPNYTNI >gi|229784114|gb|GG667621.1| GENE 2 350 - 490 140 46 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3592 NR:ns ## KEGG: EUBREC_3592 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 45 3 47 48 70 82.0 2e-11 MAFFNQAITVLQTLVIALGAGLGIWGVINLLEGYGNDNPGANAHVR >gi|229784114|gb|GG667621.1| GENE 3 703 - 936 304 77 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1892 NR:ns ## KEGG: Ethha_1892 # Name: not_defined # Def: TraG family protein # Organism: E.harbinense # Pathway: Bacterial secretion system [PATH:eha03070] # 2 76 516 590 596 115 72.0 4e-25 MSQDELAVMDGGKCILQLRGVRPFLSDKYDITKHPNFKYTADASDKNAFDIEAFLSTRLK LKPNEVCDVYEVDTEGA >gi|229784114|gb|GG667621.1| GENE 4 1953 - 2372 350 139 aa, chain - ## HITS:1 COG:no KEGG:Closa_3720 NR:ns ## KEGG: Closa_3720 # Name: not_defined # Def: TRAG family protein # Organism: C.saccharolyticum # Pathway: Bacterial secretion system [PATH:csh03070] # 8 139 6 137 585 204 69.0 1e-51 MKKQLDIKKLLILNLPYILMGLFATNFGEGWRRAQGADASAKMLSFFSTLPVALASWWPS LHPLDLLVGLCCGAGLRLAVYLKSKNAKKYRHGMEYGSARWGTHEDIAPYIDPVFQNNVI LTKTESLTMNSRPKDPKTA >gi|229784114|gb|GG667621.1| GENE 5 2369 - 2869 455 166 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1891 NR:ns ## KEGG: Ethha_1891 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 166 1 161 163 192 68.0 5e-48 MQEEVENRTLTLVVNGTKFTGRLFKAAICKYLAHCKEKKLNKQRKREIPVKPQGKQTVKQ LVGQNQGVSNIEITDPSIKEFEKIARKYGVDYAVKKDRSTSPPKYLIFFKGRDADALTAA FTEYTGKKVKKAEKTERPSVLAKLNQFKELVKNAVVDRTRRKELER >gi|229784114|gb|GG667621.1| GENE 6 2959 - 3696 655 245 aa, chain - ## HITS:1 COG:lin0080_1 KEGG:ns NR:ns ## COG: lin0080_1 COG3617 # Protein_GI_number: 16799158 # Func_class: K Transcription # Function: Prophage antirepressor # Organism: Listeria innocua # 1 128 1 129 129 102 43.0 5e-22 MNQMEIFSNQEFGSIRTFEQDGKVLFCGKDIAKALGYQRTADAITAHCKGVCVLPTPSNG GIQRMKFIPEGDVYRLIVHSKLPSAERFERWVFDEVLPSIRKHGAYITREKLWEVATSPE AMMKLCSDLLAEREENAVLRKENAMLEGKAAFYDLFIDLKHSTNLRTTAKELVVPERRFV RFLLEQRFVYRAPSGNVLPYAKPANDGLFTVKDYCNHGHTGSYTLVTPQGKLFFAQLRDS ILLMI >gi|229784114|gb|GG667621.1| GENE 7 3746 - 4624 585 292 aa, chain - ## HITS:1 COG:no KEGG:BLJ_1240 NR:ns ## KEGG: BLJ_1240 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 1 292 80 375 381 330 56.0 5e-89 MAIFRIERTRDYTVMSNHHLRNGKLSLKAKGLLSMMLSLPEDWNYTTRGLAKICKEGVDA IGGALRELETAGYIVRHQLRDRQGRISDTEYVIYEQPQPKNPDTPQPDTASPDTENPDME KPDTEKPAELNIEKSNTQKIITDGSSIDSIPFREKTAARLPERKGRDAMSVSEIENYRDL ILENIEYDYLCREFATYREDLDEIVELMVETVCARRKTTRIAGSDFPHEVVRSRFLKLDS EHIRFVMDSLQKNTTEVRNMKQYLLTVLFNAPTTISNHYTSQVNHDMNAGGW >gi|229784114|gb|GG667621.1| GENE 8 4694 - 4879 89 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620217|ref|ZP_06113152.1| ## NR: gi|266620217|ref|ZP_06113152.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 61 1 61 61 88 100.0 2e-16 MCTVKEIREAIREDCRSQHDMETADIDRIFQTLPFSQRNSLLDKPPMPKRPYRRRKKEQK S >gi|229784114|gb|GG667621.1| GENE 9 4912 - 5187 170 91 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620218|ref|ZP_06113153.1| ## NR: gi|266620218|ref|ZP_06113153.1| hypothetical protein CLOSTHATH_01299 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01299 [Clostridium hathewayi DSM 13479] # 1 91 1 91 91 158 100.0 1e-37 MRDISARELKGHNILAVERFQDNTRRMIEFSVLRSCAYGSPGDEMRLFLTEDGYQAALVS QKRREIKIRRHARIIEGHILDFKPDKHRRHS >gi|229784114|gb|GG667621.1| GENE 10 5202 - 6170 838 322 aa, chain - ## HITS:1 COG:SP2240 KEGG:ns NR:ns ## COG: SP2240 COG1475 # Protein_GI_number: 15902043 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 64 230 24 171 252 75 34.0 1e-13 MKSSAKKVELASVDDLFTTEQERQEADEERVVRIPLTQLHYFKGYPAMQTEVPFTGQPYR FKDDDPKMQETLDSVKKRGVRAPGLVRPDPDGGYEIISGHRRHRASELAGLEDMPVIIRD MTDEEAVIEMVDANIQREKVMPSEKAWAYRMKLEAIKHQGARTDLTSVQVEQKLSARDKV AKDAGEKSGIQIMRYIRLTELIPELLDMVDEKKIAFNPAYELSFLKKEEQIQLLDAMDSE QATPSLSQAQRLKKFSQEGRLSIDVMRAIMGEEKKSDLDKVTFTSDTLRKYFPKSYTPAR MQETIIKLLEQWQKKRQQQHER >gi|229784114|gb|GG667621.1| GENE 11 6127 - 6948 546 273 aa, chain - ## HITS:1 COG:BS_soj KEGG:ns NR:ns ## COG: BS_soj COG1192 # Protein_GI_number: 16081149 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Bacillus subtilis # 4 257 3 249 253 201 45.0 2e-51 MNTQIIAIANQKGGVGKTTTCANLGIGLAQAGKKVLLIDGDPQGSLTISLGYPQPDKLPF TLSDAMGRILMDEPLRPGEGILHHPEGVDLMPADIQLSGMEVSLVNAMSRETILRQYLDT VKGQYSHILIDCQPSLGMLTVNALAAANRVIIPVQAEYLPAKGLEQLLQTVNKVKRQINP KLQIDGILLTMVDNRTNFAKEIAALLRETYGSKIKVFATEIPHSVRAKEISAEGKSIFAH DPGGKVAESYKNLTQEVTKLEKQREKSRAGIGR >gi|229784114|gb|GG667621.1| GENE 12 7425 - 7817 610 130 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240144422|ref|ZP_04743023.1| 30S ribosomal protein S9 [Roseburia intestinalis L1-82] # 1 130 1 130 130 239 88 3e-62 MANAKFYGTGRRKKSIARVYLVPGTGKITVNKRDIDEYFGLETLKVVVRQPLVATETNDK FDVLVNVHGGGFTGQAGAVRHGIARALLQADADYRPILKKAGYLTRDPRMKERKKYGLKA ARRAPQFSKR >gi|229784114|gb|GG667621.1| GENE 13 7844 - 8278 675 144 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239623392|ref|ZP_04666423.1| ribosomal protein L13 [Clostridiales bacterium 1_7_47_FAA] # 1 140 1 140 142 264 88 9e-70 MKTFMASPATIERKWYVVDATGHTLGRLASEVAKVLRGKNKPIFTPHMDCGDYVIIVNAD KVNVTGKKLDQKIYYNHSDYVGGMKETTLREMMAKKPEKVMELAVKGMLPKGPLGRSMMT KLHVYAGPEHDHAAQKPEVLEIKF >gi|229784114|gb|GG667621.1| GENE 14 8450 - 9514 1054 354 aa, chain - ## HITS:1 COG:BH0167 KEGG:ns NR:ns ## COG: BH0167 COG0101 # Protein_GI_number: 15612730 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Bacillus halodurans # 1 237 1 238 263 209 47.0 9e-54 MKRIKLIVAYDGTNYCGWQIQNNGRTIEEVLNEALTALLHEKVAVIGASRTDSGVHAEGN VAVFDTENRMPADKICFALNQRLPEDIRVLKSEEVPLTWHPRKCNCVKTYEYRILNRRIE VPTLRLYAYFCYYPLDVEKMKQAAEYLVGEHDFTSFCAPRTQAEDMVRTIYSLDVGKTGD MITIRVSGSGFLYNMVRIIAGTLMKVGLGVWPPEHVEEILDARSRAAAGPTAVARGLTLI SLEEETELRPVIAAENKEWKYSLDQSGTVAEKQSFLTIERCIPKEFDRLLFRVVHQAVRN GAETVYVNDCESGERIEAGKPYGYYVFQPSEERNGWYVTHDEYRRGAGKGSPKE >gi|229784114|gb|GG667621.1| GENE 15 9531 - 10331 994 266 aa, chain - ## HITS:1 COG:CAC3100 KEGG:ns NR:ns ## COG: CAC3100 COG0619 # Protein_GI_number: 15896351 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Clostridium acetobutylicum # 1 247 1 249 267 263 55.0 3e-70 MLRDITLGQYYPVASVLHRLDPRTKLFGTMVFIISLFVADSIWAYLAATVFLAAAIKISH VPFRFMVRGLKAIIFLLLISVSFNLFLTQGEVLFQLWFLKVTKEGLKTAGFMGVRLIYLV VGSSVMTLTTTPNELTDGLEKSLGFLKKIGLPVHEVSMMMSIALRFIPILVEETDKIMKA QMARGADFESGNIIQRAKNMIPLLVPLFISAFRRATDLAMAMEARCYRGGEGRTKMKPLH YAKRDGVTYLVYIAYLAVIIVLRILI >gi|229784114|gb|GG667621.1| GENE 16 10325 - 11224 467 299 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 22 260 146 380 398 184 43 1e-45 MAIKIEHLNYIYGQGTAFEQQALKDVNLEIQDGQFIGLIGHTGSGKSTLIQHLNGLIKAT SGAIYYNGENIYSDGYDMRALRSHVGLVFQYPEHQLFEIDVFTDVCFGPKNQGLPKEEIE SRAREALTMVGLDESFYKQSPFELSGGQKRRVAIAGVLAMEPEVLILDEPTAGLDPRGRD EILDQIERLHREKKMTIILVSHSMEDIAKYVDRILVMNHGEKVFDDTPREVFKHYRELEA IGLAAPQITYVVHTLRDHGVPIDNNITTVEEARDEILALLKKEGQAARGNGSRGQETIC >gi|229784114|gb|GG667621.1| GENE 17 11214 - 12119 614 301 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 28 281 146 398 398 241 49 1e-62 MGIIKTSKLIFDYIRRDEEENIEEVNRAIDDVSIEIKEGQFVAVLGHNGSGKSTFAKQLD AILLPTDGTVWIQGLDTSKEENLWEVRKKTGMVFQNPDNQIIGNVVEEDVGFGPENMGIP TDEIWKRVDESLEAVGMTAYRMKSPNKLSGGQKQRVAIAGVMAMRPQCIVLDEPTAMLDP NGRREVVKTVHELNRQEGITVVLITHYMEEVINADRVIVMDDGRVVMDGTPREIFSRVEE LKSYRLDVPQVTELAYELQKNGVELPDGILTLNELMKYLLPIFAERYPGRVTLGDGGSYG N >gi|229784114|gb|GG667621.1| GENE 18 12355 - 13374 1227 339 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 211 338 495 620 621 102 39.0 8e-22 MRGIKGKGRWLILLGAVMTLGTAMTSMAAKSVKIRFENDVRSEWTEDIKVPKVTVNYDEV SPEWSKTPEDWVPGKKVTATFTIAGDYVKSSCSVYGGELVSAKAADGETTIKATYVPVAQ LGAPETAGWSDAAKTKASWKRVPYASKYQLRLYREDQWIKTLTTGSTSIDLLEYLQDGYS YYYEVRAIAKDSSEEKYLKDGEFTVSTDSEVQELGDTSGRWANTQAGKRYRDEDGNYAAN CWKMISGKWYYFNQDGYALTGWQYLNSKWYYMNENAEMVTGWQQIGGKWYYFNANGDMVT GWLQAEPGKWYYLYEDGSMAANTVIDGTYKVNESGQWVP >gi|229784114|gb|GG667621.1| GENE 19 13401 - 14369 850 322 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 135 320 516 743 744 134 34.0 2e-31 MTIGYEPKMRVELHATDSDEYAFKGGYQSSNVTVKGGTYVSASRSGSDTLYVTFKFKPIK GTYSSPDDAYWRENGYGGARWSKVDNSSDAYDVYLYRGSSVVKKVEGLKATSYNFYPYMT KAGTYSFKVRTVPYTESEKKYGKKSDWTESDEVYLPEEKVSDGSGQDNGTPGSTSQVGWI KDGNNWYFRYPDGSYQKNNWAKINEKWYLFDSNGVMLTGWQQRNNLWYFLNDQGAMVTGW VLSNNKWYYLNPSTTTGVEGAMLTGWIAYNNKWYCTDASGAMLEGWRQVDGSWYYFYPGE GSKAVNTTIDGFPVDGNGVWHK >gi|229784114|gb|GG667621.1| GENE 20 15322 - 15546 268 74 aa, chain - ## HITS:1 COG:no KEGG:Closa_3736 NR:ns ## KEGG: Closa_3736 # Name: not_defined # Def: cell wall binding repeat-containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 74 1 73 408 92 66.0 4e-18 MKRFGKLLALTAIVSVLGVSIPFQAFAASKTISSVTIYAGLEEMESGGTLPEETSFELKE NSTGNYVYTNNSRY >gi|229784114|gb|GG667621.1| GENE 21 15729 - 16796 1229 355 aa, chain - ## HITS:1 COG:SA1513_1 KEGG:ns NR:ns ## COG: SA1513_1 COG0258 # Protein_GI_number: 15927268 # Func_class: L Replication, recombination and repair # Function: 5'-3' exonuclease (including N-terminal domain of PolI) # Organism: Staphylococcus aureus N315 # 3 350 2 281 324 132 32.0 9e-31 MKDKFVIVDGSSMLSTCYYAVLPREIMFAKTEEEKQKHYGKILHAGDGTYTNAIFGMLKT IASLLKKQEPGYIAFVFDKTRDTFRRELYAEYKGTRGATPEPLKQQFVLIESILEEAGFR VLYSDDYEADDYAGSLVMKFRSQVPVVVMTKDHDYLQLVNDEYNVRAWMVQAKQEKADEL YEKYYGPYGLTKKDVNLPEKTFEFTAEHVFAEEGVWPEQITDLKGIQGDTSDNIPGVKGV SSAVPPLLAEYGTVEAMYEAIHDAETDKKQLKELQDFWKEKLGITRTPYKALTKTGEDGE LCGEAAALLSKRLAAIKTDIPLDLELADFSVEKYQEKVLRKWCKKLDIKEASVFG >gi|229784114|gb|GG667621.1| GENE 22 16839 - 17333 583 164 aa, chain - ## HITS:1 COG:no KEGG:BPUM_1656 NR:ns ## KEGG: BPUM_1656 # Name: not_defined # Def: hypothetical protein # Organism: B.pumilus # Pathway: not_defined # 6 149 2 138 149 102 39.0 5e-21 MIKYYEGDIFSSNADLICHQVNCQGVFGRGMAGQVKKLFPEVEKSYKILTKQWTEEVGGK TAKLLGRVSAQPVEKDGRWFLIANLYGQDDYGKNGVYTDYEALEKALEEICEFLTVRGRN ETAAFPQGMGCGFGGGDWDVVEAIIKRVFEDYPGEVQIWKYDGK >gi|229784114|gb|GG667621.1| GENE 23 17367 - 17459 60 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRKRVVEFRPYIFPHPLKCVMIQKSNIENK >gi|229784114|gb|GG667621.1| GENE 24 18541 - 19077 749 178 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145848|ref|ZP_04744449.1| LSU ribosomal protein L17P [Roseburia intestinalis L1-82] # 1 178 1 178 178 293 84 2e-78 MAGYRKLGRTSSQRKALIRSQVTALLHNGKIVTTEARAKEIRKVAEGLIALAVKEKDNFE TVKVTAKVARKDKDGKRVKEVVNGKKVTVYDEVEKEIKKDLPSRLHARRQMLKVLYDVTE VPQEAAGRKKNTKSVDLPKKLFEEIAPKYVSRNGGYTRIVKIGQRKGDGAMEVLLELV >gi|229784114|gb|GG667621.1| GENE 25 19172 - 20140 1159 322 aa, chain - ## HITS:1 COG:BS_rpoA KEGG:ns NR:ns ## COG: BS_rpoA COG0202 # Protein_GI_number: 16077211 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Bacillus subtilis # 1 318 1 314 314 390 66.0 1e-108 MFDFEKPNIEIAEISEDKKYGKFVVEPLERGYGTTLGNSLRRIMLSSLPGAAVSQVKIDG VLHEFSSIPGVKEDVTEIIMNIKSLAIKNNSDTNEAKVAYIEFEGEGVVTAADIQADPDI EIMNPDQVIATLSGGTDSKFYMELTITKGRGYVSADKNKSEDLPIGVIAVDSIYTPVERV NMTVQNTRVGQVTDFDKLTLDVYTDGTLAPDEAVSLAAKVLSEHLNLFIDLSENAKTAEV MVEKEDNEKEKVLEMNIDELELSVRSYNCLKRAGINTVEELCNRTSEDMMKVRNLGRKSL EEVLAKLKELGLQLNPSDDASS >gi|229784114|gb|GG667621.1| GENE 26 20258 - 20851 803 197 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145850|ref|ZP_04744451.1| 30S ribosomal protein S4 [Roseburia intestinalis L1-82] # 1 197 1 197 197 313 78 1e-84 MAVDRVPVLKRCRSLGLDPIYLGIDKKSNRELKRANRKMSEYGLQLREKQKAKFIYGVLE KPFRNYYAKASRMNGMVGENLMILLERRLDNVVFRMGFGRTRRETRQMVDHKSILVNGKC VNIPSYLVKAGDVIEVKEKCKGNARFKDVVETTAGRLTPAWVDVDHENLRGTVKELPTRD EIDVPVNEMLIVELYSK >gi|229784114|gb|GG667621.1| GENE 27 20868 - 21269 669 133 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239623373|ref|ZP_04666404.1| ribosomal protein S11 [Clostridiales bacterium 1_7_47_FAA] # 1 133 1 133 133 262 99 4e-69 MAKKVSTTKKVTKKRVKKNVERGQAHIQSSFNNTIVTLTDTQGNALSWASAGGLGFRGSR KSTPYAAQMAAETATKAALVHGLKSVDVMVKGPGSGREAAIRALQACGLEVTSIKDVTPV PHNGCRPPKRRRV >gi|229784114|gb|GG667621.1| GENE 28 21389 - 21757 553 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145852|ref|ZP_04744453.1| ribosomal protein S13p/S18e [Roseburia intestinalis L1-82] # 1 122 1 122 122 217 87 1e-55 MARISGVDLPREKRVEIGLTYIYGIGRTSSNRILTEAGVNPDTRVKDLTDDEVKRISAVI ADSQMVEGDLRREIAMNIKRLQEIGCYRGIRHRKSLPVRGQKTKTNARTRKGPRKTVANK KK >gi|229784114|gb|GG667621.1| GENE 29 21786 - 21899 189 37 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881761|ref|YP_001560729.1| ribosomal protein L36 [Clostridium phytofermentans ISDg] # 1 37 1 37 37 77 94 2e-13 MKVRSSVKPICEKCKIIKRKGSIRVICENPKHKQRQG >gi|229784114|gb|GG667621.1| GENE 30 22084 - 22302 271 72 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 [Streptococcus pneumoniae TIGR4] # 1 72 1 72 72 108 69 6e-23 MSKADVIEIEGTVVEKLPNAMFQVELENGHQVLAHISGKLRMNYIRILPGDKVTIELSPY DLSKGRIIWRDK >gi|229784114|gb|GG667621.1| GENE 31 22336 - 22602 274 88 aa, chain - ## HITS:1 COG:no KEGG:Closa_3744 NR:ns ## KEGG: Closa_3744 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 6 88 8 90 95 89 59.0 3e-17 MEITIGSLVKSLAGHDKDEVFFILKEEVEYVYLVDGKYRTLARPKRKNRKHVEAVSCEAD CPGKKIRENQRVTNEEIARFIKCFKKKQ >gi|229784114|gb|GG667621.1| GENE 32 22623 - 23348 775 241 aa, chain - ## HITS:1 COG:BH0156 KEGG:ns NR:ns ## COG: BH0156 COG0024 # Protein_GI_number: 15612719 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Bacillus halodurans # 1 235 13 246 248 256 53.0 3e-68 MREAGRILAKTHEELAKALKPGMTTWDIDHLGEEIIRSYGCIPSFKNYNGYPASICVSVN DEVVHGIPSKKHFIEEGDIVSLDAGVIYKGYHSDAARTYGIGEISPEAGRLIEVTKQSFF EGIKFAKAGNHLNDISSAIQTYAESFGYGVVRDLVGHGIGSHLHEDPEVPNFAGRRRGLK LRPGMTLAIEPMINEGTPEVVWLDDDWTVVTEDGALSAHYENTVLITEGEPELLSLVSEA H >gi|229784114|gb|GG667621.1| GENE 33 23391 - 24035 711 214 aa, chain - ## HITS:1 COG:CAC3112 KEGG:ns NR:ns ## COG: CAC3112 COG0563 # Protein_GI_number: 15896362 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Clostridium acetobutylicum # 1 213 1 213 215 280 64.0 2e-75 MKIIMLGAPGAGKGTQAKKIAEKYQIPHISTGDIFRSNIKEGTELGMKAKAYMDQGGLVP DELTIGMLMDRIQKDDCKNGYVLDGFPRTIPQAESLTNALNERNQKIDYAVNVDVPDENI VNRMSGRRACLSCGATYHIVYKPSKVEGICDVCGDKLVLRDDDKPETVKKRLSVYHDQTQ PLIDYYKEAGVLANVDGTQDMEKVFSDIVAVLGA >gi|229784114|gb|GG667621.1| GENE 34 24119 - 25441 832 440 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 10 438 19 437 447 325 38 5e-88 MLKTLRNAFKIKDIRKKLLYTFAMLVVIRIGSQLPIPGVETAFFKDFFAQQNNDAFGFFN AMTGSSFTNMSVFALSITPYITSSIIMQLLTIAIPKLEEMQRDGEDGRKKIAEYTRYVTV ALALIESIAMAIGFGGQGLLSEFNAISVIIAVVTMTAGSALLMWIGERITENGVGNGISI VLLFNIISSLPSDASTLYTRFMSGKSVAVSAVAAIIIVAIVIAIVVFVVVLQDGERRIPV QYSKKMQGRKMVGGQSSNIPLKVNTAGVIPVIFASSIMSFPVVIAQFFGSRINYDSIGGH ILLMLNSSNWFKPERPVYSIGMVIYIALIIVFAYFYTSITFNPLEVANNMKKSGGFIPGI RPGKPTSDYLNSILNYVVFIGACGLTIVCIIPIMVSGLFNVSRLSFGGTSLIIIVSVVLE TIKAIESQMLVRYYKGFLND >gi|229784114|gb|GG667621.1| GENE 35 25443 - 25883 638 146 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145859|ref|ZP_04744460.1| 50S ribosomal protein L15 [Roseburia intestinalis L1-82] # 1 145 1 145 146 250 85 2e-65 MELSNLQPALGSKHSDSFRRGRGHGSGNGKTAGKGHKGQKARSGAPRPGFEGGQMPLYRR LPKRGFTNRNSKTIIGINVSALEQFENDTVVTVETLLESGIVKNPRDGVKILGNGELTKK LTVKVDAFSEGAKTKIEALGGTCEVN >gi|229784114|gb|GG667621.1| GENE 36 25909 - 26091 243 60 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227871783|ref|ZP_03990188.1| ribosomal protein L30 [Oribacterium sinus F0268] # 1 60 1 60 60 98 78 1e-19 MADKLKITLVKSPIGAVPKQKATVEALGLKKLHKTVEMPDNGAVRGMIQSVRHLVKVEEI >gi|229784114|gb|GG667621.1| GENE 37 26105 - 26614 673 169 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227871784|ref|ZP_03990189.1| ribosomal protein S5 [Oribacterium sinus F0268] # 1 169 1 169 169 263 78 1e-69 MKHTIIDASQLELNDKVVAIKRVSKTVKGGRTMRFSALVVVGDGNGHVGAGLGKAGEVPE AIRKGKEAAVKNMVTVPVDENSSIPHDLIGKFGSAAVLLKRAPEGTGIIAGGPARAVVEL AGIKNIRTKSLGSNNKTNVVLATIEGLTQLKTPEEVAKLRGKSVEEILG >gi|229784114|gb|GG667621.1| GENE 38 26634 - 27002 549 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239623363|ref|ZP_04666394.1| ribosomal protein L18 [Clostridiales bacterium 1_7_47_FAA] # 1 122 1 122 122 216 90 4e-55 MVSKQSRSEVRVKKHTRLRNRFAGTAERPRLAVFRSNNHMYAQIIDDTVGKTLVSASTVE KEVKAELEKTNNVDAAAYVGTVIAKRALEKGIKEVVFDRGGFIYQGKIQALADAAREAGL EF >gi|229784114|gb|GG667621.1| GENE 39 27021 - 27563 785 180 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238916280|ref|YP_002929797.1| large subunit ribosomal protein L6 [Eubacterium eligens ATCC 27750] # 7 180 1 174 174 306 86 2e-82 MSRIGRMPIAIPAGVTVTIAENNKVTVKGPKGTLERVLPAEMSIKEEDGQIIVSRPSDLK KMKSLHGLTRTLINNMIVGVTAGYEKKLEINGVGYRAQKQGKKLVLSLGYSHPVEMEDPE GLESTMEGQNVIIVKGIDKEKVGQYAAEIRDKRRPEPYKGKGIKYADEVIRRKVGKTGKK >gi|229784114|gb|GG667621.1| GENE 40 27630 - 28031 592 133 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145864|ref|ZP_04744465.1| ribosomal protein S8 [Roseburia intestinalis L1-82] # 1 133 1 133 133 232 86 4e-60 MTMSDPIADMLTRIRNANTAKHDTVDVPSSKMKLAIADILVKEGYIKKYDLVEDGAFQTI RITLKYGKDKNEKIITGIKRISKPGLRVYANKEELPKVLGGLGTAIISTNQGVVTDKEAR ELGVGGEVLAFVW >gi|229784114|gb|GG667621.1| GENE 41 28270 - 28455 310 61 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168334339|ref|ZP_02692526.1| ribosomal protein S14 [Epulopiscium sp. 'N.t. morphotype B'] # 1 61 1 61 61 124 88 2e-27 MAKTSMKIKQQRPAKFSTREYNRCRICGRPHAYLRKYGICRICFRELAYKGQIPGVKKAS W >gi|229784114|gb|GG667621.1| GENE 42 28471 - 29010 824 179 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145866|ref|ZP_04744467.1| 50S ribosomal protein L5 [Roseburia intestinalis L1-82] # 1 179 1 179 179 322 88 5e-87 MARLKETYQNEIVEAMIKKFGYKNIMEVPKLDKIVINMGVGEAKENAKVLDSAVRDLEII SGQKAVLTKAKKSIANFKLREGMPIGCKVTLRGERMYEFADRLINLALPRVRDFRGVNPN AFDGRGNYALGIKEQLIFPEIEYDKVDKVRGMDIIFVTTAKTDEEARELLTLFNMPFAK >gi|229784114|gb|GG667621.1| GENE 43 29032 - 29337 380 101 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227871791|ref|ZP_03990195.1| ribosomal protein L24 [Oribacterium sinus F0268] # 1 101 1 101 101 150 71 1e-35 MHKIKRDDLVKVIAGKDRDKQGKVLHVDTKNNKVLVEGCNMITKHVKPGPGNPQGGIVQK EAALDISNVMLVVDGKATRVGFEVKDGKKVRVAKATGKAID >gi|229784114|gb|GG667621.1| GENE 44 29351 - 29719 582 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225376419|ref|ZP_03753640.1| hypothetical protein ROSEINA2194_02061 [Roseburia inulinivorans DSM 16841] # 1 122 1 122 122 228 95 5e-59 MIQQETRLKVADNTGAKELLCIRVMGGSTRRYANIGDIIVATVKDATPGGVVKKGDVVKA VVVRTVKGARRKDGSYIKFDENAAVIIKDDKTPRGTRIFGPVARELREKQFMKIVSLAPE VL >gi|229784114|gb|GG667621.1| GENE 45 29746 - 30000 405 84 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881776|ref|YP_001560744.1| ribosomal protein S17 [Clostridium phytofermentans ISDg] # 1 84 2 85 85 160 92 2e-38 MERNLRKTRTGKVVSNKMDKTIVVAIVDHVKHPLYGKIVKRTYKLKAHDENNECNMGDTV KVMETRPLSKDKRWRLVEIIEKAK >gi|229784114|gb|GG667621.1| GENE 46 30123 - 30338 292 71 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239623355|ref|ZP_04666386.1| 30S ribosomal protein S17 [Clostridiales bacterium 1_7_47_FAA] # 1 68 1 68 70 117 88 2e-25 MITVKINKYVEELKAKSVTELNEELVAAKKELFNLRFQNATNQLDNTSRIKEVRKNIARI QTVMTEKAKLA >gi|229784114|gb|GG667621.1| GENE 47 30319 - 30756 666 145 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881778|ref|YP_001560746.1| 50S ribosomal protein L16 [Clostridium phytofermentans ISDg] # 1 145 1 145 145 261 86 1e-68 MLMPKRVKRRKQFRGSMAGKALRGNTISNGEYGLVALEPCWIKSNQIEAARVAMTRYIKR GGKVWIKIFPDKPVTAKPAETRMGSGKGALEYWVAVVKPGRVLFEIAGVPEETAREALRL AMHKLPCRCKIAAKADLEGGDNSEN >gi|229784114|gb|GG667621.1| GENE 48 30756 - 31412 977 218 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238922836|ref|YP_002936349.1| SSU ribosomal protein S3P [Eubacterium rectale ATCC 33656] # 1 218 1 218 218 380 85 1e-105 MGQKVNPHGLRVGVIKDWDSKWYAEADFADNLVEDYKLRTYLKKRLYSAGVSDIEIERAS DRVKIIIHTAKPGVVIGKGGSEIEKLKAEVQKMTDKKLFVDIKEIKRPDKDAQLVAESIA QQLENRVSFRRAMKSTMGRSMKAGVKGIKTAVAGRLGGADMARTEFYSEGTIPLQTLRAD IDYGFAEADTTYGKVGVKVWIYKGEILPTKGNREGSDK >gi|229784114|gb|GG667621.1| GENE 49 31424 - 31810 583 128 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881780|ref|YP_001560748.1| ribosomal protein L22 [Clostridium phytofermentans ISDg] # 1 128 1 128 128 229 88 4e-59 MAKGHRTQIKRERNAQKDTRPSAKLSYARVSVQKACFVLDAIRGKDAQTALGIVTYNPRY ASTLIEKLLKSAIANAENNNGMDPSKLYVEECYANKGPTMKRVKPRAQGRAYRIEKRMSH ITIVLNER >gi|229784114|gb|GG667621.1| GENE 50 31842 - 32123 446 93 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145874|ref|ZP_04744475.1| 30S ribosomal protein S19 [Roseburia intestinalis L1-82] # 1 91 1 91 93 176 92 3e-43 MARSLKKGPFADASLLKKIDAMNTAGQHQVIKTWSRRSTIFPQMVGHTIAVHDGRKHVPV YVTEDMVGHKLGEFVATRTYRGHGKDEKKSGRK >gi|229784114|gb|GG667621.1| GENE 51 32143 - 32991 1304 282 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145875|ref|ZP_04744476.1| 50S ribosomal protein L2 [Roseburia intestinalis L1-82] # 1 282 1 281 281 506 85 1e-142 MGIKKYNPYTPSRRHMTGSDFSEITKKTPEKSLISATMNKQAGRNNQGKITVRHRGGGAK RKYRIIDFKRNSKDGIPATVIGIEYDPNRTANIALICYADGTKSYILAPAGLTDGMKIMS GENAEARIGNCLPLGQIPVGTQVHNIELYPGKGGQLVRSAGNSAQLMAKEGKYATLRLPS GEMRMVPIGCRATIGVVGNGDHSLINIGKAGRKRHMGIRPTVRGSVMNPNDHPHGGGEGK TGIGRPGPCTPWGKPALGLKTRKKNKQSNKLIVRRRDGKTIK >gi|229784114|gb|GG667621.1| GENE 52 33116 - 33415 417 99 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145876|ref|ZP_04744477.1| 50S ribosomal protein L23 [Roseburia intestinalis L1-82] # 1 99 1 99 99 165 80 7e-40 MADIKYFDVIQKPIVTEKSMNAMASKKYTFIVHPEANKAMIKEAVEKMFPGTKVASVNTM NLEGKTKRRGMTFGKTAKTKKAIVQLTADSKDIEIFEGL >gi|229784114|gb|GG667621.1| GENE 53 33415 - 34035 832 206 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145877|ref|ZP_04744478.1| ribosomal protein L4/L1e [Roseburia intestinalis L1-82] # 1 206 1 206 206 325 77 5e-88 MANVSVYNMEGNEVGTLELNDAVFGVNVNEHLVHLAVVSQLANKRQGTQKAKTRAEVSGG GRKPWKQKGTGHARQGSTRSPQWTGGGVVFAPTPRDYTIKLNKKERKLALKSALTSRVNE NKFIVVDELKFDEIKTKKFQNVLNNLKVSKALVVVGDDSTNAVKSARNIPAVKTAFVNTI NVYDILKYNTVVATKTAVAAIEEVYA >gi|229784114|gb|GG667621.1| GENE 54 34065 - 34700 943 211 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238916265|ref|YP_002929782.1| large subunit ribosomal protein L3 [Eubacterium eligens ATCC 27750] # 1 210 1 210 211 367 86 1e-101 MKKGILATKVGMTQIFNEDGVLTPVTVLLAGPCVVTQVKTAENDGYDAVQVGFVDKRAKL VSKPVKGHFDKAGVSYKRYVREFKFENASEYSVKDEIKADIFAAGDKIDATAISKGKGFQ GAIKRHGQSRGPMAHGSKFHRHAGSNGSCSDPSKVFKGKKMPGQMGNKRVTIQNLEIVKV DAENNLILVKGAVPGPKKSLVTIKETVKAAV >gi|229784114|gb|GG667621.1| GENE 55 34967 - 35284 498 105 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160941071|ref|ZP_02088409.1| hypothetical protein CLOBOL_05964 [Clostridium bolteae ATCC BAA-613] # 1 105 1 105 105 196 95 3e-49 MASQVMRITLKAYDHQLVDQSASKIIDTVKKTGSQVSGPVPLPTKKEVVTILRAVHKYKD SREQFEQRTHKRLIDIITPSQKTVDALSRLEMPAGVYIDIKMKNK >gi|229784114|gb|GG667621.1| GENE 56 35661 - 35840 76 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620265|ref|ZP_06113200.1| ## NR: gi|266620265|ref|ZP_06113200.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 59 1 59 59 112 100.0 1e-23 MTGMQDWSRKGGSDPVFCLADPAYLPFAGSAPQLAAFPGPRHGQVGCTGSLQLLLIRRN >gi|229784114|gb|GG667621.1| GENE 57 35880 - 37238 1627 452 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 29 255 527 713 744 91 28.0 4e-18 MKRKGFVVLAVAAVLALGTATMSAWAAEGWAQSGNTWVYYDSNGYKVTNVWKKGADNLWR YLNGNGEMAVNTWVDNTYYMDSNGILVTDKWMKFQETGSSEYKWYYFGSSGKAIMDNWSK INNKWYYFDSNGEMQTGWVLDNMYYCGTDGAMRTGWQKLFPPDSDYDPDRMSPGDEGDDG KHWYYFSDSGKKYMPKDTSGDYGTYKIDGVAYCFDSDGALQTGWKNVGVDNADYDIQNYK YYDSSGKLRTGWYSVEPPEDLTGYEDEVEWFYFSTNGTPKAGPKEGEATTQNLTKINGKT YLFNDKGNPVYGLQKVRIGSSTEYTAYYFGDKKTSTMQKGKIKVSEGDGGEETYYFSDSG RGYTGVKDGYLYYMGRLQRAEDGVRYEPITIPAGNSYTTYVVNSSGKVAKNTTVKNADGV KYKTSSSGSLLKVDDENASGSYREPTEPVWKE >gi|229784114|gb|GG667621.1| GENE 58 37496 - 38365 829 289 aa, chain + ## HITS:1 COG:mlr3209 KEGG:ns NR:ns ## COG: mlr3209 COG1092 # Protein_GI_number: 13472800 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferases # Organism: Mesorhizobium loti # 9 284 58 336 338 223 43.0 4e-58 MWIADGWKDYEVIDCSEGEKLERWGNYTLVRPDPQVIWSTPKTEKGWKHRNGHYHRSAKG GGEWEFFDLPQQWSIQYRELTFQLKPFSFKHTGLFPEQAANWDWFGTKIREAGRPVKVLN LFAYTGGATLAAAAAGASVTHVDASKGMVGWAKENARSSGLESAPIRWIVDDCVKFVERE IRRGNHYDGIIMDPPSYGRGPKGEIWKIEESIHPLVKLCAQLLSDDPLFYLINSYTTGLA PAVLTYMLSIEVGKKHGGFVRSEEVGLPVTRTGLVLPCGASGRWSSTPV >gi|229784114|gb|GG667621.1| GENE 59 38348 - 38761 325 137 aa, chain - ## HITS:1 COG:no KEGG:Closa_3771 NR:ns ## KEGG: Closa_3771 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 137 211 345 345 131 51.0 9e-30 MVEGVWTRKRLLVGIALMIAGCISLGTDLPGSSLGEAASVTVVYRLLVPVGLWLTVPEER LFEARDFMKNNFFLYAVHFALVRLINKTGALLLPPVPVLAFLLFFFMPLFAVVLSWLAGG LIRRLMPAVWRLLNGGR >gi|229784114|gb|GG667621.1| GENE 60 39693 - 40301 593 202 aa, chain - ## HITS:1 COG:no KEGG:Closa_3771 NR:ns ## KEGG: Closa_3771 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 202 1 202 345 243 67.0 3e-63 MAEKTFRNKIYWFTFLFSVLVIWVHSYNAVLFLGNTKSAASLVRLERFFGDRIAQIAVPG FFMISSYLFFRGYRPEILMRKWNSRIRSVLVPYIVWNSLYYFGYVIGSRLPYISDVIGKG KIPFGLPETVDAILNYTYNYVFWYLYQLILLILLAPLIYLAVKRVWPGIAFLAVLLAGVY LGIDLPLLNLDALFYYSFAAFA >gi|229784114|gb|GG667621.1| GENE 61 40529 - 41413 832 294 aa, chain - ## HITS:1 COG:SPy0255 KEGG:ns NR:ns ## COG: SPy0255 COG0395 # Protein_GI_number: 15674435 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Streptococcus pyogenes M1 GAS # 5 294 3 276 276 293 56.0 2e-79 MENKKKTISPYYIVSLILLAVLAVFFLFPLYWIITGSVKEKSDIIIKSGEMVKWFPTTIT WDNYNRLFKSGTKLFGIGMPMAIRWLFNSVFISVVAMGLTCITSSLAGYALAKKRFWGKS VIFGLFVCAMALPKQVILIPLMRQMSAMGFLKSVWGSMFAVIFPVVGWPFGVFMMKQFSE NIPTSLLEACRIDGAGELTTFVNVAFPIIKPGFGALAIFTFINSWNEYPLQLVMLTQNES KTIALGIASKLSEMSNDFGLIMAGAALASVPIIIVFLCFQRYFTQGITMGAVKG >gi|229784114|gb|GG667621.1| GENE 62 41428 - 42384 877 318 aa, chain - ## HITS:1 COG:SPy0254 KEGG:ns NR:ns ## COG: SPy0254 COG1175 # Protein_GI_number: 15674434 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Streptococcus pyogenes M1 GAS # 15 312 18 314 320 320 60.0 2e-87 MANTTVSTQKPGHSKKKKEKMPARSKVLQRRETFVSYMFLLPALIFFVGFVIFPMASGVI TSFTNASFQDPKGTAVGFANYIRLFQDKIFLKSAVNTVIIVVVSVPVVTAFSLWVGTAIY KMRSGIRSFFRCVFYLPVVTGSVAVVVVWKWMFDKYDGLFNYIIRSFGGQPLPWTSGESM ALWCIILILFTTSIGQPIVLYVAALGNVDASMEEAAKVDGANNFQVFWNIKWPSIMPTTL YVAVITTINSFQCFALIQLLTSGGPNYSTSTVMYLLYEIAFKTTKEFGYANAMGVVLAIV IALFSALQFKVMNGGTSE >gi|229784114|gb|GG667621.1| GENE 63 42478 - 43812 1335 444 aa, chain - ## HITS:1 COG:SPy0252 KEGG:ns NR:ns ## COG: SPy0252 COG1653 # Protein_GI_number: 15674433 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Streptococcus pyogenes M1 GAS # 1 442 3 432 439 333 44.0 5e-91 MKKILSLGLATAMVVGTLAGCGGNKPAETAAAPETTTAAATDAAKAPETTAEGETTNIVW WAFPTFGVDTGYEQEVVDAFTAKNPDINVKVEYIDFTSGPDKLTAALTSGTAPDILFDAP GRIIEFGEAGYLVPLDDMLDELKSDLTSQSLVETCVGADGTAWMYPISSSPFYMGLNKEA LEKADALQYVNLEGDRTWTTENFVKMCEALRDAAPTQVPGIVYCGGQGGDQGTRALVNNL YGASIVGDDGKWNISENGVKALTLLKSMYDSKALDAGFDMAAADELQKFQQETAAMTFCF GTSSEITYASDDFTQLSVPFPSENGKPSLEYLVNGFAIFDNKDEARAEASKKFVKFICDD PEWGPKSVVKTNAFPVRTSFGDLYPGDEHKAMLASWSQYYGPYYNTRAGFAAMRPLWFNM LQQVFNGTDPQAAADEFNTSANSN >gi|229784114|gb|GG667621.1| GENE 64 43989 - 44843 858 284 aa, chain - ## HITS:1 COG:YPO3017 KEGG:ns NR:ns ## COG: YPO3017 COG1737 # Protein_GI_number: 16123196 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 3 279 11 288 292 179 34.0 6e-45 MQGDFLTKIRAEYNQFTKAEKKVADYILQNPREVLFMSITDLAEVCEVGDTSVFRFCKTM NLKGYQEFKMMLSLSLHDGKPGLGQLDGDISQEDSFPEMAQKVLNTNVNALKETFSLLNE DNFDKVIRYLHNAERICFYGVGASMLTAMKAANKFLRIEPKVYCVQDSHMQAMVASMMKE NEVAVVFSYSGATKDTIQVAELARRAGATVICVTRFVKSPLTSFSDVTLLCGANEGPLQG GSTSAEISQLFLIDLMYTEYYRRYFEKCSKNNEKTSSSVIEKLY >gi|229784114|gb|GG667621.1| GENE 65 46423 - 48051 1248 542 aa, chain - ## HITS:1 COG:VC2469 KEGG:ns NR:ns ## COG: VC2469 COG0029 # Protein_GI_number: 15642465 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate oxidase # Organism: Vibrio cholerae # 1 530 16 546 550 121 24.0 3e-27 MRLKNEITCDVLVMGAGIAGIMAAISAAEAGADVCITSGTCICSGSSFYPGTWGLGLVGP ESEADEEDLESVILKVGEGMADPELVSVLVSHIHRGTERLKAFGIQLKEAEQKGEKEFIP CFDYKNRDWHGIVKDSARKVFQARLEELGVRSFPGTRILQFIMTEGRVTGAVAIAEKKGL TAFSCKSLVVASGGLGGLFQYRLNTDDVTGLGQYLALRSGAKLVNIEFMQMMPGYLSPAP KTIYNEKVFKYSNFCRQDTGRSVFEDWRKEDLEERMNVRSTHGPFTSRLLSAPVDIRLFT EFLRNPKGVTVTYQDSLKQNQPEFVRTYFEWLKEEKHLTIDDPIQLGIFAHASNGGIKIG PDGSTGVPGLYACGEATGGMHGADRLGGLSTANGLVFGEIAGKSAAVFSGISRPGPEWSD VSLSVIPDAGTCLKQIQEMNFECAMVIRNGARAELALNKLERLKVFCRREAKEFREGDDV AGLDASCRLTAALYLSECLMKAVRLRKESRGSHYRWDYPEKKEKFSKRILSVWDDGVKTG WE >gi|229784114|gb|GG667621.1| GENE 66 48023 - 49075 879 350 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 8 338 6 329 329 343 52 2e-93 MADKSKEKNKEALLTIKGLKTYFPVKGGFMGKTQNYVQAVNDIDLTIYKGETLGLVGESG CGKTTLGKTILQLYKPTEGEMIYEFENGPRNLEKLTSAEMDEARKKIQIVFQDPQSSLNP SFTIYQSLSDPLKKFGVRTKEERRKIIGDLLEAVNMRREYMDRFPHEFSGGQRQRIGIAR ALCINPELVICDEAVSALDVSIQAQVLNLLKKLKEERNLTYIFITHDLSVVEYISDRIAV MYLGRIVELADADELYRNTIHPYTQALLSAIPVADIHHKKKRIILEGDVPSPMNPPSGCH FHGRCSQCREICKNQRPELKKYEIDGREHYVACHFAEENMKHAAEKRNNM >gi|229784114|gb|GG667621.1| GENE 67 49065 - 50084 592 339 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 24 333 36 329 329 232 39 4e-60 MSDNVVISVKNLKTYFYTNERCNRAINGVSMDIKSGKTLCIVGESGCGKSVTATSIMQLL PKLSRIEEGEIIYHSGDKRGDIKIHELERNGKEMRDIRGKDIAMIFQDPMTALNPVYTIG WQIEEMIRSHNKNVTKKEARETALKLLTDMGIPNPQKRIDQYPHEFSGGMRQRAMIAMAM SCNPKVLIADEPTTALDVTIQAQIFELMDHLKKNYNTAIMLITHDMGVVCELADDVAVMY MGNIIESGSAEEVLDHPTHPYTRALLKSIPVLGRGAAQDINPIKGSTPDPYNRPAGCQFA PRCDYACPECDKCMPEEVQLGGTHMVRCARYQEVTNDGR >gi|229784114|gb|GG667621.1| GENE 68 50095 - 51024 997 309 aa, chain - ## HITS:1 COG:BS_appC KEGG:ns NR:ns ## COG: BS_appC COG1173 # Protein_GI_number: 16078205 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus subtilis # 38 305 32 299 303 224 40.0 1e-58 MSIKSKQLHLNRKFDQIRQMEEAGELKNRKSGNRALRKLLNNRLAIVGGVIFLIILLACI FAPLLTGYSPKAVDMKSILKPPSGEHWFGTDKIGRDVFARVLYGGRMSIAIGFGSALGCA FIGVLLGCYSGYRGGWFDRLMVRISEVFMSIPQLILVMMLVAIMGQSAKNIVIIFIICGW GSCFRMARSQVLSIREEEYVQSLKVFGLNPFIICYKHILPNALSPIIVNVTLSTAMFILE EASLSFLGLGVPLEVPTWGNILNAAQDMFTLQNNWWMWLPTGIVVSLFVMSINFMGDGIR DTTDPTQQG >gi|229784114|gb|GG667621.1| GENE 69 51037 - 51999 1144 320 aa, chain - ## HITS:1 COG:BS_appB KEGG:ns NR:ns ## COG: BS_appB COG0601 # Protein_GI_number: 16078204 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus subtilis # 1 317 1 316 317 222 39.0 7e-58 MAKYTIKKILGMIPMLLIITFIIYWLLDLTPGDPVSYLMDPEALARLTAEQVAALRAQYG LDDPFFIRYFKWLGRLLTGDFGYSSSSGVPVIQIMAERLPATLELAVAALCISTVLGSIL GVLGAVYKGTIGDNVLTVAGMIGVSIPQFFFGVCAILIFALNLGWLPVGGRTDPTVVNFA GRIKYLVMPALVLGISMTAGVMRYARSSMLDTMNKDYIKTARSKGLSEFRVNLVHGFRVS LTPVIVLVGFRLPTLIAGSVVIETVFQWPGIGSAFNTAVRSQNYNLVMMLALLFVFMTLF ASTLVDILTAALDPRIKLER >gi|229784114|gb|GG667621.1| GENE 70 51944 - 52054 57 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MINSMGIIPRIFLIVYLAILSPFLKVKGLYLIHTIP >gi|229784114|gb|GG667621.1| GENE 71 52090 - 53796 2076 568 aa, chain - ## HITS:1 COG:SA0850 KEGG:ns NR:ns ## COG: SA0850 COG0747 # Protein_GI_number: 15926580 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Staphylococcus aureus N315 # 118 559 105 566 571 116 26.0 1e-25 MKLKKIFSLGLAAAMTASLLAGCGSGNPASSTTAATTGGQAETTAAPAEPAAAGDSGVLY SNGGPTEFFETPWLNPGTYMYNKTLYAHLIFADENLAPVSGEGDLAKSYEMSDDGMTLAF ELRDNIFWHDGEAITADDIQWSIEYALKTTVLNSVFRSTFEALKGAAAYEDGSAEHIEGI AVDGNKITLTFEKVAPDALLTFTQFAPLPKKYLADTDPISLQQAKYFQSPIGSGPFMVDE VQMNNYTTLKPFDKYYGGTANFTIHLNPSAGDSDANFVTNAKAGKLDYAYTKNIADVKAL EGTPGLTIDTVNVRYTRLLYLNKFDKKDGKKAPLADERVRQAIAYALDMKSILDGVFEGA ALPANSLTPDGADKVDGLNNYDYNPEKAKELLKEANWDPNTELDVVYYYTDQMTQDLMAI IQQYLAEVGIKMNARLVEGDLATILWKAPEDPVNGPSAVDWDMCYAANAALSLHEYYDRY QTGYSINSHTPSDPKLDELIAATNSSVDAEVQRKAFFELQKYENETLFELPLYYQPIFLL QSDKIVKGAKLGNPQFNYNWDIQNWEIK >gi|229784114|gb|GG667621.1| GENE 72 54037 - 54930 281 297 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 2 286 4 316 319 112 30 4e-24 MRIAALDIGGTSIKSGIWNGQDMVGVKEHATNAKNGGRYVMERSVEILRQYDDFEAIGIS TAGQVNSAEGCIRYANENIPGYTGMKVREIMEREFHVPAAVENDVNAAAIGEGQFGAGRA FKDFLCITYGTGVGGAIVMNKQIYTGNDGSAGEFGGIMIHPEDSVYGEPFCGCYEKYAST TALVRKAMAYNRDLDNGRKIFARLDDPKVRDIVNSWIDEIVYGLITVIHIFNPACIVLGG GVMAQPYILNEVKQKAAARIMSSFRNAELCQAQLGNRAGLMGAAYLAGIRLDQEAQV >gi|229784114|gb|GG667621.1| GENE 73 55008 - 55691 776 227 aa, chain - ## HITS:1 COG:BB0644 KEGG:ns NR:ns ## COG: BB0644 COG3010 # Protein_GI_number: 15594989 # Func_class: G Carbohydrate transport and metabolism # Function: Putative N-acetylmannosamine-6-phosphate epimerase # Organism: Borrelia burgdorferi # 5 226 3 227 232 235 53.0 5e-62 MDYSVLERIRGGLIVSCQALEEEPLHSSYIMSRMAYAAKEGGAVGIRANTVADIVEIKKT VDLPVIGIIKKVYGDCDVYITPTMKEVDALVECGVDIIATDCTDRIRPDGKTMDEFFAEV RAKYPDQIFMADCSCYEEGLHAAQIGLDLVGTTMNGYTEYTKGAVLPDFDLMKRLVETCG KPVIAEGGIWSPEQLKAALDTGVLAAVVGTAITRPRDITRHFVAAIQ >gi|229784114|gb|GG667621.1| GENE 74 55652 - 55783 83 43 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870182|ref|ZP_06409658.1| ## NR: gi|288870182|ref|ZP_06409658.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 43 21 63 63 67 100.0 3e-10 MFVLKIFFIRAMIKTVKNPKISKKYKRGYLIWIIQYWNESEVV >gi|229784114|gb|GG667621.1| GENE 75 55890 - 56807 952 305 aa, chain + ## HITS:1 COG:SP1676 KEGG:ns NR:ns ## COG: SP1676 COG0329 # Protein_GI_number: 15901511 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Streptococcus pneumoniae TIGR4 # 5 290 3 291 305 258 46.0 9e-69 MSKYDINKFKHVTVALNTPFDKNGDVDLEATKAVVNYFCSKGIKQMYVCGSTGEGFLLDS EERKAVTETVVKEAAGRMNIIVHVGCPDTRHSAELAKHAEAAGADATSAVPCVYYHMSEE SVYQHWTKITEAADLPFFIYNIPQLTGFNLSMGLFGRMLENERVAGVKCSSDPAQDILRF KLAGGRDFIVFNGPDEQFIAGRLMGADAGIGGTYGAMPELYLKMDELVNRGEWEKAREVQ NLVTPLIYRLCSFPSMHGAVKGIISLDGCPMGDPRLPFLPVALDDPKLVELYRDIRKAVE DTKDL >gi|229784114|gb|GG667621.1| GENE 76 56882 - 57520 680 212 aa, chain + ## HITS:1 COG:SMa2002 KEGG:ns NR:ns ## COG: SMa2002 COG2755 # Protein_GI_number: 16263550 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Sinorhizobium meliloti # 3 209 7 212 220 139 38.0 4e-33 MKKHIVCYGDSNTHGYCAADDGRFNENERWPRLLEKKLGDDYLVFEEGLSGRTTCFDDPI HEGLSGLDYIYPCLMSHEPVDLLIIMLGTNDTKERFGASAACIALGLKRLIAKAISTTDC WRDKRPNILVVVPKNIEKEYETTAVGATMGRGCAEKSEGLSSEYKKIAELMGCYFFDANT VVKENNKIDYMHLTKEGHEALASALASLVRKY >gi|229784114|gb|GG667621.1| GENE 77 57670 - 58992 1219 440 aa, chain + ## HITS:1 COG:alr1615_2 KEGG:ns NR:ns ## COG: alr1615_2 COG1404 # Protein_GI_number: 17229107 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Nostoc sp. PCC 7120 # 141 426 43 327 416 178 39.0 2e-44 MKKRMCAVSLSFALAITAYAVNPWIPASGSMSGSASDVTASGKTAQEDMLSISQSGQPLQ AFALGPGVVNLSPADQFSTYQWGLKNDGEFQLVEIKAKFQAADNIYSRGKGGFGLPGLGP GDFESTVTDAVTGVDINIRPAWDLYDQSENKRSVVVAIIDTGIDYNHQEFSNAMWTNPGE IPGDGIDNDGNGYVDDIHGWNFYSGNSQIFVGEEDSHGTHAAGTIAAVRGSYGIAGITDN SYVKIMSLKALGGSDGTGSPDSVIEAIHYAEANGASICNLSMGTSSYNEALAQTMQNSKM LFIVACGNGGFAGKGYDTDLYPVYPASFPFDNVISVANLLFDGNLSRDSNYGAASVDIAA PGTYIVSTLPGNDYGFMSGTSMAAPMVTGVAAMLYSYRPELSLQDVKNILLNSSRKSDQL SGKMVSGGILDAYAALSYQQ >gi|229784114|gb|GG667621.1| GENE 78 59047 - 59613 463 188 aa, chain - ## HITS:1 COG:L170990 KEGG:ns NR:ns ## COG: L170990 COG0454 # Protein_GI_number: 15672558 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Lactococcus lactis # 41 180 7 146 147 143 52.0 2e-34 MKWGKPNMIEERTGKKREMQGKNGKKNSQKRDDPYNFVILEKRSSELIACLTGIWRRSVK ETHLFLNDQEIRKIEAYVPEALLKVPVLIIAVNAENEPAAFMGADGGRLEMLFIEPEERG KGLGGRLLAYGIEKLGIRELCVNEQNPRAKGFYEHMGFTCYKRTDTDEQGNPYPLLYMRL SERKDVPV >gi|229784114|gb|GG667621.1| GENE 79 59794 - 59970 210 58 aa, chain - ## HITS:1 COG:no KEGG:Closa_3778 NR:ns ## KEGG: Closa_3778 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 57 500 556 557 105 78.0 7e-22 MNKGDTIVATRRRKIWQVFRDIDKKKYTLEATDFAKPIVGINKWSDIIPNFRWRAMEQ >gi|229784114|gb|GG667621.1| GENE 80 59957 - 61501 1742 514 aa, chain - ## HITS:1 COG:no KEGG:Closa_3778 NR:ns ## KEGG: Closa_3778 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 505 1 494 557 739 74.0 0 MDSKLIRKGAAGILITVLLCFTGVFASFASTEETADAAEAETQGPKVESSITSCLINPDK KSVTVKVTSSGDMTGTDGVLYLVEQKPYQKDLEGRLDYAGAAAFGTEVTYQFPLNKGTAD DRLYSRFVAAVWDGTKYIEVSEPHYITNPEAVASNTAEFNDPLTKKGLNIELNMLADAFE LGVKHVGTNIAFHQILGQGIDYQYDGKTYHFDKRVIEDYDRTISALSGKGMTVTAIILNG WNDATPDLIYPGTKKSSNAFYYLFNAATEAGFEQTKAIASFLAERYDGSNPDQGKISNWI IGNEINNQQWNYTGAWDLNSYVQAYQDAFRVFYTAIKSVSANDRVYFSLDYNWNNEIDNK LKYGGKNIVDTFNSIAAVQGQMDWGLSYHPYPAPMTEPEFWDDAQTGWITSDFNSPVINF ANLNVLTDYMAQSAFRSPTGDVRHIILTEEGFTAKSPTRGDIPQIQAAAFAYSYYLVDSN PYIDAYILSRQVDAPLEVRSGLAFGRVGMRYEQG >gi|229784114|gb|GG667621.1| GENE 81 61626 - 62432 617 268 aa, chain - ## HITS:1 COG:mll6661 KEGG:ns NR:ns ## COG: mll6661 COG2362 # Protein_GI_number: 13475561 # Func_class: E Amino acid transport and metabolism # Function: D-aminopeptidase # Organism: Mesorhizobium loti # 1 267 1 265 265 175 38.0 9e-44 MKLFISADIEGCAGLSFTEETHKNEAVYQRYAEEMTREVVLVCEAAHAAGADTIVVKDGH GDASNINPLAMPEYVTLIRGKSGHPYNMMFGLDETFDGVFFLGYHAAAGSPEFAVSHTST GNSQYIRLNGRYMSEFMLNTCTAYSLGIPVLFLSGDEAVCRDARELAPHIAAAGIKHGTG GATFCVPQSVVDRRLREAAGTAVENLKGSLIPAVALPDEFIYEVTYKDWKKAYQMSFYPG MERVDRFTNCLKTEHWMEVVTAHSFVVY >gi|229784114|gb|GG667621.1| GENE 82 62610 - 63764 1397 384 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 384 14 407 407 542 69 1e-153 NIGTIGHVDHGKTTLTAAITKTLHERLGTGEAVAFENIDKAPEERERGITISTAHVEYET EKRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGVMAQTREHILLSRQVGVPYIV VFMNKCDMVDDAELLELVDMEIRELLNEYEFPGDDTPIIQGSALKALEDPTSEWGDKILE LMNAVDEWVPDPVRETDKPFLMPVEDVFSITGRGTVATGRVERGTLHVSDEVEIVGIHEE TRKVVVTGIEMFRKLLDEAQAGDNIGALLRGVQRTEIERGQCLVKPGSVKCHKKFTAQVY VLTKDEGGRHTPFFNNYRPQFYFRTTDVTGVCDLPEGVEMCMPGDNVEMTVELIHPVAME QGLRFAIREGGRTVGSGKVATIIE >gi|229784114|gb|GG667621.1| GENE 83 64913 - 67033 1816 706 aa, chain - ## HITS:1 COG:CAC3138 KEGG:ns NR:ns ## COG: CAC3138 COG0480 # Protein_GI_number: 15896387 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Clostridium acetobutylicum # 3 703 2 686 687 917 64.0 0 MAGREYPLERTRNIGIMAHIDAGKTTLTERILYYTGVNYKIGDTHEGTATMDWMEQEQER GITITSAATTCHWTLQEHCKVKPGALEHRINIIDTPGHVDFTVEVERSLRVLDGAVGVFC AKGGVEPQSENVWRQADTYNVPRMAFINKMDIMGANFYGAVDQIRTRLGKNAICLQLPIG AEDEFKGIIDLFEMQAYIYNDEKGEDITITEIPEDMQVDAELYHTELVEKICELDDDLMM QYLEGEEPSVEELKKALRKATCECTAIPVCCGTAYRNKGVQKLLDAIVEFMPAPTDIPSI KGVDMDGNEIERHSSDDEPFSALAFKIMTDPFVGKLAFFRVYSGSMNSGSYVLNATKGKK ERVGRILQMHANKRQELDKVYAGDIAAAVGFKTTTTGDTICDDQHPVILESMEFPEPVID IAIEPKTKAGQGKMGEALAKLAEEDPTFRAKTNQETGQVIISGMGELHLEIIVDRLLREF NVEANVGAPQVAYKETFTKAVDVDSKYAKQSGGRGQYGHCKVHFEPMDANAEELFKFDSS VVGGAIPKEYIPAVGEGIEEASKCGILGGFPVLGVHATVYDGSYHEVDSSEMAFHIAGSM AFKDAMHKAGAILLEPIMKVEVTMPEEYMGDVIGDINSRRGRIEGMDDLGGGKIVRAFVP LAEMFGYSTDLRSKTQGRGNYSMFFEKYEPVPKSVQEKVLAEKNGK >gi|229784114|gb|GG667621.1| GENE 84 67049 - 67519 733 156 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240146972|ref|ZP_04745573.1| 30S ribosomal protein S7 [Roseburia intestinalis L1-82] # 1 156 1 156 156 286 91 2e-76 MPRKGHTQKRDVLADPLYNNKVVTKLINNIMLDGKKGVAQKIVYGAFGRVEEKTGKPAVE VFEEAMNNIMPVLEVKAKRIGGATYQVPIEVKAERRQALALRWITLYSRKRGEKTQEERL ANEIMDAANNTGASVKKKEDMHKMAEANKAFAHYRF >gi|229784114|gb|GG667621.1| GENE 85 67708 - 68127 677 139 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240146973|ref|ZP_04745574.1| 30S ribosomal protein S12 [Roseburia intestinalis L1-82] # 1 139 1 139 139 265 94 5e-70 MPTFNQLVRKGRQTSVKKSTAPALQKGYNSLQKRATDVSAPQKRGVCTAVKTATPKKPNS ALRKIARVRLSNGIEVTSYIPGEGHNLQEHSVVLIRGGRVKDLPGTRYHIVRGTLDTAGV ANRKQARSKYGAKRPKDKK >gi|229784114|gb|GG667621.1| GENE 86 68269 - 68460 88 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLEPGWSDNPLITSYFFCSSIDSITFLIVMSSFLDVYLEDYTILGTFSLAVYKDIIINGS YDF >gi|229784114|gb|GG667621.1| GENE 87 68399 - 68762 284 121 aa, chain + ## HITS:1 COG:no KEGG:CLJU_c31760 NR:ns ## KEGG: CLJU_c31760 # Name: not_defined # Def: hypothetical protein # Organism: C.ljungdahlii # Pathway: not_defined # 1 120 1 119 187 128 53.0 6e-29 MDEQKKYEVIKGLSDHPGSSKERAALTLGYTVRHINRMLAGYQKSGKEYFSHGNKGRKPA NTIKDETRSVIVDLYRSKYYDANFTHYTELLEKHECISVSPSSVSKILEEEYILSPKVTR A Prediction of potential genes in microbial genomes Time: Thu Jun 30 23:59:01 2011 Seq name: gi|229784113|gb|GG667622.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld15, whole genome shotgun sequence Length of sequence - 82792 bp Number of predicted genes - 82, with homology - 80 Number of transcription units - 36, operones - 20 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 8.1 1 1 Op 1 . + CDS 178 - 738 531 ## COG2207 AraC-type DNA-binding domain-containing proteins 2 1 Op 2 . + CDS 817 - 1146 250 ## COG3323 Uncharacterized protein conserved in bacteria + Term 1202 - 1266 24.4 - Term 1198 - 1245 14.9 3 2 Tu 1 . - CDS 1273 - 2613 1172 ## COG0534 Na+-driven multidrug efflux pump - Prom 2794 - 2853 8.5 4 3 Op 1 3/0.000 + CDS 2989 - 4206 1461 ## COG5441 Uncharacterized conserved protein 5 3 Op 2 . + CDS 4275 - 5105 832 ## COG5564 Predicted TIM-barrel enzyme, possibly a dioxygenase 6 3 Op 3 . + CDS 5175 - 6587 1346 ## COG0471 Di- and tricarboxylate transporters 7 3 Op 4 3/0.000 + CDS 6597 - 6842 344 ## COG0492 Thioredoxin reductase 8 3 Op 5 . + CDS 7765 - 8463 294 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 9 3 Op 6 . + CDS 8485 - 10359 1698 ## Dhaf_3734 arylsulfotransferase + Term 10378 - 10432 9.1 + Prom 10494 - 10553 4.8 10 4 Tu 1 . + CDS 10723 - 11229 542 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family - Term 12333 - 12374 8.0 11 5 Tu 1 . - CDS 12401 - 12715 377 ## gi|266620309|ref|ZP_06113244.1| conserved hypothetical protein 12 6 Tu 1 . - CDS 13659 - 14627 575 ## Closa_2955 hypothetical protein - Prom 14704 - 14763 4.8 + Prom 14663 - 14722 3.7 13 7 Op 1 . + CDS 14758 - 15804 1093 ## COG3854 Uncharacterized protein conserved in bacteria 14 7 Op 2 . + CDS 15797 - 16327 584 ## Closa_3251 Sporulation stage III protein AB 15 7 Op 3 . + CDS 16403 - 16597 239 ## Closa_3250 stage III sporulation protein AC 16 7 Op 4 . + CDS 16611 - 16997 531 ## Closa_3249 stage III sporulation protein AD 17 7 Op 5 . + CDS 16994 - 18238 1312 ## Closa_3248 Sporulation stage III protein AE 18 7 Op 6 . + CDS 18244 - 18990 664 ## Closa_3247 Sporulation stage III protein AF 19 7 Op 7 . + CDS 18965 - 19567 523 ## Closa_3246 hypothetical protein 20 7 Op 8 . + CDS 19584 - 20450 1062 ## Closa_3245 hypothetical protein + Term 20473 - 20530 18.1 + Prom 20575 - 20634 5.8 21 8 Op 1 11/0.000 + CDS 20821 - 21807 930 ## COG4606 ABC-type enterochelin transport system, permease component 22 8 Op 2 10/0.000 + CDS 21804 - 22796 1084 ## COG4605 ABC-type enterochelin transport system, permease component 23 8 Op 3 5/0.000 + CDS 22829 - 23581 203 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 + Prom 23586 - 23645 4.2 24 8 Op 4 . + CDS 23668 - 24720 1349 ## COG4607 ABC-type enterochelin transport system, periplasmic component 25 8 Op 5 . + CDS 24808 - 25194 586 ## COG1302 Uncharacterized protein conserved in bacteria + Term 25216 - 25260 7.2 - Term 25203 - 25247 10.2 26 9 Tu 1 . - CDS 25266 - 25757 295 ## COG0583 Transcriptional regulator 27 10 Tu 1 . - CDS 26690 - 26998 188 ## COG0583 Transcriptional regulator - Prom 27028 - 27087 4.2 + Prom 27029 - 27088 3.7 28 11 Op 1 . + CDS 27158 - 27733 563 ## COG0163 3-polyprenyl-4-hydroxybenzoate decarboxylase 29 11 Op 2 . + CDS 27726 - 28112 481 ## SGGBAA2069_c16390 hypothetical protein 30 11 Op 3 . + CDS 28105 - 29574 1454 ## COG0043 3-polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases 31 12 Op 1 . + CDS 29689 - 29862 105 ## + Prom 29864 - 29923 2.7 32 12 Op 2 . + CDS 29945 - 30784 747 ## Dhaf_3107 hypothetical protein + Term 30820 - 30861 4.3 - Term 30636 - 30689 1.3 33 13 Op 1 . - CDS 30828 - 30935 66 ## 34 13 Op 2 . - CDS 30880 - 31383 321 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 35 13 Op 3 . - CDS 31394 - 31963 551 ## gi|266620332|ref|ZP_06113267.1| conserved hypothetical protein - Prom 32100 - 32159 4.7 + Prom 32059 - 32118 5.4 36 14 Op 1 15/0.000 + CDS 32167 - 33057 1078 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 37 14 Op 2 . + CDS 33054 - 33272 329 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 38 15 Op 1 11/0.000 + CDS 34239 - 34424 168 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 39 15 Op 2 1/0.222 + CDS 34408 - 36249 1889 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 40 16 Op 1 . + CDS 37175 - 37588 489 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 41 16 Op 2 . + CDS 37595 - 39535 2061 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family + Term 39550 - 39609 25.9 - Term 39537 - 39597 26.1 42 17 Tu 1 . - CDS 39622 - 39870 362 ## gi|288870201|ref|ZP_06113275.2| putative transposase - Prom 39897 - 39956 9.4 43 18 Tu 1 . - CDS 40858 - 41547 571 ## CLJU_c03680 hypothetical protein - Prom 41747 - 41806 6.6 + Prom 41803 - 41862 3.9 44 19 Op 1 . + CDS 41896 - 42546 461 ## gi|266620343|ref|ZP_06113278.1| hypothetical protein CLOSTHATH_01426 45 19 Op 2 . + CDS 42513 - 43043 520 ## Moth_2125 CdaR family transcriptional regulator + Term 43084 - 43139 9.3 + Prom 43093 - 43152 2.1 46 20 Tu 1 . + CDS 43361 - 43555 205 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 43568 - 43624 11.2 - Term 43556 - 43612 14.4 47 21 Op 1 . - CDS 43646 - 44506 739 ## BC1003_1494 peptidoglycan-binding lysin domain-containing protein 48 21 Op 2 40/0.000 - CDS 44539 - 45879 770 ## COG0642 Signal transduction histidine kinase 49 21 Op 3 . - CDS 45872 - 46561 740 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 46679 - 46738 6.3 + Prom 46632 - 46691 5.6 50 22 Op 1 . + CDS 46727 - 49024 2307 ## Clole_2648 extracellular solute-binding protein family 1 51 22 Op 2 . + CDS 49014 - 49307 221 ## gi|266620350|ref|ZP_06113285.1| putative ABC transporter, permease protein 52 23 Op 1 38/0.000 + CDS 50280 - 50798 507 ## COG1175 ABC-type sugar transport systems, permease components 53 23 Op 2 . + CDS 50841 - 51716 827 ## COG0395 ABC-type sugar transport system, permease component 54 23 Op 3 . + CDS 51721 - 52395 749 ## gi|266620353|ref|ZP_06113288.1| hypothetical protein CLOSTHATH_01436 55 24 Op 1 . + CDS 53329 - 53820 533 ## gi|266620354|ref|ZP_06113289.1| conserved hypothetical protein 56 24 Op 2 . + CDS 53828 - 54919 632 ## gi|266620355|ref|ZP_06113290.1| conserved hypothetical protein 57 25 Tu 1 . - CDS 54864 - 55625 558 ## COG0534 Na+-driven multidrug efflux pump + Prom 56568 - 56627 23.0 58 26 Op 1 . + CDS 56688 - 57521 1132 ## COG0789 Predicted transcriptional regulators 59 26 Op 2 . + CDS 57606 - 57821 112 ## Bcell_0385 xylan 1,4-beta-xylosidase (EC:3.2.1.37) 60 27 Tu 1 . + CDS 58778 - 60364 1427 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 60447 - 60484 3.1 + Prom 60378 - 60437 4.5 61 28 Op 1 . + CDS 60532 - 61317 995 ## COG1712 Predicted dinucleotide-utilizing enzyme 62 28 Op 2 . + CDS 61331 - 62236 1086 ## COG1052 Lactate dehydrogenase and related dehydrogenases + Prom 62740 - 62799 3.9 63 29 Op 1 1/0.222 + CDS 62828 - 63271 690 ## COG0781 Transcription termination factor 64 29 Op 2 . + CDS 63278 - 64555 1306 ## COG1570 Exonuclease VII, large subunit 65 29 Op 3 . + CDS 64572 - 64784 310 ## gi|266620366|ref|ZP_06113301.1| exodeoxyribonuclease VII, small subunit 66 29 Op 4 13/0.000 + CDS 64774 - 65661 1157 ## COG0142 Geranylgeranyl pyrophosphate synthase 67 29 Op 5 . + CDS 65678 - 66616 986 ## COG1154 Deoxyxylulose-5-phosphate synthase 68 29 Op 6 6/0.000 + CDS 66699 - 67595 1067 ## COG1154 Deoxyxylulose-5-phosphate synthase 69 29 Op 7 5/0.000 + CDS 67592 - 68407 1011 ## COG1189 Predicted rRNA methylase 70 29 Op 8 1/0.222 + CDS 68438 - 69271 774 ## COG0061 Predicted sugar kinase 71 29 Op 9 . + CDS 69283 - 69516 389 ## COG1438 Arginine repressor 72 30 Op 1 . + CDS 70464 - 70661 215 ## Closa_3233 arginine repressor, ArgR 73 30 Op 2 . + CDS 70674 - 72338 1978 ## COG0497 ATPase involved in DNA repair 74 31 Tu 1 . - CDS 72463 - 73254 837 ## COG1414 Transcriptional regulator - Prom 73374 - 73433 10.9 + Prom 73390 - 73449 9.9 75 32 Tu 1 . + CDS 73533 - 73847 457 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases + Prom 74690 - 74749 80.4 76 33 Tu 1 . + CDS 74827 - 75297 400 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases + Term 75354 - 75424 24.4 - Term 75333 - 75372 0.4 77 34 Tu 1 . - CDS 75381 - 76262 640 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 76344 - 76403 6.7 + Prom 76332 - 76391 7.1 78 35 Op 1 . + CDS 76525 - 78699 1862 ## COG1472 Beta-glucosidase-related glycosidases 79 35 Op 2 7/0.000 + CDS 78718 - 79632 988 ## COG4209 ABC-type polysaccharide transport system, permease component 80 35 Op 3 . + CDS 79644 - 80510 883 ## COG0395 ABC-type sugar transport system, permease component 81 35 Op 4 . + CDS 80540 - 81151 731 ## Pjdr2_5258 extracellular solute-binding protein family 1 82 36 Tu 1 . + CDS 82083 - 82791 768 ## Pjdr2_5258 extracellular solute-binding protein family 1 Predicted protein(s) >gi|229784113|gb|GG667622.1| GENE 1 178 - 738 531 186 aa, chain + ## HITS:1 COG:lin0157 KEGG:ns NR:ns ## COG: lin0157 COG2207 # Protein_GI_number: 16799234 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 79 168 182 271 277 81 41.0 7e-16 MIYAGPANTPIDMQYLYQNTKCQGYIGGSTFDRIPTERAILNTTKAFKSYGSFDEKDPMS KLLNGNWNPGDYVEFVKKYIEEHYMKEIQLRDLAVVAHVSGSYLSVKFKKEVGCSFTEYL VRFRMNKAKELFEQKKMSCKEVAAMVGYEDYAQFSRMFKKYTGIPPVEYAGKCRNDGNTV EDKREV >gi|229784113|gb|GG667622.1| GENE 2 817 - 1146 250 109 aa, chain + ## HITS:1 COG:BH2593 KEGG:ns NR:ns ## COG: BH2593 COG3323 # Protein_GI_number: 15615156 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 8 108 6 106 107 114 51.0 6e-26 MNFDIKEVKIEIYIPEEYIEEIRNALTLAGACRAGNYDHVLSYQSTKGFWRPLENSHPFH GEKGTICCGTEARVEVRCPVEFLEEAVKTVRRIHPYEEPVINVIPLLDV >gi|229784113|gb|GG667622.1| GENE 3 1273 - 2613 1172 446 aa, chain - ## HITS:1 COG:FN1726 KEGG:ns NR:ns ## COG: FN1726 COG0534 # Protein_GI_number: 19705047 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 7 439 12 445 457 164 27.0 3e-40 MDFLNGTIKSIYLKYLSAAFGSAMITSIYSIVDTAMVGQYQGPSGTAALAVVAPLWNIIY SLGLLMGIGGSVIFSTKRGLDNGRSGNENEYFTVSVLGSIALSLLAWMGLLFLDQPVLLF FGADSTLLPLAQSYMLPIKFVFPLFLFNQMLAAFLRNDKNPGLATIAVLSGGIFNIAGDY MFVFAFDMGIFGAGLATAVGCAISFLVMMTHFFSRKNTLLLVRPSNIWNKLKEISVTGFS TFFIDVAMGILTILFNRQIMKYLGTNALAIYGPIINVSTFVQCCAYSVGQAAQPIISTNY GAGQGMRIKETLRYALYTTAFFSVFWTALSFAFPNLYIRIFMNPTAEILEMAPGIIRSYA LSFLLLPLNIFSTYYFQAILKPMAAFIVSVSRGFFISGIMIMVLPVIAGADFLWFAMPIT ELLVMFYTAGAMRKYTKALIVNSKAT >gi|229784113|gb|GG667622.1| GENE 4 2989 - 4206 1461 405 aa, chain + ## HITS:1 COG:YPO3839 KEGG:ns NR:ns ## COG: YPO3839 COG5441 # Protein_GI_number: 16123974 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Yersinia pestis # 2 405 3 402 405 309 45.0 6e-84 MKTIAIAGTFDTKGKEYLYVKGLIEELGMHAFTIHTGVFEPEFPPDVSNAEVAAAAGYDI HAIVERKDRAVATEALSKGMEILVPQLYREGRFDGILSFGGSGGTSMVTPAMRALPIGVP KVMVSTMASGNVEQYVGTSDLVMMPSIVDVAGLNSISKVIFRNAVLAVGGMVGLKEELKP DTGDHKPLVAATMFGVTTPCVGYAKEYLESQGYEVLVFHATGTGGRTMEALTEAGFFKGV LDLTTTEWCDEVVGGVLAAGQRRCEAAARSKIPQVVSVGAMDMVNFGPFDTVPKKFAGRN LYKHNPTVTLMRTTADENREVGEKLAEKLNMASGKTVLLLPLKGVSAIDAEGQPFYGPEE DQVLFQTLRDEVNRDSVEIMELDYNINDQAFALCAAEKLMELMER >gi|229784113|gb|GG667622.1| GENE 5 4275 - 5105 832 276 aa, chain + ## HITS:1 COG:YPO3838 KEGG:ns NR:ns ## COG: YPO3838 COG5564 # Protein_GI_number: 16123973 # Func_class: R General function prediction only # Function: Predicted TIM-barrel enzyme, possibly a dioxygenase # Organism: Yersinia pestis # 1 276 5 280 280 345 61.0 4e-95 MNRMTRSEIMEKFHREVEEGRILVGVGAGTGITAKCSEKGGADMLIIYNSGRFRMAGRGS LSGLLAYGDANKIVQEMGAEVLPVVKNTPVLAGVCGTDPFRVMDLFLKQLKDQGFNGVQN FPTVGLIDGKFRANLEETGMGYGLEVDMIREAHKLDMLTCPYVFDPEQAKAMAEAGADIL VAHMGLTTKGSIGAETALTLDDCCVKIREIIRAGREVCQDIMVICHGGPIADPEDAAYVI KHVPEIDGFFGASSIERLASERGMTAQTEAFKAIEK >gi|229784113|gb|GG667622.1| GENE 6 5175 - 6587 1346 470 aa, chain + ## HITS:1 COG:SA0645 KEGG:ns NR:ns ## COG: SA0645 COG0471 # Protein_GI_number: 15926367 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Staphylococcus aureus N315 # 70 469 103 517 517 68 21.0 2e-11 MDNQGITADKKTAFDLKFFIKWGIAVGVPCILYFAVPQSEAVSPEMRKFFVCTSWAILTW AMNLLPVYISGLLLTVGAILWGVAPANVVLSPWTGSVVWLSLGGLTLSVIFEKSGLMQRI AYFFIVKSGGSYKGIITGITLSGIVVGLLVPNMTGRVALYCALCFGIYNALKIPKNSNTG AGIMFAGFVAAIAPSWLYLSASENLQLVNSYLKETGNSVSWIQYFISNFPVMLVWLIVLL FMVRILFKQDASIDGKEYFKEKYESLGKMSKKEIKFLVILILLVIAMVFSGIDAGWLFIL AVIACFLPGVEVADIEDMKQVNFPMVFFVAATMAIGSMSNAVGVAGLVSELMMPILQNVG NFGLIVFAWLTGVITDMMMTPLAGMAAFSPLFIDIAAKLGLSVIGTTFSFVWGVEQLFFP YEWALFLILFSYDVFDMKKAMKFCTVRTILSIIFLMVLIYPYWMMTGFLR >gi|229784113|gb|GG667622.1| GENE 7 6597 - 6842 344 81 aa, chain + ## HITS:1 COG:BS_trxB KEGG:ns NR:ns ## COG: BS_trxB COG0492 # Protein_GI_number: 16080532 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Thioredoxin reductase # Organism: Bacillus subtilis # 2 81 5 84 316 97 52.0 6e-21 MKIYDVIIAGSGPAGMTAAIYAKRANLDVLLLDKLAPGGQIINTYEIQNYPGFASVNGAD LAMQVFEHTQALGIEFDYATV >gi|229784113|gb|GG667622.1| GENE 8 7765 - 8463 294 232 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 31 226 104 302 306 117 36 2e-25 XEKIEIEDCEEGGGTVKTLICGEGKSYRALAVILATGTKPRMLHVPNELDFAGKSISWCA ICDGAKYRDKSVIVIGGGNSAVEESIYLAEMAERVTVVTMLDLTADPIACDKLKALPNVE VYEWFDIKEFLPGEVFTGLRAVSSKTGEEITVRGDGAFEYIGLQPVTESFAGLGILESHG YIETNERMETAVPGIFGAGDANSKLLRQIVTACSDGAVAAQSAAAYKKKCMG >gi|229784113|gb|GG667622.1| GENE 9 8485 - 10359 1698 624 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_3734 NR:ns ## KEGG: Dhaf_3734 # Name: not_defined # Def: arylsulfotransferase # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 2 621 4 626 628 773 58.0 0 MIRYEKKKSLITQQAEAEQKFLDIFEAGEYTAASPLVVGNPYLISPLTAMILFKTSVKQE VTLTVKGKEPEGDISHTFPADTVHILPVYGLYADYDNTVELVLSGGEREKVIITTEPLPP EVPVPTFCKGSREHMGDNVIFLTQTSKADALACDYCGDVRWYLTVNVCFDLKRLANGRLL VGTDRLIKLPYYVSGVYEMGVHGKIYREYRLPGGYHHDTFEMEDGNILMLSQIPDRDTVE DVLVLVDRQTGEIVRTWDYREILPYNCPTTYSGSASAHDWFHNNAVWYDKKTDSITLSGR HQDAVINIDFQTGALNWILGDPEGWPKEYVENYFFRPVGDPFEWSYEQHGVVVCPDGDIM MFDNGHYRSKRKDSYSRAKDSYSRGVRYHIDREERTIRQVWQYGKERGADFFSPYICNVE YYDEGRYMVHSGGIAYKNGEPLEGLGSMDGTGEGCELNSITCELVGDEVVYELHVPSNVF RAEKLPMYYANETAELGVGETLGSMNRTGEFETEIPAVSTGELIPEHYNASVTEEEDRIL FNATFEKGELAMLLLEEENGVVHRYYINTSAAKNFEAMCVGTFLKNDPRNVDVYVNKSGL SGEVTVRVLLENSIYETGVRIRME >gi|229784113|gb|GG667622.1| GENE 10 10723 - 11229 542 168 aa, chain + ## HITS:1 COG:FN0320 KEGG:ns NR:ns ## COG: FN0320 COG1853 # Protein_GI_number: 19703665 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 24 166 23 177 180 89 39.0 3e-18 MSFKEVKMEELNFNPFTKIGTEWMLITAGDEKKFNTMTASWGGVGIMWNKNVVTTYIRPQ RYTKEFVDANEYFTVSFYDKEYKKALNICGTRSGRDCDKAAEAGLTPYFIDGTTAFEEAN MIFVCRKLYCDPMPGDNFLDKENDEKWYPDKDYHTMYISEIVKVLVKD >gi|229784113|gb|GG667622.1| GENE 11 12401 - 12715 377 104 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620309|ref|ZP_06113244.1| ## NR: gi|266620309|ref|ZP_06113244.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 104 7 110 110 160 100.0 2e-38 MDPSTYGNEAAITADLENGTYTLWEDSDDTIMKTVYEGNIYILKGAVTGTYTNLAEASKT QKKIYAGEPVEEGKEEASAEYTVYGNDLYYNEKGYFTAFYYLGD >gi|229784113|gb|GG667622.1| GENE 12 13659 - 14627 575 322 aa, chain - ## HITS:1 COG:no KEGG:Closa_2955 NR:ns ## KEGG: Closa_2955 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 205 285 303 383 657 82 50.0 2e-14 MKNKLLPGLIIALGLLTSACHTDSGFDAPAESFQKESQAETVSPAEEKPAAKDQSSDPAD HVSIGTEETKDTETAFHTFSVKALNHLNADPSLIYNRYVKAGDIRTAISFSLAQDWKESD SLTCHFQTQDSQYESSWNGTADAVINTESGTITYKLIPEFFGTNQKNGTCEITFHLDETV PYIAVEGDPRLEGEYYSFQDSFSRPDEFTRYLGKADLYLYPTEDLKLLRNEIYAAHGRKF DTESLNQYFSAQAWYRGLLEPNAFSDAVLSDIEKKNITLIRELEEIPYNERNTIDGIQYG DEWDRLPVAPYLSYLHPGHETG >gi|229784113|gb|GG667622.1| GENE 13 14758 - 15804 1093 348 aa, chain + ## HITS:1 COG:BS_spoIIIAA KEGG:ns NR:ns ## COG: BS_spoIIIAA COG3854 # Protein_GI_number: 16079499 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 19 329 2 300 307 260 41.0 3e-69 MSPFTYIEVRGGMKVEKKDELIKIFSKNIREILTRVAVSFDEVQEIRLRVAAPLLMVYKN EEYYISRLGQLSRDCRDAYLVSRNELKETMEYMSNYSLYAFEEEMKQGFLTIQGGHRIGV AGKTILDESGIKTMKFISFINVRLSHQVKGCASPVLPYLYDSEEILHTLIISPPRCGKTT LLRDLIRQISNGTEAHPGMTVGVVDERSEIGACYQGIPQNELGIRTDILDCCPKARGMMM LIRTMSPRVIAVDEIGSREDLEAVEYVMNCGCKLIATVHGSSIDDLKQKPVLRRLVEERI FERYIVLNNKGKIGNIDQIYDSRGTQLYKAEVHRMGKRWNSEECMAYG >gi|229784113|gb|GG667622.1| GENE 14 15797 - 16327 584 176 aa, chain + ## HITS:1 COG:no KEGG:Closa_3251 NR:ns ## KEGG: Closa_3251 # Name: not_defined # Def: Sporulation stage III protein AB # Organism: C.saccharolyticum # Pathway: not_defined # 1 176 1 176 176 259 75.0 5e-68 MVNTGIRLLGAVLVVVSCSGMGFFLAGQWGERLKTMEHLRKMIFLLKGEIVYARSALPES FERTGKKGGGEIGDLFVRVAGRMEGQRGEPFYDIWQEEIEKLPKAFCLSKEDRQSLKGLG EHLGYLDLEMQERTILLYLEQLDLTIGYLREHKQERSRLYTSLGIMGGIFLTIMMY >gi|229784113|gb|GG667622.1| GENE 15 16403 - 16597 239 64 aa, chain + ## HITS:1 COG:no KEGG:Closa_3250 NR:ns ## KEGG: Closa_3250 # Name: not_defined # Def: stage III sporulation protein AC # Organism: C.saccharolyticum # Pathway: not_defined # 1 64 1 64 64 97 98.0 1e-19 MGVNLIFKIAAVGILVSVICQVLKHSGREEQAFLTSLAGLILVLFWLVPYIYQLFETIKN LFAL >gi|229784113|gb|GG667622.1| GENE 16 16611 - 16997 531 128 aa, chain + ## HITS:1 COG:no KEGG:Closa_3249 NR:ns ## KEGG: Closa_3249 # Name: not_defined # Def: stage III sporulation protein AD # Organism: C.saccharolyticum # Pathway: not_defined # 1 128 1 128 128 166 90.0 2e-40 MTVVTIAVVGITAVLLAVALKGMKGEYGIYLVMAAGFFIFFYGVGKLTTILDTIKQVQSY IKINSVYLSTLIKMIGITYVAEFAAGICKDAGYGAVGTQIEIFGKLSVLAVSMPILLALI ETLQVFLS >gi|229784113|gb|GG667622.1| GENE 17 16994 - 18238 1312 414 aa, chain + ## HITS:1 COG:no KEGG:Closa_3248 NR:ns ## KEGG: Closa_3248 # Name: not_defined # Def: Sporulation stage III protein AE # Organism: C.saccharolyticum # Pathway: not_defined # 55 414 31 391 391 444 70.0 1e-123 MKGRLLKTAVLLVLALLLAGTGGVFSEGRLCAGFLSFAAEHGGQSPVDSGKTVDEAETKE VSMDLEQYDLSEIQDYINAQGTSGWNLSFKQMMKDIMAGRLGDVMGQIGTSIRDALFSEV RNNSHMMGQIIILGIIGAVFSNFSNVFTGSQISETGFFVTYLLLFTYLAASFFTSITIAG SVVNQILGFMKALMPAYFLAAAFAGGSVSAVASYEFTLFAITVAQTITGSVLLPMVKVYA LLVMAGHIAKEDILSKLTDLLNDVITWSLKTMVGIVLGFHLIQGLVLPYVDSMKTGAVQK LIGIIPGVGQGASAVTQMVIGSGVLVKNTMGAAGVVVLVIITLIPMLKLVILMILYQCVA AVLQPVCDKRIVSCISDMARGHKILLSIAASAIVLFVVTVAVVCASTNVTYYAG >gi|229784113|gb|GG667622.1| GENE 18 18244 - 18990 664 248 aa, chain + ## HITS:1 COG:no KEGG:Closa_3247 NR:ns ## KEGG: Closa_3247 # Name: not_defined # Def: Sporulation stage III protein AF # Organism: C.saccharolyticum # Pathway: not_defined # 1 248 4 228 230 236 53.0 9e-61 MEGIYEWIRNITYYLIFMTVVTNLLPDKKYEKYFRLFAGMVLILLVLKPFTGGLRLDDKL AYYFEAISFQKEASELSAQISDMESVRLKSMVSQYEEAVEHDLRTMAESSGFVCVKAKAL IDGDEKNARFGHVVSVSLRLAPDSPGQTDQGNNTKIRPVESVEPVERIRVELGGGSTEVS EKAGERNPAAAGAETGTSAGEITKPAGAAAGQPDHRQEENSALAGLRRRISEYYDLEEQD IEIQLEIG >gi|229784113|gb|GG667622.1| GENE 19 18965 - 19567 523 200 aa, chain + ## HITS:1 COG:no KEGG:Closa_3246 NR:ns ## KEGG: Closa_3246 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 200 1 201 201 249 75.0 4e-65 MKFNWKLGKDKWLILLAAGLLLLVLTIPSGPETEKKMMKDAASGTEAVLQEGLVTEAGEP VAAAANADRTYEQQLEQRVKQILKTVDGVGTVDVMIVLKSSEEKVLRVDKNTSSSSTEEQ DSGGGTRKSSSAEQQESTILTGSGENTAPIVEKEIRPEIEGIIISAQGGGSPTVKAEISG AMEALFDLPPHKIKVLKRVE >gi|229784113|gb|GG667622.1| GENE 20 19584 - 20450 1062 288 aa, chain + ## HITS:1 COG:no KEGG:Closa_3245 NR:ns ## KEGG: Closa_3245 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 288 1 262 262 298 73.0 2e-79 MRSLKDIRESKVPGTPKKMNMKKLFRRNQIIITTLAVMIAAAGYLNYAGKQEMASGTDVY EAGVTDISDEDILAENQALTGDPGNYQEIASLDNDPSDESLAAQDTDGSGSAAASNQEGV AGDAAADTANAADNSAAEGGQLAANDAASSDTETGLENPGEAVLTSGMNVSDYIANVQLS REQVRAKNKETLMNLINNENIEEAAKQEAIQEMIDMTAISEKENAAETLLMAKGFSDPVV SITSGKVDVVINASSITDPQRAQIEDIVKRKTEVGAENIVITLMKLEE >gi|229784113|gb|GG667622.1| GENE 21 20821 - 21807 930 328 aa, chain + ## HITS:1 COG:BH1025 KEGG:ns NR:ns ## COG: BH1025 COG4606 # Protein_GI_number: 15613588 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type enterochelin transport system, permease component # Organism: Bacillus halodurans # 1 302 15 316 317 280 50.0 3e-75 MVSLFLGVMELSPEGVLRGDFEQMEILFISRLPRLLAILCTGVGMSISGLIMQQLCMNKF VSPTTGATISSAQFGILLALLFMPSSSLWGRAVFAFAAAIAGTWIFVCFIQRIQFKDVVM VPLVGIMFGNVIGGVTSYLAYKYDMTQALSSWLVGHFSLVIRGRYELVYLTLPLVLLAFL FANHFNIVGMGKNFSKNLGVNYNVVLFLGLSIAAMITASIVVVVGSVSYIGLIVPNLVAM FKGDRIRGTLLDTALFGALFVLVCDMIGRTVIAPYELPIELIIGVIGSAVFVGMLIYRLN HGRKSISLPFGKKRAAAVPSPQTGGTEA >gi|229784113|gb|GG667622.1| GENE 22 21804 - 22796 1084 330 aa, chain + ## HITS:1 COG:BS_yclO KEGG:ns NR:ns ## COG: BS_yclO COG4605 # Protein_GI_number: 16077449 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type enterochelin transport system, permease component # Organism: Bacillus subtilis # 25 329 10 314 315 223 39.0 6e-58 MKQETGSRRKKTGSINGKKLVMLSLLAAAAAVLYLFLNVNMKYFSYAMSLRIPKLAVMLI TAFCIGGASVIFQSVINNTIVTPCLLGMNSLYTLIHTAVVFFFGSTSLLARNTNLSFAVD LVLMAIIATVIYSYLFKKTSHNVLYVLLAGTVLSTFFTSMQTTLTRIMDPNEYDSLLASL VASFTNVNTEIIFFSLVLIAVLLLALRRELALLDVITLGKAQAINLGVDYDRTVRRLLLG VTLFIAIATAMVGPISFLGLIIANLSRQLFKTFRHSYLIAGSALIGMVILAGGQLITEQI FTYSVPVSVFITVGGGIYFLYLLLRTSRSA >gi|229784113|gb|GG667622.1| GENE 23 22829 - 23581 203 250 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 15 228 27 247 563 82 28 5e-15 MFVKELLKKYNGKTVVDGVSFELPAGKVISLIGPNGAGKSTVMGMISRLIAQDDGSVDFE GKDISRWKSRDLAKRLAILTQSSHVQMKLTVRELVAFGRFPYSGSRITAEAEEIIDKAIA YMELEEFEDRFIDELSGGQRQRAYIAMVIAQDTEYILLDEPTNNLDIYHATNMMKIVRRL CDELGKTVILVLHEINYAAFYSDYICAFVDGKISKFGTVEEVMTKENLSEIYRVDFEILQ IEGKPLSIYY >gi|229784113|gb|GG667622.1| GENE 24 23668 - 24720 1349 350 aa, chain + ## HITS:1 COG:BS_yclQ KEGG:ns NR:ns ## COG: BS_yclQ COG4607 # Protein_GI_number: 16077451 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type enterochelin transport system, periplasmic component # Organism: Bacillus subtilis # 1 346 1 315 317 180 32.0 4e-45 MKKMTLLTLAMAAALSMTACGQGTGQSKTTDGQSGEATVSAEAESVLAESRNETTQASET PESVTVTTLNGNGEKVELEVPYNPERVAILDMAVLDMMDNWGLGSQIVGMPKSSKIDYLM EYNNNDAIVNLGTLKEVDMEALMASEPDVIFIGGRLSAQYDELSKIAPVVYTAVDYEEGV IQSVKNNASMIASIFGEEEKAAEELAGIDARVETLRQAAEGKTAVIGMVTSSNFNTLGDG SRCSLIGNEVGFTNLANDVDSTHGNESSFELLVSLNPDYIFVLDRDSAINTEGAKLAGEV MDNELVKKTDAYKNGRIVYLTPTVWYLAEGGITAADVMLSDLEAGILGQE >gi|229784113|gb|GG667622.1| GENE 25 24808 - 25194 586 128 aa, chain + ## HITS:1 COG:lin1395 KEGG:ns NR:ns ## COG: lin1395 COG1302 # Protein_GI_number: 16800463 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 18 126 15 124 135 89 44.0 1e-18 MAETENRNTHKVYEKDKIGEVQIADEVVAIIAGLAATEVDGVDSMAGNITNELVGKLGMK NLSKGVKVEVTEEHVSVDLSLNIKYGFSIPEVCEKVQDKVKSAIENMTGLTVLDVNIKIA GVSMEENR >gi|229784113|gb|GG667622.1| GENE 26 25266 - 25757 295 163 aa, chain - ## HITS:1 COG:CAC1481 KEGG:ns NR:ns ## COG: CAC1481 COG0583 # Protein_GI_number: 15894760 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 38 161 164 288 292 67 25.0 1e-11 MEDEPLRQGFRHGSYDLVIGRPELLTGIQSTCHLLARDRLVLLTPTDHPLAGRAAVPLEE LSGESFILMHPDSSIHRLCLTACQKKGFQPAVTRTAKIESIVSAVAAGEAVSLLPYSNLD VFRHENVVVIPLQEPVAANVVLACLSPKKLSRCSRTLWKYLST >gi|229784113|gb|GG667622.1| GENE 27 26690 - 26998 188 102 aa, chain - ## HITS:1 COG:BS_ywbI KEGG:ns NR:ns ## COG: BS_ywbI COG0583 # Protein_GI_number: 16080882 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 1 101 1 101 301 73 34.0 7e-14 MTFDQLQYFVAVVESSTFFDAAESLHIAQSTLSKQIRNLELELDVQLFDRSRRHASLTEA GRALYPDAVKLLKGIQGMKQHLAPYQDSRKSIIHLGVLPILH >gi|229784113|gb|GG667622.1| GENE 28 27158 - 27733 563 191 aa, chain + ## HITS:1 COG:Z4047 KEGG:ns NR:ns ## COG: Z4047 COG0163 # Protein_GI_number: 15803256 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase # Organism: Escherichia coli O157:H7 EDL933 # 1 183 1 183 197 211 53.0 8e-55 MDLIIGVTGASGTIYAVKLLEALKERKEVVTHVIFSDYAWINLEIETEYTKEEILSLADH CYDNRDLAACISSGSCPVDGVVVLPCSMKTLSGIANGYADNLITRVCDVALKERRRLVVC PRETPLNTIHLKHLYELSQMGAVIVPPMPAFYNHPETVEDIVNHQIMKILDQFRIPWEGA KRWGEKVRDHV >gi|229784113|gb|GG667622.1| GENE 29 27726 - 28112 481 128 aa, chain + ## HITS:1 COG:no KEGG:SGGBAA2069_c16390 NR:ns ## KEGG: SGGBAA2069_c16390 # Name: not_defined # Def: hypothetical protein # Organism: S.gallolyticus_gallolyticus # Pathway: not_defined # 8 115 4 118 135 62 29.0 4e-09 MFEELMDFTMTGHGVSVQFRVLRMGTDCLVLAAGGDTGHIGAIAFGDRDECLSNAREGHR EGVVTELIYRELESVVPGGLAVLAGIHVDGITKEQITGVIALCEEGAHRIAAGLRQDQKE EKGGNQNG >gi|229784113|gb|GG667622.1| GENE 30 28105 - 29574 1454 489 aa, chain + ## HITS:1 COG:MA0246 KEGG:ns NR:ns ## COG: MA0246 COG0043 # Protein_GI_number: 20089144 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases # Organism: Methanosarcina acetivorans str.C2A # 50 477 35 421 422 152 31.0 1e-36 MVEDLRSAVEELKKYENQIACTDTEVDSYAEVAGIYRYVGAGGTVKRPTKEGPALLFRNI KGFPDKQVLIGLLASRKRVGYLLDCPPDKLGFLLKDAAANPVMPEFTESGDVPCQEEIHY ASDEGFDIRKILPAPQNTEEDAGPYITMGMCYASDPVTGDGDVTIHRLCLQSADEMTMFF TPGVRHLDAFREKAEREGVNLPISISIGVDPAIEIASCFEPPTTPLGFNELSIAGALRKK PVRMAKCRTIDEYAIANAEFVIEGELVAGKRMREDINTNTGKAMPEFPGYTGHVSGELPV IKIKAVTHRKNPMIQTCIGPSEEHVSMAGIPTEASILTMEEKALAGRVKNVYAHRSGGGK YMAVIQFAKQVPSDEGRQRQAALIALTAFPELKHVIVVDEDVDIFDSDDVLWALNTRYQG DVDTITIPGVRCHPLDPTEGPEYNPMLKDRGISCKTIFDCTVPYGLKDRFQRSKFKEVNM EDYEIRPLV >gi|229784113|gb|GG667622.1| GENE 31 29689 - 29862 105 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCSPVTFIFAVRIQMAAHWDIKPSGTAPHPKPRKNLKDPEKVNPCGLMQLPGIIPIS >gi|229784113|gb|GG667622.1| GENE 32 29945 - 30784 747 279 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_3107 NR:ns ## KEGG: Dhaf_3107 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 11 270 10 268 272 276 50.0 6e-73 MKARRIYPSALVLSLLFLSGCYRGPVQEEETLHLIQENAASLPGGESADLWQDEHNTYDS YRLSDGTSLLTVYKTVGPDNYYVAGTVSFNDLNESARKAIDGYYEEQGLLYDTETELQKA YEDYLSCRQNGTLYQERHVRQEITPSAANDTMVCFITIAAIPVSGQESEELRLGAVFNRE TGAVYSNWDLFILPEEEARQWLLDASDAVSLELRGEMEAALRPEYILLFPTHLEITYPRG TLPSQEYSTGIGLNYDDRLLSVLEPWAVPDGTQSAQADR >gi|229784113|gb|GG667622.1| GENE 33 30828 - 30935 66 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFIMPGILKEQGIAADRHEKPQEIVQHGKPTVIRK >gi|229784113|gb|GG667622.1| GENE 34 30880 - 31383 321 167 aa, chain - ## HITS:1 COG:FN1107 KEGG:ns NR:ns ## COG: FN1107 COG0454 # Protein_GI_number: 19704442 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Fusobacterium nucleatum # 51 155 2 108 108 89 42.0 3e-18 MSKEITLAKAGLSNCEELYQLQTISFHQLLEKYQDYDSNPGAETKDRTLQRLKDPFTDYY FICLSGKHIGAVRISHFETLCRLKQLFILPEFQGHGYAQQAILLAEALYPEIQRWELDTI SQEAKLCHLYEKMGYTKTGRMEKIKDGMDLVYYARHTKRAGNCGRQA >gi|229784113|gb|GG667622.1| GENE 35 31394 - 31963 551 189 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620332|ref|ZP_06113267.1| ## NR: gi|266620332|ref|ZP_06113267.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 189 1 189 189 348 100.0 1e-94 MKKKTVWTASVLGAAAFLAAFCLLAPRSYPVPADTTVGTRSIRYEDRPETAAEIIVARYF LNQITGDFNEMKNLLPNTEADRITLKNEKEAFLKGESMETYLIRSITTLDREQYIGDNRA FSYPGAEHKARQENLKDYVVVHVEFYQKWSDKAEARAPQWGSGECGRNFLVGRKALDKNY KIYDYGMMM >gi|229784113|gb|GG667622.1| GENE 36 32167 - 33057 1078 296 aa, chain + ## HITS:1 COG:ZygeT KEGG:ns NR:ns ## COG: ZygeT COG1319 # Protein_GI_number: 15803404 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Escherichia coli O157:H7 EDL933 # 24 287 25 288 292 110 27.0 4e-24 MVQNYIMAKEAKEVPALLAQYEGRGRIIAGGTDLVLDLQSGKYQADCLVDITGIPELTGI RRDGGILEIGAAVTHNQAAASSMIRQYAYALAKASHSVGSNQIRNCSTIGGNLVNGQPAA DSAVALAALGASVEVVNGTGTEVLSMDRLYAGFGKSTVDSTKSLVTRLLVPAAEPGEGSG YARLEQRKALALPMVCVAAFLAVRDGVVELCRIAMAPVGVGPVRAEAAETFLTGRTASKE LFQEAGRLALENANPRDSLIRGSKHYREEVLPVMVERALLEAFSDAGNGKKEVEQE >gi|229784113|gb|GG667622.1| GENE 37 33054 - 33272 329 72 aa, chain + ## HITS:1 COG:SMb20403 KEGG:ns NR:ns ## COG: SMb20403 COG2080 # Protein_GI_number: 16264137 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Sinorhizobium meliloti # 3 68 2 67 165 68 51.0 2e-12 MKLSFILNDVPARVEIEPFDLLADVLRDRLGLTGTKKGCGQGECGACTVIVDGKAVNSCM VPAARVEGCRAS >gi|229784113|gb|GG667622.1| GENE 38 34239 - 34424 168 61 aa, chain + ## HITS:1 COG:mlr1701 KEGG:ns NR:ns ## COG: mlr1701 COG2080 # Protein_GI_number: 13471661 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Mesorhizobium loti # 1 47 108 154 160 61 61.0 5e-10 MILSAKALLDEKPHPTREEIRTALAGNLCRCTGYVKIEQAVELAARALALMEGGETDEGG R >gi|229784113|gb|GG667622.1| GENE 39 34408 - 36249 1889 613 aa, chain + ## HITS:1 COG:SMb20132 KEGG:ns NR:ns ## COG: SMb20132 COG1529 # Protein_GI_number: 16263880 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Sinorhizobium meliloti # 7 574 19 582 772 367 38.0 1e-101 MKAEGEVGKPLPRIDALEKATGLAAFTTDLKLPGMLHAKVLRSPYPHARIVSIDTAEAEK LPGVRAVATYKNTTECHFNSSCSEANPPAVPVEDEQVFNQELRYVGDEVAAVAADTEETA KQALKLIRVEYEELPAVFDPLEAMKPDAPPVHPCFIGNNIAGMAKLPMGDLEKGFAESDL ILEERFKLSVVKHCQLETQAAVADYSASGHLTVWSTTQSPHPLKNLLAKLFGMPASRVRV LNPPYIGGAFGSHVGMSGKAEPIAAALAILAGKPVRFVYDRKEDFIASTTRHSGYITVKL GAKRDGSFQALYLNAVLNTGGYANAGPDVTAVLGVTNISIYQIPNVLYDGLCVYTNTTTA GAMRGFGNPQGNFAVESTVDMMAERLGMDPMELRLKNVMTPYAPWLPPYPCATCGLAECM EKGAEKIGWTDRDRLKKTTDSKKVRGIGMAAGTHLSNSWPFAVEFDNAYLTVMEDGSANL AVGVPDMGQGIATTLSQVAASSLGIPMDQVSIVFGDTASTPYDIGSHASRTLYSAGTAVE AAGRDAKQKILEFTAEQLKADLKTLTLENGMISVNGEPCMTIGEAAHLIHRANRQIIGIG QTVPFNAPPWHAS >gi|229784113|gb|GG667622.1| GENE 40 37175 - 37588 489 137 aa, chain + ## HITS:1 COG:mll4880 KEGG:ns NR:ns ## COG: mll4880 COG1529 # Protein_GI_number: 13474083 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Mesorhizobium loti # 1 135 635 764 774 125 48.0 2e-29 MDLETGQTEVLRLAAAHDVGRAINPDIVEGQIHGSVLQGIGYALSEQIVYDKKGRQAQDS FHKYYLPTILETPDIDVILVETNEPSGPFGAKGVGECGLVPTAAAIANAVSHATGIRFHE IPITEEKMLEAIKKEKE >gi|229784113|gb|GG667622.1| GENE 41 37595 - 39535 2061 646 aa, chain + ## HITS:1 COG:CAC1044_1 KEGG:ns NR:ns ## COG: CAC1044_1 COG1902 # Protein_GI_number: 15894331 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Clostridium acetobutylicum # 4 406 1 413 413 243 37.0 1e-63 MKKMFPKLFEPGQIGTLRVKNRLVKAPQTTGLSNKDGTVTQRLVDHYTHLADGGAGLVIV EYAYIDDIASKSCHCQVGISSHEHIAGLGWLADSIKNHGAKAGIQIEHCGRQRFLGPPMK SASAIPWPMLYDQFHAIPEELTIDEIQVLIEAFGDAAKRAVDAGFDLVEIHAAHGYLITN FLSPFTNKRGDWYGGSRENRFRFLGQVVENCRRKVGPDFPLTVRLSGTDYEPDGMTIEDT IYYAKELEKLGIDAFHISGGDHHTMIHQVSPMAMPVCYNVWAAEAVKKEVHVPVMASGSI TLPQYAEDILEQGKADFITLGRPMWADNEWVKKAMEDRPEDIRPCIRCNEGCLQRSSFLG RTVMCAVNPVLGFEEDLAVKPAETKKKVVIAGGGPAGMEAARVLKLKGHDVTIYEMRKLG GYLHEASAPEFKEDIRHLIDYQIHQIEKLEIPVVSEELTPEMVKAGGYDVVISAVGAEPV IPAVPGIDGKNVINALAILDRHPEIGKKVVVVGGGMIGTETAIDLAEKGHEVTIVEMKDA IMADCAVTDVIAYYEKIGRNRIAVIPGLRVTEVTEQGVRGVNDRTGRRTELPADSVVIAV GLKPRHAFYDTLAGEPNLEVYEIGDCVKAGKILDAFHTAYKTAVRI >gi|229784113|gb|GG667622.1| GENE 42 39622 - 39870 362 82 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870201|ref|ZP_06113275.2| ## NR: gi|288870201|ref|ZP_06113275.2| putative transposase [Clostridium hathewayi DSM 13479] putative transposase [Clostridium hathewayi DSM 13479] # 1 82 24 105 105 150 100.0 2e-35 MTPLGQMLVNDGFEKGVEKGIEKGIEKGIEKGIEKGARALISSYQETGLSYDDTLKKLME KLELDSPTAARYMEKFWIRIPV >gi|229784113|gb|GG667622.1| GENE 43 40858 - 41547 571 229 aa, chain - ## HITS:1 COG:no KEGG:CLJU_c03680 NR:ns ## KEGG: CLJU_c03680 # Name: not_defined # Def: hypothetical protein # Organism: C.ljungdahlii # Pathway: not_defined # 27 226 3 197 275 121 33.0 2e-26 MEIGSYIAGNGGYMSNTNDADAAKDIPDNFPAEDLVLKSSMQFFGDELLSYLGITAEPVT VGPTEFVYLTAKQLYEDFNFIKPDRSWIHLEFESDSIKEEDLRRFRSYEAVTSYICRVDI TTYVICSSTVKEIRNELKTGHNTYKIIPVRLKNNNADELFARLFEKQKNGEVLNRADLVP LLLTTLMSGTTDQKDRIILANRLITESGALSQGEIVKMQAVLYTLAIAS >gi|229784113|gb|GG667622.1| GENE 44 41896 - 42546 461 216 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620343|ref|ZP_06113278.1| ## NR: gi|266620343|ref|ZP_06113278.1| hypothetical protein CLOSTHATH_01426 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01426 [Clostridium hathewayi DSM 13479] # 1 216 1 216 216 459 100.0 1e-128 MEKESITCFQYKLRNLACDGEILKLLKALHERCGGQWVYSVHQNYEVYPDTPELSEAFFY PEWWKRRVVRPAFPGIQEFGWEEGRSFHFIHSSQTVNGEVRHLLQIRGEEEFTEEDHVLL DYACILTGYMKALPAASEGISTLMGNAYRGIRPETIPEGVFPENGYGIVVLERGDKGILS SRDTWDTDSTAASGNACGIVSWTRRICCCLSLLTML >gi|229784113|gb|GG667622.1| GENE 45 42513 - 43043 520 176 aa, chain + ## HITS:1 COG:no KEGG:Moth_2125 NR:ns ## KEGG: Moth_2125 # Name: not_defined # Def: CdaR family transcriptional regulator # Organism: M.thermoacetica # Pathway: not_defined # 18 166 400 548 553 75 28.0 1e-12 MLLFVSADNVMKVTEQLEEIIGFGDDDMIMGISGLYSAAEMDQALGEARHSVEVGNLTDM DRTVFFHCELGIFRILDYPKPGSPVNRLLDQMTEKLMEYPKEKRDALSDTMCCFVKNHYN YQKTADAMYAHVNTIRYRISLLENLWSCDLQREEDRLLFEVVSRLLPLWVKETGYE >gi|229784113|gb|GG667622.1| GENE 46 43361 - 43555 205 64 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1 60 605 664 744 72 55.0 2e-13 MLTGWLSSGDKWYYLNADGSMATGWVKLSGKWYYLNQNGDMETASKEIEGKVYSFDENGA CVNP >gi|229784113|gb|GG667622.1| GENE 47 43646 - 44506 739 286 aa, chain - ## HITS:1 COG:no KEGG:BC1003_1494 NR:ns ## KEGG: BC1003_1494 # Name: not_defined # Def: peptidoglycan-binding lysin domain-containing protein # Organism: Burkholderia_CCGE1003 # Pathway: not_defined # 211 280 91 162 165 65 45.0 3e-09 MRKTYLLTFLLAAGLMAESPFAAYAKDSPLTLPEFMEICQQEQFNFSRVITLPGEETGSE KTGGDQKQYETHYKNVILSEHGAKYRESPYGPGYQSLKSGKLYMDMPEGYELSYEYKIKG ESDETGPETTVPGGTIETFYRAEAAAEVKSPGEQYGYIKNEVKDIPVVPDLPQDYNWRGH DEKTLMVSSVLTVRSLTSGRTESWESQITFVTNLDKKEPDCYYLIPTDREAYTVKAGDSL WKIAKQYYGSSENWQFILHRNSDLIKNPDRIYPGQLLVIPDAAAWQ >gi|229784113|gb|GG667622.1| GENE 48 44539 - 45879 770 446 aa, chain - ## HITS:1 COG:all3587 KEGG:ns NR:ns ## COG: all3587 COG0642 # Protein_GI_number: 17231079 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 1 444 1 475 475 145 29.0 1e-34 MIKELRKKLTLLYTVTTGTILTLVVAGLLFYNLRVSEKDALQTFQNQIVAVASRMQYGST ISGSWLAQTEASEKLIIHIEENREPFLYSGSWVPATDRQTLIDLAREEALKEKVNPAVRP VSSSVIQSSVFTVTGEQKDSYYGTILVFPTARSYQSLTLLGYRTPGIVLLKDQGPRFLLF DLLGIAALFLVSWYFVGKSLEPVEVSRKRQNEFIAAASHELRSPLAVIESSVTAIAAAEE SSAAPRSAECMEKQTQYLKNIRAECSRMAALVGDMLLLASTDAGSWSVKRETVDMDTLLI ETYELYEPVCRSSGIRLFLDLPEETLPHITGDSFRLKQLLTILLDNAVSYTPEDRGITIS AAANGNELTISVADQGCGIPPEARAHIFDRFYRGDSSRTDKRHFGLGLSIAKELAVLHGG RISVGETEGGGATFWIKLPVIIENGL >gi|229784113|gb|GG667622.1| GENE 49 45872 - 46561 740 229 aa, chain - ## HITS:1 COG:mlr4457 KEGG:ns NR:ns ## COG: mlr4457 COG0745 # Protein_GI_number: 13473757 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Mesorhizobium loti # 1 224 1 221 227 168 41.0 1e-41 MRILLVEDDKVLCSTVAMQLEQAGYQVDQCCRGDDALYYATEYPYDVIILDRMLPSLDGL SLLQAIRQKSIMTPVIITTALDGLGDRIDGLDAGADDYLVKPFAVEELLARIRAVTRRPG RTADTASLAVGNLSLDAVKHELTRNGAESETVTLSGRESSLLEYLMRNAGRTLPRNAILS YVWGPDSEVEAGNLDNYIYFLRRRLKSLHGNAKIQTVHGVGYRLEEFHD >gi|229784113|gb|GG667622.1| GENE 50 46727 - 49024 2307 765 aa, chain + ## HITS:1 COG:no KEGG:Clole_2648 NR:ns ## KEGG: Clole_2648 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: C.lentocellum # Pathway: not_defined # 280 765 300 776 776 161 25.0 1e-37 MKNSCLITSLTHTQSVRVIVQKTAAVALAAVMAVMTAGCGTKGQPAGDATGTKDGTKQTT GANPGTADAAKGRYVEQAVAYPFEDGTEGVLDIIQDQNGEMVLFTLVGQETEGKKKAYRY DGSTWKEDTASPVKELPDSYYLAYSAYGPDGTLYAVCSDSDYKAHLIKLADGQPMQEMSV GIEDAMVLLNGIYVSDDSTILIPSDDQVIVVSPDGTVEKKLPQKSSFSNFCDSHTLTAHT FLTTGDRGFLRYDLKSLAEKEVIPFQTDETDLYGSLAGGDGDDFYLANAKGIHHMADQGT MWETVVDGSLNSIGMPSVYIKRLLIGSDHDFYLWYTESEEQKLARYTYDAEMPAVPSKTL TVYGLNLSENQTVKQAASLFQMEHPDVRVELIDGNSESGSTLKSDTIRSLNAELLNGSGA DVLVLDGLPVESYIEKGVLEDLGDMITPMAESGELYPNIAENFTGSDGSVYQFPVRVGIP VIYGNEAVIKEMGTIGSLRAWQEANPDQAVFFKTVYENILRQMIYLYYPELAGKDAGQLD TDKVRNLLETAKITGDAGGSKTVFDESEDGGRGQVYNSTDTRGFTGLGDYGLLTKTTEVS LIELASMMDVMSPSAMIDKYGYHLEQFNDIYYPKGLLGVTSFGKEKETAREFVKFALSPK VQEGDLRDGFTVSRTASDAWKKRTSNMSISFSFNDTGEMLSAEYPNDKKKEEIMGMLDKL HTPISVDDILLEMIVSETKGYFEGKQTAAEAAGLFENKAKLYYAE >gi|229784113|gb|GG667622.1| GENE 51 49014 - 49307 221 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620350|ref|ZP_06113285.1| ## NR: gi|266620350|ref|ZP_06113285.1| putative ABC transporter, permease protein [Clostridium hathewayi DSM 13479] putative ABC transporter, permease protein [Clostridium hathewayi DSM 13479] # 1 96 1 96 96 153 100.0 3e-36 MRNSTQSGRLARKVYLFLLPSLAGTAVFVLLPYVDVIRRSFFEAAGGRFVAMQNYLTVMG NRAFRLASFNTLRFLVICVPLLVVVSLFCSMMIAGLN >gi|229784113|gb|GG667622.1| GENE 52 50280 - 50798 507 172 aa, chain + ## HITS:1 COG:CAC0665 KEGG:ns NR:ns ## COG: CAC0665 COG1175 # Protein_GI_number: 15893953 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Clostridium acetobutylicum # 1 149 118 266 289 136 45.0 2e-32 MASIVLLWKLFFHPQGLVNQITALFGMPGMNWIDGDTAFGVLVFTYVWKNLGYDVVLWVA GLAAIPDELYEAARVDGAGILARFLYVTLPGLSKTAVLVVSLSVFNSFKVFREAYLIAGE YPDESIYMLQHLFNHWFVTLDIQKMSAASVLLLLAVLLPVILKGLIPGRKEA >gi|229784113|gb|GG667622.1| GENE 53 50841 - 51716 827 291 aa, chain + ## HITS:1 COG:CAC0666 KEGG:ns NR:ns ## COG: CAC0666 COG0395 # Protein_GI_number: 15893954 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Clostridium acetobutylicum # 10 284 4 269 275 163 36.0 4e-40 MRWLRERLPEKKILPLVRRLILSVICLFIWLPLLMMAGNSFLSRGEMLSRFGPVLNGGTG NVKVTLFPDFPTLSPFVELLLDSPGFFVMFWNSCLQTGAVLFGQMVVAVPAAWAFAMYSF PGKKLMFMGYIILMILPFQVLMTPDYFVLNRLHLLDTHLAVILPGIFSTFPVFIMTQFFS SIPLSLLEAARLDGAGEGAVFIRIGIPMGFPGIMSAGILGFLESWNAIEAPMAFLKDRTL WPLSLYLPDITADKVSVAWVASVAAMAPPLFLFLAGQSFLEQGIMVSGIKE >gi|229784113|gb|GG667622.1| GENE 54 51721 - 52395 749 224 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620353|ref|ZP_06113288.1| ## NR: gi|266620353|ref|ZP_06113288.1| hypothetical protein CLOSTHATH_01436 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01436 [Clostridium hathewayi DSM 13479] # 1 222 1 222 222 413 100.0 1e-114 MERRKIRKWCRDGFLVFLGAMAVFTVLSRSLDSLTVPQVGTSYGKMGNVTYEMNGEGKFT ASGITYLNPEEGMKVQSVEKTAGQKVEAGEVLFTYQMEGIQKKREELALSVEKMKLELDQ SAVRSKQIPGISEETLAVQSLEAAKRALDFGYQDLEEIKRDSEEKLADLKRDYNQSMTRS EGVMKEGDPGNDFYQGVAAGNAPDCVTVSFTLMDKYANAGILTS >gi|229784113|gb|GG667622.1| GENE 55 53329 - 53820 533 163 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620354|ref|ZP_06113289.1| ## NR: gi|266620354|ref|ZP_06113289.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 163 5 167 167 273 99.0 2e-72 MAIGDGKLKFEGTVDKKLGELITAGTKISLQYGESRRACEAVVESVDFLSEEEQARFTAS AREDVGVLGATAAFSINLASRQYNQVIPIAGLRQESDGYYVLVVKPQKTILGEELTAEKV PVELLEKSSSQAAVEGAFGNTDRLIVSSSRIIEAGDRVRLSGE >gi|229784113|gb|GG667622.1| GENE 56 53828 - 54919 632 363 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620355|ref|ZP_06113290.1| ## NR: gi|266620355|ref|ZP_06113290.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 363 1 363 363 739 100.0 0 MNRLKRWSKTCRAARVKTVCLAMLLSAALWGISLWQGEKTAPYAGDIQIRYHGQGISRQT LEQITEAGLKQEKKTFPETAAWSVEYDVEARNPVLNFKQDVTCIMVRGEMDQIVKDRLAA GTYGFGDDDEGCVVSSKTAWELFGSTNVTGSWITVKGQRFVIRGVTAASYPMVILPAKRL ETGFFSNISFSYSGQEGLEGQAEEMIMRFGLPGHGIRINGSMYYAVVRFFYTLPGWALLL SFGLFARRVKMGKKRLWNVFSFAVTAGIAAVLLWYGCRFPGDFIPSRWSDFSFYAEKFRQ VRENLFDIATLPKTRWDVEMIRAFKASVFCSAGAALSIITGNLLVSIDNHTFSHVFLKSK DKV >gi|229784113|gb|GG667622.1| GENE 57 54864 - 55625 558 253 aa, chain - ## HITS:1 COG:BH4045 KEGG:ns NR:ns ## COG: BH4045 COG0534 # Protein_GI_number: 15616607 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Bacillus halodurans # 1 252 194 446 447 203 47.0 2e-52 MAGAAIATVTGYMIPAVFGLIYFAVSRRALWLVSPRLSAAELKETLINGSSEMVTNLSSG IITFLFNLLMMHYAGEDGVAAITIIQYSQFLLNALYMGFSQGVSPVISYNYGSRNTKQLK QVFKTSFLFTGVTSILVFLTAQLGGGIIVTIFAHPGTAVYDLARHGFMVFAFGYLFSGVN IFSSALFTALSNGKVSAVISFVRTLCLIVLCLLFLPLIMGLDGVWLAVPAAEAGGFILCL YFLKRHAKTYGYR >gi|229784113|gb|GG667622.1| GENE 58 56688 - 57521 1132 277 aa, chain + ## HITS:1 COG:BS_bltR_1 KEGG:ns NR:ns ## COG: BS_bltR_1 COG0789 # Protein_GI_number: 16079711 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 1 123 1 122 124 108 45.0 1e-23 MEKQSELYFTTGEFAKILGVKKHTLFHYDEIGLFSPAVKDEENHYRYYFVWQMDTFEVIR ALQKLGMPLGEIREYMETRSPEHFLALVDQKEAEIDQEIERLKNMKRFMEWERTTIREAL EAEVDKPMLAERGQEYLFSSEVNLTGERKLAEEIAHHVRVCEQYHISMNAVGAVCLGKDL EAGKYDRYRSVYTRLDKRIPALKPYVKPAGTYVELYYRGYDGDMGKAYPVIRAFAEEQGI SLDEMWYEDLLLDELTVRDTADYVVKVAVRVKSTFAF >gi|229784113|gb|GG667622.1| GENE 59 57606 - 57821 112 71 aa, chain + ## HITS:1 COG:no KEGG:Bcell_0385 NR:ns ## KEGG: Bcell_0385 # Name: not_defined # Def: xylan 1,4-beta-xylosidase (EC:3.2.1.37) # Organism: B.cellulosilyticus # Pathway: not_defined # 1 71 361 431 472 89 56.0 5e-17 MDCRFPKITQDGCDGSETQGFIANLWDGAAAGFKYFSCHGIKQVTVTTRGLGSGVIELLT AWDGKPVGEIP >gi|229784113|gb|GG667622.1| GENE 60 58778 - 60364 1427 528 aa, chain + ## HITS:1 COG:SP1282 KEGG:ns NR:ns ## COG: SP1282 COG0488 # Protein_GI_number: 15901142 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Streptococcus pneumoniae TIGR4 # 1 512 12 511 519 326 38.0 8e-89 MLQIKQLTITHKKDLRVIVKDFSLVLNPGDKAVLIGEEGNGKSTLLKWIYNPELIEGYTE TQGERICTGEHPGYLPQELPAEEKEKTVYEYFCGEDAFGLQTPRELTHLAGQLKLPPDFF YRDQKMDTLSGGEKVKAQMARLLMARPTILLLDEPSNDIDIETLEWLEKLIMEAPQPVLF ISHDETLIERTANMVIHIEQLKRKTESRCSIAKTGYRQYAKNRAEQLASQEQRALVERRE EKKRQEKFRRIEQAVNHDLNSMTRQNPHGGYLLKKKMKAVKSLEKRYEREALDRTEIPET EDAIFFKLGEAVKMPAGKIVLEFSLDRLLTPVSGEEEPRLLAEHLFLRIKGGERVCITGK NGAGKTTLLKQMAEQLLSRTDVRACYMPQNYEDLLEPDRTPVDFLSVTGDKEETTRIRTY LGSMKYTVDEMNAPMSELSGGQKAKVLLLKLSLSGADVLILDEPTRNFSPLSGPVIRAIL RQYRGAILSISHDRKYIEEVCTAVYQLTPGGLVRMETEKQQCDRNGQD >gi|229784113|gb|GG667622.1| GENE 61 60532 - 61317 995 261 aa, chain + ## HITS:1 COG:MA0958 KEGG:ns NR:ns ## COG: MA0958 COG1712 # Protein_GI_number: 20089836 # Func_class: R General function prediction only # Function: Predicted dinucleotide-utilizing enzyme # Organism: Methanosarcina acetivorans str.C2A # 4 259 2 267 271 117 33.0 3e-26 MSRLKLGILGSGYLGGIIADAWKNGYLDEYELVGIAGRSEEKTKALAERAGCRPCADVDE LLALKPDYIAEAASVAAVKGMAVKALSAGTSIIVLSIGAFADQDFYEEVKRTAMKHGTKV HIASGAVGGFDVLRTVSLMGDAVSGIETRKGPKSLMNTPLFEEHLMTDTEETRVFTGNAK EAIALLPTKVNVAVAASLATTGPEQTKVTIHSIPGFVGDDHKITAEIPGVKAVVDIYSAD SAIAGWSVVAVLRNLVSPIVF >gi|229784113|gb|GG667622.1| GENE 62 61331 - 62236 1086 301 aa, chain + ## HITS:1 COG:PH0520 KEGG:ns NR:ns ## COG: PH0520 COG1052 # Protein_GI_number: 14590422 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Pyrococcus horikoshii # 11 227 9 237 333 71 29.0 2e-12 MFKKLVAIEPVSLVEEAEKELRQYAEEVVMYGDVPADDDEIIRRIGDADGVLLSYTSKIN SCVFEHCPGIRYVGMCCSLYSKESANVDIAWAEEHGVTVKGIRDYGDRGVVEYVICELVR FLHGYDRPMWKEMPVEITGLKVGMAGMGVSGGMIAEAMKFMGAEVSYFSRSRKPDYEAKG MVYRPLTELLEQSEVVFACLNKNVILMHEEEFKALGNGKILFNTSIGPAFEAEDLKNWLD GGNNYFACDTAGAIGDPTGELLNHPRVFCTNYSAGRTKQAFGLLSEKVLYNIREYLGNAN A >gi|229784113|gb|GG667622.1| GENE 63 62828 - 63271 690 147 aa, chain + ## HITS:1 COG:BH2785 KEGG:ns NR:ns ## COG: BH2785 COG0781 # Protein_GI_number: 15615348 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Bacillus halodurans # 39 145 21 124 134 75 38.0 3e-14 MTRSKLREHCFKMLFCADFYPAEEKEEQIERYFDEPKEDETTPEGVEEILHDVEMSPEEE SYLKTKTEAVMRKIPELDEKIDAVAEGWKTKRMGKAELTILRLALYEILYDEEVPEKVAI NEAVELAKRFGGNEAPAFINGVLAKLV >gi|229784113|gb|GG667622.1| GENE 64 63278 - 64555 1306 425 aa, chain + ## HITS:1 COG:BH2783 KEGG:ns NR:ns ## COG: BH2783 COG1570 # Protein_GI_number: 15615346 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Bacillus halodurans # 4 411 7 439 458 300 41.0 3e-81 MASVYSVSQVNAYIKNMFAQDFALNRISIKGEVSNCKYHTSGHIYFTLKDRGAAIACVMF AGQRKGLSFKLEEGQKVVAKGSVEVYERDGRYQLYAQEITKEGVGDLFERFQKLRDELEE MGMFSPEYKQPIPRYAGKVGIVTAQTGAAIRDIMNISARRNPYVQLYLYPALVQGENAKY SIVKGIETLDRLGLDVLIVGRGGGSIEDLWAFNEEIVARAIFACRTPVISAVGHETDVTI ADYVADMRAPTPSAAAELAVFDYSQFEGQVELYRDALGKSMARKLERYRYQTEQFRLRFQ LFDPRRQVREQRQRAVDLEEKLKNMMASRIVTDKNRAAEAESRLRQLIEKQAVRDRHRLE LISSRLDGLSPLKKIGGGYGFLTDGKNRRIESVDQVKQGDPIKVRIRDGRIDAVVSDTAA ADLKQ >gi|229784113|gb|GG667622.1| GENE 65 64572 - 64784 310 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620366|ref|ZP_06113301.1| ## NR: gi|266620366|ref|ZP_06113301.1| exodeoxyribonuclease VII, small subunit [Clostridium hathewayi DSM 13479] exodeoxyribonuclease VII, small subunit [Clostridium hathewayi DSM 13479] # 1 70 1 70 70 69 100.0 9e-11 MAAKKVKSVEETFQELEEIIRKLESGESSLEESFQYYEAGMKLVKSCNEKIDKVEKQIIV LEEDGESHEL >gi|229784113|gb|GG667622.1| GENE 66 64774 - 65661 1157 295 aa, chain + ## HITS:1 COG:CAC2080 KEGG:ns NR:ns ## COG: CAC2080 COG0142 # Protein_GI_number: 15895350 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Clostridium acetobutylicum # 12 295 8 289 289 219 40.0 4e-57 MNFNEMLETRTREVEAVVERYLPAADGYQKTVLDAMNYSVRAGGKRLRPMLMEETYRLFG GSGTVVEPFMAAIEMIHTSSLIHDDLPCMDNDELRRGLPTTWVKYGYDMAVLAGDSLLIY AVETAAKAFALTEDAAVVGRCIGILAQKTGIYGMIGGQTVDVELTNKPVPHEKLDFIYHL KTGALLESSMMIGALLAGADEEETRLVEQMAAAIGLAFQIQDDILDVTSSMEVLGKPVLS DEKNNKTTYVTLEGLEKAKQDVASISDDAVRILHELPGENEFLEALIHMLVSREK >gi|229784113|gb|GG667622.1| GENE 67 65678 - 66616 986 312 aa, chain + ## HITS:1 COG:CAC2077 KEGG:ns NR:ns ## COG: CAC2077 COG1154 # Protein_GI_number: 15895347 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Clostridium acetobutylicum # 2 307 4 308 619 342 57.0 5e-94 MILELINGPEDIKKLTGKELDILRQEIRDFLIGKISRTGGHLASNLGVVELTMAIYLVFD LPKDKIIWDVGHQSYTHKILSGRKGEFDDLRQYGGMSGFPKRKESPCDAFDTGHSSTSIS AGLGLAQARDVSGEDHFVVSVIGDGALTGGMAYEALNNAARIKKNFIIILNDNNMSISEN VGGMSRYLNGIRTGDGYLDLKKYVTNVLSRIPVIGDELIDKISRTKNGIKQLLIPGMLFE NMGITYLGPVDGHDVKALSRALKEAKKLDHAVLVHVITKKGKGYEPAEKNPARFHGVEPF DVVTGESKKEKN >gi|229784113|gb|GG667622.1| GENE 68 66699 - 67595 1067 298 aa, chain + ## HITS:1 COG:CAC2077 KEGG:ns NR:ns ## COG: CAC2077 COG1154 # Protein_GI_number: 15895347 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Clostridium acetobutylicum # 1 274 341 615 619 278 48.0 7e-75 MPDGTGLKRFSRLYPDRFFDVGIAEEHAVTSAAGMAAGGLKPVVAVYSSFLQRGFDQILH DVCIQNLPVVFAVDRAGLVGSDGETHQGIFDLSFLSAIPNMSIFAPKNMWELKAGLEFAV SFGGPFAIRYPRGEAYRGLKEFHAPVEYGRGEMLYEEKDIALLAVGSMVSTGEHVREKLK AEGWNCTLANGRFVKPFDEELVDRLAKNHWLVVTMEENVLQGGYGLMVTRYIHKHYPHVK VMNIAIPDGYVEHGNVSLLRKGLGIDSDSVIWRMKKEYLDTERQNMEWKTINTEEKKA >gi|229784113|gb|GG667622.1| GENE 69 67592 - 68407 1011 271 aa, chain + ## HITS:1 COG:CAC2076 KEGG:ns NR:ns ## COG: CAC2076 COG1189 # Protein_GI_number: 15895346 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase # Organism: Clostridium acetobutylicum # 2 267 5 266 267 297 57.0 1e-80 MKERLDVLLVKRNLAESREKAKAVIMSGIVYVEGQKEDKAGTTFDESVNIEVRGHTLKYV SRGGLKLEKAMSHFGVMLEGKTCMDVGSSTGGFTDCMLQNGAVKVYAVDVGHGQLAWKLR NDERVVCMEKTNIRYVTPDEIQEKIDFSSIDVSFISLTKVLEPVKRLLKEEGEIVCLIKP QFEAGREKVGKKGVVREKSVHLEVIEMVIAYAVSIGFEVLNLEYSPIKGPEGNIEYLLYL KNHPEGELIPDVPVAPKDIVEAAHQELSEGK >gi|229784113|gb|GG667622.1| GENE 70 68438 - 69271 774 277 aa, chain + ## HITS:1 COG:NMA1017 KEGG:ns NR:ns ## COG: NMA1017 COG0061 # Protein_GI_number: 15793973 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Neisseria meningitidis Z2491 # 3 274 14 289 296 150 32.0 2e-36 MNPEKQGAKETAQVIRDYLTSHGASCLIGKGDGRRGGDYHYTDASMVPPETECIITLGGD GTLIQAARDLAGRNIPMLGINRGTLGYLTQISRTEEIDTALDALLADQYQLEERMMLNGR AYSSTGRLYEDIALNEIVITRNERLKMLHFRVYVNHEFLNEYRADGLIAATPTGSTAYNL SAGGPIIVPDSTLMVLTPICSHALNARSIVMSGDARIRIEILGDPGTSQAAVYDGDTAAE LHSGDYIEIHRSETKTVLIKLKDVSFLDNLRNKMAGI >gi|229784113|gb|GG667622.1| GENE 71 69283 - 69516 389 77 aa, chain + ## HITS:1 COG:CAC2074 KEGG:ns NR:ns ## COG: CAC2074 COG1438 # Protein_GI_number: 15895344 # Func_class: K Transcription # Function: Arginine repressor # Organism: Clostridium acetobutylicum # 1 76 1 77 150 69 54.0 1e-12 MKLERHSKIVELIGKYEIETQEELADYLNQAGFAVTQATVSRDIRELKLTKVQSESGKQR YMVLQNQSSFSDKYITS >gi|229784113|gb|GG667622.1| GENE 72 70464 - 70661 215 65 aa, chain + ## HITS:1 COG:no KEGG:Closa_3233 NR:ns ## KEGG: Closa_3233 # Name: not_defined # Def: arginine repressor, ArgR # Organism: C.saccharolyticum # Pathway: not_defined # 1 65 85 149 149 92 90.0 6e-18 MDMAQNILVIKTVSGMAMAVAAALDALHFNEIVGCIAGDDTIMCAVRSADDTILVMDKLK KLIAG >gi|229784113|gb|GG667622.1| GENE 73 70674 - 72338 1978 554 aa, chain + ## HITS:1 COG:BH2776 KEGG:ns NR:ns ## COG: BH2776 COG0497 # Protein_GI_number: 15615339 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Bacillus halodurans # 1 548 1 558 565 342 36.0 1e-93 MLSELHVKNLALIEKADIEFGEGLNILTGETGAGKSIIIGSVTMALGGKVQKDMIRRGTE YAYVELLFTVTEPDKLRALKEMDVCPDEGGIVIISRKIMPARSLCKINDETVTVGKLKTI TGLLIDIHGQHEHQSLLYKHKHLEILDKYHERTSHDVKQNIAGLYRTYTGLKERLASFQL DEESRLREMDFLRFEIDEIEGANVKDGEEEELTARYRLLTNARRITESLSAAYAAVNTDT LDYALKEVEHVAPFDEKLGAIRDQLFDADAILSDISHEISSYMEDMSFDEEEYRELEERL DVIHNLEAKYGKTAEQIQENLENKRKRLEELEDYDSLKQKTERELRETENRLEALCGQLS GMRKAAAKELTAKIKAGLTDLNFIDVEFSMEFTRLNHFTANGYDEAEFMISTNPGESVKP LGTVASGGELSRIMLAIKTVLADSDDIPTLIFDEIDTGISGRTAQMVSEKLSYIAKSHQV ICITHLPQIAAMADSHYEIAKTAADGRTTTTIRPLGEAEMVEELARLLGGAKITDAVREN AREMKRLAGERKRP >gi|229784113|gb|GG667622.1| GENE 74 72463 - 73254 837 263 aa, chain - ## HITS:1 COG:BH2137 KEGG:ns NR:ns ## COG: BH2137 COG1414 # Protein_GI_number: 15614700 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 15 262 3 249 251 148 35.0 1e-35 MEETEQTNRFAKDNQSTDKVLAIIELLSYSKEPMRLIDIANQLHYNTSTTLRFLNSLERN GYVYKDRETLKYQMTYKLCGLASYISNRTGIVEIASLPMKQLSTSLGECVCLAVEQDYTV VYVHVADGPGQMLRTTQRIGSQAPMHCTGVGKVILSEFPESKIDEMIRMKGLTRYTDNTL VTREMLLGELQLIRKRGYAYDNEECELGARCIAFPLRDYSGKIIAALSVTGPLGRMTDEF INSSFRTILETVQDISYQLGYHG >gi|229784113|gb|GG667622.1| GENE 75 73533 - 73847 457 104 aa, chain + ## HITS:1 COG:CAC0089 KEGG:ns NR:ns ## COG: CAC0089 COG0111 # Protein_GI_number: 15893385 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 104 1 104 318 91 41.0 4e-19 MNISLLEPIGVPEEMIESLAEGLKKQGHVFTYYDTKTTDVEELKRRSAGQEIVMIANNPY PDEVVRSCDSLKAIAVAFTGIDHVGLAACREKKIDILNCAGYSS >gi|229784113|gb|GG667622.1| GENE 76 74827 - 75297 400 156 aa, chain + ## HITS:1 COG:CAC0089 KEGG:ns NR:ns ## COG: CAC0089 COG0111 # Protein_GI_number: 15893385 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 4 155 166 317 318 170 54.0 1e-42 MFHAFGAEVLAYARHEKEEWKEAGIRYADMDTLLKESDIVSLHLPLNEGTKGFFDGTMIG KMKKDAILINCARGPIVDNAALAEALNEDKIAGAAIDVFDMEPPIPADYPLCHAKNILLT PHVAFATKEAMVRRAKIEFDNVYAYLNGKPENLCKI >gi|229784113|gb|GG667622.1| GENE 77 75381 - 76262 640 293 aa, chain - ## HITS:1 COG:lin2267 KEGG:ns NR:ns ## COG: lin2267 COG2207 # Protein_GI_number: 16801331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 33 285 29 288 292 102 28.0 1e-21 MIETLNKHHETVNYKWVDGVRLYHNIKTDSFPPHWHKEAEIIMPLSNSYGVRVGLSGELN SIEEDEIIIIPPGEMHTIYTPPTGARLIVDVDFTPYMNLRGVGSLINSLSPYRIIRKEGN PELSRTLRNYLLLIEEEYNKKDLFCDASIHSLFTSFFVTLGRQDINNLQTDRYKPEQKLQ YAEKFADISYYINLHCTEPLRLEDIASYTGFSKFHFCRLFKEFVGCTFHDYVTGCRLAHA QSLLASAATTITEAALQSGFTSLATFNRIFKEKKRCTPTEYRKLLRSSSAGEK >gi|229784113|gb|GG667622.1| GENE 78 76525 - 78699 1862 724 aa, chain + ## HITS:1 COG:XF0845 KEGG:ns NR:ns ## COG: XF0845 COG1472 # Protein_GI_number: 15837447 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Xylella fastidiosa 9a5c # 17 423 32 435 882 415 52.0 1e-115 MKETSEKETALDRQRREKAEYLVKQMTLEEKVFQTMNQAPAIERLGIKAYNWWNEGLHGV ARAGVATIFPQAIGLAATFDEDLIETVGEAVSTEARAKYHMQQRYGDTDIYKGLTLWAPN INIFRDPRWGRGHETYGEDPWLTSRLGIRYIRGLQGSHEKYLKTAACVKHFAVHSGPEEL RHSFDAEVSEKDLRETYLPAFEACVKDGDVEAVMGAYNRVNGVPCCGNEYLLETILRKEW GFHGHVVSDCWAIKDFHEGHGVTDSPVESVSMAMNHGCDLNCGNLFTYLIQAVKEGKVKE ERLDEAVIRLFTTRLKLGALGKMEEDDPYAGISYLEVDSPAMKKLNRSAAGKSVVLLKNT EGLLPIDTKRYKTIGVIGPNADSRRALVGNYEGTASEYVTVLEGIREAAEPEARVLYSEG CHLYKSNVSGLGARNDRLSEVKGICRESDIVIACMGLDSTLEGEQGDTGNIYAGGDKPDL MLPGLQQKILETAYDSGKPVVLVLLAGSAMAVTWADEHLPAILTAWYPGAEGGRGVADVL FGTVNPEGRLPVTFYRTTEELPDFTNYSMEGRTYRFMKQKALYPFGFGLSYTEFSCSGLE VSERDSVDNGVEVKLCVANCGERWGRETIQVYVGRKEDHDRNPQLKAAVKVGLEPGEEKT ISIHLPADAFAVYDENGKRYIDACTYQIFAGGSQPDQRSAELKKQRVLCAEVRSVRKYCI DETI >gi|229784113|gb|GG667622.1| GENE 79 78718 - 79632 988 304 aa, chain + ## HITS:1 COG:BH0484 KEGG:ns NR:ns ## COG: BH0484 COG4209 # Protein_GI_number: 15613047 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Bacillus halodurans # 11 303 32 324 325 236 42.0 4e-62 MEKKKSIWYYLYRDKWLYLMLVPVILYYFIFKYLPMGGIAMAFQDFNMFKGIFGSKFAGL SVFKKIFAQPQFWNSVKNTLILNLLTLVVSFPFTIILSLILNEIRCTWFKRLSQSLLYLP HFISWVVVAGIAVNLFSLNGGTVNNVLNVMGIKSIPFLSDKYWWIFTYVICSVWKEIGWG TIIYLAALTGVDESLYEAAYLDGATRMQRIIYVTLPMVKPVIVTMLILSVSKMMTIGLDA PLLLGNSKVMEVSEVLSTYVYRLGIEKAQYSPATAVGLFQSVVNIAILFMADRFAKAIGE EGIL >gi|229784113|gb|GG667622.1| GENE 80 79644 - 80510 883 288 aa, chain + ## HITS:1 COG:BS_ytcP KEGG:ns NR:ns ## COG: BS_ytcP COG0395 # Protein_GI_number: 16080069 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 10 288 7 286 286 227 46.0 2e-59 MKKKKIAFADVMLYLILTVVVLICLYPFLNVVAYSLSGNTAVLSGKVTFYPIDFQLSAYK EILLKQTQIWTAMGISVTVTVLGTLLGLVLTVAAAYALSKEKLKGRGILSGFILFTMYFS GGIIPTFLVVKGVGLYDSISALVIPSAMNVFNFIVMRTFFKELPLELEEAARIDGANDMK ILFKIALPLSMPIIATIGLFYAVSYWNDYFSALLYIQTPEKFSLQLRLRQLLFAGQINQV SMENLGTQVMSESLKMASIVISTIPIILVYPWLQKYFVKGVMLGSVKG >gi|229784113|gb|GG667622.1| GENE 81 80540 - 81151 731 203 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_5258 NR:ns ## KEGG: Pjdr2_5258 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: Paenibacillus # Pathway: not_defined # 53 203 42 189 539 110 42.0 3e-23 MGKKRRAAAVLTAAMAMAFVVSGCGAREEAAPKGESGQTKEAGDTKTEAAGNQTTYGSKQ FDNVTLTVELFDRSNAPQGTTVTDNRWVKYAKEEMAKVGINLEFVAVPRSEEVTKIQTMM AAGTAPDIALTYTRSIAEDYFNQGGTYDLSSYVDGEGQAENLKAYLGEDVLNVARDGDGQ LWAIAARRATAAKDNLFIRKDWL >gi|229784113|gb|GG667622.1| GENE 82 82083 - 82791 768 236 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_5258 NR:ns ## KEGG: Pjdr2_5258 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: Paenibacillus # Pathway: not_defined # 2 235 207 440 539 95 31.0 2e-18 MLKAFKENNPDGRTDVIPFFSVAIGDETADRADVMAMPFMTTLPDEHEFNIKSVFPVYGD EGYADYLRFLNKLYNEELLDQEYYTSNDLSATLAEYVVNGQAGCFVTNVNGNVDNLRGGL LQHLKVNNPDADIVSLPPLKNNHDGEIYNIEYAQNGAYCIVPKTCKNPEAAVTYMDWMAT QEGGFTLFHGFEDEHYKLEDGVPVVIDADFNAVDKDWIRHDMFIIGNQGYFFSEDD Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:02:11 2011 Seq name: gi|229784112|gb|GG667623.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld16, whole genome shotgun sequence Length of sequence - 52499 bp Number of predicted genes - 47, with homology - 46 Number of transcription units - 25, operones - 11 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 501 309 ## gi|266620382|ref|ZP_06113317.1| hypothetical protein CLOSTHATH_01467 + Term 509 - 540 4.1 - Term 497 - 528 4.1 2 2 Tu 1 . - CDS 575 - 1759 983 ## COG0726 Predicted xylanase/chitin deacetylase - Prom 1904 - 1963 7.3 - Term 2078 - 2133 5.8 3 3 Tu 1 . - CDS 2175 - 2555 326 ## Closa_4143 hypothetical protein - Prom 2711 - 2770 6.5 + Prom 2560 - 2619 7.8 4 4 Tu 1 . + CDS 2726 - 3319 617 ## Ping_2639 P-loop ATPase - Term 3330 - 3356 0.1 5 5 Op 1 . - CDS 3443 - 4918 1614 ## COG1012 NAD-dependent aldehyde dehydrogenases 6 5 Op 2 . - CDS 4906 - 5043 66 ## gi|288870213|ref|ZP_06409669.1| Yop Translocation L - Prom 5063 - 5122 9.4 - Term 5177 - 5231 10.6 7 6 Tu 1 . - CDS 5246 - 5458 240 ## COG2155 Uncharacterized conserved protein - Prom 5588 - 5647 5.2 8 7 Op 1 2/0.000 - CDS 5650 - 7155 1582 ## COG0714 MoxR-like ATPases 9 7 Op 2 . - CDS 7152 - 8471 1110 ## COG3864 Uncharacterized protein conserved in bacteria 10 7 Op 3 . - CDS 8509 - 9537 807 ## Closa_4147 hypothetical protein - Prom 9603 - 9662 7.1 + Prom 9882 - 9941 4.0 11 8 Tu 1 . + CDS 9977 - 11878 1829 ## COG0441 Threonyl-tRNA synthetase + Term 11883 - 11936 8.1 - Term 11875 - 11921 0.1 12 9 Tu 1 . - CDS 11927 - 12112 237 ## gi|266620392|ref|ZP_06113327.1| choline binding protein - Prom 12214 - 12273 80.4 13 10 Op 1 . - CDS 13121 - 13513 410 ## gi|288870214|ref|ZP_06113328.2| hypothetical protein CLOSTHATH_01480 - Prom 13534 - 13593 3.4 14 10 Op 2 . - CDS 13613 - 14758 1172 ## COG3858 Predicted glycosyl hydrolase - Prom 14915 - 14974 4.2 + Prom 14860 - 14919 5.3 15 11 Tu 1 . + CDS 15105 - 17360 1496 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 17424 - 17481 16.1 16 12 Op 1 . - CDS 17512 - 19863 2386 ## COG1472 Beta-glucosidase-related glycosidases 17 12 Op 2 38/0.000 - CDS 19877 - 20710 832 ## COG0395 ABC-type sugar transport system, permease component 18 12 Op 3 35/0.000 - CDS 20724 - 21629 1024 ## COG1175 ABC-type sugar transport systems, permease components - Prom 21685 - 21744 2.6 - Term 21669 - 21737 10.5 19 12 Op 4 1/0.125 - CDS 21747 - 23129 1474 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 23192 - 23251 6.7 20 13 Op 1 7/0.000 - CDS 23261 - 25132 1545 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 21 13 Op 2 . - CDS 25129 - 26646 1482 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 26641 - 26700 3.6 22 14 Tu 1 . + CDS 26748 - 27098 158 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 27154 - 27182 0.3 - Term 26890 - 26930 1.1 23 15 Tu 1 . - CDS 27172 - 27519 334 ## gi|266620403|ref|ZP_06113338.1| toxin-antitoxin system, antitoxin component, Xre family - Prom 27653 - 27712 6.1 + Prom 27517 - 27576 6.8 24 16 Op 1 40/0.000 + CDS 27686 - 28381 866 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 25 16 Op 2 . + CDS 28390 - 30669 2239 ## COG0642 Signal transduction histidine kinase - Term 30646 - 30680 1.1 26 17 Op 1 . - CDS 30788 - 32149 1367 ## COG0534 Na+-driven multidrug efflux pump 27 17 Op 2 . - CDS 32170 - 33711 1510 ## COG3409 Putative peptidoglycan-binding domain-containing protein - Prom 33740 - 33799 6.6 + Prom 33815 - 33874 5.7 28 18 Tu 1 . + CDS 33916 - 34725 759 ## COG0428 Predicted divalent heavy-metal cations transporter + Term 34752 - 34812 13.7 - Term 34745 - 34793 2.3 29 19 Op 1 . - CDS 34853 - 36052 708 ## COG0006 Xaa-Pro aminopeptidase 30 19 Op 2 38/0.000 - CDS 36070 - 36897 549 ## COG0395 ABC-type sugar transport system, permease component 31 19 Op 3 35/0.000 - CDS 36894 - 37754 557 ## COG1175 ABC-type sugar transport systems, permease components 32 19 Op 4 . - CDS 37792 - 39129 1089 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 39237 - 39296 7.5 + Prom 39188 - 39247 9.2 33 20 Op 1 7/0.000 + CDS 39338 - 41131 929 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 34 20 Op 2 . + CDS 41138 - 42181 524 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 42191 - 42250 5.0 35 21 Tu 1 . + CDS 42296 - 43099 622 ## COG3393 Predicted acetyltransferase - Term 43137 - 43175 2.1 36 22 Op 1 42/0.000 - CDS 43256 - 43672 601 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) 37 22 Op 2 42/0.000 - CDS 43685 - 45091 1659 ## COG0055 F0F1-type ATP synthase, beta subunit 38 22 Op 3 42/0.000 - CDS 45131 - 46030 943 ## COG0224 F0F1-type ATP synthase, gamma subunit 39 22 Op 4 . - CDS 46089 - 47606 1659 ## COG0056 F0F1-type ATP synthase, alpha subunit 40 22 Op 5 . - CDS 47612 - 48115 506 ## Closa_4167 ATP synthase F1, delta subunit 41 22 Op 6 . - CDS 48096 - 48596 574 ## COG0711 F0F1-type ATP synthase, subunit b 42 22 Op 7 . - CDS 48648 - 48866 377 ## EUBREC_2902 hypothetical protein 43 22 Op 8 . - CDS 48920 - 49615 709 ## COG0356 F0F1-type ATP synthase, subunit a - Prom 49680 - 49739 9.4 - Term 49974 - 50017 6.8 44 23 Op 1 . - CDS 50095 - 50214 209 ## 45 23 Op 2 . - CDS 50186 - 50392 150 ## gi|288870220|ref|ZP_06409672.1| putative Rhs element Vgr protein - Prom 50484 - 50543 2.9 + Prom 50900 - 50959 5.0 46 24 Tu 1 . + CDS 50990 - 51685 742 ## COG2267 Lysophospholipase + Term 51801 - 51838 8.1 - Term 51783 - 51829 5.6 47 25 Tu 1 . - CDS 51830 - 52498 523 ## EUBELI_01420 hypothetical protein Predicted protein(s) >gi|229784112|gb|GG667623.1| GENE 1 1 - 501 309 166 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620382|ref|ZP_06113317.1| ## NR: gi|266620382|ref|ZP_06113317.1| hypothetical protein CLOSTHATH_01467 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01467 [Clostridium hathewayi DSM 13479] # 1 166 1 166 166 317 100.0 2e-85 EFLRQFKGYETTYDQDICYNITPKDITDRYDFCIFKYDTSCGSFLSYDKEVYPLGIWFGG YGVTSFAVSDLNQDGYFELFFTYSWGSGAHRSLVGYFDSATKETILPDFIYWGNDMVLNT DSNGILGIYHADCDIESFVDIEMEAKDRLASIVWESQEISVVEETE >gi|229784112|gb|GG667623.1| GENE 2 575 - 1759 983 394 aa, chain - ## HITS:1 COG:BS_yjeA_2 KEGG:ns NR:ns ## COG: BS_yjeA_2 COG0726 # Protein_GI_number: 16078275 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Bacillus subtilis # 176 376 16 207 217 165 43.0 1e-40 MKRLHRKIKALALVFLVFAGTIQAPVQAISARAEEPVKISYQAFNYPTAWSDKQADDTPY TAPEGTYMTGFQAVLLNKPQNVTGGLEYQVEVSGSGWTEWTAGAVMAGHTEGEAPAAALC MRLTGGLKDLYDVCYSIQQNGTWSPWVRNDEGTAGQEGSGLKITGMRMTVITKGAAIPGE KAEKTGIDPNRPMVALTFDDGPHAPVTNRILDSLEANGGCATFFMVGNRVKGKANTAAVA RMTALGCEVANHTYEHKAITKLSAAGIQSQLEQTNDHIRMAGGVSPVLMRPPGGAKNDVS MKTVGSMGMSAVLWSIDTLDWKTKNKQKTIDAVIGHVKDGDIILMHDIYGPTADAAEVII PKLKAMGFQLVTVSEMASYRGGIQPGKVYSRFRP >gi|229784112|gb|GG667623.1| GENE 3 2175 - 2555 326 126 aa, chain - ## HITS:1 COG:no KEGG:Closa_4143 NR:ns ## KEGG: Closa_4143 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 126 1 126 126 191 70.0 1e-47 MFFVFGISQGKKMLDYAKTVICAQCGGYGRYQVFMTYSYFSLFFIPIIKWNRHYYVQMSC CSTVYELDPEVGKRLGHGEQADITETDLTLVQAGRRTSGWNRKHCEACGYETEEDFEYCP KCGRKF >gi|229784112|gb|GG667623.1| GENE 4 2726 - 3319 617 197 aa, chain + ## HITS:1 COG:no KEGG:Ping_2639 NR:ns ## KEGG: Ping_2639 # Name: not_defined # Def: P-loop ATPase # Organism: P.ingrahamii # Pathway: not_defined # 1 194 1 193 196 261 67.0 1e-68 MKKMILVTSPPACGKTYISKQLAKNLKHVVYLDKDTLIVLSNRIFDVAGEERNRSSDFFE SNIRNFEYEAIINIAMEALDYDDIVLINAPFTREIRSPEYMDQFRKRLAAKNARLEIIWV VTDPEICHQRMIRRNSSRDTWKLENWDQYISGVDFSIPSAIDRPDSSEDDLLLFYNSTDQ EYQESMKRIVGILEEDC >gi|229784112|gb|GG667623.1| GENE 5 3443 - 4918 1614 491 aa, chain - ## HITS:1 COG:FN0454 KEGG:ns NR:ns ## COG: FN0454 COG1012 # Protein_GI_number: 19703789 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Fusobacterium nucleatum # 1 491 1 491 491 637 60.0 0 MENVARESYNLYINGQWVDASDGGTFKSYCPANGEYLSTCAEATREDVDAAVDAAWAAWE SWKKTTPAERADALLKIAGVIDQNKEKLALLETLDNGKPIRETMAIDIPYAADHFRYFAG AVRTEEGEAAMLDENTLSLILREPIGVVGQIVPWNFPFLMAAWKLAPVLAAGCCTVFKPS SYTSLSVLELAKLIDGILPPGVFNIVTGRGSKSGQYMLEHPGFRKLAFTGSSEVGCQVGC AASEKLIPSTLELGGKSANIFFPDCKWDMAMDGLQLGILFNQGQVCCAGSRVFVHEDIYD RFVEEAVAKFNSVKVGLPWEADTMMGSQIYEAHLNKILDYIEIGKKEGARVACGGERITE GELAKGCFLKPTLLVDVTNDMRVAQEEIFGPVAVVMKFKDEEEVIHMANESEYGLGGAVW TRDINRAVRVCRAIETGRMWVNTYNAIPAGAPFGGYKKSGIGRETHKVILEHYTQMKNIM INLSENPSGFY >gi|229784112|gb|GG667623.1| GENE 6 4906 - 5043 66 45 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870213|ref|ZP_06409669.1| ## NR: gi|288870213|ref|ZP_06409669.1| Yop Translocation L [Clostridium hathewayi DSM 13479] Yop Translocation L [Clostridium hathewayi DSM 13479] # 1 45 1 45 45 87 100.0 4e-16 MNKISIVAIVFANCYTENRTKECGRVAENALHQNFYQEGDMIWKM >gi|229784112|gb|GG667623.1| GENE 7 5246 - 5458 240 70 aa, chain - ## HITS:1 COG:CAC0976 KEGG:ns NR:ns ## COG: CAC0976 COG2155 # Protein_GI_number: 15894263 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 4 63 2 61 69 73 61.0 8e-14 MGNKALDYTALTIAIIGAVNWGLVGFFNFNLVSFLFGSMSWISRIVYALVGICGLYLLTF YGRSEVSTER >gi|229784112|gb|GG667623.1| GENE 8 5650 - 7155 1582 501 aa, chain - ## HITS:1 COG:DR1171 KEGG:ns NR:ns ## COG: DR1171 COG0714 # Protein_GI_number: 15806190 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Deinococcus radiodurans # 18 238 3 212 340 101 27.0 4e-21 MNIKRAKEEIKDTIEAYLLKDEYGAYEIPSVRQRPVLLIGPPGVGKTQIMEQISRECGIG LVAYTITHHTRQSAIGLPFIVKKEYGGTPYSVTEYTMSEIVASIYNKIEQTGLSEGILFI DEINCVSETLAPAMLQFLQCKTFGNHQIPEGWIIAAAGNPPEYNKSVRDFDVVTLDRIKM IHVEPDYEVWKQYAYEQSIHPAIISYLNARPESFCRIETTVDGRLFATPRGWEDLSRLIE VYEKLGKKTDREVIGQYLQFPKIARDFANYLELYYKYQDDYQIDEILSGSVRETLSNKLK RAQFDEKLNVIGLLLSKLNGGFKQVVRDAEKLEILMAELKKLHDGSLSGDMEARLCTVRK EFKQAWEKKRQAGLLNRQSDYLHRDVEHLLEQYEKEIHFAETPDPEAAWETVKSYFGRER EVYETQFDESGAMLEHAFDFMEITFGDSQEMAVFITELNTNYYSIKFLQEYECGRYYQYN ERLLFNRREDEIKERIDHLGR >gi|229784112|gb|GG667623.1| GENE 9 7152 - 8471 1110 439 aa, chain - ## HITS:1 COG:DR1169 KEGG:ns NR:ns ## COG: DR1169 COG3864 # Protein_GI_number: 15806188 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 50 423 42 367 379 71 24.0 4e-12 MIDPLRQQQLCELEEVQICHDILTNSRNELYLNFRYLDVALSSLGFEADREGRGVATDGF LIYYQPEHLMSMYKRGRVLINRAFLHMVFHCLLSHLDTRGKRAVPYWNLACDIAIESIID EFYVKNVYRHQSVCRREVYAKLKEKLRVLTAEGIYKVLQEMNLSGDEYERIAAEFYVDNH DRWDEEEEPRIRQERQNIWQNNREKMQTEMESFGEKDAEDSRSLLEAVRVENRERYDYKQ FLRRFSVLREENEVDQDSFDYVFYTYGMELYGNMPLIEPLESKEVHRIEDFVIVVDTSMS CSGDLVRRFLEETYSVLCESESYFRKINVHIIQCDEMIRSDETITSQDEMKAYMENFTIS GLGGTDFRPAFEYVNGLMNQGKFKKLKGLLYFTDGKGIYPVRMPVYDTAFIFVEGQYEEV SVPAWAMKIILTEEEIYKL >gi|229784112|gb|GG667623.1| GENE 10 8509 - 9537 807 342 aa, chain - ## HITS:1 COG:no KEGG:Closa_4147 NR:ns ## KEGG: Closa_4147 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 342 1 325 325 405 59.0 1e-111 MFLYQIKNETIAILGCRSYDGLVEIPETIDGKPVTELAAYAFSEGWGRREMLKAAGTILL ADAEGKAKHGMEAGTETGTVEAEELPPVACCGHLKEIILPRTIKKIGNYAFYNCFELGTI RCWSAIEDVGSGLFTGCTSIRHMDIHIVLHRRSCLKEMISELRQELYVNYYVESEAPAKD PVLQAKLVFPEMFEESVENTPARIIMREMHGCGHMYRYCFEQTDFRFSKYDALFPHIKVQ EPEWVVCALVMGRLRYPAELMEQHRGAYEEYLKEHMGGAAEWALSEGDTETFFWLADRYG ERKETLDEMIELAGRKGETEALSGLMDLRHRRFPVKKRTFSL >gi|229784112|gb|GG667623.1| GENE 11 9977 - 11878 1829 633 aa, chain + ## HITS:1 COG:CAC2362 KEGG:ns NR:ns ## COG: CAC2362 COG0441 # Protein_GI_number: 15895629 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 632 2 633 637 862 64.0 0 MKVTLKNEEVKEFDSPLSVYEIAASLSAGLAKKACAAEIDGTPADLRTVVASDCALTILT FDDEGGRAAFHHTTSHILAQAVKRLFPETKLAIGPSITDGFYYDFDRPEPFSPEDLEALE TEMKKIVKEDLKLEYFTLPRKEAIALMQEKDEPYKVELIEDLPEEEIISFYRQGEYTDLC AGPHLMTTGPVKAFKLTSIAGAYWRGDEHNKMLTRIYGTSFPKASDLEAYLTRIEEAKKR DHRKLGKELGLFTIMDEGPGFPFFLPKGMVLKNALTNFWREIHEREGYVEIATPIMLNRS LWETSGHWDHYKENMYTTTIDDTEFAIKPMNCPGSMLVYKSQPRSYRDLPLRMGELGLVH RHEKSGALHGLMRVRCFTQDDAHIYMTPEQITDEIKGVVRLIDEVYNMFGFKYHVELSTR PEDSMGSDEDWEMATNSLREALDEMGYDYIINEGDGAFYGPKIDFHLEDSLGRTWQCGTI QLDFQLPLRFEAEYTGADGEKHRPIMIHRVLLGSIERFIGILIEHYAGKFPAWLAPVQVK VLPISDKNAAYAKDVYNTLKKNHIRCEIDERPEKIGYKIREAQLEKVPYMLVAGAKEEEN HTVAVRYRDSGENVTMSVEDFAEMLKKEIEERK >gi|229784112|gb|GG667623.1| GENE 12 11927 - 12112 237 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620392|ref|ZP_06113327.1| ## NR: gi|266620392|ref|ZP_06113327.1| choline binding protein [Clostridium hathewayi DSM 13479] choline binding protein [Clostridium hathewayi DSM 13479] # 1 61 29 89 89 110 100.0 5e-23 MDKGKRYYFGEDGIMAANQWLKGWFCWYYAGPDGAMLTNTVTPDGYELDDTGAYYDPALS E >gi|229784112|gb|GG667623.1| GENE 13 13121 - 13513 410 130 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870214|ref|ZP_06113328.2| ## NR: gi|288870214|ref|ZP_06113328.2| hypothetical protein CLOSTHATH_01480 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01480 [Clostridium hathewayi DSM 13479] # 1 130 2 131 131 218 100.0 1e-55 MRKGKCLAALLITAALAMAPVNAYGFETSGEYFEISSEEAGAFLYDGTTVMAGTDGEMIL TKSESGWVRRAAIRTTDMNVITAEIVLNEEGLNMIRYHAKNAGKAQLIVVVPSGAAVNMG TFYVMNDRGD >gi|229784112|gb|GG667623.1| GENE 14 13613 - 14758 1172 381 aa, chain - ## HITS:1 COG:BH2292 KEGG:ns NR:ns ## COG: BH2292 COG3858 # Protein_GI_number: 15614855 # Func_class: R General function prediction only # Function: Predicted glycosyl hydrolase # Organism: Bacillus halodurans # 4 379 53 426 426 263 40.0 5e-70 MQIYVVKSGDNVDSIAAEAGIPVSTLLWDNQIVYPYRLAVGQALLLDNERNGKEKSPLHV NGYAYPFINPETLDWTRPYLTDLSVFSYGFTTEGDLVPPLADDSWMVTAAAAAGVRPVLT LTPLGEDGRFNNNLVTTLVRSPDVQKTLIWELGRTMTQKGYGGVDIDFEYVLAEDRVEYA EFVKLVRQVMNLFGYQVSVALAPKTARDQRGLLYEGIDYRLLGEAANRVLLMTYEWGYTY GPPMAVAPINMVRRVAEYALTEIPAEKISLGIPNYGYDWSLPYEQGVTKAETINNLQAIE TAIDHGVPIQFDETAMTPYFRYWQYGIQHEVWFEDVRSLKAKFDLIKEYGLSGAGYWQIM NFFRANWLLLADMFEINKVKK >gi|229784112|gb|GG667623.1| GENE 15 15105 - 17360 1496 751 aa, chain + ## HITS:1 COG:SMb21655 KEGG:ns NR:ns ## COG: SMb21655 COG3250 # Protein_GI_number: 16263752 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sinorhizobium meliloti # 1 739 2 740 755 575 41.0 1e-163 MNNPILLSNSWYFKPQFCQEDLSHFDADADCGGWQKVNLPHTVKELPFNCFSHDETAMIS SYRKDFTIPAEAAGKRIILRFDGVMAYYDLYCNGIRIGDHRGGYSISFMDITEAVRYGAV NHLFLKVDSTERNDIPPFGYIVDFLTYGGIYRDVYLYMTEQTYISNVMARYRLEDDWSLT LYPEVFFHSHETAGREASICCRLIRQDGTTVFSHTESCSVAEGESSFQMSPVSAGMVDLW DIDAPVLYTVETELLIDGHIVDSHSTKTGFRKTECTSHGFYLNGQKIKLRGVNRHQCYPY VGYAMGRGVQRKDAELIRYDMACNTVRTSHYMQSQYFLERCDEIGLLVFEEIPGWHYIGG EIYQNVVMNDVKMMITMDYNHPSIFIWGVRLNESNDCPELYTRTNELAKRLDPDRATTGV RYFKGSELKEDVYAINDFCHKGPKREDVLQSQRQVTGLDHDVPYLVSEFCGHVYPCKPWD NENVREEHARMHARVQSRAACTDNIMGALAWCAFDYQTHGDYGSGDKICYHGVMDMFRMP KFASYLYRSQKDPSEEVVLETTSVFSRGEKGDNKVAPVMIMTNCDFIEVELYGNSIGRFY PSNNYVGLPHPPIEISTHDSFWIEIWQGAVITGFIGNQKVAERRFVRDARLTTLEAAADQ SELSSEYADDTRVTLRFLDQEGNLLPYYPGIIKVACHGDIELRGPEIFPAMGGMSAFWIR TTAAGRPGTASVTIHTLNSDLPDQIVEFSIV >gi|229784112|gb|GG667623.1| GENE 16 17512 - 19863 2386 783 aa, chain - ## HITS:1 COG:YPO2803 KEGG:ns NR:ns ## COG: YPO2803 COG1472 # Protein_GI_number: 16123001 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 3 694 29 720 793 455 37.0 1e-127 MNEKIKELLKSLTLEEKASLCSGVGLWQTRPLEEKGIPEIWMADGSNGVRIMKPVNQERK QDTSDFLKVTDLTQNSPTITNQYEAVCYPSGASLASTWDTELIEEMGEALGDECRYFKVN LLLAPGINIKRSPLGGRGYEYYSEDPYLTGKVAESFIKGVQKTGTGTSIKHFAANNAETL RINMSSDVDERALREIYLAPFEIGVKNAKPWTVMSSYNKVNGVQMAENRELLTDILREEW GFDGVVVSDWGGVKDRIKALEAGNDLDMPENRRNNQSIIDAVRDGILSESVLDQSVERIL ELVFKAKEQERFTDEVDFENHRNLSRKVAEESVVLLKNEKELLPITKEKYKKVAVIGAFA KEPRYQGGGCTLVNPIRISRPYEEMEKLAGDDITLTYAKGYELKNETSDELIREAEETAR AADVAVIFAGLWVAYDREGFDRKHLEIDSSHIRLIEAVSRVQKRVVVVLSNGDAVVMNPW LDHVGAVVEQFLVGETVGEALARVLFGEVNPSGKLPVTFPKRLEDTSAYPYFPGECSHHV YGEGIFVGYRYYEKKKIEPLFPFGYGISYTTFEYSNLRVDRNRFKDTDTVTVSVDVTNTG SVKGKETVQIYISDEESRLKRPEKELKAFGKVELEPQETKTLTFTLGYRDFAYYDPEASD WVVEEGGFYIHAAANAGDIRQSIRVEVTEAKKKFRRLYLDSQHTAVFEHPVARKMYLDFL VEAGVIEADRIDAMVPLLKGNYMGIYNVVTSLLGGNVTKEEMQAVIDRINEVCRKNSSGS VLS >gi|229784112|gb|GG667623.1| GENE 17 19877 - 20710 832 277 aa, chain - ## HITS:1 COG:lin0219 KEGG:ns NR:ns ## COG: lin0219 COG0395 # Protein_GI_number: 16799296 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 8 276 13 281 282 201 41.0 9e-52 MKANTRTLILRIGATLIVALLAVICVFPFIWMVSASFKNEIDVMEFPIRIIPKVVNWTNY STVWLKSDFPNYYRNSLIVTGITLVGTICLSTTAAYGFARLKFRGKGLIFGIYLATLMVP VQVTLLPKYIYFGQLHLNNTFYALILPGIFSAFNTFLMRQYYEGIPYELTEAAQIDGAGH FRIFWQIMLPLTKSGFVTLLLFSFTWTWNDYINPLIYCNKEELLTLTVGLQRFQQAQSTH YALIMAGCTLALAPLIVLFLFTQKYFIESFASSGVKG >gi|229784112|gb|GG667623.1| GENE 18 20724 - 21629 1024 301 aa, chain - ## HITS:1 COG:lin0218 KEGG:ns NR:ns ## COG: lin0218 COG1175 # Protein_GI_number: 16799295 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 21 300 15 290 292 197 38.0 2e-50 MKKTKQRLSKRQLREERAGYLFVIPNFIGILIFVALPIVFSLLLGFTKWNPMQGLSAIEF TGMENFAKMVSDDRLKAALMNNFVYTFTYVPVSVCLALVIAVLLNKFVFCKVPLRLMCFM PYISAMVSVAYVWLILLNPEGGPVNSILAALGVKNLPGWFTKSNSALAGIIVMSIWHDAG YYMIIFLSALQGLSKEVYESARVDGANGLQTFFRITVPMMASNILFVSILATMNSFKVFD QINVLTEGGPGYSTNVLVYCIYHYAFREHNIGYASAVAVLLFLIIVIISAVQVKLKEKYT I >gi|229784112|gb|GG667623.1| GENE 19 21747 - 23129 1474 460 aa, chain - ## HITS:1 COG:SMb21595 KEGG:ns NR:ns ## COG: SMb21595 COG1653 # Protein_GI_number: 16264783 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 73 389 45 359 410 90 26.0 8e-18 MKKALTITLACVLAAGVLGGCSGKGGENMTEGVTGGNSEAVSTEAGKEEKAQSAASGEKT VIHYYDWVTMDSSIIDEFNAANPDIEVQYHAIPDNGSDKLTQLDILAMGGGEIDVMPGSD GEQMLRMKNGMYAPIDEFIEKDGIDMEKNFGGILSYASYDGVTYGYPIRSTIEGIWYNKD MFDAAGIEYPDGSWTWDEYKAIAEKLTSGEGDSKVYGTYTHTFNGQWAPVGNEVSPWYTE DGKCNIMAEGFKSSLERRKELDDLGLQLSFSQIIATKANQSSAFLGGKCAMVQAGSWIVQ NIKDQENYPHDFRIGVAPLPRFDDTVKADDNFSVATTILAIPATSGHKEEAWRFIRYMVE EGAERVAGTGNYPSYKPAYNDSLIQTFIEGSGLEVDDVRVLFEDMTAVSQKPTGLGAAEY QQSMQEQTQLYFNGEKTAEDVLGQIEKITNEAIEKEVSGK >gi|229784112|gb|GG667623.1| GENE 20 23261 - 25132 1545 623 aa, chain - ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 148 600 148 571 577 161 27.0 5e-39 MRGKKTSIKTYLLLIFIVSIVATICLSGLAMRRSITENARERILEGDAWLVSQLCETADY LSGQAENISATLAFDSDIQSALIAYEYGGAGLMDLDQVRIRINNNLLYKSRFNTALYNCV NIVLFSNDGEVIGSKEAFNQSTCISDYEWFDQVENSHGKSIWLPLGYDLNSKSTSKVLTI PVVRKIFSTQSGKENTLNEVMTVGKALGYITVYIDAAMFSDIVAEDAAVYTKRFFLLDEN NRIISCRNKEKIGETFEYQVKKNGYITVDGADYVMAEKKIENRGWNYICLTEQKEVTRDG TIVLSVCAILGVILILVFAAIGLMLSRSIAIPVKVLADHFKRAENGNVTIQETSGITEFT NLYDSFNHTMEKIHDLANQIYENKLVHQELVLSIKESRIQALQMQINPHFLYNTLDCINW RAQIDGNKEVAEMIRILGKFFRSNMEIKGEFTSLKAEIDNIELYITLSKLRFGTRLHCFI DIDEMLYDCRIMKLLLQPLVENSIKHGLEVEDIDENIWIWCGMDDSDLVISVKDDGRGMA PDTLEYLRGLWASRDEDYKKTESIGLYNVMRRLWLCYREACNLEIFSEAGRGTEIVITFP VSGDYSVGSEHSQAGVTESYDVK >gi|229784112|gb|GG667623.1| GENE 21 25129 - 26646 1482 505 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 495 1 519 525 148 25.0 3e-35 MYTLIIVDDDELIRKGLEKVIQWEKMGFSVLGTFKGAMDALEYLKNHTADVILTDIRMPQ MTGLELIDEAKKIRKDIKSVIISGYGEFELAKKALLLKVEDYLLKPLGEEEIEAAFFKLK KNLDSERSVTDGDSRQKIGREYELMKLLEKHVRMDSLFEKNNRGKKYYEMILIRIRAKEG GDRTGEQMEACEAMIRQVFSDCLSNDVQGLFAILVPPEQLNGILMHLKREFPKYTGILYQ ILIGREVYSEAEVVSSYWSAVDLGRDEEKSGVIHYERERNIYKKEWDIIQELKNRMVDNF ENGRFQEIDQQIREINSVLEHYESKDMYYFYCNIVMKMLKYFELENNGAVYLFAHRYYAD VEQQYPTTEQLKKNFSQDIDTIRKSLWENSDSMRNLIVAKAKTMIEKEYGNENLSLALVA NQLNISYGYLSTIFTKAEGRSFKTYLVEVRMEKARVLLLSRNYRIYEIAEMVGYKNPRYF TDAFKKYYNYSPADYIARFRGNEEK >gi|229784112|gb|GG667623.1| GENE 22 26748 - 27098 158 116 aa, chain + ## HITS:1 COG:all3171 KEGG:ns NR:ns ## COG: all3171 COG2207 # Protein_GI_number: 17230663 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Nostoc sp. PCC 7120 # 39 109 232 302 306 64 39.0 4e-11 MKSKPARGDFCVIEALFNRTKKLPEHRRFSVCVPAAFPLSYLCRVFKKHLNKTPTEFIND IRIANAKLALEYTNRSILQICEEVGFDNTSHFYHLFKKDTGLSPKAFRELSSGSQY >gi|229784112|gb|GG667623.1| GENE 23 27172 - 27519 334 115 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620403|ref|ZP_06113338.1| ## NR: gi|266620403|ref|ZP_06113338.1| toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] # 1 115 1 115 115 230 100.0 2e-59 MGKNRIYSLRRQKKIPQALLGDLLGVSQQTVSKIESQPIPDISSELLCRIADYFGVTADY LMERSEQSCDFNFADMDPREIGECLTPADKKMWLEMGIRLAGNEFMAKHKKKSGK >gi|229784112|gb|GG667623.1| GENE 24 27686 - 28381 866 231 aa, chain + ## HITS:1 COG:BH1153 KEGG:ns NR:ns ## COG: BH1153 COG0745 # Protein_GI_number: 15613716 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 3 230 6 232 232 270 61.0 2e-72 MQSILVCDDDKQIVEAIDIYLTGEGFTVIKAYDGYEALDLLENNQVDLMIVDVMMPGLDG IRTTLKVRETSSIPIIILSAKSEDTDKILGLNIGADDYITKPFNPLELVARVKSQLRRYT QLGNMNQQTAGQVYKCGGLAINDDNKEVTVDGELIKLTPIEYNILLLLVKNAGKVFSIDE IYEQIWNEEAIGADNTVAVHIRHIREKIEINPREPRYLKVVWGVGYKIEKQ >gi|229784112|gb|GG667623.1| GENE 25 28390 - 30669 2239 759 aa, chain + ## HITS:1 COG:BH1154_2 KEGG:ns NR:ns ## COG: BH1154_2 COG0642 # Protein_GI_number: 15613717 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 509 754 31 274 274 225 47.0 2e-58 MKRKKEHAVLLHILFVILVVTGISIMYNNPHYGEGISWLTNETYEDTPEFIRQFKTDIDD IFNYIKYKEVFESNGKLAYDKPVVRVTTGPGEVQEYTVDEMVRMAKSMGYYLNEKFKVDG RYTPEDGEQDIRVDWRSYDPDHQDSEPGESYTTMADLAKEVLAHVGNYYAFHYKFIQNPS NLKFRVSFKGDSEAPRLYSNTVIDNPDELKALGRYVYVTGESLYIDTNLNTTLKTISAQL EKSNPYNNSNYYMIVAVNTAYPVQDSYSEAARSYDHMRLLFIIGLVSLAIGILGCAASLY FLISMTGHMTSCQDAVTLGRIDRIATETCLILFVAAAMCALFLGEHVVYKVIHLFLSEER WYYGELAARLLILYTIGMIAAFSMIRRYKAGTLWTGSLTKRICVNVSLYIKNHRFTTRLA LGYICYFVFNCVMLVAGAFVYFTYDSMESRILTGLLALALLCMDLWVFHQLYKNAWETDQ INEALLKISHGDTKYKINAEAFDGKERELAENINHISRGLETALQEKVSSERLKADLITN VSHDIKTPLTSIINYVDLIKREHLQDPKIQGYLDVLEQKSQRLKTLTEDLVEASKASSGN LKIELTSIDFVELIYQTNGEFEEKFLIRHLELVSSFPEETVIIQADGRHLWRVLENLYNN AFKYAMEHSRVYVDVTAEDSQVLFTIKNVSENPLNIKGDELTERFVRGDVARTTEGSGLG LSIAKSLTELLGGTFTIYIDGDLFKVQLSFPRQNGKGTS >gi|229784112|gb|GG667623.1| GENE 26 30788 - 32149 1367 453 aa, chain - ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 9 453 4 447 447 256 35.0 8e-68 MKKDRKAVNMITDSPGRALLLFALPMILGNLFQQFYNIMDSVVVGRFVSEEALASVGASY SITNVFIAIAIGGGIGSSVVVSQFLGAKQLGNMKTAISTTLINFLTLSVLLGTLGFLFNE PILTWMNTPENVFADASVYLSIYFVGLPFLFMYNVQAAIFQSLGDSRTPLYLLIFSSLLN IVLDLLFVIRFHQGVAGVAVATLMAQGLSAVISFTILLKRLKTYEVTEAFHFYDMTMMVN MMRVAIPSTLQQSIVHIGMLLVQSVVNVFGSAVMAGFAAGTRIESICIVPMLAIGNAMST FTAQNIGAGRPDRVKQGYRYCYAMVGFFAVVICIIMQLWGDLFIRGFLNEGSAETAFQTG MAYVRFISFFFVFIGLKATTDGLLRGAGDVVVFTIANLVNLAIRVIVAAVFAPVIGVQAV WYAVPMGWAANYAISFMRYLTGKWSRVHLITPK >gi|229784112|gb|GG667623.1| GENE 27 32170 - 33711 1510 513 aa, chain - ## HITS:1 COG:BH3665_1 KEGG:ns NR:ns ## COG: BH3665_1 COG3409 # Protein_GI_number: 15616227 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative peptidoglycan-binding domain-containing protein # Organism: Bacillus halodurans # 84 365 102 393 408 135 33.0 2e-31 MNKKARWYTSFIHKVLLAAAVAAVLGGCGKKNGAEQETTAPEPSALSIELETEKSPQPVK ILTIGEEPVPVLPDLPVLDENPVPEYLRNGVEHPIVASLQQRLMDLGFMDNDEPTQYFGT MTESAVKTFQRQNHLAQDGIVGPETLNAIMTPSAKYYAVSKDVDGDDIKRIQNRLYELGY LATGDLVTGHFGDDTEAAVIKLQEVNGLNVDGKVGRQTINLLYSDEIKPNFLSYGEKSDV VLACQTRLKELGYLTTTPDGAYGDDTAAAVKQFQARNDLVVDGYLGPSTRVALNSAEAQA NGLMLGEQGETVTRIQQLLNKYGYLSSANVTGYYGEVTEKAVKSFQSSNGLTADGSVGRQ TMNKLTGSNVKKAGSGGSSGGSSSSKGGTVKGGSGVSGLLSIARSKLGCPYVYGAKGPNA FDCSGFVYYCLNQAGVRQSYLTSAGWRSVGKYTKITNFNSLQAGDIIVVSGHVGIVAGGG KVVDASSSNGRVVERALSSWWRNNFICGWRIFG >gi|229784112|gb|GG667623.1| GENE 28 33916 - 34725 759 269 aa, chain + ## HITS:1 COG:lin0435 KEGG:ns NR:ns ## COG: lin0435 COG0428 # Protein_GI_number: 16799512 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Listeria innocua # 14 269 20 269 269 207 50.0 2e-53 MGTSLLWAAGGTGFTFLMTTLGASVVFFFRKEVNASVQRIFLGFAAGVMIAASVWSLLIP AIEEASLNGGIGWIPAAGGFLLGVFFLIALDTYLPHLHPDCSNPGCNEAEGVSSSWKRTT LLVMAVTLHNIPEGMAVGLSFALAAQHGNDPALYTAAMALAIGIGIQNFPEGAAISLPLR QEGLSTGKAFFRGSMSGIVEPVFGILTVLVAGAIEPLMPWLLSFAAGAMLYVVVEELIPE AHLGEHSNAGTLGVMGGFVIMMILDVALG >gi|229784112|gb|GG667623.1| GENE 29 34853 - 36052 708 399 aa, chain - ## HITS:1 COG:YPO1950 KEGG:ns NR:ns ## COG: YPO1950 COG0006 # Protein_GI_number: 16122196 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Yersinia pestis # 10 394 12 401 405 202 30.0 7e-52 MLENRNQLEIEQKLKREMEREGLDALILTEPETILYASGFASHFLYDSRRIGTTLAVVRK EGPVILILNEIEKQTAENQCKDIKMETYPVWIYIDGLEDDGLEKPSQPDMMIPVKMALDF ITERTNKAKIGVQSPSIPHDIWDFLSDCTGVGSLCNCEKVLNRARAIKTDWEIGLLRRTA KITESAMLETAYRIVPGMSEKEILSIYRNAAFSQDLDVIGAYTVSGIGTEYSIVQIPRNI SVKENDIIRLDGGANICGYQADIARTFVVGKPEDKAVRIFEALYRAFDAGRSMIGPGVPF EPVFHAMQDTVRKSGFEKYTRGHCGHSVGCNIFVEERPFVAPGKEQEFTVFMPGMVMSIE VPFYSAAYGAFNIEDTVLITESGCEWFTTVNDSLFWPKL >gi|229784112|gb|GG667623.1| GENE 30 36070 - 36897 549 275 aa, chain - ## HITS:1 COG:SP1895 KEGG:ns NR:ns ## COG: SP1895 COG0395 # Protein_GI_number: 15901722 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 5 274 7 278 278 155 36.0 6e-38 MSIIRKYSIGKNVILLFAAAIMLSPFYISFVYSMKTREQITFTGLRFPSEFHLENFSQAV EASNFYVVLKNSVITTVPTVLLLIIICPMAAYVLARNTGKAYNIIYSLFLTALLIPYQSV MLPQYMNLKQAGLLNTYLGAVMVRAAFNISFNILLFTSFVKTVPIELEEAAMIDGAGRMK TFFRIVFPLLKSVLCTAVIINALFSWNDFTITLNILMKEEMKTIPLMLYRFFGEYNVELN LAFAAFTLSMIPILILYFLLQRYIVEGVMSGAVKG >gi|229784112|gb|GG667623.1| GENE 31 36894 - 37754 557 286 aa, chain - ## HITS:1 COG:BH2225 KEGG:ns NR:ns ## COG: BH2225 COG1175 # Protein_GI_number: 15614788 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 19 285 29 303 304 158 32.0 1e-38 MRNKKMVKKTIQNILFLAPCAVVFFVFVIMPFLQGIPYSFRRWDGFSVDYTIVGLENYKN VLTDGSIAAPAVNTIKFAAMEVVFSNLFGLLLALGISRNTRFHRILRTIYFMPYVISIVL ASYIWSYIFSDVCYSILGMKNLLASPKTVIRGLAIISVWKDTGYCMIVFLAALKSVPDEY YEVARVEGSKSLHTFVKITLPLIVPAFTTNATLVLSWGLKVFEYPMVTTGGGPGTSSESL AMYIYKNIFVYYKAGYGQAAAILFTLFLAVISFLVASFFRSREVEV >gi|229784112|gb|GG667623.1| GENE 32 37792 - 39129 1089 445 aa, chain - ## HITS:1 COG:SP1897 KEGG:ns NR:ns ## COG: SP1897 COG1653 # Protein_GI_number: 15901724 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 84 374 58 349 419 117 31.0 5e-26 MRKQKKMMLMVMAALALTACGSSEEPKQEMVQSESTQKQSTEEEKKSSAEESKTGEKVKL VLMHYAAEETKRSGIDAWVQSVTEQYPDIEIEVQAVTPWAQFASSLQTKIAAGDAPDIFM GKPSQFIQLAEAGQVLDLTGQKCLDNLSEVDKAMVSVGEKVYGVPVDRSIAGVFYNKDMF DEYQLKPPTTLSEFNEIIRTFEEQGIVPFVRAYKDNIYPRVDFDSSFGSMVAKEDEKFYE KIQNGEKKFSDYPMFKLQADIYAARLSAKRDDDLGTDASRANQLFASGKYPMCITGSWAI GDIRKNNPDGKFGFFTTPWSDQAEENTLPIGVDTAFMVSAQTKHPDEVMKFMEHLTSAEG VRQWYENAKVPTIIPVETTDLDPIFVDMNAIIESGKVVSKDSYPYFSGEYKSKFESDLQL FAATPEIGAEGLISMLDKDFSGISK >gi|229784112|gb|GG667623.1| GENE 33 39338 - 41131 929 597 aa, chain + ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 7 582 8 571 577 154 24.0 5e-37 MSKFITWYLNLKFKQKLLLSFLVIIFLTTLCISGTNYAVSNLLITRYVSDYSIAISSQIA SNFASRVDNIEETNFIKYQNDGAFNEILPVSDYTPIQNRLSYQSAVDNLLYSTDYYKSIA FLDLNNYLYTAGSDLLSDQSIDYFTDEAPEIKKSWGQCVWDSCPNGNLLLKRALYHLPTG TYIGCLFVEIDRDYLIKVYEQSIDRYGGSVVIFNPAHKPMISDHDTFEKIALEAISSATQ AENHTVKMNGETYTAAVYNTPDGQWTLTHIVSLSQLTGKLHVIGYWILFIGFAVFALAIV LAFLISGNISQNLKLLTDSMKQFSNGNFDEHIVPASHDEFAIIADNFNTMGQKIKALMRE VALEQKQKQQIEYQMLEFQYNALQARLNPHFLYNILDSISSLAKLKDEYDISDWICLLSD MLRESIGQKKRLIPLDEEISYVRKYTTLYQMMYRDHVTFIYKSDPSLSFIPVPAFILQPI VENAIVHGIEGISRHGIIHICSFSQDNTLILEVCDNGHGMSQDMIEMILNSRYSSEDSGS PSHAKVGLITVLERIRLLYGPEFGIRIQSVPEEGTRIRIILPVAVSLYHDHNSEDQE >gi|229784112|gb|GG667623.1| GENE 34 41138 - 42181 524 347 aa, chain + ## HITS:1 COG:BS_yesN KEGG:ns NR:ns ## COG: BS_yesN COG4753 # Protein_GI_number: 16077763 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus subtilis # 1 345 1 361 368 117 27.0 2e-26 MYKIVIIDDNELTRKSIAKTIKRQLPDCAIVGDASNGREGLSCIQSVCPDIIISDIKMPD LDGLSMVRMVQSSLPASQIIFITGYQDFENAHAAIRLHACAFLLKPIHDSQLLEAIALAK DEIQKNAPSMIQGFQELNLFLQQNAPEAFEEKLSSIYRKAIYLVQKAPDDSSDILLNCMR QVCSIILHYFWNLAPQKINLDQETTNLYHFTQVPDSEAEWSAFLMGYYNRLHLSVAREHQ NYSLIITHALQYLESHYQQELSLSEVAKVVSVNPSYLSSLIKKETGKNFIDIIIEARLEH AKKLLQNPSHRINEVADLSGFKDYAYFYQTFKKHVGLSPKEYRNQFL >gi|229784112|gb|GG667623.1| GENE 35 42296 - 43099 622 267 aa, chain + ## HITS:1 COG:lin2312 KEGG:ns NR:ns ## COG: lin2312 COG3393 # Protein_GI_number: 16801376 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Listeria innocua # 1 263 2 261 261 142 36.0 6e-34 MKKLTASHYNDAMAYALLEPEFNLFLIGDLENYGLESENVSVYTADTWTGGTLPYFILDY RSNFLFYSHTTDYNKAEVAEFLSEFQMRNLSGKRELLEPLMPFLKGLELVPTYLARLNQT AIPDDPAFPARRLLPDDIDAIYALLKQIDEFFTMRSKTEEENREDILNSMTNEGRIYGVF ENGTLAAVAGTSAENSMSAMVVSVATLPEYRGRGYATRLVTRLCQDCLADGMKFLCLFYD NPEAGVIYRKIGFKELGQYAMVRSIEK >gi|229784112|gb|GG667623.1| GENE 36 43256 - 43672 601 138 aa, chain - ## HITS:1 COG:CAC2864 KEGG:ns NR:ns ## COG: CAC2864 COG0355 # Protein_GI_number: 15896118 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Clostridium acetobutylicum # 3 134 2 132 133 77 34.0 6e-15 MSANTFFLQIIATDKVFFKGPCQKLIIPLLDGEKEVLPHHEDMVIAVAIGEMRLMNESGE WINGVVGNGFVQIINNRVTLLVDTAERPEDIDERRAEEARERAEEQLRQSKSIQEHSHFE ASLARSLARLRLKHKYEI >gi|229784112|gb|GG667623.1| GENE 37 43685 - 45091 1659 468 aa, chain - ## HITS:1 COG:lin2673 KEGG:ns NR:ns ## COG: lin2673 COG0055 # Protein_GI_number: 16801734 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Listeria innocua # 1 465 1 472 473 624 69.0 1e-178 MNKGKIVQVLGPVVDVEFEEGVELPCIKDALEVDNDGKRCVMEVSQHLGNNTVRCIMLAS SEGLCRDMEVTATGSGIRVPVGEQTLGRLFNVLGETIDNGEPLKEGEKWVIHRDPPSFEE QSPVAEILETGIKVIDLLAPYAKGGKIGLFGGAGVGKTVLIQELIHNIATEHGGYSIFTG VGERSREGNDLWTEMGESGVIEKTALVFGQMNEPPGARMRVAETGLTMAEYFRDQEHKNV LLFIDNIFRFVQAGSEVSALLGRMPSAVGYQPTLATEMGELQERITSTKNGSVTSVQAVY VPADDLTDPAPATTFAHLDATTVLSRKIVEQGIYPAVDPLESNSRILEPDVVGEEHYEVA RMVQETLQKYRELQDIIAILGMEELGDEDKTTVNRARKIQKFLSQPFSVAENFTGVAGKY VPLKETVRGFKAIVEGEMDQYPEAAFFNVGTIDEVIEKAKTLGAEEGR >gi|229784112|gb|GG667623.1| GENE 38 45131 - 46030 943 299 aa, chain - ## HITS:1 COG:VC2765 KEGG:ns NR:ns ## COG: VC2765 COG0224 # Protein_GI_number: 15642758 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Vibrio cholerae # 1 292 1 287 288 178 33.0 1e-44 MANAREIQSRMNSIKSTMKITNAMYTISSSKLKKARKNLTDTEPYFYALQRTISRIVRHT PEIEDPYFDTRPQIKPEDRKIGYIVVTADKGLAGAYNHNVFKLVQEQIDKGGNPMLFVVG ELGQQYFVRKGIPVEQDFRYTVQNPNMSRARIIEEKIVDYFLSGKLDEVYMIYTRMVNAM KMEAEITQLLPLPKSNFSVPQGIPIDIHLEEITMKPSPKAVIDAIVPNYIAGFIYGGLVE SYSSEQNSRMMAMQAATKSAQDMLDELAITYNRVRQAAITQEITEVISGAKAQKKKKKK >gi|229784112|gb|GG667623.1| GENE 39 46089 - 47606 1659 505 aa, chain - ## HITS:1 COG:CAC2867 KEGG:ns NR:ns ## COG: CAC2867 COG0056 # Protein_GI_number: 15896121 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Clostridium acetobutylicum # 3 503 2 502 505 595 59.0 1e-170 MGSINSDEIISILRSEIEDYDVHARDQEVGTVIGIGDGIATIYGIDHAMYGEIVIFDNGV KGMVQDIRRDDIGCILFGSDKEIKEGCKVTRTKKRAGIPVGDAYIGRVVNALGAPIDGKG EIKSDDYRPIENEAPGIIDRKSVSVPMETGILAIDSMFPIGRGQRELIIGDRQTGKTSIA IDAILNQKGKDTICIYVAVGQKASTVAKMVNTLSQYGAMDYTIVLSSTASEPAPLQYIAP YSGTALAEYFMYQGKDVLIVYDDLSKHAVAYRALSLLLERSPGREAYPGDVFYLHSRLLE RSSRLSEEKGGGSITALPIIETQAGDVSAYIPTNVISITDGQIFLESDLFFSGMRPAVNV GLSVSRVGGAAQTKAMKKAAGSLRIDLAQCREMEVFTQFSSDLDPATKEMIQYGKGLTEM LKQPLCHPMSLHDQVISLVAANNRLLLDVEVSKIKGFQKEMLTFFEQKHPEIVKEIEEKK VLSDELKQSIVDGVKEFKQNTMASF >gi|229784112|gb|GG667623.1| GENE 40 47612 - 48115 506 167 aa, chain - ## HITS:1 COG:no KEGG:Closa_4167 NR:ns ## KEGG: Closa_4167 # Name: not_defined # Def: ATP synthase F1, delta subunit # Organism: C.saccharolyticum # Pathway: Oxidative phosphorylation [PATH:csh00190]; Metabolic pathways [PATH:csh01100] # 1 167 1 167 167 191 58.0 9e-48 MTQAANNYGQVLYERKVPKEMVADAEALFLEVPQLKEALVSPVVKKSEKNRVIDRVFPAE LRNFLKVLCGHQDMDLIEDVFEAYHAYACEQEGSLRATLYYVTEPDAEQLKKIKQMLMKK YQKQDVELRLVKDSSLGGGFVIRTGDVETDWSIKGRMKQLEQILMRR >gi|229784112|gb|GG667623.1| GENE 41 48096 - 48596 574 166 aa, chain - ## HITS:1 COG:SA1909 KEGG:ns NR:ns ## COG: SA1909 COG0711 # Protein_GI_number: 15927681 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Staphylococcus aureus N315 # 8 159 21 172 173 80 32.0 1e-15 MLRLDFNLVWNIVNVIVLYLLLKHFLIKPVMDIMNKRQAMVDQSITNARESESQASELKS QYEEKLAASAEEGKRLVEEAKAEAKTVQERMIKEAGEQADRIIENAHKTANADQEKAMRE AEAQIAGLVLAAAAKVVQGEMNDRKNQALYDQYIAEAGDSHDAGSK >gi|229784112|gb|GG667623.1| GENE 42 48648 - 48866 377 72 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2902 NR:ns ## KEGG: EUBREC_2902 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: Oxidative phosphorylation [PATH:ere00190]; Metabolic pathways [PATH:ere01100] # 1 72 5 76 76 79 90.0 4e-14 MLTAIGAGIAVLTGVGAGVGIGLATSKAVDAIARQPEAEGKISKSLLLGCALAEATAIYG FVIALLIILFLK >gi|229784112|gb|GG667623.1| GENE 43 48920 - 49615 709 231 aa, chain - ## HITS:1 COG:CAC2871 KEGG:ns NR:ns ## COG: CAC2871 COG0356 # Protein_GI_number: 15896125 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Clostridium acetobutylicum # 14 231 2 220 221 136 39.0 3e-32 MSITEDLAERLMEELNCETVFTIPVFGGIPIAESVVVSWIIMAALTILSICLVRNLKVEN PGKKQLALEVAIGGIYKFFDDLVGEEGRRYIPYLISVAIYIGVANLIGLIGFKPPTKDLN ATAALALMSIVLIEYAGFHKKGLKGWVKSFAEPVAIIAPINILEIFIKPLSLCMRLFGNV LGSFVVMELIKQLVAPIIPIPFSMYFDIFDGLIQAYVFVFLTALFIKEAVE >gi|229784112|gb|GG667623.1| GENE 44 50095 - 50214 209 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKTVKGTYNEAMDDIVNYIDSTVKILQVIKPIYNFKAGE >gi|229784112|gb|GG667623.1| GENE 45 50186 - 50392 150 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870220|ref|ZP_06409672.1| ## NR: gi|288870220|ref|ZP_06409672.1| putative Rhs element Vgr protein [Clostridium hathewayi DSM 13479] putative Rhs element Vgr protein [Clostridium hathewayi DSM 13479] # 1 68 23 90 90 89 100.0 8e-17 MRQVKEEVCLYYMGFGICARGLHVDKKGCCAGCSQYKASGVKIGKKKQGGNGKEGIEVTD ENGKRNVQ >gi|229784112|gb|GG667623.1| GENE 46 50990 - 51685 742 231 aa, chain + ## HITS:1 COG:AF1753 KEGG:ns NR:ns ## COG: AF1753 COG2267 # Protein_GI_number: 11499342 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Archaeoglobus fulgidus # 9 229 7 229 266 95 32.0 6e-20 MGEMISSFDGTKLYFNREMPEQARAAAVIVHGLCEHQGRYDYLAGLFHQAGIGTYRFDHR GHGRSEGERTYYDDFNELLDDTNVVVDMAIADNPDLPVFLIGHSMGGFTVALYGAKYPDK KLRGPITSGALTKDNGGLITGVPKDLDPHTLLPNELGAGVCSVAEVVDWYGKDPYNTKTF TTGLCYALCSGLDWFSGAIRDFTYPVLMLHGEKDGLVSVKDTYDFFAEAPS >gi|229784112|gb|GG667623.1| GENE 47 51830 - 52498 523 222 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01420 NR:ns ## KEGG: EUBELI_01420 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 7 175 131 289 321 77 28.0 4e-13 LEPKGFEFLGGIRKEDRLKAVFTIVLYYGTEPWRGAGSLYEMLDLTGIPEEIRGMLNNYR IHILEVRHFEKTERFRTDLREVFGFIQKAADKNAAKRFTFQNEERFKELAEDAYDVISAL TESRELEEVKERYREKGGKINMCEAIRGMIEDGRMEGLSEGLSEGIKTGEAMGIIRGDEN RSVITAKNMYDRGFSAEDAAGMIGIDCSRVQEWYRKWGNGVH Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:03:42 2011 Seq name: gi|229784111|gb|GG667624.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld17, whole genome shotgun sequence Length of sequence - 92970 bp Number of predicted genes - 99, with homology - 96 Number of transcription units - 53, operones - 24 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 212 309 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase - Prom 290 - 349 11.0 2 2 Op 1 . + CDS 577 - 774 216 ## COG1476 Predicted transcriptional regulators 3 2 Op 2 . + CDS 779 - 1267 570 ## gi|266620432|ref|ZP_06113367.1| conserved hypothetical protein + Term 1348 - 1409 5.2 - Term 1334 - 1398 3.7 4 3 Tu 1 . - CDS 1410 - 3113 2018 ## Closa_1605 hypothetical protein 5 4 Op 1 . - CDS 4033 - 4476 436 ## Closa_1605 hypothetical protein 6 4 Op 2 . - CDS 4479 - 4799 366 ## Closa_1604 hypothetical protein 7 4 Op 3 . - CDS 4768 - 5259 227 ## COG4767 Glycopeptide antibiotics resistance protein 8 4 Op 4 . - CDS 5271 - 5420 223 ## gi|266620437|ref|ZP_06113372.1| conserved hypothetical protein 9 4 Op 5 1/0.143 - CDS 5433 - 6107 1017 ## COG0461 Orotate phosphoribosyltransferase 10 4 Op 6 . - CDS 6136 - 6843 978 ## COG0167 Dihydroorotate dehydrogenase - Prom 6869 - 6928 80.4 11 5 Op 1 13/0.000 - CDS 7777 - 7971 358 ## COG0167 Dihydroorotate dehydrogenase 12 5 Op 2 1/0.143 - CDS 7971 - 8747 839 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 13 5 Op 3 . - CDS 8756 - 9673 1198 ## COG0284 Orotidine-5'-phosphate decarboxylase - Prom 9708 - 9767 6.4 + Prom 9791 - 9850 6.8 14 6 Tu 1 . + CDS 9912 - 10325 513 ## COG0071 Molecular chaperone (small heat shock protein) + Term 10434 - 10464 1.1 - Term 10369 - 10405 1.1 15 7 Tu 1 . - CDS 10520 - 11146 554 ## COG0778 Nitroreductase 16 8 Op 1 . - CDS 12063 - 12224 241 ## Closa_1595 nitroreductase 17 8 Op 2 . - CDS 12244 - 14025 1738 ## COG1164 Oligoendopeptidase F - Prom 14267 - 14326 5.0 + Prom 14085 - 14144 4.3 18 9 Tu 1 . + CDS 14269 - 15030 479 ## Closa_1593 GCN5-related N-acetyltransferase - Term 15078 - 15129 14.1 19 10 Op 1 . - CDS 15138 - 15911 706 ## TepRe1_1356 GntR domain-containing protein 20 10 Op 2 . - CDS 15985 - 16608 505 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 16652 - 16711 22.0 21 11 Op 1 . - CDS 17613 - 17768 182 ## gi|266620450|ref|ZP_06113385.1| oxidoreductase, short chain dehydrogenase/reductase family 22 11 Op 2 16/0.000 - CDS 17805 - 18956 1182 ## COG1879 ABC-type sugar transport system, periplasmic component 23 11 Op 3 21/0.000 - CDS 19001 - 19960 682 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 24 11 Op 4 . - CDS 19985 - 21103 1048 ## COG1129 ABC-type sugar transport system, ATPase component 25 12 Tu 1 . - CDS 22364 - 23749 799 ## COG0044 Dihydroorotase and related cyclic amidohydrolases - Prom 23808 - 23867 8.2 + Prom 23854 - 23913 11.1 26 13 Tu 1 . + CDS 23967 - 25136 419 ## Plim_3780 hypothetical protein + Term 25190 - 25231 7.0 - Term 25174 - 25222 9.3 27 14 Tu 1 . - CDS 25229 - 25354 80 ## Closa_1592 GCN5-related N-acetyltransferase - Term 25398 - 25449 8.4 28 15 Op 1 . - CDS 25461 - 26189 1036 ## COG0217 Uncharacterized conserved protein 29 15 Op 2 . - CDS 26273 - 27328 993 ## COG1253 Hemolysins and related proteins containing CBS domains + Prom 28212 - 28271 80.4 30 16 Tu 1 . + CDS 28325 - 28426 107 ## - Term 28286 - 28343 20.3 31 17 Op 1 . - CDS 28450 - 28815 210 ## COG2337 Growth inhibitor 32 17 Op 2 . - CDS 28951 - 30135 1091 ## COG0787 Alanine racemase 33 17 Op 3 . - CDS 30156 - 30275 139 ## 34 18 Tu 1 . - CDS 31213 - 32580 564 ## PROTEIN SUPPORTED gi|90022317|ref|YP_528144.1| ribosomal protein S15 - Prom 32618 - 32677 6.0 - Term 32628 - 32679 20.1 35 19 Tu 1 . - CDS 32725 - 33426 811 ## COG2344 AT-rich DNA-binding protein - Prom 33533 - 33592 9.8 + Prom 33530 - 33589 5.7 36 20 Tu 1 . + CDS 33731 - 35668 2266 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 35725 - 35767 8.5 - Term 35713 - 35755 13.1 37 21 Op 1 . - CDS 35760 - 36521 939 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) - Prom 36605 - 36664 6.2 - Term 36728 - 36767 6.3 38 21 Op 2 . - CDS 36774 - 37598 967 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 37724 - 37783 80.4 39 22 Tu 1 . - CDS 38631 - 38843 278 ## gi|266620467|ref|ZP_06113402.1| outer membrane protein H.8 - Prom 38888 - 38947 5.0 + Prom 38960 - 39019 7.8 40 23 Tu 1 . + CDS 39061 - 39276 146 ## Closa_1582 hypothetical protein + Term 39280 - 39334 15.2 - Term 39268 - 39320 18.6 41 24 Op 1 . - CDS 39348 - 40622 1574 ## COG0104 Adenylosuccinate synthase 42 24 Op 2 . - CDS 40616 - 40744 140 ## gi|288870232|ref|ZP_06409674.1| conserved hypothetical protein - Prom 40850 - 40909 3.6 43 25 Op 1 . - CDS 40924 - 41817 942 ## COG0583 Transcriptional regulator - Term 41826 - 41861 2.3 44 25 Op 2 . - CDS 41885 - 43315 1703 ## COG0015 Adenylosuccinate lyase - Term 43642 - 43694 9.1 45 26 Op 1 2/0.143 - CDS 43718 - 45160 1214 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase 46 26 Op 2 . - CDS 45185 - 45892 961 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase - Prom 45956 - 46015 12.2 47 27 Tu 1 . - CDS 46085 - 46336 227 ## gi|266620474|ref|ZP_06113409.1| hypothetical protein CLOSTHATH_01564 - Prom 46473 - 46532 3.4 48 28 Op 1 . - CDS 46540 - 47289 814 ## COG2949 Uncharacterized membrane protein 49 28 Op 2 . - CDS 47314 - 47646 361 ## Closa_1575 mucin TcMUCI - Prom 47850 - 47909 4.2 50 29 Tu 1 . + CDS 47961 - 48152 83 ## gi|266620477|ref|ZP_06113412.1| putative ethanolamine utilization protein 51 30 Tu 1 . - CDS 48182 - 49420 519 ## COG4335 DNA alkylation repair enzyme - Prom 49468 - 49527 5.1 52 31 Op 1 6/0.000 - CDS 50815 - 51234 500 ## COG0289 Dihydrodipicolinate reductase - Term 51245 - 51291 10.0 53 31 Op 2 . - CDS 51316 - 52200 1135 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 54 32 Tu 1 . - CDS 53699 - 54205 623 ## COG2109 ATP:corrinoid adenosyltransferase - Prom 54260 - 54319 4.6 55 33 Op 1 . - CDS 54334 - 54960 662 ## COG0629 Single-stranded DNA-binding protein - Prom 55057 - 55116 5.6 - Term 55107 - 55140 0.0 56 33 Op 2 . - CDS 55143 - 55289 80 ## gi|288870238|ref|ZP_06409676.1| hypothetical protein CLOSTHATH_01575 - Prom 55442 - 55501 14.8 57 34 Tu 1 . - CDS 56403 - 58115 2037 ## COG1217 Predicted membrane GTPase involved in stress response - Prom 58151 - 58210 4.2 - Term 58145 - 58188 6.2 58 35 Op 1 . - CDS 58223 - 58495 111 ## gi|266620488|ref|ZP_06113423.1| lipoprotein 59 35 Op 2 25/0.000 - CDS 58431 - 58898 389 ## COG0438 Glycosyltransferase 60 35 Op 3 . - CDS 59856 - 60458 586 ## COG0438 Glycosyltransferase 61 35 Op 4 . - CDS 60471 - 60914 193 ## PROTEIN SUPPORTED gi|154175107|ref|YP_001408238.1| ribosomal protein L22 62 36 Tu 1 . - CDS 61870 - 62067 301 ## gi|266620492|ref|ZP_06113427.1| DedA protein - Prom 62099 - 62158 5.3 - Term 62182 - 62225 9.3 63 37 Tu 1 . - CDS 62244 - 67814 5102 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 67840 - 67899 9.1 64 38 Tu 1 . - CDS 68028 - 68681 689 ## COG1739 Uncharacterized conserved protein - Prom 68720 - 68779 5.2 65 39 Tu 1 . + CDS 68924 - 71611 2507 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) + Term 71634 - 71686 4.2 - Term 71622 - 71674 8.3 66 40 Op 1 . - CDS 71704 - 72057 524 ## Closa_1567 SpoVA protein 67 40 Op 2 . - CDS 72119 - 73132 1247 ## Closa_1566 stage V sporulation protein AD - Term 73231 - 73266 7.2 68 41 Op 1 . - CDS 73303 - 73755 630 ## Closa_1565 SpoVA protein 69 41 Op 2 . - CDS 73801 - 74202 485 ## Closa_1564 hypothetical protein 70 41 Op 3 . - CDS 74216 - 74677 462 ## Closa_1563 stage V sporulation protein AA 71 42 Op 1 . - CDS 75620 - 75775 162 ## Closa_1563 stage V sporulation protein AA 72 42 Op 2 . - CDS 75772 - 75933 253 ## gi|266620502|ref|ZP_06113437.1| conserved hypothetical protein 73 42 Op 3 6/0.000 - CDS 75997 - 76710 796 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 74 42 Op 4 8/0.000 - CDS 76716 - 77150 549 ## COG2172 Anti-sigma regulatory factor (Ser/Thr protein kinase) 75 42 Op 5 . - CDS 77161 - 77502 296 ## COG1366 Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) - Term 77530 - 77586 11.1 76 43 Op 1 . - CDS 77599 - 78912 1494 ## Closa_1558 TPR repeat-containing protein 77 43 Op 2 . - CDS 78924 - 79064 268 ## Closa_1557 hypothetical protein 78 43 Op 3 . - CDS 79057 - 80373 1344 ## COG0285 Folylpolyglutamate synthase - Prom 80419 - 80478 2.0 79 44 Op 1 . - CDS 80495 - 80710 62 ## gi|266620509|ref|ZP_06113444.1| conserved hypothetical protein 80 44 Op 2 . - CDS 80718 - 81101 540 ## COG0251 Putative translation initiation inhibitor, yjgF family - Prom 81134 - 81193 5.0 + Prom 81193 - 81252 3.7 81 45 Tu 1 . + CDS 81279 - 81893 796 ## Closa_1554 hypothetical protein + Term 81952 - 81982 -1.0 - Term 81905 - 81969 14.1 82 46 Op 1 . - CDS 81976 - 82275 406 ## COG4496 Uncharacterized protein conserved in bacteria 83 46 Op 2 . - CDS 82364 - 83173 1094 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 83204 - 83263 6.5 + Prom 83223 - 83282 5.8 84 47 Tu 1 . + CDS 83309 - 83881 641 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components + Term 83904 - 83950 10.4 - Term 83894 - 83936 11.2 85 48 Tu 1 . - CDS 83943 - 84290 86 ## - Prom 84310 - 84369 8.8 + Prom 84354 - 84413 7.3 86 49 Op 1 . + CDS 84597 - 84791 141 ## gi|266620516|ref|ZP_06113451.1| toxin-antitoxin system, antitoxin component, Xre family 87 49 Op 2 4/0.143 + CDS 84788 - 85888 404 ## COG0286 Type I restriction-modification system methyltransferase subunit + Prom 86041 - 86100 5.3 88 50 Tu 1 . + CDS 86263 - 88533 623 ## COG0286 Type I restriction-modification system methyltransferase subunit + Prom 88551 - 88610 5.6 89 51 Tu 1 . + CDS 88797 - 89045 141 ## gi|266620517|ref|ZP_06113452.1| conserved hypothetical protein + Term 89053 - 89089 2.2 - Term 89041 - 89076 4.5 90 52 Tu 1 . - CDS 89099 - 89842 539 ## COG5632 N-acetylmuramoyl-L-alanine amidase - Term 89860 - 89892 4.5 91 53 Op 1 . - CDS 89956 - 90162 350 ## gi|266620519|ref|ZP_06113454.1| conserved domain protein 92 53 Op 2 . - CDS 90164 - 90481 313 ## gi|266620520|ref|ZP_06113455.1| conserved hypothetical protein 93 53 Op 3 . - CDS 90501 - 90695 215 ## gi|266622353|ref|ZP_06115288.1| conserved hypothetical protein 94 53 Op 4 . - CDS 90756 - 91175 225 ## Clole_2737 hypothetical protein 95 53 Op 5 . - CDS 91156 - 91899 403 ## gi|266620523|ref|ZP_06113458.1| conserved hypothetical protein 96 53 Op 6 . - CDS 91914 - 92150 150 ## gi|266620524|ref|ZP_06113459.1| conserved hypothetical protein 97 53 Op 7 . - CDS 92143 - 92433 322 ## gi|266620525|ref|ZP_06113460.1| hypothetical protein CLOSTHATH_01618 98 53 Op 8 . - CDS 92421 - 92723 126 ## gi|266620526|ref|ZP_06113461.1| putative CTP synthase 99 53 Op 9 . - CDS 92736 - 92969 202 ## gi|266620527|ref|ZP_06113462.1| fibronectin type III domain protein Predicted protein(s) >gi|229784111|gb|GG667624.1| GENE 1 2 - 212 309 70 aa, chain - ## HITS:1 COG:TM0446 KEGG:ns NR:ns ## COG: TM0446 COG0041 # Protein_GI_number: 15643212 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Thermotoga maritima # 1 70 1 70 171 98 67.0 2e-21 MAKVAIVMGSDSDMPVMAQAADVLEKLGVDFEITVISAHREPDIFFEYAKTAEARGVKVM IAGAGKAAHL >gi|229784111|gb|GG667624.1| GENE 2 577 - 774 216 65 aa, chain + ## HITS:1 COG:PAB7155 KEGG:ns NR:ns ## COG: PAB7155 COG1476 # Protein_GI_number: 14520844 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Pyrococcus abyssi # 1 65 1 65 73 72 60.0 2e-13 MKTRIQELRKQSRITQEELADALGVTRQTIISLENGKYNASLQLAHKIAVYFHMNIEEIF LFEEE >gi|229784111|gb|GG667624.1| GENE 3 779 - 1267 570 162 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620432|ref|ZP_06113367.1| ## NR: gi|266620432|ref|ZP_06113367.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 162 1 162 162 256 100.0 4e-67 MLLNNLFIDSPINFEKKCKTRITASCFLMALGVITIALSFLAGDQIPVLYMAADHYDFIP GFYMGTGFGLFFAAAATIIKNVRYLKKPEIRKEREIYETDERNRMLGLRSWAYAGYTMLL LLYIGILISGFISILILKTLLIILAVYVVVLFVFRMILQKTM >gi|229784111|gb|GG667624.1| GENE 4 1410 - 3113 2018 567 aa, chain - ## HITS:1 COG:no KEGG:Closa_1605 NR:ns ## KEGG: Closa_1605 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 560 145 691 691 853 74.0 0 VFEETVPPAASVKDVLYWFVSDYCDYTVTYRIREGVDPALTFAKDIIMDSDLNDLRYLYR FGDYISESELLVAQFMNSLPEETVQLMADTYTEGYRKGFEVMGRDLSKKSAVQIRYELGF ERMVKAAIRNFREMGLDTILSRSAVWSVNQKAGRKVGYYGTPANRQYEYDHRYDEAIYLN KAFTDRKASVLRVAYEQYKKEAAAYAGPAVIETFGQEGFQPVNKKEANRLDGRQEKLSAK AANEAAVITNEYIPGDETSFTIIAFPVPEIGRGSHGEWDAESAGERFKKIFEETIAINTL DYEKYKTIQQTIIDALDEAEYVEVKGKGENRTDLRVSLHELGDREKETNFENCVADVNIP LGEVFTSPKLCGTQGTLFVGTVYIGDFQFKNLSMTFEDGMIKAYSCDNFEDPNEGRALVK QMILKNHDTLPMGEFAIGTNTTAYAMAERYGILDKLPILIVEKMGPHFAVGDTCYSWAED SAVYNPNGKEIIARDNEISILRKEDLSKAYFSCHTDITIPYSELDTIDAVRADGSRISII ADGRFVLPGVGELNVPLDKEQGKPDNK >gi|229784111|gb|GG667624.1| GENE 5 4033 - 4476 436 147 aa, chain - ## HITS:1 COG:no KEGG:Closa_1605 NR:ns ## KEGG: Closa_1605 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 146 1 146 691 176 62.0 3e-43 MKYCDRFKEENESVMERFQLSMERLHAIESEETVEEPYRSYFRKMASFIGMIGAYREQLE GGLLENASLDELKAWNHRLYEDILPHNYETSYGNPQYAVSALGEEYGQLFSYLYKEIRGG ILFAAENRLTDITILNETVIEIYNMFS >gi|229784111|gb|GG667624.1| GENE 6 4479 - 4799 366 106 aa, chain - ## HITS:1 COG:no KEGG:Closa_1604 NR:ns ## KEGG: Closa_1604 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 106 1 106 106 120 66.0 1e-26 MQQQKKVSYIRKPFARRSFMAIGLAAAALILAVIGVASSVMTAGNAELNVAAVCFCSFLI AIVSLIYGALSFLEKEKKYILSKIGMAVSGLLIIVWTVLIIIGFGG >gi|229784111|gb|GG667624.1| GENE 7 4768 - 5259 227 163 aa, chain - ## HITS:1 COG:CAC0829 KEGG:ns NR:ns ## COG: CAC0829 COG4767 # Protein_GI_number: 15894116 # Func_class: V Defense mechanisms # Function: Glycopeptide antibiotics resistance protein # Organism: Clostridium acetobutylicum # 15 138 18 143 308 80 36.0 1e-15 MIKNTTKRQKLGWVIFIVYLIFLAYFLFFSEDFGRGSHLQEEYAYNLVPFKEIRRFIVYW HVVGIRSFLLNIVGNVVGFMPFGFFLPVISRRSRHWYNTVLLSFLFSLCIETVQLIWKVG SFDVDDMILNTLGGLLGFVFYKIVQTIRVRRKRRAAAEKSILY >gi|229784111|gb|GG667624.1| GENE 8 5271 - 5420 223 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620437|ref|ZP_06113372.1| ## NR: gi|266620437|ref|ZP_06113372.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 49 1 49 49 64 100.0 3e-09 MERVYKTMRNAGALNIVIGIIITVAGLSVGIMSIINGAVLLKRKSEITF >gi|229784111|gb|GG667624.1| GENE 9 5433 - 6107 1017 224 aa, chain - ## HITS:1 COG:CAC0027 KEGG:ns NR:ns ## COG: CAC0027 COG0461 # Protein_GI_number: 15893325 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 1 224 1 224 224 292 63.0 4e-79 MEQYKQEFIEFMVDSNVLKFEDFTLKSGRKSPFFMNAGAYVTGAQLKKLGEYYAKAIHDN FGDEFDVLFGPAYKGIPLSVATTIAYHELYGKEIRYCSNRKEAKDHGDAGILLGSPIQDG DRVVIIEDVTTSGKSIEETFPIIKAQGDVTILGLMVSLNRMEKGKGDKSALDEIHEKYGF HAKAIVTMAEVIEYLYNKEYNGKIMIDDTLKAAIDAYYEIYGVK >gi|229784111|gb|GG667624.1| GENE 10 6136 - 6843 978 235 aa, chain - ## HITS:1 COG:BS_pyrD KEGG:ns NR:ns ## COG: BS_pyrD COG0167 # Protein_GI_number: 16078618 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Bacillus subtilis # 1 232 65 297 311 250 54.0 2e-66 MINAIGLQNPGIEVFAKRDIPFLRQYDTKIIVNVCGRTTEDYVEVVERLGDEPVDMLEIN ISCPNVKEGGIAFGQDPKAVEAITREVKKHAKQPVIMKLSPNVTDITVMAKAAEAGGADV ISLINTLTGMKIDINRRTFAVANKTGGLSGPAVKPVAVRMVYQAANAVKIPIIGMGGIMN AEDALEFILAGATAVSVGTANFHNPYATAEVVSGIENYMKKYNIEDINELVGAVR >gi|229784111|gb|GG667624.1| GENE 11 7777 - 7971 358 64 aa, chain - ## HITS:1 COG:lin1947 KEGG:ns NR:ns ## COG: lin1947 COG0167 # Protein_GI_number: 16801013 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Listeria innocua # 3 62 4 63 304 66 55.0 1e-11 MNMKVNLAGVELKNPVMTASGTFGSGAEYGEMVDLNGLGAVVTKGVANVPWPGNPTPRIA ETYG >gi|229784111|gb|GG667624.1| GENE 12 7971 - 8747 839 258 aa, chain - ## HITS:1 COG:BH2535 KEGG:ns NR:ns ## COG: BH2535 COG0543 # Protein_GI_number: 15615098 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Bacillus halodurans # 10 258 9 259 259 209 43.0 5e-54 MAQRKVTAFVDRQERLTEDVYSMWIRDEEMAAQAKPGQFISVYTKDGAKLLPRPISICET SKETGMLRIVYRTVGAGTEEFSRYQAGDPVDIMGPLGNGFPLEEAAEGKTAFLIGGGIGI PPMLELAKQLRCKKEVILGYRDVLFLNEEFAPYGDVVLATEDGSAGTKGNVIDAIREHGL KADVIFACGPTPMLRALKEFAASENITCFLSLEEKMACGIGACLACVCKTKNVDEHSHVH NARICKDGPVFRAEEVEL >gi|229784111|gb|GG667624.1| GENE 13 8756 - 9673 1198 305 aa, chain - ## HITS:1 COG:CAC2652 KEGG:ns NR:ns ## COG: CAC2652 COG0284 # Protein_GI_number: 15895910 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Clostridium acetobutylicum # 1 305 2 286 286 209 39.0 5e-54 MINQLIQKIQETKAPICVGLDPMLGYIPEHVTRKAFDEFGETLEGAAEAIWQFNKEIVDH VCDIIPSVKPQIAMYEQFGIEGLKIYKRTVDYCQEKGLLVIGDAKRGDIGSTSAAYATAH IGHVKVGGSILTGFGTDFLTVNPYLGTDGVKPFVDVCKEEDRGLFVLVKTSNPSSGEFQD QLIGGRPLYELVADKVVEWGADCMDGAYSNVGAVVGATYPEMSRILRKLMPNTYFLVPGY GAQGGTAEDLKYCFNEDGLGAIVNSSRGIIAAYKQEKYAKFGAEHFGEASRQAVIDMVSD INSVL >gi|229784111|gb|GG667624.1| GENE 14 9912 - 10325 513 137 aa, chain + ## HITS:1 COG:RSc0200 KEGG:ns NR:ns ## COG: RSc0200 COG0071 # Protein_GI_number: 17544919 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Ralstonia solanacearum # 31 136 37 139 140 67 33.0 1e-11 MLLSRRNNLFDEFFNDPFFNDSVQNRTTLMRTDIEDDGTNYIIDVELPGYKKENVRAELK DGYLTIYAEASGSSEDSENKNFIRKERYSGSCKRSFYVGSQLRQEDIKAAFDNGILKLTV PKEAPKQIEENHYITIE >gi|229784111|gb|GG667624.1| GENE 15 10520 - 11146 554 208 aa, chain - ## HITS:1 COG:MTH933 KEGG:ns NR:ns ## COG: MTH933 COG0778 # Protein_GI_number: 15678953 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanothermobacter thermautotrophicus # 21 176 63 213 243 80 32.0 2e-15 MEIKENIGKDLPVKGLWKVEAPYYLVFYSEEKDGWMMNAGYVLEPVLLYMTGKGLGTCYL GSTRIPGPEPAGMKTAVAVAFGYPRSLLYRDPATAKRLPLKELCVFKDEIGEPLKNILKA VRLAPSAMNTQPWRFIVYHDRIYIFSCREFLPSQTMVAMRDLSMGIMLSHLSIAAEELWL NLKIEVEEQMVKKSYKSGAYVATAFLLE >gi|229784111|gb|GG667624.1| GENE 16 12063 - 12224 241 53 aa, chain - ## HITS:1 COG:no KEGG:Closa_1595 NR:ns ## KEGG: Closa_1595 # Name: not_defined # Def: nitroreductase # Organism: C.saccharolyticum # Pathway: not_defined # 1 49 1 49 257 66 69.0 4e-10 MNLYQMIFKRRSIRKFKYEAVPEQLIKDVLAFADRVATLCPEISTKMEIKVAS >gi|229784111|gb|GG667624.1| GENE 17 12244 - 14025 1738 593 aa, chain - ## HITS:1 COG:BH2856 KEGG:ns NR:ns ## COG: BH2856 COG1164 # Protein_GI_number: 15615419 # Func_class: E Amino acid transport and metabolism # Function: Oligoendopeptidase F # Organism: Bacillus halodurans # 1 593 5 597 598 586 49.0 1e-167 MKKRDEIPVCDTWNLADILPSDEAWEALFLETERELAGYKEFEGHLAESGEMLFRCLKFD EEISLKIERLFVYGREKSDEDTAKAKYQDFYARAQALSYKAAGLSSFIVPEILAMDEQAL GRFRSTVKELDRYSRTFEIIMKKKAHTLSAGMEELLAKSMEATQSSSDIFNMFNNADAKF PSITDSEGKEIPVTHGTYIPLMENRDRRVRKAAFQSMYSVYGQFANTLAATFAANVKQAS FYAKARNYGSSRAYYLSENEIPESVYDNLTEAIQEGLPLFHEYVSVRKKVLGLDEIHMYD LYTPMVEHDDRKIPYEEAKETVKKGLAPLGDKYLALLQEGFDHRWIDVYENEGKRSGAYS WGVYGTHPYVLLNYSGTLDSVFTLAHEMGHSLHSWFSDQALPYVDAGYKIFVAEVASTCN EALLNRYLLAVTEDKKERAYLLNHFMESFRGTVFRQTMFAEFEAETHRRSEAGEPLTAEL LCRLYHELNEKYFGPEMVVDREIDYEWARIPHFYTPFYVYQYATGFSAAMAISSRILAGD EQALEGYFKFLSGGNSMPPIELLKLCGVDMSTGQPVKEALAVFKELLEEFKSL >gi|229784111|gb|GG667624.1| GENE 18 14269 - 15030 479 253 aa, chain + ## HITS:1 COG:no KEGG:Closa_1593 NR:ns ## KEGG: Closa_1593 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.saccharolyticum # Pathway: not_defined # 1 253 1 255 255 196 45.0 1e-48 MKIIRTEHLDQQQLKDLQALETACRLHDHTSLTFPTEDGGLFFLLYDEETLLSAFSAFFN EFETCSCSAFTLPSHRGRGHFSLLLEELLKESGECDLLFLADDSCPDTKKTLDALEAEFL NREHMMELRLPAPQSGSESHSVEFSPALSTDSGDKRYVISENSRPVGSFHLIFQNDTVYF YDFEIHEKLRNQGIGGRAAAALIGELNRMAKTEPEYFPFSRVMLQVSGDNGPALQLYRKL GFEFTETLAYYLY >gi|229784111|gb|GG667624.1| GENE 19 15138 - 15911 706 257 aa, chain - ## HITS:1 COG:no KEGG:TepRe1_1356 NR:ns ## KEGG: TepRe1_1356 # Name: not_defined # Def: GntR domain-containing protein # Organism: Tepidanaerobacter_Re1 # Pathway: not_defined # 16 240 4 227 252 105 32.0 2e-21 MDEKKRLLQTGVFKNESDIVLFLLVRQIGHRTGPVGSWTLKEELDSMGIAYGTATVGRYL KMLDYKGYTIHKGNQGRILSDEGKKWLSDMEDSLNRAEVRNESSKALQVNKYSDLIDLMK ARKAIEVETVRLAAIHATQEEITELRKSINIYYRFITEKKDFVEPALDFHGIIADMSHNR FMKAILEMLIFEEKQIEDNIDQLETRKRGNTYVIEHDDITRAIEERNPDLAAELMDRHME GILEAVEYQVNLMEKVE >gi|229784111|gb|GG667624.1| GENE 20 15985 - 16608 505 207 aa, chain - ## HITS:1 COG:MA1185 KEGG:ns NR:ns ## COG: MA1185 COG0451 # Protein_GI_number: 20090051 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Methanosarcina acetivorans str.C2A # 4 202 109 309 311 98 33.0 8e-21 MTSLEAARNYGIERYLYMSSGAVYGNVSMDVVTEEVPMHSENPYGATKVACEELVRNYGL DSASLRIGFVYGPGRKYECPVNMILRDCILKSEVMWERGMDQKLDYIYLDDCVKAIATIA MADKLPHTEYNVGGGEIVSYSRIVDKVRELYPKVPVSIGRGDLGYDNLGALSVERAYRDF GWKPEVSIEEGIEKYDRWLRKTMNHCS >gi|229784111|gb|GG667624.1| GENE 21 17613 - 17768 182 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620450|ref|ZP_06113385.1| ## NR: gi|266620450|ref|ZP_06113385.1| oxidoreductase, short chain dehydrogenase/reductase family [Clostridium hathewayi DSM 13479] oxidoreductase, short chain dehydrogenase/reductase family [Clostridium hathewayi DSM 13479] # 1 51 1 51 51 89 100.0 1e-16 MATLLIGGNGLVGTALVRYLTEQGEAVISFSAHSPSEEVNGCTYIQGDVTE >gi|229784111|gb|GG667624.1| GENE 22 17805 - 18956 1182 383 aa, chain - ## HITS:1 COG:AGc5109 KEGG:ns NR:ns ## COG: AGc5109 COG1879 # Protein_GI_number: 15890064 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 73 382 88 357 357 68 24.0 2e-11 MKKTLLLGLAAVCAWSLTACGSGAEAETTTAPETTKQEKTEMETENSKEKDEYYVEMVGN SFAIQFCNINVDGAKAAADELGNVKLNYNASQDSSQVQEQIDILNQAIAKKPDAILIAAA DPDALEAQMIKARDEGIPVVAYDISFNHAPEGALAATVSTNNEAAAGLAAEKLMENEAFV EQVKQATAEKPMVVSCLAPDAVMTAHEQRINGFTLTLYELLQEYQPGAVEITGHTSFEKA SENPAAVTIRAEIPPTKADADIRNQAQKMINGNDVAAVFCVNEASVTALLSATTDGTDLD REKGKYKDLIAIGFDAGKTMKGAVASQYFYGAVAQDPFAMGYEGLKICIDAVNGREVKNL DAPAVWYDHTNMEEDDIAKILYD >gi|229784111|gb|GG667624.1| GENE 23 19001 - 19960 682 319 aa, chain - ## HITS:1 COG:VCA0129 KEGG:ns NR:ns ## COG: VCA0129 COG1172 # Protein_GI_number: 15600900 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Vibrio cholerae # 2 309 21 329 332 156 35.0 4e-38 MKKILSSQRAIALLTLGVLFLFFSVFGSNFCTGDTMVNLLESTYYIICVSFGMTFVISTG GIDLSVGTVAMCGALVGGVAYNVWGLPMWFSLIIIVFTGMLFGILNGILVSYLNLPAFVA TLGTMMMSQGAGYIISGVQTMRYPSISEPDGWFKRIFYKSLDGVPMGVIYMVILFAAAFF LFKYTKIGKYACAIGSNKEAARLSGVNVKRWGLLVYVISGIFAGLAGIFYAATYTAILPG SGSGIETNAIAATVIGGTSMTGGSGSMFGTIIGVFIMGILKNGLMTIGIQQQWQVLFTGA AVLLAVLLDIYRNNKIKRG >gi|229784111|gb|GG667624.1| GENE 24 19985 - 21103 1048 372 aa, chain - ## HITS:1 COG:VCA0128 KEGG:ns NR:ns ## COG: VCA0128 COG1129 # Protein_GI_number: 15600899 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Vibrio cholerae # 1 372 126 495 500 348 50.0 8e-96 MLARLELDVSPDTIVGELTVARQQMVEIAKGISYHSRVLIMDEPTAALAVNEIEELFRVI RQLKEDGISIIYISHRLEELFEITDRITVMRDGCYIDTMETKESSMEALIQKMVGRTISW EKKMSSCVRKDADIVLEVKNMISKDIRDISFVLRRGEILGLAGLMGAGRTELARLIFGAD WYRSGKIYRNSREVRIRSTSDAVANGISYLSEDRKGNGLAIDLSVADNVALPNWDEFTRG IVINFNRMEEAVETVTKSVSVKTPSIGQLVRNLSGGNQQKVVIAKWLLKNSEIFIFDEPT RGIDVGAKNEIYKLMEQLIREGKSIIMISSEMPELLRMSDRILVLCEGRLTGELDIGEAT QEKIMQYAVMRS >gi|229784111|gb|GG667624.1| GENE 25 22364 - 23749 799 461 aa, chain - ## HITS:1 COG:DR1153 KEGG:ns NR:ns ## COG: DR1153 COG0044 # Protein_GI_number: 15806173 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Deinococcus radiodurans # 10 448 6 436 448 280 37.0 4e-75 MIDLAEQTTVLKGGMVVLPGCVKKMDLVIKGERIEALVNDFIMDETADCRIIDVSGKVIL PGIIDTHVHMWDPSPFNYREDWYSGSQCAAEGGITTIVDMPLSVPPVVDRQGFGLKYETA NRQSFVDFAFWGGLTPGCLDQMEELNNLGCVAYKGFMSFANPDYPQITDGYLVKGMEIAR KFDGLIGVHAENAEVADFGCRWLAEMGECDPARYDDARPWWVELEAIQRACLFSRAVGNR LYICHMTIAQGAEFLKQEKCRGTRVYVETCPQYLLFDRNVLRSKGAYAKCNPPLRSRENV EKLWNYVMDGTIDTIGSDHGPYRDEEKTKEGDFFKELCGFGGFDGLLPSMLTEGVNKRGL PLERLADLTSGNAAKIMGLYPKKGSLLPGTDADIVVVDLNEEWTFDGKKSFSKTKSDKNV YHGMDMKGRVKQTWVRGKLVYRDGSITGKAGYGQYVPRQSR >gi|229784111|gb|GG667624.1| GENE 26 23967 - 25136 419 389 aa, chain + ## HITS:1 COG:no KEGG:Plim_3780 NR:ns ## KEGG: Plim_3780 # Name: not_defined # Def: hypothetical protein # Organism: P.limnophilus # Pathway: not_defined # 37 384 81 418 823 274 42.0 5e-72 MTYINPDSALTPANIYVEATQSRFSDSYRRWQGIPSIEVTGNGRIFVNFYSGQDAEVGGN IMVLCVSDNHGESFRSCVTVVEHPDPECRIYDPNLWIAPDKKLWMFYTQARGFNDGRSGV WVTVCDQPDTDPLTWSAPRRIANGIMMNKPIITTKGEWLFPCAIWCDTSGSVPAERHGLE QEQFSNVYASSDKGKTISLRGHADIPNRSFDENMIVEKKDGSLWMLVRTFDGIGESFSTD GGYTWTPGQKSHIDGPCSRFHISRLKSGRLLLINHYQFDQRIDLEDIMHQGNVKKWKGRS HLSALLSEDDGQTWPYSLLLDERNEVSYPDAKEADDGFIYVTYDHERVTEREILMARFTE EDIIKGKTVTCGSNFKIVVNKATGQPNIH >gi|229784111|gb|GG667624.1| GENE 27 25229 - 25354 80 41 aa, chain - ## HITS:1 COG:no KEGG:Closa_1592 NR:ns ## KEGG: Closa_1592 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.saccharolyticum # Pathway: not_defined # 1 41 170 210 211 62 65.0 8e-09 MISLETHNEKNVAFYRQFGFKVFGVMEKHFDLKQYGMIREV >gi|229784111|gb|GG667624.1| GENE 28 25461 - 26189 1036 242 aa, chain - ## HITS:1 COG:CAC2295 KEGG:ns NR:ns ## COG: CAC2295 COG0217 # Protein_GI_number: 15895562 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 239 1 237 246 247 56.0 2e-65 MSGHSKFANIKHKKERNDAAKGKVFTVIGREIAVAVKEGGPDPANNSKLRDVIAKAKANN MPNDTIDRGIKKAAGDANAVNYETITYEGYGPNGVAIIVDTLTDNKNRTAANVRSAFTKG GGNVGTPGCVSYMFDKKGQIIIDKEECEMDPDELMMMALDAGAEDFSEEEDSYEVITAPE DFSAVREALEAAGIPMVEADVTMIPQTWVELTDEESIKKMNRTLDLLDEDDDVQATYHNW DE >gi|229784111|gb|GG667624.1| GENE 29 26273 - 27328 993 351 aa, chain - ## HITS:1 COG:CAC0460 KEGG:ns NR:ns ## COG: CAC0460 COG1253 # Protein_GI_number: 15893751 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Clostridium acetobutylicum # 6 323 124 436 443 198 37.0 2e-50 MLLISFGIVIPKRCAAQNPEKWGYHMLPVVTFFMVPLMPFTWLINAVAFLFLKLIGIDMM SDNENVTEEDIMSMVNEGHEQGVLEAREAEMITNIFELDDKDAGDIMTHRKNLVALDGSM TLREAVNFILKEGYNSRYPVYEKDIDDITGILHMKDALIAAENGSNGMVPICEIDGLLRE AHFIPETRNIDSLFKEMQSQKIHMVIVVDEYGQTAGIVTMEDILEEIVGSIMDEYDVDEE FIAQAEDGSYIISGMAPLDEVAKTLDIEFEEDDYDSYDTINGFLISRLDRIPQEGEQTEV EYQGYGFKILQVENKMIHTIRVQKLVPPPDTPPGPENGGESPADSENSLPD >gi|229784111|gb|GG667624.1| GENE 30 28325 - 28426 107 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTPAPSICMAALLVAMKNDTPLPPICQVLTLVS >gi|229784111|gb|GG667624.1| GENE 31 28450 - 28815 210 121 aa, chain - ## HITS:1 COG:BS_ydcE KEGG:ns NR:ns ## COG: BS_ydcE COG2337 # Protein_GI_number: 16077533 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Bacillus subtilis # 7 119 1 113 116 152 66.0 1e-37 MTECEKVIIRRGDIFYADLRPVVGSEQGGIRPVLIIQNDIGNKHSPTVICAAITSRMNKA KLPTHVELDTRRCDMIKDSVILLEQLRTIDKQRLKEKICHIDEELQEKVDYALKVSLELD T >gi|229784111|gb|GG667624.1| GENE 32 28951 - 30135 1091 394 aa, chain - ## HITS:1 COG:CAC0492 KEGG:ns NR:ns ## COG: CAC0492 COG0787 # Protein_GI_number: 15893783 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Clostridium acetobutylicum # 1 384 1 385 386 332 44.0 5e-91 MERYSRVYETVNLDAIRQNMEAMRANLKEGTGMIGVVKADGYGHGSVPVACAIDPYVSGY ATATPEEAVILRRHGITKPILILGVSPESSFEDLIRYELRPAVFRYESAARLSELSVKAG KISPIHIALDTGMSRIGYPVTAAAAEETVRISRLPGIRIEGLFTHFARADERDKGATARQ MELFTEFISMLTERGITIPVLHCSNSAGILELPQANFNAVRAGISIYGLYPSDEVEKEPV HLTPAMELKSTISYLKTIAPGTPVSYGGTFTAERKTAIATIPVGYGDGYPRSLSGRGDVL IRGRRARILGRVCMDQLMVDTTEIPEAEEGDEVTLIGKDGDEEITVEELARIGGGFHYEI LCDIGKRVPRVYVEDGKVVGKKDYFDDIYEGFGK >gi|229784111|gb|GG667624.1| GENE 33 30156 - 30275 139 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGVYLHGRAGDLVRLQSGEHGLLARELTEALADILKETE >gi|229784111|gb|GG667624.1| GENE 34 31213 - 32580 564 456 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90022317|ref|YP_528144.1| ribosomal protein S15 [Saccharophagus degradans 2-40] # 4 456 10 453 500 221 32 8e-57 MRYLVSGSQMKEVDSYTIQSIGIPSLVLMEQAAAAVVREVQKKAEKKDRIWAVCGTGNNG ADGIAAARLLHLKGYTVTVILAGTKENGTDEYRKQLAIAERLGVNLIEYSDFIPGRCELV IDAVFGVGLSREVSGVYRDLLATLSGCGASWVVAVDIPSGIHADTGEIMGTALKADVTVT FGYEKLGTLLYPGRAYSGIVVVADIGFPPVSLERLSPELFTFEPADIRMIPARPAYSNKG TFGRVLIAAGSKNMSGAAWLSAMAAYRMGAGLVKIFTVEENRAVLQTSLPEAIITTYDPE EAQASTVRFQTLLEAQCEWASVIVLGPGIGQEPYTRNLVETVLANAYVPIILDADGLNTI AAFPELTNYFTENIIVTPHLGEMARLTGSPVEDIRRRLIPAAREYADRYGITCVLKDAVT VAALKDQRTYINTSGNSSMAKAGAGDVLTGIIAGLL >gi|229784111|gb|GG667624.1| GENE 35 32725 - 33426 811 233 aa, chain - ## HITS:1 COG:CAC2713 KEGG:ns NR:ns ## COG: CAC2713 COG2344 # Protein_GI_number: 15895970 # Func_class: R General function prediction only # Function: AT-rich DNA-binding protein # Organism: Clostridium acetobutylicum # 3 210 4 208 214 231 54.0 8e-61 MAEKEISKAVIARLPRYYRYLGELMEDGVERISSNDLSVRMKVTASQIRQDLNNFGGFGQ QGYGYNVKYLYTEIAKILGIDRQHNIVIIGAGNLGQAIANYANFEKRGFLIKGMFDINPR LIGLVVRGIEIRSVDDLEQFVRDNDVQIAALTIPKTKAPEIADRLVSAGIKAIWNFAHTD LVVPDDVVVENVHLSESLMRLSYKVCSMQDKAEEEKAEREKGKSAKADAAKKS >gi|229784111|gb|GG667624.1| GENE 36 33731 - 35668 2266 645 aa, chain + ## HITS:1 COG:CAC2714 KEGG:ns NR:ns ## COG: CAC2714 COG0488 # Protein_GI_number: 15895971 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 1 644 2 638 643 535 46.0 1e-151 MILSCSNISKAFGSTEIIRHASFHIEDHEKAAIVGINGAGKSTLLKIIIGELAADEGDVV ISRGKTLGYLAQHQELSGERTVFDEVLEVKRPLIEMEARIRKLELDMKHASGEELENMLS TYSRLNHEFELQNGYAYQSEVTGVLKGLGFTEEEFTKPVSALSGGQKTRVSLGKLLLTKP DIILLDEPTNHLDMNSIAWLEGYLTNYDGSVIIVAHDRYFLDRVVTKIIEIDNGALAVYQ GNYTAYSEKRAMIRDAKMKAYLNQQQEIRHQEEVIAKLKSFNREKSIKRAESREKMLDKI EILEKPAEVNDEMRIRLEPNIISGNDVLTVRGLSKAFGPNHLFDHVDFEIKRGERVAIIG GNGTGKTTILKILNNLLPADTGEIRLGSKVHIGYYDQEHQVLHMEKTLFDELQDTYPTMN NTQIRNILAAFLFTGDDVFKRVRDLSGGERGRVSLAKLMLSEANFLILDEPTNHLDITSK EILEDALNSYTGTVLYVSHDRYFINKTATRILDLTGHTLVNYIGNYDYYLEKKELMTELY ASEPAGSQTSGTEGGASQTSESSEVKQDWKAFKEEQARIRKRQNDLKKTEDEIHRLETRD SDIDGLLTQEEVFTDVSRLLELNKEKEEIQTRLEELYEIWETLAE >gi|229784111|gb|GG667624.1| GENE 37 35760 - 36521 939 253 aa, chain - ## HITS:1 COG:TM0441 KEGG:ns NR:ns ## COG: TM0441 COG1028 # Protein_GI_number: 15643207 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Thermotoga maritima # 4 248 7 251 255 143 35.0 4e-34 MKKMEGKVVLITGGARGIGYGIAKAFAEEGANIVITGRTASTLIEARENLEADYGIEALF VVADGGDETAVKNVVKQAIDRFGKLDCLINNAQVSKSGTMLVDHTKEDFDLAVYSGLYAT FFYMREAFPYLKETKGSVINFASGAGLFGKLGQSSYAAAKEGIRGMSRVAAAEWGEYGIN VNVVCPLAMTPGLEKWKDEYPKLYDQTIQSIPLKRFADPEKDIGRVCVFLASDDAGYVTG ETITLQGGSGLRP >gi|229784111|gb|GG667624.1| GENE 38 36774 - 37598 967 274 aa, chain - ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1 100 533 631 693 67 37.0 2e-11 MAAGWQKIGGYYYCFRETGDLAIGWCYNDEEEKWYYFDKDGTAKKGWFQDEDGSWYWFSA RGEMASSGYKSIDGKRYYFFEDGQMAANQYVGLFYMDENGQRDKRYDIVIEGKSKESSVA SETKDEITAALERIPRGWIKYFAEHGWEILYYPDKEYFSAPESDGGTYYVYHKLDTSYRK IKFCRPEGLTQAFGEYIGYAADCFDSDSQFAVDLAMNKGNVDEFVDVPDYYTNNMQFYFG KLVEAYLSDSDTRTAMEADSPQVCEILRKVLYAR >gi|229784111|gb|GG667624.1| GENE 39 38631 - 38843 278 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620467|ref|ZP_06113402.1| ## NR: gi|266620467|ref|ZP_06113402.1| outer membrane protein H.8 [Clostridium hathewayi DSM 13479] outer membrane protein H.8 [Clostridium hathewayi DSM 13479] # 1 70 11 80 80 74 98.0 2e-12 MKKRWAAAVAVSMLLSLGSVMAAAADETEAVQTENTTSERENAENGTAENTAGQTEETPE ADPGNEETGT >gi|229784111|gb|GG667624.1| GENE 40 39061 - 39276 146 71 aa, chain + ## HITS:1 COG:no KEGG:Closa_1582 NR:ns ## KEGG: Closa_1582 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 69 1 69 71 72 62.0 6e-12 MKWKRPLAMAGVILILAMYVIAIVSAFSKSTSAKRWLMAALFSTVVIPVILYAFQLVYRL LNPEGKDRDNE >gi|229784111|gb|GG667624.1| GENE 41 39348 - 40622 1574 424 aa, chain - ## HITS:1 COG:MT0373 KEGG:ns NR:ns ## COG: MT0373 COG0104 # Protein_GI_number: 15839743 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Mycobacterium tuberculosis CDC1551 # 5 424 6 424 432 376 45.0 1e-104 MVRAIVGANWGDEGKGKITDMLAKESDIIIRFQGGSNAGHTIINNYGKFALHLLPSGVFY SHTTSVIGNGVALNIPFLIKEVEELVSKGVPKPHILVSDRAQILMPYHILFDQYEEERLG KKSFGSTKSGIAPFYSDKYAKIGFQVSELFDSESLKEKVARVCETKNVMLEYLYHKPVLD QEELYQTLLQYRDMVAPYVCDVSKYLYDAIKEGKNILLEGQLGSLKDPDHGIYPMVTSSS TLAAYGAIGAGIAPYEIKNITTVVKAYSSAVGAGAFVSEIFGEEADELRRRGGDGGEYGA TTGRPRRMGWFDAVASRYGCRIQGSTEVALTVLDVLGYLDELPVCIGYEIDGEVTKDFPT TVRLERAKPVYKVLPGWKCEIRGIRKYEDLPENCRNYIEFIEKEIETPITMVSNGPGRDE IIYR >gi|229784111|gb|GG667624.1| GENE 42 40616 - 40744 140 42 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870232|ref|ZP_06409674.1| ## NR: gi|288870232|ref|ZP_06409674.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 42 40 81 81 85 100.0 1e-15 MKRSGIEVHCGLTRSQIEVHCGPIQAPPRLYRNELKEEISKW >gi|229784111|gb|GG667624.1| GENE 43 40924 - 41817 942 297 aa, chain - ## HITS:1 COG:BH2712 KEGG:ns NR:ns ## COG: BH2712 COG0583 # Protein_GI_number: 15615275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 4 297 2 294 296 192 35.0 5e-49 MSSNLEYYKVFYYAGELGSVTLAAEKLCISQPAVSQAIRQLELSVDAKLFIRTSKGVRLT REGELLYSYVKSGLDHIWYGENMLKRMQDLDTGEVRIGASDMTLQFYLLPFLERFHEKYP KVKVTVSNGPTPETVGFLSRGTIDFGVVSSPVDARPEISVTEVKKIRNVFVAGEQFGYLK GKPLEYGILKELPCIFLEKNTSTRRFMDDYLEQLGIVVEPEFELATSDMIVQFAMRNLGI GCVMSEFAEKEIESGRLFELNFKAGMPERSFCIITDRKNPISPAGGRLLELMLENRE >gi|229784111|gb|GG667624.1| GENE 44 41885 - 43315 1703 476 aa, chain - ## HITS:1 COG:CAC1821 KEGG:ns NR:ns ## COG: CAC1821 COG0015 # Protein_GI_number: 15895097 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Clostridium acetobutylicum # 1 476 1 476 476 637 63.0 0 MNDRYQSPLSERYASKEMQYIFSPDKKFRTWRKLWIALAETEMELGLPVTKEQIDELKAH QDEINYDVAKQREKEVRHDVMSHVYAYGVQCPKARGIIHLGATSCYVGDNTDIIVMTEAL ALVRKKLVNVIAELAKFADKYKDQPTLAFTHFQPAQPTTVGKRATLWMHELTMDLEDLDY VAGSIRLLGSKGTTGTQASFLELFDGDMDKVRKLDPMIAEKLGYPGCYPVSGQTYSRKVD SRVLNILAGIAQSAHKFSNDIRLLQHLKEIEEPFEKSQIGSSAMAYKRNPMRSERIASLS NYVMSDVMNPMMVASTQWFERTLDDSANKRLSIPEGFLAIDGILDLYLNVVDGLVVYPKV IEKHMMAELPFMATENIMMDAVKAGGDRQELHERIRELSMEAGKNVKENGMDNNLLELIA ADPAFNLSLEELKKNMDPKKYVGCAPAQVEIYLEEVIRPLLLANRDDLGMTAEINV >gi|229784111|gb|GG667624.1| GENE 45 43718 - 45160 1214 480 aa, chain - ## HITS:1 COG:CAC1392 KEGG:ns NR:ns ## COG: CAC1392 COG0034 # Protein_GI_number: 15894671 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Clostridium acetobutylicum # 11 472 14 467 475 483 52.0 1e-136 MRHDLTITAGDELHEECGVFGMYDFDHGDVASTIYYGLFALQHRGQESCGIAVSDTAGPK GKVNAYKGMGLCNEVFTPEILEGLHGNIGVGHVRYSTAGSSTRENAQPLVLNYIKGTLAL AHNGNLVNAPELRRELEYSGAIFQTTIDSEVIAYHIARERVKTSSVEEAVAGAMKKIVGA YSLVVMSPRKMIGARDPFGFKPLCIGKRDNAYILVSESCALDTIGAEFVRDVRPGEIVTI TKDGIASDTRLCLKDPAEEARCVFEYIYFARPDSVFDGVSVYHARLLAGRALAMDSPVEA DLVVGVPESGNGAALGYSMESKIPYGTAFVKNSYVGRTFIKPKQSSRESAVRIKLNVLKE AVAGKRVIMIDDSIVRGTTSALIVGMLREAGAKEVHVRISAPPFLHPCYFGTDIPSEDQL IAHGRTVQEICDMIGADSLSFLRQERLTEMAQGLPICTACFTGKYPIQPPKEDIRGEYVQ >gi|229784111|gb|GG667624.1| GENE 46 45185 - 45892 961 235 aa, chain - ## HITS:1 COG:CAC1391 KEGG:ns NR:ns ## COG: CAC1391 COG0152 # Protein_GI_number: 15894670 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Clostridium acetobutylicum # 1 232 1 232 235 281 62.0 8e-76 MEKKELLYEGKAKKVYTTEDPDVLIVDYKDDATAFNGLKKGTIVGKGAINNRMTNHIFKK LEAEGVPTHFIEELSDRETAVKKVEIVPLEVIVRNFSAGSFAKKMGMEEGIKFACPTLEF SYKNDDLGDPFINSYYALALNLATQEEIDAITKYTFKVNDVMREYFDSLGIELIDFKIEF GRYHGQIILADEVSPDTCRLWDKETHEKLDKDRFRRDLGNVEDAYEEVFRRLGIQ >gi|229784111|gb|GG667624.1| GENE 47 46085 - 46336 227 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620474|ref|ZP_06113409.1| ## NR: gi|266620474|ref|ZP_06113409.1| hypothetical protein CLOSTHATH_01564 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01564 [Clostridium hathewayi DSM 13479] # 1 83 1 83 83 145 100.0 9e-34 MTLAELIVIGITDPAIFAEAKHLFDINWSTERSVEQTGFEEEVSPYPSDRTDSGVTIDVK EGKVEFWRTEEKVRREGGESTGP >gi|229784111|gb|GG667624.1| GENE 48 46540 - 47289 814 249 aa, chain - ## HITS:1 COG:all7616 KEGG:ns NR:ns ## COG: all7616 COG2949 # Protein_GI_number: 17158752 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Nostoc sp. PCC 7120 # 80 241 63 228 228 135 42.0 9e-32 MALGPERLEAIFSMLYNEAYNRNEDKPMKKLIKRAFVVLVILGVAGIVCLAAVNTYMIKS TEKRILTWKEASSLGADCILVLGAGVHPDGKPSNMLEDRLLRGIELYEAGASEKLLMSGD HGRKNYDEVNAMKQFAVDRGVPSEDVFMDHAGFSTYESMYRARDVFEAEKIIIVTQQYHL YRALYAAKQLGLDAYGVASDQRTYAGQKRRDVRELLARGKDFMTGIFQPEPTYLGEAIPV NGDGNVTND >gi|229784111|gb|GG667624.1| GENE 49 47314 - 47646 361 110 aa, chain - ## HITS:1 COG:no KEGG:Closa_1575 NR:ns ## KEGG: Closa_1575 # Name: not_defined # Def: mucin TcMUCI # Organism: C.saccharolyticum # Pathway: not_defined # 1 100 1 98 105 68 52.0 1e-10 MKRIKPYVFAVLVIMTVMCLSACGSNKNATNETTGGSSAAATQTSAMESSSAAESSSTDM GNTRNADEDAEESTGVIDGMINDVEKGVDDITGESNGHPTNSADMSSAAE >gi|229784111|gb|GG667624.1| GENE 50 47961 - 48152 83 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620477|ref|ZP_06113412.1| ## NR: gi|266620477|ref|ZP_06113412.1| putative ethanolamine utilization protein [Clostridium hathewayi DSM 13479] putative ethanolamine utilization protein [Clostridium hathewayi DSM 13479] # 1 63 54 116 116 118 100.0 1e-25 MVFTGGAIREAYADKDRRKKVPSKVRRHTLGPTPLTEWAFADAKPWGPAWIYPEPLITVF YRP >gi|229784111|gb|GG667624.1| GENE 51 48182 - 49420 519 412 aa, chain - ## HITS:1 COG:BS_yhaZ KEGG:ns NR:ns ## COG: BS_yhaZ COG4335 # Protein_GI_number: 16078046 # Func_class: L Replication, recombination and repair # Function: DNA alkylation repair enzyme # Organism: Bacillus subtilis # 55 409 8 356 357 367 50.0 1e-101 MICLTFPLANISVEKAKLSLIAFDGEGFLSQVADLRQSPLEKRNYMPELMKDRYYNYNTI HELAMCINAVYPTFRADDFIANIMDETWDALELKARMRRITINLGKYLPHDYEHALGILD KAIAGYPVGIVDSGLLYFPDFVEMYGQDERHWDLSMAALERYTRYSTAEFAVRPFIIKHE ARMIAQMTAWTEHDNEHVRRLASEGCRPALPWGQALTSFKKDPSPVLQILERLKADPSLY VRKSVANNLNDISKTHPDLITELARDWYGKNEHTDWIVKHGCRTLLKKGNRAVLDIFGFS DVDCVNVVNFSLDAASVCVGEDMTFSFQIETKKETKVRLEYGIDYVRANGRRSRKIFKIS ELLLRENEKKSYRKTHSFVDVSSRKHYPGIHSVTLIVNGVEWDTLNFKLSTL >gi|229784111|gb|GG667624.1| GENE 52 50815 - 51234 500 139 aa, chain - ## HITS:1 COG:CAC2379 KEGG:ns NR:ns ## COG: CAC2379 COG0289 # Protein_GI_number: 15895645 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Clostridium acetobutylicum # 1 130 1 130 250 113 40.0 9e-26 MIKMVMHGCNGAMGQVISKIVEEDENIKIVAGIDLNDTIQHSYPVFPSLEACTVDADVIV DFASAKAVDGLLTYCSEKKMPVVVCTTGLSEEQIQHVEETARQTAVLRSANMSLGINLLL KLVKEAAEVLAAAGFDIDS >gi|229784111|gb|GG667624.1| GENE 53 51316 - 52200 1135 294 aa, chain - ## HITS:1 COG:CAC2378 KEGG:ns NR:ns ## COG: CAC2378 COG0329 # Protein_GI_number: 15895644 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Clostridium acetobutylicum # 1 294 1 292 293 317 55.0 1e-86 MALFEGAGVALITPFKDNGEVNYEKLEELLEEQIAGGTDSIVICGTTGEASTMTHEEHLN VIKYTCEVVNKRIPVVAGTGSNCTETAVYLSKEAEKYGADGLLLVSPYYNKATQNGLIAH FTAVADAVKIPIILYNIPGRTGVTIAPQTIVTLCKNVPNIVGVKEASGNISNIATMLAMA DGCVDVYSGNDDQIVPLLSLGGKGVISVLSNVAPAQTHEICASYFRGDVKESARLQLEAI PLINALFCEVNPIPVKAAMNLMGKKVGPLRLPLTEMEPANQERLKKAMEEYGIL >gi|229784111|gb|GG667624.1| GENE 54 53699 - 54205 623 168 aa, chain - ## HITS:1 COG:PH0075 KEGG:ns NR:ns ## COG: PH0075 COG2109 # Protein_GI_number: 14590029 # Func_class: H Coenzyme transport and metabolism # Function: ATP:corrinoid adenosyltransferase # Organism: Pyrococcus horikoshii # 4 162 8 160 175 87 34.0 1e-17 MSKGTIQVICGEGKGKTTAALGMGISALLNGQTVIMIQFLKGCQELASCEILERLEPELK IFRFEKSDAFFETLSEEQQKEERMNIRNGMNFARKVLSTGECDLVILDEVLGLLDQGIIE MDEMKTLLQSRVEEIDLILTGKVFPKELDPFVDGIREIDHLKVDNTKQ >gi|229784111|gb|GG667624.1| GENE 55 54334 - 54960 662 208 aa, chain - ## HITS:1 COG:CAC2382 KEGG:ns NR:ns ## COG: CAC2382 COG0629 # Protein_GI_number: 15895648 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Clostridium acetobutylicum # 3 205 2 202 229 206 50.0 3e-53 MSEKMIENNKVSVIGEIISGFTFSHEVFGEGFYMVDVAVNRLSEQADIIPLMISERLIDV EANYIGCTIEALGQFRSYNRHEGTKNRLVLSIFVREIHFLEEFTDYTKTNQIFLDGYICK APIYRKTPLGREIADLLLAVNRPYGKSDYIPCIAWGRNARYASGFEVGARVKVWGRVQSR EYTKKLSETECEKRIAYEVSVSKLECAE >gi|229784111|gb|GG667624.1| GENE 56 55143 - 55289 80 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870238|ref|ZP_06409676.1| ## NR: gi|288870238|ref|ZP_06409676.1| hypothetical protein CLOSTHATH_01575 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01575 [Clostridium hathewayi DSM 13479] # 1 48 1 48 48 86 100.0 5e-16 MSPSVFVIVPSFSGWLSNAETVDLCGKRAVAVLLSPVFRANKGVTDFL >gi|229784111|gb|GG667624.1| GENE 57 56403 - 58115 2037 570 aa, chain - ## HITS:1 COG:CAC1684 KEGG:ns NR:ns ## COG: CAC1684 COG1217 # Protein_GI_number: 15894961 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Clostridium acetobutylicum # 2 569 3 567 605 680 61.0 0 MKMKREDVRNVAIIAHVDHGKTTLVDALLRQSGIFRENQEVVDRVMDSNDIERERGITIL SKNTAVNYNGTKINIIDTPGHADFGGEVERVLKMVNGVILVVDAYEGVMPQTKFVLRKAL ELGLSVVACINKIDRPEARPDEVEEEVLELLMDLDATEEQLDCPFLYASAKGGFAKKNLD DPEENMSALFQTIIDHIPAPEGDPEAPTQLLISTIDYNEYVGRIGVGKVDNGSIRVNQEC VIVNHHDPEKMRKVKVGKLYEYEGLNKVEVTEAGIGAIVAISGIADIHIGDTLCSPDNPE AIPFQKISEPTIAMNFMVNDSPLAGQEGKFITSRHIRERLFRELNTDVSLRVEETDSPDC FKVSGRGELHLSVLIENMRRENFEFAVSKAEVLYQYDERNRKLEPMEIAYIDVPEEFTGA VIQKLTSRKGELQGMSPANGGYTRLEFSIPSRGLIGYRGEFMTDTKGNGIMNTAFDGYAP YKGDLSYRKTGSLIAYESGESITYGLFNAQERGILFIGAGVKVYSGMVIGQNPKAEDIEI NVCKTKKLTNTRSSSADEALKLTTPKEMSS >gi|229784111|gb|GG667624.1| GENE 58 58223 - 58495 111 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620488|ref|ZP_06113423.1| ## NR: gi|266620488|ref|ZP_06113423.1| lipoprotein [Clostridium hathewayi DSM 13479] lipoprotein [Clostridium hathewayi DSM 13479] # 20 90 1 71 71 137 98.0 2e-31 MRNPVVKAGLCQLSYGYLRLLDKTVRLKTEFPAEMDRAGYSRAFSIKKRWDHYKIPLPFT TITVTAEKFGIVDRKSGEGKKLKNTLQSGI >gi|229784111|gb|GG667624.1| GENE 59 58431 - 58898 389 155 aa, chain - ## HITS:1 COG:DR1225 KEGG:ns NR:ns ## COG: DR1225 COG0438 # Protein_GI_number: 15806244 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Deinococcus radiodurans # 2 150 242 402 402 72 32.0 3e-13 MAEELGIGRQIVFAGRVPNTEVSNYLHAADGFLFASKSETQGIVLLEAMAAGCPVVAVRA SGVVDVVRQEKNGFMTEENEEAWASAAARLMTDPPLYRKLSAGAVETASLYKEEAVAKEA EKQYRKMIGRRNCPDEESSGQSRFMPAFLRLFTIA >gi|229784111|gb|GG667624.1| GENE 60 59856 - 60458 586 200 aa, chain - ## HITS:1 COG:TM0744 KEGG:ns NR:ns ## COG: TM0744 COG0438 # Protein_GI_number: 15643507 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermotoga maritima # 1 200 1 184 406 92 32.0 6e-19 MKIAMLTNNYKPFIGGVPISIERLAEGLRNLGHQVTVFAPEGAGCKEEPGVVRYRVLYEQ RDKGLVIGNFIDKRIEKEFRDGQFDVIHVHHPVVIGFTAVYLSKKYGIPLAYTYHTRYEE YLHYFKLYERMTAGREPVRNAARFSREVLVPKYLTAFADQCDTVFVPTASMKECLVEQGV AAPMEVLPTGISTAAFKLAS >gi|229784111|gb|GG667624.1| GENE 61 60471 - 60914 193 147 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|154175107|ref|YP_001408238.1| ribosomal protein L22 [Campylobacter curvus 525.92] # 10 139 33 162 199 79 34 8e-14 MLEYLNLPGFPAGVIMPLAGIWAAKGQIGLGMALLLALAAGMLGSWILYGLGRLGGHMLL EKYLKKFPAQRPVIEKNFAMLERRGAVGIFVSKLIPMVRTVISIPAGVIRMNFIKYSVSS ALGIFVWNFFLIGAGYWLGDRAISLFS >gi|229784111|gb|GG667624.1| GENE 62 61870 - 62067 301 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620492|ref|ZP_06113427.1| ## NR: gi|266620492|ref|ZP_06113427.1| DedA protein [Clostridium hathewayi DSM 13479] DedA protein [Clostridium hathewayi DSM 13479] # 1 65 1 65 65 90 100.0 3e-17 MDIQALTQYFLQYGAVFIFVIVLLEYLNLPGFPAGVIMPLAGIWAAKGQIGLGMALLLAL AARNA >gi|229784111|gb|GG667624.1| GENE 63 62244 - 67814 5102 1856 aa, chain - ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1727 1853 511 631 693 108 43.0 7e-23 MKQKSKNRWKRTFSMCLAFQMAVTTPAAGVTLPVGDQGSKINTGRYAAVFDPASGGGSLA LFQGSSPDQAWNTMDEALIPGTTMLWDGSSGFEKLQITPETVLSAGLSTGAAMGIRLGNE AEGVKSWFLFEDEIICAGAGVKAHDSENRRLVTVVDTIPVTGKTKLALTNPSNGFRNVRN ASGEGGVWAPLVNTATQGVHTQRNWLSAASGNGGAKDLDWSYIFGDGESGSYLTKESVYY RLNTKGTDTAQLAELFLAPQETYQYAMVAGKTTDDYSNTSLYQTDARLLVNTTAVQAAAD VGEGVLAVTKWEPGTLRLDNQVVPVTMDSPGALELKKDPSTGNIELTVAKIPSQTVGTVK LVLEAAGKQVLSVSNPKALQEAVMGESVALTFDTAAMGSAPVTVAIEGKVPDALTGDEIT LVRGQDGRIQAPDSLHGPIRWEVKIPKPDGSFVRNAGSSKIKRELKDGETDGTRKAGDTD GSHIASIKGTADGYGIITAREKGTVSVVAADESGTTATWLVTVLYEDPDQLPAADASDFA AIRRRWKESMIGDDLTVLEGGREILQKIDEQAETAWEAYAYKGQESCPGIPWPADEGAAG NAEIPYEDDAVEFRPAFKKLLAMASAYAAKGSRYYQDPVMLSDMKNILDWLCSTCYTPKT QTDNWWTWEIGLPKDLIPALVLLYDDLTPEETALYTEALYFFQPDPYHEGIIGTGSTHAQ GYRTAQGANIIDCSRTALGLGILREDNELVYLASKASAETFVIQSVEDSTKIADQGYTSG FYADGSYLDHSHVPYLGSYGIEFLKGGVGLPPLLAKTPWDYPREVQENLEFYLKEGFLNG MYNGLMLDGLKGRSVSRPGAGNRASGREAMTLMIQLMNSVSSDVEEELKSALKAWIELDP GFLDSLTGAENMAVKEKAMEIRDDDSIAGGVQPVHKNFPLMDRAIHRTEDFQLALSMYSE RIQNTEIMNHENRYGWHQGSGMTYLYNSDILQYTENFWNTVNPLRLAGTTVVSKDIGTGQ PDSSGFAQGGDFRSRESWVGGSVIGNNGINGMSMTGEVRVKEGEGSPSVPYSPEFSARKS WFMFDGQIVCLGAGITNSGESSPVESIIENRRLRAEADNVLTVNGEAEALPVQEAGLEDI VSGSADVSGTSIPNVSWAHLEGNVEGSDIGYYFPGGDQTISVRKGKNAGDWSLIGTSEGA ASETYLEMWFDHGANPKNDTYEYVLLPGLSAGETESYSKEPGITVLANTPFVQAVSTADG LLTGANFWTDQPAQLGALSVNRAASVMTEESDDGILTVAVSDPTMKNTGSITVDIERPAE EVVSHDENVDVLLTDNGVRLVVHTKGTNGGSSCASVRVTSPVTPPTAERVVKLITLAAEA AEAAGDYGEEEVIKAVKEAIAAVWEVSNEELAEYCMDELLVLEELYRGVMSAGGIHAEVR IETPGQMDAEVTAAGALLNIPWPESLIPATPSQAVKATSSNAGYEAPGTVKPSVASPSTA TPSSAKRETAKRENVNQVTESPEAAKHAEIRSVIIKVTESGDPGRLEGDWKRSCRFQLYL EDENGVRSGPAVWRAPLKAGMKFDGDQVQTDTIRALLKTGSGELKPLPLSRNEGVISFIL PESADVYLTGNKTPDAEEIFHVSIEEGLTGGVITASPLSGKKGTRVTVKAVPDSGFRLVS LSANGKTLTADGDGRSSFVLREDTHLTGVFKRISGGGSTGSGTGSSSAVRSGWKKSGEVW YYYDAGGSMVTGWLLDNGKWYWLAADGAMKTGWLFNETDGFWYYLNADGSMAEGWNLIHD NWYYFAPQTEGSPGWKQINNRWVYEKPETVSRPHGAMYSGCTTPDGYSVDENGARK >gi|229784111|gb|GG667624.1| GENE 64 68028 - 68681 689 217 aa, chain - ## HITS:1 COG:BH3630 KEGG:ns NR:ns ## COG: BH3630 COG1739 # Protein_GI_number: 15616192 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 1 197 1 197 213 151 41.0 8e-37 MRKPYKILYQGGTAEITEKKSRFIASLRPVQSEEEALLFLEETRKKYWDARHNCYAWIIG ENGEQKRCSDDGEPSQTAGRPMLDVLEGEGICNACAIVTRYFGGTLLGTGGLVRAYSGAV QAGIKSSTVLAVMPAVRTAVTTDYNGVGKIQYLLGQRNITILDSQYSDQVVLTVLIPEEG KERLLKELTDLTNGRAGIEVQDEVAYGILDKRVILID >gi|229784111|gb|GG667624.1| GENE 65 68924 - 71611 2507 895 aa, chain + ## HITS:1 COG:CAC2301 KEGG:ns NR:ns ## COG: CAC2301 COG0744 # Protein_GI_number: 15895568 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Clostridium acetobutylicum # 39 744 36 674 809 291 32.0 4e-78 MNYGKHETEKKIKSVNSKAKKYTSRVFLAFLKSLFVLCLFCLVVAGSIGFGMVKGIIDNA PDVDIATIVPNEYATTVYDSAGNVTETLVTAGSNREEASYDELPGNLINAFVAYEDARFW EHNGIDLRSIMRAVKGVLTGDSSAGGGSTITQQLIKNSVFGGGMEKSFGERLERKMQEWY LAVKLDSAMSKEQIITNYLNTINLGKNSLGVKVAARRYFNKEVSDLTLSECAVLAGITQN PSKFNPVTGQKANSDKQKVILQYMVDQGYISKDEQEEALADDVYSRIQDVEIVTKETSTP YSYFTDELVEQVQDAMKEKLGYTNTQAHNMLYSGGLSIYTTQDPTIQSIVDEEINNPDNY SAARYSIEYRLSVTGADGSTTHYSEENLKRYHRDNGDSSYDGLYDSEEEIQADIENYKNS LLKEGDTIIGESLHKTLQPQASFVIMDQKTGQVKAIAGGRGEKTASLTLNRASNTLRQPG STFKVLTAFAPAMDTCGATLGTVYYDSVYTVGKKTFSNWYSSGYQGYSSIRDGIIYSMNI VAVRCLMETVTPQLGVEYAKNFGITSLTDTDYNAALALGGITDGVSNLEMTAAYATIANG GVYTKPVFFTKIIDHNGKVLIDNTPETHRVLKDSTAFLLTDAMADSMESNRKFARPGAGP SSTSSSANIPGMSNAGKSGTTTSNNDIWFVGYTPYYTAGIWGGCDNNQKLTKKNGGTSFH KAIWKKIMTRVHEGMSDPGFTVPDSVETAQICRKSGKLPVAGVCTNDPRGNAVYTEYFAK GTVPTDVCNNHVRATVCAESHCLPTPFCPERTTAVFMALPAGEEGTTDDSVFAMPGYCPI HTEASVIIPPSDDTGMPETVAPGTLPYGPGYVSPNTQTSPAPTAGSPVQMSPGGQ >gi|229784111|gb|GG667624.1| GENE 66 71704 - 72057 524 117 aa, chain - ## HITS:1 COG:no KEGG:Closa_1567 NR:ns ## KEGG: Closa_1567 # Name: not_defined # Def: SpoVA protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 117 1 117 118 179 89.0 3e-44 MEFVKAFIVGGVICALVQILMDNTKLMPGRIMVLLVVSGTILGALGLYQPFADWAGAGAT VPLLGFGNTLWKGVWKSMGEDGLLGVFEGGFTASAVGISGALIFGYLGSVIFKPKMK >gi|229784111|gb|GG667624.1| GENE 67 72119 - 73132 1247 337 aa, chain - ## HITS:1 COG:no KEGG:Closa_1566 NR:ns ## KEGG: Closa_1566 # Name: not_defined # Def: stage V sporulation protein AD # Organism: C.saccharolyticum # Pathway: not_defined # 1 337 1 337 340 614 87.0 1e-174 MQVGKASIKFEEPPVIESMASIVGKKEGEGPLGKLFDVVEQDDMFGADTWEQAESALQKQ TADLAIEKGDIRKKDIRYLFAGDLLGQLIATSFGTVDLEIPLFGLYGACSTMGEALNLGA MTVAAGYADKIIAMASSHFATAEKQFRFPLAYGNQRPFSATWTVTGCGAVVVSKHKNEGI AAITGITTGRMVDMGSKDSMNMGAAMAPAAFHTIEQNFEDFGVDESYYDKIITGDLGEIG RTILLDFMKNKDHKLEEIHTDCGIEIYDQETQDTHAGGSGCGCSAATLCSYILPKIKDGT WKRVLFVPTGALLSTVSFNEGQPIPGIAHAVMLEHIG >gi|229784111|gb|GG667624.1| GENE 68 73303 - 73755 630 150 aa, chain - ## HITS:1 COG:no KEGG:Closa_1565 NR:ns ## KEGG: Closa_1565 # Name: not_defined # Def: SpoVA protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 143 1 143 150 206 77.0 2e-52 MEINKQAYDAYVKEVTPTHNKWLSLVKAFFVGGVICTIGQLVSTWFMNGGMDKDTAASWT LLVLIAGSVILTGLNIYPKIVKFGGAGALVPITGFANSVVAPAVEFKAEGQVFGIGCKIF TIAGPVILYGILSSWILGILYWIGRWAGIF >gi|229784111|gb|GG667624.1| GENE 69 73801 - 74202 485 133 aa, chain - ## HITS:1 COG:no KEGG:Closa_1564 NR:ns ## KEGG: Closa_1564 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 133 9 141 141 160 78.0 2e-38 MFLGLSAGGIIAAGVFAFLAIIGVFPRLIGKTRTNRHIFLYETVIIIGGVLGNVSDLYEF PLLMGGNMVLGIYGLAVGIFVGCLVMSLAETLKALPVISRRIHLAVGLQYLILSIGLGKL IGSLVYFSGNFGK >gi|229784111|gb|GG667624.1| GENE 70 74216 - 74677 462 153 aa, chain - ## HITS:1 COG:no KEGG:Closa_1563 NR:ns ## KEGG: Closa_1563 # Name: not_defined # Def: stage V sporulation protein AA # Organism: C.saccharolyticum # Pathway: not_defined # 1 153 56 208 208 263 79.0 2e-69 MESTLDVIKKITEMDPSITVNNVGEVDFIIDYHRPKSPNWVWEWIKTIFVCIICFCGASF AIMTFNNDGGVKDVFGEIYQIVMGKESSGFTILEVSYSAGLALGIIGFFNHFAKIKINTD PTPLEVEMRLYEDSISKTLIQNDGRKEQDIDIT >gi|229784111|gb|GG667624.1| GENE 71 75620 - 75775 162 51 aa, chain - ## HITS:1 COG:no KEGG:Closa_1563 NR:ns ## KEGG: Closa_1563 # Name: not_defined # Def: stage V sporulation protein AA # Organism: C.saccharolyticum # Pathway: not_defined # 1 50 1 50 208 88 88.0 1e-16 MSKTVYLNISQITEVHHKEVQLKDVATVYCDDSAVTNKCNALRIKTIHLDS >gi|229784111|gb|GG667624.1| GENE 72 75772 - 75933 253 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620502|ref|ZP_06113437.1| ## NR: gi|266620502|ref|ZP_06113437.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 53 1 53 53 92 100.0 1e-17 MESMKHKLGIALLVVCLIAVAAFVWYFIAAVPDGGEMDGTLVELIRKAGQIRI >gi|229784111|gb|GG667624.1| GENE 73 75997 - 76710 796 237 aa, chain - ## HITS:1 COG:BS_sigF KEGG:ns NR:ns ## COG: BS_sigF COG1191 # Protein_GI_number: 16079402 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus subtilis # 3 237 18 250 255 227 50.0 2e-59 MDETMRLIEMAHEGDKEARDRLVTENMGLIWSIVRRFTGRGYDPEDLFQIGSIGLMKAID KFDMSFDVKFSTYAVPMITGEIKRFLRDDGMIKVSRSIKEMGTRVKCVREALGFTLGREP TIEEIAGELGASKEEVAASIEAGAEVESLYRTVNKNDENSILLIDKIEEESSAQEELLNR MVLRELLTGLSDKDREIIIRRYYYNETQSQIASHLGISQVQVSRLEKKILKQMREKL >gi|229784111|gb|GG667624.1| GENE 74 76716 - 77150 549 144 aa, chain - ## HITS:1 COG:CAC2307 KEGG:ns NR:ns ## COG: CAC2307 COG2172 # Protein_GI_number: 15895574 # Func_class: T Signal transduction mechanisms # Function: Anti-sigma regulatory factor (Ser/Thr protein kinase) # Organism: Clostridium acetobutylicum # 11 141 5 137 143 138 57.0 4e-33 MLEENRPEVLRMEIESLSRNEEFARVVVSVFMARMNPTLEELDDVKTAVSEAVTNAVIHG YQGNGGIIYLEVKILGQELTVTVKDTGIGIPNIPQAMEPMFTTDPEGERSGMGFSFMEAF MDEVSVESEPDKGTVVTMKKKINR >gi|229784111|gb|GG667624.1| GENE 75 77161 - 77502 296 113 aa, chain - ## HITS:1 COG:BH1536 KEGG:ns NR:ns ## COG: BH1536 COG1366 # Protein_GI_number: 15614099 # Func_class: T Signal transduction mechanisms # Function: Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) # Organism: Bacillus halodurans # 9 111 8 110 116 79 37.0 2e-15 MNKPLFTYEADGHVLIVHLPEELDHHNCTGLKYETDLILSENYINRIVFDFSGTRFMDSS GIGVLLNRYKQMARSGGKVTIYGAGAQVLRILKMGGILKLMKLYDSKEDAVTG >gi|229784111|gb|GG667624.1| GENE 76 77599 - 78912 1494 437 aa, chain - ## HITS:1 COG:no KEGG:Closa_1558 NR:ns ## KEGG: Closa_1558 # Name: not_defined # Def: TPR repeat-containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 437 1 440 440 650 77.0 0 MDYVRKTRVIANSYYNMGLERARLRDLSGAAECLKKSLHFNKYQTDARNLLGLIYYEVGE VGDALVQWVISMNLQPQENRADYYLGEIQRKKGSLDTERHMIRRFNQALVYAQNGSEDLA ILQLVKVVEAKPNYVKAQLLLAILCIAREDYQKAGKAVYKVLQIDRNHPKALWYKSIVKA NTGSRSEREPEKRKLKNVLSHRQMEDDDVIIPPSYRENTKDMAVINILAGLLLGAAVIFF LVMPANTRAINEKHNQEMLKYSESLSQANQKADSLSAQIAGLEEDKKTAEESLSAFTNDS DSVLAQYQNVIGILQAYKKDDFTTAVKLYADLKTELIASPDVLAIVGEVKTDMEERGYQV LESLGDEASGAGDSQTALDYYLKSLAVKPDYWAAKYKAAVIYKGMNQKEQANEMFTDIIN NSKDETLTAQARTERGF >gi|229784111|gb|GG667624.1| GENE 77 78924 - 79064 268 46 aa, chain - ## HITS:1 COG:no KEGG:Closa_1557 NR:ns ## KEGG: Closa_1557 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 46 17 62 62 62 86.0 5e-09 MIDFETEINDYRPSLEVDAIEDAIVRSDLTDMNDLMMELIKEVGKE >gi|229784111|gb|GG667624.1| GENE 78 79057 - 80373 1344 438 aa, chain - ## HITS:1 COG:lin1586 KEGG:ns NR:ns ## COG: lin1586 COG0285 # Protein_GI_number: 16800654 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Listeria innocua # 19 432 22 424 429 192 31.0 1e-48 MEKTADEYLADIPMWTRKKNSLEDIRAFLERMDNPDEKRKIIHVAGTNGKGSVCAYITSM LMAGGYHTGAFISPHLTDVRERFLFDGEPVGKELFEDSFREIRDLSERMMGEGYCHPSYF EFLFYMGIKLFDQAESGFTVLETGLGGRLDTTNVIRHPLAVVITSISLDHTQYLGDTVEK IAYQKAGIIKPGVPVIYEREDPRTAAVIEKEAAASHSAVYPVDGQGYVIDGWEEHGIRAR LERLDGSYLPVLIPSQAEYQVMNALAAMRTVEVLAQQNGDIQLSEEALKRGLETMRWPAR MEEAAPGLYLDGAHNPGGMEAFIKTAARLCETGRKRAFLLFSSVSDKAHDTMIREIAAGL PLDAVAVAHIQSDRGLEAEVLEAEFKAVSGCRIYGFGTTKEALMALLAWQDEDHLLFCVG SLYLMGEIKTILGGMSDD >gi|229784111|gb|GG667624.1| GENE 79 80495 - 80710 62 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620509|ref|ZP_06113444.1| ## NR: gi|266620509|ref|ZP_06113444.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 71 1 71 71 125 100.0 9e-28 MITGVFRTYDSECSCFLALRAGGALRAEEAWLFKQPRSGGKESCRGSGSFLRRQRNKNVE IRLYLRKKEVA >gi|229784111|gb|GG667624.1| GENE 80 80718 - 81101 540 127 aa, chain - ## HITS:1 COG:PH0854 KEGG:ns NR:ns ## COG: PH0854 COG0251 # Protein_GI_number: 14590714 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Pyrococcus horikoshii # 1 124 12 135 137 139 54.0 1e-33 MKKVLATEKAPAAIGPYSQAVRGGDYVYVSGQLPIDPATGAFAGEDIASQTKQSLSNIKA ILESEGLSMANVVKTTVLLQNIGDFGAMNDVYATFFEGACPARAAFEVAALPKAALVEIE AIAYCGE >gi|229784111|gb|GG667624.1| GENE 81 81279 - 81893 796 204 aa, chain + ## HITS:1 COG:no KEGG:Closa_1554 NR:ns ## KEGG: Closa_1554 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 204 1 204 206 267 72.0 2e-70 MKTNDERIQDILNHLDTLSYIRPEEIPGIDLYMDQVTTFMDEHLKDTKRYPEDKVLTKTM INNYAKNNLLPAPNKKKYSKEHILLLIFIYYFKNILSINDIEELFRPITTGHFARKDDLP LEDIYREIFSLESKEMEHLREDVKAKYERAGGTFMDDSLAEEDREYLQLFSFICELSFDV YLKKQMVEQMIDELRETNPARKKK >gi|229784111|gb|GG667624.1| GENE 82 81976 - 82275 406 99 aa, chain - ## HITS:1 COG:lin1875 KEGG:ns NR:ns ## COG: lin1875 COG4496 # Protein_GI_number: 16800941 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 3 97 4 98 98 116 60.0 1e-26 MNKKIKTDAVDHLFQAILTLKTPEECYTFFEDVCTVNELLSLSQRYEVAKMLREGRTYLE IAEKTGASTATISRVNRSLNYGNDGYDMVFKRLEETEEK >gi|229784111|gb|GG667624.1| GENE 83 82364 - 83173 1094 269 aa, chain - ## HITS:1 COG:CAP0070 KEGG:ns NR:ns ## COG: CAP0070 COG0561 # Protein_GI_number: 15004774 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 4 267 5 278 283 184 35.0 2e-46 MAYRMIVLDLDGTLTNRDKVITPRTKEALMKLKSQGGTIVLASGRPTYGIMPLARELGLT EDGGYILSFNGGRIIECRSGETVFAKELPVASNKKIIALAKEHGVNILTYEGDCIITPDS GDIYVKKESDINKLEVRKVENFAEYVDFPVVKFLLLDDGDYLALVEPKVKAALGRDYSVY RSEPYFLEVLPKGIDKAASLERLLTRLDMTKDEMIACGDGYNDLSMIQYAGLGVAMENAV LPVKNAADYVTLSNNDDGVAHVIEKFMLS >gi|229784111|gb|GG667624.1| GENE 84 83309 - 83881 641 190 aa, chain + ## HITS:1 COG:CAC0620 KEGG:ns NR:ns ## COG: CAC0620 COG0715 # Protein_GI_number: 15893908 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Clostridium acetobutylicum # 16 187 18 189 338 184 51.0 1e-46 MKKIVSIVLAVFMLAVCVTGCKTTTRKQLKPVVLNEVAHSIFYAPQYAAIELGYFEDEGL DLKLVNGAGADKVMTALISGDADIGFMGSEASIYVYQQGSDDYAVNFAQLTQRAGNFLVG READSDFTWADLKGKKVLGGRAGGMPEMVFEYILKKNGIDPSTDLTIDQSINFGLTAAAF TSSDADYTVE >gi|229784111|gb|GG667624.1| GENE 85 83943 - 84290 86 115 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLEQPKEIVKVSKNMKFGDLTTQWCYGLAPDRVWKCTCGCGSVCYVKEAALQQNIVENCG NAIHKKKTSSIQPAKRKYIVCGGIERHVCLLCNADMPYDSRKVYCEKCRQLKKLK >gi|229784111|gb|GG667624.1| GENE 86 84597 - 84791 141 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620516|ref|ZP_06113451.1| ## NR: gi|266620516|ref|ZP_06113451.1| toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] # 1 64 1 64 64 99 100.0 8e-20 MDDEILTVAQAAAYLKVSDKTVLKWIKNDKLVASLVGRIYRIKLSDINDYLAANSNGKKG VKSK >gi|229784111|gb|GG667624.1| GENE 87 84788 - 85888 404 366 aa, chain + ## HITS:1 COG:Cj1051c_1 KEGG:ns NR:ns ## COG: Cj1051c_1 COG0286 # Protein_GI_number: 15792378 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Campylobacter jejuni # 4 348 5 352 885 185 36.0 1e-46 MNIDYIKKLIVKLGFAPQDGISGVFVKRYVAYDNYPIFVDFNEQKIEYAHQSIQQNKRIR LGDLTTSNFDKLENFVVLECIDRLLTKGYRPERLELEKKYPLGRNLKGKLDILIYNENDD FPFLMIECKTWGNEFVKESVKTLKDGGQIFSYYQQDRAAKFLCLYASHLDDKKIEYRNNI VLVEDSWHDLSSAKDIHDYWNKNFKENGIFEEYATPYDIKPKALTYGMLKNLREEDSGKI YNQIMEILRHNAISDKPNAFNKLLNLFVCKIIDENKNPDDELEFQWLESDTDESLQMRLN DLYKDGMWRFLEIRVIDHSEDDVTKALEGIDNAMQKQRLMDMFRDTRLKKALTLPLSRFW MRKLLS >gi|229784111|gb|GG667624.1| GENE 88 86263 - 88533 623 756 aa, chain + ## HITS:1 COG:Cj1051c_1 KEGG:ns NR:ns ## COG: Cj1051c_1 COG0286 # Protein_GI_number: 15792378 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Campylobacter jejuni # 9 378 478 879 885 182 34.0 2e-45 MKFAWAKDSVYGIDLDNRLVKTTKVSAFFNGDGEANIIWANGLANFEKAEEYRGLLRQTQ HYDRKNNGQFDILISNPPYSVEAFKSTLQYGEETFELYDNITDNSSEIECLFVERMKQLL KVGGWAGVILPSSILSNGGIYSKAREIIFKYFRVKAIVELGSGTFMKTGTNTVVLFLERR SDNDVITIEKAISTFFSSPKDVTVMGIENAFSKYVANIYDGLAFDDYISFISGRASVAMQ EHELYSDYIKAFGDDVYTKGIALEKEKMLYFFLTYTQNIVLVKTGKKQDEKTFLGYEFSE RRGHEGIKRLPGGTKLFDENGDLLNPKKANSYIYNAFLGKEIVIDESLSHNVSYGRMSGF ISYGTSKFDKAVNLSKKTTFTSSFPSVRLGELVQIIKGVTYSKEDQVYNETNNVILTADN ITNSGDFDVVKKVFLRADLTIDGTKKLKQNDIFMCFSSGSKSHVGKSAYISYNTEYFAGG FMGVLRCKSEDVSMKYLWAILSSNQFRHIISQESTGININNLSANLADIKIPLPPLDVQK KIVAEIEEIDREESYIIEQVDALRYSILSAVKNGAAGEPLEKLGVVASYSQDRISCAELS SDTYVGVDNLLQNMEGKGSSQFVPKSGTAIAYSKGNILLSNIRPYLKKIWLADNDGGSSG DVLVLKMDDTKISSKYLYYLLATDEFFEYEMQHIKGVKMPRADKASVLNYNVPIPSLFKQ QEIVAEIEKIESEITTRKMRLEDLKKQKGKVLDKYL >gi|229784111|gb|GG667624.1| GENE 89 88797 - 89045 141 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620517|ref|ZP_06113452.1| ## NR: gi|266620517|ref|ZP_06113452.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 82 1 82 82 123 100.0 4e-27 MKGKRISVLKQIRYFLKRQIRIDARLLDFVQEHPEIITDEKLEPPPGEFQKIMEEMDRRG IKPAVERQLRLKRWFDRIFRKI >gi|229784111|gb|GG667624.1| GENE 90 89099 - 89842 539 247 aa, chain - ## HITS:1 COG:lin0128 KEGG:ns NR:ns ## COG: lin0128 COG5632 # Protein_GI_number: 16799205 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Listeria innocua # 18 123 32 138 289 78 41.0 1e-14 MRDITLCHPRLQALAERLVEECNKQGLKIKIGETLRTVAEQDALYAQGRTEPGPIVTNAP GSSYSSYHQWGTAFDFFRNDGRGAYYDNDGFFTKVGKIGVSLGLEWGGNWKSPVDKPHFQ LADWGSTTAGIKRLYHNPDEFMRTWEPVEERIGWINTSNGWWYRRPDGTYPANKWEVINH HWYLFNADGYMCTSWHRWNGSVCDPEDGSGDWYYFDPTAGGPYEGACWHSRENGAMEVWF VDQMDTI >gi|229784111|gb|GG667624.1| GENE 91 89956 - 90162 350 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620519|ref|ZP_06113454.1| ## NR: gi|266620519|ref|ZP_06113454.1| conserved domain protein [Clostridium hathewayi DSM 13479] conserved domain protein [Clostridium hathewayi DSM 13479] # 1 68 1 68 68 79 100.0 9e-14 MSEKTKRWIKAAGIRAVKTMAQTAVATIGTAAVLGDVNVPMVISASVLAGALSILTSVAG LPELDKAA >gi|229784111|gb|GG667624.1| GENE 92 90164 - 90481 313 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620520|ref|ZP_06113455.1| ## NR: gi|266620520|ref|ZP_06113455.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 105 1 105 105 182 100.0 7e-45 MENEEIAVKLTALEHETKSAKHRIDDLEVQNQAIQDLALAVKELTINMGNMLEEQRRQGI DIDKLKAEPAKQWSATKKTFFTSITSSIGTAVAAGILYLLSRGGF >gi|229784111|gb|GG667624.1| GENE 93 90501 - 90695 215 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622353|ref|ZP_06115288.1| ## NR: gi|266622353|ref|ZP_06115288.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 64 12 75 75 101 98.0 2e-20 MNKNKPDLNYKTTVPYGPATGKEDPGRQPVIDSTPYEGDRGPDHRQFKPGHVPGGPGHKD CEHE >gi|229784111|gb|GG667624.1| GENE 94 90756 - 91175 225 139 aa, chain - ## HITS:1 COG:no KEGG:Clole_2737 NR:ns ## KEGG: Clole_2737 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 5 139 7 141 141 129 53.0 3e-29 MALILRKYLALASVGGLLYVVLEMVWRGRSHWTMFLLGGICFAALGLINEILPWSMALWK QILIGTGIITALEFLTGCVVNLCLGWNIWDYSHLPGNILGQICPQYCLLWLPVSLAGIVL DDWLRYWWWGEARPHYKIL >gi|229784111|gb|GG667624.1| GENE 95 91156 - 91899 403 247 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620523|ref|ZP_06113458.1| ## NR: gi|266620523|ref|ZP_06113458.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 247 1 247 247 397 100.0 1e-109 MEQIKIRGEIFSITEIVPVTPNVLRIVFNDEVPASWGDITTYTIGGVEEGEEGGTIVGYE TVYRDEGQTVYLSNDGSVYVPPAPPEPVVPPEPYVPTLEELQAAKRREVSVACQQMIYQG VNVTLTNGSTDHFALTIEDQLNLFGKQIQVTSGAAQIEYHADGQPCRYYTAADMQAIITA AMWHVSYHTTYCNALNMWISGCQTVEEVQTIFYGADVPPAYQSEVLQSYLTQIASQTGGA GDGSDPA >gi|229784111|gb|GG667624.1| GENE 96 91914 - 92150 150 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620524|ref|ZP_06113459.1| ## NR: gi|266620524|ref|ZP_06113459.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 78 1 78 78 121 100.0 2e-26 MIEILDVDISKNPVATNEKFIISVSIRPYFAAGIELNAAVPALISASVQAKPAGYAPVHK KARLQVGVPTVEVVKIKN >gi|229784111|gb|GG667624.1| GENE 97 92143 - 92433 322 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620525|ref|ZP_06113460.1| ## NR: gi|266620525|ref|ZP_06113460.1| hypothetical protein CLOSTHATH_01618 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01618 [Clostridium hathewayi DSM 13479] # 1 96 1 96 96 173 100.0 4e-42 MDSLRYVIGERKYVRLFVYSYKNEPFYVRAAEYELLNPAGEIETSGTCDVTAVENGTNIL VLVEPKLEGTYTLKIDYDIGEEHLKKSVEIEVVADD >gi|229784111|gb|GG667624.1| GENE 98 92421 - 92723 126 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620526|ref|ZP_06113461.1| ## NR: gi|266620526|ref|ZP_06113461.1| putative CTP synthase [Clostridium hathewayi DSM 13479] putative CTP synthase [Clostridium hathewayi DSM 13479] # 1 100 1 100 100 211 100.0 2e-53 MAVTRVFGRADGVDIVFQSIEGNQWNIAVPFDMDCEYILELYAEDEAGNISFLTRALFTY DPKSLTLKVAPMQYGCKLLPEPCGVDISPSRKERIDIWTA >gi|229784111|gb|GG667624.1| GENE 99 92736 - 92969 202 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620527|ref|ZP_06113462.1| ## NR: gi|266620527|ref|ZP_06113462.1| fibronectin type III domain protein [Clostridium hathewayi DSM 13479] fibronectin type III domain protein [Clostridium hathewayi DSM 13479] # 1 77 1 77 77 78 100.0 1e-13 AAEKVTVNADGTFSKALTLVSGSNTITVVSTDSAGKSSTVTRTVTLDQVAPVIKSVTITP NPVDAGKTYVISVEVTD Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:07:36 2011 Seq name: gi|229784110|gb|GG667625.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld18, whole genome shotgun sequence Length of sequence - 49498 bp Number of predicted genes - 47, with homology - 43 Number of transcription units - 15, operones - 10 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 140 123 ## gi|266625365|ref|ZP_06118300.1| conserved hypothetical protein 2 1 Op 2 . - CDS 180 - 1511 1670 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 3 1 Op 3 1/0.000 - CDS 1530 - 2474 1059 ## COG0524 Sugar kinases, ribokinase family 4 1 Op 4 . - CDS 2488 - 3378 1118 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase - Term 3410 - 3452 6.7 5 2 Op 1 5/0.000 - CDS 3463 - 4575 1274 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein - Prom 4616 - 4675 1.8 6 2 Op 2 26/0.000 - CDS 4685 - 5620 1114 ## COG1079 Uncharacterized ABC-type transport system, permease component 7 2 Op 3 24/0.000 - CDS 5617 - 6696 978 ## COG4603 ABC-type uncharacterized transport system, permease component 8 2 Op 4 . - CDS 6674 - 8215 196 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 - Prom 8266 - 8325 6.8 9 3 Tu 1 . - CDS 8337 - 8729 291 ## - Prom 8860 - 8919 5.2 + Prom 8623 - 8682 3.7 10 4 Tu 1 . + CDS 8728 - 9507 715 ## COG1349 Transcriptional regulators of sugar metabolism + Term 9526 - 9577 0.2 - Term 9330 - 9388 4.4 11 5 Tu 1 . - CDS 9583 - 10557 746 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 12 6 Tu 1 . - CDS 10664 - 10783 67 ## - Prom 10814 - 10873 3.8 - Term 10788 - 10840 -0.0 13 7 Op 1 . - CDS 10901 - 11347 287 ## COG3727 DNA G:T-mismatch repair endonuclease 14 7 Op 2 . - CDS 11352 - 12137 266 ## COG3440 Predicted restriction endonuclease 15 7 Op 3 . - CDS 12158 - 13138 355 ## BCAH820_1015 hypothetical protein 16 7 Op 4 . - CDS 13147 - 14004 434 ## GTNG_2006 hypothetical protein 17 7 Op 5 . - CDS 13991 - 14227 117 ## gi|266620543|ref|ZP_06113478.1| putative extracellular arylsulfate sulfotransferase - Prom 14267 - 14326 2.6 18 8 Op 1 . - CDS 14373 - 15929 845 ## GTNG_2007 hypothetical protein 19 8 Op 2 . - CDS 15950 - 16183 235 ## gi|266620545|ref|ZP_06113480.1| conserved hypothetical protein - Prom 16349 - 16408 9.3 - Term 16375 - 16415 6.1 20 9 Op 1 . - CDS 16418 - 17983 288 ## COG0270 Site-specific DNA methylase 21 9 Op 2 . - CDS 17977 - 18339 235 ## gi|266620547|ref|ZP_06113482.1| putative transcriptional repressor RghR - Prom 18369 - 18428 14.2 22 10 Op 1 1/0.000 - CDS 18983 - 21454 1883 ## COG1061 DNA or RNA helicases of superfamily II 23 10 Op 2 . - CDS 21415 - 22044 169 ## COG0500 SAM-dependent methyltransferases 24 10 Op 3 . - CDS 22124 - 22270 64 ## gi|266620550|ref|ZP_06113485.1| hypothetical protein CLOSTHATH_01643 25 10 Op 4 . - CDS 22245 - 22970 484 ## Dtox_3292 hypothetical protein - Prom 23136 - 23195 7.3 - Term 23302 - 23339 1.2 26 11 Tu 1 . - CDS 23343 - 24203 817 ## COG0491 Zn-dependent hydrolases, including glyoxylases 27 12 Op 1 21/0.000 - CDS 24354 - 25313 1284 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 28 12 Op 2 16/0.000 - CDS 25300 - 26802 1709 ## COG1129 ABC-type sugar transport system, ATPase component - Prom 26861 - 26920 4.2 - Term 26969 - 27000 1.8 29 12 Op 3 16/0.000 - CDS 27068 - 28096 1265 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 28123 - 28182 5.3 30 12 Op 4 21/0.000 - CDS 28191 - 29099 574 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 31 12 Op 5 . - CDS 29096 - 30484 1028 ## COG1129 ABC-type sugar transport system, ATPase component 32 12 Op 6 7/0.000 - CDS 30481 - 31329 917 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 33 12 Op 7 . - CDS 31346 - 32947 1781 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 34 12 Op 8 . - CDS 33008 - 36448 3163 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family 35 12 Op 9 . - CDS 36525 - 37658 1062 ## Closa_0371 oxidoreductase domain protein 36 12 Op 10 2/0.000 - CDS 37661 - 38491 870 ## COG3541 Predicted nucleotidyltransferase 37 12 Op 11 . - CDS 38481 - 39608 1104 ## COG3541 Predicted nucleotidyltransferase - Prom 39637 - 39696 7.7 - Term 39739 - 39784 5.4 38 13 Op 1 . - CDS 39881 - 40372 557 ## Cphy_2916 hypothetical protein 39 13 Op 2 . - CDS 40430 - 41020 574 ## DSY2752 hypothetical protein 40 13 Op 3 . - CDS 41031 - 42299 1020 ## Cphy_1588 SecC motif-containing protein 41 13 Op 4 . - CDS 42326 - 42397 63 ## 42 13 Op 5 . - CDS 42466 - 43686 1212 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 43712 - 43771 7.0 43 14 Op 1 . - CDS 43794 - 43937 110 ## 44 14 Op 2 . - CDS 43879 - 45531 1743 ## Closa_0369 plasmid pRiA4b ORF-3 family protein - Prom 45565 - 45624 5.0 45 15 Op 1 . - CDS 45678 - 46754 1230 ## COG0006 Xaa-Pro aminopeptidase 46 15 Op 2 . - CDS 46810 - 48399 1721 ## COG2978 Putative p-aminobenzoyl-glutamate transporter 47 15 Op 3 . - CDS 48415 - 49497 236 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 Predicted protein(s) >gi|229784110|gb|GG667625.1| GENE 1 2 - 140 123 46 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625365|ref|ZP_06118300.1| ## NR: gi|266625365|ref|ZP_06118300.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 46 1 46 155 101 100.0 2e-20 MKTVVIEDIRPDNDYGLIVQMPEEGNVYQDDYFYWKGSLIPACFQT >gi|229784110|gb|GG667625.1| GENE 2 180 - 1511 1670 443 aa, chain - ## HITS:1 COG:CAC0282 KEGG:ns NR:ns ## COG: CAC0282 COG0402 # Protein_GI_number: 15893574 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Clostridium acetobutylicum # 1 436 5 424 428 362 44.0 1e-100 MNHTIFALKGTIAYTPAPQRFVFEEESFLVCRDGKVKGIWKELPDELKGIKVYDYTGKII FPGMCDLHMHAPQYAFRGLGMNLENPDWDMWFEKYAFPDEKRYEDCEYAKMAYERLTDDL LKTTTTRCVMFATIHRQSTEILMDLMEEKGITAYVGKVNMDRNSIPGLLETTEESLEETK QWIADCEDKKKYRKCRPIITPRYIPTCTDELMSGLGDLIAEKKIPVQSHLSEGLDEMEWV RELKPELSFYGEGYDEYHMFGDTVPTVMAHCVYSPEEEIQLMKKRDKLIIAHCPQSNISS SGGIAPVKRYLAEGIRVGLGSDMAGSNTLSLLRAITDAIHVSKARWAFTERGDDPHAKKN VLSLAEAFYLATHGGEGLWEQVGSFEEGYCFDAVVMDDSRISDYCERSSYERTERLILMS DDRDIEAKFIDGSMVYQRDAKQR >gi|229784110|gb|GG667625.1| GENE 3 1530 - 2474 1059 314 aa, chain - ## HITS:1 COG:HI0505 KEGG:ns NR:ns ## COG: HI0505 COG0524 # Protein_GI_number: 16272449 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Haemophilus influenzae # 4 304 3 303 306 167 34.0 2e-41 MKPKKLITLGSYHTDYFLIGDRIPDKGETVTGNYYFTEGGGKGSNHACACSILGGNVVLI QKLGRDEGGETACREFAEYGLSLDYVRMVEGEKTGLAPIMIDKDGNNAIMVFPGAHGHYT KEDIDRAAREFDDAFMASFVFETNLDMTEYAIKKAYEKNVAVFLDPSPVTKFNPDLYPYL TWIKPNEHEATLYTGIEVTDYESAAEAGKWFLARGVKNALITLGEMGSVLVTPDLIRTFP APKVTVVDTTTAGDVFAGAFIYALSVEMPLDEAVLFASCAGALTATKAGAVKSAPHMEEV QQLLAEFKKSAGLI >gi|229784110|gb|GG667625.1| GENE 4 2488 - 3378 1118 296 aa, chain - ## HITS:1 COG:AF0910 KEGG:ns NR:ns ## COG: AF0910 COG0329 # Protein_GI_number: 11498515 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Archaeoglobus fulgidus # 5 296 1 288 289 132 27.0 9e-31 MNDYLFYGAMTEPIAPMDKEGNIDYELLKAQVQFQLDHKIHAIFVSGLTECMVTSVEEQI EMLRETVKTVNSQVPVMGNICLNRPEDALYAIRAFEEAGADAVSIAQPHTFTYEEDAIYE YYTNLIRGTTLPVYVYNAPQTNNVLSPGLVKRLVEENDNVRGYKNSVQDILHLQSVMELI PKERHFECIAGSDSTIFATLALGGCGIISWVSIMFPELIEELCDAYFRGDIEEARALQFR VQKVRTILKHAPMDSGYRYAGELIGLPMGYPRQPMSQATEEQKNYIREELTKLDMI >gi|229784110|gb|GG667625.1| GENE 5 3463 - 4575 1274 370 aa, chain - ## HITS:1 COG:alr5361 KEGG:ns NR:ns ## COG: alr5361 COG1744 # Protein_GI_number: 17232853 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Nostoc sp. PCC 7120 # 65 356 35 327 348 181 34.0 2e-45 MVMAAALTAALLGGCAGGNAKQPAGDTTAAPTTAAQAEAGSTEAEETEAEKTEAAKADDK KAGDSAKTGDGQFKAALIVTGSVNDAGWNAAAYNGMKAIESEMGAEINFSENVSLPDAEP ALRDFSNKGYNLIFAHSMEYGESVTKIAGDYPDTKYAVYTGTVEADNVASISLNEHETGY LAGMLAASMSKTGKIGVIIGSDLPPMIRIAEAYKLGAKEVNPDIQVFDVFAGSWDDIAKG KEAALGQIENGADVIFHVADKTGLGAIQAAQEKGVYAIGSSVDQSAVAPGTVLSSALDHA DKAYLAAAKSVSDGTFKGGIMRMGLKEEAIGMAPYDPAVPEDVIKKIEEKTAQIVSGEFE VPEILERTAK >gi|229784110|gb|GG667625.1| GENE 6 4685 - 5620 1114 311 aa, chain - ## HITS:1 COG:alr5368 KEGG:ns NR:ns ## COG: alr5368 COG1079 # Protein_GI_number: 17232860 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Nostoc sp. PCC 7120 # 5 310 3 307 312 241 47.0 1e-63 MNILSMAFIVGLLTATVRMMVPILLSALGETYSERAGILNIGLEGIMLISAFFAFLGSYY FDSAAVGVLFGAASGLIVGLIAAFLSVTLQINQMISGIALNMVAAGLTSFFYRVMFGVLP IPPQAKSLSVLEIPVLRDIPFLGEILFQHNILVYAAFILVPVSSIILYKTSFGLKIRAVG ENPKAVDTVGVSVNGVRYSAVCICGLLTGLAGTCLTIGQLNMFMDNISAGRGFIALAAVI FGKWNPKGVLFASVIFAFADALQLRLQSLGFQIPYQFLVMLPYVLTVVALAGVVGRSKGP AALGKAYSKEG >gi|229784110|gb|GG667625.1| GENE 7 5617 - 6696 978 359 aa, chain - ## HITS:1 COG:TM0104 KEGG:ns NR:ns ## COG: TM0104 COG4603 # Protein_GI_number: 15642879 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Thermotoga maritima # 15 334 8 327 344 216 39.0 6e-56 MKIKQNDKRKILTQLLKPVAAIAAALLVGAVLILYAGANPVTAYKSLLAGAFGSFYNFTE VLVKASPLWLAGLGVALSFKAGIFNVGAEGQIMMGALATAWIGVVLGNLPMVILLPVTML AAVLAGGIWASIPGMMKAKLGVDEIITTIMFNYIATWVVSYFVTGPLKDPAGVNPQTAEL GAGAQLPILVKATRLHAGLIVTLILIVVMYFVIKKSTFGYKVRLLGSSLSVARASGVNVP LTLITTMIISGGLSGLAGMMEISGLHHRLLNSFSPGYGFTAVVVALLGNLNPLAVFVSGF FFAAMTVGANEMQRSAAIPTSLISVIQGLIVFFVLISEVKHFPIPRIFTSKRENREVTE >gi|229784110|gb|GG667625.1| GENE 8 6674 - 8215 196 513 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 256 480 11 222 318 80 27 2e-14 MEELLKLQNITKKFGNVVANDHINLSVGRGEIRALLGENGAGKSTLMNIIYGLYQPTSGE IFFGGRQVAIHSPKEAIHLGIGMVHQHFMLIPELSVVENLVLGRRSRREPFLDLEEASEE IRKLSEMYGLDVDPKAKVCDLPVGSQQRVEIIKALYRGANVLILDEPTAVLAPQEAEHLG TVIRTLSEQGKTVIFITHKLMEARDLAHNITILRNGKTIVTVPAREKSNEELAQLMVGHP IATDLPRKEYAPGREALKISGLSMTAKDGKRLLDNINLTVHEGEIVAVAGVDGNGQSEVV EAVTGLRKITDGKVEFFDRDMSQTSVRDRIREGMGHIPQDRHKEGMALDMSLSDNMILET HGDKKFSRCGWIDYDKVNQYTGEMIEKYNVKATGYTAPAGSLSGGNQQKVVLAREVSREP KILIAAQPTRGLDISAVEFVHRKLLESRDSGVAILMISTELEECLSISDRLAVMHQGRIM GVMKREEYDVEKIGLMIAGVNDGEQKNEDKTER >gi|229784110|gb|GG667625.1| GENE 9 8337 - 8729 291 130 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIDLPRFTHYDFIDFNSIEFNFIDSNSINANSINANSIDANSIGSNSINANFIDSNSTNA NSIDSNSTNSNSIDFNSINPNFINSDITESNFTNSNISQFDINIFTLKSKRILQKTSIIL PKTSKLTIYQ >gi|229784110|gb|GG667625.1| GENE 10 8728 - 9507 715 259 aa, chain + ## HITS:1 COG:AGl88 KEGG:ns NR:ns ## COG: AGl88 COG1349 # Protein_GI_number: 15890151 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 241 10 239 243 159 36.0 4e-39 MNKKETRLNNFIHFLKIKNSITVKEMATHLNVSEMTVRRDIKELTADHTINFENGKIMFN PQSSLSEASPSYSLESEKEKRNLQKELIGKFAATLVEPNDTVIIDSGSTTEHLAHYIPDS MPVTALCYSMNILNELITKEDISLIFPGGFYHPNAQMFESQQAIDLINQIRANKVFLSAA GIHRELGITCMHNYEISTMQAVIKSSVTKILLADSSKFDQIRPAYFASLECIDIIVTDHY LAEEWQEYIRGVGIKLYLV >gi|229784110|gb|GG667625.1| GENE 11 9583 - 10557 746 324 aa, chain - ## HITS:1 COG:lin0876 KEGG:ns NR:ns ## COG: lin0876 COG0667 # Protein_GI_number: 16799950 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Listeria innocua # 14 307 20 305 317 141 31.0 1e-33 MNYRQLTGTYLLVSNLCLGTVQFGSAVEEKDCFEQLNEFTDSGGNFLDTAHVYGDWLPGE RARSERVIGKWLKQSGKRNHYVIATKGAHPRLDSMEVSRVNREAIMTDMEESLKCLETDA IDLYFLHRDDENVPVCEILDILETARKKGYIRYYGCSNWKLERLKEAQRAAEEYGFSGFV CNQLMWSLADINTAGLGDPTLVCMDQKSFSYHKEKQLSAMAYMSVAKGYFSKLLKGSAIP ENVAAIYENEENYRIAEELKALIAYGVTPAQACISYFNGQPFPAIPLASFRNIEQMRETV AGCDVTLGRDISRRMNDMKRFVSD >gi|229784110|gb|GG667625.1| GENE 12 10664 - 10783 67 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHFVMWVSISLKCCVLQIFYKSRLMSNYKWGIRVIYFSR >gi|229784110|gb|GG667625.1| GENE 13 10901 - 11347 287 148 aa, chain - ## HITS:1 COG:SMc03764 KEGG:ns NR:ns ## COG: SMc03764 COG3727 # Protein_GI_number: 15966902 # Func_class: L Replication, recombination and repair # Function: DNA G:T-mismatch repair endonuclease # Organism: Sinorhizobium meliloti # 1 117 2 116 145 87 41.0 6e-18 MDNLTLEQRRKNMQHIRSNNTKIEEILRKALWNKGYRYRKNFKDLPGKPDIVITKYKIAI FCDGEFFHGKDWEVLKQRLENSNNSEYWINKISKNMHHDDQVNKELMFMGWTVIRFWGND IKKNTDECIKVIDEVVFNIKLGDNIKDN >gi|229784110|gb|GG667625.1| GENE 14 11352 - 12137 266 261 aa, chain - ## HITS:1 COG:alr4915 KEGG:ns NR:ns ## COG: alr4915 COG3440 # Protein_GI_number: 17232407 # Func_class: V Defense mechanisms # Function: Predicted restriction endonuclease # Organism: Nostoc sp. PCC 7120 # 101 260 155 312 313 64 29.0 1e-10 MENERMGIKWSREETILAFDLYCRTPFRKISQGNPEIIELAQLLGRTPGAVGLKMHNLAH YDPELQKRNITAMAHGSKLDGEIFSEFSMDWTNLSYQAQLIRAKIQNKDISEVIDLGDLD EIPTGKYRDYVMKTRVGQYFFRMTVLNSYENRCCITGLKRPELLVASHIKPWRVSDERTE RTNPTNGLCLNSLHDKAFDRGLITLDKNYKIIVSKKLKDTEMDLETKNWIMGYADHQIIL PDKFLPGKKFIEYHNDVIFQR >gi|229784110|gb|GG667625.1| GENE 15 12158 - 13138 355 326 aa, chain - ## HITS:1 COG:no KEGG:BCAH820_1015 NR:ns ## KEGG: BCAH820_1015 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_AH820 # Pathway: not_defined # 1 248 1 238 257 118 33.0 4e-25 MKLYFFKQKFLDILEQNISDNLDKYQESETWVDQYFIDMDKPNYFFDTGIEINDYQLILG GPETDFPNAKILYEALQGHINLVQASDLRLWAYMAHKQHWDYMHTRWGIDIPDDEDEAEG DDTKKTPADKAIDRIGTRYFFKASKGKAFVRQGIARLYWSVYLTYDENNENGNPYEYTEY FFSKQDIFTSITERSYARNKVLVLAALKELKKNSDLSRGEVRAFLAKLNQAGAITILDFL DKTQAEKLCESVMSEIKSIAVLEEGSRFRAINDISGKPFGPEMIIKNGQVIAAGKKILTK PKKLIGKKEGTKFNISGKEYVIKDIK >gi|229784110|gb|GG667625.1| GENE 16 13147 - 14004 434 285 aa, chain - ## HITS:1 COG:no KEGG:GTNG_2006 NR:ns ## KEGG: GTNG_2006 # Name: not_defined # Def: hypothetical protein # Organism: G.thermodenitrificans # Pathway: not_defined # 2 270 35 296 315 87 27.0 5e-16 MLNVNINYPYPVIREYTDDYQGTNFIGELKVLLEPDGYAVHTNFEINNKEIQLLLSKNIL TYALEVQCVSTWFRKLYLIQESKVIRLDPQMIHERVELMPCIVVATAIEGFANEDFAEEY QGMKFDLNAGDIIGIGQKRVFDALYQNDIIKNGSSIVDVGGSDKLKEIQCDFSQSTIKIT LPIDQYEDYRSCGYNRSKYKMLNAILIVPVLVEGIGIIAADEKDPEHQSGHGNRAWYKTI VVNLKRFAENDERKYLQLLETPFASAELLLGNNSAKALDFLCQVE >gi|229784110|gb|GG667625.1| GENE 17 13991 - 14227 117 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620543|ref|ZP_06113478.1| ## NR: gi|266620543|ref|ZP_06113478.1| putative extracellular arylsulfate sulfotransferase [Clostridium hathewayi DSM 13479] putative extracellular arylsulfate sulfotransferase [Clostridium hathewayi DSM 13479] # 1 78 21 98 98 127 98.0 3e-28 MLDENAENLAIGIKIGSDDDNLSKANISEAIMDGKKLAIKNGLIELEGGHNEGEKNIIRV RLEEKTRKTLEVRAYAKC >gi|229784110|gb|GG667625.1| GENE 18 14373 - 15929 845 518 aa, chain - ## HITS:1 COG:no KEGG:GTNG_2007 NR:ns ## KEGG: GTNG_2007 # Name: not_defined # Def: hypothetical protein # Organism: G.thermodenitrificans # Pathway: not_defined # 23 511 35 501 665 189 30.0 3e-46 MFQWRENFQWIFSDLGSSDKVGVNESGIGIFKRQPYKGLAKEILQNVTDAKNPELPDEVP VRAKFELIYVDLDDIPGHERLREVIHKCSEYYSVGDDGEKLRTIRDAADNYLSGSIKVPV LKISDYNTTGLRGVKEETGSNWTGLVRERSATNKSNTSSGAFGVGKFAPYNFTSIRTVLY STKTINDEYAFQGKAILTTFKEDGKNKQNIGLFANKDSEDFDAVFDVNDIAPVFRRTEPG TDIFVLGFVKEDEDTWIEQSAISVIEYFFYSIYRGKLEVEIRDGEKVVKITQNNLGEMIA FYDKYCKEHMADDTAFKFTAPSYWRLLQDSRHKIIKAPFEYNGKSMGNYELYLLTGEDAA EKKVLEMRQAGMKIREDCAFRIPINFTAIFIATGEGAASQDPKDNISSFLRKCENQAHDD WAADEYKEEKTKAKGIINKVHQIILDAVKAEMPDFGDESVDAFGLSEFLPNEEADDDQKE EKVFSDFRPLAFEIQSVKTGKRRRQADISMKKKWRFEA >gi|229784110|gb|GG667625.1| GENE 19 15950 - 16183 235 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620545|ref|ZP_06113480.1| ## NR: gi|266620545|ref|ZP_06113480.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 77 1 77 77 116 100.0 7e-25 MKSEETKRDKFVRIAEARTNKIINMIQLLGNCSNQSLYEYSQKDVNKIFNTIQAELDEAK KRYSKQDSQKGSIFTLD >gi|229784110|gb|GG667625.1| GENE 20 16418 - 17983 288 521 aa, chain - ## HITS:1 COG:jhp0435 KEGG:ns NR:ns ## COG: jhp0435 COG0270 # Protein_GI_number: 15611502 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Helicobacter pylori J99 # 303 521 138 346 351 93 28.0 8e-19 MLKTIDLFAGAGGLSYGFESTGEFLIVAAAENNKNARKTYIENHKGRNDIRMIPDVRGYD FSALASEFDGIDVVIGGPPCQGFSNANRQKNHIISMNNSLVKEFFRAIREIRPKAFVMEN VSMLSSETHRFYDSTIDHGIVTALGVKMREDELVLSDSDYNGYNLMNIIEADEVVNYKIS DELFQLLYVLYKNRNNEERLPKYIDNKSKLIIDRIASQTEEVKNNLDFLGRIAHLINTEQ IKKCFTELGQFIKFQKTFRLKEELDSNKIIYEIESDPETGKLIARVKSYSVIEYVNKILG DDYKKNNEVVNSLWFGVPQERRRFIMIGVRSDIMQQEEIEMPKDSGVKIVTVGQAIEDLK DYDTNEDDNEPEKILYTPSMEKLSDYAIIMREGSVGISNHIVPKSRDKAKARFSALKEGE NFHNLDSGLKDNYAVPERTQNSIYLRLDSNRPSGTVINVRKSMWIHPTIDRAISVREAAR LQSFPDRFVFKGTKDAQYQQVGNAVPPLMAKGIAEVVLKYL >gi|229784110|gb|GG667625.1| GENE 21 17977 - 18339 235 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620547|ref|ZP_06113482.1| ## NR: gi|266620547|ref|ZP_06113482.1| putative transcriptional repressor RghR [Clostridium hathewayi DSM 13479] putative transcriptional repressor RghR [Clostridium hathewayi DSM 13479] # 1 120 1 120 120 205 100.0 7e-52 MGFGSYFRDLRMEKQVTQKQIADIINKKPMLVSNVENGKNGPFSDNDLKKIVTFLELSKE EERQLYKEVAKARGKLPQEMQTYIIKNDEAYYLLEMLARNKLGRESLSQIIQMVEEQVLC >gi|229784110|gb|GG667625.1| GENE 22 18983 - 21454 1883 823 aa, chain - ## HITS:1 COG:CAC2824_2 KEGG:ns NR:ns ## COG: CAC2824_2 COG1061 # Protein_GI_number: 15896079 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Clostridium acetobutylicum # 215 822 3 602 616 627 51.0 1e-179 MVELTFKNEKTGLTQTLEFQGNAITGDGNYLYSRLKPCFQKAQKIDIIVSFLMESGVKLL VNDIKAAVERGARVRILTGNYLNITQPAALYMLRRELGDAVSVRFYNVPNKSFHPKAYIF HYDDKGEIYIGSSNVSFSALTTGIEWNYCFDSKMHPEDFEHFYNTFEDLYENYAYEITDE TLKEYSRKWKKPNVQKDLEAYDTEGGREEPETVQPRGAQIEALYHLEKSRKEGWDKGITV IPTGVGKTILAAFDSKNYKRVLFVAHREEILIQAEEAFRKVRGDGKYGFFYGDKKDDDCE VLFATVQTLGKSEYLCDKWFPRDYFDYIVVDEFHHAAADTYKRIIRYFTPDFLLGLTATP DRLDDKNIYEICDYNVVYDVRLKEAVNKGWLVPFRYYGIYDGTVNYDGISYAHGKYDEEE LEKELMIPQRGDLVYHNYSKYNCKRALAFCSTKKHAEYMAEFFQNRGVRAAAVYSGTDGE YNLDRKEALEELKDGRLQIIFSVDMFNEGLDVKNIDMVLFLRPTQSPAVFLQQLGRGLRT AEGKSYLTVLDFIGNYRKANLLPFLLSGQKYQKEMARSNSMSDYEMPDDCMIDLDFQIVD LFKRQAEQEMSTKDKVNEQFEKVRQLVSCEFQRSVPTRVDLFTYMDTDVQDALKRNMRYS PVPNYLKFLAEKNLLNEDEKKLLNGKGAEFINMVETTRMQKSYKMPLLLAFCRDGKFKTV VNENDIYYSFQTYYERGGYKVDMIKDQSSAGFMTWGKKEWVKLAVMNPVKFMLNSHPDCF KKVDNGVIGLVEGMEEVVENEAFVRHVVDAVEWRSVRYFAEKY >gi|229784110|gb|GG667625.1| GENE 23 21415 - 22044 169 209 aa, chain - ## HITS:1 COG:VC0813 KEGG:ns NR:ns ## COG: VC0813 COG0500 # Protein_GI_number: 15640831 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Vibrio cholerae # 14 203 1 189 192 146 38.0 3e-35 MEMDHVKENGAADLTLDYYNKNASEFCSGTKNIAFSEHQNIFLSCLAPQGSILDFGCGSG RDSKAFLDRGYDVESMDGSEALCRLASKTIGKEVKFRKFQELEGKDLYDGVWACASILHL SMADLEDVMKRIYQALKPKSILYASFKYGEYEGIRNGRYFNDMTEKKMELLLQRLGLFKI EKMWRTSDVRPGRSEEKWLNLLLKTKKQD >gi|229784110|gb|GG667625.1| GENE 24 22124 - 22270 64 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620550|ref|ZP_06113485.1| ## NR: gi|266620550|ref|ZP_06113485.1| hypothetical protein CLOSTHATH_01643 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01643 [Clostridium hathewayi DSM 13479] # 7 48 1 42 42 77 97.0 2e-13 MSRLLGLCEERFPEMGEFRKGKEWTCVKGFYLILRNRGIIIKTYESKI >gi|229784110|gb|GG667625.1| GENE 25 22245 - 22970 484 241 aa, chain - ## HITS:1 COG:no KEGG:Dtox_3292 NR:ns ## KEGG: Dtox_3292 # Name: not_defined # Def: hypothetical protein # Organism: D.acetoxidans # Pathway: not_defined # 21 240 27 246 248 266 55.0 6e-70 MAITAVFVRIYFLSRRKYIRTKDDFVIICGDFGFWDESGEQKYWRKWLSHKPFTTLWVDG NHENYDLLKTYPVEQWKGGNVQFIAPDIIHLMRGQIFEIEGLRFFTFGGARSHDISGGIL ELDDLEFKRKKKELDHGWEPYRINHLSWWKEEMPDKNEFEEGLRSLDRCGWNVDIIVSHC CASSIQQEVCGDRCEKDELTEYFEMIMGRCEFGKWLFGHYHENRNVGKRFAVLYEQIVRV V >gi|229784110|gb|GG667625.1| GENE 26 23343 - 24203 817 286 aa, chain - ## HITS:1 COG:TM1295 KEGG:ns NR:ns ## COG: TM1295 COG0491 # Protein_GI_number: 15644050 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Thermotoga maritima # 11 251 11 215 218 92 29.0 8e-19 MLSRKLPFYEIENGIYEIDEFDCVSVFVIVGEKRALVVDTGTGIGDLKWIIEEKITDKPY DVVLTHNHGDHTGGAGFFDEVWIHAADMDWTGATAPSVEFRKDYAGTIRRRENKNYDYDV ETDIRPWEKMPKLNEITDGQVFDLGGRRVTAWHCPGHTAGEMVFIDDLTKTLLLGDACNC NLLMGRGFGKDPRDEIRTAKEELERLISMKGQYDHFYNAHHDFRGFGQTLYADALEDAVK CFEGILDGTAKFVEVPDSLFLDNPPKIAAEYGKVQISYMEGRIDEV >gi|229784110|gb|GG667625.1| GENE 27 24354 - 25313 1284 319 aa, chain - ## HITS:1 COG:BH3731 KEGG:ns NR:ns ## COG: BH3731 COG1172 # Protein_GI_number: 15616293 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus halodurans # 11 309 10 309 314 209 43.0 5e-54 MEKTKTSWIDLYKKYGIYIFLVVVCAFFSVAAPNFLSASNLINILRQVSMFGIVVIGVTM VMIGGGMDLSVGGQMAVVGMVVGFLLVKLNLPIPVTALVGILTGIAFGTVNGIVAIKLKI MPIIVTLSTMLILQGVAYLITGGYPITGMPKPFLMLGQGYIGPIPIPVIIFVLFILFGWI VMNKTYLGRMIYALGGNKEAARLAGINVDKLTVMVYAFAGFAASIAALIMVARTNASQPG AGSSYPFDCMTAACLGGISIAGGEGKISGAVIGVIIIGVLDNGLVLMGVNSNFQSVVKGI ILLLAVAIDCYEAKNKKKA >gi|229784110|gb|GG667625.1| GENE 28 25300 - 26802 1709 500 aa, chain - ## HITS:1 COG:AGc5112 KEGG:ns NR:ns ## COG: AGc5112 COG1129 # Protein_GI_number: 15890066 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 493 23 515 521 446 47.0 1e-125 MPENTILSVKNIVKTYPGVIAINHFSMEVREGEVHALIGENGAGKSTLIKTLSGAITPDE GTVCVNGKDFTAMTPKLAKEQGIEVIYQEFTLVPGISAAENVFLGEKTKPGLFVDIKERE KRARELFERLNVDIDVSKPVKDLSPAHQQLVEIAKAVSKQVKILIMDEPTAPLTVSEVET LFRIIKDLKSRGVTIIYISHRLEELFEIADRVTVMRDGCFVDTKEIAEIDRKQLISMMAG RELNESYPEKKCEIGEEVLRVEHLTGNGDSDISFSLHKGEILGFAGLVGAGRTELMRVIY GANAKEAGRIFVNGKEEKIRSCQDAIRCGIGYIPEDRKNHGVFLRMSIKWNTVANNLKGY SNGLFVDQKREARAAEEYQEKFKIKTPNLDQLVGNLSGGNQQKVVIAKTLAANSRIIIFD EPTRGIDVGAKHEIYKLMNELAEAGHSIIMVSSDMPELLGMSDRIVVIYEGRKTGEVSKE EFDQNYILDLASGGEENGKN >gi|229784110|gb|GG667625.1| GENE 29 27068 - 28096 1265 342 aa, chain - ## HITS:1 COG:mlr5740 KEGG:ns NR:ns ## COG: mlr5740 COG1879 # Protein_GI_number: 13474776 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 70 338 68 335 380 77 25.0 3e-14 MKKQLAALLCGTMLLTAALSGCTMNSPVEGKTETTTAAKTEGTAAAGGEAGSSEAAVQKA KIGLLAPTLQTEFFINIDDGLKKECEARGWDYVAVSFDNDSATAVTSIENMVTGKCDVIV AMVSDESCDDALRKAQEAGVKIIECGVQTEVFDVCLNTDQYMIGKEIGTMAGEWINSTLD GKGKVVVYTTFQNTDMQNRGNGIQDGLKEAAPDAEILEVVDIGKDVVGSGTTTTETMLQK YPELNCIAAYGDAAAVESSEAAKAAGKSGDSFGIFSCDGTNQALKGIAENDPMHGTIKFA EMGPIMAEYSEKLLAGETFPDPIDFPTEQITASNVSEFYTAE >gi|229784110|gb|GG667625.1| GENE 30 28191 - 29099 574 302 aa, chain - ## HITS:1 COG:L83296 KEGG:ns NR:ns ## COG: L83296 COG1172 # Protein_GI_number: 15673618 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Lactococcus lactis # 45 295 49 305 313 87 27.0 4e-17 MKVQRFIQMNFLNIVVWAAFFIMWAAGMSSNVYWSVTRQLLTMIFMVFGLMIQMISGVLD LSFAAEIAASTCIGACLLRTGTPLVPVMAAVLAFHLGLGALKGFLTASLRVNPVIITLAF QIIVSNLSGLFIGEHIIIFNRKDVYASRAFWLCLSALALVSSFLLWFLLKQTYYGKYLRM LGENPAAVRESGLNYTAIQTIISMAASVFYGIAAMILLFITSSGSSSNGSHYLYPVIAAA CMGGISFLNGKGKVSGAWIGTLSAVMFMHIIVILGIQSSFETILEGLIIIISILLTMGTA NS >gi|229784110|gb|GG667625.1| GENE 31 29096 - 30484 1028 462 aa, chain - ## HITS:1 COG:SMc02325 KEGG:ns NR:ns ## COG: SMc02325 COG1129 # Protein_GI_number: 15964384 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Sinorhizobium meliloti # 22 456 20 465 503 110 26.0 5e-24 MKRELLLLKKVTVQGMDSGNRLYDVDLILRESECFCLVSAVHQKEVLVEFLQGLGKRISG SVRIRGKELLQADRKTLEKNKVFVISRSTPYMNTLNIGENIFLLRRNSLKKVLINERAIR SQTEYYFKKYKIELDPDAAPETLSNADKIIIGIIRSVSQGGRLIVLYDTSGAFSQKEMPR LISLIETLKAEGIGILISDNNPEAFFCAADRLVIFKHKKIAKKIYDCEQFPLAAEIVLRG TKAVDESQKEKKRGVPSVRINGLKAGERTISIEACRGEIVLVSREAAPIDSLWDQCQRLS ACGPEFIIDGKKLPSHTMADLVKNRVGMMSCEVSRMGVMENLTREDNILMPSYRKISGRL GFYRKASSYILKDNFLFKDGEALDDIDLGKNGWKLVFYRWKLFNPRVLIVHNCITAADLW ERDWIKKQLLEMAERGTAVILLENEAAFCSGFSDRVWTEDQA >gi|229784110|gb|GG667625.1| GENE 32 30481 - 31329 917 282 aa, chain - ## HITS:1 COG:BH2110 KEGG:ns NR:ns ## COG: BH2110 COG2972 # Protein_GI_number: 15614673 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 25 274 342 577 585 146 34.0 4e-35 MMIAVTAAVLASAACVLLGMYCLYLKNKIKNVEAERERQLSELEVKLQDEYSVRMVDKQA KLSALQSQINPHFLYNTLECIRSEALLYDCDSIARMAKALASFFRYSISKKENIVTLREE FNNIENYFLIQSYRFDDKFSFEILAAPEDREAYSCLIPKLSLQPIVENAIFHGLETKPDK GKVTIRVEMTEKNVIIIVSDDGVGIGREELELMRDSLKNSKKETDKECKSPGERGNGIAL ANVSQRIKLIFGDDYGLNLYSTKGIGTDVEIILPIMTDKSLL >gi|229784110|gb|GG667625.1| GENE 33 31346 - 32947 1781 533 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 529 1 520 525 172 28.0 1e-42 MLKVLIADDEYKVGFLVKKLIEWDKLNLEFAGLVQDGMTAYEKICECTPDIVITDIRMPV LSGLELIKKVIDEGRSVHFIVISGYKYFEYAQQAIKYGVEDYLLKPIDEVELNQILKKIA DTETNREKESRSVDTMVKKLHDSRYILHREFLNSISGDGEMNLEEVNKNYGLNFGDGLFR VICCKLDGDNRVEKNEQQINLILKKTGALVEQAFQGCVRDIAITYQKNSAVTAVINYTAE CRGEVEQRIDRLFQSQMDYLSGFEHYELTMGLSTECGEFGRLHVAVETAREAVACRILGG AGRQIDERFIERDSTVCAEEICGELEEKLAGAVETARYDDVKYMIGLAFHEARKRRAFAS EYYQLAGLLFKCYLDHNPFHRTVENGEMRERWSEGAGNCSSVSALTEYVTEQITADLMEC VEANREKERRPVLNAAEYIKKNYSQKISLEEMAEQGGFNMNYFSELFKKETGKTFTAYVT EVRMEEAKKLLRDTDLPVYEVAEAVGYKDSKFFSQQFVKNVGIKPMEYRKLYY >gi|229784110|gb|GG667625.1| GENE 34 33008 - 36448 3163 1146 aa, chain - ## HITS:1 COG:CAC3303 KEGG:ns NR:ns ## COG: CAC3303 COG0553 # Protein_GI_number: 15896547 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Clostridium acetobutylicum # 187 1131 161 1069 1077 531 35.0 1e-150 MESSPQSFTRGNQLYRSGKVEDLDFQKSKGRLEASAWVEGSYDNRYQVEMTYDLKRDTFT EYYCECPAYQSYDGMCKHCVAVALEMMEREAEEEEAEEEAADALAGGAAERRGSKADAAS RTVSGRASASASAAASKIRQPPTDTAVKKLIFSSTMRENAHYFQPNLNGNIRLVPALHRN AEGWTVEFKIGSDHLYVLKDITAMVNALKEEKYVYYGQKLAFYHVKSAFTQESRRLADFL EDCVNQEQDFITELYMIRRSYLPTAATGRNLVLSGDKMVRFLQAVGPGVIALESRFSDLA RIRIVEKDPVLGFFIAERSGGGCLLSVPKGEAFAGKSGLCVLEGDTIYLCSQEFKERMGG ICSLMEPYKKQEYHINEEDLPVLCASVLPSLEEFAILKKPDSLERYLPKPCEILVYLDYV DRTLTASIYSQYGDARYNLIEGLEVKDLYRDMEKEQYTAKLAAGYFNCVDEKKQQLYIPE EDEEAQYRLLSTGIAQFQQIGEVYISDSLKRLRVLKAPKITMGITRNSGLLDITLQTERL EAGELEELLSSYRRRRKYYRLKNGDFLELEDNGLSAVAELAEGLELSAGELEAGHIRVPE YRSFYLDQVLREHEGMLEVKRSSSYKAFLRSMKNVEDSDYEVPAGLNAELRPYQKFGFRW LMTLGAMGFGGILADDMGLGKTVQAIAYLAAVKEMREAEGSDGERGDEERSDAKAAAEDG RKQEVSRRSLIICPASLVYNWESEIHRFAPGLTVDTVVGSAGIRKEKIKESRADILLTSY DLLKRDVEAYQETLFDTVFIDEAQNIKNHGTQAAKAVKAVSGARRFALTGTPIENALSEL WSIFDFLMPGFLGGYKHFKEKYEQPVTARQDEVAAERLRRMIRPFILRRLKKEVLRELPD KLEEVVYSRMEDAQREIYEARVQKLLDSLSKQSQEEFRVGKLQILAELTHLRQLCCDPSL VYENYNGGAAKVDTCVELVKNAVEAGNKILLFSQFTSMLDIIRKRLDEEEIGYYILTGAV SKEKRSELVRAFNEDDTPVFLISLKAGGTGLNLTAASIVIHFDPWWNQAAQNQATDRAHR IGQQQVVTVYKLIMKDTLEEKILEMQEKKAGLSEEIITEGSISEVIGSRDEFLEILRGAA KESAGE >gi|229784110|gb|GG667625.1| GENE 35 36525 - 37658 1062 377 aa, chain - ## HITS:1 COG:no KEGG:Closa_0371 NR:ns ## KEGG: Closa_0371 # Name: not_defined # Def: oxidoreductase domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 11 373 2 356 356 452 61.0 1e-125 MTMDQSLLEKKPFRFIVVGSGWRSLFYWRIACAYPELFTMETMLCRTDEKAGAMRRQYGV PAITSEAACEAMKPDFVVIAVNKASIFSETKRWAKKGYPVLCETPAALKLEDLKELWRMK TEEGAKIQVAEQYYLYPSFAAAMEVVRRGYLGDVTMMNLSAVHDYHGASVIRRMLGTGLQ NMAVYGKEYPYTLVETDSRDGAVTDGRRAEKTRRRLTFEFEGGKTAFYDFSSAQYHSYIR SRHLNVQGPEGELDDWIVRRIKEEIQPSGRRFLPMEQTMSVEREASGSGIIRIYLGEEQL YVNPFVHMGLKKVLPQDETAIAVLMMGLRRYIEEGTECYPLAEGLQDAYTLILMEEALKD PGRLVRSETQIWAGKQI >gi|229784110|gb|GG667625.1| GENE 36 37661 - 38491 870 276 aa, chain - ## HITS:1 COG:CAC1497 KEGG:ns NR:ns ## COG: CAC1497 COG3541 # Protein_GI_number: 15894776 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferase # Organism: Clostridium acetobutylicum # 1 266 1 258 259 322 62.0 5e-88 MRYKDFEGTLEDLVEQKLQEIEEKEHVRILHAVESGSRSWGFASPDSDYDVRFIYVRRPE DYLKLEPVRDVIEWELDETLDINGWDLQKALRQYHRSNSTLFEWSNSPVVYRTTEEWKQI HQEASPYFSVKASMYHYYGTAKSNFMEFLQGDTVKYKKYFYVIRPVLACLWIEEHACPPP VLFSELMEAVLGDQSQKDEYRKVRQAVERLLEIKALTPESGAGARIEVLNCFIEEQLEHS KKLLDTMKDDRTDSWETLDRLFLEHVCKGNAEGRIL >gi|229784110|gb|GG667625.1| GENE 37 38481 - 39608 1104 375 aa, chain - ## HITS:1 COG:CAC1496 KEGG:ns NR:ns ## COG: CAC1496 COG3541 # Protein_GI_number: 15894775 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferase # Organism: Clostridium acetobutylicum # 39 375 3 339 339 471 71.0 1e-132 MSDTEKITELPKSTRPHLTQPHLTQPHITQPHITQQLPSQQLLTQPQYSFLKTDPHLGKH IILLGLAGSYSYGTNNENSDIDVRGVTLNRKSDLIGMTSYDQYTDENTDTVIYTFNKIIR LLLECNPNTCELLGLNEEHYLYLSPIGRELLANRRLFLSKRAIQSFGGYADQQLRRLQNA LARDRMASEERERHIYNSVKNAMYEFRERYRISDYGTLKIYIDEAENPEMDTEIFLDAQF SHYPLRDYRNIWGEMNNIVKEYDKIGKRNRKKDDNHLNKHAMHLIRLFMMAIDILEKEEI ITYRKDEHELLMKIRRGEFQKEDGTYRTEFYEILADYEKRLHEAAENTSLPDEPDYERVQ EFVMSVNERVVRDEI >gi|229784110|gb|GG667625.1| GENE 38 39881 - 40372 557 163 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2916 NR:ns ## KEGG: Cphy_2916 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 6 138 6 139 139 128 54.0 9e-29 MRGMTERDKKLRLLYGAAFLVLMVVETLIALYVHDKVIRPYVGDMLVVLVVYCFVRIVIP VKARLMPLYVFVFAACVEVLQYFDLVTLLGLSGNRFARIVFGSVFDVKDLACYAAGCLLL EVWERRIRRKICGQTGQKLEEPGSGGKNREVAGRTGSGEKNRK >gi|229784110|gb|GG667625.1| GENE 39 40430 - 41020 574 196 aa, chain - ## HITS:1 COG:no KEGG:DSY2752 NR:ns ## KEGG: DSY2752 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 2 189 7 192 199 177 53.0 2e-43 MTRKSRIHYELETGEITSLKPEELRAILRAADELIATGGRGMLVKILKGSKDKKVLEYKL DQCPAYGYYHDLTMDEISRRVDFAIVKRYLRIEYSGRLPMLVFTDKGWEIECETYTEEWF QRFEETVTSKVLHLDMFEELKSVNRQVVFGLLNRIKERGDKEYIPLLKAWQEGEVRKDRE RIGGVIKFLEEGKGCE >gi|229784110|gb|GG667625.1| GENE 40 41031 - 42299 1020 422 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1588 NR:ns ## KEGG: Cphy_1588 # Name: not_defined # Def: SecC motif-containing protein # Organism: C.phytofermentans # Pathway: not_defined # 11 419 17 383 383 89 22.0 2e-16 MVKEPLVESFFLKGEPDHCLGTCLKRCPESLLDVIAEEYGLELGRGKERKRQLESLEHKI IEGLSAKIEAVPANELQLLIKLAVEECETEEAAEGLSLQKNGWVFYYVDEEEMNTTPVVP YEIVDKIKELAGEMGFASRMAYYEMVRSYISTFLRLYGVFETKWLFEVIRKHDIPAESAE EESTAEESTAEESMAEERTAEKSEDMPEKRDWLPDKMALSNEMLEKVVVRLKEESENFGL EAGYLFDPDLEDEEEYKERFDAVKDMPYYEPSFEDLLFYHENYIDEHLKEYRILKRYLSK RMDSSMEADQLLRELSIEAVEELGGIFAVSEIMERYEGIFSSGEELKEFEQLFRDWEDHV RKWNNRGFTNAEMRARGEHGCQTKIEWDLTKIKFKANTPDPDAPCPCGSGKKYRQCCGRV KK >gi|229784110|gb|GG667625.1| GENE 41 42326 - 42397 63 23 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLEECIYHMENTGEKWTNIYLTI >gi|229784110|gb|GG667625.1| GENE 42 42466 - 43686 1212 406 aa, chain - ## HITS:1 COG:SP0298 KEGG:ns NR:ns ## COG: SP0298 COG1373 # Protein_GI_number: 15900232 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Streptococcus pneumoniae TIGR4 # 1 402 1 402 402 349 48.0 6e-96 MTKINREEYMDWLIRWKDRQIIKVVSGVRRCGKSTMFDMFREYLLTDGVSRSQIIMLNFE DIEYEEITDYQKLYYYIKERLLPDQMNYIFLDEIQHVADFEKAVDSLFIKENCDVYITGS NAYFMSGELATLLSGRYVELKMLPLSFAEFCSTLTETDFSLQQKFSRYLEYSSFPYVTRM GLELKDAKEYLMDIYHTVLLKDVVARLNISDITVLENVAKFLLHNVGNLASPTKIANTLK SQGTKVDQKTVDKYLRGLTDSLLLYEAGRYNIKGKQYLTQQCKYYAVDIGLRNVLVRGKD SDIGHILENIVYLELLRRGFQVRVGQMNDGEVDFVAMNSEETVYYQVAATTLEESTLERE LAPLRKIQDNYPKYLLTLDELFGNADYEGIKKRNVLEWLLKENRSL >gi|229784110|gb|GG667625.1| GENE 43 43794 - 43937 110 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIRVRAEAGKNIKSAVERSNTGERSGSCLMGQGLSLLLFGIENGIKS >gi|229784110|gb|GG667625.1| GENE 44 43879 - 45531 1743 550 aa, chain - ## HITS:1 COG:no KEGG:Closa_0369 NR:ns ## KEGG: Closa_0369 # Name: not_defined # Def: plasmid pRiA4b ORF-3 family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 542 1 513 522 519 50.0 1e-145 MGAYQFKIIIKGSKPPIWRRVLIPQGITFETLHQMIQTAFCWSDCHLFEFEFRTMGIRVT DDRLEEEFSMPGTVSGSTVIDELVADTKKFTYTYDLGDNWEHTIEVEKILEEDTEPYARV VKYKGDVIPEDCGGIYGYYHLLDILADPKKLEELKAENGIDYLEWAGNQGMGTYDVELVN EKLKALSFSEGRVPESGKLLLEDIFGCYDKESITALAKRHGMSGYSKYKKEELIRETVDY LIRPEVMRRYFLCARDKEIELFEQILEGNGWIPLYEDENIDYLYAGGYVTEAVQQGTYLV ADDVKTAYGVMNTEQFQAERSRISRIGDCLCAANSLYAVSPVSAVLDIFNEYASPKLTEG ELLEAYETLSAARPVVELVDGRFVDRVLAEQNTYEELYRMQRNIPYYIPSLTELKFMADN GGFLMGKELQKLGEFLTGELGVKDEVIPDILRDVQSAISVGANLQTVIDELGNYGVVFSE QRELEKFSNVIIDVWNNTRMVMNHGYTPHELVRYGFSEAPGQRKNVQKIYPNDPCPCGSG KKYKKCCGKK >gi|229784110|gb|GG667625.1| GENE 45 45678 - 46754 1230 358 aa, chain - ## HITS:1 COG:CAC2788 KEGG:ns NR:ns ## COG: CAC2788 COG0006 # Protein_GI_number: 15896043 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Clostridium acetobutylicum # 1 357 1 357 358 295 44.0 7e-80 MFEERVGRVLHNMEKMGLEQMIISDPPSVFYLTGRWILPNERLLALYLNKNGNHRLFINK LFTVDGDIGVEKVWFSDTDPACQMIADCTDHGKPLGIDKKMAARFLLELMELGAGSSYKN ASECVDGARRIKDEEEKETMILASRLNDEAMARFRGLIKEGVTELEVAAGMCAIYKELGT EGPSFGPLVSFGANAAIGHHKPDGTVLKDGDCVLFDVGCKKNSYCSDMTRTFFYKNASEK GREVYEIVKKANLAAQAAMKPGMKFCEIDKVARDIITEAGYGPYFTHRLGHCIGIEVHDA GDVSSANQDVVQEGMIFSCEPGIYLPGELGVRIEDLMLVTADGAVSLNRDSKEMEIIG >gi|229784110|gb|GG667625.1| GENE 46 46810 - 48399 1721 529 aa, chain - ## HITS:1 COG:FN0470 KEGG:ns NR:ns ## COG: FN0470 COG2978 # Protein_GI_number: 19703805 # Func_class: H Coenzyme transport and metabolism # Function: Putative p-aminobenzoyl-glutamate transporter # Organism: Fusobacterium nucleatum # 12 520 2 502 512 447 47.0 1e-125 MYKGESMKNTTKTKKRSWILRFLDKIETAGNKLPTPLTIFFYLAVAVVIVSGIGSALGWS AAGEMYNSGSGTVEEATVSIVSLFSIKGLQYMLTSAVTNFTSYAPLGFTIVIMLGIGVAE SSGYLNGLIRKVVKVTPAALVTPMVVFIGVMTNVAENAGYVVFIPLAAMIFKAYKKHPLA GIAAGFAGVSGGFSANLLIGSADATLSGFSTAAAQTLDPSYTVTPLGNWFFMIVSTILIT IVGTYITERVIIPRLGPYKENGDGTLIEASDSISAKEEKALKAANWVFVGIVAFYIICCI PKNSFMRNPETGSLIDGSTLMNAIIPLFTLLFFIPSFVYGKMCGTFQSDKDVVSSFNKAI ASISGFVAMAFVAAQFTNYFSYTNIGRVLSFKGAELLKSLNVNSVVLVICFILVAGFVNL FMASASAKYAIFAPVFVPMFMKMGLSPELTQVAYRIGDSTTNIIAPVIAWIPYILTIMKK YDKDTGWGTLVSSMLPYSMAFLFFWSLLLAVWMIAGLPLGPGAPVFYGV >gi|229784110|gb|GG667625.1| GENE 47 48415 - 49497 236 360 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 4 313 110 410 458 95 26 5e-19 LEPKGVLNLQTGEVFEDTYDKLVIGSGAGVRHFPPFDRSYSNLFEIRDVADGTRVKQALF DENQRHVVIVGAGFIGLELSEACRHYGKEVTVVELADHVLPAFDPEVSEALEAELAENGV TVRVGTMVQSLIEEDGRIVGAVLSRGDGTEEIPADIVINSAGIAPATSFITNVEKAKNGA ICVNERMETSIPDVYAAGDCSIMKSAVTGDYMYAPLGTNANKQGRIIGDILGEAQPKPFK LIGSSALRLFGMDAAKVGLSEKEAKARGLNYKAHTITGNSYASYYGTEKLNIKLIYDADT RKILGAETYGQGIVVPRANYYAIAIYAGLTVDEMGFMDLCYSPPFSGVWDAALIASNTAK Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:09:30 2011 Seq name: gi|229784109|gb|GG667626.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld19, whole genome shotgun sequence Length of sequence - 61792 bp Number of predicted genes - 63, with homology - 57 Number of transcription units - 32, operones - 16 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 96 - 155 5.9 1 1 Tu 1 . + CDS 186 - 1883 2167 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases 2 2 Op 1 . + CDS 2152 - 3612 1475 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs 3 2 Op 2 . + CDS 3615 - 4532 900 ## Closa_0990 hypothetical protein 4 2 Op 3 . + CDS 4590 - 5594 891 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 5606 - 5635 -0.3 5 3 Op 1 . - CDS 5658 - 6329 467 ## gi|266620578|ref|ZP_06113513.1| conserved hypothetical protein 6 3 Op 2 8/0.000 - CDS 6322 - 7020 182 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 7 3 Op 3 . - CDS 7017 - 7376 366 ## COG1725 Predicted transcriptional regulators - Prom 7409 - 7468 10.4 - Term 7502 - 7560 5.3 8 4 Op 1 17/0.000 - CDS 7607 - 9262 1561 ## COG1178 ABC-type Fe3+ transport system, permease component 9 4 Op 2 7/0.000 - CDS 9252 - 10328 1127 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 10 4 Op 3 1/0.333 - CDS 10368 - 11474 340 ## PROTEIN SUPPORTED gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 - Prom 11525 - 11584 5.3 - Term 11546 - 11600 5.4 11 5 Op 1 7/0.000 - CDS 11606 - 12382 939 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 12 5 Op 2 . - CDS 12370 - 14082 1249 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 14218 - 14277 4.4 + Prom 14169 - 14228 5.5 13 6 Op 1 4/0.000 + CDS 14252 - 15685 1675 ## COG1070 Sugar (pentulose and hexulose) kinases 14 6 Op 2 1/0.333 + CDS 15705 - 16019 417 ## COG3254 Uncharacterized conserved protein + Prom 16046 - 16105 1.6 15 7 Op 1 5/0.000 + CDS 16128 - 17378 1652 ## COG4806 L-rhamnose isomerase 16 7 Op 2 . + CDS 17468 - 18295 1093 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases - Term 18060 - 18088 -0.9 17 8 Tu 1 . - CDS 18261 - 19169 429 ## COG2017 Galactose mutarotase and related enzymes - Prom 19265 - 19324 4.1 - Term 19337 - 19386 3.5 18 9 Tu 1 . - CDS 19480 - 21207 1132 ## Tthe_2726 transposase - Prom 21251 - 21310 10.7 + Prom 21366 - 21425 6.2 19 10 Tu 1 . + CDS 21445 - 22608 1124 ## COG1940 Transcriptional regulator/sugar kinase + Prom 22611 - 22670 13.1 20 11 Op 1 . + CDS 22745 - 23821 1226 ## COG3181 Uncharacterized protein conserved in bacteria 21 11 Op 2 . + CDS 23842 - 24285 464 ## gi|266620594|ref|ZP_06113529.1| conserved hypothetical protein 22 11 Op 3 . + CDS 24296 - 24658 282 ## COG3333 Uncharacterized protein conserved in bacteria 23 12 Tu 1 . - CDS 25712 - 25789 82 ## - Prom 25840 - 25899 7.9 24 13 Tu 1 . + CDS 25913 - 26020 79 ## + Term 26085 - 26127 3.3 25 14 Tu 1 . - CDS 26113 - 26190 82 ## - Prom 26242 - 26301 4.6 26 15 Tu 1 . + CDS 26551 - 26739 194 ## gi|288870269|ref|ZP_06113531.2| choline-sulfatase + Prom 26767 - 26826 1.9 27 16 Op 1 . + CDS 26846 - 28189 1520 ## COG0534 Na+-driven multidrug efflux pump 28 16 Op 2 . + CDS 28284 - 28493 197 ## Rumal_2298 hypothetical protein + Term 28517 - 28583 9.9 + Prom 28588 - 28647 10.1 29 17 Op 1 . + CDS 28686 - 28829 140 ## 30 17 Op 2 . + CDS 28874 - 29419 555 ## COG1695 Predicted transcriptional regulators + Prom 29528 - 29587 5.1 31 18 Tu 1 . + CDS 29636 - 30028 318 ## ELI_2299 hypothetical cytosolic protein + Term 30266 - 30297 2.2 - Term 29972 - 30007 1.0 32 19 Tu 1 . - CDS 30100 - 30672 475 ## COG1309 Transcriptional regulator - Prom 30725 - 30784 4.8 + Prom 30697 - 30756 4.5 33 20 Op 1 . + CDS 30809 - 31819 818 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases + Prom 31821 - 31880 1.6 34 20 Op 2 . + CDS 31908 - 32078 242 ## gi|266620604|ref|ZP_06113539.1| putative MFS metabolite transporter 35 21 Op 1 . + CDS 33034 - 33480 366 ## PFREUD_17740 hypothetical protein 36 21 Op 2 40/0.000 + CDS 33516 - 34187 791 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 37 21 Op 3 . + CDS 34184 - 35281 875 ## COG0642 Signal transduction histidine kinase 38 21 Op 4 . + CDS 35278 - 35652 311 ## Tthe_0858 hypothetical protein + Prom 36499 - 36558 80.4 39 22 Op 1 . + CDS 36807 - 37379 332 ## COG4905 Predicted membrane protein + Prom 37383 - 37442 1.7 40 22 Op 2 . + CDS 37485 - 37763 300 ## gi|266620611|ref|ZP_06113546.1| conserved hypothetical protein 41 22 Op 3 . + CDS 37760 - 37876 197 ## 42 22 Op 4 . + CDS 37876 - 38370 260 ## SpiBuddy_2065 hypothetical protein 43 22 Op 5 . + CDS 38420 - 38791 503 ## LEUM_0832 hypothetical protein + Term 38793 - 38825 -0.4 44 23 Op 1 . - CDS 39040 - 39228 215 ## MGAS2096_Spy1114 hypothetical protein 45 23 Op 2 . - CDS 39268 - 40248 962 ## COG1680 Beta-lactamase class C and other penicillin binding proteins - Prom 40422 - 40481 3.1 + Prom 41042 - 41101 7.1 46 24 Tu 1 . + CDS 41134 - 41790 372 ## Apre_0340 helix-turn-helix domain-containing protein + Prom 42633 - 42692 80.4 47 25 Tu 1 . + CDS 42862 - 43512 131 ## gi|266620620|ref|ZP_06113555.1| hypothetical protein CLOSTHATH_01715 48 26 Tu 1 . - CDS 43796 - 43873 85 ## - Prom 43925 - 43984 5.9 + Prom 43958 - 44017 2.3 49 27 Tu 1 . + CDS 44045 - 44536 512 ## COG0394 Protein-tyrosine-phosphatase + Term 44591 - 44624 -0.4 + Prom 44625 - 44684 5.9 50 28 Tu 1 . + CDS 44705 - 46309 1507 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 51 29 Op 1 35/0.000 + CDS 46446 - 47798 1533 ## COG1653 ABC-type sugar transport system, periplasmic component 52 29 Op 2 38/0.000 + CDS 47916 - 48782 1065 ## COG1175 ABC-type sugar transport systems, permease components 53 29 Op 3 . + CDS 48794 - 49633 891 ## COG0395 ABC-type sugar transport system, permease component 54 29 Op 4 . + CDS 49699 - 51438 1510 ## Acid345_0426 glycoside hydrolase family protein 55 29 Op 5 1/0.333 + CDS 51440 - 53380 2181 ## COG2909 ATP-dependent transcriptional regulator 56 29 Op 6 . + CDS 53290 - 53904 641 ## COG2909 ATP-dependent transcriptional regulator + Prom 54005 - 54064 3.3 57 30 Op 1 . + CDS 54084 - 55160 1128 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains 58 30 Op 2 3/0.000 + CDS 55170 - 56120 1113 ## COG2971 Predicted N-acetylglucosamine kinase + Prom 56161 - 56220 8.0 59 30 Op 3 . + CDS 56278 - 57000 622 ## COG2188 Transcriptional regulators + Term 57046 - 57093 16.4 + Prom 57196 - 57255 6.3 60 31 Tu 1 . + CDS 57344 - 57742 265 ## ELI_2236 signal transduction histidine kinase 61 32 Op 1 . + CDS 58689 - 59504 755 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 62 32 Op 2 . + CDS 59501 - 60310 734 ## COG3947 Response regulator containing CheY-like receiver and SARP domains 63 32 Op 3 . + CDS 60340 - 61792 1316 ## COG0840 Methyl-accepting chemotaxis protein Predicted protein(s) >gi|229784109|gb|GG667626.1| GENE 1 186 - 1883 2167 565 aa, chain + ## HITS:1 COG:RSc0791 KEGG:ns NR:ns ## COG: RSc0791 COG0008 # Protein_GI_number: 17545510 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Ralstonia solanacearum # 20 561 16 578 580 624 54.0 1e-178 MEENREVLETAEKEEVVSKNFIEQEIDKDLAEGVYSHVQTRFPPEPNGYLHIGHAKSILL NYGLAKKYGGQFNLRFDDTNPTKEKTEFVESIIEDVKWLGADFEDRLFFASNYFEQMYEC AVFLIKKGKAFVCDLSAEEIREYRGDFKTPGKESPYRNRSIEENLRLFEEMKEGKYQDGE KVLRAKIDMASPNINMRDPVIYRVARMSHHNTGDKWCIYPMYDFAHPIEDAVEHITHSIC TLEFEDHRPLYDWVVRECEFENPPRQIEFAKLYLTNVVTGKRYIKKLVEDGIVDGWDDPR LVSIAALRRRGYTPEAIRMFVDLVGVSKANSSVDYAMLEYCIREDLKLKKPRMMAVLDPV KLVIDNYPEDTIEYLDAPNNLENPELGERKLPFGKELYIEREDFMEEPVKKYFRLFPGNE VRLMNAYFVTCTGCEKDENGNVTVVHCTYDPETKSGSGFTARKVKGTIHWVAAKTAVSVE CRLYENIVDEEKGKLNEDGSLNLNPNSLTIIKNCYVEPALAEAKPYDSYQFVRNGFFCAD CKDSTPEHLVFNRIVSLKSSFKITK >gi|229784109|gb|GG667626.1| GENE 2 2152 - 3612 1475 486 aa, chain + ## HITS:1 COG:BH0578 KEGG:ns NR:ns ## COG: BH0578 COG1167 # Protein_GI_number: 15613141 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Bacillus halodurans # 7 478 8 468 469 346 40.0 6e-95 MELMIPLDSHSGMPYYEQIYEYIREEIKKGNLKSPTRLPSTRLLAEHLKVSRSTTQMAYE QLLSEGYIESVPCKGYFVSRIEELVDTGQENGGESPFFGMYPWEVPGDSQPAAAGEGEYR VDFSPRGVDLDSFPFNTWRKITKITLVDDNKEMFATGDPQGEPAFREAIRGYLHSARGVN CTAEQIVVGAGTEYLLMLLSRILGPDHTIAMENPTYKQAYRVFESLGYPVVPVEMDGSGL ETSLLEGSGADIAYVMPSHQYPTGIVMPVKRRQELLSWAYKKEGRYLIEDDYDSEFRYKG KPIPALQGMDGRERVIYSGTFSKSIAPAIRVSYLVLPKPLLAVYKERVNFYTSTVSRIDQ NILYQFMVSGCYERHLNRMRAVYKAKHDALIGALKPFEHQFAVKGEYAGLHLLLTDKKNR TEKWLVESAKRAGVKVYGLSSYLIREGKAVVAPTVILGYAMLSEDAIKEGIRLLRAAWSS DGEQEE >gi|229784109|gb|GG667626.1| GENE 3 3615 - 4532 900 305 aa, chain + ## HITS:1 COG:no KEGG:Closa_0990 NR:ns ## KEGG: Closa_0990 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 62 301 64 284 289 203 47.0 9e-51 MKIPLRAAGAAAVIVVLLLVVGILGNVRISGYLRASMEESIHQVESEIASKEVQESIEES KIEETAVHEIALTYTEEQLLNLLEEKLRANDLEGAARVLNENESAFQDLFFGKLADEICL YDGSVMKDEADGFGLVFQRAGTVFYGTFKDGKPSGQCVALQAITLDEGVRYDYSDGTWVD GVMEGPGECGYDYYEGVTGDGAKKTVKKGTFSNDLMEGEITYTSSRQTPERGTFPDDSGS GESTTWTMTVEKGVIVPDARWISDTSDDGTPVFRLMADTDDGHAYVVEADAMKEARWKNM IEWGE >gi|229784109|gb|GG667626.1| GENE 4 4590 - 5594 891 334 aa, chain + ## HITS:1 COG:lin2983 KEGG:ns NR:ns ## COG: lin2983 COG2207 # Protein_GI_number: 16802041 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 20 328 11 323 326 109 24.0 8e-24 MNQEMLERLSVITEEEREILNGRTEIDRTRYTKGDELVTSRQMEVRGAFHIDSGKMLEHG KLIRIRPHTRFVHFPKHKHNYIEVIYMCKGETTHVIDGETVVLRTGELLFLNQHATQEIL PAGEGDVAVNFIILPEFFDTAFEMMGEEENLLRDFLVGCLCNDTRYASFLHFKVADVLPV QNLVENMVWTLLNDQPNKRSINQITMGLLFLQLMHYTDKISHTSEGFDQRLIFQVLGYID ENYRDGELTELSSMLGYDIYWLSRMVKRLTGRTYKELLQIKRLNQAAFLLLNTRLAVADV ALAVGYDNTSYFYRIFKERYQMSPKEYRKNNRAL >gi|229784109|gb|GG667626.1| GENE 5 5658 - 6329 467 223 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620578|ref|ZP_06113513.1| ## NR: gi|266620578|ref|ZP_06113513.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 223 1 223 223 365 100.0 2e-99 MIRVWKAIKWQIQAVFHEIRTAAAFFLVLNAVLLLLPLSACQIIDNMAGFAVVFISVCYA ASLLMALFYGMNCMPEKPDARQNELWRLAEPNPWLRIFSRLTAVAVLMAVSFLNGQAGTS LMQKFADANHSYFRMEMNGSIPRAFLQFAVFLPLLYLFLTLFSRRRQNGHGTALTLIAAL ILGQIVYDMLKSLSLIAAIPAGILLVIFLFWQTGKLEAQRDVI >gi|229784109|gb|GG667626.1| GENE 6 6322 - 7020 182 232 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 13 205 9 218 309 74 27 1e-12 MRMEQMTADTVLLKLQNVTKTRNGWKVLNHVTAEVRENRIIGVIGDNGIGKSTLLKLMAG LLKPDEGRIIHGEAAISYILSGSHFYDWMNADDCVRFYEDFYSDFDREHALMLLEQSRID RRTRLCRLSRGKQERLCMILGLCRYSRLYLFDEPFDGIDPYFKKDMKRFLLENMADNSSI VMATHLLKDLESLFDEVIFLADGRVWQMETEEIRTKHHKSVEQYYLEEICHD >gi|229784109|gb|GG667626.1| GENE 7 7017 - 7376 366 119 aa, chain - ## HITS:1 COG:CAC0265 KEGG:ns NR:ns ## COG: CAC0265 COG1725 # Protein_GI_number: 15893557 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 114 1 116 121 84 45.0 7e-17 MDFQNDQPIYIQIGTFIKEQIINGTIRPGEKLPSVREYAVFFEVSPLTIHRTVQYLESEG IIETRKGIGSFVRTEIQERLSEDMVTGIIQDFIVRMHHCGLTDAQIKAAVSQELEGGLK >gi|229784109|gb|GG667626.1| GENE 8 7607 - 9262 1561 551 aa, chain - ## HITS:1 COG:FN0377 KEGG:ns NR:ns ## COG: FN0377 COG1178 # Protein_GI_number: 19703719 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Fusobacterium nucleatum # 10 551 9 550 550 578 60.0 1e-165 MKDNSKFRLDIWGIITLCTIAVYALFLIYPMAHLIQQSVIDGETGQFSMANFTKFFSKSY YFNTLFNSFKVSIAATVLSIAIGTPLAYIFSVYKIKGKSLLNILIIVASMSAPFIGAYSW ILLLGRSGVITTFFRNIGINIPDIYGFGGIVLVMTLQLFPLVFLYAKGALKNIDNSLVEA ANNLGCSGLKCFFKVVIPLIMPTLLAAALLVFMRSFADFGTPMLIGEGYRTFPVVLYTEF INEVGGNDGFAAAIAVIAIVVTTVVFLAQKYVSNKNAFSLNALHPIEEKTPKGLTGFLAH AASYLIVGIAILPQVYVSYTSFKKTEGKIFVSGYSLESYASAFGKLGRSIQNTIIIPLCA LVIIVLLAIMIAYLVVRRKNALTSTVDIISMIPYIVPGTVLGIALLTGFNKKPVMLSGTM LIMVVALVIRRLPYTIRSSVAILQQIPMSIEEAAISLGASKMKSFFKVTVPMMTAGIVSG AILSWVTMISELSTAIILYTGRTKTLTVAVYTEVVRGNYGIAAALSTILTVLTVISLLIF NRVNGGKDLSL >gi|229784109|gb|GG667626.1| GENE 9 9252 - 10328 1127 358 aa, chain - ## HITS:1 COG:FN0376 KEGG:ns NR:ns ## COG: FN0376 COG3842 # Protein_GI_number: 19703718 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Fusobacterium nucleatum # 2 351 7 357 371 425 62.0 1e-119 MIKDAVKKYGNNTVIPDLNATIRDGELFTLLGPSGCGKTTLLRMIAGFNSIEGGDIYFND TRINDMDPSRRNIGMVFQNYAIFPNMTVRDNVAFGLKNRKIEKEKIKTETDKYLNLMQIM QYAERMPNQLSGGQQQRVALARALVITPDVLLMDEPLSNLDAKLRLEMRSVIRHTQKDVG ITTVYVTHDQEEAMAISDTIAVMKDGVIQHVGTPKDIYQRPKNVFVATFIGRTNIIPARV ENGAIIFSSGYRVELECLKQAGRQEVLCSVRPEEFVISPEGTEGIAGVVKEYTYLGLNTH YFVETDDGHGVVEIVEESSIEDELKPGQRVLLQVKKHKINVFNKEGSLNLVRSEAYEG >gi|229784109|gb|GG667626.1| GENE 10 10368 - 11474 340 368 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 [Haemophilus parasuis 29755] # 58 357 30 338 346 135 30 5e-31 MKKTIAVTMALTMTAAALAGCSSSSSVPATTAAPKTEAGSEASGDTQATGSEEKKGGGKL VVYSPNSEGLMNATIPLFEEKYGVDVEVIQAGTGELVKRIQSEKNDPYADVLFGGSWSLA FDNEDLWEPYVSPNDAGVLDAYKNTCGFITGNVLDGSCLIVNTDLIGDIKIEGYADLLNP ELKGKIATADPANSSSAFAQLTNILLAMGGYEDDAAWKYVEDLFTNIDGKIIESSSGVYK GVADGEYLVGLSYEDPCAQLVKDGAPVKIIYPKEGSVYLPASATIVKGAKNMDNAKLFID FIISEEVQNIWGTTLTNRPVLKDAKTSDFMTPMSDIYVIEEDIPYVSSHKKELVNKYTDI FTSIESAK >gi|229784109|gb|GG667626.1| GENE 11 11606 - 12382 939 258 aa, chain - ## HITS:1 COG:SPy1062 KEGG:ns NR:ns ## COG: SPy1062 COG4753 # Protein_GI_number: 15675054 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Streptococcus pyogenes M1 GAS # 1 256 1 250 262 164 35.0 2e-40 MYSVILVEDEDIIRRGIRNSVPWEEFNCSVVGEARNGEEGGALIAERNPDIVITDINMPV EDGLQMIARTKDTYDYVAIILTGYSDFEYAREAIRNGVSYYVLKPLDMEEMKEALERAAL ELKNIRILRREMEDREKLKHSVLIPETDAMSGMDPVAARILDFIAGNYDQKIALADLEEE LHYSERYLNQRFQKALGTTVIEYLNRYRIQKALAMLADEDNAISDIGWKCGIGDYKYFSY VFKKYMGCSPREYRSKIL >gi|229784109|gb|GG667626.1| GENE 12 12370 - 14082 1249 570 aa, chain - ## HITS:1 COG:SPy1061 KEGG:ns NR:ns ## COG: SPy1061 COG2972 # Protein_GI_number: 15675053 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pyogenes M1 GAS # 172 542 167 531 549 155 28.0 3e-37 MAYKQTREQTIFQKQLEKAATRNIVFLVFIGCIFFSAAIFGINYLNTRFNVQEHIEFLTS TFQTAYHATDTYLNDAENDSIFTRCIENRTDGSEVRYSLSKYNVDSPIRLNLILTDSNHN IVYTSYGNDDMNLHRIAFNGIVCDNTVKNGPGLYHTVYYFTGDTSEYVFSRPLYRDGQLI GFAGAYLNGHDWARLFSNYQYDSIITTANGDIIFCSREGFLPERNTNKFRNEEQQRLVFL HGNRYLLSSHFLKDEGVTIYSFIYYPKNVLYLAIGILTIVFLGVIWFIMTQKLSKLMAEK NARSVGILADEMRIIRHGDSSHIIEIHTGDEFEEIASQINRMVKSINELNTRNTDLIRLN SMIEISNLQTQINPHFIYNTLDTIKYLIMSEPDKAAHLIEKFTHILRYSINNTKQDVLLQ EDMRYIEDYLYIQKTRFGDRFLCETAIAGECMRYDVPKLLLQPLIENSIKYGFKKRMEIR VTICGRCEGDYLYLSVEDNGPGVPRATLETLRAMLRSEELKTEHNGLQNLARRIVLEYGD NSEMTIDSREGEYFRVDIKLKNKERTTCTV >gi|229784109|gb|GG667626.1| GENE 13 14252 - 15685 1675 477 aa, chain + ## HITS:1 COG:BH1551 KEGG:ns NR:ns ## COG: BH1551 COG1070 # Protein_GI_number: 15614114 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Bacillus halodurans # 3 477 2 461 467 464 49.0 1e-130 MENYYLAVDIGASSGRHILGTVKDGIISLEEIYRFENGMKKKNGHLCWDVESLFHEIKMG MKQCAVLGKIPCSMGIDTWAVDFVLLDGEGRILGDAVGYRDDRTKGMDEKVYETISPEAL YERTGIQKQIFNTIYQLMAVKEERPEELAGAESMLMIPDYFHYLLTGVKKQEYTNATTTQ LVSPVTKTWNYELIARLGYPEKLFTELSMPGTVVGTLSKEVEDEVGFTCEVVLPATHDTG SAVMAVPVPENASGNSSDADGGLLYISSGTWSLMGTELTEADCSAESRTANLTNEGGYEY RFRYLKNIMGLWMIQSVKKELAAAGETYSFAELCRMASEETIQSIVPCNDDVFLAPESMT EAIKEYLRKSGQEVPETAGALAAVIYNSLAQCYKETVEELEAITGRTYQAVNIVGGGSNA DYLNRLTAGATGKTVYAGPGEATAVGNLLAQMIQAGEFKGLQEARAAVYRSFEISTY >gi|229784109|gb|GG667626.1| GENE 14 15705 - 16019 417 104 aa, chain + ## HITS:1 COG:ECs4828 KEGG:ns NR:ns ## COG: ECs4828 COG3254 # Protein_GI_number: 15834082 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 103 1 103 104 110 48.0 5e-25 MIHKSFKMHLYEGMAEEYERRHNLLWPEMKDMIHEYGGHNYSIFLDSETNVLYGYIEIED EDKWAESADTAINRKWWDYMADIMDTNPDNSPVSVDLKPVFHLD >gi|229784109|gb|GG667626.1| GENE 15 16128 - 17378 1652 416 aa, chain + ## HITS:1 COG:BS_yulE KEGG:ns NR:ns ## COG: BS_yulE COG4806 # Protein_GI_number: 16080170 # Func_class: G Carbohydrate transport and metabolism # Function: L-rhamnose isomerase # Organism: Bacillus subtilis # 1 415 1 417 424 540 61.0 1e-153 MTVKERYEAAKEIYRAAGVDTDAALEALKNIPVSMHCWQGDDVIGFDGAGALSGGIQATG NYPGRARTPEELMSDMDKALSLIPGTHRINLHASYAVFEDGEWADRDKLEPKHFAKWVEF AKERGLGIDFNPTLFSHVKAENATLSSEVPEIRRFWIDHVKACIRISEYFAQELGTPCTM NIWIPDGFKDIPADRTAPRARLKDSLDQILSIDYDKSKVYVAVESKVFGIGMESCTVGSH EFYMNYASKNDILCLLDSGHYHPTEVVSDKISSMLLFFDKVALHVTRPVRWDSDHVVLFD DETREIAKEIVRGGADRVLLALDFFDASINRISAWVVGMRNMQKALLNALLLPNGKMAEL QNERKFTELMMLSEEMKTYPLGDVWDYFCEQNGAPVKEAWFEEVKRYEADVLAKRS >gi|229784109|gb|GG667626.1| GENE 16 17468 - 18295 1093 275 aa, chain + ## HITS:1 COG:rhaD KEGG:ns NR:ns ## COG: rhaD COG0235 # Protein_GI_number: 16131742 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 3 266 4 266 274 257 48.0 1e-68 MRVTDAEFVKGFIRLCDDGFKQNWHERNGGNLSYRIKPEEVESVKEELTDSNPWQPIGTS VPKLAGEYFMVTGSGKYFRNVILDPAANSCIIEVDEAGENYRICWGLVNGGRPTSELPSH LMNHEVKKEVTGGKHRVIYHAHPTNVIALTFVLPLEDKVFTRELWEMATECPVVFPDGVG VVGWMVPGGRDIAVATSELMKKYDVAVWAHHGLFASGEDFDLTFGLMHTVEKSAEILVKV LSIRPDKLQTITPQNFKDLAKDFKVTLPEEFLYEK >gi|229784109|gb|GG667626.1| GENE 17 18261 - 19169 429 302 aa, chain - ## HITS:1 COG:lin1322 KEGG:ns NR:ns ## COG: lin1322 COG2017 # Protein_GI_number: 16800390 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Listeria innocua # 2 281 1 279 290 177 34.0 3e-44 MLTTIRNSELTAVIDSAGAQLISLKDRSRKEYIWQRDPRFWSRCSPLLFPVVGDCRNNRL LIDGSSYRLPKHGFCKDMEFQRLEQSEERASFLLQDNALTKAMYPWSFRLSLTYELNAGS LFLKYRVANIDSSALYYCIGAHPGFRCPMEEDAVFEDYCLIFEKEEDVYAMPFDRFQGEY LSNGRGYELRGRTIPLSRALFRDNALYFPKIASRRVALIHRADGCGVEVSYPDFETIAFW TTYPDQAPFLCIEPWNGSGARTGEGDELIHKNHVRRLEPGASHDLSMAIRPLTFRTETPP EA >gi|229784109|gb|GG667626.1| GENE 18 19480 - 21207 1132 575 aa, chain - ## HITS:1 COG:no KEGG:Tthe_2726 NR:ns ## KEGG: Tthe_2726 # Name: not_defined # Def: transposase # Organism: T.thermosaccharolyticum # Pathway: not_defined # 1 574 1 570 570 524 48.0 1e-147 MKVDTTKSKNAESFYIRQSYVNSEGKSTSKTIRKLGTLNELLVEHGPTRDDVMRWAKKQA EIETEKYKAQKDVKTVLIPLRADKKIDYNMEKRFQGGYLFLQSIYYSLGLNRVCRKIRDK HQFEYDLNAILSDLIYTRVMEPSSKRSSYKAACGFLEKPGYELHDVYRALSVLAQESDLI QAELYKNSNFLTKRNDHILYYDCTNYYFEIEQEEGDKKYGKSKEHRPNPIIQTGLFTDGD GIPLAFSLFPGNQNEQTSLKPLEKKVIEDFGCDRFIFCSDAGLASENNRLFNHTGKRGFI VTQSIKKLAAEYREAALNRTGFRRLSDGKPADLEQLNEADSQELFYKEEPYSSKKLEQRL IITYSPKYAAYQKDLREKQVQRAMDMLKDGKHKKTRKNPNDPARFIDKEAVTKDGETADI LYYLDEAKIAEEARYDGLYAVCTDLFDDQPGDILKVSEGRWQIEACFRIMKTDFLARPVY VQREDRIKAHFLICFTALLVYRLLEKKLEKKYTCSEILETLRKFQFADVGGQGFMPLYKR TKLTDALHSTSGIETDFEFITKSKMKEIQKKSKLR >gi|229784109|gb|GG667626.1| GENE 19 21445 - 22608 1124 387 aa, chain + ## HITS:1 COG:CAC3673 KEGG:ns NR:ns ## COG: CAC3673 COG1940 # Protein_GI_number: 15896905 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Clostridium acetobutylicum # 13 345 6 340 385 118 24.0 2e-26 MLRIENSGKSQMKSRNRVRVFREILNNEGISRKQLEKRLGLSAPSVTRVVEELLREGLIC EEGTEQTHMGRRPVMLCVKKNAYYSIGMNLSRSRLYYCIKNLGGEPVHSGEIRLQGRKRG DQLLSLMESAVEDCLKTAGVEKRHLFAIGIASRGTVNEGKGSVIYTPDSGEEILVKEYMK ERYDCQILVENNVIADLKGQYLDLSGTNRNLVYLYISDGVGGSIICNGQVIDGENSMAGK FAHILVESGGRLCACGKRGHLESYVSKPAMEEAYFEASGKRLELPEICRLANAGETAAAA VLTDAIDKLAIGINQIFLVVNPGTLVLYGDLFECVDGIVEYLKQRTKELAFTEEIADNCW LVREKKNVRIEESIARLSVEQALEIIV >gi|229784109|gb|GG667626.1| GENE 20 22745 - 23821 1226 358 aa, chain + ## HITS:1 COG:PM1679 KEGG:ns NR:ns ## COG: PM1679 COG3181 # Protein_GI_number: 15603544 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pasteurella multocida # 66 347 30 309 319 110 27.0 3e-24 MKRRKLMALVLAGAMAAVVGCSSQSGTVKQPETQKEAASQESTKENGNEGNTAGDSTAAQ AGSWKPDHDITIRVPNAAGGTMDTITRILGQGIQQSTGTTVMINNLTGASGAIAANDLLA KDADPCELMTSGIALFTLAPLFNQDIKVSLDDFTIVSGMVSEDFVLCVNPEKSGIHNWEE LKAYGEKERILFGSNTPGGTTHMLGTALFGEAGLNAEAVTSDGTNKDMLALTSGDVICAI GNASACQQFIEEGTAVPIAVFSAEAYDGFEGFTVPSVVDLGYDIQFKSCNFLIAKKGADQ AALDQIHDMILAYTETEECKKLAESAKYVPDIADGASVRAIVEESANMCREIYEKYYH >gi|229784109|gb|GG667626.1| GENE 21 23842 - 24285 464 147 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620594|ref|ZP_06113529.1| ## NR: gi|266620594|ref|ZP_06113529.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 147 1 147 147 228 100.0 1e-58 MKVKYRTNLCAGIISVIFGLAVLWLIPAQIGTEYSSGRGLTSRAVPGGAAMLFIVCGAAL VFQSLILKKDTVRELDLAKEGKALLYMAVFAIYVILFDYSFIVASIFLGTVTLLFTKSKR KLYYVIVMITVIVLYLLFTQVLHVRLK >gi|229784109|gb|GG667626.1| GENE 22 24296 - 24658 282 120 aa, chain + ## HITS:1 COG:PM1681 KEGG:ns NR:ns ## COG: PM1681 COG3333 # Protein_GI_number: 15603546 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pasteurella multocida # 1 106 1 106 498 94 44.0 4e-20 MKEIIVQALPQLFTLENFVFVNLGIFIGIIFGSIPGLNGNLAITVLIPFTFKLGSIPALL MLTAIFFGSNFGGSITAILINTPGTNAAAATLLDGYPLSKKGKRERLWTWRWQLQPLAAW >gi|229784109|gb|GG667626.1| GENE 23 25712 - 25789 82 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSTYETLSLMIAFGVLIVMITGTKK >gi|229784109|gb|GG667626.1| GENE 24 25913 - 26020 79 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MALAIKTYTTEIYLLSIKMLMIEKIMIDRRTGIDL >gi|229784109|gb|GG667626.1| GENE 25 26113 - 26190 82 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSTYETLSLMIAFGVLIVMITGTKK >gi|229784109|gb|GG667626.1| GENE 26 26551 - 26739 194 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870269|ref|ZP_06113531.2| ## NR: gi|288870269|ref|ZP_06113531.2| choline-sulfatase [Clostridium hathewayi DSM 13479] choline-sulfatase [Clostridium hathewayi DSM 13479] # 1 62 28 89 89 103 98.0 3e-21 MLARCNPHHCVLGHEEDPLEEVNLAHNGEFREKRLELEACLREWEEKTEDFENSRIVREE KV >gi|229784109|gb|GG667626.1| GENE 27 26846 - 28189 1520 447 aa, chain + ## HITS:1 COG:TP0901 KEGG:ns NR:ns ## COG: TP0901 COG0534 # Protein_GI_number: 15639886 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Treponema pallidum # 3 445 27 466 470 178 27.0 2e-44 MILAIAVPIMIQNGITNFVGLLDNIMVGMVGTEQMSGVAIVNQLMFVFNISIFGAISGAG IFSAQFYGSGNHEGVRHTFRFKLIFSVLIIVVAGFLFLVWGEDFISMYLHGEENAAGLAL ALDQGHKYMLVMILGMIPFGVEQVYTSTLRECGETVIPMKAGIVAVLVNLVLNYILIFGK FGAPVLGVVGAAVATVISRFIEAGIVIRWTHCHKKKNPFIIGAYRSLYIPGSLVRRIIIK GMPLMVNEVLWAGGMATLMQCYSVRGLAAVAGINISSTIGNVFNVVYLALGNSVAIIIGQ LLGAGKMEEARDTDTKLIAFSVCLCLGIGAVLALLAPLFPMIYNTTDEVKALATAFIRVI ALYMPMGAFLHASYFTLRSGGKTIVTFIFDSVFMWVVSIPCAYVLSRYTAVPIVPLYFTC QMIEIIKCVIGYVLVKKGIWLQNIVEA >gi|229784109|gb|GG667626.1| GENE 28 28284 - 28493 197 69 aa, chain + ## HITS:1 COG:no KEGG:Rumal_2298 NR:ns ## KEGG: Rumal_2298 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 9 66 21 78 78 62 51.0 8e-09 MARMKKRRGAVTLIGSGRQEVKVCDIVNGIPKVKRVEKGEYLEVRLTYSLFHREIKRYDK ETFEEIKRR >gi|229784109|gb|GG667626.1| GENE 29 28686 - 28829 140 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGALMCELNQAYLKAINLKAKKPKGNNTWTQLKPETNIPIPAPTISS >gi|229784109|gb|GG667626.1| GENE 30 28874 - 29419 555 181 aa, chain + ## HITS:1 COG:alr2018 KEGG:ns NR:ns ## COG: alr2018 COG1695 # Protein_GI_number: 17229510 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Nostoc sp. PCC 7120 # 2 129 38 165 216 75 34.0 4e-14 MVTIDYTILGMLSWKPMTGYDMKRMMQDSPVMYWSGNNNQIYKALVQLLDMGLVTNEVQY QESAPNKKVYSITDRGKAALEAWILETEPEIPEFKKAFLIQISWAGLMEKEKIDQLLARY EEQVREQISMRQEEKRRQKYFPNRSEQEHYVWDQIYDNLCQSYEAELQWTLRFREGLKAL K >gi|229784109|gb|GG667626.1| GENE 31 29636 - 30028 318 130 aa, chain + ## HITS:1 COG:no KEGG:ELI_2299 NR:ns ## KEGG: ELI_2299 # Name: not_defined # Def: hypothetical cytosolic protein # Organism: E.limosum # Pathway: not_defined # 1 118 1 118 122 147 64.0 2e-34 MQTEIIIMNQAKAAVIYSEECILKDVQSALDFMMSMKYETDCDRIALNKKAVAEEFFILS TGLAGEILQKFINYGIKFAVYGDFSRYTSKPLKDFIYESNHGTDVFFTSDREEAVKRLTG YGKPREARSI >gi|229784109|gb|GG667626.1| GENE 32 30100 - 30672 475 190 aa, chain - ## HITS:1 COG:BH0719 KEGG:ns NR:ns ## COG: BH0719 COG1309 # Protein_GI_number: 15613282 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 1 177 1 176 188 71 26.0 8e-13 MNTKNNQRYRETEIRMDAAMLELMKHTEFEKITVKKICETAGVNRSTFYAHFTDIYDMLD RMEDFLHQELLESYPFHSNPEDLPFSDPLILLFLQHIRKHQYFYRIALQTRRAFPLKQGF EPLWNLIIKPRCQAAGITSEDEMMYYFVYFQAGFTMVLKRWVDTGCKESESRIAQTLKNC IPAVWQRLKP >gi|229784109|gb|GG667626.1| GENE 33 30809 - 31819 818 336 aa, chain + ## HITS:1 COG:YPO0060 KEGG:ns NR:ns ## COG: YPO0060 COG1063 # Protein_GI_number: 16120413 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Yersinia pestis # 8 317 9 322 341 117 27.0 4e-26 MKAAIYMGKEDIEIREIPTPECGDDDVLIKNIYSSICGTDVAVYLHGPNTGHRITTGGEF GHETVSRVAAIGKNVTDFTVGERVYPYPRCAKNDTKRAGTIGGFSEYILVPQAKRNHSLY PVPQEIPDRMACLIEPFTVGCRAARRGQPQAGESAVVFGAGTIGLAAAVSLKYFGMDRVM VCDRSDFRLAAAKRLGFAVCNNDRDDFAAAAVEYFGTALSLTGNTTDIDCFLDAAGAGSI LELFMSLGKIGSRFVSVAVNNALRSLDLLHLTFAQKSIIGSGGYMPEDVRDVMNIMESGR WDLESMITHEFPLDQIETAIKTAADTEHALNVTIKL >gi|229784109|gb|GG667626.1| GENE 34 31908 - 32078 242 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620604|ref|ZP_06113539.1| ## NR: gi|266620604|ref|ZP_06113539.1| putative MFS metabolite transporter [Clostridium hathewayi DSM 13479] putative MFS metabolite transporter [Clostridium hathewayi DSM 13479] # 1 56 1 56 56 99 100.0 9e-20 MHILYQWIGDGSQRIRENYSASESRSLHMTGAAGALCMLLGLGKIIFGALSLSVFT >gi|229784109|gb|GG667626.1| GENE 35 33034 - 33480 366 148 aa, chain + ## HITS:1 COG:no KEGG:PFREUD_17740 NR:ns ## KEGG: PFREUD_17740 # Name: not_defined # Def: hypothetical protein # Organism: P.freudenreichii # Pathway: not_defined # 13 147 96 232 259 90 37.0 1e-17 MHSKERKKQYCYYRLAGAVMVAASLLYVVYSLWTIRHPKTVNYGKITAITIAAITFTEIG VNVWGVIKYRKYHSPLLHALKMISLGTSLISLVLTQSAILAFADEAQDPSVNGFLGTVTG ICAALLGVYMIRRINRIEAEDLHREDSY >gi|229784109|gb|GG667626.1| GENE 36 33516 - 34187 791 223 aa, chain + ## HITS:1 COG:lin2728 KEGG:ns NR:ns ## COG: lin2728 COG0745 # Protein_GI_number: 16801789 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 1 219 1 220 225 151 35.0 9e-37 MIHILVADDDEKLTQSICSYLNNCGYEAKGVLSAEYAYDEMYSRLYDLIISDIMMPGTDG FAFAETIRGLNKRIPILFVSARDDLSFKQRGFDLGIDDYMVKPLDLSELLMRVRALLRRA NIEASRRLVVGSMVLDADAMEITQGEQEIPVTTREFNIMFKLLSYPNHTFSRSQLMDEFW GVDSDTSLRAVDVYITKLRDKFSACSEFEIRTIRGLGYKAVLL >gi|229784109|gb|GG667626.1| GENE 37 34184 - 35281 875 365 aa, chain + ## HITS:1 COG:lin2727 KEGG:ns NR:ns ## COG: lin2727 COG0642 # Protein_GI_number: 16801788 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Listeria innocua # 54 353 161 458 459 168 33.0 2e-41 MTGNHSTEDGRVTRRYFFFPAYVALVFLLAVLMMLPTIFYGGAHLILEESGPYIKWYFLY CIVLAALVSGGLALRKYLNFDRPIKKLSAAARQVAEGDFSVYIPKRHTSNDMDYIDALFV DFNKMVEELGSIETLKNDFAANVSHELKTPLAAIQNYAQLLGTTELTPKQQEYTERILSS TQRLTSLIYNILKLNKLESQKIHPKPEPYDLCRQLSDCAIGFEAIWEQKKINFEADMEDR AVIEADEELMELVWNNLLSNAFKFTKSGGTVSLRQFSDEERIVTEVGDTGCGMTPEMMKH MFDKFYQGDTSHSTEGNGLGLALVLRVLEMSGGTITVNSTEGKGTVFTVSLPRAALAQKG ESQTS >gi|229784109|gb|GG667626.1| GENE 38 35278 - 35652 311 124 aa, chain + ## HITS:1 COG:no KEGG:Tthe_0858 NR:ns ## KEGG: Tthe_0858 # Name: not_defined # Def: hypothetical protein # Organism: T.thermosaccharolyticum # Pathway: not_defined # 11 122 9 116 202 99 48.0 4e-20 MTRIRKDGAEYVGYEYKEKSVPALYASLYLDSYPCFGWEEDPNGMSRKANTETASRAVLR FRRDRKLCNRAELTRLQRNFDSCVAEIDALEQSKRSSAVIGALVTGIMGTAFMAGSVFAV TAEP >gi|229784109|gb|GG667626.1| GENE 39 36807 - 37379 332 190 aa, chain + ## HITS:1 COG:lin2818 KEGG:ns NR:ns ## COG: lin2818 COG4905 # Protein_GI_number: 16801879 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 20 187 6 165 270 89 32.0 5e-18 MIIHGRKTGPVSKSGDRLTLMLLFYIAAVCGWLWEVLVYWGTSGFGGSLTELLVFYRGVL HGPWAPIYGTGGILLVLLYRAVQERKGYFLIAAMAVCTVVEYGTSWILEVFFHARWWDYS GQFLNLHGRICFVSIVGFSLIGLWAVQAVVPVCTGWIQKLTYETRKWLCLLVSLLFFLDV IVSVYKLSIS >gi|229784109|gb|GG667626.1| GENE 40 37485 - 37763 300 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620611|ref|ZP_06113546.1| ## NR: gi|266620611|ref|ZP_06113546.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 92 1 92 92 135 100.0 9e-31 MAEYRLKPGKAGGKVISAYEKMEKKFTDAFLEEDAVSDSGYRLKTGKTGESVKGAYKKIE DGVVGAYKKMEDAFVDAFLERCDEEERGEVDK >gi|229784109|gb|GG667626.1| GENE 41 37760 - 37876 197 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNKIGVGMVITAGAVVMAVVAAAAVVRRKGSTTLHKFL >gi|229784109|gb|GG667626.1| GENE 42 37876 - 38370 260 164 aa, chain + ## HITS:1 COG:no KEGG:SpiBuddy_2065 NR:ns ## KEGG: SpiBuddy_2065 # Name: not_defined # Def: hypothetical protein # Organism: Spirochaeta_Buddy # Pathway: not_defined # 4 136 8 150 189 72 31.0 4e-12 MKQICILLTKYSDWISSLVYYIGGQGYTHSSIALGDQPTQFYSFNYRGFAVETTEKHRRR GVRYSRLYRLNVSDTVYERIEERIRHFLENREKYRYTRLGVFCCVLRIPLYWKNHYFCSQ FVAELLSESNALRLSRKPCFYLPNHFIRLLETQPCLAGMEQDFI >gi|229784109|gb|GG667626.1| GENE 43 38420 - 38791 503 123 aa, chain + ## HITS:1 COG:no KEGG:LEUM_0832 NR:ns ## KEGG: LEUM_0832 # Name: not_defined # Def: hypothetical protein # Organism: L.mesenteroides # Pathway: not_defined # 1 123 1 123 123 62 38.0 5e-09 MKKSNFTALILGTVGIVFFALGMCMALVEEWGLFRKGIIAGVAGLIILLITVIVWRRMEG KSPVRLNAKTDGSVLAGVLGALLLGVGMCLCMVFDKMIPGVIIGLVGILALLMLIPMTKG IHD >gi|229784109|gb|GG667626.1| GENE 44 39040 - 39228 215 62 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1114 NR:ns ## KEGG: MGAS2096_Spy1114 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 2 62 1 61 61 98 81.0 9e-20 MMNAMWLLCPICGNKTRMKVRDDTELKNFPLYCPKCRQESLIDLKQLQITIIKEPDAKTQ SR >gi|229784109|gb|GG667626.1| GENE 45 39268 - 40248 962 326 aa, chain - ## HITS:1 COG:lin1811 KEGG:ns NR:ns ## COG: lin1811 COG1680 # Protein_GI_number: 16800879 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Listeria innocua # 4 319 10 318 323 243 40.0 4e-64 MNQETIAALEEIIHSRYSNTAGMIVLKNGETQYEGYFNGCTADSCINIFSVTKSIISILI GIAVDRGEIQSVNQKVLDFFPDFPVENYNEGETTVRDITLRDLMTMTAPFKHQEEPYLEY FTRDDWVSTALELLGGPEKAGTFRYTPFVGPDILSGILVKATGQSVLEFAAEHLFTPLGI TVKGSITLENEEAHYAFMNARDVSGWVMDSKGIHAAGWGLTLTARDMAKIGQLYLDGGKW DGKQIVSAQWIAESTREHSRWEELNLPYGYLWWIADENAFAAMGDSGNTIYVNREKKLVI AIASLFKENTGDRMDFIKECVEPMFA >gi|229784109|gb|GG667626.1| GENE 46 41134 - 41790 372 218 aa, chain + ## HITS:1 COG:no KEGG:Apre_0340 NR:ns ## KEGG: Apre_0340 # Name: not_defined # Def: helix-turn-helix domain-containing protein # Organism: A.prevotii # Pathway: not_defined # 6 200 4 197 259 182 48.0 1e-44 MAQKIVQGNEALAGKIKFRRNELGLTIEEAASRAGVGTKTWSRYEAGESIRRDKCKGICK ALNWKGLPDQDGEEDSRISVQEYRNHEAWSRFLENEYGAGAAISFATGSDILLDQIEEDM TALACMPAGSHIGQLDVSWLRESLPDQFLMQYDYLFLYQMKCTLRKMRLRAKNGLAMTAH SVMEELLLYLCYEEASVLVELSGGISGLEEDDGDFEAS >gi|229784109|gb|GG667626.1| GENE 47 42862 - 43512 131 216 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620620|ref|ZP_06113555.1| ## NR: gi|266620620|ref|ZP_06113555.1| hypothetical protein CLOSTHATH_01715 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01715 [Clostridium hathewayi DSM 13479] # 1 216 1 216 216 441 100.0 1e-122 MDKVAENGRVYMFETMEWKDVYKLWKTGCCYFPQMLSSFVSAETIFVEYKQEGYFYGYDW LRHDEVAKRKELIEKNPISFIYFTKPTNKGTIQTAEMLTEAQNEEELAAVWIAATTKELS ECRGGSGILRHSDILYMAACKFLQDRYYLWHHAMKRLVPEIMVPDSVIKSVVCNDAEPVI GLIQMNTMLVKSTWSILRYSSLKEGNLPEPYYRRSI >gi|229784109|gb|GG667626.1| GENE 48 43796 - 43873 85 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSTYETLSLMIAFGVLVVMIIGTKK >gi|229784109|gb|GG667626.1| GENE 49 44045 - 44536 512 163 aa, chain + ## HITS:1 COG:BH2238 KEGG:ns NR:ns ## COG: BH2238 COG0394 # Protein_GI_number: 15614801 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Bacillus halodurans # 9 159 2 152 160 132 44.0 3e-31 MTHQHQNNIKILFVCHGNICRSPMAEFIMKDMVEKRRQADRFYIASAATSREEIGNPVHH GTRNKLRQFGISTEGKRAVQMTKADYDKYDYLIGMEQWNITNMMRILGSDPERKVSRLLD FGPNPRDIDDPWYTGDFDSTYRDICEGCEALLDYIDRCEPLNR >gi|229784109|gb|GG667626.1| GENE 50 44705 - 46309 1507 534 aa, chain + ## HITS:1 COG:MT3641 KEGG:ns NR:ns ## COG: MT3641 COG1053 # Protein_GI_number: 15843149 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Mycobacterium tuberculosis CDC1551 # 217 526 253 556 566 101 29.0 3e-21 MNQYKGEAWLGEEPQIREESITEAVSADIIVVGGGLAGVAAVRRAAEMGMKVLLFEKCRC VQARSGDFAVMDSRVAKRWGRDHLNKTEIVNNLMHDMAYKASQNILKRWSEEAGKAFDWY LEGLEDIAILDRTDQVPPKGAKCYIQPRRLPLPEGFDNTEENFKCYQVTAWIRPSHIALC RGNFELAEKTGNVTSFFNTRVKKLLKNSEGRVEGVIAETREGSLIKAYASRGVILATGDY MNDEEMLKRFLPGMLETPKQWTSYDLDRQPSNTGDGHKMALWAGAKLQDSPHAPCAHHMG SVFGVADFLLLNTRGKRFVNEDAPGQQLGSQIENLPDKTAWQIVDGNWRSYIPKEYPNHG SVCFVMEDEELEGGAVYDKLCFIDNYIAPKYVRKAVEDGVLLEAETLEELIVKTGLPADT ALASVEQYNRLCEQGEDTDYGKRKDRLFAVKQGPFYAAKFTPAVMIAAMGGIQSDEDAHC YDTEGAVIPGLYAAGNVQGNRVAVDYPLTVPGLSHSLALVYGRIAAESAVGESR >gi|229784109|gb|GG667626.1| GENE 51 46446 - 47798 1533 450 aa, chain + ## HITS:1 COG:BS_yurO KEGG:ns NR:ns ## COG: BS_yurO COG1653 # Protein_GI_number: 16080313 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 84 444 49 412 422 92 23.0 1e-18 MKKTGKAVALMTALAVTGALLGGCGADSKTEVGQTGAAKTEAGSGTEDSAKTEAAGGDAA EGTVKEDIRVVTYFAGSDAYAPVWKEVCADYMKDHPGITIVDESQPTSGSNDLFKTKVQA DLAAGNPADLVLYYTGEAYTKTFEDTGLFVTFEDILKADPEWAGNFKESPLENVQYQGRQ YALPFIGYYEGMFYNKALFEQYGLEEPTTWENIMKANEVFSQNDIVTLSMSLGMPYITTE NFIMGAAGKENHRNYFDESWGIAIDCIAELYQKGGLPKDTFTISEDDVRLLFSEGKAAMM INGSWCTESLQSNPDMRIISMPTLPGGTGGENCAIAGFSSGWHMTKAASERSGETLKFLK YMTSPETMARFIAYGGSPAIVCDAPENSSELLKSAVTMLEKAQYQDAALDSQVNHEAYET LNNGIQYVCEGQMTSAEVLKQAKELNESGQ >gi|229784109|gb|GG667626.1| GENE 52 47916 - 48782 1065 288 aa, chain + ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 1 288 1 292 292 161 35.0 1e-39 MALKGKKYVWICITPAMILILAFLIYPLIKTVIYSFSSWYNFALTSEWIGLDNYKRIFSD MVMPVAFKNTAILIAGVLIFQVGFALLLAIMVDSAHHGFKLFRTVYFFPIVISGTAIGLM FTLIYKYEYGLLNYFVTLLGYDKQVWITPRTAIYLALIPVLWQYIGFYFVIFLTGIAKIP EDIYESAMLDGITPIKKIFYITIPMLKDVLVSTVVLVVSGCFRVFDTIYVITKGGPMNSS QLLSTYMYQTAFEKYNGGYASAIAVVMILIGVLLTIGLRKLLKADGEE >gi|229784109|gb|GG667626.1| GENE 53 48794 - 49633 891 279 aa, chain + ## HITS:1 COG:BH1119 KEGG:ns NR:ns ## COG: BH1119 COG0395 # Protein_GI_number: 15613682 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 1 278 1 281 281 144 31.0 2e-34 MGKKKVKISSILLYAVLIFWLLITVIPILWVFENSFKPSEEILLNGVALPKNLDFSNYRS IFSYPDVNMFRSFFNSFLISGGVVAGVVCLGGLAAFGLGRFDSGIGKVVDAGMVACLLVP AFATMIPNFVTVSRLPIRETYLAVIVPQVAGNLCFAIMLLSSYMRSLPNELDEAAIIDGA GPFQVLRHITVPLSKPMFATVGIMVFIWSYNDLITSLVYISDQKLKPVCVILTMVSNMHG TDYGAMMAAIFITVVPVLILYMMSQEMVIKGLTAGAVKG >gi|229784109|gb|GG667626.1| GENE 54 49699 - 51438 1510 579 aa, chain + ## HITS:1 COG:no KEGG:Acid345_0426 NR:ns ## KEGG: Acid345_0426 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: A.bacterium # Pathway: not_defined # 1 571 24 605 613 372 37.0 1e-101 MRVFANHIGFDKTDDKTAVLELEKEAAELTAWLEADDGERYDVSVSGPERIPDWSEGCYY VLDFSDVRREGTYRIAGNADGKAFRSEIIPITGYFLNLRMVNASRAYLKGERSSGEWLKA DRNVPFKGVGEGTLDLHGGWYDATGDFGIHLTQLSHTAVFNPMQSLLPSYVCFDLVGRME REELQDCRILKKELLDEAYWGADYAMRLHREGGGFIRSVDRGEAFSEEQHRTIDYEYFRF SGHEEDRSYLAETIEEKHYETCMRSGSGLAVAVLAQASRYRGASGEYSCGRYLAAAEEAF QYMTEHNEQYSSDGTWNFLDWYCALAAAEELYEATKSQDYLEACRRFFGELERSAIAMED GIWFLSRQGELYFHPSDEGFPLITIMKYGSLEPDEMRRTRAMELVAQAFVHVLSISRKPF GYAAFTQKTGEEERNRYFFPHDTKAAPWWQGENARIASLAAAALRFSCQKEAAEMAGDLR KFAGHQLDWIMGCNPFDSCMIDGFGRNNIQYFFRNQYDFMNSPGGICNGITSREADDKGL EFMMAPTEECDDNWRWAEQWLPHACWYLNAMGWKLKRVK >gi|229784109|gb|GG667626.1| GENE 55 51440 - 53380 2181 646 aa, chain + ## HITS:1 COG:CAC3611 KEGG:ns NR:ns ## COG: CAC3611 COG2909 # Protein_GI_number: 15896845 # Func_class: K Transcription # Function: ATP-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 5 304 21 329 862 94 21.0 5e-19 MASKYIERKRILVKMENIERYPLSIVHAPMGYGKTVAVREYLKTTEADWVWVSLAGSEGE ADYLWARLCDALNGVNEELAARFSAIGFPYDCFKTDALLDVLMDYDYKNHLILVFDDFHT IEENNVFELMKAIVRERLTGLHIVLITRELAKLDAADLYQKQLCFTITEKSLKFNREEVF RYLDFMECGMSEEERERVFQITDGWESMLYITAKGIKQGLPVGKSATADDIIEQNFYHGL SQAEKEVLWRLSVLSSFTKRMAAAVLDDSELYDAFETLISRSVFFVFHESGHTYRLRNIL KDYIYERALIDRVDFTEVYRKSGRWLLADGQYSKAFECLYRAGELEEILSILNKDTQNDK GIWGSGNIRQVFEKAGKDLCYRYPFAYLKYLLACAMDGERNYIPRVKLCLDRLEEYCRGS EYEGEMRELILGEISMVRSRISFNDIAVMAEYAGKSAEFFGGGCSCIVTRETEFTYGLPC MLMGFVKEPGSMAEVRDAFAAEGMSFAEATDGCGIGCDSLAASEAALETGAWEQAELHAY KAYFKAKAAEQLSIMISSKLVLDRLNLIQGSSRDRILSNSTLKDEVIAANNPVLNTTFDL CGAYLNGCLGRLDSIPEWIRKGEIGKGKFFFRGPVFSMWFMENASC >gi|229784109|gb|GG667626.1| GENE 56 53290 - 53904 641 204 aa, chain + ## HITS:1 COG:PA1759 KEGG:ns NR:ns ## COG: PA1759 COG2909 # Protein_GI_number: 15596956 # Func_class: K Transcription # Function: ATP-dependent transcriptional regulator # Organism: Pseudomonas aeruginosa # 48 204 742 901 901 73 29.0 2e-13 MDQKRGDRKGKVFFQGTGFFYVVYGKCLLLGKKYIELDSLCESFATFLKPFQYHLVRIYF LIYESVAKSYLESRPRGEEILKEALALAEKDHLIMPFAENTGHVRLMLEALLKREESPYI REVLEACRRYSSQLKLFHACEVSLTDRELEILKLLDEGYTHEEIAGEIYISIATVRYHVK NIYQKLDVNNKILALRKAKDLNFI >gi|229784109|gb|GG667626.1| GENE 57 54084 - 55160 1128 358 aa, chain + ## HITS:1 COG:STM0573 KEGG:ns NR:ns ## COG: STM0573 COG0449 # Protein_GI_number: 16763950 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Salmonella typhimurium LT2 # 42 341 32 332 347 102 29.0 1e-21 MESNAHIVEYWKTQPEVLEACLKNSLPLTENFVKLYQEVRPTHLYLVGSGTSLNAEETAV SFMKEMLDIDVISMPSSYVHDIRGDRPMFVFLSQGGSSTNTLKAMEETAAYPFITVTGEE TCEIASRSSCHMVIGCGEEPVGPKTVGYTASVMILYLMAMEAGRAVGAIGEEKYAYVRTV LETGIAHMRYNMEAIAGWTADHKEKLQSIEKYIFIGHGTGAAALKEGSLKVLETIKYPAF SYEFEEYLHGPILAVDACTGAFLFLSGKPEERERFRQLAECQSKYSQYTFLITDEEGETD GNALRIRKTGAMYTEVFEIVVIPQYLAAVVQGYLGIADGSPLYDEYTALCPTKFRNGK >gi|229784109|gb|GG667626.1| GENE 58 55170 - 56120 1113 316 aa, chain + ## HITS:1 COG:CAC0183 KEGG:ns NR:ns ## COG: CAC0183 COG2971 # Protein_GI_number: 15893476 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted N-acetylglucosamine kinase # Organism: Clostridium acetobutylicum # 1 303 1 301 306 143 33.0 5e-34 MDYVLGIDSGGTKFLVKAVGMDGRELAEYEGHPAGLYRFSRNEAVRTIQDNLEMCIGQAG LTKADCRAIVCGTTGIDSEEDRQEVEEIYRLLPGFSCPALCVNDAEVALYAVTGGTGVVV ISGTGSIAFGRNSSGRTGRCGGWPPCIFGDEGSGAWLNLKALEYMSHVLDGRKERTLLYD LLNQELGIGGAKDLIRICQRIERENAGFLKLGPVVMKALREGDENAVEITRLEAELTFEL ADSVVQKLEIWQEPEFTVGAWGSAIVKNPYHFELFKEQFLAKYPNVTVTVAGRDAAYGAC RIALDVLGAKTGIYRF >gi|229784109|gb|GG667626.1| GENE 59 56278 - 57000 622 240 aa, chain + ## HITS:1 COG:CAC3502 KEGG:ns NR:ns ## COG: CAC3502 COG2188 # Protein_GI_number: 15896739 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 238 3 235 237 140 32.0 2e-33 MNDSKVTPLYQQVKEDIKNAIEQGRYKAKEKIPSEPELSAEYSVSRITLRRAVEELCNEG YLIKRQGQGTFVSTPRIHRKMAGGNRMESFTNTCRNYGMKAGARLLSRQIVPARQEEQEF FGLGPDDLLVYVERLRTADGLPIFLENQFFPYQKFKGLMEENLNDVSLFEVIERISGRKP VETTRRTLEITRASVEQAQKLGIPLGEPLMFLNSYFIDQEGKSLCIGRQYYIGSRYMFDL >gi|229784109|gb|GG667626.1| GENE 60 57344 - 57742 265 132 aa, chain + ## HITS:1 COG:no KEGG:ELI_2236 NR:ns ## KEGG: ELI_2236 # Name: not_defined # Def: signal transduction histidine kinase # Organism: E.limosum # Pathway: not_defined # 16 132 37 152 407 67 33.0 3e-10 MVFIPSLYMGGNGKDKTGRVFIGILLCNALLLISDASAWLFKGHGSLLCYWGVRIANFLV YILGYLLMALFTEYLTGYLSKRIQVSRRAARFVWGICMIGIASTILSQWNHMYYGFVGCN VYSRGDWFWLSH >gi|229784109|gb|GG667626.1| GENE 61 58689 - 59504 755 271 aa, chain + ## HITS:1 COG:BS_ywpD_2 KEGG:ns NR:ns ## COG: BS_ywpD_2 COG2972 # Protein_GI_number: 16080688 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 61 262 1 198 218 127 38.0 2e-29 MDIGLLFVYRQGLGKNEVWVLLSYIILPLLAMTVQIFIYGIAWLYLATTISIIFVYETLQ ANQAQKQRWKEMELEQSRTAIILSQVQPHFLYNVLLGIKQLCDSNPKKASEALEHLAFYM RRNLNSLTRKQLIPFDEEMCHVNDYLYLEKMRFEEKLTVVLDLEYTDFFLPAMTVQPIAE NAVRWGITKKKGGGTLTIKSQLTGEEVMISVMDDGAGFDPNEIRNDGKTHVGIENVRQRL MLQCGGTLSISSKKGSGTTVTIRLPQKGRNK >gi|229784109|gb|GG667626.1| GENE 62 59501 - 60310 734 269 aa, chain + ## HITS:1 COG:CAC3390 KEGG:ns NR:ns ## COG: CAC3390 COG3947 # Protein_GI_number: 15896631 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver and SARP domains # Organism: Clostridium acetobutylicum # 1 230 1 235 370 89 26.0 9e-18 MNIIVADDERLSVEYMLPLLRRLEPEAEITGFTEAEDVFEYMERNRVDIAFLDIEMGEYS GIELAKRCIALAPCVNIIFVTGHSEYTLDAFQLHVSGYLLKPVRSEDLQAELEHLRHVIP RHHIRVQTFGFFEVFVDSVPLKFTRTKCKECLAYLIDRRGARVTYADLAVILWEARTYDP SVQANTQKVVSDLIKTLREANLEELVIKSRQDIAINRSLVDCDYYAVLDGDTSRLDAFMG EYMSKYSWAEYTVGELVKLKEKVLGADRG >gi|229784109|gb|GG667626.1| GENE 63 60340 - 61792 1316 484 aa, chain + ## HITS:1 COG:CAC3352 KEGG:ns NR:ns ## COG: CAC3352 COG0840 # Protein_GI_number: 15896595 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Clostridium acetobutylicum # 4 386 11 399 703 99 22.0 1e-20 MKSIQTKFIILILSCVFVCSAVIGGAGIISAERVVDEDSAQMMNYRCSQLACEVDAMLSR IEQSVKTLAVYTDENLESVELLKSDGTRKAFTEQIESVAVNAANNTEGAVAVYVRYNPDF TPPTSGVFWSKTNLNGTFQKLVPTDFSRYSPTDVEHVGWYYIPVKNGRAIWLSPYTNENI NIQMISYVIPIYKNNETVGVVGMDIDFSVIKDMINSLRIYESGNAFLTDDKGNVMYHNVY PFGVPMGSVDKSLIPLVAELENGTSGSSLFSYVNENVERKLAFRTLRNGMRLAVTAPLSE IDKNKNMLLMQIVAALLVIAPLSVLVTVLITRRMIRPLKELNEAAKQIAKGDLSISLTQQ TKDEVGTLADSFQQTVNHLQKYINYINSLAYRDALTGVKNKTAYQEAERRLEEQMRNGRP EFAVIVLDINNLKKINDHYGHDFGDMFIVDACRLICKSFPHSPVYRIGGDEFQSLFRLIG GVGV Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:11:43 2011 Seq name: gi|229784108|gb|GG667627.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld20, whole genome shotgun sequence Length of sequence - 66828 bp Number of predicted genes - 46, with homology - 41 Number of transcription units - 23, operones - 13 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 30 - 89 4.7 1 1 Op 1 . + CDS 114 - 989 376 ## Closa_0110 hypothetical protein 2 1 Op 2 . + CDS 1005 - 9896 7708 ## COG3210 Large exoproteins involved in heme utilization or adhesion + Term 9929 - 9982 20.6 - Term 9916 - 9970 22.0 3 2 Tu 1 . - CDS 9997 - 10077 68 ## - Prom 10130 - 10189 80.4 4 3 Tu 1 . - CDS 11033 - 11179 168 ## Closa_1010 hypothetical protein - Prom 11208 - 11267 5.9 5 4 Tu 1 . + CDS 11364 - 11690 549 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain + Term 11774 - 11837 3.2 6 5 Tu 1 . - CDS 11789 - 13198 1302 ## Closa_1011 GerA spore germination protein - TRNA 14548 - 14620 76.2 # Val GAC 0 0 + Prom 14672 - 14731 6.4 7 6 Tu 1 . + CDS 14770 - 15642 796 ## COG1284 Uncharacterized conserved protein - Term 15430 - 15482 1.1 8 7 Op 1 . - CDS 15657 - 17051 1248 ## COG0534 Na+-driven multidrug efflux pump - Prom 17076 - 17135 2.8 9 7 Op 2 . - CDS 17139 - 17546 377 ## Closa_1014 hypothetical protein - Prom 17754 - 17813 3.7 + Prom 17595 - 17654 6.6 10 8 Op 1 8/0.000 + CDS 17698 - 21177 3712 ## COG3857 ATP-dependent nuclease, subunit B 11 8 Op 2 . + CDS 21167 - 24901 4097 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) + Term 24913 - 24961 6.2 12 9 Tu 1 . - CDS 24991 - 26460 345 ## COG1672 Predicted ATPase (AAA+ superfamily) - Prom 26497 - 26556 9.2 + Prom 26549 - 26608 7.9 13 10 Op 1 . + CDS 26675 - 27517 773 ## gi|266620648|ref|ZP_06113583.1| hypothetical protein CLOSTHATH_01748 14 10 Op 2 . + CDS 27545 - 28591 1218 ## COG0709 Selenophosphate synthase 15 10 Op 3 . + CDS 28596 - 29729 956 ## COG0520 Selenocysteine lyase 16 10 Op 4 . + CDS 29759 - 30664 964 ## COG0583 Transcriptional regulator + Prom 30673 - 30732 4.5 17 11 Op 1 . + CDS 30771 - 31862 955 ## Ccur_00560 hypothetical protein 18 11 Op 2 . + CDS 31890 - 32105 276 ## gi|266620653|ref|ZP_06113588.1| sulfurtransferase TusA-like protein 19 11 Op 3 . + CDS 32106 - 32342 307 ## Closa_1020 hypothetical protein + Term 32358 - 32391 1.1 20 12 Op 1 2/0.000 + CDS 32821 - 34101 1298 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases + Prom 34131 - 34190 3.6 21 12 Op 2 . + CDS 34248 - 35642 1627 ## COG2252 Permeases + Prom 35735 - 35794 3.2 22 13 Tu 1 1/0.250 + CDS 35854 - 37980 2585 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 23 14 Op 1 12/0.000 + CDS 38903 - 39076 203 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 24 14 Op 2 15/0.000 + CDS 39095 - 39982 1159 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 25 14 Op 3 . + CDS 39975 - 40466 731 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs + Term 40517 - 40568 19.5 - Term 40498 - 40562 24.0 26 15 Op 1 . - CDS 40566 - 41954 965 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins - Prom 41985 - 42044 1.9 27 15 Op 2 . - CDS 42049 - 43044 736 ## COG1609 Transcriptional regulators - Prom 43210 - 43269 7.9 + Prom 43267 - 43326 6.6 28 16 Op 1 19/0.000 + CDS 43375 - 44604 1487 ## COG2182 Maltose-binding periplasmic proteins/domains 29 16 Op 2 20/0.000 + CDS 44618 - 45994 1492 ## COG1175 ABC-type sugar transport systems, permease components 30 16 Op 3 . + CDS 45996 - 46862 919 ## COG3833 ABC-type maltose transport systems, permease component 31 16 Op 4 . + CDS 46866 - 48911 2050 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases 32 16 Op 5 . + CDS 48803 - 50857 1384 ## COG3408 Glycogen debranching enzyme 33 16 Op 6 . + CDS 50875 - 51657 853 ## COG0058 Glucan phosphorylase + Prom 52504 - 52563 80.4 34 17 Op 1 1/0.250 + CDS 52663 - 54084 1705 ## COG0058 Glucan phosphorylase 35 17 Op 2 7/0.000 + CDS 54147 - 54398 308 ## COG0366 Glycosidases 36 17 Op 3 . + CDS 55341 - 56666 1177 ## COG0366 Glycosidases + Term 56795 - 56828 2.5 + Prom 57173 - 57232 6.1 37 18 Op 1 . + CDS 57314 - 58294 1055 ## COG0673 Predicted dehydrogenases and related proteins 38 18 Op 2 . + CDS 58308 - 58847 559 ## COG0350 Methylated DNA-protein cysteine methyltransferase - Term 58634 - 58673 4.4 39 19 Tu 1 . - CDS 58812 - 59060 76 ## - Prom 59083 - 59142 80.4 + Prom 60277 - 60336 5.9 40 20 Op 1 . + CDS 60519 - 60632 73 ## 41 20 Op 2 . + CDS 60649 - 60822 68 ## gi|266620676|ref|ZP_06113611.1| cytochrome c heme lyase 42 21 Tu 1 . + CDS 60930 - 63512 1702 ## COG3291 FOG: PKD repeat + Prom 64359 - 64418 80.4 43 22 Op 1 . + CDS 64478 - 65161 581 ## COG5263 FOG: Glucan-binding domain (YG repeat) 44 22 Op 2 . + CDS 65353 - 65454 71 ## + Term 65503 - 65541 9.3 45 22 Op 3 . + CDS 65673 - 65807 152 ## + Term 65818 - 65874 21.2 - Term 65803 - 65865 24.2 46 23 Tu 1 . - CDS 65875 - 66723 841 ## COG2233 Xanthine/uracil permeases - Prom 66758 - 66817 2.0 Predicted protein(s) >gi|229784108|gb|GG667627.1| GENE 1 114 - 989 376 291 aa, chain + ## HITS:1 COG:no KEGG:Closa_0110 NR:ns ## KEGG: Closa_0110 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 281 1 289 294 185 34.0 2e-45 MSIAKMSMIHNSYRYFTISVDKYQDRCMKGVIYHAGKTPGIRFDNFLEMVIHMNRIFDSM ACPKQAMELRQFSNVEYPEPAIRECNRYQNGRLATFQVYVKFRYNASWQGDITWLEGEQA ESFESLLQMLQLIDRIFTGQSEMEKDIKITRVCQIAVNSSEKGLLIGRVQNAVINRLEKF KGTIGLADAMVRLFEAGEGDHDSGKIISEETWDSYRLGGKEATFLIKILFREHSTWQGII YWRETGKRQAFRSFLEMVILMASALESGKKRSECEDRSITTNNRKKVLMEG >gi|229784108|gb|GG667627.1| GENE 2 1005 - 9896 7708 2963 aa, chain + ## HITS:1 COG:PA4625 KEGG:ns NR:ns ## COG: PA4625 COG3210 # Protein_GI_number: 15599821 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Pseudomonas aeruginosa # 1049 2202 819 1958 2154 80 22.0 5e-14 MKGLRKKFRRGLAAFLAFVLTMTSVNTVSLADVTNALETEKATFVMSGEDIVESAQAAID SGNTFTYEDLGIEETGSTAKEYQKLFTNGTVYEIAPSYDLDEEENADGANLRMFIRVKGE TEGYQITGEEEIIFLYINDSESKIAFRSRIDDYVTDKVTVKGNSSFAETEKPALDENGNA LEGDKPAQNQEEATAPEEENGSVEETKEPETEAPTQVPETEESIQEPETEAPVQEPETTA PAKEPETEAPTQEEIEKPTQAPVQEPETQAPETEAVTQAPEVQEPETEAPAQEPETEAPA PEAETAAPDTANETASIIRNKTFTLTTAMEAAPGESDVSIVDEEIPKADSPKLDAEGNNS SESPAAVSTGSNADKGDHTDSGENKPSEKIEGKTYGHVLLNEESYAKAYVTTLNKMGVSV EVTEGLALTIHHNAFYNDKDISEDEVVRGLKEGDVINPADYAWNKDFLTYAGSSQDSIVI SSQTENRVDLYYNAIAEEKSEDEVTITGAIPRSYFRSVKADDNTTPGKVFPTKTAEWVDE ANGIGKINFTIYGNPIRRGSDVILVIDSSGSMEGEKWSTAKTAAKGFIDNLYQNKDGVVS DDRIAIVDFDSSAKAYPGTNSGSETFLKVDDKITIKNKTYSAKDYLKSYVLDSQMKDTGG TDYNKALQTAQSVINNRRDSSRPAYIVFMSDGEPNGYWDWLTYRYYDGQKYATELKSDGV TIYSLGLNIGSTNFNKFIVPLASDPTSTYAKNIVKTSDLVGIYDAIASSIKIAGTEATIT DVINTEAFEIAADYNNGMWYEASTGTVSKSSDNKTVTWDFGNISANKETLTIYIKVKDTV PGGTAPDTNKEARVDYTDPDEQSQTQDIPSPNLAVGDVGTITLNYYLVNENGVTINGTGE VIDFERRQQLGTQLYRDAPQKFGSYSVPAPSTIEYNGIQYQYVAASADHDGSSSPANVEL SPAKKAVHLYYGYQQIKDVTVTFNGNGGTPSAGSVTVPKDTSLGSQIATAERDDHNFLGW NTQPDGTGSSFTSNTIVTDDITVYAQWQKKQALTIKASDVEKVYNGETQSVSAEGYEWSG LKAGHKIINIKTIGSGRDVKDGGYPITFTDTAKIVNGSGVDVTKEYTITTEPGTLTILPK AVTITVDNSSKYYNSNDPVFTGKVEGLVNQTDLGNVTYVRTNEAEDVGVYEKVLAAKYIE NKNYAVTETRGNFEIKVATMPGASLTAKGGSWPYDGKAHYAAASLNKADGYTVYYKVKDG EWTIDPPSVTNVSEGVVTVSVKATRKGYVDLVTDDVKLQITKKPVTITVKNAEKTFGEAD PVFEGTVQGLVKDTDLGEVTYKRTNSTQNVGTYNKVLIPEYTENTNYDVTVNKGDFTIKP AKDPEADLKIAGGSWVYDGKEHKVKLNVIDSALAALKRYKIEYSTDGGLTWSKEVPGVTN VTDGTITVIARGTREGYETLVSNTADLSITPAPVTIEVDDAFKYFGEKDPDFTGAPKGLV KAGDIGTITYSRTNKDEDVGVYLEVLTASYTPNTNYDVTIAEGDFEIKKASIEGAVLTAV GGEWTYDGDAHAAKASLENASGYTIYYKTGNSEWTTAAPSVTNVVEGTVTVSVKADRYGY ETLEADDITLKILPRDVTIQVADAEKFFDKPDPAFSGTVSGLVKAGDIGTVTYFRENTAE EVGKYPKVLSADYTQNDNYTVDVKKGNFEIKTATMEGASLEAAGGSWTYDGETHYAQAKL TNADGYTIYYQVGNGEWKKEIPGVTNVSEGELTVSVKAVKDGYVDLVTEDVKLVITPREA TIVVSDAEKFFDENDPLFSGTVNNLVKTGDLGTVRYYRTNDEEAVGTYTEVLTADYTAND NYHVSVQNGDFTIKTASVTGAELTATGGRWIYDGAAHAAAGEVENASGYTVYYKVGNGGW TTDAPSVTNVADGIVTVSVKATRTGYEELTAAPVTLEIIPKDVTIIVDPAEKFFDEADPK FTGTVTGLVKDGDLTGLTYVRTNDDEEVGIYEGVLAAAYIQNTNYNVTEHKGTFEIKTAT IKDAKVTAAGGSWIYDGNAHAATASLRNADGYTIYYKAGNGDWTTAAPSVTNVADGIVTV SVKATKTGYTDLITEDVTIQITKKPAAITVNNAEKFFNEADPVFTGSVTGLVADGDLGSV TYVRTNSASGVGTYEKVLDAVYTDNTNYEVTVKRGDFKIKEAKDPEAELKIAGGSWIYDG KEHKVKLDVIDSVLAALKRYKIEYSTDGGNTWSETVPGVTNVAEGTVTVTARGTREGYET LVSNETTIQITARTATIRVKDAEKFFDETDPAFTGTEENLVKAGDLGTVTYRRTNTDEAV GTYQKVLTAEYTPNTNYLVTVVKGNFEIKTASIEGAKVSAVGGSWIYDGTAHAAEAALEH ASGYTVYYKAGNGDWTTEAPSVTNVLEGTVTVSVKATRSGYADLMTDDVTLKITPRKATI TVDHADKFFDEQDPDFTGTTANLVASGDLGTVTYKRTNTDEAVGTYQKVLTAEYESNSNY VVIVEKGDFTIKTAIIEDAVLTAAGGSWIYDGMAHEAKAELTNAPGYTIYYKAGNGEWST TAPSVTNVDDGTVTVSVKATRTGYEDLAAEDVTITITARPATIVVDNKSKSYSDIDPLFT GQIQGLIADGDLGSVVYHRIDADKNKEAVGADITLTASYRENSNYKVEVINGKLNITALN TNTVNVTGETVTYDGKTHGLREVTALKEGSTILYSVDNQNFSETAPVFTEAGNHTVYVKA VNPNYDDTAVVTGIVVIQKREITITAASAERRYNGLELTAPTAAITNGTLADGQVLGSVK VTGSQTAVGSSQNVASDAVIVANDEEVTANYHITYLAGTLRVTSGSSGGDNGGGGGDTPN PNKPYVPGGPGTVTIEPGEVPLANLPEGSPADNLVLIDDGNVPLAGLPKTGDRAGAHAGL AAVLSGFLMAAFAVLSSKKKEEN >gi|229784108|gb|GG667627.1| GENE 3 9997 - 10077 68 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVGCEELPLSAYISQVVAQVEMLTEK >gi|229784108|gb|GG667627.1| GENE 4 11033 - 11179 168 48 aa, chain - ## HITS:1 COG:no KEGG:Closa_1010 NR:ns ## KEGG: Closa_1010 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 48 1 48 134 65 72.0 6e-10 MKLIVIDGQGGKMGHAVIVQLKKSHPELEITAIGTNSIATSSMLKAGS >gi|229784108|gb|GG667627.1| GENE 5 11364 - 11690 549 108 aa, chain + ## HITS:1 COG:CAC3376 KEGG:ns NR:ns ## COG: CAC3376 COG1917 # Protein_GI_number: 15896618 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Clostridium acetobutylicum # 2 108 5 110 114 92 43.0 2e-19 MYVENKNVPLTDLGGGVVRKVLAYSENLMNVELKFEKGAIGVKHSHPHEQIGYIISGSLL FQEEGKEDKVLVTGDSYYVEPNAVHGVVALEDTMLLDIFTPMRKDFVE >gi|229784108|gb|GG667627.1| GENE 6 11789 - 13198 1302 469 aa, chain - ## HITS:1 COG:no KEGG:Closa_1011 NR:ns ## KEGG: Closa_1011 # Name: not_defined # Def: GerA spore germination protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 469 1 469 469 795 78.0 0 MDFSSKLSDNMSYLKEKLNVDTNFDVVYRVIHVGGREACLYFIDGFTKDEVWLKILQAFS TLKEDAMPEDAHAFSKKYLPYGEVGLVRSDSEMIGQLLMGVSCLFIDGYDCCFTIDCRTY PARGVSEPEKDKVLRGSRDGFVETLIFNTALIRRRIRDPKLTMEITSAGESSHTDIALCY MQGRIDEKLLAMIKDRINKLEVDALTMNQESLAECLYPHKWYNPFPKFKFSERPDTAAAS ILEGNIIILVDNSPAAMILPSSVFDIIEEADDYYFPPVTGTYLRLSRMAISVLTLYLTPL WLLLMQNPDAIPSWLEFIQLSEPPHVPLIFQLLILEFAIDGLRLAAVNTPSMLTTPLSVI AGIVLGEYSVKSGWFNSETMLYMAFVTVANYSQASFELGYAFKFMRMILLVLTAVFNYWG FAIGVILSVCAVIFNKTIAGKSYIYPLIPFNWNKLKKRFLRGRLPHKEK >gi|229784108|gb|GG667627.1| GENE 7 14770 - 15642 796 290 aa, chain + ## HITS:1 COG:lin2365 KEGG:ns NR:ns ## COG: lin2365 COG1284 # Protein_GI_number: 16801428 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 16 284 14 282 300 145 32.0 1e-34 MWRKLRKNRRFRTVMTTVAVIGSALLQTFVIQAFIRPAGLLSGGFTGIAILIDRVTSLFG VNISTSLAMVILNIPVAWACSRSISRRFTFFSLLQVFLASIFLKLFHFSPIFEDVILNVI YGGVLYGFCIVLALRGNASTGGTDFIALYVSNKTGNSIWSYVFAGNVVLLCIFGAIFGWD YAGYSILFQFVSTMIVSTFHHRYERVTMQITTSKGPEITNAYVKTFRHGISCVDAIGGYS KKHMYLLHTVVSSYEVEDIITLLHEVDDHVIVNMFRTQQFYGGFYRAPME >gi|229784108|gb|GG667627.1| GENE 8 15657 - 17051 1248 464 aa, chain - ## HITS:1 COG:FN1653 KEGG:ns NR:ns ## COG: FN1653 COG0534 # Protein_GI_number: 19704974 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 10 444 4 437 445 342 41.0 9e-94 MSQKSRTQNAIGQNQITEGVIWKQLLLFFFPILFGTFFQQLYNTADAVIVGRFVGKEALA AVGGPTGTLINLLVGFFVGLSSGATVIISQFYGAKQPDRVSRAVHTSIAFSIAGGIVLMV IGVAGAPWALSAMGTPDDILNHAVLYMRIYFLGVIGNLIYNMGAGILRAIGDSKRPLYFL IASCFTNIILDIVFVVCLHMGVMGAALATILSQLVSAVLVIIVLMKTKESYHLIPKSIRL DLDMLKRIIQIGFPAGLQSVMYSASNVIIQSSVNALGTDTIAAWTAYGKIDSVFWMIINA FGISVTTFVGQNYGAGKKDRVHKGVHVCLAMSFAATFLLSILLYCGGSYIYLLFTTDAAV IEKGTEILQYLVPTFFTYVCIEIYSGSLRGVGDCWIPMIITCLGVCALRVVWIWAAVPLR PTIQTVIFSYPLTWVITSILFFFYFNWFSKLKKYHWYGFRRRNR >gi|229784108|gb|GG667627.1| GENE 9 17139 - 17546 377 135 aa, chain - ## HITS:1 COG:no KEGG:Closa_1014 NR:ns ## KEGG: Closa_1014 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 132 12 143 146 187 83.0 1e-46 MKEIIYTAIFVGLGILLITLFLIMFRPKKDTVPTSGEAAAYVPGVYSSAITLNNQDVNVE VTVDADKITSVTLVPLSEAVTTMYPLMQPAMDTLSEQIVKNQSTKNISYSDETRYTSTVL LKAVDKALSKAEAAK >gi|229784108|gb|GG667627.1| GENE 10 17698 - 21177 3712 1159 aa, chain + ## HITS:1 COG:CAC2263 KEGG:ns NR:ns ## COG: CAC2263 COG3857 # Protein_GI_number: 15895531 # Func_class: L Replication, recombination and repair # Function: ATP-dependent nuclease, subunit B # Organism: Clostridium acetobutylicum # 1 1157 1 1145 1153 615 31.0 1e-175 MSIQLLLGGSGSGKTHRLYTDLIKDSMENPDTKYFAIVPEQFTMQTQKEIVTLHPNHGVM NIDIVSFQRLAYRVFEELAIVNPDVLDDMGKSMILRKVTGGKQKELPLYQSHLNQNGFIG QLKSMLSELYQYGITPDMLEAKIPETTSPMLRQKLEDISVIYKAFQEYIRDRYITTEEIL DVLCRHLPESKLIRDSVITLDGYTGFTPVQYRLLDLFLRYSRRVVVTVTVDPAVPERGKR GVQDLFYMSCEMIDKLNALARQNHVKRELDIVLDEHPAVRYRRGAKRRAEQEARVERKAL TVSPASSALDFLEQNLYRYSGRVYSGKAEEIRLVQAVNPAEEISCVVREIGKMLREGYRY RDMAVITGDIGSFAGELIHQFDASEIPYFLDDKKSILKNPMVELVRAALETIQKDFSYET MFRYLRTGLVVKPEDERKLDRLENYVIAMGIRGHKRWNTPWEGWYRGGRDLNLEELNQLR EEIMAPLTAFIEAFREEGRTVRTMSEAVVRLLEALSVEEKLLARESAFQEQGEFGLSKEY GQVYGLVVDLMDRLARLLGEEKVSRREYAEILDAGFSEIKVGLIPAAVDRIVVGDITRTR LDHIKILFFIGVNDGIVPQKKENSSLFTDKEREFLGSHHIELAPTVRVESFRQRFYLYLA LTKPEERLYLSYSAMDASGKSLRPSILLGELKKLFPQLTAVAASDEVAGIPFSIREARGR LTRGLRNYGISREDSEFLELFRHFMMNEEQRESVEKLVDAAFYAYEERGIGKMAAKALYG TVLGGSVTRLEQYASCAYAHFLNYGLELAERQQYELAAMDIGNLFHDSIDLCFKRMKEQG GDWKTIGEEERKALVHGVVTEVTEEYGNTILKSSARNAWLARKVEKITDRTIWALAEQLK KGDFTPVGFEVSFSAADNLKAMKIPLSEAEALHLKGRIDRMDLCEDEEHIYVKIIDYKSG GTSFDLTALYYGLQLQLVVYMDAALEMEERRNPDKTVIPAGIFYYNINDPVIEREGDMSP EAIDRRILKELRMNGLVNSELEVISHLDHEIETESDVIPVAMKNGLIQEAKSSVAGGNRF SALKRYVNEKLKTEGREILDGVVAVNPYKQGNKTACDYCPYHAVCGFDLKTSGFGFRKFK PLKSEEIWPVIEGEQQDGN >gi|229784108|gb|GG667627.1| GENE 11 21167 - 24901 4097 1244 aa, chain + ## HITS:1 COG:CAC2262 KEGG:ns NR:ns ## COG: CAC2262 COG1074 # Protein_GI_number: 15895530 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Clostridium acetobutylicum # 4 1244 7 1252 1252 709 35.0 0 MAINWTKEQKAVIESRNRNLLVSAAAGSGKTAVLVERIIRMITEGKNPLDIDQLLVMTFT KAAADEMRERVLLAVDEKLKEDPENSHLQMQAAMIPYARITTIDSFCLGIIREHYNRLDI DPAFRVGDEGELLLLRGSVMEQMLEDYYEAGDEEFSRFVETYATGKSDRGIEDHIMAVYN FSGSNPWPEEWLTACETELENCEEERLMETEWMHFLMWDVAMQTGELCAQLEEAVTVCEE ENGPTAYIPMLTSDLRMLREIGKAEDYASLNELLGCAAFDRLASIRSKDIDADKKAFVTG CRDRVKKAVGKLRDLYCFESMETVVRDMRGTAGAVRMLLKLAGEFNDRYQAAKQEKNLVD FGDLEHYALEVLLEKPDGESEECAGDSMDDSLDASGRRPSAVADELSRQFEEILVDEYQD SNDVQETLIHAISRERFGTPNVFMVGDVKQSIYKFRLARPELFLKKYESYPREEGLYQTI ELHQNFRSRDSVLSGINEVFYQIMTKGLGGILYTEDAALHPGAVFEPTEETVGGKLELHL VNTGGIQLKQLEMESDALDYTSREMEARLIVSRMKELVNPETGLKIWDKKEKRYRTACYG DMVILLRSLSGFAEDFVNILMNEGIPAYAERRTGYFTAIEVETVLCFLSVIDNPMQDIPL AAVLKSPMVGASDEDLARLMAVFKRTAKKGQDRGIYGAWLYYLENCPEEEREEELYRKLA LFFDELAEYRRMAGYLSIHELLYVIYENTGYYDYVAAMPAGDARQANLDMLVEKASAYEK TSYSGLFHFIRYIEKLKKYDTDFGEAALAGDGNTVRILSIHKSKGLEFPIIFLAGMGKKF NKQDLYGKILIDPDLGIATDYLDLDLRVKTSTLKKNVLRRRLELEALGEELRVLYVAMTR AKEKLIMTGTDRYLDKKLERFSDIKRTAGQIPFTILSTADSFLDWLLMSLSGKLSESALL SEAGVETGLMTVREYSVADLVGVEIEHQAEKKLSKEELLNFDCSRIYDEAYAAGISAAFS YHYPHTADIGLHTKLSVSELKKQGQMVDEEESTFLPTIPAFLMEENGKKEQGGGAFRGTA YHRALELLEFPAMKTLADVETALETFHREKYMDEESLALLDAGIIWNFLSSPLGRRMSAA QSKGLLYKEQQFVIGIPAREMEVCDSGELVLIQGIIDAYMEEGDGLVLIDYKTDHVVKGH ESLLTERYGIQLEYYKRALEQMTGKRVLEKIIYSLTLQEEIVIE >gi|229784108|gb|GG667627.1| GENE 12 24991 - 26460 345 489 aa, chain - ## HITS:1 COG:PH0977 KEGG:ns NR:ns ## COG: PH0977 COG1672 # Protein_GI_number: 14590822 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Pyrococcus horikoshii # 1 423 38 435 496 262 35.0 1e-69 MFIGRKMELSFLEEKYKSKEGQLIVLYGRRRVGKTETLKEFCKGKPHIFYSCRECTDRQQ LLSYSEKVLKAGIPAAQYIHEFQDWESAITSSLELPGNGKKLLVIDEFPYMCRANGSIPS ILQNLWDSSLKNADIMIILCGSAMSFIEKEILAEKNPLYGRATGIYKMTEMPFSDVIKFF PDYPPEDQITVYAILGGIPHYLKQFDHRISVEENIRKNILTKGSVLYSEVEFLLRQELRE TAVYNTIIEAIALGNTTLNDIFNKTQIEKSKLAVYLKNLMDLQIVKREFSVSDSVKERAK SNKGLYQLTDNFFRFWYSFVFTTYSDLEAGDTEGVMKYVVTPQLHEFTAFVFESLCRDYI RLLNRQEQLPYRYTKIGRWWGKITKLTDGGNGHKSRRTYETEIDIMAVDHQSKHYLLGEC KYKASPFDLRDYKTLKSKWENTGDTECSFYLFSQSGFTEQVLQLEAEEKLVCISMDQMVS RFSKSSEYR >gi|229784108|gb|GG667627.1| GENE 13 26675 - 27517 773 280 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620648|ref|ZP_06113583.1| ## NR: gi|266620648|ref|ZP_06113583.1| hypothetical protein CLOSTHATH_01748 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01748 [Clostridium hathewayi DSM 13479] # 1 280 1 280 280 578 100.0 1e-163 MDINYHEIAERAASHALKSNITLDYSEKSIAEVESILGTYYDHLAEYDGKDGADTLWNVA VHYGIYLGETMLRLGMKEKGFAWYIDDGMPVLKNQAKAQISPVTKAHKRILNGPEDNVKS FCDVAFLLADGKFPDKNVHRAINVHLPSGQVIENVLYRDIASYITMIETGREDFLILESQ DGFFQFYGVNNQFVCEVRVNLPDGDFHTYSVIDKAKEQQTRRVQLTTPYGQFTPAEREVV SLEVVNMVVRAYYEHVKTDDFLGAIPYIDTTEITKRCMGL >gi|229784108|gb|GG667627.1| GENE 14 27545 - 28591 1218 348 aa, chain + ## HITS:1 COG:selD KEGG:ns NR:ns ## COG: selD COG0709 # Protein_GI_number: 16129718 # Func_class: E Amino acid transport and metabolism # Function: Selenophosphate synthase # Organism: Escherichia coli K12 # 2 328 1 327 347 246 41.0 4e-65 MIPDNEVRLTQMTKTAGCAAKIGPGTLAGVLENLPKADDPNLMVGIETSDDGAVYRVNDE LALIQTLDFFTPVVDDPYTFGQVAAANALSDIYAMGGEPKVALNIVAWPNCVNPAFLGKI LEGGASKVAEAGAVLAGGHSVQDDEPKYGLSVTGFVHPDKIFKNCGARPGDVLILTKPLG TGIVNTAVKADMASAEAKEEVIRVMTSLNRTAKRVIEQFDVHSCTDVTGFGLAGHSAEMA EGSGVTLEIHMSQLPIQKEAPDLAKMGLIPAGAYRNRAFVEHKVDFGDTEEFLCDIFCDP QTSGGLLVSVTPEDGEKIAAGLREAGLETAFAVIGCVKELEDKFVKLR >gi|229784108|gb|GG667627.1| GENE 15 28596 - 29729 956 377 aa, chain + ## HITS:1 COG:CAC2354 KEGG:ns NR:ns ## COG: CAC2354 COG0520 # Protein_GI_number: 15895621 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Clostridium acetobutylicum # 1 374 1 374 379 310 42.0 3e-84 MIYLDNAATSYRKPEAVYQAVLEAMRHMGNSGRGAHGASLDASRQIYAARELLAELFHAG DPSRVAFTANSTESLNMALFGLFGPGDHVITTMLEHNSVLRPLYALEEAWMELTVLPADQ KGCICYGDFEKALRPNTKGIVSTHASNLTGNMVDIGWIGRFCREKKLLYVVDASQTAGVF DINMEEMKIDVLCFTGHKGLLGPQGTGGICVREGIAVRPLIVGGSGVHSYSKTHPADMPA ALEAGTLNGHGIAGLYAALCYIRETGMDQIRERELSLMRQFYDGVRAIPGVTVYGDFSGL RAPIVSLNIRDYDSGEVADELAYRYEICTRAGAHCAPLMHRALGTESTGAVRFSMSHFNT EAEIDEAVRAVSEIAAE >gi|229784108|gb|GG667627.1| GENE 16 29759 - 30664 964 301 aa, chain + ## HITS:1 COG:MJ0300 KEGG:ns NR:ns ## COG: MJ0300 COG0583 # Protein_GI_number: 15668475 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Methanococcus jannaschii # 6 297 8 296 296 144 32.0 1e-34 MNLKQLEAFVCVAEVKSFSQAAKKLYLTQPTVSAHIHSLEKELGARLFIRTTKDVELSPA GELLYGNARKMLQLEKNILRDFTRADTPGIQKITVGASTVPGQYILPQILSLFSRTYPGN QLELMEADSLEVVRMVLDGKVEIGFTGTRLEDPTCVFEPFYYDRLVVITPNAEAYREYEK TGFPIERFYEEKWIIREEGSGTRKETERYLKEHGVDLRRMNVVATISNQETIKKSVSTAM GIAMISSAAVEDYVASGQLLRFPLAEGDIYRKLYMVWSKSNKPGKAARLFIQFVRELYAY L >gi|229784108|gb|GG667627.1| GENE 17 30771 - 31862 955 363 aa, chain + ## HITS:1 COG:no KEGG:Ccur_00560 NR:ns ## KEGG: Ccur_00560 # Name: not_defined # Def: hypothetical protein # Organism: C.curtum # Pathway: not_defined # 4 354 6 356 381 390 67.0 1e-107 MTESKKTLALSGVMLGITAAALAFFGNPKNMAICIACFIRDTAGAMKLHQAAVVQYVRPE IAGIVAGAFLISLITKEYRSTAGSSPMIRFVLGMIMMTGALVFLGCPLRMLIRMSAGDLN AYVALAGFVGGVFTGTAALKRGFSLGRSYESRSASGYVLPVLLVLILILSVTTSLMAVSE KGPGSMHAPVLIALVGGLVFGAVSQKCRTCFAGSFRDVILMKNFDLITIIGAFFAVMLLF NMVTGGFKLSFDGQPVAHAQHLWNILGMYVVGFAAVLAGGCPLRQLVLAGQGSSDSAVTF LGMFIGAALCHNFGLAGAAAAAATDTSPAAAGGPGTAGKAAVIICIAVLFAIGFLPGIQR EKK >gi|229784108|gb|GG667627.1| GENE 18 31890 - 32105 276 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620653|ref|ZP_06113588.1| ## NR: gi|266620653|ref|ZP_06113588.1| sulfurtransferase TusA-like protein [Clostridium hathewayi DSM 13479] sulfurtransferase TusA-like protein [Clostridium hathewayi DSM 13479] # 1 71 1 71 71 128 100.0 1e-28 MYEVDARGLSCPEPLILVSEALKSHPGEQIKVLVSEPHSRTNVEKFVKNRGLTVLVRETG SEFELTIEGQR >gi|229784108|gb|GG667627.1| GENE 19 32106 - 32342 307 78 aa, chain + ## HITS:1 COG:no KEGG:Closa_1020 NR:ns ## KEGG: Closa_1020 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 78 1 78 78 97 55.0 1e-19 MRQKENKVIVTFHTTAEAIAMEKACLNGKIPGRLIPVPRQISSGCGLAYCAAAELEKMVL TYMEEHGLHYEAVGVYLI >gi|229784108|gb|GG667627.1| GENE 20 32821 - 34101 1298 426 aa, chain + ## HITS:1 COG:CAC0282 KEGG:ns NR:ns ## COG: CAC0282 COG0402 # Protein_GI_number: 15893574 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Clostridium acetobutylicum # 1 426 1 424 428 399 46.0 1e-111 MTGDKNFVLKGNICYSEDAKNLKLVETGWLVCKEGLCAGVFETLPEEYASYPVTDCGDQI ITPGLVDLHVHAPQYAFRGLGMDLELLDWLNTNTFPEESKYRDLEYAKSAYGIFADAMKN SATTRACIFATIHNQATELLMDLMEETGLKTMIGKVNMDRNSPEYLCEESAEQSAADTVK WLEDTAGKYENVRPILTPRFIPSCTDDLMRKLKEIQKKYDLPVQSHLSENQGEIAWVQEL CPESRFYGDAYDQFGLFGGGVKTIMAHCVSSTEAEIELMRERGVYVAHCPESNTNLSSGV APVREYLDRGVHTGLGSDVAGGSRESIFYAMAHAIQVSKLRWRLQDTGLKPLTVEEAFYL GTLGGGSFFGKVGSFMEGYEFDALILDDSAIRHPQPLTLKERLERFIYLSDERHIKGKIV AGKRIF >gi|229784108|gb|GG667627.1| GENE 21 34248 - 35642 1627 464 aa, chain + ## HITS:1 COG:MJ0326 KEGG:ns NR:ns ## COG: MJ0326 COG2252 # Protein_GI_number: 15668500 # Func_class: R General function prediction only # Function: Permeases # Organism: Methanococcus jannaschii # 9 464 5 434 436 362 52.0 1e-100 METKGNENFLEKMFHLSANHTDVKTEVVAGITTFMTMAYILAVNPSILSAAGMDSGAVFT ATALAALVATLLMAVFANYPFVLAPGMGLNAYFAYTVVLNMGYTWEMALAAVFVEGLIFI LLSLTNVREAIFNAIPMNLKHAVSVGIGLFIAFIGLQNAKIVVDGATLVSVYSFKGALSD GTFNSVGITVLLAIIGILITAILVVKNVKGNILWGILATWILGMICQAVGLYQPNPELGM YSVFPDLSSGFGIKSMAPTFFKVDFSRVLSLDFVVVMFAFLFVDMFDTLGTLIGVASKAD MLDKDGKLPRIKGALMADAVGTSVGALFGTSTTTTFVESASGVSEGGRTGLTAVVAAILF GLSLFLSPIFLAIPSFATAPALVVVGFLMLTSITKIDFNDFTEAIPCYIAIIAMPFMYSI SEGIAMGVISYVVINLVTGHAKEKKISTLMYVLAVLFVLKYILI >gi|229784108|gb|GG667627.1| GENE 22 35854 - 37980 2585 708 aa, chain + ## HITS:1 COG:ygeS KEGG:ns NR:ns ## COG: ygeS COG1529 # Protein_GI_number: 16130768 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli K12 # 8 708 1 698 752 628 46.0 1e-180 MGVGNNMVRVDAIEKVTGGAKYTADLEPKGLLVAKVVRSTIANGVVKSFDLEEALAVPGV VKIVTCFDVPDIQFPTPGHPWSVETAHQDIADRKLLNTRVRVYGDDIAAVIAEDEIAANR AARLVKAEYEEYEPMLTVEAAMAEGATVLHEEKPGNMIAHSSFVVGEGTYKEAVEGEDVV EIDHVYETQSVQHCHIETPISFAYMEKGRIVVTTSTQIPHIVRRVISQALGIPAGEIRVI KPYIGGGFGNKQDVLYEPLNAFLCTQVGGRGVRMEISREETLACTRVRHAIKCHVKAAVK KDGTLVARKFEAYSNQGGYASHAHAIVANTSNEFKQIYHDEKTLESDAYTVYTNITTGGA MRGYGIPQAAFAAECMADDLAAAIHMDPLEFRMKNCMPEGYVDPHTHVACNTYGLKECME KGRNYIHWDEKRKEYENQTGPVRKGIGMAIFCYKTGVYPISLETASCRMILNQDGSMQLQ MGATEIGQGADTVFTQMAAETTGITEDRVHILSTQDTDITPFDTGAYASRQTYVSGMAVK KAGLIFKDKILDYAAYMLDKEKDSMDIRNNVVVDKESGEKLLDMAELATTAFYSLDRSIH ITAEATSHCKDNTFATGACFAEVEVDMPLGKVKVTNIVNVHDSGTLINPKLAEAQVHGGM SMGLGYALSEELLFDKNGRPLNDNLLDYKLPTSMDTPDLHVQFVELDD >gi|229784108|gb|GG667627.1| GENE 23 38903 - 39076 203 57 aa, chain + ## HITS:1 COG:ECs3739 KEGG:ns NR:ns ## COG: ECs3739 COG1529 # Protein_GI_number: 15832993 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli O157:H7 # 1 57 696 752 752 66 56.0 1e-11 MDDPTGPFGNKALGEPPAIPVAPAIRNAVLNATGVAVDSLPLDPQKLVAHFKAAELI >gi|229784108|gb|GG667627.1| GENE 24 39095 - 39982 1159 295 aa, chain + ## HITS:1 COG:ECs3740 KEGG:ns NR:ns ## COG: ECs3740 COG1319 # Protein_GI_number: 15832994 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Escherichia coli O157:H7 # 1 292 1 290 292 266 42.0 5e-71 MYDIEKLYQAKDIGDAIRALKEDPEAVVISGGSDVLIKIREGRLAGCSLVSIHGIKELEG VRMEEDGTIVIGPATTFSHVTNNRIIQKYIPILGDAVDQAGGPQLRNIGTVGGNVCNGVT SADSASSLCCLDAILVVEGPEGVREIPIREWYTGPGKTVRKHDEILTAIKIKKESYEGFG GHYIKYGKRNAMEIATMGCAVTVKLTGDKKHIEEIHIGYGVAAPTPIRCFKAEEKVKGME IGNELFETLGKSVLEEVNPRTSWRASKEFRLQLIEEMAKRALKQAIINAGGECNA >gi|229784108|gb|GG667627.1| GENE 25 39975 - 40466 731 163 aa, chain + ## HITS:1 COG:mlr1925 KEGG:ns NR:ns ## COG: mlr1925 COG2080 # Protein_GI_number: 13471825 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Mesorhizobium loti # 1 154 1 154 157 145 48.0 2e-35 MLKIVHMIVNDKEVEVAVDDRESLLETLRNRLGLTSVKKGCEVGECGACTVLVDGEAIDS CIYLTMWAEGRRIMTVEGLKGPNGELSPIQKAFVEEAAVQCGFCTPGLIMTAVEIVGTGK RYNREELRKMISGHLCRCTGYENILNAMERIVEETYKVVGKED >gi|229784108|gb|GG667627.1| GENE 26 40566 - 41954 965 462 aa, chain - ## HITS:1 COG:BS_ppsA_1 KEGG:ns NR:ns ## COG: BS_ppsA_1 COG1020 # Protein_GI_number: 16078895 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Bacillus subtilis # 1 458 1 432 1045 90 22.0 5e-18 MAYHLYPLTASQRMLQNACEEFGTTQVLTIGVCVSIRMPVDFSLLKACLTEETERLDCLR VQLTSPDREGGIRQYIVPEHTPDITYRNLKDKTKAEIKDLMTSWSTIGFEKADAPLHRYV MVSLPGGFNGVYLCIDHRIMDSAGLISMMNDTFRLYCHYLYGAPVPAMPARYEEVLQSDL KREQDSLRQEKDLAFWKQLLTAGEPVYTDFGGSRTLEQSRLRHGMPGLRAADRVIQPMEG DMARFTLSKKETARLQSFCREHSVSMTNLFLMSMRTALSFTNGEERDVTIRNYISRRTSR QSRTCGGCRIHCYPCRTVISPDTPFLSGIRQITSCQNSIYRHSNMDPEMVNRLFADTFPM PPLTTYEGAALTYQPLPLSSSNPWLNHIPLRTEWFSNRADIQKFYLTVMQTSRDQGLDFY FKYQKAELGKSDIDAVYELLSAVLSLGVRRESLPVGELMRQL >gi|229784108|gb|GG667627.1| GENE 27 42049 - 43044 736 331 aa, chain - ## HITS:1 COG:BH1928 KEGG:ns NR:ns ## COG: BH1928 COG1609 # Protein_GI_number: 15614491 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 1 310 1 312 335 140 34.0 4e-33 MATITDIADKLGISKGTVSKALNDAPDISETLRKTILETAVEIGYTKMRRRRSVSKKVCV LIGEGNMEYTSPGHFGYDLIIGFRKMAEPAGLTVDVVPFTDKLQKSTPYDVFMLENDYIG SFVLGISLNDAWMKDFRTSHTPAVLFDNDIKGNPAVASVGVDNNEGMELAVEYLKSLGHK KIGYLSGALGSQVFLLRHRAFYGALKQNGLSSDPQLSGSSYYISECIQKHLPRLLDRGVT AIVCSHDQLANAALVQLKELGKRVPEELSIIGFDDLPISPYTDPPLTTIRQDRPKLGKCG YYALDSLLNDVPIGTILLHSQLVMRGSCGPI >gi|229784108|gb|GG667627.1| GENE 28 43375 - 44604 1487 409 aa, chain + ## HITS:1 COG:SPy1294 KEGG:ns NR:ns ## COG: SPy1294 COG2182 # Protein_GI_number: 15675247 # Func_class: G Carbohydrate transport and metabolism # Function: Maltose-binding periplasmic proteins/domains # Organism: Streptococcus pyogenes M1 GAS # 76 373 73 380 415 136 27.0 6e-32 MKRGLVLALALAVTAAVVSGCGKAALGQGSGEEVRLMDWSSSEDQSKDSGQWLQKSCERF AELHPEWNITFVYGVADEATAANQVSQDAEASADVFLYANDTLTIMTDANALVKFGGKYR EEIEAMNSAEVINSVTMDGELYGVPFTTNTWYMYYDKSVFSEEDIKNLDTMLSKGTVSFP FTNSWYLPAFYIGNGCTLFGDGTDESKGVDFGGEKAVEVTEYLVDLIANPNFVVDADGSG MAGLRDGSIHALFSGSWDMASVKEILGGNMGVAALPTYTLNGEEKQMKSYAGSKAVGVNP ESSYMVPAVQLAVYLGSAEAQKLHHELRNVVPCNLELMENPDIASDALVTAQNDTFNRTS ILQPFVAGMNNCWVPVENMGKGIRNKSVTHENAKEQTEAMNAAMNSNGI >gi|229784108|gb|GG667627.1| GENE 29 44618 - 45994 1492 458 aa, chain + ## HITS:1 COG:SPy1295 KEGG:ns NR:ns ## COG: SPy1295 COG1175 # Protein_GI_number: 15675248 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Streptococcus pyogenes M1 GAS # 8 457 4 452 453 343 44.0 6e-94 MGKLNRINEFTTPYSFREALRRGGLETKLSMILAGLGNIVHKQVGKGLLFLAAEAAYVVF MMTTGFRCIAMMGSLGTVGQQEIWDEVNQVYRYTKGDQSVLILLYGVAVLFLTALMAAAW HGAVKSAYKAECLDKAGKHVNSFFEDLKSLLHENLYRLFMTPPLTFIFGFTVLPLIFMIC MAFTNYSKIDNHLVLFDWVGLDNFRALFDSGSILGSTFWSVLSWTLVWAFFATFTNYIFG MILAMAINRKETKAKGFWRFCFVLSCAVPMFVSLLIMRTMLQPNGAVNVLLRNVGLIGAD ASLPFFTNPTWARVTIIVINIWVGVPYTLLQLTGVLQSIPEDLYEAARVDGANSFQTFFK ITLPYMLYITTPYLITTFTGNVNNFNVIYLLSSGDPVTDLASTAGKTDLLVTWLYKLTID KQYYNIGAVIGILTFIILAVGALFTYRNSKSYKEEEAF >gi|229784108|gb|GG667627.1| GENE 30 45996 - 46862 919 288 aa, chain + ## HITS:1 COG:SPy1296 KEGG:ns NR:ns ## COG: SPy1296 COG3833 # Protein_GI_number: 15675249 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type maltose transport systems, permease component # Organism: Streptococcus pyogenes M1 GAS # 17 285 11 275 278 272 54.0 5e-73 MKKMRSAKQVRTVNNILVHTALAILAAVWVFPILWVILTSFRAEKGSYVSTFFPKAFTLA NYSRLFTDTTILNFPRMFANTLFIAVCSCILSTFYVLAVSYCLSRLKFRMRKPYMNMAMI LGLFPGFMSMVAVYFILKALGLTEGNLIRLALILCYSGGAGLTFQIAKGFFDTVPRAIDE AALIDGCNRWQIFSRITLPLSKPIIIYTVLTSFIAPWVDFIFAKVICRVNSDQYTVAIGL WRMLEKEYIDSWYTSFAASAVLISIPIAVLFLCMQKYYVNGMSGAVKG >gi|229784108|gb|GG667627.1| GENE 31 46866 - 48911 2050 681 aa, chain + ## HITS:1 COG:TM1845 KEGG:ns NR:ns ## COG: TM1845 COG1523 # Protein_GI_number: 15644588 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Thermotoga maritima # 23 669 218 840 843 408 37.0 1e-113 MTTEKMTARELKRFYADSTFKKNYIYDGHDLGVSCGGDGTVFKLWSPTAERVELHLYPDG SESGCMRSVSMEQGEQGVWLYRTEENLHGTYYDYDVTSGGETRRTADPYAKACGTNGRRS MAVELERTDPEGWKADRAPERQREDIIYELHVKEFSWDESAGFPEELRGTYRAFTEEDTT LSGDGVHPTGIRYLKELGVTHIQLMPVYDFGSVDEAGDREQFNWGYDPVFYNVPEGSYSS DPSRGEVRITELKELVQALHRNGFRVIMDVVYNHTYSADSPFQRVAPWYYYRQNEDGTLS DGSGCGNDFAAEREMAASFILDSVLYWTEEYHMDGFRFDLMGLLPTELMNRIGRALDLRY GAGEKLIYGEPWAADRTPMEPGFHQALKDQEKRLDPRVAMFCDNTRDAIKGHVFDALQPG FVNGGEGLEREILHAVSAWCDRGGAFRAGSPERILSYVSAHDNLTLWDKLILSMTGRCGR NVFYERPEEVIRAYKLAASIYFTCQGHLFFLSGEEFARTKEGNDNSFCAPISLNRLDYNR AYQLEDLVQFYKDLIALRKCLPGLCDKSVEAGKRIFRKNAGRKGVVSFQVDNRCKEEETG NGKQDTGSGGLCDTLFIAYNSRRRPVKMALPAGEWEAFLKDGDSGMWRRPQRETYSGIFM VPPVSAVIMGKKRETMQEAER >gi|229784108|gb|GG667627.1| GENE 32 48803 - 50857 1384 684 aa, chain + ## HITS:1 COG:MA0905 KEGG:ns NR:ns ## COG: MA0905 COG3408 # Protein_GI_number: 20089784 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen debranching enzyme # Organism: Methanosarcina acetivorans str.C2A # 41 678 28 671 680 377 36.0 1e-104 METAAERDVFGDFHGSACLGGYYGEEKGDHAGGGAMKFIYGKQDFKTLERGQETCCLLTN GLGGFSSVTAIGSSARSDHAFFMACMKAPNHRYNMIHRLCERLVTAEKAVFLSSQEFAGE REPEDGFLRLSSFQFEDCPAWLYHADGVEVIKTTAMKQGENTIAVRYEIHNHRREACRLE VTPFMQFVKKGGDLSPDQVFTCSDSRISSAGMTLYFTTGGSVKTFEPVSETYYYACDACD GRRETGTALANHAVCCTTGPGGCSKLELVYSTEPVTETAEKIIDDAVRYRKDLERQAGFK SEAASMLAKSAAQFISRRESTGGDTILAGYPFFEDWGRDTMIALSGCALATGRYGTAKRI LRTFLKAERNGLMPNLFPEGGSEPMYNTVDAALLFINSVYLYFERTNDISLVRECCPVME HIVKQYETGTDFGIHIDEDGLITAGQGFDQVTWMDVRIGDILPTPRHGKPVEINAYWYNA LRIMEKFTESPEAARHYGHLAELVKQSFQEKFWLEEKHCLKDVVSGAEERDIPDNQIRCN QIWAVSMNFSMLPLEKEKQVVETVAEKLYTPYGLRTLSEDDREFRPCYGGSQFDRDLAYH QGTVWVFPLGGYYLAYLKVHGYSEAAKAEVYRQLEVLEGAMREGCAGQLPEIYDGKNPVS SKGCFAQAWSVGEILRVYEALERS >gi|229784108|gb|GG667627.1| GENE 33 50875 - 51657 853 260 aa, chain + ## HITS:1 COG:SP2106 KEGG:ns NR:ns ## COG: SP2106 COG0058 # Protein_GI_number: 15901921 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Streptococcus pneumoniae TIGR4 # 6 260 4 259 752 301 62.0 7e-82 MMEQELNKVLGERFGKTIEEACMEEIYHALLTIVKGKMKDLTPNTGSRKLYYISAEFLIG KLLSNNLINLGLFDETAEVLKKHGIDIADIEEAELEPSLGNGGLGRLAACFLDSIATLKL PGDGVGLNYHDGLFKQTFADHKQREEKNPWMTGDDWLTRTDVHFEVPFKDLTVTSTMYDI DVPGYEGGKNSLHLFDLDSVDESIIHDGIEFDKEDIRRGLTLFLYPDDSDEKGQLLRIYQ QYFMVSNAAQLILKETEERG >gi|229784108|gb|GG667627.1| GENE 34 52663 - 54084 1705 473 aa, chain + ## HITS:1 COG:SP2106 KEGG:ns NR:ns ## COG: SP2106 COG0058 # Protein_GI_number: 15901921 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Streptococcus pneumoniae TIGR4 # 1 473 280 751 752 646 66.0 0 MVIPELVRLLMARGFNIDTAIDVVRKTCAYTNHTILAEALEKWPASYLEKVVPQLMPIIR ELDQRVRRDFHDEKVYIIDAQERVHMAHIDIHYGQSVNGVAALHTEILEESELKPFYDIY PEKFNNKTNGITFRRWLLHCNHELSSYIEELIGSGFKIHAEELEKLLAYQDDEAVLDRLR RIKQNAKLDLKKYLKNTQGIEIDENSVYDIQVKRLHEYKRQQMNALYAVYKYLEIKKGNL PKRPVTMIFGAKAAPAYTIAKDIIHLILCLQELIDSDPEVSPYLKVVMVENYNVSKAAKL IPACDISEQISLASKEASGTGNMKFMLNGAVTLGTEDGANVEIHQLVGDENIYIFGKSSD EVIKLYDTSGYCSRKIYESDSLVEELVDFIISKKMILTGDPVNLGRLYKELVSKDWFMTL LDLREYIAVKEQMLADYEDERAWARKSLINIAKAGYFSSDRTIEEYNRDIWHL >gi|229784108|gb|GG667627.1| GENE 35 54147 - 54398 308 83 aa, chain + ## HITS:1 COG:L124628 KEGG:ns NR:ns ## COG: L124628 COG0366 # Protein_GI_number: 15673475 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Lactococcus lactis # 9 82 4 77 515 123 74.0 1e-28 MKERNGQKWWKTSVIYQIYPRSFKDSDGDGIGDLNGIIEKLDYLEELGIDAVWLSPVCKS PQDDNGYDISDYQDIDPLFGSLL >gi|229784108|gb|GG667627.1| GENE 36 55341 - 56666 1177 441 aa, chain + ## HITS:1 COG:BH2903 KEGG:ns NR:ns ## COG: BH2903 COG0366 # Protein_GI_number: 15615466 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 1 440 97 557 561 473 51.0 1e-133 MDLVLNHTSDEHRWFLEAKKSRDNPYHDYYVWRDGEEGVLPNDMRACFGGPAWEWVPELS QYYFHQFSVRQPDLNWENEAVRREIYDMIRWWMEKGVGGFRLDVIDQIAKEPDLKITGNG PRLHEFIRELSRETFQKGDIVTVGECWGADTERAKLYSAPDGSEFSMVFQFEHIMLDSVP GREKWDLAPLPFVKLKEVLNEWQCELQGSGWNSLFWDNHDLPRIVSRWGNDKEYRVESAK MLATVLHGMQGTPYIYQGEELGMTNTDYEIGEYRDIETLHMYEERLEAGYSEEEIMRSIH AKGRDNARTPMQWSGGEGAGFTTGTPWIRINDNYREINAETEAADPWSVFHYYKRLIHLR KEYPVFVHGSFRLLLEDDPNIFAYVREWEGVRLLVLCNFFGSETWCPLKEMWEGKKRLLC NYQEVENEEILRPYEARMILI >gi|229784108|gb|GG667627.1| GENE 37 57314 - 58294 1055 326 aa, chain + ## HITS:1 COG:BS_yulF KEGG:ns NR:ns ## COG: BS_yulF COG0673 # Protein_GI_number: 16080169 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus subtilis # 1 325 1 327 328 296 46.0 4e-80 MIRCATIGTNFVADWFMKAVQACDGITCSLVYSRNRETAEAFCEKHGAGRWCTDLMEAAE AKDVDAVYIASPNSLHYEQAALMLSHGKHVLCEKTITSNEAELKELIRLAGENGTVLMEA MRSVHEEGVRRLKEVLPMAGTIRRISFQYAKYSSRYDKFKEGIIENAFNPAFSNGALMDI GVYCVHPLVRLFGMPKKIIADSLLLSNGVEGAGTILASYDGMQAELLYSKITNNRLPSQI QGEDGTLVIDNITNICSVVFHDRNGNAETVIDLPPNHTLEGEVNDWLRMIETGEGVQEEN EASLMELRVMDEARRQAGIIFPADRK >gi|229784108|gb|GG667627.1| GENE 38 58308 - 58847 559 179 aa, chain + ## HITS:1 COG:SA2335 KEGG:ns NR:ns ## COG: SA2335 COG0350 # Protein_GI_number: 15928126 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Staphylococcus aureus N315 # 1 177 1 171 173 171 48.0 5e-43 MYYSTTCPSPVGVITLASDGEKLAGLWLEGQKYHGGAIAGSMTEKEDIPVFDAAKRWLER YFAGEKPSASELPLAPVGGEFRQMVWEILCRIPYGSVMTYGDIAKRMAVKLNRPTMSGQA VGGAVGHNPISIIIPCHRVVGANGSLTGYAGGIDTKIKLLELEGVDMSSLFVPARGSAL >gi|229784108|gb|GG667627.1| GENE 39 58812 - 59060 76 82 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYSKTAKIAVKNDPFGFFYIQTGIEAYPVSIKPNKIVFRMLTRLYSGRPILCKGSGLRYL FISISSMILSSSECAPPSRNKQ >gi|229784108|gb|GG667627.1| GENE 40 60519 - 60632 73 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKTTRIFCAAEGFIKEWTDPIGDSNHPVRDPARYSRL >gi|229784108|gb|GG667627.1| GENE 41 60649 - 60822 68 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266620676|ref|ZP_06113611.1| ## NR: gi|266620676|ref|ZP_06113611.1| cytochrome c heme lyase [Clostridium hathewayi DSM 13479] cytochrome c heme lyase [Clostridium hathewayi DSM 13479] # 1 57 1 57 57 94 100.0 3e-18 MIRKMIPLGSGDLHGNWEQTVTGICLFKMTGGSYAKNRWGRGGPWWSARGKLKVKGK >gi|229784108|gb|GG667627.1| GENE 42 60930 - 63512 1702 860 aa, chain + ## HITS:1 COG:MA4289 KEGG:ns NR:ns ## COG: MA4289 COG3291 # Protein_GI_number: 20093078 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 262 741 618 1136 1734 167 27.0 6e-41 MKRKYTRVISALLAVLLISTQSAVMTWADSMVPADHLLGGDEYIGKKTGYATPSDAEQKD EDALKPDDTPKDDEIPSTEIPSTEISSHENLTDGKQPEYENTEKMVVRWEFVDDENLSGG ELSLIGVSPENRADFDTVVSMLPEQVRAEIEEAGEVTLPIADWNCQEYQQDKDGEWPFTG EYKFIAELPEGYVCEPPISVLVTLGGAMVNTINDRFTIDGLRYKELGPDTVQLIGYDGEK PVGTLIIPDKVRKPSNGREYQVASIGHNAFQDCSDLTGDLVIPDTVTEIGDNAFEGCHFT GELTLSDSLVTIGEYAFYECGFTGQLLLPPTLVRVDDYAFAETTFSGQLILPEKLNYIGE LAFYECHFTGDLIIPDGVTIIEYGAFSGNSFTGTLTLPKKLKEIGRESFCKCGFTGELVL PDGLTSIGLYAFKDCSKLTGRLSIPDGITSIESGAFSNTGFDGFDTTKQEIADMLYDSGI GEDKITVGNQPYHHTQLPPVSFFQDGNMTYQIIGSDTVALTDYNGNSDTDISIPDKVTDQ ASGTTYSVTQIGSWAFRYKKITGSLHLPNTLVSIGDNAFFNNKFSGTLSLPNTLVSIGDY AFGKNRFTGDLTIPVSVSHMGAGAFNSAGFTGNLTIEGKLTKLEDYAFFECGFTGALSLP DTLTYIGFAVFKDCGFTGSLQLPAGITYIEAGSFYNCSSFTGALQLPKPITEIGEKAFYG CDGLDSAHLGPNVQKLGAQVFPEALPLSTDSPQVQILINTYLNQDAIADTSWDGKEDVPD GAVATIKQDTTITGDRRIGTEAVITVPSGGILTVDGKLTVDGNLVVNGTIFAEGTLIING SFSGSGTLIIGKNGRVEGDT >gi|229784108|gb|GG667627.1| GENE 43 64478 - 65161 581 227 aa, chain + ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 107 222 518 619 693 74 36.0 2e-13 MKGRLFEAIEAIPGFGDTGVIVGELKITTFVPASEGTSDRPDGMDGSFAFTLLLSKGNSK GGAGGAGTIRARGYSYIPGGRPSGSNDGEDYGSNAVNPDILHGTWERTETGIWKFRQTGG AYAVSRWGMVDGLWYYFDGEGRMLTGWQYINQQWYYLCTEEDTKTKIGLKEGAMATGWHF DSAYQAWFYLGTDGAMAIGQREIDGKRYYFNPESDGTKGALQKEEIL >gi|229784108|gb|GG667627.1| GENE 44 65353 - 65454 71 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLEIKKKKEGYARLTSNKEVLRNIQFGQSASGM >gi|229784108|gb|GG667627.1| GENE 45 65673 - 65807 152 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEECLAQAQSLMQLYAGLKSTESRCYTLVFGMDLIINPMCRPFS >gi|229784108|gb|GG667627.1| GENE 46 65875 - 66723 841 282 aa, chain - ## HITS:1 COG:CAC0872 KEGG:ns NR:ns ## COG: CAC0872 COG2233 # Protein_GI_number: 15894159 # Func_class: F Nucleotide transport and metabolism # Function: Xanthine/uracil permeases # Organism: Clostridium acetobutylicum # 2 272 164 431 435 151 33.0 2e-36 MAGGTSSPTYGSWQNWAIAFLTLIVVTVLNHFGKGIFKLASILIGIIVGYIVSLFFGMVD FSSIGAAAAFQVPQPLHFGIMFEPSSCIAIGILFAINSIQAIGDFTATTSGSIDREPTDK ELQGGIMGYGVTNILGALLGGLPTATYSQNVGIVTTTKVVNRCVLGLTAAILLAAGLIPK FSALLTTIPQCVLGGATVSVFASIAMTGMKLVMSEEMTYRNSSIVGLAAALGMGISQATA ALSTFPSWVVTIFGRSPVVVATIVAVLLNIILPREGKKANQK Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:13:13 2011 Seq name: gi|229784107|gb|GG667628.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld21, whole genome shotgun sequence Length of sequence - 59118 bp Number of predicted genes - 54, with homology - 52 Number of transcription units - 24, operones - 14 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 117 - 395 361 ## Closa_0987 hypothetical protein 2 1 Op 2 . - CDS 464 - 622 257 ## gi|266620684|ref|ZP_06113619.1| conserved hypothetical protein 3 1 Op 3 . - CDS 613 - 918 268 ## EUBELI_20547 hypothetical protein 4 1 Op 4 35/0.000 - CDS 1000 - 2862 203 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 5 1 Op 5 . - CDS 2862 - 4625 217 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 - Prom 4770 - 4829 3.9 + Prom 4756 - 4815 7.6 6 2 Tu 1 . + CDS 4843 - 6150 1397 ## COG0519 GMP synthase, PP-ATPase domain/subunit + Term 6194 - 6231 3.3 7 3 Op 1 . - CDS 6105 - 6206 88 ## 8 3 Op 2 . - CDS 6254 - 7990 1735 ## COG1376 Uncharacterized protein conserved in bacteria - Prom 8025 - 8084 5.3 9 4 Tu 1 . + CDS 8054 - 8164 78 ## + Term 8272 - 8334 18.0 - Term 8008 - 8044 1.3 10 5 Tu 1 . - CDS 8114 - 8251 96 ## gi|288870301|ref|ZP_06409704.1| conserved hypothetical protein - Prom 8280 - 8339 2.0 - Term 8328 - 8372 3.5 11 6 Op 1 . - CDS 8426 - 10945 2559 ## COG3437 Response regulator containing a CheY-like receiver domain and an HD-GYP domain 12 6 Op 2 . - CDS 10938 - 12596 1547 ## Dehly_1104 GH3 auxin-responsive promoter 13 6 Op 3 . - CDS 12586 - 13176 476 ## Dehly_1105 hypothetical protein 14 6 Op 4 . - CDS 13173 - 14093 1000 ## COG1307 Uncharacterized protein conserved in bacteria - Prom 14230 - 14289 7.0 + Prom 14194 - 14253 8.2 15 7 Tu 1 . + CDS 14364 - 14861 591 ## Closa_0984 hypothetical protein + Term 14890 - 14930 6.8 - Term 14829 - 14871 -0.5 16 8 Op 1 . - CDS 14903 - 15433 678 ## COG1827 Predicted small molecule binding protein (contains 3H domain) 17 8 Op 2 34/0.000 - CDS 15439 - 17163 298 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 18 8 Op 3 . - CDS 17154 - 17927 755 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 19 8 Op 4 . - CDS 17962 - 18564 822 ## Closa_0981 signal transduction histidine kinase, LytS 20 8 Op 5 . - CDS 18557 - 19321 921 ## COG2820 Uridine phosphorylase - Prom 19420 - 19479 5.0 - Term 19530 - 19582 12.0 21 9 Op 1 . - CDS 19599 - 21011 1578 ## COG1376 Uncharacterized protein conserved in bacteria - Prom 21086 - 21145 1.9 22 9 Op 2 . - CDS 21209 - 21868 677 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family - Prom 21900 - 21959 4.9 23 10 Tu 1 . - CDS 21961 - 22983 1028 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) - Prom 23025 - 23084 4.0 - Term 23121 - 23175 -0.8 24 11 Op 1 . - CDS 23220 - 23717 595 ## Closa_2712 Rubrerythrin 25 11 Op 2 . - CDS 23722 - 24399 626 ## COG1773 Rubredoxin - Prom 24476 - 24535 4.8 26 12 Tu 1 . - CDS 24554 - 25318 600 ## gi|266620705|ref|ZP_06113640.1| putative cyclic nucleotide-binding domain protein - Prom 25367 - 25426 6.5 27 13 Tu 1 . + CDS 25624 - 26379 827 ## COG2211 Na+/melibiose symporter and related transporters + Prom 27226 - 27285 80.4 28 14 Op 1 . + CDS 27309 - 27923 579 ## COG2211 Na+/melibiose symporter and related transporters 29 14 Op 2 . + CDS 27943 - 29511 1398 ## COG3119 Arylsulfatase A and related enzymes + Term 29515 - 29572 17.6 - Term 29496 - 29564 20.3 30 15 Tu 1 . - CDS 29597 - 31438 1914 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases - Prom 31475 - 31534 17.7 31 16 Op 1 1/0.000 - CDS 32436 - 32825 337 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases - Prom 32854 - 32913 6.0 32 16 Op 2 . - CDS 32924 - 33811 861 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 33902 - 33961 3.8 - Term 33952 - 34003 9.1 33 17 Op 1 . - CDS 34049 - 35401 1304 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 34 17 Op 2 . - CDS 35419 - 36105 708 ## COG0274 Deoxyribose-phosphate aldolase 35 17 Op 3 . - CDS 36120 - 36980 769 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 36 18 Op 1 . - CDS 37943 - 38881 840 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 37 18 Op 2 . - CDS 38900 - 39658 834 ## COG2188 Transcriptional regulators 38 18 Op 3 . - CDS 39669 - 41192 1171 ## COG1070 Sugar (pentulose and hexulose) kinases 39 18 Op 4 38/0.000 - CDS 41217 - 42068 813 ## COG0395 ABC-type sugar transport system, permease component 40 18 Op 5 . - CDS 42072 - 43004 792 ## COG1175 ABC-type sugar transport systems, permease components - Term 43038 - 43070 6.3 41 19 Op 1 . - CDS 43081 - 44496 1645 ## EUBELI_20530 hypothetical protein 42 19 Op 2 . - CDS 44514 - 45509 899 ## BT_4420 hypothetical protein 43 19 Op 3 . - CDS 45527 - 46579 799 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Prom 46627 - 46686 10.6 - Term 46742 - 46788 13.0 44 20 Tu 1 . - CDS 46837 - 47874 1283 ## COG3804 Uncharacterized conserved protein related to dihydrodipicolinate reductase - Prom 47948 - 48007 5.2 45 21 Op 1 . - CDS 48075 - 48950 1150 ## COG0789 Predicted transcriptional regulators - Prom 48980 - 49039 3.4 46 21 Op 2 . - CDS 49042 - 50115 758 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 50193 - 50252 4.3 47 22 Op 1 1/0.000 + CDS 50325 - 52769 1418 ## COG3250 Beta-galactosidase/beta-glucuronidase 48 22 Op 2 . + CDS 52790 - 54130 1078 ## COG2211 Na+/melibiose symporter and related transporters 49 22 Op 3 . + CDS 54143 - 55099 441 ## COG0657 Esterase/lipase + Prom 55107 - 55166 2.4 50 22 Op 4 . + CDS 55192 - 55458 332 ## MPTP_1640 esterase/lipase + Prom 56360 - 56419 14.5 51 23 Tu 1 . + CDS 56460 - 56984 233 ## COG0657 Esterase/lipase - Term 57013 - 57073 16.3 52 24 Op 1 . - CDS 57098 - 58525 1567 ## COG0232 dGTP triphosphohydrolase 53 24 Op 2 . - CDS 58602 - 58820 81 ## gi|266620732|ref|ZP_06113667.1| transcriptional regulator, LysR family 54 24 Op 3 . - CDS 58829 - 59116 253 ## gi|266620733|ref|ZP_06113668.1| GGDEF family protein Predicted protein(s) >gi|229784107|gb|GG667628.1| GENE 1 117 - 395 361 92 aa, chain - ## HITS:1 COG:no KEGG:Closa_0987 NR:ns ## KEGG: Closa_0987 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 92 2 90 91 112 65.0 5e-24 MEAVKKESFVPMAFFKKEAYTGSFKGMRYRVIKSEDQFEAVVYPEPYCFEATPDENKVKN TFPFTEEGRESVVNWLNDQYDSRKAEWDSAAR >gi|229784107|gb|GG667628.1| GENE 2 464 - 622 257 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620684|ref|ZP_06113619.1| ## NR: gi|266620684|ref|ZP_06113619.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 52 1 52 52 77 100.0 3e-13 MAVKVKDTFEGLEEEMMDGFSDEEMDTVISLVSRIYENLRKADGKDPAGDQE >gi|229784107|gb|GG667628.1| GENE 3 613 - 918 268 101 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20547 NR:ns ## KEGG: EUBELI_20547 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 99 1 99 148 76 42.0 3e-13 MDRKQEVRDLMFRTEMARKRRIHPVWTDMGLIAGQPRVLSQLFIKDHITQKELAEACCIE PATLSRALDRLEAMGYIIRNDNPQCRRSFLISLTEEGKRWR >gi|229784107|gb|GG667628.1| GENE 4 1000 - 2862 203 620 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 395 602 36 249 329 82 29 4e-15 MAKYIGYKKPKNTKKTLLRLISYMGRHKFSMFAVGILVIVSSLASVFGTYMLKPVINNYI IPKDIPGLVKALVFMGVMYAVGATATLGYNRLMVKTSQNIVKEIRKDLFNHTQTLPLAYF DSHTHGELMSHFTNDVDTIAEALNNSFTLLIQSFITTVGTMVMIVILSFRLSLIVLGCMA LMFLFIRFNSKRSRKYFMEQQKYLGEINGFIEEMVGGQKVEKVFNHEKKDYEEFCRKNER LRQASTNAMAYSGLMVPVVVSISYFNYAVSACVGGLFALAGLMDIGSLASYLVYVRQSAM PINQCTQQVNFILAALSGAERIFEMMDEEPEVDEGTVTLCRVEEKPDGTLGECSRKSGLW AWKITDEERGKSTLVPLTGDVRFEHVVFGYNEKKKILDDISLFAKPGQKIAFVGSTGAGK TTIINLINRFYEIQGGVITYDGIDITKIRKADLRRSLGFVLQDTHLFTGTIAENIRYGNL EASEEDVIRAAKIANAHSFIRRLPEGYETVLESDGANLSQGQRQLLAIARAAVASPPVLV LDEATSSIDTRTERLIENGMDALMEGRTVFVIAHRLSTVRNSKAIMVLEHGRIMERGSHE ELLEQKGRYYRLYTGQFELE >gi|229784107|gb|GG667628.1| GENE 5 2862 - 4625 217 587 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 311 550 264 502 563 88 27 1e-16 MFRRFYGYMKDYRKYGLFSCACVALEVMFELVIPLVMADIIDVGVANGDRSYIITKGIEM CILALMALGLGAAYSRFAALCGQGLGAELRKAEYQKIQSFSFSNTDHFTTSSLVTRLTSD VTVIQNAVTTGLRPVVRAPVMMVTALCLSFSINARLALIFLVAAPLLGIILFIIISHVRP LYSRMQRSVDLVNRIVQENLTAVRAVKAYVREGYEMEKFDAVNSELQTVSERAFGLACMN MPAMQYVMYATIVSILWFGGSLIFTGGMQVGELTGFLSYVLQVLNSLMMISNVFMMVTRS MASGKRILEVLDEEIEIKDESSEPLVVEKGGITFDHVSFKYKKNAKEYVLSDVSLDIKPG QTVGIIGGTGSAKTTLVQLIPRLYEATEGEVRIDGKPVKSYPLEHLRDAVAVVLQKNTLF SGTIEDNLRWGKEDASREEMEWACKIAGAAEFVERFPDGYDTYLDQGGVNLSGGQKQRLC IARAVLKRPRILILDDSTSAVDTATETGIRRGLRESLPDMTKIIIAQRITSVQEADQIII MEDGKVHAAGTHKSLLASDPIYQEIYSSQHFLPDGSKGELPQKGVSL >gi|229784107|gb|GG667628.1| GENE 6 4843 - 6150 1397 435 aa, chain + ## HITS:1 COG:BH0607_2 KEGG:ns NR:ns ## COG: BH0607_2 COG0519 # Protein_GI_number: 15613170 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Bacillus halodurans # 121 435 1 315 315 418 63.0 1e-116 MKQDMIVILDLGSTENTRLAREIRAVGVYSEIYPHDITAEELKKLPNVKGVIINGGPNHI VDGVRIDVLPEIYEAGYPVMAAGHEPASCENRAAEWPEDEDTRVSLIREFVFDTCHAQAN WNMTNFIADQVDLIRSQVGDKKVLLALSGGVDSSVVAALLIKAIGKQLTCVHVNHGLMRK GESESVIDVFKHQLDANLVYVDATERFLGKLAGVADPEQKRKIIGAEFIRVFEEEARKLE GIEFLAQGTIYPDIVESGTKTAKMVKSHHNVGGLPEDLQFQLVEPLRQLFKDEVRACGVE LGLPAEMVYRQPFPGPGLGVRCLGAITRERLEAVRESDAILREEFAKAGLDKTVWQYFTI VPDFKSVGVRDNARCFDYPVILRAVNTIDAMSASIEQIDYSVLLKITDRILKEVKNVNRV CYDLSPKPTATIEWE >gi|229784107|gb|GG667628.1| GENE 7 6105 - 6206 88 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLIPLMVSAEFWSLRRIEGYSHSIVAVGFGLKS >gi|229784107|gb|GG667628.1| GENE 8 6254 - 7990 1735 578 aa, chain - ## HITS:1 COG:CAC0747 KEGG:ns NR:ns ## COG: CAC0747 COG1376 # Protein_GI_number: 15894034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 2 475 5 466 466 348 38.0 2e-95 MEREKLRSRKKKSLGIKGWAAVLGGIAVAGAAAVTVMYIQTGKQYRTVFFPNTTINGVDA SKKTVDEVKQLIASGIDGYTLTIKERGGAEEVISGDDIGLESVFDGSLEKLLEEQEPNDW FSHRKTPQSFEIDTMIQFNEEKLGEVIGALKCFDEAQVVKPQDAAMSEYVSGQGYSVIPA VEGNELDKEKVAAGAQEAVSGLQRELSLEELDAYVKPGIASDDAELLARVQALNRFVNVR VTYTFGDSREVLSGDTIKDWVGIGDDGNAYVSSSAVTEYVKSLASKYDTYNKAKTLNTSY GKTVRITGGSYGWRIDQAAEADELAAIIRSGESQTREPVYKQKAVSHGANDYGSTYVEIN LTAQHLFYYKDGSLVVESDFVSGNEAKGWATPAGVYPLTYKQKDAVLKGEDYKTPVDYWM PFNGGIGLHDATWRSSFGGTLYKNGGSHGCVNLPHSVAQKIFENISAGTPVLCYHLEGTE SKTTSTPAGKPIQPTTAAPETTAATTPAAPTTAATTASPESTAPSEAPVPKPTESQTAAP TKAPETEASTAASSESGEKGPGIPTGSSGNKEIGPGVS >gi|229784107|gb|GG667628.1| GENE 9 8054 - 8164 78 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSNSFLIRFYDIFITIKVYISVPLPPTQSSPVFRFF >gi|229784107|gb|GG667628.1| GENE 10 8114 - 8251 96 45 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870301|ref|ZP_06409704.1| ## NR: gi|288870301|ref|ZP_06409704.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 35 6 40 50 67 100.0 4e-10 MAAHWDIHSPCSGTHPEAGKKLSEREALNLEKAKNGGGLGGGKGD >gi|229784107|gb|GG667628.1| GENE 11 8426 - 10945 2559 839 aa, chain - ## HITS:1 COG:VCA0210 KEGG:ns NR:ns ## COG: VCA0210 COG3437 # Protein_GI_number: 15600979 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and an HD-GYP domain # Organism: Vibrio cholerae # 645 827 324 508 522 156 42.0 2e-37 MYNYSFISVTALFCYMILFVAFMAAQKTKIIHSFLVVLAAFLLWTGGSLAMRMELWPGVE FWFHFSIVGLIILPFTFFNFIYALVDSDDTILWWAWLILAVVNNAANIKWGFYIPCPEVV RLANGSAGFVYHIGWPIAILFITMSFPIIHMFYLVHKSYRKNTVERNQLKPIFAGIVILF IGHAGTLVPAFHGFPTDVLSGIINAGFMFYALYKKHLFRLTLLVSRGVVYGISTLLVALL FANYINPLQEFIQRKMPVFADSSLLIVAVCFTLVTFLLSSVVKTFIDILFTKGELVQTEN LKKFSTAVSKSLEVGDIVEEISSVIQDTLAVESIYICLADQAGESFETAHRSRPISGVNV TIRKDNPMVEWMRQNRRCIQISEFRRTVLFKSMWEEEKRLIRELGIQCMLPLQDEENLAG IVLLATKKKGKSYGYDDLNFLESVSSIASIAVRNSRLYEKACIEARTDELTGLLNRKYFY ELLNQQYEKKPNRELALIIINIDDFKLYNQLYGMKEGDTALQNVAKIISATVGERGFVAR YSGKEFAAALPDFDVLAARNIAESIRMQILNMNKRATDYALKVLTVSCGICTIPYSASNI KQLVENADMAVFSVKRKGKNGVMVFDIGSHIDGSTRDGEPLVREDVYTSYQSTIYALTAA IDAKDHYTFSHSRHVAEYAGILAEEYGMSEDFIELVREAGLLHDIGKIGIPEHILNKPGK LEPEEYEIMKGHVEAAVSIIRNLPSLDYVIPAVIGHHERYDGKGYPRGIAGEDIPLSARI LCIADSFDAMVSKRSYKDRCSLEFAMEEILKQAGMQFDPQLAPLFVEAVREGKIKPEKE >gi|229784107|gb|GG667628.1| GENE 12 10938 - 12596 1547 552 aa, chain - ## HITS:1 COG:no KEGG:Dehly_1104 NR:ns ## KEGG: Dehly_1104 # Name: not_defined # Def: GH3 auxin-responsive promoter # Organism: D.lykanthroporepellens # Pathway: not_defined # 5 513 6 497 543 374 38.0 1e-101 MKFEDKLKERKDWELWKEYCGFLDLTIEEFMAIQNRLMLEQISKWKASGIGKAILKGRNP ETIEEFRKMVPLTSYEDYADSLSLKDKSILPDDPVIWIQTTWEGGRHAIKLAPYTRSMLE TYQNNMLGCLMLSTSSGRGDFDVNPKDTILYALAPLPYATGIFPLLLNDAISIEFLPPVK EAQKMSFSERNKKGFKMGLKKGIDFFFGVGSVTYYVSLSIASLGSGHKSGGGSGSGDGKK KISISPAMVIRLLKAKHICRKEGRDLLPKDLFRLKGFMCAGTDNRLYRDDLEKLWGVRPM EIFAGTEPTCIGTEIWSRDGMYFFPDACFYEFIPEKEMERSLADSSYEPRTCLMNEVEEG EKYELVISVLKGGVFMRYRVGDVYRCIALENERDHVRFPRFEYIDRIPTVIDIAGFTRIT ENSISRVIGLSGLKVKEWTAAKEVTENKRPFLHMYVEMEPECLFSSAVTVNILKEHMGVY FKYVDEDYKDLKKILGIDPLEITILRCGTFDAYYRGGKEKMRRINPTQYEIGELLKSQEL VSAFGRRPQRSV >gi|229784107|gb|GG667628.1| GENE 13 12586 - 13176 476 196 aa, chain - ## HITS:1 COG:no KEGG:Dehly_1105 NR:ns ## KEGG: Dehly_1105 # Name: not_defined # Def: hypothetical protein # Organism: D.lykanthroporepellens # Pathway: not_defined # 1 187 1 183 192 90 34.0 2e-17 MSYLAAGVIAFGMYLLYDLNSFTWKNRILHTFFLIGSVILAVSTAVCFMTLVKAYGPPVW RIVIFGALAVSQAALLVYTLFFALPFEKTYTRLEENPPVYTEGVYSLCRHPGVVWFFFLY LFLALLTGQRLMMAVCICYSGLNVLYVWFQDRVTFRRTLSGYEAYQRETPFLIPNGKSIR KCAGYFRTVEEDGDEI >gi|229784107|gb|GG667628.1| GENE 14 13173 - 14093 1000 306 aa, chain - ## HITS:1 COG:FN1927_2 KEGG:ns NR:ns ## COG: FN1927_2 COG1307 # Protein_GI_number: 19705232 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 4 298 2 283 285 123 28.0 5e-28 MENSKFAVISDSSCDLPEELAAAKEITVVPFYVSFDGETYKKEGTEIGIREFYREMVEQP GVYPKSSMPSIQDYVDVFQPFAQAHTPVLCICITTKFSGSMQSALNAREMVLSQYEDAQI EVIDATVNTVLQGLFVLEAVSMREEGMTLSEAVAALDEMKESGRIFFTIGSMDYLQHGGR IGKVAGAAASVLGIRPIITLKQGEIFPSGISRSRRKSIDKVIALLLDYLKENGEPMSSYR FTIGFGYDREEAVLFKNQVLAALRTGGYEMEDAWINQIGATIAVHTGPYPLGVGIIRRHH KERQGE >gi|229784107|gb|GG667628.1| GENE 15 14364 - 14861 591 165 aa, chain + ## HITS:1 COG:no KEGG:Closa_0984 NR:ns ## KEGG: Closa_0984 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 163 1 163 169 171 54.0 1e-41 MAQFTPVYGTITSITPLQTADGTQDCTLLFFIMTQDMGPVTFTVTPLTYVEGQETLKPGD PIVAFYDNSVPAPLIYPPRYHAVLMAEADDGLFAAFDYFDENLVNTDATLKLNLMEKGGT EVVMANGQIYYGTPANHYLLVLYSTTTRSIPAMTTPEKIVVFCSE >gi|229784107|gb|GG667628.1| GENE 16 14903 - 15433 678 176 aa, chain - ## HITS:1 COG:lin2129 KEGG:ns NR:ns ## COG: lin2129 COG1827 # Protein_GI_number: 16801195 # Func_class: R General function prediction only # Function: Predicted small molecule binding protein (contains 3H domain) # Organism: Listeria innocua # 2 175 1 172 173 125 37.0 5e-29 MMERVDGEKRRQQIMEILNQETAPVSGTELAKRAGVSRQVVVQDIALLRAENKEIMSTNK GYVLYKPQPGRPKAHVRVFRVNHDTEHMLDELQTIVDYGGRILDVSIEHTLYGQISVDLI INNRLDAQEFVNQMAMSRDEPLKALTGGCHYHTVAAESEKNLDMIEMELQRKGYLA >gi|229784107|gb|GG667628.1| GENE 17 15439 - 17163 298 574 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 327 557 144 375 398 119 33 4e-26 MALIEVTNLKYRYPHTEKLALDGLHFQVEKGSFIGIAGENKAGKSTLCQALVGLVPTMFK GAYGGKIMIDGLEAGKTPVAMLCQKVGLIFQNPFNQLSGAKDTVFGEIAFGLQNLGVPRE EIVRRVEENLRLLDIEQYRDRNPFDLSGGQTQRVAIASILAMQPEIIVLDEPTSQLDPQG SEEVFRVVETLSKTGITIIMVEHKMEKLASYCDKILLLHDGKQIAYDTPEKIFSRDDLEE LGVEAPVFTRVCRKLSVYREEDGKRLYPVTMEQTMALKDQFPPELGGKYKEAEQPGGETE RTTGASDRTLREIFRIKDLSFHYLEHVPIMEDLTLTLDTRPAAVIGQNGAGKTTLVKLLK GLLKPVSGTILFGGNDISERTVAQLAGSVGYVFQNPDDQIFKNRVMDEVMVGPLNIGMGR EEAERRAREALSMVGLLDAAEENPYDLDLSERKMVAIASVVAMDTQVLILDEPTIAQDAR GRRVIGSIVRKLSDSGKFVLAILHDMDFVAEYFERVIVMAHGKVLADGPKETVFYEEDAL AAAKLEQPHTAALCRKLGYEGAFMTVNDIKDRTV >gi|229784107|gb|GG667628.1| GENE 18 17154 - 17927 755 257 aa, chain - ## HITS:1 COG:MA1748 KEGG:ns NR:ns ## COG: MA1748 COG0619 # Protein_GI_number: 20090600 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Methanosarcina acetivorans str.C2A # 1 249 12 285 298 77 24.0 3e-14 MKSISLYVDKDTYLTRLHPFAKMLYIAAAISVPVIMGTLWMFGVFIAVSLGLLISGKIMK KVLPLIAFSFTIIITVFLIHGLFNRENQNVLAALGPLKFYKEGLLYASRIGLNILNMLLA FAMFVLTTKPAALVEDLEQAGFSPRFGYMISSVFQIIPQMMGTMNTIMDAQRSRGMETEG SLFVRAKAFIPLISPVVSSSLINTRERAIALEVRGFDSKTKKTFLNDHKIKGRDRAFMIV MVLLIAGAVVWRVMRWL >gi|229784107|gb|GG667628.1| GENE 19 17962 - 18564 822 200 aa, chain - ## HITS:1 COG:no KEGG:Closa_0981 NR:ns ## KEGG: Closa_0981 # Name: not_defined # Def: signal transduction histidine kinase, LytS # Organism: C.saccharolyticum # Pathway: not_defined # 1 200 1 200 200 324 91.0 1e-87 MSNIFKTKLNAACIVLIPACIGINYLGKLFASVLKLPLWLDSIGTCIGAVLGGPIIGGIC GAANNLIYGFTTGDSITLVYALTSLGIGVAVGIMARLGRMEKFSGAVLTACVAGFVAVLI STPLNIIFWGGTTGNLWGDAVFAWSQASGLPVALGSFLDEVIVDVPDKLITLFIVFAIIK GLPKKLTSLYEMGDEVESLD >gi|229784107|gb|GG667628.1| GENE 20 18557 - 19321 921 254 aa, chain - ## HITS:1 COG:L54803 KEGG:ns NR:ns ## COG: L54803 COG2820 # Protein_GI_number: 15672821 # Func_class: F Nucleotide transport and metabolism # Function: Uridine phosphorylase # Organism: Lactococcus lactis # 1 242 1 238 238 219 48.0 3e-57 MEEFMPHIRLSKEQAAPHVLLPGDPQRLDRIAECLEDVKELAYNREYRSLRGVYKGVPVM AVSTGIGGASAAIAVEELVRIGVRNMIRIGSCGALQKGIRLGDLILVNGAVRDDGASVTY VDSIYPAIPDTELLAACMMSAEELGAKFHVGIARSHDSFYIDDEEAVSTYWRERGVLGSD METAALFVIGRLRGVKTASILNTVVEWEDSLEENINSYTGGESAMMQGERMEILTALEAF VRIDKNKQGGNSYE >gi|229784107|gb|GG667628.1| GENE 21 19599 - 21011 1578 470 aa, chain - ## HITS:1 COG:CAC0747 KEGG:ns NR:ns ## COG: CAC0747 COG1376 # Protein_GI_number: 15894034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 34 469 39 465 466 288 35.0 1e-77 MSIRMKKTGMLVRIALMAAVLILLPAQAVLADDTTRFVPGTKVNGIGIGGLTVDEAKAQI EGFYSAEYKLTIKERDGKTETLKGGDVGYKATAPDGLSAILDAQNASGRVSGPSTDNSHT MSMSVQYDEGALTNAVNGLGCISGADIVVTQDAHITAYQEGTPFTIIPEVQGNNVDQDKV LALVKEAVKAGSSEVDLAASGCYATVRVTAEDAALKDLCDTMNRCRDMVITYQFGDQTVT LEGETICTWILGSTQDGLVDVNYDKAAAFVKSMADQFDTAGKTRVFHTTTGKDLNLTGPY GWQIDQAGETASLIAMIRTGQSQAREPLYSKKGQDRTVDWGNTYVEIDLTSQHVYMYQDG NLAWDAPCVTGNVSKNYTTPEGIYSLTYKQTDRILRGEKKADGTYEYESHVDYWMPFNGG IGLHDASWRSKFGGTIYQYGGSHGCINLPPAKAKTLYDMVYTGIPVICYN >gi|229784107|gb|GG667628.1| GENE 22 21209 - 21868 677 219 aa, chain - ## HITS:1 COG:CAC2769 KEGG:ns NR:ns ## COG: CAC2769 COG0652 # Protein_GI_number: 15896024 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Clostridium acetobutylicum # 60 216 6 173 174 165 51.0 6e-41 MKKRIMLLMVLTAAVLAGCSGKAGAADQTISSSAAETAAEETKTDAGTAGETVEAVKHHV KINVKDYGTISVELNPTYAPITVENFLSLAESGFYDGLTFHRIIDGFMIQGGDPLGTGLG GSDTTIKGEFSSNGVENPLSHTRGAISMARSQDKDSASSQFFIVHQDSPHLDGDYAVFGY VTDGMDVVDKICEDTKVTDRNGTVAKENQPVIESVEVVD >gi|229784107|gb|GG667628.1| GENE 23 21961 - 22983 1028 340 aa, chain - ## HITS:1 COG:TM1297 KEGG:ns NR:ns ## COG: TM1297 COG0667 # Protein_GI_number: 15644052 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Thermotoga maritima # 1 191 1 197 285 122 36.0 1e-27 MADVTLGKTKITVNKNGFGALPVQRVSVEEAGTLLKKAFNHGITFFDTARAYSDSEEKLG LAFEGIREKIYIATKTAADTAEGFRRDLEESLRLLKTDYIDLYQFHNPAVCPKPGDGSGL YEAMLEARERGTVRHIGITNHRLHVAAEAVESGLYETLQFPFSYLASEKELELVEACKAA DMGFIAMKALSGGLITNSAAAYAYLAKFDNVLPIWGIQREHELDEFLSYIDNPPVMTEEL EAVIEHDRKELLGEFCRGCGYCMPCPAEIEINTCARMSLLIRRSPSAGHLTEASQNMMMK IKDCLHCGQCSSKCPYGLDTPALLERNLKDYEEILGGKAY >gi|229784107|gb|GG667628.1| GENE 24 23220 - 23717 595 165 aa, chain - ## HITS:1 COG:no KEGG:Closa_2712 NR:ns ## KEGG: Closa_2712 # Name: not_defined # Def: Rubrerythrin # Organism: C.saccharolyticum # Pathway: not_defined # 1 164 1 163 396 194 59.0 9e-49 MHAVRNIRLCTKDCLCLYVCPTGATDTETGQVDASKCIGCGICANACPSGAISMVPEKYP PQQKKEEKTTERLNKLAASKVKQEAMAAAIGRGTADPVVKQFAAALERSNRLMAEDLYRE AGYMLPQSRNTHELLMSMVSDDDKLPEDFPKEAARRLLDLLEVNE >gi|229784107|gb|GG667628.1| GENE 25 23722 - 24399 626 225 aa, chain - ## HITS:1 COG:PAB7224 KEGG:ns NR:ns ## COG: PAB7224 COG1773 # Protein_GI_number: 14521100 # Func_class: C Energy production and conversion # Function: Rubredoxin # Organism: Pyrococcus abyssi # 1 49 1 49 53 79 71.0 5e-15 MKKFICSVCGYIYDEAAGIPDRGIEPGTKWEELPPDWVCPLCGAPKSAFREKKEEGAKAV SQEPAGKSGEGGGEELRELSFGELSAVCSNLAKGCEKQYLAEEMELFLQLAGYFKTKAAP ADMENTGELLAMVSKDLETEYPQAEAAAEESGDRGARRALVWSGKVTRMLEALLERYENE QRAFLEHTKVFVCDICGFIYIGDEAPAICPVCKVPSLKIMEVQRR >gi|229784107|gb|GG667628.1| GENE 26 24554 - 25318 600 254 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620705|ref|ZP_06113640.1| ## NR: gi|266620705|ref|ZP_06113640.1| putative cyclic nucleotide-binding domain protein [Clostridium hathewayi DSM 13479] putative cyclic nucleotide-binding domain protein [Clostridium hathewayi DSM 13479] # 1 254 1 254 254 494 100.0 1e-138 MNHTTEHTLYQIMQFVQEQHRLEHPVTSDMIRPYLNRMLAIRMVLKDDYITTVHTPIKKV YYVMSGSFSMIRSSVDGKNTIRIQKAPTFLGVDFTVLSLDGYYADNLALEDCLVLEINQK YFLESIRQNGEMCYEVLYSICNKFFQTSFRYDHTLFYDPCTKLMIYILDHWIANHGQKTK YTIAVPNTRIADEIGVSIRTYYRAVSKLKTDHFISVVSGNICVTYEQIHHIKSYLQNSKA VTALPAFLENSTLM >gi|229784107|gb|GG667628.1| GENE 27 25624 - 26379 827 251 aa, chain + ## HITS:1 COG:yagG KEGG:ns NR:ns ## COG: yagG COG2211 # Protein_GI_number: 16128255 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli K12 # 18 200 12 200 460 131 37.0 1e-30 MGKSKTIGKPDYFNFGKYAVGGIGFSMTNQFILVYLTFFCTDIFGISSLTVAGLMLVTKI IDAVTDPLMGFIADQTRTKQGRYRPYLIYGAPILGVLVYLLFSAPELSSTMKVVFIYVVY ILYSLAFTVVGVPFTSLVPVLAKDSTERTMVVSWKNVMIQVGRFFITTFALPLVEVFGGG ATGWGRFGALVGVLITLCFWCVAWGLKPYDTIDLKIERPKVNLLKETQLITKNKPLLMLM SAFGTDMIANS >gi|229784107|gb|GG667628.1| GENE 28 27309 - 27923 579 204 aa, chain + ## HITS:1 COG:BS_ydjD KEGG:ns NR:ns ## COG: BS_ydjD COG2211 # Protein_GI_number: 16077683 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Bacillus subtilis # 6 202 265 461 463 103 33.0 2e-22 MNSALLAVNMYYFKYVLGRVDLVPITSVALTMTGVLSNLAIPFLTKKVGKKRLYWYGSLF SIIPLAVLFVKPVVPSAALIVLITLFGLISTVPSALAWAMLPDCVDYAEYVTGVKGNGVV SSTFSFMNKFCGAIGAFLASYLLGIFGFAANQEQTQTVLFIIVVLRFGIPILGYIASLLS MHFYELTDERNMEIRHALDERAQF >gi|229784107|gb|GG667628.1| GENE 29 27943 - 29511 1398 522 aa, chain + ## HITS:1 COG:mlr3684 KEGG:ns NR:ns ## COG: mlr3684 COG3119 # Protein_GI_number: 13473175 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mesorhizobium loti # 2 455 6 509 522 169 29.0 8e-42 MKKPNILLITTDQQRYDTICAMGYDFMETPNLDRLAAEGCCYPNAYSSNPVCMAARHQII TGLTARYHRFDDNYFESDPKTIPFDLPTLPQLLSDAGYDTIAIGKMHFQPCRRHNGFTKM ELMEEIPRYLEDDEYAKYLKENGYGHLQSPHGVRHLLYMVPQRSLIPEEHHGSTWVAKRS VYHLKENGGKRPFFLWSSFIAPHPPFDVPEKWADLYKGKELPPLKESKTPISGIAEWKKY IADYPNESYLRRARELYYASISFVDYNIGTILQQLKDMGEYDNTLILFTSDHGEMLGDHG TFQKMLPYDGSVRIPFIMRYPDQLKAGSEDTRFIDINDILPTVLDVAGVPCPNPERLPGE SIFAVDGRKDRTVEYVEHSHGKLRWISLITKSYKYNYYYGGGKEELFDLENDPDETTNLL YHAPTPEVLAVKEKLKARLTAIEAQYGLKGCVENGQFVEQEEFKPFKYHENNPPLFPKKQ GGDYISLEEEVKRAVAHEDVVKLEELDIPYYEERGVLSREKL >gi|229784107|gb|GG667628.1| GENE 30 29597 - 31438 1914 613 aa, chain - ## HITS:1 COG:YPO3001_1 KEGG:ns NR:ns ## COG: YPO3001_1 COG0446 # Protein_GI_number: 16123182 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Yersinia pestis # 1 219 235 450 459 219 51.0 9e-57 METDLVLLSIGVRPNSSLAKEAGITLNQRGGIVVDSHLRTSAEHVYAVGDVIEVEHSVLG TRTMIPLAGPANKEARICADILAGDEKEYKGTMGTSVARVFDYTAALTGVNEKTLKAMGK KKGTDYETVLISQKSHAGYYPGAVTLFLKLIFGRKGEIYGAQIAGQEGADKRIDTIASVM GMKGTIYDLEELELAYAPPYSSAKDPVNMLGFTAENVLNHLVSFMSSDELDRLLASEREG DRPLVLDVTEDVERMVYHIPGSYHIPLGQLRNRLGELDRERQIVVYCAIGVRSYNAARIL SQNGFKDVKVLEGGTSFYKSMHHRDFAEQERSEPEQGKHVKEREVKEGTQVKGTQMEGTQ IHTAAAKSVKIVDCCGLQCPGPIMKVHEAIGEMEDGEILEVSATDMGFSRDIASWCRQTG NTLVDTKRKGQESIVLIRKGLENREAEQDTVSTVPRTLNGKTMVVFDGDMDKALAAFIIA NGAVAMGSPVTMFFTFWGLNILRKPEKRKVEKSLIEKMFGAMMPRGAGKLKLSKMNMGGM GTKMMKKVMADKHVDSLEMLMNQAMKNGVKLVACTMSMDVMGIRKEEIIDGVEFAGVATY LGDAENSSVNLFI >gi|229784107|gb|GG667628.1| GENE 31 32436 - 32825 337 129 aa, chain - ## HITS:1 COG:FN1903_1 KEGG:ns NR:ns ## COG: FN1903_1 COG0446 # Protein_GI_number: 19705208 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Fusobacterium nucleatum # 4 127 3 125 469 142 54.0 1e-34 MGNKTVIIGGVAGGATTAARLRRRDEKRQIILLERGEYISYANCGLPYYIGDVIKSRDSL LLQKPEVMKARFDIDVRTSSEVTAIHTDRKSVTIKDKKTGETYEESYDDLVIATGSSPLI PPVPGIDGP >gi|229784107|gb|GG667628.1| GENE 32 32924 - 33811 861 295 aa, chain - ## HITS:1 COG:BH2747 KEGG:ns NR:ns ## COG: BH2747 COG0697 # Protein_GI_number: 15615310 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus halodurans # 9 291 11 289 302 140 33.0 3e-33 MTFLEKHPMIMIVIGIIGISLSAIFVKYSEAPSVVTAAYRLLWTVALMTPVVLGKKEIRR ELAGSDKKTVALCAVSGIFLALHFTSWFESLNQTSVASSTAIVCTEVIWVALGYCLFLKG RLSKTAVGSIAVTVAGSLLIALSDYSAGGNHLTGDGLALAAAVFSAVYTLIGREARKHMS TTVYTYIVYVFCALVLCISTAASGLSFTGYGAASVVVGFLLAVFSTLLGHSIFSWCLKFL SPSFVSASKLCEPVVAAMFAAVLFAEIPALLAVAGGIVTISGVLLYSREQRAITS >gi|229784107|gb|GG667628.1| GENE 33 34049 - 35401 1304 450 aa, chain - ## HITS:1 COG:BH1923 KEGG:ns NR:ns ## COG: BH1923 COG2723 # Protein_GI_number: 15614486 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Bacillus halodurans # 3 443 6 440 447 368 43.0 1e-101 MGFPEKFLWGTATASYQVEGGFEDGRGYTVWDDFCAIPGKVLDGSDGSMACDHYHRYKED VALMKELGFKAYRFSIAWSRILPEGTGEVSEAGLQFYENLVDELCANDIIPCVTLFHWDL PYALYRRGGWQNPDSPEWFAEYTRVVAERLGKKVPYFITFNEPQCFIGLGHVTGEHAPGN LMSRKSVLEMTHHVLMAHGKAVKVLRDYAPEAKVGIAPTSTVACPATEKPEDVEAARKAY FDVIDQENYMWSVSWWSDPVFFGKYPEEGVKLFAGDMPCIGEQDMELISQPVDFYGQNIY NGYPVASDGKGGYSCLPRKRGASRTACGWPVVPESLYWGPKFLYERYHKPIFITENGISC TDSICLDGKVHDAERIDFLHRYLLALKKCVEEGNEVEAYFVWSLMDNFEWAKGYTERFGL IHVDYETQVRTMKDSAWWFKKMMESNGNDL >gi|229784107|gb|GG667628.1| GENE 34 35419 - 36105 708 228 aa, chain - ## HITS:1 COG:BH1352 KEGG:ns NR:ns ## COG: BH1352 COG0274 # Protein_GI_number: 15613915 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Bacillus halodurans # 11 222 4 215 224 185 49.0 5e-47 MNVLKDLTADSLAQYFDHTQLHAEATEKDFIKLCEESKKYHFRMVAINSAPVALCKGLLK NTDIHVGAAISFPLGQTSIPVKVFETENAIEAGADEIDYVIHIGKLKDGDYNYIEREMAE IVSACRRADVISKVIFENCYLTQKEKEELCHIALNVRPDFVKTSTGFGTGGATAADVRLM KGIVKDHIRVKAAGGIRTLKDTLDMIEAGADRIGSTASTAIVEEFMRM >gi|229784107|gb|GG667628.1| GENE 35 36120 - 36980 769 286 aa, chain - ## HITS:1 COG:BS_yrpG KEGG:ns NR:ns ## COG: BS_yrpG COG0667 # Protein_GI_number: 16079738 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Bacillus subtilis # 1 286 26 316 316 331 55.0 9e-91 MDAALDAGINFFDTANNYGFLTGNEGITEEIIGRWFAQKGSRRERVVLATKVHEDMKNPE DGPNSAAGLSIYKIRRHFKESLRRLQTDHVEIYYMHHIDRSITWTETWDAFQNLYDRGLV DYIGASNFPAWEIARAQGEAEKRNFMGISVEQDRYNLTCRLPELEVLPACEALGVGFVAW APLGGGMLAGKSEDSLRRIGLADSNLEKAEAYRKACGEAGLKEADAALAWLLNNPRLTAP IIGIRTMEQLNSSLHALDIKLSQDFVEELDRIFPGPGGPSPEAYAW >gi|229784107|gb|GG667628.1| GENE 36 37943 - 38881 840 312 aa, chain - ## HITS:1 COG:AGl618 KEGG:ns NR:ns ## COG: AGl618 COG4225 # Protein_GI_number: 15890425 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 94 306 153 385 397 92 30.0 1e-18 MDKRKKAAMALLCMQRYSWEQGTAMQAFLESGEDEIVLKMAKEAVYRSIEDGRAAILGTL EAVTDPCSVGEALVYAVAKTQDPFLEGGLKKLINWALKLAPRSEDGAVYHVMSEPQIWVD SFYMLPPFLAAAGYPKEAVDQIHRYWSRLYLPDKKLLGHIWDDGKKEWIRRDCWGVGNGW AMAGIARVIDLLPDSFVKEKEELAALDTGLIEAVSAYIRPDGMVHDVLDNPDTFTEVNLP QMFAYTVYRGMSSGWLNEKYLDQAGLCLSTARAHIDDYGFIQNVCGAPDFNKAGMAPEGQ AFYLLLETAAEN >gi|229784107|gb|GG667628.1| GENE 37 38900 - 39658 834 252 aa, chain - ## HITS:1 COG:BH0419 KEGG:ns NR:ns ## COG: BH0419 COG2188 # Protein_GI_number: 15612982 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 12 252 2 240 240 154 36.0 2e-37 MDWEKVFNYRSLDKLLPIPMYYQIKQMILEAIKNGTLTVGDMIPAEMEFCEFCSVSRPTV RQALSELVSEGYLTRLKGKGTYVSRPKIQAMFFNRLLSFNEEMKERGLTPSTRVLALKVV PGCPEACKVLGLPDDGKMILLERIRYANGDPLVYLCTYLPYPAYEKLLDCDFVENSLYKL LETLYGDRVERVHRTIEAVNSEKKDAELLEIPKGSAECLVKTVGYSGQDKPVEYSIARYR GDRNQFSVDLYR >gi|229784107|gb|GG667628.1| GENE 38 39669 - 41192 1171 507 aa, chain - ## HITS:1 COG:CAC2612 KEGG:ns NR:ns ## COG: CAC2612 COG1070 # Protein_GI_number: 15895870 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 4 494 3 494 500 346 37.0 5e-95 MGEYILAHDLGTSGNKATLYDFEGRLKGSMVCSYDTHFKEEGWAEQDPEDWWRAVCTSAK GLMETTGVSKEEIACVSFSGQMMGCLLIDGKGKPLRPMIIWADTRAREQERKMVEAVGEE RGYKITGHRLSASYSAAKLLWVRDQEPAVYSRAAKMLNAKDYMIYRLTGEIVTDYSDASG TNLLDITQKKWSDELLECFGIERNLLPELYPSSKIVGIITEEAARETGLTAGTPVVAGGG DGSCACVGAGVAAEGKTYCVLGSSSWISAAAAKPFFDEARRTFNWVHLDPELYTPCGTMQ AAGLSYKWIRDVMCGEERKEAKFMGIDTYELLGHLAALSPPGAGGLLYLPYLLGERSPRW NSDARGCFVGLGMSTEKKDMIRAVLEGVGYNLKVILDILENRERVDSVILIGGGAKGREW LQILADIWQKTISVPAYLEEATSMGAAVCGGVGIGAFPDFRVVERFNRIVDEIRPDKKNT MVYERMYGAFNEAYDALCSVFPSISVK >gi|229784107|gb|GG667628.1| GENE 39 41217 - 42068 813 283 aa, chain - ## HITS:1 COG:BS_araQ KEGG:ns NR:ns ## COG: BS_araQ COG0395 # Protein_GI_number: 16079925 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 14 283 13 281 281 142 35.0 9e-34 MVKKQRKPVTFRELMIYIVLVLLTCLCLFPFFMLMINATRSHTEIQKGFTFLPGGAIFMN LRNVFHNENLPVLRGTLNSLVIAGGSAALAVYFSSLTAYGLHAYDFRGKKQAFSFILLVM LVPSQVSALGFLREMLALGLRDSFLPLILPAVASPTVVFFMKQYMASAVPLELVSAARID GANEFYAFNRVIIPIMKPAFAVQGIFAFVASWNNYFMPSLLLDSKMKKTLPILIAQLRSA DFLKYDLGQVYMLITIAILPTVIVYLLLSKNIVGGIALGGIKG >gi|229784107|gb|GG667628.1| GENE 40 42072 - 43004 792 310 aa, chain - ## HITS:1 COG:BH1925 KEGG:ns NR:ns ## COG: BH1925 COG1175 # Protein_GI_number: 15614488 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 19 300 20 294 305 141 35.0 1e-33 MAKNYLRKKKSGVNYDKWGYLFIMPFFLCYAVFSLFPIIFTFYNSFFENYMNGLKQVGPN FAGFENYKRILADKDILKYFWNTIYIWILGFVPQIIFSLLLAAWFTNRRLNIRGQRFFKT VIYMPNLIMAAAFSMLFFTLFSDGGPVNDVMMKWGIISEPIRFLAGRNTTRGLIGLMNFM MWFGNTTILLMAAMLGIDVSLFEAAECDGASPGQIFKKITIPMIKPILLYVIITSMVGGI QMFDVPQILTNGEGVPNRTSMTIVMYLNKHLYSKNYGMSGAVSVILFFVTVLLSLVVLRV IREPSQEGGA >gi|229784107|gb|GG667628.1| GENE 41 43081 - 44496 1645 471 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20530 NR:ns ## KEGG: EUBELI_20530 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 6 469 3 451 451 567 61.0 1e-160 MKKNMEKLTAVFMAAVMAASLLSGCGSKKETAPASVPDQTQGEVKESQAEGTDSEGKTLR IYCWNEEFQARFNDIYADKLPADVKVEWVITPTQNNAYQDKLDQDLPNNEKADADDRIDL FTVEADYILKYVDSDVTLDVRGDIGLTDEDLKNQYSYTQEIATDQSGKLKAVSWEADPGV FVYRRSIAKEVLGTDDPDKVQEAISDWNKFEEVAGKMKEKGYKMLSGYDDAYRVFSNNVS APWVNANEEVVIDPMIDQWIDQTKDFTDKGYNNKTQLWSTDWNTDQGPAGKVFGFFYSTW GVNFTLLGNALETPLAEGGKKEAGNGIFGDYAICYGPASYYWGGTWICAARGTDNAELVK DIMYTMTCDSESMKKLTYDFEDFTNNQQAMQEISDSGYTSDFLGGQNHIELFLKVAPNVS MKNVTAYDQGITESFQTAMADYFNGQVTKEQAVDNFYTSVIEKYPNLKRPQ >gi|229784107|gb|GG667628.1| GENE 42 44514 - 45509 899 331 aa, chain - ## HITS:1 COG:no KEGG:BT_4420 NR:ns ## KEGG: BT_4420 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 330 36 362 362 380 54.0 1e-104 MKKPELFLKLPEEYVCTPDGMEIDRNGKLILSCPNYAQDDMSGCIVRINRDKEVEKWFDV PVHPETKLARNMGIAFDRDWNLYICDNQGWSEKEEVLYKGRILKVTADEDGTIKETVVVA DHMEHPNGIRIRGDYLYVTQSYLHPVKDPSGKLVSCVYRFRLDEHDVHITNTLEDPNILT TLITENPECQYGADGIAFDSKGDLYVGNFGDGTVWKITLNEDGTVKEKEIYARDPEELVS TDGMVFDDEGNLYIADFCANAIVKVDQNRTVKRLAQSPDTDGFDGGLDQPGEPIIWNGSL IASCFDLVTGPGKVNTRHEMPATMSEIKLDQ >gi|229784107|gb|GG667628.1| GENE 43 45527 - 46579 799 350 aa, chain - ## HITS:1 COG:SMa1417 KEGG:ns NR:ns ## COG: SMa1417 COG1063 # Protein_GI_number: 16263228 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Sinorhizobium meliloti # 1 335 1 326 341 117 27.0 5e-26 MKGLAVHENRKLQIVELPMPEYNEYQALVKVLACGICNGTDSKIIHGTFKGVDTYPILLG HEAVGKVVETGKHVKYLKPGDIVLLPFLYQKTEQMIPTWGGFCEYALVGDDRALEEAGIG ISSPLRDGSYPAQSVMTSKDGIDPAEMVMVITFREVLSAIRRFGFRENESVLIYGAGPVG QCFIKFSKLLGMSEVICVDITDEKTAEAACMGADHVFNSRKTNVVREVRKLLPDGVDYVV DAVGLGSLINEAMELIADHGNICCYGISPKMTQEIDWSKAPYNWNLQFVQFPEKAEEAEA HSQIMAWIRAGVLNPKDFISDQYPFEKILEAFEKVEQHLPDTKKIVITYT >gi|229784107|gb|GG667628.1| GENE 44 46837 - 47874 1283 345 aa, chain - ## HITS:1 COG:mll2179 KEGG:ns NR:ns ## COG: mll2179 COG3804 # Protein_GI_number: 13472019 # Func_class: S Function unknown # Function: Uncharacterized conserved protein related to dihydrodipicolinate reductase # Organism: Mesorhizobium loti # 11 333 3 325 329 122 25.0 9e-28 MNYEKIRVVQYGCGKMAKYILRYLYEKGAEIVGAIDVNPEVVGMDVGEFAELGFKLGVTI RDDADAVLDECDADIAVVTLFSFMDDCYSHFEKCVSRGINVVTTCEEAIYPWTTSSAITN KLDLLAKENGCTIVGAGMQDIYWINMIGCVAGGVHRIDRIEGAVSYNVEDYGLALAEAHG AGLTPEEFEAQIAHPTTLEPAYVWNSNEALCNKMGWTIKSQSQKCVPYFYDTDLYSATLG KTIPKGNCIGMSAVVTTETFQGPVIETQCIGKVYGPDDGDMCDWKIIGEPDTTFYVTKPA TVEHTCATIVNRIPTILNAPAGYITVEKLEELKYLSYPMHMYLEH >gi|229784107|gb|GG667628.1| GENE 45 48075 - 48950 1150 291 aa, chain - ## HITS:1 COG:BH3496_1 KEGG:ns NR:ns ## COG: BH3496_1 COG0789 # Protein_GI_number: 15616058 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 11 113 4 106 117 71 39.0 1e-12 MIKLTNLDVKKMTIGQMARMNHISEQTLRLYDREGLLSPLGRDEKNGYRYYDIRQSAQLD MIQYMKALGMPLKEIRVHMKKWDTGRLKQLLVENRDAIDRRIEELNLQKRAIERTLDSYE RYEAAPPDGSIVLEYIPKRQMYVKNTNVNFYDYEIDEYERILRDLKEDMAAHAISPLYFA NAGTILRRENFMKGRLYSAEVFVFVDREYVDEELITVIPPSPYLCIYCDRFEKEKEYIDR LLVEIEEKHYTVTGDYICEVIVEMPLDYTERGMFLRLQVPIRMGAYQYGES >gi|229784107|gb|GG667628.1| GENE 46 49042 - 50115 758 357 aa, chain - ## HITS:1 COG:SMa0319 KEGG:ns NR:ns ## COG: SMa0319 COG2207 # Protein_GI_number: 16262625 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Sinorhizobium meliloti # 129 330 130 331 333 72 29.0 9e-13 MSIKKNSKMKNQKHVSPIDFYSPSQMSIPMSFTMCYKINYDAAHTLYHVISPGNIDSHIS SDAKTPHQHDHFELLYVLKGELVNTIEQTSYHYKEGDACLLNRNTRHIDIKGRNSIAVFL NLSTDYVAELMRDSSETSPKHKIHLFLSSNLKGNERFARNYLTFTPTLQSYGDDTLTRIA GLIDSVQLELSEQKPGSQDLIRGLIKRLIYMLEDEKKYHSNFVRLDATHQDFLFARIMNY IKETNGNITREDLSKALNYNAEYLNRIVKARTGVSMMKFRKKFQLARAKELLTETDENIC TIIGELGFSGNSHFYTFFHQETGMSPMEYRNLWKQKIGEESETAQTEREQIQLNKNC >gi|229784107|gb|GG667628.1| GENE 47 50325 - 52769 1418 814 aa, chain + ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 3 789 56 860 871 281 26.0 4e-75 MERFECNRDWSFYESNESLSFVYEKPVPKQIHLPHDFIITKPRSADAAGGALNGYFGEGQ GVYKKELAFPESLKNKTVILDLDGAYMNTEVILNGELLGMHPYGYTPYQVDLTPALNFDG SKNELKIITQSRQPSSRWYTGGGLYRGAALWIGGSIYLKPWDTFISVTEEGDGKTVIHFD VTVTSRETEERELVVLCEVLDEQGGSVENQPLPGKKQLGNGWSEEDSRIKAARSVTAVVP ALGKAVCKIEAEISSPHRWDIHDPYLYTCRITVMDSAGHTLDTEKQSFGIRTISVDTEQG FRLNGKAIKLKGGCIHHDNGLLGACAYPRAEERKIRLLREAGYNAVRISHHPPSLAMLEA CDRLGMLLLDECFDVWRMGKRPMDYHLYFEDWWERDMEWMVKRDRNHPCVILYSTGNEIG ERDGRNDGAKWSRRLSSKVKSLDSTRPVMAAICNIFEPGREPGTNFDANVKTDARDIWGE KTREYAAPLDITGYNYLPGRYEKDHESFPERVMAGTESHSFTTWDYWKKTKACPWVIGDF IWAAIDYLGEAGVGRVYWKKENETFSFMGPYPWRTSWQSDIMMTGSRRPQSYYRELMWGH TNETYLFTTHPNHYGDEYYGTGWHWPDVHADWSFGEEYTGRMVSVLAYGPGTEAEFRLNG KVMGVINYEKLTAAIDIPYTPGILEAVSYRNGKEISRAVLRTVGKEAVIDLKPEVPEILA NGCDPGYIQIRIQDENGTAYPEDERLLTAELSDHAELIAIGSGSPCTEDQIGAHSCHAFL GEALLIIKAHRPGIITVAVYDENGNRGTCTMEAK >gi|229784107|gb|GG667628.1| GENE 48 52790 - 54130 1078 446 aa, chain + ## HITS:1 COG:yagG KEGG:ns NR:ns ## COG: yagG COG2211 # Protein_GI_number: 16128255 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli K12 # 6 424 9 417 460 148 28.0 3e-35 MAWGKKIGYGIGDLGGNLVSAAAGAFVTMYYTDSVLLSAAFIGTMMLFTRLLDGISDIIM GIIIDKTYTKYGKARPWVLFSAFPLILSFILMLNVPSVLSDTATKIYVVVTYIFYSVICF TAVNGAYSTLVTLICPNPEERTGFTSARSFCSSLAMLLAQSYSVALMARFGGGQKGFTAM SIIYGVIALICILITGLTCKEYKITTKAEMENKNSSSLIQDVKLLFSSKYTFLTLIAFVF TWFMLTGSGGASVYFVRDVLQDMSLMAPLSMAGTLPSLIILALGIVPKAAKRYGKRNTLM MGAGFLMFGYTVISLFPYQLPVIFAGMICKGIGVGLSTSLLFATTSDVADYINIKNHVTV TGLTHSLTGFGVKVGVGLGSASLGWILAIGNYSAEAANSGMAQSAGALFAERACYSIVPL ICAAILMGSAVFLDVEKQLEKLKAEE >gi|229784107|gb|GG667628.1| GENE 49 54143 - 55099 441 318 aa, chain + ## HITS:1 COG:CAC2917 KEGG:ns NR:ns ## COG: CAC2917 COG0657 # Protein_GI_number: 15896170 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Clostridium acetobutylicum # 1 300 1 271 272 149 32.0 6e-36 MVTKVIPLYEDRTDVTLTAYVIAEKGEMNQQGPRPAVLICPGGGYMNCTDREAEPVALQF TAMGYHAFVLRYSTYGEGDMAYPDLSAPLLPKENCRHPGPVREIGMAMLSIREHAKEWFV DTSRIAVCGFSAGAHNAAMYGVYWDKPLITEYLNAEKEKLCPAALILGYTLSDYLYRNEC LERSAWARIFYAARTVLSGSPQPEEEVLRALSPARLVTEHTPPAFIWATSEDKLVPVQHS LRMAHALADHKVPSEIHIFEKGSHGLSVSTQASALSKRQLSADAAKWVELAGCWLEKRFA LELPDNSEFGKRMRSELQ >gi|229784107|gb|GG667628.1| GENE 50 55192 - 55458 332 88 aa, chain + ## HITS:1 COG:no KEGG:MPTP_1640 NR:ns ## KEGG: MPTP_1640 # Name: not_defined # Def: esterase/lipase # Organism: M.plutonius # Pathway: not_defined # 6 88 7 89 293 85 46.0 9e-16 MNGMAEEIRVQFGEMDRKRDEGLTTPSDILRFNDISYGSDFVWNKLDVYYKKGTSENLPV IVSVHGGGWVYGDKDVYQYYTMHLAQMG >gi|229784107|gb|GG667628.1| GENE 51 56460 - 56984 233 174 aa, chain + ## HITS:1 COG:SA0610 KEGG:ns NR:ns ## COG: SA0610 COG0657 # Protein_GI_number: 15926332 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Staphylococcus aureus N315 # 10 153 151 302 347 59 32.0 3e-09 MERQAEQYRIDLQRVFIVGDSAGAQIASQYLAILTNEAYAELFGFSLPRKLRVRAAALNC GVYDVRKYVNETKNAPIKQYLPEEEQNAFHTADTIQAFTDRFPPAFIMTSHYDFLKQWAE PFYQLLRSKGVPCEYHLYGAPEKKEIGHVFHLNIRTEEAQLCNTEECRFFRKFL >gi|229784107|gb|GG667628.1| GENE 52 57098 - 58525 1567 475 aa, chain - ## HITS:1 COG:lin2806 KEGG:ns NR:ns ## COG: lin2806 COG0232 # Protein_GI_number: 16801867 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Listeria innocua # 1 475 1 465 465 421 46.0 1e-117 MDWNTLLCEDRIRSLVKRPSGMDLRTEFEKDYHRIIGSASFRRLQDKTQVFPLDRSDFIR TRLTHSLEVSSFAKSLGQNISASIRTVIQDESFLPEHALAVCDILQCAGLIHDIGNPPFG HFGETAIQDWFKRRLPELRYQKSGEREARPLTEVLNRQMVEDFYHFEGNTQALRVVTKLH FLVDEHGMNLTKALLASIIKYPGSSLEVDKHSGDIKTKKMGYFYADREAFDDIQASCGTN GARHPLAFILEAADDIAYATADIEDAVKKNCLTFEQLVRELTEYKKERQNDRYRDMVSWL TDKYNRAVSKGYSRPDLYAVQNWVVTMQGQMIGGATEGFTNHYREIMEGTYKKELTSGTD AELLMEALGDIAYRYAFVSKPILKLEIAAETIFNFLLDKFVDAVIYFDTGKKQSAVQEKL VSLISDNYKKLYFICSDGKSEEEKLYLRLLLVTDYVCGMTDSYAKDLYQELNGIA >gi|229784107|gb|GG667628.1| GENE 53 58602 - 58820 81 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620732|ref|ZP_06113667.1| ## NR: gi|266620732|ref|ZP_06113667.1| transcriptional regulator, LysR family [Clostridium hathewayi DSM 13479] transcriptional regulator, LysR family [Clostridium hathewayi DSM 13479] # 1 72 1 72 72 129 100.0 6e-29 MPGAPDNDIRSIERRILSKTRAGELPDQRSGVRRSQAGLRIYALKPRHGEPIDQSVCIRG FPVCIKRGEFGL >gi|229784107|gb|GG667628.1| GENE 54 58829 - 59116 253 95 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620733|ref|ZP_06113668.1| ## NR: gi|266620733|ref|ZP_06113668.1| GGDEF family protein [Clostridium hathewayi DSM 13479] GGDEF family protein [Clostridium hathewayi DSM 13479] # 1 95 1 95 95 167 100.0 2e-40 GLEFVVILEGADYANCTHLLESFHDAIEEYNRSDQQDKHLSIARGIAVYNSVTDLVFSNV FKRADDAMYHNKADMKRKRVAAEAEENGEITTEKN Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:14:53 2011 Seq name: gi|229784106|gb|GG667629.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld22, whole genome shotgun sequence Length of sequence - 46412 bp Number of predicted genes - 37, with homology - 36 Number of transcription units - 16, operones - 11 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 666 587 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Term 701 - 748 7.2 - Term 687 - 734 11.2 2 2 Op 1 11/0.000 - CDS 758 - 1600 952 ## COG1091 dTDP-4-dehydrorhamnose reductase 3 2 Op 2 16/0.000 - CDS 1634 - 2656 1177 ## COG1088 dTDP-D-glucose 4,6-dehydratase 4 2 Op 3 1/0.000 - CDS 2712 - 3587 460 ## COG1209 dTDP-glucose pyrophosphorylase 5 2 Op 4 . - CDS 3648 - 4832 1189 ## COG0500 SAM-dependent methyltransferases - Prom 4867 - 4926 10.9 - Term 4917 - 4965 6.5 6 3 Op 1 . - CDS 5013 - 5504 710 ## COG1247 Sortase and related acyltransferases 7 3 Op 2 . - CDS 5553 - 6287 522 ## COG1040 Predicted amidophosphoribosyltransferases - Prom 6364 - 6423 3.4 - Term 6367 - 6405 6.6 8 4 Op 1 . - CDS 6429 - 7865 1047 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 9 4 Op 2 38/0.000 - CDS 7875 - 8720 656 ## COG0395 ABC-type sugar transport system, permease component 10 4 Op 3 35/0.000 - CDS 8726 - 9649 686 ## COG1175 ABC-type sugar transport systems, permease components - Term 9672 - 9709 6.8 11 4 Op 4 2/0.000 - CDS 9718 - 10089 353 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 10121 - 10180 80.4 12 5 Op 1 . - CDS 11029 - 11973 1021 ## COG1653 ABC-type sugar transport system, periplasmic component 13 5 Op 2 . - CDS 12010 - 12981 847 ## Closa_1289 Uroporphyrinogen-III decarboxylase-like protein 14 5 Op 3 . - CDS 12941 - 13159 90 ## gi|288870316|ref|ZP_06409707.1| conserved hypothetical protein - Prom 13240 - 13299 4.4 + Prom 13083 - 13142 5.7 15 6 Op 1 7/0.000 + CDS 13190 - 14977 1375 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 16 6 Op 2 . + CDS 14971 - 16506 991 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Term 16415 - 16442 -0.8 17 7 Tu 1 . - CDS 16494 - 18719 2634 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member - Term 18764 - 18803 3.6 18 8 Tu 1 . - CDS 18833 - 19831 1315 ## COG1077 Actin-like ATPase involved in cell morphogenesis - Prom 19929 - 19988 10.6 - Term 20449 - 20492 12.7 19 9 Tu 1 . - CDS 20538 - 22094 1986 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 22201 - 22260 7.2 + Prom 22159 - 22218 5.3 20 10 Tu 1 . + CDS 22260 - 23225 817 ## COG4905 Predicted membrane protein + Term 23236 - 23268 3.1 + Prom 23302 - 23361 5.2 21 11 Op 1 11/0.000 + CDS 23407 - 24063 776 ## COG1309 Transcriptional regulator 22 11 Op 2 . + CDS 24047 - 25282 758 ## COG0477 Permeases of the major facilitator superfamily 23 11 Op 3 . + CDS 25286 - 27607 1643 ## pE33L466_0065 ABC transporter permease 24 11 Op 4 . + CDS 27609 - 28367 240 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) + Term 28400 - 28453 10.1 - Term 28394 - 28432 3.1 25 12 Op 1 . - CDS 28457 - 30193 2275 ## COG1109 Phosphomannomutase 26 12 Op 2 . - CDS 30272 - 31228 828 ## Closa_0390 CotS family spore coat protein - Prom 31321 - 31380 4.2 - Term 31315 - 31384 10.2 27 13 Op 1 . - CDS 31398 - 32630 1192 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase 28 13 Op 2 . - CDS 32693 - 34126 1599 ## COG0442 Prolyl-tRNA synthetase 29 14 Op 1 . - CDS 34542 - 36230 1240 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 30 14 Op 2 . - CDS 36306 - 37367 1026 ## COG0628 Predicted permease - Prom 37400 - 37459 2.4 31 15 Op 1 . - CDS 37461 - 37739 389 ## gi|266620766|ref|ZP_06113701.1| conserved hypothetical protein 32 15 Op 2 . - CDS 37814 - 37939 78 ## - Prom 37959 - 38018 2.6 - TRNA 37978 - 38062 47.9 # Pseudo GCT 0 0 - TRNA 38123 - 38206 48.6 # Tyr GTA 0 0 - Term 38455 - 38495 2.0 33 16 Op 1 . - CDS 38657 - 39265 578 ## LJ1423 Lj928 prophage protein 34 16 Op 2 . - CDS 39288 - 40937 1489 ## COG1626 Neutral trehalase 35 16 Op 3 . - CDS 40966 - 42648 1284 ## COG0366 Glycosidases 36 16 Op 4 2/0.000 - CDS 42652 - 45285 1870 ## COG0383 Alpha-mannosidase 37 16 Op 5 . - CDS 45334 - 46368 912 ## COG3839 ABC-type sugar transport systems, ATPase components Predicted protein(s) >gi|229784106|gb|GG667629.1| GENE 1 1 - 666 587 221 aa, chain + ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 107 219 142 254 257 74 38.0 2e-13 LFSYQNEFTSSIDQDIVSHITQAIIDKNGTGLKEHTDAFRNRYLENGDEEEVKRLALITV CTIYASLLEMNGDIPDIQMNASVRKLENASTVDEINKHFCSCLFGMLDAKNKISTESYGY IQKAVQYIELNYSQPITIPQIAGFVGVSSIYLTKLFKLSTGKTLSEYLNYHRTQKSLDLL THTEETISWISEAVGYSDVRSYIRFFKKFYYMTPSEYRKEH >gi|229784106|gb|GG667629.1| GENE 2 758 - 1600 952 280 aa, chain - ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 1 280 1 279 280 247 42.0 2e-65 MRVFVTGVKGQLGHDVVDELEKRGHEAIGVDIDEMDITDAESVNRVIREAAPEAVIHCAA YTAVDAAEDNLELCRRVNAYGTENIAKVCRELDIKMMYISTDYVFNGQGTRPWEPDDERE PLNVYGLTKYEGELAIEENLTKYFTVRIAWVFGVNGRNFIKTMLNLGKTHDKITVVSDQI GSPTYTYDLARLLVDMIETDRYGRYHATNEGLCSWCDFAKEIFKQAGMKVEVVPVTSEEY PSRAKRPMNSRMSKDKLEANGFERLPSWQDALGRYLKEIQ >gi|229784106|gb|GG667629.1| GENE 3 1634 - 2656 1177 340 aa, chain - ## HITS:1 COG:MTH1789 KEGG:ns NR:ns ## COG: MTH1789 COG1088 # Protein_GI_number: 15679777 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Methanothermobacter thermautotrophicus # 2 340 3 334 336 456 62.0 1e-128 MKIIVTGGAGFIGGNFVHHMVNKYPDYEIINLDLLTYAGNLETLKPVEDKPNYKFVKGDI ADRKFIFDLFEKEKPDVIVNFAAESHVDRSITDPESFVRTNVMGTTTLLDACREFGIKRY HQVSTDEVYGDLPLDRPDLFFTEETPLHTSSPYSSSKASADLFVLAYHRTFGLPVTISRC SNNYGPYHFPEKLIPLIISRALADEELPVYGTGENVRDWLHVADHCEAIDLIIHKGREGE IYNVGGHNERTNLEVVKTILKALDKPESLIKYVSDRPGHDRRYAIDPKKLETELGWKPKY TFDTGIEQTIQWYLDNEDWWKHIINGEYQNYFENMYGDRL >gi|229784106|gb|GG667629.1| GENE 4 2712 - 3587 460 291 aa, chain - ## HITS:1 COG:SPy0933 KEGG:ns NR:ns ## COG: SPy0933 COG1209 # Protein_GI_number: 15674953 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Streptococcus pyogenes M1 GAS # 1 188 1 188 289 302 78.0 5e-82 MKGIILAGGSGTRLYPLTMVTSKQLLPIYDKPMIYYPLSVLMNAGIRDILIISTPDDTPR FEALLGSGSQFGIRLSYAVQPSPDGLAQAFIIGEEFIGDDNVAMILGDNIFAGHGLTKRL LAAANKKEGATVFGYYVDDPERFGIVEFDSNGKAISIEEKPEHPKSNYCVTGLYFYDNKV VEYAKNFKAFRPRRAGNYRFEPDLSGRGKTGCGTSRTGLYLARHRNPRIPGGRHQLCPHH GDPSAQESGVPGGDRVSQSLDYQRPALRRYRTTEEEPVWPLSHGCSGRQIY >gi|229784106|gb|GG667629.1| GENE 5 3648 - 4832 1189 394 aa, chain - ## HITS:1 COG:VNG0503C KEGG:ns NR:ns ## COG: VNG0503C COG0500 # Protein_GI_number: 15789731 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Halobacterium sp. NRC-1 # 139 392 13 257 262 104 30.0 3e-22 MQGYYTSGEFAKKANVSIRTIRYYDKQGLLKPAGMSEAGYRLYTDSDFAKLQKILSLKYL GFSLDEIRSITINDDDNGYVEQSLELQLDLVKKRIEHLKLVEQTLSETSKLVKDNPVIDW NKILHLIHITNMEKSLVEQYKNAANLSIRIDLHRKYARNPMGWFPWLYSQLDLKDGETVL ETGCGNGELWRENRDLIPPDAVICLSDVSEGMIQDAREHLKGVPGRFSYEVFDCCRIPKA EESFDKVSANHVLFYLKDLDGALEEVRRVLKPDGTFFCSTYGREHMKEISQLVKEFDSRI VLSEVNLYDAFGLENGCEILKRHFGRVEEILYEDHLEVRDEGPLMEYILSCHGNQQEYLS ERYDEFRDFLKKKIEKKGTLFVTKRAGIFRCRKN >gi|229784106|gb|GG667629.1| GENE 6 5013 - 5504 710 163 aa, chain - ## HITS:1 COG:ECs2052 KEGG:ns NR:ns ## COG: ECs2052 COG1247 # Protein_GI_number: 15831306 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase and related acyltransferases # Organism: Escherichia coli O157:H7 # 1 163 1 162 172 120 36.0 1e-27 MEIRIAQEEDVPYLLDIYNYEVLHGEATFDINPRTLSERLDWYYAHNISNHPLIVAVEDG HAVGYASLSPYRVKEAYAATVELSVYVHRDYRKRGIARKLTLHILNMAREDERIHTVISV ITGGNQASIHLHEELGFLHCGTIKEVGVKFGKYLDIENYQMMV >gi|229784106|gb|GG667629.1| GENE 7 5553 - 6287 522 244 aa, chain - ## HITS:1 COG:DR1389 KEGG:ns NR:ns ## COG: DR1389 COG1040 # Protein_GI_number: 15806406 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Deinococcus radiodurans # 29 238 20 217 219 98 34.0 1e-20 MRNETITVLLRAFLTDLLFPRRCPVCGDIVLPRGELICPACVKKLSFVKQPVCKKCGKEI SSAEREYCLDCTKHKRSFDYGRALLNYNDTAKKSMADIKYRNRREYLDFYAEAVCLRYGT ALLRLGADALVPVPVHPSRRRQRGFNQAEIFACRIGARLGIPVRTELLVRSKKTAPQKQL NPKERLHNLEEAFTAGEVPEGVRRVILVDDIYTTGSTIEACTRALLRAGVKNVYFIAICI GRGQ >gi|229784106|gb|GG667629.1| GENE 8 6429 - 7865 1047 478 aa, chain - ## HITS:1 COG:FN0278 KEGG:ns NR:ns ## COG: FN0278 COG0624 # Protein_GI_number: 19703623 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Fusobacterium nucleatum # 8 464 2 450 452 187 28.0 3e-47 MNREREAELRQWIDAHEEELVRDIRKLVRIPSVSRPEQERSDDTVKAAHCMMEIAAQYGL HAENCEDCCVKILYGAGENEVQVWGHLDVVPAGEGWIYPPWECTRKGDFLIGRGVSDNKG AVIAVLYCLRYLKEHNIKPCFRISQICGLSEETGMADAAWYLTKWPAPDFAFVSDCRFPL CYGEKGRCVLTLKRDSVPDVILDMSAGEASNSVPAEAVICIDRSAHMGDRKVEALQLGMD MREELSGNTVRYTVKGIPGHAAAPDRCVNPIGLAGMLLKDGSGLKPEEKHFLEFFQTACL DGYGKGLGIASSDPVFEKLTCAGTVLEMKNRTVTLTMDIRYQPSVNAADIVKQVQKTAGL YGFVLSAEEHRDGYCDSIDSEEVKALLSAYDHVVKDGKKPYVMGGNTYAGMIPNAVCFGP GIPKDYSELELPAGHGGGHGCDEIQSISSLKKAMEVYVHAFMNLNDLYEKRSRKEERD >gi|229784106|gb|GG667629.1| GENE 9 7875 - 8720 656 281 aa, chain - ## HITS:1 COG:BS_yesQ KEGG:ns NR:ns ## COG: BS_yesQ COG0395 # Protein_GI_number: 16077766 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 9 280 26 296 296 285 51.0 6e-77 MKRKRMVGKIVYHILVSIGAFIMVYPLLWMVMSSFKETNTIFTTAGQLIPKEFTLENYRN GWKGFGKVNFGTFFKNSSLISVLATAGTILSSAFVAYSFSRCSFRGKKLLFSAMLVSMML PAQVLMIPQYLWYQKLGWVGTYLPLIVPYCFAVQGFFVYLIMNFINGIPTELDEAAKIDG CSYYSIFFRIILPLVVPALITAGIFSFIWRWEDFLSPLLYVNKSELYPISLALKLFCDPA STSDYGAMFSMSVLSILPLMIIFILLQRYLVEGIASSGIKG >gi|229784106|gb|GG667629.1| GENE 10 8726 - 9649 686 307 aa, chain - ## HITS:1 COG:BS_yesP KEGG:ns NR:ns ## COG: BS_yesP COG1175 # Protein_GI_number: 16077765 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 12 303 10 303 309 289 51.0 5e-78 MAVVDMDHIWFKIKTELNKNSTAGVVFALPFVIGFLMFMVVPMGMSLYYSFCEFEILGAP KFIGLGNYLKMFTEDETFYKSISVTFFYALVSVPLRLIFALMVAMILQRTTKATGFYRAA YYLPSIIGGSVAISILWKRLFAEDGTFNRILQIIGVNSHFSWLGNKHTAIWVLIILAIWQ FGSSMLIFLSALKQIPFSLYEASRVDGAGKTNQFFRITLPLLTPTIFFNLVMQMISGFLA FTQCFIITQGRPMSSTLFYSVYMYQQSFEFYNTGYGAAMAWVMLLLISVITLFLFWSKKF WVYEGGN >gi|229784106|gb|GG667629.1| GENE 11 9718 - 10089 353 123 aa, chain - ## HITS:1 COG:BS_yesO KEGG:ns NR:ns ## COG: BS_yesO COG1653 # Protein_GI_number: 16077764 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 1 119 292 409 412 88 37.0 2e-18 MFFSVSVDSEKPEEAAKLIDYWTNSVECNKILLGERGVPVSSVVADAIAADMSESDQKVV DYINNVVTPKCSTVSPASPNGATEVYDVVYKMQEKICYEEITPEAAAEELLKQGNKILQS KQS >gi|229784106|gb|GG667629.1| GENE 12 11029 - 11973 1021 314 aa, chain - ## HITS:1 COG:BH1117 KEGG:ns NR:ns ## COG: BH1117 COG1653 # Protein_GI_number: 15613680 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 32 259 21 251 434 130 32.0 3e-30 MVAVLAAAGLAGCGGNSAEKQQNTTQAAESTAGQESSSGKETTAEAGNTQGDKTELTITW WGSQQRAELTQEALSLYEAENPGVTFDGQFLSGADYWNKLATSAAGHTMADVVQMDKAYI QQYVDNSMLVDLQPYVDSGVINMEDVDKTVQEFGRIGDGLYAICIGVNSPALLYNQTLLE ENGIELKDNMTMDEFLNVCRQVKEKTGYKTNLQYNNGTSWMDFFLRSKGEYLFGDGAIGA SKENLTEYFNLYKMGIDEGWLIDPSVFAERSIGQPEQEPLVYGSDPSNMSWCAFYFSNNM NAVTNAAPEGMKIG >gi|229784106|gb|GG667629.1| GENE 13 12010 - 12981 847 323 aa, chain - ## HITS:1 COG:no KEGG:Closa_1289 NR:ns ## KEGG: Closa_1289 # Name: not_defined # Def: Uroporphyrinogen-III decarboxylase-like protein # Organism: C.saccharolyticum # Pathway: Porphyrin and chlorophyll metabolism [PATH:csh00860]; Metabolic pathways [PATH:csh01100]; Biosynthesis of secondary metabolites [PATH:csh01110] # 1 322 1 318 319 292 45.0 1e-77 MTKKERVLHAIRGEETKEIPSCFSMHFSPEYANGENAVRAHVDFYKQTDVDILKIMNENL VTCDHELHTIQDYTEFQRITAEQPFMRRQVDLVKRILDETGGECYTIGTLHSVGTSAMYH PFKPQGYGYDQSRKIFDAWLKEDPVKMLEAMKRITDGLCELAVKYIEAGLDGVYVASLGA ERRFGLTREQFLNWIAPFDQMIMRAVKEAGGTCFLHVCNRDVNLDYYRDYDEKLYDVVNW GTFEVPCSLKEGREIFHQRTVMGGLANRQGVLAAGSEEALVNTIRSVTETYGRKGFILGA DCTVATEQDLSRIRLAVDTARKL >gi|229784106|gb|GG667629.1| GENE 14 12941 - 13159 90 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870316|ref|ZP_06409707.1| ## NR: gi|288870316|ref|ZP_06409707.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 72 1 72 72 105 100.0 1e-21 MSRYDDTIIIRKHREKSKNKRRRRRFFDTGYRNIGLKLWIFGDILRAVEDNEKEMRGKED DEKGAGTACDQG >gi|229784106|gb|GG667629.1| GENE 15 13190 - 14977 1375 595 aa, chain + ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 18 576 11 561 577 223 28.0 6e-58 MRGLVNRIKLIFSNISLKRKILTIVLVSIFLISGVSLAGISVVSDAYLKLLQKTIANNLS YSATTISYYLSNIETMSGLMLSDSNIQSNLSDVRHSDNPRIRTEAYKKLSYTVLEYYQQY KPNFISYITLNNPEFITYSNFLGSRKTPEEIERSVLNAAHDGDGRPVWICDYGNEYGIFM GRLIKNIQSSNLEELGTLLVSIDLDGLMNSLNKEDPLYENPQYYIMTGDTIIYNDNPISA TIHDQRKASAMRSYDIMNVDGHRYFVTEDVIPGIGWTYVCYLSYDPIFKSVSFVRTLSIV MIILSILLAIIFSESMISFISRHTQGLIRKMEAFSTDENAVIDSDYDYSSRRDEFGLLNR QFDSMAGKIRHLIQVNYVNELWKKDAQLKALETQINPHFLYNTLESVNWRAQVAGNMEIS SMIESLGTLLRATLSNSTSHFDLKKELEIVTAYITIQKYRFEDRLEFSIHCDEALYNAQI PKMCIQPLVENSINYGLEESTELCSIEITIEHQQESGSLMVLVKNDGSLFEECLLEKLRS QEIRPHGFGIGLINIDERIKLMFGKEYGLTLYNDQEWAVARIHMPYITDQEEIVC >gi|229784106|gb|GG667629.1| GENE 16 14971 - 16506 991 511 aa, chain + ## HITS:1 COG:BS_yesN KEGG:ns NR:ns ## COG: BS_yesN COG4753 # Protein_GI_number: 16077763 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus subtilis # 1 133 1 137 368 161 54.0 3e-39 MLKMIIADDERVIRESISRLIDWEKLGVELTGLCRNGLEAYDMIIDEDPDIVLTDIRMPG MDGLELIRKVSLMDLDTRFILLSGYGEFEYAKEAMKYGVKHYLLKPCNKSQISDCIAEAA KDSYARKRTRYLESGHALLTGSMHHNIVSSVISESLGQKKALPDILRSYEPYLDFYSTPY QLLYVFFLEIDSLDSFLHELKLFCEQNLPQVTMHGTYVHNTLLLFYKDYSADSDLLMQLI RNLSIPSQPVGLEIQKETYPSLYKLLEIILVKLKRFSTIYYINNFHLLYTCNYNSLMSEA EHLYQQIKHESADIFHSLADLLTGIEDAAFLKQLSGNLLLKAAADNPSCSAVTLTEWLIE IDGETDPKRLKPLILDHLEYLLLSPSKENSASSMVEQITTYVENNLGNSVLTLKFISEQV LFMNVDYVSKRFFKETGRKFSDYLTDIRITRAKEYMAAGEKIQVVAEKVGCGNNPHYFSQ LFKKKTGMTPSAYSAALNMTKTGGSCINHST >gi|229784106|gb|GG667629.1| GENE 17 16494 - 18719 2634 741 aa, chain - ## HITS:1 COG:CAC2854 KEGG:ns NR:ns ## COG: CAC2854 COG0507 # Protein_GI_number: 15896108 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Clostridium acetobutylicum # 8 741 8 736 739 585 45.0 1e-166 MATVCGFVERIKYRNEENGYTVLSLTDEGDEYTLVGNFHYISEGEMVEATGPMTEHPVYG EQMTVETYEIKTPEDTAAMERYLGSGAVKGIGAALAARIVRRFKADTFRIMEEEPERLSE VKGISEKMAMAISEQVEEKKEMRQAMMFLQEYGISMSLAVKIFQEYGPRLYTVIKENPYQ LADDIAGVGFKMADEIAKRVGIFTDSDFRIKSGVLYTLLQATANGHTYLPEDELLSQASE LLRVEKESIEKHLMDMQIDKRLVIRESEGIRIVYASQYYYMEMSVARMLHDLNIRGREPE ETIRKKLLQIQKEESIELDEKQVQAVVEAVNSGLLIITGGPGTGKTTTINTIIRYFESEE MEILLAAPTGRAAKRMTEATGYEARTIHRLLEISGMPGDERSVGMHFERNEENPLDADAV IIDETSMVDIHLMQSLLKAVNPGTRLILVGDVNQLPSVGPGNVLKDVIEAGCFNVVMLTR IFRQASQSDIVVNAHKINAGETVPLGKKSNDFLFIKRDDPNTIINAMITLVQKKLPGYVG ADPYDIQIMTPMRKGALGVERLNTILQEYLNPPDKSKLEKESGGVTFRVGDKVMQIKNNY NIEWEVRNKYGIPVDKGTGIYNGDIGIIREINLFAELVTVEFDEGRMVEYSFKQMEELEL AYAITIHKSQGSEYPAVVIPIFSGPKMLMTRNLIYTAVTRARACVCLVGVPEVFQAMVDN EMEQRRYSGLKERICEIYQVE >gi|229784106|gb|GG667629.1| GENE 18 18833 - 19831 1315 332 aa, chain - ## HITS:1 COG:CAC2858 KEGG:ns NR:ns ## COG: CAC2858 COG1077 # Protein_GI_number: 15896112 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Clostridium acetobutylicum # 2 323 5 326 340 404 64.0 1e-113 MMSNDIGIDLGTASILVYIKGKGVVLKEPSVVAIDRDTNKIKAIGEEARLMIGRTPGNIV AVRPLRQGVISDYTVTEKMLKYFINKAVGKKTLRKPRISVCIPSGATEVEKKAVEDATYQ AGAREVAIIEEPVAAAIGAGIDIAKACGNMIVDIGGGTSDIAVISLGGTVVSTSIKVAGD DFDEALVRYMRKKHNLLIGERTAEEIKINIGAAYRRPEVLTMEVRGRNLVTGLPKTIVVT SDETLDALKEPAMQIVDAVHNVLERTPPELAADIFDRGIVLTGGGSLLSGLDSLIEEKTG INTMIAEDPLTAVAIGTGKFIEFTHDGGKSEF >gi|229784106|gb|GG667629.1| GENE 19 20538 - 22094 1986 518 aa, chain - ## HITS:1 COG:BS_yfmM KEGG:ns NR:ns ## COG: BS_yfmM COG0488 # Protein_GI_number: 16077809 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 517 1 517 518 816 77.0 0 MSILNVEHLTHGFGDRAIFNDVSFRLLKGEHIGLIGANGEGKSTFMNIITGKLQPDEGKV EWAKRVRVGYLDQHTVLEKGMTIRDVLKSAFAFLFQMEEQMNEICDRMGEAGEEEMESMM EELGTIQDLLMAHDFYIIDSKVEEVARALGLTDIGIDKDVTELSGGQRTKVLLGKLLLEK PDILLLDEPTNYLDEQHIEWLKRYLLEYENAFILISHDIPFLNSVVNIIYHMENQSLDRY VGDYDKFVEVHAVKKAQLEAAYKRQQQEIAELEDFVARNKARVSTRNMAMSRQKKLDKMD VIELAREKPKPEFHFLEARTAGKTIIETTDLIIGYDEPLSRPLNVRMERGEKIVLVGANG IGKTTLLKSILGLVPPLDGKVMLGDYLSIGYFEQEMAPGNKTTCIEEIWKEFPSYTQYQV RSALAKCGLTTKHIESQVRVLSGGEQAKVRLCKLINRETNVLLLDEPTNHLDVEAKEELK RALKEYKGSILLICHEPEFYEGVATKVWDCREWALKLS >gi|229784106|gb|GG667629.1| GENE 20 22260 - 23225 817 321 aa, chain + ## HITS:1 COG:lin2818 KEGG:ns NR:ns ## COG: lin2818 COG4905 # Protein_GI_number: 16801879 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 9 258 6 247 270 125 30.0 1e-28 MYIYTWYQWLLFFFLYCFIGWIIESTYVSLKCGHFVNRGFLRLPMLPLYGSGAIIMLWVS LPFQNNLFLVFLSGMIGASVLEYITGWTMERLFKMKYWDYTEQPLNLNGYICLGTSIAWG FLTIVLTEFIHRPVESVVLDLNPVLCIVLCGIIGVLFVADAVESTKEALDLGRVLESMTK MKAELEEIQLQMALLKAETADKLSDLRDEQIQRAANLKGEAAVKITTLKAETAMRTEEKL NALKESAERRLEAAASIKETAGKAVPDLSALTERLQSLTENRDKLSKHLGFYRKGLLRGN PTASSRQFGEALKELKERLKS >gi|229784106|gb|GG667629.1| GENE 21 23407 - 24063 776 218 aa, chain + ## HITS:1 COG:mlr0908 KEGG:ns NR:ns ## COG: mlr0908 COG1309 # Protein_GI_number: 13471041 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Mesorhizobium loti # 2 198 3 214 236 105 31.0 5e-23 MRTVKEPEIRKNEILDAADTLFARKGFDNTSTGDILEMVGIARGTLYYHFKSKEDILDAL IERYNRQILSAAQEAASDKRFSVKERLIRTILALNVSQKGGQELKEQMHRPQNALMHQKT QMAVLNGVTPILTKLIREGAEAGYFQTPYPRECVEMIMVYSNVFFDDAAEASEELRDERV QALIFNIERLLGAETGSLASDLMRVFGSRKVGQNEADS >gi|229784106|gb|GG667629.1| GENE 22 24047 - 25282 758 411 aa, chain + ## HITS:1 COG:MA1161 KEGG:ns NR:ns ## COG: MA1161 COG0477 # Protein_GI_number: 20090027 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Methanosarcina acetivorans str.C2A # 6 409 7 426 443 315 42.0 9e-86 MKQTANYFHKFLLLWTGELVSAIGSGLTSFGLGVYVFQQTGLASSTALVTLLAFAPGLIL GVPAGVLADRYDRRLLMILGDGLSALGLLFILLCLSKGEAALWQICVGVTISSVFSSLLD PAYKATVTDLLTPEQYTKASGLVGVAGSAKYLISPVLAGLLLTISDIRLLLLLDTATFFL TVTATLVVRKGIQSKQAMETKSFTREFSEGWNAITQKKGVLVLVILSSAITFFLGFIQTL FGPMILGFADSTVLGISETICASGMLISSLVIGTVSLKKNRVVILSAALFAAGIFMAGFG MRENIVVICVFGFLFFSSLPFANAALDFLVRTNIADDFQGRAWGLIGVLSQLGYVAAYGI SGILADGTGALLSIGSGRGSGLLIVTAGILLAAISFLPCRIKSVRALEETR >gi|229784106|gb|GG667629.1| GENE 23 25286 - 27607 1643 773 aa, chain + ## HITS:1 COG:no KEGG:pE33L466_0065 NR:ns ## KEGG: pE33L466_0065 # Name: not_defined # Def: ABC transporter permease # Organism: B.cereus_ZK # Pathway: not_defined # 5 769 4 768 772 796 53.0 0 MIDCRIIRNDIRKSRLTTIITTLFVALAATLVSLAVILALNLAGALDRLMVQAETPHFMQ MHSGPLDTEPVNRFVSQHEAVDRYQILEFLNIDNSEIELNGHSLISSTQDNGFSTQSGAF DYLLDLDGAVITPAGGEIYVPVCYMQEAAMKTGDIVSVCGIPFRTAGFLRDSQMNSMLSS SKRFLVSPQDFERLRSLGTMEHLIEFRLKDLSALDAFAADYTAAGLPSSGPAVTYPLFRL LGSLSDGMMIAVILLLGFLTVLIAFLCIRYTLLAQLESDYRELGVMKAIGLRVSDIKKLY LAKYAVIAASGCTAGIVFSSLLKNLLLQNIRLNMGTGGNGPAALFLAPAGVLIVFFSILL YVNGVLGRLQKLSASEAMRRGTGSMAFRSAGRLALNRSHFFNVSIFLGIRDVLLEKRLYG VNFTVLLFAAFLLVIPINLYTTMSSRGFVTYMGIGNCDLRIDLRQSDHLAEDAARIIRKM EQDGSILSHTALTTKSFQVETPEGTVRNLLTELGDHRIFPVSYVKGQAPVTPDEIALSAM NAEELRKAPGDSLTLITGSGTQTLTVCGIYSDITNGGKTAKAAFPADSDDTVWAVIYAGL KDPSMLSQIVREYTDTFRDTKVTDIRDYVSQTYGPTLNSIRMAAIAAATVAVTLTGLITL LFVNLLTAKDRYSIAVMKASGFTAGDIRLQYLSRFVLVSVCGIFAGILLANTAGRALAGL VLSGFGAPSFRFAVNPLLSRLLCPLLMICTAVFASLAGTKGIEKITISEYGKE >gi|229784106|gb|GG667629.1| GENE 24 27609 - 28367 240 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 203 1 199 223 97 29 2e-19 MEEIIHAESIVKSFYEGSEHQRILDGISLSVRKGEFISVMGPSGSGKSTLLFALSGMDRI DGGSVSFCGSNLTACTENQLADLRRTQMGFVFQQPTLLKNLNILDNIILPSLRDKKKNVS RVTEKARALMQKTGIGELEDRDISRASGGQLQRAGICRALMNDPKILFGDEPTGALNSKS AQEIMDIFTEINTQGTAVMVVTHDARIAARTERVLFLCDGRIVREAHFPKYTGKDMERRI EKIMELMKEAGI >gi|229784106|gb|GG667629.1| GENE 25 28457 - 30193 2275 578 aa, chain - ## HITS:1 COG:CAC2337 KEGG:ns NR:ns ## COG: CAC2337 COG1109 # Protein_GI_number: 15895604 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Clostridium acetobutylicum # 4 575 3 571 575 561 51.0 1e-159 MKDYKSIYEEWLANPYFDEATKEELRSIKEDENEMKERFYQDLEFGTAGLRGVIGAGINR MNIYVVRRATQGLANYIIKQGGADKGVAIAYDSRHMSPEFAEEAALTLAANGIKAYKFES LRPTPELSFAVRELGCIAGINITASHNPPEYNGYKVYWEDGAQFTPPHDKGVTAEVLAIT DLSTVKTTDVESAKAAGKYEIIGEAIDDKYIAQVKAQVVNQDAIDKMQDSIQIVYTPLHG TGNIPARRALAEIGFTHVYVVPEQELPDGDFPTVSYPNPEAEEAFELGLALAKEKDADLV LATDPDADRLGVYVKDTKSGQYIPLTGNMSGSLLCEYVLSQKKERGAIPADGQVVKSIVT TNLVDAVAKSYGCELIEVLTGFKYIGQQILKEENTGYGTYLFGLEESYGCLIGTYARDKD AISATVALCEAAAYYKMKGMTLWDAMIAIYEKYGYYKDAVKSIGLKGIEGQAKIQAIMDT LRNNTPTEVGGYRTVSARDYKLDTIKDMENGEVKPTGLPESNVLYYDLTDDAWLCVRPSG TEPKIKFYYGIKGTSLADADEKSAKLGEAVMAMVDSMM >gi|229784106|gb|GG667629.1| GENE 26 30272 - 31228 828 318 aa, chain - ## HITS:1 COG:no KEGG:Closa_0390 NR:ns ## KEGG: Closa_0390 # Name: not_defined # Def: CotS family spore coat protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 318 8 325 326 520 82.0 1e-146 MLSQYELEILDVRRGRGAWLCETNQGLKLLREYKGTIKRLEFEDQVFTQLEEAWHPFVDR YVRNREGELLSSADDGTRWIVKDWYADRECNLKDDREVLRAITQIASLHKMLRKIEFKEE WNLGSILVQLPGDEMERHNRELIRARSFIRNKRKKTEFELCVIGNYDMFFEQARDARLGM KEFCERYGEEECYLCHGDLNQHHILMCPRDVAVVEFNRMHLGMQMEDLYHFMRKAMEKHD WSLKLGTSMLETYSRILPLSDTDRTCLYYLFLYPEKYWKQINFYYNANKAWIPARNIEKL KDLELQQPLRNEFLSRIK >gi|229784106|gb|GG667629.1| GENE 27 31398 - 32630 1192 410 aa, chain - ## HITS:1 COG:L0324 KEGG:ns NR:ns ## COG: L0324 COG0617 # Protein_GI_number: 15673541 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Lactococcus lactis # 11 389 26 393 411 219 39.0 1e-56 MVKEIRVPADAEQIIEKLNEHGFEAYVVGGCVRDSLLGKEPEDWDITTSARPGEVKSIFS RTIDTGIQHGTVTVMLNRQGYEVTTYRVDGEYEDGRHPKSVDFTTSLLEDLKRRDFTINA MAYNSREGLVDAFGGIEDLENRVIRCVGSAVNRFTEDALRILRAVRFSAQLDFRIEDETY EAISIIAPNLEKVSKERIATELTKLLLSSHPEKMKRVSDTGASRYVSEPFCRAAALNKDE LRPLGLLPAVKYMRWAGLLRREQGETAAAILRDLKMDCDTMDKVKILVSRWRIPISAEKP AIRRVMNQIPPDLLDSLLCFQEVFAAELGEGYRETLKAVKTLAEEIRRDGDPVNLKELAV DGRDLMAAGMKPGPALGKTLNALMEQVLEQPECNTREYLLNAAGLKCELP >gi|229784106|gb|GG667629.1| GENE 28 32693 - 34126 1599 477 aa, chain - ## HITS:1 COG:MYPU_1830 KEGG:ns NR:ns ## COG: MYPU_1830 COG0442 # Protein_GI_number: 15828654 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Mycoplasma pulmonis # 4 477 25 501 501 484 47.0 1e-136 MAKDKKLVEAITSMDVDFAQWYTDVVKKAELIDYSSVKGCMVIKPAGYAIWENIQKELDS RFKTVGVENVYMPMFIPESLLQKEKDHVEGFAPEVAWVTYGGLEPLTERLCVRPTSETLF CDFYARDIHSYRDLPKVYNQWCSVVRWEKTTRPFLRSREFLWQEGHTAHATAEDAQARTE QMLNLYADFCEEVLAMPVIRGQKTDKEKFAGAEATYTIEALMHDGKALQSGTSHNFGDGF AKAFEIQYSDKENKLQYVHQTSWGMSTRLIGGIIMVHGDDNGLVLPPRIAPVQVMIVPIQ QAKEGVLEKAAEIKSVLSGYRVKVDDTDKSPGWKFAESEMRGIPVRVEIGPKDIEAGQAI LVRRDTHEKISVSLEEIGQKVGELLDTIQHDMYERALAHREAHTYEATDFDTFTKTIEEK PGFVKAMWCGCQECEDKIRELTGATSRCMPFKQEKLSDTCICCGKPATKMVYWGRAY >gi|229784106|gb|GG667629.1| GENE 29 34542 - 36230 1240 562 aa, chain - ## HITS:1 COG:AGl3029 KEGG:ns NR:ns ## COG: AGl3029 COG1473 # Protein_GI_number: 15891630 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 197 549 3 371 389 175 33.0 2e-43 MEPGKLIEFLGILEKLKCNTRHNWTTSGRRESVAEHSWRLAVMAFLLKDEFPELDMDRVV DMCLIHDWGEAVTGDIPAFIKGSTDEKTESAVLRTMTGSLPEDLARRLNGLFDEMEALQT KEAKLTKALDKIETLIQHNEAGADTWLPLEYELNLTYGNEISNMSEYTRRLRDLVRQESE RIISEKPLKDQGCGSTGSHSALDDETFKKIKELRKELHEIPELSGQERKTMEVLKMFLRK HTSLSVNDRGSWFYAIHQEDGAGETVVFRADMDAIKGAGNIPYHGCGHDGHSAILAGLCL LTEGRVFQKNLCFLFQPAEETGEGGKICCNLLEELGADRVYGFHNLPGYPLGTAVMRRET FSCASRGLIIRLTGKPCHAAYPEQGINPAYLISGIIASLPDFLKPEEYQGMVLASIIEVK VGDESFGVSAGDGTLALTIRAEHLEDLDKLEGRIRDEAESKAQAEHMACCITRRDEFPDT VNTAEIADKSRMLFEKEGIPCLEAAAPFRWSEDFGWYLKKSQGMYFGMGAGEDCPDLHTP DYEFPDELIRNAVRCLYLLAEI >gi|229784106|gb|GG667629.1| GENE 30 36306 - 37367 1026 353 aa, chain - ## HITS:1 COG:BS_ytvI KEGG:ns NR:ns ## COG: BS_ytvI COG0628 # Protein_GI_number: 16079968 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Bacillus subtilis # 10 348 11 354 371 102 25.0 9e-22 MEKPSRKLKKTILILGVTGAVYASFKYLLPLVIPFLIAYIFALALRPSAVWLEKKCRLPF KKKEIHLPLAAIGGIQILILMVLFGTLLYYGGRKLFMEANLFFDNLPQMINSLDHWLTGC CFFTENLLRLPRGFLVKMLRDMLVGGGKAMKDAAMPFLMVNSMTVVRCFIEVTVMTVILF IATILSLQEMEDIRMRRDNSIFRHEFALLGGRLSMVGSAWLKTQGSIMFLTMCICTAGLF LMGNQYSLLLGIGIGLLDALPIFGTGTVLIPWALFRLVNGDWLYGIGLFVIYIICYFLRQ VMEAKIMADKVGLTPLETLISMYVGLQLFGLLGFILGPIGLLIIEDLVELYGG >gi|229784106|gb|GG667629.1| GENE 31 37461 - 37739 389 92 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620766|ref|ZP_06113701.1| ## NR: gi|266620766|ref|ZP_06113701.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 92 1 92 92 183 100.0 4e-45 MHQVLFPLVIVTILKQHGSKEQPLTISQIADKINRQYAPFADGEKVMNRSTVARTLESLV LYTEVGDLLDFCVIEGGSANKKKYYIEHHKIG >gi|229784106|gb|GG667629.1| GENE 32 37814 - 37939 78 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MISKKKGKSKWKPSYYVSDIAAASLSVITTTGFFALQQTIA >gi|229784106|gb|GG667629.1| GENE 33 38657 - 39265 578 202 aa, chain - ## HITS:1 COG:no KEGG:LJ1423 NR:ns ## KEGG: LJ1423 # Name: not_defined # Def: Lj928 prophage protein # Organism: L.johnsonii # Pathway: not_defined # 4 199 681 887 892 67 24.0 3e-10 MKTIAIYGDSISTGTHGEGGYENTLKNSLSLDKVYNFAVGSSGLTRKTPGGMLEVLDKNP VPEDAELFLIWHGSNDWYWGSELEAFSEAIEAAVNRLRAAVPTACLVWVTPIYRFECSDG MAQAGEAYELPNKSGYTMLDYYVELERASKRLGFPLIDMRRLCGIHRDNAELYLEDRVHP NRAGYQRISAVLAREIQRFCIF >gi|229784106|gb|GG667629.1| GENE 34 39288 - 40937 1489 549 aa, chain - ## HITS:1 COG:TP0931 KEGG:ns NR:ns ## COG: TP0931 COG1626 # Protein_GI_number: 15639916 # Func_class: G Carbohydrate transport and metabolism # Function: Neutral trehalase # Organism: Treponema pallidum # 178 490 99 416 476 137 32.0 7e-32 MDRDYSFLDSLDRGMYFDKKEYDGVPPLSYEEVKDNLPVPIVSSHPEWVECYRYAVQVLY TNIHRPAEGSGFVSNFVDAAFNDDIFLWDTVFMTLFCNLLHPYVPGICSLDNFYCKQFDD GEIPREMVRETGKDFLLWVNAFDSPLYSYFQNHYGFRTLRELGKLPYEDMYKPNMGRIIE KKPYLTLDNLNHPLLAFAEWESYCHTRDAARLHMVLEPLYHYHEAMKYHLRHQNGLYVTD WASMDNSPRNKHLGLAVDTSCEMAMFAGNLIDIMDVLVKRGYEVTDYEKRREGLEKDRSV LIEKINHYMWNEQDGFYYDMTFGERQTRIKTIAGFFPLVSGVADEKQGKRLIEWLEDKET FNRVHRIPVVAADEEGYDPRGGYWRGSVWAPTNALVTCGLEKHGFHKLARDIAINHLDVI AKVYEQTGTIWENYPPDEISSGDADNKDFVGWSGLAPILYLIQYAAGLSLDRKEAEPTVR WEISEHLVRGGVLGCRRYWFAGKTADFVAKDAGGSLEVSIHTEDCFKLNLIYQGAQHSIM VQGDMKLTF >gi|229784106|gb|GG667629.1| GENE 35 40966 - 42648 1284 560 aa, chain - ## HITS:1 COG:lin2973 KEGG:ns NR:ns ## COG: lin2973 COG0366 # Protein_GI_number: 16802031 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Listeria innocua # 1 559 1 559 566 469 42.0 1e-132 MDSIKEHLQHLYPDCWEELEQRIGALTESWKLKMGNTSGEALYPWVSKDDVMLITYGDGI GREGEAPLATLRSFLNEELSGTVSAVHLLPMFPYTSDDGFSVTDFKAINPELGDWEDIEN LRREYDLMFDAVINHVSKSSGYFKEYAAGNPHYRGFFIEADPEGDYSRVIRPRTLPLLTK FETSMGTKYLWTTFSEDQIDLNYRDPEVLMEILDVLLLYAARGARFIRFDAIGFAWKEPG TTCMHLPQTHELIKLMRCVLESCAKGCTIITETNVPHKENISYFGSGYDEAGLVYQFPLP PLTLYSYLSGSAAHLSDWADGLEATTEATTYFNFLASHDGIGLRPVEDILSEGERRLMVA EVMERGGQVGYRSLADGSMVPYELNINYLDAIAGNEKDMGRMVRKFLGSQCILLSVMGMP AIYYHSLLGSRNCYRDFEESGIKRRINREKLNADQLKTELSDPNSLRSRVLAGYRQMLKI RKQEEAFSPNSPQQVLKLDERVFALIRGTQENQILVLINVSDDVVKLETGMAGSGLLSEK QMNQAVTLEPYEYLWLKIKG >gi|229784106|gb|GG667629.1| GENE 36 42652 - 45285 1870 877 aa, chain - ## HITS:1 COG:ybgG KEGG:ns NR:ns ## COG: ybgG COG0383 # Protein_GI_number: 16128707 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Escherichia coli K12 # 1 867 4 868 877 280 26.0 8e-75 MDKILIAASTHWDREWYRTFQEFQIRLCGLMNRLITLLEQDEEFLCYTFDGQSAVLEDYL EIYPENRGRIEKLAAEGRLVFGPLYNLPDEFLSGGEALIRNFLLGDEVCQSVGGKLNAGY VPDNFGHISQLPQILNGVGIDTAFFFRGMDRTEYKEKEFYWESPDGSRVLCEYMILGYWS LKSWGKMGKSTAEHFYDAYRLLKESSCLNTFLLINGSDHLYQDPDFTAMVREVKEAYPEL DIKNGSIADYAEMAKAAASQSDCLRTVEGEMRDFRYGPDPTAVTSCRGRIKRLLSAVQSE VERYAEPLAVLASRTGGGFDYPGGLFGKIWRNITISLGHDAVSGCSTDEVMEDIRSYLTH ALQSASRISEMAMEELAGREEYGRKYGEESYLAVFNPHPFACSGMTEQTIRLEKAGAWKD FVLYGENDEEIPYEILEWSDEVITREYLYNSKEKVPETCVKILFPVNDVPALSIRHFRIQ KSRLLEKRSQEFYVRSQPSRAEIENQRFSITVNEDASINVLDKKSGRIYRGLNSFVDRGE AGDEYQHVSPLLDEHVMARLTGVSVIHNSPLSQKLKITAELNIPDRAERGFLKRSGAYRV CRIVTVVCLTRGTDRVEFTVQIDNPCSDHILFARFPTALHSPVEYSNSGFDETERDCERK AFAPELKSTQSLLKPMRGYAGIKGADGSFHVMGKGLYEYHTKKSGQGTDFYLTLLRSTSY LFHGLPTSWLDGQESTTPMVETKGAAESGESTVEYAILFDEQNPAREAEKYLYPLYGVNV TCLPDNREACESFLSFDNENILLSALKKHRSKEGTVLRFYNRNSSEEKMVIRTGRQIREC RACSLLEELEGEMEHTTDTITCTVPGKKIITLYIAWR >gi|229784106|gb|GG667629.1| GENE 37 45334 - 46368 912 344 aa, chain - ## HITS:1 COG:PH0022 KEGG:ns NR:ns ## COG: PH0022 COG3839 # Protein_GI_number: 14589983 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Pyrococcus horikoshii # 1 339 31 368 373 319 49.0 4e-87 MNLSIEKGEFIVFLGPSGCGKTTTIRMISGLEDISGGEILIDGEDVIGKKPKDRGVSMIF QSYAIWPHMTVFDNIAYPLKLQKVPKAEIKKRVTAAAEATDIVGLLNRYPAQMSGGQRQR VAVSRAIVVKPKIFLMDEPLSNLDAKLRVSMRTELKNIHIQQNSTSIFVTHDQSEAMSLA DRIVVMYKGKIEQIGTPMEVYQDSATRFVAEFIGTPPTNFFVTKIEKTADGLMAVNDSIH YQVPDSLRDALLPYAGKEVDLGVRPEYIDLSFSMERTKGYLCDTVIDFVEPQGSHAILIT RIGGNEVKIHTTAYMEMAPKTNVALNVKDDKVMFFDRSTSFRIK Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:15:47 2011 Seq name: gi|229784105|gb|GG667630.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld23, whole genome shotgun sequence Length of sequence - 43331 bp Number of predicted genes - 42, with homology - 39 Number of transcription units - 18, operones - 10 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1333 1179 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 1358 - 1417 9.8 + Prom 1315 - 1374 7.3 2 2 Tu 1 . + CDS 1611 - 1844 245 ## - Term 1770 - 1805 -0.8 3 3 Op 1 . - CDS 1811 - 2152 353 ## gi|266620773|ref|ZP_06113708.1| toxin-antitoxin system, antitoxin component, Xre family - Term 2176 - 2210 5.0 4 3 Op 2 . - CDS 2232 - 3587 1429 ## COG1109 Phosphomannomutase - Prom 3659 - 3718 2.0 5 4 Op 1 1/0.000 - CDS 3724 - 4533 339 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 6 4 Op 2 3/0.000 - CDS 4475 - 4930 243 ## COG0451 Nucleoside-diphosphate-sugar epimerases 7 4 Op 3 . - CDS 5003 - 6190 559 ## COG0381 UDP-N-acetylglucosamine 2-epimerase - Prom 6235 - 6294 7.9 - Term 6495 - 6546 -0.9 8 5 Op 1 . - CDS 6594 - 6923 261 ## Kvar_1555 capsular exopolysaccharide biosynthesis protein (Wzm) 9 5 Op 2 . - CDS 6940 - 8094 465 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid - Prom 8311 - 8370 6.7 10 6 Op 1 . - CDS 8397 - 8945 152 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 11 6 Op 2 . - CDS 8972 - 9850 267 ## gi|266620779|ref|ZP_06113714.1| hypothetical protein CLOSTHATH_01894 12 7 Op 1 . - CDS 9955 - 11082 312 ## COG1524 Uncharacterized proteins of the AP superfamily 13 7 Op 2 . - CDS 11095 - 12330 -40 ## gi|266620781|ref|ZP_06113716.1| putative membrane protein 14 7 Op 3 . - CDS 12336 - 12785 252 ## COG0438 Glycosyltransferase - Prom 12911 - 12970 5.5 15 8 Tu 1 . - CDS 13340 - 13417 61 ## 16 9 Op 1 4/0.000 - CDS 14401 - 15411 356 ## COG0438 Glycosyltransferase - Prom 15541 - 15600 2.9 17 9 Op 2 4/0.000 - CDS 15629 - 16669 639 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 18 9 Op 3 1/0.000 - CDS 16671 - 17885 179 ## COG0438 Glycosyltransferase 19 9 Op 4 . - CDS 17921 - 18568 231 ## COG1011 Predicted hydrolase (HAD superfamily) 20 9 Op 5 . - CDS 18546 - 19532 187 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) 21 9 Op 6 3/0.000 - CDS 19529 - 20251 429 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 22 9 Op 7 5/0.000 - CDS 20264 - 20932 144 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 23 9 Op 8 . - CDS 20929 - 22188 580 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 24 9 Op 9 . - CDS 22250 - 22756 294 ## COG0250 Transcription antiterminator - Prom 22779 - 22838 3.3 - Term 22923 - 22964 -1.0 25 10 Op 1 1/0.000 - CDS 22965 - 24608 1398 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases - Prom 24753 - 24812 5.0 26 10 Op 2 5/0.000 - CDS 24851 - 25663 686 ## COG0489 ATPases involved in chromosome partitioning 27 10 Op 3 2/0.000 - CDS 25664 - 26398 700 ## COG3944 Capsular polysaccharide biosynthesis protein 28 10 Op 4 . - CDS 26418 - 27149 655 ## COG4464 Capsular polysaccharide biosynthesis protein + Prom 26961 - 27020 5.6 29 11 Tu 1 . + CDS 27176 - 27256 58 ## 30 12 Op 1 . - CDS 27240 - 28700 801 ## gi|288870332|ref|ZP_06113732.2| conserved hypothetical protein 31 12 Op 2 . - CDS 28764 - 29705 846 ## COG5263 FOG: Glucan-binding domain (YG repeat) 32 12 Op 3 . - CDS 29695 - 30012 170 ## gi|266620799|ref|ZP_06113734.1| conserved domain protein 33 13 Tu 1 . - CDS 30402 - 31259 708 ## COG0582 Integrase - Prom 31329 - 31388 7.6 + Prom 31375 - 31434 7.2 34 14 Tu 1 . + CDS 31569 - 32837 1489 ## COG3681 Uncharacterized conserved protein + Term 32879 - 32917 0.5 - Term 32830 - 32872 -0.9 35 15 Tu 1 . - CDS 32947 - 34605 1263 ## Closa_4081 hypothetical protein 36 16 Op 1 . - CDS 34682 - 36094 1291 ## COG3119 Arylsulfatase A and related enzymes 37 16 Op 2 . - CDS 36134 - 37189 706 ## COG3119 Arylsulfatase A and related enzymes - Prom 37248 - 37307 80.4 38 17 Op 1 . - CDS 38147 - 38497 369 ## COG3119 Arylsulfatase A and related enzymes 39 17 Op 2 14/0.000 - CDS 38507 - 40075 1763 ## COG1653 ABC-type sugar transport system, periplasmic component 40 17 Op 3 7/0.000 - CDS 40141 - 41040 894 ## COG0395 ABC-type sugar transport system, permease component 41 17 Op 4 . - CDS 41040 - 41987 1027 ## COG4209 ABC-type polysaccharide transport system, permease component - Prom 42013 - 42072 5.3 + Prom 42127 - 42186 6.4 42 18 Tu 1 . + CDS 42293 - 43331 921 ## gi|288870335|ref|ZP_06113744.2| conserved hypothetical protein Predicted protein(s) >gi|229784105|gb|GG667630.1| GENE 1 1 - 1333 1179 444 aa, chain - ## HITS:1 COG:AGl3153 KEGG:ns NR:ns ## COG: AGl3153 COG1653 # Protein_GI_number: 15891693 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 68 404 37 390 433 72 27.0 2e-12 MKKILSLVLASAMVASSLAGCGSSKTAETQAPTTQAPAESAAPEGSDAAPEAPAETVAAG AIEDGAELVYWPMWAETEPQGQAISEAIDAFTKSTGVKVDVNWAGSRDTRKTLEPALSAG ETIDIFDEDIERVNGTWGKYLLDIQSMYDASPLNGAQNATLIELAKQQGGGTLKSVPYQP STFIMFYNKDAFDKAGITAVPKTWDEFLAACESLKTAGIIPMTVDDAYMACLFGFLMDRV AGSDTTEAVAAGDFTNEAVLKTAQVLEELTSKGYIDPRAAGNVYPQGQSNIADGSVAMYL NGSWLPNEVKNQTPEGFRWGAFSLPQIAEGGDGAESNQYGAQCFAINKDTKYPNAAFALI QWLTSGEWDQKLADASMGVPMDNDAKWPEALADAKAVLDSTTHRLNWAVGMENDTNVNAA IKTNMAMMISGSTDAQKFKPFRFQ >gi|229784105|gb|GG667630.1| GENE 2 1611 - 1844 245 77 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVDRCVCCGAVIPEGTEVCINCRYQYLDQYSESGGTPSGSFADSSVSKTRGRDKHKMPPL IAEQKSFYFLPDSLRQR >gi|229784105|gb|GG667630.1| GENE 3 1811 - 2152 353 113 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620773|ref|ZP_06113708.1| ## NR: gi|266620773|ref|ZP_06113708.1| toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] # 1 113 1 113 113 208 100.0 1e-52 MVYDRAGVAKRIRQRRKELGMSSAEVAGKIGRAVHYYGDIERGTCGMSIETLLDLAHYLD LSVDYILFGQAGEERIDNPDLAYRILKKYDEKTQKEAIEMMKFYLCLRESGKK >gi|229784105|gb|GG667630.1| GENE 4 2232 - 3587 1429 451 aa, chain - ## HITS:1 COG:SP1559 KEGG:ns NR:ns ## COG: SP1559 COG1109 # Protein_GI_number: 15901402 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Streptococcus pneumoniae TIGR4 # 1 443 1 443 450 396 49.0 1e-110 MSKYFGTDGFRGEANETLTAEHAYKTGRFIGWYYGLKRPDGRAKVVIGKDTRRSSYMFEY ALAAGLTASGADAYLLHVTTTPSVSYVVRTEDFDCGIMISASHNPYYDNGIKLINYYGEK MEEETILDVEAYLDGKTGEIPFAKGERIGRTIDYAIGRNRYIGYLISLATRSYGGLKVGL DCADGSAWMIAKNVFDTLGAQTYVIHNNPDGININQDCGSTHMEALQKHVVENGLNVGFA FDGDADRCLCVDEHGNVVDGDAILYIYGCYMKERGKLAGNKVVTTIMSNFGLYKAFDAAG IEYEKTAVGDKYVYENMETNGYRLGGEQSGHIIFRKYAATGDGILTAIKMMEVMLEKKMT LSKLAEPLTIYPQVLKNIRVRDKAAARNDEAVKDAVAIVEASLGAEGRILVRKSGTEPVL RIMVEANDKQACEAYVDTILQTIIDRGYQAE >gi|229784105|gb|GG667630.1| GENE 5 3724 - 4533 339 269 aa, chain - ## HITS:1 COG:SP0359_2 KEGG:ns NR:ns ## COG: SP0359_2 COG1898 # Protein_GI_number: 15900288 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Streptococcus pneumoniae TIGR4 # 148 268 1 127 128 201 73.0 9e-52 MPKKIGTKVLVYRFPNLVGKWMRPNYNSAVGTFCNNIANDLPITINDPSVVLEVLFIDDL VEEMFDALEGKEHYCEFNGVEAVQKVEGKYCCVPVTYKETLGRIAELLYIFHDQPQTLII PEIVPGSFEKKLYSVYLSYLPKEKVAFPLKMSEDDRGSFTELLKTETCGQFSVNISKPGV TKGQHWHNSKWEFFIVVSGCGLIQERKVGTDEIFEFTVSGDNMEAVHMLPGYTHNIINLS KSENLVTVMWANEQFDPMHPDTFFEKVQN >gi|229784105|gb|GG667630.1| GENE 6 4475 - 4930 243 151 aa, chain - ## HITS:1 COG:SP0359_1 KEGG:ns NR:ns ## COG: SP0359_1 COG0451 # Protein_GI_number: 15900288 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Streptococcus pneumoniae TIGR4 # 2 147 4 149 281 184 58.0 5e-47 MNILITGAKGFLGKNLVENLKNIRDGKNCTRPNIHIEKIYEYDIDSIPEQLDDYCREADF VFNLAGINRPKDPEEFKKGNFSFANRLLDTLRKYNNTCPVMLSSSVQAALEGRFGTSEYG LSKKAAEDLFFEYAKKNWYKSFGVSFPKPGW >gi|229784105|gb|GG667630.1| GENE 7 5003 - 6190 559 395 aa, chain - ## HITS:1 COG:SP0360 KEGG:ns NR:ns ## COG: SP0360 COG0381 # Protein_GI_number: 15900289 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Streptococcus pneumoniae TIGR4 # 1 394 1 394 394 689 82.0 0 MEIKTDYSDVIFKENGKLKLLIIVGTRPEIIRLAAVINKCRKYFDVVLAHTGQNYDYNLN GIFFENLNISDPEVYMNAVGDDLGATIGNIINCSYKLMYQIKPDALLILGDTNSCLSAIA AKRLHIPIFHMEAGNRCKDECLPEETNRRIVDIISDVNLAYSEHARKYLHECGLPKERVY VTGSPMAEVLHQNLEGIKKSAILEKLVLEPQKYILLSAHREENIDTEKNFMSLFAAINKL AEKYDMPILYSCHPRSQKRLLTSGFELDRRVIQHEPLGFHDYNHLQMNAFAVVSDSGTLP EESSFFSSIGHHFPAVCIRTSTERPEALDKACFVLAGIDEKSLLQAVDTAVQMNLEEDYG IPVPDYVDENVSTKVVKLIQSYAGVVDKMVWRKTN >gi|229784105|gb|GG667630.1| GENE 8 6594 - 6923 261 109 aa, chain - ## HITS:1 COG:no KEGG:Kvar_1555 NR:ns ## KEGG: Kvar_1555 # Name: not_defined # Def: capsular exopolysaccharide biosynthesis protein (Wzm) # Organism: K.variicola # Pathway: not_defined # 9 99 118 209 327 84 45.0 2e-15 MNKILWGEYTKTLYKKILLKKYIHSTRDEHKKHLLESIDLKAINTGCPTMWKLTLEHCAK IPTKKAKAVILTLMDSSVNLKLDQQLINLLINNYSEVFLASRFEGYGVF >gi|229784105|gb|GG667630.1| GENE 9 6940 - 8094 465 384 aa, chain - ## HITS:1 COG:TM0620 KEGG:ns NR:ns ## COG: TM0620 COG2244 # Protein_GI_number: 15643386 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Thermotoga maritima # 86 304 185 397 479 62 25.0 2e-09 MLLLLEPNHLLLNYVFEDGVTKQFIWLFVGYLLFSLFGKFTQLTARMGEFAINYVLSNLI AKAGFVVIIFAVCWVIPDVTLSWVAFSFAVASAFAMFINISTLYKVKTIQVSEHKRVTTT EMLKYGMPFMISNVMMLIIPLIERLIVRELAGWEVLSIYTAASVFYTVIALMKNTIDNIW NPIVFQYYKNEEVFKPLLHDFGLAVTWVTVIGLGFCILLRRWLVLILDSSYFDVYLIAPA IMYTACFEIYTLIYSVGINTAKKTVHMISAPMIQIIISVILCCLLIPEFGLLGAGIASLF SIIISRIYRIVLGLHYYGTERREWKIALLWLICAVAAVFTMYSTTIVSDITVCITLCILS LVVINTEGVNLVKKMNGLLFSKKI >gi|229784105|gb|GG667630.1| GENE 10 8397 - 8945 152 182 aa, chain - ## HITS:1 COG:BH3001 KEGG:ns NR:ns ## COG: BH3001 COG0110 # Protein_GI_number: 15615563 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Bacillus halodurans # 63 167 70 185 186 77 40.0 1e-14 MRRMLKKILIRTEQKIFAVIHRVNYHYYLTHYPLLLQKMGIRFSGDIKNTGFISSTVAFD SFDYASYITIGENTIISSEVQMLVHDYTIGNAMLALGAGGVKAGHLPHFLKEIKIGNNCF VGMRSIILPGTEIGDNTIIGAGSVVKGKIPANVIIAGNPAKVIKNIDDYVELHMKLHDYI VN >gi|229784105|gb|GG667630.1| GENE 11 8972 - 9850 267 292 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620779|ref|ZP_06113714.1| ## NR: gi|266620779|ref|ZP_06113714.1| hypothetical protein CLOSTHATH_01894 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_01894 [Clostridium hathewayi DSM 13479] # 1 292 37 328 328 589 100.0 1e-167 MLLGSLSRGEGTWIDRENGFEMLSDIEFFTIHPSGFSDFNSWNKILDDTRFEVFDGDMSL LFHIDNTYVCREKLPDLERKLLTFDARNMGITVVGEDFKNLIPEIDIHNINYYDLKDIMT HRVFSVLYYGFPLKRSGDEISYQYSLAKNSLDLMTVMLVKYGQLLSGFGNRCQAIQRLNV DERIRQYFQFCLSIKLGQTCNISFSISEMEQLFIFISDQLYEEFKVPLHNILVNWKSIIR RNLGKIKRAVFYRHFVYGDHYKRLSATFHRIEPITKREILDNIVLNGYPTEK >gi|229784105|gb|GG667630.1| GENE 12 9955 - 11082 312 375 aa, chain - ## HITS:1 COG:TM0626 KEGG:ns NR:ns ## COG: TM0626 COG1524 # Protein_GI_number: 15643391 # Func_class: R General function prediction only # Function: Uncharacterized proteins of the AP superfamily # Organism: Thermotoga maritima # 5 343 10 343 381 86 26.0 7e-17 MPLSLFCDALPYSEMSVQYKDWFDKLQLAPLMPNIAYSSSLHWQLYCNKYPDERGILVDW VREPEKNKAVRILSTILRPLDVNETLGWFSRKVLDRIVFRRNMFANVPYQFRKDFTEKAR YLFWDYSTYSQEELFKKYTVVSQDEGHLSFETTMDQLNRTIEKGDINIFGVLGFADSMGH QCRRGEEYSRKLKRYMEVLKKSITRYMILHPHEPVLIVSDHGMSTINNSVDLHLEQRFGK QSKKTYIAYSDSAIMCVWAERKDLLADIEAYLKTIEFGHLLTDEERRYYCASDRKFGDLL FILREGNVFATNWFGKSRRKQQTDGAGMHGFWPENTAKDQIASIVLINGKIKLKNYYDYH SAYHIIYSIMCGETE >gi|229784105|gb|GG667630.1| GENE 13 11095 - 12330 -40 411 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620781|ref|ZP_06113716.1| ## NR: gi|266620781|ref|ZP_06113716.1| putative membrane protein [Clostridium hathewayi DSM 13479] putative membrane protein [Clostridium hathewayi DSM 13479] # 1 411 3 413 413 700 100.0 0 MRIKISLPALEKVMKAEKVDQYKLKKLLINALIIRLFICVLVLLVGKNMNEIYFISDDAA YENLAKAYLQSASSPIDLGALTLIGATGYLQIFWPYVVCISAYVFHSIYAARFINVFLSI LTIKLIYDLTKSISNNHPTAIRAARIYAYLPYPILVCCFPIKDIYLTVAVLYVFVIFIKF QNLQKITIIQFILSIMLLIGAYFTRGGVVEMMSLFFIAFLMKRFADTHNYAAIMLCITFT FVFLYVFGGLIFDSFSIKIDNYGGYAQMDTTISAIQMSSVTQIYKLPFTYFFASLQPIPL SLFTPEAGKMWSQLIYYGNLTMIPVACGNFLYIFYKKKNSLFWICSAVMYCAVTTLSLGI FRHYLFLLPLEMINYSLYMENATQVKRSNCLVFAACCFVMILLYSCYCLIR >gi|229784105|gb|GG667630.1| GENE 14 12336 - 12785 252 149 aa, chain - ## HITS:1 COG:SP0353 KEGG:ns NR:ns ## COG: SP0353 COG0438 # Protein_GI_number: 15900282 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 1 140 222 363 372 100 40.0 1e-21 MNPKSCLVLIGRGEQEEAVKEKIKLLELTENVKLLGVRDDVPLLLSMMDVFIFPSKYEGL PFTLVETQCNGLKAVSSDAVTEQVKVSECIEFLSLEDSDEIWAETAIRAAQNGHDISARL KVIEGGYDIDVEAKKLRDYYTYLIKRQSE >gi|229784105|gb|GG667630.1| GENE 15 13340 - 13417 61 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVVELTVIYIIIAVEFKVFSLTLLL >gi|229784105|gb|GG667630.1| GENE 16 14401 - 15411 356 336 aa, chain - ## HITS:1 COG:all4426 KEGG:ns NR:ns ## COG: all4426 COG0438 # Protein_GI_number: 17231918 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Nostoc sp. PCC 7120 # 24 334 94 413 417 114 28.0 2e-25 MKARIFDNEGFNAKKETKKLIEYIGKYKPDIIHLHNLHGYYLNVELLLGYLAERDIPVVY TMHDCWAITGHCAHFAKIKCEKWKTGCFNCAIKNEYPASFLMDSSKKNWEKKRALFQKLN NIIIVTPSEWLANILKQSYLKHQSIIVIPNGVDMEVFQPTESDFRKKYKLENKHIVLGVA TAWNERKGLKEFMLLQKELDDSYQLVLVGLTQNQIQKLPDGVIGIERTNSMQELAGLYST ADIFVNAGQEETMGLTTVEAMACGTPAVVSDLTAVPEVIDSDGGLIFTDYSVECMAEKIR SAVGMSFPYTRRSAMRYEKKRQYEKYLELYETMITR >gi|229784105|gb|GG667630.1| GENE 17 15629 - 16669 639 346 aa, chain - ## HITS:1 COG:SP0358 KEGG:ns NR:ns ## COG: SP0358 COG1086 # Protein_GI_number: 15900287 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Streptococcus pneumoniae TIGR4 # 1 344 1 345 351 516 73.0 1e-146 MSLFTDKTLMITGGTGSFGNAVLNRFLQTDIGEIRIFSRDEKKQDDMRHEFQAKMPEASE KIKFYIGDVRDLQSVNNAMNGVDFIFHAAALKQVPSCEFFPVEAVKTNILGTENVLNAAI EEGVESVICLSTDKAAYPVNAMGISKAMEERVAVAKSRTSKKTKICCTRYGNVMCSRGSV IPLWIDQIRNSNPITLTEPEMTRFIMSLDEAVDLVLFAFEYGESGDILVQKAPACTIQTQ AEAVCELFGGKKEDIKIIGIRHGEKMYETLLTNEECAIAIDMGKFYRVPCDKRGLNYDKY FSQGDAERNPLTEFNSNNTQQLTVDESKKKIAALAYIQEELAKMGK >gi|229784105|gb|GG667630.1| GENE 18 16671 - 17885 179 404 aa, chain - ## HITS:1 COG:TM0631 KEGG:ns NR:ns ## COG: TM0631 COG0438 # Protein_GI_number: 15643396 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermotoga maritima # 18 391 31 415 434 127 26.0 4e-29 MNVLFLSLIDFESFSDRNIYSDLLREFIKNGHNVFSISPVERRIGRETYMIQEGSSCILR LKIGNIQKVNSIEKGISTILIEPLFINGIKKYFSNVRFDLVLYATPPITFAKVIKFIKKR DRAKSYLMLKDIFPQNSLDLGMLSAFGIKGLIYRYFRRKEKALYFLSDTIGCMSRANCNY ILRHNPEVESDKVEICPNSIEPLDLRVSDEDKNRLREMYGLPLDKKIFVYGGNLGRPQDV PFIINCLKACSNIKTAYFVIAGSGTDRHYLEEYIESEKPQHVKLFGFFPKEEYDSMIACC DIGLVFLDHRFTIPNFPSRLLAYMQAGLPVLCCTDKNTDVGKVVVNGRFGWWCPSDDKRQ FVSTVKEAVNKKDYRQMGYYGFLYLKNNFAADLTYKMIKHRGEN >gi|229784105|gb|GG667630.1| GENE 19 17921 - 18568 231 215 aa, chain - ## HITS:1 COG:MJ1437 KEGG:ns NR:ns ## COG: MJ1437 COG1011 # Protein_GI_number: 15669628 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Methanococcus jannaschii # 22 211 2 199 228 66 28.0 5e-11 MGGQKIIKTTNILDTVKYIDGLKAIIFDMDDTLYSEKEYIRSGYRKIAELFPQIEGADRQ LWNLFLEGKPAIDEFLKQQNLFSDENKEKCLTAYRLQKPDIHLYPGVKAMLLDLRKRYLV GLITDGRPEGQWAKIVALRIEPLIDEIIVTDELGGIKYRKPNDVAFRLMADRMCMSFEQM CYIGDNARKDFMAPQKLGMRCIWVRNEDGLYVLGG >gi|229784105|gb|GG667630.1| GENE 20 18546 - 19532 187 328 aa, chain - ## HITS:1 COG:CAC2189 KEGG:ns NR:ns ## COG: CAC2189 COG0458 # Protein_GI_number: 15895458 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Clostridium acetobutylicum # 1 279 1 281 315 182 36.0 1e-45 MKILFTSIGRRVELIQAFRAAADKLNIGLTIIGVDITDTAPALFFCDEMRRACRVSKPEY IPWLLSICEQENVDCLIPTIDTDLLLLAENKEKFEAFGTKVLISAVDRIKVSRDKRLTAD YFISLGLKSPLPVDSVERYASGYPAFIKPKDGSSSINAYKAESIYDLQNYAEKIEDYIIQ PFISGREYTIDIFCDYEGNPVYITPRERLAVRSGEVLKTQICQDEKMIAEMQTLIADYKP CGQITVQLIREEKTGYDYYIEINPRFGGGSPLSIKAGADSAESVLRMLNGERLHYIEKAA KSGAVYCRFDQSICINEEGKDVGWPENY >gi|229784105|gb|GG667630.1| GENE 21 19529 - 20251 429 240 aa, chain - ## HITS:1 COG:BS_yvfD KEGG:ns NR:ns ## COG: BS_yvfD COG0110 # Protein_GI_number: 16080477 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Bacillus subtilis # 3 199 2 203 216 123 41.0 3e-28 MNENVIVIGAGGHGKVIADIIKKSGDMVSGFLDDNPSLSDTFMGYPILGSTDVFEDFKKC QFVIAIGDAVIREKITEKLDGVKWYTAIYPSSVISDIGVSIGEGTVIMANVVINTGTIIG KHCIINSGAIIEHDNKIDDFVHVSVGAKLAGTVTIGKGTWIGIGVSVSNNISICADCMVG AGGVVIRNIEKAGTYVGVPVERSDMKEKFRGGENNLINIISYIPSGLHVSKYIANGRNVA >gi|229784105|gb|GG667630.1| GENE 22 20264 - 20932 144 222 aa, chain - ## HITS:1 COG:NMA0639_1 KEGG:ns NR:ns ## COG: NMA0639_1 COG2148 # Protein_GI_number: 15793627 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Neisseria meningitidis Z2491 # 22 220 2 200 223 213 51.0 3e-55 MKRLAERTGTKVFIKKKQGIYARFIKRGMDFVLSFIALVILSPLLLVLTIAGAFAMGGNP FFIQERPGKDEKIFKLIKFRSMTSKKGKDGKLLPDEQRLTPYGKFLRKTSLDELPELFNI LAGQLAVCGPRPLLQSYLLRYNDYQKHRHDVRPGLTGYAQVHGRNAVSWEEKFDMDVEYV NHITFWGDIKIILQTVLTVLKREGISSESSATMEEFMGTQSK >gi|229784105|gb|GG667630.1| GENE 23 20929 - 22188 580 419 aa, chain - ## HITS:1 COG:BS_yvfE KEGG:ns NR:ns ## COG: BS_yvfE COG0399 # Protein_GI_number: 16080476 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Bacillus subtilis # 10 269 3 266 301 269 49.0 9e-72 MEKEYKIFDKKIYLSSPTRHEEMRTYMLEAYDTNWLSTVGENINEVERLTCEKVGCRYAV ALSSGTAALHMAVKLAGVKPGTPCFCSDLTFDATVNPVVYEGGVPVFIDTEYDTWNMDPV ALEKAFELYPAVKVVVAAHLYGSPGKLDEIKRICEKHHATIIEDAAESLGAVYKGIQTGS FGCFNAISFNGNKIITGSAGGMLLTDDKESAEKVRKWSTQSRENAPWYQHEELGYNYRMS NVIAGVVRGQYPYLEEHIAQKKAIYERYKAGFRDLPVTMNPYDEKKSEPNFWLSCMLINP EAMCRQLRDDQAALYVTEAGKTCPTEILETLAKYNVEGRPIWKPMHMQPFYRMNRFITRE GNGRAKTNAYIAGGAIGKDGRPSDVGMDIFNRGLCLPSDNKMTPDQQDIVIDIIRGCFE >gi|229784105|gb|GG667630.1| GENE 24 22250 - 22756 294 168 aa, chain - ## HITS:1 COG:BS_nusG KEGG:ns NR:ns ## COG: BS_nusG COG0250 # Protein_GI_number: 16077169 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Bacillus subtilis # 2 167 5 176 177 63 27.0 3e-10 MWYVIQVRTGLEESIRIQCEKLIDRHAMERCFIPYCEKMKCYYGKWHKEKQILFPGYLFV VSQDVEALFLELNRIIGLTKLLGTGNMIVPLTSEEESVLKQISNDDQIVTMSKGLIVNNQ VIILEGPLKGREGYICKIDRHKRRARLELHMFGRKQNVELGLEILEKR >gi|229784105|gb|GG667630.1| GENE 25 22965 - 24608 1398 547 aa, chain - ## HITS:1 COG:BH3718 KEGG:ns NR:ns ## COG: BH3718 COG1086 # Protein_GI_number: 15616280 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Bacillus halodurans # 1 530 63 585 608 464 45.0 1e-130 MWSSAGIVELESIVAACGVTFILQIIGMKFLNIEVPRSYHILWLLFMTSVVGISRMSYRA LRMAGRHLGRKREDAVRIMIVGGGQAGTLLIKELNNSEKVPGIPVCIVDDDRNKNGKYIS GVPIRGTREEIPELAEKYRIDEIYIALPTVTGVERKKILEICQQTKCSLKILPGLYQLMS GEVSISKLRDVEIDDLLGREPIKVNLDEIMGYVKDKVVMVTGGGGSIGSELCRQLAGHGV KRLIIFDMYENNAYEIQQELKRNCPELDLVTLIGSVRNTNRLNSVFETYRPDVVYHAAAH KHVPLMEDSPNEAIKNNVIGTYKTARAAMKYGTKHFVLISTDKAVNPTNIMGASKRLCEM VIQMCNSKSSTEFVAVRFGNVLGSNGSVIPLFKKQIENGGPVTVTDPDIIRYFMTIPEAV SLVLQAGAYAKGGEIFVLNMGEPVKILDMAENLIRLSGYEPYKDIDIVFTGLRPGEKMYE ELLMAEEGLQGTKNSRIFIGKPIEMDYGKFEQQLSELDEAAWKETSEMRDLVQKIIPEYH FSRTNSN >gi|229784105|gb|GG667630.1| GENE 26 24851 - 25663 686 270 aa, chain - ## HITS:1 COG:SP0349 KEGG:ns NR:ns ## COG: SP0349 COG0489 # Protein_GI_number: 15900278 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Streptococcus pneumoniae TIGR4 # 2 225 9 226 227 158 38.0 1e-38 MQKVEFKKLPRLDFRANEAFKTLRTNITFCGDAQRVILFTSTIPNEGKTSVVFQLARSFA EDGKRVLLLDADMRKSVLMGHYRVDQETRGLSHYLVGQESAENVAYRTNIDNMDVIFAGP STPNPAELLGNAKFEEMLRGAREHYDYVLIDSPPLGSVIDAAIIARNVDGAVIVVESGVI SYKMVQKVKGQLEKSGCRILGAVLNKVNQESKGYYGGYYSKYYGKNYVKYSGYGSGGSSG KAEKSEEKTEMKTAAAGGEENQVKRSRRRK >gi|229784105|gb|GG667630.1| GENE 27 25664 - 26398 700 244 aa, chain - ## HITS:1 COG:SP0348 KEGG:ns NR:ns ## COG: SP0348 COG3944 # Protein_GI_number: 15900277 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 7 225 5 226 230 133 39.0 2e-31 MKDTRDDEIEIDLLQLFWVMKRRLWIMILAVVVGAAAAMLYTTTLVEPVYTSSTMIYILN KTNNITSLSDLQLGTQLTKDYKVLVTSRPVLEQVTENLGLNLGYQQLNGKITVNNPTDTR ILTISAKDTNPETARAIADEVASVSVARIAEIMDSVPPKIVETGNLPTAPSSPNVKKNTV LGGLAAGIAAAGIIILLYLMNDSIRNAGDIERYLDLNTLGQIPEFEENERRKKKKSRWGR RAVE >gi|229784105|gb|GG667630.1| GENE 28 26418 - 27149 655 243 aa, chain - ## HITS:1 COG:SP0347 KEGG:ns NR:ns ## COG: SP0347 COG4464 # Protein_GI_number: 15900276 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 5 243 1 243 243 122 28.0 7e-28 MPFNLFDLHCHLLYGVDDGAISLEESLRAVRIACEEGIRHIVFTPHYTPGKLSGEKEVIL RRMKEIREAAENAGIEGIKYYCGNEVLYVSGVEELLKQGEILTIADTKYVLLEFYQGVRY QDMFQGLSHVVRTGFIPVLAHVERYNCLYPKLDRIEELRDLGIVIQMNTECLMPQIPSVR LLWYRNLMKAGYVQLISTDAHGADNKPPRIKRAVEWMERHCDCSLVERVLYENPARVLED RIL >gi|229784105|gb|GG667630.1| GENE 29 27176 - 27256 58 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIIILISVFSFSHSAHAILSHFIFGT >gi|229784105|gb|GG667630.1| GENE 30 27240 - 28700 801 486 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870332|ref|ZP_06113732.2| ## NR: gi|288870332|ref|ZP_06113732.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 486 1 486 486 921 100.0 0 MSFLFFLITPSASYAAGLNASEQELVAKAKETFEWNGKKYRAKQTYLDQLTAYLSRDDVD LSADTCAAAAKEMYGSVERGVMEGYLYEVNSNSSGKQSDGDPVTASETASQEEKRAEQES PSEEILEEKESASVVFQVPRGIVPDEGWNHKYETDIGQTGEASLTEPGSWLSLLNPSFYR LLMIFESVCTVLLSMALAVTVRGRRLRHRRKIRKVMQTAVIFMTAAACLLSGMYAALRVG TFSEQAVMNQIEKTSWYQTVYDDMKRETVMTMYLAGVPENKLENSEVEDTVKYSNVVLTA RQYMKAALEGDSQTPVLSGVFETFRTSVMEYYEKKDPGSEAVRTGVRNLLEGLEARCVKQ MNWAGMEWWRGKTQGFLLWFPALLGGAVLVAGLGTAELICLSRSSYRAVKRIGGSLAGGF ISLVNIGLLIGRFGGTGTAWAKPSYMKEFVRNMSVQAAYSVLMIGIIGCCLAVMTFYISR CIKYQR >gi|229784105|gb|GG667630.1| GENE 31 28764 - 29705 846 313 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 25 154 563 688 744 103 40.0 4e-22 MKGKTWKLAAVTLGLGISAAAAVYAQTPGWNLTGEGWRYLSAEGQYQAAGWFQDTDGRWY HFNQNGIMQTGWFQDTNGVWYFLATDSGAMKTGWHQDPDGKWYFLSYNGAMQTGLIKVDQ KVYFMEPSGALFAGDKEIDGVVYHFTENGTVGSAPPVSTAQVWSGNGNQTSSSDSSSSKE KIRVRDVVESDIRDAVNQTSASEHIVSASVSGQSISVTPVAVSTMVEGLPAAQSVFGVFL SSGYINEVTISGPGGSAVVHDTSDLVGAARSLGFLGSGSFSQARGSYTVTVSYSGTLDYG YEDGTLLYSVTVN >gi|229784105|gb|GG667630.1| GENE 32 29695 - 30012 170 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620799|ref|ZP_06113734.1| ## NR: gi|266620799|ref|ZP_06113734.1| conserved domain protein [Clostridium hathewayi DSM 13479] conserved domain protein [Clostridium hathewayi DSM 13479] # 1 105 1 105 105 187 100.0 3e-46 MSKIGREKTGLLDDLWKKTGCLYLSDLKQPAWRAVCCSVIREIQEEEYTLWEWSDAARYL SNADGVIASTKFQSKEEAKRAILKPGQPTGPDIRSKQKQEEKNER >gi|229784105|gb|GG667630.1| GENE 33 30402 - 31259 708 285 aa, chain - ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 11 281 1 262 265 125 28.0 7e-29 MEKKLDVVMLMEEYRKYLVYNEKSKSTINKYLRDAGKYLQFLKTREGAGNEITKESTMDY KMELMEKYKPASVNSYLTAVNSFLTFYGKTDCHVRLLKIQKRIFTDQSRELTKAEYIRLV ETARYDKKERLALIMETICGTGIRISELQNITVESLLSGYASVSCKGKERIILIPVKLKE KLREYCFKHEIQSGSIFITRSGKPIDRSNVWTEMNKLCEKANISHGKVFPHNMRHLFART YYGKQKDIVHLADILGHSSIETTRIYTMTSSFEYERQLEEMDLII >gi|229784105|gb|GG667630.1| GENE 34 31569 - 32837 1489 422 aa, chain + ## HITS:1 COG:FN1147 KEGG:ns NR:ns ## COG: FN1147 COG3681 # Protein_GI_number: 19704482 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 19 422 1 411 411 315 43.0 8e-86 MMNQRIHDEFLKILAEELIPAMGCTEPIALAYAGARAREVLGCLPDRVIAHCSGNMIKNV RCVTIPNSEHMVGIEAGVMLGIVGGNASKCMEVLEDLTDEDRKTASAMLKDGVCTVEYLD SEIPLHIILELHKDASVVTLEVRYDHTNITKITKDGEVLFSSDHMEGGCSLADRSLLSVD NIKDFADQVPLKDIHELLDRVIIHNMNIAYEGMAGNYGLGIGKIIKESYPDGVAVRMKAY AAAGSEARMGGCDMPVIINSGSGNQGIASSVPVIVYARENNICQERLYRALAFSSLLTIY QKEFIGKLSAFCGAVSASCASGAAITYLVGGTLEQIKNTIGNVLANIPGIICDGAKISCA AKIATSLDAAMMAHYLAMNNKAYQPDSGILQEDAGETISCVGYIGKEGMKQTDKEIIKIM LQ >gi|229784105|gb|GG667630.1| GENE 35 32947 - 34605 1263 552 aa, chain - ## HITS:1 COG:no KEGG:Closa_4081 NR:ns ## KEGG: Closa_4081 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 8 551 9 550 552 587 52.0 1e-166 MRRDKNYQIKLMSALTKVWPGQEPEEDPCCSVFTMLRGETFSFVAAYTCGGEGNSAAVVR VESPVEQYCRIREIVPVPSIRPVNSRVDSNYLCTRPGMYPDLLTEPAGGCVTFTPGQFKS LWIDMEIPENGPYGKFPFAVVFVSPEGEELCRRETSIQVYQTVLGPQRLIHTEWFHGDCL ADYYRVPVFSEEHWKIMENFIAAAAARGCNMILTPQFTPPLDTAPGGDRTTIQLVGVNVL PNGGYEFNFARLERWVTMCLSCGMVYFEMSHLFTQWGAGHPPKIMAEENGKEIKLFGWED PAVGGRYTRFLEAYLPKLTEKLRELGIAGVTRFHISDEPEPQHLLSYKQARESVAPYLKD FVIMDAVSSLDICREGEIDCPVCSSDHIEPFLEAGVTPLWSYYCTAQDFLVSNRFMAMPS ARCRIYGAQIFRYAMEGILHWGYNFYNSQYSLTKLDPYRSTDADGAFPSGDSFLVYPGAD GKPEESVRMMVMCHTMQDVRAMEQLAEKKGREYVISMLEEDLAEPITFKQYPKSEFWMIQ MRNRINRELEEA >gi|229784105|gb|GG667630.1| GENE 36 34682 - 36094 1291 470 aa, chain - ## HITS:1 COG:CC1172 KEGG:ns NR:ns ## COG: CC1172 COG3119 # Protein_GI_number: 16125424 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Caulobacter vibrioides # 5 449 33 480 521 190 29.0 6e-48 MAVQPNFLFIFMDDMGWRDLACTGSTFYETPNIDRLCRQGMVFANSYASCPVCSPSRASC LTGKYPARLGVTDWIDMEGTSHPLKGKLIDAPYIKHLPEGEYTIAQALKDAGYDTWHVGK WHLGGREFYPEHFGFDVNIGGCSWGHPHDGYFSPYGIETLSEGPEGEYLTDRITDEAVRL LRKRQACGSRKPFYMNLCHYAVHTPIQVKDEDRARFEKKARELGLDKETALVEGEFHHTE DKKGRRVVRRVIQSDPSYAGMIWNLDQNIGRLLEALRECGEEENTVVVFTSDNGGLATSE GSPTCNLPASEGKGWVYEGGTRVPLIVKYPGRVAPGSRCDVPVTTPDFYPTFLELAGVPQ KAGIPIDGRSIVPLLSGNPMPERPIFWHYPHYGNQGGTPASSVVMGDYKYIEFFEDGRGE LYDLKADFSETNNLCEKMPETAARLRMLLHGWQREVCARFPEENAEYVQM >gi|229784105|gb|GG667630.1| GENE 37 36134 - 37189 706 351 aa, chain - ## HITS:1 COG:SMc00127 KEGG:ns NR:ns ## COG: SMc00127 COG3119 # Protein_GI_number: 15964702 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Sinorhizobium meliloti # 56 295 186 429 512 105 31.0 1e-22 MTPQHGFEKWYTIGLGGCCYYHPDIVDHGEITVEHGKYVTELFADKALDYLDELAESEEP FYLAVNFTAPHAPWGKEHHPAKWIDYYENCDFASIPDIPDHPDLLTGPVYGTEKRKENLR GYFAAVSAMDEQVGRILDALEERHLTEDTIVIFSADNGMSMGQHGIWGKGNGTFPMNMYE SAVKVPFIISWPCFIKPGSVCCELVSAYDLFPTLLELTGLSGRFPEGLPGKSFCGLLSGK EETRTGEIVIFDEYGPVRMIRDREWKYIHRYPYGAHELYHLTDDPGETRNLYGMPEYEAK VLELKKRMETWFLKYADPAVDGVREGVTGSGQLCRPGIYAVRTDVYGPVGT >gi|229784105|gb|GG667630.1| GENE 38 38147 - 38497 369 116 aa, chain - ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 3 78 29 104 497 73 40.0 1e-13 MRKKPNILFILTDDQGAWAMHCSGTPELYTPNLDRIAASGMRFDNFFCASPVCSPARASL LTGKIPSGHGVLDWIRSGSVDADKFAAQGRENPYADGYKNERKPIAYLEGQTTLAS >gi|229784105|gb|GG667630.1| GENE 39 38507 - 40075 1763 522 aa, chain - ## HITS:1 COG:BS_lplA KEGG:ns NR:ns ## COG: BS_lplA COG1653 # Protein_GI_number: 16077777 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 75 522 57 502 502 142 26.0 2e-33 MKKRQVMSLVLAGVMAVSLTACGGGTGGTADKAESGQTEAGKAEASSKSGESIPLRWLTT GDAAAKAIKADDRIVAAINEKLGINLTVEIVPEANTEKVNVAMASGDFPDVVTGAYGTSA TQQWIDDGMVIPLNDYFETMPNIKTWLEDEYQWSAIDGKYYGLPFITQYNAANALIIFRQ DWLDNLGLEYPKTLDDMKNVLTAFTNNDPDGNGKNDTYGYTAEKPSSTSGVTPFDWVFFA YGLPYADYSLNENGEIIPFFEDPSFIPAMHYIKDLWDSKVVDPELMLNDQSKKEEKFYQG KSGAMLAPLFRHVTRHENSVKELYPDASISYGLPPEGPSGARGLNKQGKNGMMTCITTAC KNQDKAAAFIDFMVSKEGNDLLRLGIEGIHYTKDGDTIIFNEEERAKDAFADDGWAHALA WGSFYWPLESGYIPVSDPNRERALETVELATECQMPNLIKQKTPAEIENGSAVNDVFIQY FSDMLQGKISIEDGVVSLGKEWRNQGGDEILKEVSEVFKTEK >gi|229784105|gb|GG667630.1| GENE 40 40141 - 41040 894 299 aa, chain - ## HITS:1 COG:BH0481 KEGG:ns NR:ns ## COG: BH0481 COG0395 # Protein_GI_number: 15613044 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 16 299 6 293 293 243 43.0 5e-64 MLRSKKQEAGSHIKASAGSKVFDAVNVTLLVLLALTTFYPFWDCMVVSFSSLKSYLATNI HLWPSEWSLDGYAYMVKSAELWTSYANSIFITVVGTLINMMITTMAAYVLSKKDLKGQRF FMFMAVFTMMFSGGIIPTYIVVKDLHLMNSLWSMILPSAINTYNLIILRNFFMDLPLELE EAALLDGCTEVGVLFRIMLPISKPALTTVTLFYAVDHWNDFFSAIMYINSKQAWPLQLFL RSMLFQNDAAYSSGGESLFLLGQPMKMAAVMMAIIPIMCAYPFFQKYFTTGMTAGAVKG >gi|229784105|gb|GG667630.1| GENE 41 41040 - 41987 1027 315 aa, chain - ## HITS:1 COG:AGl3564 KEGG:ns NR:ns ## COG: AGl3564 COG4209 # Protein_GI_number: 15891904 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 28 307 22 301 309 279 50.0 5e-75 MGKAAVRKNGKVRNGPGIRTYIWKHKYLYLMLVPAIVYYIIFCYVPMYGVTIAFKDFNPM KGIWKSSWVGFETFQKLFSMEKFYSVFWNTIRISLIRLVFGFPFPIIIALMLNELRHVRM KKVIQTAIYIPNFISWVVLGGILTSILSMDSGIVNGVIQGLGFQPIGFLTDEKFFVPTMV VSMIWKTFGWNTIIYLAAITGIDTQLYEAATVDGANRWQKLVHITVPCIRSIIIVILITR IGSLMQAGFEQIFVLYHPGVYGTADIIDTYVYRMGLQDGKFELATAVGLFKSVINFCLVV AANKIARMSGEEGIY >gi|229784105|gb|GG667630.1| GENE 42 42293 - 43331 921 346 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870335|ref|ZP_06113744.2| ## NR: gi|288870335|ref|ZP_06113744.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 4 346 1 343 343 632 100.0 1e-179 MERMGPLGIKHKNTRDSKWKNILKSRHFRSFFFITFLFTTIIFIIGGVVITLQNRMKLHE NLEMKANYVALSVQDNLSNMKNSAVFMGNLASIERVLEAKSPTLDQLVRMTNDVTPYSTL YHYESICLFFNRSQRIFDSSGGMYDYDDFYDKELIETLENMKGEEMWLINVPYKRYYSPD PAVAVITYARRLPLYKSQGRGYVTVSYSLGRLQKAAAEAAGYTPYTATVCFQDHLLWSSS DAVMKNWANGLTASENEGLLLPGTTAFSSFSTIGARCSFHASKLELLTAVSATLLQWFLI YLAAAAADFAASMVYGMIMLRPVDAIMKKIGINPYTEIPGVREDEF Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:17:48 2011 Seq name: gi|229784104|gb|GG667631.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld24, whole genome shotgun sequence Length of sequence - 62572 bp Number of predicted genes - 67, with homology - 62 Number of transcription units - 32, operones - 19 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 114 - 159 6.4 1 1 Op 1 7/0.000 - CDS 204 - 512 533 ## COG1445 Phosphotransferase system fructose-specific component IIB 2 1 Op 2 7/0.000 - CDS 540 - 1607 1252 ## COG1299 Phosphotransferase system, fructose-specific IIC component 3 1 Op 3 . - CDS 1612 - 2064 598 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) - Prom 2192 - 2251 11.1 - Term 2238 - 2284 7.3 4 2 Tu 1 . - CDS 2316 - 3890 1603 ## COG0038 Chloride channel protein EriC - Prom 4038 - 4097 7.1 + Prom 3679 - 3738 4.1 5 3 Op 1 . + CDS 3856 - 3987 58 ## + Prom 4000 - 4059 4.6 6 3 Op 2 . + CDS 4085 - 5104 743 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Term 5063 - 5092 -0.4 7 4 Op 1 . - CDS 5144 - 5287 63 ## 8 4 Op 2 . - CDS 5236 - 5457 326 ## Closa_3095 hypothetical protein + TRNA 5716 - 5806 70.2 # Ser CGA 0 0 - Term 5799 - 5859 9.0 9 5 Op 1 11/0.000 - CDS 5862 - 6779 574 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 10 5 Op 2 . - CDS 6841 - 7167 512 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 7224 - 7283 6.8 + Prom 7325 - 7384 6.6 11 6 Op 1 . + CDS 7411 - 8184 535 ## COG0778 Nitroreductase 12 6 Op 2 . + CDS 8196 - 8540 258 ## COG0640 Predicted transcriptional regulators 13 6 Op 3 . + CDS 8571 - 9326 535 ## COG1145 Ferredoxin + Term 9358 - 9399 9.3 - Term 9348 - 9385 6.8 14 7 Tu 1 . - CDS 9429 - 9758 380 ## Dhaf_2872 hypothetical protein 15 8 Op 1 . - CDS 9860 - 12361 2370 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 16 8 Op 2 . - CDS 12358 - 12528 101 ## gi|266620823|ref|ZP_06113758.1| aerial mycelium formation 17 8 Op 3 . - CDS 12582 - 13064 581 ## Amico_0799 sodium/hydrogen exchanger - Prom 13090 - 13149 23.0 + Prom 13973 - 14032 80.4 18 9 Op 1 . + CDS 14225 - 15127 895 ## COG0583 Transcriptional regulator 19 9 Op 2 . + CDS 15132 - 15260 64 ## 20 10 Op 1 . - CDS 16485 - 17498 560 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 21 10 Op 2 . - CDS 17500 - 18147 825 ## Closa_3203 hypothetical protein 22 10 Op 3 . - CDS 18162 - 19460 1279 ## COG0153 Galactokinase - Prom 19535 - 19594 7.1 + Prom 19575 - 19634 5.3 23 11 Tu 1 . + CDS 19662 - 20636 725 ## COG1242 Predicted Fe-S oxidoreductase - Term 20652 - 20690 6.6 24 12 Tu 1 . - CDS 20736 - 20870 92 ## Cbei_2688 GCN5-related N-acetyltransferase 25 13 Op 1 . - CDS 21825 - 22292 370 ## Cbei_2688 GCN5-related N-acetyltransferase 26 13 Op 2 . - CDS 22329 - 23537 1137 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) 27 13 Op 3 . - CDS 23569 - 23994 447 ## Clocel_2236 flavodoxin - Prom 24181 - 24240 10.5 + Prom 24152 - 24211 10.2 28 14 Tu 1 . + CDS 24450 - 25472 400 ## Closa_1057 hypothetical protein + Term 25506 - 25553 8.5 - Term 25494 - 25541 12.3 29 15 Tu 1 . - CDS 25591 - 25689 108 ## 30 16 Tu 1 . - CDS 26661 - 27326 860 ## COG1307 Uncharacterized protein conserved in bacteria - Prom 27371 - 27430 7.8 31 17 Op 1 . - CDS 27520 - 28077 479 ## CLL_A2176 hypothetical protein 32 17 Op 2 . - CDS 28121 - 28630 545 ## COG1846 Transcriptional regulators - Term 28651 - 28694 6.7 33 18 Op 1 . - CDS 28750 - 29046 406 ## Closa_3214 hypothetical protein - Prom 29066 - 29125 4.2 - Term 29062 - 29096 -0.1 34 18 Op 2 . - CDS 29134 - 31152 2669 ## COG0326 Molecular chaperone, HSP90 family - Prom 31223 - 31282 6.0 - Term 31253 - 31291 7.2 35 19 Tu 1 . - CDS 31300 - 31797 637 ## Closa_3216 hypothetical protein - Prom 31942 - 32001 4.7 + Prom 31745 - 31804 6.5 36 20 Tu 1 . + CDS 31974 - 33260 1136 ## COG1362 Aspartyl aminopeptidase + Term 33302 - 33371 14.2 - Term 33294 - 33355 13.4 37 21 Op 1 . - CDS 33413 - 33643 195 ## gi|266620843|ref|ZP_06113778.1| conserved hypothetical protein 38 21 Op 2 . - CDS 33671 - 34645 1131 ## Closa_3219 PpiC-type peptidyl-prolyl cis-trans isomerase 39 21 Op 3 . - CDS 34697 - 34936 251 ## COG0419 ATPase involved in DNA repair - Prom 34963 - 35022 80.4 40 22 Op 1 . - CDS 35866 - 36849 1154 ## COG0419 ATPase involved in DNA repair 41 22 Op 2 28/0.000 - CDS 36874 - 37911 1036 ## COG0419 ATPase involved in DNA repair 42 22 Op 3 . - CDS 37908 - 39062 1100 ## COG0420 DNA repair exonuclease 43 22 Op 4 . - CDS 39130 - 39915 997 ## COG0561 Predicted hydrolases of the HAD superfamily 44 22 Op 5 . - CDS 39964 - 40557 609 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation - Term 40613 - 40647 3.4 45 23 Op 1 . - CDS 40654 - 40875 197 ## Closa_3222 hypothetical protein 46 23 Op 2 . - CDS 40841 - 41218 423 ## COG4905 Predicted membrane protein - Prom 41254 - 41313 7.6 47 24 Op 1 . - CDS 41368 - 41826 569 ## COG1963 Uncharacterized protein conserved in bacteria 48 24 Op 2 . - CDS 41876 - 42445 558 ## COG1418 Predicted HD superfamily hydrolase 49 24 Op 3 . - CDS 42475 - 43920 1461 ## COG0297 Glycogen synthase - Prom 43962 - 44021 2.4 50 25 Tu 1 . - CDS 44024 - 44827 916 ## COG0784 FOG: CheY-like receiver - Prom 44867 - 44926 4.6 + Prom 44859 - 44918 5.5 51 26 Tu 1 . + CDS 44960 - 45124 105 ## + Term 45156 - 45202 9.0 52 27 Op 1 . - CDS 46030 - 46599 335 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 53 27 Op 2 . - CDS 46612 - 46749 173 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair - Prom 46785 - 46844 5.2 54 28 Op 1 . - CDS 46895 - 47194 91 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 55 28 Op 2 12/0.000 - CDS 47194 - 47361 182 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 56 28 Op 3 6/0.000 - CDS 47358 - 47891 261 ## COG1468 RecB family exonuclease - Prom 48017 - 48076 3.4 57 28 Op 4 . - CDS 48230 - 50344 1057 ## COG1203 Predicted helicases 58 28 Op 5 . - CDS 50419 - 51309 612 ## COG3649 Uncharacterized protein predicted to be involved in DNA repair 59 28 Op 6 . - CDS 51337 - 53475 1268 ## Dhaf_2184 hypothetical protein 60 28 Op 7 . - CDS 53479 - 54141 352 ## Dhaf_2183 CRISPR-associated protein Cas5 - Prom 54179 - 54238 11.7 - Term 54538 - 54584 -0.2 61 29 Tu 1 . - CDS 54608 - 55882 1541 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 - Prom 55966 - 56025 5.8 - Term 55895 - 55937 6.2 62 30 Tu 1 . - CDS 56136 - 57065 579 ## COG0673 Predicted dehydrogenases and related proteins 63 31 Op 1 . - CDS 57173 - 58060 619 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 64 31 Op 2 . - CDS 58148 - 58303 90 ## gi|266620864|ref|ZP_06113799.1| conserved hypothetical protein 65 32 Op 1 1/0.500 - CDS 59227 - 60837 1317 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase 66 32 Op 2 . - CDS 60922 - 62001 1186 ## COG0673 Predicted dehydrogenases and related proteins 67 32 Op 3 . - CDS 62030 - 62572 523 ## Tthe_0980 oxidoreductase domain protein Predicted protein(s) >gi|229784104|gb|GG667631.1| GENE 1 204 - 512 533 102 aa, chain - ## HITS:1 COG:STM4113 KEGG:ns NR:ns ## COG: STM4113 COG1445 # Protein_GI_number: 16767378 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system fructose-specific component IIB # Organism: Salmonella typhimurium LT2 # 2 97 3 98 106 93 46.0 9e-20 MKVVGITSCPSGVAHTYMAAEALKLSGQKLGIDVIIETQGGAGVENQLKQKDIDEAACVV LVNDVALEGLDRFKGKKVLKMGVSDLIKKSDAVMKKIQDTFQ >gi|229784104|gb|GG667631.1| GENE 2 540 - 1607 1252 355 aa, chain - ## HITS:1 COG:VC1822_2 KEGG:ns NR:ns ## COG: VC1822_2 COG1299 # Protein_GI_number: 15641824 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Vibrio cholerae # 1 330 4 333 358 355 61.0 7e-98 MKEQIKILKKHILTGTSHMIPFIVAGGILFSISVMLNPAGAATPTEGWLAGLAEIGLGGL TLFVPVLGGYIAYSIADKPGLAPGMIGAYLANSMGAGFLGGMAAGLIAGVVVKQLKKIKL PISLKTLGSIFIYPLVGTFITGGIIVWGIGTPIAFIMQSLTAWLSGLGDVGRVPLATILG AMTAFDMGGPINKVATLFAQTQVDTLPYLMGGVGVAICTPPIGMGIATLLAPKKYNAEEK EAGKAAILMGCVGITEGAIPFAANDPLRVIPSLIVGAVVGNIIPFLTGVLNHAPWGGLIV LPVVEGRIWYIISVLAGGFVTAIMVNLLKKNNTDDMASKAAVEDDGLDEITFDEL >gi|229784104|gb|GG667631.1| GENE 3 1612 - 2064 598 150 aa, chain - ## HITS:1 COG:STM4110_3 KEGG:ns NR:ns ## COG: STM4110_3 COG1762 # Protein_GI_number: 16767376 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Salmonella typhimurium LT2 # 4 149 1 144 145 95 34.0 2e-20 MSKLISKNCIVFDIDAAEKKDVISALVKELDAAGKITDQNEFFEDVLAREAIAPTYVGFD MGLPHGKTDHVLEASVAFARTVRPVVWNEETGETADLIILIAVPLSEAGDTHMKILANLS RQLMHEEFRESLRNSTKNQVFDILTEVLEG >gi|229784104|gb|GG667631.1| GENE 4 2316 - 3890 1603 524 aa, chain - ## HITS:1 COG:L113400 KEGG:ns NR:ns ## COG: L113400 COG0038 # Protein_GI_number: 15673646 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Lactococcus lactis # 22 516 16 511 512 346 42.0 7e-95 MRSEGNTVTNTINRYRSFRYALILEGVAVGAIAGVVVVAFRYLLGYAEILLHNILNYGRT HVWFVPVWFLILAAAAVIVTLLLKWDSLISGSGIPQVEGEIMGEIDECWWRVLTAKLAGG LIGLGCGLSLGREGPSIQLGAMAAKGFSRLTKRVKTEEKLLMTCGATAGLSAAFNAPIAG VLFSLEEIHKHFSPEILLSSMAASITSDFVSRNVFGLKPVFTFNITHMMPLSTYAHVLIL GVIIGLMGVVYNTCLSKSQDLYQKIPGQTLRLLIPFMMAGVFGFLYPSVLGGGHSLVEVL SSGEMVIGSMCLLFVVKFVFSMVSFGSGAPGGIFLPLLVMGAVIGGIYFNAVGMVSGSLD GLLGNFIILGMAGYFSAIVRAPITGIILISEMTGSFSHLLTLSMVSLAAYLVPDIMRCAP VYDQLLHRLLAKQNPEKKAVLTGEKVLVEGMIFHGSAAEGMKVSEIAWPKTCLVVSLMRG EAEFVPRGDTKLLAGDKIVVLCDETAEGQLHRTLQEFCETVKMQ >gi|229784104|gb|GG667631.1| GENE 5 3856 - 3987 58 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVLVTVFPSDRILRSPSLFSFNIRILSLKLYELFCCQYLIITL >gi|229784104|gb|GG667631.1| GENE 6 4085 - 5104 743 339 aa, chain + ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 258 334 542 617 621 62 37.0 1e-09 MKQFNHRWTRLLMTACLLSICLSLPAHAGERKKVDQIPLSFSSSIEAGKTGGEVYVTVDD PSRAAHCRIVNTAITNPPDTVWEAGTTPEILISLRAESGFYFPSDKADAFRLTGAGAVFD RAEFLDSQSVLHVTVRLSALSDPDAPVVLSGLSWDKGSFTALWDDDSNARYYQVQLFKDG QPVGAAATTYTNYSRLSDKISGKGSYRFTVCFVDNALNRSEWISSDTWEITASEAAALES SLKNTYGPGTVSSPARAAGEWKSDDTGWWFEHPDGTWTTDGWDYIGGEWYYFDHSGYRKS GWIPWQGTWYYCDEDGVMLKNAETPDGYETDSSGAVKAK >gi|229784104|gb|GG667631.1| GENE 7 5144 - 5287 63 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSQYGIFPGIVNRKSGIIEKAGMSPENAVTDRSGVRRIRPGFIFLNP >gi|229784104|gb|GG667631.1| GENE 8 5236 - 5457 326 73 aa, chain - ## HITS:1 COG:no KEGG:Closa_3095 NR:ns ## KEGG: Closa_3095 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 73 1 73 73 88 58.0 9e-17 MKNMFYTFSTVEEMVAFVNKAETCDFDVDICYNHLVIDGKSLMGVMNVGLGKTVEIVCHS TAFSPESLIGKAA >gi|229784104|gb|GG667631.1| GENE 9 5862 - 6779 574 305 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 6 296 2 296 306 225 39 4e-58 MNEKQYDVVIIGGGPGGYSAALYCARSGMSVLVLEKLSPGGQMATTGQVDNYPGFEDGID GFELGEKMKKGADRFGAETAFDEVISVELQSEPKKITTTGGELLAKTVVIATGASPRELG LPEEKKLRGRGVAYCAVCDGMRYKDKTVVVSGGGNSAAEDALFLSKICKKVYLVHRRDAL RASMVYQNALKDSPVEFLWNSRIEEILHEKKVTGVRLSDVKTGEESVVSCDGVFVAIGRV PDTAVFEGQVERNEQGYIVADETTKTNVPGVFAVGDVRTKPLRQIVTAASDGAVASKFIE EYLHK >gi|229784104|gb|GG667631.1| GENE 10 6841 - 7167 512 108 aa, chain - ## HITS:1 COG:BMEI2022 KEGG:ns NR:ns ## COG: BMEI2022 COG0526 # Protein_GI_number: 17988305 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Brucella melitensis # 1 107 1 107 107 97 36.0 4e-21 MDIMTINKDNFDKEVLQAEKPVVLDFWAPWCGYCRRLGPVVDRLAAQYEGKLLVGKVDID ESPELEEKHEIETIPTLIVYQGGRASEPLINPGSMTEIEDWLKENGAV >gi|229784104|gb|GG667631.1| GENE 11 7411 - 8184 535 257 aa, chain + ## HITS:1 COG:CAC3359_2 KEGG:ns NR:ns ## COG: CAC3359_2 COG0778 # Protein_GI_number: 15896602 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 85 254 1 188 191 118 36.0 1e-26 MVKHQVIIDPDKCVGCGICAKACAAHNIVIGNKRAKIRMDDCVMCGHCTAVCPKNAVTIS GYDTEQIEKKEKVRLNPSDVLDVIRFRRSIRRFQRKDVPSEVIGQILEAGRLTHTAKNMQ DVSFVVLKKEITRIESMAVKLFKIIKPLADLFWPMARNTKIDDHFFFFHAPVVIVILAKE KTNGILAAQNMEFAAEANDLGVLFSGFFTTTANISHKIRKALRVPKGKRVAATLVLGYPD VKFLRSVPRRQLDVIYL >gi|229784104|gb|GG667631.1| GENE 12 8196 - 8540 258 114 aa, chain + ## HITS:1 COG:MJ1325 KEGG:ns NR:ns ## COG: MJ1325 COG0640 # Protein_GI_number: 15669515 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanococcus jannaschii # 16 102 4 86 89 63 42.0 1e-10 MANDCEERLQKIVYGFRECRNAFTAIGDETRQLILLVLLESDFSGIRVGEIAEKTHLTRP SVSHHLQILKEAGIVAMRKEGTKNYYYLSVDETQWKEIADLITLIYESVRHISS >gi|229784104|gb|GG667631.1| GENE 13 8571 - 9326 535 251 aa, chain + ## HITS:1 COG:MA4170 KEGG:ns NR:ns ## COG: MA4170 COG1145 # Protein_GI_number: 20092963 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Methanosarcina acetivorans str.C2A # 2 235 12 247 294 79 27.0 5e-15 MILYFSGTGNSEYAAKRIGRELDDEVVCLFEKIRSRDFTKMHSDRPWIIVTPTYAWRIPR IIQEWLMKTALAGNRDIYFVMTCGGSIGNAGSCSEKLCSAKKLNYMGCTAVVMPENYIAM FDVPTQKESLEIIQRAEQTINEAILTIKNGKAFSRPAVTFQDRMNSGIVNNLFYPVFVHA KKFYAADTCISCGKCAKICPLNNIRMENRKPVWGKNCTHCMACICHCPTEAIEYGKHSKG KPRYVCPKTDH >gi|229784104|gb|GG667631.1| GENE 14 9429 - 9758 380 109 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_2872 NR:ns ## KEGG: Dhaf_2872 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 103 1 103 103 159 64.0 4e-38 MKKVRITVLRKEFYSDFADEYLTDGREAGPCALMNVGDSFIFEGGAQMPEGLCPWAWIDI YHSVSCVAAGGTYKPWNREDGQTIVCCTDGIRPVVFRVEAFGEEDESME >gi|229784104|gb|GG667631.1| GENE 15 9860 - 12361 2370 833 aa, chain - ## HITS:1 COG:BH0704 KEGG:ns NR:ns ## COG: BH0704 COG1501 # Protein_GI_number: 15613267 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Bacillus halodurans # 81 739 87 744 801 499 41.0 1e-140 MKVCTRAEEVKKLEDTISITTNCVEIRLLFLTDSIVRIRAGFDGTFAEESYSLVMTAWDD RMDGLMNRFRRRVRPAETVFTDGEKEAVIEGRKLRVVVEKEPFRICIYDKEGTLIHADIV ELGYQEDSNHRRIHTSEISPDDCFYGFGEKSGEWNKAQKQLCMSPKDAMGYNPKETDSLY KHIPFYVKLNRRTKKASGYFYHNPAECDFDMGREKSNYWKMHSRYRADAGDIDLFFIEGP AIRQVVERYTDLTGKSVMLPRTALGYLGSSMYYPELPENCDDAILEFIDTTKEEDIPADG FQLSSGYTSQETAEGLKRCVFTWNKKRFKDPVRFFAEMKKRGITVSPNVKPGVLLSHPDL EEMKAKDIFVKDSVKDEAGVGTWWGGKGVFVDFTKPQARAEWKEMLKRAVLEMGTSSVWN DNCEYDSLVDKDCRCDFEGKGTSIGYIKAVMSNLMCHVTAEAVEETFENIRPFIVCRSGH AGIQRYAQTWAGDNLTCWDSLKYNIATILGMSLSGVANQGCDICGFYGPSPEAELMVRWV QNGIFQPRFSVHSVNTDNTVTEPWMYGGYTDLIRKAVKLRYRMIPYLYSLMERAHETGLP IMEPMYSAFQQDTACYEEGVDFMLGDSLLVANVVEKGAEVRSVYLPEGERFYDFYTREAY EGGQTIEIPVTIESIPLFIRGGAVIPMAENQLNNLASETVTGISLLCSADRDCEFTLYED DGISKDYEHGQYLKTRITVTGGVRTTISFRNEGLYETLVETMSVDLIHREKAPYWISLDG ERLPHYLHRRKFEEAEAGWYYSQTKKSVQIKYRNPRKDYDVVISFEEFDMIGM >gi|229784104|gb|GG667631.1| GENE 16 12358 - 12528 101 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620823|ref|ZP_06113758.1| ## NR: gi|266620823|ref|ZP_06113758.1| aerial mycelium formation [Clostridium hathewayi DSM 13479] aerial mycelium formation [Clostridium hathewayi DSM 13479] # 1 56 1 56 56 110 100.0 4e-23 MAENTQSFDINHAGFDWFSRGIICMMDSEYLWDRGTGAVASVLCERRNTGDRRKKG >gi|229784104|gb|GG667631.1| GENE 17 12582 - 13064 581 160 aa, chain - ## HITS:1 COG:no KEGG:Amico_0799 NR:ns ## KEGG: Amico_0799 # Name: not_defined # Def: sodium/hydrogen exchanger # Organism: A.colombiense # Pathway: not_defined # 1 143 247 389 391 109 45.0 4e-23 MVFGATYMNVTNDKKLFKQVNHFTPPVMSMFFIISGMNLNVTALRTAGIIGVSYFLIRII GKYAGAYIGCSITGMPVVMKRYIGLALIPQAGVAIGLAFLGQRMLPADIGSLLLTIILSS SVLYELIGPACAKMSFFLSGTIKREVIKSVEDYQPAHAVK >gi|229784104|gb|GG667631.1| GENE 18 14225 - 15127 895 300 aa, chain + ## HITS:1 COG:aq_638 KEGG:ns NR:ns ## COG: aq_638 COG0583 # Protein_GI_number: 15606065 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Aquifex aeolicus # 1 296 1 297 303 121 27.0 2e-27 MMDSKLETLIKVNETRSFTKAAELLSLTQPAVSQHIRQLEQDLGATLFIRGEGPLKLTQE GEIAIKYAKRIQTLYQNLEQSLGDLKRHITRLSVGITHTAESNIMVEVLARYSSLNAGCR ITIISDTINNLYQKLKTYELDIAIVEGKIVDNNFNSVMLDTDSLILAVSNHNPLSKKSIV TLEELKKQRLILRLPDSATRNLFISHLESNNVSLDELNVTLEVDNVATIKDLVRRDFGVS ILARSAFAGELRKGKMTGLPIENLSMTREINMVYHKDFEHTDLLQEITKIYYEYSRSKYT >gi|229784104|gb|GG667631.1| GENE 19 15132 - 15260 64 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFLGGGIKSYAAAGVSRGRKGGDTGRGSTPVFSVGRTKRKTV >gi|229784104|gb|GG667631.1| GENE 20 16485 - 17498 560 337 aa, chain - ## HITS:1 COG:CC3033 KEGG:ns NR:ns ## COG: CC3033 COG0697 # Protein_GI_number: 16127263 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Caulobacter vibrioides # 38 324 18 301 310 234 43.0 2e-61 MNRKSKFCHCADKRDMPNSRKPGMQIETANTVSVCLTAMICCALWGSAFPCIKIGYQLFQ IEGNDVSSQLLFAGYRFTLAGLLTILIGSLMNRRPLLPQKPSWLRIIKLSLFQTVIQYLF FYVGLAHTTGVKASIIEASNVFAAILIASLIFRQEKLDKNKVIGCIAGFAGVVIININQG GLNMSLSLMGEGFILLSTIAYAVSSVLIKIYSREDHPVMLSGWQFLLGGIIMILCGYLTG GSVHVWTVPSISMLVYLSAVSAIAYSLWGILLKRNPVSKVAVFGFMNPVFGVILSALFLG EGQQAFGLTTLVALFLVCIGIYIVNKPTPQPAPDNQE >gi|229784104|gb|GG667631.1| GENE 21 17500 - 18147 825 215 aa, chain - ## HITS:1 COG:no KEGG:Closa_3203 NR:ns ## KEGG: Closa_3203 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 215 1 215 215 360 84.0 2e-98 MKEKENLVITIGRQYGSGGRIVGKALAEELGIHFYDEEILTMTSEQSAVGEVFFRLADEK AGNNLLYKIVSGMKPQLGKPSTDGDIVKPENLFRFQSEVIRQLADQESCIIAGRCADYIL SREDVNVPGLVKIFVYADVPALIKRTMEVDKVDEKEAARRVRKINRERKEYYRYYSGGRW EDWNNYDLIINTSKFDLEQTAKLIKDYIKLKGFEL >gi|229784104|gb|GG667631.1| GENE 22 18162 - 19460 1279 432 aa, chain - ## HITS:1 COG:FN2107 KEGG:ns NR:ns ## COG: FN2107 COG0153 # Protein_GI_number: 19705397 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Fusobacterium nucleatum # 36 416 3 375 389 112 26.0 1e-24 MKVCDTIQMLESEKSRKLMAALYGETAVEANIERYQNLVKSFQKKFAEEDITLFSSPGRT EISGNHTDHNHGKVLAGSINLDCVGVAAKNNSSKVHIISETFNQSFIIDLNDLSPSDKKA GTIDLVKGLLQGFKESGYEVGGFNAYITSNVISAAGVSSSASFEMLLCSILNTFFNEGRM DTVAYAHIGKYSENVYWDKASGLLDQMACAVGGLITIDFMEPASPVVEKIDFDFSSQNHS LIIVNTGKGHADLSADYSSVPIEMKKVAEFFGKEVCAQITEEEVIEHLAEVRAYAGDRSV LRALHFFEENKRVEAEVKALKEGRFTDFLMNITASGNSSWKWLQNCFTNSAYQEQGITVA LALTELFIAEKQRGACRVHGGGFAGVIMAMLPNDLVDEYVAYIEKALGEGNAYRMSIRPY GAICFDTVMEEQ >gi|229784104|gb|GG667631.1| GENE 23 19662 - 20636 725 324 aa, chain + ## HITS:1 COG:BS_ytqA KEGG:ns NR:ns ## COG: BS_ytqA COG1242 # Protein_GI_number: 16080100 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Bacillus subtilis # 5 322 13 314 322 279 42.0 4e-75 MYWNEKPYHSLDYEMKRIYGHKIYKLALDGGMTCPNRDGTLGSNGCIFCSGGGSGEFAES MAAHPSVTEQIEAAKALVEKKMSKAHNNTALDCPKGRYIAYFQSYTNTYAPVSRLRALFT EAIAHPDIAILSIATRPDCLPPEIIALLSELNRIKPVWVELGLQTIHEETARFIRRGYPL SVFEDAWNRLHGAGLTVIAHVILGLPGETRSMMQDTVSYLGGLGTHGIDGVKLQLLHVLE GTDLAVLYRAGGFQTFSLEEYLDLVIDCIALLPPETVIHRISGDGPKKLLVAPVWSGNKR LVLNSMAKRFKERGVCQGDQYRTP >gi|229784104|gb|GG667631.1| GENE 24 20736 - 20870 92 44 aa, chain - ## HITS:1 COG:no KEGG:Cbei_2688 NR:ns ## KEGG: Cbei_2688 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.beijerinckii # Pathway: not_defined # 1 40 128 167 171 62 67.0 5e-09 MDINKPAIHLYMKNGFEQADGIYDEVIDDDLVLHEYGFEIKTSK >gi|229784104|gb|GG667631.1| GENE 25 21825 - 22292 370 155 aa, chain - ## HITS:1 COG:no KEGG:Cbei_2688 NR:ns ## KEGG: Cbei_2688 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.beijerinckii # Pathway: not_defined # 1 153 1 154 171 201 62.0 9e-51 MNLRLAETADLPQLKTVYKELIRKMDENGVSIWDEIYPCECFAEDIGNNRLYVLVEECRI VAAFALCSQAAGADCVKWRYEGGKALYIDRFGVNADYLRKGIGSIALKGAIALAGELGAE YVRLFVVDINKPAIHLYMKNGFEQADGIYDEVIAS >gi|229784104|gb|GG667631.1| GENE 26 22329 - 23537 1137 402 aa, chain - ## HITS:1 COG:BS_ywtB KEGG:ns NR:ns ## COG: BS_ywtB COG2843 # Protein_GI_number: 16080641 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Bacillus subtilis # 72 336 44 305 380 135 32.0 2e-31 MKLLIILPVFLLAALLVWIAADGQRGGARETLASSRETAVDSAVQEGSQTADFAPEAATE EVLKLEVSETEGESTAPAAETEPEDPIVHMLFSGDVYFSSHVLAAYDNAGGIHGVLDDAY RDEIARADLYMANQEFPFSDRGTPAPDKQFTFRIPPERVSMMHELGIDIVTIANNHTLDY GEDALVDTCTTLENAGILYVGAGANMDRAKQLETIEVRGRTIGFLAASRVYPDTSWVANS KKPGMVSGYDPSILLKEIEAAGEYCDYLVVYMHWGIERDEKPQEYQRTLGKQLIDAGADL VIGSHPHVLQGIEYYQGKPIVYSLGNFIFGSSIPKTALLRADVDLEQNQVNLSLVPGTSG AGYTKELTDPQKISEFYQYMQGISFGVTIDENGVVQEENAAK >gi|229784104|gb|GG667631.1| GENE 27 23569 - 23994 447 141 aa, chain - ## HITS:1 COG:no KEGG:Clocel_2236 NR:ns ## KEGG: Clocel_2236 # Name: not_defined # Def: flavodoxin # Organism: C.cellulovorans # Pathway: not_defined # 1 127 1 128 135 87 40.0 1e-16 MKTAVRYYSRTGNTKVIAEAIAPVAGCQAESIEVPLEGVTDILFLGGALYWGKIPRTLKE YIKSLSPQQVKYVAVFTTAGILESASEIYKVFLRYHDLKVMKDTYFCNGKRVYDGKIKEE TMLFAEKVMDQASIMNHSIMD >gi|229784104|gb|GG667631.1| GENE 28 24450 - 25472 400 340 aa, chain + ## HITS:1 COG:no KEGG:Closa_1057 NR:ns ## KEGG: Closa_1057 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 18 124 18 120 336 93 41.0 1e-17 MMKKLLTLLCTLFMVIIMAGSAFAGTWRTGQAPNQDKWWYDNDNGTYPANGWKWIDGNDD GIAECYCFDPAGWLYTNTTTPDGYTVNENGAWIENGIVNISIVPRQNSGYSQKNESSVYE AADEYDEEILMPWYVEETIFNPDRSVLKYITLGCDTGCRIYYTTGVKPADPTDQDDIYVT KKRSGFQYGSPSLKPGQTLKAIAVDKATGKKSGVTVLRYEDAINANRGRSGNIAGSSNYN NGSGSNNQSSSSSSNSNSAASTTTAPRQCPICMGKGYTTCTYCHGSGIGQNASFGLGGGM SGGDTGIYDDGGIYQGICPSCGGSGTKTCAGCGGIGTVGY >gi|229784104|gb|GG667631.1| GENE 29 25591 - 25689 108 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCRQATFKDVIITETAGVATVYASDGGIIVAL >gi|229784104|gb|GG667631.1| GENE 30 26661 - 27326 860 221 aa, chain - ## HITS:1 COG:SP0742 KEGG:ns NR:ns ## COG: SP0742 COG1307 # Protein_GI_number: 15900637 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 1 217 1 217 281 114 35.0 2e-25 MSYKIIGDSCLDLTADLKKDPHFQIIPLTLQVEHTQVIDDETFDQKKFLELVRSSSECPQ TACPSPEKFKEAFECDVDQIFVITLSEHLSGTYNSAVLAKKLYEEEHGQEGKNIAVFSSD SASSGELNIALYIQSLCQASLPFEEIVEKTRSFIKNMKTYFVLESLDTLRKNGRLTGLQA FFATALNIKPVMGADSGVIIKLDQARGINKALTRMADIAIR >gi|229784104|gb|GG667631.1| GENE 31 27520 - 28077 479 185 aa, chain - ## HITS:1 COG:no KEGG:CLL_A2176 NR:ns ## KEGG: CLL_A2176 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 2 179 3 180 181 128 39.0 2e-28 MKRIILYGSKYGTTKRYAEELARRTGIPCRNCKEVNSLSSCKEVIHLGGIYAGQILGLAH TVKLLRQEPEAKLLVATVGLSDPADETNIHNIRDFIKKQLTEPLWDETKLFHLRGGIDYS KLSVAHRAMMGLLYSRVKKIPPEERNSETQGLIETYGKTVDFTDFESLNEMIGLLTAENG AGERG >gi|229784104|gb|GG667631.1| GENE 32 28121 - 28630 545 169 aa, chain - ## HITS:1 COG:CAC3406 KEGG:ns NR:ns ## COG: CAC3406 COG1846 # Protein_GI_number: 15896647 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 23 169 16 167 188 63 31.0 1e-10 MEEDMKITDLIRRFQGTREKQEKAVFSTLCIAENRLQTVFDKSSPDITLKQFMLLTMVRQ SKDRLTFTQLGKLLGCSRQNIKKLAAVLEQKGFVTILQNTDDKRAAWLVPAAGLDEYFEQ TAAIHRRKLSCLFQHYTDQEMEQLFTLLMKLYDGIDGLEAAEEAATEAT >gi|229784104|gb|GG667631.1| GENE 33 28750 - 29046 406 98 aa, chain - ## HITS:1 COG:no KEGG:Closa_3214 NR:ns ## KEGG: Closa_3214 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 98 1 96 96 82 61.0 6e-15 MTDKIREIKKAVLRLMLALAIGVLAGAAPGAVLVSHAETETVAAAADPASAETASEDDGT TFLLSFFGGIILLILFVVVIVIATATSTAGVVGAQEDE >gi|229784104|gb|GG667631.1| GENE 34 29134 - 31152 2669 672 aa, chain - ## HITS:1 COG:alr2323 KEGG:ns NR:ns ## COG: alr2323 COG0326 # Protein_GI_number: 17229815 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Nostoc sp. PCC 7120 # 277 669 216 657 658 291 37.0 2e-78 MAKTGSLTINSENIFPIIKKWLYSDHDIFYRELVSNGCDAITKLKKLEIMGEWQKPEDVE FKITVTVDSNEKTITFADNGLGMTMDEVDEYINQIAFSGAQDFLEKYKDKANEDQIIGHF GLGFYSAFMVADKVTIDTLSYQEGAAPVHWESDGGTEYEMEEGDKTSFGTTIKLYLNEES LEFCNEYRAREVLEKYCSFMPVEIYLENSSAEPQYETIEKDELAEKDTIIETIVEEAKTE EKENENGEKEVVEISPAKEKYKILKRPVPENDIHPLWNKHPNECTEEEYKSFYRKVFHDY KEPLFWIHLNMDYPFNLKGILYFPKLNLNYDSLEGVIKLYNNQVFIADNIKEVIPEFLML LKGVIDCPDLPLNVSRSALQNDGFVKKISDYITKKVADKLSGMFKTDRENYEKYWDDINP FIKFGCLKDEKFAEKMNDYLIFKNLEGKYLTLSDCLEENKEKHENTVFYITDEKEQSQYI NMFKQEGMDAVYMTQNIDSPFINFLEQKNENVKFLAINSDLSDSFKEEMTEEAREAMKGD VDALTEVFKKALGNDKLEVRVEKLKNEAVSSMITVSEDSRRMQEMMKMYTMPGMDPSMFG AGGETLVLNYNNKLVQYVLNHREGEHIQEICEQLYDLAALSHGSLTPERMTKFIARSNDI MIFMADMGKEEE >gi|229784104|gb|GG667631.1| GENE 35 31300 - 31797 637 165 aa, chain - ## HITS:1 COG:no KEGG:Closa_3216 NR:ns ## KEGG: Closa_3216 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 165 1 165 165 219 66.0 4e-56 MIALKIEDMKQFTARLFMGETFDHFLIREAEIITFNVFTIDGHIRQGYYSEEELEENQLE ELSSWKMIRPYCFSLIKGKKLPGSFQIVLQMPPQAVEKFVAARQMPLRADQVNGLYLNIR YEEGKLFCVTGTSVSFFTLDKTLDSEWDQAVRSFLKKNEIIFEEE >gi|229784104|gb|GG667631.1| GENE 36 31974 - 33260 1136 428 aa, chain + ## HITS:1 COG:CAC0607 KEGG:ns NR:ns ## COG: CAC0607 COG1362 # Protein_GI_number: 15893896 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Clostridium acetobutylicum # 5 427 9 432 433 395 43.0 1e-110 MNDMKQLERLISASVSPYHCIMTAAADLDAAGFTELPLSGSWKLEQGHSYYIKAYDSTLI AFTIGELTDQALKLKLAASHTDWPCLKLKPSPEVTSGRYGKLNTEVYGGPLMHTWFDRPL SMAGKVCVTGDAPGGTKTVFVDFRRPVLTIPNLAIHMNREANQGVAVNPQIDMLPLLTLI KNDLEKEDFFLKALAASAGVEQEDILDYEIYIYNTESGTLCGLDGEFYSAPRLDNLTSVQ ACLSGIMNSTAGNEIHVIALYDNEEIGSSTKQGAASLLMDRVLEKLFLSLGCSRETYLNA LFDGFMLSLDVAHAIHPNHGEKCDIKNQIHMGDGVAIKLSASQSYATDATSTGMIESICR NAKIPYRKFSNRSDMKGGSTLGSISSSYLTMRTVDAGIPMLAMHSAREVMGTADQKALVD LVTAYFNA >gi|229784104|gb|GG667631.1| GENE 37 33413 - 33643 195 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620843|ref|ZP_06113778.1| ## NR: gi|266620843|ref|ZP_06113778.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 22 76 22 76 76 99 100.0 9e-20 MAEGGRSSRGRSRKRKSKRGKAVFAAVLKIILILFVAAGLLFTSTYVYRYWEGSRTSTAP LETGREIVPETMKSYE >gi|229784104|gb|GG667631.1| GENE 38 33671 - 34645 1131 324 aa, chain - ## HITS:1 COG:no KEGG:Closa_3219 NR:ns ## KEGG: Closa_3219 # Name: not_defined # Def: PpiC-type peptidyl-prolyl cis-trans isomerase # Organism: C.saccharolyticum # Pathway: not_defined # 13 323 13 323 323 392 65.0 1e-108 MNFKYRKKVKQTLAAVLAAMLLTGCTRGLPMVSEVNETKAYTLPQSMIIVATERNRYQQA YTGQIWSVELPDGETFETYLLGQAREFLQEMKLMNLLAKDQEIIITSTEKEEIRKLSEEY YESLTEDDIVYMGVKEDDVRTMYEEYFLSNKVVSELTKDMNLEISDSEAKVITIDQIVLS DGNTAQDVLEQVKEDGADFEAIAREYSESNEIRKQLGRGEAKKAVEDAAFSLTAGEISPV VEDNGTFYIFRCVSDYDEDATQARKDNLYQQRKKDAFGQIYNQFKAENPVTFSNDIWDGI KFSIDDKTTTTNFFDLYRKHFPNQ >gi|229784104|gb|GG667631.1| GENE 39 34697 - 34936 251 79 aa, chain - ## HITS:1 COG:TP0627 KEGG:ns NR:ns ## COG: TP0627 COG0419 # Protein_GI_number: 15639615 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Treponema pallidum # 2 74 970 1042 1047 80 52.0 8e-16 MIQSYAGGIQVEALFVDEGFGSLDAEALEQAVAILTSLSGGSSMVGIISHVEELKEQIGS HILVEKTNQGSRIVTESMV >gi|229784104|gb|GG667631.1| GENE 40 35866 - 36849 1154 327 aa, chain - ## HITS:1 COG:lin1686 KEGG:ns NR:ns ## COG: lin1686 COG0419 # Protein_GI_number: 16800754 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Listeria innocua # 60 326 698 945 1023 82 30.0 1e-15 MKLDGEAKEKKRIYEYQDDCFKKAAAGIIARDLTEGIPCPVCGSTVHPVPASMEGEVPDE KEIRRLKADYEAASEKASAAAGAAAALVGAVKNMEENLGVMDLSDGGEKALLEIGEKLKL LEEETKQEEKEIRDCEKEYQDCSVKLEKQKAAYSQLKASIRVPKQTELVDLSGFEAALLE CERDRKQTAKEKERTGAALALNRQSLSTLKGHLEVKEKLETEYGVVRELERAAGGYNGRN LVFEQYVLSVYFEDILRAANQRLLKMSGERYELHRLERARDRRSRESMEMEVLDQYTGKR RSVKSLSGGEAFNAALALALGTSDVIS >gi|229784104|gb|GG667631.1| GENE 41 36874 - 37911 1036 345 aa, chain - ## HITS:1 COG:VCA0521 KEGG:ns NR:ns ## COG: VCA0521 COG0419 # Protein_GI_number: 15601281 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Vibrio cholerae # 1 221 1 223 1013 142 38.0 7e-34 MKPTSLVLSGWGPYRGVEQADFETMQQGLFLVSGPTGSGKTTIFDGITFALYGEVSGSIR EKDSLRSDFADASTPTFVTLNFTHRGKQYRVTRNPKYDRPKLKGEGMTTEPEGGELYEGE TLLASGSSQVTERVTGLLGLDYRQFRQISMIAQGEFQQLLVSSSKERTQIFRDIFQTRIY DTMTALLSARVKALNGRIDEVKHRADEITGTFRIENGDWEELKSKKNRNYRKILDFIEQE DDRLESRERELASCLGELDRSYKEQVRAIEELRSGNQAVRKYENDRKTLEKLEEERDSLK EQKRELAKAYGRIPVRRERLEGKKSELRVLEEQKKGLASWQAACR >gi|229784104|gb|GG667631.1| GENE 42 37908 - 39062 1100 384 aa, chain - ## HITS:1 COG:lin1687 KEGG:ns NR:ns ## COG: lin1687 COG0420 # Protein_GI_number: 16800755 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Listeria innocua # 1 382 1 371 374 266 39.0 4e-71 MRFIHTSDLHIGKTVGGFSMLEEQDAALKQITDYVKERQADAVLLSGDLYDRSVPPAAAV TMFDLFLTGLSELGVTILAIAGNHDCGERIGFANRILEKRGLYLEGKLEDPVRFVDIPDQ WGAVRVHLAPFARPAEVKSLYRCGSSMKTFEESFEEILRHVTYSPGGRDILMAHQFVVNQ GLLPELSDSETRVSVGGTDQVEAALFQPFCYTALGHIHGCQQIGKQPVYYSGSPVKYSFS ECSHQKSVLYGEVGKDGKVSLERLPLTPVHDMRKIRGRLSDLILPEVVSAADPEDYILAV LTDETELLDPIGTLRSVYPNIMQLKWEKRALEQDGSVIHEDIMKEKGPYELFADFYELVT GKAMTKEQEAVMRETAEQAEEERS >gi|229784104|gb|GG667631.1| GENE 43 39130 - 39915 997 261 aa, chain - ## HITS:1 COG:CAC0522 KEGG:ns NR:ns ## COG: CAC0522 COG0561 # Protein_GI_number: 15893812 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 1 258 1 257 265 153 32.0 4e-37 MIKLIVSDIDGTLVPDGSHEMNPELLDVIMKIREKGVQFAAASGRQWASIESIFGPIKEK VFYLSDNGAYVGCHGRNLFLNPIDRETVMDMVRDVRRMEGLEVMLSGPDVVYLETENQDF IDWLVEGYQFRVEQVPDLTKVDSQFIKISVYRETDVEAHTRELREKYSDRMKITIAGDMW MDCMKPGINKGQAVKLLQDSLEIKPEETMAFGDQLNDIEMLEQAYYSFAVGNARPEVKAT ARFQTDTNVNDGVLKILKLLL >gi|229784104|gb|GG667631.1| GENE 44 39964 - 40557 609 197 aa, chain - ## HITS:1 COG:BH3033 KEGG:ns NR:ns ## COG: BH3033 COG0424 # Protein_GI_number: 15615595 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Bacillus halodurans # 7 189 4 182 190 154 45.0 1e-37 MTEERVIVLASASPRRRELLGQIGIIPEIIPSTLEEKITTDRPEEVVKELSLQKAADVAA RCMEGTIVIGADTVVAAEGRILGKPGSHEEAERMIALLAGKTHQVYTGVTLIVCGREGGI RKTFAEKTDVHVYPMTEEEIHAYAFSEEPMDKAGAYGIQGKFASYIKGIDGDYNNVVGLP VGRVYQEIKGLCSSLNA >gi|229784104|gb|GG667631.1| GENE 45 40654 - 40875 197 73 aa, chain - ## HITS:1 COG:no KEGG:Closa_3222 NR:ns ## KEGG: Closa_3222 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 5 73 121 187 187 72 55.0 5e-12 MPGDFWDWFFFKFLNGFVSQLVDVVPIGLERLVALGLLSFYVADFTYTMRYQLKAAEMEE EDGVSVVGRLKVY >gi|229784104|gb|GG667631.1| GENE 46 40841 - 41218 423 125 aa, chain - ## HITS:1 COG:lin2818 KEGG:ns NR:ns ## COG: lin2818 COG4905 # Protein_GI_number: 16801879 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 1 123 1 122 270 87 39.0 6e-18 MTIYHTINWFFAFSFIGYLLECTVLSYENRSPVLNRGFGHGPFCVIYGFGALGASLILEP LAGQPVELYFASMVMATSMELVTAHIMIKLFGAFWWDYSQKPFNYKGIICLESSIAWGFL GLVFL >gi|229784104|gb|GG667631.1| GENE 47 41368 - 41826 569 152 aa, chain - ## HITS:1 COG:L26878 KEGG:ns NR:ns ## COG: L26878 COG1963 # Protein_GI_number: 15672981 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Lactococcus lactis # 1 148 1 146 147 122 42.0 3e-28 MTFWDELLANQVLMSAVTGWTVAQVLKTIIDLALNKSFNPERLVGSGGMPSSHSATVCGM TTAAAMHYGVGSFEFAVCFILSMVVMYDAMGVRRETGKQAKLLNSILLENPLKLSGEVLQ EKLKEYVGHTPLQVAAGAILGIALAVFMAQYY >gi|229784104|gb|GG667631.1| GENE 48 41876 - 42445 558 189 aa, chain - ## HITS:1 COG:CAC1667 KEGG:ns NR:ns ## COG: CAC1667 COG1418 # Protein_GI_number: 15894944 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 33 170 20 155 173 163 51.0 2e-40 MNKKCFTREEKKEVLRWAEDPSWINQERYDEDEEYLACVADILENPVFQSMDRFMQHGDT TCKAHCIKVSYFSYRICRRLGWDYVQTARAGLLHDLFLYDWHTHAKETGEHFHGFTHPRR ALANAEKYFDLTDKEKDMILRHMWPLTPVPPKSREGMAIIYADKFCGLAETAARVKRWIL YSLRAGRTA >gi|229784104|gb|GG667631.1| GENE 49 42475 - 43920 1461 481 aa, chain - ## HITS:1 COG:CAC2239 KEGG:ns NR:ns ## COG: CAC2239 COG0297 # Protein_GI_number: 15895507 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Clostridium acetobutylicum # 3 475 2 473 477 476 51.0 1e-134 MKKILFVASEAVPFIKTGGLADVVGSLPKCFDKEYFDVRVMIPKYLCMKEAWRNDMEYVN HFYMDYMGQSRYVGILQYVYEGITFYFIDNEGYFNGPKPYGDWYCDLEKFCFFSRAALSA LPVIGFQPDVVHCHDWQTALIPVLLKDRFHDGEFFRNMKSVITIHNLKFQGVWDVMTIKR LTGLPDYYFTPDKLEAYKDGNLLKGGIVYADAITTVSETYAEEIKMPFYGEGLDGLMRAR ANSLRGIVNGIDYDDFNPETDPYIVQKYNAKNFRKEKIKNKRALQEELGLPQDDRKFMVG IVSRLTDQKGLDLIQCVLDEMCQDDIQLVVLGTGEERYENMFRHYDWKYNDKVSANIYYS EPMSHKIYAACDAFLMPSLFEPCGLSQLMSLRYGTVPIVRETGGLKDTVEPYNEFESRGT GFSFQNYNAHEMLGVIRYAERIYYDKKREWNKIIDRAMAMDYSWHTSAGKYQELYDWLIG Y >gi|229784104|gb|GG667631.1| GENE 50 44024 - 44827 916 267 aa, chain - ## HITS:1 COG:BS_spo0A_1 KEGG:ns NR:ns ## COG: BS_spo0A_1 COG0784 # Protein_GI_number: 16079478 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Bacillus subtilis # 1 121 1 120 120 112 42.0 9e-25 MSEISIAIADDNLQTLSLLNDILEGEEDFHVVGKADNGEDAYNMIVKTSPDVVLLDVIMP RMDGITVMERVRSNAAVTKMPSFIMVTAAGSENITEDAFRMGANYYIMKPFNKDIVVDKV RRVGHRKSKTTGLSGMKKVKPYVNKNEYMEQNLENDVTQMLHEIGIPAHIKGYQYLRDAI VISVQDQEMLTSVTKILYPSIAKKHQTTPSRVERAIRHAIEVAWSRGKMDTINDLFGYTV STGKGKPTNSEFIALIADKIRLDYKKL >gi|229784104|gb|GG667631.1| GENE 51 44960 - 45124 105 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAPEAEKSPFDPNPGLVTKSDRLHFYVNVEFCGSVSGKNSGRDYRNILPYYGAF >gi|229784104|gb|GG667631.1| GENE 52 46030 - 46599 335 189 aa, chain - ## HITS:1 COG:BH0341 KEGG:ns NR:ns ## COG: BH0341 COG1518 # Protein_GI_number: 15612904 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Bacillus halodurans # 2 89 256 343 343 139 68.0 4e-33 MADRFVLTLINKQIISGDAFSVKENGAVIMDDETRKLFLSQWQMKKQETITHPFLNEKIE WGMVPYVQSMLLARYIRGDLDEYPSFLWKQVLMMLVLITYDVNTETAAGRSRLRRVAKQC VNYGQRVQNSVFESNMDAAKCRAVKGILEGIIDKNVDSLRFYYLSDNYKHKVEHIGAKPG FDVTEPLIF >gi|229784104|gb|GG667631.1| GENE 53 46612 - 46749 173 45 aa, chain - ## HITS:1 COG:BH0341 KEGG:ns NR:ns ## COG: BH0341 COG1518 # Protein_GI_number: 15612904 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Bacillus halodurans # 1 45 205 249 343 72 75.0 2e-13 MLSFAYTLLAGLCGAAMETVGLDPYVGFLHTDRPERMSLALDMME >gi|229784104|gb|GG667631.1| GENE 54 46895 - 47194 91 99 aa, chain - ## HITS:1 COG:BH0341 KEGG:ns NR:ns ## COG: BH0341 COG1518 # Protein_GI_number: 15612904 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Bacillus halodurans # 1 99 57 155 343 117 52.0 7e-27 MGACVQRNIDLSFLSASGRFLARVSGEVRGNVTLRKQQYRLSENDGEAIKVARNCILGKV FNSRWVLERAARDYPMRLDSDKLQEKSSYLAESLRKIKS >gi|229784104|gb|GG667631.1| GENE 55 47194 - 47361 182 55 aa, chain - ## HITS:1 COG:BH0341 KEGG:ns NR:ns ## COG: BH0341 COG1518 # Protein_GI_number: 15612904 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Bacillus halodurans # 1 51 1 51 343 78 70.0 4e-15 MKHLLNTLYVTAPERYHSLDGKNVVIRSEDEEIGRIPLHNLEAIVTFGYQGLVLH >gi|229784104|gb|GG667631.1| GENE 56 47358 - 47891 261 177 aa, chain - ## HITS:1 COG:BH0340 KEGG:ns NR:ns ## COG: BH0340 COG1468 # Protein_GI_number: 15612903 # Func_class: L Replication, recombination and repair # Function: RecB family exonuclease # Organism: Bacillus halodurans # 52 171 53 172 174 121 50.0 8e-28 MEKQYEQEEYEKSFMLCGFMLLLCGCSFSGEQEVKERKAGLTQTEPESTKTINTEFDVVP VEYKRGEPQKDDVDILQLTAQALCLEEMLCCRIPGGYLYYGETRRRKKIEFDEEIRDKTK DIFLEMHKYYDRRYTPKAKRTRSCNACSLKDICLPMLGRHESAADYIERNLEGKESV >gi|229784104|gb|GG667631.1| GENE 57 48230 - 50344 1057 704 aa, chain - ## HITS:1 COG:BH0336 KEGG:ns NR:ns ## COG: BH0336 COG1203 # Protein_GI_number: 15612899 # Func_class: R General function prediction only # Function: Predicted helicases # Organism: Bacillus halodurans # 21 668 35 758 800 148 24.0 3e-35 MKRALIFAREMKRYCRKDAKQIENILCLAALYHDLGKLAEENQKELHKEGIKSGHLPVNH VDAGAAFLKQKGQEALCSLILVYAHHQGLPDFTVEENRAEMVCYRDTRESVRACFDREME QLLQIHRQLIPESNVHNPEYCEGDMSMFIRMVFSCLVDADHSDTAAAYGQYPEHDNIPKL KPALRLEALNRYVANLGEKNRSKRNDLRTQMYERCRDGQREEGIVSNDGAVGSGKTTAVM AYQLNQAVAKGARRIFVVLPYTNVITQSVEVYRNALVLPGEEREAVVAELHCKADFEDED TRCLTSLWRAPIIVTTAVAFFETLASNRPGALRRLHELPGSIIFMDEAHAALPLKLLPLA WHWMKVLEEEWSCHWILASGSLVRFWQIPELVGIEKKQVPEMVPTNLRSELLGYEKNRIQ FCWNPCPLSRAELIDWVMAKPGPRLVIMNTVQSAAVIADDICRKYGRECVEHLSTALMPE DRAETIKVVKKRLENPGDTNWVLVATSCVEAGVDFSFRTGFRELASVLSLLQAAGRVNRN GSCKDAQMWSFSMQDDTMLTQNPGVKISAGILEEYLRDGLEITPELSTKSIRDELQRGKN ETKEMQALIEAEAIQNFKTVNDRFHVIENNVIPVIVKADVAERIKLGYGNWKEVQKYSVS IRRKNLEKWQVKQIAEDACQWTLFYDSFLGYMAGVLRLSESSDR >gi|229784104|gb|GG667631.1| GENE 58 50419 - 51309 612 296 aa, chain - ## HITS:1 COG:BH0339 KEGG:ns NR:ns ## COG: BH0339 COG3649 # Protein_GI_number: 15612902 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Bacillus halodurans # 12 244 12 239 283 93 29.0 6e-19 MADEMKRATGLLVIEVVNSNPNGDPDREADPRQRPNGCGEISPVSFKRKVRDLLDDPTSP FFESLPEKIRLNQDHYRILEHRGRDRTAITKEMTSGGKLADFDQEEFLKSEFVRKYWDAR VFGNTFLEKDGNKGYIKTGVVQFGVGISISPVNIIRQTNTNKAGVQEGKNAGMAPLAFRV VEHGVYCMPFFVNPNYAKKSGCTADDIELLKLLIPQAYDMNRSAIRPDVRLRHAWYIEHK NLLGSCPDYMLLDALTPERIGNVSEPSENWHDYKDKTELPEELLHKVAGIVDLVCL >gi|229784104|gb|GG667631.1| GENE 59 51337 - 53475 1268 712 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_2184 NR:ns ## KEGG: Dhaf_2184 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 687 1 686 709 468 40.0 1e-130 MINELYQLTVALDQAGIATEHTHPKYKMIPKVTKKAPCIHIIFDNRQIYKIKSIEMEQAA EIKKYGSNQGTFPALNLAPLYRLTDEAEKRMISDLTEGKTTNFNIEEIRRICRENNWGKK FSNKYRISMQDVPREMSELFQKTGTIFQPLQRLTGETDAFAEAENLHRALEHAAFAMLEE KKGIALALQILFYAGKPGKSVEEDYGTLSMVMDCRALIQEGLSVATAAFTSMLNRKLLEA DAATRAGAVKKESDAFGYVFAPIEEPMPTVKLAAGFEVSLRTMFRGQPCQSRYGRIENAT YPISKEMRFKLQSALAWISAAEHKGITWISTDKDEALFAYPETLRKNMYSMVGFCRRQAT EEGRPECETRDKIRKKAEFESAAKAFISELQKTKEPDTDPMSERIQYFVLRKLDKARTKV VYTYNTTPAEIERSSERWSLGCQNLPAFLFGQPVTLFPLEAADILNLIWRQDGKASSDKF KPIPRYYGMQMLFGAELPEVRRNLYVLMNNVINLAPYLGKAGFARNGYDGSQAVLLRQTQ NTLALTGLLLYQLEIRKEKYMQEFPYLFGQLLKLSDALHEIYCKAVRDNDIPNTLAGSGL YVSGAEQPYRTLAVLGHRMNPYITWAKKYRTKDIQKKGEESWRAGWYLSLYERIATQLHA AWGDQTRFNEEEKAQYFIGYLAAFPAKEEIKGQPCSEQICVVENEISDNSQK >gi|229784104|gb|GG667631.1| GENE 60 53479 - 54141 352 220 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_2183 NR:ns ## KEGG: Dhaf_2183 # Name: not_defined # Def: CRISPR-associated protein Cas5 # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 220 1 220 224 296 64.0 4e-79 MIHYRYPITMEIAGATALWSRTDCGDAPVSYAAPTPSAVRGIFESILWGPDIRIVPTAVE ICAPLQYHSYVTNYGGPLRAGKSIQKGNNYQLYATVLMDVCYRLYADVIPNSYKEKLPEK ALEWDRKTTSPGHAYQDIFNRRLKRGQSFSIPVLGWREFTPLYFGPFRPQTEVCRELSDI TIPSMLREVFPDGYNSDFRAVYSQSLRIHEGRLNFPREEE >gi|229784104|gb|GG667631.1| GENE 61 54608 - 55882 1541 424 aa, chain - ## HITS:1 COG:CAC2072 KEGG:ns NR:ns ## COG: CAC2072 COG0750 # Protein_GI_number: 15895342 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Clostridium acetobutylicum # 101 423 65 390 395 204 39.0 2e-52 MMGKKFHRYVVTAFWVSLVFCIGFSYFYMKWVIPDKLNVVAEEEEQFHFSMPFGVTLSSD SEEVVLGNGSNIPSDQIHLSVNQPFSVYSENQGSYKIGLKLFGWLQLKDIQVNVVDTKYA VPCGTPVGIYLKSDGIMVIGTGDITKSDGMIVEPAFGVLQSGDYIEAFNGTPLKDKEALV TELNKNGTSEAVLAVRRGDQEIEVKMTPVEGDDGKPKLGVWVRDDTQGIGTMTYIDMNGN FGALGHGISDSDTGGVVQIENGALYETSILGIEKGTFGKPGVMSGVIYYGPGSVLGTIES NTDEGIFGTVNERFKQTVKADAIPIGYRQDVKKGTAYIRSDVSGVVKDYEIEIQKVDYAT THKNKGMVIKVTDPELLAITGGIVQGMSGSPIIQDGKLIGAVTHVFIQDSTRGYGIFIEN MLEH >gi|229784104|gb|GG667631.1| GENE 62 56136 - 57065 579 309 aa, chain - ## HITS:1 COG:Cgl2503 KEGG:ns NR:ns ## COG: Cgl2503 COG0673 # Protein_GI_number: 19553753 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Corynebacterium glutamicum # 8 196 30 217 224 141 39.0 2e-33 MAAVCDVAGDRAKSLAGDHGANWYTSYEEMLDQEAIDVVHICTPHYLHVPMTIDCLKRGK NVFLEKPPVISQSQLDELELAPHIERAGICFQNRYNPGLIRVKAMLQSGETGKILGARGI VTWFRDEEYYAQSGWRGRLETEGGGVLINQSIHTLDLLVYLLGRPDAVCAGMANHHLKEI IEVEDTMEAYIEFENGKAVFYVTTAYCENLPPLIEVSCENMVIRLEEPQITVFHRDGRTE RITASADEALGKKYWGSGHEACIRDYYSAVAEGRKLPITLAEVKDSVRLMTAMYEAAGNG RRIELNSQS >gi|229784104|gb|GG667631.1| GENE 63 57173 - 58060 619 295 aa, chain - ## HITS:1 COG:yjhH KEGG:ns NR:ns ## COG: yjhH COG0329 # Protein_GI_number: 16132119 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Escherichia coli K12 # 6 291 26 311 319 194 34.0 2e-49 MQADYITPAVTIFDDQGHVDIEANKKLYEHLISNGIKGILILGSIGEFYGIPLEEKKQLI REAISCIGSRANVYVGTGGTVFEECVWLSNYAADCGAYGMAVISPYYFYLSDEEAYDFYA ELAKQVKGNILLYNFPARTGYQLSADTVLKLAMKYPNIIGIKDTVGEMGHTRELIRKVKG RRPDFLVYSGFDEFFFHNILSGGDGCIAGLSNIFPETAAAMVRAAEEDRMEDVSRYQKKL DALMAIYGIQDQFVPVIKTAVQMRGVPINPACLFPMKAVTGADREKIRQIMEYEE >gi|229784104|gb|GG667631.1| GENE 64 58148 - 58303 90 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620864|ref|ZP_06113799.1| ## NR: gi|266620864|ref|ZP_06113799.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 2 51 1 50 50 87 100.0 4e-16 VEIRNSVFFKPENGKAGCEEPGFSKRPIREKHGLFKRYTEGADSAMEGAGY >gi|229784104|gb|GG667631.1| GENE 65 59227 - 60837 1317 536 aa, chain - ## HITS:1 COG:STM3531 KEGG:ns NR:ns ## COG: STM3531 COG0129 # Protein_GI_number: 16766817 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Salmonella typhimurium LT2 # 1 522 1 520 571 645 59.0 0 MEQKSSKERKLWAQFDALQLGSGWTEEDLVKPQILIEDAYGESHPGSVHLNQLTDQVRWG VLEKGGFGAKYHVTDICDGCAQGHDGMNLVLASREAICDMMEVHAGFIPWDGMVLSSSCD KSIPAHLKAAARVGIPTVFLPGGSMRPGPDMTTSLAAGKLSLRMKSDERVEAWEVRDYKL TGCPSAGACSFMGTASTMQCMAEALGMDLPGSALVPATMTDILKYGRRAGQAVMELARKG ITPVQIMTPAAFKNALVVHSAIGGSTNATLHLPSIAKELGLTLPMSWFDEINHRIPHIAN IDPSGSQLTESFWFAGGVPMVQFLLREYLDLNVMTVTGRTLGENLEQLEKDHFYERYIGY LHNYGLSREDVICDPEQTAETGSVAVLKGNLAPEGAVIKYSACRPEMRKHKGTARVFDCE EDAHDAVIKKKIRPGDILVIRYEGPRGSGMPELLMTTEAIVCDPSLNGTVSLITDGRFSG ATRGAAIGHVSPEAARGGPIALVEEGDIIAYDVEKRSVNIVGTADGECTPADMEKI >gi|229784104|gb|GG667631.1| GENE 66 60922 - 62001 1186 359 aa, chain - ## HITS:1 COG:AGl1780 KEGG:ns NR:ns ## COG: AGl1780 COG0673 # Protein_GI_number: 15891004 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 26 356 7 334 338 285 45.0 1e-76 MNALDGMNYAPKGKPEPVVKEGEFYFAAMYLDHGHINGMCNGLTEAGASLKYVYDTDRAR AEAFAAKYPGAVVADREETVLNDPSVKLVAAAAITSKRCEVGLRCMDAGKDYFTDKAPFT TMEQLELAEAKVKETGKKYMVYYSERLHVESAVYAGNLIKQGAIGRVLNVMGLGPHRIGS PENRPAWFFEKESYGGILCDIGSHQIEQYLYFADVKSAKVEYSRVANYNHPDYPGLEDFG DAMLTGDNGATNYFRVDWFTPDGLSSWGDGRTVILGTDGYIELRKYVNIGVEDNSGDNLI LVDKTGEHFIKCRDTVGYPFFGELILDCINRTEHAMTQEHAFLAARLCLECEAGAERIG >gi|229784104|gb|GG667631.1| GENE 67 62030 - 62572 523 180 aa, chain - ## HITS:1 COG:no KEGG:Tthe_0980 NR:ns ## KEGG: Tthe_0980 # Name: not_defined # Def: oxidoreductase domain protein # Organism: T.thermosaccharolyticum # Pathway: not_defined # 2 180 206 385 386 209 56.0 4e-53 GNAAHDNAEVEDISIAIMKYPEGRLAQVTSSVIHYGEQQAVKFQGEYAAVSVPWEVHASV AKGNGFPDPNPELEQKIQEYYDSLPELQYEKHPGEIQNFLEAVERGERPFITGEDGRLTI ELITAIYEAGFTEKTVELPLKKDDPFYTAEGILKNVRHFHEKNASVENFTDEAISLGREV Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:19:40 2011 Seq name: gi|229784103|gb|GG667632.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld25, whole genome shotgun sequence Length of sequence - 42034 bp Number of predicted genes - 38, with homology - 38 Number of transcription units - 16, operones - 8 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 91 - 164 77.8 # Met CAT 0 0 + TRNA 204 - 276 75.9 # Phe GAA 0 0 + TRNA 281 - 353 73.8 # Lys TTT 0 0 1 1 Tu 1 . + CDS 467 - 1606 839 ## COG2199 FOG: GGDEF domain + Term 1609 - 1661 16.0 2 2 Tu 1 . - CDS 1628 - 2680 806 ## COG1609 Transcriptional regulators - Prom 2778 - 2837 7.9 + Prom 2817 - 2876 5.2 3 3 Op 1 35/0.000 + CDS 2958 - 4334 1454 ## COG1653 ABC-type sugar transport system, periplasmic component 4 3 Op 2 38/0.000 + CDS 4351 - 5223 1041 ## COG1175 ABC-type sugar transport systems, permease components 5 3 Op 3 . + CDS 5223 - 6086 845 ## COG0395 ABC-type sugar transport system, permease component 6 3 Op 4 . + CDS 6157 - 7476 1227 ## COG1621 Beta-fructosidases (levanase/invertase) 7 3 Op 5 . + CDS 7478 - 9364 1618 ## COG0366 Glycosidases + Term 9378 - 9430 5.0 8 4 Tu 1 . - CDS 9419 - 9649 373 ## Closa_0017 hypothetical protein - Prom 9751 - 9810 7.7 + Prom 9777 - 9836 8.8 9 5 Op 1 . + CDS 9873 - 11414 1386 ## EUBREC_2822 hypothetical protein 10 5 Op 2 1/0.600 + CDS 11466 - 13022 1181 ## COG1982 Arginine/lysine/ornithine decarboxylases 11 5 Op 3 . + CDS 13026 - 13619 627 ## COG0194 Guanylate kinase 12 5 Op 4 . + CDS 13650 - 14495 849 ## COG0726 Predicted xylanase/chitin deacetylase + Prom 14567 - 14626 6.1 13 6 Op 1 1/0.600 + CDS 14708 - 15697 1208 ## COG2812 DNA polymerase III, gamma/tau subunits 14 6 Op 2 1/0.600 + CDS 15701 - 16609 1106 ## COG1774 Uncharacterized homolog of PSP1 15 6 Op 3 1/0.600 + CDS 16599 - 17336 756 ## COG4123 Predicted O-methyltransferase 16 6 Op 4 . + CDS 17393 - 18244 832 ## COG0313 Predicted methyltransferases + Term 18272 - 18320 7.8 - Term 18267 - 18300 4.1 17 7 Tu 1 . - CDS 18339 - 18761 500 ## CDR20291_1132 hypothetical protein - Prom 18786 - 18845 4.0 18 8 Tu 1 . - CDS 18905 - 19870 925 ## EUBELI_01420 hypothetical protein - Prom 19919 - 19978 5.2 19 9 Op 1 6/0.000 + CDS 20647 - 21579 788 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III + Prom 21590 - 21649 4.8 20 9 Op 2 4/0.400 + CDS 21676 - 21909 488 ## COG0236 Acyl carrier protein 21 9 Op 3 3/0.400 + CDS 21909 - 22832 1205 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase 22 10 Op 1 26/0.000 + CDS 22938 - 23855 1134 ## COG0331 (acyl-carrier-protein) S-malonyltransferase 23 10 Op 2 11/0.000 + CDS 23849 - 24589 235 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 + Prom 24616 - 24675 2.7 24 10 Op 3 4/0.400 + CDS 24771 - 26009 1443 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 25 10 Op 4 4/0.400 + CDS 26020 - 26532 546 ## COG0511 Biotin carboxyl carrier protein 26 10 Op 5 4/0.400 + CDS 26588 - 27010 595 ## COG0764 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases 27 10 Op 6 . + CDS 27022 - 28362 1165 ## COG0439 Biotin carboxylase + Term 28390 - 28445 13.6 + Prom 28463 - 28522 2.8 28 11 Op 1 . + CDS 28579 - 30321 1583 ## COG0825 Acetyl-CoA carboxylase alpha subunit 29 11 Op 2 . + CDS 30321 - 31454 1347 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase + Term 31490 - 31537 7.0 + Prom 31538 - 31597 8.1 30 12 Tu 1 . + CDS 31660 - 32319 353 ## COG2715 Uncharacterized membrane protein, required for spore maturation in B.subtilis. 31 13 Tu 1 . - CDS 32329 - 33357 932 ## COG1609 Transcriptional regulators - Prom 33392 - 33451 12.2 + Prom 33486 - 33545 6.0 32 14 Op 1 35/0.000 + CDS 33640 - 35028 1506 ## COG1653 ABC-type sugar transport system, periplasmic component 33 14 Op 2 38/0.000 + CDS 35099 - 35977 976 ## COG1175 ABC-type sugar transport systems, permease components 34 14 Op 3 . + CDS 35990 - 36811 865 ## COG0395 ABC-type sugar transport system, permease component 35 14 Op 4 . + CDS 36833 - 39220 1929 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases + Term 39255 - 39307 3.7 - Term 39239 - 39297 4.3 36 15 Op 1 . - CDS 39300 - 40877 480 ## COG3507 Beta-xylosidase 37 15 Op 2 . - CDS 40874 - 41557 461 ## COG1802 Transcriptional regulators - Prom 41736 - 41795 7.9 + Prom 41679 - 41738 6.9 38 16 Tu 1 . + CDS 41833 - 42034 179 ## gi|288870375|ref|ZP_06113840.2| polysaccharide deacetylase family protein Predicted protein(s) >gi|229784103|gb|GG667632.1| GENE 1 467 - 1606 839 379 aa, chain + ## HITS:1 COG:aq_035_2 KEGG:ns NR:ns ## COG: aq_035_2 COG2199 # Protein_GI_number: 15605636 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Aquifex aeolicus # 183 378 52 245 251 124 37.0 3e-28 MEDKYQAEIKKKLAESVNRAWKIISVFVMAFQFLMLILMAVKPGGPFKTQRRTVYCFLYL VLLLVTGIAMILNQRFWKRENKNYHRYLNAEIVYAAFICFWGCCVTLNDQLGGNDLSVFT YTMLSVAALGFLEPVKAVVIFMSAFVFLNLCLPGIQTIENNMFSNIINSFSIAGISTAIS YTLYRNKVIVTKQEMTIRKQYEEITRVNEILSRQAMVDELTEMNNRRYMEWFIGEKIKIL EEKTGTISCMMLDIDYFKQFNDRYGHISGDICLKKIAEMIRECCQDQNVAAIRYGGEEFF ICLFQWNQDEAVEMAEKIRKRIAECRFKINEEEGAVTISIGVYTWHGEGKVQMDDLIRLS DKALYQAKKNGRNRVEVLS >gi|229784103|gb|GG667632.1| GENE 2 1628 - 2680 806 350 aa, chain - ## HITS:1 COG:BH3727 KEGG:ns NR:ns ## COG: BH3727 COG1609 # Protein_GI_number: 15616289 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 8 349 3 330 331 121 26.0 2e-27 MAQKKNITFDDIAQYTHFSKTTISRYFNNPDSLTLENQQIISDALVKLNYKENKVARILA NGKTEFVGILIPNLYLHYYSEMLNQILSTYETYGYKFLVFIGHENAETERQYIQELLAYK IEGLIVLSHTLSSRELADLQIPIVTIEREDQYVSSINTDNYMGAIQATSLLARHHCDILI HINTPTAENIPAYGRIQGFMDFCRENNMNHKIFCHRTENTYDSSLQYLKDILEELETSYT GMRKGIFVFNDTHANILLNLIVRKYGRLPDDYYIIGFDDSPVAREAVIPISTIGQQIDLI ANEAVSLLVSQMNERKKRKPAPLKEPVHKTITPILIRRETTEKEDEPKNR >gi|229784103|gb|GG667632.1| GENE 3 2958 - 4334 1454 458 aa, chain + ## HITS:1 COG:Cgl2408 KEGG:ns NR:ns ## COG: Cgl2408 COG1653 # Protein_GI_number: 19553658 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Corynebacterium glutamicum # 64 458 34 438 443 89 23.0 1e-17 MMKSIGTKAIKVMVLMAAAVLMLTGCRFGFASDPDDPNEVVQEEPIDTSVDTFEYDASLS GTKITLLNSKAEIQVALTKMAEEYEKKSGVHVEVMPVTDGDSPYTRVVSMYNSGTPPTMA ILDTTDVIALAEEKAVDLTGEPWTAEAEDYLTEVNGKIYSFPLCIEGRGLIYNRSVIEET LGREFDPLSVTTLDEFTALLDQLTEAGMKRPVSVAKEDWSLGAHQLQYIYETYNGTSEGA QEVIEKIKEGTLDLNSYDRMNQFLNMFDVLREYNVAKKDPLGADYDEMAIDLVDGKTAFW FNGNWAWPNLSEAGAENSDEYGFLPYFLNHDPGDFANQQIQASPSKQIMIDGRIVSDQER AAAREFLNWIVFSEIGQQMIVKTCNLIPPFQNNPYEPADPLSRDIYNRVHEERAFNASAI VPNDHWAVLGASMQKYLAGRSSREELIQSIENYWEEQR >gi|229784103|gb|GG667632.1| GENE 4 4351 - 5223 1041 290 aa, chain + ## HITS:1 COG:Cgl2407 KEGG:ns NR:ns ## COG: Cgl2407 COG1175 # Protein_GI_number: 19553657 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Corynebacterium glutamicum # 8 290 7 281 281 187 41.0 2e-47 MKSKSKGKDFCIFALPGLFCFFAVVIVPFIYGVYLTMTDWNGVSDVKNFIGLQNFAGVMK DGQFWSSLLLTFKYVIVVVVFVNVLAFMIAYLLTRGIKGQNFFRAGFFTPNLIGGIVLGY IWQFVFSRVFVGIGDATGWGLFEASWLSEPTKAFIALVIVTVWQLSGYMILIYVAGFMGL SDDVMEAATIDGASGWTKMKSIILPLMMSSITICLFLTLSRAFMVYDVNLSLTAGAPYGT TEMAAMHVYEKAFTSRRFGVGQAEALILFVIVACISGIQVYLTKKKEVEA >gi|229784103|gb|GG667632.1| GENE 5 5223 - 6086 845 287 aa, chain + ## HITS:1 COG:Cgl2406 KEGG:ns NR:ns ## COG: Cgl2406 COG0395 # Protein_GI_number: 19553656 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Corynebacterium glutamicum # 15 287 32 303 304 171 39.0 1e-42 MAGEKAGSGKKRKKSDLVATAGLIVIFILYMFPFYMVVINSLKQKRDIIKSPFSWLYTIK GLSFDNFVKAFVQMDFLNAFKNSLIVTVFATALVTLLAAMLAYYIVRNNNAVSRIAFVLM VASMIIPFQAIMIPLVSIYGGKLNVLNHRSTLVFMHTGFSMAMSVFMFHGFIKGNVPVAL EEAANIDGCTHAQTFFRIVFPLLKPIISTMVILNSLAFWNDFLLPSLVLSDKKLLTLPLS TYSFYGTYSADYGTIMAGLLLCVVPVLILYIALQKQIISGVVAGAVK >gi|229784103|gb|GG667632.1| GENE 6 6157 - 7476 1227 439 aa, chain + ## HITS:1 COG:SP1795 KEGG:ns NR:ns ## COG: SP1795 COG1621 # Protein_GI_number: 15901624 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Streptococcus pneumoniae TIGR4 # 7 418 31 428 439 234 34.0 3e-61 MTENTIHIKAPGCWMNDPNGFIYYRGKYHMFYQCFPFDTRWGRMHWGHAVSEDLVHWEHK GIALFPSKHGDRDGCFSGSAVEHEGKLYLYYTGVRYLEEDPEDTNLCLNEKFISAQMMIT SEDGVHFDNIRDKREIIAPLEDRERGDAAHTRDPKVWRGSDAWYMVLGTHGSDSNGRLLF YRSADLENWSYVNAVSKSGLGWMWECPDYYETKGGKVLSFSPMGFLKDGYREENQNICMV VDFEEESCTMRMPEQYQFLDYGLDLYAAQSTVDEDGRRVMAAWVRMPEAADGEWIGMMCS PRVAEVEDGHIYFRMHPAVRGAYTRKIDRPDEAAEAGYLIAADLEEGDRIEIGGYLIYRK GNRICTDRSLTAGNDRDGRMQFETPEIKNGGHVEILVDANLIEVFVNDGEYVISNVVYGV GNEILTANKIKLYTIRETG >gi|229784103|gb|GG667632.1| GENE 7 7478 - 9364 1618 628 aa, chain + ## HITS:1 COG:DR0933 KEGG:ns NR:ns ## COG: DR0933 COG0366 # Protein_GI_number: 15805957 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Deinococcus radiodurans # 7 571 23 596 644 409 38.0 1e-113 MDYCKEYEARFNQYYDELKWLYCELYKNQDEALEQLCGQMYRFYTERKAALKRMDRERAK NPQWYKGNRMVGMMMYTDAFAGDLKGVLKHLNYIEECGVNYLHLMPLLDTPEGRSDGGYA VSDFRKVRPDLGTMEDLECLAAECHKRDVSICLDFVMNHTSEDHEWAKKARMGDPEYQSR YFFYDNYDLPAQFERTVPEVFPTTAPGNFTWVADAGKYVMTTFYPYQWDLNYWNPVVLNE MVGNMLNLVNRGIDVIRIDAVPYIWKQLGTNCRNLPQVHSIVRIMRIICEIVCPGVLLLG EVVMEPDKVVPYFGSVEKPECHMLYNVTTMAATWNSVATKDIRLLRQQMNVVSSLPKDYV FLNYLRCHDDIGWGLDYPWLKQFGIDEVSHKKYLNDFLTGQYPGSFGRGELYNSDPESGD ARLCGTTASLCGIEKAAYERDEEALKKAVRLDLMLHAYMLSQSGIPVIYSGDEIGQENDY TYHENPKKWDDSRYLHRGSFRWDLEKKRSVKGSLQETIFEGIKKLESIRAQFPVFCTDAE VWTIDTWDDAVLGLVRRSGEEKLIALFNFSEFDKVAWINEEDGMYKDLISGRRLEAKGVQ IPAYGVYWLMREKWDEKDTAVEHGEKSV >gi|229784103|gb|GG667632.1| GENE 8 9419 - 9649 373 76 aa, chain - ## HITS:1 COG:no KEGG:Closa_0017 NR:ns ## KEGG: Closa_0017 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 76 1 76 76 136 92.0 2e-31 MRTIEFKMERPGLVKAGDEVSIIEREGTSSYSYIIDPAVAMSGCYTGRDRIKSDHGIVKE VKHNDRGYYVIVEFDE >gi|229784103|gb|GG667632.1| GENE 9 9873 - 11414 1386 513 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2822 NR:ns ## KEGG: EUBREC_2822 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 44 451 65 490 692 215 31.0 3e-54 MSEMNSMNSVEPMSSMEPVPAWKKGKQEVMTQQFLLFMGAALGFGILYTFCMYQNPNGIT YPVMIVGVYGVLFWMLKRLSIEWKRGSWFLASAAVLLGIAACRTADSLLIMMIKTAEFLL AVIFVMHQFYDDREWSIGKYLSSMAVYLCCVIAAAVYPFFQGKEFFRKANLKKYKTVSKV ALGLAAALPILLAAGMLLGQADAVFGAIVVHWLEKILNVWTILIMIVMTAFGTFGVYCLI CGACMDQISGEEKMVRTHEPVTAVTCMSAIALLYLFFCGIQVIYLFLGKGSLPEGVTYSS YARQGFFQLLFVAVMNLVMVLMNLKFFRKSRLLNGVLTVICGCTYVMIASAAYRMILYVR EYHLTYLRVLVLWFLLLLACLMAGVVILIYKNRFPLFSWCLVLISVFFVGFSFMRPERVI AEYNVNHVKEWSDSDFQYMTQVLSADAVPVVLKALEDDAVLERFYSGGKKKFLAEYYDWA VSRNYQVRSSDFRTWNLSFEKAEQSFRNEGMGI >gi|229784103|gb|GG667632.1| GENE 10 11466 - 13022 1181 518 aa, chain + ## HITS:1 COG:CAC2338 KEGG:ns NR:ns ## COG: CAC2338 COG1982 # Protein_GI_number: 15895605 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Clostridium acetobutylicum # 5 500 11 483 487 286 34.0 9e-77 MKKDLLSRLIEYGKSDVYPFHMPGHKRRTDSGFVENFPNPFSVDITEIDGFDNLHHPDGI LKESMEWAGKVYGADRTYYLVNGSSCGILSAICGSVSHGGTILMSRNCHKSAYHGVFFSH LKAEYIYPQTIPEFGIQGGLAPEDVRRMLINHPETEAVLVVSPTYDGIVSDIEAIAGIVH ERGIPLIVDEAHGAHFSFGKEAGFPVSALGLGADVVIQSLHKTLPSLTQTAVMHVKAGFA DLDGIDRYVHMFQSSSPSYVLMASIENCIRWMDEAGRHEMMAFDGRLDRIRQQLMEMECL TIVDGVKGRCHVFDVDRSKIVVSVGKSGLTGPELQERLRREYRLEMEMCGPDYVTAITTV ADTDEGLERLCRAFMEIDLDRKRSAEGCENSAGSCVGRNGDNAGERSSGIVDNEMEKERV CGWPEIAMTIAQAMDAPGQPVPLDDCEGLVSKEFIYIYPPGIPIVAPGEVLKPELVDEIR KYKRKGLPVQGTADPEVEAVLTVADRENMVNETGKIEG >gi|229784103|gb|GG667632.1| GENE 11 13026 - 13619 627 197 aa, chain + ## HITS:1 COG:CAC0298 KEGG:ns NR:ns ## COG: CAC0298 COG0194 # Protein_GI_number: 15893590 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Clostridium acetobutylicum # 1 191 1 191 195 202 55.0 4e-52 MGKIFYVMGKSASGKDTIYKELRERIPELKTVVLYTTRPIRDGETEGVEYHFTTAERLGD FRREKRLIEERTYQTVYGPWSYFTVDDGQICLDRKDSYLMIGTLESYEKMRGYFGENGLV PVYIEVDDGVRLERALQREKGQKEPKYKELCRRFLADEEDFKEENLTRCGIRRRYRNDDL DVCIEEICRDIRAVTGT >gi|229784103|gb|GG667632.1| GENE 12 13650 - 14495 849 281 aa, chain + ## HITS:1 COG:CAC0358 KEGG:ns NR:ns ## COG: CAC0358 COG0726 # Protein_GI_number: 15893649 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Clostridium acetobutylicum # 1 279 1 266 266 210 41.0 3e-54 MGSMFMRYPGGKKKALTLSYDDGVEQDIYLIELMKKNGLKGTFNLNSGLYAEEGTVYPPG TVHRRMTEKMATEVYGSSGMEVAVHGLTHPFLEQLPVNLCTAEVIKDRENLEKQFQTIVR GMAYPFGTTSDMVVNTLMQSGIAYARTTVSTGDFRLPSDWLRLTATCHHKDGRLKELAKK FVEDTPKYGSWLFYLWGHSYEFEADENWNVIEEFAAYTGNRDEIWYATNLELYDYVTAYD RLLFSVDGLTVYNPTSTDVYFDKDDNSYCLKAGERKLLSGT >gi|229784103|gb|GG667632.1| GENE 13 14708 - 15697 1208 329 aa, chain + ## HITS:1 COG:BS_dnaX KEGG:ns NR:ns ## COG: BS_dnaX COG2812 # Protein_GI_number: 16077087 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Bacillus subtilis # 4 305 15 336 563 137 29.0 2e-32 MPGFNDIIGHEMIKEHFKKAIETNHVSHAYILTGEAGMGRKSLANAFAMTLLCEKGKKEP CMQCHACKQVMSGSHPDLIFVSHEKPNSIGVDDIREQINDTIMIRPYSSYYKIYIVDEAE KMTAQAQNALLKTIEEPPSYAIIMLLTTNQEAFLPTILSRCVQLKLKPLQDSVVKTYLME SQGISEVKAEIYAAFARGNLGKAIHLASSEDFQLMYTELLHMLKHLKDMDITELLDYIKK LKDENLDIYECLDFMQLWYRDVLMYKVTQDINLLVFKEEYNTIKEMSAACAYDGVEKILQ SIDKAGIRLNANVNMELAMELMLLVMKEN >gi|229784103|gb|GG667632.1| GENE 14 15701 - 16609 1106 302 aa, chain + ## HITS:1 COG:CAC0301 KEGG:ns NR:ns ## COG: CAC0301 COG1774 # Protein_GI_number: 15893593 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Clostridium acetobutylicum # 1 296 1 297 303 328 58.0 6e-90 MIKVIGVRFRNAGKIYYFDPAGREIHTGDHVIVETARGIEYGYVVLGSREVPDDKVVQPL KSVIRMASKEDEEVELKNHEKEKEAFRICKEKIRKHGLQMKLIDAEYTFDNNKVLFYFTA DGRIDFRELVKDLASVFKTRIELRQVGVRDETKIVGGIGICGRPLCCHSYLSEFIPVSIK MAKEQNLSLNPTKISGVCGRLMCCLKNEEETYEELNSKLPNVGDYVTTDDGLKGEVHSVS VLRQLVKVIVITKDEKEIREYRVDQLKFKPRRRKDKGSVADAELKALEALEKKEGKSKLD DN >gi|229784103|gb|GG667632.1| GENE 15 16599 - 17336 756 245 aa, chain + ## HITS:1 COG:CAC0306 KEGG:ns NR:ns ## COG: CAC0306 COG4123 # Protein_GI_number: 15893598 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Clostridium acetobutylicum # 5 244 4 243 244 224 47.0 2e-58 MIIELKSDERLDDLQRNGYQIIQKKNGFCFGMDAVLLSGFARVKQGEKAIDLGTGTGIIP ILLEAKYEGEHYTGLEIQDEMADMAARSVALNHLEEKVSIVKGDIKEASRLFGAASFDVV TSNPPYMNDAHGLKNPDLPKAISRHEVLCTLDDVTREAARLLRPGGRFYMVHRPHRLIEI ITALTKYKLEPKRMKMVHPFVEKDANMVLIEAVRGGKSMIKVEAPIVVYQEPGVYTQEIY DIYGY >gi|229784103|gb|GG667632.1| GENE 16 17393 - 18244 832 283 aa, chain + ## HITS:1 COG:CAC0307 KEGG:ns NR:ns ## COG: CAC0307 COG0313 # Protein_GI_number: 15893599 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Clostridium acetobutylicum # 7 279 4 276 282 265 49.0 9e-71 MENQKTGKLYLCATPIGNLDDITLRVLNTLKEVDLIAAEDTRHSIKLLNHFHIKTPMTSY HEYNKVDKAKYLVEQMKNGVSIALITDAGTPGISDPGEELVRQCYEAGVPLTSLPGPAAC ITALTISGLATRRFCFEAFLPSDKKEKQWILEELKRETRTIILYEAPHRLVRTLEELREA LGNRRITICRELTKRYETAFQTTFEEALAVYETEEPRGECVVVIEGKSISEIREERMKTF EEMSLEEHMELYEKQGIERKEAMRMVAKDRGISKRDVYQQLMK >gi|229784103|gb|GG667632.1| GENE 17 18339 - 18761 500 140 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1132 NR:ns ## KEGG: CDR20291_1132 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 140 1 140 140 179 63.0 3e-44 MDLQNEKCVMVIDETLPAGLIANTAGIMGITLGKEMPDTVGPDVTDKTGCSHRGIIQFPV PVLKADGAKIRELREQLYRPEYADLTVVDFSDTAQSCKTYDEFIDKAAATEEKSFTYLGI AICGPKKTVNKLTGYLPLLR >gi|229784103|gb|GG667632.1| GENE 18 18905 - 19870 925 321 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01420 NR:ns ## KEGG: EUBELI_01420 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 3 307 10 311 321 134 29.0 7e-30 MGKKDIALARYFEDEDRYADLINGYVFSGEQVVSGSDILEKNILVTGVLGTVRKWFAVQK YRDAVKRVVMGMNFAVIALEHQDLVHYGMPVRVMLEDAAGYDEQMRRIQKKNRKARGLSR GEFLGGFRKEDRLNAIFTIVLYYGAEPWDGAGDLYSLINFEEIPEGLKNLFNNYRLHVLE VKRFKDTDLFQTDLREVFGFIGFSGDKEAERSYVFGHRESFEQLSEDAYDVITVMSGSKE LEAVKETYREKGEKINMCEAIRGMIEDGRIEGRLEGKLEAEQIMAKNMYLRGMTEEDVAG LCEEDIELVRGWFRDWKKVKN >gi|229784103|gb|GG667632.1| GENE 19 20647 - 21579 788 310 aa, chain + ## HITS:1 COG:CAC3578 KEGG:ns NR:ns ## COG: CAC3578 COG0332 # Protein_GI_number: 15896812 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Clostridium acetobutylicum # 1 310 13 323 325 328 52.0 1e-89 MPKQVVTNDDLAKIVDTSDEWIRPRTGIRERRIAAAESGTTDLAAEAARMAIEQSGIKPE ELDIIVLATSSGDCCFPNGACEVQAAVGAVNAVAFDISAACSGFVFALNTVHSFLSAGIY RTGLVIGADTLSKLVDWNDRSTCVLFGDGAGAAVVRAEDTGVIGLTMGADGTKGDVLKCG GRTTGNFLTGKKPELGYMSMDGQEVFRFAVKTVPEAIKKTLAGSGTELEEIKYFILHQSN YRISESIAKRLKLPMDKFPANLERYGNTSGASVPILLDELNREGKLKPGDKLLLAGFGAG LTWGTTLLEW >gi|229784103|gb|GG667632.1| GENE 20 21676 - 21909 488 77 aa, chain + ## HITS:1 COG:aq_1717a KEGG:ns NR:ns ## COG: aq_1717a COG0236 # Protein_GI_number: 15606797 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Aquifex aeolicus # 3 73 5 75 78 69 61.0 1e-12 MLEKMKEIIAEQLNVDAEAVTAESSFKEDLGADSLDLFELVMALEDEYSVEIPSEDLEKL TTVGAVMDYLKAKGVEG >gi|229784103|gb|GG667632.1| GENE 21 21909 - 22832 1205 307 aa, chain + ## HITS:1 COG:CAC3576 KEGG:ns NR:ns ## COG: CAC3576 COG2070 # Protein_GI_number: 15896810 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 1 307 2 308 310 392 66.0 1e-109 MKTRITELLGIKHPIIQGGMAWVAEHHLAAAVSEAGGLGLIGGANAPAEVVREEIRKAKA LTDKPIGVNVMLLSPHADEVAKVVVEEGIGVVTTGAGNPEKYIEMWKAAGVKIIPVVASV ALAKRMERYGADAVVAEGCESGGHIGEQTTMTLVPQVADAVSIPVIAAGGIGDGRGIAAA FMLGAEAVQIGTRFVVSDESIVHENYKQRIIKAKDIDSAVTGRSHGHPVRCLRNQMTREY VKLEQEGKSFEELEYLTLGKLREAVQEGDITNGTVMAGQIAGMISKTQTCKEMIDEMMEQ AEKLLNR >gi|229784103|gb|GG667632.1| GENE 22 22938 - 23855 1134 305 aa, chain + ## HITS:1 COG:CAC3575 KEGG:ns NR:ns ## COG: CAC3575 COG0331 # Protein_GI_number: 15896809 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Clostridium acetobutylicum # 1 302 1 306 308 316 52.0 4e-86 MSKTAFIFPGQGAQYCGMGQDFYENTKTGKAIFDMASELLGFSVPELCFEKNDRLDITEY TQAAMVTASIAMMRVLEETKGVRPDVAAGLSLGEYPALAAAGVMSDADAIRTVRQRGILM QEAVPVGIGAMAAILALDASKIEEVLSGIDDVSIANYNCPGQIVISGRKEAVEEANAKLL EAGAKRAVMLNVSGPFHSAMLAEAGRKLGQVLAGVTIHTPVIPYVANVTAQYVTKAEDVR PLLEKQVSSSVRWQQSVEAMLQDGVDTFIEIGPGKTLAGFMRKITRDVKVINIEKLEDIG RMPEC >gi|229784103|gb|GG667632.1| GENE 23 23849 - 24589 235 246 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 5 242 4 238 242 95 31 6e-19 MLEGKTAVVTGAGRGIGRAIALRLAAEGATVIVNYNGSAKRAEEVVNEITAGGGKAEAIQ CSVSDFESVAAMMDGVIKKYGSIDILVNNAGITKDNLLMKMSEEDFDAVIDTNLKGVFNC MKHVARQMIKQRGGRIINISSVSGVLGNAGQANYCAAKAGVIGLTKSFAREVASRGITVN AVAPGFIETEMTEVLSDAVKAGATEQIPMKHFGTTEDIAGAVAFLASGAAGYITGQVICV DGGMAM >gi|229784103|gb|GG667632.1| GENE 24 24771 - 26009 1443 412 aa, chain + ## HITS:1 COG:CAC3573 KEGG:ns NR:ns ## COG: CAC3573 COG0304 # Protein_GI_number: 15896807 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Clostridium acetobutylicum # 1 411 1 411 411 478 57.0 1e-135 MKTRVVVTGMGAITPIGNDVESFWNGLKEKKVGIGPITQFDTTDYKAKLAAEVKDFDAKN YMDPKAAKRMERFSQFAVAASKEALENSGIDMEQEDPYRVGVCLGSGIGSLQAMEREHKK LLEKGPGRVNPLLVPMMICNMAAGNVAIQFGLKGKCFNVVTACATGTHSIGEAFRSIQYG EADVMVAGGTEASIAPIGVAGFTSLTALTTVTDPLRASIPFDQDRSGFVMGEGAGVVVLE SLEHALARNAAIYAEVIGYGSTCDAYHITSPAEDGSGAARAMVNAINDGGIRPEDVDYVN AHGTSTHHNDLFETMAIKSALGDHAKQVKINSTKSMIGHLLGAAGGVEFIACVKSIQDGF VHATAGLETEGEGCDLDYTKGEGVSCEVNVAISNSLGFGGHNASILVKKFEQ >gi|229784103|gb|GG667632.1| GENE 25 26020 - 26532 546 170 aa, chain + ## HITS:1 COG:CAC3572 KEGG:ns NR:ns ## COG: CAC3572 COG0511 # Protein_GI_number: 15896806 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Clostridium acetobutylicum # 1 169 1 156 159 104 40.0 8e-23 MEINDIIRLMQAVSENGLTSFKIEEGNLKLSIRKEKEQIITVAGNSTVQQMVPAAGAALE MAALPAAEIFASAAAGGSEGAQTAGSITEAIDSDKIVASPLVGTFYNSSSPDAEPFVKVG DTVKKGQTLGIIEAMKLMNEIESEYDGVVEAILVSNENVVEYGQPLFRIR >gi|229784103|gb|GG667632.1| GENE 26 26588 - 27010 595 140 aa, chain + ## HITS:1 COG:BH3735 KEGG:ns NR:ns ## COG: BH3735 COG0764 # Protein_GI_number: 15616297 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases # Organism: Bacillus halodurans # 1 139 1 139 140 173 63.0 9e-44 MLGIKEIEEIIPHRHPFLLVDCIEELEPGVRAVGYKCVTFDEPYFQGHFPQEPVMPGVLR IEALAQVGAVAILSEEANKGKLVFFGGIDKAKFKYKVVPGDRMRLECEIIKRKGPVGVGR AVATVDGKIAASAELTFMIG >gi|229784103|gb|GG667632.1| GENE 27 27022 - 28362 1165 446 aa, chain + ## HITS:1 COG:CAC3570 KEGG:ns NR:ns ## COG: CAC3570 COG0439 # Protein_GI_number: 15896804 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Clostridium acetobutylicum # 1 443 1 443 447 578 64.0 1e-165 MFDKILIANRGEIAVRIIRACREMGIKTVAVYSEADRDCLHTLLADEAICIGPAPSTQSY LNMERILTATVAMKADAIHPGFGFLSENARFAELCEKCNITFIGPSAEIINKMGNKSEAR KTMMEAGVPVVPGGKEAVHEVEEARLVAEKIGYPVMIKASSGGGGKGMRISRGPEDFDAN FQNAQMESVKGFSDDTMYIEKYIEKPRHIEFQIMADKYGNVVHLGERDCSIQRRHQKVLE ESPSAAISEELRKRMGDTAVRAAKAVGYENAGTIEFLLDKHKNFYFMEMNTRIQVEHPVT ELVSGLDLIKEQIRVAAGEPLSVSQDDIKLTGHAIECRINAENPEKNFMPCPGLITNVHV PGGNGVRVDTHIYNDYKVPANYDSMLMKLIVHGKDRTEAIAKMRSALGELIIEGIETNVD FQFDILSHEAYQAGDIDTDFIPKYFA >gi|229784103|gb|GG667632.1| GENE 28 28579 - 30321 1583 580 aa, chain + ## HITS:1 COG:CAC3568 KEGG:ns NR:ns ## COG: CAC3568 COG0825 # Protein_GI_number: 15896802 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase alpha subunit # Organism: Clostridium acetobutylicum # 323 580 8 265 274 347 61.0 3e-95 MLRDMFKKTYTLIDSKYKTPDKADEPNIPAGLWRKCNKCGQPIYAEDVKNNFYICPKCKG YFRVHAYRRIEMTADAGSFEEWDKEMEFANPLDFPGYEKKIEAAREKTKLNEAIVTGKCT IGGEPAVLGVCDARFIMSSMGHVVGEKIARAVERATKEQLPVIIFACSGGARMQEGIVSL MQMAKTSAALKRHHNAGQLFISVLTDPTTGGVTASFAMLGDIILAEPGALIGFAGPRVIE QTIGQKLPEGFQRSEFLQDHGFVDKIVKREDMKSTIAAILKHHHPHTWAQNGEKSFSPVD NCLKPEAFSGKDKKAKNRRNKKTPWDTVQLSRSAERPVASDYINAVFDDFMEFHGDRYFG DDGAIVGGIASFHGMPVTVIGQEKGKNTKDNIRRNFGMPSPEGYRKALRLMKQAETFGRP IICFVDTPGAFCGLEAEERGQGEAIARNLFEMADITVPILSIVIGEGGSGGALAMAVANE VWMLENSIYSILSPEGFASILYKDSKKASDAARVMKITARDLMELGLIERIIAEEEPASA ENLTTLAGEMETAMEAFFDKYAAMKKEEIASGRYDRFRRM >gi|229784103|gb|GG667632.1| GENE 29 30321 - 31454 1347 377 aa, chain + ## HITS:1 COG:CAC3580 KEGG:ns NR:ns ## COG: CAC3580 COG2070 # Protein_GI_number: 15896814 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 4 377 3 350 355 331 45.0 2e-90 MGNLKPLVIGELVVPKPVIQGGMGVGISLSSLAGAVAKAGGMGIISTAQIGFKEPDFADH SLEANLRAIGKEFKKAREIAPEGVIGFNIMVATKNYAMYVKEAIKAGADIIISGAGLPVS LPAYLAEAAAEMKSAGLGETVKTKIAPIVSSVKSAMVILKMWDRKFGRVPDLVVIEGPLA GGHLGFSRESLTELGVDTPDVEHTYHQDAYDEEIRGIIRLVGEYAEKYQTKIPVVTAGGI YSHEDVMHQIDLGADGVQVATRFVTTVECDAPEEFKQAYIRSSKEDIVITKSPVGMPGRA IHNAFLSGVGQTPFKLEHCYQCLEKCDKKTIPYCITKALVNSAEGHTEDGLVFCGSNAFR AERIETVDEVMESLLGE >gi|229784103|gb|GG667632.1| GENE 30 31660 - 32319 353 219 aa, chain + ## HITS:1 COG:BH1574 KEGG:ns NR:ns ## COG: BH1574 COG2715 # Protein_GI_number: 15614137 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, required for spore maturation in B.subtilis. # Organism: Bacillus halodurans # 1 208 1 178 197 156 40.0 3e-38 MLNYLWAFMMVAGVLWAAFHGNLSAVTDGALSSAKEAVMLCITMLGVMSFWSGIMEIGQK SGLIERLSHRMGPVLRFLFPRIPKDHPAMDHIATNIIANILGLGWAATPAGLKAMESLQE LEEERREEEQTGLDKNSLESSKKRQVKKRPRAMPPGTASNEMCTFLIINISSLQLIPVNM IAYRSQYGSLNPAAVVGPALVATLISTIVGVIYCRLMDR >gi|229784103|gb|GG667632.1| GENE 31 32329 - 33357 932 342 aa, chain - ## HITS:1 COG:PM0547 KEGG:ns NR:ns ## COG: PM0547 COG1609 # Protein_GI_number: 15602412 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Pasteurella multocida # 2 342 3 334 334 161 33.0 1e-39 MTLKDIAEKAGVSMMTVSNVINGKHSRVSAATIEKVNTIIKEYNYVPNLSARSLTVKSSH IIGVIVPLDSEESPVSYFDNPYVSSMIGVIERELRNNGYFVMIRSVQTQEDLSVLMRNWN VDGVIFLYPLLGECFEDSVNAIIEETHLPVVLFDARITNPEVINVYSDDKKGCYLSTRYL ISRGHKKIAFAAYCEDNPLLTDRFNGYRAAMEESGLGCDPDCIFSYSPSYDDGIAVGKAI AGSSHKITAVVTTADICAIGVMEGARLAGLRIPTDLSVIGYDNLSVSTYTTPKLTTISQN VTKKAETAMKLLLEKLRTGHVEQPHVQLDVELIERQSTVSLL >gi|229784103|gb|GG667632.1| GENE 32 33640 - 35028 1506 462 aa, chain + ## HITS:1 COG:AGl456 KEGG:ns NR:ns ## COG: AGl456 COG1653 # Protein_GI_number: 15890336 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 63 440 24 410 421 82 22.0 1e-15 MKKCWRSCRTAVLAVTAAVAVTGLMGCSGADSGNGKDAGSAAVSVPTEPFGDTVKYDPAQ PVNDGKDISVEFWEWGSDNLFQELIDGYTAIHPNVSIKLVNNPWDDYWTKLPLSLGGTNG PAIFNVHNSQHENLIHYMAPYEISQEDLDADFNGVSAHVIDGDVYYIDYGIMTGSVYYNK DMWEAAGLTDADIPKTWDEMREVAKKLTVKEGNSFVQAGLNFNGTFHQNYLLGLNYQLGQ NLFEEDKKTATVNNDAMKKVMQLLVDLYEVDGVGSKDFGAEGADSFGQGQSAMTIQWGHF NNTLKTNFPDINFGVFEIPVFTENPYAYNRYNGESTFGINKNAPADQQAVAQDIVKYFLA DDEIQKKFNIAMSTFPAKKSLADDADILANPSMAVLADHIDRYIWPGPMPATMETSLKTA GENIFFNGMSVDDALAEAEETIHNDMKNSSFESVEDLYKFAE >gi|229784103|gb|GG667632.1| GENE 33 35099 - 35977 976 292 aa, chain + ## HITS:1 COG:lin0218 KEGG:ns NR:ns ## COG: lin0218 COG1175 # Protein_GI_number: 16799295 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 28 271 24 269 292 174 42.0 2e-43 MFRKSNPLQSKSEILLRNLTVTGMIIFYVVFLLVPIGIAFAGSFHEWNPLSGIYRFTGIE NYVNVFTSALFGKAMLNTLIFSVVVIFFRVGLGLAIAVAIYSTLVKHKSFFRAVYYMPVV TPMVAVAFVWKFLYNPQIGSINQILGLDINWLMNPKTALIAIMVMTIWKDFGYAVVMFMA GLYSLPSDAMEAAKVDGASSWQTFKYLTLPLLKPMTLFVVITSIISYIQAYVQVLILTEG GPGTATYLSSYIIYNEAFVKYNFGYASAMSFVLFVITAVFTWLSFRVSGDSE >gi|229784103|gb|GG667632.1| GENE 34 35990 - 36811 865 273 aa, chain + ## HITS:1 COG:lin0219 KEGG:ns NR:ns ## COG: lin0219 COG0395 # Protein_GI_number: 16799296 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 5 272 14 281 282 206 39.0 5e-53 MRKRVSSIVLNMVLLVLGVVTVYPFLWMLASSFKKNSEIVALHQTFLPKLFTVSNYISMN SHFNFMRFFANSIFITLAITVIVIYTSTLCGFVLSKYRFKGRDALFGLVLATMMIPWCVT IIPRYSMIKAFGWMNSYAALIVPASFSGFGIFMMKQHIASVPDEILEAARIDGANEFYIF HKIVIPMAKNGISSIAIFQFLWAWEDFLWPYLVIDDSKKQLLSVGLKMFNGQYSTDYGAL FAATTISIIPVLAVYLFFQKQFIAGIAASAVKG >gi|229784103|gb|GG667632.1| GENE 35 36833 - 39220 1929 795 aa, chain + ## HITS:1 COG:SP0312 KEGG:ns NR:ns ## COG: SP0312 COG1501 # Protein_GI_number: 15900245 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Streptococcus pneumoniae TIGR4 # 74 793 1 674 679 644 45.0 0 MRDIFKVKTNPAATGENVIAGTCYRISVLTEGLIRLEYRKDGCFEDRPTQMAWNRDFKPA GYRLIKTEDGIEIITKRVHIIYNEQEFSRHGLSIQVLGNLSAYHSIWHYSEDIHDLKGTA RTLDEVDGKTELESGVLSRFGYSVIDDSRSMVLTEDGWVEARKGEGEDLYFFGYGHDYLD CLKDFYYLTGRTPMLPRFALGNWWSRYYKYTEESYLALMDRFEQEAIPFTVAVIDMDWHL VDVDPKYGSGWTGYTWNREFFPDPERFLQTLHRRGMRTTLNVHPADGVQAYEEMYEDMAK DLGVDWEHEDPVRFNITDPDFLEAYFQYLHHPYEKMGVDFWWIDWQQGGNSRIPGLDPLW MLNHYHFLDSGRDGKRPMTFSRYAGPGSHRYPVGFSGDTITTWDSLEFQPYFTANASNIG YGWWSHDIGGHMRGYKDDEMEARWTQFGVFSPIMRLHSSCSEFNGKEPWRFKKETEAVMG DFLRLRHRLMPYLYTMNFRASCMAEPLIMPMYYKHPEAVEAYEVANQYYFGSEMIVAPVT APRIGQINEAKVRVWIPEGTCVDFFSHMVYEGGRVMELYRPLETIPVLVKAGGVIPMTDD IGAADVTANPKTLRIRIFGGADGAFTLYEDDNETCNYEQGVCVTTDMKFDWEGEKQSFTI ASAVGAVELIPEKRNYILEFAAVTDAECSVTLNGGKTACRKSYDADSRTLTLEIDQVSTA KTLVVRFETSMKLSENPVEKLVFEFLNQAEIEFEIKEMLFRRIREGKSLKVFLSELQAME IDGEIKGALVEILTA >gi|229784103|gb|GG667632.1| GENE 36 39300 - 40877 480 525 aa, chain - ## HITS:1 COG:CC2802 KEGG:ns NR:ns ## COG: CC2802 COG3507 # Protein_GI_number: 16127034 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Caulobacter vibrioides # 5 502 36 543 548 317 36.0 4e-86 MSTLRNPLLPGFYPDPSVCRKGDDYYLVTSSFEYFPGIPIFHSRDFVHWHQIGHVLTRKS QANLDGLCPSRGIYAATIRYSEETDTFYVVTTLVGNAPYYDNINFYVTASDPTGPWSDPV IIKGAEGIDPTLVFDGDEVYYLGNKRPNPDIPECESRHIWLQRLDLPTGTLVGERVTLLE HGALYDAVAPEGPHLYHIKDWYYLIIAEGGTDHNHCSTVFRSRSIFGPYECNPRNPLITH RNLKRDYPINSTGHADILELNDGSWWAVLLASRPDGGDYRNLGRETFAVPMVWEDDWPVF SPQTGCVEFSYSGPDLPECIWEPEPACDHFLTQELGMCWNILRTPDYKTFSLNDRPGFLR LYLNCHTLAEVSSPAFIGRRQQHMCFTARTYMEFIPAAKGEAAGITMFYNNQYFYSLMLQ STETGISIVLTCQICGDNTRLYSAPYENNGIYFKIRMRYQSIDFYFSQDGQTWIQAGTTV SGTILNKETAGGFTGTYIALYATADGKESQNYADFDWFEYTGEIN >gi|229784103|gb|GG667632.1| GENE 37 40874 - 41557 461 227 aa, chain - ## HITS:1 COG:YPO3651 KEGG:ns NR:ns ## COG: YPO3651 COG1802 # Protein_GI_number: 16123793 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 14 222 12 218 220 84 28.0 1e-16 MKTERSESVNGRSTLKYSTYEFIKNKIICCEYEPGTFLNEEILCEETGVSRTPVRDALGR LEQEGLVKILPKKGVMISNLSINDMNMTYEFRLLLEPYILKNYGNTLDTNETLRYYQLFS DTKLLEDKSLSATADDDFHMWIVNACPNTYFKNSYIAIHNQNLRYRTLSGIRSREKAADE QTDHLTILSECLKGNWNEAAEAMRRHILNSKETSFRLLFSQNLGGTL >gi|229784103|gb|GG667632.1| GENE 38 41833 - 42034 179 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870375|ref|ZP_06113840.2| ## NR: gi|288870375|ref|ZP_06113840.2| polysaccharide deacetylase family protein [Clostridium hathewayi DSM 13479] polysaccharide deacetylase family protein [Clostridium hathewayi DSM 13479] # 1 67 7 73 73 96 100.0 5e-19 MKKLMTLVCGAMLAAGILSGCGSQEKAVEPTTAEAAKETTQEAGGETKAETAKMEAEPVD FSFWCIS Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:20:20 2011 Seq name: gi|229784102|gb|GG667633.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld26, whole genome shotgun sequence Length of sequence - 54007 bp Number of predicted genes - 45, with homology - 44 Number of transcription units - 18, operones - 9 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 211 - 1494 1434 ## COG1301 Na+/H+-dicarboxylate symporters 2 1 Op 2 . - CDS 1496 - 2356 986 ## COG1091 dTDP-4-dehydrorhamnose reductase - Prom 2547 - 2606 6.3 + Prom 2464 - 2523 8.1 3 2 Tu 1 . + CDS 2554 - 3123 696 ## COG0778 Nitroreductase + Term 3135 - 3187 12.0 - Term 3128 - 3170 10.2 4 3 Tu 1 . - CDS 3193 - 4542 1010 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 5 4 Op 1 . + CDS 4869 - 5249 276 ## STAUR_0020 hypothetical protein 6 4 Op 2 . + CDS 5263 - 5910 712 ## STAUR_0020 hypothetical protein + Term 5954 - 6016 -0.8 - Term 5940 - 6002 9.1 7 5 Op 1 1/0.000 - CDS 6046 - 7089 1136 ## COG1609 Transcriptional regulators 8 5 Op 2 . - CDS 7155 - 7913 789 ## COG1533 DNA repair photolyase 9 5 Op 3 38/0.000 - CDS 7933 - 8760 758 ## COG0395 ABC-type sugar transport system, permease component 10 5 Op 4 35/0.000 - CDS 8760 - 9617 813 ## COG1175 ABC-type sugar transport systems, permease components 11 5 Op 5 . - CDS 9659 - 11095 1514 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 11140 - 11199 5.2 - Term 11246 - 11318 12.6 12 6 Tu 1 . - CDS 11385 - 12404 799 ## EUBELI_01420 hypothetical protein - Prom 12506 - 12565 6.2 13 7 Op 1 . - CDS 12649 - 13461 727 ## COG0561 Predicted hydrolases of the HAD superfamily - Term 13494 - 13532 6.0 14 7 Op 2 . - CDS 13537 - 14301 704 ## COG2188 Transcriptional regulators 15 8 Op 1 . - CDS 14455 - 15312 715 ## COG0561 Predicted hydrolases of the HAD superfamily 16 8 Op 2 14/0.000 - CDS 15322 - 16665 1566 ## COG1653 ABC-type sugar transport system, periplasmic component 17 8 Op 3 38/0.000 - CDS 16691 - 17530 763 ## COG0395 ABC-type sugar transport system, permease component 18 8 Op 4 10/0.000 - CDS 17543 - 18484 766 ## COG1175 ABC-type sugar transport systems, permease components 19 8 Op 5 . - CDS 18490 - 19536 1012 ## COG3839 ABC-type sugar transport systems, ATPase components 20 8 Op 6 . - CDS 19576 - 20868 1219 ## COG1524 Uncharacterized proteins of the AP superfamily - Prom 20916 - 20975 4.6 21 9 Op 1 . - CDS 21052 - 21144 76 ## 22 9 Op 2 . - CDS 21144 - 22019 925 ## COG0524 Sugar kinases, ribokinase family 23 10 Op 1 . - CDS 22973 - 24085 1270 ## COG3613 Nucleoside 2-deoxyribosyltransferase 24 10 Op 2 . - CDS 24101 - 25054 1260 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase 25 10 Op 3 . - CDS 25117 - 25524 509 ## COG0432 Uncharacterized conserved protein 26 10 Op 4 . - CDS 25539 - 26681 1051 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains 27 10 Op 5 1/0.000 - CDS 26683 - 27585 1107 ## COG2820 Uridine phosphorylase - Prom 27629 - 27688 80.4 28 11 Op 1 . - CDS 28529 - 29407 764 ## COG0524 Sugar kinases, ribokinase family 29 11 Op 2 . - CDS 29466 - 31574 2167 ## ANT_01190 hypothetical protein 30 11 Op 3 . - CDS 31580 - 33763 2168 ## ANT_01190 hypothetical protein 31 11 Op 4 38/0.000 - CDS 33856 - 34698 789 ## COG0395 ABC-type sugar transport system, permease component 32 11 Op 5 35/0.000 - CDS 34708 - 35577 912 ## COG1175 ABC-type sugar transport systems, permease components - Term 35592 - 35643 10.3 33 11 Op 6 6/0.000 - CDS 35672 - 37108 1228 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 37269 - 37328 80.4 - Term 38349 - 38405 12.0 34 12 Tu 1 . - CDS 38455 - 39480 1004 ## COG1609 Transcriptional regulators - Prom 39515 - 39574 5.1 - Term 39583 - 39633 14.3 35 13 Op 1 26/0.000 - CDS 39660 - 40577 797 ## COG1079 Uncharacterized ABC-type transport system, permease component 36 13 Op 2 24/0.000 - CDS 40561 - 41685 833 ## COG4603 ABC-type uncharacterized transport system, permease component 37 13 Op 3 15/0.000 - CDS 41687 - 43219 1369 ## COG3845 ABC-type uncharacterized transport systems, ATPase components 38 13 Op 4 . - CDS 43249 - 44343 1230 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein 39 13 Op 5 . - CDS 44380 - 45351 783 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase 40 13 Op 6 . - CDS 45370 - 46302 763 ## COG0524 Sugar kinases, ribokinase family - Prom 46360 - 46419 9.9 + Prom 46364 - 46423 8.5 41 14 Tu 1 . + CDS 46565 - 48457 1115 ## COG0524 Sugar kinases, ribokinase family - Term 48472 - 48520 7.5 42 15 Tu 1 . - CDS 48555 - 50009 1325 ## COG5476 Uncharacterized conserved protein - Prom 50076 - 50135 5.6 + Prom 50000 - 50059 8.7 43 16 Tu 1 . + CDS 50102 - 51376 1044 ## COG2508 Regulator of polyketide synthase expression - Term 51203 - 51233 -0.9 44 17 Tu 1 . - CDS 51416 - 51682 241 ## COG0395 ABC-type sugar transport system, permease component - Prom 51802 - 51861 80.4 - Term 53000 - 53039 5.3 45 18 Tu 1 . - CDS 53055 - 53945 711 ## gi|288870389|ref|ZP_06113888.2| choline binding protein PcpA Predicted protein(s) >gi|229784102|gb|GG667633.1| GENE 1 211 - 1494 1434 427 aa, chain - ## HITS:1 COG:VCA0088 KEGG:ns NR:ns ## COG: VCA0088 COG1301 # Protein_GI_number: 15600859 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Vibrio cholerae # 1 423 4 424 424 345 47.0 7e-95 MKNRKKPGLTTKIFIGLLAGAATGILLHYAVPAGAIRDEFLIGGIFYVVGNGFLRLMQML VVPLVFCSLVCGSMAMGDTKKLGTIGVKTLFFYLFTTALAITAAILVANVINPGVGLDMS SIERSEIAVADKVSLPDTLLNIIPTNPLGALARGDMLPIILFALIVGIILAKLGESVETV GNFFSQFNDVMMEMTTMVMKLAPYGVFCLIAKTFSGIGFDAFLPLLKYMAGVLLALVVQC LVVYMMMLKGFTGLSPFLFIKKFLPVMGFAFSTATSNATIPLSIDTLHQKMGVSKRISSF TIPLGATINMDGTAIMQGVAVVFAAQAFGISLSAADYITVILTATLASIGTAGVPGVGLI TLSMVFTSVGLPVEAIALIMGIDRILDMCRTAVNVTGDAVCTTIVACQEKAVKKEVFLNP EAGKLAD >gi|229784102|gb|GG667633.1| GENE 2 1496 - 2356 986 286 aa, chain - ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 6 279 2 275 280 220 38.0 2e-57 MNHMQKVWICGANGRVGRKMTELLENTPVELLLTDLDAVDITDPKAVRDYAGMNRPHFIV NCAGLTDVAMCEENKEEAFKVNALGARNLSVAARMGKARMIQLSTDDVFDGSGQTPYTEF DTPNPQTVYGKSKLAGENFVKEFCNRHIIVRSSWIFGEGSSYFSKILKLAGEGKTIYAAS DQTAAPTGASELAEKIIELMDSAPDGLYHVTGQGSCSRFEFAQEIVRLSGHKAEVVPVAA ADDKLTAMRPSYSVLDNMMLRMSGIAQLPEWRLMLERFMRDRKDFL >gi|229784102|gb|GG667633.1| GENE 3 2554 - 3123 696 189 aa, chain + ## HITS:1 COG:FN1880 KEGG:ns NR:ns ## COG: FN1880 COG0778 # Protein_GI_number: 19705185 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Fusobacterium nucleatum # 1 189 1 189 192 89 32.0 4e-18 MLKDLVTRCRSYRRFYEEEKISAETLRELVDLARLTASSANSQALKFRICNTAEENEQVF GTLKWAGALPDWDGPEKGERPSAYIIILCDLALGKNKLQDDGIAAQTIMLGAVEMGYGGC MLGNVERTRLAEHLGIDTAKYSIDLVLALGKPKEKVVVVPVGADGDVRYYRDEEQVHYVP KRSLDEMIL >gi|229784102|gb|GG667633.1| GENE 4 3193 - 4542 1010 449 aa, chain - ## HITS:1 COG:MA1276 KEGG:ns NR:ns ## COG: MA1276 COG0402 # Protein_GI_number: 20090140 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Methanosarcina acetivorans str.C2A # 5 446 13 442 442 219 33.0 9e-57 MLPCDILIKNAAVLMPDMSIAKNQTIAISGSRILEIADTDNSDQTISCQAETVIDDPHLL WMPGLTDGHLHTSQQFLRGSLLDEKPVIWKRINVPFEASLTEETMALSARLAGAEMIKCG TTSFVDAGGPHIEAAAEEYLKMGMRGALTWQTTDGANVPDGLRIDTREALPRLEKFHREY HGKDGLLKVYYSVTSLMACTEDLFYTIFLAAKEQNVTAECHMNEYASEVLDFIERYGERP FLYLDRIGALSPQFVAAHCIMLSESEIQIVQDRGIKVVHCPFSNCGKGVPQTPRLLAAGI PVAFGSDGAGHGGLDLFREMRAFRCIMNVAWGITTGNPEIMPARTLLAMALEGGAQALMN SSLGSVRRGNLADLIAIDLDQPHLTPTGSIVNTLIESASGADVKHSIINGKLVMKDRKLL TIDEDKVLFEANKAMKSHALFTKAGAYTG >gi|229784102|gb|GG667633.1| GENE 5 4869 - 5249 276 126 aa, chain + ## HITS:1 COG:no KEGG:STAUR_0020 NR:ns ## KEGG: STAUR_0020 # Name: not_defined # Def: hypothetical protein # Organism: S.aurantiaca # Pathway: not_defined # 6 116 3 121 394 85 36.0 5e-16 MSSTEWTYHMTISNQTGETLVAHNEERNWGIWYLGGKDKSAPVSINAGETKEVFGIRASK GTATGYQGQCTWIGASGSLTLSIEVPYSKDNKSSLKVSGMYNVDGWNELPPRGHHFDHLL TITKVD >gi|229784102|gb|GG667633.1| GENE 6 5263 - 5910 712 215 aa, chain + ## HITS:1 COG:no KEGG:STAUR_0020 NR:ns ## KEGG: STAUR_0020 # Name: not_defined # Def: hypothetical protein # Organism: S.aurantiaca # Pathway: not_defined # 1 203 166 365 394 76 25.0 7e-13 MDISNWEALKHDLGENTGKPLSQLLNIPYSYPPKPVFLSLSDPCDIPPKYWPAINDPTYD NMMKKQQLVDSYFGAVLYVAYTAPSNIQTFPAGQSQTTTIKISQVSVLKQTNVCSVDVRS LIKGEASGLLKGVTLGLTTELETKFGYSYSTETTDTKEKTITKTVTIPASDKDRILVPWN LSKILVVYRKHKTKGISLICADETIKQSVEKFYTL >gi|229784102|gb|GG667633.1| GENE 7 6046 - 7089 1136 347 aa, chain - ## HITS:1 COG:BS_degA KEGG:ns NR:ns ## COG: BS_degA COG1609 # Protein_GI_number: 16078147 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 11 345 1 333 337 219 36.0 7e-57 MEVLLDWRRYMKITAKDIAREAGVSEATVSIILNNKAGHYRISEKTRQKVLEIAERYEFT CNPIAVSLATRKTHTIGLILPNLVNPFLANISTGIEKCAQENEYALLLCNCEENVKQCVK YLDILQKRYVDGILLVLPREKEVNANQRLVEAALRKCRVPVILIEHYLQKAPCDFMAVDN VKGGYLAAEYLLKKGHTRIGHIAGEQFTRQRTCGYLKALKDYGIPMDSGLVMQGDFSIES GYECGKRLLEQGVTAVFAGNDYMALGVWRAARELGLRIPEDVSLIGFDDDPVAVVMDVPL TTVRQPGTTLGKAASEMLVQKIGEERGTYQQYYFAPQLIERSSVAAR >gi|229784102|gb|GG667633.1| GENE 8 7155 - 7913 789 252 aa, chain - ## HITS:1 COG:CC1330 KEGG:ns NR:ns ## COG: CC1330 COG1533 # Protein_GI_number: 16125579 # Func_class: L Replication, recombination and repair # Function: DNA repair photolyase # Organism: Caulobacter vibrioides # 21 182 72 250 360 84 35.0 1e-16 MIYETRVEKAIEITENGQPYINPYDGCVSGCPFCYWLEREGWEGNIAIRINIAEVLDQEC GRWPQGVRLYIGDYCDPYMPVEETYGLTRQCVEVIKKHGIPLTICTSAKSLLIERDLDLM RGMKELCIVTELCRLDEIEKLKAGKGHAGIETGNRLREAGLPVIATFAPIMPGITDVDLV LDALRPDIPLYIDRFYMKPDSIQEIRLMEYLREKRPELMEEYRRLIREGSDAEYEAVIHR CQGSGRVKQLPF >gi|229784102|gb|GG667633.1| GENE 9 7933 - 8760 758 275 aa, chain - ## HITS:1 COG:CAC0428 KEGG:ns NR:ns ## COG: CAC0428 COG0395 # Protein_GI_number: 15893719 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Clostridium acetobutylicum # 19 274 17 272 273 218 41.0 1e-56 MGKRRIWKSVLYQAAATGIGLLVIFPVLYALGTSFMKSTEILSLNPGMIPRRIDFSNYHA VWENTMMLRFIWNSLFVTTITCCLRVVTAALAAYGYACFDFKGKNLFFYLTVGTMLIPGE ATLLTNYETISKLHMINTYQGIIIMFIGSATSVFIMRQYFLGVSVSIKEASEMDGCGDIR FFTRILIPISKPILVTVFITAFVEIWNVYLWPMLITNRNEMRTVQVGIAQLNSSEGSAYG VIMAGAVIVLIPSLLVFIIFQKQIINGMVTGSVKE >gi|229784102|gb|GG667633.1| GENE 10 8760 - 9617 813 285 aa, chain - ## HITS:1 COG:CAC0427 KEGG:ns NR:ns ## COG: CAC0427 COG1175 # Protein_GI_number: 15893718 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Clostridium acetobutylicum # 10 285 28 303 304 175 35.0 8e-44 MKRKKKITPYLLILPSMFFLVLFIYYPAVKTLFKSLFLINSGNMPVKYVGLGNYRKLFSD ETFLKSIVVTFEYALVVIPVTILMGYLLALISASGRRLSPVYELMYAMPMAVSMSVASVI FKLLLNGNLGLVNHVFHLNIMWFTDKKYALWAIIIIGIWMGLGLTYIFLLSAVRSVSEEL LEAATAEGAGMWQKVRYIYTPMISPTLFYLFCVNVGSALMMSGPVIILTQGGPDSSTSTI MYYMYQKGFYVYNYGLSYSAAVVGFIIGFTVILISFIWEKRGVHY >gi|229784102|gb|GG667633.1| GENE 11 9659 - 11095 1514 478 aa, chain - ## HITS:1 COG:CAC0429 KEGG:ns NR:ns ## COG: CAC0429 COG1653 # Protein_GI_number: 15893720 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Clostridium acetobutylicum # 1 474 1 446 447 179 28.0 8e-45 MKRKRFYSAVLAAVMAGVMIAGCSSGTGKETADAHAPAGAQTENAGEQRDAPAQEKELIM WHNRGGSNGKFIEEQLIPEFNDTAGKLAGVKVVPVYQDEDIVSKLKALILANDVANMPDL VTIYAGDVEYMSTVKNVVPLDSFLESDADFKKEDMVQSLWDTYNFRGKQYSIPFTGDAMI FYYNKTAFEKAGLDPEKPPKTIAEMADCAEKLLIKDGDTVKQYGITLGLQNVYLNNWIGG QGDYHFIGNNEGGRTGRMTKVTFDEDGTMLTLLNEWQKVLDTGAVQSVDVDDQPKDEFCS GLSAMFCGGNWAMTSIIEAANENGFELGISELPRVLDTDKGGVCPGGTSLYILDRGDSDV VNIGWEFMKYWSSAETQTKLCIATGYIPTNGKTMETEEMKAFLKDNPSYRVAFDSLMKSN PKVQEHLAPTQQEFQLIFRDTSLRWAKGEITAQECVDSMAEQCNRALDEYNEANPVEE >gi|229784102|gb|GG667633.1| GENE 12 11385 - 12404 799 339 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01420 NR:ns ## KEGG: EUBELI_01420 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 3 306 10 301 321 139 29.0 1e-31 MAEQDYYMKILLEDRARFADFINVNVFHGKQVLAADKLSLLPNEAGIVVVDADGVKRTIQ RRRDVVMKAEFGAYFCVVASENQGKVHYGMAVREMMYDALDYTEQIRKIEEKHRAEGDKL EGADFLSHVTKADRLIPVVTLTLYYGNEAWDGPRSLYEMMGIDEEWEETALVKKCLPDYK INLIDIREGEKLDQYKTSLQHVFGLVKYNKNKQKLYEYTRVHREEINRMDRESKAAALAL IGEQKRLQKILESKREEEMDMCQAIDELIADGEVRGEVRGILMGMEKTKINFIRKQYKKQ LSSSQIANILDLDERYVEKVIKLFKQYPAGSDDEIAGYL >gi|229784102|gb|GG667633.1| GENE 13 12649 - 13461 727 270 aa, chain - ## HITS:1 COG:CAC2941 KEGG:ns NR:ns ## COG: CAC2941 COG0561 # Protein_GI_number: 15896194 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 1 243 1 241 268 94 28.0 2e-19 MNKFPVQLVVSDIDDTLIHKEDHLNEKIIQTIEELKRRGIQFTLATGRMPYRAESYAGDA KLTVPFIANNGSILYDRGRMIWCKMLYAKIIQEITQRYMERNPQFTVIFSFTDRECPLVR TEWIRKRLGKYKGYDQTLGNTPDVWNQTVHKIYVLDDARSGIIGAFADELEYYRNEISFF QYGEFSIEIVAGGCSKASGLTRLLEYLSIPADCVMAVGDHTNDIELIKKAGIGVAVANAD SQLKAAADYVTAGERDLGVKEAIEKFVLSE >gi|229784102|gb|GG667633.1| GENE 14 13537 - 14301 704 254 aa, chain - ## HITS:1 COG:BH3323 KEGG:ns NR:ns ## COG: BH3323 COG2188 # Protein_GI_number: 15615885 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 8 243 1 235 237 84 27.0 2e-16 MGNSNGIMNESPIPLYYQIYVDLKNKLLINKLVDENGKLPAEKELAESYKVSRVTMRQAI AELEKDGLIVRYRGRGSFLKSQPNPILHKLGLPGQAGRVQIDKSPEIVELSHFETTYEYI GRALNYEGDIFFIKRIFRVDGNPVAINRIWIPAVFTPELDKKGLCVEGSLSKTLKEVYHL KIDRRENVIEAVRPSVSDIELLKITYDTLILQITSKSFLKDNLPYEYSVTSWIGDAIRLE VDVEDIEHGITFLK >gi|229784102|gb|GG667633.1| GENE 15 14455 - 15312 715 285 aa, chain - ## HITS:1 COG:yidA KEGG:ns NR:ns ## COG: yidA COG0561 # Protein_GI_number: 16131565 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Escherichia coli K12 # 16 280 1 270 270 89 28.0 6e-18 MAAAKVRQPAEAGGTMGVKLIVCDVDGTLIDHSETVIPELIQIVDQCRKEHIYFSLASGR TKELIEDIRLKLHVTEPYIAANGACVFVDDTCVLMKGFCADPIRDIISDANRVGLTVTCS DAYQERALSVTDYVAEHRKLGNRFKQLLDFHAVDWKEQKFVKIMFMDEHRTGKIERIRKA LKPYSSQYWITTYSDVAVELGPVQCNKATGVRELAGLMGIGMEEVMACGDFTNDLEMIRE AGIGVAVANANEILKSSADYVAASACACGVIEAVKKFCLSGNGVP >gi|229784102|gb|GG667633.1| GENE 16 15322 - 16665 1566 447 aa, chain - ## HITS:1 COG:TM1120 KEGG:ns NR:ns ## COG: TM1120 COG1653 # Protein_GI_number: 15643877 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Thermotoga maritima # 55 447 19 430 436 189 32.0 8e-48 MKKTLAMMSVLLVAGTLLSGCGKKAVEETAAVPDNTQGTTQAENVTAAADNGGEKVKIEF WYAWGDKIGETNEELARRFNESQDKIEVVASYQGSYDDVQSKTQAAFAAGEAPAITMNET ASVGTFARSGISRDLTDLVARDGYDLNNFNTGLMTNSYVDGKLYALPYMRSTPLMFVNTD MLEAAGLDPKGPENWDELTHYAEVMAEQGKDALCFPIVNEWYFEAFLGQAGGTIFNKEEN GFDFNNEAGEAVLNYWRDLQKTGGVNILFGSDSSNMSKAEFASQNCAMLFASVADINYFL DVAKDNGFNCATAFLPGNKTFAIPTGGANLIITSNKSEEETNAAWEFMKWVLEDEQTIYA SQNTGYVPVTNSAVESDAMKAFYAEKPQFKVAVDQLQYILARPISKQGPEVHKAIKDMVT EFLMDENVSAKEVLDKTEKQCMEILNE >gi|229784102|gb|GG667633.1| GENE 17 16691 - 17530 763 279 aa, chain - ## HITS:1 COG:lin0219 KEGG:ns NR:ns ## COG: lin0219 COG0395 # Protein_GI_number: 16799296 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 9 279 12 281 282 188 37.0 1e-47 MQLTYRLKKIVLTVIQYIGFAILLVLFIIPFVWMVSTSVKSIGETLTNPPIFIPSDFHFE NYIKAWKSGPFLHFFLNSAIITFSSMILQLLFVVPAAYAFARCKFRGKTILFGITMMTLM IPGQLIFMPLFLIFSRMGLINSYASMILPFSTSAFGIFMLRQSFMQIPEELLEAARLDQA SEWQIIRHIMVPMARPTIVTLMLLTFISRWNDYFWPLVMTTNDKVRTLSIGVSMLKNTEG GASWNVLMAGNVILVLPILIVFVCAQRQIIRAFTYTGEK >gi|229784102|gb|GG667633.1| GENE 18 17543 - 18484 766 313 aa, chain - ## HITS:1 COG:AGpA78 KEGG:ns NR:ns ## COG: AGpA78 COG1175 # Protein_GI_number: 16119287 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 30 313 39 322 323 193 37.0 3e-49 MRDKRKASHAFCRRKAKQTLLKQTSFRDGIRPYMMVLPSIAVFLFCYIYPIFYMIYLSLF KWDFISPTKDFVGLKNFITLFSKAEFHQVLSNTLIYTFCSVGIAIVLALVLAVWLNHPGK MFAFVQGAIFSPHIISLVSVSFIWMWMMEPSYGLLNWLLGLFGVKPSMWLQRPDTALMSL VLVSVWKLIGYDTLILISALQSIPKSIYEAALLDKAPGIVTFFKIILPMISPNLFFLVVM NTLTSFQAFETVSIMTEGGPMNSTNTLVYYIYQNGFRFYKMGYASAAGVVLLVLVGIMTI IYFNLLGRKVHYQ >gi|229784102|gb|GG667633.1| GENE 19 18490 - 19536 1012 348 aa, chain - ## HITS:1 COG:PAB0123 KEGG:ns NR:ns ## COG: PAB0123 COG3839 # Protein_GI_number: 14520399 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Pyrococcus abyssi # 1 300 1 292 371 310 52.0 2e-84 MAEIILKDICKSFGETAVVKHVDLTIEDGAFSVIVGPSGCGKSTILRMIAGLEQPTSGEI WISGKCVNYVPPGKRDIAMVFQNYAIYPTMSVRDNIEFGLKNRRVPKEERDSLIGDVSEI VGLTEYLGRKPSELSGGQRQRVALARAMVKKPSVFLMDEPLSNLDAKLRNQMRSELIQLH DRLGTTFIYVTHDQVEAMSMGNKIVIMNQGEIMQVGKPMTVYHCPENIFSAQFIGTPPMN ILAAEGLHDAVCCPPGCRYVGFRPERISLDRECGEALVLPAVVQTRELLGAEAVYTFESE AGKIQMKNYSEYLPEPGEKVNLIIERKHLFFFDGSGKAFGMDCGKREV >gi|229784102|gb|GG667633.1| GENE 20 19576 - 20868 1219 430 aa, chain - ## HITS:1 COG:CAC0477 KEGG:ns NR:ns ## COG: CAC0477 COG1524 # Protein_GI_number: 15893768 # Func_class: R General function prediction only # Function: Uncharacterized proteins of the AP superfamily # Organism: Clostridium acetobutylicum # 2 427 5 433 434 246 32.0 5e-65 MAKKMVIISVDALVGEDLELASTLPGFKKIMEGGARVEQVMSVYPTLTYPIHVVQMTGRN LMNHGIYNNERVMPGRLSPDWFWQIWDCRVPTIFDAARRKGLKTHAVMWPVTAQADIDWL LPEVWDLQKWEDPAIIYKKNCSPQLFDKYFEKHRFKLGWHPKPELDEFGVSVAEDVIRNE KPDLTMIHVSAVDIARHGNGMFGPAIDDAIRKVDSWLCRLEQAVSDSGWFDETNFIIVSD HGHLKVDKQVNLNVLFKEEGLIQTDDHGELASYQAYCHSSGLSAQILLADAADTEVRKKV EAILEYAKDTEAFGIQAIYTRREAEEKFQLSGPFEYVVEGAGGVAFAAKWDGRAIINEED SDYDYVKTSHGHRPHLGPQPVLMAKGPDFKENIVIRECSILDEVPTFAAVLGVEMEGLEG RCLHELLAHY >gi|229784102|gb|GG667633.1| GENE 21 21052 - 21144 76 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MITAGGPPFSCQLRRFQAVVFVRNSVWKAG >gi|229784102|gb|GG667633.1| GENE 22 21144 - 22019 925 291 aa, chain - ## HITS:1 COG:RSc1013 KEGG:ns NR:ns ## COG: RSc1013 COG0524 # Protein_GI_number: 17545732 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Ralstonia solanacearum # 2 290 13 310 315 140 37.0 3e-33 MARILNYGSLNIDHVYSVPHIVRPGETISAVGFKDFPGGKGLNQSVAMARAGAMVFHAGR VGNDGGFLKEILVANGVDCTWLKNSEGPNGSAIIQKDKDGQNAIICYAGSNGEISPAEID EVLSHFREGDILVSQNEISNVPYLIEMAGKRGMRIAFNPSPIDDTVLHMDLSPVSWLVIN ETEGYGLTGEEEPEKIVKNLITRYPAMAVVLTLGEDGSLYCQGDTILRQEACHFGVKDTT AAGDTFLGYFLGTYELEGDVKKALDMAARAAGIAVSREGAVPSIPRGEEVR >gi|229784102|gb|GG667633.1| GENE 23 22973 - 24085 1270 370 aa, chain - ## HITS:1 COG:PA0144 KEGG:ns NR:ns ## COG: PA0144 COG3613 # Protein_GI_number: 15595342 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside 2-deoxyribosyltransferase # Organism: Pseudomonas aeruginosa # 207 370 33 198 208 81 34.0 3e-15 MFEKEKIYIAGPECFYQGGPDILNAMRRRAESLGFGVTLPNEHPLDMGNPELQKRADSIF ADLKMIMKETTVIIADLEAYHGSEADSGTIYEIGMAYADGLKCYGYTRDKRPLAWKDQKY VMKDGIVYDENGKKAPYRELPFSPTVVGSTKIVEGDFDDCLAMLMTDLEEEWKAEGRSGA MAAAQVPEASGEANKAQGRTNETHDPVNAPVKPRVYLSDVIRYEEDAREVYGRLKELCAS YGLEAVTPCDWADGFPKTESANPYVRAAALTENYCRLVRSCDAVIADLNDYRGYECSNDV GFECGMGFEMGKKLFGYMRDARPCIEKIPHLGEAAEFRDMTGCNVENFNYPANLMFGSSM KIYEGDFEHS >gi|229784102|gb|GG667633.1| GENE 24 24101 - 25054 1260 317 aa, chain - ## HITS:1 COG:ECs0690 KEGG:ns NR:ns ## COG: ECs0690 COG1957 # Protein_GI_number: 15829944 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Escherichia coli O157:H7 # 6 311 5 303 311 175 35.0 1e-43 MDKKKIILDCDPGMDDSIAIIMACKSNALEVKAITAVNGNYRVDVTSKNALKVLELIGRT DIPVGKGMPEPMVRKCPKDPFTHGKDGQAENYLPDPVTALSDKNAVELIIDTVKANPGEI YILCTGPMSNVAMAMKKAPEIKEMIAGIYAISGAFGLNRYAYTNATGDTPQSEWNVYVDP EAADIVYNSGVKLVALGLDVATHFDVNLTDEDIRMLEASDKPEAKFFLQAVHFVNEKGFE AYCAVIDCMAVGYAIDATLVKTISGHVGIETKSDLTLGMTVLDRRHHFVWESLPVVEIGE SADYERFLKLLMKLVLA >gi|229784102|gb|GG667633.1| GENE 25 25117 - 25524 509 135 aa, chain - ## HITS:1 COG:MJ1081 KEGG:ns NR:ns ## COG: MJ1081 COG0432 # Protein_GI_number: 15669269 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanococcus jannaschii # 1 134 6 137 138 108 42.0 2e-24 MLKNFKIKTKYNEVYDITEHVKKAVEESGVKEGTCLVYNPHSTAGLAVFSPWDPDGFLDL DEEIRRLVPTRIDFKHQHDTPQDAAGHVKSALLGISGNFIVHEGKLLVGGSQGIYFLEFD GPRNREFYVKIQADA >gi|229784102|gb|GG667633.1| GENE 26 25539 - 26681 1051 380 aa, chain - ## HITS:1 COG:TM0148 KEGG:ns NR:ns ## COG: TM0148 COG0449 # Protein_GI_number: 15642922 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Thermotoga maritima # 46 251 290 488 606 84 29.0 5e-16 MSERLTKEYDNPMCRQAMSLPELIRSQYEDLEPKTRKVLSFQEIFNIQRVVLTGCGDSYA ACMATKHVFEMLTGIQAEVVTAIDFGRYYSERHMGVDCQNPLVIAVSNSGRVARISEAVQ RAKFHDCFVLGITGNLTSPLAQNSDKVLKLDIPPFESAPGTRSYMVSVMALLLLAVRIGE VRGMYTMDQAMDYRYDMKNQGDLLEALLPQMAETCAKTAEEWKDFPCFDFVGAGFDYAAA WFGQAKMLEATGKFSMHINSEEWFHLNCFAKDPEHIGTVVVSNTTNPGLSRTKDVIRYGK KLGRPMLVITDGAGAEGVSCVTVPSPKYPVNMPVTQFVPMSLLTGYLSAMLGEEYGRGCE GPWSFCENGFCVQNNEIVLK >gi|229784102|gb|GG667633.1| GENE 27 26683 - 27585 1107 300 aa, chain - ## HITS:1 COG:APE2105 KEGG:ns NR:ns ## COG: APE2105 COG2820 # Protein_GI_number: 14601845 # Func_class: F Nucleotide transport and metabolism # Function: Uridine phosphorylase # Organism: Aeropyrum pernix # 64 294 10 240 254 149 42.0 6e-36 MKSEEIRKKPSIRRIHEQYEYYKNVPERGLMIKGRPALTQIDHEKVGDFVLITVRDPLCA YDVDPAKKIADRLDHAELIGNSGMFTSYTGTYKGAKITVVSGGSGSPEMELILYDFMEYT DAGTFLRVGGSGGIGDEVKPGDVVIASGVVREEAMTKAYIPAGYPAVSHYEVVGAMVEAA ESLGVPYHVGVTLSVDSDFVGGGRPGVGGYLQPWNIETAGIYNRAGVLNGDRESAAIVTL SALFGRRGGSICSVADNLCTGEKFEAGGGHEFAIDIALEGCAVLNRLDQEKNEKKAEEKQ >gi|229784102|gb|GG667633.1| GENE 28 28529 - 29407 764 292 aa, chain - ## HITS:1 COG:BMEI1779 KEGG:ns NR:ns ## COG: BMEI1779 COG0524 # Protein_GI_number: 17988062 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Brucella melitensis # 73 288 106 313 330 72 27.0 1e-12 MDYIIASTVVTDEIRFADKKTVVKKAGGAGIYALCGIRLWCSSVMPVTGVGRDYGELFGD WYQKNHISMEGLIEKDEKTPYNIIQYFEDGERSETALYGADHYRKIEVTPNELEPYFQTA KGIYIFKNTSPDFWEQVLPMMEKRKAAVLWEIASDSTCPECLDEVRQIAGSVDIFSINLT EARNLFGTECLDEIIGTFREWLDSSSHSVRQNHLQMIFLRRGSKGAVMITQEETVYVPTA DGAYVVDPTGGGNSSSGAVLYGWCSGHTPEECGLMGSISAAMCLEQYGVLAS >gi|229784102|gb|GG667633.1| GENE 29 29466 - 31574 2167 702 aa, chain - ## HITS:1 COG:no KEGG:ANT_01190 NR:ns ## KEGG: ANT_01190 # Name: not_defined # Def: hypothetical protein # Organism: A.thermophila # Pathway: not_defined # 6 701 7 693 698 510 41.0 1e-142 MSVLKFREQIYAGVLGKIIGVYLGRPVEGWSYDKIRETFDEVKYYVHEKVGVPLIVADDD ISGTFAFFRALEDNGYDRELPAKAFGDTWLNYIIENRTILWWGGLGRSTEHTAFLNLKNG MNAPESGAIKTNGTTLAEQIGAQIFIDAIAMACPDDPDLAVSLVRKAASVSHDGIAVEAA CHLAALEAMAFTEKDVNVLLDRAGAYVKNQVLKDMIGDVRDICSRESDWRKVRDYLDPKY GYEVFPGCCHMIPNHAMVIAAILLGGDDFQKSISIASSAAWDTDCNAGNVGAFNGIRLGL DGINAGADFRTPVADLMYVVTSDGGSVVSDAVSESRKVLKAAGALHGESVPVSDKRYTFE YPGSLQGFAPCDYDHGSQAAVTLSNYNERSGENGLLISCRHVADGVTANVSTQTFIDFSK LAQNFSTVASPTLYSSQKVVTEAAVVSSPGSSGRNEVTLRPYILYYDIENQVRAIYGDAV LLDETKKTFEWTVPDTKGMSIFKLGYEAASRKRFDGDVVICSIDWKGAPTDFAQRGMLMT SIWNTNSLWLAGFASSAVQFAADFKHTYCVSNVEEDGLVTIGTREWEDYTVSSTVFYSLH QAGGLVVRSRGHKRYYGAVLMDYSRAVVYVQKDRERVILAEVPYSYQEDTGYTLTFGAHR EKLEFSVNGETLIEVTDGTYQGGGAGFTISKGTMTCDSFIVS >gi|229784102|gb|GG667633.1| GENE 30 31580 - 33763 2168 727 aa, chain - ## HITS:1 COG:no KEGG:ANT_01190 NR:ns ## KEGG: ANT_01190 # Name: not_defined # Def: hypothetical protein # Organism: A.thermophila # Pathway: not_defined # 11 723 11 694 698 488 40.0 1e-136 MKNIIRFKNDIYAGVLGKIIGVYLGRPVEGWTYEAIRDTFGEINYYKNHRTGAPLIVPDD DISGTFAFFRSLEDNKFDPGIRAEQIGDSWLNYIIENETILWWGGLSRSTEHTAYLRLKN GIRAPKSGSIQLNGRSMAEQIGSEIFIDTWAMANPGNPERAAKMAREAASVSHDGIAVDA AVYLAVMESMAFVEKDIDRLLDAGLAYIENPEFIRLVGAVREQCAKASDWREVRQWIAEN HGYEKYPGNCPMVTNHLTLLTAFIMGGDDFHKSCMIACSSGWDTDCNSGNVGCLNGIRLG LEGFKKSTDLRKAVADRLYVVTSDGGSCISDAVQESRKILKAACRLEFEPVDFPEERLAF EYPGAVQGVVPYDKNTEEQVLCGIDNPYEETGEYGCRLSYRGLGPGVHATACIDTFIDLQ PKGKEGTSYFDVLCSPSLYSGQEVKAVIAAGEGENPVFTFFIEYFNEKDELETVSSSAFR LHAGDNQLKWLIPDVGGHAVYRLGLSLRSGAQGELPGNPMEGLSGAAGCSRLDGSVTLKS LDWSGAPECYRMRKSMEMTPTLTPWTTTTAWLKTFVSSADHFCPDYTTTFSISHSAPNGV ATTGTMDWDNYTVSSVITFSQQDSAGLVARAKGHRRYYGAVLKDKKAVIYKQCDAERTVV AEAAFDFEIDDTRKLAFTLDGPELSLAVDGKTVVRGLDESYLCGGAGFVVDSGAILADGF EVRKIER >gi|229784102|gb|GG667633.1| GENE 31 33856 - 34698 789 280 aa, chain - ## HITS:1 COG:BS_yurM KEGG:ns NR:ns ## COG: BS_yurM COG0395 # Protein_GI_number: 16080311 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 3 280 22 300 300 154 33.0 3e-37 MGVSSKKKRIWINVLIHVVLILLLLICLLPIALVVINAFKTNAEIVRNPLSIPKVLHLEN FIQAWTTGKFAHGFINSIKLSGCTILIILVCSTLAGYVLSGYRIKGSGIILTYFMMAMTV PIQLFLFPLYYAFAKLNLIGNIPATSLILAAMFMPLSLFLMRTFFLNVPKELEDAARIDG ANTGQVIWHVMRPVVSPGLITVGILIGLQSWNEYLVSSTFLQGEKNFTATLGFLSMNGSY GSNMSILMASSVILIAPIVVFFLCAQRYFVDGMVSGAVKG >gi|229784102|gb|GG667633.1| GENE 32 34708 - 35577 912 289 aa, chain - ## HITS:1 COG:SMc02472 KEGG:ns NR:ns ## COG: SMc02472 COG1175 # Protein_GI_number: 15966816 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Sinorhizobium meliloti # 24 282 49 307 312 139 33.0 7e-33 MKSKKINWNYLLIAPALLISVSVILLPGIMTIVYSFTDYNGISQNFNFIGLQNFKELFHD RIFFMAIRNNLIWTIMFITIPVCIGMLAAMLLLSRSRTRSIYQVAFLIPYVLAPAVNAML WLNVIFSPVVGVVSFLKNFGIDLGSPLASMKSAIFACAGVDIWHYWSYLTVIYLAALRQT PTDQVEAARIDGCNGWQLFRYIYLPNIKSTVNLMFVMIVIGSFLAFDYVKLLTGGGPAHS TEVLGTYAYSFAFSEMKVGKAAAVGLFMSFFGLIASLIYTRMSRKENMD >gi|229784102|gb|GG667633.1| GENE 33 35672 - 37108 1228 478 aa, chain - ## HITS:1 COG:BH3690 KEGG:ns NR:ns ## COG: BH3690 COG1653 # Protein_GI_number: 15616252 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 1 193 1 176 420 68 28.0 3e-11 MKKRLFAAALSAAMVGTLLSGCSISTTSTPESDASGTTAAGETASAAQETTAGQDLKGRE EITLWFWGAEPYAQEAMQRILADKYNSSQDQYKLSIEFRPSVDTDMSTALAANQGPDIVY GSGPSFVMPLVEAGKLEPLDEYSEKYGWSDRILKPFYESGTVNGKLYSLISGVSTMGVFY NKKVLADNGWEVPKTIGDMEKIMDDALAKGLYASVTGNKGWKPVNENYASVFLTHVAGPE TMYKCLTGEEKWNNPNVTAAIDKSKEWWDKGYLAGDDYVNLNFSESLQLLSDEKAPFFVG PSIAFQWAASFFTGDKTENIAFIPFPSTETVPNETYTLGASCTLSINANVSQEHKDEAAK IIDMMFQPEFMQEMTAAWPGYWGTPLKDLSTIDTSQMGYLSQSFVDVVKNISAAVDKGNF GYYDNVFFPASTQQSMVNIEDVWYDTVSVNDYLDKVDKDFAEALAKGLVPPIPKPAMN >gi|229784102|gb|GG667633.1| GENE 34 38455 - 39480 1004 341 aa, chain - ## HITS:1 COG:ECs4695 KEGG:ns NR:ns ## COG: ECs4695 COG1609 # Protein_GI_number: 15833949 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 5 336 4 329 330 169 31.0 9e-42 MKVKVKDIAKAAGVSPTTVSLVLNNRPSRIAEDTKEHILRVAREMQFQKESEIDFSEFKK VKTLGMVVPDVMSSFYHRLAEETSRHAFYQEYTVFQCYTNDDIQCFYMAIEGLMAKNVDG ILVIPPRTMDKENVKLLKSLQKSGVPMVLLDRAAYAVFCDFVTADNKHGGRIATEYLIKH GHTRIGCLVGEMNVYTSRKRVEGYREALAAAKIPFDKDLVYYGNYDIESGRAGMEVLWEK GVTAVVAGNDLMAYGVYLFAKEHRLSIPGDLSVIGYDNTELCGLMDVPLTSVDQNTDTMA SKAVEVLLRRIEVPASDEPEPARNYYFTPFVVERDSVGRLK >gi|229784102|gb|GG667633.1| GENE 35 39660 - 40577 797 305 aa, chain - ## HITS:1 COG:alr5368 KEGG:ns NR:ns ## COG: alr5368 COG1079 # Protein_GI_number: 17232860 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Nostoc sp. PCC 7120 # 1 286 4 287 312 158 34.0 1e-38 MNLFVEILKSALPAMIPALLAALGGLFTWHANVFNISMEGMLLVSAFTSVVGSYMFESWA MGLLFGILGSMLISLLFAFFVLKMKTGEFITGIAINTFVLGATTYFLRQLFQVKGSLISP RIEALPRWNIPFVKDIPVLGQIVSGHAFPLYITLIVIVPLVYILIYKTSFGLRLRASGFD AKVIDSVGLKSELLKFISILICGFLCGIGGTFLSLGIMRMFTENMSNGRGWISLAIIILS KGHPVKVLVTCLIFGMMEGIGLTLQQVSIPNQITDMLPYIAVLVALFINSRNWKKGSSAG MEMSI >gi|229784102|gb|GG667633.1| GENE 36 40561 - 41685 833 374 aa, chain - ## HITS:1 COG:TM0104 KEGG:ns NR:ns ## COG: TM0104 COG4603 # Protein_GI_number: 15642879 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Thermotoga maritima # 27 344 12 331 344 138 31.0 1e-32 MEQNKTNVSFAKRITASPEVKKGFTVFLGVVTAIVIGMLMLASQGYSPMESYASILKYSL STSVGISNTINRAVFLLIAGASASIALGSGASNLGQFGQLLMGAMAATLIGLYVPLPSWL LIPLMIIGGMAAGALYAGLAALGRQRFGMNEFIITLMLNFIADYFTRYLITNPLKDPGAS WPASPVVPGGAILPAMGKIDCAAFIMIAVYIIAVFYMKKSRTGYEMRIMGKNGLFARVGG CKTDKNFMKVMLSSGALAGLVGVMMIMGASQQHRFLGGLGQSYADDGLMISIVSGNQVTG VFIYAIIFSIFQSGSTGMQLDTGVPSEFTKMLIAITVLSVVTFRSYSGIFLDKLAAFKKT RSLHKGVNQSESVR >gi|229784102|gb|GG667633.1| GENE 37 41687 - 43219 1369 510 aa, chain - ## HITS:1 COG:TM0103 KEGG:ns NR:ns ## COG: TM0103 COG3845 # Protein_GI_number: 15642878 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport systems, ATPase components # Organism: Thermotoga maritima # 9 506 5 499 507 398 46.0 1e-110 MGTVADSYIEMRNIVKVYPPNNVALKDVSVDITLGEIHSIIGENGAGKSTLMKVLFGMEK LNEGEIAINGSKTAIGSPKEAVAKGIGMVHQEFMLIGEYTVLENVVLGNEPTKNGLLNLS AARKKLEKIMTDFKFDIPLDEKVKNISIAAQQKIEIIKLLYRDVETLILDEPTAVLAPQE VDELFGLLDRIKSEGCTILFISHKLDEVLRISDRITVMRAGRLIWTRKNEGLTKADLASA MVGREVMLTIDKKAAKPGTAVLQIEDLAVENPGIPGKMDLDHVSMTVHKGEIVGVAGVEG NGQYEMVQAVMGLMESKGRVLLEGKDISKMDVHGRRRLIAYVPQDRKISGSSLEDSIETN VMMTHHYVSDRLVGKTRLLDKKKCRQLSNQIISQYQVSCQGPRMPIGALSGGNQQKVIVG REFELDSSLLVVDQPVRGLDVGSIEYIHRKIVDKRDEGEAVLLVSADLDELFNLSDRIVV MHKGKIAAEKLASETTREEIGEYMLGVKSD >gi|229784102|gb|GG667633.1| GENE 38 43249 - 44343 1230 364 aa, chain - ## HITS:1 COG:SMc02884 KEGG:ns NR:ns ## COG: SMc02884 COG1744 # Protein_GI_number: 15963962 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Sinorhizobium meliloti # 64 362 32 322 330 159 36.0 1e-38 MKKLMKVTALLMAAAMTLTACGGKTTSTETGKDTTPAKTESQAEGQTEAAKETKKMRVAY VTAGAQGDNGFTDSVVRGMNRIKDDFGAEITVIENNNDAAKYAESMEACFQWQPDVVFAD AYGFEELFAQYADQYPDVTMVNLDFVIENSNKTVSSYTFISEEGAFLAGVLAAKVTTSDL PYANPEKTVGFVGGQEITVIKGFLSGFKQGVEFVDPEIKVLVSYVGDFFDPVKGKTATKQ LYAQGADIVFQAAGTSGNGALEAANEEQKYVIGVDSNQNGLYPGHVVASVIKDLDGAVYD TFSKIEDGTFEKGTVYSKGAGPSGVYLAIDEYSKEILTPEMITEINAVQDDIVDGKVTVE RYTE >gi|229784102|gb|GG667633.1| GENE 39 44380 - 45351 783 323 aa, chain - ## HITS:1 COG:ECs3054 KEGG:ns NR:ns ## COG: ECs3054 COG1957 # Protein_GI_number: 15832308 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Escherichia coli O157:H7 # 3 318 5 304 313 181 37.0 2e-45 MEKVILDVDTGTDDAIAIMTALQSPELEVLGICSVNGNRGIEFTTENTLRVVEYMGACDV PVHKGCSLPMVVSLTPGRRKEIPYTGPEDPEDNVHGDYVELPPATISAAPGNAVSWLVDT LMKSAGDITLIPVGPLTNIAMALRIEPAIAAKIKRIVLMGGGCREVNITPTSEFNFWIDP EAAKIVIDSGCDITIVPLDATHAAAVSIEDAKELRRMGTKASVVSADIIERRWSAYKNWQ PMDDINTVPVHDALAVCAVINPAVLKNVVSVHVDVDINGGASDGQCICDVGKRNKRDVPN AKVALGADKELFAAMLKEILGKQ >gi|229784102|gb|GG667633.1| GENE 40 45370 - 46302 763 310 aa, chain - ## HITS:1 COG:PA1950 KEGG:ns NR:ns ## COG: PA1950 COG0524 # Protein_GI_number: 15597146 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Pseudomonas aeruginosa # 1 304 1 304 308 238 47.0 8e-63 MSKGILVVGSLNMDMSVNLEKIPTVGETVLGNDLSYRAGGKGANQACAAGLLGGEVRMLG CVGKDEFGEKQIENLKKAGVDTSYLKKSDDQPTGTAVICVDSHGDNNIVVIPGANKECDA EYLKRQRELFEWCDYVVLQMEIPYDSVLCAAMMAKEAGKTVILNPAPAPDELPEELCRNV DYLTPNETELMKLSGVTDCSVSSMKKGAMILLEKGAGCVIVTLGEKGALLVKKGEECLYP AKKVKAVDTTAAGDCFNGAFAAALAEGMKEGAALGFANAASAIAVTRKGAQESLPAREEL KDYLGDVLAV >gi|229784102|gb|GG667633.1| GENE 41 46565 - 48457 1115 630 aa, chain + ## HITS:1 COG:PA1950 KEGG:ns NR:ns ## COG: PA1950 COG0524 # Protein_GI_number: 15597146 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Pseudomonas aeruginosa # 327 619 4 301 308 170 34.0 9e-42 MTIKEIAQLAGVSISTVSKIVNNKDENINVETRNRVLKIVKDYNYTPYGTAKNTSEAKTF ILGVLLKSTPKTNLFLNGIILAAQKNGYSILLYDSAESTSVELKNITSLCKNNVDGVIWE PVSHLSLEYERYFNEQGIETCRINGPDDPASYSIDFAQMGYEAAQILIRHGHSKLGCLTK QGSARSAMVFDGFKKCLFDHGIPFYDTMNLPVEKDDWYNAMLSHTPTGIVSTHYASSLIL MKQLAKIQFLVPYDLSLISLRDDVRENLTFPEISSIKIPYYEFGSFVCERLIEKCEKKGP SAAVFSTAYPLENTSSIDVPFSSHAKKIIVVGSINIDVTLNVDELPQPGRTVSTSKHSVI PGGKGANQAVGAAKLGNEVSLIGKVGNDYDSTIVYACMEENHVDVQGIRRDPHSETGKAY IHVQNDGESMITILTGANQHLMPSDITPYGKLFENAGYCLLQTEVPEEAIKTAALLARSH GVCNILKPAAMNRISDSLMELIDIFVPNRKEAELLCPGVPGIEGKAEAFLKKGAKTVIIT LGHSGSYIKAPEFTGYLPAADFTPVDTTGAADAFIAALAVYLSSGHTIAKAAKIATYAAG FCVSRQGVIPALIDRNSLETYIKRIEPDIL >gi|229784102|gb|GG667633.1| GENE 42 48555 - 50009 1325 484 aa, chain - ## HITS:1 COG:AGc4702 KEGG:ns NR:ns ## COG: AGc4702 COG5476 # Protein_GI_number: 15889852 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 451 2 445 477 242 32.0 1e-63 MKKKIFVAGIYQESNSFSPLMSEQKDFFTYLKGEELLLETPGVRELEAAGYEVIPGLCAS AWPGGILKLEDFRRMAGEILEKLPLDGSVSGIFFPSHGALEVEFIGSGDTFLISMLRERV GPRVPIALALDLHANNTYTLARLSNIIYGYRTAPHIDIEETSIRAAELLMKAIEEKKEPW TELIRLPFMMPGENMMTESGFGKEVIAMVAELEKVPGVWCASYFAGMPWVDCAQGGSSIV ISGVGDKKPGIKAAVKLGRKIWNRRKEFKFQGTAMEPLEALITLDRCNHYPAFLSDSADN VTAGAAGDNAYMLDLILKQGIKGVLVAHFIDGPAVAFCAAKEKGDRFDISLGGTIDPAGS VRTVLPDAELVEVFRQGGKATAAVVKTDAVTVLILAERGPITSEAVLNSYGLSAYDYRIT VVKQGYLTPEFYEVLKEYIMALTPGNCAQDLTMAEYKNVRRPMEPLDIVGDESRIAETYE AVEE >gi|229784102|gb|GG667633.1| GENE 43 50102 - 51376 1044 424 aa, chain + ## HITS:1 COG:CAC1426 KEGG:ns NR:ns ## COG: CAC1426 COG2508 # Protein_GI_number: 15894705 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Clostridium acetobutylicum # 139 392 109 369 397 81 24.0 3e-15 MGNDKAYNKPVYNELIRVSNLYYSNCSLSDLLDYLAGISGFTFVLTDTGFHPVAARLVFG SQKNTFAASDHAFGEIPPEEISLSSPASESESGTAVFCIREETVVCKILPLPNYSYYLCI DGKISLTDSRFLAIVENALPFLALTLDKEKAFTAAGFSNDFFAPFLQVLHSTSSKTPDEI RNICNIYNFNPDLKRVCMTLQLQNAGEDKEELRELIARLRNLLNADKYFLIYSHNFVSVF LLYPLADQDLDIITKSYVLSTSLFKELESRFSLRIGISSAHFGVETIAVSFEESFKSIKV QKQLNLPGSASSYFEQSIYHILNIPELKDFRKLSRDIISPLIEYDAANDTNFFETLQQYF HNSFNIQKTAKALYIHRNTLSYRLQHIKELLKFDFDFEHNTDALFTLYLCICVYQLGYHK DFTS >gi|229784102|gb|GG667633.1| GENE 44 51416 - 51682 241 88 aa, chain - ## HITS:1 COG:BH1866 KEGG:ns NR:ns ## COG: BH1866 COG0395 # Protein_GI_number: 15614429 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 19 88 216 285 285 68 48.0 2e-12 MRGRGVESAEGTLFFVWSWTDFMGPLIYLTKSSKWTISLGLSQFTNTYGVDWALLMAASA IAILPMVILFFFLQKYFVEGSMSSGVKG >gi|229784102|gb|GG667633.1| GENE 45 53055 - 53945 711 296 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870389|ref|ZP_06113888.2| ## NR: gi|288870389|ref|ZP_06113888.2| choline binding protein PcpA [Clostridium hathewayi DSM 13479] choline binding protein PcpA [Clostridium hathewayi DSM 13479] # 1 296 21 316 316 600 100.0 1e-170 MRKSTGWKLAIIFLAVTLNPMSAYAGTWKNTSTGCQYQKDDGNYARAEWITENGTSYYLN QNGIMAVNTTTPDGYLVAADGSWVTEQQSLGGYVRTPYDNRPYYYDPEWQVYIFDEDTDY AWVNDTRVLAAVRGIIPASELSEKNQVVYEEVCRFLTGFDYGASDYEKATRIYDEITLRA TYNRGKYTQADDEVYSILVNGTGKCVGFARAYKLLANAVGLKCALREDYVHMWNAVYIDG QAKSIDVSSTRTDACFYLDVVELKCPFCDYRNVFGLREVCHPCPNCKAQLYNPDMN Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:21:24 2011 Seq name: gi|229784101|gb|GG667634.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld27, whole genome shotgun sequence Length of sequence - 37168 bp Number of predicted genes - 39, with homology - 38 Number of transcription units - 16, operones - 8 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 123 133 ## Closa_4287 iron-containing alcohol dehydrogenase - Prom 196 - 255 8.3 2 1 Op 2 . - CDS 331 - 1041 759 ## Closa_2903 Crp/Fnr family transcriptional regulator - Prom 1069 - 1128 10.2 + Prom 1054 - 1113 9.0 3 2 Tu 1 . + CDS 1293 - 1496 227 ## COG1278 Cold shock proteins + Term 1527 - 1581 9.1 + Prom 1573 - 1632 5.8 4 3 Op 1 . + CDS 1761 - 2591 800 ## Closa_2902 hypothetical protein 5 3 Op 2 . + CDS 2607 - 3053 583 ## LC705_00064 hypothetical protein 6 3 Op 3 . + CDS 3090 - 4439 839 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 7 3 Op 4 . + CDS 4448 - 5209 780 ## LGG_00070 hypothetical protein + Term 5244 - 5277 5.4 + Prom 5222 - 5281 3.6 8 4 Tu 1 . + CDS 5444 - 6286 803 ## Closa_4288 Fumble domain protein + Term 6295 - 6354 17.1 - Term 6283 - 6340 17.5 9 5 Op 1 . - CDS 6420 - 6923 526 ## Closa_4289 hypothetical protein 10 5 Op 2 25/0.000 - CDS 6950 - 7891 891 ## COG1475 Predicted transcriptional regulators 11 5 Op 3 . - CDS 7894 - 8664 604 ## COG1192 ATPases involved in chromosome partitioning - Prom 8864 - 8923 5.7 + Prom 8621 - 8680 6.4 12 6 Tu 1 . + CDS 8874 - 9857 462 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) + Term 9919 - 9956 8.0 - Term 9905 - 9947 10.4 13 7 Op 1 24/0.000 - CDS 9956 - 10687 394 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division 14 7 Op 2 11/0.000 - CDS 10758 - 12665 1647 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division 15 7 Op 3 4/0.000 - CDS 12683 - 14062 1480 ## COG0486 Predicted GTPase - Term 14071 - 14118 7.0 16 7 Op 4 16/0.000 - CDS 14132 - 14800 854 ## COG1847 Predicted RNA-binding protein 17 7 Op 5 18/0.000 - CDS 14813 - 16099 1191 ## COG0706 Preprotein translocase subunit YidC 18 7 Op 6 16/0.000 - CDS 16127 - 16342 81 ## COG0759 Uncharacterized conserved protein 19 7 Op 7 . - CDS 16347 - 16694 180 ## COG0594 RNase P protein component - Term 16731 - 16762 2.5 20 8 Tu 1 . - CDS 16844 - 16978 181 ## PROTEIN SUPPORTED gi|160882064|ref|YP_001561032.1| ribosomal protein L34 - Prom 17059 - 17118 6.9 - Term 17184 - 17237 10.0 21 9 Tu 1 . - CDS 17284 - 18243 788 ## gi|288870394|ref|ZP_06409748.1| conserved hypothetical protein - Prom 18477 - 18536 6.0 - Term 18564 - 18623 11.2 22 10 Tu 1 . - CDS 18653 - 19282 735 ## COG4684 Predicted membrane protein - Prom 19333 - 19392 6.2 + Prom 19636 - 19695 5.7 23 11 Tu 1 . + CDS 19782 - 20252 474 ## COG0593 ATPase involved in DNA replication initiation + Prom 21099 - 21158 80.4 24 12 Op 1 16/0.000 + CDS 21224 - 22102 899 ## COG0593 ATPase involved in DNA replication initiation + Term 22169 - 22209 -0.9 + Prom 22140 - 22199 4.2 25 12 Op 2 . + CDS 22376 - 23497 1167 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) 26 12 Op 3 . + CDS 23526 - 23735 270 ## Closa_0003 hypothetical protein 27 12 Op 4 9/0.000 + CDS 23745 - 24830 903 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) 28 12 Op 5 24/0.000 + CDS 24840 - 26750 2117 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 29 12 Op 6 . + CDS 26787 - 29309 2617 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit + Term 29539 - 29600 18.3 30 13 Op 1 . + CDS 29641 - 30972 1318 ## COG1253 Hemolysins and related proteins containing CBS domains 31 13 Op 2 . + CDS 30975 - 32111 1390 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 32 13 Op 3 . + CDS 32124 - 32240 222 ## + Prom 32250 - 32309 7.4 33 14 Tu 1 . + CDS 32378 - 32791 326 ## COG1959 Predicted transcriptional regulator + Term 32844 - 32901 13.5 - Term 32822 - 32896 16.8 34 15 Op 1 . - CDS 32923 - 33732 774 ## ELI_0472 hypothetical protein 35 15 Op 2 8/0.000 - CDS 33726 - 34424 212 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 36 15 Op 3 . - CDS 34421 - 34792 386 ## COG1725 Predicted transcriptional regulators - Prom 34829 - 34888 6.4 + Prom 34875 - 34934 5.0 37 16 Op 1 11/0.000 + CDS 34973 - 35815 809 ## COG1951 Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminal domain 38 16 Op 2 . + CDS 35843 - 36397 486 ## COG1838 Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain 39 16 Op 3 . + CDS 36394 - 37107 195 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase Predicted protein(s) >gi|229784101|gb|GG667634.1| GENE 1 3 - 123 133 40 aa, chain - ## HITS:1 COG:no KEGG:Closa_4287 NR:ns ## KEGG: Closa_4287 # Name: not_defined # Def: iron-containing alcohol dehydrogenase # Organism: C.saccharolyticum # Pathway: Glycolysis / Gluconeogenesis [PATH:csh00010]; Tyrosine metabolism [PATH:csh00350]; Pyruvate metabolism [PATH:csh00620]; Chloroalkane and chloroalkene degradation [PATH:csh00625]; Naphthalene degradation [PATH:csh00626]; Butanoate metabolism [PATH:csh00650]; Metabolic pathways [PATH:csh01100]; Biosynthesis of secondary metabolites [PATH:csh01110]; Microbial metabolism in diverse environments [PATH:csh01120] # 1 37 1 37 871 64 86.0 1e-09 MAKKEEQVKVPEIIDNVDALIAKMDAMREAQREFKPFRFQ >gi|229784101|gb|GG667634.1| GENE 2 331 - 1041 759 236 aa, chain - ## HITS:1 COG:no KEGG:Closa_2903 NR:ns ## KEGG: Closa_2903 # Name: not_defined # Def: Crp/Fnr family transcriptional regulator # Organism: C.saccharolyticum # Pathway: Two-component system [PATH:csh02020] # 1 226 1 229 235 251 53.0 2e-65 MNWEELLKRAPKLTDYIKNMPPDIKNRCVIKVIPPGQIIHQKNYELDYFAFVCCGDHRAI NEFENGNVYMIEKNEAIDFVGEVTILAGQERTSVTLETITECVLLQMPRKDFERWIKEDI DLLYLIASKVAFKLYRSSSKNGATLFYPPNFLLLEYLVQYADKHMTGNKTSITVPFTRNQ LEEELGINIKTLNRTVKKLKDTGMIGIVKGKLTFNKDQYTKAIAELAVLRKGANHW >gi|229784101|gb|GG667634.1| GENE 3 1293 - 1496 227 67 aa, chain + ## HITS:1 COG:RSc2466 KEGG:ns NR:ns ## COG: RSc2466 COG1278 # Protein_GI_number: 17547185 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Ralstonia solanacearum # 1 67 1 67 67 81 59.0 3e-16 MNKGTVKWFNSSKGYGFITNDETGEEVFVHFSGIMTDGYKSLEDGQKVTFDTTQGNRGIQ AVNVYAA >gi|229784101|gb|GG667634.1| GENE 4 1761 - 2591 800 276 aa, chain + ## HITS:1 COG:no KEGG:Closa_2902 NR:ns ## KEGG: Closa_2902 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 272 5 275 286 337 64.0 5e-91 MKKETFQYSSLKERIKTEYRVYILAFIFILIADSIGQIKIPLGPGMLILFPIFYSLFMGI LSGPQILKIMKEKEVTAASKLVIVAICPFIAKLGINAGASLSTVIQAGPALLLQEIGNLG TIFLALPFALLLGLKREAIGATHSINRETNLALITDMYGPDSAEARGSLSIYIVGGMIGT IYFGFMATMVAALNIFHPYALGMASGVGAGIMMASATASLTEIYPAMADKISALASTSET LSGITGIYVAIFIGIPLCNKLYAFLEPKIARLRGEK >gi|229784101|gb|GG667634.1| GENE 5 2607 - 3053 583 148 aa, chain + ## HITS:1 COG:no KEGG:LC705_00064 NR:ns ## KEGG: LC705_00064 # Name: not_defined # Def: hypothetical protein # Organism: L.rhamnosus_Lc705 # Pathway: not_defined # 1 148 1 148 148 122 51.0 4e-27 MRYKEWGFLLIIACLGIMTANFVGFKVAFLDSLPGVCVLLVISCIGVVLSKIIPLKLPIV AYCSIVGLLAACPISPVSSFVIESANKINFTAPLTMVGAFAGISIGNEIKAFAKQGWKMI IIALLVMTGTFFGSALVANIILSLTHAI >gi|229784101|gb|GG667634.1| GENE 6 3090 - 4439 839 449 aa, chain + ## HITS:1 COG:MA1276 KEGG:ns NR:ns ## COG: MA1276 COG0402 # Protein_GI_number: 20090140 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Methanosarcina acetivorans str.C2A # 7 434 13 431 442 204 33.0 4e-52 MKEPVYDILIKNTRILTEDMKIRTGVDVAVKDGIIASICDSGSHVPAEADQTVDGSRLLW MPGLTDGHMHTCQQLLRGKILDALPMIWTRIMLPFESTLTPEAVSLSAALCSLEMIRGGT TAFLDAGGIHMDQAAEVYIKSGLRGALTLSTMDDTKVPDSMRADAAESISRLNEYYDTWN GSGDGRLSVCYSLRSLISCSEELIRGVFEAADERSAMVEAHMNEYPNEINYHLERYQIRP VEYLDSLGVLSERFVSAHSILLSEHEIELMAAHGVKAVHCPFSNCGKGVPNTPRLLESGI SVAFGTDGTAHGGMSLFQEMKIFRSIMNIRYGVTDSNPVIMPAETILSMALKGGSSVLGL HNRAGVLKEGCCADFIAIDLDQPHLYPTNNLVHTLVESVSSQDVIHSVAAGKFLMKNREI LTLDEEKIRFEAHSWLTKNHLMEDELCYR >gi|229784101|gb|GG667634.1| GENE 7 4448 - 5209 780 253 aa, chain + ## HITS:1 COG:no KEGG:LGG_00070 NR:ns ## KEGG: LGG_00070 # Name: not_defined # Def: hypothetical protein # Organism: L.rhamnosus # Pathway: not_defined # 11 245 1 222 228 139 42.0 2e-31 MNLIVGTLVGVSGIAGFLLPIVFTGYLGMDLSLSLALSFISFLTSGIIGSYRYQKQGQMD IKFGVLVGAGSIAGAVLGVKLNFMIPLTTAKMFLYLVVLASGLSILLKKEKPETKSETPQ EAQPASPVYTGLLANKPFVLLFGFITGAICSLSGAGGPILVMPILVTLGMPVRTAVGVSL FDSIFIALPACVGYLSNCPREGLLVLCLVSFICQAIGVFFGAGQAGRINVKILKKGVAVF SIAIAVYMIIGLI >gi|229784101|gb|GG667634.1| GENE 8 5444 - 6286 803 280 aa, chain + ## HITS:1 COG:no KEGG:Closa_4288 NR:ns ## KEGG: Closa_4288 # Name: not_defined # Def: Fumble domain protein # Organism: C.saccharolyticum # Pathway: Pantothenate and CoA biosynthesis [PATH:csh00770]; Metabolic pathways [PATH:csh01100] # 1 279 1 279 287 397 82.0 1e-109 MGIIIGIDIGGSTTKIVGFDNNNLKIPTFVKANNPIASLFGAFGKFIYDNAIQLGDIEKI MITGVGSANVEQPLYGIPTFKVDEFTSNGLGGRYFTGLNDLIVVSMGTGTSFVQVNGDKI VHTGGIGIGGGTIIGLSSLLLKTQEISKIMELASGGDLSHVDLQICDISKEALPGLPLTA TASNFGKVRGQVNEEDIAIGLINMVLQAIGKSAILSSLNSEINNFVMIGNLAKFPQCKDI FCSLETMFDVHFIIPEQAEYGTAIGAALTESHHANMTEIL >gi|229784101|gb|GG667634.1| GENE 9 6420 - 6923 526 167 aa, chain - ## HITS:1 COG:no KEGG:Closa_4289 NR:ns ## KEGG: Closa_4289 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 167 1 167 168 261 86.0 8e-69 MENSMMDLWPVDPGYMILGLAVLTLILLIIVIICLIQMRRLYRRYDYFMRGKDAETLEEI IMEQMGDIEALKAEDRANKDSIRNTNKNYRSAFQKFGLVKYNAFKGMGGNLSFAMALLDY TNTGFVLNSVHSREGCYVYIKEVERGETDVLLGSEEKDALEQALGYH >gi|229784101|gb|GG667634.1| GENE 10 6950 - 7891 891 313 aa, chain - ## HITS:1 COG:CAC3729 KEGG:ns NR:ns ## COG: CAC3729 COG1475 # Protein_GI_number: 15896960 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 310 1 282 283 217 46.0 3e-56 MAKRTGLGKGLGAIFGDEVMESAAEEQEAKHQAKSKSAQEPEKKEEENDIGKELMVKVTA IEPNREQPRKDFNEEAMEELAESMKVYGVLQPLLVQKKGDYYEIIAGERRWRAAKLAGLK EVPVVIREYTKQQTMEIALIENVQREDLNPIEEAKAYQRLIQEFELKQEEIAARVGKNRV TITNSMRLLKLDERVQDLLIQNQITGGHARALLSVEDGQLQYELAGKIIAENLSVRETEK LVKSLSKKKNPKEKKVEDESLTLIFHDLEERMKSAMGTKVSINRKDKNKGRVEIEYYSES ELERIVELIESIR >gi|229784101|gb|GG667634.1| GENE 11 7894 - 8664 604 256 aa, chain - ## HITS:1 COG:BH4058 KEGG:ns NR:ns ## COG: BH4058 COG1192 # Protein_GI_number: 15616620 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Bacillus halodurans # 1 253 1 253 253 309 60.0 4e-84 MGRIIAIANQKGGVGKTTTAINLSACLAESGQKVLTVDFDPQGNATSGLGIEKGEIDKTV YDLLVGECDIEECLISNMQENLDLLPSNVDLAGAEIELLEIENKEALLKTYLSKIQNNYD FIIIDCPPSLNLLTINALTAANTVLVPIQCEYYALEGLNQVLKTVNLVKKKLNPSLEMEG VVFTMYDARTNLSLEVVESVKNNLNQNIYKTIIPRNVRLAEAPSHGMPINLYDSRSAGAE SYRLLAAEVISRGEDI >gi|229784101|gb|GG667634.1| GENE 12 8874 - 9857 462 327 aa, chain + ## HITS:1 COG:alr3514 KEGG:ns NR:ns ## COG: alr3514 COG0596 # Protein_GI_number: 17231006 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Nostoc sp. PCC 7120 # 66 252 34 219 300 72 21.0 1e-12 MYGLVHGMEEIMKTKNKLLTLLTLAAGAAAATAVINKCIKVSATSKNILAESKSLCYKWR FGNIHYTKTGSGKPILLVHDLTAFSSSYEWTQLVNYLKDQYTVYTIDLLGCGRSEKPDLT YTNFLYVQLLSDFIKSEIGHRTNVITSGASAALVIMACNQSSELFDQIMLINPDSIVTCS QIPNKHAKVYKFILDLPIIGTLLYHIASSKQAITEEFTRDYYFNPYSVKSSYVDAYYEAA HLGESPKSLYASVKCNYTKCNIVNALKKIDNSIYIVGGTAMDSIKQLIGEYHEYNPAIES SLISESKYLPQLENPMELYKTIQMFFV >gi|229784101|gb|GG667634.1| GENE 13 9956 - 10687 394 243 aa, chain - ## HITS:1 COG:CAC3732 KEGG:ns NR:ns ## COG: CAC3732 COG0357 # Protein_GI_number: 15896963 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Clostridium acetobutylicum # 4 242 2 238 239 223 46.0 2e-58 MTAEFKDRLSRELNQFSIILENSQINQFYQYYELLDEWNKVMNLTAITDQKEVITKHFVD SLALVKAMGEISTKEYKIIDIGTGAGFPGIPLKIAFPQLKITLMDSLNKRIKFLNEVIEQ LGLKEITAVHSRAEDLGRDKDYREQYDLSVSRAVANLSTLSEYCMPFVKPGGFFISYKSG KIEEELSSAKHAIFLLGGKVNGIESFTLDGAEAERTLIKIEKVSEISKKYPRKAGVPGKE PLK >gi|229784101|gb|GG667634.1| GENE 14 10758 - 12665 1647 635 aa, chain - ## HITS:1 COG:CAC3733 KEGG:ns NR:ns ## COG: CAC3733 COG0445 # Protein_GI_number: 15896964 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Clostridium acetobutylicum # 8 621 9 620 626 781 61.0 0 MPNLEESYDIVVVGAGHAGCEAALAAARLGLETIMFTVSVDSIALMPCNPNIGGSSKGHL VRELDAMGGEMGKNIDKTFIQSKMLNESKGPAVHSLRAQADKQNYSREMRNTLENTDHLT IRQAEVTDIIVENGVLTGVKTFSGAVYHCKAAVLATGTYLKARCIYGDVSNATGPNGLQA ANHLTDSLKEHGIEMYRFKTGTPARVDKRSIDFSRMEEQFGDKKIVPFSFSTDPETIQKE QVSCWLTYTSEKTHEIIRANLDRSPLFSGAIEGTGPRYCPSIEDKVVKFPDKNRHQVFVE PEGLYTNEMYLGGMSSSLPEDVQHAMYRTVPGLEHVKIVRNAYAIEYDCINSRQLKATLE FKTISGLFSGGQFNGSSGYEEAAVQGFMAGVNASMKVLGRESFVLDRSQAYIGVLIDDLV TKENHEPYRMMTSRAEYRLLLRQDNADLRLMGIGHEIGLVSDEQYEKLLEKESAIETEIK RLESTNIGASKEVQEFLERNGSTLLKTGTTLAELIRRPELNYVMLTELDSKRPELSVDVI EQVDINIKYDGYIRRQKQQVAQYKKLENKKLDADFDYSSVKSLRKEAVQKLDLYRPMSIG QASRISGVSPADISVLLVHLEQLRYQKRDEKSEEK >gi|229784101|gb|GG667634.1| GENE 15 12683 - 14062 1480 459 aa, chain - ## HITS:1 COG:CAC3734 KEGG:ns NR:ns ## COG: CAC3734 COG0486 # Protein_GI_number: 15896965 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Clostridium acetobutylicum # 2 459 3 459 459 384 46.0 1e-106 MKMNTIAAIATAMSSSGIGIVRISGDEAFTIIDKIYRVKGKNSRKLSDAPSHTIHYGYIA DGDEIIDEVLVMLMRGPKSFTAEDTVEIDCHGGVLITKKILETVLKYGARPAEPGEFTKR AFLNGRIDLAQAEAVIDVINSKNDYALKASVSQLDGAVSNAVKKLREQIIYQIAFIESAL DDPEHISLDGYGESLLSVIQGLKEGLWKLIRSADNGRVMTEGIKTVILGKPNAGKSSLLN VLVGEERAIVTDVAGTTRDTLEETIRLEDITLNVIDTAGIRDTDDIVEKIGVEKARNAAD AADLIIYVVDGSCPLDENDEEILKFIKDRKAVVLLNKSDLTMEITADMLASRTAHRVIAI SAKERVGIELLEDEIKTMFYHGEIDFNDEVTITNVRHKNALEQAYSSLEMVEQSIQNGMP EDFYSIDLMDAYEQLGLIIGEAVEDDLVHEIFSKFCMGK >gi|229784101|gb|GG667634.1| GENE 16 14132 - 14800 854 222 aa, chain - ## HITS:1 COG:CAC3735 KEGG:ns NR:ns ## COG: CAC3735 COG1847 # Protein_GI_number: 15896966 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein # Organism: Clostridium acetobutylicum # 1 208 1 208 209 177 52.0 2e-44 MDMITVSAKTVDEAITKALIELETTSDKLEYEIVDKGNSGILGFIGSKPAIIRARKKETI DDKAIAFLGEVFGAMDMGVNIETAFNQEEKELSINLAGDDMGILIGKRGQTLDSLQYLVS LVVNRESEDYIRVKLDTENYRERRKETLETLAKNIAYKVKRTRRSISLEPMNPYERRIIH SALQNDKYVVTRSEGEEPFRHVVISLKKDYSKKDRYQDNREK >gi|229784101|gb|GG667634.1| GENE 17 14813 - 16099 1191 428 aa, chain - ## HITS:1 COG:BH1169 KEGG:ns NR:ns ## COG: BH1169 COG0706 # Protein_GI_number: 15613732 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Bacillus halodurans # 24 130 45 153 280 86 43.0 1e-16 MEFLVLTKVGGILGPFATVLGIIMDWLFKLTSIFGIQNIGLCIILFTLVTKLLMFPLTLK QQKSSKLMSIMQPELNAIQAKYKGKKDQESMMKQNVEIQALYEKYGTSMTGGCLQLVIQM PILFALYQVIYHIPAYVSSVKDVFMNVVTSITSLGVDHVGMLTQFATDNKVNMAMVKDMA TNNGLVDFLYQLNPTQWSALQDLFPSIQGTIVENAAKIEHMNSFLGINLASTPFSVMFPN GFTGGFHFSLAILIPILAGLSQWLSARLMTVNQPSTGNDDGNSMAQSMKMMNTMMPLMSV FFCFTFASGIGIYWIASSVFQVIQQVLVNRYMDRIDIDEMVKQNVEKANKKRAKKGLPPQ KVAQNATANLKSIQAANEKEEADRLKKIERTQEQMKASNDYYSSGAANPNSISAKARMVQ KYNEKHSK >gi|229784101|gb|GG667634.1| GENE 18 16127 - 16342 81 71 aa, chain - ## HITS:1 COG:TM1462 KEGG:ns NR:ns ## COG: TM1462 COG0759 # Protein_GI_number: 15644211 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 2 71 1 69 81 90 62.0 8e-19 MVKKIMILLIRGYQKFLSPLKVRTHCIYTPTCSQYAIEALQKYGVFKGTFLACKRILRCH PFAKGGYDPVP >gi|229784101|gb|GG667634.1| GENE 19 16347 - 16694 180 115 aa, chain - ## HITS:1 COG:CAC3738 KEGG:ns NR:ns ## COG: CAC3738 COG0594 # Protein_GI_number: 15896969 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase P protein component # Organism: Clostridium acetobutylicum # 5 103 4 104 119 79 47.0 1e-15 MKHFNSIKKNRDFQEVYQTGKSYANKLLVMYVKKTDRPETRIGISVSKKVGNSVVRHHIT RLLRESFRLHEDMTETGLDIVVVARAAAKEENYHSIESAYLHLCGLHNILKKESK >gi|229784101|gb|GG667634.1| GENE 20 16844 - 16978 181 44 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160882064|ref|YP_001561032.1| ribosomal protein L34 [Clostridium phytofermentans ISDg] # 1 44 1 44 44 74 81 1e-12 MKMTFQPKNRQRSKVHGFRARMSTPGGRKVLASRRAKGRAKLSA >gi|229784101|gb|GG667634.1| GENE 21 17284 - 18243 788 319 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870394|ref|ZP_06409748.1| ## NR: gi|288870394|ref|ZP_06409748.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 319 13 331 331 518 100.0 1e-145 MNQYHDMYDRVAKKCLTLSNRAIIQFINGTFGVNHPLDSKVTYNWTEHEDDDLRKTLADT IITIGLSAYHMEFQTTEDGSITFRIFEYGFHHAIKIRREENVLEFPDPVLICLYEGKSEP DERDLTIRFGNSGTYIYKVPVIKYLSLSPKELQNRNMIILIPFQLLKLKKSMEKKRTKEN LDALKYLVTHDIIGSIEVNVKAGNITPTDGRKLKNMTLQLYHHIYDRYQEVSDEGVSEAV EEALILDIDVIEMEHKKELREKEQKIEQKEKEIEEKQKEAEEKGEEVKVLKLLLKGKSPK EAARELNISLEKIEKMINS >gi|229784101|gb|GG667634.1| GENE 22 18653 - 19282 735 209 aa, chain - ## HITS:1 COG:CAC0331 KEGG:ns NR:ns ## COG: CAC0331 COG4684 # Protein_GI_number: 15893623 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 6 209 8 192 192 100 34.0 2e-21 MNIENKTRNLVLAAVFMAIIIIMAFTPFGYIPLGFMNATIIHVPVIIGAIILGPKYGGFL GLVFGITSLWKNTYMPNATSFVFSPFIKIGEYGGNFWSLVICLVPRILIGIVAYYVFRGI MKTLKANRARRSIALAAAAVAGSLTNTLLVMNLIYLCFGKEYGAAAKGLTEGIYTVILGI ICLNGIPEAIVAGVLTVAVTQALFKAMKR >gi|229784101|gb|GG667634.1| GENE 23 19782 - 20252 474 156 aa, chain + ## HITS:1 COG:BS_dnaA KEGG:ns NR:ns ## COG: BS_dnaA COG0593 # Protein_GI_number: 16077069 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Bacillus subtilis # 2 156 1 148 446 112 39.0 4e-25 MLEKLKEKWDEILLNLKEEHEITDVSFKTWLLPLKVHSINGDTVTVTVPDVEFLGYIRKK YGFLLKITIEEVTGFECNVDFVVENQVPQEEVPKAQTGNTLINNAMNTVSQSAIINANLN PKYTFDTFVVGANNNLAHAASLAVAESPGEIYNPLF >gi|229784101|gb|GG667634.1| GENE 24 21224 - 22102 899 292 aa, chain + ## HITS:1 COG:CAC0001 KEGG:ns NR:ns ## COG: CAC0001 COG0593 # Protein_GI_number: 15893299 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Clostridium acetobutylicum # 1 292 159 446 446 328 59.0 8e-90 MHSIGHFILKNNPNAKVLYVTSEKFTNELIDAIRNKNNFSPTEFRDKYRTNDVLLIDDIQ FIIGKESTQEEFFHTFNALYESKKQIIISSDKPPKEIETLEERLRSRFEWGLTVDIQSPD YETRMAILRKKEEMEGYNIDNEVIKYIATNIKSNIRELEGALTKIVALSRLDNKEITVEL AEEALKDIISPNDKREVTPELVIQVVADHYGITPLDICSQKRNKEIVYPRQIVMYLCREI VGTPLQSIGKYLGGRDHTTIIHGIEKIKADVVKNDSNLINTLDILKKKLSPQ >gi|229784101|gb|GG667634.1| GENE 25 22376 - 23497 1167 373 aa, chain + ## HITS:1 COG:CAC0002 KEGG:ns NR:ns ## COG: CAC0002 COG0592 # Protein_GI_number: 15893300 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Clostridium acetobutylicum # 1 366 1 361 366 224 35.0 3e-58 MKLTFQQDAILNGINIVLKAVPSKTTMSILECIFIDASGSEIKLTANDMELGIETKVEGT IHERGKIALDAKLLSEIVRKLSSAGDSVVTIESDEKFTTTISCEKSVFHIQGRDGEEFAY LPYIERDNYICLSQFTLKEVIRQTIFSIAPNDSNKMMTGELFQVSGSQLKVVSLDGHRIS IRNVELKDTYHDIKVIVPGKTLSEVSKILGGDNEKEVLIFFSSNHILFEFDDTIVVSRLI EGEYFRIDQMLSSDYETKITVNKRELLDSIERATILIRENDKKPLIINVVDQELQLKMNS SFGSMNAEVAAHKTGSDIMIGFNPKFLIDALRIIDDEEVTLYMLNPKSPCFIRDEEGKYI YLILPVNFNAASV >gi|229784101|gb|GG667634.1| GENE 26 23526 - 23735 270 69 aa, chain + ## HITS:1 COG:no KEGG:Closa_0003 NR:ns ## KEGG: Closa_0003 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 69 1 69 101 97 79.0 1e-19 MEIIKLRDEYIKLGQALKAAGLVGSGVDAKYVIEEALVKVNGRTELQRGKKLRAGDIVEY DGKTIKIEE >gi|229784101|gb|GG667634.1| GENE 27 23745 - 24830 903 361 aa, chain + ## HITS:1 COG:CAC0004 KEGG:ns NR:ns ## COG: CAC0004 COG1195 # Protein_GI_number: 15893302 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Clostridium acetobutylicum # 1 349 1 355 363 297 43.0 2e-80 MIIESIELKNYRNYDKLHMDFSHGTNILYGDNAQGKTNILEAIYVCATTKSHRGSKDKEI IQFDRDESHIKLNVRKRDVPYRIDMHLKKNRAKGVAVNGVPIKKASELFGIVNVVFFSPE DLNLIKNGPAERRRFIDLELCQLNKLYVHSLVQYNKIITQRNKLLKDIMFRPDYEETLDI WDMQLVQYGREVIRCREAFVSQLNDLIGTIHRQLSGEKESLHICYEPNVTADMFEDTLRK SRPSDLKQRTTLTGPHRDDLSFIINDIDIRRFGSQGQQRTAALSLKLAEIELVKKIVNDY PILLLDDVLSELDGSRQNHLLSGINHIQTMITCTGLEDFVNNRFRIDKIFKVVSGEVYSE N >gi|229784101|gb|GG667634.1| GENE 28 24840 - 26750 2117 636 aa, chain + ## HITS:1 COG:CAC0006 KEGG:ns NR:ns ## COG: CAC0006 COG0187 # Protein_GI_number: 15893304 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Clostridium acetobutylicum # 2 636 5 637 637 811 64.0 0 MSEEYGADQIQILEGLEAVRKRPGMYIGSTSVRGLHHLVYEIVDNAVDEALAGYCDMIDV TINEDNSITVVDDGRGIPVGIQKKAGIPAVEVAFTILHAGGKFGGGGYKVSGGLHGVGAS VVNALSNWLEVEITQDGKTYKQRYEKGKTMYPLKVIREAGPDEHGTKVTFLPDETIFEET VFDYDTLKIRLRETAFLTKNLKIVLRDNREEKHEHVFHYEGGIKEFVTYLNRGKTPLYDQ VFYCEGSKDGVYVEVSMQHNDSYTENIYSFVNNINTPEGGTHLSGFKNALTNTLNDYARN NKLLKENEANLSGEDIREGLTAIISVKIEEPQFEGQTKQKLGNSEARGAVDNILREQFTY FLEQNPSVAKTICDKSIMAQRARAAARKARDLTRRKTALEGMALPGKLADCSDKNPENCE IYIVEGDSAGGSAKTARSRDTQAILPLRGKILNVEKARLDRIYSNAEIKAMITAFGTGIH EDFDISKLRYNKIIIMTDADVDGAHISTLLLTFLYRFMPELIKQGHVYLAQPPLYKVEKN KKVWYAYSDEELNNILKEIGRDTNNKIQRYKGLGEMDAEQLWETTMDPERRILLRVAFDE ETASEIDLTFTTLMGDQVEPRREFIEANAKYGNLDI >gi|229784101|gb|GG667634.1| GENE 29 26787 - 29309 2617 840 aa, chain + ## HITS:1 COG:CAC0007 KEGG:ns NR:ns ## COG: CAC0007 COG0188 # Protein_GI_number: 15893305 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Clostridium acetobutylicum # 8 840 6 830 830 918 57.0 0 MEPNIFDKVHEVDLKKTMEKSYIDYAMSVIAARALPDVRDGLKPVQRRVLFSMIELNNGP DKPHRKCARIVGDTMGKYHPHGDSSIYGALVNMAQEWSTRYPLVDGHGNFGSVDGDGAAA MRYTEARLSKVSMEMLADIGKNTVDFTPNFDETEKEPVVLPARFPNLLVNGTSGIAVGMA TNIPPHNLREVIGAVVKIIDNHVEENRETEMEEVLEIIKGPDFPTGATILGKMGIEEAYR TGRGKIRVRAVTDIETLPNGKSRIIVTELPYMVNKARLIEKMAELAKEKKIDGITDLRDE SDRSGMRIVIELRRDVNANVILNQLLKHTQLQDTFGVIMLALVNNEPRVLNLMQMLKLYL EHQKEVVTRRTQYDLNKAEERAHILQGLLIALDNIDEVIKIIRNSQNVQIAKAELMERFG LSDAQAQAIVDMRLRALTGLEREKLENEFKELEAKIEELRAILADEKKLLGVIREEIMLI SDKYGDDRRTSIGFDEFDMSMEDLIPDENTIVAMTKLGYIKRMSVDNFKSQNRGGKGIKG MQTIDQDYIEDLIMTTNHHYIMFFTNTGRVYRLKTYAIPEGSRTARGTAIINLLQLLPGE SITAIIPMKEYDDDKFLFMATKNGMVKKTPMMEYANVRKNGLQAIVLREDDELIEVKATD NSKDIFLITKMGQCIRFNETDVRVTGRVSMGVIGMKLNDDDEVVGMQMDTQGEKLLIVSE NGMGKRTPIDEFTPQKRGGKGVLCYKITEKTGKIVGAKLVHDDHDIMIITNEGIVIRISV EDISVIGRNTSGVKLMNIDQESDIRVASIAKVRDDGSKSEGEGLEDLDLDDVETETETEE >gi|229784101|gb|GG667634.1| GENE 30 29641 - 30972 1318 443 aa, chain + ## HITS:1 COG:CAC0460 KEGG:ns NR:ns ## COG: CAC0460 COG1253 # Protein_GI_number: 15893751 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Clostridium acetobutylicum # 1 438 1 440 443 365 46.0 1e-101 MDSDPGAATIAAQILLLTALTMMNAFFAGAEMAVVSVNKNRIRMLADNGNKKAALIQKLS ENSTGFLSTIQVAITFAGFFSSASAATGISQILGNRMAQSGIPYSQSIAMVLVTIILSYF NLVFGELVPKRVALQKAEAFSLFAVKPIAIISRIMAPFIALLSVSTNGVLRLLGLKTENL EEEVSEEEIRSMLQTGRESGVFNEIEQDMITSIFLFDDKKAREIMTPRQDMIAVDLSTPL LSSMDEILNSRHSRIPVYEDEIDNIVGILFMKDFIIEANKKPVREIDIRAIMQKPYFIPE NRKTDKLFQEMQKNKLKMAVLVDEYGGVSGIVTMEDLIEEIVGDLHDEYEDVEPEITELE PHVYQAAGSILLYDLNEVLHEEIESSCDTLSGYLIERLGYIPNEKQMPIELYEDGIHYTI LKMNEKVIKTVKLELNREKETGG >gi|229784101|gb|GG667634.1| GENE 31 30975 - 32111 1390 378 aa, chain + ## HITS:1 COG:SPy1184 KEGG:ns NR:ns ## COG: SPy1184 COG1883 # Protein_GI_number: 15675155 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Streptococcus pyogenes M1 GAS # 1 372 1 370 373 401 65.0 1e-111 MEFLLEGIMSVTWQQILMYLVGIILIWLAIKKEYEPALLLPMGFGAILVNLPNTGVINQV LSGVGDTNGIIEWLFHVGIEASEALPILLFIGIGAMIDFGPLLSQPVMFLFGAAAQLGIF IAICAAVLLGFDLKDAASIGIIGAADGPTSILVSQILHSKYIGAIAVAAYSYMALVPIVQ PFAIKLVTTKKERTIHMEYKPAAVSKTMRIAFPIVVTIIVGMIAPMSVALVGFLMFGNLL RECGVLSNLSDTAQNGLANLITLLLGITISFSMRADAFVRIDTLMIMGIGLIAFVFDTIG GVLFAKLLNLFRKEKINPMIGAAGISAFPMSARVVQKMAQKEEPGNIILMHAVGANVSGQ IASVIAGGLVISLVTRFL >gi|229784101|gb|GG667634.1| GENE 32 32124 - 32240 222 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMENLLIALEIMGKGMGGIFVSILIIMGCVIIMGKILK >gi|229784101|gb|GG667634.1| GENE 33 32378 - 32791 326 137 aa, chain + ## HITS:1 COG:SMc01160 KEGG:ns NR:ns ## COG: SMc01160 COG1959 # Protein_GI_number: 15964112 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Sinorhizobium meliloti # 3 130 5 130 152 67 32.0 1e-11 MTSEFAIAVHALVYLNHRQQTVSSEALASNVCTNPARVRKVMAKLKKAGIISTKEGLEGG YHFEKDSSCVNLRQVCDALEVDFVSASWKSGDKEMDCMIASGMAAIMDEIYADLDELCRK RLESITIASIEAKIFKS >gi|229784101|gb|GG667634.1| GENE 34 32923 - 33732 774 269 aa, chain - ## HITS:1 COG:no KEGG:ELI_0472 NR:ns ## KEGG: ELI_0472 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 269 1 268 268 104 32.0 4e-21 MLGKVFKHEMKSTSRLFLPLMIGFIAITLLCKFAFESSYSAFLGNSRLMETITVIFFVLY FIYIIALFVMTSVFIVMHFYKTMVSDQGYLTNTLPVKMGTLINAKLLSAVLWEILAGLLF ILSIFIFFTGHLHLTDLQQIFRDIGTLYQEVSKYLNMPVFLIEVTITCIAGLISGPLMLY AAIALGHLFKKHRVLWAIISYFAIYVVMQIISSIYFSICGYSSPVISNSEYAVQTVKNYM LFTTIFSVACTAGFYAITDYVFTKKLNLE >gi|229784101|gb|GG667634.1| GENE 35 33726 - 34424 212 232 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 222 1 231 305 86 27 3e-16 MNSILQCSHITKKFGSLTALDDVNFTIEPGRIVGLLGPNGSGKSTFIKLATGLLTPTSGT ITICDLPVSVETKRRVSYLPDKNYLNDGLRINQILDYFHDFYDNFDLNKAYEMLKRLNID PKSRLKTLSKGTKEKVQLILVMSRNADLYLLDEPIGGVDPAARDYILNTILSNYNEQGSI IISTHLISDIEPVLDDIIFIKDGKIVLTSSVDAIREENGKSVDALFREVFKC >gi|229784101|gb|GG667634.1| GENE 36 34421 - 34792 386 123 aa, chain - ## HITS:1 COG:SP1714 KEGG:ns NR:ns ## COG: SP1714 COG1725 # Protein_GI_number: 15901548 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 1 104 1 104 121 108 48.0 3e-24 MTWHFENDRPIYTQLLEQIRLRIISGIYAAGSRLPSVRELAGEASVNPNTMQKALSELEQ SGIIYSQRTSGRFVTEDTAMIEHVKEEIAEEKIRDFFKVMNSLGLKPGDTLKLVEKAAKE MSQ >gi|229784101|gb|GG667634.1| GENE 37 34973 - 35815 809 280 aa, chain + ## HITS:1 COG:CAC3091 KEGG:ns NR:ns ## COG: CAC3091 COG1951 # Protein_GI_number: 15896342 # Func_class: C Energy production and conversion # Function: Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminal domain # Organism: Clostridium acetobutylicum # 2 275 3 277 282 325 57.0 8e-89 MIRTIQSEEITRNIKEMCIEANHVLSDDMKKVFCDAVGKERSPLGRQILSQLEENLKIAG EDMIPICQDTGMAVIFLKIGQDVHIEGGSLADAVNQGVHDGYVEGYLRKSVVEPVKRINT KDNTPAVIHYEITDGDQIDITVAPKGFGSENMSRIFMLKPADGIDGIKEAVLTAVRDAGP NACPPMVIGVGIGGTFEKCAEMAKHALTRNLDEQPSSGYVGELETELLEKINSLGIGPGG LGGSITALAVNIETYPTHIAGLPVAVNICCHVNRHSHRII >gi|229784101|gb|GG667634.1| GENE 38 35843 - 36397 486 184 aa, chain + ## HITS:1 COG:CAC3090 KEGG:ns NR:ns ## COG: CAC3090 COG1838 # Protein_GI_number: 15896341 # Func_class: C Energy production and conversion # Function: Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain # Organism: Clostridium acetobutylicum # 1 180 1 180 187 216 57.0 2e-56 MDRYMKAPITKEDAASLQAGDYVYLTGIIYTARDAAHKRMQETLDQGQELPFEISGNMIY YMGPSPAREGRPIGSAGPTTASRMDRYTPTLLDMGLGGMIGKGKRSKEVVDAIVRNGSVY FAAVGGAGALLSKCILSSEVVAYDDLGTEAIRRLEIKDFPVIVVIDSKGNNLYETAIQEY RREP >gi|229784101|gb|GG667634.1| GENE 39 36394 - 37107 195 237 aa, chain + ## HITS:1 COG:CAC0965 KEGG:ns NR:ns ## COG: CAC0965 COG0204 # Protein_GI_number: 15894252 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Clostridium acetobutylicum # 34 232 37 235 241 104 33.0 2e-22 MNRIAYMVLRNLFRAPIWFYHIWKLGRSGDTHTEQQRYDYIRNVVKTVNRTGRVEVEVHG IEHLPSQNGFILFPNHQGLFDVLAIMDSCPNPLGIVVKKEASNIILIKQVIAAMHGMSID RQDVKASLKVITKMTEDVKQGRNYVIFAEGTRSREGNHLLAFKGGTFKSAVNARCPIVPV ALIDCFKPFDVNSIRKQKVQVRFLEPLLPEQYVGLRTIQIADIVHDKIQQEINENMG Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:22:29 2011 Seq name: gi|229784100|gb|GG667635.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld28, whole genome shotgun sequence Length of sequence - 34962 bp Number of predicted genes - 53, with homology - 45 Number of transcription units - 19, operones - 11 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 397 171 ## EUBELI_00925 hypothetical protein - Term 408 - 439 -0.6 2 1 Op 2 . - CDS 480 - 953 414 ## LLKF_1531 phage antirepressor 3 1 Op 3 . - CDS 950 - 1270 182 ## gi|266620996|ref|ZP_06113931.1| conserved hypothetical protein 4 1 Op 4 . - CDS 1257 - 2276 764 ## bpr_IV157 hypothetical protein 5 1 Op 5 . - CDS 2273 - 2557 88 ## gi|266620998|ref|ZP_06113933.1| conserved hypothetical protein 6 1 Op 6 . - CDS 2572 - 2733 69 ## gi|288870396|ref|ZP_06113934.2| conserved hypothetical protein 7 1 Op 7 . - CDS 2748 - 3299 234 ## gi|266621000|ref|ZP_06113935.1| conserved hypothetical protein 8 1 Op 8 . - CDS 3296 - 3475 190 ## gi|266621001|ref|ZP_06113936.1| conserved hypothetical protein 9 1 Op 9 . - CDS 3465 - 3629 82 ## gi|266621002|ref|ZP_06113937.1| conserved hypothetical protein 10 1 Op 10 . - CDS 3638 - 3787 58 ## 11 1 Op 11 . - CDS 3775 - 4833 403 ## Cthe_1739 SNF2-related protein 12 2 Op 1 . - CDS 4995 - 5288 233 ## gi|266621004|ref|ZP_06113939.1| putative L-tyrosine decarboxylase 13 2 Op 2 . - CDS 5285 - 6187 219 ## bpr_II162 hypothetical protein - Term 6197 - 6240 6.6 14 3 Op 1 . - CDS 6250 - 6720 442 ## gi|266621006|ref|ZP_06113941.1| conserved hypothetical protein 15 3 Op 2 . - CDS 6744 - 11090 2000 ## gi|266621007|ref|ZP_06113942.1| conserved hypothetical protein 16 3 Op 3 . - CDS 11099 - 11332 242 ## gi|266621008|ref|ZP_06113943.1| potassium/ion transporter, voltage-gated ion channel superfamily 17 3 Op 4 . - CDS 11332 - 11544 139 ## gi|266621009|ref|ZP_06113944.1| putative methionyl-tRNA synthetase 18 3 Op 5 . - CDS 11554 - 12102 256 ## gi|266621010|ref|ZP_06113945.1| hypothetical protein CLOSTHATH_02150 19 3 Op 6 . - CDS 12102 - 12698 624 ## gi|266621011|ref|ZP_06113946.1| hypothetical protein CLOSTHATH_02151 20 3 Op 7 . - CDS 12691 - 12852 181 ## gi|266621012|ref|ZP_06113947.1| conserved hypothetical protein 21 3 Op 8 . - CDS 12852 - 12947 100 ## 22 3 Op 9 . - CDS 12964 - 13536 379 ## COG3773 Cell wall hydrolyses involved in spore germination 23 3 Op 10 . - CDS 13557 - 14480 979 ## gi|266621014|ref|ZP_06113949.1| conserved hypothetical protein 24 3 Op 11 . - CDS 14482 - 14742 301 ## gi|323485209|ref|ZP_08090560.1| hypothetical protein HMPREF9474_02311 25 3 Op 12 . - CDS 14754 - 15353 284 ## gi|288870398|ref|ZP_06113951.2| conserved hypothetical protein 26 3 Op 13 . - CDS 15368 - 15559 208 ## gi|266621017|ref|ZP_06113952.1| conserved hypothetical protein - Prom 15579 - 15638 2.3 - Term 15571 - 15616 7.9 27 4 Op 1 . - CDS 15640 - 15921 429 ## gi|288870399|ref|ZP_06113953.2| conserved hypothetical protein 28 4 Op 2 . - CDS 15990 - 16763 338 ## gi|266621019|ref|ZP_06113954.1| hypothetical protein CLOSTHATH_02159 - Term 16779 - 16807 1.3 29 4 Op 3 . - CDS 16812 - 17597 432 ## gi|288870400|ref|ZP_06113955.2| hypothetical protein CLOSTHATH_02160 - Prom 17637 - 17696 4.1 - Term 17712 - 17753 3.2 30 5 Tu 1 . - CDS 17960 - 18085 125 ## - Prom 18125 - 18184 2.5 31 6 Op 1 . - CDS 18186 - 18998 563 ## gi|266621022|ref|ZP_06113957.1| hypothetical protein CLOSTHATH_02162 32 6 Op 2 . - CDS 19017 - 19334 174 ## gi|266621023|ref|ZP_06113958.1| conserved hypothetical protein - Prom 19434 - 19493 1.9 - TRNA 19364 - 19438 55.3 # Undet ??? 0 0 - Term 19462 - 19500 4.3 33 7 Op 1 . - CDS 19503 - 19721 221 ## gi|266621024|ref|ZP_06113959.1| CTP synthase 34 7 Op 2 . - CDS 19797 - 20030 168 ## gi|266621025|ref|ZP_06113960.1| inner membrane transport protein YqeG - Prom 20065 - 20124 3.1 - Term 20466 - 20512 9.1 35 8 Op 1 . - CDS 20529 - 20777 346 ## gi|288870402|ref|ZP_06113962.2| DNA-directed RNA polymerase I subunit RPA12 36 8 Op 2 . - CDS 20764 - 20913 68 ## 37 8 Op 3 . - CDS 20951 - 21367 351 ## gi|266621028|ref|ZP_06113963.1| hypothetical protein CLOSTHATH_02168 - Prom 21397 - 21456 5.7 - Term 21652 - 21705 1.3 38 9 Tu 1 . - CDS 21774 - 21938 120 ## - Prom 22020 - 22079 5.8 + Prom 21693 - 21752 5.3 39 10 Tu 1 . + CDS 21976 - 22200 77 ## + Term 22401 - 22460 -0.2 40 11 Op 1 . - CDS 22181 - 22600 191 ## gi|266621032|ref|ZP_06113967.1| conserved hypothetical protein 41 11 Op 2 . - CDS 22600 - 22851 176 ## gi|266621033|ref|ZP_06113968.1| putative glycogen synthase - Prom 22875 - 22934 4.0 42 12 Tu 1 . - CDS 22946 - 23110 92 ## gi|266621034|ref|ZP_06113969.1| conserved hypothetical protein - Prom 23169 - 23228 2.2 43 13 Tu 1 . + CDS 23793 - 24836 139 ## EUBREC_2144 hypothetical protein + Term 24885 - 24927 0.8 - Term 24581 - 24616 -0.6 44 14 Op 1 . - CDS 24851 - 24988 79 ## 45 14 Op 2 . - CDS 24982 - 25899 374 ## COG0677 UDP-N-acetyl-D-mannosaminuronate dehydrogenase 46 14 Op 3 . - CDS 25912 - 27555 298 ## Bcell_3907 heparinase II/III - Prom 27597 - 27656 1.7 47 15 Op 1 12/0.000 - CDS 27693 - 28913 357 ## COG0438 Glycosyltransferase 48 15 Op 2 . - CDS 29647 - 31077 1193 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis - Prom 31191 - 31250 7.5 + Prom 31192 - 31251 5.2 49 16 Tu 1 . + CDS 31337 - 31639 308 ## gi|288870406|ref|ZP_06113975.2| toxin-antitoxin system, antitoxin component, Xre family 50 17 Tu 1 . - CDS 31610 - 31756 72 ## - Prom 31806 - 31865 7.6 51 18 Tu 1 . - CDS 31933 - 32436 494 ## COG4708 Predicted membrane protein - Prom 32492 - 32551 3.0 - Term 32513 - 32550 -0.8 52 19 Op 1 . - CDS 32584 - 33624 865 ## COG3590 Predicted metalloendopeptidase 53 19 Op 2 . - CDS 33563 - 34612 777 ## COG3590 Predicted metalloendopeptidase - Prom 34785 - 34844 6.8 Predicted protein(s) >gi|229784100|gb|GG667635.1| GENE 1 1 - 397 171 132 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00925 NR:ns ## KEGG: EUBELI_00925 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 129 1 115 206 124 51.0 1e-27 MIKLEHVVLASPEQMKFIIEGMRNPMNSWEKSDSRTCRQDGAFCMECEHKNNYCLGENDH SLMQRLSNAGTDHRKFMRMLPVYVRITAPLYWWKEFDTYKVGTVANSCSTMHKIQAKEFT MDDFSCEHLMGG >gi|229784100|gb|GG667635.1| GENE 2 480 - 953 414 157 aa, chain - ## HITS:1 COG:no KEGG:LLKF_1531 NR:ns ## KEGG: LLKF_1531 # Name: pp265 # Def: phage antirepressor # Organism: L.lactis_KF147 # Pathway: not_defined # 15 119 9 111 258 67 36.0 2e-10 MSKEIKIAGSISFGGKRLNVYGDLDAPLFKAKDISHAIGYSSGNEWRMLEMCEEDEKLKL PLVVAGQRRSVNFVTENGLYNILAQSRMEIARSWRRVVHDELINMRKEKGRNIAEQFEEW DHAMDNIYFDEETGQLMQSVTVPGGDVIQIPYEKEEE >gi|229784100|gb|GG667635.1| GENE 3 950 - 1270 182 106 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620996|ref|ZP_06113931.1| ## NR: gi|266620996|ref|ZP_06113931.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein [butyrate-producing bacterium SM4/1] # 1 106 1 106 106 174 100.0 2e-42 MDRNRFIQCMKSNIELSDKERRRIIRRSVESQPWKLKCTIAMEEFAELTQAISKQIRGYD NRIGLLEEMADAYICLEFLKSIFNITPEELQKAMDVKLQRERNKQR >gi|229784100|gb|GG667635.1| GENE 4 1257 - 2276 764 339 aa, chain - ## HITS:1 COG:no KEGG:bpr_IV157 NR:ns ## KEGG: bpr_IV157 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 5 99 7 101 115 124 61.0 8e-27 MKQNIIAVDFDGTLCENKWPEIGMPNEELIEYLKKRQANGEKLILWTSRNEEQTKKAVEW CKKYGLIFDAVNDNLPEIVEAFGGNCRKIFANEYIDDRNRSIGSCRERSNLERWAENEVA IACRREKPDRKDGEWDYGCACYESALKAFGSLCEDGHSGFSIGLTKAILNRLINNKPLLP IEDTDEVWSDISDMSGLKGEERNYQCKRMSSLFKYVYADGTVKYRDVDRYHGVNINCPDD PYHSGLIDTVMDELYPITMPYMPADRAFKIYTEDFLVDPAKGDYDTVGILYVITPSMDKV AINRYFKEAPNGFAEIDEAEYKERKEAAKARMEATDGSK >gi|229784100|gb|GG667635.1| GENE 5 2273 - 2557 88 94 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266620998|ref|ZP_06113933.1| ## NR: gi|266620998|ref|ZP_06113933.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 94 7 100 100 188 98.0 1e-46 MVNAEIEIKHETRLCKVNGKYGIFHLWEEQYTRPIIDELRLIPTEIGSQVFGIVEFADCV KRVQPDEIIFCDEQSDYLAQLNGVHDGTKKGENK >gi|229784100|gb|GG667635.1| GENE 6 2572 - 2733 69 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870396|ref|ZP_06113934.2| ## NR: gi|288870396|ref|ZP_06113934.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 53 7 59 59 93 100.0 4e-18 MNRTTKINILAYASEPDKNYKYEGDIVDYKGKRYFVSLAEERVEFIGIIKDKS >gi|229784100|gb|GG667635.1| GENE 7 2748 - 3299 234 183 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621000|ref|ZP_06113935.1| ## NR: gi|266621000|ref|ZP_06113935.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 183 1 183 183 340 100.0 4e-92 MSFQYDQYLARHRANVKRGFDWLSKNLPGLMTNTLTAGWNTEFAHDQSKNEPDEYEAYDA YFYGNNRSYEVVQRYQRAWLLHIHRNPHHWQHWILIHDDMEDGELETVLEMPYDYIIEMI CDWWSFSWQSGNLYEIFKWYEEHSKYIKLAQTTKITVEYILDNMKKKLQALQYADQSAMQ PGA >gi|229784100|gb|GG667635.1| GENE 8 3296 - 3475 190 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621001|ref|ZP_06113936.1| ## NR: gi|266621001|ref|ZP_06113936.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 59 1 59 59 91 100.0 2e-17 MVDSILVSVDFSNKNDTGVMVVGRKRMNQSVEIINAFQGDEARELYEKLVTKKKKEGQK >gi|229784100|gb|GG667635.1| GENE 9 3465 - 3629 82 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621002|ref|ZP_06113937.1| ## NR: gi|266621002|ref|ZP_06113937.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 54 1 54 54 91 100.0 1e-17 MENIYKEVDFKTYCKTCEHKDLEEKFDPCNDCLAEPMNVNSDKPIYWKEAENGR >gi|229784100|gb|GG667635.1| GENE 10 3638 - 3787 58 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no METMIYNLWIFLKILSIKLKSMSAEDFYSLLIECDYQQRLYAILLRYYM >gi|229784100|gb|GG667635.1| GENE 11 3775 - 4833 403 352 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1739 NR:ns ## KEGG: Cthe_1739 # Name: not_defined # Def: SNF2-related protein # Organism: C.thermocellum # Pathway: not_defined # 10 349 8 372 375 65 25.0 4e-09 MDDVNIKDLYIITTARKRDTFEWEEELSPFLLSTNKKENLYTNKVVIDSWNNIKKYADVK DAFFIFDEQRVIGSGTWVKAFLKIAKVNEWILLSATPGDTWQDYIPVFVANGFYKNRSEF TREHIVYSRFSKFPKVDRYLNTGRLIRLRNKILVNMDFKRQTVSHHEDIYVKYNIERYKD VGKTRWDPFKKEPIINAAGLCYVWRKIVNTDQSRQIALLEIVEKHPKAIIFYNFDYELEL LKEIFSGYEVREWNGHKHQPVPTSDAWVYLVQYNAGAEGWNCITTDTIIFYSQNYSYKIM AQSAGRIDRMNTPYTDLYYYHLKSRSGIDLAISKALKDKKTFNETRFVKWRQ >gi|229784100|gb|GG667635.1| GENE 12 4995 - 5288 233 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621004|ref|ZP_06113939.1| ## NR: gi|266621004|ref|ZP_06113939.1| putative L-tyrosine decarboxylase [Clostridium hathewayi DSM 13479] putative L-tyrosine decarboxylase [Clostridium hathewayi DSM 13479] # 1 97 1 97 97 184 100.0 2e-45 MKKRYSIPKEQCTCGISELYDNVAKIIGISNVSKAVYDCRKLSITKKVLDCLYKFYHSEN QSDETITTCMLLYGPKADLDGDGYEVEVEDGFVTKGV >gi|229784100|gb|GG667635.1| GENE 13 5285 - 6187 219 300 aa, chain - ## HITS:1 COG:no KEGG:bpr_II162 NR:ns ## KEGG: bpr_II162 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 56 293 1 243 244 145 32.0 2e-33 MFWNKKKPKSKPQIKTTVPKTFKAKEPPPKWQPTFGETKKKGEKPPEVTTKSEPKIDWED KFLKSFQKLTYRRRAWDVWRDYILLHACSISNVLDKDNYDQREKLYLKIIHQYSKEEQAI FPELAAYTTMALDRNQEQDFLGKMFMRLDLGNRSAGQFFTPYHVCELMAEVVATNALEKI EQYGYISINDPCCGAGATLIAGVHVIRKQLEHCEPPRNYQNHILVVAQDVDEIVGLMCYI QISLLGLAGFIKIGNSITDPISTDDSSEKYWYTPMYFSDVWSTRRMLRQINKLFGKGDDE >gi|229784100|gb|GG667635.1| GENE 14 6250 - 6720 442 156 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621006|ref|ZP_06113941.1| ## NR: gi|266621006|ref|ZP_06113941.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 156 1 156 156 299 100.0 5e-80 MARVNELIIENARIMFRNFRGEETKYNRAGNRNFCVVIPDADQAQKLGEDGWNVRILPPR DEDEAPLHYIQVAVRFDNIPPNVYMVTRRAKTKLDEESVSSLDYAEIRNVDLVISPSKWE VNGKSGIKAYLKTMYVTIEEDVFAEKYADEEEPPFA >gi|229784100|gb|GG667635.1| GENE 15 6744 - 11090 2000 1448 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621007|ref|ZP_06113942.1| ## NR: gi|266621007|ref|ZP_06113942.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 1448 1 1448 1448 2939 100.0 0 MIDFMVISTRSTKRGVIEIYPKFIIKKSTDLMIRGGDFYAIWIEERGLWSTDEQDALQLI DRELDRYAEENRQRFNSDIKVLHMWDAESGMIDSWHKYCQKQMRDSFHTLDDKLIFSNTE TNKKDYASKKLNYPLEAGDLSAYEKLMSTLYSEEERTKIEWAIGSIVSGESKKLQKFMVL YGAAGTGKSTVLNIIQQLFDGYYSVFDAKALGSSSNSFALEAFKTNPLVAIQHDGDLSRI EDNTRLNSLVSHELMTVNEKFKSTYSNRFKCFLFMGTNKPVKITDAKSGLIRRLIDVSPS GNKLNPKEYKTIVKQVEFELGAIAYHCQEVYSNNPGRYDDYIPITMLGASNDFYNFIIDS YHVFKKENGTTLKAAWEMYKTYCDDAKVGFPFSQRVFKEELKNYFHDFQERFNLDDGTRV RSYYIGFRTEKFEEETVEEKAEVVKPALIQFDSTESIFDDVCSECPAQYASENETPQKKW DSVRTKLSGIDTKKLHYVKVPENHIVIDFDIPDESGNKSFEKNLAEASKWPPTYAELSKS GQGIHLHYIYTGDPTQLSRVYDDHVEVKVFTGKSSLRRMLSKCNNLPIATISSGLPLKGE QKMVNFEAIKSEKGLRTLIKRNLNKEIHPGTKPSIDFIYKILEDAYESDLKYDVTDMRNA VLAFAANSTHQADYCIKLVNKMQFKSADPSTAVKNDDAKLVFYDIEVFPNLFLVNWKIEG EGKPVVRMINPSPSEIEDLMRFRLVGFNCRRYDNHILYARLMGYTNEQLYNLSQKIINGS PNCFFGEAYNVSYTDVYDFASAGNKKSLKKLEIEMGNLTDDDLKKKGFSDEKIRIIKAGT HHQELGLPWDQPVPEELWIKVAEYCDNDVIATEAAFNYLEADWTARQILADLAEMTVNDT TNSLTTRIIFGNNRKPQSEFHYRNLAEPVESLDKESMDFLKEACPKMMEEPHYGWKYNDK NEVPFESHSILPYFPGYVFDHGKSTYRGEEVGEGGFAQGVPGMYGNVALLDISSMHPHSA IAEVLFGPRFTKAFRDIVEGRVSIKHEAWDIVNTMLDGKLTPYIQRVIDGEMTSKDLANA LKTAINSVYGLTSASFDNPFRDPRNIDNIVAKRGALFMIDLKNEVLKRGFQVAHIKTDSI KIPDATPEIIQFVMDFGERYGYTFEHEATYDRMCLVNDAVYIAKYKSAEECQKIYGYVPG DNKKKGGKWTATGTQFQIPYVFKKLFSREDIAFEDMCETKSVSSSLYLDLNEELPDVSKE EKEFSKAESDYKKGLLSDTTFESTCQKLTPLIEKGHDYHFIGKVGQFCPMKDGYGAGLLM REKDGRYYAATGSKGYRWMESEMVKELGKEDGIDRSYYDKLVDEAVKTISQYGDFEWFVS DDPYIPELGANDADVDCVVPWAMPCGEDKYRTCFDCPHFNNDNFHMDCNLDYDIADIVMK HAMNPPEN >gi|229784100|gb|GG667635.1| GENE 16 11099 - 11332 242 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621008|ref|ZP_06113943.1| ## NR: gi|266621008|ref|ZP_06113943.1| potassium/ion transporter, voltage-gated ion channel superfamily [Clostridium hathewayi DSM 13479] potassium/ion transporter, voltage-gated ion channel superfamily [Clostridium hathewayi DSM 13479] # 1 77 1 77 77 112 100.0 7e-24 MGEMLTYIFSSLRSSEKRLDVVTRAVSKQRSFNKQLTIFAVMTTANFVVMKIEQKDQALR IRKLEKEIEELKRPEGE >gi|229784100|gb|GG667635.1| GENE 17 11332 - 11544 139 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621009|ref|ZP_06113944.1| ## NR: gi|266621009|ref|ZP_06113944.1| putative methionyl-tRNA synthetase [Clostridium hathewayi DSM 13479] putative methionyl-tRNA synthetase [Clostridium hathewayi DSM 13479] # 1 70 1 70 70 119 100.0 9e-26 MSHSEVYKWFELYFPQYAGDNVETWFQNGKNSIRIRQKNHQEFIFTFNNEGNWRFETVES FMNGLRGGKK >gi|229784100|gb|GG667635.1| GENE 18 11554 - 12102 256 182 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621010|ref|ZP_06113945.1| ## NR: gi|266621010|ref|ZP_06113945.1| hypothetical protein CLOSTHATH_02150 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02150 [Clostridium hathewayi DSM 13479] hypothetical protein [butyrate-producing bacterium SM4/1] # 1 182 2 183 183 344 100.0 2e-93 MTRDELNNAYFDWMYQLVCDDEYSRGLSYRKLLYLLHDTDFTFTIALDGNRYDDGIDLRY RFGNEQGYRDSMIASYLDNRPCSVLEMIIALAIRLEEHIMDDPDIGNRTGQWFWDMIVSL GLGSMDDSKFDKAHAIDVIRRFLNRDYGRDGKGGLFTIEHCRYDMRDIEIWYQANWYLDN IR >gi|229784100|gb|GG667635.1| GENE 19 12102 - 12698 624 198 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621011|ref|ZP_06113946.1| ## NR: gi|266621011|ref|ZP_06113946.1| hypothetical protein CLOSTHATH_02151 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02151 [Clostridium hathewayi DSM 13479] # 1 198 1 198 198 325 100.0 1e-87 MSNKSLFSLTFIIGAATGSVATWYLLKDKYEALAQEEIDSVKEVFLRREQELKDQSVKKT VAEGIKDADKEKPDLKEYAERLKKEGYTRYSDFGSDEEEKPVSEAGPYVIPPEQFGDDEE YEQISLTYYADGVLADENDEVIEDVEDAVGIDSLNHFGEYEDDSVFVRNDARKCDYEILL DQRTYSEVAEDMPHQMEV >gi|229784100|gb|GG667635.1| GENE 20 12691 - 12852 181 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621012|ref|ZP_06113947.1| ## NR: gi|266621012|ref|ZP_06113947.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF9474_02308 [Clostridium symbiosum WAL-14163] conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF9474_02308 [Clostridium symbiosum WAL-14163] # 1 53 1 53 53 90 100.0 3e-17 MDGIGNFISMMDYILDTKRKRHITGGILLSASLLFGGLALTVMTIQNEEDEDE >gi|229784100|gb|GG667635.1| GENE 21 12852 - 12947 100 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTGFMGLTFSAFAGICFVSGLAVLMGGKEHH >gi|229784100|gb|GG667635.1| GENE 22 12964 - 13536 379 190 aa, chain - ## HITS:1 COG:BS_ykvT KEGG:ns NR:ns ## COG: BS_ykvT COG3773 # Protein_GI_number: 16078446 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall hydrolyses involved in spore germination # Organism: Bacillus subtilis # 82 187 97 207 208 64 35.0 1e-10 MRNYIRMVVLPALCIFAIICTGFVCSAEQVNRYEYIEIQPTLKAEPIDPIVIISEQPLEE TVSAVEIEEYVEDTLLPREDIELIALVTMAEAEGECEEGKRLVIDTILNRVDSVYFPDTV YGVVYQANQFSSMWNGRVDKCFVDDDICQLVEEELQSRTNVDTIFFTAGEYGKYGRPMFQ VGNHYFSSYE >gi|229784100|gb|GG667635.1| GENE 23 13557 - 14480 979 307 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621014|ref|ZP_06113949.1| ## NR: gi|266621014|ref|ZP_06113949.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 307 1 307 307 560 100.0 1e-158 MKKEEIMKNVSTTFSKVSVKLKKHSPEILIVAGVVGTVASAVMACHATTKLDSVLEKSKK DIDAIHKCAENEELATEYSKDDAKKDLAIVYVQAGVKVAKLYAPAVALGTLSIASIVASH NILKKRNVALAAAYATVDKTFKEYRNRVVERFGAEVDKELRYNIKAKKFEETVTDPDSGK EKKVKSTVDVAAPSTNDYARFFDESCEAYESNMDYNLMYLRSQQNLANDKLKANGYLFLS DVYDQLGIKRTKMSQIVGWVYKPEGNENGDNFVDFGILETNRETEDGGYEKAILMEFNVD GPILDLI >gi|229784100|gb|GG667635.1| GENE 24 14482 - 14742 301 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|323485209|ref|ZP_08090560.1| ## NR: gi|323485209|ref|ZP_08090560.1| hypothetical protein HMPREF9474_02311 [Clostridium symbiosum WAL-14163] hypothetical protein HMPREF9474_02311 [Clostridium symbiosum WAL-14163] # 1 86 1 83 83 147 95.0 4e-34 MYESDDKMVSHPSHYQSETGLEVIDVIEAFTFDLKGIEATDTGNIIKYACRWKNKNGIQD LKKIMWYTQHLIDHLEKKEKIEEENN >gi|229784100|gb|GG667635.1| GENE 25 14754 - 15353 284 199 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870398|ref|ZP_06113951.2| ## NR: gi|288870398|ref|ZP_06113951.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 199 7 205 205 328 100.0 1e-88 MEEYKSNSHKSRQNQNDDIPEKRVEKVVSGSVKSKKKNGLQKITNVFVPEDVDDVKSYIF EDIVVPAVKDIILDAVRAFLGVSGNSRGGRSSTSSKISYRKYYDDRDRRDSGNVSRTRTG YDYDDIILESRGEAEDVLERMDELIATYQVVSVADFYDLVGVSGNYTDNKYGWTDIRNAS VIRVRDGYMIKLPKALPLN >gi|229784100|gb|GG667635.1| GENE 26 15368 - 15559 208 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621017|ref|ZP_06113952.1| ## NR: gi|266621017|ref|ZP_06113952.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 63 1 63 63 119 100.0 6e-26 MNQYMYDGPVMEFDTCVANRWQGSTYAASEKKARSNLVYQFKKKTNRIPSTRITLPGKVV TVN >gi|229784100|gb|GG667635.1| GENE 27 15640 - 15921 429 93 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870399|ref|ZP_06113953.2| ## NR: gi|288870399|ref|ZP_06113953.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 23 93 37 107 107 120 100.0 2e-26 MENNEIMNNNEEVIETTTEEIVKAASNGGMKKATTIGLAMIAGALTYKFVVVPATAKFKN WRENRKTVVTQPKGDIVDGEFTDIDEETEEDSE >gi|229784100|gb|GG667635.1| GENE 28 15990 - 16763 338 257 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621019|ref|ZP_06113954.1| ## NR: gi|266621019|ref|ZP_06113954.1| hypothetical protein CLOSTHATH_02159 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02159 [Clostridium hathewayi DSM 13479] # 1 257 1 257 257 508 100.0 1e-142 MQKPNLTKICRNVKTATVKHSPEILTGVGIAGMITTTVMAVRATPKAIQLLDEEKRRQQA DKLEPMDVVKTAWKCYIPAAVTGTVSVACLIGASSVNARRNAALTAAYTISESTLRDYQK KVVETIGEKKEQTVRDAVAKERLEKNPVENKEVIVTAKGDTLCFDAVSGRYFKSDIDKLK KAENELNRQMRDEMYISLNDFYYEVGLEPIKLGDDLGWNIDNGYIDLRFSSQLATDGTPC LVIDYGYGPRYDFRNLM >gi|229784100|gb|GG667635.1| GENE 29 16812 - 17597 432 261 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870400|ref|ZP_06113955.2| ## NR: gi|288870400|ref|ZP_06113955.2| hypothetical protein CLOSTHATH_02160 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02160 [Clostridium hathewayi DSM 13479] # 6 261 1 256 256 540 100.0 1e-152 MCTREMTLGEEIINLTKRGIDVPTVERMYRKYIDLDEKGKSEGCYAIDLGPLFPTFDIGD TVRYCRADVEATLNLFRDTVHNPYSTLPADIKVGDKMMVPLGKLGNFTATVQKVTNNKVL FIFDDYVAKRPMNEDGGNAGGYSQSDLKKWIDSELYNMFPAVLKQRMTGLSIPTLGEICG WADKWDRDHIEADGDEQLPLMKQRRNRVAYYKNDCEFGWLRNATKKEFSSATCAFVDGIG FTGYDYASNSYGVRPEFWLVR >gi|229784100|gb|GG667635.1| GENE 30 17960 - 18085 125 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIGVGVNLITDWVNEQKMDEKIEEKVSEALARRDKDEAEES >gi|229784100|gb|GG667635.1| GENE 31 18186 - 18998 563 270 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621022|ref|ZP_06113957.1| ## NR: gi|266621022|ref|ZP_06113957.1| hypothetical protein CLOSTHATH_02162 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02162 [Clostridium hathewayi DSM 13479] # 1 270 2 271 271 541 100.0 1e-152 MKKPNLQRLAQRSKIYLRKASPTILSGLGAAGVIVTSVLAVRATPKALRKIRADSKTNHD GDPEAYSKLEAVKSAWVCYIPAAISGTATIFCIFGANVLSKRQQAALTSAYALLNDSYNN YKDKLKELYGEEANQKIVDAIAAEKAKDVYITSTGLVRNSSLDFDEHDPNDERLFYDAYS NRYFESSINRVIQAEYHLNRDFVISGYLPANHFYQLLGLEPLEGGDTVGWSIDTGIYWID FNHSKVTLDDGLEVLVIDMDWVPDAGWDSE >gi|229784100|gb|GG667635.1| GENE 32 19017 - 19334 174 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621023|ref|ZP_06113958.1| ## NR: gi|266621023|ref|ZP_06113958.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 105 1 105 105 204 100.0 1e-51 MSIEQLDLLLCDTYQMDAWFPFGWKWKKELEKSSYSVWAIDELKRYIVGRLYPKKSGSVE DFITFVGDFRRIMNQFSKINPDNNFMFSVAADISTDVLDLLHAMK >gi|229784100|gb|GG667635.1| GENE 33 19503 - 19721 221 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621024|ref|ZP_06113959.1| ## NR: gi|266621024|ref|ZP_06113959.1| CTP synthase [Clostridium hathewayi DSM 13479] CTP synthase [Clostridium hathewayi DSM 13479] # 1 72 1 72 72 114 100.0 2e-24 MDEMRIVSKFTRGIISKAIKMVIRKKTGYNIDIQLNEAITTINDGKTHLHLDVDAELDKD ELMSILKSIGLN >gi|229784100|gb|GG667635.1| GENE 34 19797 - 20030 168 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621025|ref|ZP_06113960.1| ## NR: gi|266621025|ref|ZP_06113960.1| inner membrane transport protein YqeG [Clostridium hathewayi DSM 13479] inner membrane transport protein YqeG [Clostridium hathewayi DSM 13479] hypothetical protein [butyrate-producing bacterium SM4/1] # 1 77 1 77 77 102 100.0 1e-20 MIETYVSIGKVTDYAIGVLKYFATASSILLISIIGALTAWIFWSAVGMIVAIVGIVIATI VLTLGIYELHTQKRRRR >gi|229784100|gb|GG667635.1| GENE 35 20529 - 20777 346 82 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870402|ref|ZP_06113962.2| ## NR: gi|288870402|ref|ZP_06113962.2| DNA-directed RNA polymerase I subunit RPA12 [Clostridium hathewayi DSM 13479] DNA-directed RNA polymerase I subunit RPA12 [Clostridium hathewayi DSM 13479] # 1 82 13 94 94 141 100.0 1e-32 MSIFNEEQIKAMFSREYICHECGHLMEFEDEWEDTLVCPHCGHSIDLDDYGREGNEEYEN LYPTREEVLGIANDDSEEDSDY >gi|229784100|gb|GG667635.1| GENE 36 20764 - 20913 68 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSSSFFSREKCKGYYERIKLYLLNYRQQLVYYMYGSWTVRKEIFSYEYF >gi|229784100|gb|GG667635.1| GENE 37 20951 - 21367 351 138 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621028|ref|ZP_06113963.1| ## NR: gi|266621028|ref|ZP_06113963.1| hypothetical protein CLOSTHATH_02168 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02168 [Clostridium hathewayi DSM 13479] # 1 138 1 138 138 243 100.0 3e-63 MEEKNIEELLSEEIAAQIEALSDLQSGSKEKSTAIDDLTKLYKLRIEENKSVWDADEKYN RRMMDEESVTKDGDFKDRQIAEQVKDRYFRVGIAAAELLIPLMCYGIWMNKGFKFEETGT FTSSTFKGLINRFRPTKK >gi|229784100|gb|GG667635.1| GENE 38 21774 - 21938 120 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSNLKNVIIYLLSVLIAFESGVLLFIVGMFTVTNDLKNDRKNRSVSYRSYRKGD >gi|229784100|gb|GG667635.1| GENE 39 21976 - 22200 77 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRVLTFVHRFLPKIRPCNITVLLAYWAKEFANLPTDILKFLTLALKSALAIWTVEPVYLL HSFIDNLHIISAPC >gi|229784100|gb|GG667635.1| GENE 40 22181 - 22600 191 139 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621032|ref|ZP_06113967.1| ## NR: gi|266621032|ref|ZP_06113967.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 139 1 139 139 268 100.0 1e-70 MGTTIRPELSEKNPYWIEKHRYYELKHFCLQYPIWRKAYSVLDGYANPPKDSASFVITST LGDPTAKCAMAKTYYSERTDMVERVAEQTDRELAEYILKAVTEGWSYDILKARLEIPCCK DVYYELYRRFFWLLNKERK >gi|229784100|gb|GG667635.1| GENE 41 22600 - 22851 176 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621033|ref|ZP_06113968.1| ## NR: gi|266621033|ref|ZP_06113968.1| putative glycogen synthase [Clostridium hathewayi DSM 13479] putative glycogen synthase [Clostridium hathewayi DSM 13479] # 1 83 1 83 83 158 100.0 1e-37 MNENEFATGSVPVMVAARIYGKDASWVRAGIISGWLPIGKATRNGQLVTKIEDMNSKYGR INFYISPKRLYEETGYVWKGEKR >gi|229784100|gb|GG667635.1| GENE 42 22946 - 23110 92 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621034|ref|ZP_06113969.1| ## NR: gi|266621034|ref|ZP_06113969.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 54 1 54 54 74 100.0 2e-12 MSVEERHLLNKIRFFEDMLLRSKDYRQQENIGKELTVMRIRLQKLRFNKMRTGA >gi|229784100|gb|GG667635.1| GENE 43 23793 - 24836 139 347 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2144 NR:ns ## KEGG: EUBREC_2144 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 19 346 44 375 376 343 52.0 5e-93 MVTVGKDKNGKPICKPLKPESYFPTYNDAYTALVEFNKNPYDLEPSITVKELYDKWTPEY FKTLKSDDSARATTSAWQYCSAVYDMRVMDVRARHIKGCMEEGVATVRGQEQTPSASMKN KIKTLFNQMLDYAVEYELVDRNYSRTFKLTDDTIKEIQTVKKEHIPFSDDEMALLWKNLG HKYGIEFMIIQCYSGWRPQELGLIELADVDLSNWTFKGGIKTDAGENRVVPIHPRIRDLV SKSYEEADQLGSKYLFNYTDEDRRGKNTKLTYNRYSKIFNRIRDELKLNPDHRPHDGRKH FVTKCKDAKVDEYAIKYMVGHKISDITEKVYTAREFEWLRTEIEKIK >gi|229784100|gb|GG667635.1| GENE 44 24851 - 24988 79 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLGGLYRENPVKSRNYEAVKSSKKWENVGNSYIIPALLLYRYSYI >gi|229784100|gb|GG667635.1| GENE 45 24982 - 25899 374 305 aa, chain - ## HITS:1 COG:SA0158 KEGG:ns NR:ns ## COG: SA0158 COG0677 # Protein_GI_number: 15925867 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetyl-D-mannosaminuronate dehydrogenase # Organism: Staphylococcus aureus N315 # 2 297 3 300 420 283 45.0 3e-76 MVNVIGLGYIGLPTALMLASHGVKVIGTDYSNELINTLNAGKTTFQENGLDKLFADAVSA GIEFSTEYQVTDTYIVSVPTPYDKFSKKIDAYYVIAAIKSIMNVCPNGATVVIESTVSPG TIDKQIRPVIEKNGFKLNKDIYLVHAPERIIPGNMVYELLHNNRTIGADDQAIGERIKAL YAQFCQGEIVVTDIKTAEMTKVVENTYRAVNIAFANELAKICRHDNMDVYEIIKICNMHP RVNILQPGPGVGGHCISVDPWFLVGDYPSLAKVIDESMRTNDGMPDFVLNRIYEIMKEKG KIETC >gi|229784100|gb|GG667635.1| GENE 46 25912 - 27555 298 547 aa, chain - ## HITS:1 COG:no KEGG:Bcell_3907 NR:ns ## KEGG: Bcell_3907 # Name: not_defined # Def: heparinase II/III # Organism: B.cellulosilyticus # Pathway: not_defined # 21 494 65 551 638 251 35.0 5e-65 MITKIPYIYKKGIPKTGFLEEADGILNGVFPTVSGQTISFDNKIDWDCTGVTYRLKCFRI NDFKYLLSLSDAFKQTGQIKYLNKGFDLIEDWIQQNSDYITGDKWNPYVIAERLMNWIGF ISENAHQVQKSTGKYTSSIWSQAKELKSSVEYQLGANHLLSEGKALMYAGAFLNNPVFYK YGKKLLIKEAALQFLPDGGHYERSISYHVESLQQLFEATVLMVWRKEILDSRFADVLCNA YTFLNGMISSDGHIPLVNDASYDYPFGAADFLGTSKVIFNDAAPKGKIGFYSSRWEEIKV LTRKPDWNPVTIYKDTGYVHDLFEYNGIKHSIYFDVGNCGPDENLGHAHADSLSVLWATE ETQIFADSGVFTYEPGNERNYCRSTKAHNTVEIDGRNSSEIWSAFRVAIRSHGILINSRC SDTDDEFTALHDGYENILDKPVSHIRKLEIDKINGRIKISDYLKGKGKHEAVVRYHLTPG REVVRLDENEVIIDKIYKITSSNAIEIARCKVASNFGMKENSICLEINSTFNLEEKIVTN IYFNTSI >gi|229784100|gb|GG667635.1| GENE 47 27693 - 28913 357 406 aa, chain - ## HITS:1 COG:alr1668 KEGG:ns NR:ns ## COG: alr1668 COG0438 # Protein_GI_number: 17229160 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Nostoc sp. PCC 7120 # 1 400 1 401 427 130 26.0 4e-30 MKLLYFSQFYAPESIAPAFRAIDNSKFWTEMGHDVTVFTGYPNYPAGKIFDGYTPRLMND ETINGVRVIRSKLVAKPNTTTFRRLENAISFLFFGIVNLLFNKRKIGKNWDLILGTSGVI FTALLAWGGAKIYKTTFVLEIRDISYKQLQATGHSANSLAVKGMKALELFLCRKAKSVIV VTNGFKNVLQKDGIDPQKIFVISNGVDITEENTKRELENKFVLSYFGTLGLSQNIIGTFD YAQKIKKHCDNFEYLLIGEGAQKNRIEEHIRLQKYDFIRQFSGMSSQELEKYYKLTELSV ITLQKTDDFKYTIPSKLFQVMGRGIAVMYIGPAGEAADIINNYQAGLTLTGTLEEDLKTL DDFFSDRDWNRKIQIMGQNGKKTVKEHYSRRKLAKDYMEIMEQVRE >gi|229784100|gb|GG667635.1| GENE 48 29647 - 31077 1193 476 aa, chain - ## HITS:1 COG:all4160_2 KEGG:ns NR:ns ## COG: all4160_2 COG2148 # Protein_GI_number: 17231652 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Nostoc sp. PCC 7120 # 263 475 23 225 226 219 51.0 1e-56 MYKKATLGWLKHWDFMILDLICLQVAFISAYFIRHGVHLAYENKLYRNMAFVLFLIQVCV TFFGESFKNVLKRGYYKEFTATFKHVCQIILIAVFYLFATQTGEGYSRITLVLTGIIYAV ISYIARILWKKYLKTKGVLGKGNRSLLIITSEEMMDTVIDTIRNNNYEGFQIKGISLLDA DRVGERINDVPVVATLENVEEYVCREWVDEVFLDLPKEVPLPRDLINDFIEMGITIHLKL IEMAKLEGEVQRVERLGSYTVLTSSINMASWKQALYKRAMDIAGGLIGCIITGILYLIVA PCIYIKSPGPVFFSQIRVGKNGKKFKLYKFRSMYMDAEKRKKELMAQNRVKDGMMFKLDH DPRIIGGDKGIGGFIRKYSIDEFPQFWNVLKGDMSLVGTRPPTIDEWDKYELHHRVRLAI KPGITGMWQVSGRSDITDFEEVVRLDKEYITKWSLGLDVKIVMRTVGVVFGKSGAM >gi|229784100|gb|GG667635.1| GENE 49 31337 - 31639 308 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870406|ref|ZP_06113975.2| ## NR: gi|288870406|ref|ZP_06113975.2| toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] # 1 100 1 100 100 186 100.0 7e-46 MIMGLTQEELAEKIGRAYKYCQDIERGTCGMSIDTMLSIAACLNISLDYLIYGNNEDSTE FADLDTEQQAVIEILGNCSHKKKRYALDLLKLFMKACDER >gi|229784100|gb|GG667635.1| GENE 50 31610 - 31756 72 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAANMCEFSVKARKRSILNHNCCFIYKNMTVVMTNIFVTLALITSLHK >gi|229784100|gb|GG667635.1| GENE 51 31933 - 32436 494 167 aa, chain - ## HITS:1 COG:CAC2413 KEGG:ns NR:ns ## COG: CAC2413 COG4708 # Protein_GI_number: 15895679 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 12 155 10 155 166 74 33.0 7e-14 MLQKSSSRTYFMVYAAAIAAIYTVLTMVFAPISFGPVQFRISEALCILPFFTPAAVPGLF IGCLLSNLLCGAAALDIIFGSLATLIGALGSWMLRKNKWAVCLPPILANTIVIPWVLRFA YGSEDMILYAMITVGIGEVLAIGVLGNLLMGILYKYRQIMFGKKNLA >gi|229784100|gb|GG667635.1| GENE 52 32584 - 33624 865 346 aa, chain - ## HITS:1 COG:MT0208 KEGG:ns NR:ns ## COG: MT0208 COG3590 # Protein_GI_number: 15839578 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted metalloendopeptidase # Organism: Mycobacterium tuberculosis CDC1551 # 22 346 326 662 663 248 40.0 1e-65 MTVRSHGTIRKRAFKEKKPDKKLAAELVQNTFGFEFGRLYTEKYFSEEDKKAVETLVHQI LSAYKERINGLEWMGETTKEKAIRKLDHMTLKIGYPDNWPDYYENAVILPEGRGSLIDNM ITIYRSLRSFEKEEIKKPVDRGQWGMTPQTVNAYYNPLNNEIVFPAAILQPPFYDPEAGR AANLGGIGMVIAHEISHAFDSSGSCYDENGNYNMWWTEEDLKKFQELTLRVTEYYNDQEA VKGRYVNGQQTLNENIADLGALACVTSIAGDDTEALDSLFRQYARIWASKYTEESMIHRL NTDVYSPAKVRVNAVLSSTEAFYKVYPDLKEGDGMYVAPEKWVKIW >gi|229784100|gb|GG667635.1| GENE 53 33563 - 34612 777 349 aa, chain - ## HITS:1 COG:CC3504 KEGG:ns NR:ns ## COG: CC3504 COG3590 # Protein_GI_number: 16127734 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted metalloendopeptidase # Organism: Caulobacter vibrioides # 44 338 73 362 706 91 25.0 3e-18 MAMAEDSVVGRDGALQSTEKGQTPEDEQNIEDAVRTAGKAETGSVRAEDDYYNYVNQKLL AGKQIPEDSESWSYFYELGQESYHNLSELLDEVINQRSNLAEGSPEQKIVDLYQTAMDME ERKRAGFGALQPYLDSIRGAADIQEYIEAVGAVNNDLGFSSLIALAYFEDMKNSQNYGCY LGSADLGPGKETLEDKTQSVLLEAYRNYIKNIMESTGISWKKAEETASDIYKFQTDLAAA TLPLSDQNDPEKTYNPLSLEELRTLYSNIDIEAYLKAAGADGFHSWVVNDPGQAKKINSY LTEDHLPLLKEYSIFCLVKDFSPFLTPEIRDSTIAWNNTQKGVQGKKTG Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:28:23 2011 Seq name: gi|229784099|gb|GG667636.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld29, whole genome shotgun sequence Length of sequence - 40041 bp Number of predicted genes - 30, with homology - 30 Number of transcription units - 9, operones - 6 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 29 - 88 10.2 1 1 Tu 1 . + CDS 297 - 1241 1065 ## COG0039 Malate/lactate dehydrogenases + Term 1342 - 1403 8.2 - Term 1720 - 1758 2.9 2 2 Tu 1 . - CDS 1794 - 2498 648 ## COG0366 Glycosidases - Prom 2600 - 2659 80.4 3 3 Tu 1 . - CDS 3507 - 4436 961 ## COG0366 Glycosidases - Prom 4515 - 4574 5.6 + Prom 4514 - 4573 7.2 4 4 Op 1 1/0.333 + CDS 4614 - 6935 1626 ## COG1554 Trehalose and maltose hydrolases (possible phosphorylases) 5 4 Op 2 . + CDS 6868 - 7860 1028 ## COG1609 Transcriptional regulators 6 4 Op 3 7/0.000 + CDS 7873 - 9654 1895 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 7 4 Op 4 2/0.000 + CDS 9676 - 11151 1138 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Term 11277 - 11322 7.1 + Prom 11170 - 11229 6.3 8 5 Op 1 35/0.000 + CDS 11337 - 12659 1386 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 12670 - 12717 -0.2 + Prom 12663 - 12722 1.8 9 5 Op 2 38/0.000 + CDS 12749 - 13627 809 ## COG1175 ABC-type sugar transport systems, permease components 10 5 Op 3 . + CDS 13630 - 14478 892 ## COG0395 ABC-type sugar transport system, permease component 11 5 Op 4 . + CDS 14497 - 15327 742 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I 12 5 Op 5 . + CDS 15330 - 16604 1202 ## COG2271 Sugar phosphate permease + Term 16621 - 16683 7.7 - Term 16609 - 16671 17.0 13 6 Op 1 . - CDS 16685 - 17608 744 ## COG0167 Dihydroorotate dehydrogenase 14 6 Op 2 . - CDS 17624 - 19207 1082 ## Clos_2471 FMN-binding domain-containing protein - Prom 19398 - 19457 7.2 + Prom 19287 - 19346 7.4 15 7 Op 1 2/0.000 + CDS 19438 - 20967 458 ## PROTEIN SUPPORTED gi|15900035|ref|NP_344639.1| ABC transporter, substrate-binding protein 16 7 Op 2 7/0.000 + CDS 20977 - 21906 871 ## COG4209 ABC-type polysaccharide transport system, permease component 17 7 Op 3 . + CDS 21921 - 22817 798 ## COG0395 ABC-type sugar transport system, permease component 18 7 Op 4 7/0.000 + CDS 22860 - 24608 1414 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 19 7 Op 5 . + CDS 24586 - 26205 1369 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 20 7 Op 6 . + CDS 26242 - 27132 723 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis + Prom 28279 - 28338 5.2 21 8 Op 1 . + CDS 28470 - 29483 959 ## COG1609 Transcriptional regulators 22 8 Op 2 . + CDS 29521 - 30132 681 ## EF2258 hypothetical protein 23 8 Op 3 9/0.000 + CDS 30181 - 31569 1677 ## COG1653 ABC-type sugar transport system, periplasmic component 24 8 Op 4 10/0.000 + CDS 31594 - 32658 786 ## COG3839 ABC-type sugar transport systems, ATPase components 25 8 Op 5 38/0.000 + CDS 32627 - 33556 879 ## COG1175 ABC-type sugar transport systems, permease components 26 8 Op 6 . + CDS 33571 - 34398 570 ## COG0395 ABC-type sugar transport system, permease component 27 8 Op 7 1/0.333 + CDS 34413 - 35054 672 ## COG0274 Deoxyribose-phosphate aldolase 28 8 Op 8 . + CDS 35115 - 36050 863 ## COG0524 Sugar kinases, ribokinase family + Term 36082 - 36132 12.2 - Term 36062 - 36125 15.2 29 9 Op 1 . - CDS 36146 - 38413 1641 ## gi|266621074|ref|ZP_06114009.1| conserved hypothetical protein 30 9 Op 2 . - CDS 38370 - 39563 1183 ## gi|266621075|ref|ZP_06114010.1| FRG domain protein - Prom 39623 - 39682 1.8 Predicted protein(s) >gi|229784099|gb|GG667636.1| GENE 1 297 - 1241 1065 314 aa, chain + ## HITS:1 COG:CAC0267 KEGG:ns NR:ns ## COG: CAC0267 COG0039 # Protein_GI_number: 15893559 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Clostridium acetobutylicum # 6 307 6 307 313 328 50.0 1e-89 MINIQKAAIIGCGFVGTSIAFSLVQKGIFSELVLIDANEKKAEGEAMDLSHGLPFTKPME IRAGSYEDIADCAMIIITAGANQKPGETRLDLVHKNVEIYKSIIPKIVEKNQEATLLIVS NPVDIMTYVALKLSGYPRHKVIGSGTVLDTARLKYLLSRHLDVDSRSIHAFIIGEHGDSE LAVWSAANVSGIPLNHFCELRGYFDHMESMDRIYQSVRDSAYEIIEKKGATYYGVAMAVC RIAESVIRNEHSIMPISVYLDGLYGLHDICLSIPTVVGQEGAEKVLDIPLDLMEMGKLVY SAEELKKIIGELEL >gi|229784099|gb|GG667636.1| GENE 2 1794 - 2498 648 234 aa, chain - ## HITS:1 COG:BH2903 KEGG:ns NR:ns ## COG: BH2903 COG0366 # Protein_GI_number: 15615466 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 1 228 334 561 561 250 53.0 2e-66 MVSRFGDDGKYRVESAKMLATFLHTLQGTPYIYQGEELGMTNIRFDSIDQYKDIDTLNHY AEAVTVFHEDPEHVMEAIYAKGRDNARTPMQWDASPNAGFTDAVPWIPVNPNYTEINAEA AIRDRESIYHYYQDLIRIRRNHKVVVYGDYNPVCEEHPSIYAYTRRYQGQKLLVVLNFFS DPAVIDLPKECLPENGIRLIGNYEDAPACEVHMKLRAFEAVVFLGGDEVEQITI >gi|229784099|gb|GG667636.1| GENE 3 3507 - 4436 961 309 aa, chain - ## HITS:1 COG:BH2903 KEGG:ns NR:ns ## COG: BH2903 COG0366 # Protein_GI_number: 15615466 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 1 309 1 311 561 373 59.0 1e-103 MKETWWKEAVVYQIYPRSFMDSNGDGIGDLGGIIRKLDYLKELGVNVIWLSPIYRSPNDD NGYDISDYCEIMEEFGTMEEFDRLLNAMHEKGLKLMMDLVVNHTSDEHPWFQAALKDKNG PYRDYYFFRDGKPGGLPPNNWCSHFGFSAWEKEPEGDQYFLHLYTKKQPDLNWDNPSVRE EVYRIMQFWTDKGIDGFRMDVINYISKVAELPDMPEDGPYLATPYFANGPHVHEYLQEMN RRILSQRDFITVGEMVSVTTDQARLYTHEDRHELNMVFTFEHMYLDAVGEDKWQIRQWKL SELKAVFGK >gi|229784099|gb|GG667636.1| GENE 4 4614 - 6935 1626 773 aa, chain + ## HITS:1 COG:ECs1895 KEGG:ns NR:ns ## COG: ECs1895 COG1554 # Protein_GI_number: 15831149 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose and maltose hydrolases (possible phosphorylases) # Organism: Escherichia coli O157:H7 # 10 750 6 710 755 449 36.0 1e-125 MEKIELGEWSVSVPRWDRAWQPQMESLFCQGNGYLGVRYAVDEPVPGQKRDTFIAGTYDS CFGEVSELPNLPDVMGKEITIDGELLQLSERNHEDYCVMLHLDNGLAERRFTYKTEQGKR VKVKMERFVSLKNRHLLMQRLSLILEEPAELEIRTGINGQVSNSGAQHFTEGTRHCIPGE REELVFQAWDSGVRFAAACAPRLSAGSYRASIDMDRRRIGTLYQIFAKKGEEIVLENAAA IYTSRDIEYWDGESPFADTPFKESACGKYKSPSIVEDANALAAEAANAGFGYSLNESGEA WQKYWDLHDIRIDSDDVFDSCAARFALYHLRIMTPVHDERMNIGAKGLSGEAYKCHTFWD TEMFLFSSWLYTEPEVARRLLRYRYLTLPKAKEKVAKYGCEGALFPWESAWITDGEVTPS EGAPDVVTGKPIPIYTGIQEIHVNADIAYAVHRYFEATDDQAFMEECGFEMILETALYWA SRAEWNEERKQYDINGVIGPDEYSEHVSNNAYTNYMASWNIHEALRILERLRQTGREDLL AALEQKTGAVTREEWMRERAEKLTLPQPDSRGIIPQDEAFLSLPELDLTPFRKGGKKILE HYNVEQLCGYQVLKQADVVMLLSSMPDLFPEDIILKNYQYYEERCVHDSSLSFNVHSMAA ARLGMEEESFRLFRKAAEVDLNSGLTSAEGIHAAAMGGIWQCIVLGFAGVSVKDGRLSVE PHLPKSWRSVSFSFVWKGERMELTVDHKEVRYGKTGGCSKGLRSVGVTDEPGA >gi|229784099|gb|GG667636.1| GENE 5 6868 - 7860 1028 330 aa, chain + ## HITS:1 COG:BH3692 KEGG:ns NR:ns ## COG: BH3692 COG1609 # Protein_GI_number: 15616254 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 1 330 1 333 337 228 33.0 1e-59 MVKLVDVARACGLSVSQTSRALNGRQDVSEETKRRVSQTAAAMGYVKNLTAQTLSAKTPM QLAVVVTGVSEQTENSSFFFGIMQGVNSFANQNGYETVVYMMEKSPECFETYCRQRGAGG MILMNADYDESAVKKVAEGNFPCVFIDIPYTGERKGCVIVNNAYYSRLAVEHMVKRGRKR IAMITGSGHSIVEKEREEGYMQALRQAGLTVKPEWIVKGDFRYDMAHEQAVRLMEKEPKI DGFFCASDLMALGVINALRGQHIKIPKQVSVFGFDGLLSSEYARPKLSTIRQNQKYKGFL AARMLADILGNEAYEPTVVVPCEVCLRESV >gi|229784099|gb|GG667636.1| GENE 6 7873 - 9654 1895 593 aa, chain + ## HITS:1 COG:BH0792 KEGG:ns NR:ns ## COG: BH0792 COG2972 # Protein_GI_number: 15613355 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 298 589 287 572 587 153 32.0 8e-37 MNWYKHIPMQRKILAACLVVILVPLTILVSLILRRSMDEEIDRSTVNMTQANEMIDDRLE TIVNHLKTTSAMYLVYPDIIQFVKNDFNMDRPEYITTLRELRKNVMSAKINPYVSSITYL SNGNNAYSGAGYNETYMKRIESFAGTMKAEKIKNMLSPVYETVINQQSVKTITYGFSLVD TYTFEPVGDAFINLDLKQFGQSFMVFTEDGSMDTMAVQGDYVIYQGPRVTDAMAAGVRQA LSEKKEEFAANKNQVMDLSLSGEDWRLVATYNEALDLTIISMAAGRVLKLHMMQGMASYI IIVVIMTAVFIGVSILLSVKLTRPIRVLEEGMHQVENGKLVPITQELNRQDDLGRLIHGF NNMTARLKESILREYESKNLQKKAQIMMLESQINPHFLYNTLNVINSIAILEDVPEISEL ATGLGDMFRYNISGGSLVTVMDEITQIERYTNIQKLCMAGQIEAVFDVPEEVRGKKILKF LLQPLVENCFEHGFSRDKRDGRIEISASSQEGVVTLTVEDNGSGMSGARLKQLKEKCREQ GALCLDKEETDSIGLLNVNFRLKNYYGEEYGLILESEEGRGTKITIRIPEGEK >gi|229784099|gb|GG667636.1| GENE 7 9676 - 11151 1138 491 aa, chain + ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 5 487 6 522 525 133 26.0 1e-30 MRFAVADDVLLARKAVEKLILEWDRDSTVCCSCDNGDEVLKVLEDQKVDVLVTDIRMPGL NGIALSEVVRQRFPSVDVVLVSGYATFDYARAALHNGVRDYLLKPLKREDLFHTLDAIKK ARAEEKRQEDLLREVQKRAGSFLLLRYLSGEERPEDVLPREMTGLKNGCFFAAQIMVKGV KFEELSIRLRDLSFSRDLIFEDILHSGAGVIVSCGEERAVAVYDKKIRALGTLAEQERKE GHPAAVGVSMLLTGPDSIREAYSQARRACCLRLQDPERGLYRFDNDRGKELLSKEDLHLI RNHLNAKRVKEAKEFCLKKLIQPDMTTVLQLEQSYQALLGICFTMDMEAIYRPLWDFDDM NGLLDYICSMACCSGETDRTVNEGDIIDEIQAFLEDNYYCEISLNELAATKYFMNPNYLS RLFKARTGMGFSKYLLGLRMKKAEELLENGDMNINEVALMVGYTSPSYFIQNFKKYFGRT PGTQRNKPELR >gi|229784099|gb|GG667636.1| GENE 8 11337 - 12659 1386 440 aa, chain + ## HITS:1 COG:SP1897 KEGG:ns NR:ns ## COG: SP1897 COG1653 # Protein_GI_number: 15901724 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 54 374 29 352 419 161 31.0 2e-39 MKKVLSAAMAAVMVVSMAGCGSAGKTAETTTAAPAAATEKTEETTAADTKEAASSGEVTT IEFFQMKDEGTDYYTALIKEFEAQNPDIKIEYTNVPDAETVLMTRMASDQVPDVFTHFPL DASFREQVKAGYMMDLTGETMLENVADDILDISLIDGKSYSVPVSLNMLGVYYNKDLFDK AGVEIPTTIEDLYKVCDTLKAADITPFTFPDKDVWTVRQFCDRASVTMLEDPVGLFEDIA AGKTTAADSRELRQMGETIVKLRQYGQEDNLGTGQEQAIADFANGKAAMFFSGTFAYPEI VKSNPDLPFAMFRYPSFNGSSIDRLGVNIDTAFSISASTKYPEAAKRFVEYCTSPENAQQ FSDYDGTPSAIKGVNFNKSEFESMFKDYIQTGKTMSVPSNSWPPSFGGEYGNICQELIDT KDVDQWLDNLNQLIMDTYNN >gi|229784099|gb|GG667636.1| GENE 9 12749 - 13627 809 292 aa, chain + ## HITS:1 COG:SP1896 KEGG:ns NR:ns ## COG: SP1896 COG1175 # Protein_GI_number: 15901723 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Streptococcus pneumoniae TIGR4 # 17 290 20 293 296 205 41.0 1e-52 MKNKPKRPKAGRAAYMAFTLPAIILYTVFFVFSVIYGIYYSFTNWNGYAKEYKFVGINNY TRLFSDRVFKRALVFTVEYSAILVVGVITIALILAVILCSKIKQQTFFRAAFFFPATLSL VTIGLVFNEIYYRVIPLIGKSFNIGFLSKNILSSSSTAMWGILIVHIWRSVSIPMVLLIA GMQNISGDLYEAAAIDGANGIQRFFRITLPLLMPVLSVCIILVTKEGLTVFDYITVMTSG GPAGSTQSIAFLIYENAFMNQKFSYAITQSVVIFVLVCVVSYFQFRTTRKED >gi|229784099|gb|GG667636.1| GENE 10 13630 - 14478 892 282 aa, chain + ## HITS:1 COG:SP1895 KEGG:ns NR:ns ## COG: SP1895 COG0395 # Protein_GI_number: 15901722 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 10 281 7 278 278 216 43.0 4e-56 MNEKKKSRLRPSGILKYLFLVLGCLVIAAPLYVTIITAFKTREESTVNFFSFPSQFYLGN FREIIEKQNYFRYLGNTVIITVCSLILMALVIPSLSYALSRNMRRSKFYRFSYIYLVIGL FVPFQVVMIPLVKIASSFHMSNIWGLIVLNLVFALRDGVFLFVAYMDSVPRELEESAFID GAGVVQTYIRIVFPLVKPMTATIMILNGLSVWNDFMLPLLLLNKSKDSWTLQLFQYNFKT TYSFDYNLAFASFLLSMLPVMVVYVFAQKYIIQGLTTGAVKG >gi|229784099|gb|GG667636.1| GENE 11 14497 - 15327 742 276 aa, chain + ## HITS:1 COG:BB0533 KEGG:ns NR:ns ## COG: BB0533 COG1235 # Protein_GI_number: 15594878 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Borrelia burgdorferi # 5 274 5 251 253 75 28.0 8e-14 MEIKYLGTAAAEGWPAVFCTCEACKRARKLGGKNIRTRSQAVVDNTVLIDLPPDTYLHVL REGMEIDTIESVLITHSHQDHFYPMELLMRGEPYAHRPGAPVLTVYGNDKVEAAYRLAME MNDSPTLHAQLDFKRVRPFEPVGLPGGYVVTPLAANHAKDENCLVYLLEKDGKRLFYGHD SGNYPEETWDYLRGKMIGLASFDCTNIEAPDGNYHMGLPDNRYVKARMIREGCAGENTVF VINHFSHNGKLMHEEIVDCVREDGFVVAYDGMTVGF >gi|229784099|gb|GG667636.1| GENE 12 15330 - 16604 1202 424 aa, chain + ## HITS:1 COG:RC0082 KEGG:ns NR:ns ## COG: RC0082 COG2271 # Protein_GI_number: 15892005 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Rickettsia conorii # 21 421 33 428 431 98 24.0 2e-20 MEMGACLRRKQKEVFLFCWTGYFLTYFGRLNLSVAMNAMALEFASGAAVIGLTGSVFFWV YAVGKLVNGYAGDYLDNRHQVFFGLMVSGAANVAIGFVHNIWIIILLWAVNGFAQSAIWC NMISLIAHWFYEEQHAGAAVWLSTSMVGGTLAGWGICGFIIRAASWQWVFLIPGFALILF AFLWKTYVKNTAVEAGFTDFQSVRVVKPSAAFLEHADSSSGIGNRLGKFILASGLIWIIL ACLAQGVIKDGIGLWGPTMIQDIYGVDAGTASFLLLFIPVMNFMGITLVGMIQRKRLLEE EMLVVIMMVLSVFILASLRVFMGKSLTAGVLLLGGVSAVMYGVNTILLGVFPLHFARANR ASFVSGLLDFCSYVAAGLSSVFSGLVIQLGLGWNSVFLVWMALTVMGIGALGVFGWKYKA QTGR >gi|229784099|gb|GG667636.1| GENE 13 16685 - 17608 744 307 aa, chain - ## HITS:1 COG:lin1947 KEGG:ns NR:ns ## COG: lin1947 COG0167 # Protein_GI_number: 16801013 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Listeria innocua # 3 301 4 298 304 144 32.0 3e-34 MNLNVTIPSSIGDIHLKNPLMPAAATLGNLIEHARYFDLNELGAMMPNSMFIDSGSPTMA KKICKTNNGFISALSKNNMSIFEFEKEILPHLPCETTPLIIDMKAVDKYQMEELAASIEQ MDRIMGIEINLNCPYGPGTPYWKNPDELKDLVARVRAAAPHKWLIAKVPGGDIPVEEISV ACQEGGADAITSYSSLNGSCIDIRTGTYRCGGGGSGGFSGPAFKPVGILMCRRAATAVDI PVIGIGGISCAEDVIEYIMAGAYAVQVGSANLSRPDFMHRLLKDLEALAGQMGISSLEEI RGTARIE >gi|229784099|gb|GG667636.1| GENE 14 17624 - 19207 1082 527 aa, chain - ## HITS:1 COG:no KEGG:Clos_2471 NR:ns ## KEGG: Clos_2471 # Name: not_defined # Def: FMN-binding domain-containing protein # Organism: A.oremlandii # Pathway: not_defined # 2 278 47 320 322 238 44.0 6e-61 MKQREKTVSILDRLSWSVQPPKGLIKGCYYREEARFSSCFNGDLGHLGVLEVVVDDGRIV MVEFNEQCSPTYYMRHFQNVDKRLSDYSFFQASKERTAVTGVVLVNGITSVEKQMVAANR LTGDFDLVAGASNSIKRSMLPLAEKIAARMSVPSPARYYGLSKELEPGVTGRLQIVTEAG KIVWLFYDEIFADTQEEISDPELKQYYRQSKYHCPDYISTSGAGFNNLSDLITVSVLKHQ NLLGLIDLPYTEEDNRAPEWDRYLSLARPLQEEMVKDGVWKPADYLPVLDSTVLSVHVMT ANDGRPLHLLELEYPDFTYKPGQFVMIQNQGDGFHWSYPYMIYEGTDRGLKVIAAKNSSL FPLATGAGTAIWGANGIGCTPGTNAVFVAEPATIHLLAPLIHFCASPTLILIGSESAVPK ELLPSVTHFVSDGSQALKHLTDGPDAVYMALNVPVLESIMNCADDSLKERTTVFASTRIG CGIGACKSCYLHSPDIQMGIPVCCNGPYLPFNMIDFEKDRKCFQTFK >gi|229784099|gb|GG667636.1| GENE 15 19438 - 20967 458 509 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900035|ref|NP_344639.1| ABC transporter, substrate-binding protein [Streptococcus pneumoniae TIGR4] # 37 506 18 487 491 181 27 8e-45 MKKRTLKRIASFTLSSMMTVSLLSGCGGGKQAESQPAAGEAAIAQTKGEEKEEPYVVTML TQGEQQEDLPRIMEKVNEILVRDLNMKLNLLVAPYGSINQQRQLMLTSGEPLDLVYLDAS SAIGFMNNGQIIDLSDLIDKYGTNIKKYWGDEAKSANIGGFVFGVPNLNEVGNIPAIGMR KDMVEKYNIDVESIKSLEDLEPVLAMIKEKEPTMTPVHICADQPPISRQLSVIDPLIDGI AVLDNSGQNTTTIIPVTQSEAYKEKCELFHKWYQAGYINQDAATTTVQFESAFKAGSTFS AIMVWHPMSPKQFGGVDMAYAFLGEHRALSGATSNADYGIATNSQNPDKAMQLLDYLYGS EEVAQLLNWGEEGKDWVYVDKEQNVVSWPEGVDSNNATYHAQLSWALPNQFMASSWEGVW DPDVFEQMLKFNKEGKKSKGFGFAYNTESVETELTALKNVQEKYRISLETGAVDPAEYLP QYEEALKQAGLDKLIEEKQKQFDEWLAAQ >gi|229784099|gb|GG667636.1| GENE 16 20977 - 21906 871 309 aa, chain + ## HITS:1 COG:lin2117 KEGG:ns NR:ns ## COG: lin2117 COG4209 # Protein_GI_number: 16801183 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Listeria innocua # 19 309 15 309 309 305 53.0 8e-83 MDEKVRKRKFRWKKWIPIFVMMSPGLIYLFINNYIPIFGLVIAFKKVDFREGIMNSPWAG LSNFTYLFKTPDAFIITRNTLLYNVAFIAINTSLSLTFAIFINDMKNKYCKKIYQSAVLL PYLMSVVIIGYIVYALLSPDKGMINNSILKPMGKSPVSWYMEPKYWPFILILVNAWKGVG YNCLVFIAGMAGISPGYYEAAQLDGATKWQQIKYITLPCMKTTIITLVLLNIGRIFYSDF GLFYQIPMNSGALFDVTNTIDTYVYRSLLQLGNIGMASAAGFYQSLVGFTLILICNFIVR KIDRDSALF >gi|229784099|gb|GG667636.1| GENE 17 21921 - 22817 798 298 aa, chain + ## HITS:1 COG:BH0795 KEGG:ns NR:ns ## COG: BH0795 COG0395 # Protein_GI_number: 15613358 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 10 297 36 321 322 213 39.0 4e-55 MRSKDQKVFNLISHTVMILVTVMAVLPFVLVFLSSITEENTLVLNGYSFFPEKFSLYAYE YIVMKGKKIFRAYAVTLFVTVVGTSINVMISAMLAYPLSLKDLPGKRIFTFYVVFTLLFN GGLVPTYLMYTSAFNVKNTIFALIVPNLLMHTMNVLLMRTYYSTSIPAELFEASEIDGAS QFKIFGSIILPLGKPIAVTMALFSGLSYWNDWTNGLYYLTGYDGEKLYSIQNFLNKVVTD IQYLNSSQVGSNSDILAKLPTVSVRMAIAFVAMIPILVLFPFLQKYFSKGIAMGAVKG >gi|229784099|gb|GG667636.1| GENE 18 22860 - 24608 1414 582 aa, chain + ## HITS:1 COG:BH3678 KEGG:ns NR:ns ## COG: BH3678 COG2972 # Protein_GI_number: 15616240 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 274 576 298 603 605 110 28.0 1e-23 MRTEKRTGNSISSRLRRLMTILVPLSALYIGISVVSTYYFKKQIVSYAETFVDFYIDKIE TTVTNINRRMSMLILGEGETEIELNSYINSIKTTDNIAFRNYFIGKLRDAFRMYSMEYGS EYQFFAFFPEGKQYVGANANDKLGQEVWEQYRDDITHKLERETLIPNSGSQYWQFVEGSG GCNYIIKFYSIREVYVGCWIRPADLIAPLENAVKDGKTTALLFDQNNRCIAGERTMEPDT ITIERRFSNLPFSVRLTVHDYGLFQKTFLMQMGLVFLALAMLFVTLCSIYFLYNKVLKPI KKFSNNLERLSRGKTGLEEISSSELSELEQANAEFRQLLQKISMLQEEVYEGEIKKQLMY MEYLKLQIEPHFYLNCLNFIYNTIDLGKYTQASRMSVMTAEYMRYLFNNGRDFVCIWEEL EHIGHYLEIQKLRFEHAFDYYLEQEEETREARIPPLVIQTFVENCIKYAVDLDSVLQITV TVFSEEMNGNPYTNICITDSGPGFSTQILEQLADQKKYMEEETGHIGISNTIKRLYYTYG DRASISFYNGPVKGAVVDIHIPYTVKAALEKGDEPDESADCG >gi|229784099|gb|GG667636.1| GENE 19 24586 - 26205 1369 539 aa, chain + ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 3 529 4 522 525 113 22.0 1e-24 MNLLIVDDDLHVIEGIKRNLDWELLPIDHAFQALGVPAAKKSLSANQIDVMICDIEMPQE TGLDLLEWIREEGYPVQAIFLTSYAKFEYAQRAIKLESLEYILKPVDYGQLEKAVVLAAE RSLKSRQNEVFKESSRYWEQNRQNVVEYFWTGVLARNAPSDEEELNCKIEECGLDYIRGK VFLPIVIQMVCDVAESYAGLADAVDALCTNCLEDNTDDGVEAYEFHAKISDHTYFLMLRA RMNSDPDALRRDVERFMDLFMEGIGKAELNVLCGVGMWSTANLVYEDVNCIYTMMYESPR NKKKVMYLQEYQPVSMMYEVPDLKNWWDMMKYGQTGELKDAVERYLDILAGQQKLSSRNL QQLGLDLTQMVYSWLGSMNIYAHMLFDNEENNKIYGRVALSSHNMLQYLEYLLSKAMEYK DVVERPHTIVDTIRDYMDHHFQENLSRDDLSRLVFLNPDYLSRLFKKEMGLSISSYLIQK RIDLAKELLSTTRTPVSVISSQAGYDNFAYFTKVFKEKTGMSPNEYRKKCQSADRRDKR >gi|229784099|gb|GG667636.1| GENE 20 26242 - 27132 723 296 aa, chain + ## HITS:1 COG:TP0796 KEGG:ns NR:ns ## COG: TP0796 COG1477 # Protein_GI_number: 15639783 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Treponema pallidum # 16 281 41 327 362 131 31.0 1e-30 MKKAKMREVNFYAYETYCMVRLEERPEAGTLLAGCRDLALQVEQTLNMYDEDSELSVMCR DYRPGQPYEVSALLYDFLDISLNMARLSKGAFDPTVGALVKKWRIGSGEERIIDEKELTS LINRTGYHHIRLLPGKRAAMIDIEGVTVDPGAVGKGLAIDYIVHYLKNNHVEQACLDFGG NLYVMGNTYRIGVRQPEDPEKMMAVVPVKDRAVSTSSWYEHYFECNGRVYGHLIDPASGK PVESEFTSVTVICDKAVYADMLSTALYVLGEENGETVIRSLREREKGGLRGLSLAS >gi|229784099|gb|GG667636.1| GENE 21 28470 - 29483 959 337 aa, chain + ## HITS:1 COG:RSc1014 KEGG:ns NR:ns ## COG: RSc1014 COG1609 # Protein_GI_number: 17545733 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Ralstonia solanacearum # 4 335 3 336 347 176 31.0 8e-44 MSLTIKDIAEKAGVSVATVSRVINGSGYVSEKNKKNIEEIIQESNYCPNAAARSLITQAS DMIGLVMPERVNPFFVKVYDGVTRKADEENITVLFYKTGDDEKKQGEILKQLKAQNVKGI LITPSLYQTSATRKMLEAVERAGIPVVLIDRDTEGGDFDAVFIDNKGAIYKATEQLIQAG HTKIAAVTSPDFSRVGARRQDGFLECMKKHELEVCPDWVIEGELSVESGYEACKRLMEME NSPTAIVAFSSSELIGCVKYLNETGYRIGRDVCLFGFDDIGTFPDFGLQLSTIERPMREM GEMAFELLWERIHNGGKNKRSREIILPTNIKLAENQK >gi|229784099|gb|GG667636.1| GENE 22 29521 - 30132 681 203 aa, chain + ## HITS:1 COG:no KEGG:EF2258 NR:ns ## KEGG: EF2258 # Name: not_defined # Def: hypothetical protein # Organism: E.faecalis # Pathway: not_defined # 14 193 9 188 205 155 48.0 1e-36 MKHDNRYSYKPQYDPFENVMTRNGLHADEVISALQKSIRHADEDTAIRCGYEMYVTSMEL ENYLWRRLSVISVEDIGFGEPTAPILIQTLKNMREEFPYSLSGSRALFLFQAIRYLCKCK KERSSDTIKTIVIEEYKRGEYAKLPTEYDIYDWHTKKGREGGRTYLDFLEHFTDVTPFMD QYDPELKEILKQYTRENIELGEE >gi|229784099|gb|GG667636.1| GENE 23 30181 - 31569 1677 462 aa, chain + ## HITS:1 COG:TM0595 KEGG:ns NR:ns ## COG: TM0595 COG1653 # Protein_GI_number: 15643361 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Thermotoga maritima # 74 462 20 411 419 149 27.0 9e-36 MSKLRRIAALGLTAVLLGGCVPQKVAAPEKAEETKAEETKAEEGQAKNEGTNGTEAATEA ETAAAIATEITGPVEITIWHQYSDQYEVWFQKVADEFHAKNPDITVKLQYQPADQYEAKV MAAAKADDLPSMVHSAAANMTSYINEGALVNLDDYIYDSAVGYENFDEDINSGLNQMYVP WGNKRYAMPLFLGGNVFFYNKTMFEELGIEAPKTWSEVEAVATAVSDKIGKPGFGFQILV DGYMEVLTQAGGEVIDLDNKTVGFDSETAREKITWYKDLIDKKVARLVGDDKHFTNPFAN GDVGCYVALSGNYGYLATAVGDKFELGAAPLPQEGPKKFVSLSEPVFAIAKTSPEEELAS WLFLKYLLSPEVNAECGTVFGALPASTAAMNQSVYQDFLKVNPVAAAVYEQRDCYGYMPT IAGWYELRTSLDKALEEIMLDISDMDTALAEARKSAEAAIRK >gi|229784099|gb|GG667636.1| GENE 24 31594 - 32658 786 354 aa, chain + ## HITS:1 COG:PAB2232 KEGG:ns NR:ns ## COG: PAB2232 COG3839 # Protein_GI_number: 14520406 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Pyrococcus abyssi # 1 353 2 357 362 316 47.0 3e-86 MKIKLQNIRKRFGETVALDDLSLEFCDGKLTTLLGPSGCGKSTLLNLISGILPPTSGSIF FGDRDVTSLSPDKRNIGLVFQNYALYPHMTVGENIAFPLEIKKVSKKERLERAEEFAKLL RIEDYFTRKPSQLSGGQQQRVAIARALAKEPEILLLDEPLSNLDARLRLEMREEIRRLQL ETGITTIFVTHDQEEALSISDRILLLKKGRVQQYGLPQELYDEPCNPFVADFLGNPPINL YDGMVDGDRVVLQEDHTGFFIAGLKGIQSGRKVKLGIRSEAFEAAAEGVYETPATVKNVF LNGKETLYLLEIGGKEFRSTLESGQTYEIGGKISVRLKKKGVHIYDSETEEKIG >gi|229784099|gb|GG667636.1| GENE 25 32627 - 33556 879 309 aa, chain + ## HITS:1 COG:lin0218 KEGG:ns NR:ns ## COG: lin0218 COG1175 # Protein_GI_number: 16799295 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 28 306 15 289 292 174 39.0 1e-43 MILKQRKRLVKVKSDEKITVRTFTEPLIYLLPFILVVSVFMVFPIITVFLNAFLENYDYL ANTYTGMGMGNFAALFSDKEFLLSLKNTFLYVAFVVPVSTVIALTVAILLNQKLKGSSFF QFVYFLPMVTATIAVGIVWKWMYHYDYGIINYVLGIIGIEPVKWMENPRMALAALIIYGI WTKLPFTIILFLSGLQTIGEQYYTAAKVDGAKPAKILFRITLPLLSPTIALVLIINIINT SKVFNEVFALFNGKPGPGYSLYTVVFYIYENFYHKMNVSIACAAALVLFLIVFIMTMLQM TIQKKWVSR >gi|229784099|gb|GG667636.1| GENE 26 33571 - 34398 570 275 aa, chain + ## HITS:1 COG:TM0598 KEGG:ns NR:ns ## COG: TM0598 COG0395 # Protein_GI_number: 15643364 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Thermotoga maritima # 54 274 57 277 277 191 45.0 2e-48 MKNRHLIVKIISYVILSAGAVMMVLPFYWMIITAFKTPEQVTQMPPIWWPPSFLLDNFKY VLSSGEFARYIINSLFVTVSSVILTTTVSILTAFAFSRLKFPGRDIIFSLFLGFMMVPFE LLAITNYVTIVKLHQIDTYKALIVPYTANILYIFILRNFFLEIPGSLYNAARVDGASNWK FLWRIMVPIAKPAIITIILLNSIDSWNSFLWPMLVTNESRMRTVTVGLTSFVQSAGIRYE RLMAAAFIVVIPMVVLFFFARKYIVTAVAQGGIKG >gi|229784099|gb|GG667636.1| GENE 27 34413 - 35054 672 213 aa, chain + ## HITS:1 COG:L63310 KEGG:ns NR:ns ## COG: L63310 COG0274 # Protein_GI_number: 15673422 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Lactococcus lactis # 1 213 1 213 220 211 53.0 6e-55 MNIAEKIDHTILRADAREDEVKRYCREAVQYHFASVCVNTCHVPLVAEMLKESDVKVCCV VGFPLGAMSTRAKACEARIAVEDGAEEIDMVVNIGAVKDGNYDRVEEDIRAVVEASGGAV VKVILETCLLTREEIVETCRCAVKAGAGFVKTSTGFSTGGATVEAVHLMRETVGDSVKVK ASGGIRTCGQAKAMIEAGADRIGAGNGIILLEG >gi|229784099|gb|GG667636.1| GENE 28 35115 - 36050 863 311 aa, chain + ## HITS:1 COG:mlr8492 KEGG:ns NR:ns ## COG: mlr8492 COG0524 # Protein_GI_number: 13477007 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Mesorhizobium loti # 2 306 4 307 309 214 43.0 2e-55 MGRKTLVFGSFVVDLMARTPHLPVPGETVKGSEFKMGPGGKGFNQAVATFKAGAEVTLAT KLGMDSFSEIALNTMEKLGMDTRRIFRTERKDTGCALILVGEDSSQNEIVVVPGASLTIT EKEVDSLKELIQECEFVLMQLETNLDAVERVAEYAARSGTKVLLNPAPAADLPEGLWGRI DVITPNEVEAEYYTGIPVHSEADAAKAADWFHSRGIADVIITMADQGAYLSSSSGIREFV PAFSVDAVDTTGAGDAFCGGLLAALAEGKERKEAVVFANATAALSVQKIGTAPSMPDRKE IETFLATRVTN >gi|229784099|gb|GG667636.1| GENE 29 36146 - 38413 1641 755 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621074|ref|ZP_06114009.1| ## NR: gi|266621074|ref|ZP_06114009.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 6 755 1 750 750 1482 100.0 0 MSENRMRARRIENYKEYIRNILSNRDIDCDETWWDSEEAHKALNLLVRAEKWEVTSSVQM GIDSSENFLFLKFTEDSGGLLQSLEEQARILAKLPESAEKNIYIICLVDNMKSRVFLERL FLSAEGKAMKKNIRKEMKPWKGTVNFLYILDNSYHEQDIKLTSVNRFEFIQMPPMKCHIN WENEDRTEVSNRGYVTSVHLYQLVEIYNRIGDKLFKRNVRYGLNEQLGVDRAIKETLRNS PGEFWFKNNGITILVERPDFKLDRVEEVLLEHISEYGELHFSVINGAQTITASAQCLYEM EYDLKELQKNGEAEEAAKLEERIRKSKNAKVLLRIIHISHSAQTAESEADSTREVNEISV ALNRQKPIKAEDIAFASPFVVKLAAFLEREQINGKRYFRLVKRGEGNIARRTVDLVDFAR ARKACAGYPGDARSKGTNFLLSTRNETGGEYSFQDKTIFVPEWLEAEDEQEAGIFAKYYG AVYFAVRVADFYGKNAKKIITDNPLKAAVLQNGKWYFTTYMVQLFNGYRSDFSQFTDCFE LIRDKLKDLMELFAELCGEAAQQSGNYPVLDSNTFKKDELYILTRNEADANRFAQIINDN LDDDEKIDLVSYREVTGGNRQPSSDTDSLAQKTPNGKAKFIKLNNGPVERVNSTAHAMVK TVQFVLDNYPGHEEELLISGTDWFTDDRTRAEEGYGYLRRSVEVAGSDGKTYWVGTHSNS VVKYNQIDLLCSLLHVKKHSISWMALDGKTPLFSW >gi|229784099|gb|GG667636.1| GENE 30 38370 - 39563 1183 397 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621075|ref|ZP_06114010.1| ## NR: gi|266621075|ref|ZP_06114010.1| FRG domain protein [Clostridium hathewayi DSM 13479] FRG domain protein [Clostridium hathewayi DSM 13479] # 1 397 8 404 404 762 99.0 0 MNENSAYVLPKDKRVKEISEKPIENVSDYISKIKEYIESRKGENYSYVYRGEPQIYPTSC RANIFRRGVMDGNPFFEKSLFDAMRQNRLTGEKRYLDNAIDAQHGEFPSRLLDVSYNCLT ALYFAVTPYYRKDVDSLDHEDGMVFVFFIDEIFSPSAENTNANYDAIINRDREWHRDKII FEKNHKFIDHTKLNNRIVAQQGAFILFQGNDAADLPAWMTYGIRIPKEAKPLIRKELKEF FGIHTGSIYPEIVNLVDEISNKSRKLNTEKFCCSNELMYAVRMLEKELDYYFDYILDNRK DLDTDEILVYVENMINSYRIGLLDFTQNLEETIQKVYLSEDELNTIVNRYNQLILEFSKE LNQHGVHKFSGGQLLISMERTNDYERKSDESQENRKL Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:29:31 2011 Seq name: gi|229784098|gb|GG667637.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld30, whole genome shotgun sequence Length of sequence - 42782 bp Number of predicted genes - 42, with homology - 41 Number of transcription units - 15, operones - 7 average op.length - 4.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1055 920 ## COG4805 Uncharacterized protein conserved in bacteria + Term 1122 - 1182 16.1 - Term 1114 - 1166 15.5 2 2 Tu 1 . - CDS 1184 - 1702 616 ## Closa_0155 Lipoprotein LpqB, GerMN domain protein 3 3 Tu 1 . - CDS 2658 - 3101 390 ## COG1846 Transcriptional regulators - Prom 3139 - 3198 8.8 - Term 3142 - 3191 -1.0 4 4 Tu 1 . - CDS 3224 - 3751 485 ## COG0700 Uncharacterized membrane protein - Prom 3921 - 3980 5.2 + Prom 3645 - 3704 5.5 5 5 Tu 1 . + CDS 3951 - 4316 286 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family + Term 4330 - 4384 9.2 - Term 4318 - 4371 12.8 6 6 Op 1 . - CDS 4378 - 4881 159 ## gi|266621082|ref|ZP_06114017.1| conserved hypothetical protein 7 6 Op 2 . - CDS 4895 - 5353 176 ## gi|288870421|ref|ZP_06114018.2| putative glucosamine-6-phosphate deaminase - Prom 5483 - 5542 1.5 + Prom 5639 - 5698 4.8 8 7 Tu 1 . + CDS 5819 - 6640 451 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase 9 8 Tu 1 . + CDS 7555 - 7818 175 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase + Term 8047 - 8083 -0.8 10 9 Op 1 . - CDS 7846 - 8382 502 ## Closa_0149 RNA polymerase, sigma-24 subunit, ECF subfamily 11 9 Op 2 . - CDS 8407 - 8823 256 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases 12 9 Op 3 . - CDS 8843 - 9463 415 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) 13 9 Op 4 . - CDS 9511 - 10956 822 ## Closa_0140 FHA domain containing protein 14 9 Op 5 . - CDS 10958 - 11455 263 ## Closa_0139 hypothetical protein 15 9 Op 6 . - CDS 11344 - 11781 309 ## Closa_0138 hypothetical protein 16 9 Op 7 . - CDS 11816 - 12781 388 ## Closa_0137 hypothetical protein 17 9 Op 8 . - CDS 12800 - 14680 1630 ## Closa_0136 hypothetical protein 18 9 Op 9 . - CDS 14682 - 14864 422 ## gi|266621094|ref|ZP_06114029.1| conserved hypothetical protein 19 9 Op 10 . - CDS 14882 - 16117 940 ## Closa_0134 hypothetical protein 20 9 Op 11 8/0.000 - CDS 16120 - 17001 521 ## COG4965 Flp pilus assembly protein TadB 21 9 Op 12 . - CDS 16847 - 18031 833 ## COG4962 Flp pilus assembly protein, ATPase CpaF 22 9 Op 13 . - CDS 18107 - 19105 1048 ## COG1192 ATPases involved in chromosome partitioning 23 9 Op 14 . - CDS 19116 - 19625 409 ## Closa_0130 peptidase A24A prepilin type IV - Prom 19673 - 19732 7.7 + Prom 19905 - 19964 9.2 24 10 Op 1 . + CDS 19997 - 20392 481 ## COG0640 Predicted transcriptional regulators + Prom 20395 - 20454 5.8 25 10 Op 2 . + CDS 20486 - 20704 221 ## Closa_0128 hypothetical protein 26 10 Op 3 4/0.000 + CDS 20730 - 22064 1090 ## COG0366 Glycosidases 27 10 Op 4 . + CDS 22069 - 23979 1890 ## COG0296 1,4-alpha-glucan branching enzyme + Prom 24017 - 24076 4.7 28 11 Tu 1 . + CDS 24126 - 24875 841 ## Closa_0125 aminoglycoside phosphotransferase + Term 24890 - 24926 10.3 - Term 24873 - 24922 1.3 29 12 Op 1 . - CDS 25006 - 27270 1957 ## COG3345 Alpha-galactosidase 30 12 Op 2 3/0.000 - CDS 27334 - 28518 905 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases 31 12 Op 3 1/0.000 - CDS 28614 - 29297 780 ## COG2186 Transcriptional regulators 32 12 Op 4 44/0.000 - CDS 29360 - 30355 992 ## COG4608 ABC-type oligopeptide transport system, ATPase component 33 12 Op 5 44/0.000 - CDS 30352 - 31392 595 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 34 12 Op 6 49/0.000 - CDS 31420 - 32352 1128 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 35 12 Op 7 . - CDS 32364 - 32984 206 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 36 13 Op 1 . - CDS 33909 - 34064 62 ## gi|288870947|ref|ZP_06115879.2| sodium/hydrogen exchanger/universal stress family protein 37 13 Op 2 . - CDS 34086 - 34253 107 ## EUBREC_1606 putative reverse transcriptasematurase of intron 38 14 Op 1 . - CDS 35345 - 37066 2140 ## COG4166 ABC-type oligopeptide transport system, periplasmic component 39 14 Op 2 . - CDS 37111 - 38439 1287 ## COG0673 Predicted dehydrogenases and related proteins - Prom 38661 - 38720 6.2 - Term 38818 - 38858 3.5 40 15 Op 1 35/0.000 - CDS 38871 - 40688 194 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 41 15 Op 2 . - CDS 40701 - 42431 183 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 42 15 Op 3 . - CDS 42440 - 42544 63 ## Predicted protein(s) >gi|229784098|gb|GG667637.1| GENE 1 3 - 1055 920 350 aa, chain + ## HITS:1 COG:CC1085 KEGG:ns NR:ns ## COG: CC1085 COG4805 # Protein_GI_number: 16125337 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Caulobacter vibrioides # 8 338 203 550 564 101 25.0 2e-21 YKEQNLAVLKDHFIPAYKNLIDGITALMGTGTNEKGLSEFPQGKEYFEYLVKSNTATSYD TVLKLQKAIEKQLNSDLKALGEIVQTHPELVDALDTYTFQYTKPDEILKSLNGQISKDFP ALPECSYKVKYVPKALEGTLSPAFYLVPPLDRYQDNVIYINGGTRFQNEDLFPTLAHEGY PGHLYQNVYFTSHNTCDLRRILSFSSYSEGWATYVEYYSYTLDNGLDPSMGQLLARNSAI SLGLYALLDICINYDGWDKGQVKDYLSKFYNIGDSDVADTIYYSLVENPTNYMEYYVGYL EIMEMRNTAEKILKDQFDLKAFHTFLLDIGPAQFSIIQPYFRLWLAEQMD >gi|229784098|gb|GG667637.1| GENE 2 1184 - 1702 616 172 aa, chain - ## HITS:1 COG:no KEGG:Closa_0155 NR:ns ## KEGG: Closa_0155 # Name: not_defined # Def: Lipoprotein LpqB, GerMN domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 15 172 20 177 178 224 73.0 8e-58 MCFVGAMLVLSLAACTPTPVKQSESATEPMVEASTGGASDKVPDPNVEPMEIISVYSSNE DATGLNQAMDAVDELTAQSLVDKLIEYGVLEEGTEVLNFDTKDGVGTLDLSKVPSSGTSG EMLMLTAIGNTFTENFELDKLKLLVNGKNYSSGHIEHGDNDYLEYNRDYKKM >gi|229784098|gb|GG667637.1| GENE 3 2658 - 3101 390 147 aa, chain - ## HITS:1 COG:CAC3579 KEGG:ns NR:ns ## COG: CAC3579 COG1846 # Protein_GI_number: 15896813 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 3 142 4 143 154 132 51.0 2e-31 MDTYGTLNEVLVRLFRDIMDIEQQAIITSEFNDLTNNDMHVIEAIGIEEPKNMSTIAKEL SVTVGTLTIAMNSLVKKGYVTRQRGKEDRRVVYIFLSEKGRAAYEHHARFHKDMIDGVIG QLSGDEVEALIKALTKLNTWFRSLEEN >gi|229784098|gb|GG667637.1| GENE 4 3224 - 3751 485 175 aa, chain - ## HITS:1 COG:CAC0470 KEGG:ns NR:ns ## COG: CAC0470 COG0700 # Protein_GI_number: 15893761 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Clostridium acetobutylicum # 6 175 3 172 173 123 36.0 1e-28 MSFILFLSEAMVPLIIFYIVGFGILSKRPVFDDFMNGAKDGMKTVAGIMPTLIGLMTAVG VLRASGFLDFLGELLKNPASWLCLPAPVVPVILVRLVSNSAATGLVLDIFKEYGTDSYVG MLTSILMSCTETVFYCLSIYFGTVKIKKTRYTMGGALAATAAGVAASIVLSRFLA >gi|229784098|gb|GG667637.1| GENE 5 3951 - 4316 286 121 aa, chain + ## HITS:1 COG:BH3485 KEGG:ns NR:ns ## COG: BH3485 COG1393 # Protein_GI_number: 15616047 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Bacillus halodurans # 1 118 1 118 119 124 55.0 5e-29 MNILFLQYPPCSTCKRARKWLDDHQIPYEARHIKEQNPSVSELEAWYKKSGLPLKKFFNT SGNLYKEQNLKDRLPEMTETEQLELLATDGMLVKRPIVVGEDFVLVGFKEEEWGSNLKIR F >gi|229784098|gb|GG667637.1| GENE 6 4378 - 4881 159 167 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621082|ref|ZP_06114017.1| ## NR: gi|266621082|ref|ZP_06114017.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 167 1 167 167 335 100.0 9e-91 MKKGLLNIILSSVICIAFATPVFAQNIYFTDSKQITNAAVKTSQDITKQPRGNIISTGIL EVSNPGGGEIGVFMQTLTHVDVDETVFGIYLDRWIESEQRWATVADYKFTYNKENSPDED LMTKAISFNVVGQPAECYYRLRGVHLVSVNGNREMLTTKTDGVLITK >gi|229784098|gb|GG667637.1| GENE 7 4895 - 5353 176 152 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870421|ref|ZP_06114018.2| ## NR: gi|288870421|ref|ZP_06114018.2| putative glucosamine-6-phosphate deaminase [Clostridium hathewayi DSM 13479] putative glucosamine-6-phosphate deaminase [Clostridium hathewayi DSM 13479] # 1 152 93 244 244 293 100.0 2e-78 MRERGRGNGDIIWNNDEYVLTKEDEINDIYLKIEERLNIKPLKFMFLPFGMKFDELKIVG DHANIKFFYENKAFNCVQSKKPVTSSNNVVSDRRVYKQIYNEWLDISIPIEKNKLNDKVE YSAKFDLNGCYFYISGIIDEDNFDKIIENLNY >gi|229784098|gb|GG667637.1| GENE 8 5819 - 6640 451 273 aa, chain + ## HITS:1 COG:CAC2231 KEGG:ns NR:ns ## COG: CAC2231 COG0707 # Protein_GI_number: 15895499 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Clostridium acetobutylicum # 3 271 5 273 359 324 56.0 1e-88 MKKIILTGGGTAGHVTPNLALIPSLKEHGYEIRYIGSYQGIERKLIEGAGIPYDGISSGK LRRYFDIKNFSDPFRVVKGYGEARKLIKLHRPDVVFSKGGFVAVPVVLAAKHYKIPVIIH ESDMTPGLANKICIPSAAKICCNFPETLGYLPKEKACLTGSPIRKELLQGDRLTGLKHTG LSANRPIILIIGGSLGSVTVNRSVRHILPRLLESFQVIHICGKGNLDESLKATRGYVQYE YVDAPLKHLFAAADLMISRAGANSICEILALAS >gi|229784098|gb|GG667637.1| GENE 9 7555 - 7818 175 87 aa, chain + ## HITS:1 COG:CAC2231 KEGG:ns NR:ns ## COG: CAC2231 COG0707 # Protein_GI_number: 15895499 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Clostridium acetobutylicum # 1 82 271 352 359 81 45.0 4e-16 MALHKPNILIPLSAAASRGDQILNAKSFEKQGFSCVLEEENLSDDSLFQAITQTYHDRQT YISRMEQSELHNAVDTIVDMIESLASN >gi|229784098|gb|GG667637.1| GENE 10 7846 - 8382 502 178 aa, chain - ## HITS:1 COG:no KEGG:Closa_0149 NR:ns ## KEGG: Closa_0149 # Name: not_defined # Def: RNA polymerase, sigma-24 subunit, ECF subfamily # Organism: C.saccharolyticum # Pathway: not_defined # 1 178 3 180 180 265 73.0 6e-70 MQQDEDMLLQSIYEEYQGTLRRIARALNVPNMELEDVVQETFIAYFRKYSLTWSPTLKKA MLVKILKGKAIDCLRKNGHYEKVSLDEENSIRCIEMLTTYVVTDPIDIIISEESIQRITT EIANMRQEWKEMAVLYFLEQRTIPEICEMLEIPGTVCRSRIYRTRMCLKKILGPKYDI >gi|229784098|gb|GG667637.1| GENE 11 8407 - 8823 256 138 aa, chain - ## HITS:1 COG:BH1189 KEGG:ns NR:ns ## COG: BH1189 COG0537 # Protein_GI_number: 15613752 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Bacillus halodurans # 3 111 4 112 142 120 47.0 6e-28 MRDDNCIFCKIANGEIPSETIYEDETFRVILDLGPASKGHALILPKQHYSDICELDEDVA ARVLPLAAKIGTAMKKSLNCAGFNVVQNNGVEAGQTVFHFHVHIIPRYEKGPVMVSWVPG EVSPEELTETGAAIRSTL >gi|229784098|gb|GG667637.1| GENE 12 8843 - 9463 415 206 aa, chain - ## HITS:1 COG:BH1421_1 KEGG:ns NR:ns ## COG: BH1421_1 COG0705 # Protein_GI_number: 15613984 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Bacillus halodurans # 4 179 180 349 349 110 36.0 1e-24 MEDLYGRKKAYVNIGLIALNVLYFLFLEATGSSENTSFMVAHGAMYAPLVIERGEYYRLI TSVFMHFGISHIMNNMLILFILGDNLERALGHIKYLFFYLICGVGANIVSMIVNLGEYRN VVSAGASGAIFGVIGGLLYAVIINRGRLEDLSTRQLVVMIVCSLYFGFTSTGVDNAAHIA GLLIGIVMGILLYRKPRKKRNTEIWG >gi|229784098|gb|GG667637.1| GENE 13 9511 - 10956 822 481 aa, chain - ## HITS:1 COG:no KEGG:Closa_0140 NR:ns ## KEGG: Closa_0140 # Name: not_defined # Def: FHA domain containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 481 1 437 437 385 48.0 1e-105 MEITYKRELKHNYLIITPEEAFYDSYEIRMLASNCIEGLLKFHVKQVDNRKSYYYEITSR QPLTRILEYQSLGAEHLRCLITGIVRTLERMETYLLQEGQILLEPEYIYVEPEYFSVYLC LIPGRQGGFPEEMTALLQYLLGKVNHQDKECVVMAYGLYQESLKENYGMADLLRLTGLDQ AADRRIRQGEEWSTEENNWQDVEMEESGGDEFLRPVNRRKMETRSEPEWKEHGKIVSENK KILQAGNEKREERRMKKKTLFQNILTLCTILAGGPVLCWLLYGGRGVKHYWPILAGVDVL TALFLAVRTIAFGLHEENPEERHFFEREHKNRDRYGDTEKRHEWQMVFQEEPETNCDIPV SGNPEVKIRNEECNTVLLTEQEQPANARCFRSLEPGVPDIIISYVPFLIGKQEGLADYVL SRDTVSRLHAKIDREGEEYRITDLNSTNGTMVGGRLLETNETAPLLPGEEVYIANVGFIF T >gi|229784098|gb|GG667637.1| GENE 14 10958 - 11455 263 165 aa, chain - ## HITS:1 COG:no KEGG:Closa_0139 NR:ns ## KEGG: Closa_0139 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 37 165 1 129 129 146 56.0 3e-34 MSVSVPVWKKIRKKYTQDGGAVLTVSDSRLDLDGDFMKRRLKGSYTVEAAFVMALVLWAV LFSVQAAYRLRDETVGAMALHGAAEYLRHGEEKTAEEAENYAERLAGRPFSWSGYQFRIE QTERVLKGHRIHALGTAGTWSLEISQDEYDPENFLRMCSLPDREE >gi|229784098|gb|GG667637.1| GENE 15 11344 - 11781 309 145 aa, chain - ## HITS:1 COG:no KEGG:Closa_0138 NR:ns ## KEGG: Closa_0138 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 10 144 8 142 143 97 46.0 1e-19 MEKMKWSHVIFMTFLLAASWQDLRNKSVSIWLYLGYGAAAGVMRLIGGGPVFTILSGTLV GVVLLAAGRLTSGAIGSGDGMFFIVSGLYLSLNENIRLLLNGILWGGVFCLFLFLYGKKY GRNIHRMAVPFLPFLIPGWIWMVIS >gi|229784098|gb|GG667637.1| GENE 16 11816 - 12781 388 321 aa, chain - ## HITS:1 COG:no KEGG:Closa_0137 NR:ns ## KEGG: Closa_0137 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 321 1 307 307 300 50.0 5e-80 MPFFQRPLSKDLNNKRSVTLQVRIPLELRGLPGKRVLSSASIRSLWCVRPLDCLRMGVSI TVEAALALPLFLFAVMILMMPMILMNESRKIQSELELVCAEISQYAGVLPDSALERGDYE KGGIPEELMEDRTKTGFRIYAEGKIRSRIRTEKAGHFSLTDSRILEDGETIDLILRYEMY LPFPVFRIKSVPMTARSCRRAWIGRTGGKGNGNSGQAENDELVYVGKSSTRYHRDRNCHY LYNHISVISFADVESVRNSDGRKYKPCARCGSQAGQGSSVYIMPSGESYHSDRNCSSIIA YVRAVPLSEAEYLGPCSYCAE >gi|229784098|gb|GG667637.1| GENE 17 12800 - 14680 1630 626 aa, chain - ## HITS:1 COG:no KEGG:Closa_0136 NR:ns ## KEGG: Closa_0136 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 626 1 630 630 596 51.0 1e-168 MRKSGQITVFLSLVLLCVCSLVCGLVESARTAGAGWYLKLAADSAMDSVFSGYHREAWEQ YRLFLLEYEEEEEPGSTWLKYMEPYMEEHGWYPAVIKSAAVKGISGITDKHGANLKQEIQ DYMRFGIWDTVSEESEAENLWKELKEAESLQHVSEAYSGHAREAARMERALEAIWKSLES QEEYRQKAAVRLADRDGAGFRTAAKQLKKEGGRLPGLIKTYEKNADDLAAHLEKTREETK QKWEDLSEGVRSAMEEELAGYESYISENGAKRLEIRALIPEAEGNTAIIDRAVERSYEVE EIIAEWDEEDEGGGPDEGALWDSVEVVWDRVKVPELSFRAGLKEADKQNLLEQIQQMAGL NLLMLVLPEGEEVSKGVIQTKSLPSVHHTEDILKDSFLERVLTDEYIGRFFTCFRSEDEK EVCYEQEYVLGGKSTDEENLKSAAARILAVREGLNLIHILSDSQKREEAHALAAAITGVT GTAPLTGIVAFFVMTVWALGESAADVKALLGGERVPLIKTKETWNLDLDGLLALGEQGKV EAGQEKGVGKNYEEYLKMLMFMEPAERLYYRVMDVVQINLSVKQPGFTMERCACQAEIVG TGTGKHLFWLGGDPCYTVEVHTDKAY >gi|229784098|gb|GG667637.1| GENE 18 14682 - 14864 422 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621094|ref|ZP_06114029.1| ## NR: gi|266621094|ref|ZP_06114029.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 60 1 60 60 71 100.0 2e-11 MEMVKKELMAFLKEEDGVGVIEIVLILVVLIGLVIIFKKQITTLLNNVFKEINSQSKEVY >gi|229784098|gb|GG667637.1| GENE 19 14882 - 16117 940 411 aa, chain - ## HITS:1 COG:no KEGG:Closa_0134 NR:ns ## KEGG: Closa_0134 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 411 2 420 420 383 50.0 1e-105 MRKSIKYPVPKSSVICAAAGILLYAGLMIWQGGKWMPVSSLQRPAHGDGELLYEVIVREL GEEGREIPVTIPVGERQYSDEEAEALYDAVFPEMTERILGGNESLEAVRSDLDLVRVLEP YGLTVRWESENVERVDTFGTVVNSGVPEEGEPVWLKAALTDGVHTREYEIRVTVMPPVLN GDELFQEKIMEACRRLDSVQKTDNQLFLPEEVAGKKVSYYRERETDYTVIPVLGLLLAFL WAARVKMNEQNVKKQREQLLLLDYSEIVSKFMVFISAGMTIRTAWERIAAGYEKTVQTGN RKARPAYEEMCHTISQLKSGMAEGRAYGEFGRRCGLQPYVKLAALLEQNRKTGSKNLKSA LELEMVSAFEQRKNLAKKLGEEAGTKLLLPLFLMLGVVMIMIVVPAFLAIY >gi|229784098|gb|GG667637.1| GENE 20 16120 - 17001 521 293 aa, chain - ## HITS:1 COG:RSc0651 KEGG:ns NR:ns ## COG: RSc0651 COG4965 # Protein_GI_number: 17545370 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein TadB # Organism: Ralstonia solanacearum # 52 293 84 325 325 64 21.0 2e-10 MKKGKSNSIPCMSLKKTLKKGRGRIRPSPAALRRPEQLSTRKSCGPQGLYYDRYRLSAGE WFRYAAAGAGIAAAVSYVFYRSLPVFIMLSPFGLCYPLIKKADLKDQRLKRLNLEFKEGI LILSSFLSAGYSVENAFSASIRELAVMYGASGMMVLEFSHMEGQIRMNRPVEQVLMEFGD RSGLDDVRNFAEVFAAAKRSGGELVSIINHTSGVIRDKIQIQEEIATMTAAKQFEQKIMN LIPFFLVLYVDVSSPGFFDMMYQTGAGRWIMTGCLAIYAAAYVLAGRILRIEV >gi|229784098|gb|GG667637.1| GENE 21 16847 - 18031 833 394 aa, chain - ## HITS:1 COG:SMc02820 KEGG:ns NR:ns ## COG: SMc02820 COG4962 # Protein_GI_number: 15963897 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Sinorhizobium meliloti # 10 366 71 426 466 290 45.0 5e-78 MEVLEGQSHTSDEELYELIDERILEYGQARYLPLKERVELRSSLFDSFRRLGALQELVDD REVTEIMVNGPDHIFLERKGRIETWEKGFESTEQLEDMIQQIVSRVNRTVNVSRPVADAR LPDGSRVHIVLPPIALDGPAVTIRKFPEPITMEKLIRFGSISGEAAGCLKNLVTAGYNIF ISGGTNSGKSTFLNALSAFIPAEERLITIEDSAELQIRSVPNLVRLEARGPNTEGEGAVT IADLIRAALRMNPDRIIVGEVRGREALDMVMAMNTGHDGSLSTGHGNSPKDMLSRLETMM LMGAELPLPAIRSQLASALDILVHLGRLRDRSRRVLSIVEVDCYEEGEIKLNSLYEFEED VEEGKGPDTAVSGCLKKTGAIKHKEKLRAAGIVL >gi|229784098|gb|GG667637.1| GENE 22 18107 - 19105 1048 332 aa, chain - ## HITS:1 COG:CAC0037 KEGG:ns NR:ns ## COG: CAC0037 COG1192 # Protein_GI_number: 15893335 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Clostridium acetobutylicum # 6 283 6 291 361 83 23.0 6e-16 MTNRIMAVYDVDPFYADRFADFVNQKEKVPFTVMAFTTLERLKNYAADHEIELLLINSSV PKAEIDRLGAVRVVTLADGETVPVDGQYPSVYKYQATDNIMREVMACYCAGTIQPALTVI GNAGDIIGVYSPVNRCLKTSFALTAGQLLSRKSRVLYVNLEDCSGLEKLFGEEHKADLSD LLYFYGQGNYSWVRLSSVIYSWGDLDYIPPVRYPEDLCQITAEEMADLLTRIARESAYDI VVVDLGQFGKKAAPVLEVCSTVYMPVKDDCMSVAKVEEFEEYLIRSGHEALKERIQKIKL PYHSSFGRKENYLEQLLWGELGDYTRQLLNGR >gi|229784098|gb|GG667637.1| GENE 23 19116 - 19625 409 169 aa, chain - ## HITS:1 COG:no KEGG:Closa_0130 NR:ns ## KEGG: Closa_0130 # Name: not_defined # Def: peptidase A24A prepilin type IV # Organism: C.saccharolyticum # Pathway: not_defined # 1 163 10 165 171 147 53.0 2e-34 MLAAGAYYDVREHRIPNWWVAVSAVCGILLSMVESGAPPGPAAYLKECGLFLARMLTVSA LFFPLFICRMIGAGDIKLAALICGYSGFAGGASAIGLGFLIGAFWSFLKMMVKGSFHERF CHLAAYIRRIYHTKTITAYYDKARDGTEAVIPLGVCLFFGTMAFIMMPG >gi|229784098|gb|GG667637.1| GENE 24 19997 - 20392 481 131 aa, chain + ## HITS:1 COG:MJ1325 KEGG:ns NR:ns ## COG: MJ1325 COG0640 # Protein_GI_number: 15669515 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanococcus jannaschii # 17 115 10 85 89 60 36.0 1e-09 MLEAGDLNKLFRDCMPLFIALGDEVRLSIIGVLAGSGLYDDYETDIAVSLNGYRPSGVPG MNVKEITEQTNLSRPAISHHLKILKDAGLVNVRREGTSNYYYLTIADSTKELRRLGNCLQ ELLELYSPKTS >gi|229784098|gb|GG667637.1| GENE 25 20486 - 20704 221 72 aa, chain + ## HITS:1 COG:no KEGG:Closa_0128 NR:ns ## KEGG: Closa_0128 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 72 1 72 72 101 84.0 1e-20 MKNIIFRKRLVRSEEEKNLRREIERSKTAIDSARNHFEQVIDPTLIDCYIYELNAAQLRY QFLLRRFKSREV >gi|229784098|gb|GG667637.1| GENE 26 20730 - 22064 1090 444 aa, chain + ## HITS:1 COG:CAC2686 KEGG:ns NR:ns ## COG: CAC2686 COG0366 # Protein_GI_number: 15895944 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Clostridium acetobutylicum # 4 442 3 450 451 397 42.0 1e-110 MSKWYEKGVFYHMYPLGLTGAPKENSGADVTDRFEEMDQWIPHIRSLGCNAIYIGPLFES SSHGYDTRDYRTVDRRLGDNESFKNFVSLCHESDIKVVVDGVFNHTGREFFAFLDIAANR ESSPYRDWYRGVNFGWNSPLGDSFGYEAWQGHYELPCLNLWNPDVKQYLFDTIRFWIDTF DIDGIRLDCANVLDFGFMKELRSQTAAMKEDFWLMGEVIHGDYSRWVNDEMLHSVTNYEL HKSIYSGHNDHNYFEIAHNVKRLEAIGGRLYTFVDNHDEDRIASKLNDPDHLYPVYTLLF TLPGIPSIYYGGEWGVEGRRTKTSDDALRPCIAIADRETLHCGLTDLIARLGTIHDENPE FHGGRYQELLLTNRQYAFARFHQGHAIITAVNNDGAETSVTIPSPVTGTTAVNLLDGEEY PVTDQKITVTLPANRGVILKITEG >gi|229784098|gb|GG667637.1| GENE 27 22069 - 23979 1890 636 aa, chain + ## HITS:1 COG:all0713 KEGG:ns NR:ns ## COG: all0713 COG0296 # Protein_GI_number: 17228208 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Nostoc sp. PCC 7120 # 13 599 109 714 764 622 48.0 1e-178 MAGSKKEIGTGFITEVDRYLFNNGRHYEIFEKMGAHPKKYKGKSGMYFAVWAPHAEQIGV VGDFNSWNPEANPMQPLADSGIWEAFVPGIATGELYKYAITTKSGKILFKADPYAFSAEY RPGTASVTTDISGFQWSDDAWMEKRGSADPLNAPMSIYEVHLGSWRRKNRPEKDGCYTYV EAAKELADYVTDMGYTHVELMGIAEHPYDGSWGYQVTGYFAPTSRYGTPQEFMYFINYLH KKGIGVILDWVPAHFPRDAHGLSEFDGEALYEYADPRKGEHPDWGTKVFDYSKYEVDNFL IANAIYWVEKYHVDGLRVDAVASMLYLDYGRENGNWIPNKYGGNENLEAIEFFKHLNSVM ADRGNGAIVIAEESTAWPKVTQKPENDGLGFTFKWNMGWMHDFLEYMKLDPYFRKYNHHK MTFGLTYFTSEKYILVLSHDEVVHLKCSMINKMPGLLDDKFANLKAGYTFMLGHPGKKLL FMGQDFGQFHEWDEKTALDWYLADEPLNADLQSYVKDLLQLYRKYPALYELDYDWDGFQW INANDGDRSIFSFVRSSRDHKRSLLFVINFTPVERPDYRVGVPKRGTYTLVLDNSHGLYK RGDKAPAFRSVKKECDGQDYSIAFPLPAYGTAVFRF >gi|229784098|gb|GG667637.1| GENE 28 24126 - 24875 841 249 aa, chain + ## HITS:1 COG:no KEGG:Closa_0125 NR:ns ## KEGG: Closa_0125 # Name: not_defined # Def: aminoglycoside phosphotransferase # Organism: C.saccharolyticum # Pathway: not_defined # 1 249 1 249 249 460 89.0 1e-128 MAAELIATTATKKVYRDGDKAIKVFNADFPKAEVLNEALITARVEETGGIDVPKVLEVGV FEGKWSITFDFIEGKTLQQLMDENPDKLAEYMEQMVDLHLNILSKHCPLLNKLKDKMSRQ ILETEELSDITRYDMRTKLDSMPKHTKLCHGDFNPTNIVVREDGSMCVLDWVHATQGNAS ADVARTYLLFCLQDQKKADMYMDIFCKKTGTAKKYVQTWLPIVAAAQLSKKRPEEVELLQ KWANVCDYQ >gi|229784098|gb|GG667637.1| GENE 29 25006 - 27270 1957 754 aa, chain - ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 1 744 1 740 748 814 53.0 0 MAIAFHERTGEFHLYNREISYIMKLLPDGSVGQLYYGKRLRDREDYGHMLEYAVRDMAPC PFENDNTFSREHVRQEYPVYGTGDMRCPAYRITCHNGSRVTGFAYTGHRIYEGKKPLEGL PAVYVENDREACSLELYLEDRLIHTRLVLTYTVYRDFPVIARHARFECMEGCEGIRLDQA MSLSLDLPDSCYTMLDLAGAWGRERYPDFHPLHQGIQSVHSMRGHSSHQFNPFLALMRPG TGETAGEVIAAALVYSGNFLASAEVDTMGTTRVMIGIHPEGFSWPLAAGEHFQTPEALLI YSDKGLNGMSQVFHQLFRTRLARGWWRDRERPILINNWEATYMNFNEEKILTLAEKAKEV GVELFVLDDGWFGNRDDDTSSLGDWYADRKKLPDGIEGLSQKIRNMGLSFGLWFEPEMIN QNSRLYDAHPDWVLGSGDRPRSVGRHQLVLDFSKDEVIDYIGGLMEDIIERGHLSYIKWD MNRSITEVWSSGRGAAEQGTILHRQILGVYRLYERLIRKFPYVLFESCASGGARFDPGML YYAPQAWTSDNTDGVERIKIQYGTSYVYPLSSMGSHVSAAPNHQTHRSVPLAARAVCAYF GTFGYEMDVAGIGGDELEQMKKQTAFMKEHRRLIAEGTFYRLISPFENERNEAGWMVVSP DKREALAAYFRVLQPVNTGFCRMRLSGLLEDGEYEIREQGLETLPYTYYGDELMNSGLIL SDQASGVRTAGVPQGDYLARLFILNVRDGRDGRR >gi|229784098|gb|GG667637.1| GENE 30 27334 - 28518 905 394 aa, chain - ## HITS:1 COG:APE1226 KEGG:ns NR:ns ## COG: APE1226 COG0626 # Protein_GI_number: 14601268 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Aeropyrum pernix # 17 391 10 381 384 270 40.0 5e-72 MTDQSEYLKNDKLITSHYGEEFEHYYNAVVPPIFMNSLNVFETVDDYYDSDKTDKHVYCY GRVQNPTVRILEDKAAALEQGTGALAFASGMAAATTAVLTVCKAKSHVVCIRSAYGPLKT FLNEYCREHLDMSVTYVKGDDTEEFEHAVTDQTDLIILESPSSVVLSLQDIHAVSEIAKK HHAYVYIDNSFCTPIYQKPLLLGADIVMHTASKYMGGHSDIIGGVLAVKDEELMARLRKQ RELFGGIIGPMEAWLIIRGLRTMEVRVERHQATAMEIAEFLESHPKVRKVYYPGLLSHPQ HELMKRQQGGNTGLLSFEIKGSVEQAKEVAQRLKIFKIGVSWGGFESLVCMPHARQDAES CRFLGADQNVLRIHCGLEGAEVLKADLENALSCV >gi|229784098|gb|GG667637.1| GENE 31 28614 - 29297 780 227 aa, chain - ## HITS:1 COG:mll0857 KEGG:ns NR:ns ## COG: mll0857 COG2186 # Protein_GI_number: 13471000 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Mesorhizobium loti # 5 225 25 249 260 101 28.0 1e-21 MVNKIKRTSLQSEIIRFILDYIREHDLNPGDKLPSQESLLAMMGVSRTSLREAMKTLEAR GILEIQNGKGAYVGGLVDADGVQVIDFTQEKERLLEALEVRKILEREILRMLIHTITDEE LKELGEIKDILMSKYRRGLQQTAEDKKFHYTIYRLCHNQVMYQLILSLNNVMEKFWEFPL NMEDPFLESIPLHEELYEAICEKNVRKAQSINDKLLEAVYRDIRKQR >gi|229784098|gb|GG667637.1| GENE 32 29360 - 30355 992 331 aa, chain - ## HITS:1 COG:CAC3628 KEGG:ns NR:ns ## COG: CAC3628 COG4608 # Protein_GI_number: 15896862 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Clostridium acetobutylicum # 5 324 3 321 322 446 66.0 1e-125 MTGIEKDNQKPMVHVEHLKKYFQVRKKGILKAVDDVSFDICAGETLGLVGESGCGKTTCG RTVVKLYPATSGTIEFDGRDVGRLKTNEEIKWFKKNAQIIFQDPYASLDPRMTVGDIVAE GMKNHDLYPGKERMERVHELLNLVGLNKEHANRFPHEFSGGQRQRIGIARALAVDPKFIV CDEPISALDVSIQAQVVNLLVRLQKELGLTYLFIAHDLSMVKHISDRVGVMYMGHIVELA SSAELYRNPQHPYTKALLSAIPVPDPEFERSKTLIPLEGEVGSPINCSEGCRFEKRCKYA TEKCRSVTPQLTEIETGHYVSCHLLNQAKGH >gi|229784098|gb|GG667637.1| GENE 33 30352 - 31392 595 346 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 327 3 327 329 233 38 1e-60 QASPMSQPLLEVKNLKVSFHTYAGEVQAVRGVDFTLEAGKTMAIVGESGSGKTVTSKAVM GLIASPPGEVKKESSILFEGTDILKYTEKQWQDYRGKDAAMIFQDPMTSLDPTMKIGRQI MESIVEHQKVGKREAMQRAIELLKKVDIPNPDERVHQYPYEFSGGMRQRVVIAIALACNP KLMIADEPTTALDVTIQAQIIGLLKEIQKANNTSVIMITHDLGVVAGMADTIAVMYAGCI VEYGNVFDIFNAPQHPYTNALLNAVPRLNVENKAQLEVIPGTPPDLISPPKGCGFATRCK YCMEICRQAPPEYTEVSDTHKAACWLHHPMAREAFDYDDIRRVFEA >gi|229784098|gb|GG667637.1| GENE 34 31420 - 32352 1128 310 aa, chain - ## HITS:1 COG:BS_dppC KEGG:ns NR:ns ## COG: BS_dppC COG1173 # Protein_GI_number: 16078359 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus subtilis # 22 310 31 320 320 308 52.0 7e-84 MEQGNEMTVTKDLFRRVGVDEQKSESVVRPSISYWQDAVRRLKQNKVAMVSLVVLLLILV MCAIGPYIYPHPFDEQNIEFTNQAPSLQHWFGLDDLGRDIFARIWMGGRISLTIGIVGAI VSLVIGVLYGGISGYCGGLVDDIMMRIVEILVGVPYMVVVIIMSVVLGKGMTSLLIALCI TSWTNLARIVRGQVLELKESEYVLAARALGTPPLQIIVTHLLPNTLSLIIINTTFSIPSF IFAEAFLSFVGMGIQPPLTSWGAMAALGQQQMSYYPHELIFPALAISLTMLAFNLLGDGL RDAFDPKLRQ >gi|229784098|gb|GG667637.1| GENE 35 32364 - 32984 206 206 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 1 200 90 314 320 84 27 1e-15 LGLLLGILAAFHRGTWIDFITIFIAIVGVSIPSFVFAALLQKYGGGDYFPIVGWVSPGMK IPDIFRYTALPTLAASIGGLATYSRFMRSSVLDVLSSDYILLAKAKGLSNWKIITRHVLR NSITPIISIVAPQVAGIVTGSFVIERIFSIPGLGRYYVDSVNGRDYPMVLATTIFFSFIF ILSIVVMDILYAIVDPRVRKGIIEGK >gi|229784098|gb|GG667637.1| GENE 36 33909 - 34064 62 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870947|ref|ZP_06115879.2| ## NR: gi|288870947|ref|ZP_06115879.2| sodium/hydrogen exchanger/universal stress family protein [Clostridium hathewayi DSM 13479] sodium/hydrogen exchanger/universal stress family protein [Clostridium hathewayi DSM 13479] # 1 51 1 51 431 98 100.0 1e-19 MHGGVRGRELITPSYSIITKARFCPANKNKGFGGYIVMLTSLSLVFLLGML >gi|229784098|gb|GG667637.1| GENE 37 34086 - 34253 107 55 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1606 NR:ns ## KEGG: EUBREC_1606 # Name: not_defined # Def: putative reverse transcriptasematurase of intron # Organism: E.rectale # Pathway: not_defined # 1 54 410 463 464 96 83.0 3e-19 MGVPKELAWKAGNSRRGYWFTTQTVAVNMAMTKERLINRGYYDLATAYQSVHVNR >gi|229784098|gb|GG667637.1| GENE 38 35345 - 37066 2140 573 aa, chain - ## HITS:1 COG:CAC3632 KEGG:ns NR:ns ## COG: CAC3632 COG4166 # Protein_GI_number: 15896866 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Clostridium acetobutylicum # 63 572 45 560 565 257 33.0 5e-68 MKKRWLALCLATVMAASLTACGSSKTSAPDTTGTETAAAGSETKAGGETKAEEAPAGAVE QVVTIPMFSDPDTLDPGRSDDEQKNAIVLEIQETLIRLMDGKVTPGGAESWETSDDGLVW TFKLRDNQYSDGTAVTAQDYVNSIRRIFDPEVNCHNAGIFYCIKGGEDFNTGKGSKEDVA AKAIDDKTLEITLTEPLPYFLQLMTFANVTPVPESKTQGEKNSSYGATAEELSSSGPFYA AEWTRGSKIVLKKNPNYWDAANVKLETVNMVLAQDENTREQMFNQGQLDILRKVRSEYAD TLKAKIDGGEVQLLEGPQPRYSYICFNNEDPDGIFTNEKIRMAFSIALDRESLVKNVMKK DQAAYGMIPYGLSIGEELFRDVYPEPMKDFLAQDPKALLEEGLKEIGKEGEQITVTFLQK NSDNDTKVQAEYYQNQWQTKLGVIVKIDTASDNSSFNNQVSKGLYQVCQTGWGADYNDPM TFMQCYMTGDGNNPAFFSDARYDELVEACKTESDMKVRGEKFAEAEKIITVEHCGLAPIT FTYDKNVAKSSVKGFYINGAGGPAIELKSAYVE >gi|229784098|gb|GG667637.1| GENE 39 37111 - 38439 1287 442 aa, chain - ## HITS:1 COG:DR1362 KEGG:ns NR:ns ## COG: DR1362 COG0673 # Protein_GI_number: 15806379 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Deinococcus radiodurans # 1 422 1 396 403 253 36.0 5e-67 MKRPVTAAIAGLGNRGNDIYAHYQLVAPEEMKVVAVADPVAKKREAARAEYGVAPENCFE TVEDLLKQPKLADVLVIATQDKQHVAQAVEGIKKGYHILCEKPISPSLEECLLLQKTAHE YGRMVAVGHVLRYTPFYSKIKEVIESGQIGDVVSVQGLENVKYWHQAHSFVRGNWRDSVE TSPMILAKCCHDMDIFVWLLGKKCRRVSSYGSTYLFKESMAPEGCAMRCLDGCAAKENCP YDAEKIYITDKGTGIRHVVDEKLTGDDAWPCCVLSQTLTEEAIYEQIKTGPYGRCVYHCD NNVVDHQVVNMEFEGGVTVNFTMSAFTSSGGRDIKVMGTMGDIIGDLHTGIVKVTNFGKE TEIYDINATAGDLSGHAGGDNRLIHDFLEAVVEGETKGQLRTGVDVSIQSHIIALAAEYS RVHKGENVDLEEFIQSEKYKEL >gi|229784098|gb|GG667637.1| GENE 40 38871 - 40688 194 605 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 384 589 147 351 398 79 28 3e-14 MAARSRSEKRLDAAYNEPQGRSLIQRKHTILRLGRYMIQYKWLLLLAFVMTVGSNLLALV GPMLSGYAIDAIEPGIGKVDFSRVFHYAGWMAAFYVISSVLSYALSVLMITISRKVVYRM RKDVFDKMLALPAGYYDLHQTGDIISRISYDIDTVNESLSSDLIQILTTVITVAGALYMM VVISPRLVLVFAFTVPLSACITKFVTGRTRPLFRARSASLGNLNGFVEEMVTGQKTLRAY CQEEHVIQKFAVKNKEAVETYYRAEYYGSMVGPMVNFINNLSLSLISVFGALLYLAGYMT VGRISSFVLYSRKFSGPINEAANIMSDLQSALAAAERVFALMDEIPEAADMPGAIELGRG EEEVRGEVELSHVDFGYDRDRIILHDLSLKADPGKLIAVVGPTGAGKTTLINLLMRFYDP NSGEIRVDGHEIRGVTRKSLRKSYAMVLQDTWLFHGSIYENLAYGKEGATMEEVMAAAKA ARIHSFIKRLPDGYETILTDDGTNISKGQKQLLTIARAMLLDARMLILDEATSNVDTRTE IQIQEAMRKLMEGKTCFVIAHRLSTIRNADLILVIKQGEVVEKGNHEELMRAEGEYYQMY TAQGE >gi|229784098|gb|GG667637.1| GENE 41 40701 - 42431 183 576 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 330 529 1 198 312 75 26 6e-13 MKKIFSYYLKPYYLRMAVGFLIKFTGTLMDLFLPWTLAYMIDTVIPANQRPEIFLWGFFM IFCSVLAVVFSVAANRMASRVASSAIETIRGDLFEKVSRLSNTEIDRFTRPSLISRLTSD TYNVHQMLGRVQRLGVRAPILLIGGITMTMALDPVLACILLAVLPLLTLVVVVVSKKTIP MFFVLQDRVDRFVRLIREDIAGIRVIKALSKESYERERFDQANREVVDWERKATVTTSIT NPVMNVLLNLGLVAVILTGAFRVNQGTSEVGKILAFMTYFTIILNALMSISRLFIMISKA AASAARIVTVLEADQEMVLQELETNEEAAMAEAEADSVHVEFDHVSFSYNKVENNLENIS FRLKRGETLGIIGATGAGKTTIVNLLMRFYDADQGTVRIDGKDVRTMGLHELRKRFGAVF QNDTIFEDTIMENINMGRDLSEEAVMEAVLYGRASEFVAEKGGISEQLDIKGANLSGGQK QRILIARALAARPDILILDDSSSALDYKTDAALRKELREHFAETTCVIIAQRISSIMNSD HIMVLEDGQMIGYGTHRELMETCEIYREIGKSQMGI >gi|229784098|gb|GG667637.1| GENE 42 42440 - 42544 63 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKLAERNADRSRIPVDAAGILARFGGILYNGVSG Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:31:12 2011 Seq name: gi|229784097|gb|GG667638.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld31, whole genome shotgun sequence Length of sequence - 63736 bp Number of predicted genes - 56, with homology - 52 Number of transcription units - 32, operones - 15 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 + CDS 56 - 1921 1062 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 2 1 Op 2 . + CDS 1918 - 2109 303 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 2956 - 3015 80.4 3 2 Tu 1 . + CDS 3090 - 4400 734 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Term 4422 - 4478 15.9 4 3 Tu 1 . - CDS 4497 - 5750 1366 ## COG1051 ADP-ribose pyrophosphatase - Prom 5874 - 5933 5.6 + Prom 5833 - 5892 7.1 5 4 Tu 1 . + CDS 5926 - 6957 1051 ## Cphy_3761 hypothetical protein + Prom 7796 - 7855 80.4 6 5 Op 1 40/0.000 + CDS 7986 - 8675 690 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 7 5 Op 2 10/0.000 + CDS 8668 - 9378 544 ## COG0642 Signal transduction histidine kinase 8 5 Op 3 . + CDS 10357 - 10839 364 ## COG0642 Signal transduction histidine kinase + Term 10853 - 10888 1.7 - Term 10623 - 10658 5.8 9 6 Op 1 . - CDS 10878 - 12953 1830 ## COG1680 Beta-lactamase class C and other penicillin binding proteins 10 6 Op 2 . - CDS 12959 - 14431 1572 ## Ethha_0330 protein of unknown function DUF214 - Prom 14495 - 14554 80.4 11 7 Op 1 . - CDS 15394 - 16434 1162 ## Cphy_1027 hypothetical protein 12 7 Op 2 . - CDS 16436 - 17194 203 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 - Prom 17250 - 17309 5.1 + Prom 17294 - 17353 5.2 13 8 Tu 1 . + CDS 17457 - 18248 735 ## COG0789 Predicted transcriptional regulators + Term 18434 - 18469 -0.1 - Term 18218 - 18264 -0.0 14 9 Op 1 . - CDS 18340 - 19170 919 ## COG1082 Sugar phosphate isomerases/epimerases 15 9 Op 2 . - CDS 19226 - 20779 1308 ## COG5434 Endopolygalacturonase 16 9 Op 3 . - CDS 20817 - 21488 733 ## Cphy_2737 glycosy hydrolase family protein - Prom 21608 - 21667 5.5 17 10 Tu 1 . - CDS 21738 - 22595 698 ## Cbei_2264 hypothetical protein + Prom 22694 - 22753 3.0 18 11 Tu 1 . + CDS 22800 - 22910 58 ## - Term 22992 - 23028 4.8 19 12 Tu 1 . - CDS 23131 - 23217 62 ## - Prom 23250 - 23309 80.4 20 13 Op 1 3/0.000 - CDS 24150 - 24959 1001 ## COG0205 6-phosphofructokinase 21 13 Op 2 . - CDS 25035 - 28517 4100 ## COG0587 DNA polymerase III, alpha subunit - Prom 28555 - 28614 11.0 - Term 28655 - 28708 8.6 22 14 Op 1 2/0.100 - CDS 28743 - 28997 388 ## COG1925 Phosphotransferase system, HPr-related proteins 23 14 Op 2 . - CDS 29028 - 29996 988 ## COG1481 Uncharacterized protein conserved in bacteria 24 14 Op 3 1/0.100 - CDS 30003 - 30881 1106 ## COG1660 Predicted P-loop-containing kinase 25 14 Op 4 . - CDS 30878 - 31810 848 ## COG0812 UDP-N-acetylmuramate dehydrogenase 26 14 Op 5 . - CDS 31825 - 32757 1115 ## COG1493 Serine kinase of the HPr protein, regulates carbohydrate metabolism - Prom 32783 - 32842 3.8 - Term 32824 - 32864 -1.0 27 15 Op 1 . - CDS 32912 - 34789 1935 ## COG0322 Nuclease subunit of the excinuclease complex 28 15 Op 2 . - CDS 34797 - 35288 604 ## COG4728 Uncharacterized protein conserved in bacteria 29 16 Tu 1 . - CDS 35391 - 35570 204 ## 30 17 Tu 1 . - CDS 36519 - 37814 1559 ## COG1306 Uncharacterized conserved protein - Prom 37867 - 37926 7.1 + Prom 37925 - 37984 9.2 31 18 Op 1 . + CDS 38024 - 38224 256 ## COG1476 Predicted transcriptional regulators 32 18 Op 2 . + CDS 38221 - 38679 395 ## gi|266621150|ref|ZP_06114085.1| permease of the major facilitator family protein + Term 38735 - 38768 5.4 - Term 38723 - 38756 5.4 33 19 Tu 1 . - CDS 38796 - 40880 2183 ## COG3808 Inorganic pyrophosphatase - Prom 40924 - 40983 6.1 34 20 Op 1 . - CDS 41130 - 42005 1125 ## COG0024 Methionine aminopeptidase 35 20 Op 2 13/0.000 - CDS 42074 - 43288 1081 ## COG0845 Membrane-fusion protein 36 20 Op 3 36/0.000 - CDS 43298 - 45649 2287 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 37 20 Op 4 . - CDS 45668 - 46375 280 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 38 20 Op 5 . - CDS 46360 - 46614 254 ## BHWA1_00546 TetR family transcriptional regulator AcrR 39 20 Op 6 . - CDS 46592 - 46879 341 ## COG1309 Transcriptional regulator - Prom 47110 - 47169 80.4 + Prom 47993 - 48052 28.5 40 21 Tu 1 . + CDS 48151 - 49629 1601 ## COG4868 Uncharacterized protein conserved in bacteria 41 22 Op 1 . - CDS 49721 - 51076 993 ## COG0534 Na+-driven multidrug efflux pump 42 22 Op 2 . - CDS 51114 - 51290 109 ## gi|288870434|ref|ZP_06409754.1| conserved hypothetical protein + Prom 51195 - 51254 9.8 43 23 Tu 1 . + CDS 51312 - 51983 496 ## COG0306 Phosphate/sulphate permeases 44 24 Op 1 21/0.000 + CDS 52902 - 53318 382 ## COG0306 Phosphate/sulphate permeases 45 24 Op 2 . + CDS 53394 - 54017 644 ## COG1392 Phosphate transport regulator (distant homolog of PhoU) + Term 54019 - 54051 -0.9 - Term 53850 - 53896 -0.6 46 25 Tu 1 . - CDS 54025 - 55110 861 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 55147 - 55206 3.3 - Term 55214 - 55256 11.1 47 26 Op 1 . - CDS 55285 - 56277 714 ## mlr0585 hypothetical protein 48 26 Op 2 . - CDS 56280 - 56501 110 ## - Prom 56598 - 56657 5.5 49 27 Tu 1 . - CDS 56906 - 58351 1006 ## Sgly_1574 collagen triple helix repeat-containing protein - Prom 58376 - 58435 1.8 - Term 58452 - 58487 4.0 50 28 Tu 1 . - CDS 58496 - 58831 353 ## COG3870 Uncharacterized protein conserved in bacteria - Prom 58871 - 58930 8.6 51 29 Tu 1 . - CDS 58946 - 59512 534 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Prom 59746 - 59805 3.9 + Prom 59494 - 59553 8.7 52 30 Tu 1 . + CDS 59611 - 60177 448 ## COG1633 Uncharacterized conserved protein + Term 60283 - 60313 1.0 53 31 Op 1 . - CDS 60170 - 60748 606 ## COG1896 Predicted hydrolases of HD superfamily 54 31 Op 2 . - CDS 60748 - 62670 1553 ## COG1193 Mismatch repair ATPase (MutS family) - Term 62869 - 62899 1.3 55 32 Op 1 . - CDS 63062 - 63544 421 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 56 32 Op 2 . - CDS 63586 - 63735 151 ## gi|266621172|ref|ZP_06114107.1| hypothetical protein CLOSTHATH_02315 Predicted protein(s) >gi|229784097|gb|GG667638.1| GENE 1 56 - 1921 1062 621 aa, chain + ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 330 620 309 600 602 162 31.0 2e-39 MQKNSLLKSIRGKITFFTAALTITLTVLAVTICFYAFQSYQKKMMIRSSEFNLQSIADNT SADLNNILAFVKWCCSSSDVAEYMDTIHGQGTLGQISSKNPRVAQMAFDTYDRIWEEYTL HNQSNYIRRLILSSDNPLTYIQMFYSSVPDRTDAASLIASQDFFIPLYEASSFQWNGLVQ NPFNKISKDYIIPVVRPIYSGESSQTIGWLYVEVSSELITNRLKAYPLESDSLLYLTIGE KNYLCKNGELTESPFEFQWISDLTDSTISSNSKAMEAILPELGERTVLTCSLGIPDFHLT EVLSVRRDRAQQTLYTCMIAGILFAALSFGIILMYFLNRTIVKPVHQLRQNMDSISRGNF LKNPDIEWENELGEIGKGINDLSESVVNLMDTRIEQERQKKDLEYQILQSQINPHFLYNT LNSVKWMATIQGATGIADMMTVLARLLKNVSKRSESMITLKEELELAGDYFQIQKYRYGS SISIRYKIASEDLYSCMVHRFSLQPLIENALFHGLEPKRAPGTITVSASSEVSSEATLLI LSVTDDGIGMTKEMIEKVMTCEFPEDSDFFKHIGISNVNNRIKHDFGDSFGITVTSEPGV YTTMTIRIPYIHRKEGVEKQT >gi|229784097|gb|GG667638.1| GENE 2 1918 - 2109 303 63 aa, chain + ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 63 1 63 525 77 50.0 9e-15 MIKLLIADDEPLVQIGIKSMLDWDRLGFEICGIAVNGEHALELIEETSPQIVITDIRMPV MNG >gi|229784097|gb|GG667638.1| GENE 3 3090 - 4400 734 436 aa, chain + ## HITS:1 COG:BH1123 KEGG:ns NR:ns ## COG: BH1123 COG4753 # Protein_GI_number: 15613686 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 304 428 396 526 526 91 35.0 3e-18 MIEYLVKLELSEDALKAAITKAVKRLEDIRRVEEPQAVLLNQSMTEKFFMRLLYNLFDNR EQFELQARDLKFPFHDFSYQAACCEICESDRFPLTREQLAHLYSSCLQMIHDVLTRYVRC HVISLDQKHFAVIFCLPSGEASCDVTSRALKDANSIVHTYFNVNLRIGAGTAVSTPFAVS ESYQEAREATLKASDNQPVISFSPSSTSSLQSSFNLSIFKRDMIRSFEEFDTDVLSRTLS SIIELFAAHPNRYLQAIDGACNIMYLAMSLLPEGEVNLSEIFEGWNGSYRSIYRCSSVEQ ITVWLETLRNGLCEILKSRKKTYKDHIVTNVKHYINDHIEERLTLNEVSDVFGISHNYLS VLFKKNCGIGFSEYITQTKIARSKSMLLEENLKIYEVADRLGFETPCYFSKVFKKVEGVS PREYLQEKTVAPEPEQ >gi|229784097|gb|GG667638.1| GENE 4 4497 - 5750 1366 417 aa, chain - ## HITS:1 COG:BH3089 KEGG:ns NR:ns ## COG: BH3089 COG1051 # Protein_GI_number: 15615651 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Bacillus halodurans # 217 404 6 194 207 161 47.0 2e-39 MRYQYIVFDVDGTLLNTEEAILQSFQKTLRDGAGIEKECSELTFCLGITGAAALNRLGIG NEEELLSQWKENLTALRDYVCVFDGMELLLKRLQAEGVHMGIVTSRSRAEYEEDFTRLGL YGYFDTVVCEEDTVNHKPEAEPLDFYMRKTGARPEEVLYIGDSIYDMECACAAKAACGLA LWGAGGVKHIHADYYFPDPASVLSVLERLEAEDEGRKWLSWAMELQFIAQAGITYTKDSY DLERFERIRELSGEIMSSYTGLSGEKVTDLFCSETGFQTPKLDTRAAIIENGKILLVKEK DGRYSMPGGWVDVNKTVGENAVKETKEESGLTVVPVRVVALHDMGRHNKQLFAYGVCKVF FLCEIVEGKFAENIETVESGWFSPDELPVLSEDKNTKAQVEMCFDAYADKNWKTVFD >gi|229784097|gb|GG667638.1| GENE 5 5926 - 6957 1051 343 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3761 NR:ns ## KEGG: Cphy_3761 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 338 1 336 354 172 32.0 2e-41 MLETLKKHVTKNLIWRIVISVIIIGAVFFFSGKSILMFLKGPEPITAGMDYDDAEGSYVS FDARYIVDEYVRQTERNTDTKKETLKNISYFVYFEEDGYFFGIEVPSSKEDEMNQYIDDT ISWLNGEVEELTNVKPVKGTWTKLTGKRLQYYQEQITDDLGAEFLEIALPYYIDSNKVGE RTYVSIYICLAVIALTLIYLLYSLIQYFTGSSQKYLKKYLEANPTVAMEHLEADFASASA ISKSIWVGRRWTFHISGIYTKLIENKDLVWAYYYYRSGRHSESTLRLYDINRHLHSLSAS EKEAKAALAVYEQQQPHMVLGYSKEMEKLHDKDFQGFLNLLAS >gi|229784097|gb|GG667638.1| GENE 6 7986 - 8675 690 229 aa, chain + ## HITS:1 COG:CAC0321 KEGG:ns NR:ns ## COG: CAC0321 COG0745 # Protein_GI_number: 15893613 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 6 221 5 227 230 155 39.0 4e-38 MHYDCLIVDDETAIASSTSEYFNLFELSSAYVTSYEQCIRFLKENTVSLLLLDINLGGRS GFSLCKKLREQTDIPILFISARTSDDDILTALNIGGDDYITKPYTLSVLLAKVKAVLKRC ARSGEAVPLRIADVEIDLSAHKVTVGGKPVKLKEMEFRLLTYLAQNLGRVVTKEELLENV WPDPYVGEGTLSVHIRHLREKIEKDPNHPQLILTVWGTGYRMENGECHD >gi|229784097|gb|GG667638.1| GENE 7 8668 - 9378 544 236 aa, chain + ## HITS:1 COG:CAC0317 KEGG:ns NR:ns ## COG: CAC0317 COG0642 # Protein_GI_number: 15893609 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 145 230 203 288 498 66 36.0 5e-11 MIKKKWTFLFLISWLLLGAVFLFRLFSSMEMPEIDMVSINRIGKETAVQWEQGTFFKPDF SPYDYTVTDLSGTILLKTSASAPSSVAEAVRRHDPVLDVTVGQGVKGHVLIAAGYGDALL KTKKELAIFAVLLLLMPVIPVFLYTAYLNHTVLKPFYRMKEFTRHIAMGDLDFPLPMDRK HIFGAFTESFDIMRDQLKEARRKEAEANRSKKELVASLSHDIKTPVASIKLTSELF >gi|229784097|gb|GG667638.1| GENE 8 10357 - 10839 364 160 aa, chain + ## HITS:1 COG:BH1581 KEGG:ns NR:ns ## COG: BH1581 COG0642 # Protein_GI_number: 15614144 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 49 154 479 588 594 75 33.0 4e-14 MLQASLDELGELTVSLSEVSSSAVEAMIRDADYCGRIASLTIPGCILLLDPFRLEQVIGN ILNNSYKYADTRVNVVSQLTEDGLRIEFMDYGKGVPDEELPSLFGKFYRGSNSDKKADGS GLGLYISKNLMERMGGIIECRNRSDGFSVILYLPLRHPGS >gi|229784097|gb|GG667638.1| GENE 9 10878 - 12953 1830 691 aa, chain - ## HITS:1 COG:CAC2808 KEGG:ns NR:ns ## COG: CAC2808 COG1680 # Protein_GI_number: 15896063 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Clostridium acetobutylicum # 41 689 136 734 739 385 36.0 1e-106 MKKRWRYITAALTAAAVCSGMAGWGQMTCLAKETAPAEPLMYGIGSTSKVVTAAAVMKLE DEGLIDIKKPLTTYIPEFEMADERYRRITPEMLLDHSSGLPGSTLNNAMLLGDNDTENHD MLLERLKKQRLKSDPGEIQVYCNDGFTLAEILTERVTGLSFTEYVEQEFADQLGLTQFKT PQSSNLSGRLAKIYDAGSGGELPAENANVIGSGGIYATAMDLCRFSEIFMRNRSGNARLL SDGALTAMETSRYHEEINPNGCDTTLSYGLGWDSVETYPFSRYGIKALVKGGDTNFYHGS LTVLPEEQISCAVLTSGGSSTLNQLAVQEIVMTYLDEIGRIEREEEENVSWEADVRKGKD TDPAVSAETEVSAGMTVPADIAERSGWYAGSDLLKLSVSGDGKMTIASEGSGRRREQVYQ PGTDGGFYSTDGNYISASGELSKGSGGRIGRTRLSFQKGKQGREYLMAESMEVYPGLGRV ATYLPVGTRYTGNVPSEAAVSAWKELDGQEYYLISEKYDSAAWLKRFMVKPLLLDEPEGI LSFENLELRMAVVSDETHADFFTEVPGQAGRDLNDYTRKEENGKQILESGSFRYMAAEDA EAFPGSTCEVTLGAAGEAVWLTTGDVNQNRRVLIEKPENGAWYLYDHSGREMKCVSSSWT LEKDRPFYLPEDGRVVLAGEAGAVFAVRYLD >gi|229784097|gb|GG667638.1| GENE 10 12959 - 14431 1572 490 aa, chain - ## HITS:1 COG:no KEGG:Ethha_0330 NR:ns ## KEGG: Ethha_0330 # Name: not_defined # Def: protein of unknown function DUF214 # Organism: E.harbinense # Pathway: not_defined # 1 259 365 623 780 163 33.0 2e-38 MAVVVFVAWRSAAAAARILPVEALRGGIKSHSFKKEYFPLERTRGSLSMVLGLKSMAFHR KMILMVGVIFAGIAFASAFSIITWWNMGINDDLVLKLTGYEISDIMVYKAPHENYEELSE KVGAMEGVRKVSLYEMESLTVEGELLSCYVSDDFGKLELVEVYEGDFPEYDNEIVITGLL ARSWGKKIGDTITVSANGVSSEYVICGLTQTMNNFGRICFIHETGLLRVEPYYEKSSIQV YLEPGLDIDTVIAAMEQKFRVLSPSVNDAVRESGAEETKTDTNAKPGREEALAAARRKAE EKLTALISMYGVDSAQYALMADGEIVLSGDTNDYEIDRIENNRRLFVTNVDSIADSTRMM AALIITGTVLLVVLVLYMVIKSMLTRRRGEFGIYKALGYTDRQLMEQIAISFLPASIGGT FAGSAAACLSVNRMASLLFERLGVSEMNFVVEPWLILLMALALVAFAFAVSMVLAGKIRG ITVYGLLTEE >gi|229784097|gb|GG667638.1| GENE 11 15394 - 16434 1162 346 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1027 NR:ns ## KEGG: Cphy_1027 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 342 1 349 786 128 26.0 4e-28 MPVFMIAMANIKKKKGVAVSMGVLILLAVAIFNVGLTLLAGINRFYDMENDRLNGAHYMV RFTGNEYREEYLDFFLQDPRVETAETEDAVLMQMASFPEGGIITANFLNMDINRKISGYF VAEQTEVPEDEAVYMPVFMKDLGYGLGDELILNYNKKDYHFRIAGFTQSTWLHSSVSSMV NFYMPEKAYEKLYEQQGGGYILLVRLKDQADLKGLTADFKNSTDVNIEAISMEANAMEIT IDDMRNGTTMVVTIISAVLFVFSFLMVLVALIVMKFRISNHIDSQMRNIGALGAVGYTGR QIRWSVVLEFVIIGILGTCLGIALSYGIIAALGGLITSSVGVTLAS >gi|229784097|gb|GG667638.1| GENE 12 16436 - 17194 203 252 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 24 223 294 498 563 82 28 4e-15 MKEVVIQTDKLCKSFSSAGRQQHVLKNLDLSIYKGDFTVIMGASGSGKSTLLYALSGMDK PTLGEVMYGEVKINHLGISDMAVFRRKNCGFVFQQIHLLDQMSVLDNVLACGLLVSRDKR AVAARAKKLLEEVGIQKEDWGKFPAQLSGGEAQRVGIVRALINEPAVIFADEPTGALNSA FGEQVLNTLTEVNQKGQSIVVVTHDVKTARRADRVVYLKDGMICGECRLGTYATADTARH EKLSAFLTEMGW >gi|229784097|gb|GG667638.1| GENE 13 17457 - 18248 735 263 aa, chain + ## HITS:1 COG:CAC3509 KEGG:ns NR:ns ## COG: CAC3509 COG0789 # Protein_GI_number: 15896746 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 4 245 3 248 250 92 27.0 6e-19 MKNTYSTSDVAKLIGIHPNTVRLYEDLGLITRPERKENGYRIFTDFHLIQFRIARLAFQV EVLQNGLRTQAAAIAKTCASGDIDGALLLTDRYIRQIGEESQNAETAIQIAGQIISGQTH FANPLLLTRKQTAAYLHITVDTLRNWERNGLLVEKRMQDGRRVYTSRDLEQLIIIRSLRC ANYSLASILRMLHALSLNPEASIRDVIDTPRPDDDILTACDHLLSSLDQAGQNAKQIAGL LSELKNLNSLMLSPGQKERQGPA >gi|229784097|gb|GG667638.1| GENE 14 18340 - 19170 919 276 aa, chain - ## HITS:1 COG:YPO3617 KEGG:ns NR:ns ## COG: YPO3617 COG1082 # Protein_GI_number: 16123759 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Yersinia pestis # 12 272 12 291 299 82 24.0 7e-16 MQIGIRLHDIEPGTLEQRVMRAHEQGFTCGHLALAKTVTEHSVANSALTPGYAAYLRKMF AKYDVDIAVLGCYLNLAHPDPEELKKIQERYFAHIRFASILGCSVVGTETGAPNADYHFE PACHTEEALQTFIANVRPVVECAEKFGVILAIEPVWSHIVYDSKQALRVLKEIHSPNLQI ILDPVNLLSVENCGEYESVFAQAIEDLGEYTAVLHMKDFVVKDHTIVSGAPGTGIMKYDT IMDFVRKEKPYIQMTIEDSTPENAVESRKFLERFEK >gi|229784097|gb|GG667638.1| GENE 15 19226 - 20779 1308 517 aa, chain - ## HITS:1 COG:CAC0355 KEGG:ns NR:ns ## COG: CAC0355 COG5434 # Protein_GI_number: 15893646 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Clostridium acetobutylicum # 1 498 1 499 513 484 46.0 1e-136 MELKVICMTARSATLETEAGQIFEFDTQGEVYVNGSLYQRTNRVIFSLFGLKPDTEYEVC LKRDGAEAGTVFRTDYEFVTINVKEFGAKGDGLQDDTGFIQAAILACPKNGRVLIPKGTY RITSLFLKSHIRLELGAGAILAADTDRFKYPRLPGMIESYDETEDYNLGTWEGNPLPMFA GIINGIEVENVVIYGEGLIDGQASFENWWKDAGTMRGAFRPRMVFLERCKDITLQGFYLK NSPAWVLHPYFSQGLRFLDLDIENPADSPNTDGLDPESCKDVEITGLHFSLGDDCIAVKS GKIYMGRRYKTPSENIEIRQCLMENGHGAVTVGSEVGAGVKAVRVRDCLFRHTDRGLRVK TRRGRGKDSVLSDISFQHIVMDHVMTPFVVNSFYFCDPDGKTEYVQCREALPADERTPEI QNLSFTDIKAANCHAAASFLCGLPEQKIRQIELRNVDISFAEQAREGVPAMMEGVSACSR QGFTVMNVDTLVCENVTITGQKGDAFIMSNIDHFKEI >gi|229784097|gb|GG667638.1| GENE 16 20817 - 21488 733 223 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2737 NR:ns ## KEGG: Cphy_2737 # Name: not_defined # Def: glycosy hydrolase family protein # Organism: C.phytofermentans # Pathway: not_defined # 7 214 30 278 294 70 30.0 7e-11 MEDIYELPGLIQMYQATGKAEYGERALEQTNRAILRQDGTLLSGPEAGACLFALKQTGKQ EYRKAADLVFNRLVNGETAMPEAAMPFYAEYDTLFNKKAHYGEIAAYFEGKKAWSGREAA VLIDTIEKMSMEIYEYYRALCDLLKQAVRQKLPAEGPRPEVLLNEEEAWLGYAVLKACSL GVLNREKYGEAGLRIWRRFEVQQDKGEGFGNMLKAQYLIFEKN >gi|229784097|gb|GG667638.1| GENE 17 21738 - 22595 698 285 aa, chain - ## HITS:1 COG:no KEGG:Cbei_2264 NR:ns ## KEGG: Cbei_2264 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 11 275 7 272 277 167 34.0 4e-40 MPVPPENLPLEDMILKSAVTYFGEVLLPYLGIDCKITGIAPTELVYLEVKGLIEDFTYIV SDGSWIHLEFESDKVTEKDLRRFRGYEAVTGYTYQVPVTTIVICSAKTAEVLSELHEGIN TYRVMTVRLKDRDADQLFDQLKQKLQTGERIEREEVIPLLLTPLMSGKMAEKERFIEGNR LLQASGEITADEGAKMQAVLYALAVKFLDSKDLEEVKEEFSMTLLGQMLVNDGIEKGRSE TLRELALEMRKNGEPYEKVARYTRTPVETIRTWEKENGNDASIRF >gi|229784097|gb|GG667638.1| GENE 18 22800 - 22910 58 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPDGSMIREHGTEFWKWRQLLMEGYYLFSWYKLLIF >gi|229784097|gb|GG667638.1| GENE 19 23131 - 23217 62 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDFDIQEALNMHKDIPEDQYQISRMLVR >gi|229784097|gb|GG667638.1| GENE 20 24150 - 24959 1001 269 aa, chain - ## HITS:1 COG:CAC0517 KEGG:ns NR:ns ## COG: CAC0517 COG0205 # Protein_GI_number: 15893808 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Clostridium acetobutylicum # 5 267 1 262 319 312 64.0 5e-85 MAKQVKTIGVLTSGGDAPGMNAAIRAVVRTAIHKGLEVKGIMRGYAGLLQEEIVDMTSTS VSDIIHRGGTILYTARCQEFTTAEGQKKGAEICKKHGIDGLVVIGGDGSFRGAGKLSALG INTIGLPGTIDLDIACTDYTIGFDTAVNTAMEAIDKVRDTSTSHERCSIIEVMGRNAGYI ALWCGFANGAEDILLPERYDGDEQALINRIIENRKRGKKHHIIINAEGIGHSSSMAKRIE AATGIETRATILGHMQRGGAPTCKDRVAS >gi|229784097|gb|GG667638.1| GENE 21 25035 - 28517 4100 1160 aa, chain - ## HITS:1 COG:CAC0516 KEGG:ns NR:ns ## COG: CAC0516 COG0587 # Protein_GI_number: 15893807 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Clostridium acetobutylicum # 3 1149 11 1167 1167 1167 53.0 0 MAFTHLHVHTEYSLLDGSCKIKELTARAKELGMDSMAITDHGVMYGVIDFYRAAREVGIK PIIGCEVYVTAGSRFDRETTNGEDRYYHLVLLAENNQGYQNLMKIVSKGFVDGFYYKPRV DYEVLETYHEGIIALSACLAGEVQRYIERGMYEQGKNAALRYQQIFGEGNFFLELQDHGI PAQHTVNQGLMRMSRETGIKLVATNDIHYTFADDATPHDILLCIQTGKKVTDENRMRYEG GQYYCKTEEEMRKLFPYASEAIDNTHDIAERCNVEIEFGVTKLPRYDVPEGYDSWGYLNK LCFEGLAKHYPDDDGTLKARLEYELNVIHTMGYVDYFLIVWDFIHYARSQDIIVGPGRGS AAGSIVSYCLGITNIDPIRYNLLFERFLNPERVSMPDIDIDFCFERRQEVIDYVVRKYGK EQVVQIVTFGTLAAKGVVRDVGRVLDMPYARCDAIAKMIPGDLGMTLEKALKQSPDLRNA YNEDPEVKYLIDMSMRLEGLPRHTSMHAAGVVISRTSVDEYVPLSKAADGTITTQFTMTT LEELGLLKMDFLGLRTLTVIQNAVKAVEKNRGVVLDMDTIDYDDKAVLESIGTGKDDGVF QLESSGMKSFMKELKPQSLEDIIAGISLYRPGPMDFIPKYLKGKNDPSSATYDCPQMEPI LSPTYGCIVYQEQVMQIVRDLAGYTMGRSDLVRRAMSKKKTSVMEKERQNFVYGNPEEGV KGCIANGIDEKTANHIYDEMIDFAKYAFNKSHAAAYAVVSYQTAFLKYYYPQEYMAALLT SVMDNVTKVSEYILSCRQMGISILPPDINEGESGFSVSGGAIRYGLSAIKSVGKSVVDLI VTERESSGLFTSIEDFVDRMSNKEVNKRTLENFIKSGALDTLPGTRKQKILVAPELLDQR SKEKKNSMEGQLTLFDIASEEEKTRYQITFPNVGEFPKEELLAFEKEALGIYVSGHPLEA YETTWRNNITAVTIDFIVDEETEEARVKDGSYVTIGGMITGKTVKTTRNNKLMAFLTLED LVGSVEVIVFPKDYESKRELFVEDAKVFIQGRASIGDDPVGKVICERVIPFQSLPKELWL KFPDKDVYMAREAEILSDLKESEGNDTVIIYLEKERAKKVLPANRNVSASNELVSVLTEK LGEKNVKVVEKKVEKIRKMN >gi|229784097|gb|GG667638.1| GENE 22 28743 - 28997 388 84 aa, chain - ## HITS:1 COG:BS_ptsH KEGG:ns NR:ns ## COG: BS_ptsH COG1925 # Protein_GI_number: 16078454 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Bacillus subtilis # 1 83 1 83 88 68 46.0 2e-12 MVKKTVTVELASGLEARPVAMLVQVASQYESSIYVEIDSKKVNAKSIMGMMTLGLAAGEQ VIISADGADENQAVTDIEKYLSNN >gi|229784097|gb|GG667638.1| GENE 23 29028 - 29996 988 322 aa, chain - ## HITS:1 COG:CAC0513 KEGG:ns NR:ns ## COG: CAC0513 COG1481 # Protein_GI_number: 15893804 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 313 1 313 317 267 47.0 2e-71 MSFSSKVKDELSRQLSPARHCQIAETAAIISLCGKITISEEDHYAIKIHTENVAVARKYF TLLKKTFNIVTDVSIRRNAYLNKNRTYTVTIREHEDALRVLHAVKLLDEHGEVGENLNVV QNVVIQQSCCRRAFIRGAFLASGSLSDPEKFYHFEIVCATEEKAKQLQGIIATFDLEAKI VKRKRYYIVYIKEGSQIVDILNVMEAPVALMELENIRILKDMRNSVNRQVNCETANINKT VSAAVKQIEDIEYIRDTIGLENLPENLEEIARERVERPEATLKELGEALDPPVGKSGVNH RLRKLCDIAEQLRDRSERSRME >gi|229784097|gb|GG667638.1| GENE 24 30003 - 30881 1106 292 aa, chain - ## HITS:1 COG:CAC0511 KEGG:ns NR:ns ## COG: CAC0511 COG1660 # Protein_GI_number: 15893802 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Clostridium acetobutylicum # 1 286 1 286 294 289 52.0 5e-78 MRLVIVTGMSGAGKTNALKMLEDMGFYCVDNLPIPLVEKFAELTLSNQGGIRNAALGIDI RSGDELSALDRIFDEWSRKRVPYEILFLDASSETLIKRYKESRRAHPLAGEGRIDSGIEK ERLKLAFLKEEADYIIDTSKLLTKELKAELEKIFMANESYKNLYITILSFGFKYGIPSDA DLVFDVRFLPNPYYVDELRPKTGEDQEVRDYVMQNGTADIFLNKLYDMLEFLIPGYVLEG KNQLVIAIGCTGGKHRSVTIAKAVYERLKSHEEFGLKIEHRDIDKDNKRKKE >gi|229784097|gb|GG667638.1| GENE 25 30878 - 31810 848 310 aa, chain - ## HITS:1 COG:SA0693 KEGG:ns NR:ns ## COG: SA0693 COG0812 # Protein_GI_number: 15926415 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Staphylococcus aureus N315 # 3 300 7 301 307 263 45.0 4e-70 MFYEKLLEVSDSSHVKTEEKMSRHTTFRAGGPAAYYVSPADARELGEVIRLCRREEVPYC ILGNGSNLLVGDGGFDGVVISMTAGFSGCTVDSDRCEIAAGAGASLSRVGKAALEAGLTG FEFAAGIPGTVGGAVVMNAGAYGSETKDIIESARVMTIEGEEKVLSLPELELGYRTSCIP ANRYIVMEARYRLKPGSKTEIRAYMDELAARRKSKQPLEYPSAGSTFKRPAGNFAGKLIE EAGLAGYRVGGAEVSGKHCGFVINRDHATASDIMTLCQDVKRKVFECSGVELEMEVKTLG TFESRRGDVT >gi|229784097|gb|GG667638.1| GENE 26 31825 - 32757 1115 310 aa, chain - ## HITS:1 COG:lin2626 KEGG:ns NR:ns ## COG: lin2626 COG1493 # Protein_GI_number: 16801688 # Func_class: T Signal transduction mechanisms # Function: Serine kinase of the HPr protein, regulates carbohydrate metabolism # Organism: Listeria innocua # 4 296 5 296 312 271 45.0 2e-72 MHGVAITELIEKMNLRNMTPEIDAEKIVLSHPDVNRPALQLAGFFDHFDNERVQIIGYVE QEYIRQMEHDRKVEMYDKLLSSQIPCLVYSRNQDPDEDMIARCDHYGVPCLVSEQTTSDL MAEIIRWLKVKLAPCISIHGVLVDVFGEGVLIMGESGIGKSEAALELIKRGHRLVTDDVV EIRKVSDETLIGSAPEITRHFIELRGIGIIDVKTLFGVESVKDTQAIDMVIKLEDWDKDK EYDRLGLEDQYTEFLGNRVVCHSIPVRPGRNLAIIVESAAVNYRQKKMGYNAAQELYNRV QANLSRKQDD >gi|229784097|gb|GG667638.1| GENE 27 32912 - 34789 1935 625 aa, chain - ## HITS:1 COG:CAC0508 KEGG:ns NR:ns ## COG: CAC0508 COG0322 # Protein_GI_number: 15893799 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Clostridium acetobutylicum # 9 624 2 617 623 658 52.0 0 MEQSGYNGFNIEEELKKLPAQPGVYIMHDKKDEIIYVGKAISLKNRVRQYFQSSRNKTSK IEQMVSRIARFEYIITDSELEALVLECNLIKEHRPRYNTMLKDDKTYPYIKVTVYEDFPR VLFSRDMKKDKSKYFGPYTSAGSVKDTIDLIHKLYKIRTCNRNLPKDIGRDRPCLNYHIK QCSAPCQGYINKEEYGKAISQALDFLGGKYDPVISMLEEKMQAASDEMDFEKAIEYRDLL TSVRAVAQKQKITSSSMEDRDIIAMAKDEKDAVVQVFFVREGRLIGREHYHVAIATAEDD RQILTSFVKQFYAGTPFVPRELWVQTELEDSGIIREWLSAKRGQKVKLVVPQKGEKERLV ELAERNAALVLSQDKEKIKREELRTIGAMNEVSGWLGITGVHRIEAFDISNISGFESVGS MIVYEDGKPKRNDYRKFKIKWVKGANDYASMREVLTRRFSHGLDEAAALKEKGVDEEFGS FTKFPDLLMMDGGRGQVNIALEVLDELKLSIPVCGMVKDDNHRTRGLYYNNVEIPIDKRS EGFKLITRIQDEAHRFAIEYHRSLRSKGQVHSILDDIPDIGPTRRKALMRYFKDIEAIKE ATIEALTAAPGMNAKSAQSVYDFFH >gi|229784097|gb|GG667638.1| GENE 28 34797 - 35288 604 163 aa, chain - ## HITS:1 COG:CAC1208 KEGG:ns NR:ns ## COG: CAC1208 COG4728 # Protein_GI_number: 15894491 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 2 96 74 164 173 88 47.0 6e-18 MDRIPKPGECYRHFKNKLYQIITVAEHSETGEKMVVYQALYGTFGTYVRPLSMFTSEVDH EKYPEVNQKYRFERVELKTEEAEETSEETAVSSKNLLAFLDAETYYEKLEVLKERCEIFS EEELTAVCESLDIGSGTGTRQGMYRAAVNYLEMQTKYDGARLR >gi|229784097|gb|GG667638.1| GENE 29 35391 - 35570 204 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEETEAPVDGTTASEELQTGEDAETEAAEKAAAQPETAAQGEKRSKPNVVIVTTGGAGE >gi|229784097|gb|GG667638.1| GENE 30 36519 - 37814 1559 431 aa, chain - ## HITS:1 COG:FN0456 KEGG:ns NR:ns ## COG: FN0456 COG1306 # Protein_GI_number: 19703791 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 46 388 35 376 380 236 37.0 9e-62 MKKWLLAVLVCTLAVTGCSKYKQVSDLPETDETTETESMSGETEEETLEIQPETEPEEVK KERVRVKGVYISGYMAGSEGFQAILDKISQTEINAVVIDVKNDDGRITFAMDDAPTVNEI GASERYIRDIGSLMADLKARGIYTIARVVAFRDPYLAEKKPEWSLKNADGSLHRDNKGLA WVNPYRQEVWDYLVEVGRQAAKAGFDEIQFDYIRFATDSTMKQVVFDEADTRGRSKTDII TEFITYAYDKLSEEGVFVSADVFGTIIGSKIDADAVGQVYNDMAKHLDYICPMIYPSHYG DGNFGIEHPDTQPYDTILAALMKSKQDLAIVTAAGEDHAIVRPWLQDFTASYLQHYIKYG PAEVRAQIQAVYDAGYDEWILWSASNKYTWDGLLSPEEAGREDQKIAESRAALPPETTAV PPAEAADGETP >gi|229784097|gb|GG667638.1| GENE 31 38024 - 38224 256 66 aa, chain + ## HITS:1 COG:SPy1934 KEGG:ns NR:ns ## COG: SPy1934 COG1476 # Protein_GI_number: 15675737 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 66 1 66 68 90 62.0 6e-19 MKNKRLKIARIEHDLSQEQLGERVGVTRQTISMIESGNYNPTLNLCIAICKELGKTLDEL FWEVSE >gi|229784097|gb|GG667638.1| GENE 32 38221 - 38679 395 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621150|ref|ZP_06114085.1| ## NR: gi|266621150|ref|ZP_06114085.1| permease of the major facilitator family protein [Clostridium hathewayi DSM 13479] permease of the major facilitator family protein [Clostridium hathewayi DSM 13479] # 1 152 1 152 152 223 100.0 5e-57 MKSLFKKRIVDERMELQSLKNARKSWNFLLIATGLCLLAELHLLQWELKYVLPQAVIVLA ASLYNIFLDTRDGNIYTAESANRKRLFLLYSISSLAASLIIASGFYRRYSLITAIIIFII LFFFMFGLMYLFDSLIFHMGKKRALKDTEEEE >gi|229784097|gb|GG667638.1| GENE 33 38796 - 40880 2183 694 aa, chain - ## HITS:1 COG:MA3879 KEGG:ns NR:ns ## COG: MA3879 COG3808 # Protein_GI_number: 20092675 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Methanosarcina acetivorans str.C2A # 3 689 13 685 685 539 51.0 1e-153 MNVMYVVAIGSIAALIYAACMFFRVKEQPEGSAGMIKISTAVRKGAGAYLRRQYLGVGVF FAVVFLILLCMAFGGFLSYFTPFAFLTGGFFSGLSGFIGMRTATMANCRTAEGASHSLNK GLKVAFSAGSVMGFTVVGLGLLDLTIWYFILNTVFRSLPEAERIGQITANMLTFGMGASS MALFARVGGGIFTKAADVGADLVGKVEAGIPEDDPRNPAVIADNVGDNVGDVAGMGADLY ESYVGSIVSTSALAVAAGFGVKGVAVPMMLAAFGVIASITGTFFVKTKEDASQKSLLKAL RLGTYISAVLVAVGAFVIIRILLPGHVGIYAAVLSGLIAGVLIGAITEYFTSDSYRPTRN LASSSETGAATVIISGLSLGMLSTVAPVIIVGASVLISYYCSGGNTDFNMGLYGVGVSAV GMLSTLGITLATDAYGPIADNAGGIAEMTHMPPEVRNRTDALDSLGNTTAATGKGFAIGS AALTALALIASYIDKVQQLNPDIALNLTITNPTVLIGLFIGGMLPFLFAALTMDAVGKAA QSIVIEVRRQFKEIRGLMEGKAEPDYASCVDMCTKSAQKLMLAPALIAVIIPVAVGLLLG VNGVAGLLAGNTVTGFVLAVMMANSGGAWDNAKKYIEGGAHGGKGSDQHKAAVVGDTVGD PFKDTSGPSINILIKLTSMVSIVFAGLIVAVHLL >gi|229784097|gb|GG667638.1| GENE 34 41130 - 42005 1125 291 aa, chain - ## HITS:1 COG:ECs0170 KEGG:ns NR:ns ## COG: ECs0170 COG0024 # Protein_GI_number: 15829424 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Escherichia coli O157:H7 # 46 290 5 249 264 255 48.0 8e-68 MLKKLGRNDACWCGSGKKYKNCHLAFDTKIENYAIMGDLVPEHYMIKTPAQIEGIRRAGE VNTKILNLVDSFIKPGISTEDINQLVHENTIKMGGIPAPLNFEGYPKSVCTSVNEVVCHG IPDPGRILKEGDIINVDVTTILDGYYGDASRMYCIGTVTPEAEKLVRVTKESVELALKEA RPWGHLGDIGAAVSEYVYSNGFTVVREIGGHGVGNDFHEEPWVSHIGERGTDMLLVPGMI FTIEPMVNAGRPDVVQDEEDGWTIYTEDGSLSAQWEVTVLITEEGPEVLTY >gi|229784097|gb|GG667638.1| GENE 35 42074 - 43288 1081 404 aa, chain - ## HITS:1 COG:SMa1124 KEGG:ns NR:ns ## COG: SMa1124 COG0845 # Protein_GI_number: 16263061 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Sinorhizobium meliloti # 154 387 164 386 399 69 26.0 1e-11 MKRNVKWIFLAAIAVISIGAYAAVLLKPVEAQAERAVKGSLESQFTVSASILPVSGMVLN SITAGNVSEIPFLPGMEVKEGEILIKTDSASQTTLDIQREQFRQQLISARQEYERLYGAN GSAAAAQSAAESEYLLAKKNYDNGLVLEQQGGFIARSDLDTLKTQMDIAFQKYNQAKEDN SDARRTAFQDQIASYEKQLELLENTVNPGVVAMPYDGVLWEVYTEEGAYLSPNQPVAKVY RPQEMKLSASVLSEDAAALTPGLTAEVVYADGTRSEAEVGFISKTAKKEMSSIGLEENRC TVELKPLSLPEHVGAGQTADVSFTVIKAEQVFSVPAAAVMPQEHGSAVYVVSSGKAVLTP VETGRRESGRVEIISGLEDGDIVASDPYNDNVEKNGRVKAVLKE >gi|229784097|gb|GG667638.1| GENE 36 43298 - 45649 2287 783 aa, chain - ## HITS:1 COG:SMa1122 KEGG:ns NR:ns ## COG: SMa1122 COG0577 # Protein_GI_number: 16263060 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Sinorhizobium meliloti # 7 783 6 787 787 209 24.0 2e-53 MTKLLWKKMIRDMKRSRAAYLLCILIVAIGFCGFTVLELCYDNLVESRDLFFSQSDFCDG FAEVTDSPVSEAGILNMIPGIYAVEPRLVKDVRVYGFDGDVELHLASWTDGKMNRPVLSR GSLPAEGKREIVIGDGMAKARNLNPGDTVEIIVGGKRIPMEITGIGLTPENIYMIRNMNE LFPSPAVYDAAFTSYRTLAQLYGQEGHANSFLFRMAPGSMWEDVKDQVEQALTPYGLISH FASEDQLAVSMVDEEIKQLEKMSGVVPFLFLAVAGVILYITQSRMVEQQRTQIGTLMALG IPLKKIRLHYMGFGAVTGLAGGLAGGMLGYSMADPMAGFYRMYFNLPEVTAPLSAAYLLE GILLAGIFCSGTSWIIAGVIGKMRPAQALRPPAPKTARPSFLERIPGFLRLFTVPGIMAL RSLSRNRRRTAMSIAGIAFAYMITATLVSMNSLFDVFIFDYWEKTQRQDIMVSFEQPVSV NGALEAVKHPQVEKAEAMMEFAVTLAGPQGKTDCTIQAVSREASLTRLFREDGSRAYPEE EGIVISEHMASLMGVKKGSTIEVKVTYPEERISRVAVTDTIAQYMGSSAYMSFEGAAKIS DYRGVCSTVLLKAPLTVQEELRTRLSDAAAVSAIQSRQGRLEQYRSMMGSMSAIMASMSM LGVIISFAVIYISSLISFEELKRELSTLMMLGLKSRECLEVISTGQWILAVGAVLIGIPM AMGASRLISVSMASDMYTIPGFIDGKALLQAIGLTAVSVGLSSVMMLRKLKKIVPADLLR ERE >gi|229784097|gb|GG667638.1| GENE 37 45668 - 46375 280 235 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 5 221 4 222 223 112 34 5e-24 MGKPVIRAEHLSVRYEGEGVAVDALKDVSFEVEQGEFIVVLGPSGSGKSTLLNIAGGMDQ PAGGEIWYKEKLLTGCTPDQLSDYRKDVVGFVFQFFNLIPSLTAKENVELAASIVKEHMD PAEVLTMVGLEGREKHFPAQLSGGEQQRVSIARAIVKKPDLLLCDEPTGALDSKNSVAIV KLLLEVGKNLNCPVMTITHNAEMARVADRVFHMKDGRLERITVNEHPCRAEELDW >gi|229784097|gb|GG667638.1| GENE 38 46360 - 46614 254 84 aa, chain - ## HITS:1 COG:no KEGG:BHWA1_00546 NR:ns ## KEGG: BHWA1_00546 # Name: acrR # Def: TetR family transcriptional regulator AcrR # Organism: B.hyodysenteriae # Pathway: not_defined # 6 71 138 203 204 62 45.0 7e-09 MTLQEYFKNIDYTKFKPEVSPQEVYEMLTWMTEGFLYNRRKSELSASLDEVMKKFCVWEK WLKQIAYKEEYQDTEPEGGLWENR >gi|229784097|gb|GG667638.1| GENE 39 46592 - 46879 341 95 aa, chain - ## HITS:1 COG:CAC3271 KEGG:ns NR:ns ## COG: CAC3271 COG1309 # Protein_GI_number: 15896516 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 9 75 6 72 205 58 38.0 4e-09 MNEQFYDLPQEKQMRIINAGLEVFSKNDYKHAVTDEIARKAGISKGLLFYYFHNKKSLYL YLFDYCAKLIMNQVMGSDGEADAEACGKHDFTGVF >gi|229784097|gb|GG667638.1| GENE 40 48151 - 49629 1601 492 aa, chain + ## HITS:1 COG:SPy1343 KEGG:ns NR:ns ## COG: SPy1343 COG4868 # Protein_GI_number: 15675279 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pyogenes M1 GAS # 3 491 28 516 518 602 57.0 1e-172 MKIGFDNNKYLQLQSEHIRKRINQFGDKLYLEFGGKLFDDYHASRVLPGFAPDSKLRMLL QLADQAEIVIAINAADIEKNKIRHDLGITYDVDVLRLIQSFTDQGLYVGSVVITQYSGQK SADIFKSKLEKIGVRVYRHYIIDGYPSNVSLIVSDEGYGRNDYIETTKPLVIITAPGPGS GKMATCLSQLYHENKRGIQAGYAKFETFPIWNIPLKHPVNLAYEAATADLNDVNMIDPFH LDAYGVTTVNYNRDVEIFPVLSAIFEGIYGECPYKSPTDMGVNMVGCCIVDDEACVEASR QEIIRRYYQALSRLAKDMGSKEEVYKIELLMKQAKITTAARRVVGAANERAERTGYPAAA LELDDGTIITGKTTNLLGASAALLLNAVKVLGNIPHDMHLIAPSAIEPIQKLKVNYLGSK NPRLHTDEVLIALSASAATDHNAQIALEQLPKLKGCEVHTSVLLSDVDVKIFKKLGVNLT CEPIYEEKKIYH >gi|229784097|gb|GG667638.1| GENE 41 49721 - 51076 993 451 aa, chain - ## HITS:1 COG:BB0473 KEGG:ns NR:ns ## COG: BB0473 COG0534 # Protein_GI_number: 15594818 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Borrelia burgdorferi # 6 447 1 444 454 152 25.0 2e-36 MKTKILKTGPVERRDLILNGNIVKTLMMLSMPTLMMSVVQSLMPLSDGLFINNVAGTLVA SAVTYSEPIINMMTALAQGLSVAAMAVIGQTFGQGDFKRVKHISTQTVVFAFCLGLCIAP LLVLIAFPISGHVNQEISHNVWLYIALYAMVLPFSFLESIYNAIMNATGRPEATFIRMVL MLVLKVIFNVVFIVWLNLGIVGCVLASLCANVLICIWMFYELFLQKGDNQLTLKGFHFDG VLIHQLVRVGIPAMITSLMLNFGFFLINNETQKYGPVVLNGQGIANNITSVCFNLPSSFG SAITTMVSMNVGAGQGPRAKKCTLTGCVASMITAVLIISVIVPASSHLTTLFTRQPDVLE IADHALHIYTYSVVGFGICMVVQGAFIGLGRTKVPMVVGLLRIWLLRYLFILATESFLAY YSVFWGNLFSNYMAAAISLVLLFRIKWESAL >gi|229784097|gb|GG667638.1| GENE 42 51114 - 51290 109 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870434|ref|ZP_06409754.1| ## NR: gi|288870434|ref|ZP_06409754.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 58 1 58 58 104 100.0 3e-21 MILHFTWILNFLFFDYTELTGKIYKNKKLNADFTNDNKALKIEGQYAERIKLCYSRDL >gi|229784097|gb|GG667638.1| GENE 43 51312 - 51983 496 223 aa, chain + ## HITS:1 COG:SA0619 KEGG:ns NR:ns ## COG: SA0619 COG0306 # Protein_GI_number: 15926341 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate/sulphate permeases # Organism: Staphylococcus aureus N315 # 33 214 20 197 335 107 34.0 1e-23 MNEVTITISDFLTGLSSSPAFAVTTVLTLGVILVNGWTDAPNAIATCVSTRAISARHAIL MAAVFNFLGVFLMTMINSSVASTIYNMVDFGGDSHEALIALCAGLTAIVTWATAAWWFGI PTSESHALIAGLSGATIALHNGVDGINGSEWIKVLYGLLLSTFLGFIFGWLSVRLVELLF RNRDRRKTNRFFEGAQIAGGAAMAFMHGAQDGQKFKPFRFQPS >gi|229784097|gb|GG667638.1| GENE 44 52902 - 53318 382 138 aa, chain + ## HITS:1 COG:CAC3093 KEGG:ns NR:ns ## COG: CAC3093 COG0306 # Protein_GI_number: 15896344 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate/sulphate permeases # Organism: Clostridium acetobutylicum # 20 131 215 326 330 97 46.0 5e-21 MGVFLLGLFLAQGNAGSGHFSIPVWLMILCSLVMAAGTSIGGYRIIKAVGMDMVKLEKYQ GFAADLAGAGCLLLSSAAGIPVSTTHTKTTAIMGVGAAKRISCVDWGVVKEMVLTWVLTF PGCGIIGFLMAKLFLAVF >gi|229784097|gb|GG667638.1| GENE 45 53394 - 54017 644 207 aa, chain + ## HITS:1 COG:CAC3094 KEGG:ns NR:ns ## COG: CAC3094 COG1392 # Protein_GI_number: 15896345 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate transport regulator (distant homolog of PhoU) # Organism: Clostridium acetobutylicum # 5 207 7 210 210 103 29.0 2e-22 MTGKRDYLYFELFQEGVRYAMEAAALLHRDLEHFDSSGLDEQIDAMHQLEHQADLTKHQA MEKLIREFITPLEREDILSLTSAIDDVTDSIEDVLLRLYMFNIRTLRPDALEFSKIITDC CHALLDLMEEFPNFRKSKSIKDKIIEINHLEEVGDKLYTEAIHRLHTEQTEAATLLAWTT IYERFEKCCDRCEDVADSVELVILKNS >gi|229784097|gb|GG667638.1| GENE 46 54025 - 55110 861 361 aa, chain - ## HITS:1 COG:CAC3308_1 KEGG:ns NR:ns ## COG: CAC3308_1 COG0463 # Protein_GI_number: 15896551 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 3 86 20 104 194 81 49.0 2e-15 MARCLDSVSSLVDEIIIVDTGSQDKTREIAARYTDHVYDFTWIDDFAAARNFSFSKATMD YQMWLDADDVMEEEDREKFLQMKNTLNTRQKENAEDGPDSTPDVIMLPYHVAFDSHGSPA MAYYRERLLRRKSHFIWQGAVHEVIAPAGKIIYGDAAVCHRKIHPSDPDRNLKIYENMVD SGRKLEPRHQFYYGRELYYHARYEEAARILETFLKEGNGWGENNISACLNLSECLMKLGK SEEALTALFGSFRYDEPRPEICCAIGKWFLERGNRDRQYLKQAVYWYLQAKGQVTDEKSG AFVNPDCHDFIPDIQLCVCYDRLGENAKALEYHEKAKAKKPDNPAVLYNEEYFAKKKGST V >gi|229784097|gb|GG667638.1| GENE 47 55285 - 56277 714 330 aa, chain - ## HITS:1 COG:no KEGG:mlr0585 NR:ns ## KEGG: mlr0585 # Name: not_defined # Def: hypothetical protein # Organism: M.loti # Pathway: not_defined # 33 150 552 669 2147 65 50.0 3e-09 MGDTITGAPGSPASVTNVGTDQDAVLEFVIPQGSEGPQGIKGETGAAGVTGATGATGAAG ASDTITIRSTITAEPGTPASVTDVGTGSDHILDFVIPRGNTGAEGAAGATGATGAAGATG ATGETGETGATGPMGETGATGETGPTGAAATISIGTVITGEPGTQAEVTNSGDENDAVFD FVIPRGVTGGGGAPEVLATVDTSGQQTIPGGPLVFYDTPLVSGTFITHQPGSSDIDISQP GIYQAVFQGTVSVNPGTQIPASLMVRMLLNGYDVIGSTVRHTFSASTEVATLSFHVPFQV SATPSIIQVITDQDGFTVDDLSMTVLRLGD >gi|229784097|gb|GG667638.1| GENE 48 56280 - 56501 110 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKTIIIKIRSLDREGFIHQDKQHSHHTVPHVRTQHPVPRHVLRPVPRHALRPVPRPVLRP APRRALHRRADVP >gi|229784097|gb|GG667638.1| GENE 49 56906 - 58351 1006 481 aa, chain - ## HITS:1 COG:no KEGG:Sgly_1574 NR:ns ## KEGG: Sgly_1574 # Name: not_defined # Def: collagen triple helix repeat-containing protein # Organism: S.glycolicus # Pathway: not_defined # 72 446 251 625 764 216 56.0 2e-54 MNYRYYPMAGGPFGIPGFGNNMQEGMQDRFNGLQTPCGTNDANSADLQNASRCGGQCCNN NNNYGCCRPVPGPQGPRGPQGIPGLPGPQGPQGFPGPRGPQGATGPQGVQGYPGPQGVQG PRGFQGATGPQGQRGITGPQGPQGVTGLQGLPGVTGATGAAGIMGPMGPTGARGATGATG ITGAAGRDGVSPRITVAGTVTGNPGTQAGVDETFTQEGAQLLFTIPAGPTGPTGAAGGTG ATGAAGATGPTGAAGATGPTGAAGVTGPTGAAGATGPTGAAGATGPTGAAGATGPTGADG VTGPTGADGVTGPTGAAGATGPTGAAGATGPTGAAGATGPTGAAGVTGPTGADGVTGPTG ADGVTGPTGAAGATGPTGADGATGPTGAAGATGPTGAAGATGPTGADGATGPTGAAGATG PTGADGATGPTGAAGATGPTGPAGSTVTSLNLTVDAGTGAVTGGSITLDDGTTVPITVTR T >gi|229784097|gb|GG667638.1| GENE 50 58496 - 58831 353 111 aa, chain - ## HITS:1 COG:BH0043 KEGG:ns NR:ns ## COG: BH0043 COG3870 # Protein_GI_number: 15612606 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 1 110 1 108 109 98 43.0 2e-21 MKIVYAIVSSDDGNRVTDVLNENSFSVTKLATTGGFLKKGNSTLMIGTNDDKVEEVINLI KDTCGKRQKITCNVPAPNIASISAGYMMMPMTVELGGATIFVTDVERFEKI >gi|229784097|gb|GG667638.1| GENE 51 58946 - 59512 534 188 aa, chain - ## HITS:1 COG:L181238 KEGG:ns NR:ns ## COG: L181238 COG0494 # Protein_GI_number: 15673901 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Lactococcus lactis # 7 183 9 183 186 120 42.0 2e-27 MDLRDALRNYKPFNEQEEKDREQILYFLERGDLVYTRESKDAHMTASAWVTNRARSKIVM AYHNIYDSWAWLGGHADGEKNLLLVAEKEVMEESGLRFVRPVMEDIFSIEILTVSGHMKN GSYVPSHLHLNVTYLLEADEDAVLHEKEDENSGVRWFLTEDGINASKEPWMQTWIYRKLA EKMSRLNC >gi|229784097|gb|GG667638.1| GENE 52 59611 - 60177 448 188 aa, chain + ## HITS:1 COG:CAC1633 KEGG:ns NR:ns ## COG: CAC1633 COG1633 # Protein_GI_number: 15894911 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 13 168 65 229 236 74 30.0 9e-14 MDITTFHFSDHTPYPPVKVDGKNPQYAAAILSNIGSCNSEMTAVSLYFYNSLITREHYQE IAECFNRISVVEMHHLDIFGELALKLGTDPRLWSYNKGRMYYWCPGCNQYPTQINALLTN ALAGELEALRKYHAQSEWIEDGHVRAILNRIIADEELHVHIFRSLLTELSIPETAPAESQ EQTCPDLT >gi|229784097|gb|GG667638.1| GENE 53 60170 - 60748 606 192 aa, chain - ## HITS:1 COG:PA1878 KEGG:ns NR:ns ## COG: PA1878 COG1896 # Protein_GI_number: 15597075 # Func_class: R General function prediction only # Function: Predicted hydrolases of HD superfamily # Organism: Pseudomonas aeruginosa # 6 191 1 186 192 138 45.0 6e-33 MENQQMGEERLDQYLNFIKETELLKNVLRTAWGSTGRQESTAEHSWRLALFAALLLGDYP ELDGKRVLFMCLIHDLGELYDGDISAALLPDEQQKHAGEQRSVERLFSFLPEKEREYFMA VWREYNENSTPEAHLVKALDKAETILQHNQGINPPDFDYEFNLQYGASYFKEDGRMAALR ARLDQETKKHIM >gi|229784097|gb|GG667638.1| GENE 54 60748 - 62670 1553 640 aa, chain - ## HITS:1 COG:CAP0099 KEGG:ns NR:ns ## COG: CAP0099 COG1193 # Protein_GI_number: 15004802 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 2 634 3 621 629 350 32.0 4e-96 MNNTFHTLEFDRIVRQLEELACTKSAKEKIRDLEPYLYEGDVRKSQKDTSEARRIIESAG LPPIASMGDVDDILIVIGQDGCLTAEQLESVGMTLTAVKRLKDFLSRCRYLNTGLACYDE ELDPLEHVREQLAEAVRGGRVEDGASRLLKSLRQDMIRMEEKVRAKADSILKSKKVCFAD HYVTSRNGKMCLPVKKEYRSSVPGSVIDKSATGATLFIEPEAVASLNSELELLKIEEENE TRRILYELTAMIAENQEVFEADKRLIEKLDFIFAKGKLSAAMDAKEPQINTERNIRIKNG RHPLMRREEAVPLSFELGERGKGVVITGPNTGGKTVTMKTVGLFCIMAQCGLHVPCEEAV LCMNNLVLCDIGDGQNLSENLSTFSAHITNVLDILKKAGPESLVLLDELGSGTDPAEGMG IAIAILESLRQCGCLFLVTTHYPEVKGYAEAADGIENARMAFDRETLQPLYRLVQGEAGE SCALFIAKKLGMPDAMLLRAEQAAYGSLKPERLQEEGGGKKRKNVILPEGPRIRKQKEEK GHQKELAVKYAVGDSVLVLPDKKTGIVCRTVNEKGVLQVQLQDRKIWINHKRIKLQVAAE ELYPEGYDFSIIFDTVENRKNRHQMEKRHQEGVEIRMEDD >gi|229784097|gb|GG667638.1| GENE 55 63062 - 63544 421 160 aa, chain - ## HITS:1 COG:FN1791_1 KEGG:ns NR:ns ## COG: FN1791_1 COG0494 # Protein_GI_number: 19705096 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Fusobacterium nucleatum # 7 155 3 151 158 154 53.0 5e-38 MPNTNFTTLCYIECEDSYLMLHRVKKEGDMNRDKWLGVGGHFEKGESPEECLLREVREET GLTLLRWRFRGLITFVSDCFPEEYMCLYTADRYEGTIGECREGCLEWIKKDRIGELNLWE GDLIFFKLLRDNQPFFSLKLCYEGERLTEAVLDGKPLRES >gi|229784097|gb|GG667638.1| GENE 56 63586 - 63735 151 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621172|ref|ZP_06114107.1| ## NR: gi|266621172|ref|ZP_06114107.1| hypothetical protein CLOSTHATH_02315 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02315 [Clostridium hathewayi DSM 13479] # 1 49 1 49 49 78 97.0 1e-13 LLDQTAKQAVLTTKNYCISGKKQEILDYLNTVDYTSVMVDRVRLSILER Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:33:01 2011 Seq name: gi|229784096|gb|GG667639.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld32, whole genome shotgun sequence Length of sequence - 52267 bp Number of predicted genes - 55, with homology - 53 Number of transcription units - 26, operones - 14 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.500 - CDS 3 - 309 274 ## COG0629 Single-stranded DNA-binding protein 2 1 Op 2 . - CDS 302 - 1048 514 ## COG4712 Uncharacterized protein conserved in bacteria 3 1 Op 3 . - CDS 1048 - 1236 318 ## gi|266621175|ref|ZP_06114110.1| phosphatase family protein 4 1 Op 4 . - CDS 1240 - 1392 155 ## gi|288870442|ref|ZP_06114111.2| trigger factor 5 1 Op 5 . - CDS 1403 - 1777 319 ## gi|266621177|ref|ZP_06114112.1| conserved hypothetical protein 6 1 Op 6 . - CDS 1830 - 2156 250 ## gi|266621178|ref|ZP_06114113.1| conserved hypothetical protein 7 1 Op 7 . - CDS 2173 - 2382 383 ## gi|266621179|ref|ZP_06114114.1| conserved hypothetical protein - Prom 2458 - 2517 2.8 8 2 Op 1 . - CDS 2682 - 2927 275 ## gi|288870443|ref|ZP_06114115.2| conserved hypothetical protein 9 2 Op 2 . - CDS 2893 - 3054 99 ## gi|266621181|ref|ZP_06114116.1| conserved hypothetical protein 10 2 Op 3 . - CDS 3106 - 3615 336 ## COG3646 Uncharacterized phage-encoded protein 11 2 Op 4 . - CDS 3612 - 3815 194 ## Clole_0782 hypothetical protein - Prom 4011 - 4070 26.1 + Prom 4898 - 4957 80.4 12 3 Op 1 . + CDS 5011 - 5436 394 ## Closa_1361 helix-turn-helix domain protein 13 3 Op 2 . + CDS 5445 - 5876 116 ## COG2856 Predicted Zn peptidase 14 3 Op 3 . + CDS 5884 - 6786 118 ## gi|266621187|ref|ZP_06114122.1| hypothetical protein CLOSTHATH_02330 + Prom 6866 - 6925 2.8 15 4 Tu 1 . + CDS 6962 - 8425 304 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Term 8426 - 8471 3.2 16 5 Op 1 . - CDS 8476 - 8574 87 ## gi|295108255|emb|CBL22208.1| Mn-containing catalase 17 5 Op 2 . - CDS 8574 - 8840 290 ## Closa_2886 hypothetical protein 18 5 Op 3 . - CDS 8840 - 9244 132 ## gi|266621191|ref|ZP_06114126.1| conserved hypothetical protein 19 6 Tu 1 . - CDS 9468 - 10802 879 ## COG0665 Glycine/D-amino acid oxidases (deaminating) - Prom 10905 - 10964 4.2 + Prom 10772 - 10831 8.7 20 7 Op 1 . + CDS 10981 - 11409 316 ## Closa_1534 MarR family transcriptional regulator 21 7 Op 2 . + CDS 11441 - 12202 713 ## COG0730 Predicted permeases 22 8 Tu 1 . - CDS 13166 - 13363 58 ## gi|288870446|ref|ZP_06409759.1| zinc finger protein 207 + Prom 13112 - 13171 27.0 23 9 Tu 1 . + CDS 13327 - 14703 1573 ## COG2200 FOG: EAL domain + Term 14726 - 14784 13.2 - Term 14706 - 14778 19.7 24 10 Tu 1 . - CDS 14830 - 15294 531 ## COG0517 FOG: CBS domain - Prom 15321 - 15380 10.8 - Term 15454 - 15503 12.2 25 11 Op 1 . - CDS 15545 - 16312 834 ## Closa_2891 Peptidase M23 26 11 Op 2 . - CDS 16353 - 17243 1082 ## COG2385 Sporulation protein and related proteins - Prom 17314 - 17373 80.4 27 12 Tu 1 . - CDS 18226 - 18771 467 ## COG1077 Actin-like ATPase involved in cell morphogenesis - Term 18888 - 18949 1.3 28 13 Op 1 . - CDS 19009 - 21462 3236 ## COG0058 Glucan phosphorylase 29 13 Op 2 . - CDS 21476 - 24610 3389 ## COG0060 Isoleucyl-tRNA synthetase - Prom 24630 - 24689 1.7 - Term 24634 - 24676 11.1 30 13 Op 3 . - CDS 24697 - 24852 149 ## - Prom 24916 - 24975 7.9 31 14 Op 1 . - CDS 25030 - 26406 1548 ## COG1249 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide dehydrogenase (E3) component, and related enzymes 32 14 Op 2 . - CDS 26438 - 26983 537 ## COG0607 Rhodanese-related sulfurtransferase - Prom 27025 - 27084 6.2 + Prom 27765 - 27824 10.0 33 15 Op 1 . + CDS 28027 - 29487 1542 ## COG1488 Nicotinic acid phosphoribosyltransferase + Prom 29490 - 29549 3.4 34 15 Op 2 . + CDS 29570 - 30289 879 ## EUBREC_2283 hypothetical protein + Term 30317 - 30366 19.8 35 16 Op 1 . - CDS 30305 - 30433 68 ## 36 16 Op 2 . - CDS 30372 - 31568 853 ## Closa_0788 hypothetical protein 37 16 Op 3 . - CDS 31574 - 32149 502 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 32260 - 32319 1.7 38 17 Op 1 . - CDS 32321 - 33358 861 ## gi|266621210|ref|ZP_06114145.1| conserved hypothetical protein 39 17 Op 2 . - CDS 33432 - 34925 1006 ## Elen_3094 regulatory protein GntR HTH - Prom 35087 - 35146 7.1 + Prom 35034 - 35093 7.3 40 18 Tu 1 . + CDS 35137 - 37404 1448 ## COG2200 FOG: EAL domain + Term 37538 - 37582 0.4 - Term 37425 - 37478 1.4 41 19 Op 1 . - CDS 37592 - 37771 282 ## gi|288870454|ref|ZP_06114148.2| conserved hypothetical protein 42 19 Op 2 . - CDS 37802 - 38872 1018 ## COG3274 Uncharacterized protein conserved in bacteria - Prom 38949 - 39008 5.4 - Term 39033 - 39071 4.3 43 20 Tu 1 . - CDS 39104 - 39538 541 ## Dhaf_2679 hypothetical protein - Prom 39656 - 39715 4.1 - Term 39603 - 39651 8.6 44 21 Tu 1 . - CDS 39783 - 40235 491 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 40404 - 40463 9.8 45 22 Op 1 . - CDS 41365 - 41727 415 ## Cthe_0079 hypothetical protein 46 22 Op 2 . - CDS 41790 - 42206 432 ## Closa_3192 hypothetical protein 47 23 Op 1 . - CDS 42356 - 42925 372 ## gi|266621219|ref|ZP_06114154.1| conserved hypothetical protein 48 23 Op 2 44/0.000 - CDS 42922 - 43881 795 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 49 23 Op 3 44/0.000 - CDS 43885 - 44874 833 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 50 23 Op 4 49/0.000 - CDS 44888 - 45826 976 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 51 23 Op 5 . - CDS 45819 - 46784 831 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 52 23 Op 6 . - CDS 46807 - 47496 741 ## gi|266621224|ref|ZP_06114159.1| conserved hypothetical protein 53 24 Tu 1 . - CDS 48430 - 49146 632 ## gi|266621225|ref|ZP_06114160.1| hypothetical protein CLOSTHATH_02373 - Prom 49178 - 49237 6.8 54 25 Tu 1 . - CDS 49257 - 50690 1292 ## COG4099 Predicted peptidase - Prom 50869 - 50928 5.4 - Term 50986 - 51020 2.5 55 26 Tu 1 . - CDS 51197 - 52210 635 ## EUBREC_1182 hypothetical protein Predicted protein(s) >gi|229784096|gb|GG667639.1| GENE 1 3 - 309 274 102 aa, chain - ## HITS:1 COG:FN1304 KEGG:ns NR:ns ## COG: FN1304 COG0629 # Protein_GI_number: 19704639 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Fusobacterium nucleatum # 1 102 1 98 154 90 45.0 9e-19 MNKVILMGRLTRDPEVRYSQGERTMTLAKYTLAVDRRGRRSQDGNEQTADFLNCVAFDRA GEFAEKYFRQGMRVLISGRIQTGSYINKDGIKVYTTDIIVED >gi|229784096|gb|GG667639.1| GENE 2 302 - 1048 514 248 aa, chain - ## HITS:1 COG:CAC1936 KEGG:ns NR:ns ## COG: CAC1936 COG4712 # Protein_GI_number: 15895209 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 5 170 3 171 229 160 50.0 3e-39 MEKLNFRTLQADEIDCRIATIKESGLSLLLYKDARVDQNMLDEAVGPMNWQRHHSRDNAN CTVAIWDSEKNQWIEKEDTGTESYTEKEKGLASDSFKRACFNWGIGRELYTSPFIWISAQ DCNIKNSGNRFTCNDSFYVSQIGYDEKRNINSLTIKKSKGNSIVFSMGKKPEKPDEGQAK QPEKKSEGITTPMIESIKSLVEKHSGKGLKMEKILAMYKIKDINEMTIEQYKDCMNKLKL YEEKTIHE >gi|229784096|gb|GG667639.1| GENE 3 1048 - 1236 318 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621175|ref|ZP_06114110.1| ## NR: gi|266621175|ref|ZP_06114110.1| phosphatase family protein [Clostridium hathewayi DSM 13479] phosphatase family protein [Clostridium hathewayi DSM 13479] # 1 62 1 62 62 114 100.0 2e-24 MVKAKDLNVGQVIRLECGDAGNWGNFEVDKITVLEDSVDVLCHYGVVHMEFSWELDKMLE VV >gi|229784096|gb|GG667639.1| GENE 4 1240 - 1392 155 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870442|ref|ZP_06114111.2| ## NR: gi|288870442|ref|ZP_06114111.2| trigger factor [Clostridium hathewayi DSM 13479] trigger factor [Clostridium hathewayi DSM 13479] # 1 50 10 59 59 80 100.0 5e-14 MGDYRKLWILLREQLTRQENESEKYSIERDVIHNVLTRMSEMDAAEFLED >gi|229784096|gb|GG667639.1| GENE 5 1403 - 1777 319 124 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621177|ref|ZP_06114112.1| ## NR: gi|266621177|ref|ZP_06114112.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 124 1 124 124 244 100.0 2e-63 MEMNTQRMDAFVKDVTDCRKAIDLCRAAIKKLGEQPLHLLVSNDLERITLNSKEIGLSKE SQAAIVMVVKGSLEARENELCCQMDELMGNYGRTTDFEAVPPALTPKQKRRQCYGPINPD DVEW >gi|229784096|gb|GG667639.1| GENE 6 1830 - 2156 250 108 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621178|ref|ZP_06114113.1| ## NR: gi|266621178|ref|ZP_06114113.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 108 1 108 108 215 100.0 1e-54 MKKRLEKKVEKRKREQIHRLLDLALDINGLEPRSQTITGNLPTVFFEFYGHVGLADIRVY SAGWASGKDPDVDMYMEACSLEKLSNVVHRMEALKAETPGAATPRESR >gi|229784096|gb|GG667639.1| GENE 7 2173 - 2382 383 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621179|ref|ZP_06114114.1| ## NR: gi|266621179|ref|ZP_06114114.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 69 1 69 69 106 100.0 5e-22 MQKEQQLRVWIQKQKRLISEAAEQKDRDYIAMMWQGFLNGLCLTNAITWQEYQELSREIV EFAEGFEAA >gi|229784096|gb|GG667639.1| GENE 8 2682 - 2927 275 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870443|ref|ZP_06114115.2| ## NR: gi|288870443|ref|ZP_06114115.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 16 81 1 66 66 120 100.0 3e-26 MRDTKECRLPVDPGLMDMQWEVMDMQKYIDNLDDFEDDSRPPIVDWVEWLLVGIFDLAGA GACAYIGYLLLVLVVEGRWIG >gi|229784096|gb|GG667639.1| GENE 9 2893 - 3054 99 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621181|ref|ZP_06114116.1| ## NR: gi|266621181|ref|ZP_06114116.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 53 1 53 53 92 100.0 1e-17 MAKKELIIENYIEIDGVDVPMDTLSEEKRAEIAILLQDTAMSYAGYKRVQTPG >gi|229784096|gb|GG667639.1| GENE 10 3106 - 3615 336 169 aa, chain - ## HITS:1 COG:BS_yoqD_1 KEGG:ns NR:ns ## COG: BS_yoqD_1 COG3646 # Protein_GI_number: 16079126 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Bacillus subtilis # 9 104 16 111 124 85 42.0 5e-17 MNDLTRTAITSMEAAEWCGKKHNELLKDIRRYISQLGEGKIPHTDFFRESTYVTEQNKTL PCFLVTKKGCEFIAHKMTGQKGTEFTARYINRFHEMEEGKLSCPLNPQIASSVAELGRVT ERVMKNQGSAPYKIAEAFKMECEQFGIQLPADFVKVPEYEQMALSEFIK >gi|229784096|gb|GG667639.1| GENE 11 3612 - 3815 194 67 aa, chain - ## HITS:1 COG:no KEGG:Clole_0782 NR:ns ## KEGG: Clole_0782 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 66 1 66 118 64 48.0 1e-09 MYKKFAELLFQRGLTAYRVSKDTGIPANTFTDWKNGRSKPKFDKLLILAKYFGVPVEYFA DERKEDA >gi|229784096|gb|GG667639.1| GENE 12 5011 - 5436 394 141 aa, chain + ## HITS:1 COG:no KEGG:Closa_1361 NR:ns ## KEGG: Closa_1361 # Name: not_defined # Def: helix-turn-helix domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 133 1 132 135 82 39.0 4e-15 MYEIFVKLLEKYGVTAYKVSKATGIAGSTFTDWKTGRSTPKQDKLQKIADYFGVTVDYLM TGKEEPEKKEITLTPKDERDIAKDLSNIMEKLRKGEAGPASFEGDEIPEETQELFAQQLD IMLHHLKKINKEKYNPYKNKK >gi|229784096|gb|GG667639.1| GENE 13 5445 - 5876 116 143 aa, chain + ## HITS:1 COG:BH3550 KEGG:ns NR:ns ## COG: BH3550 COG2856 # Protein_GI_number: 15616112 # Func_class: E Amino acid transport and metabolism # Function: Predicted Zn peptidase # Organism: Bacillus halodurans # 5 116 3 115 146 70 37.0 7e-13 MGDKERIKRLVAYYIRLCGTADPFKIARQLKIEVQFGPLGEYAGCYLFAKNHRCIFINNC IEGPELLFVMAHELGHAIMHRKQNCYFLRNYTYLNSRVEREANLFSAFMVITNDFITEHS NYSSNDIAELYGCDYYTVESRLR >gi|229784096|gb|GG667639.1| GENE 14 5884 - 6786 118 300 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621187|ref|ZP_06114122.1| ## NR: gi|266621187|ref|ZP_06114122.1| hypothetical protein CLOSTHATH_02330 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02330 [Clostridium hathewayi DSM 13479] # 1 300 1 300 300 586 100.0 1e-166 MDKDLTFDDIFKYKSVSFKIAGVEYDIMKKEDVEKIPCLSVTANVFGKNYGIDYILRKNA IHIYKSNGDYELAGTCIRKSNEITLAGYGTQGEDEIERERHYKENRQKKELRQKTMVEIN NNITVDDMAKFPNLPFELRWILNLQHTNGIAWFSLNKNNQYIALSAINYINDIFQQADSY LPDGNDFYICTENIYFDYIKPILLDSLPATYVECTPYTATRKKNKYPMVLHFSEVEGEPI FLNRSSYGSIFFMSDGNIGKADITIGYSTIQLRLVGISLIVRRVDKLINNNYQNIFNYEI >gi|229784096|gb|GG667639.1| GENE 15 6962 - 8425 304 487 aa, chain + ## HITS:1 COG:SA0057 KEGG:ns NR:ns ## COG: SA0057 COG1961 # Protein_GI_number: 15925764 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Staphylococcus aureus N315 # 14 439 12 465 542 90 24.0 7e-18 MKKETEYLQIGAAYIRVSTDDQTELSPDAQLRVIRASAKEDGFFIPDEFVFIEKRGVSGR RADNREEFQRMIATAKSQTPAPFRRLYLWKFSRFARNQEESTFYKGILRKKCSVDIKSVS EPIAEGMFGRLIETIIEWFDEYYSFNLSGEVLRGMTEKALREGYQSAPCLGYQAVGGGKP FIINESEYAIAEYIHQAYHSGADMRAIARSANDHGYRTKRGNLFDKRAIEGILKNKFYAG IVTWNGFTFQGPHECRTSITAIFEDNQTRMQREYKPLNRRETSSCLHWASGLLKCGYCGA SLGYNKSKNTGRNPSFFQCWKYAKGIHGESCSITVSKAEKSILKSLDMAVNDPHIKYEYI RQSVPDSDASISKLESALSRLAVKESRIRDAYENGIDTLEEYRDNRGRIKKEREELNAQL RELKDGPARNVEDDRKALQSRIQNTIDLLKNPEVCYEVKGNALRGIVKKIVYDKEHGKLK CYYYISI >gi|229784096|gb|GG667639.1| GENE 16 8476 - 8574 87 32 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|295108255|emb|CBL22208.1| ## NR: gi|295108255|emb|CBL22208.1| Mn-containing catalase [Ruminococcus obeum A2-162] # 1 32 1 32 140 62 84.0 9e-09 MWRYEKRLQYPVNITTPNPKLATFIMSQYGGP >gi|229784096|gb|GG667639.1| GENE 17 8574 - 8840 290 88 aa, chain - ## HITS:1 COG:no KEGG:Closa_2886 NR:ns ## KEGG: Closa_2886 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 88 1 88 88 124 69.0 9e-28 MANIDRNCSALLRKVYEASFAVDDVILYLDTHPDDQDALNYYQYVSELRKQAMDAYEAQC GPLMIDEVRSDNYWTWVNNPWPWEGECG >gi|229784096|gb|GG667639.1| GENE 18 8840 - 9244 132 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621191|ref|ZP_06114126.1| ## NR: gi|266621191|ref|ZP_06114126.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 16 134 52 170 170 114 100.0 2e-24 MPPNMPMGPGPVMPPNMQPEMPMGRGGAMPSNMPMGPGPVMPPNMQPEMPMGRGGAMPSN MPMGPGPVMPPNMPPEMPMNQGFMNSPCNMDQFPVAMAYVPWQRWQQVYPVDKAINRGTI FPDLDKPFSMGRCR >gi|229784096|gb|GG667639.1| GENE 19 9468 - 10802 879 444 aa, chain - ## HITS:1 COG:CAC2596_1 KEGG:ns NR:ns ## COG: CAC2596_1 COG0665 # Protein_GI_number: 15895855 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Clostridium acetobutylicum # 1 374 1 376 399 356 44.0 5e-98 MKSIWTETTKIPGRNPLPGNKKTEVAVIGAGLFGILAACYLKEQGKEVIVLEADRIGSGQ SGYTTAKITSQHNLIYEKLVNTMGEKYASSYAMANQQAIEDYKKLIKKRRIACHFEEKDA YLFSVNESEILRREADCAARCGLPATFVRETELPFPVHGAVRFTGQAQFHPLEFIEALSG DLTVYERTKVREVKGHEIMTDRGVVRADHIVFACHYPFVNIPGFYFLRLYQEKSYVMALG PVKELEGMYLGIDKNSYSFRSAGNTLLLGHGSHRTGKCPEKNPYEEMRRLKEHFYPGTEE FGRWSAEDCMSVDSVPYIGHFSGVLADWYVATGFGKWGMTTSMAAAKIVAAQIMGRKTPY EEVFTPQRFRPKAAAVTFLSHAGHSAAGLMKGAVPKVPRCPHMGCRLVYNRMDGRYECPC HGSQFEQNGKICSGPAQESLPFIN >gi|229784096|gb|GG667639.1| GENE 20 10981 - 11409 316 142 aa, chain + ## HITS:1 COG:no KEGG:Closa_1534 NR:ns ## KEGG: Closa_1534 # Name: not_defined # Def: MarR family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 7 138 6 145 160 84 37.0 1e-15 MVISDLLSGAKSFHKIYNQSIQTAASRHGLSLMDGDVILFLYNNPDYDTAKDISTFRMLA KSGVSASVESLTGLGYLEGREDTADRRKIHLKLTEAALPAAQDLKQTQQDFFDRLNRGIS EEERQIFTDLLKRMMDNLKQTD >gi|229784096|gb|GG667639.1| GENE 21 11441 - 12202 713 253 aa, chain + ## HITS:1 COG:Ta0985 KEGG:ns NR:ns ## COG: Ta0985 COG0730 # Protein_GI_number: 16082023 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Thermoplasma acidophilum # 12 245 11 243 255 65 25.0 7e-11 MLEKIIVCLFAGLGAGFGTGFAGMSAAAVISPMLITFLGVPAYQAIGIALASDVLASAVS AYTYGKHKNLDIRNGLVMMAAVLLFTLVGSYISSLVPNHAMGNASVFLTMLLGIKFIVRP VLTTKDAQASKTVRQKLIQSIVSGIVVGFICGFIGAGGGMMMLLLLTSVLGYELKTAVGT SVFIMTFTALTGSLSHFAIGGMPDLFLLVLCILFTFLFARIAAVFANKADPKLLNRLTGI ILFLLGAVVLLVK >gi|229784096|gb|GG667639.1| GENE 22 13166 - 13363 58 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870446|ref|ZP_06409759.1| ## NR: gi|288870446|ref|ZP_06409759.1| zinc finger protein 207 [Clostridium hathewayi DSM 13479] zinc finger protein 207 [Clostridium hathewayi DSM 13479] # 5 65 1 61 61 102 98.0 1e-20 MEWGLTSFETNSCGLPPVFVIIPAGNEPDGALTGPSGDSMPPVCRGLPRGLDSQLLTVGI RLSLL >gi|229784096|gb|GG667639.1| GENE 23 13327 - 14703 1573 458 aa, chain + ## HITS:1 COG:alr2306_2 KEGG:ns NR:ns ## COG: alr2306_2 COG2200 # Protein_GI_number: 17229798 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Nostoc sp. PCC 7120 # 194 437 9 254 260 138 31.0 2e-32 MSWFQKKSIPTPSPAESEPVITERLEESKLERDACYRHLSEVMDTPSDRGVVLKVYIENF KRLNQVFGYDYCENLLSQILSYLKEKTGKPVYHYIGVEYIIILDQYSQGQASDLAEEIAE QFDHAWKVNGTDCLCSVQMGMCSYPGHAATPDELLKCLDLAVLKAEEGGPNQAIMFDSTL QKQLLRRHTIALYLKTAIEKEEIEVRYRPTLNMETGKFSRAEFYMRIFIKGLGLIGASEV LPIAEDSGQIRSLEYFALDKVGQCIAKLTEAGKEFDSIALPISSILFLQEDFLDEVRRVI DTYQIPKGKLAIEIQESALTTAYLNINVMLQELSDMGVELVLNEFGSGYSGVATILELPV NTLKLERLFVWQLETNPRSHCVIEGLIHIARDLNLNIIAEGVETENQVNILTKAGCDYQQ GFYYSPTLEQDTLLKVIDNTFADSIPLLNEEKEKMSKA >gi|229784096|gb|GG667639.1| GENE 24 14830 - 15294 531 154 aa, chain - ## HITS:1 COG:CAC3674 KEGG:ns NR:ns ## COG: CAC3674 COG0517 # Protein_GI_number: 15896906 # Func_class: R General function prediction only # Function: FOG: CBS domain # Organism: Clostridium acetobutylicum # 1 131 1 131 140 150 54.0 1e-36 MNILFFLTPKCEVAYIYEDETLRQALEKMEYHKYSAVPIISRTGRYIGTITEGDMLWGIK NKFNLSLKEAERVTVAAIDRRLDNRPVFANSNMEDLIDKALNQNFVPVVDDQKNFIGIIT RKDIIRYYYDKSQECSQAAQTGRNTGDKARLVLG >gi|229784096|gb|GG667639.1| GENE 25 15545 - 16312 834 255 aa, chain - ## HITS:1 COG:no KEGG:Closa_2891 NR:ns ## KEGG: Closa_2891 # Name: not_defined # Def: Peptidase M23 # Organism: C.saccharolyticum # Pathway: not_defined # 1 255 1 253 253 352 76.0 1e-95 MKEKMSQLFKDKVFLVLLVLGLLTIVAAAGVITIQRGNDGGQSPYLEVPDQKGVIAEETI PQENQVAVAGDSNAEQNTDADTQVADSARATAKAETEAEVPAVKAGTEKEAAAALVLNFN DATKMAWPVSGNVILDYSMESTIYFPTLDQYKCNPGIVIQGAVSTPVIAPANAKIQEIGS NEELGNYVVLNMGNDYTAVCGQLKELQVVENEYVAQGDVLGYVAEPTKYYSIEGANLYFE LEHENQPIDPLDFMQ >gi|229784096|gb|GG667639.1| GENE 26 16353 - 17243 1082 296 aa, chain - ## HITS:1 COG:CAC2861 KEGG:ns NR:ns ## COG: CAC2861 COG2385 # Protein_GI_number: 15896115 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Sporulation protein and related proteins # Organism: Clostridium acetobutylicum # 32 291 65 338 345 130 29.0 2e-30 MALLLPYIITLAWTGRIEEKKEVPLITSGKKVILDRKNGESYMDVEEYLPGVVAKQMPAE YGKEALRAQAIIARTFIYQKMNGENEVKESELHMEYLEEKQMEAMWGSESFVTYFQAVEE AVRSTAGRVITCDGRLIEPLFHRASTGMTRSGDESHPYLQAVASKRDVEAEGYLTMMVWS KEEFAGRINQIAGSVPVDAGQLPQSIQMIVRDDGGYVGQIQIGTKVYTGEEVQYALGLPS PSYTLEEYDGGIRAVCQGVGHGYGLSQYGAKCMAEEGKTAEEILDYFYKNIVLISE >gi|229784096|gb|GG667639.1| GENE 27 18226 - 18771 467 181 aa, chain - ## HITS:1 COG:CAC2858 KEGG:ns NR:ns ## COG: CAC2858 COG1077 # Protein_GI_number: 15896112 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Clostridium acetobutylicum # 14 148 16 140 340 92 34.0 3e-19 MIESTIGKPGYEPARITIYVKDRGIVLEESSMALVNRDTGLIIAMGNAAEEAIDQAVTPV TAVNPLRRGIIASYMLAERMFCSYLRRALGYDRSMVKRLTGATVKKPRVAVCVPEELTEV EEKAFMDAFYQAGARDVCLTGQPLEEAVRCLEKPCTVFVGITWNGKEKERFCINENCPHR I >gi|229784096|gb|GG667639.1| GENE 28 19009 - 21462 3236 817 aa, chain - ## HITS:1 COG:BH1084 KEGG:ns NR:ns ## COG: BH1084 COG0058 # Protein_GI_number: 15613647 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Bacillus halodurans # 3 815 1 808 815 840 50.0 0 MSMGFNKETFKKSVIYNVKNVFRKTIDEATPEQAFQAVAYAVKDVIIDEWIATHKEYEKK DVKTLYYLSMEFLMGRALGNNIINIMALPEVKEVLDELGFDLNAIEDQEPDPALGNGGLG RLAACFLDSLATLGYPAYGCGIRYRYGMFKQKIENGYQMEVPDDWLKNGYPFEVRRAEYA TEVKFGGYVRTVWDNGREHFVQEGYQSVRAVPYDMPIVGYGNNVVNTLRIWDAEAINTFS LDSFDKGDYQKAVEQENLAKTIVEVLYPNDNHYAGKELRLKQQYFFISASVQRAVKKYME KHDDIHKFFEKTVFQLNDTHPTVAVPELMRILLDEYNLTWDEAWAVTTKTCAYTNHTIMS EALEKWPIELFSRLLPRIYQIVEEINRRFVEQIQQMYPGNQDKIRKMAIIYDGQVRMANL AIVGGFSVNGVARLHTEILKKQELRDFYEMMPQKFNNKTNGITQRRFLLHGNPLLAQWVT GKIGNEWITDLPHIHRLAVYADDPKCQQEFMDIKYQNKVRLAKYIKEHNGIDVDPRSIFD VQVKRLHEYKRQLMNILHVIYLYNELKDNPNMDMVPRTFIFGAKAAAGYKRAKLTIKLIN SVADVINNDKTIDGKIKVVFIEDYKVSNAEIIFAAADVSEQISTASKEASGTGNMKFMLN GALTLGTMDGANVEIVEEVGEENAFIFGMSSDEVIGYENRGGYNPMEIFNNNYQIRRVLM QLINGYFAPQDPELFRDIYNSLLNTQSSDRADTYFILKDLPSYAEAQKRIDQAYRNETWW AKAAILNVACSGKFTSDRTIEEYVRDIWHLEKVKVEI >gi|229784096|gb|GG667639.1| GENE 29 21476 - 24610 3389 1044 aa, chain - ## HITS:1 COG:CAC3038 KEGG:ns NR:ns ## COG: CAC3038 COG0060 # Protein_GI_number: 15896289 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 1035 1 1033 1035 1229 56.0 0 MYNKVPTDLNFVAREKEVEKFWEDHDIFQKSIDNRKKAETYTFYDGPPTANGKPHIGHVL TRVIKDMIPRYRTMKGCDVPRKAGWDTHGLPVELEVEKMLGLDGKEQIEEYGLEPFIKHC KESVWKYKGMWEDFSKTVGFWADMDDPYVTYENYYIESEWWALKQIWEKGLLYKGFKIVP YCPRRGTPLSSHEVAQGYKDVKERSAIARFKVKGEDAYILAWTTTPWTLPSNVALCVNPN ESYVKVKNGDYTYYLAEALCDTVLNSEYTVLERFTGKDLEFKEYEPLFDFVHPAKKAYYV TCDTYVTLTDGTGVVHIAPAFGEDDSKVGRKYDLPFVQLVDSKGEMTKETKWAGTFCKKA DPLVLKDLEERGLLFSAPVFEHSYPHCWRCDTPLIYYARESWFIKMTDVKEDLIRNNNTI NWIPESIGKGRFGDWLENVQDWGISRNRYWGTPLNVWECECGHQHAIGSIEELKSMSPNC PDDIELHRPYIDAVTITCPKCGKQMKRVPEVIDCWFDSGSMPFAQHHYPFENKELFEQQF PADFISEAVDQTRGWFYSLLAISTLIFNKAPYKNVIVLGHVQDENGQKMSKSKGNAVDPF QALEEFGADAIRWYFYINSAPWLPNRFHGKAVMEGQRKFMGTLWNTYAFFVLYANIDEFD ATKYKLEYDKLPVMDKWLLSKLNTLIKTVDGNLGNYQIPEAARALQEFVDDMSNWYVRRS RERFWAKGMEQDKINAYMTLYTALVTVSKVAAPMIPFMTEEIYQNLVCNIDKNAPESIHL CDYPVADESCIDEKLEDNMDEVLKIVVMGRAARNTANIKNRQPIGRMFVKAPHELSVFYQ EIIEEELNVKTVVFTDDVRDFTSYSFKPQLKTVGPKYGKQLNNIRKALSEVDGNTAMDTL NEKGALTFEFDGSEVVLTKEDLLIDTAQMEGYVSEGDNTVTVVLDTNLTPELIEEGFVRE LISKIQTMRKEAGFEVMDHILITSEGNEKIAGILEAHGEVIKSEVLAEDITLGKAAGYTK EWNINGENVTLGVEKIQRIETVNQ >gi|229784096|gb|GG667639.1| GENE 30 24697 - 24852 149 51 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGNPKRAGDGESLVRVVMGTNITLESPAESCVSSKPGRESYVTGNAYVSML >gi|229784096|gb|GG667639.1| GENE 31 25030 - 26406 1548 458 aa, chain - ## HITS:1 COG:ECs0342 KEGG:ns NR:ns ## COG: ECs0342 COG1249 # Protein_GI_number: 15829596 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide dehydrogenase (E3) component, and related enzymes # Organism: Escherichia coli O157:H7 # 1 458 10 448 450 417 48.0 1e-116 MKKFDAVIIGFGKGGKTLAGKLAGEGKNVALIEKSDKMYGGTCINVGCIPSKSLVRSSQI TESKGEISFEEKAELYRTAIEEKRRVTSMLRKKNFDKLDHLETVTIYNGTASFLSNTQVN VVPADGSEEFTIEGDQIFINTGSTPFVPPIEGIEGNPQVYLSETFMDLETLPKKLVIIGG GYIGLEFSSMYSGFGSEVTVIQNEARLIPREDEDIAAEIQKVLEEKGVQFVIGAAISSIR KEGDRSYVHYSADGKEGDLEADAILVATGRRPNTAGLNLEAAGVEVTARGGVKVDDAYRT TAPNIWAMGDAAGGLQFTYVSLDDFRIVWSSLHGGAYDARAARRSHIPYSVFIEPSFSRV GMNEQEARMAGLDVKIARLPASAIPKAAVLNKTKGVLKAVIDAKTNQILGAMLFCEESYE MINIVKLAMDLNADYTVLRDQIYTHPTMSEALNDLFAV >gi|229784096|gb|GG667639.1| GENE 32 26438 - 26983 537 181 aa, chain - ## HITS:1 COG:STM1686 KEGG:ns NR:ns ## COG: STM1686 COG0607 # Protein_GI_number: 16765029 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Salmonella typhimurium LT2 # 98 168 24 94 104 66 42.0 3e-11 MITGRFLITAGIAAAVLALTACSSGQKSAAGKETQKTEAQQEKAGTENRPAQQEENGTEA RTAQQAESRTEAQASQYVKYRPEEAKEVMDSDEALIVVDVRTPEEYAESRIGDAINIPVE EIGEEMPGELPDLDAKIMVYCRSGVRSKNAAEKLLDLGYKNIIDIGGIKDWPYETVSGEK Q >gi|229784096|gb|GG667639.1| GENE 33 28027 - 29487 1542 486 aa, chain + ## HITS:1 COG:CAC1780 KEGG:ns NR:ns ## COG: CAC1780 COG1488 # Protein_GI_number: 15895056 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 4 482 13 486 489 542 55.0 1e-154 MSQRNLTLLTDLYELTMMQGYFKEKDANETVIFDAFYRTNPGGNGYAICAGLEQVIQYIK DLHFEEDDVDYLRSTGLFGEDFLEYLRHFKFSGDIYAIPEGTVIFPREPLVKVIAPIMEA QLIETALLNIINHQSLIATKTARVVYAAGGDGVMEFGLRRAQGPDAGIYGARAAMIAGCI GTSNVLCGKMFNVPVKGTHAHSWIMSFPDELTAFRTYARLYPSACILLVDTYDTLKSGVP NAIKVFKEMREAGIPLTFYGIRLDSGDLAYLSKKAKKMLDDAGFPDAVISASNDLDESLI NSLKIQGAAINSWGVGTNLITSKDCPSFGGVYKLAAILDKKSGKFVPKIKLSENAEKITN PGNKCIKRIYSRETGKMIADLICLEGEKYNENHSLLLFDPIETWKKTHLAPNTYHLRDLM VPVFKNGECVYDSPAVMDIQAYCKKELDTLWDESRRLVNPHEVHVDLSNELWHMKNQLLD SYHFSE >gi|229784096|gb|GG667639.1| GENE 34 29570 - 30289 879 239 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2283 NR:ns ## KEGG: EUBREC_2283 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: Pyrimidine metabolism [PATH:ere00240]; Metabolic pathways [PATH:ere01100] # 8 238 7 212 212 170 35.0 4e-41 MSMGQETKIRARDNDKLALKVTKGHFSSDRFHINYYIDMTTLKMRQEEAEQVAKSMVKRY VNRVDLSGLIGVGAEELKHYSKAFASKTPIDTIICMDGCEVIGAYVAKELSELGVTTTNI HKTTYIVTPEFDSAGQMVVRDNIKHMLKDKHVLVVLATAMSGRTILKSIRCIESYGGILE GISVIFAAVNEIDGYPVNSVFDSSDLPDFRLSEPDNCPDCKNGIKLDAIVNSYGYTKLD >gi|229784096|gb|GG667639.1| GENE 35 30305 - 30433 68 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTIPMTGTQTMTIPRTNLMIKPPSQNPPGHDRNFRRGPGFMI >gi|229784096|gb|GG667639.1| GENE 36 30372 - 31568 853 398 aa, chain - ## HITS:1 COG:no KEGG:Closa_0788 NR:ns ## KEGG: Closa_0788 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 256 7 249 485 94 29.0 1e-17 MSKKIETSQINQSLNRAVSQLPQPSLQKIIDTPVVKMERMDDITGQEPLKGGWFCRLCSP VQVRRMAAVCGCAVFFLAAGIGGVYSYQNLMVDSIVDMDVNPSFELKINKKDRVLSFVPL NEDAVQAAEGHTYKDWNIEEAVKDLYRIMEEKAYLTDDRKTVLISVENKNPNRVDQLQSQ LSACIRKTAEESKRTVHIVTQEKKKDGALDQTALNYHISSGKLQFIRMMTAAYPDLDEMT LSRMSMEELYRIIFEREKEKPVWLQMDEEDWNEYKEEMRKAKYGDRDSSDDRDDDDFDDD DFDDDDSDDGDSDDNDSDDHVSDDRDPADDDFDDRESDDRNSADHDDLDDADSPGKNVDD DSEDKAPDDDESGGKASDDDNSDDRDSDNDDSEDEPDD >gi|229784096|gb|GG667639.1| GENE 37 31574 - 32149 502 191 aa, chain - ## HITS:1 COG:FN0479 KEGG:ns NR:ns ## COG: FN0479 COG1595 # Protein_GI_number: 19703814 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Fusobacterium nucleatum # 47 183 14 146 149 66 28.0 3e-11 MIFFMVTGGEPEVSIQRLNIDEDLIVRIDRDDRSALEEFYTLTERTLYSYVLAIVKNPYD TQDIVQDTYLKVRASAHLYQKQGKPLAWLFTIARNLAMDLFRRNSRFLSEEAFEMDDTME YSYVTEMTDRLVLETAMNILTERERQILLLHLVAGIRHREIAANLGLPLSTVLSCYQRTL KKLKRYLEGKE >gi|229784096|gb|GG667639.1| GENE 38 32321 - 33358 861 345 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621210|ref|ZP_06114145.1| ## NR: gi|266621210|ref|ZP_06114145.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 241 1 241 345 342 100.0 2e-92 MKGAKDMKKNFMAAMAAAAVVTLGGLGLYLGGPSQYLAFDINPSIEIKANRLNQVVSLKG TNEDGKELLQGYHMEDRNLDRVLEDVVDLLDQAGYFTNKDKNDILLTVKSGSASETTVKN VNRHLAGYLEECQIEGRVLDQHVSLTKELKKLAEEHHISAGRMALIEKILAKDDSVTASQ LADMRISDLAAYARDEGISLDSLEDKLDHANDWYRSDRLEMLEDELDRMAESEAAADGKY GDDDRDDDDRDDDDRYDDNDRYEDDDRYEDDDRYEDDDRYEDDDRYDDDDRYDDDDRYED DDRYEDDDRNDDDDHYDDDDCNEDDDRYEDDDRNDNDDHNDHDDR >gi|229784096|gb|GG667639.1| GENE 39 33432 - 34925 1006 497 aa, chain - ## HITS:1 COG:no KEGG:Elen_3094 NR:ns ## KEGG: Elen_3094 # Name: not_defined # Def: regulatory protein GntR HTH # Organism: E.lenta # Pathway: not_defined # 1 483 1 486 486 254 33.0 6e-66 MEEKSTLFEYLYRNLREQIMTGYLKYGDPLPSMSQLSENYHVGIRTVREVLAALKREGLI HTEERRTSRVAYQPVDEIQKEDVLRSVLERRTAILEVYKTMELIMPRLFACSIASCGGEK VRLSFQHLKRNSKKGIDLRWKASSMALHSLLDASGNLLFRDIFTSLEICARVPFFLEFRE SPAFSASAMAYKDPMWMLEAANSGDLKEMRRCFETMYRAIADDVKQFLEELDAMVPFPVE ETDCCYSWSPEIGRNHYYMQITRDLIDKIGVGIYRDGTFLPSEAALADEYKVSVSTIRKA VAMLNKCGFCHTYNVKGTQVTLFNDDATAQCMKNRVLKNDTLLYLSGLQFMAAAVWPAAM LTAESIKKEAEELVDKMEQPWAVPLDLITDAMKEHLPLKPFQMILQEVGGLLHWGYYYSF YSDGSTRSNELNQRCRIALDCLMKGEKQAYARQLSLCYCHVLEVVREYMVQCGLYEAKNL ITPQPEQWEEYEIQNIW >gi|229784096|gb|GG667639.1| GENE 40 35137 - 37404 1448 755 aa, chain + ## HITS:1 COG:AGl374gl_3 KEGG:ns NR:ns ## COG: AGl374gl_3 COG2200 # Protein_GI_number: 15890301 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 513 752 12 249 266 185 37.0 3e-46 MKIRNRKKTDFYNPFFRKGLQISLLSAGILLVFIILIRNAADLRMALNKSTQKYLDDVTV QTARDIHDALVNKMTSLASISDTASRLDEMCSEDEAADLLHRKAEILEFDPLILLHRDGT MISSEGDFSPPWITSTDFFQLDGVLASFEGEVTADYIGGKSIFYSAPVFREGQIDEVLIG VRSKENMQAMISSKNFDGKMLSCITDSSGQVVISPTDLKPFLRLDNIFRENKDRNVVEDI HQMQENMRNNRSGILKFTAVTKEELILSYNSLGVNDWFLLTLIPADFLSSGAERYIMQSF VIIGLTVLLFSLFLFSVYRFYHIHRKQLERLAFVDPLTGGPNHAAFQLKYTELADTMVPN TYSILLMNVNGFKLINERLGIQAGSQILAYIYRVLNHRLRSEDNEFAARVESDHFFLCLK EHDPARIQKRLDGIIREINDFHHTDLTAFRLEFRLGACIVDDPGLEIARLQDRARLACQS QTAGHREKCAFYDNSLTERIMLEQELNDLFETSIENHDFKVYFQPKVLLESNRAEGAEAL VRWIHPEKGVIFPSDFIPLFEMNGKICRLDFYVFTQVCQTLHNWIATGKNPVPVSVNLSR RHFLDQNFLKKFADIAAEYEIPPGMIEFELTESIFFDNRQIDVVKEMIHQMHQLGFLCSL DDFGSGFSSLGLLKEFDVDTIKLDRSFFVNMEGEKAKDVIACLIELSKKLRVHTVAEGIE TSEMVDYLRTIKCDMVQGYIFSKPIPVSEFEARYL >gi|229784096|gb|GG667639.1| GENE 41 37592 - 37771 282 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870454|ref|ZP_06114148.2| ## NR: gi|288870454|ref|ZP_06114148.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 10 59 1 50 50 77 100.0 3e-13 MSGMNCESCMYYTYDEDYECYVCDMDLDEDEMIRFLRGDSGGCAHYQYADEYRVVRKQM >gi|229784096|gb|GG667639.1| GENE 42 37802 - 38872 1018 356 aa, chain - ## HITS:1 COG:RSc3292 KEGG:ns NR:ns ## COG: RSc3292 COG3274 # Protein_GI_number: 17548009 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 9 200 5 197 336 61 23.0 2e-09 MKEREIKYDFLRAVAVIAIMMVHAIPAVSVNDKQWWFAAVMQPLLLSFVGIYFMLSGMFL LDSATGDIMQFYKKRVKVIVIPFLFFSVLYYLYDVSRSLEVLSWWEYPLEFLRDFLSGTV PMADHMWFMYVIIALYLCTPFLARMMKAMNDQELTCFILLILGVQTLTNYAGGLGIGLDQ ILTYMVFQGWLIYYVLGYALKRLFQRKDFKWFAFAGILGLVLTLLQKRFTPGFVPGIHDL APTMIAMSAAVFLLFECWGNLRFKAARTVITWLSRHSYSAYLAHYLILKVAAEPIVDQTM VRHFYVPRIVCTTLLTAVLSFAAAWLLDGTVIKWLQKFVKTDTGKSIKSINVKWKE >gi|229784096|gb|GG667639.1| GENE 43 39104 - 39538 541 144 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_2679 NR:ns ## KEGG: Dhaf_2679 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 2 139 175 315 337 102 40.0 3e-21 MEEERRKQEAEEAIAAANRVLMNLRQAETALSSAGSWGIWDMLGGGMFTTWIKHSRIDDA RLALEESRRSLRSLRRELMDLEIPGDFKIDIGEFLNFADYFFDGLIIDWMVQTKIREASD NVKEAIRRVERLRTRLYEMLEADA >gi|229784096|gb|GG667639.1| GENE 44 39783 - 40235 491 150 aa, chain - ## HITS:1 COG:BS_bltD KEGG:ns NR:ns ## COG: BS_bltD COG0454 # Protein_GI_number: 16079713 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Bacillus subtilis # 1 146 1 149 152 135 47.0 3e-32 MYIHVEPVNDKNREAVLALKIHETQKGFVESPEQCLKEASEETCWRPVGIYDGDQLIGFS MYGYFWQYLPFGRVWLDRLLIDRDYQGHGYGHVVLPMMIEKLRNEYHRKKIYLSVVEENQ GAIHLYEKFGFRFNGETDVHGEKVMVLKDG >gi|229784096|gb|GG667639.1| GENE 45 41365 - 41727 415 120 aa, chain - ## HITS:1 COG:no KEGG:Cthe_0079 NR:ns ## KEGG: Cthe_0079 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 10 111 11 112 171 149 66.0 4e-35 MEYDMKETGYRREAAVEYAKKWAMGRNPRYLDFENFGGDCTNFASQCIYAGSGIMNYTPV MGWYYNSSTDRTPSWTGVQYLYNFLVNNKSVGPYAVETDQAGVSPGDLVQLGNASGFLAS >gi|229784096|gb|GG667639.1| GENE 46 41790 - 42206 432 138 aa, chain - ## HITS:1 COG:no KEGG:Closa_3192 NR:ns ## KEGG: Closa_3192 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 137 1 138 139 128 57.0 6e-29 MDKVSLTFLVFFEDPFWIGIVERISDGTMEVCRITFGAEPKDYEVYEWLLKNYDGLRFSP AVKAEVKKEHTNPKRRQREARMQTAVSGIGTKSQQALSLQREQLKEERKVRTRLQKEAEK EARFELKQQKRKEKHRGR >gi|229784096|gb|GG667639.1| GENE 47 42356 - 42925 372 189 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621219|ref|ZP_06114154.1| ## NR: gi|266621219|ref|ZP_06114154.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 189 1 189 189 341 100.0 2e-92 MKSLRESFEENYEPVEVPCSNRRGFRIRYEYIGPWYQWGEDAVRRKREKRTIGNACAVSL MLFLAGSTRNLALNYDRYVEFFGMLSAAAFLFEVIGTVQFCTAKEKVTDMNYSDINAKLR LAPTVHALLLLCTAAACMAAMAGNGVTVSGLGVTMCYLGAAAASFMIYIRYSRLTLSSLS GKMQTNMVR >gi|229784096|gb|GG667639.1| GENE 48 42922 - 43881 795 319 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 4 309 11 324 329 310 50 8e-84 MAALLEIQNLKKHFKSAAGMVHAVDDVTFSLEEGETVGLVGESGCGKSTLGRTLIHLNES TDGRIIYKGRDVTKLNKKELQNYRRDVQMIFQDPFSSLDPRQTVQDIIKEPMKLLGTMSE SDLKDRTSELMKTAGIEKRLRMCYPHELDGGRRQRVGIARALALNPKFIVCDEPVSALDV SIQAQILNLLMDLQDQMGLTYLFITHDMSVVKHISNQICVMYLGQIVEKCDKSELFCHPV HPYTKALLSAIPTVDIHAKKERILLKGEIKSPINPGPGCRFATRCIYADESCKQNCCLQE VNPGHFVACNKVKEIEALG >gi|229784096|gb|GG667639.1| GENE 49 43885 - 44874 833 329 aa, chain - ## HITS:1 COG:FN0399 KEGG:ns NR:ns ## COG: FN0399 COG0444 # Protein_GI_number: 19703741 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Fusobacterium nucleatum # 10 321 6 317 335 351 54.0 1e-96 MSEHKKETAFFSIRNLTVEYHSGNSVIYAVNDVSFDLKEGETLGLVGETGAGKTTIAKSV LRILPEHSVERIEGEIFYNGRDILKMTDTELHSLRGSQISMIFQDPMTALNPVKTVLSQI AEGYRQQHGCTKSEANKRAVEMLELVGIGADRAGEYPHQFSGGMKQRVVIALALSCSPKL LLADEPTTALDVTIQAQVLDLIKALRDEMGTAMVLITHDLGVVADVCDKVCVIYAGRIIE SGSKEDIFDHPTHPYTQGLFAALPDLKKDVDRLSPIEGLPPDPSSARKGCDFAERCPHKC HRCDQFDNEMREVAPGHYSRCWKTGRSGD >gi|229784096|gb|GG667639.1| GENE 50 44888 - 45826 976 312 aa, chain - ## HITS:1 COG:BH0030 KEGG:ns NR:ns ## COG: BH0030 COG1173 # Protein_GI_number: 15612593 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 36 312 25 301 301 251 41.0 1e-66 MAEELKRDHHKSDGDRRHERKKRHRHHIDESIKPGSAKYIWLNICHNKGAVAGMVMLAAI IVLSLLSPFLCKYSYSELDMLHAYSLPSLEHLFGCDELGRDLLSRVLYGARYTLVIGVIS TVLSAVIGIVMGAAAGYFGGVIDSCLMRFLDVFQAFPTLVLAMAFCAVFGTGVDKCILAL GITGIPGFARLMRANILRIRTMEYIEAAATINCPTWKVIASHIIPNAVSPIIVEIAMSIS RNGLASSSLSFLGLGVQPPKPEWGAMLAASRSYIRDYPHMVIIPGIFIVITVLSFNLLGD AFRDALDPNFKD >gi|229784096|gb|GG667639.1| GENE 51 45819 - 46784 831 321 aa, chain - ## HITS:1 COG:FN0397 KEGG:ns NR:ns ## COG: FN0397 COG0601 # Protein_GI_number: 19703739 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 309 1 307 308 252 47.0 8e-67 MSKYVLKRLLIMIPTLIAVGIIMFTLMNFVPGDPAQLALGAGTHTPAEIEMKRAALGLNR PFMVRLGEYLKNIFLHFDFGKSLIDGTEIKWELMRRFPHTFKIAIYSIVLTVVIGLPLGI YTAVKANSIGDKISLFITLLFDCMPSFWTALLLVLLFSLKLGWLPSSGCTSFKYFILPTI ANSLGGLAGFTRQVRASMLEVIRSDYVTTARSKGLPERKVIYGHALPNALIPILTVVGMR FGSMLGGATIIEVIFSIPGIGQYLVNCINNRDYTASTGGIVFIAFTFALIMLLTDLLYAF ADPRIKAQYVSSGRRKVKENG >gi|229784096|gb|GG667639.1| GENE 52 46807 - 47496 741 229 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621224|ref|ZP_06114159.1| ## NR: gi|266621224|ref|ZP_06114159.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 229 5 233 233 448 99.0 1e-124 MNVRKAIAYAFDAAMVLQGSGSSAGTISHDVAPNLGPDYVQEWDNQDYFGMNLEKSAEYL KAAGYEPGQLEIHLMVSSQAPQGPYQALQAMLEQAGIKLVIDAYDRALRQTYDSDPTMWD MSELSQSVSDFTTSFWNILFSEDNYEYGTQGFTKDDKLQELLKAANAERSEENMNAFHDY VIENCYMVGLYTETRSIVTTEGLTDICLEKLNPVLNAMTFTEQYVPVAK >gi|229784096|gb|GG667639.1| GENE 53 48430 - 49146 632 238 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621225|ref|ZP_06114160.1| ## NR: gi|266621225|ref|ZP_06114160.1| hypothetical protein CLOSTHATH_02373 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02373 [Clostridium hathewayi DSM 13479] # 1 237 1 237 237 366 100.0 1e-100 MKTKRVLALFTASMMTAALLAGCGGSSGSSQTTAPAAQSAEATEKAAEENIQSGNEKTET TVMEEAAESVTVAVDDDSFTIGPWGGASAVRDWTENILWAHLAYRPFIGAMLDDGVQLVA AKSVTKTDDATYDIEIWDNITDSQGNPIKAEDVVYSYNKLAELGYVTDIGTYYAGSEATG DYTLQIKLKNTMDGAIEKVLTSCSIASQTWYEGASENDINLSPATTGAYTVTDMQTGS >gi|229784096|gb|GG667639.1| GENE 54 49257 - 50690 1292 477 aa, chain - ## HITS:1 COG:YPO0986 KEGG:ns NR:ns ## COG: YPO0986 COG4099 # Protein_GI_number: 16121290 # Func_class: R General function prediction only # Function: Predicted peptidase # Organism: Yersinia pestis # 78 477 37 458 458 196 30.0 7e-50 MKKSWMAALLLSSAVLVTACGAGAPQNGSGTENSGTVKTEGAVPDAEQETTETETEGKTT EKAAEKAQSGGISKVITINKSFGDGQKVTHVALQYEKAIDADSLSTDSYEVKNRTVTAVR TNSEAAPAEASVPGNYVILELEIQSPLLDDKYASDGRMEHDEVIDTATVVQIKDVKSTDG EVYPAGSQEYSTPANEGIMGNDSKLYPVRDRFEDNHFYTDPEWKTVLHYNIFKPEGYEDS GETYPLLLFMPDAGSVSSDWEKVLAQGNGGTVWAEEEWQAEHPCFVVTMIYDDKFINDYW EYYDNYVEGTMNLVRSLADQYPVDKDRIYTTGQSMGCMCSLIMMSKEPDLFAGAYCVAGK WEPESLKGLKDSHILILNSEDDSLETGLLMDETVSLWEREGAAVARGAIEGIAEPQVLEA EMDKLLSENADIYYCKIQSGTGSMDLEGNPLNGSHRMTWRLGYDLPGVKEWLFTQSK >gi|229784096|gb|GG667639.1| GENE 55 51197 - 52210 635 337 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1182 NR:ns ## KEGG: EUBREC_1182 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 337 174 510 513 579 82.0 1e-164 MVHKKDRKHFIVHKCVNPKCPYYLHNLKTVDKADLDEDYGKNKYKLHYIYREFTVDFFRM DLNSLPKHASSLRFSKFDQNVLGLCLSYKINLGLSLRKTAQALGDIHGITISHQQVANYL KTAALCVKPFVDNYNYDVGNVFTADETYIKIRGIKGYIWFIMDAAKRSIIGYQVSDNRGV GPCILSMRMALKHLKELPAKFKFIADGYSAYPLAAQQFFIESKGKLKFDITQVIGLTNDD EVSREFRPYKQMIERLNRTYKASYRPTNGFDNIDGANYDLALWVAYYNFLRPHRQFKGNV LNRVEMLEGADNMPGKWQILIFLGQQTIKNLQSGQTS Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:36:21 2011 Seq name: gi|229784095|gb|GG667640.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld33, whole genome shotgun sequence Length of sequence - 40460 bp Number of predicted genes - 44, with homology - 40 Number of transcription units - 19, operones - 8 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 14 - 781 282 ## CLJ_B3105 group II intron reverse transcriptase/maturase + Prom 789 - 848 2.1 2 2 Op 1 . + CDS 998 - 1936 753 ## COG4962 Flp pilus assembly protein, ATPase CpaF 3 2 Op 2 . + CDS 1933 - 2865 633 ## Closa_3536 hypothetical protein 4 2 Op 3 . + CDS 2878 - 3750 554 ## Closa_3535 hypothetical protein 5 2 Op 4 . + CDS 3761 - 4159 228 ## Closa_3534 hypothetical protein 6 2 Op 5 . + CDS 4170 - 4712 261 ## Closa_3533 hypothetical protein + Prom 4726 - 4785 5.2 7 3 Tu 1 . + CDS 4845 - 5423 315 ## gi|266621234|ref|ZP_06114169.1| conserved hypothetical protein + Term 5440 - 5480 5.8 + Prom 5444 - 5503 1.5 8 4 Op 1 . + CDS 5535 - 7208 813 ## Closa_3531 hypothetical protein 9 4 Op 2 . + CDS 7226 - 7918 423 ## Closa_3529 hypothetical protein 10 4 Op 3 . + CDS 7915 - 9741 1284 ## COG3451 Type IV secretory pathway, VirB4 components 11 4 Op 4 . + CDS 9741 - 10571 547 ## Closa_3527 hypothetical protein 12 4 Op 5 . + CDS 10574 - 11560 577 ## Closa_3526 hypothetical protein 13 4 Op 6 . + CDS 11586 - 12074 252 ## Closa_3525 transcriptional regulator, GntR family + Term 12165 - 12202 -0.7 14 5 Tu 1 . - CDS 12187 - 13698 1149 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Term 13714 - 13767 9.2 15 6 Op 1 . - CDS 13793 - 13948 201 ## Tresu_1927 hypothetical protein 16 6 Op 2 . - CDS 13992 - 14936 848 ## ELI_1000 plasmid recombination protein - Prom 15055 - 15114 2.5 17 7 Tu 1 . - CDS 15124 - 15519 251 ## ELI_1002 phage protein 18 8 Tu 1 . - CDS 15631 - 16044 239 ## ELI_1003 hypothetical protein + Prom 16323 - 16382 5.5 19 9 Op 1 . + CDS 16410 - 16871 192 ## Cthe_0528 hypothetical protein 20 9 Op 2 . + CDS 16885 - 17685 756 ## ELI_1005 hypothetical protein 21 9 Op 3 . + CDS 17765 - 17974 211 ## 22 10 Tu 1 . - CDS 18042 - 18227 62 ## gi|160941787|ref|ZP_02089114.1| hypothetical protein CLOBOL_06683 + Prom 18544 - 18603 6.2 23 11 Tu 1 . + CDS 18646 - 20826 1254 ## COG1609 Transcriptional regulators - Term 20833 - 20875 -0.8 24 12 Tu 1 . - CDS 20896 - 21045 170 ## gi|288870464|ref|ZP_06114185.2| conserved hypothetical protein - Prom 21222 - 21281 3.6 + Prom 20861 - 20920 2.9 25 13 Tu 1 . + CDS 20962 - 21126 57 ## + Term 21336 - 21380 5.1 + Prom 21921 - 21980 9.2 26 14 Op 1 . + CDS 22021 - 22113 74 ## 27 14 Op 2 35/0.000 + CDS 22178 - 23389 990 ## COG1653 ABC-type sugar transport system, periplasmic component + Prom 23391 - 23450 1.9 28 14 Op 3 38/0.000 + CDS 23495 - 24385 803 ## COG1175 ABC-type sugar transport systems, permease components 29 14 Op 4 1/0.000 + CDS 24398 - 25240 506 ## COG0395 ABC-type sugar transport system, permease component 30 14 Op 5 . + CDS 25240 - 25902 320 ## PROTEIN SUPPORTED gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 31 14 Op 6 . + CDS 25922 - 26698 346 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III 32 14 Op 7 . + CDS 26634 - 27305 456 ## Hore_20030 PHP domain protein 33 14 Op 8 . + CDS 27311 - 28135 671 ## COG0584 Glycerophosphoryl diester phosphodiesterase + Prom 28640 - 28699 3.0 34 15 Tu 1 . + CDS 28764 - 29003 173 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 29188 - 29241 11.5 + Prom 29213 - 29272 3.6 35 16 Tu 1 . + CDS 29338 - 29715 222 ## EF2290 ECF subfamily RNA polymerase sigma factor + Term 29739 - 29781 8.2 + Prom 30149 - 30208 1.6 36 17 Op 1 . + CDS 30296 - 32593 578 ## COG1199 Rad3-related DNA helicases 37 17 Op 2 . + CDS 32635 - 33102 241 ## gi|266621262|ref|ZP_06114197.1| resolvase, N domain protein + Prom 33116 - 33175 3.1 38 17 Op 3 . + CDS 33196 - 33453 216 ## Closa_3494 hypothetical protein + Prom 33462 - 33521 6.5 39 18 Op 1 2/0.000 + CDS 33591 - 35228 823 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 40 18 Op 2 2/0.000 + CDS 35230 - 36909 661 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 41 18 Op 3 . + CDS 36906 - 38423 851 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 42 18 Op 4 . + CDS 38443 - 38757 376 ## Closa_3319 Ferrous iron transport B domain protein + Term 38773 - 38822 7.2 + Prom 38981 - 39040 10.1 43 19 Op 1 . + CDS 39064 - 39165 89 ## 44 19 Op 2 . + CDS 39271 - 40459 1172 ## Pjdr2_4665 S-layer domain protein Predicted protein(s) >gi|229784095|gb|GG667640.1| GENE 1 14 - 781 282 255 aa, chain + ## HITS:1 COG:no KEGG:CLJ_B3105 NR:ns ## KEGG: CLJ_B3105 # Name: not_defined # Def: group II intron reverse transcriptase/maturase # Organism: C.botulinum_Ba4 # Pathway: not_defined # 1 251 375 626 626 239 48.0 7e-62 MTKKSKEKCKKELKQKIKVLQHDAIGARVQQFNATVLGLHGYYKIASHISKDFAEIAFHV NKSLYCRTKCIRSKRGNKSRAYERFYGKCKVKPLFIQRVALYPIHYVQTTPPVCFSQDIC NYTVNGREKIHKYLQNFSYDTLQYIMKNPVQNQSLEYNDNRISLYVGQRGRCFITGEKLK ISDMEVHHKTMRSEGGTDAYKNLVFVTGAVHKLIHTKEVETIEKYLELLKRKSIDFVKLN NLRRLVGNCEISENK >gi|229784095|gb|GG667640.1| GENE 2 998 - 1936 753 312 aa, chain + ## HITS:1 COG:PM0849 KEGG:ns NR:ns ## COG: PM0849 COG4962 # Protein_GI_number: 15602714 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Pasteurella multocida # 33 291 158 402 425 100 29.0 3e-21 MIFDNAQTIVVGHLNNKVRITVMGDGVIDQDKGLAASIRIVNPRKLTMKQFVEYGTATEE MLELLSQCYSHGVSMCITGATSSGKTTLMSWILSQVPYHKRIVTIEQGCREFDLTVSDES GNVLNNVVHLVTRTSDDPRQNIDLIKLLETTLTINPDCIAVAEMKGGESMQAINAANTGH SVITTIHANSCEDTYYRMVTLCKQEQPGMDDDTLQSLATKAFPIVAFAKRLEDNSRRMME ITECETLPNGTRRMRTLYRFHIADSLVVDGKTKIIGQYENVENLSEGLQKRLRENGLSQK FLDRMLEGEKKV >gi|229784095|gb|GG667640.1| GENE 3 1933 - 2865 633 310 aa, chain + ## HITS:1 COG:no KEGG:Closa_3536 NR:ns ## KEGG: Closa_3536 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 310 1 310 310 417 67.0 1e-115 MTVIITIAAFLCIGGSFLVIGLSPFEFLEGLTKWIKPKDNSLKRRIRESRNQKPPKGVKR LVLEVQAILRITGRQEKFSMICVLSMMLFVAGVLLALSMKNIFMVPVLAAGFPLLPFYYV LFTASKQKKQINGELETALSMITSSYMRNKNTFLRAVEENLPYLNPPVSEVFRDFLMESK LIHSNLKEALEKLKLAIDSDVFQEWVEAVIACQDDHNLKSTLVPIVGKLSDMRVVSAELD LLIYEPVKEYITMVILVLGSIPLLYFLNQEWYKTLMFTAFGKMLLAISGGVIFFSLAAVV KHTKPIEYKR >gi|229784095|gb|GG667640.1| GENE 4 2878 - 3750 554 290 aa, chain + ## HITS:1 COG:no KEGG:Closa_3535 NR:ns ## KEGG: Closa_3535 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 290 1 291 291 343 64.0 4e-93 MHYYILLLFALSFSMGIYLILAHLLKISTFRARRAILSVGRQKKKAKNSDAAIMELAVKL SPFVPMDAYKRRKLVMVLHSAGIKETPEVYLAQAYVKSGLVFFGALPCLAVIPILAPAFL IMGIGVLFSETGKAEKTVRDSREAIEYELPRFVATITQELLASRDVLSMLETYQKHAGPA LKRELSVTTADMRTGSYEAALTRMESRVSSAMVSDVVRGLIGVIRGDDGVAYFQMMGHDM KQLEIQRLKKLALERPPKIRRYSFLLLACMLMIYMGVMAYQILGTMSGMF >gi|229784095|gb|GG667640.1| GENE 5 3761 - 4159 228 132 aa, chain + ## HITS:1 COG:no KEGG:Closa_3534 NR:ns ## KEGG: Closa_3534 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 132 2 133 133 194 67.0 1e-48 MRNVLREKRGEGYIDIAVGILCLMLVIALAVSLFPVFMVKQQLNRFADEIVRQAEIVGST SVNRRIEELREETGLDPEISWDCDYFEGSKVQLNGDIAVTLTCRINLGFFQFGSFPVEIR ARASGKSEVYYK >gi|229784095|gb|GG667640.1| GENE 6 4170 - 4712 261 180 aa, chain + ## HITS:1 COG:no KEGG:Closa_3533 NR:ns ## KEGG: Closa_3533 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 179 3 178 178 226 62.0 4e-58 MREIKKLVKDKNGNTTPLTIALILALLFTLCALSEFFRLSVIANGVRNALQSAVISVATT NYDEVYNGLREGYSGGYWLSGDQWEENLDYGDIYEQLDELLDTESDGKYHVKEMQNGYEY RLSNLSVDIKNVRLRPGSAVNNLEVTAKLRLEIPISFGGESFPPVQMQIKVKSAYMPKFE >gi|229784095|gb|GG667640.1| GENE 7 4845 - 5423 315 192 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621234|ref|ZP_06114169.1| ## NR: gi|266621234|ref|ZP_06114169.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 192 1 192 192 254 100.0 2e-66 MKEMMNRRKLITAGLAVLVVCLIIAVYNLSRPDVRIEETMALEETTELVMESVTVPEITV AETAIEAETEESTEETEVQTTKADMEKRESPVEMETHPVTMTTEAVKETEPKLQVPEEAT PPAREREPQEVEPDHEEGDQPTAKPEEPQGGSTNDQGQVYVPGFGYVETGGGVTEIPAHS DGDWNKQVGTMQ >gi|229784095|gb|GG667640.1| GENE 8 5535 - 7208 813 557 aa, chain + ## HITS:1 COG:no KEGG:Closa_3531 NR:ns ## KEGG: Closa_3531 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 11 555 21 557 559 526 50.0 1e-148 MWLVPSAKAFAIQGDGNIDGGGGHVNQGVAGYEWPTGYEGIRITIVDAGTGQQVMSPIDF TNSNVAAVSSDIFNFGKVSKLQYRNGAALNIQTQYEYFHPDTPIPKIITTNSGKADMNAI KTYFCSEGAASMVASYTGIELNDLISGKYKVILEPVIYLWYNHLFFAMTVTEAGLYNRTT NGDLGSHFPTVVMKNFALSMFLERDDLGFKAYTGPKGSARTTNEMILILGIGIVSYAAQP EEDPPVTEMEYDQEYRVDTDVITAVSLYSDREINNRSKATVTFRIKGRRCQMTDIVMPEG GSQLVWVKWHTPSTPQDIKITISSNKGTLSTTQMTVRVVDLNENPPPDPQANDRNDSYRL PNVPNKRDVTELTWGEWDCWWQEHWVYHDGSEDESGYYCDHGWYEFDWIPYFASISASMS IKPDERNPTAEGCVMKSGYGINVKATARVSSDASSAAITGAQNVVTYFPEFAYQTYWRLL ERMAGGYASSFELQKNPYSTYQRRSHFSPVWFPDGPYEVYGEVLDAWTPAGMLKIHLTDD LTVRGNLFFDWHIRPAN >gi|229784095|gb|GG667640.1| GENE 9 7226 - 7918 423 230 aa, chain + ## HITS:1 COG:no KEGG:Closa_3529 NR:ns ## KEGG: Closa_3529 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 229 1 229 231 269 63.0 7e-71 MQTMIIPMVTLTVTLMGGGILLYFLRADKKRQPASGEQAAMQTAQEFMNVKDIRDKFLYS RDGYVFLYLRIHAISIDLYSTSEKNTLIRTLTAELSDIQRPFKFLAVSRPVDISPVIQEM TELLPEAGEKQKELLKQEIVQMSGYALSGDIVERQFYLMIWEKQAEGCEADLQKTAALLC EKFSAGGVLCDILQEKEIVRLLNLVHNPAYSHLEEPDYEATIPLLKEGIV >gi|229784095|gb|GG667640.1| GENE 10 7915 - 9741 1284 608 aa, chain + ## HITS:1 COG:MYPU_3830 KEGG:ns NR:ns ## COG: MYPU_3830 COG3451 # Protein_GI_number: 15828854 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Mycoplasma pulmonis # 24 593 288 845 853 147 23.0 8e-35 MKKRDMEVIRENPALLNTITPVGLEFQKNQLSIGENVGKIYGIIRYPQTVESEWLSKITN IPSTIVSIGFNPVDNAELINSISRSVVQQRGIAEAAKDPLSRQRAEKAAEDGETVIAQID REGETVGLMSVSVMPVSSDEKIFKRICRRAESSISVMKCKMRVIPNLQKECFMHLSPTYP NHSKIETILQKVVPISTFVGGFPFASSGFNDGSGYYFAKDTSGGLIIVDPWKRGGDRTNS NFCVMGNSGVGKSTAVKHILLSEYMRGTKIIVIDPESEYKEMCQKLSGDWISATGGFKGR LNPLQVLPSPRDDEEEPEENRLYRDEGYGINDLAMHLKNLDIWFGLYLPSLSDMQKATLK RCLVELYAQFRIDWDTDILELEAKDFPVFTDLYLLLKSRAEEQPGEIVYKELLNLLYDLV YGADSFIWNGHSTVASSARFLVLDTHALQETGENVQRAQYFLIMLYNWRLMSWDRSERIL MAGDEAYLMIDQKVPQAMVYLRNMMKRARKYEAAVAIISHSVRDFLSESIRQYGQALLDV PCYKILMGTDGPNLKETRQLYDLTEAEQELLLARRRGHALFIVGAKRLHIQFDIPEYKFQ YMGKAGGR >gi|229784095|gb|GG667640.1| GENE 11 9741 - 10571 547 276 aa, chain + ## HITS:1 COG:no KEGG:Closa_3527 NR:ns ## KEGG: Closa_3527 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 272 3 273 274 347 62.0 3e-94 MISPVVIKTAVGIAARAASDEKARNRMIVLIVTPVAFLLFVISFSLYLITNPLSVLKVFL DAEELELVEEFQIDHGYHQNLGIYENDYLQGNGQTYEGVVFGDAGETEVVYYSQLDKRWA GASYGDSTIGRSGCGPTSMSIVISTLTGQAVDPPHMAGWSYQNGYYCSGSGSYHSLIPGA AESYGLTAKGNLTAQEIVDALKNRELVVAIMAKGHFTKNGHFIVLRGVTREGKILVADPA SIERSNQEWDLSLIMNEARKGAGAGGPFWAIGKKGA >gi|229784095|gb|GG667640.1| GENE 12 10574 - 11560 577 328 aa, chain + ## HITS:1 COG:no KEGG:Closa_3526 NR:ns ## KEGG: Closa_3526 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 327 1 333 334 291 46.0 3e-77 MILYMTTQDHTDLLDFYIEREGTLPVKKMTGSFILKQFVIYDMRNFSHCTELILDRGAFR DSDDAFVQAIEEFLTMYQARVTVIAEGLGQEEALFIKLLEAGIGNIVTERAIERQQEEIE ECLSSQGLRKYRIREKQEIYHEGERYNFSCSRIQIAVVGSQGRIGTTTVALGLANWLTEV GGRSCYVEANESGHMECLAIDYEMVPKDEGFVMDKVGYYRQQSTQEYSFVVTDFGIGEPK TMPDILLLVCGTKPYELPHTLKLLEQYEDQCATILFSFVVQEHRRAYEDSFSTDKHRVLF MEYQPDCFDGVINGSIFKTIIKPYITGK >gi|229784095|gb|GG667640.1| GENE 13 11586 - 12074 252 162 aa, chain + ## HITS:1 COG:no KEGG:Closa_3525 NR:ns ## KEGG: Closa_3525 # Name: not_defined # Def: transcriptional regulator, GntR family # Organism: C.saccharolyticum # Pathway: not_defined # 3 107 4 107 216 129 61.0 5e-29 MNQYEMKQIAYKSTLKSRALQVLVYLIDRSNKEGTCYPAISTMSQELHISVSTVKRALKE LTVSGFVKKESRFREKNNGQTSNLYILCFAGISKSKELNEETKKKVHIRKENRNKPHTME DALTSISIFRRWPGYCNIKIISFIALYYISYWPGEGANLIPP >gi|229784095|gb|GG667640.1| GENE 14 12187 - 13698 1149 503 aa, chain - ## HITS:1 COG:CAC1956 KEGG:ns NR:ns ## COG: CAC1956 COG1961 # Protein_GI_number: 15895229 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Clostridium acetobutylicum # 7 451 4 471 531 192 30.0 1e-48 MNNRIDAIYARQSVDKKDSISIESQIEFCKYELKGGNCKEYTDKGYSGKNTDRPKFQELV RDIKRGLIAKVVVYKLDRISRSILDFANMMELFQQYNVEFVSSTEKFDTSTPMGRAMLNI CIVFAQLERETIQKRVTDAYYSRSQRGFKMGGKAPYGFHTEPIKMDGINTKKLVVNPDEA ANIRLMFEMYAQPTTSYGDITRYFAEQGILFNGKELIRPTLAQMLRNPVYVQADLDVYEF FKSQGTVIVNDAADFTGMNGCYLYQGRDVKPSKKNDLKDQMLVLAPHEGIVPSDIWLTCR KKLMNNMKIQSARKATHTWLAGKIKCGNCGYALMSIFNPSGKQYLRCTKRLDNKSCPGCG KIITSELEAVVYQQMIKKLASYKTLTGRKKAAKANPKIAALQVELLHVDSEIEKLVDSLT GANNVLLSYVNVKIAELDGRKQELVKQIAELTVEAISPEQVNQISGYLDTWDNVSFDDKR RVVDLMITTVAATSDSLNITWKI >gi|229784095|gb|GG667640.1| GENE 15 13793 - 13948 201 51 aa, chain - ## HITS:1 COG:no KEGG:Tresu_1927 NR:ns ## KEGG: Tresu_1927 # Name: not_defined # Def: hypothetical protein # Organism: T.succinifaciens # Pathway: not_defined # 1 50 20 69 70 68 94.0 6e-11 MTFQGREITLENLSPVFTPEQEAAKRRELEQQLYEVFRKYADKRQSEEAGA >gi|229784095|gb|GG667640.1| GENE 16 13992 - 14936 848 314 aa, chain - ## HITS:1 COG:no KEGG:ELI_1000 NR:ns ## KEGG: ELI_1000 # Name: not_defined # Def: plasmid recombination protein # Organism: E.limosum # Pathway: not_defined # 1 314 1 313 313 531 92.0 1e-149 MAQHAILRFEKHKGNPARPLEAHHERQKEQYASNPDIDTSRSKYNFHIVKPESRYYHFIQ NRIEQAGCRTRKDSTRFVDTLITASPEFFKKKPPKEIQEFFHRAADFLIGRVGRENIVSA VVHMDEKTPHLHLVFVPLTEDNRLCAKEIIGNRANLSKWQDDFHAYMVEKYPDLERGESA SKTGRKHIPTRLFKQAVNLSKQARAIEATLDGINPLNAGKKKEEALSMLKKWFPQMGNFS GQLKKYKVTINDLLAENEKLEARAKASEKGKMKDAMERAKLKSELDNLQRLVDRIPPDIL AELKRQQRQHGKER >gi|229784095|gb|GG667640.1| GENE 17 15124 - 15519 251 131 aa, chain - ## HITS:1 COG:no KEGG:ELI_1002 NR:ns ## KEGG: ELI_1002 # Name: not_defined # Def: phage protein # Organism: E.limosum # Pathway: not_defined # 1 131 1 131 131 261 98.0 5e-69 MANTIYIHQPEKAVSFTRLPNFLFEAPTFTPLSNEAKVLYAFILRRTDLSRKNGWADEYG RIYLYYPINEVVELLHCGRQKAVNTLRELQYAGLVEIQKQGCGKPNRIYPKSYEAVSNTD FKKSGYGTPED >gi|229784095|gb|GG667640.1| GENE 18 15631 - 16044 239 137 aa, chain - ## HITS:1 COG:no KEGG:ELI_1003 NR:ns ## KEGG: ELI_1003 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 137 56 192 192 234 89.0 1e-60 MRDNPYKDLPPLERRPDGSLYRMTPAQRKQANALIRRECCCYEDGNCMFLDDGDTCTCPQ TVSFSVCCKWFRWAVLPLDKTLEAEIFRDKDLKRCAVCGGVFVPKSNRAKYCPDCAARVH RRQKTESERKRRSTVDS >gi|229784095|gb|GG667640.1| GENE 19 16410 - 16871 192 153 aa, chain + ## HITS:1 COG:no KEGG:Cthe_0528 NR:ns ## KEGG: Cthe_0528 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 26 150 1 124 131 119 53.0 4e-26 MYNLNWSRSDLTRQKLGTFCEYYAKMSLASYGVSIYTSEVDDHGIDFIAESKRGFLKFQV KAIRKGTGYVFMREEYFDISDQSLYLFLLLLNDGEHPIEYLIPATTWDNDSSNTFVYHSY EGKKSKPEYGLNISAKNIPQLERFKLENMITAI >gi|229784095|gb|GG667640.1| GENE 20 16885 - 17685 756 266 aa, chain + ## HITS:1 COG:no KEGG:ELI_1005 NR:ns ## KEGG: ELI_1005 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 266 1 266 266 427 87.0 1e-118 MALTIQERLKDLRVERGLTLEQLAEQTHLSKSALGSYEAEDFKDISHYALIKLAKFYSVT ADYLLGLSETKNHPNADLADLRVSDDMIELLKSGLVDNFLLCELAVHPDFPRLMADLEIY VNGTAVKQVQSANAIVDIMSATIMKQHNPGLSDPQLRQLIAAHIDDDSFCRYVIQQDING IALDLREAHKDDFFSVPEDNPLEDFLQAAEAASTPGSDPEQAAMAFICKRLKLNYGKLSE EERKWLKRIAQKSDLLKNPKPQRGRK >gi|229784095|gb|GG667640.1| GENE 21 17765 - 17974 211 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRILYELSWLWETLVIALAVAAVCVVCAMLICKAVKKNLSKKTAAVIGAAAFAGAVLAVI VIARTPMPL >gi|229784095|gb|GG667640.1| GENE 22 18042 - 18227 62 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941787|ref|ZP_02089114.1| ## NR: gi|160941787|ref|ZP_02089114.1| hypothetical protein CLOBOL_06683 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06683 [Clostridium bolteae ATCC BAA-613] # 1 61 9 69 69 115 98.0 1e-24 MIAWRGAANAGLVLSADRAYMKAPVPRPTPDFGTKCPELVDTLTKKQTDFSLKIEMGGKP F >gi|229784095|gb|GG667640.1| GENE 23 18646 - 20826 1254 726 aa, chain + ## HITS:1 COG:SP1725 KEGG:ns NR:ns ## COG: SP1725 COG1609 # Protein_GI_number: 15901558 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 1 239 2 237 321 86 29.0 2e-16 MATILDVANLAGVSQGTVSNVLNRKGNVSSEKIKVVEEAAKQLGFTINEKAKMLRKGSSN IIAVILPNIQFRHYRDFYTSLRAYAIKNGYTTNILISNDNPEEELLLIQQAKSAMSEGIA VFSCLEGSENKYGQAGFTNVCFVERRPQFTADYYGFDYSAAGREMAESAVRRGYQDILVF TGSTRFSNENEFAEAFKQMVKVGGDCTVTHISTDAGRISHTVLTALIGERKVDAVFTTNI GFADKIRQIRNSFSGKDELPIHTLAPVFTLPEKDYSKYELNYSYLGREVAENLIKSIKDK PKLENHIMDNDGQRKWHQIELKKPGTKCLNVLALESPEALAMKGLARLYTEKTGTEVKIA VFSYDEIYEAFVNCEDFGIFDVFRIDVTWLSWFADKILLPIDEIDPDISEIFPEYIPALT DKYSKVNGRVYALPVTPSAQLLFYRKDLFENVAMRRQYQEMYREELKVPESFEAFNRIAG FFTKKMNGNSPVRYGTSLTLGNTGVAATEFLTRFFSYQKNLYQENGHIEINNEIGVRALH DLMKSMKCIGPANVPWWTDSAKAFSDGDVAMSIMFSNYASEILGYQSKIIDKVGCAFVPG RNPLIGGGSLGIAKKSRHPEDALAFIKWMTREPVASALAALGSVSPCNKTYEIYDIVDVF PWLELSKDCFAMSETRRRPPDQMKPFNERKFLSILGTAVKNVSVGVMEPGEALDTAQKAI ERDLLN >gi|229784095|gb|GG667640.1| GENE 24 20896 - 21045 170 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870464|ref|ZP_06114185.2| ## NR: gi|288870464|ref|ZP_06114185.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 49 228 276 276 99 100.0 1e-19 MQKEGMANENTRFVITHFAHTFNPLHEHTDSLARTNGFIAAFDGMVLDI >gi|229784095|gb|GG667640.1| GENE 25 20962 - 21126 57 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFVKRIKGMGEMSDDKPGIFIGHAFFLHPFFYQYIIREPHVIWKMVSDFYTGTI >gi|229784095|gb|GG667640.1| GENE 26 22021 - 22113 74 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKVLKKLHYSSGSGSCLYIDPGRLWKFVNC >gi|229784095|gb|GG667640.1| GENE 27 22178 - 23389 990 403 aa, chain + ## HITS:1 COG:mll4149 KEGG:ns NR:ns ## COG: mll4149 COG1653 # Protein_GI_number: 13473518 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 85 342 100 339 408 75 27.0 2e-13 MEETAGGPQYSGSLSYWSSWNSTEPQAKVIEAAAAEFMKLYPEVKIDLTFNGRDNRKLVN PALQGGQQIDMYDANADNIQAYWMDTIIPLDDYYEQTYPTTNGEKYIDVIMPSMVSLARQ IGDDKLYYVPYCPQAFMWFCNKTIFEDAGITKVPTTWEEFTEVCQTLKDAGYTPITTDDK YAENMFGYYLSRLKGNDFVEKLAYDTTGELWDDPAVLETAKVFEDMVAKGYFASNVGSNT LPLGQQEMVLEEKIAMYLNGTWLPNEVKDTAGSDFNWGEFAFTTVPGGVDGLEAGAYGSY GIAINNQCQNPDTAFAFAVYLTTGEWDQKMSEEAAAIPMSNDAQWPEALKDAKTVFDQLT YRYPSQTAIRMNSDTQPIIETACIKLYAGQITAEQFISECRPK >gi|229784095|gb|GG667640.1| GENE 28 23495 - 24385 803 296 aa, chain + ## HITS:1 COG:MT2100 KEGG:ns NR:ns ## COG: MT2100 COG1175 # Protein_GI_number: 15841528 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Mycobacterium tuberculosis CDC1551 # 10 295 14 294 300 94 29.0 2e-19 MKTKKSTLFIFLGPAFLLYMFVFLYPTVRTTAMSLFQMSNVTTPISQWSFVGLDNFTKLF STPLFRISMINIAKIWLFCGIAALGLAFLFAVLLVGGIKGKGIYRTIIYLPNVISGIAIG YMWLLYVFNNKFGLLAKVFHAIGLKGLAEIPWTSPSYIFTAMCVAYVFSSVGYYMLTYMA AIEGIPEDFYEAARLEGANRLHEMIYITFPLIASTVKSSLTLWSNKVIGFFALSLVFGGA TTITPMVYTYNALFGTEVSVESNAGVAASSAVVMTVIIIILFVATNLLVKDKKYEY >gi|229784095|gb|GG667640.1| GENE 29 24398 - 25240 506 280 aa, chain + ## HITS:1 COG:BS_yurM KEGG:ns NR:ns ## COG: BS_yurM COG0395 # Protein_GI_number: 16080311 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 32 280 54 300 300 137 32.0 2e-32 MNKEKGKRILKALPAHILVGLWSVFTIVVIGWIVLASFSTTKEIFTDKLLHSGFHIENYI KALFANDALLNLVNSVIYTVPSCILIILFSAPAAYALQRYVFRGNRFLQKLMVIGMGIPS VMIIMPLISVVNAIGMTNSRLTLIILYTATSIPFTTYFLMSYYNNISVTYEEAAAIDGCG PIRTFWSIIFPLIQPAIVTVTIFNFIGKWNEYFMALVFANDKKLKSIGVGLYSTIQSMVT TGDYAGMFASVVIVFMPTLLIYIFMADKIVSGVSQGGLKG >gi|229784095|gb|GG667640.1| GENE 30 25240 - 25902 320 220 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 [Bacillus selenitireducens MLS10] # 1 189 1 198 236 127 35 8e-29 MDFYSFMTQSELLLRIVVACICGCIIGYERESRNKGAGIRTHAIVSLAAALIMIVSKYGF SDIPDYDASRVAAQIVSGIGFLGAGIIFVRNNSISGLTTAAGVWATAGIGMAVGSGLYYI GVVTAFFIVLAQILLHQEVFTGKERKGEVVKLVLKNQSAGIDELRSVLSEEQIDIISVDW SQKENGKYDIKASVLFPAGYDRIRLMSLAASGEGTISLEN >gi|229784095|gb|GG667640.1| GENE 31 25922 - 26698 346 258 aa, chain + ## HITS:1 COG:CAC0364 KEGG:ns NR:ns ## COG: CAC0364 COG1234 # Protein_GI_number: 15893655 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Clostridium acetobutylicum # 4 159 1 162 167 128 38.0 9e-30 MIVILEFIGCGSAFNPQLGNTSAYIEDGKRFVLLDCGESVYARLFELGIFEQYEEIYVLI THTHADHVGSLPSLISYCYYVKQKKVTVIHPNRSLITLLDHMGIAREAYIFQKPDLLKLT GFEVKAFSVKHVEDLKCYGYLITTYGRTVYYSGDACEVPPAILKRLLEGRIEAVYQDTTE FWTPHLSHCPVELLADMVPVEYRKRVYCMHFSNDFSDKIKALGFNVVELSKKEGRMSYVS DGYTHSYERKQLLCESGC >gi|229784095|gb|GG667640.1| GENE 32 26634 - 27305 456 223 aa, chain + ## HITS:1 COG:no KEGG:Hore_20030 NR:ns ## KEGG: Hore_20030 # Name: not_defined # Def: PHP domain protein # Organism: H.orenii # Pathway: not_defined # 2 222 4 223 225 179 41.0 8e-44 MYQMDIHIHTKESSYCARVAARDVVHLYKEKGYQGLIITDHYNREFFDHFPNETWKEKVD HYLEGYRQAKAEGAKIGMDIFLGIEFRSVEHINDFLIFGLTEQFLYEREELFRLSIEEAS EIFKKAGALVIQAHPMRTGVNVLADPACIEGIEVYNGNHQWPYDAGAAEEAAEKYGLIAL SGSDFHMTEDLAKGGIAFEEPVDGYEDLLRKLRGREITGLIRR >gi|229784095|gb|GG667640.1| GENE 33 27311 - 28135 671 274 aa, chain + ## HITS:1 COG:CC3172 KEGG:ns NR:ns ## COG: CC3172 COG0584 # Protein_GI_number: 16127402 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Caulobacter vibrioides # 13 273 1 269 295 63 24.0 3e-10 MDLFEKVKEKGVLIAAHRGVAGGNIACNTIEAFEAALIQGADILEMDVFKTTDSQLYIFH TGKERMQLNRDIDVTALSSTEVEKLHYVNGDFFETEQCINKLDDVLEALKGRCLINLDRC WEDWKLVRECVERHQMRDQVILKSHPEEKYFRMLEEVAPDYMYMPILSEEDHCTEILEGR NLRFVGAECTFTSEQAQIASDEFIDCMKKKGKLLWANGILYSSKVPLSAGHSDDISVTGN PEEGWGWLMDKGFDIIQTDWTGMMREYLKTKKLN >gi|229784095|gb|GG667640.1| GENE 34 28764 - 29003 173 79 aa, chain + ## HITS:1 COG:all4727 KEGG:ns NR:ns ## COG: all4727 COG0745 # Protein_GI_number: 17232219 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Nostoc sp. PCC 7120 # 1 73 159 231 238 65 42.0 2e-11 MTVKEFQLLYYLASNQGQVINKEQIYYHLWKEDIGVGSNSVETLIMRLRKKLEPDKSDPI FIETIRGGGYRFKLESKKC >gi|229784095|gb|GG667640.1| GENE 35 29338 - 29715 222 125 aa, chain + ## HITS:1 COG:no KEGG:EF2290 NR:ns ## KEGG: EF2290 # Name: not_defined # Def: ECF subfamily RNA polymerase sigma factor # Organism: E.faecalis # Pathway: not_defined # 6 123 6 123 126 95 45.0 5e-19 MESINWIEILEQEDRIIANSDRKQRYHFSSLDAMSEELVYQESMTRYQEQESYAGKDEDF TEQVRDERLAAALKLLTTKQKEVIELIFWEGYTQEETARELGCSQSSVSERLSNGLKRLA VHLKQ >gi|229784095|gb|GG667640.1| GENE 36 30296 - 32593 578 765 aa, chain + ## HITS:1 COG:PM0710 KEGG:ns NR:ns ## COG: PM0710 COG1199 # Protein_GI_number: 15602575 # Func_class: K Transcription; L Replication, recombination and repair # Function: Rad3-related DNA helicases # Organism: Pasteurella multocida # 118 760 28 647 647 175 26.0 4e-43 MQDKIMIYPADIRNHLYEAGILRENEDVVMVAGQCALLENAGEIRTTYRAINLFDYRTRY LCARKTPFPQESYKIVAKRLSELEINELDENMELELDENKRAQHLLSYIFDVILPQYGMA KRKSQKELALSMLRALQENKLALCEAEVGTGKTHAYILAVTIHNLFSQSKTPTMISTSTI ALQEAITKEYIPQISRILLEHRIIDKPLSFVVRKGKSHFVCDSRLKTYESSIRNQNRPED QGLLAQLGELWREQEHYLDLDGFPLTPYVKERINVSQCSEFCPHSSYCRFASFSRRWLSE SYDFQISNHNYILADVVNRREGKKKLLPYYNIAVFDEAHKLIDAARQIYCTRLSENEIPK LIHLISPERFVDKSSQRQMMELCTGLSEKNTRLFQKTGGAFIQSNQMENISRTIEEDRST RLWIRDLIGMLEKLGRIFVTVKMDSCGRVKTIRNTAQSIEDRLNVFLSAKQNISWIEKSD SGGLEFCSLPRELERILYHDLWHTGVHSIITSGTISVGGNFDHFKRMTGIDLLQPQNRRV MEVSKLSPFNYSQNAILYIPEELPFPDIHDEKYIEAISREIIHLICATFGHTLILFSSYW LMERIFYEVKQEIQSFPLFIMKRGRMDMIEKFRRSGNGVLFASDSAGEGIDLPGDILSSV IIVKLPFPVPSPILEYERTKYDDFSQYVKEIITPGMIIKLRQWFGRGIRRETDTAVFSIL DSRVGIHGRYRDDVLKALPDMPVTGRIEEIRRFIAEKKTGEYFLL >gi|229784095|gb|GG667640.1| GENE 37 32635 - 33102 241 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621262|ref|ZP_06114197.1| ## NR: gi|266621262|ref|ZP_06114197.1| resolvase, N domain protein [Clostridium hathewayi DSM 13479] resolvase, N domain protein [Clostridium hathewayi DSM 13479] # 1 155 1 155 155 294 100.0 1e-78 MDQIEKRVWGYCRCGGADQTDIENQVEQLKQYAESCGYTLSGYTVGHGSSLEPDSPDMAE IERAAKEYAADTLLINAASRLSRKTVELLGSLKYIAQIPMNVECVNGTDLSADSPFKDIV MAMSECMRSVGPEYEAKYEEDYELDLNDGSEELGR >gi|229784095|gb|GG667640.1| GENE 38 33196 - 33453 216 85 aa, chain + ## HITS:1 COG:no KEGG:Closa_3494 NR:ns ## KEGG: Closa_3494 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 7 85 59 137 137 108 59.0 7e-23 MEPVIAMDMEEMKNVDIRTVDRESLADINDVHINRKLPREERMADFIRQIKNPYCYRCGK AVVKVGFTDTEVTLEDRLEHYLKTL >gi|229784095|gb|GG667640.1| GENE 39 33591 - 35228 823 545 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 15 309 12 300 301 177 34.0 7e-44 MKNLTDNRIYRADAYLRLSKEDGDKVESDSIANQRDLIRHFLADKPDIQLVDEQIDDGYS GATFDRPSFQKMMEDVKSGKINCIIVKDLSRFGRNFVEAGRYIDQIFPEYGVRFIAINDN FDSAKGRSQSDNVLLPFLNLVNDAFCRDISIKVRSQLEVKRKKGDFVGSFAVYGYLKDPD NRHKLIVDEFAANVVRDIFKWKIDGLSQQRIADRLNELGILSPMEYKKNCGMAYRTGFQV STKAKWTAVAVGRILRNEFYIGTLVQGKRTTPNHKVKKTIEKPNNEWVRIENNHEPIVTI EEFSAVGRLLLQDTRVAPDQNTVSLFSGLIFCGDCGQNMVKNGVCKAGKTYSYYMCCTNR AKKGCSSHRIPEQAVADSVLMSLKEHISTVLEIEKTLEFLAGLPLQRAEVQKADARLVSK QEEIKRYEKLKATLYEALADGLIDKSEYLELKAVYDKKMQEARSEEVKVKDELESILQNQ SVNSQWIEHFRAYQNIDTLDRRMVVTLIERILVYEDYRLQIEFRYQADYERALAMIQASG VKEVV >gi|229784095|gb|GG667640.1| GENE 40 35230 - 36909 661 559 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 25 319 10 300 301 181 32.0 5e-45 MARVSRKQTVSPAAPVLVERVYCVGLYVRLSKEDNGYHTSDSIQMQQCLLEQYVKTQQDM KLVGIFCDNGATGTDFDRPDFERMMEAVRRREIDCIVVKDLSRFGRNYVETGYYLEKIFP FLGVRFVAVNDCYDTLKCADGDEMVVSLKNIVNSLFAKDISKKSGTALRRKQEKGEFIGS WTPYGYLKSPEDKNHLIIDPETAPIVRDIFTWRLEGLGYDEIAKRLNEQKILCPTMVRFL RGEFKKDSPRTVGTRWVAQTVKNITNNMVYAGHMAQGKQIRSLCEGIPRKEMERQDWIIV RNTHTPIVDEDTWNRVQRINSEAKTKNTEKRQLPKTENLLVGMVYCKDCGRTMGRYKNCS RKGHIRYTYCCRIRKASKGMSGCTLKSIGEQELFDIILKSIRIQIAATIEQENLLRRLKA QPEYQKTKRKLESSLAEIQTAISKNMARRAALFESYNDNVVSDEEYLYMKQRYDVQAAEL LSRQEQLLAEQVVYAKTLSPENEWIKACRKYDGVKQLSRAMVLELIHHITIEGRNDVHIT WNFKDEYQTLISYTEGVQV >gi|229784095|gb|GG667640.1| GENE 41 36906 - 38423 851 505 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 5 302 10 299 301 152 34.0 1e-36 MNTLALYTRASNEDENLGESATIQNQRDLLYHYIRSKQEFKDWNVLEFQDDGWSGTTFDR PGIGKLLKLAGKEVQCIIVKDFSRFGRNLIEVGNYLDQIFPFLGVRFIAVNEGYDSNDNA GRTIGLDVSLKAMVYEMYSRDLSKKISSVKEEQMKSGHYAGSFAFYGYQKSKDSKSGLEI DPEAAGIVKRIFSLAASGVKTLQIAVLLNKEGILPPLSYRKQKKMDEHFQCPTATEHNIW TQARIAAIIRDERYTGVLVSKKKQRIDISTKKIKLNDRKEWIRVENAMDAIVTREEFGQA QEALVRRGNRNKVTPSEPFRGLLKCGICGKALDRIDCKNPYFRCVTGRFEPDSLCAEIRI ERKELEQTVLSALKYQIQLMQENKPSDKSGSETTIRQNLEKIDGQMGKYKSNQMIQFERF AGGLIDREQFIMEKKKIAEQISALQAEKEELFRQLQNVEQLAQVEEHDLGRYAFVETLSR ELLVALIQEIRIQSDHVIEINWNFR >gi|229784095|gb|GG667640.1| GENE 42 38443 - 38757 376 104 aa, chain + ## HITS:1 COG:no KEGG:Closa_3319 NR:ns ## KEGG: Closa_3319 # Name: not_defined # Def: Ferrous iron transport B domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 7 103 184 280 281 150 85.0 2e-35 MYVLTQANEIVVPIIIMAYLAEGSLLDISDLSVLRDLLVSHGWTWITAVSTMLFSLMHWP CSTTCLTIKKETQSVKWTIASFLVPTVTGIVVCFLFTTAARMLV >gi|229784095|gb|GG667640.1| GENE 43 39064 - 39165 89 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNKKYRKKAYCPDTIGKNLVAQKVSCMRKAISL >gi|229784095|gb|GG667640.1| GENE 44 39271 - 40459 1172 396 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_4665 NR:ns ## KEGG: Pjdr2_4665 # Name: not_defined # Def: S-layer domain protein # Organism: Paenibacillus # Pathway: not_defined # 42 396 39 382 2026 345 51.0 3e-93 MKMKQMAKQSLCFLFSAGMVMNSGMMFNPPAYAAADSNVHEITVDGDVIDPVNTFRGFGA VTCNNTSRLLMDYKEEHPDKYWEMMELLFNPEKGAGLNHIKVEMGGDVNSSSGTEPATMR SPEEEANVLRGAGWHFAADAKTINPDITVEILRWGEPKWTQEGIGYETSENPKYEARYQW YKKTIDAVYDEYGYEINYVSPGQNERRRDYGDNTAWIKYCANRLNEDAEKEDARYDYSQI QIVAADTHSNSKDIAGRMLSDSELMDLVDVVSDHYTLYGNSDLTKVNQEYGKEIWYSEAI APMINAKYRINVDPDRGGVGGKVGMADLATRFINSYSYSDAGDNPARMTRFEFQPAIGAF YEGSAYSPKQLIGAFDPWSGYYDADGGLQMVGHFME Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:38:43 2011 Seq name: gi|229784094|gb|GG667641.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld34, whole genome shotgun sequence Length of sequence - 62471 bp Number of predicted genes - 51, with homology - 48 Number of transcription units - 31, operones - 14 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 781 616 ## COG0789 Predicted transcriptional regulators + Term 806 - 861 18.4 - Term 798 - 845 12.5 2 2 Tu 1 . - CDS 893 - 2749 2209 ## COG3950 Predicted ATP-binding protein involved in virulence - Prom 2786 - 2845 3.9 + Prom 2793 - 2852 1.9 3 3 Op 1 . + CDS 2879 - 4249 1012 ## Closa_1352 hypothetical protein 4 3 Op 2 . + CDS 4297 - 5409 854 ## COG0367 Asparagine synthase (glutamine-hydrolyzing) 5 4 Tu 1 . + CDS 6356 - 7078 575 ## COG0367 Asparagine synthase (glutamine-hydrolyzing) + Term 7136 - 7191 -0.8 + Prom 7116 - 7175 7.9 6 5 Tu 1 . + CDS 7207 - 7557 330 ## gi|266621275|ref|ZP_06114210.1| hypothetical protein CLOSTHATH_02428 + Term 7586 - 7630 -0.7 - Term 7546 - 7605 3.5 7 6 Tu 1 . - CDS 7687 - 8052 528 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 8 7 Tu 1 . - CDS 9027 - 9671 631 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases - Prom 9855 - 9914 8.0 + Prom 9796 - 9855 7.3 9 8 Tu 1 . + CDS 9885 - 10682 865 ## COG0789 Predicted transcriptional regulators 10 9 Op 1 . + CDS 11647 - 12081 436 ## COG1970 Large-conductance mechanosensitive channel 11 9 Op 2 . + CDS 12085 - 12216 59 ## - Term 12102 - 12163 16.2 12 10 Op 1 . - CDS 12175 - 13872 2120 ## COG0366 Glycosidases 13 10 Op 2 . - CDS 13782 - 13949 69 ## - Prom 14059 - 14118 5.3 + Prom 14056 - 14115 9.1 14 11 Tu 1 . + CDS 14174 - 14848 781 ## CDR20291_1876 hypothetical protein + Prom 15750 - 15809 7.0 15 12 Tu 1 . + CDS 15851 - 16243 554 ## Closa_1782 hypothetical protein + Term 16289 - 16354 13.7 - Term 16285 - 16333 10.0 16 13 Op 1 . - CDS 16369 - 17199 852 ## Sgly_1299 transcriptional regulator, MerR family - Prom 17219 - 17278 7.0 - Term 17302 - 17344 7.4 17 13 Op 2 . - CDS 17360 - 18937 1695 ## COG0495 Leucyl-tRNA synthetase - Prom 18985 - 19044 80.4 18 14 Tu 1 . - CDS 19884 - 20732 924 ## COG0495 Leucyl-tRNA synthetase - Prom 20929 - 20988 3.0 - Term 20915 - 20954 -0.5 19 15 Tu 1 . - CDS 21038 - 24838 4060 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain 20 16 Tu 1 . - CDS 24917 - 25456 529 ## gi|266621290|ref|ZP_06114225.1| conserved hypothetical protein 21 17 Tu 1 . - CDS 25562 - 26809 1329 ## COG0205 6-phosphofructokinase - Prom 26868 - 26927 7.5 22 18 Tu 1 . - CDS 26982 - 27233 479 ## gi|266621292|ref|ZP_06114227.1| putative nuclease sbcCD subunit C - Prom 27282 - 27341 3.6 23 19 Op 1 . - CDS 27344 - 27955 641 ## COG0406 Fructose-2,6-bisphosphatase 24 19 Op 2 . - CDS 27952 - 28509 691 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) - Prom 28538 - 28597 80.4 25 20 Tu 1 . + CDS 29497 - 29886 328 ## COG0583 Transcriptional regulator + Term 29891 - 29933 10.2 26 21 Op 1 . - CDS 29913 - 30422 567 ## COG1670 Acetyltransferases, including N-acetylases of ribosomal proteins 27 21 Op 2 . - CDS 30394 - 32793 2579 ## COG1199 Rad3-related DNA helicases - Term 32824 - 32875 13.4 28 22 Tu 1 . - CDS 32901 - 33143 305 ## Closa_1342 Phosphotransferase system, phosphocarrier protein HPr - Prom 33185 - 33244 6.0 29 23 Tu 1 . - CDS 33513 - 33854 249 ## gi|266621299|ref|ZP_06114234.1| conserved hypothetical protein - TRNA 34072 - 34142 63.0 # Trp CCA 0 0 - Term 34245 - 34284 9.1 30 24 Op 1 . - CDS 34307 - 36367 2384 ## COG0840 Methyl-accepting chemotaxis protein - Term 36421 - 36449 -1.0 31 24 Op 2 . - CDS 36503 - 37072 628 ## ELI_1484 hypothetical cytosolic protein 32 24 Op 3 . - CDS 37069 - 37545 373 ## COG0622 Predicted phosphoesterase 33 25 Op 1 . - CDS 37658 - 39061 1601 ## COG1066 Predicted ATP-dependent serine protease - Term 39073 - 39107 3.5 34 25 Op 2 . - CDS 39125 - 39538 527 ## Closa_1335 hypothetical protein - Prom 39605 - 39664 7.6 35 26 Op 1 . - CDS 39801 - 42251 2422 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 36 26 Op 2 . - CDS 42284 - 42463 163 ## Closa_1333 ATP:guanido phosphotransferase - Prom 42559 - 42618 18.8 37 27 Op 1 7/0.000 - CDS 43522 - 44001 233 ## PROTEIN SUPPORTED gi|163764772|ref|ZP_02171826.1| ribosomal protein L5 38 27 Op 2 . - CDS 43994 - 44569 234 ## PROTEIN SUPPORTED gi|163764773|ref|ZP_02171827.1| ribosomal protein L24 39 27 Op 3 . - CDS 44515 - 44628 57 ## 40 27 Op 4 . - CDS 44635 - 45057 401 ## COG1725 Predicted transcriptional regulators - Term 46042 - 46092 10.0 41 28 Op 1 . - CDS 46110 - 47957 1750 ## Closa_1330 cellulosome anchoring protein cohesin region 42 28 Op 2 . - CDS 47978 - 50017 1697 ## COG4193 Beta- N-acetylglucosaminidase 43 29 Op 1 . - CDS 51262 - 52548 847 ## COG3119 Arylsulfatase A and related enzymes 44 29 Op 2 . - CDS 52545 - 53585 809 ## gi|266621314|ref|ZP_06114249.1| hypothetical protein CLOSTHATH_02469 45 30 Op 1 . - CDS 54536 - 55393 458 ## PEPE_1760 hypothetical protein 46 30 Op 2 38/0.000 - CDS 55406 - 56239 720 ## COG0395 ABC-type sugar transport system, permease component 47 30 Op 3 35/0.000 - CDS 56254 - 57144 683 ## COG1175 ABC-type sugar transport systems, permease components 48 30 Op 4 . - CDS 57174 - 58304 1355 ## COG1653 ABC-type sugar transport system, periplasmic component 49 31 Op 1 . - CDS 59226 - 59438 264 ## gi|266621319|ref|ZP_06114254.1| putative acid shock protein - Prom 59463 - 59522 6.2 50 31 Op 2 7/0.000 - CDS 59529 - 61349 1760 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 51 31 Op 3 . - CDS 61342 - 62469 816 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain Predicted protein(s) >gi|229784094|gb|GG667641.1| GENE 1 2 - 781 616 259 aa, chain + ## HITS:1 COG:lin2870 KEGG:ns NR:ns ## COG: lin2870 COG0789 # Protein_GI_number: 16801930 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Listeria innocua # 6 96 16 107 244 60 36.0 4e-09 SKFCCISARMLRHYDHIGLLHPAATDEQNGYRYYDSSQIPTFQRIEKLKRYGFSLAEIKR LLTLTEAELNIQILLQYERVKEQAASLTETLTCMERDISHFKEGITMNRNYHIIVMNTPE QRVFSVKRTIAIRARDIHGLISDLMAQAKERGLHRTGPCQLVYMGSEFNEDMMEVEAQLQ VAEAHPDTKLLPDSLSVTTIHQGPLEEIHTGYQAICQYLDEHPEYQLTGISIERYLKDET MVSSAEELETAILFPVIKA >gi|229784094|gb|GG667641.1| GENE 2 893 - 2749 2209 618 aa, chain - ## HITS:1 COG:STM3753 KEGG:ns NR:ns ## COG: STM3753 COG3950 # Protein_GI_number: 16767037 # Func_class: R General function prediction only # Function: Predicted ATP-binding protein involved in virulence # Organism: Salmonella typhimurium LT2 # 1 355 1 373 396 63 22.0 9e-10 MQITSVHIRNFKAIREMEIHGIENALILVGKNNTGKTSVLDAIRAVAGSYQIQESDFNER KQNIEISITLNFTEDDLHMFHRAGIVSQYKRYDAWYRDFCAKLPSCEDETLSFTFVVNWK GESRLYDGYKKNNRYIQEVFPRIYYIDTERQLKQFQDDLFMFQEDEQLIRLRAHVCMFDA AKECNQCFQCIGLINQKKPEDLKLYETSKLLEYKMYQLNLKDFTEKVNANYRKNGGFEEI KFTLNCNPDEMFHVTAEMYHEEHGRLNPIGNLSGGMKSIYMLSLLQTYTDDEKRIPSVII VEDPEIFLHPHLQKISSEILYQLSKKNQVIFTTHSPNMLFNFTSRQIRQVVLDEEDYSVV RRHTDIDAILDDLGYGASDFLNVSFVFIVEGKQDKSRLPLLLEKYYSEIHDANGNLSRIS IITTNSCTNIKTYANLKYMNQVYLRDQFLMIRDGDGKNPEELAGQLCRYYDERSLEDVDK LPKVTRRNVLILKYYSFENYFLNPEIMAELGIVKSEEAFYETLFSKWQEYLYRLQSGRHL TEVLGGAISSVEELKAHMEEFKIYMRGHNLYDIFYGPYKKQETEILRKYIELAPREDFKD ILEAIDRFVYFDSRKVDI >gi|229784094|gb|GG667641.1| GENE 3 2879 - 4249 1012 456 aa, chain + ## HITS:1 COG:no KEGG:Closa_1352 NR:ns ## KEGG: Closa_1352 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 456 1 456 456 613 80.0 1e-174 MFDSAVVLTPLHYVYLIGVLVILAVMILRKDTPAVCIAFLFLLGVIGLGSVTGGIMTVFN SILFAAREFMEVIATIALVTALSKCLKDLGSDYLMMVPMSKIMKTPSVTWWILGITMLLF SLFLWPSPSVALVGAIMLPFAVKAGLKPLAAAMAMNLFGHGIALSYDFVIQGAPAVSAGA AGISTSDILTQGRPVFWVMGIATVTAAYLLNRKSMAAGDGVCAEALPEERNTRTKAAIAV AILTPAAFLADILLMLAFDLKGGDATSMVSGTAVLVMCIGAVAGFGRKSLEKVTDYVTDG FLFAIKIFAPVIIIGAFFFLGGNGITTIMGSQYQSGIMNDWAVWLAHNAPLNKYMAAFIQ MAVGSLTGLDGSGFSGLPLTGALAHTFGTAVGASVPVLASLGQITAIFVGGGTIVPWGLI PVAAICNVSPLDLARKNLPPVLIGFLFAFITGCLLL >gi|229784094|gb|GG667641.1| GENE 4 4297 - 5409 854 370 aa, chain + ## HITS:1 COG:CAC2243 KEGG:ns NR:ns ## COG: CAC2243 COG0367 # Protein_GI_number: 15895511 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthase (glutamine-hydrolyzing) # Organism: Clostridium acetobutylicum # 1 368 1 364 397 355 45.0 1e-97 MCGIAGFYHPHRNYLNEEARYRDILKDMAAVQRHRGPDDSGVFLNRHAGLSHARLSIIDL VTGHQPMEKEVGGRTFSIAYNGELYNTEELRKELQALGHSFSTTSDTEVILTGCIEYGPD FVKRLNGIFAFACYDAFHDRVILFRDRSGVKPLFYMVRGDEILFASEIKGILAVPGIRPE LDRKGLNEVFSIGPARTYGCGVFAGMEEVLPGHFLICSPDGISSHCYWKLVSQPHEDSYE ETIEKTSFLVQDSIKRQMVSDVPICTFLSGGVDSSLVSAVCAAELKKQGKRLTTFSFDFT DNDKYFKANSFQPSQDRPFVDRMQKFLDSDHHYLECGNTDQADRLCDSVLAHDLPAMGDI DSSMLHFCSS >gi|229784094|gb|GG667641.1| GENE 5 6356 - 7078 575 240 aa, chain + ## HITS:1 COG:BH1508 KEGG:ns NR:ns ## COG: BH1508 COG0367 # Protein_GI_number: 15614071 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthase (glutamine-hydrolyzing) # Organism: Bacillus halodurans # 1 240 374 614 615 258 47.0 6e-69 MALTGECADEIFGGYPWFHKEECFRAHTFPWTMDLEARKILLKDDFLDYLQMDAYVKETY ERSVSETPTLSEDSPTEARRREISYLNLRWFMQTLLNRMDRASMYSGLEARVPFADHRII EYVWNVPWDMKTRDGLVKNLLRQSGKGLLPDDILFRRKSPYPKTYDTNYEKLLTSRVREI MSDASSPVMQFLDRKKLETFLKSPSDYGKPWYGQLMAGPQMLAYVLQIDYWMKTYKITLL >gi|229784094|gb|GG667641.1| GENE 6 7207 - 7557 330 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621275|ref|ZP_06114210.1| ## NR: gi|266621275|ref|ZP_06114210.1| hypothetical protein CLOSTHATH_02428 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02428 [Clostridium hathewayi DSM 13479] # 1 116 1 116 116 192 100.0 6e-48 MFGSSKGATTEALEKLVQKGKWDKIKKSYLNADAETKVHLAEACASAVGDDSSNVLMALL DSPEDEVKTAALKSLAKVGNDHCVSRIQQMLTSISPDKTALRGEVQTTLQALRGKQ >gi|229784094|gb|GG667641.1| GENE 7 7687 - 8052 528 121 aa, chain - ## HITS:1 COG:TM0395 KEGG:ns NR:ns ## COG: TM0395 COG0446 # Protein_GI_number: 15643161 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Thermotoga maritima # 1 107 264 373 425 60 36.0 8e-10 METTIPSVYACGDCAEYEGINYAIWPQAVEMGRTAGANAAGDGLVYHTVPAALTFNGMET SLFAIGDVGKNGETQYTVEHEEDAEKKIYKTRYYADGRLTGAILIGDTKEMQRVTEEMEQ A >gi|229784094|gb|GG667641.1| GENE 8 9027 - 9671 631 214 aa, chain - ## HITS:1 COG:Cgl2652 KEGG:ns NR:ns ## COG: Cgl2652 COG0446 # Protein_GI_number: 19553902 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Corynebacterium glutamicum # 49 207 3 165 411 114 39.0 2e-25 MGETKKKLVKCLVCGAVFEEGVSVCPVCGVGPENFVPFTEEEKSFHRDTEELFLILGGGA AGLNAAEAIRERNSTASIVMATNEDMLPYSRPMLTKSMLGENNREQLLVHDASWYEKNRI LTLTGKTAESIDPETREVTFSDGIRLKYDKLIYALGSECFIPPIPGHEKPQVAAIRRITD TEKISALLPQTKHAVVIGGGVLGLEAAWELKIAS >gi|229784094|gb|GG667641.1| GENE 9 9885 - 10682 865 265 aa, chain + ## HITS:1 COG:Cgl2096 KEGG:ns NR:ns ## COG: Cgl2096 COG0789 # Protein_GI_number: 19553346 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Corynebacterium glutamicum # 6 131 3 126 334 60 33.0 2e-09 MKQYTIGDVSKRLGISRDSLRFYEKKGIISPQKLENGYRCYSYEDTRKLLDIMFYRRLNF SIEDINRILHQSSFGSYYTMIQEKIAEEEQEVERHRRSLIHLKYLTQLYKNIDDYLNRYD IRPLRRYYKADESLIDKLAVHDLCYIYQEYQLGEGMPEQVDEYYLFAADTAAIIGLEEQL SGRLFIQHEHCIYTVIASASRIPDTRSIMKAVCWARDHGYSLEGTAYSGFLLSCAVDEMQ TERTEEAETSQPVYYIELYLPLKEL >gi|229784094|gb|GG667641.1| GENE 10 11647 - 12081 436 144 aa, chain + ## HITS:1 COG:FN0766 KEGG:ns NR:ns ## COG: FN0766 COG1970 # Protein_GI_number: 19704101 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Large-conductance mechanosensitive channel # Organism: Fusobacterium nucleatum # 3 137 12 136 142 115 51.0 3e-26 MQEFKAFILKGNIIDMAVGVIIGGAFSKIVTSLVNDILMPLLGAVTGGADFNTFKIILSP AVTDAANQIIKEEAAVKYGLFLQNILDFLIIGACMFFMIKAVAMVSSRFRHEEEKKEEAA PAGPTQEELLAEIRDLLKAQQNAE >gi|229784094|gb|GG667641.1| GENE 11 12085 - 12216 59 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPKKKNETKNRKHQTGPVTAAILVLPVLFVSGFFQHYNDSTAL >gi|229784094|gb|GG667641.1| GENE 12 12175 - 13872 2120 565 aa, chain - ## HITS:1 COG:BH3868 KEGG:ns NR:ns ## COG: BH3868 COG0366 # Protein_GI_number: 15616430 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 1 530 1 529 553 433 43.0 1e-121 MERKWWHEKVAYQIYPKSFYDTNGDGIGDLRGIIEKLDYLKELGIDIVWISPIYQSPFVD QGYDISDYYKIAPEFGTMEEFDELLAEAKKRGIAIVMDLVINHCSDQHEWFQKALADPYG EYADYFYFEKGKNGGAPSNYRSYFGGSVWEPVPGTDLYYLHLFAKEQPDLNWNNPKMKEE LFTMIRWWLEKGVAGFRIDAIINIKKDPDFPDFPADGPDGLAACTKMVEEVEGVGELLEE LRQKAFEPYHAFTVAEVFNMKEEELKEFVGENGHFSTMFDFSAHLLTGGEHGWYDAKPVM FSEWKKTVIESQLEVQNVGFLANIIENHDEPRGASTYLPEHAVNPRGIKMLAAVNILLRG LPFLYQGQEIGMRNCPMDTIDEYDDINTKDQYRTALEAGCTEEEALEACCRYSRDNARTP MQWSREHEAGFTAGTPWLKVNPNYKEINVEDQLRDGDSVLRFYQKLIALRKSEAYREILT YGRFVPAFEADEDIFAFNRESESGEHIFLAANFGKEEKELRLPKKNYRVLLTNDDSASDG TLSTLQEDGTLCLKSCGVIVMLEEA >gi|229784094|gb|GG667641.1| GENE 13 13782 - 13949 69 55 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGSILQEIWKQPYYNDSTFETAGGRPWKENGGMKKWHIRFIPRAFMIQTGMESVI >gi|229784094|gb|GG667641.1| GENE 14 14174 - 14848 781 224 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1876 NR:ns ## KEGG: CDR20291_1876 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 3 221 2 218 399 337 75.0 2e-91 MEERLIGTISRGVRAPIIREGDDLAKIVVDSVINAAASSRYGFEIQDKDIVAVTEAVVAR AQGNYASIDNIAADVKEKFGDASVGVIFPILSRNRFAICLKGIARGVKKVVLMLSYPSDE VGNHLIDEDELDAKGVNPWADVLSEEKYRELFGYKKHTFTGVDYIEYYKDVIKEQGAEVE VVLANDCRTILNYTDNVICCDIHTRARSKRRLLAAGAKKVYLAS >gi|229784094|gb|GG667641.1| GENE 15 15851 - 16243 554 130 aa, chain + ## HITS:1 COG:no KEGG:Closa_1782 NR:ns ## KEGG: Closa_1782 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 130 268 396 396 236 85.0 2e-61 MMMERTGKHVEVMIYGDGAFKDPVGKIWELADPVVSPAYTDGLEGQPNELKLKYLADNDF ADLSGEELKKAISERIRTKDDNLVGDMASQGTTPRRLTDLIGSLCDLTSGSGDKGTPIIF IQGYFDNYTK >gi|229784094|gb|GG667641.1| GENE 16 16369 - 17199 852 276 aa, chain - ## HITS:1 COG:no KEGG:Sgly_1299 NR:ns ## KEGG: Sgly_1299 # Name: not_defined # Def: transcriptional regulator, MerR family # Organism: S.glycolicus # Pathway: not_defined # 1 136 1 134 259 98 40.0 3e-19 MTIKDMEQQVGITKANIRFYEKEGLLSPARGENNYRDYTEEDREVLEKIKYLRTLGVPVS DIRSFQQGKRSLAQLLKERECQLREEEAAISRMKLVCLELKKKDWDFQTLDTELFDLQME FIEKRGAERMKKDRTKPVILARKIVWFLAMCGIVSLVFFPVNSVLRIPVSEEVITIWTLS VTLVTLAASVLQAVTAGLRDDEAEKRRRGELLVRRAENRRLRYEKQIVGFNQVCLSSLLI IPVNRMLEIQWPLWLLVIWFVLVFGSGIAVAVVKNI >gi|229784094|gb|GG667641.1| GENE 17 17360 - 18937 1695 525 aa, chain - ## HITS:1 COG:BS_leuS KEGG:ns NR:ns ## COG: BS_leuS COG0495 # Protein_GI_number: 16080084 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Bacillus subtilis # 2 523 276 802 804 550 50.0 1e-156 MDAIKEYREVCAKKTEFERTQLVKEKTGVCIDGLSAVHPLTGKKIPIYIADYVMMGYGTG AIMAVPAHDDRDYEFAKKFGIDIIPVIEGGDISEAAYTGEGKMINSDFLNGLTNKKDSIA RMVKELEKLGVGEAGVQYKMKDWAFNRQRYWGEPIPIIHCPVCGVVPVPYEELPLRLPKV DNFEPGQDGESPLAGIESFVNCTCPKCKGAAKRETDTMPQWAGSSWYFLRYIDPHNQEAL ADPEKLKYWMPVDWYNGGMEHVTRHMIYSRFWHHFLYDMGLVNTEEPYAKRTAQGLILGP DGDKMSKSKGNVVDPLDVVEEFGADVLRTYVLFMGDYGAATPWSDSSVKGCKRFLERVAG LCDIAGGEGATPELETAFHKTIKKVSNDLEEMKFNTAIAALMGLLNEIYAKGSITKDELI IYIKLLCPIAPHLCEEIWESLGGDGFLSMAEWPVYDEAKTMDATVEVGVQVNGKLRSVVA LPANCDKEEAFAIAVKDEKIVPLLEGKTIVKEIYVPNKIVNLVVK >gi|229784094|gb|GG667641.1| GENE 18 19884 - 20732 924 282 aa, chain - ## HITS:1 COG:lin1769 KEGG:ns NR:ns ## COG: lin1769 COG0495 # Protein_GI_number: 16800837 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Listeria innocua # 12 282 1 271 803 301 49.0 1e-81 MVVSAKKWRQIMKHDFKKIEPKWQKKWDDARIFQAVDNDERPKFYGLVEFPYPSGAGMHV GHIKAYSSLEVISRKRRMEGYNVLFPIGFDAFGLPTENYAIKTGVHPRKITDDNIARFSN QLKKVGFSFDWSRVVDTTDEDYYKWTQWIFLKMFEKGLVFRDTALVNYCPSCKVVLSNED SQGGKCDICHSDIVQKSKSVWFLRITEYADKLLKGLDTVDYLPNIRQQQVNWIGKSEGAF VDFSLADKEESFRIYTTRPDTLFGVTFMVIAPEHPMIDKLAS >gi|229784094|gb|GG667641.1| GENE 19 21038 - 24838 4060 1266 aa, chain - ## HITS:1 COG:CAC1655_1 KEGG:ns NR:ns ## COG: CAC1655_1 COG0046 # Protein_GI_number: 15894932 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Clostridium acetobutylicum # 20 975 5 960 985 1076 55.0 0 MWYYCINEKQILSKEEIMSVRRVYVEKKPAFAVKAHELKEEIASYLGIDTVTDVRVLIRY DIENLSDETYRTALATIFSEPPVDEVYEETFPRNEADVVFAVEYLPGQFDQRADSAEQCV KLLKEDEEPVIKSATTYVISGTLTEAEAADIKSFCINPVDSRETDETKPETLLTVFETPA DVIIFDGFQTFGEGALKEVYDSLNLAMTFKDFMHIQNYFRDEEKRDPSMTEIRVLDTYWS DHCRHTTFQTELKDVTFTEGDYRAPMEHTYHKYLSDREVVLKGKSGKFVCLMDLALMAMK KLRMEGKLEDLEVSDEINACSIVVPVEIDGKEEEWLVNFKNETHNHPTEIEPFGGAATCL GGAIRDPLSGRTYVYQAMRVTGAADPTKPLNETLKGKLPQRKIVTGAASGYSSYGNQIGL ATGYVKEIYHPNYVAKRMEIGAVMGAAPRKNVIRETSDPGDIIILLGGRTGRDGCGGATG SSKVHTESSIETCGAEVQKGNAPTERKIQRLFRRAEVSSIIKKCNDFGAGGVSVAIGELA DGLKIDLDKVPKKYAGLDGTEIAISESQERMAVVIDPKDVKQFLAYAAEENLEAVEVAVV TEEPRLVMNWRGRNIVDISRAFLDTNGAHQESSVVVEVPNKEGSAFDKKEISDVRNEWMN VLSDLNVCSQKGLVEMFDGSIGAGSVFMPYGGKYQLTETQSMVAKLPVLSGKTDTVTMMS YGFDPYLSSWSPYHGAVYAVVSSVAKIVAAGGDYKKIRFTFQEYFRRMTEEASRWSQPFS ALLGAYEAQLGFGLPSIGGKDSMSGTFNEIDVPPTLVSFAVDTADQKHIITPEFKKAGSK IVVFRIEKDSYDLPVYEQLMDGYEKLYNEITAGRVISAYAVEGHGLAESVAKMAFGNKLG VKIEHNVDPRDFFAAGWGDIVCEVPDGKVGELSISYTVIGEVTDRGVLEYGSAAITMDEA LNAWTETLESVFPTRSGVVNTPVTEGLFDTKDIYVCRHKTAKPNVFIPVFPGTNCEYDSR KAFERAGANVVTKVFKNRNADDIRESVEEYRKEIAKAQIVMFPGGFSAGDEPEGSAKFFA TVFRNAVIREEIEKLLNERDGLMLGICNGFQALIKLGLVPEGTITAQRKDSPTLAMNTIG RHISKMVYTKVVTNKSPWLQEAELGGVYCNPASHGEGRFVANEDWIKKLFHNGQVATQYV DENGNPTMDEFWNPNGSYMAIEGITSPDGRILGKMAHSERRGEAVAMNIYGEQDMKIFES GVKYFK >gi|229784094|gb|GG667641.1| GENE 20 24917 - 25456 529 179 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621290|ref|ZP_06114225.1| ## NR: gi|266621290|ref|ZP_06114225.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 6 179 1 174 174 312 100.0 5e-84 MKGERMADMRIGEYLISDIMMSVGKKSDVYLRKFQRIHEKGYSFNWCAALFTALWFAYRK MWKAAAAIMGFNVVYTVVLSVLESTVLSADNALVFMGTAAGLFVLSMVVFGLLGDRLYCR HISAILDKEGCRARRASEKTARTEALQKAGGTSVIGVIILWAVSLVLQWAVSWLISFFL >gi|229784094|gb|GG667641.1| GENE 21 25562 - 26809 1329 415 aa, chain - ## HITS:1 COG:XF0274 KEGG:ns NR:ns ## COG: XF0274 COG0205 # Protein_GI_number: 15836879 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Xylella fastidiosa 9a5c # 8 408 16 413 427 241 38.0 2e-63 MKEVRGNVIVGQSGGPTAVINSSLAGVYKTAKDRGAKRVFGMLHGIQGLLEERYVDLSEH INNDLDIELLKRTPSAYLGSCRYKLPEICEDQEIYKKIFSILEKLEVEYFFYIGGNDSMD TIKKLSDYSILNGSKIRFMGVPKTIDNDLAATDHTPGYGSAAKYIGSITKEVIRDGLVYD QQNVTLLEIMGRNAGWLTGAAALAKCEDCEGPDMIFLPEIPFDVDTFMKKAEDLHKRKKS VVVAISEGVKMADGRYVCELTDSIDYVDAFGHKQLTGTARYLAEKISREVGCKTRAIEFN SLQRSASHIVSRVDITEAFQVGGAAVKAAFEGETGKMIILKRVSDDPYICVTDIYDVHKV ANVEKKVPREWINEAGDYVTEEFVSYIKPLIQAELTPIMVDGLPRHLYYTDVEKK >gi|229784094|gb|GG667641.1| GENE 22 26982 - 27233 479 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621292|ref|ZP_06114227.1| ## NR: gi|266621292|ref|ZP_06114227.1| putative nuclease sbcCD subunit C [Clostridium hathewayi DSM 13479] putative nuclease sbcCD subunit C [Clostridium hathewayi DSM 13479] # 1 83 1 83 83 100 100.0 5e-20 MARGVRKSPVEKLQDQLNEVRESIRQYENCLVTLKEKEKTLNEQIELEEFKTIRGLLTEQ GIGMEELKEMLESNVTAVEEQSA >gi|229784094|gb|GG667641.1| GENE 23 27344 - 27955 641 203 aa, chain - ## HITS:1 COG:FN0808 KEGG:ns NR:ns ## COG: FN0808 COG0406 # Protein_GI_number: 19704143 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Fusobacterium nucleatum # 1 203 1 205 206 116 31.0 3e-26 MKLYLIRHGQTDWNIQGKIQGSHDIPLNDTGRAQAKLVAEGMDSRPVTKIFSSTLMRAVE TARMIGDRQHVDIYLVPGLIEVEFGKWEGMTWAEIKEQYPNEYERWFINPVEVAPPGGET QMMVMERVAGAIETVMGMTNGREDIAVVSHGATMAYIVAYLMRNHPEESEIIVDNASITT VNYNPVTQDYMLLEVNDTAHLHQ >gi|229784094|gb|GG667641.1| GENE 24 27952 - 28509 691 185 aa, chain - ## HITS:1 COG:BH1376 KEGG:ns NR:ns ## COG: BH1376 COG0568 # Protein_GI_number: 15613939 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Bacillus halodurans # 8 168 117 276 372 79 32.0 4e-15 MKAITPCTQEENTQLLNDAAAGNEGAKKRLIEGNLEAALKCAKEYDGKGVLLTDLVAEAN MALTMAVTEFLTDAGDGDFEGFLAARMKAALDLAVEEQQSAKQTGEELAARVNVLQTVSQ MLAKELGREATVEELADKMKMTAGEVKDIMKMAMDAMSMNAENADLEDLAEVEGIEITGA EDGEE >gi|229784094|gb|GG667641.1| GENE 25 29497 - 29886 328 129 aa, chain + ## HITS:1 COG:BH2712 KEGG:ns NR:ns ## COG: BH2712 COG0583 # Protein_GI_number: 15615275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 8 127 166 287 296 65 29.0 2e-11 MINEFHDVFVANQEYFPLKGEAVSLQKLQTYPILMLDRKSTTSEFLHHMFQREQLDLVPE IELSSNDLLIDLARIGLGIAFVPDFCIPENDKDLFIVRLTEKMPARQMIVAYNESLPVSQ ASKQFMDMM >gi|229784094|gb|GG667641.1| GENE 26 29913 - 30422 567 169 aa, chain - ## HITS:1 COG:CAC2561 KEGG:ns NR:ns ## COG: CAC2561 COG1670 # Protein_GI_number: 15895823 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Acetyltransferases, including N-acetylases of ribosomal proteins # Organism: Clostridium acetobutylicum # 4 168 2 166 166 150 40.0 1e-36 MMKLRLRPCKREDGRELMKWFSDERQMRMWCRDGFTFPLTDGQMDAYYAELDQDSRAWGF TALDESGKPVGSFKMAAADYEADSVHMGYIVLAPEVRGRGVGQQMVAMAAAYAVDFLGMK RITLKVFDCNPGAKRCYEKAGFREEEYRPEEFSYRDEMWGNSLMVYNRR >gi|229784094|gb|GG667641.1| GENE 27 30394 - 32793 2579 799 aa, chain - ## HITS:1 COG:CAC1672 KEGG:ns NR:ns ## COG: CAC1672 COG1199 # Protein_GI_number: 15894949 # Func_class: K Transcription; L Replication, recombination and repair # Function: Rad3-related DNA helicases # Organism: Clostridium acetobutylicum # 9 785 8 790 791 619 43.0 1e-177 MADGEKKIIKISVRNLVEFVMRSGDLDNRRTAGAEKTAMQEGSRIHRKIQRRMGADYHAE VVLKHLVEENQFEILVEGRADGIIEKPSGVTIDEIKGVYMDLYYLNEPIEVHMAQAMCYG YFYCCDHQLGGAVMQLTYCNIETEEIKRFQVEKSFEELEAWFGGLIHEYVKWADYLYHHG LRRDESIRDLEFPYPYRKGQKELAVSVYRSIARRKNLFIQAPTGIGKTLSTVYPALKAIG EGHGDKLFYLTAKTITRAVAEETYEILRKGGLYFSTVTITAKEKLCFLEKPECNPDACPY AKGHFDRVGDAVFDVIHQEAGITRETVLEYARRHQVCPFEFCLDISNWVDGIICDYNYVF DPNIRLKRYFSEGIAGEYLFLVDEAHNLVPRAREMYSAVVYKEDFLLIKKIVKPKDQKLA GLLDRCNKQLLEMKRECEGYTILTDIKLFMTAIMSLFGEMEKFMDSFPEFEDRDLVLDFY FALRDLLNIYDRVDENYRIYTEFLSDGRFMLRLFNMNPAKNLKECLDKGNSTVFFSATLL PVQYYKELLSGSQEEYAVYAESPFDREKRLLLVANDVSSRYTRRNEREFKKIVEYIRRIV FAHAGNYMVFLPSYQYMREVEQLLSGMEAQQEFAWLAQSAHMTEQEREIFLEEFKENRED SFVGLCVMGGVFSEGIDLKEGRLIGAIIVGTGLPMVCTEQEILKGYFDEKEQKGFDYAYQ YPGMNKVMQAAGRVIRTVADEGVIALLDDRFLKPDYQALFPREWDGYYSVNLNSVSGIVT EFWGERERQNNDETETSAL >gi|229784094|gb|GG667641.1| GENE 28 32901 - 33143 305 80 aa, chain - ## HITS:1 COG:no KEGG:Closa_1342 NR:ns ## KEGG: Closa_1342 # Name: not_defined # Def: Phosphotransferase system, phosphocarrier protein HPr # Organism: C.saccharolyticum # Pathway: not_defined # 1 80 1 80 80 102 78.0 4e-21 MKQYKIMLPGVAEAKEFVAAATKCDFDIDVYYNRVTIDAKSILGVLSLDLTRPLTVEFNG ENEEFETYLMEKAPSRIHAA >gi|229784094|gb|GG667641.1| GENE 29 33513 - 33854 249 113 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621299|ref|ZP_06114234.1| ## NR: gi|266621299|ref|ZP_06114234.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 113 1 113 113 226 100.0 3e-58 MKKAILYRISYLVEVDEEEISDHKGKEKFWKLAWGKLPEQEIVLDEQHKAEWQSTSTLYL DPLTMNCGQCARCHGWTTDREKPDPIRGLSNGATVDGELLCDECLPDHHPWAF >gi|229784094|gb|GG667641.1| GENE 30 34307 - 36367 2384 686 aa, chain - ## HITS:1 COG:CAC0120 KEGG:ns NR:ns ## COG: CAC0120 COG0840 # Protein_GI_number: 15893416 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Clostridium acetobutylicum # 381 684 211 514 555 221 44.0 3e-57 MIENVEKAKHHMPEQTGRGEKRATVSGKINRRLIEILFPALSVLIFVTCYIAALAVTNLS YQVTEGQTKNAVDTVDEFFNGKKSAVGILSYNKELCQLLEEAPTAEQIAGAPGKEAAVTL LKRAYDTMVSDGVQAVWLAGIGNDTYLMQTGEAWAANLADIDWDERVRETMGPVVTDPFS DPVTGKMVISIVAPVLSLDGSNITGYVGFDVFQDHFSQLLQDIRIGEGGTLELVTGDESL VYSFDASLINKKVEEVENLGREYLEGIRNQYTGSMTYSYEGKKYYSIFAKSEATGWATVG SIPVEEINAPRNQLIAVMVVISLLILITVIIIVIRLVRKLLSPLSDIAVNVEGFSKGDFS VNLEIDSDDEIGLLADSVKGTIQSLREMISHATEVLAELSDGNLRLTVDGVYDGEFGKIK DALEEIVRSLNDTLGRIYESADQVSEGAAQLSFGAQSLSQGAAEQAGAIENLAAVIQEVS NQVDHNAGNAKTASDKSKELGSEILEQNRQMQKLKEAMDDIRMNSGEIGEIIRVIEDIAF QTNLLALNAAVEAARAGDAGRGFAVVAEEIRSLAGKSAEASKNTAELIEKSVRSVKHGAG IVDTAAQALEVVAQESVKMTDTVDRISKASEEQSRSITEITRNITQIASVVQTNSATAEE SAAASEELSSQAQLLKDYVSRFQLDR >gi|229784094|gb|GG667641.1| GENE 31 36503 - 37072 628 189 aa, chain - ## HITS:1 COG:no KEGG:ELI_1484 NR:ns ## KEGG: ELI_1484 # Name: not_defined # Def: hypothetical cytosolic protein # Organism: E.limosum # Pathway: not_defined # 1 189 1 190 204 290 76.0 2e-77 MSERPRLDRSLDGETFRSYYYLKEELTAFCRENGLPVSGGKLEITERIACFLDTGNVLPV SAKTKNRPEVGVISEDMKIEPDFVCSEKHRAFFREAIGTGFSFNVAFQKWLKSNTGKTYR EAIDAYGRILEEKKKGKTAIDRQFEYNTYIRDFFADNEGRSLEEAIRCWKYKKGLKGHNR YERSDLSAL >gi|229784094|gb|GG667641.1| GENE 32 37069 - 37545 373 158 aa, chain - ## HITS:1 COG:PA0351 KEGG:ns NR:ns ## COG: PA0351 COG0622 # Protein_GI_number: 15595548 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Pseudomonas aeruginosa # 1 144 8 153 157 133 45.0 1e-31 MRVGILSDTHGLLRSEVKEVLRTCDVILHGGDINRETIIKELEELAPLHVVRGNNDKEWA EGMPEELSFSLEGVSFYMIHNKKEASPEAVSDADVVLYGHSHRYDERIIDGTLWLNPGSC GPRRFHQPITMAVMTLEGGTYQVEKLEFAHEKKGGEDQ >gi|229784094|gb|GG667641.1| GENE 33 37658 - 39061 1601 467 aa, chain - ## HITS:1 COG:BS_sms KEGG:ns NR:ns ## COG: BS_sms COG1066 # Protein_GI_number: 16077155 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Bacillus subtilis # 1 464 1 452 458 507 56.0 1e-143 MAKAKATSFFCKECGYESPKWLGQCPGCRQWNTFVEELTAKKEPAARRGLGSASGNGLGG VGEGGLSKIRPSYLSEISLDEQDRMQTGYEELDRVLGGGVVRGSLVLVGGDPGIGKSTLL LQVCRNLAASGQKVLYISGEESLKQIKLRANRIGTVAGELKFLCETNLDTIEGAISGEKP DVVIIDSIQTMFREEVTSAPGSVGQVRESTNLLLQIAKGVGIAIFIVGHVTKEGVVAGPR VLEHMVDTVLYFEGDRSASYRIIRGVKNRFGSTNEIGVFEMRESGLEEVKNPSEYMLSGR PEDASGAVVACSLEGTRPVLLEVQALITPTLFGMPRRTAAGTDYNRVNVLMAVLEKRCHY ELSRYDAYVNIAGGLRMNEPALDLAIVMAVISSLKDRPVNPRTIIFGEVGLAGEVRAVPM AEQRVNEAKKLGFETCILPKTSLDKMRKMEGIRLVGVENIREAISLL >gi|229784094|gb|GG667641.1| GENE 34 39125 - 39538 527 137 aa, chain - ## HITS:1 COG:no KEGG:Closa_1335 NR:ns ## KEGG: Closa_1335 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 137 1 137 137 212 84.0 3e-54 MPVIEELIRTEQDGTISFGNYKLSTKSKLQDFEHDGDLYKVKTFYEITKLERNGMFVYES VPGTAVEHFTVSNNGVEFSVEGDKDAQLTIQLEDDTEYEVFVDDAAVGSMMTNMSGKLSV SVELNEEEPVSVKVVRR >gi|229784094|gb|GG667641.1| GENE 35 39801 - 42251 2422 816 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 805 2 806 815 937 57 0.0 MIDRFTAKARESINLAVQAAEELGHSYVGTEHLLIGLLEEGSGVAARVLSENGVKKEKVI GLVSQLISPDQTVRMEENGYTPSARRVLENSYKEAVRFKARLIGTEHLLISIIRDNDCVA SRLLNTIGISIQKLYIDVLAAMGEDAPANKEELLKGSRTKGSTPTLDSYSRDLTALAREG KLDPVIGRETEIKRLIQILSRRTKNNPCLIGEPGVGKTAVVEGLAQMIIEGNVPETIAEK RVLTLDLSGMVAGSKYRGEFEERIKKVIAEVMEDGEVLLFIDEIHTIIGAGGAEGAIDAS NILKPSLARGELQLIGATTIEEYRKYIEKDSALERRFQPVTVEEPSEEEAVAILKGLRGR YEAHHHVTITDEALTAAVKLSARYINDRFMPDKAIDVIDEAASKVRLTAFVEPPEIKDLE KEIEKLEDQKEAAIRDEAYEKAGAIKKKQEKKKEKIDKIREKWQKDKDTRKQVVDEGEIA DVVSSWTKIPVRKLEEGESQRLKNLESILHERVIGQEEAVTAVAKAIRRGRVGLKDPKRP IGSFLFLGPTGVGKTELSKALAEAMFGTENALIRVDMSEYMEKHSVSKMIGSPPGYVGYD EGGQLSEKVRRNPYSVILFDEIEKAHPDVFNILLQVLDDGHITDAQGRKIDFKNTVIIMT SNAGAESIISPKRLGFGAVADEKADYKVMKDRVMEEVKHIFKPEFINRIDEIIVFHPLNK GHMKDIVTIMLKEIMKRTKEQMNITLSVDEAAKEFLINKGYDEKYGARPLRRTIQSCLED RLAEEILDGAVKEGDEVLVSQGEAELKFSVPELIKN >gi|229784094|gb|GG667641.1| GENE 36 42284 - 42463 163 59 aa, chain - ## HITS:1 COG:no KEGG:Closa_1333 NR:ns ## KEGG: Closa_1333 # Name: not_defined # Def: ATP:guanido phosphotransferase # Organism: C.saccharolyticum # Pathway: not_defined # 1 57 287 343 343 92 82.0 7e-18 MAGTADGLLAMKEPCSVYRLMLGLQPANLQKISDRPLSKEELDIARAEFIRRELPDLAK >gi|229784094|gb|GG667641.1| GENE 37 43522 - 44001 233 160 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764772|ref|ZP_02171826.1| ribosomal protein L5 [Bacillus selenitireducens MLS10] # 4 160 14 170 365 94 32 1e-18 MLRWFEKTDENRPGVISSRIRLVRNWEEYKFPAMLGTQESEEMVRRLEFGLKDLSEVEGK KYEYAMLEELEELDRAALRERRILNRAAVEKKAPAGIILSEDEDTSILLNGDDHIRIQLL SSGLHLEELWERADALDDYINERFPYAFDDRYGYLTSFPT >gi|229784094|gb|GG667641.1| GENE 38 43994 - 44569 234 191 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764773|ref|ZP_02171827.1| ribosomal protein L24 [Bacillus selenitireducens MLS10] # 1 188 1 174 179 94 32 1e-18 MLCERCGIREANIQYTEVINGVKKEHHFCAQCAKEMDFGPYSAIFDGDFPLGKLLSGLLG IEDTSRVPDKLHQVVCPTCKTSYDEFIKNSQFGCADCYSVFGPLMEESLKQLQGSTTHTG KQPKYQKISYKAPDIKSQQDGAAGEAVCEREEEIMELDAKLKEALQFEDYETAAVCRDKI RELKKGHEANA >gi|229784094|gb|GG667641.1| GENE 39 44515 - 44628 57 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAASGREQTVTNSRKNGGYLCYVRDVEFAKRISSIQR >gi|229784094|gb|GG667641.1| GENE 40 44635 - 45057 401 140 aa, chain - ## HITS:1 COG:CAC0599 KEGG:ns NR:ns ## COG: CAC0599 COG1725 # Protein_GI_number: 15893888 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 13 130 1 120 125 101 42.0 5e-22 MCITDITKDGENMILQIDFNSDEAIYIQLRNQIIIGIATETIKEGDPLPSVRQMADNIGI NMHTVNKAYSVLKQEGFVKLDRRKGAVVSLDVDKMRVLDEMRRDLSVVLARGICKNVTCD EVHQLVDEIYMLFSRGQKEE >gi|229784094|gb|GG667641.1| GENE 41 46110 - 47957 1750 615 aa, chain - ## HITS:1 COG:no KEGG:Closa_1330 NR:ns ## KEGG: Closa_1330 # Name: not_defined # Def: cellulosome anchoring protein cohesin region # Organism: C.saccharolyticum # Pathway: not_defined # 1 539 1 574 620 613 63.0 1e-174 MKIKLKRLMAGLMLVCMLAVMGPWGQMVSWAANTRIAFSDPSVMVGNEVTVTMKITSDSA LGKADVMVAYDSAALEFISGNGANGGAGALKLQDTTDAADQKSFSFTLKFKALKAGNTQI SVTSQEIYDADSQIVTVDQQGSSTVKVTSPATSSKDAALKSLKISPGELSPEFSPEVENY TASVEGNTVDLVVNAVAANAGASVRLQGDKGLKAGENQVVVNVTAEDGETVKNYTIAVTK AEGGETAPGGSIGADVSAEFGDAAVTVNGTEYKVAASFDESALPEGFEAGTYDYKGTEVM AGKGLEKDLTLLYLQDAGGNGGFYIYNAEADSWSQFVSVETSSKAIVIVPLDAGVTVPEG FEERQIDIDGVRVTGWIPKSEGDPQYCLFYAMNWNGEKNFYRYDLTEKTIQRYYASGVSM DSYNKMKEDYNSLVRDYNLQRYILIGVGVLAFILLIVILILLKRGGNGSDRGFREERRGR TEEPEGFFTEPDSQAGRVRRYTRDEFEGEGEAPLQIQEEPEEDDFEMEDSLDDMEKKLSA RLAKEAERDAAAPQRTVREQTMAAPRPNPRNAAQAMRGQEGRPQPSRPQPSRPDRAVRYP EMEDDDDFEFMDLDD >gi|229784094|gb|GG667641.1| GENE 42 47978 - 50017 1697 679 aa, chain - ## HITS:1 COG:SA2100 KEGG:ns NR:ns ## COG: SA2100 COG4193 # Protein_GI_number: 15927887 # Func_class: G Carbohydrate transport and metabolism # Function: Beta- N-acetylglucosaminidase # Organism: Staphylococcus aureus N315 # 356 504 112 255 258 61 29.0 6e-09 MAAILASVLVIDSLVGYVSPYIESMAYMERSATVNASSLNVRSGPGTTYSIVTKLTSGAA VTVIDEKTASDGALWYQIRVKGSGGTETTGYVSKSYLKFPVSYTSDADFEAHLNSEGFPE SYKVRLRELHAQYPKWVFRAQHTNLDWNTAVKEESQVGRNLVHTNSMSSWKSIADGAFNW DSSTWSGFDGSSWVAASEDIVKYYMDPRNFLDETNVFQFLTHTYDSNSHTVEGLQTMVKG TFLSGTAAGGGSSSPGGSSPGGGDSSSYGPGVSGGGPGVSLPSTGGGSSGAGGSDTIESA APSKGNTPGVSQESPHASITPNTPRVLMEYGPGVSLTGPSAGGSGSGTVTPSGSTNYVDI IMNAGTQSGVNPYVLAAMIIQEQSSSGTGGSITGTEPGYEGYYNFFNIEAYQSGSMSAVQ RGLWWASQSGSYERPWNSPEKSILGGALYYGNNYVKVGQDTFYLKKFNVQGNNLYKHQYM TNIQGAASEGAVYAKAYNDEMKQTALEFKIPVYNNMPDNACSQPTTDGNPNNKLSYLSVD GFVMTPTYNRDTLEYNLIVDPSVSSITVNAGAIDSRAAVSGSGTIQLQNGENDVRIAVTA QNGSVREYVVHVVRQDNGPTYSSSVGGSTGGSGGSSAGPGGGEVGPGVSSGPGVSEPSSG GSAGTPGGSNITIAPDGSN >gi|229784094|gb|GG667641.1| GENE 43 51262 - 52548 847 428 aa, chain - ## HITS:1 COG:YPO0829 KEGG:ns NR:ns ## COG: YPO0829 COG3119 # Protein_GI_number: 16121138 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Yersinia pestis # 2 409 42 453 517 247 34.0 2e-65 MKRPNLLVVFADQWRNTARGIHDPQIVTPNMDQFAEEAFSTDQAVSGCPLCSPYRSELLT GRRAVHTGVFGNCMTGYDMCLSPDELCISQVLKEYGYRTGYIGKWHLDSPELNSVSHPVS GAEGWDAYTPPGKMRHGFEYWHAYNAWNDHLQMHYWEDSSEKIYADAWSPVHETDKALEF MGSVKEQPFALFLSWNPPHPPFERVPKEYYNRYRNLEPDLPPNVEGERFDNQTGEPGFGS REELAEAVRCYYAAITGLDEQFGRIVSWLKEMELYDQTIVLLTADHGEHLGAHGYVGKHT WYEESINIPFLMRYPEKLPAGRNDISVETVDIVPTLLGLLDIAIPPSAEGRCLADWIMCG KKPENEAVYSSAYISRDIFLEAYKEKGLDPKRSGWRCIRTPEYKYVIEKGYMPEQIPRFL LFDRIADP >gi|229784094|gb|GG667641.1| GENE 44 52545 - 53585 809 346 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621314|ref|ZP_06114249.1| ## NR: gi|266621314|ref|ZP_06114249.1| hypothetical protein CLOSTHATH_02469 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02469 [Clostridium hathewayi DSM 13479] # 1 346 12 357 357 704 100.0 0 MTGLQMKLHVLAGADKISMNVFDFMGTPYVQEMPMVELIRDKKPELLKMDGLRRGKEQDG LGVLWYPGQENLLETPGGRLEELVLKREFDTLFPLLGIPVCFREREVNLLSGVNALCCSR EELMRLLEKGLILDGAAAGYLCRRGLGRYIGCEDVGAELRSPSAELLCGEEFHGSFSGNL LPTDWMRLEYKKIRIPLFQYDPDCEVLTVYVDKDRKVLGQGIVLYENRLGGRVCVIPGYI GTWQFAYRSRSWQMMRIVEWLSKGTFPVLVEDSPNIAPFYYSDRETGEGLLALLNTGLDV QTACLRRNANLCMYEEGMEENAYSLEPLELRVIRVGYERNEKENAS >gi|229784094|gb|GG667641.1| GENE 45 54536 - 55393 458 285 aa, chain - ## HITS:1 COG:no KEGG:PEPE_1760 NR:ns ## KEGG: PEPE_1760 # Name: not_defined # Def: hypothetical protein # Organism: P.pentosaceus # Pathway: not_defined # 3 263 7 269 674 120 30.0 5e-26 MNYILRAVIHEDQWERQLEDIVWVCHEAGIEEVYLKEQCHQILMSPFPLEKHRRMAEIYG KMAERLKEEKITYSINLATSVGHVDCRLEKKDILPYSKFTGESLQPADSVYCILDEGWQK YIADVCTVYARTHPAVLMVDDDFRSLNHSDSFGCFCSLHARAVSEKLGREIDSSQLLEAV TRNTEEGRAVRKAWLEVNFEGQSKAAARIEQAVHSVDPAVRIGLMNSGEAAHALQGRDME SLLRTFAGANRPLSRPLGGAYSDCLHQDLIAVHQGMALSMAQLAS >gi|229784094|gb|GG667641.1| GENE 46 55406 - 56239 720 277 aa, chain - ## HITS:1 COG:SMc04137 KEGG:ns NR:ns ## COG: SMc04137 COG0395 # Protein_GI_number: 15963868 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 8 276 23 294 295 183 40.0 3e-46 MKQNKKRVGKMVMTAVLLILGAANIFPFIFMLSASFKPLNEIFVYPVKLIPEHFITSNYK EVFSAQYSFVRWYFNTFVMVGTILVLKVGIVTLTAYAFARLRFKGRDILFVVLLSSMMIP PDAVIIPKYVIFKAFGITDSMWSIVLPSIFDVFFVFMVRQFFLGIPETLSEAAIIDGCGH LKIYSQIVLPLAKPAVITMILFTFIWAWNDYMGPYIFITTPAKQMLSVGIKMFQVNNSVD YGLQMAAATMVLIPIIVMFLFSQRYFIEGIATSGMKN >gi|229784094|gb|GG667641.1| GENE 47 56254 - 57144 683 296 aa, chain - ## HITS:1 COG:lin0218 KEGG:ns NR:ns ## COG: lin0218 COG1175 # Protein_GI_number: 16799295 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 1 272 1 268 292 189 38.0 6e-48 MTSSLRKKEARIAWLFITPYLIGFTLFHFLPVLLSAVLSLTDMRYISMIDKTHFIGLGNY FEMSRDPEFWNAFCNSIVYSICFVPFVMAAGLLLAVLVNQKIHFRNGIRAMIFMPYVSNM VAIAIVWSILLDPVDGIVNQTLSALGVNQAPMWLMGSKTALYSVVMIAVWREVGLQFVTY LAALQEVPTELKEAAQIDGAGRFQQFFYVTLPLIRNTTFLLTITSIITSLKNFTIIQVLT AGGPGNSTTVLPLNIVDTAFSSARMGYASAQAMVMFVLVMGVTVIQWVGKNRSEIY >gi|229784094|gb|GG667641.1| GENE 48 57174 - 58304 1355 376 aa, chain - ## HITS:1 COG:lin0220 KEGG:ns NR:ns ## COG: lin0220 COG1653 # Protein_GI_number: 16799297 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Listeria innocua # 18 263 76 316 418 85 30.0 1e-16 MDIEIIALPDPGAGEANKLLISLMAGDEFDIVEDAYSNMKQFYEAGVIEELGPLAEAAGY DVEKIFGDYPAAFDGKIYGLPAYVDKAITIYNKDVFDQAGVSYPEADDWTWEKFIEIGKK LTDEKAGIYGAYNPDWVHYNYMYAMQKGFSHYKEDGSSNYDDPLFKESVKWYYDLGNTEK IQPSYLVQKSKQMPVDYFTTGRVGMSVCGGWTTNWIIDKEKYPRDWKAGVLPMPHPEGED PSVSVVVSNVWIPATSKNKEKAFDVVKLFAEKQYTLGYGRIPARIDLTDDEIQDYIRDQI LPALEPDGITVEDIQAMWFDPDVNIFDEKPVGTASTEIKNIIVDECGLYGIGEQSLDDTV AHIKTKADAAIETASK >gi|229784094|gb|GG667641.1| GENE 49 59226 - 59438 264 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621319|ref|ZP_06114254.1| ## NR: gi|266621319|ref|ZP_06114254.1| putative acid shock protein [Clostridium hathewayi DSM 13479] putative acid shock protein [Clostridium hathewayi DSM 13479] # 1 67 1 67 67 72 100.0 8e-12 MKKRLSIVLAFAMAVSMMGGCKGADTAQSASGQTSATKEGGTKASENKETEKEQQASEQP GEKVHLKLAS >gi|229784094|gb|GG667641.1| GENE 50 59529 - 61349 1760 606 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 178 602 174 591 602 182 30.0 1e-45 MIRKWTLKRNLVVVNVILLTVALLYSSIMLIRYSGEVSMKNLINSSYDMLTYVCRSVDDF ITTEVKISTEVMQSDSDLKEHLEQKPEDYENQHEKLVDIYLMNRYLTRRNSYSRVSGVLA GIINKNGVVYNLQYPMDMAEEAEAFAESLMKDKKRLEYGYDWYPLQKNCFAPETGNLRND SMIPVLKNLLLDGGGYQGTLIYSLSEEQIYEQYKKSKLLEEGYLYIVDREGRLISHSDES MLLHPEEVVNLAYTKKAMEGTGNYFIENREVVMSQRLNDGNWTAIAVIPLNNMIGEVIGI YKRFIFVIAVILIISVSVSVLLSREAIKPLETMIASMERVENGDFSVRVEEKGPYDIARL IHHYNTMLENTERYIREEYQIKRMKKAAELDALVAQINPHFLYNTLESIVWQARAAGAYK ISDMAYSLGKMFNIMVNKGHSMLTVEKELEHVKLYVHLQNNRYNDRFDLKIDLDDEDLLQ LRTLKLILQPIVENIILHGFTKEQEDCTIRIGVYISHGMLVYDVEDNGIGISKDELDSIR ESLKEQNVIEYDEQNAKLRRNSGRGIGLQNVHQRIGLYYGNQYGLEIFSREGEGTEVIIR IPTGKS >gi|229784094|gb|GG667641.1| GENE 51 61342 - 62469 816 375 aa, chain - ## HITS:1 COG:BH1123 KEGG:ns NR:ns ## COG: BH1123 COG4753 # Protein_GI_number: 15613686 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 267 358 426 519 526 73 36.0 6e-13 LAEGAYRETGAIEEKASEYGISLPRNAYVVEIGWRRDNAEENDKDRIKIRSCVQKYFRSS DGFVFVNKRQLITVIVSGTLGEAQLERKIRELMREVWAEWGIGLCFGISSMTDITETAGG MKEADIALHQMIFVESEQIFSYQDLIFDEDCTVVQFPAEPVINGVFLNGEADWERELERF FSYFKRKYIYDYRHLDFLCIQMYLELLRYADKRGIPVLETCSRAEFAEELADIYTLSGKQ ALMTAKISQMAKSVTGTEVKSRLVRDIEQIIREDYSSFNMSLNYIAEKTGRTVNYLSAVY KNETGKNINDEIVRCRMEKAKRLVGEGQMKMYRIAESIGYTDSSYFAKLFKRYTGLSPAG YRETVARQKEEKLYD Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:40:40 2011 Seq name: gi|229784093|gb|GG667642.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld35, whole genome shotgun sequence Length of sequence - 35487 bp Number of predicted genes - 48, with homology - 44 Number of transcription units - 12, operones - 8 average op.length - 5.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 94 - 134 5.2 1 1 Op 1 . - CDS 151 - 1542 699 ## COG0582 Integrase 2 1 Op 2 . - CDS 1520 - 1957 269 ## gi|160939516|ref|ZP_02086866.1| hypothetical protein CLOBOL_04409 - Prom 1992 - 2051 3.5 - Term 2011 - 2058 3.4 3 2 Op 1 . - CDS 2062 - 2289 258 ## Ethha_1353 DNA binding domain protein, excisionase family 4 2 Op 2 . - CDS 2320 - 2700 193 ## COG2337 Growth inhibitor - Prom 2720 - 2779 3.6 5 3 Tu 1 . + CDS 3509 - 3676 129 ## + Term 3851 - 3888 -0.7 - Term 3902 - 3931 1.1 6 4 Tu 1 . - CDS 4037 - 4573 289 ## Closa_2767 hypothetical protein - Prom 4616 - 4675 2.2 7 5 Op 1 . - CDS 4678 - 5292 157 ## Cthe_1746 hypothetical protein 8 5 Op 2 . - CDS 5304 - 5672 265 ## Cthe_1745 XRE family transcriptional regulator - Prom 5713 - 5772 5.0 - Term 5759 - 5796 3.2 9 6 Op 1 . - CDS 5804 - 6766 463 ## TepRe1_0261 hypothetical protein 10 6 Op 2 . - CDS 6795 - 6935 154 ## 11 6 Op 3 . - CDS 6936 - 7814 822 ## TDE0553 hypothetical protein 12 6 Op 4 . - CDS 7811 - 8686 732 ## gi|266621333|ref|ZP_06114268.1| conserved hypothetical protein 13 6 Op 5 . - CDS 8713 - 9492 670 ## gi|266621334|ref|ZP_06114269.1| hypothetical protein CLOSTHATH_02491 14 6 Op 6 . - CDS 9520 - 10164 467 ## BCZK1570 hypothetical protein 15 6 Op 7 . - CDS 10180 - 10539 370 ## gi|223985042|ref|ZP_03635140.1| hypothetical protein HOLDEFILI_02444 16 6 Op 8 . - CDS 10551 - 11087 523 ## Rumal_0263 hypothetical protein 17 6 Op 9 . - CDS 11114 - 11665 641 ## lwe0445 hypothetical protein 18 6 Op 10 . - CDS 11675 - 12538 942 ## COG3196 Uncharacterized protein conserved in bacteria 19 6 Op 11 . - CDS 12554 - 19225 6198 ## COG0457 FOG: TPR repeat 20 6 Op 12 . - CDS 19268 - 20242 759 ## EFER_3822 hypothetical protein 21 6 Op 13 . - CDS 20254 - 20664 299 ## gi|223985035|ref|ZP_03635133.1| hypothetical protein HOLDEFILI_02437 22 6 Op 14 . - CDS 20661 - 21611 702 ## PPSC2_c3802 protein 23 6 Op 15 . - CDS 21627 - 21923 325 ## gi|266621344|ref|ZP_06114279.1| conserved hypothetical protein 24 6 Op 16 . - CDS 21937 - 22200 88 ## gi|239624299|ref|ZP_04667330.1| conserved hypothetical protein 25 6 Op 17 . - CDS 22197 - 22637 378 ## gi|266621346|ref|ZP_06114281.1| conserved hypothetical protein 26 6 Op 18 . - CDS 22665 - 23396 885 ## Sterm_2633 hypothetical protein 27 6 Op 19 . - CDS 23462 - 23611 154 ## gi|223985029|ref|ZP_03635127.1| hypothetical protein HOLDEFILI_02431 28 6 Op 20 . - CDS 23617 - 23943 598 ## Sterm_1192 hypothetical protein 29 6 Op 21 . - CDS 23930 - 24187 163 ## gi|266621350|ref|ZP_06114285.1| lipoprotein, SmpA /OmlA family 30 6 Op 22 . - CDS 24220 - 24774 650 ## gi|266621351|ref|ZP_06114286.1| hypothetical protein CLOSTHATH_02508 31 6 Op 23 . - CDS 24797 - 25393 315 ## gi|288870492|ref|ZP_06114287.2| hypothetical protein CLOSTHATH_02509 32 6 Op 24 . - CDS 25449 - 25859 544 ## PPE_01753 hypothetical protein 33 6 Op 25 . - CDS 25869 - 26708 994 ## TDE0503 hypothetical protein 34 6 Op 26 . - CDS 26721 - 27800 865 ## COG0666 FOG: Ankyrin repeat 35 6 Op 27 . - CDS 27808 - 28149 295 ## Vpar_0189 hypothetical protein 36 6 Op 28 . - CDS 28191 - 28733 743 ## Sterm_0689 hypothetical protein 37 6 Op 29 . - CDS 28747 - 28980 228 ## gi|239624294|ref|ZP_04667325.1| conserved hypothetical protein - Term 29036 - 29089 3.1 38 7 Op 1 . - CDS 29312 - 29467 160 ## gi|223985023|ref|ZP_03635121.1| hypothetical protein HOLDEFILI_02425 39 7 Op 2 . - CDS 29430 - 29855 80 ## Closa_1120 hypothetical protein - Prom 29935 - 29994 10.5 + Prom 29843 - 29902 5.1 40 8 Tu 1 . + CDS 30013 - 30327 268 ## Sgly_1172 helix-turn-helix domain protein + Term 30328 - 30366 -0.9 41 9 Op 1 . - CDS 30404 - 30550 74 ## 42 9 Op 2 . - CDS 30516 - 31928 1279 ## COG3843 Type IV secretory pathway, VirD2 components (relaxase) - Prom 31961 - 32020 4.2 - Term 32115 - 32164 1.1 43 10 Op 1 . - CDS 32177 - 32299 65 ## 44 10 Op 2 . - CDS 32306 - 33160 609 ## EUBREC_3503 hypothetical protein 45 10 Op 3 . - CDS 33157 - 33750 268 ## EUBREC_3502 hypothetical protein - Prom 33771 - 33830 4.1 46 11 Tu 1 . - CDS 33875 - 34174 376 ## Ethha_1912 membrane protein - Prom 34209 - 34268 4.4 - Term 34256 - 34309 12.0 47 12 Op 1 . - CDS 34330 - 34668 328 ## Ethha_1911 hypothetical protein 48 12 Op 2 . - CDS 34671 - 35486 769 ## Ethha_1910 hypothetical protein Predicted protein(s) >gi|229784093|gb|GG667642.1| GENE 1 151 - 1542 699 463 aa, chain - ## HITS:1 COG:mlr0475 KEGG:ns NR:ns ## COG: mlr0475 COG0582 # Protein_GI_number: 13470699 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mesorhizobium loti # 20 414 25 391 399 123 23.0 1e-27 MASIVKRKSKYSVVYDYTDENGKRRQRWETFSTNAEAKKRKKQIEYEQDSGTFFIPTAKT LNDLLDEYMSIYGVNTWAMSTYESRRGLARNYITPIIGDMLLSDITPRMMDKYYRDLLSV KTVSVNNRKPTSEYLTPHTVREIHKLLRSAFNQAVRWELISRNPVLNATLPKEEHKERDI WTAETLSKAMEVCDDPILSLALNLAFSCSLRIGEMLGLTWDCIDIAPQSIENGSAYIFVN KELQRVTRGALDDLSDKGVIKKFPPCIASTHTALVLKEPKTKTSIRRVYLPKTVAYMLVE RKKEIDELMDLFGDEYIDNNLVFCSSNGRPMESQVINRAFNKLIKENGLPHVVFHSLRHS SITYKLKLNGGDMKSVQGDSGHAQVKMVADVYSHIIDEDRCINAQRLEEAFYSSKTPDPV EDTEPKTADTAVTESDESDAAKILELLKNPETAALLKQLAKAL >gi|229784093|gb|GG667642.1| GENE 2 1520 - 1957 269 145 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939516|ref|ZP_02086866.1| ## NR: gi|160939516|ref|ZP_02086866.1| hypothetical protein CLOBOL_04409 [Clostridium bolteae ATCC BAA-613] hypothetical protein HOLDEFILI_02459 [Holdemania filiformis DSM 12042] conserved domain protein [Clostridium hathewayi DSM 13479] hypothetical protein CLOBOL_04409 [Clostridium bolteae ATCC BAA-613] hypothetical protein HOLDEFILI_02459 [Holdemania filiformis DSM 12042] conserved domain protein [Clostridium hathewayi DSM 13479] hypothetical protein FP2_00500 [Faecalibacterium prausnitzii L2-6] # 1 145 1 145 145 280 100.0 3e-74 MQKQRVKTSMSVPEMGKMLGLGKVESYWLVKKNYFKTIQVAGRMRVMLDSFEDWYAGQFH YKKVDGTPPGEKWRHTTMSVPEMADLLGLKSGTAYDLVKRGYFETTLIDRRIRIITSSFE AWYQKQTHYVKISERSNENGIYREA >gi|229784093|gb|GG667642.1| GENE 3 2062 - 2289 258 75 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1353 NR:ns ## KEGG: Ethha_1353 # Name: not_defined # Def: DNA binding domain protein, excisionase family # Organism: E.harbinense # Pathway: not_defined # 15 71 10 66 70 68 56.0 1e-10 MFEARIAELNRFNEQNPVSYDKRTYTVDEIQDILGISRPTAYNLVKQGVFHSVRVGGHIR ISKKSFDDWLDHADE >gi|229784093|gb|GG667642.1| GENE 4 2320 - 2700 193 126 aa, chain - ## HITS:1 COG:BS_ydcE KEGG:ns NR:ns ## COG: BS_ydcE COG2337 # Protein_GI_number: 16077533 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Bacillus subtilis # 6 115 2 111 116 112 52.0 1e-25 MKEDWIYKRGDLYYANLNPYFGSEQGGTRPVLVLQNNVGNFFCPTLIVAPLTSKWIKKKE LPTHYALESVPELGLKSVVLLEQIKTIDKRRVLSYIGRVSREEMRAIDDALQVSLDIHIP EEMEAP >gi|229784093|gb|GG667642.1| GENE 5 3509 - 3676 129 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFLEIAAVAVLLLVFFKIFARTFHDNAVIFITEHFDCFNKKAPSKATIAACLTER >gi|229784093|gb|GG667642.1| GENE 6 4037 - 4573 289 178 aa, chain - ## HITS:1 COG:no KEGG:Closa_2767 NR:ns ## KEGG: Closa_2767 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 178 11 188 206 153 43.0 2e-36 MRKESTTIQKCNSRLTALGLNSDAAFHRSKLILKIYRDVVWVLSERAEELQETAWIMGEQ DIESGLCYLENFAPDIELQAFEEKVCCLVQNQMLVNIIDRALLRLKRYPDRGELYYEILT KQFIYRFNSTEKELLEELNIERSVFYDRKREAIYLFSVCLFGYSIPEVLEELPRLNPD >gi|229784093|gb|GG667642.1| GENE 7 4678 - 5292 157 204 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1746 NR:ns ## KEGG: Cthe_1746 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 4 204 14 250 262 106 32.0 7e-22 MSIPQIRTEAIEAVGRKILMEYDPSLLSGQPCPVPIETIIETKFDLILEFHTLRKNPKIL GETIFDDGAVVLYDQIQRQYRMIAVRAGTILIDERLCDPSKLGRLRFTCAHELAHWVLHK KLYSGTGDVATYNGNVSSDESHGIIERQADTLASALLMPLPQIKKCFYRLKIGRTDEQLI AEMANIFEVSKQAMQIRLKSRNLI >gi|229784093|gb|GG667642.1| GENE 8 5304 - 5672 265 122 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1745 NR:ns ## KEGG: Cthe_1745 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.thermocellum # Pathway: not_defined # 6 121 5 126 126 87 40.0 2e-16 MGNITFGSFIAEKRKAHKFNLRDTAKHLNIAYGYLCDIEQSRRPAPNGDFVERISAFLNL DKSEHELLLDLAAKSRNTVSADLPDYIMEKDIVRAALRVAKEVDATDEEWQTFMKMLKER KR >gi|229784093|gb|GG667642.1| GENE 9 5804 - 6766 463 320 aa, chain - ## HITS:1 COG:no KEGG:TepRe1_0261 NR:ns ## KEGG: TepRe1_0261 # Name: not_defined # Def: hypothetical protein # Organism: Tepidanaerobacter_Re1 # Pathway: not_defined # 17 318 19 322 327 232 40.0 1e-59 MQNSTNMRILELLRFLYERTDENHPATVSDIIAHLNGKGIQAVRQTVYADTNALIDAGID IVVVKSTQNQYFMGNRLFEYPELKMLTDAVASSKIISAKKSEELVQKLCRLTSLHQAKQL QKFAALSSRVKPHNEKVYYIIDNIQTAIGNHQQIRFQYYEYTQEKKKILKHDGYYYVVNP YALEWKNDHYYLIGFSLKHQKIAHFRVDRLTSVENLETYFMPIEGFDVASYTNKMVDMFT SESSKEVTLLCENELMRVIIDHYGEDAAVDRYDDTHFTAKIEVNPSGTFYGWIFKFKGKI KILSPKECITEMQQIAQEFI >gi|229784093|gb|GG667642.1| GENE 10 6795 - 6935 154 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNGVVLLVGGMLLVAVIKLIVDRNWMVLLLCAVALFLVFGAGHHTK >gi|229784093|gb|GG667642.1| GENE 11 6936 - 7814 822 292 aa, chain - ## HITS:1 COG:no KEGG:TDE0553 NR:ns ## KEGG: TDE0553 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 3 292 10 299 299 298 47.0 2e-79 MSNKAFQQNLDDKKGLQPGGPYLIQMLFKEPVEMPDKEKMTAVMEKHIGSTECFCYDKKM AGFAAQEHIAEFKDGKCPVQLMVMKCDKFKGKGFDAFLMSQMWDCQEDRERIFRECKYQV VATDMLAAALPALERANLDADFLEALAELYPTCEAFYFQNCGKLFLAEDVRSHQIEGSDR FIRFGVNVRFFNIEGTEDMLIDTVGMSTLFLPDLQYHFHDMDPNWVVNHAYNVASYILEH DNPIQDGETIDGVADGRMCREIQWKCQYEDALIQPPRGVLDIHMGEYAAGGR >gi|229784093|gb|GG667642.1| GENE 12 7811 - 8686 732 291 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621333|ref|ZP_06114268.1| ## NR: gi|266621333|ref|ZP_06114268.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 291 4 294 294 554 99.0 1e-156 MEQKRPADIIQELLDYLWNGLGLEEKGWKRLKKGDFKKKMKNGLTYQIWFDRSRYNYIDY EIGHGNVEVGFSCIIKQGDDYLYSFRIEPTTGGSFFRMLTEDLRLNTELLDTFLPLIQAH YLDFISRFEVDPAEALQPVCAPFIQPEDYSWCIHVREQMVERYGTSEQLAEYRHQAELRG TPEHKAKNWMGSMLFHLSHANDVDQAWVSSRTREELDQVVEPFVQAKRQTGQWTQEDEAG YQLYRQETDPKKRTFRVWYLIANPRGLPKEFVQKELEFRWKLFPEKKEETK >gi|229784093|gb|GG667642.1| GENE 13 8713 - 9492 670 259 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621334|ref|ZP_06114269.1| ## NR: gi|266621334|ref|ZP_06114269.1| hypothetical protein CLOSTHATH_02491 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02491 [Clostridium hathewayi DSM 13479] # 1 259 1 259 259 491 100.0 1e-137 MGLDIYAGPLTRYYSHNWKTVVQQWAEENGYSFNRITPDGEPADNEEEMSPAEVQAAVEN WRDQILSAISQPGQPPYAPWLEDNEKPYYTDKPDWDAFGAMLLVAACRTYEEPVPSTVEK DWIFGEHPLVARLASDEERVWSLFRGATWWLPLSDNFLFQGSLPTDDTAAIATLGGLRKE LEKLNQLAWQADEDTILGWADTEGYPVDGTVDSDGQYSKADIPEHTQYDTQSLAKFAFSM FWRAMRFAEEQQVPILLDY >gi|229784093|gb|GG667642.1| GENE 14 9520 - 10164 467 214 aa, chain - ## HITS:1 COG:no KEGG:BCZK1570 NR:ns ## KEGG: BCZK1570 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_ZK # Pathway: not_defined # 7 209 5 212 213 89 29.0 7e-17 MLAKRFEHILHDLGMAGLEHPLFYHAPVGIRFKIGGEEPIYLDRRAAKLKTNPAYVQGAL DRAAAIYRALPAVPDLLRIDGYPDEEPAESLLTVIRQRVGLPVPDEQLSATEQDEDGDTH AQVQFYWNLSKISFQPELLLREIILGDIGGWNGFVSSVYLAGPGPFLYHLYDDRGLDVLG GSQKLLLPLYHQFHDWILEYDLEKIDQMFAPAKE >gi|229784093|gb|GG667642.1| GENE 15 10180 - 10539 370 119 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|223985042|ref|ZP_03635140.1| ## NR: gi|223985042|ref|ZP_03635140.1| hypothetical protein HOLDEFILI_02444 [Holdemania filiformis DSM 12042] conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein HOLDEFILI_02444 [Holdemania filiformis DSM 12042] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 119 1 119 119 213 100.0 3e-54 MSQIASFYLIKNNQRQELSDGDCSGAVYMAIWDWCESELDLDVRFPAPQTEDTLDCALLE GELASQLLAALREQDLPELAAEIAPDWDLPTEAVQSGLNTLRSHLELVQGDAALLYEMT >gi|229784093|gb|GG667642.1| GENE 16 10551 - 11087 523 178 aa, chain - ## HITS:1 COG:no KEGG:Rumal_0263 NR:ns ## KEGG: Rumal_0263 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 9 172 3 162 163 149 48.0 5e-35 MSNSRVSTWSGIRNKLENDYLCPALRGHIQYFATSYSKSADHEGRAAIRVDGVEVLRSNY YTYFENVWTKFHHLRSTTLKDHDSAKEAINQAHAYALEQGTFDQKVFYEAFGIFDNQSIE KSLVSENPLVRIFALLDRRLGKRRLLALEDSMEQELDWVRAFYVIRLQAEGLMEANNI >gi|229784093|gb|GG667642.1| GENE 17 11114 - 11665 641 183 aa, chain - ## HITS:1 COG:no KEGG:lwe0445 NR:ns ## KEGG: lwe0445 # Name: not_defined # Def: hypothetical protein # Organism: L.welshimeri # Pathway: not_defined # 1 176 1 179 188 152 46.0 5e-36 MEKKDFPKGTKPREIFVYTCERIAEPLNPLGYKYRKSKNDIYKKDGIFVFSFYFSPSIRF GSTTFTAFFDVSSPVIAQWRSEQEGTEETYDGIVGTSIARLTHRYDDFPRYEVSTLLERE RSIQEISGQIQDYALPFFARFSNLPKLLDDVEQEGFFPHRKGFDVPKRNREFIECFREYL QKQ >gi|229784093|gb|GG667642.1| GENE 18 11675 - 12538 942 287 aa, chain - ## HITS:1 COG:yieJ KEGG:ns NR:ns ## COG: yieJ COG3196 # Protein_GI_number: 16131585 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 116 286 8 194 195 169 46.0 7e-42 MTEYQKTYIELKKQFVATNEGPDNVRALYTFKEELEQSEDQQAKEVLVDVYDLLDFKKDA YELLCQIGNRSDKKTLKRLGTLKDYAENWGNHYALPKPKTPEEKQKEKERQAQLGLPAFR YHPNPLETGAFEESADGVVCDCCGKTTHIFYTAPFYAVEDIAYLCPECIANGEAARKYDG SFQDDFSVDDGVDDPEKLDELIHRTPGYSGWQQEYWRAHCGDYCAYLGHVGARELRALGV LEEVLDDPMWDDEQKKMIQESVNGGHLQCYLFQCLHCGKHLVWMDFD >gi|229784093|gb|GG667642.1| GENE 19 12554 - 19225 6198 2223 aa, chain - ## HITS:1 COG:FN0819 KEGG:ns NR:ns ## COG: FN0819 COG0457 # Protein_GI_number: 19704154 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 455 560 13 109 665 94 47.0 2e-18 MGAWGIKALERDEGLDVLDILKNEYVPEHPVMDLGEMIELMKEEVMLGSDFSQIDFLFDN TAMALAELYFQWKDNSKLGYDHEEAIWDKVTGFTASKEALAFLLRQLTDIKNEVPDEDGI REIVDLWKNEDSGEIAPAWLEHLNQLIDRLDSEQEARQMYIKKYWGNFIGGSDDSLNLVA FLEDQKKEEIPLSEIFAKIGLDKQSWDFRQTVEYLEFTHSDGVEMDFHFAIDVVTDLAAI LLECSVSGSVNLQDLDEYNTPVCRIRITATPEEHDAMNKSLADFAQNPLEYDISEMMGED EITDMAYQVEMLRKELYEASGRNRNYHVKAEDVKHLLPDWEGADGCIATNRITVEGYKVG YCYREKPDGGWDSGWRFTAGDESEAYMDDPNNAGIYKLNTICNDDPDIIPLLNTSAPCAF ERDENGVFQQIKDWKPDEDEEDPDMDILKQCQKWHEESKQHKIIDALEAIPAEERTPEMD SELARAYNNLADPHKPTCKEMLKKALALLKPHEEYFEDDYYWNFRMGYSYFYLDQEGRAL RYFEKALEVRPGDDDTKEFIERCKKGISLPQFWECFRERTEDWWETFAEMEAELRQMMDE DKDHTRGAELVAQMQETLNLVFDEISFEMGFNGKKHELILTPEGDKVKLFELVYFQKHAP KEVLEHWNILVGRQPLQNIGLRTEDGWDISGEDVQIWLEEQGENSFAISAYCEKLLPMLR EEEGRAWWMLTTLTDQVLGEIPHMRYIDSFDVLEEPKAEPSFLLSQLPDKLREQGLELST DPEAYLESYLGYKMEPKQDPDADWRLDVMAGSTCCVPLINGYLNADNDFMDDLHADGAVA GFFCYPLDTLREEEGSQKIFDFRDKLEEVLTGGDGSEVLTLTGGATGLYCGYVDFIAWDI QEALNMAKEFFEGTDIPWAIFHTFRREAGSVSLKQQDDGTETENQDDELDETLTGMDYIP YTQQNAEAFFAQLEQWNDEDEYTRCIQALNAIPENWRNYRTAYALARALENYAIIGDHDE GTLKSKGDKALLRAIEVLESVREEGQDKAEWNMRMAYGYQYLYGQEEKAIPYAQRWAELD PEDENAPAVIRECKAEIRKRQRSRKKKAKFVPGDTPFEGFDLTNFWDDNWYALKEYVSDP PSDELIASVEEELGYKLPAAYIWLMKQHNGGIPVNTCYPCDEPTCWSDDHVAITGIFGIG REKSCSLCGELGSQFMIDEWEYPAIGVAICDCPSAGHDMIFLDYRACGPQGEPAVVHVDQ ENDYKITHLADSFEEFIRGLEHESLYDPDEDAEGLEDDADEEETDHKGSFAGSVLLSKAE WDKEQLIRDLREEWGIVDEEPDEGDEDVENSDDAVVMRVGNMMLIVTLFHGHIPDNEAEI NAENNYMWPEAVEVAKAHKAHIMVAVLGEEEKLLERGKLFTKAMAVCCKQKYATGVYTSG VVFEPRFYEGLADMLKKDELPIFNWVWFGLYRSEGGLNGYTYGMDVFGKEEMEVLNTDAE PEELRDFLASLASYVLACDVTLQDGETIGFSADDKHTITRSPGVSLPEEQMTLKIGYEPI KGDPEDDSCDHSDNDDTQDEEEFSNPEVYTEEEMEAVEGHIEQYFGKVENVFHELVSPDI HVDICIVPPTEERDYYTLVTMGMGAHRMNVPEELAEYKLERAELAIALPADWKLVQESMQ DERWYWPIRLLKTLARLPIASDTWLGFGHTMDNEEDFAKDTKLCAAILTGPQDTEDGNEV CILPSGEEVNFYQVIPLYRDELEYKLAHDADALLDKMNGISFVVNPTRQDAITRGTLSND DFDGEMDDASYHLESIEEKELPIDPINAYNHMAIYLRWCMEHDLMGEDFLKEYSEVAKQV KADPASVDLRAFIQDELDGCLFSVLFNQQGRAFAGYYYGEGDSPYYPADIDDNALRFFGP ERYHSNEFQQEAYLFIPFDEDYYQAMAEVIGERFENWQGQDFDEDTLEPSEVAQAIMEYL DCECTYFPSMADDDPIMSAYSYAQRLGVREGFVPVLIKADDETLLECLVMNADPEHNADF YEFDLKTVEEYRKKMLSAPIKDGKAVLEELTGQRKEEAEDDDMDWEEEVLGEMEGGEPND RFANYWNDDTGMTYPLILAKIPVKNPWEIFAYLPFGNWNECPDTPDLMAVAKYWFEQHGA IPAAMSHDELEFELPTPISKERAMEVAVEQYGFCPDLDQNEDGSIGSLADVLWQSTVWYF WWD >gi|229784093|gb|GG667642.1| GENE 20 19268 - 20242 759 324 aa, chain - ## HITS:1 COG:no KEGG:EFER_3822 NR:ns ## KEGG: EFER_3822 # Name: not_defined # Def: hypothetical protein # Organism: E.fergusonii # Pathway: not_defined # 7 320 6 324 324 203 36.0 9e-51 MNKRLFRTQFNQMENIEKQVLMESLAARYDMTFLGLHTFDRWGQNCTTGIFKKDGREFVF VPGDTVTLGWEQFAEGLNQESREELEYLFREWEMEQNPAELIGESMALVRQAAIGPMLVG RELEELCWESVKMDDPRLTAHPDWPKEFRDFAWSDSSSLTLHQSARIERTEDGFQTWIYN RTDYDELLTGLEKQGLSLPTADEWAYLCGGGCRTLFPWGDGLDYSMRLRWFEDMDEDENR PYDMEEPNFFGLSIAYDPYMREVVQADRLTTCGGDGGCNICGGLGPFLGFLPCSPHCKPE VQEDKELNGDYDFYRPIIRVENHD >gi|229784093|gb|GG667642.1| GENE 21 20254 - 20664 299 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|223985035|ref|ZP_03635133.1| ## NR: gi|223985035|ref|ZP_03635133.1| hypothetical protein HOLDEFILI_02437 [Holdemania filiformis DSM 12042] conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein HOLDEFILI_02437 [Holdemania filiformis DSM 12042] conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein FP2_00350 [Faecalibacterium prausnitzii L2-6] # 1 136 1 136 136 274 100.0 2e-72 MKQSDIYTEALTCLRSILLADHPEFQNWIDWLERDIQDWNQRREVAHHLRAYGGMGSFND LPSMRGNHDYIFDFLKSVCYAFGHLYGKREGISPEALMEECLHDVEQAAYHPHKALNQAI AQHLMQGDLQENLDRL >gi|229784093|gb|GG667642.1| GENE 22 20661 - 21611 702 316 aa, chain - ## HITS:1 COG:no KEGG:PPSC2_c3802 NR:ns ## KEGG: PPSC2_c3802 # Name: not_defined # Def: protein # Organism: P.polymyxa_SC2 # Pathway: not_defined # 9 260 5 250 306 158 36.0 3e-37 MNKPIKAKNLPLFSIIDLDQLRREKHLEGTEVTDFFTARDGKVYLLMEQPSETQGKDWLS TPSTYTAVEIQLDWAEQRVLETTLFPLGLLKFQFHYLRPAGDHFLLLGARCAYRENGPDQ NAWIVSRDGAVLSRFCLGDGIQDCVVKKDGTIITSYFDEGVFGNYGWDEPLGACGLIAWT SEGTPLWKNENYSIYDCYAISLDEEENLWFYYYDEFRLVRTNFKEDFVFELPIEGSGAFS VAPSGNTFLFQGGYQQRDKFYFLTAHGDHLGKKQEAIPTCDGNKVAVEQCSLLRSRMLFL GKDGVLYGGIWGSDGQ >gi|229784093|gb|GG667642.1| GENE 23 21627 - 21923 325 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621344|ref|ZP_06114279.1| ## NR: gi|266621344|ref|ZP_06114279.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 98 2 99 99 184 100.0 2e-45 MEYIKLFWEDAPEGEPSVILYEVDTENERLALHSIDIFADGRTRNIPDLYDGAIEITPIP TVEELNAHVWGEEFHACVIEQAEFEAAWESHTYDGALK >gi|229784093|gb|GG667642.1| GENE 24 21937 - 22200 88 87 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|239624299|ref|ZP_04667330.1| ## NR: gi|239624299|ref|ZP_04667330.1| conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] S6 modification enzyme RimK [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] S6 modification enzyme RimK [Clostridium hathewayi DSM 13479] # 1 87 1 87 87 174 100.0 2e-42 MKDLTTQTGIIVKCSKTAIEFFQNAQSVDFFSALEIPKEFQDIAVEFYDLIMENDHLAAL LGCRGNYDIAIQIDEVTGTMTGWHWFK >gi|229784093|gb|GG667642.1| GENE 25 22197 - 22637 378 146 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621346|ref|ZP_06114281.1| ## NR: gi|266621346|ref|ZP_06114281.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 146 1 146 146 308 100.0 9e-83 MKDLCQYGNRPEDEWEILPWIPDPRPPFKIWVKPEQIAPFFLIPHHPYALSLLLKINNGF RTEVFRRLGLTGSSGDWERLVRGVIQEFEENNSGVDLFHFDSDEDVFCVYSQYIDDLMLL SKMIRAACDNEKTMGMYLNMSEVAKA >gi|229784093|gb|GG667642.1| GENE 26 22665 - 23396 885 243 aa, chain - ## HITS:1 COG:no KEGG:Sterm_2633 NR:ns ## KEGG: Sterm_2633 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 2 243 3 244 245 195 41.0 1e-48 MPTTEWLNKYEAIKDKLTCKDDLEAHFTEKVIGNMAVDVLDIGTVHFPTGQIFACDPLVE LEDTLPFLQTIPAGTYPVKICVVPSEQYGDRYACVKVEVSQEKPVRYELGMVGNEDLDEE LGEDEYFGFGVDAGMGCVADIQTQAAFKAYWAKRLEEDPDIDPYNDLFCDLLEENAKAYP KYQGDCGDWLNWTVPGTDCNLPIFASGWGDGYYPVYFGYDAKGKICAVYVRFIDIEASYR EQD >gi|229784093|gb|GG667642.1| GENE 27 23462 - 23611 154 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|223985029|ref|ZP_03635127.1| ## NR: gi|223985029|ref|ZP_03635127.1| hypothetical protein HOLDEFILI_02431 [Holdemania filiformis DSM 12042] hypothetical protein CLOSTHATH_02505 [Clostridium hathewayi DSM 13479] hypothetical protein HOLDEFILI_02431 [Holdemania filiformis DSM 12042] hypothetical protein CLOSTHATH_02505 [Clostridium hathewayi DSM 13479] # 1 49 1 49 49 83 100.0 5e-15 MAEKELALCDECGSLFFKGSSQMMGLCQNVPMFSMATRIVTIIFRMAGV >gi|229784093|gb|GG667642.1| GENE 28 23617 - 23943 598 108 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1192 NR:ns ## KEGG: Sterm_1192 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 108 1 108 108 103 59.0 3e-21 MKTFDPNYKLLDEMYQDNYYPTFLVDKVKDELQKVIDLLESGETDTEVVQETLDEAVCGI NDLQEEFDENDSEIETVARECIAATVAYILEWFGIPIDTEEAIRERDW >gi|229784093|gb|GG667642.1| GENE 29 23930 - 24187 163 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621350|ref|ZP_06114285.1| ## NR: gi|266621350|ref|ZP_06114285.1| lipoprotein, SmpA /OmlA family [Clostridium hathewayi DSM 13479] lipoprotein, SmpA /OmlA family [Clostridium hathewayi DSM 13479] # 1 85 13 97 97 159 98.0 9e-38 MQNPQNISPLKFGMSQDEVIEIFGKPDAVSTMRSDGKPLILKYHEIELHFDSKAPHGLYL IYSDDEIELSMTAEHEERSNPYENI >gi|229784093|gb|GG667642.1| GENE 30 24220 - 24774 650 184 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621351|ref|ZP_06114286.1| ## NR: gi|266621351|ref|ZP_06114286.1| hypothetical protein CLOSTHATH_02508 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02508 [Clostridium hathewayi DSM 13479] # 1 184 1 184 184 362 100.0 7e-99 MIKERIPISGDLGSKVKQLMEYAGWYEGRSVDISIAEQYYADHGVPMMKTTQRFYRKYFG LCCEWYLAQKKLKWAADFEFALFPYLVNGIKNHLEDAYFRDMSGCELAEIEQAAGEKCQP IGHIGYYYPAEVWISEYGKLYAKYEYQDEIECFPDVFALIERELRQCKFDSAAMKTVEAL DGKL >gi|229784093|gb|GG667642.1| GENE 31 24797 - 25393 315 198 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870492|ref|ZP_06114287.2| ## NR: gi|288870492|ref|ZP_06114287.2| hypothetical protein CLOSTHATH_02509 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02509 [Clostridium hathewayi DSM 13479] # 1 198 27 224 224 414 100.0 1e-114 MKQNKHQLEHLPTLLVQQIRLQWYKDCRGGNAAAQRNQYPRAMRLPKDFFSYYPFGLPTH FASIIQRPDGFQIDRDCRRLMEWKPNGTMRLHPFELIQQEPGIHVRYRNDWHIGAIPERY TYDKTGQKQPLNELALDLIPGDYGRAVCNGRFRDWDTGVWYYVLDILNVMPLAEPTDALT SFTDREPNKIYTQIDRLW >gi|229784093|gb|GG667642.1| GENE 32 25449 - 25859 544 136 aa, chain - ## HITS:1 COG:no KEGG:PPE_01753 NR:ns ## KEGG: PPE_01753 # Name: not_defined # Def: hypothetical protein # Organism: P.polymyxa # Pathway: not_defined # 1 131 7 133 136 72 34.0 4e-12 MEKATAVNTCLGVLKGRDCIYLDQVKQDALNNLTFTGDINGHLISQCRDEKDWFPYTLTF RQVLAYFTCELDTYENMAGTEYLDGSSFDLIEDSTWLKSLPVREDFDKGIYRHYRLFTYD DVYNIIAVSYEFAAEL >gi|229784093|gb|GG667642.1| GENE 33 25869 - 26708 994 279 aa, chain - ## HITS:1 COG:no KEGG:TDE0503 NR:ns ## KEGG: TDE0503 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 3 279 16 283 283 179 37.0 1e-43 MFEEFYEMYESEEQEVVALINRCIGGGYNNRGNFWQMTVVTLGMVFCDTGKVTTKEERLE WPVTDEERNSEKGWGRFQNEQICRLKIRRMKEEWAKDLVAWPWCISEVVKTHEDCPELQA VLDECHKPVVIQDQVLGKLELDKDHDAFEGEIQWRGKDVLLSLEVNAESKPSWTRARSAA KKLLADCETWDKAMRELAAKNLTELANNWLSQDEEPPRNPETDPITEEELARRISLTSLS VTSGGSFTAWFDCDEMFTDHAVTIYGSLKKGLKTANIEG >gi|229784093|gb|GG667642.1| GENE 34 26721 - 27800 865 359 aa, chain - ## HITS:1 COG:all2748 KEGG:ns NR:ns ## COG: all2748 COG0666 # Protein_GI_number: 17230240 # Func_class: R General function prediction only # Function: FOG: Ankyrin repeat # Organism: Nostoc sp. PCC 7120 # 30 232 122 290 426 64 28.0 3e-10 MYQIAYIGRWETLPETAAAICDYDTSKLEVLLQGGLDLDDPIQLSEYIKLTPLEIAVFRN DVPMIHFLLGHGADPGLAEEQPLLLTAARCCGPEVVALFAGQAAKLSLKQKERAFQEVRW GKRPENIPVLEQAGITVDKFGGEAFRAAVSEGNAKLARLLLEKGADINYHKPDMVFPYAS TPVTEAARSNNFSMVRWLVEQGANITLVNKYGDRPYSVAVQNKNQEMADYLKALEPEDWH NEQEKVRQLMPYKLPAKLVEYLKTGPLRLEFPEQEWVKWAELYAYMDVQEMTWKRKKLLS LMAAMDNYSDYLLLWSPRDKKLWYLDIEHEEFHSLAKWDDFIADPGRYLNGMIEGEFEE >gi|229784093|gb|GG667642.1| GENE 35 27808 - 28149 295 113 aa, chain - ## HITS:1 COG:no KEGG:Vpar_0189 NR:ns ## KEGG: Vpar_0189 # Name: not_defined # Def: hypothetical protein # Organism: V.parvula # Pathway: not_defined # 15 99 12 95 107 90 51.0 2e-17 MIGTDENRAVLHVEVIFWSGKRKIPPSLVSGKYCPHFVVTGTTEYLGVCFLDGTECTFDT PALGNAQPLYPDTIDYAPLENNAEFLIYEGANAVGKGRVLGRTVPYKVKQQRK >gi|229784093|gb|GG667642.1| GENE 36 28191 - 28733 743 180 aa, chain - ## HITS:1 COG:no KEGG:Sterm_0689 NR:ns ## KEGG: Sterm_0689 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 5 180 2 177 177 175 51.0 6e-43 MMTEKYPTWLIGHVKEWAEKRLPTVILCSTTGNELLEVWYYGDLLTVKGEPQSYIIDSDE APGLVTARDPESGEEFVIFDGGRHGYDNMFCDEHDPSELEHRPLKRYEIPASKLVLELGY SIDYEDEKENFEPDEADTVELINGERMPWEQAKRDGIDYIALYYVNEKGKPVQILDAELA >gi|229784093|gb|GG667642.1| GENE 37 28747 - 28980 228 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|239624294|ref|ZP_04667325.1| ## NR: gi|239624294|ref|ZP_04667325.1| conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] cytochrome D ubiquinol oxidase subunit II [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] cytochrome D ubiquinol oxidase subunit II [Clostridium hathewayi DSM 13479] # 1 77 1 77 77 144 100.0 3e-33 MNSEQITAFLQEHWYIASVLIGAVILFGAIRNWNWLCDPTGTRDAHRHSRGYRRVVFFLL GVLLIVVSIWGFVLKLK >gi|229784093|gb|GG667642.1| GENE 38 29312 - 29467 160 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|223985023|ref|ZP_03635121.1| ## NR: gi|223985023|ref|ZP_03635121.1| hypothetical protein HOLDEFILI_02425 [Holdemania filiformis DSM 12042] hypothetical protein HOLDEFILI_02425 [Holdemania filiformis DSM 12042] # 1 51 130 180 180 108 98.0 2e-22 MVDHAEYIIGVFDNQKKLRSGTAQTVNYALHQGKVITLIHPDTMEITAPAP >gi|229784093|gb|GG667642.1| GENE 39 29430 - 29855 80 141 aa, chain - ## HITS:1 COG:no KEGG:Closa_1120 NR:ns ## KEGG: Closa_1120 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 120 1 118 164 104 44.0 1e-21 MNQYACAITGHRPTRFCFGYNEEDPLCQKLKECLLEQFRILHDEKFVRTFFVGGALGVDM WAGEQLLTLRAQTGYEDIKIIVVVPFIGYDSKWPDQSKHRLKKLIQNANDSIVISHSADV SSYKKKELLYGRSCRVHHRCV >gi|229784093|gb|GG667642.1| GENE 40 30013 - 30327 268 104 aa, chain + ## HITS:1 COG:no KEGG:Sgly_1172 NR:ns ## KEGG: Sgly_1172 # Name: not_defined # Def: helix-turn-helix domain protein # Organism: S.glycolicus # Pathway: not_defined # 1 100 1 100 110 124 55.0 2e-27 MDEEFIRNRITELRLKKGVSEYQMSMELGQNRSYIQAISSGRSMPSMKQFLNICEYFEIT PLQFFDAQENNPQLIKKALDGMRKMSDDDLIMLIGFISRLNTEN >gi|229784093|gb|GG667642.1| GENE 41 30404 - 30550 74 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLKKKKTTASDKLVLVKSVRRTRSHRMKFCVQRGLGALPSTSSLWRNG >gi|229784093|gb|GG667642.1| GENE 42 30516 - 31928 1279 470 aa, chain - ## HITS:1 COG:SP1056_1 KEGG:ns NR:ns ## COG: SP1056_1 COG3843 # Protein_GI_number: 15900926 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD2 components (relaxase) # Organism: Streptococcus pneumoniae TIGR4 # 24 173 18 180 402 80 33.0 9e-15 MATTRIMPLHVGKGRTESRAISDIIDYVANPKKTDNGRLITGYACDSRTADAEFLLAKRQ YLAATGRVRGADDVIAYHVRQSFRPGEITPEEANRLGVEFAKRFTKGNHAFVVCTHIDKS HIHNHIIWSSVSLEYDRKFRNFWGSTKAVRQLSDTICVENGLSIVENPKPHGKSYNKWLG DQAKPSHRELLRVAIDNALSQSPANFEELLKLLKESGCEVSKRGKSYRLKLPGCEKAARM DSLGEGYGLDDLQAVLSGKKAHTPRKKIVTQAETQKVNLLVDIQAKLQAGKGAGYARWAK VFNLKQMAQTMNYLSENNLLEYAVLEEKAAAATAHHNELSAQIKAAEKRMAEIAVLRTHI VNYAKTREVYVAYRKAGYSKKFREEHEEEILLHQAAKNAFDEMGVKKLPKVKELQTEYAK LLEEKKKTYAEYRRSREEMRELLTAKANVDRVLKMEVEQDVEKEKDHGQR >gi|229784093|gb|GG667642.1| GENE 43 32177 - 32299 65 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDTIVKFVVNTKPTKNFQAKVMHPTFVNPVLRYQQKSRLN >gi|229784093|gb|GG667642.1| GENE 44 32306 - 33160 609 284 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3503 NR:ns ## KEGG: EUBREC_3503 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 275 1 277 287 375 68.0 1e-103 MIKTARQLKDLIRNLSREKSADAQILMRNYMMERFLERISLSEYRDKFILKGGMLVAAMV GLDARSTMDLDATIKGANVNVEDIENLISSIVTVPIDDGVKFQLKSISEIMDEAEYPGIR VSMSTTFDGVVTPLKIDISTGDAITPREVRYSFKLMLEDRSIDIWAYNLETVLAEKLETI ITRTTTNTRMRDFYDIYILEQLHGTTLNPKILHDALLATAHKRGSEKYLNQAEEVFDEVE NDSVMQKLWEAYRKKFSYASDLEWDVIMKAIRRLYVLCEKGISL >gi|229784093|gb|GG667642.1| GENE 45 33157 - 33750 268 197 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3502 NR:ns ## KEGG: EUBREC_3502 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 197 1 197 197 274 65.0 2e-72 MGQFEQLDQMLTAQEGMLRTSQVVSSGISKPVFYDYVRSRDLDRVAHGIYLSKDSWVDAM YLLHLRFEQAVFSHETALFFHDLTDREPIEYTVTVKTGCNPSKMKAEGIQVFTIKADLHD VGLTTAKTPFGHTVPVYDMERTICDLLRSRSRIEIQTFQGALKAYARRKDKNLRALMQYA GMFKVEKILRQYLEVLL >gi|229784093|gb|GG667642.1| GENE 46 33875 - 34174 376 99 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1912 NR:ns ## KEGG: Ethha_1912 # Name: not_defined # Def: membrane protein # Organism: E.harbinense # Pathway: not_defined # 4 99 1 96 96 79 56.0 4e-14 MKILKGLLMIITAPVILVLTLFVWLCTGLIYISGLVLGLLSTVIALLGVAVLITYSPQNG VILLVMAFLISPMGLPLAAIWLLGKVQSLKFAIQELVYG >gi|229784093|gb|GG667642.1| GENE 47 34330 - 34668 328 112 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1911 NR:ns ## KEGG: Ethha_1911 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 111 1 111 112 126 66.0 2e-28 MAKRKRDMQLNFRVSAEELAVIEQKMSQFGTSNREAYLRKMALDGYVVKLELPELKELVS LMRRSSNNLNQLTRKVHETGRVYDADLEDISQRQEQLWEGVEEILTQLSKLL >gi|229784093|gb|GG667642.1| GENE 48 34671 - 35486 769 271 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1910 NR:ns ## KEGG: Ethha_1910 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 105 263 1 166 254 105 37.0 3e-21 NTNDLNTALYEKMAAEQDKFRDWLKRQPPEEILHHTYEYTVREDIVMAMEELELTDAQAQ ALLESPSPLADVYRYFEKLETGHMDVIRDSIENRADDVCRAKEELRTTPVYPHSAAYARE HGELEQYRASNNANLQCKESIEAAVREHFDGMYLSHDAANGVIEIYGMERVSMVLSNTVQ LQDWDGRYSRRNKEWAKTIPNDNPETVRCGYALNSHPAVLDGFIDLVRVEQQRSRTQGEK LQPSRPSVRDKLKQELPAHKPAAPKKREPER Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:44:30 2011 Seq name: gi|229784092|gb|GG667643.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld36, whole genome shotgun sequence Length of sequence - 57432 bp Number of predicted genes - 59, with homology - 54 Number of transcription units - 20, operones - 17 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 99 65 ## 2 1 Op 2 . + CDS 96 - 1148 1202 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 3 1 Op 3 . + CDS 1228 - 2355 1317 ## COG0628 Predicted permease 4 1 Op 4 . + CDS 2451 - 3368 1048 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) 5 1 Op 5 5/0.000 + CDS 3406 - 3876 520 ## COG0534 Na+-driven multidrug efflux pump 6 1 Op 6 . + CDS 4810 - 5658 849 ## COG0534 Na+-driven multidrug efflux pump 7 1 Op 7 . + CDS 5690 - 6301 755 ## gi|266621376|ref|ZP_06114311.1| conserved hypothetical protein + Term 6331 - 6376 -0.7 + Prom 6319 - 6378 7.0 8 2 Op 1 2/0.000 + CDS 6463 - 6879 287 ## COG1342 Predicted DNA-binding proteins 9 2 Op 2 . + CDS 6842 - 7243 447 ## COG1433 Uncharacterized conserved protein + Term 7283 - 7316 2.1 10 3 Tu 1 . - CDS 7268 - 7720 354 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 11 4 Op 1 7/0.000 - CDS 8655 - 10037 1158 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 12 4 Op 2 . - CDS 10034 - 11008 751 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 13 4 Op 3 . - CDS 10935 - 11585 562 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 11654 - 11713 5.7 + Prom 11671 - 11730 3.5 14 5 Tu 1 . + CDS 11766 - 12212 488 ## COG1879 ABC-type sugar transport system, periplasmic component + Prom 13114 - 13173 4.1 15 6 Op 1 . + CDS 13200 - 13688 579 ## COG1879 ABC-type sugar transport system, periplasmic component + Prom 13722 - 13781 4.6 16 6 Op 2 . + CDS 13807 - 14043 306 ## gi|239625994|ref|ZP_04669025.1| conserved hypothetical protein + Term 14085 - 14137 16.6 17 7 Op 1 3/0.000 - CDS 15048 - 15629 482 ## COG3694 ABC-type uncharacterized transport system, permease component 18 7 Op 2 . - CDS 15648 - 16091 307 ## COG4587 ABC-type uncharacterized transport system, permease component 19 8 Op 1 . - CDS 17021 - 17344 216 ## Closa_1267 hypothetical protein 20 8 Op 2 . - CDS 17354 - 18154 177 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 - Prom 18399 - 18458 7.8 + Prom 18457 - 18516 4.0 21 9 Op 1 2/0.000 + CDS 18601 - 18984 519 ## COG0640 Predicted transcriptional regulators 22 9 Op 2 15/0.000 + CDS 19012 - 19236 357 ## COG2608 Copper chaperone 23 9 Op 3 . + CDS 19272 - 21137 2283 ## COG2217 Cation transport ATPase + Prom 21151 - 21210 10.4 24 10 Op 1 . + CDS 21327 - 22664 1521 ## COG0174 Glutamine synthetase 25 10 Op 2 . + CDS 22710 - 23258 545 ## Closa_1273 ANTAR domain protein with unknown sensor 26 10 Op 3 . + CDS 23269 - 24078 980 ## COG0253 Diaminopimelate epimerase 27 10 Op 4 . + CDS 24100 - 25329 1245 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 28 10 Op 5 . + CDS 25387 - 25566 108 ## gi|266621396|ref|ZP_06114331.1| conserved hypothetical protein + Prom 26468 - 26527 8.5 29 11 Op 1 . + CDS 26567 - 27013 593 ## COG1522 Transcriptional regulators + Prom 27022 - 27081 4.4 30 11 Op 2 . + CDS 27103 - 27885 688 ## COG0860 N-acetylmuramoyl-L-alanine amidase + Prom 27977 - 28036 4.2 31 11 Op 3 . + CDS 28082 - 28720 598 ## Closa_3203 hypothetical protein + Term 28724 - 28769 14.3 32 12 Op 1 . + CDS 28830 - 28958 70 ## 33 12 Op 2 . + CDS 28943 - 29587 817 ## COG0637 Predicted phosphatase/phosphohexomutase 34 12 Op 3 . + CDS 29663 - 30559 916 ## COG2035 Predicted membrane protein + Term 30599 - 30647 8.4 - Term 30575 - 30648 9.1 35 13 Tu 1 . - CDS 30675 - 31949 1106 ## COG3409 Putative peptidoglycan-binding domain-containing protein 36 14 Op 1 24/0.000 + CDS 33441 - 34514 833 ## COG0505 Carbamoylphosphate synthase small subunit 37 14 Op 2 . + CDS 34514 - 37708 3886 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) + Term 37718 - 37769 17.0 + Prom 37790 - 37849 6.3 38 15 Op 1 . + CDS 37886 - 38236 361 ## gi|266621407|ref|ZP_06114342.1| conserved hypothetical protein 39 15 Op 2 . + CDS 38229 - 38429 264 ## COG1476 Predicted transcriptional regulators 40 16 Op 1 . - CDS 38452 - 38994 511 ## PPSC2_c1114 ThiJ/PfpI family protein 41 16 Op 2 . - CDS 39040 - 40062 880 ## COG1816 Adenosine deaminase - Prom 40112 - 40171 3.1 + Prom 40419 - 40478 8.1 42 17 Op 1 . + CDS 40545 - 41546 973 ## COG1609 Transcriptional regulators 43 17 Op 2 . + CDS 41600 - 42055 309 ## COG0698 Ribose 5-phosphate isomerase RpiB 44 17 Op 3 12/0.000 + CDS 42036 - 43742 936 ## COG3959 Transketolase, N-terminal subunit 45 17 Op 4 . + CDS 43729 - 44706 679 ## COG3958 Transketolase, C-terminal subunit 46 17 Op 5 . + CDS 44703 - 45890 1045 ## COG1653 ABC-type sugar transport system, periplasmic component 47 18 Op 1 . + CDS 46861 - 46965 75 ## 48 18 Op 2 38/0.000 + CDS 47024 - 47884 806 ## COG1175 ABC-type sugar transport systems, permease components 49 18 Op 3 . + CDS 47901 - 48728 334 ## COG0395 ABC-type sugar transport system, permease component 50 18 Op 4 . + CDS 48784 - 49674 747 ## blr4705 hypothetical protein 51 18 Op 5 . + CDS 49676 - 50578 844 ## Spirs_2575 xylose isomerase 52 18 Op 6 . + CDS 50584 - 51561 873 ## COG1052 Lactate dehydrogenase and related dehydrogenases 53 18 Op 7 . + CDS 51527 - 51688 79 ## 54 18 Op 8 . + CDS 51631 - 51747 70 ## + Term 51787 - 51829 -0.9 + Prom 51792 - 51851 2.8 55 19 Op 1 . + CDS 51988 - 52806 746 ## CA_C0752 DNA ligase III 56 19 Op 2 . + CDS 52806 - 53945 1263 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase 57 19 Op 3 . + CDS 53962 - 54270 362 ## gi|288870515|ref|ZP_06114360.2| conserved hypothetical protein + Term 54274 - 54338 15.0 + Prom 54338 - 54397 3.8 58 20 Op 1 1/0.000 + CDS 54417 - 55457 978 ## COG1609 Transcriptional regulators 59 20 Op 2 . + CDS 55478 - 57431 1569 ## COG1874 Beta-galactosidase Predicted protein(s) >gi|229784092|gb|GG667643.1| GENE 1 1 - 99 65 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no SYCVSYEQIERSLPAFQAVAGEYGLAQQEGER >gi|229784092|gb|GG667643.1| GENE 2 96 - 1148 1202 350 aa, chain + ## HITS:1 COG:CAC3031 KEGG:ns NR:ns ## COG: CAC3031 COG0079 # Protein_GI_number: 15896282 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Clostridium acetobutylicum # 2 348 3 349 352 348 50.0 6e-96 MKPWKANIRKVIPYVPGDQPKGSHLIKLNTNENPYPPSPFVSGALKEMDYDLFRKYPDPA ASVLVDALAEYYNVDSDEIFVGVGSDDVIAMAFMTFFNSSKPVLFPDISYSFYPVWAELF GVPCEKPALNKEFQIVKEDYYRENGGIIFPNPNAPTGLYMPLSEVEDIISHNQESVVIVD EAYIDFGGKSALPLTRKYENLLVVQTFSKSRSMAGMRIGFAIGNPDLIRALNDVKYSYNS YTMNMPSQILGAKAVEDQAYFKETTEKIIRTREQAKVRLRELGFTFPDSMANFIFASHKT AMAKDIYEALRANHIFVRYFNQPGIDDYLRITIGTDEEMEALYRFLEEYL >gi|229784092|gb|GG667643.1| GENE 3 1228 - 2355 1317 375 aa, chain + ## HITS:1 COG:BH0463 KEGG:ns NR:ns ## COG: BH0463 COG0628 # Protein_GI_number: 15613026 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Bacillus halodurans # 12 361 6 367 372 115 27.0 1e-25 MNHWKKYCRLILNIVIPLLVIWLVCFIGPRLLKFFLPFVIGWIIAMIANPLVKFLEKRVK IVRKHSSMMIVVAVLALIITLLYFVISKLVSETVGFVGDIPKYYESASIEVQKMLLSVER LLQFLPQGVSDSVNQFFGHIGEYLNLAVQKIASPTVTVAGNVVKSIPAALVYTIVTIFSS YLFIVNRDKILDFFKKYMPEGGSKYYKYLRKDVRHLVGGYFLAQFKIMFVVAVVLAAGFL VLRVDYALLIAVIVAFLDFLPVLGTGTILVPWAVIRLFSGDTYFALWMLALYVLTQVLRR VIEPKIVGDTMGLDPLATLLFLYLGFKFNGIAGMVLAVPIGMLFLNLYEFGAFDSLISSV KTLVHDINAFRRGPD >gi|229784092|gb|GG667643.1| GENE 4 2451 - 3368 1048 305 aa, chain + ## HITS:1 COG:yegX KEGG:ns NR:ns ## COG: yegX COG3757 # Protein_GI_number: 16130040 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Escherichia coli K12 # 46 229 72 252 275 60 27.0 4e-09 MAAGLMAVMMMAAVPFPAAAAEPWEKENGQYVDASGTPIAGALEKGITVTKYQNRANADK GGIDWAKVAADGVDFAMIRIGYYKDMDPYYSMNMQGASANGIKTGVFFYTQALDTQSAVD EANYVLKMVRDYPVSYPIAYDVESQHLLDNGLTKQQITDNINAFCKVISDAGYRPVVYAN NEWLTNHMDISQIPYDIWYARYGTINNCQNRTIWQCTDKGKVDGITGDVTIEFSFVDYAA LIPPEGWKHVDGNWYYTKNYVKQTGWVQVGEKWYYLDSNGAMIHDTTMTIDGASYTFGPD GAMAE >gi|229784092|gb|GG667643.1| GENE 5 3406 - 3876 520 156 aa, chain + ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 1 154 1 152 448 93 35.0 1e-19 MKAENNLDSDDMRGLVWRLAFPSMLAQFVSVFYSIVDRMYIGNIAGTGEIALAGVGICGP IVTLISSVAFLVGVGGSPLMSIRLGEKNERAARQILANCFLLLTVLSVVITVISLLVKNH LIMWFGASEATFPYANAYITIYLLGTVFALLATGTS >gi|229784092|gb|GG667643.1| GENE 6 4810 - 5658 849 282 aa, chain + ## HITS:1 COG:FN1726 KEGG:ns NR:ns ## COG: FN1726 COG0534 # Protein_GI_number: 19705047 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 281 178 457 457 129 31.0 5e-30 MCNIILDPVFIFGCHMGVRGAAVATVLSQMASCAYVLRFLFSRRAPVRITFGEYDWQIMK RVLLIGLSPFLIIAFDNILIIALNTVIQKYGGAEQGDMLLTCMTIVQSFMLMVTMPLGGI TSGTQTILGYNYGARRPERIKKAEVHIAALGLVFTTVMFLIAHTIPQFFVRIFTQNETYV ELTVWAIKIYTLGIIPLAAQYTVVDGFTGMGIARVAITLSMFRKMIFLGGAFLIPATMGI KYIFYTEPVSDFISAAVSVVIYFLVIDRIINRKPVFGRRNMP >gi|229784092|gb|GG667643.1| GENE 7 5690 - 6301 755 203 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621376|ref|ZP_06114311.1| ## NR: gi|266621376|ref|ZP_06114311.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 203 1 203 203 380 100.0 1e-104 MRKWIAAAAVLLFSTMGTGEAFAETWQYQAAEQPGPGTNGGAMAEEPGSVPEAGSWNGLT FTSSRLGYQVTVPEGFMPLESYMDLSVDDTMTLDLSVAAPDMTVSMVVVTYDKMDEEFKG MDAGQLAAVMGQGLGGYDDYMQIGEVEMVPLGDREYAKLPVIYTGLLNQDLYFCWSGESV MMIEIVYDPYRKAMADEIMGSIR >gi|229784092|gb|GG667643.1| GENE 8 6463 - 6879 287 138 aa, chain + ## HITS:1 COG:CAC3166 KEGG:ns NR:ns ## COG: CAC3166 COG1342 # Protein_GI_number: 15896414 # Func_class: R General function prediction only # Function: Predicted DNA-binding proteins # Organism: Clostridium acetobutylicum # 1 97 1 95 143 72 44.0 3e-13 MARPVKARRICGIPGRMQFGPLEAGTGPEADQFISLTLEEYETIRIIDYLDRTQEECAEQ MGVARTTVQAVYQSARKKMAAMLVEGSALSIGGGNYEVCPRAGGCCKKDCRKRLCPGRRC DGQEYQNGGWNHENCGNV >gi|229784092|gb|GG667643.1| GENE 9 6842 - 7243 447 133 aa, chain + ## HITS:1 COG:CAC3167 KEGG:ns NR:ns ## COG: CAC3167 COG1433 # Protein_GI_number: 15896415 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 6 118 1 114 118 62 34.0 3e-10 MEDGIMKIAVTYENGSVFQHFGHSEQFKIYQAEEGTVVSSEVVGTNGNGHGALAGFLKEM GVETLICGGIGGGARNALDEAGIQLYPGVSGDADKAVEALLKGTLSYDPDTMCSHHHEGE GHDCHHGGGSCHH >gi|229784092|gb|GG667643.1| GENE 10 7268 - 7720 354 150 aa, chain - ## HITS:1 COG:BH2110 KEGG:ns NR:ns ## COG: BH2110 COG2972 # Protein_GI_number: 15614673 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 2 148 444 584 585 83 31.0 2e-16 MYVELQNMRYDNCVSFVVDVPDELSEYTIPKLTFQPIVENAWLHGIMGTEEKRGNILLTG WRQGEDIIFLISDDGIGISPESLESILAEENGDPMSNGTSDSVHIGVYNTNLRLKRLYGE SYGLTFQSSHGTGTEVTIKIPAKRREDSPV >gi|229784092|gb|GG667643.1| GENE 11 8655 - 10037 1158 460 aa, chain - ## HITS:1 COG:BH1122 KEGG:ns NR:ns ## COG: BH1122 COG2972 # Protein_GI_number: 15613685 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 243 457 213 425 586 84 28.0 5e-16 MKNFVNRILRHYTYDLKLKNKLVISHALLILFPTAVLSGFIFVRLYGIVMDDSIRSEQAL SAQTAASIENLVSHVTYVSDALRDSSSIHRLFYLSESDAAAFEPEQNRMDSLFHLTDSLT DNTMITGIRIYYDDSVYPDLMKYNTEKNRLFAPVSTVSSSYWYGICSTMQNKDLLCPELY LSPSEAEKFGRLAYISRITYLPPAGSEQENLRASAYLAVYFNSEPFDSALINNASVKGEA AYIINSRDVMVSASDMALAGIYYIPTDEFLSRIGPEKTFTLVSYPDGAAYTAYFPVRNTD WYMISLLPSSHITDAGTRLMIQLIVAYLLFTALALSIALKLSVSIADRIIAVAYQMESVR TGRPMPMDTVETGCDEIGVLTDTYNYMTEEINDLMNSREKASEDLRLAEFRALQAQINPH FLYNTLDMINWLSRTGRKEDVTRAIQTLSRFYKLTLSKRG >gi|229784092|gb|GG667643.1| GENE 12 10034 - 11008 751 324 aa, chain - ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 184 322 375 517 530 78 28.0 2e-14 MRRGRIFLILLTELGLKQVYAFRSDRYVIVHIYSEKKPDKEAVKRCALFLQQSFDKYCRF FLAVGPIVTGTASVHLSYEKAKLLLENSFFHDFNSILTEYDKPAALCPPADVLMDFDLAL SGKQKDKALELAGQLRDSILNGQPLLPSQVKDIYYKYLVKLDENGIMNYVSLWNNEGVNP QSIWDSVLRCSTFSELHSLLLEKIEQYFDHLTRNTGENPLVFQIKEYLHQNYAISSLSVL DVSEYVNRSSSYICTLFKNETGQTLNQYLTDYRIKKSKQFLGDPRYKIADISSKVGYSDG NYYSKTFRKLVGLSPSEYREKMLS >gi|229784092|gb|GG667643.1| GENE 13 10935 - 11585 562 216 aa, chain - ## HITS:1 COG:SPy1556 KEGG:ns NR:ns ## COG: SPy1556 COG4753 # Protein_GI_number: 15675449 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Streptococcus pyogenes M1 GAS # 3 166 4 171 246 89 32.0 6e-18 MKLLIVDDEKLTREGIRDSIARLSLPFDSILLADDGVHGLETAVSERPDIVLTDVRMPRM TGVEMAEAILEQNPDTAIIFMSAYSDKEYLKAAIKLKAVRYVEKPLSLSELDDALAEALL NCQVRSHTRSAVMIQEKEQKGHLAHLLTLPDSETAALELSRHLGLAMGPSTLFFSIIVDS VTPLSELPEIPMDEARENFFNPVNGAGIKTGLCLSL >gi|229784092|gb|GG667643.1| GENE 14 11766 - 12212 488 148 aa, chain + ## HITS:1 COG:mll5706 KEGG:ns NR:ns ## COG: mll5706 COG1879 # Protein_GI_number: 13474749 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 17 143 11 142 331 67 33.0 1e-11 MKRFWLIAGILTIFCIAACSAAGMVTESPEYAVIMKSKDNQYNEEVAEAFRQAAEEAGYR CEVLYPETARAQDQVILVRRMMRERVKAIAIAVSDEHALAPVLKEAMAKGIVITTLDSDT EKDSRSIFVSPADPKETGAGLVQAVLAS >gi|229784092|gb|GG667643.1| GENE 15 13200 - 13688 579 162 aa, chain + ## HITS:1 COG:SMc02324 KEGG:ns NR:ns ## COG: SMc02324 COG1879 # Protein_GI_number: 15964383 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 10 116 180 283 329 73 40.0 2e-13 MMEELEKPEYQKMRLVDIVYGEDDYKTSAARTRELIEKYPALRAICAPTTVGIQAAADVV REQGLEGQIRVTGLGIPAKSEDGTVSDDTGVCPALNFWNPAALGRLSAQVSMALVNQELT VREGEVFRTEDGGEYLIHKGAGGGLEVTAGDPLTPDESGSGL >gi|229784092|gb|GG667643.1| GENE 16 13807 - 14043 306 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|239625994|ref|ZP_04669025.1| ## NR: gi|239625994|ref|ZP_04669025.1| conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] # 1 71 1 71 75 96 67.0 8e-19 MKHIFVTFDQAEQIFSFVKIMSKCDFDADVKLGSRIVDAKSIVGVLSLAKSRTVEVIFHT DNCDNIIEQIAAIVPIAA >gi|229784092|gb|GG667643.1| GENE 17 15048 - 15629 482 193 aa, chain - ## HITS:1 COG:DR0204 KEGG:ns NR:ns ## COG: DR0204 COG3694 # Protein_GI_number: 15805240 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Deinococcus radiodurans # 4 192 13 203 277 82 31.0 4e-16 MRAIRLYFRFFVIQLKSMMQHKASFFFTVLGQFLVSFNVFLGVTFMMERFHEVQGFTYPE VLLCFSITLMAYTLAETFFRSFDTFNQMIGNGEFDRILLRPGSCIFLVLCSKIELTRIGR LLQAAVMLVYGVTKSGIIWTPMRAFTLFLMILGGTLVFASIYIIFASICFFTLEGLEFMN VFTDGAREYGKYS >gi|229784092|gb|GG667643.1| GENE 18 15648 - 16091 307 147 aa, chain - ## HITS:1 COG:SP0637 KEGG:ns NR:ns ## COG: SP0637 COG4587 # Protein_GI_number: 15900543 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 1 147 126 272 272 63 29.0 9e-11 MPILAAALLLPKPYGLTLPSDAGTWFFTILSMILALLVITALNMVVYLSVFYTISSQGIR LIVSSLFDFLSGAVIPLPFLPGTVRAVVSLLPFASVQNVPFRIFSSDLAGREMYVSLLLQ AFWLLTLLAAGRAMERRALQHVVIQGG >gi|229784092|gb|GG667643.1| GENE 19 17021 - 17344 216 107 aa, chain - ## HITS:1 COG:no KEGG:Closa_1267 NR:ns ## KEGG: Closa_1267 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 105 1 105 267 167 83.0 2e-40 MKKYLSFFRLRFIHGLQYRTAAVSGMVTQFVWGSMEILLFRAFYQADASSFPMTFQALSS YVWLQQAFLALYMAWFWEMELFDSITTGNVVYELCRPIRLYDMWYVS >gi|229784092|gb|GG667643.1| GENE 20 17354 - 18154 177 266 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 234 11 241 329 72 25 4e-12 MIQVEHVNKTYKVARRSAGFSEAVRALFHREVEYVHALNDITFTIGDGEMVGYIGPNGAG KSSTIKILSGILTPDSGTCLIDGRVPWKERKEHVKEIGVVFGQRSQLWWDVPVIDSFELL RDIYEIPQPQYEDALDELVNLLSLSEVMRTPARQLSLGQRMRCEIAASLLHRPKILFLDE PTIGLDAVSKLAVRDFIRRQNRIHGTTVLLTTHDMQDIEAMTDRVMLIGRGRLLLDSSSR SLKTQCMQEGGTLDEIVADLYRNYRI >gi|229784092|gb|GG667643.1| GENE 21 18601 - 18984 519 127 aa, chain + ## HITS:1 COG:FN0260 KEGG:ns NR:ns ## COG: FN0260 COG0640 # Protein_GI_number: 19703605 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 17 127 10 120 125 125 54.0 3e-29 MEDENRMDLQDSEAPHCDFISVHEDIVNRVMQVMPADQQLLDLAEFFRVFGDSTRIRILY VLSQSEMCVCDIAGLLKMGQSAISHQLRVLKQMRLVKFRRDGKTVFYSLADGHIETILAQ GMEHISE >gi|229784092|gb|GG667643.1| GENE 22 19012 - 19236 357 74 aa, chain + ## HITS:1 COG:FN0259 KEGG:ns NR:ns ## COG: FN0259 COG2608 # Protein_GI_number: 19703604 # Func_class: P Inorganic ion transport and metabolism # Function: Copper chaperone # Organism: Fusobacterium nucleatum # 1 67 1 67 73 64 49.0 4e-11 MRKIIKLEGLCCANCASKIETEVSKLAGVKSAALSFMTQRLTIEAEDDRADAIVEESRKI AYKIEPEAEFKVLR >gi|229784092|gb|GG667643.1| GENE 23 19272 - 21137 2283 621 aa, chain + ## HITS:1 COG:FN0258 KEGG:ns NR:ns ## COG: FN0258 COG2217 # Protein_GI_number: 19703603 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 8 609 12 609 614 608 54.0 1e-173 MAARLSVSAVFLIAGALMEGRTAYAWIPFVISYLSAGYDIPLKAVRNIFHGQVFDENFLM TVATFGAIGCGALEEAVAVMLFYQVGELFSDYAVNKSRKSITELMDINPEYANLVRDGKE EKVDPYEVEIGDTIIIKPGEKVPLDGVIVKGSSSLDTKALTGESMPVEVKERDPIISGSI NLNGLLEVQVTKLFDDSTVAKILELVENASSRKAKAENFITKFARVYTPVVVCLALVLAV IPPLLTGGNWGIWIYRACSFLVVSCPCALVISVPLSFFGGLGAASRQGILMKGSNYLEAV ASLDTVVFDKTGTLTTGKFQVTRVKPAEGTKDQLLELAAYGEFHSNHPIAISVKEAYGKR VEESRIVTAKEIAGHGIHAVLELENGRKDLYIGNERLMKAQGITITEVPEIMGTSLYIAE DGVFLGSIIISDTVREDVPDALHGLRRAGVRRLIMLTGDKPEVGRAVADKLSLDEAYGGL LPADKVLKVEELLTAKPEGSTLGFVGDGINDAPVLARADIGIAMGGIGSDAAVEAADVVI MTDEPSRLVDAIVIARKTMRIVKQNIIFAIGVKVLVLILTALGFATMWAAVFADVGVSVL AILNAIRALNYKKARDKAFEG >gi|229784092|gb|GG667643.1| GENE 24 21327 - 22664 1521 445 aa, chain + ## HITS:1 COG:BS_glnA KEGG:ns NR:ns ## COG: BS_glnA COG0174 # Protein_GI_number: 16078809 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Bacillus subtilis # 1 445 1 444 444 584 63.0 1e-166 MSKYRKEDIFRIVEEEDVEFIRLQFTDIFGMLKNVAITSSQLKKALDNRCMFDGSSIEGF VRIEESDMYLYPDLDTFEIFPWRPQQGKVARLMCDVYRPDKTPFEGDPRYVLRRVLKEAS DMGFVFNAGPECEFFIFQTDEEGRPTTETHEIAGYFDVAPIDQGENVRRDIVLNLEDMGF MIEASHHEMAPGQHEIDFEYAEGLVTADNVMTFKMAVKAIAKRHGLHATFMPKPKAGVNG SGMHLNMSLSGLNGKNLFEDETDQLGLSRIAYQFMAGILSHMKEMTILTNPLVNSYKRLV PGYDAPVYIAWSAKANRSPLIRIPSSRGESTRIELRCPDPSVNPYLALAACLAAGLDGIK KGMVPPDSVDTNIYTMAPEEVRARGISRLPETLGEAIEEFKSSEFMKEVLGEHIYTKYLE AKEAEWRMFRAYVTDWETKEYLYKY >gi|229784092|gb|GG667643.1| GENE 25 22710 - 23258 545 182 aa, chain + ## HITS:1 COG:no KEGG:Closa_1273 NR:ns ## KEGG: Closa_1273 # Name: not_defined # Def: ANTAR domain protein with unknown sensor # Organism: C.saccharolyticum # Pathway: not_defined # 1 182 1 182 182 281 77.0 1e-74 MANVIVAFSKPEDAKNIRNILAKNGFAVAAVCTSGAQAINCADELGSGILVCGGRFADMV YDELHDCLPPGVRMLLVASPGLWNGRAPEDVVCLGLPLKVHDLVSTLEMMMESLVRRRKK QRVKPLERSREEQELIKQAKELLMDRNHMTEADAHRYLQKCSMDSGTNMVETAQMIMSLI NR >gi|229784092|gb|GG667643.1| GENE 26 23269 - 24078 980 269 aa, chain + ## HITS:1 COG:BH3412 KEGG:ns NR:ns ## COG: BH3412 COG0253 # Protein_GI_number: 15615974 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Bacillus halodurans # 1 264 1 280 286 251 47.0 1e-66 MKFTKMQGTGNDYVYINCFEETVERPEELAVKVSDRHFGIGSDGLILICPSGQADCRMKM FNADGSESEMCGNGIRCVGKFVYDHHIVEKKEFDVETKAGIKHLKVTDEGGKAALITVDM GIPEVTGDVPEPIVVDGRSYEFIGISMGNPHAIYYMDEIDDLDLEAIGPAFETHERFPER TNSEFIQVVDRSHLRMRVWERGSGETWACGTGATASAVASVLSGRTENTVEVELKGGTLQ ITWDRESGHVYMTGPAVEVFQGEFQPENL >gi|229784092|gb|GG667643.1| GENE 27 24100 - 25329 1245 409 aa, chain + ## HITS:1 COG:MTH52 KEGG:ns NR:ns ## COG: MTH52 COG0436 # Protein_GI_number: 15678081 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Methanothermobacter thermautotrophicus # 1 404 1 408 410 535 62.0 1e-152 MVKVNDNYLKLPGSYLFSAIAKKVNAYTAANPDKKIIRLGIGDVTQPIAPALIKALHDAV DEMGNAETFHGYAPDLGYGFLREAIAAGDYASRGCEIDADEIFVSDGAKCDCGNIQEIFS EDAVIAVCDPVYPVYVDSNVMAGRTGEYDEKTGKWSRVIYMPCTAKNQFVPELPKETPDL IYLCVPCNPTGTTLTRDQLKVWVDYANRTGAVILYDAAYEAYIAEDTVPHSIFEIPGART CAIEFRSFSKNAGFTGVRLGFTVIPKDLVRGGVTLHSLWARRHGTKFNGAPYIVQKAGEA VYSPEGRAQLKEQVAYYMRNAKVIYDGLKEAGCEVYGGVNAPYIWLVVPDGMTSWEFFDC LLNEAGVVGTPGSGFGPSGEGYFRLTAFGTYENTVEAVERIKNMPSLRK >gi|229784092|gb|GG667643.1| GENE 28 25387 - 25566 108 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621396|ref|ZP_06114331.1| ## NR: gi|266621396|ref|ZP_06114331.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 56 1 56 56 86 100.0 6e-16 MVINKKDKSVRTLVVLTLFLILQESAPVVLKSPPDRITLRRTEIMSRKRPAAGGESLAS >gi|229784092|gb|GG667643.1| GENE 29 26567 - 27013 593 148 aa, chain + ## HITS:1 COG:SMc00425 KEGG:ns NR:ns ## COG: SMc00425 COG1522 # Protein_GI_number: 15964097 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Sinorhizobium meliloti # 1 139 2 140 145 101 36.0 6e-22 MDEIDKKILQMLQENARYSLKKIAENTYLSSPAVSSRIGRLEREDIITGYHAAVDPVKLG YHIIAFVNLDVLPGEKPGFYARMAEIPNVLECNCVTGDYSMLLKVAFQSTMELDVFIGEV QKYGKTNTQIVFSTPVGPRGADVMTAGI >gi|229784092|gb|GG667643.1| GENE 30 27103 - 27885 688 260 aa, chain + ## HITS:1 COG:BS_cwlC KEGG:ns NR:ns ## COG: BS_cwlC COG0860 # Protein_GI_number: 16078804 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus subtilis # 7 254 4 252 255 130 32.0 3e-30 MAERKIVVIDAGHGGADPGAKFEGRQEKDDALRLALAVGNILSQNGVDVRYTRTDDTYNT PLEKATMANEAGADYFVSIHRNAMPIPGSASGVMSLVFENKGVPAQLANNINEELANTGF ANLGIIERPGLVVLRRTEMPAVLVEVGFIDNEADNQLFDDNLHAIAEAIANGILTTIREG EAEQPEYYQIQTGAYRIRSLAEQQQNTLRSQGFPAFIVSEDGLFKVRAGAFRELDNAVRM EQELRRYGYNTFLVRRPAVS >gi|229784092|gb|GG667643.1| GENE 31 28082 - 28720 598 212 aa, chain + ## HITS:1 COG:no KEGG:Closa_3203 NR:ns ## KEGG: Closa_3203 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 212 5 215 215 220 51.0 3e-56 MGDKKLIITIGRQYGSGGNEIGRKLAEELGIDFYDKNILRMNSDESGIKESYFHLADEKA GSRLLYRIVSGMTPEMREPSFGSDLISADNLFRFQSEVIRKLAEEQSCVIVGRCADYVLE DADDIELVRVFIYADMDARIRRVREKELYTPEDVRKNVKRIDKERRNYYRYYTGRGWADP ENYDLLINTSTTGIKGSVRMIEEYIKIRGYKI >gi|229784092|gb|GG667643.1| GENE 32 28830 - 28958 70 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIEPSGLAEERKGSGNGRIGKHGKIPGIQEEGQERGNTWLRQ >gi|229784092|gb|GG667643.1| GENE 33 28943 - 29587 817 214 aa, chain + ## HITS:1 COG:L150333 KEGG:ns NR:ns ## COG: L150333 COG0637 # Protein_GI_number: 15672725 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Lactococcus lactis # 3 214 4 215 222 127 35.0 1e-29 MVKAVIFDMDGVIIDSEGKYLEFQLEFAQKKNPNVRIEQLYPMVGATKKEAWEVLEHAVD NGQTWEELRDECRRRDIYSEVDYREIYRPEVTEVLKTLKEKGYRLALASSTQLDLVERVL RENEIREYFEVVVSGSQFKRSKPNPEIYQYTASRLGVRTEECLAVEDSTIGITAASRAGM KIAAVIDNRYNFDQSLADYHIARVKEVLEVVKEA >gi|229784092|gb|GG667643.1| GENE 34 29663 - 30559 916 298 aa, chain + ## HITS:1 COG:FN0490 KEGG:ns NR:ns ## COG: FN0490 COG2035 # Protein_GI_number: 19703825 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 7 265 5 257 260 89 26.0 9e-18 MEYIGELLKGVVIGIANILPGISGGMLAISMGVYDTIIHAVNRLFKEPVESIRTLFPYGL GAVAGIISLSLVFEYLFGTFPLQTKLAFIGLIGGGLPALFEKTKEPGERRRGVMTICVTA AVVIFITVAGGMSIASGSHGGAGTVKTGAEYLSSGRFWIISMFLVGILAAGTMVVPGVSG SMIMMMIGVYEPLLQTTNGCIRAAAAFDFPALLSDGLVLLPYFTGLGLGIFLFAKIVEHL LVNHERQMYCVIIGLVFSSPVVILWDVAWSRIELYQLLGGISLCLLGYVAAVRCGGEG >gi|229784092|gb|GG667643.1| GENE 35 30675 - 31949 1106 424 aa, chain - ## HITS:1 COG:CAC3244 KEGG:ns NR:ns ## COG: CAC3244 COG3409 # Protein_GI_number: 15896489 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative peptidoglycan-binding domain-containing protein # Organism: Clostridium acetobutylicum # 16 422 4 407 437 344 47.0 2e-94 MRKYMTAAALEPTDTGSLQVNVVSAENNFPIRDAEVSIAYKGDPESTVESTNTNSSGQTG EIRLAAPPLEYSLSPGLTQPYSEYTITIRAQGFAEVAISGTEILPDALAIQPVRMTPLAD ETSPDTPIVIPDHTLYGYYPPKIAEAEVKPVAETGEIVLSRVVVPQTVVVHDGVPTDSTA PNYYVPYRDYIKNVASSEIYATWPRSSITANVLAIMSFTLNRVYTEWYRNQGYDFTITSS TAYDHKWIYGRNIFQSISEVVDEIFDNYLSRPEVKQPILTQYCDGNRVSCQHKGWMTQWG SADLGERGYSPIEILRYFYGDDMYINTAEQISGIPASWPGYDLTIGSSGQKVQQVQEQLD AIATVYSAIPHITPDGIYGPATAAAVREFQSIFGLPVTGVIDFRTWYKISHIYVGITRIA ELNP >gi|229784092|gb|GG667643.1| GENE 36 33441 - 34514 833 357 aa, chain + ## HITS:1 COG:lin1950 KEGG:ns NR:ns ## COG: lin1950 COG0505 # Protein_GI_number: 16801016 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Listeria innocua # 2 357 3 360 363 365 51.0 1e-101 MKAFLILEDGTVFTGTSIGSKREVISEIVFNTSMTGYLEVLTDPSYAGQAVVMTYPLIGN YGICRDDMESRKAWPDGYIVRELSRIPSNFRSGDTIQHFLEEQDIPGISGIDTRALTKIL REKGTMNGMITTNENFDCNEAVARMKKYSVTGVVKKTTCAEKYVLPGDGKKVALLDLGAK MNIARSLNERGCQVTVYPATTAAEEILAGNPDGIMLSNGPGDPKECTDIITEIRKLYESD VPIFAICLGHQLMALATGADTHKLKYGHRGGNHPVKDLETGRVYISSQNHGYVVDMDTID PGTAVPAFVNVNDGTNEGLKYVNKNIFTVQYHPEACPGPLDSGYLFDRFLNMMEEGK >gi|229784092|gb|GG667643.1| GENE 37 34514 - 37708 3886 1064 aa, chain + ## HITS:1 COG:BS_pyrAB KEGG:ns NR:ns ## COG: BS_pyrAB COG0458 # Protein_GI_number: 16078616 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Bacillus subtilis # 1 1060 1 1062 1071 1197 56.0 0 MPKNPDIKKVLVLGSGPIIIGQAAEFDYAGTQACRSLKEEGIEVVLLNSNPATIMTDKDI ADRVYIEPLTVEVVEQLILKEKPDSVLPTLGGQAGLNLAMELEDAGFFRDHNVRLIGTTA LTIKKAEDREMFKETMEKIGEPVAPSDIVEDVKHGLEIAAEIGYPVVLRPAYTLGGSGGG IAENPEQCAEILENGLRLSRVGQVLVERCIAGWKEIEYEVMRDGAGNVITVCNMENIDPV GVHTGDSIVVAPSQTLGDKEYQMLRTSALKIISELGITGGCNVQYALNPDSFEYCVIEVN PRVSRSSALASKATGYPIAKVAAKIALGYTLDEIKNAVTKKTYASFEPMLDYCVVKMPRL PFDKFISAKRTLGTQMKATGEVMSICTNFEGGLMKAIRSLEQHVDSLMSYDFTGLSDEEL KETLALVDDRRIWVIAEALRRGFTYDTIHAITKIDVWFIDKLAIIVEMEAALKKGPLTVD LLREAKRIEFPDNVIARLTGISEQDIHQMRKDNGIVAAFKMVDTCAAEFAAETPYYYSCF GSENEAAFTNDRKKVLVLGSGPIRIGQGIEFDFCSVHSTWAFSREGYETIIVNNNPETVS TDFDIADKLYFEPLTPEDVESIVDIEKPDGAVVQFGGQTAIKLTESLMKMGVPILGTAAE DVDAAEDRELFDKILEECRIPRPAGHTVFTAEEAKEAAHKLGYPVLVRPSYVLGGQGMQI AISDQDVDEFIGIINQIAQEHPILVDKYIMGKEIEVDAICDGTDILIPGIMQHIERTGIH SGDSISVYPAPDLTEKNIETLVDYTEKLARALHVKGMINIQFIVDGDDVYIIEVNPRSSR TVPYISKVTGIPIVPLATQIICGRTIRELGYTPGLQPSADYISIKMPVFSFEKIRGADIS LGPEMKSTGECLGIAKTFNEALYKAFEGAGVRLPKYKNMIITVRDSEKEEVVDIARRFQA QGYRIFSTAGTAKFLNSHGVKALAVRKLEQESPNLLDLILGHEIDLVIDIPPQGADKSRD GFIIRRNAIETGVNVLTSLDTAAALVTSMENRANELTLIDIATI >gi|229784092|gb|GG667643.1| GENE 38 37886 - 38236 361 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621407|ref|ZP_06114342.1| ## NR: gi|266621407|ref|ZP_06114342.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 116 1 116 116 197 100.0 2e-49 MKYGYGCLGFLALLGILGIVTEERSFLAFFGFAVDFQYFFRKTDEMMIEYMNKSAARAFM CGMLITAVVTLIYAALYGINRALLIGIAWGWSASVLVYALSTAWYGIKESWWTGDD >gi|229784092|gb|GG667643.1| GENE 39 38229 - 38429 264 66 aa, chain + ## HITS:1 COG:MA4668 KEGG:ns NR:ns ## COG: MA4668 COG1476 # Protein_GI_number: 20091114 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 2 66 12 76 79 77 60.0 7e-15 MIKNRIKEYRARYDMKQEELADRVGVRRETIGNLEKGKYNPSLVLAWNIAKVFGATIEEI FTVEES >gi|229784092|gb|GG667643.1| GENE 40 38452 - 38994 511 180 aa, chain - ## HITS:1 COG:no KEGG:PPSC2_c1114 NR:ns ## KEGG: PPSC2_c1114 # Name: not_defined # Def: ThiJ/PfpI family protein # Organism: P.polymyxa_SC2 # Pathway: not_defined # 1 177 1 166 168 104 33.0 1e-21 MTNYILIYDDCCFYEIVLLGYFMRYSNMGEQPCSYCMAVPAEGPHRDTIRTAEGFRVQAD TYLEDIDPEEVTSFIIPGGDISRVRGEDLSAFLHSLDRDKSCICAICAGVDLLEEYGFLE GKNSIRTSQDLAVSDGLLVTARPNGYVDFAVEAGKAIGLFTDEADIAETLDFFKFHKSES >gi|229784092|gb|GG667643.1| GENE 41 39040 - 40062 880 340 aa, chain - ## HITS:1 COG:SPBC1683.02 KEGG:ns NR:ns ## COG: SPBC1683.02 COG1816 # Protein_GI_number: 19111850 # Func_class: F Nucleotide transport and metabolism # Function: Adenosine deaminase # Organism: Schizosaccharomyces pombe # 8 340 5 339 339 327 48.0 2e-89 MPQSKVTKEFIYGLPKAELHLHLEGTLEPELKLHLAQKNNIDIGQHTIEEVKAAYQFDSL TSFLKVYYPAMNVLQHTEDFCTLALDYLKKAKEHNIKYAELFFDPQAHTSRGVPIEAVIN GYYEATQQAREFGVEANLIMCFLRDMSAESAMETYLAALPYRSKFIGIGLDSDERDNPPQ KFAQVFALAKKDGFHITMHCDIDQEGSIEHIRQALMDISVDRLDHGTNIVEDPSLVAYIK EKKIGLTCCPVSNSFVVDDMKGKEILELLHQGIQVTVNSDDPAYFQSYISDDLYSLAEAY ELSREDVIQIVKNAFSISWISEEKKNGYLKMVDEYVASFE >gi|229784092|gb|GG667643.1| GENE 42 40545 - 41546 973 333 aa, chain + ## HITS:1 COG:BS_ccpA KEGG:ns NR:ns ## COG: BS_ccpA COG1609 # Protein_GI_number: 16080026 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 3 310 5 313 334 218 37.0 1e-56 MSTIKDVAREAGVSLGTVSNVLNGKPNVKKENKEKVQKAIRKVGFQYNMAASFLRTKTSK NIGLIIPSITNPYYPELARGVSDSARIASLTVFLCNDDRDVSKEHRYVDELLSKGVDGII LVKPQMNLSELKELNRQTALVLVDVGEDIDSEFHVINVCDEQGIIQGMTLLRQYGHKKIA FVSGLLESYSSKCRVQTYRRCLADYGFDYRPEYVISGNYDWQSGYTAARQLMQLQDPPTA IFASNDLMALGVIKSALEKGIRIPEELSVLGFDDIDMSNLCTPSLTTIHQPKYEIGIESM NMLLACMNGEMPKNGMKTILETRLVFRDSVGYA >gi|229784092|gb|GG667643.1| GENE 43 41600 - 42055 309 151 aa, chain + ## HITS:1 COG:aq_1138 KEGG:ns NR:ns ## COG: aq_1138 COG0698 # Protein_GI_number: 15606395 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Aquifex aeolicus # 6 136 3 133 154 148 54.0 3e-36 MKEEFIYIGSDHGGFSLKQVIKDCLDRKGILYKDVGSDSAEIVRYPTYAEEVALAVSEHR ATRGILICSTGIGMSIVANRFKGVRASLCTDSYMGKITRAHNDSNILCLGGKITGELEAV DILENWLTTEFEGGRHTISLGLIEELDCNGK >gi|229784092|gb|GG667643.1| GENE 44 42036 - 43742 936 568 aa, chain + ## HITS:1 COG:FN0294 KEGG:ns NR:ns ## COG: FN0294 COG3959 # Protein_GI_number: 19703639 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Fusobacterium nucleatum # 295 557 7 269 270 276 50.0 9e-74 MTVTGNEQNTGKKWGRNVKVAVTIGSDQAGDSAFVVWRGFEESIRKAAAFGYDGVELAMR SANDVEACDLKRWLSESGMEVSCITTGRMFAEDHLYFTHPDPEIRKRAKDSYKSLIDMAS EYCNLINIGRVRGEKGKEQSEEEADRLFLDAYREICSYAEKKGVQVLIEPVNRYEMNLIN SLDQGAEFLSRAGCPNGGLHADVFHMNIEDDRIGESLIRNGNWIKYVHIADSNRLAPGSG HLDFNEIFTALKRADYDGWISMEMLPGNDPDEEVKKALKYISPWVSRNDCEEEKMEQLSR KFRKELIQLLHSKGTGHPGGSLSCVELLTTLYYKFLRVDPKCPDREGRDRLILSKGHAAP ILYCILAEKGFFPVEELETFRQAGSRLQGHPCRHKTPGIECSSGPLGLGLGAGLGMALAE KLKKSDARIFVVMGDGEIQEGAVWEAASAAVKYRLDHLTAVLDFNGVQLDGTLEEILPPG DLVKRWEAVGWNVISCDGHSIPEIMKSVELAKQCRQKPSIIIAKTVKGRGVSFMEGRHQW HGKVINDEDFKRAMQELDMERGEQSAGE >gi|229784092|gb|GG667643.1| GENE 45 43729 - 44706 679 325 aa, chain + ## HITS:1 COG:PAB0296 KEGG:ns NR:ns ## COG: PAB0296 COG3958 # Protein_GI_number: 14520662 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Pyrococcus abyssi # 12 319 4 311 317 265 43.0 8e-71 MLANEGGKRYCSVREAYGKALAELGRTNESVVVLEADVGGSTKSSLFGGEFPERYFNMGI CELNMVNTAAGLAMEGFTPFVNTFAVFMTSRALDPIQSMIAYDGLNVKLAGAYCGLSDSY DGASHQAITDIAVMRTIPGMTVVSVSDAAEAEAAVRALADYPGPAYLRLSRADAPVIYER GCDFKVGKGIVCRDGNDVTLIGTGTVVSRCLEAAARLKELGIDAAVIDMHTIKPIDESLI LKYAKRTKAIVTVEEHSVCGGFGSAVAEVIVKRYPIPMDIIGIETFAESGDYEELLDKFG LGSQRITEACRQIVQRKQNWKGDSL >gi|229784092|gb|GG667643.1| GENE 46 44703 - 45890 1045 395 aa, chain + ## HITS:1 COG:SP1897 KEGG:ns NR:ns ## COG: SP1897 COG1653 # Protein_GI_number: 15901724 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 76 388 49 361 419 116 26.0 6e-26 MRRSGKMKIAAALLLCSTVVLGACGSESSQNKETAKVTEAAGTAASTELAAGEPEKQEDG LTGELNIFVHVGSMDINPVIEKMQQENPDLKINLEMGSGTQYETVLKTKLSAGEAPDIMT VWPGSRTADYAAKGYLEKLTEEPWAARIPATINSEFSLEDDLYAMCFTVGGEGLIYNQDL FDQYNLSVPQNFDELLNVCETFQKNGVQPFASGYKDDWVIMRYTNSAFATLGYGREPDFD KKLASGELDFSYPGWVELFDKYGVLLDKHYFGDAILSTDSSQAIADFATGKTAMMIHTSS TISEIRKVSPDMKLGLTATPVNNAGEELYANWKSGLGFAVSSKSKNKEAAKRFLEIWADP EVNEALYHNNGEPTILNDVPSTGTDPALQDFYSTS >gi|229784092|gb|GG667643.1| GENE 47 46861 - 46965 75 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSNNWKKKLQEFVGRQINTEQMIDWLNEEYQSMK >gi|229784092|gb|GG667643.1| GENE 48 47024 - 47884 806 286 aa, chain + ## HITS:1 COG:BH2225 KEGG:ns NR:ns ## COG: BH2225 COG1175 # Protein_GI_number: 15614788 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 5 281 28 301 304 170 36.0 3e-42 MLFMLPAILIVVCVDIIPIFQNFYYSLFSWDILEPSMKFVGLDNFINFFKMDGARGALKN TLIYAVLAAVLKNIVGLGLALVLNSGLKTQKFLRTCILSTTMISLVISGYTWTYLYHPEY GVGYLIEKYTGISFLNQDWLGNPKLALYAVLVVSIWQIAGKYMVIYLAGLQTVPKDLYEA GNMDGASGFSRFRYITAPLIIPSFTVGILNAVIQGLKVFDEIYSMTRGGPGFASETLTTL MYSQTFFYSGKAGFGSAISVILFGIVLIVSLLISSYLRKKEDAVYQ >gi|229784092|gb|GG667643.1| GENE 49 47901 - 48728 334 275 aa, chain + ## HITS:1 COG:BH2224 KEGG:ns NR:ns ## COG: BH2224 COG0395 # Protein_GI_number: 15614787 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 5 274 6 275 276 174 37.0 2e-43 MQKIRLRKILLELVMILFTVIYMFPVLNLVFSSLKSTGDLMRNPVGLPSVLELGNYVVAW TKMDYPRAFFNTVIICGGSIFFIVVIGAMAAYPIARYRTRFNQAMYLFFLATIMIPGQAI MIPLVRLFYKTGLVNTHISLIVYYTAAAVPFAIFLFTGFIKTIPGELEEAASIDGSGAYR TFWRVIFPLMKSSVTTVIILNIMWIWNDFMMPMVLLQSGKMRTLTPSIFNFFEEFVTQWN FAFAASVLVMIPGILVFLLLQKHFIEGMVAGSVKS >gi|229784092|gb|GG667643.1| GENE 50 48784 - 49674 747 296 aa, chain + ## HITS:1 COG:no KEGG:blr4705 NR:ns ## KEGG: blr4705 # Name: not_defined # Def: hypothetical protein # Organism: B.japonicum # Pathway: not_defined # 10 277 14 287 308 122 30.0 3e-26 MNDRMIPQIPPTGEVMRILIDSDVATEIDDLYALTIAFGHPERFLIEGIVAAHFSGNGPS NSKKMSFELLKELLQKSGMEGKFPIAEGSHPMQYANTPNDSEGVDLIIRKAMESTPEDPL WVIGIGAATDLACALLKAPDIVSRVRYVFHARSEHTWPDRNEQFNVYGDIIAAQTLLESE VPLIWFDTGTHICASYEETEEKLAPMGDYGKFLHEFRDRDAWFCQSDKGFFDMGDFSFLY QPDWGLYETVETPEMTRFMYFKHTHRHGKMMRVYDIDVKKTWDLFYEAVNNLRKGV >gi|229784092|gb|GG667643.1| GENE 51 49676 - 50578 844 300 aa, chain + ## HITS:1 COG:no KEGG:Spirs_2575 NR:ns ## KEGG: Spirs_2575 # Name: not_defined # Def: xylose isomerase # Organism: S.smaragdinae # Pathway: not_defined # 1 300 71 368 368 409 63.0 1e-113 MKDPIQKYFDVGTIQWMTYPPQTYPVSEALKKIACDDYFGAVEITHIEDMEARRQVREML RQSHLRVSYGAQPALLGTGLNPNDLDDDKRLEAQNLLMAAVDEAEYMGAGGIAFLSGKWK EEAKEQSFRQLLKTCEAVCLYASEKGMMVELEVFDFDMDKAALIGPAPYAAEFAAEIRKR CSNFGLLVDLSHFPTTYETSEFVIRTLRPYITHFHIGNAVVQKAMEAYGDQHPRFGFPGS ANDTAELLDFFRILKAEGFFYCRRPYTLSLEVKPWKNEDGDIILANTKRVINRAWALLEE >gi|229784092|gb|GG667643.1| GENE 52 50584 - 51561 873 325 aa, chain + ## HITS:1 COG:CAC2945 KEGG:ns NR:ns ## COG: CAC2945 COG1052 # Protein_GI_number: 15896198 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 318 1 323 324 355 54.0 8e-98 MKIVILDGYTENPGDLSWEDLAALGELEVYDRTPAGEIEERMSGAEAVFTNKTPISRHAI ENCPSLKYIGVLATGFDVVDIRAAAEAGITVTNVPSYGTDAVAQYTIALLLELCHHVGAH SDSVKAGEWTKSADWCFWNYPLTELSGKTMGIIGFGRIGQKTARIAEALGMEIKVFDAVC RKELETEHCRYADLEELLAVSDVICLHCPLNSGTKGIINRDTIAKMKDGVLIINDSRGPL IVEKDLKEALESGKVAGAAVDVVSEEPVREDNPLMTAPNMIITPHIAWASRESRKRLMDI AVDNLKSFLQGRPQNVVSVLRQETV >gi|229784092|gb|GG667643.1| GENE 53 51527 - 51688 79 53 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSYLYCGRKQCKQLTFVLLKLRCICKLPDKGIPYRCILLQKNALDGTLFSAAQ >gi|229784092|gb|GG667643.1| GENE 54 51631 - 51747 70 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYFIAEKCIRWNLVFSGTVTIDNSPRLMVILPWINKRG >gi|229784092|gb|GG667643.1| GENE 55 51988 - 52806 746 272 aa, chain + ## HITS:1 COG:no KEGG:CA_C0752 NR:ns ## KEGG: CA_C0752 # Name: not_defined # Def: DNA ligase III # Organism: C.acetobutylicum # Pathway: not_defined # 1 262 1 262 265 355 63.0 1e-96 MEMIYKYPRTRHIEGSRIQAGDEDLKNVRFEEIRNRFVVLEEKVDGANCGISFSDQGKLM LQSRGHFLNGGYGERQFDLFKTWAGCFQTELYEMLGKRYVMYGEWLYAKHTVYYDRLTHY FMEFDIYDKENGVYLSTAARRKLLRQCPFVESVLVLYEGRLDSLNQLVSYLGRSHFKSDR YEESLREQCEEQGQNWDLVKRQTDRSDLMEGIYIKVEEDDMVTDRFKYVRSTFLNTILDS ETHWLNRPVIPNRLAPGIHLFENPAGRGESGL >gi|229784092|gb|GG667643.1| GENE 56 52806 - 53945 1263 379 aa, chain + ## HITS:1 COG:CAC0753_1 KEGG:ns NR:ns ## COG: CAC0753_1 COG0617 # Protein_GI_number: 15894040 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Clostridium acetobutylicum # 4 233 3 227 228 207 45.0 2e-53 MEKLEDFLERKGYEWEALLPEFPGLMALDETAQDPEYHGEGSVLEHTKRVCRAVVSGTEW KNLNKRDRAVLYMAAMYHDIGKKSRTMQDPSGRIISPGHAIAGAKAFREVCYRELEGGFQ IPFSMREETAWLIRYHGLPLLFMEKAEPSYDLIRAAESVSLPLLYQLGRSDVQGRVCSDQ KNALETTEYFKTYAGELGCYGKKISFANEYTRFSYFEKRDLWYGDRLYDASEFDVYVMAG LPLAGKDTYISEELAGFPVVSLDDIRAEMGIRPDEPSGPVAAEARERAKAYLRAKTSFVW NATNLILDNRQKVCRMCAAYGARVNLKYLEMPYAEILKRNMIRDRSVPVDVINRMIHKMD MAECVEAYRTNYLAGGTQR >gi|229784092|gb|GG667643.1| GENE 57 53962 - 54270 362 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870515|ref|ZP_06114360.2| ## NR: gi|288870515|ref|ZP_06114360.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 102 12 113 113 191 100.0 1e-47 MVFYINKQAVEIKTRFENAPLTSPNEACAAIGMLCKLYGLPIPEKREMVSEMKADLNAAL KGRQAPSKFAEKIISLLDGYFKDDPVTDEIYELLKYSYEEQK >gi|229784092|gb|GG667643.1| GENE 58 54417 - 55457 978 346 aa, chain + ## HITS:1 COG:VC1721 KEGG:ns NR:ns ## COG: VC1721 COG1609 # Protein_GI_number: 15641725 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 1 340 1 334 336 127 27.0 3e-29 MITREDVAKRAGVSVSAVSRAMNHRGYVAKDKKEAIYKAVEELGYRPNPLSHSLKNRQTY QLCFFGADIYNSFYMELFNYMSDYAAERGYTMFLFSEFNEERVKTMLMDGMIVENDTVAE KVQALLGEQFFMPIVTASYGSPIVSLKRIPYVDVDTHDAVEIGLAYLKKMGHKKTAFATP YDFFGDSRVNPRHVAFGNQMRPVYGKRLKDYVIDNTGKARLPFARENFFEEGMNCADQFL ARNCDATAVICFNDEFALGMIRRFEQMGIRIPDDLSILGIDGIENRKYVTPFLSSVSLNI KEQAECCVSTLLDMIEGRRVTFFNSLKPYLVEGNSVRDLSMETIKR >gi|229784092|gb|GG667643.1| GENE 59 55478 - 57431 1569 651 aa, chain + ## HITS:1 COG:TM1195 KEGG:ns NR:ns ## COG: TM1195 COG1874 # Protein_GI_number: 15643951 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Thermotoga maritima # 5 645 2 634 649 409 35.0 1e-114 MGIHFGVDYYPEHWPRERWETDARLMKEMGLSVVRMAEFSWHKLEPQEGRYDFSWLDDAI ALLGSCGIRTIVGTPTAAPPAWMANRYPEILPVDREGRTRGFGGRHHDCQSNPVYREFVR KIVTAMAIHYKDNPNVIGWQPDNELGNSHHDLCMCSSCRKSFQGWLEKKYGTVDRLNEAW GTAFWSQEYNEFSEVFTPKITVTGENPSAMLDWKCFCSDLIVDFMKEQTEIIRRYCPNHF ITHNYMGFADKVNYYDLGKQLDFVSHDQYPGGFFADASHERGDMTASALDVVRSFKKQNF WIMEQQSGITGWEVMGRAPAQGQLSMWSLQSVAHGADAVVYFRWRVCAMGTEQFWHGILP HSGNPGSRYHELKEMIRNVSPLMERLEGSMPVPEVGIVYSFRQNYALQIQPQNRAMSYIE QIQEYYRAFYERQIPVDFVPEDGDFGNYKLLAAPLQYLMNPELEERYFNYVRNGGHLILT MRTGVKNDTNLCMTDRELPGRLSELTGSEVLDYDCLRDVKVPVICQNTEYTARNWCDLMR VREGVEVLAVYAGEFYAGEPCVTRNNYGKGCCYYVGTEPDQALMAALVGCMADAAGITEL GSAEEGVELAAREKDGKRWLFAINHTAGERTYTVDSAYRMIYGGEAGNAEG Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:46:10 2011 Seq name: gi|229784091|gb|GG667644.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld37, whole genome shotgun sequence Length of sequence - 37840 bp Number of predicted genes - 38, with homology - 37 Number of transcription units - 14, operones - 8 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 492 248 ## COG5263 FOG: Glucan-binding domain (YG repeat) 2 1 Op 2 . - CDS 489 - 1406 683 ## Closa_0425 hypothetical protein - Term 1818 - 1855 -0.1 3 2 Tu 1 . - CDS 2006 - 3889 1792 ## COG4886 Leucine-rich repeat (LRR) protein - Prom 3995 - 4054 5.8 - Term 4099 - 4146 6.1 4 3 Tu 1 . - CDS 4212 - 4802 234 ## Closa_3908 hypothetical protein - Prom 4832 - 4891 3.5 - Term 4876 - 4913 7.1 5 4 Op 1 32/0.000 - CDS 4983 - 6059 1297 ## COG0216 Protein chain release factor A 6 4 Op 2 1/0.000 - CDS 6110 - 7090 276 ## PROTEIN SUPPORTED gi|227996692|ref|ZP_04043706.1| (LSU ribosomal protein L3P)-glutamine N5-methyltransferase 7 4 Op 3 . - CDS 7087 - 8031 1043 ## COG3872 Predicted metal-dependent enzyme - Term 8043 - 8104 4.0 8 5 Op 1 9/0.000 - CDS 8112 - 8321 327 ## PROTEIN SUPPORTED gi|238917368|ref|YP_002930885.1| large subunit ribosomal protein L31 - Prom 8419 - 8478 7.3 - Term 8424 - 8465 0.6 9 5 Op 2 . - CDS 8658 - 10370 1643 ## COG1158 Transcription termination factor - Prom 10430 - 10489 3.7 10 6 Op 1 . - CDS 10539 - 11840 1221 ## COG1524 Uncharacterized proteins of the AP superfamily 11 6 Op 2 . - CDS 11860 - 13194 1678 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase - Prom 13240 - 13299 2.1 - Term 13211 - 13259 7.5 12 7 Tu 1 . - CDS 13310 - 14035 779 ## COG0726 Predicted xylanase/chitin deacetylase - Prom 14055 - 14114 5.7 + Prom 14089 - 14148 8.3 13 8 Tu 1 . + CDS 14212 - 14817 584 ## Rumal_2967 cytidylate kinase + Term 14885 - 14946 8.2 14 9 Op 1 . - CDS 14998 - 15192 290 ## gi|266621443|ref|ZP_06114378.1| conserved hypothetical protein - Prom 15212 - 15271 3.2 15 9 Op 2 15/0.000 - CDS 15277 - 17295 1906 ## COG2205 Osmosensitive K+ channel histidine kinase 16 9 Op 3 . - CDS 17364 - 17978 727 ## COG2156 K+-transporting ATPase, c chain 17 9 Op 4 . - CDS 17995 - 18126 148 ## gi|266621446|ref|ZP_06114381.1| hypothetical protein CLOSTHATH_02609 18 9 Op 5 20/0.000 - CDS 18144 - 20219 2548 ## COG2216 High-affinity K+ transport system, ATPase chain B 19 9 Op 6 . - CDS 20234 - 21985 1810 ## COG2060 K+-transporting ATPase, A chain 20 9 Op 7 . - CDS 21982 - 22077 126 ## - Prom 22125 - 22184 3.5 + Prom 22170 - 22229 3.9 21 10 Tu 1 . + CDS 22256 - 23440 778 ## Closa_2898 acyltransferase 3 + Term 23446 - 23501 15.1 - Term 23432 - 23491 16.8 22 11 Op 1 16/0.000 - CDS 23504 - 24190 669 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 23 11 Op 2 . - CDS 24183 - 25673 1413 ## COG2205 Osmosensitive K+ channel histidine kinase 24 11 Op 3 17/0.000 - CDS 25663 - 27231 1393 ## COG0168 Trk-type K+ transport systems, membrane components 25 11 Op 4 1/0.000 - CDS 27236 - 27910 844 ## COG0569 K+ transport systems, NAD-binding component 26 11 Op 5 . - CDS 27907 - 28347 540 ## COG0569 K+ transport systems, NAD-binding component - Prom 28371 - 28430 5.6 - Term 28529 - 28575 6.3 27 12 Tu 1 . - CDS 28607 - 28810 240 ## COG1278 Cold shock proteins - Prom 28830 - 28889 10.5 - Term 29080 - 29120 6.6 28 13 Op 1 47/0.000 - CDS 29136 - 29510 519 ## PROTEIN SUPPORTED gi|240143815|ref|ZP_04742416.1| 50S ribosomal protein L7/L12 29 13 Op 2 43/0.000 - CDS 29631 - 30134 624 ## PROTEIN SUPPORTED gi|239623242|ref|ZP_04666273.1| ribosomal protein L10 - Prom 30261 - 30320 6.3 - Term 30401 - 30441 -0.7 30 13 Op 3 55/0.000 - CDS 30446 - 31141 1017 ## PROTEIN SUPPORTED gi|238916246|ref|YP_002929763.1| large subunit ribosomal protein L1 31 13 Op 4 45/0.000 - CDS 31209 - 31634 636 ## PROTEIN SUPPORTED gi|240143818|ref|ZP_04742419.1| ribosomal protein L11 32 13 Op 5 . - CDS 31722 - 32237 619 ## COG0250 Transcription antiterminator 33 13 Op 6 . - CDS 32261 - 32464 265 ## Closa_0408 preprotein translocase, SecE subunit 34 13 Op 7 . - CDS 32492 - 32641 242 ## PROTEIN SUPPORTED gi|160881814|ref|YP_001560782.1| ribosomal protein L33 - Prom 32839 - 32898 11.2 - Term 32891 - 32933 7.1 35 13 Op 8 . - CDS 32988 - 33353 389 ## Closa_3981 hypothetical protein - Prom 33391 - 33450 80.4 36 14 Op 1 . - CDS 34294 - 35838 1060 ## Closa_3981 hypothetical protein 37 14 Op 2 38/0.000 - CDS 35867 - 36736 943 ## COG0395 ABC-type sugar transport system, permease component 38 14 Op 3 . - CDS 36750 - 37694 1056 ## COG1175 ABC-type sugar transport systems, permease components - Prom 37725 - 37784 3.2 Predicted protein(s) >gi|229784091|gb|GG667644.1| GENE 1 3 - 492 248 163 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 39 145 528 620 621 65 32.0 5e-11 MMWTIRKKSRWFAFWLGLLLTAAFAVPSHAECAEKNRADTGRWVYRTDYGAAEENQNSME QGGAWYRLNAADEVDTGWIFTDGFWYFLDPGDGMQKGRMMTGWQWIDGRCYYFAETSSDF YPQGAMYAACKTPDGYLVGTGGAWLDEHGDEWYIAGRGLSALK >gi|229784091|gb|GG667644.1| GENE 2 489 - 1406 683 305 aa, chain - ## HITS:1 COG:no KEGG:Closa_0425 NR:ns ## KEGG: Closa_0425 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 13 234 65 280 361 89 29.0 3e-16 MCRKWLFFSLAVLTALMVIPADPVYARPKPEITVLSEIMFGTSEEDVPEIIHEEGGQLYR LKSWELEPVIVNPREKYVEQEIFYESVEGIEGVPERRNITSEEDFSGLKVSADYPVIKRK KVKEEWRDDFTFPVVFHSYGAEEYFLGVENVPVDGDALRLELYEEALLSEIGVTTENYRI TSTVWDGAPYMDEDDILCRNATAFGKRKVRDYLVTYGGTVRFPELEGYRCRAVYTLKEFE RTPAEEKSIVRNGAVNTAEEERDSDWIIRREAMVITVSLVLVLFIVMLGAWLIKRGVEKR EERSE >gi|229784091|gb|GG667644.1| GENE 3 2006 - 3889 1792 627 aa, chain - ## HITS:1 COG:CAC3274_2 KEGG:ns NR:ns ## COG: CAC3274_2 COG4886 # Protein_GI_number: 15896519 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Clostridium acetobutylicum # 279 604 41 362 394 68 25.0 4e-11 MKLTALKCPSCGGPIELDEKDENVGYCQYCHTKYYLESEQPKVFIEYKPEAFRNSMATKK ESGSKAAAAAGAVIGVIGLLALILVPLSGTSKNSSRYPDRDMTAAAAASGEQSREESKVP VTYSSSFEVLIKEALGMSISSVTEEELGRFKYLAMRKGEAGWLITYSFENPAEAAEQGQD ADIHVVVCQEALKVQDIAALQKLEYLNLDHTELPEGSLETLESLKMITCDYIQLDMLRLA LGSRAQNLERLNASHLKSLEELSGFPGLKALEMEDCRDIADISALAQVKDLERLKLCDMD EVNDFTVLNVLKKLKSLTIDAENLKDISFINHMPELAHLEITDSRILLLDPLLGKESLKE LALVDNSEVRDYAPVGTLAGLETLVLNKYTSQDDPDLTALAGLKHGEFHGMMGIRFLGSL TGLEELKVQGCNVDQPSVIASLPLLKKLVYRKNWGNSDDLSFLAGCQGLTYLDMNHSEFY GDVSYAFNLPALETLILDSSSFEINFDRLQDNPALKALSLNRVKLYKNIEMWSDGMVRSI NYDNVLLDENTEFLAHYPSLEYLSLRENQLTDLQFAAGLKNLRELDVADNYVTDTHVLDQ LEYLGKGSSVKRPVADVDWIDTAGDFH >gi|229784091|gb|GG667644.1| GENE 4 4212 - 4802 234 196 aa, chain - ## HITS:1 COG:no KEGG:Closa_3908 NR:ns ## KEGG: Closa_3908 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 7 195 10 197 198 149 44.0 6e-35 MANDSLGSIITQGNFLRIDRALVEEVSSSGRNTGFIIISYSVPWQSGITTIQQLRLNINQ NTAVMNSLGMPIRLSDIRRGMRVDATFSPNMTRSIPPQSAAFTIVTRQPSRPSVSTTTQR VVWIDCSNSQLLAGMPNNISRMTRYNITNSTIILNRNGLPIRLCDLRPGQLVEITHASFQ TASIPPQTTAYRIQVR >gi|229784091|gb|GG667644.1| GENE 5 4983 - 6059 1297 358 aa, chain - ## HITS:1 COG:CAC2884 KEGG:ns NR:ns ## COG: CAC2884 COG0216 # Protein_GI_number: 15896138 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Clostridium acetobutylicum # 1 354 1 354 359 383 55.0 1e-106 MFDKLDDILIHYEELMLELGNPSVTEDQNRFRRLMKEQADLAPLVETYTEYKKAKTAVED SLAILDEENDEEMRELAKEELSDAKKQIEALEQELKILLLPKDPNDDKNIILEIRAGAGG DEAALFAAELYRMYVNYSESQHWKVELISVNENGIGGFKEVVAMVTGKGAYSKLKYESGV HRVQRVPETESGGRIHTSTATVAVMPEAEDVDVVIEDKDCRIDVMRASGNGGQCVNTTDS AVRLTHIPTGIVIYSQTEKSQLQNKEKAFRLLRSKLYDMELEKRQNSEAEERRSQIGTGD RSEKIRTYNFPQGRVTDHRIKLTLYKIDSIMNGDITELLDSLIAADQAAKLAKVNENV >gi|229784091|gb|GG667644.1| GENE 6 6110 - 7090 276 326 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227996692|ref|ZP_04043706.1| (LSU ribosomal protein L3P)-glutamine N5-methyltransferase [Kangiella koreensis DSM 16069] # 67 288 81 264 304 110 34 1e-23 MKLTLQTLWEEGAESLRRSGIGEAELDAKYLLFEAFQTDMVHFLMHRNEEVKDEDSVKRT VADYRRMIEKRSERIPLQQITGSREFMGLEFSVNEYVLIPRQDTETLVEQVLKDFQGKNP EVLDLCTGSGCIGISLSILGGWQEVTLADISLKALLVAKKNAEDLMTAKLHPIRLSSEGQ KESPWRWRLTSELPGMEDKTAGVQNITLVESDLFSSLSGDGKRKFDVIVSNPPYIPTNVI EELEPEVRDHEPRLALDGMEDGLYFYRRLAAECGSYLKPGGTVYFEIGYDQGQAVSGLLR EAGFQNVLVYQDAPGLDRVVKGTAAS >gi|229784091|gb|GG667644.1| GENE 7 7087 - 8031 1043 314 aa, chain - ## HITS:1 COG:CAC2886 KEGG:ns NR:ns ## COG: CAC2886 COG3872 # Protein_GI_number: 15896140 # Func_class: R General function prediction only # Function: Predicted metal-dependent enzyme # Organism: Clostridium acetobutylicum # 2 295 3 296 317 242 42.0 5e-64 MKYSGIGGQAVIEGVMMKNGDEYATAVRKPDGEIEVKKDTYVSMGEKVKFFSLPFVRGVF SFVDSMVLGMRSLTFSASFFEDDEDAGEPGKFEQFLDRVFGEKLEKALMAVVMVFSVIMA IGIFMVLPLFLANIFHRFIASQTVMAVVEGVIRIAIFIAYIKIISHMEDIRRTFMYHGAE HKCINCLEHGLDLTVENVRKSSKEHKRCGTSFLLIVMVISILFFMVVRVDTVWLRVVSRI VLIPVIAGVSYEFLRLAGRSNSKIMDILSRPGMWMQGLTTTEPDDSMIEVGIASVEAVFD WREFLNKNFPEKQG >gi|229784091|gb|GG667644.1| GENE 8 8112 - 8321 327 69 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238917368|ref|YP_002930885.1| large subunit ribosomal protein L31 [Eubacterium eligens ATCC 27750] # 1 65 1 65 68 130 86 1e-29 MKEGIHPNYYQAKVVCNCGNEFVTGSTKEDIHVEVCSKCHSFYTGQQKAAAARGRIDKFN RKYGVKAAE >gi|229784091|gb|GG667644.1| GENE 9 8658 - 10370 1643 570 aa, chain - ## HITS:1 COG:CAC2889 KEGG:ns NR:ns ## COG: CAC2889 COG1158 # Protein_GI_number: 15896143 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Clostridium acetobutylicum # 135 554 57 468 483 406 53.0 1e-113 MREKLETLSLAKLKEFAKEQEIKGYGSMRKAELIDILCAKAEEKEGISAPAVVSTGNDEP SANDVEPAKKAGEPVKREAEPAKEEKAPAPAQTQAAPANRVNPNEKKRTVVRSYRQEGTQ RTGNYRQGQSDAAERNDRQERNDRQDRPEQRQERNDRQERQEQRQDMNRMESIRETAEEA RPEFQPSPELQELDSGVDAYGILEVMPDGFGFIRCENYLPGENDVYVAPSQIRRFNMKTG DIIQGSRRVKTAAEKFAALLFVKYINGYPASVAEHRPNFENMTPIFPNERLHMETTGGRN TTAMRVLDLLSPIGKGQRGMIVSPPKAGKTTLLKQVAKAVTVNHPDMHVIILLIDERPEE VTDIKESITGNNVEVLYSTFDELPERHKRVSEMVIERAKRLVEHGRDVIILLDSITRLAR AYNLTVAPSGRTLSGGLDPAALHMPKRFFGAARNMREGGSLTILATALVDTGSRMDDVIY EEFKGTGNMELVLDRKLSEKRIFPAIDILKSGTRRDDLLLTREEAEAVDLIRKATNTLKA EDAVEKILDLFTRTRTNREFVDMSKKIFRY >gi|229784091|gb|GG667644.1| GENE 10 10539 - 11840 1221 433 aa, chain - ## HITS:1 COG:CAC0477 KEGG:ns NR:ns ## COG: CAC0477 COG1524 # Protein_GI_number: 15893768 # Func_class: R General function prediction only # Function: Uncharacterized proteins of the AP superfamily # Organism: Clostridium acetobutylicum # 9 432 6 433 434 356 41.0 5e-98 MSIPDRNKRRMIILSLDAVGGRDLAYMETLPNFGSFFKRAAGCGEVVSVYPSLTYPAHVS IVTGRYPKNHGIVNNLQVQPEREPSDWFWQRKYVRGTTLYDEAVKRGYRVASLLWPVMGK AKVHYNLPEVLPNRPWQNQLGVCVMNGTPLYELELQMKFGSLRDGIRQPQLDHFVHASML HTMEKYRPDLMLVHLTDVDTNRHLYGLDSPEAMAALRRHDERLGELLGLLRKMGLEEETD VVILGDHCQLDVKEAIYPNYYFVKAGLAEVKKGRIRNWRAIARDCDGCCYIYVKDPECVR RTETLVTRMMERENSGIARMFTREEAAALGADPECAFVLEAADGYYFQNGWEEYRRKAGA SDHHTDFHIQAATHGYLPDRDGYRTFFMAAGPHFLKGARVEKMSLVDEGPTLAGILGLDL GQVDGRTVNELIR >gi|229784091|gb|GG667644.1| GENE 11 11860 - 13194 1678 444 aa, chain - ## HITS:1 COG:BS_yrvN KEGG:ns NR:ns ## COG: BS_yrvN COG2256 # Protein_GI_number: 16079807 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Bacillus subtilis # 17 432 3 412 421 409 47.0 1e-114 MDLFDFMRENTMDKESPLASRLRPATLDEVVGQQHIIGKDKLLYRAIKADKLGSVIFYGP PGTGKTTLAKVIANTTSADFKQINATVAGKKDMEEIVKEARDSLGMFGRKTILFVDEIHR FNKGQQDYLLPFVEDGTLTLIGATTENPYFEVNGALLSRSRVFELKPLEREDIRELIRRA VYDRDKGMGAYNAVIDEEAMNFLADVANGDARAALNAVELGIMTTERDAADGKIHIDIDV AQECIQKRVVRYDKDGDNHYDTISAFIKSMRGSDPDAAVYYLARMLYAGEDMKFIARRIM ICAAEDVGNADPQALVVAVAAAQAAERIGLPEAQIILSQAVTYVASAPKSNAACNAVFEA LEAVRSSKTMPVPVHLQDAHYKGAAKLGHGQGYLYAHDFPNHYVKQQYLPDGMEERVFYR PTENGYEKKIKEHMEFIRSEAEES >gi|229784091|gb|GG667644.1| GENE 12 13310 - 14035 779 241 aa, chain - ## HITS:1 COG:SP1479 KEGG:ns NR:ns ## COG: SP1479 COG0726 # Protein_GI_number: 15901329 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Streptococcus pneumoniae TIGR4 # 19 239 233 448 463 159 36.0 3e-39 MEMKLRWKIICLFAVLALDIALGRSMYGFYQEAKSTLFLPADAGDSAQSEVRRQFSGEKQ IALTFDDGPHSVYTSQLLDGLRERGVHATFFLLGENIDGKEGVIRQMKEDGHLIGNHGYT HVQMSKETLESACGQIGETNRKIEEITGQKPEYLRPPYGSWNEELECATDMTVVLWSVDP LDWKLQNTNKVVKKIVKSAEPGDIILLHDVFPTSVQAALEVVDTLKKQGYTFVTVDELLV D >gi|229784091|gb|GG667644.1| GENE 13 14212 - 14817 584 201 aa, chain + ## HITS:1 COG:no KEGG:Rumal_2967 NR:ns ## KEGG: Rumal_2967 # Name: not_defined # Def: cytidylate kinase # Organism: R.albus # Pathway: not_defined # 5 197 4 200 209 176 44.0 6e-43 MGHKIITIGRQFGSGGHEIGERLSKALDIPLYDRDLVEMAAEKMGLSEISVEEVDESVLN TFLSAYRYPDHVNSFTGYGLPLNDSTYLAQSSIIESLAKRGPCIIVGRCADFVLRQNPDC LNIFICADMKDRIERIMDRYHLTEKEAATAIKRTDRRRKNYYETYTDQNWGSPDSHQLLI NVSKLGLERSLDLIEYLYKQP >gi|229784091|gb|GG667644.1| GENE 14 14998 - 15192 290 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621443|ref|ZP_06114378.1| ## NR: gi|266621443|ref|ZP_06114378.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 64 1 64 64 79 100.0 8e-14 MLEAFLTNLAPVKRMADRLEQKHSVLSMLVFQIAVALFLIAAVGGIALAGGGLIWLFYHL VGTM >gi|229784091|gb|GG667644.1| GENE 15 15277 - 17295 1906 672 aa, chain - ## HITS:1 COG:pli0050 KEGG:ns NR:ns ## COG: pli0050 COG2205 # Protein_GI_number: 18450332 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Listeria innocua # 27 666 14 641 888 539 45.0 1e-153 MDKKRDPEKILQEIKEQEEEQEETGIKRRGKLKIFFGYAAGVGKTYAMLDTAGEAKAAGA DVVAGYIEPHARPDTIALLEGLEVLPPKLIDHKGIKLREFDLDAALWRKPEILLVDELAH TNAEGCRHRKRFQDIQELLDAGIDVYTTVNVQHLESLHDIVASITHIRVNERVPDRIFDE ADKIELVDIEPEELLTRLAAGKIYGKNQAGRAMSNFFTKENLTALREIALRRTADQVNKA VMKERQAAGRDYYTGEHVLVCLSPSPTNEKVIRTAARMAGAFRAELSAVVVETLEAAERE EKVRKTMENNITLAKQFGAKVVTLYGDDIAEQIAGYAKNNGVSKIVIGRTVRPTGLSAIV RRRRNLIDRLIALVPGMDIYVIPDVRAAVQKHRHLTGKNPAAVEPVRVAVDAAAAVFLLA SATILGQGIEACGLGEMNQIMIYIMAVILISGFSRYRVTGVMASAASVLLFNFFFTVPRF TFQAVASVYPFTFLTMLLCALTVSSLAARLKRQVRLSKEESEKMQILIAISNKLKLAADS DEILDICAKQVMKLLEKNVIIYHAVGGEVKQPRVYLSNDAEEMAGGILLNEDETAVAGWV CKNCHKAGCTTNTLPGAAGLYIPIQKDGRVFGVIGIDLKEGAPIKPNDKNVLHVILDETA VVLKAQMIHGYL >gi|229784091|gb|GG667644.1| GENE 16 17364 - 17978 727 204 aa, chain - ## HITS:1 COG:RSc3384 KEGG:ns NR:ns ## COG: RSc3384 COG2156 # Protein_GI_number: 17548101 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, c chain # Organism: Ralstonia solanacearum # 12 197 23 200 204 145 43.0 7e-35 MKTIKSVIPKMLGFFLILTVICGVVFPLVVTGIAKVFFPKQATGSILVGEDGTKYGSELL GQQFTENKYLWGRIMNVDTSTFTGENGEPLMYSWASNKTPAGEELEELIAERVEKIREAQ PEKEGQPVPQDLVTCSGSGLDPAISPAAAEFQVSRIARERGITEDDVRAIIETYTDGRVL GVFGEPAVNVLKVNLALDGITWKE >gi|229784091|gb|GG667644.1| GENE 17 17995 - 18126 148 43 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621446|ref|ZP_06114381.1| ## NR: gi|266621446|ref|ZP_06114381.1| hypothetical protein CLOSTHATH_02609 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02609 [Clostridium hathewayi DSM 13479] # 1 43 1 43 43 62 100.0 7e-09 MYDDIFSNKVYILFRNAVITVILLALGSLIASGIVETVVCEWI >gi|229784091|gb|GG667644.1| GENE 18 18144 - 20219 2548 691 aa, chain - ## HITS:1 COG:lin2829 KEGG:ns NR:ns ## COG: lin2829 COG2216 # Protein_GI_number: 16801889 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity K+ transport system, ATPase chain B # Organism: Listeria innocua # 15 688 6 679 681 901 72.0 0 MSEKKKTSIFADKRIMTRALKDSLIKLDPREQIKNPVMFMVYVSAILTLVMFVFSLAGIR DAKPWFILAVSVILWFTDLFGNFAEAIAEGRGKAQADALRANRKDVMALKIPSPSQKEKG TRVTSTELRKGDLVYVAAGEQIPADGDVVDGAASVDESAITGESAPVIRESGGDRSAVTG GTTVTSDWLVIKVSNNPGESFMDKMIAMVEGASRKKTPNELALQIFLVALSIIFVLVTLS LYAYSMFSAGQKGIENPTAVTSLVALLICLAPTTIGALLSAIGIAGMSRLNQANVLAMSG RAIEAAGDVDILMLDKTGTITLGNRQADALIPVDGADRTELADAAQLSSLADETPEGRSI VVLAKEQFGLRGRNMEELHAEFVEFTARTRMSGINYQGNEIRKGAADAVKAYVLANGGVY SAECDQVVTKIANQGGTPLVVAKNHRILGVIELKDIVKEGVKEKFADLRKMGIKTIMITG DNPLTAASIAAEAGVDDFLAEATPEGKLKMIRDFQSQGHLVAMTGDGTNDAPALAQADVA VAMNTGTQAAKEAGNMVDLDSSPTKLIDIVKIGKQLLMTRGALTTFSIANDVAKYFAIIP ALFMGLYPGLAALNIMNLSSPYSAIFSALIYNALIIVALIPLALKGVKYREVPAAKMLTG NLLVYGLGGIIIPFIAIKVIDVMIVALGFAL >gi|229784091|gb|GG667644.1| GENE 19 20234 - 21985 1810 583 aa, chain - ## HITS:1 COG:lin2830 KEGG:ns NR:ns ## COG: lin2830 COG2060 # Protein_GI_number: 16801890 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, A chain # Organism: Listeria innocua # 1 579 1 557 561 542 55.0 1e-154 MTNTAIQMILYCAVLIVLAIPLGRYIGKVMNGEKTVLTRLLSPVENGIYKVLKINREEEM DWKKYAVCAGVFSVISLAVLWLILMLQNILPLNPEGQAGTSWHLGFNTAASFVTNTNWQA YSGESALSYFSQMIGLGVQNFVTPAVGMAVLFALIRGFIRVKKKGIGNFYTDVTKSTVYV LIPLSLIVSVFIVSRGVPQTFRGHEEVQLLEPVTVENEDGTATVVDKQIVPLGPAASQIS IKQLGTNGGGFYGVNSAHPLENPDPWSNLFEMISLLLIPAALCFTFGRNIKDKKQGVAVF LAMFIMLALALAVVGYTEQVSTPQIAQGGAVDISTAGQAGGNMEGKETRFGIATSGTWAA FTTAASNGSVNSMHDSYTPLGGMIPMLLMMLGEVVFGGVGCGLYGMLAFAILTVFIAGLM VGRTPEYLGKKIEPYEMKMAVLICLATPIGILVGSGIAAVLPATADSLNNPGAHGLSEVL YAYASAGGNNGSAFAGFNANTPFLNTSIGLVMLFVRFLPMFGALAIAGSMAQKKKVAVSA GTLPTHNAMFIFLLIVVVLLIGALSFFPALALGPIAEFFEMIG >gi|229784091|gb|GG667644.1| GENE 20 21982 - 22077 126 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEMGFMLGIIGLTALFILVYLIIILLRGDKQ >gi|229784091|gb|GG667644.1| GENE 21 22256 - 23440 778 394 aa, chain + ## HITS:1 COG:no KEGG:Closa_2898 NR:ns ## KEGG: Closa_2898 # Name: not_defined # Def: acyltransferase 3 # Organism: C.saccharolyticum # Pathway: not_defined # 51 332 10 295 356 126 32.0 2e-27 MLKTKYILMIFILLIVLHAHYRRLKQAGSFSSYIRSMLWDPSYDRNRIKAYDYIRVLAVV FVIAAHVVQSDSEPAAGTSAFMALRIMAVLFLNCNLLFVMLSGALLLNQKEEPVSVFYRK RIFKVVIPMAVYYLFYLYMGLYQSGLTDPKNLLDACRRFLSGPSQWNPHFWLMYVILAFY LAAPFFKVMVQNMSDRLLLSFVVLTFFMNGALSWLPLLGIEFHFVTLLCGWESVFVFGYF WTRPFSSRYRKPFMILGAASFLITAAISCVLPDYTDAFLNKAPTMLFMSGAIFSWFTGME EKLKKPGPTVRIISKYGFSILLIHWYILHHTAEEQLGLYASSWGILGGTAVTVAVTLVLS FLFAFVFDQTAVLCVNTLCSALSRGLGQLKRLRR >gi|229784091|gb|GG667644.1| GENE 22 23504 - 24190 669 228 aa, chain - ## HITS:1 COG:CAC3677 KEGG:ns NR:ns ## COG: CAC3677 COG0745 # Protein_GI_number: 15896909 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 227 3 231 232 265 58.0 4e-71 MNDADILVIEDDKTIQNFIKITLKTKGYRCVLADDGLTGISYFYANNPDLILLDLGLPDI DGMEVLGQIRQESEVPVIVVSARGMEQEKVGALDAGADDYITKPFNAGELLARIRVALRH RSSAPRQEPVFELDTLKMDFEKRKVSVRGEEVHLTPIEYKMLSLLVNNSGKVLTHHYIQQ EVWGYDTTDDYQSLRVFMANIRRKIEEDSTKPRFILTEVGVGYRFVES >gi|229784091|gb|GG667644.1| GENE 23 24183 - 25673 1413 496 aa, chain - ## HITS:1 COG:CAC3678 KEGG:ns NR:ns ## COG: CAC3678 COG2205 # Protein_GI_number: 15896910 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Clostridium acetobutylicum # 15 492 403 900 900 325 36.0 1e-88 MRNDIASAEHIVKILKNASKVAVIWVLSTMFAIVLNHMGVRIENLLLIYVVGVVICSVET SSMCWGIGSSVVFVFTFNFLFTAPKFTFQVDDPNYVISLMIFIIVAFIVASLTVKLQRQM DIANRRTEITTKLNAIGSGFLNLAGRPAIARYSCDSLKKLTEKRVDILIRSSDNEEFSDS IAEWCYRNSMICGHGEAQFGENTSLYVPIKNSNKTYGVIIFDCCDGDLEEEERVYVDTVI SQITLVLERERLNEEKEETKIQIEKERLKSTLLRSISHDLRTPLTGIAGSSNFLYDNYGS LDEETAKSMLNDICTDAEWLNSMVENLLNMTRIQEGRLDIQKKKEVVDDLISGAVTLVSK RLGDHRLKTVTPHDIVLLPVDARLFTQVLVNLIDNAIRHSGNDTTITVSARVMGNFITFK VSDNGSGIPEERLDKIFDNFFTTAYENGDKQRGVGLGLTICKAIVEAHGGHIKAVNNEGG GTTFQVDMPMEVRQNE >gi|229784091|gb|GG667644.1| GENE 24 25663 - 27231 1393 522 aa, chain - ## HITS:1 COG:TM1089 KEGG:ns NR:ns ## COG: TM1089 COG0168 # Protein_GI_number: 15643846 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Thermotoga maritima # 22 503 9 490 495 291 38.0 2e-78 MILWETLGAENHHERHVLKGYRLIAGYMGIVMILAGIITLLPLFTLFFYPEEMGQAQYFV APGVSSILAGYLLSLVLRGRRAGNLEKNQEMIVVLGTWVIVIFVTAIPFVLSGTYSVTQA VFETTSGLSTTGLSVVDVSTAPHIFLIHRTILLFFGGIGLVLVMTSVLSDVYGMRLYHAE GHSDRLLPNLIESARLIIGIYSGYILGGTFLYILFGMSPFDAINHAVAALSTGGFSTRAE SIGYYNSTAIEMVTVILMLLGSTNFFVHLLLLKGKFKEFISHCEVRLMLFLLALFVPILT VLLMNGVYSNVPVSLRAALFQAVSALTTTGFQTVESFAAWTPAMMFLMILLMLIGGGAGS TAGGLKIYRVYVMLKEILWKLVRDAHPDRVVFTEQINRGGKKEVITDREKNRINAFVFFY LLLFAAGSFVFCCYGYSIQDSMFEFASAISTVGLSVGITSYGASPVIHWTAIAGMFLGRL EIYVVLIAAARMASDGKNYLHEKQMKRKRHRNRNRKRDHYEK >gi|229784091|gb|GG667644.1| GENE 25 27236 - 27910 844 224 aa, chain - ## HITS:1 COG:TM1088 KEGG:ns NR:ns ## COG: TM1088 COG0569 # Protein_GI_number: 15644627 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Thermotoga maritima # 1 219 1 217 218 122 30.0 8e-28 MKIVVAGGRDEADFLIGLLLMGKHKIIAINGDMGYCEHLAASYNAASVIYGDPCKEYVMD EAGVKGYDILIALREKDADNLEICQMAKRLFAIKKTVCTVRNPKNVEIFEMLGVDRAISA TYMLAHYIEQASVIEDLVRVMPLENQKVLVNEIKVNSDCPVVDRRIADMELPFNTIISCI IRETEVIVPNGQNKIMAGDRLVVITTPQSQKESVKAILGSEDNE >gi|229784091|gb|GG667644.1| GENE 26 27907 - 28347 540 146 aa, chain - ## HITS:1 COG:Rv2691 KEGG:ns NR:ns ## COG: Rv2691 COG0569 # Protein_GI_number: 15609828 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Mycobacterium tuberculosis H37Rv # 9 122 3 116 227 75 32.0 4e-14 MFFHKTKTIIIAGASRFGAGLAGKLSGRSTRIVVIDLNEDAFRKLPEDFSGYQIVGDGTD IDLLKQVGIESAEVFIAATDDDNVNILASQIACRIFHVPEVFSRLNDKNKEKLIRGFNIK PICPFVLSVNEFDRLTEGEYQEAGIS >gi|229784091|gb|GG667644.1| GENE 27 28607 - 28810 240 67 aa, chain - ## HITS:1 COG:RSc2466 KEGG:ns NR:ns ## COG: RSc2466 COG1278 # Protein_GI_number: 17547185 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Ralstonia solanacearum # 1 64 1 64 67 83 62.0 8e-17 MNNGTVKWFNGAKGYGFITNDETGEEVFVHFSGIMADGYKTLEEGQKVTYDTTEGNRGLQ AVNVCLA >gi|229784091|gb|GG667644.1| GENE 28 29136 - 29510 519 124 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240143815|ref|ZP_04742416.1| 50S ribosomal protein L7/L12 [Roseburia intestinalis L1-82] # 1 124 1 125 125 204 89 6e-52 MAKLTTAEFIEAIKELSVLELNDLVKACEEEFGVSAAAGVVVAAAGAGAAAEEEKTEFDV ELTEVGPNKVKVIKVVREATGLGLKEAKDVVDSAPKVVKAGASKEEAEQIKASLEAEGAK VTLK >gi|229784091|gb|GG667644.1| GENE 29 29631 - 30134 624 167 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239623242|ref|ZP_04666273.1| ribosomal protein L10 [Clostridiales bacterium 1_7_47_FAA] # 1 167 1 166 166 244 76 4e-64 MAKVELKQPVVDEIKELFDGAETAVVVDYRGLTVAQDTELRKKLREAGVVYKVYKNTMIR FAGKGTAFEALEPNLEGPTALAVSKTDATAPARILAEFAKTAPALEIKGGVVEGVYYDEK GMAQISSIPSREVLLGKLLGSIQSPITNFARVLKQIAEAQGGESAEA >gi|229784091|gb|GG667644.1| GENE 30 30446 - 31141 1017 231 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238916246|ref|YP_002929763.1| large subunit ribosomal protein L1 [Eubacterium eligens ATCC 27750] # 1 228 1 228 231 396 84 1e-109 MKRGKRYAEAAKTIDRTVLYDAPEAISLVKKAASAKFDETIEAHIRTGCDGRHADQQIRG AVVLPHGTGKTVRILVFAKGPKADEALAAGADFVGAEELIPKIQNEGWLDFDVVVATPDM MGVVGRLGRVLGPKGLMPNPKAGTVTMDVTKAINDIKAGKVEYRLDKTNIIHVPVGKASF EEDKLADNFQTLIDAIMKAKPSAVKGAYLKSVTLTSTMGPGVKLNVAKLMN >gi|229784091|gb|GG667644.1| GENE 31 31209 - 31634 636 141 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240143818|ref|ZP_04742419.1| ribosomal protein L11 [Roseburia intestinalis L1-82] # 1 141 1 141 141 249 88 2e-65 MAKKVTGYIKLQIPAGKATPAPPVGPALGQHGVNIVQFTKEFNARTADQGDLIIPVVITV YADRSFSFITKTPPAAVLLKKACKLKSGSAAPNKTKVATISKEDLQKIAETKMPDLNAAS LEAAMSMIAGTARSMGITVAE >gi|229784091|gb|GG667644.1| GENE 32 31722 - 32237 619 171 aa, chain - ## HITS:1 COG:CAC3149 KEGG:ns NR:ns ## COG: CAC3149 COG0250 # Protein_GI_number: 15896397 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Clostridium acetobutylicum # 1 171 1 172 173 193 60.0 1e-49 MSEAHWYVVHTYSGYENKVKVDIEKTIENRNLQDQILEVSVPLESVIELKNGVEKKADKK MFPGYVLIHMIMNDDTWYVVRNTRGVTGFVGPGSKPVPLTEEEMANLGFRNEEVIVDFEV GDTVVVISGAWKDTVGAIKSINEGKKSITINVEMFGRETPVELNFNEIKKM >gi|229784091|gb|GG667644.1| GENE 33 32261 - 32464 265 67 aa, chain - ## HITS:1 COG:no KEGG:Closa_0408 NR:ns ## KEGG: Closa_0408 # Name: not_defined # Def: preprotein translocase, SecE subunit # Organism: C.saccharolyticum # Pathway: Protein export [PATH:csh03060]; Bacterial secretion system [PATH:csh03070] # 1 67 1 67 67 82 83.0 5e-15 MGDTVNNEGAQKKNFFQGLKTEFKKIVWPDQETVTKQTIAVLAVSIALGLIIAILDLIIK FGLSFIL >gi|229784091|gb|GG667644.1| GENE 34 32492 - 32641 242 49 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881814|ref|YP_001560782.1| ribosomal protein L33 [Clostridium phytofermentans ISDg] # 1 49 1 49 49 97 83 9e-20 MRTKITLACTECKQRNYNMTKDKKTHPDRMETKKYCRFCKTHTLHKETK >gi|229784091|gb|GG667644.1| GENE 35 32988 - 33353 389 121 aa, chain - ## HITS:1 COG:no KEGG:Closa_3981 NR:ns ## KEGG: Closa_3981 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 12 113 547 649 651 89 45.0 4e-17 MEQGLMERAGDLSYIKEGFRVLARCGGYDALIRVYEQLPEEVASESRVVLEYIMALYRTG SFRKAYELLTADGGLVAADLREGENSIADLWLDLRRALNMPEEEVPHVFDFSAEIRQNRK K >gi|229784091|gb|GG667644.1| GENE 36 34294 - 35838 1060 514 aa, chain - ## HITS:1 COG:no KEGG:Closa_3981 NR:ns ## KEGG: Closa_3981 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 5 480 1 479 651 538 55.0 1e-151 MENYMRPTLIFDQVYLKTAPKGPEASVPNLLGMDNMQNKTEFLLDEYDEIYEGYGTLNSV YPYRDYTCYTRECGLHPVKTAVLENEYLRAVFLPEFGGRLWQLTDLKSGRELLYTNDVIR PSNLALRDAWFSGGVEWNIGLIGHTPLTMEPLFTARLETGSGLPVLRMYEYERIRRMVYQ MDFWLDECGAYLNCRLRVVNHNQEVTPMYWWSNMAVPEYEDGRVIVPADSAYSSGGGSVY KVPVPVVDGIDISHYRNIPGQVDYFFHIPEESPRYIANVASDGFGLLQYSTRRLCGRKLF SWGNNSASARWQEYLTEEAGRYIEIQAGLGKTQYGCIPMAPHTAWEWIERYGAVQLNGRS DSFEEDRETLTEMVRAEAGDTLEKVLENSREWAVKPGEVIYRGSGYADLENACRLRRGER PLSPHLDFSSEDQRQAPWRVFLETGRFPSADPAEIPADCMADDFWYEMLKKQAEQDQREN LQPQAEPQSRGDRKPGKASPDWHLLYHLALNYIS >gi|229784091|gb|GG667644.1| GENE 37 35867 - 36736 943 289 aa, chain - ## HITS:1 COG:BS_yurM KEGG:ns NR:ns ## COG: BS_yurM COG0395 # Protein_GI_number: 16080311 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 6 289 21 300 300 149 31.0 9e-36 MAKQKEIKPKLNVTREIPRLPGYILCVIWLIFTVLLIGWVMLASLSTSREILGGELFKFT SGLHFENYTNAWKSQKISVFFLNSLIYTAVSCTLIIMVAAPAAYALSRFKFRGNMILQNM FATALGIPVIMIVMPLFSIITSLKLTNTRGLIMFLYIAMNVPFSIFFLLAFFSNLSYTFE EAAAIDGCSPIQTFWKIMFPLAQPGIITVTIFNFIAIWNEFFMSMLFASKQNVRPIAVGL YNMVKGMQYTNDYGGMFAAVVIVFAPTFLLYLFLSDKIIAGVTGGAIKG >gi|229784091|gb|GG667644.1| GENE 38 36750 - 37694 1056 314 aa, chain - ## HITS:1 COG:lin0218 KEGG:ns NR:ns ## COG: lin0218 COG1175 # Protein_GI_number: 16799295 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 15 252 6 232 292 99 32.0 1e-20 MAKQKAVKKGGNPLRRNTTPMLVMFLTPAVLLYLIVFLYPTIRTFIMSFCKVEGVTDPVS TWQFSGLTNYVKSFNNNMFQTAMLNLGRLWLVGGIAVMALALIFAVALTNGMQGKKFFRS VIYLPNLVSAVAMGTMWLYYALSKENYGLLNTIIGWFGGSNVMWTDPHHKFWSMLIAYCF GMVGYHMLIFMSGIERISADYYEAASIEGANVFKKFWYITFPLLRGSIRTNLVMWTVSSV GFFIWGQVFDPVNLSEQTVMPLNYMYELVFGASNAVQAARDSGVGAAIGVMMAVIVVAVF SLTTLIIKNDDAEL Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:47:11 2011 Seq name: gi|229784090|gb|GG667645.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld38, whole genome shotgun sequence Length of sequence - 51426 bp Number of predicted genes - 43, with homology - 43 Number of transcription units - 19, operones - 11 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 2521 2438 ## COG2200 FOG: EAL domain - Prom 2656 - 2715 1.7 - Term 2657 - 2705 7.1 2 2 Op 1 . - CDS 2726 - 3583 901 ## COG0253 Diaminopimelate epimerase 3 2 Op 2 . - CDS 3580 - 4605 915 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes - Prom 4708 - 4767 6.6 + Prom 4667 - 4726 6.7 4 3 Tu 1 . + CDS 4751 - 5590 830 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 5580 - 5620 10.3 5 4 Tu 1 . - CDS 5666 - 6451 863 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 6513 - 6572 7.4 + Prom 6557 - 6616 6.5 6 5 Tu 1 . + CDS 6727 - 7749 786 ## Closa_1256 hypothetical protein + Term 7781 - 7828 10.5 7 6 Op 1 . - CDS 7792 - 8805 1114 ## Ccel_3208 hypothetical protein - Term 8858 - 8895 2.1 8 6 Op 2 . - CDS 8904 - 10196 1235 ## Closa_1252 hypothetical protein - Prom 10254 - 10313 2.6 - Term 10213 - 10249 6.2 9 7 Op 1 25/0.000 - CDS 10357 - 12063 2100 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) 10 7 Op 2 . - CDS 12096 - 12359 321 ## COG1925 Phosphotransferase system, HPr-related proteins 11 7 Op 3 . - CDS 12387 - 13727 1573 ## Closa_1249 YbbR family protein 12 7 Op 4 . - CDS 13705 - 14577 807 ## COG1624 Uncharacterized conserved protein 13 7 Op 5 . - CDS 14608 - 15555 1082 ## Closa_1247 hypothetical protein 14 7 Op 6 . - CDS 15552 - 15953 409 ## Closa_1246 hypothetical protein - Prom 16008 - 16067 6.9 - Term 16072 - 16120 7.1 15 8 Op 1 10/0.000 - CDS 16157 - 17368 350 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 16 8 Op 2 36/0.000 - CDS 17386 - 18594 317 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 17 8 Op 3 24/0.000 - CDS 18585 - 19379 319 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 18 8 Op 4 . - CDS 19396 - 20526 1540 ## COG0845 Membrane-fusion protein - Prom 20602 - 20661 80.4 19 9 Tu 1 . - CDS 21501 - 22583 1047 ## COG2199 FOG: GGDEF domain - Prom 22691 - 22750 7.0 - Term 22863 - 22896 2.0 20 10 Op 1 . - CDS 22939 - 24168 1266 ## gi|266621489|ref|ZP_06114424.1| conserved hypothetical protein 21 10 Op 2 1/0.250 - CDS 24168 - 25388 987 ## COG0477 Permeases of the major facilitator superfamily 22 10 Op 3 44/0.000 - CDS 25419 - 26378 724 ## COG4608 ABC-type oligopeptide transport system, ATPase component 23 10 Op 4 44/0.000 - CDS 26380 - 27381 562 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 24 10 Op 5 49/0.000 - CDS 27404 - 28303 729 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 25 10 Op 6 38/0.000 - CDS 28284 - 29252 994 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 26 10 Op 7 . - CDS 29263 - 30915 1871 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Prom 31006 - 31065 7.4 27 11 Op 1 1/0.250 + CDS 31243 - 32019 272 ## COG2188 Transcriptional regulators + Term 32076 - 32115 -0.4 + Prom 32093 - 32152 8.4 28 11 Op 2 . + CDS 32197 - 32952 673 ## COG2188 Transcriptional regulators + Term 32971 - 33022 9.3 29 12 Op 1 . - CDS 33058 - 34077 844 ## COG5279 Uncharacterized protein involved in cytokinesis, contains TGc (transglutaminase/protease-like) domain 30 12 Op 2 . - CDS 34096 - 34791 681 ## EUBELI_01840 hypothetical protein 31 12 Op 3 . - CDS 34831 - 36369 1473 ## EUBELI_01841 hypothetical protein - Prom 36407 - 36466 3.8 32 13 Op 1 . - CDS 36521 - 37996 1266 ## ELI_2097 regulatory protein GntR 33 13 Op 2 . - CDS 38019 - 38387 410 ## COG2200 FOG: EAL domain - Prom 38427 - 38486 10.8 34 14 Op 1 . - CDS 39388 - 41694 1913 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain 35 14 Op 2 1/0.250 - CDS 41703 - 43256 1458 ## COG2199 FOG: GGDEF domain - Prom 43386 - 43445 8.0 - Term 43477 - 43521 9.8 36 15 Op 1 . - CDS 43537 - 44670 1015 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family 37 15 Op 2 . - CDS 44643 - 44792 122 ## gi|266621506|ref|ZP_06114441.1| hypothetical protein CLOSTHATH_02669 38 15 Op 3 . - CDS 44851 - 46314 1447 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 39 16 Op 1 . - CDS 47246 - 47569 376 ## Clole_4004 xenobiotic-transporting ATPase (EC:3.6.3.44) 40 16 Op 2 . - CDS 47562 - 49292 219 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P - Prom 49341 - 49400 4.3 41 17 Tu 1 . - CDS 49452 - 50162 329 ## COG2378 Predicted transcriptional regulator - Prom 50188 - 50247 7.5 + Prom 50192 - 50251 5.1 42 18 Tu 1 . + CDS 50282 - 50773 445 ## Sgly_1223 isochorismatase hydrolase + Term 50775 - 50821 12.2 43 19 Tu 1 . - CDS 50839 - 51321 377 ## Dhaf_2153 GCN5-related N-acetyltransferase - Prom 51355 - 51414 1.8 Predicted protein(s) >gi|229784090|gb|GG667645.1| GENE 1 1 - 2521 2438 840 aa, chain - ## HITS:1 COG:AGl374gl_3 KEGG:ns NR:ns ## COG: AGl374gl_3 COG2200 # Protein_GI_number: 15890301 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 432 679 6 251 266 184 37.0 5e-46 MIECLQLWLRQGGGVMNQLPAAGEGTGIYVIDGEFRVVYFNETAKAVCPNLRVGDLCYRG ICNGSAPCSSCPDRNEDCSRLLLYDAVNRQWIELSSGRIEWPGHGRCRLMLFKVVDRQSM SLFYDLTETSAYEELFELNLAENTYKILFHQNGKYVIPDMEGRLDSMCMEVADHMICPED RERFLEFWNFDTLAQRLSGNNHTIRGEFRKLLVSGEYCWVAQTAIQLHTGEQDKTVIMVF IQDIDKQKKQEIAAANSGKSGLEEKDSMMGLYRYGPFLERAEQFLREKSGSKFSMVAIDI EHFKLFNEWYGEEAGDRFLTSIAEQLKEVEKQYDSMAGYMGGDDFAIIIPREESVLRNLE RGIRHFVKQYGGSVGFLPAFGVYEIEEWDVSVSMMYDRASIALASVKGNYLNRMGRYDAG MKKKMEDDQLLLSEVQRALDNHEFIFYAQPQCNMLTGKIIGLESLVRWKHPVRGVVQPGE FIPLLERNGFITNLDLYVWDMVCARLHDWIAAGHKPVPIAVNVSRIDIYSMNVTNTFVEL TGRYGIAPALLAIEITESAYAEDYNLIRQVVIDLRKAGFTVFMDDFGSGYSSLNMLKDVN VDVIKIDTKFLDMNEHSQNRGMGILETIVRMARLTQLRVIAEGVEKKEHVDFLRNVGCIY GQGYYYYRPMPIEQFEPLLLDEENVDYRGILAHQLEELRLEDLVNKDVTSDAILNNMLGG MALYDVCGDKIELLQVNEGYYRVTGCNPVDLEERRKSILAQIHEEDRDKVFDLFETAYKN PLKGADGTIRRYRLNGELMWISLRIFFLKERDEHRLFYGSVSDVTEGKRREEKLIASRKI >gi|229784090|gb|GG667645.1| GENE 2 2726 - 3583 901 285 aa, chain - ## HITS:1 COG:alr4841 KEGG:ns NR:ns ## COG: alr4841 COG0253 # Protein_GI_number: 17232333 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Nostoc sp. PCC 7120 # 7 278 5 276 285 238 44.0 1e-62 MKITLEKYHGLGNDYLIFDPNKNEWKLTKEAVRLICNRNFGVGSDGLLVGPILGENKMEV KILNPDGSEAELSGNGVRIFGKYLKDAGYVQKNHFAVNTLSGEQQIHYLNEEGTRIKVSM GHLSFWSDEIPMTGPRREVLNETLTFGSIPYRVTCVSIGNPHCVIWLNDISKDLVCRIGR HSEAADCFPEKVNTELLRVVDRTNIEIEIYERGAGYTLASGTSGCAAAGAAYKMGLTDPK MYVHMPGGVLEVEIEEDVSVLMTGDVGYVGKFTLSHEMTEQLRSL >gi|229784090|gb|GG667645.1| GENE 3 3580 - 4605 915 341 aa, chain - ## HITS:1 COG:PA4201 KEGG:ns NR:ns ## COG: PA4201 COG1181 # Protein_GI_number: 15599396 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Pseudomonas aeruginosa # 1 336 6 334 346 231 42.0 1e-60 MKIIVLAGGYSPEREVSLSSGSLIANALMEQGHEVFFLDPWPGIREGEEVRYRTREEGYH FSSEIGMEAPDLEKLRAGHGMKKTLIGPGVLKACREADMVFIALHGGAGEDGQIQAVLEA YGIPHTGSSWKGCLLAMDKNISKIFMRMAGVATPDWYLAEKGKTSRPDFLPCVVKPLDCG SSVGVTIVEREEEWSAALEAAFSHGGRVLVEKKITGREFSVGVLGDRALPAIEIIPESGF YDYQNKYQPGMTREICPAWIDGMQEVKMKETALTVHRILELGYYSRVDFMMDEDGAIYCL EANTLPGMTPLSLLPQEAKAAGITYHELCEAIVNNGKGAAL >gi|229784090|gb|GG667645.1| GENE 4 4751 - 5590 830 279 aa, chain + ## HITS:1 COG:CAC2818 KEGG:ns NR:ns ## COG: CAC2818 COG2207 # Protein_GI_number: 15896073 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 6 275 4 274 279 233 43.0 3e-61 MTGYEKRGYLNSDFRLFHLTDTKKQDFEYHYHDFDKIIIFIRGRVTYRIEGQEYQLQPYD IVLVSHNDIHRPDIDPSVPYERIIVYLSPRFITSYRSDTYDLSDCFHKAKELRSNVLRIH SLEKSGLFKTITDLEYACAHDGYAKELYCQVVFLEFMIKLNRAALTGRVEYLSPGTGDRR ISAVMEYISGHLTEDLSVDQIADAAYLSRYHLMHLFKQATGYTLGSYITEKRLLLARDLL RSGTPVTSVCFDCGFKNYSTFSRAYKKYFESTPSDARQK >gi|229784090|gb|GG667645.1| GENE 5 5666 - 6451 863 261 aa, chain - ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 27 118 517 604 693 67 36.0 2e-11 MGKKWITAVLAAALSVAMASSAFAGTWENQEGRWKYHNDDGSYAAAQWVEDQGNWYYVEQ DGFMKTGWYQDAAGVWYYLDENTGAMASNTTRNIGGTDYTFDGSGAWMQPQVRLNDGWNG TTYVNRLIGYQITLPDSYQAVSGNSVSLPENAQYDMVAAAPDGNSIIVTAVINMSELDGE NWSDEELAYYFKYIYRSDSDVIGASFNDLGFAKITLVQSEDGVADVYFRRCGSYVLMIET VFIPSRWQEIDQILHTMKIPD >gi|229784090|gb|GG667645.1| GENE 6 6727 - 7749 786 340 aa, chain + ## HITS:1 COG:no KEGG:Closa_1256 NR:ns ## KEGG: Closa_1256 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 300 1 297 320 303 53.0 1e-80 MKKMVYFLIPFFAAAVGITACGAASKSVMVAETTGAYSETAAETVSNDSESYDYASQKAE GGTTLNTESFSTPKLPEGRKLIRNITMQVETEEFDRMISGVNAKLSELGGYTEQSDISGT SLNYRGQPIPRYANITARIPDDKLDRFVGAVSDSGNVTNKSESTQDVTLQYSDMESRKKS LEIEQERIWQFLEKAESIDTVITLEQRLSDIRYQLESMESQLRLYDNQVDYSTVYLNIAE VTSYTPAAPESAGTRMKNGLAKNFESLITALTNLLIFIITTCPFWIPLAVIAAVILSAVR HHNRKIYRISEEEKVRSSAPRKSEELPDAEQDDTSDDSRD >gi|229784090|gb|GG667645.1| GENE 7 7792 - 8805 1114 337 aa, chain - ## HITS:1 COG:no KEGG:Ccel_3208 NR:ns ## KEGG: Ccel_3208 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 3 337 2 337 338 209 33.0 1e-52 MEKEDIRVTHVIIHILDSTVGTAVMSGGLLEHGSDFSDFVRSHIYRVMTSDEGKSCQFDG DSPVYQMVSEMEEDTFIQKSQEIAQKLFDIMYSNIDIPSADLMVARYDAGQQCGIALLKM NYKSSYTHMTNYTENGNCNDIIRQNAILPAENQKLAEACLIDLNDYSLRVMEKKYEVNGV KTNYFSQMFLQCHGSMSPKTELSIVTRAVEQVQKGYYDESEQWEAHMEAKSILSQELAEQ GTIDIPTVAEKIFHEKPELKEEFTEKMEKYNLSEKQVTPQNPSTTRKFEKQYLTTDTGIE IKIPMEEYQNRNSVEFITNADGSISVLIKNIGSITSR >gi|229784090|gb|GG667645.1| GENE 8 8904 - 10196 1235 430 aa, chain - ## HITS:1 COG:no KEGG:Closa_1252 NR:ns ## KEGG: Closa_1252 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 430 1 430 430 679 82.0 0 MGVYLICYAASYLFARAGYYWLSGLVLILAALWLYWYDYRRTENIIHLRGLFSLFWVGGQ GIACLKLSELSSDWNIVTWLCFLAALLGFWVTYEFLMRAQGESGGQRSRWFNFQGYERPL FWSMVIVTAVSLGAFLAEALILGFVPFFVRGVPHAYSTFHITGVHYFTVSCVLVPALSVL YFCIEQGRNGVRMVLVILMDILACAIPVLCVSRFQLILAVGLAVLTYIAMDNRLKISYAI GLVLGMIPIYVLLTFARGHDVSYLNGIFQMKNSEMPIFITQPYMYIANNYDNFNCMVEWL PSHTFGLRMLFPLWALTGLKFLVPSLVDFPIYVTKEELTTVTLFYDSFYDFGIVGVVLFA CILGMAAFLLVRMMKRIRNPVGYLFYAQFAMYLVLSFFTTWFSNPATWFYLIVTGLLWGF CSWRNDGRHG >gi|229784090|gb|GG667645.1| GENE 9 10357 - 12063 2100 568 aa, chain - ## HITS:1 COG:BH3073 KEGG:ns NR:ns ## COG: BH3073 COG1080 # Protein_GI_number: 15615635 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Bacillus halodurans # 2 566 5 568 572 538 50.0 1e-152 MMKGTSASAGIGIGKAAIVEETELVIKKETISDAAAEKERFQAALKQAMEETEALAKDLA TRVGEKEAEILNGHLMLLSDPMLTGEIENTIAGENACSEYAIENVCNMYADMFASMGDEL MQQRATDMRDIKTRMQKILLGVSSVDIASLPAGSVIVAKDLTPSMTAGINPANVCGIVTE LGGKTSHSAILARALEIPAVVAVEGFLNSVKDGDTVVLDGSEGVVFVNPEEAVTAEYEAK RTAYLKEKKELDQYIGKPTVTKDGVTIELVANIGKPEDVDKVLQYDAEGIGLFRTEFLFM DRNSMPTEDEQFEAYQKVAIAMNGKPVIIRTLDIGGDKEIPYMGLKKDENPFLGYRAIRF CLDRREDVYRPQLRALLRASAFGNIKIMVPMVTCLEEFREAKAMIEEIKAELDSRGIAYK KDIQVGIMVETAAASLMADAFAKEVDFFSIGTNDLTQYTMSVDRGNDKVSYLYSPLNPAV LRSIRHIIQCGRKEGIMVGMCGEAASDPLMIPLLLAFGLNEFSMSASAVLNARKLITGYS IAELQAIADKAMSFVTAGEVEAYMREQQ >gi|229784090|gb|GG667645.1| GENE 10 12096 - 12359 321 87 aa, chain - ## HITS:1 COG:MYPU_6030 KEGG:ns NR:ns ## COG: MYPU_6030 COG1925 # Protein_GI_number: 15829074 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Mycoplasma pulmonis # 1 83 1 83 87 65 42.0 3e-11 MVSAKVVIKNPTGLHLRPAGILCKEAMNFKSSVNFRFKNTTANAKSVLSVLGACVKSGDE IEFVCEGVDEKEALAAMVQAVEDGLGE >gi|229784090|gb|GG667645.1| GENE 11 12387 - 13727 1573 446 aa, chain - ## HITS:1 COG:no KEGG:Closa_1249 NR:ns ## KEGG: Closa_1249 # Name: not_defined # Def: YbbR family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 444 1 431 434 431 56.0 1e-119 MKEKLLNNLGLKVVSLVLAFVIWMAIVNISNPVVSGQPFTIPVDVRNEEILDNANLTYEI VEKDSVTVNYQVNTRNSSLFRASDFSAYIDLKDYNDVTGAVPITIEVNKNKESLIRDGEV TAKPMVLHVKTEAMQNKKFDLKVNYVGEVEDGYASGITTLSSDYVVARGPESVIGQIASI GVEISREGANADLEGDVVPKCYDANDKELTDLGDKLTITPNTIHYNMAILKAKELTLNFE VKGTVAPGYRFTGVDSDLKSVPVVATKSVLASVNALSIADDKLTIEGATADKVIHLDLNN YLPPSTAIVPNAQSEVTVTLKVEPLTTKSYTLNLAELASRGTDPDYEYTFSRDSVDVNIR GLKEDLDTLDVSDLNAVLDMTGLEPGTHPGKLAFEVGEGFEVVNYSDFDVMVVADGEDES EGPGQTTADQGTDEPATAGTSKKAAE >gi|229784090|gb|GG667645.1| GENE 12 13705 - 14577 807 290 aa, chain - ## HITS:1 COG:BS_ybbP KEGG:ns NR:ns ## COG: BS_ybbP COG1624 # Protein_GI_number: 16077243 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 22 262 14 249 273 207 48.0 1e-53 MADIIDGLYSWVSFLPRITKTNLIEIIIIAVLLYELLVWIMNTRAWTLLKGILLILCFYV LTYIFRLDNIKWILNTIATPAVTMAIVIFQPELRKALEQLGSRNPLTGILTFDDGKDNSG FTDRTINELVKATFEMAKVKTGALMVIERGDSLKEIERTGIEINGLVSSQLLINIFEHNT PLHDGAVVIHGNRVAAATCYLPLSDNMSISKELGTRHRAAVGVSEVTDSLTIVVSEETGR VSVASEGGLKRAVDADALRELLMDLKHVEGEGSRFKILKGRRKNEGKIAK >gi|229784090|gb|GG667645.1| GENE 13 14608 - 15555 1082 315 aa, chain - ## HITS:1 COG:no KEGG:Closa_1247 NR:ns ## KEGG: Closa_1247 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 315 1 315 315 409 76.0 1e-113 MMSLLVFKERLKEFYTKFDIYITPVIKFVYSYLAFWLMNRNIGFMAKLTNPLVPLVLALV CSFLPYSAVSFLAAVFMLAHLWAVSFEVTLVMAVFVLVVALLYYGFHPGDSCLLILTPIF FVLKIPYAIPLIVGLSGGIVSVIPVGCGVFIYYTLLYVKQNAGVLTNEFSADMMQRFSQI VKSLFGNQLMLVLIAAFAAAIVVVFILKNMSFDYSWIIAIVAGTIAQLVVIFIGDITFDV SVPVSELIIGVAVSLILSGIYTFFVFAVDYSRTEYVQFEDDDYYYYVKAVPKLTVSTPDV KVQKINARKLQRPQR >gi|229784090|gb|GG667645.1| GENE 14 15552 - 15953 409 133 aa, chain - ## HITS:1 COG:no KEGG:Closa_1246 NR:ns ## KEGG: Closa_1246 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 132 10 141 142 179 71.0 2e-44 MTNLAMFEKNEGNRIFPVNRYFKSDYISSKLLRSFFSYTLSFLLGLVMWVLCDIEKWMNV LRIDTLIEVGMKAGVIYLAGLVVYLIISLCVYAKRYDYASRGMKVYMAKLKRLDKRYEGN SRPGARTKGGRTQ >gi|229784090|gb|GG667645.1| GENE 15 16157 - 17368 350 403 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 4 403 7 413 413 139 28 3e-32 MLLENMFMALHAIRANKMRSFLTMLGIIIGIGSVIAIVSIGDTMRSMVAGLYANVGMTRA LIYIWPADGSDQRDSDYFNMEQMERIKEVFQDDIEYIDSNAYYSGEALHGRNKVKFQFQG IDYNYPDVQKINIIYGRNLNEADVKGRKKNVVLESKGAMELFGTENAVGKTFRTTVYGET DDYNVVGVYKKELTPIEAMMMGTGQTKEAFIPYTILTWPNDYLFQVNLFAREGTDMKEFE GRIVPYIARLKDRQPADVTFYSALDEMSGMDTMMGGLSAAVGGIAAISLLVGGIGIMNIM LVSVTERTREIGIRKALGARTHDVMIQFLTESAILSAFGGILGVVIGAGLVMAGGALFGL SVVVKPQVVLVAVGFSALVGLFFGLYPASKAAKKDPIDALRYE >gi|229784090|gb|GG667645.1| GENE 16 17386 - 18594 317 402 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 4 402 7 413 413 126 27 2e-28 MLLENMTMALHAIRSNKMRSFLTMLGIIIGIASVIAIVSVGDSMRSLYADAYKNVGVNRA VVLVSWNVDDFRMSDYFTLDELDRAKEVFKDKIAYMDSDAYASTDAVYGRNKVKYSFQGI DYNYPQVQPMNIIYGRELNEADVKGRRNNVVLEDKGAEMLFGTANCVGRTFRMTINKSTE EFTVVGVYRKDLSPLEAMMMGNGRTQSGFIPYTILTRPSDYFQDLNFYVKDGVDMKAFLP EIARYVAKIKGRPVSEIMTQSVVEQMGSVDTVLGGMSAAVGGIAAISLLVGGIGIMNIMM VSVTERTREIGIRKALGARTRDIMMQFLTESALMSACGGIIGIVLGVGIVTIGGSAMGMV PVVKVSVILVAVGFSALVGIFFGLYPASKAAKADPIDALRYE >gi|229784090|gb|GG667645.1| GENE 17 18585 - 19379 319 264 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 40 238 17 218 245 127 38 1e-28 MWKREKKLPSCTNEKASDDLIRLVDMVKIYDTGSIKVLGLKRINLTIKRGEFVAIMGQSG SGKSTLMNILGCLDRPTMGHYYLDGIDTADMSPDALSAIRNRKIGFVFQSFNLISRTSAL KNVELPMTYAHKSKKMREQRALELLERVGLGERYEHMPNELSGGQRQRVAIARALANEPP LILADEPTGNLDTASSVEIMELFSRLHKEGATVVVVTHEEDIAAFTERIIRFRDGQVVSD KRNIPKELTEEEAVKMKKEGTPCC >gi|229784090|gb|GG667645.1| GENE 18 19396 - 20526 1540 376 aa, chain - ## HITS:1 COG:AGc3332 KEGG:ns NR:ns ## COG: AGc3332 COG0845 # Protein_GI_number: 15889118 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 316 72 358 437 74 25.0 3e-13 MAATFPLSRGEVVEMLSVSGPVEGTDSVEVVSNLHAEILEIPVKEGDRVEKGQLLATLDE KDAKKEVDIAQNAYDLAVTTYNEKQIEAENGYAKARQDYDAAKANYDRTNVLFQAGSASQ LELENVSDALNNAQRELKTFTLKDGRPVANESYSLQIENAAFELDQKKELLANTKVTSPI AGTVVRVNCKVGRFADKTDDDKPMFIIENLDVLEMKINVSEYSIGKVKIGQPVEISADIL NGENVKGEVTAISPTGEEKGNGSTERVIPTTIRIIDQNTRLIAGITAKAKIELQKAEDAW VVPVSALYQQPDGSTAIVAVEDNIAHMIPVTTGVESDIQVQVIPAEEGKLTEGMQIIETP AAYLTDGMAVVAVPAA >gi|229784090|gb|GG667645.1| GENE 19 21501 - 22583 1047 360 aa, chain - ## HITS:1 COG:PA3311_2 KEGG:ns NR:ns ## COG: PA3311_2 COG2199 # Protein_GI_number: 15598507 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Pseudomonas aeruginosa # 166 298 65 197 204 78 33.0 2e-14 MASEDELFAGFGADAGMFYSALVRSTDNYIYIINMKTDVALVSPNMQRDFELEGRLVPGL VTVWGNLVHPRDFKRYSKSIEEMLDGETDVHDVEYQVLNRKGEYVWVCCRGVLKRNEEGE PVIFAGVVTDLGSKGKIDYTTGLFTNRECEKEVGRIFDSEEHAEGGILLLGLDDFVRIND LNSHIFGDAVLRQFAQSVQQLLPPEAAIYRFDGDEFAVVYPGAGEAELRALYQKIHVYCN RPHKIDDSSYFLTVSVGVAMMGKDGNGYQELLTCALSALEVSKQNGKNTVTFFSQDMMHT RRRALELSSRLQLSVMNGMESFHLAYQPFVHSDTLQVSGAEALLRWSCEPYGVVSPVVAS >gi|229784090|gb|GG667645.1| GENE 20 22939 - 24168 1266 409 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621489|ref|ZP_06114424.1| ## NR: gi|266621489|ref|ZP_06114424.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 409 1 409 409 838 100.0 0 MNRKDYWSPETNPETGEKFARVLEYFHTAHNGSADIDSMIDSPYGEACARRMQLRFKYEH DAMGEAAMEYYRGCGLKKEMYDGEDYYARWVILTPVEMETEEGRKKKYPIVFYNHGGGNS IECEEFSLGFAELAGRDKFMVAYLQNTNWENFERVLELISERYPLDRERVYLCGYSQGGY QVTSAYFRIPWKLTAVGPCGNDIYREYDNFNVPYTEEETQNLKNALVPFMQVVGVCEASS FVPINDWKPRKDWGRECSGETYLDDRRDDSKDPTRIHGGRRRFSDMPVPPEGEDRHEWMI GRLNKRMDTLNCEPRDSKTCISYLNTPEDELHHVLGFYGDKEEIIWHYGYKYYTLNIWNR EHINAFRYVAVENNPHWPPVLMAELLWDFFKQFRRDGKTGKIVEEEYRY >gi|229784090|gb|GG667645.1| GENE 21 24168 - 25388 987 406 aa, chain - ## HITS:1 COG:Cgl1041 KEGG:ns NR:ns ## COG: Cgl1041 COG0477 # Protein_GI_number: 19552291 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Corynebacterium glutamicum # 29 398 3 376 391 150 27.0 3e-36 MSSSIALKTSKTEEKIPDKIWNASFISVFIANMLMFMGQQMMNTLVAKYANHLGAAPIIV GLVTSAFAYTALIFKIFAAPAIDTFNKRYILVFAMLIMAASYAGYSFSGSIPVLLASRLL QGAGQAFSATCCLALATDTLPPDRLGTGISYFSLAQVICQSIGPTIGLSLSRTIGYHYTF MIGAVTMCLAAVAASRIRNKHVKSVKRYSISLDSIIAREAILPAVLMFLLCMVYYNITTF LVIYAEERGCGVNIGYFFTVYAVTLLVTRPVIGRFGDKHGLVKIFIPAMFCFALAFLLIS FATNIWMFLLAAFVSAFGYGACQPSVQTLCMKCVPKEKRGAGSSTNYIGQDLGNLVGPLI AGSLAGRFGYAAMWRIMILPVALALVIVLMFRTRINSAGQEKEEAV >gi|229784090|gb|GG667645.1| GENE 22 25419 - 26378 724 319 aa, chain - ## HITS:1 COG:FN0400 KEGG:ns NR:ns ## COG: FN0400 COG4608 # Protein_GI_number: 19703742 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Fusobacterium nucleatum # 4 317 1 315 320 410 61.0 1e-114 MEALIEARNLKKHFDSPKGKVHAVDDVSFKIMPGRTMGVVGESGCGKSTLGRTMIHLLQS TEGSIYFEGEDVTKVKDKELKQLREKMQIIFQDPYASLNPRMTVKAIIQEPLILSKRFSD RGSLEAETVRIMDLVGIEERLRNAYPHELDGGRRQRVGIGRALALNPKFVVCDEPVSALD VSIQAQILNLLMDLQDEYHLTYMFITHDLSVVRHISHDICVMYLGQLVETAPSGELFEKQ LHPYSRALLSAIPSVDIHAKRERIILKGEITSPVNPKPGCRFASRCPYVCDKCHEPQHLE ELLPNHFVSCCRAREIGGL >gi|229784090|gb|GG667645.1| GENE 23 26380 - 27381 562 333 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 2 323 1 325 329 221 37 9e-57 MINKKIETDSFLSVKELVVEYTMEGEVVHAVNGVSFHLEKGKTLALVGETGAGKTSIAKA ILRVLPDPPARVPSGSVYLDGVDLLQLKEEEMRKVRGNKISMIFQDPMTALNPVKRVGEQ IAEGIRQHQKISRAEALNRACDMLNMVGISDDRVNDYPHQFSGGMKQRVVIAMALACSPD LLLADEPTTALDVTIQAQVLEMIKALRDQMNTAMIMITHDLGVVADVADDVAVVYAGEII EYGSKEAVYDNPSHPYTRGLFGALPDLSKEVRRLNPIEGLPPDPIHLPEGCAFHPRCPYA TEECRKHPVPLREVETGHWNRCLKNGTVQKKED >gi|229784090|gb|GG667645.1| GENE 24 27404 - 28303 729 299 aa, chain - ## HITS:1 COG:BH0030 KEGG:ns NR:ns ## COG: BH0030 COG1173 # Protein_GI_number: 15612593 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 2 296 6 298 301 265 43.0 6e-71 MEKIKNNTASAAAGGTKRKNESMFSIFLRRMARRKEAMFGLSILIILAVLAILAPYICRY NYAEMNLQNAFAPPSREHWLGTDSMGRDMLSRILYGGRFSLSVGIISVAISGTGGLILGS IAGFFGGKVDNLIMRFLDIFHSIPQVLMAICVSSVLGAGFFKTVLAVGIAGIPNFARTIR ANILAIRGLEYIEAAESINCRNARIIFRHVIPNAITPFIVHCTLAIAGGLIVSATLSYIG LGVQPPLPEWGAMLSDARSYVRQYPYLMICPGVFIMIVVMSFNLIGDAVRDALDPKLNK >gi|229784090|gb|GG667645.1| GENE 25 28284 - 29252 994 322 aa, chain - ## HITS:1 COG:FN0397 KEGG:ns NR:ns ## COG: FN0397 COG0601 # Protein_GI_number: 19703739 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 309 1 308 308 270 50.0 2e-72 MGKYIAKRILWLIPVIMGVTFLIFTIMFFIPGDPVVIMLGSSATPAEIEQAREMLGLNGS YLHRFWNYASGVFLRFDLGTSFQYGTSVTADMLTRFPRTFTLAVASMLISICVGVPLGVI AATHQNTVSDRASMIVAMFGVSMPSFWLAILLVLLFSMKLGLLPSYGIGGLKFYILPAIA NSFGGIAGFARQTRSSVLECIRADYVTTARAKGVSEKKVLMGHILPNSMIPIITYAGTQF SGLLGGAIVIENVFTIPGIGTYMVQAINYRDYGAVQGSVIFSAITFSVVMLIVDVIYAYV DPRIKAQYENKQRSTKNGKNKK >gi|229784090|gb|GG667645.1| GENE 26 29263 - 30915 1871 550 aa, chain - ## HITS:1 COG:FN0396 KEGG:ns NR:ns ## COG: FN0396 COG0747 # Protein_GI_number: 19703738 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 425 1 384 511 113 26.0 1e-24 MKLKKVLAVALAVTMLGSLTACGGNGGSGETAVKTPVGKEASAEGEKIVNIAMTTTMGDL DPFAPPTQGRNFLRYAVYDNIAIFKDFGTSWDQMQWVMAKNIEQKDDVTFEIELYNYIHD ALGNPINANDVVFCLNSISQSGNFTRFTNYMESATVIDDYNLEIKLKTTTVGAFEYMLAQ CCIVSQKTYEEYKDQFSTKPITSGAYQVTECVSGSYYVLEKNPDYWQTDKNLRSYAANAY DQGVDKLVYNIVTEASQATIALQMGEDDIITVCNNAEIGYFMNQDGTPTEGYSVNKVLSS APMILAFNMDEGQMFDGNLALRQAICYAINQEDIIAGGLRGNGEAIRDLGDKLCGDYNPD WNNEDYYDFDLNQAKAMLDEAGYPGGIDPATGAPLHIRLLIDQQYKDTAVVIQSQLIAAG LDVEINAYDNSLLATYQYDPAQWDLYVFIQGTECYVTSIYDGIFDAGADGKAPKCFVKDE ALQNLVEAAHEISTHGDETVEALHDYVRDNAYAYGLYQSYTFSVGRDTIGLVNHPWGQLV GPACDYTNFK >gi|229784090|gb|GG667645.1| GENE 27 31243 - 32019 272 258 aa, chain + ## HITS:1 COG:BMEII0116 KEGG:ns NR:ns ## COG: BMEII0116 COG2188 # Protein_GI_number: 17988460 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Brucella melitensis # 2 238 3 232 244 91 27.0 2e-18 MVQTKNKLKYMEIYRDIKERIASRNYGAGSLLPTEDELTQLYSASRTTIRRAIALLQEDG LVKTVQGKGTEVISPKKWSRPYPFQEYRNLNHATSFRGSPLMEGEIITQEALVDMVPAEI KIAEALSIAIGSDVYRIQRIKSLNDTVFAYVISYVPCSLAPGLEQYSGKFFILYHCLQEY YGLVVTKVEETIASASSGFLESRLLGVTPGSPLLTFQRTAYWKEGIMEYSESSFRPDIYQ IAVTIEGPFEWTSSDQLS >gi|229784090|gb|GG667645.1| GENE 28 32197 - 32952 673 251 aa, chain + ## HITS:1 COG:BMEII0116 KEGG:ns NR:ns ## COG: BMEII0116 COG2188 # Protein_GI_number: 17988460 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Brucella melitensis # 4 238 7 235 244 101 28.0 2e-21 MDKNAPKYVSVYNAIKADILSSKYPTDSFLPPENELMEIYGVSRTTIRNAVNLLRDEHIV EVHQGRGTRVLQSAKYITPYKFLSVKNQASVTTQFRLEGSCSVSAQGAVIDIIPAELKVA RALNIENGSDVYRLQRVKMVNDTIFAYVVSYVLPEIAPGLEQYNGEIYFLYRFLKEKYDI NFENGKDIISAGIAGFMESRLLNINPGSPLLIFNRTTYQHGTPFEYSESINRGDLLELVV DSFASSPNSYY >gi|229784090|gb|GG667645.1| GENE 29 33058 - 34077 844 339 aa, chain - ## HITS:1 COG:SPy0210 KEGG:ns NR:ns ## COG: SPy0210 COG5279 # Protein_GI_number: 15674407 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Uncharacterized protein involved in cytokinesis, contains TGc (transglutaminase/protease-like) domain # Organism: Streptococcus pyogenes M1 GAS # 4 290 56 374 410 77 23.0 3e-14 MEPYYYSRMNRTEQSAYHAMKTGLTALAPSFAVPRLENRELADIFFRLRLDCPEIFYGTG FHYRFYQNSGSVEMIPEYLFDKGKVKDHQKAMKARIAKLARPAESLTEWEKERYIHDFIC TGVRYDKLKKPYSHEIIGPLGQGVGVCEGIAKTVKILCDSLGIWCMIAISEANPEKNIKY RHAWNIVRIGGRYYHLDATFDNSLGHGETIRYDYFNLDDRQLFRDHEPALYPAPPCSDGD HFYYREKKLSFTKTEDVAKRAAQAVKKGKPLVFHWRGGYLTRAMVEELLTLLEEAAKQKG KHVRAGLNWPQAVFSAEFIDELPGEELLVETANEGESIG >gi|229784090|gb|GG667645.1| GENE 30 34096 - 34791 681 231 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01840 NR:ns ## KEGG: EUBELI_01840 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 229 3 231 239 242 54.0 9e-63 MSELYLMVTITERTGVKKFHTLYGHHGADVTFITVGNGTAVSETLDYLGLEETEKAVMLS IVTGTVWKEIKSDLQKHLKIDVPGTGIAFIIPLSSIGGKKPLLFLTENQNFEKGEESSLK DTKYELLIVIANQGYTNMVMDAAREVEEVGGTVIHARGTGLERAEKYLGVSLVAEKEMVF IVVKSGMKNRVMKNIMEKAGPASKARAVAFSLPVTSTAGMRLMEEAKDDND >gi|229784090|gb|GG667645.1| GENE 31 34831 - 36369 1473 512 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01841 NR:ns ## KEGG: EUBELI_01841 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 7 512 20 525 525 560 66.0 1e-158 MKNSLLKLREKLSEALHSVLPIIGIVMVLCFTIAPISSSILLCFLMGAAMIVIGTMFFTL GAEISMTPMGERLGTAVTRSKKLSIIVILGFVLGFIITISEPDLQVLAEQVPSIPNRTLI FSVALGVGVFLVIALLRMLFSIALPHMLITFYILAFLLTLFVPQSFLAVAFDSGGVTTGP MTVPFIMAFGIGISAIRNDRHASDDSFGLVALCSIGPILAVLILGMIYRPADSAYIPPVL PEISNSVELWKLFRVGLPTYIREIAVSLFPIILFFGIFQIAVLKLSKRNLRKIIVGLVYT YIGLVLFLTGANVGFMPAGNYLGQVIASLDYNWIIVLIGMLIGFFIVKAEPAVYVLNKQV EEITDGAIPARAMSISLSIGVAVSIGLAMLRVLTGISILWFIIPGYAIALGLSFFVPKIF TAIAFDSGGVASGPMTATFLLPFAQGACTAVGGNIVTDAFGVVAMVAMTPLITIQILGMI YQMQQRSAPKDQTLPEPAAGFDQLDDDAIIEL >gi|229784090|gb|GG667645.1| GENE 32 36521 - 37996 1266 491 aa, chain - ## HITS:1 COG:no KEGG:ELI_2097 NR:ns ## KEGG: ELI_2097 # Name: not_defined # Def: regulatory protein GntR # Organism: E.limosum # Pathway: not_defined # 1 491 1 485 485 482 49.0 1e-134 MKNDVVKYERICEILKNKIICGLLPPGTALPGRAKLCREFHTSERTVRRAVEVLVQEGFL EVVPKKRPVVACGYLDKRREEQDERETADAVAANDILKSGILLCYPVNRRGMLLCRGDDW KTPETIVANMNPDDPTRFWRLSNRLWRFFICRNENELALRAVDSLGLGEVDPLPGTRENR IDYLKSLKELIQTIKRGGRPEEVHFDALFSLYGIVIDRIEGEPVCRVTPDSPLRVGTKNF EQQFRRSQERYSSVYLDILGLIAMGRYQPGDRLPTHEQLQEIYGVSIDTTIRAIRVLQDW GVVTAAPRNGIFVAKDLEGLRKIYIAPEQIAGHVRRYFDSLQLVALTVEGVAEHAALHAK PDETRRMLALMERQWNDPFCHQLIPLTLLEFITDHIRYNALKSVYELLLKNFSIGRGIPK LVQKDKTPVNSELYRQCMEAAGMLKAGEEKRFAARAAGIFQQIHRLVIQECKRLGYWESA GNVYDGTALWK >gi|229784090|gb|GG667645.1| GENE 33 38019 - 38387 410 122 aa, chain - ## HITS:1 COG:BS_ykoWm_4 KEGG:ns NR:ns ## COG: BS_ykoWm_4 COG2200 # Protein_GI_number: 16081166 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Bacillus subtilis # 1 117 139 255 259 106 40.0 9e-24 MSKAEGIIRDLSENGIRISLDDFGIGYSSLNYLMNLPVHSLKIDRSMTSKINGSPKQYAL LKAIIAMAKVNDIEVVVEGVENEEERRAIIDAEADYIQGFYYSKPLPQTELELFLTDFAV RD >gi|229784090|gb|GG667645.1| GENE 34 39388 - 41694 1913 768 aa, chain - ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 458 766 316 625 776 100 28.0 2e-20 MKRIIKRRRIIYFFVLTLFLLISAVFGLETQSRYAEERKADAGSILELYRENIVLVMREQ LNYAQGLVGLASGTMKDSEWFQWKAEELMAMDGVSSILVFEGDTVYRAFPECVQEVMAGR NLKDLSYLYTLAKVVKEPVVEGPVIMEGTDETVFLFLQPVLEGRAYQGEIALALSEEYVK EKMNLESLYQLGYEYYLWKVNSQDGSKEVVASSNAMIDYSHAAKTEFYLPSQWNLSIMPI EGWIPRKRQILYAAESILLWLLLSSLFVLSTALAKRYQYDRNIGNTDWQTGMLNRQGIGK ELNTWIRKGEKAFSVFYFALEDYNRIALLAGPEKEEEYLKSIPVIMKDYIEGPYAAARIS SESFFVAVKEEMTEGEMAEFAKGLSLKMIWKIRHQGGKVFLNAKYQYVSYPEEGNDVRTL LRNLIEKYYSKLFRESSVRELTEKCRQLAAGRSDVEFGEYAEPWITDLSQAINHYRKQVE QAVYYDPVFGIGNRMKYIRDADMMIAYDERRRFRLYGVDIRSFGKYNELFSVETGDALLR EVTKRLSAIFGSNLYRINGDVFLGISLDHASRDKENDEMISRIQEAFRIPVKVQEAVITL DAFIGVCDYPANARKPEKLLECLQSAINYGKVPEHGVTDGIVIFNNRLLEIRRREGKILQ LIRDSIHEGTLEVWYQPIYQMKAGRFTAAEALVRLPDGTGGYISAGQMIEVAERNGLVCQ IGEYVIHQACTFMKEKGEALGLGRVGINLSVQQLLVEDSAAHLLESIS >gi|229784090|gb|GG667645.1| GENE 35 41703 - 43256 1458 517 aa, chain - ## HITS:1 COG:aq_035_2 KEGG:ns NR:ns ## COG: aq_035_2 COG2199 # Protein_GI_number: 15605636 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Aquifex aeolicus # 355 517 84 247 251 99 34.0 2e-20 MTKYKNGREKQNEMTYTYMPVLKIIILLCILTIIIMAAGGLAIAFQRMKKEAAAAHENVT AQVSNRVEESLKLLESMAIQPEFYDPSVPPLEKVAKLDRITETYGYIMICFVDSDINVYT IGEEPASLASREYMQQLFSTGQPQITDSFAAGADGITLNYTVAYPLKQNGEITGCLFCAI YFDEIVNILKEASGFSGTQAVLIGSRGQIMSSTNNLQYGEPYLDHVREAKNLGITTDQLE TELLAKRPGAYWSIRQGRLCYTIYSNVKNTNWDLLTVIEFWDVYKTQVPAYLVISVFLLL LCAFAVRLTKRFVNEQKQAVNMLVHSIEELEEKIYQNERPDNVDFKEIIRLTSNGLSDGL TGVVTRSVFFKQAEAQLDKVPEERLVALCFVDLDNLKKLNDTHGHAVGDTALKSIGYILR EYEKKYDGVVGRYGGDEFVLLLTDIDDEKELKRVMESLVLRLHSDIGTGGIPLPIQCSVG VAVRQSGEKLDEMIAHADEALYFVKQNGKGYYKIYQD >gi|229784090|gb|GG667645.1| GENE 36 43537 - 44670 1015 377 aa, chain - ## HITS:1 COG:MA3683 KEGG:ns NR:ns ## COG: MA3683 COG1453 # Protein_GI_number: 20092483 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Methanosarcina acetivorans str.C2A # 9 376 4 373 376 263 37.0 5e-70 MVYKEFQDLKLSALGMGAMRLPVTEGDDSVIDEAATAEMVAYAMEKGINYFDTAWGYHGG HSEEVMGRVLSRYPRESFYLATKFPGYDLSNMDQVEAIFEKQLEKCGVEYFDFYLFHNVC EMNIDAYLDEKYGIFQYLLKQKEAGRIRHLGFSAHGSCEVMKRFLEAYGSEMEFCQIQLN YLDWSFQSAEAKMELLRSCGLPVWVMEPLRGGKLAVLSGEAMERLNQFRPEEKAPAWAFR FLQSVPGVTMILSGMSDFRQLQENIHTFEEDKPLQKEEMDTVLAIADEMLKKSALPCTAC RYCVSHCPKQLDIPALLELYNEHCFTGGGFLAPMALMAVPEDKHPDACIGCRSCEAVCPQ QIKISEAMADFAAKLNS >gi|229784090|gb|GG667645.1| GENE 37 44643 - 44792 122 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621506|ref|ZP_06114441.1| ## NR: gi|266621506|ref|ZP_06114441.1| hypothetical protein CLOSTHATH_02669 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02669 [Clostridium hathewayi DSM 13479] # 1 49 1 49 49 78 100.0 1e-13 MAAAYAKNSEGKSPLKRQKEEHAMSRTLYVEGEERKGECGIWYTKSFRI >gi|229784090|gb|GG667645.1| GENE 38 44851 - 46314 1447 487 aa, chain - ## HITS:1 COG:CAC3281 KEGG:ns NR:ns ## COG: CAC3281 COG1132 # Protein_GI_number: 15896526 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 1 486 217 703 706 464 49.0 1e-130 MRNEVQNKIRRLPVRYFDTNSFGDVLSRVTNDVEAVSNALQQSFSQVISGILTLVLALWM MFSINRVMACIACLIIPVGALITRGIVKISQSQFKAQQDSLGDMNGAITELYTGYNEILL FGQQEQSRRQFETVNDSLQKHAFKAQFLSSLMSPLISLTTYLSIGIIAVIGCFAILAGTL SVGNLQAFIRYIWQVNDPLTQVSQLSAQIQAAFAGLKRIFEILDEPEETESAPAGLLPGP AKGEVTFEHVSFSYGDTPVIRDFNVTVKSGQMVAIVGPTGAGKTTLINLLLRFYDVNSGR ILVDGVDIRDMNREDLRSMFGMVLQDTWLFSGTIFDNIRYGNLSARKDEVIDAAKMANVH HFIRTLPQGYNMVINEEGSNISQGEKQLLTIARAILKNPQIMILDEATSSVDTRLEKMLQ SAMYTVLKDRTSFVIAHRLSTIKNADLILVLRDGDIAEMGNHESLMAQNGFYSQLYNSQF AWGAEEA >gi|229784090|gb|GG667645.1| GENE 39 47246 - 47569 376 107 aa, chain - ## HITS:1 COG:no KEGG:Clole_4004 NR:ns ## KEGG: Clole_4004 # Name: not_defined # Def: xenobiotic-transporting ATPase (EC:3.6.3.44) # Organism: C.lentocellum # Pathway: not_defined # 5 106 1 102 597 67 37.0 2e-10 MRKKIKDIGRMLGKLKPFLAPYSFQLFVSVFMIVISIAAITAAPRIEGMITSRLAADVAG LASGAAGARIHFEVIRNILLVLLSIYLIKTVSQIVGAFCLTNSIQNA >gi|229784090|gb|GG667645.1| GENE 40 47562 - 49292 219 576 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 320 575 118 378 398 89 25 5e-17 MKLIIRYLKNYKKMFLLNVVSVLGFALVELGIPTIVAQMIDNGVSRGNPAYIWKMGLVIA AISFAGVAGTILLGYCCAYLSTSVTRDIRNDIFKKTQEFSHSEFHEFGIASLITRTGNDA FQIQMFMNILLRTALMTPVMMIGSIFLVLRASLSLSVVILATIPLIVGGVILVARISEPI SERQQSSLEDINRISKENLSGIRVIRAFTNDDYEKQRFDGANSSYMGNSKKLFKLMSVTQ PVFFLLMNLAGMVIYWIASIMLSQGTLQIGELVAFMDYLFHVMFSIMLFCLVFMMYPKAA VSAGRIQKIFDAEPSIKNPETNAQSVEHIDEIVFDHVDFVYPDGEEAVLKDVSFSVKRGE TIAFIGSTGSGKSTLINLIPRAYDVSAGRVLMNGKDIRTYDLESLRRAIGVIPQKAMLFS GTIADNLRFGKEDATMEDLVQAAKTAQAYDFIMEKGGFDEQITESATNVSGGQRQRLSIA RALIRKPDVYVFDDSFSALDFKTDAKLRRMLKKETGNAIVMVVAQRISSIMDADQIIVLN EGSIVGIGTHKELLNNCRIYREIALSQLSEEELSRA >gi|229784090|gb|GG667645.1| GENE 41 49452 - 50162 329 236 aa, chain - ## HITS:1 COG:lin0464 KEGG:ns NR:ns ## COG: lin0464 COG2378 # Protein_GI_number: 16799540 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Listeria innocua # 1 198 1 198 310 283 67.0 2e-76 MKIDRLVSIIMVLLDKKRVGAQELADMFEVSPRTIYRDIDAINMAGIPVRGASGVGGGFE IMENYKIDRNVFSAADLSAILMGLSSLSNMIQGEELINALAKVKSFIPADRAKDIELKSN QIAVDLTPWMGSRNTQPYLEMIKTALQESRLLSFEYEDRHGNKTTRTAEPYQLVLKGSHW YWQGYWITARMNRFFRMAMSITLSAFLSLRTIITTIFFSVLAINASVWSRYTSARR >gi|229784090|gb|GG667645.1| GENE 42 50282 - 50773 445 163 aa, chain + ## HITS:1 COG:no KEGG:Sgly_1223 NR:ns ## KEGG: Sgly_1223 # Name: not_defined # Def: isochorismatase hydrolase # Organism: S.glycolicus # Pathway: not_defined # 1 162 1 162 163 264 82.0 8e-70 MQKKALVIIDIQNDITKNYKDIIGNINQAIDWAVSHTIPVVYIRHENLSAGTRTFKPNTY GAELASDLNVVSKNIFTKYKGNALSCEEFTDFIRTNELCEFYIAGADAAACVKSTCYNLC KANYGVYVLSDCITSYDKRKIDEMIRYYEQKGCKITGLQDLVL >gi|229784090|gb|GG667645.1| GENE 43 50839 - 51321 377 160 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_2153 NR:ns ## KEGG: Dhaf_2153 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 160 7 166 166 120 40.0 1e-26 MEYFKASREHGELIFNLVQDTINAVYPRYYPKPVVEFFCGIHSRDNINRDIEEGCVRVLW KDGRLVGTGSRKGNHITRVFVAPELQGQGCGSCIMEKLKDEIAQEFRFICLDASLPAVHF YENMGYKTVGHGRIEVGNNAILIYDMMEKQLCGQNSADID Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:48:45 2011 Seq name: gi|229784089|gb|GG667646.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld39, whole genome shotgun sequence Length of sequence - 44064 bp Number of predicted genes - 39, with homology - 38 Number of transcription units - 19, operones - 10 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 9/0.000 - CDS 2 - 1169 846 ## COG0477 Permeases of the major facilitator superfamily 2 1 Op 2 . - CDS 1194 - 1994 346 ## COG0500 SAM-dependent methyltransferases 3 1 Op 3 . - CDS 2006 - 2809 331 ## Amir_4779 hypothetical protein 4 1 Op 4 . - CDS 2796 - 4064 458 ## COG0439 Biotin carboxylase - Prom 4183 - 4242 4.9 + Prom 4102 - 4161 4.8 5 2 Tu 1 . + CDS 4214 - 5074 201 ## COG0583 Transcriptional regulator + Term 5103 - 5143 5.9 - Term 4782 - 4820 2.0 6 3 Tu 1 . - CDS 5016 - 5522 100 ## COG1002 Type II restriction enzyme, methylase subunits 7 4 Tu 1 . - CDS 5682 - 7097 1298 ## COG3843 Type IV secretory pathway, VirD2 components (relaxase) - Term 7116 - 7148 3.2 8 5 Op 1 . - CDS 7154 - 7582 332 ## COG5586 Uncharacterized conserved protein 9 5 Op 2 . - CDS 7575 - 8615 725 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 - Prom 8660 - 8719 5.2 10 6 Tu 1 . - CDS 8721 - 9032 510 ## Ethha_1912 membrane protein - Prom 9069 - 9128 5.1 - Term 9114 - 9168 11.1 11 7 Op 1 . - CDS 9189 - 9533 292 ## Ethha_1911 hypothetical protein 12 7 Op 2 . - CDS 9530 - 10336 908 ## Ethha_1910 hypothetical protein 13 7 Op 3 . - CDS 10339 - 14610 3406 ## COG4646 DNA methylase - Prom 14664 - 14723 80.4 14 8 Tu 1 . - CDS 15563 - 16729 1112 ## COG0827 Adenine-specific DNA methylase - Prom 16821 - 16880 80.4 15 9 Tu 1 . - CDS 17724 - 20795 1270 ## COG0827 Adenine-specific DNA methylase - Term 20822 - 20859 6.4 16 10 Op 1 . - CDS 20878 - 21426 130 ## COG0500 SAM-dependent methyltransferases 17 10 Op 2 . - CDS 21530 - 21757 207 ## gi|266621530|ref|ZP_06114465.1| peptidase M50 - Prom 21778 - 21837 6.5 - Term 21776 - 21813 -0.5 18 11 Op 1 . - CDS 21851 - 22066 325 ## Ethha_1906 hypothetical protein 19 11 Op 2 . - CDS 22068 - 24539 2271 ## Ethha_1905 hypothetical protein - Prom 24667 - 24726 15.2 20 12 Op 1 . - CDS 25628 - 25993 230 ## COG0358 DNA primase (bacterial type) 21 12 Op 2 . - CDS 25996 - 26277 243 ## Ethha_1904 hypothetical protein 22 12 Op 3 . - CDS 26274 - 28397 1519 ## COG0550 Topoisomerase IA - Prom 28485 - 28544 22.7 - Term 29494 - 29535 11.8 23 13 Op 1 . - CDS 29582 - 30352 1017 ## CD1107 hypothetical protein 24 13 Op 2 . - CDS 30342 - 30596 397 ## CD1107A hypothetical protein 25 13 Op 3 . - CDS 30624 - 32357 1708 ## CD1108 putative DNA-repair protein 26 13 Op 4 . - CDS 32350 - 32718 563 ## gi|266621540|ref|ZP_06114475.1| conserved hypothetical protein 27 13 Op 5 . - CDS 32699 - 33163 396 ## Ethha_1897 hypothetical protein 28 14 Tu 1 . - CDS 34088 - 34693 643 ## Ethha_1897 hypothetical protein 29 15 Op 1 . - CDS 35659 - 35922 169 ## gi|160915395|ref|ZP_02077607.1| hypothetical protein EUBDOL_01403 30 15 Op 2 . - CDS 35807 - 36274 302 ## Ethha_1896 hypothetical protein 31 15 Op 3 . - CDS 36291 - 37160 938 ## Ethha_1894 hypothetical protein - Prom 37347 - 37406 3.4 32 16 Tu 1 . - CDS 37411 - 39021 961 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Prom 39060 - 39119 3.7 33 17 Op 1 . - CDS 39133 - 39423 177 ## EUBREC_3579 hypothetical protein 34 17 Op 2 . - CDS 39318 - 40172 742 ## COG1484 DNA replication protein 35 17 Op 3 . - CDS 40169 - 40918 689 ## CDR20291_1762 phage protein 36 17 Op 4 . - CDS 40908 - 41120 178 ## gi|160945844|ref|ZP_02093070.1| hypothetical protein FAEPRAM212_03377 37 18 Op 1 . - CDS 41306 - 42928 1211 ## EUBREC_3583 hypothetical protein 38 18 Op 2 . - CDS 42900 - 43298 326 ## EUBREC_3584 hypothetical protein - Prom 43415 - 43474 1.8 + Prom 43467 - 43526 3.9 39 19 Tu 1 . + CDS 43627 - 43773 74 ## + Term 43783 - 43821 4.3 Predicted protein(s) >gi|229784089|gb|GG667646.1| GENE 1 2 - 1169 846 389 aa, chain - ## HITS:1 COG:YPO2040 KEGG:ns NR:ns ## COG: YPO2040 COG0477 # Protein_GI_number: 16122279 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 10 375 11 376 401 174 31.0 3e-43 METQINQRRGLLILLFFTFFMVVGFDMIMPLVIGHYVNNVGLTATAVSVALAIRQFSQQG LAMIGGALADRFQVKTLISVGVFLRVLGFLTLAFSSAYGMLIAAMILIGLGGVLFEMPYQ SAIAMLTTNDNRPKYYSLNNTITGIASAIGPLLGVALLRLDFKVVCFGAASCFMINFVLS RVAMPPIMRQAKSYAVMPAVKRIAKDRPYQIFVLLMMVFWLGASQIDITYPLRVQEIYGS ADGVGIMYTVYAVVMAVLQYPLVALASKRFSSRAAVVIGTGIITAALIATPFAADVKSFM AIVACYAVGMILARPNQQNIAVSMADTRALGMYLGVNSTAFAIGKGFGTIIGGASFDMAK KYTLENAPWYLYGVLVGTAMIGFMLYKNF >gi|229784089|gb|GG667646.1| GENE 2 1194 - 1994 346 266 aa, chain - ## HITS:1 COG:AGl2320 KEGG:ns NR:ns ## COG: AGl2320 COG0500 # Protein_GI_number: 15891269 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 29 146 31 148 298 71 28.0 2e-12 MIRLKTNTDNLLHSNPELYAVFENDKSFEMARFIHAILQKYETGKKVLDVGSGLGREVAF LREKGYDAVGIDASEDMVAWSQKHYPNANFYIGQQAMFDLNQTFDALVCVGSTFLYNYTN EDAGSTLTNFRKHLKDGGILYLDMRNAAFFLTKEGQRWLTDELLEEAIFDGKPAIVKTRF YIDLAKQILKRDYNWKIGTREAIIEHLEHRLFFPQEIAGLLSANGFEVVELFDKPEPHIS SFTDIRPFQFGYELSGRRLQVIAKAI >gi|229784089|gb|GG667646.1| GENE 3 2006 - 2809 331 267 aa, chain - ## HITS:1 COG:no KEGG:Amir_4779 NR:ns ## KEGG: Amir_4779 # Name: not_defined # Def: hypothetical protein # Organism: A.mirum # Pathway: not_defined # 7 267 5 266 270 245 48.0 1e-63 MKQYKNVNELWGAIVNGDMDPSVSDFFVTGASYIYQTTQFPNSINKYHNYYLLLRVGTYF GACSHVPEQLGIEAGREFSGKSLKSCVIDPRLPVRIAAMDAYLGTVYPHERQCSDAVELP GGTPAEKADFRDNLIADIANIEQGMRIALIGVVNPLIAAIRKRGGTCLPCDLQMERTQWG DPVEKDMEKVLDSADGVICTGMTLSNGTFDRVVERVLERNIPLTVYAQTGSAVAARFVGN GITSLLAESFPFSQFSANATRVFCYKG >gi|229784089|gb|GG667646.1| GENE 4 2796 - 4064 458 422 aa, chain - ## HITS:1 COG:mll9226_1 KEGG:ns NR:ns ## COG: mll9226_1 COG0439 # Protein_GI_number: 13488126 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Mesorhizobium loti # 4 349 28 363 458 124 26.0 2e-28 MAHLLMIESWVGASGNLLPPLLRELGHSYTFVTRKASHYQNPMSSEKHAVFRYADDVVET ETNDLQQLIESVRHVRFDGVITVCDYYIETVCEVAKAYNVPCPFSANVKTARQKHLMRQA VDRAGLSNPKYMLAYNWEEVEKAAQEIGYPLVLKPVDLASSAFVKLIQTSNDLQQAYRDL ENFPLNFRDQERERAFLIEEYICGEEVSVETVSNNGEITVVGITDKSVTGSPYFIENGHM FPAKLDEDVKEEIGDYVRNVLLATGYDHGIAHTEIKLTKDGPKVVEINPRTAGNYIVELI EYVTGVNMLKAFVSLALGQQPSVTVSDTNVSSAAIMFIVPDCGGKIRQIKGADTLREDSN IIRYRIEDCVGKSIEKPIDNACYLGHVITRDCDGYGARMFAETALRRLEIEFEETERVHE AI >gi|229784089|gb|GG667646.1| GENE 5 4214 - 5074 201 286 aa, chain + ## HITS:1 COG:CAC3466 KEGG:ns NR:ns ## COG: CAC3466 COG0583 # Protein_GI_number: 15896705 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 282 1 285 292 166 32.0 7e-41 MNIDYLRTFLTLAENGSFSETAKEHIVVQSTVSSRIQELEKELGQSLFIRSYNHAELTLA GQAFLEYAKNIVELENRAIDRIGLVGQFSERLTIGTVYAFHKCYLVQGTRRFLEKHSDIS VRIEFGHSRHIISAMHQGKIDIGYSHHPFRHTGFQCDLVCEDEIILVTSKQPTDLNECIT LKDMETLRIYNSNFLYADTHNRIFSRYKAFQLDIDIGENIVPYLLNDDGYAFVPRKMVEQ ELSNESLFEVTILNEEIPSMKNYVIYKPTSPKRKLIQQWLNEVVSD >gi|229784089|gb|GG667646.1| GENE 6 5016 - 5522 100 168 aa, chain - ## HITS:1 COG:CAC3535 KEGG:ns NR:ns ## COG: CAC3535 COG1002 # Protein_GI_number: 15896771 # Func_class: V Defense mechanisms # Function: Type II restriction enzyme, methylase subunits # Organism: Clostridium acetobutylicum # 14 133 563 684 993 102 45.0 4e-22 MPDLPRIRDKYYKQFDKYFIFVEQAIRKTKDAGYICYIVPNKFFKVGAGEHLRSLIAKGR YLVSLDDFGDAQLFEDKTIYSSILLLCKSPQEKFRYTGVDSADKLWVGEEVSSIEFGSSI LNKLRMAITQGRDGSKSYLSLFIITANNPFNPKQLRSAIAVSVFVLGK >gi|229784089|gb|GG667646.1| GENE 7 5682 - 7097 1298 471 aa, chain - ## HITS:1 COG:SP1056_1 KEGG:ns NR:ns ## COG: SP1056_1 COG3843 # Protein_GI_number: 15900926 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD2 components (relaxase) # Organism: Streptococcus pneumoniae TIGR4 # 24 173 18 180 402 83 33.0 1e-15 MATTRIMPLHIGKGRTESQAVSDIIDYVANPQKTDNGKLITGYGCDSRTADAEFLLAKRQ YIAATGRVRGADDVIAYHVRQSFKPGEITPEEANRLGVEFAMRFTKGNHAFVVCTHIDKS HIHNHIIWSAVNMDCDRKFRNFWGSTRAVRRLSDTICIENGLSIVENPKPHGKSYNKWLG DQAKPSHREQLRVMIDRALEQKPADFDALLKLLSEMGCEVSRRGKAIRLKAPGWKNVARM DDKLGAGYSEAEIRAVLAGEKQHTPRKKSIVQPEPPKVNLLVDIQAKLQAGKGAGYARWA KVFNLKQMAQTVNFLTEHHLLDYAELAEKAAAATAHHNELSAQIKAAEKCMAEIAVLRTH IVNYAKTRETYVAYRKAGYSKKFREEHEEEILLHQAAKNAFDEMGVKKLPKVKDLQAEYA KLLEEKKKTYAEYRRSREEMRELLTAKANVDRLLKMDEEQKKEQEKDHGQR >gi|229784089|gb|GG667646.1| GENE 8 7154 - 7582 332 142 aa, chain - ## HITS:1 COG:Rv2802c KEGG:ns NR:ns ## COG: Rv2802c COG5586 # Protein_GI_number: 15609939 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycobacterium tuberculosis H37Rv # 23 137 23 131 347 61 33.0 5e-10 MTEKELIGKVHSAVYHQCQRRGYAAPVDVLMEVGVLPKQKYEDWRFGRVDYLERVCTVNL RKLSFIMHQMRVYAQKTGLKPSFCYYKQWGVKKKNGQGHKPVIPLRFSKSGNSEIEKWYA THFVDTKRIAALKAQQPVENSD >gi|229784089|gb|GG667646.1| GENE 9 7575 - 8615 725 346 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 8 331 5 328 339 283 43 9e-76 MENQLTPNNSMILEIRELLENARKNVAQQVNTQLLTTYWNIGRIIVEYEQQNQIRADYGK QTLKELSKELTREFGKGFSRSNLQNMRAFYLAYEKCQTVSGKLSWSHYCELLSITDENKR SFYEKESVNSGWSVRELKRQIDSSLYERLLLSSEDVNKEKVLSLAQKGVEISQPADIIRD PYVFEFLGVPENKPMLESDLEKALVAQIEKFLLELGRGFMFVGTQQRVTLNNTHYYVDMV FYNKILRAYVLIELKTKKLTPEAAGQLNMYLNYYAAEVNDPDDNPPIGIILCTEKDSIAA EYALGGLSNNIFASRYVLYMPDKEQLIAQVEAVLKNWHEKKDNRHD >gi|229784089|gb|GG667646.1| GENE 10 8721 - 9032 510 103 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1912 NR:ns ## KEGG: Ethha_1912 # Name: not_defined # Def: membrane protein # Organism: E.harbinense # Pathway: not_defined # 8 103 1 96 96 84 63.0 1e-15 MRALVFLLKLLLKILVAPVILALTLFVWICVGIVYISGLVLGLISMVIALLGVAVLITYS LQNGIILLVMAFLISPFGLPMAAIWLLGKVQDLKFAIQDLVYG >gi|229784089|gb|GG667646.1| GENE 11 9189 - 9533 292 114 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1911 NR:ns ## KEGG: Ethha_1911 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 6 113 4 111 112 129 64.0 5e-29 MSSPKRKRDVPVLFWVSADEMELIQQKMAQFGTKNLSAYLRKMAVDGYVVQLDLPELKEL VSLLRRSSNNLNQLTRKVHETGRIYDADLEDISQRQEQLWDGVKEILTRLSKLS >gi|229784089|gb|GG667646.1| GENE 12 9530 - 10336 908 268 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1910 NR:ns ## KEGG: Ethha_1910 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 106 261 1 164 254 102 37.0 1e-20 MNTNDLNTALYEKMAAEQDKYRDWLKSQPPEETLHHTYEYAIREDIVMAMEELELTDAQA QALLESPSPLADVYRYFEKLETGYMDAIRDSIENRADDACRAQEELRTAPLYPHSAAYAS QHGEMAQYNRSYQANSACKEAIEQTISAHYAENRLDTEAAVKDVLEKFGAERVQFILANT IQHKNRDGRISQDNKVWAKTIPMPEDSGTSRHCAYLVVDGVNPGLTDLFTRQARKTMQEQ QKSSVLQKLRQEPSIHKPAAPKKREPER >gi|229784089|gb|GG667646.1| GENE 13 10339 - 14610 3406 1423 aa, chain - ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 84 1380 1 1308 1315 644 32.0 0 MRPGGVVAFVTSRYTMDAKDSTVRRYLAQRAELLGAIRLPNDAFKKNAGAEVVSDIIFLQ KRDRPLDIVPEWTQTGQTEDGFAINRYFLDHPEMVLGRQEPESTAHGMDYTVNPIEGLEL ADQLHDAVKHIRGTYQEADLPELGEGEAIDTSIPADPNVKNYSYTVVDGDVYFRENSRMV RPDLNATAEARVQGLVGLRECVQQLIDLQMDAATPDSAIRDKQAELNRLYDSFSAKYGLI NDRANRLAFADDSSYYLLCALEVIDEDGKLERKADMFTKRTIKPHKAVETVDTASEALAV SIAERACVDMAYMSELTGKTSDELAAELQGVIFRVPGQVEKDGTPHYVTADEYLSGNVRR KLRQAQRAAQQDPSFAANVEALTAAQPKDLDASEIEVRLGATWIDKEYIQQFMYETFDTP FYMQRNIEVNYTPFTAEWQITGKSSISQNNVAAYTTYGTSRANAYKILEDSLNLRDVRIY DTVEDADGRERRVLNAKETTLAAQKQQAIRDAFKDWIWKDPDRRQALVRQYNEEMNSTRP REYDGGHITFGGMNPAITLREHQKNAIAHVLYGGNTLLAHEVGAGKTFEMVAAAMESKRL GLCQKSLFVVPNHLTEQWASEFLRLYPSANILVTTKKDFETHNRKKFCARIATGDYDAII MGHSQFEKIPISRERQERLLYEQIDEITEGIAEVQASGGERFTVKQLERTRKSLEARLEK LQAESRKDDVVTFEQLGVDRLFVDEAHNYKNLFLYTKMRNVAGLSTSDAQKSSDMFAKCR YMDEITGNRGVIFATGTPVSNSMTELYTMQRYLQYDRLQELNMTHFDCWASRFGETVTAL ELAPEGTGYRARTRFSKFFNLPELMNLFKEVADIKTADQLNLPTPEVEYHNIVAQPTEHQ QEMVKALSERASEVHRGSVDPSVDNMLKITSDGRKLGLDQRIINQLLPDEPGTKVNQCVD NIMQIWRDGDADKLTQLVFCDISTPQAAPYKKAAKQLDNPLLHGLEEAIPLDEPEPAFTI YEDIRQKLIAQGMPADQIAFIHEANTEVRKKELFSKVRTGQVRVLLGSTAKMGAGTNVQD RLVALHDLDCPWRPGDLAQRKGRIERQGNQNPLVHVYRYVTEGTFDAYLWQTVENKQKFI SQIMTSKSPVRSCDDVDETALSFAEIKALCAGDPRIKERMDLDVEVAKLKLMKADHQSKQ YRLEDQLLKYFPQEIETNKGYIQGFEVDLETLVAHPHPADGFAGMEIRGDVLTDKENAGA ALLDACKEVKTSDPVQIGSYRGFIMSVEFEAWKQEYTLLLKGQMTHRATLGTDPRGNLTR IDNALAQMPQRLEAVKNQLENLYQQQAAAKEEVGKPFPFEDDLRIKSARLVELDTLLNID GKGHAQPETVAAKSARPSVLDSLKRPVPPRSPEKKPKQHEEVR >gi|229784089|gb|GG667646.1| GENE 14 15563 - 16729 1112 388 aa, chain - ## HITS:1 COG:pli0004 KEGG:ns NR:ns ## COG: pli0004 COG0827 # Protein_GI_number: 18450290 # Func_class: L Replication, recombination and repair # Function: Adenine-specific DNA methylase # Organism: Listeria innocua # 251 386 379 520 756 117 47.0 3e-26 MGDTVYIEDDAYQITELRDDTVQLLPTGMVYPIYRAERKEQFEQLLRADRRNAYYTEFLP IDPDKADQDLRDVLAHGLMDEADKKQVSTLLQSGRSNSEIAYWLSRAYPREIETLDLETG DIADYRTTAQGMELEVLDAEEKRLAVLYIRWDEVAPLLRGMYARQLDGFGQEQPQPSAES PAFHSETVAVYPGDKNNLPYDVVVERLHIEEPEPPAPVTEPEKTFEEVLDEHPVSIPVNG QWQTFPNARAAEEASYEEYKANLRHNAQNFRITDAHLGEGGPKAKFQANIEAIKLLKHLE ETTGQATPEQQEILSRYVGWGGLADAFDPEKPAWAAEYAQLKELLTPEEYAAARSSTLNA HYTSPTVIQAIYEAVGRMGFETGNILAS >gi|229784089|gb|GG667646.1| GENE 15 17724 - 20795 1270 1023 aa, chain - ## HITS:1 COG:pli0004 KEGG:ns NR:ns ## COG: pli0004 COG0827 # Protein_GI_number: 18450290 # Func_class: L Replication, recombination and repair # Function: Adenine-specific DNA methylase # Organism: Listeria innocua # 2 100 12 103 756 69 40.0 3e-11 MPSKTEFYRQMADHVATQLTGSWQEWAGFLTTAARLYKYPFHEQLLIYAQRPDATACAEY DLWNEKMGRYVRRGSKGIALVDDSGDRPRLRYVFDISDTGTREHSRTPWLWQLEERHLDS VQAMLERTYDVSGDDLAGQLTEVAGKLAEEYWTEHQQDFFYIVDGSFLEEYDEYNIGVQF KAAATVSITYALMSRCGLEPERYFDHEDFMAIFDFNTPSTIGALGTAVSQINQQVLRQIG VTVRNAEREANQERSKQDEQSHDLHPERRLSDSRPEAEPAAGETPGQVRQDEENLPERTP SHPLQPDAAEREVVPAPSGDRRDRPEQTGADDAPAGERSGSHRATESQRSHEVGGADEYL QSTGRGDPDGGAYQQLTLNLFLSEAEQIQSIDEAENVAHTSSAFSFAQNDIDHVLRLGGN TDRQRERVVAAFEKQKTTAEIAEILKTLYHGGNGLGSVSAWYAEDGIHLSHGKSVRYDRS AKVISWESAAERIGELLESGQFASNVELAEAAGYERSLLAEKLWHLYHDFSDKARDSGYL SCLSGIQRTGFPEETAWLTEQLNSPEFRQTLAEEYAAFWTAYQQDRELLRFHYHKPREIW ESLQDLSLPRKSFSSEMQDVPAVKQFITEDEIDAAMTGGSGIEGGKGRIFTFFKNPHTDK EKVDFLKSEYGIGGHSHALSGAMGSNEDHDGKGLHYKKDGCPDMHFTWEKVAKRITGLIQ KGRYLTEQEQAQYDKIQAEKALAEEDALQAQQPTPEIWEYNGVKERHSDDIVLYQMGDFF ELYGEDAKTAAAELDFHLTTRAIPGGGRVEMCGFPANRLEQVVEHLRDQHDVTISAVPEG GRERQEYSMLSIDHEAEQHINAQEAEFGADGTRVFRDMEPEQATPTIRELYEKYKPIVME AVTQDTRYRNACGHSDYENAMIECNAAVRRTILDSHDIELIRLFSDVPEFRQWLHREVAD ETYPKLHELLRPLSQEDIDSALCAWNGNIESKHAVVRYMKDHARRKTPPHGWLRNTAAVT ACS >gi|229784089|gb|GG667646.1| GENE 16 20878 - 21426 130 182 aa, chain - ## HITS:1 COG:CAC0567 KEGG:ns NR:ns ## COG: CAC0567 COG0500 # Protein_GI_number: 15893857 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Clostridium acetobutylicum # 1 182 26 209 209 139 38.0 3e-33 MNKGHAAVSDWGMEHITKLHPSKIIDLGCGGGRNAAELLKRFPAATVHALDYSEVSVQKT KQFNQQAIKNGRLQVTHANVLNLPFSADTFDLATAFETVYFWPGPLESFKEVYRILKKGG CFLIVNESDGTNKADDKWLKIIDNMKIYTVEQLEFFLTEAGFSEITIDHKNKKHHICILA KK >gi|229784089|gb|GG667646.1| GENE 17 21530 - 21757 207 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621530|ref|ZP_06114465.1| ## NR: gi|266621530|ref|ZP_06114465.1| peptidase M50 [Clostridium hathewayi DSM 13479] peptidase M50 [Clostridium hathewayi DSM 13479] # 1 75 1 75 75 110 100.0 3e-23 MKKTLIGGFLTLSGTIGIVILLVACILNPVTSWITPPGRLICTMLEHGIAVPIGCFLIIF ITGLFILGIEYRKKI >gi|229784089|gb|GG667646.1| GENE 18 21851 - 22066 325 71 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1906 NR:ns ## KEGG: Ethha_1906 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 4 71 2 69 72 71 63.0 1e-11 MGNFTFEEMNLMCIYNTGSRTGLMEALTEMRGELSPEETELRELTDSALMKLQAMSDAEF SQLELYPDFDE >gi|229784089|gb|GG667646.1| GENE 19 22068 - 24539 2271 823 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1905 NR:ns ## KEGG: Ethha_1905 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 418 822 393 816 817 341 46.0 1e-91 MAENKSTNRERLREITDGIEQGIKELFESEKYMRYLSVMSRFHRYSVNNTMLIYMQKPDA TLVAGFNKWKNQFERHVKKGEHGITIIAPTPYKKKIEEMKRDPDTHAPILDADGKAVMEK KEIEIPMFRPVKVFDVSQTDGKPLPELASSLSGTVPHYEAFLEAVRRSAPVPIEFEPMAD NMDGYFSPERQRIAIREGMSEVQTVSAAVHEIAHSKLHNQKKIQIDNDQQYQEVELFEKP ALFSNGRISHDDLPEGVYCYDLRGSDDDPGFPICVEERVVVNHAGSVILTAPLEFSEEGR LYFTDENGLNFNGGMLTLSQFLQEQKKDRRTEEVEAESISYAVCQYFGIQTGENSFGYIA SWSKGKELKELRASLETINKTSCELINDIERHYKEICKERGIDLTAKQEPEQADIPQKAL FLLNDATYLHIQPCDTGWDYTLYDKETMKELDGGQLDEPELSRSAAVRQICEGLELENPS IQDAPLSMIETLQDAACQQMQEQVSQQTAETAATQLPDAQEQSLDEYPMPDSEVSVSDMQ EYGYLYDGMLPVTRERALELDAAGLTVYLLHEDNTESMVFDTEEIMEHGGLFGVEHEEWE QSPAFHEKVLERQERQMEREQAFLAHEGSCFAIYQVSRDDPQNVRFMNLDWLQSHNLAVD RNNYDMIYTAPLNGSGSAMEQLETLYEQFNLQKPVDFHSPSMSVSDIVAIKQDGKVSCHY CDSVGFTEIPGFLPDNPLKNAEMMLEDDYGMIDGIINNGPKERTVAQLEQQARSGQPISL MDLAAAAHREERDKKKSVMEQLKSQPRTERKKAAPKKSAEREI >gi|229784089|gb|GG667646.1| GENE 20 25628 - 25993 230 121 aa, chain - ## HITS:1 COG:RC1330 KEGG:ns NR:ns ## COG: RC1330 COG0358 # Protein_GI_number: 15893253 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Rickettsia conorii # 4 117 7 124 595 68 32.0 3e-12 MDNLFETVKQSITIREAAERYGIEVKRGGMVCCPFHDDKNPSMKLNKEYFYCFGCGATGD VIDLASRLYNLSPKEAAEKLAQDFGLIYDSQAPPRRNYVRQKTETQQFREDRQRCYRILS D >gi|229784089|gb|GG667646.1| GENE 21 25996 - 26277 243 93 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1904 NR:ns ## KEGG: Ethha_1904 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 93 1 98 100 73 55.0 2e-12 MKLSLYEQETILLYNQAEDTAEVYTHDRKLMEKLTSLSKKHPEQVCKKDKHNFTVPKRCV SVREPYSAERRKAASERAKAAGYRPPVRKADSE >gi|229784089|gb|GG667646.1| GENE 22 26274 - 28397 1519 707 aa, chain - ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 5 642 5 656 709 461 39.0 1e-129 MGCKLVIAEKPSVAQSLAAVIGATVRKDGYLEGNGWRVSWCVGHLAGLADADSYDPKYAK WRYDDLPILPEHWQMVVDKDKKKQFDVLKKLMNAPDVTEVVNACDAGREGELIFRSVYEL AGCQKPMKRLWISSMEDSAIREGFANLRPGADYDGLHQAALCRAKADWLVGINATRLFSV LYHRTLNIGRVMSPTLALIVQREAEIDTFKPVPFYTIVLELPEFSVSGERMADKAAAEQL KTACQGADVTVKKVERKDKTEKPPALYDLTTLQRDANRLLGYTAQQTLDYLQNLYEKKLC TYPRTDSRYLTSDMAEGLPVLVNLVANAMPFRKGIAISCDEALVINDKKVTDHHAVIPTR NLHGADLSGLPVGERMILELVALRLLCAVAEPHTYSETAVTVECAGAEFTTKGKTVKRPG WRALDAAYRAGLKNAEPDKETEDKALPDGGRLPELSEGQTLPLSSSAVKEGKTSPPKHFT EDTLLSAMETAGSKEMPDDAERKGLGTPATRAGILEKLVSTGFLERKKSKKAVQLMPSKD AVSLITVLPEQLQSPLLTAEWEHRLSEIERGELSPEDFMGGICAMLRELVGTYQVIKGTE YLFTPPREVVGKCPRCGGDVAELQKGFFCQTESCKFAIWKNNKWWEMKRNQPTKAIVTAL LKEGRAHVSGLYSEKTGKTYDATVVLVDDGKYVGFKLEFDRQKGGKR >gi|229784089|gb|GG667646.1| GENE 23 29582 - 30352 1017 256 aa, chain - ## HITS:1 COG:no KEGG:CD1107 NR:ns ## KEGG: CD1107 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 231 3 229 244 226 62.0 7e-58 MKNKRIFKTFSALCAALVLMMGLSVTAFAQGTEQPPAEDATNDENVVVEKTEDSPALTPE GNAALVDDFGGNKQLITVTTKAGNYFYILIDRANEDKETAVHFLNQVDEADLMALMEDGQ SAEKPPVVCNCTEKCEAGAVNTKCEVCSTDMTGCTGKEAEPETPAEPAEPEKKESAGLNP VVLLLVLAVMGGAAFTYFKFIKGKANHKNSSNPDDYDYEDGEELPDEEEIELEDEESGEP DADGGSAEEDDEDSVK >gi|229784089|gb|GG667646.1| GENE 24 30342 - 30596 397 84 aa, chain - ## HITS:1 COG:no KEGG:CD1107A NR:ns ## KEGG: CD1107A # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 4 67 5 68 85 82 78.0 5e-15 MATKIERIDREITKTREKIAEYQEKLKTLEAQKTEAENLEIVQMVRALRLTPEQLNAMLS GGTVPGMAAASADYNEQEDTAHEE >gi|229784089|gb|GG667646.1| GENE 25 30624 - 32357 1708 577 aa, chain - ## HITS:1 COG:no KEGG:CD1108 NR:ns ## KEGG: CD1108 # Name: not_defined # Def: putative DNA-repair protein # Organism: C.difficile # Pathway: not_defined # 6 577 71 646 646 673 62.0 0 MRKEPRLRFTDEERADPALEKPIRKVDRAAVKADKAQAKIPKKQVRQKAVDLGTGKVITK LVLEDKKKPPSKLSHAVRDAPGDAALGKLHKEIRETEQDNVGVESAHKSEEAVETGARLV REGYRSHKLKPYRKAAQAEQKLEKANVNALYQKSLRENPQLTSNPFSRWQQKQAIKKEYA AAKRAGQAAGNTANTAKKTGKAAKTVKEKAQQAGAFVMRHKKGFLLVGAIFLLICLLLNT MSSCSMMAQSIGSAISGSTYPSDDLELVAVEADYAAKEAALQAEIDNIEISHPGYDEYRY DLDMIGHDPHELAAYLSAVLQGYTRQSAQAELERVFDAQYQLTLTEEVEVRYRTETRTDS EGNSYTVEVPYNYYILNVKLTSKPISSVASELLTPEQLEMYQVYRQTLGNKPLIFGGGST NTSDSESLEGVEFVNGTRPGNPELVELAKRQVGNVGGQPYWSWYGFNSRVEWCACFVSWC YGQMGLSEPRFAGCQSQGVPWFQSHGQWGARGYDNLAPGDAIFFDWDLDGSADHVGIVIG TDGSRVYTVEGNSGDACKIRSYDVNYECIKGYGLMNW >gi|229784089|gb|GG667646.1| GENE 26 32350 - 32718 563 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621540|ref|ZP_06114475.1| ## NR: gi|266621540|ref|ZP_06114475.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 122 1 122 122 251 100.0 2e-65 MTKKEIEKGAGLLLAAVSGCEQMVELLRAAVHYCYEQEKTAKNLTPKGYAGFTRIHHIWT EGIPDVPKLFMGSPIQDRDMARGTLMALFLSTTVFPAETIQIPVYAETCVLLDDLMAGGW YA >gi|229784089|gb|GG667646.1| GENE 27 32699 - 33163 396 154 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1897 NR:ns ## KEGG: Ethha_1897 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 145 644 788 794 276 86.0 3e-73 MWNRVTINRAAHKSTRYYIDEMHLLLKEEQTAAYTVEIWKRFRKWGGIPTGITQNVKDLL SSREVENIFENSDFVYMLNQAGGDRQILAKQLGISPHQLSYVTHSGEGEGLLFYGSTILP FVDHFPKNTELYRIMTTKPLDLKKEDEQHDKERN >gi|229784089|gb|GG667646.1| GENE 28 34088 - 34693 643 201 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1897 NR:ns ## KEGG: Ethha_1897 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 200 92 291 794 306 73.0 5e-82 MNLAASEETFARAINIPLQGDDFDSIRIEYMTMLQNQLAKGNNGLIKTKYLTFGIDADSI KAAKPRLERIETDILNNFKRLGVAAETLDGKARLAQLHGIFHMDEQLPFRFEWDWLPSSG LSTKDFIAPSSFEFRTGKQFRMGKKYGAVSFVQILAPELNDRMLADFLDMESNLIVSLHI QSVDQIKAIKTVKRKITDLDK >gi|229784089|gb|GG667646.1| GENE 29 35659 - 35922 169 87 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160915395|ref|ZP_02077607.1| ## NR: gi|160915395|ref|ZP_02077607.1| hypothetical protein EUBDOL_01403 [Eubacterium dolichum DSM 3991] hypothetical protein EUBDOL_01403 [Eubacterium dolichum DSM 3991] # 1 51 3 53 821 96 90.0 5e-19 MFTAIKRWLHRLFGKNEEKAVQPVKTKKKLSRADRKQIEAAIARANRTDKKENRHRTVSL MNECGRTVSAGYQTVTTQKPFNFRTST >gi|229784089|gb|GG667646.1| GENE 30 35807 - 36274 302 155 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1896 NR:ns ## KEGG: Ethha_1896 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 155 1 141 152 172 62.0 5e-42 MAYVPVPKDLTKVKTKVAFNLTKRQLVCFGGGALLGVPLFFLLRGSAGNSTAALCMMFVM LPFFLLAMYEKHGQPLEKIVGNIIRVTVIRPRQRPYRTNNFYAVLQRQANLDEEVSHIVY GNQTLAAPAVRKKRGKSCAAGQNKEKAVPRRQKAD >gi|229784089|gb|GG667646.1| GENE 31 36291 - 37160 938 289 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1894 NR:ns ## KEGG: Ethha_1894 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 289 1 289 289 398 72.0 1e-109 MDFLLDALTEWLKEMLVGGIMGNLSGMFDSVNQQVGDISVQVGQTPQGWNGGIFNMIQNL SNSIMVPIAGVILAIVMTLELIQMITDKNNLHDVDTWMIFKWVFKSAAAILLVTNTWNIV MGVFDMAQSVVAQASGIISSDAAIDISSVLTDMEPRLMEMDLGPLFGLWFQSMFIGVTMW ALYICIFIVIYGRMIEIYLVTSVAPIPMAAMMGKEWGGMGQNYLRSLLALGFQAFLIIVC VAIYAVLVQGIATEEDVIMAIWSCVGYTVLLCFTLFKTGSLAKAVFQAH >gi|229784089|gb|GG667646.1| GENE 32 37411 - 39021 961 536 aa, chain - ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 6 301 2 301 301 325 56.0 2e-88 MLRQTNQQPITALYPRLSHEDELQGESNSISNQKRILETYAKQNGFSNLRWYTDDGYSGA NFQRPGFQAMLADIEAGKVGTVIVKDMSRLGRNYLQVGMYTEMIFPQKGVRFIAINDGVD SAQGDNDFAPLRNIFNEWLVRDTSKKIKAVKRSKGMSGKPITSKPVYGYLMDEDENFIID EEAAPVVKQIYNLCLAGNGPTKIARMLTEQQIPTPGTLEYRRTGSTRRYHPGYECKWATN TVVHILENREYTGCLVNFKTEKLSYKVKHSVENPLEKQVIFENHHEPIIDTQTWERVQEL RKQRKRPNRYDEVGLFSGILFCADCGSVMYQQRYQTDKRKQDCYICGNYKKRTHDCTAHF IRTDLLTAGVLSNLRKVTSYAAKHEARFMKLLIEQNEDGGKRRNAAKKKELEATEKRIAE LSAIFKRLYEDSVTGRISDERFTELSADYEAEQRELKERAAAIQAELSKAQEATVNAEKF MNVVRRHTSFEELTPTLLREFVEKIVVHECSYDENKTRRQDIEIYYSFVGKVDLPE >gi|229784089|gb|GG667646.1| GENE 33 39133 - 39423 177 96 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3579 NR:ns ## KEGG: EUBREC_3579 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 35 96 1 62 62 88 100.0 1e-16 MRPRPLHGQQLPKGNRTGKAGTLKATDEAAKGEPMTETKQTSTTKTDRRPDCVTEIRMGN SVLTVSGFFKQGATDTAADKMMKVLEAEAATQKTAI >gi|229784089|gb|GG667646.1| GENE 34 39318 - 40172 742 284 aa, chain - ## HITS:1 COG:CAC1933 KEGG:ns NR:ns ## COG: CAC1933 COG1484 # Protein_GI_number: 15895206 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Clostridium acetobutylicum # 56 277 49 281 282 114 28.0 2e-25 MKNEIEAMITDITATTAEAEDYTGEDGLLYCGKCHTPKEAYFAEGKTCFERDRHPTDCDC QRAAREKQQAAESRQKHLEKVEDLKRRGFTDPAMRNWTFEHDNGRNPQTETARFYVESWE TMQAENIGYLFWGGVGTGKSYLAACIANALMEKEVAVCMTNFATILNDLAASFDGRNEYI SRLCSYPLLILDDFGMERGTEYGLEQVYSVIDSRYRSDKPLIATTNLTLEELQHPQDTPH ARIYDRLTSMCAPVRFTGSNFRKETAQEKLERLKQLMKQRKENL >gi|229784089|gb|GG667646.1| GENE 35 40169 - 40918 689 249 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1762 NR:ns ## KEGG: CDR20291_1762 # Name: not_defined # Def: phage protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 249 1 244 244 287 69.0 2e-76 MADNRKYYYLKLKENFFDSDSIVLLEDMKDGILYSNILLKLYLKSLKNGGKLQLDEHIPY TAQMIATLTRHQIGTVERALEIFRQLGLVEQLDSGAFYMTDIELMIGQSSTEAERKRAAR LENKALLPPRTKGGHLSDIRPPEKEIELEKEIEIEKEREGETGHPAPAAYGRYNNVILTD TELSGLKTELPDKWEYYIDRLSCHIASTGKQYHSHAATIYKWAQEDAAKGKAAPKQGIPD YSCKEGESL >gi|229784089|gb|GG667646.1| GENE 36 40908 - 41120 178 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160945844|ref|ZP_02093070.1| ## NR: gi|160945844|ref|ZP_02093070.1| hypothetical protein FAEPRAM212_03377 [Faecalibacterium prausnitzii M21/2] hypothetical protein FAEPRAM212_03377 [Faecalibacterium prausnitzii M21/2] # 1 70 1 70 70 118 95.0 2e-25 MPRMSKKRRLEWSFFLRQVKVGNSTCDRITYNDLCRGCTHSCKQSFRAIIILCPHYYSKR RKKEDRDNGR >gi|229784089|gb|GG667646.1| GENE 37 41306 - 42928 1211 540 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3583 NR:ns ## KEGG: EUBREC_3583 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 538 3 539 543 929 94.0 0 MATFKHISSKNADYGAAEAYLTFEHDEFTMKPTLDENGRLIPREDYRISSLNCGGEDFAV ACMRANLRYEKNQKREDVKSHHYIISFDPRDGTDNGLTTDRAQELGEQFCKAHFPGHQAL ICTHPDGHNHSGNIHVHIVINSLRIYEVPLLPYMDRPADTREGCKHRCTNAAMEYFKSEV MEMCHREGLYQIDLLNGSKERITEREYWAAKKGQLALDKENAAREAAGQPTKPTKFETDK AKLRRTIRQALSQAASFDEFSSLLLREGVTVKESRGRLSYLTPDRTKPITARKLGDDFDK AAVLALLTQNANRAAEQTKAIPEYPAAAKKPSQGEKTAKTTPADNTLQRMVDREAKRAEG KGVGYDRWAAKHNLKQMAATVTAYQQYGFSSPEELDEACSAAYAAMQESLAELKQVEKTL NGKKELQRQVLAYSKTRPVRDGLKQQKNAKAKAAYRQKYESDFIIADAAARYFRENGISK LPSYKAMQAEIETLIQEKNSGYNDYRAKREEYRRLQTVKGNIDQILHRERKPVKRQEQER >gi|229784089|gb|GG667646.1| GENE 38 42900 - 43298 326 132 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3584 NR:ns ## KEGG: EUBREC_3584 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 18 132 1 115 115 214 99.0 1e-54 MRKRYNTPHRSRVVKTRMTEEEYAEFAERLSAYNMSQAEFIRQAITGAAIRPIITVSPVN DELLAAVGKLTAEYGRIGGNLNQIARTLNEWHSPYPQLAGEVRAAVSDLAALKFEVLQKV GDAVGNIQTYQL >gi|229784089|gb|GG667646.1| GENE 39 43627 - 43773 74 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIENGNFPLLTYQYRYFLDLPKMGAKKAANLFQTYCFMCRYGAAVIQL Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:50:40 2011 Seq name: gi|229784088|gb|GG667647.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld40, whole genome shotgun sequence Length of sequence - 42832 bp Number of predicted genes - 42, with homology - 41 Number of transcription units - 16, operones - 8 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 1400 1539 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 2 1 Op 2 . - CDS 1430 - 2680 1017 ## gi|266621556|ref|ZP_06114491.1| conserved hypothetical protein - Prom 2703 - 2762 6.4 + Prom 2747 - 2806 7.3 3 2 Tu 1 . + CDS 2833 - 3327 394 ## gi|288870545|ref|ZP_06114492.2| conserved hypothetical protein - Term 4505 - 4555 17.1 4 3 Op 1 2/0.000 - CDS 4568 - 5125 705 ## COG4720 Predicted membrane protein 5 3 Op 2 . - CDS 5190 - 6032 1013 ## COG1912 Uncharacterized conserved protein 6 3 Op 3 34/0.000 - CDS 6060 - 6896 935 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 7 3 Op 4 . - CDS 6893 - 8608 295 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 8 3 Op 5 . - CDS 8630 - 10150 1612 ## COG2188 Transcriptional regulators - Prom 10300 - 10359 6.1 + Prom 10259 - 10318 5.2 9 4 Tu 1 . + CDS 10342 - 11412 881 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases + Term 11461 - 11493 1.4 - Term 11416 - 11456 2.7 10 5 Op 1 . - CDS 11511 - 12371 980 ## COG1912 Uncharacterized conserved protein 11 5 Op 2 . - CDS 12468 - 12980 817 ## COG1528 Ferritin-like protein - Prom 13107 - 13166 5.8 12 6 Tu 1 . - CDS 13172 - 14710 1703 ## COG0696 Phosphoglyceromutase - Prom 14764 - 14823 5.2 13 7 Tu 1 . - CDS 14835 - 15323 403 ## Closa_3846 SARP family transcriptional regulator - Prom 15371 - 15430 4.8 - Term 15430 - 15487 15.1 14 8 Op 1 13/0.000 - CDS 15525 - 16277 888 ## COG0149 Triosephosphate isomerase 15 8 Op 2 26/0.000 - CDS 16346 - 17560 1517 ## COG0126 3-phosphoglycerate kinase - Prom 17618 - 17677 2.1 - Term 17677 - 17717 9.6 16 8 Op 3 . - CDS 17735 - 18766 939 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase - Prom 18878 - 18937 6.9 - Term 18810 - 18845 -0.9 17 9 Tu 1 . - CDS 18960 - 20627 1579 ## COG1283 Na+/phosphate symporter - Prom 20721 - 20780 6.3 + Prom 20794 - 20853 6.7 18 10 Tu 1 . + CDS 20881 - 21678 1005 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family + Term 21719 - 21768 1.2 - Term 21707 - 21754 5.4 19 11 Op 1 . - CDS 21817 - 22083 484 ## Cbei_4689 hypothetical protein 20 11 Op 2 . - CDS 22088 - 23167 1233 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 21 11 Op 3 2/0.000 - CDS 23201 - 23488 435 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 22 11 Op 4 . - CDS 23518 - 24201 765 ## COG4869 Propanediol utilization protein - Prom 24312 - 24371 20.8 23 12 Op 1 2/0.000 - CDS 25275 - 26687 408 ## PROTEIN SUPPORTED gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P - Term 26746 - 26774 -0.9 24 12 Op 2 . - CDS 26802 - 27383 711 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 25 12 Op 3 . - CDS 27397 - 28032 697 ## COG4766 Ethanolamine utilization protein 26 12 Op 4 . - CDS 28087 - 28350 369 ## COG4576 Carbon dioxide concentrating mechanism/carboxysome shell protein 27 12 Op 5 . - CDS 28344 - 28649 336 ## Closa_4011 microcompartments protein 28 12 Op 6 . - CDS 28666 - 29511 1008 ## COG4820 Ethanolamine utilization protein, possible chaperonin 29 12 Op 7 . - CDS 29493 - 30173 406 ## Cphy_1422 hypothetical protein - Prom 30321 - 30380 80.4 30 13 Op 1 . - CDS 31220 - 32254 1224 ## COG1454 Alcohol dehydrogenase, class IV 31 13 Op 2 4/0.000 - CDS 32283 - 32702 429 ## COG4917 Ethanolamine utilization protein 32 13 Op 3 . - CDS 32717 - 33067 424 ## COG4810 Ethanolamine utilization protein 33 13 Op 4 11/0.000 - CDS 33081 - 34034 1048 ## COG1180 Pyruvate-formate lyase-activating enzyme 34 13 Op 5 . - CDS 34063 - 36603 2743 ## COG1882 Pyruvate-formate lyase 35 13 Op 6 2/0.000 - CDS 36627 - 38093 301 ## PROTEIN SUPPORTED gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P 36 13 Op 7 5/0.000 - CDS 38112 - 38420 464 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 37 13 Op 8 5/0.000 - CDS 38453 - 38749 456 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 38 13 Op 9 . - CDS 38765 - 39052 419 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein - Prom 39196 - 39255 12.8 - Term 39325 - 39376 8.4 39 14 Tu 1 . - CDS 39463 - 40320 869 ## COG0789 Predicted transcriptional regulators - Prom 40382 - 40441 5.1 40 15 Tu 1 . + CDS 40259 - 40372 56 ## + Term 40406 - 40445 -0.4 41 16 Op 1 . - CDS 40466 - 41833 1697 ## COG0165 Argininosuccinate lyase 42 16 Op 2 . - CDS 41847 - 42830 941 ## COG4485 Predicted membrane protein Predicted protein(s) >gi|229784088|gb|GG667647.1| GENE 1 2 - 1400 1539 466 aa, chain - ## HITS:1 COG:DR2379 KEGG:ns NR:ns ## COG: DR2379 COG1132 # Protein_GI_number: 15807369 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Deinococcus radiodurans # 10 466 14 470 588 396 47.0 1e-110 MEQKEMKTGALLKRFVPYYKKYVRIMVFDLLCASLTTVCELVLPLILRYITNQGLKDLAS LTVQTIVGIGVLYFALRIVDGLASFYMAYTGHVMGAAIETDMRQDAFEHLQKLSDNYFNN TKVGQIMSRITSDLFDVTEFAHHCPEEFFIAFLKTAVSFVILAGINLPLTVIIFVFIPVM AVSCSYFNIQVKRAFKKQRNHIGELNARIEDSLLGNKVVRAFANEGVEIEKFNRDNQEFL NIKRQTYKYMAAFQNTIRMFDGLMYVVVIVAGGIFMIKGLIDPGDLVAYTMYVTTLLATI RRIIEFAEQFQRGMTGIERFTELMDASVDIFDEEEAVPLRDVHGSITFEHVSFEYPDDHN PVLSGIDLKIRPGEKVALVGPSGGGKTTLCNLIPRFYDPTEGRILLDGQDIRNVTLQSLR GSVGVVQQDVYLFSGTVYENIEYGHPGATREEVLEAAKMAGAHEFI >gi|229784088|gb|GG667647.1| GENE 2 1430 - 2680 1017 416 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621556|ref|ZP_06114491.1| ## NR: gi|266621556|ref|ZP_06114491.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 416 1 416 416 884 100.0 0 MSIRIRFALRPKNRLTGSREWTPEAAETMIKELAGEQGYTCESLSGCLLCRFCPEGYIWF HWDKRILRGESQTNIVGPGFHVAVIEFLEQLAGRGKLKLKVDDKTGYYWKRDFLTMRRKY FYQWFMDLMKLVSDWENGPEYTFCWPSSYYIPEKQEGKLITHIRPFSFTEIRGVVNSGLS MAFARDFFIWSEIEKDAVYYRNCALVLLNQSCFFMPSSRSEADRKVNDGIIYCLEKALSM DSGLQFPVKEYLEICRLAEHEPVDVSHVDPLSGELFVGCRRQLIYRKLGNMKFCLPGSFL FDENCRNGMDHYYDGVGYGGHDLYIYAVVLGTGGTSRFKKAWFEQGTPEQEFLFDADKGK GKAVFYAPEEKDGEVLYNMSAQVLYKEQRTNINITCRKPGETEWALDLVKRITISE >gi|229784088|gb|GG667647.1| GENE 3 2833 - 3327 394 164 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870545|ref|ZP_06114492.2| ## NR: gi|288870545|ref|ZP_06114492.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 162 46 207 207 315 100.0 8e-85 MPYCPKCDMEFIKGVTVCSDCGGSLFESKEEALAVQNASQPGEPDALPPEYPKEDASTEE SGELPRQEKPARPPAAYVKKSQKYEDRKSSASAFLLVGGAITIFALLCWAGIFDMPMTGF SKYLIQGTLTVMGIGCLIIAVMTFRSAKAMAGDIEAEEKQTEAS >gi|229784088|gb|GG667647.1| GENE 4 4568 - 5125 705 185 aa, chain - ## HITS:1 COG:SA2477 KEGG:ns NR:ns ## COG: SA2477 COG4720 # Protein_GI_number: 15928271 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Staphylococcus aureus N315 # 6 185 5 184 184 172 51.0 2e-43 MKKLFEFKTKTIVATGLGAALFTLLFMYVKIPTGIPETDIQTAYGIGAFFAALFGPIAGG LIAFIGHALSDSIQYGSAWWSWVVASGLSCFIIGLVYPKLHVEDGEFGKKDIILFNVYQI IGNAIAWIIVAPILDILIYAEPANLVFTQGVVAAISNAISSGVIGTILLVLYSRTRSKKG SLTKE >gi|229784088|gb|GG667647.1| GENE 5 5190 - 6032 1013 280 aa, chain - ## HITS:1 COG:lin2732 KEGG:ns NR:ns ## COG: lin2732 COG1912 # Protein_GI_number: 16801793 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 2 280 3 279 280 280 48.0 3e-75 MKPILVFQTDFTYKEGAVSSMYGVVKCVDRELEIMDGTHELPQFDTWSASYRLYQSLQFW PEGTVYVSVVDPGVGTRRRACVAKTVDGYYIVTPDNGSLTHVKKMIGIEAVREIDETVNR LRGKGTEGVAIFHGRDLFGYTAARLASGIIDFEGVGPEYPVEEIVEHPILEPEITEGKAC GVFEINDPNFGNLWTNIRLVDFNRAGFAYGDYVNVTVKHEGEVAFQQKVLFHQSFGFAEK GDPMIYNNELMKVSLAVSQGSFCEKYGLGYGPEWTVEFTK >gi|229784088|gb|GG667647.1| GENE 6 6060 - 6896 935 278 aa, chain - ## HITS:1 COG:SP0484 KEGG:ns NR:ns ## COG: SP0484 COG0619 # Protein_GI_number: 15900399 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Streptococcus pneumoniae TIGR4 # 1 277 1 275 276 218 41.0 1e-56 MKSKLFDYIDRDNFIFNLSGLTKLVCFIGLTFAVMFSYDIRYILFVMVLSVIFFAMSGLQ FKQIRLMTIYVVIFLSVNFVLTYIFSPQYGTEIYGTSHELFRINSRYIVTEEQLLYQVTK VTKYASVVPLGMIFLLTTNPSEFAASINRVGCNYKAAYALSLTLRYFPDTIRNYQDIALA QQSRGLDLSKKEKLGTRVKNVMNICVPLIFSTLDRIELISNAMDLRGFGKKKKRTWYVAK PLKTGDWCAMAVALVIFLGSLCMTFFVNGGSRFYNPFI >gi|229784088|gb|GG667647.1| GENE 7 6893 - 8608 295 571 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 321 555 147 384 398 118 32 7e-26 MDTNKKPMIQFNSFQFHYFSQAEPTLHDINLTIYEGEKVLIVGPSGSGKSTLGNCLNGLI PFAYKGEIEGSLKINGRETREMNIFELSKMVGTVLQDADGQFIGLTVGEDIAFALENDCV DQGTMKETVQRVADMVDMGKLLKSSPFELSGGQKQRVSFAGVMVDDVDILLFDEPLANLD PATGKTAIDLIDHVWKELGKTVVIIEHRLEDVLYRDVDRIVVVNDGRIAADMTPDELIAA DILPGMGIREPLYATALKYAGVSVTPEMRPGRMETLRIEMVKEAVQKWALSRVPDREVND RPDILTAEHVSFQYTKKRRILKDLSFHIREGEMVSIVGTNGAGKSTLAKVICGFVTEDEG RLLYEGEDLKGKTIKERSRMIGFVMQNPNQMICKPMIYDEVALGLRIRGVDEAEIDKRVE KALKICGLAPFKKWPISALSFGQKKRVTIASALVLDPKILILDEPTAGQDFHHYTEIMEF LKSLNSEGVTIIMITHDMHLMLEYTPHAIVIADGEKIGDATSAEILTNESITERANLKIT SLYELAVECGVEDPEGFVQNFIDYERGLRHV >gi|229784088|gb|GG667647.1| GENE 8 8630 - 10150 1612 506 aa, chain - ## HITS:1 COG:SPy1202 KEGG:ns NR:ns ## COG: SPy1202 COG2188 # Protein_GI_number: 15675167 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 273 499 11 235 241 94 27.0 6e-19 MVKRCTGEDESRLLDFLEKDAVYHTFLLADIRNYGFDQEFQTVYAGLEEERITHVYLKFY GNLIVAGDAKGIDGEFLKTLTEDWKPDVVMGKAELTKAAGRWLPGYELAEKNLYLLEEGT HLIDENGLPEGVTMKKGVPGDEDKIHGFLMTIPEIRALYTSKEMIADRLKSGDGTHLYLE RNGEIIAHVNSAAQSPFTVMLGGAAVKETERGKNYEAMLVSRISRDILAEGKKPCLFCDR ESEHNLFVHIGFIKAGAWGTLTPRKAEEGKSRLPSYIPVYNRLFQDLMDGVYEKDSLLPS ENVLAAAYKVSRNTLRQALTILAQDGFIYKRQGKGTFVSYDCRRREKKNLYNFLRECAKE PVSRVTMDYNIGLPTNIAKQKLQLKNGEEVLASNNVYWGEEGPIGQSFLQIPMELITGSG IDGKEPDEEQLLHFMDSMIYQKAVKAHMTIQVVAADEQVISYMNIPEGTVLLHMEQILYG PDFSPAARIKYYFIPDKYQVDCQLQA >gi|229784088|gb|GG667647.1| GENE 9 10342 - 11412 881 356 aa, chain + ## HITS:1 COG:NMB1395 KEGG:ns NR:ns ## COG: NMB1395 COG1063 # Protein_GI_number: 15677256 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Neisseria meningitidis MC58 # 11 354 1 342 346 255 38.0 9e-68 MSYNHHTKKTMKAAVYEGNGILRFEERPVPHIISPKDAIVRVTLTTICSSDIHILHGAVP RAVPGTILGHEFVGVVEEAGSEVTRVKPGDRVAVNVETFCGECFYCRRGYVNNCTDEHGG WALGCRIDGGQAEYVRIPFADNGLTPIPDSVTDESALFTGDILSTGYWGAGLADIKEGDS VAVIGAGPTGLCTMMCARLYHPGKLIAIDTSEERLDLARSRGLADYTLNPSSGPILERVL ELTSGRGADSVLEVAGGENTFRTAWQIARPNAVVCVVALYEEPQILPLPDMYGKNLIFKT GGVDANSCGEIMKLIDEGRLDTSCLITHRTDLAHIMEAYEVFEKQADHVIKYAVTV >gi|229784088|gb|GG667647.1| GENE 10 11511 - 12371 980 286 aa, chain - ## HITS:1 COG:lin2732 KEGG:ns NR:ns ## COG: lin2732 COG1912 # Protein_GI_number: 16801793 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 2 278 3 279 280 338 56.0 8e-93 MRKPLIIQTDFGTADGAVCAMYGVSYGVDPELRITDLTHEIPQYDIWEASYRLIQTVNYW PEGSVFVSVVDPGVGSTRRSIVAKTTTGQYIVTPDNGTLTHVKRMCGITEARVIDENVNR LPGSGESYTFHGRDIYAYTGARLAAGVITFEGAGPSVDVESVIELPVVEAAYNGDRVSGT IDVLDVRFGSLWTNISRELFLKLGVKHGQRVEISIENETRTLYKNILVYAKSFADVYVGE PLIYVNSLDCMAVAINQGSFAKAYNIGTGNKWQITMRKAPKIIYED >gi|229784088|gb|GG667647.1| GENE 11 12468 - 12980 817 170 aa, chain - ## HITS:1 COG:CAC0845 KEGG:ns NR:ns ## COG: CAC0845 COG1528 # Protein_GI_number: 15894132 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Clostridium acetobutylicum # 13 170 13 170 170 191 59.0 6e-49 MLDKKVAELINVQVNKEFYSAYLYLDFANYYREAELNGFANWYQVQAQEERDHAMLFMQY LQNNSGKVTLEAIDKPDKQFEDFGGPLKAGLEHEIYVTGLIHAIYDAAYSVKDFRTMQFL DWFVKEQGEEEKNADNLVKRFELFGHDPKGLYMLDSEMAARVYNAPSLVL >gi|229784088|gb|GG667647.1| GENE 12 13172 - 14710 1703 512 aa, chain - ## HITS:1 COG:CAC0712 KEGG:ns NR:ns ## COG: CAC0712 COG0696 # Protein_GI_number: 15894000 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Clostridium acetobutylicum # 1 510 1 509 510 611 59.0 1e-175 MSKKPVVLMILDGYGLNDNCDHNAVCEGRTPVMDQLMSQCPFVKGNASGLAVGLPDGQMG NSEVGHLNMGAGRIVYQELTRITKSIQDGDFFDVPEFLQAVENCKKNHSALHLWGLVSDG GVHSHNTHIYGLLELAKRNGLDKVYVHCFLDGRDTPPASGKGFVEELEAKMKEIGVGKVA SVMGRYYAMDRDNRWDRVERAYNALTKGEGKTAVSAADGIQASYDAEVNDEFVEPFVVVE DGKPVAVVNDHDSVIFFNFRPDRAREITRAFCDDEFKGFAREKRLDLTYVCFTDYDDTIA NKLVAFKKESIVNTFGQYLADHNMTQARIAETEKYAHVTFFFNGGVEEPNKGEDRILVPS PKVATYDLQPEMSAPAVCDKLVEAIKSGKYDVIIINFANPDMVGHTGIEDAAIKAIETVD ACVGRTVDAVKETDGILFICADHGNAEQLVDYETGTPFTAHTTNPVPFILVNADPSFKLR EGGCLADIAPTLIELMGMEQPKEMTGKSLLVK >gi|229784088|gb|GG667647.1| GENE 13 14835 - 15323 403 162 aa, chain - ## HITS:1 COG:no KEGG:Closa_3846 NR:ns ## KEGG: Closa_3846 # Name: not_defined # Def: SARP family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 10 134 257 381 399 73 31.0 3e-12 MDTKEPELMQKSIEIDMERIRKELCEQELLPGAYCQEYEIFKVIYRFMARRLRRTKEDCY IILLTLTDRKGELPCLKNRGGQMELLREMIQYSIRCGDVYTKYSSCQFLIMVTDLSEPEA DMVANRITDAFYQKLGGDADQVLLHHCYPLKAAETTAAGSEG >gi|229784088|gb|GG667647.1| GENE 14 15525 - 16277 888 250 aa, chain - ## HITS:1 COG:BS_tpi KEGG:ns NR:ns ## COG: BS_tpi COG0149 # Protein_GI_number: 16080445 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Bacillus subtilis # 3 240 2 240 253 248 54.0 1e-65 MARKKIIAGNWKMNMTPTEAVELVNTLKPLVAGGDADVVFCVPAIDIIPVVEAAKGTNIL VGAENMYYEDKGAYTGEISPAMLVDAGVKYVVIGHSERREYFAESDETVNKKVLKAFEYG ITPIICCGESLTQREQGITIDWIRQQIKIAFLNVPADKAAAAVIAYEPIWAIGTGKVATT EQAQEVCGAIRACIAEIYDEATAEAIRIQYGGSVSAGSAPELFAQPDIDGGLVGGASLKP DFGKIVCYNK >gi|229784088|gb|GG667647.1| GENE 15 16346 - 17560 1517 404 aa, chain - ## HITS:1 COG:CAC0710 KEGG:ns NR:ns ## COG: CAC0710 COG0126 # Protein_GI_number: 15893998 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Clostridium acetobutylicum # 2 403 3 397 397 493 67.0 1e-139 MLNKKTVDDLTGLQGKKVLVRCDFNVPLKDGVIQNYNRIDGAIPTIKKLLDQGAKVILCS HLGKPKGEPLPEMSLAPVAPALSERLGVEVKFVDDPKVTGPETQKVAAELKDGEVLLLQN TRYRIEETKYGKDENAEGYAKELADLCDGIFVNDAFGTAHRAHCSNVGVTKYTTTNVVGY LMEKEIKFLGQAVENPVRPFVAILGGAKVSDKINVINNLLDKVDTLIIGGGMAYTFEKAQ GEEIGNSLCEEDKLDYALEMIKKAEEKGVKLLLPVDHVEGREFSNDTERKVVDVIDAGWS GFDIGPKTIALYKAALEGAKTVVWNGPMGVFEFSNFAEGTLEVCRAVADLQDATTVIGGG DSVNAVKRLGFADKMTHISTGGGASLEFLEGKELPGVAAADNRL >gi|229784088|gb|GG667647.1| GENE 16 17735 - 18766 939 343 aa, chain - ## HITS:1 COG:NMA0246 KEGG:ns NR:ns ## COG: NMA0246 COG0057 # Protein_GI_number: 15793264 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Neisseria meningitidis Z2491 # 1 340 1 331 334 425 65.0 1e-119 MAVKVAINGFGRIGRLAFRQMFGAEGYDVVAINDLTDPKMLAHLLKYDTAQGGYTGRIGE GLHTVEAGEDSITVDGKTIKIYAEKDAANLPWGEIGVDVVLECTGFYCSKDKSQAHINAG AKKVVISAPAGSDLPTIVYSVNEKTLTADDTIISAASCTTNCLAPMAKALNDYAPIQSGI MSTIHAYTGDQMILDGPHRKGDLRRARAGAANIVPNSTGAAKAIGLVIPELNGKLIGSAQ RVPVPTGSTTILVAVVKGKDVTKEGINAAMKAAASESFGYEEDPIVSSDVIGMRYGSLFD ATQTMVSSLGDDLYQVQVVSWYDNENSYTSQMVRTIKYFAELG >gi|229784088|gb|GG667647.1| GENE 17 18960 - 20627 1579 555 aa, chain - ## HITS:1 COG:SP0496 KEGG:ns NR:ns ## COG: SP0496 COG1283 # Protein_GI_number: 15900410 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Streptococcus pneumoniae TIGR4 # 1 548 1 535 543 332 34.0 1e-90 MSVNDISSLFSFIGGLGMFLYGMNIMADGMQKTAGSKMSQFLGMLTNNRLMAVLLGALIT AIIQSSGATTVMVVGFVSAGVLNLTQAVGVIMGANIGTTITAWIVSMNQLGDAFAVFQPA FFAPLLIGIGAIFMLFGKKQKMKTAGEILVGLGLLFIGLDFMSSSISPYTDAPVFSEAFR LLGSNPILGMLIGALVTALLQSSSASVGILQTLAMNGVVTTNAAIFITLGQNIGSCVTAM ISSIGGSRTAKRAAVIHLTFNMMGAVIFGVISFVLFSLYPLLAAHNITSVQISIFHTIFN LTNTALLFPFANQLVKLSGIFVPEDKKEPVATDEESETMKHLDERIFESPAFALETAAME VVHMGQITMENVRRAMDAVLTKNADEVEDVYKTEQTINNMEKMLTEYLVKVNNLSLTERQ KLIVNDLFYSINDIERVGDHAENLAEQAEYMVQHNISFSETGESDLHIICETAFKSFQHS IEARRKGDMDDVRKVSQYEDEVDTLEEELREKHIERLSAGKCDPSAGVVFLDLISNLERI SDHAYNLAGYVKDEM >gi|229784088|gb|GG667647.1| GENE 18 20881 - 21678 1005 265 aa, chain + ## HITS:1 COG:L37351 KEGG:ns NR:ns ## COG: L37351 COG1387 # Protein_GI_number: 15673198 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Lactococcus lactis # 4 259 5 259 269 150 33.0 3e-36 MLPDYHLHTDFSGDSTTPPRAQIERAIQLGMDSLCITDHHDYDVDSIIDFTLDLDPYMSS LASLQEEYRDRIDVRIGIELGLQVHLKDYFKELTRRYPFDFVIGSTHFIDRKDPAYPEFF KERGEASAYLQYFETTLENVSHLDDYDVAGHIDYIVRYGPNKAAFYSYEAYRDILDRILK AVIEHGKGIECNTAGLRKGISQPNPAADVLKRYRELGGEILTIGSDAHIPEDLGADFEQT RELLKECGFSYYTVFRKRKPVYLPL >gi|229784088|gb|GG667647.1| GENE 19 21817 - 22083 484 88 aa, chain - ## HITS:1 COG:no KEGG:Cbei_4689 NR:ns ## KEGG: Cbei_4689 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 1 87 1 87 88 130 77.0 2e-29 MKPLSYAIIKHFTKVPEACAEDVIDALKGEYGKFKGLTLKAVIETLMTDEANGLLEESRF ELDEAGNLRIYYRANEEQRATINRYIKD >gi|229784088|gb|GG667647.1| GENE 20 22088 - 23167 1233 359 aa, chain - ## HITS:1 COG:FN1236 KEGG:ns NR:ns ## COG: FN1236 COG0697 # Protein_GI_number: 19704571 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 21 350 5 313 322 84 27.0 4e-16 MNEGAGAAAIAAKKKLSAQFFKKGVFVAILSGICYGLYSAFMTLGMTKGVWADWYGPNTA ALSAFVITYLLGSLGSAINDTCSAVWALLIAGIKGKLGDFFRCLKTKPGWIMIAAALVGG PISSAAYVVGLQMAGSIVIPISALCPAIGAILGRILYKQELNKRMVAGIVICVLSSFMIG STSLGGDAPEGRLLGICIAFIAALGWGIEGCVAGYGTTLIDYEIGITIRQCTSGLSNLCI LVPIFGIIAGNIGIAPHLTAQAFTSGPAMVFFVISGFFALFAYSLWYKGNSMCGAALGMA CNGAFSFWGPFFCWIILGVVCGIDGWSLPPIVWAAAILMIFGILMIAMNPLDLFRKKED >gi|229784088|gb|GG667647.1| GENE 21 23201 - 23488 435 95 aa, chain - ## HITS:1 COG:FN0083 KEGG:ns NR:ns ## COG: FN0083 COG4577 # Protein_GI_number: 19703435 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Fusobacterium nucleatum # 4 91 5 92 94 107 86.0 5e-24 MKYDALGMIETKGLVGSIEAADAMVKAANVTLIGKEFVGGGLVTVMVRGDVGAVKAATDA GAAAAQRVGELVSVHVIPRPHAEVETILPAAAREE >gi|229784088|gb|GG667647.1| GENE 22 23518 - 24201 765 227 aa, chain - ## HITS:1 COG:TM0375 KEGG:ns NR:ns ## COG: TM0375 COG4869 # Protein_GI_number: 15643143 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Propanediol utilization protein # Organism: Thermotoga maritima # 37 210 25 199 210 205 59.0 5e-53 MDQYEAMIKLLLEAVKNEGGTASGDASAKPRANTVPVGISNRHVHLSRADLDTLFGCGYE LTKAKDLSQPGQFACKETVTVCGPKGAIEKVRILGPVRSRTQVEILAGDCFKLGTKAEPR LSGDLDGTPGITLVGPKGSVQTKEGLMVAQRHIHMHPDDAAEFGVHDGQSVSIEIDGIRG GIYRNVVIRVTPDSALECHVDTEEANGMGLGGAATAALIIESSKTDI >gi|229784088|gb|GG667647.1| GENE 23 25275 - 26687 408 471 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P [Lactobacillus reuteri DSM 20016] # 23 433 47 463 477 161 27 5e-39 MENFDFDLRSIQEARDLARKGEIAAKKMATYSQEQIEIILQNMVRAAEEHAVCLGKMAAE ETGFGKAEDKAYKNHAASTLLYEEIKDMKTVGIIDEDPVKHTMDVAAPVGLVMGIVPSTN PTSTTIFKSIIAVKSGNAIVFSPHPSAAKCTLKAAELMRDAAVEAGAPEDIIGCITMPTM GATNELMKCKEVNLIIATGGPGMVKAAYSAGKPALGVGAGNSPAYIERTADIKQAVSNIM ASKTFDYGTICASEQSIVVEECNKDAVVAELKSQGAYFMTPEETEKVCALLFKSGSHNMN AKFVGRSPQVIAEAAKITIPAGTRVLIGEQGGVGEGYPLSYEKLTTVLAFYTVKDWQEAC ELSIRLLQNGIGHTMSIHTEDREMVRRFAAKPASRILVNTGGSMGGTGISTGLATSFTLG CGTCGGSSVSENVSPKHLINIKKVAFGVKNCATLVQEDKTFRHPELSPQSS >gi|229784088|gb|GG667647.1| GENE 24 26802 - 27383 711 193 aa, chain - ## HITS:1 COG:sll1029 KEGG:ns NR:ns ## COG: sll1029 COG4577 # Protein_GI_number: 16329368 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Synechocystis # 1 84 3 86 103 75 50.0 7e-14 MALGLIETRGTLAAVEAADAMVKAADVTLIEKTRVGGGLVAIAVTGDVAAVQAAVDAGAA AVERITASSLAARLVIPRPHEELDILFTPGKPGGKPGADVEVREPEPAEIPENAVELEVS ALPEKLDRQTLDSIVEQYGAAEGLAMADRLKVTALRTLAREYGELKISGREVSRANKQLL LEELKNYYENRND >gi|229784088|gb|GG667647.1| GENE 25 27397 - 28032 697 211 aa, chain - ## HITS:1 COG:eutQ KEGG:ns NR:ns ## COG: eutQ COG4766 # Protein_GI_number: 16130385 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli K12 # 1 210 1 233 233 112 32.0 4e-25 MKKLVCARDIEECGKQGKTEYCVDSDTIITPSAKDAAEAYGIRFTDSISQPEPACAGVFA GLDKMDAAGLCQLLKTILEQGGLQTPSKPYEEEHHGNGLKVVRGRTVKMDVFDTGNPDAT AYFQELVSKDESHISAGFLIIENSKFDWELTYEEIDYVIEGTLTITIDGKTYTAHAGDVL FVPANSKVVWGSPDRARVFYATYPANWADLL >gi|229784088|gb|GG667647.1| GENE 26 28087 - 28350 369 87 aa, chain - ## HITS:1 COG:FN0087 KEGG:ns NR:ns ## COG: FN0087 COG4576 # Protein_GI_number: 19703439 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Fusobacterium nucleatum # 1 81 1 77 82 71 48.0 5e-13 MVTAKLIGNVWATRKAESLSGFKFMMAEIISGTRTGERMIVVDTISAGIGDRVIITTGSS ARRMLGDDGIPVDAVVVGIIDEDCEFI >gi|229784088|gb|GG667647.1| GENE 27 28344 - 28649 336 101 aa, chain - ## HITS:1 COG:no KEGG:Closa_4011 NR:ns ## KEGG: Closa_4011 # Name: not_defined # Def: microcompartments protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 101 1 101 101 151 81.0 9e-36 MEIRIIKSPAQGTLDILLRRKGSGASIPIESVDAVGLVQGRMIEMVCAADVAEKAVGVTV EDIRGNCPQNMVLLAIFGDTASVEAALGEIKRKEKEGTPEW >gi|229784088|gb|GG667647.1| GENE 28 28666 - 29511 1008 281 aa, chain - ## HITS:1 COG:FN1783 KEGG:ns NR:ns ## COG: FN1783 COG4820 # Protein_GI_number: 19705088 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein, possible chaperonin # Organism: Fusobacterium nucleatum # 6 272 1 268 274 251 46.0 1e-66 MSKEGITFEECDSLVKEFERVVKSPVTNGSSVYYTGVDLGTACVVVAVLDENFRPAAGAY RYADVVRDGMVVDYIGAIRIVREMKEELEEKLGCELLYAAAAIPPGTDELDSGAIKNVVQ GAGFEITALLDEPTAANAVLKIKNGAVVDIGGGTTGISILKDGKVVYVADEATGGTHFSL VISGAYGLPFGEAERYKRDPSHHKELLPVLKPVVEKVSSIINRHLEGHQVEGIYLVGGTC CLAGIEDIIEKRTGIPTYKPDNPMFVTPLGIARSCTREACE >gi|229784088|gb|GG667647.1| GENE 29 29493 - 30173 406 226 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1422 NR:ns ## KEGG: Cphy_1422 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 225 1 210 211 164 41.0 3e-39 MKYITEADLRDLYRTTPFTTYETEPGTRLTPGARQFLGDRGIRVPDESISSKRYGDNGTS ETASGGKAADEKTSDETALNEKADGKKKLQNRLKSARALFLETGLQFKEHDVLLAQRIFV MERCLAGLSEEAETCLLPCQACSGIGPDNFSEEMEDCFEITAFHVQSDRGLDLIRLHRLR CLLRELGPYVEQELGDGKKRELHQVINGLSQLICQLFGGNTCQKKE >gi|229784088|gb|GG667647.1| GENE 30 31220 - 32254 1224 344 aa, chain - ## HITS:1 COG:lin1135 KEGG:ns NR:ns ## COG: lin1135 COG1454 # Protein_GI_number: 16800204 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Listeria innocua # 1 343 1 346 379 325 48.0 7e-89 MKQFIMKTKVYLGEASLETLRDMGIKKAFVICDPFMKSSHKVDMVTELLEEAGAEFTVFA EVVPDPTIAVVSKAIEQMLVCRPDAVIALGGGSAIDTAKAAGLLYSGMSQTERPELIAIP TTSGTGSEVTSFAVISDPEKQAKYPLVDDSMVPDTALLDPRFTVSVPPHITADTGMDVLT HALEAYVSTDAGDFTDACAEKAVRLVWRYLKTAVEDGSCMEAREKMHNASCLAGVAFNGA SLGICHSLAHALGGRFHIPHGRSNAMLLCHVIAYNAGLDTPGETAALKRYVEVANMLGIS AGTDKATVHGLIRQIRNLMNQIHIPQYITECGVDREEFLEAVAS >gi|229784088|gb|GG667647.1| GENE 31 32283 - 32702 429 139 aa, chain - ## HITS:1 COG:lin1109 KEGG:ns NR:ns ## COG: lin1109 COG4917 # Protein_GI_number: 16800178 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Listeria innocua # 1 119 5 125 143 77 35.0 1e-14 MVIGAGGSGKTTLAHALNGDHSPVKRTPCLVYGARTIDAPAAYLESPWMVQHLIAEAQNA SYVLMLVDQTKTREVYSPGFALAFRAPVIGVITKAGKSREQLERCEKELLKAGVQPPFYR VVFPSGEGLDELERRLNSY >gi|229784088|gb|GG667647.1| GENE 32 32717 - 33067 424 116 aa, chain - ## HITS:1 COG:lin1108 KEGG:ns NR:ns ## COG: lin1108 COG4810 # Protein_GI_number: 16800177 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Listeria innocua # 10 116 9 116 116 113 54.0 9e-26 MGKELTGVSRVIEESVPGKQVTIAHVIASPVPEVYECLGIDAAGAIGILTLSPFETAVIA ADIAAKAADVEVGFLDRFTGSVVIAGDVQSVETALASVCRTLKTVLGFAVTPVTRT >gi|229784088|gb|GG667647.1| GENE 33 33081 - 34034 1048 317 aa, chain - ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 15 294 8 280 302 224 43.0 1e-58 MNQTHAGSIERKALIFNIQKYNMYDGPGIRTIVFFKGCPLRCKWCSNPEGMDRSFQVMFK KRSCVDCGACAAACPAGIHVISPETGKHEILRDRDCTGCRKCMEACPREALTISGEVRTI SELLKIISEDEAFYDQSGGGVTLGGGEVTSQPEAALNLLMACKQEGINTAIETCGFAKTE TILKIAEYVDLFLFDIKHMDPERHNELAGVNNELILENLKELLHRRYNVKVRMPMLKGIN DSREEIDQVIRFLMPFRDYKNFKGIDLLPYHKMGVNKYNQLDRPYPIDGDPSLSDGDLSR IEGWLKEYQFPVTVVRH >gi|229784088|gb|GG667647.1| GENE 34 34063 - 36603 2743 846 aa, chain - ## HITS:1 COG:SP0251 KEGG:ns NR:ns ## COG: SP0251 COG1882 # Protein_GI_number: 15900186 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Streptococcus pneumoniae TIGR4 # 58 841 19 808 812 445 34.0 1e-124 MDIREFTDKFTDATRNMSPEQRAALVKMLESVSGDLPKSAPEKSGVYSGEGTAVPDGITE RLQLLKENYLKQVPSITTYRARAITKIAKENPGMPKILLRAKCFRYCCETAPLVIQDHEL IVGAPCGAPRAGAFSPDIAWRWMEDEIDTIGTRAQDPFYISEEDKKIMREELFPFWKGKS VDEYCEDQYREAGVWELSGESFVSDCSYHALNGGGDSNPGYDVILMKKGMLDIQREAKEH LEQLDYENPDDIEKIYFYKSIIDTTEGVMIYAKRLSEYAAELAAKEADPKRKAELLKISE VNARVPAHKPETFWEAIQAVWTIESLLVVEENQTGMSIGRVDQYMYPFYKGDIESGRLND FQAFELAGCMLIKMSEMMWITSEGGSKFFAGYQPFVNMCVGGVTRDGRDATNDLTYLLMD AVRHVKIYQPSLACRIHNKSPKKYMKKIVDVIRSGMGFPACHFDDAHIKMMLAKGVSIED ARDYCLMGCVEPQKSGRLYQWTSTAYTQWPVCIELVLNHGVPLWYGKQVCPDMGDLSRFK TFEEFDAAVKEEIKYVTKWTDVATVISQRVHKELAPKPLMSIMYEGCMEKGKDVSSGGAM YNYGPGVVWSGLATYTDSMAAIKKLVFDDKKYTLEQLNEALKADFVGYDQIKADCLNAPK YGNDDDYADLIAADLVDFTEAEHRKYKTLYSHLSHGTLSISNNTPFGQMTGASANGRGAW LPLSDGISPTQGADYKGPTAIIKSVSKMANDSMNIGLVHNFKIMAGLLDTPEGEESLITL LRTASIFGNGEMQFNYLDNNTLLEAQKHPEKYRDLIVRVAGYSAFFVELCKDVQDEIISR TMLTHL >gi|229784088|gb|GG667647.1| GENE 35 36627 - 38093 301 488 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P [Lactobacillus reuteri DSM 20016] # 102 413 128 434 477 120 26 1e-26 MRFVDKDLLSVQEARILMENAGEARKVLAGFEQERLDRIVVHMAQAAEKHARELAVMSCE ETEYGNWQDKYRKNRYICKYLPAELKNARCVGIISEDREKKVMKVGVPLGIIAALTPAEN PVSTAVYQALIAVKSGNAIVFAPHERAVKTTARALDILMEAAAEAGLPDGALSYLGTVTR QGTEALIGHPAVSLVISTGVPEMTEAVKAAGKPFIYGGTVGSPAFIERTADIPKAVKAIV ASRSFDCGIVPAAEQYIVADSCIAGEVKREFQRNGAYFMSGEEEKRLIGLLCPDNGSQDP ELVGKPARELARRAGFSVPDAVKVLVSEQKYISDRNPYARALLCPVLVYYIENDWIHACE KCIELLANESHGHTLVIHSKDEDVIRQFALKKPVARMLVNTPAVFGSMGVTTNLFPAMTL GSITAGVGITSDSVSPMNLIYIRNVGYGVREIEDRMAGGVLKNEAENGLDDDLERAMKKL LEWMAEEN >gi|229784088|gb|GG667647.1| GENE 36 38112 - 38420 464 102 aa, chain - ## HITS:1 COG:lin1123 KEGG:ns NR:ns ## COG: lin1123 COG4577 # Protein_GI_number: 16800192 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Listeria innocua # 6 92 4 89 91 79 56.0 2e-15 MERYEAIGSIETFGLVFVLEAADAMCKAADVELIGYENVASGYISVLVRGDVGACKTAVE AGVKAVEAMGAEVYSSVVIPRPHPDLEKIITRYSLKNLLPAE >gi|229784088|gb|GG667647.1| GENE 37 38453 - 38749 456 98 aa, chain - ## HITS:1 COG:lin1115 KEGG:ns NR:ns ## COG: lin1115 COG4577 # Protein_GI_number: 16800184 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Listeria innocua # 6 93 2 89 95 70 53.0 6e-13 MRYYGEEALGLVETIGLTPALEAADKMLKAADVELISYENVGSTLVTIMVKGDVAAVRSS VEAGAAAAAAIGKLTAKNVMPRPIQGVGDIVSVHDVDA >gi|229784088|gb|GG667647.1| GENE 38 38765 - 39052 419 95 aa, chain - ## HITS:1 COG:FN0083 KEGG:ns NR:ns ## COG: FN0083 COG4577 # Protein_GI_number: 19703435 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Fusobacterium nucleatum # 4 89 5 90 94 109 88.0 1e-24 MKYDALGMIETKGLVGSIEAADAMVKAANVTLIGKEHVGGGLVTVMVRGDVGAVKAATDA GAAAAQRVGELVSVHVIPRPHAEVECILPVCRKPE >gi|229784088|gb|GG667647.1| GENE 39 39463 - 40320 869 285 aa, chain - ## HITS:1 COG:BS_bmrR_1 KEGG:ns NR:ns ## COG: BS_bmrR_1 COG0789 # Protein_GI_number: 16079458 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 11 98 6 96 120 66 40.0 5e-11 MGVKQSTKKLYSIGEVSRICNVSKKALRFYDQIGVISPDMVSEENNYRYYSEETLLLVPV VKYYKQMGFSLEQMQNLVEGSTYQCLERSFLSKSEELQNERQELLYRYTAVKDWYELVRE AELVSQRKEHDVTIKYVNSTVYCSKEQDFSYNYMESIINIPWVNYLESIHNEITGPVILN FPSFREKRDGTCKRARIMQMPLRPLVKEAELKTLGGTMMASVYHVGSLDTISREYEKIEE WALKKGYLCGEESFERYVVDYWTTRVASEFVTEVLVPVTRMEECR >gi|229784088|gb|GG667647.1| GENE 40 40259 - 40372 56 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQIRLTSPIEYNFFVLCFTPIANHTASSDQILIYPHV >gi|229784088|gb|GG667647.1| GENE 41 40466 - 41833 1697 455 aa, chain - ## HITS:1 COG:BH3186 KEGG:ns NR:ns ## COG: BH3186 COG0165 # Protein_GI_number: 15615748 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate lyase # Organism: Bacillus halodurans # 2 448 3 449 458 505 55.0 1e-143 MKLWGGRFTKETNQLVHNFNASLSFDQKFYHQDIEGSIAHVKMLAKQGILTTEDRDTIIE GLEGIRRDLESGALVFTAEHEDIHSFVEAVLTERIGDAGKRLHTGRSRNDQVALDMKLYT RDEIDELDGLVKALLEELLKLMEENLDTYMPGFTHLQKAQPITLAHHMGAYFEMFYRDRT RLSDIRKRMNYCPLGSGALAGTTYPLDREYTAELLGFEGPTLNSMDSVADRDYIIELLSA LSTISMHLSRFCEEIIIWNTNEYQFVEIDDSYSTGSSIMPQKKNPDIAELIRGKTGRVYG ALVSILTTMKGLPLAYNKDMQEDKEMTFDAIDTVKGCLALFTGMISTMTFKKDAMEASAK NGFTNATDAADYLVNHGVAFRDAHGIVGQLVLYCIGKGIALDEMSLEEFQAISPVFEEDI YDAISMETCVKKRMTIGAPGQEAMKRVIEACRERL >gi|229784088|gb|GG667647.1| GENE 42 41847 - 42830 941 327 aa, chain - ## HITS:1 COG:SP2231 KEGG:ns NR:ns ## COG: SP2231 COG4485 # Protein_GI_number: 15902035 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pneumoniae TIGR4 # 2 317 506 846 850 77 25.0 3e-14 GLNFPSVSLFSSTANADLSKFFKKLGCESSTNAYSITGSTPLVDSLFSVKYALYSETPAD TGLLTPLEVEDHTYLYRNEYTLPLGVMVPYDLEDNWQLDITNPADVQNDLAVVLGADPVL EEVPSEILGTSFTFTPEVSGDYYVYVSNKKVEKVSALMGENTKSFDNVNRGYMLELGWIT AGEEVTLRNDDNEQDLVAVAYRFIPEGLESVYHVLNRNSMELTKKTDTEITGRIDTEKAG LLYLSIPYDKGWSIYVDGEETEPYKLFDTFLSVRLNAGTHIVELRYMPQGLKTGGMVTAG SAAVLLLIAGITAGRRKKPVMRRRIKC Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:51:38 2011 Seq name: gi|229784087|gb|GG667648.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld41, whole genome shotgun sequence Length of sequence - 44293 bp Number of predicted genes - 43, with homology - 42 Number of transcription units - 13, operones - 7 average op.length - 5.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1139 1133 ## COG1205 Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster - Prom 1164 - 1223 6.3 - Term 1311 - 1359 5.5 2 2 Op 1 12/0.000 - CDS 1406 - 2329 614 ## COG3958 Transketolase, C-terminal subunit 3 2 Op 2 . - CDS 2342 - 2950 542 ## COG3959 Transketolase, N-terminal subunit - Prom 3064 - 3123 3.4 4 3 Op 1 . - CDS 3198 - 4064 764 ## COG1082 Sugar phosphate isomerases/epimerases 5 3 Op 2 . - CDS 4137 - 5249 773 ## COG1363 Cellulase M and related proteins 6 3 Op 3 . - CDS 5251 - 5883 461 ## COG0036 Pentose-5-phosphate-3-epimerase 7 3 Op 4 2/0.000 - CDS 5915 - 6667 260 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 8 3 Op 5 24/0.000 - CDS 6674 - 7384 637 ## COG1116 ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component 9 3 Op 6 2/0.000 - CDS 7414 - 8214 478 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 10 3 Op 7 . - CDS 8214 - 8990 434 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 11 3 Op 8 . - CDS 9003 - 10199 943 ## DvMF_0519 nitrate/sulfonate/bicarbonate ABC transporter periplasmic protein 12 3 Op 9 . - CDS 10219 - 11022 697 ## COG0434 Predicted TIM-barrel enzyme 13 3 Op 10 . - CDS 11054 - 11842 626 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 11924 - 11983 7.6 - Term 12002 - 12055 10.0 14 4 Op 1 . - CDS 12080 - 12772 334 ## PROTEIN SUPPORTED gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 15 4 Op 2 1/0.000 - CDS 12808 - 13485 803 ## COG1285 Uncharacterized membrane protein - Prom 13553 - 13612 1.9 - Term 13500 - 13535 3.4 16 4 Op 3 17/0.000 - CDS 13615 - 15363 2049 ## COG1178 ABC-type Fe3+ transport system, permease component 17 4 Op 4 7/0.000 - CDS 15379 - 16383 1283 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components - Term 16393 - 16439 7.1 18 4 Op 5 . - CDS 16449 - 17588 1303 ## COG1840 ABC-type Fe3+ transport system, periplasmic component 19 4 Op 6 . - CDS 17599 - 18438 998 ## COG0584 Glycerophosphoryl diester phosphodiesterase - Prom 18514 - 18573 8.2 - Term 18704 - 18768 12.1 20 5 Op 1 . - CDS 18810 - 20999 1965 ## COG1609 Transcriptional regulators - Prom 21031 - 21090 6.4 21 5 Op 2 . - CDS 21092 - 22357 1454 ## COG4536 Putative Mg2+ and Co2+ transporter CorB - Term 22387 - 22435 7.1 22 5 Op 3 . - CDS 22438 - 22548 76 ## - Prom 22587 - 22646 6.7 23 6 Tu 1 . - CDS 22697 - 22990 393 ## gi|266621620|ref|ZP_06114555.1| phosphocarrier protein hpr - Prom 23088 - 23147 5.5 + Prom 23140 - 23199 6.2 24 7 Tu 1 . + CDS 23327 - 25207 1551 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains - Term 25169 - 25212 -0.4 25 8 Tu 1 . - CDS 25259 - 25912 459 ## gi|266621623|ref|ZP_06114558.1| hypothetical protein CLOSTHATH_02787 - Prom 25949 - 26008 5.7 + Prom 26011 - 26070 6.2 26 9 Tu 1 . + CDS 26261 - 26518 267 ## gi|288870562|ref|ZP_06114559.2| conserved hypothetical protein - Term 26561 - 26604 9.1 27 10 Op 1 . - CDS 26618 - 27190 666 ## COG2068 Uncharacterized MobA-related protein 28 10 Op 2 6/0.000 - CDS 27208 - 28254 989 ## COG1118 ABC-type sulfate/molybdate transport systems, ATPase component 29 10 Op 3 23/0.000 - CDS 28264 - 28932 693 ## COG4149 ABC-type molybdate transport system, permease component 30 10 Op 4 . - CDS 28992 - 29870 1075 ## COG0725 ABC-type molybdate transport system, periplasmic component 31 10 Op 5 . - CDS 29870 - 30670 575 ## COG1975 Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family 32 10 Op 6 . - CDS 30731 - 31222 286 ## PROTEIN SUPPORTED gi|134277849|ref|ZP_01764564.1| ribosomal protein S16 33 10 Op 7 . - CDS 31201 - 31644 494 ## COG2258 Uncharacterized protein conserved in bacteria 34 10 Op 8 6/0.000 - CDS 31649 - 32653 837 ## COG2896 Molybdenum cofactor biosynthesis enzyme 35 10 Op 9 . - CDS 32650 - 33138 548 ## COG0315 Molybdenum cofactor biosynthesis enzyme 36 10 Op 10 . - CDS 33140 - 33253 125 ## Closa_1040 molybdopterin binding domain protein 37 11 Tu 1 . - CDS 34174 - 35028 1044 ## COG0303 Molybdopterin biosynthesis enzyme - Prom 35111 - 35170 3.7 38 12 Op 1 . - CDS 35173 - 35520 376 ## COG2005 N-terminal domain of molybdenum-binding protein 39 12 Op 2 . - CDS 35545 - 36378 857 ## Closa_1038 hypothetical protein 40 12 Op 3 . - CDS 36427 - 37803 1623 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 41 12 Op 4 . - CDS 37852 - 38361 436 ## COG3467 Predicted flavin-nucleotide-binding protein - Prom 38436 - 38495 6.7 - Term 38460 - 38491 -0.2 42 13 Op 1 . - CDS 38571 - 41138 3233 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 43 13 Op 2 . - CDS 41156 - 44176 3132 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 44233 - 44292 1.6 Predicted protein(s) >gi|229784087|gb|GG667648.1| GENE 1 2 - 1139 1133 379 aa, chain - ## HITS:1 COG:BS_yprA KEGG:ns NR:ns ## COG: BS_yprA COG1205 # Protein_GI_number: 16079280 # Func_class: R General function prediction only # Function: Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster # Organism: Bacillus subtilis # 4 379 16 380 749 322 43.0 9e-88 MDINDLIIYRRTMPERSARYAGFPAELPEELREYLQKNGVRGLYIHQAEMFLRALKGENV VITTSTASGKTLGFLLPVIEEILRNPLARAVFIYPTKALASDQYRAILPYIEYFGKNRIS AGVYDGDTPAGERSRIRKEANIILTNPEMVNGAFLPNHSKFGFDFIFANLKFVVIDELHT YRGVFGSHLANVFRRLHRICRYYGSSPVFLCSSATIANPVELAEEICGRKFVRIEWDGSP AAERTYLLIKPPKILGHDKKYYGQVQATAVAAELIPELVEQRKNFIAFARSRRNVEVILK EARDRLEAESALNASQADKISGYRGGYTPAERKEIEHKMVTGALCGLVSTNALELGIDIG KIDATVLAGYPGTRASFWQ >gi|229784087|gb|GG667648.1| GENE 2 1406 - 2329 614 307 aa, chain - ## HITS:1 COG:FN0295 KEGG:ns NR:ns ## COG: FN0295 COG3958 # Protein_GI_number: 19703640 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Fusobacterium nucleatum # 3 307 5 309 309 249 41.0 6e-66 MIANRVAYGNMVHKLAMEDERITVVDSDCINVLNYGEFIRDFPDRFFECGIAEQNMVSIA AGLASCGMIPFVASFAVFTSMRALDQVRNMICYNGYSVKIVGTHAGLETGFDGATHQAIE DMAIMRAIPGLRVLAPSTPNMTAKLTRLMAETDGPFYMRFGREVNQEYYPEDMEFTLGGS RVLRDGDRLTVMACGRMVDFAVRAADQLIKEGIRVRVVDMYSIKPIDGKAIEAAVSDTAC ILTIEDHNTIGGFGGAVSEYVTEHCPCKVLKMGMRDEFGRSGSSADLFEMYGLTADRIAA RIRELLE >gi|229784087|gb|GG667648.1| GENE 3 2342 - 2950 542 202 aa, chain - ## HITS:1 COG:MJ0681 KEGG:ns NR:ns ## COG: MJ0681 COG3959 # Protein_GI_number: 15668862 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Methanococcus jannaschii # 7 192 95 274 274 191 51.0 6e-49 MEEMSHLRQADSHLQGAPNPKTPGIDMSSGPLGQGLSAAVGMALAARTMKKDFFVYCMLG DGEIEEGQIWEAAMTASKYRLNRLIAILDHNKVQMSGTNEEIMPLGDIKDKFDSFGWKTY KINGHSISEIIEILDHVTQESHEEKSKQDKPVMIIANTVKGKGVSFMEGKAEWHGAVPTE EQYDAALTELGKGESNFDHLRT >gi|229784087|gb|GG667648.1| GENE 4 3198 - 4064 764 288 aa, chain - ## HITS:1 COG:MJ1311 KEGG:ns NR:ns ## COG: MJ1311 COG1082 # Protein_GI_number: 15669501 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Methanococcus jannaschii # 2 263 30 266 293 66 26.0 7e-11 MNLREKMEFAKRLGFDGIELVIVEKGELQPDCRKEELMTIRRTAKETGIEIISVSSSLNW KCSFTSNDAQIRKKAGEMLRKQLYTAAELNSEVALALPGFVTADFISDEIHPSFRIHGEE DYHPGLEIIDYKTAYERSLQAFEEIAPIAERVGVTVGIENIWNHFLMSPLEMRQFIDEIG SNYVQVYFDIGNVLPSGYPEQWIRILGSRMKRVHVKDFRKGCSSVQGFVNLLSGDIDFVK VMQALHETGYNGWITAEVNEKRDYPEFAAKASALALEQLLMERSNSDV >gi|229784087|gb|GG667648.1| GENE 5 4137 - 5249 773 370 aa, chain - ## HITS:1 COG:sgcX KEGG:ns NR:ns ## COG: sgcX COG1363 # Protein_GI_number: 16132126 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Escherichia coli K12 # 2 359 14 370 383 245 39.0 1e-64 MNAEEVLKELCLAPAPSGYEKRAAEVFSSKIRPFVDRIMLDRMGNVMATLDGTDPESPSV MMYAHLDQLGFIVRRIEPDGYIQLDRLGGIPEKVLPALRLWIRTVRGDFIPGLIGNKAHH AASAEEKYKVDMVTNLYVDVGAATEQEVRELGVDVGCPAIYEPSFQPLAGDKVSGTALDD RGGLTALVLAAERLWEDRPASTVHFVGTVWEEFNIRGAAFAVRRLKPDVALGLDVVLAGD TPDLKTKYQNALDKGPTINFYNFHGRGTLNGMIPHEGLARLAEECALREKIPHQRFASLG IITESAYVQMENEGVACLDLGYPARYTHSPVETCSLSDIKNLGLLAAATAKAITPEFSIN RYSIDCCSDS >gi|229784087|gb|GG667648.1| GENE 6 5251 - 5883 461 210 aa, chain - ## HITS:1 COG:DR1401_2 KEGG:ns NR:ns ## COG: DR1401_2 COG0036 # Protein_GI_number: 15806418 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Deinococcus radiodurans # 1 195 1 197 214 122 36.0 5e-28 MKLMPSIASADPLAVGEVLKKLEAWPYLHIDIEDGNFVPNITFGMKAVRAICKKAGKRYI QVHLMVTHPETYLDGLAACGVDSVIAHLEALDYPLHFLNHCRALGLEPGLAVNIKTPVEA VKPFLGNMDQLLLMTSEPDYLEEKVYGPAVERLIRLAGNIPAGVELYADGGVTRETLREL EEAGVHGAVLGRLVFQSPDPLEMLKELKEE >gi|229784087|gb|GG667648.1| GENE 7 5915 - 6667 260 250 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 210 14 235 329 104 30 8e-22 MQYMTENYKIEVKELTKNFGDLKVLDHISFKVKKGEFICVVGPTGCGKTTFLNLLTQLIQ PTSGEILIDGKTADPKIHNISFAFQEPSAIPWLTVEENIMFGMKLKKLPKEEIYRRTEEI LSLVGLTRFRKQYPHQLSTSSEQRIVIGRSFALYPDLLLMDEPYGQMDIKMRYYLEDEVI RLWKKFESTVLFITHNIEEAVYLADRIIILTNKPAAIKEELVVDLARPRDFTDPQFVELR KYVTEQIRWW >gi|229784087|gb|GG667648.1| GENE 8 6674 - 7384 637 236 aa, chain - ## HITS:1 COG:MJ0412 KEGG:ns NR:ns ## COG: MJ0412 COG1116 # Protein_GI_number: 15668588 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component # Organism: Methanococcus jannaschii # 3 228 29 254 267 244 50.0 1e-64 MENGKNHEVVRELSLTVRDNEFLILLGPGHCGKTVILNMIAGLMQPDYGNIYLDGKKVVG PNPDMGMVFQKIGTLPWKTVMENVELGPKMRGIPKRERREKAQHYIDLVGLHGFEEHLPK QLSGGMKQRVGIARAYANDPQILLLDEPFGQLDAQTRYNMEEEIARIWEKEKRTILFVTN NIEEALMLGDRIILLTECPARVKQVYQVDLPRPRDMMDCRFLQMRREISENTDLSI >gi|229784087|gb|GG667648.1| GENE 9 7414 - 8214 478 266 aa, chain - ## HITS:1 COG:mlr4519 KEGG:ns NR:ns ## COG: mlr4519 COG0600 # Protein_GI_number: 13473801 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Mesorhizobium loti # 41 260 64 283 286 146 35.0 3e-35 MKKVTNWKKGFEHILYTIAPALSLVLLFLLWVQISRGNSDLVPGPDLVMKRFVKTFEKPI NGYMIFGHILASLKRIFLALLFSLGIGIPFGILIGWNKTFNAIFGTLFEFIRPIPPIAWL PLIVMWFGIGEFPKVIIVFIGTFTPVVVNTATGIRLVEPLYLDVGRMFHANDRKLLTEIA LPSAMQSIFAGFRTATSGGLMVVLAAEMIGAKAGLGFLITRGQDAFDVPLIMVGMIVIAI VGTVISIFTDLLERRICPWSENIKSD >gi|229784087|gb|GG667648.1| GENE 10 8214 - 8990 434 258 aa, chain - ## HITS:1 COG:mlr4519 KEGG:ns NR:ns ## COG: mlr4519 COG0600 # Protein_GI_number: 13473801 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Mesorhizobium loti # 10 255 38 282 286 149 34.0 4e-36 MKKILDRYAVISVLSVLTFLLIWQGCVTAGILSDKILPAPTKVFAALFDKLNNKSPDGSL LQTNILVSLKVACSGLLLAIVIGVPLGWLMGWYKTVSRFVRPIFEIMRPIPPIAWIPLTI LWVGIGLEAKAMIIFFASFVPCVINATAGIKSTNQTLINVAKTYGASDFTIFLKIGIPSS LPMTFAGIRIALNSAWSTLVAAELLAANAGLGYMINMGRQFGRVDIIVLGMVTIGILGFA FSWIFTKLEHIFVKWRTV >gi|229784087|gb|GG667648.1| GENE 11 9003 - 10199 943 398 aa, chain - ## HITS:1 COG:no KEGG:DvMF_0519 NR:ns ## KEGG: DvMF_0519 # Name: not_defined # Def: nitrate/sulfonate/bicarbonate ABC transporter periplasmic protein # Organism: D.vulgaris_Miyazaki_F # Pathway: ABC transporters [PATH:dvm02010] # 57 388 29 344 354 99 23.0 4e-19 MRRRLQKWFCLLLSLLLVLGTLAACSQSAEKTDNNGVAGNSGKEETGREEAGIYDGIEPL ETETELNFGYLTGSHHGMIIYLIDKMGGYEKTGIKANIETFGNGPVMVEAMASDSWDCGT IGLGGVLTGVISQDMVVIGAAARDYDSLRIFAKNDSDVVSAGNTLKDYPAIYGTADTWKG KEILIPTGSTLHYVLSTGLSKFGLSDTDVAMTHMDVPGINTALRANKGEIGALWGSFTYA DEMKKNYTVVMSANDLGIELPTVMCANPRSYEDPAKYKAIKKWMELYFATVDWIYSSEEH FDKAVEMFTEINEEAGATGTLEENRTVLENNRHYSLSDNVKLFNEKTEDGAMLLIEAMHN DPLLFYIKNGSYQKGDDTKLLGGYFKADIINEIAAMKE >gi|229784087|gb|GG667648.1| GENE 12 10219 - 11022 697 267 aa, chain - ## HITS:1 COG:STM1615 KEGG:ns NR:ns ## COG: STM1615 COG0434 # Protein_GI_number: 16764959 # Func_class: R General function prediction only # Function: Predicted TIM-barrel enzyme # Organism: Salmonella typhimurium LT2 # 1 267 1 267 268 374 67.0 1e-103 MNWLKEVFKTEKPIIAMCHLQALPGDSGYDQEKGMEWVIEKAREDLYALQDGGVDAVMFS NEFSLPYLTDVKKETLAAMGRVIGEIRSEITVPFGVNVLWDAKASLDLAVATGACFVREI FTGVYASDFGLWNTNCGETVRHQHAVGAQNVKLLFNIVPEAAVYVEQRNIREVAKTTVFN CRPDALCVSGLTAGESTDTQILSEVKSAVPDTVVLANTGCRIENITTQLSVADGAIVGTT FKKEGKFENGVDYNRVKAFMDLVKEFR >gi|229784087|gb|GG667648.1| GENE 13 11054 - 11842 626 262 aa, chain - ## HITS:1 COG:BH0801 KEGG:ns NR:ns ## COG: BH0801 COG1349 # Protein_GI_number: 15613364 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Bacillus halodurans # 12 241 13 238 259 141 40.0 1e-33 MEQKTSIIPAGRLQKIMEYMKEHGSVQIKELADYLDVSDATVRRDLDDLDTQGLLVRTHG GAVRKNDNTSSFEWQHNEKMTIMLHEKKQIAKLAASFVHEGDTILLDSGTTTYYLASELS DVPNLTVITYDLFIGGNLTLHPTSTMIVTGGIRRQGYNNVLLGSMVEDYVRNIRVDKAFL GADAIDVDFGISNTNVMEASIKRLLVQAGKQVILIGDHSKMGRVALIKVCDLTDLDGVII DKAIDENSLHRMKEKVKNVYLA >gi|229784087|gb|GG667648.1| GENE 14 12080 - 12772 334 230 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 [Bacillus selenitireducens MLS10] # 21 227 16 230 236 133 34 2e-30 METVTTALEYLEGINLASITLRIVMSMVCGGVIGIERGKAHQPAGMRTYMLVCMGAAMVM LTGQYMYYHFQTGDPARLGAQVVSGIGFLGAGSIIISGKTKIKGLTTAAGLWTAACIGLS IGIGFYEAGIIATLAVSLIMTHLKKLEYHLIFDDVWLSVYVELDDAVKMTDLVQEFSSVG LVIGEVRIEKKRKGFQKAVISLKNTEHKSREAVLRFLENVEGIQFVKYTA >gi|229784087|gb|GG667648.1| GENE 15 12808 - 13485 803 225 aa, chain - ## HITS:1 COG:CAC3658 KEGG:ns NR:ns ## COG: CAC3658 COG1285 # Protein_GI_number: 15896891 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Clostridium acetobutylicum # 1 223 1 227 229 158 38.0 8e-39 MSDFMMEQWIYVARLIGAAVCGGLIGYERQSRKKTAGVRTHVIISVAAATMMIISKYGFN DVLGEYVRLDPSRVAAGIVTAVGFIGSGIIFFRNNNVSGITTSAGIWATVGVGMAMGCGM YVLGVVSTVIVMVSEVFLGRRGYFSRKAGEERELEIEYKDSDKEDIFSYINKTIEEAGCR MENIKLTDEDSLCVLNAVLKAPDELDAALLVGELKKKKSVKRVSV >gi|229784087|gb|GG667648.1| GENE 16 13615 - 15363 2049 582 aa, chain - ## HITS:1 COG:SP1824 KEGG:ns NR:ns ## COG: SP1824 COG1178 # Protein_GI_number: 15901653 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 7 563 2 558 563 606 56.0 1e-173 MKQRSKRKSFSIQTLILKIVIVWAIVAFVIYPNLNLLVSVFYKNGQFSTEVFGKVFGSAR AMKSLVNSFILAFTLVVTVNIVGTLSVLLTEYWDIKGAKILKLAYMSSLIYGGVVLVSGY KYVYGANGLLTKLLVSVFPGMNPNWFVGYGAVVFVMTFSCTQNHMMFLKNAIHGLDYHTI EAAKNMGASGSRIFFQIVMPVLKPTFFAITILTFLTGLGAMSAPLIVGGENFQTINPMII TFAKSSFSREVAAFLAVLLGIATMILLAVFSRIEKGGNYISVSKTKAKLKKQKITNPVFN VAAHVVAYAMFAIYVIPIILVIVYSFCDSLTIKTGNLTLNSFTLANYQKLFTSAEAIKPY LVSIVYSLLAAVLAVLMAVICARFVRTAKHKYDVFFEWGMLIPWLLPATLIALGLTTTYG ERQPILFNYVMVGTLVLMLVGYTVVKLPFSYRMVKAAFFSIDNDLEEAAKCMGASTLYTM IKVILPVIFPAIMSIIILNFNGMLSDYDLSVFLYHPLFQPLGIVVKAASDESASIDARAM GFVYTVVLMILSSIALYFGQGDGMERIRRWRQRRFEKKVARG >gi|229784087|gb|GG667648.1| GENE 17 15379 - 16383 1283 334 aa, chain - ## HITS:1 COG:SP1825 KEGG:ns NR:ns ## COG: SP1825 COG3842 # Protein_GI_number: 15901654 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Streptococcus pneumoniae TIGR4 # 1 334 1 336 336 401 61.0 1e-112 MIEFKHIEIKYGDFVAIRDLNLTIGDGEFFTFLGPSGCGKTTTLRALVGFLEPSAGQILI DGKDVTHTPVEKRNIGMVFQSYALFPTMTVYENLAFGLKVKKLSAAEIDKKVKDVAKVIE IKSEQLVRNVSELSGGQQQRVALARAIVLEPKILCLDEPLSNLDAKLRVGLRGELKKLQK NLGITTLYVTHDQEEALTLSDRIAVFNNGFVEQVDKPYNIYNHSGSEFVCDFIGEINRLN ASVLSEITTQTGHALDGSKGGYVRLERIRLNEIPGYAKLHGVIRDYDYNGVMTKYRVEVM GCELRVIHVNDGQPFMEIGQELDIYVNPRDIMQF >gi|229784087|gb|GG667648.1| GENE 18 16449 - 17588 1303 379 aa, chain - ## HITS:1 COG:SP1826 KEGG:ns NR:ns ## COG: SP1826 COG1840 # Protein_GI_number: 15901655 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 1 379 1 355 355 309 42.0 5e-84 MKNLKKFAALGLSGAMVLSLAACGSGTKNTTTDTTAAGTEAAAKTEAAGDTEAAADTAAA GGNVDKSQKLIIYTNSGSNGRDAWLKEKAGTEGYNVEVVQIQGGDLANRIIAEKNNPQAD MVFGLNSIEYEKLKAENVLDQWEPSWAGDVDMTLGDKDGYYYPIVIQPLVNMMNADLQNP PKDYTDLVNPEWKDKYTILNFGGGTGKTILASLLVRYRDDSGEYGISDEGWDFVKNWIQN GHMEVQGEDYVGNAISGTVPITEMWGSGVLQNQTERDYKFQIMVPEVGVPYVTEQVAMIT GSKKRDLAVDFANWFGSAEVQAEWMNQFGTIPCQPKAVEEAPADIKEFMDQVHPQDMDWG FVAANIDQWVEKCELEFLQ >gi|229784087|gb|GG667648.1| GENE 19 17599 - 18438 998 279 aa, chain - ## HITS:1 COG:AGl598 KEGG:ns NR:ns ## COG: AGl598 COG0584 # Protein_GI_number: 15890416 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 13 263 32 284 306 61 26.0 1e-09 MYTKNQSIRELIDKNRILIAAHRGTCGGNVVQNTCLAYKNALLHGADMIEVDASMTTDGV FYAFHNGEEPLELGIERDIRTMSSEEVEKLFTLNSLRQVCGQKLERLEEVLESFRGKCLI NIDRSWFYWKEIIALLDRMDMKDQILLKSGVEEELLKELEESGTGIMYMPIMKSPEEWEI LEKYEINVAAAELIFTDTDSEFLSPEFMGRLKRAGIAPWVNAITLNDEIVLSGLLDDNNA IKIGFDETWGKLTDLGFEIIQTDWPALLKGYITNKPNKK >gi|229784087|gb|GG667648.1| GENE 20 18810 - 20999 1965 729 aa, chain - ## HITS:1 COG:CAC2962 KEGG:ns NR:ns ## COG: CAC2962 COG1609 # Protein_GI_number: 15896215 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 3 128 4 129 337 71 33.0 5e-12 MPTIKDIAAKAGVSHGTVSNVLNKRGNVSAEKIQLVERIAKEMGYKMNVQAQQLRAGTAR RVCVIVPKISLKCYNDLYTGLEQVLREHDISIELISTNNLERDEEKAVKKALSLNPMAVV VVSSLLKNRGVFTPDTRFFFAERKVKGMPENSVYISFGFEQAGRDIALKCIGDGHQNVAL LCENSIYSNNKSFMNGVVDVLEDDNCSCKLFPGDDSMWFNKAFDILGSKEEFDAVIAMSQ EDADYLKAAHQYNPDKKMPVIYALTSKRIGIDPDVCRYELNYKLMGRNIAGQIIHAGGED AEAAPAKKQQIIMENDGFYEKSLPSAGKKESICFLTIQNLTSKAIGMLLPSFTRETGIEV NMIEVSYDEHYKMAQTCVQNSPYDLVRIDMAWMTELGDKIYRPFDGSSSSVQWLRQQILP SLSENYSMVHGIQYAFPLDACVQILFYRKDLFEDELIKREFFEKYKRRLEIPKTFEEYNE VARFFTRKENPDSRTRYGAVTSYGRTFLAACDFLPRFRELKKDIFDARGNVNILIPEMKQ AVRNYLDTCRYAGSAVYQWWGEPTRQFSEGSTAMHIVFSNYASEMIHNPESKVVGRIAYD TVPGGQPLSGGGSVGISRFSKKCDACMSFLQWLYRKDIAETITYLGGYVCNRQISRNMDI LERYPWLENMEKSFEPGWRLYKHDRNPDFNEFLFEDILGKAIRSIASGIEGMDDALEKAQ AECEQAFNR >gi|229784087|gb|GG667648.1| GENE 21 21092 - 22357 1454 421 aa, chain - ## HITS:1 COG:SMc00697 KEGG:ns NR:ns ## COG: SMc00697 COG4536 # Protein_GI_number: 15966428 # Func_class: P Inorganic ion transport and metabolism # Function: Putative Mg2+ and Co2+ transporter CorB # Organism: Sinorhizobium meliloti # 1 407 9 424 434 221 33.0 2e-57 MEEQSISLMIIIGCIIMSAYFSATETAFSSLNRIRIKNMAEKGNKRAQLVLTLSENYDGL LSTILIGNNIVNIASASLATVIFVKLLGDEAGASISTVVTTIVVLIFGEVSPKSIAKESP EQFAMFSAPFLNAFMVVLTPANYLFKQWKKLLSLLIRTSGDPGITEEELLAIVEEAKQDG GIDEQEGSLIKSAIEFTELEATDIATPRVDVTGISIDADKEEIAAVFDETGYSRLPVYKE TIDDITGIIYQKDFYNRMYRGTCGVEAIVRPALYVAKSKKINVLLKELQKNKMHIAVVID EFGGTMGIVTLEDILEELVGEIWDEHDVVVQEIEKISDREYLVSGNASVEKLMGELGSEE TFESFTVSGWVMELAERIPEEGDVFYYDHMSITVIKMRDRRVEQVRLVLEAEPMEPCAAS F >gi|229784087|gb|GG667648.1| GENE 22 22438 - 22548 76 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDDADSDAEPDTHRANFYTTVTYRLIKSNKRMKLLT >gi|229784087|gb|GG667648.1| GENE 23 22697 - 22990 393 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621620|ref|ZP_06114555.1| ## NR: gi|266621620|ref|ZP_06114555.1| phosphocarrier protein hpr [Clostridium hathewayi DSM 13479] phosphocarrier protein hpr [Clostridium hathewayi DSM 13479] # 1 97 1 97 97 161 100.0 2e-38 MDEEERTGYGYMKSREYNLAPEEEKARKQMMELILLSSRFASSITVENEEKHFNGKSILK AGELKKGEHFYVVADGSDAEAAIEAFDQLFPQAAGGI >gi|229784087|gb|GG667648.1| GENE 24 23327 - 25207 1551 626 aa, chain + ## HITS:1 COG:aq_1967 KEGG:ns NR:ns ## COG: aq_1967 COG0749 # Protein_GI_number: 15606966 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Aquifex aeolicus # 48 623 1 571 574 283 34.0 7e-76 MENVISQSNDRIEPPHLQTSLGAEINSGTPDLPLESGREISFSTDLSVEFRIISQPSDLT PYFHTVKEAPLLAIDTETTGLDPHSDRLRLIQISAPGIPVLVIDCAAFLPDGFACLKELL NTPSEKIFHNARFDLQFLMGIGIDCFPVFDTMLAAQLLRPCGGPLKAGLAVVADHYLGIK LDKTEQTGSWDSASLTGSQLAYAALDAWILLKLYDVMNPLLARHGLGRTASIEFACVSAI ARTEYDGINLDLEKWDELRRKTEKQYDAALETLYTYSGTPSCQLTLWGGEEALDVNFESN PYVLKLLNRYGIPVTSTSRRSLAPYHSQPLVQALTEYRRHSKSLSSFLHPIPAQIHPVTH RLHPKYMQIGAWSGRMSCYNPNIQQIPREADFRSCFTAPDGRKLILADYSQIELRVAAQI SGDSRMKSACQKGEDLHALTASLISSIPVSQVTKAQRQAAKAVNFGLIFGMGAEGLKQYA GQSYGVEMTLEEAEQFRSAFFQSYRGINWWHHDLRESHPLEGRTLTGRKFLFSPNTGLAD LANTPVQGTAADILKHALGLLASRIRGSDKKIVAIVHDEILMEVPVEEADDTVVLLKTAM EEAAQAILPDVPCEADAKAADSWAGK >gi|229784087|gb|GG667648.1| GENE 25 25259 - 25912 459 217 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621623|ref|ZP_06114558.1| ## NR: gi|266621623|ref|ZP_06114558.1| hypothetical protein CLOSTHATH_02787 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02787 [Clostridium hathewayi DSM 13479] # 1 217 1 217 217 438 100.0 1e-121 MDWEQLMTELRKKEDPKVYAELGYRDASVIPLFIEVMEKEKTAVKYQAEKAVRKISEERP AMLLPYVDRLIGLLDSDNNFIRWGMLLTLPGLLEAGGKDIWMKMRTRYLLSFRSRQVAEF GNAVSSTGKILRACPEEEETIIPLLLKIDSHTFFHHGVPSPECLNVAKGHILECFLEQFP DTAYRTEMASFAKQNLENDRKQVCIRARRLMKLAGEV >gi|229784087|gb|GG667648.1| GENE 26 26261 - 26518 267 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870562|ref|ZP_06114559.2| ## NR: gi|288870562|ref|ZP_06114559.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 85 4 88 88 130 98.0 4e-29 MILTEYDEQQHLAGERELGVTLGCLMVQIQLVRKKFEKHLVPETAAEELELDKAYVTQIY TLLREHGELDDRRLMELLAHELRYI >gi|229784087|gb|GG667648.1| GENE 27 26618 - 27190 666 190 aa, chain - ## HITS:1 COG:APE2223 KEGG:ns NR:ns ## COG: APE2223 COG2068 # Protein_GI_number: 14601923 # Func_class: R General function prediction only # Function: Uncharacterized MobA-related protein # Organism: Aeropyrum pernix # 7 187 1 182 205 74 31.0 1e-13 MKLTLILLAAGDSRRFDGNKLLTPFHGKAMYQYIVEEVAKLPDDLFDKKIVVTQYREIME DLGKRGYLVVENVQSSLGISHSIHLALEAASPQKAEEAAYCFAVCDQPYLKADTIEDLIE GWKNSKRGIGCLCNMGEFGNPAIFSDMYREELMTLEGDVGGKRVLRRHIDDLYLHEVDDG LELVDIDVRG >gi|229784087|gb|GG667648.1| GENE 28 27208 - 28254 989 348 aa, chain - ## HITS:1 COG:sll0739_2 KEGG:ns NR:ns ## COG: sll0739_2 COG1118 # Protein_GI_number: 16331977 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate/molybdate transport systems, ATPase component # Organism: Synechocystis # 3 278 38 311 395 248 45.0 1e-65 MSLKVEIKKKLRDMTLSVAFDTDRDSGITGILGASGCGKSMTLKCIAGIFTPDEGRIELN GRVLFDSEKGINVKIQDRGVGYLFQNYALFPHMTIRENVEMALSCPKRERRQAAMRYLDM YHIGELADRYPSRLSGGQQQRAALARIMASKPEVLMLDEPFSALDYYLKEKLQLEQLEEL KNYDGDVLLVTHSRDEIYRFCSTIHVINEGEIVVSGAVKEVFKEPETVSAAKLTGCKNIL DIKYVDDHTFYVPNWDMQVIIRDRNVPKAAPFIGIRAHYFTAAREEAEVNVMECRLKQLL DDPFEVTLILENDLWWKIPKARWKDEMKERVPERLVVPEESIFFLKDR >gi|229784087|gb|GG667648.1| GENE 29 28264 - 28932 693 222 aa, chain - ## HITS:1 COG:alr2433_1 KEGG:ns NR:ns ## COG: alr2433_1 COG4149 # Protein_GI_number: 17229925 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, permease component # Organism: Nostoc sp. PCC 7120 # 1 212 3 212 223 179 47.0 4e-45 MDYSPILISLKTSVISIIFTFFLGIWAARAVIALKSERAKAILDGVLTLPLVLPPTVAGF FLLYIFGVKRPVGKLLLAVFSYKIVFTWQATVLAAVLISFPLMYRSARGAFELVDENVLN AARTLGLPERKVFWKILMPMAMPGVLSGGVLAFARGLGEFGATAMIAGNIAGKTRTLPLA IYSEVAAGDMEGAGKYVVIIVAISLLIVAGMNVAATGKKKYL >gi|229784087|gb|GG667648.1| GENE 30 28992 - 29870 1075 292 aa, chain - ## HITS:1 COG:BS_yvgL KEGG:ns NR:ns ## COG: BS_yvgL COG0725 # Protein_GI_number: 16080391 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, periplasmic component # Organism: Bacillus subtilis # 53 290 22 259 260 163 41.0 5e-40 MKKAILLAGALTVAMAALTACSGNGKAEPETTAQAAESAAETQKETTAKAEETTAEAAKE AEAQTVTIAAAASLETCLRDELIPLFEEQNPGFTVVGTYDSSGKLQTQIEEGAGIDVFFS AATKQMDALNDEGMIQEGSMKNLLENKIVFIVPTGDEGSYSSFEDIAKAENAALGDPDSV PVGQYSKEALTNLGLWDEVLAKASLGTNVTEVLNWVAEGSAKAGIVYATDAAQTDKVTIV AEAPEGSVKKAIYPAGIIKDCENPEGAQKFIDFLAGSEAMEIFVKNGFAAAQ >gi|229784087|gb|GG667648.1| GENE 31 29870 - 30670 575 266 aa, chain - ## HITS:1 COG:yqeB KEGG:ns NR:ns ## COG: yqeB COG1975 # Protein_GI_number: 16130777 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family # Organism: Escherichia coli K12 # 4 256 271 523 541 235 49.0 6e-62 MERVLIKGAGDLATGIGCRLHRCGFSVAMTEIAVPTTVRRTVAFSRAVYEGSALVEDVEG VLCRNTGDIDRVVNQDKVAVIVDESCETVRSWKPDIVVDAILAKVNLGTEITDADIVIGV GPGFTAGRDCHVVVETKRGHNLGRCIWEGSAFPNTGVPGMIGGYDKERIIRAASDGIFTG AVEIGTLVEKGDVVGSSGGTPIYAEVGGVVRGLLQDGVAVTKGMKSGDIDPRGVTEHCWT ISDKATAIGGGVLEGIFHIKKNRRVK >gi|229784087|gb|GG667648.1| GENE 32 30731 - 31222 286 163 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|134277849|ref|ZP_01764564.1| ribosomal protein S16 [Burkholderia pseudomallei 305] # 5 140 3 144 194 114 45 8e-25 MYRAGIVTLSDKGAAGEREDKSGAVIREILEAAGYEVAAQSLLPDEGEALKKELIRLSDQ VQCDLVLTTGGTGFSRRDVTPEATMAVAERNAPGIAEAIRACSMTVTKRAMLSRGVSVIR GGTLIINLPGSPKAVRESLEYVLDTLPHGLDILSGRGGECARR >gi|229784087|gb|GG667648.1| GENE 33 31201 - 31644 494 147 aa, chain - ## HITS:1 COG:CAC1991 KEGG:ns NR:ns ## COG: CAC1991 COG2258 # Protein_GI_number: 15895261 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 141 2 142 145 137 48.0 1e-32 MGQIMAVCISEKKGTQKKGISEGHFIEEFGIEHDAHAGKWHRQVSLLSYETIEAFKARGA EIGDGAFGENVIVKGIDLVHLPVGTKLSCGDILLEVTQIGKECHSHCEIFKKMGECIMPT NGIFTRVLHGGVMKEGDEINVCTEQES >gi|229784087|gb|GG667648.1| GENE 34 31649 - 32653 837 334 aa, chain - ## HITS:1 COG:lin1039 KEGG:ns NR:ns ## COG: lin1039 COG2896 # Protein_GI_number: 16800108 # Func_class: H Coenzyme transport and metabolism # Function: Molybdenum cofactor biosynthesis enzyme # Organism: Listeria innocua # 1 334 4 333 333 227 36.0 3e-59 MRDENGRTIDYLRISVTDRCNLRCRYCMPEEGVALMNHEDILTYQEILRIAEAGAGLGIR RVKVTGGEPLVRKGVIQLIRKLTAIDGIQDVTMTTNGILLGGMAQELAASGLSSVNISLD TLDPVRFHRITRRDCFADVMKGIEAAGKSGLMVKLNCAVMEELCREDVLAFAEYSVKNGI PVRFIEMMPIGQGRKYHAMDNEELLAVLSEEYSDVRISTEPRGNGPAVYYTFASGAGCIG LISAVHHKFCAGCNRVRLTSDGFLKLCLDSPAGVDLRTPLRSGISDETLKDLMEQWIRKK PESHHFLSSAGGEKEGLFFDDTEKQCLNMNQIGG >gi|229784087|gb|GG667648.1| GENE 35 32650 - 33138 548 162 aa, chain - ## HITS:1 COG:RSc0560 KEGG:ns NR:ns ## COG: RSc0560 COG0315 # Protein_GI_number: 17545279 # Func_class: H Coenzyme transport and metabolism # Function: Molybdenum cofactor biosynthesis enzyme # Organism: Ralstonia solanacearum # 1 153 1 153 158 168 60.0 5e-42 MEGLTHFDEHGNARMVDVGEKAVTDRIAEAHGVIHVNREVYEAIKQGTAKKGDVLGVARV AGIMGAKRTSELIPLCHVLMLTKCTVDFVLHDEIMAVEAVCTVKTSGKTGVEMEALTGVN VALLTIYDMCKAMDRGMTMTDISLLMKDGGKSGLYRKDGERL >gi|229784087|gb|GG667648.1| GENE 36 33140 - 33253 125 37 aa, chain - ## HITS:1 COG:no KEGG:Closa_1040 NR:ns ## KEGG: Closa_1040 # Name: not_defined # Def: molybdopterin binding domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 37 308 344 344 79 86.0 4e-14 MADDPVTKTELDKLGRGGLCLSCEVCHYPNCGFGKGW >gi|229784087|gb|GG667648.1| GENE 37 34174 - 35028 1044 284 aa, chain - ## HITS:1 COG:mlr0093_1 KEGG:ns NR:ns ## COG: mlr0093_1 COG0303 # Protein_GI_number: 13470396 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzyme # Organism: Mesorhizobium loti # 4 274 6 271 330 92 26.0 1e-18 MREIRTEDAVGHVLCHDITQIIPGVIKDAKFKKGHVVTEEDIPVLLSLGKEHLYIWEKDE SKYHENEAAEILCGLCRGSHMERSGVKEGKIELTSAVRGLLKIDTELLDRINSLGDMMIA SRHNNTPVEPGDKLCGTRIIPLVIEKEKMEEAVAVCGGKKIFSILPYQDKKVGIVTTGSE VYHGRIQDAFGPVLKEKVKEFGGEVMGQTITDDDPDHTRKAIEEFIAQGADMVLVSGGMS VDPDDKTPLAIRSTGARVVSYGAPVLPGAMFLLAYYEKDGRQIS >gi|229784087|gb|GG667648.1| GENE 38 35173 - 35520 376 115 aa, chain - ## HITS:1 COG:PAE0813 KEGG:ns NR:ns ## COG: PAE0813 COG2005 # Protein_GI_number: 18312198 # Func_class: R General function prediction only # Function: N-terminal domain of molybdenum-binding protein # Organism: Pyrobaculum aerophilum # 23 105 20 102 136 62 37.0 2e-10 MEQEKKLRFHMKLRLYYEERNFGPGVAGLMQLVRERGSLAAACQEMHMAYSKAWKIIHKA EEDLGFPLMEGKRGGENGGATVLTEEGEAFLNRYLQFVNEAEHAVAELFEKYYCS >gi|229784087|gb|GG667648.1| GENE 39 35545 - 36378 857 277 aa, chain - ## HITS:1 COG:no KEGG:Closa_1038 NR:ns ## KEGG: Closa_1038 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 15 265 22 262 265 216 51.0 8e-55 MVFYSWEHDGLRECGSLWEALALSVKRHRVIALTGGGGKTSVMFRLADELADMGKTVIVT TTTHIFHPADRNVVESDRASDAAEWLARNERKPERSRTDHGNGTVLVAGVPAAEGKLKSL PLTETAVLKELADVLLIEADGAKRLPIKVPGNGEPVIPEYTDVVIGCMGLNCIGGELEEF CFRTEQAAALLGLKKEPNGRYSHRITCGDAAKILTSDAGTRKLVGERDYRVVINKADDEG LLAKAAEIAGEVERLSQVRCAVSCFLDRKHPQQEREV >gi|229784087|gb|GG667648.1| GENE 40 36427 - 37803 1623 458 aa, chain - ## HITS:1 COG:mll1629 KEGG:ns NR:ns ## COG: mll1629 COG0044 # Protein_GI_number: 13471607 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Mesorhizobium loti # 1 440 1 439 483 328 37.0 9e-90 MKKVLKGGTVVSGEGCKKADVLIEDGKIGAVAPEITDEEAEVVDVSGKLLFPGFIDAHTH FDLHVSGTVTADNFETGTKAALSGGTTMIIDFATQYKGESLHKALENWHEKADGNASCDY GFHLAIADWNPDISEELRQIVDEGVTSFKIYMTYDDMVLDDKSIYQVLKRLKEVGGIAGV HCENKGIIEALTEEEKAKGHLTTDAHPKTRPAVVEAEAISRLLKIADLVHTPVIIVHLSS GAGYQEVKYARERGQEVILETCPQYLILNESVYSLPDFEGSKYVIAPTIKKETDSTRLWN AIRKDHIQTISTDHCSFTTGQKAAGKDDFTKIPCGMPGVESRPALIYSFGVLERDLKLEQ MCRLLSENPAKLYGVYPQKGAVAPGSDADIVVWNPDTEWTMSVENQVANVDYCPFEGTKV KGKAELVFLKGELAAKDGKVVLEKTGAYVPRKRRMELD >gi|229784087|gb|GG667648.1| GENE 41 37852 - 38361 436 169 aa, chain - ## HITS:1 COG:CAC2475 KEGG:ns NR:ns ## COG: CAC2475 COG3467 # Protein_GI_number: 15895740 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein # Organism: Clostridium acetobutylicum # 1 153 1 152 154 132 44.0 2e-31 MFRELRRKNQQLSETEARSILKNGTYGVLSVQGDDGYPYGVPMNYVYGDDAIYFHCAKEG HKMESFQRSDKVSFCVVGEDEVIPEAFSTAYASVIVFGRASVVTDEKEKREALKLLAERF SPEYTKEGEAAIESSWNDVCVVRLKIEHAAGKAGMERIKREEARYASFL >gi|229784087|gb|GG667648.1| GENE 42 38571 - 41138 3233 855 aa, chain - ## HITS:1 COG:BH0748 KEGG:ns NR:ns ## COG: BH0748 COG1529 # Protein_GI_number: 15613311 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Bacillus halodurans # 169 849 15 744 760 370 34.0 1e-102 MYTLQVNGTSVAVEEDRKLIDVLRNDLGLKSVKDGCSEGACGTCTVLIDGKATKACVQKV SKMEGKSIVTVEGLSDREKEVFVYAFGEAGAVQCGFCIPGMVMCAKGLIDQNPDPTRDEA AFAIRGNICRCTGYKKIIDAILLAAKLFREGAKDLKAPVVDHIGQRAHRVDVPEKVLGYG EYVDDMEVEGMIYGSAIRTPYPRARVLSIDKSEALALPGVRAVLTADDVPGGNKVGHLKK DWDTMIPVGEITRYVGDAVCLVAADTPEILEEAKKLVKIEYEELEGVFSPQEALKPDAPL VHETGNVLAHEHLVRGDAKKAIENSKYVVTKEYRTPFTEHAFLEPECAVAIVDREKEEAL IYSSDQGTYDTQRECAGMLGWELSRVHVINKLVGGGFGGKEDMAVQHHAALLAYHTGKPV KVRLTRKESIMIHPKRHPAWMEFTTACDENGYLTGMTAKVITDTGAYASLGGPVLQRLCT HAAGPYNYQNIDIDGTAVYTNNPPAGAFRGFGVTQSCFATECNLNLLAEMVGITPWEIRY RNAIRPGQELPNGQIASPGTALAETLEAVKEVYDSHPYVGIGCAMKNAGVGVGIPDWGRC RMTIEDGKVYVRSGASCIGQGLGTVLEQVVCEALGVTGDRVVYVAAETNLAPDSGTTSGS RQTLVTGEAARRACMDILEDLKGGKTLEDLNGKEYYGEYLAATDPMGCGKKNPVSHVAYG YATQVVCLNEDGTIEKVVAAHDVGKAINPLSVEGQIEGGVVMSLGFALTEDFPLDHGKPL AKFGTLGLFRADKTPPVESIIVEENKDPLAYGAVGIGEITSIPTAPAVQGAYYKLDGNFR TSLPLEDTAYRKSKK >gi|229784087|gb|GG667648.1| GENE 43 41156 - 44176 3132 1006 aa, chain - ## HITS:1 COG:ygfK_2 KEGG:ns NR:ns ## COG: ygfK_2 COG0493 # Protein_GI_number: 16130780 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Escherichia coli K12 # 448 1006 1 578 582 387 39.0 1e-107 MGDRMTPMPFSQLVEWVMKEHKEKGTVFGVYRPYQAKAENDRKIFGRTLETPIGPAAGPH TQLTQNIAAAYYAGSRFFELKTVQIIDGEDLPVNKPCIKADDECYNCEWSTELYVPQAQE EYIKAWFLLSIMAKEYGLGSMDGFQFNMSVGYDLEGIKSPKIDSFIESMKDAKDTETFRS CRQYLLENVSMFQHVTKEDIEGISSNICNSVTLSTLHGCPPQEIERIAKYLIEVKGLHTF IKCNPTLLGYEFARKTMDEMGYDYVAFGDFHFLDDLQYQDAVPMLQRLMKLAGEKNLEFG VKITNTFPVDVKQNELPSEEMYMSGKSLYPLSISLAAKLAKEFDGKLRISYSGGADYHNI KGIIDAGIWPVTVATTLLKPGGYQRLDQMAKELELVSPHLRSAECGSVEFTGVDPEKTAK LAESAITDEHHVKAVKPLPSRKMNKPVPLTDCYVAPCKEGCPIHQDVTTYLQLSGAGKYE EAMKVIAEKNPLPFITGTICAHNCMSKCNRNFYEDPVNIRGVKLVCAEGGYDALMKEIKA GELKGKKTAVVGGGPAGLAAAYFLTKNGYPATIFEAGDSLGGVVKHVIPGFRISDEAIAK DVALATAYGAEVKLNTRVESLEDLKKAGYDYIILATGASKPGVLKLEAGEAMNALEFLAQ FKASDGNVNIGKNVVVIGGGNTAMDTARAVKRTKGVEHSYLVYRRTKRFMPADEEELVMA VEDGVEFKELLSPVKLENGKLLCNVMKLSDYDASGRRGVSETGETVEIPADTVIAAVGEQ VPTDFLKANGIHVNDRGRAMVNEETLETSVSGVYAVGDGLYGPKTVVEAMRDARMAVEAI IGGKAARDFDDTVKDLGKLYDRRGILAGTKEYKEDSTRCLGCSNICENCAEVCPNRANIA IRVPGMEKHQIIHVDYMCNECGNCKSFCPYDSAPYLDKFTLFANEADMENSKNQGFVVLD RAAVSCKVRYLGNIYTWNAADSESVIPEGLRKVMETVCKDYDYLLA Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:52:28 2011 Seq name: gi|229784086|gb|GG667649.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld42, whole genome shotgun sequence Length of sequence - 31977 bp Number of predicted genes - 30, with homology - 29 Number of transcription units - 14, operones - 10 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 689 665 ## COG2205 Osmosensitive K+ channel histidine kinase 2 1 Op 2 . - CDS 738 - 2327 1420 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 2352 - 2411 2.6 3 2 Op 1 . - CDS 2413 - 3108 501 ## Closa_4227 integral membrane protein TIGR01906 4 2 Op 2 . - CDS 3137 - 3787 756 ## COG2860 Predicted membrane protein - Prom 3835 - 3894 7.6 + Prom 3943 - 4002 5.1 5 3 Op 1 . + CDS 4027 - 4155 102 ## gi|288870567|ref|ZP_06409806.1| hypothetical protein CLOSTHATH_02809 6 3 Op 2 . + CDS 4152 - 5510 1594 ## COG0372 Citrate synthase + Term 5537 - 5589 11.3 + Prom 5583 - 5642 5.6 7 4 Op 1 . + CDS 5705 - 6073 466 ## Elen_2024 Hpt protein 8 4 Op 2 . + CDS 6070 - 9855 3406 ## COG3437 Response regulator containing a CheY-like receiver domain and an HD-GYP domain + Term 9871 - 9913 5.1 - Term 9858 - 9900 5.1 9 5 Op 1 16/0.000 - CDS 9912 - 10532 805 ## COG1394 Archaeal/vacuolar-type H+-ATPase subunit D 10 5 Op 2 16/0.000 - CDS 10544 - 11947 1788 ## COG1156 Archaeal/vacuolar-type H+-ATPase subunit B 11 5 Op 3 . - CDS 11962 - 13728 1883 ## COG1155 Archaeal/vacuolar-type H+-ATPase subunit A 12 5 Op 4 . - CDS 13721 - 13873 82 ## Closa_4197 hypothetical protein - Prom 13934 - 13993 23.0 13 6 Op 1 . - CDS 14895 - 15257 510 ## Closa_4197 hypothetical protein 14 6 Op 2 . - CDS 15274 - 15582 366 ## Closa_4196 Vacuolar H+transporting two-sector ATPase F subunit 15 6 Op 3 . - CDS 15586 - 16014 427 ## Closa_4195 H+transporting two-sector ATPase C subunit 16 6 Op 4 . - CDS 16042 - 17970 2129 ## COG1269 Archaeal/vacuolar-type H+-ATPase subunit I 17 6 Op 5 . - CDS 17986 - 19035 976 ## COG1527 Archaeal/vacuolar-type H+-ATPase subunit C 18 6 Op 6 . - CDS 19040 - 19351 391 ## Closa_4192 hypothetical protein - Prom 19592 - 19651 4.2 - Term 19704 - 19759 3.4 19 7 Tu 1 . - CDS 19817 - 20797 438 ## Closa_4191 SH3 type 3 domain protein - Prom 20870 - 20929 5.8 + Prom 20826 - 20885 6.7 20 8 Op 1 . + CDS 21025 - 21672 649 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase 21 8 Op 2 . + CDS 21741 - 21911 201 ## gi|288870572|ref|ZP_06114598.2| conserved hypothetical protein + Term 22095 - 22152 8.0 + TRNA 22019 - 22094 85.8 # Lys CTT 0 0 - Term 22146 - 22174 1.0 22 9 Tu 1 . - CDS 22267 - 23262 975 ## COG1609 Transcriptional regulators - Prom 23318 - 23377 5.5 + Prom 23242 - 23301 7.5 23 10 Tu 1 . + CDS 23334 - 23423 64 ## + Prom 23432 - 23491 10.1 24 11 Op 1 . + CDS 23710 - 24681 1034 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase + Term 24730 - 24769 9.5 25 11 Op 2 . + CDS 24779 - 26026 1420 ## COG1914 Mn2+ and Fe2+ transporters of the NRAMP family + Term 26087 - 26132 -0.9 - Term 26065 - 26129 18.3 26 12 Op 1 9/0.000 - CDS 26180 - 27490 1225 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 27 12 Op 2 . - CDS 27487 - 28356 500 ## COG3279 Response regulator of the LytR/AlgR family - Prom 28503 - 28562 2.2 + Prom 28276 - 28335 5.5 28 13 Op 1 1/0.000 + CDS 28378 - 29802 1538 ## COG0531 Amino acid transporters 29 13 Op 2 . + CDS 29832 - 31088 1193 ## COG1228 Imidazolonepropionase and related amidohydrolases + Term 31120 - 31164 9.5 + Prom 31288 - 31347 8.1 30 14 Tu 1 . + CDS 31537 - 31975 358 ## EUBELI_01420 hypothetical protein Predicted protein(s) >gi|229784086|gb|GG667649.1| GENE 1 2 - 689 665 229 aa, chain - ## HITS:1 COG:lin2827 KEGG:ns NR:ns ## COG: lin2827 COG2205 # Protein_GI_number: 16801887 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Listeria innocua # 109 227 654 772 896 127 50.0 1e-29 MPKEDLMRNIVVTSVFLLLATICSTLFLFIGGNRTNVGIVFMLAVMLTARYTDGYVPGII ASVIGVICVNYVFTYPFMELDFTLDGYPVTFIAMLLISGITSTTTTHLKEQNHILMEAEK ETMRANLLRAVSHDLRTPLTGIIGASSAYLENRDHMSETEKTAMVSNINEDANWLLNMVE NLLSVTRIRDSETQVTKSSEPLEEVASEAVQRFHKRLPDTVVHVTVPDE >gi|229784086|gb|GG667649.1| GENE 2 738 - 2327 1420 529 aa, chain - ## HITS:1 COG:CAC1267 KEGG:ns NR:ns ## COG: CAC1267 COG1686 # Protein_GI_number: 15894549 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 5 287 1 288 425 203 38.0 6e-52 MTQLIKRMSAFLLCLMLLMGAFPATALAAPDWPSGVSVQAESGIVIDADTGTVLWGQNIH NQYFPASITKIMTALIVIETCSLDETVTFSHNAVFNVEAGSSNAGINEGDKLSVKDCLYA LLLKSANEAANALAEHVAGSTEAFADMMNAKAKELGCTDTHFANPSGLNDPEHYTSTYDM ALIARAAFLNPTFEEIDSTAYYKLPPNSINPEGLTIYPGHKMLKKSTPYYYPGIVGGKTG YTTLAGNTLVTCARKNGLKLIAVILKGSTPQYWTDTKNLLDFGFENFVSVRAADHETKFS PVSSDLTFGGLALDKPAALILDPDGRIILPKTAEFSDAEAALSYDISDSDPDHAVAKICY RYNERQIGCTYLETNPALFESAASSQPLAADAGTEESAAETDPAGSDAETAGTETSAAET AGSQDETELPKETEDYREKALRPFEIPSIAWIILGSVAGIILLGALITFLVLRKNREEQE YHFRHEQRLKRLQDSGVSADEFNSLMEQRRSAYTSKRKKGRGNRHLKFK >gi|229784086|gb|GG667649.1| GENE 3 2413 - 3108 501 231 aa, chain - ## HITS:1 COG:no KEGG:Closa_4227 NR:ns ## KEGG: Closa_4227 # Name: not_defined # Def: integral membrane protein TIGR01906 # Organism: C.saccharolyticum # Pathway: not_defined # 1 228 1 228 236 310 71.0 2e-83 MKTLIHYTAGIVFSLCMMAALLITSVEAVVYWTPGYFEKEYAKYNVTEAVDMTMEDLLDV TDQMMAYLRGKRADLHVATTMGGVSREFFNEREIAHMEDVRGLFLDALSIRRGCLLVMAL CVIILFLLKADFKRVFPKSVCLGTGLFFGITAVLAAIISTDFTKYFIMFHHIFFHNDLWI LDPATDMLINIVPEGFFSDTVLHIGITFFLCVAVVFGLALFFLRKSKKNSV >gi|229784086|gb|GG667649.1| GENE 4 3137 - 3787 756 216 aa, chain - ## HITS:1 COG:L169795 KEGG:ns NR:ns ## COG: L169795 COG2860 # Protein_GI_number: 15673892 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Lactococcus lactis # 1 211 1 215 219 131 39.0 1e-30 MELQLSLLFFVEAVGTIAFASSGAMVAIKKQLDLLGVIVLGVTTAVGGGMLRDIIIGNVP PNLFKDPVYVLLAFITVMILFVIVRCNQKILASRSIEMYEKVMNIFDAVGLGAFTVVGID TAVLAGYGRYHFLIIFLGVITGVGGGLLRDIMAGETPYILKKHVYACASITGAVLYAYLQ DYMNNDAAMLIGAGSVILIRILATRYCWDLPTATKK >gi|229784086|gb|GG667649.1| GENE 5 4027 - 4155 102 42 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870567|ref|ZP_06409806.1| ## NR: gi|288870567|ref|ZP_06409806.1| hypothetical protein CLOSTHATH_02809 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02809 [Clostridium hathewayi DSM 13479] # 1 42 1 42 42 81 100.0 2e-14 MNNREVTISIEGPRGPKACDHAGGLPGKQRNLIINFKKEVFA >gi|229784086|gb|GG667649.1| GENE 6 4152 - 5510 1594 452 aa, chain + ## HITS:1 COG:L67186 KEGG:ns NR:ns ## COG: L67186 COG0372 # Protein_GI_number: 15672652 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Lactococcus lactis # 19 452 11 441 441 524 58.0 1e-148 MIDQKEFVKNVDEWSSICLNDERIDLGLYEKYDVKRGLRDKNGNGVVAGLTKVSKILSSK TVDGVKMPCEGQLFYRGYNIYDLVGGVVKDSRFGYEEIAYLLLFGELPNTEQLERFSQTL AYSRTLPTNFVRDVVMKAPSKDMMSCLARSILTLSSYDDMASDIFVPNVLRQSLMLISVM PMLAVYSYHAFNHYERDESMYIHRPDENLSTAENLLRLLRPDKNYSKVEAHVLDLALILH MEHGGGNNSTFTTHVVTSSGTDTYSVVAAALASLKGPKHGGANIKVVEMMEDLRKEVHDL KDEEEVEFYLRRLLHKEAFDKKGLIYGMGHAVYSKSDPRAEIFKGFVHQLSEEKGRMDDF NLYSMIERLAPQVIADERKIYKGVSANVDFYSGFVYSMLDIPKELFTPIFAIARIVGWSA HRIEELINMDKIIRPAYTSVMMEEEYTRLDDR >gi|229784086|gb|GG667649.1| GENE 7 5705 - 6073 466 122 aa, chain + ## HITS:1 COG:no KEGG:Elen_2024 NR:ns ## KEGG: Elen_2024 # Name: not_defined # Def: Hpt protein # Organism: E.lenta # Pathway: not_defined # 1 114 1 114 119 70 35.0 2e-11 MTDRTLDILKQASVDVEEALAHFSGNSRLYEKFLIKFLQDDCFKRIEGAVKARDSQEMLT ASHTLKGVAGNLGFVALMDACAEMVRQLRSNNMEGALEAYPGVKETYETVCAAINSMEDM GG >gi|229784086|gb|GG667649.1| GENE 8 6070 - 9855 3406 1261 aa, chain + ## HITS:1 COG:slr2100 KEGG:ns NR:ns ## COG: slr2100 COG3437 # Protein_GI_number: 16330586 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and an HD-GYP domain # Organism: Synechocystis # 3 351 9 366 368 238 36.0 8e-62 MKTRDTLLIVDDMEVNRAILRSLFEHQYNLLEAENGEQALLLLREYEEHVAAMLLDLVMP VKDGYEVMEEMGKSGLVSRIPVIVITSDDTAESEVRAFDLGAADIIMKPFEPHVVKRRVQ NAVELCRHKLHLEELVEEQSASLRESNAIMIDVLSSIIEYRSVESGQHIRRIRMFTRILL EDVAKTYQEYNLDGRTIEIIADASSMHDIGKIAIPDSILNKPGPLTREEFEVMKTHTTKG CEMLSGLERMNDKEYIGYAYNICRYHHERWNGKGYPDGLKGDNIPICAQVVAIADCYDAL TTDRVYKKALPPEQAFNMILNGECGEFSPRLLECFKNVREDFAALSRAYADGLAPTKKPE RKEVRPSSTIDTAGTLEQGQMKYLTLLRYVDSTVMEVDLNTGVYHVVYMPDRDFELLRSG SRFEDSVKAFVQGAVYPEDRDIVLGLLGDYMEEFFSDGLMRRTRKYRVLDQAAGVYYWCY ATLLRVDPGNPNQKKVMLIWHRASDEPLEVPHSGRAEEDLFRETVIGRLAGGVIKCRNDK WYTFADNGKGLHGLTGCTEQVIRDKYHNRYKEVIYPADRDLVSQGLVSQMGEGRIAELEY RLVTEDGRLIWVMDRSILVTEEDGMEYIYTLLVDITQSRKAMEGLQESVERHKIILNQTK DVIFEWNVTTDEISYSSNWECKYGYQPIIEGIRRRLPYASHINPDDIPDVMKFTNDMMTG RQLGEVEFRLANAEGRYCWCRVRATAQFDPAGNTDKVIGIIEDIDDEKRETEKLIDRAER DVLTKLYNKRAARHKINRFLELSQGREGSAMLIIDVDNFKMVNDRYGHMFGDAVLTELAA RITSHFRDSDIISRIGGDEFMVFMPGTVNHEMVTERAGRLMASLQTMFKNNMKDVPFSCS IGIAYSPEDGSDFQSLFQNSDAALYQAKLEGKNRWMQYRRPAVNAALELKEEADRPEEMR ENTEHRGQGTVARVISMTFQELYDAADFDRAMNSILEITGRMFHVSRVYIFENCTDSDDF CTNTFEWCREGITPQRAKLRNVAYTDQGGDYRDNFNEEGVFYCLDIAELRPELQYMLAEQ GISSMLQCSITAEGVFKGFVGFDECEAGRIWTQEQIDCLVFLSKMLSTFLLKKRAEDKLR ESMQNLEGILNNQNSWIYVIDGETYQLRYINEKTYQIAPETRLGMCCYEAFFKRNEPCEH CPVRECAGKESSILEVYNPVLNVWTLADSSRIRWGDREAFLLACHDISPYKRMEKEGAAN R >gi|229784086|gb|GG667649.1| GENE 9 9912 - 10532 805 206 aa, chain - ## HITS:1 COG:PH1972 KEGG:ns NR:ns ## COG: PH1972 COG1394 # Protein_GI_number: 14591709 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit D # Organism: Pyrococcus horikoshii # 7 200 9 205 214 108 35.0 5e-24 MDPNTFPTKGNLILAKSSLTLARQGYELMDKKRNILIKELMGLIEEAKGIQSEIDTTFTA AYKALQKANIELGINYVQDIAAAVPVDDTVRIKTRSIMGTEIPLVQHEERPLNLTYAYYS TKESLDEARYHFEKVKELTVKLSMVENSAYRLANSIKRTQKRANALKNITIPRYEALTRN ITNALEEKDREEFTRLKVIKRNKTSA >gi|229784086|gb|GG667649.1| GENE 10 10544 - 11947 1788 467 aa, chain - ## HITS:1 COG:PH1974 KEGG:ns NR:ns ## COG: PH1974 COG1156 # Protein_GI_number: 14591711 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit B # Organism: Pyrococcus horikoshii # 1 457 4 457 465 550 58.0 1e-156 MAIEYLGLSAINGPMVVLEGVQDAAFDEIVEMTVEGREKKIGRIIEVYEDKAIIQVFEGT EGLALRNVHTRLTGRPMELAVSEDMLGRTFNGVGEPIDGLGDISSDIRLDVNGKPLNPVT REYPRDYIRTGISAIDGLMTLIRGQKLPIFSGNGLPHDDLAAQIVKQASLGDDAQSSEKF AVVFAAMGVKYDVADFFRRTFEESGVSDHVAMFINLANDPVVERLITPKVALTLAEYLAF EKGMHILVILTDMTAYAEALREVSSSKGEIPSRKGYPGYLYSELAAIYERAGIVKGRHGS VTQIPILTMPNDDITHPIPDLTGYITEGQIVLDRNLYGQSIYPPINVLPSLSRLMKDGIG EGFTRADHQGVANQLFSCYAKVGDARALASVIGEDELSPIDKKYLVFGKEFERRFVGQGN HANRNIIETLTIGWELLGLLPRAELDRIDTKVLDQYYKPTTAEASEE >gi|229784086|gb|GG667649.1| GENE 11 11962 - 13728 1883 588 aa, chain - ## HITS:1 COG:MK1017 KEGG:ns NR:ns ## COG: MK1017 COG1155 # Protein_GI_number: 20094453 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit A # Organism: Methanopyrus kandleri AV19 # 1 585 1 589 592 646 54.0 0 MNNTGTISGINGPVIYLKGDPGFKMNEMVYVGRDNLVGEVIGLTSGRTIVQVYEETSGLK PGETVTASGSPVSVTLAPGILNNIFDGIERPLTEIAKTGGAYIDRGVHVDSLDGEKKWNT HITVKKGDRLFPGSIIAEVPETPAIVHKVMIPPDMEGYVLDVVSDGSYTIHDEILTLQLP DGSEKKLTMTQKWPIRVPRPTLKRYPASRPLITGQRIIDTLFPLAKGGTACIPGGFGTGK TMTQHQIAKWSDADIIIYIGCGERGNEMTQVLEEFSELIDPRSGNPLMDRTTLIANTSNM PVAAREASLYSGLTLAEYYRDMGYNVAIMADSTSRWAEALRELSGRLEEMPAEEGFPAYL ASRLSAFYERAGMMQNMNGTEGSVTIIGAVSPQGGDFSEPVTQNTKRFVRCFWGLDRNLA NERHFPAIHWLSSYSEYLTDLAPWYVEHVDKKFVDYRNRLVFLLTQESSLMEIVKLIGGD MLPDDQKLILEIAKVIRIGFLQQNAFHKDDTCVPMEKQFKMMDLILYLYKKSRSLVSMGM PMSVLKEDPIFDKIISIKYDVPNDRLDMFDDYRKQIDAFYESVIERNA >gi|229784086|gb|GG667649.1| GENE 12 13721 - 13873 82 50 aa, chain - ## HITS:1 COG:no KEGG:Closa_4197 NR:ns ## KEGG: Closa_4197 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: Oxidative phosphorylation [PATH:csh00190]; Methane metabolism [PATH:csh00680]; Metabolic pathways [PATH:csh01100] # 1 50 151 200 200 79 80.0 3e-14 MSEYSFGGGTRAVIPSRHILIDNSFQTKLEEARRDFRFNLKDLEGGITNE >gi|229784086|gb|GG667649.1| GENE 13 14895 - 15257 510 120 aa, chain - ## HITS:1 COG:no KEGG:Closa_4197 NR:ns ## KEGG: Closa_4197 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: Oxidative phosphorylation [PATH:csh00190]; Methane metabolism [PATH:csh00680]; Metabolic pathways [PATH:csh01100] # 1 118 1 118 200 142 75.0 5e-33 MTTEEKLQHFLDFCMEDARNRSAKMLDEYTAALEQTFEEHQADARRRAEQQVELESERIE HETNKKLSLEQIGIKRELGHKQEELKDKIFVELKDVLEQYMETPEYTQLLEKQIRHAREF >gi|229784086|gb|GG667649.1| GENE 14 15274 - 15582 366 102 aa, chain - ## HITS:1 COG:no KEGG:Closa_4196 NR:ns ## KEGG: Closa_4196 # Name: not_defined # Def: Vacuolar H+transporting two-sector ATPase F subunit # Organism: C.saccharolyticum # Pathway: Oxidative phosphorylation [PATH:csh00190]; Methane metabolism [PATH:csh00680]; Metabolic pathways [PATH:csh01100] # 1 102 1 102 102 187 94.0 2e-46 MKMYLISDNIDTWTGMRLAGVEGAVVHEKAELKQELDKVLADKEIGIVLLTEKFGREFPD IIDDVKLNRKLPLIIEIPDRHGTGRKPNFITDYVNEAIGLKL >gi|229784086|gb|GG667649.1| GENE 15 15586 - 16014 427 142 aa, chain - ## HITS:1 COG:no KEGG:Closa_4195 NR:ns ## KEGG: Closa_4195 # Name: not_defined # Def: H+transporting two-sector ATPase C subunit # Organism: C.saccharolyticum # Pathway: Oxidative phosphorylation [PATH:csh00190]; Methane metabolism [PATH:csh00680]; Metabolic pathways [PATH:csh01100] # 1 142 1 143 143 134 88.0 2e-30 MSTLVKITLAAALILSIAVPFGAFAMGEKSKGRYKKALGANALLFFGTLMVSAVFMFSGS AVAAEPAAAASGVAGMGYLAAALSTGLACIGGGIAVSAAASAALGAISEDGSILGKSLIF VGLAEGVCLYGLIISFMILGKL >gi|229784086|gb|GG667649.1| GENE 16 16042 - 17970 2129 642 aa, chain - ## HITS:1 COG:SSO0559 KEGG:ns NR:ns ## COG: SSO0559 COG1269 # Protein_GI_number: 15897481 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit I # Organism: Sulfolobus solfataricus # 3 637 5 693 701 137 24.0 5e-32 MIEKMKFLSITGPKEDIDRVIDTYLSKYEIHLENALSELKTVKDLRPYIEINPYRDTYQR AAELVELLPATAPKTTATAVSIEEAIGTIEKLDATVKELTEKEAVLTGERDSLQESLDKV VPFTGLNYDLSSILHFKYIKFRFGRISHEYYKKFESYVYDSIDTVLYKCREDEEYVWVVY FVPETISNQIDAIYSSMHFERFFLPDEYEGTPLDAIHTLEDKISALQADIDSIHRQMAEV LGSQRDKLLASHDKLSTFSTNFNVRKLAACTKQNVNTFYILCGWMSEKEAKAFQKEISND EKTFCIVEDDHNNIISRPPTKLHNPKLFKPFEMFIQMYGLPAYNEIDPTILIGITYSFLF GFMFGDVGQGLCLLLGGFLLYRFKKMNLAAIISCCGFFSTIFGFMFGSVFGFEDIIGAVW LRPMEHMTNLPFIGKLNTVFIVAVSIGMGIILLTMVLNIINSLRDHDPEKTWFDTNGVAG LVFYASLVLTIVLYMTGNPVPATILLVIMFGIPLLIMFFKEPLTHILEKKAQIMPAGKGM FVVQGFFELFEVLLSYFSNTLSFVRVGAFAVSHAAMMEVVLMLSGAEAGSPNWAIVVLGN LFVCGMEGLIVGIQVLRLEYYELFSRFYRGTGRAFKPYGKTN >gi|229784086|gb|GG667649.1| GENE 17 17986 - 19035 976 349 aa, chain - ## HITS:1 COG:MTH957 KEGG:ns NR:ns ## COG: MTH957 COG1527 # Protein_GI_number: 15678975 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit C # Organism: Methanothermobacter thermautotrophicus # 4 342 46 380 385 67 21.0 4e-11 MGGLLSYSGITTKVRAMESHLITDRQFGEMSTLESVSDAVEYLRRLPAYGMIFSNLEGVD LHRGAIEQRLILSQYQDFAKLYRFANLSQRKFLDLYFMHYEIDILKKCFRNALGHQKPEI DLSVFQEFFEKHSKLDLIKLSSSGSVQEFISNLDGSIYYDLLTHLDDVEHPSLFDYEIHL DLLYFKTIWKVKGKYLSRTEQKLLTDCFGSKLDLLNIQWIYRSKKYYNLQPADIYALLIP IEFHIKKEQITKLVEAGTLDEFFHALQTTYYGRLEDMDASDQTELEMLAEEALNKIYSAT SRKNPYSIATLNSYLYFKEEEIQKIITVIESIRYGVSPDEILSYVVKNQ >gi|229784086|gb|GG667649.1| GENE 18 19040 - 19351 391 103 aa, chain - ## HITS:1 COG:no KEGG:Closa_4192 NR:ns ## KEGG: Closa_4192 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 103 1 103 103 90 80.0 2e-17 MDTVIEKISEIESAATGIMEDANERKKAFAKEMEERTAAFDAQLEAETNKRIEELRARME IEMNERLEKQRNDSQNVLRAMEQRYQEHHTEYVEELFKKMIKE >gi|229784086|gb|GG667649.1| GENE 19 19817 - 20797 438 326 aa, chain - ## HITS:1 COG:no KEGG:Closa_4191 NR:ns ## KEGG: Closa_4191 # Name: not_defined # Def: SH3 type 3 domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 318 1 363 377 278 52.0 2e-73 MGTTDNNTWTRIQQSVRETERLIGQKQYNLAMIKSRQTLEFMVNCLGEKALIVDGDLADS IDQLFEGHWISQATKDHYHRIRVLGNKAVHDGNDSPYDANEAFQLLSQEATAFADIYSGR RRSTTPTRPQQRPASRSSQPAQRSTGQRSSSPQRNGQRTAQRPSSANRSRRRSKKKGFDP YDLLRPAVIFLIILVVVLAAMGLFKLFGGKDDKKETSAPSTSPEVTTEATAAPEPATEPE PATEAPMIYKTTATPNLNVRAEPSTTGAVLGRLAPGTVVDFVQTYDQQWSVIMFEGKQAY VSSQYLAAEEAPQPSDAGTEATTAAQ >gi|229784086|gb|GG667649.1| GENE 20 21025 - 21672 649 215 aa, chain + ## HITS:1 COG:BS_ytmQ KEGG:ns NR:ns ## COG: BS_ytmQ COG0220 # Protein_GI_number: 16080042 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Bacillus subtilis # 1 210 1 210 213 229 51.0 4e-60 MRLRHIPGSEEEIANSPYVVSNPEEKKGRWNGVFGNDNPIEIEVGMGKGRFIMELAAKNP EINYVGIERYCSVLLRGIQKRREMELSNIYFMCVDARELSEIFAPGEVKKIYLNFSDPWP KDRHAKRRLTSEPFMEVYDKILASDGVVEFKTDNRGLFDYSLESIPAAGWEIREYTYDLH HSDMAGGNIMTEYEEKFSSMGNAIFKLVAERKRNE >gi|229784086|gb|GG667649.1| GENE 21 21741 - 21911 201 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870572|ref|ZP_06114598.2| ## NR: gi|288870572|ref|ZP_06114598.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 56 1 56 56 67 100.0 4e-10 MTGAVHMGMKDKVTVKLLQATSDKTELNKKALQWKKSFVEVNKILGTSSPKKKKNK >gi|229784086|gb|GG667649.1| GENE 22 22267 - 23262 975 331 aa, chain - ## HITS:1 COG:CAC3037 KEGG:ns NR:ns ## COG: CAC3037 COG1609 # Protein_GI_number: 15896288 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 2 330 3 332 334 149 29.0 6e-36 MATLKDVARLACVDVSTVSRALNNTSYVHPDTKARIFAAVKELSYQPNVLAQGLRQGKRH TIGVVVPRLHLTLFADITQSIESESRKLGYATLICHTEDDPLIEKECLNRLRNGFVDGII IAGTGRNGHLLRDIHAGGIAVTQIVRKQENSISSVIADYDACGYDTVKYLYSKGCREIGL INGTTSLAPYRERYNGYHRAMEELELTETCATSPMPGNNFEYGYQCTEDLLNQNPHLDAI MAAVDIQGMAAIRALKDRKIRVPEQIRLVSLTSHSIGGLLETTMTSLEIPAHEMGEKATH MVIDEIEAPSDSKPSVQHLVFSASLVERESS >gi|229784086|gb|GG667649.1| GENE 23 23334 - 23423 64 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKAKYKQNAKELQEKHLGTDVLIHISLFY >gi|229784086|gb|GG667649.1| GENE 24 23710 - 24681 1034 323 aa, chain + ## HITS:1 COG:mll1010 KEGG:ns NR:ns ## COG: mll1010 COG0329 # Protein_GI_number: 13471125 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Mesorhizobium loti # 10 314 13 311 321 127 28.0 3e-29 MAKVKNFKTIFPAVSVPLNDDYTINEPEFRAYLRWMKTFYNQGIQGLVCNGHTGEITGLS RAERKRVVEICSEECGDIMTIISGVNCENTIESIEMAKEAKEAGADGILLMPPHMWLRFG MHADAPFEYVKDVAEGADIDIIIHLYPATSKAFYPVETLIKMCREIDHVKCIKMGTRVTS IYEHDVRLLRQECPDISLITCHDETLCVSWFPGMDGALIGFAGCVPEIICPAREVFANPD KHTLKEAQDWSDRIYHISQAIYGGGQPSGEAHARLKEALVQRGIFSSGLMRKPVLPLNQE EKDWIALGLKNSGMEKVDMSQFQ >gi|229784086|gb|GG667649.1| GENE 25 24779 - 26026 1420 415 aa, chain + ## HITS:1 COG:BMEI0569 KEGG:ns NR:ns ## COG: BMEI0569 COG1914 # Protein_GI_number: 17986852 # Func_class: P Inorganic ion transport and metabolism # Function: Mn2+ and Fe2+ transporters of the NRAMP family # Organism: Brucella melitensis # 13 367 42 405 456 92 27.0 1e-18 MGEAFDQTKVKKASFLELLKRIGPGIILAGVVIGPGNITTSAMLGSNYGYSMIWLVVPIA FMGITFMLTTYRISMLTGMPVIHAIRHYYGKGAAVIVGMATFFACLFFTMGNITGSGAGM NLIFGMNWKLGSLIMIAIILVCYFTKGVYSKVEKLITLCILGMIVCFYATLAATGGPAWG ELGKGLTHWTVAEGSLTTALAYISTNAAITAGIYGTYLGVEKKWKKEDLFNGIMLADAIA HVAAVILISGAIVLVGAIVLHPTGTAIKAPAQLGDLLIPFLGKAAPVVMGIALLAAGFSS LLGNTQRGMVLLAAGIDKPVGLETKFIRWGCLLCIAFAAVICYSYGGSPTQLIFIANIAT AVATPFGGFFVCRLIFRRDVNAGLKTPRVLQICMVISYAFALIMTGSALWKMFFL >gi|229784086|gb|GG667649.1| GENE 26 26180 - 27490 1225 436 aa, chain - ## HITS:1 COG:lin0802 KEGG:ns NR:ns ## COG: lin0802 COG2972 # Protein_GI_number: 16799876 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Listeria innocua # 189 432 191 432 433 100 28.0 8e-21 MMVYYQLVLDTILSCAETLILFQMYRAFLRRNSRPNKYYYLGLFSYFVFQLVTYAGRWPL FSAWIYYLLFTLLLACLFFADTLQIKILVTYLFVTLHYSCKLACSTFFMALKHADLPSVP SHLIQSPLSQIAACIVFILFTWLFIYFRSMRRHNKYNLYSAITYLAPAGILFIVIHQFYL RTAGQTAPFYLSESGILMCTSFALFYLIDKTEMIDEASERSLMASKLLEHQKDYYKSVEK SQHEVAAMRHDLKNHLHCIASFIELEQYHDALHYIEEIYANSRHLSSTVNIGNNLISILL NDAKERASQNNIRMTVNVMVPPDLPIDNVDLCVILGNLLDNAREACCRMEGNEEDRFIEV EIVFRKSFLIIKVTNSFNGQYILKENRYESVKKDQHFCGIGLSNVRTTVEKYDGEMKVTP NEKEFVVTVMLMLLGN >gi|229784086|gb|GG667649.1| GENE 27 27487 - 28356 500 289 aa, chain - ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 53 285 1 231 234 108 29.0 9e-24 MVYVSLDKSHFNSYDFIVNCLCTKDIFHVRNGTCFVNFDLFARRIGSERRSFMYKVAICD DNPADLQLITDYLTEPEFHYPLELFSFHDGVELAHAYQKEGQFDLIILDMMMDRLNGIET ALKIRKVDQNVAILIVTATVEYAIEGYKINAARYIVKPVSKREFQKITRNIFAFIDKKRG AYYRFPSKSGTTVISMEDIFYFESDIRSIHISSRQGDYTFTGRISAVEEQTEGHGFLRVH KSFIVNLKHVHNIFKDSVTLDNGEVIPLSRHRHREVNQKFLDYMEEQII >gi|229784086|gb|GG667649.1| GENE 28 28378 - 29802 1538 474 aa, chain + ## HITS:1 COG:SA0541 KEGG:ns NR:ns ## COG: SA0541 COG0531 # Protein_GI_number: 15926262 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Staphylococcus aureus N315 # 14 467 13 453 494 152 28.0 2e-36 MAGSKQTAPEGPGMQKTLSIWNYFTIGFGAIIGTGWVLLVGDWMIIGGGPVAAMIAFFIG ALFLLPIGAVFGELTAAIPISGGIVEYVDRTFGKNVSYITGWFLALGNGILCPWEAIAIS TLVSDMFGDMLPFLRSVKLYTIMGADVYLFPTIIALGFACYVIFLNFKGASSAAKLQAFL TKALLGGMLLAMAISFVKGGPHNILPAFGQVEGAGSSTSATNMFAGIVSVLVMTPFFYAG FDTIPQQAEEAAEGLDWNKFGKVISMALLAAGGFYMICIYSFGTIIPWSDFVKNSVPALA CLRDINMILYVIMLVIATLGPMGPMNSFFGATSRIMLAMGRKGQLPEKFAEIDPKSGAPK MANTLLACLTIVGPFLGKKMLVPLTNVSALAFIFSCTMVSFACLRMRTTEPDLPRPYKVP GGRLGIGAACLAGSIIIALMVIPMSPAALKPVEWLIVAGWLIVGLILKMFSQKK >gi|229784086|gb|GG667649.1| GENE 29 29832 - 31088 1193 418 aa, chain + ## HITS:1 COG:BH2935 KEGG:ns NR:ns ## COG: BH2935 COG1228 # Protein_GI_number: 15615497 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Bacillus halodurans # 52 412 46 385 394 162 31.0 1e-39 MKKELKDTAFVGGLLIDGTGAEPVKDSLVLVRDGKVAYAGAGKEVGPEYEIVDVSGKTIM PGLIDTHLHFSGNLTDDDSDWVLESVEQKAVVAVQQAHECLETGLTTVGEIGRNGLAIRD MVEAGIMEGPRVVGTGLGFCRTAGHGDSHKLLKEHNDLGHPWAERIDGPWDLRKAVRRRL RDNPDAIKIWATGGGIWRWDQKLDQHYCMEEIQAVVDECNMVGIPVWAHCEGFGGALDCA KAGVHLIIHGQALNDECLDIMAEKGIYFCPTIQFLHEWFKTYPPTYVPEIHDQFEGGDVV EKELNRVYANLRRAKEKGIGLTIGSDSFCSSLTPYGYTAMGEMYSFVECAGISEMDTIVA ATKVGAEMLKVDDVTGTLEEGKSADLLVLDGNPLENIRNICVENMKVIMKEGRFVKRH >gi|229784086|gb|GG667649.1| GENE 30 31537 - 31975 358 146 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01420 NR:ns ## KEGG: EUBELI_01420 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 5 144 12 153 321 100 36.0 2e-20 MGKCDVALAQYFEDEYRYADLINAYIFNGRQIVKAEDIVSGNPVINGLLGRFKEWVTIQK YRDAVRKIIFGMNFIIVGLEHQNLIHYGMPVRIMLEDAAGYDEQLRTLQRQNRRLKKLSS KEFLGGIRKEDRLKAVFTIVLYYGTE Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:53:17 2011 Seq name: gi|229784085|gb|GG667650.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld43, whole genome shotgun sequence Length of sequence - 28290 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 12, operones - 8 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1246 352 ## COG3344 Retron-type reverse transcriptase 2 2 Op 1 . - CDS 2013 - 2339 464 ## Sgly_0064 type II secretion system protein E 3 2 Op 2 . - CDS 2323 - 3126 775 ## HM1_0586 hypothetical protein 4 2 Op 3 . - CDS 3152 - 3973 531 ## CKR_3434 hypothetical protein 5 2 Op 4 . - CDS 3993 - 4511 389 ## gi|266621677|ref|ZP_06114612.1| putative prepilin peptidase 6 2 Op 5 . - CDS 4459 - 6165 892 ## DSY4654 hypothetical protein - Prom 6185 - 6244 3.2 - Term 6195 - 6232 6.0 7 3 Op 1 . - CDS 6248 - 6826 310 ## gi|266621679|ref|ZP_06114614.1| conserved hypothetical protein 8 3 Op 2 . - CDS 6898 - 7398 467 ## gi|288870577|ref|ZP_06409810.1| hypothetical protein CLOSTHATH_02845 9 3 Op 3 . - CDS 7407 - 7850 284 ## gi|288870578|ref|ZP_06409811.1| hypothetical protein CLOSTHATH_02846 - Prom 7925 - 7984 4.6 - Term 7870 - 7907 3.1 10 4 Op 1 . - CDS 8019 - 9476 273 ## PROTEIN SUPPORTED gi|20808441|ref|NP_623612.1| ribosomal protein S1 11 4 Op 2 . - CDS 9499 - 11400 1260 ## COG3505 Type IV secretory pathway, VirD4 components 12 4 Op 3 . - CDS 11424 - 12401 628 ## gi|288870579|ref|ZP_06409812.1| hypothetical protein CLOSTHATH_02849 - Prom 12616 - 12675 2.3 - Term 12485 - 12517 -0.1 13 5 Op 1 . - CDS 12688 - 13350 453 ## gi|266621685|ref|ZP_06114620.1| hypothetical protein CLOSTHATH_02851 14 5 Op 2 . - CDS 13351 - 13896 331 ## gi|266621686|ref|ZP_06114621.1| conserved hypothetical protein - Prom 13954 - 14013 2.1 - Term 14012 - 14048 -0.4 15 6 Tu 1 . - CDS 14058 - 14693 273 ## COG1533 DNA repair photolyase - Prom 14909 - 14968 4.5 - Term 15054 - 15098 8.6 16 7 Op 1 . - CDS 15101 - 18202 1878 ## COG3451 Type IV secretory pathway, VirB4 components 17 7 Op 2 . - CDS 18234 - 19610 708 ## Closa_3527 hypothetical protein 18 7 Op 3 . - CDS 19612 - 20688 664 ## gi|266621690|ref|ZP_06114625.1| conserved hypothetical protein 19 7 Op 4 . - CDS 20704 - 21414 702 ## gi|266621691|ref|ZP_06114626.1| conserved hypothetical protein 20 7 Op 5 . - CDS 21455 - 21853 221 ## gi|288870583|ref|ZP_06114627.2| hypothetical fimbrial protein like protein - Prom 21998 - 22057 4.3 - Term 21882 - 21938 13.0 21 8 Op 1 . - CDS 22085 - 22636 396 ## gi|266621693|ref|ZP_06114628.1| hypothetical protein CLOSTHATH_02860 22 8 Op 2 . - CDS 22639 - 23793 209 ## Closa_1149 hypothetical protein - Prom 23869 - 23928 4.1 - Term 24085 - 24128 3.2 23 9 Op 1 . - CDS 24321 - 25832 816 ## Ethha_1352 integrase family protein 24 9 Op 2 . - CDS 25841 - 26053 288 ## gi|288870586|ref|ZP_06114631.2| conserved domain protein - Prom 26094 - 26153 5.6 + Prom 26070 - 26129 6.6 25 10 Tu 1 . + CDS 26297 - 26998 406 ## Shel_19180 predicted transcriptional regulator + Term 27004 - 27059 4.1 - Term 26992 - 27046 -0.8 26 11 Tu 1 . - CDS 27055 - 27252 233 ## gi|266621698|ref|ZP_06114633.1| ribbon-helix-helix protein, CopG family - Prom 27307 - 27366 5.3 + Prom 27259 - 27318 9.7 27 12 Op 1 . + CDS 27343 - 27819 435 ## gi|266621699|ref|ZP_06114634.1| conserved hypothetical protein 28 12 Op 2 . + CDS 27869 - 28288 113 ## gi|266621700|ref|ZP_06114635.1| conserved hypothetical protein Predicted protein(s) >gi|229784085|gb|GG667650.1| GENE 1 1 - 1246 352 415 aa, chain - ## HITS:1 COG:BH0224 KEGG:ns NR:ns ## COG: BH0224 COG3344 # Protein_GI_number: 15612787 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Bacillus halodurans # 29 387 1 326 418 177 34.0 3e-44 MRNNEYYDTQAIYDELYSKAKENRRFTNLYDLIVSRENILLAYRNIKKNKGSKTKGVNAT TILDIGKDNPERLISYVRNRMSNFQPMPVRRVEIPKDNGKVRPLGIPTMEDRLIQQCIKQ VLEPICEAKFYKHSYGFRPNRSTHHAIARVLALTNRHNFQYVVDIDIKGFFDNVNHGKLM KQLWSLGIQDKQLLCIISKMLKTEIKGIGVPQKGVPQGGILSPLLSNVVLNELDWWIASQ WENHKTHHSYKNPTRKYELLRRGNLKEVFIVRYADDFKLLCKDRKTAEKMFAATKQWLEE RLGLEISPEKSKIVNLRKRYSEFLGFKLKLWKKGQKYVVKSHMTKKSKEKCKKELKQKIK VLQHDAIGARVQQFNATVLGLHGYYKIASHISKDFAEIAFHVNKSLYCRTKCIRS >gi|229784085|gb|GG667650.1| GENE 2 2013 - 2339 464 108 aa, chain - ## HITS:1 COG:no KEGG:Sgly_0064 NR:ns ## KEGG: Sgly_0064 # Name: not_defined # Def: type II secretion system protein E # Organism: S.glycolicus # Pathway: not_defined # 1 103 61 158 510 65 31.0 5e-10 MNNLNDIFFAPAANDNLTYDQILEDVQRYFSENHASTIAEAGEENTDKATTVLKELMVQY IVKRKYTADGLTTKELCEKLYEDMAGYSFLNQWIYKPGVEEVNSATRS >gi|229784085|gb|GG667650.1| GENE 3 2323 - 3126 775 267 aa, chain - ## HITS:1 COG:no KEGG:HM1_0586 NR:ns ## KEGG: HM1_0586 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 3 224 28 250 271 124 30.0 5e-27 MAKVITVWGSPGSGKSMFCCILAKALTRDKSKAILINADISVPMLPVWLPEQLAPANASI GQVLSSVEIDTALAASHVTVLKSYPFIGLMGYCAGENPLSYPEMKYDMAMALIAAASRLV DFVILDCSSAMTNVFTPAAIESGDLVIRILTPDLKGIHYLKAHQPLLADSRFRYHEHLTF AGLARPFHAIEEMGHLIGGFDGLLPYGKEVDRCATEGGMFHAISYCNPKYTASLNKVLEM LEPKEEERQQDDFVEERQEERADEQSE >gi|229784085|gb|GG667650.1| GENE 4 3152 - 3973 531 273 aa, chain - ## HITS:1 COG:no KEGG:CKR_3434 NR:ns ## KEGG: CKR_3434 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 3 260 4 266 276 166 39.0 1e-39 MRLNNRFLFGLLSLLLAAVIAFVALPTIARQTNGKTEIVRVSKPVLKGAIITDKNVELAE VGSYNLPANIARSTEDVVGKYAVSDLMEGDYILSSKISALPLTSDIALNDIPSGRVAISL SVKTLASGLSDKLQAGDIVRIYHFLETAREIPELRFVRVLSVTGADGTNIDNTAEPAEDE EKQQSATVTVMATPEQARIITELENDGVAHVALISRNNEELAAELLAEQDQMLSDLESPA LPEEMSDSGHVFSTSSSEAPEASTTGEEPGQNP >gi|229784085|gb|GG667650.1| GENE 5 3993 - 4511 389 172 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621677|ref|ZP_06114612.1| ## NR: gi|266621677|ref|ZP_06114612.1| putative prepilin peptidase [Clostridium hathewayi DSM 13479] putative prepilin peptidase [Clostridium hathewayi DSM 13479] # 9 172 1 164 164 233 100.0 4e-60 MYPYGGVFMMTGILTGNNIPVLLYETLVFCAITGFAIYDLFHKRVPDRALVLFCPAALAA PFLLNAGSFNWQMVSEAWLASLAGAAAGFGVLLTAALVSKDGTGVGGGDIKLAGIMGFIY GVNRMTAILLTAAALAAIAVFLTRRNQLNENLAIPFVPFLAMGNLVVAAALK >gi|229784085|gb|GG667650.1| GENE 6 4459 - 6165 892 568 aa, chain - ## HITS:1 COG:no KEGG:DSY4654 NR:ns ## KEGG: DSY4654 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 19 564 20 558 564 513 49.0 1e-144 MKQLFAILCSLIMMLVLVPASALAAGEGNIDQGGGNMGQGTSQNKWSPGNDGVRITVVDA ETGIPVSGSVDFSNRPQPATIYYFGTNNKIEYRNGTPLAMESAVPYQCRQPAYSMPPIVN SKSRPTSIEIIKRYFCSEYACQMVAEATGVNYEKMLSGEYKILLEPIAYVTFNGMNYCFT ATEAALYDQLSGGALRQRLPTVAFQNLPLAMFLEYDDIGFFAWTGPKTGVQSNADIINYL GLGLVWFDEGPEEPENPGEDIEAPDMEYRVNTDVITAVTLRSDRDLTPDNPASVTFHIMG RTYRIQDIVIPAGDSQVVWVKWRTPGTPQTVPITISVRRAYTAQDTFTAKIVDLNEKVPP DPLATDTNPGYSVPSLPDAPQKRTANWGVWRCYWVPVWDWCDHDDGGHWVDNGYWEYDYT GYSASITGSMSLMPDDLVPTASGKNMKSGYGVKTDVRATLSTSAPSSHITHPQTAFSVFP EFDYGTYLRLLQRTSGGRSAKFTFRPNAFSTYNRSVHFTPLWFPDHTQYSVYTQVWDTWT PDGMLSINLNDHVSIRGSLYDDWYTNRE >gi|229784085|gb|GG667650.1| GENE 7 6248 - 6826 310 192 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621679|ref|ZP_06114614.1| ## NR: gi|266621679|ref|ZP_06114614.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 192 1 192 192 277 100.0 3e-73 MKKFFKSRGFLITMLSVCCVAILGICWAVNRDKNSQFTADEPPPSTASQEWVESSSETAP ELEETTRAVPETTLSATTAPAETTTAEYPKIAEKTEKDVAVEFSPTEKPTETPPAPEGKT IIEDPGPEHPVNPAPDVTAPAPEAESAPAPGSTDGNGAYYDPVFGWVTPAEVIQSTIDSD GDPDKMVGNMGD >gi|229784085|gb|GG667650.1| GENE 8 6898 - 7398 467 166 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870577|ref|ZP_06409810.1| ## NR: gi|288870577|ref|ZP_06409810.1| hypothetical protein CLOSTHATH_02845 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02845 [Clostridium hathewayi DSM 13479] # 1 166 5 170 170 295 100.0 8e-79 MKYIYRRLRGRLCNEHGEIAIYSCFFVVGIVMLISFLLLYASIRINCINIRNGAKMELNN LSASIYADTYRSQRETNFSEYLSTLYSSSSYTEMLEDTVREGLSTKVPLSTDDYKIKNIT LDFHVTGDRVKYTFYCDAEFYVKMFGRSYPVITQGIELTGYHNTKF >gi|229784085|gb|GG667650.1| GENE 9 7407 - 7850 284 147 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870578|ref|ZP_06409811.1| ## NR: gi|288870578|ref|ZP_06409811.1| hypothetical protein CLOSTHATH_02846 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02846 [Clostridium hathewayi DSM 13479] # 1 147 6 152 152 290 99.0 2e-77 MKRIRVILGSQRAEASYFSTMVFIFIAVLLLAFIIDLFSIISTKQELDYCADEMVKQIQL SGGINSETDEMFEFLCSQIEGAENITYSVDATYHSPTPPGMSYGIQLGTPFYITIEGSAK LGGFWNLRLVRINIVARGSGVSEHYWK >gi|229784085|gb|GG667650.1| GENE 10 8019 - 9476 273 485 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|20808441|ref|NP_623612.1| ribosomal protein S1 [Thermoanaerobacter tengcongensis MB4] # 207 483 5 255 257 109 27 2e-23 MAAKKKELLEQEPAVIGTENSEAADSGAILLEEMADCTDAPAAAGEVEPELDELLSDMGP DLTGNAIPGSSDDELSCETADWEDSEVEVTGINDGASDPPPSASPKRAARKKSDTRSTKA VEQSDAMEGSVTSSFITEANGGETEALTAEMTEFTEDLAEGREPASASSPPASRRASRGR PAAEAPVLTIEARSEVETAEEREDVIWHEIHNAYRTRRILMGQLCGIEQTDSGKTIAIAD YKGFRIVIPLKEMMIQVGRSPSGQAYADLMLRQNKILNNMLGAHIDFVVKGIDSKTRSVV ASRQEAMLKKRQIFYLDSDSAGMYRIYEGRVVQARIIAVAEKVIRVEIFGVECSIMARDL AWDWVGDAHERFSVGDEVLVRVLGVRRDSLEEISVKADIKSVTQNNNSENLKKCRIQSKY AGKVTDVHKGVVYVRLSNGVNAVAHSCYDQRMPGKKDDVSFAVTRIDEERGVAVGIITRI IRQNL >gi|229784085|gb|GG667650.1| GENE 11 9499 - 11400 1260 633 aa, chain - ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 75 572 87 574 591 446 46.0 1e-125 MIAAPLAVLSILYAGGYLAQFIGNYAVWQQGGGVPGDGTSPVMPTPDFFTCLTLVFRPPY GVYGVLICIGLFAILLVMVMRLGYSEDGEYDKDRNFTYSTKGTYGTSGWMNREEMDGVLE LVSDLRGHPGTILGMLNNKAVCVPKETRLNRNLAVYGASGSMKTRSFCMNRILQSVACGE SLIICDPKSELYEKSSEYLRDKGYIVRVFNLVSAENSDSWNCLCEIDGQELMAQLFVDVI IKNTTTNGKGDHFWDSAEMNLLKALVLYVDRSYAPENKNIGQVYQLLTLNAESDLNNLFD TLPSTHPAKAPFSLFKQASDSVRSGVIIGLGSRLQVFQSEIIKKITARDEIDLELPGQKP CAYFCITSDQDSTFDFLSSLFLSFVFIKLVRFADKNCEGGRLPVPVHVLGEELCACGVIP DLSRKISVIRSRNISMSCVFQNLAGLQNRYPLNQWQEILGNCDIQYFLGCTDELTAEFIS SRTGLASVAVSSKSKQLGTWRISNYTPEYRETSGVGKRPVLTPDEVLRLPIDQALVIIRG KKALKVNKFDYSKHPEYPKLQSCKASAHIPEWRRLEQEAQDMAKAKPKTPKKAAARKPKE TRAQTAAKEAAPAEPEVTFTADIITSDKDSILS >gi|229784085|gb|GG667650.1| GENE 12 11424 - 12401 628 325 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870579|ref|ZP_06409812.1| ## NR: gi|288870579|ref|ZP_06409812.1| hypothetical protein CLOSTHATH_02849 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02849 [Clostridium hathewayi DSM 13479] # 15 325 1 311 311 643 99.0 0 MKREGLIRTFYRDHLRTYRLTAQAKAQLMAEQPGRFSFYLEGNSDTNLLKGEFTRRLRLY QIAVVYVNMKLAGVHIFRDIKPEVFSPSGSPVSHIGEPAFYSSREVKDLGMEAAKIRGSR MAGVLLSPSGIFSVYNNGDSLTRWENRSELRVKALLQMELCQRRLHSQYQMDDVKALLFG DSMELAYEILVAGREKKQNYFILDGNYEHFYYLTNNPHGETLLRLLCNPAMTAELDRILG QGFGERIPDWHIENDAIDRSGEPVLFGYFFDLPRIARFNAALSLQGKGGTLVCFDFQCDV LRRYCIPAMRFQTIDFTKFNRRFFP >gi|229784085|gb|GG667650.1| GENE 13 12688 - 13350 453 220 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621685|ref|ZP_06114620.1| ## NR: gi|266621685|ref|ZP_06114620.1| hypothetical protein CLOSTHATH_02851 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02851 [Clostridium hathewayi DSM 13479] # 1 220 1 220 220 456 100.0 1e-127 MERKHLSQQIGQILDDLARLSNTLYAMGTADIQRYPDNYEVLSTDAALRAEKIACELRHL IFSTGGIKKPEYHGLACEVHGVEILYEDEILEVTLPSLLPKRRNRKSVEFLLDPLHFYLS QYAGQNTLPKYRECVVCFSHTYSMELPARRVHDYDNMELKQILDVLASYIMVDDTGLLCD AYNTTEFGEKDCTRIFVIPKNRFPAWLAKREKGLKNISDF >gi|229784085|gb|GG667650.1| GENE 14 13351 - 13896 331 181 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621686|ref|ZP_06114621.1| ## NR: gi|266621686|ref|ZP_06114621.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 181 1 181 181 374 100.0 1e-102 MIQIKTRAELYGQEATELLRLIAMYPGLSETQLCRFYPDKADKVKNLLAHLQRQARIVPT NVGRYFPAGTTDFTIDSGLNRAVWVLLDFIDRVEYHSSGEFPVKITFFLSGEIYEIIYAA AGQEVLISHILRQHGIHDGRRIVLVDSHEQLHQLDFPGITGFCTAELSGEVQYYKKTNGG M >gi|229784085|gb|GG667650.1| GENE 15 14058 - 14693 273 211 aa, chain - ## HITS:1 COG:CC1330 KEGG:ns NR:ns ## COG: CC1330 COG1533 # Protein_GI_number: 16125579 # Func_class: L Replication, recombination and repair # Function: DNA repair photolyase # Organism: Caulobacter vibrioides # 6 122 127 245 360 68 35.0 7e-12 MNQLSPGTIIRLGGMTDCFQPCEREQRVTYHTILEMNRRRIGYLIVTKSALIAEPEYMEI LDRSLAHIQITVTCLDNKKAFTYEKASPPSQRILAIQKLQESGFDVSIRLSPVIEEFMDF DMLNSLGIERCVIEFLRINTWIRQWFKEVDFTKYSLQHGGYRHLPLDVKLKILDKIHIPE KTICEDVTEHYEYWRQHVNPNREDCCNLRIL >gi|229784085|gb|GG667650.1| GENE 16 15101 - 18202 1878 1033 aa, chain - ## HITS:1 COG:MYPU_3830 KEGG:ns NR:ns ## COG: MYPU_3830 COG3451 # Protein_GI_number: 15828854 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Mycoplasma pulmonis # 573 994 421 834 853 96 24.0 2e-19 MNNQNDHDTYIIPPNFVETGTFFGGMFKARNVIEAGILAFAIGLPVFLFLPFGLTARIIV LCLTALPLALLALIGISGESLSSFLMIFLKFLKNRRVVGKETSNDTVSPKSEKQKLEKQK RVRAAKSMKDHRTAKAQKSAKGSMEKKRRTLADSDFPAEFDEIREYKIKQKLRPCKGVKR KRTVKSKPDKQKHHGKQRPHLKKAARTQEPAFSCLNPVADYFPITKIENGIIYTKDHRYI KVVEVVPINFMLRSAREQRSIIYSFISYLKISPVKLQFKVLTRRADISRHMDAVRCEMAT ETNEQCRLMQEDYLKFVQQVGSREAVTRRFFLIFEYEPWSNTKRSEEEGEAIGCLQSAVR TASNYLRQCGNEVIIPDSEDEFTVDILYNLLCRNESAVKPLSERVQEVIAKYMENGRESD VDRIPANEFFAPSSIDFTRSRSVCIDGLYYSYLLIPSDGFKPQVPAGWLSLIVNAGDGID LDMFLTRQPKERIVQKVGQQLRINRSKLKDASDTNTDFDSIDGAIRSGYFLKEGLSGNED FYFLNLLITITAHSEDDLEWKVSEMKKLLLSQDMRVNTCHFREEQAFLTSLPLAAMEKKL YEQSKRNLLTEGAASCYPFTSYEMCDDNGILLGVNKYNSSLIIADIFNSAVYKNANIAIL GTSGAGKTFTMQLMALRMRRKGIQVFILAPLKAHEFHRACANVGGEFIQISPASPHCINI MEIRRIDHTVNDLLDGPSIHLSELAAKIQRLHIFFSLLIPDMTHEERQLLDDALIRTYNA RGITHDNASLEDPKRPGNYREMPILGELYELLKQSQDTRRLANIINRLVNGSARSFNQPT NVSLDNKYTVLDISSLTGDMLAVGMFVALDFVWDRAKENRTEEKAIFIDECWQLLSGAGA TGTRLAGDFILEIFKTIRGYGGSAVCASQDLNDFFNLDEGRFGKGIINNSKTKIILNLED DEAMRVQESLHLSDAETMEVTHFERGHGLISTNNNNIMVEFKASHLEKELITTDRRELKE LVDRKRKENLILQ >gi|229784085|gb|GG667650.1| GENE 17 18234 - 19610 708 458 aa, chain - ## HITS:1 COG:no KEGG:Closa_3527 NR:ns ## KEGG: Closa_3527 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 285 458 100 273 274 210 59.0 1e-52 MAEEQKNGLSEAVETGAPAAQAIHGAIKTGKAVSAAAKGAAVGGPYGAAVGAVLAGRRHI GKIVAAIAVLVMLPVLFVLMLPGLLFGGLTGSGSSGTDGLPVLNDEQAVIENANDITFSI NQILSEGLDDVMARIDADFAASDGDEMEVHNPYASAPVYNANLFISQYCAARNSDFRTIS LSDMEAVLRRNKSYFYSFTRKEEFRDREETDPDTNEKSTVTEKWMVYTISYNGEAHFADN VFQLTDEQKELAADYASNLSLFLGDGMLQNLSAWNGNSIPSLGDIRFADGGTPVVYFNQM DERYAGKPYGTDHVGSYGCGPSAMAIVVSSLTDEIVDPAKMADWAYKNGFWCKGSGSYHA LVPGAAKAWGLPVSGCSASEPQRIIDALSEGKLVVAIMAKGHFTSSGHFIVLRGVQDGKI MVADSGSYKRSNQLWDLSIILNEASRRAAAGGPFWIIG >gi|229784085|gb|GG667650.1| GENE 18 19612 - 20688 664 358 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621690|ref|ZP_06114625.1| ## NR: gi|266621690|ref|ZP_06114625.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 358 1 358 358 562 100.0 1e-158 MKKEYARKGFLWFITSGKASNFNDAAMCPAVCKFLVGVLAVALLVCLSCQPAFAAISESD VQAQVDSAGKEAVAGNVLIWFLCAVSFLKISQKIDSFMASLGVNVGHTGGSMLAEAMIAT KGLSTVRNASSRHFGGGRSSSTHTNSVRGSKSGAAGAMGGFMSGGLAGVVSRSIQNGAAR AATGSSETGTPGGIGGVGKDSGGIGGQIFASSVSKGGNFANSVISSIATGSIAASGSMTG EKAAQALQSYMGYAALGEGAEHIPSFSDVEIGGGRITGTEVSEEYPEGAAFGMYHTAQYT APSGSYTTIQTADGASWYKQYAADAVEKSPYMASDGSIAYNESIVKKLPPPPQRKDRI >gi|229784085|gb|GG667650.1| GENE 19 20704 - 21414 702 236 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621691|ref|ZP_06114626.1| ## NR: gi|266621691|ref|ZP_06114626.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 236 1 236 236 397 100.0 1e-109 MGILDGIVEWIAEQVMNGLDLVTTSVLGALGCDMTVFLRYFPAAETMYHVFVALSIGLIL LNLVWQLFKNFGLGTGVEAEDPVKLTIRSVLFILLAYFSDNIVNTVLEIGGTPYGWILDS ELPALSFADFNSVMLIIIGVCANGAVALITLVLVLILAWNYLKLLFEAAERYILLGVLVY TAPVAFSMGAAQATSNIFKSWCRMLGGQVFLLIMNAWCLRLFTSMVGAFIANPLSV >gi|229784085|gb|GG667650.1| GENE 20 21455 - 21853 221 132 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870583|ref|ZP_06114627.2| ## NR: gi|288870583|ref|ZP_06114627.2| hypothetical fimbrial protein like protein [Clostridium hathewayi DSM 13479] hypothetical fimbrial protein like protein [Clostridium hathewayi DSM 13479] # 4 132 1 129 129 244 100.0 1e-63 MKKMTVSQTAKPPEPMGKRRYRFPVRKAAYCAYLAVCLSVMYSQPAYAADVWTKAKEIMQ DVYNQIVLISTVAAIVTASVALLMMNFSRSGKTVDESRAWLKRICLSWAIINGLGFIMAY ITPFFADGKWTG >gi|229784085|gb|GG667650.1| GENE 21 22085 - 22636 396 183 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621693|ref|ZP_06114628.1| ## NR: gi|266621693|ref|ZP_06114628.1| hypothetical protein CLOSTHATH_02860 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02860 [Clostridium hathewayi DSM 13479] # 1 183 3 185 185 357 100.0 2e-97 MKKKNELNHADNPLYHNTWTLLKKYRDVVWSLELSVQQVRNTFEIEYGNAVEEFLDSIYL AGADLTGSAIEHHAQCIEQSHKMLKLLDTAIDLLRTKHKNGESFYWLLYYSFLSPQQLRN VDEIIEKLRPHIRDISFRTYYRRRREAIDALSSVLWGYTAKDSMAILEQFFPDQLHGKKL AQD >gi|229784085|gb|GG667650.1| GENE 22 22639 - 23793 209 384 aa, chain - ## HITS:1 COG:no KEGG:Closa_1149 NR:ns ## KEGG: Closa_1149 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 335 1 336 338 536 73.0 1e-151 MSEYQLQIKQIVDYPRCRIYRQFVRTLMDDRSFRASEGSGLFHYVVLCNFANFRTSYRHI DGISYTIYPGEWICTTKELAAWFRTRFQYQAVNILEKLQSLHLISFLQLNRGKVIKYKIR DWKKHNTVLDYNCPCQKDTGFFFMPVSTAMELLSTSRCSEMDIALDLWISTVYNDEQVDG SDVGPVVYIRNGTGNPLVTYTELSARWGISKATVGRVLKRLSDLEYLTLLTFPGRHGNVI YLQSYLSTMFQISDVMIDKEEVAMILNIKLELPDEAESGDNQPVPEYNVCVSNELSSVSK AHIEIILRQMAKVLSAQGIPCFECPKSQYKLYPLSNACMEESLIRARGISESRIGLAVLC GDGKQVCTFELTLSPCAEMEPRRK >gi|229784085|gb|GG667650.1| GENE 23 24321 - 25832 816 503 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1352 NR:ns ## KEGG: Ethha_1352 # Name: not_defined # Def: integrase family protein # Organism: E.harbinense # Pathway: not_defined # 28 432 10 392 464 134 27.0 9e-30 MGRKPKKKDIAEYLWHSEPTKINATNDNWRIRYGYLDSEGKEHQTTETVNNERAKNQFLS YLLLREKEHKAAKGTTVASKVDKTEIETVEQLLYAYAEHTKYLHTIGDRYGWDEGTYRAN IGKIRNYIVPVFGSFPIWKITRVDMRQGFKDMLQLPQANGNHKANGARVSQRTVYDCKKI ISRAFKYATDDLELITENPMLGVTVKQPKSSRREVWTEEQFNFAYDNCSDPQLKLLISLM VALSSRIGECLALTWSDVYDNEDKHEPSYIRNYKELTYRTEKFIKETNGRGILKVFDQVR STRPDAKRRAVFHVTKTEKEIEAKTDRIYIAPEVIFMLRDYRLVQNQHKQLAGDSYIDDD LIFAHEDGTHISSQYADDMFRSLYEPLGLPKVDLYSLRHFSITKKLDYNKHDYVSLAKDT AHRQLSTIQDYYETPEDSVRIDTAKAIGKFLSRSGKSDSESKATISLADEDFAKKLEEVA SDSAGKQMLEFAIAMYEKNNSGN >gi|229784085|gb|GG667650.1| GENE 24 25841 - 26053 288 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870586|ref|ZP_06114631.2| ## NR: gi|288870586|ref|ZP_06114631.2| conserved domain protein [Clostridium hathewayi DSM 13479] conserved domain protein [Clostridium hathewayi DSM 13479] # 1 70 4 73 73 127 100.0 3e-28 MAGVTQSIMECRNYSVQDIMNIYGVGEESVRNFIKKHQKNNDFRVFMVGKYLRIDKNSFE DWYNNHPNEF >gi|229784085|gb|GG667650.1| GENE 25 26297 - 26998 406 233 aa, chain + ## HITS:1 COG:no KEGG:Shel_19180 NR:ns ## KEGG: Shel_19180 # Name: not_defined # Def: predicted transcriptional regulator # Organism: S.heliotrinireducens # Pathway: not_defined # 17 152 1 139 141 81 33.0 3e-14 MGLWDKFRSDKKSGRYLGTGTKIRRIRSQKDTTVVELANALGVNEAAIRNYEIGYRQASR DKLELIAQRLGVPVETLIDRQIDSYNDAIHILFELSEKYDLVPIELPQEPKYAIQTKDET ILQALQAWYNERRKWENGDITQAELQEWTDTFPLQCEENEPPAEKVESRYTDFERILGLK SSLEQFDMIVNDNVELIEDCIAHKDYKTAREHLRTLKATVHTLSQVDIKRYGK >gi|229784085|gb|GG667650.1| GENE 26 27055 - 27252 233 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621698|ref|ZP_06114633.1| ## NR: gi|266621698|ref|ZP_06114633.1| ribbon-helix-helix protein, CopG family [Clostridium hathewayi DSM 13479] ribbon-helix-helix protein, CopG family [Clostridium hathewayi DSM 13479] # 1 65 1 65 65 115 100.0 1e-24 MKEEKLIIHPKRPKGDDGYKTFSVRIREDVVQRIDEISAQTGRSRNELIGILLEFSLDRC SIEPK >gi|229784085|gb|GG667650.1| GENE 27 27343 - 27819 435 158 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621699|ref|ZP_06114634.1| ## NR: gi|266621699|ref|ZP_06114634.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 158 1 158 158 285 100.0 9e-76 MNSDELYTLKSMVLDNYEQYHDGLRILLESDKGLFKAFGQSPRTKYLKTENFYLTSKERL TYMMGFIDALAWLTRQGKLVPDTFPTDITGFTDPEETPEERAEYFEQRYKGLSPEQRAEL EAFDKENEELWLEQVEESIKENKVINNLILRFITGEDF >gi|229784085|gb|GG667650.1| GENE 28 27869 - 28288 113 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621700|ref|ZP_06114635.1| ## NR: gi|266621700|ref|ZP_06114635.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 139 1 139 139 216 100.0 4e-55 MSRITTKILKMIIFLGIFAWYVYSQFQKGGAIFSLEIIGAGLGLASLAYISLSLYGFILD VSKRYLLAFIITVAIALFVSFKIDGIIASIPWLSEDGFVFIIIVLTVLCVVRDVKTIRTI SQIQKSSVTQTNSTDERNT Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:56:45 2011 Seq name: gi|229784084|gb|GG667651.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld44, whole genome shotgun sequence Length of sequence - 38191 bp Number of predicted genes - 35, with homology - 35 Number of transcription units - 15, operones - 8 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 187 - 1179 1108 ## COG1816 Adenosine deaminase - Prom 1207 - 1266 8.7 + Prom 1315 - 1374 4.6 2 2 Op 1 . + CDS 1423 - 1929 398 ## COG3837 Uncharacterized conserved protein, contains double-stranded beta-helix domain 3 2 Op 2 . + CDS 1935 - 2483 661 ## COG0778 Nitroreductase 4 2 Op 3 . + CDS 2462 - 3658 925 ## Hore_15700 hypothetical protein - Term 3669 - 3701 -0.2 5 3 Tu 1 . - CDS 3742 - 4989 1446 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 5050 - 5109 8.0 6 4 Tu 1 . - CDS 5119 - 5949 967 ## COG1396 Predicted transcriptional regulators - Prom 6024 - 6083 4.3 - Term 6183 - 6243 -0.9 7 5 Op 1 . - CDS 6265 - 7878 1507 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold 8 5 Op 2 . - CDS 7897 - 8421 771 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins - Prom 8562 - 8621 5.7 - Term 8494 - 8531 5.4 9 6 Op 1 . - CDS 8637 - 9011 468 ## COG2314 Predicted membrane protein 10 6 Op 2 16/0.000 - CDS 9077 - 9943 1065 ## COG1082 Sugar phosphate isomerases/epimerases 11 6 Op 3 9/0.000 - CDS 9964 - 10968 1104 ## COG0673 Predicted dehydrogenases and related proteins 12 6 Op 4 16/0.000 - CDS 11024 - 12220 1251 ## COG0673 Predicted dehydrogenases and related proteins 13 6 Op 5 . - CDS 12241 - 13125 994 ## COG1082 Sugar phosphate isomerases/epimerases 14 6 Op 6 . - CDS 13143 - 14615 1495 ## COG2211 Na+/melibiose symporter and related transporters 15 6 Op 7 . - CDS 14669 - 16393 1893 ## COG1472 Beta-glucosidase-related glycosidases - Prom 16413 - 16472 3.8 16 7 Tu 1 . - CDS 16498 - 17724 1212 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 17835 - 17894 4.6 - Term 17892 - 17929 4.0 17 8 Tu 1 . - CDS 17948 - 18835 980 ## COG4974 Site-specific recombinase XerD - Prom 18888 - 18947 6.2 - Term 18962 - 19018 2.4 18 9 Op 1 . - CDS 19096 - 19722 520 ## Closa_3613 hypothetical protein 19 9 Op 2 . - CDS 19787 - 20335 748 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 20 9 Op 3 . - CDS 20395 - 21192 1051 ## COG0345 Pyrroline-5-carboxylate reductase - Prom 21430 - 21489 5.3 + Prom 21419 - 21478 6.6 21 10 Op 1 . + CDS 21564 - 22271 764 ## COG2964 Uncharacterized protein conserved in bacteria 22 10 Op 2 1/0.000 + CDS 22268 - 23425 930 ## COG1921 Selenocysteine synthase [seryl-tRNASer selenium transferase] 23 10 Op 3 . + CDS 23415 - 24155 532 ## COG3964 Predicted amidohydrolase + Prom 25057 - 25116 6.7 24 11 Tu 1 . + CDS 25177 - 25443 175 ## gi|266621725|ref|ZP_06114660.1| putative adenine deaminase dihydroorotase + Term 25479 - 25533 11.6 - Term 25467 - 25521 16.2 25 12 Op 1 . - CDS 25591 - 26370 818 ## COG1414 Transcriptional regulator 26 12 Op 2 38/0.000 - CDS 26508 - 27341 733 ## COG0395 ABC-type sugar transport system, permease component 27 12 Op 3 35/0.000 - CDS 27334 - 28224 889 ## COG1175 ABC-type sugar transport systems, permease components - Term 28236 - 28284 8.6 28 12 Op 4 1/0.000 - CDS 28297 - 29643 1385 ## COG1653 ABC-type sugar transport system, periplasmic component 29 12 Op 5 . - CDS 29686 - 30822 1199 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily + Prom 30979 - 31038 9.6 30 13 Op 1 2/0.000 + CDS 31193 - 31825 552 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase 31 13 Op 2 . + CDS 31845 - 32825 612 ## COG3734 2-keto-3-deoxy-galactonokinase + Term 32876 - 32916 6.1 - Term 32859 - 32909 17.1 32 14 Tu 1 . - CDS 32947 - 33696 849 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 33745 - 33804 5.6 - Term 33728 - 33772 4.1 33 15 Op 1 2/0.000 - CDS 33814 - 35214 1748 ## COG2610 H+/gluconate symporter and related permeases 34 15 Op 2 3/0.000 - CDS 35309 - 36310 411 ## PROTEIN SUPPORTED gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 35 15 Op 3 . - CDS 36338 - 37645 1492 ## COG3395 Uncharacterized protein conserved in bacteria - Prom 37843 - 37902 5.9 Predicted protein(s) >gi|229784084|gb|GG667651.1| GENE 1 187 - 1179 1108 330 aa, chain - ## HITS:1 COG:L87453 KEGG:ns NR:ns ## COG: L87453 COG1816 # Protein_GI_number: 15672269 # Func_class: F Nucleotide transport and metabolism # Function: Adenosine deaminase # Organism: Lactococcus lactis # 4 327 19 343 352 202 38.0 7e-52 MGRPETDLHLHLDGSLSIDVVKTLADRIGYDFGGKNVKELLSVGETCESLPDYLKCFDLP GILLQTEEALEYAACDLVKRLAGQGLVYAEIRFAPQLHKQKGLTQGQAVEAVVKGVRTGC KETGIYAGVLLCAMVNGSDAENEETMELAKAFCKEKAVNGVVGADIAGPEGFVPMAHFEG MFKRVYQAGVPFTIHAGECGDYENVVRAVAFGAKRIGHGCAAIQSEECMRLLEREKITLE MCVVSNLQTKAVAGIEDHPLKKFFDRGIRVTYNTDNMTVSDTSLEREAELITRKLGFREE DLRKMNEYAVEGAFAGEEVKQELKKRFGGQ >gi|229784084|gb|GG667651.1| GENE 2 1423 - 1929 398 168 aa, chain + ## HITS:1 COG:CAC1388 KEGG:ns NR:ns ## COG: CAC1388 COG3837 # Protein_GI_number: 15894667 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Clostridium acetobutylicum # 17 159 17 153 154 106 40.0 2e-23 MKNKKENSKTGNSRSLDPSFIWKDAAGLITGDLGKSVGSERIYIHLDTIPPGAFSTTYHS HSCQEEFFYIMSGTGTLRLNDETYPVGPDDFLAKPAGRNIAHSFYNSGEENLCILDIGTV DPEDTCYYPDEQIYLHKSNGERRIYRADATESDWSSAPNQPHRQKKGE >gi|229784084|gb|GG667651.1| GENE 3 1935 - 2483 661 182 aa, chain + ## HITS:1 COG:MA0330 KEGG:ns NR:ns ## COG: MA0330 COG0778 # Protein_GI_number: 20089228 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanosarcina acetivorans str.C2A # 1 172 6 175 179 97 39.0 2e-20 MDTMNAIRSRKSFRGPFTDAPVSREDLTELLEAGYSAPSGCNLQTTKLIGVDDPELLAGL AEIYNHEWAKGATAAILLIGSFTMSPSGVSYHVHDYCAAAENIYLAAADKGLGTVWIEGQ IRGERAEKMGRLLGVPEDQTVYVYMPVGYPAKPGPQAKKKPFEERAWFNRYGGTGSCSSP ES >gi|229784084|gb|GG667651.1| GENE 4 2462 - 3658 925 398 aa, chain + ## HITS:1 COG:no KEGG:Hore_15700 NR:ns ## KEGG: Hore_15700 # Name: not_defined # Def: hypothetical protein # Organism: H.orenii # Pathway: not_defined # 81 397 39 356 357 419 62.0 1e-115 MQFTGIIKKAGVLPLLALTLALGLTACSKAPAQAAESVSQSAVQQTAESASQPDPQQTAE PAAQPTSQQAAESGDVQPQAGQIYLYGEQHADEKILNREVEIWNDHYHNHNMRHLFVELP YYTAEYLNLWMKSDSDEILNALYDDWEGSAAHNPYIKPFYQAIKSQCPETVFHGTDVGHQ YFSTGERYLEYLKENGLEASEQYRISQEIIEQANYYYDHSDNVYRENMMVENFIREYDKL NGEAIMGIYGGAHTNPEALDSTGTIPCMANQLKEHYGDVIHSEDLILATKDSTPVRTDKL TVNGKEYEAAYFGKQDMTGFQDYSFREFWRLEHAYDDFKDCELTSDNLPYNNYPMRIETG QVFVIDYTKTDGSVTRLYYRSDGDQWNGSPSTNGFSVP >gi|229784084|gb|GG667651.1| GENE 5 3742 - 4989 1446 415 aa, chain - ## HITS:1 COG:CAC0764 KEGG:ns NR:ns ## COG: CAC0764 COG0493 # Protein_GI_number: 15894051 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Clostridium acetobutylicum # 7 405 9 411 411 410 50.0 1e-114 MAVHVIDEANRCLNCKKPLCQQGCPIHTPIPKMIEAFKEGDLNRAGDMVFANNPMSLVCS LVCNHENQCEGHCILGRKGQPVHISSIENYVSDTIFDKMKIECQPKNGKKVAVIGAGPAG ITIAILLTKKGYSVTIFDSKDKIGGVLQYGIPAFRLPKTILDRYKKKLLEIGVKIRPNTA IGTALEIKDLFRDGYESIFIGTGVWRPKTLGVKGESLGNVHYAIDYLANPDAYDLGERVA VIGMGNSAMDVARTVIRHGAREVTLYARGLVSNASEHETAYARLDGAAFEFGKQIVEITD DGPVFETILYDAEGRQTGVEETTEQVYADSTIISISQGPKSKLVNTTEGLKASKNGLLMT DERGQTTIPGIFASGDVVLGARTVVEAAAYSRVVAEAMDEYMKTNSPLPQEHAGK >gi|229784084|gb|GG667651.1| GENE 6 5119 - 5949 967 276 aa, chain - ## HITS:1 COG:L12334 KEGG:ns NR:ns ## COG: L12334 COG1396 # Protein_GI_number: 15671989 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 1 57 16 72 107 68 40.0 1e-11 MTQEQLAEQLEVSRQAVSKWESGQSYPEMEKLLIICDMFHCDMDSLVKGDLTLEDQADAA GYDRFMNGFSCRITGSITLILAAVTLMAFLDGLVHEDILAVFLIAAAGVAVAVMIVSGIN KEAFEKKYPHIQDFYSEEERSGFTRRFGIAIAAAVCLILFGICVNIAVDSMSFPLKEQLS GGFFLVCVTVAVGMFVYFGLQKEKYEVEEYNKRHDGSREKGNAIVGKASGVIMLLATAVF LYFGLVWDMFRSAWIVYPIGGILCAVVAVIWGDKKG >gi|229784084|gb|GG667651.1| GENE 7 6265 - 7878 1507 537 aa, chain - ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 3 537 16 550 553 323 35.0 6e-88 MDKILYHGTFLTMAENRPEPEAVLIRDGVIGGTGTLEEMRKLAPDGMLRDMEGHTVLPGF IDGHSHLSAVAYQLLVANLKPSPLGPCNCVEDVVRELKGFLETHSLRPGQWLMGMGYDNS VFPEGAHPTKEDLDRVSTEIPVAAAHISGHLCVVNTKGMELLGYTGEHFKVPQGGVVEPA GLLKEQAFLGKNGEMQGPAPEEVVKAVGDASKLYASYGLTTVHDGKVPEGQYQLLKGAAA MGLLKNDVVMYLAPELADQLLTGPGPEALAAKGYEHHLRPAGVKLFLDGSPQGKTAWLSE PYYVVPDGEQPDYRGFPVQSEAYVMDMMRTCVKNRWQINVHANGDEAIEQMIRCYQSVLE ETGSDRDLRPVVIHCQTVREDQLARMKEIGMVASFFLDHVYYWGDYHYESVLGPERAERI SPARSALKHGVSFTLHQDSPVAPPDVMGAVHNAVNRKTEKGRVLGQEQTITVMEALKAVT LNGAYQIFEEDKKGSIEVGKTADFAVLERNPLTVPKEEIREIKVLETIKSGETIFRR >gi|229784084|gb|GG667651.1| GENE 8 7897 - 8421 771 174 aa, chain - ## HITS:1 COG:SA1461 KEGG:ns NR:ns ## COG: SA1461 COG0503 # Protein_GI_number: 15927215 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Staphylococcus aureus N315 # 4 172 3 171 172 191 57.0 7e-49 MKKLEDYVTSIPDFPEPGIIFRDVTTILQDADGLSLAVDGVRGMLKDVDYDVVVGPESRG FIFGVPVAYAEHKGFVPVRKKGKLPREVISADYELEYGQATIEMHRDSIQPGQKVVIIDD LIATGGTIEAIIRLVKEMGGEVVKICFIMELKGLNGREKLKGYDVESLIAYEGN >gi|229784084|gb|GG667651.1| GENE 9 8637 - 9011 468 124 aa, chain - ## HITS:1 COG:SA1057 KEGG:ns NR:ns ## COG: SA1057 COG2314 # Protein_GI_number: 15926797 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Staphylococcus aureus N315 # 65 124 8 67 67 62 46.0 2e-10 MGKRRKQDDPDYVDPKKMIALEEEMAALKKEYGIEDKPRWWVRIGDFIANHAGTRQPVPV FRKKYLLLALCLGWAGGHRFYTKQYALGVIYLLFCWTGIPFAMTLIDLMIALPMKADEQG RMLL >gi|229784084|gb|GG667651.1| GENE 10 9077 - 9943 1065 288 aa, chain - ## HITS:1 COG:lin2265 KEGG:ns NR:ns ## COG: lin2265 COG1082 # Protein_GI_number: 16801329 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Listeria innocua # 3 269 2 232 246 86 27.0 5e-17 MKKGLIGIQMSTIKDKIAELGAYGTLKACAELGYHCVEVSQIPMTEENVAGMKRACEEFG IKIAALSAAVEPMMPGMPGEYLTTDFDKIVADCKTLNCDMLRIGMLPMTCMGSREKALDF VKRADEMAGRLKEHGIDLYYHNHHVEFVKYDGEYLLDIIKNNTRNMGFELDIHWIHRGGE NPVEFIRKYDGRIRLLHLKDYRIGQLKMPEGAFDPKTFMAAFTDIVEFAEIGEGTLPVKE CIEAGFAGGSEYFLIEQDTTYGRDPFESLRISKENLEKMGYSDWFLLD >gi|229784084|gb|GG667651.1| GENE 11 9964 - 10968 1104 334 aa, chain - ## HITS:1 COG:TM0585 KEGG:ns NR:ns ## COG: TM0585 COG0673 # Protein_GI_number: 15643351 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Thermotoga maritima # 3 332 4 347 360 131 30.0 2e-30 MKKAAVIGLGDISKIHIPVIEANPDITLCAVCDNNPELKETAPEGAAFYTDYREMAEKEK PDCVHVCLPHYLHYPVTKELVEMGCNVFCEKPVALNSEEAELFVRLEEAHPEIRIGICLQ NRLNESVELLKEIIESGEYGQVVGTRGIVPWYRPAEYYSAKPWRGRWDTAGGGCMINQSV HTLDLLYYLGGDIKGLNASVSQILDYGIEVEDTVTARLDYVSGAKGLFFATNANYKNEAV QIAIQLERGEFRIEENILYQILPEGGRKKLAEDRKFPGTHFYYGASHGKLIDRFYRVLEM GEGEYIRVKDAKMSIRLIDAILESGKSGSYKFIL >gi|229784084|gb|GG667651.1| GENE 12 11024 - 12220 1251 398 aa, chain - ## HITS:1 COG:SSO3049 KEGG:ns NR:ns ## COG: SSO3049 COG0673 # Protein_GI_number: 15899754 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Sulfolobus solfataricus # 63 377 70 366 371 104 27.0 3e-22 MNKVRFGLIGIGAQGGAYAGFLTGRGGFPGMPAPECPPHCALGALCDTDPAKEAMCKEKY PETPFFKDWKDLVNSGTVDAVVTTVPHYLHTEIAIYCLERGVAVLVEKPAGVYAKSVREM NECAAAHPEVPFGIMFNQRTNKLYQRVKAIVESGELGQIRRSNWIINSWWRPDSYYRQSD WRATWGGEGGGVLVNQAPHQLDLWQWICGVPNRVYAKCIYGDHRDIIVENDVTIVTEYPN GATGSFVTCTHDPIGTDRLEIDLDGGKIVVEDSRTATVYRLKKNEDELNRTMSMMELVKL TQGNAAGADGGLYTVETFENTDGWGVQHMTVMENFACHMVEGTPLLAPGSDGINGVNLAN ATLLSSWLGREVDMPVDEDLYLEELNRKIAAEGKFPTR >gi|229784084|gb|GG667651.1| GENE 13 12241 - 13125 994 294 aa, chain - ## HITS:1 COG:Cgl2502 KEGG:ns NR:ns ## COG: Cgl2502 COG1082 # Protein_GI_number: 19553752 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Corynebacterium glutamicum # 58 255 1 201 226 151 39.0 1e-36 MEKIMISGFSDEISADFDEQLKVVSELGMEYISLRSADGKGIAEYTAKEAEEKLLPGLQA AGIRVSSLGSPIGKVGIDDEEGFANQLIQLEELCRIAKVLDCRYIRMFSFYIPEGKDPED YREKVIEKLAKFDAIAKAHDIVLIHENEKDIYGDRKERCRVILDALGSDHFKAVFDFANF VQCGEDTMECWELLRDQVVYIHIKDAVASDKENVLCGTGEGKIKEILTRAIREEGYEGFL TLEPHLVLFDALQSLEVSDAESVIKENKAKDGAEGYSMQYHALKEILNSIERQD >gi|229784084|gb|GG667651.1| GENE 14 13143 - 14615 1495 490 aa, chain - ## HITS:1 COG:ECs2323 KEGG:ns NR:ns ## COG: ECs2323 COG2211 # Protein_GI_number: 15831577 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli O157:H7 # 10 489 1 450 457 97 25.0 4e-20 MGKTTNPSGMHRAKLWQIGGFAFNNTATNLYMFMMNFIAYYLTGYVGVGVVVASTLITTM RIWDGVTDPFIGYMVDKTNSKFGKNRPFMVIGNVILAVGSFILLHVTHLLPTGARFIFFV LIYMVYIIGYTFQCVVTKSAQSCLTNDPKQRPLFSIFDGVYNTIVFALIPIFVSSYLVPK HGGFNAAFFHEFWCYVAAASGILTCIAIYCISSKDRLEYFGTGKVVKIGFKDYWEVLKNN RAIQMLVVAASTDKLASSVRSNSTIGVILYAIICGNYALSGAVAGYTSVPNMLILFFGVG WIATRLGQRKAMLFGTWGGLIFTTLSIVLFAVGDPTQLSFPGIAGFAGWNFFTLAFLALY ILGQGCMGVSGNIVIPMTADCADYEVYRSGKYVPGLMGTLFSFVDKLISSLAASIVGVMC AMIGFKETLPGVDTPYSPQLKFIGLFCFFGLILFGLICNLIAMKFYPLTREKMEEIQEKI AEIKAQAQQA >gi|229784084|gb|GG667651.1| GENE 15 14669 - 16393 1893 574 aa, chain - ## HITS:1 COG:BH0675 KEGG:ns NR:ns ## COG: BH0675 COG1472 # Protein_GI_number: 15613238 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Bacillus halodurans # 26 553 138 657 686 169 27.0 1e-41 MIDLRSKPFYLSDEDIAWVEETKAGLSLDEKIGQLFVDMLWGDTDEEIKRRIDTYGISGF RYNNLPAEDLYHQNEVIQRCSRVPALIAANVESGGNGAVTGGTRLGDPVAIGASGSAENA YYMGYYGCKEAASVGCNWTFAPVVDINKNWRNCVVSNRCFGSDADMVLEMGKEYMRGAGD AGLACCMKHFPGDGLDERDQHIVTTINTMTCEEWDETFGKVYKGMIDAGVQSVMVGHIAL PSYSRKLCPGIRDEDIMPATISKELLTGLLRGQLGFNGLIITDATHMVGLTSKMKRSEFV PYVIEAGCDMVLYYRDKDEDVENFKAGLESGLLSRERFDEALTRVLAMKAMLKLHRKQKE GTLMPPKEELSVIGCPEHRKKANEIIDNAITLVKNTRNQLPLTPETHKRIMLYSIDSTPA VMRSMMGGGQDTAESMMKEYLEEAGFSVTVFTPEGKNGKQLLAGTPVKAFVSQFDAVILV ANVSGFSQSNERRLHWSMPMGPDIPWYVTELPTVFVSVNNPFHLIDVPMVPTYINAYSPD RDVIRQVVDKITGKSEFKGRSTVDAFCGAWDTRL >gi|229784084|gb|GG667651.1| GENE 16 16498 - 17724 1212 408 aa, chain - ## HITS:1 COG:BH0793 KEGG:ns NR:ns ## COG: BH0793 COG4753 # Protein_GI_number: 15613356 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 296 401 399 504 508 84 37.0 4e-16 MQVDLLEVTEAFVVSFGVNMTVIKTPDDDIGNFDLGIRRTFRKEESAGQTVKHLYRICRE GALYFLDDCFEMEYCLFRIPREESQYGELVLIGPYQKELVDEYQLNLLVQSHRIPMGLMA ELQEYYHAVPVILLYETWMSVLTAMVRKLYGDREVEIVLRESLDEYGDTNMVSSHPDEPL AVKLIEDRYKAEEELMAAITQGNMEKALKAQGKFRNFQIARRYKDPARNFRNLMITANTL CRKAAQAGGVHPVHIDELSCKCAKKIETLMTKQEADRFNLEMVRKYCMLVRNYSLQGYSP LVQKVVNHIDLNLVSDLSLRNLAAEFNVNPSYLSSLFKKEMSVTITAYVNQQRMKQAIRY LNTTNMQIQNIAADVGISDVNYFSKLFKRATGKTPSEYRELILVRSQL >gi|229784084|gb|GG667651.1| GENE 17 17948 - 18835 980 295 aa, chain - ## HITS:1 COG:BH1529 KEGG:ns NR:ns ## COG: BH1529 COG4974 # Protein_GI_number: 15614092 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Bacillus halodurans # 5 294 8 298 299 251 44.0 1e-66 MVTEINEFVKYLREVKKTSKNTEVSYHRDLLQMASYLEEQGITDVEKVTKTSLNSYVLHL EKEGRATTTISRTLASMKAFFHYEYGEGRIRRDPAELIKAPKIEKKAPTILTVEEVNSLL EQPTGDSPKELRDKAMLELLYATGIRVSELIHLKVEDMNLAVGFITCRDEHKERMIPFGK VARQAMVNYMESGRASLLKGQESEWLFTNCSGRPMSRQGFWKIIKYYGEKAGIQADITPH TLRHSFAAHLLGNGADIHAVQAMMGHSDMATTQMYMNYTRGEAVRSAYAGAHPRN >gi|229784084|gb|GG667651.1| GENE 18 19096 - 19722 520 208 aa, chain - ## HITS:1 COG:no KEGG:Closa_3613 NR:ns ## KEGG: Closa_3613 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 41 206 52 217 220 176 57.0 7e-43 MAIRLRSMGRGLRIRSGTFTPIHFQKNPGTAAAAGYGSSGNRFILFFFAGLLAGTAAANF LHASLSDQAGYYLNLLSHQSDLGRSEQLALFGDICRQRIIEVLIAWLIGLTAYAVPCFCV LSAGFGLSVGVVLSVMTGLKGILGLPFFLASVMPQAFLYVPVWCLLLLWGIRRNGRLRIP ALLLVLAFVAAGSACEVWLNPFFLGMAG >gi|229784084|gb|GG667651.1| GENE 19 19787 - 20335 748 182 aa, chain - ## HITS:1 COG:VCA0764 KEGG:ns NR:ns ## COG: VCA0764 COG0494 # Protein_GI_number: 15601519 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Vibrio cholerae # 18 172 24 176 185 114 41.0 1e-25 MEQPPVKRLNRELKYQGTILKIYEDTVEANGHEAHWDFIHHNGAAAVVPVTKEGKLLMVR QYRNALDRYTLEIPAGALDYPEEPKLDCAHRELEEETGFKTEKEKMEYLISVNTTVAFCD EAIDIFVARELEPSRQNLDDDECIDVEEWTVADLEKLIFEGTITDGKTIAAILAYARKYH VE >gi|229784084|gb|GG667651.1| GENE 20 20395 - 21192 1051 265 aa, chain - ## HITS:1 COG:CAC3252 KEGG:ns NR:ns ## COG: CAC3252 COG0345 # Protein_GI_number: 15896497 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Clostridium acetobutylicum # 2 264 3 267 270 239 50.0 3e-63 MAKIGIIGIGNMGSAILKGLLHVYGKNDIIFTDVNREKCEEITKETGVAYAGSNAECANQ AKYIILAVKPQYFDPVLKNIRNVVTEANVIISIAPGITIGQLKEKLGIEKRVVRAMPNTP ALLGEGMTGVCYDAEAFDDGEKDTIRDIFTSFGEMCIVEERLMSAVVCASGSSPAYVYMF IEALADSAVKYGLPRDTAYRMAAQTVLGSAKMVLETGEHPGALKDKVCSPGGTTIAAVAA LEEHGFRSAVIKAGDACYEKCENMK >gi|229784084|gb|GG667651.1| GENE 21 21564 - 22271 764 235 aa, chain + ## HITS:1 COG:VC0355 KEGG:ns NR:ns ## COG: VC0355 COG2964 # Protein_GI_number: 15640382 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Vibrio cholerae # 11 219 34 236 240 76 25.0 5e-14 MKTVYDNWAFWQRLLNMLENEFGNRCELILHDLTKDYAHTVVDIRNGHITGRKIGDCGSN LGLEVLRGEVKDGDRYNYVVHTPDGKILRSSTMFLYDDDGKVIGSFCINLDITQTVALEG ILHQYNQYVPGMETEKPVEECRETFAGNISDVLDFLIEQAGEMIGKPVDEMNRLEKIRFT GYLDAKGAFLITKASERVCDYLKISRYTLYNYLDLARTEHGDSDSNTKREGTEEK >gi|229784084|gb|GG667651.1| GENE 22 22268 - 23425 930 385 aa, chain + ## HITS:1 COG:AGl3109 KEGG:ns NR:ns ## COG: AGl3109 COG1921 # Protein_GI_number: 15891670 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine synthase [seryl-tRNASer selenium transferase] # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 300 45 338 439 109 29.0 1e-23 MNIYEELNVDTIINASDTYTRIGGSRMSRRTLEAMQEAAESFADIGELSDAICRRIAERT GNETAFISSGAGACVVLTASACMTMGNEELMYQLPDASACPKNEIIVFASQKNCPILPYW HLTELSGASLIPVEDSISALREAIGPKTAAVFFFTGTVYEWTTPDLKDIIETAHAAGVPV IVDAAAQLPPKSLMSWYTVDLKADAVIFSGGKFINGPQTTGIVLGRHQILDHCRAMASPN VRIGRPYKVGKEEYAAFYRACMDFLDMDEEADYQKLKTILENIQSSLSDVPGYHSYIEEN GRLGQRIPMLYLQFTDGTTGKECYDFLYSAPERIDIGTFHPGDPTGDPCRVFLNAINLRE PDLPVLIKKLNRFLTLPERRNRHER >gi|229784084|gb|GG667651.1| GENE 23 23415 - 24155 532 246 aa, chain + ## HITS:1 COG:STM4445 KEGG:ns NR:ns ## COG: STM4445 COG3964 # Protein_GI_number: 16767691 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Salmonella typhimurium LT2 # 36 243 33 246 377 142 35.0 6e-34 MNDKTLVQYYDADKNRFLNGELSGCDGDWHVEYRNDGPAETDLFLSPGWIDMHTHIFDGF GLFGTEADAVGWKTGTCLLVDAGTVGEYTIHGFTKYVAPAIETNIRLFLCISPIGVIFHH DYNAMQYLDADRCAACIAEYPGLISGVKVRMGSETIRHEGLEPLRLASLAARKANVPMMV HVGGNPPYLKDMEPYFEKGDILTHVFNGRGGDVWNPDGTPSDALQKLIDRGVWLDVGHGS SSFLAS >gi|229784084|gb|GG667651.1| GENE 24 25177 - 25443 175 88 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621725|ref|ZP_06114660.1| ## NR: gi|266621725|ref|ZP_06114660.1| putative adenine deaminase dihydroorotase [Clostridium hathewayi DSM 13479] putative adenine deaminase dihydroorotase [Clostridium hathewayi DSM 13479] # 1 88 36 123 123 188 100.0 1e-46 MPVLLSKLYGCGVSLEDILYGVTAGPAMALGLTDWCRLSTVRNATLFRIVDHTEAYEDCQ GNTRTFHRAFLTEGVILNGKYRKAPAQL >gi|229784084|gb|GG667651.1| GENE 25 25591 - 26370 818 259 aa, chain - ## HITS:1 COG:Cj0480c KEGG:ns NR:ns ## COG: Cj0480c COG1414 # Protein_GI_number: 15791844 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Campylobacter jejuni # 6 251 1 243 253 136 32.0 3e-32 MPNEELHRATKRVIDTLEQLTAAPPQGMTLTELASALKAPKSSLFPIVHTLCELHMLSLN SDTGRYCIGYKAYEIGNVYLRRGGLTTDIQTQMHTIVETCGETCYFAQLVDGDVFYLYKV DSTEPVRSVVSPGQRLPAYATGIGKALLSGKSKDELKRLYPDGLKALTPNTITDIDRLSY QLDEIQRNGLAFECEESTQYIRCVAVPIRRDGGVIAAMSIAMPVFRYDPEKEALTIRLLR EAQLRLEKLLKNADFNGIM >gi|229784084|gb|GG667651.1| GENE 26 26508 - 27341 733 277 aa, chain - ## HITS:1 COG:mlr7227 KEGG:ns NR:ns ## COG: mlr7227 COG0395 # Protein_GI_number: 13476021 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mesorhizobium loti # 25 276 30 280 280 186 38.0 3e-47 MNSQKRVNALCYIGLTLGSLIVLVPILYMASTSFKSINEIMTSSTVTMFPKKFSTEAYKN VITEYPFFTYLKNSVVISVTATVLAVLFSTLAGYGFSRFQFRGKGMIMMFILVTQMFPAV MLFVPYYKLLTIYGLANSRRGIILVHIASVIPFCSWMMYGFFNSISRELDEAARIDGCSH LKIFWKIIAPLTLPGLISTTIYAFIQSWNEYMFTMLCITSDKMKTLPVAIGQMASYDKIM WNDLMAASLLSSLPVVVMFIFLQRYFISSMTQGAVKG >gi|229784084|gb|GG667651.1| GENE 27 27334 - 28224 889 296 aa, chain - ## HITS:1 COG:mlr2327 KEGG:ns NR:ns ## COG: mlr2327 COG1175 # Protein_GI_number: 13472129 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Mesorhizobium loti # 1 293 14 309 319 188 35.0 1e-47 MKHGMNGTMNRVKTRAAAYGFVTPAVITLALLLLYPFCYGIYISFFKTNLVNKWNFIGLK NYAKVLTEADFYSSMKITLIFTVGVVLGHFIFGFLFAVMLNRDIKGRTIFRAVLLLPWLF PEVVIANLWRWIFNPTMGFLNSTLVSVGILKEPMSWLGSPKLALAVLIFICIWKGYPLVM IQLLAGLQTVPGDIIEAARIDGAGNWKTFWYVTVPSMKSTLSVTLILDVVWWFKHVTMIW LLTQGGPNGATNTIGVNIYKRAFEFFDFGPSSALAVVVFLICIVISILQRRLLKDE >gi|229784084|gb|GG667651.1| GENE 28 28297 - 29643 1385 448 aa, chain - ## HITS:1 COG:PM1762 KEGG:ns NR:ns ## COG: PM1762 COG1653 # Protein_GI_number: 15603627 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pasteurella multocida # 2 444 8 447 451 122 24.0 2e-27 MAIGLSVVMAASMLAGCGGKEAAPETTAAAAETQQTEAAKTEGDTEAAEEKSGENVTITY ISSTILESPEGDFEKKCIEEFNALGNGITVEVEGVSANDLMKKYITLATSNSMPDFFMAN MLDTATIVDMGLAAEIAPIFGQDYVDGFEDVSLASASVDGVAYGAPWFSGASGIVYRKDI FDEKGVAVPKTWDEYVEVSKALTGDGAYGNTLVGANNGSGAGRFQYVIRNFGVDEFIQGD DGTWTTDIGSQKYIDALRAYTDLDVTYHTCPPGVIETDYPTAVNLFSSGKAAMLITGSNA IGAITSQVPELKGKLGSFPVPAVERSVSTPGGFGYFITPGKHEKESAEFIKFMLEKERAL EFSVMTGRLPTRTETLDDPSINDMPELAGFLEARQTIYETPNIPGYSEVNDVHGQAYQSV FTGESTVEEAAAKAQKRAQEICDAANEG >gi|229784084|gb|GG667651.1| GENE 29 29686 - 30822 1199 378 aa, chain - ## HITS:1 COG:SMb20510 KEGG:ns NR:ns ## COG: SMb20510 COG4948 # Protein_GI_number: 16264240 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Sinorhizobium meliloti # 1 378 6 382 382 471 56.0 1e-132 MKLYFMPHRFLLLKIETDAGICGWGEPLVEGRAATLEASVNEWRDYFIGKDPLRIEEHWQ TMYRRAFYRGGPVLMSTIAGIDQALWDIKGHYYHAPVHEMLGGRVKDKIKVYRSIHGDTP EEVAADAKLAVKEGYTVIKTSPTDPTHYVDTLQSVNKLVEKVGAIRDAIGNNIDMAIDFH GRIHKPMAKVLVRELDQFHPLFMEEPVLPENKEALREVAKYTSAPIAIGERMYSRWDYKS LFEQGYVDIIQPDLSHAGGISEARRIAAMAEAYDVAVAPHCPLDAIAFSSCIQMGAATPN LVLQEQSIDIHDASDTNPRINFLKNKEVFHFENGFVEIPEGPGLGVEVDEEIVQEQNKNR HNWKNPMWRTYDGTPIEW >gi|229784084|gb|GG667651.1| GENE 30 31193 - 31825 552 210 aa, chain + ## HITS:1 COG:TM0066 KEGG:ns NR:ns ## COG: TM0066 COG0800 # Protein_GI_number: 15642841 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Thermotoga maritima # 12 210 12 200 205 133 37.0 2e-31 MTTFDKILAGGIITIIRGLSPDCAEKTVEAIHAGGLHLAEITFDQTAPPKVTADIIRTLS RQFEGKVLIGAGTVMTLEQLHAAYNAGAAYIISPNADSSVIRETKRLGLLSMPGAYTATE VARCYAEGADIVKVFPSDSAGPGYIKALRGPLHHIPLAAVGGVNLDNIRDFFDAGACCVG IGSNIVSKQAVQAGDFDRIRLLAAAYAAKL >gi|229784084|gb|GG667651.1| GENE 31 31845 - 32825 612 326 aa, chain + ## HITS:1 COG:RSc2759 KEGG:ns NR:ns ## COG: RSc2759 COG3734 # Protein_GI_number: 17547478 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-galactonokinase # Organism: Ralstonia solanacearum # 1 263 1 260 331 121 29.0 1e-27 MYYFYFDSGTTNTRACLLKDRTIICRGDIPVGSRDSALHQDRTVLISALKQLYDQLLTKS GITDTQIAEIYMSGMISSPSGLIEIEHLSTPVDFKKLKSSIVVYYEGQFFQRNVHIIPGI KTLPQGTTASVDTVVNANNMRGEETELLGILHRCPGLASGRSIILLPGSHTQAAFLLDGT IENISSTITGELYHALAGHTILGSSINGEKCTELLPNMVCMGYDIVHRYGFNRALYVVRS MDLFAGSTLAERRSYLEGVINGGVMDAIVSVTQDKPATIAVAGPHMQYEIFSAFADKLFP QFKITEVPVTPSFPFAVEGFLSLMEQ >gi|229784084|gb|GG667651.1| GENE 32 32947 - 33696 849 249 aa, chain - ## HITS:1 COG:BH0801 KEGG:ns NR:ns ## COG: BH0801 COG1349 # Protein_GI_number: 15613364 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Bacillus halodurans # 4 231 10 241 259 144 34.0 1e-34 MMLMAERQLKILGMIQDNGSVQVDELARKLDVSPMTIRRDLEKLQKDGLIERCHGGAVNK TEVNYAVKSVSNHTQKEMIARKAAEFIRGGTTIFLDAGTTTYEIAKHIMEYENMTVVTND LEVALLLKNSRVELFICGGYIQKSTGSAVGYYATQMMENFRFDMGFFGAAAINEEFHVLT PTIDKAFLKRQLTHQCQQSFLVVDDSKFNRQGMNFINKLSDYTGVITDHEFKETEKILLR KEGARIITV >gi|229784084|gb|GG667651.1| GENE 33 33814 - 35214 1748 466 aa, chain - ## HITS:1 COG:FN0225 KEGG:ns NR:ns ## COG: FN0225 COG2610 # Protein_GI_number: 19703570 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Fusobacterium nucleatum # 8 455 3 440 452 418 58.0 1e-116 MVGGLSGQRMLIGLLIGIIVLVVLVLKTKIHAFLTLIMSAVIIGLVGGMPMVNVTLENGK ILGIINSITTGFGGTLGSIGIIIGFGVMMGQIFERTGAAKRMAQTFLKLFGKGREEEALG ITGFLVSIPIFCDSGFVILAPIAKAISKVTKKSVISLGVALAAGLVITHTLVPPTPGPLG VCGIFGVDVGKFLLFGIVIAIPMAIACTIYARWVGKKIYQIPNDDEGFDRLPYQEPDYSI DFTTDFSGLPSTLESFAPLVLPIILILISTVSSALGLTTGFMEIFQFLGSPIVAVGLGLV VAIVTLGRNLTREEAIKEMERGMASAGIIMLVTGGGGALGQIIKDSGLGTYMAEGLAGTA VPIVVLPFIISTAMRFIQGSGTVAMTTAASISAPIIIAAGVNPILGALACCVGSLFFSYF NDSYFWVVNRTLGVGEVKDQIMTYSITSTIAWAVGFVEVLILSIFL >gi|229784084|gb|GG667651.1| GENE 34 35309 - 36310 411 333 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 [Flavobacteriales bacterium ALC-1] # 5 320 8 320 346 162 30 2e-39 MNKPLIAIPIGDPAGVGPEITAMALADSKVTEAARCIIVGDKKIMEQAVRITGAELKVNV VKEPEEGIYEPGILNLIDLDNIDMDHFEFGKVSGMCGKAAYEYIAESIRLANEGKADAVA TTPINKEALKAGDIHFIGHTEIFGALTGTEDPLTMFETNGMRVFFLTRHVSLRQMLDMIT KERIKDYVKRCLEALEKLGVTGGTMAIAGLNPHCGEHGLFGDEEVREIMPAIEELQAEGY PVTGPIGADSVFHQAAQGRFNSVLSLYHDQGHIATKTLDFDRTIAITNGMPILRTSVDHG TAFDIAGTGKVSAVSMIEAILLAAKYSPNFTKK >gi|229784084|gb|GG667651.1| GENE 35 36338 - 37645 1492 435 aa, chain - ## HITS:1 COG:FN0227 KEGG:ns NR:ns ## COG: FN0227 COG3395 # Protein_GI_number: 19703572 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 433 1 425 425 259 34.0 6e-69 MPQCIVVADDLTGANATGVLLKKMNYQTYTVMNSERLELSLFDSCDCVMYPTDSRGVSSS LAYNRVYNVTNLLKEDGVKVYAKRIDSTMRGNVGSETDAILDALGDDYIAIAAPCFPASG RIVIGGYMLVKGLPLHKTEVALDPKTPVTVSDVRQIFEQQSKYPVGSIMMNDLMNGKHYL ADRMNELVSDGCRIIVADCVTQEDLDLIADAVITSQLKFVAVDPGVFTATISRKLIVPAD KKEKKKILAVVGSVNPVTRQQMEELWLAQRTAYNIFAATRRFLESGEEREAEITRITEEV LEASRAYDILTVTGDGIYPENRIDFKQYEDRFPGGVDDMTGTINDSFAEIAHRVFTAHQG FKGIYSSGGDITVSVCRRFQTAGLCLLDEVLPLAAYGQFLKGDFEGVHIITKGGMVGESD AANRCITYLKEKLYM Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:57:16 2011 Seq name: gi|229784083|gb|GG667652.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld45, whole genome shotgun sequence Length of sequence - 40900 bp Number of predicted genes - 33, with homology - 31 Number of transcription units - 17, operones - 6 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 672 613 ## COG0423 Glycyl-tRNA synthetase (class II) 2 1 Op 2 . - CDS 717 - 1343 653 ## Closa_0907 hypothetical protein - Prom 1371 - 1430 7.8 3 1 Op 3 . - CDS 1452 - 2216 393 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen - Prom 2256 - 2315 80.4 4 2 Op 1 . - CDS 3163 - 4005 1040 ## COG1159 GTPase 5 2 Op 2 . - CDS 4032 - 6956 3179 ## COG1026 Predicted Zn-dependent peptidases, insulinase-like - Prom 6990 - 7049 2.2 6 3 Tu 1 . - CDS 7096 - 7920 910 ## COG1434 Uncharacterized conserved protein - Prom 8154 - 8213 6.8 + Prom 7871 - 7930 2.2 7 4 Tu 1 . + CDS 8034 - 8156 79 ## - Term 8185 - 8225 -0.9 8 5 Tu 1 . - CDS 8319 - 16808 6692 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 16834 - 16893 10.1 + Prom 16925 - 16984 16.4 9 6 Tu 1 . + CDS 17107 - 17907 713 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 10 7 Tu 1 . - CDS 17840 - 18691 666 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 18731 - 18790 6.3 11 8 Op 1 4/0.000 - CDS 18920 - 20155 1486 ## COG0826 Collagenase and related proteases 12 8 Op 2 . - CDS 20160 - 20852 642 ## COG4122 Predicted O-methyltransferase 13 8 Op 3 . - CDS 20849 - 21244 477 ## Closa_0897 aminodeoxychorismate lyase 14 8 Op 4 1/0.000 - CDS 21340 - 23007 1984 ## COG0595 Predicted hydrolase of the metallo-beta-lactamase superfamily 15 8 Op 5 . - CDS 23028 - 23495 496 ## COG0735 Fe2+/Zn2+ uptake regulation proteins 16 8 Op 6 . - CDS 23513 - 23794 378 ## Closa_0894 hypothetical protein 17 8 Op 7 6/0.000 - CDS 23802 - 24233 511 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) - Prom 24428 - 24487 1.5 18 8 Op 8 . - CDS 24513 - 24773 361 ## COG4472 Uncharacterized protein conserved in bacteria - Prom 24793 - 24852 9.2 19 9 Tu 1 . - CDS 24856 - 25806 592 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 20 10 Tu 1 . - CDS 26748 - 27083 287 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 - Prom 27197 - 27256 3.9 + Prom 27075 - 27134 7.2 21 11 Tu 1 . + CDS 27234 - 27467 302 ## Closa_0890 Phosphotransferase system, phosphocarrier protein HPr + Term 27577 - 27615 7.1 - Term 27485 - 27517 2.2 22 12 Op 1 7/0.000 - CDS 27630 - 28808 1358 ## COG0301 Thiamine biosynthesis ATP pyrophosphatase 23 12 Op 2 . - CDS 28828 - 29982 1287 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes 24 12 Op 3 9/0.000 - CDS 30000 - 30737 846 ## COG1385 Uncharacterized protein conserved in bacteria 25 12 Op 4 . - CDS 30747 - 31703 984 ## PROTEIN SUPPORTED gi|238917093|ref|YP_002930610.1| ribosomal protein L11 methyltransferase 26 12 Op 5 . - CDS 31713 - 32459 796 ## COG0730 Predicted permeases - Prom 32629 - 32688 7.5 + Prom 32464 - 32523 1.7 27 13 Tu 1 . + CDS 32580 - 32693 86 ## + Term 32875 - 32917 6.2 - Term 32857 - 32909 18.1 28 14 Op 1 . - CDS 32978 - 33547 523 ## COG5263 FOG: Glucan-binding domain (YG repeat) 29 14 Op 2 . - CDS 33570 - 36260 2335 ## COG2199 FOG: GGDEF domain - Prom 36314 - 36373 4.2 - Term 36412 - 36470 13.1 30 15 Tu 1 . - CDS 36479 - 37081 584 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain - Prom 37238 - 37297 11.2 31 16 Tu 1 31/0.000 - CDS 38201 - 38551 402 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain - Term 38616 - 38664 12.2 32 17 Op 1 29/0.000 - CDS 38691 - 40562 2244 ## COG0443 Molecular chaperone - Prom 40603 - 40662 2.2 33 17 Op 2 . - CDS 40664 - 40858 292 ## COG0576 Molecular chaperone GrpE (heat shock protein) Predicted protein(s) >gi|229784083|gb|GG667652.1| GENE 1 3 - 672 613 223 aa, chain - ## HITS:1 COG:CAC3195 KEGG:ns NR:ns ## COG: CAC3195 COG0423 # Protein_GI_number: 15896443 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase (class II) # Organism: Clostridium acetobutylicum # 2 223 4 224 462 375 77.0 1e-104 MEKTMEKIVALAKSRGFVYPGSEIYGGLANTWDYGNLGVELKNNVKKAWWQKFVQESPYN VGVDCAILMNSQTWVASGHLGGFSDPLMDCKACKERFRADKLIEDYMAEHGITIEGSVDA WSQEEMKKYIDDNNICCPSCGKHDFTDIRQFNLMFKTFQGVTEDAKNTVYLRPETAQGIF VNFKNVQRTSRKKIPFGIGQIGKSFRNEITPGNFTFRTREFEQ >gi|229784083|gb|GG667652.1| GENE 2 717 - 1343 653 208 aa, chain - ## HITS:1 COG:no KEGG:Closa_0907 NR:ns ## KEGG: Closa_0907 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 208 1 209 209 181 45.0 2e-44 MKKMKAVLAVLITAGLLAGCAGRGGTSTFEPKSSGIFVTDAGTFSTATVETYDDQDYYNE EEWKTFLEENVASYNAEHGEGAVALQSCSLKDGTASMIFDYATGSDLAQFTALYEDTANS VNSIDIIPVAQALEEAGTAGTIFIKTADGKTASTDEIAKKTDYHVVAVDGGPVKIQTEGK IMYTSDGVKLNSSFIAEISEGKNYIIFK >gi|229784083|gb|GG667652.1| GENE 3 1452 - 2216 393 254 aa, chain - ## HITS:1 COG:UU038 KEGG:ns NR:ns ## COG: UU038 COG2865 # Protein_GI_number: 13357594 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Ureaplasma urealyticum # 1 247 222 459 463 205 43.0 5e-53 MDALDDKEFSGSLVVLLENGLDFVRTNSKVKWQKAGEERIELPDYPKRSVLEGLVNALIH RQYLDLGSEVHIDLFDDRLEIYSPGGMCDGSLVQEQDIDKVPSKRRNPVIADIFNRLKYM ERRGSGFKKIQSVYQSQYLYTDEMRPQFYSDKDTFILTLKNLNYKKGAEKGAEKGAEKGA EKKQFHKMEERREKILLLIKQNPHITQGEIMENLNLSRKQVQSLMRILNNQGQIRRVGSN RNGYWEVFREGRGW >gi|229784083|gb|GG667652.1| GENE 4 3163 - 4005 1040 280 aa, chain - ## HITS:1 COG:BH1367 KEGG:ns NR:ns ## COG: BH1367 COG1159 # Protein_GI_number: 15613930 # Func_class: R General function prediction only # Function: GTPase # Organism: Bacillus halodurans # 5 272 8 275 304 291 57.0 7e-79 MDNNYKSGFVTLIGRPNVGKSTLMNHLIGQKIAITSDKPQTTRNRIQTVYTDERGQIIFL DTPGIHKAKNKLGEYMVSVAEHTLKEVDVVLWLVEPTTFIGAGERHIAEQLQNVKTPVIL VINKIDTIKNQDEILEFIAAYKDVCSFAEIVPVSALKDKNTDLMLDLIYKYLPCGPQYYD EDTVTDQPMRQIASELIREKALRLLSDEIPHGIAVTIEKMTERKNGMMDIEATIVCERDS HKGIIIGKGGAMLKRIGSAARREIEDLMDTQVNSNPFGSN >gi|229784083|gb|GG667652.1| GENE 5 4032 - 6956 3179 974 aa, chain - ## HITS:1 COG:CAC3006 KEGG:ns NR:ns ## COG: CAC3006 COG1026 # Protein_GI_number: 15896258 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases, insulinase-like # Organism: Clostridium acetobutylicum # 9 972 12 975 976 874 46.0 0 MRQLEHLTAYEVVEEKQLDEMNATGTVLRHKKSGARIFAVSCEDENKVFSIGFRTPPEDS TGVAHILEHSTLCGSGKFPVKDPFVELVKGSLNTFLNAMTYPDKTVYPVASCNEKDFQNL MDVYMDAVFNPNIYKEPKIFMQEGWHYELESPEADLIYNGVVYNEMKGAFSSPEEVLDRY TRKVLFPDNCYGQESGGDPAFIPDLTYDQFLNFHRRYYHPSNSYIYLYGDMDMVEKLEWL DDQYLSKYDTLEIDSRILPQEPFDRPVEEETHYSITDGESEEQATYLSINTVVGDDLDPH LYVAFQVLEYTLLDAPGAPLKQALIDAGIGQDILGGYQNGILQPYFSVIAKNADKEQKGE FLAVVKGTLRKLADEGINKKSLKAGLNFYEFRYREADYGSAPKGLMYGLQCMDSWLYDGD PTVHLTYQKTFDYLKQAVDEGYFERLIRDYLLDNPFEAVLVVSPEKNLTAKEDEKTARRL AAYKASLSNEEIAALVEQTRALREYQETPSPQEELKKIPMLSREDIGREPEAIIWEEKEV EGVKVIHHKMFTSGIGYLKLLFDTSRIPEEDLCYVGLLKSVLGFVDTEHYTYGDLTSEIH LNSGGINLSVTSYPNLKDGADFKGVFIASVRVLYDKLDFGFSILGEILKNSILDDEKRLG EVISETRSRGRMKLEGACHSAAVARATSYFSPTSYYNDRTGGIGYYQFLEQLDREYPEHK KEIIARLKQVMARLFTVKNLLVSYTADEEGFRLLPEALRSLKEMLPEGSEETYPFTFPAG NRNEGFKTASQVNYVARCGTFAGSGYAYTGALRILKVILSYDYLWIHLRVKGGAYGCMSG FGRSGEGYLTSYRDPNLKETNEIYEGIVEYLEHFDVDDRDMTKYVIGTISDMDVPYPPST KGSRGLSAYLSGVDEAMMQQERDEILNATKEDIRALAPIVKAVLATGSLCVIGNEEKIEA NKELFHETETLFHS >gi|229784083|gb|GG667652.1| GENE 6 7096 - 7920 910 274 aa, chain - ## HITS:1 COG:CAC0441 KEGG:ns NR:ns ## COG: CAC0441 COG1434 # Protein_GI_number: 15893732 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 7 267 15 252 259 110 28.0 4e-24 MTWILWLLAAACVIYYVVIVIYSGFTTSFSFIWMVFAAACLFLALGWEYYMKHKDRVPLW LPVSVITFCCTGILVFSMVEILIFTGAASRDASNLDYLIVLGARVKEDGLSKSLKSRLDK AIDYLEENPGTVLVLSGGQGEDEPVSEAAAMRDYLVFNGVSERQLILETRSFSTVENIAY SRIAIEEDQAERKARHARSDIIMEPGTYEEISDKPIRIGVLTSDFHVFRAEQIAKKWGIP DIYGISSESDPVLLVHFCVRECAAILKDKLVGNM >gi|229784083|gb|GG667652.1| GENE 7 8034 - 8156 79 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIQPKAGGIRNNTTECLSVPATAIGGHACPPIEIPINNKL >gi|229784083|gb|GG667652.1| GENE 8 8319 - 16808 6692 2829 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 2710 2828 465 620 621 102 38.0 2e-20 MEKKRMKIIWNQVLCAGKKCGKRWFSGILAVILAVSGGVVRDPMEAHADQISQEVTSFKT GEAWKDTAGKQIQAHGGLIQKFGDTYYWYGEDKTRGGRPIDGVRAYSSKDLYNWTDEGTV LKVMENREQFEMDEYFKTLYFDYNDAEKDEVYLNLRSSNCIVERPKVLYNEKTGKYVMWF HSDGPEAGKEEDSSASRYSRAMAGVAVSDTPDGPFQYVDSFKLHWVEGYAGVQRRGDSRD MNIFQDEDGSAYIIYSSEMNAYLYIAKLTDDYMGLQTPAGKVTPEVGKSGDGETWQARIL PDTSREAPAVFKSGEYYYMITSGTSGWDPNPAKYYRSKDIMAEKWEAMGDPCEGGSKTTF DSQSTYVVPIDPENGFYLYMGDRWKNGDLKNSSYVWLPLIVNDDGTIVIKNYDEWNLEET PRTIMKKSEGLEVYYTTDQGKVPALPEEITVTWLNGQKETAAIVWENLSAADFEQPFSKK NVSGTVKGQKIYTDIEVVPENLVYFVDCGTGSWGPESTDYPEIASKNPGLLNQVSDQEYK DGAEWGYVKTTAAKALVLTPDNASGKAESKYDVGVRTALDYITYKMKLTAGTYQFTAGFH EWWSGQKRTMKPVVTYTDASGAEMSAEGGTFTTSGSDMTVTEEFTIPADGVVSYTLKKTA NSAPVLSFLAVGKTKDAQEPDKPEEEKPELVLFASAFMLKGENEASNIKLNWNPIEGAAS YRVYRTDEKGENRQLLKETAGIMQDDYGVSEASWYVTEALDESGEILAVSEPAEGTPVKL GGEMKTRSNMAPGSDFVYDSKAVQVFAENGVSYKYSIEKDGETGKKLVEYQALDGVTYDA GRTVLSSKQYEIMGDCKFEGVAQMRKDGKIIIWAHFEPASGYSRAEISCMSGTMGGDDFT FYSERPHGNESRDLNIFNDNGTLYAISAANNNNDLNIYKIDESWTRVLPESEFPAITVCE GQHREAPNMVKVDGWYYLFTSEANGWYPSQGMYCSASTIQGLADAALVPIHVTTFGTQSG WMSSVGDNRLLVGSVWASSGQFGEGKNWTKVFPVSFKDGYAVYNYYPELAYNDDRVMVPV QNGKILSADQPAGIADGDAPGEAGCEAALVTDGRSDDKNVYYKSVESIGVPYSLTVDLGG LCSVSQVDVSFREPKGSDTRNLYKIYGSKDGIIFDEVLVDASSNKNPGFDSRAVENKGLY RYVKITVSEVRDMRHNDNKVSWSRGIHEMTIYGTPSDMKITGYEPEMAVDLGRKPELPAA VTLEYSDGRTMEEPVEWEYVSASELSEPFVKKNVKGTLRNFPDMTISTTLEVIPKELKYF IDCGTGDWGVKSEEFERISSVEGVSLFNDTSDRQFSSSEELTGQWGYSSKTSDTSSLKLT PDNTTGSSASKYEVGLRTNLSSGISYKLYLPAGEYMFSGGYHEWWSNQNRKIQPSVIYED ENGQSHEEKLSVVSLPGLGSDVMSSDIVKVPKDGVVTFQLSAASSTKPVISFLTVTMMTD DNRPAGKVTVSPEPGRYELDVIKPVELTSESEDSIYYTLDGTDPMTGDGSEEAAEGAELY DAPILFEEEGIYEINARSVRNASDGSVIWGPVTTARYNVTESSGPVERYDSVPVGKPWYD NNGEMIQAHGGGFLQMEDEQGAVYYWVGENKDHNGSSFNGINLYSSRDLLNWKFENTVLK PDSENPALRDNKIERPKLIYNKTTGQFILWGHWETADSYASSQICVAVSDTVNGDYTFLG HWRPGGKEKNWRTKSVNGSTIYVKDEEYESAVTNQVTPDGNMSRDMTVYVDGDQGYLVSA CADKHSICIYELNDEFTDILPGSEYHVFESDKLEAPAIIKSGDYYYLMGSGQSGWYPNQA RYAYTKDISNPEGWSELELIGNNTSFYSQPTNIMELTSPAGQKNYVYMGDRWNSKKLGES TYVWLPLEIDGTDMSLSYIPEWSLNAEDGTVQYEEAVVVSTGKPVTCSVEGQEGYGVEKA NDGDYFNTNKSGTSSSYFRPVSLPFTWTVDLEEVYDLSRIDISFNHWNGSEAYHQYHIYG SVDGNRWKQLVDESDNKTTGFKSHGLKGQYRYIKLEVIKAVKDKDGQSAAFAAGLVEVEV FAEAKAETADITGIEITARPFKTTYETGEDFNSEGLEVAAKYSDGTSAKLESGQYELSEV DTLVTGKKDVVVTYTTGEGKVYSDTFYIIVYDPDEMYADKLKVVKQPDRTIYQTGEEYDT AGMEVRILMKASASNAVPTKVKELSSQEYELEYDFGKPGKEKVTVVYYGIGKNGEERKLT DTVLVTVIDGETEYYQTGIAVEKEPDKKVYKIGGSFERTGMIVTRTMKASPSDASPSNVS YTEEIDNYRVEGDDFAEAGKKTVNILYDGTDRDGNDKVFKTSVTVMVTDCTSDVVEAEFE TIRSGLCEVLTGENAAFATEEEKQTAAAEAKEKLLGVLEENEGEWILTDEIMNLIESIET YYLMAYTNVRKIIEGPAFLTDGMKITGAGLSANPQSDYQRVEITVSEAETPEGLEEIEGK QANAMMFELAVNGEAVRLAAPVSVSMKVPYGIPEENLKILFVGEDGYTELIRPKVKDGFM VFGVKSFGTLVVFSDTENGGGEVDEKTLDHIAVTKKPVKTEYRTGEEFEPEGMVVTAYYS DGTKEVLTEYEVSGFDSGKTGTQMITVTYKGKTAEFEVTVRKRQSSYGGSGSGSSSWGVS AAVPNVPGTWKQDEKGWWYEKKAGGYLKAEWGRINGTWYYFNEEGYMETGWVQTGGQWYF LNADGSMVENNWVLSGDQWYYLKNGGSMAAGQWVTWKEKSYYLNQDGTMAMNTVTPDGYR VDENGVWIQ >gi|229784083|gb|GG667652.1| GENE 9 17107 - 17907 713 266 aa, chain + ## HITS:1 COG:BH2728 KEGG:ns NR:ns ## COG: BH2728 COG4753 # Protein_GI_number: 15615291 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 132 232 399 499 510 81 44.0 2e-15 MRRKQIYIAEREELVIRGLDMLFKTLPEEVSIFGYGRDSEEVLQDFTFHTPDILFIDEML MFHQTMPLYVFLSIKYPQTSTVLMSSEALLPEKRNTHTDYYISKAILDENQIKALWTQLS NQPHRQKNAAESFDICRIKDYIVEHLKEDLSLETIANKFGYNYSYLSTCFSKYVKKGFKQ YVNELRIQHACRLLTESETTVSEAGSESGYSSQSYFTQVFKKYTGCTPSAYQLRHGGSDC QMESATLSDVSCIPPETYSHICEKCG >gi|229784083|gb|GG667652.1| GENE 10 17840 - 18691 666 283 aa, chain - ## HITS:1 COG:CAC2818 KEGG:ns NR:ns ## COG: CAC2818 COG2207 # Protein_GI_number: 15896073 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 4 277 1 275 279 120 28.0 4e-27 MTKMREKNGIYERQPFELFAYHDKKPTLDHSVFHFHNNYEILLYVKGNLEIYIEQSRYLP VPGTMFVISTEEVHKGTILDSEDYERYVIHVSSSVIQPLLGYDVDLLACFHGHEPGKQNA VVLSMEEIVEFCELFSKLKQILDTEPYGKEVLKISCVAQLMTMVNHAFSHTIMPARPVSN LASGIMDYVENHMSEPLSLESIAAALFMDKYYLAHQFKKQTGDSLYHFILMKRIALARRL LGEGLPVTEVCDRSGFNNYNNFIRTFRKYVNMSPGEYRKHQTE >gi|229784083|gb|GG667652.1| GENE 11 18920 - 20155 1486 411 aa, chain - ## HITS:1 COG:CAC1687 KEGG:ns NR:ns ## COG: CAC1687 COG0826 # Protein_GI_number: 15894964 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Clostridium acetobutylicum # 1 403 1 401 406 451 54.0 1e-127 MRKTELLIPAGSLDVLKTAVIYGADAVYIGGEAFGLRAKAKNFTNEEIREGIAFAHEHGV KVYITANILAHNDDLPGVEAYFEELKEIGPDALIISDPGVFAIARRVLPEMEIHISTQAN NTNYGTYRFWYELGAKRVVTARELSLKEIKEIRNQIPEDMEIESFIHGAMCISYSGRCLL SNFMTGRDANQGACTHPCRWKYSIVEEKRPGEYMPVYENERGTFIFNSKDLCMVDHIPEM IDAGIDSFKIEGRMKTALYVATVARAYRKAIDDYLKDPKLYEANLAWYREEIAKCVNREF TTGFYFGKPDASAQIYENSTYVKDYTYLGTVASVDGRGYARIEQKNKFSVGETIEVMKPD GRNLAAVVSAIYDEDGNPQESAPHARQMVDVDLGIEMEEYDILRRCESREE >gi|229784083|gb|GG667652.1| GENE 12 20160 - 20852 642 230 aa, chain - ## HITS:1 COG:SP0980 KEGG:ns NR:ns ## COG: SP0980 COG4122 # Protein_GI_number: 15900857 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 2 207 17 221 237 146 37.0 3e-35 MIVNDRITDYINSLERSRGPLLTEIASKARKERVPVIREETASFLISLVESKRPAAILEV GTAVGYSTLLMAGVMPENCRITTIEKYEKRIPEALRNFERAGESDRITLLAGDAGEILKK LTGPYDFIFMDAAKGQYLVWLPEIMRLMVPGSMLVSDNVLQDGDIIESRYAVERRDRTIH ARMREYLYELKHTDRLATSIIPIGDGVAFSVLKVPEITNVCPSSAGEKEM >gi|229784083|gb|GG667652.1| GENE 13 20849 - 21244 477 131 aa, chain - ## HITS:1 COG:no KEGG:Closa_0897 NR:ns ## KEGG: Closa_0897 # Name: not_defined # Def: aminodeoxychorismate lyase # Organism: C.saccharolyticum # Pathway: not_defined # 1 131 1 131 131 164 73.0 1e-39 MSRATKEINRITGAIIGISGRLIVCALVVLLLYEGVTKGYEFGHDIFYATSVEAAPGRDR SITVEEGTSITDAAKLLKSYGLITNEYSFVIQAIFFDYEVNPGTYTVNTSMTSKEILQMM NENTGEEEEKE >gi|229784083|gb|GG667652.1| GENE 14 21340 - 23007 1984 555 aa, chain - ## HITS:1 COG:CAC1683 KEGG:ns NR:ns ## COG: CAC1683 COG0595 # Protein_GI_number: 15894960 # Func_class: R General function prediction only # Function: Predicted hydrolase of the metallo-beta-lactamase superfamily # Organism: Clostridium acetobutylicum # 9 555 7 555 555 675 60.0 0 MKKQGSNAKIKIIPLGGMEQIGMNITAFEYEDSIIVVDCGLAFPSDDMLGIDLVIPDVTY LKQNIDKVKGFVITHGHEDHIGALPYILKEVNVPVYGTKLTIALIENKLKEHNLLKNTKR KVIKHGQSINLGCFRIEFIKTNHSIADASALAIYSPAGVILHTGDFKIDYTPVFGEPADL ERFAELGKKGVLALMCDSTNAIRPGFTPSEKTVGKTFDNIFADHKFNRIIVASFASNVDR VQQVINSAAKYGRKVVVEGRSMVNVIGTASELGYIDIPEGTLIDIDQLKNYPDEQTVLIT TGSQGESMAALSRMAGSTHKKVSIKPTDVVVLSSTPIPGNEKAVSKVINELAMKGAEVIN QDTHVSGHACQEEIKLIYALVRPKYAIPVHGEYRHLMAQKSLVQAMGIPKDNIVIMSSGD VVELGQESWGIVDHVQSGGILVDGLGVGDVGNIVLRDRQNLAQNGIIIVVLTLEKYSNQL LAGPDIVSRGFVYVRESEDLMEDARAVVNDAVQDCLDRHVNDWGKIKNIIRDSLSDFLWK RMKRNPMILPIIMEV >gi|229784083|gb|GG667652.1| GENE 15 23028 - 23495 496 155 aa, chain - ## HITS:1 COG:CAC1682 KEGG:ns NR:ns ## COG: CAC1682 COG0735 # Protein_GI_number: 15894959 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Clostridium acetobutylicum # 1 145 7 149 151 142 49.0 2e-34 MDQEVFKDMLRKKGLKVTNQRMLVLEIMAEHPGEHLTAEEIYDLAKKKCPEIGLATIYRT VQVLVDLSVIDKVSFDDGFARYELGGLESESRHHHHHAICSRCGRVFSFEGDLLDTLEQT ISDTMGFKVTDHEVKLYGYCRDCQKIMEKEEQDGK >gi|229784083|gb|GG667652.1| GENE 16 23513 - 23794 378 93 aa, chain - ## HITS:1 COG:no KEGG:Closa_0894 NR:ns ## KEGG: Closa_0894 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 92 1 92 93 108 83.0 7e-23 MNSEEKKITLVTDDGEAVDFFVLEETRINGMNYLLVTDAADDDEEGECYILKDLSESGDA DALYEFVEDDNEIDYLYKIFSELLEDADVDIEK >gi|229784083|gb|GG667652.1| GENE 17 23802 - 24233 511 143 aa, chain - ## HITS:1 COG:lin1537 KEGG:ns NR:ns ## COG: lin1537 COG0816 # Protein_GI_number: 16800605 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Listeria innocua # 1 134 4 136 138 141 58.0 3e-34 MGLDYGSKTVGVAVSDSLGLTAQALETITREDENKLRKTCARIEELIREYEIETIVLGYP KNMNNTLGDRVEKTESFKAMLERRTGLPVVLWDERLTTVQSERILQESGVRRENRKAVID KVAAGLILQGYLDSLPGGTTRGE >gi|229784083|gb|GG667652.1| GENE 18 24513 - 24773 361 86 aa, chain - ## HITS:1 COG:lin1538 KEGG:ns NR:ns ## COG: lin1538 COG4472 # Protein_GI_number: 16800606 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 19 81 20 82 90 90 65.0 9e-19 MGDIHNTQFFGSVQENKLDVSQVLEQVYIALSEKGYNPVNQIVGYIMSGDPTYITSHKNA RSLIMKVERDEILEELMRVYIDTMLK >gi|229784083|gb|GG667652.1| GENE 19 24856 - 25806 592 316 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 1 299 122 419 451 232 42 2e-60 MIDISAAKEYEALTIHKIADHTRAFIKVTDGCNQFCSYCIIPYTRGRVRSRRMEDVRAEV ERLVAGGYKEIVLTGIHLSSYGVDFREEERRTLLDLIVYLHEVDGLLRIRLGSLEPRIIT KEFAEALAALPKVCPHFHLSLQSGCDATLARMNRHYTTADYLERCDILRAAFDNPAITTD VIVGFPGETGEEFQTTEAFLRTVHFYEMHVFKYSRREGTRAAVMPDQVPEPVKTERSGVL LSLEKIMSLEYRKQFLGKMTEVLMEEEFEWDGRRYMIGYTKEYVKAAVPFEEGLKGAMVC GTLTEMVNDEVILLTR >gi|229784083|gb|GG667652.1| GENE 20 26748 - 27083 287 112 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 1 112 1 112 451 115 46 6e-25 MRKAALHNLGCKVNSYETEAMQQLLEEAGYEIVPFAEGADVYIINTCSVTNIADRKSRQM LHRAKKMNPQAVVVAAGCYVQSAGDALKMDEAVDLVIGNNKKTELVKILEEY >gi|229784083|gb|GG667652.1| GENE 21 27234 - 27467 302 77 aa, chain + ## HITS:1 COG:no KEGG:Closa_0890 NR:ns ## KEGG: Closa_0890 # Name: not_defined # Def: Phosphotransferase system, phosphocarrier protein HPr # Organism: C.saccharolyticum # Pathway: not_defined # 1 77 1 77 77 113 92.0 2e-24 MKTVRISLNSIDKVKSFVNDLTKFDTDFDLVSGRYVIDAKSIMGIFSLDLSKPIDLNIHS ESSVDEIVNILKPYIVD >gi|229784083|gb|GG667652.1| GENE 22 27630 - 28808 1358 392 aa, chain - ## HITS:1 COG:CAC2971 KEGG:ns NR:ns ## COG: CAC2971 COG0301 # Protein_GI_number: 15896224 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis ATP pyrophosphatase # Organism: Clostridium acetobutylicum # 4 381 3 374 384 360 50.0 2e-99 MQYKSFLIKYAEIGVKGKNRFMFEDALVTQIRHALKDIDGDFMVVKESGRIYATAESEYD FDEAVEALRRIFGIAAICPMVQVDDCGYEDLKKQVLAYVDEAYEDKNFTFKVNARRGNKK YPVNSDQINRDLGEVILDTFPETRVDVHQPEVMLHVEVRNRINIYSKVIPGPGGMPVGTN GKAMLLLSGGIDSPVAGYMISKRGVKIDAVYFHAPPYTSERAKQKVVDLAKLVSKYSGPI HLHIVNFTDIQLYIYDQCPHEELTIIMRRYMMRIAERLAADAGAQALITGESIGQVASQT MQSLAATDAACTMPVFRPVIGFDKQEIVDVAEKIGTYETSVLPFEDCCTIFVAKHPVTKP NLKMIERSEEKLKEKIDEMVETAISTVEKVWC >gi|229784083|gb|GG667652.1| GENE 23 28828 - 29982 1287 384 aa, chain - ## HITS:1 COG:CAC2972 KEGG:ns NR:ns ## COG: CAC2972 COG1104 # Protein_GI_number: 15896225 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Clostridium acetobutylicum # 1 382 1 378 379 338 47.0 1e-92 MDVYFDNSATTKVLAPVADLVMKVMTEDYGNPSAKHGKGMRAEQYIKEAAEIIAGTLKVS PKEIVFTSGGSESNNMALIETAMANRRAGNHIISTAIEHASVYNPLAYLEEQGFEVTYLP VDHNGHISLEDLERAVRRETILVSVMYVNNEIGAVEPVEEIAKLIHKINPAILFHVDAIQ AYGKFVIRPKRQGIDLLSVSGHKIHAPKGIGFLYVDSRVKIKPLIYGGGQQRGLRSGTEN VPGIAGLGAAAREMYRDHSERLKRMYEIKDYMISRLGEVEGTTVNSLPGDQSAPQIVSAS FSGVRSEVLLHALEEKGIYVSSGSACSSNHPAVSGTLKGIGVKKELLDSTLRFSFGVFNT KDEVDYCISVLQELLPVLRRYQRG >gi|229784083|gb|GG667652.1| GENE 24 30000 - 30737 846 245 aa, chain - ## HITS:1 COG:CAC1285 KEGG:ns NR:ns ## COG: CAC1285 COG1385 # Protein_GI_number: 15894567 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 243 1 243 250 172 38.0 7e-43 MYHFFVNQEQIGQDTVTVTGPDVNHIKNVLRMKPGEEVLISNGVDKDYVCRLQTITSDAV TAEIVTVTEGGTELPAKFYLFQGLPKSDKMEFLIQKAVELGVYQVIPVETRRTVVKLDAK KVEAKVRRWNAVSESAAKQSKRLIIPEVTGVMTFQEALNYAGGFDKNIIPFEHAEGMADT KAYFAGIRPGMSCGIFIGPEGGFEDEEISQASEAGVKPVTLGKRILRTETAGLAVLSVLM FQMES >gi|229784083|gb|GG667652.1| GENE 25 30747 - 31703 984 318 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238917093|ref|YP_002930610.1| ribosomal protein L11 methyltransferase [Eubacterium eligens ATCC 27750] # 1 318 1 318 318 383 58 1e-106 MKWKKFTLKTTTQAVDLVSSMFDEIGIEGIEIEDNIPLTAEETKGMFIDILPELPPDEGV AFVSFYLDDSQDVAGILERVEEGLNDLSMFTDLGERTITESETEDKDWINNWKQYFKPFT VDDILIKPTWEEIPEEHKDKLLIQIDPGTAFGTGQHETTQLCIRQLRKYVTPETVLLDVG TGSGILGITALKLGANTVFGTDLDENAITAVGENLEANGISGESFTVLQGNIIDDKAVQD AAGYEKYDIAVANILADVIIMLQREIPVHLKKGGIFITSGIINMKEEAVRAAFAANDAFE VIEVTYQGEWLSVTARKK >gi|229784083|gb|GG667652.1| GENE 26 31713 - 32459 796 248 aa, chain - ## HITS:1 COG:FN1706 KEGG:ns NR:ns ## COG: FN1706 COG0730 # Protein_GI_number: 19705027 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 3 245 7 248 254 151 40.0 8e-37 MHQYLIVCPLVFLAGFVDSIAGGGGLISLPAYLAAGVPPHLALGTNKMGSTMGTVISTAR FAKSGFIKWKLSLFAAACAIVGSIIGSNLSLLASEKFLKGMMLFALPVVAFYVLKNKDMG DNKDTGSLPEKKMLAVSMAAALLIGTYDGFYGPGTGTFLLLVLTGAAKMDLRTASGTTKV VNLSSNIAALVTFLINGKVLLPLGVTAGVFCIAGHYIGSGLVVKSGRKIVRPVVLVVIMC LFVKIVKG >gi|229784083|gb|GG667652.1| GENE 27 32580 - 32693 86 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIPVIFILESLKQQIQKKILKNFEIDAIKPFADALIY >gi|229784083|gb|GG667652.1| GENE 28 32978 - 33547 523 189 aa, chain - ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 76 188 512 614 693 67 37.0 1e-11 MRKIRKFVLTAALACCTVLWFSMAAFADNAQEVRDFPGFNGSGSKQTGPIEGSISYDCWY DEDDSDDDNSGPGAYTYKRGWRYSPGGWWFQNSDGSWPSNGWKYIDGRWYMFDGGGHMMT GWYTDGNGYKFYLNPTDDGTMGSMRIGWQIIDGKAYYFNTMSDGTLGRLLVNTTTPDGYH VGTDGIMAQ >gi|229784083|gb|GG667652.1| GENE 29 33570 - 36260 2335 896 aa, chain - ## HITS:1 COG:CAC0631_1 KEGG:ns NR:ns ## COG: CAC0631_1 COG2199 # Protein_GI_number: 15893919 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Clostridium acetobutylicum # 399 595 319 523 525 122 31.0 3e-27 MTEERQVIEFVRRFHQTCLEQQDVEQIGQMLSDDIEWTGAGIYNRIKGRENVLNAYAANQ NSDLHEYQILNADYTASALAGHIYSFSGSCYVRCRRKQQEDGERVVRVSGICRFEDGAGY MMRLHHSLPNRAWESDLQHMTQEEVWALRSLVGLQSKELEEKNSNLDALIQNIPGGVICC RDDADLELIFYSDGFLKMFGYTRDEMETIFENKFSRMIVPEDLKSTRLEVIRQMKNGNTK EIEYRVICRDGNYLVIQDRGQLVMRDGQPVFYCILIDITERRKADEELKMSLERYQIILN QATDIIFEWDIKSDTLTCSPNWVKKFGYQAVSDHISEKILKRGNLHSDDEKVFIQMQKEV LLGSPYGENEIRLAAKDGRYIWCRVRYTLQTDQEKRPVRAIGLIADIDREKREKERLLEL AEQDSLTGLYNRGAVQTLIQRYVVKARPSDRCVLMILDVDNFKKVNDIYGHLSGDAMLVD IADMLRRLFPENAVLARVGGDEFAVLLYHVKTMEEAAEKADQILKSFRSILKQQNDMISC SIGISTAPENGDNFASIYKNADAALYQAKRQGKNRYAFYAETLEDVIQPESQTDVAEAAV MDQKLADQNLTGYVLDILSRSHSMEKSINQLLEIVGRQVDVSRVYIFEDDEDGSISSNTF EWCKEGIQPEKENLQSIDMDSLDHYYDNFSEEGIFFCRDVSTLLTHQRMLLESQGIKSVL QCRIMDQGKPRGFIGFDECTENRYWTGSQIRLLQTISRILGIFLMKERAQERLCRAMNGF SAALEEQRDWFYILDMETCEFLYTSKKVQERDPQAVNGGVCYRAFFGADRMCEGCPMLLL DSVRTSASSLVWNRNMNCYVHTRAQSVLWPGGKEACMISCRIVEGLEKQREELERN >gi|229784083|gb|GG667652.1| GENE 30 36479 - 37081 584 200 aa, chain - ## HITS:1 COG:VC0856 KEGG:ns NR:ns ## COG: VC0856 COG0484 # Protein_GI_number: 15640872 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Vibrio cholerae # 4 199 178 369 381 153 43.0 2e-37 MYTQQSFFGQVQNVQTCPDCGGTGRIIKEKCPDCYGTGYITKRKKFKVTIPAGIDNGQSV RLAGAGEPGTNGGERGDLLVEAVVSGHPIFKRQDTSIFSTVPISFTRAALGGPIRIKTVD GDVEYEVKPGTQTDTKVRLKGKGVPSLRNKAVRGDHYVTLVVQVPERMNEAQKEALRRFD EAMGGDSSEGEKHRKKGIFK >gi|229784083|gb|GG667652.1| GENE 31 38201 - 38551 402 116 aa, chain - ## HITS:1 COG:BU152 KEGG:ns NR:ns ## COG: BU152 COG0484 # Protein_GI_number: 15616772 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Buchnera sp. APS # 4 114 3 112 377 124 54.0 6e-29 MAEKRDYYEVLGIPKDADDAAIKKAYRALAKKYHPDTNPGDAAAAEKFKEASEAYSVLSD PDKRRQYDQFGSAAFDGSGGPGGFGGFDFNGADMGDIFGDIFGDIFGGGRSRSSAS >gi|229784083|gb|GG667652.1| GENE 32 38691 - 40562 2244 623 aa, chain - ## HITS:1 COG:CAC1282 KEGG:ns NR:ns ## COG: CAC1282 COG0443 # Protein_GI_number: 15894564 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Clostridium acetobutylicum # 1 621 1 610 615 720 67.0 0 MGKIIGIDLGTTNSCVAVMEGGKPVVIPNSEGVRTTPSIVAFTKNGERLVGEPAKRQAVT NADRTISSIKRHMGTDYKVAIDGKNYTPQEISAMILQKLKADAEAYLGEKVTEAVITVPA YFNDAQRQATKDAGKIAGLDVKRIINEPTAAALAYGLDNEKEQKIMVYDLGGGTFDVSII EIGDGVIEVLSTNGDTRLGGDDFDNRITQWMVDEFKKTEGVDLSGDKMAMQRLKEAAEKA KKELSSSTTTNINLPFITATAEGPKHLDMNLTRAKFDELTLDLIERTAVPVQNALRDAGI TASELGKVLLVGGSTRMLAAQEKVKQLTGKEPSKTLNPDECVAIGASIQGGKLAGDAGAG DILLLDVTPLSLSIETMGGVATRLIERNTTIPTKKSQIFSTAADNQTAVDIHVVQGERQF ARDNKTLGQFRLDGIPPARRGVPQIEVTFDIDANGIVNVSAKDLGTGKEQHITITSGSNM SDDDIDKAVKEAAEFEAQDKKRKEGIDARNDADSMVFQTEKALQEVGDKIDANDKAAVEA DLNALKEAINRAPIEEMTDAQIEDIKAGKEKLMNSAQALFAKVYEAAQGSAGAGPDMGAG AGNAGGGASYQDSDVVDGDYKEV >gi|229784083|gb|GG667652.1| GENE 33 40664 - 40858 292 64 aa, chain - ## HITS:1 COG:BS_grpE KEGG:ns NR:ns ## COG: BS_grpE COG0576 # Protein_GI_number: 16079602 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Bacillus subtilis # 1 62 124 185 187 67 54.0 8e-12 MKTLEDTGVKPIEAVGQPFDPNFHNAVMHIDDESLGENTVAMELQKGYTYRDTVVRHSMV QVAN Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:57:47 2011 Seq name: gi|229784082|gb|GG667653.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld46, whole genome shotgun sequence Length of sequence - 33440 bp Number of predicted genes - 31, with homology - 29 Number of transcription units - 11, operones - 5 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 33 - 707 201 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 2 1 Op 2 3/0.000 + CDS 764 - 1717 1064 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 3 1 Op 3 16/0.000 + CDS 1733 - 2851 1181 ## COG1879 ABC-type sugar transport system, periplasmic component + Prom 2879 - 2938 2.9 4 1 Op 4 10/0.000 + CDS 3011 - 4525 1533 ## COG1129 ABC-type sugar transport system, ATPase component 5 1 Op 5 . + CDS 4590 - 5591 1050 ## COG4211 ABC-type glucose/galactose transport system, permease component 6 1 Op 6 . + CDS 5652 - 7142 1333 ## COG2407 L-fucose isomerase and related proteins 7 1 Op 7 4/0.000 + CDS 7162 - 8007 1057 ## COG0191 Fructose/tagatose bisphosphate aldolase 8 1 Op 8 . + CDS 8020 - 8991 882 ## COG0524 Sugar kinases, ribokinase family 9 1 Op 9 . + CDS 9004 - 9396 416 ## Aaci_1428 glyoxalase/bleomycin resistance protein/dioxygenase 10 1 Op 10 . + CDS 9411 - 10544 1093 ## COG0657 Esterase/lipase 11 1 Op 11 . + CDS 10558 - 11061 537 ## gi|266621780|ref|ZP_06114715.1| conserved hypothetical protein 12 1 Op 12 . + CDS 11071 - 12459 1091 ## COG2407 L-fucose isomerase and related proteins + Prom 12540 - 12599 3.9 13 2 Tu 1 . + CDS 12646 - 12885 236 ## Closa_1693 hypothetical protein + Term 12935 - 12997 -0.2 + Prom 12969 - 13028 6.4 14 3 Tu 1 . + CDS 13081 - 13242 63 ## gi|288870613|ref|ZP_06409831.1| conserved hypothetical protein 15 4 Tu 1 . - CDS 14155 - 15639 271 ## gi|225570110|ref|ZP_03779135.1| hypothetical protein CLOHYLEM_06206 - Prom 15821 - 15880 5.5 + Prom 15827 - 15886 5.2 16 5 Tu 1 . + CDS 16010 - 16513 479 ## Closa_0702 CarD family transcriptional regulator + Term 16514 - 16563 10.2 + Prom 16614 - 16673 5.8 17 6 Op 1 1/0.000 + CDS 16813 - 17862 1043 ## COG3392 Adenine-specific DNA methylase 18 6 Op 2 . + CDS 17853 - 18722 942 ## COG0338 Site-specific DNA methylase + Term 18728 - 18790 4.6 + Prom 18765 - 18824 6.7 19 7 Op 1 11/0.000 + CDS 19004 - 19705 480 ## COG1180 Pyruvate-formate lyase-activating enzyme 20 7 Op 2 . + CDS 19702 - 21783 1578 ## COG1882 Pyruvate-formate lyase 21 8 Op 1 . - CDS 21832 - 22014 135 ## EUBELI_01710 two-component system, sensor histidine kinase YesM 22 8 Op 2 7/0.000 - CDS 22025 - 23524 1465 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 23 8 Op 3 . - CDS 23566 - 25053 882 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 25175 - 25234 6.3 + Prom 25149 - 25208 5.8 24 9 Op 1 . + CDS 25265 - 27019 1505 ## COG1653 ABC-type sugar transport system, periplasmic component 25 9 Op 2 . + CDS 27033 - 27146 56 ## 26 9 Op 3 7/0.000 + CDS 27134 - 28075 874 ## COG4209 ABC-type polysaccharide transport system, permease component 27 9 Op 4 . + CDS 28089 - 28976 884 ## COG0395 ABC-type sugar transport system, permease component 28 9 Op 5 . + CDS 29005 - 30984 1855 ## GYMC10_2964 hypothetical protein 29 9 Op 6 . + CDS 30988 - 32493 1574 ## COG3534 Alpha-L-arabinofuranosidase + Term 32521 - 32586 6.9 - Term 32717 - 32771 4.1 30 10 Tu 1 . - CDS 32778 - 33068 133 ## gi|266621797|ref|ZP_06114732.1| hypothetical protein CLOSTHATH_02977 - Prom 33307 - 33366 3.8 + Prom 32807 - 32866 3.4 31 11 Tu 1 . + CDS 32989 - 33156 100 ## Predicted protein(s) >gi|229784082|gb|GG667653.1| GENE 1 33 - 707 201 224 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 1 207 105 312 323 82 28 4e-15 PIMIDNVGRFLGYAELLEDPSLNHKTVAALFMHSTGSVGGCVIEKGELLYGRHGYQGEFG HLLVETNQGRRCVCGNDGCLESMMSPESLRSIWKKYEAAYPHNALDSRKAGDIREVFLEA NSGNLLAGAVVDEFAVYMARAIYNIVLMYDPQQIILQGLLAYGGEYLEKKLNDLVRIFPS YGNADTGLVTYSKINDSDEGFITGAALYALNSCLFGDKYLLDKA >gi|229784082|gb|GG667653.1| GENE 2 764 - 1717 1064 317 aa, chain + ## HITS:1 COG:SMb20500 KEGG:ns NR:ns ## COG: SMb20500 COG0667 # Protein_GI_number: 16264230 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Sinorhizobium meliloti # 9 307 13 319 331 198 36.0 1e-50 MRYKHFKNAGVSVSALAVGTWAIGGQNYGQVNRSDSVRAIRTMIDQGVNLVDTAPCYGNG ASEKIVGEALRGIPRDQILISTKFGLITDVYSGEYIKHAGYKSVMREVESSLMNLETDYI DFYYVHWPDVNTPIDETMAALNTLKKQGKIRFIGVSNFSREQILAAEEYAQIDVQQPPYS MVNQKFTDLMKWGYEKGIDSMTYGSLGSGILTGAIRTMPDFEPGDMRLTFYDFFREPVFS KIMEFLKTMDRIAEDHGVPVAQVALNWSTQKEFVGTALCGVRNEQEALQNCKAFEWSLTE DEITVLDKELTRLNIGV >gi|229784082|gb|GG667653.1| GENE 3 1733 - 2851 1181 372 aa, chain + ## HITS:1 COG:YPO1507 KEGG:ns NR:ns ## COG: YPO1507 COG1879 # Protein_GI_number: 16121780 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Yersinia pestis # 60 365 33 332 335 183 38.0 4e-46 MKRTVKKAFALSLTAVLCMTACTQKTAETSPAADGGGTGTTAAEAAAAENKSNGERPVAG MTCYMYSDTHIANVRNTVLKAAETGTVAVETSDSQFDVGLQMNAMDAFVTKGCKYLVINN INTNAKDQVIDVARKADLPIIFWNTDSPSDEEMDSYGPCYFVSSAAEQSGVVQGEAAAKY WKEHPEADRNGNGKMDYVMLMGQIGNYDTEMRTKYSIDTVKEAGVETNCLLEVVCEWQRA KAQDQMASVISANADDIDIVFANNDDMALGAIEALKAAGYFTDESNYIPVLGVDATKVGL QALEDGTMLATSLNNPVIMGKCIYKILELLEAGEEITSENLGYDIDDHKRVWLDYVAITK DNLEDADLSKYE >gi|229784082|gb|GG667653.1| GENE 4 3011 - 4525 1533 504 aa, chain + ## HITS:1 COG:HI0823 KEGG:ns NR:ns ## COG: HI0823 COG1129 # Protein_GI_number: 16272764 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Haemophilus influenzae # 2 502 8 504 506 518 51.0 1e-146 MEDKSIILEMKNISKSFPGVKALNSVSFTLRRGTVHALMGENGAGKSTLMKCAFGLYDPD EGEVYFNGSQVHFDNPRQGLEAGISMIHQELNPIRRRSIAENIWVGRMPEKKIAGLSFID HKKMYEETRRLFDQLHMEADPRTQAIELSASVLQLGEIARSISYGAKVIIMDEPTSSLTE TETATLFEVIHKLTSEGIAIVYISHKIDEILKISDEVTIMRDGCRIGTWPAEELTEELIV NRMVGRQMDNWFPELDNKPGEVVLEVKNFTSPNPMSFQDCSFQLRRGEILGLGGLVGAQR TELMEALFGIRSIKSGEVYKDGKKITIKSSADAIANKIGLLTEERRATGIIPGASVLDNT VIAKLGEYSNVLGMLDLKKCRQDTDEYIKTMNIKTPTRDTRIADLSGGNQQKVLLARWLL TNPDILILDEPTRGIDVGAKYEIYTIMLNLAKEGKSIIMISSEMPELMGMSNRVMIMCEG HVTGFLEGKDIDDSKIMQFASKVS >gi|229784082|gb|GG667653.1| GENE 5 4590 - 5591 1050 333 aa, chain + ## HITS:1 COG:HI0824 KEGG:ns NR:ns ## COG: HI0824 COG4211 # Protein_GI_number: 16272765 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type glucose/galactose transport system, permease component # Organism: Haemophilus influenzae # 2 333 6 336 336 266 48.0 4e-71 MQTKQFDIKKWAQNNAIYLVLIAIIIMITVKNPAFLSLRVFRDILMQSSTRLIMAMGCMF VILSGGADLSGGRMLGAAAVLAGSLAQMSTYANKFFPNLAELPIIVPIIACLLMGMAMGL LNGIVVAKLNVPAFIGTLGVQLIIYGGISIYYNMEPNRSQPLGGFLKTYSKLGTGSFFGI PYILIIAVVLMIACHIVLNYHRFGKTLFAVGGNQEAARVSGINVQKIIMVVYIIAGAFYG LTGALEAARTGGATNNYGQGYELDAIAACVVGGCSVAGGVGAVPGVAIGVTVFTVIQYGM TFVGINPYWQNVVKGAIIVIAVAIDVRKYARKR >gi|229784082|gb|GG667653.1| GENE 6 5652 - 7142 1333 496 aa, chain + ## HITS:1 COG:TM0951 KEGG:ns NR:ns ## COG: TM0951 COG2407 # Protein_GI_number: 15643713 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Thermotoga maritima # 10 487 4 465 471 130 26.0 4e-30 MIRYTQDRTVVLGYVPVRRDMFPSGPASELKSRVKARFDEMAEKCGDVELVTIDDTVEGG MLWDLRDADRVVELFRAKKVDAIAFPHCNFGQEEVVAKVAAEIGKPVLLWGPRDPSPETD KSGKTGPRDFDTQCGMFATSRALKRYHVPYTYIENCWLDSPVLDQGFDDFIRVVSVVKAF RGMRIAQIGSRPRQFLSVKVNESELLSKFGIEIVPIWPEEVKRMIEKLKRGLPPSPEGCE LPMGQPPLPPLKGLLPDPRIASVAAEIESSVDWSAVGHGAVETMAYLEVAIMDLAKIHQC QAAAVDCWGFSQDAYGIPSCFVLGELFDRGLPAACETDIHAAIGAVLLQAASRYSSPAFI ADVTIRHPEDDNTELLWHCGPFAKSLVKKGVIPAVKECKGMYEIEGGPLTVVRFDQDDGN YLLLADEGEGTEGPKTTGNYVWFKVNDWVKWEKKLMYGPYIHHVSGIHGHFAKVLKEACK YLGPVTHDSVEPVEEM >gi|229784082|gb|GG667653.1| GENE 7 7162 - 8007 1057 281 aa, chain + ## HITS:1 COG:lin2238 KEGG:ns NR:ns ## COG: lin2238 COG0191 # Protein_GI_number: 16801303 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Listeria innocua # 3 234 2 235 299 170 39.0 2e-42 MSFVRVKDLLEMGKKNNACVLAFDAADGNMIKSVIKGAEAAGKPVMVMLYPGMRNIMPFG VFAETTKYYAERASVPVGLHLDHCSDFDMIMEAIHEGFSSVMADGSVLSFEENVAFTRKV VEVARNFDVDVEGELGHVGTAADSDEFSRKEFFTRPDEVKEFVERTKVTSLAVSFGSAHG LYKSTPKLDIDHLKVLSAATDTPLVLHGGTGIPDDQLALAFANGINKINIGTEFFNLNTK LNQEFYAQDFTKCDPFGFPEYIQPKLTEYIAGRLELSKITL >gi|229784082|gb|GG667653.1| GENE 8 8020 - 8991 882 323 aa, chain + ## HITS:1 COG:APE0012 KEGG:ns NR:ns ## COG: APE0012 COG0524 # Protein_GI_number: 14600388 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Aeropyrum pernix # 1 297 2 302 310 126 27.0 5e-29 MEKVYDVIGVENPIMDFAVSIDRLPRTDSMSVMYDYLWQSGGNASSAIAALARLGARCSM LGVVGNDAFGAFCRDDMIRHGVDVSHLYTQEGDTTFTVCLAEEETKGRSFLGKMGVNGAL DDEQVDEAFIAGTRYIHTSMIECSAKKKAIEYARKHGVLVSVDGGAYTDEADFVIGNSDI LIISEEFYSAVFSDDSYMENCRKLTEQGPQIVIVTLGAKGCAGAKRGGAAFQLPPFEGHK IVDTTGAGDVFHGGFLYAHSQGWELEYCAKFASAVSYINCTSLGGRVGIPNRKMVEQFLK DHTIDYSDIEPRREFYRNVMKFE >gi|229784082|gb|GG667653.1| GENE 9 9004 - 9396 416 130 aa, chain + ## HITS:1 COG:no KEGG:Aaci_1428 NR:ns ## KEGG: Aaci_1428 # Name: not_defined # Def: glyoxalase/bleomycin resistance protein/dioxygenase # Organism: A.acidocaldarius # Pathway: Pyruvate metabolism [PATH:aac00620] # 7 127 4 123 129 71 37.0 9e-12 MSAEKNIRSSHVGLNVRDLDRSMAFYEVHLGFRPVSRIELENGTRIGFVELPEQFQLELV EDVNYEAGRDGPWNHVCMRVADFDESYRRLIREGIQFETEKLFMKEFWENGMKFAFFRGP DGERLEIAEY >gi|229784082|gb|GG667653.1| GENE 10 9411 - 10544 1093 377 aa, chain + ## HITS:1 COG:SSO2521 KEGG:ns NR:ns ## COG: SSO2521 COG0657 # Protein_GI_number: 15899257 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Sulfolobus solfataricus # 121 352 66 281 311 140 31.0 3e-33 MEQELYQRLLRSLAAGEYRIDAGGAEIAVRDCFDRIPGHLDPRTEAAASVTREDFETRAG EIDPELTRNQDKLLGMDNPEGHSLLEQLRYQFGWRTLDRSKGIETRYETFLVDGCEGTRW IYKPKTRKEDRPCLIYIHGGGFFGGDTLTVENQCKRFAELADAVVISIDYPLSPETKYPV AFHMCCETVSWVWDQAAHLGIDRKKIGISGDSAGGTMAVAVTLWDRFEGSKRLAYAGLIY PSIVWGPDRNGTYSVWKKEMFDNPEDNQYIKEQIETTGKLKKEAADWYLKEGENLHTPYL NPVEGDFTGFPKTLVMTAEYDYLRPECELFTKKLKDAGVHVRHIRYGGIVHGTFDRLGYA PQVEDMLMEMAKDFESL >gi|229784082|gb|GG667653.1| GENE 11 10558 - 11061 537 167 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621780|ref|ZP_06114715.1| ## NR: gi|266621780|ref|ZP_06114715.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 167 1 167 167 340 100.0 3e-92 MREIKVQQLNEEAFRKYGMYQSLTDNEQMKSRVIPCGSFYPDLLTLDFAQTTLPTISCCH VFRQEQMVVDFMEYHTCTCEGLIALDDDVVIYVGTPDMGELKIENLEAFYVPMHTFVKFN PMIIHGGQYPVHKEEAHLICMLPGRTFNNDMVFRKIEKEEEKGVLIL >gi|229784082|gb|GG667653.1| GENE 12 11071 - 12459 1091 462 aa, chain + ## HITS:1 COG:TM0951 KEGG:ns NR:ns ## COG: TM0951 COG2407 # Protein_GI_number: 15643713 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Thermotoga maritima # 4 450 2 463 471 154 25.0 5e-37 MQKKKIILGVAPTKRSFLSMEEAVRQKERFMAVIRNTLSDSVEIVDIDDISENGICWNMD ETRRIIEKFRTAGIDALFIPFCDFGEEQVAAAIASGFHLPVLIWGARDERPNSDTSRGRD TQCGMFAATKVLRRHGVTYSYIVNCETESDEFRQGYDRFIRVAAVLKALRNLRIAKIGNR PVPFMSVMTNEAELLRRLGVTVIPVSPADVSARARKIAEESGTTFLQYYGDLTSRMDASA MNAEEVKLSAAMKPALLEVMKENNCTAGAFECWSAFPSLIGVCPCVVLGEMADEGYPLAC ENDINGAVTLAILQACNLYKEPPFLADLTIRHPENDNAELLWHCGPFPYSLKAEESKARL VDGQERFELKQGHITTCRFDDAEGRYYLFAGEGDTTRGPETNGTYTYLEVDDWKRWEEKL MFGPYIHHLGCAYGSYKEVLQEAARYLGIIFDDASVPGVSSL >gi|229784082|gb|GG667653.1| GENE 13 12646 - 12885 236 79 aa, chain + ## HITS:1 COG:no KEGG:Closa_1693 NR:ns ## KEGG: Closa_1693 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 79 1 79 95 89 58.0 5e-17 MKFENIQNVEKLFQIIRDECTGTVELVSTEGDRINLKSRLSQYISMVRLLDAEFVRELQL VASDPADVERLLRFMRDGE >gi|229784082|gb|GG667653.1| GENE 14 13081 - 13242 63 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870613|ref|ZP_06409831.1| ## NR: gi|288870613|ref|ZP_06409831.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 50 6 55 55 83 100.0 6e-15 MENESGSLPAVPLRCGSRRIQADMETMIGADCMAAPEDTRDPGYMATAENLAS >gi|229784082|gb|GG667653.1| GENE 15 14155 - 15639 271 494 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225570110|ref|ZP_03779135.1| ## NR: gi|225570110|ref|ZP_03779135.1| hypothetical protein CLOHYLEM_06206 [Clostridium hylemonae DSM 15053] hypothetical protein CLOHYLEM_06206 [Clostridium hylemonae DSM 15053] # 31 452 10 416 590 383 53.0 1e-104 MSNTDISPAAPNSPGTQIAVTGPSGTVTPSSRPFILYTGTDLTGTATGDPEVRSYCALPD TEASRYRLNRDDHNQIFRFTFTANGTTGPAYCLDKFKDGPYGDVPYRSQTFETLFPDATE RQKQMLAWILANAYPTVSASQTFNLVGVDDTAAPPLDNNDAYAAVQVAIWVLLGQISPSE AVFLNCSDGAAHPKSDRLNQTVLRLIEMAGNFADTAARPAYSYPPGTGASCCRQEFIYCC NTGTIPSNPSESYLVFHGCPSEIRNVCGRLLVGPFSLRSSLTGTPILTIETFCDCGEDFS ASFMDFCGNPITSPAVGQEFYIALRLVHNAMCFRVNASLSGTVTRVVLMDPTAETTLNYQ PIGATLEDYPVTTTASICVCIVNPSMHSSGSSGKGGICITNNNSNSTNNNNNNNNSNNNN NNNNNSSNNGGGNNGFPWFPPCFWGLPPYPGWPFPPVGPWPPYPPQPPQPPYPPEPPWPP QPPWPPQPPWPPQP >gi|229784082|gb|GG667653.1| GENE 16 16010 - 16513 479 167 aa, chain + ## HITS:1 COG:no KEGG:Closa_0702 NR:ns ## KEGG: Closa_0702 # Name: not_defined # Def: CarD family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 166 1 167 170 125 41.0 6e-28 MFQVNDHVVYGNYGICVVKAIGSLEMDSVVKDRLYYTLEPLYSEKNTIYTPVDKEDSMRC AITEQEAWKLIDGIQAQEMIQVADEKRAEQAYREIMRTNECSGWSRIIKTIYLKNRKRLA EGKRYTAKDDIYLRLAEDFLFRELAAALKVKKEDVESIISDRVKQLG >gi|229784082|gb|GG667653.1| GENE 17 16813 - 17862 1043 349 aa, chain + ## HITS:1 COG:FN1935 KEGG:ns NR:ns ## COG: FN1935 COG3392 # Protein_GI_number: 19705240 # Func_class: L Replication, recombination and repair # Function: Adenine-specific DNA methylase # Organism: Fusobacterium nucleatum # 1 327 1 312 333 202 36.0 8e-52 MRFIGSKTLLLDHIRAVAEEKAPDARSFCDIFSGTAAVARYFKQWYEVYSNDLLYFSYVL QRATIENDSIPEFDRLKEELGITDPVEYFNGKETRDMESLEQKRRFFQNTYAPTGGRMYL NDDNALRIDYARCTVEDWKTAGLLNENEYYYLVACIVEGIPFVSNTSGTYGAYHKSWERR SYKRYQLFQLEVTTNHKENRCYNEDGAELLKRLQGDLLYIDPPYNERQYLPNYHVLETAA RYDYPEVTGVTGQRPYENQKSEFCMKKNVTDAFERLISNARFQHIILSYSTDGLMTTEDI ERIMKMYGKPETFHIYEIPYRRYKSRKVKETERLKELIVYVEKQVEPCI >gi|229784082|gb|GG667653.1| GENE 18 17853 - 18722 942 289 aa, chain + ## HITS:1 COG:FN1923_2 KEGG:ns NR:ns ## COG: FN1923_2 COG0338 # Protein_GI_number: 19705228 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Fusobacterium nucleatum # 3 287 1 304 304 198 38.0 8e-51 MYIKSPLNYTGGKYKILKPVLGAFPRQIGSFVDLFAGGFNVGINVEADRIFCNDQITYLI GMYDMFRKTDTEVLLTRIRSLIGSYQLTQQNKEGYYALRRHYNESRDLTELFVLTCYAFN HQIRFNNSHEFNSPFGKNRSSFNGTIEKNLTAFCRALKEKNISFYTTDFREMNLDFLGEN DLIYCDPPYLISTGSYNDGNRGFKDWHEEEEEALLTLLDRLNDQGTRFALSNVLYHKGLS NDLLIDWSKKYHITCIDKSYANCNYHFKDRDAVTVEVLITNYVPEGMEE >gi|229784082|gb|GG667653.1| GENE 19 19004 - 19705 480 233 aa, chain + ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 16 206 80 269 302 107 33.0 3e-23 MVGTASCTGCGKCREVCRHKTCISCGECIPVCPLHLRRIAGEKMTSEELIFRIRKSSDYY ARYGGGVTFSGGEPLMQAEFLTEVLSGIPEVHRAVETSGYCEEDVFRKVIAHLDYVMMDI KMFDAVLHKKYTGVDNKKILGNARILCAGEIPFVIRIPLIPGVNDNEENFRSTAKWIAGA KALIKVELLPYHKTAGAKYAMVKKEYRPAFDPEQTVWVSQKVFEEYGIRSEVL >gi|229784082|gb|GG667653.1| GENE 20 19702 - 21783 1578 693 aa, chain + ## HITS:1 COG:pflD KEGG:ns NR:ns ## COG: pflD COG1882 # Protein_GI_number: 16131789 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli K12 # 102 686 156 760 765 256 31.0 1e-67 MTENIEKMRQFFVVDKGQKAMWQEPEDPYILAEQFAAEQIPDVDRAARRLIYVLDKEEPV LFEGERIAFTRTVTTIPEIFTEEEMRDLKKTHWIHEKGDVCNISVDYTKLLNQGFYAKKT ELEALREEFFQRGEREKAHYLQLQIDILDHVLSLASRYQKLAEQRGNEVVAKTLLSVPAH APQSFLEALSMFRIIHFTMWCGRNYHNTVGRFDQYIYPYLKADLDKGIHTKESALELLEE FFLTFNRDSDLYPGMQQGDNGQSLVLGGLNEDGTDSYNLLSELCLKASLELKVIDPKINL RVHEKTPLDTYILGTQLTKQGLGFPQYSNDDVVIPGLMELGYKKKDAYRYVVAACWEFII PGTAMDIPNIEALSFTKAVSDAVMEKLEQCRSYEEFEEAVRNKINSQVEALCKQVNGVYL YPAPFLSLMMEGCCEKGIDVSLGCKYNNYGFHGTGIATAADSMAAVKKFVYEEKSTDALS LVKALKNDFESDELLCNRLRYESPKMGNDNDYVDGIAVRLLNIFADSLKGRSNDRGGIYR AGTGSAMYYIRHAENLPATPDGRRKGEGFGANYSPSLFSRCRGPVSIIQSFTKPDLKRVI NGGPLTLELHDTVFRTEESCAKVAVLVKSFMDLGGHQLQLNSVNRDTMLEAQKHPENYKN LIVRVWGWSGYFVELDKEYQDHIIQRMELMLAV >gi|229784082|gb|GG667653.1| GENE 21 21832 - 22014 135 60 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01710 NR:ns ## KEGG: EUBELI_01710 # Name: not_defined # Def: two-component system, sensor histidine kinase YesM # Organism: E.eligens # Pathway: Two-component system [PATH:eel02020] # 1 53 525 577 585 63 49.0 2e-09 MEKEAVEHLFSVQTSGYGMKNVNDRLRLLYGDKYTLTITSKPDRGTTAFVRLPCDIGQKK >gi|229784082|gb|GG667653.1| GENE 22 22025 - 23524 1465 499 aa, chain - ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 1 469 12 492 577 117 25.0 5e-26 MGIRRKVLIVFLLVGALPFLCYIILSNRYTNRIILERERSLTELALEQAVSSVDNKLITY NNLSNFMFNNNAIMKVLNTSYHDDYFQMYKAYSDTIEPLFLTYYALYPDLEAITIYSSGD IHPYNNYILGLDSLYQKEWFPQVHNQYIPTWVSAVEDGKPYLYSTRLIGDIRQYKSSNYL SFRINYDMFFEPFLSLSEDKYDVAVTAQDGSLIFSTSDLPREELTAFLTDLSDTRYLKLT SSIPETGWNICYYKTYTSIHSAVNAITRGTYAFGWIILLFLGVMVLIISVSIVTPIENLT EKIDEVRTGDMNDLSVSLGSSRHDEIGTLFQTFSRMMGEINHYIMVNLKHELEKKNYQQK ILYAQINPHFLYNSLSLINSRAILSGQNDISEMVILLSTFYRTALNKGKDITTLENELMN IQAYVKIQLFSYSESIEVVYNIEEALAGVPFPNFILQPLVENALDHGLKNSLKKDKKLTV TVKKEEYMAVDFISIWIEG >gi|229784082|gb|GG667653.1| GENE 23 23566 - 25053 882 495 aa, chain - ## HITS:1 COG:BH0793 KEGG:ns NR:ns ## COG: BH0793 COG4753 # Protein_GI_number: 15613356 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 491 1 503 508 150 25.0 7e-36 MYQILVVDDSVLDVDCITFLINKYELPLEVATAVNGQEALGLFQNPENHFDILFTDIRMP FLDGLGLSKEVRKLSPNTRIVIFSGFNDFEYAKTAITIGVEDYLMKPVIPDEFVSVMTRV INGVEETNRQIQTRRSETRLLKNHMLWLAVNGKNGTPADSSSFFKDRYTGLMLIECSNEF FSLEGISFQEKLEGILTVPFDYLNLDPARSILFFKEEAFLFPCAQAVCLCAEKEFNQKCC VAFQDLPDNSSISQIYTQLEKRLENHFFFPEQNIFLPDGPDHPYAASGHISIDIVSDDLR LKDYDSLMHHLEDLFESLKHEKTHSLIFVKYCFTELMKEIIHYLPDDQKPDLNKVAEKIY SSANIVELMDMTCSQAKKLLSQTQKNQGSCLKSDKIRQYIYQNYTQPLSLNDISSHFYIS PNYLCSVFKKETGCNLMKFINEYRLEQAAKLLLETEMKVNKIAETVGFSNPSYFSQRFRD YYGESPENYRQKETY >gi|229784082|gb|GG667653.1| GENE 24 25265 - 27019 1505 584 aa, chain + ## HITS:1 COG:BH1064 KEGG:ns NR:ns ## COG: BH1064 COG1653 # Protein_GI_number: 15613627 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 71 362 36 331 550 122 29.0 2e-27 MEVTMKRRKKLVSLLVTACMAGGCMAGCGGNTKAPEDSSVKTESTTKDAAKESNVQSAAV TEEVDPYGPVNEETTVIHVGRAESANVTYPDGEDSHNNYIVKYLQDQLNTEYIYDFSVDD ASYPTKVAMAIASGDMPDVMNVTYTQLVQLVNAGAVEDMTEAFATYRSDALKKCFDSTNG MSEGLATFDGKLMAIPDIQPGMDSVPLVYIRGDWMEELGLEEPKSLDDIVNIAKTFMEKN PGGNVTDGIAVGLSSDKKLVQKNGGTWHINGLFTLFDAYPKMWIKQDDGSVVYGSITEEA KTALGEIRTLVEEGVIDPSFAVRDSEQCVEMISNGQAGIFFGAWWSNQWPVVSMLEGADD SVKWNSYVAPLNAEGKLNVPMKSPTTTYIVVKKGCSEDIKEAVVKTINYQYDLDQSQAEN VRPNGMDSPFSWHYYPINVLHCDYDAKEKQIQSVMDCIDGKVAYDDLSGDGKTWYNGYTA VQKDGFRKTVEKNISTANGWGWANGAWTVQSQADKLNLQYAASYADSPSMESLWATLETQ EDEFYLQVLTGNTTIDEFDKFVSQWKALGGDTITQEMSELAGAN >gi|229784082|gb|GG667653.1| GENE 25 27033 - 27146 56 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRNGKPEVKNLRFSKSGRAGIPNYREFKKEGSFVWEK >gi|229784082|gb|GG667653.1| GENE 26 27134 - 28075 874 313 aa, chain + ## HITS:1 COG:BH1065 KEGG:ns NR:ns ## COG: BH1065 COG4209 # Protein_GI_number: 15613628 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Bacillus halodurans # 10 313 15 317 317 279 47.0 6e-75 MGKVAGRKRRRKEGRHVTIRQTYQYHLLMLPGFAILFIYTIIPFFGNIMAFQNYQPIMGF LKSKWVGLANFKHMLLLPDTARIFRNSFTIAVGKLVLTMVLSIFFAILLNEITSVKFKKS VQTICFLPHFLSWVILATIFKNLLDTGGILNQGLMSLGILKEPLMFLGSNKLFQGVVIST DVWKEFGYGAIIFIAALMGINPELYEAASVDGAGRLAKIWHITLPSIRVTIVMVATLNIA NILNAGFDQIYNMYSPVVYQSGDIIDTYVYRLSFVNAQYSLATAIGLLKSAISFIMIVTA HKLAKKFANYQIF >gi|229784082|gb|GG667653.1| GENE 27 28089 - 28976 884 295 aa, chain + ## HITS:1 COG:BH1066 KEGG:ns NR:ns ## COG: BH1066 COG0395 # Protein_GI_number: 15613629 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 1 295 1 293 293 232 43.0 7e-61 MKKYNTWQNRLFDFCVFLLLLILGLSCLIPLLHVIALSLSSKTAALSGKVTLWPVELTFT PYKVLMTDHKFMQAMWISVQRVVLGGAINLVLTVITAFALSQDETDFPARKVYMWFMIFA MLFGASLVPWYFVIKATGLMDTIWALVIPGALPVYNTILLMNFFRNLPKEIKESAMLDGI TPFQMMMRIFVPLAKPAIATVTVFSVVFHWNNFFDGLLLMNTPSNMPLQTYIQTLTNTTV SVMSSSLTPAEIEALTSLKTFNAAKIVVASLPIVIFYPFMQKFFVSGLTLGSVKE >gi|229784082|gb|GG667653.1| GENE 28 29005 - 30984 1855 659 aa, chain + ## HITS:1 COG:no KEGG:GYMC10_2964 NR:ns ## KEGG: GYMC10_2964 # Name: not_defined # Def: hypothetical protein # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 5 650 6 665 676 491 40.0 1e-137 MRINRMASVPLLVHDPYFSIWSSADCLYDADTAHWSGKEKRLYGHITADGERFRFLGGED GYPVIPQTSLEVTPTATTYTFENEKLRLFVRFLSPLLPEDPVLVSRPCTYVDFTVSRKQE VSVKIDFTVTGDLVFDTPGKIIGGSHENETYHFHYGTMRKAFQTPLGHSGDRITIDWGDM ILASEDEAVTVFFDPEYSLLKASAELGTDQMACTVIAAYDDLLSIFYFGSWQKAYWTTTY ETIFDAIGESFRDREEVTEKAQAFDRSLEEQARNTGGDDYAFLCSISYRQVMAAHKLIAD QEGNLVFLSKENDSNGCTGTVDISYPSAPMFLLFNPEYVKAMLRPVFRFAGCPVWEYDFA PHDVGRYPYAAGQVYGLSKEGENREFCCDNGAIFPFYYEYPAGLSVYDVRDQMPVEESGN MMILTAAVCELERSGAFASPYYEVLKTWAGYLLAYGEDPGEQLCTDDFAGHLAHNVNLSA KAVMGIEAFSRISACLGKEEEAKRYHEAAVRMAQSWETRSDAGDHYRLTFDCEESWSLKY NLVWDHFFGSGLFKPEVFEKETAFYLKKSSTFGVPLDSRKTYTKSDWILWCAAFAPDRDA LYRLMAPVAEYARSTEDRVPFSDWYDAESGRYCHFIGRSVQGGIYMPMLIEKRKIKQEA >gi|229784082|gb|GG667653.1| GENE 29 30988 - 32493 1574 501 aa, chain + ## HITS:1 COG:BS_abfA KEGG:ns NR:ns ## COG: BS_abfA COG3534 # Protein_GI_number: 16079924 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Bacillus subtilis # 1 498 1 497 500 654 60.0 0 MKKAVIKIDKDYKIGPIDNRIYGNFIEHIGRAVYGGIYDPGHPLADDMGFRKDVIALTKE LKVPIVRYPGGNFVSGYNWEDGIGPKELRPRRTELAWFAIEDNQVGTDEFCEWSRRAHTD VMMAVNLGTRGADAARNLVEYCNFPGGTYWSDLRIKNGYCDPHGVKLWNLGNEMDGPWQI GHKTAEEYGRAACETAKVMKWVDPSIELVACGSSNSGMNTCYEWEAAVLDHTYDFVDYVS MHQYYGNEKNDTPSFLANTMDMDRFINHIVSVCDYIQAKKRSKKVLNISFDEWNVWYHAF ADNAKAEKWKEKPAYNEDQYNFEDALLVGGMMITLLKHADRVKVACMAQLVNVIAPIMTE NGGRIWKQTIYYPYLHASLYGRGTALNVLVDAPKYDCNEYTDVPALESIAVESEDGAYVT IFAMNRGEEDLEVTCCLRDYPGCCVEEFITMAGYDLKQYNTADQPDRVIPQSSSAYEMEG DCLTIKFSAYSWNVVRVRVGR >gi|229784082|gb|GG667653.1| GENE 30 32778 - 33068 133 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621797|ref|ZP_06114732.1| ## NR: gi|266621797|ref|ZP_06114732.1| hypothetical protein CLOSTHATH_02977 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02977 [Clostridium hathewayi DSM 13479] # 1 96 125 220 220 187 100.0 3e-46 MRIKTSLTSSLVSLCAEGYGIFFSTPMLLKHLYDTQPKCFESLHVFPVMEFQGTRKTLLL YHKNKYLSKPLSSSIHIIQQIYKEHKCVMDRIQAER >gi|229784082|gb|GG667653.1| GENE 31 32989 - 33156 100 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGVLKNIPYPSAQRDTRELVSDVFMRINGGVPFWTSFDTIASCRRLEAGCKNIGT Prediction of potential genes in microbial genomes Time: Fri Jul 1 00:59:12 2011 Seq name: gi|229784081|gb|GG667654.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld47, whole genome shotgun sequence Length of sequence - 31175 bp Number of predicted genes - 31, with homology - 29 Number of transcription units - 17, operones - 6 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 408 - 1130 671 ## Closa_0179 Colicin V production protein 2 1 Op 2 . - CDS 1142 - 1975 748 ## COG0561 Predicted hydrolases of the HAD superfamily 3 1 Op 3 . - CDS 2070 - 4796 2812 ## Closa_0177 hypothetical protein 4 1 Op 4 23/0.000 - CDS 4827 - 5423 554 ## COG0353 Recombinational DNA repair protein (RecF pathway) 5 1 Op 5 30/0.000 - CDS 5426 - 5779 168 ## PROTEIN SUPPORTED gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 6 1 Op 6 . - CDS 5796 - 7418 1565 ## COG2812 DNA polymerase III, gamma/tau subunits 7 1 Op 7 . - CDS 7432 - 8511 1311 ## COG0205 6-phosphofructokinase - Prom 8569 - 8628 6.5 - Term 8964 - 9013 0.0 8 2 Tu 1 . - CDS 9205 - 9702 510 ## COG0590 Cytosine/adenosine deaminases - Prom 9924 - 9983 6.5 + Prom 9690 - 9749 7.6 9 3 Tu 1 . + CDS 9962 - 10657 490 ## COG1180 Pyruvate-formate lyase-activating enzyme - Term 10572 - 10610 1.2 10 4 Tu 1 . - CDS 10674 - 12809 1416 ## Closa_0170 hypothetical protein - Prom 12861 - 12920 5.0 + Prom 12825 - 12884 10.2 11 5 Tu 1 . + CDS 12914 - 13669 794 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase + Term 13705 - 13745 1.1 - Term 14031 - 14082 6.3 12 6 Tu 1 . - CDS 14111 - 15502 607 ## SZO_02320 transposase - Prom 15552 - 15611 5.6 + Prom 15626 - 15685 6.1 13 7 Tu 1 . + CDS 15713 - 15871 154 ## gi|288870625|ref|ZP_06409838.1| hypothetical protein CLOSTHATH_02993 + Term 15897 - 15936 7.5 + Prom 15939 - 15998 3.7 14 8 Op 1 . + CDS 16019 - 16849 534 ## COG4587 ABC-type uncharacterized transport system, permease component 15 8 Op 2 . + CDS 16839 - 17699 350 ## Sgly_2655 protein of unknown function DUF990 16 8 Op 3 . + CDS 17653 - 18699 797 ## COG4586 ABC-type uncharacterized transport system, ATPase component + Prom 18721 - 18780 6.5 17 9 Tu 1 . + CDS 18851 - 18934 86 ## + Term 19022 - 19058 7.3 18 10 Op 1 . - CDS 19049 - 20014 544 ## COG2801 Transposase and inactivated derivatives 19 10 Op 2 . - CDS 20011 - 20688 323 ## Clole_0189 transposase IS3/IS911 family protein - Prom 20836 - 20895 6.0 20 11 Op 1 2/0.000 - CDS 20923 - 22134 1102 ## COG1015 Phosphopentomutase 21 11 Op 2 . - CDS 22131 - 23288 807 ## COG3457 Predicted amino acid racemase 22 11 Op 3 . - CDS 23289 - 24359 990 ## Closa_2021 hypothetical protein 23 12 Op 1 . - CDS 24465 - 24821 476 ## Closa_2025 hypothetical protein 24 12 Op 2 . - CDS 24856 - 26136 1477 ## Spico_1515 hypothetical protein 25 12 Op 3 . - CDS 26222 - 26584 282 ## Closa_2026 PRD domain-containing protein - Prom 26660 - 26719 1.7 - Term 26862 - 26898 0.6 26 13 Tu 1 . - CDS 26906 - 27838 821 ## Closa_2027 hypothetical protein - Prom 27952 - 28011 4.6 27 14 Op 1 . - CDS 28045 - 28143 57 ## 28 14 Op 2 . - CDS 28115 - 28285 175 ## gi|266621822|ref|ZP_06114757.1| ATP phosphoribosyltransferase - Prom 28422 - 28481 8.3 + Prom 28383 - 28442 6.6 29 15 Tu 1 . + CDS 28556 - 28927 199 ## ELI_2972 hypothetical protein - Term 29018 - 29053 7.4 30 16 Tu 1 . - CDS 29081 - 29917 669 ## COG0789 Predicted transcriptional regulators - Prom 29943 - 30002 4.4 31 17 Tu 1 . - CDS 30080 - 31174 1054 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific Predicted protein(s) >gi|229784081|gb|GG667654.1| GENE 1 408 - 1130 671 240 aa, chain - ## HITS:1 COG:no KEGG:Closa_0179 NR:ns ## KEGG: Closa_0179 # Name: not_defined # Def: Colicin V production protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 240 6 244 244 313 68.0 4e-84 MENWLSIAAGVYLLSMVLYGHYRGFIRLAVSMVALVAALAIVHVSMPKVTEYLKENTSIQ QTLSDSMKQAAGIGLPDETGDGEEKDVPSVQRTIIENLNLPQNVKAALIENKNHEVYELL GVNAFADYIGNYLADMILNSVGFVLLFAIVYLLIRLVVRWLDLIARLPILSGMNKIAGAL LGGVQGLLFLWILCLVLTACSGTAWGMTLIRQVEASKWLSFLYEHNFLNAIVLGVIHGML >gi|229784081|gb|GG667654.1| GENE 2 1142 - 1975 748 277 aa, chain - ## HITS:1 COG:BH0497 KEGG:ns NR:ns ## COG: BH0497 COG0561 # Protein_GI_number: 15613060 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Bacillus halodurans # 4 274 6 245 247 133 32.0 4e-31 MDIRLIALDLDGTLLDSQKRLSSRNKKALTECLRRGIHIVPTTGRTVSGIPQVVRELPGV RYAITTNGAVVEDMKDDLTLSECTIPWQLALNILKIVDHYHVMYDPYIERRGITEPRFYE HLTEYGLAPELQEVVRLTRDVHPNIIEFVETARKPVEKINLFFPEIEERARVREALEAID GILITSSMPMNLEINAPGATKGGGIRRLAEHLGLKREQTMAMGDGENDFSMILEAGIGVA MKNGRPDLCEAADYITDTNDEDGVASAIYRFVLETEI >gi|229784081|gb|GG667654.1| GENE 3 2070 - 4796 2812 908 aa, chain - ## HITS:1 COG:no KEGG:Closa_0177 NR:ns ## KEGG: Closa_0177 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 300 1 304 765 392 76.0 1e-107 MDKYEFNIKVEQIKKLVGKSDYETAMKIADTIDWRRVRNTNLLSMVAMVYEKNEDYEEAR EILLLAFERAPIGKRLLYKLAELALKEGNIDEAEAYYREFSDLAGDDPRQQLLRYLILKA KGAPAQQLIHSLESYTSQEIDEKWLYELAELYSIAGMADRCVETCDKIMLMFGLGKYVDK AMELKIQYAPLTSYQMDLVENRDKYEAKLRAVEQEYGLGGNQMQVQDQEEEYYDDGNQGM DMPEEPDFEDLQARMQEAEVQEGLAREMSRMSYEEPPAAQRAMRHDRTRVLDDIRRINRP VYHDASMAAGETAAAGAAYAENAFAGAAGAVYGAADSAVYMAGNAERLASAMPRPRGMMR MPEMDEEAAAEPVIEEDYAYGNSGYNGSSYGNTAYDEPVYGEDAYDNTGYGETAYDDPAY GENAYEDNTYGEPFNGENTSGEPAFGEADPEEEAYSDYSDMESSGDDPACGENAYEEPEL SYAESETEESADVPVEAEYTAPESDPAESEVLNSQEPEEEALEIEDLDDDAEEPPVLNHL MIEARTPEKGLKIAVEALKQIHNETGIKNPVAKITGEKLSRRGVLASAEKLAGKDLVIEE AGDLTPEALEELYTLMNRDASGMIVVLIDNPKQMENLHRNHPRLASKFECIGSGEAYEPG EAYDQPVSPVREPKQAVRAEVSAPGKAAPKAAVRPLYPERGEAPVRRQPVPPEESYEEDS YEHDSYEQDPYEQDIRNAESYETDSYGEEGYEEIPNEEKVHNKKKGLFRNRKKPVYEELE PEESYDENDDYYVEEDEDGENGMTYRNDNGPGSHGEEMDIDEFAQYACRYANEIDCSITG KSMLALYERIEIMEEDGVALTRENAEGLIEEAADRAEKPSFGKRVKGIFSSKYDKDGLLI LKEEHFIY >gi|229784081|gb|GG667654.1| GENE 4 4827 - 5423 554 198 aa, chain - ## HITS:1 COG:CAC0127 KEGG:ns NR:ns ## COG: CAC0127 COG0353 # Protein_GI_number: 15893423 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Clostridium acetobutylicum # 1 198 1 198 198 233 54.0 2e-61 MDYYSSQITKLIEELSKLPGIGSKSAQRLAFHIINMPEESVTSLASSITEAKHNVHYCKE CFTLTDQELCPICKSDKRNHKVIMVVEDTRDLAAYEKTGKYDGVYHVLHGAISPMLGIGP NDIKLKELMQRLQGDVEEVIIATNSSLEGETTAMYLSKLIKPTGVRVTRIASGVPVGGDL EYIDEVTLLRALEGRVEL >gi|229784081|gb|GG667654.1| GENE 5 5426 - 5779 168 117 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 [Roseobacter sp. AzwK-3b] # 10 109 6 105 114 69 35 3e-11 MAKRGGFSGGMPGNMNNLMKQAQRMQRQMEETTKELEAKEYTATAGGGAVSVTVSGKKEV TAVKLSEEVVDPDDIEMLEDLIVAATNEAFRQMEEESSAAMSKLTGGLGGMGGGFPF >gi|229784081|gb|GG667654.1| GENE 6 5796 - 7418 1565 540 aa, chain - ## HITS:1 COG:BH0034 KEGG:ns NR:ns ## COG: BH0034 COG2812 # Protein_GI_number: 15612597 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Bacillus halodurans # 1 526 1 535 564 378 38.0 1e-104 MSYTALYRKWRPASFEDVRGQDHVVQTLKNQMVSDRIGHAYLFCGTRGTGKTSIAKIFAK AVNCESPVDGSPCGECRTCKNIAAGSSLNVVEIDAASNNGVENIREIRDEVQYPPTEGKY RVYIIDEVHMLSTGAFNALLKTLEEPPSYVIFILATTEVQKIPVTVMSRCQRYDFKRITV DTIVTHLQKLTEAEHISVEDKALAYIAKAADGAMRDALSLLDQCVAFHYGEVLTYDNVLD VLGAVDITVFSQLFRAVVENRTKDCIQSLEEMVIQGRELGQFVTDFIWYLRNLLIVKSVD DAENMLDMSTENMNLLKEEAGLIDGETLLRFIRVFSDLSNQLRYASQKRVLIEVALIKLT RPEMEENLDSVIQRIGNIEKQLEEGVFVAGTMPATGNGTDSVPAGDNGRMGGNGAPSEPL TPAERVVLPEAQLEDLNLVRNEWGKIVKDLGGPIRASFRDTVVEPAGDSCLCVVFTDQSN FIIGSRPTTLGDIERYVEEHYQKSIYFKARLRGGGERLDTIYVSDEELKENIHMDIIIED >gi|229784081|gb|GG667654.1| GENE 7 7432 - 8511 1311 359 aa, chain - ## HITS:1 COG:Cgl1221 KEGG:ns NR:ns ## COG: Cgl1221 COG0205 # Protein_GI_number: 19552471 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Corynebacterium glutamicum # 2 349 4 343 346 249 39.0 5e-66 MMRVGMLTSGGDCQSLNATMRGVAKALYRMYDDVEIIGFEDGYKGLIYADYRVMKQSEFS GILTEGGTILGTSRQPFKLMRTPDENGLDKVEAMKHTYKKLKLECLVVLGGNGSQKTANL LREEGLNIIHLPKTIDNDLYGTDVTFGFQSAINVATNAIDCIHTTAASHGRVFIVEVMGH KVGWLTLYAGIAGGADIILLPEIPYDLDIVVEALKSRTKRGKRFSILAVAEGAISKEDAA LTKKELKEKKKSGVIYPSVAYEIGAQITERTGQEVRVTVPGHMQRGGEPCPYDRVLSTRL GAEAAYLIKNKEYGKMVAVINNEIVKVPLEEVAGKLKTVDPESSIIKEAKMIGISFGDE >gi|229784081|gb|GG667654.1| GENE 8 9205 - 9702 510 165 aa, chain - ## HITS:1 COG:BH0033 KEGG:ns NR:ns ## COG: BH0033 COG0590 # Protein_GI_number: 15612596 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Bacillus halodurans # 4 152 5 153 159 185 59.0 3e-47 MNADEKYMRAAIGQARKAGAIGEVPIGCVIVYEDKIIARGYNRRTIDKNVLSHAEIIAIK KACKKIGDWRLEGCTMYVTLEPCPMCAGAIVQARIPKVVIGCMNPKAGCAGSVLDLLHED GFNHQVEMEKGVLEEECSRLMKEFFKALREKKAAVKKQKQEETTP >gi|229784081|gb|GG667654.1| GENE 9 9962 - 10657 490 231 aa, chain + ## HITS:1 COG:MTH1586 KEGG:ns NR:ns ## COG: MTH1586 COG1180 # Protein_GI_number: 15679581 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Methanothermobacter thermautotrophicus # 1 191 4 186 233 108 35.0 7e-24 MKICGLQKTTLLDFPGHVAATIFLGGCNFRCPFCHNSGLLGNDAEEAMTEDSLFTFLKRR TSVLEGVCITGGEPTLSDDLEEFIRRIRQMGYLIKLDTNGYRPAVLKDLAGKGLLDYIAM DIKAGRENYPRTAGIPGIQIKYIEESAAFLMNGSIPYEFRTTVVKELHGSRDFLDIGQWL KGCSHYYLQNYVDSGEVLCTGFTSCTKEELLAFAEILHPYIADVSLRGIDY >gi|229784081|gb|GG667654.1| GENE 10 10674 - 12809 1416 711 aa, chain - ## HITS:1 COG:no KEGG:Closa_0170 NR:ns ## KEGG: Closa_0170 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 495 710 343 560 561 235 60.0 7e-60 MSNYRRLISYIYAYEGGIKGKNIGFAKIETRGSQCKITVNVKKVYVGGNDIGVYLLAGEK EILLGNIFIRGGSGEFRTVVSVSDVEHSGIPMDQCYGLTVHDVENTWRSYTTIWEDAVAH AAEVELSNTLPETKEREETAQEAQIKKAIKEIEEEFPVENEQDTKESEKQDKMSREQLKA FTESAREQSKAFAESIRKQVEQSADTTEQTAEAWIRQQEARMEWSRAKASYIESVNEPAK PEMAPVEEPPRNVHGTKEPAAEQEIPIREPVRQQQEPLTMAHTSELSYGKKDNSLRSYTV FSGVPAVEYASELMPDIETVTTLHTGMLEAEANNQPPFMEERQSAGESAEMLYNQEEKEI QSEFLEEIQGETICYRADASRPFAVRPESRMYPQQEADMGAPKSADENLTDFHASEHRQE QVDRYETQNTDEITGIHLLADREKIPGTESVGNTQNPEMPDKAMSVKAMPDKAMSDTTMS DTASGTQNSSDSGMEEFMQERPEMYLADDAQDCSETSASENQAAQSVQENRDAAESYGTD EGAAATPSHAAVERQKEPEIQKAQEAQPVPGDPMELERLLKTEEEEENSSDRIWDQLRRE HTKILDFDYEKGCEILTIKPQDIGLLPRETWVYGNNSFLLHGYYNYRYLILAKLFNPEGT PRYLLGVPGHYYSNERYMASMFGFPNFVLSKNQPMEDGRFGYWYTDVKIGN >gi|229784081|gb|GG667654.1| GENE 11 12914 - 13669 794 251 aa, chain + ## HITS:1 COG:CAC0187 KEGG:ns NR:ns ## COG: CAC0187 COG0363 # Protein_GI_number: 15893480 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Clostridium acetobutylicum # 6 246 1 241 241 267 53.0 2e-71 MEGYKMRLYRAKDYNDMSRKAAHIISAQVIMKPDCVLGLATGSTPIGTYKQLIEWYNNGD LDFAEVKTANLDEYKGLTRDNDQSYYYFMHENLFKHVNIKEENTNIPDGTEPDAAKECAR YENVVHELGGVDLQLLGLGHNGHIGFNEPADEYPKETHIVDLQESTIEANKRFFASIDDV PRQAYTMGIGTIMSAKKILLVVSGEDKAAILREVICGPVTPRVPASILQLHPDVTIVADE AALAKLGDCCK >gi|229784081|gb|GG667654.1| GENE 12 14111 - 15502 607 463 aa, chain - ## HITS:1 COG:no KEGG:SZO_02320 NR:ns ## KEGG: SZO_02320 # Name: not_defined # Def: transposase # Organism: S.equi_zooepidemicus # Pathway: not_defined # 1 449 8 453 472 364 44.0 5e-99 MDEQKKYEVIKGLSDHPGSSKERAALTLGYSVRHINRMLAGYQKSGKEYFSHGNKGRKPA NTIKDETRSVIVDLYRSKYYDANFTHYTELLEKHECISVSPSSVSKILEEEYILSPKVTR AKQKRIKKHLKDLKKAARPQKEADTIQTNLVAVEDAHSRRPRAAYFGELIQMDATPYAWI PGVIWHLHVAIDDATGRIVAAWFDTQETLNAYYHIFHQILTTYGIPYKFFTDRRTVFTYK KKNSPSLDEDTYTQFAYACNQLGVELESSSIPQAKGRVERLNQTLQSRLPIEMRLAGITT IEAANEFLNLQVKEFNEKFSLPLNTIKSVFEVQPSDEKINLILAILSERTVDTGHCLRHN NHYYRMLDKNGVQVHYRKGTKTMFIQAFDGNQYCCVNDTTIHALDCIPEHEAKSKDLDLD YKEPKPKKQYIPPMNHPWRKSVFGKFVRKQEHHWNDDEKVYAS >gi|229784081|gb|GG667654.1| GENE 13 15713 - 15871 154 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870625|ref|ZP_06409838.1| ## NR: gi|288870625|ref|ZP_06409838.1| hypothetical protein CLOSTHATH_02993 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_02993 [Clostridium hathewayi DSM 13479] # 1 52 1 52 52 89 100.0 9e-17 MHCITIKATTLWLNIDNVAVITYKNDASVYQCAHFCDNSSVLPSYSSIFRLS >gi|229784081|gb|GG667654.1| GENE 14 16019 - 16849 534 276 aa, chain + ## HITS:1 COG:SPy0519 KEGG:ns NR:ns ## COG: SPy0519 COG4587 # Protein_GI_number: 15674623 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Streptococcus pyogenes M1 GAS # 53 276 42 272 272 67 25.0 2e-11 MNKFKTFLAGHGCGPRAIVSTVLMCTKQIFDGGILCIAGGYLVSAVQFVMLVFIWTALDR EGAGLGGMTLNQLMTYTLMATVFHQELNIISPATSSLWEGSVIGRFTRPVPVPVSFIAET IGRWWIPNFLFFGLPLLIISPFLHVNPAPADSASGLMAFISLMLSASLGFALDLLFAAFA MFLKNGCWAAASIREAAFSLLSGEMIPFALFPWGLDKFFSCLPFGSIAHAPLTIYTGMAA SPFRTIGLQLIWNVILWITAAMVFKKSKERMISFGG >gi|229784081|gb|GG667654.1| GENE 15 16839 - 17699 350 286 aa, chain + ## HITS:1 COG:no KEGG:Sgly_2655 NR:ns ## KEGG: Sgly_2655 # Name: not_defined # Def: protein of unknown function DUF990 # Organism: S.glycolicus # Pathway: not_defined # 17 286 19 281 282 193 44.0 6e-48 MVDNLKILFRLYRQYTKMDLLWFLRDTRYCLLQIFSDTVCAVCTIAGVFLLSEKFGGFGG MNQGEILFMMGFSTLVDGIYMMFFIGNNTSMISRIIGRGQLDHIMIQPVPLWAELLAQGF SPLSGSSMLVCGIGLTAYTVRRLPLSATLPWLLLLLIYAVSSTILVLSVMVLLSCAAFYA PAAAEEIAQTGRDLFTSLKTYPLGTMNHSVKRLFLTVLPVGLAAWFPSELLLKAGNGGLS AVLLLQACYLPAAALALSLFTIYVFKKGMKYYAVNGSPRYSGFGHR >gi|229784081|gb|GG667654.1| GENE 16 17653 - 18699 797 348 aa, chain + ## HITS:1 COG:SP0636 KEGG:ns NR:ns ## COG: SP0636 COG4586 # Protein_GI_number: 15900542 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, ATPase component # Organism: Streptococcus pneumoniae TIGR4 # 14 342 6 326 330 231 36.0 1e-60 MQSTVRHDTPALVIDNVTKQYSQWQRSGHARDILKNMLRPQKRIITALDHLSFEVAPGEF VAYAGANGAGKSTTIKILSGILSPSEGTVSVSGFNPATDRIELMRHVGVLFGQRTETWWD HPVITSFEWKKEVWGIPDAVYKKNLALVTELLDLKDLLHTFARELSLGQRMRADIAMLLL HDPSVIFLDEPTLGLDVLAKQQMIHFLKEINRECGTTIIVTSHDMDDLEEMAQRIILLNK GQIAFDGNFDELRNTAGACSRIVVTMRSPGSDKQSAAAGGLLSAPAIPGLQLLSSNGGIY EYEFNRNQTAIHEVLGYLAEFEGIEDVEIRRAPIEDVIARLYFSWKKS >gi|229784081|gb|GG667654.1| GENE 17 18851 - 18934 86 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKFDFKDLMTFGLFILALLTFVFTFCR >gi|229784081|gb|GG667654.1| GENE 18 19049 - 20014 544 321 aa, chain - ## HITS:1 COG:RSc1436 KEGG:ns NR:ns ## COG: RSc1436 COG2801 # Protein_GI_number: 17546155 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Ralstonia solanacearum # 22 298 6 278 278 157 35.0 3e-38 MKLGKLRYESKYLAIRHFHETEQWSIAWMCKQLEISRAAYYKWLDREIPKQESENLKLAE LIKEYDQRFGHILGYRRMTSWVNRFNHTAYGKKRVHRIMKKLGIQSVIRKKKKKYVSSPP ETTAENKLCRDFYASAPNQKWSTDVTEFKIPGEKKKLYLSVIIDLYDRYPVSYVISSRND NQLVFKTFDKAVISNPNAAPIFHSDRGFQYTSRVFQKKLKEHEMEQSMSRAGHCIDNGPT EGFWGIIKSEMYQMYEITDETTLRHAIKEYMKFYSAERPQDRYHCKTPLEVRNEAIAAEH PEEYPIPKNKRIEKYKEKWCA >gi|229784081|gb|GG667654.1| GENE 19 20011 - 20688 323 225 aa, chain - ## HITS:1 COG:no KEGG:Clole_0189 NR:ns ## KEGG: Clole_0189 # Name: not_defined # Def: transposase IS3/IS911 family protein # Organism: C.lentocellum # Pathway: not_defined # 1 225 1 225 225 315 73.0 1e-84 MSKSPHTSEFRAKVSQEYLDGLGSYNYLATKYNIGCKTLKEWVAKYRIHGIGAFIRKTGN ASYSSDFKTMCVEAVLSGDGSVDDIAAKYNISSRRVLSQWVVSYNANRELKDYVPKREVY MAEARRKTTVAERKEIVEYCIKHNHDYKGTAGIYNVSYSQVYSWVKKYDVNGEDGLTDKR GHHRADNEVDELERLRRENLRLKRQLEEKDKAVELLKKVKEFERM >gi|229784081|gb|GG667654.1| GENE 20 20923 - 22134 1102 403 aa, chain - ## HITS:1 COG:yhfW KEGG:ns NR:ns ## COG: yhfW COG1015 # Protein_GI_number: 16131258 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphopentomutase # Organism: Escherichia coli K12 # 4 403 3 402 408 389 48.0 1e-108 MKKRFIVIILDGFGIGAMKDACIVRPGDETANTLGSILKDSPDMKLPVLERLGLMNAFGR ESAAMKFSPSANYGRCELMHYGADTFMGHQEIMGTLPIKPTAHPFLEKAEEVSRYLRQNG HDVHVIEKQGLKYLIVDRYATVADNLEADPGMCYNVTAPLDFMPFEEEVEIARLVREVVT VGRVIVFGGTGNTMEDLQRAEEIKLGNYIGIASAKSKSYEHGYQCRHLGYGVDKQVQAPT ILTNAGVHCTLIGKVADIVANDGGESISCVPTEEVMRLTVQAVKNMEKGFVCTNVQETDL AGHSQSAAAYREILETADRGIGAVLEEMEAEDILVVMADHGNDPNIGHSRHTRECVPLLI YTRGMSGREIGTRKTLSDVGASVCSFFGVKSTQNGTSFLEDLR >gi|229784081|gb|GG667654.1| GENE 21 22131 - 23288 807 385 aa, chain - ## HITS:1 COG:yhfX KEGG:ns NR:ns ## COG: yhfX COG3457 # Protein_GI_number: 16131259 # Func_class: E Amino acid transport and metabolism # Function: Predicted amino acid racemase # Organism: Escherichia coli K12 # 1 382 1 382 387 348 45.0 7e-96 MFLEQTVKRNPELVRLAFSLHRQGIIEPDSYLIDVDTFLQNAEEMLECSRQNGVKLYFML KQLGRNPYLAKELMRLGYSGAVAVDYKEAAVMMEHGIPIGNAGHLVQIPTAQVRQIVSCR PEVITVYSEEKIHQIDQAASELGVIQEILLRVYGEGDMIYSGQTAGFHLNTLEELADRIV GSYPHVRIAGVTSFPCYLYDEQLDDIVPTANLKTVTRAVAILKSRGILCRIINTPSATCC RTIERMKRDGGNCGEPGHGFSGTTPMHAAHELPERPCVVYISEVSHNFEGRGYCFGGGHY RRSHMKNALVGDSFLNARHVEVIPPADDSIDYHFGLTKECPVGQTVVMAFRYQIFVTRSD VVLVTGLRKGAPRIEGIYDSAGREK >gi|229784081|gb|GG667654.1| GENE 22 23289 - 24359 990 356 aa, chain - ## HITS:1 COG:no KEGG:Closa_2021 NR:ns ## KEGG: Closa_2021 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 356 9 364 364 555 75.0 1e-156 MSVEEAAMLQFQVVDCITKVFEGHESLTRGDLGVVMGINMPVTTRKAEKVIAEVFGAEAC ILVRGAGSAAIRFGLHSMMGPGDRLLVHRAPIYNTTAASLEMMGIEAVEADFNDSKEIRR VLTDNPDIKGALVQYTRQLPEDRYDMHEVVRVIKQTADIPVLTDDNYAAMKVKDIGVQCG ADLSCFSAFKLLGPEGVGVIVGKQELVGRLVKESYSGGMQVQGHEALDVLHGMVYAPVSL ALSARVNEECVRRLNSGEIPAVKQAFLANAQSKVLLVEFRENIAERVLEEAEKSGAAPNP VGAESKYELVPMFYRISGTFRKADPELEHRMIRINPMRAGADTVLRIIKESVERVV >gi|229784081|gb|GG667654.1| GENE 23 24465 - 24821 476 118 aa, chain - ## HITS:1 COG:no KEGG:Closa_2025 NR:ns ## KEGG: Closa_2025 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 110 1 114 122 94 49.0 1e-18 MKIVIGGQMGKEEIKDLIVTTMGTECEVAIRDDVAAAMDVKSGNADLYLGACMTGGGGAL AMAIAILGYGACMTVSDADENAIRDALKSGKKAFGFTPSMSRTVVPVIVRLAKDAVKN >gi|229784081|gb|GG667654.1| GENE 24 24856 - 26136 1477 426 aa, chain - ## HITS:1 COG:no KEGG:Spico_1515 NR:ns ## KEGG: Spico_1515 # Name: not_defined # Def: hypothetical protein # Organism: S.coccoides # Pathway: not_defined # 1 426 3 428 428 318 44.0 4e-85 MRYIVIFLAGAFGALLANRGIAVFNDAVRPVVPEYREGRMTRLEFATTTFALSFGLVIGF GIPYSIMSPIILVHSLWLGTDVIGIFFPAKNIEKWYLDKESLIGAGLSVLAGGLYGVLLL AGLQSFVNMMQALPVNIFDAWQNISGPVISAFIAFPCVVITMDYGWKKGLVSLVVSVLLR QIMVFFGKGDIADGVALLTGLVFIIVFAVRDKSESTGNLASIFGDRVKNIRKNIIWIAIM GAIYGLACNMHLLMEGPQSLVALKDGNVSSAVSISFARALSFIPLKALTSLTTGTFVTDG FGFTATAGLLAPNGVLAAVFGAVVMSLEALSLAVVAKLFDKVPSIKKSADSIRVAMTRLL EVAILVGSMSAAEGMAPGFGFLVVGGFFVMNEYFKKPIVRMAVGPIAVIAVGVLVNVLAV LGLYSV >gi|229784081|gb|GG667654.1| GENE 25 26222 - 26584 282 120 aa, chain - ## HITS:1 COG:no KEGG:Closa_2026 NR:ns ## KEGG: Closa_2026 # Name: not_defined # Def: PRD domain-containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 119 1 119 119 127 52.0 1e-28 MNLEEALNQRLEILLNSGVISQRISDYCKTAVKLLIKQKPEADGDRAAMFITHLAMAGQR ILEGKEEKPLDGQILESVKQEPAYSKALDFLTYMLNQTDLEFPDTEKDFLTVHLCNLFMP >gi|229784081|gb|GG667654.1| GENE 26 26906 - 27838 821 310 aa, chain - ## HITS:1 COG:no KEGG:Closa_2027 NR:ns ## KEGG: Closa_2027 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 310 7 311 311 449 69.0 1e-125 MEYEVGKSILYQKTGLAKSLLALDLLSKNAGDRIQPISEYQEKFGVSRGTVQNAFTYLKE CGAVKLEHHGHQGTYIEKLDYRKLQENCMRKELLGIMPLPYSLTYEGFATAIYSELDELN FNMAYARGAVGRIELVESGTYQFAVCSRYAAEQSISCGKEIEIAFDFGPGSFLSKHVLLL ADESKDGICDGMKVAYDSKSLDQSRITDNIVKGKEITLIPIRTQQTIGALMDGTIDAGVW NYDDVLENHHDRLKVVSLPESDYNNSFSTAVLVIKTGDEALKALLKKYISVERVLGILSE VREKRREPFF >gi|229784081|gb|GG667654.1| GENE 27 28045 - 28143 57 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVWMKTKRSESVYDKVLSFDLDWIEVKEPGHA >gi|229784081|gb|GG667654.1| GENE 28 28115 - 28285 175 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621822|ref|ZP_06114757.1| ## NR: gi|266621822|ref|ZP_06114757.1| ATP phosphoribosyltransferase [Clostridium hathewayi DSM 13479] ATP phosphoribosyltransferase [Clostridium hathewayi DSM 13479] # 1 56 1 56 56 79 100.0 9e-14 MKSIGRDQEKNRAEIILEESGKLSVEEQERVLATLKGMVFTRDCILKNGVDEDKKE >gi|229784081|gb|GG667654.1| GENE 29 28556 - 28927 199 123 aa, chain + ## HITS:1 COG:no KEGG:ELI_2972 NR:ns ## KEGG: ELI_2972 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 110 1 111 118 97 47.0 1e-19 MGIGARLKQARSSKGYTQNSLADAIGVSRGVITNIEHEKSEPQMLVIRGICYELGINSSW LLQGEGPMESEKDAERSAQLLAEIYHLSTLLSKEEQDYVLDMIKTFQKHKENLCSNSNPE QES >gi|229784081|gb|GG667654.1| GENE 30 29081 - 29917 669 278 aa, chain - ## HITS:1 COG:BS_ydfL_1 KEGG:ns NR:ns ## COG: BS_ydfL_1 COG0789 # Protein_GI_number: 16077613 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 1 116 1 116 117 127 51.0 3e-29 MNKLFTIGEMAKLFGINAKTLRYYDEIGLIRPEHTDPMTGYRYYSTGQFERLNTIKYLRA LDMPLAKIRRFFENKDIGQMMEILKEQQEEVFVRRRELERIEHKISERLSQLHDALYTVY NRVEEVTLPNRKLAVLRTEIPATDDLEFPIRQLERQTQLEAAMFLGKVGVSISMDHVKSH KFDEFCSIFVVLEPGDSYDGEAVDIPGGQYLTLRYCGTHTQSSFYYEMLLKEMSENGYEL GGNSIEITLVDAGMTNDTSRFVTELQLPVSGGNNAKGI >gi|229784081|gb|GG667654.1| GENE 31 30080 - 31174 1054 364 aa, chain - ## HITS:1 COG:SPy0572_2 KEGG:ns NR:ns ## COG: SPy0572_2 COG1263 # Protein_GI_number: 15674662 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Streptococcus pyogenes M1 GAS # 8 357 10 362 364 177 35.0 3e-44 KQDTEAKKGNVMNTVMSFIGGTFSPVIPVLIAGGLTGAVLTLLTTLFGVSTESGTYTVIY AINQATFYFLPIFIGFSAASRLKSNGFLGAFLGAILLFSTINNVEGLNFLGIPIQQISYN STVFPVILGVLFMAVIYRFLSRVVPVYLRTIFVPLITMLITVPVTLIVLGPIGNLMGTWL ANGVFAIYRAAPPVAVMIIGITTPLMVFFGMNNATYPIVFALLAEVGSDPLICTGMAPAN VAVGGACLAAALIAKDVNDRSVATGAGITALCGITEPGVYGVLFTKKYPLIGAMAGGGIG GLLGGLAGMTQYVISTPGFISIPAYINPDGTSSNLIWAMAVMILAVAVAFGVTYVLGRRM EKQS Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:00:50 2011 Seq name: gi|229784080|gb|GG667655.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld48, whole genome shotgun sequence Length of sequence - 42882 bp Number of predicted genes - 45, with homology - 44 Number of transcription units - 21, operones - 13 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 352 340 ## gi|266621826|ref|ZP_06114761.1| conserved hypothetical protein 2 1 Op 2 . + CDS 423 - 2192 1236 ## BBR47_28160 hypothetical protein - Term 2019 - 2051 -0.1 3 2 Op 1 . - CDS 2206 - 3084 684 ## COG2207 AraC-type DNA-binding domain-containing proteins 4 2 Op 2 . - CDS 3149 - 4273 889 ## COG4927 Predicted choloylglycine hydrolase - Prom 4327 - 4386 5.0 + Prom 4353 - 4412 5.1 5 3 Op 1 . + CDS 4452 - 5111 670 ## COG1357 Uncharacterized low-complexity proteins 6 3 Op 2 . + CDS 5168 - 6127 677 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 6170 - 6226 10.6 - Term 6163 - 6209 10.3 7 4 Tu 1 . - CDS 6291 - 6986 817 ## COG1802 Transcriptional regulators - Prom 7131 - 7190 10.7 8 5 Op 1 . + CDS 7373 - 7723 403 ## COG0662 Mannose-6-phosphate isomerase 9 5 Op 2 . + CDS 7745 - 8668 1014 ## COG2513 PEP phosphonomutase and related enzymes 10 5 Op 3 . + CDS 8686 - 9564 771 ## COG2513 PEP phosphonomutase and related enzymes 11 5 Op 4 38/0.000 + CDS 9625 - 11256 1401 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Term 11275 - 11323 7.2 12 5 Op 5 49/0.000 + CDS 11342 - 12292 300 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 13 5 Op 6 44/0.000 + CDS 12298 - 13170 695 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 14 5 Op 7 44/0.000 + CDS 13167 - 14177 503 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 15 5 Op 8 . + CDS 14170 - 15159 774 ## COG4608 ABC-type oligopeptide transport system, ATPase component + Term 15166 - 15210 8.6 16 6 Op 1 . + CDS 15233 - 15850 149 ## CLJU_c36100 putative transporter protein 17 6 Op 2 . + CDS 15832 - 16410 427 ## CLJU_c24980 putative transporter protein 18 7 Tu 1 . + CDS 16499 - 17827 488 ## COG0534 Na+-driven multidrug efflux pump + Term 17969 - 18013 -0.8 + Prom 17977 - 18036 5.3 19 8 Tu 1 . + CDS 18087 - 18935 783 ## CHU_2510 hypothetical protein + Term 19134 - 19181 5.9 + Prom 19053 - 19112 3.9 20 9 Op 1 6/0.000 + CDS 19229 - 20050 902 ## COG2145 Hydroxyethylthiazole kinase, sugar kinase family 21 9 Op 2 . + CDS 20037 - 20693 717 ## COG0352 Thiamine monophosphate synthase 22 9 Op 3 . + CDS 20790 - 21455 553 ## COG0637 Predicted phosphatase/phosphohexomutase 23 10 Tu 1 . - CDS 21396 - 21581 87 ## - Prom 21603 - 21662 2.4 + Prom 21544 - 21603 1.7 24 11 Op 1 . + CDS 21631 - 22515 889 ## COG0351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase 25 11 Op 2 . + CDS 22505 - 23713 1301 ## COG1457 Purine-cytosine permease and related proteins + Term 23794 - 23850 7.2 + Prom 23888 - 23947 5.6 26 12 Op 1 . + CDS 24054 - 25508 1087 ## Dhaf_4599 sodium/sulfate symporter 27 12 Op 2 . + CDS 25540 - 25791 329 ## gi|266621852|ref|ZP_06114787.1| ferredoxin, 4Fe-4S 28 13 Op 1 . + CDS 26779 - 28374 1117 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 29 13 Op 2 . + CDS 28388 - 29692 1194 ## COG3681 Uncharacterized conserved protein 30 13 Op 3 . + CDS 29732 - 30421 405 ## BDI_0433 NAD/NADP octopine/nopaline dehydrogenase 31 14 Tu 1 . + CDS 31385 - 31708 319 ## Vapar_6050 NAD/NADP octopine/nopaline dehydrogenase + Term 31720 - 31753 -0.2 + Prom 31727 - 31786 6.6 32 15 Op 1 . + CDS 31822 - 32757 726 ## COG0583 Transcriptional regulator 33 15 Op 2 . + CDS 32822 - 33088 255 ## Cbei_3717 methyltransferase type 11 + Prom 33107 - 33166 8.3 34 16 Tu 1 . + CDS 33224 - 33523 250 ## ELI_2135 sigma factor-related protein 35 17 Op 1 . + CDS 34456 - 34659 85 ## Amet_2000 ECF subfamily RNA polymerase sigma-24 factor 36 17 Op 2 . + CDS 34656 - 35435 790 ## Clole_4048 hypothetical protein 37 17 Op 3 . + CDS 35420 - 36289 287 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 38 17 Op 4 . + CDS 36289 - 37368 1023 ## CD1890 hypothetical protein + Prom 37373 - 37432 5.4 39 18 Op 1 . + CDS 37477 - 38415 1156 ## COG3608 Predicted deacylase 40 18 Op 2 . + CDS 38412 - 39347 904 ## COG3608 Predicted deacylase 41 19 Tu 1 . - CDS 39357 - 39734 176 ## Cthe_1487 hypothetical protein - Prom 39800 - 39859 9.7 + Prom 39905 - 39964 6.3 42 20 Op 1 . + CDS 40024 - 41019 1061 ## COG1940 Transcriptional regulator/sugar kinase 43 20 Op 2 . + CDS 41041 - 41652 837 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 44 20 Op 3 . + CDS 41729 - 42022 264 ## EUBELI_20209 hypothetical protein + Term 42053 - 42090 8.5 + Prom 42129 - 42188 4.9 45 21 Tu 1 . + CDS 42234 - 42882 680 ## COG5564 Predicted TIM-barrel enzyme, possibly a dioxygenase Predicted protein(s) >gi|229784080|gb|GG667655.1| GENE 1 2 - 352 340 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621826|ref|ZP_06114761.1| ## NR: gi|266621826|ref|ZP_06114761.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 116 1 116 116 202 100.0 5e-51 ADSGFAVPAVIIGGSRTDAALSYDEASKILTLELSEIPTEKNIEVCFETGMRVAAANRGA QAYEILNRAQISYDKKEAMFEAVKKQRGDALLTILSMEENTTLTGALAEIMSDPLP >gi|229784080|gb|GG667655.1| GENE 2 423 - 2192 1236 589 aa, chain + ## HITS:1 COG:no KEGG:BBR47_28160 NR:ns ## KEGG: BBR47_28160 # Name: not_defined # Def: hypothetical protein # Organism: B.brevis # Pathway: Galactose metabolism [PATH:bbe00052]; Glycerolipid metabolism [PATH:bbe00561]; Sphingolipid metabolism [PATH:bbe00600] # 50 564 44 563 586 426 41.0 1e-117 MERILNYGEYEVFSEAAASCTASVMEKGEGRAVTEFCFRSDCVSEKGSAVEIRYSFPILD ICGRWHPNCRADRALKADWELPAASMTAVSAPVVCFFNSESRNRHTIAASELCQEVDINA GVHEEDGRMAVKIQFVLSAQKMSEGYSFRLWESQENIPYWDTLNQVRQWWEDGMPPVMEV PEEAREPVYSFWYSWHQDINASVVEEEAARAAEMGFDTIIVDDGWQTGDNNRGYAFCGDW APEPGKFPDFPEHIRRVHAMGIKYMLWFSVPFVGKNTGCWDRFSDKLLCYDERQRAGVLD LRYQEVREYLLAVYIKAVREWDLDGLKLDFIDEFYMREETPVWRAGMDYRDIQEALDVFM TAVRETLRREKESLLIEFRQRYIGPNMRRYGNMFRVADCPCSPVTNRVGCVDLRLLCGDS AVHSDMLMWNRGERPEEAAMQVLSCLFGVPQISVRLADIDDGMAEMLRFWLNFMRQHRTL LQMTVIRAKEPENLYPEVSVENETEEILVHYSAGRVIRPAKGKKICSCVNALHAQEQILI LDPVRPVTVTVKDCRGNTVESMELNGTDNRIQDLIRLSMPACGLCELRW >gi|229784080|gb|GG667655.1| GENE 3 2206 - 3084 684 292 aa, chain - ## HITS:1 COG:BH3634_1 KEGG:ns NR:ns ## COG: BH3634_1 COG2207 # Protein_GI_number: 15616196 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 1 107 1 107 132 107 48.0 2e-23 MDYLKALERAILYIEHHLGEDIKVEDAAAAAGYSYYHFTRQFNALLGESVGSYIRKRRIA KAAKELLYTDRRILDIALDCGFESAESFSRAFKTLYLTSPASYRKNRLDLFIGSKPQLEG ERLTHITKNLTVHPAIVELPDIMAAGLRGSTTLNDNVLPGLWAAFMELSPQIPNQLPCGR GFGICEACEEGNTLYNMNGDVLFSEVAAVEVSSFNGLPDQFIPKIIKGGRYAVFTHTGSL SLLQDSYSYIWGTWFLGTNEKLDAREDFELYDQRFLGYDHPESQIDLYIPIR >gi|229784080|gb|GG667655.1| GENE 4 3149 - 4273 889 374 aa, chain - ## HITS:1 COG:SA1739 KEGG:ns NR:ns ## COG: SA1739 COG4927 # Protein_GI_number: 15927499 # Func_class: R General function prediction only # Function: Predicted choloylglycine hydrolase # Organism: Staphylococcus aureus N315 # 1 238 1 229 346 81 26.0 3e-15 MKTITARTLELSGTSYEIGQALGRMAASNPRMKKFYTAGYENFGENEVQEAEKMFNEWCP GLNEELAGFAGELAVPIKSVVYYAMTYLRPSCSHLALLPSKTTGGHPLIARNYEFNDELE DFQLIRTSVKGRYTHMGTSVLSFGREDGFNEHGLAVTMSSCGFPVGADHCMRRPALKGLQ YWAVIRSILENCRDTREALLFLKGMPIAYNINLILLDRSGNGALVETLDGSMAVRMLNET SPVPYTHATNHAVIRELASREPEAMVHSLKRYEYIKNVADHSETLTVNQLKDMLLSPYPD GLCCHCYKDFFGTTKSMIIDPADGIIDLCWGGRAENGWHAYRISEPLPTREHDVTITCEP AAPTMFRFDPNPFA >gi|229784080|gb|GG667655.1| GENE 5 4452 - 5111 670 219 aa, chain + ## HITS:1 COG:BS_yisX KEGG:ns NR:ns ## COG: BS_yisX COG1357 # Protein_GI_number: 16078153 # Func_class: S Function unknown # Function: Uncharacterized low-complexity proteins # Organism: Bacillus subtilis # 1 218 1 211 212 84 27.0 2e-16 MAGIKLAEPWLPEVYELVTGDTELLHQCREEDERIQNAYLKNMEAQEEDFSGLCFSRVKF ENCSFEGCNFEKAEFSDVMFVSCNFSNCLFRDSFLRRVSFQRSKGTGARMTESSLKDVSI MECSMQYLNLDDSKLESFRVEDSDLENGNFAQCRCKGVTFSQVSLQNASFFKTPLKGMDF TTCDIKGLVLSDECSEVRGAVMDLYQAAELAKRLGIVIK >gi|229784080|gb|GG667655.1| GENE 6 5168 - 6127 677 319 aa, chain + ## HITS:1 COG:BH0594_1 KEGG:ns NR:ns ## COG: BH0594_1 COG2207 # Protein_GI_number: 15613157 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 6 125 1 118 127 112 44.0 9e-25 MTEVIMNSSKTFNDVILYLEKIIPDGCEIDYHEISRIAMSPAALFQRIFNFISGISISDY VRKRRLTLAGYELKNTDISVLDAAVKYGFQSHSSFSRAFKEHHGITPSQARMKNARLNDY LPINFSDMRFVGGKRIMAEMKRIIYKDTPERLMEGLPRETSFTDAGFVWQEFFQGDTIEK LDRLADVKCCDDIDENDGIGFMYDFSDRMNFKIVIGDFVRMQTEIPEGLFVKHIPKGRAA YVQIEGSNVAEILDSAYLLITEAIEKTGGTIDFDNFYWCEIYTHERYSEPFARGEKVVID YMMPVKADAETTGECPEEK >gi|229784080|gb|GG667655.1| GENE 7 6291 - 6986 817 231 aa, chain - ## HITS:1 COG:mlr7144 KEGG:ns NR:ns ## COG: mlr7144 COG1802 # Protein_GI_number: 13475949 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Mesorhizobium loti # 16 197 50 233 253 84 28.0 1e-16 MAANNRDLKNKAYETIKERLLDCTYEPGILLNEARLAEDLGFSRTPVREAISRLESDGFV KIMPKKGIYVSDILLSDVLQIFQTRIEIEPVALRMAAPYLPESELIQFRGKFLEDFEDIP NSFRLDTAMHLFIIEYCGNRYIIDMMHRVFDDNTRVIIASKQNRVQIHDARAEHMGILDA LLEKDVDKAQALMKTHVESCRRAALDYFYSLQAYNTTPPETYKKQLKDWSS >gi|229784080|gb|GG667655.1| GENE 8 7373 - 7723 403 116 aa, chain + ## HITS:1 COG:TM1287 KEGG:ns NR:ns ## COG: TM1287 COG0662 # Protein_GI_number: 15644042 # Func_class: G Carbohydrate transport and metabolism # Function: Mannose-6-phosphate isomerase # Organism: Thermotoga maritima # 1 115 7 120 121 85 41.0 2e-17 MIKRAEELVTEHKPSPFNGPGTISCKELLKVPEEMYNKGRFFGHTTVMPGSGIGYHVHEK ESETYYILKGTGRFNDNGTIQTVKPGDVTFTGAGEGHGLEAVGEEPLEVIALILFQ >gi|229784080|gb|GG667655.1| GENE 9 7745 - 8668 1014 307 aa, chain + ## HITS:1 COG:PAE1716 KEGG:ns NR:ns ## COG: PAE1716 COG2513 # Protein_GI_number: 18312827 # Func_class: G Carbohydrate transport and metabolism # Function: PEP phosphonomutase and related enzymes # Organism: Pyrobaculum aerophilum # 13 226 22 235 308 90 29.0 3e-18 MNIPTLQSVLAEGQLIAPDTFDGISTRAAEYMGCKGALLSASALTHTVHGTPDEGTLSTS EMVWSLTRFVEDNCSIPLIVEVRSGFSDNLRIMPYDLERIVKAGAAAMLLDDRTFGCGSD SPEITLVSPEVFARKVAIAVEAASNTDCMVLARSYAADQADAIARCKAAQAAGAQMVGAV CMHTEDDAREFAGAIAGSKLWSDLTVKDGSPEVSAETLMELGYGLVFITFMEKASWFGDI DFGTKNWANKNTVYADLHDFDGYLLDEEGNLRDYHYIFSYWKKWMPLEKKFEDLSELGKE AFQEVKK >gi|229784080|gb|GG667655.1| GENE 10 8686 - 9564 771 292 aa, chain + ## HITS:1 COG:RSc2000 KEGG:ns NR:ns ## COG: RSc2000 COG2513 # Protein_GI_number: 17546719 # Func_class: G Carbohydrate transport and metabolism # Function: PEP phosphonomutase and related enzymes # Organism: Ralstonia solanacearum # 21 205 39 206 303 98 34.0 2e-20 MELFEKGEQVFAPCVYDCMTAMAAERSGYPCMMLSGGAIAYSMDGQPDMAFGTLDEVITI VEAITNCTDIPLLVDFDDGYGESPAVIYRNVKRLIRAGAAGFTLDDSTGHRGFERMEYYN ANPDEVNPALLDPSYKTIRRDHWLSKIKAAVAACEGTGCVCIARTESGPSYGMDEAITRC VLAKKLGAPMTTVCGGMNDLAMGTRVAQKDLGWKMWPDVIVVNGKPCVELEDIKKLDFNL VTMHVFEKAALYGMIKYGMEDLREKNVVYSTGHAMGGLTDDEQKKIISMRRM >gi|229784080|gb|GG667655.1| GENE 11 9625 - 11256 1401 543 aa, chain + ## HITS:1 COG:AGl861 KEGG:ns NR:ns ## COG: AGl861 COG0747 # Protein_GI_number: 15890545 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 95 422 101 425 540 147 31.0 7e-35 MKKLIAVVLAVTMSMSLAACGSKSASETMAGESKANESEVTTSKEAGEESTAAPAAESGE PVSGGTVNIPITDDPTTLQGWMLRNSNESVLAPAIYETLLQYDETGKPQPYLVESFEGDP EALTYTIVVKDGITFQDGTPLDAEAIKWNLDYYKENGVLTGSYFNYVDSVEVVDEKTVVV HMSKWDALFDYGLARTCYICSPSAVESLGVDGFNETPVGTGAFKVTKWVHGEGIYTERYD GYWRGTPYLDGVNFKIYASTATQQAALEVGDLDIMNLSGDAITADALKAKGYSLTNAAIP STGYTLCFNSQADGPLKDLRVRQAIAYAVDSDAICESLLGNGAYGTASTQWAVSSSAEYN SEVTGYGYDVEKAKALLKEAGYDNGFELTINFQVGDFAKSVCQIIQAELQQVGITVTLNQ IEVANYANYLGEWDGILFHPMGLGNGQFSQVAANMIPGARFGGATFLHDDKSVSLINEAI VSDQETLSKNLKEAVKILFQDNVELYTVSITYSTAVTSPKLHGDYGAVQQYVATWNELWK EAE >gi|229784080|gb|GG667655.1| GENE 12 11342 - 12292 300 316 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 61 315 35 316 320 120 28 2e-26 MGRYILKRLGISVAVIFLVSVISFTLVHMLPGDPAVLALGSEASEQDIATFREKYYLDRP IVEQYFIWITDTLRGDLGDSITYSRPVLTCLAERLPRTISIGLPALILSSILGILCGIIS AIRRGKFIDNLITFVSTIGLGAPQFWLGIVGIYIFGMKLGWLPLSGYVSPGENFGQYIYY GVMPILILSFSLLASVARQTRSNMLEVVNQDYIRTARANGLAEKSVIYKHALKNALIPVI TTIGLQVRVVVGGSLMVEQVFNIAGIGTLIVSAINGRDYMVVQGCVFVISVFTVLVNFLI DLLYGFVDPRIRKSWR >gi|229784080|gb|GG667655.1| GENE 13 12298 - 13170 695 290 aa, chain + ## HITS:1 COG:BH0030 KEGG:ns NR:ns ## COG: BH0030 COG1173 # Protein_GI_number: 15612593 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 30 285 43 298 301 209 38.0 6e-54 MSKKNSKRAVGMKRFFRIFFSRGWVTKLCFGIITAFVFIAVFAPVLTRYTPYEQNLRYIL AGPSGEHLLGCDNIGRDTLTRLLYGARVSLITGLLSSIWAAILGTSIGLIAGYVGGWLNT LVMRITDAMIAIPPIIFTMVLSAMVTGNLIGISVVIGVSIIPSYIRMVNGLVISLRENDY VTASNLIGQKPLVIMFRHLLPNCLASLIVTFTMNVGTAIMLESTLSYLGVGIIPPTPAWG VMVSEGYKYLFNQPLVAILPGICIMLVVIAFNVVGDALRDALDPRLRGKI >gi|229784080|gb|GG667655.1| GENE 14 13167 - 14177 503 336 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 327 13 326 329 198 37 5e-50 IMEEKKVLLDVKNLRTSFFTEAGEVKSVNDVSFHVNEREVVAVVGESGSGKSVTQLSVLQ LIQSPPGKILGGEVWFEGENLLTYSKKQMCNIRGNGISMIFQEPMTSLNPVYTIGNQLTE VLRAHHKNMSKQEAWKRGVEALAAVGIPDPEARMKNYPFEMSGGMRQRAMIAISVACNSK LIIADEPTTALDVTTQAQVMELLLSLVEEKGTSIIVVTHNLGLVTRYADRIYVMYAGRIV ESGTTEALLTNPRHPYTLGLLASVPKLVDNRNEKLIPIEGAPPNLANIPEYCSFYPRCPY ACDKCRNAAAPELRPIDGNGHFMACHLDVRGGKTSA >gi|229784080|gb|GG667655.1| GENE 15 14170 - 15159 774 329 aa, chain + ## HITS:1 COG:BH0027 KEGG:ns NR:ns ## COG: BH0027 COG4608 # Protein_GI_number: 15612590 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Bacillus halodurans # 4 324 2 322 338 374 56.0 1e-103 MPEKETLLKVENLKMYFPAAKSVFGKTTSCVKAVDDVSFEVEKGKTFGLVGESGCGKTTT GKCILRINTPTEGHIYYKGMDLSRASKDEIKSIRREIQLIFQDPYGSLDPRQSAYSIIKE AVVTGREKYTEAAVAARVKELLETVELLPEMGDRYPHEMSGGQRQRLGIARALACNPELI VCDEPVSALDVSIQAQVINLFQSLQERLGLTYVFIAHDLAVVRHIADVIGVMYLGHIVEI MDAAEIYTNPIHPYTKALLSAVPITDYYEEKKRNRILLEGEVPSPIHAPSGCPFHPRCAY ATDACRTEVPSLKDAGGGHLVACHLSKKN >gi|229784080|gb|GG667655.1| GENE 16 15233 - 15850 149 205 aa, chain + ## HITS:1 COG:no KEGG:CLJU_c36100 NR:ns ## KEGG: CLJU_c36100 # Name: not_defined # Def: putative transporter protein # Organism: C.ljungdahlii # Pathway: not_defined # 1 200 1 198 389 125 36.0 1e-27 MIQNKKATSYTLVALSFLTLASSAVSPVLASVGEAYPDIPSTMISLLTTIPSLLAIPMTL FCGQIAGKKVSYRVLTVCGLTCNLISGVAPFFTHNFYLLLLWRALFGCGTGILTPLIMPV MMSVFRGAEVHQQASLNAVFTNIGAVMFQMMGGVACARWGWQATFLIYVVVFPALLIVIR FMPEPPALTDEEKIKKFRFPCFALY >gi|229784080|gb|GG667655.1| GENE 17 15832 - 16410 427 192 aa, chain + ## HITS:1 COG:no KEGG:CLJU_c24980 NR:ns ## KEGG: CLJU_c24980 # Name: not_defined # Def: putative transporter protein # Organism: C.ljungdahlii # Pathway: not_defined # 8 184 207 387 391 66 27.0 6e-10 MFRPILKWCNLYFVHMILFYVCVTETSDVVMKSGFGSSMTAAVILSLVTLSGVAGGWLYK FINKWEIRALGGAYLFLSAGYLLMALSSNAVIMGVGAMLVGIGFGVNMPALQVFVGLEIP GYARSNAASFLNVFGSLGSFLSKFIMTAIAGMFGYENGRFNFVVCVNAYLVMALFCFLTG QKKRHSDVVLCK >gi|229784080|gb|GG667655.1| GENE 18 16499 - 17827 488 442 aa, chain + ## HITS:1 COG:MA2050 KEGG:ns NR:ns ## COG: MA2050 COG0534 # Protein_GI_number: 20090897 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 6 433 16 444 468 181 29.0 3e-45 MKIPYLLFNMACPLTVSLMVQAMYNIIDSLFISRLGENALAAISLAVTAQNLMSSTAAGL GVGINALVTSCLGKKDRDQASAAALNGILIEFFCMLIFLIFGLFFTEAYFLLQTDNREII FYGVQYLQLCMIGSFGIFGEVTFECLLQATGKNTCTMVTQGTGAAINILLDAVFIFGMFG FPAMGIRGAALATVCGQWIAMGLAVWLNLRKNTELSFSLKRFRPDRHMMAAVFRVGIPTT VMASVTSLMSFLMNRLVAAYSTTAVAVFGIYNRLQHFVQTPMIGINSSMVSIAAYNYGAK NKKRIMASVKWGLIYGCFIMALASSCFVFFPEALMSPFQPTREMLRVGIPIFRIVGFTFL LSPVSVICSSLFQGLGNGHYSLIVTTSRQLVIRIPLAYFLTQFGDMALIWFCWPISEVCS ILISIFLFLRIYKRRILLLETD >gi|229784080|gb|GG667655.1| GENE 19 18087 - 18935 783 282 aa, chain + ## HITS:1 COG:no KEGG:CHU_2510 NR:ns ## KEGG: CHU_2510 # Name: not_defined # Def: hypothetical protein # Organism: C.hutchinsonii # Pathway: not_defined # 1 280 1 288 293 69 26.0 2e-10 MIELILALLCSASMAIALRLSGRPNSSRYGLLVGNYLTCVLMAWLSLSEKTVLPVSGAMW TAMAAGVFNGFIFLADLLLFQLNIRKNGAVLATTISKLGLLIPVGASILFLGERPTALQM VGMLLVLAAILLIHFEKGEAKASFKTGLLLLLLFYGIGDGMAKVFEHIGERRYDGLFLFY TFITALFLSSSLFAAELKREKKKLQAAELLSGITVGVPNYLSSLLLLKAVTKLPAYVVYP CYSVGAVLVVCVVSILFLKDRMTKHQAFGCGVILAALVLLNV >gi|229784080|gb|GG667655.1| GENE 20 19229 - 20050 902 273 aa, chain + ## HITS:1 COG:CAC3096 KEGG:ns NR:ns ## COG: CAC3096 COG2145 # Protein_GI_number: 15896347 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxyethylthiazole kinase, sugar kinase family # Organism: Clostridium acetobutylicum # 1 271 1 271 273 323 61.0 3e-88 MLKECLENVRKKAPLVHNITNYVTVNDVANILLACGASPIMADDRNDVEDITTICSGLNI NIGTLNQNTVPSMLLAGKKANELGHQVLLDPVGAGAGTFRTETALNLMKEVRFDVIRGNI SEIKTLARGFGRTKGVDADAADAVTEETLSSAITFVKEMAAGLGCIIAVTGAIDLVSDGS RTFVIRNGRPEMGQITGTGCQLSGLMTAYLAANPDNLLEAAAAAVCAMGLAGEIGWSHLE SFEGNATYRNRIIDAVCQMDGDTLEKGARYEIR >gi|229784080|gb|GG667655.1| GENE 21 20037 - 20693 717 218 aa, chain + ## HITS:1 COG:CAC0495 KEGG:ns NR:ns ## COG: CAC0495 COG0352 # Protein_GI_number: 15893786 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Clostridium acetobutylicum # 10 201 8 199 211 177 50.0 1e-44 MKLDKKSLLLYAVTDRSWLGEETLYSQVERALKGGATMIQLREKEMTEEAFIAEAQSIKE LCGRYGVPFIINDSTAVAVAVDADGVHVGQSDMEAGDVRRRLGKNKIIGVSAQTVEQALL AEKQGADYLGVGAVFHTGTKKDASDVSFQTLKAICGAVQIPVVAIGGITADNMKHLAGSG ICGVAVVSAIFAEADIEAAARKLKRAAAETADAEKEAD >gi|229784080|gb|GG667655.1| GENE 22 20790 - 21455 553 221 aa, chain + ## HITS:1 COG:CAC3231 KEGG:ns NR:ns ## COG: CAC3231 COG0637 # Protein_GI_number: 15896477 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Clostridium acetobutylicum # 1 211 5 215 215 110 31.0 2e-24 MEGAIFDIDGTLLDSMSMWDHAGEWYLEHSGRKAEPGLGRILLPMSMEQAANYMREQYGL DQTEAEIVKGVEKTVSRYYREEVLPKPGVKEFLTRLETHGFQMAAATSSSRTEVEAALSR CGLKPFFKQIFTCTETGAGKDRPDIYYAALSCLGTEKIRTWVFEDALYAAATAKTAGFPV AGVYDEASERDQEELKRISDYYLTDFNDFPAFFRIASGGGK >gi|229784080|gb|GG667655.1| GENE 23 21396 - 21581 87 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHSWIVTLLLSDRRWCPQSAAGLQAALSRSVNSRPPPYSISFSFAASGSNPEKGGKIIEI R >gi|229784080|gb|GG667655.1| GENE 24 21631 - 22515 889 294 aa, chain + ## HITS:1 COG:CAC3095 KEGG:ns NR:ns ## COG: CAC3095 COG0351 # Protein_GI_number: 15896346 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase # Organism: Clostridium acetobutylicum # 1 259 4 262 265 310 61.0 2e-84 MKTALTIAGTDPSGGAGIQADLKTMTAHRVYGMSAITAVVAQNTTGVTGIMEVTPEFLAE QLDDVFTDIFPDAVKIGMVSSSALIRVIAEKLKQYHAKNIVLDPVMVATAGSKLISDEAI STLKEMLIPLADVITPNIPEMEVLSGMAIQSAQDMERAAKETGDAYHCAVLCKGGHQLND ANDLLYADGEYLWFYGKRIATANTHGTGCTLSSAIASNLAKGRSLAESVGRAKAYLSGAL GDMLDLGKGSGPMNHAFAIAGEYGESEEMSGWECAGKSGVNGVRTACGGTGDEK >gi|229784080|gb|GG667655.1| GENE 25 22505 - 23713 1301 402 aa, chain + ## HITS:1 COG:NMA0365 KEGG:ns NR:ns ## COG: NMA0365 COG1457 # Protein_GI_number: 15793373 # Func_class: F Nucleotide transport and metabolism # Function: Purine-cytosine permease and related proteins # Organism: Neisseria meningitidis Z2491 # 12 373 15 383 437 306 51.0 7e-83 MKNKGTSTLSNGLIWFGAGVSIAEILTGTYLAPLGFEKGLAAILLGHFIGCAMLFFAGLI GGMKRLSAMETVKLSFGGKGCLLFAVLNVLQLVGWTAIMIYDGALAANSIYNTGAWVWCL VIGALIIVWILIGLKNLGRLNTVAMALLFLLTLILGRVIFSSHGGSSFVSDESMSFGAAV ELSVAMPLSWLPLISDYTREAEKPFQATLASTLVYGVVSCFMYVIGMGAALFTGTYDIAV IMVKAGLGIAGLFIIVLSTVTTTFLDAFSAGVSSTVISHAVKEKWAAIAVTVIGTIAAVV YPMDNITGFLYLIGSVFAPMIAIQIADTFILKNDFRSTKFHVPNLIIWICGFVLYRILMH VDTPVGNTLPDMAVTIGICLLYHEVSYAVNGRKREKSEMERV >gi|229784080|gb|GG667655.1| GENE 26 24054 - 25508 1087 484 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_4599 NR:ns ## KEGG: Dhaf_4599 # Name: not_defined # Def: sodium/sulfate symporter # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 6 464 7 464 487 221 33.0 6e-56 MAADLTATKKDGNVKYYINSAICLIIMFGFGYLPPIAPITVLGMQILGIFLGMVYGWIFV GIAWPSLAGLIALMQTGYMTAGEVIKSSFGEANVVLMFFIFIFCGTFAYYGLSRIIAVWS ITRKMVLGRPWVFTFVFLMVMAFLGGVTSATPTIVIGWSLAYVISEQCGFQKGDSYPLIL IFGTAFAAQIGDCIIPFKSIPLLVMGVYESLAGRSIPLGPYMLIVILATVLCIIVYLLGA KLIFKPDMSSLRDLDAKSLVGNEGIKMNRVQKVVLFFLVLLMFMMLLPGFFPGSSFFAIV FLKKIGSTGICILLVLLMCLIRIEGVPIIDFKKMVGNVEWNSILLLAAAIPVANAMSNEA SGFPAALKAIINPLATNTSAFVFVLLIGLAATLITNVMTNGPVGMILMAAVFAATQSVGM NPTGAVVTVIMCVHISVLFPSGSSTAALINGNEWIGTRRLWKIGPMACISAWMVISVVAG IFGL >gi|229784080|gb|GG667655.1| GENE 27 25540 - 25791 329 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621852|ref|ZP_06114787.1| ## NR: gi|266621852|ref|ZP_06114787.1| ferredoxin, 4Fe-4S [Clostridium hathewayi DSM 13479] ferredoxin, 4Fe-4S [Clostridium hathewayi DSM 13479] # 1 83 1 83 83 150 100.0 3e-35 MPPVIDKSKCISCTTCAQICPLDVYGPVIKGQKPEVLYPNECWHCRACEFDCPVGAISMH YPLPMMLLRRPKSKIEKEGARND >gi|229784080|gb|GG667655.1| GENE 28 26779 - 28374 1117 531 aa, chain + ## HITS:1 COG:PA2298 KEGG:ns NR:ns ## COG: PA2298 COG1053 # Protein_GI_number: 15597494 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Pseudomonas aeruginosa # 1 504 21 521 574 133 27.0 9e-31 MAAVAAAERGARVIVAEKADTRRSGGGATGNDHFACYIPGYHESVDSYMHELEQGMCKGA DPRVQRVFAERSFEIVKDWEKWGINMRPFGDEYVFEGHTLPGHPRCFLKIDGRDQKKCLT NQARRCGVMIDNRTSVTRYISREGRVVGAIAIDISRPEPEITLYEAKSIISVTGTGARVY PAVAPSCLFNVANCPAGTGTGRVASYEIGAVMVNFDKPRAHIGPRYFVRGGKGTWIGACC DAGGNAVGIAGRPEPGEQDITTDICHEVFDLSKKNGTGPVFMDCTENPKDTMEHMKECFY SEGISSLLDAMEQQDIHLEEDMIEWGRYLPHIQYSGTCINEKCETSTPGLYSAGDECGNF FCGVSGAAVTGRIAGESASEYIKGVESFTDIAEAPEVLEAQRFYSSLMEREEGSHWEELN HAVQNIMRDYASIETPRSETLLSCGVSYLEELNQIACKRIKCANSHELMRTLEAFDILAL GRLILLSALARKESRGLHKRVDYPYTNPLWNSLSMTIKKMNGKDYTEMQKL >gi|229784080|gb|GG667655.1| GENE 29 28388 - 29692 1194 434 aa, chain + ## HITS:1 COG:FN1147 KEGG:ns NR:ns ## COG: FN1147 COG3681 # Protein_GI_number: 19704482 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 17 412 3 405 411 186 30.0 7e-47 MNDTELYSILREEIKPALGCTGPIAICFCAAQAYDAIGGEIKSIDATLDWCMSAKNDDVA FPGSRMLGVEMAIALGAVCGDPGAGLEVLHTVTPEGERKARKVAECVKINPQWNLQELEI YCDIAVHTDKGTGRAVVSQRSDGLILKERNDEILLELEPDISAQSGKLPILKYKVKDFYD MVTSLSEDELAFLRDAVAYNSRLANATLENRLGAGIGYELYHSDYQNYITRAKAYAAAGC EARMSGVSHPAMTCGNKGNVGIAASMPLVSMAKDLNLEEMDLLRGIALSYLTAISIIHRI GKAPSMCSCEVAAALGIAAGATLMRGGTYEQVAAAMQNTIPNVFGVVCDGAKLACALRIS SGTGIALECSDLALSQVRIANNQGVLGATIDDSIEIIGKTALYAMVGSDRDLCRLLFEKR RIFPLMNFEERQSQ >gi|229784080|gb|GG667655.1| GENE 30 29732 - 30421 405 229 aa, chain + ## HITS:1 COG:no KEGG:BDI_0433 NR:ns ## KEGG: BDI_0433 # Name: not_defined # Def: NAD/NADP octopine/nopaline dehydrogenase # Organism: P.distasonis # Pathway: not_defined # 4 225 2 224 358 91 28.0 2e-17 MDERIAIVGAGAAGVITAAWLTGKGHKVILCDSKEQCREDFAVIARKGIVVKGPGLISGL CPDRMTYEIAEAMEARLVLVCVSSGRQEEAAAWMAPWVRPDHDLLLMPGNLGTVIFHRKF KEKGVTYGLLAELAECLWACRRLGPGSYVSAMPPGQRRIAAFPSTETGRALERFGKLFSL KAGASLVENCLNSPNVLTHLSGTLLNLGGIEQKKGDFALFTDGMSDLAS >gi|229784080|gb|GG667655.1| GENE 31 31385 - 31708 319 107 aa, chain + ## HITS:1 COG:no KEGG:Vapar_6050 NR:ns ## KEGG: Vapar_6050 # Name: not_defined # Def: NAD/NADP octopine/nopaline dehydrogenase # Organism: V.paradoxus # Pathway: not_defined # 27 94 279 346 362 62 48.0 8e-09 MGAECYAAPLRPFMTMLKASADYPEMEAFKKLSGPDGLWHRYVTEDAACGVALLVSLAEK FDVEVPAAKALLTLASRLTGIEYRQSGRTWEWMGNDAVGCLQTVNKP >gi|229784080|gb|GG667655.1| GENE 32 31822 - 32757 726 311 aa, chain + ## HITS:1 COG:VC1617 KEGG:ns NR:ns ## COG: VC1617 COG0583 # Protein_GI_number: 15641625 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Vibrio cholerae # 1 201 8 198 296 89 25.0 9e-18 MQLTQLEYLIAVEKYGSISRAARELYTSQSSISTSIKTLETELGVEVLRRGTKGVHITAE GQYILENAKIICQRMEDIKAVRERISGAVRGNVTVGGDGYCCMNILGNTAVNLKDRYPEI SVALKGGNQRQILSELSDGSLDLGIFQINQFNEKYVRAKVEYAQLTSSDILKSHLVVGVS PQHPLYGRTEVTLEDMMPYEIATGFVRAEDLVYWSLFCEMRKQGYTKSITCLGDVGVSRL YTIKRNCIQLVPRVSLEMTNPLFGTPLYPIEMNQRYELKYLLVYREESLGQPQELFLDEV LRYLKPYQDTE >gi|229784080|gb|GG667655.1| GENE 33 32822 - 33088 255 88 aa, chain + ## HITS:1 COG:no KEGG:Cbei_3717 NR:ns ## KEGG: Cbei_3717 # Name: not_defined # Def: methyltransferase type 11 # Organism: C.beijerinckii # Pathway: not_defined # 1 80 42 152 265 88 42.0 9e-17 MAEELRSMDFKGKKAAQFCCNNGREILSLMQFVACIILDIDEMYHNSFDFIMFTIEAITW FEDMNPLFQKTEACLKPGGVSLLIKSLR >gi|229784080|gb|GG667655.1| GENE 34 33224 - 33523 250 99 aa, chain + ## HITS:1 COG:no KEGG:ELI_2135 NR:ns ## KEGG: ELI_2135 # Name: not_defined # Def: sigma factor-related protein # Organism: E.limosum # Pathway: not_defined # 3 84 4 85 175 77 41.0 2e-13 MNEKEMVRRIKQGEKELFEPLAKKYYQEIYHFCFYKTGDAESAFDCTQDTFLHVIRFLDG YAERDHFRAWIFGIARNVCNDYFRSRKHVTMETGALDAS >gi|229784080|gb|GG667655.1| GENE 35 34456 - 34659 85 67 aa, chain + ## HITS:1 COG:no KEGG:Amet_2000 NR:ns ## KEGG: Amet_2000 # Name: not_defined # Def: ECF subfamily RNA polymerase sigma-24 factor # Organism: A.metalliredigens # Pathway: not_defined # 4 62 76 134 139 67 52.0 1e-10 MRNSIQAALDRLPEMQKDVIILRFYYDMKFKEIAAVVGVGIPTVKSRLRQGMEKMKQFLG KEGDLRI >gi|229784080|gb|GG667655.1| GENE 36 34656 - 35435 790 259 aa, chain + ## HITS:1 COG:no KEGG:Clole_4048 NR:ns ## KEGG: Clole_4048 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 39 257 21 238 243 67 24.0 4e-10 MKEKFEASEEEKRLRSMMRQYEAPDMTALQEQKLWVLLKEEADKKRLAPDRSMWRHMWRL AQFIAPSVWMLQAVFLLLAIGISNGQNEEWYLWNVSAMIPFLGVICVPELTKSFSLGMWE LEQSCCFHLRKLMAAKMAVMGIVDGVFLIFLMAAVGNRGSGLSGAVLAVLVPFLLSNAVY LWLFRVLKRHCSGYVLTAAGICMAGGMLMLKQYLPDLPSAVSSGAAAAIGTAGFVLLLVS GYGVLKQMDREETVVWNCV >gi|229784080|gb|GG667655.1| GENE 37 35420 - 36289 287 289 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 3 238 12 252 318 115 28 6e-25 MELRVDRLTKQYGSKIAVDRMDFTMKAGVYGLLGANGAGKTTLMRMICDILRPTSGEVFY DGRPISEQGDAYRMVLGYLPQQFGYYPEFTADKFMMYMAALKGLGKSAAAARTGELLELV GLSEVKNKKIKTFSGGMKQRLGIAQAMLNDPEVLVLDEPTAGLDPKERVRFRNLISSFSK DKIVLLSTHIVSDIEYIANEILVMKSGKLIHQGKPEAVTKEITGYVWECRVPAGEAEHWN EKYTIGNLRNDGEYSVLRVISEKRPAGNAVPVKPVLEDLYLYYFEEEND >gi|229784080|gb|GG667655.1| GENE 38 36289 - 37368 1023 359 aa, chain + ## HITS:1 COG:no KEGG:CD1890 NR:ns ## KEGG: CD1890 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 57 358 115 412 413 89 25.0 2e-16 MKRLIKFEFSKLIKKKLVLIAIGIFAILYATMLWSWIFGNEWAVTQDGEQLFGAEAEAYN TEITNRFAGPLTDEKVQEILAAFPRTDGMTTGDVSNNTYYPVANLFAERDGTWNGKTVQE VFPEFDEPPVLGMSSRWESFLYSMMYLVLMSGILVVIIVSPVFSDEYTSGMDALILTSRY GKRRCVLAKVFSAFCFSITMEAVILLIGFMMFCAGRGLAGWDTDIQLSEMMVFSRIRQPL KCCEAALMTAFLAMMSTITVTGLTLLFSVLCPTSFVSIILAAVTYLAPMFINPGSEAARR ILILFPANSISISGVMTAGGIPAAGLMIPLTVMVGVIAAVAAAAGIWGSKSIFSRHQVC >gi|229784080|gb|GG667655.1| GENE 39 37477 - 38415 1156 312 aa, chain + ## HITS:1 COG:mlr6093 KEGG:ns NR:ns ## COG: mlr6093 COG3608 # Protein_GI_number: 13475087 # Func_class: R General function prediction only # Function: Predicted deacylase # Organism: Mesorhizobium loti # 30 304 42 323 331 63 25.0 4e-10 MIETVASVGLAIDERLTIYKNRIQPEQMTGNEKRICIVTGIHGDELEGQYVCYELNRVLK EQKEHVKGIVDIYPAMNPLGVDSITRGVPGFDIDMNRIFPGSDEGAVTEHMAGQIMGDIA GADFCVDIHASNIFLREILQVRMSEGTAERLLPYAKLLNTDFIWVHGAATVLESTLAHGL NAGGVPTVVVEMGVGMRITRSYGLRLTEGILCLMKQLGIWDGRTIVPKEPLTSDHRTIGL VHANTPGIFIPSAEHQGTVRAGERIGEILNPLDGTVLETVTSPVSGMVFTLREYPVVSGG SLIARVLGGEEE >gi|229784080|gb|GG667655.1| GENE 40 38412 - 39347 904 311 aa, chain + ## HITS:1 COG:VC2282 KEGG:ns NR:ns ## COG: VC2282 COG3608 # Protein_GI_number: 15642280 # Func_class: R General function prediction only # Function: Predicted deacylase # Organism: Vibrio cholerae # 33 305 65 325 362 71 24.0 2e-12 MRKEVIYTQKNLYRADMNIYGYHFGKGEQSACIVGACRGNEIQQLYICSQLVKKLKELEE RGAIVKNNEILVIPSVNPSSMNISKRFWPTDDTDINRMYPGDSGGETTKRIAAGVFDVAR QYHYGIQFTSFYIPGDFVPHVRMMDTGYQNPSLANLFGLPYVVVRKVSPFDTTTLNYNWQ MNQTNAFSVYSSATDKIDEESAKRAVAAVLRFLTRMGMVRYQSHSGYIATVIEEGDLASV RTDRAGIYKPFFRPGEEAARGDVLAEIIDPYEGEVISRILSPADGIVFFAHSQPLVMEGT IIFKIIKRLHE >gi|229784080|gb|GG667655.1| GENE 41 39357 - 39734 176 125 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1487 NR:ns ## KEGG: Cthe_1487 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 1 125 4 128 132 144 52.0 1e-33 MWFWLFMFCCNLLIPCTMIVFGWIMREHPPKDINSSFGYRTRMSMLNEDTWKFAHKTCGT LWRKTGWITLFPSALVQFPFLHSSDDTIGTVGLILCLLQCAVLIGSIFPVERALKRTFHP DGTRK >gi|229784080|gb|GG667655.1| GENE 42 40024 - 41019 1061 331 aa, chain + ## HITS:1 COG:lin0520 KEGG:ns NR:ns ## COG: lin0520 COG1940 # Protein_GI_number: 16799595 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Listeria innocua # 8 330 7 329 334 151 27.0 2e-36 MRADAKLMKELNKRQLRRVLRQCGEATKPELADRTGLSVVTINSLLEDFLKTGEAVQSGM APSGGGRPSALYQYRPDYRYAVLLYGHQSEGRNLMHLAVVNLTGECVWRKEEYFDEIRVE SFESVLDDVIGRFPETGLLAFGLPGEVVNDVVTIHDYEALIGPDFMAHYKERYHLPVIFE NDINAMTYGACRDDEMERATVAGIYLPRIYPPGAGLVIQGKIYYGTSHCAGELAGLPVPV SWDSLDYGSAEEVLENLKLLTAVYGFTTAPETLIYYGDFFTEELQEQLRTYSERLLEGKF HMNLVFEENLEHDYEEGMKRLALEALWEEME >gi|229784080|gb|GG667655.1| GENE 43 41041 - 41652 837 203 aa, chain + ## HITS:1 COG:CAC2692 KEGG:ns NR:ns ## COG: CAC2692 COG0110 # Protein_GI_number: 15895950 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Clostridium acetobutylicum # 1 202 1 202 204 290 70.0 2e-78 MNQRERMLSELPYKAWLDGLEEERKESRKRVYRYNNLPPEAEKEKDALIREIFGKTGETI FVETPFRCDYGTNIEAGNNFYANFNCVILDVAKVVIGENVMFAPNVAVYTAGHPVHPDSR NSGYEYGIGVTIGDNVWVGGNTVINPGVHIGNNVVIGSGSVVTKDIPDNAIAVGNPCRVI RYITEEDRKYYFKDREFDVDDYK >gi|229784080|gb|GG667655.1| GENE 44 41729 - 42022 264 97 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20209 NR:ns ## KEGG: EUBELI_20209 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 8 97 89 181 181 95 50.0 5e-19 MIFPGDLFTVPGCESFTYENLKETAFESLRISEKFTPIIYHEENGAFVGKSVSMFSPVLK FTLEERFDSEVLEVSETFEVNGKRTFGYDLPLEYRRV >gi|229784080|gb|GG667655.1| GENE 45 42234 - 42882 680 216 aa, chain + ## HITS:1 COG:mll9387 KEGG:ns NR:ns ## COG: mll9387 COG5564 # Protein_GI_number: 13488176 # Func_class: R General function prediction only # Function: Predicted TIM-barrel enzyme, possibly a dioxygenase # Organism: Mesorhizobium loti # 4 216 12 223 285 175 41.0 7e-44 MIPRNEILKRLRAQINVNGHIIGTVAGSGMTARYSAMGGADLLLALSAGKFRIMGRSSFS SYLCYGDSNTIVMDMGCNELLPIIRDTPVLFGLFASDPMIHLYDYLQKIRENGFSGVVNY PTLSLIDGIFGEALSEEGNTYEKEVEAIKLAHFLDLFTIAFVVNAEQARAMTLAGADVIC AHLGLTKGGFLGAKKYISINDARKISDEIFNASDEI Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:02:13 2011 Seq name: gi|229784079|gb|GG667656.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld49, whole genome shotgun sequence Length of sequence - 32515 bp Number of predicted genes - 34, with homology - 33 Number of transcription units - 13, operones - 9 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 173 - 1438 929 ## gi|266621871|ref|ZP_06114806.1| hypothetical protein CLOSTHATH_03059 2 1 Op 2 . + CDS 1449 - 1586 78 ## gi|266621872|ref|ZP_06114807.1| conserved hypothetical protein 3 1 Op 3 . + CDS 1661 - 1789 84 ## gi|266621873|ref|ZP_06114808.1| conserved hypothetical protein + Term 1846 - 1885 2.1 + Prom 1851 - 1910 9.9 4 2 Tu 1 . + CDS 1937 - 2224 161 ## BLD_1961 putative transcriptional regulator + Prom 2242 - 2301 2.7 5 3 Op 1 . + CDS 2380 - 2571 148 ## gi|266621875|ref|ZP_06114810.1| putative transcriptional regulator 6 3 Op 2 . + CDS 2637 - 2846 247 ## COG0556 Helicase subunit of the DNA excision repair complex + Prom 3037 - 3096 4.6 7 4 Op 1 . + CDS 3144 - 4160 481 ## COG3943 Virulence protein + Prom 4175 - 4234 3.8 8 4 Op 2 . + CDS 4254 - 4493 88 ## HSM_1189 hypothetical protein + Prom 4917 - 4976 3.6 9 5 Tu 1 . + CDS 5006 - 5203 155 ## gi|266621879|ref|ZP_06114814.1| putative cytoplasmic protein + Prom 5213 - 5272 5.8 10 6 Tu 1 . + CDS 5403 - 5717 280 ## COG0556 Helicase subunit of the DNA excision repair complex + Term 5898 - 5946 9.5 + Prom 6343 - 6402 3.1 11 7 Op 1 25/0.000 + CDS 6599 - 7396 701 ## COG1192 ATPases involved in chromosome partitioning 12 7 Op 2 . + CDS 7374 - 8267 281 ## COG1475 Predicted transcriptional regulators + Term 8311 - 8368 1.0 + Prom 8306 - 8365 2.5 13 8 Op 1 . + CDS 8418 - 8972 268 ## gi|266621884|ref|ZP_06114819.1| hypothetical protein CLOSTHATH_03073 14 8 Op 2 . + CDS 9011 - 9190 178 ## + Term 9323 - 9356 -0.2 + Prom 9194 - 9253 6.0 15 9 Op 1 . + CDS 9482 - 9934 281 ## gi|266621886|ref|ZP_06114821.1| conserved hypothetical protein 16 9 Op 2 . + CDS 9951 - 10184 173 ## gi|288870646|ref|ZP_06114822.2| conserved hypothetical protein + Term 10335 - 10371 5.0 + Prom 10212 - 10271 8.8 17 10 Op 1 . + CDS 10389 - 10769 241 ## gi|266621889|ref|ZP_06114824.1| conserved hypothetical protein 18 10 Op 2 . + CDS 10845 - 13091 1166 ## gi|266621890|ref|ZP_06114825.1| hypothetical protein CLOSTHATH_03077 + Term 13096 - 13132 6.3 19 10 Op 3 . + CDS 13137 - 15128 924 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 20 10 Op 4 . + CDS 15133 - 15786 233 ## gi|266621892|ref|ZP_06114827.1| conserved hypothetical protein 21 10 Op 5 . + CDS 15801 - 16346 391 ## gi|266621893|ref|ZP_06114828.1| conserved hypothetical protein 22 10 Op 6 . + CDS 16168 - 18804 1328 ## COG3451 Type IV secretory pathway, VirB4 components 23 10 Op 7 . + CDS 18809 - 19351 455 ## gi|288870647|ref|ZP_06114830.2| hypothetical protein CLOSTHATH_03082 24 10 Op 8 . + CDS 19362 - 19865 31 ## gi|266621896|ref|ZP_06114831.1| conserved hypothetical protein + Term 19957 - 19988 -0.7 + Prom 20078 - 20137 3.3 25 11 Op 1 . + CDS 20209 - 21300 299 ## gi|266621897|ref|ZP_06114832.1| conserved hypothetical protein 26 11 Op 2 . + CDS 21317 - 22363 677 ## AB57_3247 prophage LambdaBa04, DnaD replication protein, putative 27 11 Op 3 . + CDS 22360 - 24288 756 ## COG3505 Type IV secretory pathway, VirD4 components 28 11 Op 4 . + CDS 24281 - 24574 147 ## gi|266621900|ref|ZP_06114835.1| conserved hypothetical protein 29 11 Op 5 . + CDS 24571 - 25359 276 ## COG4509 Uncharacterized protein conserved in bacteria - Term 25712 - 25743 3.2 30 12 Tu 1 . - CDS 25832 - 27295 322 ## Elen_3094 regulatory protein GntR HTH - Prom 27416 - 27475 6.1 + Prom 27375 - 27434 8.0 31 13 Op 1 . + CDS 27464 - 29116 789 ## COG3437 Response regulator containing a CheY-like receiver domain and an HD-GYP domain 32 13 Op 2 . + CDS 29135 - 29497 309 ## EUBELI_20567 hypothetical protein 33 13 Op 3 . + CDS 29503 - 32064 1117 ## COG0642 Signal transduction histidine kinase 34 13 Op 4 . + CDS 32148 - 32514 264 ## gi|266621906|ref|ZP_06114841.1| putative transglycosylase SceD 3 Predicted protein(s) >gi|229784079|gb|GG667656.1| GENE 1 173 - 1438 929 421 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621871|ref|ZP_06114806.1| ## NR: gi|266621871|ref|ZP_06114806.1| hypothetical protein CLOSTHATH_03059 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03059 [Clostridium hathewayi DSM 13479] # 1 421 58 478 478 850 100.0 0 MLFPRSCRMDMEGFEMYCLWGGTDQEDDRHLWIEETVAVWRNVEKPFLFGKQIMVQTFEL SDLEGRKIIGALHDSEEEGWNLLIEGGMKLYLHKEIPYLRERIGVSDAGLFTVSEIDRIT SNPVYAYGKAFQARELCEEWHKVFFFTAALTEQVWDEDAFVPVYERFLAFLEENISDIWE AETIIEKKMFWSVLTGHIDEIRGYLKGEEEVAVSKDFLLLLNSRYAYLPYLHELIKPAGD RAVKKTADCENHAFSPEKMRQIIELSGTGDSFQKGVQWEAAAEYFISHTKGLKVSGRRIH VGLQEIDLSVVNTSLDSNLWEMGAYFLTECKNWTDKVGIPVIRALSYSSNMKGNKTTLLF AANGVTRDGKREILRAAVNRQFILCFTKQELLEVKNGEECYELLVGKWRELKERVEREIM V >gi|229784079|gb|GG667656.1| GENE 2 1449 - 1586 78 45 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621872|ref|ZP_06114807.1| ## NR: gi|266621872|ref|ZP_06114807.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 45 1 45 45 77 100.0 3e-13 MAEFMRKKWKKQEEQMFVICLYFLQGMWYNFIGLPGSQLKEYDAA >gi|229784079|gb|GG667656.1| GENE 3 1661 - 1789 84 42 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621873|ref|ZP_06114808.1| ## NR: gi|266621873|ref|ZP_06114808.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 42 1 42 42 68 100.0 2e-10 MSSILKGLGQLLTQRKMAKLHPTAAGLLMLGAEYHIAREFPE >gi|229784079|gb|GG667656.1| GENE 4 1937 - 2224 161 95 aa, chain + ## HITS:1 COG:no KEGG:BLD_1961 NR:ns ## KEGG: BLD_1961 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.longum_DJO10A # Pathway: not_defined # 1 94 289 382 515 117 56.0 1e-25 MEGGNRIDDTPVHKVLREALANCLVNADFYVGRGIVVRKNQESIVIENPGYVRIEKRQML KGVISDPRNKTLMKMFNMIGIGERAGSGVPYIFSV >gi|229784079|gb|GG667656.1| GENE 5 2380 - 2571 148 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621875|ref|ZP_06114810.1| ## NR: gi|266621875|ref|ZP_06114810.1| putative transcriptional regulator [Clostridium hathewayi DSM 13479] putative transcriptional regulator [Clostridium hathewayi DSM 13479] # 1 63 1 63 63 103 100.0 4e-21 MMEPGKEYKTTEIAGWVDLKSSRMRELLKVLSENGEVEAIGNNRERTYKRMQLAQSSKDS RCV >gi|229784079|gb|GG667656.1| GENE 6 2637 - 2846 247 69 aa, chain + ## HITS:1 COG:TP0116 KEGG:ns NR:ns ## COG: TP0116 COG0556 # Protein_GI_number: 15639110 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Treponema pallidum # 2 69 3 70 668 90 60.0 5e-19 MEFKLHSEYQPTGDQPQAIEALVKGFKEGSQFETLLGVTGFGKTFTMANAIQQLQKSTLI IARNKTLAS >gi|229784079|gb|GG667656.1| GENE 7 3144 - 4160 481 338 aa, chain + ## HITS:1 COG:STM3755 KEGG:ns NR:ns ## COG: STM3755 COG3943 # Protein_GI_number: 16767039 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Salmonella typhimurium LT2 # 14 336 13 342 345 258 43.0 2e-68 MEDKTIQIRNSTVDFLVFTKDAGEEGIEVRVQAGNVWLTQKAIAQLFDVERSVVTKHIKN IYGSGELEESLTCANFAQVADNGKTYQYKFYSLPAIIAVGYRANGQRALQFRQWATKVLD TFTKQGYVLDKNRLINGQIFDEDYFDHLISEIQEIRASERRFYQKITDIYATAVDYSLDS QTTKDFFATVQNKMHYAVHGNTAAEVIMKRADHMKEHMGLMTWRNAPQGKIVKADVSIAK NYLLKDEMQELNEIVTMYLDYAARQARRHIPMTMADWASKLDAFLQFNDAELLQDKGKVT AAIAKAFAESEFEKYRILQDKEYMSDFDRLLLEKSENS >gi|229784079|gb|GG667656.1| GENE 8 4254 - 4493 88 79 aa, chain + ## HITS:1 COG:no KEGG:HSM_1189 NR:ns ## KEGG: HSM_1189 # Name: not_defined # Def: hypothetical protein # Organism: H.somnus_2336 # Pathway: not_defined # 12 78 505 571 571 92 64.0 7e-18 MKIKLFNRQEEIDLLPNLDVGECIVVEDAIKLPTKILLDEPKETPKSSTIDFGDRWNDQE NTIYDLDSAILNLVRQSRS >gi|229784079|gb|GG667656.1| GENE 9 5006 - 5203 155 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621879|ref|ZP_06114814.1| ## NR: gi|266621879|ref|ZP_06114814.1| putative cytoplasmic protein [Clostridium hathewayi DSM 13479] putative cytoplasmic protein [Clostridium hathewayi DSM 13479] # 1 65 107 171 171 119 98.0 7e-26 MERNIKSGYYKRILSTQLQDPLAENYEFVKDPYVLEFMGLPENYEYKENELERAILLDNV NQAKE >gi|229784079|gb|GG667656.1| GENE 10 5403 - 5717 280 104 aa, chain + ## HITS:1 COG:MTH442 KEGG:ns NR:ns ## COG: MTH442 COG0556 # Protein_GI_number: 15678470 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Methanothermobacter thermautotrophicus # 1 92 2 93 646 148 73.0 3e-36 MEFKLHSEYQPTGDQPQAIEALVKGFKEGSQFETLLGVTGSGKTFTMANVIQQLQKPTLI IAHNKTLAAQLYSEFKEFFPENAVEYFVSYYEKQNKPLLIDYDW >gi|229784079|gb|GG667656.1| GENE 11 6599 - 7396 701 265 aa, chain + ## HITS:1 COG:BS_soj KEGG:ns NR:ns ## COG: BS_soj COG1192 # Protein_GI_number: 16081149 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Bacillus subtilis # 1 260 1 253 253 196 44.0 5e-50 MNKIIAIANQKGGVGKTTTCVNLGIGLVRKGKKVLLIDADAQGNLAACLGIDEPDNLEVT LVNILAKVVNDEPLDVTEGILHHEEGVDFLPANIELAGLETTLVNVMSRETVLRQYVDEI KDRYDYILIDCMPSLGMMTINSLVAAGSVLIPVEAAYLPTKGLQQLIKTIGRVHRQLNPE LGIEGILLTKVDRRTNFARDISDKLRMAYGGQIHIFENCIPLSVRAVETSAEGKSIFLHD PKGIVADGYAALTEEVLAYGKPQCE >gi|229784079|gb|GG667656.1| GENE 12 7374 - 8267 281 297 aa, chain + ## HITS:1 COG:DR0012 KEGG:ns NR:ns ## COG: DR0012 COG1475 # Protein_GI_number: 15805053 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Deinococcus radiodurans # 21 232 20 220 288 93 36.0 5e-19 MGSRSANKVKFTSYNDLFGNESIEQSAEGQVIHLSLDILHSFKNHPFRVLDDEKMMETVD SVRKYGVLIPGVVRKDKQGGYEIVAGHRRKRACELAGYQDMPVIVRDLNDDEATIIMVDS NIQREDLLPSEKAKAYRMKMEALSHQGVKGEEYTADLIGKGAKKTGRTVQRYVRLTYLLP ELLEYVDLKKLLLIPAEKLSFITAEEQGWVLGVILDKRLFPNEIQAEALKDESKAGTLNK KRVPEILISDSKMPSMPKITLSPKKLRDYFPKEYTKGQIEAVIYDLLESWRDSHKVG >gi|229784079|gb|GG667656.1| GENE 13 8418 - 8972 268 184 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621884|ref|ZP_06114819.1| ## NR: gi|266621884|ref|ZP_06114819.1| hypothetical protein CLOSTHATH_03073 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03073 [Clostridium hathewayi DSM 13479] # 1 184 1 184 184 347 100.0 2e-94 MGKSGVITIESKDNQVYHDVAVLFKIYRSVSWRMQVQIGQVKHRFQKEYGTDVDQFLESI YQAGMDINDDEANIKSRVEAINSSNKFLKLIDEAVELMRKYHPQGERYYWILYYTYMSAY QPNNITEIINKLEPHLSQNVRINRSTYFRWRESALKAVEGILWGYEMESRMLLEHYRDTW VIDV >gi|229784079|gb|GG667656.1| GENE 14 9011 - 9190 178 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFENRDTCRKIIQEKSLLSIFDMIAQEEEMMLSERARGTINLQNNVTKQLSSGEVLDNK >gi|229784079|gb|GG667656.1| GENE 15 9482 - 9934 281 150 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621886|ref|ZP_06114821.1| ## NR: gi|266621886|ref|ZP_06114821.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 150 1 150 150 275 100.0 1e-72 MKRDKFKKMKLDIQTLEPVSHPDNCVLLLSEVKMENPEPMELSIGQAMQKKSWMPDGTIL VGLEAKGIRTEDEKLRNLGTSKQAEILDYNFFRSRLPKTVITEVKAELLDFRLRKTIPFS VKKLEFAFGNGKKVDYTDKVSVLSLDQLAS >gi|229784079|gb|GG667656.1| GENE 16 9951 - 10184 173 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870646|ref|ZP_06114822.2| ## NR: gi|288870646|ref|ZP_06114822.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 77 2 78 78 150 100.0 3e-35 MKNNLQAKRIVITAGEVQLPIKVGEEVCYRYDGHLMWTDRVHRILEVAADYVRIETTSYY YTIERNGRECGQLKGAA >gi|229784079|gb|GG667656.1| GENE 17 10389 - 10769 241 126 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621889|ref|ZP_06114824.1| ## NR: gi|266621889|ref|ZP_06114824.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 3 126 1 124 124 231 100.0 2e-59 MSMDRQAGFLVFNQKKHINIPALTGQFYVVSMMVLFRCQTAYADIFDTAKSAMQQVYTDV AGIATVGAVVCAAVCLFLMNFSKSGKTVDESRAWLKRIIICWAALMTLGGIVTYMESIIP KSMFSS >gi|229784079|gb|GG667656.1| GENE 18 10845 - 13091 1166 748 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621890|ref|ZP_06114825.1| ## NR: gi|266621890|ref|ZP_06114825.1| hypothetical protein CLOSTHATH_03077 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03077 [Clostridium hathewayi DSM 13479] # 1 748 1 748 748 1314 100.0 0 MGWLLELLFESIKEMCSQFIIDMMDVASSMFTEILSCNLNLFEELFGVAGNLYRDGIMPI AVMILLMILVWQLFKGIFSKITTSSEDPIELILRSAISLFMIVYAKDIVNYILNIVGTPY QWVAGSAITVYSFSAYVTAAEAVVSVLGIDIISIQMLLLIMHFVVAWNYFKMLFILAERY VLLGIFSYTAPLAFAAGGSKSTNNITTTWVRMFGGQILIIILDAWCLKMFLSGYGNLMSS SYGFTKYFAATMCLIGFCKITAKLDSYMASLGVNLGRIGGGLSGMGALLMAGRLLHMGGS GGRTGGGNQEKGASLMGFGTGKPIPMGSGSPNFGVAGISGMRNGSNIGGMNGGVNAEEKP EPGEDFGWHGAENEFDSGMPGTESANAPFGMPDESDFDAQEGSFDGMESREGMDYPGFNY DEDIPMDSSYDEAGEIPGMGYGKDGPFGASLDEPEEAFGTEIGMDEGNISGFVGAEPGGG MEVSGETGGITGMQDGGQEAEPGAGMADTGILSYDSESAQAKEHAGTLPGGAEAMAADYA GGLDSPASSVQSGLPGTSTAGAGISDTDTAINEMAGGGAMDEGQPHSMGMGGSVSNGAGA VSGAESMQEVKGVSQGGMQSGTVMETGTGTYGQSGKGSSDENYGSYLAERDGERYMRYDA SRFEKPQGAYQTIHEDGKTYYELPEKEQAPAVLPETKAVLEKNGMIRLEKVYKETSQERG VPKQTPKKKPENKKNTGKRYADGKGEKR >gi|229784079|gb|GG667656.1| GENE 19 13137 - 15128 924 663 aa, chain + ## HITS:1 COG:CAC2677 KEGG:ns NR:ns ## COG: CAC2677 COG0741 # Protein_GI_number: 15895935 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Clostridium acetobutylicum # 210 344 107 234 237 106 49.0 1e-22 MNEDRSNEPLDKAAETAARIVRAAGQTEKIAHSIQAVHAATGAVRAGGAASGATVGTALG GPLGTLVGALLTSKTVWKVIGSILLAILLFVYLLVNSVSLIFMYLGFADADAYVSQARTA ECNNIKIQIEALFAADPAFESEIYGLVEALRDKAVEEIRADYDNNQSGYDGYEVDDEYET VLKPRLSQYLAVLMEESWSGKQIEGFNGYGTGGGTSSLTSAYDEYFTLASATYQVSEVLL KAMAKAESDFNPNCVSSAGAIGIMQLMPETAANLCVTNPYDPKQNIMGGANYISQQLAAF SVYPNGLELAIAAYNAGPGAVKRAGYQIPQNGETPAYVQKVLGYMGEGGGNTIPGITQTD SSISKVLLKSFIDEKESAFFDWAKTGTHTETIGSGDEEEEIEIVDYSIVVKLNPHLSSTG TGYSFRYVTDPNSFNNVLKLFQLVQNGKDGALDILFKAASWKNYILGAGASGGIYTSTIA TGGDKIIYETVPGCVREIVYYNQGEEPWASLSYGGSTIRTSGCGPTALAIVISTLTGEQI TPELTAQYAMDTGGYVSGRGTSHAYPANAARNWGLSVERVRREQMNYVVAELKNGKLAVV ICAENTISGNSGHFIVLTGVTSDGNITIADPGSRGRTGNLYSPSTIRSYARDLSDGGIWI IGE >gi|229784079|gb|GG667656.1| GENE 20 15133 - 15786 233 217 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621892|ref|ZP_06114827.1| ## NR: gi|266621892|ref|ZP_06114827.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 217 1 217 217 431 100.0 1e-119 MKHRSKWSFVPSLALSLVLGISGVCYADGLAGAKGETMWDEEYTDTDTGKGTLAVRCQVF QGFQGAVTIYFTGLTGGRAFTVSLDKDVGYIANLSLAGERYKVTGVTAISELREYDCHVE PDTVCVEAGRISICKVFVNPGSIQKFPEKTETLMTEQAEKQEIIERKKQEEQIKPAVEEE MLKTESDRRQIPVLGILGMALILICAGSLFYLRKQDK >gi|229784079|gb|GG667656.1| GENE 21 15801 - 16346 391 181 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621893|ref|ZP_06114828.1| ## NR: gi|266621893|ref|ZP_06114828.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 181 1 181 181 300 100.0 2e-80 MNENNRIDIYTIPPNFAEEGTILSGRVKTRNAFETALILAGLVPLIMSLSVTGKTKLYIG MIVLVPVIILAILGVQGESLFAFIASFFKFLKRRRSLSVPDKRYRLEQSRKKERSERGGK IKRERGKKHRKPKKSLEAGTPESQKETDTGAKKKKTKAKWRKCSPKTEEGESHSSEAGGV L >gi|229784079|gb|GG667656.1| GENE 22 16168 - 18804 1328 878 aa, chain + ## HITS:1 COG:CAC2047 KEGG:ns NR:ns ## COG: CAC2047 COG3451 # Protein_GI_number: 15895317 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Clostridium acetobutylicum # 360 853 124 616 617 104 23.0 1e-21 MKEGKSTENQRKAWRRELRRVRKKLIQEQKRRKPKQNGENAPQKRKREKVTAAKQEEFYD LPYTRASKEIPIESIRDGVIYTSDGRYVKIVEVLPINFLHRSASEQRNIIYSYIGYLKIA PPELQIKSISKKADIGLYVEAVKQDIEAEPDGRCRMLQKDYIKLLRSVGYREAVTRRFFL VFSVDSRRGTESEIAATLNSYAQTAKKYLYQCGNEVVISDNPTQDTVEMLYLLLNRKTST TVSLSQRLNQVAAWYIRENGEESTGRIPVAEYFAPESLDFRHGNYVIMDGVYHTYLFVPS ARYRNRVPAGWLSLLVNAGEGIDVDLFLYKQDKVKTIERIGRRIRLNRSKIKDASDTNTD FDDLAESIQSGYYLKNGLAANEEFYYIAILITVTGYSAKEVEWRAKEMRKLLNAQDLDTV SCLFQEEQAFLSALPLLKLEKSIYERSKRNVLTSGAASCYPFTSFEMSDKDGILMGVNTA NSSLVIVDIFNSSLYKNANISIMGTTGAGKTFLLQLMALRMRRKNIQVFIIAPDKGHEFA RACTNIGGEFIKISSSSKSCINVMEIRKKNKETNAILDGAVMEYSELAAKIQDLHIFFSL LIPDMSHEEKQLLDEALVLTYQQKGITHENESLIDPENPEQYKEMPVLGDVYNELIRRQE TKRLANILNRLVHGSASSFNQQTNVDLKNKYIVLDISELTGDLLPVGMFVAVDYVWSKVK EDRTEKKMVFLDEVWQLIGASSNELAAEYVLEIFKIIRGYAGGAVCATQDLSDFMALKEG KYGRGIINATKTKIVLNLENDEAKKVQELLHLSDAERMEIVHFERGKGMISTNSNNLTLE FRASQLENDLITTDRKDLQNLKERLEKFGNHAYGRMGG >gi|229784079|gb|GG667656.1| GENE 23 18809 - 19351 455 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870647|ref|ZP_06114830.2| ## NR: gi|288870647|ref|ZP_06114830.2| hypothetical protein CLOSTHATH_03082 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03082 [Clostridium hathewayi DSM 13479] # 1 180 9 188 188 328 100.0 8e-89 MAVYVSPEAMEERTLMVIRKYNVLYMKQLLHFFEGEEKAAGKAVSRLLKNRQIYRNPHTG LLASNEFAYSLKDEGTIQSLWVLIDMMHKREVDGHYLAVKEDFPVRILFFCGQDIYDIIY IGIGDLKLVNGMFAKSRRSEENHIIVLEDKVMISQIEVPGIIGFCVVKEGGEVEYYRKRT >gi|229784079|gb|GG667656.1| GENE 24 19362 - 19865 31 167 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621896|ref|ZP_06114831.1| ## NR: gi|266621896|ref|ZP_06114831.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 167 1 167 167 327 100.0 2e-88 MELLEHQLLEVQRVAGVVQDKLAEAQSVKTLYDRSASSFQYGRLLDCMMEASEHAECLTD RLRGFVLRNTFNSPLKEDYCLKLVQIHEIRVEYEDCILKAELPMLLPHRKKNIQTISINL FFWHCRIGVKSVSSRGRKFLYTRRQRSVLFMSMIKKCRKPGFGTMTI >gi|229784079|gb|GG667656.1| GENE 25 20209 - 21300 299 363 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621897|ref|ZP_06114832.1| ## NR: gi|266621897|ref|ZP_06114832.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 363 1 363 363 753 100.0 0 MKIDRKDELLLLIAVSGEIPSDWVGRAVGSESYAVALLTRLKREGEVKLRSKDGVRGYLL RNKAKQYLLVHYGDDVRLYLSGANSTNHVKSEPEKRLRLHRMSMVWIYFHHAGIRIFQSE KTELFPVFHQAPFSTPDIKGSAPVSYYGTMEWKQATDMEIKGSRACGVLAAERFYVVYNM LDSLMKWVPKTERNVRSRLEVRLRKNRGCITISTIMMGTDMAIVRRILTSMGGLKGNLFS LDDVYECYYYIPFLEEAVIQLRLLCSEAGQVRLYQFLCRALKQVNDNCFTPEAGADENGN LVYFCYMMDLWQVKRIMSLPLRRGGRVFCFTYQAGVLRSILPECFKVEAIRPEKVYRYLG WKE >gi|229784079|gb|GG667656.1| GENE 26 21317 - 22363 677 348 aa, chain + ## HITS:1 COG:no KEGG:AB57_3247 NR:ns ## KEGG: AB57_3247 # Name: not_defined # Def: prophage LambdaBa04, DnaD replication protein, putative # Organism: A.baumannii_AB0057 # Pathway: not_defined # 4 142 3 141 297 152 50.0 2e-35 MEPIYQTGNPTVDRLSRIRLTGNVIPPAWYHTILRETGKPYLIAIVILSDIVYWYRAAEV RDEGSGQLLGYRKRFKADLLQRSYQQMADQFGITKRDATNAIVELEKLGVIRRVFRTLTI NGQSIPNVLFIELDVDILERLTFPEDYAEEVESVKRVQKKSQKKAQGTLTIGYPQFKGDV SPKSEEGLSDWNGASISKKARSVSELGETNTEITYKDYNNEYPIYSYLKVKEEFKNQIEY EILRRDLKELEELEELVEVAVEVLVSSADTIRVNREDKPAELVKEQYRRLNMFHIQYVLT CLRETETKARNIRAVMVTALYNSVNTIGTYYGNLVKYHQKIKYKEGTE >gi|229784079|gb|GG667656.1| GENE 27 22360 - 24288 756 642 aa, chain + ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 62 565 74 574 591 394 42.0 1e-109 MNEEQRNRKKLIAGLLCLLILALYAGGVLRQFMVDAGHISLNPIRCLYFAVSSIHGIKTT VFTILLFFAFAGIVVFQGSISDTSGEDKERNFTYSKLGTYGTAGYMEDAERRRVLKEDKD IRKVEGIIFGLEMMSGNVLSLPVDSKLNRNIAVCGSQGSMKSRAFARNMILQCVRRGESM FITDPKSELYEDMVAYLKAQRYTVRQWNLIDLENSDAWDCLAEIDEGGLIDIFVDVVIRN TTDKFDHFYDNVEMDLLKALCLYVYHEYKDKDRTFAESYKILINQSLEAIDGIFDRLPTS HPAKGPYRLFAKAEKVKGNAVLGLGTRLQIMQNEKVQKITSHKEIDLTLPGREKCAYFCI TSDQDSTYDVLATLFVSFLCIKLIRYADRQKNRKLPVPVHFILDEFPNIGIVPDFKKKLA TARGRNIGMSIIYQNIPQLQNRYPDGQWEEILGGCDMSLFLGCNDMTTAMYFSNRSGEVT VSVSSVRKNLNTVRMSNYVPDYSETSSVGKRMLLLPDEILRFPLDQALVIIRGQRILRVR KMDYTMHPDAKKLIQEKTDCHVPEWRREEELRAEQDSKVRQEDIGEQESGFEQGIWSGKD LKKEDESQEVGKSKANEKNIQEEPTERKLGIIEKVEDLFTDE >gi|229784079|gb|GG667656.1| GENE 28 24281 - 24574 147 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621900|ref|ZP_06114835.1| ## NR: gi|266621900|ref|ZP_06114835.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 97 1 97 97 164 100.0 2e-39 MNDSLTFEKKELKFLLEREFFIHIDSFEDEDYAKARRAIFVYENADEFFEATGWERNNPE LRTEDYLTENRVCRWWNGRFIYFSRILWEDNWGTGLE >gi|229784079|gb|GG667656.1| GENE 29 24571 - 25359 276 262 aa, chain + ## HITS:1 COG:lin2285 KEGG:ns NR:ns ## COG: lin2285 COG4509 # Protein_GI_number: 16801349 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 41 199 31 191 246 92 34.0 8e-19 MMKGIVMTTLFILTIFLTVLLAHNWQGLDSEQQAFVELAAYIEDKKKNVLDEGTDSENVK TVDNKGKIEDKEAKILPEYHELALKNPDFAGWIVIDDTAINYPIMQTPKELEYYLHRDFK GKASYAGTPFVGRGNLQEKRDLFLYGHNMRNGTMFADLLKYQKKPFWEAHQVIQVDNLYE HREYRVFSVFYAEETEWSEEGGLFADAGLGSMKREELIEILIERGMHENPIFPDHRAPLL FLITCGYWKKDSRLVVVAIREK >gi|229784079|gb|GG667656.1| GENE 30 25832 - 27295 322 487 aa, chain - ## HITS:1 COG:no KEGG:Elen_3094 NR:ns ## KEGG: Elen_3094 # Name: not_defined # Def: regulatory protein GntR HTH # Organism: E.lenta # Pathway: not_defined # 1 462 1 465 486 129 24.0 2e-28 MNPQETKFTFVYNEIKQCILEGKILPGNALPSSRLYCEQFHVSRYTINRVFNALRKEGLV DIQPRLAPIVVSTIDAYDSSNAVFEILKQKEGILQVYQTFALILPSLMVFSLQGCDVEVL PHYKQAMKALRLGYAAGGWRPPSKLGYDILRIGGNSLFSELYSTFGLYNKLTFFAEVCPY FSTHFLQESVSATNDIMDILKVNEPLTRYNQLSNMYQRLTDAIQNTLNYLSETTPKCPAQ TGLKFSWNPMRGQDYYYSKIIDDLNLKIGLGAYSVGMFLPYEKQLAKQYDVSISTVRKAL SELEQRGFVKTLNGKGTIVIKPDDTKLHQLALNSRYVEKAQRYLHALQLMVLIIRPAALV AAPRFSREELDELADKFTSSDSIHLADILESIMKHTTLEPLYVILSKTNHLLEWGHYFAY YPSKKHTLSHLNKQVILALEQLREGNADSFADGIADCYRYNLVLMKKRMVEKYKFHNVAN IRVPEKY >gi|229784079|gb|GG667656.1| GENE 31 27464 - 29116 789 550 aa, chain + ## HITS:1 COG:slr2100 KEGG:ns NR:ns ## COG: slr2100 COG3437 # Protein_GI_number: 16330586 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and an HD-GYP domain # Organism: Synechocystis # 3 343 10 358 368 206 36.0 7e-53 MEKQKILIVDDSEMNRALLADILEEQYDVVEAENGIEAISLLSRQRADFSLLLLDIMMPE MDGFEVLAYINKYHWNDTFAVIMISADDSPANIKRAYDLGAFDYISRPFDSTIVKRRISN TMFLYARQQRLEKIIAEQFHEQEKNNKLMISILSHIVEFRNGESGLHILHVNTITKYLLK QLVRCTDQYPLSKADISLISTASALHDIGKIAISDTILNKPGRLTAEELEVMKTHSMVGA RMLSDLPFEQQEAPLVKVASEICRWHHERYDGNGYPDGLKGDEIPIAAQVVALADVYDAL TSERCYKKAYSHKEALNMILEGQCGAFNPTLLLCLQEIADTLENELTDNSSEQETKNIQD IRNKIDYDRLFSYEKYTFLSRKQRHLQLLYIDSLTSVYNRRYYDEHFQGSDNIQAMVVID VDNFKHINDNYGHDVGDIVLQNIAQTVLSCVRKTDAVIRYGGDEFVIIFFSIPANIFEKK LERIRYSVDSLIIDGHPELHMTVSIGGVYGTGTAKGLFKAADSMMYQSKNTKNKVTICYL DENKDIADNI >gi|229784079|gb|GG667656.1| GENE 32 29135 - 29497 309 120 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20567 NR:ns ## KEGG: EUBELI_20567 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 112 1 112 114 87 38.0 2e-16 MNLKDCYLKFGGDFDEVLGRLRREQTVGKFVFKFLDDKSFSLFEASMVKKDYSEALRAVH TLKGICQNLSFTRLFESSSLVTNALKENDWNKAVDMMPKLSKDYYETINVIKDFKNSREE >gi|229784079|gb|GG667656.1| GENE 33 29503 - 32064 1117 853 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 331 588 71 323 328 169 37.0 2e-41 MRNSKTKDSTTRFLIYSFIGLLIFSIIIFSLLGIYMSRKSEKTVYEIGQIYMSGMNKQMS NHFETVIKLRFGQVSGIVSVVSADNNNKEKLYEELVYRAQVRNFDHLALCSTEGNFQTLY GQSIQPLNPEPFVEALVRGEQRVALGSDPAGNLVVLFGVDATAYPMQNGSMSTGLVAAVP LEYITDFLSLEDEGQLMYYHIIRPDGSFVIQNPNTELWNFFEQLQKQSDSAVNESSAESP VEKFNAALKKHEEYAATLEVNGEECQIYGISLPYSEWYLVSVMPYSILDDAINNLSSQRM FMTLLSCASVLIILTLIFLRYFSITRSQLNELEKARQTALEANKAKSEFLANMSHDIRTP MNAIVGMTAIATAHMDDRKQVENCLRKITLSSKHLLGLINDVLDMSKIESGKLTLTTEQI SLKEVVEGIVNIMQPQVKTKKQTFDIHVENILTENVWCDGIRLNQVLLNLLSNATKYTPE GGSIQLSLSEEKSPKGENYVRIYIKVKDNGIGMSPEFLKRIYESYSRADGARIHKTEGAG LGMAITKYIVDAMEGTIDIQSEPDKGTEFLLMFDFEKAAAMEMDMVLPAWNMLVVDDDEL LCETAIDALKSIGIRAEWTLSGEKAIELVNEHHKKREDYQIILLDWKLPGMNGIQVAKEI RSNLGDEVPILLISAYDWSEFEAEAREAGISGFISKPLFKSTLYHALCQYMDVGTEHEQT LNQNIDLSGRRILLAEDNELNWEVAKELLTDLGVELDWAEDGRICLDKFQKSSEGYYDII LMDIRMPHMTGYEATKAIRGLNHPDALSIPIIAMSADAFSDDIQRCLEFGMNAHIAKPID IVEVSRLLKRFLT >gi|229784079|gb|GG667656.1| GENE 34 32148 - 32514 264 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621906|ref|ZP_06114841.1| ## NR: gi|266621906|ref|ZP_06114841.1| putative transglycosylase SceD 3 [Clostridium hathewayi DSM 13479] putative transglycosylase SceD 3 [Clostridium hathewayi DSM 13479] # 1 122 1 122 123 210 100.0 2e-53 MKKRVRKLFLMASILCVVSLVVCACGGKKLEQIDAVETAPTQQPEESGETPSEEKASTLE SFTENTFEKWEDSNPNLEGSIKELQEGQFTVVEAITEKADNGGDIVVSPGNGDDSEFNKV TV Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:05:37 2011 Seq name: gi|229784078|gb|GG667657.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld50, whole genome shotgun sequence Length of sequence - 35082 bp Number of predicted genes - 39, with homology - 39 Number of transcription units - 7, operones - 5 average op.length - 7.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 480 662 ## Closa_3994 hypothetical protein 2 1 Op 2 36/0.000 - CDS 537 - 4244 4449 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 3 1 Op 3 . - CDS 4275 - 4976 322 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 5059 - 5118 3.8 4 2 Op 1 28/0.000 - CDS 5153 - 6058 1153 ## COG2177 Cell division protein 5 2 Op 2 . - CDS 6048 - 6764 318 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 6 2 Op 3 10/0.000 - CDS 6802 - 7863 1248 ## COG1377 Flagellar biosynthesis pathway, component FlhB 7 2 Op 4 17/0.000 - CDS 7906 - 8688 1040 ## COG1684 Flagellar biosynthesis pathway, component FliR 8 2 Op 5 16/0.000 - CDS 8704 - 8967 497 ## COG1987 Flagellar biosynthesis pathway, component FliQ 9 2 Op 6 . - CDS 8984 - 9760 865 ## COG1338 Flagellar biosynthesis pathway, component FliP 10 2 Op 7 . - CDS 9760 - 10134 376 ## gi|266621916|ref|ZP_06114851.1| putative flagellar protein FliO 11 2 Op 8 20/0.000 - CDS 10136 - 11092 1000 ## COG1886 Flagellar motor switch/type III secretory pathway protein 12 2 Op 9 . - CDS 11107 - 11994 1145 ## COG1868 Flagellar motor switch protein 13 2 Op 10 . - CDS 11991 - 12842 1072 ## COG1360 Flagellar motor protein 14 2 Op 11 24/0.000 - CDS 12844 - 13278 579 ## COG1558 Flagellar basal body rod protein 15 2 Op 12 . - CDS 13291 - 13647 473 ## COG1815 Flagellar basal body protein 16 2 Op 13 15/0.000 - CDS 13664 - 14044 553 ## COG1516 Flagellin-specific chaperone FliS 17 2 Op 14 . - CDS 14060 - 16891 2152 ## COG1345 Flagellar capping protein 18 2 Op 15 . - CDS 16903 - 17379 522 ## gi|266621924|ref|ZP_06114859.1| hypothetical protein CLOSTHATH_03111 19 2 Op 16 5/0.000 - CDS 17366 - 17647 300 ## COG1551 Carbon storage regulator (could also regulate swarming and quorum sensing) 20 2 Op 17 . - CDS 17641 - 18087 558 ## COG1699 Uncharacterized protein conserved in bacteria 21 2 Op 18 . - CDS 18116 - 18706 681 ## Closa_3431 hypothetical protein 22 2 Op 19 . - CDS 18731 - 19714 1068 ## COG1344 Flagellin and related hook-associated proteins 23 2 Op 20 . - CDS 19725 - 19871 188 ## gi|239628960|ref|ZP_04671991.1| flagellar hook-protein FlgK - Prom 19944 - 20003 11.2 24 3 Op 1 . - CDS 20907 - 21443 669 ## COG1256 Flagellar hook-associated protein 25 3 Op 2 . - CDS 21472 - 21918 606 ## gi|288870656|ref|ZP_06114866.2| hypothetical protein CLOSTHATH_03118 26 3 Op 3 . - CDS 21932 - 22234 210 ## gi|266621932|ref|ZP_06114867.1| anti-sigma-28 factor, FlgM - Prom 22280 - 22339 3.7 27 4 Op 1 . - CDS 22479 - 22967 364 ## Cphy_1341 hypothetical protein 28 4 Op 2 . - CDS 22960 - 23733 813 ## COG1192 ATPases involved in chromosome partitioning - Prom 23760 - 23819 5.9 29 4 Op 3 . - CDS 23830 - 24666 702 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase - Prom 24865 - 24924 7.7 + Prom 24827 - 24886 6.3 30 5 Tu 1 . + CDS 25134 - 26135 787 ## EUBELI_01420 hypothetical protein + Term 26232 - 26272 8.2 - Term 26217 - 26264 10.8 31 6 Tu 1 . - CDS 26309 - 27808 862 ## COG1344 Flagellin and related hook-associated proteins - Prom 27950 - 28009 3.4 - Term 27972 - 28014 12.1 32 7 Op 1 . - CDS 28016 - 28900 776 ## COG1876 D-alanyl-D-alanine carboxypeptidase 33 7 Op 2 . - CDS 28938 - 29588 635 ## gi|266621939|ref|ZP_06114874.1| type IV pilus assembly protein PilZ 34 7 Op 3 . - CDS 29652 - 30035 342 ## COG2257 Uncharacterized homolog of the cytoplasmic domain of flagellar protein FhlB 35 7 Op 4 . - CDS 30022 - 31485 1524 ## Closa_3422 hypothetical protein 36 7 Op 5 . - CDS 31485 - 33188 1451 ## COG1315 Predicted polymerase, most proteins contain PALM domain, HD hydrolase domain and Zn-ribbon domain 37 7 Op 6 8/0.000 - CDS 33227 - 34006 984 ## COG4786 Flagellar basal body rod protein 38 7 Op 7 . - CDS 34020 - 34760 991 ## COG4786 Flagellar basal body rod protein 39 7 Op 8 . - CDS 34769 - 35080 462 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit Predicted protein(s) >gi|229784078|gb|GG667657.1| GENE 1 3 - 480 662 159 aa, chain - ## HITS:1 COG:no KEGG:Closa_3994 NR:ns ## KEGG: Closa_3994 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 159 1 159 310 276 84.0 3e-73 MKNTALVIMAAGIGSRFGGGIKQLEPVGPSGEIIMDYSIHDALEAGFNKIVFIIRKDLEK DFKEIIGRRIEKIAPVEYAFQELTDLPEGYEKPADRTKPWGTGQAILCAKQVIHEPFIVI NADDYYGKEGFKKIHEYMVNEMDTESDVFDICMGGFILG >gi|229784078|gb|GG667657.1| GENE 2 537 - 4244 4449 1235 aa, chain - ## HITS:1 COG:lin1187 KEGG:ns NR:ns ## COG: lin1187 COG0577 # Protein_GI_number: 16800256 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Listeria innocua # 2 1235 3 1136 1136 653 34.0 0 MKNAKRKDFMVEIGRSMNRFLSILFIVALGVAFFAGIRSTRPDMELSADVYYDDTNLMDI RIVGTLGLTEDDVEAVRQLPETEHAEGAYTTDVLCESPDREQTVRIFSMSGQINQFTLLD GRMPERAGECLMDRGKAYKGQFGLGDTVKVCSGTDEELSDTLKCDTFTIVGIGMSSRYLS LERGTTTIGNGSLDTFMAVTPDNFSMDVFTDIYVQTKQGKSFLSYSDEYQDYIDLVTKEI EDTLSEPRSAIRYRDVTDEANRALNDARRELTEKTAETDRKLDDAARQIADAQKKIDEGK DDIKKGRADLEKGKTDLENAKNDLEDGKRTLAEKEQELADAKVTLAEKEQELEDGKVTLA EKEQELEDAKKEVTDKEKELEDARKKFDEELPGARMELDDGWLEYEIGLEEYEDGKAELE EQQETLKEGWKEYRKNAPEVERNLKKALNGLIAMGVPLDITPSEISRIQVEEFEIDAYAP WLEMIPGGEGAMLVEGVHTLQASYQELTYGDAAVKEGKKELAKAKRELAKGKKKLDQGEM KLKNARRKLDEGEEQLADARQKIIDGEQKVEDAKQDIVDGEAKIADAHRDIADGEEKIAD AHRDIADGEAKIPKAEKDIAKGEADIEEAQKKLNDGEADLEEAKGKYEDSRREAEEKIAD AGKKIDDAQKEIDDIKEPEWYVLDRNSIQTYVEFGSDAERIGAIGKVFPFIFFLVAALVS LTTMTRMVEEQRLQIGTLKALGYGKWSIASKYIGYALAATLAGSILGAAAGEKLLPYVIM TAYFILYENLPVMLTPFNLHYALLASGLALACTVLATVFSSYRELLSSPAALMRPVPPKQ GKRVLLEYVTPLWRRLTFIQKATVRNLFRYKKRCLMTIFGTGGCMALLLVGFGLNDSIST LGEKQYNDIWTYDVLVQTDDEADEAEQEKLRSFVSQGDVEAWLRVEETILDVSAGDVTKE AYLFVPENLDEVNRFVHLRSRESREPYVLDDDHVLITEKMAGLLDVKVGDTIFLKEDEGT HHPVVVGGIAENYLHHYVYMTPALYEKVYGEAPEYNTMFLNLKDLNEETQNRLGTELLSF DSVMAVTFIGTMRGQINDMLDSLNVVIWVLIISAGLLAFVVLYNLNNININERQRELATM KVLGFYNPELAAYVYRENIMLTILGSMVGIVMGYFLHRYVMTTVEVDMLMFGRQIRPVSY AYSVLMTYAFSAFVNLAMYDKLKGIDMVESLKSVE >gi|229784078|gb|GG667657.1| GENE 3 4275 - 4976 322 233 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 4 214 2 215 245 128 35 4e-29 MAFVELKDVRKSYHMGEITIHAVDGISFAIEKGEFVVIVGPSGAGKTTVLNILGGMDTAT SGEILVDGVNIASYPEKKLTGYRRNDIGFVFQFYNLLQNLTALENVELALQVCKNPLDAE RVLIDVGLGERLNNFPAQLSGGEQQRVSIARALAKNPKLLLCDEPTGALDYQTGKAILKL LQDTCREKKMTVIVITHNSALAPMADRVIQIKNGKTASIVLNEKPLPVEKIEW >gi|229784078|gb|GG667657.1| GENE 4 5153 - 6058 1153 301 aa, chain - ## HITS:1 COG:BS_ftsX KEGG:ns NR:ns ## COG: BS_ftsX COG2177 # Protein_GI_number: 16080578 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Bacillus subtilis # 11 301 10 292 296 108 26.0 1e-23 MAVRTWLYLCRLGFKNLWHNKVYTAASVLTMAACIFLFGIFYIAVLNLDGVLTKTEEDVY VAVFFEETAAPERVEEIGNLIRERPEVERTVYTSADDAWEGFRNDYFEDGELLDGVFDGD NPLAVSNHYQVYIKGIEQQEGFVEYVKGLEGVRKVTHSADTVRALLRIKTVISRLTAGSA AILLLISILLIHNTLSVGIESQKGKTRVMRLMGAREEFISVPFLVEGLVMGLAGMCIPIL LLLAGYRWGLELVAAGLSLSGGTVSLLPPAEVFPKLTAASVILGLVTGVAGGWSVMGKLK K >gi|229784078|gb|GG667657.1| GENE 5 6048 - 6764 318 238 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 11 230 1 222 245 127 33 1e-28 MEKGSTVDNTMISFDRVTKTYGTGQKALDDVSLQIEKGEFVFLVGNSGAGKTTLLELILR ETEADQGNIVVNGIDLRQLKERQVCRYRRYIGMVFQDFCLFPDFTVYENVAFAQRVVEAD HKRMRSRVMEVLGQVGLEKKAGCYPGQLSGGEKQRTALARAVVNQPVLLLADEPTGNLDQ KNAEEIMRLLEKINRQGTTVLVVSHNQELVKYMHRCQIALRYGKVIKDSSRGGLMYGC >gi|229784078|gb|GG667657.1| GENE 6 6802 - 7863 1248 353 aa, chain - ## HITS:1 COG:BS_flhB KEGG:ns NR:ns ## COG: BS_flhB COG1377 # Protein_GI_number: 16078701 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhB # Organism: Bacillus subtilis # 3 351 11 358 360 290 44.0 4e-78 MAADEKTEKATPKRRQDERKKGNVFQSSDVIAVCSLLVLFNSLNALAPMIYRTLKASMEL FFSYAASTEPLTAEDLPDKLLKAMFLFGQAALPLLFIGVAVAVVTTFFQTRLSFSADVMK FKMDRISLLKGFKRMFSIRSVVELLKAMIKITILAWVIYSYIKGRIHEFARLMDGTVQAA FVYVGDTAVSLVNTVGAAFIFLAAFDYLYQWWEYEKNLRMSKQEIKDEYKQTEGDPQIKG RIREKQRQMASMRMMQNVPKADVIIRNPTHFAVALGYDSNTNRAPVVLAKGADHLALKIV EVGEANGVYIMEDKPLARGLYASVEVDMEIPEEFYQTVAKVLAFVYKLKKKDI >gi|229784078|gb|GG667657.1| GENE 7 7906 - 8688 1040 260 aa, chain - ## HITS:1 COG:BH2440 KEGG:ns NR:ns ## COG: BH2440 COG1684 # Protein_GI_number: 15615003 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FliR # Organism: Bacillus halodurans # 6 259 5 256 258 99 28.0 8e-21 MDSGVLEYFDVFLLVFARMGGLIFVHPVFARRGIPAMVKTGLVLTLSLVIAPTAVQGSAA IQAYSTFQMAEALMREVIMGLVIGSVFQLFFYMIYAAGDLMDTVFGLSMGKIMDPAGGVQ TAILGQFIHVFFFLYFFATGCHLLTVKLFAYTYEVIPVGAARLVTQDIAWYLVSIFGSVF LMVIKLALPFVAAEFVLEVSMGVLMKLIPQIHVFVINIQSKILLGILLMMLFARPIGAFI DSYFGMMMSEVQKVMMMFAS >gi|229784078|gb|GG667657.1| GENE 8 8704 - 8967 497 87 aa, chain - ## HITS:1 COG:RSp0374 KEGG:ns NR:ns ## COG: RSp0374 COG1987 # Protein_GI_number: 17548595 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FliQ # Organism: Ralstonia solanacearum # 1 81 1 81 89 59 43.0 1e-09 MTADQVMEVMKEAMMVAFEIAGPLLIISIVVGLIVAIFQAATQIHEQTLTFVPKLLVIAL VLLALGSWMFKVMDEFVVELFAIMASL >gi|229784078|gb|GG667657.1| GENE 9 8984 - 9760 865 258 aa, chain - ## HITS:1 COG:BS_fliP KEGG:ns NR:ns ## COG: BS_fliP COG1338 # Protein_GI_number: 16078698 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FliP # Organism: Bacillus subtilis # 59 257 29 221 221 199 56.0 5e-51 MKGIRKIAGGVLGLAAFIRVSAACGQIAFASDISLTLEGSSGSGKPMDMLDIMFLFLFLA VVPSLLIMMTSFTRIVIVLSFLRNALGTQQSPPNQVMIGLALFLTLFIMSPVFAKVNTQA YEPYKAGAITREEAFTRAQEPMKEFMLKQTEKKSLDLFLSISKTEVPDVSGGPEGYMKLG LTVLVPSFILSELNKAFTMGFLIYIPFLIIDLVVSSTLMSMGMVMLPPTMIALPFKIMMF VLVDGWSLVIKTLVTGFR >gi|229784078|gb|GG667657.1| GENE 10 9760 - 10134 376 124 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621916|ref|ZP_06114851.1| ## NR: gi|266621916|ref|ZP_06114851.1| putative flagellar protein FliO [Clostridium hathewayi DSM 13479] putative flagellar protein FliO [Clostridium hathewayi DSM 13479] # 1 124 1 124 124 202 100.0 5e-51 MAVLKVLFYLIVLILVLILAYVTTRMLGRGMNGRQQSGDMKILDRLAVGRDSYLMIVDVQ GRILLVGVSPAGITRLEELETYERKSPAEVSPDFASVFTEQLKTLVKKDGRQQDERQKKN GRDL >gi|229784078|gb|GG667657.1| GENE 11 10136 - 11092 1000 318 aa, chain - ## HITS:1 COG:BB0277 KEGG:ns NR:ns ## COG: BB0277 COG1886 # Protein_GI_number: 15594622 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar motor switch/type III secretory pathway protein # Organism: Borrelia burgdorferi # 224 318 13 112 113 71 40.0 2e-12 MGYLDQEALTELLDMEAVPVCNTLASLLGRNARAAVTELGETDRETLTEYLPHFNVVIEA ERTAEGLRVPQLYVFDRADMLKISNFIMGLPVNTESPLDEIALSTLKEVASQCMGAAMEN LGDFLGREMGEVLTRVTAFDSAERILDTISLWDKVPHLLLIRFHLEIEGVLSAEVLGIAT EELQEIFGIPVMGQEPAAEEAVHTEELPFRKKEKTVAFREVSFPEFKYVPLDNQVDHIGE ERKKLRDITLDVSVRIGGTVCSVKDILALQKGQILTLDKQAGSPADVVVNGKLIGKSDVL VTDDKFAARIIEIIGKRD >gi|229784078|gb|GG667657.1| GENE 12 11107 - 11994 1145 295 aa, chain - ## HITS:1 COG:Cj0060c KEGG:ns NR:ns ## COG: Cj0060c COG1868 # Protein_GI_number: 15791452 # Func_class: N Cell motility # Function: Flagellar motor switch protein # Organism: Campylobacter jejuni # 2 291 38 320 359 139 28.0 6e-33 MIKTYDFKSPKKFTKERMSTVENLYDSFARALAPYLTGLMQSYCEISVTGIVERRYQEFS NSVPDRSLFGMITLIPDNKDYNEAPLVLEVDTRLAFFMIERLLGGAGTEYELSRDFTDIE KAILQYLLGKITEFIGDSWKEYLDVDAALTGLQTNPHLLQVSAPEDVVIQVELEVRIDQL SAVLNLVMPAPNVEELTSKFGYTYAVSQKKDEAKQLAKRGYIEQHLLESEVELRAIFHEF SLDAQDILQLQPGDVVPLNKKVDSSIDIYVEDIACFEARLGHTKLRKAVEINKVL >gi|229784078|gb|GG667657.1| GENE 13 11991 - 12842 1072 283 aa, chain - ## HITS:1 COG:TP0724 KEGG:ns NR:ns ## COG: TP0724 COG1360 # Protein_GI_number: 15639711 # Func_class: N Cell motility # Function: Flagellar motor protein # Organism: Treponema pallidum # 2 245 4 233 238 84 31.0 3e-16 MKKRREEGGGQEWLNTYADMITLVLTFFVLLYSISNVNISKLEEIAAAMQRQLGIESTND LEDVPQDLKYPSISEGAGAPEGEGSQFNTGTGGGARSASSLAMADMAKDIQSYFQMENLD AVVSDSENAVYIRFKNDLLFAPDSAVLQENSKSMLDALGIMLKDRQDEIMAIYINGHTAQ AANSLINDRLLSSERADNVAIYLEDHVGLEPKKLICRGYGKYYPIADNSTKEGRELNRRV DMIILGNDYEVSQDNLDGIETMDPLFPVDMPADMSGGQEGTAQ >gi|229784078|gb|GG667657.1| GENE 14 12844 - 13278 579 144 aa, chain - ## HITS:1 COG:BH2460 KEGG:ns NR:ns ## COG: BH2460 COG1558 # Protein_GI_number: 15615023 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Bacillus halodurans # 1 144 1 152 152 139 51.0 2e-33 MGYLDSLNITGSALTAERFRTDIILQNLANQNTTRTAEGGPYRRKQVALRENQLDFKSEL GKAMTKADRGGVYVEEVVESQNPLVPVYDPDHPDANEDGYVMMPNVNSAEEMVDLMAATR AYEANVTALNIAKSMALKALEIGK >gi|229784078|gb|GG667657.1| GENE 15 13291 - 13647 473 118 aa, chain - ## HITS:1 COG:BH2461 KEGG:ns NR:ns ## COG: BH2461 COG1815 # Protein_GI_number: 15615024 # Func_class: N Cell motility # Function: Flagellar basal body protein # Organism: Bacillus halodurans # 1 113 1 128 132 64 34.0 5e-11 MPLFEDAAFRALQSGLDAMWLKQQVAGHNIANVETPGFKAKKVEFKQVLKEAADGSGPVY KAVVGTDNDTEARPDGNNVQVEKEELDLWKAYTEYSAMTSRMSGKLSTLRYVINNTGK >gi|229784078|gb|GG667657.1| GENE 16 13664 - 14044 553 126 aa, chain - ## HITS:1 COG:BS_fliS KEGG:ns NR:ns ## COG: BS_fliS COG1516 # Protein_GI_number: 16080586 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport; O Posttranslational modification, protein turnover, chaperones # Function: Flagellin-specific chaperone FliS # Organism: Bacillus subtilis # 1 126 3 128 133 68 29.0 2e-12 MQNPYARYKEQSVMTMTQGDMINLLFDEVINRLNKGLVSLEQNDYEGSNAHFKKAQAIVA HLSSTLDHQYEVAGGLASLYEYFNYQILQANIKKSPEPVEEVLPMIAELKEAFSQADKQV RMGHTG >gi|229784078|gb|GG667657.1| GENE 17 14060 - 16891 2152 943 aa, chain - ## HITS:1 COG:BS_fliD KEGG:ns NR:ns ## COG: BS_fliD COG1345 # Protein_GI_number: 16080587 # Func_class: N Cell motility # Function: Flagellar capping protein # Organism: Bacillus subtilis # 592 939 153 494 498 152 33.0 4e-36 MSSTSSINTVSNSGNKNYIAGLASGMDTESMVQSMLSGTQSKIDKQTGLKQQLEWKQDIY RSLITKINTFSDKYFSYYGSGDTNLLSSSLYNTMTGVSSSSNIKVTSVSGNAVSSMKIDS IERLATACTVKTAGTVTGTPVGSGADMSQFTNGDTYSFSATLDGVNRTITFTAGANDADT IKNINQALYRNFGTAVGMKYTAGTGSDKGTMELVQLDAAGNPTDVPVDSSRRVIIQSSGD AGTVENLGFGTGFSNKLDYGTSLKNMNFANKPQGGRYEFEINGVTIKGLTEDSTLSDVIS AINNSGAGVKVSYSSAADKFIMESTSTGDISNITMSQTYGTLLTTMFGVDAKGVQSSLFS KRLAGGEAEVLDSIRTDLNAGHDKKFVFQVDDKTVEVTLEGKNGSSNYKTNQSVIDALNS QLDKKLGTGAVCFTIETKNNAGTDVDTIILASKTHKISFVQDNLDNGIAGALGFSDGINN LLGTGNSLRAAGFTGTMVINGVTLTESSKISELVNALGAGAAYSADGRIEISGNTTISIS GDNADGKAALKKLFGVENISIAPVNGSFTSNAASSIPGSGIIYVKAGNKNIMVNMASSSP PDLDAFKNAVKEATGGIVEFNGDNKLTITGGTNIATISGTDAATSQFIKELFGTDNYTFS PQMTAQITAGQNAKLTVDGVTMERNTNTFELDGITMELTGTTGAGDPPISLITSRDTDKI IDSLKSFVEDYNTLIEELNSYVSEEATYKKYAPLTDAQKKEMSDREIELWEEKSKKGLLH NDSNVSSFLSDMRMVLYSSVEDAGLALYDIGIETSDNWRDNGKLVIDEDTLKSMVATNAD NIRKLFTDKEQGVAVKLQQAIREAANVSSGSPGSMVRYAGTKDVLITSNTIYDEMKRISE TLSKLNTKYELERTRYWKQFNAMEQAISNMNSQSSWLTQQFSS >gi|229784078|gb|GG667657.1| GENE 18 16903 - 17379 522 158 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621924|ref|ZP_06114859.1| ## NR: gi|266621924|ref|ZP_06114859.1| hypothetical protein CLOSTHATH_03111 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03111 [Clostridium hathewayi DSM 13479] # 1 158 1 158 158 244 100.0 2e-63 MEMIDEMRKLLHEKKACFLQYEEETCRIASQECREAEEFEAGIAARQRLIEEIDGIDERL KAMKNFSAEGERLFEISKNRWDFRKLSEEEQALFSEGQEIFTVMTRIRELEPGALAEMER VRNLLQDKIRKNNTNTRFTGYLKQMDQGTKGMLYNQKR >gi|229784078|gb|GG667657.1| GENE 19 17366 - 17647 300 93 aa, chain - ## HITS:1 COG:BH3617 KEGG:ns NR:ns ## COG: BH3617 COG1551 # Protein_GI_number: 15616179 # Func_class: T Signal transduction mechanisms # Function: Carbon storage regulator (could also regulate swarming and quorum sensing) # Organism: Bacillus halodurans # 1 63 1 64 75 57 50.0 6e-09 MLILTRKKGESIKIGDDIEIFVTEVKGDKVRLGISAPEDMKISRTELYLTVENNKEASDK VDLLKVFQLSRNLRAMSSEEEEEKQQEENHGDD >gi|229784078|gb|GG667657.1| GENE 20 17641 - 18087 558 148 aa, chain - ## HITS:1 COG:BH3618 KEGG:ns NR:ns ## COG: BH3618 COG1699 # Protein_GI_number: 15616180 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 3 142 4 143 151 73 32.0 1e-13 MKIETRDFGILEIEEKNIITFKQPIFGFEEFTQFVLVNDSNMGNGICWLQSIEQKNVCFI MLNPLEVRRDYAPVVMQDVLIMLQAVPEDDLDCWVLMVIGETFRKSTVNLKSPIIINHKT NLAVQVILEQDYAIRQPIFDEEAEDDVC >gi|229784078|gb|GG667657.1| GENE 21 18116 - 18706 681 196 aa, chain - ## HITS:1 COG:no KEGG:Closa_3431 NR:ns ## KEGG: Closa_3431 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 190 4 189 202 88 30.0 2e-16 MNVLKITTTPIKLSMTSQRARLDSQIPDPEIGIIQNPGRLNMKSETIKVNIDTSRSRDSM GFKTARGLMSDAAQAGIKAAADATAQYSRVGNQMMQIQDGVTVADIMKNQMLAQTSVTTG IGFIPSVGADISWTPADLAVDFTPASVDFKPMVQKTEAVYVPGELNVNVEQYPKVDIEYI GEPMYVPPSANPAYEE >gi|229784078|gb|GG667657.1| GENE 22 18731 - 19714 1068 327 aa, chain - ## HITS:1 COG:lin0714 KEGG:ns NR:ns ## COG: lin0714 COG1344 # Protein_GI_number: 16799789 # Func_class: N Cell motility # Function: Flagellin and related hook-associated proteins # Organism: Listeria innocua # 1 326 1 291 291 68 25.0 1e-11 MRITENMMTAMYNRNLQRNVANLASSNLKLTSQRQYNHVSEDPAAAAKAFTVRDQIARSE EHINTVKNAVGELDTADSNIATINSILETVFEKATRAGGASSQDNLDAIAEELGGLKEEI LQTMNARYGDKFLFSGSANGEAPFTLDAEGNLLFNGKAVDAYKPDDPATHFNENKPVYLD IGFGTYASGTNTEGTGIRISTSGVDVLGYGTDDQGIPNNIYSLIGKIESQLKDGDKSGAM DTLSQLKKKQSNISIATSEIGTREKLLDRTKDRLESGLVNLKETQKNLEAVSVETEAVNN KSYETAWMVTLQLGSSIIPPSIFDFMK >gi|229784078|gb|GG667657.1| GENE 23 19725 - 19871 188 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|239628960|ref|ZP_04671991.1| ## NR: gi|239628960|ref|ZP_04671991.1| flagellar hook-protein FlgK [Clostridiales bacterium 1_7_47_FAA] flagellar hook-protein FlgK [Clostridiales bacterium 1_7_47FAA] # 1 48 442 489 489 87 93.0 4e-16 MGVSGVSLNEEGINMMTYNKAYSALGRLMTTMDEQLDMIINQMGLVGR >gi|229784078|gb|GG667657.1| GENE 24 20907 - 21443 669 178 aa, chain - ## HITS:1 COG:BS_flgK KEGG:ns NR:ns ## COG: BS_flgK COG1256 # Protein_GI_number: 16080594 # Func_class: N Cell motility # Function: Flagellar hook-associated protein # Organism: Bacillus subtilis # 1 175 1 174 507 95 32.0 5e-20 MRPTFMGLETAKRGLMVNQKALDIIGNNISNINTKGYTRQRLDTVSVQVYGSDRFQYSSI PLAGQGVDARGVSQIRNPYLDTKFREQYSDVGYFDQKAAIMEQMEGILSDPEVEGTGIKD ALTVLSQALADFSQHPYQETNANIVLNAFKGVTQVLNEYDTNLKSLEEQTKNDLSIAS >gi|229784078|gb|GG667657.1| GENE 25 21472 - 21918 606 148 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870656|ref|ZP_06114866.2| ## NR: gi|288870656|ref|ZP_06114866.2| hypothetical protein CLOSTHATH_03118 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03118 [Clostridium hathewayi DSM 13479] # 14 148 1 135 135 207 100.0 3e-52 MEELIRFLTEYTEMFEKMEQKQVEKLGLLMTKELDKIEESIMMQQAMDKQLQNMEQRRQE LFAKAGLNGKTLQEVAELCGGEERKQLTDLYRRLDGAIGNIRFYNEKAESLAKSELEKMG LDARLVGNPTGIYGKSIGPKGQRFEKKV >gi|229784078|gb|GG667657.1| GENE 26 21932 - 22234 210 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621932|ref|ZP_06114867.1| ## NR: gi|266621932|ref|ZP_06114867.1| anti-sigma-28 factor, FlgM [Clostridium hathewayi DSM 13479] anti-sigma-28 factor, FlgM [Clostridium hathewayi DSM 13479] # 1 100 1 100 100 141 100.0 1e-32 MDVKVTRNYSVYQNAVNSGRHAEKSVSVPASADKKVRGDAICISSDGAKKSEASSFAAAL RKSMDEGAPADRIAALKQQVSEGTYQVSAEQIAKRLMSGL >gi|229784078|gb|GG667657.1| GENE 27 22479 - 22967 364 162 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1341 NR:ns ## KEGG: Cphy_1341 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 77 162 2 88 93 84 43.0 2e-15 MAKSRSEIDKEMMYRKIMPSAAKKSGSAVSGPDESAVTVDMGAAGLMGSTAAVPTAASMA KAIRRPAVSLPVKEEQNMVLINLMEELVLSKLDATLDRFNCCKCDKCKKDIAALALNRLK PHYVVMSEQDEDHRRKNEEMYASEVTSALIQAILMVKKEPRH >gi|229784078|gb|GG667657.1| GENE 28 22960 - 23733 813 257 aa, chain - ## HITS:1 COG:CC3753 KEGG:ns NR:ns ## COG: CC3753 COG1192 # Protein_GI_number: 16127983 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Caulobacter vibrioides # 5 251 9 258 267 167 41.0 2e-41 MAATIVLSNQKGGVGKTTSAYVLASVFKSKGYRVLAVDMDPQGNLSFSMGAETDGCATIY DVLKGELKPKYAVQKSSLVDLIPSNILLSSIELEFTGARREFLLKMALDSLKPYYDYIFI DSPPALGILTVNAFTAADYILVPMLSDIFSLQGIMQLNETIERVRSYCNPDIKVLGAFLT KHNPRTRFSREVEGTLNMVAADLDFSVMNTYIRESVALREAQSLQKSVLEYAPGCNAVKD YEKLADELFQRGLEANG >gi|229784078|gb|GG667657.1| GENE 29 23830 - 24666 702 278 aa, chain - ## HITS:1 COG:CAC1622 KEGG:ns NR:ns ## COG: CAC1622 COG2240 # Protein_GI_number: 15894900 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Clostridium acetobutylicum # 1 250 5 258 290 162 34.0 5e-40 MKRIVTLQDISCVGCCSITVALPVISAMGVECGILPTAVLSTHTMFKTFTCKDLSDQIAP ISQAWKAEQISFDGIYTGYLASAEQCEQICDFFDQFATGENLVLVDPAMADNGKLYPAFD ENFPAAMAKVCAKADVILPNITEAALLTGMPYRTDYDESYIREMLERLLALGCKTAALTG VSFEPDKLGVAYLDREGESFSYFTHRCPQSYHGTGDLYSSVVLGGMMRGLSLGSALTLAA DFVVLCIEATAAAGSSRWYGVEFESQIPRLCEMLEKCL >gi|229784078|gb|GG667657.1| GENE 30 25134 - 26135 787 333 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01420 NR:ns ## KEGG: EUBELI_01420 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 3 276 10 279 321 154 33.0 6e-36 MGQKDISLARYFEDESRYADLINGFIFGGKQVVSGNDIQDLDSRVTGLMSKIKKGFKVQK YRDSVRKVVLGLGFAIIGLENQDRIHYAMPIRIMLEDAAGYDKQMRQIQKHHRNKKDLQK EEFLSGFTIHDKVYPVITICIYYGDKPYNGATELYQIMEYETLPDKLKPFLNNYKIHVLE IKHFHDIDCFQTDLREVFGFIQRSGSPDEERKFTFENREKFQELDEDAFDVITTVTNSKE LELLKNEYQDKRGKINMCEAILGMIEAGRVEGRTEGEAKIVAMIRRKYDRKKNLLSISDE LELDYSYVKEVIDLINTYPDWTDLQIGETLIKQ >gi|229784078|gb|GG667657.1| GENE 31 26309 - 27808 862 499 aa, chain - ## HITS:1 COG:BH1477 KEGG:ns NR:ns ## COG: BH1477 COG1344 # Protein_GI_number: 15614040 # Func_class: N Cell motility # Function: Flagellin and related hook-associated proteins # Organism: Bacillus halodurans # 1 499 1 464 464 157 32.0 6e-38 MRIQHNIMALNSNRQLGINNSAVARSLEKLSSGYRINRAGDDAAGLAISEKMRAQIKGLE AATDNSQDAISLVQTAEGGLQEVHSMLNRMTELATKSANGTYTDDVDRKALQDEISALKD EINRIADGTDFNGIKLLDGTMGTGSAGKVDTVAAGVTATTMDAATKFTFSGPEGYAVKIQ AAEKNTASSAAWNGTTLTITLNGDKSKTYTQEDIDNLIRNASGKPKDAGVNIKLTVDRDF KLGDDTGTGATSDIKTLTSEAASQAFAVDNTAGIKVTAKKAGLNNSSLTFKNTPADKVGG SITGAGAYEVNLVAATSYTSSEINKMLADAGIELQVDFEGTKTGADMAAAGTTGAFTLAG GTGLPEGGGLKLQIGDTSDSYNQLELGIADMHVNALDLTSVDISTRDGAAAAMSKIKTAI NTVSTSRGKLGAIQNRLEHTINNLGVTTENITSAESRIRDVDMAKEMMNYTKNSVLVQSA QAMLAQANQQPQSVLQLLQ >gi|229784078|gb|GG667657.1| GENE 32 28016 - 28900 776 294 aa, chain - ## HITS:1 COG:BS_yodJ KEGG:ns NR:ns ## COG: BS_yodJ COG1876 # Protein_GI_number: 16079020 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus subtilis # 101 287 82 267 273 120 36.0 3e-27 MGKKKPENRKVIRNRRVMIYTFLLIFMLLYISAHFTVQERKAAEAEQRAEAVTVAAVEIR EEKLMQESSDRNAPAGELKEGGSMEETADGGSAESKGLVINPEDMWCLILTNAEYPVPED YTVTLKDVPGTDQTVDERIYEPLMKMLEAMKAEGLSPVVCSGYRTLDKQEKLFNRKVSSY VKKGHSKEEANALARQTLSIPGSGEHCLGLAVDFYTRSYHQLERAFERTPEGKWMREHAQ DYGFTLRYEEGKEEITGIGYEPWHFRYVGVEVAQYLKEHNLSLEEFYIEESLYG >gi|229784078|gb|GG667657.1| GENE 33 28938 - 29588 635 216 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621939|ref|ZP_06114874.1| ## NR: gi|266621939|ref|ZP_06114874.1| type IV pilus assembly protein PilZ [Clostridium hathewayi DSM 13479] type IV pilus assembly protein PilZ [Clostridium hathewayi DSM 13479] # 1 216 1 216 216 420 100.0 1e-116 MVDLSDGYVGSSCVIKAKNNDLITMGVMHRIGKNFIDIGSSRNELPSIPYNLLVKIEIYN TQIGFKVLIGRVYLSSPKLARVIDLNEATNDERREYFRISTRDTGVIYNCMRGDDILDME EESTDYNGLKVHLVDISLGGLMFRSREIFKIKDRFNIVIPAMGDSMLFTCEVRRSVDRPD GEFGYGCEFMEMATKQEDLLYRYILKRQSDQLRRIR >gi|229784078|gb|GG667657.1| GENE 34 29652 - 30035 342 127 aa, chain - ## HITS:1 COG:BH2473 KEGG:ns NR:ns ## COG: BH2473 COG2257 # Protein_GI_number: 15615036 # Func_class: S Function unknown # Function: Uncharacterized homolog of the cytoplasmic domain of flagellar protein FhlB # Organism: Bacillus halodurans # 5 78 6 79 92 60 43.0 6e-10 MSKYKKNKAVALKYNVEEDASPVVIASGYGTVAEHIIDIAEKKGIPVFKDDSAASLLCML EVGSNIPVELYEVVAAIYCKLIETSASIRGSETAAEAAAQVRGKESGAGERLRRNLASGK KNEESAK >gi|229784078|gb|GG667657.1| GENE 35 30022 - 31485 1524 487 aa, chain - ## HITS:1 COG:no KEGG:Closa_3422 NR:ns ## KEGG: Closa_3422 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 487 1 473 473 89 20.0 4e-16 MADILRVTSPVINKNIIQPDRAQKDPSVPFDMQEIRHITKNPSGSGLLGQHNVFQKEAGA ATLMNLLKDPAVTVNYLKNIYMLEEIISLLPVNNNTMTQEIRQLFNALLIKPDDIVGELV DQEYASTLFKGPLFDFLRNLLNEKPELRLEMADFLKAVNGSVSRNDVLDSVANSLEFLSG QLAGSRNLSGTFGMLSDRIRAMIGGDGGAEPNSGVLDRGAGLKNRLTEWEAAKADILDAL KELENSVLFSPQTERMIPNLLYNMSRYQDNESYLQEALLNLLIHVNSREDKDQLKDLLQD YVNQFGSPEHRENRSRIMDTLAEIISKQDRETPMNSLNGEKIEKIITSLLSSPCNYTPLL HFVIPVEYQGMKAFSELWIDPKDEGGGQEQDRERDDHVHVLITFDIPGIGQMEAEFKVAG REMQFYLYCPESYTSVFGRLAPDFRKIMEEYGYHAAEVEVSGLDHVRSLMDVFKNLPYRR TGVDVKI >gi|229784078|gb|GG667657.1| GENE 36 31485 - 33188 1451 567 aa, chain - ## HITS:1 COG:TP0710 KEGG:ns NR:ns ## COG: TP0710 COG1315 # Protein_GI_number: 15639697 # Func_class: L Replication, recombination and repair # Function: Predicted polymerase, most proteins contain PALM domain, HD hydrolase domain and Zn-ribbon domain # Organism: Treponema pallidum # 113 565 178 637 656 174 27.0 4e-43 MSEKSLIFSMFSKFLDNGEEDETKEREVRPYTEDQKPKAETDQRIETVEDWNDAERTRRE RLERELAGENGSEEADHAEDAEAEEDVEAGEDVETGTDRDGGGKSSGEEECRDAGIELTV SDDRMAVSVMLIEPSGGGRDITRERIEQELEAHRIIHGIDEEKVSTIAAEHLYRQMFIIA RGTPAVDGNDGRIKDYFPREAKIKYASKGNGGIDFKSMNLIHNVGKGELVCELTMPTEPQ DGMDVFGQPVRGKNGTMPPIPQGKNVVYSPERDRLLTACEGNLTFRNGRFQVENVFVVSG NVDNSIGNIDFTGSVTIHGDVFEGYTVKAKGDITFMGIVEGAALQAGGSILLHKGMRGMK SGILEAGVDITAKFLEDCTIYAQHNIQAEYIINSEVSCGHDLTLIGKRGAFIAGSCAVHN CMNVKTVGASSHVSTVVTLGVTPQLMEEVEKISADLVAVSRKQAETAKDISYLNGKLKDG SITPRQKERLAKLKLEAPIHNLKEKRLKTQAADLARQLREVGKCRLTAGEVHPGTVINIG DCRMTIVKREESCSFYYLDGEIRKGIR >gi|229784078|gb|GG667657.1| GENE 37 33227 - 34006 984 259 aa, chain - ## HITS:1 COG:TM1542 KEGG:ns NR:ns ## COG: TM1542 COG4786 # Protein_GI_number: 15644290 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Thermotoga maritima # 1 259 1 260 261 123 33.0 3e-28 MNISYYTAVSAMNAFQSELDVTANNMANVSTPGYKVLRSSFDDLLYTQMDTKNANQMVGH GVKTNGAETVFEQGIFEKTERELDFAILGKAYFAVESENEDAEEPYYTRDGSFQISATDE GNYLTTRDGHYVLSRDGDRIELEYKTVEESNGKKQFTNELDLSGLSEVIGLYTCENPDGL VPVGKNLYSTGAASGEWIPIEDMDETQAGSRLLTGTLELSSTYVPTEMINLLQAQRAFQL NSRIVSAADQMEEMINNLR >gi|229784078|gb|GG667657.1| GENE 38 34020 - 34760 991 246 aa, chain - ## HITS:1 COG:BB0775 KEGG:ns NR:ns ## COG: BB0775 COG4786 # Protein_GI_number: 15595120 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Borrelia burgdorferi # 4 245 22 299 300 111 29.0 1e-24 MFSGFYTAASGVLMNQRALNISANNIANVKTSGYRSKRLVKTTFDEQLVRQMNGQTTAIG GGSTISLAEREMTTHRQGALVDTGKNFDMAISGDGLFVIQGEDRQYLTRNGHFTQDEDGY LVLKGVGQVMGDGGPIQVEEGGFRVGENGIIYDNENGELEQIQIVVPDNYDNLTFYDNGT YGAENGVALTQVYPEVFQGKLEESGVNLNSEMTRAMEVQRAFQSCSRALTIIDQMNQKSA SEIGKL >gi|229784078|gb|GG667657.1| GENE 39 34769 - 35080 462 103 aa, chain - ## HITS:1 COG:RSp1390 KEGG:ns NR:ns ## COG: RSp1390 COG1191 # Protein_GI_number: 17549609 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Ralstonia solanacearum # 23 99 169 245 253 68 41.0 3e-12 AVLSFEAVLQDMTGVIAKDKLEATDIDTKPEESLFYRDLQRTLAEAVEALGDKERLVVSL YYYEELKYSEIAEIMGIGQSRVCQIHTKAMKKLKSSLEEYVRG Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:06:52 2011 Seq name: gi|229784077|gb|GG667658.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld51, whole genome shotgun sequence Length of sequence - 31672 bp Number of predicted genes - 25, with homology - 25 Number of transcription units - 13, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 95 - 1282 1534 ## COG0192 S-adenosylmethionine synthetase - Prom 1307 - 1366 4.1 + Prom 1605 - 1664 4.4 2 2 Tu 1 . + CDS 1715 - 2632 956 ## COG0618 Exopolyphosphatase-related proteins + Term 2688 - 2732 4.2 - Term 2676 - 2715 10.0 3 3 Op 1 . - CDS 2722 - 3738 1114 ## COG1087 UDP-glucose 4-epimerase 4 3 Op 2 . - CDS 3759 - 4865 1175 ## Closa_3988 aminoglycoside phosphotransferase 5 3 Op 3 . - CDS 4865 - 5794 1180 ## Closa_3989 hypothetical protein - Prom 5835 - 5894 80.4 + Prom 6668 - 6727 80.4 6 4 Op 1 2/0.000 + CDS 6931 - 7866 710 ## COG2207 AraC-type DNA-binding domain-containing proteins 7 4 Op 2 . + CDS 7907 - 9391 1206 ## COG3119 Arylsulfatase A and related enzymes + Term 9620 - 9653 2.6 8 5 Op 1 24/0.000 - CDS 9410 - 10018 699 ## COG1174 ABC-type proline/glycine betaine transport systems, permease component 9 5 Op 2 12/0.000 - CDS 10011 - 10784 241 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 10 5 Op 3 13/0.000 - CDS 10800 - 11774 1092 ## COG1732 Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) 11 5 Op 4 . - CDS 11845 - 12570 634 ## COG1174 ABC-type proline/glycine betaine transport systems, permease component - Prom 12631 - 12690 10.3 + Prom 12546 - 12605 9.2 12 6 Op 1 . + CDS 12661 - 13656 460 ## COG2706 3-carboxymuconate cyclase 13 6 Op 2 . + CDS 13719 - 15170 1169 ## COG0362 6-phosphogluconate dehydrogenase 14 6 Op 3 . + CDS 15142 - 16557 1129 ## COG0364 Glucose-6-phosphate 1-dehydrogenase 15 6 Op 4 . + CDS 16571 - 17266 596 ## COG0120 Ribose 5-phosphate isomerase + Term 17283 - 17313 1.1 - TRNA 17364 - 17436 77.3 # Ala CGC 0 0 - Term 17500 - 17563 8.2 16 7 Tu 1 . - CDS 17594 - 18871 1666 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase - Prom 18915 - 18974 5.6 17 8 Tu 1 . - CDS 18995 - 21472 2697 ## Closa_3992 hypothetical protein - Prom 21502 - 21561 5.3 18 9 Tu 1 . - CDS 21598 - 21852 252 ## Closa_1116 XRE family transcriptional regulator - Prom 21974 - 22033 5.4 + Prom 21861 - 21920 8.6 19 10 Tu 1 . + CDS 22012 - 22215 160 ## gi|266621965|ref|ZP_06114900.1| N-acetyl-gamma-glutamyl-phosphate reductase + Term 22429 - 22463 0.1 20 11 Tu 1 . - CDS 23835 - 24341 452 ## Closa_3993 hypothetical protein - Prom 24566 - 24625 9.3 + Prom 24580 - 24639 6.9 21 12 Tu 1 . + CDS 24755 - 25684 924 ## COG4866 Uncharacterized conserved protein - Term 25659 - 25703 9.0 22 13 Op 1 . - CDS 25721 - 27079 1105 ## COG2730 Endoglucanase 23 13 Op 2 . - CDS 27163 - 28662 1058 ## COG3119 Arylsulfatase A and related enzymes 24 13 Op 3 . - CDS 28732 - 30957 2004 ## COG1472 Beta-glucosidase-related glycosidases 25 13 Op 4 . - CDS 30980 - 31672 598 ## COG1874 Beta-galactosidase Predicted protein(s) >gi|229784077|gb|GG667658.1| GENE 1 95 - 1282 1534 395 aa, chain - ## HITS:1 COG:BS_metK KEGG:ns NR:ns ## COG: BS_metK COG0192 # Protein_GI_number: 16080107 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Bacillus subtilis # 5 393 7 395 400 606 72.0 1e-173 MEKLLFTSESVTEGHPDKMCDQISDAILDEMLKQDPMSRVACETCCTTGLVLVMGEVTTH AYVDIQKVVRETVRQIGYDRAKYGFDCDTCGVIVSLDEQSPDIAMGVNKALEAKEHTMSE AELDAIGAGDQGMMFGFASNETEEYMPYPIALAHKLALQLTKVRKDGTLKYLRPDGKTQV TVEYDENGRPIRLEAVVLSTQHDEAVSQEQIHEDIKKYVFDAIIPADMIDEETKFFINPT GRFVIGGPQGDSGLTGRKIIVDSYGGYARHGGGAFSGKDCTKVDRSAAYAARYVAKNMVA AGLADKCEIQLSYAIGVAHPTSIMVDTFGTGKLSDDRLVEIIRENFDLRPAGIIRMLDLR RPIYKQTAAYGHFGRNDLDLPWEKLDKVELLKKYL >gi|229784077|gb|GG667658.1| GENE 2 1715 - 2632 956 305 aa, chain + ## HITS:1 COG:VCA0593 KEGG:ns NR:ns ## COG: VCA0593 COG0618 # Protein_GI_number: 15601352 # Func_class: R General function prediction only # Function: Exopolyphosphatase-related proteins # Organism: Vibrio cholerae # 2 287 7 293 310 256 49.0 3e-68 MRLVTRSDFDGLVCGALLKEAGVIDHWKFAHPKDLQDGLVEITEDDCLANVPFVEGCGLW FDHHSSEFERLELEGKYKGESRLAPSCARIIYEYYGGNERFSHFNEMMEAVDKVDSGHLT IDEIQNPKGWILIGFLMDPRTGLGRWRNFTISNYQLMERLIDACRTMDTAEILNLPDVKE RIDVYFEQTEKFKAMIAQHTRVDHDVIISDLRGVDPIYTGNRFLIYSMYPEQNISAWIVN GKGGIGCSVAVGYSILNRTASLDVGSLMLKYGGGGHKNVGTCQFGDETMEEELPKLLHDL VYPTL >gi|229784077|gb|GG667658.1| GENE 3 2722 - 3738 1114 338 aa, chain - ## HITS:1 COG:BS_galE KEGG:ns NR:ns ## COG: BS_galE COG1087 # Protein_GI_number: 16080937 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Bacillus subtilis # 1 338 1 336 339 511 71.0 1e-145 MRILVTGGAGYIGSHTCVELLNQGQEVVVVDNLCNSSEESLNRVKQITGKDVTFYKADLL DKDAMEEIFSKETIDAVIHFAGLKAVGESVAKPLEYYHNNITGTLVLCDVMRNHGVKKII FSSSATVYGDPAFVPITEDCPKGAITNPYGQTKSMLEQILTDLHTADPEWSVILLRYFNP VGAHKSGLIGEDPAGIPNNLTPYITQVAVGKLKEVGVFGNDYDTPDGTGVRDYIHVVDLA IGHVKAVDKLAQSEPDVRIYNLGTGKGFSVLQMIEAFSKACGKQIPYVIKPRRPGDIAEC YADAALAKKELGWEAERGIDEMCEDSWRWQSNNPNGYR >gi|229784077|gb|GG667658.1| GENE 4 3759 - 4865 1175 368 aa, chain - ## HITS:1 COG:no KEGG:Closa_3988 NR:ns ## KEGG: Closa_3988 # Name: not_defined # Def: aminoglycoside phosphotransferase # Organism: C.saccharolyticum # Pathway: not_defined # 1 367 1 367 370 631 80.0 1e-179 MGDAQCGKIREAIGLLSFEGEPVSWERYGSGHINDTFRLVCRDGEKENLYILQRINHEIF TDPVSLMRNIAGVTSFLREKITAQGGDPYRETLNIVKARDGRDYVQDSDGNYWRGYLFIS GATCFDKVRNPDDFYQSGKAFGHFQRQLAEYPAEELTETIVNFHNTPVRLETFKKAVEED VCGRAHLVQEEIQFVMDRADEAGAAMNMLKEGTLPLRVTHNDTKLNNIMIDDITGEALCI IDLDTIMPGLSIFDFGDSIRFGANTADEDEPDVSLVSLSLPLFEIYTKGFLEGCQGSLTK AERDMLPMGAKLMTFECGVRFLTDFLQGDTYFRISRENHNLDRTRTQFALVADMEKKWAD MEAIVRKY >gi|229784077|gb|GG667658.1| GENE 5 4865 - 5794 1180 309 aa, chain - ## HITS:1 COG:no KEGG:Closa_3989 NR:ns ## KEGG: Closa_3989 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 308 1 308 309 545 85.0 1e-153 MSKKPVLVIMAAGMGSRYGGLKQIDPVDPYGNKIIDFSIYDAVEAGFEKVVFIIKKAIEQ EFKEQIGDRMAKYVQVEYVFQELDKLPEGYSVPEGRVKPWGTAHAILCCKDVVDGPFAVI NADDYYGKSAFQSIYDQLVAYSDNEKYQYTMVGYKLYNTLTENGHVARGVCTTDENSRLV DIHERTRIEKHGDIAEFTEDDGATWTELPEETIVSMNMWGFTASVLKEIDARFAAFLDRE LPKNPIKCEYFLPFVVDELLKEGKADVTVLKSIDRWYGVTYKEDKETVVNAIKGLKEAGF YPEKLWEEA >gi|229784077|gb|GG667658.1| GENE 6 6931 - 7866 710 311 aa, chain + ## HITS:1 COG:lin0157 KEGG:ns NR:ns ## COG: lin0157 COG2207 # Protein_GI_number: 16799234 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 36 286 32 271 277 84 28.0 2e-16 MEYKSTVLRPSAVVHRVISIHYFEYMSDFSFPGESHDFWEFVCVDKGVIDVMAGEKRIPL KRGNIIFHQPGEFHNIITNGEVAPNLVVIGFECHSPCMKAFEGKILTVSETERELLARII IEARNAFSGRMDDPYQEELVRNPSPLAFGAEQMIANYLEELIIHLYRRYFENPGQFKTRR QPEVHIKSDAYNRIIRYMEEHIGERLSLDTICRDTLTGRSQLQKIFREAHGCGVIDYFSS MKIDTAKQLIRDNHLNFTEISDRLGYTSVHYFSRQFKKLTGMTPSEYATSIRRLSEYSTF GSRKEEGGQKD >gi|229784077|gb|GG667658.1| GENE 7 7907 - 9391 1206 494 aa, chain + ## HITS:1 COG:mlr3684 KEGG:ns NR:ns ## COG: mlr3684 COG3119 # Protein_GI_number: 13473175 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mesorhizobium loti # 9 442 8 470 522 165 30.0 2e-40 MSQNRELPPNIVLIMTDQMRGDCLGIAGHPDVKTPYLDSIAAKGILFDHAYSACPSCVPA RAALHTGMRQEHHGRVGYQDMVNWNYPHTMAGELAAAGYYTQCVGKMHVHPLRNLMGFHN IELHDGYLHAYRDPAAAWEESQKQADDYFYWLKQELGADADVTDTGMECNSWVSRPWIYE EKYHPTNWVSTRSIDFLRRRDTSKPFFLMASYLRPHPPFDAPQYYFDLYRDKQLTPPAVG DWEDEDFTGDYQRLGRIYDSATGPVDPELIRQAQIGYYACITHLDHQIGRLIQALVEYKL MDNTIILFTSDHGEELCDHHLFRKSRPYEGSCRIPMLLSGPERLIHAAPGTVCHSVAELR DVMPTLLDAAGAPIPETVDGKSMIPDPDGTLPVIRQWLHGEHEAGVNSNHFIVTEHDKYV WYSQTGREQYFNLDEDRRELHNGIADTQYQERIGLLRGLLIEELKEREEGYSDGQRLIIG RETTVLLQSVFGTI >gi|229784077|gb|GG667658.1| GENE 8 9410 - 10018 699 202 aa, chain - ## HITS:1 COG:CAC1476 KEGG:ns NR:ns ## COG: CAC1476 COG1174 # Protein_GI_number: 15894755 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, permease component # Organism: Clostridium acetobutylicum # 1 200 1 200 202 204 63.0 7e-53 MIEYALKYPEKLYGALGQHLMLVAVTLVLSLILAAALTVCAMYFKTVSNGLIHLFSVIYS IPSLAMFAMLIPVTGLGTKTALIVLTLYNQYLLLRNFTAGLNGVDSSVIEAAAGMGMTTM QILLKIRLPLAKRSVFTGIRLAIVSTTGIATIAATINAGGLGTILFDGLRTLNVVKILWG TVLSAGLAIVLNAGLERVERRL >gi|229784077|gb|GG667658.1| GENE 9 10011 - 10784 241 257 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 218 7 219 318 97 30 1e-19 MSQTAIEFQNVTKKFNNAALPSVDRVSLTIEEGEFITILGSSGSGKTTLLKMVNRLYEPT EGKIFLFGEDISMVDVVKVRRRIGYVIQQIGLFPHMTIADNISVVPKLLNWDKKQTDERV DELLNQVGLLPEEFKRRYPSQLSGGQQQRVGLARALAVNPKIMLLDEPFGAIDAITRMKL QDELLRIHGGLKKTFLFVTHDINEAFKLGSRVIVMNEGTVRQFDTPARIVKNPADDFVAS LIRSAREQEEFWRMTID >gi|229784077|gb|GG667658.1| GENE 10 10800 - 11774 1092 324 aa, chain - ## HITS:1 COG:CAC1474 KEGG:ns NR:ns ## COG: CAC1474 COG1732 # Protein_GI_number: 15894753 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) # Organism: Clostridium acetobutylicum # 30 323 4 302 303 399 66.0 1e-111 MKKQDGIKAYDGLKKQDRIKRSSGLKRHSSIKTAAALVLTAVMAASLTACGASSSGKDVI RVGSKDFTESLVVSEIYALALQDAGYKVERKQDIAGSVVHTSLINNEIDLYPEYTGTGLL SILQMDMITDPEEVYRTVKDEYEKRFQVTWLDYSKANDGQGLVIRTDVAEQLGIRTISDL QKHAKELRFASQGEFDAREDGLPALEQKYGAFDWKSSRVYDGGLKYEVLSNGEADVAPAY TTEGQLAQKDKYMLLEDDKQVWPPYNLAPVVRDDVLEKNPGIADILNKISASLDTDTVTA LNAEVDVNKREYEEVAREYYDSLN >gi|229784077|gb|GG667658.1| GENE 11 11845 - 12570 634 241 aa, chain - ## HITS:1 COG:CAC1473 KEGG:ns NR:ns ## COG: CAC1473 COG1174 # Protein_GI_number: 15894752 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, permease component # Organism: Clostridium acetobutylicum # 19 223 9 213 218 207 60.0 1e-53 MVFPIISIGKEGIDTISLIWEYFGSHMDDYLTVLREHVAISLMALAISVILGISAGYVCV RSPRLEKWLTGTFQTLRIVPSLAILILLIPFLGTGVKPALVALVLLAIPPILLNTVTGLT EVPDFMLETASGLGMTEKQVMYRVKFPMAFPLIMTGVKTAMIEIVASAALAAKIGAGGLG GLIFTGLGLNRLDLLLVGGISVALLSLVSGAILDGIEKMMTPYLQTAPKRRIRGRRKAGR L >gi|229784077|gb|GG667658.1| GENE 12 12661 - 13656 460 331 aa, chain + ## HITS:1 COG:SP1506 KEGG:ns NR:ns ## COG: SP1506 COG2706 # Protein_GI_number: 15901353 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Streptococcus pneumoniae TIGR4 # 7 323 6 333 337 127 29.0 4e-29 MQVQNGYLGTYFSEQSAGIYRFQFDSDTGIITEPELYYTAPDSKYLSLSHGLLASPLVKE GRAGVCLIDTTGERPLLAGELFEETKSACYVIQDDTFIYTANYHEGSILIYRKEAHSISL SKRIETGYQAGCHQILFHGHIMMVPCLLRDQILLFDMDRDFAPAGSITFPQGTGPRHGVF DRAHRRFFLVSELSNELFLYEAGTGPQFTLRSVQKILPEGRVWSPAAASAAVRLSPDERF LYLSTRFADLITVYQLDGFQAVKIQQTESGGSHPRDFILTGDGRFLLAANRYEGGLASFR INPGTGMITGLCSRIAAPEAVAVVLDGPHTL >gi|229784077|gb|GG667658.1| GENE 13 13719 - 15170 1169 483 aa, chain + ## HITS:1 COG:L0046 KEGG:ns NR:ns ## COG: L0046 COG0362 # Protein_GI_number: 15672604 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconate dehydrogenase # Organism: Lactococcus lactis # 16 477 5 468 472 516 53.0 1e-146 MNENLKRTTPEHELLDIGLIGLSVMGRSLALNMADHGFRVAGYNRSPEVTQKVMEEYPHP NLTPFFSLPELVSALRKPRKIMVMIQAGAPVDAVIDQLIPLLEEGDFIMDGGNSYFGDTN RRAALLKEKGLHYLGVGISGGEEGARFGPSIMPGGSREAYEAVGQILQTIAAKADNTPCC AYIGPDGAGHYVKMVHNGIEYADMQLIAESYLLLKYAGGFTNRELAEIYRTWNEGELHSY LIGITADIFLEADDMADGDLIDHILDSAGQKGTGRWASIEALRQGVDISMITAACNARIM SNHLERRQKASSLIAAPVITAAGDKQAFTEMVREALYISKIIAYAQGFSLLQDASRRFGW DLDLGSIASIFRAGCIIQAEFLNDITRAFLKAPDLESLIFDDFFLSRINRYQENLRKTAG LAVSSGLPVPALFNAVSYLDSFRTKTAGANLIQAQRDLFGAHTFERTDRKGIFHHEWRAV HEL >gi|229784077|gb|GG667658.1| GENE 14 15142 - 16557 1129 471 aa, chain + ## HITS:1 COG:lin2085 KEGG:ns NR:ns ## COG: lin2085 COG0364 # Protein_GI_number: 16801151 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate 1-dehydrogenase # Organism: Listeria innocua # 13 470 11 488 491 353 38.0 5e-97 MNGGPSMNYDVCITIFGGTGDLTFRKLLPALYTMSRTRKLPVHSKILIIGRRDYDTAAYI SQARDWIRKFSRLPYSEELLEDFAAHLEYYRMDFSDRSAYEGLGQAYEAMQAASHIFYFA VAPRFFSVIAAGLKTITGLTDGKIIIEKPFGETLASARELNRQLESCFGPDNIYRIDHYL GKEMIRNIQTIRFSNPVFANAWNGKMIEAVEISALEDVGVETRGGYYDASGALKDMVQNH LFQILSILAMERPEAFTGEEMHARQLEVLKALRPVSPEDIEQSLVLGQYDGYRKEPLVSP ESVTETYACMRLFIDNERWSGTPFYIRTGKKTGLRETEVAVIFRRPDAETEPDILLIKIQ PTEGIYLRFNIKRPGDSEEVITASMDFCQNCILENQVNTPEAYERLLTACIEGERSWFSQ WDQVEISWNYIEQLKQLAASRNLPLYPYLPGTRGPREAETLLNTFGHHWLE >gi|229784077|gb|GG667658.1| GENE 15 16571 - 17266 596 231 aa, chain + ## HITS:1 COG:SP0828 KEGG:ns NR:ns ## COG: SP0828 COG0120 # Protein_GI_number: 15900715 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase # Organism: Streptococcus pneumoniae TIGR4 # 13 227 5 220 227 180 44.0 2e-45 MPDTKDKQIKEQKKAAAVMAAGELRDGMVAGLGTGTTVYYLIQEIARLVKEGLTIHVLPT SEQTRSLASQLNIPLLSTDTAPAADLAIDGVDGVDSHFWSVKGGGGALFREKLIASNARR VIWILDESKLLNHLSEVPLPIEVVPFGFSYVMKEVAALGFRPRLRLNDGMPFLTDNGNYI LDLAGDSSMDYREKSALLKSLTGVVETGLFPDFCEKLIIGGADGVRVLENR >gi|229784077|gb|GG667658.1| GENE 16 17594 - 18871 1666 425 aa, chain - ## HITS:1 COG:CAC3539 KEGG:ns NR:ns ## COG: CAC3539 COG0766 # Protein_GI_number: 15896775 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Clostridium acetobutylicum # 1 410 6 415 418 439 59.0 1e-123 MKGGNPLAGEVTIGGAKNAALGILAASIMTDEDVVIENLPDVRDINVLLEAIEEIGADVE RIDRHTVRINGKYIHEVSVDDEYIRKIRASYYFIGAMLGKYKSAQVPLPGGCNIGSRPID QHLKGFRALGAEVKIERGAVIAHAIDLVASHIYLDVVSVGATINIMMAATLAEGQTILEN VAKEPHVVDVANFLNSMGANIKGAGTDTIRIKGVRKLHGTEYSIIPDQIEAGTFMCAAAI TRGDVTVKNVIPKHLEAISAKLLEMGCEVVEFDDAVRVVGKTLQRHTDIKTLPYPGFPTD MQPQMTVTLALAEGASVVTESIFENRFRYVDELSRMGGNVKVEGNVAVIDGVKKLTGASV NAPDLRAGAALVIAGLAAEGYTIVEEIGYIQRGYECFEEKLQGLGAMIEKVDSEKDAKKF KLKVG >gi|229784077|gb|GG667658.1| GENE 17 18995 - 21472 2697 825 aa, chain - ## HITS:1 COG:no KEGG:Closa_3992 NR:ns ## KEGG: Closa_3992 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 825 1 825 825 1202 69.0 0 MKRLGKKIGILFIIFIAAIAIYFVWNQKRTEKPDSIVYTAMDEAALPVVSATMYGRDVNF LHGYTQDMKQAAARDSLTVLPPDRALNIRIADYDGSIRGISYEVRSLNLERLVERTSLTE WDTADGNTTAVLPIQNLLSKDREYLLILSLDTAERGEVRYYTRILWTDNENAKEMIDFAV NFTTKTFDYDQARELTTYLETNASEDNSSLGHVTIRSSFSQLTWAGLDVKPVGDIRVTLK DLDGIMCNVRLDYQITRTADNGMNELYEVEDDFTMKWDSKRIYLMDFERSTNQIFSGQRD LFSGKRIMLGIMNDNEVRTKASPDGRHIAYVVNRDLWSYDQSSEKAVKIFSFRSGDEDEI RNNYDQHDIRILSIGDNGDVNFIVYGYMNRGIHEGAMGIAMYQYTSDENAINEKFFTPVT QSFEMLKGDIDKLAHLGSNGMLYMMADHAVYGIDLNSNEYMVLADSLTEGSFAVSRSERR FAWQEGNDPYGSRVIHLMDLDTGSKKEITGEDGSLCRPLGFVGDDFIYGLAREGDMWIEN GRVKDAPMYRLEIMDDTGAIVNRYEYPDTYIADVEVEENRIRLTKIARTGDQTYTELKDD IIVCNEALKEDPLKGIGWYASEDRRKLYFVQLDKEAASHDVKVSVPKKMAYESSEVVELK SATGITENLFYAYGGGHYLGSSRSFTEALNLAYDKMGVVTDENQEIVWNRVNRLPARNIK DPENKGALLIRHMEEFSGSKNYDDGILMLDARGCILNQVLYFISKGCPVAVYTENGGYEL IAGYDQYNVTIFNPETKEARKMGLNDASQYFSSVGNDFVCGVFTE >gi|229784077|gb|GG667658.1| GENE 18 21598 - 21852 252 84 aa, chain - ## HITS:1 COG:no KEGG:Closa_1116 NR:ns ## KEGG: Closa_1116 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 3 75 2 74 89 68 43.0 1e-10 MEQKIKQDMVIGYNIRRLRKRTGLTQEQTVAKMQLMECNITRGNYAKIEVGLANIRATEM MALKRIFGVDYGEFFVPVPEERKT >gi|229784077|gb|GG667658.1| GENE 19 22012 - 22215 160 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621965|ref|ZP_06114900.1| ## NR: gi|266621965|ref|ZP_06114900.1| N-acetyl-gamma-glutamyl-phosphate reductase [Clostridium hathewayi DSM 13479] N-acetyl-gamma-glutamyl-phosphate reductase [Clostridium hathewayi DSM 13479] # 1 67 1 67 67 103 100.0 3e-21 MIRKKTGELTPKEERELMQEFDRVDGYCDTAIPPAPKGEFEKIIAEMKKRNIVPQIREEL RKRFGRS >gi|229784077|gb|GG667658.1| GENE 20 23835 - 24341 452 168 aa, chain - ## HITS:1 COG:no KEGG:Closa_3993 NR:ns ## KEGG: Closa_3993 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 131 1 131 255 143 56.0 2e-33 MQDNLEHTYGNSFQVLEPYMLYGDRSTVEDLPKMQREHMVQDLEYLQGMYPGHMKRIQEY VMAACDRLDYKNSPMYDEFPDRIMVNQVCDSVSRQILEDGVVNGSELGLQNTPEQTGEET AETVEEVEEVEVYEVNSVNTQDADWRMQQVVSQNPPAWGPPPGQRLAS >gi|229784077|gb|GG667658.1| GENE 21 24755 - 25684 924 309 aa, chain + ## HITS:1 COG:FN0277 KEGG:ns NR:ns ## COG: FN0277 COG4866 # Protein_GI_number: 19703622 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 22 308 10 288 290 123 32.0 5e-28 MKDKVRQYMELQFKDLDSDALKTLAPHFTFRENKTCDSVFLGLFLWKDYYQVRYAIGAEN NIFLRITADGKEYAALPICQAADLPHSFDVLKDYFNRILGLKLNIFLADEAAVRLLNLPQ DEFEIEELEDARDYLYSGDALRSLAGRKLHKKKNLVNSFVKSWEGRFEYRRLSCADGPEI KSFLARWGLAREERMEGQLEHEILGIHEILRCCSLLDITMAGIYIDGNLEAFTIGSYNPL EQMAVIHIEKANPEIKGLYQFINQQFLIHEFPDAVLVNREDDLGDPGLRQAKLSYCPIDF ARKYRIRQR >gi|229784077|gb|GG667658.1| GENE 22 25721 - 27079 1105 452 aa, chain - ## HITS:1 COG:SPBC1105.05 KEGG:ns NR:ns ## COG: SPBC1105.05 COG2730 # Protein_GI_number: 19113253 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase # Organism: Schizosaccharomyces pombe # 6 189 25 215 407 72 27.0 2e-12 MEFVRVKGNEFIYQGQSIRFAGLGIGSWLNLEHFMLGIPTPEKQMKEAFTEVFGPEKSAV FFDDFVCSFCSEGDFKLLKDTGINLIRVPFNYRLFLDDQNPELQKEEGFRYFDRLLDLCR KYEIYLLPDLHSVPGGQNPDWHSDNQTGTPAFWHYDVFQQQIISLWREIAARYKDEPYLL GYDVLNEPFLMPAAEGKLQRFYERVTAAVREVDQNHIIFLEGDSFAMDFSCLKEIRDAQT ALTFHFYPTVWEADLCDPDYPRGERRQVFEQRFRTMLESLLPFNRPLLCGEAGYDIAGHS LGHVMEMVEDTLDLFCKYGVSWTLWCYKDAQFMGLVYPKDDSEWMRFAKEIRTKWTHYGE MAMGRSLVETMCGLFPGEVEEELKYRLQFRQRALLYTLQKEQILKPQLRAWGWERVSRMP ESFLFENCSYYKEYQELLADYTKKIRQLAQRS >gi|229784077|gb|GG667658.1| GENE 23 27163 - 28662 1058 499 aa, chain - ## HITS:1 COG:STM0886 KEGG:ns NR:ns ## COG: STM0886 COG3119 # Protein_GI_number: 16764247 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 1 488 1 492 495 343 38.0 4e-94 MRTVLILMDTLNRRFLKSYHENAKGITPNMDAFAEEAVRFDNHFIASAPCMPARRDIFTG RMNFLERGWGGMEPYDRTLQGELRSHGVYTHITTDHAHYFELGGENYCSLFDTWDYHRGQ ENDPWISRVSPPFKEPDSYGRKSVQHLLNRQEYAREEKTYPTPATFLSACRWARENRGAD NFFLMVEAFDPHEPFDAPVSYHELYGDHYSGKEFNWPGYAPVTEPEAAVLHLRNCYLATM TMADRWLGHFLDALKENGLYDGTLVILTTDHGHMLGEHGFTGKNYMHAYNEMAHIPLFVK MPNGECAGEQRKQLTQNIDLMPTILKHHGIPVPVSVKGRDLSEMILRRSPSRDAVIYGWF GRAVNVYDGRYTYFRAPASLDNRPCYQYCAVPTTLGRFLGEEYADEMEMGRFLPHTNYPV YRIPVHNETDCMGDLRFIEDSCLYDLERDYAQEHPMTDEAVETAMCRKLIAGMREAQAPA EQYERLGLWTVRKGEDRQR >gi|229784077|gb|GG667658.1| GENE 24 28732 - 30957 2004 741 aa, chain - ## HITS:1 COG:lin1840 KEGG:ns NR:ns ## COG: lin1840 COG1472 # Protein_GI_number: 16800907 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Listeria innocua # 1 740 1 718 723 577 44.0 1e-164 MEQKQLIDLLNDMSLEEKAGQLFQVTGDWFDGTKELEQTGPEAEETAKIRKNYGHLAGSV LGVYGAASIKKIQKAYMERQPHHIPLLFMLDIINGFKTIYPIPLALGAMFDPELAGDCAA MAAKEGAAGGLHVTFSPMVDLVRDPRWGRVMESTGEDGYLNSLYAKALVEGYQRQQGGMD SLAACVKHFAGYGAPEGGREYQTMELSDETFREFYLPAYESAVKAGSRMVMTAFQTVKGI PATVNRSLLRGILRGEMEFDGVLISDYSAIGETVVHGVSEDRADAAVRALEAGVDIDMMS GVYPECLAGLVRSGRLDEALLDESVLRVLELKNWLGLFEDPFHGADEAKESGTYLCEDHR KLAYEAAVKSFVLLKNEDGVLPLSPERKTAFIGPYTESQEIMGSWSFTGGAGTEMVKTIR MAAEEALAGWPVTYHTGCPLLAPGTVLEGFDRYRPCSASGEELACWKEEALKAAGEADVV VMPLGEHCFQSGEAASRAFLEIPEIQQDLFRAVCEVNSNVVAVVFSGRPLDLRFLSGHAK AVLFVWLPGMEGGPAIVDVLTGKRAPSGRLPMSMPYCVGQIPVHYDSLMTGRPYREGKPE RYVSRYQDIPADPLFSFGYGLTYTEMELSGVRLSKTVLSPGETMEASVTLKNTGRREGTE TVQLYLRDVTGSTARPVKQLKGWKHITLKPGEEAEVVFSISEEQLRFYQGPGNWDSEAGT FYVFIGTDSKTENAASFRLEK >gi|229784077|gb|GG667658.1| GENE 25 30980 - 31672 598 230 aa, chain - ## HITS:1 COG:BH3701 KEGG:ns NR:ns ## COG: BH3701 COG1874 # Protein_GI_number: 15616263 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Bacillus halodurans # 3 227 453 671 672 212 48.0 6e-55 GGYELIAAPLLYMVKNGVSERLEEFVRTGGTAVFSYLSGYVDENDRITLGGYPGKLRELC GIWVEETDSLPETEQNSFCYEGELYPAGLLCDIMHTEGAEVLARYREDFYAGTPIITRNQ YGGGLAYYVGTRSGEDFYLRFFADRCKEKGLRTASHDTVETAAALSEKGIEITVREKDGV EYLFLLNHSGKRQELAVSAGGTDLLSGREIHGGEAFAIDAAGVMLVKAAE Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:07:36 2011 Seq name: gi|229784076|gb|GG667659.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld52, whole genome shotgun sequence Length of sequence - 36050 bp Number of predicted genes - 29, with homology - 27 Number of transcription units - 14, operones - 6 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 24/0.000 - CDS 3 - 510 618 ## COG4603 ABC-type uncharacterized transport system, permease component 2 1 Op 2 15/0.000 - CDS 521 - 2089 1297 ## COG3845 ABC-type uncharacterized transport systems, ATPase components 3 1 Op 3 . - CDS 2109 - 3230 1270 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein 4 1 Op 4 . - CDS 3265 - 4038 824 ## COG1349 Transcriptional regulators of sugar metabolism 5 1 Op 5 . - CDS 4117 - 5166 930 ## COG0182 Predicted translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family - Prom 5212 - 5271 7.4 - Term 5397 - 5465 19.2 6 2 Op 1 15/0.000 - CDS 5492 - 5788 466 ## COG1862 Preprotein translocase subunit YajC 7 2 Op 2 . - CDS 5842 - 6984 1133 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase 8 2 Op 3 1/0.000 - CDS 7028 - 9202 2832 ## COG0342 Preprotein translocase subunit SecD 9 2 Op 4 . - CDS 9221 - 10615 1434 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) - Prom 10635 - 10694 4.4 - Term 10673 - 10711 3.8 10 3 Tu 1 . - CDS 10746 - 10892 116 ## - Prom 10919 - 10978 5.3 11 4 Tu 1 . - CDS 11037 - 13256 1810 ## gi|288870672|ref|ZP_06114919.2| hypothetical protein CLOSTHATH_03175 - Prom 13352 - 13411 2.9 - Term 13340 - 13401 0.5 12 5 Op 1 1/0.000 - CDS 13462 - 14406 862 ## COG0524 Sugar kinases, ribokinase family 13 5 Op 2 . - CDS 14420 - 15061 591 ## COG0274 Deoxyribose-phosphate aldolase 14 5 Op 3 . - CDS 15058 - 16053 1049 ## CDR20291_0789 hypothetical protein 15 5 Op 4 38/0.000 - CDS 16115 - 16948 823 ## COG0395 ABC-type sugar transport system, permease component 16 5 Op 5 35/0.000 - CDS 16959 - 17849 1024 ## COG1175 ABC-type sugar transport systems, permease components 17 5 Op 6 . - CDS 17868 - 19262 1715 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 19298 - 19357 2.8 + Prom 19442 - 19501 7.3 18 6 Op 1 7/0.000 + CDS 19576 - 21345 1457 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 19 6 Op 2 . + CDS 21350 - 22933 1090 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Term 22742 - 22788 0.3 20 7 Tu 1 . - CDS 22938 - 23027 68 ## 21 8 Tu 1 . + CDS 23002 - 23361 439 ## Closa_0598 hypothetical protein + Term 23424 - 23492 20.3 - Term 23411 - 23438 -0.5 22 9 Op 1 . - CDS 23505 - 23846 369 ## COG1416 Uncharacterized conserved protein 23 9 Op 2 . - CDS 23859 - 25127 1687 ## COG1253 Hemolysins and related proteins containing CBS domains - Prom 25156 - 25215 7.0 - Term 25162 - 25209 3.0 24 10 Tu 1 . - CDS 25229 - 25696 564 ## COG0597 Lipoprotein signal peptidase - Prom 25721 - 25780 2.9 + Prom 25671 - 25730 4.9 25 11 Tu 1 . + CDS 25813 - 27570 1478 ## COG1404 Subtilisin-like serine proteases 26 12 Tu 1 . - CDS 27572 - 29017 277 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 - Prom 29038 - 29097 2.9 27 13 Tu 1 . - CDS 30318 - 32069 1818 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain - Prom 32145 - 32204 6.8 - Term 32170 - 32219 4.2 28 14 Op 1 . - CDS 32243 - 34777 1716 ## COG1316 Transcriptional regulator 29 14 Op 2 . - CDS 34821 - 36050 1245 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis Predicted protein(s) >gi|229784076|gb|GG667659.1| GENE 1 3 - 510 618 169 aa, chain - ## HITS:1 COG:BMEII0086 KEGG:ns NR:ns ## COG: BMEII0086 COG4603 # Protein_GI_number: 17988430 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Brucella melitensis # 9 169 14 174 359 94 33.0 8e-20 MSRFELLRTLLAVGIAVIFAVVIIFLVSSQPLEAIYKFFVAPLTSFRYIANIIELMTPLL FTGLAVTFILQTKVYNLAVEGMFYAGGLAAALVATLLKMGPGVHAFAALFAGLAAGALIA FLPAVLKLKFEANEIVSSLMLNYITFYIGDFLLKTYMKDPKSTHVASLK >gi|229784076|gb|GG667659.1| GENE 2 521 - 2089 1297 522 aa, chain - ## HITS:1 COG:TM0103 KEGG:ns NR:ns ## COG: TM0103 COG3845 # Protein_GI_number: 15642878 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport systems, ATPase components # Organism: Thermotoga maritima # 10 517 7 507 507 431 49.0 1e-120 MEKAREILSMKNITKVYGNGVLANSHVDFSINKGEIHALMGENGAGKSTLMKILFGTQQP DEGTIQLDGKTVKIASSNDALRLGIGMVHQHFKLVPSLTVAENIVLGKEPKKNGFLDRKK AVEITRALSERFRFDVDPTAAVSDLSVGKKQKVEILKALYREVKLLILDEPTAVLTPQET EELFEQLNKLREAGLTIIFISHKLNEIKEICQRITIMRHGKSVGTYDVAGLTVEDISKLM VGRDIILKVEKKPAKPKEVRLSVRNLSCKNEVGRQVLDRVSFSVRSGEILGIAGVEGSGQ EELVELITGMRHLDRRNGEVLIGAERTAGCSVKEIRRKGMSYIPEDRMVYGAAGHVTIRE NMISSICDSDQVNSGLFLNKKKIDRWVEAGIRDYDIRCGSPLDEIGMLSGGNMQKVIVAR EFSTEPSILIADQPTRGVDVGAIEFIHKKIVEIRDRGCAVLLVSADLNEILELSDSIIVM CQGRISGYFPQAKEMSEEELGYYMLGVKNQPEEDLRRACCEE >gi|229784076|gb|GG667659.1| GENE 3 2109 - 3230 1270 373 aa, chain - ## HITS:1 COG:CAC0702 KEGG:ns NR:ns ## COG: CAC0702 COG1744 # Protein_GI_number: 15893990 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Clostridium acetobutylicum # 2 358 4 346 357 177 35.0 4e-44 MKKVWAICLTAALTAGMLAGCGSAKTSDTAGESVSTKASVGEEKKDGDLSVLLLVPGTLG DKSFFDSANAGLALIEDHYGASTKVVEMGTDITKYSATLEDVIHEGWDIIITGGVNISQP LQEAAEEYPDQKFILYDESVDFSDGANSNIYCMTYKCYEGGYLAGVLGASASKTGQLGFA GGFDIPLINDWLVGYIEGARTVNPDIKINTGYMNSFTDAAKGKEIGIALYHAEADIVFQA AGGAGLGVLDAAKETGNHAIGTDSDQSELFSSDAEKADAIYASILKRVDISIEKAVGDYL DGKLEFGTVKTYGLSDGCIEITDNSWYQKNVPAAAREKVAEAKQNIIDGKVEIKSAFDMT DEEIASLKGEVQP >gi|229784076|gb|GG667659.1| GENE 4 3265 - 4038 824 257 aa, chain - ## HITS:1 COG:BS_glcR KEGG:ns NR:ns ## COG: BS_glcR COG1349 # Protein_GI_number: 16080683 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Bacillus subtilis # 7 233 4 230 258 147 41.0 2e-35 MKTFSAEERKKFILDYVKENEFVDITYLKDTLMVSEMTIRRDLKKLEEDNSLVRVLGGAR SIPGGTYEDPVDKRVKFHSEEKDRIARYAASLVNEGDSIFMDASSTVYAMADYLEVHATV ITNNISICMKLKDKEKIDVILIGGSLRKSAMSLVGIEAVRMLDNYYVDKAFLSSKSVDEK HGISDATRDEAEVKKAMIKASEEVYFLMDHHKLASRAFYRVCGLEEITELIVDASDDEYV TEFAEECRKNNKRICCV >gi|229784076|gb|GG667659.1| GENE 5 4117 - 5166 930 349 aa, chain - ## HITS:1 COG:TM0911 KEGG:ns NR:ns ## COG: TM0911 COG0182 # Protein_GI_number: 15643673 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family # Organism: Thermotoga maritima # 6 348 7 342 343 328 51.0 1e-89 MREIITLKWEKGNLFLLDQTLLPNTVTYVACKSLEDVYQAIQTMIVRGAPAIGVSAGYGM VLSANAFDGTETASFLDYFREQGEYLKSSRPTAVNLMWAADRMVKRAETLVDEGYPMEAV KKELEAEAVAIHEEDKQTNRNIGENLLSLLKDGDTVLTHCNAGSLATSEYGTALSVFYMA QEKGMNIKAFADETRPRLQGANLTAFELVEHGVDVTLICDNMAAVVLSQGKINALIVGCD RVAANGDTANKIGTFSVSVLANQFHVPVYIAAPTTTIDMECPTGKEIPIEERSREEVLCI NGEYIAPKEVKTYNLAFDVTPAEYITAIVTEKGIVYPPFRDNLKILMKK >gi|229784076|gb|GG667659.1| GENE 6 5492 - 5788 466 98 aa, chain - ## HITS:1 COG:aq_1254 KEGG:ns NR:ns ## COG: aq_1254 COG1862 # Protein_GI_number: 15606479 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YajC # Organism: Aquifex aeolicus # 13 91 12 88 102 64 39.0 4e-11 MALATSGGMGMGFWIIYIAVIFGFMYFIAIRPQKKEKKKQQELLASIAIGDSVLTTSGFY GVIIDMTDDTVIVEFGNNKNCRIPMQKAAIVQVEKPEV >gi|229784076|gb|GG667659.1| GENE 7 5842 - 6984 1133 380 aa, chain - ## HITS:1 COG:CAC2282 KEGG:ns NR:ns ## COG: CAC2282 COG0343 # Protein_GI_number: 15895550 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Clostridium acetobutylicum # 3 370 2 369 376 595 73.0 1e-170 MNYKIIAKDGRAKRAEVNTVHGTIQTPVFMNVGTVGAIKGAVSTDDLKEIRTQVELSNTY HLHVRTGDKLIKEFGGLHRFMNWDRPILTDSGGFQVFSLTGLRKIKEEGVYFQSHIDGHR IFMGPEESMQIQSNLGSTIAMALDECPPHPATREYMINSVDRTTRWLERCKAEMARLNSL PDTINREQLLFGINQGGTYEDIRIRHAKTISAMDLDGYALGGLAVGESHEEMYRILDETV PYLPENKPTYLMGVGTPANILEAVDRGVDFFDCVYPSRNGRHGHVYTNQGKRNLFNSRYE LDHSPIEEGCGCPACRSYSRAYIRHLLKAKEMLGMRLCVLHNLYFYNTMMEEIRDAIEQH RYAEYKAEKLAGMMAVQAEK >gi|229784076|gb|GG667659.1| GENE 8 7028 - 9202 2832 724 aa, chain - ## HITS:1 COG:CAC2278 KEGG:ns NR:ns ## COG: CAC2278 COG0342 # Protein_GI_number: 15895546 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Clostridium acetobutylicum # 5 393 4 401 417 255 40.0 2e-67 MKSNKGKGLIGLIIALALVGLFGYFGYTTMNDIKLGLDLAGGVSITYQAKEENPSAEDMS DTIYKLQQRVQNYSTEAEVYQEGSNRINVDIPGVSDANAILEELGKPGSLIFVDEAGQTI LNGNQVATAKPVITDENGIKKYMVDLTFTDDGKTVFADATTKNVGKRIAIIYDGKIYSNP VVNEPITQGQCQISGMVSYEEAETLASTIRIGSLSLELEELRSNVVGAKLGQEAIATSLK AGAIGFGIVAVFMLLVYLVPGLAAVIALSIYVGVILILLSAFSVTLTLPGVAGIILSIGM AVDANVIIFTRIKEEIGLGKTVQSAIKSGFNKALSAIIDGNVTTLIAAGVLFWRGSGTVK GFASTLAIGIVLSMITALFITKFVLNTLFNLGFQDPKFYGTKTDKKTIDFLGKRKIWFAV SLVVIVIGLGGLVINKTQTGDILNYSMEFKGGTSTNVTFNEDMTLDEISSKVVPVVENVT GDAQVQTQKVAGTNEVIIKTRTLTVEERQALDQAMVDNFGVEAEKITAESISGAISKEMK EDAIIAVIIATICMLLYIWLRFKDIKFAGSAVLALLHDVLVVLAFYALLKWSVGSTFIAC MLTIVGYSINATIVIFDRIRENLKVHSKLELAEIVNMSITQTFTRSINTSLTTFVMVFVL FLLGVSSIREFALPLMVGIVCGTYSSVCITGSLWYVMTVYKNKRMEEKKAAEKASRAAAK SSKK >gi|229784076|gb|GG667659.1| GENE 9 9221 - 10615 1434 464 aa, chain - ## HITS:1 COG:CAC2279 KEGG:ns NR:ns ## COG: CAC2279 COG0641 # Protein_GI_number: 15895547 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Clostridium acetobutylicum # 1 463 3 454 454 483 52.0 1e-136 MIHQYINNGFHIIMDVNSGSVHSVDPVMYDAVEIVAERVPELAEPQPLPAEVAEEVKERL SPTYGEAEVLEALEEIQYLIDAEELLTTDQYHDYVVDFKKRKTVVKALCLHIAHDCNLAC QYCFAEEGEYHGRRALMSFEVGKKALDFLIANSGNRRNLEVDFFGGEPLMNWEVVKQLVE YGRSKEKEYNKNFRFTMTTNGVLLNDEIMEYCNREMSNVVLSLDGRKEVNDKMRPFRGGK GSYDLIVPKFRKFAEMRGDRDYYVRGTFTRHNLDFSKDVMEFADLGFRSMSIEPVVAAPE EEYAIREEDLPQIMEEYDRLAEEYLKRKKEGRGFNFFHFNIDLNQGPCVAKRLSGCGSGT EYLAVTPWGDLYPCHQFVGQEEFLLGNVDTGVTNERIRDEFKLCNVYAKDKCRDCFARFY CSGGCAANSYNFHGSITDAYDIGCAMQKKRIECAIMLKAALAED >gi|229784076|gb|GG667659.1| GENE 10 10746 - 10892 116 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKHVKTLNSSTLKESMKKGGCGECQTSCQSACKTSCTVGNQSCENSNR >gi|229784076|gb|GG667659.1| GENE 11 11037 - 13256 1810 739 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870672|ref|ZP_06114919.2| ## NR: gi|288870672|ref|ZP_06114919.2| hypothetical protein CLOSTHATH_03175 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03175 [Clostridium hathewayi DSM 13479] # 1 739 14 752 752 1438 100.0 0 MEIPPKFVYNNGNLPGGRQMKEKLTCQLSEKEYREFVHWYHGTYIKMRRAKPGKTVFSCL KEPVELTAKEDMVYGKSGACEVYLLYGAVEKLYELETLFVFYDMKRFWAVPKRNLGSKEE AAEWRTAFESRIRELGKMRISFQEAAEACADSKFTPRRYTRQIDEVKAEYGLLRISRREM ERLRKLGTFPYRMIGEQMLAAGKRGIIEISERAAVLHPYETITGAFYTEQTLYLAEKEGD GILVPLAALGGLEGAEALMNVCDTRCRFNNPAFRLRKKSIPLSVRWERPGMAVAAALCIA AAGIAFLGTVYWKGENRSTAALSSEVPGNDRAETGEPADGKHGGGEAGGSAFQDLTAVIP DDTVFDVVTEEGTFASSTLYYQLILPKGEWKIWQDRNNAYDSLVSDWGSISVTGGPTGSF EGYGLTQMLPKTKAEYQARANADSGSFGAETEVIDYSLEEMGDCLIVKREVHNKGKEGSR SVIGLEVYGPERCYSLSITPDTEDAVSMQTARAVRDSFRIADMTTGICKEMEAEVFHGYY GDNTYMTSCLVLMDRAMSDEEIAEGLREMKKLRSAEYPASPYSASGDALAARTPDSKWLG IDCASVQQNCTKETAKEVARIFQAEVIIYDEFDGDLLMVACSDADRKHAYERATANAKWI LESEFQCYGKEQDFPEMLLKYMDISKEEAEAVWKSGDYIFQMDKWAELTGHMTKMPVPQE FIGLYDIDALDERFRVIRE >gi|229784076|gb|GG667659.1| GENE 12 13462 - 14406 862 314 aa, chain - ## HITS:1 COG:HI0505 KEGG:ns NR:ns ## COG: HI0505 COG0524 # Protein_GI_number: 16272449 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Haemophilus influenzae # 3 307 2 304 306 216 41.0 5e-56 MGRKSLVFGSFVADIMARAPHLPAPKETVMGNGFKIGPGGKGFNQAVAAFKAGAEVKLVT KLGTDHFAEIARDTMCQLRMDQSGVFITKEYDTGCALILVGEDSGQNEIVVVPGACLTFT SLEVDCVKDQICGNDCEYVLMQLETNLDAVTAIAECAAAGGKQVILNPAPAAVLPEQIWR WIDIVTPNEVEAEFYTGISVESEESAGSAAAWFHERGVNCVIITMGEKGAYLSVGGSGYM VPPFRVKAVDTTGAGDAFNGGLLAGLSEGMDWKEAVIFASAAAALSVRKIGTAPAMPKRE EILALMEGKEIPAE >gi|229784076|gb|GG667659.1| GENE 13 14420 - 15061 591 213 aa, chain - ## HITS:1 COG:CAC1545 KEGG:ns NR:ns ## COG: CAC1545 COG0274 # Protein_GI_number: 15894823 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Clostridium acetobutylicum # 1 213 1 214 215 209 58.0 2e-54 MKITDKIDHTILRADARAVEVRRYCSEAVEHHFASVCVNTCHVPLAAELLRDSGVKVCCV VGFPLGAMSTQAKAYEAYTAVKDGAEEVDMVMNIGALKDGNDHLVEADIREVVAASGSAA VKVILETCLLTNDEIRRACELSVKAGAAFVKTSTGFSTGGATEEDVRLMRETVGDRAKVK ASGGIRTHEQACRLISAGADRIGAGNGLLLLEE >gi|229784076|gb|GG667659.1| GENE 14 15058 - 16053 1049 331 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_0789 NR:ns ## KEGG: CDR20291_0789 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 209 323 18 132 209 84 36.0 5e-15 MDRANITDYSTRSGMKVSAAFLKLHRDILAGSTDEAISTAYDLYRTSAVMLEELWKQLEL EAVYTIGLAEQGALAYIYNLRCICYHTEPYMPDAGGMQALSFIQAIRYLCGCEKEQSLEQ EIRKVQDCYDAGVYPVVPAFARDHHNHAGRELGHTPLDFLYPDGGSKVVPEAEGAEPYKK RLIELLTPIYGGENGKAFRSDAYNHFYEMESPHGLNLELLQSAFQKSIRRALEPEVLMLT YEAFLSGGEMEAYLWERIVIMSVEDIGMGDVECNRIMYAYSRVKDQFADRDEVRLGFLMQ AVRILCRSPKERGTELIKGILVQECRNGDRK >gi|229784076|gb|GG667659.1| GENE 15 16115 - 16948 823 277 aa, chain - ## HITS:1 COG:mlr7002 KEGG:ns NR:ns ## COG: mlr7002 COG0395 # Protein_GI_number: 13475832 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mesorhizobium loti # 22 276 32 287 288 120 30.0 2e-27 MKKRKITPGGILYACAAGIWLITTIFPLYFAVLSSFKDDQTIFADFFALPQRFGLDNYIS AEKMVHILRATANSLFLSAGSICLMLGVSVMGAYVTARKRIPGSEGVTLFLIAAMMIPIQ SAIVPIVQMVSAIGQRNNLFVLMVIYAGVNLSMVFFILKGYIEGIPKELDEAAMIDGASL FQTLLRVIIPVAKPALSTCAITSFLFIYNELPIANVLITKPQLKPISVALLNLKGDFGTL YAVSFASIVISIIPTVIFYLIAQEKVESSICSGVVKG >gi|229784076|gb|GG667659.1| GENE 16 16959 - 17849 1024 296 aa, chain - ## HITS:1 COG:SMc02472 KEGG:ns NR:ns ## COG: SMc02472 COG1175 # Protein_GI_number: 15966816 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Sinorhizobium meliloti # 26 294 46 312 312 149 31.0 5e-36 MLIKVRGKKYRPEAFFMIAPVILLYTFVIIYPLANMVFTSFFEWNGIPTDPYRFVGMGNY VTFFKDYTTKTAFINLGILCATGLFVTIPVSLFLATVVSKKFFGIRFVKTCYFLPVVINR IAICLMFTFIMLPGQGPVPVLLKELGIAEHFNLFNDVHTAMWGVALVNMWSNVGFQMIIF SSGMAAISEDVYEAAAMDGVTPWQRLVYITLPQLKPTFKIVVVFVLTGAFKVFDFIMGLT GGGPGYATDVPNTLLYKNAFTYSKFGYANAIAVLMIVLCLVITVVTNRVFAERDGE >gi|229784076|gb|GG667659.1| GENE 17 17868 - 19262 1715 464 aa, chain - ## HITS:1 COG:BS_yurO KEGG:ns NR:ns ## COG: BS_yurO COG1653 # Protein_GI_number: 16080313 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 1 385 1 342 422 87 22.0 5e-17 MMKRIAALVLTAAMIAAVSGCSAKTTETKDTSGAAVSTETKQEETAAKAEAGQADGEKVK ISYLSRYTNPELPRAKYYMEKLEEFRAANPNIEVEDVSIADAESYKSSLKASVAAGSPPT LFICSDAFPHYDWAKNGVIKDLTPLIESPEWTGPVDEGVFGSFSFERKGLEGIYGVPNSV IGSPVYVNTKLLEENGIETPKTWEDVMTMTEKLKEVDPSIVPFSVAAKTKADLGRFFSEL AVRMNGLEFRDKFISHEVKWSDPEMMAVLNKMKKLMDAGVFGRDAISYEVENNVTSFGEG KIAMLFTASYYFDRFNGMEFADQIDCVNFPYFESAPENKNIWFASTSEGFCISAEPGTPE YDAACKLMAFMLSKETFEGYAEVAGGGVYPVDIDYDASKSPNPMKTFMEGYATRSDTTDI MAAYLDDASVINITNTELQTMFVERPVEDIAKTLDTEYAKLFPQ >gi|229784076|gb|GG667659.1| GENE 18 19576 - 21345 1457 589 aa, chain + ## HITS:1 COG:BH1909 KEGG:ns NR:ns ## COG: BH1909 COG2972 # Protein_GI_number: 15614472 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 33 586 33 596 597 144 24.0 4e-34 MSGKGYFSRLESKFTFMYTIILIVVITCIACSIGQYSNSILKKKSLDSCIQKLDFAGERF GLILDRIENESLLLTLNQASRHEAAKKEAASPYEEHMEAASFSSYLVEFLSTQSAVESIS YYSPDGSFFYQDNYGVLPPVKIEIPQEIREEFFSSSLKSSWYIHEAPNSEAGGNLTFTCL KKSFAFTGEPLGFFALTVSPSQIREIYSGFFSENEIFMITDKNGLICASTKETLPGKRVK EVFSDGASLQTGNTITLDTTKYLCTFQPEKDLNLFALTPESQVYHDSRSLVSVILAIGII SSIFTFFLFRYSTRKIMLPVNQIISNVRAMSDGDYGVRISVREGQTDEISLLAGQINRMA KNTQDLLARVHRENELKRRYELSCIQLQMQPHFLYNTLETLCGMIEMNNKSEAIGLVNLI SGFYRGALGKGKEIITLEQELKITRDYLKIMQKRYPGCFYFETSMEPEVLSCCIPKLTLQ PLVENSIIHGLQLFTSDRAGLITITGYTEDSLICLKITDNGSGMDDETVARLNSQIFSEE RSSFGIHSIKKRLKLYFAVADIVVTSKISEGTVITIRISRQQENPDRKD >gi|229784076|gb|GG667659.1| GENE 19 21350 - 22933 1090 527 aa, chain + ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 3 126 2 126 525 104 40.0 5e-22 MTYQFLIVDDEYFVRQRIRLCIPWKEYGFECAGEAANVEQALHFLQTVPIDLVLLDISMP GPNGLELLRMMKEKNSQAKFIILSGFATFEYARQAIAYNVTNYLLKPINTEELIQAITGI RETLDKNNHLKQERLAWQEARLIAEHVEKNTFFQNIFSGHALGSESAGLAEFGVIPDTPC TVVIFDGQPKITHSLSFQERSSFQQALGNLSGCLLSGKSGCVQVTDIYNHQVLIIPQTQL PFGPDHFLGRLSAAALENFSKDIVCGYGNGEDGLAAAISSAYQTALQFFTFRTVYGTESG AFHTRLPEKQALEQLNTSIRGIAGGLLKKDHQVILNSLHSVFQIIGQEQFSLPALESALS SLLSAAIEYAVHSGIELFSGSGGSAYTAAGIIYHGCSLGEIEEKFTNLFLSLPGSREMDE VPMIQRIVLQAAELIEQEYSRTDMGLQYIASVLLISPAYLSRNFKRIKQYSVMQYITKCR MEHAWDLLSGGRLSIAAAAEKTGYQDAFYFSKCFKRYFGISPSQVDR >gi|229784076|gb|GG667659.1| GENE 20 22938 - 23027 68 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MACSFDFSICFSPNLVFWFSIFRGGWGYS >gi|229784076|gb|GG667659.1| GENE 21 23002 - 23361 439 119 aa, chain + ## HITS:1 COG:no KEGG:Closa_0598 NR:ns ## KEGG: Closa_0598 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 119 1 119 119 137 83.0 2e-31 MEKSKLQATLKTLLISYILTGILLVVLAFALYKFRLKEGQIRIGVNAVYIITCLIGGLIM GKSIRQRRFFWGLTLGLVYFLVLLAVSFLLNKGLNGTMNQILTTMAICAASGTIGGMIS >gi|229784076|gb|GG667659.1| GENE 22 23505 - 23846 369 113 aa, chain - ## HITS:1 COG:AF0913 KEGG:ns NR:ns ## COG: AF0913 COG1416 # Protein_GI_number: 11498518 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Archaeoglobus fulgidus # 1 113 1 112 112 89 44.0 1e-18 MKVIYHIDETDRWPLVLGNVKNMNAYYKEAKESCMIEVLANSAAVEGYALSSPEHETAMK TLSEEGIRFAACGNALKGTGLTKDDIFDFVTVVPAGVVELAERQAEGYAYIRP >gi|229784076|gb|GG667659.1| GENE 23 23859 - 25127 1687 422 aa, chain - ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 19 410 17 414 426 251 38.0 1e-66 MDSSDYIQLLILILLIGLSAFFSSAETALTTVNKIRIRNLAEAGDKSAVTLTKVLEDQGK MLSAILVGNNVVNLTASSMSTTLAMNIWSNKAVGVATGVLTLVILVFGEISPKTISTLYS EKISLKYAKFIYLFMTVMTPVIYAVNVLSSGFLRLVHVDPNRKQEAITEDELRTIVEVSH EEGVIESEEKKIINNVFDFGDSVAKDIMVPRIDMAMVEVDATYDELIDIFREEKYTRMPV YEETTDNVIGIINMKDVLLIDRNEEFHIRDLLREPLYTYEYKNTAELMVEMRQTSNNMII VLDEYGATAGMITLEDLLEEIVGEIRDEYDEDEEQELVKVGPGEYVVEGSMKLDDLNDQL ELELESEDYDSIGGLIIGQLDRLPEEGESVVCDGIRLVVDRLDKNRIDRVHMYLPNEQNV DA >gi|229784076|gb|GG667659.1| GENE 24 25229 - 25696 564 155 aa, chain - ## HITS:1 COG:SMc01129 KEGG:ns NR:ns ## COG: SMc01129 COG0597 # Protein_GI_number: 15964144 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Sinorhizobium meliloti # 40 155 48 162 166 58 33.0 4e-09 MVFIGIIVLLAAVDLCIKKAIEEQEESSFPKELNGTGGRIVLHKNHNPGFSFGFLKEKPE YVKMVPLAVASFIGGGLVWMLPKKGNIADKLALSVTLGGAVSNLYDRLVRDYVVDYFSIQ FGRLKRVVFNLGDMFIFFGAGMMLVLELIRAWKER >gi|229784076|gb|GG667659.1| GENE 25 25813 - 27570 1478 585 aa, chain + ## HITS:1 COG:CAC3245 KEGG:ns NR:ns ## COG: CAC3245 COG1404 # Protein_GI_number: 15896490 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Clostridium acetobutylicum # 64 584 597 1098 1118 293 34.0 8e-79 MPVCDFNPASEEYADFIIRHNNSSPEDILLQQKAGCIDYVDQEFAVVYSQLEPLLPITTK TYTYFAIPTLYTLLDTTSMESSGIIQTAQQPSLNLRGEGTMIGFIDTGIDYRNPLFRLPD GSSRIAGIWDQTLPEPTGTMDASFGIPASTSAMSFSYGQEFLKPQIDEALRSSDPLSVVP SMDTSGHGTFMAGIAAGGATPNEDFTGAAPNCSIGVVKLKPAKQYLRDFYLIPENAIAYQ ENDIMMGIKYLKLLALRQNLPLVIYIGLGTNYGSHDGTSPLGLVLNNLVRFVGVSAVLPA GNEAGLRHHYMGRLLNDQEYDDVEIRVANGERGFFIELWATEPELYTVGFVSPTGEVIPR LPIGLGAEITVPFTLEQTVITVNYRTTEIGSGSQFVLMRFEAPTAGIWHIRVYNSLFITG IFHMWLPVEGFISEDTFFLRSDPNTTITEPANAAVPITVSTYNHINNSIYIHSSRGYTRG GLIKPDLAAPGVNVYGPGLSPGGAGETFPMTRRTGSSVAAAHVAGAVADLFTWGIVRGNN TAMSDASVRAYLIRGANRNPAYTYPNREWGYGTLDLYQTFLRIRE >gi|229784076|gb|GG667659.1| GENE 26 27572 - 29017 277 481 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 225 450 258 432 466 111 33 7e-24 MEDKEYLEEAKERKEEDRAVSEEAEKKPAKDDDEYEKICYVCRRPESKTGPMVTMGGMNL CHDCMQKAFDSVTQGGFDLSKIQNMPYMNLNLSDFSNLNQQNVEIPKKQKIKKRTAETKK PEFDLKDIPAPHIIKQRLDEYVIGQEQAKKVISVAVYNHYKRVFLVDSDKKEEENVQIEK SNILMIGPTGSGKTYLVKTLAKLLDVPLAIADATSLTEAGYIGDDIESVVSKLLAAADND VDRAQRGIIFIDEIDKIAKKKNTNSRDVSGESVQQELLKLLEGSNVEVPVGSNQKNALTP MTTVSTDNILFICGGAFPDLEDIIKERLTNKSSIGFAAELKDKYDKDPDILGRVTNEDLR KFGMIPEFLGRLPITVTLQSLTKELLVRILKEPRNAILKQYEKLLELDEVKLVFEDDALE WIAEQAMKKETGARALRSIIEDFMLDIMYEIPKDPEIGSVVITRAYLDKNGGPLIQMRGY H >gi|229784076|gb|GG667659.1| GENE 27 30318 - 32069 1818 583 aa, chain - ## HITS:1 COG:TM0571 KEGG:ns NR:ns ## COG: TM0571 COG0265 # Protein_GI_number: 15643337 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Thermotoga maritima # 258 559 32 337 459 182 35.0 2e-45 MYNENENNDLGRMTPEEHKEEPVTTNFTMRDPDPEETKHESGYEKKESQPEYTRQEDRSA YGQQMYQENQNQGMNQSQGMNQGQSMNQGMNSNQTQNQQPYRTAQPMGQNQYQGGQNVYQ GSQNGQSQNAYRSAYGYQQGSPNQQPYGGQQGIPHYQEHQYQRMAQGSVPEGPKPPKKKS GFAKRAAAVVGAALVFGVVAGSTMVGINWAAGSYNQKNQVEISQAETLPSSDGSAAASSN STTGQVTGTSANMDVSAIVDKAMPSVVAITSKVIYESQTFFGPMQREGVGSGSGIIVGKN DDELLIVTNNHVVQGAEELKVTFIDQTAVDAAIKGTDADSDLAVIAVALKDIPSDTLSQI AIASLGNSDTIKVGQGVVAIGNALGYGQSVTVGYISALDRTVQTEDGVSRDLLQTDAAIN PGNSGGALLNMQGEVIGINSAKYSSTEVEGMGYAIPISKAQNIIDTLMTRKTRTQVADTE QGYLGIQCKNIDAATSQQLGMPQGVFIYKIVEGGAASKSDLREKDIITKFDGQSIKTYDD LTNMLKYYEGGTTVTVTVQSLENGQYVERNVDITLDKKPAENS >gi|229784076|gb|GG667659.1| GENE 28 32243 - 34777 1716 844 aa, chain - ## HITS:1 COG:CAC3063 KEGG:ns NR:ns ## COG: CAC3063 COG1316 # Protein_GI_number: 15896314 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 371 583 70 272 339 119 37.0 2e-26 MNKEYDDDLGPAGYSDAGGDTYGRQQAGRPVSRPAEGTQGQGRSSADTGRTPGNGAPRGG NARVPGNGRAQDGNARPTGSNGAQGGAGRTAGNGGAQNGSARPAGNSGAQSGNVRPAGNS GMQGASKRPAGNGGSQGGSARPSGSGNPQGGSVRPSGNGGPQGGNARQAGGGPQNGNARV PGSGAPQGSSANGSRVSGSGQPQGGARPAGAPGNGRGRARMEGARPDVGNASGASRMAGR PAGSGQSQGNDNRRRMSGTSGYEPRYQSLQGTQTGDGGSGSGGMGGRRPQDSASRTAASR SAAQAAKKKKIRRIIVMAVAEVLTLICIGFYGYALRLNSGLKHSNDYDPDKVLNSELPVE KTEHMKGYWTIAVFGVDSRDGNVHKGTNTDVIMLCNINRDTGEIKLVSVFRDTYMNTNDN GTYNKINSAYANGGAVQALSALNRNLDLNVKDYVTFNWKAVADGINMLGGIEDVEISKPE FRYINSFITETVQKTGVPSTQLKSAGVQHLDGVQAVAYGRLRLMDTDYARTERQRLIIQK AFEKAKKADLGLLNRILLMEVEQIETSLSFSDFTSLILDIGKYHIGETGGFPFARDAMIM GKKGDCVIPQTLESNVAELHKFLYNEEGYQPTDLVKKISAKISADSGMYKQGVSVDHVPT NEGYVPKETKETEKATKESDETDERESSSEGMSIEETDENGNPVKPTKPGETTAAEGTRP TKPGETTAAEGTKPTKPGETTAAEGTKPGETAATDGTKPSNPVRPGETTEAVRPSSPGDG VTESPDSSTGPGVKPTTAPHESTSAPETTSPGPGGSPGPGGGGTVQPGGSESGEVITVPP PNAA >gi|229784076|gb|GG667659.1| GENE 29 34821 - 36050 1245 409 aa, chain - ## HITS:1 COG:wcaJ KEGG:ns NR:ns ## COG: wcaJ COG2148 # Protein_GI_number: 16129987 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Escherichia coli K12 # 59 409 115 464 464 246 38.0 6e-65 VLYGIFHLYAPKRVQQRRYEFANICKANLIGFLLVTMVLFLLNKNPYFWNFSKRMVFYFF AINIILETLERNLIRYALHSMRSKGYNQKHILLIGYSRAAEGFIDRVQQNLEWGYQVKGI LDDHKEWGTEYKSIHVIGKVSDLDEILALNSLDEIAITLSIDEYANLEKIVASCEKSGVH TKFIPDYNNIIPTRPYTEDLQGLPVINIRHVPLTDPVNAIMKRIVDIFGAIAAIVLFSPV MLVTAALIKLTSPGPLIYSQERVGLHNRPFKMYKFRSMVVQAPSEEKSRWTTPHDSRVTP IGRFIRKTSIDEMPQFFNVLMGDMSLVGPRPERPLFVEKFKEEIPRYMIKHQVRPGITGW AQVNGYRGDTSITKRIEHDLYYIENWTLGFDFKILFLTVFKGFINKNAY Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:08:30 2011 Seq name: gi|229784075|gb|GG667660.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld53, whole genome shotgun sequence Length of sequence - 43942 bp Number of predicted genes - 41, with homology - 40 Number of transcription units - 15, operones - 11 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 559 437 ## COG3505 Type IV secretory pathway, VirD4 components 2 1 Op 2 . + CDS 552 - 734 116 ## gi|266622004|ref|ZP_06114939.1| conserved hypothetical protein 3 1 Op 3 . + CDS 731 - 895 233 ## + Term 1041 - 1077 4.2 + Prom 1031 - 1090 4.1 4 2 Op 1 . + CDS 1137 - 1352 293 ## Closa_3718 conjugative transfer protein 5 2 Op 2 . + CDS 1372 - 2241 693 ## CD1112 hypothetical protein 6 2 Op 3 . + CDS 2259 - 2666 322 ## CD1111 hypothetical protein 7 2 Op 4 . + CDS 2611 - 5049 1662 ## COG3451 Type IV secretory pathway, VirB4 components 8 2 Op 5 . + CDS 5046 - 5942 339 ## COG0270 Site-specific DNA methylase 9 3 Op 1 . + CDS 6089 - 6802 557 ## smi_1322 hypothetical protein 10 3 Op 2 . + CDS 6777 - 8858 1339 ## CD1108 putative DNA-repair protein 11 3 Op 3 . + CDS 8886 - 9143 338 ## CD1107A hypothetical protein 12 3 Op 4 . + CDS 9133 - 9840 791 ## CD1107 hypothetical protein + Term 9890 - 9927 9.2 + Prom 10010 - 10069 4.2 13 4 Op 1 . + CDS 10095 - 10772 234 ## Mhun_2786 hypothetical protein + Term 10787 - 10819 2.4 14 4 Op 2 1/0.000 + CDS 10839 - 11246 271 ## COG0550 Topoisomerase IA 15 4 Op 3 1/0.000 + CDS 11210 - 11386 120 ## COG0550 Topoisomerase IA 16 4 Op 4 . + CDS 12315 - 13850 1294 ## COG0550 Topoisomerase IA 17 4 Op 5 . + CDS 13910 - 14104 56 ## CD1105 putative DNA primase + Prom 15006 - 15065 15.7 18 5 Op 1 . + CDS 15110 - 18010 2022 ## CD1105 putative DNA primase 19 5 Op 2 . + CDS 18015 - 18965 606 ## EF2322 hypothetical protein - Term 20178 - 20231 17.0 20 6 Op 1 . - CDS 20234 - 20935 302 ## FN0918 hypothetical protein 21 6 Op 2 . - CDS 20940 - 21380 105 ## gi|288870686|ref|ZP_06114957.2| conserved domain protein 22 6 Op 3 . - CDS 21422 - 22771 701 ## COG3843 Type IV secretory pathway, VirD2 components (relaxase) 23 6 Op 4 . - CDS 22732 - 23061 263 ## CD1101 putative mobilization protein - Prom 23118 - 23177 1.8 - Term 23120 - 23151 1.0 24 7 Tu 1 . - CDS 23235 - 23594 143 ## CD1100 putative conjugative transposon protein 25 8 Tu 1 . - CDS 23697 - 24644 153 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 24736 - 24795 6.5 + Prom 24695 - 24754 5.1 26 9 Op 1 34/0.000 + CDS 24801 - 25523 327 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 27 9 Op 2 . + CDS 25539 - 27005 216 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 28 9 Op 3 . + CDS 27017 - 27601 462 ## TDE0907 hypothetical protein 29 9 Op 4 35/0.000 + CDS 27632 - 29401 191 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 + Prom 29405 - 29464 3.8 30 9 Op 5 . + CDS 29680 - 31143 214 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 31 9 Op 6 . + CDS 31174 - 31602 366 ## COG1321 Mn-dependent transcriptional regulator + Term 31687 - 31738 9.2 + Prom 31927 - 31986 2.4 32 10 Op 1 . + CDS 32098 - 32544 234 ## CD1094 conjugative transposon protein 33 10 Op 2 . + CDS 32525 - 32779 289 ## CD1092A conjugative transposon protein 34 11 Tu 1 . + CDS 33183 - 33455 234 ## Rumal_1103 excisionase family DNA-binding domain-containing protein + Prom 33467 - 33526 8.2 35 12 Op 1 . + CDS 33566 - 34786 85 ## COG0582 Integrase 36 12 Op 2 . + CDS 34840 - 35937 1284 ## COG0012 Predicted GTPase, probable translation factor + Term 35967 - 36027 15.2 - Term 35959 - 36011 16.0 37 13 Op 1 21/0.000 - CDS 36061 - 37548 1421 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 38 13 Op 2 . - CDS 37560 - 42119 4955 ## COG0069 Glutamate synthase domain 2 39 14 Tu 1 . - CDS 42565 - 43104 110 ## COG0563 Adenylate kinase and related kinases - Prom 43192 - 43251 1.9 - Term 43174 - 43217 10.1 40 15 Op 1 . - CDS 43254 - 43520 380 ## COG1925 Phosphotransferase system, HPr-related proteins - Prom 43540 - 43599 3.8 41 15 Op 2 . - CDS 43606 - 43941 289 ## COG1636 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|229784075|gb|GG667660.1| GENE 1 2 - 559 437 185 aa, chain + ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 2 140 420 562 591 90 39.0 2e-18 ANIGQIPKFEKLIATIRSREISASIILQSQSQLKAIYKDNADTIVGNCDTTLFLGGKEKT TLKEMSELLGKETIDSFNTSETRSNQKSYGLNYQKLGKELMSQDEIAVMDGGKCILQLRG VRPFFSDKFDITKHPKYKYLSDADPKNAFDMEKHIKRRPVIVKPDEAFDYYEIDADDLTG DTNHE >gi|229784075|gb|GG667660.1| GENE 2 552 - 734 116 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622004|ref|ZP_06114939.1| ## NR: gi|266622004|ref|ZP_06114939.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 60 1 60 60 85 100.0 1e-15 MSNAGNRPRLRLIITEQPAKLHREEMQQQAERKQFIARCEALYEKNRAKSNIIYLQEDRP >gi|229784075|gb|GG667660.1| GENE 3 731 - 895 233 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNKRIKKKVAKRQMIAAMRELVELAQEMERWEEERKRRYVEAFEQYYLTHANAR >gi|229784075|gb|GG667660.1| GENE 4 1137 - 1352 293 71 aa, chain + ## HITS:1 COG:no KEGG:Closa_3718 NR:ns ## KEGG: Closa_3718 # Name: not_defined # Def: conjugative transfer protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 71 1 71 71 93 95.0 3e-18 MAFFNSAVGVLQTLVIALGAGLGIWGVINLLEGYGNDNPGAKSQGMKQLMAGGGVALIGA TLVPLLSGLFG >gi|229784075|gb|GG667660.1| GENE 5 1372 - 2241 693 289 aa, chain + ## HITS:1 COG:no KEGG:CD1112 NR:ns ## KEGG: CD1112 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 289 1 289 289 454 87.0 1e-126 MQSILDAINEWIKEILIGAINGNLSTMFGDVNEKVGTIATEVGKTPQGWNASIFSMIQTL SENVIIPIAGLVITYILCVELISMVTEKNNMHDVDTFMFFKWFFKAWVAVYLVTHTFDIT MAVFDVAQRVVSGAAGVIGGNTNIDVTAALSSMQDGLDAMEIPELLLLVMETSLVSLCMK IMSVLITVILYGRMIEIYLYCSVSPIPFATMTNREWGQIGNNYLKSLFALGFQGFLIMIC VGIYAVLVNTLTVADNLHSAIFSIAAYTVILCFSLFKTGALAKSIFNAH >gi|229784075|gb|GG667660.1| GENE 6 2259 - 2666 322 135 aa, chain + ## HITS:1 COG:no KEGG:CD1111 NR:ns ## KEGG: CD1111 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 134 1 132 134 194 79.0 7e-49 MAYVPVPKDLTKVKTKVAFNLTKRQLIFFAAALALGLPLFFLLKGGVGTSAAAMAMIIVM LPCFLLAMYEKHGQPLEVVIRNIINTKFTRPKERPYQTQNFYAVIERQAKLEKEVSAIAK SNRKKGAGKTPRRKD >gi|229784075|gb|GG667660.1| GENE 7 2611 - 5049 1662 812 aa, chain + ## HITS:1 COG:MYPU_3830 KEGG:ns NR:ns ## COG: MYPU_3830 COG3451 # Protein_GI_number: 15828854 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Mycoplasma pulmonis # 359 789 393 840 853 93 24.0 2e-18 MQKATEKKAQAKRPAEKTKRKLTRAEKKQIAEVIQKAKGDGKPHTAQQTIPYVQMYPDGI CHVSGKRYSKTVAFEDINYQLAQADDKTAIFENWCDFLNYFDASVQVQLSFINQGGRGSD AENAIHIPVQNDAFNSIRTEYSDMLKNQLAKGNNGLVKAKYITFSIEADGMSAAKSRLAR IETDILNNFKVLGVSARPMTGYERLEVLHGIFHPQGEPFRFSWDWLIPSGLTTKDFIAPS SFRFGDGRYFRMGQKIGAVSFFEILAPELNDRILSDVLDLENGVMVNLHIRSIDQTEAIK TIKRKITDLDKMKIEEQKKAVRSGYDMDIIPSDLATFGSEAKNLLQDLQSRNERMFLLTF LVVNMADTKRKLDNDIFATAGIAQKNNCALTRLDYMQEAGFMSSIPLGENLIPIQRGLTT SSTAIFIPFITQELFQTGAALYYGLNALSNNMILCDRKQLKNPNGLILGTPGSGKSFAAK REMTNAFLITDDDIIICDPEAEYFSLVQRLNGQVIRLSPTGRGIDGKPQYVNPMDINLNY SEDDNPLALKSDFILSLCELVIGGKEGLQPVEKTVIDRAVRNVYRPFLADPNPAKMPILG DLYDELLKQPEPEAARIAAALELYVSGSLNVFNHRTNVELTNRLVCFDIKQLGKQLKKLG MLIVQDQVWNRVTVNRAEKKSTRYYMDEFHLLLKEEQTAAYSVEIWKRFRKWGGIPTAIT QNVKDLLSSREVENIFENSDFVLMLNQAAGDRAILAKQLNISPQQMKYVTHSEAGEGLIF YGNVVLPFIDRFPKDTELYKVMTTKPEEVAGA >gi|229784075|gb|GG667660.1| GENE 8 5046 - 5942 339 298 aa, chain + ## HITS:1 COG:all7369 KEGG:ns NR:ns ## COG: all7369 COG0270 # Protein_GI_number: 17233385 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Nostoc sp. PCC 7120 # 7 164 5 161 253 83 35.0 6e-16 MKPLTHFSLFTGIGGIDLAAEAAGFTTVCQCEWADYPTAVLEKHWPLVPRFRDITTVTKE AFIEKTGRKEITLLSGGFPCQPFSSVGPRRGFEDTRYLWPEMCRVIKELRPNWVLGENVA NFVNMGLHKTLFDLESAGYAVWTFVLPACAVGAWHERKRTFIVGADVSHAPCFRHRDKTP CIKGDDLSKRSVPANEQQRELLGGVPFGSSLPPIPERGGKPAVQSRMGGMAHGVPAEMDG RYLWAIEPLDIPRLAEDKKDRVKRLKTLGNAVVPPQVYPILRFIADIELGNCKDWCVF >gi|229784075|gb|GG667660.1| GENE 9 6089 - 6802 557 237 aa, chain + ## HITS:1 COG:no KEGG:smi_1322 NR:ns ## KEGG: smi_1322 # Name: not_defined # Def: hypothetical protein # Organism: S.mitis_B6 # Pathway: not_defined # 1 222 37 257 257 275 60.0 2e-72 MGFYRDKTLCGVVTLGWGTQPLQTIRKLFSDHALQTKDYLEIGKMCFLPSENGNGHFGSL ALSALIKWVRQETDCLFLYTLADGIMGKCGYVYQAANFRYMGHFLTSVYRDAETREKIHP RSAGLLLKENAELDGKKARHWLTHDFCTMKGIEKIDGRMFRYIYPLNKKAKNIMRQYPQY QSLSYPKDRDLFFRVRIGDRQYQTIAPPQFDRDVCRYNPQKFDRMEVTDHERPLEAP >gi|229784075|gb|GG667660.1| GENE 10 6777 - 8858 1339 693 aa, chain + ## HITS:1 COG:no KEGG:CD1108 NR:ns ## KEGG: CD1108 # Name: not_defined # Def: putative DNA-repair protein # Organism: C.difficile # Pathway: not_defined # 15 692 1 646 646 901 71.0 0 MKDPLKPRDKVTQKMSRDGLIEVNETKESIERISRREQETDYTKRPEQPQDMTQQLHNQT PDAVPVPSPGIAPKHDTATAERVMEHIGAAQTRKASKKAVRKAQEEATTSGRSSRLQFTD EELSTPELEKCIKNSDKAADRLDAARAAIPKEKALVRNRTFDEATGKGKTRLHFEKREKP MPDLKHRKNPLSRPMQEAGIFVHNKIHEVEKDNSGVEGAHKTEEVTEGGARYGARKIREG YHRHKLKPYRAAAKAEKTAARANVEFQYQKLLHDNPQIAASNPLSRYWQKQQIKKQYAKA ARTGSIKTAAENTRKATKKAAETTQKTAEFAARHWKGILLIIAALLLFIMVSAGLSSCGA MFSGLLNGVIGTSYTSEDSDLVAVENNYAAMEAELQQRIDNIERDYPGYDEYRYDLDNIG HNPHALASYLTALLQSYTPQSAQAELDRIFNLQYKLTITEEAEIRYRTETSTDPETGETT TEKVPYEYYILNVSLTNRDIATIAPEVLNEEQLAMFRVYLETSGNKPLLFGGGSSDTSAS EDLSGVQFVNGTRPGNTEIVDIAKRQVGNVGGRPYWSWYGFNSRVEWCACFVSWCYGQMG LSEPRFAACQSQGIPWFTSRGQWGARGYENIAPGDAIFFDWDLDGSADHVGIVIGTDGSR VYTVEGNSGDACKIKSYPLDYACIKGYGLMNWN >gi|229784075|gb|GG667660.1| GENE 11 8886 - 9143 338 85 aa, chain + ## HITS:1 COG:no KEGG:CD1107A NR:ns ## KEGG: CD1107A # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 68 1 68 85 84 79.0 1e-15 MANNKLDRINAEIEKTRERITEQQNRLKELLAQKTELENLQIVQLVRAMRLTPAELTAML SGGGIPGMNAVPAETYEQEDTAHEE >gi|229784075|gb|GG667660.1| GENE 12 9133 - 9840 791 235 aa, chain + ## HITS:1 COG:no KEGG:CD1107 NR:ns ## KEGG: CD1107 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 220 3 227 244 199 54.0 7e-50 MKNKRLFRSFLVLFAALVCVGGFSMTAYAQGNDIPTDDSGVITKTEPQPLTPEGNMNLVD DISGEASEDKQFIVVQSKGGNYFYIIIDHAAEGENTVHFLNQVDEADLMAIIGEEETTTP AVCTCTDKCVTGAINTACPVCSVNMAACTGVTATTEPEQPEEPQEEDNGMGGLVVFLIVA LLGGGGALYYFKIFKPKQDVKGGTDLDDFDFDEYDEDEQEETDDTGDGETEDEEV >gi|229784075|gb|GG667660.1| GENE 13 10095 - 10772 234 225 aa, chain + ## HITS:1 COG:no KEGG:Mhun_2786 NR:ns ## KEGG: Mhun_2786 # Name: not_defined # Def: hypothetical protein # Organism: M.hungatei # Pathway: not_defined # 6 222 16 230 237 71 27.0 3e-11 MENKITDVLIATAKGAASLFPGGGFLAEYISLAQGYVADKRMNEWKLKVEEVLEKIPRSI DELAQDEAFYSCVQVATMGAMRAYQKEKQELFANALYSSANNVDISTDKKLFYLSLLDNY TLSHIMLLKYFAQNNYNENNNGIRQNGMVSIREIGGTEYPIKGIVEKLPCFEDDIMFVKH ITEQLCSDSLISIVDFNMPVSKERARAKRTTKYGDEFLEFIQKYQ >gi|229784075|gb|GG667660.1| GENE 14 10839 - 11246 271 135 aa, chain + ## HITS:1 COG:CAC2947_1 KEGG:ns NR:ns ## COG: CAC2947_1 COG0550 # Protein_GI_number: 15896200 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 2 124 4 126 618 122 46.0 1e-28 MILVIAEKPSVAQSIAAVLGVKEKKDGYIEGGGYLISWCVGHLVQLAEAAAYGEQYRKWS YDSLPILPQNWQYTVAADKGKQFKILKELMHRADVSEVVNACDAGREGELISVLSMKLPG VKSPCAACGFPQWRT >gi|229784075|gb|GG667660.1| GENE 15 11210 - 11386 120 58 aa, chain + ## HITS:1 COG:CAC2947_1 KEGG:ns NR:ns ## COG: CAC2947_1 COG0550 # Protein_GI_number: 15896200 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 1 54 127 180 618 78 55.0 4e-15 MRRLWISSMEDVAIRSGFDNLKDGRDYDALFASAFCRAKADWIIGINATRLFSCLLAS >gi|229784075|gb|GG667660.1| GENE 16 12315 - 13850 1294 511 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 1 448 186 655 709 246 34.0 9e-65 MNVGRVQTPTLKMLADRDAAITTFKKEKYYHVRLALPGTEAVSEKLSDPAAAENLKAACN TAQAICKSVTREKKTVSPPKLFDLTGLQREANRIFGYTAKQTLDLAQSLYEKQLLTYPRT DSSYLTDDMGGTATQISGLLCKKLPFMQDVTFSPDVSRTLNSKKVSDHHAIIPTQEFAKT DLATLPESERNILILAGARLLFAAAEPHIYEAVTAVFSCSDTDFTAKGKTVLCAGWKDLE RRYRLTLKTKSDHEDSDNLSNVPDFTEGQTFSNISATVTAHDTQPPKPLTEASLLAAMEH AGSEDTTEDAERKGLGTPATRAAVIEKLVKSGFAERRGKQLVPTKNGINLVCVLPDSLTS PKLTAEWENTLTLIAKGKSDPEDFMQGIEEMAGELVKAYPFLSDGDRERFKEEKPVIGNC PRCGSSVYEGRKNYYCENRDCTFAMWKNDRFFEERKTAFTPKLAAALLKSGKANIKRLYS PKTGKTYDGTVVLADTGGKYVNYRIEITKKK >gi|229784075|gb|GG667660.1| GENE 17 13910 - 14104 56 64 aa, chain + ## HITS:1 COG:no KEGG:CD1105 NR:ns ## KEGG: CD1105 # Name: not_defined # Def: putative DNA primase # Organism: C.difficile # Pathway: not_defined # 27 64 1 38 1343 70 76.0 3e-11 MGTVGIFACWDNPKPLFTPPRKEVYRMASLRDTVKQYREELRDGIAWIAFWREGRSWNGD LFLL >gi|229784075|gb|GG667660.1| GENE 18 15110 - 18010 2022 966 aa, chain + ## HITS:1 COG:no KEGG:CD1105 NR:ns ## KEGG: CD1105 # Name: not_defined # Def: putative DNA primase # Organism: C.difficile # Pathway: not_defined # 317 683 361 716 1343 390 59.0 1e-106 MLNSYYCGYLVEDMTVDELTAGVRRHYENGYNSVADFIEAHDDTLPPEVLEEARQAAHAA GLPFHEKPYDGEDIDPYVYDGHMSIEDYDLMHKMIAQERSENRLNYSSFENDRLEVADRL KIERSIYFESGKADISALTALPLAELIRQREESAAAEQAIFETLKQQAAAWEAQAGNTLT FDKAIEYARTPAVTHTENQWQADENNNHTISNNVYQMRYHIYENTRYDRAQGKSIPYSYT LTWSVHTNSPDRYGQTKIAGQDRKVFADKAAMEKYLNGRIKAYQHLFTEISPTIPQEYAD HFKVNGQLLPGYTVEGEEQQTKQEQTAPEKIEPDAAISPQAEALTAEPTQPRSVIPIVLT SEKPTEKLKEITDRLEQGIKELFDSERYKEYLRVMSKFHNYSFNNTLLIAMQKPDASLIA GFNAWKNNFKRNVMKGEKGIKILAPSPFKIKQEMEKIDPATQKPVIGADGKPVTEEKEIT IPAFKVVSVFDVSQTEGKEIPNLAVDMLTGDVERFKDVFVALEKTSPVPIGFEKIEGGAH GYYHLEEKRIAIDEGMSELQTLKTAIHEIAHAKLHDIDLNASVTEQTDRPDRCTREVQAE SVAYAVCQHYGLDTSDYSFGYVAGWSSGRELDELKSSLETIRSAAAEIINSIDGYLKDLQ QEQETEQTATLPDPTIQITDMQEYGYTWDGMLPLQQEAAERLFHEDLEIFCIYEDGTEGA VTSLSELREHAENGGLFGVEKAAWKAFYERTNGKAQEETKAAPELPQEKDTFSIYQLKRD DKTVDLRFEPYDRLTAAGHTVDMANYDRIYIADLAPGTSLEDIYTRFNVDHPKDFKGHSL SVSDVVLLHQNGQDTAHYVDSIGYKEVPEFWKQPENPLKHVEDTIEQNDNNFDGIINNTP TVDELEKKAKSGEQISLTDLADAIKTDKQRGKGKEEKRSIRAQLKAEKQRTAQKKTRAKS QDLERS >gi|229784075|gb|GG667660.1| GENE 19 18015 - 18965 606 316 aa, chain + ## HITS:1 COG:no KEGG:EF2322 NR:ns ## KEGG: EF2322 # Name: not_defined # Def: hypothetical protein # Organism: E.faecalis # Pathway: not_defined # 3 314 117 433 434 130 31.0 7e-29 MGTLSKFENADVLSSLEAIMRTNTGFFQNDFDIDKRIIIKAAGQPAAEDKTLLWLSRPSG THCFRERDVFLKDTAAHNTWRFHKEQTRDRILAYAVELIGIEDGKIKGNLYELDYAKHYE RVKEQALAADTYTLIYENGSREQPAAYVIHGTPDPQLGKFERFEAKPNDPEALRDLLQQE KRNRSTLTPGNFAEHIAALHDGRIKAEARRIVERMKQLDAPNSPNKTHFMVELSPTFMQL TSSRNTERLAAMLPYKSLVISSLKDRHGVYAQINKDENRDREIKKPRQSIRAQLKADKAK TAPKKAAKTKNHELEV >gi|229784075|gb|GG667660.1| GENE 20 20234 - 20935 302 233 aa, chain - ## HITS:1 COG:no KEGG:FN0918 NR:ns ## KEGG: FN0918 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 36 229 1 186 187 121 38.0 2e-26 MNISPNNDKIEKYLDELADEYKVLLYKALISHTKSLDDLSVSELLRLDNEIKKPLFEDYQ RKQRRRRTLLIAGLTYMVMGVVLYLFTEIVFGDYQYSRENMMSLMSGIIALVGFIIAVYS FALQTLNIGTPRHKVQTEDSSQILEYEVVTKWRELEGIVNDISINSQVKTPRSIIQFLTE NQFIDDKESNILKEFLKIRNNIVHSANNSYTTTELKAMLDEIDKIINKIKKIV >gi|229784075|gb|GG667660.1| GENE 21 20940 - 21380 105 146 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870686|ref|ZP_06114957.2| ## NR: gi|288870686|ref|ZP_06114957.2| conserved domain protein [Clostridium hathewayi DSM 13479] conserved domain protein [Clostridium hathewayi DSM 13479] # 13 146 1 134 134 233 99.0 5e-60 MCYIFLFYNVGGVDVAVQVNIDDNKIDNFSDGAQTTLKKQMEKYADDIIKEANLIEEAIR EDGANTEITSNIVLQAVRKNKSNPSKKPSRFLLITKIISSFSLLITGFLFDSSGYQNNVL KLIAFVVCLIVASVSTVLQFVFEERE >gi|229784075|gb|GG667660.1| GENE 22 21422 - 22771 701 449 aa, chain - ## HITS:1 COG:SP1056_1 KEGG:ns NR:ns ## COG: SP1056_1 COG3843 # Protein_GI_number: 15900926 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD2 components (relaxase) # Organism: Streptococcus pneumoniae TIGR4 # 1 194 1 210 402 70 29.0 1e-11 MAVTKIKPIKSTLSKALDYIQNPDKTDGKMLVSSFGCSYETADIEFEFTIAQALDKGNNL AHHLIQSFEPGEVDYEKAHEIGRQLADAVTKGQHEYVLTTHIDKGHVHNHIIFCAVNFVD HHKYNSNKRSYYGIRNISDRLCRENGLSVVVPGKGNKGKSYAEYQAEKTGTSWKGKLKIA VDALISQVSDFEELLSRLQAMGYEIKPGKYVSCRAPGQERFTRLKTLGADYIEEALRERI KGTCTRTVKAPKPQRSGINLLIDIQNCIKAQESKGYEQWAKIHNLKQAAKTLNFLTEHKI EQYADLTAKISEIAAASEQAADSLKATEKRLADMAVLIKNVTTYQKTKPAYDAYRKAKNK DRYRTDHERAIILHEAAAKALQTAGIKKLPNLTSLQAEYEQLQAQKETLYADYGKLKKQV KEYDVIKLNIDSILRKEKELEEELIKVER >gi|229784075|gb|GG667660.1| GENE 23 22732 - 23061 263 109 aa, chain - ## HITS:1 COG:no KEGG:CD1101 NR:ns ## KEGG: CD1101 # Name: not_defined # Def: putative mobilization protein # Organism: C.difficile # Pathway: not_defined # 1 109 1 109 109 166 87.0 4e-40 MSGRKRTVQIKFYVTEEEKNLIEEKMKLIPTRNMTAYLRKIAIDGYVIQVDHTDIKAMTA EIQKIGVNVNQIAKRVNSAGSVYQEDIEEIKGVLAEIWRLQRLSLLKAR >gi|229784075|gb|GG667660.1| GENE 24 23235 - 23594 143 119 aa, chain - ## HITS:1 COG:no KEGG:CD1100 NR:ns ## KEGG: CD1100 # Name: not_defined # Def: putative conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 6 116 7 116 117 112 50.0 7e-24 MNCNHQRYDFKPFGQAIKSTREKKGLTRKELAAQIRISPRYIASIENSGQTPSLQVFYRF VTCLDMSVDQFFFPEKVIAKTTQRRQLDTLLDNISENGLRIITAVAKEIADTEQKNSLK >gi|229784075|gb|GG667660.1| GENE 25 23697 - 24644 153 315 aa, chain - ## HITS:1 COG:mlr4907 KEGG:ns NR:ns ## COG: mlr4907 COG2207 # Protein_GI_number: 13474103 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Mesorhizobium loti # 219 310 203 294 305 67 39.0 3e-11 MNRSHQSFLEQNGSTLLEKESEICGFKTYQFRSSNGYGKTSIFELGSRATISFSDHLYYN DFEYAVLSEDSVVIHQYDSILTDNRFRIGQVHTGMQYIERTAENEVQRYIIKKGTPAKTI GIQLMPEYYNYYLKKEFNIGYSAFEEMLHCNTREMCIPSLSAVFHQILGFRGDDISSTIF YRAKIDEAVSLLFQNAKRAKSGAAISKTDYRSIINIVEYMAQNVDIALSLGEMAESVYMS PAKFKYTFQSVMGHSFSEHFFLLRMAKACDLLLNSNEYITAIAQSVGYKNVGSFSTQFKR YAGILPSEYRATKVR >gi|229784075|gb|GG667660.1| GENE 26 24801 - 25523 327 240 aa, chain + ## HITS:1 COG:SP1437 KEGG:ns NR:ns ## COG: SP1437 COG0619 # Protein_GI_number: 15901289 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Streptococcus pneumoniae TIGR4 # 98 223 13 131 147 62 32.0 9e-10 MTEKKVIVDLDFRTILFLDILIMLFMLLSGKATVTLIAFVIAGTVLMVFGLYSTVIRCTI VFAVLFLYYFAITHSNGSVFQGTLLSVIGIIAFIIQRIIPFLMLAIAIKERKNISEITTA LGRCRLPKGIIISMTVMLRYFPSMKNDFLMIIEAMKLKGIDTSWRGILFHPLRMLEFVIV PMLFRSLRTSEELSCAALVKGIENQGQRSSYFDVRIKGIDVVFSSSAFVMLLMSVKLHLF >gi|229784075|gb|GG667660.1| GENE 27 25539 - 27005 216 488 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 264 476 5 222 305 87 28 1e-16 MVRLKDLSYQYKDNTAPTIENISFFIRKGEVVVLAGQSGCGKSTIFRCINGLCPRFFEGE KKGGIYLNGKESTPLRICDISRLVASVFQNPESQFFTTDVLSDLVYACENYGVPKGEIET RLNRIVSMLSLSPLIGKKLSELSGGEKQKVAIASTLMLNADVLLLDEPSSNLDYQSIALL KAILAELKSKGYTIIVIEHRLYYLADVCDRLIVFERGKISRIYEGVILKAVSNEELHLQG LRGLHPFQNTVCDLAPKRGSGPLLILDNLIFGYEKFTNILKGVSFSVYPGDKIALIGRNG CGKTTLGKILCGLIKEQQGCVRFMESPLFSRDRNRVASYVMQNVDFQLFGCNVYDDLLLG NERSPNVENRIHSILSLLGLSALESQHPTTLSMGQKQRLVIAASYLLNKRINVFDEPTSG LDYGSMKNVCDLIDSITGNENASIVITHDYEFILSVCNRVILLENGKITRDFPLKDTTEL EHIFKEKL >gi|229784075|gb|GG667660.1| GENE 28 27017 - 27601 462 194 aa, chain + ## HITS:1 COG:no KEGG:TDE0907 NR:ns ## KEGG: TDE0907 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 2 194 6 198 198 228 68.0 1e-58 MNKWKISDFVLIGLLAAIDAAVIYGVGMLTAAMVPVMHVFTSSITALIMGTIVLFVVKKI QRFGAMTLLVSLGVAIFTLTGMGSITCLPFVIVASLIADGVIAKTNFKTISIGIGYGFTQ AAYFFGGCFPFIFFLDREINKWVEMGVTEEEMYGYIRYLTGGFAVIGVVSAILCGIVGVY VGKLILRRHFKNID >gi|229784075|gb|GG667660.1| GENE 29 27632 - 29401 191 589 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 353 560 21 225 305 78 27 8e-14 MKTKNVIQFRDFLKPYKKAYISSIALAVIGVFFGLVPYYIAYHLLVNIQTPLSYREIFLA AVFMLMAFILQLAMHNASTAISHKTAFRILEQMRIAITDKMLRMPMGYMQVQGAGHFHHI LIDGTERLEYPLAHAIPEISSNVLIPAGIIGLLFVVDWRMALSVLVPATLTLLFYLPMYI GIMNEFANTYYAALENMNGRIIEYIRGNKEIKIFGQEKEAYTKYEESIQKYKNSTLKLYR RMYAAAAPSFVLLSSIIVSVLCVGGYLYAQAHLSGNVFLFSVLLSLGTGTSLLKFTDFMD NFYHIRNGKRLINEILSAPELPELNENIKGQVPSSNQIELRNVSFAYEKETVLQNISLVF QEKEKVAIVGPSGAGKTTIANLLARFWDVSDGSILIGGVDYRNLPLSSLMTRISYVTQDT FLFNLSILENIRLGNPSATDEQVIAAATQAQCAEFIDALQEGYHTIVGNDGIKLSGGQRQ RIIIARAILKNAPILILDEATAYTDMENQHKIQVSLQELCREKTLIVIAHRLTTIKNCDK IVVLNGGRVAAVGSHEELLKTCAVYQKMWSIYFEEDSTQLAGKAEVEVC >gi|229784075|gb|GG667660.1| GENE 30 29680 - 31143 214 487 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 215 460 243 502 563 87 28 2e-16 MCEERIRLGEHLKRVSLGFFTGRNLGDLVSTITSDAAFLEIEGLGVIEKAAMGIPSLLIG LLFLFQVDYRIFVLTVILLIPAWFSYRYFATTQDRLNLNRQRLIGEVTEETVEFIRGIHI LKSCCLENEQASRIRLIFTRLRKASILNELAHLFPMASFQFWFRLITSGTVLLSGILYLR GEADFLRLFLLVVASFTLFQGVEAMGIFSIFSKMTQQSIDRMKAIKGISVMENISGKEQP RQFDISYQDVTFAYDRTPVLQQVSFCVPEKTTTALVGLSGSGKTTIINLLARFWDVQQGE IKVGGKSIRTVSYEYLLHNLSFVFQDVMLFQDTVFNNIRLGNPCASMDEVVAAAKRARCH EFIMQLENGYDTVLGEGGTKLSGGEKQRLSIARAIIKDAPIVLLDEVTANMDVENELMIQ MALQELLQDKTVIMIAHKLSTIKNVDQILVLENGKITQRGTHDQLVSQDGLYQKLWNIQY EASSWSI >gi|229784075|gb|GG667660.1| GENE 31 31174 - 31602 366 142 aa, chain + ## HITS:1 COG:CAC1469 KEGG:ns NR:ns ## COG: CAC1469 COG1321 # Protein_GI_number: 15894748 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 3 116 1 115 122 99 46.0 2e-21 MVLHSSGEDYLKAILILQKNNGKVRSVDVAAYMGVSKPSVSHAVKLLREGGFLTKDSDHF LHLTALGQETAEKLYERYQFFAKHLAGAGVDTAIAEEEACRMEHTISHESFHVLKEQKQN TCPFADSCQLVTESKNSTKEDN >gi|229784075|gb|GG667660.1| GENE 32 32098 - 32544 234 148 aa, chain + ## HITS:1 COG:no KEGG:CD1094 NR:ns ## KEGG: CD1094 # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 1 148 1 146 146 194 71.0 1e-48 MQPNRLEFQKQCAFQSFCKRALRNEAANAHKEIKRRRAKEVSFSDLTPQEQDQLCTYDSY FYGGHEDTEPGYYAAGKKITAKLIAEALRTLPEEKRKAVLLYYFAGMTDTEIGKLFDAPR STIQYRRTSSFELLKKYLEGHADEWDEW >gi|229784075|gb|GG667660.1| GENE 33 32525 - 32779 289 84 aa, chain + ## HITS:1 COG:no KEGG:CD1092A NR:ns ## KEGG: CD1092A # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 1 84 1 85 85 130 76.0 2e-29 MNGMNGKYESPEVALVPYPIILAATKGDPEAMKLAVQHFSGYIASLSMRKLYDERGNAYY GVDEDIRERLQAKLMRTVLRFKAD >gi|229784075|gb|GG667660.1| GENE 34 33183 - 33455 234 90 aa, chain + ## HITS:1 COG:no KEGG:Rumal_1103 NR:ns ## KEGG: Rumal_1103 # Name: not_defined # Def: excisionase family DNA-binding domain-containing protein # Organism: R.albus # Pathway: not_defined # 28 84 20 76 82 64 50.0 1e-09 MDRRNNYPLPPMLHDSKGLYAAGRDGEEKVPIHLKMTLTIREAAEYSNIGINKIDAMLKQ PNCPFVLYVGTRKLVKRREFEDYIRRETVI >gi|229784075|gb|GG667660.1| GENE 35 33566 - 34786 85 406 aa, chain + ## HITS:1 COG:BH3551 KEGG:ns NR:ns ## COG: BH3551 COG0582 # Protein_GI_number: 15616113 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Bacillus halodurans # 112 386 107 366 378 111 32.0 3e-24 MGKNLKGKECGKGIYQRKDGLYSARYYSKNGKRKERYFETLPEAKNWLADARYEERHNLI VTSPDMTVDKWFTYWIENIVCDLAPNTKRNYQERYKWNIQPVIGAMCIADVKPMHCKAVL NRMETTYAGSTIRQAYITMGTMLKSAVMNDIIAKHPMNGVRYTKPVRAVDDIKYLTVEEQ EKFLEAAARSHNYRQYALLLETGLRTAEMIGLTWDAIDWKNRTLTVNKSLEYRHSQGYWR AGPPKTQKSYRTIPLTDKAYSILKSCYDEKDSRKSSETLSQVLEYIDSRTGENKCLIMRD LVFVNWRTGEPAKNSSYDTHLYKLCDEAGIKRFCMHALRHTYATRAIERGVQPKVLQQLL GHASIKTTMDRYVHVTDESLVNAIRQFQQATPPVSKKGRKKGVQEI >gi|229784075|gb|GG667660.1| GENE 36 34840 - 35937 1284 365 aa, chain + ## HITS:1 COG:CAC2134 KEGG:ns NR:ns ## COG: CAC2134 COG0012 # Protein_GI_number: 15895403 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Clostridium acetobutylicum # 1 365 1 365 365 449 64.0 1e-126 MKLGIIGLPNVGKSTLFNSLTKAGAESANYPFCTIDPNVGVVPVPDERLGKLAALYDSAK ITPAVIEFVDIAGLVKGASKGEGLGNQFLSNIREVDAIVHVVRCFEDGNVIHVDGSVDPL RDIDTIDLELIFSDLEILERRISKTAKSAFNDKSLAKELEILKAVKAHLEDGKLARSMEF DDDDAQEFVASLNLLTYKPVIFAANVSEDDLADDGASNEYVKQVRGFAAGNGSEVFVICA QIEQEIAELEEDERKMFLEDLGLTESGLEKLIAASYHLLGLISYLTAGPTETRAWTIKVG TKAPQAAGKIHSDFERGFIKAEVVNYQDLLDCGTYTAAKEKGLVRMEGKEYVVKDGDVVL FRFNV >gi|229784075|gb|GG667660.1| GENE 37 36061 - 37548 1421 495 aa, chain - ## HITS:1 COG:sll1027 KEGG:ns NR:ns ## COG: sll1027 COG0493 # Protein_GI_number: 16329369 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Synechocystis # 1 494 1 493 494 552 54.0 1e-157 MGKPTGFLEYDRMTGRAADPKSRIKNYDEFHLPLSEKEQRCQGARCMDCGVPFCQSGVMI KGMVSGCPLNNLIPEWNDLVYTGTWDQAYNRLKKTNSFPEFTSRVCPAPCEAACTCGLNG TPVAIKENEGAIIRRAYETGIAGPKPPKIRTGKTVAVIGSGPAGLAAADQLNKRGHSVTV FEREDRIGGLLMYGIPNMKLEKWVIDRKVEVMKEEGISFVTGADVGKNHKAKDILKDFDR VILACGASNPRDIKVPGRDSEGIYFAVDFLKSTTKSLLNSNMQDKSYLPAKGKHVIVIGG GDTGNDCVGTVIRHGCASVLQLEMMPKLPEKRTPDNSWPEWPRVTKTDYGQEEAIAVFGS DPRLYQTTVKEFVKDKNGKVAKAVLLSLEPKKDEKTGRITMEPVAGSEKEAAADLVLIAA GFSGCQSYVADAFGVKLNAGMNVETEAGRYKTNVERVFTAGDMRRGQSLVVWAIREGREA AKEVDENLMGYTLLE >gi|229784075|gb|GG667660.1| GENE 38 37560 - 42119 4955 1519 aa, chain - ## HITS:1 COG:sll1502_2 KEGG:ns NR:ns ## COG: sll1502_2 COG0069 # Protein_GI_number: 16329610 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Synechocystis # 393 1191 1 801 801 956 58.0 0 MVKNSKSPTEKKVPDLYDSRFEHDNCGIGAVANMKGIKSHKTVEQALHIVENLEHRAGKD AEGKTGDGVGIMLQISHKFFKKVTKSLGIEIGGEREYGVGMFFFPQDELARNQAKKMFEI IVSKEGLEFSGWRDVETHPELIGKRAVDCMPYIAQGFVKKPADTEKGLDFDRKLYVSRRV FEQSNDNTYVVSLSSRTIVYKGMFLVKELRRFFPDLQDKDYESAIAVVHSRFSTNTNPSW MRAHPNRLIVHNGEINTIRGNVDKMMAREETMESKYFKYDMHKVLPIINQEGSDSAMLDN ALEFMVMSGMELPLAVMVTIPAPWAHDKSMPQEIKDFYSYYATMMEPWDGPASIVFSDGD LVGAVLDRNGLRPSRYYITDDDQMILASEVGAIDIEQNHVVRKERLRPGKMLLIDTVKGC LVSDSELKETYASRQPYGEWLDSNLIELKDLPIPNKGVPDMSAERRSRLQKAYGYTYEEY KTMILPMALNGAEAVSAMGADSPLAVLSRKHQPLFNYFKQMFAQVTNPPIDSIREEIVTS TAVYVGEEGNILEETAENCKILKVSNPILTCTDLMKIKYMKRPGFQTEVIPITYYKNTSL EKAIDRLFLEVDRAHRDGANIIILSDRDIDENHVPIPSLLAVSALQQHLVKTKKRTSVAL ILESGEPREVHHFAALLGYGACAVNPYLAQESIRALIESGMLEKDYYAAVEDYNNAVLHG IVKIASKMGISTIQSYQGAQIFEAIGISKSVVEKYFTDTVSRVGGITLDDIANDVDTLHS MAFDPLGLDVDLTLESVGSHKARAGQEEHLYNPLTIHLLQEAARTGSYETFKQYTHALNE EGQTVNLRSLIDFDYTKQKAVPIDEVESVESIVKRFKTGAMSYGSISGEAHETLAIAMNR IHGKSNSGEGGESLERLTVGADGLNRCSAIKQVASGRFGVTSRYLVSAKEIQIKMAQGAK PGEGGQLPGGKVYPWVAKTRHSTTGVGLISPPPHHDIYSIEDLAQLIYDLKNSNTEARIS VKLVSEAGVGTVAAGVAKAGAQVILISSFDGGTGAAPRNSIYNAGLPWELGVAEAHQTLI MNGLRDKVILETDGKLMTGRDVAIACMLGAEEFGFATAPLVTMGCVMMRVCNLDTCPVGI ATQNPELRKRFKGKPEYVVNFMYFIAQELREYMAKLGIRTVDELVGRTDLLKKKEHISSK RADKVDLSAILGNPYAGQKLAGYNHKSVYSFELEKTIDEKILLKKLRPALINGQKKSIQI DVSNTDRALGTMFGSEITRLYPEGFPEDTYTVNCTGSGGQSFGAFIPKGLTLELVGDSND YFGKGLSGGKLIVYPPKGVTFRAEENIIIGNVALYGATSGKAFICGVAGERFCVRNSGAT AVVEGVGDHGCEYMTGGRVVVLGSTGKNFAAGMSGGIAYVLDEKRDLYKRLNKEMVSFEE VTNKYDVLELKEMIKEHVAYTNSARGKEILDSFGEYLPKFKKIMPHDYRRMMNTIVQMEE KGLSSEQAQIEAFYANTKK >gi|229784075|gb|GG667660.1| GENE 39 42565 - 43104 110 179 aa, chain - ## HITS:1 COG:BS_yqaC KEGG:ns NR:ns ## COG: BS_yqaC COG0563 # Protein_GI_number: 16079690 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Bacillus subtilis # 1 169 1 172 178 100 29.0 2e-21 MGTGIMICGLNGTGKSTLGKALAEKLHFYFIDNENLYFSGTELSCVYAAPRTREEAETLL FREMNAHENFVFASVKGDYGEAVCPFFQYAVLISVPKDIRLQRVKNRSFQKFGSRMLPGG DLYGQEERFFDFVKSRSEDTVEKWVQGLSCPVIRIDGTKPVEENINLILKQINFHTLSV >gi|229784075|gb|GG667660.1| GENE 40 43254 - 43520 380 88 aa, chain - ## HITS:1 COG:STM3779 KEGG:ns NR:ns ## COG: STM3779 COG1925 # Protein_GI_number: 16767063 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Salmonella typhimurium LT2 # 1 87 1 87 89 65 35.0 3e-11 MLTKTLTVVNPSGLHLRPAGVLSQTAMKFKCDIIIECGEKRIVAKSVLNVMAAGIKCGTE INLICDGEDEEEAMKSMSEAIESGLGEM >gi|229784075|gb|GG667660.1| GENE 41 43606 - 43941 289 111 aa, chain - ## HITS:1 COG:CAC1577 KEGG:ns NR:ns ## COG: CAC1577 COG1636 # Protein_GI_number: 15894855 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 2 107 96 201 208 140 68.0 6e-34 EFFRIAKGLEKVPEGGERCFKCFDLRLREAAKVASAGRYDYFTTTLTISPLKNADKLNEI GEKLAKEYRVAFLPSDFKKKDGYKRSVQLSEEYGLYRQDYCGCIYSREAGK Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:10:08 2011 Seq name: gi|229784074|gb|GG667661.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld54, whole genome shotgun sequence Length of sequence - 28725 bp Number of predicted genes - 26, with homology - 25 Number of transcription units - 12, operones - 6 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 + CDS 3 - 722 545 ## COG4209 ABC-type polysaccharide transport system, permease component 2 1 Op 2 14/0.000 + CDS 724 - 1602 755 ## COG0395 ABC-type sugar transport system, permease component 3 1 Op 3 . + CDS 1687 - 3303 1499 ## COG1653 ABC-type sugar transport system, periplasmic component 4 1 Op 4 . + CDS 3320 - 5245 1233 ## Trad_1839 hypothetical protein + Term 5263 - 5305 9.6 5 2 Tu 1 . + CDS 5328 - 6110 517 ## Dfer_0110 lipolytic protein G-D-S-L family + Term 6201 - 6228 -0.8 6 3 Tu 1 . - CDS 6133 - 6978 659 ## COG2816 NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding - Prom 7031 - 7090 5.2 7 4 Tu 1 . - CDS 7127 - 7717 186 ## PROTEIN SUPPORTED gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 - Prom 7838 - 7897 8.0 + Prom 7794 - 7853 6.9 8 5 Op 1 . + CDS 7900 - 8037 124 ## gi|288870697|ref|ZP_06409863.1| conserved hypothetical protein 9 5 Op 2 . + CDS 8067 - 9401 1747 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase + Term 9457 - 9507 16.1 - Term 9445 - 9495 15.1 10 6 Tu 1 . - CDS 9536 - 10657 210 ## COG1609 Transcriptional regulators - Prom 10853 - 10912 7.2 + Prom 10700 - 10759 8.0 11 7 Op 1 . + CDS 10882 - 11670 411 ## COG3959 Transketolase, N-terminal subunit 12 7 Op 2 . + CDS 11682 - 11819 131 ## + Prom 11929 - 11988 2.6 13 8 Op 1 . + CDS 12023 - 12631 418 ## COG3958 Transketolase, C-terminal subunit 14 8 Op 2 . + CDS 12663 - 13313 618 ## COG0698 Ribose 5-phosphate isomerase RpiB 15 8 Op 3 38/0.000 + CDS 13368 - 14249 338 ## COG1175 ABC-type sugar transport systems, permease components 16 8 Op 4 14/0.000 + CDS 14263 - 15084 275 ## COG0395 ABC-type sugar transport system, permease component 17 8 Op 5 . + CDS 15118 - 16626 1343 ## COG1653 ABC-type sugar transport system, periplasmic component 18 8 Op 6 . + CDS 16677 - 18029 599 ## PPSC2_c4454 beta-glucosidase + Term 18066 - 18111 8.2 19 9 Tu 1 . - CDS 18171 - 19406 815 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 19436 - 19495 6.0 20 10 Op 1 . + CDS 19401 - 19595 172 ## gi|266622063|ref|ZP_06114998.1| putative clarin-3 21 10 Op 2 . + CDS 19588 - 20979 1277 ## COG2211 Na+/melibiose symporter and related transporters 22 10 Op 3 . + CDS 20992 - 21837 776 ## CPE0145 hypothetical protein 23 10 Op 4 1/0.000 + CDS 21891 - 22529 640 ## COG1472 Beta-glucosidase-related glycosidases + Prom 23372 - 23431 80.4 24 11 Op 1 . + CDS 23526 - 24617 1088 ## COG1472 Beta-glucosidase-related glycosidases + Term 24730 - 24767 6.2 + Prom 24664 - 24723 5.7 25 11 Op 2 . + CDS 24842 - 25729 958 ## Closa_4282 hypothetical protein + Term 25814 - 25856 10.0 + TRNA 27174 - 27250 81.1 # Arg CCT 0 0 26 12 Tu 1 . - CDS 27506 - 28723 625 ## Ethha_1352 integrase family protein Predicted protein(s) >gi|229784074|gb|GG667661.1| GENE 1 3 - 722 545 239 aa, chain + ## HITS:1 COG:BS_lplB KEGG:ns NR:ns ## COG: BS_lplB COG4209 # Protein_GI_number: 16077778 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Bacillus subtilis # 6 238 85 317 318 186 43.0 5e-47 NRKGLNFWSAVRNTLLLNLTTLCVTFPITIIVSLMLNEVLSIRFKKISQSLLYLPHFISW VVVAGIATNLFSQRNGTINNLIVNAGGQAIPFLSSNGWWIFVYVMLNLWKEVGWGTIIYL AALTNVDESMYEAAYLDGATKFQRMIYVTLPAIKSVIVTMLILQVSKMMTIGLDAPLLLG NDKVMGVSEVISTYVYRLGIQNAKYSDSTAIGLFQSVVNILILFLADKFAKLIGEEGII >gi|229784074|gb|GG667661.1| GENE 2 724 - 1602 755 292 aa, chain + ## HITS:1 COG:BS_ytcP KEGG:ns NR:ns ## COG: BS_ytcP COG0395 # Protein_GI_number: 16080069 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 17 292 9 286 286 233 45.0 3e-61 MEEKTAQKRKFTVGKFFLYLVLTLLVFACVYPFLNVLAYSFSGYNAVLSGKVTFYPIDFT ISAYQQIIGKQQMWKAMSTTVFVTFMGTFAGLLLTTFAAYALSRDTLPGKKILLGLIIFT MYFNGGIIPTFLVVKQLGMYDSLTSLYIPQAMSVFNFIVMRTFFKQLPESLEEAARIDGA SDIAVFFKIVIPLSVPIIATIALFYAVQYWNNYFDALLYIQDPDKYTLQLRLRSLLFADE LNSGAASEADVQVMSQSLKMACVAVSTIPILVVYPWLQKYFVKGVMLGAVKG >gi|229784074|gb|GG667661.1| GENE 3 1687 - 3303 1499 538 aa, chain + ## HITS:1 COG:BH0482 KEGG:ns NR:ns ## COG: BH0482 COG1653 # Protein_GI_number: 15613045 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 81 532 81 511 512 79 21.0 3e-14 MLAAVITAGSLAGCGGKQESETKVAAAKDANYVTTYGEKMFDNVTIKVELFDRSNAPEGS TLTDNRWVKYVNEEMNKVGINVEFVAVPRADEVTKMQTMMSSGTAPDIVTTYTYAYAEDY WNQGGIWDLTEFVEGEDQAKNLKAYLGQDVIDVGKTADNHLYGIVAKRPTLAASNLFIRQ DWLDKLGLEVPETPDELFEVLTRFVNDNPDQRKDVVGASFWTLFNPSNSVCYRNTMAAAF TKLGDNEKERHIATGIEYYYDEGIRDYFRFINKCYNAGLMDKEYYTETRDSLTSDIVNGA LGFCEYDISFNVRVTSNLLKSLKENNKDAEFVSIPSLKNVNNGKQYSATYSPGGMIAFCP KTADAETVEACMTYLDWLCTKQGGYVLINGFEDEHYTLDSNGLPKVIDSEYNAKDKDWLG NDLFILGNSAYASTADEFHEMTASNNVGYENYVLNNYKNSLVGELLVPDSYSAPSASERK TDLDLVCSEWLVKCITCAPEEFDGMYDTFMKECENAGIEKIVEERTEHFNKIYGDSQS >gi|229784074|gb|GG667661.1| GENE 4 3320 - 5245 1233 641 aa, chain + ## HITS:1 COG:no KEGG:Trad_1839 NR:ns ## KEGG: Trad_1839 # Name: not_defined # Def: hypothetical protein # Organism: T.radiovictrix # Pathway: not_defined # 62 609 66 613 641 365 37.0 3e-99 MDRILYLKKMVEASKGKAGDTRYGFRSFHGVYLELAKMELQNCGETTEAIDYQNMDLGLD HVASRRDCADFIIPAYIRILMEHRDKSYMDPDYCARIESTLLGFRYWLDEPGETTGCYFT ENHQILFHSAEYLTGKIFPDRIFLNNGQTGQWHQEHAAGQIRQWLSWRIRFGFSEWLSQG YYAEDMLALLGLAYYADDQDIRIKSKMLIDMLLFDLAANSFRGHLGGTHGRAYTNAITNP QKEAVLPVMGLLWDTEYSFDDLSTCGVLMAAYDYTCPKAIAEAAWQPGEWINRQRVSIDF KDAKTYGIDPKNFENIMFFWGMQTYSDREVIQNSLKVFPTYNWMYNRLNAYKERYELMDK LGLPVTETPDQTAMTQADIYTYRTDDYMLGCVQDYRKGRRGFQQHVWTASLGGKAVVFTN SPGSEEYFYRPNQFAGNLFLPKAVINKNVVMCMYRIDADYIDYQYTHAYFPTKEFDEVIE KYGWVLGRKDKAYIALKSLKPASWQPVKKEFYQAVYPETWNEELDMDSPFEYTAQGHANV WIAEMGSEKQNGSFETFVNNFQNASVTGDKFDCTYHSPTQGEMRCGWLLPLTVNGEEISI HNYRRYDNPASFTEFAESRIEIASGVHKTVLDFENAAYENV >gi|229784074|gb|GG667661.1| GENE 5 5328 - 6110 517 260 aa, chain + ## HITS:1 COG:no KEGG:Dfer_0110 NR:ns ## KEGG: Dfer_0110 # Name: not_defined # Def: lipolytic protein G-D-S-L family # Organism: D.fermentans # Pathway: not_defined # 9 251 21 270 275 212 43.0 1e-53 MSGGTGMNEIRWCAIGDSFTYLNDHLSETGYRVKSGYLDRTCEKVGGLRLMNHGINGSET KDWLTAKLPEADIYTVLLGTNDWRRGVPMGTDADFMEKIPGTILGNLGCLTARIRETAPK GRIVIMTPVERGDYVCIGDPGCQTMGSYRPQAGQRLCDIAEAIARSAEKAGLCVVDLHKN AGFTAESVVRFKRLRKEAGYHDYVYPGYVDVPYDWKRDPYPYPVEAVGMTFDGLHPSDEG NEAIAKLLAEKLQEILGQLS >gi|229784074|gb|GG667661.1| GENE 6 6133 - 6978 659 281 aa, chain - ## HITS:1 COG:CAC3396 KEGG:ns NR:ns ## COG: CAC3396 COG2816 # Protein_GI_number: 15896637 # Func_class: L Replication, recombination and repair # Function: NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding # Organism: Clostridium acetobutylicum # 85 259 77 250 271 155 41.0 1e-37 MIQEIAPHRFRNEFYIKTPSPDSYVLFFQDGSILLNHTDSLKIPKFSELNIADADMISGH CDYLFSIDTKEYYLLDETSVSLTEDETLAFYPTSVIRDLEPLWISFAAANGAELHRFYRN NHFCGRCGTANIKSKKERSMVCPSCGNTVYPKIAPAVIVAVTDGDKLLLTKYAGREYTRY ALVAGYTEFGETLEETVRREVMEEVGLKVKNIRYYKNQPWAFSDSMLVGFFAELDGSPQI RLDETELSTAVWMKREDIPGDYTNVSLTHEMILLFKHGQLS >gi|229784074|gb|GG667661.1| GENE 7 7127 - 7717 186 196 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 [Bacillus selenitireducens MLS10] # 18 188 5 176 190 76 28 2e-13 MNSSHTTRTAVPSRRISTRDLAFAGMIAAILAIISQISVPMPTGVPITIQVFGITLIGVA FGWRLGLYGSLVYILLGAVGLPIFANFRGGFSVLVDLTGGYIWSWPVMAVLCGLQIKSEN KKLETASIFLFSFIGLAINETAGGLQWAALAGDMSVWGVFVYSMTAFIPKDIVLTILAVI FGIQIRKTLNRAGITR >gi|229784074|gb|GG667661.1| GENE 8 7900 - 8037 124 45 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870697|ref|ZP_06409863.1| ## NR: gi|288870697|ref|ZP_06409863.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 45 1 45 45 73 100.0 4e-12 MVIIDFTIFTCYNFYQLQKAKGPGRMFWPLFALLRAKFTFFADVW >gi|229784074|gb|GG667661.1| GENE 9 8067 - 9401 1747 444 aa, chain + ## HITS:1 COG:lin0569 KEGG:ns NR:ns ## COG: lin0569 COG0334 # Protein_GI_number: 16799644 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Listeria innocua # 3 444 17 458 458 578 66.0 1e-165 MSYVDEVYERVVNQNPGEAEFHQAVKEVLDSLKLVIDANEEKYRKMALLERLVEPERIIS FRVPWVDDKGQVQVNKAYRVQFNSAIGPYKGGLRLHPSVNQGILKFLGFEQVFKNSLTGL PIGGGKGGCNFDPKGKSDREVMAFCQSFMTELFKYIGADQDVPAGDIGVGAREIGYMYGQ YKRLTGLYEGVLTGKGLTYGGSLARTQATGYGLVYIVDEMLKHNDKELAGKTVVVSGSGN VAIYATEKAQQLGAKVVALSDSNGYIYDKDGIKLDIVKDIKEVRRGRIKEYVDAVPTAVY TEGKGIWTIPCDIALPCATQNELNLDDAKALKANGCFAVAEGANMPSTREATDFFQENGM MFMPGKAANAGGVATSALEMSQNSLRMSWSFEEVDEKLHDIMVGIYAKAADAAERYGVKG NYVAGANIAGFEKVVEAMMAQGVV >gi|229784074|gb|GG667661.1| GENE 10 9536 - 10657 210 373 aa, chain - ## HITS:1 COG:BH1875 KEGG:ns NR:ns ## COG: BH1875 COG1609 # Protein_GI_number: 15614438 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 8 373 3 364 375 94 26.0 4e-19 MNKNDLLTHEKLILQQIYLDELETGDTIPSVRELQSRLCLSRNSVLNAIQSLSEKGVIKK GTSPRQGYRLCEKPLRLDGNHNPKKELSVQYLLPFTTWNYTINKYLSSFENIFSRHGINL IFSNTHNSAEEEGRLLETIYSKPAALRPDFLFLTTCDSTMNPNLELLQKISREIPCILID RYFHGKNFHYIGVSNNFIGSRTADYLLENGHRHIAYIKAYGKISTIQDRFSSFENALERS AVFLDQQCIFALSNANVGSISEIQEEIDRIGARILSLSPRPTAVVCSFDRIAVALMNFFL KNSVRVPKDICIIGCDNDVDVGAMSPLSLTTFRHPYRECAREAIARIENIKNNQKASPHS VEFFPEFIIGEST >gi|229784074|gb|GG667661.1| GENE 11 10882 - 11670 411 262 aa, chain + ## HITS:1 COG:TM0954 KEGG:ns NR:ns ## COG: TM0954 COG3959 # Protein_GI_number: 15644626 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Thermotoga maritima # 6 237 18 248 286 223 45.0 3e-58 MELTQQIRDGIKLAICSSGGGHAGGSLSVADIMASLYGTVMKYDPLNPDWDERDRFILSK GHSGLALYAALAASRFIPYDELAEFGTAGSRLMNHPDAHQIPGVEMSTGSLGHGLSLAVG IALAGKMKNKEYFTYCILGDGECCEGSVWEAAMYACQQKLKRLVVIIDQNRVGCDGPLEE MVQIEPFGRKWSSFGFSVEEVDGHNVTEMDRLFLRLRNECSGPYAVIAHTVKGYGLTDEI AGTGASHYISGSYKELINKFTF >gi|229784074|gb|GG667661.1| GENE 12 11682 - 11819 131 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQEKAERAVHSQLEASAAMLEFLMETDDNIVVMTADLAGSSKLKK >gi|229784074|gb|GG667661.1| GENE 13 12023 - 12631 418 202 aa, chain + ## HITS:1 COG:TM0953 KEGG:ns NR:ns ## COG: TM0953 COG3958 # Protein_GI_number: 15644625 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Thermotoga maritima # 1 202 108 307 311 130 37.0 1e-30 MSAGQAGSTHFSLEDIGVIRSMPESVMIAPADAVSAAKYIELLSSIQNPAYIRLDRNPVP DLYDENYEAVIGEGKVVAAGHDIAVIAIGEAVAEAVSAQKELLEAGGPAVTVVDMPTIKP IDKGLLSQLSENHSIFLTVEEHNIYGGLGSAVGQAAAELGLGIKLQCIGIPDCYPQGNPI SYNRELFGLDCKSIRKAVMDFF >gi|229784074|gb|GG667661.1| GENE 14 12663 - 13313 618 216 aa, chain + ## HITS:1 COG:YPO1991 KEGG:ns NR:ns ## COG: YPO1991 COG0698 # Protein_GI_number: 16122233 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Yersinia pestis # 1 211 1 211 212 210 50.0 1e-54 MRAALMNEFSQAGKNALIVRALEKSAAKYHVEWFNTGMRTDNDRPVLNYIHLGIQAFVLL NTGAVDFVITGCGSGQGACMSLNSYPGVNCGYITDASDAYLFTQINNGNAVALPFAKDFG WAAELKLEDIFDKLFSCEGGGGYPVEMRDIQRKNELFFNNVKRTLCNEPVDILKALDSEL VKTALEGEWFRRCFLQYAVPGELTTYVNELIEEKSE >gi|229784074|gb|GG667661.1| GENE 15 13368 - 14249 338 293 aa, chain + ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 8 293 6 292 292 139 30.0 8e-33 MPGSILRKVQPYLYLLPLFVLLGVFKLYSSVYAVFRSFFEWDGGRISVFIGIQNYISLLS DATFGKSLLNLLILLVTSVIKVVTLPLLVAELVMSVKREFSSNLYKYLFVIPMVVPGVVM ILLWRWIYDYNGLLNGFLSALGLENFMHSWLGEQATALGALVFYQFPWVAGIQFLIFLAG LQGIDTSIYEATKIDGIGPVSRIFYIDIPLLTGQFRYLIITTIISTFQTFEHVLIMTNGG PGDSTIVPALHMYNETFSYYNMGYACAMGTVLFLITLGITAFNMKYIKGAENQ >gi|229784074|gb|GG667661.1| GENE 16 14263 - 15084 275 273 aa, chain + ## HITS:1 COG:lin0761 KEGG:ns NR:ns ## COG: lin0761 COG0395 # Protein_GI_number: 16799835 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 15 266 19 285 291 136 34.0 4e-32 MQKKRMIGLAVHGTLLFILFLTIYPLFVMVISSLKTNIQIYQNLYFFSLPLHFEHYKIAF GQIYRSILNSLIITAGNVLLTLIISTLAGYAFARFQFKGKNLLFMTMMMMMMVPYFLILI PQFIISQKMHLLNTYAVQILPPSSAFAILSTFLITTHMESLPKSLFEMATIEGASDFKIF YYIALPLSKPILATVCITTLISSWNNYIWPLVSANAEKVRPVILQVSKVVASTYQMPGVN FAAYVIASVPLLLIFTVATKPFVEGLTAGAVKA >gi|229784074|gb|GG667661.1| GENE 17 15118 - 16626 1343 502 aa, chain + ## HITS:1 COG:BS_yurO KEGG:ns NR:ns ## COG: BS_yurO COG1653 # Protein_GI_number: 16080313 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 4 398 1 329 422 73 22.0 7e-13 MMKMGKKLLTFTLAAFTLTALSGCGKENRSSSETANTATEVSSDTTSEKSSDGNWHPDKP VTLKWATWEYNNTEGQEYLKKSLDLYKKYQPNITVEYEYIDNDQYQTWLQTQLMGGTAPD LFLVRHAWGQQYLRDGSVIDLTDYLMNETNPYNENQNWMDTLQESVMAQVMDPTNGRYAS VPTNGVLCKLFYNKNLMEKAGVDKMPSTFAELLKTCETLQGAGITPMVVGMKVSEGGGYH WFERMMMDPLATQFIEELDINKTGLIEVNEIARGVDNGVIDVTKEPWKDIYPMIKELSQY WYPGFTAMDNNEALEQFMRGEVAMTFSVGSNAKIMSSNPDLSFEIGITEFPTLTTESHPQ SSGIAYEIGGAPQGNICIPTTTKGDQLLAAVDFLKFKTGLEDSSLYVNELWGVTPVKNVQ ISNENPLLQALQFKNDVSPLKLYETYFDKQFFDDSVMYGQLYLQDKLELDEYTKILQEDL VKAKENYVKKEGWSDENNWGEN >gi|229784074|gb|GG667661.1| GENE 18 16677 - 18029 599 450 aa, chain + ## HITS:1 COG:no KEGG:PPSC2_c4454 NR:ns ## KEGG: PPSC2_c4454 # Name: not_defined # Def: beta-glucosidase # Organism: P.polymyxa_SC2 # Pathway: not_defined # 22 450 1 404 404 248 34.0 4e-64 MKLTISPDSHYFMNEGKPYFLMADTCWSAFTHPDIEEWQEYLDFRNEQNFNAIQMNLLPQ RHSCMPRQWLHPFTVTQEGSYDYSNIKAEYFDHVCQYLDEMKNRSMLPMIVLLWGNFVPG TWQGIRDAGRTNVEDKPYYQPMTILEMERFLQYAVPRLRGYNPVFLVSGDVNLAEDDVKY NTVEDQRKANPETVAFYHRALQITKRLAPDCLTTLHIAPKVELPQCLQEAEELDFYMYQP GHTYSDPELNRTLAQDLRLYPITRPVYNSETSYEANVYFDSPQGRFDERQIRKAFWQGVL NGANAGFTYGAGGVWLWRQNQNFCSERADIGLVNDWRDDLRLPGAYDMGFAKMMADTIGL YDLKPAAGMIQDNPCGEIGCAANDDQSKLVIYLPFSWPVNLNIDLTGYQVYAVDLETRWA LNPIFYLKNHVTSFRQLRYNHDYLIIACKE >gi|229784074|gb|GG667661.1| GENE 19 18171 - 19406 815 411 aa, chain - ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 304 405 153 254 257 83 37.0 8e-16 MHYDSVLMYVRKVLENRGIPSFLFREPFEDIESLDMGLRESLMEDFDKNKSFKKLTECCK EHTLYYFTDNYYCTYVCMWLPASDTAEFFITGPFTYVEINKMNYYKLLSNLEVPAALLPI LKTYYYNIAFVSSESQFNNLMETLAEIIWQGVDQYTVAEIKGNIWNAPDDNHYLTSLPET IYLSPVNTISLEKRYEAENQLMQAVSQDNQELLNQALASFENRRLIPRLADSLRDYKNYM VIYNTILRKAAEQGYVHPVYLDEISSKYAKKIENLTTPQDTSLMKEMARKYCLLVRNYSV RGYSPVIQKVINQINLNLTSDLGLKDLSEMFHISANYLSSLFKKEMGITLTDYVNKRRVE QAVFYLNTTDMQIQTIASYCGIQDINYFTRIFKKFIGMTPSKFREQISRHH >gi|229784074|gb|GG667661.1| GENE 20 19401 - 19595 172 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622063|ref|ZP_06114998.1| ## NR: gi|266622063|ref|ZP_06114998.1| putative clarin-3 [Clostridium hathewayi DSM 13479] putative clarin-3 [Clostridium hathewayi DSM 13479] # 1 64 1 64 64 91 100.0 2e-17 MHDIVLLFYSDDFTILSYNFTFISSVSKKYAKKDKIVLTVRAVLMLSYYKKRREKTNIKE VSYE >gi|229784074|gb|GG667661.1| GENE 21 19588 - 20979 1277 463 aa, chain + ## HITS:1 COG:BS_yjmB KEGG:ns NR:ns ## COG: BS_yjmB COG2211 # Protein_GI_number: 16078296 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Bacillus subtilis # 5 447 13 447 459 194 28.0 3e-49 MSKEKTVRPFGMSDKIGYAFGDFGNDFTFILSSMFMLKFYTDVMGIAAGVVGTLMMAARF VDAVTDVTMGQIVDRSKPTKDGKFKPWIKRMCGPVAIASFLIYQSGLSGMSYGFKIVWMV VTYLLWGSIFYTSINIPYGSMASAISAEPKDRAQLSVWRTIGATLASLVIGVGTPLLAYE VVDGKTILSGSKMTMIAGVFAVFAIICYLLCFKLVTERVPVEANSEKLDIGKLLKSLLSN RALMGIIAAAILLLLAQLGMQGMSSYVFPNYYGSAGAQSASSMAGSLGVLIICAPFAVKL AQKFGKKELSIVSCLAGAASFVVCLIVKPQNPYVYVAFYTLAMVGLGFFNTVIWAMITDV IDDAEVKKGVREDGTIYSLYSFARKLGQAFASGLTGGLLSLVGYTQATAFDAEVTAGIFR MSCIVPIIGLIAVALVLIFIYPLNKKKVEENTAELALRRSKNA >gi|229784074|gb|GG667661.1| GENE 22 20992 - 21837 776 281 aa, chain + ## HITS:1 COG:no KEGG:CPE0145 NR:ns ## KEGG: CPE0145 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens # Pathway: not_defined # 1 281 1 281 281 337 57.0 4e-91 MESIFDGFKKSRDYLICVDSDGCAMDTMNIKHIRCFGPCLVEEWGLQNWEKTILDRWNDI NLYSMTRGINRFCGLLKALQEIHQCYTPIEGLEQLAQWVEHSPELSQAALEAEIEKNHSA VLKKALHWSAAVNLKIGLLEEQEKQPFQGVGEGLAFAHRFADIAIVSSANMEAVLEEWDV HGLLEHVDLVLAQEAGSKAYCIGQLLLKGYDRDNVLMTGDAPGDQKAALENGVLYYPILV RHETESWKEFKEQAVPRLIEGTYEGTYQKNKTEEFVRNLQR >gi|229784074|gb|GG667661.1| GENE 23 21891 - 22529 640 212 aa, chain + ## HITS:1 COG:BH0675 KEGG:ns NR:ns ## COG: BH0675 COG1472 # Protein_GI_number: 15613238 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Bacillus halodurans # 19 204 131 326 686 95 32.0 7e-20 MTDLKKNPYYLSDEDCKWVQDTIAGMTEEEKVGQLFFQLTQSRDEAYLKGLMETYHLGGC RYNSAPGKDIQEQNRILQKYARIPVLIACNTEGGGDGACADGTPIASEIKIGATGNEDYA RALGKMSSEEAAAIGCNMAFAPICDIPYNWENTEIITRSFGSDPELVAKMSAAYMEGAHS TPGFACTAKHFPGNGLDFRDAHIANNVNTFAS >gi|229784074|gb|GG667661.1| GENE 24 23526 - 24617 1088 363 aa, chain + ## HITS:1 COG:BH0675 KEGG:ns NR:ns ## COG: BH0675 COG1472 # Protein_GI_number: 15613238 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Bacillus halodurans # 5 180 352 529 686 102 32.0 1e-21 MFDNGLEAVMGGHIMLPEYAKLINPDLKDEDIMPATLSPEIMTGLLRDRMGFNGVVVTDA SHMVAMTNRMKRSEMLPRAINAGCDMFLFFNDPDEDFSVMLEAYRNGTIREERMTEALTR ILGLKAHLGLNKRKPEELVPTPEFLEKVLGKEEYKETAKSISRDCLTLVKYKDREVLPIT PERYQRIMIVHVKGSGGAMGELMKLMGGSKENPAELLKQKLCDRGFDAFIYESPMDKMKK QMEQGEKPSLDLYFAGKNAVKDFIKDMDLVITLCDVPSGRPVFGMSKGGGEIPWYVFELP VVVVGCGQPTMLADIPQARTYINTYDARNDTLGALVDALMKGPEAFPGTDPIDSFCGLFD TRL >gi|229784074|gb|GG667661.1| GENE 25 24842 - 25729 958 295 aa, chain + ## HITS:1 COG:no KEGG:Closa_4282 NR:ns ## KEGG: Closa_4282 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 295 1 295 295 478 79.0 1e-133 MHKFLRSIGFSLYQKDRDIDKLLDQMTEDVSGMKRIQLDEETNLCEIRKEVAPGMGIAVF GQMDRDGVFRRNYYYPYLRSEDMTSDVSCSIQRHTERETYAGLLDEYKVGISLIFYIDNS FECRERIMDQRSLETKRANLTGLAVSGKILLPIQKTEKQRENAKVTAKERNTLLEAAKNG DEDAMETLTIEDIDMYSQISRRVMKEDLYSIIDSCFMPCGVECDQYSVIGEITEIVEKKN QITGEEVYDLKLECSDIVFHVGIAKQDLTGEPAVGRRFKGQIWMQGNVVFDEAAD >gi|229784074|gb|GG667661.1| GENE 26 27506 - 28723 625 405 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1352 NR:ns ## KEGG: Ethha_1352 # Name: not_defined # Def: integrase family protein # Organism: E.harbinense # Pathway: not_defined # 2 401 63 456 464 217 34.0 8e-55 FEFLYEFIEKYGYKKWAASTYDGNVGLLENYVHPHIGDKKLLSLTTKMIDDYYDFLEKEA EPATNMGKPTREHITASTIHDIHKILRCAFNLAVRWDYRKKNPFLNATLPEHKEQERVVL EPNQILKVLKYTCRPDNYDYYLIHCAVLIAIGCTIRGGEIGGLQWDRVHYEKMIFHIDRA IDRISKKNLKLPKVRILFKFPNLIPGAKTCIVLKQPKTDNSARDVDVPQMVLNSLQILRQ MQEKLKAELGSDGYIDYNLVICQANGRPMMTEHLNKRFKEILVEMNDPEIKAEEIVFHSL RHTSATAKLFVSQGDFNSVMQAGGWANLEMLTRRYGKHSFQDNREKLAHKMDDFLGNGLE EASGNDGGTVIAQPGAIEQALQTLFQANPDLLIQVIQSVQSANKE Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:11:11 2011 Seq name: gi|229784073|gb|GG667662.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld55, whole genome shotgun sequence Length of sequence - 53965 bp Number of predicted genes - 45, with homology - 43 Number of transcription units - 25, operones - 10 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1524 1108 ## gi|266622070|ref|ZP_06115005.1| putative listeria/Bacterioides repeat-containing domain protein 2 2 Op 1 1/0.333 - CDS 2475 - 8669 5653 ## COG5263 FOG: Glucan-binding domain (YG repeat) 3 2 Op 2 . - CDS 8684 - 10324 1682 ## COG5263 FOG: Glucan-binding domain (YG repeat) 4 2 Op 3 . - CDS 10377 - 10946 509 ## Closa_0878 3D domain protein 5 3 Tu 1 . - CDS 11054 - 11845 627 ## Pjdr2_5011 transglutaminase domain protein - Prom 11917 - 11976 6.3 - Term 11971 - 12012 3.2 6 4 Tu 1 . - CDS 12116 - 12970 641 ## COG0582 Integrase - Prom 13033 - 13092 10.7 - Term 13044 - 13107 11.2 7 5 Op 1 . - CDS 13193 - 14008 792 ## COG3947 Response regulator containing CheY-like receiver and SARP domains 8 5 Op 2 . - CDS 14005 - 16005 1694 ## COG3275 Putative regulator of cell autolysis - Prom 16219 - 16278 6.6 9 6 Tu 1 . + CDS 16506 - 16646 63 ## 10 7 Tu 1 . - CDS 16923 - 17243 375 ## COG3689 Predicted membrane protein 11 8 Op 1 . - CDS 18186 - 18380 186 ## Sgly_0154 protein of unknown function DUF1980 12 8 Op 2 . - CDS 18377 - 18748 250 ## COG0701 Predicted permeases 13 8 Op 3 . - CDS 18679 - 19941 706 ## COG0701 Predicted permeases 14 8 Op 4 . - CDS 19938 - 20912 671 ## COG0523 Putative GTPases (G3E family) 15 8 Op 5 . - CDS 21027 - 21569 620 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components - Prom 21599 - 21658 80.4 16 9 Op 1 42/0.000 - CDS 22502 - 22756 318 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 17 9 Op 2 25/0.000 - CDS 22743 - 23450 247 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 18 9 Op 3 . - CDS 23498 - 24469 929 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin - Term 24545 - 24606 -0.4 19 10 Op 1 . - CDS 24650 - 24961 147 ## gi|266622087|ref|ZP_06115022.1| transcriptional regulator, AraC family 20 10 Op 2 . - CDS 24998 - 25432 286 ## COG0735 Fe2+/Zn2+ uptake regulation proteins - Prom 25531 - 25590 20.6 21 11 Tu 1 . - CDS 26517 - 27641 950 ## COG3533 Uncharacterized protein conserved in bacteria 22 12 Op 1 . - CDS 28582 - 29367 823 ## COG3533 Uncharacterized protein conserved in bacteria 23 12 Op 2 . - CDS 29420 - 31138 1472 ## gi|266622091|ref|ZP_06115026.1| conserved hypothetical protein 24 12 Op 3 38/0.000 - CDS 31142 - 31999 805 ## COG0395 ABC-type sugar transport system, permease component 25 12 Op 4 . - CDS 32002 - 32652 604 ## COG1175 ABC-type sugar transport systems, permease components 26 13 Op 1 . - CDS 33614 - 33838 340 ## SpiBuddy_0358 ABC transporter inner membrane protein 27 13 Op 2 . - CDS 33838 - 34650 586 ## gi|266622095|ref|ZP_06115030.1| amidohydrolase family protein 28 13 Op 3 . - CDS 34757 - 35776 959 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 36017 - 36076 19.7 29 14 Tu 1 . + CDS 37302 - 38165 482 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 38217 - 38276 4.8 30 15 Tu 1 . + CDS 38299 - 38499 115 ## gi|266622098|ref|ZP_06115033.1| transcriptional regulator GadW + Term 38556 - 38612 2.1 - Term 38631 - 38671 -0.3 31 16 Tu 1 . - CDS 38673 - 39320 317 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases - Prom 39428 - 39487 2.7 + Prom 39155 - 39214 4.2 32 17 Tu 1 . + CDS 39281 - 39805 238 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 39873 - 39921 6.2 - Term 39861 - 39909 8.6 33 18 Op 1 . - CDS 39935 - 40033 58 ## 34 18 Op 2 36/0.000 - CDS 40048 - 42258 1080 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 35 18 Op 3 4/0.333 - CDS 42264 - 42947 187 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 36 18 Op 4 40/0.000 - CDS 43017 - 43568 576 ## COG0642 Signal transduction histidine kinase - Prom 43723 - 43782 25.7 37 19 Op 1 . - CDS 44696 - 45367 379 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 38 19 Op 2 . - CDS 45400 - 45987 398 ## COG4430 Uncharacterized protein conserved in bacteria - Prom 46198 - 46257 16.2 39 20 Tu 1 . - CDS 47159 - 47707 432 ## COG4422 Bacteriophage protein gp37 - Prom 47798 - 47857 4.8 + Prom 49141 - 49200 80.4 40 21 Tu 1 . + CDS 49261 - 49413 66 ## gi|266622110|ref|ZP_06115045.1| conserved hypothetical protein + Term 49551 - 49596 0.3 41 22 Op 1 5/0.000 - CDS 49520 - 49975 215 ## COG0534 Na+-driven multidrug efflux pump 42 22 Op 2 . - CDS 49962 - 50603 265 ## COG0534 Na+-driven multidrug efflux pump 43 23 Tu 1 . - CDS 51555 - 51791 359 ## ELI_1443 Na+ driven multidrug efflux pump - Prom 51828 - 51887 7.5 + Prom 51833 - 51892 2.9 44 24 Tu 1 . + CDS 51916 - 52452 510 ## COG1695 Predicted transcriptional regulators + Term 52580 - 52618 5.1 - Term 52568 - 52606 8.4 45 25 Tu 1 . - CDS 52627 - 53901 1651 ## COG0151 Phosphoribosylamine-glycine ligase Predicted protein(s) >gi|229784073|gb|GG667662.1| GENE 1 3 - 1524 1108 507 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622070|ref|ZP_06115005.1| ## NR: gi|266622070|ref|ZP_06115005.1| putative listeria/Bacterioides repeat-containing domain protein [Clostridium hathewayi DSM 13479] putative listeria/Bacterioides repeat-containing domain protein [Clostridium hathewayi DSM 13479] # 1 507 12 518 519 978 100.0 0 MTAPGIPVYAVNHGDQSVIEFDPQSGPDLSHSSYSTMPRRDTGVSYAAGQAGFPLISSND FSGILTENFGSGERPKIPEFQVDWPGYTFDGWYNESGNKIETLPYAFPYSAVSSYKAVWK GNATTPFRFTVMHYRDLNTDRDGNNDGTDAGAWPPEGASDLWKFYESGDWTREVMANTPI SATYRRDIPGYRLKSVLIKNNRVRRYDEPSGQGRLDGGADINTTTKAVRGNMPNDDLTVA YRYEPDPTKKFAVYVEYVDSNGAQIRTPDTYTFSAETEIDIVPAQINAYSLQSGALKEGF EGTDDLEGQGIYSAASAGCSFDEAMHYRGKMPNQPVTVQYKYNIDPNFHTVLQVKRIDNH GNPMEDDEIREITPADPVIIDVEQKTGYDYPPNIHWDEHFTGTQSSFNPDDGTLTLSSDM QGGTVTITYNENLSDESYWSKIEYYNGTNGTISGDTSPKFKKHGTYTIDELTAGVNLSPS SHYLFDGWYRANANGSDKTGDRLEGSL >gi|229784073|gb|GG667662.1| GENE 2 2475 - 8669 5653 2064 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1949 2053 503 599 621 76 39.0 5e-13 MDGSRNGVSFMESMRAGRIIKKGIACLLAFVLAFLPPVPGLTMSAMAAGTELQPAQGMKD NGGGLTTAYKWNKGTTGKFDIEGIEDASLIGAENPYTGDRLKAVITTYAHQGYETVVQKQ GGAGGRKNVAISVNGGIQTFDDLGIEVQMKVYPSLDQKWILVDYIVYNKKNENNPIQIAS GTDVQVGGRSGAGAGDPADASTLTANARGFHMVNEQTQQTFDCIYNDDSLGLTEPTTKWV GRYNERQNYYFTDNPHSVANVDSGLTYSWTIPLRPYETAVRRVAFGAKGPSYYVSSAHGN DGNVGAYDTPLKTLGAAITKIGNNVGYIYIQDYNELTSTITVPSGANITFMSSDYSREGT PIDDIITIRRNSGLVSGPMFSTNGGTMRFSNLVLDGNRTGAAGNTSPILSADRGTVGIMS GATLQNNQVNAAGMGSAIQISGSADLEMNYGTITGNTAVDRGAVNFDSTGKFMIQNRISI EGNTLSSGEKANVYLSEGKYMTVVGDLDTSAIGVTTARLPEASAGGASTQAAQEVNVAVP LGRSAVDISPSPFVDNFFADRAGSEGTGLYVTSGTAGLSEAGENHDRNTVLKKNGYQITF FIKDADTGGSITGAPAIQAKTYGSGEAVSLAAPPEVTGYEAAGVTIEQGTSNLLHAQLNP GNDYGTVTGIMPNQDVTIDYEYRKVNSSIRFEANGGTPQPPSLTGTAGSNVNAMMPRVSR YGYVFRGWSSVNDWNNPSIITGLPSVFPETPVTYYAVFEPDPNVKFNYTVEYSNRNGDIV FQSDTEENAYSVETPITAQKKEIHGYAWSLEDSGTTPAEYNYSGTAVPVGQFNGTTGAFA GTMPGQDATVRYRYQVDYGNTAAQSQLTVRHETAGGTTVSEAQTSSHYPEETVTARPVTR YGYECVDVRIDTGDTADDSDGHLVSALTGSFDADRIFRGMMPNQPVTLTYLYEPTEEGYP LTVEYVDGVTTDSSLKNIREPLVHFHQAEEGVGLACSQSEVVYREPYGYRLASKSIAPST TLITWSGNDFTGTMPNDSETVTYRHDRNPDLWAYITYRAGAHGSLTAEGVSSDVEDRGDG SYRTSVLIDDNSEQGAEFGYTLGTMTEKRLVPGTRADQYYRFGGWFVDANSNGIMDDGET LLEEGSRFTGETVLTAYFEEDPEKWVDIHFEAGAHGSINDNEARTLHTMYDRKWSEITES LPQYTAEVNYLTDRWYAGDVPVTGDMNLVNGQTYTIRFRPDPGVFGTEVTKPETLAGLNS QGKGRITVFNTTPGYQYLLTDPSGKILEIIRGNELTGRVVFDNLYPGADYQVYEATGDVH VEIGSQTDSLAGQISDPAQVLTPVLETNYQILYDEEETGKTILVIKPADPDSEYAVLDSS GHVVDIPAAGNGGFKKPSGSPASLTFAGLDYNREYTVVARPLGKTDVTAESLAEKGSVIT TDPGAELELPNYIVETQNGEITMVGETSVGAQRYEEAHKGDRVKITADEVNGAGEPFSHW ETTIGTAGELGDKTARREASFIMPDNNVVLTAVYERPKASPSNAAVVDEVRGGSRKETAL DPNEIPRLEAELTTEADRELLDVNHADVTYKVVYRKNAVKASESNAIKASGYYETEHEAA YKGAWGLDVLIERYVNGRKVAGAAEPEAAFRTYIQMGKDDVDMLDYQLYEITVNPGTGET EAALVTLEYDPEETGGLYAFTAQAGSRYVMIYNRAYRVYFVNQVALPVEYRYRYYFKVRR DEAPSDEYYADEYGQVEEQLDYFVNPDGAEFSYEGWSRSQDSFKEFVPSEPVRRKTYIYA FYKDNVREVDDARQKLEEAIREAIRISDDHFLKLEESGILKEAVEEALEVLDRENPKATA GQLTAALDRLKDKTDPYEKILDDRYNHYDDIQDNGNKGGNKGGGGSGSGTRQNPYNTKLP ESYVVGTNGNWVEVSAPEGQERQMAFVLTGGMRLSGMWAWIKYPADGSTSISDRTGWYHF NDKGIMETGWIRDEGGNWYYCRTEKEGNFGKMQTGWHHDAADGNWYYLIPSSGVMATGWR KIGEKWYYFETAGAGVYVYDFVAS >gi|229784073|gb|GG667662.1| GENE 3 8684 - 10324 1682 546 aa, chain - ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 21 186 504 641 693 64 29.0 4e-10 MRKRKFTVTAAMFALSAILLSEIQMQPAYAVEGWTNDSGEWMYLDKSDQPVTDGWKQSKD SWFYLDSNGKMVRDSFIELGNGLYYADKDGKRAESAWIWSDGREEDGYEDGWYYFGANGK AYRRTGGFKKVIHGNTYVFDEKGKMLTGWIDEEGRALTEDDNPLVDGLYYAGENGALLSG VWMDYGSVTLGGADDLESSLTGKNYSEYKELWLYFDDNFKKLKSSGDKIKEKTIGDNTYG FDEYGIMLPWWSTVASVSNADRSNPTSSESAKYFAGYNGGKLLKNTWFWMCPSENLDEQD YNDGEYSWWYSDQNGEVYRNRIRKVHDKYYAFDGLGRMRTGFVLFDGKDTFVAQYDIDDW SSDDFKNGDLYGLEKADLYLFDPDELNDGSMQTGGEIKVELEDGVHTFGFGSNGIAYGNR NKLRKTKNSFYINGLKLEADQEYGYGVVKEDEDTYRVVDTRGKILTGDRKVIKDKEGGWL IFLNSRFAARVEDADKPRWYNGEEGPGFYHYDRSGGEDKYSGGLIVNYDSEPLLDYLADE QRLNFE >gi|229784073|gb|GG667662.1| GENE 4 10377 - 10946 509 189 aa, chain - ## HITS:1 COG:no KEGG:Closa_0878 NR:ns ## KEGG: Closa_0878 # Name: not_defined # Def: 3D domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 97 188 148 238 240 78 42.0 1e-13 MTAACFATVVVTGVYSFVSEGASGQDYIQTPPLTGILESAEAPGERNETERGPGVSVKEE PEESAAGGNTGPAAGREENGEEKNDAEAAEPEESEYLGEFEVTGYCGCDICCGTKEVKLT KMETVPKPDYTVAADPEVIPLGSSIVIEGTVYHVEDVGQAIKGKSVDIFFASHEEAKNFG RQKLSVYLK >gi|229784073|gb|GG667662.1| GENE 5 11054 - 11845 627 263 aa, chain - ## HITS:1 COG:no KEGG:Pjdr2_5011 NR:ns ## KEGG: Pjdr2_5011 # Name: not_defined # Def: transglutaminase domain protein # Organism: Paenibacillus # Pathway: not_defined # 152 237 140 229 380 69 34.0 1e-10 MKEIYESCLTVLLSTCLAFTWNLPAFGAGADTQIREPCMDRDEIGKYAEEMTGRIPEYEI CGKELIRLLADDVRPGEMRNTGIIFDTADEAVSFGRYFYRYVYLGKERIQLAAGKTGESS CVYVQCDNPVKAAKQHRQVEARLWETAAAAAEMEEQEKALFFYEWVYDHVEYDTEFRNKT VYDAVMEGSSVCWGYVSAYLTLCRMSGLNCEPVYSGDHAWNRVWIEGRWNYCDITWDKGQ GERRRKLMSEEEMEADPAHRRTG >gi|229784073|gb|GG667662.1| GENE 6 12116 - 12970 641 284 aa, chain - ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 18 270 14 262 265 112 27.0 6e-25 MEETIDNYCEWLAGREKSEGTIRKYRYYLQKFMEHMNGRSVDKGSLLIWKQYLRNHLAPI TVNGALTALNGFFRFYRWEDCEVRFLKITTSPFSQESRELSKEEYLRLVDAAFRKGNERL SLLLQTVCSSGIRISELKYVTVEAVQKGKAEIECKGRIRTVLLTRQLCNMLAEYAKRKGI MQGMIFITRRGNALDRSNIWREMKALSMAAGVESDKIFPHNLRHLFARTYYEIEKDLSKL ADILGHSDVNTTRIYTKESGTRHRQQLEKLGLLVTSYNRISLLL >gi|229784073|gb|GG667662.1| GENE 7 13193 - 14008 792 271 aa, chain - ## HITS:1 COG:CAC3390 KEGG:ns NR:ns ## COG: CAC3390 COG3947 # Protein_GI_number: 15896631 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver and SARP domains # Organism: Clostridium acetobutylicum # 3 239 1 235 370 94 25.0 2e-19 MKMKTIIVEDEANILSETKRNLERMADIEVLGGFLSPLEALKFAHGHDIDLAILDIEMAG INGMDLAELLKKIYPAVQILFSTGYSQYAMKAYQLQAMAYLMKPYSYSELENAVHRCKAL IEGLRFDKKERKIYVRTFGNFNVYVDGEALFFPYGKAKELFAFLVNARGGAVTMEQIVTS LWENRPYDNRAKQLYRKAVSLMRSTFRNAGIENICFYYRSCLAGNTKMYSCDYEEFLDGS IEQRRTYNGNYMAEYSWAEDTNAVLSNMVDY >gi|229784073|gb|GG667662.1| GENE 8 14005 - 16005 1694 666 aa, chain - ## HITS:1 COG:FN0220 KEGG:ns NR:ns ## COG: FN0220 COG3275 # Protein_GI_number: 19703565 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Fusobacterium nucleatum # 459 660 317 533 541 123 34.0 1e-27 MRIVRRKLKQLKEQYGFCGIVFLDVKSETFLFTEQKLREVFEDPNSRILQSSSFESLNEF VRNLYQTEETDRHACPYDFTNIKVEDEQGSYWVQLYACKWTISRRRKIAGIIWIKLEEKE RGQKEQEKKEPGRENGIAGNYWIQKSSITDKYLKDTSDNYLFEWNIADDRLLLSENWNSK FSARGSGKRLNGQMIEWYIVKEDIPRFHKLQNDIMNGESPEDILLRFYIKDDKHDYAWCN VSFLSILCDDRTPMYVVGRVRDIDKKLRGLRHGNDSFYTSNRREARQKIDKLLAVKESEK IHALLVIRLNMNEAAKNEETFWGESAISYSLVKNLTYMIYPTDILLVHDNYMVVFLGDMG NEENVLRKAARIHKVFESMIPENVFAEIGIALWPRDGKTYEELLMYAEKPMIIKAKWFET EFAGEETDGPGEAAHETETFKNQEQWPISVVSDIMDDWYHMLYMNSRLKRQMELAESQIM LGQIKPHFIFNALANIKALIYTDPDLAEKLIIAFTKYLRTHLNALGKEEMARFPDILDFV DNYIQVEQSRFPGKIKVVYDIQYDEFQMPHFIIQPLVENAIKHGLCKKNGTGRLRIASYQ AGDEICIEIEDDGIGFDRMPERTVDSGIGLENVRKRLSYLLDGSLALHSEPGKGTRAVIR FKERQT >gi|229784073|gb|GG667662.1| GENE 9 16506 - 16646 63 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKQVNPIISIASIMAAPANGSMHAINGSTDGTAANPPTARTSSFM >gi|229784073|gb|GG667662.1| GENE 10 16923 - 17243 375 106 aa, chain - ## HITS:1 COG:CAC1591 KEGG:ns NR:ns ## COG: CAC1591 COG3689 # Protein_GI_number: 15894869 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 3 103 131 238 245 61 26.0 5e-10 MYELSSSYERYEGYTVIIKGFVYRDSEIRETCDFALVRLSMWCCAADLSPVGFLVDNGGT LDFKDDDWVVVKGTLGISEDGQSLLLKAQSIEAAEKPEEEYVYPYF >gi|229784073|gb|GG667662.1| GENE 11 18186 - 18380 186 64 aa, chain - ## HITS:1 COG:no KEGG:Sgly_0154 NR:ns ## KEGG: Sgly_0154 # Name: not_defined # Def: protein of unknown function DUF1980 # Organism: S.glycolicus # Pathway: not_defined # 3 59 1 57 274 66 50.0 3e-10 MTLRGKGINVQVLTECICYLLFGFLLFHLTYSGRYLNYVTPRMKPYLYGMAAFMLLWAVM NGRY >gi|229784073|gb|GG667662.1| GENE 12 18377 - 18748 250 123 aa, chain - ## HITS:1 COG:BH1518 KEGG:ns NR:ns ## COG: BH1518 COG0701 # Protein_GI_number: 15614081 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Bacillus halodurans # 1 113 221 333 337 92 42.0 2e-19 MGKFLLTGIFVSTLFQDFIPQAVTAGSGAAPWKALLLMMGLAFVLSLCSSSDAVVARSMA ASLPAGAVLGFLVFGPMMDIKNAAMLLSGFKSAFVVRLFVTAFLVCFLVMMLFMLWGSGG IRI >gi|229784073|gb|GG667662.1| GENE 13 18679 - 19941 706 420 aa, chain - ## HITS:1 COG:BH1518 KEGG:ns NR:ns ## COG: BH1518 COG0701 # Protein_GI_number: 15614081 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Bacillus halodurans # 224 419 46 241 337 122 33.0 2e-27 MTVPAWVITGLLDSGKTTLINRLIEEKLDELDILVLQFESGETPLTEQERVKKLVFSKSQ LERAPLDIADTVIEYLNSHSTELILIEWNGMEHFHKLEEMFLQFMAKTVMSIEKVVYVAA DTGIQAGIPDAGAAAFSQIAGSDCAYIRSEGREHGRVKPDLLYNCSPDIRVYTDRKWERF MRDIFHVRMQPQHFLLMAAAAAVLFLLTVSSLDEIGFSFGRYISVYLGVFLQAAPFLLIG VLLSSLIQVCLKPDWIQRKFPEKILAGQLFAVIAGFCLPVCDCASIPVFKSLVKKGVPMP AAVTFMLVSPVINPVVILSTWYAFNGNYQMIAARCGLGVLCAVLCGFTYLFKPPEDYLLE NAVPLQNACENYSLPAAADSRLSRFFLIMRHAQNEFFPWGNFCSPGFLYRRCFRILFLRQ >gi|229784073|gb|GG667662.1| GENE 14 19938 - 20912 671 324 aa, chain - ## HITS:1 COG:FN0779 KEGG:ns NR:ns ## COG: FN0779 COG0523 # Protein_GI_number: 19704114 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Fusobacterium nucleatum # 4 316 3 292 294 136 28.0 7e-32 MTDIYIISGFLGAGKTTLIKTMVRSAFRNKEIVVIENDFGEAGIDAGLLRECNLAVTSLS DGCICCSLTGDFEKAIERVVKEYAPEAVLVEPSGVGKLSDIIKICFKQEDKGCLHLKKAI TVVDVRSFDKYRKNYGEFFEDQIAYADLIMLSHQEEAAEEIRTVGAKLRELNPEARIEAD FWESIPASVFRQGLRNSSIFRLEMETAVAMKPVRIRKSREQSKKAGFIRRHFAGEVFASV TIEWKEPLTEEQLQKKVMHVAAYAEGEILRGKGIVASGGRGLVFHYLPDYLSIEPVNTAG NQVCFTGTGLNEKQIRMLFSEETV >gi|229784073|gb|GG667662.1| GENE 15 21027 - 21569 620 180 aa, chain - ## HITS:1 COG:lin1484 KEGG:ns NR:ns ## COG: lin1484 COG1108 # Protein_GI_number: 16800552 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Listeria innocua # 1 176 98 274 292 103 36.0 2e-22 MSIAVVMSAGIGLAGVLSGFVKNAADFNSFLFGSIVAVSDTELALVTAVSITVLLLFLLM YKELFYISLDEQSARMAGVPVKAVNFLFTVLIAVTVSVAARTAGALIVSSMMVVPAACAM QFGTNFRQTLLCAVVLDVLFMITGLFAAYYLGLKPGGTIVLVGVTVLILVFAGKRVLLRK >gi|229784073|gb|GG667662.1| GENE 16 22502 - 22756 318 84 aa, chain - ## HITS:1 COG:CAC2878 KEGG:ns NR:ns ## COG: CAC2878 COG1108 # Protein_GI_number: 15896132 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Clostridium acetobutylicum # 1 79 2 80 268 66 45.0 9e-12 MEIFKYAFMQRAFLVGILLAAVIPCIGMVMVCRRLSMIGDALSHTSLAGVAAGLILGLNP VLTAAAACVIAAIGIEAIRRKLPS >gi|229784073|gb|GG667662.1| GENE 17 22743 - 23450 247 235 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 2 222 3 216 311 99 31 3e-20 MDAIAVKGLDFSYGNTRILKNINFTVPEGRFAVLLGPNGAGKSTLLKLMLGELPLNGQNG RIELLGREIRQFKNWQKVSCVSQNGMASCQNFPASVEEIVQANLYAQIGRFRFAGKKEKE QVRCALEKVGMEAFAKRLIGRLSGGQQQRVLLAKALVNEPGLLLLDEPASGMDENSTMLF YQLLCRINREQGVTIWIITHDRKRLEAYADDIWFLEEGTMKQVRAGQKEACYGDL >gi|229784073|gb|GG667662.1| GENE 18 23498 - 24469 929 323 aa, chain - ## HITS:1 COG:lin0191 KEGG:ns NR:ns ## COG: lin0191 COG0803 # Protein_GI_number: 16799268 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Listeria innocua # 12 323 7 312 312 248 41.0 1e-65 MKKKLNFTFCRLVVPALVSAALLSGCSGKTAAPGGADGEPVQTTAVSEREGKDTDCGKLK VMASFYPMFDFAVKIGGDKAEVTNMVTAGTEPHDWEPAVSDIRNLEEADLFVYSGAGLEH WADDVLASLETRNLIVAEASSGLTLREGHGQFDPHVWLSPENAKKEMENIKDAYVKADPD NKDYYEENYKTYAAAFDKLDRKYKDTLSALPNKSIVVSHEAFGYLCDAYGLTQMGIEGLS PDSEPDPARMAEIIDYVKENHVRVIFFEELASPKVAETIARETGAGIRVLNPLEGLSDEE IKGGADYFSVMEDNLKQLEAALQ >gi|229784073|gb|GG667662.1| GENE 19 24650 - 24961 147 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622087|ref|ZP_06115022.1| ## NR: gi|266622087|ref|ZP_06115022.1| transcriptional regulator, AraC family [Clostridium hathewayi DSM 13479] transcriptional regulator, AraC family [Clostridium hathewayi DSM 13479] # 1 103 1 103 103 181 100.0 2e-44 MNCSIKQKRICAFLAAFLVTAVLLLSGFFIVTHTEHDCTGEDCPVCAELHACAATIRLIT EAAGTGAVVIFTYMITQKLLLSYQSGRSLCPVSLVSLKIRLDN >gi|229784073|gb|GG667662.1| GENE 20 24998 - 25432 286 144 aa, chain - ## HITS:1 COG:CAC1682 KEGG:ns NR:ns ## COG: CAC1682 COG0735 # Protein_GI_number: 15894959 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Clostridium acetobutylicum # 1 136 16 150 151 57 24.0 1e-08 MAERGRYKTKQQEIILDCLKKQKSRFLTVDQIMDCLKEDGVAVGQTTVYRVLERMVDDGK MIRLPTEDGAKVRYCFVDKEDLNKQGKLVCLGCGRFIPLECSKMKAFLDHIYEEHGFELD QQHTILYGYCESCKDRHSGIKSDL >gi|229784073|gb|GG667662.1| GENE 21 26517 - 27641 950 374 aa, chain - ## HITS:1 COG:ECs4459 KEGG:ns NR:ns ## COG: ECs4459 COG3533 # Protein_GI_number: 15833713 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 369 277 648 656 335 43.0 1e-91 MADVARETLDDTLYEACKAMWDSIVNRRMYITGAIGSSAYGEAFTIDYDLPGDLMYGETC ASVGMVFFACRMLQMEQRGEYADVMERLLYNGTISGMALDGKSFFYVNPLEVYPKKDEEV QAFRHVKVERQKWFACACCPPNLARLVESVGAYIYTQSEDRLYVNLFIGSETEVFSGEEA AHIHMETGYPWTGNVKLVTECESGREWEIALHLPGWCRQFDILVDGEPVGYRTEAGYLYL KRCWKGRHEIVFVMKLEARLVRSNPRVRDNMGKTAVMRGPVVYCLEEADNGPDLHCIEIP EETVFKEEYRKDLLGGVVVLESVGRKIREEEGIPLYFDEKNEKYDEKKLLWIPYYSWANR SAGEMMVWVRRVTL >gi|229784073|gb|GG667662.1| GENE 22 28582 - 29367 823 261 aa, chain - ## HITS:1 COG:ECs4459 KEGG:ns NR:ns ## COG: ECs4459 COG3533 # Protein_GI_number: 15833713 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 13 238 11 236 656 253 50.0 2e-67 MNQKHFTKAVASVVVEDSFWKPIMEVTRTQMIPYQWKALNDELPDIEPSHCIRNFKVAAG LMDAEFDGRVFQDSDVAKWIEAASYSLEWHPDEELEKKIDDTVDLIIMSQQPDGYIGTYY SINGLEKRWTNVMRHHELYCAGHMLEAAVAYCRVTGKDRFLNAMIRFVDYIDTVFGPEEG KLHSYPGHEVIEMALMRLYEITKNEKHLRLAKYFIDERGKQPCCFEKEIETYHNEYPWKD SFFQFHYYQAGAPVREMKDAS >gi|229784073|gb|GG667662.1| GENE 23 29420 - 31138 1472 572 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622091|ref|ZP_06115026.1| ## NR: gi|266622091|ref|ZP_06115026.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 572 1 572 572 1207 100.0 0 MNDTITNNMFPRWNSFESRNMEKMLLLFADTGKAHAAAGDIGELEKYFDFAKRAGSRFDW VQLHSGYFFNHAELSKYPDNGIEIDTKTNAFIDEAVRICKKNGKKVSYMMGDYAPLECLL ERYPQVRNLNNGLFWELLYDAACGIFKRFPELDELAMYFFESKNLLHYNNFFGCMNYGLD FNEKTLREHPELVMKQEGQSYPYLSFGDHLRMMLQAVSTACRDCGKSFTLLTHVWFPYQE EILYEALKDFPADLPILLEHNYTTGDFNPALPAPALIKRLPHMNHGLCFCCGMEYHGLSL VPCCFPEAMEATVHDAMEATANLKRITVRPIWDGQSLLGSPNEINLYYLLQLSDQADRDP EVAWKEWIAGKYGITDEESELILASALRKSYEVVRKVFFEFGVRTNDHSHIPNFSHLESR LFNYGKALIHWCPTPENKQNIYDLLIRPGSKILRMHRELHEESLHLIEQALGNVKLLENR MRPEDYEDIIRRYGDMKVWVLLHQEEYEAYIRLLIQRKTPSERNQAMAEKSLVRLGKMAE RIRSGEIKDCYLFSVDHVESFIEDCRREFDQI >gi|229784073|gb|GG667662.1| GENE 24 31142 - 31999 805 285 aa, chain - ## HITS:1 COG:YPO1721 KEGG:ns NR:ns ## COG: YPO1721 COG0395 # Protein_GI_number: 16121981 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Yersinia pestis # 18 285 39 306 306 264 46.0 2e-70 MNRKIAKIRKIVLKTISYLVLTALGIVFIYPLLFMLSASFKTNQELMTSMSLIPKSISFD SYINGWKGVGTYTFGHFMKNSGILVVPVVLFTVISSSLVAYGFARFKFKGHGLLFGAMLS TMMLPNAVVIVPRYIMFRQMGWLNTYVPFYALSLFACYPFFIFSMVQFIRGIPKDIDESA FMDGCSTFGIFVYMILPLAKPCLFSMAIFQFIWTWNDYFNPLIFINSVNKYTVMQGLRMC MDSSSGISWGPIMALSLIAILPCVIIFFAAQKYFVEGVATTGLKG >gi|229784073|gb|GG667662.1| GENE 25 32002 - 32652 604 216 aa, chain - ## HITS:1 COG:YPO1722 KEGG:ns NR:ns ## COG: YPO1722 COG1175 # Protein_GI_number: 16121982 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Yersinia pestis # 1 216 77 292 296 214 50.0 8e-56 MPMKLISALIIAMLLNQALKGIGLFRTIYYLPSILGGSVAIAILWRHLFDLNGVVNVFLN RLGIKSIGFLTDPKIALTTISMLSVWQFGSSMLFFLAALKQVPQSLYEAAKIDGAGPVRS FFKITIPMISPMTLFNIIMQMINAFQEYTAPAVITGGGPVKKTQVLAMTLYQNAFTYRKM GYASAISWIMFIIIIIFTVFIFSTSAKWTFYGDEEG >gi|229784073|gb|GG667662.1| GENE 26 33614 - 33838 340 74 aa, chain - ## HITS:1 COG:no KEGG:SpiBuddy_0358 NR:ns ## KEGG: SpiBuddy_0358 # Name: not_defined # Def: ABC transporter inner membrane protein # Organism: Spirochaeta_Buddy # Pathway: not_defined # 15 70 12 67 295 71 62.0 1e-11 MKNKKQGTMLERRNNQGYLYLLPWIIGFLIFQLYPIAMSLYYSFTDYSFGSQYDFVGLAN YLQIFTKDREVKNS >gi|229784073|gb|GG667662.1| GENE 27 33838 - 34650 586 270 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622095|ref|ZP_06115030.1| ## NR: gi|266622095|ref|ZP_06115030.1| amidohydrolase family protein [Clostridium hathewayi DSM 13479] amidohydrolase family protein [Clostridium hathewayi DSM 13479] # 1 270 1 270 270 542 100.0 1e-153 MNLIDFHMHPYLTGSQYTGMYLEPGPESGKNVMDKIRCDMERAGISHVCGSVIGGGNTVW KGSFHDLKELNRSALGLKNMSDGFYTPGFHIHPEFVRESIEEIEYMAEHGVRLIGELVPY AHGGYSYSSGAMDKILQAAEAYGMVVSYHSTDDDDADRMVERHPKLTFVAAHPGEKPRVE QRLERMRRYDNLYLDLSGTGLARLGILAYSVRRLGAERFLFGTDYPINNPAMYVHGVLYE ALSDRERRMICYENAERILGIHIFEGEGDH >gi|229784073|gb|GG667662.1| GENE 28 34757 - 35776 959 339 aa, chain - ## HITS:1 COG:YPO1719 KEGG:ns NR:ns ## COG: YPO1719 COG1653 # Protein_GI_number: 16121979 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Yersinia pestis # 46 324 130 415 430 63 23.0 7e-10 MQTIATAAAEYQAAFPDSFVKLDQQDTFKLDVFDQGFLDSFCKAHDGSTVAVPSGVSSYN LVVNKTVTDAAGVELPDNMTWDEFLEYGKQIHAADPDYYMITLSDDDCNHFMRSYVRQLS GYWTISEDYEVVSDRDALVKAFTLLQSLYKEGIAEPMDTAFAYNGDKDANKKLLNNEIAC SYRGSSSIINLDTSGGMVLDVINIPIDPDAKETGVITQPSQLFMVSEGPNREEALKFMQW LYTDEDAIRILSDCRGVPPTEIGRLQLEKENLLNPVVNKAITLALEKTDGAVPPVNENSE VYGYLFPLMQELCYDTISPEEAADELMANLTEIVEKLKP >gi|229784073|gb|GG667662.1| GENE 29 37302 - 38165 482 287 aa, chain + ## HITS:1 COG:BS_yisR KEGG:ns NR:ns ## COG: BS_yisR COG2207 # Protein_GI_number: 16078146 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 14 282 14 285 287 105 28.0 1e-22 MHHLMLPCSRPLTFLSCGQLVKNKNFLHPKRCLDSFVLIYILKGTLNISQYGVQYRLEPN QFILLHAETEHYGWQESSEYLHYYWVHFHVPEPPVYINQTEYPSEISQALPEKYIVPETG TLTHGKRVPLLFSQLLDISMRRYPASEYLLHYSLTALLLELAGEFSSRLLAPKEDPLHIY RIQDWIRRNFQQAITVAQIAEIFHYNPDYLSSSYKKLTGTTLTQYINSVRIESAKNLLTS TDFTLKEIAFYCGFFDEKYFFKVFKQYEDMTPSEFKGAFFKKKEVSQ >gi|229784073|gb|GG667662.1| GENE 30 38299 - 38499 115 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622098|ref|ZP_06115033.1| ## NR: gi|266622098|ref|ZP_06115033.1| transcriptional regulator GadW [Clostridium hathewayi DSM 13479] transcriptional regulator GadW [Clostridium hathewayi DSM 13479] # 1 66 42 107 107 102 100.0 6e-21 MERAVILLKGTSLAINEVALMLGYRNSSHFYMAFREYYGVSPREYIQYLSKQYLLAYLYY RVAMTA >gi|229784073|gb|GG667662.1| GENE 31 38673 - 39320 317 215 aa, chain - ## HITS:1 COG:CAC3408_2 KEGG:ns NR:ns ## COG: CAC3408_2 COG0446 # Protein_GI_number: 15896649 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Clostridium acetobutylicum # 41 215 107 285 285 87 32.0 2e-17 MYIPLRFHSAENAYVKQVPRRCSGFPPLFFFFLVRSLELYEAVVCATGSEPVIPQVPGID KTIAIDALTAIDHEEKLGKRIVIIGGGLIGSETALDLAEKGHDVTLVEMLPQIMNGVATT DFLAYSERIEKTNMQIMTNTRLIAVEDHGVTVQGPHDFEKLDADTVVLAIGLRAKQELYH ELKERGKEVYLVGDAVKAGKIFDAFHTAYRTALKI >gi|229784073|gb|GG667662.1| GENE 32 39281 - 39805 238 174 aa, chain + ## HITS:1 COG:CAP0082 KEGG:ns NR:ns ## COG: CAP0082 COG0664 # Protein_GI_number: 15004786 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 2 172 19 185 188 65 21.0 5e-11 MHFLLNGIVKVYTNSSDGFVRLLGYHRQNTLFVLDGFCEGKPAIVTTQAVTDITTVTVTR QSLLRMCEANPAFGLDLASYVGDVLRLMCFDAENQSVSDVTTRLINFIFLYMKSEDYCQT QCIPMSYENLASAVNASRIQIARICSKMKRLGVIDVGRRCITIKNPNALYDFTK >gi|229784073|gb|GG667662.1| GENE 33 39935 - 40033 58 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLDALCNDPYIVVYLFFEKQITRGIAFEGLKG >gi|229784073|gb|GG667662.1| GENE 34 40048 - 42258 1080 736 aa, chain - ## HITS:1 COG:CAC2731 KEGG:ns NR:ns ## COG: CAC2731 COG0577 # Protein_GI_number: 15895988 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 60 735 188 875 875 177 23.0 5e-44 MWKDYSAGYIKHNRASSVTVMAAAFISALLLSLLCSLFYNFWIYEIERLGAEEGIWQGRI VGEIDTKQLESIRNFANVKTASVNEKLSDGQEAVIDICLHNRRTILTDMPKIAELAGISQ EAVSYHHSLLSMYLIRDPMDPAPRLIFPFLLGTTVLACFSLIMIINHSFAVSMYARIHQI GIFSSIGATPGQIRICLLQEAAMLCVIPVIAGNLLGILISVWIIGQTNVLAGDTAGRQEA VWQYHPWVLVFTFLITILTVWISAWMPAKKMSRLTPLEAIRNTGELQLKKKKKIRILPLL FGVEGELAGNALKAQRKALRTTTLSLVLSFLAFTLMQCFFTLTGVSQRMTYFEGYQNVWD IMVTVQDAGIDSFEKAEEIQALSGVADSAVYQKAAAKRIVTEDEISSEMAALGGFMHAPE NYVTKTEEGWLVNAPLVILDDSSFLEYCGMIGAAPRLDGAVVRNQIDDVTDPDFRNRDSH DYLTGQQKTSVLRQAAREELTAEVPVLAYTREVPELREEYGSLDLYELVHFLPVSLWKQI RGQIGGEEEDFYIRILADDTTSLEALNRIKEELQEILNPEYEVLIENRIQKKIDNDHMIH GMMTVLGGFCVLLALIGIGNLFSNTLGFVRQRNREFARYMSIGMTPGGLRKLFCIEALVI AGRPVLITIPLTVLMTGYMIKISYLEPVIFIKEAPVIPVLAFLFMIAGFVALAYYLSWKK VSIVNLRAALSDDTMI >gi|229784073|gb|GG667662.1| GENE 35 42264 - 42947 187 227 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 20 222 25 240 563 76 28 3e-13 MELLKCEGIGKIYGKGLNQVTALDGIDLSVERGEFLAIVGASGSGKSTLLHILGGVDKPT SGNVMIDGMDLSALNQKQAAIFRRRKVGLVYQFYNLIPTLTVRKNILMPLLLDKKSVNQE FFNQIVSSLGIEDKLESLPGELSGGQQQRAAIARSLVYRPALLLADEPTGNLDQKNSREI IDMLKLSNRNLKQTIILITHDEKVAMEANRIVTLEDGHIVSDRKQRG >gi|229784073|gb|GG667662.1| GENE 36 43017 - 43568 576 183 aa, chain - ## HITS:1 COG:CAC0525 KEGG:ns NR:ns ## COG: CAC0525 COG0642 # Protein_GI_number: 15893815 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 6 179 153 326 329 126 39.0 2e-29 MEYAGQIRKQVARLTHLEEALLLLSRIDTGTLTLEKVKLDVFTMLTLAADNLQELSVLSD VELDISDHGEAVITADMEWTMEAVMNLLKNCMEHSPAGARVHCSYEQNPLYVQIRIWDEG EGFEKKDIPHLFERFYRGENAVKGSIGIGLSISKAIIEMQNGVVHAGNLPEGGAFFEIRM YSH >gi|229784073|gb|GG667662.1| GENE 37 44696 - 45367 379 223 aa, chain - ## HITS:1 COG:CAC0524 KEGG:ns NR:ns ## COG: CAC0524 COG0745 # Protein_GI_number: 15893814 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 219 1 221 228 158 39.0 9e-39 MENIYYVEDDRNIASLVAEYLEQRDCIVSIIPSIAEARQMLSRQKPDLCLIDWNMPDGQG DVLCRSIRFRWKDMPIIFLTARGDVKDIVAGFGCGADDYVVKPFELEVLYSRIQALLRRA GTVSDQVLSCDFISLNHSKKTVFCGDEEITLSQMEYQLLEVLLKNKGQTISREQILHEVW DSNGSYVNDNTLTVTMKRLREKLHQPLCIKTIRSFGYRMEDTV >gi|229784073|gb|GG667662.1| GENE 38 45400 - 45987 398 195 aa, chain - ## HITS:1 COG:alr0739 KEGG:ns NR:ns ## COG: alr0739 COG4430 # Protein_GI_number: 17228234 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 12 194 9 191 193 107 36.0 1e-23 MVMMFTGGDMSEVMKFSTRKDFRSWLQENCMSTAGIWVLFGKAGGPKTIKAGEALEEALC FGWIDGQMQSIDDKIYKKYFSLRRENSKWSEKNKALAEKLEQQGLMTDYGLAKIAEAKQN GQWDAPKSPAVTEEQIAFLSGLLKAYEPAYANFQAMSLSVKKTYTRAYFDAKTDAGREKR ILWMVDRLNKNLKPM >gi|229784073|gb|GG667662.1| GENE 39 47159 - 47707 432 182 aa, chain - ## HITS:1 COG:SMa2239 KEGG:ns NR:ns ## COG: SMa2239 COG4422 # Protein_GI_number: 16263660 # Func_class: S Function unknown # Function: Bacteriophage protein gp37 # Organism: Sinorhizobium meliloti # 3 178 13 199 259 67 30.0 1e-11 MNWEPWTGCYQISDGCANCYFYGPYAKRCGQNTILKTDKFNWPIRRNKKGEYNIKGDKIL ATCFATDFFLPEADEWRKEVWGMIRERTDIDFLILTKRIDRFPVSLPDDWGEGYDNVNIG CTVENQAIADYRLPLFLSYPMKRRFIACAPLLEAIDLTAYLHGVDHVTVGGETGREAREC DY >gi|229784073|gb|GG667662.1| GENE 40 49261 - 49413 66 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622110|ref|ZP_06115045.1| ## NR: gi|266622110|ref|ZP_06115045.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 50 1 50 50 97 100.0 4e-19 MALVYTEEYTAEDFYKHQDISMNLIHKIKDCQTRGPARFMPHHLLFSGQK >gi|229784073|gb|GG667662.1| GENE 41 49520 - 49975 215 151 aa, chain - ## HITS:1 COG:FN1726 KEGG:ns NR:ns ## COG: FN1726 COG0534 # Protein_GI_number: 19705047 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 8 120 323 434 457 63 33.0 1e-10 MKPFKTSILWSTIFCAMFGLAAAIFSSQIMALFTKGDMEMIRIGSFALRANGLSFILFGF YTVYSFLFLVMGKAVEGCFLGACRQGICFVPVILLLPGIFGLDGILYAQPIADVLSAIVT AWLAIHFHKELSAAKAHFLATQAESSESADV >gi|229784073|gb|GG667662.1| GENE 42 49962 - 50603 265 213 aa, chain - ## HITS:1 COG:MA2050 KEGG:ns NR:ns ## COG: MA2050 COG0534 # Protein_GI_number: 20090897 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 11 213 117 320 468 129 40.0 3e-30 MTGAVLILCSIIFLKPILKQLGAIESVMPYATAYTSIYITFSIFNVFNVTMNNIVSSEGA AKTAMCALMAGALLNVILDPIFIYRLHLGVAGAAIATAISQLVSTIVYMGYILRKKSIFS FHIKECCCSKEIMSEILKIGIPTLLFQILTSLSISMINNAAKEYGGSALAAMGPLTKIMS MGSLIVFGFLKGFQPIAGFSYGAKKFDRLHEAI >gi|229784073|gb|GG667662.1| GENE 43 51555 - 51791 359 78 aa, chain - ## HITS:1 COG:no KEGG:ELI_1443 NR:ns ## KEGG: ELI_1443 # Name: not_defined # Def: Na+ driven multidrug efflux pump # Organism: E.limosum # Pathway: not_defined # 1 78 1 78 450 119 84.0 4e-26 MNESDKKIALLGSMPIPKALLALGLPTMIGMLINALYNLADTYFVGGLGTDQMGAVTVAF PLGQVIVGLGLLFGNGAA >gi|229784073|gb|GG667662.1| GENE 44 51916 - 52452 510 178 aa, chain + ## HITS:1 COG:MA2093 KEGG:ns NR:ns ## COG: MA2093 COG1695 # Protein_GI_number: 20090938 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 1 170 1 167 189 108 36.0 4e-24 MATIDLIVLGILKRESLSAYDIQKLVEYRNISKWVKISTPSIYKKVIQLEEKGLIKSNIV KEGKMPEKAIYSLTDTGAKEFENLMNEIASQPIHFFLDFNAVIVNLPSLPPEGQKLCLEK IEENVKALKTCLEENLCEKERVPEIPETGMAVLQQQYILAQTIETWITSLKDNSFSAI >gi|229784073|gb|GG667662.1| GENE 45 52627 - 53901 1651 424 aa, chain - ## HITS:1 COG:BH0634 KEGG:ns NR:ns ## COG: BH0634 COG0151 # Protein_GI_number: 15613197 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Bacillus halodurans # 1 423 1 419 428 399 50.0 1e-111 MKVLIVGSGGREHAIAWKVAGSPKADKIYCAPGNAGIAEYAECVPIGAMEFDKLAAFAKE NQIDLTVIGMDDPLVGGIVDVFEKEGLRVFGPRKNAAILEGSKAFSKDLMKKYHIPTAGY ETFDTPEAALQYLETAEMPIVLKADGLALGKGVLICNTREEAREGVRTLMLDKQFGSAGD RIVVEEFMTGREVSVLSFVDGDTIKIMTSAQDHKRAKNGDQGLNTGGMGTFSPSPFYTEE VDAFCRKYVYQATVDAMKAEGRPFKGIIFFGLMLTEKGPKVLEYNARFGDPETQVVLPRM KNDILEVFEACVDGTLDQIQLEFEDNAAVCVVLASDGYPEHYEKGYPISGFENFKGRDGY YVFHAGTKFDEEGRVVTNGGRVLGVTAKGENLKQARENAYRATEWITFGNKYMRDDIGKA IDEV Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:13:02 2011 Seq name: gi|229784072|gb|GG667663.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld56, whole genome shotgun sequence Length of sequence - 44401 bp Number of predicted genes - 32, with homology - 31 Number of transcription units - 18, operones - 7 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 40 - 1101 1127 ## COG2221 Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits + Prom 1280 - 1339 5.3 2 2 Op 1 1/0.000 + CDS 1442 - 2092 809 ## COG1648 Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) 3 2 Op 2 . + CDS 2085 - 2417 526 ## COG0181 Porphobilinogen deaminase + Prom 3319 - 3378 9.1 4 3 Op 1 . + CDS 3426 - 3698 350 ## Closa_3081 porphobilinogen deaminase 5 3 Op 2 . + CDS 3701 - 5215 1651 ## COG0007 Uroporphyrinogen-III methylase + Prom 5227 - 5286 6.2 6 4 Tu 1 . + CDS 5328 - 5540 280 ## Closa_3078 flavodoxin family protein + Prom 6442 - 6501 6.0 7 5 Op 1 . + CDS 6531 - 6764 292 ## Closa_3078 flavodoxin family protein 8 5 Op 2 . + CDS 6777 - 8645 1242 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 9 6 Tu 1 . - CDS 8653 - 9183 273 ## COG2207 AraC-type DNA-binding domain-containing proteins 10 7 Tu 1 . - CDS 10123 - 10452 381 ## bpr_I0234 AraC family transcriptional regulator - Prom 10472 - 10531 5.2 + Prom 10422 - 10481 8.2 11 8 Op 1 . + CDS 10591 - 12741 1857 ## PPSC2_c0917 protein + Prom 12743 - 12802 4.0 12 8 Op 2 . + CDS 12826 - 14160 1484 ## COG0534 Na+-driven multidrug efflux pump 13 9 Tu 1 . - CDS 14222 - 14632 327 ## COG5485 Predicted ester cyclase - Prom 14807 - 14866 10.6 + Prom 16453 - 16512 3.9 14 10 Tu 1 . + CDS 16612 - 16761 142 ## gi|266622127|ref|ZP_06115062.1| glycoside hydrolase family 2 + Term 16785 - 16843 13.1 - Term 16773 - 16831 5.5 15 11 Tu 1 . - CDS 16875 - 21422 3472 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 21473 - 21532 4.4 + Prom 20956 - 21015 4.3 16 12 Tu 1 . + CDS 21262 - 21576 84 ## 17 13 Tu 1 . - CDS 21694 - 23487 991 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 23596 - 23655 80.4 + Prom 24478 - 24537 23.1 18 14 Op 1 . + CDS 24591 - 25676 677 ## COG3395 Uncharacterized protein conserved in bacteria 19 14 Op 2 . + CDS 25655 - 26812 949 ## COG1454 Alcohol dehydrogenase, class IV 20 14 Op 3 . + CDS 26842 - 27744 652 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 21 14 Op 4 . + CDS 27769 - 28830 438 ## PROTEIN SUPPORTED gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 22 14 Op 5 . + CDS 28833 - 29810 534 ## COG0657 Esterase/lipase 23 14 Op 6 . + CDS 29826 - 31139 1124 ## COG3493 Na+/citrate symporter 24 14 Op 7 . + CDS 31166 - 32335 518 ## COG4692 Predicted neuraminidase (sialidase) + Term 32357 - 32415 5.4 - Term 32343 - 32403 4.2 25 15 Tu 1 . - CDS 32412 - 33368 638 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 33391 - 33450 9.4 + Prom 33311 - 33370 6.2 26 16 Op 1 . + CDS 33534 - 34760 995 ## CLJ_B0047 major facilitator superfamily protein 27 16 Op 2 . + CDS 34757 - 36328 1335 ## bpr_III022 hypothetical protein + Prom 37167 - 37226 80.4 28 17 Op 1 . + CDS 37420 - 38565 1290 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 29 17 Op 2 38/0.000 + CDS 38565 - 39476 930 ## COG1175 ABC-type sugar transport systems, permease components 30 17 Op 3 14/0.000 + CDS 39491 - 40309 752 ## COG0395 ABC-type sugar transport system, permease component + Prom 40311 - 40370 3.3 31 17 Op 4 . + CDS 40399 - 41661 1323 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 41699 - 41747 12.8 32 18 Tu 1 . + CDS 41756 - 44209 2124 ## Sterm_4037 hypothetical protein + Term 44241 - 44280 9.2 Predicted protein(s) >gi|229784072|gb|GG667663.1| GENE 1 40 - 1101 1127 353 aa, chain + ## HITS:1 COG:STM2550 KEGG:ns NR:ns ## COG: STM2550 COG2221 # Protein_GI_number: 16765870 # Func_class: C Energy production and conversion # Function: Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits # Organism: Salmonella typhimurium LT2 # 1 333 1 334 337 369 50.0 1e-102 MNHDIDIKKVRINCFRQSKVAGEFMLQLRVPGALIEAKHLTMVQEICQRWGNGTFHMGTR QTLNIPGIKYENIEEVNKYIKNYIKDVDIEMCNVDMESNDAGYPTIGARNIMSCIGNAHC IKGNVNTYKLARKIEKLIFPSHYHIKVAVAGCPNDCVKANFNDFGVMGIHKQIYDIDRCI GCGSCVDACEHHATGVLSLNANGKIDKDTCCCVGCGECSLICPTGAWSRGKQLYRVTLGG RTGKQNPRAGKLFLNWVTEDVVLGMFANWQKFSAWALDYKPEYLHGGHLIDRVGYKEFVK HIFEGVEFNPEAKMADDIYWAENEQRGNMHVMPLSQHKHAGPAETANYTFNQK >gi|229784072|gb|GG667663.1| GENE 2 1442 - 2092 809 216 aa, chain + ## HITS:1 COG:PA2611_1 KEGG:ns NR:ns ## COG: PA2611_1 COG1648 # Protein_GI_number: 15597807 # Func_class: H Coenzyme transport and metabolism # Function: Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) # Organism: Pseudomonas aeruginosa # 1 208 1 207 213 135 36.0 7e-32 MAYFPFFMDIEGQRCLIVGGGVVACRKVEVLLDYGPEIVVVSPEMTERMEAYEETGAGKV TLLRREYQEADLDCVDFVVAATADEELNRRISLACRERRIPVNVVDVREECSFIFPALIK DEDITVGISTGGSSPTIAQFLKAKFRAAIPEGFGKLAAELGTYRELVKERVPVLSVRTEI FKEMVEEGIRQGGMFTREQAEQLIDRKLEELEKDHE >gi|229784072|gb|GG667663.1| GENE 3 2085 - 2417 526 110 aa, chain + ## HITS:1 COG:RP466 KEGG:ns NR:ns ## COG: RP466 COG0181 # Protein_GI_number: 15604330 # Func_class: H Coenzyme transport and metabolism # Function: Porphobilinogen deaminase # Organism: Rickettsia prowazekii # 1 108 1 108 299 111 53.0 3e-25 MNKTIRIGTRKSALAMVQTELVAEALKQVQPGLETRIVTKVTEGDRILNKPLLEFGGKGV FVTEFEQALQNGEIDFAVHSAKDLPMDLEDGLGIVAVPPREDPRDVLLAS >gi|229784072|gb|GG667663.1| GENE 4 3426 - 3698 350 90 aa, chain + ## HITS:1 COG:no KEGG:Closa_3081 NR:ns ## KEGG: Closa_3081 # Name: not_defined # Def: porphobilinogen deaminase # Organism: C.saccharolyticum # Pathway: Porphyrin and chlorophyll metabolism [PATH:csh00860]; Metabolic pathways [PATH:csh01100]; Biosynthesis of secondary metabolites [PATH:csh01110] # 1 89 211 299 300 98 57.0 1e-19 MEGRIEDRICEMAVRIQDEDAKTCLMLERKILKLLDAGCHEPVGVYSRIRGEEIEVFGIS RRDRETKRVYLTGGIGELDLLAERAAKGLR >gi|229784072|gb|GG667663.1| GENE 5 3701 - 5215 1651 504 aa, chain + ## HITS:1 COG:CAC0098_1 KEGG:ns NR:ns ## COG: CAC0098_1 COG0007 # Protein_GI_number: 15893394 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III methylase # Organism: Clostridium acetobutylicum # 1 253 1 252 252 230 45.0 4e-60 MKETGKVFLVGAGPGDPGLMTEKGLKRLRECDAVVYDHLASERFLDEVPEESQKIYVGKK AGCHYMKQEEINRLLVDLAREGKNVVRLKGGDPFVFGRGGEEVMAVSREGIPYEVVSGVT SAIAALASAGIPVTHRAVSRSFHVMTGHTMTEAGKLPLDFAEFAGLSGTLIFLMGLTHLP LIVKGLLENGKSGSTPAAVIEKGTLPDQRVVRGTLDTIEQRVAEEGIGTPAIIVVGEVAA LNFSSTYHEPLKGARVAVTGTGLLAAKLRKSLEEAGASTECVLSLELESCRSGEAMKAAY ERLADYGWIVFTSANAVREFFLGLLESGRDNRSVGHVKFAAVGKGTAKELLRYGFTTDYI PEEFCAAALARGLKDVVKDGERLLIPRSLKGSPELNGILDGAGIRYDDVVLYNVAERKQG EKELSEALKKADYLTFASGSGVDAFFENAGEDVTALLAGIRVVCIGDITAKKLEEHGRKA DVTASRFTVEGMTEAVSEDWSRRI >gi|229784072|gb|GG667663.1| GENE 6 5328 - 5540 280 70 aa, chain + ## HITS:1 COG:no KEGG:Closa_3078 NR:ns ## KEGG: Closa_3078 # Name: not_defined # Def: flavodoxin family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 68 1 68 167 112 73.0 6e-24 MDYLVVYTSNTGNTEKVAMKIFDSIPGRSKDIQRLEELRDGEADTYFVGFWNDRGTCGSR VMEFLSGLAS >gi|229784072|gb|GG667663.1| GENE 7 6531 - 6764 292 77 aa, chain + ## HITS:1 COG:no KEGG:Closa_3078 NR:ns ## KEGG: Closa_3078 # Name: not_defined # Def: flavodoxin family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 73 95 167 167 120 67.0 2e-26 MSVFVPEDNDYLGCFLCGGKMMPQVLDRYKQMQAINDSPQIRTMIAAYEEGMLHPDEKDF QDAEAFVEEVLNKVTGR >gi|229784072|gb|GG667663.1| GENE 8 6777 - 8645 1242 622 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 21 617 8 596 636 483 45 1e-135 MGIKNEDDNKSPRKPFIYYYLIVMIVMILFNALMVPWIQSRTIIEVPYSTFLEKVEAGQV TDVAKTDSEIQFIADTGEKDKNGKEIYATYKTGPWPDEKLTERLMAHKEINFQEAIVEPM NPLLSFFLTWILPILIFVFLGNMMAKQMQKRMGGGNAMTFGKSNAKIYAESETGKTFADV AGQEEAKDALKEIVDFLHNPQKYADIGASLPKGALLVGPPGTGKTLLARAVAGEAHVPFF SISGSEFVEMFVGMGASKVRDLFKQANEKAPCIVFIDEIDTIGKKRDGGGFSGNDEREQT LNQLLAEMDGFDGKKGVVILAATNRPDSLDKALLRPGRFDRRIPVELPDLGGREAILKVH GKNVKLSDDVNFHDVALATAGASGAELANIVNEAALRAVRMGRKLVSQEDLEESVEVVIA GYQKKDSGVSVNERKIIAYHEVGHALVAACQSHSAPVHKITIIPRTSGALGYTMQVDEEQ RYLLTKDEALNKIATFTGGRAAEELIFHSVTSGASNDIEQATRIARAMVTRYGMSETFDM VALETVTNQYLGGDTSLACAPDTAKLIDEEVIKIVREQHQKALQILKENEGKLHKIAEYL LEKETITGEEFMDIFSGEDESL >gi|229784072|gb|GG667663.1| GENE 9 8653 - 9183 273 176 aa, chain - ## HITS:1 COG:BH1906 KEGG:ns NR:ns ## COG: BH1906 COG2207 # Protein_GI_number: 15614469 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 1 149 143 287 299 84 30.0 7e-17 MRELEQREEYYEVSVRGLFLALLADLVRAVLQIAGPDEKKDDQPPENALVIAPALEYIRC HYMEDFSMEYLAGLCSLSPAHFRRLFSAVMETSPLKYLNVTRIRQAAVLLRTTEASVLSI SEEVGYRSVSSFNRQFRDTMGQTPHEWRRRMSILKDQSVMKYRGFLVPETLTPDEP >gi|229784072|gb|GG667663.1| GENE 10 10123 - 10452 381 109 aa, chain - ## HITS:1 COG:no KEGG:bpr_I0234 NR:ns ## KEGG: bpr_I0234 # Name: not_defined # Def: AraC family transcriptional regulator # Organism: B.proteoclasticus # Pathway: not_defined # 1 109 1 109 320 137 56.0 2e-31 MPRHKKPYIEFRDYNLPARFPVLLLSGEEWRISDIPSPRLHIHNCLEIGLCRSDCGTMVF EDTKRPFKAGDVTVISCDVPHTTYSAPGTASRWSYLFVDLTGLLTPLAS >gi|229784072|gb|GG667663.1| GENE 11 10591 - 12741 1857 716 aa, chain + ## HITS:1 COG:no KEGG:PPSC2_c0917 NR:ns ## KEGG: PPSC2_c0917 # Name: not_defined # Def: protein # Organism: P.polymyxa_SC2 # Pathway: not_defined # 2 710 3 711 724 977 63.0 0 MEKKKGSFTLPGEAGYEELTLKLAERWGADVIRDSDGTELSEEITSAGYGIYSTICLIRD HNDWARAHRDQLQQTFLMTEPRTALGDELKIHLMEDFFAEQFEVNDSPEAFARWQVYDRT EDREVPRGFWHYEKESGTVVLKGARAFHDYTVSFLAYRIWEEISMYNHVTNNWEKEHLMP VDPIHEETREYLYGWLKEWCDTHPVTTVVRFTSLFYNFVWMWGSDEKKRNLFTDWGSYDF TVSPAMLAAFEAEYGYSLSAEDFINQGKFHVTHMPPSGRQLDYMAFVNRFVVDFGKRLVE LVHSYGKKAYVFYDDSWVGLEPYHDHFKEFGFDGLIKCVFSGYEARLCAGVEVETHELRL HPYLFPVGLGGAPTFCEGGDPVRDAKEYWIRVRRALLREKIDRIGLGGYLHLVEEFPDFC SYMETVADDFRRIKSFHEEGGVYTLPIRAAVLHSWGKLRSWTLSGHFHETYMHDLIHINE ALSGLPVSVEFINFEDVKAGCLEGVDVLINAGAAGSAWSGGMAWADSKVTGEISRWVYEG GVFLGVGEPSAAPGAAAYFRMAAVLGVDEDTGARVCHGRRTYELTESCGLIPAGASVRGR KGIFLTSADTEVLREEDGLPVITSHRFGSGTAFYLSSFQVTPENTRMLLNLMISGMGLEL NQNYLTDSSDTECAWYPASGKLVVINNSGKEQETVICTPSGRRTVQTEPFGMTVLE >gi|229784072|gb|GG667663.1| GENE 12 12826 - 14160 1484 444 aa, chain + ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 1 439 1 439 447 239 35.0 7e-63 MVKDLTVGKPSSVLWRFSIPMFVSVIFQQMYNIADSVVAGKFAGEAALAAVGASYPITMI FMAVAVGSNIGCAVVISQLFGAKRWKEMKTAIYTTLIACIALSLGLMALGFIFCHPVMNL IQTPEDIFSDGALYLRIYIGGFLFLYLYNVCTGIFTSLGDSRTPLYFLIGSSLGNILLDM IFVIVFHWGVAGVAWATFIAQGAACVLALLTVHKRIAEVKTEGSFPLFSWNMLKKISVVA VPSILQQSFISVGNMFIQGLVNSFGSPVIAGYSAAVKLNTFTITSFTTLANGLSSFTAQN IGAGQEERVKSGFKAGWFMAFCVAVPFTVLFFFFGAPAVTIFMEQTGGEAMNTGITFLKI VSPFYVVVATKLMADGVLRGSGAMGCFMIATFADLILRVLISFALAGPLGAKGIWMSWPI GWTIATVLSLGFYLKGKWKISRDL >gi|229784072|gb|GG667663.1| GENE 13 14222 - 14632 327 136 aa, chain - ## HITS:1 COG:all4540 KEGG:ns NR:ns ## COG: all4540 COG5485 # Protein_GI_number: 17232032 # Func_class: R General function prediction only # Function: Predicted ester cyclase # Organism: Nostoc sp. PCC 7120 # 54 119 56 121 137 59 42.0 2e-09 MEKKDIVKYFYETVVSENLLDEVPHYVSEDCVVKSGETLLPVGVSGMKQHLAEVKKTYPD YTMKIIKQYTDGDYVISEFIMEGTHEGEWIGIKPTHKRLSFTGVDIDKVIDGKIVEHGGA VNTFDTLYEHHLIKPV >gi|229784072|gb|GG667663.1| GENE 14 16612 - 16761 142 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622127|ref|ZP_06115062.1| ## NR: gi|266622127|ref|ZP_06115062.1| glycoside hydrolase family 2 [Clostridium hathewayi DSM 13479] glycoside hydrolase family 2 [Clostridium hathewayi DSM 13479] # 1 49 30 78 78 103 97.0 5e-21 MSWGTNDAGMMIWPYSLQETNGLLYADRTPKLDVERIAACMAGKEKQNG >gi|229784072|gb|GG667663.1| GENE 15 16875 - 21422 3472 1515 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1382 1513 492 625 744 85 36.0 8e-16 MKKQRKVNRCSAKAAAVILAAAMVVQTTTPALALAQLGEGADARSAVSTASQSNAAKGDK HSDTAPSRASGSNAEKLEKATFSNAMRKEVKNLHPNGSFEFTGKTTNTNYQKVWVNEIMP IGWSMWPTPTGSEKMSFEIVTDPEEAADGDNYLHIRSENTASRLGISYPIKNIADLADQY LCTFQLKAEDVKGTGFNGRLTWLKKSGSGNYIETAKITGTSDGWVQYGLISPDQPADATG IQVEFYANNLSGEISLDDVQILPTYKLSLNKTETTMLTGDTLELTATCSDGYDEEVTWSS KDPAIADVDENGVVTAYAMGMVEIVAKTDTFHEAICKISIEDGELRPYYEAIRDSWRDRL TGNSMEDTDDEDYKAMMADLTETAKKNWDTMEKGGNDRAFLWEDIDFTYQKQNTASNVTE GLGTGFSRIEQMATAYSAKGSSLYHDEDLKADIIGAMEWVYATMYNDTMDVTKDIYGNWW HWFIGMPQSLCNTVILMYDDLDPEFIEREAKTLENFNEDPNYRYHTTGGNRLPLDSANLI DTSLVSALRASIGETALPLNMAKDALEKNLGFTSEGNGYYEDGSYIDHGNLAYTGGYGAT LLGGIEKLLFITSDSPWEADSDKLTSIFQWIWNGIRPLYADGAMMDMTTGRGIARPSTNE HTVGKGLLKPISHLAAIAPEDERDAFRTFARTEVLAGLEYEEDFFIGMTVADMTAMKNLI TDNSLGKDDEIYHKNFGAMDKAVYHGENFDLGISMYSKRTGSFEYGNKENRKGWHMSDGA LYLYNKDQAHYADVYWPTVDPHRLAGITTDHTEGFIPSDDSWGPHVSTKNWVGGSSVLDQ YGSVGMDFEGELPNGGISSLKAKKSWFTFPNGAAALGAGITSNENKATETIVDNRKAMED AANTIWVNGTEAELTVGNPAVMPAKWALLEGNNGEQQNIGYYFPEETEITVLKETRTGNW KDINGAIVAGSANDKEIIRSYISLAVDHGTNPQDETYSYVLLPGKNADEMAEFAENPEIK VLANTSSVQAVENTAYGVSGYNFWTAYDGEELEVSAKTPASVTIAKEDGTVTAGISNPTQ NGKDTVILLNGFYEPAEMDKGVTVTTEDGRTVITIKGASDLGKTYTVVLESMDKKLADWQ NKLQNITREDAADVEVMIKELQNLDTENLNEEQKSLIDSLLETAGSKLVLINSVNEVMAP ISKVDEVTFENADAVEEYLKKYGELTEAEKAVASEADKAKIAALTAGFQTENVHTAAEGV TVKAVYGTVDVRTMLLLHDESAAADFYQETISEGETLLHVFKPALLLKDSILSLNDYPVN ITMKAPALASGEQGILKLYSFTSAGKARAAISASELSFTDNRNGTITFEAKYDGIYGFAL AKDTEPAPAPEPVPDKKPSGSSGNSSESVSDGVWVQDANGWWFNHYNGTWPAAQWKWIKG SWYHFNEAGYMQTGWILDKGFWYYLRPDGSMVADDWVFYKDHWYYLTRNGEMAANTSVTW KGIVYHLGADGVMEE >gi|229784072|gb|GG667663.1| GENE 16 21262 - 21576 84 104 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDWLAVLTALLASAPSPNCANAKAGVVVCTTMAAARITAAALALHRLTFLCFFINNLLLY FVLSNSGWPKLKGQYLCFISIRENKRRVHIRFFAFVVNIYLFVV >gi|229784072|gb|GG667663.1| GENE 17 21694 - 23487 991 597 aa, chain - ## HITS:1 COG:CAC0459 KEGG:ns NR:ns ## COG: CAC0459 COG3829 # Protein_GI_number: 15893750 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Clostridium acetobutylicum # 51 591 46 625 627 111 21.0 5e-24 MENRKIRILGIAPYEGMKTIMQNIAKTREDLELDVYVGDLNTGVEIARRNFHSDYDVIIS RGGTAQLIGQATSIPVIEVTLSVYDILRAIKLAENYADRYAIVGFPSITSCAHLLCDLLQ YQIDIFTIHNAEEVEEKLHTLKKNGYRMILCDMIGSTIARRLGLNAILITSGSESISSAF DQAVKLSSSYASLKDENQFLSDMIHGSGYETVVLHADGTVFFSTLEGQNKDEFSVLLKRE VPAILAEGSHKFFKTLDGSLYSFEAKKILFHQEPFAVFYFSKSSIPIATSKYGIQYTNRQ EAEDHFFNSFFSITSDSTDLRTVVDHLSQSTVPVMIFGEEGSGKEQIMRMIYSQSTRKNN PLITVNCALLNDKSWNFLTNHYNSPFNDNGNTIFLSNLPSLPESRRKHLLSVLVDMDVCK RNRIIFSCVCGHGHLMTAEALEYVNTLACVTLYVPPLRDRADEIPTLASLYLNALNTESA NQIIGFEQEALKLLQSYDWPCNYMQFKRILNELSLITTTPYIRTEHVASLLEKEKHNIAP VSSSPASRGLNFNRPLEEINRDIAALILKECSGNQSEAAKRLGISRTTLWRMLKNGI >gi|229784072|gb|GG667663.1| GENE 18 24591 - 25676 677 361 aa, chain + ## HITS:1 COG:STM0162 KEGG:ns NR:ns ## COG: STM0162 COG3395 # Protein_GI_number: 16763552 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 11 341 76 411 423 106 27.0 6e-23 MVERAGRADIPYIYKKTDSALRGNVGSELTAALKASKAGSLHFFPAFPKMGRITQNGRHF VDEIPVEKSVFGRDPFEPVRCSYIPDIIRSQSDVRVVTAGDQNIWTEDDEPVIAIYDAAA DEDLYRLAQRLKESGRMSVMAGCAGMAAVLPELLGLTGEVPGRPVFCPSMLVACGSVNPI TKRQLDYAENHEFIRIRLKPEEKLNYDYYEHPEGKRRLSELLNICRRNPFVLLDTNDADD ENSTMELAEKRGFTLEEVRIRIADNLGSLVKRLLEAGLEKTLMVTGGDTLLGLMKQMEVS EMQPVCEIKNGIVLAQFTWNNECREIITKSGGFGSESLLAELAETMKNEKRENVVCRKNI S >gi|229784072|gb|GG667663.1| GENE 19 25655 - 26812 949 385 aa, chain + ## HITS:1 COG:AGpA199 KEGG:ns NR:ns ## COG: AGpA199 COG1454 # Protein_GI_number: 16119364 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 32 382 48 396 399 220 38.0 3e-57 MQEEYQLKLPGEIYAGSNALNNIKKVLGKRIKRVALFTDKGVENSGLLNKVKTILSESGV EMVIFHELPSEPTCDQVQETAERFRESGAEYIIAVGGGSVMDSAKLASVLSTGDYGVRSL LEEPGLARKGVPTLMIPTTAGTGAEATPNSIVAVPEKHLKIGIVSDEMIADAVILDGEMI RGLPRTIAASTGVDALCHAIECYTSQKANPFSNLFALEALDLILNNIEEACDNKNALEEK NRMLLGAFYGGVAITASGTTAVHALSYPLGGRYHIAHGVSNAVLLLPVMRFNESVCRKQF AEIYDRCVHGKKILEGEEEKSAYILAWLEEIVSHLEIPLHLSEFGVPASDLDTLTESGME VTRLLNNNLRTLTYEDARELYRTIL >gi|229784072|gb|GG667663.1| GENE 20 26842 - 27744 652 300 aa, chain + ## HITS:1 COG:MK1607 KEGG:ns NR:ns ## COG: MK1607 COG0329 # Protein_GI_number: 20095043 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Methanopyrus kandleri AV19 # 5 293 4 292 300 199 41.0 8e-51 MKQVKIEGIITPVITPMNADESINVAELRRQVNRQIESGVHGLFCFGTNGEGYILNGEEK KLVLETVIEECGGRVPVYAGTGCISTRETIEQSKMAEAAGADVLSIITPGFAAASQKELY DHYAAVAGAVNLPVVLYNIPARTGNALTPETVGKLGQISNIVGAKDSSGSFTNILGYIAA SENCDQFSVLSGSDQLILWTLKAGGTGGIAGCANVYPRTMASIYNCYQAGDWEGAKRHQE SIASFRACFRFGNPNTVVKTAVALLGYDVGRCRAPFNQLPEEGKEALKNVLQENEEKGMR >gi|229784072|gb|GG667663.1| GENE 21 27769 - 28830 438 353 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 [Flavobacteriales bacterium ALC-1] # 6 340 8 323 346 173 32 2e-42 MERKPVVGITMGDPAGNGPEITVKALLHEDVYTWCRPVVIGDGRIMEEAVKVAGHPKVRI HRITDLSEARFEYGEIDVYHMDLIKLEQFHYGTVSPMCGKAAFECVKKVIELAMTGKVDA TCTNALNKEAMNLGIASEGLHFDGHTEIYAAYTNTEKYTMMLAYHDLRVVHVSTHCSLRE ACDRVKKARVLDVIRIADRACKMLGIEKPRVAVAGLNPHAGENGLFGREEADEIYPAVEA AKLEGILAEGPFPSDSLFSKALGGWYDIVVAMYHDQGHIPLKTVGFVYDRAASSWKAVEG VNVTLGLPIIRTSVDHGTGFDQAGKGTSNELSLLNALEYAALMADTKMRQGAE >gi|229784072|gb|GG667663.1| GENE 22 28833 - 29810 534 325 aa, chain + ## HITS:1 COG:DR0133 KEGG:ns NR:ns ## COG: DR0133 COG0657 # Protein_GI_number: 15805172 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Deinococcus radiodurans # 30 287 39 247 296 98 30.0 2e-20 MKKEKPLCPPEKRPKYEDIVFAVVPAADGSSMELKLDIYQAPDQQEPGPCIIYYFGGGWM WGEYKQVTQKAVYCRDLVRLTEAGYTIVSPSYRLASQAVFPACIHDCKGVVRFLKANAEQ YHIDPDRIGVLGNSAGGHLAGMVALSSGNPEMEGDVGGNLEYSSSVRAAAIFYAPSDLVQ LIREASEKEEIKDLSGTEIENLASGDPSFGLEARIVGYSGPGRNLLRLSEVARKMDETDP DWKYIELAKKCSPVTYASSDCPPVIILHGGQDPVVPVSQSECLYKALTKAGAEAVYLSCS QAGHGPTVSAEADQFAYDFLKNHLN >gi|229784072|gb|GG667663.1| GENE 23 29826 - 31139 1124 437 aa, chain + ## HITS:1 COG:STM0057 KEGG:ns NR:ns ## COG: STM0057 COG3493 # Protein_GI_number: 16763447 # Func_class: C Energy production and conversion # Function: Na+/citrate symporter # Organism: Salmonella typhimurium LT2 # 4 437 18 446 446 220 31.0 5e-57 MEHLKKEKILGFPLPVVLLFCVIVFAAMYTGALGGDMMSTITFTLVLGLVLNRIGDAIPY ANKYLGAGLLLCMFVPSYMVYRGVITEQVVTNTANFFSFSGVQYMTFFIVLLMTNSMLNI EKKVLLQCIARFIPLLFITVIGAILGTALVGKFFGVSLGESISNFVLPIMGSGTSGAAAL SQIYNKASGLDAESYYGTVVVMVVLATTLCIFFAALLNLIGSVFPKLTGDKNTLLRAGKK GEEISNSPAEKMPEASLSDVISGLVLAGVVYAAAGICSKIIPEIKGVELHQYAYMILILI AVNYMGIVPDHIKSGLKRLTAACVNLVMPMIGVCIGMTLMSWESFTSAFTAQTLIMTIAC IVSACLTAAVFGHLLGMYAVDSAVTAALCQADMGGGADIGILVPSDRMGLIAYATVASRI GYAIILIMASILFPLIL >gi|229784072|gb|GG667663.1| GENE 24 31166 - 32335 518 389 aa, chain + ## HITS:1 COG:AGpA102 KEGG:ns NR:ns ## COG: AGpA102 COG4692 # Protein_GI_number: 16119301 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted neuraminidase (sialidase) # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 36 359 35 384 397 201 36.0 2e-51 MDYPVLPGLRPDGKIYYNEEMGILEAELFPGGKTAHAPALLELSDGSMLCVWFTGTYEGS ADVSIVCSKLPAGAKSWKRPEPVSKDPERSEQNPSLFTGPEGDVWAVYTAQLDRMPGKDN MQYTSVIRCQKSVDGGETWGEAETLFSRQGSFCRQPIQILENGRWIFANWLCSDSLTGLA GDPTVFQLSDDQGKTWRMVRMPDSNGRVHANVVELGGGRLAAFMRSRSADFIYRSESQDW GENWTVPMATSLPNNNSSISAIRLKSGRIGIAYNPTHAANPIAGTAAWPGLRCPVAVALS EDEGLTWPVIRYMERGEGYAGPENRTNNRQYEYPYLMQGTDGRLHLAFAGKDRSCIKYMC FTEDDVMGKKREKEGLYNPTAGAFTDIKK >gi|229784072|gb|GG667663.1| GENE 25 32412 - 33368 638 318 aa, chain - ## HITS:1 COG:CAC1333 KEGG:ns NR:ns ## COG: CAC1333 COG2207 # Protein_GI_number: 15894612 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 30 296 21 280 286 77 26.0 3e-14 MKPYPDSVLNYEKGFSIIINELKTPVENSICYYFDYDARSYSVNMEFPHMHMFYEIMILL SPKAYHFIAGKPYSIINNDIVLLAPAILHQTEYLPGPPSDRIIINFMVPKHLLLHPGGYD TILSVFNDPVPIFRFDLERQKTLFQIINQLTDLSRTTEDPAVRNMMIHNKFTEFLFTLYQ MRKENLYEEYNSERSIKEKIYTITNYIQTHYYEDLSLAYLADKFFISSFYLSHQFKLVTG FTVVQYIQQTRVKNAQYLLLNTSLKITDIAEKIGFSSFSQFNRVFHRFCSMSPSEYKTFY NQNAMRVNAPMMSAKSNL >gi|229784072|gb|GG667663.1| GENE 26 33534 - 34760 995 408 aa, chain + ## HITS:1 COG:no KEGG:CLJ_B0047 NR:ns ## KEGG: CLJ_B0047 # Name: not_defined # Def: major facilitator superfamily protein # Organism: C.botulinum_Ba4 # Pathway: not_defined # 21 399 1 358 369 119 26.0 2e-25 MDKRMKMWKKAVPFLCFAVLMFGLGSSDSLRGIFAPVFQNRYALSDPQLSMIVTISYVGN LLFLSLGGKLLDTFGRKTVALSVTGIWVLSMLLNVVSDSYPCILISMFFALGASTLLNTT VNILTPVVFSGYAGLMVNIFFFIQGIGTSGSQFVLGRYGLSYSGFKGVSLLLLLIGVAAG ALLLLTPLDGEKKEKGNPSKAVPFKGQQSEMTQSKEGKMKQNPGKAVFILLCIMIGFYFI GEHGIMNWLFSYCIQAFSVPGGKASVYLSLFWGGMTAGRLIFAPVVQKLGTVKSIRIFGM IGTLLFCAGILTGEKGILLLSISGLAISVLYPTMVLLIQQIYPTEVAATRTGAIISVATV ADIVFNLLFGVITGVIGYRLSFMILPVCMAGFYISYMVLQNNIWRNAR >gi|229784072|gb|GG667663.1| GENE 27 34757 - 36328 1335 523 aa, chain + ## HITS:1 COG:no KEGG:bpr_III022 NR:ns ## KEGG: bpr_III022 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 1 523 1 527 556 321 35.0 7e-86 MKRYINHLMKGAEIRTESFLRSQDKNPESIQYGGMHGDVVEAKTTIYHMATALSVYFKEE SRFYQSEVLYDAINMACGFVRRMQRENGSFDYPTCNFQSAADTSFCFKRLILGYRLIDKY GNGDEKAETLKKKYLVIMHDALKAVRDGGFHTPNHRWGITAALLQGADLFAEETDFADSL RKRAAQYLAEGVDEDEDGEYAERSTGNYNAVVNNAMFAIGQETGEELYYDYARKNLKMML LYIDPDDTIFTQNSSRQDKGRAEYADKYFYQYLYAACLGDDGVFDAAAHKIIRDNMERGD IAPDCFPVLMTHDEMMNHEFQGYGFVESYRKYFKHSGVLRVKKEKYTYTVLNGKSSFLFF KAGSTAVCVKIGESCCEKRNFIPKTMEVNDHGCVLSEAFDSWYYLPFDEPQGTSDWWEMD QSKRKKLLPERLITRLEIRERDEGLELHVKADGLKGLPLRLEIGIPAGTTIENEHFYMKA EAGSGMILRNGYLQVKETDRELVMGPGFGSHEFKGHYSGELAS >gi|229784072|gb|GG667663.1| GENE 28 37420 - 38565 1290 381 aa, chain + ## HITS:1 COG:STM1911 KEGG:ns NR:ns ## COG: STM1911 COG4225 # Protein_GI_number: 16765253 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Salmonella typhimurium LT2 # 53 371 56 373 379 301 44.0 1e-81 MMMKNSLEEKKDMLLEEIAEKLECLINGFEKVLYDDDDIFLQNMKTDNLAGDDVQKYRFW EWTQGVGLFGLWKQFEENRDEKALAMLKRYYDERIADGLPSCNINTTAPMLTLTYLYDYT KNETYWETCRDWLKKVLTELPRTEEGGFQHLTSDTLNEQELWDDTLFMTILFIARMGKLM NDRSCIEEAQYQFELHKKYLEDRKTGLWFHGWTFKGRHNFAEGLWGRGNSWVTIAIPELI EILGADCEPVIKRLLTEALMNQVEALKTYQADNGMWHTLIDDSDSYLETSATTGFAYGML KAAHMGLIDESFAECGMKPLEAVMDYISGDGVVGQVSYGTPMGRDGKQFYKDIEIRPMPY GQAMAILYLIEARKELTREVK >gi|229784072|gb|GG667663.1| GENE 29 38565 - 39476 930 303 aa, chain + ## HITS:1 COG:lin0760 KEGG:ns NR:ns ## COG: lin0760 COG1175 # Protein_GI_number: 16799834 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 20 300 13 292 296 191 35.0 2e-48 MANTNTAVPAPARRKKRDTRQLWAFGFCLPNILFFLVFFVAPAIVGVWYSLTNYNGFKQM DFVGLSNYMKLFRDPEFYKTLWQTILYSIVSVPLGYAVALGLGLLLSSEKIKGITILRIL IYWPILLSTIMVGLTWRWIFGESFGLINYLLQCIGLDGIKWATNPTAAFITTIIAGIWSG CGTNMLIFIGALKQVPGELLEAANLDGANKWQTFKNIILPHLKPVSFMVIILSVISSFKV FAMVQTLTNGGPGTATTYMIQYIYTTGFTKNKVGYASAVSMVLFVILLILSFVQTKVSDK SND >gi|229784072|gb|GG667663.1| GENE 30 39491 - 40309 752 272 aa, chain + ## HITS:1 COG:TM0430 KEGG:ns NR:ns ## COG: TM0430 COG0395 # Protein_GI_number: 15643196 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Thermotoga maritima # 4 271 2 270 271 209 44.0 5e-54 MKTVNKVLKYVFAFGFAILWIFPVIWLIISSFKDGSELFNYPLTILPEHFTLENYTTALQ QFDLIRYTSNTAFITVAATVITVLMSCMCGYALAKYKYKWLNVLFMCLLATTMLPTEVIM SPSFTVLNTLGLYNSLWGCIIPTVGTMTGVFLMRQFFITVPTELMESARIDGANEGSIFV RIMLPICKPQIAILAIFSFQWRWNDYIWPLLALSDPKKYTLQLALRTLSGAEAVNWTVLL STAVLSMIPILIVFIVFNKQIMNANMSSGVKG >gi|229784072|gb|GG667663.1| GENE 31 40399 - 41661 1323 420 aa, chain + ## HITS:1 COG:TM0432 KEGG:ns NR:ns ## COG: TM0432 COG1653 # Protein_GI_number: 15643198 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Thermotoga maritima # 127 349 128 351 423 73 26.0 9e-13 MKKRLLTGAVCLSLAGMTMLTACSGKESAKKGEVELEILLADDTLEGGAMKTIVEKFNEE HKDQGVTAFVNEIAYADMETQIKNRAKAGNLPAICKMSNFDTYIDYVLPLDDTSLNPDNF NRNGVRNGKLYGTGVNDTAVGMIINKTAFEAAGVSYPTASEDRWTWDEFEAAVKEVAEKN EKITTPFVIDHSQQRSGTMLYQFGMQYFDPADSTRVAFRSDDTKKGIEFFLDMFKDGGIS KASIGTGAENAQDVFKTGTVAAHMAGNWVISDYAQNIKDFEWTPVLMPYEKEKATCLGGN WLYAFDGSGVEDQAKMFLEWFYQPENYALYCQTGNYLPGTKDVEVSYDIEGINVFNTEIA ESIGQTQYDQSITDQEHSGESVGNALRDALDRAIAGELDADGVMDYVVTQYLENLTGVHE >gi|229784072|gb|GG667663.1| GENE 32 41756 - 44209 2124 817 aa, chain + ## HITS:1 COG:no KEGG:Sterm_4037 NR:ns ## KEGG: Sterm_4037 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 10 787 11 786 806 493 35.0 1e-137 MELTYQIGYDETISRWNVTAMGKEPFHSPKKTFEAGVNMEESYVQVVYPVREAFLKEERI RKAVRYDGSYESLYFPFENNRIDLSTFFHTPQYIYVHAKAGVRVQEEGCYPFEIYTCGGV RVWVNGEEQACYTPYTRNIAGHTRVNLSLRKGLNEIKVYADELAERDVFFYFELRYKGVT PVEGSVEVTEQPEKIQKAEQILKSCYFEQDMYTEGEVRLCYDRTLLDGDTTVYLTSTPCG TGMQLSEIEHTVMTLKKDKDYLVVAETEKSSIRLSRISVCLTVGGYVIPRNLFVGVIPKK RIVLEPAGTIGERKQQALEFLVKNGEIGFQSVITSLELLKGWNDNAEKGFDMACKKIERH DDCADFSLAPFSLLMTRYKHLLTPEKLERIRDMVLNFRYWIDEPGNDVMWYFSENHAFLF HVSQYLWGSVFEKELFTVSGRTGAEQYEIGRKRVLEWFDSFFSYGYAEWNSATYIPIDLI GFFSLYLSAPDGEIRQKAEKALDFTMQIIGYNSFEGVMNTTYGRIYEETIKTRLQVEPNF VSWVSAGRGFCTYYGNATCLYAISDYEPEDYEMECRPQPGQGVVMEMDQGIAGVKIHTYR TGEYLTAGVRRFKPFRHGHQQHLMNVVFGKERPAIFYVNHPGERVFSGENRPSYWAGNGT MPWIERYCNVTVMLFAIDPDELVHYIHAYTPVYEYEAYVCGGNWFFAKSGDGYLGCWFSN GYEMAAYGANTKKELISEGLNHAVIVKCGSKEEFGSFEVFQENLKEMDIEWDGDRSIAFA DVQYGEMRVRDAEEFLVNGTAVGVEPVEGVRFLRREL Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:14:06 2011 Seq name: gi|229784071|gb|GG667664.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld57, whole genome shotgun sequence Length of sequence - 31740 bp Number of predicted genes - 30, with homology - 29 Number of transcription units - 12, operones - 6 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 334 275 ## gi|266622145|ref|ZP_06115080.1| conserved hypothetical protein 2 1 Op 2 38/0.000 + CDS 372 - 1238 686 ## COG0395 ABC-type sugar transport system, permease component 3 1 Op 3 . + CDS 1243 - 2121 617 ## COG1175 ABC-type sugar transport systems, permease components + Term 2179 - 2222 7.7 - Term 2255 - 2315 3.2 4 2 Op 1 . - CDS 2357 - 2785 344 ## Athe_0204 hypothetical protein - Prom 2867 - 2926 4.2 - Term 2800 - 2846 3.0 5 2 Op 2 . - CDS 2949 - 3206 246 ## Mahau_1303 LacI family transcriptional regulator - Prom 3251 - 3310 80.4 6 3 Tu 1 . - CDS 4154 - 4873 597 ## COG1609 Transcriptional regulators - Prom 4980 - 5039 5.4 + Prom 4977 - 5036 5.6 7 4 Tu 1 . + CDS 5083 - 5778 701 ## COG0684 Demethylmenaquinone methyltransferase + Term 5829 - 5871 6.3 + Prom 5811 - 5870 1.9 8 5 Op 1 . + CDS 5908 - 6687 565 ## Rcas_2201 xylose isomerase domain-containing protein 9 5 Op 2 . + CDS 6718 - 7653 928 ## COG3246 Uncharacterized conserved protein 10 5 Op 3 . + CDS 7678 - 8868 1334 ## COG0183 Acetyl-CoA acetyltransferase 11 5 Op 4 . + CDS 8906 - 10240 1446 ## COG3395 Uncharacterized protein conserved in bacteria 12 5 Op 5 . + CDS 10256 - 10993 233 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 13 5 Op 6 . + CDS 10956 - 11303 172 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 11320 - 11376 10.1 - Term 11302 - 11368 19.5 14 6 Tu 1 . - CDS 11373 - 13202 1754 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase - Prom 13354 - 13413 6.1 + Prom 13359 - 13418 7.0 15 7 Tu 1 . + CDS 13487 - 14881 1278 ## Closa_3823 ErfK/YbiS/YcfS/YnhG family protein + Term 14897 - 14949 8.3 - Term 14888 - 14934 5.3 16 8 Tu 1 . - CDS 14956 - 15834 796 ## COG0583 Transcriptional regulator - Prom 15920 - 15979 6.2 + Prom 15884 - 15943 6.5 17 9 Op 1 9/0.000 + CDS 15998 - 16666 537 ## COG1760 L-serine deaminase 18 9 Op 2 . + CDS 16734 - 17582 908 ## COG1760 L-serine deaminase 19 10 Op 1 1/0.000 + CDS 18805 - 19701 739 ## COG3475 LPS biosynthesis protein 20 10 Op 2 8/0.000 + CDS 19698 - 21002 1353 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 21 10 Op 3 . + CDS 21005 - 21700 781 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 22 10 Op 4 . + CDS 21719 - 22111 422 ## Closa_3819 hypothetical protein 23 10 Op 5 1/0.000 + CDS 22092 - 23744 1580 ## COG4713 Predicted membrane protein 24 10 Op 6 . + CDS 23795 - 24754 833 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 25 11 Tu 1 . - CDS 24950 - 25057 62 ## - Prom 25283 - 25342 7.6 + Prom 25477 - 25536 6.0 26 12 Op 1 . + CDS 25659 - 26213 776 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 27 12 Op 2 26/0.000 + CDS 26231 - 27040 950 ## COG1682 ABC-type polysaccharide/polyol phosphate export systems, permease component 28 12 Op 3 . + CDS 27069 - 28409 1648 ## COG1134 ABC-type polysaccharide/polyol phosphate transport system, ATPase component 29 12 Op 4 . + CDS 28411 - 29193 921 ## Closa_3802 Methyltransferase type 11 30 12 Op 5 . + CDS 29180 - 31739 2678 ## COG1216 Predicted glycosyltransferases Predicted protein(s) >gi|229784071|gb|GG667664.1| GENE 1 2 - 334 275 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622145|ref|ZP_06115080.1| ## NR: gi|266622145|ref|ZP_06115080.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 110 1 110 110 197 100.0 1e-49 EKQACCEFLESIVSPDFYKGIIESGSMLYTGKVDYDREKAPEATNMLYDRFQSGVQMAPS MDCVMDQAMLDAVVNSVPSALINDTMTVEEAVNVVAQVGQESYEDYMASK >gi|229784071|gb|GG667664.1| GENE 2 372 - 1238 686 288 aa, chain + ## HITS:1 COG:BS_yurM KEGG:ns NR:ns ## COG: BS_yurM COG0395 # Protein_GI_number: 16080311 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 24 287 37 299 300 165 34.0 9e-41 MSKKQEGQTSIEFRYLSRFSKGVVYFVLILCAVCVLYPLFWLLSASLKSPMDLAVNVWGL PKKIVFQNYIDAWQVGKMGHYILNSVKVALVSIATTLICSSMLAFILARLRFKGRSIIYY GILLGLMIPIHAIIIPLYITTRDLGLHNNLYCIGLVYAAFQIPFSVFVLRGFMAGIPASL EEAAILDGCGVIRVFLYIILPLTKEGLITIAILTLMSAWNELLVAMLLISNVSLKTLPLG LIGFITEYASKYAELCAGLIIACLPNILFYAVCQEKMIKGMTMGAVKE >gi|229784071|gb|GG667664.1| GENE 3 1243 - 2121 617 292 aa, chain + ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 14 292 11 291 292 142 30.0 7e-34 MRLKGKKLYLTVLLFAAPGLCLYLFLVLFQIIQGIEMSFFSWPTLTEKVFVGIQNYKRIF TNVYFWRSFKNTLLYMLLTSALQLSIGFFVGYLLYLQMRGYKFFRLVVFLPAVLAGTAVA FIWQYMYSPAFGILKPLFELLGISEYYIPPLSSEKLALVSIIFAQVWGGVGIQIMMFNSS FMRIPQDVVEAAIVDGVKGFGMIKHIMFPLSLDVGKVIIILQVVGSLKAFDLVYVMTKGG PNHATELLGMTVYKTAFERFKFGEGNAVAVVIFALCLLMTLVLQKVFKTDND >gi|229784071|gb|GG667664.1| GENE 4 2357 - 2785 344 142 aa, chain - ## HITS:1 COG:no KEGG:Athe_0204 NR:ns ## KEGG: Athe_0204 # Name: not_defined # Def: hypothetical protein # Organism: A.thermophilum # Pathway: not_defined # 14 142 16 144 146 110 44.0 2e-23 MRTYTTNQYGRTVIIHLGKGEKLLESLTEEIKRLHLKNGILLSAIGSLRKASLHVITSTD DWPVNQFITVEKPIELGAAQGLIINGEPHFHLVISEENSLYAGHMEPGCEVQYLAEFAIL ELLDVDLTRKTDTFGISYIDTL >gi|229784071|gb|GG667664.1| GENE 5 2949 - 3206 246 85 aa, chain - ## HITS:1 COG:no KEGG:Mahau_1303 NR:ns ## KEGG: Mahau_1303 # Name: not_defined # Def: LacI family transcriptional regulator # Organism: M.australiensis # Pathway: not_defined # 1 85 253 337 347 79 43.0 3e-14 MTIGALKALKQLNLRIPEEISIAGFNGIDHLELMETRPTVADYDPYKIGKAAGKAILERI KDNSIDNREYIFSPSLIRGNAVASV >gi|229784071|gb|GG667664.1| GENE 6 4154 - 4873 597 239 aa, chain - ## HITS:1 COG:RSc1014 KEGG:ns NR:ns ## COG: RSc1014 COG1609 # Protein_GI_number: 17545733 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Ralstonia solanacearum # 7 238 3 231 347 148 36.0 1e-35 MGKQTLTIKDVAEACGVSIATVSRVLNDNYYVTPDIKNRVLDTVNRMGYIPNSAARGLKL NTSGIIGYITSDISNGYHIMIAKAVEDMIRPANYNLVVCSTGNNAEAERQYLKLLVGKNV DALVLNTCGENDKFILKLNQTLPMVLVNRRLNTPGFHGDFADCNNVLGMYLLTRELIDHG HRRIYLIEGPERLSNTAERLEGFSKAMEEIGMTEINSYPYRYHGDYSLESGIDAIRKLS >gi|229784071|gb|GG667664.1| GENE 7 5083 - 5778 701 231 aa, chain + ## HITS:1 COG:SMc00502 KEGG:ns NR:ns ## COG: SMc00502 COG0684 # Protein_GI_number: 15965518 # Func_class: H Coenzyme transport and metabolism # Function: Demethylmenaquinone methyltransferase # Organism: Sinorhizobium meliloti # 33 214 35 196 224 76 29.0 4e-14 MNSENQELLKLFEPLRVADVRDGMDWMGYHHYGTMDYKIRPLYRTKAVGIARTARYLPYE GPDPEVKGDAYTEWSNWYYGTICTYPWFNEIEEGDFIAIDISGVDVGLIGSENGLGAIIK GARGFVTNGGGIRDTDEVILQKIPVWSYFVSQKMDQVRVQFDAKDVPVAIGGVTVHPGDV IVADGDGVIVVPRKIAKDVAKYVHKELYKDKNARREKYETLGWELDDTVIN >gi|229784071|gb|GG667664.1| GENE 8 5908 - 6687 565 259 aa, chain + ## HITS:1 COG:no KEGG:Rcas_2201 NR:ns ## KEGG: Rcas_2201 # Name: not_defined # Def: xylose isomerase domain-containing protein # Organism: R.castenholzii # Pathway: not_defined # 52 199 52 199 258 96 35.0 8e-19 MQVGCGLVSVRDIGPASEAGFDYIEFMGKYLVSASGREFLKILNTITKYDIRCRGLNGYC PEEIIIAGPGFDLGKARRYAKTVADRAKELDVVVVGIGSPSSRKLPEDFDRKIARKQLKD FLKVTAEELGRYGITVCLEALAPCYCNFVNTVEEAVEITREIGWSSIRTVLDFYNMEYSG EADSDVKPWLNEIAHVHISDDAGSPFQRDFLREEKSGIHKRRLYRLFQAGYRGLVTVEVD LPVNGPKSFNSLHVIKNAY >gi|229784071|gb|GG667664.1| GENE 9 6718 - 7653 928 311 aa, chain + ## HITS:1 COG:mll0106 KEGG:ns NR:ns ## COG: mll0106 COG3246 # Protein_GI_number: 13470407 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 6 311 4 309 309 260 45.0 3e-69 MSKQQRKVIISAAITGGVHTPTMSPYLPKTPDDIIKNSIDAYKAGAAVVHIHARKENGEP TADHKTFEYILSSIKKECDVIIGLTTGGAQGMTVEERLSVIDRFRPEMASANGGSINFCF SRLADTMENPEYDWEKPFITRTYDNIFKNTFKDMEYCITTMNKCGTMPEYEIFDYGQLSN LAYFKKKGVITQPIYLQFVPGIMGGMPISQEGMMFMIDQSKKILGNDIQFCSVGAGRRMF RTETLCALNGGNVRVGLEDGLYIKPNGELAVDNAAQVTKIRRILEDLDFEIADTSDAREM LHLKGIDQVGF >gi|229784071|gb|GG667664.1| GENE 10 7678 - 8868 1334 396 aa, chain + ## HITS:1 COG:CAC2873 KEGG:ns NR:ns ## COG: CAC2873 COG0183 # Protein_GI_number: 15896127 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Clostridium acetobutylicum # 3 394 2 390 392 354 49.0 1e-97 MFQDVVVASAVRTPVGSFGGSLKDVSAIDMGAMTIKEAANRAGIPASMVEEVVMGCVGQY GYNPFLARLAGLKAGCSVESSGQTVNRLCASGLQAIVTGAMTVDHHDALICAAGGTENMS SYPYSSFTNRWGNRMGNTELKDNLTMALGEPVAGMNGKDIHIAITAENIAEKYGITRQEA DAYALEGQKRAKRAIENGTFREEILPVEIRVKKETKVFDTDEHPRETSLEKLGKLRPIVK SDGVVTAGNASGINDAAAAVILMSGSQAKEMGITPLARVIDYAVAGVDPDYMGIGPVAAT RKLLEKQKMELSEIGLFELNEAFATQAIACIRELGLDPEIVNVNGSGIALGHPIGATGAV ISVKLLYEMKRRKVRYGIASLCIGGGQGLSVLFESM >gi|229784071|gb|GG667664.1| GENE 11 8906 - 10240 1446 444 aa, chain + ## HITS:1 COG:alr3454 KEGG:ns NR:ns ## COG: alr3454 COG3395 # Protein_GI_number: 17230946 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 5 442 6 437 441 256 37.0 7e-68 MEQFKIIVLDDDPTGIQTVHGVCVYTNWSEESIRAGFEEENPMFFILTNSRGFTAEETRS VHQLIGRRIVKISKEQKVPFVLISRGDSTLRGHYPLETETLRETLEAEGITVDGEILCPF FPEGGRYTVGDVHYVAEGTRLIPAGETEFAKDRTFGYKSSDLKEWIEEKSGGRWKKDEVL SVSVAELRGKTLKGTEEKLLSLTGFQKLIVNAVEYRDVERFAEVCKACISRGKTYIFRTA AAFPKVLGGITDQKLLGKEELVEQDNPHGGLIIAGSHVTKTTRQLEELKKDERIRFIEFN VALVTDEEKFRREQERVIRETDEALAAGVTAAVYTSRKRLDLDTGNREDDLRLSVKISDA VTGIVPALSVRPSFLIAKGGITSSDIGTKGLRVKKAMVAGQILPGIPVWKTGEESLFPHM SYVIFPGNVGSDTALYDAVQKLIG >gi|229784071|gb|GG667664.1| GENE 12 10256 - 10993 233 245 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 6 245 4 242 242 94 29 8e-19 MRFEGKTAAVTGAAQGIGRAIAKRLYDEGAKVALLDISDEKVKEAAKQIDKEGVRAVGFG CNVADQASVNSVMEEVEKQLGPVDILINNAGITRDGMFHKMSTEQWNQVIDVNLNGIFNC TRAVITGMRQRKYGKIVNLASVSAFGNMGQTNYGASKAAVIGFTKCLAKESARAGITVNA IAPSYINTEMLQAVPEDVMQKFIAAIPSKRLGTPEELAAVAAFLSSDDSSFVNGECIVVS GGSYM >gi|229784071|gb|GG667664.1| GENE 13 10956 - 11303 172 115 aa, chain + ## HITS:1 COG:lin0583 KEGG:ns NR:ns ## COG: lin0583 COG2723 # Protein_GI_number: 16799658 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 20 115 369 464 464 117 52.0 5e-27 MSASSYQAAPTCNSLLWRKGKQEEERFLDENGVVEGNYRIAFIRDHLKWLHKGITEGSNC FGYHLWCPFDCWSWTNAYKNRYGLIRVDTEDGCSLQPKRSAAWFKQVSSSHKVSD >gi|229784071|gb|GG667664.1| GENE 14 11373 - 13202 1754 609 aa, chain - ## HITS:1 COG:mlr5890 KEGG:ns NR:ns ## COG: mlr5890 COG0079 # Protein_GI_number: 13474906 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Mesorhizobium loti # 236 606 65 434 449 162 28.0 1e-39 MQAIILAAGMGKRLKEYTAENTKCMVKVNGTTLIERALRILDKKQLSRIIIVVGYEGRKL MDYITTLGIQTPVCFINNKVYDKTNNIYSLALAKESLISEDTLLFESDIIFEEALVDILI EDPRETLALVDKFASWMDGTCMVLDDDDSIKDFIPGKYLKFEDKGGYYKTVNIYKFSQHF SAKTYVPFLEAYAKAMGNNEYYESVIKLIALLETKEIKAKRLSGQVWYEIDDIQDLDIAE SLFADTIEKRYDAIASRYGGHWRYPKLLDFCYLVNPYFPPQKMLDEMKSNFEILISEYPS GMRINSLLASRNFGVRQEHIVIGNGAAELIKVLMEHTEGPTGFIRPTFEEYPNRYDSALS VVFTPDNEDFSYTAEDLIEYFGSPDHAVRTLVLINPDNPSGNYIDSEGLKKLIDWGRKNS IRLIIDESFSDFAELPGDTEALDLSLINEEVLHSYSGLFVVKSISKSYGVPGIRLGVLAS SDEAMISLLKREAPIWNINSFGEFFMQNSEKYKKDYRESLKKLAADRRQLVKELDSVPYL RPIPSQANYVMCEVTEGRTSREIACELLKHDIFIKDLTPKLHNGRQYVRIAVRNGEDNRR LTTALKDLI >gi|229784071|gb|GG667664.1| GENE 15 13487 - 14881 1278 464 aa, chain + ## HITS:1 COG:no KEGG:Closa_3823 NR:ns ## KEGG: Closa_3823 # Name: not_defined # Def: ErfK/YbiS/YcfS/YnhG family protein # Organism: C.saccharolyticum # Pathway: not_defined # 46 412 117 482 483 281 42.0 5e-74 MLQALKKAAAVAALAAFMVLGCMTAFADDEIQNPPSPHVVYQVFTNGQWKNEAMDRQPAS DSASPVEGLKVWLTGVDGTISYRVYLHDGGWQGWVSDGAVAGGENTGNQIEAVQMKLGGY AGQVLDVWYRATVSGREKLGLAKTGSPAGTVGEGAPLTSLEAYLTEVNHNESDGTPAVRT NFPFGFYDDNGVTRYTDVPGGIYTGWVDHDGVRYYVRDNSFLTGWQYIDGLKFYFDEQGK LVQDLDPIIGIQSSYVIKINKQLNTLVPYAKDGDNGYIIPVKSMLCSVGDDTPLGTFHTP EKYRWRLMVNDTYTQYATRITAGQGFLIHSICYDKPDIYTMQSVGYNGLGVVRSLGCIRL TAQNAKWVYDNCPIGTTIEIFEDPLNVGPYYKPTITPIPVEQTWDPTDPLVSEDVKNQAA AEQQEEAARRQAEEQAAAEQRAAEEAARAAEEAAKQEIGPGYGL >gi|229784071|gb|GG667664.1| GENE 16 14956 - 15834 796 292 aa, chain - ## HITS:1 COG:CAC0023 KEGG:ns NR:ns ## COG: CAC0023 COG0583 # Protein_GI_number: 15893321 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 289 1 289 299 170 34.0 2e-42 MELRQLSTLIRAAQLQSFSKAAESLGYSQSAVTVQIRLLEEELGTRLFDRIGRKVTLTPQ GSRFLTSAYDILYEVNRAKLSVSGGAELENPLHVGTIESLCFSKFPSILRYFRENHPKVA VQITTAPPEELIEMMERNQLDLIYILDEPRFNSNWYKVMEQKEEIVFLSSPSFPLADKKE IRLEELLGEPFFLTEKNANYRKAFDRYLASKNILLTPFLEISNTEFILKMIEENKGLSFL PRFAVEESAESGHIRVLDVIDFHASMYRQIFYHKDKWKTREMDEFIRLASLE >gi|229784071|gb|GG667664.1| GENE 17 15998 - 16666 537 222 aa, chain + ## HITS:1 COG:lin1927 KEGG:ns NR:ns ## COG: lin1927 COG1760 # Protein_GI_number: 16800993 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Listeria innocua # 1 220 1 218 220 187 43.0 1e-47 MSFISVFDVLGPNMIGPSSSHTAGTAVIAYLAQKMINGPLKKVEFTLYGSFAKTYRGHGT DRALLGGIMGFSTDDTRIRDSFRIAEERGLEYHFIVNETDTDIYPNTVDIRMENEAGQVM TVRGESLGGGKVRIVRINQVKVDFTGEYSAVIVVHQDKPGVAAHITKVLSDCSVNIAFMR IFREAKGHTAYTIVESDNRLPGNITETLRENIHVHDVMIVQS >gi|229784071|gb|GG667664.1| GENE 18 16734 - 17582 908 282 aa, chain + ## HITS:1 COG:CAC0674 KEGG:ns NR:ns ## COG: CAC0674 COG1760 # Protein_GI_number: 15893962 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Clostridium acetobutylicum # 1 276 1 276 290 229 48.0 4e-60 MDFKNARELLELCSENQCPISAVMRERECDQGETTAEEADDRMRRVLEIMRSSAYEPIEK PGKSMGGLIGGEARKLSARRENGKVVCGTVMSRAITYAMAVLEVNSSMGLIVAAPTAGSS GIVPGLLLALQEEYGLADDRLIDALYNAGAIGYLAMRNATVAGAVGGCQAEVGIASAMAA SAAVELMEGAPEQCLYAASTVLMNMLGLVCDPVGGLVEYPCQNRNASGAANALIAAEISL SGIRQLIPFDEMLATMYSVGKKLPSELRETALGGCASAPSAC >gi|229784071|gb|GG667664.1| GENE 19 18805 - 19701 739 298 aa, chain + ## HITS:1 COG:SP1273 KEGG:ns NR:ns ## COG: SP1273 COG3475 # Protein_GI_number: 15901133 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: LPS biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 14 269 10 254 267 125 32.0 1e-28 MAGFYEPEILSRVQQMELEILKDVMELCDSHGLLCFGMAGTAIGAIRHKGFIPWDDDIDV AMPRKDFECLLKLVEEKLGDKYYVLNWRTSENYPLMTTRICRRGTRFVDSPMRDVDSPLG VFLDMYVYDDIPDSGVLMKLQAWEAWFFSKLLILRSIPRPYLAQTGWKAKLITAVCIAVH HVLKGLCISKRGLASRCEKICRRYEGKNTKRMAFYPDTNPFWNVVEKAAFSPGLKLDFEG FKLVFPRTEHDILTYMYGDYMQMPPIEKRKTHYPYILDLGDGTVLSEETVTDRGMKKT >gi|229784071|gb|GG667664.1| GENE 20 19698 - 21002 1353 434 aa, chain + ## HITS:1 COG:SPy0797 KEGG:ns NR:ns ## COG: SPy0797 COG2244 # Protein_GI_number: 15674839 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Streptococcus pyogenes M1 GAS # 26 429 14 419 428 157 30.0 3e-38 MKWLRQRMFGSLSTGTAAIDRQNIAWNMIGSLIYAGSSMILTALVNHLIGPEQGGIFGFA FSTFGQQMFLVAYFGMRPLQSTDVNQAYTFGEYRRVRLYTCMAAVLFGISYIFVNTYFVP NGYSRQKTLVVFLMVVYKVLDGFADVYESEFQRAGRLYLTGKAMAYRTILSVILFLGTLF LTRELVFSCVIAVLAQAAGILLFDWNLMKLVPGVDISRAAGKSLALVRESFLLFLSVFLD GLIFAMAKYAVDAQMTATDTAIFVAVFMPTSVINLAANFVIRPFLTKMSYQWEGKKFDEF TRDLKKLMAIILILTIIALAGAWAIGVPVLGAISNVKLAPYKSGLLYIILGGGFFAVMNL FYYVLVIMKCQRRIFFGYVPVCILSAILSFLLVKTGGINGGALSYMLEMLVLMMCFMGQA FYVFHRTKKQQGGR >gi|229784071|gb|GG667664.1| GENE 21 21005 - 21700 781 231 aa, chain + ## HITS:1 COG:SPy0794 KEGG:ns NR:ns ## COG: SPy0794 COG0463 # Protein_GI_number: 15674837 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pyogenes M1 GAS # 3 231 2 231 231 174 41.0 1e-43 MDKVLVIIPAYNEALNIERVVTRLQEEYPIYDYVIVNDGSADETAAICRSRGFQLIDLPV NLGLAGAFQTGLKYAYRRGYRYAIQFDGDGQHRPEYIGAMREKMDEGYDIVIGSRFVTEK KPNSMRMLGSRLLSGAIRLTTGVSVSDPTSGMRMFNERMIREFAEGLNYGPEPDTISYLL KQGARVAEIQVTMDERILGESYLKPVSAARYMGKMMFSILLVQNFRKRGRR >gi|229784071|gb|GG667664.1| GENE 22 21719 - 22111 422 130 aa, chain + ## HITS:1 COG:no KEGG:Closa_3819 NR:ns ## KEGG: Closa_3819 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 116 1 116 123 139 72.0 3e-32 MTVILRCVLIVVSILLTFFVLKKIRQSKVKIEDSIFWVMFALMMVVFSIFPGLADILSDF VGTYSTSNFIFMFVIFILLVKVFFLSLKISQLESRVTELIQQLALDRKEEADRKRAADRE EEARDESRYC >gi|229784071|gb|GG667664.1| GENE 23 22092 - 23744 1580 550 aa, chain + ## HITS:1 COG:CAC0024 KEGG:ns NR:ns ## COG: CAC0024 COG4713 # Protein_GI_number: 15893322 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 90 529 10 426 456 141 27.0 3e-33 MKADTVKNKKKISFQTAKEIAGQSGLWMAAATILFLIVVFVKPYHSVIDAIPSPDYESVR LFYIWILRLGILFLAVLAGILWVYGKRPGWLFPVTVICLGFFYMMILPPFSAPDEPRHFV SAYRLSNQIMGKEAVADESFLLSASDQEKEYIARGGNVLVREQDGREEPYSSVGRDSYAL MLRELFTKNTGGGTTVRFEAPVNTTPVVYLPQALGISLARVLHLGYIPLIYMGRLFNLLA FAGMAWLAVRIMPFKKELFMAVSLLPMTLHLAASLSYDAALIGLSLLFFAWCFYLAYEKK QIGIRDTVVLGLLLVLLEPCKIVYLPLAGICLLIPASKFGTKKQYWISVAAVIAVMALAI FLINNVVLSAWIQDTENIISWEGGAAGYTIQDILKQPYQAFLIYYETLVTQFDYYQATML GGYLGNLDPNLTVPYFCLYLLWGILVVSSVRKAGEHTPMKSGHKVWIAVLAFLSIVLVLT SMLLGWTPKGFTYIAGIQGRYFLPILPLVLLMLYGDNLTVKRDYSRGTYYLECFVSIYAL VRICSMTCIR >gi|229784071|gb|GG667664.1| GENE 24 23795 - 24754 833 319 aa, chain + ## HITS:1 COG:STM2085 KEGG:ns NR:ns ## COG: STM2085 COG0463 # Protein_GI_number: 16765415 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Salmonella typhimurium LT2 # 12 316 3 302 314 171 34.0 2e-42 MTAETMEIKRTVDVIIPVYKPDAVFETLLERLGKQSYPIRRIIIMNTEQSFWKNDAYEAV PNLEVHHVKKSEFDHGGTRNKGASFSEADIMVFMTDDAVPADRELIGELVKGLSLSGPAG EPAAVAYARQLPNSDCALAEKYTRSFNYPDQSRIKTKGDLKELGIKTFFASNVCCAYDRK LFLEAGGFISRTIFNEDMIYAGNAVLDRGQAIVYAAKARVIHSHNYGCIAQLKRNFDLAV SQADHPEVFEGIRSESEGIRLVKKTCRWLMEQKKPWLIPGVIVKSGFKYLGYLLGKRYRK LPIWMIMKITMNREYWRRG >gi|229784071|gb|GG667664.1| GENE 25 24950 - 25057 62 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSIYPVFFVSMPDTYNRHYDTNNRYNDTDNSNNNA >gi|229784071|gb|GG667664.1| GENE 26 25659 - 26213 776 184 aa, chain + ## HITS:1 COG:CAC2331 KEGG:ns NR:ns ## COG: CAC2331 COG1898 # Protein_GI_number: 15895598 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Clostridium acetobutylicum # 1 175 1 175 185 263 74.0 1e-70 MGKIKVTPCDIKGLYVIEPTVFPDSRGYFMETYNQNDFKEAGLNMVFVQDNQSMSVKGVL RGLHFQKQFPQGKLVRVVRGSVFDVAVDLRSDSETYGKWFGVLLSAENKKQFYIPEGFAH GFLVLSDEAEFAYKCTDFYHPGDEGGILWSDPEIGIDWPIEEGMELIISDKDQKWSGIRD TFKF >gi|229784071|gb|GG667664.1| GENE 27 26231 - 27040 950 269 aa, chain + ## HITS:1 COG:lin1062 KEGG:ns NR:ns ## COG: lin1062 COG1682 # Protein_GI_number: 16800131 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate export systems, permease component # Organism: Listeria innocua # 1 269 1 267 267 139 30.0 4e-33 MNYLVSLCKEIVHKRKLILDLAKADFKKRFVGSYFGIVWMFLQPLATVLVYFCVFQLGFK SVPPVDYPYVLWLLPGIMPWFYFSEVMNAGTNCLQEYSYLVKKVVFRVEILPVIKMLSCM MVHGIFAIIMIVVFLAYRIPVKISWIQIIYYTFACSMLSLALTYFTSAINVFFKDMAQIV GIILQFGMWMVPIMWAPEMLPGAPEWLDKVLKFNPFYYIVAGYRDSMLTGDGFWMRPTLG VYFWAVTIVLMLIGLKVFKKLRPHFSDVL >gi|229784071|gb|GG667664.1| GENE 28 27069 - 28409 1648 446 aa, chain + ## HITS:1 COG:PA1386 KEGG:ns NR:ns ## COG: PA1386 COG1134 # Protein_GI_number: 15596583 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate transport system, ATPase component # Organism: Pseudomonas aeruginosa # 1 396 1 366 422 265 38.0 1e-70 MESKNSAISVKDVTKIYRLYDKPIDRLKEAMSITHKNYHRDFYALNGISFDVNKGETVGI IGTNGSGKSTILKIITGVLTPTTGEVEVDGVISALLELGAGFNMDYTGIENIFMNGTMMG FSKKEMEAKLQDILDFADIGDFVYQPVKTYSSGMFVRLAFALAINVEPEILIVDEALSVG DVFFQAKCYRRMEEIRKNGTTILMVTHDMGAIIKYCDRVVVLNKGNFIAQGEPGKMVDLY KKILANQMDALEEELVELRSDELSDFSGDSAAAGQRHKTESHGGLMKEKLTINPAKTEYG DGRAEIVDFGLFDERGNLTNLLLKGEMFTIKERIHFNTSLDTPIFTYTIKDKRGADLTGT NTMYESADVKPVKNGDEYEVEFSQKMTLQGGEYLLSMSCTGFENGEHVVYHRLYDIANIT VISNKNTVGVYDMESEVKVALKRGGK >gi|229784071|gb|GG667664.1| GENE 29 28411 - 29193 921 260 aa, chain + ## HITS:1 COG:no KEGG:Closa_3802 NR:ns ## KEGG: Closa_3802 # Name: not_defined # Def: Methyltransferase type 11 # Organism: C.saccharolyticum # Pathway: not_defined # 1 260 1 259 259 321 60.0 2e-86 MGNISSEVKDLLIKYHGDTKLAMKEEPELKYFLALSVTRENVLEWFEFKEGASLLEVGSG CGALTGLYSRRVKEVTVLDEDSEDLEVNRLRHEACGNICYVKGSLDTYEGGLFDYVVMAG SLKMPYEAQINRAKSLLKPGGTLIVAVDNRLGLKYKSGVKPDEACLLKNELTELLCGSSD GKNGTLTYYYPMPDYRLPVTIYSDEHLPVKGELAHAVLAYDYPEYIRFDPGKFYDEICEA GVFETFANSFLAIWSSYEED >gi|229784071|gb|GG667664.1| GENE 30 29180 - 31739 2678 853 aa, chain + ## HITS:1 COG:alr4493 KEGG:ns NR:ns ## COG: alr4493 COG1216 # Protein_GI_number: 17231985 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Nostoc sp. PCC 7120 # 405 628 8 240 295 129 34.0 3e-29 MKKTKYIKYNRTRREEFQIRTCIVEEDGKQYVEKTALNEPAVAHIRSMKEKYDRLTSQSP LGVAEMALSGDGKTARFAFLKGETMAEILGRKIMDGRIPVAELKEAMEQVLEVREGYLKP FKRTDEFIAVFGQCADESGELEADEAFDASNVDCLFENIMVTEYGPCCLDYEWVFFFPIP VRFIRYRMLYYFYEQYRSLLGETTLPEFLVNFGIRPEMIPVFGEMEHHFQEFVHGENQKI YLTNYMHDVHIIPEDMNEVKRAVEKQRQWVKQLQLEIEEKDLSIRKQIELKRLTDNHVTN LEVMIGNLRQDNENMAQTIAVLNRHEAIIFKVKRKLGEKFNQKFPKGTRKRKILGYVKNT FCHPVASFRLYTTPEGRNLIEGDMKIGDGYRIHGKLKFREEKEPMVSIVIPVYNQIDYTY VCLASILEHTTDVAYEVIIADDVSTDATMELAKFVEHVTICRNETNQGFLRNCNQAAKAA RGKYVMFLNNDTQVTPGWLKSLVDLIERDPSIGMVGSKLVYPDGRLQEAGGIIWSDGSGW NYGRLDDPDKPEYNYVKDVDYISGAAILLSNALWKQIGGFDERFAPAYCEDSDLAFEVRK AGYRVVYQPLSKVIHFEGISNGTDVNGSGLKRYQVENSEKLKEKWAEEFKKQCVNNGNPD PFRARERSQGKDIILVIDHYVPTYDRDAGSKTTFQYLKMFLEKGYVVKFLGDNFLHEEPY STTLMQMGIEILYGRDYQVKIWDWLKVHGDDITAAYLNRPHIAEKYVDFIEEHTNIKMIY YGHDLHFLREGREYQLTGDEKKRDASEYWKAIEFRIMRKAAVSYYPSYVERGAIHMIDPT IRVKAIVAYVFEK Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:14:52 2011 Seq name: gi|229784070|gb|GG667665.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld58, whole genome shotgun sequence Length of sequence - 32649 bp Number of predicted genes - 34, with homology - 34 Number of transcription units - 14, operones - 10 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 377 488 ## COG2172 Anti-sigma regulatory factor (Ser/Thr protein kinase) 2 1 Op 2 . + CDS 393 - 1793 1507 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 3 1 Op 3 . + CDS 1813 - 2166 476 ## Closa_3386 hypothetical protein 4 1 Op 4 . + CDS 2240 - 2872 631 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) + Term 2924 - 2978 11.5 - Term 2916 - 2962 3.4 5 2 Op 1 . - CDS 2978 - 3424 457 ## COG3871 Uncharacterized stress protein (general stress protein 26) 6 2 Op 2 . - CDS 3494 - 3850 521 ## COG1733 Predicted transcriptional regulators 7 2 Op 3 . - CDS 3903 - 5132 1166 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 5293 - 5352 7.7 + Prom 5134 - 5193 6.3 8 3 Op 1 . + CDS 5307 - 6698 1514 ## COG2211 Na+/melibiose symporter and related transporters 9 3 Op 2 . + CDS 6739 - 7599 971 ## CPE0145 hypothetical protein 10 3 Op 3 . + CDS 7592 - 9289 1749 ## COG1472 Beta-glucosidase-related glycosidases + Term 9350 - 9401 11.5 11 4 Op 1 . - CDS 9399 - 10079 598 ## CDR20291_3070 hypothetical protein 12 4 Op 2 . - CDS 10076 - 10606 527 ## COG0655 Multimeric flavodoxin WrbA 13 4 Op 3 . - CDS 10620 - 11144 468 ## COG1695 Predicted transcriptional regulators 14 4 Op 4 . - CDS 11166 - 11318 93 ## gi|288870737|ref|ZP_06409888.1| conserved hypothetical protein - Prom 11530 - 11589 5.3 + Prom 11252 - 11311 4.1 15 5 Op 1 5/0.000 + CDS 11493 - 13532 1111 ## COG0642 Signal transduction histidine kinase + Prom 13575 - 13634 5.6 16 5 Op 2 . + CDS 13684 - 15918 2192 ## COG2200 FOG: EAL domain 17 6 Op 1 . - CDS 15956 - 16384 511 ## gi|288870740|ref|ZP_06409890.1| hypothetical protein CLOSTHATH_03403 18 6 Op 2 . - CDS 16399 - 16836 389 ## COG3279 Response regulator of the LytR/AlgR family - Prom 16875 - 16934 4.2 19 7 Tu 1 . + CDS 17276 - 17827 547 ## PROTEIN SUPPORTED gi|227412959|ref|ZP_03896151.1| acetyltransferase, ribosomal protein N-acetylase + Term 18003 - 18042 0.9 20 8 Op 1 . - CDS 17880 - 19085 725 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative 21 8 Op 2 . - CDS 19106 - 19600 378 ## Closa_3378 hypothetical protein 22 8 Op 3 . - CDS 19652 - 20089 638 ## Closa_3377 hypothetical protein - Prom 20127 - 20186 6.3 + Prom 20537 - 20596 6.2 23 9 Tu 1 . + CDS 20668 - 21285 618 ## COG0846 NAD-dependent protein deacetylases, SIR2 family + Prom 21330 - 21389 4.5 24 10 Op 1 2/0.000 + CDS 21413 - 22021 505 ## COG0118 Glutamine amidotransferase 25 10 Op 2 . + CDS 22033 - 22794 849 ## COG0107 Imidazoleglycerol-phosphate synthase 26 10 Op 3 . + CDS 22848 - 23522 669 ## COG0692 Uracil DNA glycosylase 27 10 Op 4 . + CDS 23545 - 24378 816 ## COG1968 Uncharacterized bacitracin resistance protein + Term 24411 - 24463 10.1 - Term 24399 - 24449 9.6 28 11 Op 1 . - CDS 24462 - 26561 1637 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases 29 11 Op 2 . - CDS 26583 - 26828 243 ## gi|266622203|ref|ZP_06115138.1| phosphocarrier protein HPr - Prom 26940 - 26999 6.8 - Term 27178 - 27231 19.3 30 12 Tu 1 . - CDS 27240 - 28268 1049 ## COG0722 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase - Prom 28346 - 28405 6.1 + Prom 28366 - 28425 6.7 31 13 Tu 1 . + CDS 28457 - 28747 323 ## Closa_3370 hypothetical protein + Prom 28787 - 28846 7.3 32 14 Op 1 . + CDS 28876 - 29448 690 ## Cphy_0228 hypothetical protein 33 14 Op 2 . + CDS 29451 - 30854 1226 ## COG1696 Predicted membrane protein involved in D-alanine export 34 14 Op 3 . + CDS 30887 - 32074 1068 ## Cphy_0232 hypothetical protein Predicted protein(s) >gi|229784070|gb|GG667665.1| GENE 1 3 - 377 488 124 aa, chain + ## HITS:1 COG:TM1354_2 KEGG:ns NR:ns ## COG: TM1354_2 COG2172 # Protein_GI_number: 15644106 # Func_class: T Signal transduction mechanisms # Function: Anti-sigma regulatory factor (Ser/Thr protein kinase) # Organism: Thermotoga maritima # 3 120 45 163 181 105 49.0 2e-23 RAGEASSDVKGKLKQMGVSPDAVRKVAIAMYEGEINMVIHAKGGEITVNITPEEIEMILA DVGPGIPDVELAMQAGYSTAPDEVRSLGFGAGMGLPNMKKYSDVLEVDTTLGVGTTVRMV VKIV >gi|229784070|gb|GG667665.1| GENE 2 393 - 1793 1507 466 aa, chain + ## HITS:1 COG:TM1421 KEGG:ns NR:ns ## COG: TM1421 COG4624 # Protein_GI_number: 15644172 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 15 350 12 300 301 103 25.0 7e-22 MDKFYHSVRLDADLCMGCINCIKRCPTQAIRVRNGKAQINSKFCIDCGECIRVCPHHAKH ATYDKLDVLKQYEYTVALPAPSLYSQFNNLDDVNIVLNALLMMGFHDVFEVSAAAELVSE ATREYLSENPDRLPAISTACPSVVRLIRVRFPNLIPNLLPLNPPVEVAAILAAEKAMKET GLPREKIGIIFISPCPSKVTYVKSPLGTDHSEVDRVLAIKDVYPQLLSCMKAVGDDPPEI GTSGKIGISWGRSGGEASGLFTEEYLAADGIENVIRVLEDMEDQKFTNLKFVELNACNGG CVGGVLTVENPYVAEVKLKRLRKYMPVARSHIEGNAEELVKWTKEVQYEPVFNLGNTMME SFARLNQVERLCKKLPGLDCGSCGAPTCKSLAEDIVRGEAVESDCVYYLRENLHKLSEEV SILADDIAAGNAEGYEMLKVMTEYIQRISDEMSLLDSRDEKEKDQE >gi|229784070|gb|GG667665.1| GENE 3 1813 - 2166 476 117 aa, chain + ## HITS:1 COG:no KEGG:Closa_3386 NR:ns ## KEGG: Closa_3386 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 109 1 109 116 163 80.0 2e-39 MTVQELIDSGIFGVVNAGEGLEREITKPFCCDLLSIAMGRAPEGCAWVTVMGNMNTLAVA SLADAACVIMAEGAALDEAARKKAADREITVLETEEPIFEAALAVWKKLQAGTHEEP >gi|229784070|gb|GG667665.1| GENE 4 2240 - 2872 631 210 aa, chain + ## HITS:1 COG:TM1352 KEGG:ns NR:ns ## COG: TM1352 COG0613 # Protein_GI_number: 15644104 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Thermotoga maritima # 7 185 31 198 232 84 31.0 1e-16 MAAVKGLDVIAVTDHNSCKNCPAVLHFAGEYGVLAIPGMEINTSEEVHAVCLFSNLEAAM AFDSYVYDRLMPFPNNEEIFGKQQLYNKEDEVCGTVPNLLINSVDISFDGLWELVRSYDG VMFPAHIDKTANSLIANLGFVPPDSRFTTAEVKDLKKLHELKRSNPYLDGCRIISNSDAH YLEDIHEPELTIEVEEMSMRGVVDCLLRSL >gi|229784070|gb|GG667665.1| GENE 5 2978 - 3424 457 148 aa, chain - ## HITS:1 COG:CAC3491 KEGG:ns NR:ns ## COG: CAC3491 COG3871 # Protein_GI_number: 15896728 # Func_class: R General function prediction only # Function: Uncharacterized stress protein (general stress protein 26) # Organism: Clostridium acetobutylicum # 11 140 13 141 145 117 40.0 6e-27 MKDAEKTVGTMLDKQSVAWIGSVSPDGFPNIKAMLRPRKRDGIRTIYFTTNTSSMRVGQF RENPKACVYVCDSRFFRGAMLTGTMEVLEDSESREMIWREGDTMYYPGGVTDPDYCVLRF TALSGRFYSNFHSENFEIPEACTAEGIH >gi|229784070|gb|GG667665.1| GENE 6 3494 - 3850 521 118 aa, chain - ## HITS:1 COG:CAC3399 KEGG:ns NR:ns ## COG: CAC3399 COG1733 # Protein_GI_number: 15896640 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 115 1 116 116 125 56.0 2e-29 MKQRTEYTCPLELTHDAIRGKWKPIILWQLSKGGSSLSALKKEIVGISQKMLIQQLSELA QYGFVGKHTFDGYPLKVEYYLTERGRKIFEAVTIMQSVGIDMMAEDGRADFLREKGLL >gi|229784070|gb|GG667665.1| GENE 7 3903 - 5132 1166 409 aa, chain - ## HITS:1 COG:BH0793 KEGG:ns NR:ns ## COG: BH0793 COG4753 # Protein_GI_number: 15613356 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 268 404 363 503 508 87 36.0 4e-17 MSHAPQTAVIDLMTTLLSTTLGVQTSVLFPDSAELADIDYSLRKSIFTGLDAKELFCSMM ERAEEKYIYSITDHFFSQYILMPLPEKKDCGCLLVGPYLTEEANEFLYNRAINRNRLTLE TLPVLKKYYQNLSIVDPARITAALNHIAAFLYEDHHDLTIRYMTEDWNEQYSGWNYVPDP EHAFSIHVTEKRYAAENEFLDAVMRGDQVLAVRRFEAFLDFRIAPRYKDPIRDSKNLIIV LNTLLRKAAEKAVVHPVYLDEISHHYAVQIEASTSCLQLENLRIEMVRKYCILVQNQSLK NYPPLIRDLLNHINLNLSSDLSLKSLSQMFSVNASYLSSLFRREMTMTLTDYVNGRRIDT ALKLLNTTDIQIQDIAYYVGIEDVNYFTRIFKKKIGVSPTEYKKSIRPA >gi|229784070|gb|GG667665.1| GENE 8 5307 - 6698 1514 463 aa, chain + ## HITS:1 COG:BS_yjmB KEGG:ns NR:ns ## COG: BS_yjmB COG2211 # Protein_GI_number: 16078296 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Bacillus subtilis # 2 444 15 444 459 198 30.0 2e-50 MERKFGFRDKIGYMFGDFGNDFFFMLASTFLMVFYTKVLGISAAAVGTLFLVARCVDAFT DITMGHIVDTARIGKDGKFRPWIRRMCIPVVVAGVLMFNPFIADKSMTFKMVYIYLTYLF WGSFCYTGINIPYGSMASGITSDPIERGQLSTFRSVGAALAGVCVNVGVPLFVYAYEGGN QVVIADRFFYIACLFAVLALICYILCYSLSTERVKPQVKESNRGSFAKAVKGMVHNRSLL AIIGAAIVLLLSMLLSGSMNTYLYMDYFKSKEAMSLAGFFSTGATLILAPFSTAILKKFG KKEASAASVLFASVIYFILFFVRIQNPWAFCVMIAFGNLGTGLFNLLIWAFITDVIDYQE VQTGSRDDGTIYAVYSFARKIGQALAGGLGGWVLGAIGYQSSKAGEVIVQTDSVTRNIYT VATLAPAVCYLIVGLILLFAYPLSKSVVEENNRILAERRTAAK >gi|229784070|gb|GG667665.1| GENE 9 6739 - 7599 971 286 aa, chain + ## HITS:1 COG:no KEGG:CPE0145 NR:ns ## KEGG: CPE0145 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens # Pathway: not_defined # 1 281 1 281 281 330 55.0 4e-89 MGDVFENFQKKQEYLICIDSDGCAMDTMDIKHFKCFGPCMVKEWGLEPWEEALLKRWNEV NLYSMTRGINRFKALAIVLREASEAYRPIDGIDAFVRWCETAPELSGEALSKMEKNESSV CFQKALSWSKAVNEAIERLPEDVKKPYDGVAQGIRAAHEKADIAIVSSANKQAVVEEWER CGLLSLTDVVLTQTEGSKAYCIGRLLEAGYEKDHVLMIGDAPGDHQAALQNGVCFYPILV KKEALSWKQFRKEALERFQNDRYRGAYEERCIADFEKNLGKGENNG >gi|229784070|gb|GG667665.1| GENE 10 7592 - 9289 1749 565 aa, chain + ## HITS:1 COG:BH0675 KEGG:ns NR:ns ## COG: BH0675 COG1472 # Protein_GI_number: 15613238 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Bacillus halodurans # 19 403 131 529 686 159 28.0 1e-38 MVNLREKPYYLKDEEIRWVEDTIASMTDEEKIGQLFVNMVTSRKPEDLENVVKNYHVGAI RYHNTSPDELYEQNRILQEHSRIPLLIASNCEAGGNGGVGGGTAVACGAATAAADDEETA YEVARIGAAESAAIGCNWNFAPVVDLLYNWRNTIVQCRAFNNSPEDTIRYAKSFFRGTRT QNMATCMKHFPGDGTEENDQHLLMGVNDMTCEEWDNTFGKVYKELIDDGVMTIMAGHIAL PAYTRKLRPDTADEDILPATLSPELITDLLKGRLGFNGMVVTDASHMIGMFAAMPRRDQV PRAIAAGCDMFLFFNDMDEDFGYMMEGYRNGVITEERLNDALHRILGIKAALKLHKLQAE GRLMPPREQLSVIGCREFRESAAKQADKFITLVKDTKHYLPLTLEKYQRVKLVFIGGEGT VVAGKLQSDDSAPVKEMVIRKLTEAGFTVDGETPAVKGKMEDLRKTYDVCLVVLNVSGFA QYNTMRVKWAQPADQPWYVSELPTVFLSLNFTNHLIDVPMAKTYINAYLNSEEAVSAAID KMTGKSPFKGRYNETVFCGRWDTRI >gi|229784070|gb|GG667665.1| GENE 11 9399 - 10079 598 226 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_3070 NR:ns ## KEGG: CDR20291_3070 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 224 1 213 216 152 36.0 1e-35 MKGILINGSPKRNGSASGVLLDDLRSLLAPDTETAVMAVTAAHPAASPAYTGMHWEPSPE EQKKLGGLSFLVLSFPLYVDGIPSHLLSFLTELERWQNQSRLLEGDKPLAVYAAVNCGFY EGKQAAIALRIVENWCVRCGFEYKQGIGCGAGGMLAHLKGVPLAKGPKKPLGAALTDMAA NIERLEKGNDRFFSLSIPRPLYIAAAHIGWRKQGRENGLTVKELKS >gi|229784070|gb|GG667665.1| GENE 12 10076 - 10606 527 176 aa, chain - ## HITS:1 COG:FN1035 KEGG:ns NR:ns ## COG: FN1035 COG0655 # Protein_GI_number: 19704370 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Fusobacterium nucleatum # 1 167 1 157 159 166 47.0 3e-41 MKTIIHDLDPAAFEELFPETALLTRLGTSEPPAVILSPMYPARPCIGCFGCWIKTPGSCV LPDSYQEMGKLLSRTSELILISRCCYGSYSPHVKNVLDRSIGYLQPFFRIIGGEMHHVMR YGNQMAVTAHFYGDSITKQEEETANRMVKANAVNLGASGFTVHFYQTPEQIRGNMT >gi|229784070|gb|GG667665.1| GENE 13 10620 - 11144 468 174 aa, chain - ## HITS:1 COG:L9255 KEGG:ns NR:ns ## COG: L9255 COG1695 # Protein_GI_number: 15673931 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 2 155 7 162 179 69 32.0 3e-12 MLEYIILGFLMRRKATGYDLKQYMAESTSYFFDASYGSIYPALKRLEEKKFLCSEEQVTG GKFKKLYSVTGEGRKHFLEWLKQPVHFSKTRLDHLVPFFFYDCLDAETAGRNLTQFIEEA SSGLGELRAQQRELDSSCPECRHTYHYSVLVYGIRYYEMLIGWCMELASQTPKP >gi|229784070|gb|GG667665.1| GENE 14 11166 - 11318 93 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870737|ref|ZP_06409888.1| ## NR: gi|288870737|ref|ZP_06409888.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 50 1 50 50 85 100.0 1e-15 MGIFQIYRLHCYNFFKKLPVLFRLMNKPKTNLGLTLHNVPAIVSLSYISF >gi|229784070|gb|GG667665.1| GENE 15 11493 - 13532 1111 679 aa, chain + ## HITS:1 COG:VC1084 KEGG:ns NR:ns ## COG: VC1084 COG0642 # Protein_GI_number: 15641097 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 250 471 204 430 439 84 27.0 1e-15 MKELLLRVSQEGIHGLYLLVFTLLIWLVFFLILAANPGNKINQWCFATGFLCGIGTFKEF LYYELAGLIHSPAAHAIPDWLYSVMSGVFYFFSLPCGLICALYFSHEDVKNPSAFRKKQA ASFALAVVMILIFPCTQTLYYQSKVSFCLSTAAYNWLYGIILTSILLKTLREERLSANYN QRMLVTASVLLPLWCWLIAGFPYHALGLGGFSKAWQINLIIIVVILLFIVYHAFHSGIWG LRIRGEQYNWSSDEKLLQKNAHYVGHALKNDLAKISWCTGLLRQEQVRMKELDIIEQSVA HLESFVARTQMYSDRIVLKQQFCDIAQIFEDLKQELMLPEGKQIQICTCDMDPLFCDPVH LAEVLRNLIWNAAESIKEEGLIKLSYHCQKNRRQAVITVRDNGCGIEKDDLKRISEPFYT TKNNNHNMGLGLYYCWNVMCAHSGRLQADSAAGEGSCFSLYFPYVNPQKGASREKKKIMV VDDDKDFLYLIQQTLGKEADFDVCSLCSSKKGALPAALTDSPDLVLMDLNLESTWMDGVE TARTIRIETNARIIILTSFDQPEIILDACRRAFASGYVLKNQFSMLIPTIRTTYSGITPQ ALLICSLILEVLSPAERAVLDQMLGREITLHSKNKTIANQQSNILRKLELSGKLELCHIF SAYYTSLPRENENSGTLPH >gi|229784070|gb|GG667665.1| GENE 16 13684 - 15918 2192 744 aa, chain + ## HITS:1 COG:PA4601_3 KEGG:ns NR:ns ## COG: PA4601_3 COG2200 # Protein_GI_number: 15599797 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Pseudomonas aeruginosa # 503 739 1 238 247 209 43.0 1e-53 MRNIFAKYREIILGLLILILLAVSWLLYARTLKNDFQNEIISSLEEVSTQGENILEKEIN AKLELLTEVSRRVSFYPAEDYEEAASMLAATAMENDFKRMGIISADGKTYTTDHVQMDLS DRPYYKKAMEGEKCVSDPLRDREGNEKIHVYAVPVYHEDQVTGVLFGTYNLNEMRKQLEV SSFEGRGYTYIVNTDGNCIVDSVNPYSYPDLQNIFDAIEEVDSQNSEAAAAMKADFDLRQ EGCIRIENHGMKYMYYRPLPINNWYLLTVAPASVLDSKMNAVLGRTYLLGAFMVLIFAGI LIYILKDQKRRKAELMHSLYTDEITGGYSFAKFQVEAEKKIRRAEVGSWSLISLDIDDFK FINELLGYEEGNHLIRYIHGVLGEWCGSEEIFAHQSADMFVALADSRDHDVLCRRLEELC SRLQSYSIWPQNKLAIVPALGIFQIRDKGLTLDFCLDCAGIARKSRKGQFAVHYAFYDER VKEQIYRDKRIEAEMKGALENGEFQAYYQPQYDAESRRIVGAEALVRWARPDGGLVPPGE FIPLFEKNGFISELDRYMFRQVCEQQKEWEERGYEVVPVSVNVSRKLLYDLNFVEKYNLI LTETAVDVRNVEIEITESVFFDNQPRLLDAIRHLHQSGYRILLDDFGTGYSSMAMLNDMS FDTLKIDKSFVDNIGDERGNKIVDGIIRLADSLELSTIAEGVETREQYEYLKARGCDVIQ GYYFGRPMTAESFGMLLSSQTAGR >gi|229784070|gb|GG667665.1| GENE 17 15956 - 16384 511 142 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870740|ref|ZP_06409890.1| ## NR: gi|288870740|ref|ZP_06409890.1| hypothetical protein CLOSTHATH_03403 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03403 [Clostridium hathewayi DSM 13479] # 1 142 38 179 179 254 100.0 2e-66 MKRLLRTYLPSSAIAFTIVILFSVIFNLILGKHDALSSIFVLELAGLILLIQFVSAVCDR IPFQSKKAYQITFYALEYAIVLIAGFLLNWAAFTISSFLYTTLLWLFIAFLIDRYFTAVH RHEAEEINRLILSQNQKEEQIP >gi|229784070|gb|GG667665.1| GENE 18 16399 - 16836 389 145 aa, chain - ## HITS:1 COG:lin0983 KEGG:ns NR:ns ## COG: lin0983 COG3279 # Protein_GI_number: 16800052 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Listeria innocua # 1 144 1 145 151 79 28.0 2e-15 MKITLHEGLDLSEPEIEIRCHIADPRLKRLIDYIHQYSFSLEGRIGEASFYLHSEEIIYV DSADGRTFLYSGKNVYESRETLASLEEKLANTTFVRISKNCIVNTAWLKSVSPLWNHRLE ALLKNGEKLIVSRNYIPALKEKLSQ >gi|229784070|gb|GG667665.1| GENE 19 17276 - 17827 547 183 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227412959|ref|ZP_03896151.1| acetyltransferase, ribosomal protein N-acetylase [Eggerthella lenta DSM 2243] # 14 181 11 178 185 215 61 3e-55 MGNVTEENTDDEIILETKRLILRRLRRSDYGALCLMLKDEEVMYAYEHAFEDWEAADWLD RQLMRYETYGFGLWAVVLKETGEVIGQCGLTMQEVGDSEVLEVGYLFRMEFWHHGYALEA AAACRDYAFEKLGAEAVYSIIRENNGPSRAVAERNGMTVCGSIVKHYYGLDMPHLLYRIS RVE >gi|229784070|gb|GG667665.1| GENE 20 17880 - 19085 725 401 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 4 395 2 391 396 283 38 7e-76 MESAVVRIKKGEGRSLKAGGMWIYDNEIDTIKGDFTNGDMVFVEDFDGYPLGHGFINTNS RLTVRMMSRKKDAVIDDAFIEMRVRAAWEYRKTTTDTSSCRIIFGEADFLPGIVIDKFAD VLVVESLALGIDRFKTMILDQVKKVLAEDGIHIRGVYERSDAKVRLQEGMERFKGFIGDP FDTKVEIVENNVHYMVDVKDGQKTGFFLDQKYNRLAIQRLCPGKRVLDCFTHTGSFALNA GLAGASSVLGVDASELGVAQAEENAALNGLSDTVHFRCADVFELLPELERQGEKFDVVIL DPPAFTKSRSSVKNAVKGYREINLRGMKLVTDGGYLATCSCSHFMTPELFEKTIREAASN VHKRLRQVEYRTQAPDHPILWSGDETSYYLKFFIFQVCDEK >gi|229784070|gb|GG667665.1| GENE 21 19106 - 19600 378 164 aa, chain - ## HITS:1 COG:no KEGG:Closa_3378 NR:ns ## KEGG: Closa_3378 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 161 7 167 173 176 65.0 2e-43 MKTKELVRDAFLAALLLVLQVSLSWLPNIELVSLLMILYTLVFRRHVWVILYVFVILEGL IYGFGLWWFSYLYVWAFLCTAVFLMRWTGKPRTISAALLAGIFGMLFGLLCAVSYFFTGG PGAALAWWTAGIPFDIIHGISNFTVTLLLFEPLYHLLTRLKNQF >gi|229784070|gb|GG667665.1| GENE 22 19652 - 20089 638 145 aa, chain - ## HITS:1 COG:no KEGG:Closa_3377 NR:ns ## KEGG: Closa_3377 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 10 141 3 134 143 159 62.0 3e-38 MNTDKQRHKRLKRPSPFLGACVAVAVLALLLVIYNISKPVPMVGSKTITIDVVYKDGKED SYHVTTEAQYLKGAADAIPELTLDGTVTEEYGLMITTVNGVRADYTQDGAYWALLLNDEP CNYGISMQPIKDGEHYKLVYTPADQ >gi|229784070|gb|GG667665.1| GENE 23 20668 - 21285 618 205 aa, chain + ## HITS:1 COG:MYPU_4420 KEGG:ns NR:ns ## COG: MYPU_4420 COG0846 # Protein_GI_number: 15828913 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Mycoplasma pulmonis # 33 201 92 278 282 71 26.0 1e-12 MKENQNIRQEIDECEKLLIGLGEEWRAELVPDLDEGYGKLAELTEGKDFFLVTTVTDARI FESGIEAARIVAPCGNETWRQCSRSCTKDIWEPGEIPEDICPHCGAPLTGNTIQAETYIE EGYLPQWNRYTAWLAGTLNKKLLILELGVGFGTPTVIRWPFEKTVSLNKKAHMYRIHGKF AQLSAEMGERAVSVPMNSVEFIRTL >gi|229784070|gb|GG667665.1| GENE 24 21413 - 22021 505 202 aa, chain + ## HITS:1 COG:MJ0506 KEGG:ns NR:ns ## COG: MJ0506 COG0118 # Protein_GI_number: 15668683 # Func_class: E Amino acid transport and metabolism # Function: Glutamine amidotransferase # Organism: Methanococcus jannaschii # 1 201 3 197 198 202 49.0 2e-52 MIAIIDYDAGNLRSVEKALDALGEQPVISRDEKTILSADKVILPGVGSFGDAMGKLRQYG LVEVIHHVADQGTPFLGICLGLQLLFERSGESEGVSGLGILPGEIVRFPDTPGLKVPHMG WNCLDIRPDAKLFKGLASGEYVYFVHSYYLKAEKEEDVAASSVYGVRFHASVEHGNVFAC QFHPEKSGTTGLKILKNFIELP >gi|229784070|gb|GG667665.1| GENE 25 22033 - 22794 849 253 aa, chain + ## HITS:1 COG:aq_181 KEGG:ns NR:ns ## COG: aq_181 COG0107 # Protein_GI_number: 15605750 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate synthase # Organism: Aquifex aeolicus # 1 252 1 250 253 327 63.0 1e-89 MLTKRIIPCLDVHNGRVVKGVNFVNLKDAGDPVEIAAAYDKAGADELVFLDITASSDNRS TVVDMVRRVAETVFIPFTVGGGIRTVDDFKLLLREGADKISINSSAINTPELISEAADKF GRQCVVVAIDARRREDGSGWNVYKNGGRIDTGLDAVEWAMKADHMGAGEILLTSMDCDGT KAGYDDELNAVIADSVSVPVIASGGAGTTEHFYNALTKGKADAALAASLFHYKELEIMEL KRFLADRGLPMRL >gi|229784070|gb|GG667665.1| GENE 26 22848 - 23522 669 224 aa, chain + ## HITS:1 COG:lin1190 KEGG:ns NR:ns ## COG: lin1190 COG0692 # Protein_GI_number: 16800259 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Listeria innocua # 1 224 1 224 224 293 61.0 2e-79 MGAIDNDWLEPLSVEFKKPYYRELYKKVKHEYETRRVFPEADDIFNAFQFTPLSRVKVVI LGQDPYHNYGQAHGLCFSVKPDVEIPPSLVNIYQELHDDLGCDIPNNGYLKKWADQGVML LNTVLTVRAHAANSHQGIGWETFTDAAIKILNEQDRPMVFLLWGRPAQNKKSMLTNPKHL ILEAPHPSPLSAFRGFFGCRHFSKTNEYLKANGLEPIDWQIENI >gi|229784070|gb|GG667665.1| GENE 27 23545 - 24378 816 277 aa, chain + ## HITS:1 COG:CAC0501 KEGG:ns NR:ns ## COG: CAC0501 COG1968 # Protein_GI_number: 15893792 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Clostridium acetobutylicum # 1 272 1 273 274 266 56.0 3e-71 MSIIETLKVIVLGIVEGFTEWLPISSTGHMILVDEIIHLKQPEDFKNMFMVVIQLGAILA VVVMYFHKLNPFSPTKKPAQKRATLELWSKIILACIPAAVIGLLIDDIMEKYLMNGYVVA ATLIIYGVLFIVIENANRNRNFEMQRVGDISYQTALYIGLFQLLSLVPGTSRSGSTILGA MILGCSRGAAAEFSFFLGIPVMFGASFLKIVKYGLHFTGPQVFYLLLGMVVAFIVSVYAI RFLMGYIRKHDFKFFGYYRIVLGVIVLVYFGITALIG >gi|229784070|gb|GG667665.1| GENE 28 24462 - 26561 1637 699 aa, chain - ## HITS:1 COG:slr1857 KEGG:ns NR:ns ## COG: slr1857 COG1523 # Protein_GI_number: 16330244 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Synechocystis # 7 660 5 672 707 716 50.0 0 MPLYTDNFRPLDTISGFSVRPGLYEEFGARLIPGGVSFTLHSQEATSCELLLFHHNEREP YARIPIPDRYRIGNVYSIIVFDLDTRDLEYAFSIDGPYDPKKGLIFDKTKYLLDIYAKAV TGQGTWGSKPESGFQYKARVVSDNFDWDDCCHPPIPMEDLVIYELHVRGFTRDASSGVSA PGTFQGIIEKLPYLEDLGINAIELMPVFEFDEMRNERSVNGNMLLDYWGYNPVSFFAPNT SYASKSEHNHEGRELKTLIRTIKERGMEVYLDVVFNHTAEGNEKGGFFSFKGFDNQIYYM LTPDGFYYNFSGCGNTLNCNHPIVQQMILDCLRYWTIHYHVDGFRFDLASILGRSEDGTP LHKPPLLESLSYDPVLSSAKLIAEAWDAGGLYQVGSFSSWNRWAEWNGKYRDDMRRFLKG DDNMAAAAVSRITGSPDLYPPATRGFNSSVNFLTCHDGFTLYDLYSYNEKHTEANGWNNT DGDNNNNSWNCGAEGETDDPGILELRFRMIKNACAVLMCSRGTPMFFAGDEFGNTQFGNN NAYCQDNSISWLDWTLLQKNHELYEFFRRMIAIRRSHPVIRRETLSSSTGFPPVSVHGTE AWKGETTSYTHYVGIMYAGKHDDGTEDIIYLAVNTYWEPLPITLPHLPEGFFWNLLADTG RGSGSIPETVDGWDEVIRYSALAHPRSVMILRAVQKELS >gi|229784070|gb|GG667665.1| GENE 29 26583 - 26828 243 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622203|ref|ZP_06115138.1| ## NR: gi|266622203|ref|ZP_06115138.1| phosphocarrier protein HPr [Clostridium hathewayi DSM 13479] phosphocarrier protein HPr [Clostridium hathewayi DSM 13479] # 1 81 1 81 81 160 100.0 3e-38 MRTKQVDIYQRYTPIPIAVLITAANCFDCDIYIDNGKSKVNVKNYDEMKKGLVTRNRNLL FYFDGKDEKAAEGRIEMLFQP >gi|229784070|gb|GG667665.1| GENE 30 27240 - 28268 1049 342 aa, chain - ## HITS:1 COG:SP1700 KEGG:ns NR:ns ## COG: SP1700 COG0722 # Protein_GI_number: 15901534 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Streptococcus pneumoniae TIGR4 # 15 338 15 337 343 369 53.0 1e-102 MGFQYVNNLPTPDEIKERFPLPAELKERKAKRDQEVSDVLTGKSDKFLVIIGPCSADNED SVCDYAERLIKVQEKTADRLIIIPRVYTNKPRTTGEGYKGMLHQPDPEKKPDMHEGLIAI RKMHMRVFAETGFPTADEMLYPENMTYLDDIMSYVAIGARSVENQQHRLTASGCDVAVGM KNPTSGDLSVMLNSVVAAQQGHDFTYRGWEVKSGGNPLAHTILRGAVNKHGQCIPNYHFE DLVLLQELYAKRNLDNPACIVDTNHSNSNKKYMEQIRIAKEVLHSRRHAAPIKDLVKGLM IESYIEPGSQKVGEHCYGKSITDPCLGWEDSERLLYEIAENA >gi|229784070|gb|GG667665.1| GENE 31 28457 - 28747 323 96 aa, chain + ## HITS:1 COG:no KEGG:Closa_3370 NR:ns ## KEGG: Closa_3370 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 94 1 94 96 130 59.0 2e-29 MERFWTWGGRYVGVRQKEYLVACDGTVLGKFFGRDIYDQDGYYVGELGRNDRLIRNRTKA ATRRPAFSRSVKGTIQAPLRDCAPYPLIYGFDDFEW >gi|229784070|gb|GG667665.1| GENE 32 28876 - 29448 690 190 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0228 NR:ns ## KEGG: Cphy_0228 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 28 190 56 216 217 117 41.0 3e-25 MRKSVYVTAMVLTAISLTACGGGKGPDNGSSGTVAESTQAVTVDLEKVHQAVKEAYGENY IPSAPYDAQALNDIFGVPEDLYEEFIAEGPMISVHVDTYAAIKAKEGKGEEVAKLLEAYR TRLVEDSVQYPMNISKVEASEVVRHGDYVFFVMLGTASDEAQAEGEEAALQSAQEENKKA VDAINAFFEG >gi|229784070|gb|GG667665.1| GENE 33 29451 - 30854 1226 467 aa, chain + ## HITS:1 COG:CAC1564 KEGG:ns NR:ns ## COG: CAC1564 COG1696 # Protein_GI_number: 15894842 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane protein involved in D-alanine export # Organism: Clostridium acetobutylicum # 1 467 1 473 473 343 39.0 6e-94 MVFSSLLFLFRFLPVLLIVYFVAPRRYRNGVLFLGSLIFYGWGEPVYISLLLFSTLVDYI HGRLVWRFMEEGKQMRARLCVASSVIINLGLLGVFKYADFLMETVGRLLGIPLPLPGLAL PIGISFYTFQTMSYTIDIYRKEAAPQKNIIDFGAYVSMFPQLIAGPIVRYHTIAEELRER TENFAGFSRGVFLFTVGLGKKVMIANTTGALWTQISGVPDAERTVLMMWLGILAFGMQIY FDFSGYSDMAMGLGAMMGFHFPENFRYPYTAKSITDFWRRWHITLSTWFKEYVYIPLGGN RKGAAVQIRNILIVWLLTGIWHGAAWNYLLWGLYFGVLLLLEKLLLKPYLEKLPSPVRSA YAMIFVFLGWVLFAHEDMGGGLRYLMQMAGAGGIAFTSRRTWYLLVTSLPLIAASVLGST PLPGTLCRKMKGRVRETAELLFVAVVLVLSVAFLVDASYNPFLYFRF >gi|229784070|gb|GG667665.1| GENE 34 30887 - 32074 1068 395 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0232 NR:ns ## KEGG: Cphy_0232 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 4 392 18 398 399 322 43.0 3e-86 MEKEKHSGRGRGVTFLFLGFLGVLSLLSIVTPQKAFSDSENRYLQKKPEFSVKSLLNGSY GEKYEQYLSDQFPGRNVWIGMKVTTERLALQEDVNGVYFGKDGYLIEKFDTEDLEGEQLN KNIGKLAAFTGAAEKSLGKDHVRVMLVPSASQIMTERLPFLAAPYDQGRVTGMLCRSLKE AGGSRETVLPAEEYLKRYREEALYYRTDHHWTARGAYLGYRLWVESVGLTPWTEEMFDIQ TVNSEFHGTVYSKLNVPWRYDTIEVWQPKEEKDYRVSFDGEPKEYDSLYFPGALEGKDKY AVYLDGNHAITKIENRSITGDQKEKKLLMIKDSYAHSFAVFAANHFGTVYMADLRYLNLN LKEWMEEQEITDVLVLYQIPGFAKEKSVSKLVYDR Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:15:48 2011 Seq name: gi|229784069|gb|GG667666.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld59, whole genome shotgun sequence Length of sequence - 22305 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 5, operones - 3 average op.length - 6.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 581 -183 ## Ajs_3023 group 1 glycosyl transferase - Prom 677 - 736 2.9 2 2 Tu 1 . - CDS 782 - 1180 226 ## gi|266622210|ref|ZP_06115145.1| hypothetical protein CLOSTHATH_03424 - Prom 1345 - 1404 6.1 3 3 Op 1 10/0.000 - CDS 1869 - 2990 291 ## COG0381 UDP-N-acetylglucosamine 2-epimerase 4 3 Op 2 . - CDS 3016 - 4338 701 ## COG0677 UDP-N-acetyl-D-mannosaminuronate dehydrogenase 5 3 Op 3 . - CDS 4363 - 4911 264 ## Dret_0386 hypothetical protein - Prom 5020 - 5079 3.2 6 4 Op 1 . - CDS 5744 - 7141 173 ## COG1541 Coenzyme F390 synthetase 7 4 Op 2 . - CDS 7165 - 7485 178 ## BT_0602 UDP-N-acetyl-D-mannosaminuronic acid dehydrogenase 8 4 Op 3 . - CDS 7582 - 7890 327 ## gi|266622216|ref|ZP_06115151.1| putative sensory transduction protein kinase 9 4 Op 4 . - CDS 7905 - 8492 472 ## gi|266622217|ref|ZP_06115152.1| conserved hypothetical protein 10 4 Op 5 . - CDS 8520 - 8735 298 ## gi|323485149|ref|ZP_08090500.1| hypothetical protein HMPREF9474_02251 11 4 Op 6 . - CDS 8722 - 9057 347 ## gi|266622219|ref|ZP_06115154.1| conserved hypothetical protein 12 4 Op 7 . - CDS 9083 - 10279 285 ## COG3344 Retron-type reverse transcriptase - Prom 10325 - 10384 2.8 13 5 Op 1 . - CDS 10491 - 10979 163 ## gi|266622221|ref|ZP_06115156.1| conserved hypothetical protein 14 5 Op 2 . - CDS 10976 - 11119 173 ## gi|266622222|ref|ZP_06115157.1| hypothetical phage-related protein 15 5 Op 3 . - CDS 11134 - 11445 325 ## gi|266622223|ref|ZP_06115158.1| conserved hypothetical protein - Term 11462 - 11490 -0.3 16 5 Op 4 . - CDS 11500 - 12474 739 ## gi|266622224|ref|ZP_06115159.1| hypothetical protein CLOSTHATH_03438 17 5 Op 5 . - CDS 12490 - 13620 512 ## CKR_P04 hypothetical protein 18 5 Op 6 . - CDS 13628 - 14701 520 ## Aflv_0677 phage related protein 19 5 Op 7 . - CDS 14707 - 15621 466 ## CLB_2961 hypothetical protein 20 5 Op 8 . - CDS 15618 - 22031 3538 ## EF1288 hypothetical protein 21 5 Op 9 . - CDS 22028 - 22303 110 ## gi|295115638|emb|CBL36485.1| hypothetical protein Predicted protein(s) >gi|229784069|gb|GG667666.1| GENE 1 2 - 581 -183 193 aa, chain - ## HITS:1 COG:no KEGG:Ajs_3023 NR:ns ## KEGG: Ajs_3023 # Name: not_defined # Def: group 1 glycosyl transferase # Organism: Acidovorax_JS42 # Pathway: not_defined # 6 179 79 250 385 86 28.0 6e-16 MQPVKIPYFIENKLHIHRNVFNTLKKVKPDLIFIHEVNSLSLLQIIKYAKLTPAVRIMVD NHNDYFNSGRNWLSYHVLHKGIYKYLCSRINPYVEIFWGTLPIRCEFLNHVYGVPEEKIQ FLPMGIDDTNIPYNEREKIRNEIRDKLNINEDDFVIITGGKIDSDKKIVELIDALKDMPI HVKLIVFGKPAAN >gi|229784069|gb|GG667666.1| GENE 2 782 - 1180 226 132 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622210|ref|ZP_06115145.1| ## NR: gi|266622210|ref|ZP_06115145.1| hypothetical protein CLOSTHATH_03424 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03424 [Clostridium hathewayi DSM 13479] # 1 132 236 367 367 272 100.0 8e-72 MVCKRIADKTVSVAKVAELSGSCLSGMSDGVRGFFSTTVENDERIMGWRSWIDYHRSPGI KSEYAKLISLEFSDLSIKKVIEMKKDFWPCRLMQYGYFVICNVAKLESVIVYPIAVKKYD AKLLLLEKGIDY >gi|229784069|gb|GG667666.1| GENE 3 1869 - 2990 291 373 aa, chain - ## HITS:1 COG:SP0357 KEGG:ns NR:ns ## COG: SP0357 COG0381 # Protein_GI_number: 15900286 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Streptococcus pneumoniae TIGR4 # 1 364 1 364 365 577 73.0 1e-164 MKKVMLVFGTRPEAIKMCPLVNELKTRKNIQTVVCVTGQHRQMLDMVLSTFDVIPDYDLS IMKDRQTLFDVTTSILIRIKEVLENESPEVVLVHGDTSTTFATALACFYLQIPVGHIEAG LRTFDIHSPYPEEFNRQAVGIISQFHFAPTEFAKNNLLKEGKKAETIYVTGNTAIDALKT TVRDDYTHPVLEWASDSRLIMITAHRRENLGEPMKNMFKAIKRVMDTHTDVKAIYPIHMN PVVREIADRILGNGDRVRIIEPLDVIDFHNFLNRSYLILTDSGGIQEEAPSLGKPVLVMR DTTERPEGIVAGTLKLVGTNEENIYQNFKFLLESREEYNKMAKASNPYGDGFACKRIADV LEKEYWGGHENNC >gi|229784069|gb|GG667666.1| GENE 4 3016 - 4338 701 440 aa, chain - ## HITS:1 COG:PM1003 KEGG:ns NR:ns ## COG: PM1003 COG0677 # Protein_GI_number: 15602868 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetyl-D-mannosaminuronate dehydrogenase # Organism: Pasteurella multocida # 13 426 7 416 424 382 46.0 1e-106 MSLYKELINGNKKLALVGLGYVGMPMAVAFANKGIKVIGFDLNEAKIDLYKKGIDPTKEV GDDVIKNTAVEFTSDEKKIQEAKFIIVAVPTPVNTDHTPDLSPVINASKIIGRNLTKDSI IVYESTVYPGCTEDICATILEKESGMKCGEDFKIGYSPERINPGDKVHRLENIRKIVSGM DSESLYEIKSIYDFVIEAGTFPVSNLRTAEAIKVVENSQRDINIAFMNELAMVFDRMDID TNEVVDGMNTKWNALGFRPGLVGGHCIGVDPYYFTYEAEKMGYHSQIILNGRIVNDSMGG YIADAAVRLMIEAGQAPKKSKVVILGLTFKENCPDIRNSKVEDIVKKLNSYGISPLIVDP WANEQDAKKEYNLELTKLSEVSDVDCIIVAVAHHEFKLLSLDTIKNMYKNVPDEERVFID VKGLYDLNKLKESGIKFWRL >gi|229784069|gb|GG667666.1| GENE 5 4363 - 4911 264 182 aa, chain - ## HITS:1 COG:no KEGG:Dret_0386 NR:ns ## KEGG: Dret_0386 # Name: not_defined # Def: hypothetical protein # Organism: D.retbaense # Pathway: not_defined # 8 182 259 433 434 135 38.0 7e-31 MDILRKDIPVVGYLIAFAKKIVTPFVGKPLEVVRNGHWYGNDTLWRAILDLNKILVYSDE NGIMHNKPKRKCISIIDAIVAGEKNGPMSPEPKVCGMVGIAFSPLLGDVVMSRLMGFDYK KIPSINNGFGLTNYTLNNEVIPDNISVGSNNPNWNKKLGNINKNDCFAFNAAFGWENHIE LD >gi|229784069|gb|GG667666.1| GENE 6 5744 - 7141 173 465 aa, chain - ## HITS:1 COG:VC0924 KEGG:ns NR:ns ## COG: VC0924 COG1541 # Protein_GI_number: 15640940 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Vibrio cholerae # 9 454 5 434 446 167 27.0 4e-41 MSLFNIAKTCYYQLPDWLLKAGGIVYYMIPERVRYGNVFSGTLKKIQEVEYLAQADLDKL VDEHFVFTVRHAYEQVPFYREYYDTYGVNISSIHGVKDINKLPFIDRDIVKKYSDKLIAN DSKKKKLIYVTTSGSTGTPLGFFQPDNITMQEWAYTIQIWERVGYRPNSSRLVLRGKKLH PAGEKQNYYYDPLRRELSCNIFDMREETMDVYCQAVEKYRPEFIHGYMSAILMLAKYIDQ NNINIKHHFKAILATSENVIDEQKEYVEKIFNTKVFSFYGHSERLVIAGECEYSTEYHVE PLYGYCEIIDEQGAASQFGEIVATGFLNNDMPLIRYKTGDMASWSQTPTCKCGRCYKRLQ KITGRWHQDMLVNIDKAYVSLTALNIHSDAFDKIIRYKLVQYEIGKVEMLIQVSNLYLEE DQIKIKELLESKTNHKIEFTLKIVDHLSVPQNGKYQIVEQHIKLE >gi|229784069|gb|GG667666.1| GENE 7 7165 - 7485 178 106 aa, chain - ## HITS:1 COG:no KEGG:BT_0602 NR:ns ## KEGG: BT_0602 # Name: not_defined # Def: UDP-N-acetyl-D-mannosaminuronic acid dehydrogenase # Organism: B.thetaiotaomicron # Pathway: Fructose and mannose metabolism [PATH:bth00051]; Butanoate metabolism [PATH:bth00650]; Metabolic pathways [PATH:bth01100]; Microbial metabolism in diverse environments [PATH:bth01120] # 1 106 301 407 407 136 61.0 3e-31 MIDTNRVGLYGLTYKENVDDMRESPTLQLLESQERHLATGLKVYDPFITKEVVSNQYYDL DSFLENIDMVVIMVKHDEIKENLNKLSNKVVLDCHNIIKLPGVYHI >gi|229784069|gb|GG667666.1| GENE 8 7582 - 7890 327 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622216|ref|ZP_06115151.1| ## NR: gi|266622216|ref|ZP_06115151.1| putative sensory transduction protein kinase [Clostridium hathewayi DSM 13479] putative sensory transduction protein kinase [Clostridium hathewayi DSM 13479] # 1 102 1 102 102 184 100.0 3e-45 MAANKVVFGNKVLIDLTGDTVTEEALLKGYTAHKADGTIITGTAFAGYPNEFVFLDNIED SSGNPIKDSSGKTIQGQTIYRKARNSVLLDSTGDIIEDGFEQ >gi|229784069|gb|GG667666.1| GENE 9 7905 - 8492 472 195 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622217|ref|ZP_06115152.1| ## NR: gi|266622217|ref|ZP_06115152.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 195 1 195 195 317 100.0 4e-85 MANMNVNKVIYGGGDVLIDLTGDSVSADKVLKGITAHDKSGAKITGTCTFDSDTSEDTAA VAEILVGKTAHARGSKLTGTMKNNGAVKGIISTVAGEYTVPQGYHDGSGKVSIDATEQAK LIATNIREGVTILGVEGAMSGSEDMKPQSKEVTPSKEAQTIMPDEEYNCLSQVTVKAIPY VETDNSAGGKTVTIG >gi|229784069|gb|GG667666.1| GENE 10 8520 - 8735 298 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|323485149|ref|ZP_08090500.1| ## NR: gi|323485149|ref|ZP_08090500.1| hypothetical protein HMPREF9474_02251 [Clostridium symbiosum WAL-14163] hypothetical protein HMPREF9474_02251 [Clostridium symbiosum WAL-14163] # 1 71 1 71 71 125 97.0 8e-28 MSMSNKTYDILKWIAMYLLPAAGTLYFALAGIWGLPYGEQVVGTITAVDTFLGVILGIST SQYNKTADKEK >gi|229784069|gb|GG667666.1| GENE 11 8722 - 9057 347 111 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622219|ref|ZP_06115154.1| ## NR: gi|266622219|ref|ZP_06115154.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF9474_02252 [Clostridium symbiosum WAL-14163] conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein [butyrate-producing bacterium SM4/1] hypothetical protein HMPREF9474_02252 [Clostridium symbiosum WAL-14163] # 1 101 1 101 111 202 100.0 8e-51 MEPWFQVVLTIFSSVLASSGLWAYLQKKSEQKDVKTEMLIGLAHDRIMYLGMSYIDRGCV TQDEYENLRVYLYEPYERMGGNGSAKRIMQEVDKLPIHKFIEKEEEHNEHE >gi|229784069|gb|GG667666.1| GENE 12 9083 - 10279 285 398 aa, chain - ## HITS:1 COG:alr3497 KEGG:ns NR:ns ## COG: alr3497 COG3344 # Protein_GI_number: 17230989 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Nostoc sp. PCC 7120 # 3 358 8 329 352 69 23.0 1e-11 MNYEEIVCDANNLYRAYKVSVKSSKWKESTQKFMMNFLRYIFEIQDDLINRTLQNGPTQE FELHERGRIRPITSIQIRDRIVRHSLCDEVLLPEVRKHIIYDNCASIKGRGISQQRKRFE IHLHKYYQLYGNDGYILFGDFSKFYDNIIHEIAKRELLKLFNDDEFIDWLLTLIFKGFQI DVSYMSDEEYETCMIDTFNKLEYRNIPKEKLTGEKWMEKSVNIGDQLSQVIGIYYPYPID NYVKYVRQQKFYGRYMDDWYIMNPSKEELENLLENVCKIAAELGIHINRKKTRIVKISSK YKFLQIKYTLTDTGKVIKRINPDRVTAMRRKLKKLAVKVENEEADYDNVENMFRGWMGGH YKLLSREQRKNLIQLYEDLFSKEITIVNKKLIVSDRSA >gi|229784069|gb|GG667666.1| GENE 13 10491 - 10979 163 162 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622221|ref|ZP_06115156.1| ## NR: gi|266622221|ref|ZP_06115156.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF9474_02254 [Clostridium symbiosum WAL-14163] conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF9474_02254 [Clostridium symbiosum WAL-14163] # 1 162 1 162 162 296 100.0 5e-79 MSVLLGDRKESKFEAITYSIELHDMLILLMQRGFGVKDVDSFVRKKYAYGEISEENFAKY RELMRSFKSKVNQCASLITSNVRAANTIYPRSMHEYETRRDYQNAAIVNCEQLINELQRV VEIFDVDLNLYNRYVKAIDREIGLIKRWRQRDMAIKSRLEKG >gi|229784069|gb|GG667666.1| GENE 14 10976 - 11119 173 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622222|ref|ZP_06115157.1| ## NR: gi|266622222|ref|ZP_06115157.1| hypothetical phage-related protein [Clostridium hathewayi DSM 13479] hypothetical phage-related protein [Clostridium hathewayi DSM 13479] # 1 47 1 47 47 71 100.0 2e-11 MEHSKNYSKVKLWHSMKMWNETRVRNAVKMGWITKEEFAEITGKDYE >gi|229784069|gb|GG667666.1| GENE 15 11134 - 11445 325 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622223|ref|ZP_06115158.1| ## NR: gi|266622223|ref|ZP_06115158.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF9474_02256 [Clostridium symbiosum WAL-14163] conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF9474_02256 [Clostridium symbiosum WAL-14163] # 1 103 1 103 103 185 100.0 1e-45 MEEKIYKITLGDGTEISNLKLNGNNFISTEKIEESVFADNCSPVTISDGTTETVHPNMEL VQIVEQVPGEYWFVLRDISEEEFARTKMQSDIAYIAMMSNVEL >gi|229784069|gb|GG667666.1| GENE 16 11500 - 12474 739 324 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622224|ref|ZP_06115159.1| ## NR: gi|266622224|ref|ZP_06115159.1| hypothetical protein CLOSTHATH_03438 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03438 [Clostridium hathewayi DSM 13479] # 1 324 1 324 324 661 100.0 0 MKITDYEKVQELAASNIFLLDGPNGTKTIAADALAKALIGLLSSKDFIGGVNLSELTQIN KLVTGNKLLVGTTDGNKAIAAEDALFAMLDSFAPVELRRVIFRGKNLGTALTAVQKAAIK DASFKGMFLGDYWNIGGRIWRIVDMDYWYNCGDTAFTSHHLVIMPDEALYNAQMNTTNVT TGGYVGSEMYKKNLENAKTIVNAAFQGSVLTHREYLCNAVANGRPSGGAWFDSSVELPNE PMMYGHLHFSPTSDGSTVPSIYTISKTQLALFMVCPKFIVNRSYNQWLRDVVSSAYFATV YGHGITDYHGASDSHGVRPVFPVG >gi|229784069|gb|GG667666.1| GENE 17 12490 - 13620 512 376 aa, chain - ## HITS:1 COG:no KEGG:CKR_P04 NR:ns ## KEGG: CKR_P04 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 8 345 7 305 916 101 25.0 4e-20 MSVTFGFYNSKEGDRRYDAIQMSSIFDGIIQDGILQHVGTAMVVKESEAMIINVGIGRAW FNHTWTLNDALLPLVVPQSEILLNRYDAVVLEVDSREAVRANDIKIIKGTPASNPMKPTM VKTNDRWQYPLAYIYVGAGVTSIRQANITNCVGTSECPFVTAPLDKVEIDDLIAQWQDQW KEFYEKQTTDMEETNKFWKEQWSTWFLAQTEEIQSAYLTWEAQWNIWYSEHTADMEATST YWKEKWEAWFNEYTSINTAEMADWKQKSETEFREWFDQLQALLDSNTAASLAKKLLELQE QVDVLNQFSSNLENEYTVYQKLYDNGYRTYGDVLDSSDAPITDSNLDTVIGRTYSSDLLR DSNGDVIEGRAIFVIK >gi|229784069|gb|GG667666.1| GENE 18 13628 - 14701 520 357 aa, chain - ## HITS:1 COG:no KEGG:Aflv_0677 NR:ns ## KEGG: Aflv_0677 # Name: not_defined # Def: phage related protein # Organism: A.flavithermus # Pathway: not_defined # 3 349 4 357 381 170 33.0 8e-41 MDVTILNTDLDAVAIVDTYESFIWTDRYYAYGDFELYEAMRDGLLDYIKQDYYLQSKESE HVMIVEKIQITSDTEDGNHVTVTGRSLESILDRRIVWGQKLLSGNLQNGIKTLLNENVIS PSDSNRKISNFIFKESTDSAITKLKLEAQYTGDNLYDVIQKICEEQGIGFKITLNDEKQF VFELYAGSDRSYDQTENPYVIFSPKFENIINSNYIESKASLKTVTLVGGEGEGAGRRYTT VGGGSGLNRRELFTDARDISSNVGSDDALTDAEYMAQLQQRGKEKLAENVSITSFEGETE TTIMFQYGKDFFNGDIVQIANEYGHETKARILEIVRSEDKDGYSVYPTFKTIEQEGA >gi|229784069|gb|GG667666.1| GENE 19 14707 - 15621 466 304 aa, chain - ## HITS:1 COG:no KEGG:CLB_2961 NR:ns ## KEGG: CLB_2961 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A_ATCC19397 # Pathway: not_defined # 12 304 11 289 289 90 27.0 1e-16 MIRAVTFTNYLGDSIRLDLARPEESGFIIKSVTGLGPGKANINTTEIATNDGSLFNSSRM PSRNIVISLAYMWKDSIEDVRQLSYKYFPIKKKLTMLIETDNRQAEIEGYVESNDPTIFS KDEGSDISIVCPNPFFYSAGKDGLNTTIFYGVEALFEFPFSNESLQDPLLEMGEIKNETE QVVAYNGDAEIGVTITIHAIGEVSNITIYNTGTREVMRIDTDKLEKFTGSGIIAGDEIII CTVKGNKSITLLRNGKTTNILNCLDKNADWFQLAKGDNIFAYTAEYGSTNLQFKIENRIV YEGV >gi|229784069|gb|GG667666.1| GENE 20 15618 - 22031 3538 2137 aa, chain - ## HITS:1 COG:no KEGG:EF1288 NR:ns ## KEGG: EF1288 # Name: not_defined # Def: hypothetical protein # Organism: E.faecalis # Pathway: not_defined # 961 1327 387 739 977 74 24.0 5e-11 MSTTVDERVVEMRFDNKQFEQNIQTSLSSLDKLKKSLNLEGAAKGLETVNDAANKCSGNM SPLSNAVETVRVRFSALEVMAITALQNITNSALAAGKNLVSAFTIDPIKTGFEEYETQIN AVQTILANTSSKGTTLDQVNNALDELNHYADMTIYNFTEMTRNIGTFTAAGVDLDTSVAA IKGIANLAAVSGSNSQQASTAMYQLSQALAAGTVKLQDWNSVVNAGMGGQVFQDALKETA KVHGIAIDEMIKDEGSFRETLSKGWLTSDILTETLAKFTGDLNEDQLRTMGYTDDQIKSI MEMGKTANDAATKVKTFTQLFDTLKEAAQSGWTQSWEIIVGDFEEAKELLTEVSDTFSAV INASADARNKMLQDWKDLGGRTMMIEAVKNVFEGLVSVAKPVREAFNEIFPPMTGKQLAE ITERIRDLTAKFKMGEESSKNLKNTFKGVFAVLDIVGQAFKAVAGGVGELIGLFLPAGNG VLSLTGSFGEYLVKLDETVKKTDVFGKAVSTVVDIVKIAITFVKTAGEKVKEFGKTAGEK FDFPGFELFHSFLERVHDRMAQIGDGAGKMKSGVIVAFEMMGEALEKCKFLKVMEALWTA VKVIAGGIADAVGTMMGTLAEKLGNADFSGVLDILNSIAVGGIAVSISKFLKSVTEPLEG LNGVLEGVTGILDGVRGCFEAYQTNLKAGTLLKIGAAIALLAGSIVAISLIDSDKLSASL GAITVLFANLLGAMAIFNKISSDTGKVSKACTAMIAMSVAVSILAGALKKVSDLDWGELA RGLIGIAGLTTIVVASSKAMASGQKQVMKGATSLIIFGAAIKILASACKDLSKLQWDELG RGLTGVGVLFGEIAVFLRVAKFNGKMISTATGIVILAAAMKVLASACKDFGQMEWSEIGK GLAGIGGLLAELAVFTNLAGNAKHVMSTGVALTAIGAAMKIFASAVKDFAQLQWDELGRG LTAMGGALAEVAIAVNLMPKNMIGIGTGLVIVGGALEIIANCMSKFGGMQWEEIGRGLTV MGGALAELAISLNFMKGTLGGSAALLVASAALAVLAPVLSILGALSWEAIAKGLISIAGA FTIIGVAGAVLTPLVPTILALSGAFALIGVGVLTIGAGLLAAGTGLSALAIGFTALATAG AAGATAIVAALTVIVTGIASLIPAVLTKVGEGLIAICKVIAAGAPAIGEAVKSVILTLID VFVSCVPQLADGALRLVVGVLEALVTYTPQIVDLAFKFLIGILEGIASNLPSLIKAGIDV LMAFFAGIVDALSGIDTGALLKGIAGIGLLSAIMLALSATAALVPGAMVGILGMGAVVAE MALVLAAVGLLSKLPGLSWLIGEGGKLLQGIGTAIGQFVGGIVGGFMSGVSSQFPQIGAD LSAFMNNVQPFLQGASQIQPSMMDGVKALAETVLILTAADILQGLTSWLTGGSSLSKFGE ELVPFGEAMRDFSLAIGNMDGEIVANAATAGKALAEMAAIIPNTGGLVSFFAGENDMTAF GKQLVPFGEAMKQFGDAITGLDSNAVTEAAIAGKAMAEMATTIPNSGGVVGFFAGENDMG EFGKQLVPFGEAMKAFGDAVRGLEADAIVNSATAGKALVELADTVPNTGGVVAFFTGNND VNTFGEKLVPFGEAMKAYSEAIMGMDSAAVTNSATAGKALVELANTIPNTGGLVSWFTGD NDLGSFGDSLVQFGSGIKSYSDSISGIDTGIMSSVITQVNRLVEMAKGMAELDTSGMSGF STALTQLGNNGIDSFINAFTDASGRVTSTATSMLTTFINAANAQKSNMTSTFTTMMQAVL TTLTNYQTQFNTAGSTLMTKFITGIKSQDGNTKTAITNIISGCVTAINNKQIQFNTAGSN LMIKLIAGIKSKDYETRNAFVNILSSCLTAIANKYPEFQNAGMQCMIKFIAGIKEKAEEV KTAFTGNLNASVTAIRDYHEQFKQAGAYLVEGFADGISENTYRAEAKARAMARAAAEAAE DELDEHSPSRVGYHIGDFFGLGFVNAIGTYAVKAYNASADMAKSAKTGLGNAIAKVKDMI DNGVDTQPTIRPILDLSDVEEKSHRLNTLFSRSQALTVSTGIAASRGQDLQNEDTNPNTG NSYNFTQNNYSPKALSRTEIYRQTKNQFSAMERMVET >gi|229784069|gb|GG667666.1| GENE 21 22028 - 22303 110 91 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|295115638|emb|CBL36485.1| ## NR: gi|295115638|emb|CBL36485.1| hypothetical protein [butyrate-producing bacterium SM4/1] # 1 91 35 125 125 183 100.0 4e-45 EGVAALASATPVDTGRTANSWHYKIEQKQGSVSISFYNTNIQNGVPIAVILQYGHATRNG GWVQGRDYINPAIQPIFDKIADAAWKEVTKL Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:18:06 2011 Seq name: gi|229784068|gb|GG667667.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld60, whole genome shotgun sequence Length of sequence - 23752 bp Number of predicted genes - 27, with homology - 23 Number of transcription units - 16, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.500 - CDS 3 - 750 764 ## COG1089 GDP-D-mannose dehydratase 2 1 Op 2 . - CDS 832 - 1902 880 ## COG0836 Mannose-1-phosphate guanylyltransferase + Prom 1875 - 1934 4.4 3 2 Tu 1 . + CDS 1968 - 2081 114 ## 4 3 Op 1 . - CDS 2427 - 2996 204 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 5 3 Op 2 . - CDS 3032 - 4150 224 ## gi|288870751|ref|ZP_06115168.2| hypothetical protein CLOSTHATH_03448 - Prom 4319 - 4378 5.7 - Term 4316 - 4366 1.6 6 4 Op 1 25/0.000 - CDS 4550 - 6229 382 ## COG0438 Glycosyltransferase 7 4 Op 2 . - CDS 6239 - 6721 249 ## COG0438 Glycosyltransferase - Prom 6866 - 6925 4.0 8 5 Op 1 . - CDS 7443 - 7808 136 ## gi|288870752|ref|ZP_06115171.2| conserved hypothetical protein 9 5 Op 2 . - CDS 7813 - 8970 322 ## COG0438 Glycosyltransferase - Prom 8995 - 9054 2.7 10 6 Op 1 . - CDS 9058 - 9321 160 ## gi|266622238|ref|ZP_06115173.1| conserved hypothetical protein 11 6 Op 2 . - CDS 9311 - 11371 658 ## gi|266622239|ref|ZP_06115174.1| methyltransferase, FkbM family 12 6 Op 3 . - CDS 11382 - 11753 203 ## gi|266622240|ref|ZP_06115175.1| hypothetical protein CLOSTHATH_03455 13 6 Op 4 26/0.000 - CDS 11759 - 12616 380 ## COG1134 ABC-type polysaccharide/polyol phosphate transport system, ATPase component 14 6 Op 5 3/0.500 - CDS 12616 - 13392 577 ## COG1682 ABC-type polysaccharide/polyol phosphate export systems, permease component 15 6 Op 6 25/0.000 - CDS 13379 - 14341 436 ## COG0438 Glycosyltransferase - Prom 14421 - 14480 6.8 16 6 Op 7 . - CDS 14532 - 15653 348 ## COG0438 Glycosyltransferase - Prom 15673 - 15732 4.6 + Prom 15773 - 15832 7.8 17 7 Tu 1 . + CDS 15857 - 16606 263 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 18 8 Tu 1 . + CDS 16731 - 16862 107 ## gi|288870754|ref|ZP_06115181.2| conserved hypothetical protein 19 9 Op 1 . - CDS 17406 - 17885 182 ## Dhaf_2435 RNA polymerase, sigma-24 subunit, ECF subfamily 20 9 Op 2 . - CDS 17935 - 18099 131 ## - Prom 18132 - 18191 7.6 + Prom 18043 - 18102 3.3 21 10 Tu 1 . + CDS 18154 - 18354 164 ## gi|266622248|ref|ZP_06115183.1| putative phage tail sheath protein + Prom 18525 - 18584 5.0 22 11 Tu 1 . + CDS 18663 - 18812 57 ## 23 12 Tu 1 . - CDS 20250 - 20408 163 ## gi|266622250|ref|ZP_06115185.1| hypothetical protein CLOSTHATH_03467 - Prom 20563 - 20622 2.1 24 13 Tu 1 . + CDS 20599 - 20748 192 ## + Term 20817 - 20849 -0.1 25 14 Tu 1 . - CDS 20799 - 21272 274 ## Closa_0569 hypothetical protein - Prom 21308 - 21367 7.5 - Term 21481 - 21518 6.1 26 15 Tu 1 . - CDS 21584 - 22942 1539 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 23006 - 23065 3.7 + Prom 22897 - 22956 5.4 27 16 Tu 1 . + CDS 23082 - 23751 552 ## COG0582 Integrase Predicted protein(s) >gi|229784068|gb|GG667667.1| GENE 1 3 - 750 764 249 aa, chain - ## HITS:1 COG:aq_1082 KEGG:ns NR:ns ## COG: aq_1082 COG1089 # Protein_GI_number: 15606359 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: GDP-D-mannose dehydratase # Organism: Aquifex aeolicus # 2 249 4 253 345 335 64.0 6e-92 MKNVLITGITGQDGSYLAELLLEKGYNVYGLMRRKSVVDYGNVEHIKDRIHFIYADMTDV VSLTNAMRISQADEVYNLAAQSFVATSWEQPIATADIDAIGVTNMLEAIRTVKPEAHFYQ ASTSEMFGKVQEMPQTEKTPFYPRSPYGVAKVYGHWITKNYRESYDMFACSGILFNHESE RRGLEFVTRKITDAVARIKLGILDHVELGSLDAKRDWGHSKDYVKAMWLMLQQDTPDDYV IATNETRTV >gi|229784068|gb|GG667667.1| GENE 2 832 - 1902 880 356 aa, chain - ## HITS:1 COG:CAC3072 KEGG:ns NR:ns ## COG: CAC3072 COG0836 # Protein_GI_number: 15896323 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 5 335 4 336 350 301 46.0 1e-81 MTTTALIMAGGRGERFWPKSRKNLPKQFLSLTDDGKTMIQLTVERILPLVKLKDIFIATN KAYRELVLEQIPGLPEENILCEPIGRNTAPCIGLGAIHIAQKYDDSIMFVLPSDHLIKFT NMFLKTLETGADVAENNTNLVTIGITPDYPETGYGYIKFDPRRTEGQAYAVERFVEKPSL EVAKEYLSTEEYLWNSGMFIWKVSSILKNMQKLMPDTYESLIKIKEAIGTPQQDFILEKE FHNMQSQSIDYGIMEKADNIYILPGTFGWDDVGSWLAVERIKKTNEFGNAVAGNIITVNT HNCIIQGDKKLIAIVGMEDTIVVDTKDATLICAKDSAGDIKKVTENLKICNRNEYI >gi|229784068|gb|GG667667.1| GENE 3 1968 - 2081 114 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no METMIINPNLSKYKFANGFNKYIINVENAIPQTQTLP >gi|229784068|gb|GG667667.1| GENE 4 2427 - 2996 204 189 aa, chain - ## HITS:1 COG:SPAC31G5.16c KEGG:ns NR:ns ## COG: SPAC31G5.16c COG0463 # Protein_GI_number: 19114929 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Schizosaccharomyces pombe # 9 189 5 195 236 84 31.0 1e-16 MEIIDKGISVVLLAYMEEDNLKFLLPQIINNLNKIGQDFEILVIDTAEPLDHTREVCTEF NARYINQEWPGFGGAFKTGIKYANKEKFMILDSDGSHNPNNIPEINKKFDEGNYDIVIGS RYIKGGETNDKKTSVVMSKMLNLAFRICLGIKAHDISTDFRMYDTQQLKNVKLDNKNYDV LQEVLLKTK >gi|229784068|gb|GG667667.1| GENE 5 3032 - 4150 224 372 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870751|ref|ZP_06115168.2| ## NR: gi|288870751|ref|ZP_06115168.2| hypothetical protein CLOSTHATH_03448 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03448 [Clostridium hathewayi DSM 13479] # 1 372 95 466 466 628 100.0 1e-178 MLFSLGYVFWEIFYFMLTIYYSYKLKEYKFMVYTIVIFTLTHIFTGFFIMLESLAAVGIF WFLLLFFIHYSQLYNRVPKWFVCFTVILSVKVYESFAFFGIVLLLTIFNKKLWKRKEKLF ILGISVLLAYASFQGFRYVIWPRDPVNAKGALQSILSLDYRLIATYVLLFIIFIFSILIN YAKDTKKRKTFNNCLCFLNLVCGCWFAYLMFFNTDYIATESYNSRITNLAFPLALSLVVL FIYIKKIDFSLSRLTCVLGILLISNLWFNFKSAYDYNQHLQKTYEITSMNEGIIPANKVD LSNIKKYYWPWTMMFESVNAQMVNGVYNIKCIIIHDEWYNSWEPFDTKDILSYPDLTYYN VTYLNDTLKRHD >gi|229784068|gb|GG667667.1| GENE 6 4550 - 6229 382 559 aa, chain - ## HITS:1 COG:AF0045 KEGG:ns NR:ns ## COG: AF0045 COG0438 # Protein_GI_number: 11497665 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Archaeoglobus fulgidus # 135 508 406 784 1213 132 27.0 2e-30 MEKYIKRKKALVQRHLLHEISKIDSSFIDYSNLLKQFDGDEIKKKEVEKLEYAQYSLTLL NSSQSSHQCILEDNLLQTIKYQKARIDYLEMMFNKYLLNTSEAISDKTKINHALDVFMNE QRANNVNISSKLFDKKYKIAMFAPLSPLKNGIAHYTEDILVQLVQYFDIDVYIDTNYEPT NHIIIDSCKIYSHYSFEKSYRNYDAVIYQIGNNPDHVYMIPYLLKFPGIIVLHDFILSYL RSFLSEELKQFCDTNCNSIYYNGNNDGKNPFNKYIFQNARGIIVHSEYSKLGIWEQDFSK LITKIPLYSKCDDVLDDDTLKTQKYHYTQDTIILASFGFVTASKRIVPILQAFNKLKLAL PLIKLKYLIVGQIEPEIKSEIDLVVARHKLFSDVVMPGYVSLDELENYINISDICINLRS PYGGETSAALSRIMGKGKACIVSDIGSFSELPDNVCIKIPTAPENEINNIYTAMLELVRN QKLRHGIAKAAHEYVKTYLNIEKSAEMYRNFIVEIIAGGKNLNYTMLKKIATFIACNSFY DENYILDTVTRRITYELLD >gi|229784068|gb|GG667667.1| GENE 7 6239 - 6721 249 160 aa, chain - ## HITS:1 COG:MA1179 KEGG:ns NR:ns ## COG: MA1179 COG0438 # Protein_GI_number: 20090045 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanosarcina acetivorans str.C2A # 1 158 645 808 812 90 34.0 8e-19 MGSAHPPNFDACDSIINFAYRCPETYFFLIGSQCTKYERISLPNNVGLLGMVNESVKNSI LGIADFSVNPICSGSGTNVKMFDYMACGIPIISTLFATRGIENKDSFIICDIKEMPVYIN QFDLSNQTNRVLEAYTYVKEYFNWDKIVKETLYPKLNQLL >gi|229784068|gb|GG667667.1| GENE 8 7443 - 7808 136 121 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870752|ref|ZP_06115171.2| ## NR: gi|288870752|ref|ZP_06115171.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 32 121 1 90 90 164 100.0 2e-39 MIETNLSYVKYLLLSSDKILNDTNNFPLHYYMLEEETKCMQKKSHINPEFYIGSSKGRII GFLAISYKKLIRKMLGWYFREIVYQQNQFNESVLITLDTSKMLMQEILDKEQLTDHKRSE D >gi|229784068|gb|GG667667.1| GENE 9 7813 - 8970 322 385 aa, chain - ## HITS:1 COG:MA1061 KEGG:ns NR:ns ## COG: MA1061 COG0438 # Protein_GI_number: 20089931 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanosarcina acetivorans str.C2A # 80 384 106 413 419 114 28.0 3e-25 MLAPAQNMDSNSNIEYSYIYMSYTLEDIESNNKYLMDIPVVKFADGVLTSEAFQWDWTNI PTNVAPQVWYTLNKDYLDFIKQHLETNDYDIIQIVHSQFAWLVPTIRKLRPEIKIIVDYQ NIEWLVYKRWIDYIDDSEKIKALEKEYTTLKCWEEKVLDWFDGIFCISPIEKEQLKSLTK TPLYYVPTGAGINDDEYYPKAENSTKRYDLVFVGSMNWFPNTQALEWFLEKVMPLIQKKR PATRLEIIGSGKPDQHMIKVIKQNRNVTFWGEIENEKPFLHGAKVFISPIWIGAGVRLKN PTAWAAKLPVVATSISVEGLEYLPGKDLLIGDTPEDFAKQVLSVLNDPEYGKSIADRAYQ TYKEIYSAERLTNIWVESYHKVLGR >gi|229784068|gb|GG667667.1| GENE 10 9058 - 9321 160 87 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622238|ref|ZP_06115173.1| ## NR: gi|266622238|ref|ZP_06115173.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 87 1 87 87 129 100.0 1e-28 MKDDVTWDAQIEFIDLLLYVKHNTVSKYYLLDNNELKGQPEVIYKDLGVKAALKNWWIRK LNNEKLRSLKIVKFIIKIKRIMHIKFY >gi|229784068|gb|GG667667.1| GENE 11 9311 - 11371 658 686 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622239|ref|ZP_06115174.1| ## NR: gi|266622239|ref|ZP_06115174.1| methyltransferase, FkbM family [Clostridium hathewayi DSM 13479] methyltransferase, FkbM family [Clostridium hathewayi DSM 13479] # 1 686 1 686 686 1319 100.0 0 MLDNWKRKYYGLCEKVANRHEIQELLSFLFMDIPNEELISFYLSQKFTLTQVFIDVWKKR TGFIENGIKPADPKDIINLYSSLLSRKPSKEELNFYKTESCSVYYIFNTILSSDEYKEAT SKTYNPSIIEKLSLLLSGEKKQNLCCQSSELTCSDMKLYIGHLLEEHINEIDIKISDYQD IKNIYYDLLGRFPTNEELGYFFEKSTINYVLNHIYNSEEYMEKCKIYNVETVIPKLILLL QGSDSISEEIKSALAHNYTIGHVTEKLICTDFVRNRIDNVRKIEKKDIINGFVGILGRFP TESEVDYYLHQGYELGSFLQTLLYSDEGIQNIEKKNDDLNVHYGITQFNCFQTNFDIVFD NSMSDVISMGIKEKKFLWDDEYNYIKLLPDTGFVLDIGANVGAISMIFGAKGWSGFCIEA SKKNTECLKRSIYLNRYNFGVGCFAVSDATKKIAFMENGPWGVINNTLIQDDTNNFLQTF KSGSVLKEIQAYCLDDWEKTELRYIKKIDFIKMDIEGSEYSALQGMGEFLKVFQYPSFYS EVNGYNLFTYNKTPRQLFDKFREYGYMPYKLIEDQLKEFDYNNFQTELYDNYLFIHKTNH YFDKMINGRIVFDTLKVKLEILNILNNGWSQHIKYLLFALKDYPQFYNDIDIKEKLMLYS NLQDDKVINYALDWLNKTNEVNSDER >gi|229784068|gb|GG667667.1| GENE 12 11382 - 11753 203 123 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622240|ref|ZP_06115175.1| ## NR: gi|266622240|ref|ZP_06115175.1| hypothetical protein CLOSTHATH_03455 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03455 [Clostridium hathewayi DSM 13479] # 1 123 1 123 123 235 100.0 8e-61 MVGTEITHVKAGESFWIELEIAVNEEIKNGVIGIGINDDMGKMVFGLNTDMDAIHVPPIV KRRKISLLIKNVNLISGEYTIHASISDGREEIYDWYPVINRFFIDNESRFVGIFYLDHDW NFD >gi|229784068|gb|GG667667.1| GENE 13 11759 - 12616 380 285 aa, chain - ## HITS:1 COG:CAC2328 KEGG:ns NR:ns ## COG: CAC2328 COG1134 # Protein_GI_number: 15895595 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate transport system, ATPase component # Organism: Clostridium acetobutylicum # 7 256 4 257 419 248 48.0 9e-66 MKPENAIEVNHISKKFKVYMDKGHTLKEKMLFSKRRKYEERIVLDDISFEVKKGEAIGLI GQNGCGKSTTLKLLTRIMYPDSGNIEMCGRVSSLIELGAGFHPDMSGRQNIYTNASIFGL TRKEIEARFDDIVAFSELEQFIDNPVRTYSSGMYMRLAFSVAINVDADILLIDEILSVGD SSFQTKCFKKILDLKKRGVTIIIVSHDLQSIQKICDRCIWIQNGKVEDNGYPFHVINRYL KYMEYKGAEKESEELKINLPINQSETISSSLENWGNKKVFLKMFF >gi|229784068|gb|GG667667.1| GENE 14 12616 - 13392 577 258 aa, chain - ## HITS:1 COG:CAC2329 KEGG:ns NR:ns ## COG: CAC2329 COG1682 # Protein_GI_number: 15895596 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate export systems, permease component # Organism: Clostridium acetobutylicum # 5 258 6 258 258 139 32.0 4e-33 MQIFRELYAYREMIFSLVRRDLKGRYKGSVLGFLWTFINPLLQLAVYTLVFSVILRNNIE DFYLFLFVALVPWIFFSTSVAGGSSCIWQQQDMVKKIYFPREVLPIAFVTSQFINMLLSL LVVLATLIVSGFGINFKALLYLPIIMIIEYSLSLGMALITSAVTVYVRDLEYLLGIITMA WQFLTPIMYSVELVPAGLMPVFNLNPMTPVIIAYRDILYYKQLPKMETLLQATILGVLLI VIGWWTFSKLKRHFAEEL >gi|229784068|gb|GG667667.1| GENE 15 13379 - 14341 436 320 aa, chain - ## HITS:1 COG:MA3757 KEGG:ns NR:ns ## COG: MA3757 COG0438 # Protein_GI_number: 20092555 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanosarcina acetivorans str.C2A # 76 311 117 351 351 182 42.0 1e-45 MPASVYNHLERLIPFPYRFVFGEKTDITQFFNYTVPLGVRGKRITFVHDMSYKSCPDTMA LKTKRWLEDHLEQYCRRADLVITISEFSKCEIIKYLAIPSKKIHVVHCGVDLKQYREIDN IKQIEDVKQKYGIVGNYILYLGTLEPRKNLDSLLTAYMYLKKEQKFVPNLVLAGKKGWLY DSIFEKVKEYGLQENVVFPGYIAASDAPILLSGALLFVFPSLYEGFGIPPLEAMACGTPV ITSNSSSLPEVVGDAAILTDPLDIQGLKNAMQKLINSPELRRKLKEKGKLRAEQFSWRAS AKRLLNIYNEVKMETGNADI >gi|229784068|gb|GG667667.1| GENE 16 14532 - 15653 348 373 aa, chain - ## HITS:1 COG:CAC3070 KEGG:ns NR:ns ## COG: CAC3070 COG0438 # Protein_GI_number: 15896321 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Clostridium acetobutylicum # 2 303 5 307 370 71 25.0 3e-12 MIEIVVDARTMGSQPSGIGFYLYDFLKGLVHAEKLHIKLITDVAESEQIHYLQQQKVPVI CYGKRVYRSAGVYAYFKFIKNYLIKEQPDIFWEPNNLIPIKLHGFSGKIVLTVHDLFPIT NPEYFSLLYCLYFRRGIRQSIHRADALLFDSDETRSFVHHYFSEAVCKRSFISYLIIDRH INVEPHDGGYFLYVGNIEKRKGSNLLIKAYKKYCSEGGTMPLYLGGRIREKKLQSLMNEA ADKYPMLHHLGYLSQEEKEKWIAGCSCFLFPSKAEGFGLPPLEAIKYGKRVITSNLPIFR EILCGTTECFDINGSEAQQIDRLCKCMMSYPEPLESDKNEIYANVLKKYDPQILTESLKN FMLSLEDNREDSI >gi|229784068|gb|GG667667.1| GENE 17 15857 - 16606 263 249 aa, chain + ## HITS:1 COG:BH3661 KEGG:ns NR:ns ## COG: BH3661 COG0463 # Protein_GI_number: 15616223 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus halodurans # 1 249 6 257 257 155 36.0 5e-38 MEKPVVSVIMPAYNCEKFIAKAIESVLAQNVSLELIILNDCATDGTERVIQQYLSDQRIR YVKNDRNMGVARTRNEGVRMAMGEYVAFLDSDDWWEREKLSKQLALIKREKKVLCSTGRE LVDINGNLTGREIPVRETITYGMMLKQNWINCSSVLLKREVAEEFPMEHEDSHEDYITWL KILQKYQYACAINEPLLKYRLSSQGKSGSKFKSAKMTFKVYQYMGFSWWKSIGCFICYAW NGVRKYVRK >gi|229784068|gb|GG667667.1| GENE 18 16731 - 16862 107 43 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870754|ref|ZP_06115181.2| ## NR: gi|288870754|ref|ZP_06115181.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 43 30 72 72 82 100.0 1e-14 MNLLTDMYGCAGLMYAVIDRVVALIMKLEPEEFWRLESEAYDI >gi|229784068|gb|GG667667.1| GENE 19 17406 - 17885 182 159 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_2435 NR:ns ## KEGG: Dhaf_2435 # Name: not_defined # Def: RNA polymerase, sigma-24 subunit, ECF subfamily # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 22 149 7 134 139 86 38.0 3e-16 MPDSFSPVLFTQIRKKGIIINYYIRIEQQNIPVSEEIYKIWQHGERKERYFREGDIRNGV FSYDALDGEGLNGSELFADEDRNPVEHQAERDLLINVLKQALNTLNNEDKELLALIYCQE ESLRSIAQSTGVPFTTLQYRHKKVIKKLRLFFQHKFLLL >gi|229784068|gb|GG667667.1| GENE 20 17935 - 18099 131 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNHIIYCPVCHQRLFDIKPPIAGTIEIKCPRCKKVQHIHLIEPATLKSTTEKTT >gi|229784068|gb|GG667667.1| GENE 21 18154 - 18354 164 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622248|ref|ZP_06115183.1| ## NR: gi|266622248|ref|ZP_06115183.1| putative phage tail sheath protein [Clostridium hathewayi DSM 13479] putative phage tail sheath protein [Clostridium hathewayi DSM 13479] # 1 66 58 123 123 120 98.0 4e-26 MDYVNNRIKKLNQNENNIGRQMKTFMIYSNNVERDTMMQEEIDNEHLICISGVVPVKPAK YNVLEI >gi|229784068|gb|GG667667.1| GENE 22 18663 - 18812 57 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAAALENLYEYFSIMLWKGCMLRAGDGDSNEYNNIFCVLLPDRPYKIRK >gi|229784068|gb|GG667667.1| GENE 23 20250 - 20408 163 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622250|ref|ZP_06115185.1| ## NR: gi|266622250|ref|ZP_06115185.1| hypothetical protein CLOSTHATH_03467 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03467 [Clostridium hathewayi DSM 13479] # 1 50 1 50 50 91 100.0 2e-17 MKNDIIYDFDKIIRIDPNPPAKNIQQACYGMIPRASYRPGALKNILENRFDS >gi|229784068|gb|GG667667.1| GENE 24 20599 - 20748 192 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTDTEKMRFMEMENMRLMEENVKLRRVISKLNQTLNRLIEHYISSDRIA >gi|229784068|gb|GG667667.1| GENE 25 20799 - 21272 274 157 aa, chain - ## HITS:1 COG:no KEGG:Closa_0569 NR:ns ## KEGG: Closa_0569 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 11 138 1 128 135 207 75.0 8e-53 MTGKAKKEDAMRELSVYYCSKCGYYGYYQLPKNAVCPKCSVDMVPLSISFQDFMDLSCEE RDDLLSKQIISASSPYVKRLMAPHKAYNNREFIARMSDRIVELEAENKKLNETVEWMHQT IWDLVRKNKGIEPAGKSSLPSVDENSTDGTGKSENPE >gi|229784068|gb|GG667667.1| GENE 26 21584 - 22942 1539 452 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 39 226 527 691 744 86 32.0 1e-16 MKKNLKMMAVLSAAAVMTVAAPEMGLTAGSSTAYAKTIGWVEENGSFKYYEEDDYFLTDT WKKRGEDWYYLNEEGEIATNAQIDEYYVDETGKRVMDQWISIENEDAWDSPDAAEYHWYY YGKNGKAVISRWQKIGESWYYFNEDGQMMTGKVEIDDATYYLGEANDGVMKTGWIQLENE SDDPEMTHSWYYFDTNGKMVENQVDKKISGDYYTFVDGVMQTGWFKLPVSEQEENATPSD ATENTAATVAGYQYYEPENGKRVTGWRTIEGVEGISQEGEFYNFYFKNGKPYHAESGLQL FTVESKKYAFNTKGEMQTGLKVVNLEGGEVANFYFAEDGVMRTGKQVIYNEDLDENQTWF FYTDGSRKGQGFHGIRDNTLYEYGLRKDADSDLRLAPMAYDGNQYLVNTSGSIQKASSSS KSAAKPELGNGFKDYKDSNGKTWVVDVNGIIQ >gi|229784068|gb|GG667667.1| GENE 27 23082 - 23751 552 223 aa, chain + ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 6 223 3 219 265 230 52.0 2e-60 MEMDVEFEGFLRRSNLADNTIKAYLYAVKQYCSRYEKVTKSNLQKYKLFLLENYKPKTVN LRIRAVNCYLEAIGKEKLALSSVKVQQKTYLENVISDADYMYFKECLRRDGEWFWYFVIR FMAATGARVSELVQIKAEHVQTGYLDLYSKGGKIRRIYIPSALREEAMQWIKYRGIESGF LFLNRSGDRITTRGIAGQLKKFAVRYGIPVEVVYPHSFRHRFA Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:19:54 2011 Seq name: gi|229784067|gb|GG667668.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld61, whole genome shotgun sequence Length of sequence - 30042 bp Number of predicted genes - 37, with homology - 36 Number of transcription units - 9, operones - 8 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 260 247 ## gi|266623296|ref|ZP_06116231.1| putative bacteriocin ABC transporter 2 1 Op 2 . + CDS 266 - 691 307 ## CLJ_B2545 hypothetical protein + Prom 1050 - 1109 5.5 3 2 Op 1 . + CDS 1135 - 2010 319 ## COG0616 Periplasmic serine proteases (ClpP class) 4 2 Op 2 . + CDS 2072 - 2251 105 ## + Term 2264 - 2296 1.0 5 3 Op 1 . + CDS 2321 - 2764 311 ## Cphy_2972 hypothetical protein 6 3 Op 2 . + CDS 2730 - 4022 756 ## COG1783 Phage terminase large subunit 7 3 Op 3 . + CDS 4035 - 5441 1557 ## SPJ_1869 phage portal protein, SPP1 family 8 3 Op 4 . + CDS 5434 - 5838 367 ## SPJ_1868 phage putative head morphogenesis protein, SPP1 gp7 family + Prom 6677 - 6736 80.4 9 4 Op 1 . + CDS 6822 - 7415 297 ## COG5585 NAD+--asparagine ADP-ribosyltransferase 10 4 Op 2 . + CDS 7481 - 8116 580 ## SP670_2151 phage scaffold protein 11 4 Op 3 . + CDS 8136 - 9068 843 ## CLL_A1913 prophage protein 12 4 Op 4 . + CDS 9075 - 9314 255 ## gi|266622266|ref|ZP_06115201.1| transcription termination factor Rho 13 4 Op 5 . + CDS 9320 - 9679 291 ## gi|266622267|ref|ZP_06115202.1| prophage protein 14 4 Op 6 . + CDS 9615 - 9986 223 ## TepRe1_1056 hypothetical protein 15 4 Op 7 . + CDS 9983 - 10513 424 ## TepRe1_1057 phage protein, HK97 gp10 family 16 4 Op 8 . + CDS 10510 - 10878 295 ## SPJ_1856 conserved structural protein, putative 17 4 Op 9 . + CDS 10883 - 11296 513 ## SPJ_1855 major tail protein, putative 18 4 Op 10 . + CDS 11315 - 11758 601 ## SP670_2143 prophage protein 19 4 Op 11 . + CDS 11773 - 12141 281 ## gi|266622273|ref|ZP_06115208.1| hypothetical protein CLOSTHATH_03489 20 4 Op 12 . + CDS 12141 - 12353 354 ## gi|266622274|ref|ZP_06115209.1| conserved hypothetical protein + Prom 13200 - 13259 80.4 21 5 Tu 1 . + CDS 13361 - 14899 1265 ## Ccel_3035 Phage-related protein-like protein 22 6 Op 1 . + CDS 16147 - 16890 333 ## gi|266622276|ref|ZP_06115211.1| hypothetical protein CLOSTHATH_03492 23 6 Op 2 . + CDS 16955 - 17674 456 ## Shel_11450 hypothetical protein + Term 17681 - 17714 -0.4 24 6 Op 3 . + CDS 17734 - 18126 283 ## gi|288870763|ref|ZP_06115213.2| hypothetical protein CLOSTHATH_03494 25 6 Op 4 . + CDS 18126 - 18437 326 ## gi|266622283|ref|ZP_06115218.1| conserved hypothetical protein + Prom 19276 - 19335 80.4 26 7 Op 1 . + CDS 19358 - 21724 1354 ## BATR1942_04005 hypothetical protein 27 7 Op 2 . + CDS 21791 - 22846 129 ## gi|288870764|ref|ZP_06115216.2| head morphogeneis protein, SPP1 gp7 family + Term 22847 - 22879 2.0 28 8 Op 1 . + CDS 22899 - 23291 282 ## gi|288870765|ref|ZP_06115217.2| hypothetical protein CLOSTHATH_03498 29 8 Op 2 . + CDS 23291 - 25546 1999 ## Sgly_0356 hypothetical protein 30 8 Op 3 . + CDS 25509 - 25907 328 ## gi|266622284|ref|ZP_06115219.1| conserved hypothetical protein 31 8 Op 4 . + CDS 25907 - 26356 471 ## gi|266622285|ref|ZP_06115220.1| conserved hypothetical protein 32 8 Op 5 . + CDS 26361 - 27545 279 ## Closa_0859 hypothetical protein + Term 27552 - 27579 -0.9 33 8 Op 6 . + CDS 27602 - 27760 184 ## gi|266622353|ref|ZP_06115288.1| conserved hypothetical protein 34 9 Op 1 . + CDS 28715 - 29074 117 ## gi|266622286|ref|ZP_06115221.1| conserved hypothetical protein + Term 29081 - 29108 -0.9 35 9 Op 2 . + CDS 29131 - 29325 240 ## gi|266622353|ref|ZP_06115288.1| conserved hypothetical protein 36 9 Op 3 . + CDS 29342 - 29611 156 ## gi|266622290|ref|ZP_06115225.1| putative branched-chain amino acid ABC transporter permease protein 37 9 Op 4 . + CDS 29632 - 29889 372 ## gi|266622291|ref|ZP_06115226.1| enolase + Term 29891 - 29923 1.2 Predicted protein(s) >gi|229784067|gb|GG667668.1| GENE 1 3 - 260 247 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623296|ref|ZP_06116231.1| ## NR: gi|266623296|ref|ZP_06116231.1| putative bacteriocin ABC transporter [Clostridium hathewayi DSM 13479] putative bacteriocin ABC transporter [Clostridium hathewayi DSM 13479] # 1 84 39 122 123 107 100.0 2e-22 DLVHAIEFAVDKSERNRVATKFQQSRKYRRQNKDIVKRNERIVKFFEEQKNRDTLNRMRQ LLGQQRKEEEYLDGERVYKPRVGKG >gi|229784067|gb|GG667668.1| GENE 2 266 - 691 307 141 aa, chain + ## HITS:1 COG:no KEGG:CLJ_B2545 NR:ns ## KEGG: CLJ_B2545 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_Ba4 # Pathway: not_defined # 5 123 7 127 145 66 33.0 4e-10 MNKEVLEQYIDACELIKETEADIRRVKKQRKTIVQDRVRGSMSEFPYAAQSFNIHGMVYA AAREPGELAVYERLLEERKAKAEEIKVQVEAWLNTVPQRMQRIIKYKIFEGNTWAETASR IGRKATPDGIRKEYENFMKTA >gi|229784067|gb|GG667668.1| GENE 3 1135 - 2010 319 291 aa, chain + ## HITS:1 COG:Cgl1774 KEGG:ns NR:ns ## COG: Cgl1774 COG0616 # Protein_GI_number: 19553024 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Corynebacterium glutamicum # 8 291 22 303 314 227 41.0 2e-59 MAGWDDILREFNETMSQSDFVRRKYLRKLSEYTGRNTIAYYSAFLNKPNASNLDINDLDM TGFMNALKGMECSKGLDIILQTPGGSPASAEAIISYLRKKFSNDIRVIVPQIAMSAGTMM ACAAKVIIMGNQSSLGPVDPQFNGIPAYNIQMEFEEAKKDLASNPQNAQYWAIKLQQYPA AFLKTAMDAIALSGTLLREWLSTCMFEGEDATKVEAIVAKLNEHDDSKTHGRHFNIDFCR NIGLKIVALEDDNILQDAVLSVHHAYMLTLGGTDVVKIIESQNGKAIINHE >gi|229784067|gb|GG667668.1| GENE 4 2072 - 2251 105 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNYAVVDRLYRTLGIKNDDKIPKRILNTGTMDRKLNTNYSAVIGNKPNTTWSNSAEMTK >gi|229784067|gb|GG667668.1| GENE 5 2321 - 2764 311 147 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2972 NR:ns ## KEGG: Cphy_2972 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 132 1 171 190 168 60.0 8e-41 MAKGKYEYWLSPDGLLRLEAYARDGLTDEQIAKNLDIVPSTLYEWKRQYSEISEALKKGK EVVDIEVENALLKRALGYSYEEKKVEVSEEGTKVTKTIKEVVPDTTAQIFWLKNRRPDRW RDKQDIEHSGQIGGVTIVNDIPKPDTS >gi|229784067|gb|GG667668.1| GENE 6 2730 - 4022 756 430 aa, chain + ## HITS:1 COG:SPy0972 KEGG:ns NR:ns ## COG: SPy0972 COG1783 # Protein_GI_number: 15674984 # Func_class: R General function prediction only # Function: Phage terminase large subunit # Organism: Streptococcus pyogenes M1 GAS # 32 411 26 408 429 154 30.0 4e-37 MSTISPNQTQVRLSSLIAPSFYDLHWDIAEHRHTHYKLAGGRGSTKSSFVSLEIILGMMQ DPQANAIAMRKVGRFLDESVFQQLIWAINVLEVTDKWKIRYSPLGLTYTPFGNKIIFRGA DDPQKIKSVKLANGYFKYIWFEERAEFDGDSEERTILQSLMRGGPQYFVFYSWNPPKSMN SWVNQDILQSKENAIVHHSDYRTVPKEWLGDDFFIEAEDLKETKPKAYEHEYLGIATGTG GQVFENVTVRPITEEEMARFDRIYQGLDFGFGADPAAYIKMHYDRTRKRLFLFGEIYAPR LGNTKLADQIKKYNPLNRVVTADSEDPRAIDALNELGLRVVGARKGPGSVDFGMEFLADE VNEIIIDQQRCPNAAREFTGYELEQDKNGNFKGSYPDKDNHSIDAVRYALEDVMTSRKAK VRKKSDYGLH >gi|229784067|gb|GG667668.1| GENE 7 4035 - 5441 1557 468 aa, chain + ## HITS:1 COG:no KEGG:SPJ_1869 NR:ns ## KEGG: SPJ_1869 # Name: not_defined # Def: phage portal protein, SPP1 family # Organism: S.pneumoniae_JJA # Pathway: not_defined # 56 457 7 405 433 356 49.0 1e-96 MYIYTLPRENWDETNPDKQAIRTLIVKHRREANQLRKSMKYYEGEHKILTESRKTKLVCN HAKDISDTASAYFIGNPISYNSKEDITPLTDAFEIAGADEADGDNGLDLSVYGRSYEYIY PMEGETDLTIKSLSPENTFMVYDDTIEQKELFAVYYYAKKDDSDKKRAIYVATVLTEHYK WVLNIENIDCPQALLEEPKPHYFDEVPVIEYLNNKLAIGDFELQIPLIDAYNALMSDRIT DKEQFIDAILALYGAMLGDNEAKDADGRTAAQKLKEDRLMELPKDAKAEYITRTFDESGV EILKKAIEQDIHKFSHIPCMTDESFGGNVSGVAMEFKLLGMENITKIKTRYYKRGLRKRL RLFSAWLAKSKSVQVDISGITPTFSRAMPKNLLEISQIVTNLWGKISRKTLLSQVPFVEN VEEELAAVEKEEDEAAKKQMEMFGLGSNTPPPDDEDEPKKKKPGEIDE >gi|229784067|gb|GG667668.1| GENE 8 5434 - 5838 367 134 aa, chain + ## HITS:1 COG:no KEGG:SPJ_1868 NR:ns ## KEGG: SPJ_1868 # Name: not_defined # Def: phage putative head morphogenesis protein, SPP1 gp7 family # Organism: S.pneumoniae_JJA # Pathway: not_defined # 19 127 1 111 527 78 38.0 9e-14 MSSASYWERRKAREMFHYMEKAEDTADEIAKLYLKSSGYLSAELDKIFERYKRKHHLTDA EAYRLLNSLHDKTSIDELKEALRTGDGAQKDILAELESPAYCARLERLEQLQNQLDATMK NVYRQEKKINTLAS >gi|229784067|gb|GG667668.1| GENE 9 6822 - 7415 297 197 aa, chain + ## HITS:1 COG:BH3531 KEGG:ns NR:ns ## COG: BH3531 COG5585 # Protein_GI_number: 15616093 # Func_class: T Signal transduction mechanisms # Function: NAD+--asparagine ADP-ribosyltransferase # Organism: Bacillus halodurans # 6 139 166 300 490 95 37.0 5e-20 MIDRVINSKWSGANYSTRIWNNTQALAQDLKEELLVNLITGRTDREVAEIIANKYAQGAS NARRLVRTESCNLANQMEMQSYEECGIEKYRFVATLDLKTSAVCRELDGKVFPVSEQQPG KNCPPMHPWCRSTTICVIDEIDMSNMKRRARDPVTGKTNTVPADMTYKQWYDKNVKGNVD AEAKEKELRKRKKSTQE >gi|229784067|gb|GG667668.1| GENE 10 7481 - 8116 580 211 aa, chain + ## HITS:1 COG:no KEGG:SP670_2151 NR:ns ## KEGG: SP670_2151 # Name: not_defined # Def: phage scaffold protein # Organism: S.pneumoniae_670-6B # Pathway: not_defined # 59 193 24 159 194 90 44.0 4e-17 MRKTGFTGISPKMNLQFFAESGDGAGADQGEGGGTGNKPDEGSSGGADTGNKEPKSFDDL LQNKDYQAEFDRRVQKALGTAKEKWTALMDDKLSEADKLAKMNKEEKAEYLRQKQEKELK DREAAITRRELMAEAKNTLAEKKLPVGLAEVLNYADADSCNKSMAAVEKAFQEAVQAAVE EKLKGGEPLKKAPSEDGKDLAKQVEDLMMGI >gi|229784067|gb|GG667668.1| GENE 11 8136 - 9068 843 310 aa, chain + ## HITS:1 COG:no KEGG:CLL_A1913 NR:ns ## KEGG: CLL_A1913 # Name: not_defined # Def: prophage protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 4 309 3 307 311 313 52.0 8e-84 MAINTLATATLFMNTLDKVAIREAVTGWMDANAGQVIYNGGAEVKIPKMSVQGLGDYDRD NGYQQGGVTLEYETRKMTQDRGRKFQLDPIDINENNFVTTAAAVMGEFQRMFVVPEIDAY RISKIATETITANKAGMIEYNYTPGATGTSALRKIKEGIKAIRELYNGPLVIHATPDMIM ELEMELSGKITNTTFSKGGIDTAVPSVDGVPIISTPSNRMYTAIKINDGKTPGQERGGYE KGESAKNINFFICPRTTPIAVTKQDIMRIFDPTINQKLNAWQMDYRRFHDIWVLDNKLDS IFLNIKDTKE >gi|229784067|gb|GG667668.1| GENE 12 9075 - 9314 255 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622266|ref|ZP_06115201.1| ## NR: gi|266622266|ref|ZP_06115201.1| transcription termination factor Rho [Clostridium hathewayi DSM 13479] transcription termination factor Rho [Clostridium hathewayi DSM 13479] # 1 79 1 79 79 124 100.0 3e-27 MRFIKDNVERVTESEAMADKLKALGFKPLGLETLTREPGEGNTNIGKMTVPELKILAKEK GIEGAASLNKEELLAVLKG >gi|229784067|gb|GG667668.1| GENE 13 9320 - 9679 291 119 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622267|ref|ZP_06115202.1| ## NR: gi|266622267|ref|ZP_06115202.1| prophage protein [Clostridium hathewayi DSM 13479] prophage protein [Clostridium hathewayi DSM 13479] # 1 119 4 122 122 184 99.0 2e-45 MTEIEKLKLLTGESDEDLLSLLLADAIEYVLGYTNRTELPAALNKPVRDLAVIAYNRLGT EGETGRSEGGESYSFDAAPKQIYDVLDRYRLVGVGGRRYEAKTKQAETVQPQAGDTEKG >gi|229784067|gb|GG667668.1| GENE 14 9615 - 9986 223 123 aa, chain + ## HITS:1 COG:no KEGG:TepRe1_1056 NR:ns ## KEGG: TepRe1_1056 # Name: not_defined # Def: hypothetical protein # Organism: Tepidanaerobacter_Re1 # Pathway: not_defined # 1 123 1 108 110 65 32.0 9e-10 MRLKQNRLKLYSHRQAIPKKDNEGNSYMEYGLPSSFKAEVWPAGGKLQAEMYGQRVNHIQ NCRINAGYETVVDEKGHVSYRISSMTLQEGDGICLNVPGDHEPDYRIIAIRPYRYLTLEV ERI >gi|229784067|gb|GG667668.1| GENE 15 9983 - 10513 424 176 aa, chain + ## HITS:1 COG:no KEGG:TepRe1_1057 NR:ns ## KEGG: TepRe1_1057 # Name: not_defined # Def: phage protein, HK97 gp10 family # Organism: Tepidanaerobacter_Re1 # Pathway: not_defined # 4 176 3 175 175 122 35.0 4e-27 MNETIKGLDKLLQKYGSLEAAAEHGVKKAIGQGTKIAQAGAVLMCPVNDEELRQSIKTRV MVEEGRVIGTVYTNKKYAAYVELGTGPRGQVDHAGISPEITPAYSPSPWWIHESQIDAET AEKYHWFYIDAPDGRFYQCSGQPAQPFLYPGLKDNEELICRKINDVLAAEIRKASQ >gi|229784067|gb|GG667668.1| GENE 16 10510 - 10878 295 122 aa, chain + ## HITS:1 COG:no KEGG:SPJ_1856 NR:ns ## KEGG: SPJ_1856 # Name: not_defined # Def: conserved structural protein, putative # Organism: S.pneumoniae_JJA # Pathway: not_defined # 1 117 1 117 122 100 42.0 2e-20 MINVKDEVYGALCTVTENVTDYYPRSWEQDISIQYMEEDNKVAEESGRGEVKSYVRYRID IWSRKSTSVAAVAVDAAISPLGLKRAQCMDVEDPSGLKHKQMRYEGIIDVRNKQVYHAIG KE >gi|229784067|gb|GG667668.1| GENE 17 10883 - 11296 513 137 aa, chain + ## HITS:1 COG:no KEGG:SPJ_1855 NR:ns ## KEGG: SPJ_1855 # Name: not_defined # Def: major tail protein, putative # Organism: S.pneumoniae_JJA # Pathway: not_defined # 1 134 1 133 137 136 55.0 2e-31 MLANGITLEVKKNGSESYTALQDLKEVPELGVDAEKVENTRLKDKFKHSELGIGDPGDMA YKFVYDNSSASSDYRVLREIADSNKVASYRQTFPDGTKYEFDAYSSIKVGGGGVNAAIEF TLTLGLQSDIKVTDPTE >gi|229784067|gb|GG667668.1| GENE 18 11315 - 11758 601 147 aa, chain + ## HITS:1 COG:no KEGG:SP670_2143 NR:ns ## KEGG: SP670_2143 # Name: not_defined # Def: prophage protein # Organism: S.pneumoniae_670-6B # Pathway: not_defined # 30 127 5 107 126 65 34.0 5e-10 MTQGFDEELESKEEEDKVVSLEEKKKRRPFAYWEVGGQTYKMKLTTQNICRLEDKFKTSL LNVLFGAGSVPPLSVMLTITQAAMLPYNHKIKYEGVQALFDKYCEEGGTQMTFMTDVFME IYKVSGFFTEDQAEEMDKRLEEAKDQM >gi|229784067|gb|GG667668.1| GENE 19 11773 - 12141 281 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622273|ref|ZP_06115208.1| ## NR: gi|266622273|ref|ZP_06115208.1| hypothetical protein CLOSTHATH_03489 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03489 [Clostridium hathewayi DSM 13479] # 1 122 1 122 122 240 100.0 3e-62 MIDDLFPLALDCGISPERFWELSIPDIIDIMECSRRQEERKVKRELMNLHFLARDIGQFT AVAIQGSDKVEIMELWDFFPDLFGREHEETEKKIQEKQLAEYKARFNDFAIRHNHARAGG GN >gi|229784067|gb|GG667668.1| GENE 20 12141 - 12353 354 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622274|ref|ZP_06115209.1| ## NR: gi|266622274|ref|ZP_06115209.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 70 1 70 70 102 100.0 7e-21 MGKGMTLEKLQVIIEAQTKPYRDEVEKLKKQTTTAASHVERQTAKMKKSFGGLGKVVASV LGVGAIVAFA >gi|229784067|gb|GG667668.1| GENE 21 13361 - 14899 1265 512 aa, chain + ## HITS:1 COG:no KEGG:Ccel_3035 NR:ns ## KEGG: Ccel_3035 # Name: not_defined # Def: Phage-related protein-like protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 304 83 390 841 208 40.0 6e-52 MSGAVDAFSKNAITQFGLSELTAKKYMGTYGAMAKSFGIVEEAGYQMSASITGLTGDVAS FYNLSTDEAYTKLKSIFTGETESLKELGVVMTQTALDQYALNNGFGKTTAKMTEQEKVML RYRFVMSSLADASGDFARTSTSWANQVRVLSLQFQSLKATIGQGLINAFTPVIRVINTIL AKLQTLAAYFKAFTVALFGDAGGSSDIADSMDSAAGSSGAVADNMDKAAGAAKKMKEYTL GIDELNVLNPDEGSGGSGGGAGGGGSLDFGDMSGELFGEVTVNQEIEAAVERLRKAIDAI KELAKPTTAALKRLWNEGLSLLGKFTWTALEDFWNYFLVPLGSWALGEGFPRFIDITNDF LKTINWEAINEALRNFWKALEPFAESVGEGLLDFYEDLSKVGAAFINTVVPGGLNALADA LNKIDSDQAKAIGYGLGVIATSILTFKITKSILDFFDKLKGVILAAGGASTFGITLTLSL IASTGISWASALDLLDRYENGTDEEKKEIEAS >gi|229784067|gb|GG667668.1| GENE 22 16147 - 16890 333 247 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622276|ref|ZP_06115211.1| ## NR: gi|266622276|ref|ZP_06115211.1| hypothetical protein CLOSTHATH_03492 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03492 [Clostridium hathewayi DSM 13479] # 1 247 109 355 355 443 99.0 1e-123 MYQNVKVKLKETWDNTVGEWAINIQTWWDKHVATWFTAKKWLDVYKSVKESLGTTWTNTV NDWKKNIGDWWAKDVEKWFKLETWIDMMKRVPDAFKETFKGAVNAAVDQLNRLIDWLNDK LNFSFDGLEILGEEIIPAFSVQLFTIPHIPQFATGGFPEDGLFMANHGELVGKFANGRTA VANNDQITQGIKEAVIEGMSVVMASYSGNSEPITIETHVDMDGRTIVKQTDKVQSRKGFN FRNPQTV >gi|229784067|gb|GG667668.1| GENE 23 16955 - 17674 456 239 aa, chain + ## HITS:1 COG:no KEGG:Shel_11450 NR:ns ## KEGG: Shel_11450 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 67 236 154 321 326 66 30.0 8e-10 MDNFFFALIVLGILSFCVMIIVDFIFVLMRKEVRVKYLLIPAVAFIVGFFGFAASVKPST NSDSVKESQIVEKTTAEATTEEETSTVEETTAAEETTTQPETEETTTEEETTVAETESED EFKSSCQEIGYKKLLRTPDDYVGQRIVITAEVQQVIDGGLFDDGKYYRVQTDNNGSGYYF DDEYFMYDNRVDDNMKILDGDVLKIYGEFTGLETMKRVITGSKDEVPAIKAYYIELISE >gi|229784067|gb|GG667668.1| GENE 24 17734 - 18126 283 130 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870763|ref|ZP_06115213.2| ## NR: gi|288870763|ref|ZP_06115213.2| hypothetical protein CLOSTHATH_03494 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03494 [Clostridium hathewayi DSM 13479] # 1 130 4 133 133 263 99.0 3e-69 MLDERQATIYVNGKPFPSPKRGLNFIASTIVTSSRNANGEVVGQKVGRDQNKLDSLVWPV LDAETWSEMLREFSNFYVTVKFPDMVTNKWKTLKMYPSDRSAEPYEVDENGFPTKYINCK VNLIDCGVID >gi|229784067|gb|GG667668.1| GENE 25 18126 - 18437 326 103 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622283|ref|ZP_06115218.1| ## NR: gi|266622283|ref|ZP_06115218.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 100 1 100 751 201 97.0 1e-50 MQTVSREYKRSMKEKLRNRSYIRVTIGVINQQAQASACVPHPENYTYYSNLKWPLDNYQV QELYATCDQDYTAVDGSMYFLPRAREDVVLNQGIVSEDLPLAS >gi|229784067|gb|GG667668.1| GENE 26 19358 - 21724 1354 788 aa, chain + ## HITS:1 COG:no KEGG:BATR1942_04005 NR:ns ## KEGG: BATR1942_04005 # Name: not_defined # Def: hypothetical protein # Organism: B.atrophaeus # Pathway: not_defined # 370 662 470 755 1391 72 24.0 8e-11 MDKAAGAAKKMKEYTLGIDELNVLNPDEGNGGSGGGAGGGGSLDFGDMSGELFGEVTVNQ EIEAAVERLRKAIDAIKELAKPTTDALKRLWNEGLSLLGNFTWTALEDFWNYFLVPLGSW TLGEGFPRFIDITNDFLKAVDWDAINEALRNFWKALEPFAEKVGEGLLDFYKDLSNIGAN FINAVVPGGLNALADALKKIDSEQAKAIGYGIGVIATSLLTLKVTKSILDFFDKLKGVIL AAGGSSTFGITLTLSLIAATGVSWTTALELLDRYENGTDEEKKEIEVGWDKNKEESKDKN KWGMGNRYGQMAASADAYQNDNAYTKQYDYVKSLPGKIKECFDQNLKDTEDWLNELDKQR EKESSDFQTWIEDKKSNAKQNWEEIQTWWDNTIGSWWDEHVAPWFTTEKWLGLYESVKTS LKTKWDETVAVWISDIQNWWNTHVSPWFTAEKWSELYENIKTSLKSKWDETAGQWGTDLG DWWNTHVSPWFTAERWSQLYNDIKEQLRIKWDSTVGEWGNNIKTWWNQHVAPWFTAKKWS DLYQTVKTKLKETWDNTVGEWITNIQTWWDEHVAPWFTTEKWLDIYRSVKESLGTTWTNT VTDWKKNIEDWWKEDVEKWFKLETWTDMMKRVPDAFKETFRGAVNAAVDQLNRLIDWLND KLNFSFDGLEILGEEIIPAFSVQLFTIPHIPQFATGGFPEDGLFMANHGELVGKFSNGRT AVANNEQITQGIKEAVIEGMSIVMASYNGNGEPITIETHVEMDGRTIVKQTDKVQSRKGF NFKNPQTI >gi|229784067|gb|GG667668.1| GENE 27 21791 - 22846 129 351 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870764|ref|ZP_06115216.2| ## NR: gi|288870764|ref|ZP_06115216.2| head morphogeneis protein, SPP1 gp7 family [Clostridium hathewayi DSM 13479] head morphogeneis protein, SPP1 gp7 family [Clostridium hathewayi DSM 13479] # 1 351 9 359 359 670 100.0 0 MGVFDFLKQRNRNVKECTGFLNNSETKTPLPNENQKNESTNKSTKNKIPLKLSEFKEEHY KALRRLNNKPLGERLSGIVTKEVNIEKFTPVLLELDLLRIGTCEECLNLLSVDKLKLILK NQSKKSTGNKSELIKRIIAEIPEEDIRSDKAYTDFYVHTEKAKEIIHESYDIINGLKIDF IVQCINKIKEKKIREVYKDICKRNMELPIPPGMGVNWEEQFVNGISDFEEKLYLDILKDN NTDVVALGIYTDVSGDSVSECINLLNKTGQYTDLRKEDVLYVCSVISTLTEMHSIRASGF NSYTFVASKNDDACPTCKSLNMKSFDVAKAKIGKNCPPMHIGCKCCIICKP >gi|229784067|gb|GG667668.1| GENE 28 22899 - 23291 282 130 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870765|ref|ZP_06115217.2| ## NR: gi|288870765|ref|ZP_06115217.2| hypothetical protein CLOSTHATH_03498 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03498 [Clostridium hathewayi DSM 13479] # 1 130 17 146 146 263 99.0 3e-69 MLDERQATIYVNGKPFPSPKRGLNFITSTIVTSSRNANGEVVGQKVGRDQNKIDSLVWPV LDAETWSEMLREFSNFYVTVKFPDMVTNKWKTLKMYPSDRSAEPYEVDENGFPTKYINCK VNLIDCGVID >gi|229784067|gb|GG667668.1| GENE 29 23291 - 25546 1999 751 aa, chain + ## HITS:1 COG:no KEGG:Sgly_0356 NR:ns ## KEGG: Sgly_0356 # Name: not_defined # Def: hypothetical protein # Organism: S.glycolicus # Pathway: not_defined # 1 386 1 394 542 106 24.0 4e-21 MQTVSREYKRSMKEKLRNRSYIRVTIGVINQEAQSSAYVPHPENYTYYSNLKWPLDNYQV QELYATCDQDYTAVDGSMYFLPRAREDVVLNQGIVSEDLPGSIEIQFPIRYDIKGLTVEF GRAYPVDFRIESDNKTVEIAGNATEHFVTEEIFEGATFLRFVPASMANGQSRFRIHQLTT GIGIYFDNRKILSATKKEHISPVMEELPALDFDMTIDNKDRAYDVENEESTVNFLETGQE VKVLYGQELDNGTVEWLPGATVYLREWSADDEEMSFTATDRFESMDGTYYKGEYRSEGIS LYDLAVDVLKDAGVDSRTYWLDNYLKDVSVCNPMPVVSHKEALQLIANAGRCILYQDRSG NIFMKSSFVPDMEAASDNEVYFSHAGAVLDKKAKQSYAMTARDYTDVQPTQYFLPRQETG KAYLNTGYISEAVAGADGTFAVNPAVEISLEARYKCFGLTLKFGRNNPDTVIFHASLAGE PREDYMVSGLTAMTVISHEFPEFDRLVLEFTKGCPYNRVILNNIIFGDSTDYMLEYGHEL TKTPKGTQLPKVRELHVVRTFYSQGVEVKELAKETVTVSAADNRYTIYFSAPSYDLSCTI TEPQAGQTAVIVDSSNYYATVELTGVTGACEVLITGREYIVTQARVSRRLNPTGRLEQWE NPLVSDVVHAADLADWVGDYMRADREYDLQYRGEPRIDANDIAFLENKYVPGLLLRVYDH TLKFNGALSGTIKARREIGYVADTQDRLAGK >gi|229784067|gb|GG667668.1| GENE 30 25509 - 25907 328 132 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622284|ref|ZP_06115219.1| ## NR: gi|266622284|ref|ZP_06115219.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 132 1 132 132 264 100.0 2e-69 MWQTPKTDWQESDFFNVEDYNRIKGNLNEIRAQAVILWPEFSLEDMGADKTYEDYSFYAD EINRFETNVGRICAGTYPFAVGNQKTFYDNQPFIDWQELNRIEEACRLIYSNIQSRLNGR KILAFTLNGGVI >gi|229784067|gb|GG667668.1| GENE 31 25907 - 26356 471 149 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622285|ref|ZP_06115220.1| ## NR: gi|266622285|ref|ZP_06115220.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 149 1 149 149 272 100.0 6e-72 MKLKTNYKDALFDGARKYRITANADGTSGIADETVYTQEGDLFGANDINATNEAVNKLGN RRILKLTAAGWSGSYPFTQTVDAAGITVADDIKVIGVYVPANATLEQVKAWNKAAGYLMC NPDGVADGKITFKAYKKPAVDFQILTEGA >gi|229784067|gb|GG667668.1| GENE 32 26361 - 27545 279 394 aa, chain + ## HITS:1 COG:no KEGG:Closa_0859 NR:ns ## KEGG: Closa_0859 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 230 3 209 355 119 39.0 2e-25 MGKVIPMLGGGGNADSDELTACRQHVLENYTAITHDTDGEPGTGTMPNLTAGSTVNHATG NATKVIVGDEAFISTNTDGVTRAEIRYNNKPGYIEGNTLLGVEQSKMANAIGLNSLKISA GNTVLGVQGDAWTVWTGDASAESGHVLPGKTYWRNGVKMTGNMAIQNADVSGSDRAYATN VSAWGGVICLGVRNGHYLNGVNWIQADIAGLQAANIREGINIGGLTGTLKDMSVNHVPFD GASFSGVLAKGAEICNWYNNNVLRNPVEITGDGLRMMYSRANLQFNKFCPKESITFSPFK TVRVSIKFNGSSRGKGHATLMVYRSDTSRHNVLVNQSGGLLKSTTIGMTNSGTIYTGDLD VSAVNDEGFLAIHFTNAEDVNYTSYFITRIEFLI >gi|229784067|gb|GG667668.1| GENE 33 27602 - 27760 184 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622353|ref|ZP_06115288.1| ## NR: gi|266622353|ref|ZP_06115288.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 50 12 61 75 77 100.0 3e-13 MNKNKPDMNYKTTVPYGPATGKEDPGRQPVIDSTPYEGDRGPDHRQFKPGPS >gi|229784067|gb|GG667668.1| GENE 34 28715 - 29074 117 119 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622286|ref|ZP_06115221.1| ## NR: gi|266622286|ref|ZP_06115221.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 119 276 394 394 225 94.0 7e-58 MMYSRANLQFNKFCPKESITFSPFKTVRVSIKFNGSSRGKGHATLMVYRSDTSRSDILLN RSTGLLKSTTIGMTNSGTIYTGDLDVSAVNDEGFLAIHFTNAEDVNYTSYFITRIEFLI >gi|229784067|gb|GG667668.1| GENE 35 29131 - 29325 240 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622353|ref|ZP_06115288.1| ## NR: gi|266622353|ref|ZP_06115288.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 64 12 75 75 94 92.0 4e-18 MNKNKPDMNYKTTVPYGPATGKEDPGRQPVIDETPYKKDYSPDHRQFKPGHVPGGPGHKD CEHE >gi|229784067|gb|GG667668.1| GENE 36 29342 - 29611 156 89 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622290|ref|ZP_06115225.1| ## NR: gi|266622290|ref|ZP_06115225.1| putative branched-chain amino acid ABC transporter permease protein [Clostridium hathewayi DSM 13479] putative branched-chain amino acid ABC transporter permease protein [Clostridium hathewayi DSM 13479] # 1 89 1 89 89 170 100.0 3e-41 MYITTNTIITAASVITALVVIFSALFAVYRWYLRQGQQDREIKNIKDEQCLLVYGVLACL KGMKEQGYNGPVTEAINKIEKHINQQAHE >gi|229784067|gb|GG667668.1| GENE 37 29632 - 29889 372 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622291|ref|ZP_06115226.1| ## NR: gi|266622291|ref|ZP_06115226.1| enolase [Clostridium hathewayi DSM 13479] enolase [Clostridium hathewayi DSM 13479] # 1 85 9 93 93 125 100.0 1e-27 MDLGIASVAGITALCYLAAMAVKATAVDNKWLPVICGVIGAILGVAGMYTMPDFPAADII NAAAVGTVSGLAATGINQAYKQLTK Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:23:34 2011 Seq name: gi|229784066|gb|GG667669.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld62, whole genome shotgun sequence Length of sequence - 33704 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 15, operones - 7 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 648 385 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) - Term 495 - 533 -0.4 2 2 Op 1 9/0.000 - CDS 602 - 1099 390 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 1128 - 1187 80.4 3 2 Op 2 . - CDS 2035 - 2598 562 ## COG3279 Response regulator of the LytR/AlgR family 4 2 Op 3 . - CDS 2603 - 3667 1156 ## COG1316 Transcriptional regulator - Prom 3809 - 3868 6.2 + Prom 3938 - 3997 2.7 5 3 Tu 1 . + CDS 4049 - 4219 188 ## gi|266622297|ref|ZP_06115232.1| transcriptional regulator, AraC family 6 4 Op 1 2/0.000 - CDS 5596 - 6021 494 ## COG0778 Nitroreductase 7 4 Op 2 . - CDS 6056 - 6772 671 ## COG0778 Nitroreductase - Prom 6856 - 6915 7.0 + Prom 6807 - 6866 7.2 8 5 Tu 1 . + CDS 6937 - 7515 415 ## GYMC10_3122 putative transcriptional regulator, TetR family + Term 7563 - 7609 11.1 - Term 7551 - 7595 10.7 9 6 Tu 1 . - CDS 7674 - 10298 2582 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase - Prom 10448 - 10507 80.4 10 7 Tu 1 . - CDS 11517 - 12197 657 ## Rumal_1162 hypothetical protein - Prom 12231 - 12290 4.7 11 8 Tu 1 . + CDS 12304 - 13740 1086 ## COG1032 Fe-S oxidoreductase 12 9 Op 1 5/0.000 - CDS 13718 - 14053 217 ## COG0534 Na+-driven multidrug efflux pump 13 9 Op 2 . - CDS 14987 - 15985 1084 ## COG0534 Na+-driven multidrug efflux pump - Prom 16010 - 16069 4.7 + Prom 16061 - 16120 7.1 14 10 Op 1 15/0.000 + CDS 16164 - 17348 1414 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein + Prom 17360 - 17419 3.9 15 10 Op 2 24/0.000 + CDS 17450 - 18988 1745 ## COG3845 ABC-type uncharacterized transport systems, ATPase components 16 10 Op 3 26/0.000 + CDS 18985 - 20094 1411 ## COG4603 ABC-type uncharacterized transport system, permease component 17 10 Op 4 . + CDS 20091 - 21044 1147 ## COG1079 Uncharacterized ABC-type transport system, permease component + Term 21067 - 21125 5.2 18 11 Tu 1 . + CDS 21197 - 21706 435 ## COG3177 Uncharacterized conserved protein 19 12 Op 1 . - CDS 21758 - 23821 1998 ## Closa_3207 glycoside hydrolase family 38 20 12 Op 2 . - CDS 23845 - 25140 1229 ## COG1482 Phosphomannose isomerase - Prom 25186 - 25245 80.4 21 13 Op 1 . - CDS 26093 - 26410 485 ## LCAZH_0395 mannose-6-phosphate isomerase, class I 22 13 Op 2 . - CDS 26434 - 27351 765 ## COG1940 Transcriptional regulator/sugar kinase 23 13 Op 3 . - CDS 27379 - 27540 167 ## gi|266622317|ref|ZP_06115252.1| D-tyrosyl-tRNA(Tyr) deacylase 24 13 Op 4 3/0.000 - CDS 27556 - 28275 614 ## COG2188 Transcriptional regulators - Prom 28352 - 28411 9.3 - Term 28559 - 28608 9.3 25 14 Op 1 8/0.000 - CDS 28625 - 30028 1504 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 26 14 Op 2 3/0.000 - CDS 30012 - 31946 2066 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 31972 - 32031 2.7 - Term 32040 - 32079 7.7 27 14 Op 3 . - CDS 32100 - 32999 1017 ## COG1737 Transcriptional regulators - Term 33020 - 33068 -0.7 28 15 Tu 1 . - CDS 33102 - 33704 651 ## COG0423 Glycyl-tRNA synthetase (class II) Predicted protein(s) >gi|229784066|gb|GG667669.1| GENE 1 1 - 648 385 215 aa, chain + ## HITS:1 COG:RSc1140 KEGG:ns NR:ns ## COG: RSc1140 COG0613 # Protein_GI_number: 17545859 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Ralstonia solanacearum # 89 190 165 268 286 73 40.0 2e-13 FEFAQIEQEILDQEKTAAREKIRLFQSATGIPVDEGEVMAAAADGVVPGELIAEILLSKE EAGQYELLRPYLAGGAKSDMPNVRFYWDFFSEGKPAYVPVRYISLADAVSLIHDAGGIAV LAHPGQNLDRSETELLNSILFREVDGIEAFSSYHNEETSAHYVNVAVKNRLLITCGSDFH GKHKPSIKLGGHGASWEDSRIMEVLENHLKSLLIQ >gi|229784066|gb|GG667669.1| GENE 2 602 - 1099 390 165 aa, chain - ## HITS:1 COG:lin0802 KEGG:ns NR:ns ## COG: lin0802 COG2972 # Protein_GI_number: 16799876 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Listeria innocua # 19 159 294 433 433 68 26.0 5e-12 MYLEQMGEPLRQLEMMVWTGDPVLDVLLNNARSSAIKKKIRFTIQADALTLGDMEDREKC SLFSNLLDNAVEAADQVEAGERWIAVKIRRTGEMVFIELSNSMAGSPSVKDNRLVTTKRQ GTGHGLGLKSAERVVEKYGGSLVCEYGDDVFTVLVSFLGGFPELP >gi|229784066|gb|GG667669.1| GENE 3 2035 - 2598 562 187 aa, chain - ## HITS:1 COG:BS_yccH KEGG:ns NR:ns ## COG: BS_yccH COG3279 # Protein_GI_number: 16077343 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Bacillus subtilis # 4 184 1 177 233 63 27.0 2e-10 MIQLVRIAVCDDEPAFAERVKILIEAECRGLGVSAEVSLYTDSGMLAYDIGEKRYFDMLF LDIEMPGLSGMELADLIKASLPQALVSFITSHTKYAVKSFELSIFRYIPKAEVESCLPMA VSDGIRLLSWKVKECYVVETPRQIIKVAVSDILYIYKKQKYSVMVLEDEEIPVRKALGQL KEELDRE >gi|229784066|gb|GG667669.1| GENE 4 2603 - 3667 1156 354 aa, chain - ## HITS:1 COG:CAC3063 KEGG:ns NR:ns ## COG: CAC3063 COG1316 # Protein_GI_number: 15896314 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 44 299 42 287 339 122 32.0 7e-28 MYELKDKKKEGTRKPIWKKAAACVMALILFFICSAMLYVTKLASTIQKIPDADIKEIKND NLKEETEEKMGGYWTVAIFGVDSRDGNLGRGTNADTQMVLTVKRDSGEIRLASVYRDTLL MSNVAEGKCRKINSSYAEGGPEQAVTALNENLDLDIMDYVSFSWSAAADAVNLLGGIDMD ITKAEYAYINSFITETVERTGIPSVHLTGPGLTHLDGVQAVAYCRLRLMDTDMQRTERQR KVIEAVLEKLKTADLATINRLLETVLPQVSTSITMADGLSLGKNAGKYHMVKTEGFPQEY GTATIGKLGHCVIPNTLETNVAGLHKFLYEDEDYQISDRVREIGREIVKQAVGE >gi|229784066|gb|GG667669.1| GENE 5 4049 - 4219 188 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622297|ref|ZP_06115232.1| ## NR: gi|266622297|ref|ZP_06115232.1| transcriptional regulator, AraC family [Clostridium hathewayi DSM 13479] transcriptional regulator, AraC family [Clostridium hathewayi DSM 13479] # 1 56 1 56 56 105 100.0 9e-22 MKRARELLTHFPEMKMTDIAAEIGLGDNPQYFSQLFKKYEGITPSQFSSAPGQEDI >gi|229784066|gb|GG667669.1| GENE 6 5596 - 6021 494 141 aa, chain - ## HITS:1 COG:MA3441 KEGG:ns NR:ns ## COG: MA3441 COG0778 # Protein_GI_number: 20092253 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanosarcina acetivorans str.C2A # 1 121 6 121 174 67 35.0 6e-12 METMKAIAKRKSTRAFMPDQAVAEPELDAILAAGCAAPVGAGDYSSLHLTVVQNPETLSK ISEAVKEAFHMDRDVLYGAPVLVLVSSSEPKFPNIQYANVGCVMENMLLAATDLGVDNIY LWGAVNVIANMPELTKSTWNS >gi|229784066|gb|GG667669.1| GENE 7 6056 - 6772 671 238 aa, chain - ## HITS:1 COG:MTH933 KEGG:ns NR:ns ## COG: MTH933 COG0778 # Protein_GI_number: 15678953 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanothermobacter thermautotrophicus # 2 236 3 238 243 114 32.0 1e-25 MDTYSSIFTRASTRKFDSAPLSAQALEDFETFITGVKPLVPEVKLTHKIVGADGVKGLAL PKAPHYLLISGREHPLRNTAAGFCFQQAELYLYSRGYATRWLNSVSPRQPDPNFIIGIAL GKPSEPAVREPGDFDRRPLEDISSGTDSRLEAARLAPSGMNGQPWYFVASGGVIHVYYRP SLGGIKGRLYHLTDLDVGIALAHLKAAGDHEGKSFRFTVNQSSAPTPPKGFAYVGTVE >gi|229784066|gb|GG667669.1| GENE 8 6937 - 7515 415 192 aa, chain + ## HITS:1 COG:no KEGG:GYMC10_3122 NR:ns ## KEGG: GYMC10_3122 # Name: not_defined # Def: putative transcriptional regulator, TetR family # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 1 180 1 177 189 118 35.0 1e-25 MKKQPEITSATRKTFVDAFCSLYLHTPIEKITVQEIVRRAGYNRSTFYQYFRDVNDLLTY LEDEIIENIREKVDMNFDQIDFADSFICSFSSLHEETEKYVAMLLTNPDSTKFTQRVKTA MMPVLLRHFNISENDEKAVYVLEFYLAGVIAIASHWLAEGRKVPIIEVGRLIHTLLTEGV LSALGREHADCD >gi|229784066|gb|GG667669.1| GENE 9 7674 - 10298 2582 874 aa, chain - ## HITS:1 COG:TM0272 KEGG:ns NR:ns ## COG: TM0272 COG0574 # Protein_GI_number: 15643042 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Thermotoga maritima # 1 874 1 880 881 1052 60.0 0 MATKWVYLFKEGRADMKNLLGGKGANLAEMTNLGLPIPQGFTVTTEACTDYYNSGKKITD EIQGQIFDALKTLEEIQGKKFGDTEDPLLVSVRSGARASMPGMMDTILNLGLNDVAVEGF AKKTGNPRFAYDSYRRFIQMFSDVVMEMSKTFFEGILDEVKESKGAKYDTDLTADDLKEV IAKYKSIYKEKMGEEFPQDPKVQLMEAVKAVFRSWDNPRAIVYRRMNDIPGDWGTAVNVQ AMVFGNMGNTSGTGVAFTRNPSTGENGIYGEYLINAQGEDVVAGIRTPQPITKLAEDLPE CYKEFMAIAHKLEDHYRDMQDMEFTIQEGKLYFLQTRNGKRTAPAAIKIACDLVDEGKIT PQEAVLRIEAKSLDQLLHPTFDTAALKAGEVIGSALPASPGAAAGKVYFTAEDAKAHHEA GERVILVRLETSPEDIEGMHAAEGILTVRGGMTSHAAVVARGMGTACVSGCGEIKIDEAA KYFELGGHTIKEGDYISLDGSTGKIYLGDIATVEATVSGDFGRIMGWADEFRTLHVRTNA DTPADTLNAVKLGAEGIGLCRTEHMFFEEDRIPKIRKMILSDTVEAREEALNELIPFQKG DFKAMYKALEGRPMTVRYLDPPLHEFLPTKEEDIEALAADMGLTVEAIKEKCADLHEFNP MMGHRGCRLAVTYPEIAKMQTRAVMEAAIEVKEEDGYDIIPEIMIPLVGEKKELKFVKDI VVETAEAVKAEKGSDIQYHIGTMIEIPRAALTADAVAEEAEFFSFGTNDLTQMTFGFSRD DAGKFLDSYYKAKIYESDPFARLDQTGVGQLVQMAAEKGRKTRPDIKLGICGEHGGDPSS IEFCHRVGLNYVSCSPFRVPIARLAAAQAALNNK >gi|229784066|gb|GG667669.1| GENE 10 11517 - 12197 657 226 aa, chain - ## HITS:1 COG:no KEGG:Rumal_1162 NR:ns ## KEGG: Rumal_1162 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 1 168 1 168 208 100 34.0 4e-20 MEYLYVLLLYLPSGFGRLTQKATGYRYTHVAVSLDDTYTHFYAFSRLRARTPPISGYIEE KRIYYTLGEDVPIYTKIFRVPVSRAGYENAIRFMEDVKHDPEIMYNLIHMLLIPLFGGHP LYKAYNCGEFVARVLEEAGITLHKPYYRYTPKLFSELLEPYVLFEGTLDNSSREGLEDDF FRETPKKEYLAKTLYIIRELLFRQVFHRASKRFQPEKVRFSDSGKL >gi|229784066|gb|GG667669.1| GENE 11 12304 - 13740 1086 478 aa, chain + ## HITS:1 COG:CAC2010 KEGG:ns NR:ns ## COG: CAC2010 COG1032 # Protein_GI_number: 15895280 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 58 467 78 467 472 119 23.0 2e-26 MILLVKPEAAGSGFSLDMALKTEPLELEYIKAMLKEFDVESIIYEASFNRRSFDEIFREY NPDAVAVTGYITQENIMLSYVRRAKELCPGCITMVGGSHAQLNPERFFDPAVSYICRSDN IYAIADVLMAEGIIPLKNGSCPPELSSIDGLCYQEQSCLPENGTAQNGAVSRNKSLWRKN PMKPFDINKLPIPDRTTFSRYQSHYRYLDVSPVALLKTSSSCPFHCAFCYGRQLNCGTYC QRDLEAVIEELETIPCDHIQIADDDFLFDVPRLKTFIRLLRERNIKKTFICYGRSDFIAA HESLIRELAEAGFRYIMVGLEAVSDSFLDSYQKHSSQALNIQTIRILHKYGINLVGLFII DSRFKKEDFRNMRRFIHAHRITYTGVSIFTPIPGTPLFNEYRDRLTTENYEKWDFMHLVV KPECMSRFHFYFEYYLLIMDLFRIAQKAGIYSFLRLKDYKNIFLKLLFTDSFKLRPDR >gi|229784066|gb|GG667669.1| GENE 12 13718 - 14053 217 111 aa, chain - ## HITS:1 COG:FN1469 KEGG:ns NR:ns ## COG: FN1469 COG0534 # Protein_GI_number: 19704801 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 20 102 354 436 440 60 40.0 1e-09 MTAAFVKPTAEILAMSVPAVRIYFLSFFAMGFNILYSTYFQSVMKPGYSLFICMMRGMVL NAVLVFLLPIAFGVTGIWSVMPVTEFLTLFVCRFLLTAQNKKLPSNGPAAV >gi|229784066|gb|GG667669.1| GENE 13 14987 - 15985 1084 332 aa, chain - ## HITS:1 COG:lin2192 KEGG:ns NR:ns ## COG: lin2192 COG0534 # Protein_GI_number: 16801257 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 6 332 8 335 443 155 31.0 9e-38 MQKLNLLEEPVPKLFIKYLMPSISATLVTSIYILADTLMIGRGVGAVGIAALNLLLPLFS LFFGTGMLFGVGGGVLLSISKGRRDERAAREYFTVALWLAVAFAVFYIAGGHLLFRPVTE FLGRNDTMGNYVDEYGRVLVTGAPVFLFSAFLQAFVRNDRAPKVSMAAVVAGGVTNVILD YIFIFPMKMGMAGAAVASVIGSLLTVGILLTHFFSPESTLKTVARPNWRKAWEVVTAGFA SFLLEMCNGVVTFLFNRQLLVYVGDLGVVVYGIISNSALIVASISNGISQAVQPILATNY GAGRKERLKETRHLGEMAVTVSGVLFTGIGLL >gi|229784066|gb|GG667669.1| GENE 14 16164 - 17348 1414 394 aa, chain + ## HITS:1 COG:TM0102 KEGG:ns NR:ns ## COG: TM0102 COG1744 # Protein_GI_number: 15642877 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Thermotoga maritima # 64 346 18 315 359 138 32.0 2e-32 MKKRALAVTLAVLMAASLSACGSSNSGSGSTTAAPAETTTAAETKAAEAETTAGTAVKSE SSDGYELALVTDLGTIDDKSFNQGAWEGMKKYAEENGISYKYYQPQEGTTDSYLETIGLA IEGGAKLVVCPGFLFEEPVYLAQDQYKDVHFILLDGEPHSGDYSEYRTEANTMPILFQED EPGFLAGYAAVKDGYTKLGFLGGMAVPAVIRYGYGFAEGADYAAKEMGVDGIEIMYNYTG AFAATPEAQSMAASWYQNGTEVIFGCGGAVGNSVMAAAEEKSAKVIGVDVDQSYESDTVI TSAMKQLSVSVYDGVRDFYAGSFPGGKTSIFSAKNDGIGLPMETSKFTNFTQADYDAVYS QLKDGKITLVQPSEDNNDPTKDLTLEKTTINFVE >gi|229784066|gb|GG667669.1| GENE 15 17450 - 18988 1745 512 aa, chain + ## HITS:1 COG:lin1426 KEGG:ns NR:ns ## COG: lin1426 COG3845 # Protein_GI_number: 16800494 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport systems, ATPase components # Organism: Listeria innocua # 3 508 2 503 513 580 59.0 1e-165 MDDYIIEMLNITKEFPGIIANDNITLQLKKGEIHALLGENGAGKSTLMSVLFGLYQPEAG VIKVRGKEEKITGPLDANALGIGMVHQHFKLVHNFTVLQNIVLGMETTRHGFLKMDDARK KIMDLSEKYQLTIDPDALISDITVGMQQRVEILKMLYRDNEIMIFDEPTAVLTPQEIDEL MQIMRDLVKEGKSILFITHKLNEIKAVADRCTVLRRGKYIGTVDVKDTTPEEMSEMMVGR KVNLTVEKGPASPKDVVLEVKNLSVRANKEGHTKNLVSNVTFQVRKGEIVCIAGIDGNGQ TELIHAITGLAEMSEGTVLLNGQDVTKKSIRYKNTHGLSHIPEDRHKHGLVLDYNLGYNL VLQQYFEKDFQKYSFLKNDSIYGYSEKLIESYDIRSSEGPYTSARSMSGGNQQKAIVARE VTRDHDILLAVQPTRGLDVGAIEYIHKELIKQRDAGAAILLVSLELDEVMNLSDRILVMF EGSVVADLDPKKVTVQELGLYMAGSKKEGGQS >gi|229784066|gb|GG667669.1| GENE 16 18985 - 20094 1411 369 aa, chain + ## HITS:1 COG:lin1427 KEGG:ns NR:ns ## COG: lin1427 COG4603 # Protein_GI_number: 16800495 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Listeria innocua # 16 354 12 346 350 197 38.0 2e-50 MKQKQPRDYTGVLSSIFAIVLGLIVGLIILFLCNPAQALPGFATILTGAFTHGAKGVGQV FYYATPIILTGLSVGFAFKTGLFNIGTPGQFIMGGFGAVYVGILWTSLGPAHWFVALLAS VVMGAVWGLVPGLLKAYYNVNEVIASIMMNYIGMYLVNWIVKSSKPLFNNLRNESRNVAA TANIPKMGLDKIFPGSSVNGGIIIAIIVVILIYLLLNKTTFGYELKAVGFNRDASKYAGI NEKKSIIMSMMIAGAIAGLAGGLLYLAGTGKHIEIKDVLASEGFTGISVALLGLSNPIGV LFSGIFIAYLTAGGFYLQLFEFSTEIIDIIVAVIIYFSAFALMVKMLIGRVQKARHEKEE AQAKGGGDA >gi|229784066|gb|GG667669.1| GENE 17 20091 - 21044 1147 317 aa, chain + ## HITS:1 COG:SPy1225 KEGG:ns NR:ns ## COG: SPy1225 COG1079 # Protein_GI_number: 15675189 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Streptococcus pyogenes M1 GAS # 1 316 4 318 318 222 42.0 1e-57 MSVIYFIFQQTMYFMIPLMIVALGAMFSERSGIINIALEGIMTMGAFTGILFIHFTGGTM SGQMQLILAVLISMATGMFFSLFHGYASINMKANQVISGTALNMFAPAFAIFVARVIQGV QQVQFTNTFRITSVPVLGKIPFIGPLLFQNAYITTYIGIGIFVISTIVLYKTRFGLRLRS CGEHPQAADAAGINVYHMQYAGVLISGALGGLGGLVFVVPTSTNFNADVAGYGFLALAVL IFGQWKPVKIMFASLFFGLMKAVAAAYSGIPFLSSLGIPSYFYKMVPYIITLIVLVFTSK NSQAPKAEGVPYDKGQR >gi|229784066|gb|GG667669.1| GENE 18 21197 - 21706 435 169 aa, chain + ## HITS:1 COG:mlr2757 KEGG:ns NR:ns ## COG: mlr2757 COG3177 # Protein_GI_number: 13472455 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 3 159 68 230 263 72 33.0 4e-13 MTPLKELLQKADQLKTGRQQDIAITEETIRELHRTTAAESGSTPSDSYRTDMPADADSGY QPPQPEDVPRLTGHLADQIRSSKSSLHPIELAAMAGKRLIDIQPFDRGNEETAFLLMNLI LASEGYCPVTVGIERREAYRQVLSASGKEYDMEPFTRLIAGWILERANA >gi|229784066|gb|GG667669.1| GENE 19 21758 - 23821 1998 687 aa, chain - ## HITS:1 COG:no KEGG:Closa_3207 NR:ns ## KEGG: Closa_3207 # Name: not_defined # Def: glycoside hydrolase family 38 # Organism: C.saccharolyticum # Pathway: not_defined # 1 686 7 691 691 733 52.0 0 MDYQKIKKVHVVYKTHLDIGFTDMGRNVLDRYVKEYIPHSVRLAEEMNGGKEKRFIWTVG SYLIDYYLKHADAKALSELEAAIERGDICWHGIASTTHTELLDMDLFQYGLELGSRLDSR YKRHTIAAKMTDVPGHTIAVVDALADKGIRFLHIGVNASSMVPEIPQTFVWRHGTKQIIV QYSSQYGAPGYVEGMDEVLEFAHTGDNLGPQSAEEVEAEYERIRKLYPNAVVEASTLDAY AESLLAVEETLPVIEEEIGDTWIHGIGSDPWKVSRYRELIRLKDQWQREGRFSRDCEGYD AFMMNLMLIAEHTWGLDYKKYLADFKNWTKEDFQKARAADETTLEFLTNRNAQMREVLDI DFKRYRGGVFTGSYRYYESSHEEQREYLKKALEELPADLKAEAERALTGCIPVKEESEGR RVCVGEILEIGGWKAAVDGSGAIRKLERGGRDWAKDGAIGRLGYETFNALNCVQNYYRYN RAFYENGCWSEGDFSKPGLEFVEDLENRYYEFSVRDLRVCGEELLIDLAGDAEAAEKYGC PQAAQIKYVFGPQITCSVSWFGKDANRMPEALWFDFIPDTENPNRWLMEKMGVPVSPLNV LRGGNRLQHCVERLVYDGSDGKMAIETVHAPLVSAGGRNLYDDYRRLPDLRKGFSFNLFN NKWGTNFCMWCEDDCRFEFKLDISEKR >gi|229784066|gb|GG667669.1| GENE 20 23845 - 25140 1229 431 aa, chain - ## HITS:1 COG:CAC2918 KEGG:ns NR:ns ## COG: CAC2918 COG1482 # Protein_GI_number: 15896171 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Clostridium acetobutylicum # 83 413 3 300 326 78 22.0 3e-14 MARWEIQLRYRAGMPNWNCSNTGDPVLSKYKRGFFIEWRLADRYKKERFQRFDYVVDTNE RNNPKMITGEAFREALCRVSSEPFRLQPYFDPGVWGGQWMKTRFGLDPSEDNFAWSFDGV PEENSLNLKFGEVTVEIPAMDLVLYQPVKLLGDRVHARFGAEFPIRFDLLDTMGGGNLSL QVHPLTEYIQDQFGMHYTQDESYYILDAGDDACVYLGVKKDVDRDAMFHDLEEAREGKIL FPAEHYVNRIPVKKHDHVLIPAGTIHCSGKNAMVLEISATPYIFTFKLWDWGRVGLDGLP RPIHLEHGAANIQWDRDTDWVYDNLVHQERTIREEEGLKLERTGLHSREFIETHRYTLTK PVECSMEDSVHVLNLVEGEKAFIESPQGAFEPFEVHYGETFIIPAAVKKYRIRPAGADKK EPVAVIAASVK >gi|229784066|gb|GG667669.1| GENE 21 26093 - 26410 485 105 aa, chain - ## HITS:1 COG:no KEGG:LCAZH_0395 NR:ns ## KEGG: LCAZH_0395 # Name: not_defined # Def: mannose-6-phosphate isomerase, class I # Organism: L.casei_Zhang # Pathway: not_defined # 10 104 3 99 575 67 38.0 2e-10 MINREGYGNYDLHPYKEIKGYEGHALEGGCAVLAQLKAEEKAGKRVIVCDFYPGVDREEA AGLLKQLEPALLIDADACAVPEEELTAQWKDYLTDDRVFGVMCHK >gi|229784066|gb|GG667669.1| GENE 22 26434 - 27351 765 305 aa, chain - ## HITS:1 COG:all5002 KEGG:ns NR:ns ## COG: all5002 COG1940 # Protein_GI_number: 17232494 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Nostoc sp. PCC 7120 # 8 265 22 264 311 68 25.0 1e-11 MESCYLMMDVGGTGIKAGLFSRSGEMVERVRSFPARAKEGREEIFRNFACIIQEMAKDRM VEGIGMAFPGPFDYEKGISLMRGLDKYDSVYGIPVREGVMEQLERIGKTESVARGCRFLF LHDVEAFAIGACRQGRMAGCKKVMCVCIGTGAGSAFVENGRVVKKAGGGVPENGWIYNTP YKESVIDDYVSVRGLKRLTESVFARPEAVDGAKLFELAEAGSDEAAKVFAAFGENVSEAL TPFLKSFRPQGLVLGGQIAKSFPHFGMQLREVCRHTGTAVFTEPDTSIRAMEGLYAAMER VTGAK >gi|229784066|gb|GG667669.1| GENE 23 27379 - 27540 167 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622317|ref|ZP_06115252.1| ## NR: gi|266622317|ref|ZP_06115252.1| D-tyrosyl-tRNA(Tyr) deacylase [Clostridium hathewayi DSM 13479] D-tyrosyl-tRNA(Tyr) deacylase [Clostridium hathewayi DSM 13479] # 1 53 1 53 53 93 100.0 5e-18 MNNFWETNFKADLGGFYEFAYTVRTMENVSAPEALESCETDNEGLLAFYIEKA >gi|229784066|gb|GG667669.1| GENE 24 27556 - 28275 614 239 aa, chain - ## HITS:1 COG:BH0419 KEGG:ns NR:ns ## COG: BH0419 COG2188 # Protein_GI_number: 15612982 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 6 239 4 238 240 125 32.0 1e-28 MLNREKGAPPLYHQLKEILKQEIEEEKYKKGDSFLTERQLQEKYDVSRVTVRQAVSELVN EGYLQSSRGVGTTVVFSKIDENLKQVISFSEEMGRHGITMMTSHCVMSRERAGKIVAKNL RMKKDCYCYRLERVRCAQNIPVVYSITWLTNKYHLPIDNDLYKDSLYKLLQEEYSIVIAG GRDTFEAVLATEVIGKFLNITPGQPVFKRIRSTYDQDNDVLEYTICYYPGERYKYSVEL >gi|229784066|gb|GG667669.1| GENE 25 28625 - 30028 1504 467 aa, chain - ## HITS:1 COG:CAC0743 KEGG:ns NR:ns ## COG: CAC0743 COG2723 # Protein_GI_number: 15894030 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 4 467 5 469 471 693 68.0 0 MRTFPDDFLWGGAVAANQCEGAYNEDGKGLSVQDVLPHGLRGERAEGPSADNLKQEAVDF YHHYKEDIRMLAEMGFKVFRTSIAWTRIFPQGDEEQPNEKGLAFYDAVFDECLKYGIEPL VTISHYETPLGLAEKYDGWRSRKLVEFFERYCRTIFTRYGGKVKYWLTFNEINSLLHAPF MSGGIKTPPDQLTEQDLYQAMHHELVASAMAVKLGHELMPGAKIGCMILGITVYPLTPDP DDMIAVMEKDRETLLFADIHARGRYPKYLLNYFEDHGISIRMEPGDEEMLKHTVDFISFS YYSSICATTHPELSEPTGGNLSRGYKNPYLKASAWGWQIDPKGLRYTLNRLYDRYQLPLF IVENGLGAEDELVTVNGEKTVLDDYRIDYLHHHLLQVREAIRDGVEVMGYTSWGCIDLVS ASTSEMRKRYGYIYVDRNDDGTGSMKRYRKKSFWWYRDVIKSNGAIL >gi|229784066|gb|GG667669.1| GENE 26 30012 - 31946 2066 644 aa, chain - ## HITS:1 COG:SPy0572_2 KEGG:ns NR:ns ## COG: SPy0572_2 COG1263 # Protein_GI_number: 15674662 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Streptococcus pyogenes M1 GAS # 100 455 13 364 364 270 40.0 7e-72 MDYKKLAEDILAKVGGEENISGLTHCATRLRFNLKDEGKAQTEALKKVNGVMGVISKGGQ YQVVIGSDVASVYRPLTELCEVTAGEGGQPSQEKKDDRKVVEKLIDTLSGIFTPILPAIT AAGMIKAILALLTAFSLVDKTGTTYQTLNFMADAAFYFLPVLLANSAAKKFKCNPYLAMM LGGMLLHPDFVNMVAASKESGEAIRIFLLPITNAGYSSSVLPIILIVWFMSYVEPIADRI SPKPVKFFTKPLITAIVTGIVGLVVIGPIGTIISNLIASGVHGLNNYCSWLVPALVGAFT PLLVMTGTHYGLIPIGINNRMTMGYDTLIYPGMLGSNVAQGGAALAVALKSRKSEIRQLA SSAGITAVCGITEPALYGINIRFKTPLYAAMIGGGAGGFLMGILNVKNYSGGSPGFLTLP SYLGGDTLTSFYFACAGAAVSVVIAFITSYVLYKDPSDENGERAGGETKSAKQDSEWDSA ERSSGKAVTIAAPLTGEVVPLGQVNDPTFSEEILGKGTAIRPAVGRVVSPVNGTIATIFE TGHAMGLVSEDGAEILIHVGLDTVNLKGKYFSVKKKSGDRVKTGDCILEFDLEAIKAEGY DVITPVIISNHFNYSAVETVAGGQVEAGADLLRLSDGGNNENIS >gi|229784066|gb|GG667669.1| GENE 27 32100 - 32999 1017 299 aa, chain - ## HITS:1 COG:CAC0191 KEGG:ns NR:ns ## COG: CAC0191 COG1737 # Protein_GI_number: 15893484 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 16 271 19 271 283 84 27.0 3e-16 MILEKLKRMEGFTHQEQAVARYILEHVEEIQQMSTEELARVSYTSKATVVRLCKKLDTAG YQELKLKLVSELVQNIRVNQLLNREPITETSTLQDILHTLPKLYDKAITDTELCMDKNVI KRICNRIRTSDRIDIYGNGISYILAQSAAFKFATLGMECAAYDSVNAHFLSAGKTKKTVS FVITFTGANKSMIDTARYLKETTETYVVGIVGRHNEEIRKWCHEIVEIPSRDSLLSLKVI TSFTATTYVLDIFFSMLLAQTYESHVKSSLEMVCHETLRIGVDYWNLKDSPEEEDKSMI >gi|229784066|gb|GG667669.1| GENE 28 33102 - 33704 651 200 aa, chain - ## HITS:1 COG:CAC3195 KEGG:ns NR:ns ## COG: CAC3195 COG0423 # Protein_GI_number: 15896443 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase (class II) # Organism: Clostridium acetobutylicum # 2 187 265 450 462 277 69.0 1e-74 GDHAPEELCFYSKATTDIEFLFPFGWGELWGVADRTDYDLTQHQNTSGQDMSYFDDEKKE KYIPYVIEPSLGADRVTLAFLCSAYDEEEIGEGDVRTVLHFHPAIAPVKIGVLPLSKKLN EGAEKIYTELSRYYNCEFDDRGNIGKRYRRQDEIGTPFCVTYDFDSEEDHAVTVRDRDTM EQVRVPIAELKNYFEDKFRF Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:24:11 2011 Seq name: gi|229784065|gb|GG667670.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld63, whole genome shotgun sequence Length of sequence - 24954 bp Number of predicted genes - 31, with homology - 30 Number of transcription units - 21, operones - 8 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 437 240 ## gi|288870773|ref|ZP_06409900.1| hypothetical membrane protein 2 1 Op 2 . - CDS 467 - 1258 758 ## DICTH_0463 hypothetical protein - Prom 1324 - 1383 6.1 - Term 1410 - 1447 4.5 3 2 Op 1 . - CDS 1494 - 2318 663 ## COG2207 AraC-type DNA-binding domain-containing proteins 4 2 Op 2 . - CDS 2372 - 3475 1254 ## COG0562 UDP-galactopyranose mutase - Prom 3507 - 3566 6.7 - Term 3535 - 3588 8.0 5 3 Tu 1 . - CDS 3597 - 4985 1300 ## COG0069 Glutamate synthase domain 2 - Prom 5035 - 5094 5.2 + Prom 5096 - 5155 12.7 6 4 Op 1 . + CDS 5233 - 6006 751 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) + Prom 6010 - 6069 2.7 7 4 Op 2 . + CDS 6166 - 6504 372 ## EUBELI_20572 hypothetical protein + Term 6524 - 6574 15.5 - Term 6695 - 6753 7.7 8 5 Tu 1 . - CDS 6822 - 8486 1977 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases - Prom 8526 - 8585 3.6 - Term 8631 - 8669 -0.8 9 6 Op 1 . - CDS 8727 - 9893 1492 ## COG3853 Uncharacterized protein involved in tellurite resistance 10 6 Op 2 . - CDS 9886 - 11376 1372 ## Closa_3842 hypothetical protein - Prom 11453 - 11512 7.5 - Term 11486 - 11554 6.3 11 7 Tu 1 . - CDS 11587 - 12939 1743 ## COG1109 Phosphomannomutase - Prom 12974 - 13033 2.4 12 8 Tu 1 . - CDS 13064 - 13834 864 ## COG4509 Uncharacterized protein conserved in bacteria 13 9 Tu 1 . + CDS 15048 - 16223 1096 ## COG3629 DNA-binding transcriptional activator of the SARP family - Term 15977 - 16009 1.3 14 10 Tu 1 . - CDS 16240 - 17532 1367 ## COG0726 Predicted xylanase/chitin deacetylase - Prom 17588 - 17647 5.6 + Prom 17527 - 17586 5.0 15 11 Tu 1 . + CDS 17711 - 18184 548 ## COG1803 Methylglyoxal synthase - Term 18193 - 18259 26.5 16 12 Tu 1 . - CDS 18284 - 18820 541 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases + Prom 18821 - 18880 6.2 17 13 Tu 1 . + CDS 19109 - 19279 102 ## gi|266622340|ref|ZP_06115275.1| hypothetical protein CLOSTHATH_03558 - Term 19286 - 19326 4.5 18 14 Tu 1 . - CDS 19405 - 19815 365 ## gi|266622341|ref|ZP_06115276.1| thermophilus bacteriophage protein - Prom 19960 - 20019 8.7 + Prom 19805 - 19864 13.1 19 15 Op 1 . + CDS 19989 - 20186 226 ## gi|266622342|ref|ZP_06115277.1| cag pathogenicity island protein 20 15 Op 2 . + CDS 20272 - 20382 66 ## + Term 20394 - 20435 -0.3 + Prom 20388 - 20447 12.3 21 16 Tu 1 . + CDS 20497 - 20646 163 ## gi|266622343|ref|ZP_06115278.1| conserved hypothetical protein + Term 20665 - 20705 2.7 + Prom 20705 - 20764 5.0 22 17 Op 1 . + CDS 20807 - 21058 305 ## gi|266622344|ref|ZP_06115279.1| hypothetical protein CLOSTHATH_03562 23 17 Op 2 . + CDS 21079 - 21492 255 ## gi|266622345|ref|ZP_06115280.1| excisionase/Xis, DNA-binding 24 17 Op 3 . + CDS 21485 - 21805 240 ## Dtox_2202 hypothetical protein + Term 21806 - 21836 3.4 25 18 Tu 1 . - CDS 21953 - 22099 247 ## gi|288870778|ref|ZP_06115282.2| LysR-family transcriptional regulator - Prom 22192 - 22251 4.9 26 19 Tu 1 . + CDS 22224 - 22388 176 ## gi|266622348|ref|ZP_06115283.1| putative toxin-antitoxin system protein - Term 22374 - 22408 4.9 27 20 Op 1 . - CDS 22457 - 23155 579 ## Ppro_1118 BRO domain-containing protein - Prom 23178 - 23237 1.7 - Term 23178 - 23219 3.8 28 20 Op 2 . - CDS 23248 - 24117 530 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 24145 - 24204 1.5 - Term 24130 - 24162 4.1 29 21 Op 1 . - CDS 24206 - 24463 399 ## gi|266622351|ref|ZP_06115286.1| putative holin 30 21 Op 2 . - CDS 24466 - 24705 395 ## gi|266622352|ref|ZP_06115287.1| conserved hypothetical protein 31 21 Op 3 . - CDS 24725 - 24919 227 ## gi|266622353|ref|ZP_06115288.1| conserved hypothetical protein Predicted protein(s) >gi|229784065|gb|GG667670.1| GENE 1 2 - 437 240 145 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870773|ref|ZP_06409900.1| ## NR: gi|288870773|ref|ZP_06409900.1| hypothetical membrane protein [Clostridium hathewayi DSM 13479] hypothetical membrane protein [Clostridium hathewayi DSM 13479] # 38 145 1 108 109 186 99.0 5e-46 MKRLISLCITAAVAASLCACGSKQDTAATNGTNGEAPVASAETSTEGQESKSGKKTQLRW MNIVSEGSDSYQLYQDAIKKFNENNDKYEIVSDTPSDLGTKLLVEYSAGTGPAISWSLTA QAKKLTDSGLIVNWSDIYGKDYGKN >gi|229784065|gb|GG667670.1| GENE 2 467 - 1258 758 263 aa, chain - ## HITS:1 COG:no KEGG:DICTH_0463 NR:ns ## KEGG: DICTH_0463 # Name: not_defined # Def: hypothetical protein # Organism: D.thermophilum # Pathway: not_defined # 1 138 1 130 241 75 31.0 2e-12 MREIRFEYLRPRQILEAMKEKSIVYLPVSPIEWHGLHNPVGVDGFHAYESALEAARRIGG VVMPPIYAGQGGRMREETLRNLGFEEPYPKVNGIDVPANLVPSMYWDEDIYRLMIKEQIR LVAAYGFKMIVIVVNHGALSGVAEEMDEKLENCKVIFTSCVGAKFDEYDDRGHATMGESS IVMYIRNECVDLTELPSKEEEPYLLCRDHGVIDARFFNPELNTDYRLLKDPRDATCEMGK EFFESGVKQVIDKVSEEYEKLHI >gi|229784065|gb|GG667670.1| GENE 3 1494 - 2318 663 274 aa, chain - ## HITS:1 COG:lin2267 KEGG:ns NR:ns ## COG: lin2267 COG2207 # Protein_GI_number: 16801331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 23 268 30 289 292 103 27.0 4e-22 MEKGIIGDSWEIFYKDEFSSKKHWHKEIEFTYLVTGKLIICVDEVMHELHEGELFVVSGG VVHYYIETEEKYKVGIAKMSLQAFDGFSKKVKDFFRGFYSSVLYVAKDDQIKTIFRDIIN LISKQNIVIQESVLISRIIDLTVYLSSHEELIKNREEYRKNSDAELLERMRTYFAEHYNE KIGLADMASHLGFSESYCSKYIKKKTNMNFLEYLNSCRIMRAETLLRTTENSVTEIAYAI GFTSIQSFNRVFKSFCDESPTEYRKRLHDKKNDP >gi|229784065|gb|GG667670.1| GENE 4 2372 - 3475 1254 367 aa, chain - ## HITS:1 COG:Cj1439c KEGG:ns NR:ns ## COG: Cj1439c COG0562 # Protein_GI_number: 15792757 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-galactopyranose mutase # Organism: Campylobacter jejuni # 4 358 2 356 368 456 60.0 1e-128 MKKYDYILVGSGLYSGVFAYLAGKEGKRCLVVEKRDHIGGNIYCEEVEGIHVHRYGAHIF HTSNREVWQFVNELAEFNRFTNSPVANYKGEMYNMPFNMNTFSRMWGVSTPEEAKAKIEE QRASVQGEPKNLEEQAIRLVGTDLYRKLIKGYTEKQWGRDCKDLPSFIIKRIPVRFTYDN NYFNDLYQGIPVGGYNVIIDKLFEGCDIETGVDYLEDKERYDSLGETVIYTGTIDAYFGY CFGKLEYRSLRFETETVDTDNYQGVAVVNYTDCETPYTRVIEHKHFEFGTQPKSVITREY PVAWSEGMEPYYPVNDEKNQELYRRYETLAEGEKNVVFGGRLGEYKYYDMDKVIESAMRK AREVLGK >gi|229784065|gb|GG667670.1| GENE 5 3597 - 4985 1300 462 aa, chain - ## HITS:1 COG:MK0550 KEGG:ns NR:ns ## COG: MK0550 COG0069 # Protein_GI_number: 20093988 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Methanopyrus kandleri AV19 # 72 462 27 422 429 325 44.0 9e-89 MARFQCGVCGFIYDEEKEGKQFGELKECPVCRQPASVFFPAGGAEKAEQSEQAADGLGYD PAYVREDRSCRYMKEIHEMAVSGNTIIDAMGTKMAMPGWDDILILGAQLNPPPLMEHDPV TITTVIGKHAKKPMVLDGPVYISHMSFGAMSKEMKVALAKGSAMAGTAMCSGEGGILPEE KSAAYKYIFEYVPNRYSVTPDNLRESDAIEIKIGQGTKPGMGGHLPGAKVTPEIAAIRNK PLGEDVISPSKFEDIRSKEDLRDLVAQLRMASEGRPIGIKIAAGKIEKDLEYCVFAEPDF ITIDGRGGATGASPKLVRDSTSVPTVYALSRARKYLDEAGADIDLVITGGLRVSSDFAKA IAMGADAVAIASAGLIAAACQQYRICGSGNCPVGAATQDPELRKRLNESAAAARVGNFLK CSFKELKTFARITGHEDIHDMTVDDLCTLNREISEFTSIRHA >gi|229784065|gb|GG667670.1| GENE 6 5233 - 6006 751 257 aa, chain + ## HITS:1 COG:XF2269 KEGG:ns NR:ns ## COG: XF2269 COG1028 # Protein_GI_number: 15838860 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Xylella fastidiosa 9a5c # 3 250 7 252 255 81 31.0 1e-15 MSTYVITGGTTGIGAAVRKTLLDQGHEVFNIDFKGGDYLADLSTKEGRQGAIDAVFEKYP DGIDVLICNAGVGPTAPPKTIFALNFFASVQLAEGLRPLLKKKHGNCVITSSNSITNMTV RMDWVDMLSNVMDEDRILELAEDIPRNQTPSCYSSSKHALARWVRRISPSWAVDGLRINS VAPGNTTTPMTQGMTDAQMEAALLIPIPTRYGRKEFLDAQEIANGIVFLASPLASGINGV ILFVDGGIDALLRSEQI >gi|229784065|gb|GG667670.1| GENE 7 6166 - 6504 372 112 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20572 NR:ns ## KEGG: EUBELI_20572 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 111 1 109 110 63 31.0 2e-09 MTILEQYIECMKKGDEVALADLFNEYGVLHDSSVIKAGMDTIHLEGKMAVEMMFHHKFGF NGGPFPIHSVKYVGDNTVWYFITYNDHVIPVTAFLSETDENGKIKRLNIYPL >gi|229784065|gb|GG667670.1| GENE 8 6822 - 8486 1977 554 aa, chain - ## HITS:1 COG:CAC0273 KEGG:ns NR:ns ## COG: CAC0273 COG0119 # Protein_GI_number: 15893565 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Clostridium acetobutylicum # 4 551 3 554 558 629 55.0 1e-180 MLNYQRYKRVPVINYPERQWPNKEIEKAPVWCSVDLRDGNQALIEPMVVEEKIEMFNLLV KLGFKEIEIGFPAASQIEFDFLRQLVERKLIPDDVVVQVLVQCREHLIKRTFEAIQGIKK AIVHIYNSTSTLQRDVVFHKSKEEIKEIAIEGTEMVKKYMDGFDGEVILEYSPESFTGTE LDFALDICTAVQDTWGATPEKKIIINLPSTVEMTTPNVYADQIEWMDRHFKNRDSIILSV HPHNDRGTGIAATELALLAGADRVEGTLLGNGERTGNVDILTVAYNMFSQGINPELNLEN VREIVDVCERCTKMEVEPRHPYAGKLVFTAFSGSHQDAINKGMQAMHERNQEYWEVPYLP IDPADIGRQYEPIVRINSQSGKGGVAFVMDTFYGFRLPKGMHKEFADVIQAISEKQGEVA PEQIMDEFKKNYSERKEPIHFRKCQITEAEDDTEFSTLAKVRYTDHGVEKVFEGVGNGPI DAVQKGLEKALGIEIRVLDYNEHALTSGSGAQAASYIHLLDVKTGRATYGVGISSNITRA SIRGIFSAVNRLFY >gi|229784065|gb|GG667670.1| GENE 9 8727 - 9893 1492 388 aa, chain - ## HITS:1 COG:BS_yaaN KEGG:ns NR:ns ## COG: BS_yaaN COG3853 # Protein_GI_number: 16077094 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in tellurite resistance # Organism: Bacillus subtilis # 15 386 13 383 386 244 42.0 1e-64 MSKDIETMINEAPVLTLDPFSTTELPVEGAHAVPAEVEEENLAESIPEAELTPEERKMID DFAARIDLTDSNLVLQYGAGAQKKIADFSEAALDNVKTKDLGEIGEMLSSVVTELRSFEA EEEDKGFFGFFKKSSNKLESMKAKYNKAEVNVNQICKILENHQIQLLKDIALLDKMYDLN TTYFKELSMYILAGKKKLNEVRGTMLPALMDKAHKTGLPEDAQAVNDMASLCNRFEKKIH DLELTRMISIQMAPQIRLVQNNDTLMSEKIQSTIVNTIPLWKSQMVLAIGINHSEQAARA QREVTDMTNELLRKNAEVLKTATIETAKESERGIVDMETLKITNESLITTLDEVVRIQTE GRQKRREAETELARMEGELKQKLLQLSN >gi|229784065|gb|GG667670.1| GENE 10 9886 - 11376 1372 496 aa, chain - ## HITS:1 COG:no KEGG:Closa_3842 NR:ns ## KEGG: Closa_3842 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 496 1 439 439 435 53.0 1e-120 MDHKDFSNLGDQIRDSVQDAIDSMDFKQLNRTISDTVNFALDEARNQMMKGMDAARRMDS GNAGWKGPDTGAGDKTDDARNTFQTEAGAGRRPGAANDSGSTGERGRTGCENRRYTTENA SRRTDRPGYRAFQTSETAGENGTAYRVRKMAPSPAAIRMNQPGRISGILFTVFGGIGLGV VGILFLVTWILALVVKPLWGAVLGSSGLLLVLGCVFGTMLGTGISTRGRLKRAGLYLKQA GSRLYCDIEELARSIGKTRRYVVRDLQKMIEKGILPEAHIDEQKTCLMLNEETYKQYLSA QKSLKQREKEMALLAEAEKKQKKGASAPEERKKAEEDEKQLPEGLKEMMAQGGEYLRILR EANDAIPGEAISGKISRLETVIRRIFETVERQPDQMEEMERFMDYYLPTTVKLVTAYRDF DRIEIEGDNIVSAKHEIEETLDTINLAFERLLDDLYQDAAIDITTDASVLQTMLKKDGFA ESDFSKSDLTGGKNHE >gi|229784065|gb|GG667670.1| GENE 11 11587 - 12939 1743 450 aa, chain - ## HITS:1 COG:SP1559 KEGG:ns NR:ns ## COG: SP1559 COG1109 # Protein_GI_number: 15901402 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Streptococcus pneumoniae TIGR4 # 1 446 1 446 450 418 50.0 1e-116 MGKYFGTDGFRGEANVTLTVEDAFKVGRFLGWYYGQRKKGDVCRIVIGKDTRRSSYMFEY SLVAGLTASGADAYLLHVTTTPSVSYVVRTEDFDCGIMISASHNPFYDNGIKVINGNGEK LEESVIVEIEKYLDGETHEVPLAKRDKIGRTVDFAAGRNRYIGYLISIATRSFKNKKVAL DCANGSASAIAKNVFDALGAETHVISNSPDGLNINTKCGSTHIEMLQEYVKEIGADVGFA YDGDADRCIAVDENGNVVDGDVIMFICGKYMKEQGALKNNKVVTTIMSNFGLYKALDREG IGYEKTAVGDKYVYENMVTYGNCLGGEQSGHIIFSKHATTGDGILTSLKVMEVMLEKKES LAKLASEIEIYPQVLKNVRVKDKAAAQDDAVVKAEVSRVTEDLGDSGRILLRQSGTEPVV RVMVEAADLETCEKYVDQVIEVMKQQGHVL >gi|229784065|gb|GG667670.1| GENE 12 13064 - 13834 864 256 aa, chain - ## HITS:1 COG:SPy0129 KEGG:ns NR:ns ## COG: SPy0129 COG4509 # Protein_GI_number: 15674344 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pyogenes M1 GAS # 74 179 55 160 237 101 37.0 2e-21 MKKKRIASILLLLAVVLAAAAGIKTYVDYQVRAAAEREYESLAELARQTEAPVETETPIE SEPEETEAPYVSPIQFDELKEINPDIVGWLKVEGTEIDYPIVQTGNNETYLNTDFEGKKS VAGAIYLDYESEPDFSGRHNIIYGHNMKNGSMFKDIVKYKDEEYFMEHQDITVYTPEREY HLRPMSVLYTEPTGMRRKTKFATEESFQAYVEEMTKGCSFAQIPEKPLEQLWSFITCSYE FNDARTILYAYEVTEE >gi|229784065|gb|GG667670.1| GENE 13 15048 - 16223 1096 391 aa, chain + ## HITS:1 COG:TM1119 KEGG:ns NR:ns ## COG: TM1119 COG3629 # Protein_GI_number: 15643876 # Func_class: T Signal transduction mechanisms # Function: DNA-binding transcriptional activator of the SARP family # Organism: Thermotoga maritima # 13 340 10 308 349 74 23.0 3e-13 MEPKGFEFSITIGSQVIMDRNNQSKKPWSLLEYLITFRDRNIPVEELIDLFWKDEASNNP AGALKTLMFRVRKLLEPLGIPPQDLITQNRKAYGWTKELTTITDTDRFEDLCLRSESTDI NSEECLSLCLEAVSLYKGNFLPKSEWESWVVPIHTYYHTLYQNLIHRTLLLLDEQKDYPT IIDVCQQAIAIDQYEEDAHYYLIYALYQSGNQYMAMEHYQHVNDMFYNEFAITPSTRFKE LYKLISDKKHGITMDLSTIQETLMEGAKEKGAFCCEPSVFRDVYQLETRAIERTGDSIFL CLITISNLKGELLKPSVQTRAMDELGESIRTSLRRGDIYSRYSVSQYLMLLPTATYENGE MVLKRIIQNFKKAYSRKDLSITYSLQSIIPS >gi|229784065|gb|GG667670.1| GENE 14 16240 - 17532 1367 430 aa, chain - ## HITS:1 COG:BS_yjeA_2 KEGG:ns NR:ns ## COG: BS_yjeA_2 COG0726 # Protein_GI_number: 16078275 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Bacillus subtilis # 221 416 19 211 217 167 45.0 3e-41 MPSARNLCYTYSRQMLAEVGNMKSIKWKILRRIGAFVLAAALVAGGQPFCSCADEAAAEE EIGPGIIPKVKPEDLTPLVVSYSTYQDSVGWSEVKADNDPLKVQTGSFASAIRMTVSGQP EVLKGTLAYQVNITGKGWLDWGENDVEIGAAPEQLPIEAVAMKFTDRLAEYYDIYYSVWE NGAWTEWAMNGTAAGAEGSGIRVEGLRAAAVLKGAEPPEMKPEEIDPTKPMVALTFDDGP STPVTTRILDSLEANGGRATFFMVGNRVPGTQAVVQRMNALGCEVANHTYEHKYLTKAGD AGIRSQVGLTNQKITEACGVTPTLVRPPGGFYNQASLDTLGSMGMAAVMWDIDTLDWKTR NAQNTINVVLNQVKDGDIVLMHDIYSTSADAAEVIIPELVNRGYQLVTVSEMAQYRGGIQ AGHVYNRFRP >gi|229784065|gb|GG667670.1| GENE 15 17711 - 18184 548 157 aa, chain + ## HITS:1 COG:CAC1604 KEGG:ns NR:ns ## COG: CAC1604 COG1803 # Protein_GI_number: 15894882 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Clostridium acetobutylicum # 7 150 1 144 149 183 57.0 1e-46 MIQEDFITLTLEKQKNIALIAHDNEKHALIDWCKEHKDILSQHTLCGTGTTARMITDQTG LSVKGYNSGPLGGDQQIGAKIVEGRIDLVIFFSDPLTAQPHDPDVKALLRIAQVYDIPIA NTRATADFIITSPLMSTTYEHQVINFNKNVADRANTL >gi|229784065|gb|GG667670.1| GENE 16 18284 - 18820 541 178 aa, chain - ## HITS:1 COG:lin0808 KEGG:ns NR:ns ## COG: lin0808 COG0317 # Protein_GI_number: 16799882 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Listeria innocua # 1 151 1 150 180 100 39.0 1e-21 MVDEAVAFAMKAHEGTFRKGTKVPYIVHPLETAVIVSMMSEDEELVCAALLHDVVEDAGV SEQQLETLFGQRVTAFVMEETEDKTKSWKERKAATLKHLETASRESKILVLGDKLSNLRC TARDHMVIGEAIWDRFNEKRRSEHAWYYNGVAERIRELSEYPACQEYFGLCRKVFGDS >gi|229784065|gb|GG667670.1| GENE 17 19109 - 19279 102 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622340|ref|ZP_06115275.1| ## NR: gi|266622340|ref|ZP_06115275.1| hypothetical protein CLOSTHATH_03558 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03558 [Clostridium hathewayi DSM 13479] # 1 56 1 56 56 94 100.0 2e-18 MNRTDELFFEIIETYQRHAQAVKNAKYQEIREMAEMIIDVDISSMWFLTQRIPDTK >gi|229784065|gb|GG667670.1| GENE 18 19405 - 19815 365 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622341|ref|ZP_06115276.1| ## NR: gi|266622341|ref|ZP_06115276.1| thermophilus bacteriophage protein [Clostridium hathewayi DSM 13479] thermophilus bacteriophage protein [Clostridium hathewayi DSM 13479] # 1 136 1 136 136 230 100.0 3e-59 MKKYNLSSIMKRAWELVKKAGTAMSEALKQAWREAKETMKELKGTPKQIAWAEDIRNTAI KYVKEGKEVWGKYPELLAGFEFVENRFSQLFEMHDEAVFYIEKRNFFSKDNIKEKVNDIA TKNVKKNNMAEGHILG >gi|229784065|gb|GG667670.1| GENE 19 19989 - 20186 226 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622342|ref|ZP_06115277.1| ## NR: gi|266622342|ref|ZP_06115277.1| cag pathogenicity island protein [Clostridium hathewayi DSM 13479] cag pathogenicity island protein [Clostridium hathewayi DSM 13479] # 1 65 1 65 65 78 100.0 1e-13 MAYSEKQKEYTMKYLEKLKEIRFRVRPEEYQKYEEAAEKAGYPSMRQFYMDAIDEKVNKI LNLRE >gi|229784065|gb|GG667670.1| GENE 20 20272 - 20382 66 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTFGECMEEFGFNREVEWNKPPLNGQKITTYFRKHK >gi|229784065|gb|GG667670.1| GENE 21 20497 - 20646 163 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622343|ref|ZP_06115278.1| ## NR: gi|266622343|ref|ZP_06115278.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 49 1 49 49 88 100.0 2e-16 MNVNMSEAARLILGLRDAGWNEKDINDFILFIETGEEKYKPKQKNKPTE >gi|229784065|gb|GG667670.1| GENE 22 20807 - 21058 305 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622344|ref|ZP_06115279.1| ## NR: gi|266622344|ref|ZP_06115279.1| hypothetical protein CLOSTHATH_03562 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03562 [Clostridium hathewayi DSM 13479] # 1 83 1 83 83 153 100.0 5e-36 MIKLVGEGHVLIKEFQTEEEGLDYMRKNKDDLMWYYSYYELVVPGEKTIWAVSDTLISDA EFRKQVDAKIAEKRDNNTLPEEG >gi|229784065|gb|GG667670.1| GENE 23 21079 - 21492 255 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622345|ref|ZP_06115280.1| ## NR: gi|266622345|ref|ZP_06115280.1| excisionase/Xis, DNA-binding [Clostridium hathewayi DSM 13479] excisionase/Xis, DNA-binding [Clostridium hathewayi DSM 13479] # 1 137 1 137 137 252 100.0 8e-66 MYTDMFADSYKQASYDIGSNGYKILKTPKWQGIIKATAIMQYNNGLVRMRWLDAEGKIVA EHRGLPEYDYYPILGEELHEEKEAEYVSLAEYAHMQKVSPDTVRQKILRGNLPAKKLGRN WCIRKNTPYTDNRRKNV >gi|229784065|gb|GG667670.1| GENE 24 21485 - 21805 240 106 aa, chain + ## HITS:1 COG:no KEGG:Dtox_2202 NR:ns ## KEGG: Dtox_2202 # Name: not_defined # Def: hypothetical protein # Organism: D.acetoxidans # Pathway: not_defined # 8 103 28 123 502 71 41.0 1e-11 MYDMNYAEIVKAYESLQSLNEVSYKFGISKGKVKKILITAGAYENEISRRVMELYATGKS TQEIAKELKISKSCVNMCLPYTKGAYRSDTPTINAMRIRKYREENS >gi|229784065|gb|GG667670.1| GENE 25 21953 - 22099 247 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870778|ref|ZP_06115282.2| ## NR: gi|288870778|ref|ZP_06115282.2| LysR-family transcriptional regulator [Clostridium hathewayi DSM 13479] LysR-family transcriptional regulator [Clostridium hathewayi DSM 13479] # 1 48 3 50 50 62 100.0 1e-08 MEEFEQILAKMRERGIIPRIHYELMAIRVLGAIVAVLIGVGIVWVMVK >gi|229784065|gb|GG667670.1| GENE 26 22224 - 22388 176 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622348|ref|ZP_06115283.1| ## NR: gi|266622348|ref|ZP_06115283.1| putative toxin-antitoxin system protein [Clostridium hathewayi DSM 13479] putative toxin-antitoxin system protein [Clostridium hathewayi DSM 13479] # 1 54 1 54 54 72 100.0 1e-11 MKPLKSKVSITLDADIIEKIKEMAENDDRSFSQYVNMVLKEYISKENRTKRNDN >gi|229784065|gb|GG667670.1| GENE 27 22457 - 23155 579 232 aa, chain - ## HITS:1 COG:no KEGG:Ppro_1118 NR:ns ## KEGG: Ppro_1118 # Name: not_defined # Def: BRO domain-containing protein # Organism: P.propionicus # Pathway: not_defined # 86 226 70 217 247 89 34.0 9e-17 MRLQLVKQGDFLGTKCDFYVNETGDIFMSRTQIGYALKYKNPSKGIEDIHNRHHDRLDTM SVKVDPLSLQGSNPHYRNGERAYMYPEKGIYEICRYSRQKVAGDFYDWVYDVIQSIKKNG YYIASEKDEKWLGIRQETKEVRKAETDQIKLFVEYARAQGSQHADRYYVSLTKLINRRLG IESGGRDKADQRTLMHLKSLETVVELHLVTLMAEGLPYREIYQGVKKFIEAL >gi|229784065|gb|GG667670.1| GENE 28 23248 - 24117 530 289 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 177 267 509 625 744 77 32.0 2e-14 MNIVKDFIDSATKWNGYLEKKSNKDLDSFTANAGKNNYTCFSRDYQRDTGLNLQGQPWCA MYVSEVFVQAFGLNTAKKLLCGALYHYCPTGVNQFKAAGRWHKVPEPGDVIFFTNGTRAY HTGIVTEVTSSRVKTIEGNTSVASGVIENGGGVCRKSYALVESKIMGYGRPDWNSAKQPA EQPKKSGWVQEDGGWRYYNGDTCQYVRNDWVQDGQDWYWFDGAGMMVRNTWYRYKDAWYY LGDDGAMCTGQVTVDGKWYIMDNAGRMIVEPVTLTPDRDGALRWPGLAE >gi|229784065|gb|GG667670.1| GENE 29 24206 - 24463 399 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622351|ref|ZP_06115286.1| ## NR: gi|266622351|ref|ZP_06115286.1| putative holin [Clostridium hathewayi DSM 13479] putative holin [Clostridium hathewayi DSM 13479] # 1 85 1 85 85 137 100.0 2e-31 MNFGIASVAGITVICYLAAMAVKATEVDNKWLPVICGLIGGILGVVGMFYMPDFPAADII NAVAIGIVSGLAATGINQAYKQLTK >gi|229784065|gb|GG667670.1| GENE 30 24466 - 24705 395 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622352|ref|ZP_06115287.1| ## NR: gi|266622352|ref|ZP_06115287.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 79 1 79 79 137 100.0 3e-31 MPTDVMVALIGLGGSAIGTFAGVFASAKLTAYRLEQLEKKVDKHNTVIERTYKLEETQAV IQEQIKVVNHRISDLEREE >gi|229784065|gb|GG667670.1| GENE 31 24725 - 24919 227 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622353|ref|ZP_06115288.1| ## NR: gi|266622353|ref|ZP_06115288.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 64 12 75 75 101 100.0 2e-20 MNKNKPDMNYKTTVPYGPATGKEDPGRQPVIDSTPYEGDRGPDHRQFKPGHVPGGPGHKD CEHE Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:26:03 2011 Seq name: gi|229784064|gb|GG667671.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld64, whole genome shotgun sequence Length of sequence - 46764 bp Number of predicted genes - 42, with homology - 39 Number of transcription units - 22, operones - 14 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 594 468 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) + Prom 620 - 679 3.0 2 2 Tu 1 . + CDS 707 - 1144 292 ## Clos_1272 hypothetical protein + Prom 1204 - 1263 6.0 3 3 Op 1 35/0.000 + CDS 1362 - 2756 1691 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 2767 - 2805 9.3 4 3 Op 2 38/0.000 + CDS 2819 - 3700 1047 ## COG1175 ABC-type sugar transport systems, permease components 5 3 Op 3 . + CDS 3697 - 4515 684 ## COG0395 ABC-type sugar transport system, permease component 6 3 Op 4 . + CDS 4541 - 5893 1290 ## CbC4_2175 hypothetical protein + Prom 6740 - 6799 80.4 7 4 Op 1 . + CDS 6851 - 7417 448 ## CbC4_2175 hypothetical protein 8 4 Op 2 7/0.000 + CDS 7437 - 9233 1959 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 9 4 Op 3 . + CDS 10196 - 10819 630 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Term 10823 - 10852 -0.9 + Prom 10825 - 10884 6.2 10 4 Op 4 . + CDS 10911 - 12005 1467 ## COG0180 Tryptophanyl-tRNA synthetase + Term 12017 - 12075 15.5 + Prom 12007 - 12066 3.7 11 5 Op 1 . + CDS 12093 - 12344 325 ## Closa_2450 hypothetical protein 12 5 Op 2 . + CDS 12357 - 13223 827 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 13284 - 13341 -0.9 + Prom 14306 - 14365 80.4 13 6 Op 1 2/0.000 + CDS 14451 - 14879 428 ## COG0450 Peroxiredoxin 14 6 Op 2 . + CDS 14881 - 15909 481 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 15 7 Op 1 . + CDS 16845 - 16937 97 ## 16 7 Op 2 . + CDS 16968 - 18263 1422 ## COG0148 Enolase + Term 18332 - 18380 12.0 17 8 Tu 1 . - CDS 18396 - 18935 611 ## gi|266622371|ref|ZP_06115306.1| conserved hypothetical protein - Prom 19166 - 19225 6.3 18 9 Op 1 1/0.333 + CDS 19314 - 19910 552 ## COG2091 Phosphopantetheinyl transferase 19 9 Op 2 . + CDS 19907 - 20476 659 ## COG1309 Transcriptional regulator + Term 20537 - 20585 15.6 - Term 20519 - 20577 15.6 20 10 Op 1 . - CDS 20617 - 20847 383 ## gi|288870786|ref|ZP_06115310.2| conserved domain protein - Prom 20899 - 20958 2.5 - Term 20919 - 20962 1.2 21 10 Op 2 . - CDS 20989 - 21444 377 ## gi|266622376|ref|ZP_06115311.1| iron compound ABC transporter, ATP-binding protein 22 11 Tu 1 . - CDS 22393 - 22491 85 ## - Term 22508 - 22540 1.6 23 12 Op 1 3/0.000 - CDS 22553 - 23488 786 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 24 12 Op 2 . - CDS 23492 - 24346 808 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 24355 - 24414 5.2 25 13 Tu 1 . + CDS 24594 - 25763 1330 ## COG2230 Cyclopropane fatty acid synthase and related methyltransferases + Term 25927 - 25973 5.6 26 14 Op 1 . + CDS 26112 - 26354 399 ## Ethha_1384 glutaredoxin 27 14 Op 2 . + CDS 26383 - 27933 1555 ## COG1409 Predicted phosphohydrolases 28 14 Op 3 . + CDS 27988 - 28875 954 ## COG1737 Transcriptional regulators + Term 28989 - 29033 9.1 + Prom 29146 - 29205 4.7 29 15 Op 1 19/0.000 + CDS 29384 - 30289 970 ## COG1105 Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) 30 15 Op 2 . + CDS 30303 - 32258 2393 ## COG1299 Phosphotransferase system, fructose-specific IIC component + Term 32283 - 32330 8.5 + Prom 32304 - 32363 5.0 31 16 Op 1 . + CDS 32482 - 33624 838 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases 32 16 Op 2 . + CDS 33638 - 34519 926 ## BAD_0201 hypothetical protein 33 17 Tu 1 . - CDS 35556 - 35696 134 ## - Prom 35802 - 35861 5.7 + Prom 35536 - 35595 20.7 34 18 Op 1 . + CDS 35832 - 36701 914 ## Clocel_3251 adenylyltransferase 35 18 Op 2 . + CDS 36802 - 37812 1039 ## COG0673 Predicted dehydrogenases and related proteins 36 18 Op 3 . + CDS 37790 - 38443 710 ## COG0546 Predicted phosphatases + Term 38491 - 38547 0.8 37 19 Tu 1 . + CDS 39658 - 40689 1053 ## COG1316 Transcriptional regulator + Prom 41528 - 41587 80.4 38 20 Op 1 . + CDS 41670 - 42269 704 ## EUBREC_1586 hypothetical protein 39 20 Op 2 . + CDS 42273 - 42704 605 ## COG0716 Flavodoxins + Term 42740 - 42788 10.4 - Term 42728 - 42776 11.2 40 21 Tu 1 . - CDS 42795 - 43751 737 ## COG0366 Glycosidases - Prom 43886 - 43945 19.8 41 22 Op 1 . + CDS 44849 - 45676 685 ## COG2207 AraC-type DNA-binding domain-containing proteins 42 22 Op 2 . + CDS 45715 - 46762 1080 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin Predicted protein(s) >gi|229784064|gb|GG667671.1| GENE 1 1 - 594 468 197 aa, chain + ## HITS:1 COG:SP0666 KEGG:ns NR:ns ## COG: SP0666 COG0596 # Protein_GI_number: 15900567 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Streptococcus pneumoniae TIGR4 # 19 177 47 202 205 64 26.0 1e-10 IQCLNVFALKDWEEISYSRVYEDFSAYCESVKTEFGLCGLSLGAVIALNYVVEHPGKVTS LVLIGGQYVMPKGLLRLQNMIFRVMPNGIFKKMGLGKRELIQLTKSMMALDFKEDLEKVD CRTLIVCGEKDRANRRAAEMMAEKIPGANIKILEGCGHEVNVEAPEKLAESLEAFYEEKQ DGNLTEAVVYDQHAGGD >gi|229784064|gb|GG667671.1| GENE 2 707 - 1144 292 145 aa, chain + ## HITS:1 COG:no KEGG:Clos_1272 NR:ns ## KEGG: Clos_1272 # Name: not_defined # Def: hypothetical protein # Organism: A.oremlandii # Pathway: not_defined # 1 144 1 118 121 97 43.0 2e-19 MININKETLEEAIFVIASIVERCENAQTKFREGTSQHTLLKNRIRALYLSKGLVEAELLR MGGRLSAACCEEAGKCETPDSCVILGEGSFMEELKSAVPPIVSIISKCEKAQSKYEKGTA QFRRYEKIILAMKVSKTLLEDQLEM >gi|229784064|gb|GG667671.1| GENE 3 1362 - 2756 1691 464 aa, chain + ## HITS:1 COG:lin1841 KEGG:ns NR:ns ## COG: lin1841 COG1653 # Protein_GI_number: 16800908 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Listeria innocua # 83 426 50 385 414 79 24.0 1e-14 MKKRWITLTLTAAVMAGLCGCGGNSGTSGAGDTTGTKAETETTTAKAETVAQTAKEEQKG EITLQFWEMNYGSDDAYLETCEKLIAQYESENPGVNIEVTVQPWDNYYQLFLTAVSSNAA PDVASVALDLPDMYARMGEALELNDVVAEWKQDGTIDDYGEATMKSFTYEGDQIAVPYMV GYRGIFYRSDYLKECGIEKVPETWDELLAACEKIEEVRPDLAPLDFAGADMGLWHYFMCF AMSNDAGIVKEDGTPGSTDDRYKAAIDLFKTFKEKGYVADGVLGHKSADAEKLYYSGKSV FWFTSFPASIFDYPEILENTEVYSAFKGPDGTAVRTSSFSDALMAFSQTKYPEEAKKFVK WWAENSSALYTEGGCWAYPAKKSFFSGETFNNKVAQGVLNKIVPTAVPCTWPIENIFPGF SQVNGEQTMNYSAQEALNGKVDTGEIARIQAEKIQEAIDIATEE >gi|229784064|gb|GG667671.1| GENE 4 2819 - 3700 1047 293 aa, chain + ## HITS:1 COG:PM1761 KEGG:ns NR:ns ## COG: PM1761 COG1175 # Protein_GI_number: 15603626 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Pasteurella multocida # 12 286 12 285 294 171 36.0 2e-42 MEKNRRRFGLKMLVMPTIILLLVSFYPLFNSIYLSVTNTNILKKGQTVRFTGAANFIKLF SDRTMGPSLVYSLEYAFICVAASYVLGLFLAVLLNQKIKGRALFRGLILIPWAIPTVVAT ANWLWILNDQSGIVNVVLQRLHLTDGPILFFADQRMARITCIVVGTWKSYPFMCMSLLAG LQGVPEDVYESARMDGATGIKTFFYITLPMIRNVSMVVVTLMFIWGFNNFDIIYLLTQGG PLEATFTLPIYTYNTAFYRGQMGYASAISMMTLVILLFFCAVYMKMQKGKDDV >gi|229784064|gb|GG667671.1| GENE 5 3697 - 4515 684 272 aa, chain + ## HITS:1 COG:mll7152 KEGG:ns NR:ns ## COG: mll7152 COG0395 # Protein_GI_number: 13475955 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mesorhizobium loti # 21 272 6 263 263 164 33.0 2e-40 MKNKKGIVYALLIFFCAAANLPVLSMIGTALKPRSETLSTVSLFTLHPTLDNFIHVLTRT SFAQSLVNTIFVALIVTVLCVAFSTMAGFGLSRCRGILFDGYTMFLIILQMLPAMLIMIP LYVTLSRFHLVDTHFSLILIYTAINLPFGIWTLKGFFDDIPTELDEAAMIDGCSRFQAYW RIILPFSLPGVASVGVFTLMNVWNEYTIASIFIQDKNLRTLSVGIRQFMMQNTTDWASMS AAATIAVLPAFVMIIFAQKYLVGGLTSGSVKG >gi|229784064|gb|GG667671.1| GENE 6 4541 - 5893 1290 450 aa, chain + ## HITS:1 COG:no KEGG:CbC4_2175 NR:ns ## KEGG: CbC4_2175 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_BKT015925 # Pathway: not_defined # 71 440 76 444 645 281 41.0 3e-74 MCRNDRETILYAAEELKRYLSLAGVSAWTSVGEAHSQYMKTLELAVAPEEFRVKEGQCLE RPSPSIWQNADAYAITVTDLGGTIRGSNNRSVLLGAYRYLKELGFAFLRPDREGERIPAA VGPHTVKIREEASSRYRGICIEGAVSFENIMRFVTWLPKAGFNTYFTQFKIPYEFFKTWY LHTNNPYHGKEPVPSEEEVDAFVKEILVPHMKKLGLIWQAGGHGFTTGVAGLPGTGWDPL AEEKVPEERREYLAMIDGKRGLFHGVPLNTSLCCSNPLVRELLVREILDYIHANPQLDMV HVWLADDGNNSCECGACAAKRPSDWYIEILNQVDEALSKEGSPVKVVFLAYFDLLWPPVS AKLLNSERFVFMFAPITRSYRTPLPVEATALIPPYKRNQCRFPVNAGENMHYCSAWKQFF PGDSFLYDYHYMWNQFRDWGDYGSAEILWN >gi|229784064|gb|GG667671.1| GENE 7 6851 - 7417 448 188 aa, chain + ## HITS:1 COG:no KEGG:CbC4_2175 NR:ns ## KEGG: CbC4_2175 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_BKT015925 # Pathway: not_defined # 3 177 459 632 645 79 28.0 6e-14 MNLEEAGFDGYVSCQQTRVFAPAGFGMYVMAETLWNRSRTFETLEREYFQMVYGDQAETV LSYCKELSALSYMEQPENDDPGVCAGAAKKLKAAADLIRTYRPLFEKNFGDEKIQDHTAW KYLLYSGRAAEMYISMLKYRRQGSEDRVSEEYRKLKEYLGRTEEEWQEGFDVYWFIKDRD KKFLPSDT >gi|229784064|gb|GG667671.1| GENE 8 7437 - 9233 1959 598 aa, chain + ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 5 582 2 571 577 189 26.0 2e-47 MGDRRKRIPALLRRIGLKRRLFIGFVIVSLLPMAVLGVIMGVRLYHTIMDNYEEQSIQAL GTIQYRIGIVTGAKEERINAFASDADVVKSVRKLNMDISLLDRTALTRDLQGKLLAELSQ IKGGVRAEILDRNGAYVTGIFDDRYTALSGEARARAAESPDRLHLWHGTDQEGEEIIVFS RQIINYFNGRTTGWVLLYVGTDELKEEMKTVQTDAVIAVYDEEGGLICSGTGNGNIADIA DEDLNEAGFRVSGYRDSEDMWKADWKLESGRYQVLFQNMAGTGWEIAIFKDYHELVRSFA GIGILAAVICLVSIGIILYLSLLITRTVAEPVQNIAESMKIFSQGNLSVRVADDAADEIG MLGDEFNSMSNNIERLTHEVYEARIKEKEAGLRALEAQINPHFLYNTLDMINWMSYRSSN QDICKIVKSLSDFFKLSLNHGKELYTVGDEVRHIQSYVTIEQYKKAEIDFEIHVEEELKP CVCPKLIVQPLVENAILHGIEPKRGPGTILISFIRRENDIVITVEDDGVGLCAKQREGAR YHHGGYGIQNINERLTMLYGKKYHVTVEERKEGGVISSVVIPLIGLEGLDDVSDDCRG >gi|229784064|gb|GG667671.1| GENE 9 10196 - 10819 630 207 aa, chain + ## HITS:1 COG:SP0661 KEGG:ns NR:ns ## COG: SP0661 COG4753 # Protein_GI_number: 15900562 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Streptococcus pneumoniae TIGR4 # 6 203 47 241 245 120 34.0 2e-27 MITDSRRFPDIVLTDIRMPGMDGLELSGKIREHSAASKIIFISGYEDFAYAKKAISLGAS GYVTKPVAQDELLELINRVMVQIRKEEQFDRQQEISCFHENQTDALLGDILSQMRDNPGG VSLKALSESWGVSPSYISILFKDKTGHNFKDYLLDCRMKRAKELLAEGSPAAEICENLGY SDYDYFSKSYKKYYGESPAEYRKRINL >gi|229784064|gb|GG667671.1| GENE 10 10911 - 12005 1467 364 aa, chain + ## HITS:1 COG:SPy2207 KEGG:ns NR:ns ## COG: SPy2207 COG0180 # Protein_GI_number: 15675940 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Streptococcus pyogenes M1 GAS # 2 346 3 340 340 427 63.0 1e-119 MKKIILTGDRPTGKLHIGHYVGSLKRRVELQNSGEFDEIFIMIADAQALTDNADNPEKVR QNIIEVALDYLSCGLDPAKSTLFIQSQIPELTELSFYYMNLVTVARLQRNPTVKSEIAMR NFEASIPVGFFTYPISQAADITAFKATTVPVGEDQEPMIEQTREIVRKFNSVYGEALVEP DILLPDNKACLRLPGTDGKAKMSKSLGNCIYLSDSEADVKKKVMSMYTDPNHIQVSDPGQ IEGNTVFTYLDAFCREDAFEKYLPDYKNLDELKDHYRRGGLGDVKVKKFLNSVLQEELAP IRARRKEYEANIPYVYQILKEGSEKAERVAADTLAGVKAAMKINYFDDMELIAAQAEKYK NQES >gi|229784064|gb|GG667671.1| GENE 11 12093 - 12344 325 83 aa, chain + ## HITS:1 COG:no KEGG:Closa_2450 NR:ns ## KEGG: Closa_2450 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 80 3 79 80 92 55.0 4e-18 MKDVISLREIVDCDAFLKERGLNFRIHLRDACGKQSCWIEPLGECGCDGQYEELYAALEE FFGKLRYKLEYSDDKLNFWLSGK >gi|229784064|gb|GG667671.1| GENE 12 12357 - 13223 827 288 aa, chain + ## HITS:1 COG:SA0684 KEGG:ns NR:ns ## COG: SA0684 COG0697 # Protein_GI_number: 15926406 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Staphylococcus aureus N315 # 1 278 1 276 288 197 43.0 2e-50 MKSRYKGIVFIVCSAFCFALMNLFVRLSGDLPSVQKSFFRNLVAFFFALILMKRDHVGFS GKKENLWPLFLRSAFGTVGILCNFYAVDHLVLADASMLNKMSPFFAILFSFLILKEKVKP VQAVIVAGAFAGSLFIIKPSPLNMELVPAVIGLLGGMSAGAAYTFVRSLGIRGEKGPFIV CFFSGFSCLATLPYLLVRFHGMTLEQLGYLLLAGLAAAGGQFAITAAYCHAPAREISVYD YSQIIFSAVLGFVIFGQVPDRFSVLGYVIICAMAVVMFLYNNRGTQNE >gi|229784064|gb|GG667671.1| GENE 13 14451 - 14879 428 142 aa, chain + ## HITS:1 COG:FN1983 KEGG:ns NR:ns ## COG: FN1983 COG0450 # Protein_GI_number: 19705279 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Fusobacterium nucleatum # 1 142 48 188 188 208 69.0 3e-54 MCPTELEDLADKYEDFKKIGCEIYSVSCDTHFVHKAWHDVSKTIQKIQYPMLADPTGALA RDFDVMIEADGLAERGSFIVNPEGKIVAYEVIAGNVGRNAEELFRRVQASQFVAEHGDEV CPAKWKPGEETLKPSLNLVGLL >gi|229784064|gb|GG667671.1| GENE 14 14881 - 15909 481 343 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 5 307 2 303 306 189 36 2e-47 MRGQYDVIVVGAGPAGLSAALYAARAKYRVLVLEKEKMGGQITITSEIVNYPGVEKTDGT ELTEQMRRQAEAFGAEFAMAEVTGIDLDGDIRTVKTDKGEYRAPGIVLAVGANPRKLGFV GEREYQGRGIAYCATCDGEFFTGMDVFVIGGGFAAAEEAVFLTKYAKKVTIIVREEDFTC AKAVADKAKNHEKIEVHYETEIVEAAGDGLLRRARFKDNASGGEWTYEAPEGTSFGIFVF AGYVPDTRWLEGFVELDGQGYIITDRNQKTSVDGIYAAGDVCVKNLRQVVTAVSDGAVAA TSLERYVSETYEKPELAQWKAQMEEERTKAVENSGASKHTGAA >gi|229784064|gb|GG667671.1| GENE 15 16845 - 16937 97 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSVPCTIINDRTVVFGRKTVDDLLAVLEES >gi|229784064|gb|GG667671.1| GENE 16 16968 - 18263 1422 431 aa, chain + ## HITS:1 COG:CAC0713 KEGG:ns NR:ns ## COG: CAC0713 COG0148 # Protein_GI_number: 15894001 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Clostridium acetobutylicum # 2 427 3 426 431 537 63.0 1e-152 MNYLEIEKVIGREIIDSRGNPTVEAEVWLADGTIGRGAAPSGASTGEFEALELRDNDKKR FGGKGVRKAVENINTVISGVLTGMDASDIYAVDHAMIQADGTKDKSKLGANSILAVSIAC ARAASISMGIPLYRFLGGISGNRLPVPMMNILNGGAHAANTVDVQEFMIMPVGAPTFREG LRWCTEVFHCLASILKEKGLATSVGDEGGFAPDLGSDEEAIELILEAVKRAGYEPGHDFV LAMDAASSEWKGGKKGEYVLPKCKKHFTSEELIGHWKQLCEKYPIWSIEDALDEEDWEGW KKLTASLGSKVQLVGDDLFVTNTERLKKGIDSGCGNSILIKLNQIGSVSETLEAIKMAHK AGYTAIASHRSGETEDTTIADLAVALNTCQIKTGAPSRSERVAKYNQLLRIEEELGETAV YPGRNAFCVKR >gi|229784064|gb|GG667671.1| GENE 17 18396 - 18935 611 179 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622371|ref|ZP_06115306.1| ## NR: gi|266622371|ref|ZP_06115306.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 179 1 179 179 337 100.0 2e-91 MSNTNNAGGSKDIPDNFPAEDLVLKSSMQFFGEQLDRSDLVPLLLTTLMSGTMDQKDRIL TANRLITESGTLPRPDLVKMQAVLYTFANKFLSNDDLDKVKEVMFMTRLGQMLVNDGIEI GIEKGIEKGIEKGALALISVCREIGLSYDDTRRKLIEKLELDAPAAKRYMEEFWTRSSL >gi|229784064|gb|GG667671.1| GENE 18 19314 - 19910 552 198 aa, chain + ## HITS:1 COG:BS_sfpm KEGG:ns NR:ns ## COG: BS_sfpm COG2091 # Protein_GI_number: 16081163 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantetheinyl transferase # Organism: Bacillus subtilis # 54 162 73 181 224 75 31.0 8e-14 MVWIYHAFFEPGTGGNKRETEHEQGLWLLRRALRERYGIGCGEGRTPELVEGAHGKPYLK EYPLIHFNISHCMGLAVLAIGDCTVGIDAECVRPYREPLLKRVLSDAELRQMKEAGESER EELFFRFWTLKESYVKAVGCGITVPLQDISFQIGKNGEIACEKTGWSYRQWKLAEKYIVS ACVAGEGEIRLAEQEEQP >gi|229784064|gb|GG667671.1| GENE 19 19907 - 20476 659 189 aa, chain + ## HITS:1 COG:BH0719 KEGG:ns NR:ns ## COG: BH0719 COG1309 # Protein_GI_number: 15613282 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 5 177 4 176 188 87 30.0 2e-17 MSGEKKDLRYSRTDRLLQEAFLELLKVKSVDRITIRDLTERAGINRCTFYHHYQDIYDLL EQIEDGVMEHVLEMMRGFHPNREEDVSRRYFECFCQYIYENRTVYCVLTNGQESRFIKKL LKMISEYMAGLSGKENDGDKLSREYAVAYSIGGVVGVLHKWMKEGFESPPEMVAGFISNV FLNGMKDIL >gi|229784064|gb|GG667671.1| GENE 20 20617 - 20847 383 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870786|ref|ZP_06115310.2| ## NR: gi|288870786|ref|ZP_06115310.2| conserved domain protein [Clostridium hathewayi DSM 13479] conserved domain protein [Clostridium hathewayi DSM 13479] # 1 76 30 105 105 132 100.0 9e-30 MYKQLKELVTQYVDVSPESITRDSRFVEDLGFDSLDFIAMIGDIEEHFQIETDISDLLEI KTVDDMFRYIERVAVS >gi|229784064|gb|GG667671.1| GENE 21 20989 - 21444 377 151 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622376|ref|ZP_06115311.1| ## NR: gi|266622376|ref|ZP_06115311.1| iron compound ABC transporter, ATP-binding protein [Clostridium hathewayi DSM 13479] iron compound ABC transporter, ATP-binding protein [Clostridium hathewayi DSM 13479] # 1 151 9 159 159 256 99.0 4e-67 MLETDEAQNGFNGTEADLSLQAEAVPEVTAGRAASLEESDYKGFAGQIQKVFSERDIEAL SRLCAYPVYVTTEANTEGQDVADAAALISQKDDIFTDAMLQAIAGVDPEELTPSQAGIFI GSESKSPGLFFSLTEDGQLNIMGINSEVIDQ >gi|229784064|gb|GG667671.1| GENE 22 22393 - 22491 85 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIKKCLLAAAVTAAALLCLTGCQPKKDPLQFG >gi|229784064|gb|GG667671.1| GENE 23 22553 - 23488 786 311 aa, chain - ## HITS:1 COG:BS_ybfH KEGG:ns NR:ns ## COG: BS_ybfH COG0697 # Protein_GI_number: 16077290 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 1 273 1 273 306 260 51.0 3e-69 MSEKPNMAGHLAAAVTIFIWGTTFISTKVLLVSFTPLEILFFRFFIGYAALWIAAPRLLH TKDIKKELLFAAAGLCGVTLYFLMENFALTLTQAANVSIIISVAPFFTSLFDWLFMKGEK PGRRFLTGFTAAMLGISLLSFQKGTGVQLSPKGDLLALAAAITWAAYSILTKKIGSYGYG TIESTRRTFFYGLIFMVPALFFLPFQIAPARFMVMQNALNILFLGFGASALCFVTWNFAV RALGSVKTSVYIYAVPVITVITSLIFLHEVITWQSACGIVLTLAGLILSESRITLKQTAR KPSVDIGMDRP >gi|229784064|gb|GG667671.1| GENE 24 23492 - 24346 808 284 aa, chain - ## HITS:1 COG:BS_ybfI KEGG:ns NR:ns ## COG: BS_ybfI COG2207 # Protein_GI_number: 16077291 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 6 271 4 269 275 231 45.0 1e-60 MTTSQEQRHIFYDSCLGIEAYQLSGIVQKFPNHFHECYVVGFIEGGCRHLWCRGSEYDLS KGDLILFNPKDNHFCSPMNGELLDYRALNIPVPVMKRAAADITGREFIPQFTVNVVYHSD ITTSLADVYNAIVTGASPFEKEEAFFFLLEQILKEHSAPYEEHNGVFLDSAIADTCDYLK EHYSENITLEELTDQIHLSKSWLLHTFTKQTGVSPYRYLQSIRLEEAKKLLEQAVPPIDA AARAGFTDQSHFTRFFKDFTGLTPKQYQKIFTESGRDTAAGTEE >gi|229784064|gb|GG667671.1| GENE 25 24594 - 25763 1330 389 aa, chain + ## HITS:1 COG:CAC0877 KEGG:ns NR:ns ## COG: CAC0877 COG2230 # Protein_GI_number: 15894164 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cyclopropane fatty acid synthase and related methyltransferases # Organism: Clostridium acetobutylicum # 27 389 27 388 391 355 48.0 1e-97 MKNMTDNFMISFLQRFDEHPFDVKIHGKTHHIGSGNPAFVVDLKEDIEGKELLTSTSLAL GEAYMKGNLEVEGDLFEALDSFLGQMDRFSTDRSKLKKLMFTSQSKKNQEKEVTSHYDIG NDFYSLWLDETMSYSCGYFKHETDTLYEAQCNKVERILEKLCLKKDMTLLDIGCGWGYLL IRAAKEYGVKGLGITLSHEQQKEFEERIEKEGLTGQIEVARMDYRELEKSGRQFDRVVSV GMLEHVGRTNYELFVRNVDSVLKPGGLLLLHYISALKEYPGDPWIRKYIFPGGMVPSLRE MIHLFGEYRFYTLDVESLRRHYYKTLRCWERNFSGHRQEVEANMGTEFARMWELYLSSCA ATFHNGIIDLHQILASKGVNNDLPMTRWY >gi|229784064|gb|GG667671.1| GENE 26 26112 - 26354 399 80 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1384 NR:ns ## KEGG: Ethha_1384 # Name: not_defined # Def: glutaredoxin # Organism: E.harbinense # Pathway: not_defined # 4 79 10 85 87 82 46.0 4e-15 MIQESCPYCRQALRMMDELKEERPEYKAVEVKIVDENREKALADSLDYWYVPTYFVDGVK VHEGVPTMEKVRKVYEKALN >gi|229784064|gb|GG667671.1| GENE 27 26383 - 27933 1555 516 aa, chain + ## HITS:1 COG:lin2791 KEGG:ns NR:ns ## COG: lin2791 COG1409 # Protein_GI_number: 16801852 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Listeria innocua # 104 466 33 402 443 216 33.0 9e-56 MSNGKKNWLIILIYAGLIAMCLGIVWFVGKLSSPTEPGRESGTWTTGPQTEETQEERKEG ETESVKATPSEAETQEEGRQQGESGNWYSGRTETGRMETEEELPVEEYRPPTIVTVSDIH YFPSSMTDYGEAFDELVKRDDGKLIPYISQLMDVFTEEMTELMPDAVVLSGDLTLNGEKE AHEALAGKLKVLADHGVRVLVIPGNHDINNPHAASYFGEERTAAETVNAEGFLDIYHEFG YDQAISRDETSLSYICRLDEKNWLMMLDSAQYDPVNLVGGRIRSTTVKWMEEQLEAAKEL GITVIPIAHHNLLKESILYPVDCTLENSGEVIRLLERYRVPVFISGHLHLQRTKKYKPEP GEPEDVYHISEIVADSFAIPPCQYGILSWGDGGELRYATKEADMERYAAEHGLTDENLLN FKEYGTDFLVGVISQQIYQKMSSLPEEHMRKMAELYGQLNRDYCAGTPIDVKEIKSEEAF RLWERNLPDSRMFEEIDDILRDTGKDHNTWELPGEW >gi|229784064|gb|GG667671.1| GENE 28 27988 - 28875 954 295 aa, chain + ## HITS:1 COG:BH2675 KEGG:ns NR:ns ## COG: BH2675 COG1737 # Protein_GI_number: 15615238 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 23 277 17 269 287 122 28.0 9e-28 MRDRNGGRGAMQDHINVSGVLCSSYDTFFEAEKKIADYILEHKSEIIDMTVAELALASGT SDATVSRFCRRCGFNGFHHLKITLAKEVAEERGNSVEVSNDISRENIAQSLQNILANKVA ELTQTVSMMDPVQLGKILDLIEHARTVQLVAVGNTIPVALDGAFKLNQLGINAVSGTIWE AQMAFTCNLGPEDVVIIISNSGFSKRLMTLTAVARENQCRMIAITNNAASPLGQACDYHI TTATREKLLLGEFCFSRVSATMAVEILYLFLAVSKKDSYEHIRRHEIAISEDKNN >gi|229784064|gb|GG667671.1| GENE 29 29384 - 30289 970 301 aa, chain + ## HITS:1 COG:SPy0854 KEGG:ns NR:ns ## COG: SPy0854 COG1105 # Protein_GI_number: 15674887 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) # Organism: Streptococcus pyogenes M1 GAS # 1 300 1 298 303 261 44.0 9e-70 MIYTVTFNPALDYVVKVNHFTLGMVNRTVQEDIYYGGKGINVSAVLKTLGFQSTALGFIA GFTGDEIERGVKNLGFQSDFIRVENGLSRINVKLKSEEESEINGMGPEITEGDVEKLFKR LEKLKADDVLVLSGSIPSSIDDGIYERIMERLDGKGIRIVVDATKDLLMNVLPYHPFMIK PNNHELGEMFGTVLKDEEEIIYYARQLQEKGASNVLVSMAGDGAVLIAEDGAVCSMGVPK GTVKNSVGAGDSMVAGFIAGYLENGNYEHALRLGSACGSASAFSEGLACREDILKLYEEL R >gi|229784064|gb|GG667671.1| GENE 30 30303 - 32258 2393 651 aa, chain + ## HITS:1 COG:L185031_3 KEGG:ns NR:ns ## COG: L185031_3 COG1299 # Protein_GI_number: 15672941 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Lactococcus lactis # 308 648 2 351 351 327 55.0 3e-89 MRITELLKKESIELGVKVSSKEEAIDTLIGLMAAGGRLNDRAGYKEGILAREALGSTAVG EGIAIPHAKVAAVKEPGLAAMVVPDGVDYEAFDGSLANLIFMIAAPEGGADVHLEALSRL STLLMDPDFKNDLIHAESKEEFLQLIDDKESERYEKKARKEETKASDAAPAVHETAKSGA AGYRVLAVTACPTGIAHTFMAAENLEQLGKKMGIPVKSETNGAEGAANVLTKEEIAAADG IIIAADKNVDMARFDGKHVVKASVSDGIQKGEELIKRAVSGEAPVYHHTGAASSEEGGEN EGLGHTIYKHLMNGVSHMLPFVIGGGLLIALAFLFDDYSINPANFGKNTPVAAYLKTIGE QAFGMMLPVLAGYIAMSIADRPGLAVGFVGGMVAKMGATFMNPAGGDVNSGFLGALLAGF IGGYIVVLLKKVFKKLPKSLEGIKPVLLYPLLGIFLVAVVTTFINPFVGAINDGLTHLLN GMGGTSKVILGAVVGGMMSVDMGGPVNKAAYVFGTAQLAEGNFDIMAAVMAGGMVPPIAI ALCTTFFKKKFTEKERQSGLVNYIMGLSFISEGAIPFAASDPLRVIPSCIIGSAVAGGLS MALNCTLRAPHGGIFVLPTIGNPFGYLAAIVIGSVVGCVILAALKKNKTEA >gi|229784064|gb|GG667671.1| GENE 31 32482 - 33624 838 380 aa, chain + ## HITS:1 COG:lin0658 KEGG:ns NR:ns ## COG: lin0658 COG0639 # Protein_GI_number: 16799733 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Listeria innocua # 17 236 4 210 235 59 27.0 9e-09 MLTIRRESLDIPQGRRILAVSDIHGHVHFLKKVLERAGFCEADELIIDGDIIEKGPYSLE TLRYVMELGKLPNVHVISGNVDCWQLELIGSREPEMTRALAEFIQTAKERWGGSLFLDFC DEMGIEHPRTEEETAAAKKRVQAYGRAELEFLSGLPSIVEAGNFIFVHGGLPNGWEEMEG QEATPVLKFDDFRNRGGSFDRCVVAGHWPVCLYRHEVAEFNPLIDREKNIISIDGGCGLK RDGQLNCLMIGNPRGTIEEVSWTSYGGFPEMTAVDGQKEQKTRISIQYTDNRVELLAGQS AGPGLVRLRHLTTDTVFAVPKDYLYERDGTLRCDDFCDYHMEVKPGERVELVEVNALGSL IKKNGVSSWYHGRLELAGIE >gi|229784064|gb|GG667671.1| GENE 32 33638 - 34519 926 293 aa, chain + ## HITS:1 COG:no KEGG:BAD_0201 NR:ns ## KEGG: BAD_0201 # Name: not_defined # Def: hypothetical protein # Organism: B.adolescentis # Pathway: not_defined # 20 272 7 259 263 287 52.0 4e-76 MSKKKEETGLPMIQKEAALLLEELSLCPETSAVAVGGSRATGRADEKSDYDIYVYVEKEI PEEKRRRILEKYCSTMEIGNHYWEAEDNCTLKDGVDIDLIYRNRKEFEREIAAVVEQFRA SCGYTTCMWHNLATCRIVYDRDGELEAMKKRFDVPCPEELRDEIIRKNRALLSGVLPSYD AQIRKAWARRDLVSVNHRTTEFLASYFDIIFAMNGRTHPGEKRLVTLCKESCENLPDRFE ENLERLFTVMYNPDGEVNDVIARMVAELDRALLKVRPGEFTAGQPPMGAAEKQ >gi|229784064|gb|GG667671.1| GENE 33 35556 - 35696 134 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNKNMKYWAAGIIAAVVIIAVAVFAAAGSRKKAPLSPAPAETESSS >gi|229784064|gb|GG667671.1| GENE 34 35832 - 36701 914 289 aa, chain + ## HITS:1 COG:no KEGG:Clocel_3251 NR:ns ## KEGG: Clocel_3251 # Name: not_defined # Def: adenylyltransferase # Organism: C.cellulovorans # Pathway: not_defined # 1 289 1 289 290 370 59.0 1e-101 MRTEQEMFSMILDAAKADGRIRAVYMNGSRVNPEAKHDIFQDYDIAYVVTETESFREDRT WIDRFGERLYMQYPEENREYPSDVANCYGWLIQFADGNRLDLHVQTLSWSLRAMEEDRMF RILYDKDGCLAKAPEPAAEFYRVKRPEESRFLSVCNEFWWCLNNVAKGLWRKEIPYAQDM VNFHVRPQLIRLLSWKAGIDTDFTCSPGKSGKYLYRYLSQEEWSRLMKTYAGGSADEMWE AADTMCVLFHETALAVAEQLGFQYDKREAEASLHFFRHVRGLPEDAKEL >gi|229784064|gb|GG667671.1| GENE 35 36802 - 37812 1039 336 aa, chain + ## HITS:1 COG:lin0303 KEGG:ns NR:ns ## COG: lin0303 COG0673 # Protein_GI_number: 16799380 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 1 316 1 315 323 335 54.0 7e-92 MKTYNWGIIGTGSIAREMAAALTACRGTVYGVCGTSREKAERFAAENGVLHAFADGAALL ADENVDIVYIATPHNMHFDYIRQAVLAGKHVLCEKAITVSSAELEEAAALAEEKGVVVME AMTIFHMPLYKKLKEIVDSGAIGPVKMISVNFGSCKPYDVTSRFFRKELAGGALLDIGTY AVSFARYFLDERPDTVLSAVKRFETGVDEESGIVLMGENGQMAVIALSMRAKQPKRGVVA GELGYIEVSEYPRADRAAIVYTADGRREEITCGDSKKALEYEVEDMEKAVSEHLVSESMA LSREVMWLLSSVQKRWEEQELRTEQGGQKDVPVLHI >gi|229784064|gb|GG667671.1| GENE 36 37790 - 38443 710 217 aa, chain + ## HITS:1 COG:TP0554 KEGG:ns NR:ns ## COG: TP0554 COG0546 # Protein_GI_number: 15639543 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Treponema pallidum # 1 212 4 216 222 142 34.0 5e-34 MYQCCIFDLDGTIINTLHSLRKTVNETIKQFGYGPMDEYHVKYFVGDGYKKLMERTLKYC GDTELTHYEAALEVYDEMFKKYCLYQVEPYAGMPEFLTMLKEKGIRIAVVTNKAHDRAVE CVETVYGPGFFDKITGEGNGIKCKPDPEGALKTAEEFHVKPSECLYFGDTNTDMRTGINA GMDTAGVLWGFRDRAELEAFKPKYIISHPDEIKSLFR >gi|229784064|gb|GG667671.1| GENE 37 39658 - 40689 1053 343 aa, chain + ## HITS:1 COG:CAC3063 KEGG:ns NR:ns ## COG: CAC3063 COG1316 # Protein_GI_number: 15896314 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 64 286 70 292 339 125 34.0 8e-29 MAKRWKWGFVILAAVWAALFGTMLGYAWNVTHQAEPMLKVPMEQVQNTNISEVAQETMDR FWTVAVFGVDSREGKLGKGTRSDMEMVLNINLKTGEIRMVSVYRDTYVKIDEKNKYDKIN QAYFEGGPAQAIWALSENMDLNIDDYVSFSWKAVADAINLLGGIDVEISPEEFKVINGFI TETVESTGVGSHHLKKAGVNHLDGVQAVAYARLRKMDTDFMRTKRQRLVAELVLEKLKEA DLGTVRQLAFSILPQVSSSVGMKDILSLAGTVSKFHLTETEGFPFELKDALVGKKDCVIP VTLRDNVIRLHQFLFDEEEYVPSKQVDEISRQIELRTAKSLAS >gi|229784064|gb|GG667671.1| GENE 38 41670 - 42269 704 199 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1586 NR:ns ## KEGG: EUBREC_1586 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 19 182 17 181 205 130 38.0 4e-29 MSTDAIMSYISECSRAGRLEFQVSIQCAPVLKGIKASNLVTMAKGSLQTVRNALAGTEIT AVMLAAGDRTEVLLLYRFPMLKAVLANKEIREFLKSCGYVQFDLASILMGLKRKYTAYLA GSGEFPHELGVLLEYPVEDVRDFIRYNGDKFLFTGYWKVYHNPAQARAKFTRYDRAREAA MQQIINGCRLHEVAVRKGE >gi|229784064|gb|GG667671.1| GENE 39 42273 - 42704 605 143 aa, chain + ## HITS:1 COG:CAC0587 KEGG:ns NR:ns ## COG: CAC0587 COG0716 # Protein_GI_number: 15893876 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Clostridium acetobutylicum # 1 139 1 139 142 97 41.0 1e-20 MSDIYVVYFSSTGNTEEMAKHVADGIERKGGSPVLTEIGSVDPDRLKEASAFALGCPACG AEELDESMETLIEEISGSLEGKHVGLFGSYGWGGGDWMREWEKRMAERGARIVGGEGVIA LESPDGDAEKQLKSLGEELAELG >gi|229784064|gb|GG667671.1| GENE 40 42795 - 43751 737 318 aa, chain - ## HITS:1 COG:DR2036 KEGG:ns NR:ns ## COG: DR2036 COG0366 # Protein_GI_number: 15807030 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Deinococcus radiodurans # 2 244 196 464 552 80 30.0 4e-15 MRYWLDQGCDGFRVDMAASLVKNDDEEKTGTCAIWSNVREMLDSDYPEAAMVSEWGNPQL ALKAGFHMDFFLNGRGNGYSTLMRDYENQEDHSFFKKDGHGDITRFLEDYLPRYEASRDN GYFCLLTCNHDTLRPRYNLDETELKLAFAFIFTMPGVPYLYYGDEIGMRYLELPTKEGGY FRTGSRTPMQWNHEKNAGFSEAEKDSLYLPVDEADDAPTVAAQESDPGSLLHTVKSLLTL RHANADLQADAAFEPVYAEKGAAAFVYRRGALTVAVNPSGGTVTVPSVGQKSVIYSIGNG RWEEDRIILEPQSFLVLE >gi|229784064|gb|GG667671.1| GENE 41 44849 - 45676 685 275 aa, chain + ## HITS:1 COG:BH1906 KEGG:ns NR:ns ## COG: BH1906 COG2207 # Protein_GI_number: 15614469 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 11 263 36 287 299 122 31.0 8e-28 MPNSYRNLSLHWHEEMEITLIKSGTIYYDIDFETFSVEKGDLLLISPHILHSAHEMPGKS MVSDSLVFHLDFLGYQVPDACTIKFLNPVQSGKLRFLPVVKPGAAGYEEMKECFISLLEC FNGKQYGYELYTKELLMRLFRLLYQYGCVAKHEGTHGEFGAEEKLKEVLSYIQSNYRETL TIEELSSVCHFSQTHFMNFFKRYAGMTCMEYINHYRITRAAADLVETERQVMDIAMENGF RNISYFNKVFKERFGVTPGSYRKNSRENLSLPCKM >gi|229784064|gb|GG667671.1| GENE 42 45715 - 46762 1080 349 aa, chain + ## HITS:1 COG:lin0191 KEGG:ns NR:ns ## COG: lin0191 COG0803 # Protein_GI_number: 16799268 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Listeria innocua # 13 349 9 312 312 169 34.0 7e-42 MNKKKRIPRLCCIGTVILAMILTGCVRRQSPPAETKTEGGKLNVVTTLFPYYDFTRQIAG DHIELTMVVPAGMDSHSFEPTPADMITIQNADVLICNGGAMEHWLERVLSAVESPHMVIM TGMDYVDTVQEEIVEGMEEETHDHDHGHGHGAEPGHEDMYSHDLADDNGHGEEIEYDEHI WTSPVNAMKLVKGIEETLSAADPGSRAEYEANAESYLLKLKQLDQEFRDVTAHEKRNMII VGDKFPYRYLADEYGLRYRAAFSGCSTDTEPSARTIAYLIDKMKSEQIPVIYYPELTSHR VAEIIAEETGARPLLLHSCHNVTRREFDSGVTYLELMEQNVTNLKKGLE Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:27:17 2011 Seq name: gi|229784063|gb|GG667672.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld65, whole genome shotgun sequence Length of sequence - 18939 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 4, operones - 4 average op.length - 6.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 297 - 366 25.6 # Pseudo CGC 0 0 + Prom 296 - 355 74.1 1 1 Op 1 . + CDS 534 - 776 139 ## gi|266622401|ref|ZP_06115336.1| conserved hypothetical protein 2 1 Op 2 . + CDS 793 - 1188 226 ## CKR_0421 hypothetical protein + Term 1199 - 1231 1.1 3 2 Op 1 . + CDS 1261 - 2010 672 ## COG5484 Uncharacterized conserved protein 4 2 Op 2 . + CDS 2000 - 3439 1014 ## Clole_2765 phage terminase, large subunit, PBSX family 5 2 Op 3 . + CDS 3451 - 4872 1106 ## Closa_1391 phage portal protein, SPP1 family 6 2 Op 4 . + CDS 4872 - 5882 689 ## COG5585 NAD+--asparagine ADP-ribosyltransferase 7 2 Op 5 . + CDS 5869 - 6054 207 ## gi|266622407|ref|ZP_06115342.1| chemotaxis protein 8 3 Op 1 . + CDS 6170 - 6763 768 ## Closa_1394 minor structural GP20 protein 9 3 Op 2 . + CDS 6784 - 7788 1069 ## Closa_1395 phage coat protein 10 3 Op 3 . + CDS 7802 - 8716 736 ## Sfum_4062 hypothetical protein 11 3 Op 4 . + CDS 8729 - 9052 299 ## gi|266622411|ref|ZP_06115346.1| putative glycosyl hydrolase 12 3 Op 5 . + CDS 9049 - 9324 280 ## gi|266622412|ref|ZP_06115347.1| putative pantothenate synthetase 13 3 Op 6 . + CDS 9324 - 9575 187 ## gi|266622413|ref|ZP_06115348.1| conserved hypothetical protein 14 3 Op 7 . + CDS 9577 - 9951 285 ## Closa_1396 hypothetical protein 15 3 Op 8 . + CDS 9923 - 10258 206 ## Closa_1397 hypothetical protein 16 3 Op 9 . + CDS 10255 - 10758 381 ## Closa_1398 hypothetical protein 17 3 Op 10 . + CDS 10751 - 11158 145 ## Closa_1399 hypothetical protein 18 3 Op 11 . + CDS 11162 - 12205 1049 ## Closa_1400 hypothetical protein 19 3 Op 12 . + CDS 12218 - 12691 537 ## Closa_1401 XkdM protein, phage-like element PBSX 20 3 Op 13 . + CDS 12705 - 13106 438 ## Cthe_2487 hypothetical protein + Prom 13184 - 13243 6.8 21 4 Op 1 . + CDS 13284 - 15266 1107 ## COG5281 Phage-related minor tail protein 22 4 Op 2 . + CDS 15266 - 15922 605 ## COG1652 Uncharacterized protein containing LysM domain 23 4 Op 3 . + CDS 15932 - 16885 586 ## Closa_1406 hypothetical protein 24 4 Op 4 . + CDS 16878 - 17153 224 ## Closa_1407 hypothetical protein 25 4 Op 5 . + CDS 17159 - 17542 350 ## Closa_1408 hypothetical protein 26 4 Op 6 . + CDS 17539 - 18564 854 ## COG3299 Uncharacterized homolog of phage Mu protein gp47 27 4 Op 7 . + CDS 18551 - 18938 125 ## Closa_1410 hypothetical protein Predicted protein(s) >gi|229784063|gb|GG667672.1| GENE 1 534 - 776 139 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622401|ref|ZP_06115336.1| ## NR: gi|266622401|ref|ZP_06115336.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 80 1 80 80 112 100.0 9e-24 MKMQNNGNEVEKENLTQKKQGDDAMNSSVLSYTKLDLHSLGKKNVKIVSSEEALKDVIPM KWEEDTIQGRTKVTVTKRNS >gi|229784063|gb|GG667672.1| GENE 2 793 - 1188 226 131 aa, chain + ## HITS:1 COG:no KEGG:CKR_0421 NR:ns ## KEGG: CKR_0421 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 131 1 133 133 144 49.0 2e-33 MCKVGDIIVVRNYLSQGQTIKRHSFVVLSTEHGEIQGMDFDLVCNVMSSFRSEEQRKRKM GYPGNFEYPADAENIRNGHGRSGYIKAEQLYYFNREKTDFYVLGNVEPELFNVLIEFINQ LKDIEVITDNL >gi|229784063|gb|GG667672.1| GENE 3 1261 - 2010 672 249 aa, chain + ## HITS:1 COG:lin1733 KEGG:ns NR:ns ## COG: lin1733 COG5484 # Protein_GI_number: 16800801 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 192 1 234 294 102 35.0 7e-22 MARPRSPNRDKACELWLKSGKKRPLKDIAAELKVSEEQIRKWKNQDKWAKVTLPNAKSNV TNRKGGQPGNKNAVGHGGTGPPGNKNAVTTGEFETLFFDTLEDDERLLIGMIQPDKEKLL LQEIQLLTVRERRMLKRIEDLRDCDFTTVKKKKGTEKDKWTDLKEDQAVLGQIQSIEDAL TRVQGRKQRAIESLHKFGFDDARLEIELMKVELATLKIGGQEAEQEDDGFLAALNAEAES LWEAGVDDN >gi|229784063|gb|GG667672.1| GENE 4 2000 - 3439 1014 479 aa, chain + ## HITS:1 COG:no KEGG:Clole_2765 NR:ns ## KEGG: Clole_2765 # Name: not_defined # Def: phage terminase, large subunit, PBSX family # Organism: C.lentocellum # Pathway: not_defined # 26 479 5 429 434 631 66.0 1e-179 MTIKEHIASMKEKLARLKEKRGILTKVQTFKFQPFSQRQKQILTWWLPDSPVKDYDGIIA DGAIRSGKTICMSLSFVFWAMSAYNGQNFAMCGKTIGSFRRNVLFWLKLMLRSRGYRVSD HRADNLVEISRGQITNYFYIFGGKDERSQDLIQGITLAGLFCDEVALMPESFVNQATGRC SVTGSKYWFNCNPDGPYHWFKVNWIDKAIGYLGKEKVAKIREDAAKTGADPALKKLLYVH FTMDDNLSLSEEIKARYRSMYTGVFFERYIMGLWAMAEGIIYDMFDLARNATDTEALAVS YKTKTGHDFWTDERYVSCDYGTQNPTAFLLWNKAADKKWYCRREYYYSGRDKGRQKTDKE FSDDLTAWLDGIAIKAVILDPAAASFKAQLEKDGYKVKKAKNDVLDGIRFVATLLLQGSI FIDSSCDNLIKEFASYIWDAKAGERGEDKPVKEHDHALDALRYFCMTIIRNRSGMKIMK >gi|229784063|gb|GG667672.1| GENE 5 3451 - 4872 1106 473 aa, chain + ## HITS:1 COG:no KEGG:Closa_1391 NR:ns ## KEGG: Closa_1391 # Name: not_defined # Def: phage portal protein, SPP1 family # Organism: C.saccharolyticum # Pathway: not_defined # 1 470 1 466 468 642 67.0 0 MELEVVKKLIKKYTAGHGAFLAEADTADRYYRNQTDILLEPPKKRETEQGENPLRNADNR IPLNFHGLLVNQKASYMFTAPPLFDLGDKASNKALTAFLGDKYAKTCKDLCVEASNASVA WLHLWKDKASGQYKYAIVPSGQVIPVWSNNLERELKGALRCYHDITDEGKELDVYEYWND TTCQAYAIEAGGVIDTGLMPYNSFTLIDTAGNSNLVNEFKHDVGEVPFFPFFNNNIDTGD LDNIKPLIDVYCKVFSGFVNDLEDIQEVIFVLTNYGGDDLGQFLRELKDYKAIQIENEGG EDKSGVSTLTIELPVEARKELLATTRKCIFEQGQGIDPDPQNFGNSSGVALGFLYSLLEL KSGLMETEFKLGFGRFIRCICRLLNIKIKDDTIVQTWTRTSVKNDLELSQIAQQSKGVIS DETIVSKHPWVEDPEKEMDILNKQKEAEAEAQREISDMFPKKKEDPDEGEGDE >gi|229784063|gb|GG667672.1| GENE 6 4872 - 5882 689 336 aa, chain + ## HITS:1 COG:BH3531 KEGG:ns NR:ns ## COG: BH3531 COG5585 # Protein_GI_number: 15616093 # Func_class: T Signal transduction mechanisms # Function: NAD+--asparagine ADP-ribosyltransferase # Organism: Bacillus halodurans # 3 292 9 294 490 102 25.0 1e-21 MGYWEKRQEAMYKAGEMQINQYYTQLEKAFNQTRRELQKTIEAFYFEYAEENGLSYAAAQ RQLSKAEIGNLRDFIDLAMENIGKHNQTVNNMSIKARITRYQALEAQVDAMLRQLYAVDY QAAAEQTMKAVYEDTYYHTWYNIDQFHGFHAAFAQVDPHSVEKLLEYPFNGANFSSRLWK QKDHLQAQLMESLTTMMVQGKSPQALTNDFAKKLNVKKFDAYRLLHTESSFLMSEATHAG YKEDGVEKYQILATLDSKTCDICGDKDGEVYEVGKEITGENMPPFHCFCRCTDVPYYVDD DRSGEMRVGRDLETGENVEVPTGMTYKEWRKQYVRD >gi|229784063|gb|GG667672.1| GENE 7 5869 - 6054 207 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622407|ref|ZP_06115342.1| ## NR: gi|266622407|ref|ZP_06115342.1| chemotaxis protein [Clostridium hathewayi DSM 13479] chemotaxis protein [Clostridium hathewayi DSM 13479] # 1 61 1 61 61 104 100.0 2e-21 MYEIRLTGGEEIKMHCEKSADEMLEWINMICEEGFEFMSCDQLDGKTVMIRVSEIRTITK L >gi|229784063|gb|GG667672.1| GENE 8 6170 - 6763 768 197 aa, chain + ## HITS:1 COG:no KEGG:Closa_1394 NR:ns ## KEGG: Closa_1394 # Name: not_defined # Def: minor structural GP20 protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 196 1 196 197 245 78.0 6e-64 MKKEDLVAKGLSEEHAQIAVDAWNEAVKGFVPKDRFDEVSGKLKEANSTIETLKKDNSDN EELQKQVNEYKEKVTALEIASANTVKEYALKDKLKEAGVVDADYIIYKQGGLDKFTFDKD GKPVGIDDIVKPLKESSPHLFKTEPGADYRPAGGGTPPAKNPFAKDSFNLTEQGKLLKEN PAQAQVLAAAAGVTINV >gi|229784063|gb|GG667672.1| GENE 9 6784 - 7788 1069 334 aa, chain + ## HITS:1 COG:no KEGG:Closa_1395 NR:ns ## KEGG: Closa_1395 # Name: not_defined # Def: phage coat protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 334 1 332 332 580 89.0 1e-164 MPVTKLSDVIVPELFTPYVVNRTMELSALFQSGIITNNAEFDRLASEAAPIHQMPFFEDL SGDSEDIIEDQDLTAKKITSNKDVSTTVRRANMWAATDLSAALAGSDPMAAIGDLVAGYW AREYQKILIQVLSGVFGSYQTATEPAETKTPLADHILDISTAGSAAAQKISASAFIDALQ LLGDAQGQLTAVAMHSATKAFLKKNNLIDTERDSTDVEFDTYQGRRVIVDDGCPVADGVY TTYLFGQGAIAFGNGSPVGFVATEVDRDKKKGSGVDYLINRKTFIMHARGIKWTDLAREH VETPTKAELMNAINYERVYEPKQIRIVAFKHKIG >gi|229784063|gb|GG667672.1| GENE 10 7802 - 8716 736 304 aa, chain + ## HITS:1 COG:no KEGG:Sfum_4062 NR:ns ## KEGG: Sfum_4062 # Name: not_defined # Def: hypothetical protein # Organism: S.fumaroxidans # Pathway: not_defined # 57 295 301 528 711 100 32.0 9e-20 MAVKTVQAVINGVTTTLTYNSTSKKYEATITAPATSSYNNNDGHYFPVTIKATDEAGNVT TKNDTDATLGSSLQLRVKEKTAPTITITYPTASALIINNKPAIRWKVTDNDSGVNPDTIG ITIDSGSKITGSAITKTAITGGYDCTYTPTTALADGSHTIKIDASDYDGNAAAQKSVTFK IDTVPPTLSVTAPVNGLITNKAACTVAGTTNDITSSPVTVTVKLNSGSAEAVTVGADGSF SKALTLVAGSNTITVVATDSAGKSTTVVRTVTLDTVAPTIRAVTLTPNPVDAGKTYVISV EVTD >gi|229784063|gb|GG667672.1| GENE 11 8729 - 9052 299 107 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622411|ref|ZP_06115346.1| ## NR: gi|266622411|ref|ZP_06115346.1| putative glycosyl hydrolase [Clostridium hathewayi DSM 13479] putative glycosyl hydrolase [Clostridium hathewayi DSM 13479] # 1 107 1 107 107 206 100.0 4e-52 MAVARVFGLVDGIEVILQKVDEDRWSVPVPFDADGEYVVEVVAEDEAGNQTYLSKMLYTV DAGNICIHALPLPKYTFELLQAPYQMEPQFTEYLFTRLIPECQEVAL >gi|229784063|gb|GG667672.1| GENE 12 9049 - 9324 280 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622412|ref|ZP_06115347.1| ## NR: gi|266622412|ref|ZP_06115347.1| putative pantothenate synthetase [Clostridium hathewayi DSM 13479] putative pantothenate synthetase [Clostridium hathewayi DSM 13479] # 1 91 1 91 91 160 100.0 2e-38 MIRFILGEDRHVKYFVHSVKSEYFVVKDATYELIYNGEVEASGGCEVTQEEDGSFVDVKI QPTYRSNLYILEITLMIADEVIKNREQMEVV >gi|229784063|gb|GG667672.1| GENE 13 9324 - 9575 187 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622413|ref|ZP_06115348.1| ## NR: gi|266622413|ref|ZP_06115348.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 83 1 83 83 129 100.0 5e-29 MAVRIEAVELSRNPVSVKERLIVSVSVVTHGYLGRSTNAELTSYTNGQLRLRGESVPTYA QLQEYRHSALHSMTHQQIESMEV >gi|229784063|gb|GG667672.1| GENE 14 9577 - 9951 285 124 aa, chain + ## HITS:1 COG:no KEGG:Closa_1396 NR:ns ## KEGG: Closa_1396 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 116 1 116 117 130 53.0 2e-29 MNSNEMLQTVKQNLRLGTEDHDLIISDLILTVCDYCNLDPDCVPDILEPFVRKKARGIIE YEASEGSGYNPEIASIKEGDGSITWAQTEGNTKASIYGLSESDKAGLRRHRRLRGYAKPV CKNV >gi|229784063|gb|GG667672.1| GENE 15 9923 - 10258 206 111 aa, chain + ## HITS:1 COG:no KEGG:Closa_1397 NR:ns ## KEGG: Closa_1397 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 108 1 109 112 113 48.0 3e-24 MRNPYARMYDAKMDVYRWTDVEIDGITKQVRAAVATGRPCRYSSSGQVSTGAPNPAIVNS HKLFCSLEEDIREGDQLLITLRTGKNIEADLGECHPYSYQWQCEIKRDDNV >gi|229784063|gb|GG667672.1| GENE 16 10255 - 10758 381 167 aa, chain + ## HITS:1 COG:no KEGG:Closa_1398 NR:ns ## KEGG: Closa_1398 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 167 1 167 168 187 59.0 2e-46 MSSSNYRRNKAAIDQFRKELMAMVEDIQQIDKKVLNKAVNAGAAYAKRRTPVGDHPNPVT FIVNNGPGVRKVVSFKVKNPGVGGFLRKSWHKLPTKKTKAGVETELVNTAEYSTYWNYGH RIVTKKGGPTKGFVKGTFVLEKTRGYIEKQLVKEFEKEVKAVQSKHD >gi|229784063|gb|GG667672.1| GENE 17 10751 - 11158 145 135 aa, chain + ## HITS:1 COG:no KEGG:Closa_1399 NR:ns ## KEGG: Closa_1399 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 135 1 139 139 91 37.0 9e-18 MIEKLYKNIAAGLKAVRPCKVYVEDVPQNFAQPSFLVTFYEQEPSKGINGRLKNSVRVDV SYFPATDREPYEECWLVGQDLSREFIVADFKIRNRNLKIVDSVLHFLFDVDYREYQENNS TAMQTISQNTDIKEE >gi|229784063|gb|GG667672.1| GENE 18 11162 - 12205 1049 347 aa, chain + ## HITS:1 COG:no KEGG:Closa_1400 NR:ns ## KEGG: Closa_1400 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 347 1 346 346 434 72.0 1e-120 MAGTWESQNKVLPGAYINIRTNEPLSITPGDRGIVVILQEMSVGTDGAVYTITATEAAWP DGATAADKKLAAEALKKAKTVLVYKLKASHKAADVTSALTALKTVQFNTLCYPYDGEGEE VNKTAIATWIKAMRDDEGVKCQAVLANHVADSEGIINVVQGIIMQGNQELTAAEVTAWVA GATAGASITTSNTGMVYAGAIDVKPRMTKSEMEAAVTAGKFIFKVDTAQNVSVVYDINSL TAVTVDKGKMFTKNRVIRTVDNIANDITKIFEANYVGKVNNNDEGRSLLKASLVDYFATL QTMGAVQNFETDDITVTKGNDSDAVVIEAAVQPVDSVEKIYITVNLS >gi|229784063|gb|GG667672.1| GENE 19 12218 - 12691 537 157 aa, chain + ## HITS:1 COG:no KEGG:Closa_1401 NR:ns ## KEGG: Closa_1401 # Name: not_defined # Def: XkdM protein, phage-like element PBSX # Organism: C.saccharolyticum # Pathway: not_defined # 1 157 1 157 157 235 79.0 5e-61 MGNYTKLTDLVTGSEGSAFITIDGQNRYFFEISKVDASIEFKVIAKRLLGHRMTQHKVVG AEGKGTLTMYNVSPAALAVYQQYIKEGKVPQISIQTTNEDPASTVGRRVVVMRNCILAKA PVAYLDDSSEDLNTVDTDFTFDDVDELESYAFPENMR >gi|229784063|gb|GG667672.1| GENE 20 12705 - 13106 438 133 aa, chain + ## HITS:1 COG:no KEGG:Cthe_2487 NR:ns ## KEGG: Cthe_2487 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 1 133 3 136 137 162 67.0 5e-39 MGSLNAFLHPEQTENREVFVSNRFKENGKPVPFVIRPVTQQENEGLIKKYTKRDKKGNES FDRVAYNQELTAAAVVEPELENTELQNAYGVLGASKLLTKMLYVGEYGALLEAVYDISGL DKDINEDIDEAKN >gi|229784063|gb|GG667672.1| GENE 21 13284 - 15266 1107 660 aa, chain + ## HITS:1 COG:RSc0873 KEGG:ns NR:ns ## COG: RSc0873 COG5281 # Protein_GI_number: 17545592 # Func_class: S Function unknown # Function: Phage-related minor tail protein # Organism: Ralstonia solanacearum # 16 258 23 266 1366 113 29.0 1e-24 MPTLSAMFRLMDGYSTTLNKFIGKVDAASTKTLGASKNTDKLNDSMDRTGRVAEAASSGV AKFVGTVASLAAVKKVMDFTDTYTNTNARLAMITKSLEEQKALQEDIFAAANRARGQYDD MANAVAKMKMLAGDAFGSNQEAIGFTELLQKSLKVSGAGTSEQQSAFLQLTQAMASGKLQ GDEFRSIMENAPMVANAIAKYLDVSKGELKELSSDGAITAEIIKNAMFDSADDINEKFKT MPQTFGDVWNRIKNAGTQAFGGVFQKINNILNSDAGQRSINNLIGAIYLAGEAMEGFIDF CVTAWPMVSPFIWAAAAAVGAYALALGVSNGLALISAIRTGAQAVGVGLYALALWATTGA TWGAVTAELGLTEAQLGANAAMYACPIVWIVGLILVLIAVFYAAVAAVNHFSGSTISATG IIGAVIGGWAAAVINYFIMMYNMIAEVVNFFANVWNDPIAAVKILFYDLASTVIGYIAEM ARSIETIINRIPGVNVQISAGLDNFKAGLEKAATEAKDAAGWEEVVKKKDFLNGADFANK GYDLGAGLADRASNLFSGFAPKDPKGLDYSQFATAGNPATVKGTGKGGAVKVENEEDIEW MRRLAERDYIARIAQNTLAPNIKVEFNGPITKEADTDNIMSHVSEQLKEMIATAPEGVPA >gi|229784063|gb|GG667672.1| GENE 22 15266 - 15922 605 218 aa, chain + ## HITS:1 COG:lin1283 KEGG:ns NR:ns ## COG: lin1283 COG1652 # Protein_GI_number: 16800351 # Func_class: S Function unknown # Function: Uncharacterized protein containing LysM domain # Organism: Listeria innocua # 3 217 7 227 227 85 29.0 9e-17 MSYSVYFKYGSKKYKLPVNPEEIKRSRELNIETYQVLEEGQVSIPSYCALEEFSFEAEFP GHNVSYMESGTEADADYYEKMFRKAQKNKKPIRFIASNDISDDISVKVLVKSVEVVEKAG EEGDKYISLTLMEYKGAGKRYVAVQTPDATVKQEETPLAENPAVTANKTHTVQSGDTLWG IAKKYYGNGSQYPKIMSANPAIKNANLIYPGQVFTIPA >gi|229784063|gb|GG667672.1| GENE 23 15932 - 16885 586 317 aa, chain + ## HITS:1 COG:no KEGG:Closa_1406 NR:ns ## KEGG: Closa_1406 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 317 1 319 321 517 79.0 1e-145 MEVLVETGGYIYDISGMCTEISWSDVLNDGSGSMDISYINDDLILQNGDVVRLTDNDQAD GIFFGTVFKVSGDESGIIKVKVYDQLRYAKAKEIIVLENGTLKNLVQNMCTFLSLNPGTL EEPGYVLPTIADSDKTWLDEVTQAISDTLIATQEMYCVRDEYGSVCLWNMRNLQLPLVLG DRSLCTGYSWEKSIDDDFYNRIKVGWKDEESKKMDVGAAADQESINRYGLLQYFETSASG IDNAAKAQERANSLLKLYNHEKETLKLECLGDLRIRAGNSIYGSIEDIELNRRLIVKKVT HDFLPVHTMKVEVMTGE >gi|229784063|gb|GG667672.1| GENE 24 16878 - 17153 224 91 aa, chain + ## HITS:1 COG:no KEGG:Closa_1407 NR:ns ## KEGG: Closa_1407 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 89 1 90 94 108 64.0 1e-22 MNDRNGANELFNVIKTIVNNYLNNRKVAAVVIGEYKGNAVMVGNLPIPMSMITGNMVSKI AAGDKVRLLRNDGGREYYILEIIGKPYQTGG >gi|229784063|gb|GG667672.1| GENE 25 17159 - 17542 350 127 aa, chain + ## HITS:1 COG:no KEGG:Closa_1408 NR:ns ## KEGG: Closa_1408 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 127 3 128 128 164 68.0 1e-39 MKLTTDLTLREPIYEGKTYKVLPGKIEGYVNDLESLKQAIYKVLATEQFEYPIYSFSYGI AWKELIGEEQPYVRAEMKRMIQEALLRDDRIREVDGFSFSFTGDTCQCSFNVFSIYGDIE IEMEVPV >gi|229784063|gb|GG667672.1| GENE 26 17539 - 18564 854 341 aa, chain + ## HITS:1 COG:lin1287 KEGG:ns NR:ns ## COG: lin1287 COG3299 # Protein_GI_number: 16800355 # Func_class: S Function unknown # Function: Uncharacterized homolog of phage Mu protein gp47 # Organism: Listeria innocua # 1 332 7 358 361 197 35.0 3e-50 MTYEELLQAMLDRVPSNVDKREGSIIYDALAPCAYFLAQQNFQLENYLDLVFPDTAVGEY LDRAVAAFGVTRKPASAAVRKMITTGAVPIGSRWGINSLAYVVTRELASGTEYEAECETS GDIGNQYSGSVQPISNITGVTAELTDIITAGADEETDGALRERFLQKVRLPATSGNAYHY KLWALDVPGVGDARVFPLDSGPGTVTVLIVDSDKNIDTSLERAVSAYMETIRPIGASVTI DSPSSMAVNVVADVTLDGSKAKDDVLQAFRTSLITYLKGLVFTDYRVSYARIGSLLLATE GVQDYDNLTLNGSMANVIITDKAIPVIGTVDFSEVRMYGAN >gi|229784063|gb|GG667672.1| GENE 27 18551 - 18938 125 129 aa, chain + ## HITS:1 COG:no KEGG:Closa_1410 NR:ns ## KEGG: Closa_1410 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 111 1 111 177 112 52.0 5e-24 MELIKLLPDYYSENETMKTLQSILSEQTDGLDMEMYKTIDNCFVGSASDALTRYEHLLGL VPDAAKSDRYRRERIKAKISGAGTTTTSLIQNIAESFTNAAVNIVENSDPSVPTGYERLM DSAFLLLGT Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:28:57 2011 Seq name: gi|229784062|gb|GG667673.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld66, whole genome shotgun sequence Length of sequence - 24476 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 7, operones - 4 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 112 - 2718 2340 ## COG0474 Cation transport ATPase + Term 2752 - 2796 10.3 + Prom 2835 - 2894 4.4 2 2 Op 1 49/0.000 + CDS 3037 - 3993 274 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 3 2 Op 2 5/0.000 + CDS 4006 - 4869 1016 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 4 2 Op 3 5/0.000 + CDS 4915 - 6597 1984 ## COG0747 ABC-type dipeptide transport system, periplasmic component 5 2 Op 4 44/0.000 + CDS 6632 - 7609 1128 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 6 2 Op 5 . + CDS 7599 - 8585 304 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 + Term 8697 - 8726 -0.9 7 3 Op 1 3/0.000 + CDS 8922 - 10418 1670 ## COG0554 Glycerol kinase + Term 10508 - 10565 13.0 + Prom 10426 - 10485 4.6 8 3 Op 2 6/0.000 + CDS 10635 - 12071 1328 ## COG0579 Predicted dehydrogenase 9 3 Op 3 4/0.000 + CDS 12085 - 13350 1503 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 10 3 Op 4 . + CDS 13352 - 13714 455 ## COG3862 Uncharacterized protein with conserved CXXC pairs 11 3 Op 5 . + CDS 13748 - 14377 462 ## Closa_0515 hypothetical protein + Term 14552 - 14612 1.1 + Prom 14420 - 14479 9.2 12 4 Op 1 35/0.000 + CDS 14692 - 16044 1589 ## COG1653 ABC-type sugar transport system, periplasmic component 13 4 Op 2 38/0.000 + CDS 16067 - 16975 871 ## COG1175 ABC-type sugar transport systems, permease components 14 4 Op 3 1/0.000 + CDS 17002 - 17868 1111 ## COG0395 ABC-type sugar transport system, permease component 15 4 Op 4 . + CDS 17886 - 18614 784 ## COG0584 Glycerophosphoryl diester phosphodiesterase + Term 18763 - 18796 -0.3 + Prom 18703 - 18762 5.6 16 5 Op 1 . + CDS 18857 - 19753 981 ## COG0179 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) + Term 19784 - 19840 3.6 + Prom 19813 - 19872 2.6 17 5 Op 2 . + CDS 19892 - 21460 1820 ## COG0475 Kef-type K+ transport systems, membrane components + Term 21536 - 21605 24.7 18 6 Tu 1 . - CDS 21623 - 23878 1436 ## Hoch_1572 hypothetical protein - Prom 24054 - 24113 4.7 - Term 24036 - 24095 15.6 19 7 Tu 1 . - CDS 24166 - 24474 346 ## COG1592 Rubrerythrin Predicted protein(s) >gi|229784062|gb|GG667673.1| GENE 1 112 - 2718 2340 868 aa, chain + ## HITS:1 COG:FN1022 KEGG:ns NR:ns ## COG: FN1022 COG0474 # Protein_GI_number: 19704357 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 5 864 4 860 862 818 51.0 0 MEEWFHLSEEEALGRLEAARSGLSSREAAVRKEEFGPNALRAGEKRSALQVFLDQFKDLL VIILIIAAVISMISGDVESTAVIFAVLLLNAVLGTVEHQKAEKSLDSLKSLSAPMAGVLR DGRRQEILSSDVVPGDILLLEAGDMVAADGRLLETHSLLVNESSLTGESINVEKKTGKIR AEKVPLAEQHNMVFSGSLVAAGRGTVLVTGTGMNTEIGRIAALMNSTGEKKTPLQISLDQ FGSHLAAGILIICALVFGLSIYRRMPVLDSLMFAVALAVAAIPEALSSIVTIVQAMGTQK MAREHAIIKDLKAVESLGCVSVICSDKTGTLTQNRMTVEEVYVNNRLYPPENLNLENLVQ RYLLYDAVLTNDASCTDGKILGDPTEAALVVMGRKLGISESELRHSMPRIGELPFDSDRK LMTTVHRVDGVRIMFTKGALDILLPRCESILTGEGVRPMRENDRQRINEQNLGFSRQGLR VLTFVYRKMGEKEILTEKSESGYIFLGMTAMMDPPRPESKNAVLNARRAGIRPVMITGDH KITAAAIAEKIGILNEGGIAVSGPELDDMADEELDRNLDRISVYARVSPEHKIRIVSAWQ KKGHIVAMTGDGVNDAPALKKADIGVAMGKMGTEVSKDAASMILTDDNFATIIKAVANGR NVYRNIRNAIKFLLSGNMAGILSVLYTSLMALPTPFAPVHLLFINLVTDSLPAIAIGMEK AEPGLLTKKPRNPKEGILTKRFILQILLQGALIAVCTMTAYHTGLATGSEAAASTMAFST LTLARLFHGFNCRSSHSVFKIGLMSNLYSIMAFEAGVVLLAGVLFVPGLQTLFSVADLSV RQLLTIVIFAVVPTLVIQAFKTVRESMR >gi|229784062|gb|GG667673.1| GENE 2 3037 - 3993 274 318 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 6 313 5 310 320 110 26 1e-23 MKWKYIAKRLLIAIPTFLGITVLAFFILNLAPGSPLDALLADPRISPAELERKRVALGLD KPVIIQYFSWLNLLLHGDLGFSYSTQRPVAAMIGERLPATACLAGASILLSLIVAIPLGI YAASKPNSKRDYVSGGISLLMMATPNFFVGLVFIYLFAIVFKILPSGGMYDSAGVKSGAA LMRHMIMPCLVLSFQQIGGWVRHMRSSMLEVMQEDYIRTARSKGLKKWGVIYKHGLKNAL IPVITVVGMSIPSLVGGAVITEQVFGWPGVGSLMVTAINGRDYPVIMGITVMIAAAVLVA NILTDVAYGLLDPRISYK >gi|229784062|gb|GG667673.1| GENE 3 4006 - 4869 1016 287 aa, chain + ## HITS:1 COG:BS_appC KEGG:ns NR:ns ## COG: BS_appC COG1173 # Protein_GI_number: 16078205 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus subtilis # 1 284 12 300 303 230 43.0 3e-60 MKKRENESYLSEVLSKLTEHKMAMVGLFIITLEIILVAVLPFIMHLNPYDSDYTAFSAAP SSLHILGTDAIGRDIFARLVYGGRTSLLVGLLSTIISCAIGVPLGLIAGYFRGKAEIAIM RVADIFMSFPSIVLILVLVAVIGPSIWSVTIVIGVLGWTQFARLIYANVLSVSEKEYVES ARAIGTSNYKIITRYILPNSFAPILIAITFQMASAILMESSLSFLGMGVQPPGASWGNML YDAQSITVLSKRLWIWMPPGIALLITVLSINFLGDGIRDALDPKIKI >gi|229784062|gb|GG667673.1| GENE 4 4915 - 6597 1984 560 aa, chain + ## HITS:1 COG:CAC3179 KEGG:ns NR:ns ## COG: CAC3179 COG0747 # Protein_GI_number: 15896427 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Clostridium acetobutylicum # 24 547 19 552 567 150 24.0 5e-36 MRKSKSKMTRLTAIVMAGAMVLGLSACGGKDAGSTTTAAGDAGTTAQAAGETTAAQGSAG AAAGEKVITAAVSTAWDTMMPMNTTSNYTRMICDQIYDRLTQSKADGTYEGRLAKSWTVN DDSNAVTFELYDNAVWSDGEPVTAADVVFSYQMYSDPSVDAKSRYHLQYIAGVDDSGAEL SEDSIEVTANGDHEVTFNLKSSMFVDTFLQDIDTVFIIPKHIFEGKTAQEINSPDLWAKP VGSGPFIYDSEINGERMEFVKNPNYHLGAPKIDRLVIRVTDSASMLAGLINGDLDLIGYG SILIDDWDLANEQENLVAESIPTTSYTTLIFNTQKPYLTQEVRQALSMAINRQVLVDALL QGQGQQIITPIAPMSPYYNKDVTVWYDPDKAKTMLEEANFPFDQTLTFYIAAGNSITERT AALIVQDLQKVGVKVQIEQVDFPTLMSNMLDGKHDMGTIGSGGTLDPSESREMIHPDSSV NFCQLRDTEMTDLIDKGNAELTFDARKPYFDEYQVMVKERSAMAYLYTKNTLTVHNKRVT GIDAENFDSLNWSTWNWDVQ >gi|229784062|gb|GG667673.1| GENE 5 6632 - 7609 1128 325 aa, chain + ## HITS:1 COG:TM0501 KEGG:ns NR:ns ## COG: TM0501 COG0444 # Protein_GI_number: 15643267 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Thermotoga maritima # 7 319 1 315 327 378 60.0 1e-105 MDSLLTVDGLSVQFNTKKGINTAVDGISFSVGKGEILGIVGESGCGKSVTSLSILRLLGT NARISQGSVKLEGRELLSLSEDEMCKIRGNEIAMIFQDPMTALNPTLTIGDQLIEPLVIH QGFNKKDARKEAVEVLKKVGISAPEKRLKEYPHQLSGGMRQRVMIAMAVSCAPKLLIADE PTTALDVTIQAQILELMLDLRQKMDTAVILITHDMGVVAETADNILVLYAGKVVEYGSVK EIFNTPKHPYTKGLLSSIPPLEEDVEELNTIEGTVPGPGQMPAGCRFSPRCPYADERCKK EQPGIYEAGGAKVSCFRYEGDNYEK >gi|229784062|gb|GG667673.1| GENE 6 7599 - 8585 304 328 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 291 1 293 563 121 29 4e-27 MRSENQVLLETKDLRKYFSGKKGLFNLNPAVVKAVDDINLTIHKGETLGLVGESGCGKST LGRTILKLIPMTSGQVIYNGEDIAAYDKKQMWEMRKRMQIIFQDPYSSLNPRMTVYDLVS APLEVYGIGSAAERKEMVISILNDVGLDKQYLNRFPHEFSGGQRQRIGIARALILNPEFV VCDEAVSALDVSVRAQVLNLMKKMQQKRNLTYLFISHDLSVVRHVSDRVAVMYLGSVVEI ADKQDLYGAPLHPYTRALLSAIPIPDANRKRNRIILEGDVPSAYNPPSGCKFHTRCPYAT DRCREEVPALRDMGDGHGVACHFPLNQD >gi|229784062|gb|GG667673.1| GENE 7 8922 - 10418 1670 498 aa, chain + ## HITS:1 COG:TM1430 KEGG:ns NR:ns ## COG: TM1430 COG0554 # Protein_GI_number: 15644181 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Thermotoga maritima # 3 480 2 479 482 670 65.0 0 MGKYVIALDQGTTSSRCILFDEQGTICSVAQKEFTQIFPQPGWVEHNPMEIWSSQLSVTM EAMGKIGAHYSDIAAIGITNQRETTIVWDKETGEPVYNAIVWQCRRTADRIEELKKEELA DRIREKTGLIPDAYFSGSKIEWILNHVEGARERAERGELLFGTVDTWLIWNLTKGCIHVT DYTNASRTMLFDIHRRCWDDEILEYFGIPKCMLPDVKPSSCVYGYTSSDVMGGKIPIAGA AGDQQAALFGQCCFDEGEVKNTYGTGCFLLMNTGEKAVLSRNGLLTTIAASEGNEVRYAL EGSVFVAGAAVQWLRDEMRMVRSAPQTEEYCAAVPDTGGVYVVPAFAGLGAPYWDQYARG TIVGVTRGTSKEQFIRATVESMAYQVYDLIEAMQQDSGICLRQLRVDGGACANNFLMQFQ ADLLNTAIIRPDCIETTALGAAYLAGLAVGYWKDKEEIRKNWKVSRAFESSMEEDQRSRL VKGWKRAVKCALCWAEEE >gi|229784062|gb|GG667673.1| GENE 8 10635 - 12071 1328 478 aa, chain + ## HITS:1 COG:CAC1322 KEGG:ns NR:ns ## COG: CAC1322 COG0579 # Protein_GI_number: 15894602 # Func_class: R General function prediction only # Function: Predicted dehydrogenase # Organism: Clostridium acetobutylicum # 6 477 2 473 475 473 52.0 1e-133 MESLNYDAVIIGGGVTGCAVARELSRYELSVCVLEKEEDVCSGTSKANSAIAHAGHDATP GSLKAKFNVQGSQMMEPLSKELDFDYVRNGSLVLCFSEDDLPALEELLEKGKRNGVQGLE IISGDEVRKMEPNVTDTVVAALHAPTGGIVCPFGLTIALAENAVDNGVEFKFLTEVNEIK KDGEGYILETGKGAITAKYIINAAGVYADRFHNMVSSSKIHITPRKGDYCLLDKEAGGHV SHTIFQLPGKLGKGVLVTPTVHGNLLTGPTAVDIEDKEGNATTAEELAFVTEKSSIGVKN VPFRQVITSFSGLRAHEDGDDFIIGEAEDAPGFFDAAGIESPGLTSAPAIGVYLAELVAG KAGAAQKKDWKRERRGIVRPEKMSVEERAELIRQRPEYGTIICRCEGVSEGEIIDSIRRT LGAVSLDGVKRRVRQGMGRCQAGFCTPRTMEILSRELGITMEEVCKNAPGSNMLTGQK >gi|229784062|gb|GG667673.1| GENE 9 12085 - 13350 1503 421 aa, chain + ## HITS:1 COG:CAC1323 KEGG:ns NR:ns ## COG: CAC1323 COG0446 # Protein_GI_number: 15894603 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Clostridium acetobutylicum # 22 402 21 399 417 445 58.0 1e-125 MTQHDIVIIGGGPAGLAAAVAAKRAGAEDILILERDSCLGGILNQCIHNGFGLHTFKEEL TGPEYAARYIDMVEEEKIPYKLNTMVIDINQEKEITVINREDGLQVIRAKAIILAMGCRE RPRGALNIPGYRPAGIYSAGTAQRLMNIEGFSVGREVVILGSGDIGLIMARRMTLEGAKV KVVAELMPYSGGLKRNIVQCLDDYGIPLKLSHTVIDIQGKERVTGITLARVDENRKPIPG TEEHYSCDTLLLSVGLIPENELTKGIGASMNRVTQGPNVNDQLETDTSGVFACGNVLHVH DLVDYVSEEATLAGANAAAYVGAESEKEAGRIVALKAENGVRYTVPQAIDIDHMKDKVTV RFRVADVYKDRFISVYYDGVRVSKKKKKVLAPGEMEQVVLKKESFSSCPDLKNIVICTEV E >gi|229784062|gb|GG667673.1| GENE 10 13352 - 13714 455 120 aa, chain + ## HITS:1 COG:CAC1324 KEGG:ns NR:ns ## COG: CAC1324 COG3862 # Protein_GI_number: 15894604 # Func_class: S Function unknown # Function: Uncharacterized protein with conserved CXXC pairs # Organism: Clostridium acetobutylicum # 5 117 4 114 117 96 52.0 9e-21 MEVRELICIGCPLGCPLTVRLEGGEVEVSGNTCARGADYAKKEVLSPTRIVTSSVHVSGG TIEMVSVKTEHDIPKGKIFECMDEIRKVTIPAPVKIGDVVIPDCAGTGVSIVATKNVELA >gi|229784062|gb|GG667673.1| GENE 11 13748 - 14377 462 209 aa, chain + ## HITS:1 COG:no KEGG:Closa_0515 NR:ns ## KEGG: Closa_0515 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 8 209 5 213 260 98 28.0 2e-19 MRWNINRFFNPENRFWAFMERVTDICVLSFFWFLFSIPAVTVGAATTALFQYTLRLTADE EGYVWKTFREGFKKNFVQATILWIGAVLAGAFLVFDLWCCQFLAPLGAIRFIAFFLLTGA IFVYLLTIIYLFPLLAFFHVGTLKVIRDSFVMSVGNLHVSITILVIYGITAVITYFFPML FMFWFALGSYVASFFYRSIFYKYMKDEEE >gi|229784062|gb|GG667673.1| GENE 12 14692 - 16044 1589 450 aa, chain + ## HITS:1 COG:CAC0429 KEGG:ns NR:ns ## COG: CAC0429 COG1653 # Protein_GI_number: 15893720 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Clostridium acetobutylicum # 1 399 1 388 447 122 25.0 1e-27 MKKRGIALLMVAAMMAGTLSGCSGGGSTGKAGGGAQTEAQTVAEVDPSLYEVTEPITIKW WHALEEQYSETVQKVVDDFNKSQDMITVEAEYIGSYKKLNEALVAAHAAGTDLPAITVAN TPYVAEYGAGGLTEDLTPYIQASGYDLKDFGDGMISASSYNGKQVSLPFLISTQIVYYNK DMADELGVEIPEKWEDMDAFLEKTSKINADGTTAVYGTIMPGWDQWYYETFYLNQGVKII NDDQVTTDLGDEKAVEIAAKIKEWCDKGYTYWTGTADDASSIMRQHFIDGEAFSVVHTSS LYTNYLSQCNFEIGMAWLPGGETKNQEIGGCVLLIPSKNDQATKNAAWQFMQYLCSKEVN LTWARGTGYIPTRKSVLTTDEGKAFLEEKPAFQCIFDNLDLIQPRIQHKAWSQMANIWKN DMAQLMLEGGDTQKSMEKMAEEIDDVLGDS >gi|229784062|gb|GG667673.1| GENE 13 16067 - 16975 871 302 aa, chain + ## HITS:1 COG:AGpA78 KEGG:ns NR:ns ## COG: AGpA78 COG1175 # Protein_GI_number: 16119287 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 21 302 45 323 323 177 33.0 2e-44 MKEKKRMNPRKVSAMKDFACVLPVLIFLAVFTYYPIAELFRISFTDWNMLNDTWHYVGFK NWKWLFGGSGTKYLWNSLKVTILYSAGEIFITLVGGMIFALIFNRMTKGFSAMRAIVFMP KYVAMSSAAVVFLWILNTDGGILNYLLSLVGISKVDWLGNRHTALISVLMLTGWRCVGYG MMVYLSAMMGISTDYYEAASLDGANAVQRFFKITIPMLSPTTLFLFVTTFLSSMKVFQSV DILTQGGPYRSTEVFVYNIYRYAMEDFRMDRASTVAIFFFLLLLVVTVATMKVSNGNVTY DS >gi|229784062|gb|GG667673.1| GENE 14 17002 - 17868 1111 288 aa, chain + ## HITS:1 COG:lin1843 KEGG:ns NR:ns ## COG: lin1843 COG0395 # Protein_GI_number: 16800910 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 7 288 1 276 276 182 37.0 9e-46 MKDGRDINRSARRKKATITAVQWVLAIIVLIIIVFPIYWMLVSSVKSQEEILLPVPTLWP KEFHFENYINVLRKGNFARYYFNTAVMTAGILIGQVVTGIFAGYGFSKGRFKGQNFCFII VLGALMIPLQVTFIPIYIMMANWGMSDTFLGLILPEAVSPYYIFMLRQTFMAIDDSYIEA ARIDGLGRVGIITKILAPMCRSTLFTVTLVTFTNGWNAYFWPKIIAKGEKRRVLTVGLAY LKNTFAGAESMNNHEIMAGAVMAVVPVIILFFIFQKYMLTGYSKAAMK >gi|229784062|gb|GG667673.1| GENE 15 17886 - 18614 784 242 aa, chain + ## HITS:1 COG:BS_yqiK KEGG:ns NR:ns ## COG: BS_yqiK COG0584 # Protein_GI_number: 16079474 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Bacillus subtilis # 2 235 3 237 239 196 43.0 4e-50 MKIYAHRGYSGRYPENTMLAFQKAAETGCDGMELDVQLTKDGTVVVIHDEAVDRTTDGTG LVKDYTFEELRKLDAGTIKGGRYGFCPIPSFEEYCEWAGQYGFVTNVEIKTGVYYYEDIE EKTLELIRKYHLEDRVMFSSFNHLSLERMKELAPDIPRGALLEHAGLGNAGYYCSRFGYD FYHPGIKGLTENTVKNCRDHGIGVNVWTINDMESLENTYEWGCSGIFTNYPEICRSWVEK KG >gi|229784062|gb|GG667673.1| GENE 16 18857 - 19753 981 298 aa, chain + ## HITS:1 COG:Cj0021c KEGG:ns NR:ns ## COG: Cj0021c COG0179 # Protein_GI_number: 15791420 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) # Organism: Campylobacter jejuni # 1 296 1 287 292 265 45.0 8e-71 MKLLTYEVDRRKDIGVMSRDEMWVYPLRAFGMEYTSMMEVVKGFSQSEKDLLEHVSGLEP YGVTGAAMRSEVKILAPIEAPEQDIICLGLNYMEHAEESARFKKEAFDGVRKEAVYFSKR VNRTVDPDGEISAHSDIVDSLDYEAELGVIIGKEAYRVPKEQVKDHIFGYTIINDVSARN VQNAHKQWYFGKSLDGFTPMGPCILTADSVEYPPKFPIRSLVNGEVRQESNTGQMIFDID HIVSELSMGMTLKAGTIISTGTPSGVGMGFTPPKFLKAGDVVECVIEGIGSIKNTVSE >gi|229784062|gb|GG667673.1| GENE 17 19892 - 21460 1820 522 aa, chain + ## HITS:1 COG:CAC0444 KEGG:ns NR:ns ## COG: CAC0444 COG0475 # Protein_GI_number: 15893735 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Clostridium acetobutylicum # 13 377 3 370 393 69 23.0 1e-11 MQGYLAGVSEGTRVLLILSVILFAGFIMTRLTNTLNLPKVSGYIMAGILIGPCGLNLIPV DIIGHMGFVSDLALAFIAFGVGKFFKKEVLLKTGSRIIIITLFEALLAGVLVTLFCVGVF RMEWNFALILGAIATATAPASTMMTINQYKAKGEFVNTLLQIVALDDVVCLLAFSIVAGI AGRTGEEEVTMSTLLMPVVYNILALGLGFFCGYFLSRLLIPARSKDNRLILAIAMLLGIS GICASLDISPLLSCMIFGAAYINLTSDKKLYRQINNFTPPVMSIFFIVSGMNLDLTALTT VGAVGLAYFIVRIIGKYLGTYISCLITGTSREIRNYMGLALIPQAGVAIGLAFLGQRLLP EEMGKLLLTIILSSSVLYEMVGPVCAKMSLFLSGSITTEKMKEVAKEREAEIHEENLIAR ENEEHGEDDGAWENEAPCESDGSCGEEENPAAEAEESADLICDNNSGKNHRGGKGRKNEL LEQIKELEAGVEDQEEDYDIEREIQPEKEKNARKGKARKKKK >gi|229784062|gb|GG667673.1| GENE 18 21623 - 23878 1436 751 aa, chain - ## HITS:1 COG:no KEGG:Hoch_1572 NR:ns ## KEGG: Hoch_1572 # Name: not_defined # Def: hypothetical protein # Organism: H.ochraceum # Pathway: not_defined # 82 154 65 136 206 68 47.0 1e-09 MQKQNNKRNPKSPGIHAFFGKKRYILILLPLFLLLLFLFLLLLYCTAENHKGPGRPDDFD SYSNSRVPPDNTRGSLPLPDYTTDTKSTVFEDRDGKLDFLDRYLDMPSEIIDTAYHIIYQ DNSKGRVPGPSDYNIRAAFTVAAGDLPLWTKDMKKILPEQVNPDWWEDLKTSDFTWDIPS DAEYYKRPGSESYIVVCPGLNLILKMVSTQTLPVPLENSAFQKELPEELPGYDQFKFLAA DALGYDHSAVPYIRAALIEKTYTADGAELTLVCYRAMFCNSPLYGIPVLVIIGEDNTSCT VLCGGSYDDEWYLADIDGDGRDELLMQHLTGITGGAGAYESAIYRLTDGGPSKLFGSPDP DGNGYFDTYFSLHLSEGYTHTVQNGSAGFYTVFTRKGPEGNPYFDAEGNLTDEGRENNET DWLNTDPCFYLFKPVDTDGDGIYEIMTAQYTYLYGRADGLGTAYTLLKWDNSMERMYIQK AGYWPYEDHEDDSQDYYQRWEEYENTWYQKEEPSPGDSLNSKELIKAASDQIRAIDFHVR QYPIDTSLFDLEMDRKYREAFYQAVTNRKPVLCRAYRSLDYEENTYQYILRSDSLSDQEF LRQELKQDSGYYYVDYDGDGLPELIIRNTGLYGLKYDPKQDNVYLFLSTYSSYSYFMGAG QIFWHNPCLASKDIYSYESEDRNGNRIYAGFETVYHWDEDSSSESYYISTNDYRDVEVSR NLWEELISLLFDAYEETPAPESYEEFFKGCY >gi|229784062|gb|GG667673.1| GENE 19 24166 - 24474 346 102 aa, chain - ## HITS:1 COG:FN0455 KEGG:ns NR:ns ## COG: FN0455 COG1592 # Protein_GI_number: 19703790 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Fusobacterium nucleatum # 1 102 78 179 179 105 51.0 2e-23 AAEGENYEWTDMYKRMAEEAREEGFEELAIKFEYVGKVEAAHEKRYLKLLDSLKNDKTFK GDAPLGWKCRNCGYVHEGPEAPEICPTCAHPKAYFERKVENY Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:29:19 2011 Seq name: gi|229784061|gb|GG667674.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld67, whole genome shotgun sequence Length of sequence - 23456 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 12, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 35/0.000 - CDS 3 - 768 688 ## COG1175 ABC-type sugar transport systems, permease components 2 1 Op 2 . - CDS 781 - 2121 1415 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 2333 - 2392 4.2 + Prom 2160 - 2219 4.5 3 2 Op 1 7/0.000 + CDS 2240 - 4078 1465 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 4 2 Op 2 . + CDS 4062 - 4850 1015 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 4869 - 4928 3.1 5 2 Op 3 . + CDS 4962 - 6368 1561 ## COG0534 Na+-driven multidrug efflux pump + Term 6454 - 6512 19.1 + Prom 6492 - 6551 4.7 6 3 Tu 1 . + CDS 6583 - 7245 608 ## COG4186 Predicted phosphoesterase or phosphohydrolase + Term 7303 - 7334 -0.2 - Term 7261 - 7323 -0.1 7 4 Tu 1 . - CDS 7536 - 8333 568 ## Cphy_2216 hypothetical protein - Prom 8398 - 8457 4.6 + Prom 8446 - 8505 3.7 8 5 Tu 1 . + CDS 8551 - 8829 346 ## Closa_0373 sporulation transcriptional regulator SpoIIID + Term 8862 - 8925 20.1 - Term 8774 - 8800 -1.0 9 6 Tu 1 . - CDS 8834 - 9286 232 ## COG2337 Growth inhibitor - Prom 9431 - 9490 2.8 + Prom 9390 - 9449 3.5 10 7 Tu 1 . + CDS 9532 - 9963 402 ## Closa_0380 hypothetical protein + Term 10003 - 10052 9.8 + Prom 10101 - 10160 7.5 11 8 Op 1 . + CDS 10209 - 10862 833 ## COG1045 Serine acetyltransferase 12 8 Op 2 . + CDS 10890 - 11813 876 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 13 8 Op 3 . + CDS 11891 - 13501 1760 ## COG1388 FOG: LysM repeat + Term 13561 - 13604 2.0 + Prom 13802 - 13861 6.1 14 9 Tu 1 . + CDS 14078 - 14347 330 ## COG2827 Predicted endonuclease containing a URI domain - Term 14177 - 14217 2.2 15 10 Tu 1 . - CDS 14332 - 14979 660 ## COG1272 Predicted membrane protein, hemolysin III homolog - Prom 15049 - 15108 9.4 + Prom 15003 - 15062 8.5 16 11 Op 1 . + CDS 15195 - 15929 859 ## COG0778 Nitroreductase 17 11 Op 2 . + CDS 15941 - 16648 194 ## PROTEIN SUPPORTED gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family 18 11 Op 3 . + CDS 16702 - 18303 1608 ## COG2978 Putative p-aminobenzoyl-glutamate transporter + Term 18308 - 18362 9.2 + Prom 18411 - 18470 5.7 19 12 Op 1 . + CDS 18497 - 19549 748 ## COG1609 Transcriptional regulators 20 12 Op 2 3/0.000 + CDS 19573 - 20637 1240 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 21 12 Op 3 16/0.000 + CDS 20637 - 21428 952 ## COG1082 Sugar phosphate isomerases/epimerases 22 12 Op 4 4/0.000 + CDS 21458 - 22510 630 ## COG0673 Predicted dehydrogenases and related proteins 23 12 Op 5 . + CDS 22507 - 23455 614 ## COG1554 Trehalose and maltose hydrolases (possible phosphorylases) Predicted protein(s) >gi|229784061|gb|GG667674.1| GENE 1 3 - 768 688 255 aa, chain - ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 5 252 6 252 292 141 35.0 1e-33 MKTKKLYPTWFTLGALIFYLGLFFLPGIMGIGYSFTDWNAYSSEVNFVGWKNYLKVFEGG TNYGMFIKNTFLFTAVSSITKTTLGIALAILLTSRMVKMKSMHRMLVFLPQVMSFLIIGL VFKNILHPSRGFINITLKAIGLDFLAQNWLGDLDFAFKSIFAVDAWKGTGYVMMVVIAGL NSISATYYEAAELDGAGFWSKFRYVTLPLLKNVIISVTVLNITYGFRVFDLIYSLTNGGP GNSTGVINTAVYSEF >gi|229784061|gb|GG667674.1| GENE 2 781 - 2121 1415 446 aa, chain - ## HITS:1 COG:BH2226 KEGG:ns NR:ns ## COG: BH2226 COG1653 # Protein_GI_number: 15614789 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 66 436 55 424 424 77 25.0 4e-14 MKRKIALLLTMALGVSCLAGCNAGTAEKSGGQSSAAAGESTAAGKKENVTISFGTHQSGI PSCGVLQDLAGRFEEETGIKIDFQVSPDAQWRDLLKVKLDSGEAPDIFCVDADPLSLYSR VRPDETCVDLSGEEWVDRMADYAKECVSYDDKIYAIRFAEPKLNFYNYNKKIFADLGLEI PVTYEEFKTVCQKIQDSGTIPIYEATQSAWHQCVPLYGSGPYFEKLEPGLYDALNNNEKD IKDVAMMKTILDQMKEFADLGYYGEDYLSNTVDAAKDEIGTAKAAMYLAAPGFALQVEEA YPEMKDQIGIFLMPFGDNDVLANNASSAAYFINKKSENVEYALEFFRFLARPENLQYRQD NNPEEMGLCWPEITPSYPDEFNKYIDSLERGTIMQVGVKYIDPQFMDTGKDIEAMYTGAM TSEQVMESISRRRDEQAELQKDPAWN >gi|229784061|gb|GG667674.1| GENE 3 2240 - 4078 1465 612 aa, chain + ## HITS:1 COG:SP0662 KEGG:ns NR:ns ## COG: SP0662 COG2972 # Protein_GI_number: 15900563 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pneumoniae TIGR4 # 17 597 15 550 563 111 22.0 3e-24 MKRKAPSLQKSLPRYYFLVIFVIFTAMLTAFYIYYTRNLRASEYLNLQNVCDSAYSNMEQ EIEKMSTVSLNTIYSKDILPGIRKIPVDQLGTQQTYTQVQAIYNSISAMIGPQQTVAQIN LYHKNGFSAGTGFYTYNKKIDLKEKSWYNDVLTENGYKVMSTPQPISHFIPTAAIPYEPY YISLTRLYFDDVHEMEGAVEVLQKSDTFFNYIDRLTAANPALSLLIMNRNGELVYPVNGR GEELIHEITALAHKQDIPTGRTVHVRDESGRKNALYHQTVKSAGWQIYVFEPDSVITKSL AAPTFLFLAVLAAGIALTMAVCFSISRRILQPVGELTETFETLVLDDVFSHTIDRVSLPR SHYEEIQTLITSFRDMYNKMNASIMEMLKSQENEIRAKEIATQSMMKPHFLYNNLANISV MAEENMNQEIITLTENLCDYLRYTSTGSEIRVPVSDEFLYTQKYLACMNVRYRGRLTCDF TLDEQMGEIKIPKLSLQPIIENSLKYAFMKKPPWHISVRGYLDRDAGKWFITITDNGVGF KEEELQELAHTLESIRDSKDITRLKIGGMGLQNVYLRLLLQYGPSASLTAENHPEGGACI TMGGSITNDRNS >gi|229784061|gb|GG667674.1| GENE 4 4062 - 4850 1015 262 aa, chain + ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 8 253 2 253 257 118 30.0 1e-26 MTETADRKQYDVVVAEDELLILENLVKKINAMPLPFCVKGAFEDGDSALEFIKTNHVDLV ITDIKMPFMNGLELSRVLHSSFPSIKILIISGYNEFTLAQEAIEYQVKGFLLKPIKQEQL EQALTKLLIELDSDNDTLQASVPEHLKTMGKEEIAGCVEKYLKENFQKQISMGDISDKLG FAPDYLSHLFKKYHGESPVRYLTGLRINYAKQLLIRQPDLEVAVIGEMSGYPDPVYFSKV FKKNTGMWPSQYRAEGGCQRPE >gi|229784061|gb|GG667674.1| GENE 5 4962 - 6368 1561 468 aa, chain + ## HITS:1 COG:yeeO KEGG:ns NR:ns ## COG: yeeO COG0534 # Protein_GI_number: 16129928 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 33 461 93 527 547 187 31.0 6e-47 MTGSMARKEPPAKGRGCASETVGQKQYLFTNSDLKRLIAPLIVEQILAVTVGMVDTMMVS SAGEAATSGVSLVDMINNLLINVFAAVSTGGAVVSSQFLGQKNRKRACEAADQLMMITGL ISIVIMAGAIAFRHGLLRLLYGGIADDVMKNALIYLVLSALSYPFLAIYNSCAALFRSMG NSRISMQASIVMNIINVAGDILFIFGFHWGVAGAALASLISRITACVILVIRLHNTGLEI HVSFGRLHWNSGMIGKILHIGIPGGIENSIFQLGRVLVVSIIALFGTTQIAANAVANNLD GMGVLPGQAMNLAMITVIGRCVGAGDFEQAEYYTKKMMKLTYLISGLCCLGVIITMPLTM KLYGLSEETLKLAAVLVLIHDGCAILLWPASFTLTNVLRAANDVKFPMCISILSMVIMRL GVGYLLAVGLGLGAVGVWCAMILDWVVRVICFVGRYRSGKWKNFYHGV >gi|229784061|gb|GG667674.1| GENE 6 6583 - 7245 608 220 aa, chain + ## HITS:1 COG:SMb20398 KEGG:ns NR:ns ## COG: SMb20398 COG4186 # Protein_GI_number: 16264132 # Func_class: R General function prediction only # Function: Predicted phosphoesterase or phosphohydrolase # Organism: Sinorhizobium meliloti # 1 142 1 131 164 69 35.0 4e-12 MRYYISDLHFYHENLNYQMDHRGFSNAAEMNAHMIRQWNSKIRDRDEVVVLGDLSLGSGE ETNRILSQLKGRISLIEGNHDSYLTDRQFDRSRLEWVKPYAELRDNKRKVILSHYPMICY NGQYRRDEEGRPRSYMLYGHVHDTFDEALVNRYQEETRRAERPACHGGPPLSVPCQMINC FCMFSDYIPLTLDEWIETDGKRRSGRTDEISFRTEGTEKK >gi|229784061|gb|GG667674.1| GENE 7 7536 - 8333 568 265 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2216 NR:ns ## KEGG: Cphy_2216 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 9 244 7 248 283 125 30.0 2e-27 MNQPYNIFIEGIPGSGKSTLLDALARHFTNYRVFREGDISPVELAWCAYMTETEYRQALS DLADLRDEIEKSTVKEGTHFITPYTKIKTVHYDFYEYMEKFEVYGGRKKPSEFRDIILRR FRAFHSGGNLFECSFFQNILEELMLYAMFSDQQIMDFYQELTALLDLSNFKLIRLVSPDI RQCIQTIKEERVNPEGEEVWYQLMIGYLSRSPYGQAHQIEDFDGMISHFRRRIRLESEVM TLLPAGCCCDIESKHYDLTDLLRIF >gi|229784061|gb|GG667674.1| GENE 8 8551 - 8829 346 92 aa, chain + ## HITS:1 COG:no KEGG:Closa_0373 NR:ns ## KEGG: Closa_0373 # Name: not_defined # Def: sporulation transcriptional regulator SpoIIID # Organism: C.saccharolyticum # Pathway: not_defined # 1 90 1 90 90 149 96.0 4e-35 MKEYIEERAVAIANYIIDYHATVRQTAKKFGISKSTVHKDVTDRLEHINPSLAARARVIL DINKSERHIRGGLATKEKYQHRLAMCSKESDQ >gi|229784061|gb|GG667674.1| GENE 9 8834 - 9286 232 150 aa, chain - ## HITS:1 COG:CAC0494 KEGG:ns NR:ns ## COG: CAC0494 COG2337 # Protein_GI_number: 15893785 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Clostridium acetobutylicum # 13 124 5 116 122 144 62.0 4e-35 MNHLSDKYKEKPVLRGDLYYADLSPVIGSEQGGLRPVLIVQNNTGNRHSPTVIVAAITSR TTKAKLPTHVPVTREESGLKCDSTILLEQVRTIDRCRLKEYIGQLSPGQMELADTALLTS FGLKRETAVRDRTPVTVRTAVPSRMYSMQS >gi|229784061|gb|GG667674.1| GENE 10 9532 - 9963 402 143 aa, chain + ## HITS:1 COG:no KEGG:Closa_0380 NR:ns ## KEGG: Closa_0380 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 143 1 143 143 238 78.0 8e-62 MKSLLNRKWMLFMCLCMWVLVVRTLSGCSMSKDDGNKVRDMEFSVTAEADIPEELKQIIA EKQKTPFKLTYSDDQNLYIVVGYGKQDTGGYSIAVNELYLTENSIVLDTDLLGPEKGESG GTEPSFPFIVIRTELSELPVVFQ >gi|229784061|gb|GG667674.1| GENE 11 10209 - 10862 833 217 aa, chain + ## HITS:1 COG:CAC0687 KEGG:ns NR:ns ## COG: CAC0687 COG1045 # Protein_GI_number: 15893975 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Clostridium acetobutylicum # 2 190 1 184 186 227 58.0 1e-59 MILKDIWLDVKAVQERDPAARNALEVLLLYQGVHALIWHRFAHWFYKHKMFFAARLISQI ARFFTLIEIHPGAQLGHGILIDHGCGVVIGETTVVGDNCTIYQGVTLGGVGLNKGKRHPT LGSNVTVGAGAKILGSFEVGDNCTIAANAVLLKPLQDNVTAVGVPARAVKIDGVPIPKKE KNLVTMDHYCKMEERIRQMEETIASLQEELADARRNK >gi|229784061|gb|GG667674.1| GENE 12 10890 - 11813 876 307 aa, chain + ## HITS:1 COG:FN0868 KEGG:ns NR:ns ## COG: FN0868 COG0037 # Protein_GI_number: 19704203 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Fusobacterium nucleatum # 9 284 20 274 277 214 38.0 2e-55 MKEPKTRCQQIERSIIKNYRKEIWRPFVDALNEYKMIEEGDNIAVCISGGKDSMLLAKCM QEILRHGKMKFGLHFLVMDPGYHPANRKLIEENAALMEIPIEIFESDIFDVVVDVDESPC YLCARMRRGFLYAHAKELGCNKIALGHHFDDVIETVLMSMLYGGQVNTMMPKLHSTNFEG MELLRPLYFVKERAIIDWKEENGLQFLQCACRFTEAIAKERAMRAESAADGMAGGDIVHS SKRQEMKELIEEFRKINPNIENNIFKSVSNVNLDACIGYVKNGKRHHFLEEYESDGDLAV LCHTEQE >gi|229784061|gb|GG667674.1| GENE 13 11891 - 13501 1760 536 aa, chain + ## HITS:1 COG:CAC2903 KEGG:ns NR:ns ## COG: CAC2903 COG1388 # Protein_GI_number: 15896156 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: FOG: LysM repeat # Organism: Clostridium acetobutylicum # 3 531 4 514 520 147 22.0 8e-35 MLELVKKNIHMNRWKGNATSQITLDDDFIVPDTMDDVDQVILDSGEIVIESVKNQGERVL VRGKMDFNILYRKSEGGLQALGGSISFEEPINVPGLEEKDYVQLSWDLEDLNAGMINSRK LSIKAIVTLEVKVETLYDVEAAVELDTGTAGTGTPATMGGGSDLNGTPQVEVLRRNVDVA AIAVRRKDTYRIKENLQLSGNKPNIDHILWSEMKLRGITTRPLDGKMVLDGEVMVFVIYQ GEGEDAPVQWLEESIPFSGELDLSEASEEMVPSVSVHLIHREIEAKPDYDGEMRELDVDA VLELDMKLYREESMELLSDLYSTNRELSLETGEACFDKVLTKNMSKCKIAEKLNMDHADR ILQICHSEGMVKIDETEVKDDGLHVEGVLEVSLLYLTADDTQPIQSSVEVIPFHYLIEAP GINEKTICQLVPGLEQMSAVMMGGGSVEVKATIALDLLALQPVCEQVIKNVSEAPMDLKK LQQMPGIVGYIVQPGDSLWKIAKKFHTTVETIMTTNGLTDSLIKPGDRLLLVKEMS >gi|229784061|gb|GG667674.1| GENE 14 14078 - 14347 330 89 aa, chain + ## HITS:1 COG:BS_yazA KEGG:ns NR:ns ## COG: BS_yazA COG2827 # Protein_GI_number: 16077103 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease containing a URI domain # Organism: Bacillus subtilis # 2 73 6 77 99 77 55.0 6e-15 MNYTYLLKCADDTLYCGWTNQLDKRLKAHNEGKGAKYTRGRRPVSLAYYEIFETKEEAMQ REAAIKRLSRKDKLDLVAAFTSDASYATN >gi|229784061|gb|GG667674.1| GENE 15 14332 - 14979 660 215 aa, chain - ## HITS:1 COG:CAC0882 KEGG:ns NR:ns ## COG: CAC0882 COG1272 # Protein_GI_number: 15894169 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Clostridium acetobutylicum # 4 214 4 213 214 177 49.0 2e-44 MKFKIKDPGSAITHFIGMLMALFAATPLLLKAAADDSLHLAALSIFIISMILLYAASTIY HTLDISPNVNRILKKIDHMMIFILIAGTYTPICLIVLGDRTGWGLLALVWGIAIAGIIIK ACWIYCPKWFSSVLYIAMGWICVLAFTKITSALPRAAFGWLLAGGIIYTVGGIIYALKLP IFNSKHRNFGSHEIFHLFVMGGSLCHYIMMYQFVA >gi|229784061|gb|GG667674.1| GENE 16 15195 - 15929 859 244 aa, chain + ## HITS:1 COG:BS_ywcG KEGG:ns NR:ns ## COG: BS_ywcG COG0778 # Protein_GI_number: 16080862 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Bacillus subtilis # 1 222 1 232 249 115 31.0 5e-26 MNQTIQELYERKSVRAYEERPIEPEKKEMVIQAAIQAPSAGNMTLYSIIDVTDQKLKETL AVTCDNQPFIAKAPMVLVFCADYQKWYDLFLQREEEVRKPGMGDFMLASCDALIAAQNAV TAAQALGIGSCYIGDIMERYETHRDLLHLPKYVVPTAMVVFGYPTKQQEEREKPVRLPAQ AVVSENTYRRQEASEYAEYLSDRQGKTEEEFDRWLHAFCKRKWNSDFSAEMSRSVEVMMR QWAE >gi|229784061|gb|GG667674.1| GENE 17 15941 - 16648 194 235 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family [Lactobacillus jensenii 269-3] # 3 215 83 278 287 79 27 2e-14 MNILYEDDQIIVVEKPAGIESQSSRTFEPDMVSEVKKHINSLSPKKGDPYVGVIHRLDKP VGGILVFGKTKEAAGALSKQISEHQTEKRYLAVVCGKPVENFGTYVDYLWKDGKNNISKI VDKGITDAKRAELSYRVLAELEGEGGPLCLMEIDLKTGRHHQIRVQLSGHGTPLLGDLKY NPDGPKVTGQRNVALCAWKLSFTHPGNKKRMKFTVKPQGAAFYQFDFDYLADTHT >gi|229784061|gb|GG667674.1| GENE 18 16702 - 18303 1608 533 aa, chain + ## HITS:1 COG:FN0470 KEGG:ns NR:ns ## COG: FN0470 COG2978 # Protein_GI_number: 19703805 # Func_class: H Coenzyme transport and metabolism # Function: Putative p-aminobenzoyl-glutamate transporter # Organism: Fusobacterium nucleatum # 7 525 2 502 512 477 50.0 1e-134 MKQNEGKKERKSLIFRFLNTVEQVGNRLPHPITMFAALALFIVLLSGILAAAGISAEGEV IDSKTMEVTTQTVSAVSLLDRDGLVYMLTHMVSNFTEFAPLGVVLVTMLGVGCAEGSGYL SALLKKTVSVTPAKLVTPMLVFLGVMSNVASDVGYVVLIPLGAMVFLACGRHPIAGIAAA FAGVSGGFSANLLIGTLDPMLAGISTEAAQLVSPGYIVEPSANWYFMIVSTFLIVALGSW VTDRIVEPRLTGGIKAKPGEGGQASAAGEPDDRASVSLTDRERKALCLANTSFLLLAAAI TLLAWPQDSFFRNPETKSLISSSPFMTGLIVLIALLFFVPSVVYGRVSGAYKGEKDVCAQ LGSNMEAMGGYIALSFAAAQFISYFNYTKLGTILALKGAAVLGRIGTGGPVLMVLFILLA SLINLFMGSASAKWTILAPVFVPMFMLLGYSPELVQVAYRIGDSCTNLITPLMAYFAMVV VFAKKYDDNSGIGTLVSTMLPYCICFLIGWSLLLILWMAAGLPLGPGAGLMIS >gi|229784061|gb|GG667674.1| GENE 19 18497 - 19549 748 350 aa, chain + ## HITS:1 COG:BH2219 KEGG:ns NR:ns ## COG: BH2219 COG1609 # Protein_GI_number: 15614782 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 6 330 2 324 335 109 25.0 6e-24 MDEIQKANMMQIAEHCGVSLKTVSRVLNHPEQVSPKTREAVRRSMEQCGFQVNLLAKGLK QNRTNIIIVFLDKHNGDYMNLWRSRMLKYLFRYSSGIGLKIIVSPSDSHGFLEDETDGFY LLSSGIADGAILLEYMAEDQRVGYLKRTGTPYVVLGQPEEQDIPAVSLDNFDVGLKGGRY LKEKGFKKICYLTNDVGFYSTKLRVGGFLEAVPEGRVIYGVRDAKTAYERTRELLQAGEA DCIFTNGDDRFLGIYRAAAECGYRIPQDLGVLSADDLAVNEEAYPSLSSLGQDFEGLAEN CIAVLQSLLQKRQSHESVAHNQIFLPSTIIERESTGITGTEEKTDKTIKE >gi|229784061|gb|GG667674.1| GENE 20 19573 - 20637 1240 354 aa, chain + ## HITS:1 COG:lin2969 KEGG:ns NR:ns ## COG: lin2969 COG1063 # Protein_GI_number: 16802027 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Listeria innocua # 1 348 1 348 350 491 65.0 1e-138 MKKLVATAPRTAELKEYEDRPIQSSEVRVKVEFAAPKHGTELADFRGTTPFIDGKFDNDW KVFVERDADEPRGIEFGDLPIGNMFVGTIIEAGADVTEYAVGDRVCSYGPIRETEIVNAV DNYKLRKMSKEASAKNAVCYDPAQFALGGIRDANVRSGDNVVVIGLGAIGQLAVQMAKKS GASIVIAVDPIEKRREIAGKYGADYCLDPTACDVGYEIKRITGKMGADAILETSGSVQAL QSALKGLAYGGTIAYLAFAKPFPAGLWLGQEAHYNYGKIVFSRACSEPNPDYPRWNRKRI EDVVWDMLMSGYLDCSEIIDPVVKFEDSAEGICTYMDKEQEKSIKMGVIFGGEE >gi|229784061|gb|GG667674.1| GENE 21 20637 - 21428 952 263 aa, chain + ## HITS:1 COG:lin2968 KEGG:ns NR:ns ## COG: lin2968 COG1082 # Protein_GI_number: 16802026 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Listeria innocua # 1 260 1 260 264 341 60.0 6e-94 MLLGTQDRDFFPQDLTEKFKFVKSIGFECFEIDGKVLMEKPDEIKRAIRESGLPVSSACG GYRGWIGDFIEEKRLNGIADLKVIIRNLKEVGGTGVVVPAAWGMFTYRLPPMVSPRSREG DRKAILDSLSQLEETAKECGIYIYLEPLNRYQDHMLNTTADAVSIIDEGKFEMVKITGDF YHMSIEEDDISETLEKYRDYIGHIHIAENHRYQPGTGSIDFKRHMETLKRIGYDGAVVNE GRIRGEDPLTVYVDSIAYMKQFM >gi|229784061|gb|GG667674.1| GENE 22 21458 - 22510 630 350 aa, chain + ## HITS:1 COG:BH2165 KEGG:ns NR:ns ## COG: BH2165 COG0673 # Protein_GI_number: 15614728 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 4 343 6 346 348 338 46.0 9e-93 MSILRVAVIGAGQVARTSHINHYKSLPDVEICAVCDVNQNAAKMAADEFHIPSWYTDAET MLKEIRPDAVSICVPNKFHCDMTCLCLEYGCHVLCEKPPAVTLKEAERMRDTARRCSRIL TFGFHFRHAETIAFLKKKVEAGEMGTLYAGDVTWLRRRGIPGWGSFTNLEIQGGGPLIDI GAHMLDLAVYLLDYPEVSYVCASSSDRIGKQGGTGFLGSWDGARFGVEDSLFGFIHFSNG GSLTVRTSFTINIAEKEERNVRLYGDKRGISVFPFEIYGDEDGHQVNSVYPFEETKDWHY DCIRNFVESCMGRADILVTPDQAVYVQKLITGLYQSAGTGRPVIYGNGNL >gi|229784061|gb|GG667674.1| GENE 23 22507 - 23455 614 316 aa, chain + ## HITS:1 COG:ycjT KEGG:ns NR:ns ## COG: ycjT COG1554 # Protein_GI_number: 16129277 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose and maltose hydrolases (possible phosphorylases) # Organism: Escherichia coli K12 # 19 314 9 299 755 121 27.0 2e-27 MIYYGTEEKEKEKGFRFYEPERNPDYNKKTEAVFAQCNGFLGVRASFETKQLDESRGTFI CGLYHKAGIHEVTELVNCPDVTEFRIRINGENLRLDSCCLTEYERSLYTRYGELESRIGC EMKSAGGVHLTARRFASADNRRLFCHQLEISCEEGGVLELSTGINGQITNSGVAHFDRME ARVFEHCRMFLECGCDDGQILSVMTVCSTGESRESAAFHLERRAVYQDFKKELQPGEVWT ISKYTLFDVENRKESSGKTDGQVLVEMTDVLAEALKAGYGELNRRHRDAFERFWDMAAIE ITGAAKEERAAVEFKP Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:29:38 2011 Seq name: gi|229784060|gb|GG667675.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld68, whole genome shotgun sequence Length of sequence - 27911 bp Number of predicted genes - 22, with homology - 21 Number of transcription units - 7, operones - 7 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 1967 1469 ## COG4485 Predicted membrane protein 2 1 Op 2 . - CDS 2058 - 2483 443 ## COG2246 Predicted membrane protein - Prom 2510 - 2569 5.8 3 2 Op 1 . - CDS 2571 - 3623 1267 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 4 2 Op 2 . - CDS 3636 - 4436 928 ## COG4822 Cobalamin biosynthesis protein CbiK, Co2+ chelatase - Prom 4467 - 4526 6.2 - Term 4495 - 4560 10.0 5 3 Op 1 1/0.000 - CDS 4588 - 5778 1307 ## COG1454 Alcohol dehydrogenase, class IV - Prom 5809 - 5868 7.1 - Term 5836 - 5893 4.5 6 3 Op 2 . - CDS 5908 - 6852 777 ## COG0583 Transcriptional regulator 7 3 Op 3 . - CDS 6849 - 7523 814 ## COG2344 AT-rich DNA-binding protein 8 3 Op 4 . - CDS 7561 - 9270 1596 ## COG3044 Predicted ATPase of the ABC class 9 3 Op 5 . - CDS 9267 - 10364 1155 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) - Term 10380 - 10424 11.0 10 3 Op 6 . - CDS 10481 - 11371 886 ## Closa_0811 hypothetical protein - Prom 11504 - 11563 74.7 + TRNA 11487 - 11558 61.7 # Arg CCG 0 0 - Term 11619 - 11660 6.0 11 4 Op 1 . - CDS 11687 - 13906 1680 ## SGGBAA2069_c16140 hypothetical protein 12 4 Op 2 . - CDS 13929 - 15449 967 ## lp_2956 hypothetical protein 13 4 Op 3 . - CDS 15449 - 16918 1404 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 16988 - 17047 11.6 14 5 Op 1 . + CDS 16859 - 17110 92 ## gi|288870820|ref|ZP_06409920.1| putative response regulator 15 5 Op 2 . + CDS 17188 - 17919 434 ## SGGBAA2069_c16380 hypothetical protein + Term 17931 - 17977 9.4 - Term 17912 - 17971 16.5 16 6 Op 1 1/0.000 - CDS 17989 - 20115 2444 ## COG1511 Predicted membrane protein 17 6 Op 2 . - CDS 20105 - 22351 2262 ## COG1033 Predicted exporters of the RND superfamily - Prom 22381 - 22440 7.7 - Term 22480 - 22518 7.5 18 7 Op 1 2/0.000 - CDS 22540 - 23445 1031 ## COG0370 Fe2+ transport system protein B 19 7 Op 2 . - CDS 24423 - 25439 1288 ## COG0370 Fe2+ transport system protein B 20 7 Op 3 . - CDS 25432 - 25665 318 ## Closa_0801 FeoA family protein 21 7 Op 4 . - CDS 25662 - 25775 65 ## 22 7 Op 5 . - CDS 25860 - 27845 2059 ## COG0556 Helicase subunit of the DNA excision repair complex Predicted protein(s) >gi|229784060|gb|GG667675.1| GENE 1 2 - 1967 1469 655 aa, chain - ## HITS:1 COG:L48341 KEGG:ns NR:ns ## COG: L48341 COG4485 # Protein_GI_number: 15672817 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Lactococcus lactis # 164 653 10 508 883 201 29.0 5e-51 MTSEKKRSSEETETEKKKLIPSENGTDKKTPSASLEDFSLILDLGDGDVVSSDSAEVSPA GGDREEEPGTVQSPAEGGPVSSEGGQVPDAAEPLSGLDSVQEDPPDLFLEPVESEEYDYE EDLEFPDQEALEFSDQEEPAYDERDGKRRRDGRESICLIRPSDGLVAAFFVPVIIMIIIF AQRGIFPFGEESFLRTDMYHQYAPFFSEFQHKLTHGGSFLYSWDVGMGVNFAALYAYYLA SPLNWLLALCPKAYIIEFMTYMIVFKIGLSGLSFAYYLRKHCRTSDFGIAFFGIFYALSG YMAAYSWNIMWLDCILLFPLIMLGLERLVREKKCFLYCITLGLSILSNYYISIMICIFMV LYFIALLVMDGKKSWSDYMVNISSFAVFSLLAGGLAAVVLLPEIYALQMTASGDFNFPKT VTSYFPIFDMVARHLPDVETEIGLDHWPNIYCGVAVLMLFLLYLVNRNIKRREKVVYCGL LLFFFASFSFNVLNFIWHGFHYPNSLPCRQSFIYIFLMLLMCYQAYSHLKETPWKHVVLA FFAAIAFVLMAQKLITDEAYHFSVFYVAILFLSAYAGLICLYQKGASRNVVILLALGVVS VEAAVNTTVTSVTTTSRTSYVKDNDASVSLTEGITDPAFYRVEKVSRKTKNDGAW >gi|229784060|gb|GG667675.1| GENE 2 2058 - 2483 443 141 aa, chain - ## HITS:1 COG:BS_ywcD KEGG:ns NR:ns ## COG: BS_ywcD COG2246 # Protein_GI_number: 16080872 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 17 141 2 127 127 79 38.0 3e-15 MIKKIWDKVMNREVISYLIFGVLTTLVNWVVYAAMVKVHIDYRIATAAAWAVSVLFAFIV NKIFVFQSYNLRPAYVMKEITSFVACRAVSGVMEMVFMIVMVSWIHMDEYISKIAVSVIV VIVNYVFSKLFIFRKSEEKSL >gi|229784060|gb|GG667675.1| GENE 3 2571 - 3623 1267 350 aa, chain - ## HITS:1 COG:CAC2761 KEGG:ns NR:ns ## COG: CAC2761 COG1477 # Protein_GI_number: 15896017 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Clostridium acetobutylicum # 50 336 17 304 327 207 37.0 3e-53 MTLFVYEEKDRNYAKKVKTAAAVVCLTAAASIWLLSGCARKIEPITKSSFMLNTFVTVTL YDTDDPDILNGSLELCRSYENIFSRTIEASEIYKLNHRPADEQTVTVSDDVAALLEKGIY YSKVSDGDFDITVEPLSSLWNFSSEDHAVPSDSEIREARDKVNWKNLELNGNTLTFGSPD TAIDLGAIAKGYIADRMKDYLVKEGVKSAVINLGGNVLCVGKRPDGTPFKVGLQKPYADR NETIATLDIDGMSVVSSGVYERHFVKDGVNYHHILNPRDGYPYQNGLVSVTIISDLSVDG DGLSTTCFSMGLEKGKALIDSIDGAYGVFITDDYEIVYSEGAEKFLDKES >gi|229784060|gb|GG667675.1| GENE 4 3636 - 4436 928 266 aa, chain - ## HITS:1 COG:CAC1373 KEGG:ns NR:ns ## COG: CAC1373 COG4822 # Protein_GI_number: 15894652 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiK, Co2+ chelatase # Organism: Clostridium acetobutylicum # 1 257 1 258 278 185 38.0 7e-47 MKKEDAVLVVSFGTSYNDSRKKTIEAIEKTIAGHFHDYEFHRAFTSRVIIQILEKRDSIA VDGVSEAMERLFNQGIKRVLIQPTHVMSGEEFDAMMEEMRPYEAAFEAVAVGAPLLTSEA DYERLVRILAEDTKEFDAEGTEIILMGHGTEHSANEVYVRLAEEFKKQGYSRYHVGTVEA VPSLEDMKREVDRTGAHRVVLQPLMIVAGDHANNDMAGDEEGSWKSVFQSAGYDVVCRLK GLGELEGIRESFLEHAKMAVRSLTLK >gi|229784060|gb|GG667675.1| GENE 5 4588 - 5778 1307 396 aa, chain - ## HITS:1 COG:YPO2180_2 KEGG:ns NR:ns ## COG: YPO2180_2 COG1454 # Protein_GI_number: 16122410 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Yersinia pestis # 6 393 6 416 441 317 43.0 3e-86 MARFTLPRDLYHGKGALEELKNLKGNKAIVVVGGGSMKRFGFLDRVTDYLKEAGLEVKLF EGVEPDPSVDTVMKGAEMMREFGPDWIVAIGGGSPIDAAKAMWAFYEYPDTTFEELCIPF NFPTLRTKARFCAIPSTSGTATEVTAFSVITDYNKGIKYPLADFNITPDVAIVDPDLAET MPAKLTAFTGMDAMTHAIEAYVSTLHCDYTDPLALHAIKMIHNDLADSYHGDMAARDRMH NAQCLAGMAFSNALLGIVHSMAHKTGAAYSGGHIVHGCANAMYLPKVIKFNAKEAEAAKR YAEIARFIGLKGSTDGELVDALIAELRAMNGKLNIPACIREYEGGIIDEKEFNEKLPSVA ELAVGDACTGSNPRPITPAEMEKLLTCCYYDKEVDF >gi|229784060|gb|GG667675.1| GENE 6 5908 - 6852 777 314 aa, chain - ## HITS:1 COG:SPy0898 KEGG:ns NR:ns ## COG: SPy0898 COG0583 # Protein_GI_number: 15674920 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 6 204 6 205 301 107 31.0 3e-23 MNTVLLEYAVEVEKTGSITQAAGNLYMDQPNLSKAIKTLEESLGAPIFRRTPKGVVPTAR GRIFLEYARNVLAQLEEMEALYKPAKTGSVEFTLSMPRASYISSAFSRFVKRIDRKEGMN LWLRETNSADTLRDVERGEYQLGIIRYKPSAEAYYTKQAAECGLKTELLLEYDERLLMSQ AHPLASARHIESGDLLPYIEITHGDAKASGRRELKAADGTTASTPANRIYIFERGTQMGL LHENPDTYMWVSPMPPSVLTQHGLTEKPCIDGNHRYRDVLVYRNGYKMGKWERLFLEELE QVKADLVKDDCGRQ >gi|229784060|gb|GG667675.1| GENE 7 6849 - 7523 814 224 aa, chain - ## HITS:1 COG:CAC2713 KEGG:ns NR:ns ## COG: CAC2713 COG2344 # Protein_GI_number: 15895970 # Func_class: R General function prediction only # Function: AT-rich DNA-binding protein # Organism: Clostridium acetobutylicum # 1 216 1 211 214 126 32.0 4e-29 MEEKKHLVGGISRKTLERLPVYHHYLERRDCEGLVNISAPVIALDLQLNEVQVRKDIAMV ARSAGKPKTGYVVKDLIDDIEEFLGYHNTNQAVLVGAGSLGTALLSYRGFEQYGVEIVMA FDEDAEKAGQKIGEKMILPMEKLENLCRRMNIHIGIITVPADSAQEVCDRLVNCGVKAVW NFAPIHLTVPDHVLVQNENMAVSLAALSKYLYEADKDAVRGNGV >gi|229784060|gb|GG667675.1| GENE 8 7561 - 9270 1596 569 aa, chain - ## HITS:1 COG:VCA0786 KEGG:ns NR:ns ## COG: VCA0786 COG3044 # Protein_GI_number: 15601541 # Func_class: R General function prediction only # Function: Predicted ATPase of the ABC class # Organism: Vibrio cholerae # 11 505 8 491 549 383 43.0 1e-106 MKSAEDLRQFLRSVDRKGYPAYKGAKGVYRFPDFTLSIDHVQGDPFASPSRVSIRVEGKK ACFPGESYRPGYKRTALQDFLLRKFGAQVERYNFKARGSGKSGLIAVSRCGQEILERTAC EVDPGDGSILIRMEVGFPANGRTINSGELIKILFEFLPECVAGTCFYKNYPAGEKAKVKA VSELAEDQEEARRLLSEMGLAAFVANGSVLPRESGVSDRPMAKAVPFSSPASMEVVMKLP NAGEIKGMGIPKGVTLIVGGGYHGKSTLLSALESGVYNHIAGDGREYVVTEASAVKLRAE DGRSVKNVDISLFIQNLPDLRDTKRFCTEDASGSTSQAAGVVEAMESGAKTFLVDEDTSA TNFMIRDELMQMVVHRDKEPITPFVERVRALYDEQGISTILVAGSSGAYFHVADRVIQMD CYVPKEVTGEAKEAAAGFGEGVRSLPLTPVSFGRIPKKLKTGGRDERFKMKVLGRDSLQF DRDVIELRFVEQIADTEQIAALGYLLKYAGTHFIDGRRNLIQIVDLLEKKVAAEGLAALA DDGYLPVNLCMPRRQEIFACIDRCRSMIY >gi|229784060|gb|GG667675.1| GENE 9 9267 - 10364 1155 365 aa, chain - ## HITS:1 COG:BH1227 KEGG:ns NR:ns ## COG: BH1227 COG0809 # Protein_GI_number: 15613790 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Bacillus halodurans # 1 342 1 342 347 459 63.0 1e-129 MNVKDFNFELPQELIAQDPLEDRSSSRLLVLDKKTGEIEHRTFRDILSYLRKGDCLVIND TKVIPARLFGVKEGTEAKIEILLLKRRENDIWETLVKPGKKAKIGTVITFGEGLLTGTVV DIVDEGNRLIQFSYEGIFEEILDRLGQMPLPPYITHQLKDQTRYQTVYAKHEGSAAAPTA GLHFTKELLAEIEAMGVVIAHVTLHVGLGTFRPVKVDEIEDHHMHSEFYIVDEAEAKKVN DTKKNGGRIVCVGTTSCRTVESASSDDGVLKAGSGWTEIFIYPGYRFKTLDCLITNFHLP ESTLVMLVSALAGREHVLNAYEEAIRERYRFFSFGDAMFITDLHERVSSDDGDQTGKETG ERMER >gi|229784060|gb|GG667675.1| GENE 10 10481 - 11371 886 296 aa, chain - ## HITS:1 COG:no KEGG:Closa_0811 NR:ns ## KEGG: Closa_0811 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 242 1 236 250 197 48.0 5e-49 MKDLLENLKKRLADEETVRYLKFMVIPLVVIILIIVIVAADRPQGEEGESTADSSQMEIR TAETTEEEKTGDTEDTVYHLEEAGDEIRSLMEAYFKARRTCDINGLADVYGDSASPEELR EESLRMEEEVKYYQSYSDIACYSVNGPVEHSLVVYTRFNIKFRQSDTVAPSLFACFVKRG ADGSWHMMAEVTPEEGAYMARVNESEAVQRMAEEVNVSLRAALETDSNLLAVYNTLMNGE EQTGGGESEENGLSEGTDGEGSVTVPGNEVTVDGNGVTDSGDEGADSGNEGMDIGN >gi|229784060|gb|GG667675.1| GENE 11 11687 - 13906 1680 739 aa, chain - ## HITS:1 COG:no KEGG:SGGBAA2069_c16140 NR:ns ## KEGG: SGGBAA2069_c16140 # Name: not_defined # Def: hypothetical protein # Organism: S.gallolyticus_gallolyticus # Pathway: not_defined # 1 602 1 538 588 193 28.0 2e-47 MEKNIFGVKIMAVIAGTLILAMTACSTSSDQGKKEAASADSAAETVDVENRELDTSELVF ERSGDSVKLTSGDESLTITLKATKGAFDAASGEDVTDWFVTQSGNAVFTNSGVAMVESVN ENGTSMEITLDASAIQGFTAGGTVAMYVQPEKQAAKVNVGSYTIPELDITLDASEGVVAA NTSGAAAVTGAVAYFNLTMPEETVIDDVNTEAVTLAMTAGDGYCVDEFTDPVVRLDEAVW SEGRYEFHFDETEVEKLFTRTKVRGSESEVYSEGGPDWTAFGGNGNGQYYFNLTVSGLTY QGLPIPDSTFRVCYYVYGRDASDKARITLADYINVYDYEDTGAEKRTLAETGRNAGTGNG IAWTWIGADYEGKPILCDTEQDDFYITWPEGVDASALTASDVTVTLYSQYGDEQRLVPDS EYYVYSDSRETQIAVPYVQMAFTPVYTNMVISVDRAAVSGADDAAEEAFTQSYDIASVYV YQVQHGGLNPNGAVLVYQFYGFDEDTLTDWSQLMHGFEYALYIENEDGTKAYYSEENGGC LVDSADAAALYDAAGAQDLNIRLWDQAVIYDTTSAIENEDGSKTYRTETKTVNGENITFN KIILTGWDGNDLRGVQMTPAEARANGLKPAAGYAFPQTDDFYEKFKNYNFGHDYWVAHSM WPWVEGIERGWLTNADDMHASWNGLDKGYHFEYGSDGRYPDWSAPEAEGSDNWWPWLRDG FPDGMWEAVDAANADTYRQ >gi|229784060|gb|GG667675.1| GENE 12 13929 - 15449 967 506 aa, chain - ## HITS:1 COG:no KEGG:lp_2956 NR:ns ## KEGG: lp_2956 # Name: not_defined # Def: hypothetical protein # Organism: L.plantarum # Pathway: not_defined # 14 506 1 469 469 436 45.0 1e-120 MKQDYKEYEEKIKMNDNLKFDAERGETRTYEAAGKKITCRAYLGIVYCERPADPVQKLNI FVPEVYDRGGMIGGYTKLTAPIFVPNTVGGYLPGPAEEPGLNIRTGMPNSIFAALEHGCV VVSGGVRGRTSGMKSTEFFEGGKVDDTGQDNGKMVGKAPALIVDMKAVIRYIRHNRGLIP GNTERIITSGTSAGGALSALAGATGNSRDYVPYLKEIGAADERDDVFAANCYCPIHNLEN SDSAYEWQFGGIRDYCRTRHKKTEHGIVRIQEDGTMTDRQMELSGKLQKLFPAYLNSLKL KDENGNSLNLDENGEGSFLEYIKGEILKSAGRELAFHETANHLSGLSAPGSAVNTFPALK ILGERVVGLDWKAYMKEITRMKPAPAFDATDLKSPENEEFGSMDMEARHFSEFSMEYTET EAEMAPEQLIRMLNPTKYIGKADTAKHWRIRHGSFDRDTSFAIPVILALLLKNKGYDVDF LLPWGLPHSGDYDLQELFTWIDTICR >gi|229784060|gb|GG667675.1| GENE 13 15449 - 16918 1404 489 aa, chain - ## HITS:1 COG:BH0595_2 KEGG:ns NR:ns ## COG: BH0595_2 COG1263 # Protein_GI_number: 15613158 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Bacillus halodurans # 103 454 9 351 399 231 40.0 2e-60 MAVNYETLPGEIVRLVGGKQNIKGLTHCMTRLRFDLADYGKAKTEVLSKTEGILKVINSG SQYQVVIGNKVESVYESIIKQTGINGGGVVEEEASSDEGVGPKKKSGVINRLMDTISGVL APTLHVLAAAGIIKGLIALTLTLGWVSDTSGLYMLLYSIGDGFYYFLPIILGFTAARKFK MNEILGAAIGATLVYPSMVNIASNLAVSGTIFAGTAFEMSYYNTFLGIPIVMPGSGYTSS VIPIIIAVYVAAKLEKALKKSLPAAIRGILTPVIVLLVTVVFTYLVIGPVSMGICGVIAM FVTFLYNIPVVGGIIAGALIGGGFGILVMFGLHWVIISLGLSTIAVQGFDYMLACGSIGP MIGMAQGIAICLAAARHKNQKVMDLALPGTLSQICGVGEPLMYSVLIPLKKELYLNILSG CVGGAVLGLLGTRIYMFGGSGLFSFPNFVSAASGTTDLIKYCIGVAAGCIFAFAAELVIY NNDKAKKLL >gi|229784060|gb|GG667675.1| GENE 14 16859 - 17110 92 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870820|ref|ZP_06409920.1| ## NR: gi|288870820|ref|ZP_06409920.1| putative response regulator [Clostridium hathewayi DSM 13479] putative response regulator [Clostridium hathewayi DSM 13479] # 1 83 8 90 90 96 98.0 6e-19 MFSPNEPDNFSGQSFIVNCHSIFLLVLNNLFIIYLIYKYIIHYYPYFKILVLHKYFIYFC ALCPNILFDAIIKLWFIEVIFIL >gi|229784060|gb|GG667675.1| GENE 15 17188 - 17919 434 243 aa, chain + ## HITS:1 COG:no KEGG:SGGBAA2069_c16380 NR:ns ## KEGG: SGGBAA2069_c16380 # Name: not_defined # Def: hypothetical protein # Organism: S.gallolyticus_gallolyticus # Pathway: not_defined # 1 237 1 236 246 159 37.0 9e-38 MNVIQIMEQKKTEFTPNDQLIYQKISKDPGKIVQMTTSALSDACGVSQPALTRFVKGLGY TRYQDFRADLIAWLAEQRTQQMPDCEHLEYFSVMNQLLREAEKILTDSYMRSLACYINQF NRVFTTGVSKSFQPASLLETLMMKTGRFFHAVPNDNISELSDSMMENDLLIVFSVSAKHS LLKSLSSSTGKIMLVTVTSSHPYQELIDRQVLLPYIPPTPEESAVSPVLFDMFVEILCRY LTH >gi|229784060|gb|GG667675.1| GENE 16 17989 - 20115 2444 708 aa, chain - ## HITS:1 COG:BH0721 KEGG:ns NR:ns ## COG: BH0721 COG1511 # Protein_GI_number: 15613284 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 1 695 1 598 599 93 20.0 1e-18 MRNRRKIRYRITAAALISSMLGMSMAGSALAGPPAVAADEALYVNLDHYGNRVDSSVVKG VSLNGLRTFTDYGNYKDVTNMSNYAEPVINGDSVTWQLPENSKERFYYECQLDNDEVVLP WDFDVSYKLNGIPKEAKDLVHANGLVEMEVHCIPNKNAREYYRNNMLLQVATMVNMKDVN AVEAPGSQTQALGTYKVVIFAAVPGEEKTFHIGISTTDFESMGLIMMMIPGTLDQMSEIT DIKEVKDTVGDSTDKLLDGMNEILDTLDRISGGMNVAQAGLEDLQKARAGFDASKDEITA NADNSLDSLEAVNEKISQLAPDINTNKQSLDEINTRINAIVKTLRSSGNDFFDLAGKLSD LEDSLGDLRNDLNSSNRDEILDQLQVVDDKLEEINEILQNILQTAGGMPAELDEEDLAEQ QETLNDLLEETYGILDDLESVAGSDAVDRLRSQLDAMNGNGLGDAASLSKLAAAAKALLP VVSGLQGLVSDLERTLSGVRLNDGLESSRDAVGEIGSIMGRIETLIGDINDLNRTVNEDK AGFDLMLDDMAASLDQMSSGTSQLITLLRSVQNTAKANRSAVENGTKQTLDGLIDILEKA ADTSGTSGKLKDANEDLRTSVKEELDKIEDDTNLLEMDPSRSMISFTSDKNPSPASIQVI LRTEEISEEDVNTNAVDIEPAAQNVGLWQRIVNVFVKIWDTVTGFFKK >gi|229784060|gb|GG667675.1| GENE 17 20105 - 22351 2262 748 aa, chain - ## HITS:1 COG:BH0720 KEGG:ns NR:ns ## COG: BH0720 COG1033 # Protein_GI_number: 15613283 # Func_class: R General function prediction only # Function: Predicted exporters of the RND superfamily # Organism: Bacillus halodurans # 21 697 8 678 687 427 38.0 1e-119 MKTRKAREKRQKKRFSMVDVIVQEGRWIEKIFAVAVILSVISACFVHVNYDLSEYLPDWA PAKEGLNRMEEEFGYPGTARVMVGPVTLYEAKAYKDRIAEIDGVDMVMWADTTTDVYNAN IFINYDDIQDYYKDDYAVMDIIFEEGDSSSLTKSAINEMQTLTGDKGYFMGSAVQNKSLN ETLMREISIAMALGVVMIAVILCLTTTSWFEPVMFLMVMGIAIIINMGTNLILGTISFLT FSVASILQLAVAMDYSVFLLDMFTKERAKGNDPVTAVTNAIHKSLPSILASGATTIIGFI VLMLMRFSIGKDLGFVLAKGIVVSLITVLFLMPSLLLRWGDRIKRTEHKSFMPSFHGLGQ TVYKVRYVVLGVVLLFVIPAFTAQNMNSFTFGNSALGSSPGTKVYEDEQQINGRFGKSNL ILAVVPNTSMVKEKDLTDALDDLYYVKSVTSLAGTIPEGIPEKIIPKSATSLLHTENYSR IMIYVKTSDESDFAFQCANEITSIVKSYYPEDSYITGVIPSTQDIKSIITADYSFVNLLS ILGVGLVVLISFKSLIIPVVVLIPIEVAVFFNMAVPYFTGESMLYMGYIIVSCLQLGATV DYSILMTNNYMDARRTYPEKARAIKHTVSRSALSILTSGTILTIAGYGLYFVSSVAAIGG LGHLIGRGALISVLMVLFLLPVLLTMADPLLMKEEAKKVRLEEVRLAARQKKQDLKEQMK QKMEQMKTEPQKEPPSKNLENEVKEHEK >gi|229784060|gb|GG667675.1| GENE 18 22540 - 23445 1031 301 aa, chain - ## HITS:1 COG:CAC0448 KEGG:ns NR:ns ## COG: CAC0448 COG0370 # Protein_GI_number: 15893739 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Clostridium acetobutylicum # 1 300 300 586 587 260 45.0 2e-69 MARVAYVMNETMGRVGLSGKAFLPMLLGFGCTVPAVMAARALPSEKDRRRTILITPFMSC SARLPIYVLFAEMFFPQHALIVAYSLYLIGLVVAIAVAHVTHRMSKAGEEDTLLIELPEY KTPNMRTVAIYVWDKVKDYLSKAGTTIFLASIVLWFVLNSGPSGFVDDVSGSFAAMIGRV LVPVLNPAGLGDWRIAVALISGLSAKEVVVSSFSVLFGIGNINSAGGMNQLSGILSSIGF NGVNAYALMIFCLLYSPCIAAVATIRKETGSTRWTLGMVLFQLLTAWGAAVLVFQIGSLL F >gi|229784060|gb|GG667675.1| GENE 19 24423 - 25439 1288 338 aa, chain - ## HITS:1 COG:MA3477 KEGG:ns NR:ns ## COG: MA3477 COG0370 # Protein_GI_number: 20092288 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Methanosarcina acetivorans str.C2A # 4 338 8 348 670 218 37.0 2e-56 MSDENKVINVGFVGNPNCGKTTLFNAFTGAKLKVANWPGVTVERVEGETSYKGRPIRVID TPGIYSLTSYTIEEKVTRKCIEDGEVDVIINVVDASSLERNLYLTLQLLELKKPVILALN MMDIVEDRGMEIDLHRLPEMLGSIPVVPVSARKRTGLDVLMHAVVHHYEEGPQGTVVKYT PRIENKISKVETVLKAHYGDMDNLRWHAIKFMEYDETVWNDHPVDLNNIIDHNYEKEIIN QKYDYIEEIIKECLVNKESKAASTDKIDEYLTHPVWGIPIFLGIMAMVFFLTFTVGDFLK GYFEVGLDWLSGSTLYLLQSIHASEWITSLVVDGIVAG >gi|229784060|gb|GG667675.1| GENE 20 25432 - 25665 318 77 aa, chain - ## HITS:1 COG:no KEGG:Closa_0801 NR:ns ## KEGG: Closa_0801 # Name: not_defined # Def: FeoA family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 77 24 100 100 114 89.0 1e-24 MKLHEGEIGKTYIVYSVMVDDTITRRLEALGVNEMTPVTLMNKKGSGTVIIKVRGTRLAL GRRISEGIEIREAADHE >gi|229784060|gb|GG667675.1| GENE 21 25662 - 25775 65 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDAGIQKVPGPTIVRPGFSIALKQGETNETAVRRRTE >gi|229784060|gb|GG667675.1| GENE 22 25860 - 27845 2059 661 aa, chain - ## HITS:1 COG:SP1238 KEGG:ns NR:ns ## COG: SP1238 COG0556 # Protein_GI_number: 15901100 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Streptococcus pneumoniae TIGR4 # 2 655 9 661 662 849 65.0 0 MEFKLHSEYQPTGDQPQAIEALVKGFQEGNQFQTLLGVTGSGKTFTMANVIQRIQKPTLI IAHNKTLAAQLYSEFKEYFPENAVEYFVSYYDYYQPEAYVPSTDTYIEKDSAINDEIDKL RHSATAALSEREDVIIVASVSCIYGLGSPIDYKEMVISLRPGMEKDRDEVIHKLIDIQYT RNDMDFKRGSFRVRGDVLEIYPAYSGGDAYRVEFFGDEVERITEIDSLTGEGKAELGHIA IFPASHYVVSKEKMELATQTILAELKEQVAYFKSEDKLLEAQRISERTNFDVEMMRETGF CSGIENYSRHLTGSLPGEPPCTLIDYFPDDFLIIVDESHITLPQVRGMFAGDRSRKTTLV NYGFRLPSALDNRPLNFEEFESKIDQMMFVSATPSVYEANHEMLRVEQIIRPTGLLDPEI EVRPVEGQIDDLVSEVNKEVANKHKILITTLTKRMAEDLTDYMREIGIRVKYLHSDIDTL ERSEIIRDMRLDVFDVLVGINLLREGLDIPEITLVAILDADKEGFLRSETSLIQTVGRAA RNSEGHVIMYADSITDSMRVAIEETNRRRAIQQQYNEEHGITPTTIKKSVRDLIAISKAA IEDKKSFEKDPESMDAKELEKLAKELTKKMHQAAAELNFEEAARLRDRMVEVKKMLQELD E Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:30:25 2011 Seq name: gi|229784059|gb|GG667676.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld69, whole genome shotgun sequence Length of sequence - 28200 bp Number of predicted genes - 26, with homology - 26 Number of transcription units - 17, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 82 - 118 -0.8 1 1 Tu 1 . - CDS 166 - 423 141 ## gi|288870823|ref|ZP_06115432.2| conserved hypothetical protein 2 2 Tu 1 . - CDS 525 - 734 159 ## gi|266622498|ref|ZP_06115433.1| toxin-antitoxin system, antitoxin component, Xre family - Prom 764 - 823 9.3 + Prom 761 - 820 4.5 3 3 Tu 1 . + CDS 892 - 1215 393 ## CTC01070 DNA-binding protein + Prom 1479 - 1538 5.0 4 4 Op 1 . + CDS 1671 - 1910 110 ## gi|288870824|ref|ZP_06115435.2| putative toxin-antitoxin system, toxin component 5 4 Op 2 . + CDS 1969 - 2484 245 ## gi|288870825|ref|ZP_06115436.2| putative pterin-4-alpha-carbinolamine dehydratase + Term 2540 - 2575 1.3 + Prom 2555 - 2614 7.4 6 5 Tu 1 . + CDS 2666 - 4057 760 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 7 6 Op 1 . - CDS 4069 - 4242 228 ## Closa_3331 GTP-binding protein HSR1-related protein 8 6 Op 2 . - CDS 4233 - 4604 392 ## Closa_3332 FeoA family protein - Prom 4712 - 4771 2.8 + Prom 4558 - 4617 2.7 9 7 Tu 1 . + CDS 4807 - 5226 339 ## COG1321 Mn-dependent transcriptional regulator + Prom 5254 - 5313 2.5 10 8 Tu 1 . + CDS 5340 - 6536 1604 ## COG0538 Isocitrate dehydrogenases + Prom 6569 - 6628 6.7 11 9 Tu 1 4/0.286 + CDS 6700 - 7248 250 ## PROTEIN SUPPORTED gi|229236145|ref|ZP_04360568.1| acetyltransferase, ribosomal protein N-acetylase + Prom 7280 - 7339 7.2 12 10 Tu 1 . + CDS 7377 - 8255 665 ## COG0583 Transcriptional regulator + Term 8330 - 8376 -0.3 - Term 8338 - 8386 12.1 13 11 Op 1 . - CDS 8417 - 9730 1464 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs 14 11 Op 2 21/0.000 - CDS 9812 - 11242 1344 ## COG0064 Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) 15 11 Op 3 31/0.000 - CDS 11243 - 12826 469 ## PROTEIN SUPPORTED gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 16 11 Op 4 1/0.571 - CDS 12838 - 13167 463 ## COG0721 Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit 17 11 Op 5 . - CDS 13221 - 14633 1362 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases - Prom 14703 - 14762 1.8 + Prom 15124 - 15183 5.9 18 12 Tu 1 . + CDS 15215 - 16585 1387 ## COG0534 Na+-driven multidrug efflux pump 19 13 Tu 1 . - CDS 16616 - 17449 909 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 17586 - 17645 5.8 + Prom 17525 - 17584 4.7 20 14 Tu 1 . + CDS 17752 - 19026 900 ## COG0477 Permeases of the major facilitator superfamily + Term 19229 - 19271 -0.5 - Term 19034 - 19075 8.6 21 15 Op 1 1/0.571 - CDS 19081 - 20277 1086 ## COG1609 Transcriptional regulators 22 15 Op 2 3/0.286 - CDS 20246 - 20935 766 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 23 15 Op 3 . - CDS 20977 - 22629 1714 ## COG1070 Sugar (pentulose and hexulose) kinases 24 15 Op 4 . - CDS 22662 - 24164 1689 ## COG2160 L-arabinose isomerase - Prom 24190 - 24249 4.3 + Prom 24236 - 24295 6.0 25 16 Tu 1 . + CDS 24330 - 27395 1356 ## COG4870 Cysteine protease - Term 27429 - 27488 21.1 26 17 Tu 1 . - CDS 27504 - 28199 718 ## COG3192 Ethanolamine utilization protein Predicted protein(s) >gi|229784059|gb|GG667676.1| GENE 1 166 - 423 141 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870823|ref|ZP_06115432.2| ## NR: gi|288870823|ref|ZP_06115432.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 8 85 19 96 96 150 100.0 4e-35 MKKDKKKPLPAQRKNVFGRKWYSTNISATEYFDNNLNDCWQAIYKLSLLVIVMIVILFRK GLFNHFNNFIVNISSKLKERFSPGN >gi|229784059|gb|GG667676.1| GENE 2 525 - 734 159 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622498|ref|ZP_06115433.1| ## NR: gi|266622498|ref|ZP_06115433.1| toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] # 1 69 1 69 69 115 100.0 1e-24 MYYRLKIEISRRGYTIEKFASMLSISEKSLRNKINGTTEFTWSEVLAIRDLIDPDMLLEE LFKKESKIA >gi|229784059|gb|GG667676.1| GENE 3 892 - 1215 393 107 aa, chain + ## HITS:1 COG:no KEGG:CTC01070 NR:ns ## KEGG: CTC01070 # Name: not_defined # Def: DNA-binding protein # Organism: C.tetani # Pathway: not_defined # 1 66 4 69 150 75 66.0 8e-13 MGLEKIAEYKKKLGMTTEELSEKSGVPLGTLNKILSGATKDPKLETLKAIAHVLGLSLDD FDDKKKVIHQEPTYADVEKLVARNGKKMSVEQKMRLIQLLSEIEFED >gi|229784059|gb|GG667676.1| GENE 4 1671 - 1910 110 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870824|ref|ZP_06115435.2| ## NR: gi|288870824|ref|ZP_06115435.2| putative toxin-antitoxin system, toxin component [Clostridium hathewayi DSM 13479] putative toxin-antitoxin system, toxin component [Clostridium hathewayi DSM 13479] # 1 79 102 180 180 156 100.0 5e-37 MAANRIYEDYKMCHLSECKEINQEIHNWFFPVIIPEMTVSKPKRIVEHKESKEKHTAWAE YHDMLERYFPERLQNYVLR >gi|229784059|gb|GG667676.1| GENE 5 1969 - 2484 245 171 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870825|ref|ZP_06115436.2| ## NR: gi|288870825|ref|ZP_06115436.2| putative pterin-4-alpha-carbinolamine dehydratase [Clostridium hathewayi DSM 13479] putative pterin-4-alpha-carbinolamine dehydratase [Clostridium hathewayi DSM 13479] # 1 171 17 187 187 322 100.0 5e-87 MKKVILSVLIIVGLVCGCSSKESEPEKNVEIILDVQQFCGMSEAELIEKMGEPESREEWN YEVGNLYSPIISCFYNNNEFEFMLNSDKVQRISIHADSYNHTDGDPFTFESKEDILPMFG ITIEDVKYAKKIDTNSALRYQDFANIKSFWIPEYENNSFDEVKVDFSDLFE >gi|229784059|gb|GG667676.1| GENE 6 2666 - 4057 760 463 aa, chain + ## HITS:1 COG:lin1231 KEGG:ns NR:ns ## COG: lin1231 COG1961 # Protein_GI_number: 16800300 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 9 407 1 407 471 112 25.0 2e-24 MTKPQTTLLRVALYIRVSSDEQAERGDSIRDQKERGTKYIDDHQNMILQDTYIDDGVSGQ KLDRDDFTRLIRNVKAGLVDLIIFTKLDRWFRSLRHYLNTQVILDKYNVAWTAIDQPYFD TSTPYGRAFVAQSMTWAELEAQNGGLRVADVFRSKVEHGEVITGKVPRGYMIQNKHLVFS DEAPAMLDSIQYFHREQGLAKTVNYMRETHGINMSIQNLKNSILRNEKYTGRYRGNENYC PRLISDEMYQDIQRVLDNNSTIRSSQKYPYIFSGILICDECGHKMSGCHINVVSRRTSGK VYRYKYPAYECLQYRAYKKCSNGGEIREMRIEEYLLEHIREELSGYLVDFETGEAKRIDN RAKKNKIRRKLDRLKDLYLNEVINLEEYKKDRAEYEEQLAALPDMEQPIKNLKPLKQVLD CNFEIIYNNLNNEEKRAFWRSIIKEIRVSKSIGRNRKYQIIFL >gi|229784059|gb|GG667676.1| GENE 7 4069 - 4242 228 57 aa, chain - ## HITS:1 COG:no KEGG:Closa_3331 NR:ns ## KEGG: Closa_3331 # Name: not_defined # Def: GTP-binding protein HSR1-related protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 55 1 55 435 109 87.0 4e-23 MGLSRDSVGIKSVDCGLEIKKRREDDKVIALAGNPNVGKSTVFNEMTGMNQHTGNLT >gi|229784059|gb|GG667676.1| GENE 8 4233 - 4604 392 123 aa, chain - ## HITS:1 COG:no KEGG:Closa_3332 NR:ns ## KEGG: Closa_3332 # Name: not_defined # Def: FeoA family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 85 1 89 104 119 70.0 5e-26 MTEVIQLTDLKKGQKAVISELAVFDDMRRRLQDIGLIEGTVVECVGKSPLGDPCAYIIRG AVIALRSEDSGRVMVYRKESEHQGAREGKEQRAESPDMALAADTASICEGKAGQEETEDV LWG >gi|229784059|gb|GG667676.1| GENE 9 4807 - 5226 339 139 aa, chain + ## HITS:1 COG:CAC2616 KEGG:ns NR:ns ## COG: CAC2616 COG1321 # Protein_GI_number: 15895874 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 4 136 3 135 157 89 34.0 1e-18 MDQEDNFYTLKGYSLLEHIQITSSMEDYLEMICRLSQDGTPVRIGDLAKCLHVRPSSASK MAGNLRSHGLVRFEKYGTVTLTEAGLQLGRYLLFRHTVLQDFFCFINQTTNELEQAEKVE HFIDPKTVYNLKKWMDKIT >gi|229784059|gb|GG667676.1| GENE 10 5340 - 6536 1604 398 aa, chain + ## HITS:1 COG:TM1148 KEGG:ns NR:ns ## COG: TM1148 COG0538 # Protein_GI_number: 15643905 # Func_class: C Energy production and conversion # Function: Isocitrate dehydrogenases # Organism: Thermotoga maritima # 1 398 1 395 399 502 60.0 1e-142 MDKIKMTTPIVEMDGDEMTRILWKMIKDHLLLPYIDLKTEYYDLGLEYRDETDDQVTVDS ANATKKYKVAVKCATITPNAARVKEYHLKEMWKSPNGTIRAILDGTVFRAPIVVKGIEPC VKNWKKPITIARHAYGDVYKGSEMKIPGAGKVELVYTAEDGSETRELVHNFTGAGIVQGM HNLNNSIESFARSCFNYALDTKQDLWFATKDTISKKYDHTFKDIFQDIFDAEYAGKFEEA GITYFYTLIDDAVARVMKSEGGYIWACKNYDGDVMSDMVSSAFGSLAMMTSVLVSPDGDY EYEAAHGTVQRHYYKHLKGEETSTNSVATIFAWTGALRKRGELDGNKDLMDFADRLEKAT IDTIENGEMTKDLALITTIPNPTVLNSEDFIKAIAARL >gi|229784059|gb|GG667676.1| GENE 11 6700 - 7248 250 182 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229236145|ref|ZP_04360568.1| acetyltransferase, ribosomal protein N-acetylase [Chitinophaga pinensis DSM 2588] # 8 178 3 174 181 100 35 8e-21 MRYKPKDHILKNGAVCRLRSPLPADAAAILHLMKVTSGETDFMARYSDEITMPPQREEKF LETTLNSPTDLMICAVVDGTIVANAGLSPIAQRERYRHRAAFGISIRKEYWGLGIGSLLL SAIITCAREMDLEMIELEVVCENERAIALYKKYGFRIYGSRPHSFKYRDGSYADEYLMLL TL >gi|229784059|gb|GG667676.1| GENE 12 7377 - 8255 665 292 aa, chain + ## HITS:1 COG:PA3398 KEGG:ns NR:ns ## COG: PA3398 COG0583 # Protein_GI_number: 15598594 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pseudomonas aeruginosa # 3 288 5 288 308 157 30.0 2e-38 MNLRHLTIFKTVCEEGSITGAAKTLSMTQPAISHAINELEESVGASLFDRVSRKIVINEN GKFYLSKVIPLLELYHDLETCSGALNRQTPLRVGSCVTLASFLLPGVISRFMKSNPESVI KVTVNNSKEIAQLLLRNELDIGLIEGVVPDEQLEKIPFSSYPLAVICAPDHPFARQTSQP VSLERLLEEPLILREEGCTIRDAFDCALTLHNLTAAPLWSSTNSDVVVQAVRENLGIGVV PRIFADPYIQRGEIVEIDVKDFDVSCMNHVVFHKDKFQSESFQAFINLVLFD >gi|229784059|gb|GG667676.1| GENE 13 8417 - 9730 1464 437 aa, chain - ## HITS:1 COG:ML2336 KEGG:ns NR:ns ## COG: ML2336 COG1167 # Protein_GI_number: 15828259 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Mycobacterium leprae # 4 425 40 459 463 387 45.0 1e-107 MKPYAEMTKEELQELRKQLSTKYREYQGKDLRLDMSRGKPSVDQLDLSMGMMDVLSSDDD LTCEDGTDCRNYGVLDGIKEAKELLGDMMEVHPDQIIIYGNSSLNVMYDTVSRSMTHGVM GNTPWCKLDKVKFLCPVPGYDRHFAITEYFGIEMINVPMTPTGPDMDKVEELVANDESIK GIWCVPKYSNPQGISYSDETVRRFARLKPAAPDFRIYWDNAYTIHHLYDHDQDHLIEILA ECKRAGNPDLVYKFASTSKVSFPGSGIAAIACSQNNLVDIKKQLKIQTIGHDKVNQLRHV RFFGDIHGMVEHMRKHADIMRPKFEAVIEILERELGGLGIGSWTSPKGGYFISFDSLDGC AKSIVARCKKAGLIMTGAGATYPYGKDPHDSNIRIAPSYPPLSDLILAMELFALCVKIVS IDKLLAEMNEKTETVTL >gi|229784059|gb|GG667676.1| GENE 14 9812 - 11242 1344 476 aa, chain - ## HITS:1 COG:CAC2669 KEGG:ns NR:ns ## COG: CAC2669 COG0064 # Protein_GI_number: 15895927 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) # Organism: Clostridium acetobutylicum # 4 475 2 473 476 489 50.0 1e-138 MSRQYETVIGLEVHVELATKMKIFCSCSTEFGGAPNTHTCPVCTGMPGALPVLNRQVVEY ALAVGLAVNCEINQYCRFDRKNYFYPDNPQNYQISQLYLPICHDGFVEIETAAGKKKVGI HEIHMEEDAGKLIHDEWEDCSLVDYNRSGVPLIEIVSEPDMRSAEEVIAYLEKLRLIIQY LGASDCKLQEGSMRADVNLSVREAGAPEFGTRTEMKNLNSFKAIARAIEGERKRQIELLE EGKKVIQETRRWDDNKESSRAMRSKEDAKDYRYFPDPDLPPVFISDTWIQEIKDRQPEFK TGKMARYQAELGLSEYDADIITGSKRMADIFEATVSLCKKPKEAANWLMVEGMRLLKEEA LEPEAIRFSPENLAKLIVLVDNGTITRTVAKEVFEKIFAEDIDPEAYVEENGLKVVNDEG ALRAVIEEVIAANPQSVTDYRNGKERARGFLVGQTMRAMKGKADPAMVNKMIEELL >gi|229784059|gb|GG667676.1| GENE 15 11243 - 12826 469 527 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 [Phaeobacter gallaeciensis BS107] # 4 478 2 435 468 185 29 3e-46 MTEKEILSLTAVQLGKKIKEKEVTCEEALTAVFAQIERQESSLHCYVTLEKEKALKQARE IQKKIDEGQLAGPLAGVPAAVKDNMCIEGMRTTCSSKILENFVPTYTAEAVENLRKAGAV IIGKTNMDEFAMGSTTETSAFGVTRNPWNTDHVPGGSSGGSCAAVAANECFYALGSDTGG SIRQPSSFCGVTGLKPTYGTISRYGLIAYGSSLDQIGPVAKDVTDCAVILEALASHDAKD STSVDREAVMRQKMEQGAAGWTSNGGCDFTSALVDDVTGMRIGIPGDYLGEGLDPEVKEA VLKAAEVLREKGASVEEFDLGMAEYAIPAYYVIASAEASSNLSRFDGVKYGFRAKEYDGL HDMYKKSRSQGFGPEVKRRIMLGSFVLSSGYYDAYYLKALRTKALIKKAFDRAFEKYDVI LGPASPATAPKLGESLSDPLKMYLGDIYTISVNLAGLPGISVPCGKDSGGLPIGLQLIGD CFQEKNIIRAGYAFERSRSYEMPPCAVPETAESVTVQSGMAAGKEAE >gi|229784059|gb|GG667676.1| GENE 16 12838 - 13167 463 109 aa, chain - ## HITS:1 COG:lin1868 KEGG:ns NR:ns ## COG: lin1868 COG0721 # Protein_GI_number: 16800934 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit # Organism: Listeria innocua # 5 109 4 96 97 70 40.0 9e-13 MAYRIDDATIEYVGILAKLELSESEKEDAKKDMGRMLDYIDKLNELDTDGVEPMTHIFPV TNVFREDTVREAADSYDTTGEHSSDGRENTLKNAPEQKNGAFKVPKTVE >gi|229784059|gb|GG667676.1| GENE 17 13221 - 14633 1362 470 aa, chain - ## HITS:1 COG:DR1055 KEGG:ns NR:ns ## COG: DR1055 COG0017 # Protein_GI_number: 15806075 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Deinococcus radiodurans # 23 470 16 435 435 372 45.0 1e-102 MEFMKGIQRKETVELADLLSGDYTGKTVSLNGAVHTIRDMGEVSFVILRKAEGLVQCVRE RSVPVYRAGERGAGAAQEAADGGLEFADNGPEIPSAELKEESAVTVTGIVSPEPRAPHGV EIRLTEIKVLSTPAEVLPIAVSKWKLNTSLETKLALRPITLRNPMERAKFRIQEGIVRGF RDFLHSQGFTEIRTPKIVARGAEGGSNVFRLDYFNKKAELGQSPQFYKQTMVGVYDRVFE AAPVFRAEKHNTTRHLNEYTSLDFEMGYIDGFEDVMAMETSMLQYVMKLLSEEYEKELRM LKVTLPDVARIPAVTFKRAKELVSEKYSRPIRNPYDLEPEEEYLIGRYFKEEYDSDFVFV THYPSKKRPFYAMDDPHDPKYTLSFDLLFKGLEITTGGQRIHDYDTIIKKMEKRGMDPEE LASYLMIFKYGMPPHGGLGIGLERLTMRLLDEQNVRETTMFPRDVTRLEP >gi|229784059|gb|GG667676.1| GENE 18 15215 - 16585 1387 456 aa, chain + ## HITS:1 COG:lin2873 KEGG:ns NR:ns ## COG: lin2873 COG0534 # Protein_GI_number: 16801933 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 9 450 2 445 450 220 32.0 5e-57 MSNMTISGKPAENKMGVMPISRLLFTMSLPMILSMLVQALYNIIDSIFVARLGETALAAV SLAFPVQNLIIAVSTGTGVGVNALLSRSLGEKNQKNANLAAVNGLFVFFLSYLLFAVFGI FFARIYFTVQTSNPEIIRQGTTYLSICSIFSFGIFLEIALERIMQSTGRTIYNMITQGLG AIINIILDPILIFGLLGFPRMGIMGAAAATVIGQIAAMLLLLYFNITKNTDVNLNMRSFR PNAEIIAEIYRVGLPSIIMQSISSVMTFGVNKILLLFSETAVSVFGIYFKLQSFIFMPVF GLNNAMVPIVAYNYGAAKKDRIMKTIRASVTAAVAIMLVGLAIFQIFPEQLLYLFDASEN MMGIGVPALRIISLSFLFAGYCIVIGSVFQALGNGVYSLITSAARQLVCILPAAWFFAAE FGLHAVWYAFPLAEIISVVLTTVLFRRIYQKKIQLL >gi|229784059|gb|GG667676.1| GENE 19 16616 - 17449 909 277 aa, chain - ## HITS:1 COG:STM3678 KEGG:ns NR:ns ## COG: STM3678 COG2207 # Protein_GI_number: 16766963 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Salmonella typhimurium LT2 # 5 266 2 264 271 184 38.0 2e-46 MEHFLHLKALLPVQALNAGYLVTKGWGRHVERTMDTFEIIYVNSGTLGIFEEDVEYDVEK GEALILFPGRHHGGTRDFDADLTFYWLHFTMEEEALRMGSDIMALPQLIRLAKPEAFEEL FRRFIRRQNVCRDDKTFLNLVLLELLCELSDSLSPLEVSGQKVYLANQAARYIREHWKEP LSASFLADKLECNPDHLGRVFHAVYGRNLTDAIHEARINRACRLLIETNLTGSEIAYECG YTDVDYFRRLFKKYMGITAKEYRQAYCLVLPEDRVGK >gi|229784059|gb|GG667676.1| GENE 20 17752 - 19026 900 424 aa, chain + ## HITS:1 COG:TM0563 KEGG:ns NR:ns ## COG: TM0563 COG0477 # Protein_GI_number: 15643329 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Thermotoga maritima # 37 423 21 397 459 110 23.0 7e-24 MVSQAYFGDTVMTETDYSRGRKLFIAEGCCANGVVTLTTGAFLSGYAGYLGADDSLNGII GSVPVLLCTLQMFSSVVLENLRQKKFLIACFALIHRLLLSSLFFVPLLVENQAGRLAAVV AIYVVAHFFGAFIGTGTGSWLLSLVPENIRGTYLGRKDAFAFAFTTVLSLVCGRVLDWMR SSHQDLTGYYIIGLIVLSTAFTDFWCLSSIREPVSEPHRRSLKAAVIDPLFDSEYRKIIG LYMFWNLALQVAGPFFSVYMVTGLSLDYTYITFLGLIASSVRVVAACFWGRLADATSWLH AARCSMLLLGFVHTSWLFMTPSTCLILQPVLQALSGAAWGGIAIAVFSLQYQYAPADKRV SYVSANSSYAGLCGFFATLLGASLLKILPEFQIGGFPVSGMQMIFALSGVLIILCVRYMR RLQT >gi|229784059|gb|GG667676.1| GENE 21 19081 - 20277 1086 398 aa, chain - ## HITS:1 COG:BH1875 KEGG:ns NR:ns ## COG: BH1875 COG1609 # Protein_GI_number: 15614438 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 24 389 2 363 375 270 40.0 3e-72 MGPMRIMGRDSGRPLSREEGEGGKTKYYILMEELKRDMISGRFQPGDRLPSENELSSSHH VSRHTVRKALSILEQEGFIEAEHGRGTFVSRKPGRGKPLSPALGGTYRKGSGNIAVITTY LSDYIFPRLIQGIDSVLTAEGYSIILKNTGNSRVKESRCLEEILEKDIDGLIVEPSKSEI LCANRPLYEKLDFYQIPYVFIQGYYAQMKNKPHILMDDCLGGYLVTNYLIESGHRRILGI FKADDSQGKERHKGYVKALQEAGCAYDPDMVIWFHTEDRAVKPAKVLELMLESGVTMDGI VCYNDQIALEVIKALQKMKIRIPEDISVTGYDNSFIAENGAVKLTTIAHPQEKLGAMAAE LLLEKIKGVPDEESRVERILKPELIVRESCMKREAGGI >gi|229784059|gb|GG667676.1| GENE 22 20246 - 20935 766 229 aa, chain - ## HITS:1 COG:SPy0179 KEGG:ns NR:ns ## COG: SPy0179 COG0235 # Protein_GI_number: 15674384 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Streptococcus pyogenes M1 GAS # 2 229 5 234 234 304 63.0 9e-83 MLEELKKEVYEANMELPRRGLITYTWGNVSGIDRERGLFVIKPSGVDYGQLRPEDMVVMD LDGNKVEGDLNPSSDTATHVELYKAFGEIGGIVHTHSPMATAWAQAGRGLPCYGTTHADY FYGEIPCARNLTPEEIEEGYEKNTGLVIIETFRGKDPVHVPAVLCKNHGPFTWGKDAAGA VHNAVVLEEIAKMNFVTEMLNRDAEPAPQSMQDKHFMRKHGPNAYYGQR >gi|229784059|gb|GG667676.1| GENE 23 20977 - 22629 1714 550 aa, chain - ## HITS:1 COG:CAC1344 KEGG:ns NR:ns ## COG: CAC1344 COG1070 # Protein_GI_number: 15894623 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 12 544 3 533 534 677 61.0 0 MTSVFKEKEEAMQEEVCRNMINGRTALGIELGSTRIKAVLVDEHQKPVASGSHDWENRLV EGVWTYTLEDIWEGVQDSFAKLKEDVKARYGVTLTTVGSIGISAMMHGYMAFNRAGEILV PFRTWRNTMTEEASEKLTELFQYHIPQRWSIAHLYQAILKKEPHVPEIDYLITLEGYVHW KLTGQKVLGIGDCAGMFPIDTGKKDFHERMIAQFDELTAGEQLPWKLRDILPKVLTAGEE AGVLTEEGAKLLDPSGALKPGIPLCPPEGDAGTGMVATNSVAKRTGNVSAGTSVFAMVVL EHELKNVHEELDLVTTPAGDLVAMVHCNNCTSDLNAWVNLFEEFSLAMGMKVDRNQLFST LYNLALEGDPDCGGLLAYNYFSGEHITHFEEGRPAFVRLPDSRFNLANFMRVHLYTSLGA LRVGMDILFKEEGVKLDSLLGHGGLFKTKGVGQRIMAAAAGVPVTVMETAGEGGAWGAAI LASYMRYRRDGGSAALDQYLTDCVFADEEGVVMEPQEEDERGFKAFMERYINGLPIERAA VDCLGGTAGN >gi|229784059|gb|GG667676.1| GENE 24 22662 - 24164 1689 500 aa, chain - ## HITS:1 COG:BH1873 KEGG:ns NR:ns ## COG: BH1873 COG2160 # Protein_GI_number: 15614436 # Func_class: G Carbohydrate transport and metabolism # Function: L-arabinose isomerase # Organism: Bacillus halodurans # 1 500 2 496 497 559 53.0 1e-159 MEMKAYKFWFATGSQDLYGDECLRKVAEHSRIIVEGLNQSGLLPYEVVWKPTLIDNTSIR TLFNEANRDGECAGVITWMHTFSPAKSWILGLQEYRKPLLHLHTQFNQEIPYDTIDMDFM NENQSAHGDREFGHMVTRMGIERKVIVGFWDDKTVQTEIGSWMRTAVGIMESSHIRVVRV ADNMRNVAVTEGDKVEAQMKFGWEIDAYPVNEIADYVKEIPAGDVSALVEEYYDRYDILL EGRDAAEFKRHVAVQAQIELGFEKFLEEKNYQAIVTHFGDLGSLKQLPGLAIQRLMEKGY GFGGEGDWKTAAMVRLMKIMTAGMEGAKGTSFMEDYTYNLVPGKEGILQAHMLEVCPSIA GSRIAVKVCPLSMGDREDPARLVFTAKEGRAVATSLVDLGSRFRLIINTVNCKKVEKPMP KLPVATAFWTPEPNLKTGAEAWILAGGAHHTAFSYDLTAEQMGDWAASMGIEAVFIDEDT TIRGLKNELRWNSVSYRQML >gi|229784059|gb|GG667676.1| GENE 25 24330 - 27395 1356 1021 aa, chain + ## HITS:1 COG:MA1513_1 KEGG:ns NR:ns ## COG: MA1513_1 COG4870 # Protein_GI_number: 20090372 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cysteine protease # Organism: Methanosarcina acetivorans str.C2A # 196 502 178 460 511 101 28.0 1e-20 MKKKFRGFSFKSRRLTAVILLSVFFSSEVLPLTPVLEAMAIPTESRILSSFYDSEDEDLW EKIEAHGDVLLTFPSQIHSTISSPEDTDVFNFTLEDPEDVILTLESSEPCQVVLSDHGVV VEESSLPHDQSIERAGLTAGTYTVTVSPQPGTDYAEYELKIRRQTDKTVMPDYSEAHIAG VTFNKRSPYRWMDLEDDELDRGGSSIEVLNYLAHWQGPVKNSAAPYPNDGSYMETPSNYI KYVDASRTSREYHIPNAYMLPTSDTEGYKEHWKNAIMTYSAIHSGLYFLPPYYTSAEPGY LYLPPELSDELMGGHATTIVGWDDSVPKEKFTVSSGSNAWTPEEDGAWICKDSYGTDPIL TDGTGYYYISYEDAFLGYPGMLAVVYPAGERIDNYSHMYSNSANGSIDLPLEDYTELILS QVFHTGHAPEMLRAAGFILSNPDFSYRIYVRIGEDKPFLAKSGYEKYGGFHTERLDTGIL LPAETEFEVIVKILIPEDRQNILRFFFAHNTPGWIDGIKSQKGISYQYELLEDGTWERHD ISENGLYPSIFAYTNAPSMKEKITLTNIREPLSSPANANQTEHMISSGSNAAKRPSLSED GMRTLPNPLSIAAAPLHVNLPSSYDLRDENQVTVPKHQGISSLCWTFAAAAAMESGYLKG GSNMVNYPGGLNLYSEDGAAKNGKIKLALAPGEELPIDFTATLFSDSEYFNPGTDQIYWE IAGDMSSIEAGDILSESGESVHVLTARSAGTVRVTAVSLADHSLRTSCTVEITEKIPAKV HLDRESLTLTQGETCKLQVTVESDEEVTVFFSSDHPSIADVSPEGEILALRPGTAVITAR AGEGYASCTVTVTRRSSSSDSSSGNSYSPAGPGSYSGFIPSTSITSDTTSGTWILNGTGW WLQREDGTYPSSSWERINGKWYYFKESGYIAAGWINQNGFWYYLSEDAPSCGQMATGWIF DPAYGGWFYLKPDGTMVTGWNTIDGKQYYFNPVSDGTKGRMAEDTVLEDGTRVGKDGAKT E >gi|229784059|gb|GG667676.1| GENE 26 27504 - 28199 718 231 aa, chain - ## HITS:1 COG:FN0089 KEGG:ns NR:ns ## COG: FN0089 COG3192 # Protein_GI_number: 19703441 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Fusobacterium nucleatum # 1 213 130 342 360 138 39.0 7e-33 LIEERDRTVFARGIMLGLAAMPAALIAGGAVCGLGFWEILHQNLPVLVLALLLGIGLYRV PDGMVKGFEVFAVLIRAVITAGLVLAAVTYMTGFVVIPGMAPVEEAMAVVSSIGVVLLGS LPVTEFLQRILKRPCTVLGAKIGLDSISVLGLLVSIVSPIPALAMMKDMNEKGKLVNVAY MVSAASMLAAHLGFTVSTEPDMLPVLLISKAAGCTAAVLLGLVLPEADGVG Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:31:09 2011 Seq name: gi|229784058|gb|GG667677.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld70, whole genome shotgun sequence Length of sequence - 29101 bp Number of predicted genes - 30, with homology - 28 Number of transcription units - 11, operones - 7 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 261 114 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) 2 1 Op 2 2/0.000 - CDS 248 - 1531 1028 ## COG0477 Permeases of the major facilitator superfamily 3 1 Op 3 . - CDS 1528 - 2574 807 ## COG1940 Transcriptional regulator/sugar kinase - Prom 2620 - 2679 8.2 + Prom 2563 - 2622 6.5 4 2 Tu 1 . + CDS 2672 - 2848 125 ## gi|266622528|ref|ZP_06115463.1| conserved hypothetical protein + Term 2871 - 2915 5.1 - Term 2865 - 2894 -0.3 5 3 Op 1 . - CDS 3064 - 3321 172 ## gi|266622529|ref|ZP_06115464.1| conserved hypothetical protein 6 3 Op 2 . - CDS 3363 - 3962 546 ## Tthe_0908 hypothetical protein 7 4 Op 1 . - CDS 4355 - 5620 1229 ## COG0673 Predicted dehydrogenases and related proteins 8 4 Op 2 . - CDS 5640 - 7280 1436 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 7451 - 7510 6.1 + Prom 7455 - 7514 7.8 9 5 Op 1 35/0.000 + CDS 7682 - 9043 1716 ## COG1653 ABC-type sugar transport system, periplasmic component 10 5 Op 2 38/0.000 + CDS 9134 - 10012 989 ## COG1175 ABC-type sugar transport systems, permease components 11 5 Op 3 1/0.000 + CDS 10035 - 10865 765 ## COG0395 ABC-type sugar transport system, permease component 12 5 Op 4 . + CDS 10890 - 12479 1071 ## COG3507 Beta-xylosidase 13 5 Op 5 . + CDS 12479 - 13288 977 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 13339 - 13398 5.1 14 6 Tu 1 . + CDS 13490 - 15094 1603 ## COG1866 Phosphoenolpyruvate carboxykinase (ATP) + Prom 15960 - 16019 80.4 15 7 Op 1 . + CDS 16058 - 16783 588 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 16803 - 16844 8.1 16 7 Op 2 . + CDS 16869 - 17015 64 ## gi|266622541|ref|ZP_06115476.1| conserved hypothetical protein + Prom 17027 - 17086 2.5 17 8 Op 1 12/0.000 + CDS 17106 - 17918 1117 ## COG3959 Transketolase, N-terminal subunit 18 8 Op 2 . + CDS 17920 - 18852 1029 ## COG3958 Transketolase, C-terminal subunit 19 8 Op 3 1/0.000 + CDS 18947 - 19930 1119 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase 20 8 Op 4 . + CDS 19923 - 20885 1175 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase 21 8 Op 5 15/0.000 + CDS 20965 - 22020 1447 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein 22 8 Op 6 24/0.000 + CDS 22067 - 23599 177 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 23 8 Op 7 26/0.000 + CDS 23596 - 24681 1155 ## COG4603 ABC-type uncharacterized transport system, permease component 24 8 Op 8 . + CDS 24678 - 25607 1013 ## COG1079 Uncharacterized ABC-type transport system, permease component + Term 25706 - 25737 1.1 25 9 Op 1 . + CDS 25755 - 26453 891 ## COG0036 Pentose-5-phosphate-3-epimerase 26 9 Op 2 . + CDS 26482 - 27510 1203 ## COG1609 Transcriptional regulators 27 9 Op 3 . + CDS 27538 - 27975 684 ## COG0698 Ribose 5-phosphate isomerase RpiB 28 9 Op 4 . + CDS 28041 - 28136 65 ## + Term 28161 - 28200 11.1 29 10 Tu 1 . + CDS 28254 - 28358 68 ## + Prom 28533 - 28592 8.5 30 11 Tu 1 . + CDS 28640 - 29099 442 ## Selsp_1788 hypothetical protein Predicted protein(s) >gi|229784058|gb|GG667677.1| GENE 1 3 - 261 114 86 aa, chain - ## HITS:1 COG:PA3200 KEGG:ns NR:ns ## COG: PA3200 COG0613 # Protein_GI_number: 15598396 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Pseudomonas aeruginosa # 3 83 2 81 295 72 43.0 2e-13 MSKIDLHIHSAASDDGEYTPQELVAMCTAQGMELIAVTDHNSVRSVTAAQSTASSCGLRV LSGVELDCTHKGRNFHLLGYGFDHTR >gi|229784058|gb|GG667677.1| GENE 2 248 - 1531 1028 427 aa, chain - ## HITS:1 COG:TM1060 KEGG:ns NR:ns ## COG: TM1060 COG0477 # Protein_GI_number: 15643818 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Thermotoga maritima # 21 321 6 302 436 75 21.0 2e-13 MSCGPAGPGLPGRSGRTCFFIQAGFGTAFTYLTSGVFLSGLAILMGAGDILVSYLSVIVN ICGVLILAFPAFLERFTSRKKLTIALTILSRLATLFIVTIPVLFPAGIRLYVFVPTVVAA FALQAQTTVVLNQWMLGFIEEKKSGRYISLRQTLTLTVTVVLSLAGGRFMDLMEGKYAGF ALLFAAAALMGILEVILLAATPDGEPYQSSGRSCRFLDIARLPLKNRCFTGFVAYIFIFY LLLTISDSYTMVFMMKYLALPYQIVTGLYMMISLPQIILLSFWGRLSDRYGHEFTLKMSI WFFAGETLFLSFASAQNWFIFIPAAFLISSAANAGFVISVFNRRYELMPEDNRIVYDNFY TAAIGLGSILGPLTGGAVKGMINARTAAAGAAGFTGIRILYLISTAGILLLQFIYFYSRK ETNHVKD >gi|229784058|gb|GG667677.1| GENE 3 1528 - 2574 807 348 aa, chain - ## HITS:1 COG:lin0520 KEGG:ns NR:ns ## COG: lin0520 COG1940 # Protein_GI_number: 16799595 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Listeria innocua # 6 246 2 237 334 85 27.0 1e-16 MDTLTARPRLLKQANLSQIRRVIKRRGTATRAEIAGETQISSTTVRSLLAEMMENGEIES IGHDKSSGGRKAERYRFKPDKYYGAAICISGSDMHGLLVGICGEILETVRLEFTGSDYEP AICAFLDELMARKDIKSIGIGVPGIVEGGAFWQGTGGSDELCCYDLGDRLAERYHIPVVM ENDLNATAIGFGRCYAKEFPCESPERTNMAYLHLEKTCVSAGFIVGGRIVRGFRNFAGEL GLIPMEDGRILDEWLMSALDDKDYTDLMIRIISWICGILNPQYVVLGGPDLRTDCIGPIS DGFSALLPKHMTAEILYSADMWHDYHDGMASLTAGKIFDDVQFIKELP >gi|229784058|gb|GG667677.1| GENE 4 2672 - 2848 125 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622528|ref|ZP_06115463.1| ## NR: gi|266622528|ref|ZP_06115463.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 58 1 58 58 95 100.0 2e-18 MTNPFRITVTIHGPAESFRRRSGAWEGQQKVAFLVSSVAGTKAERREKGRIVQTTVNS >gi|229784058|gb|GG667677.1| GENE 5 3064 - 3321 172 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622529|ref|ZP_06115464.1| ## NR: gi|266622529|ref|ZP_06115464.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 85 18 102 102 149 98.0 5e-35 MCNALEELFADKLEQREQLGIERGIERGIERSIIELLSELGPVPDALRERIILQKDVNVL TAWLKLAARSKSIADFQKDTGSSTS >gi|229784058|gb|GG667677.1| GENE 6 3363 - 3962 546 199 aa, chain - ## HITS:1 COG:no KEGG:Tthe_0908 NR:ns ## KEGG: Tthe_0908 # Name: not_defined # Def: hypothetical protein # Organism: T.thermosaccharolyticum # Pathway: not_defined # 8 181 4 183 316 87 38.0 3e-16 MEHKPLQWHPAFQAVLQIELEAEKEYLQFHEEFNLTKKPLQIDTLIIKESGRKIEKSFGR IFSRYNIVEYKNPEVSFGIHDFFKVNGYACLYQAAAEREVKILPSEITITLVVNKYPDKL IKFLRETYGSEITEEFPGIYYISKLLFKVQLLIIHQLSPEETIWLSRLRSDLEIQKDIEP LAKAYKGKEQDPVYEAPWI >gi|229784058|gb|GG667677.1| GENE 7 4355 - 5620 1229 421 aa, chain - ## HITS:1 COG:DR1362 KEGG:ns NR:ns ## COG: DR1362 COG0673 # Protein_GI_number: 15806379 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Deinococcus radiodurans # 5 413 7 396 403 248 38.0 1e-65 MKTTVAIAGLGSRGRITYAPFASLHPDKMEITAAADIDPGALRETAKTYHIPEERCFTSA EELLAQPKMADAVFICTQDRDHVSQALTALERGYHILMEKPISPDPSECRRLLEASKKYD RKIVVCHVLRYTPFFSRMKELIASGIIGDVVTIQAIENVGYWHQAHSFVRGNWRNSETTS PMCLQKTCHDFDLYLWLADKTPLRVSSMGNTWLFKESCAPEGSALRCLDGCAAKDSCPFD AEKIYVTNEKTGIACGHTEWPVDVLALHPTPETIYDAIKTGPYGRCVYHCDNNVVDHQVT NIENTDGSTINFTMSGFTSDGASRCCKIMGTKGDLTGDMYKNTIEVGLFGKPRQIIDISQ LAEDFSGHAGGDNRMVEEFLDMLQTGKEPEGITSLEQSINSHLVAFAAEESRLSDGKVIE I >gi|229784058|gb|GG667677.1| GENE 8 5640 - 7280 1436 546 aa, chain - ## HITS:1 COG:BH3678 KEGG:ns NR:ns ## COG: BH3678 COG2972 # Protein_GI_number: 15616240 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 3 541 34 588 605 222 27.0 1e-57 MLVMVIAFLCIGGYLYKNMSGYILQQSSKYIETTVKQAQERFNSSLTEMDSGIQRFCNDL AVQTILNDYNHNASFSSENFSTLRLKTMNTINYADNLESIELYSKKAQIYPYTKEPLTTA IPESALAMADEYNGKAVWLLSDSESEDGHTIMAVKRILLADYAFEHGGYVVVHLKPDLMD SIAQDLNALGDCTLELTDCAGTRVSYGKTLKDMNPDSFDRNYQLADSTSPYSGWTFSIYI PKELTSNDISWMSGIVICGFLIGTGVFLICSLFIARMISKPLSDMRKSMVIADGKLQAND RHYSNSDINELNIHYNALVEKNNRLIEQVFEKELLRTKAEITALQSQVNPHFIINALESV YWSLIQKNDMDNAGILMAMAKLFQYILKESDRITIEQELTFVEQYLQVEKFRFGNRLTWE YQVEPEVRSMQIPKLLIHPLVENSMKYAVETTSSPVHIVVCIAESGDGCVISVTDNGPGI DRATMERIQESFKNTNPVGSVSKSYGLANLYKRIKLYFEKSSELTIESEPENHRTTVTMR IHTINS >gi|229784058|gb|GG667677.1| GENE 9 7682 - 9043 1716 453 aa, chain + ## HITS:1 COG:BH3680 KEGG:ns NR:ns ## COG: BH3680 COG1653 # Protein_GI_number: 15616242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 53 399 29 379 438 195 33.0 1e-49 MKKRAAALFCAAVLGITSLTGCAAGKQETTAAGTETTAAVSGEKPAAEESKSETAGETAA GQEAAEITYWDTVGSVSDREAKLAMIERFEAQNPGITVKYEGMEGEAYKTKIKTVVSANN LPDIFGYWVGERFRTLVNSGNVEDLSSVFEEDTAFRDTFVPGSLDAVSYDGKVYGAPTSI TCMAVWYNKAIFEENGVEVPKTYEELKGVIDTLSDKGVTPIIVGAKDRWPLLGWYSYLAQ RIGGVDLYNEVCTGDKNFTEDAYVQAGEEIKTLAKKGFINGCMSVDATTAEALFAAGKGA ILITGSWSIPTFTEDPERAKDFSYFPFPMVEGGKEDEAGYLYGGVANTLAVSKSAKNKDA VVKFLKYMMSEEEQTVNVERTGTFSTVKVSPKEENMDPLAYQFSQYVSDGVEGFIPYTDQ ALPPEQAENLLNALTAIVAEDNADVKAELAKIK >gi|229784058|gb|GG667677.1| GENE 10 9134 - 10012 989 292 aa, chain + ## HITS:1 COG:BH3681 KEGG:ns NR:ns ## COG: BH3681 COG1175 # Protein_GI_number: 15616243 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 33 267 35 266 293 172 40.0 1e-42 MRKIHPAVRCLFILPAVLWFFAAVIVPVLMAVYYSFFSWDGVTARVFIRFDNYRELFRDP VVWKSLTNSLKMAFWCVLLQLPIGMAFSVILFNRKLKGRGFFRTVYFLPVVLSSSMIGIL WGQIYDPNFGLLNGLLEAVGLKQLQQIWLGDMKLALGCIIAVVVWQFIGNYILIYYTALH NISEDVLESARLDGAGVWKLFRYIELPLMWPVIRLTLILATVNSLKYFDLVYIMSNGGPN HASEVLASYVVKNAFNSMRIGYANTVAVLLLVLGIFFIILYNRILAPSSAGD >gi|229784058|gb|GG667677.1| GENE 11 10035 - 10865 765 276 aa, chain + ## HITS:1 COG:BS_yurM KEGG:ns NR:ns ## COG: BS_yurM COG0395 # Protein_GI_number: 16080311 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 3 275 27 299 300 201 38.0 1e-51 MNKRSISPGKIILYLILAVQAVVTMYPLIWMIVSSLKDNVSFFADPWSIPKNPQFVNYVT AWKEGIQDYMVNSIVITIATLFIVIILSCAFSFMVARFPFKGAGLLVGMFFAGMMIPVHC TLVPLYSMMNGLGWLDSLWALLFPYVASGLPLAIFLTYGHYQQIPMELEEAARMDGCGVI RMFLYIFLPLAKPVISTIGILTAISVWNEFIFANIIISDPVKKTLPVGLLALKGTYNTNY AAVSAALTISAIPIIIVYILMSGKIQAGMVSGAVKS >gi|229784058|gb|GG667677.1| GENE 12 10890 - 12479 1071 529 aa, chain + ## HITS:1 COG:BH3683 KEGG:ns NR:ns ## COG: BH3683 COG3507 # Protein_GI_number: 15616245 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Bacillus halodurans # 1 520 1 524 528 499 47.0 1e-141 MKNCLENPVLPGFHPDPSVIRVGEDYYMAVSTFEWFPGVCIYHSTDLQNWELAARPLNRV SQLDMKGNDASGGVFAPCLSYADGVFYLVYTNVRTHWYDFMDCHNYLVTAADIRGEWSEP VYLHSLGFDPSLFHDEDGRKWVVSIVRHYRKNRVQRIALQEYDPVLKKMTGPLREIFEGS GLTAPEGPHLYRRNGCYYIMTAEGGTEYGHAVTMGRSEKIGGPYELSPYNPVLTARHNMT LPLQKAGHADLVEAPDGEWYMVHLCGRPLPPFRRCILGRETAVQHVKWTEDGWLCLADGG NEPRLRVPVSWPVERKREQEIRYSFQETELSPDFFSLRVPLGESVSLTERPGYLRLTGKD SPNSRFEQSLIARRQEDFDFSFETELEFEPSSERQMAGILYFYDECDFYYLRVTYSQDGQ KIVDVMRMNRGKAEYTPDCGIPVPEGRSIRLRLEVHRTEACFSYSLEEGVWQQAGDTLDA SLLSDEYPEEGAYTGAMVGLGCHDMAYRSAKAYFKYAVYRREGNGMERG >gi|229784058|gb|GG667677.1| GENE 13 12479 - 13288 977 269 aa, chain + ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 15 264 5 254 257 159 35.0 6e-39 METDGEGETMERTYTLMLVEDERNILYGMKNAILCHNALVSEILTAENGREALELLEERV PDIIITDIRMPEIDGTELVRRIREQGYEMPVIILTALTDFAIARDLIHYQIQNYIVKPFS IEEILKETEVAIGELKKRSQMKMAQKIVKEFPELVEIPPSSENLLISQAVEYIGSHLAGA ASLNDISGALHVSKAYLSTLFKREMNTTVTDFVTKQRMKEAKKLLLETDMLVSEIYLKVG YQSDKYFIKVFKELEGITPLAYRKKWKAE >gi|229784058|gb|GG667677.1| GENE 14 13490 - 15094 1603 534 aa, chain + ## HITS:1 COG:VC2738 KEGG:ns NR:ns ## COG: VC2738 COG1866 # Protein_GI_number: 15642731 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxykinase (ATP) # Organism: Vibrio cholerae # 2 534 10 541 542 806 72.0 0 MAEIDLSQYGINGTTEIVYNPSFETLFEEETKPELEGYEKGQVSELGAVNVMTGIYTGRS PKDKYIVMDENSKDTVWWTSDEYKNDNHPASQEAWDTVKKIALEELSGKRLFVVDAFCGA NKDTRMAIRFIVEVAWQAHFVKNMFIQPTAEELENFTPDFVVYNASKAKVENYKELGLNS ETTSMFNITSREQVIVNTWYGGEMKKGMFSMMNYYLPLKGIASMHCSANTDLNGENTAIF FGLSGTGKTTLSTDPKRLLIGDDEHGWDDNGVFNFEGGCYAKVINLDKESEPDIYNAIKR NALLENVTLDKDGKIDFDDKSVTENTRVSYPIDHIEKIVRPVSAAPAAKDVIFLSADAFG VLPPVSILTPEQTQYYFLSGFTAKLAGTERGITEPTPTFSACFGQAFLELHPTKYAEELV KKMEKSGAKAYLVNTGWNGTGKRISIKDTRGIIDAILDGSIASAPTKQLPYFNFEIPTEL PGVDPKILDPRDTYAEVSEWETKAKDLSQRFIKNFAKYEGNEAGKALVSAGPQL >gi|229784058|gb|GG667677.1| GENE 15 16058 - 16783 588 241 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 4 232 18 247 256 118 31.0 1e-26 MKRKQISGKTMEMLKRLKEKGVMICIATGRAPSALPAFGDIAFDVYLTFNGSYCYNRDQT IFSNPLLTDDVDTIIKNAETIHRPLSLATKDRQAANGKDDDLVEYYGFANREVEVADDFE EVAKEEVYQIMLGCRERDHLSLLNGVHRAKIAAWWDRAVDIIPADGGKGIGIEKVLEYYH LNQSEAMAFGDGNNDIEMFQAVGKGIAMANASDRLKEAAYGLCGDAAEDGIYHYCMEHGL I >gi|229784058|gb|GG667677.1| GENE 16 16869 - 17015 64 48 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622541|ref|ZP_06115476.1| ## NR: gi|266622541|ref|ZP_06115476.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 48 1 48 48 67 100.0 3e-10 MFKEKLHFIGWDKNNIKMKLYRMHKNTLEYVLRNIILLSKFYKKNIAY >gi|229784058|gb|GG667677.1| GENE 17 17106 - 17918 1117 270 aa, chain + ## HITS:1 COG:FN0294 KEGG:ns NR:ns ## COG: FN0294 COG3959 # Protein_GI_number: 19703639 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Fusobacterium nucleatum # 1 266 1 269 270 270 51.0 2e-72 MKSYEAMEEEALAVRRDIIEMLYQAGSGHPGGSLSVTDILVCLYNRIRVDEKQPDMEDRD RVVLSKGHAAPALYAVLAEHGFFSKDAFKRLRKLGGMLQGHPDMNKTPGVDVCTGSLGLG VSTACGIALAGKVRKKDYHVYAILGDGELQEGIVWEAMTAAVHHKLDNMTVIVDLNGLQI DGPVEKVMGLGNLKGKFLDYGFKVVEADGHDFAALDAAIAVRENGVPVCILAKTVKGKGV SFMENQAGWHGQKVSGEDYKNAMEELKGGR >gi|229784058|gb|GG667677.1| GENE 18 17920 - 18852 1029 310 aa, chain + ## HITS:1 COG:FN0295 KEGG:ns NR:ns ## COG: FN0295 COG3958 # Protein_GI_number: 19703640 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Fusobacterium nucleatum # 1 300 1 301 309 290 50.0 2e-78 MAEPAMREAYGDALLEMGEDERVVVLDADLAKATSSGKFAAKYPERFFDMGIAEQNLMGT AAGMAISGLKPFASTFALFAAGRAYEPIRNAVCYAKAPVKIVATHAGLSPNSDGGSHETI EDIALMRVLPGMTVLSPCDYRQAFDMVLQMKDMDHPAYIRMSRHPVETVTAEGSHTEIGK IDVLREGGDVCFAATGVMVAEALHAAEALEEKGIHAAVLNVHTIKPLDRETLIRYGKSCK RMVTAEEHSVIGGLGSAVAEVLAECGGCRMKRVGIQDKFGQSGDLGELMESYGLTAENLM ETALLLCHST >gi|229784058|gb|GG667677.1| GENE 19 18947 - 19930 1119 327 aa, chain + ## HITS:1 COG:ECs0690 KEGG:ns NR:ns ## COG: ECs0690 COG1957 # Protein_GI_number: 15829944 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Escherichia coli O157:H7 # 4 319 3 311 311 225 41.0 7e-59 MKKIPVIIDCDTGIDDMISFAVTLTSDKLNILGITTVAGNQTVDITTQNTLNGVAIMGRT DIPVAKGAEKPLSRPLQDAGYIHGETGLGTYQFAEKTVKQPETEDAVELMRRLLAASEEK VVILAIAPLTNVARLLMDHPEAGEKIDKIVFMGGSLRTGNPTPVSTFNVLADPEAARYVM KSGVPFHMCPLDTTREVYLTADDIEAIGAIGNPVARMNYEMCRFYNDAVNKNNNADKRFK GLCIHDLCAAVYVTDPQLFTTVKYQGDVETEGELTLGFTVLDYENIRQVPEEERNVTFIS SVDRDGVLKIYFDALKCFGEQGEEIHG >gi|229784058|gb|GG667677.1| GENE 20 19923 - 20885 1175 320 aa, chain + ## HITS:1 COG:ybeK KEGG:ns NR:ns ## COG: ybeK COG1957 # Protein_GI_number: 16128634 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Escherichia coli K12 # 4 319 3 311 311 214 37.0 1e-55 MDKIPVIIDCDTGIDDMISFVLTLASKQLDIRGVTTVAGNVSLPYTTYNTVNGLAFMGRG EIPVAAGEAAPLERPYRDASEIHGDSGLGAFTFEHPVDYGPVETGGVAFLYEKLMESEEK ITILALAPLTNIAKLFLEHPDCKEKIEKIVFMGGSIYSGNPTPVATFNVWVDPEAARIVM KGGVPFYMCPLDTTREAYLTEEELEIVGTIDNPVADMVYTMLSFYWNQEKKNVNGHKRFK GLCIHDLCTAAYVTNPELFHSAKYYGDVETKGPLTEGFTMIDCEDILRKSEDEKNIDYID SVDRDGVIKIFFEALNSYGA >gi|229784058|gb|GG667677.1| GENE 21 20965 - 22020 1447 351 aa, chain + ## HITS:1 COG:CAC0702 KEGG:ns NR:ns ## COG: CAC0702 COG1744 # Protein_GI_number: 15893990 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Clostridium acetobutylicum # 40 349 37 351 357 179 34.0 8e-45 MKKIMAAVLAGVMAFSLTACGGGAAKDAGGGEKAEGTGGKTYKVALMADGAGFGSQSFND VALEGLEKAKADFGIELTTLEVKEVADFANSLRSLAQQGCDLIITPSSTIKDAVTEVAAE YPDTFFGLLDVQVEGVPNVVSSSYREHEAAFLLGALGGYLTKTNQIGFIGGVSGVIQDRF QYGYMAGAWYSNPEVKVTSSYTGSFSDVGKGKEIATMMFSDGCDYVAPSAGACNLGVFQA SKEAGADKWSFGAANGQFNQMPDKIVASQVKRVDNVAYSMVEKMVNGTLKGEQSEYGLKD GGVDLMYSTEDAMKDAVPQDIKDKIEEIRKQIVDGTIKVPASQDEYNAYVK >gi|229784058|gb|GG667677.1| GENE 22 22067 - 23599 177 510 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 264 481 13 214 311 72 25 2e-12 MEYAIELKELTKRFNQFTANDKISFQVEKGEIHALAGENGAGKTTLMNMIYGLYTPTSGE LWINGEQVKFTSSRDSIARGIGMVHQHFMLVPKLTVAENIIAGQETGTALKLDRKTAEKQ IAELSDQYGLKIDPSVKVGDLSVTMQQRVEILKVLYRRAEILIFDEPTAVLTPQEIDEFC DILLKLKQQGKTILFISHKLNEVMKISDHVTVIRLGKVIGTVKTSETTPEELTRMMVGRD ISLGGGAREEIKSPKEILKIDHVSYKNTKNIKKVDDLSLSVKAGEILGIAGVDGNGQEEL VEMICGLRTMDEGDILLDGESIKHHPTGMVQDMGIGYIPEDRHKDGLVLDFSIAENVILG QHRKPLFAKHGLIMKKKEIGETAEHLREQYDIRCSGIDSAASTLSGGNQQKIVIARAIFR GPKAILAVQPTRGLDVGAIEYIHKALVEQRNQGKAVLLLSLELDEILSLSDRIAVIHKGK IVGIVDAHKTTREELGLMMLGHVTGKEEEA >gi|229784058|gb|GG667677.1| GENE 23 23596 - 24681 1155 361 aa, chain + ## HITS:1 COG:TM0104 KEGG:ns NR:ns ## COG: TM0104 COG4603 # Protein_GI_number: 15642879 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Thermotoga maritima # 14 357 1 344 344 206 40.0 5e-53 MTKEKKNEKKRDGLNGVLMQVAISIISIVFAFLVGGIIIAAMGENPFKAYSVLIKGAVGT PVALTVSLTKTVPLILTGLAVAVAQKSRVFNIGAEGQLLVGAFAAGYVGFTFHLPAPLHI LLCLAAAVLAGMFWAFIAAFLKYTRNVHVVIATIMLNYIATSLVQYFVCGPFKEPGATFN ATSAIAGTAKLPQILPRPLSLNLGFIIALVMIAACWFLLNKTTTGYEMRAVGSNPDASRV GGINIKKNMFMALVISGALAGLAGGIEVSGSLYRVMEGFSPGYGFNGIPVAMMANGNPIA IFFSAFLLGAMRNGALAMQIQLGISQDIVDIIQGLVIIFLGCDYMIRYYINHVRLERRKL A >gi|229784058|gb|GG667677.1| GENE 24 24678 - 25607 1013 309 aa, chain + ## HITS:1 COG:TM0105 KEGG:ns NR:ns ## COG: TM0105 COG1079 # Protein_GI_number: 15642880 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Thermotoga maritima # 12 304 20 317 319 228 43.0 1e-59 MTGMISILLLSTLRMGTPLALTALGGTMSERSGVTNIGLEGILTAGAFAAVVGSYFTGNP WIGVLCAIVMGIAISAIHSFIAVTAGGAQNISAMALVMLANGFSAVGLNAIFKSAGNSAQ VNALPTTPILAHIPVVGEFLAKLSPFVYIAFIMLFVVWYLLKHTPLGLRITMVGEQPKAA ETAGISVAKIRYFSVLASGFLGGLGGAYLSLGQMNLYQDGMVSGRGYLALGAVIMGKWNP VGAFFSAMFFGLFEAIQIYVQMIPNFPVPHEFIEMIPYVASLIVLAGFIGKAEGPAANGK PYSRFVNLR >gi|229784058|gb|GG667677.1| GENE 25 25755 - 26453 891 232 aa, chain + ## HITS:1 COG:lin2808 KEGG:ns NR:ns ## COG: lin2808 COG0036 # Protein_GI_number: 16801869 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Listeria innocua # 8 202 4 198 214 165 43.0 7e-41 MRQRDLIVNPTLACGNPLNFKEDLDILEKTGIKILHIDIMDGHYVPNLCFDTGMVGLIRK NYSFLLDVHLMVTNPEDYIQPLAAAGADYVSFHLDATHFPIRMLNQIRENGMKGGVVLNP AQSPDLLAETLPYLDFILVMGVEPGFSGQKFLEGTVDKVGKLDSMRKELGLSYLIEADGG VDRENTVTLLEHGADTLVSGAFGVFEGRNGLEQDCREYRELALKAAGERLEA >gi|229784058|gb|GG667677.1| GENE 26 26482 - 27510 1203 342 aa, chain + ## HITS:1 COG:VC2677 KEGG:ns NR:ns ## COG: VC2677 COG1609 # Protein_GI_number: 15642672 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 1 342 1 335 335 183 33.0 4e-46 MATIKDIARLAGVSTATVSMVLNGKDCITKKTKEKVMAAAREVDYIPSFAAKSLKTKRSY TIALFVGDIANPFFPEIIKGVEAAAKDRNYSVIIYDLSGQDQDFVEQIDKAASQKVDGFF ITGSTQISEEAKKRLLMIADSGIKMISCNRFMEWGEFPLIYTSEGEQVDGLLSKLAALGH KHIGCISGYPEYWVAERREGFYRKTLKQYGLYHPEYIVNGGFFIEDGIRSTRMLLKEQPQ ITAVMCVNDTLAIGCNAAARELGLKVPEDLSIFGVDGVECLKYFAPQITTVDTHRYEYGY KGTVRLIELIEDDGSRRQELLETMVCPCSIRQGDTIAPPRKE >gi|229784058|gb|GG667677.1| GENE 27 27538 - 27975 684 145 aa, chain + ## HITS:1 COG:lin2811 KEGG:ns NR:ns ## COG: lin2811 COG0698 # Protein_GI_number: 16801872 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Listeria innocua # 1 145 1 145 146 144 52.0 4e-35 MRIGIGCDHGGYEMKEELKKRLTGGGHAVTDKGIYSVSEVDYPDVAAEVAEGVVAGEYDC AILVCGTGIGMSIAANKVDGIRAALVTDTFSARMAKEHNNANIITLGARTVGIELAWELV NAYLGAEFLGGKHLVRVEKIMGMEK >gi|229784058|gb|GG667677.1| GENE 28 28041 - 28136 65 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLLRIQYEIDIVTCEGSNIEQVLLKEQAVDN >gi|229784058|gb|GG667677.1| GENE 29 28254 - 28358 68 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLKKYDYMLLFHDDFPGWLYEVHMGVTTIWERWN >gi|229784058|gb|GG667677.1| GENE 30 28640 - 29099 442 153 aa, chain + ## HITS:1 COG:no KEGG:Selsp_1788 NR:ns ## KEGG: Selsp_1788 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 1 153 1 146 325 121 43.0 7e-27 MKRITLDKLTGKLEKASYQTQYDHIMSLIREGRIKPVAASGKNGKKPALYLEYWITEPEE EQKERLERLRLLKEELEYGLSPAIGIDYYVSHPEVYEAERSWVLMMDRYLKNCREKLRTE ESMNERSFEIWGREKFLKEEQGKTVCRHMGISM Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:31:51 2011 Seq name: gi|229784057|gb|GG667678.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld71, whole genome shotgun sequence Length of sequence - 35675 bp Number of predicted genes - 39, with homology - 37 Number of transcription units - 20, operones - 9 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 625 487 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 711 - 770 4.4 - Term 1124 - 1172 1.3 2 2 Tu 1 . - CDS 1253 - 1984 559 ## COG0730 Predicted permeases 3 3 Op 1 . - CDS 2100 - 2225 74 ## gi|288870848|ref|ZP_06409929.1| conserved hypothetical protein 4 3 Op 2 . - CDS 2265 - 3062 733 ## COG0730 Predicted permeases 5 3 Op 3 . - CDS 3118 - 4278 1131 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase 6 4 Op 1 . - CDS 4461 - 5954 1458 ## COG3875 Uncharacterized conserved protein 7 4 Op 2 . - CDS 5985 - 6359 394 ## COG0149 Triosephosphate isomerase 8 4 Op 3 . - CDS 6302 - 6820 542 ## SpiBuddy_0574 triose-phosphate isomerase (EC:5.3.1.1) 9 4 Op 4 . - CDS 6870 - 8057 1302 ## COG0205 6-phosphofructokinase - Prom 8212 - 8271 5.5 + Prom 8230 - 8289 9.0 10 5 Tu 1 . + CDS 8397 - 8483 106 ## - Term 8541 - 8576 -0.1 11 6 Op 1 2/0.000 - CDS 8639 - 9670 936 ## COG0673 Predicted dehydrogenases and related proteins 12 6 Op 2 . - CDS 9675 - 10892 1129 ## COG1940 Transcriptional regulator/sugar kinase 13 6 Op 3 1/0.333 - CDS 10959 - 11231 305 ## COG0395 ABC-type sugar transport system, permease component - Prom 11280 - 11339 80.4 14 7 Op 1 38/0.000 - CDS 12183 - 12707 409 ## COG0395 ABC-type sugar transport system, permease component 15 7 Op 2 35/0.000 - CDS 12721 - 13650 633 ## COG1175 ABC-type sugar transport systems, permease components 16 7 Op 3 2/0.000 - CDS 13667 - 15049 1253 ## COG1653 ABC-type sugar transport system, periplasmic component 17 7 Op 4 16/0.000 - CDS 15076 - 15927 697 ## COG1082 Sugar phosphate isomerases/epimerases 18 7 Op 5 . - CDS 15942 - 17009 680 ## COG0673 Predicted dehydrogenases and related proteins 19 7 Op 6 . - CDS 16996 - 18282 938 ## COG0166 Glucose-6-phosphate isomerase - Prom 18372 - 18431 11.3 20 8 Tu 1 . - CDS 18539 - 19690 1487 ## COG0191 Fructose/tagatose bisphosphate aldolase 21 9 Tu 1 . - CDS 19815 - 20636 956 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase - Term 20644 - 20698 5.3 22 10 Op 1 17/0.000 - CDS 20738 - 21826 1297 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 23 10 Op 2 24/0.000 - CDS 21898 - 22689 970 ## COG1116 ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component 24 10 Op 3 . - CDS 22665 - 23468 800 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component - Prom 23520 - 23579 9.1 + Prom 23612 - 23671 6.5 25 11 Op 1 1/0.333 + CDS 23711 - 24775 1091 ## COG1609 Transcriptional regulators 26 11 Op 2 . + CDS 24809 - 25288 505 ## COG2017 Galactose mutarotase and related enzymes 27 12 Tu 1 . + CDS 26209 - 26580 235 ## COG2017 Galactose mutarotase and related enzymes - Term 26575 - 26626 4.0 28 13 Tu 1 . - CDS 26694 - 26984 302 ## Closa_3067 response regulator receiver protein - Prom 27129 - 27188 80.4 29 14 Tu 1 . - CDS 28032 - 28436 460 ## gi|266622580|ref|ZP_06115515.1| conserved hypothetical protein - Prom 28525 - 28584 5.0 + Prom 28491 - 28550 6.6 30 15 Tu 1 . + CDS 28606 - 28788 187 ## Closa_3068 hypothetical protein + Term 28807 - 28853 8.1 31 16 Op 1 . - CDS 28749 - 28907 152 ## 32 16 Op 2 10/0.000 - CDS 28873 - 29268 497 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) - Prom 29301 - 29360 80.4 33 17 Op 1 3/0.000 - CDS 30208 - 30645 717 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 34 17 Op 2 . - CDS 30679 - 31734 1179 ## COG1312 D-mannonate dehydratase - Prom 31799 - 31858 4.4 + Prom 31831 - 31890 6.3 35 18 Tu 1 . + CDS 31918 - 32796 772 ## COG2207 AraC-type DNA-binding domain-containing proteins 36 19 Op 1 . - CDS 33864 - 34094 317 ## gi|266622587|ref|ZP_06115522.1| hypothetical protein CLOSTHATH_03819 37 19 Op 2 8/0.000 - CDS 34091 - 34426 220 ## COG1687 Predicted branched-chain amino acid permeases (azaleucine resistance) 38 19 Op 3 . - CDS 34423 - 35127 831 ## COG1296 Predicted branched-chain amino acid permease (azaleucine resistance) - Prom 35166 - 35225 5.3 + Prom 35143 - 35202 6.2 39 20 Tu 1 . + CDS 35265 - 35673 221 ## COG1313 Uncharacterized Fe-S protein PflX, homolog of pyruvate formate lyase activating proteins Predicted protein(s) >gi|229784057|gb|GG667678.1| GENE 1 1 - 625 487 208 aa, chain - ## HITS:1 COG:PH0753 KEGG:ns NR:ns ## COG: PH0753 COG1653 # Protein_GI_number: 14590623 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pyrococcus horikoshii # 59 208 65 210 464 80 34.0 2e-15 MKKRRWMALALACTMIFTAGCQKGGKETAEVPGSQETKVSETEMAKAEDSKAENSVSQEE PVTIKFANYAVLEEGNSAFWEKIKTDFETENPNIKIEWITAPFGEMVQQVINMAGGGEYV DLIFGELDWTPTLADSGIACPVSDIMPQEYLEDFYPNVLEAHSIDHVLYSLPLYVGPYIL YINKDLFEQAGLDASNPPTTYEEMLECA >gi|229784057|gb|GG667678.1| GENE 2 1253 - 1984 559 243 aa, chain - ## HITS:1 COG:ECs4801 KEGG:ns NR:ns ## COG: ECs4801 COG0730 # Protein_GI_number: 15834055 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Escherichia coli O157:H7 # 4 237 19 251 291 58 25.0 8e-09 MEQVLIVVFSSFCGGLVQAVTGFGGAVIIMIFLPLILNMTAAPALSDVITMTLSFSMFWR YRKSVRFKSIVIPAAVYLISSTLAIHGSAFIDAGKLKGVFGVFLIILSVYFFGFSGKLSV KPTLFMKFGCGALAGICGGLFGISGPPVSLFYLAATDTKEEYLGTLNAFFSFTVIFNLIS RIYNGFLTVSLVPLMGVGIAAILTGCVLGSRIVKKINIDVMRKCVYGFMAFAGMVIVVQS ILR >gi|229784057|gb|GG667678.1| GENE 3 2100 - 2225 74 41 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870848|ref|ZP_06409929.1| ## NR: gi|288870848|ref|ZP_06409929.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 41 2 42 42 74 97.0 2e-12 MCSHISKFTFGFFHRAKPVTEVKRSGTEVHCGLAGSGAELN >gi|229784057|gb|GG667678.1| GENE 4 2265 - 3062 733 265 aa, chain - ## HITS:1 COG:lin0630 KEGG:ns NR:ns ## COG: lin0630 COG0730 # Protein_GI_number: 16799705 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Listeria innocua # 7 260 4 246 246 70 28.0 3e-12 MSGYLYILFWIVSFGASIAGAICGIGGGVIIKPTLDAFGVLSVSTISFLSGCTVLAMTCY SVIKGKMSGESLIDMKTGTPLAIGAAIGGVVGKSMFQALSNMFANKDMVGAVQAGCLLII TLGTLIYTLKKDKIKTLHVTNAVVCIIIGVVLGIFSSFLGIGGGPINLVVLFYFFSMETK TAAQNSLYIILFSQITSLINSLVTRTVPEFTIWLLALMVIGGILGGMSGRSINKKIDSKV VDKLFIFLMIVIIFINIYNIYQFAR >gi|229784057|gb|GG667678.1| GENE 5 3118 - 4278 1131 386 aa, chain - ## HITS:1 COG:FN1133 KEGG:ns NR:ns ## COG: FN1133 COG1820 # Protein_GI_number: 19704468 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Fusobacterium nucleatum # 40 378 45 380 386 254 40.0 1e-67 MKTLIKNASVYHNHRFEALDLLLEDGRISRMGGCEGERCDEVIDAGGKRVVPGFIDIHTH GAVGVDVNGADADGFEKICRFFASQGTTSWLCSVLTDTKEQTMHAIQMYKAWKNTEHHGA NLMGIHLEGPFLCPAYKGAMPEHLLKKPDMELLKAYQEAAEGEVRYITVSPEVEGIVDFI PYIKSLGIQVAIGHSGADYDTARRAIQNGALGATHTGNAMKLLHQHFPAIWGAVLEDDEV YCEMICDGRHLHPGTVRFIIKIKGLDRVIAVTDSIMAAGLPDGNYKLGVNDVVVVDGDAK LVSDGTRAGSTLTTGKALKNLLEFTGRSLTDILPMLTENPARLIGVYDRVGSIEPGKDAD LVFLDEDCSVVRTFVKGKECQFSIAE >gi|229784057|gb|GG667678.1| GENE 6 4461 - 5954 1458 497 aa, chain - ## HITS:1 COG:MA1313 KEGG:ns NR:ns ## COG: MA1313 COG3875 # Protein_GI_number: 20090175 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 2 259 56 290 468 98 29.0 3e-20 MKLSFEYGAGLMAAELPDTTDIFIPGETVPDPACIPEDRLIEETLKSIRNPMGMEPLSEL AHKGSKVTIIFPDRVKGGEQPTSHRKISIKLILDELYAAGVEKKDILLICSNGLHRKNTE QEIHNILGDELFHEFWHTHQIINHDSEDYDHLVDLGTTDRGDPVLMNKYVYDSDVAILIG HTQGNPYGGYSGGYKHCATGITHWRSIASHHVPEVMHRKDFTPVSGKSLMRTKFDEIGQH MEKCMGKKFFCCDAVLDTKSRQIEINSGYAKVMQPHSWITADKRTYVPWAEKKYDVMVFG MPQFFHYGEGMGTNPIMLLQAISAQVIRHKRIMSDNCVIICSSLCNGYFHDELWPYTREM YEMFQHDFMNTLPDMNRYGEYFATNEEYIRKYRYCNAFHPFHGFSMISCGHIAEMNTSAI YLCGAQDPGYARGMGLKTRATIEEALEDAKKKFVGPNPNILALPQTFKLGAVHLMMKDDV YEGKGAEDCACAAHMHH >gi|229784057|gb|GG667678.1| GENE 7 5985 - 6359 394 124 aa, chain - ## HITS:1 COG:NMB1887 KEGG:ns NR:ns ## COG: NMB1887 COG0149 # Protein_GI_number: 15677722 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Neisseria meningitidis MC58 # 2 99 134 238 257 75 41.0 2e-14 MLYCIGESSEEQERWKEVLKEQIDIGLEGVEKDLVTIAYEPVWAIGPGKIPPDKDYITMI GTYIKEITEGCDVVYGGGLKVDNAEMLASVPVMDGGLIALTRFQGEIGFYPEEYLEIIKT YMGH >gi|229784057|gb|GG667678.1| GENE 8 6302 - 6820 542 172 aa, chain - ## HITS:1 COG:no KEGG:SpiBuddy_0574 NR:ns ## KEGG: SpiBuddy_0574 # Name: not_defined # Def: triose-phosphate isomerase (EC:5.3.1.1) # Organism: Spirochaeta_Buddy # Pathway: not_defined # 2 155 3 163 286 130 42.0 2e-29 MKHIFLNLKRFDIPKEYGGVNSIAPIGEWGSFIVSNTQEQLKKYTGEDVEFAMYFPEAHL IPAVGALCENSPVKVGCQGVYRDDTAVGGGFGAFTTNRTANAAKAIGCTTTIIGHCEERK DKAGILAEAGVTDTKAVNRLLNQEIKRLFLPVFLCYTVSESLLKSRRDGKRY >gi|229784057|gb|GG667678.1| GENE 9 6870 - 8057 1302 395 aa, chain - ## HITS:1 COG:XF0274 KEGG:ns NR:ns ## COG: XF0274 COG0205 # Protein_GI_number: 15836879 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Xylella fastidiosa 9a5c # 3 382 14 398 427 200 35.0 3e-51 MAENVLIVHGGGPTAVINSSLYGVIEEAKKIGKIDKVYAAIGGSEGILKARFLNLLEFPE EKLKLLLETPATAIGSSRYALEQEDYEAMVGIFKKHEIKYVLLNGGNGTMDTCGRIYEVC KDEDIRVVGIPKTIDNDIAITDHTPGFGSAARYIAATTAEVGVDVKSLPIHVCIIEAMGR NAGWITAASALARKKPGDAPHLIYLPERPFNEAEFLEDVKRLHEEKGGVVVVVSEGLKNE AGEPIVPPIFKTERATYYGDVSAYLAELVIKKLGIKARSEKPGICGRASIAWQSPVDRDE AVLAGSEALKAAVAGQSGVMVGFIRDEEAEKIDGVYRMHTEMIPIKEVMMYERTIPESYL NERGNDVTEEFIRWCRPLIGPELRDFFDFNITSEN >gi|229784057|gb|GG667678.1| GENE 10 8397 - 8483 106 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKHFDFKDLMAFGMFILALLTFMYLICH >gi|229784057|gb|GG667678.1| GENE 11 8639 - 9670 936 343 aa, chain - ## HITS:1 COG:SSO3049 KEGG:ns NR:ns ## COG: SSO3049 COG0673 # Protein_GI_number: 15899754 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Sulfolobus solfataricus # 2 330 18 355 371 157 32.0 2e-38 MEKIKAGIIGCGRISSVYKAAFENLRDEVELVLAVDKELSRAEKLAAAFPGCGYSDKLED LLKQPLSVVHVLTPHFLHKEHGIACLEAGFHVLTEKPIAITAEDADAMIRAAKKNKRQLG VIFQNRYIEGVQEVKRLIAEGKFGRITGAFSTLNWWRPPSYYECDWKGSWEKEGGGVVID QAIHSLDLVRYMMGCEPVKVKGQIDRRILTNIEVEDVADAAITFENGAVYSFFACNYYTS NSPIRVEISGENGTALLTQDEVVIRLKGQEPRIVSPSVRSNVNGEAYWGNYHEIQIRDFY RCLRAGKPVPVDPEDAKRTLELVLDIYRSSKEGREIEKYSDKA >gi|229784057|gb|GG667678.1| GENE 12 9675 - 10892 1129 405 aa, chain - ## HITS:1 COG:BH2758 KEGG:ns NR:ns ## COG: BH2758 COG1940 # Protein_GI_number: 15615321 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Bacillus halodurans # 11 403 10 377 386 152 28.0 1e-36 MTARGTNLPRVKKQNEALIKEIIYKYGPISRSKIAEMLSLTPPTITTNVTSLIERGLVYE FAADDSDVEERSLGRKPVKINFVQEAKYAVGVEMNPYHTAICMLDLRGNEKISMRYPPMN RTYVEEMDVLARNINKLIQDSGTEPEKILGVGVGLPGFVEPDVGVLRESFKTEWNGKNIA RDLSRRLDLPVLIENNARVRAIGEEMFGKTLRPDSFAYYLISYGIACPLFVKNRMITGDN ASAGEAGHMVVDINGPKCETCGHNGCLEELASERAIIKKCKAAARAGVSTMLTELCPDIE GLTMKDVLMAEECNDAYVVKVMEEAVMYLGVTLANIINLISPPLVIVDGYMMKVKRNRRQ LLEETKKHIFGLNDQEIEIEFIDFNPLTGARGGAALAIKQLFIKE >gi|229784057|gb|GG667678.1| GENE 13 10959 - 11231 305 90 aa, chain - ## HITS:1 COG:SMb20969 KEGG:ns NR:ns ## COG: SMb20969 COG0395 # Protein_GI_number: 16264842 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 2 90 182 270 270 76 39.0 1e-14 MLPLVVPGVLTVALFRFMNSWNDFLGPLLYLSDEKRYTLSIGLQMFTTQYKTEWSLLMAT ALMITLPVIVLYFIVQKRFIEGITFSGIKG >gi|229784057|gb|GG667678.1| GENE 14 12183 - 12707 409 174 aa, chain - ## HITS:1 COG:SMb20969 KEGG:ns NR:ns ## COG: SMb20969 COG0395 # Protein_GI_number: 16264842 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 4 174 1 171 270 123 36.0 2e-28 MKHLKKLPIHLILIGIGFFFLLPFIWMLSTALKTDQQVLINPPIWIPRPVMWENFVKAIN YIPFFRYMGNTSLIAVLDVLGTVIACPLVAYGLSRLEWRGRDLLFFITIAVMMIPQEVTM VPTFILFNKIGLTGTYVPLFIQSFFGRPFMIFLMRQFFMNLPHDLEDAARVDGS >gi|229784057|gb|GG667678.1| GENE 15 12721 - 13650 633 309 aa, chain - ## HITS:1 COG:SMc04136 KEGG:ns NR:ns ## COG: SMc04136 COG1175 # Protein_GI_number: 15963869 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Sinorhizobium meliloti # 27 309 32 315 315 209 41.0 5e-54 MSSGSNVRPGHKVVTKGSLEARKARMGVLFTAPWIIGLLLFYAYPLLSSIYYSFTSYSVL NSGNFVGLENYRELIKDNLFWKSIWNTLYFTVLSVPVNIILGIIIALLLNSKIRFIGLYR TVFFIPTLVPVVATATVWRWLLDSNYGIINSLLNKVGLASIPWLASENWSKLALVMIASW GIGQAIIINLAGLQDISPEYYEAAQIDGAGIFQRVRHITLPLLTPVIFYNVVMGVINALQ TFTLPYTLTNGEGTPVNSLTFYVMHLYNNAFGYMRMGYASAMAWILFLIILILTIVIFGT SKKWVHYQE >gi|229784057|gb|GG667678.1| GENE 16 13667 - 15049 1253 460 aa, chain - ## HITS:1 COG:lin0762 KEGG:ns NR:ns ## COG: lin0762 COG1653 # Protein_GI_number: 16799836 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Listeria innocua # 45 417 18 371 417 128 27.0 2e-29 MKNKRLLAASMAFILSAASLTGCSGGNNTTNGSSSGETASKETNAESAGTTEAEAPKGEK TVLTVWHTWGAGPGTDAMEKVVEMYNQSNDKNVEINLQFVANKASGNTQTMDKLMASIAA GNPPEIALLDNFQIPTWAQQDALVKLDDLMAASGLSFDGIYDWASNGSHYKDAVYSIPYN GDTRALFYNKDLFAAAGLDPEDPPSTIEELNEAAEKLTIMDGNTYKQVGFIPWLNAGKPI YTWGWNFGGDFYDADTNTLTVAKAENIEALQWEVDFAEKFGGKSFIEFANGLGSGAQDAF VTGQVAMAVRGNFDIANMAQYNPDLNYGICPIPSKVEGEHATWAGGWAFVIPRGAKNQEL SMDFMKYMLSDEAQTVMAQDSASFSPRKSVSEAVFGSDDKYKAFLDYIETAKIRPPVPVG QLLWDNLNQVLDSALNGKDTPENLMKALDVSINEELKKLN >gi|229784057|gb|GG667678.1| GENE 17 15076 - 15927 697 283 aa, chain - ## HITS:1 COG:mlr1887 KEGG:ns NR:ns ## COG: mlr1887 COG1082 # Protein_GI_number: 13471795 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Mesorhizobium loti # 29 276 47 297 304 153 34.0 3e-37 MRMSLLTNSLTGIGISDLKTIADWAAENGICELDVGPAVPLDKKQYGAVLDEGKVAIKTL IYCRNFLSENEEEAKMHCDALRERIQFAGEMGIEKIVCSTGVTAQSFEGMGFTPEKSLPA VVELAKQFVDLAEKNGVRLCYENCPMMGNIATSPDMWEAIFDAVGSERLGLCYDPSHLVW QFIDPYRNIYKFADKIFHVHAKDTAVDREMLASAGVLQNGKWWRHRLPGLGDLDWGRIVD ALYEIGYDQAICIEHEDPVWEGTEDKVKRGILKTRDHVSSFFM >gi|229784057|gb|GG667678.1| GENE 18 15942 - 17009 680 355 aa, chain - ## HITS:1 COG:BH0710 KEGG:ns NR:ns ## COG: BH0710 COG0673 # Protein_GI_number: 15613273 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 1 346 1 376 388 155 28.0 1e-37 MEKIKVGIVGTGYTVGIAANHVNGYLQNPGCILTALYDIIPGRAAKWAEEKKLNVEICSS YDELLDKVDAVSICTPNYSHVDLAVRAMEKGKHLLCEKPLSLDSESSVPAVSLARCCSTV AMIGFSYRGIPAVRYMRKILKSGSMGKIFTYRETLGGCRITNPNVKLEWRMQKQLSGTGA LADFGCHMLDLCDWLFDGSEGPVSEVNGFVSCSIPKREEIFDQGQGTVTNDDSTAFNVRF QGGAIASFVASRLGVARHTIEIYGEGGMMLFRDDRPNELEVWFKEKDDGYKGKPEIVQAD EDMIKEPWFYAEIDDFIHCIQSGERPDRNFDRALYIQRLLDKILLACETGKTINL >gi|229784057|gb|GG667678.1| GENE 19 16996 - 18282 938 428 aa, chain - ## HITS:1 COG:BS_pgi KEGG:ns NR:ns ## COG: BS_pgi COG0166 # Protein_GI_number: 16080187 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Bacillus subtilis # 42 417 46 435 451 328 44.0 1e-89 MQLVLYDKCKQLETEELKEYMNQKMEVLIRAQNGEERYRESLGWLHVDEWAGEENIREIT RHAERVRENGDILVVIGIGGSNQAARAVIEALHGDGHVKILYAGNNISSAYMKNIINQLE GKSVYINVIAKNFETLEPGISFRILRQYLKKRYGAEYGSRIFATGTRESKLHELCRTHGY TFLDFPDNIGGRYSGLCNVGLFPAAVSGIDSGSIAAGARSMEAELHNTYGEENTAFQYAA IRNLLYQKGFRLEMLASFEPRYQYFYGWWTQLFAESEGKEDRGLYPVTAKYSEDLHSIGQ FVQEGTPILFETFLDIREQNSAIWLDADEVKDEFDYLNGLSMDEINYSAFCATLKAHSER FPCISLQVEELSEYSFGQLFYFFEFACYLSGTMLGVNPFNQPGVENYKGYMFEALGKYKY RSECNGKN >gi|229784057|gb|GG667678.1| GENE 20 18539 - 19690 1487 383 aa, chain - ## HITS:1 COG:all4563 KEGG:ns NR:ns ## COG: all4563 COG0191 # Protein_GI_number: 17232055 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Nostoc sp. PCC 7120 # 1 382 1 346 359 197 36.0 4e-50 MPLIPLRPLMEATIKYGFGQGAFNVNAVAQAKAAIEVHEMFRSPAILQGADLANGFMGGR TDFMNATLEDKKIGAKNIADAVKKYGADSEIPIVLHLDHGRDFDSCVAAISGGYTSVMID GSSLPFDENVELTREVVKYAHARGVSVEGELGVLAGVEDHVFSASSTYTNPLKAVEFFRK TGVDALAISYGTMHGASKGKDVKLRKEIATAIRECLSHEGIFGALVSHGSSTVPKYIVDE INALGGELTNTYGIAVEELQAAIPCGINKINVDTDIRLAVTRNMKELFVKYPELRTSASI GEIYELLEAKKSQFDPRVFLPPIMDTVMKGTIPDEDVAAIVNCVERGVKEVVGTLIVQFG SFGKAPLVEQVTLEEMIERYKKI >gi|229784057|gb|GG667678.1| GENE 21 19815 - 20636 956 273 aa, chain - ## HITS:1 COG:STM0684 KEGG:ns NR:ns ## COG: STM0684 COG0363 # Protein_GI_number: 16764054 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Salmonella typhimurium LT2 # 48 258 31 233 266 86 29.0 6e-17 MNHAYYQYNKEQLLKNPRLPLICMPDNATVFQAMAKEMLQAIEENNQAGKKTVFICPVGP VGQYPYFVDMVNEKNLSLKDTWFINMDEYLDDDKKWIPVDHPLSFHGFMERTVYTKIRPE LIMPEEQRVFPDPDNVSYIPELIEKLGGVDIAFGGIGINGHVAFNEADPSLTPVEFLAQK TRVLAITPETRTANAIGDFGGALEDMPHYCVTIGIYEIAHARKIRLGVFRNWHRAVVRRT AYGEPSTDFPVSLLQNHPDITLTLTDYVAALTD >gi|229784057|gb|GG667678.1| GENE 22 20738 - 21826 1297 362 aa, chain - ## HITS:1 COG:BMEII0109 KEGG:ns NR:ns ## COG: BMEII0109 COG0715 # Protein_GI_number: 17988453 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Brucella melitensis # 57 355 17 314 319 132 29.0 1e-30 MKKALPLIMAALVAASLAGCGSSKTEATTAAPAETTTVAQAETTKEETKAEETTEAPTEA KAEPITMNVAYMPNYGSLWSVMTAKEKGYFDEEGITVNLVEFADGPTIIAAMESGSIDVG YIGQGAHKLCVNGQAKIFALSHISNGDALIAGPGITKVEDLKGKKVAYSSGTSSEDILVN SLTKAGMTMDDITAVDMDASGIVTAMISGGVDACATWSPNSLKILEEVKDSTKLCDNLTF SDTTVSLASWIATPKYAEANQDKILRFTKALFKAMDYAADDHYDEVAKYVSTQTATDYDS VYAQRGDADWLTGKEVSEGAADGTVENYYKIQQDNFINSGVVEKEVPVADYVMIDNMIEA GK >gi|229784057|gb|GG667678.1| GENE 23 21898 - 22689 970 263 aa, chain - ## HITS:1 COG:MJ0412 KEGG:ns NR:ns ## COG: MJ0412 COG1116 # Protein_GI_number: 15668588 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component # Organism: Methanococcus jannaschii # 8 263 15 267 267 276 56.0 2e-74 MAGNERNIKVKIDHVEKIYEGRKGRMVALNGIDLDIMENEFICVVGPSGCGKSTLLNIIA GLLPATSGAVYVDGKKVEGTGTERGVVFQQYALFPWLTVLKNVMFGLKLKGMNDAQAKEI AMKYIKMVELEEFVDAYPKELSGGMKQRVAIARAYAVQPEVLLMDEPFGALDAQTRTQLQ SELTKTWQEEKKTCFFITHDVEEAVILATKVIIMSARPGRIKKIVDINLPYPRTQEMKME LPFLDLKTYIWGEVYQEYLEVRK >gi|229784057|gb|GG667678.1| GENE 24 22665 - 23468 800 267 aa, chain - ## HITS:1 COG:BMEII0107 KEGG:ns NR:ns ## COG: BMEII0107 COG0600 # Protein_GI_number: 17988451 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Brucella melitensis # 16 261 3 246 246 184 40.0 2e-46 MSFGNSENKKKKDIVAAIPNWGWLLISFMVAIALWFWLSVNPSTARSFPFLPEVVKSLKT MIERGVLWNDFSSSMISVILGFILGFVTAVPIAFLMAWYRPVRYIVEPWIQFIRNIPPLA YVPLVVIAAGVGRKPQVIVIWMATFLIMCVTIYQGVRNVDETLIKAARVLGAKDRDIFVK VIFPATTPFIITAVRLGVSVALTTLIAAESTGAVAGLGMRIRSLNNSFDVPPMLLYIIII GIIGITSEKIIKYLERKFTGWQETREI >gi|229784057|gb|GG667678.1| GENE 25 23711 - 24775 1091 354 aa, chain + ## HITS:1 COG:BS_msmR KEGG:ns NR:ns ## COG: BS_msmR COG1609 # Protein_GI_number: 16080078 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 3 335 4 336 344 157 30.0 2e-38 MTLKEIAREAGVSISTVSRVINKNSTNAASKEVQDRIWEIVRRTGYTPNSTARDLKMGGS ESPEPLSRSIACLFARTEDSKSDLFFSTLARSIEQEAFKHNYVLKYSFTAIDIHHPNTFR LITDNHVDGVVVLGRCDKQTLSFLKKYFNCVAYTGLNILDAKYDQVISDGYQASLTAVNH LMGLGHTRIGYIGETQDEDRYTGYCTALNTGGIALKKEYVANVPLSSEGGYRGARQLIDR GVDVSALFCSNDVTAIGAMRAIQETGCRIPEDVSIISIDDIDTAQYLSPMLTTIHMPVEE MGQMTAKILIDRIEGGHRLPIRMNLPYYLASRESCCQYHPSRNLTASRDTTNKK >gi|229784057|gb|GG667678.1| GENE 26 24809 - 25288 505 159 aa, chain + ## HITS:1 COG:lin1322 KEGG:ns NR:ns ## COG: lin1322 COG2017 # Protein_GI_number: 16800390 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Listeria innocua # 18 157 17 157 290 133 42.0 1e-31 MLVTIFDSNGSAVIDSIGAQLISYKDSTGKETIWQRDPKFWNRCSPLLFPIVGNCRNNKT IIEGKTYCIEKHGFCRDVDFQVTRQSDTAVTFEIHDTEETKAVYPYSFSLSLTYVMENGM LSMEYRVANTDDRTIYYCIGAHPGFNCPMEEGASFNDSS >gi|229784057|gb|GG667678.1| GENE 27 26209 - 26580 235 123 aa, chain + ## HITS:1 COG:lin1322 KEGG:ns NR:ns ## COG: lin1322 COG2017 # Protein_GI_number: 16800390 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Listeria innocua # 4 120 167 289 290 67 33.0 6e-12 MVYDLENLQFDPSHRIPRLDHSRRLPLTREMFKDDAVYFDDLKSRKVSIIHKDSGHGIEV GFPGFDTVAFWTPYPAEAPFLCVEPWNGSGVYATEDDEFVHKNHVQKLEAGKSKTYGLTI RMI >gi|229784057|gb|GG667678.1| GENE 28 26694 - 26984 302 96 aa, chain - ## HITS:1 COG:no KEGG:Closa_3067 NR:ns ## KEGG: Closa_3067 # Name: not_defined # Def: response regulator receiver protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 92 80 169 173 95 52.0 8e-19 MDDICFLESYDRKTSLVLGKKKIRVKARLDVEEQRLSSYHFVRINQYNLINMLQIRSMEG DEIHMANGEKLYISSSRRKLFMEKYRKFAQENFSVV >gi|229784057|gb|GG667678.1| GENE 29 28032 - 28436 460 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622580|ref|ZP_06115515.1| ## NR: gi|266622580|ref|ZP_06115515.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 132 1 132 132 247 100.0 2e-64 MKIAVYADNTSQLDNARAYFEKWKRRGVLINEDYFLSPADLLSAVEEKNYHAIIVYADED WTAKECKFKQLKQKEKDSLIIYICGRKDQWKKLARAKYTAVFNGQPMVYNMDDICFLESY DRKTSLVLGKKKIS >gi|229784057|gb|GG667678.1| GENE 30 28606 - 28788 187 60 aa, chain + ## HITS:1 COG:no KEGG:Closa_3068 NR:ns ## KEGG: Closa_3068 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 59 11 68 70 62 55.0 4e-09 MEKDKNTVSQPDDFDVDIQACSTMDCTGLIPALPETEAEKEAYEDLYPYITHAKNCADDK >gi|229784057|gb|GG667678.1| GENE 31 28749 - 28907 152 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTADSPLIQEYKNHFVQDRRIKSPEKCAERLSGDFVIPDLYLSSAQFFACVM >gi|229784057|gb|GG667678.1| GENE 32 28873 - 29268 497 131 aa, chain - ## HITS:1 COG:CAC1331 KEGG:ns NR:ns ## COG: CAC1331 COG1028 # Protein_GI_number: 15894610 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Clostridium acetobutylicum # 1 131 155 285 285 202 76.0 2e-52 MIGREGCNILNVSSMNAFTPLTKIPAYSGAKAAVSNFTQWLAVHFSRVGIRVNAMAPGFF VTKQNEKLLFNDDGTPTARTAKILAATPMGRFGEAEELNGTLLFLLNNEAAGFITGVVIP VDGGFSAYSGV >gi|229784057|gb|GG667678.1| GENE 33 30208 - 30645 717 145 aa, chain - ## HITS:1 COG:HI0048 KEGG:ns NR:ns ## COG: HI0048 COG1028 # Protein_GI_number: 16272023 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Haemophilus influenzae # 1 145 5 148 285 151 54.0 3e-37 MGLSFGTDLTGKVAVVTGAGGVLCGMFAKTLALAGAKVAVLDLNEEAAGKVADDIVAAGH TAKAYKANVLERESLEEVHARVLAELGPCDILVNGAGGNNPKAQTTKEYFEMGDIEADTI SFFDLDPKGVEFVFNLNFLGTLLPT >gi|229784057|gb|GG667678.1| GENE 34 30679 - 31734 1179 351 aa, chain - ## HITS:1 COG:CAC1332 KEGG:ns NR:ns ## COG: CAC1332 COG1312 # Protein_GI_number: 15894611 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Clostridium acetobutylicum # 1 351 1 348 351 499 65.0 1e-141 MKLSFRWYGDDDKVTLKQIRQIPGMHSIVTAVYDVPVGEVWSRESIAKLKKETEDAGLAF EVIESIPVHEDIKLGKPSRDQYIANYCENIRRVAEAGIKCICYNFMPVFDWTRTQLDHEL ADGSTSLVYYQDQVDKVNPLESDSDLTLPGWDSSYTRDGLKQVVSEYNAMSEDDLWNHLK YFLEAIIPVAAECDVNMAIHEDDPCWSIFGLPRIITCEENLDRFLKLVDDRHNGITLCTG SLGCSNKNDVVKMAAKYAAMGRIHFVHARNVAVLEDNKGFEERAHLSSCGSLDMFAILKA LYDNGFDGYMRPDHGRMIWGETGRAGYGLYDRALGATYLNGLWEAIEKMSK >gi|229784057|gb|GG667678.1| GENE 35 31918 - 32796 772 292 aa, chain + ## HITS:1 COG:CAC1333 KEGG:ns NR:ns ## COG: CAC1333 COG2207 # Protein_GI_number: 15894612 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 12 284 11 282 286 150 33.0 3e-36 MKKELSTGFNARQYMNSGEFEVFFYNDIDLDHVVDHTHPYYEIYFFLNGDVTYEVDGNRY ELQYGDYLMIPPETRHHPIFHSTGQDYRRFVLWISRPYYDALTARSEDFAYSFRHVAEKK QYHFRTDFITFQELQGRLLDMIEEMRGSKPFRGTSAHLMVCSFLLHLNRITYDSLHQVPA VYENVLYLNVCDYINNHLDEDLSLDHLASFFYASKYHISHVFKDNMGISLHQYILKKRLQ ASKNGILSGIPFNELYHQYGFSDYTSFYRAFKKEFGLSPKEFREQRSLPEGY >gi|229784057|gb|GG667678.1| GENE 36 33864 - 34094 317 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622587|ref|ZP_06115522.1| ## NR: gi|266622587|ref|ZP_06115522.1| hypothetical protein CLOSTHATH_03819 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03819 [Clostridium hathewayi DSM 13479] # 1 75 1 75 75 126 100.0 7e-28 MNKKSKIVKTYGAVHKRRLPVILAVMAVLAVAGVIIGMNIYDEYHTYRSSEATEFETTAE EQNGLLLTSETKTLLS >gi|229784057|gb|GG667678.1| GENE 37 34091 - 34426 220 111 aa, chain - ## HITS:1 COG:BS_azlD KEGG:ns NR:ns ## COG: BS_azlD COG1687 # Protein_GI_number: 16079723 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permeases (azaleucine resistance) # Organism: Bacillus subtilis # 1 110 1 110 110 107 57.0 4e-24 MSGHTIQGIVIIVVMAAATLLTRFLPFILFPAGKKTPKYISFLGTTLPYATIGLLVVYCL KGVNLVSWPHALPEILSVAAIVLLHIWKGNSLLSIGAGTVIYMVLVQGVFR >gi|229784057|gb|GG667678.1| GENE 38 34423 - 35127 831 234 aa, chain - ## HITS:1 COG:BS_azlC KEGG:ns NR:ns ## COG: BS_azlC COG1296 # Protein_GI_number: 16079724 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permease (azaleucine resistance) # Organism: Bacillus subtilis # 9 222 28 241 254 207 47.0 1e-53 MENQERIRALKYAFPRTVPVMAGYLVLGAAYGILMNISGFGIWWALAISVFVYAGSLQYL GITFLTAMVNPVYAFFMSLMLNARHLFYGLSMLDKYRDAGRLKPYLIFALTDETFSIVCH EEPPKSIPRYWVCFWISILDQCYWVAGTVVGVLLGSLITFNTTGLDFALTALFVVIFVDQ WKSGHGHKAALTGVAASALCVVIFGQSAFIIPAMILILAAITWDYKKNGKETGA >gi|229784057|gb|GG667678.1| GENE 39 35265 - 35673 221 136 aa, chain + ## HITS:1 COG:CAC3242 KEGG:ns NR:ns ## COG: CAC3242 COG1313 # Protein_GI_number: 15896487 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S protein PflX, homolog of pyruvate formate lyase activating proteins # Organism: Clostridium acetobutylicum # 9 136 6 133 298 162 58.0 1e-40 MDFTSQYDSCSLCPRSCMVNRRVPGVKGFCGCGDTVMAARAALHQWEEPWISGIHGSGTV FFTGCTLRCCFCQNYPISQEGLGKEISIRQLGDIFLRLQDEGAHNINLVTATQYLPSIVS ALDFVKHRLAIPVVYN Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:32:34 2011 Seq name: gi|229784056|gb|GG667679.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld72, whole genome shotgun sequence Length of sequence - 28299 bp Number of predicted genes - 27, with homology - 26 Number of transcription units - 9, operones - 6 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 666 310 ## COG1228 Imidazolonepropionase and related amidohydrolases - Prom 860 - 919 7.3 + Prom 803 - 862 8.0 2 2 Op 1 35/0.000 + CDS 919 - 2274 1431 ## COG1653 ABC-type sugar transport system, periplasmic component 3 2 Op 2 38/0.000 + CDS 2303 - 3232 810 ## COG1175 ABC-type sugar transport systems, permease components 4 2 Op 3 . + CDS 3238 - 4089 576 ## COG0395 ABC-type sugar transport system, permease component 5 2 Op 4 . + CDS 4104 - 5387 1102 ## COG3681 Uncharacterized conserved protein 6 2 Op 5 . + CDS 5397 - 6575 887 ## COG1228 Imidazolonepropionase and related amidohydrolases 7 2 Op 6 . + CDS 6639 - 7295 641 ## COG1802 Transcriptional regulators 8 2 Op 7 . + CDS 7321 - 7860 403 ## Closa_1289 Uroporphyrinogen-III decarboxylase-like protein 9 3 Op 1 . + CDS 8792 - 9127 347 ## Closa_1289 Uroporphyrinogen-III decarboxylase-like protein 10 3 Op 2 . + CDS 9117 - 9257 166 ## 11 3 Op 3 . + CDS 9205 - 10533 1209 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases + Term 10548 - 10598 13.0 + Prom 10664 - 10723 8.2 12 4 Op 1 . + CDS 10850 - 11740 715 ## COG1082 Sugar phosphate isomerases/epimerases 13 4 Op 2 . + CDS 11745 - 12515 661 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 14 4 Op 3 35/0.000 + CDS 12527 - 13927 1257 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 13937 - 13982 8.8 + Prom 14010 - 14069 2.7 15 4 Op 4 38/0.000 + CDS 14102 - 14821 522 ## COG1175 ABC-type sugar transport systems, permease components + Prom 15668 - 15727 80.4 16 5 Op 1 . + CDS 15785 - 16555 531 ## COG0395 ABC-type sugar transport system, permease component 17 5 Op 2 . + CDS 16596 - 17648 980 ## BF3305 putative lipoprotein 18 5 Op 3 7/0.000 + CDS 17671 - 19416 1548 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 19 5 Op 4 . + CDS 19403 - 20224 942 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 20 5 Op 5 . + CDS 20225 - 22159 1406 ## COG3533 Uncharacterized protein conserved in bacteria + Term 22164 - 22217 3.7 21 6 Tu 1 . - CDS 22223 - 22894 822 ## COG3859 Predicted membrane protein - Prom 23064 - 23123 5.1 + Prom 23078 - 23137 3.3 22 7 Tu 1 . + CDS 23158 - 23610 495 ## COG3238 Uncharacterized protein conserved in bacteria + Term 23629 - 23668 -0.3 + Prom 23760 - 23819 7.9 23 8 Op 1 1/0.000 + CDS 23866 - 24633 717 ## COG1555 DNA uptake protein and related DNA-binding proteins 24 8 Op 2 . + CDS 24654 - 25358 869 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 25378 - 25421 5.3 25 9 Op 1 . + CDS 25521 - 26012 616 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 26 9 Op 2 . + CDS 26079 - 27935 1735 ## Closa_3262 hypothetical protein 27 9 Op 3 . + CDS 28018 - 28297 228 ## Closa_3261 acetyl-hydrolase Predicted protein(s) >gi|229784056|gb|GG667679.1| GENE 1 3 - 666 310 221 aa, chain - ## HITS:1 COG:SMa1677 KEGG:ns NR:ns ## COG: SMa1677 COG1228 # Protein_GI_number: 16263373 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Sinorhizobium meliloti # 5 216 66 276 480 129 36.0 5e-30 MKRQIITNCNVFDGNHEQLKLHSHIIIEDNLVTEISSSDVSAENFDEIIDAGGRTVIPGL VDCHVHVALTGGGSEMECMRADETAVRGAKNAEEMLYRGFTTVRDAGGLVWGIKHCIDTG YTIGPRIFPSHSGIGQTSGHCDNRPGAASMRTLTGYMSPVMDNKIWILADGANEVLKAAR DQLFLGASQIKLFLGGGIASVFDPLYTVQYTEEEIQTLFGS >gi|229784056|gb|GG667679.1| GENE 2 919 - 2274 1431 451 aa, chain + ## HITS:1 COG:BS_yesO KEGG:ns NR:ns ## COG: BS_yesO COG1653 # Protein_GI_number: 16077764 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 52 447 11 409 412 204 32.0 2e-52 MKRKGLAFTMAAVMALSVAGCSSKPDTAETEKQSVSSTQEAQADASTEAVSSAAEGEPVT LKLFWWGNQVRNDLTQQVIDLYMKDNPNVTIMPEFTDWNGYWDKLATSTAGGNMPDIIQM DYRYLEKYVSSNSLANLSEFIDNGTIKTDKIPESVIESGSINGTCYAISLGSNAPAIYYD KEIVDKAGVTIPDQMTIEELYEIGQTIYEKTGALTYYDGGYVLMGEIARSYGSHLWDELE AGDETAVKKDFEYIKKFSDAEFSISQELLVEKNPKVVETKPIIDGTVWNDFSFSNQYSAM VTAAGRDFGMSMYPTTEGATQQPIYLKPSQFFSVAETSQYKEEAAKFVDWITNSVEANEI LQGERGVPVNTDVQEAIKGTLDSANAQVFDFVAKVGEVATPVDAPDPEGSSDVTTKLDTY VENIRDGSMDVETAAKEFTEYAKKTLEEAKQ >gi|229784056|gb|GG667679.1| GENE 3 2303 - 3232 810 309 aa, chain + ## HITS:1 COG:BS_yesP KEGG:ns NR:ns ## COG: BS_yesP COG1175 # Protein_GI_number: 16077765 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 18 308 15 307 309 295 54.0 5e-80 MKKTQKTPGTKKTLKQILNQEKVAGYVFILPFIIGFMVFLFYPMVMSLAYSFTRYNILKS PVFIGLDNYITMFTEDDLFWKVLMVTVKYVVFGVPLRLLMALIVAMLLVKNTKLSGFYRA VYYLPSIIGSSVAVAILWKRIWASDGVINALLALIGLPSKTVWLGKEETAIWVLIILAVW QFGSSMLIFLSGLKQVPVSLYEAATVDGASGITKFFKITLPMLTPTIFFNLINQLISGFM AFTQSYIITEGKPKNSTLFYAVYMYQNAFTNEKLGYSCAMAWFMVAFIGLLTFILFKSQK YWVYYESEG >gi|229784056|gb|GG667679.1| GENE 4 3238 - 4089 576 283 aa, chain + ## HITS:1 COG:BS_yesQ KEGG:ns NR:ns ## COG: BS_yesQ COG0395 # Protein_GI_number: 16077766 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 12 282 26 296 296 299 52.0 3e-81 MKNYTLKRNAMRVIYHLFTFSFACVMIYPLVWMIMGSFKDTTEIFRHAEHLLPQSFHLGN YIKGWKGFANVSFTVFFKNSLIISGLATVGAVCSSAIIAYGFARCNFKGKGIWFSAMLVS MMLPFQIIMIPQYILFNQMGWAGTYLPLIVPHFFGQGFFIFLNIQFIKGIPRDLDEVAKI DGCSIYGIFTRIILPLIKPSIVTSAIFSFMWRWDDFMSSLIFLNNPLQYPVAYALKLFAD PTAGSDWGAMFAMATLSLVPIFIIFLTMQKYLVEGIASTGIKG >gi|229784056|gb|GG667679.1| GENE 5 4104 - 5387 1102 427 aa, chain + ## HITS:1 COG:FN1147 KEGG:ns NR:ns ## COG: FN1147 COG3681 # Protein_GI_number: 19704482 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 18 422 3 409 411 223 35.0 4e-58 MKRYEELYDVLNKELQPALGCTEPAAVTYACAVAAKALGVPAEKVEIAASKNILKNVMGV GIPNTDSYGITAAAAMGVQLTDIDKKLMILNDATPDVCKKAGLMMRQNRIEVCEAKHSSR IHIEVICRGGGHESKVIVDGLHTNIVSVMRDGAELKVPGQEKGSADESVKTDLRGYTIQD FCEFASVADARRLASTRKAVELSKKLSDEGLKGKYGLGVGKALKEMAVQGEHGLVNEVKY ITAAAADARMAGCELPMMSLCGSGNQCIEATLPVAAAAVRSGASEEKMICAAALSQLITV YTKQYTGRLSPICGCGLAAAIGAAAGITWMNHGTVPMVEAAIKNVAADLSGMICDGAKPG CSLKLATAAGTAVQSSMLALKGISATDRDGIVEQSAEKTIQNLGILDKDGMNCTDQVILD LMLGQYA >gi|229784056|gb|GG667679.1| GENE 6 5397 - 6575 887 392 aa, chain + ## HITS:1 COG:BH2935 KEGG:ns NR:ns ## COG: BH2935 COG1228 # Protein_GI_number: 15615497 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Bacillus halodurans # 33 384 40 385 394 164 33.0 2e-40 MITLFKNGMVWEEDHFCRKDIAVEAERITMTGTAPEGFVCDRTFDVEGKYLVPGVIDCHA HVTMVGGDHHMAVFFQENTAELTMEGVINCERMVSHGITTARDCGGKYMETLAIRDFIRK GKIKGPRLLCSGTPLKVVGGHEPGFDITGPIEARKAVRQFLFEGVDFIKAMVTGGLGKAG EDPGAVELELNELAAMVSEAAKHGKKVACHCHSRAGMEILLKAGAASIEHATYLDPEINE QIIRQGVYVVPTFTPYEIAAASEVGCGIVPDAIFASRAMIGDKRKRFKEAYDQGVLIAFG RDSGGFFMDQGEFADEMLYMEHAGMARKDIIKSATEHAARLLGIWGETGSLEEGKCADIL FLNSNPLENLAAYRNDLQEVCANGTLILNKLG >gi|229784056|gb|GG667679.1| GENE 7 6639 - 7295 641 218 aa, chain + ## HITS:1 COG:mlr4749 KEGG:ns NR:ns ## COG: mlr4749 COG1802 # Protein_GI_number: 13473980 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Mesorhizobium loti # 10 109 30 129 248 76 40.0 4e-14 MAAKKMSLKEQVYTKIFNDLVVERFPPDEFLTEKQFIELYGVSKAPVREALIELCNEGIL RSIPRLGYQIIPVTQKDITNSTELRLLIEMNAFRKMSESFNEEMLEKIRTLNKNWMEDVI TGNIDVQKRWMHNTLFHTTMCSFGGNDLAVDVIAKLIKLEWRAYAHMLSTPEKQDKFFNN SEKKYHIEIEHALMERNFEKAEQLLHDDIATIGKELFL >gi|229784056|gb|GG667679.1| GENE 8 7321 - 7860 403 179 aa, chain + ## HITS:1 COG:no KEGG:Closa_1289 NR:ns ## KEGG: Closa_1289 # Name: not_defined # Def: Uroporphyrinogen-III decarboxylase-like protein # Organism: C.saccharolyticum # Pathway: Porphyrin and chlorophyll metabolism [PATH:csh00860]; Metabolic pathways [PATH:csh01100]; Biosynthesis of secondary metabolites [PATH:csh01110] # 1 179 1 178 319 141 41.0 1e-32 MNRRERITAVFKGEKPDRTPMGFWMHFPTEQHHGEEALAAHLKYFEETKTDICKVMNENL YPVQHPIMEAADWADVKACGRNHPFIRSQVELVKRIVDSTADDAPVIATVHGIVASASHA LMQCSRYDKVGRYAQLYHLRTNPDSVYSAYQAIAESLTILAEECIAAGADGIYYAALGG >gi|229784056|gb|GG667679.1| GENE 9 8792 - 9127 347 111 aa, chain + ## HITS:1 COG:no KEGG:Closa_1289 NR:ns ## KEGG: Closa_1289 # Name: not_defined # Def: Uroporphyrinogen-III decarboxylase-like protein # Organism: C.saccharolyticum # Pathway: Porphyrin and chlorophyll metabolism [PATH:csh00860]; Metabolic pathways [PATH:csh01100]; Biosynthesis of secondary metabolites [PATH:csh01110] # 1 100 214 315 319 92 41.0 7e-18 MCKPLVDLERYVDYPCEAVNWGILESGVKLFEGRKLFPGKVIMGGMDDGSQILISGSPEE LEREVHNILAENDYPGFILASDCTLPGNLPYERIRMMEKACETYRGGSYEL >gi|229784056|gb|GG667679.1| GENE 10 9117 - 9257 166 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNYDEKQNAVSDLQIAYIGGGLQRLGMDIYDRPGYGKKYVRYNPAV >gi|229784056|gb|GG667679.1| GENE 11 9205 - 10533 1209 442 aa, chain + ## HITS:1 COG:BS_lplD KEGG:ns NR:ns ## COG: BS_lplD COG1486 # Protein_GI_number: 16077780 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Bacillus subtilis # 1 428 27 441 446 461 51.0 1e-129 MTDLAMEKNMSGTIRLYDIDMEAAGHNERIGNDLSTREDASGKWNYETYSSLESALTGAD FVVISILPGTFDEMAVDVHTPERLGIWQSVGDTAGPGGIMRSLRTLPMFVEIGEAIKKYS PDAWVINYTNPMTLCVKVLYCVFPEIKAFGCCHEVFGTQRVLKGITEQMLGLKDIERSDI HVNVLGINHFTWFDYASYRGIDLFPVYRDYIDSHFEEGFQEKDRSWMNSAFHCAHRVKFD LFKKYGMIAAAGDRHLAEFMPGDLYLKDPETVKSWKFGLTTVDWRKNDLKERLEKSRKLF EREEQVELKPTGEEGILLIKALCGLGRMVSNVNIPNTDGQISNLPVDEVVETNAIFDRNS IRPIMAGSLPQPVLDLIMPHVRVHDKTMRAALSPDLELVVEAFLEDPNVKAKKPSEKDVR SLVIDMLKGTKAYLPKEWEHWI >gi|229784056|gb|GG667679.1| GENE 12 10850 - 11740 715 296 aa, chain + ## HITS:1 COG:AGl260 KEGG:ns NR:ns ## COG: AGl260 COG1082 # Protein_GI_number: 15890243 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 296 1 289 289 175 36.0 8e-44 MKYGIHFGHLGSWYDELGVRECLKQAKEAGSDVFEFFPTKEMFDMEKDKIRELKMYMEEI GIEPAFTFGYPAGWDMAGEDEANREKAVEHLKRIIESMGVLGATGIGGIVYSNWPADYSL QVIEPDDKKRRKDNCIAGLRKVMKTAEDNNVTVNLEIVNRFEHYLMNTAAEGIEVCKAVG SPNCKLLLDCFHMNIEEDSLPEAIRSARGYLGHFHVSEPNRKVPYHTDRIPWNEVGRALR DIGYDKAVIVESFYKFGGVQGHNMRMWRDLDPDLSLESRLKLARQGIEYIRGQFGG >gi|229784056|gb|GG667679.1| GENE 13 11745 - 12515 661 256 aa, chain + ## HITS:1 COG:Rv1928c KEGG:ns NR:ns ## COG: Rv1928c COG1028 # Protein_GI_number: 15609065 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Mycobacterium tuberculosis H37Rv # 2 254 3 253 255 207 46.0 1e-53 MILEEMRLCGKTALVTGAAHGIGKACALALAEAGADVAVVDLKEEVFQTAEEIRKTGKHA LAFQADITAEEEILAVVENVAAEWGKLDICFCNAGTFQDIPAEDMPLKEFKRVIEVNLTG AFITARAAGRSMIENHISGSIILTASMSGHIANIPQCQCAYNSSKAGVIMLGKSLAVEWA RHGIRVNTVSPGYIRTWSDTPLKPEEVGQSNFEQWTPLRRFGQADELKALIVYLAGDASG FVTGSDYVIDGGYTAF >gi|229784056|gb|GG667679.1| GENE 14 12527 - 13927 1257 466 aa, chain + ## HITS:1 COG:BH3680 KEGG:ns NR:ns ## COG: BH3680 COG1653 # Protein_GI_number: 15616242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 2 402 3 375 438 76 22.0 9e-14 MRKTGKKVLGIVLCAAMAGSLLAGCGKKQGAEAVGGGQETTAAAREQSTEQEKTAQKGNV PTIRLVSWQTPSDPKTKPVYDAVDQFAKEHASEFKLEHENILGDELKAKIKTDVASNTVP DVFLYWGSGGNSAMLLDADVIIPFDEYLDASQNIKRELFPQDSFDRTSANGTLATITLGL EYGVWFCNQELFDQYGLELPKTLDDMIAVGKVFNENGIVPFAMGSKGGNPAHEFVAEILG QMPDCQEDFRKINEEYTVDTPNIRKTLEIIETMRENKLFPSDTISIGDWNQHFAMYNEGK AAMIYAWTWQLSNMSKEMADKTVIIPAPGMPGGTRDTSNFARSGGDMGYVISRSAWEDPD KKDAVIELMDYLYSREIQELTLYSGGGIPSRLDIEVDADKVEKQKLAEVIEFARGKESCP NLPQACPKTECWTDFADGIDELMAGISTPDQVIENINKSLTAAKEN >gi|229784056|gb|GG667679.1| GENE 15 14102 - 14821 522 239 aa, chain + ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 1 239 27 267 292 154 36.0 1e-37 MITSLYFSLLKWDGVTQSKFIGMANFVDLFTNDRYFKKVIFNTLKSMAAGVCIQLPLALI LAYLVYRTVRGFHFFQSVYFVPVVISATVIGLMFSLFFNPAFGPVNAFFEAMGWKGMIKN WLTDPGVVLFSVIMPGIWQYIGYHFIILLAGMQSVPKELIESAYIDGANTVDIFFRIMIP NIKAMIQVCVVICLTGSIKTFDIPYLMTQGGPGAESTFLSIYMYKEAFVDSSIGRGTAV >gi|229784056|gb|GG667679.1| GENE 16 15785 - 16555 531 256 aa, chain + ## HITS:1 COG:BH3682 KEGG:ns NR:ns ## COG: BH3682 COG0395 # Protein_GI_number: 15616244 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 16 256 50 293 293 147 33.0 2e-35 MTFITLYPLFWIGLCSLKSSEEIYGSPFGIPRQPEWENYAEAWQVADIPVHLLNSILYAA AAVTVVVILSAMSAYIIAKVWKNRGLYHYYSLGIMIPINAIIIPFILIFRRVGILNTRVG VILAFIVTNLAFSIFILVPFMMGLPDELEDAARIDGCNRLQTFLKIMLPLSKAGLATVGT FVLINCWNDLFLSLLIISRQNLITLNQVCYNMRSQYAADYGLITATVMIMVIPVIICYAM FQKQIVKGMTAGSIKG >gi|229784056|gb|GG667679.1| GENE 17 16596 - 17648 980 350 aa, chain + ## HITS:1 COG:no KEGG:BF3305 NR:ns ## KEGG: BF3305 # Name: not_defined # Def: putative lipoprotein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 41 338 40 326 328 286 50.0 1e-75 MLKYSEELAKSGFYVTRYEDRFFWDTPENLAVYPYGTSDGWRKYEKNPVLGGEYGTCFDL SVLHEDGIYRMWFSWRPHECIAYTESEDGLNWKEPVEVLRPNPESLWDADELNRPSVIKR DGIYEMWYSGQMKPYTNEGRSCIGYASSREGIHWERFNTPVLVPDQPWEQKAVMCPHVIY EEETGVYKMWYSGGGNHEPDAVGYAWSRDGRNWEKEVSNPVLRKEEKNPWEREKVAACQV LKWKNYYYMFYIGFIHVDRAAIGLARSKDGISGWERYPANPIIAPDVDGFDEKAVYKPYV LKVQNGWMMWYNGAKYIPDKENLVIEQIGAAFLDREEFWVEQEESYTAAE >gi|229784056|gb|GG667679.1| GENE 18 17671 - 19416 1548 581 aa, chain + ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 168 567 176 587 602 141 26.0 4e-33 MQKTIRSDMIRSNVILILLVTVCLSGLHLYQIYRYTILQNQEEIIKMSEQARNSLNNMLN QLDVLQYQIVDSLTQSEEYDRIRPVWTKEGIDFFKNTENQFQTFRRAVPFVTNIYWLDNY DRLYTTDSNVSREHFYGNSRISLLEESGGDDCYVPPSVPEYQLVPDSDEVFSYLKNVYRL RQGKERRGVLQIDMRASYMEEILKTINSDRYCLAYVTAGDETVFWFPDRIVYKNFGFGDM DSGYHLVYPLNNGWNLQVRYSNAFALKSLKDNLTSSIILLLIIMPVSVWSVYKYAGYLTG PLQKLTGRMKRLESGELVEIHTISQFEEIRILEQGYNSMISKMDDMMTKMTEIRTENINA KLLALQAQINPHFLANSFELIRGLAIREHNGDIESIAEALGMMYRYILSDGKEKVTLEEE LTYVKNYIRIQEYRFQRSIQVLYNIDPRSRECRMARLLIQPLVENAFIHGLELKNGLKWI MITCSVNGETLYVEVKDNGLGIEAERLKELSGDIMDFATKEIGTEKSGMGSGKSIGLKNV NKRLFLSYGASGCLQIISEPGKGTSVSFRIEIEKEEGDGGF >gi|229784056|gb|GG667679.1| GENE 19 19403 - 20224 942 273 aa, chain + ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 7 261 7 252 257 145 32.0 9e-35 MGDFKAVLVDDEENVRYLLADLLESFHLGIEVTGQAGDGEEALKLCRSLKPDLVISDIKM PGLDGIHMLEAVKKISPEIICVFVSAYTDFEYAKAAIANGARGYILKPVEPEEMYQLLKT VKAEWMQSQKHRKKVQLMETEIRKLKAEHLTDDMPVDSNGCSRVIRKALNYMEDHYHEDI SLESISEVVFLNKNYFSELFKKETGKSFVQYLTEYRLEKAKLLLAISGFKGSEVADMTGF QNNSYFISVFKKYVGCTPKEYSESLHQAEKEDK >gi|229784056|gb|GG667679.1| GENE 20 20225 - 22159 1406 644 aa, chain + ## HITS:1 COG:ECs4459 KEGG:ns NR:ns ## COG: ECs4459 COG3533 # Protein_GI_number: 15833713 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 5 642 13 649 656 610 47.0 1e-174 MDCMIKSPFWDRYLKLVREQLIPYQWGILTDRIPCETESHAVRNFQIAAGVKEGEFSGFV FQDSDLYKWIEAVGDILQYGRDEDLEQKADEIIGYIVSAQQEDGYLNTYYQLKEPDRKWT NLLECHELYCAGHLIEAAVAYHKGTGKDGLLKAACRLADCVDRTFGPEEGKLHGYPGHQE IELALLKLYDVTGEERYLNLSSYFLEARGNNRFFEEEFERRGRISHWTRGQVDEPNREYN QYPYAYYNQFHIPVRQQKKAVGHAVRAVYMYAAMARMAGCCRDEKLLDTCRELWKNITGT QMYITGSIGSTPSGEAFTRAYDLPNDTNYSETCASVGLIRFAMEMLLNEADSSYADVIEK ALYNTVLSGLSLDGRHFFYVNPLEAVPEVCEANPERHHVKTVRQAWYACACCPPNIARTI ADLQSLIYSVDGERFYVHQYIGSEREVTLPSGTFRLIQESELPWKGRSVFRFESENPVSG ELALRIPQWCVRYSLTMNQHSVTDFSVERGYLVVKREWKDGDVLEWVLEMEPSFCQADQR IFYDAGKAALIRGPLVYCLEEADNGPYLHECRVDVQGKPELYWEETLGGYYGITVPGVRY SGRRQECLYEKAFVEKQAVKLKAVPYFLWNNRGEGEMITWIDIM >gi|229784056|gb|GG667679.1| GENE 21 22223 - 22894 822 223 aa, chain - ## HITS:1 COG:CAC2928 KEGG:ns NR:ns ## COG: CAC2928 COG3859 # Protein_GI_number: 15896181 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 25 193 20 183 210 114 40.0 1e-25 MSFFLTYAADSEEYILKPAGYGLLIVLLAALLFVIYKLGKASLKTQQMQTKQLVYCSAAM ALAVVTSFIKVASLPFGGSITLFSMMFICLIGYIYGARTGLITGLAYGILQFIVEPYIYA PLQVLLDYPLAFAGLGLAGFFSSKKHGMIIGYLVGIFGRYICHVLSGYIFFAAYAPEGMN PFFYTLGYNATYIVPEAIATVIVFSLPPVAKAIAQVKKQAISS >gi|229784056|gb|GG667679.1| GENE 22 23158 - 23610 495 150 aa, chain + ## HITS:1 COG:CAC3547 KEGG:ns NR:ns ## COG: CAC3547 COG3238 # Protein_GI_number: 15896783 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 23 149 14 139 143 79 39.0 2e-15 MHDGFGGYSMTGFIIALISGALMSIQGVLNTGVTKQTSIWVSAGWVQLTAFITCVVIWYF SGRENISGILQVEPKYMLLGGIIGAFITYTVVRSMGSLGPAKAALIIVISQIIVAYGIEL FGLFGVEKAGFEWKKLIGAVVAVIGIIIFN >gi|229784056|gb|GG667679.1| GENE 23 23866 - 24633 717 255 aa, chain + ## HITS:1 COG:BS_comEA KEGG:ns NR:ns ## COG: BS_comEA COG1555 # Protein_GI_number: 16079613 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Bacillus subtilis # 92 255 64 204 205 103 43.0 2e-22 MKYRYQAVKTAVILICVLAACICCSCDGRQGGWESGELSVLSGPGDTEMSRGEEMPEEKN GPDDKGSGEASETSEVLQTPVSESTTAAVCYVHICGEVLLPGVYEMEEGSRIFQVVERAG GFTERAAQQYLNMAQIVSDGMKIVVPDGESMEGAERYGIDSASGSGSMGGGVMVPGMAAA GSGSEPGGSAAKEKVNLNTATREELMTLKGIGEAKADDIIAYRESHGGFQKIEDIKKISG IKNAAFEKIKDDITV >gi|229784056|gb|GG667679.1| GENE 24 24654 - 25358 869 234 aa, chain + ## HITS:1 COG:BS_yycF KEGG:ns NR:ns ## COG: BS_yycF COG0745 # Protein_GI_number: 16081093 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 3 231 4 230 235 207 49.0 2e-53 MARILVVDDEKLIVKGIRFSLEQDNMEVDCAYDGEEAINLAKQKEYDVVLLDVMLPKFDG YEVCQAIREFSEMPIIMLTAKGNDMDKILGLEYGADDYITKPFNILEVKARIKAIMRRNT KKNRRESRAGSNVVTSGDLKLDRDSRRVFIGEKEINLTAKEFDLLELLACNPNKVYSREQ LLTYVWGNKAMDSGDVRTVDVHVRRLREKIEPSPSDPKYVHTKWGVGYYFRAQS >gi|229784056|gb|GG667679.1| GENE 25 25521 - 26012 616 163 aa, chain + ## HITS:1 COG:CAP0111 KEGG:ns NR:ns ## COG: CAP0111 COG0454 # Protein_GI_number: 15004814 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 163 1 162 162 213 60.0 2e-55 MDYTIRNVKIEDLDQVTEVEALCFPAAEAAVEASFRQRIETFPDSFFVAEDENGRIIGFI NGCVTDERTIRDEMFEDSGLHHTEGLYQSVFGLDVIPEFRRQGVAADLMNRLMQEAKARG KKGMILTCKDRLIHYYEKFGYRNLGLSASVHGGAVWYDMLLEF >gi|229784056|gb|GG667679.1| GENE 26 26079 - 27935 1735 618 aa, chain + ## HITS:1 COG:no KEGG:Closa_3262 NR:ns ## KEGG: Closa_3262 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 618 1 618 618 973 75.0 0 MKKSPQGERWRRLDNTAKIFPVIANEHLSNVYRISATLKEEVVPGTLQRALEEILPQFEG FAVRLRRGFFWYYFESNKRMPVIERETAYPCKYIDPRTSPYYLFRVSYYEKRINLEVFHA VTDGLGAVNFLKALVYRYLDLTRDGRTGHPAPQKISSDVSMNVEDSYVRNYKEVARRKYS TRKAFHISGEPLPLDEESVLHGYVNLAELKKVGKSYGVSITKFLAAALIWAIYQEYEEAG TGDRSIGISLPINLRAFFDSETMANFFAVTLLEFLADGKEHEFSEAVEAVSRQMDENITR EKLEETISYNVSNEKKWYLRIAPLFMKWGALNVIFRRNDRAYTMTLSNIGPIKVEEEYEA EIERFHLMIGVSKRQPMKCAVCAYGEHVVVTFTSVFQDTRLQDRFFGFLKELGVSVETES NGVPDSRDDKAMYPRIQYDPKRWKEVVNTFYGILFAVAAILGVVNFATYSGSMWSVIAIG CIIYVALTLRYSIMRHANLASKVVIQTIAAEVLLVMIDHVTGYTGWSVNFGIPSTLLFAD LAVVFLIIVNRLNWQSYFMYQIAITIFSFIPVILWAAGLVTRPLMTLVTVVVTVLILFVT IKWGDRSVKTELKRRFHL >gi|229784056|gb|GG667679.1| GENE 27 28018 - 28297 228 93 aa, chain + ## HITS:1 COG:no KEGG:Closa_3261 NR:ns ## KEGG: Closa_3261 # Name: not_defined # Def: acetyl-hydrolase # Organism: C.saccharolyticum # Pathway: not_defined # 1 93 1 93 347 163 81.0 2e-39 MRDKRVSVRGNLVRDMMQNVMETSLKQPIQSGELRKNPVEPAWICPAGYEYEIVETEQFP MEYLRPEGIFTGRVILQLHGGGYIGPMKNIYRK Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:33:09 2011 Seq name: gi|229784055|gb|GG667680.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld73, whole genome shotgun sequence Length of sequence - 22063 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 5, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 480 - 2408 1123 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 2 1 Op 2 . - CDS 2405 - 4594 320 ## gi|266622620|ref|ZP_06115555.1| conserved hypothetical protein 3 1 Op 3 . - CDS 4609 - 4791 130 ## gi|288870865|ref|ZP_06115556.2| hypothetical protein CLOSTHATH_03855 4 1 Op 4 . - CDS 4772 - 5530 793 ## COG3279 Response regulator of the LytR/AlgR family + Prom 5857 - 5916 3.9 5 2 Op 1 . + CDS 6001 - 6414 360 ## gi|266622624|ref|ZP_06115559.1| hypothetical protein CLOSTHATH_03858 6 2 Op 2 . + CDS 6417 - 6551 76 ## gi|266622625|ref|ZP_06115560.1| hypothetical protein CLOSTHATH_03859 + Prom 6584 - 6643 2.5 7 2 Op 3 . + CDS 6667 - 14619 5493 ## COG3291 FOG: PKD repeat + Term 14664 - 14705 10.1 + Prom 14741 - 14800 5.2 8 3 Tu 1 . + CDS 14858 - 16759 1458 ## COG0178 Excinuclease ATPase subunit + Prom 17663 - 17722 18.2 9 4 Tu 1 . + CDS 17756 - 18019 229 ## COG0178 Excinuclease ATPase subunit + Term 18093 - 18139 7.4 + Prom 18112 - 18171 7.6 10 5 Op 1 . + CDS 18256 - 19209 956 ## COG5263 FOG: Glucan-binding domain (YG repeat) 11 5 Op 2 . + CDS 19261 - 22063 1922 ## COG2199 FOG: GGDEF domain Predicted protein(s) >gi|229784055|gb|GG667680.1| GENE 1 480 - 2408 1123 642 aa, chain - ## HITS:1 COG:lin0802 KEGG:ns NR:ns ## COG: lin0802 COG2972 # Protein_GI_number: 16799876 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Listeria innocua # 396 618 194 416 433 83 27.0 2e-15 MNNRLTQLGKRLLLLTITFFLTMGMFTAAYRFDNKYTTQSVQPINGILFYTPEKTDPVYL INGWEYYQGRLLTPEDFISDLPLPDNYVYIGQYSGMESGNPEASPHGSATYRLTLSLPET PAFYTLELPEIYSAYRLYIGSRLMASQGNPDPEHYEPSLYSGSVTFEASGDLQLLLAVSD WSHLYSGMVYPPAFGTPNAVHALLNQRFAISLTVTALAAVLGFFQLALAAILKNRRSLLS GLICMAFAASASTPALHQLTATGIVPWYSLEIFCRYAIYGLSLILVLDLCGRQYRALKAA SYAAALFPFLALAVSLLAPRLSCRQMVLFSHTAGTYKVLCALWLLGTAFFSKTKQEPERD RTILLIGICIFASSLAADRLYPLFEPIRFGWFSEIAGFLFVLLLSFLLLRDSARFYKRQF ILAQEKEHMETQIQMQKKHYSELASQIEKIRTMRHDIRHHLTQLSVLLKDRNTEAAVQYL EKVTHSTSTSAPLTFCEAYYVDVLLRYYYSCAEDLHIPMTVHANVPTKPGVPEEDLCVIL GNILENAFNASTPVPQNQRRITVAMICQEGTVCIEVKNTFKGELVPDGKSFYSSKEKGRH GIGLSSVRSMAEKYSGDVWLNTEPGENGVNVFSIQVLLFSQS >gi|229784055|gb|GG667680.1| GENE 2 2405 - 4594 320 729 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622620|ref|ZP_06115555.1| ## NR: gi|266622620|ref|ZP_06115555.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 729 1 729 729 1272 100.0 0 MKKRKQFLYFLAMLSLILTLPSVDIPSTAQAGTSIGTPSDAEEELPEKTAPAETRQAETE EQKETDAVPSAPLGPPSETESSASVSMPYDKIQAPALASSRIATGSNAEPREIVKVITPY SDILILAERQNARELLNSKLENRDYSTAVALFSDRTREFYPIQYDVEALDVTQTGLISLN GTVEVPDDVSIDPELTAVTIPVFLYDPELPCELQAHSSNPVRDSQLLLAQDSSSEDLQKE LKYNSHTYLYCDDDIVLEVPLTWDVSSVNFGTPGTYRIHGIPDLPEGILLSDEFSSFPCD VIIQETGTFSLTPPIFDGISFFTRWTKPTPQLDKLHRYYSIGDDGEWQEDTTGNFMQIAT YNTQLMSVLYFEDFLYEVPYYFQLEYDGECSNILKVSITEGDLHYDLIEGDRDGGDRGEQ QPPTVTLPPDQPDQTPPLGTTRPEENTPPETPPGETLPAPAETQPTDQENASHTGSSSGT SGKTSRHSAVSDFVSGPGYQEALTSSASLSPLPESGPSVEASETAAAISAPASTGQEADT ADYTILSGRRIHKELELNPGRPLIITKHRIRLEIPADSGLFTNMPVQSLFRAEVISLDDH HISVMLSMDGVPLTDLPSMTITMPWKATDERPVLEVINESGEVLGTATYLDSFTITFQLS ETGTFTIETQAPAAVIPSVPVDPADGQAGDSVSSSSSSIDVVACLLLAGILVSGFIILRF CRRGRRDPA >gi|229784055|gb|GG667680.1| GENE 3 4609 - 4791 130 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870865|ref|ZP_06115556.2| ## NR: gi|288870865|ref|ZP_06115556.2| hypothetical protein CLOSTHATH_03855 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03855 [Clostridium hathewayi DSM 13479] # 16 60 1 45 45 79 100.0 1e-13 MYTTVFKNFVSCYQYMETCRQNNPLTMQVLALRKYTDTLVFHSVLDGMFRLRLLMQDEDQ >gi|229784055|gb|GG667680.1| GENE 4 4772 - 5530 793 252 aa, chain - ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 225 1 216 234 85 27.0 1e-16 MLKIAVCDDEPLDLNRIQDLVSEYIKDHPNPGVTITRFDSSESLLERLKNETYQIYLLDI LMNGKNGIEVGEVIRENDSHAAIIYLTVSPDYALASYSVDAQYYLLKPVEKEKFFRILGR EIENLTKKTDTYITVRTKDGLKSIRTDLIMYVEQNYHVFSYHLTNDMVLESVTSRASFDQ ELQDLLSDSRFVKISSSILVSMRCIKTINKKGLILQNGKELLVARAYSEAKHRYMEYMIQ GVKQEDRVYHGI >gi|229784055|gb|GG667680.1| GENE 5 6001 - 6414 360 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622624|ref|ZP_06115559.1| ## NR: gi|266622624|ref|ZP_06115559.1| hypothetical protein CLOSTHATH_03858 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03858 [Clostridium hathewayi DSM 13479] # 1 137 1 137 137 267 100.0 2e-70 MVIAVCSKQEVEYQAIADAIRSYPKREQYVMISRWFPSVAEYRDTHLKHPFAVTIFAFDD IDNQELSILVHRTSEESQIVWVGEDKRFGVASYRVRAANFLMKPITEKEIWISLDRCIKR LQEEHAIDGYDCVNIRE >gi|229784055|gb|GG667680.1| GENE 6 6417 - 6551 76 44 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622625|ref|ZP_06115560.1| ## NR: gi|266622625|ref|ZP_06115560.1| hypothetical protein CLOSTHATH_03859 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03859 [Clostridium hathewayi DSM 13479] # 1 44 2 45 45 67 100.0 3e-10 MIIFFISLVRIARCMNGERQAEGILWFRMPDMIAYPVKIAVMIL >gi|229784055|gb|GG667680.1| GENE 7 6667 - 14619 5493 2650 aa, chain + ## HITS:1 COG:MA4289 KEGG:ns NR:ns ## COG: MA4289 COG3291 # Protein_GI_number: 20093078 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 378 862 872 1325 1734 85 24.0 2e-15 MRKQFMPKTKGRRIMAMFLSICMVFTLSPAMVFADEKWEKTEAGSMYTYEQGSLADETKE CSELDESDGTDETDEFDEIDGANKIDETNETDELDENDGTDETGAFDETNGKKETSEFEE TGGTDETGGTDETSGTDETGGIDENGGADETDGIDKNGGDRENADLTDCFGEILETDTKP ELECLCDPAIEGEEEHTNKDCPFYKEKATVCNCSVLCTADQTDTECPVCREDYSLCEAEA ASAVPSSDDMAVKAIIEQMEALPTVDSIYENMPGDGDEEYDAWLKKMQQIIADIRAAKEA YDGLTEEERQLVGEEYREKLLDLWNFAERLGEMKTLDVIASGTSGGCTWVLDGDGSLTIS PTNGVSGTLNGLSSVDKAKVTKIVVDNEVMMSNDYNRIFADFNNCTEMDLSGLNTAGHKD FGGMFENCKSLQVLNLSGFDTSAATSLSSMFSGCKKLTTLDLSDFNTSKVTNMFEMFKGC SELTALDVSSFTTANVRNMSEMFYNCSKLTELDLSNFNTTNVTNMQRMFGGCSGITVLDF SNFSNFKTTNVTDMSSMFVDCSELTTLNVGTFDTASVTDMSNMFNGCKKLDQLDVSAFHT SNVKKMFNMFQNCSGLQTLDIENFETSNVTTMSSMFMGCSGLVELKWDNGKFNTSAVTNM SAMFSGCKSLQTLDVSNFNTSQVKDMGSMFSGCSNLTQLDVSHFDTSTVSSTYIGIVTGG MSHMFSDCSSLTELDVSHFDTSNVVLMSYMFNNCSGLKSLDISNFKTGNTIKMDGMFSGC SGLSELDVSKFDTAQVKSMAVMFYGCSGLLELNVSNFNTSNVTDMGSMFSGCESLQTLDV SNFNTSSVTVMSFMFSGCSGLLELDVSRFDTSQVMYMISMFSGCKGLQTLDISSFSFNEK CYFPNVFEGCSLGLKTEFYHAQMGGSSISIYNKHRFENMTDSDISWYKNLICTPYVITPN VTEGGTVTYSGKYGNRAACDTVYQTYFVKTAPFYNEETIAPEFDAAKFYAQDGEGFKISY TPLDGYAVTKVVVDDMNINLKQYPKEYTMKVTKDMEVSVTFTKLVNVTLMTGNVSLRTIS SAAGEKLEKPADLVKTGYIFAGWYKEPELLNQWDFDNDILNNEDINLYAKWIGITYNVTL YPNGGTIEEGEITSYTCGEGAVLPTKVTKTGYTFGGWYDNPECTGYSPTKAIFDNATGDK TFYAKWNVNSYWITLYTLGGSIQEGNVTKYTYGQGAVLPTNVTRAGYKFAGWYDNFDCTG EPVVKISASDIGDKGFYAKWSANSYTITLNTNEGTLEGEAVSEYTYGIGAVLPANVTRAG YTFAGWYDNSGLTGNPTAAISTTDTGDKTFYAKWNDTQKPVLTVSQPPKDWKAENVTLTL SYSDNEGVTALYSRLDDGSYAEITDFTAGSYQYTVSVEGEHTYTFKAVDAAGNAKETAPV TVKLDKSAPQIGTITFDSGHQSFWQWIFGRTSLEVTVPVSDPYSGADHISWVLTPAGKMP ASPKTAAVTNGTAVFTVDSGFKGEILITAYDKVGNSETADVIRAALEDHQPVISVTDGTK PFTGDWYDAPQMIHVEVTDGEDGSDTESGLHTVVYTVDDNLTNTPLFTRTEGELAYYAER SFSAKEGDHTYHITAADNAGNSVTVHVTVKQDTTPPVVTELTADKAGEKSIGITFTSDDT GTYHYLVRTARDAKPDAETVIASNSADIAEAGAKVSFTADGLLPNTEYVLYLVTVDKAGN KGTVQSISFATEKEDIEGTVVLDDTAPRYGDVITVEAHLTGSNPGDLAFQWYRGDEAIDG AVKHSYTVTKDDVGESLSVKVSAENYKDVLCSAPMDPVGKRQLTVTAEAEDKIYDGTADA VISLEITEGKIDGDDVTAVGAGAFSDVDAGPEKTVVLSDIALEGADKSYYELKSQPDNPK ADIGKMAGSAAPEVSSVDETYPGASDGKITGLTAGLVYEISSDNGVTWEDAVLTGTEIHN LRAGAYQVRIKESTNSLAGEMTAVTIGTTPPVHAPVPQTVFDAFLMTLAGTREGMQYSLD GGITWIDITHGTEVKLEEGITEENGIYVRQKGNGTTLLDSLPQMITITKSPVPEGIISVG ETGMGLGDGKLKHVTPDMEYRKDGENAWRSCEGTEITVNPGVYYVRMKGKGTELASEEAG PYVIQKYETKAVTGIKMDVSSLILAQGASFGLTFTLEPADATDRTVVWSSSNPSVAAVAE DGSVSALHIGTAVITVKAADGGWSDSCRITVKAPGGVNGIVKDDYGPVEGATVEAREGGT DGTLAGEPAVTGADGRYIFNNLPRGIYSLVAKRTSAEGKLQVITHIIKITDEMEKCDIFM PAGDKNTIVEVKADTPPVAVDYLDELFLHENLVEVPEEGVTPEDVETVAAGGSLEIKLLA EKKERKDAAVQEDVGKIEESAGPGTGLLVLDLAVEKTVREAGAQTGSTTRLKRLPELIKI VVPIDNLSDKERIRVLRVHGGETQELPLGDEYFEVDENNLILYVNQFSIYAVAYDEEIKV SDGEKERTSDSKWIEGTWILDEVGWWYDYGNRLWPSNGWYYLPWRKEYYWYHFNKTGYID TGWFTDTDGKIYYLHPYHDGSQGYMYTGWNFIDGNWYYFETKQGKEKGMLYVNRKTPDGY LVDASGAWVQ >gi|229784055|gb|GG667680.1| GENE 8 14858 - 16759 1458 633 aa, chain + ## HITS:1 COG:lin2156 KEGG:ns NR:ns ## COG: lin2156 COG0178 # Protein_GI_number: 16801222 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Listeria innocua # 6 633 8 638 746 630 49.0 1e-180 MEDMIVQGLTQNNLKHVSFRIPKEKITVFTGVSGSGKSSIVFDTIAAESQRQMNATYPSY VRSRLPKYPKPAVERIDNLTASVVVDQSPLGGNARSTVGTISGLYASLRLLYSRIGSPYV GTASYFSFNDPNGMCRTCSGLGKITQVDIEAVIDRDKSWNEGCVKDSLYSPGTWYWKQYA RSGLFDLDKPVKEYTAEEYNLLLYGSRSGTGKPENPKVTGIFHKYTKTLLNRDISSKSRH TQEKSQNLIAETVCTECHGRRLNKAALRCKINGYSIADLCEMELTQLREVLSRISDQTVE VLVQTLIEGLDRMIEIGLPYLHLNRETPSLSGGEAQRLKLVRYMGSSLTGMTYIFDEPSA GMHPRDVCRMNSLLKQLRDKGNTVLVVEHDKDVISIADYVIDVGPGAGQEGGEIVFAGSY QELLESGSLTGNAMLASLPVKIKTRKAAGSLPVRGASLHNLKNVNVDIPLGIMTVVTGVA GSGKSTLISQVFARQYEKDIVMVDQGPITATNRSTPASYLGFFDEIRKVMARESGKADSL FSFNSAGACPVCGGKGIIVTELAFMDPIVTECEACCGKRYNEEALACTYKGKNIVELLEM TASQAMEVFEDAKIRKHLNVMKQVGLSYLTLGH >gi|229784055|gb|GG667680.1| GENE 9 17756 - 18019 229 87 aa, chain + ## HITS:1 COG:lin2156 KEGG:ns NR:ns ## COG: lin2156 COG0178 # Protein_GI_number: 16801222 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Listeria innocua # 1 81 666 746 746 109 51.0 1e-24 MDEPTTGLHASDIRNLLNLFDLIVSRGNSLVVIEHNLDVMKQADWIIDIGPDGGKNGGEV VFEGVPMEMIQSSETLTAQCLRSSCTL >gi|229784055|gb|GG667680.1| GENE 10 18256 - 19209 956 317 aa, chain + ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 231 313 537 617 621 68 42.0 1e-11 MRVKKIAAALICAALVSGMTGMTALAVVESKISSVTLVVEADLEVGQNIQDQEIEFTPKS DKYTAGEYEFLNTGFTWSEEDIPTVKVSIYAEDGYKFAVSKDKITLKGATLVDYGREDES HTLVLTMQLPPLAEQTSAIEQAGWTSTTGVSWSQSAGAGSYEVKLYRDGKTVGTTKTTES TSYDFSAAMTKAGTYFYRVRPVNRLKPENKGEWVESPSRYIDSDTAAAIYQKGSSAGEWK QDEAGWWYRNGDGTYTVSNWQQIDGAWYFFNEQGYMATGWIDWNGTRYYCDPASGKMLTN AVTPDGHTVGADGALMQ >gi|229784055|gb|GG667680.1| GENE 11 19261 - 22063 1922 934 aa, chain + ## HITS:1 COG:STM1703_2 KEGG:ns NR:ns ## COG: STM1703_2 COG2199 # Protein_GI_number: 16765047 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Salmonella typhimurium LT2 # 389 553 10 173 182 110 34.0 1e-23 MPQNSLQPDLPLIVIVADSAQKVRYVHEEMQKFLDYPVSGCSGRDLSEVIWDIAGQRLNG TPDKGYGELLFSRGSEEYRLTVQLLDSGREQAAPEITVILQKNGDTERRNLEQELSLNLQ RYKIALSRCSNIIWEYHLDEDCAYLQGDRRLPFAPSGKITDFSKKMVKDKCVDPESLPAL RQMFLDLKAGNPMACALLHIRRNDGGQRWFRVTHTVLRSGDGADHIAIGVAEDISREQME QERYQQEEKYRQALVVDALASYEIDIDNDRIIEKIIENNKDMLASVGLSVNCRYTQFLKR WAEKNIHTEDRFIFLQEMNPDVLRRRFENGHQEAECEYRSLNANNEMIWCNTTIYLIESG GHLMGFVNVKNIEEKKRKELELLRQSRIDPLTGLLNRAACLQEINYLLKENGPEESALLI IDVDNFKMVNDTFGHMYGDKVLAGIAYKLRDIFDEDAIVGRLGGDEFLIFMFHIPGEDAV CHKAQIVARELQTIQQEGENQVVVTNSIGIAFSPRHGSRFQELYVKADTALRYAKQSGKS CYSVFGDKISRVVPMQYVNREWLLDELEEIVYISDIQDYTLLYVNRVGREQIGMKLEDFE KKKCYQMLQGRNTPCPFCNNAKLVKESFITWEYENPYLKKYFIVKDKLVEWNGRPVRMEI AVDVGNHLIGDYTTTDKYHMETVMLESLRTLNSAEDLESGITRILELITRFYDGCRSYII EIDRERGYAHNTYEWCREGIPSQKDSLQNIALDAIPYIFETFNRKQHLIISRVEELKYTY PSEYQFLKRRRAHSLFAVPFEDETAFSGYIGVDNPNINQDTIRLLDTIAYNIANEIKKRR LYERLEFEAGHDTLSGLLNRSSFVRYQSEIEKKCGAPCGIITADINGLKQLNQDFGHSRG DETIIEVTRIMRSSFPEGDIFRLSGDEFKPFRFQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:34:04 2011 Seq name: gi|229784054|gb|GG667681.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld74, whole genome shotgun sequence Length of sequence - 35302 bp Number of predicted genes - 36, with homology - 33 Number of transcription units - 24, operones - 10 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 685 496 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase + TRNA 853 - 925 83.8 # Val TAC 0 0 + 5S_RRNA 859 - 910 92.0 # AF302131 [D:490..741] # 5S ribosomal RNA # Streptococcus agalactiae # Bacteria; Firmicutes; Lactobacillales; Streptococcaceae; Streptococcus. + TRNA 963 - 1036 76.1 # Met CAT 0 0 - Term 1125 - 1183 19.0 2 2 Op 1 36/0.000 - CDS 1195 - 3498 1549 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 3 2 Op 2 . - CDS 3488 - 4159 272 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 4 3 Tu 1 . - CDS 5078 - 5203 156 ## gi|266622634|ref|ZP_06115569.1| 30S ribosomal protein S8 - Prom 5364 - 5423 9.0 5 4 Op 1 . - CDS 5712 - 5855 76 ## gi|288870870|ref|ZP_06409940.1| hypothetical protein CLOSTHATH_03873 6 4 Op 2 . - CDS 5913 - 6023 77 ## 7 5 Op 1 . - CDS 6215 - 6460 128 ## gi|266622635|ref|ZP_06115570.1| conserved hypothetical protein 8 5 Op 2 . - CDS 6482 - 6646 184 ## gi|266622636|ref|ZP_06115571.1| integrase-recombinase protein - Prom 6681 - 6740 8.3 9 6 Op 1 . - CDS 7044 - 7121 68 ## 10 6 Op 2 . - CDS 7159 - 7338 64 ## gi|288870872|ref|ZP_06115572.2| transcriptional regulator - Prom 7477 - 7536 2.1 - Term 7556 - 7596 2.0 11 7 Tu 1 . - CDS 7605 - 7928 423 ## gi|288870873|ref|ZP_06115573.2| transcriptional regulator - Prom 7978 - 8037 5.9 + Prom 7911 - 7970 5.7 12 8 Tu 1 . + CDS 8136 - 8597 276 ## gi|266622639|ref|ZP_06115574.1| toxin-antitoxin system, antitoxin component, Xre family - Term 8675 - 8720 8.5 13 9 Op 1 . - CDS 8767 - 9537 738 ## COG5523 Predicted integral membrane protein - Prom 9578 - 9637 4.8 14 9 Op 2 . - CDS 9690 - 10280 611 ## COG2135 Uncharacterized conserved protein - Prom 10309 - 10368 4.2 - Term 10386 - 10438 6.0 15 10 Op 1 1/0.667 - CDS 10461 - 12704 2325 ## COG1511 Predicted membrane protein 16 10 Op 2 . - CDS 12691 - 15114 2505 ## COG1033 Predicted exporters of the RND superfamily - Prom 15160 - 15219 6.0 + Prom 15122 - 15181 10.6 17 11 Tu 1 . + CDS 15306 - 15887 482 ## COG1309 Transcriptional regulator + Term 15889 - 15961 22.1 - Term 15884 - 15943 13.3 18 12 Op 1 . - CDS 15944 - 16423 463 ## Cfla_0433 aminoglycoside-2''-adenylyltransferase - Prom 16484 - 16543 1.9 19 12 Op 2 5/0.000 - CDS 16551 - 17171 418 ## COG1186 Protein chain release factor B 20 12 Op 3 . - CDS 17182 - 18234 770 ## COG1690 Uncharacterized conserved protein - Prom 18477 - 18536 5.6 21 13 Op 1 . - CDS 18754 - 19014 254 ## Dhaf_3117 MerR family transcriptional regulator 22 13 Op 2 . - CDS 19023 - 19922 1020 ## gi|266622649|ref|ZP_06115584.1| hypothetical protein CLOSTHATH_03891 - Prom 20107 - 20166 80.4 23 14 Op 1 12/0.000 - CDS 21010 - 21795 831 ## COG3958 Transketolase, C-terminal subunit 24 14 Op 2 . - CDS 21783 - 22622 926 ## COG3959 Transketolase, N-terminal subunit 25 14 Op 3 . - CDS 22640 - 23293 952 ## COG0176 Transaldolase - Prom 23373 - 23432 8.2 26 15 Op 1 . - CDS 23528 - 24115 545 ## DSY3594 hypothetical protein 27 15 Op 2 . - CDS 24128 - 24472 377 ## DSY3591 hypothetical protein - Prom 24509 - 24568 13.3 28 16 Tu 1 . - CDS 25470 - 26015 600 ## COG1940 Transcriptional regulator/sugar kinase - Prom 26109 - 26168 4.4 + Prom 26158 - 26217 9.0 29 17 Tu 1 . + CDS 26246 - 26629 317 ## COG3682 Predicted transcriptional regulator + Term 26777 - 26814 1.5 30 18 Tu 1 . - CDS 26915 - 27055 83 ## - Prom 27253 - 27312 3.6 + Prom 27274 - 27333 2.7 31 19 Tu 1 . + CDS 27353 - 28429 607 ## COG2602 Beta-lactamase class D + Term 28530 - 28593 13.6 - Term 28526 - 28575 3.5 32 20 Tu 1 . - CDS 28584 - 29477 908 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 - Prom 29518 - 29577 80.4 33 21 Tu 1 . - CDS 30421 - 31713 1280 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 - Prom 31846 - 31905 6.2 + Prom 31980 - 32039 8.1 34 22 Tu 1 . + CDS 32062 - 32496 442 ## Cphy_1855 hypothetical protein + Term 32745 - 32778 1.0 - Term 32526 - 32585 15.5 35 23 Tu 1 . - CDS 32626 - 34068 1555 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 34124 - 34183 8.1 - Term 34173 - 34229 11.1 36 24 Tu 1 . - CDS 34243 - 35277 515 ## COG3119 Arylsulfatase A and related enzymes Predicted protein(s) >gi|229784054|gb|GG667681.1| GENE 1 1 - 685 496 228 aa, chain - ## HITS:1 COG:L39484 KEGG:ns NR:ns ## COG: L39484 COG1597 # Protein_GI_number: 15673780 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Lactococcus lactis # 2 224 3 218 302 110 30.0 3e-24 MYYFIINPNAKCGRGKNVWKKLEKILERENAQYQAYITEKPGDAKVFARKLTEGCREPHV IVAVGGDGTVNEILDGLSFCSPVTLGYIPAGSGNDLARSLKLPRKPGRCLKKILYPKYHK QMDYGVLTYGGDEISCRRFIVSAGIGMDAAVCHNILYSRTKGILNKLHIGKLTYLLIGVK QLLLAKPAKGYLLLDGVQKVEFNHAYFISAHIHPYEGGGFKFKPFRFQ >gi|229784054|gb|GG667681.1| GENE 2 1195 - 3498 1549 767 aa, chain - ## HITS:1 COG:lin2220 KEGG:ns NR:ns ## COG: lin2220 COG0577 # Protein_GI_number: 16801285 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Listeria innocua # 23 330 3 309 646 110 30.0 1e-23 MSFKKKVPQDNRTYGKSKLSPVFFDMAFKNVRKSARDYLIYFFTIALGVALFYSFNSIGK QFSVLKIPDTLSYISFTVSSMAVISVFVCLIIGFLIVYANRFLLKRRKKEIGIYMTLGMS ERNISDLLMLETLLIGIFAVAVGIPIGILVSQGLMIFSAQMLNIPMKQAKLFISQEAIAG SILFFFVLFAVVHLFNKKEIRKLKLIELLKAEKQNEAIKTNGFVSILSFFLSVFLTLAGY YFLFYDPLQNMMNALVFGILLISVGTILFFFSVSAILLSVFGNIKSYYYRGLNMFVLRQL SSRIKSSGLSMAVICILLFLSAASMAVGPILGKSSVKGTDQAMPYDAELVRNYPGSYQDY GKENDPWVNLDSSDQNDGWDQSDGWNQNDGWDQSDGWDQNDGWDQSDGWNQNENLDQQWG TADVSGSNEMVENSALWEGSLTENLQRLGFPLSDIFKECGELTIYFSDDLKEQAFLVEGY PRKESEKGFGNFTIGIVGVEEYNRMMELQEKEGISLGENEFAITYNMSEYGGIYQYYANN HKKPLRINGYDLNLKEDGVYVRTMGIENVLMNNGTVIVPEKVLSGLAANRTMLNGNFKDE NTGYTKFVEAVRALNTNLIWTTRQDTFVEVTMANILLSYVGLYLGICFIITAGAVLALQQ LSQTADNQQRFLLLQKLGTKRSMMLKAMRVQIVTYFALPFLLALMHAWVMITYTMNTITN LSLAEQLNYVYLSGGIVLVLYGTYFVGTYLGSRRIVLEVISEKTNRK >gi|229784054|gb|GG667681.1| GENE 3 3488 - 4159 272 223 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 193 26 220 245 109 34 3e-23 RGEYVGIMGASGSGKSTLLNCIATIDEVTSGNIYIAGKDITKMTKNQLSAFRREQLGFIF QDFNLLDTLNCKDNIALALSILGVRPGQIEKKVKDVAQELGITQILKKYPYEISGGQKQR VACARAIVTEPSLILADEPTGALDSKSASMLLDNFEYLNEQKGATIVMVTHDAFTASHCS RILFIQDGLIFHQLHRGNSDRKAFFQKIMKVVSDLGGDSDHVL >gi|229784054|gb|GG667681.1| GENE 4 5078 - 5203 156 41 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622634|ref|ZP_06115569.1| ## NR: gi|266622634|ref|ZP_06115569.1| 30S ribosomal protein S8 [Clostridium hathewayi DSM 13479] 30S ribosomal protein S8 [Clostridium hathewayi DSM 13479] # 1 38 19 56 56 72 100.0 1e-11 MEKILVAEGLIKQYITKGHTTNALNGIGFEVCRGEYVGIAS >gi|229784054|gb|GG667681.1| GENE 5 5712 - 5855 76 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870870|ref|ZP_06409940.1| ## NR: gi|288870870|ref|ZP_06409940.1| hypothetical protein CLOSTHATH_03873 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03873 [Clostridium hathewayi DSM 13479] # 1 47 1 47 47 87 100.0 3e-16 MQNIIKYCTVALLKHRYNRNGKTAGVQHFSDFGELGSLSVSLSESGG >gi|229784054|gb|GG667681.1| GENE 6 5913 - 6023 77 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIAFDVEPFLQAHPQLDYLKEFLKKRPFHPRTAFSV >gi|229784054|gb|GG667681.1| GENE 7 6215 - 6460 128 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622635|ref|ZP_06115570.1| ## NR: gi|266622635|ref|ZP_06115570.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 81 1 81 81 157 100.0 3e-37 MTICSKALMDGIAAHVEHIDCDKVTSKELMEIYRLPATVNAAKVQLHIGNSVPVTAFIER PFSRMVIRTRIYQRKLLRWTS >gi|229784054|gb|GG667681.1| GENE 8 6482 - 6646 184 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622636|ref|ZP_06115571.1| ## NR: gi|266622636|ref|ZP_06115571.1| integrase-recombinase protein [Clostridium hathewayi DSM 13479] integrase-recombinase protein [Clostridium hathewayi DSM 13479] # 1 54 1 54 54 95 100.0 1e-18 MRELGNEMQLVEEFKQYLVLEEKSKATIDKYMRDIVKFLDWEWEPRGSGNVTIS >gi|229784054|gb|GG667681.1| GENE 9 7044 - 7121 68 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRKTMGFEMEVAGEYCMVGLKFVKA >gi|229784054|gb|GG667681.1| GENE 10 7159 - 7338 64 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870872|ref|ZP_06115572.2| ## NR: gi|288870872|ref|ZP_06115572.2| transcriptional regulator [Clostridium hathewayi DSM 13479] transcriptional regulator [Clostridium hathewayi DSM 13479] # 1 59 33 91 91 110 98.0 4e-23 MNNNRDRVLSYTEEFSKKYGYTPTTFQISNELYIPHDKVSSIIEALRAEGAIQISRPLS >gi|229784054|gb|GG667681.1| GENE 11 7605 - 7928 423 107 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870873|ref|ZP_06115573.2| ## NR: gi|288870873|ref|ZP_06115573.2| transcriptional regulator [Clostridium hathewayi DSM 13479] transcriptional regulator [Clostridium hathewayi DSM 13479] # 16 107 1 92 92 173 100.0 4e-42 MSSVRGILERASLDSMIACLLDEPGPVNGTIEVSESAIENSYREFILELTHLYPGVNKDD DELFDAVIRFASVLEEIYLKIGFIVGRKLDGDLREGEEKLKNIVTAK >gi|229784054|gb|GG667681.1| GENE 12 8136 - 8597 276 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622639|ref|ZP_06115574.1| ## NR: gi|266622639|ref|ZP_06115574.1| toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] # 1 153 1 153 153 261 100.0 1e-68 MVNNKILMYMKLKDINKTQLSALTGIPATTLNRIINGTVINVKTAYMENIAKALGTTIYN LFDLPELNQPLITALRPEEAGTVVEKLLSEEESLLLYWFQNSSQDSRASILMKARNEYYV TQREINEKLSIGKAFVSDPADYNQIEFNFKHPN >gi|229784054|gb|GG667681.1| GENE 13 8767 - 9537 738 256 aa, chain - ## HITS:1 COG:lin0656 KEGG:ns NR:ns ## COG: lin0656 COG5523 # Protein_GI_number: 16799731 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Listeria innocua # 70 215 20 176 345 70 27.0 3e-12 MWTRAELKDRAKNDLRPYYWYGVLVCFIAGILGAGSSGGGVSISTGSRASENSYSIGSIR DGRHAAVFLLILGIFLVVFIIAVAVGVFVSNVVAVGKLRYFMDSAVMQQSAGIGRIFFAF GGGNYLNVVKTMFLKGLFEGLWSLLLVIPGIYKHYEYYMIPYLLADNPEMDRHEAFRLSK EMMDGNKFNTFVLELSFIGWYLLGMLLCCIGGIFVNPYYEMTFVELYWVLRDQVMGPRPT EAMDQNVCADAYYREI >gi|229784054|gb|GG667681.1| GENE 14 9690 - 10280 611 196 aa, chain - ## HITS:1 COG:Cgl0741 KEGG:ns NR:ns ## COG: Cgl0741 COG2135 # Protein_GI_number: 19551991 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 34 165 32 164 207 78 33.0 7e-15 MCGRYYVDDETAREIEKLVETVNDKLRLERLGRDIHPTDEAPVLAAGNHEMKLEWQHWGF PGFQGKGVIFNARSETVLEKKMFRESVLHRRIIIPCTWFYEWNRKKERVTFYREDAPVIF LAGFYNVFGDEKRFTVLTTEANQSMVSTHDRMPLIIEHDQIKDWVMDDTKFSGMLKQTPV LLETKQEYSQQELKFL >gi|229784054|gb|GG667681.1| GENE 15 10461 - 12704 2325 747 aa, chain - ## HITS:1 COG:BH0721 KEGG:ns NR:ns ## COG: BH0721 COG1511 # Protein_GI_number: 15613284 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 34 384 39 361 599 86 23.0 2e-16 MRKSKRISAFVLAAVLACGPSMEAGAAADTPVCDETLYITMDPYGEIKESSVVKSYTMNG SSRIVDYGTYNRITNMTDHSEPVTEQDGKLTFNLENPEDRFYFEGGTETTKEELPWNILV SYRLNGLEKKAEELCGAAGLVEVNVDLTPNKGVSDYYRNNMVLMAGTVVDMDKNLSLEAE GAQVQAMGNLNAVIFFALPGEEQHYSIRIGTDDFSFSGVVFTMVPLTASQLDKVGDLRKA KTTLEDSADAISDSLDTLFDTFDGMQKSVEDSADGLRGLDHSRQLFADSKGKVYADADEA LAGLNELTRQFEPFSGHMREAEKYLDTMNSHVNELTGHLDDLSPDLEDMKATMRDLRDDL RGISSMLNSPQVDLGAQAFLQLMEKTKTDLEIFKQSQAKLDGGVSALAPALAGLLAQVGG LGGKKAEYFSEDSLEDFIDEMESAGVDLGDEDEVASYLRNELGMAADEIDALMRYASQAL YETATASQPDTGNIAAAARAIVEGFHGTAGNTGLITDIQTMIGQTEVLLKAAASQKGNAV GMLQNTADMAHLAASMCNTLDDIISDADSLTKTVNYYHEGTLETLRDLGSMTDAAAGSLK SLTVFCTSFENQLKTVGDSLNSSTQKTLNGLAGTLDGLGNGLDQTDVLKNAKDTIKTVID DKWDEYTTKDTTILNIDMDAKPVSLTSDKNPSPRTVQIILRTSEIREKDKSNAAEVDESY HPEGNVFHRIASIFTRIWKMITSLFQK >gi|229784054|gb|GG667681.1| GENE 16 12691 - 15114 2505 807 aa, chain - ## HITS:1 COG:BH0720 KEGG:ns NR:ns ## COG: BH0720 COG1033 # Protein_GI_number: 15613283 # Func_class: R General function prediction only # Function: Predicted exporters of the RND superfamily # Organism: Bacillus halodurans # 14 707 7 684 687 411 36.0 1e-114 MKRNSDIFNQVIRFIVYHGKIIEKIFLVTTILCAVCFPFVGVNYDLSKYLPDFAPTKQAL DVMEAEFGYPGMARVMVKDVSLQEAKRIREEISAVDGVDLVLGSDVSTDVYMGTPFLSES MTEFIGVDLLAIDDYYKDGNALMDIVFEDKDGEPRTNAAIEEIYRIVGKDRGCFSGSAIS SKEREASITREIAMAIAMSIVIIWLILTLTTTSWMEPFLFIFVMIVAIVLNMGSNIIFGT ISFFTFSTAAILQLAVSMDYSIFLLHTFTALKNRGMEIHEAMVEAIRESCSSILASGVTT IVGFIVIAFMRFTIGKDVGFVLAKGIICSLLTVLLLMPTLILRFDDKIVKTAHKPLIASF DGFARAMYRIRIPVFLLAALLAVPCYFGQGMNHFLYGDDAIGAGPGTRVYEDTQEIDRLF KKSNMTICIVPNGSGVTEKELSKELENLDFVNYVISMSGTMPDGIPESFLPDDLTSQLRG DVYARMLISMNTLQESDYAFECSARLEEIVHKYYPDNSYVIGMTPTTIDIRDILTEDYNH VSLLSLAGVALVVFLTFRSVLVPILVIVPIEVAIYLNMTLPYVMGDTMIYIGYIIVSCLQ LGATIDYSILMTNNYLAFRKEKGRREAAVAAVNKSTLSIMTSGGILMVVGYLLYFTSSIQ AISQVGRLVGRGAFLSVTLVLSLLPALLSAFDKQIKRQQERADARKERRRNRLAAAKNLT TGKSLPGAKRVTEEKELTEEKELSEEKELSEEKELSEEKELPEEAGLTTDVSPTFADGKC GQKEGSAQAEAEEPEQKEKGDDSHEKE >gi|229784054|gb|GG667681.1| GENE 17 15306 - 15887 482 193 aa, chain + ## HITS:1 COG:CAP0046 KEGG:ns NR:ns ## COG: CAP0046 COG1309 # Protein_GI_number: 15004750 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 181 1 178 188 95 29.0 8e-20 MESQKEDRRIRRTQKMLKDSLIELMTERDFKNISVKDITERADLNRGTFYNHYTDTYDLL QKMESDVLADFQDMIDNFLCSSKNGSLMPILLPVIQYIEENVKIARILFENSASSDFAYR FHQLISDNGTAIFKKQYPGIDDVSLSYFIEFITFGLTGVLRCWINNGMKESKEKIAEIAD KSAMQVAQNLWGH >gi|229784054|gb|GG667681.1| GENE 18 15944 - 16423 463 159 aa, chain - ## HITS:1 COG:no KEGG:Cfla_0433 NR:ns ## KEGG: Cfla_0433 # Name: not_defined # Def: aminoglycoside-2''-adenylyltransferase # Organism: C.flavigena # Pathway: not_defined # 2 134 13 143 157 133 46.0 2e-30 MLEQFGIRYWLDGGWGVDALVGRQTREHRDVDIDFDAGYTEELRHKLEEKGYEVVTDWSP VRIELYHPQLSYIDIHPFILKEDGTARQAGLEGGWYEFEADYFGSAVLDGRKIPCISVKG QKIFHTGYEPREVDKHDMKNIEKLMSGVSQSGFTHEYKG >gi|229784054|gb|GG667681.1| GENE 19 16551 - 17171 418 206 aa, chain - ## HITS:1 COG:STM0315 KEGG:ns NR:ns ## COG: STM0315 COG1186 # Protein_GI_number: 16763697 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Salmonella typhimurium LT2 # 3 192 4 201 204 123 38.0 2e-28 MRIQISSGQGPAECELAVGLLYEELKREAPDIRLLSFTQGKKREGFSSIMFETEHDFTSL EGSVLWICKSPYRPEHKRKNWYVDVSILETVPRVTEEKLVRFETFKSGGKGGQNVNKVET GVRAIHIPTGTAVVSTEARSQHMNKQAALNRLCEILAEMNLESGRREKNLAWMEHTRLER GNPVRVYEGRAFHLKRGNGGDKSSGA >gi|229784054|gb|GG667681.1| GENE 20 17182 - 18234 770 350 aa, chain - ## HITS:1 COG:RSp0700 KEGG:ns NR:ns ## COG: RSp0700 COG1690 # Protein_GI_number: 17548921 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Ralstonia solanacearum # 1 349 14 372 379 297 45.0 2e-80 MISNSKNWIEQAAVDQLRGVAKLPGVVKAVGLPDLHPGKTPVGMAVLSRERFYPHLIGND IGCGMSLFSTGVKQKKFKMEKWETRLNSIRELGDIPCDNPYEGFCPVKDLGTIGTGNHFA EFQCLEHVYREEEAERLGLLEGRIMLLVHSGSRGYGQEILNRFYSPEGLPDDGEEGSRYL AEHDSAILWAGRNRMVTAKKLLAYLGVESEVNPLLESCHNYLERTEEGWLHRKGSVSTKR GAVVIPGSRGSLSYVCVPAADTAMSLDSVSHGAGRKWARSICKSRIDRRYDRDSIRSTKL KSRVVCHDTNLLFAEAPEAYKNVEQVIASLEDYGLIDVVATLRPLLTYKG >gi|229784054|gb|GG667681.1| GENE 21 18754 - 19014 254 86 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_3117 NR:ns ## KEGG: Dhaf_3117 # Name: not_defined # Def: MerR family transcriptional regulator # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 13 83 12 82 86 90 69.0 2e-17 MSRQRSMEKISTSEYVSIGELARLTGCRYSTLKYYTEEGMLPFEQEEENLTRRYRRVETV ERIERIKSLKESGKTIPEIKELVETG >gi|229784054|gb|GG667681.1| GENE 22 19023 - 19922 1020 299 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622649|ref|ZP_06115584.1| ## NR: gi|266622649|ref|ZP_06115584.1| hypothetical protein CLOSTHATH_03891 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_03891 [Clostridium hathewayi DSM 13479] # 1 299 1 299 299 583 100.0 1e-165 MKEYTLSQQYALIGLDGQDSIHATAAKNAVCRCIAAAKLLERLVLTEDPGSEAVKEELEA GFKEIRSIGKKEARALETEVAETLKAEGVLDEVPDIMACDINYNSMNMELKVYRSDEVEY QRTAEGVRAEMLEGGAVTLECACLVWLFRESGCIHEIFSVNEQEELAARMIELAEQEPVI RVIWDLEIYRWTEKLSQQLLGAKKNLFKNPYLEGVNLLFPFLDRRQAIFIDFVIFGTDVA GRRMAILNYLTEKGHYVEEVKNGSETLLKVDNYYYRVFPMTKVYYKVPVQGANLVPVYK >gi|229784054|gb|GG667681.1| GENE 23 21010 - 21795 831 261 aa, chain - ## HITS:1 COG:FN0295 KEGG:ns NR:ns ## COG: FN0295 COG3958 # Protein_GI_number: 19703640 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Fusobacterium nucleatum # 9 260 1 251 309 298 61.0 9e-81 MPEMKEQSVKKIATRESYGSALAELGSVNDKIVVLDADLAAATKTGVFKKVFPERHIDCG IAECNMMGIAAGLAATGMVPFASTFAMFAAGRAFEQVRNSIGYPHLNVKIGATHAGISVG EDGATHQCNEDIALMRTIPGMVVICPSDDVEARAAVKAAAEYEGPVYLRFGRLAVPVIND NPEYEFQIGKGVVLKEGKDLTVVATGLEVGHALEAAKLLAEDGIDAEVINIHTIKPLDEE LIIKSAKKTGRVVTVEEHSVS >gi|229784054|gb|GG667681.1| GENE 24 21783 - 22622 926 279 aa, chain - ## HITS:1 COG:FN0294 KEGG:ns NR:ns ## COG: FN0294 COG3959 # Protein_GI_number: 19703639 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Fusobacterium nucleatum # 1 269 1 269 270 321 58.0 9e-88 MSDYLELEKIAVEIRKGIVTAVHSAKSGHPGGSLSAADLFTWLYYKELNIDPEQPLKEDR DRFVLSKGHVAPGYYSTLAHRGFFPVEDLKTLRHTGSYLQGHPDRKHIPGVDMSSGSLGQ GISAAVGMALSAKLQKKDYRVYTLLGDGEIEEGQVWEAAMFAGYRKLDNLVVIVDNNGLQ IDGPVDEVCSPYPIDKKFEAFNFHVICIDGHDFSQIEQAFAEARNTKGMPTAIIAKTIKG KGVSFMEGAVAWHGAAPNDEQYQTAMADLEKAGEALCQR >gi|229784054|gb|GG667681.1| GENE 25 22640 - 23293 952 217 aa, chain - ## HITS:1 COG:lin2886 KEGG:ns NR:ns ## COG: lin2886 COG0176 # Protein_GI_number: 16801946 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Listeria innocua # 1 213 1 211 214 296 76.0 2e-80 MKFFVDTAKVEDIRKANDMGVICGVTTNPSLIAKEGRDFNEVIREIASIVDGPISGEVKA TTTDAEGMIKEGREIAAIHPNMVVKIPMTVEGLKAVKVLTAEGIKTNVTLVFSANQALLA ARAGATYVSPFLGRLDDISMPGIDLIETIAEMFRVAGITTEIIAASVRNPIHVIDCALAG ADIATVPYSVIEQMTKHPLTDQGIAKFQADYKAVFGE >gi|229784054|gb|GG667681.1| GENE 26 23528 - 24115 545 195 aa, chain - ## HITS:1 COG:no KEGG:DSY3594 NR:ns ## KEGG: DSY3594 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 193 4 196 201 306 68.0 3e-82 MDCAKVGRLILQFRKEKGLTQQQVADLLGISNKTVSKWERGLGCPDVSLWEGLSTVLGAD IIKLLQGELTPNRPDVGKLEKARFYVCPVCGNILVSTGNVSISCCGRKLKPLIPVPCSEE HKPVIREMDLEYFVTLDHDMSKEHYIAFAACVNHDRIYLNRLYPEQNPEFRMPMIRRGGY LYVYCTKHGLQKYPF >gi|229784054|gb|GG667681.1| GENE 27 24128 - 24472 377 114 aa, chain - ## HITS:1 COG:no KEGG:DSY3591 NR:ns ## KEGG: DSY3591 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 114 2 115 115 147 62.0 9e-35 MNTERKKQQIMEYKNRHPEMGVISLKCKATGEAFLGISGDTRTAFNSVCARLSGSSHPNR HLQELWNEYGEEGFEQTVMRVLKYEDPGEDHTEELELLREECLEEDERAVKIWK >gi|229784054|gb|GG667681.1| GENE 28 25470 - 26015 600 181 aa, chain - ## HITS:1 COG:PM0683 KEGG:ns NR:ns ## COG: PM0683 COG1940 # Protein_GI_number: 15602548 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Pasteurella multocida # 1 177 1 178 407 73 30.0 1e-13 MVQNRGINQDVTQEMNRSLLLKNLRREGMCSRAYLASLTGLKQATVTNIMKDFLNWGIVT EVGFLNGSKGRRSIGVSISQDRYRVIGVRLARKHYSVGLFDLTGTQITKKRVDFNPEEQP DAEDILNQIAAEMHALMKKYGKDTILAAGMAVPGPFIAKKNRIALITGADIWKDVDLKKF S >gi|229784054|gb|GG667681.1| GENE 29 26246 - 26629 317 127 aa, chain + ## HITS:1 COG:CC1640 KEGG:ns NR:ns ## COG: CC1640 COG3682 # Protein_GI_number: 16125886 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Caulobacter vibrioides # 6 126 24 144 144 104 43.0 4e-23 MSRLPQISEAEYEVMKVIWSKAPINTNEVTDILLQATDWSPKTIQTLLKRLVQKGAITYE KKSRVFVYTPLVVKEDYLNQESEHFLKRFFNGNLTSLVANYMKDERISQEELEELRSLLL HDGEGEP >gi|229784054|gb|GG667681.1| GENE 30 26915 - 27055 83 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIYTDCGNQSKNCEDDCSSTPQADKSGCPKTRYLPAYGNGKIVQYS >gi|229784054|gb|GG667681.1| GENE 31 27353 - 28429 607 358 aa, chain + ## HITS:1 COG:SA0039_2 KEGG:ns NR:ns ## COG: SA0039_2 COG2602 # Protein_GI_number: 15925746 # Func_class: V Defense mechanisms # Function: Beta-lactamase class D # Organism: Staphylococcus aureus N315 # 109 354 7 251 252 215 43.0 1e-55 MVWLALKEMVSDREIACDTAVLSLLGENASVDYGTALLHFAAHLSHLTFGQTAGIGGSRK EIRKRISFIASYRRESGDKRIKSRMLYLLTAALVLCTTPSVSALACASDRYTEALPRTDQ EDLSEYFGQLDGSFVLYDASENRYLISNEEDARTRVSPNSTYKIYSALMALDAGIITGTD SAMDWNGETYPFESWNRDQNLNSAMANSVNWYFMDLDRRRGWDSTRQVLTSLSYGNMDFS GGPSRFWLESSLKISPLEQVKLLAGLSGHTLPFSDNGMKTVQNALFLSSDGASSLYGKTG TGTVDGKNTNGWFIGFIESVHGPYSFAVHISGEDGATGVKAGEIALDILKAKKLYGGN >gi|229784054|gb|GG667681.1| GENE 32 28584 - 29477 908 297 aa, chain - ## HITS:1 COG:CAC3683 KEGG:ns NR:ns ## COG: CAC3683 COG0768 # Protein_GI_number: 15896915 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 1 292 382 669 671 266 48.0 3e-71 MSEEKWASLNEDERKPMFNRFRQRFAPGSSFKPITGVIGLSTGALTPDESFGTDHAGLSW QKDASWGKYQVTTLHDYSPVNLENAYIYSDNIYFAKAALKIGYDDFMAGLDRLGFNQDLP FEISVAQSQYSNTERIETEIQLADSGYGQGQVLMNPIHLAALYTMFPNGGDVIKPYLTYQ KEREPEVWLKGACTQKTAETIETALKKVVSSEHGTGHAAMRNDITLAGKTGTAEIKASKG DTSGTELGWFAVYTADGPDHTPILMVSMVEDVKNAGGSGLVVRKAKAVLDSYIPPVQ >gi|229784054|gb|GG667681.1| GENE 33 30421 - 31713 1280 430 aa, chain - ## HITS:1 COG:CAC3683 KEGG:ns NR:ns ## COG: CAC3683 COG0768 # Protein_GI_number: 15896915 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 43 430 26 374 671 207 32.0 4e-53 MRHRRSKRGLIVGGIAAGILIAAAIVIGILWFQKREETEKEKTPEELLMEYTEYIRTGNY EAMYGMLNEQSRKNINLEDFTARNKNIYEGIGTSGLQVEVQNVEERTEGLKTVSYQTSLN SSAGEIHFLNQVDFIREQPSENQETKRKKDAGEYRLIWNDQVIFPNLGRTDKVRVSTDKA VRGEVLDRNGGLLAGKGSASLVGLVPGKMAGYSAGAAGAAGEEAGANPDLERLSELLGVS VESIEKKLAAKWVKEDSLVPVKTLKKVDEFDLQSANPRQDNLENKALQDELLSIPGVMIT DTPVRSYPLGEAAAHLVGYVQNVTAEDLEKHPGEGYLSDSVIGRSGIETLYEKELKGQNG EKISIVTSEGEEKLVLAAIPRVDGSNITLTIDSSLQRTIYEKFGDVKSCTVAMNPYTGEV LALVNTPSYS >gi|229784054|gb|GG667681.1| GENE 34 32062 - 32496 442 144 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1855 NR:ns ## KEGG: Cphy_1855 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 134 8 140 144 148 66.0 6e-35 MKIKKNYLLLLAGIVWLAAGFNILRIGIISYKSYVTLFHLALSAAVFFLFWFMVFGKLVR RHTDRIVRYEEEMQFILKFFDRKSFIIMAFMMTFGIGLRTSGLCPDIFIAVFYSGLGTAL VLAGLAFTGNYIIQAAGNCQHNPE >gi|229784054|gb|GG667681.1| GENE 35 32626 - 34068 1555 480 aa, chain - ## HITS:1 COG:CAC1267 KEGG:ns NR:ns ## COG: CAC1267 COG1686 # Protein_GI_number: 15894549 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 29 288 27 288 425 177 37.0 5e-44 MKHFLKKILCWTLISNCFLMTAYAAPAWPAGPDIQAEAGVVIDMDSGAILGGKAYDTPYP PASITKLLTALLVLEHSKLDEMVTFSNSAVYNVESDSGNKLNVAEGDQLSVEDCLYSLLV HSCNQAANALAEHVAGSQEAFVKMMNDKAAALGCKDSHFDNPSGLNGDTQYVTARDMAAI SRAAFAEPDIIRISAALKHTFPPTINNPEGLTIYAENKLILNTNDSSSQYYYPPVKGGKT GYLLKAGNTLVTYAEKDGKRLISVILKGSPGQYYLDGKALLEFGFEHFTNYNVAENESAY TEGEAAVDVGGKSYLPADLEINPAGAVTLPEGASFTDLEHTLVTDYPDEHPERSVAFIQY TYDGRKAGGAYLTQKEVQEPVTEPASQEETQPETDDKKENGKKGFSINPFLIIALVVIAA AAGGGAYIFYSRKKEEQQRALRREQRRRRLREEGVTEEEFDRLLKERLGDRSKGKNQKKK >gi|229784054|gb|GG667681.1| GENE 36 34243 - 35277 515 344 aa, chain - ## HITS:1 COG:PM0598 KEGG:ns NR:ns ## COG: PM0598 COG3119 # Protein_GI_number: 15602463 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 7 339 134 452 467 129 29.0 9e-30 MRVEMNLCEKGWIDDVKEGDIYDFPRELTNEALSGILAGPKECHEAYYVADMACRQLEEL KQEELNREKEKGSGQVPFMMRVDFWGPHQPYCPTEEFAALYPPESIEEYPSFADDLAGKP ESYLFDTGRETSRERQLIRPNPMEWSRWQKILSRCYGQITMVDEAAGRVIEKLRELHLDE NTLIIWTADHGDALACHGGHFDKAFYLPEEVLRIPLAMAYPGVLPKNRVCRKLITNCDLA PTIVSAAGGSFHLPVDGDDILRLFTEQKPCWRTAVLAETYGHLARWRAEAVVWQQYKYVD NHEDMEELYDLEADPYELHNLALDEEYQVLLMKMRMKRLELKPE Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:35:33 2011 Seq name: gi|229784053|gb|GG667682.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld75, whole genome shotgun sequence Length of sequence - 23545 bp Number of predicted genes - 24, with homology - 23 Number of transcription units - 8, operones - 7 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 329 204 ## COG0583 Transcriptional regulator - Prom 394 - 453 10.2 + Prom 367 - 426 8.7 2 2 Op 1 . + CDS 473 - 1342 872 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases 3 2 Op 2 11/0.000 + CDS 1383 - 1889 242 ## COG1838 Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain 4 2 Op 3 . + CDS 1902 - 2765 203 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase + Prom 2794 - 2853 1.7 5 3 Op 1 . + CDS 2881 - 3777 525 ## GC56T3_2842 hypothetical protein 6 3 Op 2 8/0.000 + CDS 3825 - 4853 779 ## COG2358 TRAP-type uncharacterized transport system, periplasmic component 7 3 Op 3 . + CDS 4881 - 6755 1372 ## COG4666 TRAP-type uncharacterized transport system, fused permease components 8 3 Op 4 . + CDS 6758 - 7465 642 ## COG0684 Demethylmenaquinone methyltransferase 9 3 Op 5 . + CDS 7468 - 7671 200 ## gi|266622671|ref|ZP_06115606.1| conserved domain protein + Term 7734 - 7770 3.0 + Prom 7760 - 7819 5.0 10 4 Op 1 1/0.000 + CDS 7960 - 9375 396 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase 11 4 Op 2 . + CDS 9372 - 10388 992 ## COG1609 Transcriptional regulators 12 5 Op 1 . + CDS 10508 - 13708 2746 ## COG0383 Alpha-mannosidase 13 5 Op 2 . + CDS 13722 - 14612 783 ## COG0583 Transcriptional regulator + Term 14619 - 14657 6.1 + Prom 14624 - 14683 5.6 14 6 Op 1 . + CDS 14800 - 16218 1461 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 15 6 Op 2 . + CDS 16260 - 16979 797 ## TDE0003 hypothetical protein 16 6 Op 3 . + CDS 17007 - 17120 155 ## 17 6 Op 4 . + CDS 17148 - 17729 640 ## TDE0004 hypothetical protein + Term 17770 - 17820 10.2 + Prom 17848 - 17907 6.7 18 7 Op 1 . + CDS 18025 - 18507 568 ## COG0590 Cytosine/adenosine deaminases 19 7 Op 2 . + CDS 18522 - 19814 380 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 20 7 Op 3 . + CDS 19866 - 20633 716 ## COG2188 Transcriptional regulators + Term 20690 - 20739 4.4 - Term 20678 - 20727 8.2 21 8 Op 1 . - CDS 20742 - 21062 481 ## Cphy_2462 hypothetical protein 22 8 Op 2 25/0.000 - CDS 21114 - 22187 1267 ## COG0687 Spermidine/putrescine-binding periplasmic protein 23 8 Op 3 36/0.000 - CDS 22236 - 23030 874 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 24 8 Op 4 . - CDS 23027 - 23545 635 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I Predicted protein(s) >gi|229784053|gb|GG667682.1| GENE 1 2 - 329 204 109 aa, chain - ## HITS:1 COG:RSc0615 KEGG:ns NR:ns ## COG: RSc0615 COG0583 # Protein_GI_number: 17545334 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Ralstonia solanacearum # 6 105 11 109 289 63 34.0 1e-10 MNMTDIQCFIKVTELMSFTKAAEALYTSQQAVSLHIKHLESTYKVQLFERKPSLKLTPSG QMLLEAARDIIDRENQLLNQFITMQDEFVGEITIGLPPNRSTAFVCEFK >gi|229784053|gb|GG667682.1| GENE 2 473 - 1342 872 289 aa, chain + ## HITS:1 COG:SMc00501 KEGG:ns NR:ns ## COG: SMc00501 COG2084 # Protein_GI_number: 15965519 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Sinorhizobium meliloti # 3 255 4 256 294 116 33.0 4e-26 MIVGFIGFGEAASSIALGLRGEGIDRIVCYDAMQDDERMSEIMKERVAACQGEKVESAAE VCQKADVVISAVPSNYAVAAAEDAAAGIKEGQLFLDVSTATPIEKKKISEMVKARGGLLI DGAMMGALLKSKHQVPMLLSGDGAKAFRDKMEPYHMCLEVVEGEAGTATSIKFIRSITAK GLSCLLIESLQAAQRFGVEQTIVDSFVDTFGPDFIGIINGYVSGAIIHAERREHELKNVV DFLQSEMLPYTMAEATRTKLQWLADHHVKDNFESGVARDWKKVLEGWKL >gi|229784053|gb|GG667682.1| GENE 3 1383 - 1889 242 168 aa, chain + ## HITS:1 COG:MTH1115 KEGG:ns NR:ns ## COG: MTH1115 COG1838 # Protein_GI_number: 15679126 # Func_class: C Energy production and conversion # Function: Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain # Organism: Methanothermobacter thermautotrophicus # 1 136 1 134 134 110 46.0 9e-25 MIKINVPFRDKECRLKAGDLIEMSGKIYCGRDAVLPYVVELAKKGELSEHGIELDGAVIF HTAVSCAGIAPTTTSKPEIEGSMIPLSEKGAMFHLGKGRLLPETIEGLKKNGAYFLITPP VSALLTASMSECRAVLHPEFGMEAFYEITVKDFPAIVAVADGESVFSN >gi|229784053|gb|GG667682.1| GENE 4 1902 - 2765 203 287 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 8 272 8 277 508 82 28 2e-15 MFLNKVTEDVAQTVVDASSHWRDDQVACWKQACAEEEIPHAKWAMENFIKNAAVADKKHC ALCDDTGTPHPFLEIGDEAVLPAGFLAAMEDGIRKGLNDLPARPMGVRGDELERISQCKG LYDEPGMLKPAPTQIRRMAGSRIRLTILMLGGGPEIRSRTQALFHMHSLDHVVEEMIEWA VPQVAALGCQPCTLSFGLGRTAVEASSLSLEAMKDGCYDVQSEMEKRITDAVNGKGIGPI GLGGASSVLATFIKIGDMRASGFRVVSMRPNCCMEIRKASCEWETLL >gi|229784053|gb|GG667682.1| GENE 5 2881 - 3777 525 298 aa, chain + ## HITS:1 COG:no KEGG:GC56T3_2842 NR:ns ## KEGG: GC56T3_2842 # Name: not_defined # Def: hypothetical protein # Organism: Geobacillus_C56-T3 # Pathway: not_defined # 3 295 4 294 301 294 50.0 4e-78 MRELIQGAYDLHVHTAPDVLPRRLDDFEMAQRVMDSGMAGFAMKSHYFCTGERAAMTRKL YPECDAVGTITLNGSVGGINPAAVEMAARAGTKLVWFPTSDSAHEQNLTLGGDRNPDKKL PYWAQVVLQMREEGIDNPVIRVVGEDGRLTEATHQVLEVIAGHQMILATSHLSHEETFAL VKAAKEHRVEHIIITHVDFPTTFYTVEEQLELLRCGAYMEHCFVTYNSGKVDFEVVLEQI RAIGASHIILSTDLGQKTGIYPDEGLEQFVTSLYDRGISEGDIKKMTSHNQRILLGKA >gi|229784053|gb|GG667682.1| GENE 6 3825 - 4853 779 342 aa, chain + ## HITS:1 COG:AF0635 KEGG:ns NR:ns ## COG: AF0635 COG2358 # Protein_GI_number: 11498243 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, periplasmic component # Organism: Archaeoglobus fulgidus # 8 341 5 329 330 148 31.0 2e-35 MKRKNTGLAMLLLAAVFMTAACSPTDVNSTTGNSGSTAAEGASSGGTHYDLTLGTAASTG TYYVVGAAIANEVNTKNPDLTVVAQTTNGGVENVNLCMSGEMDLGMTNADAAYWSSHGGG SGFYEIYAEPGNVKGVMRLYESQGHMITTANSGIETYEDLRGKTVCLGPPSSTIPAMSIA ILEAYGINPDSDIKAVYYSIDEGLEKLTDGVIDATFFVASAPASGLTSASATSDLKFVDA GEEILKQVVEKYPYYTAAATKEGAYPNMNSVNTLKIYTEIIASGDVPEEAVYQFVKGALE GQPDYVDAHVACQEINAETAASAAAKLHPGAARYYRELGLID >gi|229784053|gb|GG667682.1| GENE 7 4881 - 6755 1372 624 aa, chain + ## HITS:1 COG:BH2945 KEGG:ns NR:ns ## COG: BH2945 COG4666 # Protein_GI_number: 15615507 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, fused permease components # Organism: Bacillus halodurans # 10 608 40 638 656 355 39.0 1e-97 MTEKLKYQKMVDMIALGMSVILCLFHMFSASYGTWTSYVLAAVHWGFIGSYIVLRKPTKL KYIGPVYDAVIIILTIYACYHQVAMQARLVTQAGLYTTEDIIVAVIAVLLVLEVGRRAVG KILPLIGVAFLLYCYFGNMVPGLMKTTRFTVNRIAIFIYTSSDGIFGQTLYVSAKYIFLF ILFGSILELTGAGAFFVDLAFAFVGKARGGPAQAAIYSSMLMGTINGSGAANVVTTGTFT IPLMKKIGIRPSTAGAIEAVASSGGQIMPPVMGAVAFLMAEMTGINYAKIALAALVPAIL YYMTLSVSVYLIARKDNIAAGNPEDIKKPREVLKEGWLFMVPLLVLVALIISGMSVQLSA FYSIIVTLIVGFMKNRRNMTVGNILRACENSVKSIAPVAAACILAGIIMGSMSLTGFGLK ISSLIEMISGGNLLFALLLSMVASLLLGMGLPTSAAYMVLAVLVAPVLLNLGVSKMAAHL FLLYFGAISTITPPVALSVFAASGISGAGVWETGWDALKLALTGFIIPFIFAFDNSLLLM GTPVMIVIAVVTAVIGCMILSVAVSGWLIHNLNVGTRLLLLVAGVCMIISNPLWINLAGL ALAVIVIVTVLKTNQKNIKTEGVY >gi|229784053|gb|GG667682.1| GENE 8 6758 - 7465 642 235 aa, chain + ## HITS:1 COG:BH1938 KEGG:ns NR:ns ## COG: BH1938 COG0684 # Protein_GI_number: 15614501 # Func_class: H Coenzyme transport and metabolism # Function: Demethylmenaquinone methyltransferase # Organism: Bacillus halodurans # 14 206 7 194 210 134 40.0 1e-31 MENYYELPELISEDLIERASKLSSANICDGLSKLGLLAGRCMSGQIKPVSDKMSVVGTAC TVETKDGDNLPLHVAIYNCRPGYVIVVSGKAFEERAYMGDLMGGAAEAIGVSGIIVDGYI RDKTGLDEIGMPMFSRGYKPASPAKKGPGGLNIPVECGSVLVNPGDLIVGDYDGVVVVPR EKIKEVLEAAEVKVAYEEKRVAAISEYRKYRTEGSVLPDLTPAWVKEMMEQKQEA >gi|229784053|gb|GG667682.1| GENE 9 7468 - 7671 200 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622671|ref|ZP_06115606.1| ## NR: gi|266622671|ref|ZP_06115606.1| conserved domain protein [Clostridium hathewayi DSM 13479] conserved domain protein [Clostridium hathewayi DSM 13479] # 1 67 1 67 67 113 100.0 6e-24 MPRIQIQMLKGRTAEQKRRLAKGVTEAASEALGISQERISLIMLEVEEDQLAHGGLLWCD REVKPPI >gi|229784053|gb|GG667682.1| GENE 10 7960 - 9375 396 471 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 112 457 8 340 345 157 31 8e-38 MIGVLNNLRDYLVKSNRESVTGRLDILVCSSNVSKTPVIMELKVLPTYKNMESTCDMAIR QIEEKAYDSSLPEEGYTEVLFYGIAFYLTHRIKKAILTLTLMERKSIMNRTVFGVVDGEK EVSLFTLKNRNGMEITVSDLGAVLTRVIVPDQEGQPRDVVLGYGSANEYRKNTSTYFGST IGRNGNRLEGAAVTLEGKVFCMTPNEGKNNLHSGPDGYQIRLWNVKEPAEGRNEVTFVLE SPDGDQGFPGNLELSVTYALTEDNEICITYKGVSDAETVFNPTNHSYFNLNGHDSGTILN HVLTLMADFYTPVRDSASIPTGETADVTGTPMDFRQGKAIGAEIDADYEQLQFTGGYDHN FVISQSAGSGSRMGLSRAAVVTSPESGISMEVETDRPGVQLYAGNFLNEEPGKDGVKYGK RCGFCLETQYFPNAANEPAFESPIINANEPCVTKTVYRFKSQGLKQGGEDR >gi|229784053|gb|GG667682.1| GENE 11 9372 - 10388 992 338 aa, chain + ## HITS:1 COG:TM1200 KEGG:ns NR:ns ## COG: TM1200 COG1609 # Protein_GI_number: 15643956 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Thermotoga maritima # 1 309 1 309 333 199 38.0 8e-51 MKKRRPAVTLKDIAAKTGYSVNTVSRALRDKDDIAVETREMIQRTAREMGHVSNTLAASL RLGYTNTIAVILGDISNPHFSIMMKEIEGYAWKAGYTSILFNTNENEELELAAIQSALNK NVDGIIICPAQKTPDNLIYLKESGVPFVLIGRRDEAYSYVVCNDELGGFQATKELLAAGH RDILLLHGATYISSARERLEGYCRAFREAGLPVNERLICQVPVVSDGCEAVLTALEKEGQ EYTAIFAFSDMLAWEAWCCLLKRGKMVPKDCSIIGFDHIESRLKLPLGLTSVSSYKGRMS VCAAEILVEQIRGNTGIRQVVIDTAIAAGETVRKPERI >gi|229784053|gb|GG667682.1| GENE 12 10508 - 13708 2746 1066 aa, chain + ## HITS:1 COG:lin2123 KEGG:ns NR:ns ## COG: lin2123 COG0383 # Protein_GI_number: 16801189 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Listeria innocua # 57 1066 55 1032 1032 448 30.0 1e-125 MGIMHSEYRDRINHWIRTLKEDFYTPVGTLNWEVFKTMEQLSYEEAGAGTYERVEPGYTW GKTYEYAWFRTTLSLPEETAGKRVVLDLCPGGESTVFVNGAAFGTYRADWVKQPHHFLVD NTLTREAKPGEVFDICMETYAGHFYAEAQDSGCATGPVLPGLYRDELTEGARRTLREGTY GIWDEDAYQLFMDVETLSSLLKILDEDSLRAARVAEALEQFTLIVDFEQEKEARSRDYRK ARAALEPVLSAVNGSTTPVFSAIGNAHLDLAWLWPMAETYRKTARTFAAQLRLMEEYQDY KFIQSQPACYEMCRQYYPELFEEIRKAVKKGQWIAEGAMWVEPDTNMAGGEALIRQLLYG KAYYKKMFAVDSRMLWLPDTFGYTAALPQILKKCGVDYLVTQKIFWSYNEGEQFPYHYFT WRGMDGSEIDAFLPTNYTYYTSPEEMNTVWKNRVQKRSLDTFLIPFGYGDGGGGPTRDHI EFIERQKNLEGGVRIEMENPVDYFERMESRGGPSNTYVGELYFTAHRGTYTSQAAVKKWN RRAEFAMRDLELWASAAASMPAGSGMVPDGENAAGSGYRYPAEETERLWKETLLNQFHDI LPGSSIHRVYEEAESSLGRVCEEAGKFCRDAVTSLIQEGEGLTVFNSLSFPRVCLIDVPE NYGTGARTLDGETVPVYTDGKTVTARIMVPPLGAVTVVPCGDGETAPEAVKETGLTLPPA VVKAEGDGFVMENDRLRVTVASNAEITSYVLKENGREFAETPMNHFRLFKDVPRKFDAWD IDSNYREQEIEGAFDVKVEIICAAGARAALRATGKIGSSSFCQDITLAVGESRVEFRTEI DWHELHRLLKVSFPTILYGEYGINEMQFGYVERPMHRSRAYEKERFEVCNHRYSAVCDGG NGFAVLNDCKYGISMEDGALELTLLRAGACPDMQADNRIHTFTYGAAAWNGSFQESDVVK QGYEINVPPLVINGITDTFSLADTGSDHIVIEAVKLAEDGSGDLIFRLYECKKRIGSTVV KINLPVKAAWLCDMLENKEEEVCVTDGAVILDFRAFEIKTVRIELK >gi|229784053|gb|GG667682.1| GENE 13 13722 - 14612 783 296 aa, chain + ## HITS:1 COG:SP0676 KEGG:ns NR:ns ## COG: SP0676 COG0583 # Protein_GI_number: 15900577 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pneumoniae TIGR4 # 1 295 21 320 322 96 26.0 4e-20 MTIQQLQYVMEISRTGSVSKAAKTLFLSQPNLSNAIKNLENELGITIFERTPMGMQLTAS GMKLAKKAASIMADIQEITSGISEEEDCLFRLVYPRYVPAFEAFWDLCTEYQEMAHLHFS CYIGDGEKQVEALYRNLCDLVIYLDTGSSGFARLCSDLHVKFVKLKDACFYVQLAEDHPL LQKPEFKIEELKNYPYVAFSDLYERDANWMPWQDIVNPNRLICVQSTSSRVSLVSNSQAF SIVLPHSEEYNRLHHVVSIPFDSRQLELGYLYSTDRGLSNFAKEYLEYFKERISFL >gi|229784053|gb|GG667682.1| GENE 14 14800 - 16218 1461 472 aa, chain + ## HITS:1 COG:abgB KEGG:ns NR:ns ## COG: abgB COG1473 # Protein_GI_number: 16129298 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Escherichia coli K12 # 14 468 20 477 481 338 41.0 1e-92 MIHEIVMEKADTISEINRAVWDYAEFGYQEKQSAEKMKAVLRQEGFEVEEGQAGISTAFV GRFGTGKPVIAILAEYDALPDLSQKAGCAKPCPIEGKKYGHGCGHSALGAGAAGAAIAVK EYLQRTGVTGTVELYGCPAEETGFGKAFMVKEHCFDGIDAAFCWHPMDRNMSMSVRTVAY YKVRFDFKGRTAHAGAAPELGRSALDACELMNVGVNYLREHIISDARVHYAYLDCGGEAP NIVPDHASLLYFIRAPKLTQSGEILERIKKIAEGAALMTETSVTIKVLGGLSDTIPNPTL SSLLSDAYLEAGAPDFGEEEFAIAREFLNAMPEEQRERVVKKGARQNGISEAEFAERPLN TFIVPYTPAMRNRVMTGSSDVGDVSYQVPTAQITAAVGIPETGVHTWQMTAQVGTSIGDK ASQAVARAIALACAKIYGKPEILDTAKKELEEETGGVYTSLIPEGILPGDIS >gi|229784053|gb|GG667682.1| GENE 15 16260 - 16979 797 239 aa, chain + ## HITS:1 COG:no KEGG:TDE0003 NR:ns ## KEGG: TDE0003 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 7 232 4 232 239 106 34.0 1e-21 MTFNEIMNSPLLYALVGAGIVYILIFCVITLRKAYRHGLEIGLTKDKLKTAITSSAIYSI VPSISIVIGLFSLAAVLGVPWSWFRLSVVGSVTYELMAADMVATGAGYESIAALNAAGDA SIVGTVMFVMSICILGGIVGVLIFGKKVQTSLQKAKNTHGALGALLTGVLSLAIIEAFLP IQIMKGPVYLAVVVTSCVIVVIHMAVIKKFKILWLRNFVMADTLLLGMASSLLWVRILG >gi|229784053|gb|GG667682.1| GENE 16 17007 - 17120 155 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDKKYETSIHRTGRIGVIIAIVFMLGIPTILSAVYDV >gi|229784053|gb|GG667682.1| GENE 17 17148 - 17729 640 193 aa, chain + ## HITS:1 COG:no KEGG:TDE0004 NR:ns ## KEGG: TDE0004 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 4 179 56 224 224 89 38.0 7e-17 MGGGLLAVFLPTNIAEVLSYTPILGSSAYLTFLTGNVMNLKIPVVINAQVLTDTFQGTDE GDTIATIGVAVSSIVTTLIIALGVVLLVPLRPVLTSPAVQTATQYLLPALFGGILLSFVN DDCGEYEAKGKSLTMILPLILVFVINAFYPLTGKEGFVVLLCMGVTVICALVMYKAGIIK MTRKSELKARKKN >gi|229784053|gb|GG667682.1| GENE 18 18025 - 18507 568 160 aa, chain + ## HITS:1 COG:mll2512 KEGG:ns NR:ns ## COG: mll2512 COG0590 # Protein_GI_number: 13472273 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Mesorhizobium loti # 9 159 9 154 156 152 47.0 2e-37 MSYQSHEYYLRRAIEISKEARAAGNTPFGALLVNKDGDIVMEQGNIEITDKICTGHAEAT LAARASHEFTKDYLWDCTLYTTAEPCAMCAGAIYWANIGRVVYGMTERRLLELTGSNEQN PTFDLPCREVFSRGQKAIEVVGPVTEVEVDAAKVHEGYWD >gi|229784053|gb|GG667682.1| GENE 19 18522 - 19814 380 430 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 5 422 8 418 447 150 26 6e-36 MDTFSEKISPGKNLLVGFQHVLAMCPGTIAVPLILSGAMGLDARTTAFLVSANLFTSGIA ILIQVFGLSKYIGSRYPIILGSSFAPLAPMIMIGNTYGLPTLFGAIMVSGVLIFILSFFL DKILMLFPPVVVGTFVTLIGVSLAPTAIQDLAGGTGSSTYGSVQNLLLGFGVLIFIILVE KFGKGIWRAMSLLLGIIGGTLVGAFMGMVDITPIQEASAFRLVTPFYFGAPQFEAGSILI MTIFCIINMIQCIGVFSVLDEIAGTHTDNATKIKGIKAQAVSQVITGAFNSVPSTMFNEN VSLIDLTKIKSRSVIATAGVMMLLLGIFPKVSAVITVVPKAVLGGATLALFGVITSSGIS ILSKLNFFENNNFTIVGTSIAIGVGATFASDIFAGLPSTLSMICSNGLFMVSASAILLNL LLNGSKGLKR >gi|229784053|gb|GG667682.1| GENE 20 19866 - 20633 716 255 aa, chain + ## HITS:1 COG:BS_ymfC KEGG:ns NR:ns ## COG: BS_ymfC COG2188 # Protein_GI_number: 16078744 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 19 251 18 240 241 84 27.0 2e-16 MEYQKLNTLTLSEQAKREIHKYIKNMDLSRNNKLPREEVLAEMIGVSRITIRQALNDLAS EGIIFRRQGKGTFVNVDSLNIKVKFNPCMEFTQMIENSGYTPSVRLLNIQKIDRDEAICQ QLQMESSEQLVVAEKMFLADDLVCAFCRDYFSLSLIGGEEAFCEFSKYENSIFQYIYNLS GEKIEWDKVEIDTILTTEIEGLEQYVDINQLGCKPFLYLKGINYSSSDRPLIHANEYINT AIIKFNMIRQKNIRY >gi|229784053|gb|GG667682.1| GENE 21 20742 - 21062 481 106 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2462 NR:ns ## KEGG: Cphy_2462 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 5 102 4 101 115 111 64.0 1e-23 MVNALETLLQQVVDITILLFEFMGVLVIIAAGLRGIYDYVKRNPSIRLNLAQGMALGLEF KLGSEILRTVVVRQLSEVAVVAAIIALRAALTFLIHWEIKVERESE >gi|229784053|gb|GG667682.1| GENE 22 21114 - 22187 1267 357 aa, chain - ## HITS:1 COG:CAC0837 KEGG:ns NR:ns ## COG: CAC0837 COG0687 # Protein_GI_number: 15894124 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Clostridium acetobutylicum # 3 356 7 352 354 300 45.0 3e-81 MAVMAVAALSAVTVSGCGKSADKAGSAGEVYVYNWGEYIDEDVKAEFEEETGIKVIYDMF ETNEEMYPVIEAGGVKYDAVCPSDYMIQKMIENQMLAEINFDNIPNIKEIDPKYMDMSKS FDPDNKYSVPYCWGTVGILYNTSMVAPEDAPTKWSDLWDEKFKDNILMQDSVRDAFMVAL KSLGYSMNTTNEAEIAEARDLLIKQKPLVQAYVIDQVRDKMIGGEAAMGVIYSGEMLYIQ QEVADLGLDYSLEYVIPEEGTNLWLDSWVIPANAPNKENAEKWINFLCRPDIAKKNFEYI TYPTPNKGAFDLLDADLQNNKAVFPDTDSLKNCEVFQYLGTDVDSIYNEYWKEVKSN >gi|229784053|gb|GG667682.1| GENE 23 22236 - 23030 874 264 aa, chain - ## HITS:1 COG:CAC0838 KEGG:ns NR:ns ## COG: CAC0838 COG1177 # Protein_GI_number: 15894125 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Clostridium acetobutylicum # 8 256 1 249 260 219 49.0 5e-57 MTGKKTGLLKRFVQDFYLVLILVFLYAPIATMAVLSFNSSKSRTQWGGFTTRWYTELFSS STIMTALYNTLLIAFLSSIIAVIIGTAAAIAINNMKRVPRTIVMGITNIPMLNADIVTGI SLMLAFIAFGISLGFKTILIAHITFNIPYVILSVMPKLKQTDKSTYEAAMDLGATPVYAF FKVVFPDILPGVLSGFLLAFTMSLDDFIITHFTRGAGINTLSTLIYSEVRRGIKPSMYAL STIIFLTVLILLLITNFAPKRKKH >gi|229784053|gb|GG667682.1| GENE 24 23027 - 23545 635 172 aa, chain - ## HITS:1 COG:CAC0839 KEGG:ns NR:ns ## COG: CAC0839 COG1176 # Protein_GI_number: 15894126 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Clostridium acetobutylicum # 2 168 103 269 277 177 56.0 1e-44 GLNFLLRTLAWQTLLEKNGVINGVLSALHLPPQGLINTPGAIILGMVYNFLPFMVLPIYN VLSKIDDNVINAAQDLGANFFQILFRILLPLSIPGIVSGITMVFVPALTTFVISNLLGGS KILLIGNVIEQEFTKGSNWNLGSGLSLVMMIFILISMALIAKYDKNGEGTAF Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:36:05 2011 Seq name: gi|229784052|gb|GG667683.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld76, whole genome shotgun sequence Length of sequence - 22702 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 5, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 2427 1899 ## COG0383 Alpha-mannosidase + Prom 2429 - 2488 4.2 2 2 Op 1 1/0.000 + CDS 2650 - 3930 1147 ## COG1653 ABC-type sugar transport system, periplasmic component 3 2 Op 2 . + CDS 3949 - 5475 1232 ## COG3534 Alpha-L-arabinofuranosidase + Term 5480 - 5535 6.5 + Prom 5592 - 5651 8.1 4 3 Op 1 . + CDS 5675 - 6475 642 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 5 3 Op 2 . + CDS 6544 - 8502 2198 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) 6 3 Op 3 . + CDS 8506 - 9438 1059 ## COG1897 Homoserine trans-succinylase 7 3 Op 4 . + CDS 9514 - 10623 1091 ## ROP_24240 LuxR family transcriptional regulator + Term 10632 - 10692 14.2 + Prom 10745 - 10804 5.8 8 4 Op 1 11/0.000 + CDS 10841 - 11791 685 ## COG1180 Pyruvate-formate lyase-activating enzyme 9 4 Op 2 . + CDS 11830 - 14349 2598 ## COG1882 Pyruvate-formate lyase 10 4 Op 3 . + CDS 14379 - 15164 916 ## Ilyop_0061 protein of unknown function DUF81 11 4 Op 4 . + CDS 15180 - 16328 1360 ## COG1454 Alcohol dehydrogenase, class IV 12 4 Op 5 . + CDS 16334 - 16549 68 ## gi|266622699|ref|ZP_06115634.1| conserved hypothetical protein 13 4 Op 6 . + CDS 16567 - 17952 1557 ## DSY4285 hypothetical protein 14 5 Op 1 36/0.000 - CDS 17915 - 18652 356 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 15 5 Op 2 2/0.000 - CDS 18676 - 21240 1699 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 16 5 Op 3 . - CDS 21288 - 22700 818 ## COG0642 Signal transduction histidine kinase Predicted protein(s) >gi|229784052|gb|GG667683.1| GENE 1 1 - 2427 1899 808 aa, chain + ## HITS:1 COG:lin2123 KEGG:ns NR:ns ## COG: lin2123 COG0383 # Protein_GI_number: 16801189 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Listeria innocua # 1 806 246 1029 1032 635 42.0 0 THIDVAWMWQKSHTRNKTVRSFQTALKLLEEYPEFIFMSSQPQLYEYFREAQPEEFEKIK KYTTLGRWEAEGGMWVEADCNLTSGESLIRQFLEGKQYFKEQFGTESRILWLPDVFGYSA ALPQICRKCGIDYFMTTKISWNEQDQIPYDTFLWKGIDGSELLTHFSPSRQFDHVEYNSF GFARSPHITTYNAELSPDYVMGGWKRYAQKSLNTEFLMPCGFGDGGGGTTREMVECGKRM SYGLPGAPEVQFSTAGEFFEKLEVDTEGNPALPVWYGELYLEFHRGTYTSVGEIKRLNRI AETLLQRCEFLYSLLAQAEGPEIYPYRKLKQWIRIILTNQFHDILPGSAIDAVYDDCRKE YEEILPEMESTAVQAMERIAGKWSGRSLIAFNTFGFARNSRILFACDSDAEVLVLTGEDG MGIPAQKTPDGQYCAELKAVPGAGAFRYRIRETCEHGGFTITYRDNIIESPYYKIRLNEN GTMASLYDREADRELLGGAANVLEVYEDRPYQYDAWELSPYYREKRYELNHCTEAELVEQ GPVRIGLRFVYRYLSSTVRQTLYVYSNSRRLDFVTDIDWQEEHLLLKAAFPVNIVSQKAV YEIQFGTVERPTHTNTSWDAEKFETCAQRFADLSEGNYGVSLLSDCKYGYDIHDGVMRLS LLRSAVFPNRQDRGFHHFTYSLYPHPGDYREAGVFQEAYDLLSPVPALFSKVEGQQCGAF PRRLAICNTPGIFIETVKRAEDGDGWIIRAYEGFGNRVKAGIEIITEAPCTLKECDMLER GGTELSAVGNCFTVVFNPFEIKTFRMQK >gi|229784052|gb|GG667683.1| GENE 2 2650 - 3930 1147 426 aa, chain + ## HITS:1 COG:AGl3270 KEGG:ns NR:ns ## COG: AGl3270 COG1653 # Protein_GI_number: 15891755 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 31 422 30 417 419 124 26.0 4e-28 MGESAEKAQESSEKSGETKGETVKLTWWMEMSTPELNEAYEAAAAAYMEKNPNIEIEFLG IPGNSADAKAKLDMAFATDSAPDIFECAMPEFIARGNVLPLDGYYEKSELYGKVNEEAIR ADANLDPKKEHLYYFPAFMDFRNFWIRPDIYAEKGVEIPETWDEFFESVKKVADPDNNLY GTTIRGGSGGAFNVMCMAYAYSGIPHAFDENGKSTMNDPLHVEFVEKYFDLYKKYTSEDD LNKGWTEIAASFTSGSTASIFHNLGSAHMISETFGGDETKFQMMPFLKSVNGEQCHPEVG VMGFGISASTEHPDEAWDFIQFLSSEGTEYWVESRGGIPTEQSAQKADWVMEKDYYQKAI SIMNDENTVFFEVPTYLTNYVNIENNIAQPKLQECMAGICTAQEFLDAWADAMTEEKARY DAENGN >gi|229784052|gb|GG667683.1| GENE 3 3949 - 5475 1232 508 aa, chain + ## HITS:1 COG:BH1861 KEGG:ns NR:ns ## COG: BH1861 COG3534 # Protein_GI_number: 15614424 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Bacillus halodurans # 5 503 4 499 500 653 59.0 0 MAERTARLVIDRAAVTDAIDRRLFGSFIEHLGRAVYGGLYQPGHPEADSRGYRQDVIKLV KELGVSVIRYPGGNFVSGYRWEDGVGPMAQRPRRLELAWRSMESNEVGLNEFAEWAKLVN SEVMMAVNLGTRGISDACNLLEYCNHTSGTKYSDLRISHGVKEPHKIKLWCLGNEMDGPW QIGHKTMDEYGRLAAETAKAMRLIDPEIELVSCGSSQMSMPTFPEWEAATLEHTYDFVDY VSMHQYYGNRQDDSMDYLAQSDDMEQFIRTVTAVCDYIKAKKRGRKNINISFDEWNVWFH SKAQEKDTVQNHPWQQAPHLLEDIYNMEDALMVGLMLITLLKHADRVKIACMAQLINVIA PVMAEENGTAWRQTIFYPFYHAALYGKGVVLQPIVNSPLHETSAHGLVTDVECVSVYEEE AGLLTVFAVNRNIEENVILDIDLRSFEGYELLEHLVLEGDLKQCNGPEGEKIRPIRAERS IKETGKLSSMLGKASWNVIRMKKNKETV >gi|229784052|gb|GG667683.1| GENE 4 5675 - 6475 642 266 aa, chain + ## HITS:1 COG:VC2223 KEGG:ns NR:ns ## COG: VC2223 COG1187 # Protein_GI_number: 15642221 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Vibrio cholerae # 7 265 20 267 340 212 47.0 5e-55 MGEDGIRLNKFLSGAGICSRREADRLIEEGKVTVGGRTAVPGQKIKKGETVVCNGRTIAA GSRGAEKKPERVLLAVHKPCGIVCTTSDKDRAPNIVDMVDCPVRIYPAGRLDKESEGLIL MTNQGDLVNRMMRGSSGHEKEYLVWIDRPVKPEFIQKMRKGVFLPELDVTTKPCFAEKTG EKSFRIVLTQGLNRQIRRMCEQLGCKVRKLKRVRIMNIELGDLKAGESRPVSKKEYAELM RRLEHTSGLSLKDREEKQKKDSGKRK >gi|229784052|gb|GG667683.1| GENE 5 6544 - 8502 2198 652 aa, chain + ## HITS:1 COG:CAC2673 KEGG:ns NR:ns ## COG: CAC2673 COG0272 # Protein_GI_number: 15895931 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Clostridium acetobutylicum # 1 650 1 666 669 340 34.0 6e-93 MDEKISRMKELAATLTAAAKTYYQESREIMSNYEYDKLYDELAELEKETGVVLSKSPTQS VGYEVLSELPKERHESPMLSLDKTKSTADLQEWLGDQTGILSWKLDGLTVVLTYSGGTLK KAVTRGNGEIGEVITSNARVFANVPVTISFQGELVLRGEAVIRYSEFNRINEAIDDVDAR YKNPRNLCSGSVRQLNSQITAERNVNFEAFALVTAEGVDFNNSRKEQFEWLKRQGFDVVE YRTVTADTLPEAVEAFAEAVSGYDIPSDGLVLLLDDIAYGESLGRTAKFPRNAIAFKWAD EIRETKLQYIEWSPSRTGLINPVAVFDPVELEGTTVSRASVHNLSIMESLALGEGDAITV YKANMIIPQIADNLTRSGKIQIPEKCPVCGGRTEVRQAGDVRSLYCTNPDCQAKRIKSFS LFVSRDALNIDGLSEATLEKFIGAGFIKEFADIFHLDRYEETITQMEGFGQKSYDNLIRA TEAASHTTLARMVYGLGISGIGLANAKMLCRKFKFDFDRMRHATAEELVEVDGIGAVLAD AWIRYFADEKNQETVDHLLSELTFEQEEESAEEAVFEGMTFVITGSVEHFANRNELKEAI EARGGKATGSVTSKTTYLINNDVTSNSSKNKKAKDLGVPIISEEEFIKMLGE >gi|229784052|gb|GG667683.1| GENE 6 8506 - 9438 1059 310 aa, chain + ## HITS:1 COG:CAC1825 KEGG:ns NR:ns ## COG: CAC1825 COG1897 # Protein_GI_number: 15895101 # Func_class: E Amino acid transport and metabolism # Function: Homoserine trans-succinylase # Organism: Clostridium acetobutylicum # 1 301 1 301 301 373 58.0 1e-103 MPIKVQKDLPAKAILEKENIFIMAEDRALSQDIRPLEILILNLMPVKEDTETQLLRALSN TPLQVDCTFLMLATHTSKNTSASHLNKFYVYFDEIKNKKFDGMIITGAPVENLEYEEVNY WDELQTIMEWGKTHVTSTLHICWGAQAGLYYHYGIPKYKREKKLSGIYRHRVLDRKVPLV RSMDDYIMAPHSRYTEVRRKDIEKNPELVVLAESDEAGVFLVMKRDGSQVFVQGHPEYDR MTLNNEYHRDLKKGLNPELPCNYYEDNDPFAVPVLSWRNAANTLYGNWLNFYVYQITPYD LEGNPCKTLY >gi|229784052|gb|GG667683.1| GENE 7 9514 - 10623 1091 369 aa, chain + ## HITS:1 COG:no KEGG:ROP_24240 NR:ns ## KEGG: ROP_24240 # Name: not_defined # Def: LuxR family transcriptional regulator # Organism: R.opacus # Pathway: not_defined # 294 369 685 757 759 68 47.0 7e-10 MKYENAGQMNFFSEMAEHQISAREFFSNVLLDSLDRNFGLKHVLISYFDTHGRFLSWINR NGILLDCEEHPYRKFVANDVIRHVVYQDAVRDHLTYFNVLPRLYKSTDIISPVDYAHSAY VRFLEENFHAHYSVTMAFGINAYIQVAFFKSEESGDFTDKEIEALNEIYVYVANSYKNFK KYEQAKIVSNIQSEIISSGEKAYFVTDDFMHIMSYNKTAQNYLKEILGPSVADQISTTTP CSWLPFLLGGEENNNTADRVQTRVIKNYIFTIYTYDQSYSNGIIDRYHWITISRKEDGKL LDGTGTMLPLTQAEQRVAELMYNGLTYKAIADELVISYHTVKKHVQNIYTKCGVKSRFQL YKWLENREQ >gi|229784052|gb|GG667683.1| GENE 8 10841 - 11791 685 316 aa, chain + ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 4 299 5 290 302 210 38.0 2e-54 MPQQGIIFNIQRFTIHDGPGMRTELFLKGCPLRCRWCSNPESWTAYIQPGIYKTKCISRG KCGACEEVCPEKGILHFTRGKLTAIDRSKCTNCLACYDACPSDAVKQWGKSMSVEECMKE IRRDKGYYERSGGGVTVSGGEPLIQSDFVAELFQACRDEGIQTCCESTFHADWKEIEKIL PVTDIMISDIKHMDPEIHKEYTGVTNDRILANLKRLTEEERELILRIPVIPHVNDDKKNM EATADFIRNELGGRVRTLQLLSFMRLGEEKYASLGLPYQMEDVKINRKSFQKHVEEIAEY FNSRGIHCMVGTKEKQ >gi|229784052|gb|GG667683.1| GENE 9 11830 - 14349 2598 839 aa, chain + ## HITS:1 COG:STM4114 KEGG:ns NR:ns ## COG: STM4114 COG1882 # Protein_GI_number: 16767379 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Salmonella typhimurium LT2 # 34 835 4 761 765 501 36.0 1e-141 MNMNHTTACGTEPFDHVYSLGYQVHHEDWSPFPRVNHLRQTFLDRPYEVDVERLRLVTES YKEHEVCPRKLQCAYAFENILLNATLYIDDEDLILGEIAAPVKASPIYPEFSVNWIIDEI LHSPFEEREHDQFYIRNDEERKEIIELCKYWQGKTVDDMINVRLEDEQIKGSEAGKKIFQ TNLYHFAGTGHLAIDYAKLMKVGYNGLIEQAKECLGRLDKRDPEYGDKRDFYQAMIIMHE AAKKYLERYAKLAEEYAEKETGSVRKKELEAIAENCRQIAGGAPKTFWQALQLFNIATTL IQVESNGHSISYGRMDQWLYPFYEADMKNGTITKEFALELLEVVYVKVNNPTKLKDKGSM AVRNGRGFGGESLTVGGVNRDGSDATNDLTMMLLEASAHTRMMNPWVCVRMHENTPYELK VKTVECIRAGYGHPKLFNDAPAIRGMMRKGMTLEEARDYCVVGCVEIDLPGKEYGWHDAA YVNTAKMMEMVLNGGRCLDCGPHCMRWETCGALGDHLGPDTGSLATYQSFDEVLESVDKQ FAYWTDQMCSSLNIIDNTHRALKPVPYVSAFYEDCMESGKDLTEGGAKYNGTGPQASGMA TCADSLAAIKQLVFDEKRCSGAELLQAVKDNWKGHEKLYALVNSSKVHHYGNDDDYADEM FKYMFECYCRHIRGRKNPRGGEFSPGVYSVNANVAMGLNTNASVDGRRAGEAISDNMGPV HTDGGSHDFSGPTAVVNSLTKVDHSLASNGTLMNLRFPEESVAGIEGRDNLISFIDEYIA KQAMHVQFNIMSSATMRAAQKKPEDYRDMLVRVAGYSAYFVELGKPLQKDLIQRTELHF >gi|229784052|gb|GG667683.1| GENE 10 14379 - 15164 916 261 aa, chain + ## HITS:1 COG:no KEGG:Ilyop_0061 NR:ns ## KEGG: Ilyop_0061 # Name: not_defined # Def: protein of unknown function DUF81 # Organism: I.polytropus # Pathway: not_defined # 1 257 1 262 299 346 67.0 5e-94 MTFEFDFILVGLMQIFGFFVQGCTGFGCTVIAASVTNGLLGTAEGVPYGTLLTIPFLYYL GVKAWREVSWKDLLKIVALCAPGILIGNYLFYTISPTTAKICIGAMVTIIALMNIYKHII RPLVLKKVDDEDAPDTTGKKIFRYGCLILGGIVHGAFTIGGPLITVYTIEAVKDKEKFRN TMNMVWVVLNTWNIYTQWRNGAFTPRMWSALAIGLPCAAIGFFLGMGFLKRINREQFLRI VYCVLLFIGANMFITSLMAVL >gi|229784052|gb|GG667683.1| GENE 11 15180 - 16328 1360 382 aa, chain + ## HITS:1 COG:ECs3659 KEGG:ns NR:ns ## COG: ECs3659 COG1454 # Protein_GI_number: 15832913 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 1 381 1 382 383 252 38.0 1e-66 MKEFRFKIPQNIEFGVGSLKKLPEILKENQSEHVFLISDRGLEKIGVVKKIQDIIEAAGI SCTAYLDVIPNPTTDIVNEATGLYKGCGATSIVALGGGSSMDVAKAVGVLVNYGGSITDY EGNHKVPGPIVPMIAIPTTAGTGSEVTASAVITDEARNYKLSVFSYEILPKYAVLDPELI MTAPASIAAACGVDALIHAIEAYLSTNASPFSDAMAEKAMELIGGNLRQFVANRKNEEAA CAMMAGSNFAGIAFAWARLGNVHAMSHPVSAYFNVPHGVANSILLPAVLEYNALADDGRY EVIYNYVRQGSEELNGFKPEMLVEEIRRLNEDLGIPKSLSQVGVKEELIPAMAEDAMKSG NIPANPRQTNVKDIMELYKKAL >gi|229784052|gb|GG667683.1| GENE 12 16334 - 16549 68 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622699|ref|ZP_06115634.1| ## NR: gi|266622699|ref|ZP_06115634.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 71 1 71 71 141 100.0 2e-32 MIKQVFQNEHPRCSDGDPLRSAVFGPHPAAVIHAFGICFLPEIDKKKTIHTGGFCTSTAG QEQNFEIHPGL >gi|229784052|gb|GG667683.1| GENE 13 16567 - 17952 1557 461 aa, chain + ## HITS:1 COG:no KEGG:DSY4285 NR:ns ## KEGG: DSY4285 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 461 47 508 509 378 45.0 1e-103 MFTTLDKKKRIGWTISILIPLIIAFIPVSENFTVSLKLFFMITLFMILIIAFELFPLLVS AILLPSLYMVSGIVPASAAFGSWSNTTVWMVLGGLIFSTVLDECGLLKRIAYFVIRKCGG TYGGVVYGCFMIGIVLNLVTFCYGWVIAGALIFGVCKAMGLKPSRESSLVCFAGTIGCTG STVFLYYPSYYSMIETSLQEFVPGYSMGMFTTFIYNGCFILWCVLTLFILMKVYKTKELG AKLNRKIFDEKYEQLGTMSSKEKKAVVMIILLFVYLFSSNFTGLPAAYGFMTIPYLMFLP GIEIGNTEVIGRINFSMIFLVAACLGIGVVGAAVGFGDFLTNIAVPLLSGRSVLTVCICF MLFGMIANFFMTPLAMLGGLSIPFAQIAVSLGVNPVVACMILVYACEMLFLPYESNGNLI MYEYGMMPMKDFIKQEGLKTVIMFAGFIAVMFPLWNLFGLM >gi|229784052|gb|GG667683.1| GENE 14 17915 - 18652 356 245 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 28 227 20 219 223 141 39 3e-33 MKSMGKTAAVEGRHIVKNFRTGDAVTEVLKDLSLQVMQGEFVSIMGPSGSGKSTLLYILG GLDSPTSGRVLLNGTDISGFGDKEMSMLRRRSIGFVFQFYNLIPNLNAEENIMLPLLLDG QKMEHLKDRLDQILEVVGLTDRRRHTPRELSGGQQQRVAIARALIGNPGILFADEPTGNL DSRAGADIMSLLQKINRTTGQTILMVTHSPEAAKSSSRIITVQDGTILQPPFCPTSTRTD STAET >gi|229784052|gb|GG667683.1| GENE 15 18676 - 21240 1699 854 aa, chain - ## HITS:1 COG:CAC3561 KEGG:ns NR:ns ## COG: CAC3561 COG0577 # Protein_GI_number: 15896796 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 31 854 1 842 842 265 26.0 3e-70 MIYMIAVSFFYLSGTRFRSKTIFLLKAGVEMKIILKYILTNIKERKLRTAVMFLSVLLSS MLLFVSFSIGASYESAQQKMARGMAGSAVLSVRAADRAVGTDEIPPLPSIDAGVGVLKGS ALYHEDGYYETVDLLAADLTQLDKINPPHLANGKKITDFTGSQVILPDRFTSKYGIKKGD TIHLQINGTPVPLRVADIANYDTVFLRHTRGVTALLPLSTLAGLLDQTDGYSEILIKPVT GAGTAALKRELSAALPAPAYHVSETVNQAQIEADARQKSMPFFLISFFSLTMSVFIIYSS YKVITLDRLPVIGTFRSIGAEKKTVTCIFLLESLLYGSAGGLLGIPAGTALLKLILHGMG QSLTQGIEIPAVVSPLYAVFSLAAAVISAFLSAWLPIRRAGRLPVKDIVLGTVEETRTSV RFLSGVCVLLFLVSVLLPRLVSGPMRYPAGGFSLLGMIAVAVLLVPTLTNLCSAAFTPLF CRFLGNEGWLAARNMRDNKNTAQSITLLFISISAVTAITVVGNFVTSYVSDVFAGAELDG FADGHMEPEFISQLKEMEGIEKVLPLYVFNSRMTADGIPLNRLEATDRLDWYGPMMALHY SEKDMEASAISAFASGRAVILSTDCMERTGSTVGGLLSLSDGTVQQDYLIAGSFKSRATD VEAVIPAACAVSDFGASSYGFAAYTAADPDAIMVQLRSLFGETSNWSRTVEEFNTDALTT FGTFLKPMHSMTWFILLLAAVGVVNNLLIGHIQKRRSTAMYQSIGLSNRQMVKITVIESV SAGLISAVLAVFISYMEIQTIFLVAGPKIQTVPELDASVFLTSGLLGIAVTLLGSVVPVV KSRNMKLVEEIKVD >gi|229784052|gb|GG667683.1| GENE 16 21288 - 22700 818 470 aa, chain - ## HITS:1 COG:CAC0290 KEGG:ns NR:ns ## COG: CAC0290 COG0642 # Protein_GI_number: 15893582 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 22 465 20 461 467 203 27.0 6e-52 KSGYRTTFHIYFIFLLTLLGTILAAVSLFFLLITVQKPDGSPIRSDWPKAFTDRFAEQII FQNGVPCVTQAGTALLQDNGAGLQILNPDGREAYSYGKPDSARESYTAAELLKLTRDGFP AAGQRTISVFAGTAHNAGEDYVYLLCFPMRISAVTMYLNGERFSGGRTVILTLLCVLLPV ILITGALYGFWTARQLKRLTASIHDISLRCYLPDHRRGTFEDLYAGLNALNSEIRASDRL REETEKMREEWITNITHDLKTPLSPIKGYAEIIRDGNDGQENNYGRYAAVMLKNVSCLEA LIEDLKLTYQLESGLLPVNRECQDMVRFLRELAIDLLNSPEYENRIIHFDSAKDTLFYSF DSKLLARAFRNLIINAFVHGGAQAEVSLRITVSGGSPCIQVADNGYGMRPEEIKHLFDRY YRGTSTGEKPEGTGLGLAIAKSIIELHGGTISAAGSPGAGTVFSIEFPVS Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:36:37 2011 Seq name: gi|229784051|gb|GG667684.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld77, whole genome shotgun sequence Length of sequence - 24014 bp Number of predicted genes - 24, with homology - 23 Number of transcription units - 11, operones - 6 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 - CDS 2 - 538 171 ## PROTEIN SUPPORTED gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase 2 1 Op 2 . - CDS 535 - 963 650 ## COG0802 Predicted ATPase or kinase 3 1 Op 3 . - CDS 1005 - 1616 610 ## Closa_0622 hypothetical protein 4 1 Op 4 . - CDS 1626 - 3767 196 ## PROTEIN SUPPORTED gi|227488118|ref|ZP_03918434.1| 30S ribosomal protein S1 - Prom 3802 - 3861 9.9 + Prom 3844 - 3903 9.8 5 2 Op 1 . + CDS 4018 - 4968 729 ## bpr_I0824 hypothetical protein + Term 4983 - 5018 0.6 + Prom 5023 - 5082 5.7 6 2 Op 2 . + CDS 5109 - 5495 405 ## Closa_0620 hypothetical protein + Term 5528 - 5586 17.2 - Term 5516 - 5573 9.4 7 3 Op 1 . - CDS 5596 - 5958 339 ## Closa_0619 hypothetical protein 8 3 Op 2 . - CDS 5975 - 7183 1465 ## COG0527 Aspartokinases 9 3 Op 3 . - CDS 7252 - 8466 1448 ## COG0460 Homoserine dehydrogenase 10 3 Op 4 . - CDS 8472 - 8915 552 ## COG4492 ACT domain-containing protein - Prom 8967 - 9026 7.3 11 4 Tu 1 . - CDS 9044 - 9184 146 ## Closa_0615 hypothetical protein 12 5 Op 1 . - CDS 10118 - 10954 949 ## COG1186 Protein chain release factor B 13 5 Op 2 . - CDS 10998 - 11075 97 ## - Prom 11255 - 11314 21.3 14 6 Tu 1 . - CDS 12318 - 14888 3046 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) - Prom 14932 - 14991 7.1 + Prom 15456 - 15515 9.7 15 7 Tu 1 . + CDS 15542 - 16069 493 ## PROTEIN SUPPORTED gi|28210085|ref|NP_781029.1| SSU ribosomal protein S30P + Term 16097 - 16162 21.1 - Term 16083 - 16150 31.0 16 8 Tu 1 . - CDS 16155 - 17075 496 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 - Prom 17172 - 17231 6.5 + Prom 17116 - 17175 6.3 17 9 Tu 1 . + CDS 17344 - 18075 218 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 + Term 18091 - 18136 9.4 - Term 18079 - 18124 8.6 18 10 Op 1 . - CDS 18129 - 19193 1354 ## COG2515 1-aminocyclopropane-1-carboxylate deaminase 19 10 Op 2 . - CDS 19193 - 20572 1398 ## COG0534 Na+-driven multidrug efflux pump 20 10 Op 3 . - CDS 20573 - 21019 549 ## Cphy_3318 MarR family transcriptional regulator - Prom 21060 - 21119 7.0 - Term 21091 - 21125 5.0 21 11 Op 1 . - CDS 21184 - 21867 478 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 22 11 Op 2 . - CDS 21870 - 22466 614 ## COG2364 Predicted membrane protein 23 11 Op 3 . - CDS 22503 - 22985 298 ## COG3542 Uncharacterized conserved protein 24 11 Op 4 . - CDS 22982 - 24013 830 ## COG0673 Predicted dehydrogenases and related proteins Predicted protein(s) >gi|229784051|gb|GG667684.1| GENE 1 2 - 538 171 179 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase [Atopobium parvulum DSM 20469] # 39 170 71 208 832 70 33 1e-11 MRILGIESSSLVASVAVVTDDVVTAEYTVNFKKTHSQTLLPMLDEIVRMVELDLETADAI AVSGGPGSFTGLRIGSAIGKGLGLALKKPLIHIPTVDAMAYGMYGYAGLVCPIMDARRNQ VYTGIYDNREGFSAVREQCAMDIDELVDELNKMENRAVFLGDGVPVYRERIAERITVPC >gi|229784051|gb|GG667684.1| GENE 2 535 - 963 650 142 aa, chain - ## HITS:1 COG:BS_ydiB KEGG:ns NR:ns ## COG: BS_ydiB COG0802 # Protein_GI_number: 16077658 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Bacillus subtilis # 5 137 8 135 158 126 50.0 1e-29 MVTETWTPEETYAFGRRLGEAAEPSSVYCLNGDLGVGKTVFTQGFADGLGVEGPVDSPTF TIVKQYDDGRMPFYHFDVYRIGDISEMDEIGYEDCFYGDGVSLVEWGGLIEEILPENVIT VKIEKDLEKGFDYRRITVEGLE >gi|229784051|gb|GG667684.1| GENE 3 1005 - 1616 610 203 aa, chain - ## HITS:1 COG:no KEGG:Closa_0622 NR:ns ## KEGG: Closa_0622 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 36 203 1 168 168 223 62.0 3e-57 MEQERWEKEQAKTEDGKEAPVEQNETGQAEPEAELMVQTSPLVFTVKLTAKDLWKFSLYH SNKGMLGIFNVIFSLAAIFLLVTTWRSNTVMYRALLIVCALMFTVWQPFLLYLKAAKQSK RPVIQNPMDLSFSREGIVVTQGTERLELIWENIGRVERIQGMIIVYMGKVRAYLLPDSIT GEKKEELLDLFRESLPAERRKKI >gi|229784051|gb|GG667684.1| GENE 4 1626 - 3767 196 713 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227488118|ref|ZP_03918434.1| 30S ribosomal protein S1 [Corynebacterium glucuronalyticum ATCC 51867] # 637 712 201 275 482 80 51 1e-14 MDIIRALQEELGIGRGQVEAAVKLIDEGNTIPFIARYRKEATGSLNDEVLRNLDERLRYL RNLEEKKEQVISSIREQEKLTPELEQQILAAATLVAVEDLYRPYRPKRRTRATIAKEKGL EPLADLIMLQQMKGPVLEAARDYISEEKGVATAEEAVEGASDILAERISDNAEYRTYIRN ATMSQGSIQSAAKDEKAESVYEMYYQYEESIKKTAGHRILALNRGEKEKFLTVKVLAPVE QIIRYLEKHVIVKDNPNTTPVLREVITDSYDRLIAPAIEREIRNDLTEKAEDGAIRVFGK NLEQLLMQPPIVGRVVLGWDPAFRTGCKLAVVDETGKVLDTVVIYPTAPQNKVEEAKVIL KKLIKKYNVSLISVGNGTASRESEQIIVELLKELQQPVQYIIVNEAGASVYSASKLATEE FPQFDVGQRSAVSIARRLQDPLSELVKIDPKSIGVGQYQHDMNQKHLSEALTGVVETAVN RVGVDLNTASASLLEYISGITKTIAKNIVTYREENGAFTSRNQLLKVAKLGPKAYEQCAG FMRIQGGKNPLDGTSVHPESYEAAEKLLDRLGYDVAELAHGGIKGIGRKVTGYKKLAEEL GIGEITLQDIVKELEKPARDPRDEMPKPILRTDVLDMKDLQPGMVLKGTVRNVIDFGAFV DIGVHQDGLVHISQMSEKFIKHPLEAVSVGDIVEVKVISVDVAKKRIALTMKL >gi|229784051|gb|GG667684.1| GENE 5 4018 - 4968 729 316 aa, chain + ## HITS:1 COG:no KEGG:bpr_I0824 NR:ns ## KEGG: bpr_I0824 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 14 283 8 269 294 113 30.0 8e-24 MTVKIKQHYGGISMNENAIRKLADSDERLNIRLLNDFAFKTTFHNKAALTGLLSALLDID PADIRKLEFMDSFLPGEYADDKEGILDVKLLLNGDRKINIEIQVLPFANWEERSLFYLSK YFVEGFEKGTSYKMLEATVHISILAFPLYENGTWYSIIEFRDRNTHRLYSDKMSLRVLQL SQLSKATPEERKSEIYAWAQMISADDWEVLKNMAERNEYMKAAVEELEKINAEKEKRYHY LMREKWEHDEATIRDYERGQGKAESILELLEEFGEIPEQLRTVVLGQHDTDILRIWLKYA AKAESLDDFISFIQTR >gi|229784051|gb|GG667684.1| GENE 6 5109 - 5495 405 128 aa, chain + ## HITS:1 COG:no KEGG:Closa_0620 NR:ns ## KEGG: Closa_0620 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 116 1 117 131 142 63.0 3e-33 MEDNNNKNNGLLPNMPSPGTLWASWSVFLGGLGLIGNCMCGPIWGLPAGLILGLLGIACA VLSKKGKPFTQQAQLGLILSILAAVCGLLMTFFIIIVYDVMGTDTMAGKYFRDMMEALQN PSALFPTQ >gi|229784051|gb|GG667684.1| GENE 7 5596 - 5958 339 120 aa, chain - ## HITS:1 COG:no KEGG:Closa_0619 NR:ns ## KEGG: Closa_0619 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 8 120 3 115 115 133 61.0 2e-30 MINEIFGFIKKLLGIGMPCAFHSLTGLYCPGCGGTRAVRELLYGDLRMSFQYHPLVLYGA AVVLLELVSWGASRMFHRPWLHIRRYNLFILIGAGIVVANWFYKNYMLVFRGVDLLPLLK >gi|229784051|gb|GG667684.1| GENE 8 5975 - 7183 1465 402 aa, chain - ## HITS:1 COG:Cj0582 KEGG:ns NR:ns ## COG: Cj0582 COG0527 # Protein_GI_number: 15791942 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Campylobacter jejuni # 1 400 1 398 400 346 47.0 4e-95 MLIVKKFGGSSVADKDRIYHVARRCIEDYEKGNEVVVVLSAMGKTTDGLLAKAHEINPNP PKRELDMLLATGEQTSVALMSMAMSALGGPAVSLNAAQVAMHTTSTYGMAKLKRIDTERI RHELDARKIVIITGFQGINKYDDITTLGRGGSDTTAVALAAALHADACEIYTDVDGVYTA DPRIVPNARKLTEVSYDEMLEFASLGAKVLHNRSVEMAKRYGVQLVVLSSLTRAEGTIIK EETKVERMLVSGVAADKNVARISVIGVKNEPGIAFKIFNLLARHHINVDIIIQSIGREER KDISFTVAKTDLQEAMTLLNENREVLTAQTITCEEGVAKISIIGAGMTSNPGVAAKMFES LYSANVNIKMIATSEIRITVLIGEEDVNRAMRTVHDAFDLAD >gi|229784051|gb|GG667684.1| GENE 9 7252 - 8466 1448 404 aa, chain - ## HITS:1 COG:MTH1232 KEGG:ns NR:ns ## COG: MTH1232 COG0460 # Protein_GI_number: 15679243 # Func_class: E Amino acid transport and metabolism # Function: Homoserine dehydrogenase # Organism: Methanothermobacter thermautotrophicus # 2 348 1 352 423 271 44.0 2e-72 MIHVAVMGYGTIGSGVVEILEKNRETIAKRTGEPVDVKYVLDLREFPGTPVEHKIVHDFA VIENDPEVTMVIETMGGLNPAYPFVKACLEAGKHVATSNKALVAAYGTELLRIAGEKNVN FFFEASVGGGIPIIRPLYTSLAGEEIEEITGILNGTTNYILTKMDKEGETFESALKEAQD LGYAERNPEADVEGHDTCRKIAILTAMATGHEVHYEDIHTEGITSITDVDFKYAEKLGTS VKLFGSSRMKDGEIHAWVAPVMIGKEHPLYSVSDVFNGILVKGNMLGTSMFYGSGAGMLP TASAVIADIIEAVQNRDRHVEMGWDDSRLGIAPIETSSCRFFIRIKGIAETRLAEVQKVF GDVSVIELDHMDEFAVLTDVMTEAEYEKKAKELSGIRQRIRAEV >gi|229784051|gb|GG667684.1| GENE 10 8472 - 8915 552 147 aa, chain - ## HITS:1 COG:BH1214 KEGG:ns NR:ns ## COG: BH1214 COG4492 # Protein_GI_number: 15613777 # Func_class: R General function prediction only # Function: ACT domain-containing protein # Organism: Bacillus halodurans # 4 145 3 144 147 104 40.0 6e-23 MEEKSKYFVVKQKAVPEVLLKVVEAKKLLESERVITVQEATDRVGISRSSFYKYKDDIFP FYDNAKGKTITLVMQMDDEPGLLSDLLHIVAVYHANILTIHQSIPVNGVATLTLSVEVLE STGNVSRMVEDMEEKHGVHYVKILARE >gi|229784051|gb|GG667684.1| GENE 11 9044 - 9184 146 46 aa, chain - ## HITS:1 COG:no KEGG:Closa_0615 NR:ns ## KEGG: Closa_0615 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 44 333 376 377 81 84.0 1e-14 MVKDHRTNAETGNVNSVLDGALDLFIYAYLRWISTGAKANDSTGPE >gi|229784051|gb|GG667684.1| GENE 12 10118 - 10954 949 278 aa, chain - ## HITS:1 COG:BS_prfB KEGG:ns NR:ns ## COG: BS_prfB COG1186 # Protein_GI_number: 16080582 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Bacillus subtilis # 1 273 40 312 366 308 54.0 9e-84 MEEPGFWEDPEKSTKIVQLAKNLKDTVNGYKDLEQQYEDIGVMIEMGNEENDPSVVPEVE EMLKEFKDKLEEIRINTLLSGEYDSDNAILRLNAGAGGTESCDWCSMLYRMYCRWAEKKG YSVEVLDYLDGEEAGIKSVTIQINGENAFGYLKSEKGVHRLVRISPFNAAGKRQTSFVSC DVMPDIEEDLDVDINEEDLRIDTYRSSGAGGQHINKTSSAIRITHLPTGIVVQCQNERSQ FQNKDKAMQMLKAKLYMLKQQENAEKLSDIRGDVTVAS >gi|229784051|gb|GG667684.1| GENE 13 10998 - 11075 97 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVELDQYKYELSTFEKPLVEVRDSL >gi|229784051|gb|GG667684.1| GENE 14 12318 - 14888 3046 856 aa, chain - ## HITS:1 COG:CAC2846 KEGG:ns NR:ns ## COG: CAC2846 COG0653 # Protein_GI_number: 15896101 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Clostridium acetobutylicum # 1 854 1 838 839 929 58.0 0 MNLVQKVFGTHSERELKLINPIVDKIEALRPKMIALTDEELRDNTRLFKERLAAGETLDD ILPEAYATVREAARRVLDMEHFRVQLIGGIVLHQGRIAEMRTGEGKTLVSTCPAYLNALA GKGVQIVTVNDYLAKRDAEWMGRVHEFLGLKVGVVLNSMTSDERREAYNCDITYVTNNEL GFDYLRDNMAIYKEQMVLRELDYAIIDEVDSVLIDEARTPLIISGQSGKSTKLYEVCDIL ARQLERGEASGEFSKMAAIMGEEIEETGDFIVDEKDKVVNLTEQGVDKVEQFFHIENLSD PENLEIQHNIILALRANYLMFRDKDYVVKDDEVLIVDEFTGRIMPGRRYSDGLHQAIEAK EHVNVRRESKTLATITFQNFFNKYKKKSGMTGTALTEEKEFRNTYGMDAIEIPTNKPIAR IDMEDAVYKTKKEKFKAVCDEVEEAHKKGQPVLVGTITIETSEMLSGMLKRRGIPHKVLN AKYHELEAEIVADAGVHKAVTIATNMAGRGTDIKLDDEAKAAGGLKIIGTERHESRRIDN QLRGRSGRQGDPGVSRFYISLEDDLMRLFGSERLMNMFTALGVPEGEQIEHKMLSNAIEK AQMKIETNNYGIRENLLKYDEVMNEQREVIYAERRKVLDGDNMRDVILKMVTDIVENAVD LSITDEQMPEEWDLGELNTLLLPIVPIKPVVLPEDKKIKKNELKHMLKEEAIKLYETKEA EFPDAEQIREIERVILLKVIDNKWMSHIDDMDQLRQGIGLQAYGQRDPLVEYKMSGYQMF DEMTAAIREDTVRILMHIRVEQKVEREPAAKVTGTNKDDSAAKAPVKRNETKIYPNDPCP CGSGKKYKQCCGRKVV >gi|229784051|gb|GG667684.1| GENE 15 15542 - 16069 493 175 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|28210085|ref|NP_781029.1| SSU ribosomal protein S30P [Clostridium tetani E88] # 1 175 1 176 176 194 58 5e-49 MRFTITGRNIEVTQGLREAVEDKLGKLDRFFAPATEAVVRLSVQKDIQKIEVTIPVKGHI IRAEESSSDMYVSIDLVEEILERQLKKYKNKLIDKKQSAPSFSEAFLQEDASAEEEIQIV KSKKFAVKPMDPEEACVQMELLGHNFYVFLNADTEEVNVVYKRKGGTYGLIEPEF >gi|229784051|gb|GG667684.1| GENE 16 16155 - 17075 496 306 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 4 302 1 303 306 195 38 2e-49 MEKIYDVVIIGSGPAGLTAAIYAKRAELETVVIEKEIASGGQVLNTYEVDNYPGLPGING YDLGMKFREHADKLGAEFVTDDVVRVEKADGLFTVTCEEESYTAKTVIISTGASHRKLAV PGEEELTGMGVSYCATCDGAFFRNRVTAVTGGGDVAIEDAIFLSRMCKKVYLIHRRDELR GARTLQTQLFSLDNVEVLWDTVVEKINGKDQVESVTLKNVKTEESKELPVDGVFIAVGIN PQSSAFDGLVEMDHGYIQAGEDCETSVPGIFAAGDVRTKQLRQVSTAVADGANAITSVER YLTANF >gi|229784051|gb|GG667684.1| GENE 17 17344 - 18075 218 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 1 240 1 238 242 88 29 4e-17 MAKKTVLVTGASRGIGKAIAVKFAKKGYNVVINCLHSEERLLQTKKELETYQISCLAYMG DMGDMEQCRILFDQIRKQFGTLDVLVNNAGISYIGLLQDMSTESWDRIIRTNLTSVFNCC KLAVPGMVEKKQGKIINISSVWGVSGASCEVAYSATKGGINAFTKALAKELAPSNVQVNA VACGAIDTEMNRFLEEDELIALIEEIPAGRLGQAEEVADLVYHLGYKNSYLTGQIIGLDG GWI >gi|229784051|gb|GG667684.1| GENE 18 18129 - 19193 1354 354 aa, chain - ## HITS:1 COG:PAB2303 KEGG:ns NR:ns ## COG: PAB2303 COG2515 # Protein_GI_number: 14520280 # Func_class: E Amino acid transport and metabolism # Function: 1-aminocyclopropane-1-carboxylate deaminase # Organism: Pyrococcus abyssi # 7 353 4 329 330 206 38.0 5e-53 MSMSVKEAEKLFEKLPKAELGFFPTPFYRLDRLSKELGVNLYIKRDDFTGMNLFGGNKVR KLQYLMGAAMSCGCEYVFTFGATQSNHAMQTAAACRRCGLKPVLYLVAIVKPDEDDLRAN LLLDRILDAEVHIVEILEGETEEEAEERAVILAREHMARLNKEAGRCICYEVPMGGASPV GSVGFIEGYVELEKQLSAMGLRADYVFHATGTGGTMAGLAAGRNLVGSGTEIISINVSAK DPEYPNRTAALANESLKLIGAGITVEAERDIHTDLNYYLPGYEIPNEAASEAIKLLAEKE GLLTDPVYTGKAFAGMLDYIRTGKVPAGSSVVFWHTGGATALFAEKEILGDLFE >gi|229784051|gb|GG667684.1| GENE 19 19193 - 20572 1398 459 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 9 439 5 432 448 295 40.0 1e-79 MEPNKGKENELGTESVGKLLFRLSMPAIAAQVINVMYNMVDRMYIGHIPGVGAAALTGVG VTMPVIMAITAFAYFISMGGAPRSSIMLGRNEKDKAEKILGNCTTMLVVLALILTAVFLV FGRSILLLFGASENTIGYAWDYMKIYALGTIFVQLALGLNAFITAQGYAKTSMYTVLIGA VCNIILDPIFIFGLNMGVKGAALATILSQAVSSVWVLRFLTSEKSSLRIKRKGLKLEAGV ILPCMALGTSPFIMAFTESILSVCFNSSLLRYGGDVAVGAMTILSSVMQFSMLPLQGLTQ GSQPIVSFNYGAGNAERVKKTFKILFLCCIGYSTLLWALSMFCPQIFIMIFTSDAALTEY THWAIRIYMAASLMFGAQVACQQTFIALGNAKVSVFLALLRKVFLLIPLIFILPRFVENQ TMGVFLAEPVADFCAVCVTVTMFSVSFKKILARMEGGKK >gi|229784051|gb|GG667684.1| GENE 20 20573 - 21019 549 148 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3318 NR:ns ## KEGG: Cphy_3318 # Name: not_defined # Def: MarR family transcriptional regulator # Organism: C.phytofermentans # Pathway: not_defined # 6 144 2 140 141 109 41.0 5e-23 MEQKDEGYFAYYAKIDKVYRKLCALEVAEYCFTPNEIVVLMFLANNPGYDSASDIAHFRN ISKGLVAKSVETLCERGYLRAGKDLKDRRLIHLSLTDKSDDIVARLRKCRQEFAGKLYAG VPQEDMEAMARTTKVINENLDNILKGMK >gi|229784051|gb|GG667684.1| GENE 21 21184 - 21867 478 227 aa, chain - ## HITS:1 COG:PM1364 KEGG:ns NR:ns ## COG: PM1364 COG0235 # Protein_GI_number: 15603229 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Pasteurella multocida # 4 188 1 189 210 109 34.0 4e-24 MRYMTEKEARRVLTDLGRRSYERAYVAANDGNISCRISEDTILITPSGVSKGYMTDDMMV KAALDGSVLSEGKPSSEMKMHLRIYREDPEICGVTHLHPPFATAFSAAGIPLNRAILTEA VMAVGVVPVAPYALPGTEEVPESIAPFIHDYNSVLLANHGALSWGKDVYQSFYFMESVEQ CARIQLYLNMLGSSATLHEEQVERLLEMRKVWNIDRGGIPVTGKMLI >gi|229784051|gb|GG667684.1| GENE 22 21870 - 22466 614 198 aa, chain - ## HITS:1 COG:BS_yczE KEGG:ns NR:ns ## COG: BS_yczE COG2364 # Protein_GI_number: 16077426 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 21 180 32 190 215 68 28.0 7e-12 MVIVGSMMMGASIDLFVQADFGLDPLSLFQAGLGRVLHLSLGTTSQVLMISIIILLFFID RKRIGIGSILNSVLVGTFIKWFSPMICTGGGAAAERLLCLLAGLILMGVGIGTYVAAGLG EAGMDAMMMYLSGKLKKNVNVTRISIDILLSITGFLLGGKLGGATVVSMVVNGYMIQFTI ELIGRIRKNRAVIEVGGQ >gi|229784051|gb|GG667684.1| GENE 23 22503 - 22985 298 160 aa, chain - ## HITS:1 COG:sll1188 KEGG:ns NR:ns ## COG: sll1188 COG3542 # Protein_GI_number: 16329452 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Synechocystis # 6 144 9 152 164 105 40.0 4e-23 MIESIINELKLVPLAGEGGMYRYLYAGEKDSAGRESYSSIYYLLTERSFSHMHRLKTDEI YHYYMGDGLELLLLYPDGTAEIKVLGTRLSEGEVPQLRVPAGVWQGSVVAHGGKYCLVGT TMAPAYLPEDYEHGDAETLCAGYPDAEAAVRRRCGEAVAR >gi|229784051|gb|GG667684.1| GENE 24 22982 - 24013 830 343 aa, chain - ## HITS:1 COG:BMEI1388 KEGG:ns NR:ns ## COG: BMEI1388 COG0673 # Protein_GI_number: 17987671 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Brucella melitensis # 1 340 54 388 390 108 24.0 2e-23 EALKASGIKIGAIVDVSEEKAKAFAEEFSIEKWGTDSSILLDDEITTVHVCTPPNLHYDM VMKLLDHKKNVLCEKPLCFDDSQAEELARRAKECGVVCAINFNVRFHMACQKARKLVESE EFGRVNMIHGNYLQEFNAFPAPVDWRYNPVLAGRMRAVTEIGTHWFDIAQYISGRKVKAL SANFGRFYPKRYVEDRTMYPDSGDGKRKETIEVCSEDAAAINLKFDNGAIGTVLLSEVSQ GRINRITLEVTGEKKNLWWNSEDNNILNTASKGSGVNSEIFGFGNGFMDTFRSLVDSFYE DVKRGCVSEAPVYPTFFDGRDIVKLCNAMLESADSEGKWVVVE Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:37:07 2011 Seq name: gi|229784050|gb|GG667685.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld78, whole genome shotgun sequence Length of sequence - 29246 bp Number of predicted genes - 27, with homology - 26 Number of transcription units - 12, operones - 7 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 4 - 171 129 ## EUBREC_2457 putative recombinase 2 2 Op 1 . + CDS 308 - 2329 2135 ## COG3711 Transcriptional antiterminator 3 2 Op 2 . + CDS 2353 - 2586 300 ## Mahau_1979 major facilitator superfamily protein 4 3 Tu 1 . + CDS 3560 - 4486 919 ## Mahau_1979 major facilitator superfamily protein + Term 4498 - 4561 23.0 - Term 4486 - 4547 19.0 5 4 Op 1 . - CDS 4594 - 5100 470 ## gi|266622733|ref|ZP_06115668.1| BclA protein 6 4 Op 2 . - CDS 5221 - 6903 1117 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Prom 6929 - 6988 7.3 7 5 Op 1 . - CDS 7520 - 7735 70 ## 8 5 Op 2 . - CDS 7747 - 7989 266 ## CD0510A conjugative transposon protein 9 5 Op 3 . - CDS 7973 - 8410 333 ## CDR20291_1773 hypothetical protein - Term 8801 - 8857 14.8 10 6 Tu 1 . - CDS 8880 - 9236 72 ## COG0820 Predicted Fe-S-cluster redox enzyme - Prom 9291 - 9350 4.7 11 7 Tu 1 . - CDS 9883 - 10068 101 ## gi|225388238|ref|ZP_03757962.1| hypothetical protein CLOSTASPAR_01973 + Prom 10388 - 10447 4.7 12 8 Tu 1 . + CDS 10467 - 10823 303 ## CD1100 putative conjugative transposon protein + Term 10844 - 10887 8.2 + Prom 10940 - 10999 2.2 13 9 Op 1 . + CDS 11052 - 11354 247 ## CD1101 putative mobilization protein 14 9 Op 2 . + CDS 11315 - 12649 756 ## COG3843 Type IV secretory pathway, VirD2 components (relaxase) + Term 12801 - 12851 18.1 - Term 12784 - 12843 22.7 15 10 Op 1 . - CDS 12850 - 13059 346 ## gi|266622744|ref|ZP_06115679.1| conserved hypothetical protein 16 10 Op 2 . - CDS 13064 - 16462 3136 ## CD1105 putative DNA primase 17 11 Op 1 . - CDS 16953 - 19028 1417 ## COG0550 Topoisomerase IA 18 11 Op 2 . - CDS 19025 - 19750 879 ## CD1107 hypothetical protein 19 11 Op 3 . - CDS 19740 - 20006 263 ## gi|266622749|ref|ZP_06115684.1| conserved hypothetical protein 20 11 Op 4 . - CDS 20028 - 21995 1294 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 21 11 Op 5 . - CDS 21992 - 22840 522 ## COG0863 DNA modification methylase 22 12 Op 1 . - CDS 22945 - 25341 1438 ## COG3451 Type IV secretory pathway, VirB4 components 23 12 Op 2 . - CDS 25301 - 25693 267 ## Ethha_1896 hypothetical protein 24 12 Op 3 . - CDS 25695 - 26591 567 ## COG1032 Fe-S oxidoreductase 25 12 Op 4 . - CDS 26608 - 27471 682 ## Ethha_1894 hypothetical protein 26 12 Op 5 . - CDS 27542 - 27790 251 ## Ethha_1893 putative conjugative transfer protein 27 12 Op 6 . - CDS 27821 - 29119 725 ## COG3505 Type IV secretory pathway, VirD4 components Predicted protein(s) >gi|229784050|gb|GG667685.1| GENE 1 4 - 171 129 55 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2457 NR:ns ## KEGG: EUBREC_2457 # Name: not_defined # Def: putative recombinase # Organism: E.rectale # Pathway: not_defined # 1 54 410 463 464 99 85.0 5e-20 MGVPKELAWKAGNSRRGYWFTTQTVAVNMAMTKERLINSGYYDLATAYQSVHVNR >gi|229784050|gb|GG667685.1| GENE 2 308 - 2329 2135 673 aa, chain + ## HITS:1 COG:SA0321_1 KEGG:ns NR:ns ## COG: SA0321_1 COG3711 # Protein_GI_number: 15926034 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Staphylococcus aureus N315 # 19 475 25 483 504 95 25.0 4e-19 MLNIIRYLKSQGESTYREMAKELGVSERSVRYDVDRINDILSLEHLPEIEKHSKGLLTYP ASLDLKGLEEGNGVVYTGKERMSILLLILMIRNEDLKINKLSEQFGVSRSTVKNDMAALD EELKKNEMGIGYSGHFYLAGPKLKRTTLMNQEFRKYIDYLINPFTNYNSYEFYCIHIIHK AFEGVSIANVVMAVDGLLEELGCTLTSSSYLWYMSNIVVLVWFILHDKEYPLDITVMPEF DREIYGRFAGKLEEIIGRPVSEYHVYMMAKMYDFTNKMTGSGAEVDTVHAQSVTFSLIAA MSKRLDMPFDKDRTLAEGLLNHMIPLIQRIRNHVEINDHMISLLGPEERKLYSVMTAVCR EIDTLKDTANEDEIVYLTVCFMASIRRMKSAPYKRVLLVCGHGYGTTTMLKESLLSEYQI HIMDTIPIYKLSAYPDWAGIDYVLSTIRLNNSLPKPCIVVNPILRPEDKAAIEQLDIPRK TILSGFYSIEEKLGFLDEATRARVMEVIERELGYQTVKEVHNPKSFSSFLKFDCIRLTDE EYEWKAAVRESAALLEKRGFIDSTYTENMIEFIEEQGFYAVSDDSFALLHGKGVEGIHQT SLSLLVSRQPVHFGEKKAKVILCLASRDSKEHIPAVVTLMRMVKTTPLIHDLELCSSEEE IYQTILNCEFQVL >gi|229784050|gb|GG667685.1| GENE 3 2353 - 2586 300 77 aa, chain + ## HITS:1 COG:no KEGG:Mahau_1979 NR:ns ## KEGG: Mahau_1979 # Name: not_defined # Def: major facilitator superfamily protein # Organism: M.australiensis # Pathway: not_defined # 4 76 2 74 394 106 68.0 3e-22 MTSQIRQNYKHTINACYIGYITQAVVNNFAPLLFLTFQRSYGISLGKISFLVTVNFGVQL LVDFLAAHFVDRIGYRI >gi|229784050|gb|GG667685.1| GENE 4 3560 - 4486 919 308 aa, chain + ## HITS:1 COG:no KEGG:Mahau_1979 NR:ns ## KEGG: Mahau_1979 # Name: not_defined # Def: major facilitator superfamily protein # Organism: M.australiensis # Pathway: not_defined # 1 294 97 390 394 356 61.0 9e-97 MMEDSYSGLMAAIILYAVGGGLLEVLVSPIVEACPTERKAAAMSLLHSFYCWGHMGVVIL STAFFYLFGIDQWQVLAVIWAAVPLANALYFFLVPIRTLAEEEEGLSIRRLSGKMVFWIL MVLMICAGASEQAVSQWASAFAESGLHVSKSVGDLAGPCAFAFFMGVSRVVSAKYSDRIS LERIMLGSGGLCVVSYLLIALSPWPVLSLAGCALCGFAVGALWPSTFSLAMKKIPGGGTA MFALFALAGDIGCAGGPLVVGRISGVFGDDLRIGILAALVFPVILTVGVLASGRSKGGAR NKTVSSGC >gi|229784050|gb|GG667685.1| GENE 5 4594 - 5100 470 168 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622733|ref|ZP_06115668.1| ## NR: gi|266622733|ref|ZP_06115668.1| BclA protein [Clostridium hathewayi DSM 13479] BclA protein [Clostridium hathewayi DSM 13479] # 1 168 1 168 168 280 100.0 2e-74 MGPAGPTGPTGETGNTGPVGPTGVTGPTGATGPVPDDIYASFINFAALLTDASLIPMGIG VSDPTGNIVLSDPTRITLAPGIYAIFYEVSALLADSGFMQVTPYYNNAPHIEYGIYFMTN SGRSSAFGSVSFMIEVPVQTVFNLTFNSPVTATEGTLTMVIHKLRREV >gi|229784050|gb|GG667685.1| GENE 6 5221 - 6903 1117 560 aa, chain - ## HITS:1 COG:CAC1228 KEGG:ns NR:ns ## COG: CAC1228 COG1961 # Protein_GI_number: 15894511 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Clostridium acetobutylicum # 15 484 9 470 544 137 26.0 5e-32 MPRIRKDKTARSYAEPFWKIGLYIRLSKEDDNEDESESVINQEKILRDFVDSYFEPGTFV IVDVFADDGLTGTDTARPNFKRLEGCIVRKEVNCMIIKSLARGFRNLADQQKFLEEFIPI NGARFICTGTPFIDTYANPHSASGLEVPIRGMFNEQFAATTSEEIRKTFKMKRERGEFIG AFATYGYKKDPNDKNSLLVDEEAAEVVKSIYHWFVNEGYSKMGIAKRLNQMGEPNPEAYK KKKGLKYNNPNSGKNDGLWSASTIARILQNEIYTGVMVQGRHRVISYKVHKQISVPEDEW FVVPNTHEAIIDRETFEKAQALHKRDTRTAPGKQEVYLMSGFVRCADCQKAMRRKTARNI AYYSCRSYTDKKICSKHSIRQDKLENAVLAALQMQIALVDQLAEEIERINNAPVINRENK RLSYSLKQAEKQLKQYNDASDSLYLDWKSGEITKEEYRRLKGKIAEQIQQLEANISYLKE EMQIMADGIGADDPYLTAFLKYKNIQSLNRGIVVELIKAVWVHENGEITVDFNFADEYQR IIDYIENNHNIITVIESKAI >gi|229784050|gb|GG667685.1| GENE 7 7520 - 7735 70 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRGACPLSTLPVMGYLSFRKSISAVDVLLQADKAYCSLTKKAAFGAQHNRQTGTFRLYRA SRRVRHDSFSP >gi|229784050|gb|GG667685.1| GENE 8 7747 - 7989 266 80 aa, chain - ## HITS:1 COG:no KEGG:CD0510A NR:ns ## KEGG: CD0510A # Name: orf8 # Def: conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 5 79 1 75 76 73 46.0 2e-12 MMNTVRKSENLLPFPVISAAANGDTTAMCAILKHYEGYIAKLCTRTLKDDAGNTYSYVDE EMRNRLQVRLITRTLAFHVG >gi|229784050|gb|GG667685.1| GENE 9 7973 - 8410 333 145 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1773 NR:ns ## KEGG: CDR20291_1773 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 145 1 145 145 238 100.0 5e-62 MELSSSDKERIQHQYDALAKKTLVGEAKSHRRTLAKRAAREVTFSDLSESELAQLFTTDE YESDYFRFQVSGFDVLVKNELLAEALNALPERKRDIILLSYFLDMSDAEIGELLNVVRTT VFRHRKSALAKIKQYLEGKADDEYR >gi|229784050|gb|GG667685.1| GENE 10 8880 - 9236 72 118 aa, chain - ## HITS:1 COG:sll0098 KEGG:ns NR:ns ## COG: sll0098 COG0820 # Protein_GI_number: 16331844 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Synechocystis # 2 107 230 335 350 77 34.0 5e-15 MRIYAIEDVVKQALSYSERHNRKIVFAYLLLPGINDRPSDVRQLAKWFRGKKVMINVLQY NPTSNSRIKAPQKREIVAFKHQLEQAGLEVTMRVSHGREINAACGQLANTYNKFKKND >gi|229784050|gb|GG667685.1| GENE 11 9883 - 10068 101 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225388238|ref|ZP_03757962.1| ## NR: gi|225388238|ref|ZP_03757962.1| hypothetical protein CLOSTASPAR_01973 [Clostridium asparagiforme DSM 15981] hypothetical protein CdifQC_16838 [Clostridium difficile QCD-66c26] hypothetical protein CdifQCD_17118 [Clostridium difficile QCD-37x79] conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Erysipelotrichaceae bacterium 5_2_54FAA] conserved hypothetical protein [Clostridium difficile NAP08] hypothetical protein CdifQ_19480 [Clostridium difficile QCD-32g58] hypothetical protein HMPREF9474_03073 [Clostridium symbiosum WAL-14163] hypothetical protein HMPREF9475_01932 [Clostridium symbiosum WAL-14673] hypothetical protein CLOSTASPAR_01973 [Clostridium asparagiforme DSM 15981] conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Erysipelotrichaceae bacterium 5_2_54FAA] hypothetical protein RBR_01080 [Ruminococcus bromii L2-63] hypothetical protein ES1_15110 [Eubacterium siraeum V10Sc8a] conserved hypothetical protein [Clostridium difficile NAP08] hypothetical protein HMPREF9474_03073 [Clostridium symbiosum WAL-14163] hypothetical protein HMPREF9475_01932 [Clostridium symbiosum WAL-14673] # 1 61 1 61 61 106 100.0 5e-22 MLKKYWIKCPICNGKTRVQVFHNTVLKDFPLFCPKCKLTHIIDVEKLEIVIKNTEKQTFI I >gi|229784050|gb|GG667685.1| GENE 12 10467 - 10823 303 118 aa, chain + ## HITS:1 COG:no KEGG:CD1100 NR:ns ## KEGG: CD1100 # Name: not_defined # Def: putative conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 1 117 1 116 117 155 73.0 5e-37 MAKRPVPLYDFKAFGAAIKAARNEYGESRKKVSDELYISPRYLANIENKGQQPSLQVFYD LVTRYHISVDQFFFPNSNAEKSTGRRQLDALLDGMSDKGIRIVTATAREITEVEKAED >gi|229784050|gb|GG667685.1| GENE 13 11052 - 11354 247 100 aa, chain + ## HITS:1 COG:no KEGG:CD1101 NR:ns ## KEGG: CD1101 # Name: not_defined # Def: putative mobilization protein # Organism: C.difficile # Pathway: not_defined # 1 97 10 106 109 105 60.0 4e-22 MKFYVTEEEKRLIDEKMKQLPIKQYGAYFRKMAIDGYILVVDRSDTKAYIRELQAVSRNI NQIAKRANATGTVYRQDIEDIKKAVDEIWRLQRRTLLNQP >gi|229784050|gb|GG667685.1| GENE 14 11315 - 12649 756 444 aa, chain + ## HITS:1 COG:SP1056_1 KEGG:ns NR:ns ## COG: SP1056_1 COG3843 # Protein_GI_number: 15900926 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD2 components (relaxase) # Organism: Streptococcus pneumoniae TIGR4 # 1 236 1 258 402 81 30.0 3e-15 MAVTKTHPIKSTLKAAIDYILNPEKTDGKLLASSFGCGLETADIEFAWTREAAGDRGTHL GRHLIQSFAVGETTPEEAHKIGMELAGAVLGGKYEFVLTTHVDKDHLHNHLIFNAVSFVD YKKYHSNKQSYHFIRRTSDRICKEHGLSVVVPGQDKGKSYAEYTAEKQGTSYKAKLKTAI DTLIPQVKDFDELLRRLQEMGYEIKQGKYISFRAAGQERFTRTKTLGAAYTEEAIKERIK GVYVAKTKTLREDKKIRLVVDLENSIKAQQSAGYERWAKIHNLKQAAKSMNFLTENKIEY YSELESKIADIMTAHDAAAKAVKEVEQRMSDLSLLIKHTTTYRQLKPIYDEYRKSPDKEK YLRGHESEIILFEAAARALKEMQIKKLPDLAALRKEYRSLNDRKTKLYEDYRQAKKQMQE YGVVKKNVDSILYPSQSRAREQER >gi|229784050|gb|GG667685.1| GENE 15 12850 - 13059 346 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622744|ref|ZP_06115679.1| ## NR: gi|266622744|ref|ZP_06115679.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 69 1 69 69 99 100.0 1e-19 MNIRFTIEEENIVAIYAEDSRETTLENILAAMPYMDEDIRQLAAAAVNKLQAMTDSEFSE RKFILTDEE >gi|229784050|gb|GG667685.1| GENE 16 13064 - 16462 3136 1132 aa, chain - ## HITS:1 COG:no KEGG:CD1105 NR:ns ## KEGG: CD1105 # Name: not_defined # Def: putative DNA primase # Organism: C.difficile # Pathway: not_defined # 232 795 160 711 1343 469 49.0 1e-130 MPYSSYDHDKLETAETMRIERRIYFEAKDGDIAPYVSLPPEQLHAMREESAAAEQAIFND LSSRAAAWEEQAGKTLLLDKAIEYTRTTVVQHTSNEWQKGEYDRYTRSNRVYQMNYHIYE NTRYDREKQQSVPYSYSLTWGVYTNSPNRNGQAKIAGQDRKVFSDKAAMEKYLNGRIKAY DRLFTEISPPIPQEYAEHFKVNGMLLPGYTIEGEEPPQQQPQTAAIPETTGQQKEREHMS EQFSIMIGNRSRFEAGDPDGYWLDMPATKEQLHEAMQSVGITADNPQDFSIRGFSDDPEK HIALPYDMVCAASVDELNFLAARIEQLDPAEIGKLNAALQQKNGFENIGQVIDFTYNVDF YVHIPEVHTYRDLGDYYLNQSGMVQMPEEWKGGIDLAAFGRNAAEHEKGAFTEYGYIVES GDEWERQFEGREVPEEYRIMSYPQQERGEQDKVFMDAAEAQQIETPTAEPPQPRPVVPII LSSEKPGDKVKEITARLEQGIQAVFDSDRYKEFLTAMSKFHDYSLNNTILIAMQGGNLVK GYSQWQKEFDRHVKSGEKGIKIFAPAPYKVKKLVDKIDPDTRKPILDSEGKTVKEEKEVT VPAFKVVTVFDISQTEGKEFPDLSVKPLLADVEQYEDFFAALEKASPVPIAFEQIGGSAN GYFSLTDKRIAIKEGVSELQAVKTAIHEIAHAKLHDVDLNAPLEEQNRVDRRSREVQAES VAYTVCQHFGLDTSDYSFGYVAGWSSGKEMTELKASLETIQATAKELITEIEGHFTELQQ QRQTEQEQEAKTKETLLLNGKEDSYGIYQLARREETMDLRFEPFDRLQATGHTVDRANYE LVYTAPLTKDMTLGGIWEKFNIDHPADFKGHSLSVSDIVVLHQNGENTAHYVDSIGFQQV PEFLQEQAQQTPVFDKLSPEQQQALSDTVKDTLQLLVDADNRIYGDVTGKTLEAIAAQGY SYKDGQIEKQQPEATPDSLLTGETVRTPRGNFHITDMSREQIEAAGFGFHHASEDGKYLV MGNGTQAFAIAAEQPQRDNPLKHVEDTIEQNDNNFDGLINNTPQTPTVADFEQRAKAGEA ISVTDLAKAVKAEKREQPQKKPSILKKLDEYKKQAAQQTKDKQKEHKKDLEV >gi|229784050|gb|GG667685.1| GENE 17 16953 - 19028 1417 691 aa, chain - ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 3 652 5 676 709 482 40.0 1e-135 MKLVIAEKPSVGAAIAAVLGANEKRSGYFEGSGYLVSWCIGHLISLADAATYNEQYRKWK YDDLPIVPQDWQFTVASGKEQQFSVLKDLMHRSDVSEIVNACDSGREGELIFRFVYEQAN CKKPFSRLWISSMEESAIREGFSNLKDGRSYDDLYQSALCRAKADWLVGINATRLFSILY HKTLNVGRVQTPTLTMLVNRDYAISSFKKEKYHVVRLNAGGVSALSERLNDEAAAQQMKA ACEKSQAVCTSLKKEKKTVSPPKLFDLTALQREANRLYGFTAKQTLDYAQALYEKRLLTY PRTDSKYITSDMQDSTKELITGLCSLLPFMRDVKLQADLTRVCDNSKVTDHHAILPTAEF LKAGFSSLADSETKLMTLVCAKLLCAAAAPYEYEAVTAAISCGGYTFTAKGKTTLCEGWC EIEKLSRAASEEQDEDAEPETVLPPLAEGQTFDNPAAEISERYTQPPKAYTEDTLLSAME NAGKEETPEDAERKGLGTTATRAGIIEKLISAGFAERKGKKLIPTKDGYNLAAILPEVLT SPQLTAEWETRLTGIAKGSDSPDDFMRSIEEMTAGLVKTYSAISEDKAKLFTPQREAIGT CPRCGAAVYEGKKNYYCSDRACGFVMWKNDLFFQQRKKTFTKAIAAALLKDRKVKIKGMY STKTGKTFDGVVLLADTGGKYVNFRVEQNRK >gi|229784050|gb|GG667685.1| GENE 18 19025 - 19750 879 241 aa, chain - ## HITS:1 COG:no KEGG:CD1107 NR:ns ## KEGG: CD1107 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 5 235 6 229 244 184 51.0 3e-45 MKHKKLFRSVTALLAALVLMGGFSVTAFAGGGDVSDEPPTPVTETEPEETTGGYEPQPLT PEGNMSLVDDITGEASGDKQFITVVTKNGNYFYIIIDRAEDGENTVHFLNQVDEKDLLQL MEDEQTETPACSCTEKCVAGSVNTDCPVCSVNMSECAGKEAEPEPEETEQPEPEEDKGGD MGGAILLVVLLLAAAGGGAFYYFKIMKPKKDAAKGSSNLDDLDFDEDDEEELETEQEDEE V >gi|229784050|gb|GG667685.1| GENE 19 19740 - 20006 263 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622749|ref|ZP_06115684.1| ## NR: gi|266622749|ref|ZP_06115684.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 77 1 77 88 112 100.0 1e-23 MANTKLDRIERDIEKTRAKILEYQKRLKDLEAQKTEEENAQIVQMVKAVHLDGAQLAAFL SAYASGEIALPQPDADYTDEQEETEDEA >gi|229784050|gb|GG667685.1| GENE 20 20028 - 21995 1294 655 aa, chain - ## HITS:1 COG:BS_yddH_2 KEGG:ns NR:ns ## COG: BS_yddH_2 COG0791 # Protein_GI_number: 16077564 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Bacillus subtilis # 529 652 4 123 124 95 40.0 2e-19 MSKELKAPDKVTQKMTRDGAIAENLTTGETTSISERPPEENYSAPAAGAAEKTTQRISTE VSRHFEKGAEKKTLDAAQLKTHTSRLQFSEAERATPELQKAIKKSDTAADRLDKARAAIP KQTKITKERVFDEAKGKAKTRLAFEKTDKPPNGKLRHNPLSRPAQELDAAVHGKIYEVEK ENVGVESGHRIELVGEKAGGMTTRAVKRGIRSHKLKPYRAAAAAEKKAAKANADFLYQKA LHDNPALAHSNPISRFWQKQRIKKQYAKALKTGGKVKKTAEATAKAAKKTAQETKRATFF VVRHWKGCLIVGGIAFIVLLFFGGLSSCSLFGGNSGSGLIASSYLSEDADITGAESAYVA MEAELQDMLDNIESEYPGYDEYRVTADEIEHDPYVLISILSAWHEGVFTLDEAQSTLEML FDKQYILTVTEEVEVRYRTETRTDSEGNEYDVEVPYNYYILNVDLENFNLSHVPVYIMGE EQLSMYAMYMSSLGNRPDLFPQSGYISKYYENPPADYEIPPAYLEDEQFARLIEEAEKYV GFPYVWGGSSPETSFDCSGFVSYVLTNSGLYNTGRLGAQGLYNISTRVSNPQPGDLVFFT GTYDTPGVSHVGIYVGEDGDGSPVMLHCGDPIQYAKLDTSYWQSHFYAYGRLHYN >gi|229784050|gb|GG667685.1| GENE 21 21992 - 22840 522 282 aa, chain - ## HITS:1 COG:XF0641 KEGG:ns NR:ns ## COG: XF0641 COG0863 # Protein_GI_number: 15837243 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Xylella fastidiosa 9a5c # 2 279 81 378 380 124 29.0 2e-28 MRDYGTDGQIGREATPEEYVSRITAVFHEVKRVLTPEGTCWLNIADTYCGTGSKADHQDP KYPKGRNGQQVAVNHRAPGCKPKDLIGIPWLVALALRGDGWYLRSSIIWHKGNAMPESTR DRPTRCYEYVFLLTKSKKYYYDWQAVAEPIAPTTAVRLKSGVGKGNKYAATVPGQNQPQK INRPRRKGAYTDEMISPVRSRRNVWQINTASYRGGHFAAFPPKLAETCILAGCPVGGIVL DPFLGSGTTAAAAKSLSRRYVGIEINPEYCTLAKQRIGGDEH >gi|229784050|gb|GG667685.1| GENE 22 22945 - 25341 1438 798 aa, chain - ## HITS:1 COG:MYPU_3830 KEGG:ns NR:ns ## COG: MYPU_3830 COG3451 # Protein_GI_number: 15828854 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Mycoplasma pulmonis # 380 776 426 840 853 99 25.0 4e-20 MSKEKKNTQNEKRKLSRQERKQIDTLIRQAKGDGKPHTAQDSIPFERMYKDGICRLANGR YSKCIEFEDINYQLAQPDDKTAIFEALCDMYNSFDASISVQLSLISRHANKDDFKNSITI APQNDDFDSIRAEYTEMLRTQLERGNNGLIKTKFLTFTVEAKDIRSARARLARIETDTLN HFKVIGAAARVLDGKQRLEVLHGIFHPDGERFNFAWEWLPASGLSVKDFIAPSSFRFGDG RMFQMGGKFGAVSFLQIVAPELSDRMLADFMDAENGIIVNLHIQSIDHNEAIKTIKRKIT DLDRMKIEEQKKAIRSGYDMDIIPSDLATYGGEAKNLLRDLQSRNERMFLLTLLVLNLAD TKQKLDNEVFGAAAIAQKYNCILTRLDYRQEQGLMSSIPLGENLIHIQRGLTTSSTAVFV PFTVQELFQSGEALYYGLNALSNNMILCDRKKLKNPNGLILGTPGSGKSFSAKREITNAF LITSDDIIICDPEAEYYPLVGRLKGQVIKVSQNSTQYINPMDINLNYSEGDTPIALKSDF ILSFCELIMGGKNGLEAVEKTLIDRAVISVYRNYLADPKPKNMPILGDLYDEIKKQPEKE AQRIAAALELYVNGSLNIFNHRTNVDIHNRLVCFDIKELGTQLKKVGMLIVQDQVWNRVT VNRSEGKATRYYIDEFHLLLKEEQTAAYSVEIWKRFRKWGGIPTGITQNVKDLLTSREVS NVFENSDFVYMLNQAQGDREILAKQLNISQQQMTYVTHSEAGEGLIFYGNVILPFIDRFP KDTELYRVMTTKPGEVSA >gi|229784050|gb|GG667685.1| GENE 23 25301 - 25693 267 130 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1896 NR:ns ## KEGG: Ethha_1896 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 124 1 126 152 164 68.0 1e-39 MAYVTVPKDLTKIKSKVMFNLTKRQLICFSAAVAIGLPLFFLVKGSVGTSAAALCMVLAM LPMFLLAMYEKNGQPLEVIIRQFVEVKFLRPKERPYQTNNFYAALARQEQLDKEVRAIVK GKEKHPKRKA >gi|229784050|gb|GG667685.1| GENE 24 25695 - 26591 567 298 aa, chain - ## HITS:1 COG:Ta1390 KEGG:ns NR:ns ## COG: Ta1390 COG1032 # Protein_GI_number: 16082367 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Thermoplasma acidophilum # 104 258 73 235 425 77 32.0 2e-14 MKIGLIDVDSHNFPNLCLMKLSAYHKAKGHTVEWWNTKERYDLVYKSRVFTDTYSKDTIT VTNAEQVIFGGTGYDTKNRLPPEVEHSCPDYSIYPQFFGIAYGFLSRGCPRNCGFCIVSG KEGRKSVKVADLSEFWKWQPEIKIMDANLLACPDHENLIEQLIRSRAWVDFSQGLDIRLV NRDNVSLLNRVRIKAVHFAWDNPDEDLTGYFQRFLDLTAIKSSRQRRVYVLTNYGSTHEQ DLYRVNTLRAMGFDPYVMIYERPTAPPVTRHLQRWVNNKRLFYAVPRFEDYIPGRKEV >gi|229784050|gb|GG667685.1| GENE 25 26608 - 27471 682 287 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1894 NR:ns ## KEGG: Ethha_1894 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 2 287 4 289 289 372 68.0 1e-101 MIWQMIEDWFRGILTDGILSNLSGLFDSVNTEVGEIATQVGTTPAGWNAGIFNMIRSLSE NVIVPIAGVIITFVMCYELIQLVIEKNNLHDLDTWIFFKWIFKTFVAVLLVTNTWNIVMG VFDVTQSVVNQSAGVIISDTSIDVTTVITDIEAKLDAMSVGGLLGLWFQSLFVGLTMKAL SICIMLVVYGRMIEIYLVTSVAPIPMATMVNHEWGSMGQNYLKSLLALGFQAFLILVCVG IYAVLIQTIAATDDISGAIWACMGYTVLLCFTLFKTGSLAKSVFSAH >gi|229784050|gb|GG667685.1| GENE 26 27542 - 27790 251 82 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1893 NR:ns ## KEGG: Ethha_1893 # Name: not_defined # Def: putative conjugative transfer protein # Organism: E.harbinense # Pathway: not_defined # 12 82 1 71 71 99 88.0 3e-20 MSRKAAESEVYMDFFNSAVGVLQTLVIALGAGLGIWGAINLMEGYGNDNPGAKSQGMKQL MAGGGVALIGGTLVPLLSKLFG >gi|229784050|gb|GG667685.1| GENE 27 27821 - 29119 725 432 aa, chain - ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 2 388 184 562 591 179 34.0 8e-45 MQLHSSYVLTDPKGTVLIECGKLLQRAGYRIKVLNTINFRKSMHYNPFVYIRSEKDVLKV VNTLIVNTKGEGEKSAEDFWVKAERLLYCALIGYIWYEAPAEEMNFTTLLELINASEARE DDEEYQSPVDLLFADLEERSPDHFAVKQYKKYKLAAGKTAKSILISCGARLAPFDIEELR ELMSYDELELDTLGDRKTALFLIMSDTDDTFNFVIAILQSQLFNLLCDKADDEYNGKLPV HVRFLLDEFANIGQIPRFDKLIATIRSREMSASIILQSQSQLKAIYRDAAEIILDNADST LFLGGRGKNAKEISDNLGRETIDSFNTSENRGTQVSHGLTYQKLGKELMTQDEIAVMDGG KCILQLRGVRPFLSDKYDITKHPNYKYLSDFDKKNAFDVERYMSTRPAIVKPDEAFDIYE IDLSDEDAAAEE Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:38:34 2011 Seq name: gi|229784049|gb|GG667686.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld79, whole genome shotgun sequence Length of sequence - 24526 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 6, operones - 4 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1474 1143 ## Rahaq_2079 alpha-L-rhamnosidase 2 1 Op 2 2/0.000 - CDS 1459 - 2976 1197 ## COG0554 Glycerol kinase 3 1 Op 3 12/0.000 - CDS 2973 - 3905 1092 ## COG3958 Transketolase, C-terminal subunit 4 1 Op 4 . - CDS 3898 - 4764 1003 ## COG3959 Transketolase, N-terminal subunit 5 1 Op 5 . - CDS 4856 - 6265 1678 ## COG2407 L-fucose isomerase and related proteins - Prom 6320 - 6379 7.8 + Prom 6267 - 6326 10.1 6 2 Tu 1 . + CDS 6492 - 7502 962 ## COG1609 Transcriptional regulators + Term 7554 - 7594 10.1 + Prom 7582 - 7641 4.7 7 3 Tu 1 . + CDS 7704 - 8621 504 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) + Term 8668 - 8720 10.8 - Term 8664 - 8700 5.1 8 4 Op 1 . - CDS 8759 - 9685 1076 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase 9 4 Op 2 38/0.000 - CDS 9768 - 10613 1132 ## COG0395 ABC-type sugar transport system, permease component 10 4 Op 3 35/0.000 - CDS 10603 - 11511 844 ## COG1175 ABC-type sugar transport systems, permease components - Term 11523 - 11564 8.7 11 4 Op 4 . - CDS 11588 - 12922 1540 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 13106 - 13165 10.7 - Term 13131 - 13183 12.6 12 5 Op 1 . - CDS 13191 - 13418 368 ## Closa_3318 hypothetical protein - Prom 13452 - 13511 2.7 13 5 Op 2 1/0.000 - CDS 13515 - 16673 3413 ## COG0642 Signal transduction histidine kinase - Prom 16859 - 16918 20.5 14 6 Op 1 1/0.000 - CDS 17829 - 19526 1612 ## COG0840 Methyl-accepting chemotaxis protein 15 6 Op 2 1/0.000 - CDS 19555 - 22863 3181 ## COG0642 Signal transduction histidine kinase 16 6 Op 3 . - CDS 22902 - 24173 1130 ## COG1653 ABC-type sugar transport system, periplasmic component 17 6 Op 4 . - CDS 24227 - 24367 57 ## gi|288870903|ref|ZP_06409954.1| conserved hypothetical protein 18 6 Op 5 . - CDS 24387 - 24524 79 ## Predicted protein(s) >gi|229784049|gb|GG667686.1| GENE 1 1 - 1474 1143 491 aa, chain - ## HITS:1 COG:no KEGG:Rahaq_2079 NR:ns ## KEGG: Rahaq_2079 # Name: not_defined # Def: alpha-L-rhamnosidase # Organism: Rahnella_Y9602 # Pathway: not_defined # 1 491 1 475 885 451 46.0 1e-125 MLEIVKVLMDYEKDPVGVEEMPQFSWKLKSDKRNVVQSAYRLQIAENRDFQTPVYDSGRV ESSESAHVRPAGTKGDSAAILKSAVRYYVRVRVWTEEEESGWCCGEFVTALLDNREWKAP FVSAESAPACREESRGTLVRGDFSVGKGLTEAYAFTTALGLYQFYLNGSKVGTDEMTPGW TSYRRHLLYQTYDVTGCLKEGINTAGALVGAGWYKGVMGLTRSRNNYGDQTAFAMQMVLR YDDGREEVICTGEDWKGADAPVVFSEIYHGETYDAALEIPGWCGGEEAGEDRACETGRGE EPEIWHKTEIVPFDSSVLHAQGGARVTVIDRIPAKEVFVTPNGETVIDFGQNMAGRICVT GCGRKGDVIELRCFEILDAEGNVYLDNLRKAKTTMKYVFGKDGEITYHPYFTYMGFRYAL VISYPGTPTAEAFTAETLHSEMAPAGEITCSNPLLNQLHHNYLWGLKGNFLDVPTDCPQR DERLGWTGDAQ >gi|229784049|gb|GG667686.1| GENE 2 1459 - 2976 1197 505 aa, chain - ## HITS:1 COG:TM0952 KEGG:ns NR:ns ## COG: TM0952 COG0554 # Protein_GI_number: 15643714 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Thermotoga maritima # 5 493 2 490 492 461 47.0 1e-129 MKQNYIIGIDQSTQGTKAMLFDGNGILLCRADKSHAQLVDERGWVEHNPEEIYQNTLLVV KELMKKSGVDTSAVACVGISNQRETAMAWNRSTGKPVYNAIVWQCARGEEICKRITRLNP GAGEEVRRRTGLRLSPYFSAAKLAWILEHVKEAASLAARGELCCGTMDSYLVYRLTKGRE FRTDYSNASRTQLFHIRELAWDETLCGIFGIPSACLPDVTDSDGFFGMTDFEGILPGEVP IHGVMGDSHGALFGQGCLSKGMMKTTYGTGSSIMMNIGEMPIFSDLGIVTSLAWKTGGRV QYVLEGNINYTGAVITWLKDQLGLLASAGESEELAKRANPADHTYLVPAFSGLGAPYWRS DVSAAFISMSRTTGRAELVKAGLCSIAYQIADIVELMKEASGIHEVELRVDGGPTKNSFL MQFQSDILNSRIRVPEVEELSGLGAAYAAGLGAGIYDESLFGNLRSREYLPAMGPEEREE KLKGWTAAVKMVLASGEGDQSCWRS >gi|229784049|gb|GG667686.1| GENE 3 2973 - 3905 1092 310 aa, chain - ## HITS:1 COG:TM0953 KEGG:ns NR:ns ## COG: TM0953 COG3958 # Protein_GI_number: 15644625 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Thermotoga maritima # 3 307 2 305 311 283 48.0 4e-76 MNKIPNRQVICDVLMERAGSDRDIVVLCSDSRGSASMTPFAETYPEQFVETGIAEQNLVS ISAGMAKAGKKPYAASPACFLSTRSYEQAKVDVAYSNTNVKLIGISGGISYGALGMSHHS AQDFAAMCAIPNMRVYVPSDRHQTRKLMEALVQDEKPAYIRVGRNPVEDIYEEGRVPFEM DRATAVCEGTDIAMIACGEMVRPAKQAAELLKKHGISASVLDMYCLKPLDEKAVIAAAEG VRAVITVEEHAPFGGLGSMVSQVLGRNCPRKMGILALPDAPVITGTSKEVFDYYGLNAEG IAAKAMELLA >gi|229784049|gb|GG667686.1| GENE 4 3898 - 4764 1003 288 aa, chain - ## HITS:1 COG:TM0954 KEGG:ns NR:ns ## COG: TM0954 COG3959 # Protein_GI_number: 15644626 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Thermotoga maritima # 4 270 11 276 286 296 51.0 3e-80 MNEELKALAYALRKNVLDMVYRAKTGHIGGDFSVMEILTSLYMKEMNISPENQDDPDRDY FVMSKGHSVEAYYSVLAAKGFLDIGDVIQNFSKFGSGYIGHPNNKLPGIEMNSGSLGHGL PVCVGIAKACRMDARPSRVYTVMGDGELAEGSVWEGVMAASQYKLDNLCAVVDRNRLQIS GSTEDVMHHDDLAARFAAFGWNALSVPGNDIDALLEAFEQAKKTKGCPTVVIANTTKGCG VSFMENRAEWHHKVPDEAQYQAAVRELSQREAGEHSQSAHNKEVLRDE >gi|229784049|gb|GG667686.1| GENE 5 4856 - 6265 1678 469 aa, chain - ## HITS:1 COG:TM0951 KEGG:ns NR:ns ## COG: TM0951 COG2407 # Protein_GI_number: 15643713 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Thermotoga maritima # 1 447 1 462 471 155 30.0 1e-37 MVSIKLGFAPTRRSIFSAPDALKFAGLTKKRLEELGVEFVSLDGINEEGLLYDDNDLEAV VKRFEEAGVDGLFLPHCNFGTEYVCARLAKRLNVPVLLWGPRDERPDENGIRLRDSQCGL FATGKVLRRFQVPFTYLTNCRLEDKEFENGIRNFLAVCNVVKVFRHTRILQIGPRPFDFW STMCNEGELLEKFNIQLSPVPIPELTSEMKRAKAEGAEVQAVMEYCKSSMCVKINEEQLE NVAALKVAMKRLAGKYGCNAIAIQCWNALQGEIGIMPCAANSLLNEEGIPVVCETDIHGA VTAVMAEAAGMDEARTFFADWTVRHPDNENGELLQHCGPWPVSVAKEKPAIGYPLAFDYP GAVEAEAKHGDMTLCRFDGDNGEYSLLLGHAKGVEGPYTKGTYVWVEVEDLKRLENKLVC GPYIHHCVGIHKDVVPVLYEACKYIGVAPDLYDPIEEEVKAAIYSPSGK >gi|229784049|gb|GG667686.1| GENE 6 6492 - 7502 962 336 aa, chain + ## HITS:1 COG:TM1200 KEGG:ns NR:ns ## COG: TM1200 COG1609 # Protein_GI_number: 15643956 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Thermotoga maritima # 3 333 6 332 333 156 32.0 5e-38 MSISAKELAQKLNISAATVSMVLNRKPGISEKTRNLVLEAAREYGYDFSKKWDAGEEKGS ILYVIYKKHGTVVADTPFFSLLTEGIEQACKGKGYELQITYFYEHKDISSQIQELSEKNC QGIILLGTEMGVEYFQPFTRLKVPMVVLDTYFEELACDSVLINNVQGAYLAANYLIEKGL KDVGYLRSSYPIGNFEERADGYYKALRHHNLPTGHPYVHRLTPSMEGAYTDMAELLREKT PVAPAYFADNDLIAAGAMRAFKEFGYRIPEDISIIGFDDTPICDFLEPPLTTMEVPKKRL GELAVLRLLQKIGGETKVMIKTEVSVKLHERKSVRE >gi|229784049|gb|GG667686.1| GENE 7 7704 - 8621 504 305 aa, chain + ## HITS:1 COG:BS_ybaC KEGG:ns NR:ns ## COG: BS_ybaC COG0596 # Protein_GI_number: 16077182 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Bacillus subtilis # 8 300 13 311 318 131 30.0 1e-30 MQINEQLKLPINGVLQYVSIRAEEAGLPLLLYLHGGPGDAALPLVLKYNRELESHFTVAV WEQRGAGKSYYDFGDSIITIDTFLQDLHSLIMLLTARIHQKKLFLLGHSWGSVLGLTYAA DHPEMLHAYIGCGQVVNMKKSCRLSYDFAAAHAGPRALERLKHTDCTYTGENWLDDLLFV TGQVVKHGGSLYGRSNYNDLIIPFLFSRYYSIPDLIRRQKGSLQSIKRLWPELMGINFEE KTIYKVPVVLIEGRHDMHVSSELAKEYFDTIETDKQFFWFEESCHFPQWSENRRFHKLMT ELKNR >gi|229784049|gb|GG667686.1| GENE 8 8759 - 9685 1076 308 aa, chain - ## HITS:1 COG:BMEII0088 KEGG:ns NR:ns ## COG: BMEII0088 COG1957 # Protein_GI_number: 17988432 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Brucella melitensis # 17 186 24 213 331 72 32.0 1e-12 MRYTDYKFHVPEEKIVRLIVDTDAKNEADDQFAIVQALLSPKFENVGMVAAHYGTRHADS MERSYRELEVIFDKMHFPKEGMIYHGAPHAIPDKTTPVVSEGAELIVREAMKEEDRPLFA IFLGPLTDLASALLMEPRIAARMTAIWIGGGRYPAGGPEFNLGNDIHAANVVFQSKMEIW QVPKNVYEMMPLSLAELEYKVAPYGEIGEYLLQQLDEHAQEEGPRNSAFRTGETWVVGDS PAVGLLLYEHRFEFDWLEAPLVTEDMAYVHTKLNRPIRIYHSIDSRLILDDFFAKLALFH EKHPKGLI >gi|229784049|gb|GG667686.1| GENE 9 9768 - 10613 1132 281 aa, chain - ## HITS:1 COG:BS_yurM KEGG:ns NR:ns ## COG: BS_yurM COG0395 # Protein_GI_number: 16080311 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 17 281 38 300 300 155 33.0 7e-38 MSDRMFSMNKGMKVLAYAVLGAYTIVVAYPILFMFFTSFKSNKEFFVNLFGLPQNVEIMN YAKAWSVGKLGTYFGNSVFVTVVSVAVTTIVSLLGGYALAKLYIPKANMIMDAFMVFNFI PGIAIYISLYSMMSKMHLTGGRASLILPYIAWQIPFSMYIIKQYFETIPEELIESGRIDG CTEFEAFFHIMLPLVKPAIATIIVFTFIGNWGELMWANITTASNALIKTLPVGLLNFKTE MGVEWGQYTAGICMVTLPLMLVFGYFQKYFVSGLTNGAVKG >gi|229784049|gb|GG667686.1| GENE 10 10603 - 11511 844 302 aa, chain - ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 20 301 10 292 292 177 35.0 3e-44 MQKSKRKMSSIEKKYMVAGFAFLMPAIVIYLIFAVIPFFDSFILSFQEWSGFGPRSFLGF KNYISAFRDKTFLMAIRNSVYLGVTSAFFSVIIGVLMAWLLLYVGKKWGGFFRTILFSPS MIPAVITALIFAFVYEPEIGILNNILGAVGLENLQHAWLTNKGTALNCILFVSAWKQVGL TMVLCFAGMQGIDNSLIESARLDGATDFQIFKKIILPLIMAFVQLSTIFAVMSGLKIYDT VVALTNGGPAKATTVMPLWILQNSFSFNNYGYGSAMSMIFVLIVLIGMILVKKLIKGGND ER >gi|229784049|gb|GG667686.1| GENE 11 11588 - 12922 1540 444 aa, chain - ## HITS:1 COG:BS_yurO KEGG:ns NR:ns ## COG: BS_yurO COG1653 # Protein_GI_number: 16080313 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 3 364 2 339 422 114 24.0 4e-25 MKRKLTTVLTVALAAGMMMTGCGSSNSGTETKESAATAAADSTAADTTAAAEAAKSGEVT TIRFYGSDSEYNQNIVAGFQAENPDIKVEIVPVDFDNAEQVIKTGIASGDPVDVSFFWGS QINAFTEDNMAYDLTPYLTENNNEWKDIFVPAYIDAGMVDGKYYAVSYQPVIETIFYNKD LFTQYNVELPTTWDEMVEACKVFKDNGVYGIGNWNGQNHQLLQFAYQYMANDGTLEEFVS GKGDFTQCEGLKKCLENWKSVYDAGYWYPGEGALTSTKDQTQAAWYQGKVAMMFDAGSNA GEYTKNSEFEVGILPFPYVAEGGKYALNTVTNALFIPANAKHPEEAVRFMKYYTSDAGIE EIIKSGRLPATISMQDKVDSQILKDLLATTVGDNVVGYKQMQGLSSEINSFLQNDMIGMV CSGTSVEDTLQQLEDLRQGTVKGE >gi|229784049|gb|GG667686.1| GENE 12 13191 - 13418 368 75 aa, chain - ## HITS:1 COG:no KEGG:Closa_3318 NR:ns ## KEGG: Closa_3318 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 74 1 74 75 81 62.0 9e-15 MSRHYLVYSLSDPSKAQRIVSEIGKVAGVKMVNISEDLRELKIDAEAAEFSTVMDRVVNI YSKVSYGCEVKYKFT >gi|229784049|gb|GG667686.1| GENE 13 13515 - 16673 3413 1052 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 519 907 265 663 676 181 32.0 8e-45 MEVNLSYDQLNDEYSMLMSVLGASVSKHLIDEHYTCIWANDYYYELIHYTKPEYEALFQN HCDKYFYNNPEGWKLLTEKINSALEKKEKRYTIFLPMIYPDGEKFWVKLQSVFTDECIDG YPVAYTTMTDVTEMVLARQEQEYTQQVVEKVTQEQEMLMGALNVSVSKHLIDEHFTCVWA NEHYYKLIGYPKAEYEALFHNHADEYYRNNPEGWEMLANKVQSVLERGEDQYELIIPMKY KDGTGYWVKVFSYFTDEYINGIRTSYTVMTDVTELVHMKDEQEMLMKTMKVSVSRHLVDK HFTVLWANHFYYELIGYSKQEYEALFHNHCDEYFQNNQDSWNAIHRKINQMYENGKKSYE VFLPLKIPDGSTCWVKMVGFFTDARQDGKQLAYTTMVDVTDMIQIQQEKTIAYDNIPGFI VKYRVLPERLEMIEASDRITDIFDVDLSRLSALDCYSILQPESRAVLEANHPSFLRREPF EGSICVKDKYGRDRWFQIHCACIDFVADDPVYLTVFIDITDITELRELQSKLEERTVMLN TALEEAKQANEAKSDFLSRMSHDIRTPMNVIAGMTEIAGSHVYEPERVKDCLQKIRLSSH HLLGLINDVLDMSRIENGNFHINMVPVSLPAVLREVIMITLAGIREKAQIFDVHLVNLED EQFYSDALRLRQILLNLLSNACKFTPVSGRIVFEVEQKDRTALIFTVSDTGAGMSPEFLE HIFEAFTREQDSRTDQIEGSGLGMCITKRLVDMMEGTITVSSCPGKGTTFQVTLPAAVVQ EEEETAFQGVRVLLADHDPAVLQYGCQALSSLGAEAEGAGSGEEAARLVLSRHREGREFS AVILDLDLSEPDWLETVRKIREAGDGSRPRLILSAYDWNDRKEEAMRAGICGFVEKPLFR SALKECLLEHLNGKKQSEHQQMSFDFSGKTFLLAEDNELNREIALELLGNMGAGMETAVN GEEAVKRFKESPPNYYSLILMDIQMPVMNGYEAARAIRLMDRPDAKAVPIIAMTADAFVE DIQNAKAAGMNGHMAKPLDYEGLSQEINKYLK >gi|229784049|gb|GG667686.1| GENE 14 17829 - 19526 1612 565 aa, chain - ## HITS:1 COG:CAC0120 KEGG:ns NR:ns ## COG: CAC0120 COG0840 # Protein_GI_number: 15893416 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Clostridium acetobutylicum # 255 557 214 514 555 203 45.0 6e-52 MKDFKVSKKLFISYAAIMILFVVSCFVSITDFVSLGKQIEAFYTGPFTVNESASIINSNF ERMQKAVYRSISNTDPEITSEAIANARECAVAIQEQVPVVKSHFKGDQQIIDRLEAALEK LEPMRETVLSLAAENKNAEAADYMEHNNILVIHEAQEQLNQLIESGNRMGASLVSGLKEK QARAVVSLILLGSASMAVSIVFGIYITRGITRPVRELEQAALAMARGEFSTVRVEYESKD ELGNLAFSIRSMVDTLANVLQSETQMLNEMSEGNFSVRSGKDEYYIGEFERLLCSMRKIN HGLSALLLQISQSADSVAAGSEQVSSSSQNLAQGATEQAASVEELTGMMNEISDQAYRNS RDAQEASEKTQTVRDNAAESSRCMQEMLHAMAEISDKSDEIRKIVKTIEDFSFQTNILSL NAAVEAARAGDRGKGFSVVANEVRSLASQSSAASRNTAALIQSSLQAVENGRKIANETSE ALTAVVQGIEMVSELLHQITEASLKQSDANRQVTENINLISDVVQTNSSTAEECAAASEE LASQAQLLKELAGKFRLTDINDRKV >gi|229784049|gb|GG667686.1| GENE 15 19555 - 22863 3181 1102 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 558 948 266 663 676 187 30.0 8e-47 MVALLVSALIAFISVFFVSAVREQLWQQSVGTIRESTQQGMNTLRVQLQEEYGEMESMAG YLNQYDAGQQEELNVLISSYDRVDGRISLYLEDGRIFPEGNPGDTAAAEFLASASGEQGI IDPHICSANGVNEFNLYRKAVLKDGTAGYLLKSYDVGEIVDSFTVSFYHDAGFSYVVDTA GNVLIRPPHPGSNKTVQNLFDMLPDSQNDSDKLAQFAQSLIDKRTGWATFSYQGEDTVFC YIPLGLNSDWYLISIIPQDVVEAQTNQILIRTFILIAFILIGLAVLVTVYLRYVNQMNRK LRSQAGYIAHLYNAIPEGIALMSVEAPYRLLQLNREGMRLLHVQEENSNGKQLIDFVHPD DYPMVADIFRAAAGDDRKHSFENRVVREDGSFFWSGGLVEKTLNEDGTPILIATFHDITM EKLAEEEAEREKLQERRMLISAISNVFPVIISLNLTSDTLNFIYLQPGLLAEMGGQESFS RLYQDFMEAVHPDFQEDFRNRLSPENLKKTLGSSRQEVFLEARILLTDGQYHWTSTQIIY VENPYSGDQLAILLSRRIDEQKHEEEQHRQALQSALESAKAASVAKSRFLSNMSHDIRTP MNAIMGMTAIAASHMDDQERVRECLRKINLSSSHLLSLINDILDMSKIESGKISLRDEPF NFAELVAEVAELIRPQADAGHLDMNLSLMDLKDELVAGDPLRVRQICFNVLSNAVKYTPE GGHISVRLYQRDSGRREYQNFIFECSDTGLGMSPEFLDKLFLPFERVQDSTHSRVTGTGL GMAITKNIVDMMGGDIQVESELGEGSAFTVTLPLRLQNVPPEEVSSKWLGIHCLITDDDV QTCKSVTELLEDIGLRAEFTTEGVTAVSLAAQAMEREDPFELMIIDWKMPDMDGVEITRR IRKTAGDEVPIIILTAYDWSEIEEEARAAGVTAFLAKPFYRSKICYLLRELEEKKSPADD KETAVSHDFRGRRVLLAEDNLLNREIAHTLIEEMGASVEDACDGAEAVRKVAASEEGYYS LILMDVQMPVMSGYEATKAIRKLERRDAGRIPIVAMTANAFEDDRQEALRAGMNAHFSKP IDVKELEMLLSRYLERTEENEE >gi|229784049|gb|GG667686.1| GENE 16 22902 - 24173 1130 423 aa, chain - ## HITS:1 COG:SMc02873 KEGG:ns NR:ns ## COG: SMc02873 COG1653 # Protein_GI_number: 15963951 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 58 354 44 342 419 66 24.0 9e-11 MPKKLLALFLVLTCAASAITGCSSKNRVVNEDNQIDQEIVTITFFGNKYEPENVIVIEQI ISDFMRENPSVRVSYESLKGNDYFEALEKRMEHGRGDDIFMVNHDVLLKLEADGQVADLS GLSTLPDFTDEMRSQMGEGKITWVPTTVSIFGLYCNLDLLKEHKQEVPGTLSEWEAVCEY FVNCGITPVIANNDISLKTLAIGRGFWQVYQDKRQTEVFDQLNHGRETLSEYLTDGFSIV ETFIRRGYLDAEKTLNTKKTSDDLNDFIRGESPFLLTGAWAAGRVAGMEPDFRFEVAPLP ILDEGSMLVINPDTRLSINAQSDHFEAAMKFVEYFTKAENIQKFADQQSSFSPLKGGSPS SVLEIQPLIACYESGQTVIGTDDLLKLPIWELTAEVSRKLLAGEELESAMSWMDQQAQER IVP >gi|229784049|gb|GG667686.1| GENE 17 24227 - 24367 57 46 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870903|ref|ZP_06409954.1| ## NR: gi|288870903|ref|ZP_06409954.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 46 1 46 46 80 100.0 3e-14 MAACFLMRDLSVTALPFPYYNMILPPAATGSGGIDCTKSSVVVKSK >gi|229784049|gb|GG667686.1| GENE 18 24387 - 24524 79 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no LEPKGFEFAAERYRFTASVSSSDFPEAVLTQNFLHEIYNIFFVVY Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:39:01 2011 Seq name: gi|229784048|gb|GG667687.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld80, whole genome shotgun sequence Length of sequence - 30807 bp Number of predicted genes - 26, with homology - 26 Number of transcription units - 9, operones - 6 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 997 461 ## Cpin_5748 glycoside hydrolase family 9 2 1 Op 2 38/0.000 - CDS 1042 - 1905 573 ## COG0395 ABC-type sugar transport system, permease component 3 1 Op 3 35/0.000 - CDS 1918 - 2769 728 ## COG1175 ABC-type sugar transport systems, permease components 4 1 Op 4 1/0.000 - CDS 2822 - 4078 959 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 4104 - 4163 3.6 5 1 Op 5 . - CDS 4176 - 5042 442 ## COG0477 Permeases of the major facilitator superfamily - Prom 5073 - 5132 5.5 + Prom 5812 - 5871 5.1 6 2 Tu 1 . + CDS 5898 - 6596 591 ## gi|288870908|ref|ZP_06115715.2| conserved hypothetical protein + Term 6642 - 6688 10.1 - Term 6627 - 6679 11.5 7 3 Tu 1 . - CDS 6685 - 8268 1726 ## COG0178 Excinuclease ATPase subunit 8 4 Tu 1 . - CDS 9208 - 10431 1385 ## COG0178 Excinuclease ATPase subunit - Prom 10454 - 10513 4.8 - Term 10666 - 10725 7.1 9 5 Op 1 . - CDS 10792 - 11370 639 ## COG1954 Glycerol-3-phosphate responsive antiterminator (mRNA-binding) 10 5 Op 2 . - CDS 11363 - 14734 4126 ## Closa_2967 SMC domain protein 11 5 Op 3 . - CDS 14715 - 15365 794 ## Closa_2968 hypothetical protein 12 5 Op 4 . - CDS 15365 - 16741 1463 ## Closa_2969 hypothetical protein 13 5 Op 5 . - CDS 16728 - 18458 1423 ## COG2199 FOG: GGDEF domain - Prom 18489 - 18548 7.0 14 6 Op 1 . - CDS 18654 - 19265 246 ## COG0655 Multimeric flavodoxin WrbA 15 6 Op 2 . - CDS 19286 - 20149 444 ## COG1284 Uncharacterized conserved protein 16 6 Op 3 . - CDS 20149 - 20688 265 ## Nmag_1021 NAD-dependent epimerase/dehydratase 17 7 Op 1 . - CDS 21628 - 22035 293 ## COG0451 Nucleoside-diphosphate-sugar epimerases 18 7 Op 2 2/0.000 - CDS 22057 - 22815 764 ## COG1402 Uncharacterized protein, putative amidase 19 7 Op 3 . - CDS 22833 - 23126 292 ## COG1735 Predicted metal-dependent hydrolase with the TIM-barrel fold 20 8 Op 1 . - CDS 24095 - 24718 435 ## COG1735 Predicted metal-dependent hydrolase with the TIM-barrel fold 21 8 Op 2 . - CDS 24732 - 24992 230 ## COG1925 Phosphotransferase system, HPr-related proteins 22 8 Op 3 3/0.000 - CDS 24989 - 26011 847 ## COG1735 Predicted metal-dependent hydrolase with the TIM-barrel fold 23 8 Op 4 . - CDS 26031 - 27092 848 ## COG3037 Uncharacterized protein conserved in bacteria 24 9 Op 1 11/0.000 - CDS 28023 - 28262 227 ## COG3037 Uncharacterized protein conserved in bacteria 25 9 Op 2 13/0.000 - CDS 28295 - 28582 384 ## COG3414 Phosphotransferase system, galactitol-specific IIB component 26 9 Op 3 . - CDS 28603 - 30762 1477 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) Predicted protein(s) >gi|229784048|gb|GG667687.1| GENE 1 1 - 997 461 332 aa, chain - ## HITS:1 COG:no KEGG:Cpin_5748 NR:ns ## KEGG: Cpin_5748 # Name: not_defined # Def: glycoside hydrolase family 9 # Organism: C.pinensis # Pathway: not_defined # 3 332 25 351 613 208 35.0 3e-52 MKRILMNQIGYDSDFSKKAVFQSDEEIQVLGFKILEEKSGIEVYCGEAEKSSPVARWNKG DYWVMDFTEFAPNAHKNYSSFYLLDVETSAGEVVSSPFEIYGSVVEYSTLSSIVYYFKSQ RATGEWEVKDHNIGFKGPREGTFDVHGGWFDATGDVGVHMTHQSHTTYFNPQQASFSAAV FYRLLDLLQKNENPCYSVFCRRLLDEASYGANWLMRRRAPSGSFFKVGPCRNYAYDPTDI TREIGFEYKHSSPQFGTAETAQTENVGDEYYETSFRSGGGYAIAALAAASRYPYPSCEFT GIEYLNAARESYFYLEKNNEHYTNDGAWNLLD >gi|229784048|gb|GG667687.1| GENE 2 1042 - 1905 573 287 aa, chain - ## HITS:1 COG:BH1119 KEGG:ns NR:ns ## COG: BH1119 COG0395 # Protein_GI_number: 15613682 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 19 286 11 281 281 137 30.0 2e-32 MKYHYETPQDRFMAAVRAVLLYGILIMWACAAIIPMVWVFINSFKESNEILKNSIMLPSH WTLENYITMANYPDMNLLRAFRNSVIISGGVVIGVVLCAGFAAYSITRFANRFTAVISIV FNAVLLVPSFSTLIPNFVTISASPFRGTYMAVILPLIASNMSFSTLLLTGYMNSVPKELD EAAIIDGAGPMQIYWRIICPLSVPMYSTIATMVFIWSYNDLLTSLTYLSNRKMQPVSVIL SMVSSMIGTDYGAMMAAIFFTMLPLLVLYVVLQEQVVKGLTAGSVKG >gi|229784048|gb|GG667687.1| GENE 3 1918 - 2769 728 283 aa, chain - ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 5 269 11 279 292 174 41.0 2e-43 MYSVLLLVPALIFVLIFLIIPLFKTVYYSFTSWFNFSQTKTFIGLANYKQFFTDSVVGIA LRNTMLLIVFALIFQVGVALILAICVDSAHHGFRFFRTVYFFPIVISATAIGLMFSLIYK YDYGLLNYIIQAFGGEPQVWLTKERSIYMLLIPVCWQFVGFYFVVYLTAIANIPIDIYES AVLDGVKPWQKAIYITVPMLKSSLVANVVLVISGCFRVFDMVYIITGGGPLNSSQLLSSY MYYKAFADYNTGYASAIAIIMMVLGLSITGVVKVFTNKIQGDT >gi|229784048|gb|GG667687.1| GENE 4 2822 - 4078 959 418 aa, chain - ## HITS:1 COG:BS_yurO KEGG:ns NR:ns ## COG: BS_yurO COG1653 # Protein_GI_number: 16080313 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 1 418 1 420 422 80 21.0 7e-15 MKKKCMAWITMAVMLGAASISGCSSKTETPASENQSGAAGEKSVKKENLRVITFFAGSDQ WAPTWQEVIKEYMEANPNITISDESVPTAGNNDMFRPKMNADIAAGTPVDVALYFNGSDA EPLYESGLYVSWDQYLEEDSEWASSFRSTVMPSGQINNEQFNIPYIGSFEGLCYNQKIFD EYGLEYPTTWDNIIKACEVLADTDIIPISNSLLSPTCQLETMLLAQVGPENQKKPLDASW AGAIDQFKTLYDLGAFPADAATISDADSRIIFQDGRAAMTFTGSWVLNTLKENQDMRMIA VPVPEGAQGQEGAIISVFGSGWYMSKAAAERSDAALNFVKYMCSPEILARFIEVGGTPAM NIEMPANATAMNPPIDTMIPREVFNNLAKKLIYVCEGEMTADELLEECRTALSNIKAN >gi|229784048|gb|GG667687.1| GENE 5 4176 - 5042 442 288 aa, chain - ## HITS:1 COG:BS_ywbF KEGG:ns NR:ns ## COG: BS_ywbF COG0477 # Protein_GI_number: 16080885 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 1 270 102 374 399 77 23.0 4e-14 MVVNCLFCVFYNPVPPLLDNLSLETLESSKSIFDFGHIRLGGTVGYAIGVLTAGQLMQNQ YRRMFYMISLLLLCAFICLQFVPPVMGYRQKAKKSSFKELLSDKKIFCFIFMNFIFSLGT VIYYSYYPLYFTTIGGDSQKVGILMFVTAVSEVPFWLIAGRLTERFGYKKMMVVAAVVTG LRWTTLSIAMNPLAAICINLTHGFCFVTLNYCIVTYINVHVPRDLRATGQSMNNMVTTIL SRVIGGVLIGYLSDIFGVPTMLRLAGAAAFIGAGIFSVCYHLIEKKEG >gi|229784048|gb|GG667687.1| GENE 6 5898 - 6596 591 232 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870908|ref|ZP_06115715.2| ## NR: gi|288870908|ref|ZP_06115715.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 232 3 234 234 435 100.0 1e-120 MKKTILVLSAAFLASVWLAACNSGNKPFPEPPEATEIKETETTAEGKKTTEFSTGSWDGY TFTSPWLNLTFTFPEDCTISTEEDIKKALGTGQEILVNNGVSNEVQYKVAEFTTAYDFMV ILPDQSSNFQMVHENISVATLGKGMSAEQYLDSVKEQLGTLEDYGYEFNDYEDVNIGGQT FTKLSLSALGGALLQDYYCISKGKYISSIIASYIPDSAETVEAVMAGVTAAK >gi|229784048|gb|GG667687.1| GENE 7 6685 - 8268 1726 527 aa, chain - ## HITS:1 COG:BH3594 KEGG:ns NR:ns ## COG: BH3594 COG0178 # Protein_GI_number: 15616156 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Bacillus halodurans # 2 527 413 939 957 744 68.0 0 MRLKKESLAVTVGDKNIYEATNMLIIKFREFLDQLVLTPMQHSIGDLILKEIRARVTFLI DVGLDYLTLTRATGSLSGGEAQRIRLATQIGSGLVGVAYILDEPSIGLHQRDNDKLLKTL LHLRDLGNSVLVVEHDEDTMRAADCIIDIGPGAGEHGGNVVAVGTANQIMKCKDSITGAY LSGRIRIPVPEERRKPTGYLTVRGAKENNLKNIDVTFPLGVMTCVTGVSGSGKSSLVNEI LYKSLAKKLNRARTIPGKHKCIEGVEQLDKIIAIDQSPIGRTPRSNPATYTGVFDLIRDL FASTSDAKAKGYSKGRFSFNVKGGRCEACSGDGIIKIEMHFLPDVYVPCEVCGGKRYNRE TLDVKYKGKSIYDVLNMTVEEALKFFENVPSISRKIATLYDVGLSYIRLGQPSTELSGGE AQRIKLATELSKRGTGKTIYLLDEPTTGLHFADVHKLIEILRKLSEGGNTVVVIEHNLDV IKTADYIIDIGPEGGDKGGTVIAQGTPEEVAANPASYTGQYVKKYLE >gi|229784048|gb|GG667687.1| GENE 8 9208 - 10431 1385 407 aa, chain - ## HITS:1 COG:CAC0503 KEGG:ns NR:ns ## COG: CAC0503 COG0178 # Protein_GI_number: 15893794 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Clostridium acetobutylicum # 3 404 2 402 939 489 59.0 1e-138 MAKQYIKIRGANEHNLKNISLDIPRNELVVLTGLSGSGKSSLAFDTIYAEGQRRYMESLS SYARQFLGQMEKPDVESIEGLSPAISIDQKSTNRNPRSTVGTVTEIYDYMRLLYARIGIP HCPKCGREIKKQTVDQMVDHIMGLPEKTKIQLLAPVVRGRKGEHAKVLDHARRSGYVRIR IDGNMYELSEEIKLEKNIKHNIDIVVDRLVVKAGIEKRLADSIETVMELSDGLMTVDVID GQPITFSQSFSCPDCGISIDEVEPRSFSFNNPFGACPDCFGLGYKMEFDEDLMIPDKSLS INQGAITVLGWQSCTDKGSFTRAILEALSSEYKFSLDTPFEEYPKPVHDVLINGTNGKEI KVHYKGQRGEGVYDVAFEGLVQNVARRYRETYSEASKAEYETFMRIT >gi|229784048|gb|GG667687.1| GENE 9 10792 - 11370 639 192 aa, chain - ## HITS:1 COG:ZygcP KEGG:ns NR:ns ## COG: ZygcP COG1954 # Protein_GI_number: 15803286 # Func_class: K Transcription # Function: Glycerol-3-phosphate responsive antiterminator (mRNA-binding) # Organism: Escherichia coli O157:H7 EDL933 # 12 190 6 184 191 176 48.0 2e-44 MIDDKKRKFRIILEDCPVIAAVKDETGLKECLYSESQIIFLLFGDICSVGRYVEIAKSAG KMVFVHMDLINGLGNKEVAVDFIREHTGVDGIISTKPQLVKRAKELGLFGILRIFVIDSM AFGNIEKQCASLVPDAVEILPGLMPKIIKKLCSTVNVPIIAGGLISDKEDVMNALNAGAV AISVTNQRVWFM >gi|229784048|gb|GG667687.1| GENE 10 11363 - 14734 4126 1123 aa, chain - ## HITS:1 COG:no KEGG:Closa_2967 NR:ns ## KEGG: Closa_2967 # Name: not_defined # Def: SMC domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 1120 1 1120 1127 1535 73.0 0 MTNQRFEALSRICLNNWHYINRKTLSFNREINFFTGHSGSGKSTVIDAMQIVLYANTDGR GFFNKAAADDSDRNLIEYLRGMVNIGENNEFAYLRNQNFSSTIVLELQRTDTEEYQCVGI VFDVDTSTNEISRRFFWHRGPLLENGYRTARRTMSTLELEGWLQSHYGKEEYFATSHNER FRRQLYDVYLGGLDSEKFPLLFKRAIPFRMNIRLEDFVKEYICMEQDIHIDDMQESVMQY GRMRKKIEDTYTEISILKEMQEKYGLVLEKEETIQKNGYFTGKLEILELMERAGELTDKA ALSRDDLQNQERLKDGLDEQIADLTTQGEELLKRISSTGYEDLKTRLESLNELMEHLSKS EVKWHQTAERLNAWIDQETTSNQTIWDIEEFEKNTIDGEKLERLKKSLAEMRKEAEDLKS EASAVLRDLKRQERQLKEELAQLKAGSKAYPKVLEHARSYIQRRLLEETGKAVEVHVMAD LLDIRNEEWRNAVEGYLGGNKLALVVPPAYAEDALRIYSELDKNEYFRVAVLDTEKAGKD TPTVLEGALCEELDVRESYLKPYLERLLGQVVKCRSIDELRGCRIGVTPDCMVYHTFRLQ QMNPEQYTKSAYIGKDSVRQRIRLLEKSLAEMENKRAPEEKTVRECEAVLSLEGLNEETG LYLEWQKDMGDLKERQKEKKKLEQKLLKLKEENVDQWEKERSALLALADGKRRERDAVSR RIYELHTKISQMTQAVTAVEQELLAKEREFKQDDILEAEFTEYMAGKEQPRYEKLRDYFT GRLNAVAEARDQAFQTLMDARGEYARKYPNRNFSITAKDNREYEKLMEILECDHLEEYRQ IAAEQARSAVEHFKDDFMFKIRSAIREALQRKDELNRIISRLDFGKDKYQFVIGKNKGAD GRFYDMFMDDSLEVNPNDLNDTVDNQMNLFTMEHESQYGELIEELIHVFIPPDNASPEQM EEAKRNMDKYADYRTYLSFDMQQLIQNEDEVIKIRLSKMIKKNSGGEGQNPLYVALLASF AQAYRISMPESIRRNPTIRLVVLDEAFSKMDAEKVASCIELIRGLGFQAIISATNDKIQN YVENVDKTFVFANPNKKSISIQEFERESFPQLMKELDEGEDDD >gi|229784048|gb|GG667687.1| GENE 11 14715 - 15365 794 216 aa, chain - ## HITS:1 COG:no KEGG:Closa_2968 NR:ns ## KEGG: Closa_2968 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 204 1 204 216 290 76.0 3e-77 MGYEEQATGTMGIPYYDSLLPEEQQEVTEAVKLLLRQTFVLEWKYEKRSGRFTNNREFRI CNKHLEFLRKYFAVSGVSVVENSQMGVIYIQGETLVGDKLPKLATLYLLILKLIYDEQMA TVSTSVNIYTSLGEIHEKLGNYRLFKKQPSPTDIRRAVTLLKKYQVVEPLDLLEDLNGDS RLIIYPCINVVLFGDDVRVLLESFGEGEDNDDESEI >gi|229784048|gb|GG667687.1| GENE 12 15365 - 16741 1463 458 aa, chain - ## HITS:1 COG:no KEGG:Closa_2969 NR:ns ## KEGG: Closa_2969 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 7 457 3 451 453 635 76.0 1e-180 MKQNKKLMEEIPDRFWGLFRSVNRSIYIEALLRINEEYEYSNYFLSREVCIQILSEYFAK KRYVIWQDELEDELDTLEPPATRVLSWLLKTGWLRKVDDYASMTVYIVIPDYAAVMTEAF LRLDSEDEDETQIYIQNVYAILFSLKNDPRASISLLNTALVNTKRLNKTLQDMLHNMDKF FASLLEQKTYEDLLKDHLEGYVEEIVKKKYHILKTSDNFYLYKTDIKRWIAAMRDDTEWI EQMSRRSGGKVSAADITEKLDQIERGFDSIEHRIANMDREHSRYIRATVTRLNYLLNQDD NMKGMVIQLLNHLSEGDKQEERIRQVGERMNLSQMTILSEKSLYKRRRPKADFTGNLEPE EEAEELSREEILNLNKLKNRYSKKEIESFIEENMEDGRMEVTGDMIRSGEDFEKLILAYD YSTRRKSLYRVEDADGEPEMIESGGYRYPKLTFTRRKS >gi|229784048|gb|GG667687.1| GENE 13 16728 - 18458 1423 576 aa, chain - ## HITS:1 COG:AGl2664_1 KEGG:ns NR:ns ## COG: AGl2664_1 COG2199 # Protein_GI_number: 15891442 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 400 547 222 375 382 91 36.0 4e-18 MKKKRAGSDRSCRSAVMYFVMAFSFCLFLWIFITGKAPDVRFDSEQSAAVTGEWKIEYQG QEVQSELPGPFSPQKEELIRYSTELGDYGIKGNSVMLRAVHQYVRIYLEGELIEEFGYNQ ETPFGNAPYNAWVIARLPENWQGKTLSIETVKYYNGLSGQLGRVYIGTKNALVFQVIHEC MPTLIFNSAIILAALLLLIYSFSFQKKYITYQLRCLCIFSLVTCMWLILESGGYQILWGK APLVSNALFLLFYLIPPACIQFILTYESFSEDRWMNLLFWLANLNFIVVSILQITGISDF IESLIGAHVILLLVMAELLFRFIMRILRRDPVKDIQLMAACLSFALFACLDIVRFYVGGT DISSAYFSQIGVVIFFSVLIYYAVRQLVLERDEGMKRELLEHMAYTDMLTELPNRNAFEK QMALLRGESGPGTMVVVADLNELKRINDQWGHQKGDAALIFVAAALKSSLEDCSGLYRIG GDEFCILSDRISGKEAKAFEERLEEKLGEAENELGFPVSVAVGWAEEDECGIDLAFRRAD FTMYEKKQQMKRKRKQEELRSEPPECDCGGIKNETE >gi|229784048|gb|GG667687.1| GENE 14 18654 - 19265 246 203 aa, chain - ## HITS:1 COG:CAC3341 KEGG:ns NR:ns ## COG: CAC3341 COG0655 # Protein_GI_number: 15896584 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Clostridium acetobutylicum # 1 201 1 206 208 233 55.0 1e-61 MRVLIVNGSPHENGCTRGALEEVASMLINDGIETEWFWIGMEPVAGCIACKHCQSTGRCC KQDLVNEFLDMADTADGFLFGSPVYFAGITGQLKCFMDRVFYCGDFSGKPAAAVVSCRRG GASTAYDQINKYFQIRNMPVVTSQYWNQIHGNGNRRSDLLLDQEGLQTMRTLGKNMVWLL RLIESGTILRPVYEKKIRTDFHK >gi|229784048|gb|GG667687.1| GENE 15 19286 - 20149 444 287 aa, chain - ## HITS:1 COG:SP2113 KEGG:ns NR:ns ## COG: SP2113 COG1284 # Protein_GI_number: 15901928 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 17 283 47 308 313 125 29.0 7e-29 MKRDDGLHLLKAMAGTFLFAFGMNCFIVPSGLYSGGLLGICQIMRTILEEVFHMPVGERD IAGIIFFFLNLPLFCLAWLDIGSRFFCKTIVCVAVQTFFLTVIPVSFPLLPDAPFASAAV GGVICGWGVGLTLREGGSGGGQEVLGLVIMKYHNRMSVGKIAFLINLAVYAWCAFHYSLE TVIYSLVYVCVSSAITDMVHLQNRSMGAILVSENEESIQWIITNCNRSATVLEGRGGYSG KPYQVIYAALSGQEAKMLERYMKDRGSEAFVTFFYIDRIYGRFPKHL >gi|229784048|gb|GG667687.1| GENE 16 20149 - 20688 265 179 aa, chain - ## HITS:1 COG:no KEGG:Nmag_1021 NR:ns ## KEGG: Nmag_1021 # Name: not_defined # Def: NAD-dependent epimerase/dehydratase # Organism: N.magadii # Pathway: not_defined # 1 179 143 324 324 125 36.0 6e-28 MDERTPLCPGSLYSYTKVFNEGLSEFYRKEYGLDSIALRVNLSYGPYIMSSFQPFFSSII DQPAVGAASRVPYSDTMFDFQYVGDMADIFRDALLAPPTQSGIFNTRGEYRPVKDAIEFI RSVLPETDIQPLGGDMDIPFHFDTSALEQEIGYKPSHTMEQGLLASINMVRRSRGLEEL >gi|229784048|gb|GG667687.1| GENE 17 21628 - 22035 293 135 aa, chain - ## HITS:1 COG:mlr7549 KEGG:ns NR:ns ## COG: mlr7549 COG0451 # Protein_GI_number: 13476270 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Mesorhizobium loti # 1 132 1 141 342 77 32.0 5e-15 MRCFVTGGTGFIGSYVVKSLVEAGIEVVALYHSKSTSRWLDEMLTEEERELLVLEQGDMT DAERIRKLIVDYGCDSVAHLASRLNQKTRANPTDGVQTNIVAFQNLLEICKMEKLRRLVW ASSSAVYGVQSDRKR >gi|229784048|gb|GG667687.1| GENE 18 22057 - 22815 764 252 aa, chain - ## HITS:1 COG:BH0226 KEGG:ns NR:ns ## COG: BH0226 COG1402 # Protein_GI_number: 15612789 # Func_class: R General function prediction only # Function: Uncharacterized protein, putative amidase # Organism: Bacillus halodurans # 11 237 10 235 249 187 39.0 1e-47 MKEWRNLTWPEAKEAVEEMPVCLLPLGAIEAHGPHMKLDVDNHICEEQCRRVCAETGIIL MPTIPVGQVWSLNRFPGTLTLSYHTLIAVIKDLLYSFRKRGFELVFIHSHHYGNVPAVKQ AVRECYDEFRDMKMVILEERQSMRDKAKEVCTSPFAHPVFWHACEVETSQALECCPDRVY MERAICDYPKFPEDFDSTPTYWDEVTSTGVMGDAKAGTKEKGKVLTEAEVNAMVRLVTYE MQKMNLSGGKGK >gi|229784048|gb|GG667687.1| GENE 19 22833 - 23126 292 97 aa, chain - ## HITS:1 COG:lin2797 KEGG:ns NR:ns ## COG: lin2797 COG1735 # Protein_GI_number: 16801858 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Listeria innocua # 2 96 238 329 330 70 35.0 9e-13 MDQISKIKYCTEQTRIDLICELIRLGYRKKILLCGDMARQSYLTSFGGGPGFGYILKVFL PRLVRQLTEQGMQEEQAMDIRDDLICNNPRQYLSFEA >gi|229784048|gb|GG667687.1| GENE 20 24095 - 24718 435 207 aa, chain - ## HITS:1 COG:BH0225 KEGG:ns NR:ns ## COG: BH0225 COG1735 # Protein_GI_number: 15612788 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Bacillus halodurans # 1 202 1 222 331 154 41.0 1e-37 MEIIRTLTGDMIPERAGAVYYHEHVIARSPVSREKEPDLELCDIDRMVSEVKLFKEAGGT LLVDASIADFGENSHLRLEISEKSGIPIVGTVGFGQKEHHSEQVRNSTTEQLYERVMDAA ENGYGRNHLKPGQLKFGTSYGFISESENRCARAVARAQKSLSMPLFTHTGIGTMGLEQIA LLKEEGANLEQVCIGHMDRNPDLWLAS >gi|229784048|gb|GG667687.1| GENE 21 24732 - 24992 230 86 aa, chain - ## HITS:1 COG:BS_ptsH KEGG:ns NR:ns ## COG: BS_ptsH COG1925 # Protein_GI_number: 16078454 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Bacillus subtilis # 1 84 1 84 88 61 36.0 3e-10 MKQIIYALEDDYGMHARPASRLAEVSQNYRSSMEIKYKEKTGDLKSMISVLKLGIKKGEA FHIEITGEDEDEACRTLEKMLKEKPI >gi|229784048|gb|GG667687.1| GENE 22 24989 - 26011 847 340 aa, chain - ## HITS:1 COG:BH0225 KEGG:ns NR:ns ## COG: BH0225 COG1735 # Protein_GI_number: 15612788 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Bacillus halodurans # 1 329 1 330 331 412 61.0 1e-115 MGFIRTLLGDIRPEEMGFTLSHEHLVCRPPYWVWKNDDDLLLDDPEKTLADMMDYKRMGG AAIVDATAIDYGREVEAVADLSKRSGLHVIGTAGFNKGFLWEAPLDERRRGLIGGYSTFK EWIEKSSVSELADFVCREVEVGLEGTNCKAGQVKFGTGYNSISPLEKKTMEVAAAVHFAT GAPIHSHTEAGTMGLEQIELLKAVGVSLENVSFGHMDRNMDLWYHRRIADTGAYLCFDGI GKTKYHTESSLITHILTLVKEGYEDQILISGDMARKSYYRHYQFGLGLGYVFDWFQIQFK EMAEKSGLEGEKLFYKFFYENPQRCMTFKQKNKTGYGEIL >gi|229784048|gb|GG667687.1| GENE 23 26031 - 27092 848 353 aa, chain - ## HITS:1 COG:lin2798 KEGG:ns NR:ns ## COG: lin2798 COG3037 # Protein_GI_number: 16801859 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 2 352 80 431 432 346 53.0 5e-95 MLTNCMGYDAFVGKFGSVIAIVMATSFLINVILARITPFKYIFLTGHELWWVTVVSSGVL INFAPEMATWKMVCILAPVLGLYFTLQPAICQKYVRNITGGNDFALAHPSGVGSFIAANA GKVFAKNKKDAETIVFSKRLSFLRDNNVITALVMMIFYLIGAVIVMVKNTENGQALMANA GTQNFIVYAVVQSLTFAAGIAVILVGVRLFVGEIIPSFKGIADKLVPNAIPALDCPVLFT YSPNSVILGFVGMFAGIIFWMLTLGLSIGFVVVPTMIVLFFHGATAGIFGNSTGGIKGAL LGGFIISTIVALGQLVFIKFLVPATIPDTIMWAGETDEFLLFWIFALFGRLFG >gi|229784048|gb|GG667687.1| GENE 24 28023 - 28262 227 79 aa, chain - ## HITS:1 COG:lin2798 KEGG:ns NR:ns ## COG: lin2798 COG3037 # Protein_GI_number: 16801859 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 7 78 3 74 432 81 56.0 3e-16 MGTLGTIATWLSNNFFNTPAFLLMLVVLIGHLLQKSPFEKTVSGTLKAGIGFLVISSGSN IVTGALKVFEPLWSEVFGS >gi|229784048|gb|GG667687.1| GENE 25 28295 - 28582 384 95 aa, chain - ## HITS:1 COG:BH0222 KEGG:ns NR:ns ## COG: BH0222 COG3414 # Protein_GI_number: 15612785 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIB component # Organism: Bacillus halodurans # 1 89 1 89 89 81 49.0 3e-16 MKILCVCGTGQGSSLILKMSVQDILNRKGLQAELEHTDASCACSERCDFIITSQEIADSI TNPRAKIVCITNYINKGIIEKEIAEVLQALENEQK >gi|229784048|gb|GG667687.1| GENE 26 28603 - 30762 1477 719 aa, chain - ## HITS:1 COG:YPO2569 KEGG:ns NR:ns ## COG: YPO2569 COG1762 # Protein_GI_number: 16122787 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Yersinia pestis # 572 716 1 147 147 123 39.0 1e-27 MKIARDILIAFQLLSDNMGCLTVHQLSQVMRVSRTQAKQIIEDCIYWLEKRGVVLKDERE KRTVSVPPEGLEQLRELLKNISEKEYYYSFFERRKYIVFKLLTNHFITNYQICEMFEISR NTCIADMTAISRFLKNNGFQLRITSNKKGYMLAGKESEIRRLIPYYISLIPIFSSEEDAP LYHIDEEILFHLYGFDDVQIRKRADRLETASKELGHKLSLQSAVNISLAVELIARRIRQG HYVERGAVSKESREAYTRNCIEYLILKSGIWQEIKTASKTASAEVDDWKEILTMEKELLS QTIVCMGSVQEDNLEQFVMHETSLEQFAEEIITEFEERIGLGCTQKEVVIEKLSQSLSGV LLRTSCGFQIYDPFVETIRSNYKYIFDVTRKMNTIRGLQEVEISENGAAYITAQLIGWFI QGEEAKQKQKIPLVGIVCANNIGIGALVEKQIQSTFSGVRTILIPSYEINYWTFYNKLDL LVAPMLLKIDENIPCVRVNPFLTAQDKGNISRMLAEKGDRAADSTKILLKETMYILKDFL PSGQRDEAGIRLEQLYNSGDYEIIFKGAKNYMLKDLLTADRIQFTDDKLDWREAIRLASA PLIEDGSITQNYVRAMIDSIETTGPYIVLTPGVALPHARPSKGAKKLAMAMLCSRERIYF NEEKYANLLIILSSVDGHSHINALVHLSNMFGEEETLEKFMEAQTVEEVVELIHRYETD Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:39:54 2011 Seq name: gi|229784047|gb|GG667688.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld81, whole genome shotgun sequence Length of sequence - 20913 bp Number of predicted genes - 23, with homology - 21 Number of transcription units - 10, operones - 5 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 61 - 120 3.8 1 1 Op 1 . + CDS 158 - 2287 1980 ## COG1653 ABC-type sugar transport system, periplasmic component 2 1 Op 2 . + CDS 2320 - 2919 340 ## gi|288870912|ref|ZP_06115738.2| conserved hypothetical protein 3 1 Op 3 . + CDS 2984 - 4888 1512 ## Dhaf_1669 RND family efflux transporter MFP subunit 4 1 Op 4 . + CDS 4878 - 6119 428 ## gi|266622805|ref|ZP_06115740.1| hypothetical protein CLOSTHATH_04057 5 1 Op 5 38/0.000 + CDS 6103 - 6975 592 ## COG1175 ABC-type sugar transport systems, permease components 6 1 Op 6 21/0.000 + CDS 6972 - 7811 522 ## COG0395 ABC-type sugar transport system, permease component 7 1 Op 7 . + CDS 7828 - 8946 463 ## COG3839 ABC-type sugar transport systems, ATPase components + Prom 8985 - 9044 11.6 8 2 Op 1 . + CDS 9266 - 9532 205 ## gi|266622809|ref|ZP_06115744.1| teicoplanin resistance protein VanZ + Prom 9642 - 9701 3.5 9 2 Op 2 . + CDS 9757 - 10110 244 ## CDR20291_3456 hypothetical protein + Prom 10115 - 10174 2.2 10 3 Op 1 8/0.000 + CDS 10276 - 11607 871 ## COG3969 Predicted phosphoadenosine phosphosulfate sulfotransferase 11 3 Op 2 . + CDS 11597 - 12130 375 ## COG1475 Predicted transcriptional regulators 12 3 Op 3 1/0.000 + CDS 12230 - 12991 343 ## COG0491 Zn-dependent hydrolases, including glyoxylases + Prom 12996 - 13055 3.3 13 3 Op 4 . + CDS 13084 - 13548 455 ## COG0346 Lactoylglutathione lyase and related lyases 14 3 Op 5 . + CDS 13599 - 13727 78 ## + Prom 13733 - 13792 1.6 15 4 Tu 1 . + CDS 13836 - 14606 303 ## Athe_2346 TraX family protein + Term 14681 - 14721 2.1 + Prom 14620 - 14679 5.8 16 5 Tu 1 . + CDS 14738 - 15490 583 ## COG0483 Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family + Prom 16346 - 16405 7.3 17 6 Op 1 . + CDS 16504 - 16938 276 ## gi|266622817|ref|ZP_06115752.1| hypothetical protein CLOSTHATH_04069 18 6 Op 2 . + CDS 16928 - 17134 283 ## gi|266622818|ref|ZP_06115753.1| conserved hypothetical protein + Term 17206 - 17262 -0.8 19 7 Tu 1 . - CDS 17161 - 17352 73 ## - Prom 17412 - 17471 2.0 - Term 17486 - 17513 1.5 20 8 Op 1 . - CDS 17523 - 17879 254 ## gi|266622819|ref|ZP_06115754.1| conserved hypothetical protein 21 8 Op 2 . - CDS 17925 - 18884 370 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase + Prom 19178 - 19237 7.7 22 9 Tu 1 . + CDS 19389 - 19535 138 ## gi|266622821|ref|ZP_06115756.1| conserved hypothetical protein + Term 19617 - 19648 3.2 + Prom 19538 - 19597 5.0 23 10 Tu 1 . + CDS 19674 - 20636 299 ## PROTEIN SUPPORTED gi|20808441|ref|NP_623612.1| ribosomal protein S1 + Term 20650 - 20684 1.3 Predicted protein(s) >gi|229784047|gb|GG667688.1| GENE 1 158 - 2287 1980 709 aa, chain + ## HITS:1 COG:CAC0662 KEGG:ns NR:ns ## COG: CAC0662 COG1653 # Protein_GI_number: 15893950 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Clostridium acetobutylicum # 366 707 78 451 451 65 22.0 3e-10 MEVAVRKKLRRIFTIMGTTALIFTITACGNRMDAMQPDSSYVADFFDIPDSVTGIERLLL KDDMAYMCCLEEDGTSYLAAMNIDDGKFLKQNLELDASVSLLDFGFAPDNSIWAVCLEQA GVYSLRKFDNSGSLVQSVDLAGVMDTSVISAVGRDLFLSIDREGNICVAAKGRNTLVYLF DNDGSFLFSLDYEGNLMTTTTTAGGQIGICATSPDRMNYDLLTVDMKSLDWSKDKIYLGA TAGLYGGVNKDFYRFDSSVLYGYAAGAQEGSPVFNWSDMGLGTSEVHLGECTDGRFVVLA ASSNQTEVLSYEMAVLTKGADERENLSMVSLSATPGLVQAVSDFNKTNDRYKVELTEYFP YAQNVTDEEWDNAIINLNTRIISGDMPDILDMSHLSVQIYQSKGLLEDLYPYMEHDPEIR KEDYFENVFEAISLDGKLPYLTDGAAISTMLAGEDIVAGNDGWTIQELEKVLDTYGANSI NNLSGASFLKIMLQTDGSFIDWNLGKCSFDSPEFVKMLEFAGKIQESSQNSFGGTEMSES YAAAYETVLSVYHITRYRNYYNGNLRFLGLPGGGGEYHVIKPEVKVGISSSSSRKEGAWE FVRTLLTEEHQMSCMMLPIHKRAFDKVTQAAVDGKSVWTWIYEDGKATKEDVELTMQLLN SAAYVENDNQTLESLILEEAREYFSGARSALEAAERIQSRASLYINEQM >gi|229784047|gb|GG667688.1| GENE 2 2320 - 2919 340 199 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870912|ref|ZP_06115738.2| ## NR: gi|288870912|ref|ZP_06115738.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 199 13 211 211 328 100.0 1e-88 MKRQLMKRGMMGIYAVICAGMLLVTGCAGIETNTKTETERIEATERQESNPVITASKEEK AEVPEKQEEAVQNDSSLQIQDDAGNGDSDTMDTFADANDMMKSADLSGTAGECSDTGCVI NTSPFADGTSSSGDGGSMTVTYTEDTVFQKGTVKSDGSSYFLEDSEKDRIGGSDFVLCFG KQQADGSYLADRIIMIVFN >gi|229784047|gb|GG667688.1| GENE 3 2984 - 4888 1512 634 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_1669 NR:ns ## KEGG: Dhaf_1669 # Name: not_defined # Def: RND family efflux transporter MFP subunit # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 392 633 155 385 388 68 28.0 7e-10 MLISGKKYRPLIEDRVQRMAVYLMIGFFAFMLVCTIISRVSYSFTTAHVSVASMAGKTLM NRVELEGIIHASAEKGISLPDGLKAVSVYAGKGKSVEKGEALIEFDVPAVEERIKKLEEE IRILNLKLDISSNNGGNDVIEAQRTLEDAKKAYERIAAKYARAEMRLKEDYTKLEETLAD AEKILINKAEALVKEARENLQNVREDAEDAIRAAEQALDEESENLDDSEDSYQQALEAYE QAKDELNMANQRVSEIKEAIAGKQPDDATDYTEELKEAEEELSIAQKAFNQAKKELSKAD YNSSSYDRADDNLDSIKRRWKRKIDKAKDALYKAETVLEAVREGKDFSTESAVSEAQAAI DTAREALNRARRESEDNQYSEEEELYAAGRAVETAQAALEDARRQAEVLQKEGEIDRITC ESERSDKEKSRAALQEILDNGGRLLAPAPGTVVRTLERGDKTKEGEDAVILSSADRGFVF EGKLDKDSARRFSAGDKGELHYKLDGSTQKADVEIHSISTPDESDQVLVTAVLPEGGYTS GMPAQLFLSRKSETYQNCLPLTALRSDSGGDHVFVLRRQSSVLGTEWVIARVDITVKERD SQMMYVDGALTYSDQVVISSNKIISEGDRVRIEN >gi|229784047|gb|GG667688.1| GENE 4 4878 - 6119 428 413 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622805|ref|ZP_06115740.1| ## NR: gi|266622805|ref|ZP_06115740.1| hypothetical protein CLOSTHATH_04057 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04057 [Clostridium hathewayi DSM 13479] # 1 413 1 413 413 817 100.0 0 MKINRFAHYRALTISAILAAVLWSFACSFYSTAISHCTSVGIKWEKGGVSPSELIRQQTY ARQDGAGNQPEVTLWQIYPEQEIMASDKKEMKADIVAAFGDCGDVVSSMPADGAYPARSD ASGCAVSTGLAFSLWGSINVIGLPVKAEGKRFYVRGIFEEEEPRLFLQAQAESKEPLSNM QLTFTGPGAGERAGQYLSSAGFPEGRILDLTLFAWVLGMILRLPAIILAIGILGRVTRHG KRLFRYPLLLAVYFLIAFGILAVLWFCMDLPEFPSEFVPTMWSDFEFWGNLFQEQGKNMV SWMAAGPAFRDMELWTSSFMAVLLSVCASAFTAAAAMLVSIRSYKRMMFYCAAYTLTLCL ISFKIASAHNMTFCKAMYLLPCLWICVDFLFYCLKETLAADVREGRIMDEEKT >gi|229784047|gb|GG667688.1| GENE 5 6103 - 6975 592 290 aa, chain + ## HITS:1 COG:CAC0665 KEGG:ns NR:ns ## COG: CAC0665 COG1175 # Protein_GI_number: 15893953 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Clostridium acetobutylicum # 8 286 7 286 289 194 39.0 1e-49 MKKKREYSVPIREQMVPWLFLLPSLLFVSIMVLIPMLDALRRSFFLAAGDRFVGFQNYLS VLRNHAFQLAAANTARFILVCIPLLLLFSLAVSLMLASFREKRGIFKTSFLVPMAVPVAS IVLLWRVIFHKNGLLNAFTALWGLVPVDWIGSKAAFGVLVFAYLWKNFGYDMVLWLSGLS AINPALYEAAQIDGAGSLQRFVRITLPNLMPTLFTITVLSLLNSFKVFREAYLIAGGYPD DSIYMLQHLFNNWYGNLDVDKMCAGAVMMAILVFILILLLQKIMNRGDRI >gi|229784047|gb|GG667688.1| GENE 6 6972 - 7811 522 279 aa, chain + ## HITS:1 COG:CAC0666 KEGG:ns NR:ns ## COG: CAC0666 COG0395 # Protein_GI_number: 15893954 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Clostridium acetobutylicum # 10 278 16 275 275 157 34.0 3e-38 MKKWLLGLPLVLISVFIWLPLWMLISGTFMGGAELSENLAPVLEGRGGTAVWPVMARYPT LAAYVELLLDTPQFFTVFWNSCRQVFPIILGHVLLGAPAAWAFAKFHFRGKKILYTLYIV LMLMPFQVTMVSSYLVLNKLGLIDTVWAVILPGAASTLPVFIMTRFFMDIPEAVMEAAAV DGASSFQTFLRFGLPLGAPGILSAVVLGFLEYWNAIEAPQAFLKNQALWPLSLYMSNITA DNAGVSLIASLITLMPPLLIFLSGQKYLEQGIISSGMKD >gi|229784047|gb|GG667688.1| GENE 7 7828 - 8946 463 372 aa, chain + ## HITS:1 COG:ECs1897 KEGG:ns NR:ns ## COG: ECs1897 COG3839 # Protein_GI_number: 15831151 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Escherichia coli O157:H7 # 10 366 4 358 360 329 50.0 5e-90 MENRNFAGDLQISKVNKWYPGDVHAVKDMDMEIKRGEFIVFVGPSGCGKTTLLRMIAGLE SISSGEVIMDGRTVNQVPPKNRDIAMVFQNYALYPNMKVKDNIAFPLKMQHIGKTEREKR VAEVAQKLELDALLERKPGALSGGQRQRVALARAMIRKPGLFLMDEPLANLDAKLRTEMR REIITLQRELGVTTIYVTHDQTEAMTMGTRIAVISGGVLQQFGTPQEIYRNPANLFVAGF IGSPNMNVWDTELISNNGQHYLTLGNARIPMGKSGLDARGLQGPIKAGIRPEHIVPAAPD DPHAVRMKISILENTGREAAVFLTGADIPDMTMVTGADFSGRAGQEVYIRLKPEKILLFN TETGCAYGDVYV >gi|229784047|gb|GG667688.1| GENE 8 9266 - 9532 205 88 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622809|ref|ZP_06115744.1| ## NR: gi|266622809|ref|ZP_06115744.1| teicoplanin resistance protein VanZ [Clostridium hathewayi DSM 13479] teicoplanin resistance protein VanZ [Clostridium hathewayi DSM 13479] # 1 88 114 201 201 155 98.0 1e-36 MGLGLMTSLGIEILQIFTFRATDINDVITNVAGTMIGYLIGKLIINRFPQLNWLGCKERE LYLLYVTVGVVMFFSQPFIQSVLGNFSL >gi|229784047|gb|GG667688.1| GENE 9 9757 - 10110 244 117 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_3456 NR:ns ## KEGG: CDR20291_3456 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 115 46 160 180 127 53.0 2e-28 MEHNGIEYCFQCNEYPCEKYEKIDKFDSFISHRNQKSDLEKARQIGIEAYNAEQEEKVEI LGTLLAGYNDGRKKTLFCVAVNLLKIHELRAVLGKIENRSDLENLTLKEKALLLLGC >gi|229784047|gb|GG667688.1| GENE 10 10276 - 11607 871 443 aa, chain + ## HITS:1 COG:SPy0189 KEGG:ns NR:ns ## COG: SPy0189 COG3969 # Protein_GI_number: 15674394 # Func_class: R General function prediction only # Function: Predicted phosphoadenosine phosphosulfate sulfotransferase # Organism: Streptococcus pyogenes M1 GAS # 2 442 1 444 444 653 68.0 0 MVREYLHMNVYDAFLARMHFLFEEFDHIYVSFSGGKDSGLLLNLVLDYRNQHYPERAIGV FHQDFEAQYTVTTEYVERTFERIQDQVELYWVCLPMATRTALSSYEMYWYPWDDTKKELW VREMPQKEYVINLENNRMTTYRYRMHQEDLAKQFGRWYRDAHDGGKTVCLLGNRADESLQ RYSGFLNKKYGYKDTCWISKQFKDVWCASPLYDWSTDDVWHATYLFGYDYNRLYDLYYMA GLKPSQMRVASPFNDYAKDALNLYRVIDPEIWCKLVGRVQGANFASIYGKSKAMGYRNIT LPEGHTWKSYTKFLLDTLPKRLRNNYAKKFNTSIQFWHGTGGGLDESVIQELQEKGYQIR RNGVSNYTLNQKSRIVFVGPIPDHTDDIKSTKDIPSWKRMCYCILKNDHICRFMGFGMTR QQQKRLDMIRRKYKSIEEFDYEV >gi|229784047|gb|GG667688.1| GENE 11 11597 - 12130 375 177 aa, chain + ## HITS:1 COG:SPy0190 KEGG:ns NR:ns ## COG: SPy0190 COG1475 # Protein_GI_number: 15674395 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 3 171 4 172 176 258 71.0 5e-69 MKYKSPVYNVIAVPIEKVRPNNYNPNKVAPPEMRLLYESIKEDGYTMPVVCYYVKSQDIY LIVDGFHRYQVMMDHPDIYEREQGLLPVSVIEKPIDKRMASTVRHNRARGTHDVDLMSNI VKELHEFGRSDAWISKHLGMDRDEILRLKQITGLAALFRDIKFGEAWRPVEEETEKE >gi|229784047|gb|GG667688.1| GENE 12 12230 - 12991 343 253 aa, chain + ## HITS:1 COG:PA1415 KEGG:ns NR:ns ## COG: PA1415 COG0491 # Protein_GI_number: 15596612 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Pseudomonas aeruginosa # 2 211 7 220 242 116 33.0 3e-26 MDKWFTIDQIDEDTYIISEYRHWEETHCYLLNGSSRSLLIDTGLGICNIYDEVMKLTNKP VTAVATHIHWDHIGGHKFFPDFYAHAEELSWLNGEFPLTIETIREMVADRCDLPDGYDTG FYEFFQGTPTKVLEDGENIELGGRNIQVLHTPGHAPGHMCFWESDRGYLFTGDLVYKDTL FAYYPSTDPDAYLASLEKISELPVKKVFPAHHSLDIQPEILIRMRNAFRQLREDGKLHHG SGTFDYGDWAVWL >gi|229784047|gb|GG667688.1| GENE 13 13084 - 13548 455 154 aa, chain + ## HITS:1 COG:MA0108 KEGG:ns NR:ns ## COG: MA0108 COG0346 # Protein_GI_number: 20089007 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Methanosarcina acetivorans str.C2A # 1 152 10 163 163 127 42.0 7e-30 MKYVCTVISVADISAARKFYEELFGVEVYQDYGRNIAFTCGLALQQDFDWLVSIPKEKVL KKSNNAEIVFEEQDFDGFLNKLKAYSDIEYLGEVIEHSWGQRVIRFYDLDEHLIEVGEDM QMVVKRFLASGMTMEEVSAKMDVSIEDLTKLLNS >gi|229784047|gb|GG667688.1| GENE 14 13599 - 13727 78 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLDRTIPFYNAILRCDRYLTTTPVLHIQMNFLKKGHARRLMM >gi|229784047|gb|GG667688.1| GENE 15 13836 - 14606 303 256 aa, chain + ## HITS:1 COG:no KEGG:Athe_2346 NR:ns ## KEGG: Athe_2346 # Name: not_defined # Def: TraX family protein # Organism: A.thermophilum # Pathway: not_defined # 29 250 13 218 225 91 32.0 4e-17 MKTTQNLMRVKELLLTPPTFSDNKARLQTNLDTDFLKLIAIVSMLIDHIGSVFFPEVRVL RWIGRLAFPIFCYCMTVGLLYTHDIKKYLFRLGIFALISQPCYIFAFHPYDFWSQFTNWN IFFTLFLSLLAMYGWKERKWWLFALTFFVISWWNFDYSSNGIFLMLVFYLCRNNPVVGAM FYLLFTVLPALFARPGDFRNLTLGGLTFDWTFAMAFAALFIFPCTHTNLKVPRWLFYAFY PIHLLIIGLVRLVLKV >gi|229784047|gb|GG667688.1| GENE 16 14738 - 15490 583 250 aa, chain + ## HITS:1 COG:SMc03994 KEGG:ns NR:ns ## COG: SMc03994 COG0483 # Protein_GI_number: 15966531 # Func_class: G Carbohydrate transport and metabolism # Function: Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family # Organism: Sinorhizobium meliloti # 25 237 32 247 266 166 43.0 4e-41 MDIQKIISLVTKTQGLIKNREMAAHVKEKGLADYVTQVDIAVQNFLKKELFALAPDIQFL GEETGLQEIKADSFWILDPVDGTTNLMHDYQHSVVSLALCRQGDIIMGIIYDPFRDEVFS AIKGKGSFLNGQPIHVSNAEKLSDTMIGLGTAKREVADENFARFRRVFGQCQDVRRIGSA ALELAYTACGRQGGYFEIYLNPWDYAAGMLMIQEAGGKVTDFTGKPLEPKKGSSVVGTNG FIHTELLKLL >gi|229784047|gb|GG667688.1| GENE 17 16504 - 16938 276 144 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622817|ref|ZP_06115752.1| ## NR: gi|266622817|ref|ZP_06115752.1| hypothetical protein CLOSTHATH_04069 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04069 [Clostridium hathewayi DSM 13479] # 1 144 3 146 146 279 99.0 6e-74 MKKKDNIRYRFTGYVLAALKRAKRDYICREIKISNHESPLEKLLPEPLCDITDSKPFLWE ETEQIPLIPDEVRQYMESQIGESGETALDALTDMEILVVFMKVFRQLTYQEIGRHLGMDY KKAASVFNYAKKKMRKGWKIRNEF >gi|229784047|gb|GG667688.1| GENE 18 16928 - 17134 283 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622818|ref|ZP_06115753.1| ## NR: gi|266622818|ref|ZP_06115753.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 68 1 68 68 114 100.0 3e-24 MNFKELLMKAKQGDKGAREELFLMYRPMVLSRSRIKGIFSEDLYQELSKTFVSCIEQFSI EGTENEQK >gi|229784047|gb|GG667688.1| GENE 19 17161 - 17352 73 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVLTPPIAIKLLKGQVSDTLEINVAAYIKIIAIKGKGNTPASACCIPLLSVSLKTHSYWS SCR >gi|229784047|gb|GG667688.1| GENE 20 17523 - 17879 254 118 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622819|ref|ZP_06115754.1| ## NR: gi|266622819|ref|ZP_06115754.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 118 1 118 118 235 100.0 8e-61 MRTKRYTVVIAGLIPEIATQNILVKIWGKAAKKVYASTGIYVNAWLSESYFLCGDKRGPD LDGLTANFIIIWNPVEVESYEKFHEAFTQVVNGVRDILGNPYVWITIDDIEFYYFVKC >gi|229784047|gb|GG667688.1| GENE 21 17925 - 18884 370 319 aa, chain - ## HITS:1 COG:PH1043 KEGG:ns NR:ns ## COG: PH1043 COG1473 # Protein_GI_number: 14590880 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Pyrococcus horikoshii # 1 307 78 378 387 213 39.0 5e-55 MDALPIKEAAPVPFASENGYMHACGHDMHTTMLLGAAKLLKQYQQDLVGTVKLIFQPDEE GFTGAKALLKAGVLENPHVDAGIAFHVVSGIPSGTVMCGSGTCMAGCTLFQIHIKGTGCH GAMPETGVDPINIAAHVYLSLQEIIAREIAPTQPAALTVGRFSAGEAPNIIPEDVILEGT IRAMDCKISKYIFDRIEEISIQTASLFRGQACVKEIASAPPLQNDSEMVKELASYMKECY EPYKIELFTQGGMGSEDFASYTYERPCCYMLIGAGTPEENPLFGKPMHNDHVVFNEEILS LGSALYASNAINWLRNHSQ >gi|229784047|gb|GG667688.1| GENE 22 19389 - 19535 138 48 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622821|ref|ZP_06115756.1| ## NR: gi|266622821|ref|ZP_06115756.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 48 1 48 48 92 100.0 8e-18 MRMITRNVNIVHLTLEGQEDLYQELSKPFLNCIDKFLIDVTEKGPKRD >gi|229784047|gb|GG667688.1| GENE 23 19674 - 20636 299 320 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|20808441|ref|NP_623612.1| ribosomal protein S1 [Thermoanaerobacter tengcongensis MB4] # 46 319 5 255 257 119 28 1e-26 MREDSEGRLAEEVQEREKRVHVQPMGILTVDASGAAARQESGDITWHEMVNAYRTKKILS GGLSGIERLDGGWVVAVVYYKDYRVLIPMEEMMINLEGDGRENSNTLNRQTRLANNMLGA ELDFIIREMEEKSRSVVASRKEAMLRKRQQFYLGQQEGPPMIVPGRVVEARVIAVAQKAV RLEVFGVECSLKARDMSWEWMADANEKFSIGDVVSVMMKKISGDSIENLKVEVSAKEAKT NVNKENLMRLRRQGKYVGKITDVFKGTYFIQLDCRVNAVAHSCNTASLPGQGDEVGFLVT RINDEREVAEGIITRIIKRK Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:41:28 2011 Seq name: gi|229784046|gb|GG667689.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld82, whole genome shotgun sequence Length of sequence - 30290 bp Number of predicted genes - 26, with homology - 25 Number of transcription units - 11, operones - 7 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 + CDS 3 - 551 408 ## COG0477 Permeases of the major facilitator superfamily + Prom 564 - 623 7.3 2 1 Op 2 . + CDS 655 - 1641 804 ## COG1609 Transcriptional regulators + Term 1698 - 1753 11.1 + Prom 1826 - 1885 6.2 3 2 Op 1 . + CDS 1933 - 2403 493 ## gi|266622825|ref|ZP_06115760.1| hypothetical protein CLOSTHATH_04077 4 2 Op 2 . + CDS 2418 - 2840 467 ## gi|266622826|ref|ZP_06115761.1| putative cupin region 5 2 Op 3 . + CDS 2856 - 3293 379 ## KP1_4207 hypothetical protein + Prom 3323 - 3382 4.8 6 3 Op 1 . + CDS 3402 - 4337 844 ## Achl_3528 hypothetical protein 7 3 Op 2 . + CDS 4359 - 5618 1233 ## COG0673 Predicted dehydrogenases and related proteins 8 3 Op 3 . + CDS 5650 - 5958 481 ## ANT_12460 hypothetical protein 9 4 Op 1 16/0.000 + CDS 6884 - 7375 617 ## COG1082 Sugar phosphate isomerases/epimerases 10 4 Op 2 . + CDS 7461 - 8597 1093 ## COG0673 Predicted dehydrogenases and related proteins 11 4 Op 3 . + CDS 8641 - 9852 815 ## COG2706 3-carboxymuconate cyclase 12 4 Op 4 . + CDS 9867 - 10331 441 ## Achl_3527 hypothetical protein 13 4 Op 5 . + CDS 10348 - 11310 1026 ## Achl_3528 hypothetical protein + Term 11314 - 11370 17.2 - Term 11530 - 11582 4.6 14 5 Tu 1 . - CDS 11641 - 13374 1315 ## Swol_1745 transposase - Prom 13411 - 13470 7.0 15 6 Tu 1 . + CDS 13829 - 14206 466 ## COG0514 Superfamily II DNA helicase + Prom 15045 - 15104 80.4 16 7 Op 1 7/0.000 + CDS 15188 - 16480 1244 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 17 7 Op 2 2/0.000 + CDS 16482 - 18038 1589 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 18054 - 18113 3.6 18 7 Op 3 35/0.000 + CDS 18140 - 19528 1859 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 19543 - 19585 4.2 + Prom 19550 - 19609 3.5 19 7 Op 4 38/0.000 + CDS 19679 - 20518 1093 ## COG1175 ABC-type sugar transport systems, permease components 20 7 Op 5 . + CDS 20515 - 21204 805 ## COG0395 ABC-type sugar transport system, permease component 21 8 Op 1 . + CDS 22159 - 22239 83 ## + Prom 22244 - 22303 2.9 22 8 Op 2 . + CDS 22323 - 23795 1591 ## COG3250 Beta-galactosidase/beta-glucuronidase 23 9 Op 1 . + CDS 24735 - 25883 955 ## COG3250 Beta-galactosidase/beta-glucuronidase 24 9 Op 2 . + CDS 25968 - 26291 217 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 27229 - 27288 25.1 25 10 Tu 1 . + CDS 27510 - 28457 1116 ## bpr_I1133 hypothetical protein + Term 28689 - 28723 5.1 + Prom 28766 - 28825 1.8 26 11 Tu 1 . + CDS 28865 - 30290 1632 ## COG1132 ABC-type multidrug transport system, ATPase and permease components Predicted protein(s) >gi|229784046|gb|GG667689.1| GENE 1 3 - 551 408 182 aa, chain + ## HITS:1 COG:L157472 KEGG:ns NR:ns ## COG: L157472 COG0477 # Protein_GI_number: 15672136 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Lactococcus lactis # 1 169 213 383 393 64 29.0 9e-11 ALEPAILNFASQIAYASISAFIVVYGGVKGWEKIALFYTVYSFSLFLFKPVNGKLYDKMG LAPLVVFGNLSFIFGLFLLGTTDSFAVCLLSAVFCAYGYGGAISTFQAEALKSTTFERHG IASGTYFMLNDFGGFLGASLAGIVVSAVGYSRMYLLFTLPLIAAIAFYWCMRLFWRRSAD FC >gi|229784046|gb|GG667689.1| GENE 2 655 - 1641 804 328 aa, chain + ## HITS:1 COG:VC2677 KEGG:ns NR:ns ## COG: VC2677 COG1609 # Protein_GI_number: 15642672 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 1 326 1 328 335 131 32.0 2e-30 MATVVDVAELAGVSVATVSRVIRDSNKVSDDKREKVMNAIEELGYKMDPNSTGRLRKKIL YVVCGGSQDEFVDNILEIANESGYEIAVEYTAGLPLEMNTFMKKLLKNKIISGVLTCGLP PESGKALREIDQIVPVVQCCDEIMTENSFVVSSDDVMMGHDAVLHLSKKGCRRIAFLGLG KMKNPFKYSHAREMGYRVAHAELGIPVDENLIKQCDLTADSVYRALDELRKLPEMPDAIF CARDSTAVFAVNKLTRDGIRIPDNISIMGCGSAESAERSWLPLSNVTQSYYEIGLEAINL MNARIMGKTVLGRRLNIKHTIVDRETTK >gi|229784046|gb|GG667689.1| GENE 3 1933 - 2403 493 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622825|ref|ZP_06115760.1| ## NR: gi|266622825|ref|ZP_06115760.1| hypothetical protein CLOSTHATH_04077 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04077 [Clostridium hathewayi DSM 13479] # 1 156 1 156 156 310 100.0 2e-83 MGKSKIIHRDALIRDNGFEKGFGVYSLVCDETCGSTNIMLGDTIHNPASRNQEHVHLGCE VQWFVLAGHSLHYSCTVEHQEYKETECFPGTVGYVHPNEIHVGMNLNADNTGEVIFCYAG CNNTEGANTVFYQEADIVEEYMRKHGKKLEDVLHIQ >gi|229784046|gb|GG667689.1| GENE 4 2418 - 2840 467 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622826|ref|ZP_06115761.1| ## NR: gi|266622826|ref|ZP_06115761.1| putative cupin region [Clostridium hathewayi DSM 13479] putative cupin region [Clostridium hathewayi DSM 13479] # 1 140 1 140 140 272 100.0 6e-72 MKKTQTIDRTMTPKSGSVQPGMITELMTSADTCATERLLGTYCTLLPAGESQMRRHLNAE CAWYLVKGRIREIIIEADGSRIEEECKAGDAGYINPGDAHQEINLSIEENAEMIMCFANP ENKKCNCFDDLGTEAAVKPE >gi|229784046|gb|GG667689.1| GENE 5 2856 - 3293 379 145 aa, chain + ## HITS:1 COG:no KEGG:KP1_4207 NR:ns ## KEGG: KP1_4207 # Name: not_defined # Def: hypothetical protein # Organism: K.pneumoniae_NTUH-K2044 # Pathway: not_defined # 5 141 1 127 130 75 31.0 7e-13 MTNSLFEKYLLCDEGFGNIYDRNHPDDPPIGFQLKIRIPYYRSARLSLIEKITVRVDGRE YDPDKMLFTTHDGTFTMDQMRTMPNHYWLFGEKATVTVMEPDGLGRAFSTAYRTVELGIY LRISYSHLGFIGIANKELKAESFLS >gi|229784046|gb|GG667689.1| GENE 6 3402 - 4337 844 311 aa, chain + ## HITS:1 COG:no KEGG:Achl_3528 NR:ns ## KEGG: Achl_3528 # Name: not_defined # Def: hypothetical protein # Organism: A.chlorophenolicus # Pathway: not_defined # 1 311 23 329 346 179 31.0 2e-43 MGKLDLEGCIAEAAKTGARGIELIPEQNCADEYLDPSDEFAAKWKDWMQKYGTEPCAMDI FYDYKLFGNRILIWKEQVKMYTDAFRFAKKLGFPVVRGMLTTPIKLVEMMIPVAQELGLR FGVEIHSPYSLESEYCLQLYELADKYHTDCIGVIPDTGIYVKQQSEVIIQKFIRAGAHKE IADYTCRAYIDQVPQQEAAEAIARMNPNSVDHALFKRVYFATYDDPNLLVKYADRILHVH SKCWHMTEDCEEPSIDNENVIRMLRMGGYDGWIATEFEGQRYFHDEGCKENADEIEEVRR NQEMLKRLLGE >gi|229784046|gb|GG667689.1| GENE 7 4359 - 5618 1233 419 aa, chain + ## HITS:1 COG:SA0210 KEGG:ns NR:ns ## COG: SA0210 COG0673 # Protein_GI_number: 15925921 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Staphylococcus aureus N315 # 1 281 1 275 359 122 28.0 1e-27 MQTVRVGIIGGGMISHRHMQIYKNINDRAQLLGFRAEVVAVAEIIPERLKAWGEQYGLDE SSLYADYHDMLKRDDIDTIDVCVHNNLHVPVAIEVMKAGFDCYCEKPAAVTYTDAKLMVD AAQKLGRKFHVQISSLMTPQTRLAKQYIREGRLGTPYLVNLEQCTSRRRPGYDLPEFTED FLSSRIAGHGPSIDLGIYVIGQILHLFDCPKVQSVNGFAGHYVDYDKSIIKSPDGYGVED TVDGFVKFETGLGFHFLSTSANNYKDYSMTYILGSKGGLEVINTDTVGGKFERENPDAPT PPGTEPELRFFGSMEGRDVGIELNCDKNGRMEARKDPKIMLYNDNQCMWLAYKLGILDDN TRYNTPEIALNQLIITDGFFLSAELGRSVARDEIIANSPSVYLREQEIAGKLYRFDTEF >gi|229784046|gb|GG667689.1| GENE 8 5650 - 5958 481 102 aa, chain + ## HITS:1 COG:no KEGG:ANT_12460 NR:ns ## KEGG: ANT_12460 # Name: not_defined # Def: hypothetical protein # Organism: A.thermophila # Pathway: not_defined # 1 99 1 97 250 70 35.0 2e-11 MKRLPVGIQVYGLRDLLENTPENFRSVMLQIKEMDYDGVELAGLYGLEPEFVKKTLDEIG LIPISAHVPLTEMMEDAKKVAETYQKIGVSYIAVPYLPELAS >gi|229784046|gb|GG667689.1| GENE 9 6884 - 7375 617 163 aa, chain + ## HITS:1 COG:CC1629 KEGG:ns NR:ns ## COG: CC1629 COG1082 # Protein_GI_number: 16125875 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Caulobacter vibrioides # 7 158 183 323 328 71 32.0 8e-13 MARIGQIMKEYGIKLEYHNHDFEFVRLPDGTFGFDDIYRRIPEDLLKVEPDLCWIKVAGE SPVDYLKKYGERSEVVHLKDYIKEGNPKNMYKLIGIESGEEEGDSGIFEFRPVGFGMQIW EPILEAVAEAGAKWVVVEQDEHYDLPVLEAARRSREYLKILGW >gi|229784046|gb|GG667689.1| GENE 10 7461 - 8597 1093 378 aa, chain + ## HITS:1 COG:lin2262 KEGG:ns NR:ns ## COG: lin2262 COG0673 # Protein_GI_number: 16801326 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 3 374 2 346 349 334 42.0 2e-91 MDEKKRIAIIGCGGMGGGHAIAIGSGTGNAVYNADSSRERGMKIFDTSANTDISTRLELA GIYDINPDRMRWASEKGFALYDSYESMLQDQSVDIILIATPNHLHKDMAIAAMKAGKHVL CEKPVMMNSRELEEVIEVSRETGKLFYPRQNRRWDRDYRIVKKIYEEKLVGNVFRIESRV QGSRGIPGDWRKEREFGGGMMLDWGVHLLDRLLFMIPDKVETVNCRLSYVLKEEVDDGFT MFLTFAGGLSALVEVGTCNFISLPLWYVGGDRGSAVIDNWKCEGKMVQLSSWEDKDTLPI MAGAGLTKTMAPRDQKSLITLPLPDVVYDDNELYSNVVDAVNGVAEPCVTAEQALRVMRL MEAAIRSSEEGKTVRVDG >gi|229784046|gb|GG667689.1| GENE 11 8641 - 9852 815 403 aa, chain + ## HITS:1 COG:RSp0183 KEGG:ns NR:ns ## COG: RSp0183 COG2706 # Protein_GI_number: 17548404 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Ralstonia solanacearum # 6 398 59 403 410 189 32.0 1e-47 MGRKRYAFVGTWKDFGGHGEGIYVYELDPETGDMKETDHLRGFRNNSILSISPDKRRLYA TIEDKNYQDQWGAGGGVLALAFDAEKGKLSVLNERSALGACTSYISADPAGQYVLITNHG SGLDVITRVVPDGTGGYKIVNEYDDSTLVLFRLCGDGSIGEVCDLHKHEGKGGLKVRGQM CSHLHSVNFDPSGQWVLVCDKGCDRIYVYRMNREKGILEPSPRTVVHTRPGTAPRHLEFH PYAPYCFVNHETASAVTSYYFDAGTGTLREADYQPMIPMDLSVKAELIGDTYGSPADIHV HPGGQFLYSSQRSSYHTGQGTDCIAVFRIDPENGRLERTSIAVMEGPCPRGFAIEPDGRF LYVGNQYLDTICRYRIDPVTGGLTDRQTVAHAATPTCIRIVDI >gi|229784046|gb|GG667689.1| GENE 12 9867 - 10331 441 154 aa, chain + ## HITS:1 COG:no KEGG:Achl_3527 NR:ns ## KEGG: Achl_3527 # Name: not_defined # Def: hypothetical protein # Organism: A.chlorophenolicus # Pathway: not_defined # 24 151 2 126 130 72 28.0 8e-12 MEYKAVLEDPKELERLVYSARSCWEQYLIVDGSLKNDVQGDAVTGCTLAIRTCWYRSLAV TLIRRLEVRIDGKLIPQEDIFFQVEDNPHMYRLDEIGPEQSEEYWYLNKYGYLHLRLPGG LSKGPHITEVLLEVFVTYHNYPDATYQKKILLVE >gi|229784046|gb|GG667689.1| GENE 13 10348 - 11310 1026 320 aa, chain + ## HITS:1 COG:no KEGG:Achl_3528 NR:ns ## KEGG: Achl_3528 # Name: not_defined # Def: hypothetical protein # Organism: A.chlorophenolicus # Pathway: not_defined # 1 294 8 300 346 168 35.0 3e-40 MKLSISLYSFAREYFDRKYTLEQCIAKAAEIGYQGIEIIGPQMIDNYPNPTEEFYENWHQ WMKKYNVEPVCLDVFNEYMVLKNRKMTDEEQCEQLRRDIRIASRMGIHVVRVMNLYTPEQ LEACIPEAERYDVTLSLEVHGPKLLNDPSVLKYLEMIDRNKTTRVSICPDISVFQYSVPP RMLRYYVAKGASVELAEYCRTAYEEGVDIDTAMETCRKLGGTPLDVLMVKNCCQGSKSDY KYLEEFAPYITHVHAKCFDELTPEGKDPTFDCERIAGILKKNGYEGYISTECEAFQYEPW QLVTETLLEQHCAMWHKLIS >gi|229784046|gb|GG667689.1| GENE 14 11641 - 13374 1315 577 aa, chain - ## HITS:1 COG:no KEGG:Swol_1745 NR:ns ## KEGG: Swol_1745 # Name: not_defined # Def: transposase # Organism: S.wolfei # Pathway: not_defined # 1 573 1 569 570 521 48.0 1e-146 MKVSISKSKNAESFYIKQSYIDGNGKSTSRTIRKLGTLKDLLAEHGPTREDVMKWAKEQA RIETERYELEKESHCIPIVFHPNRKIPHGQRRNFQGGYLFLQSIYYELGLNRICRKIRDK HHYTYDLNAILSDLIYTRVLDPGSKRSSYEAAKGFLEAPAYELHDVYRALTVLSEECDLI QAEVYKNSKSILERNDRVLYYDCTNFYFEIEQEDGNKKYGKSKEHRPNPIVQMGLFTDRD GIPLAFRLFPGNENEQKSLKPLEKKVIEEFGCERFIFCSDAGLGSENNRLYNHTRNRSYI ITQSIKKLPVEYRELALNRKRFRRLSDNRLTDLDTITEKDSDELFYKEEPYSTRKMEHRL LITYSPKYAAYQREVREKQVERAKAMLADGNHKKTGKNPNDPARFIGKSAVTKDGESADI RYVLDEQKIVEEARYDGLYAVCTDLFEDDPGEILKVSERRWQIETCFRIMKTDFSARPVY VQREDRIKAHFLICFLALLVYRCLENKLNKRYTCEEIVSALKGFAFADVQGQGFIPIYES TKLTDELHKISGFETDYEFLTKSRMREIQKLSKKKTS >gi|229784046|gb|GG667689.1| GENE 15 13829 - 14206 466 125 aa, chain + ## HITS:1 COG:FN0578 KEGG:ns NR:ns ## COG: FN0578 COG0514 # Protein_GI_number: 19703913 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Fusobacterium nucleatum # 5 121 15 131 614 148 57.0 2e-36 MDKYQVLKQYFGYDEFREGQELLIDSILSGRDTLGIMPTGAGKSLCFQIPALMMEGITLV ISPLISLMKDQVEALNQAGIHAAYLNSSLTASQYYRALAYAREGRYPIIYVAPERLVTEE FKLAS >gi|229784046|gb|GG667689.1| GENE 16 15188 - 16480 1244 430 aa, chain + ## HITS:1 COG:BH2727 KEGG:ns NR:ns ## COG: BH2727 COG2972 # Protein_GI_number: 15615290 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 104 425 267 584 597 136 27.0 9e-32 MVKVGFYDRNVTASRKAHMLTIVAGLSPGIDVDRDNVVEMAALFVSSQTGNLIKEYNKEK LLGVTMLLNPDGTPVFDVNDGMRFFIPEIISGGETELHHKLDGKNYVYVISEEPVTGLLA VSVVDSELLTRDFTRIAAVIVAVTVVLFGLFYLFSSYFLKNIIDPIHNTVEGMRLVEEGR LDVHVVPVGQAELRVMIHSFNRMTRRLKQLMQENEEEQQKKHEAEIRALQSQINPHFLVN SLNSIRFIAQVSRYEAIARMAEALIKILSCSFRSNVGFYTLDEELEVLDGFIYLMKIRYS DGFEIEYEIGEGCGSCMVPRLILQPIVENSIVHGFSGQEEELGKITVKARLEDQYLYITV RDNGKGMTEEEIRRLLSNETAENEDYVSIGVTNVNTRLVLNYGDECTLQIRSEVGRYTET VLRIPAAGRD >gi|229784046|gb|GG667689.1| GENE 17 16482 - 18038 1589 518 aa, chain + ## HITS:1 COG:BH1910 KEGG:ns NR:ns ## COG: BH1910 COG4753 # Protein_GI_number: 15614473 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 518 1 496 506 126 24.0 1e-28 MRKVLIVDDDTIIRITLRSLINWEELGFQIAADAIHGEAALQYLKNHTVDLIISDMKMPV LDGIGLLEAVSRFDVMPKVLVLSGYDDFKLVRDAFRLGACDYLLKADLNGELLKNMLEGL NGEWESEEETGRQNSAEESREKISDSERLAAMAMGKCELDEEFLAGEYLVVQFEIDEFQK HAGRFAGDFEGKLIRPLLEFAAQIPRVVSRCVLGTISPSRYVMLYRILDAGQYQDNAVST CRQLCSVWNNFLNLPVSAGISTRGSGAGDFLSRFEEAGGWLSLHWLKGRAAVCSPWEKDS VDYLEARNSFSRYKRLAAGLLTADEMTVSEEKQKLIAVFYGSGLSSAKKEILCLICNLAW LLNESHDDIGALFAEKVNYYEKIGRLEEMGSLELWLNNYFRWVMDYAAHQNDRRQADMMS RAKRFIMDNYSNPELTLGSVAGYVGVNEKYFSSRFTKEEGMTFSNYLTEVRIRKARELME QTDLKIYEISQSVGYNSVEHFTRVFKKICQVSPGAYRK >gi|229784046|gb|GG667689.1| GENE 18 18140 - 19528 1859 462 aa, chain + ## HITS:1 COG:BH3680 KEGG:ns NR:ns ## COG: BH3680 COG1653 # Protein_GI_number: 15616242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 43 369 24 353 438 84 26.0 4e-16 MKKQIAALGLCAAMVVGAATGCSATASKETEKTPAADAATTAGGQTDTTAAPDTEAAGAD LKGNLVFAIWDNNLMDYIDQNDMVGKFQEKYPDAEIEVEKIKDDSEYWNAMKMRTSANQL PDVMFNKPFTLSRFKDYLVDLSDLEATKNNELAAGYALDGKILGVPLTAGYEYVFYWKDM FKEAGVTVPTTWSEFEQTAQTLQDYYGKDNPDFMAIALGAKDEWPDYPYMEFMPALEGGN GQNWNDMAKSDAPFAEGTDINKAYTKVNELFNMDVFGKDPLGMGNDQVTALFAQKKAAIL ALGDWGLQNIQSGTDDYSELGTFYLPVRDTSSDPFRVIVQGDSFMGVTTHSKNPELAKAF VEWFYSEDWYPGYINYVTSASSMTNFPKEKDPILAEADALQPDMELVMYDGGGDDFQAIQ NEIAFDYKKLGAQMFTEGFDLKAELDTLNAKWAEARTKLGIK >gi|229784046|gb|GG667689.1| GENE 19 19679 - 20518 1093 279 aa, chain + ## HITS:1 COG:BH2225 KEGG:ns NR:ns ## COG: BH2225 COG1175 # Protein_GI_number: 15614788 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 4 263 34 292 304 152 32.0 7e-37 MALLIGFVVVPAFDLIRMSFTNWDGYSKTYDYIGIGNYFDMLKNKDLWQSLRNNAVYFFG HLLFIPIELMFAVLLTSKLRAAKFYKTMVFMPYIINGVAISYAFSYFFSPINGAFDEILT LFHLEPLIRNWLSDPKIVNFVLAGVSLWRFSGYHVILFMAALQSLPQDVQEAARVDGASA WQMFKYIQIPAIMLMVDFVLFDNIRGALQVFDIPYVMTSGGPGYASSTFTLYTIDTAFKY SNFGLASTMAVAIMLMIVIIYVVQNKIIHGIVLKGGKKG >gi|229784046|gb|GG667689.1| GENE 20 20515 - 21204 805 229 aa, chain + ## HITS:1 COG:mlr4774 KEGG:ns NR:ns ## COG: mlr4774 COG0395 # Protein_GI_number: 13474001 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mesorhizobium loti # 7 227 10 230 279 100 30.0 3e-21 MKQRTLQQIVKQVICLCMVSIVLAPILLTLFAAFKTKADMVNTSPLLLPPLSRVTTDNFQ KVLGDKYLLIGFKNTGIILVFSIFFNVLFGTITAFILERFQFRFKKLIMSLFFLGMLIPS FVTEIARFKIINGLGLYNTLGAPIVIYVASDLMQLYIYRQFISTLPVSLDESALLDGCSY FGLFTRIIFPLLAPATATVVIIKAITIINDMYIPYLYMPKNKLRTLTTF >gi|229784046|gb|GG667689.1| GENE 21 22159 - 22239 83 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLPTIVIYLFFQKYILAGIVAGAVKE >gi|229784046|gb|GG667689.1| GENE 22 22323 - 23795 1591 490 aa, chain + ## HITS:1 COG:BH2723 KEGG:ns NR:ns ## COG: BH2723 COG3250 # Protein_GI_number: 15615286 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Bacillus halodurans # 5 490 4 492 1014 605 59.0 1e-173 MDQYHADLAWLENPEVFAVNRLEAHSDHRFYESREEADSGIMKLRQSLNGTWKFSYAPKP KQRKKDFYLPERSLGDFGEITVPGHIELQGYGKCQYINTMYPWEGHSELKPPHVDWEDNP VGSYVKEFETDAALKNKRLFLSFQGVESAFYVWVNGRFVGYSEDTFTPSEFEITDFVKDG VNRLAVEVYKRSSASWIEDQDFFRFSGIFREVYLYAVPECHVRDMFVHPGVLEDLKTGEL TVDLTLEGSRKGSVSALLTDREGNTAAVWEHMPARENISFAGSVPEVHLWSGEDAFLYTL TVILYDEKGRIVEVVPQKLGFRRFEMKDRLMCLNGKRIVFRGINRHEFDVRRGRAVTEED MLWDIRFMKRHNINAVRTCHYPNQSRWYELCDEYGIYLIDEANLESHGSWQKMGACEPSW NVPGSLPEWKECVVDRAKSMLERDKNHASVLIWSCGNESYAGEDILAMSKYFKERDPSRL VHYEGVFWNR >gi|229784046|gb|GG667689.1| GENE 23 24735 - 25883 955 382 aa, chain + ## HITS:1 COG:BH2723 KEGG:ns NR:ns ## COG: BH2723 COG3250 # Protein_GI_number: 15615286 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Bacillus halodurans # 1 372 500 868 1014 347 48.0 3e-95 MESRMYAKPAEVEAYLEGEPEKPFILCEYMHAMGNSLGGMEKYTSLEDRYPMYQGGFIWD YVDQALMKTDENGVEHMAYGGDFNDRPTDYNFCGNGIVYADRTISPKAQEVKALYQDLKL VPDAGGAEIENRRLFTDTSDLEFVWLALRDGVPVHSERFCAQVKPGEREYVSVSAPELTE PGEYVYQVSAVLKREERWAAAGYETAFGESCRVIGSDDQCAGENRADAGSVPFTVIHGDV NIGVKGDGFHVIFSKQEGGIVSLVYDGREWIGKLPMPVYWRATTDNDRGNKFSVSSAVWY GAGSFPLYDSKTCVVEEGKDCVRVSYTYRLATVPETVTEVVYEVDGEGRITTTARYFGRE GLPELPLFGMRFCISGTGGGFE >gi|229784046|gb|GG667689.1| GENE 24 25968 - 26291 217 107 aa, chain + ## HITS:1 COG:BH2723 KEGG:ns NR:ns ## COG: BH2723 COG3250 # Protein_GI_number: 15615286 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Bacillus halodurans # 1 106 908 1013 1014 114 50.0 3e-26 MSRYLVPQECGNRTGIRWLKVTDEAGHSIRFTAEGRPFEGSVLPYTAEELEHASHIEELP QPRYTVVNILAAMRGVGGDDSWGAPVYPEYCVSGEKDITFSFVIGRE >gi|229784046|gb|GG667689.1| GENE 25 27510 - 28457 1116 315 aa, chain + ## HITS:1 COG:no KEGG:bpr_I1133 NR:ns ## KEGG: bpr_I1133 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 104 303 99 299 308 100 29.0 7e-20 MKKERETKEYLEFVEQFQAFLILMTEEWGAKVTLQKGGEAEEAEDLLVVELNGDGSGTHI QRFHMAEIYHDFQLGKGMEELLSEASDCLERCREIEKSSPLGHMEDYEAIREYLVIRPLN YERNAKKLEEGVYEVTGDIALTLYVSIGNFGGLYTSSMVPKVVCEGWDKSREQVMNDAME NTYRLFPPRLFNWLEVERFRDKDYGIFMERDTEVRLDGGACATFVTTKSQINGAVAIFLP GVAKRLGELMGNDLYIAFTSMHEAAIHNCEKVYPETIQESLKNLNREMPADEDFLSEKVY YYSRAKDRIEVVLEY >gi|229784046|gb|GG667689.1| GENE 26 28865 - 30290 1632 475 aa, chain + ## HITS:1 COG:lin0156 KEGG:ns NR:ns ## COG: lin0156 COG1132 # Protein_GI_number: 16799233 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Listeria innocua # 1 475 4 479 580 278 35.0 2e-74 MRNLGKYVKKYGILYVVAIAAMVVSILLDAASPQITRHIIDDVIVGGRLELLMRLLLGLL GIGFGRAIFQYVKEFIFDYSASGIGSRLRKDLFDHIQTLSMGYFDSHNTGELMARVKDDV DRIWNAMGFVGMLIIECTIHTVIIMTCMFRLSPALTVIPLLVMPVIAWYGLKMENGLGQV YDKISEQTAELNTVAQENLAGVRTVKAFAREDYEIDRFKRHNKRYYDLNMEQAKLIVKYQ PNVSFLSKVMLMAIIVVGGIMVIQKRITIGELGAFSEYANNIIWPMEMVGWLSNDFAAAM ASNKKIKRIMAQKPDITEPDAPVVPEKIEGELQFKHVDFDLYGSRILTDIDFTLKKGKTL GIMGVTGSGKTSIVNLTQRFYDVTNGEILLDGIDIRKLPLSLLRSQTTVVMQDVFLFSDT ITDNIKTGNRDATEWESVEEASVCAEAHDFIMKLGEKYDTVIGERGVGLSGGQKQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:42:26 2011 Seq name: gi|229784045|gb|GG667690.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld83, whole genome shotgun sequence Length of sequence - 25869 bp Number of predicted genes - 21, with homology - 20 Number of transcription units - 15, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 19 - 78 3.7 1 1 Tu 1 . + CDS 106 - 882 340 ## Calkr_0213 two component transcriptional regulator, AraC family 2 2 Tu 1 . - CDS 1067 - 1189 58 ## - Prom 1271 - 1330 5.8 + Prom 1112 - 1171 7.8 3 3 Tu 1 . + CDS 1192 - 2982 975 ## EF2276 hypothetical protein 4 4 Tu 1 . - CDS 3899 - 5446 1720 ## COG0519 GMP synthase, PP-ATPase domain/subunit - Prom 5553 - 5612 4.2 - Term 5646 - 5694 2.5 5 5 Tu 1 . - CDS 5813 - 6961 852 ## COG1363 Cellulase M and related proteins - Prom 7162 - 7221 7.5 - Term 7254 - 7305 8.1 6 6 Tu 1 . - CDS 7345 - 8211 810 ## COG0789 Predicted transcriptional regulators - Prom 8419 - 8478 6.7 + TRNA 8734 - 8805 62.4 # Glu CTC 0 0 - Term 8916 - 8959 3.0 7 7 Tu 1 . - CDS 9198 - 9626 244 ## Dred_1461 hypothetical protein - Prom 9684 - 9743 6.8 - Term 9841 - 9884 10.8 8 8 Op 1 . - CDS 9909 - 10919 1090 ## Closa_2929 hypothetical protein 9 8 Op 2 . - CDS 11004 - 11696 668 ## COG0822 NifU homolog involved in Fe-S cluster formation - Prom 11813 - 11872 6.1 10 9 Tu 1 . - CDS 11874 - 12290 276 ## gi|288870930|ref|ZP_06115790.2| conserved hypothetical protein - Prom 12330 - 12389 7.2 - Term 12360 - 12396 10.3 11 10 Op 1 . - CDS 12450 - 12686 254 ## gi|266622856|ref|ZP_06115791.1| conserved hypothetical protein 12 10 Op 2 . - CDS 12756 - 14696 2121 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family 13 10 Op 3 . - CDS 14760 - 16691 1832 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family - Prom 16723 - 16782 6.8 14 11 Op 1 . - CDS 16807 - 18015 925 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 15 11 Op 2 . - CDS 18080 - 19330 216 ## PROTEIN SUPPORTED gi|145223395|ref|YP_001134073.1| NLP/P60 protein - Prom 19373 - 19432 4.2 16 12 Op 1 5/0.000 - CDS 19436 - 20830 1519 ## COG0534 Na+-driven multidrug efflux pump 17 12 Op 2 . - CDS 20853 - 22217 1517 ## COG0534 Na+-driven multidrug efflux pump - Term 22240 - 22272 5.4 18 13 Op 1 5/0.000 - CDS 22310 - 23707 1433 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 19 13 Op 2 . - CDS 23707 - 24054 331 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 20 14 Tu 1 . - CDS 24975 - 25523 577 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases - Prom 25606 - 25665 6.3 21 15 Tu 1 . - CDS 25675 - 25869 185 ## Closa_2935 dihydrofolate reductase region Predicted protein(s) >gi|229784045|gb|GG667690.1| GENE 1 106 - 882 340 258 aa, chain + ## HITS:1 COG:no KEGG:Calkr_0213 NR:ns ## KEGG: Calkr_0213 # Name: not_defined # Def: two component transcriptional regulator, AraC family # Organism: C.kristjanssonii # Pathway: not_defined # 86 255 300 469 508 69 27.0 1e-10 MSFPDLTSEAAILQTAAQDISLLKNNLRRYLSIHTFAETMPVFDTVEHLRPYYLKIHHRL QVKPFNSDSDLSEEKEENSPLLLSLQEENELSESLLHLNEKRVTEIVTHIFESVQQTHVS LLELQRLVLRLTEIMQLALRTATPDHEAIAVQPPAIHNILDIDALKKQFLLYYHQGLEQL ISTSVFQYPPLIQKALIYIRQNYHLDISLSDISRHCGVSEVYFSRVFKDAMGLPFTKYLN TYRVRIASHLLLQSNDSL >gi|229784045|gb|GG667690.1| GENE 2 1067 - 1189 58 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKSSPLKKYYNSNGDGCHLFCEEIEFAYCKEIQSVSGTDV >gi|229784045|gb|GG667690.1| GENE 3 1192 - 2982 975 596 aa, chain + ## HITS:1 COG:no KEGG:EF2276 NR:ns ## KEGG: EF2276 # Name: not_defined # Def: hypothetical protein # Organism: E.faecalis # Pathway: not_defined # 72 589 7 513 583 263 34.0 2e-68 MKTKTKSILAALRSKFMAALPIILFFLFLFYSTILLFGISYSILVSVVTILFKVNYRKSF TVRQLMVLVGTQFLMALLSFFASLSLPLCILLNLIVPFLLVFLQTSQFNQLGYFANAMCF VFLQLRPVGREGLALQMAALAYGLAVLTIVLFFCSLRNKKNDHFSPAKRGLLLLADALRA QIEPKAHATEQGKPSDSPASGQAEPIFPILQGLYGEAYKSRGLTYVVSPRGKIQYMFALL FQRAVYFLTNPYQASMLSDPNRHDLLERLAQYMETAGTDGFEQEELVRTGSELLAEVEEC DEIPCIFIQNFLRLFLLILDNIRQDGGKQSSHGWKLPSYRRPLKKLFGQIKADTFETRFA LRLSAVLTIGFAYSMISQANHGYWLVLNAFLLLRPMYEESATRIKTRFIGTLAGCLILQL LLPLFHGTGWHFLLATFMAVGLYMETAGTWQQALFSTCFALTLTTLALPQMLAAELRILY VVSAMVLVLLVNRFVFPTHQKGQFRYNLYQLFHIHHVYLRLLESSLAAPLDYGVICDVQI HYHLIHDQIIQYLKKAGSEDSAFIKKLLWISWHMISEAEQMLFLINNRKTRTVNSA >gi|229784045|gb|GG667690.1| GENE 4 3899 - 5446 1720 515 aa, chain - ## HITS:1 COG:CAC2700_2 KEGG:ns NR:ns ## COG: CAC2700_2 COG0519 # Protein_GI_number: 15895957 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Clostridium acetobutylicum # 195 513 1 316 316 442 67.0 1e-124 MEREMIIVLDFGGQYNQLIARRVRECNVYCEVHPYTLGLDKIREMNPKGIIFTGGPNSVY KEDSPHCEKEIFEMGIPILGICYGSQLMAYMLGGSVATAPVSEYGKTEVDAEADSRILDS VSPKTICWMSHTDYIEKAPEGFRVTAHTPVCPVAGMECEERGLYAVQFHPEVMHTQEGMK MLSNFVYNVCKCSGDWKMDSFVESSIKSIREKVGSGKVLCALSGGVDSSVAAVMLAKAVG KQLTCVFVDHGLLRKDEGDEVEAVFGPNGPYDLNFIRVNAQERFYEKLKGAEEPEQKRKI IGEEFIRVFEDEAKKIGAVDYFVQGTIYPDVIESGLGKSAVIKSHHNVGGLPEHVDFKEI IEPLRLLFKDEVRKAGLELGIPEYLVYRQPFPGPGLGVRIIGEITPDKVRIVQEADAIYR EEIAKAGVDKGLGQYFAALTNMRSVGCMGDERTYDYAVALRAVLTSDFMTAESAELPWEV LGTVTRRIVNEVKGVNRVLYDCTGKPPATIEFEYS >gi|229784045|gb|GG667690.1| GENE 5 5813 - 6961 852 382 aa, chain - ## HITS:1 COG:BS_ysdC KEGG:ns NR:ns ## COG: BS_ysdC COG1363 # Protein_GI_number: 16079934 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Bacillus subtilis # 4 379 8 361 361 194 34.0 2e-49 MVDLRILEELSNAFGPSGFEEDVVRVVKKYCGGLAVRKDSMHNVYTEMTPSPLTAVSRGN CGIYRNDEIPSDRRPVLMLDAHLDECGFMIQSIMENGLMSIVTLGGFHLTSLPAHSVIVR TEEGKLHRGITTSKPVHFLTPEQKADNSLKIEDILIDVGACSREEVLTVYGIRPGDPVVP DVTYEYHEENGICFGKAFDNRLGCFCIIETMKALMAEEERLAARVVGAFAAQEEVGTRGA TVTAQQVKPDLAIVFEGSPADDPYVQLGTAQGVMRGGVQIRHLDNSYISNPEFIRYAHET GKKFGIPYQDAVRRGGSTNAGKISLTGQAVPVLVLGIPSRYVHSHYNFCAKSDIEAAVSM AVEVIRGLNAETVKRIFRQDLV >gi|229784045|gb|GG667690.1| GENE 6 7345 - 8211 810 288 aa, chain - ## HITS:1 COG:CAC3475 KEGG:ns NR:ns ## COG: CAC3475 COG0789 # Protein_GI_number: 15896713 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 2 287 1 243 243 132 34.0 7e-31 MMTVKQVSVLTGVSVRTLQFYDEIGLLKPTQMTESGYRLYDESALEVLQQILFFKELDFT LKEIKVIMENPQFDKKAAFEKQKELIRLKRDRLNGLLELLDKLIHGETCMDFKKFDMSEY FQALGEFKATHTAEIIRRLGSMEQFDELFGDLQSQEVEIAEAAVRQYGSIENFTKAMKEN FQNYLSEGPVISENEVDGIMEKTELLTKRLTADLTRDPSSAEVQAAVSELIAFCEESNRG IDMGENYWLLTAEAYQSNPVYLEVNDRKYGEGASKFIGSAIKEYLDRK >gi|229784045|gb|GG667690.1| GENE 7 9198 - 9626 244 142 aa, chain - ## HITS:1 COG:no KEGG:Dred_1461 NR:ns ## KEGG: Dred_1461 # Name: not_defined # Def: hypothetical protein # Organism: D.reducens # Pathway: not_defined # 1 141 25 165 167 126 41.0 3e-28 MEEVILEACKDLTILSRPNLNKILDELFVAKGWQRQAPVFSSEEAAYAKMDFLKDRIGVE VQFGHASFIGIDLLKMQIASYSNLDNIDFGVYIVTTNKMQKYLKSNYRLKWDGSLSYEKV IKYLPHIKSAIQVPIYVIGIDI >gi|229784045|gb|GG667690.1| GENE 8 9909 - 10919 1090 336 aa, chain - ## HITS:1 COG:no KEGG:Closa_2929 NR:ns ## KEGG: Closa_2929 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 336 1 336 336 603 94.0 1e-171 MALFESYERRIKQINEVLNSYGIASIEEAEKITKDAGLDVYKQVESIQPICFENAKWAYT VGAAIAIKKGCRKAADAAAAIGEGLQSFCIPGSVADQRKVGLGHGNLGKMLLEEATDCFC FLAGHESFAAAEGAIGIAEKANKVRQKPLRVILNGLGKDAAQVISRINGFTYVETEMDYY TGEVKELWRKSYSEGLRAKVNCYGSNDVTEGVAIMWKEGVDVSITGNSTNPTRFQHPVAG TYKKECNEKGKKYFSVASGGGTGRTLHPDNMAAGPASYGMTDTMGRMHSDAQFAGSSSVP AHVEMMGLIGAGNNPMVGMTVAVAVSIEEAAKAGKF >gi|229784045|gb|GG667690.1| GENE 9 11004 - 11696 668 230 aa, chain - ## HITS:1 COG:CAC2565 KEGG:ns NR:ns ## COG: CAC2565 COG0822 # Protein_GI_number: 15895825 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Clostridium acetobutylicum # 1 230 1 230 230 372 80.0 1e-103 MIYSQEVEEMCTVAQGVHHGAAPIPEEAKWVKSKEVKDISGLTHGVGWCAPQQGACKLTL NVKEGIIQEALVETIGCSGMTHSAAMAAEILPGLTVLEALNTDLVCDAINTAMRELFLQI AYGRTQSAFSEEGLPVGAGLEDLGKGLRSQVGTMYGTLKKGPRYLEMAEGYVTGIALDAD DQIIGYQFVSLGKMTDFIKKGDDPNTAWEKAKGQYGRVADAVKIIDPRHE >gi|229784045|gb|GG667690.1| GENE 10 11874 - 12290 276 138 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870930|ref|ZP_06115790.2| ## NR: gi|288870930|ref|ZP_06115790.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 138 8 145 145 264 100.0 2e-69 MKWLGTISLLFFAVIAISIGYMNYSRMSEKWQNCTADTEGIIVDFDSKISMSSSDKILYP VVRYFVNGKGYQITSGSGTNRQKLKVGDTVAVRYNPDKPKQMLIVGYNTKSTIYLCMAVM AMGVVVPFFSMILIWKKQ >gi|229784045|gb|GG667690.1| GENE 11 12450 - 12686 254 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622856|ref|ZP_06115791.1| ## NR: gi|266622856|ref|ZP_06115791.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 78 1 78 78 149 100.0 5e-35 MKRWLDTADEYITTWKIRDVALLKLCVCAAGFLLGMEIPRRHRNKAAIAASVVFSVTYVM IMLPFLKRLERNGKAEEV >gi|229784045|gb|GG667690.1| GENE 12 12756 - 14696 2121 646 aa, chain - ## HITS:1 COG:CAC1044_1 KEGG:ns NR:ns ## COG: CAC1044_1 COG1902 # Protein_GI_number: 15894331 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Clostridium acetobutylicum # 1 379 1 393 413 221 36.0 3e-57 MKSRLLEELKIGSMTVKNAVFMAPMSLGYESPDGTVNETMQEYWLARARGGVGCIITDAL SVDPDVPYLGNTLCFRNEESIASYRRFTDRIHEYGTKIIPQITHPGPESVSAFRGIPPVA SSVYLNSMAQRTRAVTLEEIPGIIQKYAKGAYDARRAGFDGIELHCAHAYMLLGSFLSPM RNKRTDRYGGSLDNRARLLCEVLDAIREACGPDFPVILRMSGSERDEQGNTLEDMKYLVP VLERHGVAAFEISGGTQYERCNKIIPCHGEVQGVNVPEAEALHEAASVPVIVVGKINEPR FAGYLVDSGKVDGVVLGRALIADPDFVNKMEAGDYEGIAPCAACAIGCVGEQTKRHPASC VINPAAGREKELELTAAEQPKRVVIVGGGIGGMACARAFAVRGHEVVLLEREMELGGQMR LACIPPHKQELSKWIVYLEHELERLSVDVRLNTAADRAVIDGLAPEVLILATGAKEMLPP VPGIDPEQAVTAWQVLSGETVIPGGNVLVVGGGMVGCEICEYLMHQKRGSLHITMIEMAD EIGAGMVVNNRVPTMIRLNRPEITMMTGTKLMSVNGSDVTVERHGVQETFGGFTHVIYAC GARPARALFDELKEAYPEAVLIGDASQPAQALEAVRQAVETAVRLG >gi|229784045|gb|GG667690.1| GENE 13 14760 - 16691 1832 643 aa, chain - ## HITS:1 COG:AF1262_1 KEGG:ns NR:ns ## COG: AF1262_1 COG1902 # Protein_GI_number: 11498861 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Archaeoglobus fulgidus # 4 348 2 344 354 205 36.0 2e-52 MYDRLFSPVTIRGLELKNRVVLPAMGTRMAGEKGEITPRLVAYHAARAAGGCGLNIVEVA AVHTPSAPAHFVSISDDSLIEGHKKLTDAIHQNGGKAAIQLWQGSIATTFDPRAEMLVVN DMEFAGKKLTAITPEGIREIVECYKHAAARCVEAGYDCIEFHCAHNYLPHSFLSGGLNHR TDEYGGSFENRSRFPLECIRAIRSAIPGDMPLFMRIDAHDDYLPGGLTIEEVIEFCCIAG ENGVDVLDVSRGNIITAGSVYEVPPIDIPRGFNVENAARIRRETKMVTIAVGRINTAEQA EEILEADKADLIVMGRAQLADPEFCNKAKEGRTEDIIHCVGCNQGCYDGFCDLRNRPFIT CLRNPCLGHEEEWKLTKAETSKRVYIAGGGMAGLEAARVLHERGHQPVIFEAGDRLGGQF ITAGQAPGKAEMKQAAVSFGRLVERSGVEVRCHTPLTAELIQEEKPDAVIAAIGSAPVAL KLPGSEKKMVCQANAVLEHEIEPEGDVCVIGGGLVGLETADALAAAGHAVKVLEMKEKAG EDLGMLRKIAVMQKLGGEGVEIITSALCLEINETGVVVDTPEGKRTVACDSVVLAVGSVS RDSSALREACESAGIEFTVIGDAKQARRAIDAVAEGFEAARSL >gi|229784045|gb|GG667690.1| GENE 14 16807 - 18015 925 402 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 301 392 429 520 525 67 38.0 4e-11 MKQQNDWYTGINHYDNYVEKEIGIERRSIDGPTEPYLHERVELFYVLSGKGTISVNGYAY DAVPGSLFCLYSHHFHHVEAVEEKLDVAVIRFHIGLFMYMSWERHPKHANARLVYDTRPV VQLSGEEREEVEALVKSLLREEKEARFERLNMIEYKTLELHAYYCRYAYEEIGKRPRKER TVWSMIEKVILATGEDITLEDMTEEEGISPETLNRRMKEACGFTFFQLKQFGKVINACAL LHFPELSMEYISDQLNFSSAAAFYKVFKQYCGMTPREYQAKSIGIMPPGSGAGPAMQFLQ YIHIHFSEDITLLDLCEKFCMKPYTVKQIFKQVFGNPFEELLAQVRVSYACSFLRTGDRT VLEIASLCGFNSHSTFLRHFKRWMNQTPEEYREIMRERGAGT >gi|229784045|gb|GG667690.1| GENE 15 18080 - 19330 216 416 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145223395|ref|YP_001134073.1| NLP/P60 protein [Mycobacterium gilvum PYR-GCK] # 226 414 151 347 348 87 34 6e-17 MNIKIQFRSNHFRKWLRRIGCSLLIGSMVCTQTVPAWATSKAEKEKKEAQQKLDEAKKKA EAAEANKNAVSAQVSKLTSDLTALLTDIQFLETDIADKEADIKQAEIDYDAAKKEEEKQY AAMKKRIQYMYEKGDTEYLDIFLQVKSMSDLLNKAEYVESIYSYDRKMLDTYQETKQQVA DYKAQLENDKAEMEVMELELQGQRSQLETTIANKKKEVSNFDAQLAQAKKDAADYAKTVE KKNQEIQKAKEEEARKQKAAEEARKKAEAEAAKKKAAATSTSKKTSANDKYTGPAAVKSS GGSAEGRAVADYGLQFVGNPYVYGGTSLTNGADCSGFVQQVYKHFGYSLPRSSSEQRSAG REVSYSEAQPGDLICYAGHVAIYIGNGQIVHASTPSTGIKVGTATYRTILSVRRIV >gi|229784045|gb|GG667690.1| GENE 16 19436 - 20830 1519 464 aa, chain - ## HITS:1 COG:MA2050 KEGG:ns NR:ns ## COG: MA2050 COG0534 # Protein_GI_number: 20090897 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 8 449 2 447 468 238 34.0 2e-62 MKKPMTQEERFVQMIETPVSTLIPRLAVPTIISMLVTSIYNMADTFFVSQIGTSASGAVG VMFSAMAMIQAIGFTLGMGSGNYISRSLGDRNGGEADKSAATAFFTALIIGILITIVGLT FLRPLVFLLGATETIAPYAMDYGIYILIAAPFMMSSFVMNNILRAQGNAMYSMIGITAGG ILNMILDPVLIFGFHMGIAGAAVATMVSQMISFGILFYQCNFREGSIRLRLANFTPTWKI YGEILHAGLPSFCRQGLASVATVILNFAAGPFGDSAIAAMSIVSRFMMFINSALIGFGQG FQPVCGFNFGAKRYDRVLEAFWFCVKVAITMLTCFGIIAFAVSRPIITAFRREDMEVIRI GTLALRLQILTLPLQAWVIMVNMLTQSIGYGFRASLVAMGRQGLFLIPSLLILPRICGIL GVQLAQPVADVFTFVLATFIVLRVLEELKAMREKENSNRKIKKI >gi|229784045|gb|GG667690.1| GENE 17 20853 - 22217 1517 454 aa, chain - ## HITS:1 COG:TM0815 KEGG:ns NR:ns ## COG: TM0815 COG0534 # Protein_GI_number: 15643578 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Thermotoga maritima # 12 451 18 455 464 147 27.0 6e-35 MSKKIDLLHGHIFTSLTGLALPIMATALVQTAYSLTDMAWIGLVGSPAVAAVGAASMYTW LSQGVVALAKMGGQVKVAHSLGSGKPEEAAKFAAGAIQMGVLFALMFAAVTVFGARPLIG FFGLSDGAIIHNAQVYLKITCGLILFSFLNAIITGILTAMGDSKNPFIANVIGLVMNMIL DPVLIFGIGPFPALGVTGAAVATVTAQMIVTIVFLVVVKKDRLVFDKVRFLERIPRDYFK VMVRIGLPAAVQNLIYTSISMVLTRFVAGFGDAAVAVQRVGSQIESISWMTADGFAAAIN SFIGQNYGGRQYRRVKKGYMTAVGVMFVWGLFCSGVLIGFPKQIFGLFIHEADIIPMGIS YLVILGFSQMFMCVELTTVGALSGLGKTLLCSVISVVFTSARIPLAMILSSSPLALDGIW WAFTISSVMKGILFFICFLYVSGKLPEDGALASW >gi|229784045|gb|GG667690.1| GENE 18 22310 - 23707 1433 465 aa, chain - ## HITS:1 COG:TM1640 KEGG:ns NR:ns ## COG: TM1640 COG0493 # Protein_GI_number: 15644388 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 5 461 4 462 468 515 58.0 1e-146 MDMMKKVPVREQDPKVRATNFEEVCLGYNEEEARAEASRCLKCKNPKCVGGCPVSIDIPG FIKEVQEGNYEEAAKVIAKSSALPAVCGRVCPQESQCEGQCIRGIKGEPISIGKLERFVA DWSRENGFVPAAPEKTNGKKVAVIGSGPSGVTCAGDLAKMGYEVTIFEALHEPGGVLTYG IPEFRLPKEGVVQPEIDNVRKLGVKIETDVIIGKSVTIDELLDEEGFQAVFIGSGAGLPM FMGIPGENANGVFSANEYLTRSNLMKAFRDDYDTPIVAGKKVAVVGGGNVAMDAARTALR LGADVHIVYRRSEAELPARAEEVHHAKEEGIIFNLLTNPVEILTDDNGWVKGMKCIRMEL GEPDASGRRRPVAIEGSEFVIEVDTVIMSLGTSPNPLISATTKGLETNRRKCIVAEESNG RTTKEGVYAGGDAVTGAATVILAMEAGRAGARGIHEYLSGQEAAE >gi|229784045|gb|GG667690.1| GENE 19 23707 - 24054 331 115 aa, chain - ## HITS:1 COG:PAB1737 KEGG:ns NR:ns ## COG: PAB1737 COG0543 # Protein_GI_number: 14521153 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Pyrococcus abyssi # 1 91 174 264 278 131 63.0 2e-31 MVNQGKKYDVCVAIGPMIMMKFTCLMTKELGIPTIVSLNPIMVDGTGMCGACRVSVGGEV KFACVDGPEFDGHLVNFDEAMKRQQMYKTEEGRALLRLHEGATHHGGCGHCGGDE >gi|229784045|gb|GG667690.1| GENE 20 24975 - 25523 577 182 aa, chain - ## HITS:1 COG:MA3786 KEGG:ns NR:ns ## COG: MA3786 COG0543 # Protein_GI_number: 20092582 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Methanosarcina acetivorans str.C2A # 2 179 21 191 299 157 47.0 1e-38 MYRILKAEKLADKIYLMDVEAPRIAKACQPGEFVIVRMDERGERIPLTICDYDREKGTIT IVFQIVGASTERMAGLKTGDSFIDFVGPLGNPSEFVHDDFEEVKNRKYLFVAGGVGTAPV YPQVKWLKEHGVSVDVIMGSKTKDLLILEDEMRAVAGNLYVTTDDGSYERKGMVTEVIKV AS >gi|229784045|gb|GG667690.1| GENE 21 25675 - 25869 185 64 aa, chain - ## HITS:1 COG:no KEGG:Closa_2935 NR:ns ## KEGG: Closa_2935 # Name: not_defined # Def: dihydrofolate reductase region # Organism: C.saccharolyticum # Pathway: One carbon pool by folate [PATH:csh00670]; Folate biosynthesis [PATH:csh00790]; Metabolic pathways [PATH:csh01100] # 1 64 99 162 163 105 68.0 6e-22 GGSSIYDQFLPYCDTVHVTFIDYEYSADTHFPNLDISEDWSLAAESDEHTYFNLCYSFRM YRKK Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:43:14 2011 Seq name: gi|229784044|gb|GG667691.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld84, whole genome shotgun sequence Length of sequence - 25679 bp Number of predicted genes - 26, with homology - 26 Number of transcription units - 12, operones - 5 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 303 282 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases 2 1 Op 2 . - CDS 332 - 1813 1073 ## COG3119 Arylsulfatase A and related enzymes 3 1 Op 3 1/0.000 - CDS 1831 - 3324 1053 ## COG3119 Arylsulfatase A and related enzymes 4 1 Op 4 2/0.000 - CDS 3315 - 4058 251 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 5 1 Op 5 24/0.000 - CDS 4076 - 4852 266 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 6 1 Op 6 2/0.000 - CDS 4831 - 5616 607 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 7 1 Op 7 . - CDS 5613 - 6401 713 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component - Prom 6442 - 6501 3.6 8 2 Tu 1 . - CDS 6641 - 7810 1249 ## Bmur_0152 extracellular solute-binding protein family 3 - Prom 7871 - 7930 10.8 9 3 Tu 1 . - CDS 7975 - 8982 775 ## COG1609 Transcriptional regulators - Prom 9066 - 9125 7.2 10 4 Tu 1 . - CDS 9621 - 9815 155 ## gi|266622876|ref|ZP_06115811.1| conserved hypothetical protein - Prom 9991 - 10050 7.9 - Term 10085 - 10140 11.6 11 5 Op 1 . - CDS 10206 - 11552 1072 ## COG0477 Permeases of the major facilitator superfamily 12 5 Op 2 . - CDS 11590 - 12411 798 ## Clole_3366 glycoside hydrolase clan GH-D - Prom 12431 - 12490 80.4 13 6 Op 1 . - CDS 13334 - 14524 1107 ## Bcell_1103 alpha-galactosidase (EC:3.2.1.22) - Prom 14554 - 14613 1.6 14 6 Op 2 38/0.000 - CDS 14615 - 15445 813 ## COG0395 ABC-type sugar transport system, permease component 15 6 Op 3 35/0.000 - CDS 15458 - 16345 906 ## COG1175 ABC-type sugar transport systems, permease components - Term 16356 - 16395 4.1 16 6 Op 4 . - CDS 16413 - 17729 1679 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 17756 - 17815 5.8 + Prom 17743 - 17802 9.2 17 7 Tu 1 . + CDS 17950 - 18798 840 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 18 8 Tu 1 . - CDS 18806 - 19276 349 ## EUBREC_3503 hypothetical protein - Prom 19317 - 19376 13.4 19 9 Op 1 . - CDS 20278 - 20553 155 ## Apre_0680 hypothetical protein - Prom 20575 - 20634 1.8 20 9 Op 2 . - CDS 20649 - 21242 362 ## COG5340 Predicted transcriptional regulator - Prom 21273 - 21332 7.7 + Prom 21235 - 21294 6.5 21 10 Op 1 . + CDS 21498 - 22097 546 ## CDR20291_1887 hypothetical protein 22 10 Op 2 2/0.000 + CDS 22094 - 22885 557 ## COG0500 SAM-dependent methyltransferases 23 10 Op 3 40/0.000 + CDS 22925 - 23575 485 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 24 10 Op 4 . + CDS 23568 - 24956 1085 ## COG0642 Signal transduction histidine kinase - Term 24683 - 24720 -1.0 25 11 Tu 1 . - CDS 24967 - 25302 254 ## SpiBuddy_2052 phosphoesterase PA-phosphatase related protein + Prom 25116 - 25175 2.8 26 12 Tu 1 . + CDS 25352 - 25495 87 ## gi|266622892|ref|ZP_06115827.1| conserved hypothetical protein + Term 25529 - 25558 2.1 Predicted protein(s) >gi|229784044|gb|GG667691.1| GENE 1 3 - 303 282 100 aa, chain - ## HITS:1 COG:TM0068 KEGG:ns NR:ns ## COG: TM0068 COG0246 # Protein_GI_number: 15642843 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Thermotoga maritima # 1 100 1 100 539 110 47.0 6e-25 MKLTLEGIKDKQVWNRAGIGLPSYDLETLAQRTTEAPVWVHFGIGNIFRIFIGSIADRLI TDGLMDRGIVCVETFDYETIDRIYKPYDNLCLSVILHNDG >gi|229784044|gb|GG667691.1| GENE 2 332 - 1813 1073 493 aa, chain - ## HITS:1 COG:PA2333 KEGG:ns NR:ns ## COG: PA2333 COG3119 # Protein_GI_number: 15597529 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 1 464 1 487 538 165 28.0 2e-40 MSQKPVNIVWFCTDQQRYDTIHALGNPYIHTPNIDKLAEEGVAFTRAYTQAPICTPSRSC FLTGRYPRTTKAFYNGNDNYSKDEKLVTKLLADSGYVCGLSGKLHLTSAQGRMEKRTDDG YSYIQWSEHPNDDWPGGENDYQNWLKEKGLRWEEIYGGKYTSMATWPPKPNPNFTGKTVG VPAEYHQTTWCVEKAIEFIEQNRGSGKPWLISLNPYDPHPPLDPPQEYKDRLKIEEMPLP LWKEGELDTKPPHQQNDYFQGGQDGQAEPYPTLTDEDRRERFRDYYAEIELIDDQFGRLM EYLEETGQREDTLVIFMSDHGEMNGDHGLYWKGAYFYEALVHIPLILSCPSLFRRGLKSD ALVELVDLAPTLMELTGQEVPYYMQGKSLLPILTGQADPHHHKDSVYSEFYHCLLGSHDN IYATMYYDGRYKIVNYHGMEYGELYDHETDPDEFCNLWDKPEYAMKKAELIKKNFDSAVM KNMDRSMHMINEY >gi|229784044|gb|GG667691.1| GENE 3 1831 - 3324 1053 497 aa, chain - ## HITS:1 COG:PA2333 KEGG:ns NR:ns ## COG: PA2333 COG3119 # Protein_GI_number: 15597529 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 7 471 5 490 538 174 27.0 5e-43 MVVNMEKKKKNIVWFCTDQQRWDTIHSLGNPYIHTPNIDRLVKEGVAFTRAYTQAPICTP SRACFLTGRYPRTTKTIFNGNEKFSKDEKLVTKLLSDEGYTCGLVGKLHLTSAEGRVEKR CDDGYSYFQYSHHPHNDWKDGGNDYQNWLNEKGVHWEEIYGGKFMTMATWPPQANPSFSG KQVGVPAQYHQTTWCVEKTIDFIETRRNSGEPWLISVNPFDPHPPLDPPQEYKDRLNVEE MPLPLWEDGEMEGKPPHQQKDVIQGGQDGQAEPIGSLTEEEKRERFRDYYAEIELIDDQF GRLLSYLDQTGLREDTIIIFMSDHGEMSGDHGLYWKGAYFYEGLVHVPLIISCPSIFKQG FLCDALVELVDIAPTLMEAAGLEVPYFMQGRSFYDILTGEADPHHFKDAVYSEFYHCLRG THEDIDATMYYNGRYKLIVYHGKEFGELYDHETDQNEFHNLWDKPEYEALKTELIRKSFD HTVLCNMDNSMHRVYGF >gi|229784044|gb|GG667691.1| GENE 4 3315 - 4058 251 247 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 7 240 2 242 245 101 27 6e-21 MSDQIKVQVRHLTKKFGDLLVLDDVSFDVMDGDLLCVVGPTGCGKTTFLNSLVNIYDITS GEILLDGQKVNTKKQNIAYIFQGNSTMPWLTVEENVGFGLLYKPLSDAEKKERTEKYLEI VGLSQYRKYYPKQLSASMLQRVSIARAFATEPELLLMDEPYGQLDIELRFRLEDELVKLW QMTKTTVLFITHNIEEAVYLGNKIMILTNKPTTVKTTLINELPRPRDIADPQFIELRNKV TELIKWW >gi|229784044|gb|GG667691.1| GENE 5 4076 - 4852 266 258 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 215 1 216 318 107 32 1e-22 MEQERKINTVIECKGVSKIFDTPEGKHEVVRGLNFAARENEFLVLFGPGQCGKTTIINMI AGFEQTTGGNITASGKKVDRPDVSRGVVFQSISLFPWMTAMGNVEYGPKIAGVSKEERKK RAQKYINLVGLQGFENSYPVQLSGGMKQRVGIARAYCNEPELLLMDEPFGALDAQTRYLM QEELQRIWSAEKRTVIFVTNNIEEALYLADRIIVLTNCPATVKKEYVIDLPRPRNLVSEK FLALRKEITGILDNSQGN >gi|229784044|gb|GG667691.1| GENE 6 4831 - 5616 607 261 aa, chain - ## HITS:1 COG:BMEII0107 KEGG:ns NR:ns ## COG: BMEII0107 COG0600 # Protein_GI_number: 17988451 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Brucella melitensis # 26 256 22 246 246 158 39.0 8e-39 MNENAKEKILSVICAVISIGAFLLLWWFATHHTALSQLLPDPVVVLEATLEYCTGVVGKY SLIIHMLSSLRRVLIGFCLGSLAGVILGLLMGRYLLARAIFNPIFRFIRPIPPIAWIPIS IIWFGLGEEAKIFLIFLAAFANTTLNAMTGAMNVDPEVVNAARMLGAKERQIFTTVIIPA SVPAIFAGLQVGISSAWASVVAAEMIKAENGLGWIIQAGMDNNNMTQILAGIIMIGVVGF LLAYIMRKAEEVMCRWNKSGR >gi|229784044|gb|GG667691.1| GENE 7 5613 - 6401 713 262 aa, chain - ## HITS:1 COG:Cgl1193 KEGG:ns NR:ns ## COG: Cgl1193 COG0600 # Protein_GI_number: 19552443 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Corynebacterium glutamicum # 37 261 37 256 256 124 33.0 1e-28 MKTNKKLNKYTIISCCTVLIVLAAWYVCINVLHLKQETVFPGPITVAKTFVQKMTSKVPD GATLPTHLMGSLKVALFGYSLGVLFGVPMGILMAWFKWVDRFVRPLFDLIRPVPGLAWIP MFILLFGIGILPKAAVIFLSTSVACIVNSYTGIKQTKQEHLWVGDVFGFSNLQKLYKIAI PTALPMIFTGLRVAMGAAWMALVAAELLASSVGLGYMIQQGRYASKPALVFVGMIMIGIV GLLFDTILHWLEVKVAKGMNAE >gi|229784044|gb|GG667691.1| GENE 8 6641 - 7810 1249 389 aa, chain - ## HITS:1 COG:no KEGG:Bmur_0152 NR:ns ## KEGG: Bmur_0152 # Name: not_defined # Def: extracellular solute-binding protein family 3 # Organism: B.murdochii # Pathway: ABC transporters [PATH:brm02010] # 85 379 49 342 347 80 22.0 1e-13 MVRFGRKVLAATLSASMVLAMTACGSGSLSKTTEKAAETTTAAETAKETTAGDTEEPAEE TAGAVMPKASGEKLVIGTLANWVGLPAWYAYAEGYYDEVGLDVEIVNFGSQGTLVNEAMA ADECDIAVSGMASVYALSTGMYKYIGDGCLTISGETLYGRPDDAIYKAGADANGVYGSPD VLKDAVILGPMNTTAQMNAIAYMQYFGIDANSFTFTNLDYSSALAAFESGQGDLIAVNPT YAAILAGDGYVQVSDFTQVSPQDIIDAVYCQNEVAEDRPEDVELFLYCYYKACQDMLNNQ DNRVAVAYKWYVDEGLDYKEQDVIDECSLKSYFTLDTVDSGDYLLGSFMNYAAEFLLANE KVTEDDVANVKASIDNSFVTSIKSWAAAE >gi|229784044|gb|GG667691.1| GENE 9 7975 - 8982 775 335 aa, chain - ## HITS:1 COG:CAC0360 KEGG:ns NR:ns ## COG: CAC0360 COG1609 # Protein_GI_number: 15893651 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 4 330 5 323 328 119 28.0 1e-26 MTSKEFAEIIGVSQSTISRALNNSSQISAERREFIQKKAEEYGFVLNSQARSLKTNRTGT VGILFPKHFIGMSSNLGLAHLYDLVQAELSKYDYDVMVVYDGEKQNGLSTFERMIKIGKV DGFIIFRMESSKKDLELIKKYKVPCVYLLHAEKAGSYGSCLSDCEYSGYLIGKLFGKYKE YKSIYINVKTSAESKTRLAGFKRGLKEHGVVMDKNAVWQSELSVKSAYECVISNQETVKS RKCAILAFNDTVALGALNACRDLGVKVPEDVQISGVGGLPIIAEIAPDLTTVNVFHERIA SMGCEMLRSAIIHGAQEPIHTVIRPELILRGTTLR >gi|229784044|gb|GG667691.1| GENE 10 9621 - 9815 155 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622876|ref|ZP_06115811.1| ## NR: gi|266622876|ref|ZP_06115811.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 64 1 64 64 106 100.0 4e-22 MVKVNKKYKDRLFRMVFNRKEELLSLYNAVSHSEYTNPEDLEIHTLRQQSCRVILNDIRV QNVD >gi|229784044|gb|GG667691.1| GENE 11 10206 - 11552 1072 448 aa, chain - ## HITS:1 COG:BS_yqgE KEGG:ns NR:ns ## COG: BS_yqgE COG0477 # Protein_GI_number: 16079556 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 39 407 3 369 430 95 21.0 2e-19 MTGNVVENASGDTFEAEPGDPREDVTGRPMRSKSSINWKIKCILGGTMLSRNYYLFLLVH TLFLIFTRIPAIFINTLLMNQTGDVNSTLFYNGAIFAACALTMSFSAQIMHKTGCRVTAT VGIIGYNILYLCYILFQPYVGTYYMLFGLFNGMADGFYYISYGQMILDYTDVTNRDSGMG IISTLSAGVNLTVPFLAGSILSAIGGVKGYVAIFILAFAVAFLAMAAVFRLPGAPEKEKK NVRYGAFLRLMGENRQIFYGLLGETLKGIREGTFMFILNIILYQLIKSEFLIGCNSLLTG LVSIVSFWFMSHTFRPDNRRGFMLLAVIGLTGITIVCYWFLNPAMVILFAVVNAFLAGMV EISCYTTFFDVSQSVKSIEDFSPELLAFHEIFVVAGRCIGLGAFAFINTVSGATLKAQIF SLLLLTLVQFGTVFMCRTAVREAVRTAG >gi|229784044|gb|GG667691.1| GENE 12 11590 - 12411 798 273 aa, chain - ## HITS:1 COG:no KEGG:Clole_3366 NR:ns ## KEGG: Clole_3366 # Name: not_defined # Def: glycoside hydrolase clan GH-D # Organism: C.lentocellum # Pathway: Galactose metabolism [PATH:cle00052]; Glycerolipid metabolism [PATH:cle00561]; Sphingolipid metabolism [PATH:cle00600] # 1 267 436 703 703 305 54.0 2e-81 MRDYATGVIKRLVEEYGVGYIKMDYNIDTGVGTDYHADSAGEGLLSHTRAYLNWLDKIFE MYPDLIIENCSSGGMRMEYSMLSRQSIQSVTDQTNYIKMAAIAANCATACTPEQAAIWSY PLVEGTGEETIFNMVNAMLFRIHQSGYLGQIGEQRMQYVREGIACYKKIREDIREGIPCW PTGLASMSDEYISYGLVNGGKMYLAVWRTGGEGEKSFEIPLKGSFSAAAEVSCIYPEKKE TAFTVNGNCLTVSLAPKSARLFLIVTETREQSE >gi|229784044|gb|GG667691.1| GENE 13 13334 - 14524 1107 396 aa, chain - ## HITS:1 COG:no KEGG:Bcell_1103 NR:ns ## KEGG: Bcell_1103 # Name: not_defined # Def: alpha-galactosidase (EC:3.2.1.22) # Organism: B.cellulosilyticus # Pathway: Galactose metabolism [PATH:bco00052]; Glycerolipid metabolism [PATH:bco00561]; Sphingolipid metabolism [PATH:bco00600] # 21 395 10 388 689 441 56.0 1e-122 MNIAENGIYLNLGLKKGEAARLLHFSSHPQEMEIGDNDASVYTLAELQASGFNQNDHHGL KHTGSSPSLLMTYESHRDYRNAYGRKLEVIQKYNGLELITHIQFYDGIQTVKFVNEVLNH SGEEYALEYVSSFALTGITEQTKGPRDKTGLLYIPHNTWFGEAQWKKYTLNELGYDAVNS FSMKRIAVTNTGSWACVEHLPMGCYYNTELDQSLMWQIETSGSWHWEISDIRGTLYVQAG GPTYQENGFLKILKPDEHFLSVPCAVSAAKGEFQEAVAELTKYRRRIRRKNEDNEKLPVI FNDYMNCLMGDPTTEVLKPLIDAAAESGCEYFCVDCGWYSDGHWWDGVGEWLPSDRRFPG GIGEVIRYIRDKGMVPGLWLELEVMGINCPMASQVS >gi|229784044|gb|GG667691.1| GENE 14 14615 - 15445 813 276 aa, chain - ## HITS:1 COG:lin0219 KEGG:ns NR:ns ## COG: lin0219 COG0395 # Protein_GI_number: 16799296 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 8 275 15 281 282 197 39.0 2e-50 MEANNSKKHVVTHLLLILGSIVMLFPFIWMILSSFKTVGEQMSIPMKILPGSFGYLENYR QALEAVPFLRLYLNTILMILGRVICAVIFSSMAGYGFARMEFPFKRLLFTLVLMQMMVPG QIFIVPQYMMLSKIRLINTVFALIFPALVSAFGTFLLRQQYMALPKELEEAAVIDGCNPW QVFTKIMVPVTRSSLASLAIFTALFAWKDLMWPLIANNDLNRMPLAPGLTVLQGICNNNK GALMAGSVIATVPMLLLYIACQKQFIEGIAQTGIKG >gi|229784044|gb|GG667691.1| GENE 15 15458 - 16345 906 295 aa, chain - ## HITS:1 COG:BH1865 KEGG:ns NR:ns ## COG: BH1865 COG1175 # Protein_GI_number: 15614428 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 3 294 21 309 309 171 34.0 2e-42 MNKRKYKNYLWGYFFVAPTIIGLLILNIYPIFKTLLLSFSTSTGFEEYVFSGLNNFARVL KDREVWIGLRNTLIFSAVSVPGGIFISLLLAWLMSREIRGRSVYRVIYFTPMVAAPAAVT MVWAWLYNSEFGFLNYVIKSLGGATHSWLHDPKYALMSVIIIAIWGGLGQQIIILIVAIT NVPRVYYEAAEIDGAGGFVKLFKITVPLISPSVFFLTITGFIGSLTQFDLIYMLYNANTS KAMDSVRTIMYQYYRQAFVVQDKPYASAISIVALCLILIFTGLQFVAQKKIVHYE >gi|229784044|gb|GG667691.1| GENE 16 16413 - 17729 1679 438 aa, chain - ## HITS:1 COG:lin0220 KEGG:ns NR:ns ## COG: lin0220 COG1653 # Protein_GI_number: 16799297 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Listeria innocua # 47 434 29 416 418 135 27.0 2e-31 MKKNLAGLFVAAAVAVSLCACGSGGSGSTAAEKTSEAAAGTAAGGGDSVTLTYVTWNENQ RGQIQDTIDGFQKLYPNIKVDLQITPWGEYWTKLEAAATSGNMADIVTMHTNVVAKYVNG GKLAQLDDLTQYDDTFSYDNYPEGVTKLYTFGDAHYGVPKDKDCVVLVYNKELFDKAGVE YPNADWNWNDVEEAAKKLTDKENGIYGFAAYNHDQEGWGNFLYENGGSIIDETTHQSGLD KPESIEAMEWYMNMNANYSPSNEMMAEVNYIELFATGTVAMQTFGNWELSYFTENELVKD KFAITELPAGPSGIRATQMNGLALSVPSDCKNMEAAKKFVAYAGSKQGMSDSVNGPAIPA FTGVDADWAKAHEDLYDTGAILKSMDYGVQFVGTESKTRWSEVMNTYVAKIFNGEASVQE AFTQAAKEINEILATEKK >gi|229784044|gb|GG667691.1| GENE 17 17950 - 18798 840 282 aa, chain + ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 156 276 134 254 257 82 33.0 1e-15 MAYENAYQITNPDCVLLPEEICYMNWQKYYMDYHQHPAGSIEFNYIVEGTCYYNIAGEIY NLKKRNLMIVNGNTPHKLISPDSCINMSINYYQSELTPSIGTLSALVQAYPCLGTFFQNL NTGLVIQNARNIFHLLEEICNETNNKKDPYYLNILVNKALIEVVRLLSAEKSPTDTYVEQ TKNYINYHFFSIENIDEIASYVALNKVYLQKIFREKTGLTIWNYLTKYRMEKAVYFLLYS DTPISDIDELVGINSRQNFYLLFKKQYGMSPSEYRKRYKLTL >gi|229784044|gb|GG667691.1| GENE 18 18806 - 19276 349 156 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3503 NR:ns ## KEGG: EUBREC_3503 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 4 153 133 284 287 133 47.0 2e-30 MSVPLKLDITTGDKITPKEIEYKFKLLLEDRSISVLAYNLETVLAEKLETVISRGDQNTR PRDYYDVYIIKKLQYGNVKPEVLKEALKTTVEKRGSANIVKNYKRIMKNVKASEVMQRQW RNYRKDFEYAKDIMFEETCDTVITIMNEIDSEMIER >gi|229784044|gb|GG667691.1| GENE 19 20278 - 20553 155 91 aa, chain - ## HITS:1 COG:no KEGG:Apre_0680 NR:ns ## KEGG: Apre_0680 # Name: not_defined # Def: hypothetical protein # Organism: A.prevotii # Pathway: not_defined # 2 91 35 124 278 94 47.0 1e-18 MLERISESAYQKNFILKGGFLIASIVGLDTRTTMDMDATIRGLPVNEQSVREMFEEICRI KLNDDVSFTFRYIEEIREGDEYTGYRVALTS >gi|229784044|gb|GG667691.1| GENE 20 20649 - 21242 362 197 aa, chain - ## HITS:1 COG:MT1074 KEGG:ns NR:ns ## COG: MT1074 COG5340 # Protein_GI_number: 15840474 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Mycobacterium tuberculosis CDC1551 # 1 98 14 115 207 59 35.0 3e-09 MNATEAIIQMAEKNNGIVTNMMVVDAGISRGNLKYLVDRGRLEKVSRGVYTLPKVWEDEL FNLQMRFKRGIFSYETALFLCDLTDRTPNYYNMTFPSNYNITNPKVENVRCTQSKPSLYD IGITSLFTPGGNEVNAYCRERTLCDIISPRSRVDIQIIADAYKRYTSKKDKNIPLLSEYA KTFKVEKRLRSYLEVLL >gi|229784044|gb|GG667691.1| GENE 21 21498 - 22097 546 199 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1887 NR:ns ## KEGG: CDR20291_1887 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 190 1 190 195 268 74.0 9e-71 MPPKAKITKDMILHTVLHITRETGFETVNARSIAGKLQCSTRPIFTCYENMEELKKDFLD FAYEYYLQYVSSFSSSEQVIPSLLLPLSYIEFAAEETHLFKLLFMNDMDLNMAEAKDFYN EADNAKKAAAFSETIGVELERAKVIFLDLFLYTHGIAVLTAAGKIAFDRGSAEKMLMNLL SAFINQEPSDAPNSRRSLI >gi|229784044|gb|GG667691.1| GENE 22 22094 - 22885 557 263 aa, chain + ## HITS:1 COG:CAC0728 KEGG:ns NR:ns ## COG: CAC0728 COG0500 # Protein_GI_number: 15894015 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Clostridium acetobutylicum # 7 263 19 271 272 238 47.0 8e-63 MTANQYLTDFYNQYDEDNRLAYRHGSVEFLTTMRYIEKYRKPGNRILEIGAGTGRYSHAL ARQGYAVDAVELIGHNIDIFRKNTQAGETITITQGNAMDLSAFPDNMYDITLLLGPLYHL YTNEDKQQALREAIRVTKPGGVIFAAYVISDGCLLDEGFHRGSKNVAEYMEKGLLDSRTF AARSEPKDLFELVRKEDIDDLISALPVTRLHYVAADGCALFMREAIDAMDRDTFELYLKY HFAVCEREDLAGITSHALDILQK >gi|229784044|gb|GG667691.1| GENE 23 22925 - 23575 485 216 aa, chain + ## HITS:1 COG:CAC3220 KEGG:ns NR:ns ## COG: CAC3220 COG0745 # Protein_GI_number: 15896467 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 209 7 223 228 159 42.0 5e-39 MKLLIAEDDPEMQKILKLYLQREGYLVNVVSDGRAAVDFLTEHPVDLVLLDWMMPVQNGI QTLKEIRLLNIPVKVLMLTAKGENESEVTGLSCGADDYLRKPFDIQVLLLRIKKLCNAAD VLQFHEIRLNPVTMEVTKDHKKMVLTKTEFELLKYFLSNQNIVLSREQLLNHVWGMDYGG DPRTVDTNIRRLRKKIGEDLIQTRIGMGYMMGGSHD >gi|229784044|gb|GG667691.1| GENE 24 23568 - 24956 1085 462 aa, chain + ## HITS:1 COG:CAC2434 KEGG:ns NR:ns ## COG: CAC2434 COG0642 # Protein_GI_number: 15895699 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 11 441 7 468 492 125 21.0 1e-28 MTKENRRFHRLSTKFICGTASILVLVLAATLFVNSKVAARYYLRQQTEYVSQAGSRIREY LDAGMSPDEAVASLEASDNVLITYASNTSDYDALSSSLREKFREKGLGFQKFWLWDQDYL SAVQKGFQFRLYQQDKLNYGILVEYIPVGPHLYAVAAIVPNTADFIGIINRFSILLHAIS LIITVILLYLLVKHITNPLRRMERFSQEIARQEYGTLIINTRDELEHVADSMNQMSISIQ QYQSMLQAKNQQMEQLLSDVAHDLKTPISLIGMYSSGIRDGLDDGTFLETIIQQNSRMSQ MTERLLNLSGIERKEYPLTTIRLDLLLFECIAEQKLFLSKRGLSLNTAVTPNLEIMGNAE LVTELFSNLLSNAVKYAASGTVNAELVKNEGQCCFRITNDLGNDDLDTERIWEPFYVGEP SRNKALSGTGLGLPIVKKIADKFGYQVRCIREDQTITFEVIF >gi|229784044|gb|GG667691.1| GENE 25 24967 - 25302 254 111 aa, chain - ## HITS:1 COG:no KEGG:SpiBuddy_2052 NR:ns ## KEGG: SpiBuddy_2052 # Name: not_defined # Def: phosphoesterase PA-phosphatase related protein # Organism: Spirochaeta_Buddy # Pathway: not_defined # 4 92 59 147 214 78 46.0 9e-14 MVNVSEWLGAVAMATAFGFIMAGLFQLVTRRSIWKVDVPILVLGAFYGILAACYAFFEVV VINYRPVILTQGLKASYPSSHTMASICIMVTALTALFCLALQYIGERSAKA >gi|229784044|gb|GG667691.1| GENE 26 25352 - 25495 87 47 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622892|ref|ZP_06115827.1| ## NR: gi|266622892|ref|ZP_06115827.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 47 1 47 47 85 100.0 2e-15 MLENMKLLGKGNTAEVFDYGNKRVCKLFYEGYPDKYVALEFRNSKEM Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:44:00 2011 Seq name: gi|229784043|gb|GG667692.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld85, whole genome shotgun sequence Length of sequence - 32471 bp Number of predicted genes - 31, with homology - 30 Number of transcription units - 15, operones - 6 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 2700 2515 ## COG5263 FOG: Glucan-binding domain (YG repeat) 2 2 Op 1 . + CDS 2853 - 3422 447 ## gi|266622895|ref|ZP_06115830.1| putative tungsten formylmethanofuran dehydrogenase subunit B 3 2 Op 2 . + CDS 3419 - 3586 161 ## gi|266622896|ref|ZP_06115831.1| methionyl-tRNA synthetase 4 2 Op 3 . + CDS 3669 - 4322 547 ## ELI_0596 hypothetical protein 5 2 Op 4 . + CDS 4327 - 4659 258 ## gi|266622898|ref|ZP_06115833.1| DNA-binding protein 6 2 Op 5 . + CDS 4750 - 4866 115 ## 7 3 Tu 1 . - CDS 4851 - 5435 414 ## COG3544 Uncharacterized protein conserved in bacteria - Prom 5657 - 5716 7.9 + Prom 5516 - 5575 8.1 8 4 Tu 1 . + CDS 5657 - 5968 302 ## gi|266622901|ref|ZP_06115836.1| conserved hypothetical protein + Term 6136 - 6178 -0.7 9 5 Op 1 . - CDS 6057 - 6800 716 ## Closa_1908 ABC transporter 10 5 Op 2 . - CDS 6793 - 7521 312 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 - Prom 7548 - 7607 4.5 + Prom 7566 - 7625 2.2 11 6 Tu 1 . + CDS 7667 - 8119 493 ## COG1671 Uncharacterized protein conserved in bacteria 12 7 Tu 1 . - CDS 9055 - 9945 871 ## Closa_2639 hypothetical protein - Prom 10034 - 10093 4.1 + Prom 9916 - 9975 2.9 13 8 Tu 1 . + CDS 10070 - 10747 464 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 10749 - 10805 12.5 - Term 10737 - 10793 16.3 14 9 Tu 1 . - CDS 10814 - 12145 680 ## ELI_1104 hypothetical protein + Prom 12368 - 12427 7.5 15 10 Op 1 . + CDS 12447 - 12602 62 ## gi|288870940|ref|ZP_06115844.2| conserved hypothetical protein 16 10 Op 2 16/0.000 + CDS 12571 - 14415 1900 ## COG0642 Signal transduction histidine kinase 17 10 Op 3 6/0.000 + CDS 15370 - 15879 532 ## COG0784 FOG: CheY-like receiver + Prom 15907 - 15966 5.3 18 10 Op 4 . + CDS 16005 - 17729 1711 ## COG2200 FOG: EAL domain 19 10 Op 5 . + CDS 17774 - 18052 350 ## gi|266622913|ref|ZP_06115848.1| toxin-antitoxin system, antitoxin component, HicB family + Term 18100 - 18152 7.0 + Prom 18114 - 18173 1.8 20 11 Tu 1 . + CDS 18221 - 19357 980 ## COG1470 Predicted membrane protein + Prom 20259 - 20318 18.4 21 12 Op 1 24/0.000 + CDS 20344 - 21381 329 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 22 12 Op 2 2/0.000 + CDS 21308 - 21913 560 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component 23 12 Op 3 . + CDS 22908 - 23231 322 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component + Term 23294 - 23343 1.7 + Prom 23272 - 23331 2.4 24 13 Op 1 40/0.000 + CDS 23361 - 24047 974 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 25 13 Op 2 1/0.000 + CDS 24044 - 25303 1112 ## COG0642 Signal transduction histidine kinase + Term 25351 - 25406 -0.3 + Prom 25392 - 25451 6.2 26 13 Op 3 38/0.000 + CDS 25493 - 27151 1526 ## COG0747 ABC-type dipeptide transport system, periplasmic component 27 13 Op 4 . + CDS 27159 - 27674 470 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 28 14 Op 1 49/0.000 + CDS 28601 - 28996 339 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 29 14 Op 2 13/0.000 + CDS 28993 - 29796 729 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 30 14 Op 3 . + CDS 29844 - 30341 178 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 31 15 Tu 1 . + CDS 31281 - 32429 279 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 Predicted protein(s) >gi|229784043|gb|GG667692.1| GENE 1 1 - 2700 2515 899 aa, chain + ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 733 895 480 620 621 75 34.0 5e-13 SIHTGKGTPWTDLNLPQTVPDPYYIFAGWFDESGNRVQESQVILASQHYTARFVPASLED DGILAMPDARGAVAGDGSGKVTVSGANEERRYALTDGEHRILQIKTGAQLQNSDFTGLYP CTLYYVYELSMSAEPAVGTILPEHTDASDYGPPSQVTVPALGSNYQVSRDTAEGRLRLTI NPAAYDTYYAIMDIDGNTLFVPGSDADGWVLAESTRNEAVLGDLEPNQLYIVVAKHAGEM GTPADRQTMGTQVSVIYAGQEEKYYTISLTNGGYIDEIIRNGSVIETEEHAVSAIVKKGD TVKINAEITGYNGQAFKQWNPLIGSPQIFVTRRNQTITMPGQNVVLQAVYEMPSTATPGN ATVDYAPRDGKNALDLSENYFETLIQTLTDNAEDREAMTAGIDVEYTVKFNRRAPKASES NAVKAEMWEDSESVKIPWALSSALTRKVGGTNKGLPEDGNPAPGIRVYSEIDESMLGYLD YRLWKVENPEGTPVCTEIPMTPDPNGDGADFTGTIAFDTHVGETYLLTYLKAHEITVVDT KRAAVHRLKVKSQTSLEETEAFGNLNIFEDYTDPVTGIVYEYGGLGKTESASSEYDITTP VTKAATLYVLYRAADDSEWQAARQKLMEQIAIAQTLMNNSSVSEENRQLLSSAVVSAQEV VNRTYRPAIEELLQEYETLKALVDSISSGGPVDPDDPNNPNNPDDPENPNNPENPNHPGG SGGGSGSGSRGRSLGSGYTFNRYRTYYVGTDGYWEQTGMNGGQWAFVLYSGQRLKNQWAN VKYNDLQQLYTYHFNAEGIMDTGWYQDAEGEWYYLNSQQGRDEGCLRIGWYFDEKDTKWY YLNQFTGGMLVGWQNVGDNWYYLSPVKAAGHPLGSLYTGELTPDGYPVDENGSWIRETP >gi|229784043|gb|GG667692.1| GENE 2 2853 - 3422 447 189 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622895|ref|ZP_06115830.1| ## NR: gi|266622895|ref|ZP_06115830.1| putative tungsten formylmethanofuran dehydrogenase subunit B [Clostridium hathewayi DSM 13479] putative tungsten formylmethanofuran dehydrogenase subunit B [Clostridium hathewayi DSM 13479] # 1 189 1 189 189 335 100.0 1e-90 MIGENGTEEYSDEIMTKKREMEQQMQAMAVTDWAGIMMVCIHGQVENRIYGAIVNRCLEE PVSFCGLDQLVLKMDEICELAGSPMLSMDPRFLQEKRKEDYLGIGKKRKKGRQKTAGWQN MFDILKQQAALSREVFEVQILFRSNATMQGRLRCSLSGKKYVSFRSALELLRMLKEEELE FCGKRGKTI >gi|229784043|gb|GG667692.1| GENE 3 3419 - 3586 161 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622896|ref|ZP_06115831.1| ## NR: gi|266622896|ref|ZP_06115831.1| methionyl-tRNA synthetase [Clostridium hathewayi DSM 13479] methionyl-tRNA synthetase [Clostridium hathewayi DSM 13479] # 1 55 1 55 55 91 100.0 2e-17 MMICTGCKLKNKRGDSICEICGALKKVRELEQFQEEWRMRLQAEDGQKTAVRGLV >gi|229784043|gb|GG667692.1| GENE 4 3669 - 4322 547 217 aa, chain + ## HITS:1 COG:no KEGG:ELI_0596 NR:ns ## KEGG: ELI_0596 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 3 123 15 135 795 97 38.0 4e-19 MNEYETIIEGMRFGVSKHMMDRHFTVLWANKYAYELTGNTRDGFRSRYHNRVDEYYSEDS GTFAYVTKRIEDAYRNRKECYSFECPMRPNRGKVAWIRMNGRFTGEVYQGSPVMWNVFYD ITDSHREQEELKVKSELLDSEIEKAEQMIQWTTGFLAELNSEIRNMANIIVGMADVAEAC IDDPAKCEECLEKIVQATRRLRTKVGNSRHMFRQEAG >gi|229784043|gb|GG667692.1| GENE 5 4327 - 4659 258 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622898|ref|ZP_06115833.1| ## NR: gi|266622898|ref|ZP_06115833.1| DNA-binding protein [Clostridium hathewayi DSM 13479] DNA-binding protein [Clostridium hathewayi DSM 13479] # 1 110 1 110 110 185 100.0 1e-45 MKYNRTEAALRIRNRRKALGLTSAEVAEKIGRADHYYGDIERGTCGMSIDTLVELTRTLE LSSDYILFGAENENGNSNAAKAYRILRKYDEQRQEHAVELMKYYLNLEER >gi|229784043|gb|GG667692.1| GENE 6 4750 - 4866 115 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIMARKKKQVQEIVDGNPPVRILRTGGLLRLQNCQHGL >gi|229784043|gb|GG667692.1| GENE 7 4851 - 5435 414 194 aa, chain - ## HITS:1 COG:all4988 KEGG:ns NR:ns ## COG: all4988 COG3544 # Protein_GI_number: 17232480 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 42 181 72 216 225 64 28.0 1e-10 MTKQEPMSIVTDKYLTCFHNILDQMVQQMNGAKLTNSISCNYIVQMIPHHKAAIEMSCNI LQYTTLIPLQEIALSIIKEQTKSIETMEAIRSRCQMFCNQQQKIELFLCHCRQITQTMFD DMRNACSTNDINADFIREMIPHHRGAIQMSRNALQFPVCQELKPVIQSILVSQQNGIREM ECLLKTIENCHRPC >gi|229784043|gb|GG667692.1| GENE 8 5657 - 5968 302 103 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622901|ref|ZP_06115836.1| ## NR: gi|266622901|ref|ZP_06115836.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 103 1 103 103 194 100.0 2e-48 MWNYTKCTAKDRIEKLNRLTEVFHEESDCLDYLSEVQSCINSIQTSLVLLQDVDANIMIE MDYVPVNNYSFHVVCIEENGIKTAGRSCKNIEQAARQFLANIK >gi|229784043|gb|GG667692.1| GENE 9 6057 - 6800 716 247 aa, chain - ## HITS:1 COG:no KEGG:Closa_1908 NR:ns ## KEGG: Closa_1908 # Name: not_defined # Def: ABC transporter # Organism: C.saccharolyticum # Pathway: ABC transporters [PATH:csh02010] # 1 247 1 247 247 288 68.0 1e-76 MNKFFTMLKTELKLSFRGMDMMIFAVCMPVVIVILLGIIYGGKPAFDGADYTFLEQSFGA VSTIAVCAGGVMGLPLVISDYRQKKILKRFKITPSSPLLLLSIQVAIYMIYSLIALVLIY LVSALLFGMSLKGAFLPFIGAFFLVMISMFSIGMLVGGVSPNIKIASVLASLLYFPMLIF SGATLPYEVMPAALQKVADVLPLTQGIKLLKATSLGLPAENGMASLIVMAVIAVVCGGLS IRFFQWE >gi|229784043|gb|GG667692.1| GENE 10 6793 - 7521 312 242 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 207 1 213 305 124 34 6e-28 MQTTIQVSGLTKSYSTLHVIRNLDLTVSRGEVFGLLGANGAGKSTAIECMLGTKKADGGS ISILGMSPQKDRRKLFQQVGVQFQEANYPDKIRVDELCEETACLYAAPADYRKLLQQFCL REKETNFVSELSGGQRQRLFIVLALIPDPQVVFLDELTTGLDTKARRSVWKSLQELKSRG LTIFLTSHFMDEVEELCDRIGILKNGEFAFCGTAAEAVTLSPGKTLEEAYLWFTGEEADG DE >gi|229784043|gb|GG667692.1| GENE 11 7667 - 8119 493 150 aa, chain + ## HITS:1 COG:CAC2825 KEGG:ns NR:ns ## COG: CAC2825 COG1671 # Protein_GI_number: 15896080 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 143 1 143 148 212 69.0 2e-55 MKILVDADACPVVKLVEQTAEQYQIPVVLLCDTSHVLQSSYSEIKIIGAGTDAVDFALVN QCAKGDIVVTQDYGVAAMILGKGAYGIHQSGKWYTNENIDQMLMERHMAKKIRMGKGRHH LKGPAKRTEEDDERFKESLCRLIEEVTHGG >gi|229784043|gb|GG667692.1| GENE 12 9055 - 9945 871 296 aa, chain - ## HITS:1 COG:no KEGG:Closa_2639 NR:ns ## KEGG: Closa_2639 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 290 1 287 380 259 50.0 8e-68 MYQLLALINGIILSVMISVNGNLSVQYSAFTAAAIIHAVGSLFALLLCAFQKNRKPVWPH RPRWIYLGGAIGVFTTVFQNLAFTCISITSVIALGLLGQTVTSLVIDSSGLFGMEKRPLP RYALIGLSLSVFGIFVMLDTTVTTAALGAAISFASGISMVLSRTANARLAEKTGALQGSL VNHLTGLPITVGLALAAAAISGTPVTVSAGGSFRLWIYLGGALGVIVVMLLNITVPKIPG FQLTLLVFVGQVFTGILLDSAAGSYHTGTSLWGGIIIAAGIAVNMILEQVTAVLAS >gi|229784043|gb|GG667692.1| GENE 13 10070 - 10747 464 225 aa, chain + ## HITS:1 COG:ECs3055 KEGG:ns NR:ns ## COG: ECs3055 COG0664 # Protein_GI_number: 15832309 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Escherichia coli O157:H7 # 30 212 21 202 219 75 25.0 7e-14 MKRLPLKKEHTACLAEYGLQNVPISACCCLRFAAGEDVLREGHPIGWIAIVVEGNAKVCR TAPNGKSLILCYNISHGLIGEIELMTRQVNATSSVKAVTELECVAVSYQNCLAELKTNLV FLNKLGTELAEKLTASSDNFTSTALCTGEQRLCAYVIQNSHRGLFCDTLTDVSCSVGMSY RHMFRLLGKLCEEGMLEKRESGYCIVDQEELARRSRTAAELYQRV >gi|229784043|gb|GG667692.1| GENE 14 10814 - 12145 680 443 aa, chain - ## HITS:1 COG:no KEGG:ELI_1104 NR:ns ## KEGG: ELI_1104 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 4 352 65 420 500 123 26.0 2e-26 MHNGMATTVIFRQTRQEAADYILNYFSERWDTYFDLYASSELIFIPLLVEGLKRLDEEDL DVLSRLEERAEADDLIHFYCRVLQKMENPLALNLFWETTLFQGFPFARMEPHPIHFDTAL VRQRLKNLRSLLSEGNWELIRNTLLEYQRSDVRVIMQNLEPILRSRPRPPQIPFVWRVYR SRPQICYHLASRILNDIYMGEYREMQFLPSYEKMAQLYGASVSTLRRTIRILNQAGVCRT LNGKGTRIFPLGTPCEPPDFESPAIRRNLSFFVQSFEILIYSCDGVTRSFLSSLSQDRIE ELTGLLEECLNTRQGELSIWYYLIFVSQYSHLTGVLEIYKTIYSLFLWGYPLKASSGNLT DLDCVIMHFTEAMVRHLKEHDYDQCAADVKELLFKQFPAARQYLIGHGIPPEELRISPSI RLFITDTEALECRPENRGGSPES >gi|229784043|gb|GG667692.1| GENE 15 12447 - 12602 62 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870940|ref|ZP_06115844.2| ## NR: gi|288870940|ref|ZP_06115844.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 51 39 89 89 96 100.0 6e-19 MNKVAGRMIARKTIWMKEESQWDYFRRRTEGQSERVREESGDDRTGKRVCR >gi|229784043|gb|GG667692.1| GENE 16 12571 - 14415 1900 614 aa, chain + ## HITS:1 COG:VC1349_3 KEGG:ns NR:ns ## COG: VC1349_3 COG0642 # Protein_GI_number: 15641361 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 277 537 1 259 260 189 42.0 1e-47 MTEQEREYAAECSSRETVTADYEYHTLMNLLKVSVSKHLLDESFTLVWANEFYYDLIGWP KEEYEANFHNRPDLYYQNDQQLWAELAGTVMAALEAHESGYQLVTQMRKKNGDFVWVHLT TQFADEYINGYQVAYSVMTNIDKLVKMQKEQSVTYESLPGFVAKYRIDADMNLTLLFANS RFVEYFGEGDEDNPLYLRNIEDNREIISRYREEILKGAPVHFVISARSRDGQNMWLQVNA TCIEWQNGCPVYLAIFIDITDVTQLREMQRKLTEQTEALKDALAVAEHANQAKTDFLSRM SHEIRTPMNAILGMTTIAAAYIDDQRRVEDCLEKIGYSSKHLMALINDVLDMSKISEGKV RIAHEVFDLETVVESISSIIYPQAAARGLTFQVPLIDLSNTVMIGDSLRLNQILLNLLSN ALKFTPAGGTVRLEIRQLQRKEDRIRLRFTVLDTGIGMSDAFLERLFVPFEQEDNRVAQN SGGTGLGMPITKNLVTLMGGTITVKSEVGKGTVFTIELDFDTPLDKDRVIPQKTHELESL KVLITDDDRDSCIHASLLLKKMGILSDWVLTGQECVDRIRQSHQNGTDYDVCLVDLKMPD MDGVEVARKVREEV >gi|229784043|gb|GG667692.1| GENE 17 15370 - 15879 532 169 aa, chain + ## HITS:1 COG:VC1349_4 KEGG:ns NR:ns ## COG: VC1349_4 COG0784 # Protein_GI_number: 15641361 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Vibrio cholerae # 1 154 98 249 250 118 46.0 5e-27 MDMFLTKPVFASTLYNALLSVTGIDRVVRAPAEKKIRPELAGRHVLLAEDNELNREIAVE LLQMAGVTVDWAANGRIALDKFLSGGDSYDLILMDVQMPVMDGYQATAAIRKSAHGRAQT IPIIAMTADAFHEDIVKAEAAGMNGHLSKPIDPDLLYEKVAEYISSDHA >gi|229784043|gb|GG667692.1| GENE 18 16005 - 17729 1711 574 aa, chain + ## HITS:1 COG:all4225_2 KEGG:ns NR:ns ## COG: all4225_2 COG2200 # Protein_GI_number: 17231717 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Nostoc sp. PCC 7120 # 301 556 4 260 260 200 38.0 8e-51 MPAKRTILIVEDNEINRMMLSELLSSEYMVLEAENGQQALDVLERKKDEISVILLDITMP VMDGYTFLSIAKEDPSLASIPVIVTTQSDSEADEVAALSHGATDYVAKPYKPQIILHRVA SIIHLRETAAMINQFQYDRLTGLYSKEFFYQRVRETLRQNSGQNYYLICSDIVNFKLIND VFGVAAGDRLLCGIADMYRRYVGEGGICGHLDADRFACLLEFDDVFTEQMFISANEEINR LQNAKNVAVKWGIYSVGNQNVTVEQMCDRALLAARSIKEQYGVYYAAYDDKLRDELLRQQ AIIDSMETALKEGQFLVYLQPKYRIWDETLAGAEALVRWKHPEWGFQSPAEFIPLFEKNG FITKLDQFVWDKTCFYLRKWDEEGYPPIPVSVNVSRADIYHADIADIMMRTVTKYGLTPS RLHLEITESAYTEDPGQIIETVRHLRELGFVIEMDDFGSGYSSLNMLNQMPIDILKLDMQ FVQSETAKPMDQGILQLIMELARRMHLSVVAEGVETKTQLDRLSETGCEYVQGYYFARPM PEGEFETLMRDQAVKVPVGISGRASDKNLYKGAE >gi|229784043|gb|GG667692.1| GENE 19 17774 - 18052 350 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622913|ref|ZP_06115848.1| ## NR: gi|266622913|ref|ZP_06115848.1| toxin-antitoxin system, antitoxin component, HicB family [Clostridium hathewayi DSM 13479] toxin-antitoxin system, antitoxin component, HicB family [Clostridium hathewayi DSM 13479] # 1 92 1 92 92 175 100.0 1e-42 MTFIYPAVFTPHKDDKGYHVEFPDLQCCEADGPDLEDAVEHAREAAYNWLYLEVEEGTYE FPNQTHPEDIKLEEGAFLKYIMVTIKLLPDND >gi|229784043|gb|GG667692.1| GENE 20 18221 - 19357 980 378 aa, chain + ## HITS:1 COG:BH3215 KEGG:ns NR:ns ## COG: BH3215 COG1470 # Protein_GI_number: 15615777 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 31 378 8 351 385 172 34.0 9e-43 MPMKHQKEDIMEEAIVTRPIEKRRIWAFLAAFMAAVLLLFLNVPAAYAAEGLKLSTDYPG ISVKPGDNLNIPVKLDNDTGASLDANVKVSAMPDGWEGYLQGGNYQVSRIHVKNGGDGAS MTLHVTVPKELTEGTYTVEVGASTESGLSDTLSISFLVNEMKAGKGSFTSEYPQQEGTAG TNFSFSTTLINNGLKSQSYSLSSNAPAGWNVSFTPTGETTKVAAIEVESSTSKGMTVSVT PPEGVPAGEYDISCSAVSAEETLSTDLKVVITGSYGLKVGTPDGRLSFDAYAGKESDVTL NVTNTGNVDLQDVSLNSSLPSGWTVTYNLENNVIPSIPAGSTTEVIAHVKPGSEAITGDY VNTFTAKCEQTQSSADFR >gi|229784043|gb|GG667692.1| GENE 21 20344 - 21381 329 345 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 13 316 12 309 312 131 28 6e-30 MDIAEHYEDEMIIKTDHLTKIYGEKKAVSDLSLEIRKGEIFGLLGPNGAGKTTTTLMLLG LTEPTAGSAWIDGKDCTRQAIEVKRMVGYLPDNVGFYSDMTGRENLRFTGQLNGLPGKVI EERIDDLLQKVGMTEAADQKTGTYSRGMKQRLGIADVLMKDPKVVIMDEPTLGIDPEGMR ELMFLIRSLAEEDGRTILISSHQLYQIQQVCDRVGLFVEGQLIACGRIEDLASQMRREGR YVLEAGAVPDDKGFKELLEHLGNVEEIERSGDFYVIYSSSDLSRELNRKALEGGYSLCHL RQRGNDLDEIYRTYFEKGEKSNGSLNQRKNKGFLTFGREQRQGEA >gi|229784043|gb|GG667692.1| GENE 22 21308 - 21913 560 201 aa, chain + ## HITS:1 COG:BH3213 KEGG:ns NR:ns ## COG: BH3213 COG1277 # Protein_GI_number: 15615775 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Bacillus halodurans # 40 201 33 202 345 132 45.0 3e-31 MAVSIKGKTKASLLSAENSARERHDGTGGSLRTGMSGMGALYHKEMSDHIHSKRFLIVLL LILLTSCASIYGALSSLTPEQAADSGYIFLKLFTLSGNSIPSFTSFIALLGPFVGLALGF DAINSERSEGTLNRLVAQPVYRDAVINGKFLAGAAIIFLMVFSMGLIAGAAGLLATGIPP SGEELGRVIVLLFFTSVYICF >gi|229784043|gb|GG667692.1| GENE 23 22908 - 23231 322 107 aa, chain + ## HITS:1 COG:BH3213 KEGG:ns NR:ns ## COG: BH3213 COG1277 # Protein_GI_number: 15615775 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Bacillus halodurans # 11 105 249 343 345 85 42.0 2e-17 MAANVLYPVGQTATASRILDNYNCQMSLNRLSPYYLYSEAVSTIMNPQVRSTNIILPQQL SGAINGYLSLGQSMLLVWPHLTGLLALTAVVFAISYISFMRKEIRGR >gi|229784043|gb|GG667692.1| GENE 24 23361 - 24047 974 228 aa, chain + ## HITS:1 COG:CAC0653 KEGG:ns NR:ns ## COG: CAC0653 COG0745 # Protein_GI_number: 15893941 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 219 1 220 221 154 39.0 1e-37 MRVLIVEDNMRLAESIRDILKHWYDCDICGDGNTGYFLLTDGGYDAAVLDLMLPGMDGIT LLKKARKSGCFVPVLILTARSETEDRILGLESGADYYLTKPFDMQELLAVMKTILRRRGE MIPECLNLGNISLNQADYTLKGQEKSIRLGKKEYDILRILLVNRDMVVSKETLLSKVWGS DGEAVDNNVEIYISFLRKKLEFLKADVVIVTIRRLGYRITEKNGGGLE >gi|229784043|gb|GG667692.1| GENE 25 24044 - 25303 1112 419 aa, chain + ## HITS:1 COG:BH1945 KEGG:ns NR:ns ## COG: BH1945 COG0642 # Protein_GI_number: 15614508 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 159 387 223 443 462 114 36.0 4e-25 MKEISRLRLKFICYNMLIVTAVIGITFCAAAFIMEKRVGAQGRQALAKVVSQEEHPLIFT TISPAQIPYFSVIVGEDGMVTPWEGGYESFPGQDFLEQMAWLSMAGEEDMGILEGYHLRY LRVSRPTGYMIAFADTSYEDSLRSGVMKYGGLACAGIWLGFLVLSYFFSRWAVKPVEESI RMQKQFVADASHELKTPLTVITANTELLSEQYTGISAEADKWLEHIMQECREMRSLVESL LMLAKNDALTQKKGTFSVFSLSDLVMEKVLTFEPVFYQEEKQLEYQLDEEVQMLGDPEQM GQLIKALLDNAVKYSLPKGKTVVKLETAGRRRIRLWVNSQGEAIPEDKRSLIFRRFYRGD SARSSCSGYGLGLAIAAETAGKHRGRIGMEYRDGMNCFYVRLNCHKSCKRLGKNIGKFD >gi|229784043|gb|GG667692.1| GENE 26 25493 - 27151 1526 552 aa, chain + ## HITS:1 COG:FN0192 KEGG:ns NR:ns ## COG: FN0192 COG0747 # Protein_GI_number: 19703537 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 71 552 39 512 515 362 39.0 1e-99 MEEAMRLRKSAYFMFGVMAAVFLATGCGSQEKSAAEAVQTTESAKYAEQTGEEGNGAAKD RTEQGHLVETLKLAGGTDWGVPSPYLNASRGPGSAKMNLVFASLIDEDETGDIPWLAQSW EINGSDYTFTLFPETKFHDGTPLTTEDVAFTIDYFREHPPVSNPLGAGDSFLIDHYTVVD DRTITISVKESAADTLSSIGSFVIIPKHVWEKVEDPNTYTGDGYLTGSGAYRCTAYDGAS GSYEFTTFDDFRGGKPAADRVLFVPVSDALLAFENHEIDITGMPADLKDKYLSDSGIGVV EKANDMGYKLLINFEKCPGFLDLDQRKAVYAALDRQAVVDKVFRGAGSVGSAGYVPQGSL YFNDRCVTYPYDPEKAKAALESKQYEVTLLAADSGSDVDIAELLKQDLEAAGMKVTVTAF DSATRDEKVNAGNYEFALVGNGGWGNNPPKYMRTIFSDLSKNKGGNPHSMGPIGYSNSEI TKLAEEQMKEVDFEARKQMFKDLEYLVSEEIPLIVIANQSSYSMYRRDYYDGWMKTYAYQ QTEQNRLSFMER >gi|229784043|gb|GG667692.1| GENE 27 27159 - 27674 470 171 aa, chain + ## HITS:1 COG:MA0842 KEGG:ns NR:ns ## COG: MA0842 COG0601 # Protein_GI_number: 20089726 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 8 169 12 175 327 97 38.0 9e-21 MKRRVLCMIGTFLVIYLFNFFLPRMMPGDPLLYSSTVSGEDMDLEYSKEQLDQMRAYYGL DQPVFTQLVQTVKKNLKGDLGLSIHYKKPVAAVLRERLPWSLSVMGAALILSLAAGTFLA LLGMQKKWIDRTVFKILSALTEIPAYLIGLLLLFLAAAKISWLPLSGGLAS >gi|229784043|gb|GG667692.1| GENE 28 28601 - 28996 339 131 aa, chain + ## HITS:1 COG:FN0193 KEGG:ns NR:ns ## COG: FN0193 COG0601 # Protein_GI_number: 19703538 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 129 188 316 318 97 39.0 7e-21 MPVLSLVIALFPGFYFTARASFLEVMAKPYLLNARAKGLKAGRIRWSYIFRNGMTPIVVR FFLNVGTAVGGTLLVENVFAYPGLGLVMREAVRYRDYMMIQGFFLMSTVLVLTSLFLADV LNAEIDREELK >gi|229784043|gb|GG667692.1| GENE 29 28993 - 29796 729 267 aa, chain + ## HITS:1 COG:MA0843 KEGG:ns NR:ns ## COG: MA0843 COG1173 # Protein_GI_number: 20089727 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 3 263 74 332 342 199 40.0 4e-51 MKRIKRYLFLLFLLAAVCLYGLTLYPHSPVKSSGTSLEAPSAVHFLGTDNLGIDIFAQLA KGFYLSMMTGVTAAVCSFAAGGVLGVCAGYCGRRTDAVISFFTNVFLSVPQLPVMIVIGA FWGQSTWNIILIVSAFSRASIAKQVRAKVMSVRNRNYVVLARSYGAGPWYIVRKHMLGEI FPLLLVNAVAVTGRTIIQESSLAYLGLSDPLAKSWGLMIQNASSFTGIYFTDYWKWWLIP PVAMLVFTIFCLRLLAQALEEWLLRDR >gi|229784043|gb|GG667692.1| GENE 30 29844 - 30341 178 166 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 166 3 178 329 73 28 2e-12 KKEETMKEILKVCDLTVKYPGFTLEPCSFSMEQGEILAVAGESGSGKTTLARALSRLLDE EAAVRGEVVMAGSSLYEMKEKDVRLVRMETFSIVFQNSGEWLNPAMKMKEQLRETLIRKY PVREHRDRMKALMEMTGLEEEDLERYPDQLSGGMAQKFLLANAMAL >gi|229784043|gb|GG667692.1| GENE 31 31281 - 32429 279 382 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 145 362 1 217 245 112 31 4e-24 MILDEPTSSLDADSRKSFIELIRRIRDEYGTAFLLITHDLRLAKDLSSRILILYGGMIVE AGTTEQILGHPKHPYTQGLLRASMDLNLLKDIWGIRPSEEDGQGGCPFFGRCSQKLDICR TKRPLLTPHGGGRQVACNRGGIVTLMEARHIGKWYGKQEVIRYVSLEIEHGEFISLVGKS GVGKTTLTRILGGYTDEFMGTVLFNGETADYQSLHKSRHGVQMIFQDSNQSVNPSMTVRE AVSEPLLLSGGENADIGKDRIRQALQDVGLPFTECFVDTRVKKLSGGQKQRVCIARALTM EPALMIADEPTSMLDPSSKANVLRFLKGLQNSRGFSILMVTHDLTCAAKVSDRIYLLKNG RLQPFTPFVNRIECSIYQMEEF Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:45:10 2011 Seq name: gi|229784042|gb|GG667693.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld86, whole genome shotgun sequence Length of sequence - 19891 bp Number of predicted genes - 19, with homology - 17 Number of transcription units - 7, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 - CDS 2 - 697 321 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase 2 1 Op 2 . - CDS 699 - 1760 821 ## COG0477 Permeases of the major facilitator superfamily - Prom 1924 - 1983 4.7 - Term 1799 - 1835 -1.0 3 2 Op 1 . - CDS 1996 - 2697 631 ## CDR20291_0534 two-component response regulator 4 2 Op 2 . - CDS 2675 - 3070 446 ## gi|266622929|ref|ZP_06115864.1| 30S ribosomal protein S16 - Prom 3181 - 3240 28.6 5 3 Op 1 . - CDS 4233 - 4346 63 ## 6 3 Op 2 . - CDS 4336 - 4431 91 ## 7 3 Op 3 . - CDS 4479 - 5789 1156 ## COG0477 Permeases of the major facilitator superfamily 8 3 Op 4 . - CDS 5878 - 6777 520 ## COG0280 Phosphotransacetylase 9 3 Op 5 . - CDS 6790 - 8232 796 ## COG1757 Na+/H+ antiporter 10 3 Op 6 . - CDS 8263 - 9336 771 ## COG3426 Butyrate kinase - Prom 9478 - 9537 5.6 - Term 9402 - 9445 2.9 11 4 Tu 1 . - CDS 9575 - 10339 766 ## COG1414 Transcriptional regulator + Prom 10508 - 10567 4.6 12 5 Op 1 . + CDS 10643 - 10861 229 ## gi|266622937|ref|ZP_06115872.1| putative keto/oxoacid ferredoxin oxidoreductase, delta subunit 13 5 Op 2 23/0.000 + CDS 10863 - 11924 908 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 14 5 Op 3 22/0.000 + CDS 11926 - 12678 400 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 15 5 Op 4 . + CDS 12681 - 13244 457 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 16 6 Op 1 35/0.000 - CDS 14306 - 16108 197 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 17 6 Op 2 . - CDS 16105 - 17835 212 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 18 6 Op 3 . - CDS 17881 - 18309 515 ## gi|266622943|ref|ZP_06115878.1| transcriptional regulator, MarR family - Prom 18391 - 18450 7.6 19 7 Tu 1 . - CDS 18497 - 19681 1352 ## COG0025 NhaP-type Na+/H+ and K+/H+ antiporters - Prom 19711 - 19770 3.3 Predicted protein(s) >gi|229784042|gb|GG667693.1| GENE 1 2 - 697 321 232 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 14 208 16 211 345 128 35 3e-29 MKMKRLGTLPDCSDVLEVMIANGSGMTASIMTYGAVIKNLYVPGENGKADDVVLGQNTLE EYRRNPSCSAAVIGRVANRIRGGEFHVNGRGYWLERNDRGNCLHSGSAGYATKNFHIAAG GDDWVTLYWKDSAADGFPDSVSLEVTYRVTEDDALDIRYRLVPEEDTPVNLTNHAYFNLS GGRDADVLNHELRLMADFYTPAAPDMIPTGEIRKVEGTNLDFTGRRKLSEVL >gi|229784042|gb|GG667693.1| GENE 2 699 - 1760 821 353 aa, chain - ## HITS:1 COG:TM1032 KEGG:ns NR:ns ## COG: TM1032 COG0477 # Protein_GI_number: 15643790 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Thermotoga maritima # 2 341 47 367 381 79 26.0 7e-15 MQVGILSAVLPVASLTVQPLWAALADRTGKRRCVLILLNLCCSAAVLLFPGTESFREILG AVICYALFNSAVLPVCDALVVNQAEEKRINFAWIRMCGTISYAVIVLGMGFYLKKRINLM FYGNSACFLVFALLCLTLPGDRKTWKVQEPDTKNPDTGKFKAKADKGPIFKSKEIYPVLV FAFAMQFGINYYSAFLGVYLLELGYSQSLLGVLNCISALSEIPVLLLIHRLSRRFKETAL LTAAVGCMALRLILLSAGNVPLMAAAQLLQGPSYMVCYFVCVTYINRMVMPGKISQGQST LAVVQMGAGSLSGSLLGGVLASAYGIREGFLIIAFLLLAVTAAAGITAGWRRS >gi|229784042|gb|GG667693.1| GENE 3 1996 - 2697 631 233 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_0534 NR:ns ## KEGG: CDR20291_0534 # Name: not_defined # Def: two-component response regulator # Organism: C.difficile_R20291 # Pathway: not_defined # 34 224 43 231 242 77 24.0 5e-13 MGKILLYDPEKRFAGLEWALSAGDAADYRALHTFTDKDQLMFYLEEVNEDAEILIVDLVH QKEKGTGVARTCKKKIHHLKLVLVVDRMDWPEHLFDLQPDSLLFSPLDKDEFQRTILRLE QMAQQEYKRCLTLVTKGKIYRIPYQKILYVESMGHQLCVKTWEEEIRSYGKLDDLLDRLP PYFLHCHKSYLINLNYVRQLEMYRIRLEGEEEWIPVSQKYYKHIKNSLLAEAE >gi|229784042|gb|GG667693.1| GENE 4 2675 - 3070 446 131 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622929|ref|ZP_06115864.1| ## NR: gi|266622929|ref|ZP_06115864.1| 30S ribosomal protein S16 [Clostridium hathewayi DSM 13479] 30S ribosomal protein S16 [Clostridium hathewayi DSM 13479] # 1 131 1 131 131 235 100.0 9e-61 MVRILGVFIPGKEERRKRDEEVLRHYFVYGAKHRDKVGELLEELIPGEKREHLILYYMQI KDRLEMNGGQKFEEAVRQIKRKYIIISANDTVNRYYKAVMEADAAVQEDLCFPCADEIRK MVEQDGKDSTV >gi|229784042|gb|GG667693.1| GENE 5 4233 - 4346 63 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDFSAKPNLPLYLSECMDICDSVLDRIMVENGIETYL >gi|229784042|gb|GG667693.1| GENE 6 4336 - 4431 91 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKQERFQVSKMEAFFFFGVKSYFYSIVKDGF >gi|229784042|gb|GG667693.1| GENE 7 4479 - 5789 1156 436 aa, chain - ## HITS:1 COG:YPO1668 KEGG:ns NR:ns ## COG: YPO1668 COG0477 # Protein_GI_number: 16121932 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 18 406 9 396 411 140 28.0 4e-33 MSNSKVSITGRQLLIVFFMGLAFAVVYATPFVQYVFYDDLAGALHASNTQLGFLIAIFGI GNLLAPLGGALSDKFNTKKVYLLGMFITCALDFLLAMKMSYAFAVFIWAGLAVAGLILYF PAHTKLVRLVGDEESQGTIFGFTESACGLASVIINFVALGLYAKVAVGAMGGVAGLKAVI ISYGAVGLLATVALIFLLPNPEKAGSSADGGEETKLSVKEWIGVFKDPRTWFAGIAVFAT YSTYQTLSYFTPYFTNVLGATVIGSGAIAIIRTYGIRIIGAPLGGYMGDRIHSVSTVIST VLAFAAVITLIFMFMPAGVPSVLLTAMTLVIGFMVHIARGGMFAVPSEVKIPCRYAASTA GFVCAIGFCPDLFQAAMYGHWMDTYGNAGYTRIFIYIIIIMLIGVVNGIATAFYKKKYVV TGRVPVEGGSEGDRAS >gi|229784042|gb|GG667693.1| GENE 8 5878 - 6777 520 299 aa, chain - ## HITS:1 COG:CAC3076 KEGG:ns NR:ns ## COG: CAC3076 COG0280 # Protein_GI_number: 15896327 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Clostridium acetobutylicum # 1 297 1 298 301 212 40.0 6e-55 MIKNFKELRERVKSSRPLVVSVAAAADTELLLAVKAAVDDGFIQPVLTGDKEKIEDICRQ INLVPTEILGAETETEAVERAVREVHNGNAQVLMKGLVNTSIYMRGILNREWGLRTGRLL SMMAAYEAPGYHKLLFCSDSGINVAPGLEQKKDILKNLLYAVENMGIHTPKVAVLTANEM VDPKVPSTMDAAGLVEAVKTEDGFLPCIIEGPIAFDVAFDPQAAAHKKIDSKITGDVDLV IFPNIEAGNIMGKSWIHLCHSSWAGIVLGASNPVILGSRSDTAEIKMNSIALACLAAQT >gi|229784042|gb|GG667693.1| GENE 9 6790 - 8232 796 480 aa, chain - ## HITS:1 COG:FN1422 KEGG:ns NR:ns ## COG: FN1422 COG1757 # Protein_GI_number: 19704754 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 11 466 3 450 473 293 36.0 6e-79 MNENHKEKLQPGIGMSMVVFGAAIAVLLIGVVLLDYDIHIMLLCALAIVCLASVPLGYSF MDLVGCMKKSLGQALAAMIIFIFIGVIISSLIFSGTVPALIYYGLQYIIPDYFLPIGLLL CSLTSISIGTSWGTVGTMGLAMMGIGTGLGIPAPLTAGMVISGAFFGDKMSSISDSTNLA PAAVGTTLYAHIGAMWKTTLPSYIITLIAFTVLGLSYRGGNFSMDTVELFLAAIGGRFHI SPLVLVPMAVLLILNLRKYPAVPSMAIGTVLAVLVAVFYQGYPLKEVIAGLNYGYTEPTG VELVDKLLLRGGIQSMMYTFSLSCIAIAFGGVMEHVGYLPCIIEVVIHKVKRDKMMVPVV IASTTLGTLTMGEVFLSIVINGSLYKDAFKKRGLKPEMLSRLLEEGGTLTQVFIPWSTSG VFIFSTLGMSVSQYWKYAIFNYVNPLLSIVLAMFGIYIMRIGDEERKSIDHRLSKRHKLG >gi|229784042|gb|GG667693.1| GENE 10 8263 - 9336 771 357 aa, chain - ## HITS:1 COG:CAC3075 KEGG:ns NR:ns ## COG: CAC3075 COG3426 # Protein_GI_number: 15896326 # Func_class: C Energy production and conversion # Function: Butyrate kinase # Organism: Clostridium acetobutylicum # 4 357 1 355 355 369 52.0 1e-102 MKLLMKLLIINPGGTSTKIAVFEDEKQILKLNITHTQEELSEYKRVFDQFDYRRKLILDA LSQEGIEISDLSCVVGRGGLMNPIPGGTYRVSNKMVEDLRNAVHGEHPSNLGSALAKSIG DEIGIPSFVVDPVSVDELMPVARISGIEELERPSWFHALNQKAVARWAAERIGKKYEESN LIVAHLGSGNSIVAHKEGKMVDGSGGRTNGPFSPERSGGLPTYPLVKLCYSGKYTHEEMV DKISSTGGIYDYLGTKDAEEVERRISQGDEKAKLVYEAFVYNVAKEICSYGAVFDGKVDC IVLTGGIAHSEYVVNEVRRMTGFMAPIEVVAGEFEMTALALGGLRVLRGEEVPKEYE >gi|229784042|gb|GG667693.1| GENE 11 9575 - 10339 766 254 aa, chain - ## HITS:1 COG:STM4187 KEGG:ns NR:ns ## COG: STM4187 COG1414 # Protein_GI_number: 16767437 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Salmonella typhimurium LT2 # 5 251 24 267 274 139 31.0 4e-33 MADTIQALDRALELLIYLNKKGKEVGITQIASDMGVYKSTIYRTLSTLEAKGFVKRNPET EKYWLGSQLFILGKSVENKMGIPEVIRPYAHRLHQKFHEVVNVSILEYNPGDVYHSVIIL KEMNDRQMLTVNPPVGSLSECHCSSVGKCLLAFGDKVDLSIYEGKVLTRYTPNTITSLDE LKEELVKVKRRGYAMDHEEQELGLTCIGAPILDQNMKAVAAISLSGPTSRITSSDIEDRI EAVCNIAKEISNNF >gi|229784042|gb|GG667693.1| GENE 12 10643 - 10861 229 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622937|ref|ZP_06115872.1| ## NR: gi|266622937|ref|ZP_06115872.1| putative keto/oxoacid ferredoxin oxidoreductase, delta subunit [Clostridium hathewayi DSM 13479] putative keto/oxoacid ferredoxin oxidoreductase, delta subunit [Clostridium hathewayi DSM 13479] # 1 72 1 72 72 125 100.0 1e-27 MKVIVNKDRCKECGLCIHHCPKQAISKCNTLNAGGYYPVQVDDEACIACGMCYITCPDGV FYISGDIEEEVQ >gi|229784042|gb|GG667693.1| GENE 13 10863 - 11924 908 353 aa, chain + ## HITS:1 COG:TM1759 KEGG:ns NR:ns ## COG: TM1759 COG0674 # Protein_GI_number: 15644505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Thermotoga maritima # 4 309 8 321 356 270 45.0 2e-72 MSDMMKGNEAIAEAAVRAGVKFYAGYPITPSSEIMEYLSWRLPETGGSFVQAESELAGIN MVIGAAASGVRALTASSGPGISLKQEGVSTLSDEGLSAVVITQVRYGNGLGTLLSAQCDY NRETRGGGHGDYRCLVFAPSSVQEFVDLMRPAYELAEKYRVVSVMMGEATLGQMMEPVTL PEFVEPERTPWALNGHYTFKKIGIFQRDSMQEAVELVEKYNTIRENEQRWADGGIEDADY VFVSYGVPGRSTLGAVRELRAQGEKVGIIRPITVWPFPEKAFHKINPEVKGIITVEANAT GQLVDDAALYTKKALKERNIPAYALTYVFGTPTIRNIKKDYYEIKSGNKKEIY >gi|229784042|gb|GG667693.1| GENE 14 11926 - 12678 400 250 aa, chain + ## HITS:1 COG:MA2909_1 KEGG:ns NR:ns ## COG: MA2909_1 COG1013 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Methanosarcina acetivorans str.C2A # 20 245 37 261 296 168 37.0 7e-42 MSASKKEKKVPALVDKPMDFCPGCGHGIISRLIMECLDELNQNDNIIFPIGVGCSSNLGA GLECDRLHCSHGRAGAVATGMKRVNPDVLIVSYQGDGDAYNIGIAETFNAAYRNENITVV VVNNTNFGMTGGQMSLTTMEGQKTSTSIYGRDCSVTGYPLRFPEMVAREFPNAAYAARGT VTSPAKINQLKGYIKNGLMAQMNHEGYSVIEVLSPCPVNWGLTPIKAMERIENELVPYYP LGEFKKREVK >gi|229784042|gb|GG667693.1| GENE 15 12681 - 13244 457 187 aa, chain + ## HITS:1 COG:MA2909_2 KEGG:ns NR:ns ## COG: MA2909_2 COG1014 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Methanosarcina acetivorans str.C2A # 7 173 13 177 186 102 37.0 3e-22 MKEIVFAGSGGQGVLTAGLIISDIAAKEGLNVTWVPSYGSAMRGGTANCTVKYCENTIYN PSQEEPDVLLAMNNPSFKKFLPLVTPGGIVVIGDLVDIPEDARKDVIYVRVPATEISTDQ GNPKSANIVMTGAIVKLMGDFSKEAAITAMNHMFEKKGKSQFREANEKAFDAGYDAVEVM VQDSCVR >gi|229784042|gb|GG667693.1| GENE 16 14306 - 16108 197 600 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 362 579 8 226 245 80 28 8e-15 MKQGTVSKTDKKINTRGTLGRLFRFMKPYRIRIILMVVCLVMGAVFTTQGPYTLGRAMDA LVAVVVNDAGVLQGFQAFLTVLLQLGCVYLLAFLFNYSGQYIVAGVAERTMHDLRMAVDK KIRRLPLAYFDSNTFGDVLSRVTNDVDTIDTSLQQSISQVITAVCTMVFIFGMMLVVSPI LTLIGVCVIPVCGLVSMKVVKHSQRYFRGQQSALGDLTGYVEEMYNGQNVIAAFGKEEDI ITNFEDINNRLYDNGWKAQFSSSIIMPLTQALTNIGYVGVAVVSGWLCINGRLSIGMIQS FIQYLRQFSQPINQISNIANIMQATMAAAQRVFEFLDEQEEVPEAEDLKFPQNPEGCVDF SHVRFGYTEGQTLIHDLDLHVQAGDKIAIVGPTGAGKTTLVNLILRFYDVKGGSITIDGV DVRDMKREALRSMIGMVLQDTWLFAGTIKDNIRYGRLEATDEEVIAAARAAHAHGFIMSM PGGYEMQLHEGASNIAQGQRQLLTIARAFLSDPEILILDEATSSVDTRTEVAIQKAMNKL MEGRTSFVIAHRLSTIKDAELIVYMEHGDIKEVGNHKELLAKGGYYAALYNSQFAAENAG >gi|229784042|gb|GG667693.1| GENE 17 16105 - 17835 212 576 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 333 557 131 354 398 86 25 2e-16 MKKIFAELKAFRLELLCVLVSVAAGVGATLGLPTYLSDIINRGIADKDMNYILHTGIIML GIAILGMVCNITTGFFASRIALGLGRNVRSRIFSKVEYFSQAEIDTFSTASLITRTNNDI TQVQNFMVMFLRVILTAPIMCVMGIILAYSKNPKMSSILVVSMPVMVLIISLIGRRAMPL SRKMQTRIDRINLIMREKLTGIRVIRAFGTEDYEEKRFDGANKDLMNNAMKMMHAMSLMG PSLILILNLTVVGLLWRAGQGAGTEPVMPGDILAIIQYVMQIMMSVTMLSMIFVMYPRCA ASADRICEVLDTENSIGDPEAPKTSEGQRGYLTFRDVSFYFPGAKEPAVSHVSFEAKPGE TTAIIGSTGSGKTALVGLIPRFYDVQEGEVLVDGVNVKDYDRKTLRKKIGYVPQKALLFK GTIMENIRFGDDSASDERVKEAAAIAQSTDFIEDKPEGFDSFIAQGGSNVSGGQRQRLAI ARAIVRKPEIYVFDDSFSALDFKTDSALRQALARETENATVVIVAQRVSTIMNADRIIVM DEGKVMGIGTHRELLNTCETYQEIVRSQLSEEEMSA >gi|229784042|gb|GG667693.1| GENE 18 17881 - 18309 515 142 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266622943|ref|ZP_06115878.1| ## NR: gi|266622943|ref|ZP_06115878.1| transcriptional regulator, MarR family [Clostridium hathewayi DSM 13479] transcriptional regulator, MarR family [Clostridium hathewayi DSM 13479] # 1 142 26 167 167 283 100.0 4e-75 MPFINFKDQWFQELRKVNWLVREGVEEKCRPFGITPEQGRVLNHLFLADGQVNLTQLSRS LHVTKGNCSMFCRRLERAGHITMTRNEKDARFIDVVLTEKGRALVQSMMEQMGSHSEKDT MSREDLETILKGLKCLEKYLQS >gi|229784042|gb|GG667693.1| GENE 19 18497 - 19681 1352 394 aa, chain - ## HITS:1 COG:MTH760 KEGG:ns NR:ns ## COG: MTH760 COG0025 # Protein_GI_number: 15678785 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters # Organism: Methanothermobacter thermautotrophicus # 4 392 7 396 399 271 43.0 2e-72 MLTSLSLVFLLGMLLSQAFQKIRLPGLLGMLLTGMILGPYALNMLDASILGISADLRQIA LIIILTRAGLSLDIQDLKKVGRPAILMCFVPACFEIVGMVLLAPRLLGISMLDAAVMGAV VGAVSPAVIVPKMLNLMEKGYGVRKSIPQMILAGASVDDVFVIVMFTAFTGLAQGGRFSA GSVLSIPVSIGTGIAVGGVTGVLLARFFQKIHIRDSAKVVILLSISFLMIDLENRLKGCL PFSGLLAVMSIGIALQRKRHEAAARLSVKYSKLWVAAEVLLFVLVGATVDIRYAWEAGAA AVILIFGVLIFRMAGVFCCLLKTELNRKERFFCMIAYMPKATVQAAIGGVPLAMGLGCGK IVLTVAVLAILITAPLGAFGMERTYKRLLKADCP Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:45:50 2011 Seq name: gi|229784041|gb|GG667694.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld87, whole genome shotgun sequence Length of sequence - 16699 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 6, operones - 3 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 28 - 82 11.2 1 1 Op 1 . - CDS 108 - 758 857 ## Closa_4129 hypothetical protein - Prom 786 - 845 2.8 2 1 Op 2 34/0.000 - CDS 850 - 1629 249 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 3 1 Op 3 31/0.000 - CDS 1644 - 2294 640 ## COG0765 ABC-type amino acid transport system, permease component 4 1 Op 4 . - CDS 2372 - 3274 1232 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 3310 - 3369 3.1 - Term 3383 - 3444 11.2 5 2 Op 1 . - CDS 3462 - 3932 656 ## COG1225 Peroxiredoxin - Prom 3962 - 4021 2.0 6 2 Op 2 . - CDS 4024 - 4716 697 ## COG1011 Predicted hydrolase (HAD superfamily) 7 2 Op 3 . - CDS 4737 - 6014 1368 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 8 2 Op 4 12/0.000 - CDS 6061 - 6516 523 ## COG3610 Uncharacterized conserved protein 9 2 Op 5 . - CDS 6513 - 7337 839 ## COG2966 Uncharacterized conserved protein 10 2 Op 6 . - CDS 7297 - 8025 664 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold - Term 8129 - 8164 8.1 11 3 Tu 1 . - CDS 8208 - 8528 378 ## - Prom 8707 - 8766 5.5 12 4 Tu 1 . - CDS 8920 - 9663 865 ## COG0789 Predicted transcriptional regulators - Prom 9729 - 9788 8.2 13 5 Op 1 7/0.000 - CDS 9870 - 10598 820 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 14 5 Op 2 . - CDS 10716 - 12524 1770 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 15 5 Op 3 38/0.000 - CDS 12599 - 13444 846 ## COG0395 ABC-type sugar transport system, permease component 16 5 Op 4 35/0.000 - CDS 13441 - 14358 916 ## COG1175 ABC-type sugar transport systems, permease components 17 5 Op 5 . - CDS 14438 - 15736 1459 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 15795 - 15854 6.3 18 6 Tu 1 . - CDS 15956 - 16297 307 ## COG2315 Uncharacterized protein conserved in bacteria - Prom 16335 - 16394 7.7 Predicted protein(s) >gi|229784041|gb|GG667694.1| GENE 1 108 - 758 857 216 aa, chain - ## HITS:1 COG:no KEGG:Closa_4129 NR:ns ## KEGG: Closa_4129 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 216 1 215 215 216 59.0 6e-55 MKRTLIVIMAVCAALAFTACGKSASNTETTPAPAVTEPETKEPATTPAATLEAPGDTAEF KILTCDVVSVSDTLEDMTVKNGDAEVKFNLNGVVIETSYELKAGIPVSIIYKGEISGDDA ANAAIVLVLDAQEDMEVKTMTGSVTDQAMSSFTIRNEKGEDTGFLKDNCEGLNTGVLGQA TDDSNGSGAMVKVTYVTVSYDAGSEANFPLKVESAK >gi|229784041|gb|GG667694.1| GENE 2 850 - 1629 249 259 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 18 241 27 252 563 100 29 7e-21 MYLLEMNHVKKSFDGQGVLKDISLSVEEGEIVSIIGPSGSGKSTLLRCATMLEKMDSGEL LYLGEKAAWNGPDGTAVYAGKEKLKEIQGNYGLVFQNFNLFPHYNVLKNIIDAPIHVQKR KADEAIREAYDLLRKMGLEDKAEAYPCQLSGGQCQRVAIARALALNPKILFFDEPTSALD PELTGEVLKVIRSLADLQITMVIVTHEMAFARDISDRVIFMADGLIVEEGTPEEVFTSSH ERTQSFLGRYGDGYGSLKN >gi|229784041|gb|GG667694.1| GENE 3 1644 - 2294 640 216 aa, chain - ## HITS:1 COG:SP1502 KEGG:ns NR:ns ## COG: SP1502 COG0765 # Protein_GI_number: 15901349 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 3 216 4 213 213 144 43.0 1e-34 MTLTKIFSQLTGGMWISIQIFFVTLIFSLPLGLFISFGRMSKNPVIRTIVKFYISVMRGT PLMLQLMVVYFGPYYLFKIKVGNNYRLWAAFIGFVINYAAYFAEIYRSGIQSMPVGQYEA AKLLGYSKTQTFFRIIFPQVMKRILPSVTNEVITLVKDTSLAFTISVLEMFAVAKALASS QTSLIPFVAAGLFYYIFNLLVAVIMEYAEKRLDYYQ >gi|229784041|gb|GG667694.1| GENE 4 2372 - 3274 1232 300 aa, chain - ## HITS:1 COG:SPy1274 KEGG:ns NR:ns ## COG: SPy1274 COG0834 # Protein_GI_number: 15675229 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Streptococcus pyogenes M1 GAS # 61 297 25 273 278 149 34.0 9e-36 MKKRAILMAAALGMMAVVSGCSGQAAGNTAAETAGTTAKEAETTAAETAADTDTDTKAPE TEASEAGAAGGTFTVGFDQEFPPMGYVGDDGEYTGFDLEVAEEVAKRLGMEFVPQPVDWA AKDMELESGNIDCIWNGFTMTGREDDYTWTEAYMANQQVFVVTAESGIKTLADLAGKVVE VQAESSAEAALKDDPNLTGTFGTLQSTPDYNTAFMDLQMGAVDAIAMDEVVARFQIEQRQ VDFIVLDETLAAENYAVGFKKGNDALKDQVQAQLEALAADGTLAKISEKWFDKDITTIGK >gi|229784041|gb|GG667694.1| GENE 5 3462 - 3932 656 156 aa, chain - ## HITS:1 COG:CAC0327 KEGG:ns NR:ns ## COG: CAC0327 COG1225 # Protein_GI_number: 15893619 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Clostridium acetobutylicum # 2 150 3 151 151 155 50.0 3e-38 MLQAGTKAPDFTLNDKDVNPVTLSSFLGKKVVLYFYPKDNTPGCTREACAFAGAYAGYQQ RGVEVIGVSKDSVKSHSNFAQKHELPFILVSDPELEAIKAYDVWQEKKLYGKVSMGVVRS TFVIDENGVIEKVFEKVKPDTNAQEILEYLDERANA >gi|229784041|gb|GG667694.1| GENE 6 4024 - 4716 697 230 aa, chain - ## HITS:1 COG:BS_yfnB KEGG:ns NR:ns ## COG: BS_yfnB COG1011 # Protein_GI_number: 16077800 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Bacillus subtilis # 1 230 1 227 235 203 44.0 2e-52 MKQYTTILFDVDGTLLDFDSAEERGLASVFKEYEENGVCRTADLIGTYRRVNRGLWDAYE KGLITKDHITDTRFGAVFEAHGISADGIQTEHRYREILNHTAIVMPEAVEVLTYLQDRYD LYVVTNGFTETQKMRMADSGLDQYFKKSFISEEVGYQKPQKEYFDRCFEAMPGAERKGTL IVGDSLNSDIKGGNTAGIDTCWFNPQGALNTAGVTVSWEIRSLKELKEFL >gi|229784041|gb|GG667694.1| GENE 7 4737 - 6014 1368 425 aa, chain - ## HITS:1 COG:CAC0519 KEGG:ns NR:ns ## COG: CAC0519 COG0044 # Protein_GI_number: 15893810 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Clostridium acetobutylicum # 3 423 2 423 424 438 52.0 1e-122 MRMLIKNVRIVSPGNGLDRIGDIAVSDGTILRVGECAEPEDKFDTVIDGRGLVAAPGLVD VHVHFRDPGLTYKEDIHTGAAAAAKGGFTTVVCMANTKPPVDNVETLRYVLEEGKKTGIH VLQAAAVTVGLAGKELTDMEALKAHGAAGFTDDGIPIMDEKLLKTAMEEAARLELPLSFH EEDPAFIVNNGINHGAVSDELGIYGSPALAEDSLVARDCMIALHTGAEVDIQHISSEGAV RMVKLAKELGARVYAEVTPHHFTLTEKAVLEHGTLAKMNPPLRTERDRQAIIEGLKNGSI DMIATDHAPHSEEEKARPLQQAPSGIIGLETALGLAITKLVRPGHLSLLQLMEKMSLNPA KLYHLDSGSVEEGKAADLVLFDPEEEWTVGEFASKSSNSPFTGSRLYGKVKYTICDGKIV YRDGE >gi|229784041|gb|GG667694.1| GENE 8 6061 - 6516 523 151 aa, chain - ## HITS:1 COG:lin0586 KEGG:ns NR:ns ## COG: lin0586 COG3610 # Protein_GI_number: 16799661 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 2 129 6 134 152 65 34.0 3e-11 MTLLISVVAAIGGTVSFSLLFGVPTRYYPYCGLIGGAGWGVYSILIQSCTAPTAALLATI VVILLSRLFAVRERCPVTIFLISGIFPLVPGAGVYWTAYYIVTNELDKASQTGFMALKVA VAIVLGIVFVFELPQGIFKVFSKKEEKQGSL >gi|229784041|gb|GG667694.1| GENE 9 6513 - 7337 839 274 aa, chain - ## HITS:1 COG:CAC2265 KEGG:ns NR:ns ## COG: CAC2265 COG2966 # Protein_GI_number: 15895533 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 22 274 5 257 257 143 35.0 3e-34 MGRSLRRKMSVSCELNGEEQREILDAAMQAGHILLENGAEIFRVEETVDRICRYYGIESE NAFVLSNGIFITSGSTREEIFAKVQHIPVSGVHLNRVAAVNQLSREIVEGRYTIGEVRER LDEIRVMPGKTKTMQILASGVGSACFCMLFGGNLTDSAAAFVSGVLLYFYVLYISGPHLS KIVGNIGGGALVTFLCSVMYLFSIGQHLNFMIIGSIMPLIPGVAFTNAIRDIADGDYISG SVRMLDALLVFFCIAMGVGLVFSVFHQLTGGVML >gi|229784041|gb|GG667694.1| GENE 10 7297 - 8025 664 242 aa, chain - ## HITS:1 COG:FN1387 KEGG:ns NR:ns ## COG: FN1387 COG2220 # Protein_GI_number: 19704722 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Fusobacterium nucleatum # 3 206 2 208 237 103 30.0 2e-22 MKITYIHHSSFLVELESVYLLFDYTEGVIPGLKREKPLLVLASHRHGDHYSPVIFELIKK HENVRFVLSGDIWRKRLPEEALPVTDSMKPGEMAEYRLENGGMLTIHTYKSTDEGVAFLI EAEGRTIYHAGDLNNWRWEGEPDNWNDSMAKKYSAQIDKMAGMHIDVAFLPLDPRQEDDF YLGMDEFLNKVDVKHVFPMHCWGDYSVIGKMRALDCSAGYRDRIQEITGDGEVFTEEDER FV >gi|229784041|gb|GG667694.1| GENE 11 8208 - 8528 378 106 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKLTAKKLVLTGIAVLTMASISACGSGTTAETTAATEAEVTTEATTEAVTTEEAADETTD AESADAEATEAVESEDASAEESSEEVEEASLEDAETVTEEATTAAN >gi|229784041|gb|GG667694.1| GENE 12 8920 - 9663 865 247 aa, chain - ## HITS:1 COG:CAC3475 KEGG:ns NR:ns ## COG: CAC3475 COG0789 # Protein_GI_number: 15896713 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 3 247 1 243 243 244 54.0 1e-64 MRLTVSETAKLTNISVRTLHYYDEIGLLKPTDTTEAGYRYYDEEALAVLQQILLYRELDF PLKEIKDMLSKPDYDRQEALAAHKSLLLLKIERLRRLTGLVDSMMRGENTMSFQEFDMHE IEEAKKKYANEAKKRWGDTQAYRESEKKTSGYSEEKWKDLNGEMSDIFREFAECVQAGEA PDGSRAAGLVERWRNHITENYYECTLEILEGLGKMYVEDGRFKENIDQFGEHTAEFMSRA IESYCQR >gi|229784041|gb|GG667694.1| GENE 13 9870 - 10598 820 242 aa, chain - ## HITS:1 COG:FN0189 KEGG:ns NR:ns ## COG: FN0189 COG4753 # Protein_GI_number: 19703534 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Fusobacterium nucleatum # 4 242 3 254 261 140 35.0 2e-33 MSVKLLVADDEEVIRRGVAKYIRLHTDRFDKIYEAENGQEAIDLLLKYQPDILLLDVQMP LKNGLDVMKEAQRAGLHPIVVILSGYDEFKYAQQALRFGAKEYLLKPARAADILKCLNRL VDDGTGLEEPCVQEEENGTNQLVKRAKEYIAEHYMENITLSGTAEILGITGGYLSTLFQK SLQCGFIDYLNRVRIERACCYLEQNYFKTYEIAYKVGFRDEKYFSKVFKKVMGMSPKEYR TI >gi|229784041|gb|GG667694.1| GENE 14 10716 - 12524 1770 602 aa, chain - ## HITS:1 COG:BH3678 KEGG:ns NR:ns ## COG: BH3678 COG2972 # Protein_GI_number: 15616240 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 161 596 162 589 605 167 29.0 5e-41 MKKIKQWVNQLSVKMKLIFYGYLIITPVLILICAVLLFYNYNKDLESRLENDRSSVNALA ESIFVLQSDIKDFSTYFCINDEVHALLTAKNAEELNKNAKLWLEAAPMQIVQDMVALKGH IKTIGIYPENGVRPYLRCMDASAYVPDLETVKKTAVYRDTLASDNGILWRSVSKYDKETY NTNRSDKLVLYRGIFDLTQKKTLGYIVIGVSQEVFQKLGEGILHSDREGVLILDKFGGEL SRSGSIPTQVAEYLKSDDFIKEDYRERKHNFTFGDYQVVCSQKEKNSSIIVKVVPRYSRQ MQLTDIAYMPITLMIGVLLGLLPLLLIISNIVTRPLKRVSEAIVKFSTGDFDQRVEVETR DEVGEVAECFNKMVEDIKSLIDENYVITLQEKESELAALQAQINPHFLYNTLDSLYWQAT EADNDEIAESILALSQLFRLVLSQGKKEVTVAQEIELVSRYLQIQKMRFSRRLHYEINVA DEVKSAYIPKLILQPFVENAIVHGFENVSIPCYLTVTGSLDKGKIRFEIKDTGIGMRQDQ IDDIWEEEPAQYARQRIGRYAIKNIKERLELRYHGDFELEIQSDVGKGTTVILCIPFEKP LH >gi|229784041|gb|GG667694.1| GENE 15 12599 - 13444 846 281 aa, chain - ## HITS:1 COG:BH3682 KEGG:ns NR:ns ## COG: BH3682 COG0395 # Protein_GI_number: 15616244 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 1 281 13 293 293 199 40.0 6e-51 MNQTDIRTFSWKKPLLWAVLILVAVVQIFPLIWLADFSLASSNEMFTSGLLVIPKKIQWG NYVKAFIDGHFLLYLKNSILINGLAVILVLLVSIMAGFACRRMNWKLRGLVKTLLLMGMM IPIHATLLPNYKIFNQLGLTDTIWALLIPYVAFSLPQGLFLMSSFMESIPVELEEAAVMD GCGIYRIVFRIITPMMKSSIATVSIMTFLNNWNEFMMASTYLSSPKWKTLPFSVLEFTGQ YSSNYAVQFAAMALTAAPAVIVYILLNKHITKGVAMGAVKG >gi|229784041|gb|GG667694.1| GENE 16 13441 - 14358 916 305 aa, chain - ## HITS:1 COG:lin0760 KEGG:ns NR:ns ## COG: lin0760 COG1175 # Protein_GI_number: 16799834 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 3 280 10 285 296 168 33.0 1e-41 MKKLYSNKLVILSLVLPGLLLFVFAILAPICLSVYYGFTDYSGMGSYNYIGLENYHRLMK DQVFWTSLKNSLLLAIGFICIQHPLALIVAAVLDKLGGRGEGFFRCVYFVPNVISVAVIA YLWKFIYNPDFGLLNNIIKLFGGKGGINWLSQGRAIWSVLIVLIWHGFGWGMLIYHTGIK NIDPTLYEAAAIDGANQKTTFLRITLPLMKPVIQVNVTMAVISALKQMETVYLLTNGGPG NSTQFAANYLYQQAFKAFKYGYGNAIGVIFIIICLVVTVLLNTVFADRDEEAGRKHGLSR RRAKS >gi|229784041|gb|GG667694.1| GENE 17 14438 - 15736 1459 432 aa, chain - ## HITS:1 COG:BH3680 KEGG:ns NR:ns ## COG: BH3680 COG1653 # Protein_GI_number: 15616242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 6 356 10 361 438 149 29.0 8e-36 MKHKLVSVVLAAAMLASSLTGCGSKSETTAKNDNGTITLKVFSNLPDRKNGQGLVEQMII DDYMKENTNVKIEVEALDEEAYKTKFKAYSMEGMPDVVSIWGQPSFLDEVLDAGVLAELN EADYADYGFISGSLEGFKKDGKLYGLPRNTDVAGFYYNEKMFKDNGWEVPAAYDELLDLA QKISGAGITPLAMDGGDGWPMAIYLSDILFKLTGDYTGIVSNAVSTGDFTDPAFKQATEI LKETADAGLFQNGYDSQDYGTAMNLFTNGQAAMFYMGSWEASMAMNEDIPEEVRTNIRVF TMPTIAGGKGAATDIAAWNGGGYAVSAKSEVKEEAIKFLNYMYQPDKLSKYGWENGVGMS AQDQTSYMSGKETKLQMQFVEAVNGASRVSGTPVNDCGPSAFKTSIESEIQSVSNGTITV DEFLSKIGEACK >gi|229784041|gb|GG667694.1| GENE 18 15956 - 16297 307 113 aa, chain - ## HITS:1 COG:L35832 KEGG:ns NR:ns ## COG: L35832 COG2315 # Protein_GI_number: 15674158 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Lactococcus lactis # 6 113 3 114 114 91 40.0 4e-19 MAMKTKQDLIKYCLTYSDVYEDYPFHDTEWAVIRHRKNKKVFAWIYTREGNTCVNVKCDP EWTEFWRNAFPGVLPGYHLNKKYWNTILLDGTVPDQDIKRMIGESYDLTMEKR Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:46:07 2011 Seq name: gi|229784040|gb|GG667695.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld88, whole genome shotgun sequence Length of sequence - 24391 bp Number of predicted genes - 21, with homology - 20 Number of transcription units - 11, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 595 238 ## Closa_2982 translation initiation factor IF-2 2 1 Op 2 . + CDS 357 - 1352 855 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 3 2 Op 1 32/0.000 + CDS 2285 - 3706 1712 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 4 2 Op 2 4/0.333 + CDS 3804 - 4130 519 ## COG0858 Ribosome-binding factor A 5 2 Op 3 1/0.333 + CDS 4142 - 5101 979 ## COG0618 Exopolyphosphatase-related proteins 6 2 Op 4 12/0.000 + CDS 5104 - 6009 993 ## COG0130 Pseudouridine synthase 7 2 Op 5 . + CDS 6009 - 6965 365 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 8 2 Op 6 . + CDS 7012 - 8064 176 ## PROTEIN SUPPORTED gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 + Term 8129 - 8179 12.2 - Term 8117 - 8165 14.8 9 3 Op 1 . - CDS 8180 - 8359 88 ## 10 3 Op 2 . - CDS 8426 - 10108 1333 ## COG0504 CTP synthase (UTP-ammonia lyase) - Prom 10156 - 10215 9.4 + Prom 10228 - 10287 8.4 11 4 Tu 1 . + CDS 10362 - 12464 1256 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 + Term 12509 - 12569 11.1 - Term 12495 - 12555 4.3 12 5 Tu 1 . - CDS 12592 - 12912 364 ## Desal_3692 hypothetical protein - Prom 13008 - 13067 7.5 + Prom 13003 - 13062 6.5 13 6 Tu 1 . + CDS 13097 - 13936 573 ## COG0500 SAM-dependent methyltransferases 14 7 Tu 1 . - CDS 13959 - 14840 622 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 14901 - 14960 3.5 + Prom 14745 - 14804 5.3 15 8 Op 1 16/0.000 + CDS 15025 - 15783 843 ## COG1082 Sugar phosphate isomerases/epimerases 16 8 Op 2 9/0.000 + CDS 15841 - 16998 1420 ## COG0673 Predicted dehydrogenases and related proteins 17 8 Op 3 . + CDS 17038 - 18042 937 ## COG0673 Predicted dehydrogenases and related proteins + Prom 18907 - 18966 80.4 18 9 Op 1 . + CDS 19016 - 19291 406 ## trd_A0554 xylose isomerase domain protein TIM barrel + Prom 19388 - 19447 4.7 19 9 Op 2 . + CDS 19476 - 20993 1658 ## Cphy_1302 putative RNA methylase + Term 21023 - 21074 13.3 + Prom 21123 - 21182 4.2 20 10 Tu 1 . + CDS 21296 - 22393 975 ## Trebr_2048 hypothetical protein + Term 22437 - 22495 11.2 + Prom 22466 - 22525 6.5 21 11 Tu 1 . + CDS 22588 - 23787 1191 ## COG0282 Acetate kinase + Term 23814 - 23859 4.1 Predicted protein(s) >gi|229784040|gb|GG667695.1| GENE 1 2 - 595 238 197 aa, chain + ## HITS:1 COG:no KEGG:Closa_2982 NR:ns ## KEGG: Closa_2982 # Name: not_defined # Def: translation initiation factor IF-2 # Organism: C.saccharolyticum # Pathway: not_defined # 1 179 172 349 1038 114 56.0 2e-24 SRDGRPQGQGSRDGRPQSQGNRDGRPQNQSGINEQRTAGNTTVSGQSVQPREAADNAVKP QQTQAAQSQTTPVLPGQGQTPQAPASASPSPQAQPPQTDARTGQDSSARPQSQGYQGNRD GRPQGQGYQGNRDGRPQGQGYQGNRDGRPQGQGYQGNRDGRPQGQGYQGNRDGRPQGQGL PGKPRRKTAGPGLPGKP >gi|229784040|gb|GG667695.1| GENE 2 357 - 1352 855 331 aa, chain + ## HITS:1 COG:BB0801 KEGG:ns NR:ns ## COG: BB0801 COG0532 # Protein_GI_number: 15595146 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Borrelia burgdorferi # 41 330 67 396 882 108 31.0 1e-23 MTEDRRARATRGTATEDRRVRATRGTVTEDRRARATRETATEDRRARAIRETVTEDRRAR VYQGNRDGRPQGQGYQGNRDGRPQGQGYQGNRDGRPQGQGYQGNRDGRPQGQGGYQGNRD GRPQGQGGYQGNRDGRPQGQGGYQGNRDNRGGRPQGPGGRPGAGRDGGAKSFDTPIQAKP TSNRGNKNAYKNDRFDKRDKDDDAKKQLGKGGKFKAPIQPPVHEEPKVEVVKTIVLPEIL TIKELAEKMKLQPSAIVKKLFLQGKIVTLNQEIDFDQAEEIAMEYEVLCEKEVKIDVIEE LLKEEEENEADMVARPPVVCVMGHVDHGKTS >gi|229784040|gb|GG667695.1| GENE 3 2285 - 3706 1712 473 aa, chain + ## HITS:1 COG:BH2413 KEGG:ns NR:ns ## COG: BH2413 COG0532 # Protein_GI_number: 15614976 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Bacillus halodurans # 1 473 258 729 730 575 65.0 1e-164 MTSREAGGITQHIGAYMVEINGQKITFLDTPGHEAFTAMRMRGAQSTDIAILVVAADDGV MPQTVEAINHAKAAGVEIIVAINKIDKPSANIDRVKQELSEYELIPEDWGGSTVFVPVSA HTKEGIPELLEMILLTAEVKELKANPNRNARGLIIEAELDKGKGPVATVLVQKGVLHVGD TVAAGSCYGKVRAMMDDKGRRVKEAGPSTPVEILGLNDVPNAGEVLVGTVNEKEARNFAE TFISEGKNKLLEDTKAKLSLDDLFSQIKAGNIKELPIIVKADVQGSVEAVKQSLVKLSNE EVMVKVIHGGVGAINESDVSLASASNAIIIGFNVRPDVTAKSIAEREKVDMRLYKVIYQA IEDVEAAMKGMLDPVFEEQIIGHAIVRQTFKASGVGTIAGSYVMDGKFQRGCSVRITRAG EQIYEGPLASLKRFKDDVKEVATGYECGLVFEKFNDIQEDDMIEAYAMVEVPR >gi|229784040|gb|GG667695.1| GENE 4 3804 - 4130 519 108 aa, chain + ## HITS:1 COG:BH2411 KEGG:ns NR:ns ## COG: BH2411 COG0858 # Protein_GI_number: 15614974 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Bacillus halodurans # 2 105 12 114 116 92 46.0 1e-19 MEVQRELSEIIRLEIKDPRIHPMTTVVAVSVTPDLKFCKAYISILGDEEAGKATIEGLKS AEGYIRRELARRVNLRNTPEIKFILDQSIEYGVNMSKLIDEVTKDIKE >gi|229784040|gb|GG667695.1| GENE 5 4142 - 5101 979 319 aa, chain + ## HITS:1 COG:CAC1804 KEGG:ns NR:ns ## COG: CAC1804 COG0618 # Protein_GI_number: 15895080 # Func_class: R General function prediction only # Function: Exopolyphosphatase-related proteins # Organism: Clostridium acetobutylicum # 14 315 16 321 321 129 31.0 6e-30 MSFFDTMLEGVETMAIMGHIRPDGDCIGSCLAAYNYLEEHYPDVRVTVFLEEPSIKFGYL KNIGRICSDFSVEAEFDLCLCLDCSDVLRFGEAVRYLNSAKKSVCVDHHVTNTGFAGQNI VEAGASSTCEVLFGLLDEEKISKAVAECIYTGIIHDTGVFKYDCTSAKTMEIAGKMMAKG IDFPHIIDNSFYRKTYIQNQILGRALLESITFLDGRCIFSVIHKRDMEFYGVTSADMDGI IDQLRITEGVECAIFMYETACREYKVSLRSTNDLDVSKIAVYFGGGGHVKAAGCTMAGSV HDVVNNLSARISAQINGDT >gi|229784040|gb|GG667695.1| GENE 6 5104 - 6009 993 301 aa, chain + ## HITS:1 COG:CAC1805 KEGG:ns NR:ns ## COG: CAC1805 COG0130 # Protein_GI_number: 15895081 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Clostridium acetobutylicum # 3 286 2 280 289 222 42.0 8e-58 MADGIINVYKEKGYTSHDVVAKMRGILKQKKIGHTGTLDPNAEGVLPVCLGNGTKLCDML ADRTKTYRAVLLLGLETDTQDTTGVTLAEYPVSVTEEEVRAAVLSFQGDYEQVPPMYSAL KVDGKKLYELARAGKEVERKARPVKILDICVEETALPRVTMTVSCTKGTYIRTLCYDIGR KLGCGGCMESLLRTRVGQFELKDSLTLSEIERQRDSGGLADHILSVDAVFQDYPALTMKE EFDRLVHNGNAFRRSQAEGDFPVQGSPVRVYDSQGRFIGIYAMDREKHVLKPQKVFLGGN K >gi|229784040|gb|GG667695.1| GENE 7 6009 - 6965 365 318 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 18 308 20 314 317 145 32 3e-34 MEYITETTDFQIQEPSVVTIGKFDGRHRGHQKLLKQMLEVKAERGLKTAVFTFDMTPGSL VAGKQQTVITTNLERRNNLEKIGIDYLVEYPFTEETAHMAPEDFVRDILVSRMHARVIVV GTDCSFGYKGAGNADCLEAWKAKYGYELIVIRKERDDNRDISSTYVREELDAGNIVKANE LLGEPYAIHGTVVHGNHIGGSVLGFPTANILPPPEKHLPAFGVYVSKVYVEGTYYGGVTN IGRKPTVSGESPVGAETFLYGIDQDIYGKNIEVQLLSFIRPEKKFDGLEQLKAQIGRDRD YALAYLKNLSDQQITHLS >gi|229784040|gb|GG667695.1| GENE 8 7012 - 8064 176 350 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 [Haemophilus parasuis 29755] # 225 340 52 162 175 72 34 3e-12 MTDKAWKMLGFCLACGSAVVLGTADVQAAKPEPSYTSGTSVAGFGTAMDRHLKSVEEKGS QAAPAALTQEKVVFGTFRDIFKNLGVAMVEDSLNIRKEPKNDAEIVGKMESHAGCSVLGM EHGWYKIRSGQVTGYVSGKYLAVGQAARASAYCDMKLMLRVNTDTLRVRSAPDTDSEILG RIHEGETYDYLSRAGGGWIKIRYGEQEGYACVPGNAVVAYTIPEAEKTENSLRRQVVNYA VRFVGNPYRWGGTNPNTGADCSGFVQYVMEHAAGIHLDRTSREQSAEGMAVSSKAMQPGD LLFYASGSTIDHVAMYIGGGKIVHAANRRSGIKISAWNYRQPVTIRRVIP >gi|229784040|gb|GG667695.1| GENE 9 8180 - 8359 88 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGADGIVTTDQAVILLRAYAACMLSENCSDCPYQGKCSGWTEEQLLQALHHLNTESTEE >gi|229784040|gb|GG667695.1| GENE 10 8426 - 10108 1333 560 aa, chain - ## HITS:1 COG:CAC2892 KEGG:ns NR:ns ## COG: CAC2892 COG0504 # Protein_GI_number: 15896145 # Func_class: F Nucleotide transport and metabolism # Function: CTP synthase (UTP-ammonia lyase) # Organism: Clostridium acetobutylicum # 1 560 1 535 535 719 60.0 0 MPVKYVFVTGGVVSGLGKGITAASLGRLLKARGYHVTMQKFDPYINIDPGTMNPVQHGEV FVTEDGVETDLDLGHYERFIDENLTKNSNVTTGKVYWTVLTKERRGDFGGGTVQVIPHIT DEIKSRFHRSGERTLPDGEPASWGVGQNSSGKAGSPESDDMEIAIIEVGGTVGDIESQPF LEAIRQFQHDVGRENAILIHVTLVPYLKVSEELKTKPTQASVKDLQGMGIQPDIIVCRSE LPLDDGIRSKIAQFCNVPKNHVLQNLDVEVLYELPLVMEQEHLAEVACESLRIPCPEPDL TEWSSMIESWKHPQKQVTVALVGKYISLHDAYISVVEALKHGGVANHASVSIRWVDSETV NDRNAAEIFDGVNGILVPGGFGSRGIEGKISAIRWARTHGIPFLGLCLGMQMAIVEFARD VVGYRDAHTAELDPDTSHPVIHLMPDQNDVEDIGGTLRLGSYPCILDKTSKAYGLYGQET IHERHRHRYEVNNDYRQALTEHGMKLSGISPDGRIVEMIEIPDHPWFVATQAHPEFKSRP NRPHPLFSGFIEAALKNQEK >gi|229784040|gb|GG667695.1| GENE 11 10362 - 12464 1256 700 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 13 623 1 606 636 488 42 1e-137 MDNNNNQNNNSKMPKNAQVIIAWVITFLIAVAGLSYLHNALSARTTEELSYDTFMNMVDE KQVEAVSVEDSRITIEPKKTWEGYMPGKTYYTVRMDNDLTLAERLRAAGVDTHRIRKDGN FIIQTIVSYLLLFGLMGILLNFMMRRVGGGGGIMGVGKSNAKVYVQKETGVTFRDVAGED EAKESLTEIVDFLHNPGKYTKIGAKLPKGALLVGPPGTGKTLLAKAVAGEAKVPFYSLSG SDFVEMFVGVGASRVRDLFKQAQESAPCIIFIDEVDAIGKSRDSRFGGGNDEREQTLNQL LSEMDGFDSSKGLLVLAATNRPEILDPALLRPGRFDRRIIVDKPDLKGRVDILKVHSRDV LMDESVDLDAIALATSGAVGSDLANMINEAAILAVKSGRQAVAQKDLFEAVEVVLVGKEK KDRIMNKEERRIVSYHEVGHALVSALQKDAEPVQKITIVPRTMGALGYVMHVPEEEKYLN TKKELHAMLVGALAGRAAEELVFDTVTTGAANDIEQATRVARAMITQYGMSDKFGLMGLA TQEDQYLTGRTVMNCGDATAAEVDAEVMKMLKEAYEEAKSLLSENRDVMDKIAEFLIEKE TITGKEFMKIFREMKGIPEPEEKEKSDADKMDDVVSHKDTFGDKNGGDAGSPGNDGGTGN SGTLENPAAGEVSAVSEEKKDAGDGFGDMFGNPDFGEVKR >gi|229784040|gb|GG667695.1| GENE 12 12592 - 12912 364 106 aa, chain - ## HITS:1 COG:no KEGG:Desal_3692 NR:ns ## KEGG: Desal_3692 # Name: not_defined # Def: hypothetical protein # Organism: D.salexigens # Pathway: not_defined # 1 103 1 103 106 75 42.0 5e-13 MKELLLTLLIGIAAGVIDVLPMIKMKLDKYSEISAFVHYLIAPFIIFNTELFGMAWWLKG GVINLALAVPVVILAAKDDKTSAPPMIVMSVVLGTVIGIVGHFTFS >gi|229784040|gb|GG667695.1| GENE 13 13097 - 13936 573 279 aa, chain + ## HITS:1 COG:MA4656 KEGG:ns NR:ns ## COG: MA4656 COG0500 # Protein_GI_number: 20093435 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Methanosarcina acetivorans str.C2A # 24 259 2 237 257 194 40.0 2e-49 MYSQFINYLKEKPALYAPSSAPFWDDEHISKHMLKAHLDPELDAASRRHGFIEQSAEWIA QYCNVRPGMRLLDLGCGPGLYAEKFSRLGFSVTGLDFSRRSIKYAMEHAEAEKLPVEYEY RNYLDMDYQDVFDAAVLIYCDFGVLPPENRKVLLQKTYRALKKNGIFIVDGDSMKYGTKL KEMKSVEYLDQGFWSGEPHVCIQRNYQYPETNNYVEQYVIVTDDACQCYNNWNQLFTAES LERELREAGFETVTFFDDVCGRPFTGESDTICAVSAKRL >gi|229784040|gb|GG667695.1| GENE 14 13959 - 14840 622 293 aa, chain - ## HITS:1 COG:CAC2608 KEGG:ns NR:ns ## COG: CAC2608 COG2207 # Protein_GI_number: 15895866 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 16 269 20 275 284 121 29.0 2e-27 MIYESAPAPVSGKVNFTTSHIKSDTFEVHYHSCYEIYYFVKGDADYVVEGREYHLTPHSL ILLSPYVFHGVRVNSTTDYVRCSVHFSPESVIAERRVLLLSSFPGNKKSGPGEVFYEKTE DYGLYAYFEHLIWSQKQPDALSSSYYAMFLEALLAEISLMCQTLRPSTPDSSASGTITDM IRYLNEHLTEPVTLDDLSSRFFISKYYMNRAFKKATGTTVMDYLIYKRVVMARQLMLNGF TASDAANETGFGDYSTFYRAYKKVMGCRPGADRPGKTSPYTDNKMEKAQGTKA >gi|229784040|gb|GG667695.1| GENE 15 15025 - 15783 843 252 aa, chain + ## HITS:1 COG:BH3346 KEGG:ns NR:ns ## COG: BH3346 COG1082 # Protein_GI_number: 15615908 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Bacillus halodurans # 6 248 8 239 241 118 32.0 1e-26 MKTGAQLYTVRLFTQTVPDFQETMRRVAEIGYRTVQLSAIGGAVTPQIARETADETGLEI VLTHSDLNRILTDTEQLIRDHETMGCKYIGIGSMPEKYRNPDWIKRFAADFKEPARMIKD AGMRLMYHNHAFEFEKTDGRYLLDYLLDDFAPDELGITLDTYWVQTAGGDVCQWIERLAD RIPCVHLKDRGIVNGQEVMAPVMEGNMNFSAILKALEKTDCEYALVEQDTCQENPFVCLE KSWRNLTELGYR >gi|229784040|gb|GG667695.1| GENE 16 15841 - 16998 1420 385 aa, chain + ## HITS:1 COG:SSO3049 KEGG:ns NR:ns ## COG: SSO3049 COG0673 # Protein_GI_number: 15899754 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Sulfolobus solfataricus # 1 353 16 366 371 129 27.0 1e-29 MSKVRLGIIGIGNMGSGHVESILDGKVPEIELAAVADQREVRRQWARETVPETVKIFETA EELLDDSDLDAVLVAVPHYDHPRYVIQALRRGFHVMCEKPAGVYTKQVREMNEEANKHDL VFGMMFNQRTNHIYRKMHELVTGGQYGAVKRVNWIITDWYRTQAYYNSGGWRATWEGEGG GVLLNQCPHNLDLIQWICGMPKTVQAFCHEGKWHDIEVEDDVTAYLEFPNGATGVFVTTT GDAPGTNRFEVDLEKAKIVCENEEIKVYELDVNEREYCYTAVEGFKKPEGRWIDVETDGE NLQHVGVLRAFAGAILRGEPLIAGGEEGINGLTLSNAMHLSSWLKRPVEIPFDEELFLEE LNKKRAGSRKKEKVEEVTFNTKGTH >gi|229784040|gb|GG667695.1| GENE 17 17038 - 18042 937 334 aa, chain + ## HITS:1 COG:Cgl2503 KEGG:ns NR:ns ## COG: Cgl2503 COG0673 # Protein_GI_number: 19553753 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Corynebacterium glutamicum # 36 224 32 214 224 137 36.0 3e-32 MKSAIIGCGSIAAVHAKSITGLGGCELTAVADVIPERAEAMAREYGAVPYTDWGTMLEKE EIDVVHICTPHYLHTPMAEASLRMGIHTFMEKPPVISREQWEKLKQAEKAAKDGAKLGVC FQNRFNPSVAAVKERITAGEFGKILGARGLVTWCRGEKYYTESGWRGTKEKEGGGALINQ SVHTLDLIQYLIGKKPVFMEANVSNHHLKGIIEVEDTMEAYISYGEQTACFYATTGYVAD VPPMIELECDKARVRIEDLRVTVMTADGQTQTVDYSGSEHLGKSYWGSGHRDCIRSFYES IGQDRPFSVGLKQIEDTVELMLTAYETAGRAQTC >gi|229784040|gb|GG667695.1| GENE 18 19016 - 19291 406 91 aa, chain + ## HITS:1 COG:no KEGG:trd_A0554 NR:ns ## KEGG: trd_A0554 # Name: not_defined # Def: xylose isomerase domain protein TIM barrel # Organism: T.roseum # Pathway: not_defined # 1 54 190 243 275 73 61.0 2e-12 MLKPYIAYIHIKDAVKGSGEVVLPGDGDGCVKEILTMLDGAGYQGYLSLEPHLVSFDGLE KLEKGAKERREQDGIAAYKKAYGRLAELLDE >gi|229784040|gb|GG667695.1| GENE 19 19476 - 20993 1658 505 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1302 NR:ns ## KEGG: Cphy_1302 # Name: not_defined # Def: putative RNA methylase # Organism: C.phytofermentans # Pathway: not_defined # 8 501 1 506 510 434 48.0 1e-120 MAYKRRTMIKETWNEIVQNQDVRQNLSKLRQEMKEGRGREAACSCISGEEEKLILLLESE DAKIRKNAALLMGDLGNQEYLEPVLKAYRSEEQLFVKSSYLNAVKNFDYGEHLAFFKKRL DELAQIKLTPETEKHITEEIRELSSMIISREGVLLHPFCGWEETFDIVLLTNRNFQEITR EELRKLEPHAKTKVFGAGVMARVENLNWLEEIRTYQELLFAVKGLPSCPMDAEKAAEMIV KSDLFKFLARSHEGNPPYYFRVELKCKMDHSKKSAFTRRLSSKIEKLSGRKLINTTSNYE FEIRLIENKEGSCNVLIKLFTLKDERFSYRKEVIPTSIKAVNAALTVKLAQPYMKEGAQV LDPFCGVGTMLIERHKAVKAGTMYGLDILEEAIEKARENTAAAGQIIHYINRDFFDFKHE YLFDEIITNMPFKIGRKTEEEIENLYEAFFSSTKKVLKDDGIMILYSHNAGHVKKMAPAN GFTVSETFEISLKEGTYVFVLNRRQ >gi|229784040|gb|GG667695.1| GENE 20 21296 - 22393 975 365 aa, chain + ## HITS:1 COG:no KEGG:Trebr_2048 NR:ns ## KEGG: Trebr_2048 # Name: not_defined # Def: hypothetical protein # Organism: T.brennaborense # Pathway: not_defined # 222 358 143 283 287 96 38.0 1e-18 MPQDSEKYMSEDQSGTDIEKNADRKIFIPGAMGERITYLRKRKKMTQAQLAEQLGISAQA VSKWESGLSCPDIMTLVPLSQVLGVSTDELLGVQGGMGAGGFEASGTMEAGEHAESNTEP ITESNAEAKGTADGSPIREPNRFKSLLVVPAAEDGTRAIREFTVSAGACAAVIQYGSDFG LETEGYQDGEIISEVSDGTWTVRDIADKNILRIGRNTLSKRKMILTIPKGYHFDSVNLNI GAGTLTGAGIITESCVLSVGAGQVTMKDFFSNGSKITCGMGEIVVRGGLYGKCAVDCGMG AVRLYLREPEHYGYRTSVGMGEVRIGDNRLSGIGGSQTMNAGDSNFYKVSCSMGMVSVVF TEGNE >gi|229784040|gb|GG667695.1| GENE 21 22588 - 23787 1191 399 aa, chain + ## HITS:1 COG:TM0274 KEGG:ns NR:ns ## COG: TM0274 COG0282 # Protein_GI_number: 15643044 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Thermotoga maritima # 1 395 1 397 403 429 53.0 1e-120 MTILVINSGSSSLKFQVIDPETEEVLAKGVCERIGIDGIFTYRTHTGISLKESVPMENHT QAVHTLLDMLADAEKGVLKDFQEISVIGHRLVHGGEKFTGSVVINDEVIEAMTECNDLAP LHNPANLVGVEACRRLMPDTLMVGVFDTAFNQTMEPKAFLYGLPYRYYEKYKIRRYGFHG ISHRYVSRRAAEFMRQDPKRVKTIVCHLGNGASICAVKYGQVVDNSMGFTPLEGLVMGTR CGDVDPGALEFLTKKEHLDFLQVMAILNQESGVLGISKNFSSDFRELKDAEVKYNDHAAL AREIFSYKVAKYIGSYAAAMNGVDNIVFTAGVGENDSDIRMEICSYLEYLGITIDEVSNR DQAEAFRISEWTSKVNVLVIKTNEELEIAREAAEVARNC Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:46:45 2011 Seq name: gi|229784039|gb|GG667696.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld89, whole genome shotgun sequence Length of sequence - 32324 bp Number of predicted genes - 29, with homology - 28 Number of transcription units - 11, operones - 7 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1785 2063 ## COG2176 DNA polymerase III, alpha subunit (gram-positive type) - Prom 1881 - 1940 19.4 2 2 Op 1 17/0.000 - CDS 2842 - 3360 614 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 3 2 Op 2 15/0.000 - CDS 3362 - 4504 1233 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase 4 2 Op 3 32/0.000 - CDS 4558 - 5358 957 ## COG0575 CDP-diglyceride synthetase 5 2 Op 4 19/0.000 - CDS 5378 - 6100 859 ## COG0020 Undecaprenyl pyrophosphate synthase 6 2 Op 5 33/0.000 - CDS 6257 - 6811 748 ## COG0233 Ribosome recycling factor 7 2 Op 6 . - CDS 6826 - 7527 799 ## COG0528 Uridylate kinase - Prom 7575 - 7634 6.8 - Term 7646 - 7692 9.2 8 3 Tu 1 . - CDS 7718 - 8494 840 ## COG4166 ABC-type oligopeptide transport system, periplasmic component - Prom 8517 - 8576 18.7 9 4 Op 1 2/0.000 - CDS 9478 - 10323 769 ## COG4166 ABC-type oligopeptide transport system, periplasmic component - Prom 10345 - 10404 1.9 - Term 10345 - 10387 2.5 10 4 Op 2 44/0.000 - CDS 10437 - 11414 871 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 11 4 Op 3 . - CDS 11414 - 12061 441 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 - Prom 12088 - 12147 80.4 12 5 Op 1 44/0.000 - CDS 13000 - 13323 178 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 13 5 Op 2 49/0.000 - CDS 13339 - 14379 954 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 14 5 Op 3 . - CDS 14394 - 15311 735 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components - Term 15481 - 15518 6.4 15 6 Op 1 38/0.000 - CDS 15608 - 16531 545 ## PROTEIN SUPPORTED gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts - Prom 16558 - 16617 4.7 - Term 16649 - 16706 8.6 16 6 Op 2 2/0.000 - CDS 16718 - 17458 1138 ## PROTEIN SUPPORTED gi|240145469|ref|ZP_04744070.1| ribosomal protein S2 - Prom 17564 - 17623 5.7 - Term 17638 - 17673 -0.7 17 6 Op 3 1/0.000 - CDS 17693 - 18508 1040 ## COG4465 Pleiotropic transcriptional repressor - Prom 18560 - 18619 5.1 - Term 18559 - 18598 -0.9 18 6 Op 4 1/0.000 - CDS 18634 - 20340 1594 ## COG0550 Topoisomerase IA - Prom 20389 - 20448 9.8 19 7 Op 1 13/0.000 - CDS 21351 - 21614 317 ## COG0550 Topoisomerase IA 20 7 Op 2 2/0.000 - CDS 21629 - 22729 928 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake - Prom 22750 - 22809 80.4 21 8 Tu 1 . - CDS 23649 - 25124 1213 ## COG0606 Predicted ATPase with chaperone activity - Prom 25210 - 25269 5.2 22 9 Op 1 . - CDS 25280 - 26212 632 ## COG2267 Lysophospholipase - Prom 26234 - 26293 6.1 23 9 Op 2 . - CDS 26325 - 26984 460 ## Closa_1841 hypothetical protein - Prom 27039 - 27098 25.5 24 10 Tu 1 . - CDS 28041 - 29405 1149 ## COG4277 Predicted DNA-binding protein with the Helix-hairpin-helix motif - Prom 29444 - 29503 3.2 - Term 29468 - 29523 16.6 25 11 Op 1 . - CDS 29557 - 30576 1215 ## COG1087 UDP-glucose 4-epimerase 26 11 Op 2 . - CDS 30611 - 30775 103 ## 27 11 Op 3 . - CDS 30738 - 31505 553 ## gi|266623016|ref|ZP_06115951.1| putative lipoprotein 28 11 Op 4 . - CDS 31520 - 31861 402 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family 29 11 Op 5 . - CDS 31892 - 32323 473 ## gi|288870970|ref|ZP_06115953.2| conserved hypothetical protein Predicted protein(s) >gi|229784039|gb|GG667696.1| GENE 1 3 - 1785 2063 594 aa, chain - ## HITS:1 COG:CAC3442 KEGG:ns NR:ns ## COG: CAC3442 COG2176 # Protein_GI_number: 15896683 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit (gram-positive type) # Organism: Clostridium acetobutylicum # 309 594 211 499 1452 221 40.0 4e-57 MTKAFLDVFPNLNITDELRELLNLVEVEKVTSTRDRSSIRVYIVSPRLIHKQNIYGLEKG IKDQLFPSKQVSIKIIEKFRLSGQYTPEKLLAVYRDSILMELKNYSIIEYNIFRKTEYEF PEPDLLKMTVEDTIVTHEKEAELKRVLEKIFFERCGLPIEVRYEYVEPKENALAKQKELQ MQREVEEIVSRTSLAAVLNEEQALPKTGTEDGYGGPSMADAASLEALAAGYSAEVPFDEG PARQGAADNGEIPPRRLAENGGAGAPVLRMGGKKDENKRDKNGDNGKKEQKKDYKDYKKN GGYGGYGGKRSDNPDVLYGRDFEGETIEIEKIDGEIGEVTIRGKILMVDSRELRSGKTIM IFDISDFTDTITVKMFAREEQLDDLKAAVQAGTFIKIKGVTTIDKFDGELTLGSVVGIKK SEDFTGKRMDGSLEKRVELHCHTKMSDMDGVSEVKDIVKRAKKWGMPAIAVTDHGCVQSF PDANHALDKGDTFKILYGVEGYLVDDLKLLVDNPKDQTFADTYVVFDIETTGFSPEKNRI IEIGAVKVVNGVITDRFSTFVNPDVPIPFEIEQLTSINDNMVLSYPKIDVILPQ >gi|229784039|gb|GG667696.1| GENE 2 2842 - 3360 614 172 aa, chain - ## HITS:1 COG:CAC1796 KEGG:ns NR:ns ## COG: CAC1796 COG0750 # Protein_GI_number: 15895072 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Clostridium acetobutylicum # 3 154 5 156 339 125 42.0 3e-29 MSSIIVAVLVFGLIVLIHELGHFLFAKLNGISVVEFSIGMGPRLFHVKKGETTYSLKLLP IGGSCMMLGEDEENPAEGAFQNASIPGRMAVIAAGPVFNFILAFFLALILVGMGGYNVTQ IKEVTEGSPAYEAGLKPGDVITGVNEEKMTVYGDYILYRMLKPDEKMSRVSY >gi|229784039|gb|GG667696.1| GENE 3 3362 - 4504 1233 380 aa, chain - ## HITS:1 COG:lin1354 KEGG:ns NR:ns ## COG: lin1354 COG0743 # Protein_GI_number: 16800422 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Listeria innocua # 1 378 1 378 380 395 52.0 1e-110 MKKIAVLGSTGSIGTQTLEVVRAQKDIEVTALAAGSNITRLEEQIREFHPEIVCVWEEER AKELALAVKDMPVKVVYGMEGLIEAAVDTEAEIVVTAIVGMIGIRPTIAAMEAGKDIALA NKETLVTAGHIIMPLAKEKGVKILPVDSEHSAIFQSLNGEDKKAVHKILLTASGGPFRGK TREELKMVRLEDALKHPNWSMGRKITIDSSTMVNKGLEVMEAHWLFDVSMDQVQVVVQPQ SVIHSMVQFDDGAVIAQLGTPDMKLPIQYALYYPERRFLAGERLDFWSIGQITFEKPDME NFPGLALAYEAGRIGGTMPTVFNAANERAVAKFLNREISYLTITDMIEGAMKGHTVKENP SVDEILEAEASAYDYIESRW >gi|229784039|gb|GG667696.1| GENE 4 4558 - 5358 957 266 aa, chain - ## HITS:1 COG:CAC1792 KEGG:ns NR:ns ## COG: CAC1792 COG0575 # Protein_GI_number: 15895068 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Clostridium acetobutylicum # 9 241 9 240 245 127 37.0 2e-29 MFTTRLISGIVLVLIALVVVGSGGMILFAATVAISMIGLFELYRVMKIQKNVIGVMGYLT VVSYYGLLWFHGEQYMMLMSIAFLMILMTIYVFTFPKYKTEEVTVAFFGVFYVAVMLSYL YQVRVMADGKYLVWLIFLSSWGCDTCAYCVGMLIGRHKLAPVLSPKKSIEGAVGGVAGAA LLGFLYATVFGASMAEVQNPQVACALACAIAAVISQIGDLAASAIKRNHQVKDYGHLIPG HGGILDRFDSMLFTAPAIFFAVTFLR >gi|229784039|gb|GG667696.1| GENE 5 5378 - 6100 859 240 aa, chain - ## HITS:1 COG:mll0640 KEGG:ns NR:ns ## COG: mll0640 COG0020 # Protein_GI_number: 13470839 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Mesorhizobium loti # 8 239 1 232 244 244 51.0 1e-64 MEVNIDNMVIPDHVAIILDGNGRWAKKRGLPRSMGHKEGCVVVERTVEDAARIGIKYLTV YGFSTENWKRSGEEVGALMQLFRYYMVRLLKIAKTNNVRVRMIGERSRFDQDIIDGINRL ERETENNTGLTFVIAVNYGGRDEIVRAARKIMKDTADGKITPDQMDEAVFSSYLDTAEIP DPDLLIRTSGELRLSNYLLWQLAYTELYVTDCYWPDFNKEELKKAIAAYNSRDRRFGGVK >gi|229784039|gb|GG667696.1| GENE 6 6257 - 6811 748 184 aa, chain - ## HITS:1 COG:CAC1790 KEGG:ns NR:ns ## COG: CAC1790 COG0233 # Protein_GI_number: 15895066 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Clostridium acetobutylicum # 9 184 10 185 185 183 57.0 1e-46 MQESLQVYEDKMNKTLEVLESDYMTIRAGRANPHVLDKIKVDYYGTPTPLQQVGNINVPE ARMIIIQPWEKSLLKAIEKAILTSDLGINPTNDGSVIRLVFPELTEERRKELVKDVKKKG EAAKVAVRNIRRDGNDYFKKLQKADEISEDDLAEAEEDIQKLTDKMIGKIDKAIEAKSKE LLTV >gi|229784039|gb|GG667696.1| GENE 7 6826 - 7527 799 233 aa, chain - ## HITS:1 COG:CAC1789 KEGG:ns NR:ns ## COG: CAC1789 COG0528 # Protein_GI_number: 15895065 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Clostridium acetobutylicum # 4 232 7 235 236 217 48.0 1e-56 MNPRRVLLKLSGEALAGPNKTGFDEETVKEVARQVKISVDAGVEVGIVIGGGNFWRGRTS NAIDRTKADQIGMLATVMNCIYVSEIFRRAGMTTQILTPFECGSMTKLFSKDRANKYFEK GMVVFFAGGTGHPYFSTDTGIALRAIEMEADCILLAKAIDGVYDSDPKINPEAKKYDEIS IQEVIDKQLGVIDLTASIMCMENRMPLAVFSLNEKDSIADAMQGKINGTIVTA >gi|229784039|gb|GG667696.1| GENE 8 7718 - 8494 840 258 aa, chain - ## HITS:1 COG:CAC3634 KEGG:ns NR:ns ## COG: CAC3634 COG4166 # Protein_GI_number: 15896868 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Clostridium acetobutylicum # 15 252 317 543 550 144 34.0 1e-34 MGTYYISFQTEKEPFNNPDVRMALSLAIDRDYVANTVMQGTYSPATNFVGPGLSDAEAGS SFEEVTRKNNGGDFFNVTDHDADVAKAKELLANAGYPDGAGFPTIEYMTNDAGYHKPLAE YLQSCWKESLGINMDIKVVEWSTFTPTRRAGDFQVARNGWVYDYDDPSNMLNLMITGGGN NDGKYSNPEVDKMLNDANSTADVAEHYAKLHEAENMILADAAMAPIAYYNDFYLQDPKLK GTWHSPYGYWFFQYATMD >gi|229784039|gb|GG667696.1| GENE 9 9478 - 10323 769 281 aa, chain - ## HITS:1 COG:CAC3634 KEGG:ns NR:ns ## COG: CAC3634 COG4166 # Protein_GI_number: 15896868 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Clostridium acetobutylicum # 61 281 41 262 550 176 38.0 5e-44 MKKTALVLAAVMATGMILTACGGNKTADAPTTKAADTTTTAAAAEGETAAPAENGGAGLD LAVQIGPDPETVDPALNSAVDGGNMILHAFETLMSVDSENKIVPGQAESMEVSDDGLTYT FHLRKGLKWSDGTPLTANDFVYSWKRLADPNTAAPYAMDMLGYVKGYKEASEGNLDALGV SAPDDDTFVVEMASPCVYFSKLITHGSMVPVQQATIEANGDQWTLTPETYISNGPLKMIE WVPGSHITFAKNENYWNADKVTVNTLKFVLMEDSNAAYSAY >gi|229784039|gb|GG667696.1| GENE 10 10437 - 11414 871 325 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 4 318 9 324 329 340 54 9e-93 MEDKYLVEVKHLQQYFPVAGSKLFEKKVVKAVDNVSFGIRKGETLGLVGESGCGKTTTGR TLLRLYEPTDGTILYDGKDITHVNMLPYRRKMQIVFQDPYASLDPRMTVEDIVGEAIDIH RLAANKKDRHERILSVLSTVGLNSEHANRYPHEFSGGQRQRVGIARALAVNPEFIVCDEP VSALDVSIQAQVVNMFEQLQEELGLTYLFIAHDLSIVKHISNRIGVMYLGKLVELADSYE LTFHSVHPYTKSLISAIPIADPETSRKSKRIVLEGDVPSPVNPPSGCRFRTRCPYADELC AAEEPQWKEVSTGHYAACHHLDKVN >gi|229784039|gb|GG667696.1| GENE 11 11414 - 12061 441 215 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 199 133 329 329 174 42 6e-43 NKQQANERAKELLTLVGINEPEKRLKQYPHELSGGMRQRVMIAIALACEPKLLIADEPTT ALDVTIQAQILELMMDLKEKLGMAIIMITHDLGVVANMCDRIAVMYAGKVVEYGTTDDIF YRPSHEYTKGLIRSIPKLTEKEHNKLVPIEGSPVDMLNPPAGCPFAPRCRACMKICLREM PPYTDLSDVHYSACWLMQKKEFDKQQDSKTQKGDA >gi|229784039|gb|GG667696.1| GENE 12 13000 - 13323 178 108 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 3 108 6 111 563 73 34 2e-12 MSEYLVNIKNERLSFFTPAGEVKALNDVSLHLKEGEVLGIVGESGSGKSVTAYSLMGITA YPGRLIGGSLEFNGHQIEKMSEKELRKVRGNEISIIFQDPMTSLNPVY >gi|229784039|gb|GG667696.1| GENE 13 13339 - 14379 954 346 aa, chain - ## HITS:1 COG:CAC3643 KEGG:ns NR:ns ## COG: CAC3643 COG1173 # Protein_GI_number: 15896876 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Clostridium acetobutylicum # 10 345 7 307 308 249 44.0 6e-66 MSNNKLSLQLNVDDFLPASDEEKESLTVLRKSVGFWQDGIRRLKKNKIAMVSLVVIIFVF IFSFLVPQFYPYSYEQQIRGSENLAPMQYSEKELEAKAAGESVFPHVLGTDNLGRDYAVR VMMGSRVSLTVGLVASAIILLIGSLYGSIAGFFGGWVDMIMMRIVDMIYTVPDILIIVLL SVAFDQPLKALAQKPGFEWIQVIGVNLISIFVVFALLYWVGMARIVRSQILILKEQEYVT AARALGASNGRIIRKHLLTNCIGTLIVTTTLQIPSSIFTESFLSFLGLGVAAPMPSLGSL ASAALNGLQSYPHRLFAPALAISIIILSFNLLGDGLRDAFDPKLKD >gi|229784039|gb|GG667696.1| GENE 14 14394 - 15311 735 305 aa, chain - ## HITS:1 COG:CAC3644 KEGG:ns NR:ns ## COG: CAC3644 COG0601 # Protein_GI_number: 15896877 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Clostridium acetobutylicum # 1 305 1 305 306 302 50.0 5e-82 MAKYIFKRIIMAIVTIFAVASVTFFVMNMVPGGPFMSEKAISPQAQAALNEKYGLDKPLG EQYVTYMKDLLHGNLGLSVKQRGRTVNMIIASKFPVSARVGGIAILTAVLVGVPLGSIAA FKRGTIVDDIIIVFSTCGIAIPSFVICTILMYVLSLKLGLLPTYGLASWKNYIMPVIALA LYPSSYIARLMRSSMLDVMGQDYMRTARAKGLSQVISIFKHALRNAILPVVTYLGPLLAY TVTGSFIVEKIFTIPGLGSEFIGSITGRDYPLIMGTTIFLASLMVVMNVIVDVAYKLIDP RINLK >gi|229784039|gb|GG667696.1| GENE 15 15608 - 16531 545 307 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts [Haemophilus influenzae R2866] # 1 304 1 277 283 214 44 6e-55 MAAVTAGMVKELREMTGAGMMDCKKALAATDGDMEKAVEFLREKGLAAASKKAGRIAAEG IVKTLVSEDGKSAAIVEVNSETDFVAKNAQFQGYVADVASQVLASSAADMEAFMAEPWAK DTSLTVAQELSSQIAIIGENMNIRRFEKIATDGVVIDYIHGGGRIGVLVEASAEVVNDEV KEALKNVAMQIAALTPKYVKRDEIPQEFVEHETEILKVQAKNENPDKPDAIIEKMIIGRL NKELKEFCLLDQAYVKDGDLTVAKYLEQVSKEVGGKIELKKFVRFETGEGLEKKEEDFAA EVAKQMQ >gi|229784039|gb|GG667696.1| GENE 16 16718 - 17458 1138 246 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145469|ref|ZP_04744070.1| ribosomal protein S2 [Roseburia intestinalis L1-82] # 1 245 1 245 247 442 91 1e-123 MSVISMKQLLEAGVHFGHQTRRWNPKMAPYIYTERNGIYIIDLQKSVVKVDEAYKAVADI AAEGGKILFVGTKKQAQDAIKTEAERCGMYYVNERWLGGMLTNFKTIQSRIARLKQIETM SEDGTFDVLPKKEVIALKKEWEKLEKNLGGIKEMKKIPDAIFVVDPKKERICVQEAHTLG IPLIGIADTNCDPEELDFVIPGNDDAIRAVKLIVSKMADAVIEANQGEATAEEAEAAQDF EEAVEA >gi|229784039|gb|GG667696.1| GENE 17 17693 - 18508 1040 271 aa, chain - ## HITS:1 COG:CAC1786 KEGG:ns NR:ns ## COG: CAC1786 COG4465 # Protein_GI_number: 15895062 # Func_class: K Transcription # Function: Pleiotropic transcriptional repressor # Organism: Clostridium acetobutylicum # 5 255 4 258 258 214 51.0 2e-55 MSVQLLDKTRKINKLLHNNNSHKVVFNDICEVLSEILNSNILVISKKGKVLGVAISPDVE EIQELIADTVGGYVDGMLNERLLGILSTKENVNLETLGFTGDNIRKYQAIITPIDIAGER LGTLFIYKSDAQYDIDDIILSEYGTTVVGLEMMRSVNEENAEETRKVQIVKSAISTLSFS ELEAITHIFDELDGNEGILVASKIADRVGITRSVIVNALRKFESAGVIESRSSGMKGTYI KVLNDVVFDELKAIKAGNGKGIQAHREIKES >gi|229784039|gb|GG667696.1| GENE 18 18634 - 20340 1594 568 aa, chain - ## HITS:1 COG:CAC1785_1 KEGG:ns NR:ns ## COG: CAC1785_1 COG0550 # Protein_GI_number: 15895061 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 1 455 122 576 578 472 52.0 1e-133 MKTPREIDMDLVDAQQTRRILDRMVGYMISPILWAKVKRGLSAGRVQSVALRMICDREEE INAFIPEEYWNLEADLKQPGSKKALAAKFYGNKDGKIAIRNKEELDGILKELEGCDYVVD EVKNGERIKKAPIPFTTSTLQQEASKVLNFSTQKTMRLAQQLYEGVEVKGHGTVGLITYL RTDSTRIAEEADTAARAFIEAQYGSKYVASGEQAKKSGAKIQDAHEAIRPTDISLTPVIV KESLQRDLFRLYQLIWKRFAASRMQPAVYETTSVKVAAGDYCFTLAASKLAFDGFMSVYI QEDDKEESSALLGRLEKGSVLTLEKLDSSQHFTQPPAHFTEASLVKAMEEQGIGRPSTYA PTITTIIARRYVAKENKNLYVTELGEVVNSMMKQSFPSIVDITFTANLEYLLDKVGDGSV PWKTVVRNFYPDLKEAVDKAEVELESVTIADEVTEEICELCGRNMVIKYGPHGKFLACPG FPECKNTKPYLEKIGVPCPKCGKDVVIKKTKKGRIYYGCEANPECDFMSWQRPSTEKCPK CGSYMVEKGNKLLCSNEQCGYSTSKNTK >gi|229784039|gb|GG667696.1| GENE 19 21351 - 21614 317 87 aa, chain - ## HITS:1 COG:SA1093_1 KEGG:ns NR:ns ## COG: SA1093_1 COG0550 # Protein_GI_number: 15926833 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Staphylococcus aureus N315 # 1 84 1 84 571 108 63.0 3e-24 MANCLVIVESPAKVKTIKKFLGTSYEVDASNGHVRDLPKSQLGISPEQDFEPKYITIRGK GDILAKLRKEVKKADKIYLATDPDLAS >gi|229784039|gb|GG667696.1| GENE 20 21629 - 22729 928 366 aa, chain - ## HITS:1 COG:CAC1784 KEGG:ns NR:ns ## COG: CAC1784 COG0758 # Protein_GI_number: 15895060 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Clostridium acetobutylicum # 12 362 7 347 354 187 33.0 4e-47 MISHKERQYLCWFLSVPGIGAVTGMRIAKQFKSFEEIYNIEGKNLEQAGLLRPKVAAAFD AEKCSLDKRKRELEQLEKEGIRLVAFLDENYPKRLLQIPDYPVGLFVRGGLPDEERPAAA VIGARDCTNYGRQEAESVAAELARAGVQIISGLAYGIDSAGHRGALKAGGRTFGVLGCGI KICYPRENYGLYERMCRQGGVLSEWMPWEPPQQKNFPIRNRIISGLSDVLLVMEARQKSG SLITAGIGLEQGKEIFALPGRVSDPLSEGCNRLISEGAGVLLGPEEVLEYLALKAGKILR VHEKNENGLAKKEKMVYSCLDLQPKGLEEIVILSGLSVSECIGTLLDLELGGYVVQTSHQ YYGKRL >gi|229784039|gb|GG667696.1| GENE 21 23649 - 25124 1213 491 aa, chain - ## HITS:1 COG:alr4088 KEGG:ns NR:ns ## COG: alr4088 COG0606 # Protein_GI_number: 17231580 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Nostoc sp. PCC 7120 # 1 489 1 491 509 437 45.0 1e-122 MFSKVHSIGILGVEGYPVVVEADVSEGLPGFTMVGYLSGAVREAQDRVRTALKNSGFRLP ARKITVNLSPADVRKEGTGFDLPIAVAVLAATGLVDQSLLGTSVLVGELGLDGRIKPVSG ALPIAAAAGSAGMKRCFLPISNVKEGQIIENIDIIGVENIKELAAVLNDPDRLSPAPRRD SHCLSETESYDVDFSEVNGQFLLKRATEIAVAGMHNILYIGSAGTGKTMMARRIPTIMPS LSHEENIEISKIYSICGLLPQEQPLLSRRPFRSPHHTISPQALTGGGRIPKPGEISLASR GVLFLDELPEFQKNSIEVLRQPLEERKITISRLHGACEFPADFMLVAAMNPCKCGYYPDR SRCTCMEPQIRQYLSRISKPLLDRIDICAEAVPMKYEELKGRGEGEPSSAIRERVEAARI IQRQRLKDSGIYFNSAMNGAQIRAFCSLDQKEQAFLKRIFEHMGLSARGYGKILKVARTI ADLAGSSEIAS >gi|229784039|gb|GG667696.1| GENE 22 25280 - 26212 632 310 aa, chain - ## HITS:1 COG:PA3301 KEGG:ns NR:ns ## COG: PA3301 COG2267 # Protein_GI_number: 15598497 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Pseudomonas aeruginosa # 7 304 6 300 316 171 33.0 1e-42 MIRRREFRIPSSDGKSELHTVLWEPDGEVKQLLLISHGMTEHILRYDAFAVFLAEQGIAV IGHDHLGHGGTVQDGRYGYFSDHGGHICVIRDLHRTASRIRVMYPGRPLYMLGHSMGSFF LRRYLTLYGSGLAGAILIGTGDQALPIVIAGKLIASTIGLIRGNDYQSKWMHQLVLGNYN RSFMPARTDNDWLSRNCESVDQYCSDPLCNFKFTASAYCDFFNIMLDLKLHRQEDRIPRS LPLFILSGAQDPVGEFGRGVRRVFRRYKKIGMTDAEMILYPDDRHEILNETDKENVWHDI LSWITYHELS >gi|229784039|gb|GG667696.1| GENE 23 26325 - 26984 460 219 aa, chain - ## HITS:1 COG:no KEGG:Closa_1841 NR:ns ## KEGG: Closa_1841 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 217 42 258 259 299 64.0 6e-80 MELFCEYIDVLPDDEKVEKVIRSVRTKISEAAYEAIYKASLSKNGDKADHIYRFLIRGFQ MGAGVTQSLQLPEVFQIFEICRNLDNEAHLETEFLRFVQMDEDILVSRIGPKNDVLVLIA PHFADRMPAENWIIYDENRKKAVLHQAGQPWFLMELDSREWEERLHKAAEEDGYRELWKA LRRSITIKERWNPVCQRNHMPLRYRAYMPEFDHGLKDRR >gi|229784039|gb|GG667696.1| GENE 24 28041 - 29405 1149 454 aa, chain - ## HITS:1 COG:CAC3343 KEGG:ns NR:ns ## COG: CAC3343 COG4277 # Protein_GI_number: 15896586 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with the Helix-hairpin-helix motif # Organism: Clostridium acetobutylicum # 13 437 1 427 440 516 56.0 1e-146 MCSRWLMLIQEDLSIQEKLEILSDAAKYDVACTSSGVDRKGKAGMLGNTVAAGLCHSFAG DGRCISLLKILFTNQCIYDCKYCINRCSNDVVRTSFTPDEVCTLTMEFYRRNYIEGLFLS SGILYSADYTMDLIYQTLKRLREVCRFNGYIHVKAIPGADPVLIEKVGFLADRMSINLEL PTADGLKKLAPGKSRSKILAPMRQIQRGITANKYELMEYKRTPSFVPAGQSTQMIVGATP ENDYQMMMVAEALYQNYDLKRVFYSAFVPVNQDTSLPALPGGPPLLREHRLYQADWLLRF YGFKASELLTEDRPNFNVFLDPKCDWALRHLEQFPVEINRADYYTLLRVPGMGVKSVKRI IAARRQGVLDFDALKKMGVVLKRAMYFITCSGRMMAPIRMDEDYITSHLVGDERRAVWDI SHQGAYRQLSLFDDMKLNLSPAPADRYSVVTGSM >gi|229784039|gb|GG667696.1| GENE 25 29557 - 30576 1215 339 aa, chain - ## HITS:1 COG:BS_galE KEGG:ns NR:ns ## COG: BS_galE COG1087 # Protein_GI_number: 16080937 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Bacillus subtilis # 1 337 1 335 339 487 69.0 1e-137 MAILVTGGAGYIGSHTCVELLKAGYEVVVVDNLCNSCEESMERVQEITGKTLAFYEADLL DKAALSRIFKEQKIDAVIHFAGLKAVGESVYKPLEYYHNNITGTLVLCDVMREHGVKSIV FSSSATVYGDPAFVPITEECPKGEITNPYGRTKGMLEQILADLHTADPEWNVMLLRYFNP IGAHESGRIGENPKGIPNNLLPYITQVAVGKLESLGVFGNDYDTPDGTCVRDYIHVADLA DGHVKALKKLEGEKGGVLIYNLGTGCGYSVLDVIHAFEEANGLKVPYEFKPRRAGDVPQC YADPAKAERELGWKAQRDLKDMCRDSWNWQKNNPEGYAK >gi|229784039|gb|GG667696.1| GENE 26 30611 - 30775 103 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIFWKETAFSSHKTGSIGRKVSSGTGLHTFANIKLKKYIGFSKKSALLFPFRLL >gi|229784039|gb|GG667696.1| GENE 27 30738 - 31505 553 255 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623016|ref|ZP_06115951.1| ## NR: gi|266623016|ref|ZP_06115951.1| putative lipoprotein [Clostridium hathewayi DSM 13479] putative lipoprotein [Clostridium hathewayi DSM 13479] # 1 255 12 266 266 484 100.0 1e-135 MRRWLLTAALVCALASGGCASKYTGQENGKAEEQSNQAFAERSQNETAAGQTEAGLPEKD RQEESAETAGETADKQNGTPQAALKLQETVLRGMVWTASEKRMVYNTADNSFAWWCLYGI AAAGAEEGETEISIKTEKLKTYLSVLFEGNTSLPAIPESMSGFITPDEKNDTVVFHLSEA MEGSFTFSEPERIDETHVKIAGVWENGRPELTSPPTVLFYLVNRTGDDYPYTVQYMETEG YICNDILEGDRVFEP >gi|229784039|gb|GG667696.1| GENE 28 31520 - 31861 402 113 aa, chain - ## HITS:1 COG:CAC0417 KEGG:ns NR:ns ## COG: CAC0417 COG1393 # Protein_GI_number: 15893708 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Clostridium acetobutylicum # 1 112 1 111 112 121 58.0 3e-28 MNIQIFGKKKCFDTKKAERYFKERNIRVQMIDMAEKGMSKGEFTNVKQAVGGVEALIDEN CRDKDLLALLKYLSPEDREEKILLNPQVIKTPVVRNGKKATVGYHPEIWKDWE >gi|229784039|gb|GG667696.1| GENE 29 31892 - 32323 473 143 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288870970|ref|ZP_06115953.2| ## NR: gi|288870970|ref|ZP_06115953.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 143 1 143 143 272 100.0 5e-72 ILPGICGSVCGFIDRYLVSAAVTVFLAVSSVVCRAMDSLVDGLILLARKTTHRQLPGTAK IRENDRLACVLGSMADEAAAWKRRWGRKEKADRTSAIPRFIEAEEELKRTGKLVEESFSF GLMLFCIGLCLTLGYLLAVFLIG Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:47:25 2011 Seq name: gi|229784038|gb|GG667697.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld90, whole genome shotgun sequence Length of sequence - 36924 bp Number of predicted genes - 37, with homology - 37 Number of transcription units - 12, operones - 10 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 69 - 128 7.9 1 1 Tu 1 . + CDS 198 - 374 285 ## PROTEIN SUPPORTED gi|227872333|ref|ZP_03990687.1| ribosomal protein S21 + Term 391 - 444 12.9 + Prom 496 - 555 2.7 2 2 Op 1 . + CDS 575 - 841 356 ## Closa_2071 hypothetical protein 3 2 Op 2 . + CDS 871 - 2118 1131 ## Closa_2070 sporulation protein YqfD 4 2 Op 3 . + CDS 2105 - 2488 450 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase 5 2 Op 4 17/0.000 + CDS 2407 - 3135 780 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase 6 2 Op 5 . + CDS 3119 - 3616 680 ## COG0319 Predicted metal-dependent hydrolase 7 2 Op 6 . + CDS 3613 - 3879 366 ## Closa_2067 hypothetical protein 8 3 Op 1 . + CDS 4811 - 5662 978 ## Closa_2067 hypothetical protein 9 3 Op 2 2/0.000 + CDS 5652 - 6923 493 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 10 3 Op 3 . + CDS 6709 - 7338 811 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 11 3 Op 4 . + CDS 7390 - 7776 503 ## Closa_2065 hypothetical protein + Prom 7825 - 7884 4.9 12 4 Op 1 . + CDS 7910 - 9130 966 ## COG2081 Predicted flavoproteins 13 4 Op 2 . + CDS 9127 - 10731 1386 ## COG2509 Uncharacterized FAD-dependent dehydrogenases + Prom 10755 - 10814 2.8 14 5 Tu 1 . + CDS 10835 - 11047 249 ## COG0263 Glutamate 5-kinase + Prom 11890 - 11949 80.4 15 6 Op 1 . + CDS 12003 - 12557 597 ## COG0212 5-formyltetrahydrofolate cyclo-ligase 16 6 Op 2 . + CDS 12592 - 13845 1378 ## COG0014 Gamma-glutamyl phosphate reductase + Term 13912 - 13956 10.2 + Prom 13960 - 14019 6.7 17 7 Op 1 . + CDS 14079 - 14936 187 ## PROTEIN SUPPORTED gi|183179613|ref|ZP_02957824.1| conserved hypothetical protein + Prom 13960 - 14019 6.7 18 7 Op 2 . + CDS 14079 - 14936 166 ## PROTEIN SUPPORTED gi|229245919|ref|ZP_04369978.1| SSU ribosomal protein S12P methylthiotransferase 19 8 Op 1 1/0.000 + CDS 15868 - 16569 453 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 20 8 Op 2 1/0.000 + CDS 16607 - 17395 1035 ## COG0249 Mismatch repair ATPase (MutS family) 21 8 Op 3 6/0.000 + CDS 18320 - 20176 2174 ## COG0249 Mismatch repair ATPase (MutS family) 22 8 Op 4 12/0.000 + CDS 20240 - 22228 1957 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) 23 8 Op 5 . + CDS 22225 - 23199 1076 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 24 8 Op 6 . + CDS 23177 - 24457 1523 ## COG4100 Cystathionine beta-lyase family protein involved in aluminum resistance + Prom 24480 - 24539 2.7 25 9 Op 1 . + CDS 24590 - 24841 329 ## Closa_2047 hypothetical protein 26 9 Op 2 . + CDS 24891 - 25586 654 ## COG2003 DNA repair proteins 27 10 Op 1 . + CDS 26580 - 27413 966 ## COG1792 Cell shape-determining protein 28 10 Op 2 . + CDS 27462 - 27980 479 ## Closa_2043 rod shape-determining protein MreD 29 10 Op 3 3/0.000 + CDS 27973 - 28728 890 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 + Prom 29576 - 29635 80.4 30 11 Op 1 1/0.000 + CDS 29788 - 31740 2206 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 31 11 Op 2 22/0.000 + CDS 31772 - 32437 726 ## COG0850 Septum formation inhibitor 32 11 Op 3 . + CDS 32471 - 32854 446 ## COG2894 Septum formation inhibitor-activating ATPase 33 12 Op 1 22/0.000 + CDS 33782 - 34153 367 ## COG2894 Septum formation inhibitor-activating ATPase 34 12 Op 2 . + CDS 34164 - 34442 269 ## COG0851 Septum formation topological specificity factor 35 12 Op 3 . + CDS 34445 - 35575 1035 ## COG0772 Bacterial cell division membrane protein 36 12 Op 4 . + CDS 35590 - 35985 621 ## COG1803 Methylglyoxal synthase 37 12 Op 5 . + CDS 35996 - 36923 670 ## COG1686 D-alanyl-D-alanine carboxypeptidase Predicted protein(s) >gi|229784038|gb|GG667697.1| GENE 1 198 - 374 285 58 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227872333|ref|ZP_03990687.1| ribosomal protein S21 [Oribacterium sinus F0268] # 1 58 1 58 58 114 96 9e-25 MSNVIVKDNESLDSALRRFKRNCAKAGIQQEIRKREHYEKPSVRRKKKSEAARKRKYN >gi|229784038|gb|GG667697.1| GENE 2 575 - 841 356 88 aa, chain + ## HITS:1 COG:no KEGG:Closa_2071 NR:ns ## KEGG: Closa_2071 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 88 1 88 88 134 82.0 1e-30 MFEEMKFRAAENLKLPKDVVLGEVLVSFLGRHSVVIENYRGIILYTDTLIKLQAKTCRVA VSGSRLVIEYYTNEEMKINGLIKSMEFE >gi|229784038|gb|GG667697.1| GENE 3 871 - 2118 1131 415 aa, chain + ## HITS:1 COG:no KEGG:Closa_2070 NR:ns ## KEGG: Closa_2070 # Name: not_defined # Def: sporulation protein YqfD # Organism: C.saccharolyticum # Pathway: not_defined # 2 415 1 414 414 650 73.0 0 MIVTWLNHRISGYLFVEMTGFSPERFFNMCSVHEIEIWGVSDTGRSCRFYMTVKGFKKIK PIVRKSKVRLKVLGKFGLPFFLHRNRKRKLYAAGLASFFVLLYILSIFIWDIEFEGNHMY TYDTLLKYCETEEIRYGMVKSRIDCDALEESLRSAFPEITWVSARVSGTRLLVKIKENEV LSEIPAKDESPCDIVASKDGVITSMIVRQGVPAVSVGDEVKKGDVLVSGTLHVIGDSEEI MNTHYVHSDADITARTEYHITRQIPLFKRVDVETGNVRRGKYAKVFQYSLMLMRPKQEGT TWKVTMEEDQLHLFQNFYLPVYLGDITAKEYISYERPYTEEEKKQAADEINENLKKNLLE KGVQILENHVKILDNESLCQIAIDLVAEEAVGQSQGVEIPITQEQEETEESNERN >gi|229784038|gb|GG667697.1| GENE 4 2105 - 2488 450 127 aa, chain + ## HITS:1 COG:BH1361 KEGG:ns NR:ns ## COG: BH1361 COG1702 # Protein_GI_number: 15613924 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Bacillus halodurans # 20 127 22 129 320 65 31.0 3e-11 MSVIETIIDIPAEHEKNVCGQFDSYLKKIERTLHVTMIERDGALKIIGPEQMVQKAKSVF NNLIELSKRGNAITEQNVDYALSLSFSESDGQILEIDKDIICRTITGKPVKPKTMGQKQY VDQIRKR >gi|229784038|gb|GG667697.1| GENE 5 2407 - 3135 780 242 aa, chain + ## HITS:1 COG:lin1504 KEGG:ns NR:ns ## COG: lin1504 COG1702 # Protein_GI_number: 16800572 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Listeria innocua # 28 212 131 315 319 267 68.0 2e-71 MQNDYGKAGEAEDHGPEAVCGSDQEKMIVFGIGPAGTGKTYLAMAMAIQAFKNGEVSRII LTRPAIEAGEKLGFLPGDLQSKIDPYLRPLYDALYQIMGAESYLHNAEKGLIEVAPLAYM RGRTLDNAYIILDEAQNTTPAQMKMFLTRIGFGSKVIVTGDQTQKDLPAGAVSGLDTAVR VLKRIDDIGFCYLSSSDVVRHPLVQKIVQAYDDYETKKKTGEKERKTAKAPGGRKGRYDD SY >gi|229784038|gb|GG667697.1| GENE 6 3119 - 3616 680 165 aa, chain + ## HITS:1 COG:CAC1293 KEGG:ns NR:ns ## COG: CAC1293 COG0319 # Protein_GI_number: 15894575 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Clostridium acetobutylicum # 18 158 18 159 166 110 46.0 2e-24 MTIHIEYETEEELKLPYEEIITTVVNEALDYENCPYEAEVNVLLTDNEDIRQINKEYRDI DNATDVLSFPMADYDSPSDFDRLEDMASDYFNPETGELLLGDIVISVDKVKEQAEKYGHS ETRELAFLVAHSMLHLCGYDHMVDEERLVMEKKQEEILKRGGYER >gi|229784038|gb|GG667697.1| GENE 7 3613 - 3879 366 88 aa, chain + ## HITS:1 COG:no KEGG:Closa_2067 NR:ns ## KEGG: Closa_2067 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 85 1 83 383 79 48.0 3e-14 MKKRAWFCAAGLILLMGAVTGCQGKYQEVAVTDASVETDTTPAPTKEIKIETTTEAPDGE EVESGDWYTYKERTEQDGKIRSYLTLAS >gi|229784038|gb|GG667697.1| GENE 8 4811 - 5662 978 283 aa, chain + ## HITS:1 COG:no KEGG:Closa_2067 NR:ns ## KEGG: Closa_2067 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 283 100 382 383 473 81.0 1e-132 MMSNDKAALPQYGINRADIVYEAPVEGGMNRYMAIIEDYDDLDRIGSVRSCRTYYTYFAR EFDAIYAHYGQSTFAKPYLANVDNINGLDGIGTVAYYRTKDRKSPHNAYSSGERLNKAIA QLGYSESYSADYKGHYRFAKTGHEVTLDGAPSVMDAARVYPGYVMNKPWFEYNESDGLYY RYQYGAPHKGDEGPIAVKNILFQYCPSGHYATTDYLDINVHDDSYGLYATEGKAIPVKWT KDGEFGLTHYYDMENNEVILNQGKTWICVISAQDSSNVEVYGK >gi|229784038|gb|GG667697.1| GENE 9 5652 - 6923 493 423 aa, chain + ## HITS:1 COG:CAC3213 KEGG:ns NR:ns ## COG: CAC3213 COG2244 # Protein_GI_number: 15896460 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Clostridium acetobutylicum # 22 338 6 302 512 133 29.0 6e-31 MENNRTSRSRKKPEKKGASNFIIQGTILAAAGIIVRLIGMLYRIPLANILGDEGNGYYSS AFTIYSILLIVSSYSLPTAVSKMIATKNARKEYKNSVRILKAALFYGTVVGGIGAAVLWF GADFFATRFLNMPYTRYALKTLAPTIWVIAYLGVFRGYFQGLGTMIPTALSQIFEQIINA IISIVAAERLFHAGLKANLVHETTEFSFAFGAAGGTIGTGAGAVAGLLFLLFLMASYRPV MMRQAKRDRTRRRDSYGEISAVLLMTVLPIVMSSVAYNISTVIDNSIYGKGMAFLGMGAS EAVSTWGVYAGKYHLLFNIPVAIANSLASSLIPSLSRAVARKKPRADREQGIHGDSFFHG DSNPGYGWAYRACRAGVQSAVQPLRQYQSDPYDDVRFSGCRIFLPVHGDKCRASGNQPHA DTA >gi|229784038|gb|GG667697.1| GENE 10 6709 - 7338 811 209 aa, chain + ## HITS:1 COG:CAC1016 KEGG:ns NR:ns ## COG: CAC1016 COG2244 # Protein_GI_number: 15894303 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Clostridium acetobutylicum # 4 195 326 523 539 62 28.0 4e-10 MVIRFSMVIAIPATVGLTVLAGPVCNLLFSRYDNTNLIRMMMYGSLAVVFFSLSTVTNAV LQGINHMQTPLKNAVISLVLHIIVLCVMLFGFKMGIYSVVYSNILFALFMCILNGMAISR YLNYRQEMKKTFILPTIASGIMGGAAYGVYQLVHMTLKSNTIGVLLAIAVAVIVYGILLL KLRCVDEVELSGMPGGTKLVRIAQKCHLM >gi|229784038|gb|GG667697.1| GENE 11 7390 - 7776 503 128 aa, chain + ## HITS:1 COG:no KEGG:Closa_2065 NR:ns ## KEGG: Closa_2065 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 128 1 130 130 207 83.0 9e-53 MKKYMFCFLLFAATSLLCLGIGFVITKDSVQPEEAIPGATLETETVTEDQIVINQEKVSP APEKEKEKYYLVSEDGYLLVLYQDQTTICLYTHMPVTDFPEEERGRLMDGIWFSSMIEVF NYLESYTS >gi|229784038|gb|GG667697.1| GENE 12 7910 - 9130 966 406 aa, chain + ## HITS:1 COG:CAC3590 KEGG:ns NR:ns ## COG: CAC3590 COG2081 # Protein_GI_number: 15896824 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Clostridium acetobutylicum # 5 399 3 403 405 318 43.0 2e-86 MKRNVMIIGGGASGMTAAIAAAREGARVVILEHTKRPGKKILSTGNGKCNLTNLDMRKDA FRGEHPGFAWKVIEAVPVKETVSFFEDLGLVLTDRNGYVYPATGQASSVLDALRFELERL KIPVLCECEVSGISKKLELRTTQGVMKADAVILAAGSKAAPGTGSDGSGYKLAGSLGHRV IEPLPALVQLKCRETWYKQFAGVRTDAFVTLFCDGHKTASDRGELQFTDYGISGIPVFQV SRFASRALNEGKKVTVRLDLMPVMDKQELFHFLTERVRRLGYRPGEEFLNGVLNKKLHQV LLKKSGIPAGRRTSEITEKQLRVLTDRIKALETTVTGVNSFDQAQICCGGVDTREINPCT MESKLVSQLFLAGEILDVDGICGGYNLQWAWSSGILAGRNAGRGPI >gi|229784038|gb|GG667697.1| GENE 13 9127 - 10731 1386 534 aa, chain + ## HITS:1 COG:L195271 KEGG:ns NR:ns ## COG: L195271 COG2509 # Protein_GI_number: 15673161 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Lactococcus lactis # 1 528 1 529 535 507 49.0 1e-143 MIRISQLKLSVNHSEEDLKKKAAKMLKLQPADIRRITVIKQSVDARKKSSEVLFIYTVDV ETAREDKVLHRLNSPNIAKAEKKEYQFPKPGTEPLKHRPVIIGSGPAGLFCALMLARAGY APLIFERGERVDERKRSVLRFWQGGELNGSSNVQFGEGGAGTFSDGKLNTLVKDPMLRNR KVLEVFVEFGADPSILYKNKPHIGTDVLSVIVKNMREEIIRLGGTFSFQSCVTDFCIEGG KLKAITVNGTDKIETEVTVLAVGHSARDTFEKLLERSVPMEKKAFAVGLRIQHPQRDINI AQYGAAEIEALGAADYKVTHQCANGRGVYSFCMCPGGYVVDASSEPGRLAVNGMSYHARD GVNANSALIVTVTPQDFPEDSPLAGIAYQRRLEEAAYRCGGGKIPVQLYGDFKKKRITTA FGEVTPAFEGKYAMANLREFFPDTLSESLIEGVEAFNGRIAGFSRGDAILAGVESRTSSP VRIPRDERFESSVKGLYPCGEGAGYAGGITSAAMDGIKTAEAVVSRYAPHHGMA >gi|229784038|gb|GG667697.1| GENE 14 10835 - 11047 249 70 aa, chain + ## HITS:1 COG:lin1228 KEGG:ns NR:ns ## COG: lin1228 COG0263 # Protein_GI_number: 16800297 # Func_class: E Amino acid transport and metabolism # Function: Glutamate 5-kinase # Organism: Listeria innocua # 10 68 1 58 276 60 55.0 7e-10 MIGDTMKTEVREKLKDKQRIVIKIGSSSLTHTYTGDLNLMKIEKLVRVISDLKGEGKSVV LVSSGAIAAS >gi|229784038|gb|GG667697.1| GENE 15 12003 - 12557 597 184 aa, chain + ## HITS:1 COG:BH1417 KEGG:ns NR:ns ## COG: BH1417 COG0212 # Protein_GI_number: 15613980 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Bacillus halodurans # 3 179 1 180 186 96 33.0 3e-20 MEMSGKSEVREWVKENRLALTQKQELEWNDAICKKILSLRTIRQAFCVYCYVSFRHEAGT EKLITSLLEMGKYVAVPKVEGKRLVFYAIQGKRDLESGVMGIMEPKLNCLRIRDEFAPVI VPGIAFDREGHRVGYGGGYYDRFFEEEPDHPRFGIAYPFQLFDSLPAESHDKGMDLVITP YDPG >gi|229784038|gb|GG667697.1| GENE 16 12592 - 13845 1378 417 aa, chain + ## HITS:1 COG:CAC3254 KEGG:ns NR:ns ## COG: CAC3254 COG0014 # Protein_GI_number: 15896499 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyl phosphate reductase # Organism: Clostridium acetobutylicum # 3 413 7 417 418 472 58.0 1e-133 MTLIETGKRAKETAGVLGILDSDKKNEGLRAAAAALLEGEEDILTANQDDVIKATINGMS AGMIDRLELTPSRIEAMADGLISVAGLDDPVGEVISMKVRPNGLTIGQKRVPMGVVGIIY EARPNVTADAFGLCFKSGNSVILKGGSDALESNKAIVRCLRAGLKNAGLPEDCVQLITDT DREVTKEFMRLHDYVDVLIPRGGAGLIRSVVENSTVPVIETGTGNCHIFVDETADFKMAL DIIFNAKTQRIGVCNACESLVIHKKIAEEFLPLLKARLDKKQVEVRADEEACRIVPQFVP ASEEDWGTEYLDYILSLKVVDSIDEAIAHINRYNTKHSEAIITSDYGNSQRFLNEIDAAA VYVNASTRFTDGFEFGFGAEIGISTQKLHARGPMGLKELTTTKYIIYGNGQCRPPAL >gi|229784038|gb|GG667697.1| GENE 17 14079 - 14936 187 286 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|183179613|ref|ZP_02957824.1| conserved hypothetical protein [Vibrio cholerae MZO-3] # 27 259 14 235 470 76 25 2e-13 MDHLHMTHEEALKAAPAEEPGRQQFFMERCRQMIGERKKNGTYRGTFHIETFGCQMNARD SEKLTGILEASGFTETDTEEADFVLYNTCTVRDNANQRVYGRLGYLNRIKQKNPAMMIAL CGCMMQEETVVEKIKKSYRFVDIIFGTHNIFKLAELISERMDEKKMVVDIWKETDRIVEE LPTDRKYPFKSGVNIMFGCNNFCSYCIVPYVRGRERSRNPQDIVGEIRRLVADGVVEVML LGQNVNSYGKNLEEPMTFAKLLQEVEAIDGLRRIRFMTYPSEGLIG >gi|229784038|gb|GG667697.1| GENE 18 14079 - 14936 166 286 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229245919|ref|ZP_04369978.1| SSU ribosomal protein S12P methylthiotransferase [Catenulispora acidiphila DSM 44928] # 198 275 229 306 529 68 43 5e-11 MDHLHMTHEEALKAAPAEEPGRQQFFMERCRQMIGERKKNGTYRGTFHIETFGCQMNARD SEKLTGILEASGFTETDTEEADFVLYNTCTVRDNANQRVYGRLGYLNRIKQKNPAMMIAL CGCMMQEETVVEKIKKSYRFVDIIFGTHNIFKLAELISERMDEKKMVVDIWKETDRIVEE LPTDRKYPFKSGVNIMFGCNNFCSYCIVPYVRGRERSRNPQDIVGEIRRLVADGVVEVML LGQNVNSYGKNLEEPMTFAKLLQEVEAIDGLRRIRFMTYPSEGLIG >gi|229784038|gb|GG667697.1| GENE 19 15868 - 16569 453 233 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 1 218 202 421 451 179 41 3e-44 YGKNLEEPMTFAKLLQEVEAIDGLRRIRFMTSHPKDLSDELIEVMKNSDKICHHLHLPLQ SGSSRILKAMNRRYTKEQYLELVEKIRAAVPDISLTTDIIVGFPGETEEDFSETMDVVRR VRFDSAFTFIYSRRTGTPAAAMEDQIPEDVVKNRFDRLLSEVQAVSAEVCGRDVHTVKSV LVEELDDHVEGFVTGRLDNNTIVHFKGDRSLIGKIVDVYLDESKGFYYMGTMK >gi|229784038|gb|GG667697.1| GENE 20 16607 - 17395 1035 262 aa, chain + ## HITS:1 COG:CAC1837 KEGG:ns NR:ns ## COG: CAC1837 COG0249 # Protein_GI_number: 15895112 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 4 255 3 248 869 221 44.0 1e-57 MAGMSPMMVHYLETKKQYPDCILFYRLGDFYEMFFEDALTVSKELEITLTGKECGLEERA PMCGVPYHALDNYLYRLVQKGYKVAIAEQMEDPKQAKGLVKREVIRVVTPGTITSAQALD ETKNNYLMGIVYIDERFGIAVSDISTGDFLVTEVASERELADEINKFCPSEIICNDAFFV SGVDTEEVKNRYQTVISALDSHFFSDEGCRKILKEHFKVGSLDGLGLQDYDTGVIAAGAV MEYMYETQKSTLSHITAITLAS >gi|229784038|gb|GG667697.1| GENE 21 18320 - 20176 2174 618 aa, chain + ## HITS:1 COG:CAC1837 KEGG:ns NR:ns ## COG: CAC1837 COG0249 # Protein_GI_number: 15895112 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 3 614 262 867 869 628 52.0 1e-179 MIIDTSTRRNLELLETMREKQKRGTLLWVLDKTKTAMGARLLRTYIEQPLIHKDDIIARQ NAIEELNMNYISREEICEYLNPIYDLERLIGRISYKTANPRDLISFKNSLEMLPYIKDLM GEFTTPLLKELWEELDPLEDVHDLVSRAIVDDPPVSLRDGGIIKEGYHEETDKLRHAKTE GKTWLAQLESRERDKTGIKNLKVKYNKVFGYYFEVTNSFKGMVPDYFVRKQTLANAERYT TDELKELEDMILGAEDKLYTLEYGLFCEVRDTIAAEVLRIQQTARAIAGIDVMTSLSAVA TKNNYVKPRINEKGVIDIKNGRHPVVEKMMRDDLFVANDTYLDNTKNRLSIITGPNMAGK STYMRQTALIVLMAQLGSFVPADEANIGICDRIFTRVGASDDLASGQSTFMVEMTEVANI LRNATKNSLIVLDEIGRGTSTFDGLSIAWAVVEHISNPKLLGAKTLFATHYHELTELEGT INGVNNYCIAVKEQGDDIVFLRKIVKGGADKSYGVQVAKLAGVPDSVIVRAKELLVELSD ADITARAKEIAEAGAGTPKHAAVPRPDEVDLQQMSLFDTVKDDDIVRELGELELGNMTPI DALNTLYRLQTKLKNRWQ >gi|229784038|gb|GG667697.1| GENE 22 20240 - 22228 1957 662 aa, chain + ## HITS:1 COG:CAC1836 KEGG:ns NR:ns ## COG: CAC1836 COG0323 # Protein_GI_number: 15895111 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Clostridium acetobutylicum # 4 660 3 621 622 407 36.0 1e-113 MPNITLLDQNTINKIAAGEVIERPASVVKELLENAIDARATAVTVEIKEGGTTFIRVTDN GCGIPREEVPLAFLRHSTSKIKSVEDLFTISSLGFRGEALASIAAVCQVELITKTSEALT GSRYQIEGGMERPLEEIGAPEGTTFIARNLFYNTPARRKFLKTPMTEGSHVAELVEKIAL SHPEISIRFIQNNQNKLHTSGNHNLKDIIYTVFGREIAANLLAVEAKKQDISISGFIGKP VIARGNRNYENYFINGRYIRSSIISKAIEEAYKPFMMQHKYPFTMLHFTIEPELLDVNVH PTKMELRFRDGEMVYRMVYDAVSGALAHKELIPEVELNKDRTDQEAKEARKREPSPEPFE LRRLEAMSRQQAACAPGSKRLKPAEPSLMKDPDFLAENWLKKPAAPPETNPLRGPSQEPV PSSREEAAAVTKPAAPEGETITPEHDNTAGAGKPEQLDLFDGKLLEPKSRQMHKLIGQVF DTYWLVEFNEQLYIIDQHAAHEKVLYEKTMATLKNREYSSQMLDPPIILTLNMNEEVLLK EHMKYFSDMGFEIEPFGGREYAVRGVPANLLSIAKKDLLIEMIDGLSDDVSTHNPDIIYD RVATMSCKAAVKGGNRLSAAEANELIDQLLNLENPYACPHGRPTIISMSKYELEKKFKRI VS >gi|229784038|gb|GG667697.1| GENE 23 22225 - 23199 1076 324 aa, chain + ## HITS:1 COG:CAC1835 KEGG:ns NR:ns ## COG: CAC1835 COG0324 # Protein_GI_number: 15895110 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Clostridium acetobutylicum # 1 309 1 308 309 304 49.0 1e-82 MKPLIILTGPTAVGKTALSVRLAKAVGGEIISADSMQVYRHMDIGSAKVTRDEMEGIPHY LIDILEPTGEFNVAVFKKMAAEAVQEIYSHGSIPIVVGGTGFYIQALLYDIDFTEQGEER EIREELERLGEERGSVFLHDMLAEIDPEAAMEIHANNRKRVIRAIEFYRQTGERISEHNR RERQKQSPYDFLYYVLTDDRKTLYERIDRRVDGMMEAGLVHEVEQLRDMGCTKNMVSMQG LGYKEILDYLMGNTTLDEAAAILKRDTRHFAKRQITWFKRERDVRWLNLEDFGRDREKVL EKILQDIRETYGLETENHGNGCDV >gi|229784038|gb|GG667697.1| GENE 24 23177 - 24457 1523 426 aa, chain + ## HITS:1 COG:BS_ynbB KEGG:ns NR:ns ## COG: BS_ynbB COG4100 # Protein_GI_number: 16078807 # Func_class: P Inorganic ion transport and metabolism # Function: Cystathionine beta-lyase family protein involved in aluminum resistance # Organism: Bacillus subtilis # 6 424 1 421 421 458 53.0 1e-128 MEMDAMYEQLGIEKRVLEFGKEIEEHLSERFAAIDETAEYNQLKVIKAMQDNRVSDIHFA ATTGYGYNDLGRDTLEDVYASVFHTESALVRPGLISGTHALHVALSGNLRPGDELLSPVG KPYDTLEEVIGIRESTGSLKEYGVVYRQVDLLPDGSFDFDGIAKAINEKTKLVTIQRSKG YATRPTLSVERIGQLIAFIRTVKPDVVCMVDNCYGEFVERIEPADVGADMMVGSLIKNPG GGLAPIGGYIAGRKDCIDRASYRLTAPGLGREVGASLGLNQSFYQGLFLAPTVVAGALKG AVFAANVYEKLGFAVVPNSTESRHDIIQAVTFGKPEGVIAFCQGIQAAAPVDSFVTPEPW AMPGYDAPVIMAAGAFVQGASIELSADGPIKPPYAVYFQGGLTWYHAKLGILMSMQKLKD AGLITL >gi|229784038|gb|GG667697.1| GENE 25 24590 - 24841 329 83 aa, chain + ## HITS:1 COG:no KEGG:Closa_2047 NR:ns ## KEGG: Closa_2047 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 83 1 83 83 102 75.0 6e-21 MKRKNCWVLLLLVLTGIVLGGFIGTMAQDVSWLSWLNFGQSFGFDSPLIINFGILVITFG LSIRITMASIIGVVLALITYRFI >gi|229784038|gb|GG667697.1| GENE 26 24891 - 25586 654 231 aa, chain + ## HITS:1 COG:CAC1241 KEGG:ns NR:ns ## COG: CAC1241 COG2003 # Protein_GI_number: 15894524 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Clostridium acetobutylicum # 4 231 6 229 229 192 42.0 3e-49 MSRITMKDIPADDRPYEKCLQTGPERLSDAELLSIIIRTGSREDSSLALAQKILALNYPS EGILGLLHLSLPELTQIKGIGRVKGAQLLCIGELSKRIWKKAVLEDVASFTNPEDIVNYY VEDMRHMEQEQIRIMLLNTKGVLLKDVLVSQGTVNASVVSPREIFIEALKYHAVNLIVVH NHPSGDPAPSREDITLTRRIMEAGDLIGIRMLDHIIIGDNSYTSFKERGII >gi|229784038|gb|GG667697.1| GENE 27 26580 - 27413 966 277 aa, chain + ## HITS:1 COG:CAC1243 KEGG:ns NR:ns ## COG: CAC1243 COG1792 # Protein_GI_number: 15894526 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Clostridium acetobutylicum # 1 236 35 273 283 92 27.0 7e-19 MRTGVGYFLIPVQSGVNRVGSAIYDEIMDYTKLKDALEDNKEMKDIITQLTEENTRLQAE EFELKRLRQLYELDQEYGQYHKVGARVIANDSSNWFQVFRIDKGSKDGIAVDMNVVAGGG LVGIVTDVGANYATVRSIIDDSSRVSAMAIQSGDGCIVAGDLTLFKEGRLRITNVLKDSD LKDGDKIVTSNTSSVFVPGVLVGYAAEITNDTNNVTKSGYLIPAAEFDSLQEVLVITDLK AKMMEEAGPEDGSTEETQAGGSEADEASSEEETTAAQ >gi|229784038|gb|GG667697.1| GENE 28 27462 - 27980 479 172 aa, chain + ## HITS:1 COG:no KEGG:Closa_2043 NR:ns ## KEGG: Closa_2043 # Name: not_defined # Def: rod shape-determining protein MreD # Organism: C.saccharolyticum # Pathway: not_defined # 1 172 31 202 202 259 87.0 4e-68 MKRIIFNILMIILAFTVQNCIFPLLPFLSASPNLLLILTFSFGFIHGKEAGMYYGLLSGL LMDLFYSGPFGFYTLIFVYVGYVNGICTRYYYEDYITLPLILSVVNDLAYNLYIYVFRFL IRKRLGIGYYFMNLMLPEIIFTVVTTLLIYRIFLMFNRHLEEIEKRGDSSIV >gi|229784038|gb|GG667697.1| GENE 29 27973 - 28728 890 251 aa, chain + ## HITS:1 COG:CAC1246 KEGG:ns NR:ns ## COG: CAC1246 COG0768 # Protein_GI_number: 15894529 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 15 251 8 267 916 60 25.0 2e-09 MFRDLLDILKDFLKKIISSRLLVLGVICIAMYAGLIHKLFNLQIVNGEQALNDYMQLTEQ TLTTAGTRGNIYDRNGKVLAYNKLAYSVTVQDTGAYKTTADQNAMYLRLVRILEKHGETV QGKFEVALDSNGDMIYTSSSEAARKRFLRDYYGLKSVEELDDEDNKYPSAISARELFEKA FTTAKLNEMKDADGNPVTMTDQEALDIINIKYALRLMSYRKYEATTVATQVSDETVADVL EHTADLAGVNV >gi|229784038|gb|GG667697.1| GENE 30 29788 - 31740 2206 650 aa, chain + ## HITS:1 COG:HI0032 KEGG:ns NR:ns ## COG: HI0032 COG0768 # Protein_GI_number: 16272007 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Haemophilus influenzae # 303 629 274 623 651 137 31.0 5e-32 MEQELQGTKGSKTIYKDSRGRVRETLNETESKAGNDIYLTQIAGLQTGIYHLVEQSLAGV LLKTILPGDVEITSTTDGSNMKIPVKDAYFQLINNNILSLKHFESEEASDVEKNINSKFL SSRDQIFSELQNELMSSHPATLKDLDQDLAAYMYYIYTYLASPTVGIIKEDLIDQTSQEY LAWKADEISLRDYLYYGIANGWVDTTKLETGSKYSNADDTFRTLVDYVLAELKEDRNFTK RIYRYLINDDVITGQELCLALFDQGILEYNEEDINQLKATGDAYAFLMKKISTLEITPAQ LALAPCNAGVVVTDIRTGEVLALVSYPGYDNNRIGDSAYFSQLQNDLSLPLYNNATQAKK APGSTFKPLTAVAALEEHVIGIDDTIECTGRYEEIDTPIKCWIYPGRHGPLNMVGGLLNS CNFFFADLGHRLSMDAEGNYVPSLGLDKLHKYAAMFGLDHKSGVEIDELEPQISDIDPER SAMGQGNHAYTNVQLSRYVAAIASRGNVFELSLLDKITDSEGNLIKDYTPEISSHIDAAD STWDTVQQGMREVIASGTAKKLFADLEVPIAGKTGTAQEAKNKPNHAFFISYAPYNNPEI CVTVNIPYGYSASNAANIAKSVYQYYYGEIDLESILNAGARDAANVTIRD >gi|229784038|gb|GG667697.1| GENE 31 31772 - 32437 726 221 aa, chain + ## HITS:1 COG:FN0175 KEGG:ns NR:ns ## COG: FN0175 COG0850 # Protein_GI_number: 19703520 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor # Organism: Fusobacterium nucleatum # 1 178 1 177 216 102 32.0 6e-22 MHNTVVIKSNKAGMTVYLEPEVPFQTLLADIGRKFGDNTKFWGSAQMTLTLEGRKLTAEE EFAVVNQIMENSNLEIVCLVDTDANRIERCEKALNEKLMELSCQTGQFFKGNLQDGETLE SEASIIIIGDVERGSRVIAKGNVIVLGALKGTVCAGVAGNEAAVIVALTMAPTQLRIADC TSRLDGRGKKLGRGPMTASLDENKVSIKPMKKSVEIFRKLM >gi|229784038|gb|GG667697.1| GENE 32 32471 - 32854 446 127 aa, chain + ## HITS:1 COG:BH3027 KEGG:ns NR:ns ## COG: BH3027 COG2894 # Protein_GI_number: 15615589 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor-activating ATPase # Organism: Bacillus halodurans # 1 125 1 125 264 189 75.0 1e-48 MSEIIVITSGKGGVGKTTTSANVGTGLAILGKKVVLIDTDIGLRNLDVVMGLENRIVYNL VDVVEGNCRMKQALIKDKRYPNLFLLPSAQTRDKTSVNPGQMVKLVDDLREEFDYVLLDC PAGIEPS >gi|229784038|gb|GG667697.1| GENE 33 33782 - 34153 367 123 aa, chain + ## HITS:1 COG:FN0176 KEGG:ns NR:ns ## COG: FN0176 COG2894 # Protein_GI_number: 19703521 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor-activating ATPase # Organism: Fusobacterium nucleatum # 1 122 142 264 264 119 53.0 2e-27 MTTPEVSAIRDADRIIGLLEASGMKTIDLVVNRIRMDMVRRGDMMSLDDVMDILAIDIIG AVPDDEDIVISTNQGEPLVGIGTPAGQAYMDICKRITGETVPLQNVAARGGFFFKLSSLL KRA >gi|229784038|gb|GG667697.1| GENE 34 34164 - 34442 269 92 aa, chain + ## HITS:1 COG:ssl0546 KEGG:ns NR:ns ## COG: ssl0546 COG0851 # Protein_GI_number: 16331863 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation topological specificity factor # Organism: Synechocystis # 10 86 15 85 97 58 37.0 2e-09 MSLFPVFRKKNSGEIARNRLKLLLVADKADCSPEVMQMIKDDMIHVISRYMDVDSDRIEI QMTKMKQPDCDSYMPVLYANIPIRDIPDKGIY >gi|229784038|gb|GG667697.1| GENE 35 34445 - 35575 1035 376 aa, chain + ## HITS:1 COG:BH3275 KEGG:ns NR:ns ## COG: BH3275 COG0772 # Protein_GI_number: 15615837 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus halodurans # 7 372 8 388 398 153 31.0 6e-37 MLFDYNFKYYNYRLVLYMVALSVIGILVVASASGQDSSTVTKQIIGIMVGFALAIGLSII DYHRIINLYALDYAVCILLLGAVLVMGHTAGGATRWIDVPGIGRIQPSEFVKIGLIVFFS WYWNKYQEKMNMPVMVGIAAILAVIPIALIFAEPNLSTSLVVIVIILCMVYTAGISYRWI GGVLAVAIPAGALFIYLLTQGLIPFIHDYQARRILAWIYPHAEQYAENMYQQKNSIMAIS SGQLQGKGLFNTTIASVKDGNWLGTTGETDFIFAIIGEELGFIGGVTVIVLFGLIVFECL RMAYKSRDMAGRLICTGMAALIGFQAFANIAVATQIFPNTGLPLPFISSGVSSLISIFIG MGLVLNVGLQRKIGNY >gi|229784038|gb|GG667697.1| GENE 36 35590 - 35985 621 131 aa, chain + ## HITS:1 COG:lin2020 KEGG:ns NR:ns ## COG: lin2020 COG1803 # Protein_GI_number: 16801086 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Listeria innocua # 1 131 1 131 134 153 53.0 7e-38 MNIGLVAHDSKKKLMQNFCIAYRGILSKHTLYATGTTGRLIEEVTNLNIHKYLAGHLGGE QQLGAQIEHNEIDLVIFLRDPLSTKAHEPDLVNIMKTCDMHNIPLATNLATAELLIKSLD RGDLEWREMYK >gi|229784038|gb|GG667697.1| GENE 37 35996 - 36923 670 309 aa, chain + ## HITS:1 COG:BS_dacB KEGG:ns NR:ns ## COG: BS_dacB COG1686 # Protein_GI_number: 16079376 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus subtilis # 57 305 17 251 382 157 39.0 2e-38 MKKKIMMKAGCAFLCVTMLSGCSSFKKLDDPYDFAQRTALYEGHTLSDRAESFAHDLCVV TADTPSSDQAVTAEAAAVFDLSDKTVIFGKNPFEKLYPASITKTMTALLAIKYGNLSDEV TVPEEAVITESGATLCGIKPGDKLTMEQLLYGLMLPSGNDAGAAIAVHMDGSIEKFADRM NEEARKLGATGTHFVNPHGLNDENHYTTAYDLYLIFNEAMKYPDFRKVIGTSSYDAVYAL ASGETTTKTWKQSNWYMTGERNMPEGLTALGGKTGTTRAAGSCLIMASSGSSGMEYISVV LKAPDRTGL Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:47:58 2011 Seq name: gi|229784037|gb|GG667698.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld91, whole genome shotgun sequence Length of sequence - 30569 bp Number of predicted genes - 26, with homology - 25 Number of transcription units - 17, operones - 6 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 74 - 2227 2064 ## COG0474 Cation transport ATPase + Term 2324 - 2363 -0.0 2 2 Tu 1 . - CDS 2209 - 2892 431 ## COG4845 Chloramphenicol O-acetyltransferase + Prom 3827 - 3886 21.6 3 3 Tu 1 . + CDS 4117 - 5265 1385 ## COG1454 Alcohol dehydrogenase, class IV + Term 5325 - 5379 14.1 - Term 5313 - 5366 6.3 4 4 Tu 1 . - CDS 5384 - 6274 954 ## DSY1244 hypothetical protein - Prom 6456 - 6515 4.8 - Term 6414 - 6454 1.2 5 5 Tu 1 . - CDS 6652 - 7488 513 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 7549 - 7608 10.2 + Prom 7559 - 7618 8.8 6 6 Op 1 . + CDS 7714 - 9096 1532 ## COG2211 Na+/melibiose symporter and related transporters 7 6 Op 2 . + CDS 9108 - 10085 841 ## Htur_2842 amidohydrolase 2 8 6 Op 3 . + CDS 10073 - 10918 946 ## gi|266623060|ref|ZP_06115995.1| creatininase 9 6 Op 4 . + CDS 10930 - 12330 1098 ## COG2272 Carboxylesterase type B + Term 12358 - 12407 9.1 + Prom 12353 - 12412 5.8 10 7 Tu 1 . + CDS 12440 - 14257 2002 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 14266 - 14311 12.6 + Prom 14362 - 14421 3.6 11 8 Tu 1 . + CDS 14459 - 15256 863 ## COG0406 Fructose-2,6-bisphosphatase + Term 15278 - 15347 14.1 12 9 Op 1 . - CDS 15269 - 15424 102 ## 13 9 Op 2 . - CDS 15375 - 16664 1332 ## COG2015 Alkyl sulfatase and related hydrolases - Prom 16685 - 16744 4.5 + Prom 16578 - 16637 3.5 14 10 Tu 1 . + CDS 16721 - 17584 900 ## gi|266623065|ref|ZP_06116000.1| conserved hypothetical protein 15 11 Op 1 . + CDS 18529 - 18921 492 ## gi|266623066|ref|ZP_06116001.1| conserved hypothetical protein 16 11 Op 2 1/0.333 + CDS 18981 - 19400 524 ## COG2764 Uncharacterized protein conserved in bacteria + Prom 19410 - 19469 5.2 17 12 Op 1 . + CDS 19499 - 20875 1481 ## COG0534 Na+-driven multidrug efflux pump 18 12 Op 2 . + CDS 20917 - 21828 760 ## COG2421 Predicted acetamidase/formamidase 19 12 Op 3 . + CDS 21914 - 22939 1227 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase + Term 22952 - 22992 10.2 + Prom 23058 - 23117 10.3 20 13 Tu 1 . + CDS 23344 - 24246 818 ## EUBELI_00788 hypothetical protein + Term 24265 - 24313 9.1 + Prom 24361 - 24420 3.8 21 14 Op 1 . + CDS 24451 - 25422 991 ## COG0679 Predicted permeases 22 14 Op 2 . + CDS 25460 - 26845 1303 ## COG0733 Na+-dependent transporters of the SNF family + Prom 26896 - 26955 4.3 23 15 Op 1 . + CDS 27073 - 29022 2348 ## COG0441 Threonyl-tRNA synthetase 24 15 Op 2 . + CDS 29096 - 29410 295 ## Closa_2298 hypothetical protein + Term 29431 - 29491 8.1 - Term 29447 - 29473 0.3 25 16 Tu 1 . - CDS 29557 - 29970 355 ## COG2510 Predicted membrane protein - Prom 30106 - 30165 5.9 + Prom 30161 - 30220 7.2 26 17 Tu 1 . + CDS 30382 - 30568 223 ## gi|288870988|ref|ZP_06116012.2| Fe-hydrogenase, gamma subunit Predicted protein(s) >gi|229784037|gb|GG667698.1| GENE 1 74 - 2227 2064 717 aa, chain + ## HITS:1 COG:SP1551 KEGG:ns NR:ns ## COG: SP1551 COG0474 # Protein_GI_number: 15901394 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Streptococcus pneumoniae TIGR4 # 1 715 183 906 914 614 46.0 1e-175 MKVEEAALTGESVPVLKTERVLSLEEGTKDIPLGDRKNMVYMGSTVVYGRGAAVVTGTGM ETEMGKIADALTQAKDHETPLQGKLSQLSRILSVLVLGICVIIFAVSLLRAYPDISGYTV IDTFMIAVSLAVAAIPEGLAAVVTIVLSIGVTNMSRKNAIIRKLTAVETLGCTQIICSDK TGTLTQNRMTVVEHTGTDEKALAAAMALCNDAHLNDKNEAEGEPTQAALVAYAWSAGLDQ NSLKREYVRIGEAPFDSMRKMMSTVHEKADGTVIQFTKGAPDEVLKRCTHILSEGRGVPL CEEDREKILLANKSLAGRALRVLAAAAREYAARPSGWEAEDLEQKLCFLGLTGMIDPVRP EVLPAMKECRRAGIRPIMITGDHKDTAVAIAMQLGIITSPEQAITGAGLDEISDEDFGDE IGKYSVYARVQPEHKVRIVNGWRAKGCITAMTGDGVNDAPSIKNADIGVGMGITGTDVTK NAADMVLADDNFATIVSAVGEGRRIYDNIRKSIQFLLSSNLSEVLSIFFATLLGFVILKP VHLLWINLITDCFPALALGMEEPEKDVMNRPPRSPKEGIFAGGVGFHVAVQGIAVTVITL ASYLIGHYLEAGVWEITNSADGMTMAFLTLSMTEIFHSFNMRSLNQSVFTLKHHNPWLWL SMAASFLCTTVVIYVPFLSDAFEFEHISLTEYAVAIGLAFCIVPFVEGMKWLQRRRK >gi|229784037|gb|GG667698.1| GENE 2 2209 - 2892 431 227 aa, chain - ## HITS:1 COG:CAP0060 KEGG:ns NR:ns ## COG: CAP0060 COG4845 # Protein_GI_number: 15004764 # Func_class: V Defense mechanisms # Function: Chloramphenicol O-acetyltransferase # Organism: Clostridium acetobutylicum # 7 211 5 209 220 199 47.0 3e-51 MDQTVKFTPLDMQTWPRAQMFYYFSKMAPTGYSLTVDLDITAMRTALKHAGIKFFPAYLW LVTKTLQKQTEFKTAEVDGQVGCFDTLTPLYAAFHEDDNTFSLMWTEYADNFMVFYQSCL DNQRRYGGSHGVLAQAGQTPPANAYTISCIPWVSFKHFSVHSYENKPYYLPSVEAGRFYE KDGRIMMPLSITCHHATTDGYHINRFLENLQCEAEAFAENVPHFRLR >gi|229784037|gb|GG667698.1| GENE 3 4117 - 5265 1385 382 aa, chain + ## HITS:1 COG:STM2973 KEGG:ns NR:ns ## COG: STM2973 COG1454 # Protein_GI_number: 16766278 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Salmonella typhimurium LT2 # 1 380 1 380 382 452 60.0 1e-127 MANRIVLNETSYHGAGAINEIPGEVTKRGFHKAFIASDPDLVKFGVTAKVTKILEENGMD YEVYSNIKPNPTIENVQTGVEAFKKSGADYIIAIGGGSSMDTSKAVGIIITNPEFADVRS LEGVAPTKNPCVPILAVPTTAGTAAEVTINYVITDVEKKRKFVCVDTHDIPVVAFVDPEM MSSMPKGLTASTGMDALTHAIEGYTTKGANEITDMFNLKAIEKIASSLRDAVENKPEGRE GMALGQYLTGMGFSNCGLGIVHSMAHALGAVYDTPHGVANAILLPTIMEYNADATGEKFR DIAKAMGVEGTETMTVEEYRKAAVDAVKKLSQDVGIPADLTAIVKEEDVQFLAESAVADA CAPGNPKEAGLEDIIALYKSLM >gi|229784037|gb|GG667698.1| GENE 4 5384 - 6274 954 296 aa, chain - ## HITS:1 COG:no KEGG:DSY1244 NR:ns ## KEGG: DSY1244 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 3 246 6 240 283 151 36.0 4e-35 MEEIAYQNKDITSKLMAETLKGKSLAAFGLPELKIVDILPTNLPVIESNELRLDNLFLLS DGSLAIIDYESSFSRENFVKYLNYIARVIRRFAVRRELKDLKQLKMVVIYTADVERAEER YDLGGLILIVESAYLIHLDGSQIYDRLKHKIDAGEKLAEEELMELMILPLTVKGKKRKQE TIEKAVTLGKRLPDREGQLKVIAGILTFTDKVIDRAYAKKLEEEMQMTLVGQMLMDEGYQ RGMEKGMEKGIQVFIQDNVSENVPEQRIIQKLQANFSLMEKDAINYYTRFSKQTPN >gi|229784037|gb|GG667698.1| GENE 5 6652 - 7488 513 278 aa, chain - ## HITS:1 COG:CAC1333 KEGG:ns NR:ns ## COG: CAC1333 COG2207 # Protein_GI_number: 15894612 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 20 270 32 280 286 127 29.0 3e-29 MQDSCTEIRIQTNTDNTGHIFLHSHTFYEILLCKSGNIHYLLGDKRFRVQQGDILLIPPG VSHQPLFPEPLREPYSRYVIWINAAYWHQQCEEFPDLNFAFEQCQKRGSYLLRSTRATWS GLFSAAAAALKETEEQKPGWGYCARTAVISLMAHISRTYYYQDLQMPAAEKSVLVDDLFR YIDEHLMEKITLEGTAKHFLVSPSTISHTFQKNMQVSFHHCVIQRRLIAAKNGILSGVPL QEIWESCGFPDYSSFYRSFKKEYGISPREFKNFHRHDA >gi|229784037|gb|GG667698.1| GENE 6 7714 - 9096 1532 460 aa, chain + ## HITS:1 COG:BS_yjmB KEGG:ns NR:ns ## COG: BS_yjmB COG2211 # Protein_GI_number: 16078296 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Bacillus subtilis # 5 452 16 458 459 191 29.0 4e-48 MKETRKFGIRDQIGYVCGDMAGSFVNLYVDAFFLTFCTYVLGIKAAWMGSMFLFARLWDA FNDPMIGSFPDRWQIGKSGDKFKPYIKIAMIPLAVSGLLCFADVSSWGSLAKHVWVVFAY IVYGMSYTGTSMPFGSMASVVTNDPVERTKLSRARSIGGTIVGIGAISLVPQFVFNKSGD VVASGFFKVAVVFGILSIICYCLLLHFTTERVRQPKTEGQKFHYGKVLKSVFKNRPMLGV MLATVGSLLFITGNSQLGSYLYKEFYHAPQVLTLVSLISIPIMLVFFPLIPKLSQKYGKR NVILVCSGYNLVISLILFMVPIENVYLFLVVNTLATSGQTAFTMLIWAFVTDCIDYHEYQ TGERSDGSLYSIYTFSRKIGSTLASTIASFSLGAIGYISGIEAQTAVVAGHIRSLCTFIP VATCVLELLGIGLVYNLTNQKTTEMYAVLKKRREEEQAQE >gi|229784037|gb|GG667698.1| GENE 7 9108 - 10085 841 325 aa, chain + ## HITS:1 COG:no KEGG:Htur_2842 NR:ns ## KEGG: Htur_2842 # Name: not_defined # Def: amidohydrolase 2 # Organism: H.turkmenica # Pathway: not_defined # 6 245 62 299 338 122 34.0 2e-26 MDKKIDIHLHLSERAIPEGLGMKISSGTEMLPHLDALGIGMGIVMSSGEKQAVFGSNASC RKIAEQYPDRYAWMCMLDETEPETVYERLKKYRSEGAVGVGELTVNRRMDHPFLEAVFQA AEALGLPVLFHMSPEEGFQYGVVDEPGLPLLERTLQKYPDLILIGHSQPFWHEISGDAGA SREDRYEWGKGPVTPGGRLGQLFEAYPNLYGDLSANSGGCAVMRDETFGLAFLETWKDRL MFGTDMVNVDMEFPLGSWLDQKAAEGALSLAAYQAVVSGNAERIFRLCGNRDRESEGRSA AAEAEEPSGKIIEEKTMKGDQGWRK >gi|229784037|gb|GG667698.1| GENE 8 10073 - 10918 946 281 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623060|ref|ZP_06115995.1| ## NR: gi|266623060|ref|ZP_06115995.1| creatininase [Clostridium hathewayi DSM 13479] creatininase [Clostridium hathewayi DSM 13479] # 1 281 1 281 281 574 100.0 1e-162 MEEMRTRILPRMLNDEVEAYLTRGDVIIVPVGTVELHGGLPLDSETVISEAFALRMAEAA DGLVLTGLPYFYAGATAIGRGTTQVSVREGIDYLGAIARSLLRQGFKRQIYVSFHGPAHM TCCPMVRDFFDETGVPILYMDLTMQMIQSAKDLFKSLDSFHSITVGAYQMMGRLEDVPFT TGYCHQEPQTTKPFDHLFSLAYQSGSIGYCFGDAKDHMSTPAIPDAARRKELADEGEVLI NEIVKRLDMKTITEEMRSLEAFEQDVMKRYPWIPASGGQRR >gi|229784037|gb|GG667698.1| GENE 9 10930 - 12330 1098 466 aa, chain + ## HITS:1 COG:STM1623 KEGG:ns NR:ns ## COG: STM1623 COG2272 # Protein_GI_number: 16764967 # Func_class: I Lipid transport and metabolism # Function: Carboxylesterase type B # Organism: Salmonella typhimurium LT2 # 12 333 5 308 502 157 34.0 4e-38 MKTESGPVTGGSARLIKTPCGTVRGAVSAKGDCVVFKGIRYAEAERFCYPTEVTHWDGVY EADRFGNCSYQPRAFYDESKVPEKAFYYHEFREGESYTYSEDCLFLNIWTPLNAERAPVL FYIHGGGFKGGCGHEKHFDGAAYCRQGILTVTINYRLGPLGFACMEEWKDEKGCVGNYAM YDQMAALSWIRRNIAAFGGDPDRITLMGQSAGAMSVQQHCLSPLTRGSFCGAVMMSGGGV MKGFSPTAALNDAEPFWRELDCRVVKACGVGMRKAEPSVIFKEFHTLTKERPEAAMACGP IIDGVFLPEPPMEIVSDGGAHEIPYLMGSTSEDIVPPIVHKMAKGWARLQSDMGRCPSYA FFFDRQLPGDDCGAWHSADLWYAFGTLENSWRPFGAWDRELSDIMVRYFSDFIRTGNPNG GGLPRWLPMEKGQNRVMRFGDHGVAMGGVSVARLNHTMRTKTSVGE >gi|229784037|gb|GG667698.1| GENE 10 12440 - 14257 2002 605 aa, chain + ## HITS:1 COG:uidA KEGG:ns NR:ns ## COG: uidA COG3250 # Protein_GI_number: 16129575 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli K12 # 1 604 1 596 603 488 42.0 1e-137 MLYPILTQSRGLMDLSGVWAFKLDDGRGFEEKWYERPLEAAMTMPVPASYNDLKEGEAFR DHYGWCFYQRTLAVPSYMREQRLVLRFAAVTHIARVYLNGVLIAEHKGGFLPFEVQINEY LKEGENLLTVAVDNRIDHTTLPVGSEAGGGMGDLMGMGGGKTSRKRNNPNFDFFNYCGIT RPVKLYTTPASYIQDIFLNAEVENTTARITYHLETVGEATAKIAVYTKDGRLAAEAEGTD GTLEISDVILWQPLCAYLYDVRVTYGEDVYTLPYGVRTVKTEGGKFLINGRPFYFKGYGK HEDTFPAGRGLNLPMNTKDISLMKWQGANSFRTSHYPYSEEMMRLCDEEGIVVIDETPAV GVHLNFGGGANFKDGKRVNTFDPPEEGGIRTHSHHMDVVRDMILRDKNHACVVMWSIANE SDSGGPGAYAYFKPLYDLARETDPQKRPCTIVSVQMPNYKEDCTIELSDVFCLNRYYGWY ACGGDLDMAKELCREELTFWNSLGKPFMYTEYGADTVSGFHDTTPVMYTEEYQVEYYKVN HEITDSLENFVGEQVWNFADFATSQSLLRVQGNKKGIFTRDRKPKLVAHYLRDRWTKIPE FGYKE >gi|229784037|gb|GG667698.1| GENE 11 14459 - 15256 863 265 aa, chain + ## HITS:1 COG:lin0566 KEGG:ns NR:ns ## COG: lin0566 COG0406 # Protein_GI_number: 16799641 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Listeria innocua # 41 264 6 224 231 164 43.0 1e-40 MRQKRRMVMMAAGLLCISAILAGCAKAQVQAQETAKEQMPVTLYLTRHGKTMLNTTGRMQ GWCDSPLTEEGAAVAEKLGAGLKKAGITFDAVYTSDSGRAIETAELVLKNNGQQDLAVQK NPRIREVCYGIYEGAMPAEAYAPAAKALGYDSVDTMMGAVMSGAMTINEAVSAMAKTDES KTAETWETAQKRVMEGLREIADEAARKGQNQVLVVSHGMAISAAVAAIDPEAEAGELGNA SITKLICDGGTFTVESVNDMSYIEE >gi|229784037|gb|GG667698.1| GENE 12 15269 - 15424 102 51 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDATTISAAPKNWIPNNNPRGNNFHRNHTEGRTGQERRISSVSSYPIPPSD >gi|229784037|gb|GG667698.1| GENE 13 15375 - 16664 1332 429 aa, chain - ## HITS:1 COG:YOL164w KEGG:ns NR:ns ## COG: YOL164w COG2015 # Protein_GI_number: 6324409 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Alkyl sulfatase and related hydrolases # Organism: Saccharomyces cerevisiae # 22 426 100 509 646 189 31.0 1e-47 MLSNDAETRLREFNDTAFPKELITITDRVRIAVGWGHSNCIIVEGDTSLILIDTLDSDAR AARLRDELSRLTDKPVKTIIYTHGHPDHRGGGGAFRDTVEEIIAFAPKKAVLKGTDRINP ILNKRTFRQFGYGLTDEELITQGLGIREGHAIGDGVYSFLPPATLYTEDSVIRTIDGVTF EMVSAVGETDDQIFLWLPGERIMCCGDNYYGCWPNLYAIRGGQYRDIAAWVDSLERIRSY PADALLPGHMLPVLGRERISEVLGNYQEAIRYVLDETLSCISQGLSQDETAARVVLPERY RNLPYLQEYYGTVQWSVRAIYQGYVGWFDGNPSNLNRTAPEEYSRRLVELAGGEQAVSEA VKKALTESEFQWAAELCDLLLGAGDAVSKDAGDNARRWKAGSLRELARLETSANGRHYYL SCAKELDTE >gi|229784037|gb|GG667698.1| GENE 14 16721 - 17584 900 287 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623065|ref|ZP_06116000.1| ## NR: gi|266623065|ref|ZP_06116000.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 284 1 284 284 572 100.0 1e-161 MNGSLSIMKEAQAGQETAAPGERYVGKEKSMSVGFTCQAVVKDKRFVKQMIRMLGEEKRY EVRQEEDYMRVGFCRLGDLFFQFGSGLDGEIPTQMVYGECTSSLAGAGFHAAAVRFVEEL ARETEMEIILSDETGYGDDHDFDRMREEHFYGWLKNLVSVCREREEQWPEAVSFGLCWDL DQYTPEEVPGTVFTPFGRFSIQKIVGWVEQEGIEPFAKEFFIWNEPGRDAGYYRNTALSL MWEECYFMPGSRSGRDRRINDRIIDDLERSLLLDRSLPFPAEEYLAS >gi|229784037|gb|GG667698.1| GENE 15 18529 - 18921 492 130 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623066|ref|ZP_06116001.1| ## NR: gi|266623066|ref|ZP_06116001.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 130 10 139 139 239 100.0 7e-62 MTFVIPGSYLYEYDEDGNSHSWYDDLEEGWHAVRITALKSREESPEITERIFSGADGEPV DGENGCLKYRFAFAGTVEPETEEAYSQYVGEVAGGYQIALITASCEHREDEWAEAFFRSV AHSPEAKLEK >gi|229784037|gb|GG667698.1| GENE 16 18981 - 19400 524 139 aa, chain + ## HITS:1 COG:CAC3689 KEGG:ns NR:ns ## COG: CAC3689 COG2764 # Protein_GI_number: 15896921 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 134 1 134 137 127 44.0 4e-30 MEFTPYIRFNGSCREAVEFYAKAFGAEQPKVLTFGESGENPAWPMPEEAKKLIMHAEIVV EGRKLMFSDTFPGMPYEQGNHITIAVAPEDEETARRIFGHLKEGGKVEMELSETAWSSCY GMVTDRYGIGWQMNVLRGQ >gi|229784037|gb|GG667698.1| GENE 17 19499 - 20875 1481 458 aa, chain + ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 17 434 16 430 448 276 37.0 5e-74 MSDTANDFSKGSIVANIMKLAVPMTLAQLINVLYNIIDRIYIGRIPENATLALTGLGLCL PIISMVIAFANLFGMGGAPLCSIERGRGNKKEAELIMGNSFVLMVLFGIGLTVVGLALKK PMLYLFGASDNTYPYADQYISIYLLGSIFVMIGLGMNSFINSQGFGTIGMMTVLLGAIAN IILDPIFIFTLHMGVQGAALATILSQLLSAVWIVKFLTGSKAILRLRTSAFRLKKQRVKD IVALGMSGFTMAITNSSVQIMYNASLQHYGGDLYVGIMTVINSIREIISMPVNGLTNSAQ PVMGFNYGAGEYKRVKRAIVFISVVSIAYTTVMWLFVHGFPEFFIRIFNQDADLVREGIP AMRIYYFGFFMMSLQFAGQSVFVALGKAKNAVFFSIFRKVIIVIPLMLLLPLMFGLGTDG VLMAEPVSNFIGGAACFITMLITVWPELSGKKREKSKT >gi|229784037|gb|GG667698.1| GENE 18 20917 - 21828 760 303 aa, chain + ## HITS:1 COG:BH0025 KEGG:ns NR:ns ## COG: BH0025 COG2421 # Protein_GI_number: 15612588 # Func_class: C Energy production and conversion # Function: Predicted acetamidase/formamidase # Organism: Bacillus halodurans # 13 287 13 286 300 238 44.0 1e-62 MKVITKDFITNVLSKDNAACASIRSGETVVFETYDCFTNQFLPEEATFENVVRKPGNPAT GPLYIEGAMPGDMLRIEILDIELGPVGIVMLGPGSGSERTEFPEKILKRVPVKDGKAYYD GKVEIPVEPMIGVIGVAPAAEGVSTITPMDHGGNMDCTQIKKGAVLYLPVFVEGGLLSMG DFHAIMGDGEVEDCGLEIEGRATVHVDVVRNQYCVPYPMIETEDRLITIASAEDVEGAWR AATRQMFDFMREKIGMSVSDAGMLLTMTGSLIICQTVNPMKTVRMELPRYITKGYGFYSI FGS >gi|229784037|gb|GG667698.1| GENE 19 21914 - 22939 1227 341 aa, chain + ## HITS:1 COG:CAC1479 KEGG:ns NR:ns ## COG: CAC1479 COG0115 # Protein_GI_number: 15894758 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Clostridium acetobutylicum # 1 341 1 341 341 580 79.0 1e-165 MEKKNIDWGNLGFGYVQTDKRYVSNFKGGAWDEGTLTGDPNIVLNECAGVLQYAQTIFEG LKAYTTEDGRIVTFRPDLNAERMENSARRLEMPVFPKERFVEAVTKTVAANAAYVPPFGS GATLYVRPYMFGSNPVIGVKPADEYQFRIFTTPVGPYFKGGVKPLTICVSDFDRAAPHGT GHIKAGLNYAMSLHAIMEAHRAGFDENMYLDPATRTKVEETGGANFLFVTKDKKVVIPKS ESILPSITRRSLMVVAKEYLGLEVEEREVLLEEVKDFAECGLCGTAAVISPVGCIVDHGK AICMPSGMSEMGEVTKKLYDTLTGIQMGRIEAPEGWIHVIE >gi|229784037|gb|GG667698.1| GENE 20 23344 - 24246 818 300 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00788 NR:ns ## KEGG: EUBELI_00788 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 2 234 3 251 315 194 41.0 5e-48 MVKVNKKYKDRLFRMVFNRKEELLSLYNAVSHSEYTNPDDLEINTLDDVIYMKMKNDLAF LIDDVLNLWEHQSTWNPNMPVRGTFYIVEEYRKYIDQNGLNLYGSSRITLPVPQFYVFYN GLREEPDYIELKLSDAFSRVHSEVEPCMEFKAVMLNINRGHNEELMRQCTTLREYAEFVA RIRDETEDGTALEEAAMNVMDSCIRDGILAEFLSVHRAEVFEVLLTEYDEQRHIASEKEI SRREGHMEGRTEGILEKAKEVAVNLIKKGFTVEDAASICGEDICRVKEWHREWKAAKECR >gi|229784037|gb|GG667698.1| GENE 21 24451 - 25422 991 323 aa, chain + ## HITS:1 COG:SA2054 KEGG:ns NR:ns ## COG: SA2054 COG0679 # Protein_GI_number: 15927838 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Staphylococcus aureus N315 # 32 283 9 261 302 95 25.0 1e-19 MCANFKRQKTAGEMAGRGRDMTAVLTKAFAFVAIIAMGYILKRVGFFHAKDFYLISNIVI KITLPAAIVSNFSKITMDYSLLAMCVIGLLCNFVTIAAGYGINIRKSKETRAFAMLNMSG YNIGNFTMPFVQSFLSPVGFAATSLFDAGNSIMCTGMTFTLAGMVVGEDEKPSFSGMTKK LFSSIPFDAYIVMTILSILKLQLPAVMISFADTAGGANAFLALLMLGIGFEVHMDREKIA QIVQILVMRYGIAVLMAVGFYFLAPFGLEVRQTLAIVSLGPVASIAPAFTGALNGDVEMA SAVNSLSIIISIAAITAALILLL >gi|229784037|gb|GG667698.1| GENE 22 25460 - 26845 1303 461 aa, chain + ## HITS:1 COG:BH1128 KEGG:ns NR:ns ## COG: BH1128 COG0733 # Protein_GI_number: 15613691 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Bacillus halodurans # 5 450 9 444 453 275 41.0 1e-73 MHQKRSQWASNVGFILAAAGSAVGLGNIWKFPGKVGAYGGGAFILCYMIIVALIGFPVML AELSIGRKTQKNVIGAFRQLDKRFSFVGGIGVLTLFVIMSYYSIVGGWVLKYIWVYVSGA HFGTGLNPYQTYFTEFIAKPAEPLLWGAAFLLLCIYVVVKGVSSGIERVSKVLMPALFLL LLACVIRSVTLPGAKEGLSFMLTIRPETLNGNTLVGALGQAFFSLSVGMGIMVTYGSYVP KDDHLVKSAVSICALDTMVAILAAFAIIPVVFATLGAEGLGMGGGFAFMALPEVFAGFPG GRLFGLVFFILLFFAALTSAISILESCVAFISEEFHFSRLKSTIVLSVPMTVLSAGYSLS QSDARGIRIPWFDFSQGLTMLPMNAVMEKFTDNLMIPLGALAFCLFVGWVWGTKHAVEEI ESEGRYPMVMRRSWSFIIRFLAPLVIAVILYFTLGRGEGLS >gi|229784037|gb|GG667698.1| GENE 23 27073 - 29022 2348 649 aa, chain + ## HITS:1 COG:BS_thrS KEGG:ns NR:ns ## COG: BS_thrS COG0441 # Protein_GI_number: 16079947 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Bacillus subtilis # 1 631 5 634 643 581 47.0 1e-165 MKITLKDGSVKEYDQAVSVYDIAADISEGLARNACAGEVDGRVVDLRTMVDRDCCLNILT VHDPAGLAAYRHTTSHVLAQAVKHLYPEAKLAIGPSIENGFYYDFETPSFSREDLDKIEA EMKKIIKAGEKLERFTLPRAEAVKFMEEKGEPYKVELIRDLPEDAEISFYRQGDFTDLCA GPHLMSTKPIKAFKLTSSSGAYWRGSEKNAMLTRVYGTSFAKKEELNEYLEYLANIKNRD HNKLGRELELFATVDVIGQGLPLLMPKGAKIIQTLQRWIEDEEEKRGYMRTKTPLMAKKD LYVISDHWDHYKEGMFVLGDEEKDDEVFALRPMTCPFQYYVYKQSPKSYRDLPCRYGETS TLFRNEDSGEMHGLTRVRQFTISEGHLVVRPDQLEEEFKGCVDLAKYCLTTLGLEEDVTY RMSKWDPEKADHYLGNAEMWDEVEAAMRKILDDIGIEYTEEVGEAAFYGPKLDIQAKNVY GKEDTMITIQLDMFLAERFDMSFVDKDGTKKRPYIIHRTSMGCYERTLAWLIEKYEGAFP TWLCPEQVRVLPISEKYHDYAEKVESELKSNGILATVDERAEKIGYKIREARLAKLPYML VVGQKEEEDGLVSVRSRFNGDEGQKSLSDFIDAISREIRTKEIKTINVE >gi|229784037|gb|GG667698.1| GENE 24 29096 - 29410 295 104 aa, chain + ## HITS:1 COG:no KEGG:Closa_2298 NR:ns ## KEGG: Closa_2298 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 104 1 104 104 141 78.0 7e-33 MTRLDCNVTNCLHNAENCCCKAAIIVEGEQAKDTCDTCCGSFDENKDGAYHNLFKTPENR LEVDCEAVNCVYNEGRHCSADHIGIAGDGAREASHTECASFRER >gi|229784037|gb|GG667698.1| GENE 25 29557 - 29970 355 137 aa, chain - ## HITS:1 COG:NMA0616 KEGG:ns NR:ns ## COG: NMA0616 COG2510 # Protein_GI_number: 15793606 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Neisseria meningitidis Z2491 # 2 137 6 141 143 107 46.0 7e-24 MWFLFALLSSVFAALTSILAKAGIEGVNSNLATAVRTVVVVIMAWAVVFITNTQSGLPDI GRKSWIFLILSGLATGASWICYYKALQMGEASRVVPVDKFSIVLTVILAFLFLHEQITIK KLVGILLITAGTFFMIL >gi|229784037|gb|GG667698.1| GENE 26 30382 - 30568 223 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288870988|ref|ZP_06116012.2| ## NR: gi|288870988|ref|ZP_06116012.2| Fe-hydrogenase, gamma subunit [Clostridium hathewayi DSM 13479] Fe-hydrogenase, gamma subunit [Clostridium hathewayi DSM 13479] # 1 62 41 102 102 124 100.0 3e-27 MMKENTEHCCCEGCDSSPETLLGRIGELASEYRGKEGSLISVLHMAQGIYGYLPLEVQKT VA Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:49:08 2011 Seq name: gi|229784036|gb|GG667699.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld92, whole genome shotgun sequence Length of sequence - 27681 bp Number of predicted genes - 25, with homology - 24 Number of transcription units - 14, operones - 7 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 150 - 245 122 ## + Term 324 - 355 0.4 - Term 788 - 818 0.2 2 2 Op 1 . - CDS 867 - 2216 1156 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 3 2 Op 2 . - CDS 2221 - 2454 226 ## Clole_2594 beta-glucosidase (EC:3.2.1.21) 4 3 Tu 1 . - CDS 3381 - 4553 1158 ## COG1472 Beta-glucosidase-related glycosidases 5 4 Tu 1 . - CDS 5482 - 6114 682 ## COG1472 Beta-glucosidase-related glycosidases + Prom 6169 - 6228 6.6 6 5 Tu 1 . + CDS 6278 - 7201 781 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 6916 - 6966 -0.9 7 6 Op 1 38/0.000 - CDS 7210 - 8055 811 ## COG0395 ABC-type sugar transport system, permease component 8 6 Op 2 . - CDS 8055 - 9029 1111 ## COG1175 ABC-type sugar transport systems, permease components - Term 9039 - 9081 7.1 9 7 Tu 1 . - CDS 9132 - 10544 1618 ## Spico_0723 carbohydrate ABC transporter substrate-binding protein, CUT1 family - Prom 10704 - 10763 7.1 - Term 10881 - 10938 10.1 10 8 Op 1 . - CDS 10949 - 11875 459 ## COG1893 Ketopantoate reductase 11 8 Op 2 . - CDS 11872 - 12696 681 ## COG0345 Pyrroline-5-carboxylate reductase 12 8 Op 3 . - CDS 12730 - 13371 821 ## CLOST_1006 conserved exported protein of unknown function 13 9 Op 1 . - CDS 14305 - 14841 579 ## CLOST_1006 conserved exported protein of unknown function 14 9 Op 2 11/0.000 - CDS 14879 - 16147 969 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components - Prom 16279 - 16338 19.2 15 10 Op 1 21/0.000 - CDS 17240 - 18100 997 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 16 10 Op 2 . - CDS 18087 - 19712 205 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 17 10 Op 3 . - CDS 19743 - 20987 778 ## COG1228 Imidazolonepropionase and related amidohydrolases - Prom 21118 - 21177 9.2 + Prom 21157 - 21216 6.7 18 11 Tu 1 . + CDS 21301 - 22290 696 ## COG1609 Transcriptional regulators + Prom 22297 - 22356 6.4 19 12 Tu 1 . + CDS 22511 - 22891 345 ## COG2731 Beta-galactosidase, beta subunit - Term 22905 - 22952 5.4 20 13 Op 1 . - CDS 23006 - 23362 310 ## COG3695 Predicted methylated DNA-protein cysteine methyltransferase 21 13 Op 2 . - CDS 23387 - 23698 364 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 23729 - 23788 4.5 22 14 Op 1 . - CDS 23831 - 24499 741 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 23 14 Op 2 . - CDS 24531 - 26267 1652 ## Closa_2523 Type II site-specific deoxyribonuclease (EC:3.1.21.4) 24 14 Op 3 . - CDS 26298 - 27005 911 ## COG1794 Aspartate racemase 25 14 Op 4 . - CDS 27002 - 27679 566 ## COG0266 Formamidopyrimidine-DNA glycosylase Predicted protein(s) >gi|229784036|gb|GG667699.1| GENE 1 150 - 245 122 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEGYADVSGAVEEKGARAAGRAGLRDLAGWD >gi|229784036|gb|GG667699.1| GENE 2 867 - 2216 1156 449 aa, chain - ## HITS:1 COG:BH1923 KEGG:ns NR:ns ## COG: BH1923 COG2723 # Protein_GI_number: 15614486 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Bacillus halodurans # 3 447 8 443 447 369 43.0 1e-102 MEEKMIWGAATSAYQIEGAAEEDGKGLNIWDIYCRQPGKIFGDHTGDMACDHYHRYREDI GIMKQIGISAYRFSISWARILPQGQGTVNEEGIAFYNRLIDELITNGIEPYITLFHWDYP YALYQRGGWMNPDSVNWFAEYARLVTERFSDRVRYFITFNEPQCFIGMGHVTGEHAPGLK NPLHDGFLMAHHVMMAHGKAVQAMRAAAKRPIMIGYAPTSTVSAPASRSQEDIEAARARY FACPEDAGGWSWNVSWWSDPVMLGRYPEDGVKRYEEYLPDIRPGDLELMHQPLDFYGQNI YNGYRVKAGADGRPMVLERPAGYETTAMGWPVTPESLYWGPKFLYERYGKPIYITENGMA CHDMVSRDGRVHDASRIDFMDRYIESLLKAREDGTDIRGYFYWSLLDNFEWAEGYKDRFG LVYVDYATQRRVLKDSAEWYRKWIAGENG >gi|229784036|gb|GG667699.1| GENE 3 2221 - 2454 226 77 aa, chain - ## HITS:1 COG:no KEGG:Clole_2594 NR:ns ## KEGG: Clole_2594 # Name: not_defined # Def: beta-glucosidase (EC:3.2.1.21) # Organism: C.lentocellum # Pathway: Cyanoamino acid metabolism [PATH:cle00460]; Starch and sucrose metabolism [PATH:cle00500]; Biosynthesis of secondary metabolites [PATH:cle01110] # 5 72 623 690 696 63 35.0 2e-09 MDHWSLCGFKRIELEPGEVKQVELELGSHTYECVNSEGQYVENATVYEFSIGTSQPDKRS RQLGAPEPVAVVCRRGV >gi|229784036|gb|GG667699.1| GENE 4 3381 - 4553 1158 390 aa, chain - ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 1 389 264 683 778 200 34.0 5e-51 MEEILRGKWGFEGHFVSDCWAIRDFHQSHMVTDNAEESAALAVSKGCDLNCGNTYLYVLK AYEKGMISEEEIKQAVVRLFTTRYLLGLGETTEYDAIPYEAVECEEHRRRAVEAAEKSMV LLKNDGILPLNPDSVKKIAVIGPNANSRIALIGNYHGTSSHYTTILEGIQHEAGEGTTVL YAQGCHLYKDRVEHLGLPGDRLSEARIAARHSDVVILCVGLDETLEGEEGDTGNSYSSGD KEGLELPASQRRLMEEILALDKPVIVCNMTGSAVDLSLAQEKAGAVIQAWYPGAEGGTAF ARLIFGRSTPGGKLPVTFYKDLSALPEFEDYSMAGRTYRYIKEEPLYPFGFGLTYGRVEL SDGAVEGKKASVTVRNMGDYCAREVIEVYS >gi|229784036|gb|GG667699.1| GENE 5 5482 - 6114 682 210 aa, chain - ## HITS:1 COG:XF0845 KEGG:ns NR:ns ## COG: XF0845 COG1472 # Protein_GI_number: 15837447 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Xylella fastidiosa 9a5c # 3 208 37 244 882 240 57.0 2e-63 MELVEQMTVEEMASQMRYDAPAIPRLQIPAYNWWNEGLHGVARGGTATVFPQAIGLAAMF DEALIGSVADTVSTEGRAKYNEFQKEGDGDIYKGLTYWTPNVNIFRDPRWGRGHETYGED PYLTSRLGVAFVKGLQGEGEHLKIAACAKHFAVHSGPEAVRHEFNATSTKRDLWETYLPA FEACVKEGGVESVMGAYNCYEGEPCCASTS >gi|229784036|gb|GG667699.1| GENE 6 6278 - 7201 781 307 aa, chain + ## HITS:1 COG:lin2267 KEGG:ns NR:ns ## COG: lin2267 COG2207 # Protein_GI_number: 16801331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 40 293 21 285 292 103 28.0 5e-22 MSRLNYTLPDEYHGREMEGCVEKVDYYLNSNIRIWHNIQNEGYALHQHNAIEILVPIENS YTVLANNHTYQLEPGNILFIPGLTLHEIMKPDWGERFIFMFNMDPLTSFHDYSSLEPVLL DPCFLTKRTTPGIYEHIYSRLMEISTLYFSSSITFREYSIYANLIDILTTIGREYYDALF AGTENTAEKSYDYYQKFNSLLSYINDHFAEPLTLEKMADYTGFSKYHFTRLFKQHTNSTF YDYLSRKRIQAAQTLLTTDASITSIAFQTGFNNSTSFTRCFKKYTNYSPSEYRTKFIEDV PGNVTVF >gi|229784036|gb|GG667699.1| GENE 7 7210 - 8055 811 281 aa, chain - ## HITS:1 COG:BH0903 KEGG:ns NR:ns ## COG: BH0903 COG0395 # Protein_GI_number: 15613466 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 4 281 2 279 279 152 33.0 1e-36 MKYKKRTSTALKIVIYVVCICLAILSILPFWIMIVNATRSTTQIQQHAISMIPSGYMMRN LSILLGKSFNPMVGFMNSLIISACSTGIAVYFSTMTAYGLVVYDWRLRRPFFSFIMAVMM IPAQVTMIGFYQMVYRIHMTNQLSMLILPALASPAMVFFMRQYMMPALSLEIVQAARIDG AGEFFIFNKIALPIMKPAIATQAIFSFVSSWNNLFTPLVLLTNKEKYTMPIMVSLLRGDI YKTEYGSVYLGLALTVLPLIIVYLLLSKYIIAGVALGGVKG >gi|229784036|gb|GG667699.1| GENE 8 8055 - 9029 1111 324 aa, chain - ## HITS:1 COG:BH2725 KEGG:ns NR:ns ## COG: BH2725 COG1175 # Protein_GI_number: 15615288 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 12 298 11 289 290 129 31.0 7e-30 MKRKRLRYAKWGYIFCAPFVLGFLIFSLYPIIFTGVIGFTDFKGMGAVDFNFLEDPFQNF KTVLTNPSFKKAFGNTIHIWLFNFVPQIGLALLLTAWFTDKRYHIKGEGLFKVLFYMPNI ITAATIAILFNALLGYPMGPVNDIFMRLGISKEPINFLVDKSVARGTVEFIQFWMWYGYT MIILVSGVMGINPELFESAEVDGANRIQTFFYITIPNLKTILLFTLVTSLIGGLQMFDIP KLFLMGGPDNATLTTSVFIYNQAFSGSYMYNRAAAASMIMFIIICAASAVLFFLMRDKDE ALLNKQKRRQEKEYRKALKEGRKS >gi|229784036|gb|GG667699.1| GENE 9 9132 - 10544 1618 470 aa, chain - ## HITS:1 COG:no KEGG:Spico_0723 NR:ns ## KEGG: Spico_0723 # Name: not_defined # Def: carbohydrate ABC transporter substrate-binding protein, CUT1 family # Organism: S.coccoides # Pathway: not_defined # 1 469 1 449 450 613 68.0 1e-174 MKKMIAGLLCGCMIAASLTGCGKAPASAGGAAEKPAAAETEKTADKAGASSDSKVINVWS FTDEVPKMIEKYKELHPDFDYEIKTTIIATTDGAYQPALDQALAAGGSDAPDIYCAEAAF VLKYTQGDASRYAAPYEDLGLDADGKIKSSEIAQYAVDIGTNTDGKVVALGYQATGGAFI YRRSIARDTWGTDDPKEVGAKLGAGTNDWTQFFNAAEDLKGKGYGIVSGDGDIWHAVENS SDKGWIVDGKLNIDPKREAFLDLSKKLKDNGYHNDTQDWQDAWFADMKGEGAKPILGFFG PAWLINYTMAPNCGGTAVGEGTFGDWAVCDSPIGFFWGGTWVLANKDTKVKDAVADIIDW ITLDDSEDSLQYMWANGTMNDAGTKDAVASGTVMSKSDGSVDFLGGQNMFDVFVPANKFA NGTNLTQYDETINNYWRGQVREYTAGNKSREQAITDFKQQVADNLDIIVE >gi|229784036|gb|GG667699.1| GENE 10 10949 - 11875 459 308 aa, chain - ## HITS:1 COG:CAC2937 KEGG:ns NR:ns ## COG: CAC2937 COG1893 # Protein_GI_number: 15896190 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Clostridium acetobutylicum # 1 302 1 303 307 191 33.0 2e-48 MRIAVLGAGAMGLLFGGLLSRKNDVCIYGHRAEKVEEIKKNGVIIRENDGTELHLFPRAA VSGEDLREIPDVVMLFTKSLVSREVLEQNRGLFGPDTLLLTLQNGSGHEELLREFVEESR IVIGTTRHNSVLTGNTSVLHGGSGVTSIGSAHARNTLVYILAEEMSACGIETSISEDIRR FIWDKLFVNTSVSVVSGILQVPQGYLLDNPHAWSLVERLAREAVETARSEGFVFKPEKVL EGLHSLLEGARNGYPSIYADLRDGRKTEVDAISGSIVRTGHKNGIPVPSHEMVVELIHAM EGKTGQTV >gi|229784036|gb|GG667699.1| GENE 11 11872 - 12696 681 274 aa, chain - ## HITS:1 COG:STM0386 KEGG:ns NR:ns ## COG: STM0386 COG0345 # Protein_GI_number: 16763766 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Salmonella typhimurium LT2 # 1 268 1 268 269 261 54.0 9e-70 MERKFGFIGGGNMCRAIIGGMLKAGITKPGDILVSDKNPSGLEALKAEYGIEITDENCNT ASFADILILAVKPNVCPAVIAEIREAVRSESVVVSIAAGLSLSVLMRYFKRDIPVIRVMP NTPAMAGEGMAAVCSNDLVSRIQMEDVLTVFRSFGRAEEVTESLFDVVTAVSGSSPAYVY LFIEAMADAAVAGGMMREQAYTFCAQAVAGSARMVLETGMHPGELKDMVCSPGGTTIDAV MELEERGMRSAVQNAVTVCIRKSKSMGKQQEDGL >gi|229784036|gb|GG667699.1| GENE 12 12730 - 13371 821 213 aa, chain - ## HITS:1 COG:no KEGG:CLOST_1006 NR:ns ## KEGG: CLOST_1006 # Name: not_defined # Def: conserved exported protein of unknown function # Organism: C.sticklandii # Pathway: not_defined # 2 213 170 387 387 171 42.0 1e-41 MKTVLHYSFPRHLAIATKLAYHDNLKAKCEELGITFVDVTTPDPTSDAGTTGTQQFVLED VPKKLDEYGEMTAFYGTNLSQTEPIIKTLADRQEGYYLIIKDPSPTFYANPLGIEIPEEY AGDYDYLNKQIVEKAAEKNMTGHFTGWPLSVTNMSLRGGVQYCMDYLDGKTDGKVDIGTL QTILTNLAGEGASVTLYENYDNYFAVTAPNMTY >gi|229784036|gb|GG667699.1| GENE 13 14305 - 14841 579 178 aa, chain - ## HITS:1 COG:no KEGG:CLOST_1006 NR:ns ## KEGG: CLOST_1006 # Name: not_defined # Def: conserved exported protein of unknown function # Organism: C.sticklandii # Pathway: not_defined # 1 169 1 152 387 137 46.0 2e-31 MFKKVIATALTLTMVLGLAACGSTVGSGTPETTTTAETAKTSADETMEKTAEAAENAGYK IGIMTTTVSQAEEEYRVAEELAAEYPDTVVHVTFPDNFATEMETTISTALSLASDPDMKA IIFTQSVSGTAAAIDKIREVRDDIFIWAGLPMDDNDVIAQAADVVFNTDFATAGVLAS >gi|229784036|gb|GG667699.1| GENE 14 14879 - 16147 969 422 aa, chain - ## HITS:1 COG:FN1896 KEGG:ns NR:ns ## COG: FN1896 COG1172 # Protein_GI_number: 19705201 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Fusobacterium nucleatum # 255 407 170 322 340 151 47.0 2e-36 MKARIRNVLSGNAVTLLFVILCVGSILCSGQPLSYIVKETVARISRNAVLILSLILPVLC GMGLNFSIVVGAMAGEISLVMITHWNIGGAPGMLLAALIASVLGGLFGLAVGKLMNRAVG QEMITGIILGYFSIGLFDLTFLILLGRVIPFQDAELMLSNGVGLKNTIAFHDDVKNYLDH IWRITLDWSVVYAALAAILVLVILVVRKKKRYGRTLSEAVKDCRSAVAAAVALTASAALV WLIPPLHLACAATQIPMVVGIIIALLCLLTAFISRTKIGQDIRTTGQNMDVAIVSGINVG RCRLFSITFSTIIAALGQVIYLQNMGNMNTFSSHEQVGQYAIAALLVGGASIKKASIGHV FLGAALFHILFFTSPLAGNRLFGDAQIGEYFRVFVGYAVITISLVLYGLKQAADKRSKEE NK >gi|229784036|gb|GG667699.1| GENE 15 17240 - 18100 997 286 aa, chain - ## HITS:1 COG:FN1897 KEGG:ns NR:ns ## COG: FN1897 COG1172 # Protein_GI_number: 19705202 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Fusobacterium nucleatum # 7 286 1 278 339 204 43.0 2e-52 MQQIKRLYQAIGFSRFIIMAFLAGLVLLTFTLHLDSGTIVSDVLVRIGLNSFFVLAMLPM VECGVGLNFALPLGILCGQLAGVLSVEWGIQGMKGIFAACMLGSVFATVVGYLYGLLLNR LRGSEMMVGTYINFSIVSLMCMGWMLFPFFSDPRIKWAMGNGVRSTITLDGFYDKVLNNL LAFKIGNVTIPTGLFLVFGLGCLAFWVFQKSKTGVMMKVSGMNPMFGRSIGIDNDRMRVM GTILSTICGAIGIVVYAQGIGMYQLYTAPRNMAFPAIAAILVGGAS >gi|229784036|gb|GG667699.1| GENE 16 18087 - 19712 205 541 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 279 513 1 225 245 83 26 1e-15 MEYEYLLEMNHIGKEYYGVRVLKDVSIRLKKGQILALLGENGAGKSTLMNILFGMDVIHS TGGFTGDVLFEGKQVKIMKPSDASDLGIGMVHQEFMLIDGYEIAENIRLNREITKVNLFS RIISRKLETVDKVSMRKEARKTLDSLGLEGLSETVAVENVSVGYKQFVEIARELNKKNIK LVVLDEPTAVLTEGEAKQFLDCVRAVADQGISFIFISHRLDEAIQYADTAVVLRNGELVS QCDMEGVTAVQLSEMMIGRKVELVTKEPKPESELHSPVILTMKNFHVDMPGELVKGIDLE VREGEILGIGGLAGHGKVGIANGIMGLYPATGDVLYRGETLTLADTLSTLKKKIMFVSED RRGVGLNLDASIEMNTVIAALRVNQEYLRRLGFMRFYNKKAAAEYAKKMIEEIDIRCTNS SQPCKRLSGGNQQKVSIAKALAMNPEVLFISEPTRGIDIGAKKLILNYLVQLNRERHMTI IMTSSELAELRSVCDRIAIVTEGKLAGILKPNDEEYKFGLLMSSAAGVTEEQKEVGNHAA D >gi|229784036|gb|GG667699.1| GENE 17 19743 - 20987 778 414 aa, chain - ## HITS:1 COG:CC2672 KEGG:ns NR:ns ## COG: CC2672 COG1228 # Protein_GI_number: 16126907 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Caulobacter vibrioides # 1 407 29 427 429 189 32.0 6e-48 MKAFTHARVIDGKGGFIPEGTVLVEGSSIVKVGTSSEVELPEGVMTVDCGGRTLMPGLMD AHTHLETTASNNVINEVAQYNLPTRTMRSYWNSLETIRAGITTVRDVADIGDCVIAVRNV INQKLLPGPRIIASGKAINQTGGHASMSTEYQWSRPDAGGSGHCVDGEAECRRAIRTEVA AGADLIKIYVTSGLYDPFLGKPRDEFSKKEIEALCDEAANSGKKVAAHCHTAKSAKYCIE NGVASIEHGMMINEETLELMKAHGTFWVPTTIVYQLMADGEKYGLGSSTIENSRKALVNQ EKMFRKALEMGVKIGIGTDCGGVLINHGETARELECLNRWGMSAMDCIKAATSANAELLG IDSETGSIEAGKEADLLIVDGDPLEDIRILQNQEKILVVMRSGEYYKDMFSVQR >gi|229784036|gb|GG667699.1| GENE 18 21301 - 22290 696 329 aa, chain + ## HITS:1 COG:BH3727 KEGG:ns NR:ns ## COG: BH3727 COG1609 # Protein_GI_number: 15616289 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 2 328 3 322 331 150 30.0 4e-36 MTIYDIAKEAGVSASSVSRVINNKTGVNPEVRKKVQKLLKKYHYTPDAAARGLVSQSSRI IGILIADIRNIHHTDGAYYIERKLAALNYCCVILNTGDDDESRAQAISILEQRRVEGAVL IGSSFQTESVAQAIKRHLSNVPIVMINAFLDAPNIYGVLSDELHGVEACVNLLAEKGRKR LAFIRDCYTPSCELKVEGFRRGIRLLGQTEPWIYDSHGTTADDGKRTTLEVLKDHPDVEG IIYSVDLIATGGIRALYDLGIAIPDQVSVIGVDNSIYGEICIPTLTSLDNKTFDSSIAAC RILIDCLENRTTTRQMILPSAIVVRESTP >gi|229784036|gb|GG667699.1| GENE 19 22511 - 22891 345 126 aa, chain + ## HITS:1 COG:STM3669 KEGG:ns NR:ns ## COG: STM3669 COG2731 # Protein_GI_number: 16766954 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Salmonella typhimurium LT2 # 1 122 33 154 154 83 35.0 8e-17 MEPGDYPLEGNLMYVKVFDQTSIPFEDTRPEIHHHYADVQYWPEGEDLAGITPNPGTYTV TEAHESDDLYFLEGIQNESFIHCLPGCIAVYFPWDVHRPSVAYEGKPITYRKVVVKVHMD LIKESV >gi|229784036|gb|GG667699.1| GENE 20 23006 - 23362 310 118 aa, chain - ## HITS:1 COG:YPO3139 KEGG:ns NR:ns ## COG: YPO3139 COG3695 # Protein_GI_number: 16123301 # Func_class: L Replication, recombination and repair # Function: Predicted methylated DNA-protein cysteine methyltransferase # Organism: Yersinia pestis # 3 102 84 181 181 77 43.0 7e-15 MDFYERVRFVMNHIPYGKAATYGQIALLCGKPGNSRQVGYALNRRLGGADLPAHRVVNAS GYLSGAASFRTPDTQKKLLEAEGVAVNDENRVDLNRFGWHHTLEDALEFRKWFEKNGI >gi|229784036|gb|GG667699.1| GENE 21 23387 - 23698 364 103 aa, chain - ## HITS:1 COG:PM0994 KEGG:ns NR:ns ## COG: PM0994 COG0526 # Protein_GI_number: 15602859 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Pasteurella multocida # 1 102 1 103 106 96 45.0 9e-21 MSVFSITNNNYHEEVVASGKPVLLDFWAPWCGPCKKLSPVVDKIAEENPHIKVGKVNVEE QRELAEKFHIMSIPALVLMKEGKAVASSVGVKPQAAIEKMLNS >gi|229784036|gb|GG667699.1| GENE 22 23831 - 24499 741 222 aa, chain - ## HITS:1 COG:FN0217 KEGG:ns NR:ns ## COG: FN0217 COG0664 # Protein_GI_number: 19703562 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Fusobacterium nucleatum # 3 209 10 212 217 80 27.0 3e-15 MELGEYLPFWKKLTEAEQRTLAGSVQRRRFEKGQVLHHGSDECTGLILTLNGQLRVYTIS DEGKELTLYRLLERDLCIFSAPCMVKSILFDVIVEAEQDTEILHIPVDVYRDLMDNSAAV ANYTNELMAERFSDVMWLMDQVLNKKIDSRLAAFLLEESGLDDSRKLSITHDQIARHMGT AREVITRMLKAFQADGLVKLNRGGIELLNEKRLTELAADSLR >gi|229784036|gb|GG667699.1| GENE 23 24531 - 26267 1652 578 aa, chain - ## HITS:1 COG:no KEGG:Closa_2523 NR:ns ## KEGG: Closa_2523 # Name: not_defined # Def: Type II site-specific deoxyribonuclease (EC:3.1.21.4) # Organism: C.saccharolyticum # Pathway: not_defined # 1 574 1 575 577 885 72.0 0 MAVRTFGWVQEAYKLDNLKNVVSVFVPGSPVNRRLIEDKLVRLISDEDGKAEFIEELEAD PVIVPYAHLKGKGTPKGYTRSNAPCSGIIQAVLPGQRKEYQSDWPADSFLRWAVSIGFLN YDRNADTCSLSGLGRSYAEAEAGSKEEEDALTTAFLSYPPVCRVLSLLASGAHLTKFEIG AKLGFIGEAGFTSIPQHMILQGLSETAEEDRAKLLQDTEGTSDKYVRTICSWLKQMGWVE QAAKTVAEDAGTRRYEGTIPQSYRLTLKGRTVFKHTTGVSKFARIPKRVMWDMLATKAAD RDYLRNRRTHIIQYLLKDYRSPEQVKAYLEMVGLEETVETIKDDICGFENIGLNVKRTGK TYKIMDDIVGLEIPADDAGRISVKSDMAVVKDSVRERLAHVSHEYLILIDLGFDGTSDRD YEIQTAELFTRELDFLGGRLGDTRKPDVCIYYGKDGMIIDNKAYGKGYSLPIKQADEMYR YLEENKERNEKINPNRWWKVFDEGVTDYRFAFVSGSFTGGFKDRLENIHMRSGLCGGAID SVTLLLLAEELKAGRMEYSEFFRLFDCNDEVTFQSIVI >gi|229784036|gb|GG667699.1| GENE 24 26298 - 27005 911 235 aa, chain - ## HITS:1 COG:ECs3697 KEGG:ns NR:ns ## COG: ECs3697 COG1794 # Protein_GI_number: 15832951 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Aspartate racemase # Organism: Escherichia coli O157:H7 # 1 223 1 224 230 268 59.0 6e-72 MKTIGLIGGMSWESTVTYYQVINEAVKERLGGLHSARCILYSVDFQEIEECQAAGEWEKS GKILGEAAESLQKAGADFIVICTNTMHKVVPQIQSKIHIPILHIAEAAALRLKAERMDRV GLLGTKYTMTQDFYKGKLAEYGVEVLIPEASEIETVNRVIYGELCRGILSETSKEAYLTI IAHLKERGAQGVILGCTEIGLLVKSGDTDVPLFDTALIHAEKAAELSVVGDTDER >gi|229784036|gb|GG667699.1| GENE 25 27002 - 27679 566 225 aa, chain - ## HITS:1 COG:STM3726 KEGG:ns NR:ns ## COG: STM3726 COG0266 # Protein_GI_number: 16767011 # Func_class: L Replication, recombination and repair # Function: Formamidopyrimidine-DNA glycosylase # Organism: Salmonella typhimurium LT2 # 84 219 133 269 269 74 35.0 1e-13 TGAAGFGIFSELEFDGGVRLGFNDGVNPRFIRKEEVRPKKYQLLLEFTDGSAIGFTVAMY GSISCHSGDYDNEYYRKSIESISPLTEEFDERYFKELLASVKSAMSAKAFLAAEQRIPGL GNGCLQDILFQAGIHPQRKVLTLSDCEQNVLWKQVRMVLQAMTEQGGRDTEKDLFGNPGG YQTLMSKNTVASGCPACGGGIIKETYLGGAVYYCPSCQKMEEEKR Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:49:46 2011 Seq name: gi|229784035|gb|GG667700.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld93, whole genome shotgun sequence Length of sequence - 23655 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 12, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1304 1299 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases + Term 1362 - 1410 8.1 + Prom 1357 - 1416 2.0 2 2 Tu 1 . + CDS 1447 - 2658 1075 ## COG1151 6Fe-6S prismane cluster-containing protein 3 3 Tu 1 . + CDS 3594 - 4046 567 ## COG1151 6Fe-6S prismane cluster-containing protein 4 4 Tu 1 . - CDS 4050 - 4919 703 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 4967 - 5026 8.0 + Prom 4926 - 4985 5.9 5 5 Tu 1 . + CDS 5098 - 7680 2467 ## COG1472 Beta-glucosidase-related glycosidases + Term 7839 - 7876 7.1 + Prom 7771 - 7830 6.0 6 6 Op 1 40/0.000 + CDS 7921 - 8601 640 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 7 6 Op 2 5/0.000 + CDS 8598 - 10568 1750 ## COG0642 Signal transduction histidine kinase 8 6 Op 3 . + CDS 10645 - 11193 429 ## COG2199 FOG: GGDEF domain 9 6 Op 4 . + CDS 11197 - 11649 332 ## Closa_3846 SARP family transcriptional regulator + Term 11653 - 11693 0.4 10 7 Tu 1 . - CDS 11688 - 12575 794 ## COG0583 Transcriptional regulator - Prom 12651 - 12710 5.2 + Prom 12652 - 12711 7.4 11 8 Op 1 . + CDS 12741 - 13244 583 ## COG0716 Flavodoxins 12 8 Op 2 1/0.000 + CDS 13258 - 13668 415 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain 13 8 Op 3 1/0.000 + CDS 13760 - 14077 394 ## COG0599 Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit 14 8 Op 4 2/0.000 + CDS 14112 - 15074 946 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 15 8 Op 5 . + CDS 15132 - 15287 215 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase + Prom 16189 - 16248 21.2 16 9 Tu 1 . + CDS 16270 - 16944 411 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase + Term 17034 - 17081 6.5 - Term 17022 - 17069 6.5 17 10 Tu 1 2/0.000 - CDS 17121 - 18137 920 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 18170 - 18229 80.4 18 11 Tu 1 . - CDS 19074 - 19628 522 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 19680 - 19739 10.8 + Prom 19619 - 19678 7.8 19 12 Op 1 . + CDS 19922 - 20704 853 ## gi|266623126|ref|ZP_06116061.1| putative transcription regulator 20 12 Op 2 7/0.000 + CDS 20745 - 22094 1174 ## COG4209 ABC-type polysaccharide transport system, permease component 21 12 Op 3 2/0.000 + CDS 22109 - 23005 929 ## COG0395 ABC-type sugar transport system, permease component 22 12 Op 4 . + CDS 23005 - 23653 519 ## COG0477 Permeases of the major facilitator superfamily Predicted protein(s) >gi|229784035|gb|GG667700.1| GENE 1 3 - 1304 1299 433 aa, chain + ## HITS:1 COG:TM0068 KEGG:ns NR:ns ## COG: TM0068 COG0246 # Protein_GI_number: 15642843 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Thermotoga maritima # 1 430 104 532 539 441 51.0 1e-123 KRVFGCFTEALKGLPDETNDRRRLKEIFANPSLQMVSFTITEKGYALRGADGQYYSFIQA DIDHGPERAVGAIAVIAAMLLERYQAGAKPLALVSMDNCSQNGARLRSAVLEIGREWRDR GFVEEEFVNYISDEERIAFPWTMIDKITPRPGETVKTMLQEAGVEDMDIVVTQKQTYIAP FVNAEEPQYLVVEDHFPGGRPPLEKAGVYMTSRETVNKAERMKVTVCLNPIHTALAPYGC VLGFTLFSDQMRDPEMVKLAEKVGLREGMEFVEDPGIISPKAFIDECINVRFSNPYLGDT PQRIAVDTSQMVGIRFGENIKACVRKYGNAKKLTGIPLAIAGWLRYLLAVDDEGKPFELS PDPMIPELQAQMAGIVFGDPDSLQDQLRPILSNRNIFGIDLYDAGIGETITEMLREEIAG KGAVRAALKKYLA >gi|229784035|gb|GG667700.1| GENE 2 1447 - 2658 1075 403 aa, chain + ## HITS:1 COG:CAC3428 KEGG:ns NR:ns ## COG: CAC3428 COG1151 # Protein_GI_number: 15896669 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Clostridium acetobutylicum # 5 403 3 415 567 487 58.0 1e-137 MENRMFCYQCQETAGCSGCTKSGVCGKGPRTAALQDMLVWVTKGISEAAVRLREENRNVP DEVNSRVMENLFITITNGNFDDEAIIRRIRRTLAMKRDLIDCLSGTEGLSEAALWDDAEA MEEKAKGIGVLSTEDEDIRSLRELITYGLKGMAAYLKHAAALERKAENGAVYQADGMRDA EEFLEKALAATLDQNMTVPELTALALETGEYGVKAMAILDEANTSAFGHPEITKVSIGVR KNPGILVSGHDLCDLEQLLEQTEGTGIDVYTHSEMLPAHYYPAFKKYAHFAGNYGNAWWK QKEEFEAFGGPVLLTTNCLVPPKDSYKGRLYTTGAVGYPGCKHIEADADGTKDFTEMINQ AKQCPAPVELETGSIVGGFAHNQVLALADPIVEAVKSGAIKKF >gi|229784035|gb|GG667700.1| GENE 3 3594 - 4046 567 150 aa, chain + ## HITS:1 COG:CAC2750 KEGG:ns NR:ns ## COG: CAC2750 COG1151 # Protein_GI_number: 15896007 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Clostridium acetobutylicum # 1 147 381 527 530 266 85.0 1e-71 MAGCDGRAPSRNYYTDFARALPKDTVILTAGCAKYKYNKLNLGEIGGIPRVLDAGQCNDS YSLAVIALKLKEVFGLDDINDLPIIYNIAWYEQKAVIVLLALLSLGVKNIHLGPTLPAFL SPNVANVLVEQFGIAGIRTAEEDMKIFFEI >gi|229784035|gb|GG667700.1| GENE 4 4050 - 4919 703 289 aa, chain - ## HITS:1 COG:CAC1451 KEGG:ns NR:ns ## COG: CAC1451 COG2207 # Protein_GI_number: 15894730 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 32 274 36 285 295 98 27.0 1e-20 MAHIIHEEVSENVNLPAKLELVHCMPMTIPAHWHEYLEILCLIDGRLTAVIQAESYQPEP GSLLVINSNELHMTQTAGLTTYVLLQISAEQLRRFFPNFEALHFQTLISKENRSPDQKEM FFHLETMVSEYEKQADGYQLLVTAHLYEFLYYLYRCCSHWNEDGAAGGSSRDMRRIAGIM DYVRRNFRKPLTLDDAAASQGLSREYFCRLFKKYTGQTFLAYVNSVRTMNFYEDLLKSDE SITQLMVQNGLTNYKVFMRIFKEMYGTTPQKIRKSSIHQADSGFCLPPP >gi|229784035|gb|GG667700.1| GENE 5 5098 - 7680 2467 860 aa, chain + ## HITS:1 COG:TM0025 KEGG:ns NR:ns ## COG: TM0025 COG1472 # Protein_GI_number: 15642800 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 503 849 6 318 721 249 40.0 1e-65 MKLKNRTFTGSTSDAVTKREQENREIARRAASEGFVLLKNDGHLLPLAAKGKIGLYGAGA VKTIKGGTGSGDVNERDCVSICQGLKAAGYEVTSDAWLSSYETIYADARQAWKEEVLRKL KQYDGNFFQAYSTTPFVVPCGDSIDEEAAKTDGADVAVFVLSRIAGENADRHDTEGDYFI TKEEKSLLAQISASYDNVILVINTGGLIDLAFTEEFTNIKSIVQFMQAGQEGGSAFADIM SGAVNPSGKMTDSWAYTYLDYPNARTFSHKNGNTDTEKYEEGIFVGYRYFDTFDVPVRYG FGFGLSYTEFSVVGTGVSASGLGTDQPKISVTASVKNTGNTYAGKEVAEVYVSCPQNGMP KEYRRLAGFAKTRLLSPGENQDLTITFPLYQLASYHEDRSAWILEAGTYGIWVGNSLESA SLSATISLDADVVMVQCESICERKEDLKELIPDAGKMKEKEAVWKSLAESLKLPDLSVCA DQIVTETVAYPEHLGVTEGKAGEIVSTLSEEQLIALATGDPGKGQENALGSAGLTVPGAA AETSSAAIESPWNVASMVLADGPAGLRLHKTYQVVDGKINKGSFLQAFEGGFFADPEEPE GTTYYQYCTALPVGTLLAQTWDVNLLKEAGEMIGREMELFNVTLWLAPGMNIHRNPLCGR NFEYYSEDPLLSGMMASAITLGVQKVPGCGTTIKHFACNNQEDNRMGSDSVLSERALREI YLKGFEIAVKNSQPMSIMTSYNQINGVHAANSYDICTKAARDEWGFAGAIMTDWTTTTAS TAGVCTAAGCMRAGNDMVMPGVKEDHDNIRRELKEGTLEISALQRCICNTVNLVLQSNQY EDAVSYSSRFEKLDTYMTAE >gi|229784035|gb|GG667700.1| GENE 6 7921 - 8601 640 226 aa, chain + ## HITS:1 COG:CAC0321 KEGG:ns NR:ns ## COG: CAC0321 COG0745 # Protein_GI_number: 15893613 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 6 216 3 228 230 150 34.0 3e-36 MDYTYRLLAVDDETDILRTNRKYLEARGYQVDSAVCASEALELLRKQKYDCILLDVLLPD MNGFELCEAVRAITSAPILFFSCMDDEEDKIRGLMAGGEDYITKPYSLKELAARVYAQVR RSTIKRVAVDRENQLLRIDDRIIPLAQKEFELFLFLLGNPGRILSASELYQEVWRTGKPD NANTVAVHMTRLRHKLEEAEPVIGRIETVRGEGYRFLPKVEAGTIL >gi|229784035|gb|GG667700.1| GENE 7 8598 - 10568 1750 656 aa, chain + ## HITS:1 COG:BS_resE_4 KEGG:ns NR:ns ## COG: BS_resE_4 COG0642 # Protein_GI_number: 16079368 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 385 654 11 266 269 148 33.0 4e-35 MKRFLAIFLLIFSLCLALFFASSRPFVSEGPKAGEGILDFREADFTSSVYELSGQWEFYH SQLYTPEDFRTGSPAGMEYIDLPGPWTRMGYPGLGYATYRLRILTEPGQIYLLYIPEIMS SSAVWLNGKEFYHAGKVGMSDLDTVTGVRNELLAAAAPDGVLELVIQAANYHMNGSGIFY PILIGRDTVLTHCMFWQRTVAAAALGGILLIGVYHLFLYFFRRRERLYLTFSLTCLAAVL RLGMESNALVQYFLPGGIGFLLSRVFLLLFTIHSLCICLFMVQAFSIQLGRCLRIFYILC FGLPMAGICVLPYAQAVTGMFLVMIPYGISIFLAFRSGNIGRDPYRLLYLFSMVTFVIYG PLTKTLFEGDMFLPGIVPNMFLILSQCVMLSRSYAQAHEEVERVNANLEDLVKQRTAQLN SANEQLAASQSALREMISNISHDLKTPLTVLNNYLELLGDETVASSEQERTEYLGIAYHK NLDLQRLIHNLFEVTRMEGGTVSYRLEWVLALPLMEEAERKYGDLAGGMGLFFTAEADDA LKLKIDRNKIWSVLDNLVYNALRHTPEGGSIALGIKRRGDRAVLTVADTGEGIDAGHLPH IFERFYKVSPERGEKDGSSGLGLYIVKTTAEAMGGTVEVESVPGKGTVFTLTFPAN >gi|229784035|gb|GG667700.1| GENE 8 10645 - 11193 429 182 aa, chain + ## HITS:1 COG:BMEI0929 KEGG:ns NR:ns ## COG: BMEI0929 COG2199 # Protein_GI_number: 17987212 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Brucella melitensis # 1 165 55 220 255 93 32.0 2e-19 MKNVLALLKNENRRLSQQAETDWLTGLYNRMAVEEKVGRRLKKYETGVLFVIDIDNFKSI NDRYGHLAGDQVLRGVARILQERVFHSDILGRIGGDEFVIFMSVKQDQNFIEERCRQIRQ QFLNLPPDQFMVTRLSVTVCGSIYQKGDDYQRLFDRADQRLITEKSLRKKRPADADNRPE NT >gi|229784035|gb|GG667700.1| GENE 9 11197 - 11649 332 150 aa, chain + ## HITS:1 COG:no KEGG:Closa_3846 NR:ns ## KEGG: Closa_3846 # Name: not_defined # Def: SARP family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 4 130 260 386 399 66 31.0 4e-10 MKSLKTDMEQVSLELSEPNPYAGAFCQDYDSFVGIYRFVERRLQRVQSSVYSLLFTLADE HGDFPDLPERILLMEELHTIIQMSLRAGDVFTRYSSCQYLVMVSDATDLQTDGIAERIRE KFINSSSFSDKYILVYKRYPLKPSEARGRK >gi|229784035|gb|GG667700.1| GENE 10 11688 - 12575 794 295 aa, chain - ## HITS:1 COG:BS_ywfK KEGG:ns NR:ns ## COG: BS_ywfK COG0583 # Protein_GI_number: 16080817 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 1 211 1 204 299 73 29.0 5e-13 MYNHMLDTLIAAADCGSFTRAAERLYISPTAVMKQINALEDHLGMKLLERTPSGVSLTAA GAVIYQDAKFMIDYSKKSIAGAKAAIHARDTVFCIGTSLLNPAKPFMDLWYRVNHDFPDY KLHLVTFQDNHEGILSEIEQLGGKFDFLIGVCDSKAWLSRCRFLPLGRYKKMIAVSREHR LADRKSLEIEDLYGETMMMVERGDSGVNDFLRNDLEKHHPQITIEDTPRFYDLSVFNRCA ETGNVLLTIECWQEVHPGLVTIPVNWDYSIPYGLLYSLDAPEDVERFAEVVKEIL >gi|229784035|gb|GG667700.1| GENE 11 12741 - 13244 583 167 aa, chain + ## HITS:1 COG:MA0407 KEGG:ns NR:ns ## COG: MA0407 COG0716 # Protein_GI_number: 20089301 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanosarcina acetivorans str.C2A # 4 166 10 172 179 199 53.0 2e-51 MAELIAFFSRGEENYVNGEIRTLKKGNTEAAAGMIQKFTGADLFKIEPQQAYSADYNECT AQAQADQRRNARPELAAYPESLEPYDVIYLGYPNYWGTMPMAVFTFLEHFDFTGKTIRPF CTHEGSGMGSSAADIKKLCPGAIIGKALAIRGGRVNQSEKEIESWLK >gi|229784035|gb|GG667700.1| GENE 12 13258 - 13668 415 136 aa, chain + ## HITS:1 COG:MA0416 KEGG:ns NR:ns ## COG: MA0416 COG1917 # Protein_GI_number: 20089309 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Methanosarcina acetivorans str.C2A # 8 135 12 140 141 137 56.0 5e-33 MNAENLSVFPAGKPNDAYAEYFTGQSYLAMLSTEQVVIGNVTFEPGCRNHWHIHHADKGG GQILLVTAGRGYYQEWGKEARELKPGDAVAIPAGVKHWHGAAPDSWFQHLAVEVPGEYCR NEWCEPVTGEDYDKLG >gi|229784035|gb|GG667700.1| GENE 13 13760 - 14077 394 105 aa, chain + ## HITS:1 COG:Cgl1022 KEGG:ns NR:ns ## COG: Cgl1022 COG0599 # Protein_GI_number: 19552272 # Func_class: S Function unknown # Function: Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit # Organism: Corynebacterium glutamicum # 2 104 3 105 107 144 67.0 3e-35 MEKVTAGKEQLGDFAPKFAELNDDVLFGEVWAREGQLSARDRCIVTVTALMASGILDSSL KFHLFNAKKHGVTKEEIAEILTHAAFYAGWPKAWAVFHMAKEVWQ >gi|229784035|gb|GG667700.1| GENE 14 14112 - 15074 946 320 aa, chain + ## HITS:1 COG:lin2113 KEGG:ns NR:ns ## COG: lin2113 COG0667 # Protein_GI_number: 16801179 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Listeria innocua # 1 320 1 327 331 359 54.0 4e-99 MEYTTLGNSGIEVSRLCAGCMSFGDPASKMHAWTLNPAESEAIIRHALDLGINFFDTANC YSAGTSEEYLGRAIRNNTARDRVVLATKVYFNEGRLSKNAILREVEGSLRRLGTDYIDLL IIHRFDYDTPVGETMEALHKVVQSGKVRAIGASAMYGYQFMEMQYTAEKNGWTKFVSMQN HYNLLYREDERELIPICDTQNVARTPYSPLAAGRLSRPTWSADTLRSRTDVTAQGKYDKT RESDMVIVKRVSELAEKKDATMTQIALAWQFAKGAASPVIGATKAKYLDDAAGSFSVKLS KEEIDYLEEAYVPHHVVGAL >gi|229784035|gb|GG667700.1| GENE 15 15132 - 15287 215 51 aa, chain + ## HITS:1 COG:ECs0335 KEGG:ns NR:ns ## COG: ECs0335 COG0656 # Protein_GI_number: 15829589 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Escherichia coli O157:H7 # 1 51 1 52 191 68 57.0 2e-12 MEYVKLSNGMEMPILGYGVYQVSKEECERCVLDALRAGYRSIDTAQAYFND >gi|229784035|gb|GG667700.1| GENE 16 16270 - 16944 411 224 aa, chain + ## HITS:1 COG:YPO2805 KEGG:ns NR:ns ## COG: YPO2805 COG0656 # Protein_GI_number: 16123003 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Yersinia pestis # 1 202 89 290 297 245 53.0 5e-65 MWVEHYGYEECRKSVEASLRKLQTDYLDLMLLHQPFADYYGAWRALEELYEEGKLRAVGI SNFYPDRMVDLASFARIRPMVNQVETHPYNQQILAKEYMDQYGVQIEAWAPFGEGRGGLF EDPVLMETGAKYGKTTAQIMLRWNIQRGVVVIPKSVHYERMVENFDVFDFSLSDGDMEKI ASLDKKASSFFSHQDPATVEWFVQMVEERKKNQAGAAQKTGNNN >gi|229784035|gb|GG667700.1| GENE 17 17121 - 18137 920 338 aa, chain - ## HITS:1 COG:BH0796 KEGG:ns NR:ns ## COG: BH0796 COG1653 # Protein_GI_number: 15613359 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 76 337 240 497 500 69 26.0 8e-12 MRGDILDKLGLREQAKNMKTWADYENIMAAFKESGLPGYATGGGAGFGMISNNGCIYQGE NISDSYIFDPLGDVYFSLATDQNGKAMNQVAMEETQNQFKMFRKWMDAGYMYPDSAYDST GAKELFNQGVLFSVFAASEYGVETSWRASSGYDVECYLVGESPLNTSNAQKFGLVLPTTC KEPEAVMKWINLLYTNPDIMNLITWGIEGKSYEVVDGVAQYIGDADALTSGFHNNDYKIG NAFLCLPWSGAAPTFREESLEFFKGAPASNYLGLTVNTSDHESLIAAISAVTDEFHGQMC GGFYTDDLYQEYLQKLKDAGIDEYIALYQDSIDKFMKK >gi|229784035|gb|GG667700.1| GENE 18 19074 - 19628 522 184 aa, chain - ## HITS:1 COG:L127223 KEGG:ns NR:ns ## COG: L127223 COG1653 # Protein_GI_number: 15673477 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Lactococcus lactis # 80 179 52 146 483 58 32.0 8e-09 MKKRLLLSLLSFQMAATLLVGCGTTSEEAVPKSSGSENSESTESTEAGETENSENEITEI NYWTWEAGDGTAGSNSLGDEVEAAINQITEKEIGVHVNFNWVNGADYATQLSLAILNGEQ VDVADYYYSGAGTFSQLYSSGSLMEITDLAAEYAPEAMETIGSEVLSGVTLNGKLYGVPT YRVS >gi|229784035|gb|GG667700.1| GENE 19 19922 - 20704 853 260 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623126|ref|ZP_06116061.1| ## NR: gi|266623126|ref|ZP_06116061.1| putative transcription regulator [Clostridium hathewayi DSM 13479] putative transcription regulator [Clostridium hathewayi DSM 13479] # 1 260 1 260 260 501 100.0 1e-140 MYIDYLTDSITRLIQIVNDNMANETEWDIAVQMIKHMDEIPGASIQQVADICHCSIATIS RFVKKLNFENFSTFRYKLAMDLFNSPKLNLKMPGEYQIDYAELKENYLETVKTQIDELEK GLSLDEIERAVELLHAAETILIYSHGDLDFHAFQVDLAMHGKASWHVKGYSEMKKALDTL HDNTVVIAPLFSTQNGRQNVEMLHNLGIKMIIVSRGSVNVYEKYADAFFRIRMSETKMDD YLIHLLFDILSMHYRKRYME >gi|229784035|gb|GG667700.1| GENE 20 20745 - 22094 1174 449 aa, chain + ## HITS:1 COG:lin2117 KEGG:ns NR:ns ## COG: lin2117 COG4209 # Protein_GI_number: 16801183 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Listeria innocua # 146 449 4 309 309 262 46.0 1e-69 MNSFKEQKRVIPALQVLLAVFMMVSYAFPYGKYVYKKGAYHMSALDYLFGKNIAGGSVDI EPQRLFTLFIIAAILFTAAAGIHRFKSEKAAGIMTMTGGAVAAAVPIIFMYNFNTVMGGA KKTGVLYGPYITMFFGVLALVLGFYILYRCKTLNVLDIMVIPGLIYLIINNYIPMSGIVI AFKSLNYSQGAFNSPWCGLENFEFLFQNKDIWNIIRNTIGYNVVFIILNNLFGIAVGIML SEAVSKVFQRVSQTLILLPQIISWVIVAYMVYGLFSTNTGWINNTLLPLFGYDGPNIAFY GTRKYWPFILTFVAVWKGLGYNSIIYLSSIIGIDRNLYEAAYIDGCSKLKQIRHITLPLL KPTVIILVIMALGHIMNSNFGLFYQTTQNSGALYPVTQTLDVYVYRSIMETQNLSMGAAA AALQSIVGFCLIMIANTAIRKFDSDNALF >gi|229784035|gb|GG667700.1| GENE 21 22109 - 23005 929 298 aa, chain + ## HITS:1 COG:L128777 KEGG:ns NR:ns ## COG: L128777 COG0395 # Protein_GI_number: 15673478 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Lactococcus lactis # 4 298 10 306 306 209 38.0 7e-54 MDSSVKSFSTRIFQIVVTVFLTIFMLLALLPFIMLVMSSFSTESNLYLEGYGFWPKGFTT AAYEFLFKSNLGNMMRAFGMSIAVTGIGTGLHLIIAPMFAYPLSRRDYKHGKFLMVVLLV TMMLNGGMVGQYLMWTNLFHVRNTIWALVFPNLLFSAFQIVLYKNSFANNIHPVLIEAAK IDGANELYIYFKIVLPLSTPILATVGLMVGIGYWNDWINGQYYITETNLYSLQVMLTKIM QNLQMLVNMGDTGITSSEMPGVSVRMAIAVMGTVPILVLYPFFQKAFVAGISLGGVKE >gi|229784035|gb|GG667700.1| GENE 22 23005 - 23653 519 216 aa, chain + ## HITS:1 COG:Cgl1466 KEGG:ns NR:ns ## COG: Cgl1466 COG0477 # Protein_GI_number: 19552716 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Corynebacterium glutamicum # 8 187 19 197 403 71 27.0 9e-13 MKKSNETIWSKPFLMLFLLTIFDQMAFLCTRAVISKYALDLGHTEAMAGVVAGALSVAAL FSRPVAGRLLGSSRISKKKILMCSVAASLLISISYMLVRNFVPLVLVRVFNGIAYGISGT VELTLVSDSLDEKVMGRGIAVFGLGNIIGLAVAPSIAVFLYDNYSPNTLFSFCIATSIIA VFIVAAIPGGKKTEAAFKPGKIRKGRVTLKEFLGSF Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:50:09 2011 Seq name: gi|229784034|gb|GG667701.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld94, whole genome shotgun sequence Length of sequence - 22171 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 7, operones - 3 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 118 - 177 6.3 1 1 Tu 1 . + CDS 255 - 851 783 ## Closa_0782 hypothetical protein + Prom 970 - 1029 6.4 2 2 Op 1 . + CDS 1089 - 2255 1256 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 3 2 Op 2 . + CDS 2252 - 3466 1154 ## COG0101 Pseudouridylate synthase 4 2 Op 3 . + CDS 3468 - 4646 1271 ## COG0053 Predicted Co/Zn/Cd cation transporters 5 2 Op 4 . + CDS 4724 - 5296 647 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 6 2 Op 5 . + CDS 5368 - 6903 1610 ## Closa_0788 hypothetical protein + TRNA 7043 - 7114 64.6 # Gln CTG 0 0 7 3 Tu 1 . + CDS 8394 - 10670 1948 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases + Term 10842 - 10894 12.1 + TRNA 10768 - 10839 64.6 # Gln CTG 0 0 + Prom 10765 - 10824 75.8 8 4 Op 1 . + CDS 11039 - 11770 733 ## COG3884 Acyl-ACP thioesterase 9 4 Op 2 1/0.000 + CDS 11767 - 12603 307 ## PROTEIN SUPPORTED gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains 10 4 Op 3 1/0.000 + CDS 12635 - 13219 664 ## COG0500 SAM-dependent methyltransferases + Term 13241 - 13279 4.3 + Prom 13354 - 13413 7.5 11 4 Op 4 . + CDS 13552 - 14664 1148 ## COG3839 ABC-type sugar transport systems, ATPase components + Term 14682 - 14718 6.4 + Prom 14679 - 14738 6.7 12 5 Tu 1 . + CDS 14768 - 14938 173 ## Closa_0794 CdaR family transcriptional regulator + Prom 15777 - 15836 80.4 13 6 Op 1 . + CDS 15856 - 16683 1160 ## COG2508 Regulator of polyketide synthase expression + Term 16735 - 16794 5.5 14 6 Op 2 28/0.000 + CDS 16872 - 17558 319 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 15 6 Op 3 2/0.000 + CDS 17548 - 18459 965 ## COG2177 Cell division protein 16 6 Op 4 1/0.000 + CDS 18472 - 19674 956 ## COG0739 Membrane proteins related to metalloendopeptidases 17 6 Op 5 . + CDS 19698 - 20999 1633 ## COG0793 Periplasmic protease + Term 21000 - 21051 14.1 - Term 20989 - 21032 7.6 18 7 Tu 1 . - CDS 21042 - 22169 944 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes Predicted protein(s) >gi|229784034|gb|GG667701.1| GENE 1 255 - 851 783 198 aa, chain + ## HITS:1 COG:no KEGG:Closa_0782 NR:ns ## KEGG: Closa_0782 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 195 2 196 199 305 72.0 8e-82 MERAITISREFGSGGRELGVRLAESLGIPFYDKELISMAAEESDMPEDAFLHYDENMPLI KDPLDRHFYSPFSSIYQVPMSDQIFLAQSRVIRKLAEQGPCVIVGRCADMVLENSVNLFI YAKMKDRIRRMNSLETGVDPVKMESRIREVDRKRKDYYQYYTGNEWGKAQNYHLSLESGL VGVDGCLKAVLAYLESVQ >gi|229784034|gb|GG667701.1| GENE 2 1089 - 2255 1256 388 aa, chain + ## HITS:1 COG:BH0936 KEGG:ns NR:ns ## COG: BH0936 COG0436 # Protein_GI_number: 15613499 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Bacillus halodurans # 4 378 5 378 385 356 44.0 3e-98 MRFAERMDRFGEGIFSKLLEIKRQKEERGETVIDLGVGAPNIPPARHITDALCRAAADPK NYVYAISDQKELLKAVAQWYERRYGVNLNPDTEICSLLGSQEGLAHIALSIADEGDIVLV PDPCYPVFADGPLLAGATLFYMPQKKENGYVIDVKEIPEEVAEKARLMVISYPNNPTTVM APDSFYEEVIAFAKKHDVIVLHDNAYSELVFDGRTCGSFLRFPGAKEVGVEFNSLSKTYG LAGARIGFCVGNEEVVSHLKMLKSNMDYGMFLPVQMAAIAAVTGDQACVEETRKAYERRR DALCDGMTAIGWRMEKPEATMFVWAKIPDNFTDSSSFAMELAEKSGLLVTPGSAFGPSGE GYVRLALVQDEDVLKKAAELVDKSGILR >gi|229784034|gb|GG667701.1| GENE 3 2252 - 3466 1154 404 aa, chain + ## HITS:1 COG:CAC3099 KEGG:ns NR:ns ## COG: CAC3099 COG0101 # Protein_GI_number: 15896350 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Clostridium acetobutylicum # 157 400 2 244 244 201 43.0 2e-51 MKMKITKRQCFWSAAAVLYVLFIFSNSMKNADLSSADSGAVLRLVQQVLAAGGVDGTIIT EHVIRKTAHFTEYAALGILLCLCFQTYVLPLHSRVLSQVLAGFLVPFADETIQLFVAGRS GQISDVWLDCAGVAFGTILFAAAAWIVSRTGAKKRDKNYKIVLQYDGSRYDGWQKQGNTG NTIQGKLEGILLKLTGRPVEVHGSGRTDAGVHALAQTANFHADTGMTEDQIRAYFNQYLP EDIAVLSVEKVQDKFHSRLNAVQKTYLYRIEMGPKKDVFERKYCYGLGKQLSADRMKEAA ACVLGEHDFKSFCGNKKMKKSTVRTISDITIEKEGTKLLIRYTGNGFLYHMVRILTGTLI EIGLEERSPQEMKEILLKLDRQAAGFTAPPEGLFLERVMYEDRE >gi|229784034|gb|GG667701.1| GENE 4 3468 - 4646 1271 392 aa, chain + ## HITS:1 COG:CAC0606 KEGG:ns NR:ns ## COG: CAC0606 COG0053 # Protein_GI_number: 15893895 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Clostridium acetobutylicum # 1 361 11 371 403 328 44.0 8e-90 MTDFLVKTFVKDFENTGDTGVRTEYGLMASTVGICCNVLLFAAKLFIGLLINSISVMADA FNNLSDAASSIIGFIGVKMAGKPADEDHPFGHGRVEYIAAFIVAFIVIQVGFSLFKTSAG KILHPEEMTFKYISIVILLMSVCVKLWMGMFNRKLGKRINSSVMMATAADSMGDVGTTSA TILSILVYGIWGINIDGIVGLIVSLIVMWAGVGIARDTLAPLIGEPIDPKLYREITEFVE SFDGIVGSHDLIVHNYGPSRSMASIHAEVPNDVSIEKSHEIIDYVEREAQRRFGIFLVIH MDPIETKDSRVTEFKKMVESIVMSIDSKCSFHDFRMVEGEKQINLIFDLVVPRDYDRAAR EAIRKEITEKVAERDSRCCLVMTTESGFGVEE >gi|229784034|gb|GG667701.1| GENE 5 4724 - 5296 647 190 aa, chain + ## HITS:1 COG:AGc3605 KEGG:ns NR:ns ## COG: AGc3605 COG1595 # Protein_GI_number: 15889274 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 23 185 52 217 223 87 35.0 2e-17 MLFMGMMSLDGPEDYGRLSDEVLLNKVGTGDQEAFRQLYQNTDRSIYGFILSIVKNPQDA EEIMQETYIKVWTSAAGYKSQGKPLAWMFTIARNLCYMKFRNQKHESDIAFEDLEGEETG EVCPEIEQAADKLVLAAALKILKEEEREIVLLHATAGMKHREIAASLGMPLATVLSKYNR AMKKLENYLT >gi|229784034|gb|GG667701.1| GENE 6 5368 - 6903 1610 511 aa, chain + ## HITS:1 COG:no KEGG:Closa_0788 NR:ns ## KEGG: Closa_0788 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 7 499 7 485 485 470 60.0 1e-131 MFHKTDRQLNRKQLKAQLKSAVMDLTPNVLDKIDLTMPQMTAEELHEEAGRAVRSIAVMR RRMRMMAALAAACLCMVAAAGGTYTYRNGKVDSVIGIDVNPSVELSVNRKNRVLEAEPLN EDAKAIMEDMNLKGVDLNVAVNAVIGSMVTHGYLDDLDNAILVTVSNDSISKASALRSSV VNDIQSSLEENQVKAVVYDQQVIEEDDVKALADEYRISYGKAYFLKELIDQNPALTMEDM EKLAPLTMEEIAKEITERSYAVGGHAGVSEDMTTAAQTTTQPVTTVPVTTEAATEETTTA EESESETGNQTTAAPVPQTTAPVPSGLPEMTTEAPTTEAQVPVSDAKITIDYVDYENGVV TVNFKTKVKWKNPTVSVKDEGGTTYSAMVGDTDSSTCEIYVSGLEGGTSYTFVLGGISPK NGKQTTVKGTFETPVIGDGGANGEPVTLPETEETKETETQPQTEESSEASATSPAETESV PQTESTPPTEPPTEAPTETRTEAPADTPPAV >gi|229784034|gb|GG667701.1| GENE 7 8394 - 10670 1948 758 aa, chain + ## HITS:1 COG:STM3749 KEGG:ns NR:ns ## COG: STM3749 COG1501 # Protein_GI_number: 16767034 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Salmonella typhimurium LT2 # 1 729 1 728 772 862 53.0 0 MKFTEGYWCTKKEITPLYAAEFADSRREGDELILYAPGKHIAGRGDCLNLGMLTVRLSSP MEDVIKVSVRHFEGTAYNGPFARIHQTSPHVTVEETEDEIRYQSGRTTAVIDKRPNCWGI RFLDGERELTSTGYRNMAHMTNRDTGKTYMVEQLAIDVDECIYGLGERYTPFVKNGQVVE MWNEDGGTASEITYKNIPFYITNKGYGVLLDNEGDAAYEIASEKVERLQFSVEGERLDYY VINGSTPKGTISKYTELTGKPALPPAWSFGLWLTTSFTTNYDEGTTSGFIQGMADRDIPL HVFHFDCYWMEAYEWCNFTWDPKTFPDPAGMLKRYHDRGLKICVWINPYIGQKSPLFQEG MEHGYLIRKTNGDVWQTDMWQAGMGLVDFTNPDAAAWYQGKLKTLLDMGVDCFKTDFGER VPVKDIAYYDGSDPVKMHNYYTYLYNQAVFQLLERERGTGEAVLFARSATVGGQQFPAHW GGDCSATYPSMAETIRSGLSLACAGFGFWSHDISGFENTAPADIYKRWCQFGLLSSHSRL HGSSSYRVPWLFDDEACDVLRKFVKLKCALMPYLYRQAVKAHEEGIPMMRPMYVEFPEDR ACEPLDKQYMLGDSLLVAPVFKESGEVEYYLPEGVWVNLLTGTTVKGGRWQKETHDYFSL PLMVRPGSIVAVGQESSRPDYDFADGVRFLLWLPEDGMTAETGVTDLNGNVVMTVSAGRT DGKITLRAERTSAGVSADVSTNFTFEVLGEEKLEVVVK >gi|229784034|gb|GG667701.1| GENE 8 11039 - 11770 733 243 aa, chain + ## HITS:1 COG:CAC3591 KEGG:ns NR:ns ## COG: CAC3591 COG3884 # Protein_GI_number: 15896825 # Func_class: I Lipid transport and metabolism # Function: Acyl-ACP thioesterase # Organism: Clostridium acetobutylicum # 12 206 16 211 248 133 35.0 3e-31 MYTFGSRIRYSETDEYGKLTLTGIMNYLQDCSTFQSEDIGLGISYLTEHHQAWWLSSWQI VIDRYPALGEEVVIGTWPYDFKGFYGYRNFTICDQAGNYLVRANSVWFLFDTELGRPVKV KEENIRGYGRGDEKRLEMEYAPRKITVPADYVETECVTIAKHHIDTNHHVNNAQYVEIAR EVLPAEIEVGELRVEYKKAAVCGDVVYPHVSRTDEGYTVSLCDEQGAAYAVIWLRGKTRE DII >gi|229784034|gb|GG667701.1| GENE 9 11767 - 12603 307 278 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains [Anoxybacillus flavithermus WK1] # 4 268 2 270 285 122 32 2e-27 MIELGKVQTLTVQRVKEFGVYLGEGAASEASVLLPKKQVPEGAKPGDEISVFIYKDSEDR LIATTGIPKLSVGETAMLPVKEVAKIGAFLDMGLEKDLLLPFREQTHKVRQGESVLVALY VDKSQRLAATMRVYAYMSSESPYQKDDQVTGTIYEINENLGAFVAVDHKYYGLIPKKELF DDYHEGDTVSARVTKVREDGKLDLSPRQKAYLQMDDDSARVLAVIGDQFGGVLPFTDKAA PEVIKREFQMSKNAFKRAVGHLLKDGKVRITEKTIEIL >gi|229784034|gb|GG667701.1| GENE 10 12635 - 13219 664 194 aa, chain + ## HITS:1 COG:VC0813 KEGG:ns NR:ns ## COG: VC0813 COG0500 # Protein_GI_number: 15640831 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Vibrio cholerae # 3 194 2 192 192 137 38.0 2e-32 MDSIDYYNKYAAKVFEETVDQDMSEIRTEFLNLLEEGDTILDLGCGSGRDSLIFYELGYD VTAMDASEEMCKLAEIHTGLEVLHMTFEEMDFDSVFDGIWACASLLHVPEKELSDILTKI ARALKDSGILYMSFKLGDFEGFRGERYFCDYTEDAMEEVLKDNDRFDVVKIWETEDVRSG HSDTRWLNVLVRKQ >gi|229784034|gb|GG667701.1| GENE 11 13552 - 14664 1148 370 aa, chain + ## HITS:1 COG:CAC3237 KEGG:ns NR:ns ## COG: CAC3237 COG3839 # Protein_GI_number: 15896483 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Clostridium acetobutylicum # 1 369 1 369 369 487 65.0 1e-137 MASLSLKNITKKYPNGFEAVKDFNLEIADREFVIFVGPSGCGKSTTLRMIAGLEDISSGE LYIDGKLMNDVEPKDRDIAMVFQNYALYPHMTVYDNMAFGLKLRKTPKDEIDKLVHEAAR ILDLEHLLDRKPKALSGGQRQRVAMGRAIVRNPKVFLMDEPLSNLDAKLRGQMRIEISKL HQRLQATIIYVTHDQTEAMTLGTRIVVMKDGVVQQVDSPQNLYDKPGNKFVAGFIGAPQM NLIEATVAKSGSDVTLTFGGHTIALPEGKAKKLEGAGYVGKTVVLGIRPEDLHDEEAFLA SSPNSIIDATIRVYELLGAEVYLYFDVDDANFTARVNPRTTARPGDTVKLALDLTKIHVF DKDTELVILN >gi|229784034|gb|GG667701.1| GENE 12 14768 - 14938 173 56 aa, chain + ## HITS:1 COG:no KEGG:Closa_0794 NR:ns ## KEGG: Closa_0794 # Name: not_defined # Def: CdaR family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 56 1 56 362 109 96.0 4e-23 MISNQILQTTIEGLKGITRIDLCICDTEGKVLATTFPDAEDYESSILAFVDSPLAS >gi|229784034|gb|GG667701.1| GENE 13 15856 - 16683 1160 275 aa, chain + ## HITS:1 COG:CAC3236 KEGG:ns NR:ns ## COG: CAC3236 COG2508 # Protein_GI_number: 15896482 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Clostridium acetobutylicum # 82 268 134 312 312 86 30.0 4e-17 MVGKLAAFQIQSLLVAYKERFDKDNFIKNLLLDNLLLVDIYNRAKKLHIETNIKRVVFII ETQHEKDVNALETVRNLFAAKTKDFVTAVDEKNIILVKEVRDGETYEELEKTANTILDML NTEAMTQVHVAFGTIVGEIKEVSRSYKEAKMAMDVGKIFYSNKNVVAYSKLGIGRLIYQL PLPLCRMFIKEIFDGKSPDEFDEETLTTINKFFENSLNVSETSRQLYIHRNTLVYRLDKL QKSTGLDLRVFEDAITFKIALMVVKYMKYMENQDY >gi|229784034|gb|GG667701.1| GENE 14 16872 - 17558 319 228 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 216 1 218 245 127 33 6e-29 MIELWNVNKTYETGNKALRNINITVEDGEFVFIIGRSGSGKSTLLKLLLKEVEPTSGKII VNDMNLGKMPRRYIPKYRRRLGMIFQDFRLLKDRNVYENVAFAQRVIGASARSIRESVPT MLTMVGLSSKYKFYPQQLSGGEQQRVAIARALINRPEILLADEPTGNLDGHNSMEIMKLL DEINRRGTTVIVVTHSHEIVDRMKKRVITMDRGTIVSDEKKGGYTHED >gi|229784034|gb|GG667701.1| GENE 15 17548 - 18459 965 303 aa, chain + ## HITS:1 COG:BS_ftsX KEGG:ns NR:ns ## COG: BS_ftsX COG2177 # Protein_GI_number: 16080578 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Bacillus subtilis # 10 303 9 296 296 129 30.0 8e-30 MRISTFWYCLKQGVNNICRNILFSLASIATISACIFLFCLFFSIVVNVQYVVKNTESTVG ITVFFNEGTTDETIAKFGETLKARPEVKEIKFTSSEEAWEEMKVDYFGEDKMDLAEGFRD DNPLANSSSYDIFLNDLENQEEFVTWLKSQNIVRRVNYSSTAAEGITSINKVIGVLSVMI IGVLLAVAIFLISNTISVAAAFRKHENEIMKLIGATNYMIRAPFVVEGVLLGIFGAVIPL AAIFYLYRNAVGYVVEKFSILSGLFQFLPVEQIFPYMAGTAMALGVGIGFFVSFFTIRKH LKV >gi|229784034|gb|GG667701.1| GENE 16 18472 - 19674 956 400 aa, chain + ## HITS:1 COG:TP0864 KEGG:ns NR:ns ## COG: TP0864 COG0739 # Protein_GI_number: 15639850 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Treponema pallidum # 278 399 424 545 546 113 44.0 8e-25 MRYRKRMGAAIVTAVLTVSCIFPSYGTKTDMDAARESKSNLENEKKEIEKTLEKLEALKS DTVSYVNQLDAELNSLNKQVTALNSRVAAKEAEIAQAEAELAASRETERRQYESMKLRIQ YMYEKGETSYLDLLFKSADLTQLFNRAEYISQIAAYDRKMLDQYEAASDEVAAREADLKA EHEELVILKTNTEEKQKDAERLMNEKTAELNSYNSKIAASAGSLSELEKDIAAQEDKIKA IEAEIRRKEEEAKKAALAAGQKYNTVSIGNINFIWPCPSSSRITSSFGDRESPTEGASSS HQGVDIGASTGSSILAAASGTVTISTYSYSAGNYIMINHGGGVSTVYMHCSELLVSAGQE VTQGQVIAKVGSTGYSTGPHLHFGIRVNGSYVNPINYVSP >gi|229784034|gb|GG667701.1| GENE 17 19698 - 20999 1633 433 aa, chain + ## HITS:1 COG:aq_797 KEGG:ns NR:ns ## COG: aq_797 COG0793 # Protein_GI_number: 15606169 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Aquifex aeolicus # 45 409 22 360 408 226 37.0 6e-59 MENKNKFWKGALVGALLTAFAGLIIVGMSLGIFLIGRTAIDGPQQAAESNQAENQNGSLD VNRITKKITTLQQIIDKYYLFDEDTTQVEDWIYKGMMFGLNDPYTTYYTAEEYQKLNEDT EGEYHGIGVMISQNRGTGLITVIKVFKDTPAAEAGMRPGDILYKVGDMEVTGMDMDILVK DYIKGKDGSEVELTVFRQDEGEYVDLKMERRNVTVQTVEYQMMEDSVGYIAVSQFDVVTA DQFKAAVNDLEKQGMKKLVVDLRNNPGGVLGTVVDMLDYILPDDLTIEGDKDLVRTNPEA TLLVYMADKNGKGEQEYASDGHSLDIPMAVLVNGESASASEVFTGAMKDYGRATIVGTKT FGKGIVQNLIPLDNGTAIKMTTAHYYTPSGFDLHGKGIEPDVEVELKEELRNQITVDVKE DNQIEAALKALNQ >gi|229784034|gb|GG667701.1| GENE 18 21042 - 22169 944 375 aa, chain - ## HITS:1 COG:mlr9675 KEGG:ns NR:ns ## COG: mlr9675 COG1502 # Protein_GI_number: 13488516 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Mesorhizobium loti # 2 323 70 405 466 85 23.0 2e-16 DRGVKVQILVDGLYGTLHMEGNPIFYAAGTNPNIEIKFYNIPNPLKPWTINGRMHDKYLL VDDKLLLLGGRNTFDYFLGEYNLRNLSYDRDVLIYNTKHGQEEAWPSSVLSEADAYFEAM WQSKYCKTVFDSPSASMKKKLPAAKTELAEYYHSLKEAMPELTRSGHDYIPETVAINKAT LISNPTHILAKEPWVWYQVQYLMAHAEKQAYIHTPYAVFSKDMYEGMKEAADHVPEVTML LNATSVGDNFMASSDYTRNRGKILDTGVSLREYFGDHSSHGKSILIDDRLSIVGSYNLDM RSTYVDTETMLVIDGEEFNRQLKACIDTLEDDSLAVQKDGTYVTDPEVPVIPVPSKKKVI FAVTSRLFQLFRYLI Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:50:30 2011 Seq name: gi|229784033|gb|GG667702.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld95, whole genome shotgun sequence Length of sequence - 18351 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 8, operones - 2 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 13 - 72 5.7 1 1 Tu 1 . + CDS 209 - 1849 2082 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 1898 - 1931 5.4 - Term 1886 - 1919 5.4 2 2 Tu 1 . - CDS 1951 - 3306 1221 ## COG4905 Predicted membrane protein - Prom 3344 - 3403 6.1 - Term 3325 - 3367 11.9 3 3 Tu 1 . - CDS 3413 - 4537 1139 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins - Prom 4620 - 4679 8.3 + Prom 4561 - 4620 11.1 4 4 Op 1 35/0.000 + CDS 4870 - 6276 1582 ## COG1653 ABC-type sugar transport system, periplasmic component 5 4 Op 2 38/0.000 + CDS 6339 - 7268 1006 ## COG1175 ABC-type sugar transport systems, permease components 6 4 Op 3 . + CDS 7274 - 8131 1076 ## COG0395 ABC-type sugar transport system, permease component + Term 8138 - 8174 6.1 + Prom 8181 - 8240 7.3 7 5 Op 1 2/0.000 + CDS 8270 - 9262 1222 ## COG1609 Transcriptional regulators + Term 9280 - 9329 11.1 + Prom 9305 - 9364 5.1 8 5 Op 2 7/0.000 + CDS 9417 - 9836 440 ## COG1846 Transcriptional regulators 9 5 Op 3 . + CDS 9850 - 11202 1467 ## COG0534 Na+-driven multidrug efflux pump 10 5 Op 4 34/0.000 + CDS 11260 - 12084 688 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 11 5 Op 5 1/0.000 + CDS 12069 - 13007 373 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 12 5 Op 6 4/0.000 + CDS 14370 - 15119 953 ## COG0310 ABC-type Co2+ transport system, permease component 13 5 Op 7 . + CDS 15216 - 15539 385 ## COG1930 ABC-type cobalt transport system, periplasmic component + Term 15554 - 15602 7.5 + Prom 15599 - 15658 4.8 14 6 Tu 1 . + CDS 15678 - 16601 689 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 16649 - 16688 1.3 15 7 Tu 1 . - CDS 16709 - 17287 560 ## gi|266623164|ref|ZP_06116099.1| conserved hypothetical protein - Prom 17315 - 17374 6.2 + Prom 17442 - 17501 3.9 16 8 Tu 1 . + CDS 17528 - 18283 863 ## COG0708 Exonuclease III Predicted protein(s) >gi|229784033|gb|GG667702.1| GENE 1 209 - 1849 2082 546 aa, chain + ## HITS:1 COG:BS_ykpA KEGG:ns NR:ns ## COG: BS_ykpA COG0488 # Protein_GI_number: 16078507 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 531 1 533 540 764 69.0 0 MISANNVTLRIGKKALFEDVNIKFTEGNCYGMIGANGAGKSTFLKILSGQLEPTSGDIVM TAGQRLSFLQQDHFKYDEFQVLDTVIMGNARLYEIMKEKDAIYMKEDFTDEDGIKAAELE GEFASMNGWEAESDAANLLNGLGIETEYHYKYMKELNGAQKVKVLLAQALFGNPDILLLD EPTNHLDLDAIAWLEEFLINFENTVIVVSHDRYFLNKVCTQIADIDYGKIQLYAGNYDFW YESSQLMVKQMKEANRKKEEKIKELQEFIQRFSANASKSKQATSRKRALEKIELDDIKPS SRKYPYIDFRPAREIGNEVLTVQNLSKTIDGVKVLDNISFILNREDKVALVGPNEQAKTV LFKILAGEMEPDEGDYKWGLTTSQCYFPKDNSAEFNNDDTIVDWLTQYSPEKEATYVRGF LGRMLFAGEDGVKKVRVLSGGEKVRCMLSKLMISGANVLMLDEPTDHLDMESITALNNGL VKFQGVLIFSSRDHQIVETTANRIMEIVNGQLIDKITTYDEYLASDEMARKRQVFTLTDE QMQENE >gi|229784033|gb|GG667702.1| GENE 2 1951 - 3306 1221 451 aa, chain - ## HITS:1 COG:lin2818 KEGG:ns NR:ns ## COG: lin2818 COG4905 # Protein_GI_number: 16801879 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 8 201 6 196 270 124 32.0 5e-28 MNEAFFQLISYFFIYSFLGWCMEIAYAALKKKHFYNRGVLNGPLCPVYGIGMILILTFFS SLKESLVFLAIGSAVIATLLEFFTGALMEKLFKKRWWDYSDYKYNLGGYVCIPFSALWGV GAALTLRFIHPLIHMMISRVPALAGQIILIAGLVILAMDFISVLGVVFQVQKYNTRVEEL AEGMQQISNRLARAIFTRIEKRMAKAHPGAAEEHPEADTGSRVKTRSCEAVPAVFADGCS FHKLVWLFFIGAFLGDITETIFCYATSGILMSRSSVVYGPFSVVWGLGVVVLTMMLHKYK DRDDRYIFLFGTVVGGAYEYICSVFTEIAFGTVFWDYSKLPFNLGGRINLLYCFFWGLAA VVWIKILYPRISALIERVPVKTGKILSWLMIVFMTVNIIISAMALARYSDRQTGKEAANS IDYFLDDHFPDKRMEWIYPNAIVKDNSAMQP >gi|229784033|gb|GG667702.1| GENE 3 3413 - 4537 1139 374 aa, chain - ## HITS:1 COG:CAC0359 KEGG:ns NR:ns ## COG: CAC0359 COG4225 # Protein_GI_number: 15893650 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Clostridium acetobutylicum # 35 369 23 357 361 297 44.0 3e-80 MEILDRYIHQLLDQSTPESPIWYKEESDASAPAKWNYMDGCMIKAILELYHITQNSRYLE FADHFIDHFVREDGSIVSYDPEEYNLDNVNAGKTLFDLYKLTGKEKYRKGIETVYSQILR QPRTSTGNFWHKKIYPNQIWLDGLYMALPFYMQYEVEYNNCKNCTDIIDQFRNVRRLMRE PLNGLYYHAFDDSRKMFWCDKVTGLSDNFWLRALGWFSMALIDTMEIMPDSLKSQRSELN TIYRELIDAMLPYQDEETGMWYQVVNRGGICPNYLETSGSAIFAYAIMKSVRLGYLPDTY LPFGEKAFSGICDKYLSEENGELQLDGICLVAGLGNTENREGTFEYYMREPIVKNEAKGI APLILAYIEILMRK >gi|229784033|gb|GG667702.1| GENE 4 4870 - 6276 1582 468 aa, chain + ## HITS:1 COG:YPO1719 KEGG:ns NR:ns ## COG: YPO1719 COG1653 # Protein_GI_number: 16121979 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Yersinia pestis # 69 468 26 427 430 264 36.0 3e-70 MRKNWKKALTLVMAAAMAAGLAGCGNGAKQTEETTAAQTEAAKTEAAATEAETTAEPAQA EGLTVNTTDPIEISFSWWGGDSRHEATLAAVDAFMKKYPNITVKTSYSAWDGWEDKMSSQ FATGTAPDVNQINWNWITSFSSDGSAFYDLNKLSDILDLSQFSEDYLAQCTVADKLQGVP ISMTGRIFYWNKTTYDQAGIETPKSLADLRAAGKVFQEKLGDDYYPLAMNEYDRTIFMVY YLESKYGKAWVENNELQYSEDEIKDGLEFIQSLEADHVIPSIATIAGDGAASFDKNPKWM EGKYAGIFEWDSAATKQQGALNEGQEFVVGEEFADMGDYKGGFAKVSMCFGISENTKYPA ECAALVNFLLNEEEGVKILGSERGIPCSAEGLKICEDNGLLNELVAEANSKVLSYVSFPL DPKFEASALKATNEGVYWDVMAGLSYGDYDIDEAAEVLIDGVNKVLSK >gi|229784033|gb|GG667702.1| GENE 5 6339 - 7268 1006 309 aa, chain + ## HITS:1 COG:AGl3351 KEGG:ns NR:ns ## COG: AGl3351 COG1175 # Protein_GI_number: 15891796 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 25 306 8 290 293 295 55.0 7e-80 MKQPAAQSATAERRSFKRIVKENKGFFFILPWLIGFLIFKVYPFGSSLVYSFTDYHLFDG VSKVGLMNYDEIIHTKKIVKAFVVTMKYAFMTVPLKLVFALFIAYILNFKLKGVNLFRTA YYIPSILGGSIAIAVLWKAVFKDDGIINMLLSYMGIEGPNWLASPSHALFVICLLRVWQF GSAMVIFLAALKGVPEDLYEAASIDGAGKWRQFFSITVPLITPVIFYNLVTQLCQAFQEF NGPFIITQGGPRNSTMLISMLIYNNAFKSYEMGLASAQAWLLFLIVMTFTAVAFFSQKHW VYYSDEDGR >gi|229784033|gb|GG667702.1| GENE 6 7274 - 8131 1076 285 aa, chain + ## HITS:1 COG:YPO1721 KEGG:ns NR:ns ## COG: YPO1721 COG0395 # Protein_GI_number: 16121981 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Yersinia pestis # 1 285 24 306 306 292 52.0 6e-79 MTRETRKTVNTIVRYVVLIVVGFIMIYPLIWMVGATFKTNNEIFSGIGFIPKNPTLDGYK NALKSYGGDIDIFKAMINTYSIVLPKVVFTIVSATVTAYGFARFEFKGKKLLFAVLMSTL FLPQVVLNVPQFVLFNKFGWIDSPIYLALIVPTLFATDTYFVFMLIQFLRNIPKELEEAA TIDGCGSIKTLWYVIVPMLKPSIVSCALFQFMWSSNDFMGPLLYVNTPARYPASIFVKMS MDADTGFEWNRVLAMSLISIIPSLIVFLLAQDQFIDGIAAGGVKG >gi|229784033|gb|GG667702.1| GENE 7 8270 - 9262 1222 330 aa, chain + ## HITS:1 COG:CAC0693 KEGG:ns NR:ns ## COG: CAC0693 COG1609 # Protein_GI_number: 15893981 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 329 1 333 334 228 39.0 1e-59 MAVTIKDVAKKAGVSYSTVSRALNDPNADKSDKKKRILKIAEEMGYVPNQAAIQLKMSRS YVIGLYFSTISKMSSPFVLHDVLTGVYSVVGSKYNVIVKGIDMHEPGTLNPSFLDGIIVL TQKDDDLEFMDEVLNKEIPMVAICRTVFLDVPNVTTDEAGAMEKAMDYLLENGHRRIGII EGNSNLDSTRMRHRGWRASMIKYGLDPDSLPVEHGTYRYASGYQAAKRLLEQDLTAILSF NDEMAFGAREAINEVGLKVPDDISLVGFDNWDLSGYANMKLTTVERNMGEIAREGTKILL RRLEDGVIDNRRIYLENKLIIRETVKNLNI >gi|229784033|gb|GG667702.1| GENE 8 9417 - 9836 440 139 aa, chain + ## HITS:1 COG:CAC3413 KEGG:ns NR:ns ## COG: CAC3413 COG1846 # Protein_GI_number: 15896654 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 4 127 8 131 143 67 33.0 8e-12 MDTGKLINKISIRLRRRSEKMGKNFGITEIQGRILDFILVDGRDRPLYQKDIEKEFGLRP STATELLKTLESRKMIQRVSSEEDGRYKKIQFTEAADEIRTALQQEIQKTEELLIEGIPQ EDLNTFMRVAETMLANLER >gi|229784033|gb|GG667702.1| GENE 9 9850 - 11202 1467 450 aa, chain + ## HITS:1 COG:BH0886 KEGG:ns NR:ns ## COG: BH0886 COG0534 # Protein_GI_number: 15613449 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Bacillus halodurans # 10 410 7 406 454 157 29.0 3e-38 MSKNNSLLDDFTQGSIVGKLVKFMIPVLGALILQAMYGAVDLLVVGQFGTDAGISAVATG SNVINLVTFVITALVMGVTVLISRYLGEKRNERIGGVIGGTVCFFVIFTVVIMALLLLMA PVFASLLNAPEQAYELTVQYVRICGVGIVFVVAYNVISGIFRGLGNSKLPLIFVLIACIV NVIGDLLFVAVFHMNVAGAALATILAQAVSVVLSLLIIRRQELPFTVKRSDIRFNCEIPI FLKLGIPLALQELLTNISFLILCAIVNGIGLEASSGYGIAQKVISFVMLIPSALMQSMSA FVAQNVGAGKEERARKAMLTGMGLGVCVGVLIFGVTFFRGDLPASLFTANEAYIMRAAEY LKGFSFEAVLTCIVFSYIGYFNGHGMSLPVMLQGITASFLVRVPLSYVFSLKEGASLFDI GLAVPLASVYGIVFFTVWYVWYRKRAARKE >gi|229784033|gb|GG667702.1| GENE 10 11260 - 12084 688 274 aa, chain + ## HITS:1 COG:MJ1089 KEGG:ns NR:ns ## COG: MJ1089 COG0619 # Protein_GI_number: 15669277 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Methanococcus jannaschii # 24 247 9 230 268 109 28.0 6e-24 MPVNKLKEVNTIHRGHKHDAAYSIDFYACESRLRSWNAGLKMAVGMAALCLCIAADKLAV SVFVILTMACITVIAGGLAPVRYLKLLTIPLAFLFMGTVAIAVGISRMPYGDWYISAPWF SLYVTKNGLIQAVGLFTRALGAVSAMYMMTLATPASEMIGVMKRIHVPKMIVELMYLIYR FIFVLSDTYSQMKQAAGSRLGTCDFRTSCRTFGSTAGNLLVISLKKADTYYDSLLSRCYD GDLCFLEEEKRIKAGQAAAAVGYLLMTAAVWLVF >gi|229784033|gb|GG667702.1| GENE 11 12069 - 13007 373 312 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 51 294 137 381 398 148 36 3e-35 MAGVLTAAAGNRPVAMNCVFRQEYRENERKGGEYVEVNGGDVILETEHLCYSYDGKKIAL QDVNVSVRQRERIAVLGSNGAGKSTCFLNLNGVLTPDSGRILYRGTEITKKNRKELRKHV GIVFQEADNQIIASTVAAEVSFGPMNLKLPVMEVKRRVDEALECMNLTEMKDRPPHYLSG GEKKRVCIADIVAMESELFLFDEPTASLDPLNAAIFEAVLDKLWNMGKTLLISTHDVDFT WRWADRALVFSGGRLIADDAPCRIFQDESVIVQANLRKPVLMDVCSRLAAHGIVPDGTYV KDTEELEQLLCR >gi|229784033|gb|GG667702.1| GENE 12 14370 - 15119 953 249 aa, chain + ## HITS:1 COG:MTH130 KEGG:ns NR:ns ## COG: MTH130 COG0310 # Protein_GI_number: 15678158 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, permease component # Organism: Methanothermobacter thermautotrophicus # 28 227 1 197 222 197 58.0 1e-50 MTKKERRIMTLAVALAFVFGVMPVSYAMHIMEGYLPFGYCLAWGAVCVPFLAAGFFSIRR TLGQNRKNITVLAMAGAFIFVISSLKIPSVTGSCSHMTGTGLGAILFGPASVSILGIIVL IFQAVLLAHGGLTTLGANTFSMAIAGPFVAFGIYKLCGKLKVNRYVSVFLAAALGDLFTY CVTSFQLALAYPAEPGGVALSAVKFLGVFAPTQLPLALIEGILTTVIIIALESYAGPELT SIGFAKPHH >gi|229784033|gb|GG667702.1| GENE 13 15216 - 15539 385 107 aa, chain + ## HITS:1 COG:alr3944 KEGG:ns NR:ns ## COG: alr3944 COG1930 # Protein_GI_number: 17231436 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, periplasmic component # Organism: Nostoc sp. PCC 7120 # 1 99 1 94 100 62 34.0 2e-10 MDKNKKIIVTVLLVIACLIAIVPLFALKGAEFGGSDDAGSQMVAEIRGEEYEPWFTPVME TLIQGELPGEVESLLFCVQTGIGVGIMAFVMGRFVERKKWMTKAENS >gi|229784033|gb|GG667702.1| GENE 14 15678 - 16601 689 307 aa, chain + ## HITS:1 COG:BH3634_1 KEGG:ns NR:ns ## COG: BH3634_1 COG2207 # Protein_GI_number: 15616196 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 23 125 4 105 132 89 41.0 1e-17 MGNQPEAVDQQEAGAANHKRNNIEEIAAVIDYIEEHLMEEKMDLESIASIFHYSRYHLHR MFASVAGFPLHTYLTRRRLTEAARLLVWTDRPVLEIALASGYETQRSFSRAFQKYFQHSP SRYRKKGDFFPLQLKYDVFRREKLRGDRIITVTMREAKELSVVGYCSSTKGGFAGIGRCW RRFHAKKDRISGCTNKEFLIGINDYSYYPCKGLDQVFRYIAGAEVMQEYTAPKGMQKFTL PAGRCVVFSFRGRNEDSLQPVVEYIYGEWFPDSTCRFDEDRRYDFARYGEQLDEHGESEI QLWVPVM >gi|229784033|gb|GG667702.1| GENE 15 16709 - 17287 560 192 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623164|ref|ZP_06116099.1| ## NR: gi|266623164|ref|ZP_06116099.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 192 1 192 192 268 100.0 2e-70 MKHLNSLAALALSVSMGISGCALAPASSASSETVSSTYEKNTENTEIDTIEETAETVPPD SQEHAPETPEESTSETETEAEAKTERERAAAFAQKIKKAVAAADMEELADLCGYPVCITL EYGDSRELETREDFTALGADTLFTQKVKDTIAAVQESELEFCGAGVMMGENGTIVFHDVN GTMMITGINLSD >gi|229784033|gb|GG667702.1| GENE 16 17528 - 18283 863 251 aa, chain + ## HITS:1 COG:CAC0222 KEGG:ns NR:ns ## COG: CAC0222 COG0708 # Protein_GI_number: 15893514 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Clostridium acetobutylicum # 3 251 2 250 250 410 77.0 1e-115 MKKLISWNVNGLRACVGKGFLDYFKEVDADIFCIQESKLSEGQIELELSGYHQYWNYAEK KGYSGTAMFTKEEPLSVAYGIGIEEHDHEGRVITAEFSDYYVVTCYTPNSKDGLARLPYR MVWEDAFLRYLKGLEEKKPVIFCGDLNVAHKEIDLKNPKTNRKNAGFTDEERGKFTDLLD AGFIDTFRWFYPDREEIYSWWSYRFSARSKNAGWRIDYFCVSEALKDRLVSADIHTEVLG SDHCPVELVIE Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:50:47 2011 Seq name: gi|229784032|gb|GG667703.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld96, whole genome shotgun sequence Length of sequence - 19641 bp Number of predicted genes - 34, with homology - 33 Number of transcription units - 19, operones - 9 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 40/0.000 - CDS 99 - 1244 630 ## COG0642 Signal transduction histidine kinase 2 1 Op 2 . - CDS 1234 - 1941 532 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 1968 - 2027 7.0 - Term 2286 - 2335 8.0 3 2 Tu 1 . - CDS 2449 - 2727 126 ## gi|288871012|ref|ZP_06409988.1| transcriptional regulator, MerR family - Prom 2753 - 2812 2.0 4 3 Op 1 . - CDS 2850 - 3266 383 ## Clole_2808 pyridoxamine 5'-phosphate oxidase-related FMN-binding protein 5 3 Op 2 . - CDS 3337 - 3741 122 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 6 3 Op 3 . - CDS 3778 - 4062 85 ## Fphi_1280 hypothetical protein - Prom 4266 - 4325 4.7 - Term 4160 - 4207 0.8 7 4 Op 1 8/0.000 - CDS 4374 - 4523 138 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 8 4 Op 2 . - CDS 4544 - 4846 318 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 9 4 Op 3 . - CDS 4890 - 5222 151 ## mru_0642 hypothetical protein - Prom 5245 - 5304 3.2 10 5 Op 1 . - CDS 5323 - 6609 813 ## Clole_2898 hypothetical protein 11 5 Op 2 . - CDS 6638 - 7810 463 ## Clole_2898 hypothetical protein 12 5 Op 3 . - CDS 7758 - 7922 138 ## - Prom 8007 - 8066 4.2 13 6 Tu 1 . - CDS 8338 - 8508 157 ## gi|266623175|ref|ZP_06116110.1| inositol-1-monophosphatase - Prom 8685 - 8744 2.6 - Term 8945 - 8986 -1.0 14 7 Tu 1 . - CDS 9008 - 10255 560 ## Clole_2897 hypothetical protein - Prom 10359 - 10418 5.9 - Term 10335 - 10389 3.1 15 8 Tu 1 . - CDS 10611 - 10907 184 ## gi|266623177|ref|ZP_06116112.1| conserved hypothetical protein - Prom 10994 - 11053 7.7 16 9 Tu 1 . - CDS 11073 - 11369 169 ## CDR20291_3456 hypothetical protein - Prom 11547 - 11606 5.7 + Prom 11232 - 11291 4.1 17 10 Tu 1 . + CDS 11388 - 11552 69 ## gi|266623179|ref|ZP_06116114.1| conserved hypothetical protein 18 11 Op 1 . - CDS 11709 - 12125 405 ## Clole_2808 pyridoxamine 5'-phosphate oxidase-related FMN-binding protein 19 11 Op 2 . - CDS 12146 - 12799 282 ## SGGBAA2069_c10210 type 11 methyltransferase 20 11 Op 3 4/0.000 - CDS 12854 - 13084 209 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 21 11 Op 4 . - CDS 12978 - 13427 325 ## COG0346 Lactoylglutathione lyase and related lyases - Prom 13470 - 13529 2.2 22 12 Tu 1 . - CDS 13556 - 13690 70 ## gi|288871018|ref|ZP_06409993.1| conserved hypothetical protein - Prom 13907 - 13966 4.3 23 13 Op 1 4/0.000 - CDS 14098 - 14334 116 ## COG1475 Predicted transcriptional regulators 24 13 Op 2 . - CDS 14327 - 14674 314 ## COG1475 Predicted transcriptional regulators - Prom 14853 - 14912 7.4 + Prom 14966 - 15025 7.4 25 14 Op 1 . + CDS 15045 - 15386 193 ## COG1737 Transcriptional regulators 26 14 Op 2 . + CDS 15449 - 15757 134 ## COG1737 Transcriptional regulators + Term 15766 - 15800 2.1 + Prom 15768 - 15827 6.5 27 15 Tu 1 . + CDS 15878 - 16183 181 ## CKR_1771 hypothetical protein + Term 16337 - 16375 6.3 + Prom 16433 - 16492 3.5 28 16 Tu 1 . + CDS 16663 - 16803 183 ## EUBREC_3592 hypothetical protein + Prom 16839 - 16898 6.6 29 17 Op 1 . + CDS 16936 - 17175 248 ## COG2002 Regulators of stationary/sporulation gene expression 30 17 Op 2 . + CDS 17165 - 17782 124 ## CLK_0622 hypothetical protein 31 18 Op 1 . + CDS 17931 - 18374 194 ## gi|266623190|ref|ZP_06116125.1| sigma-70, region 4 subfamily 32 18 Op 2 . + CDS 18346 - 18558 311 ## LAC30SC_09595 hypothetical protein 33 18 Op 3 . + CDS 18634 - 19164 273 ## PROTEIN SUPPORTED gi|50365462|ref|YP_053887.1| acetyltransferase of 30S ribosomal protein L7 + Term 19244 - 19284 4.2 - Term 19234 - 19270 1.5 34 19 Tu 1 . - CDS 19403 - 19639 119 ## gi|266623194|ref|ZP_06116129.1| conserved hypothetical protein Predicted protein(s) >gi|229784032|gb|GG667703.1| GENE 1 99 - 1244 630 381 aa, chain - ## HITS:1 COG:BH1809 KEGG:ns NR:ns ## COG: BH1809 COG0642 # Protein_GI_number: 15614372 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 73 370 46 346 351 175 36.0 2e-43 MKFKKAKYNYSKLGRITFIRGLRITVGAVAAVFGLRMLTRGHLADAIVRWFARSLSISEM EADLLYFKIVTVNMQFILALVIIIFVFLLFHMLLDTYKRYFDEVVTGIDKLMEERAAISL SPELESVEHKLADVKRTLAERADKAKRAEQQKNDLVVYLAHDIKTPLTSVIGYLSLLDET PDMPDEEKAKYIHTAWEKANRLRILVNEFFEITRSHSESLPLHKAKIDLFYMIAQISDEL YPQLNACGKIIENHVDEDISVYGDSDKLARVFNNILKNAISYSEENSVIKVSARELSEWT VIRFENDGEIPEDKLNVIFDKFCRLSNARLSETGGSGLGLAIAKNIIVLHGGQIRAESSN GRTAFIIEIPSASSKISVTAS >gi|229784032|gb|GG667703.1| GENE 2 1234 - 1941 532 235 aa, chain - ## HITS:1 COG:CAC0564 KEGG:ns NR:ns ## COG: CAC0564 COG0745 # Protein_GI_number: 15893854 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 5 232 6 229 233 205 49.0 5e-53 MLEKILIVDDEKEIADLIEVYLQNENMDVYKFYSGEEALVCIGSTDFDLAILDIMLPDIS GLSICQSIRSKEYTYPVIMLTAKDGETDKITGLTLGADDYITKPFLPLELVARVKAQLRR YKKYNVGAVVPAANEIFKYAGLTMNIKTYECRLNGEALSLTPTEFSILRILLERRCSVVS AEELFHEVWQEEYYTKSNNTITVHIRHLREKMKDTGENPRYIKTVWGVGYKIGEI >gi|229784032|gb|GG667703.1| GENE 3 2449 - 2727 126 92 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871012|ref|ZP_06409988.1| ## NR: gi|288871012|ref|ZP_06409988.1| transcriptional regulator, MerR family [Clostridium hathewayi DSM 13479] transcriptional regulator, MerR family [Clostridium hathewayi DSM 13479] # 26 92 1 67 67 123 100.0 6e-27 MFLNKNCPGCGGGEGNQSCKIARCSMEHNGAQYCFQCSEYSCEKYDHIDDMLMYLGELYQ TDERFVEYINRFATWNVAEFFCEAIKVYCCNR >gi|229784032|gb|GG667703.1| GENE 4 2850 - 3266 383 138 aa, chain - ## HITS:1 COG:no KEGG:Clole_2808 NR:ns ## KEGG: Clole_2808 # Name: not_defined # Def: pyridoxamine 5'-phosphate oxidase-related FMN-binding protein # Organism: C.lentocellum # Pathway: not_defined # 6 129 5 138 163 97 45.0 2e-19 MKEFSQEAEQVMIERFGRDTIISLATTENATPYVRYVNAYYENGAFYIITHALSNKMKHI KDNPVVAIAGEWFTAHGNGVSLGYFGKKENCEIADKLKKVFAEWIDNGHTDFNDENTIIL QVELSDGLLLSHGTRYEF >gi|229784032|gb|GG667703.1| GENE 5 3337 - 3741 122 134 aa, chain - ## HITS:1 COG:CAP0110 KEGG:ns NR:ns ## COG: CAP0110 COG0454 # Protein_GI_number: 15004813 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 133 1 133 137 145 52.0 2e-35 MKIEYKDTHIFTEKELQRLFLSVGWESGNYPDKLVGAMLNSSHVISAWDDNKLIGLVRAL DDGETVAFLHYLLVDPAYQGHHIGDELMKKIMSCFENLLYVKIMPSDPKMIAFYERYGFQ QYDNYSAMVLKHFR >gi|229784032|gb|GG667703.1| GENE 6 3778 - 4062 85 94 aa, chain - ## HITS:1 COG:no KEGG:Fphi_1280 NR:ns ## KEGG: Fphi_1280 # Name: not_defined # Def: hypothetical protein # Organism: F.philomiragia # Pathway: not_defined # 8 88 61 143 144 68 42.0 9e-11 MLMAQVDYVMKQITEYNAYGFEIIGIIGANRSPNCGVETTSDFNKEISGKGIFMSELSNR IKAENLNIPMLGIKDTDNVFEQVKSFIKSNLVKE >gi|229784032|gb|GG667703.1| GENE 7 4374 - 4523 138 49 aa, chain - ## HITS:1 COG:RSc2565 KEGG:ns NR:ns ## COG: RSc2565 COG0454 # Protein_GI_number: 17547284 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Ralstonia solanacearum # 2 48 119 165 177 61 53.0 4e-10 MERGCGRLEWWCLDWNKPSIDFYLSLGAEAMSDWAVYRIAGDTLTQLAE >gi|229784032|gb|GG667703.1| GENE 8 4544 - 4846 318 100 aa, chain - ## HITS:1 COG:lin2807 KEGG:ns NR:ns ## COG: lin2807 COG0454 # Protein_GI_number: 16801868 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Listeria innocua # 3 100 4 101 159 105 56.0 2e-23 MNVTFRNAERKDTLLILQFIKELADNEKMLNEVVADETTLETWIFDKQKAEVIFALEDGK EVGFALFFHNFSTFLGRAGIYLEDLYVKPECRGKGYGKLF >gi|229784032|gb|GG667703.1| GENE 9 4890 - 5222 151 110 aa, chain - ## HITS:1 COG:no KEGG:mru_0642 NR:ns ## KEGG: mru_0642 # Name: not_defined # Def: hypothetical protein # Organism: M.ruminantium # Pathway: not_defined # 1 109 1 109 112 118 55.0 1e-25 MRNMIAYCGLDCEKCDAYLATINDDQALRERTAKRWAELNNAPILPEHINCEGCRVDGIK TIFCDSLCKIRQCALKKDVTICGNCPDMERCQTVGEIISNGPEALDNLKG >gi|229784032|gb|GG667703.1| GENE 10 5323 - 6609 813 428 aa, chain - ## HITS:1 COG:no KEGG:Clole_2898 NR:ns ## KEGG: Clole_2898 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 23 420 20 407 409 224 32.0 8e-57 MVLKKKSTLLIMAFLIGLFVYACGKEKSITGESLLKNIEEVYSTEETKSAEEELAEQWVK GYGLPVDEQEEKEAENDCESMMKLLFEIYNGADKETSSNVILNDETILEMQKKLMETECP IATVMTYSNMENYERVDSFLEDCMAGNGGSVVMYKIHDDGGLGRMKFIFEGTDMYVVSAR SVWNEEGKSVMSYISYTRIKEWKYTDKGWFCYELCVPEPPEVSEIMDGSFLIRIKPMTEE QHEMSERCVRRLGYQGNNILCSNWDSNHMEELDYNGMYEYLYVMKYQKPFEPEDYQEGIP KAEFESLIMEYLPVTAEQIREYAVFDEKKQNYAWERIGCFSYAPTFFGTSLPEVMDINEN EDGTITLTVDAVCDMVICDDAVITHELTVRFADDGSFQYLRNNILNDGIQDIPSYQYRIR KEQLRVKK >gi|229784032|gb|GG667703.1| GENE 11 6638 - 7810 463 390 aa, chain - ## HITS:1 COG:no KEGG:Clole_2898 NR:ns ## KEGG: Clole_2898 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 48 390 57 405 409 149 28.0 3e-34 MEEMKVAEEVWEPPVEIENAETVPEEQNVAKKSTESVYWFIPQASDELLSEEEKEQLQSK VLSAAESVKEIYKDMIIADAADYSSGVGGFSNEQRKAVVEQLGKVGLVSVEENSNMQNPE GIKTFYADYLNGRDSMVTVFEVQRDGLIGAVTFIYRKEQLQSYYIGVRWKEGGTPEIQGT SVSNVAEIKLTDKGYFIYAYDYVIAHASLREYWRIEPLPEACRELTKKYISGLSYVDYNM LVINWDSSNVEDILMPCMYEDIYRIYTGENLRTKKWKIPAEEYERIMTTYFPVSIDQLRV HCGYDEGSNSYEYEMIYARPYPPFGEVVDYMENADGTITLIVDGVWPDYNSDLAFRNTVI VRPFEDGTFRYLSNSIEQIELELPPIARSQ >gi|229784032|gb|GG667703.1| GENE 12 7758 - 7922 138 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGVNNGRCKMAIEEMGIPDFAITIFSNSRSNLLGKNAYGRDEGGRRSMGASGGN >gi|229784032|gb|GG667703.1| GENE 13 8338 - 8508 157 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623175|ref|ZP_06116110.1| ## NR: gi|266623175|ref|ZP_06116110.1| inositol-1-monophosphatase [Clostridium hathewayi DSM 13479] inositol-1-monophosphatase [Clostridium hathewayi DSM 13479] # 1 56 40 95 95 111 100.0 1e-23 MIGLETAKRELADENFARFRRGGSQCQDVRRIVSAALELDTACGRQGSYFEIHLNP >gi|229784032|gb|GG667703.1| GENE 14 9008 - 10255 560 415 aa, chain - ## HITS:1 COG:no KEGG:Clole_2897 NR:ns ## KEGG: Clole_2897 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 15 386 9 375 401 172 31.0 2e-41 MKKVEKQRLFIAAGILILTVTGCGRKTQENHDILQQTEVREEHLEREQPKENLDYDAEYI AALYRDIYNQAAEMGTLSNLEVVRSIVERIGEHGYAVVDEGNQIDMVGPEQVLRFCRSVD GKQEAEQILIVVTSFGGFIKYDLHTVDGTVDVIRGYYKYVNGRLENMDTVCYPVDSWVYT KEGYLLFEGSHYSEIYYALSLSDEPEYTALRVLPLDETCRKLNRDYLLPVGYGGNNLFLV NWSEQDFENLDFYDLFDRFYPDLYELPVPFETDDDSGVGAVYRISADIFEHVIDVHFSID HKELREKATYIPENHTYEYRPRGFYEVEYPDIPYPEVVSYEEKNDGTITLTVNAIYPEKN TSRAFTHKTVVRPLDDGGFQYVSNQIIFPEIGFEPWWHSERLSEDQWKKVYEENN >gi|229784032|gb|GG667703.1| GENE 15 10611 - 10907 184 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623177|ref|ZP_06116112.1| ## NR: gi|266623177|ref|ZP_06116112.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 98 1 98 98 200 100.0 3e-50 MSGNQKTIQPLSAILEGVPDSEYIQSELYLTESVQMFLRAGYVNDYGENAGVCRNLEDYF TVGNRGGVKLVIYIARVMALTFENPQQRTWEVIFTPCS >gi|229784032|gb|GG667703.1| GENE 16 11073 - 11369 169 98 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_3456 NR:ns ## KEGG: CDR20291_3456 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 97 84 180 180 95 52.0 7e-19 MEKAKRLGIDAYNAEQAEKANILEVLLSNYNDGRKKTLFCVAVNLLELQDLQTVLKEIDC KPDMETLTFKEKSAFVAGLLQDAAVMKNIDLNLRKKKG >gi|229784032|gb|GG667703.1| GENE 17 11388 - 11552 69 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623179|ref|ZP_06116114.1| ## NR: gi|266623179|ref|ZP_06116114.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 54 1 54 54 70 100.0 4e-11 MCYKGIKVINMIVFFARVFAALETVLGAIALHAAPCDFAGLVPLTAAAAGAVFI >gi|229784032|gb|GG667703.1| GENE 18 11709 - 12125 405 138 aa, chain - ## HITS:1 COG:no KEGG:Clole_2808 NR:ns ## KEGG: Clole_2808 # Name: not_defined # Def: pyridoxamine 5'-phosphate oxidase-related FMN-binding protein # Organism: C.lentocellum # Pathway: not_defined # 17 129 18 138 163 92 45.0 4e-18 MKGFSGEAELVMIDRFGKDTVIALATTKNEMPHVRYVNAYYENGAFYIITHALSNKMKHI KDNPVVAVAGVWFTAHGKGVNLGYFGKKENCEIADKLKKVFAEWIDNGHTDFNDENTIIL QVELTDGLLLSRGTRYEF >gi|229784032|gb|GG667703.1| GENE 19 12146 - 12799 282 217 aa, chain - ## HITS:1 COG:no KEGG:SGGBAA2069_c10210 NR:ns ## KEGG: SGGBAA2069_c10210 # Name: not_defined # Def: type 11 methyltransferase # Organism: S.gallolyticus_gallolyticus # Pathway: not_defined # 1 205 2 206 207 233 58.0 5e-60 MKNPWEEIPLIDYENHMSLDSIMQLQAMSEMMKTQVAAYSVSRIMVLGVAGGNGLEHIQK DKFERIYGVDINSSYLQTVIQRYPELDGSLKCLCINLIDETDKLPKADMVIANLLIEYIG YMCFQKAIQQVNPKYVSCIIQINIGDNWVSDSPYIHVFDDLEQVHHQVEEHTLEDAMLEI GYHSIKTLEHMLPNEKKLVQIDFERQQPICLNRLEKV >gi|229784032|gb|GG667703.1| GENE 20 12854 - 13084 209 76 aa, chain - ## HITS:1 COG:SMc00906 KEGG:ns NR:ns ## COG: SMc00906 COG0454 # Protein_GI_number: 15964552 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Sinorhizobium meliloti # 1 75 83 157 159 93 60.0 9e-20 MEDLYVKPECRGKGYGKAILKKLASIAVERGCGRLEWWCLDWNKPSIDFYLSLGAEAMSD WTVYRIAGDTLTQLAE >gi|229784032|gb|GG667703.1| GENE 21 12978 - 13427 325 149 aa, chain - ## HITS:1 COG:MA0108 KEGG:ns NR:ns ## COG: MA0108 COG0346 # Protein_GI_number: 20089007 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Methanosarcina acetivorans str.C2A # 2 101 16 117 163 86 43.0 2e-17 MLISVANINAARKFYEDLFGLEVFQDYGRNIAFTCGLALQQDFDWFMNLPKEKVLKKSNN TEIVFEEQDFDTFLNKLKQYPDIEYLGEVIEHSWGQRVIRFPQFFYVLRTSRYLLGGSVC QAGMPRERLRKSYSEKTGIYRCGTWLRTA >gi|229784032|gb|GG667703.1| GENE 22 13556 - 13690 70 44 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871018|ref|ZP_06409993.1| ## NR: gi|288871018|ref|ZP_06409993.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 44 1 44 44 80 100.0 4e-14 MLLSGYNAGRKKTLFCIAVNILDLQKLRAVLREIEGRPDLEGLN >gi|229784032|gb|GG667703.1| GENE 23 14098 - 14334 116 78 aa, chain - ## HITS:1 COG:SPy0190 KEGG:ns NR:ns ## COG: SPy0190 COG1475 # Protein_GI_number: 15674395 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 55 118 172 176 87 69.0 5e-18 MSNIIKELYELGRSDAWISKHLGMDRDEILRLRQITGLAALFRDVKFGQAWRPAEDVVPD EESSDSGKTHTICTLGND >gi|229784032|gb|GG667703.1| GENE 24 14327 - 14674 314 115 aa, chain - ## HITS:1 COG:SPy0190 KEGG:ns NR:ns ## COG: SPy0190 COG1475 # Protein_GI_number: 15674395 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 6 114 4 112 176 166 73.0 8e-42 MANKEFRSPVYNIIAVPFEKIVPNTYNPNNVAPPEMRLLYDSIKEDGYTMPIVCYHVKSN DKYVIVDGFHRWRVMRDYPDIYEREKGMMPVSIIDKPLSGRMASTIRHNRARGNE >gi|229784032|gb|GG667703.1| GENE 25 15045 - 15386 193 113 aa, chain + ## HITS:1 COG:YPO1253 KEGG:ns NR:ns ## COG: YPO1253 COG1737 # Protein_GI_number: 16121540 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 1 63 1 63 246 72 53.0 1e-13 MFSYDEIQKYNETEVMIYKYVISNIEKIPYMTIRELANEMHISTSTLLRFCSKNGYDGYS ELKKAVKAETYVLKMQPPLEDLQELSLFLKEQIQVLLKEKFHFLLKRLKILTI >gi|229784032|gb|GG667703.1| GENE 26 15449 - 15757 134 102 aa, chain + ## HITS:1 COG:YPO1253 KEGG:ns NR:ns ## COG: YPO1253 COG1737 # Protein_GI_number: 16121540 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 1 102 136 237 246 91 40.0 4e-19 MGKFSVGLEDTFYPIATYSYKNTVVIALSESGETKELIDMVRQFQQKKCVVLSITNAATS TLAKISDWSFTYNMKEHRVNGNYNATTQVPTLFVLEALARRI >gi|229784032|gb|GG667703.1| GENE 27 15878 - 16183 181 101 aa, chain + ## HITS:1 COG:no KEGG:CKR_1771 NR:ns ## KEGG: CKR_1771 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 101 23 123 123 162 81.0 4e-39 MKVNAAFIFIAPEVDYKNHRSVIDTSTVKLTVIGVKNYSEAETVAQELVEQGVKAIELCA GFGNEGIARICKATKGMASVGAVKFDYHPGFDFKSGDELFQ >gi|229784032|gb|GG667703.1| GENE 28 16663 - 16803 183 46 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3592 NR:ns ## KEGG: EUBREC_3592 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 46 3 48 48 70 76.0 3e-11 MEFFNSAVDVLQTLVIALGAGLGIWGVINLMEGYGNDNPGANAHVP >gi|229784032|gb|GG667703.1| GENE 29 16936 - 17175 248 79 aa, chain + ## HITS:1 COG:CAP0091 KEGG:ns NR:ns ## COG: CAP0091 COG2002 # Protein_GI_number: 15004795 # Func_class: K Transcription # Function: Regulators of stationary/sporulation gene expression # Organism: Clostridium acetobutylicum # 1 55 3 56 76 65 63.0 2e-11 MSMPKGKHIFGTVKVGERGQIVIPKEAREIFDINPGDTLLVLGDEEQGIAVVKADVMKEV AVKILKGLGHFKGGYDDEK >gi|229784032|gb|GG667703.1| GENE 30 17165 - 17782 124 205 aa, chain + ## HITS:1 COG:no KEGG:CLK_0622 NR:ns ## KEGG: CLK_0622 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A3_LochMaree # Pathway: not_defined # 1 204 1 209 212 152 43.0 6e-36 MKNSFSNILPFLLIADLIIPFVLAPTYKGYSHLTQVMSVLGNSKAPLHLIYNIWLVAFGV SILISTLQLYPTVAQVSNSISIMLFSVIVIYALGGCILSGIFSVGETKSLETLSAKIHGY GSVIGFLLLTLAPLFVGLYFFKVSNGLLGILSLICFIFAIGFFALFVMADKPNYKGTIIA FEGLWQRLSLLSMYLPIAALCIFNK >gi|229784032|gb|GG667703.1| GENE 31 17931 - 18374 194 147 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623190|ref|ZP_06116125.1| ## NR: gi|266623190|ref|ZP_06116125.1| sigma-70, region 4 subfamily [Clostridium hathewayi DSM 13479] sigma-70, region 4 subfamily [Clostridium hathewayi DSM 13479] # 1 147 1 147 147 256 100.0 3e-67 MTEDEAYKVHIQYTFNAFCKIVIRHAAIDKILKLRRRWEREVSLDYLMNEKFVQLAEPEQ LEEYLFTACGQTAVLYHAELAAALLSLPQKNREIIFLYFFGEYTQQEIGKMYGRCRSTTG YQIRRTLKLLQRKMEVLSHGESEPFAL >gi|229784032|gb|GG667703.1| GENE 32 18346 - 18558 311 70 aa, chain + ## HITS:1 COG:no KEGG:LAC30SC_09595 NR:ns ## KEGG: LAC30SC_09595 # Name: not_defined # Def: hypothetical protein # Organism: L.acidophilus_30SC # Pathway: not_defined # 1 66 1 66 67 72 50.0 5e-12 MGNPNLLPYETIVRATSGEPEAVDEVLRHYSKRIRIAAIENGHIDRDTEDSIKQRLVASL FKFRFDEQEV >gi|229784032|gb|GG667703.1| GENE 33 18634 - 19164 273 176 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|50365462|ref|YP_053887.1| acetyltransferase of 30S ribosomal protein L7 [Mesoplasma florum L1] # 3 174 2 169 170 109 35 1e-23 MCKLLLVKPSKQYAEQVMSYKEEMLENGDSFDGCAGLEDVHSFGEWIDFETRLKAKYKDG YVPSEVFLAIRQRDNHIVGMIDFRHPLSDFLFNFGGNIGYSVRPSERQKGYATEMLRLIL PICRKFGENRVLLTCDKNNKASQKTIIKNGGVLEKEIIDTVGLSESGVIQRYWIST >gi|229784032|gb|GG667703.1| GENE 34 19403 - 19639 119 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623194|ref|ZP_06116129.1| ## NR: gi|266623194|ref|ZP_06116129.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 78 1 78 78 147 100.0 2e-34 KFVHYNGQIQTLSVPTLEPKGFEFSDTLDGRGGDSAKAAETFLPLITYKPTKSKMDVDLR KFPLITYKTTQVGIGGNL Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:52:21 2011 Seq name: gi|229784031|gb|GG667704.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld97, whole genome shotgun sequence Length of sequence - 20660 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 8, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 185 - 237 17.1 1 1 Op 1 . - CDS 261 - 5702 3477 ## SGGBAA2069_c19980 LPXTG-motif cell wall anchor domain-containing protein - Prom 5744 - 5803 5.1 2 1 Op 2 . - CDS 5850 - 7535 1142 ## Closa_3324 LPXTG-motif cell wall anchor domain protein - Prom 7591 - 7650 15.8 3 2 Op 1 . - CDS 8552 - 8731 221 ## Closa_3324 LPXTG-motif cell wall anchor domain protein 4 2 Op 2 . - CDS 8747 - 9637 481 ## Closa_0110 hypothetical protein - Prom 9737 - 9796 8.9 + Prom 10286 - 10345 6.7 5 3 Op 1 . + CDS 10417 - 11415 660 ## COG2207 AraC-type DNA-binding domain-containing proteins 6 3 Op 2 . + CDS 11447 - 11890 239 ## Fisuc_0607 hypothetical protein 7 3 Op 3 . + CDS 11924 - 12463 333 ## COG3797 Uncharacterized protein conserved in bacteria + Term 12651 - 12701 -0.9 - Term 12517 - 12566 13.9 8 4 Tu 1 . - CDS 12605 - 14059 1663 ## COG0516 IMP dehydrogenase/GMP reductase - Prom 14119 - 14178 8.0 - Term 14210 - 14258 9.1 9 5 Tu 1 . - CDS 14326 - 14835 665 ## Closa_1008 hypothetical protein - Term 14911 - 14954 9.2 10 6 Op 1 41/0.000 - CDS 14990 - 16609 1566 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 11 6 Op 2 . - CDS 16702 - 16992 368 ## COG0234 Co-chaperonin GroES (HSP10) - Prom 17065 - 17124 9.1 + Prom 17178 - 17237 7.4 12 7 Tu 1 . + CDS 17258 - 17866 634 ## Closa_1005 hypothetical protein - Term 17892 - 17935 5.8 13 8 Op 1 . - CDS 17961 - 19490 1563 ## Sterm_1722 hypothetical protein 14 8 Op 2 . - CDS 19504 - 20658 1137 ## COG3333 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|229784031|gb|GG667704.1| GENE 1 261 - 5702 3477 1813 aa, chain - ## HITS:1 COG:no KEGG:SGGBAA2069_c19980 NR:ns ## KEGG: SGGBAA2069_c19980 # Name: FN1 # Def: LPXTG-motif cell wall anchor domain-containing protein # Organism: S.gallolyticus_gallolyticus # Pathway: not_defined # 13 557 538 1083 1666 132 27.0 2e-28 MYGESLTVAKSTTKAGYLFDGWYTDKNFSQGTRVSGSIVLEDNTILYAKWNPGTVGYKVV YLIENADDNNYSLLCTDAKTGKTGDNVQITAKTANPTNWGELEKNAFTFKDSTSEVLKAD GTSIVTVRYSRNTYTIRFQDKSPTLCRMESHTHTLSNCYKRTCNKLHKHNESCYDMSYTI CGLTEHQHDSTCYTRDISAKYQANIREQWLENYGGHNWLWSGSSYTSLQDIMPYVSGNGI KVLNKHNDGPTPRPLEYYVEDPNGTIKDPKGGSTKFSLYTKITIYLEKGSTPSWDEEYYE IDGYERYYTNIDFSRPSFKNPSTFYYTRAKYDLTLINGTNETTKKDIPYKSDISKYLGAP TVKPTSDATFDAWYVDSSFTEKYTSGTMPKGNLVLYAHWNNAAYDVTFKDGQEILDTVEV NSGELVTPISTPEKAGYSFTGWYIDEACTQRYDFQRKVESDFDLYAGWSMHSTTSYQILY VTSEGDPVADSVTGNGVIGDTIQANAVVPTIAGYEDYIVDKATASISLDADPNKNVIRFI YSSPESLRYTVQYWYEDKIVAEEANIASKLSSFRCYPNTDLEQLTGYKVKEAYKNVKLTT GENIIKFTLTLNEYTITYENVDYNDIVWENGNTNPTKYNIHTNSFTLKNPGRVGYTFMGW KLVKGSVESGEYDSKNVVIETGSRGNLTFKAEWAVDVNQKLNYSIKYYKSEQPDKEIGNL NKNVLVASPTVKYADVDKTYNKPSGYILDKVEFGGQTVSDGEMTITDTDNIIKVYYKVDE TQKLKYSVEYYIDGKDIPFDKLDNQEVLVADPVVSEVADSSKVPAGYTRDSVLPELPATI SETANVIKVYYRVDETQKLKYSVEYYINGNDTAFDKLENQEVLAADPVVKTVADSSKVPA GYRRDSVFPELPVTITESSNIIKVYYRIDTTQKLQFDVEYYLVGEMEAFDAVRNQTVLVA NPVINAVADSSKVPAGYERASVMPQLPTIISKTNNVIRVYYKVDETQKLKYSIEYYVTGN ETAFDKLENQEVLVANPVVSTVADSSKVPAGYTRDRVLPELPTTISGSNNVIRVYYKVDE TQKLKYSVEYYLNGNDTAFDKLENQEVLVADPVVKAVADSSKVPVGYTRDSVLPELPATI SETANVIKVYYRVDETQKLKYSIEYYVNGNDAAFDKLENQEVLVADPVVASVADSNKVPA GYTRDHVMPELPTTITESNHVIRVYYRVDETQKLRYDVEYYVDGNDTAFDKLENQEVLVA NPVVNAVVDSSKVPAGYTRNSVLPELPTMISESNNIIKVYYSVDEMQKLKYNVEYYINGS ATAFDKLENQEVLVADPYVLFVKDSALVSAGYVRDRVEPLLPAKIVHDGDIINVYYRART DLYSEINYYYDDTKTDTQKDNNAVFDTGILDGVKVEMEVTHTDGKHYVLDRIEGKDKKVT VNPTENIVNIYYALDEKGTGTDPNKPDNIPDKYQITFTYVAGENGNVDGQKEEVITRELD ADGSFSSINPAYPGALVKANANNGYSFQNWTSDEVGMMAGKPVSTFADEAAIREAGFITD TVFTAHFNAKEDTSYRIEYYYEVQGQYPATTNNYEVRTGKTDTTVSVTDNDKIKAGYKYD TEAANVVSGVVKGDNSLVLRLYFKQQFTVTYQSGNHDFFEEQVIGENSYGDKTPVFTGEM NIRGNYVFNGWQPVIAEKVTENAVYVAQWRYTGSSGGDTGGGGNTPSDNKPYVPNGPGDT VTIEPGDVPLANVPENGPVDNLMLIDDGNVPLAGLPKTGDRFGTNTGLAALLTGFLLAAF TALNNKRREEENK >gi|229784031|gb|GG667704.1| GENE 2 5850 - 7535 1142 561 aa, chain - ## HITS:1 COG:no KEGG:Closa_3324 NR:ns ## KEGG: Closa_3324 # Name: not_defined # Def: LPXTG-motif cell wall anchor domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 336 101 398 2050 116 33.0 2e-24 MSDDEDADGADLRVFVRIKDDYEGYRLTGKEEIIFLYINDSEGKISFRTIVDGYATERIT VKGNSKLAEAEVPGNVNGTLEGEVTTPEENGTPEGEVTTPEESGTPESEVTTPEENGTPE SEVTAPEESGTPESEVTTPEENGTPEGEMATPEESGTPEGEVTTPEESGTSEGEVTTPEE NRTPEGEVTTPEESGTPEGEVTTPEENRTPEEEVTTPEENGTLEGESDNEDAAMSEDNTV VSISRNSSAVLMSADSTVTVLESANEKNDKITLATASNSENSNSGKSDPKESSSVANLVN GKVYGNVILDESSYARAYVITLDALNIEQPLNEYQIKVTHRLHGDNWIDYEKIETVTITD NDFDNGEYDTAKLEFKREGIVLTSQSLVLTKDAFKSGTEAEITLEYAVADGYRIYDNRAN LRSIFVGELDDDDPVIKPVGQAVVNIKYVYENGTIAKDTESVAALYDENTKKYKLTYELG DLDGYKPVLDTMDSASGITLDGNILTANLSEDKEYLAVIVYMADTVEYVVRHNYQKLDGS YEVKESETKSGKAGELSAAVA >gi|229784031|gb|GG667704.1| GENE 3 8552 - 8731 221 59 aa, chain - ## HITS:1 COG:no KEGG:Closa_3324 NR:ns ## KEGG: Closa_3324 # Name: not_defined # Def: LPXTG-motif cell wall anchor domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 58 1 58 2050 87 68.0 2e-16 MKGLRRKFRRGLAGFLAFVLTMTSFNMVSWADVASAFETEKATFVMSGEEIRESAQAAS >gi|229784031|gb|GG667704.1| GENE 4 8747 - 9637 481 296 aa, chain - ## HITS:1 COG:no KEGG:Closa_0110 NR:ns ## KEGG: Closa_0110 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 271 1 273 294 192 40.0 2e-47 MSIAKMSMLDNSYRYFTVTVDRYERQCMEGIISHAGETPGIRYGNFLELVLQMDQIFNRM SCPKQTMEFRRFTGTDYPAPKVKECRKVCNGKLATFQIYVKFRYDASWQGDITWLEEERT EHFESILQMIQLVDEILTGRYRIKDLGKGARTCQVAVNSYDSGLLEGSVQNAFINHLEEF KGTIGLADAMVHLFEVGIGRSASGTAPGECKIISEETWDTYRKGGRKATFVIKILYREHS TWQGVICWRETGEKQAFRSFLEMIILMASALEIGSGDNECEDRYYTDNRQKTIKEG >gi|229784031|gb|GG667704.1| GENE 5 10417 - 11415 660 332 aa, chain + ## HITS:1 COG:ECs5044 KEGG:ns NR:ns ## COG: ECs5044 COG2207 # Protein_GI_number: 15834298 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 9 113 4 104 107 78 37.0 2e-14 MNKRMYEWQKQIQVIVDEIDKCIKSYNDETLTLRFLSRKLGYSEFYTTRKFKEISGMQFR DYLRQRKLAFALKEVRDYDKSLLDIAFDYGFSSHEAFTRAFKGAYGVTPSEYRKNPTPVA LRTKITPFDRYFLGLGEIGMIQSSDDIKIYSVTIPAHKFLHIRNNESNGYWDFWRKQNLI PGQDYETICGLLDSIKGKLDDEGGSEANCAGGQIMAYMNDPNGRLCDWGILRTECWGVRL PSDYHGNVPPQMVMADIPEAEYIVFEHGPFDYEQENCSVEEKIEMAMADFDYTKSGYCLD TSPGRIMYFYYDPEQYFKYIRPVLRESSASRY >gi|229784031|gb|GG667704.1| GENE 6 11447 - 11890 239 147 aa, chain + ## HITS:1 COG:no KEGG:Fisuc_0607 NR:ns ## KEGG: Fisuc_0607 # Name: not_defined # Def: hypothetical protein # Organism: F.succinogenes # Pathway: not_defined # 51 143 102 194 195 77 44.0 2e-13 MAASICGIDCTKCGLSSTCNGCMETNGRPFGAECMAALCLEKGEKELCESKRNLIAAFRA LNIEDMEEITELHALKGSLINLEYTLPNEQIIRFWNDDKIYLGNQLSKKGSDRYYGMIAD ENYLMVSEYSGSGSDAELVAFMRWRQS >gi|229784031|gb|GG667704.1| GENE 7 11924 - 12463 333 179 aa, chain + ## HITS:1 COG:SP0830 KEGG:ns NR:ns ## COG: SP0830 COG3797 # Protein_GI_number: 15900717 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 1 179 1 177 180 74 28.0 1e-13 MNRYIALLRGINRSGKNKISMSELKKGFAELGFTEITTYLNSGNVIFSSTVQDIKILTNK IQFMITDIFDMAVPVFIILQDELKELLNHAPDWWGTDHKERYDNLIFLMPPLSYEAFYDE IGDPETGYENVYHYNNVIFWSFSRNDYQKTNWWSKTAISGVSGRITIRTANTVRKIVEL >gi|229784031|gb|GG667704.1| GENE 8 12605 - 14059 1663 484 aa, chain - ## HITS:1 COG:CAC2701_3 KEGG:ns NR:ns ## COG: CAC2701_3 COG0516 # Protein_GI_number: 15895958 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Clostridium acetobutylicum # 205 484 1 280 280 388 68.0 1e-107 MGTIIGDGITFDDVLLVPAYSEVIPNQVDLTTNLTKTIKLNIPLMSAGMDTVTEHRMAIA MARQGGIGIIHKNMSIEAQAEEVDKVKRSENGVITDPFYLSPEHTLKDADELMAKFRISG VPITEGRKLVGIITNRDLKFEEDFSRKIKECMTSRNLVTAREGITMTEAKKILAKARVEK LPIVDDDFNLKGLITIKDIEKQIKYPLSAKDSQGRLLCGAAIGITANVLERVAALVASKV DVIVLDSAHGHSANVLNCVRMIKEAYPDLPVIAGNIATGDAAKALIEAGADAIKVGIGPG SICTTRVVAGIGVPQITAVMDCYAVAKEYGIPVIADGGIKYSGDLTKAIAAGGNVCMMGS MFAGCDESPGTFELYQGRKYKVYRGMGSIAAMENGSKDRYFQSDAKKLVPEGVEGRVAYK GLVEDTVFQMLGGLRAGMGYCGASDIKTLQETAKFVKITAASLKESHPHDIHITKEAPNY SIDE >gi|229784031|gb|GG667704.1| GENE 9 14326 - 14835 665 169 aa, chain - ## HITS:1 COG:no KEGG:Closa_1008 NR:ns ## KEGG: Closa_1008 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 169 1 169 169 275 76.0 5e-73 MNEMYAEWLVKRKAPAYMMLVKVLMVLVCVVAFFIAVSPLLGIFGILVLFAAAAAAYFVF RNSEVEFEYLYVTNQLTIDRIYGKTKRKKAWEGSMDEIQIVAPTGSTEAKDHETNNMKVL DFSSHMPDAKTFTLISQSGSNSTKIIFEPNDKLLHCMKMTAPRKVVDRI >gi|229784031|gb|GG667704.1| GENE 10 14990 - 16609 1566 539 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 2 539 3 543 547 607 58 1e-173 MAKQIKYGVDARKALEAGVNQLADTVRVTLGPKGRNVVLDKSFGAPLITNDGVTIAKEIE LEDPYENMGAQLVKEVATKTNDVAGDGTTTATVLAQAMINEGMKNVAAGANPIVLRKGMR KATEAAVEAIAKMSEPIKGKASIARVAAISAADDAVGEMVADAMEKVSNNGVITIEESKT MKTELDLVEGMQFDRGYVSAYMCTDMEKMEANLDDPYILITDKKISNIQEILPLLEQVVQ SGARLLIIAEDIEGEALTTLIVNKLRGTFNVVAVKAPGYGDRRKEMLQDIAILTGGKVIS EELGYELKSTTMDMLGRAKSVKVQKENTIIVDGEGSKDEIEARIGQIKAQIEETTSDFDK EKLQERLAKLSGGVAVIRVGAATETEMKEAKLRMEDALSAARAAVEEGVIAGGGSAYIHA MKDVQKVIDGLEGEEKTGAKIILKALEAPLYHIVANAGLEGAVIISKVKEAAVGTGFDAY KEEYVDMIEAGILDPTKVTRSALQNANSVASTLLTTESVVANIKEKTPAAPAPGGMDMM >gi|229784031|gb|GG667704.1| GENE 11 16702 - 16992 368 96 aa, chain - ## HITS:1 COG:CAC2704 KEGG:ns NR:ns ## COG: CAC2704 COG0234 # Protein_GI_number: 15895961 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Clostridium acetobutylicum # 1 96 1 94 95 86 68.0 1e-17 MKLVPLFDRVVLKPLVAEETTKSGIVLPGQAKEKPQQAEVIAVGPGGLVDGKEVTMQVKV GDKVIFSKYSGTEVEGESEKDKLVIVKQNDILAVIE >gi|229784031|gb|GG667704.1| GENE 12 17258 - 17866 634 202 aa, chain + ## HITS:1 COG:no KEGG:Closa_1005 NR:ns ## KEGG: Closa_1005 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 200 1 214 214 154 63.0 3e-36 MAKKGNGFGKLVAFATIAGAVAAGISYFTKYKSFHKELEEDFHDFEDDGNDDISKIDSTM NRNYVSLHADKDEFKVAAADMADAAKDAASAAKNLVKDAANIVNSTAREAASAVADTAKD MLDTAMDRSEDDEEEDITEDGGLDYAAGAADSFKEEPEQAADTEDKEDSDEVEEIYEGTG DALKKFAAEESTTIVEDTVDEP >gi|229784031|gb|GG667704.1| GENE 13 17961 - 19490 1563 509 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1722 NR:ns ## KEGG: Sterm_1722 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 2 507 3 516 517 468 44.0 1e-130 MHYKKLWTDLHSNIHHNQMGELDQWIEHAKNTVDFWPVAYYPFYMKATKTGAGLEDLYDG KLIEEDWEQLREAVKRENAAGYPMFMGYEWQGSGKDGDHNVFFLDNEQAMHHPMRYEELY QAYKDTEAIAIPHHVAYQLGSRGKNWETHREEFSPFAEIYSSHGCSENDTGPFDMERHLH MGPRTGETCYERGLEKGYLVGAIASGDNHNVPAVHDHGTMCVLAADRTKEAIWEGLKNRR VYGVSRSRMDIDFTINGSPMGAVVKPGEGTLAFTVRAADAIDRVEILKDNVLDTMAVHAG TWEKAEYTGVVKVKFAVEFGWGPNPKFYKDRLVKTWDGSLEVPGRLCSVQKCFNSFGQEL LNVTEKSCEFHMTTYMSTTTGHWMGPSTVVKEGFVFEVEGNTEDEIVLKVDSYVYRLKIG ELMKSSRILAQYRESMELAAEVFGNVEHYRDDFYWHNAYKTRIRQAVPEEAYTMSCEKKV ALEAGSCYRLRVHLRNGDTAWVSPIFVRQ >gi|229784031|gb|GG667704.1| GENE 14 19504 - 20658 1137 384 aa, chain - ## HITS:1 COG:PM1681 KEGG:ns NR:ns ## COG: PM1681 COG3333 # Protein_GI_number: 15603546 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pasteurella multocida # 1 369 126 495 498 285 44.0 1e-76 LLFFAPQISKIAIYFGAPEYFALAVFGLSVIASVSGSSILKGVISGAFGVLIATVGIDAV SGVTRFTFGSLQMYNGIKMLAVLLGVYAIAQMISRVNGTEEEDSGTELKKADPSDRLTFK EIKSTIPTMLKSSAIGAFIGAIPGTGGAIASYIAYNEAKRCAKPGEKFGEGELKGIAAPE SANNGATAATLIPLLTLGIPGDAVAATLLGAFTMHGLVVGPKLFSSSGPVVYAILIGCVI SQIVMFLQGKYLLPVFVKITHIPQELLTALLVVICCTGAFSLANSSFDVRVMLIFGGIAY FMQKVELPPVPIILGIVLGPIAEANLRNSLVLSGGSWTIFFKRPICLAFIILTFVLIWLL KKNDKKQRKMEAEFFEKYGENTER Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:53:21 2011 Seq name: gi|229784030|gb|GG667705.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld98, whole genome shotgun sequence Length of sequence - 25958 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 18, operones - 6 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 430 427 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 2 1 Op 2 . - CDS 420 - 854 505 ## COG1959 Predicted transcriptional regulator - Prom 897 - 956 7.8 + Prom 913 - 972 8.2 3 2 Tu 1 . + CDS 1024 - 1578 614 ## Amet_3741 hypothetical protein + Prom 1599 - 1658 4.9 4 3 Tu 1 . + CDS 1689 - 3143 1761 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases 5 4 Op 1 . + CDS 3266 - 5110 2074 ## COG0210 Superfamily I DNA and RNA helicases 6 4 Op 2 . + CDS 5133 - 5360 207 ## Closa_1311 ATP:corrinoid adenosyltransferase BtuR/CobO/CobP 7 5 Op 1 . + CDS 6323 - 6739 385 ## Closa_1314 selenium metabolism protein YedF 8 5 Op 2 . + CDS 6814 - 7245 237 ## COG3663 G:T/U mismatch-specific DNA glycosylase + Prom 7298 - 7357 9.2 9 6 Tu 1 . + CDS 7444 - 7911 699 ## gi|288871040|ref|ZP_06116154.2| membrane spanning protein 10 7 Tu 1 . - CDS 8025 - 8597 395 ## Closa_1315 hypothetical protein - Prom 8667 - 8726 9.1 + Prom 8644 - 8703 10.6 11 8 Tu 1 . + CDS 8848 - 10272 1463 ## COG0144 tRNA and rRNA cytosine-C5-methylases + Term 10322 - 10361 4.1 12 9 Tu 1 . - CDS 10245 - 11168 309 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit - Prom 11213 - 11272 5.5 + Prom 11185 - 11244 8.5 13 10 Op 1 . + CDS 11312 - 11581 522 ## Closa_1318 hypothetical protein 14 10 Op 2 . + CDS 11676 - 12551 1086 ## COG1284 Uncharacterized conserved protein + Term 12569 - 12609 4.6 15 11 Tu 1 . - CDS 12582 - 13877 1194 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair - Prom 13988 - 14047 7.3 + Prom 13950 - 14009 9.2 16 12 Tu 1 . + CDS 14029 - 14388 325 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Prom 15231 - 15290 80.4 17 13 Op 1 . + CDS 15326 - 16141 824 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Prom 16196 - 16255 5.2 18 13 Op 2 . + CDS 16309 - 16929 628 ## COG2365 Protein tyrosine/serine phosphatase 19 14 Tu 1 . + CDS 17857 - 18078 330 ## gi|266623229|ref|ZP_06116164.1| putative protein-tyrosine phosphatase + Prom 18115 - 18174 4.3 20 15 Tu 1 . + CDS 18209 - 19054 1052 ## COG1284 Uncharacterized conserved protein + Prom 19114 - 19173 6.0 21 16 Tu 1 . + CDS 19199 - 19795 817 ## Closa_1323 hypothetical protein + Prom 19925 - 19984 9.8 22 17 Op 1 . + CDS 20068 - 21522 1484 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) 23 17 Op 2 . + CDS 21540 - 22310 751 ## Closa_1325 sodium pump decarboxylase gamma subunit 24 17 Op 3 9/0.000 + CDS 22331 - 22708 429 ## COG0511 Biotin carboxyl carrier protein + Prom 22715 - 22774 2.5 25 17 Op 4 4/0.000 + CDS 22832 - 23902 1144 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 26 17 Op 5 . + CDS 23973 - 25385 1671 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase + Term 25400 - 25451 14.1 + Prom 25396 - 25455 2.5 27 18 Tu 1 . + CDS 25581 - 25956 314 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain Predicted protein(s) >gi|229784030|gb|GG667705.1| GENE 1 1 - 430 427 143 aa, chain - ## HITS:1 COG:CAC0908 KEGG:ns NR:ns ## COG: CAC0908 COG1179 # Protein_GI_number: 15894195 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Clostridium acetobutylicum # 3 142 4 140 251 132 50.0 1e-31 MIHEFSRSELLLGKDSMAKLAGSCVAVFGVGGVGSHCIEALARGGVGRLVLIDNDTVSIT NINRQSIAYHSTVGMYKTQVMKTRIQDINPKAEVFPFETFVLPDNLEALFEQIPYKVDYI ADAIDTVTAKLALTQYAIEHDIP >gi|229784030|gb|GG667705.1| GENE 2 420 - 854 505 144 aa, chain - ## HITS:1 COG:DR2094 KEGG:ns NR:ns ## COG: DR2094 COG1959 # Protein_GI_number: 15807088 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Deinococcus radiodurans # 1 138 46 182 197 114 42.0 5e-26 MKISTKGRYALRMMLEFALHPGECTKINQVAERQQISDKYLEQIVTILSRAGYVKSMRGA QGGYYLTKEPSDYTVGMILRLTEGSLAPVACMEDELNQCIRSGHCATLTVWKQLDQAING VVDNITLADLVEEQKKLDGDCHDS >gi|229784030|gb|GG667705.1| GENE 3 1024 - 1578 614 184 aa, chain + ## HITS:1 COG:no KEGG:Amet_3741 NR:ns ## KEGG: Amet_3741 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 5 110 4 106 172 70 36.0 3e-11 MTMKCPECGAEFVDGNEVCTDCKTLLIPKEERSAGEEELITEGVDVEILTTVNDHLEAEL LQGILANQGIPSYSQDEESGGYMRIYMGYSIFGEKIYVRSSDFPAAQQCLKEWEDGKKNG GVEVLEGEEEAAGDDTGENTGDFEDDSEGGGGNPLILKDRRTAAIIILAAAVILGFIAVF PPIP >gi|229784030|gb|GG667705.1| GENE 4 1689 - 3143 1761 484 aa, chain + ## HITS:1 COG:CAC0990 KEGG:ns NR:ns ## COG: CAC0990 COG0008 # Protein_GI_number: 15894277 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Clostridium acetobutylicum # 2 483 4 485 485 648 63.0 0 MSKVRTRYAPSPTGRMHVGNLRTALYAYLIAKHEDGEFLLRIEDTDQERYVEGAVEIIYR TLEETGLVHDEGPDKDGGVGPYVQSERQASGIYLEYAKLLVERGEAYYCFCTQERLDSLK QTVEGQEIMTYDKHCLSLSREEVEANLAAGMPYVIRQNNPTEGTTTFHDEIYGDITVDNS ELDDMVLIKSDGYPTYNFANVVDDHLMGITHVVRGNEYLSSSPKYNRLYEAFGWEVPVYV HCPLITNEEHKKLSKRSGHASYEDLIEQGFISEAVVNYVALLGWSPVDNREIFSLEELIK EFDYHNISKSPAVFDMTKLKWMNSEYMKAMEFDRFFEMAEPYIREVIRKDLDLRKIAAMV KTRIEVFPDIADHIDFFETLPEYDPEMYVHKKMKTNKETSLTVLTEVLPILEAQNDFSND ALYEALSKYVAEKGCKTGFVMWPIRTAVSGKQMTPAGATEIMEVLGREESIARIRNGIEM LSRS >gi|229784030|gb|GG667705.1| GENE 5 3266 - 5110 2074 614 aa, chain + ## HITS:1 COG:BS_yjcD KEGG:ns NR:ns ## COG: BS_yjcD COG0210 # Protein_GI_number: 16078247 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Bacillus subtilis # 11 613 140 752 759 337 34.0 5e-92 MSSEVLFEGAQKAAVTHDKGPMLVLAGPGSGKTFTITHRIKYLTEECSVNPSGILVITFT KAAATEMKERYESLCGGTPSSVSFGTFHAIFFRILKFAYRYDARNIVRDEQRMQYIRELM ETNRVEVEDEAEFVSAVLSEISSVKGEMIHLNHYYAKNCSEEVFKKLYSGYEERLRRANL IDFDDMLVMCYQLFQERKDILSAWQRKYQYILVDEFQDINRVQYEIVRMLAAPENNLFIV GDDDQSIYRFRGAKPEIMLGFEKDYPNAERVLLNINFRSTKDIVDTAGRLIAHNKTRFPK EIITKQGRGKPVVTRVWKDPQEETLEIVHEIADYAAAGYAWSDIAVLYRTNLGPRLLIEK LMEYNIPFCMKDAVPNLYDHWIAKNLIAYLNAAAGDLSRSCILSVINRPNRYISRDALEL PVVNWEAVKSFYQDKGWMVERIEQLEYDLKFLRQLAPAAAVNYIRKAVGYDDYIREYAEY RRMKPEELFEVLDQLAESAAGFKTVEAWFAHMEEYARELKLQARSREKRTEGVSLMTMHS SKGLEYRIVYILDANEGVTPHHKAVLDADMEEERRMFYVAMTRAKERLHVNYVSERYSKK QEVSRFVKEYLGRE >gi|229784030|gb|GG667705.1| GENE 6 5133 - 5360 207 75 aa, chain + ## HITS:1 COG:no KEGG:Closa_1311 NR:ns ## KEGG: Closa_1311 # Name: not_defined # Def: ATP:corrinoid adenosyltransferase BtuR/CobO/CobP # Organism: C.saccharolyticum # Pathway: Porphyrin and chlorophyll metabolism [PATH:csh00860]; Metabolic pathways [PATH:csh01100] # 1 75 1 75 174 117 72.0 2e-25 MDSCVHIYCGDGKGKTTAAVGLAVRAAGCGRKVLITRFLKTDHSGEVAALGLIPGITVTP CEKSFGFTFRMTEEH >gi|229784030|gb|GG667705.1| GENE 7 6323 - 6739 385 138 aa, chain + ## HITS:1 COG:no KEGG:Closa_1314 NR:ns ## KEGG: Closa_1314 # Name: not_defined # Def: selenium metabolism protein YedF # Organism: C.saccharolyticum # Pathway: not_defined # 13 138 80 204 204 176 71.0 4e-43 MLPGDGRQPADAEAVTEAEACIPDARRKGQVVVISSDCMGTGDDELGRQLMKGFLYAQTQ LDVLPDTILLYNGGAKLSAEGSQSVEDLRSLEAQGVEILTCGTCLNYYGLSGKLAVGNVT NMYDIAEKLSGAASIIRP >gi|229784030|gb|GG667705.1| GENE 8 6814 - 7245 237 143 aa, chain + ## HITS:1 COG:Cj1254 KEGG:ns NR:ns ## COG: Cj1254 COG3663 # Protein_GI_number: 15792578 # Func_class: L Replication, recombination and repair # Function: G:T/U mismatch-specific DNA glycosylase # Organism: Campylobacter jejuni # 1 142 21 158 160 128 42.0 4e-30 MLGTMPSPKSREQGFYYGHPRNRFWPVLADLLGEPAPETIEEKKSLALRHHIAVWDVLAG CDICGADDSSIKNPVSNDMNVIFSRADIKAVFTTGTKADELYGRYCFSQTGIASVRLPST SPANCRTGYEELRAAYSQILKYL >gi|229784030|gb|GG667705.1| GENE 9 7444 - 7911 699 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871040|ref|ZP_06116154.2| ## NR: gi|288871040|ref|ZP_06116154.2| membrane spanning protein [Clostridium hathewayi DSM 13479] membrane spanning protein [Clostridium hathewayi DSM 13479] # 1 155 2 156 156 97 100.0 2e-19 MNTKSETKAATKAAETKAAETKTAEKTPVKATPSEAKKAPAAKTAAEKKAAAPKKAAPAK KEAEPKAVPEKKEAAPKKAAAKKPAAKKEVKSSVVIEYAGRQIVARDVLEAAKKAYESMN KGAAIETIEIYVKPEEGAAYYVVNGEGSDQYKVQL >gi|229784030|gb|GG667705.1| GENE 10 8025 - 8597 395 190 aa, chain - ## HITS:1 COG:no KEGG:Closa_1315 NR:ns ## KEGG: Closa_1315 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 177 3 178 187 208 68.0 1e-52 MILNITAYFIIPVYTFLFAWGTDWFTLNFSVLGSLANRKNAFLLWGIIVGTYFYYVLRKI IHHLPRNRKETVTSVSALILLAFAVTTPYLPENRPFRAFLHVIFAFSASVLLLGCLYLIV WKLYCMNRDGYRSYFICLNVITVLSAMLLLLAGIVSSALEIFFTVSCTLMLIRLYRRVSS AGDTYYSLKH >gi|229784030|gb|GG667705.1| GENE 11 8848 - 10272 1463 474 aa, chain + ## HITS:1 COG:SP1402_1 KEGG:ns NR:ns ## COG: SP1402_1 COG0144 # Protein_GI_number: 15901256 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Streptococcus pneumoniae TIGR4 # 3 301 2 280 280 243 45.0 7e-64 MTELPEAFKEKMKNLLGTEYEAFLESYEKERVQGLRLNPGKTDEKEFLAKVPFHLTKIPW AREGYYYDSSDRPGKHPYHEAGLYYIQEPSAMAVVELLDPKPGDCVLDLCAAPGGKTTQI AGRLMGEGFLLSNEIHPARAKILSQNVERMGIRNAVVANETPERLAERFPEFFDGMVVDA PCSGEGMFRKDEEACRQWSPDHVVMCAARQRQILDSAARMLKAGGRMVYSTCTFSPEEDE QTIEMFLSEHPEFEIEDMGVREGLSPGKPEWGISAAETLRGTYRIWPHLSEGEGHYLAVL RKTGEDCGTWKRKAPAYLKDKAVHKEYEGFCRNLFTDPERYLDREEYILFGDQLYLLPPQ MIDLAGLKIVRPGLHMGTMKKNRFEPSHALALSMKKEEAVRRFPMKAEGQEAGRYLKGET LRIDDWLRPEESENCRLNGQKGWVLMTVDGWPLGFSKLAGGILKNHYPRGLRWL >gi|229784030|gb|GG667705.1| GENE 12 10245 - 11168 309 307 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 29 282 32 279 285 123 35 1e-27 MTYLIQQHDTQKTAEQFLLSNGYSAALIRRLRRTEHSILKNGSPIYTTHRLNEGDTLSVT LPEDHGSENILPIPMDLDICYEDRDLLIINKAAGVPIHPSQGNHDNTLANGIAWYLGEKG ETAAYRAINRLDRDTTGLLILARHALSACILSEMVRTHAIRRCYLAAATGLVPPEGVIDA PIARAGTSTIERCVDFERGDSARTHYRTLCYNPDTNCSLVELRLETGRTHQIRVHMKHIG HPLPGDFLYNPDYRLIGRQALHSWQLDFIHPIKKEPLHFEAPLPDDMRPLFGMDLPLALY SQRSPLG >gi|229784030|gb|GG667705.1| GENE 13 11312 - 11581 522 89 aa, chain + ## HITS:1 COG:no KEGG:Closa_1318 NR:ns ## KEGG: Closa_1318 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 89 1 89 89 100 61.0 2e-20 MMNEELYEALEQEFEKNHVDEDVEDVLLDLAEHMADQGIMDKEVIFKESYGKTSVEGCGV CAEEDGEISVLIKWIRVGKKEFEIDDYFL >gi|229784030|gb|GG667705.1| GENE 14 11676 - 12551 1086 291 aa, chain + ## HITS:1 COG:CAC0431 KEGG:ns NR:ns ## COG: CAC0431 COG1284 # Protein_GI_number: 15893722 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 11 286 11 286 291 139 32.0 6e-33 MTGIRQQYTAKDLGIDLLCDIAGAFLFNIGIYNFAVNAEFAPVGVSGIALILRYLFGLPM GIVSILLNIPILLVSYKLLGRRFLLRSMKSMVIFALVLDYVMPIIPVYDGNPFYASICTG IFSGLGLTIVYLRGGSTGGTDFLVMAIRKLFPHLTIGQISLVVDAIVIIAGGFVFRNLDA VILGMISSIVTTMVIDKIMYGLGAGKLMLIVTSCEAAVAEAIEEATGRGCTFLEGQGSYS LDEKKVVMCACNNSQVIPIRREVHEIDKGAFIIITDSNEVFGEGFKPIDMI >gi|229784030|gb|GG667705.1| GENE 15 12582 - 13877 1194 431 aa, chain - ## HITS:1 COG:CAC0285 KEGG:ns NR:ns ## COG: CAC0285 COG0389 # Protein_GI_number: 15893577 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Clostridium acetobutylicum # 4 399 2 394 396 334 44.0 2e-91 MTQSPLVFHIDVNSAFLSWESVCRLAEDPKALDLRTIPSAVGGDAASRHGIVLAKSTPAK KYGITTGEPLVQALRKCPELCVVPSRFDVYIHYSGLLMNLLSDYTPDIEKFSIDEAFLDM SSTIHLFGDPWTTANQIRERVFRELGFTVNIGIAPNKLLAKMASDFKKPDMCHTLFLDEI PKKMWPLPVRELFFVGRSAQKKLEGIGIHTIGQLAACDVELLKPLLGEKYAIQIHGYACG IDPSPVAEREPTGKGYGNSTTLSRDVDDFDTACTVLLSLCETVGARLRADHVICDCVCVE IKDWEFHVQSHQATLNSPTDSTTVLYENACRLLKEFWDLTPLRLIGVRATKISDDEFTQI SLFESEQSRKMKEMEKAVDQIRSKFGTDIIKRASFLKKDTIVDHAASKQKHLSSRSKERK SGSESKKERND >gi|229784030|gb|GG667705.1| GENE 16 14029 - 14388 325 119 aa, chain + ## HITS:1 COG:CAC2057 KEGG:ns NR:ns ## COG: CAC2057 COG1686 # Protein_GI_number: 15895327 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 35 117 23 105 351 67 42.0 8e-12 MKQVILFFLCLLLCTGICGRTAAQAETDGQEQQNQKDELGLYAQSAVLMDAKTGRILYGK NEGVARPMASTTKIMTCIIALEYGNLSDTVTASQNAAAQPKVHLGVYKGQTFRLEDLPS >gi|229784030|gb|GG667705.1| GENE 17 15326 - 16141 824 271 aa, chain + ## HITS:1 COG:VC0947 KEGG:ns NR:ns ## COG: VC0947 COG1686 # Protein_GI_number: 15640963 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Vibrio cholerae # 1 261 159 391 391 72 26.0 1e-12 MGCSDTYFITPNGLDASREENGQMKEHSTTAADLARIMNYCIGTSPKKEEFLKITGTQSY YFTDQEGKRSYNCQNHNALLTMMSGAFSGKTGFTGGAGYSYVGALEDGGREFTIALLGCG WPPHKTYKWSDARKLFGYGMEHYQYREVYQEKTFPEIPVENGVPRSGNPGDPVVVHAGLN LKEEEKSLKLLMAEDEEVTVSEELPDTLEAPLKKGQTVGTVTYRLQGETVKTFPVYLTED VEKITFRWCLDHVIQLFRAPFAFVNECLPAL >gi|229784030|gb|GG667705.1| GENE 18 16309 - 16929 628 206 aa, chain + ## HITS:1 COG:lin1914 KEGG:ns NR:ns ## COG: lin1914 COG2365 # Protein_GI_number: 16800980 # Func_class: T Signal transduction mechanisms # Function: Protein tyrosine/serine phosphatase # Organism: Listeria innocua # 15 206 45 222 298 139 39.0 3e-33 MEQNVNRTANHCTGRIPLEGLPNTRDLGGIRTMEGKKILPARLIRSGALYEATPADLERL VGEWRLGTVVDFRTAVERSQKPDPDMDGVTNIFNPILNEETVGITFEDEEEGKPKQDAIL GMLEHASSLGGDPELYVDKLYENLVVDEHASSYYGRFFDILLEADDRAVLWHCTAGKDRV GVGTALLLSALSVDRDTIIRDFVRTN >gi|229784030|gb|GG667705.1| GENE 19 17857 - 18078 330 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623229|ref|ZP_06116164.1| ## NR: gi|266623229|ref|ZP_06116164.1| putative protein-tyrosine phosphatase [Clostridium hathewayi DSM 13479] putative protein-tyrosine phosphatase [Clostridium hathewayi DSM 13479] # 1 73 2 74 74 110 100.0 2e-23 MENAEKTAALVREKTKDARLAECVKVLLTVSEDYIRHAFSAMEAACGSVEAYLYERIGLN ERKRAELKRKFLL >gi|229784030|gb|GG667705.1| GENE 20 18209 - 19054 1052 281 aa, chain + ## HITS:1 COG:lin2652 KEGG:ns NR:ns ## COG: lin2652 COG1284 # Protein_GI_number: 16801713 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 9 275 23 286 287 129 34.0 6e-30 MNMKKIITVLAGNTLYALAVSLFILPNGLITGGTTGLALVAQYKLGIPIAAFVGVFNVVM FVAGAAVLGKAFAFTTIISTFYYPFILGVFEGLFGTAPITGDTMLATVFAGLFIGAGIGM VIRAGASTGGMDIPPLILNKKFRLPVSMTMSVFDCLILLTQMVFAGKEKILYGILLVLLY TMVLDKVLMIGQNQMQVKIISEQCEIISEAIQNRMDRGTTLLLIEGGHLRKASYAVLTVI SGRELSKLNELVMGLDSNAFMIINQVSEVRGRGFTLHKEYK >gi|229784030|gb|GG667705.1| GENE 21 19199 - 19795 817 198 aa, chain + ## HITS:1 COG:no KEGG:Closa_1323 NR:ns ## KEGG: Closa_1323 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 22 198 22 192 192 122 42.0 8e-27 MAGRGKLMAVLLAVITVLALGGCFKKSMSPGAGAAETTAANATEAVREEAAPLHTETEKE DETAPAAEENEERAADGPGADAEDAAESGAGVQDIEAFAERIQEAVADRDMEALADLSAF PLNLETADGEKMTFEDRDEFLKQNPDLIFGDDLMVAIAGVDTATLEANADGVTMGEENPH IRYKKTEDGTFGITDIRE >gi|229784030|gb|GG667705.1| GENE 22 20068 - 21522 1484 484 aa, chain + ## HITS:1 COG:PAB1769 KEGG:ns NR:ns ## COG: PAB1769 COG4799 # Protein_GI_number: 14521095 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Pyrococcus abyssi # 3 484 31 520 522 360 40.0 4e-99 MSNSAQTSASNRINALLDENSFVEVGAYITARNTDFNMTSRETPADGVVTGYGTIEGCLV YVYSQDASVLGGSIGEMHAKKIVNLYAAAMKTGAPVIGLIDCAGLRLQEATDALDGFGQL YLSQTMASGVIPQISAVFGTCGGGMAVSAAMADFTFVESRNGKVFVNSPNALAGNYTGKC DTSSADFQSKETGLADFTGDEAEILNGIRTLVTILPANNEDDMSYDECNDDLNRVCSDLA NYVGDTAIALTQISDDNFFMEVKKDYAPSMVTGFIRLNGTTVGCVANRTESYDESGLKTA EYDAVLTSQGCTKAAEFVEFCDAFNIPVLTLVNVKGYKASKCTERKIAKAAAGLTYAFAN ATVPKVTVVIGQAFGTAYLTMNSKSIGADMVYAWPDASIGMMDAKEAAKIMYADEIGKAD DSVSFINEKAESYQALQSSALAAAKRGYVDDIIDASETRARVIAAFEMLFTKREDRPDKK HGTV >gi|229784030|gb|GG667705.1| GENE 23 21540 - 22310 751 256 aa, chain + ## HITS:1 COG:no KEGG:Closa_1325 NR:ns ## KEGG: Closa_1325 # Name: not_defined # Def: sodium pump decarboxylase gamma subunit # Organism: C.saccharolyticum # Pathway: not_defined # 1 256 1 256 256 281 63.0 1e-74 MKLNMKRVALGLCMAACLFSLSACSKADSASSEVDVNIQAQLSQMASGYLNSYVNIPDDQ LDEVRNQTAKQSEALAEGVDSWKNVKHDLGAMISVEDTAEVEEIDSGYSATVHAVFEKRE MDFGITTDREGAITSVSFVPEYTLGENMEQAAMNMLMGMGTVFLVLIFISFLISCFKYIN RFEEKMKNRGKKEEPAPLPAAEPEPAVVQAEEENLADDLELVAVITAAIAASTGTSPSGL VVRSIRRAPAGKWKKA >gi|229784030|gb|GG667705.1| GENE 24 22331 - 22708 429 125 aa, chain + ## HITS:1 COG:SPy1176 KEGG:ns NR:ns ## COG: SPy1176 COG0511 # Protein_GI_number: 15675148 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Streptococcus pyogenes M1 GAS # 1 124 1 116 116 78 42.0 3e-15 MKNYTITVNGNVYDVTVEEGTGSAAPAAAPKAAPAAAPKAAPAAPKASAPAGAAGSVKVA APMPGKILGVKANPGQAVKKGDVIVILEAMKMENEIVAPQDGTIASINVATGDSVEAGQT LATLN >gi|229784030|gb|GG667705.1| GENE 25 22832 - 23902 1144 356 aa, chain + ## HITS:1 COG:SPy1177 KEGG:ns NR:ns ## COG: SPy1177 COG1883 # Protein_GI_number: 15675149 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Streptococcus pyogenes M1 GAS # 1 354 24 376 376 345 57.0 1e-94 MILVAFVFLYLAIRHGFEPLLLIPISFGMLLVNIYPDIMKEIGADGSGGGLLQYFFKLDE WSILPPLIFMGVGAMTDFAPLIANPKSFLLGAAAQFGIYSAYFLAILLGFNGKAAAAISI IGGADGPTSIFLASKLGQTALMGPIAVAAYSYMALVPIIQPPVMKLLTTEKERKIKMEQL RPVSKLERILFPIIVTVVVCMILPTTAPLVGMLMLGNLFRESGVVKQLAETASNALMYIV VILLGTSVGATTSAEAFLNWNTLKIVFLGLAAFIVGTAAGVLFGKLMCVLTHGKVNPLIG SAGVSAVPMAARVSQKVGAEADPTNFLLMHAMGPNVAGVIGTAVAAGTFMAIFGVK >gi|229784030|gb|GG667705.1| GENE 26 23973 - 25385 1671 470 aa, chain + ## HITS:1 COG:FN1376 KEGG:ns NR:ns ## COG: FN1376 COG5016 # Protein_GI_number: 19704711 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Fusobacterium nucleatum # 9 453 4 448 448 531 57.0 1e-151 MAEIEKKPVKIVETVLRDAHQSLLATRMTTEQMLPIVDKMDKVGYYAVECWGGATFDACL RFLKEDPWIRLRKLRDGFKNTKLQMLFRGQNILGYNHYADDVVEYFVQKSLSNGIDIIRI FDCLNDIRNLQTAVKACNKEKGHAQVALCYTLGDAYTLDYWKDIAKRIEDMGADSICIKD MAGLLTPYKADELVRALKESTALPIDLHTHYTSGVASMTYLKAVEAGCDIIDTAMSPLAL GTSQPATEVMVETFRGTPYDTGYDQNLLAEIADHFQPIREEALKSGRLNPKVLGVNIKTL QYQVPGGMLSNLVSQLKEAGQEDKYRQVLEEVPRVRKDFGEPPLVTPSSQIVGTQAVLNV ISGERYKMVTKESKKILMGEFGQTVKPFNAEVQKKIIGDETPITCRPADLIAPQLPQFEK ECAQWKQQDEDVLSYALFPAVAKEFFEYRAAQQTGVDSAVADKGSKAYPV >gi|229784030|gb|GG667705.1| GENE 27 25581 - 25956 314 125 aa, chain + ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 125 1 126 525 97 37.0 5e-21 MYRLIIVDDEWEALKGMRDTLPWEQWGFEVTGAASSGEEAWTLVEADIPDVLLTDIRMDE MSGIQLIDRVHQKYPEIRSVIISGYSDIEYYRKALEYKVFDYILKPSREEDFSKIFSNLR ALLDQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:54:05 2011 Seq name: gi|229784029|gb|GG667706.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld99, whole genome shotgun sequence Length of sequence - 17254 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 7, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 418 513 ## COG5438 Predicted multitransmembrane protein - Prom 608 - 667 80.4 - Term 1700 - 1743 10.6 2 2 Tu 1 . - CDS 1759 - 3657 1775 ## COG1409 Predicted phosphohydrolases 3 3 Tu 1 . - CDS 4586 - 6619 1741 ## COG1409 Predicted phosphohydrolases - Prom 6644 - 6703 4.8 - Term 6761 - 6798 6.7 4 4 Op 1 . - CDS 6822 - 7988 1167 ## COG1929 Glycerate kinase 5 4 Op 2 . - CDS 8018 - 8737 486 ## Ccur_13510 hypothetical protein - Term 8749 - 8801 4.3 6 5 Op 1 . - CDS 8828 - 10258 1318 ## COG0786 Na+/glutamate symporter 7 5 Op 2 . - CDS 10279 - 11601 949 ## COG4690 Dipeptidase - Prom 11627 - 11686 7.4 - Term 11671 - 11711 3.5 8 6 Op 1 . - CDS 11753 - 12892 697 ## COG3835 Sugar diacid utilization regulator 9 6 Op 2 . - CDS 12928 - 13452 751 ## COG0703 Shikimate kinase 10 6 Op 3 . - CDS 13452 - 14141 586 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 11 6 Op 4 . - CDS 14183 - 15853 1871 ## COG1227 Inorganic pyrophosphatase/exopolyphosphatase - Prom 15891 - 15950 4.5 - Term 16236 - 16297 6.3 12 7 Op 1 . - CDS 16321 - 17106 836 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family 13 7 Op 2 . - CDS 17122 - 17253 181 ## gi|266623251|ref|ZP_06116186.1| hypothetical protein CLOSTHATH_04532 Predicted protein(s) >gi|229784029|gb|GG667706.1| GENE 1 1 - 418 513 139 aa, chain - ## HITS:1 COG:CAC0206 KEGG:ns NR:ns ## COG: CAC0206 COG5438 # Protein_GI_number: 15893499 # Func_class: S Function unknown # Function: Predicted multitransmembrane protein # Organism: Clostridium acetobutylicum # 21 129 21 130 397 71 37.0 5e-13 MDWIKKKRMEIGLAAAVVLMIAAICIYNKANPITYTMYENGTINYVKARVLEVTDQQLEP VEEAEGRWLGIQELKVELLNKGHSGEIITVTNYLSTTHNVYAKKGQSLIIKADCPEGVEP FYSVYNYDRTTGLMMTGIV >gi|229784029|gb|GG667706.1| GENE 2 1759 - 3657 1775 632 aa, chain - ## HITS:1 COG:CAC0205 KEGG:ns NR:ns ## COG: CAC0205 COG1409 # Protein_GI_number: 15893498 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 8 272 187 457 652 223 46.0 8e-58 METAMKAFPETSFLISAGDQVNTASNETQYAGYLSPKELLSLPAAVNVGNHDAGSSAYSQ HFQVPNVSSLGMTEKTGKFGGDYWYTYNNVLFMSLNSNNMSTAEHRAFMKQVLDENGADA DWTVVTFHHSIYSTASHESDNDIIQRRAELSPVFTELGIDVVLMGHDHVYTRSYMMNGND PVVPADGTVPESVTDPAEGEVLYVTANSASGSKYYSIHNKDFPYAAVMNQESTPNITNVE VTDNSFAITTYRTTDMSEVDHFTIYRTEAPKPQPDVTGDTVAEILESLDQALEQAETEGE RQEILKKAAEAAEALSYDPNTMDESEMDEIKKLEDMVLAGYSELAAETDVQSKKVTGVTV EGAVLSVPLKAGVKAVSVLKVSDMELPEALGFETEDVIALDIRLDIISDDPEVSGGNIQP KVPVRITIDAPEDIDLNRLVLMHYTNGAYEDVKFTVKDGAISFVVNALSPFVLAEKAEAD KPDDGGDGSDDGSSDNGPSDNGPSDSGSSDNGSSDNGSSGNGSSSSVQGSWIKDQTGWWY QYQNKTYPVNTWVSIQGIWYHFDQAGYMQTGWIQVNGVWYYLQPSGAMVASDWVLYQDKW YYFNQDGAMATAPVHYKGTEYRFDESGACINP >gi|229784029|gb|GG667706.1| GENE 3 4586 - 6619 1741 677 aa, chain - ## HITS:1 COG:CAC0205 KEGG:ns NR:ns ## COG: CAC0205 COG1409 # Protein_GI_number: 15893498 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 470 665 52 254 652 126 37.0 2e-28 MSVKLKRLTALLLSACLAAGNITTAFGAPVQQDPEPGARVRTSRLISTEETVWRYLDDNT DPAAADDTKEETPGSPSNASPSDASPSNASAKYGFYRFAAGKNKASSKTGRLVWTSADYD DSGWKEARGSFGAKKGEIADLGDGLIPYTLLAQYIEGPDVAEDTDIPAYFFRTTIELEDI DSISALNGEIFYDDAAIIYINGEEVHRAGVPEGGYTQNLEYGGTNAGTPINNKFQITDLS MLKEGSNTVSVELHQGRANSSDIYFDFCSLEIGTPASESDEFITTEGTVWNYLENNTDPA EGSEDRTVWTYPEFADEQWKSGKGSFGAKKGAIADLGGGYIPDTLLKHYIDDTKTAVPAY FFRTDIYLNDVSDLEALEAEIAYDDAVIVYINGKNVYEGQVPEGGYDKNLAYGSANGLSA PSHDTITINDVSMLRDGVNTIAVELHQDRAGSSDIYFDFISLKAAQPKPFKAISLGVWND ETQTAVTWYSSIGEDGKVRIAEKADMDGDSFPEQAFQFAAERAEANDAGFYSFQAVMTGL EPDTDYVYQVGAGETWSDIYEFTTRDYENGFNFLLAGDPQIGAGSTDTDTEGWQNTLKTA MKAFPETSFLISAGDQVNTASNEAQYAGYLSPKELLSLPTAVNVGNHDAGSSAYSQHFQV PNVSSMGMTEKTGKFGG >gi|229784029|gb|GG667706.1| GENE 4 6822 - 7988 1167 388 aa, chain - ## HITS:1 COG:STM0525 KEGG:ns NR:ns ## COG: STM0525 COG1929 # Protein_GI_number: 16763905 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerate kinase # Organism: Salmonella typhimurium LT2 # 7 387 1 382 382 287 44.0 2e-77 MTGEVVMKFLFASDSFKGTLTSEQIIALLTEAAKKVFPDCETVGIPVADGGEGTIDAVIS VLHGSVHEVNAHGPLMEEVVSYYGESGDGAAVIEMAVSSGLPMVPVNKRDPRVTTSYGTG ELIKTALDRGCRDITIAIGGSATNDGGMGAMRALGIRFLDENGEELSGCGNDLARVADID IRGLHGAVKDAKFTIMCDVNNPLTGPDGATYTFGMQKGGTPKILDELEQGMIHYAALIRE KMGIDVDQIPGSGAAGGLGAAFSVFLKAEMKSGIETVLDLIHFDELLEGVDLVITGEGRI DWQSAFGKVPSGIGNRCRKKGIPAIAIVGGMGDKAEMIFDHGIDSIITTINGAMGLDEAL ERAEELYAGAAERAFRMIKAGMKLQDRS >gi|229784029|gb|GG667706.1| GENE 5 8018 - 8737 486 239 aa, chain - ## HITS:1 COG:no KEGG:Ccur_13510 NR:ns ## KEGG: Ccur_13510 # Name: not_defined # Def: hypothetical protein # Organism: C.curtum # Pathway: not_defined # 8 239 2 233 237 130 34.0 4e-29 MSGNIREYLFITIVLFANIIEGITGFAGTMLAMPASMMLIGAEEAKVILNMVALMVSSTI AVKTYKKINRKELVKITSLMMLGMVIGLYLFAVLPVKLLSVCYGVLIIAVAVNGLASKKR MELPAGILIIIVLAAGIIHGLFLSGGALLVVYVTAVIKEKGVIRATLAPVWLMLNTIILL QDIWFGRITFHTLRLTGMCMIPVILALVLGNYLHNKIKQERFVHLTYLLLIISGVSLVI >gi|229784029|gb|GG667706.1| GENE 6 8828 - 10258 1318 476 aa, chain - ## HITS:1 COG:Cgl2722 KEGG:ns NR:ns ## COG: Cgl2722 COG0786 # Protein_GI_number: 19553972 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Corynebacterium glutamicum # 7 440 12 431 449 98 24.0 3e-20 MISMAALAWTGIMIIAGVVLRAKVPFLRNNLVPASVIGGVIGFIVMNLGFITDCTVGEFG DIANLLWIFTFANMGVTLAAKKETVKTEDGQSFRQKLANSQFSGICGMGFIWVIPYAFSG LIGFLVLQVIGKYFGMPAIHGLQLPFAFAQGPGQSVNYGGMMEANGVADAVQVGVTFAAM GFLVAFLIGVPYAKKGIKKGLAPYSGKIGESMLKGLYEPDEQEFYGKQTTHPGNVDTLAF HVALVGIAWVGGMYIGKLWALVPGYIGTLFSGFLYLNSMLCGYVIRWIIGKLGLSKYLDR GTQIHLSGFCIDMMVTSAFMAISLEVVGKWLVPILILVLITTVFTYIITRYFGERMGGKY GFERSLGVWGAMTGTNATGQALVRMVDPDRKTTVLEEMGPMNVINVPACYVVMPAVIAYS AGEISTTALFVSLIGVGVAFMAAMLVTGVWGKKTYDYRKGELYYSLDDETEQKAAQ >gi|229784029|gb|GG667706.1| GENE 7 10279 - 11601 949 440 aa, chain - ## HITS:1 COG:PA4677 KEGG:ns NR:ns ## COG: PA4677 COG4690 # Protein_GI_number: 15599872 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidase # Organism: Pseudomonas aeruginosa # 6 352 2 341 412 193 37.0 7e-49 MEVFSCDTMVALGNSTKSGNLIFAKNSDRPLTEAQPLVVYPAGDHKPDEMVQCTYIKIPQ AAHTYRVLGSKPFWIWGFEHGMNEHHVVIGNEAVWAREPEEKEDGLLGMDLLRLGLERGR TAYEALHVITGLLKQYGQGGNAALGMEHRYNNAFLIADDHEAWILDTCGRRWVARRVKDV AGISNCYSTEEQWDEDSGDVKEYAYENGWVSRDVPFNFAKVYSAMGLKHRAAHPRYARLN KLLNQHKGSIDLNVIRMIQRDHFEGEIIEPRWSPADGLHVSICMHAINEKASRTAAAAHI ELAKGETPVWWNCFSVPCMSIFAPYTVESELPEIISHAEQCYTENSAWWQFERMQYAMEQ DYPKYIGQWRTVQRQLEEKYLEDVREHGAPANEKIKANTEEMLKAADNMYRMITCHPSAS GEPQKAEANEVAKQRAKIVF >gi|229784029|gb|GG667706.1| GENE 8 11753 - 12892 697 379 aa, chain - ## HITS:1 COG:BH2731 KEGG:ns NR:ns ## COG: BH2731 COG3835 # Protein_GI_number: 15615294 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Sugar diacid utilization regulator # Organism: Bacillus halodurans # 3 363 2 352 371 121 27.0 2e-27 MRITQDVAKTIVNEISNTINKNVNLMDETGRIIASTASDRVGTFHRGAQKIVEEKLDCLT IYTDQEYSGSRMGINMPIYFQGGIVGVIGITGEWDQISQYVQLIRKATEILLMESYLKNR ELHDEASRREYVYKLLFEEFDRIPEQFYQYGFSIGIDAVLARACLTVSFKDGSRGERPKQ IYEYLDMAEKKVHEYQAEAKQILVYRELTQLSIFIPGKDSEEVKRTAEDIRKRITLPEGI EIKAGLDESLYSGADIRSGYTKACKSLLVAMASNDRWLVCYNQLTLGLFVDEVSRATREE YVKRIFAFISDEEIAKWVSLLEVFYRCEGSITETASELFIHKNTLQYQLKKLASITGYDP RSLSCAGLYQTAIKFFYSF >gi|229784029|gb|GG667706.1| GENE 9 12928 - 13452 751 174 aa, chain - ## HITS:1 COG:MA3237 KEGG:ns NR:ns ## COG: MA3237 COG0703 # Protein_GI_number: 20092053 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Methanosarcina acetivorans str.C2A # 6 159 8 162 175 116 40.0 3e-26 MQKLDNITLVGMPASGKSTVGVLLAKRLGYSFVDVDIVIQEKEGRLLKEIIAEEGLNGFM EVENRVNAGLSVHKSIIAPGGSVIYGKEAMEHLKEISTVVYLKLSYEAVEARLGNLVDRG VVLKDGMTLRDLYEERVPYYERYADITVDETGLDAGQTVDKLREMLTDMYGLRT >gi|229784029|gb|GG667706.1| GENE 10 13452 - 14141 586 229 aa, chain - ## HITS:1 COG:FN1468 KEGG:ns NR:ns ## COG: FN1468 COG1853 # Protein_GI_number: 19704800 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 20 184 20 185 197 169 48.0 5e-42 MARQYFKPGNMLYPVPAVMVSCGREGETPNIITVAWAGTVCSSPAMLSISVRPERYSHAI LMETKEFVVNLVTKDLVFAADYCGVKSGRDVDKYKEMKLTPSPSKFVKTPGIGESPVNLE CRVTEVKKLGTHDLFLAEVVGVSVNENFMDENGKFLLNSTGLVAYSHGEYFELGKKLGSF GYSVKKTGNGKKSGSAEKTNIGSKRDNQKHASVKGTGAKRSGGNRRKSK >gi|229784029|gb|GG667706.1| GENE 11 14183 - 15853 1871 556 aa, chain - ## HITS:1 COG:FN1824 KEGG:ns NR:ns ## COG: FN1824 COG1227 # Protein_GI_number: 19705129 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase/exopolyphosphatase # Organism: Fusobacterium nucleatum # 8 555 3 536 538 293 32.0 5e-79 MGEKISRKTIVIGHKNPDTDSICSAICYANLKRCITGENYQPGRAGAVNEETQFVLKHFN VEAPEFVENVKTQVRDIEIRETAGVKKNLSLKKAWNLMQEANVVTIPAVTEEGLLEGLIT VGDIAKSYMNVYDSSILSKANTQYANIVETLEGAMVVGDETSYFSEGKVLIAAANPDMME YYISRGDLVILGNRYESQLCAIEMDAACIIVCEGAAVSMTIKKLAQEHGCTVMTTPYDTF TAARLVNQSMPISYFMKTEALITFEMDDYIDDIKDVMASKRHRDFPILDKNGRYRGMISR RNLLGAKGKRIILVDHNERSQAVEGMESAEILEIIDHHRLGTVETIAPVFFRNQPVGCTA TIVYQMYQENRVEIEPWVAGLLCSAIISDTLLFRSPTCTDVDKRAALHLAEIAGVEVETY ASAMFAAGSNLKGKTDVEIFYQDFKKFSVGKVSFGVGQISSLNAGELEELKGRMLPYMAK AREEHGMDMMFFMLTNILTESTELLCEGQGAEQMIAGAFRTYSEEGMGVKDHVVSLPGVV SRKKQLIPGIMLAVQA >gi|229784029|gb|GG667706.1| GENE 12 16321 - 17106 836 261 aa, chain - ## HITS:1 COG:L37351 KEGG:ns NR:ns ## COG: L37351 COG1387 # Protein_GI_number: 15673198 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Lactococcus lactis # 4 255 5 259 269 130 33.0 2e-30 MLADYHTHTYYSDDSECPMEEMVRKAVAIGLDEIAFTEHVDYGVKTDLNCDYEAYFEETD RLKERYRDRITIKAGIEFGVQSHTIPLFRRDFKWYPFEFVILSNHQVGDREFWNYAFQEG KSQEEYQTAYYEAIYDVIRQYKEYSVLGHLDMIKRYDRFGDYPDDRIMGMVDRILSQVIA DGKGIEVNTSSFKYGLKDLMPSRAILRRYRELGGTILTIGSDTHETVHLGDHIAEVKQVL KEMGYQTFNTFSNMEPVYHSL >gi|229784029|gb|GG667706.1| GENE 13 17122 - 17253 181 43 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623251|ref|ZP_06116186.1| ## NR: gi|266623251|ref|ZP_06116186.1| hypothetical protein CLOSTHATH_04532 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04532 [Clostridium hathewayi DSM 13479] # 1 43 1 43 43 74 100.0 2e-12 KGFEFVKLPAVYRHYKKYQWVRNLTREETVLSGYAETEIPLQK Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:54:19 2011 Seq name: gi|229784028|gb|GG667707.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld100, whole genome shotgun sequence Length of sequence - 16360 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 5, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 568 632 ## COG3119 Arylsulfatase A and related enzymes - Term 440 - 482 -0.9 2 2 Op 1 2/0.000 - CDS 571 - 1896 1436 ## COG1653 ABC-type sugar transport system, periplasmic component 3 2 Op 2 7/0.000 - CDS 1954 - 3555 1609 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 4 2 Op 3 . - CDS 3581 - 5305 2051 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 5 2 Op 4 . - CDS 5377 - 6411 1224 ## gi|288871055|ref|ZP_06116191.2| hypothetical protein CLOSTHATH_04537 6 2 Op 5 . - CDS 6425 - 12049 4379 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 12211 - 12270 5.5 - Term 12171 - 12216 -0.7 7 3 Tu 1 . - CDS 12279 - 14336 1727 ## COG4289 Uncharacterized protein conserved in bacteria - Prom 14380 - 14439 4.7 + Prom 14357 - 14416 9.4 8 4 Tu 1 . + CDS 14514 - 15431 578 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 15454 - 15513 1.3 9 5 Tu 1 . - CDS 15693 - 16358 714 ## Teth514_2260 hypothetical protein Predicted protein(s) >gi|229784028|gb|GG667707.1| GENE 1 2 - 568 632 188 aa, chain + ## HITS:1 COG:Rv0296c KEGG:ns NR:ns ## COG: Rv0296c COG3119 # Protein_GI_number: 15607437 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mycobacterium tuberculosis H37Rv # 7 157 265 421 465 85 35.0 4e-17 FAGNQAAGKVSDSLISQIDVYPTLCDLLEIEKPDYLQGRSFAGLFDNPELEINEEVFAEV TYHAAYEPMRCIRTSRYKLIRYFDYHNGYVPANIDASPSKDSMIEAGLLKQTREREMLFD LALDPLERKNLVNSSDYKEVYDDLRLRLENWMERTGDPLLRVGYRVPKPEGAWVNRLCDL DPGCTLSE >gi|229784028|gb|GG667707.1| GENE 2 571 - 1896 1436 441 aa, chain - ## HITS:1 COG:BH1864 KEGG:ns NR:ns ## COG: BH1864 COG1653 # Protein_GI_number: 15614427 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 9 355 9 351 461 102 27.0 1e-21 MKISRIASAIFLLAIVAAGAAGCSLGRDMTSDSGKAAEGRSAADGVKEGRVKITVSVWDN ANSPQFQAMADAFMEKNPDIDVELVDIRSDEYNSKLTVLLAGNESDPDVIMIKDADTQIS MREKGQILDLTSYIARDGVDLSIYNGAAEQLQMDGKQYTLPFRQDWYVLFYNKDLFDAAG VAYPADDMTWEEYEKLAEKMTFVKDGTAVYGTHHHTWMALVSNWAVQDGQHTLMSEDYAF LKPYYEQALRMQDEGIIQNYASLKISNIHYTSVFEQQQCAMVPMGTWFIATLVQDRKAGI FDFNWGVTRIPHPAGIEAGSTVGSVTPIAVNAKSDVPDQAWEFVRFATSEAGAEILADNS VFPAITTAGVVSRLASIEGFPEDGKAALSVKNFVFDRPVDAKMAAVRKVIEEEHDSIMIG EETVDAGIRNMNERAAEARSR >gi|229784028|gb|GG667707.1| GENE 3 1954 - 3555 1609 533 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 531 1 525 525 145 23.0 3e-34 MYRIMIVDDEPLILAGIASLLNWEEHQCRIVGKASNGQQALALLDTLRPDIVITDIKMPA MDGITFMKTAMENGSAASFILLTNLEEFSLAKEALHLGAIDYLVKLELSEESLVQALESA IERCNMNYHMTHLAEEPLTVDERIQNYFRRILIYDTDVEADAGLSGLIRERYPRPALMLI NFNYGFEGFSEAFTRADQKKVMGFAEDIISGMVKGFFDNSCVLRREQSGFVLVLSTEGMD DFKREASVMSVKFRQVMKDYFEVPVSIAVSLRGGGVDEIQDLLYQAMSAMNLTFYDNTGG AVFYSELCEDNRSHSSNFNINFLKKDVTEAIRQNDCEEFSSIMNQMIQLFSECRPSRRQA VNACNNLYYFITSLIEDRGDQDFPYAVDIVGQLNRMDNLSAVLTWLEGFRDQMIKVLENH QEARKDKIVELALEYVKAHYQEKITLSQAAAVLNVSQGHLSSTFKKQTGKNFSDYVTEMK IEKARELIESYQYRMYEISDMLGFDTPYYFSSVFKKITGYTPKEYENLVVKKL >gi|229784028|gb|GG667707.1| GENE 4 3581 - 5305 2051 574 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 234 572 247 592 602 164 30.0 6e-40 MVYNICIICVIGIIFSVTNYVTANRKAIQLAQNSVEYHVESISRYYQEVYDQMVNLVLNC AEREMFDLSSLGQLDTVEEKRKGLDYAKLAANFCAVTGYGDYITRLSIFDDSDVHISAGT ALSSVDDGRRIRSAPWFDRELKKDMYDYRLDLVDSPFYKEVGMAIPIICKNSGYKDSDWS ALFLSPRLYLDRLKEDDNGNTMLVTTYRGERIATLREPEGQKEESDRLIDSILRREENRG IERYRISGEPYLVSFNKESQSGVVVIEIMDMAVLKNDSLMLMQTVGLIFLSCVLLGISMS FLFTNAVRKPIDRLVNHVGRIAKGDFGQEESLESEDEIGLIGKAVNSMSSQIEQLMDKRL EDEKEKSSLEIRMLQAQINPHFLYNTLDSIKWIAVIQKNSGIVKAVTALSKLLKNMAKGV DERVTLREELDFVRDYVTIEKLKYVEMFDLEINIEEETLYDARIVKLTLQPLVENAIFSG IEPGGKNGTILIHAYEQDKCLCIEVRDDGIGIPEEKIPELLNHSEKLKGDQMSSIGMPNV DRRLKLIYGEEYGLSIESRAGEYTQITVRIPLEY >gi|229784028|gb|GG667707.1| GENE 5 5377 - 6411 1224 344 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871055|ref|ZP_06116191.2| ## NR: gi|288871055|ref|ZP_06116191.2| hypothetical protein CLOSTHATH_04537 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04537 [Clostridium hathewayi DSM 13479] # 31 344 1 314 314 548 100.0 1e-154 MKKKTAFLVLTWFLILECLAGCARQGYSCFMEVGSYEISKEEYALLAYDNIAVTAREYGT KDGIDVNVPGVWERKTDGVSPMDRVRELTDEEAVQQKGIQILAKKNGIIDDIGYDSFLKQ YQEENEQRRQMKEEGTVLYGPVELSVSQYYSYRQSQLEQLLEEFVEQNIITIGEEELKEY YPVVRQREEKKNFQAKLSLAVFEDAGDGQAAAVEQWVADHGITDSLPELLAEDTGIVADV EKMSVDTLELGKENQLLDRVVLAIQGLEAGAVTPAIEYTDGMNAVIAVESKEYVPFGSYD TERDYVEYLLREKKAREYIADEISRMEVKRGEAWETLTYEDVVK >gi|229784028|gb|GG667707.1| GENE 6 6425 - 12049 4379 1874 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1708 1852 481 648 744 72 30.0 1e-11 MKKSWKRGLALYLSFVMATGNFGSGISYSFAAVNGGGGNGTGKATSSNAAGSGSGHTHTS KATSSNADSGSSDREPDKKATESNATMSNALKLLLEKTEFSLTDDFSDTQGPIWYYKYGD RDSGEVSELQEYDADTPRWKDNKDKWLRIFNDGMMPGNKGDDAILEWKAPADGLIQISED DGSIFMETDRNDRDGVLLSIQQNGKLIWPDTGTISTEHNKTDVKFQPFYVQVREGDAIHF RANGNKEYNNDRVKWCPVITYLEEEDVAMISDITMTGRDSLEFTAASKELSGDTRDRISL SAIVNGNKEEAVTVRSAEAKDGKVVVAFDEVEAADKDQIQLLVGFDEDHRFVTYSPQDFE VISGLTYYVDAVNGNDSNDGLTEENAWKSLDKVNSTTFLPGDSILLKAGCIWNGYLYPKG EGSESAPVTLGKYGGDGYPIINGDGSVPYDSKYKVRKDHVYSPAVLLSDQSNWVIDGIEV TNSADDNFNHVGILVFTTGQTGVISNVTVQNCYVHDVTAEEANSKMTGGILVIGADAWID GQKPEGEMASDIGFDEILIENNYVKNVAKEGIRSSGQSTGAEGVGFRNTKTFRNITFRGN YIEEIFGDGIVLAEVGENGLVENNVVKNHCNYDTANYAGAWLWQCDDSVFRYNEVFGGEY GYNDGEAFDYDIGCKDIIYEYNYSHHNKGGLLLTMYDNASYHFRYNISANDGKPSQELFH CQTGETKIYNNTMYIGKGITTSIFQTGSVGYFKNNIIMAHGSVPKFGTSAISVSKGAMTN NIIYPASITSVNGPSEAVLEDNLFVNPRLSSPQVSTKYSSKEYDLTYENMGQGEEFIAGL RKRMAIFKLQENSPAVDAGVVIPESNAKEDIFGNAIVGLPDIGAHEFNNDPGPVEEDTEK AEKVILDKTELYMTVEDSVQLKASVQPEDILDDSIAWSVQDPDVVSVSAAGKVTARMKGE TVITAVSLSNPEAKAECIVHVYEKDSASVIADKDARVRDDGTHSGTEESMTVKLDGSGYI RNVLMSFDLSSVPISYKRILLKFHIDSTEADNVPFTISRIGDEWEEESVTWSSMPDKGEK ILDFSSSISDGGTWKELDITDYVETLLNSGEKRLSIVFEGVKVGSKNYTNIHSRETDYKP QLVFDNETVKSAEDIFVKVPVGTAPALPETVKVTYSDNTTADAAVVWDEIPAVKYEKEGQ FTVCGYIEERKLPVIAYVTVSDSIITAVSAGDVTTRAGVAPILPETVNATYINGSEEQIP VVWEPVNESQYAKPGTFTVKGKVNGFDAGCIIKIIVTEAGVVEAEAVDITGYVGFMPVLP ETVTVRYTDGSKGTARVAWEETEEEKYQKAGAFTVYGTCEGLEQKAAANVTMIDAAVKPA EMKSPAGKYPALPYTIKLENRSRVANASEVPVVWETVSRELYQGSNPYVVNGYIAGTTIP ASIKITPADPLEYMDGVIDTNGDTYIQVSPSDGNHGNKDVIRIKNYEGEYKRIGVVQFDF KSMNISDLTEMTLRLYFNKMDEKETVTETYLNIYTNTPGWGENTASYDSITQASGYKAEL IKEHVVIKRNQVGSYITIDVSDAIDYIEDGKISFILDIEENPENDANGSNSGLEFSSKES GAEQAPKLFVSNQYVTELEAVNLEAEVGNTLDLPDTITIYYADGTEKLEPVAWEAVSPTL LGRKGTFLVRGYVEGLYCSVTAAVTMKAEDPDTPDTPDTPDTPDTPDTPDTPDTPDTPDT PDTPNHPEKPNKESSSSSGKAVTTNSWVQGIGYKKCILGNGEYAKNGWYYLECNGIMSWY RFDENGNQITGFFTDQDGNIYYLDKDGVMMTGWQLIDGKWYYFNPVSDGTRGSLYRNTVT PDGYHVGPDGVWKN >gi|229784028|gb|GG667707.1| GENE 7 12279 - 14336 1727 685 aa, chain - ## HITS:1 COG:SMb20536 KEGG:ns NR:ns ## COG: SMb20536 COG4289 # Protein_GI_number: 16264263 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 100 382 84 354 617 102 28.0 3e-21 MKFELKNPDYGKSPYTGMTKEHWLEVCEFLLEGIFSGIERMDGHPVSRRVEFEVSYPNKN SSPTKASAEQFEGLARSFLIAAPLLYNRPEAVVCGVSIRDYYKTRILEAVTPGNPNYLYG LQELLAMAKEGEDTFQHTCECASLVIGLDQCRKAVWDTYTAEEKDRIASYIREFAYGKTG AHNWRLFNMLILGFLHQEGYEIDKGVMRDHAQTILSYYAGDGWYRDGHRFDYYTPWAFQV YGPIWNRWYGYAEEPWIAGKIEQYANALTDTFLSMFDKEGHVTLWARSGLYRNAAASPFA SAYLLNHTNVNPGLARRINSGALLQFITREDVFVNGVPSLGFYGQFLPMVQDYSCAESPF WIANPFVALTYPEEHPFWSAVEENGCWERLGRQEAESSLPCSARGYEETVMDGPGIVSAQ LGGNGACEFRTAKGLFKPGDEYIRYYIRLAFHSHYPWEAFDGQGAEAMQYSLFYDGAAEA LVPNIIMYGGVKNGVLYRKQYFDFHHNFQGGASIDLADFPVANGHIRVDKMRIPNKPFTL TMGAYGFPEFQDETKTQVIRRSCEGAQAVILKQAGRQLALVAYTAWESVETKERRGVNPA AENSILAYGKLRREKMYGYEPYVMISAVLSRNDDGEWEEDELFPVREITFMDEEQCGGYG PVTLLMTDGRTVVVDFEGLEGHLMI >gi|229784028|gb|GG667707.1| GENE 8 14514 - 15431 578 305 aa, chain + ## HITS:1 COG:CAC2608 KEGG:ns NR:ns ## COG: CAC2608 COG2207 # Protein_GI_number: 15895866 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 54 298 35 278 284 139 30.0 7e-33 MTFSTEQRINNGVSAALPPEHVTEENGSCRRIHFPETIRSHHMLSTRPVNDVYHMHDSYE LLLFLGGCTDCYIEHHRYRLKRGDLLIINSQEIHRAVASEGTPYERIATHFYPELVRNYC TEQTDLLSCFEKRKNADRNLLSLTPKSLEHYLYLTDHLHTAVASASFGHDILASTYLLQL LVLANTCFYESTWLPENLMPQLSSQIMRYIDEHLTEDLSIQALADHLYMNRSHMSRKFRE QAGCSIQEYMILKRISLAKTLLKQGKSVTDTCELSGFHDYSNFIRTFKKYAQVSPGHYRV TNMEA >gi|229784028|gb|GG667707.1| GENE 9 15693 - 16358 714 221 aa, chain - ## HITS:1 COG:no KEGG:Teth514_2260 NR:ns ## KEGG: Teth514_2260 # Name: not_defined # Def: hypothetical protein # Organism: Thermoanaerobacter_X514 # Pathway: not_defined # 2 215 118 332 337 145 37.0 2e-33 ELGFGGKGQWQYFYSFAKGYMPFTPDAREVSVLTEAFKGLFMATRAVKEKRISVDFEHGE ILWRVYNAETEEWNMFAGPLPPYERNYPEIELEDQDLKLELKAQPRNSQELAVDIAYMKT GIRDEEYDRPVCPRLLVVLDWKADMILRMDMMKPDDDEIGMVLDFFVTYVMTAGLVKKVR ARNPWVFAALSEICDYCGIELKKDRLGKVDRILEEMAGMMG Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:54:46 2011 Seq name: gi|229784027|gb|GG667708.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld101, whole genome shotgun sequence Length of sequence - 27437 bp Number of predicted genes - 32, with homology - 30 Number of transcription units - 10, operones - 6 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 647 519 ## Sfum_4062 hypothetical protein - Prom 667 - 726 2.7 + Prom 673 - 732 4.3 2 2 Tu 1 . + CDS 770 - 943 81 ## gi|266623262|ref|ZP_06116197.1| conserved hypothetical protein 3 3 Op 1 . - CDS 1169 - 1312 59 ## 4 3 Op 2 . - CDS 1348 - 2334 367 ## DSY2173 hypothetical protein 5 3 Op 3 . - CDS 2343 - 3026 541 ## lin1709 hypothetical protein 6 3 Op 4 . - CDS 3028 - 4221 622 ## COG3299 Uncharacterized homolog of phage Mu protein gp47 7 3 Op 5 . - CDS 4214 - 4582 361 ## gi|266623267|ref|ZP_06116202.1| hypothetical protein CLOSTHATH_04551 8 3 Op 6 . - CDS 4588 - 4986 265 ## gi|266623268|ref|ZP_06116203.1| conserved hypothetical protein 9 3 Op 7 . - CDS 4996 - 5847 639 ## Amet_0433 hypothetical protein 10 3 Op 8 . - CDS 5847 - 6209 267 ## gi|288871060|ref|ZP_06116205.2| conserved hypothetical protein 11 3 Op 9 . - CDS 6225 - 6416 70 ## gi|266623271|ref|ZP_06116206.1| cell wall protein TIR3 - Prom 6467 - 6526 12.3 12 4 Tu 1 . - CDS 7428 - 9338 973 ## COG5412 Phage-related protein 13 5 Tu 1 . - CDS 10299 - 10544 214 ## gi|288871061|ref|ZP_06116208.2| putative methyl-accepting chemotaxis protein McpR - Prom 10622 - 10681 2.5 - Term 10552 - 10602 12.2 14 6 Op 1 . - CDS 10771 - 11037 233 ## gi|288871062|ref|ZP_06116210.2| conserved hypothetical protein 15 6 Op 2 . - CDS 11094 - 11498 351 ## Apre_0847 hypothetical protein 16 6 Op 3 . - CDS 11509 - 11985 420 ## BSn5_14805 hypothetical protein 17 7 Op 1 . - CDS 12902 - 13450 627 ## gi|288871063|ref|ZP_06116213.2| hypothetical protein CLOSTHATH_04562 18 7 Op 2 . - CDS 13443 - 13988 232 ## gi|266623279|ref|ZP_06116214.1| hypothetical protein CLOSTHATH_04563 19 7 Op 3 . - CDS 13985 - 14359 364 ## gi|266623280|ref|ZP_06116215.1| hypothetical protein CLOSTHATH_04564 20 7 Op 4 . - CDS 14361 - 15110 353 ## EF1466 hypothetical protein 21 7 Op 5 . - CDS 15110 - 15769 564 ## gi|288871064|ref|ZP_06116217.2| hypothetical protein CLOSTHATH_04566 22 7 Op 6 . - CDS 15738 - 15863 61 ## 23 7 Op 7 . - CDS 15873 - 16283 448 ## gi|288871065|ref|ZP_06116219.2| conserved hypothetical protein 24 7 Op 8 . - CDS 16296 - 16919 532 ## RSKD131_3041 hypothetical protein 25 8 Op 1 . - CDS 17845 - 18855 983 ## TM1040_1622 hypothetical protein 26 8 Op 2 . - CDS 18868 - 19575 513 ## COG3740 Phage head maturation protease - Term 19600 - 19634 0.2 27 8 Op 3 . - CDS 19652 - 20131 500 ## Clocel_3545 hypothetical protein - Prom 20208 - 20267 2.4 28 9 Op 1 . - CDS 20300 - 22471 1340 ## MXAN_1213 hypothetical protein 29 9 Op 2 . - CDS 22477 - 23931 947 ## Mahau_0567 hypothetical protein 30 9 Op 3 . - CDS 23921 - 24769 698 ## COG3728 Phage terminase, small subunit - Prom 24984 - 25043 80.4 - Term 26398 - 26441 8.0 31 10 Op 1 . - CDS 26561 - 27058 569 ## Cphy_2975 hypothetical protein 32 10 Op 2 . - CDS 27065 - 27436 298 ## gi|266623296|ref|ZP_06116231.1| putative bacteriocin ABC transporter Predicted protein(s) >gi|229784027|gb|GG667708.1| GENE 1 2 - 647 519 215 aa, chain - ## HITS:1 COG:no KEGG:Sfum_4062 NR:ns ## KEGG: Sfum_4062 # Name: not_defined # Def: hypothetical protein # Organism: S.fumaroxidans # Pathway: not_defined # 32 213 273 455 711 79 28.0 9e-14 MAVKTVQAIINGVTTTLTLNTATGKYEATVTAPSTSSYNVNDGHYYPVTIKATDQAGNVT TKNDQDATLGASLRLKVKEKGAPVITITSPTASARLTNNKPQVAFTVTDADSGVNPDTIK ITIGSKVITTGITKTPSGKGYTCTYTPTEALADGSNTIKVDASDYDGNAAAQKSVTFIID TVPPTLSITAPANNLVTNQAACTVTGTTNDATSSP >gi|229784027|gb|GG667708.1| GENE 2 770 - 943 81 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623262|ref|ZP_06116197.1| ## NR: gi|266623262|ref|ZP_06116197.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 57 1 57 57 92 100.0 7e-18 MDSYIIFQSPKAADDFYTACQLAPMNTEGITYISDETAAANLEMAMLYHEAYDTLFS >gi|229784027|gb|GG667708.1| GENE 3 1169 - 1312 59 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAIRISFYTFETDIYCVGLDINNIDNTMYCAFYDNKRERVICFGIWQ >gi|229784027|gb|GG667708.1| GENE 4 1348 - 2334 367 328 aa, chain - ## HITS:1 COG:no KEGG:DSY2173 NR:ns ## KEGG: DSY2173 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 3 158 6 160 497 67 31.0 6e-10 MQGTVITNKGLQLIAKLVASGTALSFTRAAVGIGSVPSGYDPTNMTNLNKYKMDGSIASC SSSGDTASIIFQISSIGVSAGFTITEAGLFATDPDEGEILYCYLDMSSDPQYIYAEGSAI SKFVEMTLTVVVGSVQSITAYINPGSLLSNDGDISNTAIGTLDNITESFPVPAAKEKSSK FFGKIKKSLEDWKNLKASLLLVGSIVNNCVTNNAKLPLSAAQGKALMDLYTTLNSDLAVK YEPMNFKNGWSHPDERLPYIARTGNIITIRFPFCTGTAGAVVCTIPENYKLSRSSWIMLQ YAGIEIHGMDIIASNSQIEEINVTFIDF >gi|229784027|gb|GG667708.1| GENE 5 2343 - 3026 541 227 aa, chain - ## HITS:1 COG:no KEGG:lin1709 NR:ns ## KEGG: lin1709 # Name: not_defined # Def: hypothetical protein # Organism: L.innocua # Pathway: not_defined # 27 171 16 152 213 63 28.0 5e-09 MSFAVKMLEMLTSAYNRTDLQNFRENKPPETNIGKLFALSGWGFDILKEQTEKVRLWDRM DKMQGTSLDNFGRNYGVVRGEASDEMYRVMIKVKILAMMAAGNMDTIILSAASLFGVSAS DVTCEEVFPAKVYLYIDEDKLDQEHKDVANIIANLMCRIKSAGIGIRIFYQTYHGARLNV YLATNNMEIIRLQVQPESGDKYLKYEINFNAGVPMLERIAVHFTPAE >gi|229784027|gb|GG667708.1| GENE 6 3028 - 4221 622 397 aa, chain - ## HITS:1 COG:lin1710 KEGG:ns NR:ns ## COG: lin1710 COG3299 # Protein_GI_number: 16800778 # Func_class: S Function unknown # Function: Uncharacterized homolog of phage Mu protein gp47 # Organism: Listeria innocua # 8 393 2 379 383 157 31.0 4e-38 MSNDKWGLTENGFYRPTYTVLLNALEYKARELMGDDINLTVRSPLGLFLRILAWMWNILW SCLEDVYNSHFVDTAAGNSLYNLGRSIGMQLLSEGKASGYITVTGVEGTKIPAGYLVATN AGLQYTVTEEAVISSNGTALALIKAVKTGPEYNTAAGTVVVIANPSAVTGVDSIMNEAEI AGGRVKETDAEFRARYYESVDYSGGVNADAIQAALMNDVEGVSSAYVYENDTDVMEATYN LPAHSIEAVVYGGLDEQIAKAIYDRKAGGVQTIGSTAVNVVTASGQRLPIRFSRPASKKV YIKITRLSTSADYPGDAALKQALIDYIGDNTSGGLSIGVDVTYIKLPGILTALPGVEDFE LQIGTNGTSYAKNNIVIGYREKAVIDSAAITISKGDG >gi|229784027|gb|GG667708.1| GENE 7 4214 - 4582 361 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623267|ref|ZP_06116202.1| ## NR: gi|266623267|ref|ZP_06116202.1| hypothetical protein CLOSTHATH_04551 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04551 [Clostridium hathewayi DSM 13479] # 1 122 4 125 125 216 100.0 7e-55 MANKVWKINPDTKDLSFTDSGMLETLEDDAASAQGVAMTLGAWKGDFDLVAGHGTDYEQI LGVPADEETIDEVFREAIFQEDNVSTVDELMVVQAADRSLAVTWSGRLSNGEPVSMEVNV GE >gi|229784027|gb|GG667708.1| GENE 8 4588 - 4986 265 132 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623268|ref|ZP_06116203.1| ## NR: gi|266623268|ref|ZP_06116203.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 132 1 132 132 239 100.0 4e-62 MGRKSNEFANAEANERRNLEKIKCSAIVSVTDFDAGKMTVNVKPLVQREISGTYVSPPPI LGVKVAYIPLEIEVDGKKATVKVDIKPGDIGTVVFLDVDSDNSMKTGAESKPNSSRIHSG DDAVFIGVMQKG >gi|229784027|gb|GG667708.1| GENE 9 4996 - 5847 639 283 aa, chain - ## HITS:1 COG:no KEGG:Amet_0433 NR:ns ## KEGG: Amet_0433 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 3 278 2 254 258 145 32.0 1e-33 MAFFVRSATLQIGPLKYSMNDGFYFEFEVPFYDSDQLVTASFTVNNLSATSRAGIQKNQV VILNAGYEDDMGVLFVGQVASCSHRQNGVEWQTKITATAALDQWLNTQINKTYAENTKAE DIVRDLLNIFGLEVGVFQLVENVVYPRGRVCSGKLKDILTEIVANECKSRLLIRANQIII NNPADGVTKGYLLTPDTGLLFQSDDSNVTTVETAQTKGADAEAKAAEEKTWKRQCLLNYR MGPGDQIQIQSRDLNGKFLIVSGKHKGTPTGTWLTEIEFKVAG >gi|229784027|gb|GG667708.1| GENE 10 5847 - 6209 267 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871060|ref|ZP_06116205.2| ## NR: gi|288871060|ref|ZP_06116205.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 8 120 1 113 113 222 100.0 6e-57 MSDELQSMGLTAEVEYIPIDVLKVPYAFSIKLDDRTYTMTVKYNDQGGFFTIDLSIMATG EVLCYGDPVRYGRPMFSAIEDGRYPIPVIVPYCLTGGVDDVTYDNFGKEVQLYLHERRAD >gi|229784027|gb|GG667708.1| GENE 11 6225 - 6416 70 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623271|ref|ZP_06116206.1| ## NR: gi|266623271|ref|ZP_06116206.1| cell wall protein TIR3 [Clostridium hathewayi DSM 13479] cell wall protein TIR3 [Clostridium hathewayi DSM 13479] # 1 63 32 94 94 68 100.0 2e-10 MSQQDSGEKTAAPAQTSATKAEGLKTTVSKEISQSSYAAYVNSYNGGSSSGPNQRKSSSY NGV >gi|229784027|gb|GG667708.1| GENE 12 7428 - 9338 973 636 aa, chain - ## HITS:1 COG:BH3518 KEGG:ns NR:ns ## COG: BH3518 COG5412 # Protein_GI_number: 15616080 # Func_class: S Function unknown # Function: Phage-related protein # Organism: Bacillus halodurans # 367 621 602 870 940 66 23.0 1e-10 MSESMDRVTDAVEDTTDSLDDLGDAGDDVGNTYRNIGAEANSFKEAIANSSNAAIKQTNS LTKTIKAGFQGAYGYAGKQVSTFGTKIKSGAEDVKQAFTHPIQTIKGKLSDALDRASAGI DDTGDEADKTGDDLKKMGHDGESAGTKIKDAVGGAVKSFFAISAAIEIVKAGIEVAKQFG TAVMNAGIEAEQTGAKFEASFAADSGVQEWAENFSSAIHRSKTEVQGFLVSNKAMYNDLG ITGEAANQLSEITTSLAYDLGSAFKMDDAEALNVMQDYINGNTSALSEYGIQINDTVLKQ TAMEMGLGSNIDSLNDAAMAQVRMNALLQNSTEIQQAAAKKQEGYTNSIKSLKGVWNDFL SGAAERFAPVFTELTNTIMTSWPQIEPALMGMIDMLSNGFAAGAPVIMDLATGALPGLIS TLGELMTAAAPIGGVLLDMATTALPPLAAAVTPLISTFGTLAQTILPPVSRIISSIATTV VPPLVSILKSLSENVIAPLMPHVESIANAILPALSAGLKVIPPILSAISPVLSGIAGVLS TVVGFLSKIMEWAANGLANLLDKVAGIFGGGSKAASSAGADIPHNADGDNNFAGGWTHIN ERGGEIAYLPSGSTIIPADKSEQIIKGSQQQNVLAS >gi|229784027|gb|GG667708.1| GENE 13 10299 - 10544 214 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871061|ref|ZP_06116208.2| ## NR: gi|288871061|ref|ZP_06116208.2| putative methyl-accepting chemotaxis protein McpR [Clostridium hathewayi DSM 13479] putative methyl-accepting chemotaxis protein McpR [Clostridium hathewayi DSM 13479] # 1 78 6 83 83 112 100.0 9e-24 MDDQRNLTMGVQFGLTDSISQLSNVLDMIQDIKAGFLGAERGASNFGSEAASSAGMMADE LESARRQSERVSDAMMEILAS >gi|229784027|gb|GG667708.1| GENE 14 10771 - 11037 233 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871062|ref|ZP_06116210.2| ## NR: gi|288871062|ref|ZP_06116210.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 88 21 108 108 159 100.0 5e-38 MAKTKKVTINGTEYELQSVSPTWYLDLNDECGMTGGRRKTAKYMDSLFKNTVIAPKEVNT EGMQYFEEQEDISSAEKLLTEIESFLRK >gi|229784027|gb|GG667708.1| GENE 15 11094 - 11498 351 134 aa, chain - ## HITS:1 COG:no KEGG:Apre_0847 NR:ns ## KEGG: Apre_0847 # Name: not_defined # Def: hypothetical protein # Organism: A.prevotii # Pathway: not_defined # 6 128 5 127 135 83 40.0 3e-15 MDITGYDPKKVNVNIDGTILTGFASDGIITVSKNEDAVTPNVGCQGDVVYEENANESGTI AITLQATSSSLPKLRNLAANRKQFPVSVSDANDDDSISISAQRCRITKIPDLSRGKNTGT VTVNIYVPNLVIRS >gi|229784027|gb|GG667708.1| GENE 16 11509 - 11985 420 158 aa, chain - ## HITS:1 COG:no KEGG:BSn5_14805 NR:ns ## KEGG: BSn5_14805 # Name: not_defined # Def: hypothetical protein # Organism: B.subtilis_BSn5 # Pathway: not_defined # 3 155 192 343 343 75 28.0 4e-13 MPVGLSVPELKSSEITALENGHINFVTNEYKKNYIKNGCCADGEWIDAILGGDWIAITMR EKLYDIFMSNPVVPYTDAGFATVAAGVFETLDEATEYGIIAANAESGAGIYNVTVPKRSS ATDQQAALRQMPDIPWEAQLAGAVHGTKVKGTLKVSLS >gi|229784027|gb|GG667708.1| GENE 17 12902 - 13450 627 182 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871063|ref|ZP_06116213.2| ## NR: gi|288871063|ref|ZP_06116213.2| hypothetical protein CLOSTHATH_04562 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04562 [Clostridium hathewayi DSM 13479] # 1 179 9 187 187 286 100.0 5e-76 MSKDVVVVVTLESVDTSVDSLDILLISTAGVKEAKIYTALEDIETDWKKESVIYKQAAAM MGQGKAKPTPASLIQKVKVVGLAKPDTPEALVTAIKEYQEKDNDWYMFLTDQTEDEYIVA LGKFAEDSEPSEAELTAGVEDHRKFYFAETSNKALAIESRRTAVIYSEKQEYAEAAWIGV AS >gi|229784027|gb|GG667708.1| GENE 18 13443 - 13988 232 181 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623279|ref|ZP_06116214.1| ## NR: gi|266623279|ref|ZP_06116214.1| hypothetical protein CLOSTHATH_04563 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04563 [Clostridium hathewayi DSM 13479] # 1 181 1 181 181 358 100.0 1e-97 MKFLELRNKLIAGLSSYLKIPIVLADQVQPEQLFPYVLYSVTSSYIPEAGLGEFHQKTDG EKNIETRSEQPSCSFSFTVCSMNREIIDRNGNPVRIFGEDEALELSEKTQGWFLHIGYDY ISNMGITVIDVTNVQNRSFLQVDEEARRYGFDVLIRYRRTDERQTNVIETVNARMEGDSN E >gi|229784027|gb|GG667708.1| GENE 19 13985 - 14359 364 124 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623280|ref|ZP_06116215.1| ## NR: gi|266623280|ref|ZP_06116215.1| hypothetical protein CLOSTHATH_04564 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04564 [Clostridium hathewayi DSM 13479] # 1 124 1 124 124 241 100.0 1e-62 MYGFAYAQPQIPEGLLHDMYEIKMVGGYVDEEHGGQWVSGKEERVVFKGVVLPVNDKDLI RDSGGTYTQCSEKVYTNGKALKTGGQIYDPADQSYYTVSQELGHNSLHPMKRYLIERREG TAVK >gi|229784027|gb|GG667708.1| GENE 20 14361 - 15110 353 249 aa, chain - ## HITS:1 COG:no KEGG:EF1466 NR:ns ## KEGG: EF1466 # Name: not_defined # Def: hypothetical protein # Organism: E.faecalis # Pathway: not_defined # 17 248 16 201 201 93 30.0 1e-17 MTIEDNISSELERIRTEMETLSQMKIHIGIQGASGYGAGGEGKPGAPADIMTIANVNEFG ATIKAKNVKNLAIPIAKKAKGKSPLDFPGLFFLRSDGGYLFGCISKSRKGGPPKEKSSPS DMQPKGNKPGKKPIPKKKDDIEFLFILMEAVNIPERSFIRAGYDNNRAAIEDIASNAMKK IVFAGWDAETAANNIGMSIVGMIQEYMNQPFNFQGKSSITKATSNWPSNPLVETGRLRNS ITYRIEGGE >gi|229784027|gb|GG667708.1| GENE 21 15110 - 15769 564 219 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871064|ref|ZP_06116217.2| ## NR: gi|288871064|ref|ZP_06116217.2| hypothetical protein CLOSTHATH_04566 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04566 [Clostridium hathewayi DSM 13479] # 13 219 1 207 207 414 100.0 1e-114 MAKRMPLLSSNAMTTLEAMMEFIGMDPQDKEIPDAVKNNLERLINAASSYIETMTNRNFG LKKYVENHHGSGWQELCLNQYPIKAVESVMDVENKQIIDPDSYSFDDTGDIGVLYRDAGW ADRSYLGGLALDKVAPKRYLKVTYSAGYILPKDGTEEQPSDLPYDLQYIVWQMVQQQWNL AKNGANGLTAFSISDVSWTFDKELSTQVQSVIDQYRRWA >gi|229784027|gb|GG667708.1| GENE 22 15738 - 15863 61 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKEYTRENKMITGPNSNKAEGKPSADKKGDQHGKKNASAVK >gi|229784027|gb|GG667708.1| GENE 23 15873 - 16283 448 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871065|ref|ZP_06116219.2| ## NR: gi|288871065|ref|ZP_06116219.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 136 2 137 137 213 100.0 5e-54 MKRELFDNVSVVVGASGAVVDREGFLSAVFAASIGAITGSPTSAKLIVKVEHCDTADGTF ELASDTMLDPEHVSTAGVLQEVAVESKDVLQMNLDLIGCKRYIKLTPTISFTGGTSPAAG AAAYALVLGDPVDSPV >gi|229784027|gb|GG667708.1| GENE 24 16296 - 16919 532 207 aa, chain - ## HITS:1 COG:no KEGG:RSKD131_3041 NR:ns ## KEGG: RSKD131_3041 # Name: not_defined # Def: hypothetical protein # Organism: R.sphaeroides_KD131 # Pathway: not_defined # 82 197 312 431 441 66 29.0 6e-10 MSTFSADAMFANDLTRRMQLGLDFGGLYGAGAEFQPLGIANNKEVEQIDGSKADYVDAKG FISADFPVFVSSKAFAKNIDDVNAGWAMNSLLEGVFKNMKTEMGTYIYRDEMATGKLCGI PYKVSNQIPTIGGKTDLFFGNWSDLLIGDQMGLETYTTLDGTWTDDDGTQHNAFEENLAA TRALMYDDIGVRHPESFMYCKNIKVMN >gi|229784027|gb|GG667708.1| GENE 25 17845 - 18855 983 336 aa, chain - ## HITS:1 COG:no KEGG:TM1040_1622 NR:ns ## KEGG: TM1040_1622 # Name: not_defined # Def: hypothetical protein # Organism: Silicibacter_TM1040 # Pathway: not_defined # 212 335 94 212 439 74 33.0 7e-12 MSRYKMSRKAARTRKSMKMGADDLQEMVKAAVKEALDEQKADDGAEGDPAEAGGDLGEIL DAAIEAVNEKRKSAKADELNPDDTSELIDAILEEAGATEDEKADDEPTDLEEVIKAACEA VNEKRKSAKADGIGDDVVEDILDAVAEVMSDDTADEESKGEKGAPVKRQTKAAVSRPAAQ KKKVERKYSSFYLSRDGGNGTMGNTKKKEVPPMIQLARAIKCIDVFGRQDPERAAYYARK KYDDADMEREFKALSVTSPVDGGYLIPEVYSDQIIELLYPKTVIVELGAQQVPLTSGNLN LPRMTAGARAQWGGEQRKIKTSTPKFGNIKLSAKRS >gi|229784027|gb|GG667708.1| GENE 26 18868 - 19575 513 235 aa, chain - ## HITS:1 COG:AGc1747 KEGG:ns NR:ns ## COG: AGc1747 COG3740 # Protein_GI_number: 15888296 # Func_class: R General function prediction only # Function: Phage head maturation protease # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 19 152 25 160 190 85 38.0 1e-16 MKHEVKTMQFKVDAYNEEEGIFSGYGAVFDNVDSGGDIIEPGAFTKTLAEGWERVKILAL HNDCWLPIGRPIELREDANGLFLSAKISDTSMGRDIKVLLKDGVLNELSIGYDPIVFDYD GDGIRHLREVKLWEVSVVTWAMNPEAVITGYKSMQETAEQAQAIKDDLLLDVKEGRKISN TRLKSLKDVSKSMKDAARTIDAVIREASGADQKSRTRAPVSRKSNQDQIEIEISY >gi|229784027|gb|GG667708.1| GENE 27 19652 - 20131 500 159 aa, chain - ## HITS:1 COG:no KEGG:Clocel_3545 NR:ns ## KEGG: Clocel_3545 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulovorans # Pathway: not_defined # 1 159 1 154 154 170 56.0 1e-41 MKKYIGTKLVEAAPAYKCISTGKTFAIDKMPEEEARRKGLELGYLVQYPDGYFSWSPKEV FEKAYLQVDDNKDLPSGVSIGTKMVEDFISEVETMTMGDSTTVVRCVLKNGFAIVESSSC VDPKNYSEEVGREICMNKIEDKVWELLGFLLQTAWHGIQ >gi|229784027|gb|GG667708.1| GENE 28 20300 - 22471 1340 723 aa, chain - ## HITS:1 COG:no KEGG:MXAN_1213 NR:ns ## KEGG: MXAN_1213 # Name: not_defined # Def: hypothetical protein # Organism: M.xanthus # Pathway: not_defined # 66 414 99 447 452 129 29.0 4e-28 MGIFNFRQRNRSDYFDRRSGNNNMMIPRYTSPPTRNTAEWMDTFGKNPRMAVINKISSDL AYAKGKLYRIDKDGEKKEVTVHPFLDFWARPNPLFEFTASSLWKLQSDYLLLKGEGYFII ERFDNGYPAELWPVPTHWVQMTPYQGFPFYRIRGADGSVMDVSVDDVFVMKDLNPLDPYK RGLGQAEPLADEIEIDEYAAKFQKKFFFNDATPGSIVIMPGADDKQQERFLAKWKERFRG HQNSHGIATIGGPKDVPTSVVKLGDNMKDLDMINGRTFTRDAVLEHFGVPREIMGITQNS NRATAETARYIYATNVLMPRLMNRQDAINLQLLPSYGDDLLWEFDDIIPKDKEFEKGVAF DGWNNGLLTKDEAREKLDMEPAKKGKIFKMSFADLYIGDDEDPVELSSAAANLQYSDIPE PIEADRNDVEIVGRDNPDTILLEDEEKILRRAAEIKAQRIKAAGRGLDAARRTQTRKFEM ATMKYLRNQSNQIQKVLLGTTKAEGTVWDALDMTQEEFLQLSEEQQKELTLRFVNGLIDW KKEETMLESILTPLWAETYDKGVENVISTYRLRNIQQPALTSAARIRGGQRVTYVSQTTK QSISKIITDGLAEGKSHQTLTEEIMDTMNTSASRARTIAAQECNTSLLAGNYDMAQRGGF STKTWHVTNPEKARDTHRELNGKSVPITEPFVTSKGNKLMMPCDPNCGVAEETVNCHCFL TYS >gi|229784027|gb|GG667708.1| GENE 29 22477 - 23931 947 484 aa, chain - ## HITS:1 COG:no KEGG:Mahau_0567 NR:ns ## KEGG: Mahau_0567 # Name: not_defined # Def: hypothetical protein # Organism: M.australiensis # Pathway: not_defined # 10 476 11 448 461 367 41.0 1e-100 MLSDDAVLFYADEPIYFVEDIIRVTPDQKQRDILRSLRDYPMTSVRSGHGVGKSAVESWS VIWFLCTRPFPKIPCTAPTQHQLYDILWAEISKWLRNNPELKNDIIWTQQRVYMNGYPEE WFAVPRTATNPEALQGFHAEHVLYIIDEASGVSDKVFEPVLGAMTGEDAKLLMMGNPTRL SGFFFDSHHKSRSEYSAMHIDGRDSQHVNQKFVQKIINMFGMDSDVFRVRVAGQFPKSTP DSLIMMDWCEAATQLKPETVRNRVDIGVDVARYGDDSSALYPVIDKVQSDGYELYHHNRT TEISGYVVQMIKRYAVECLDAVIRVKVDCDGLGVGVYDNLYDLTDQIIDEVWRDRCRREG LDPDNGNQWNECQRIPQLDLEIVECHFGAAGGKIDEDDPVEYSNSTGLMWGKIRKLLQTG ALQIPDDDALISQLSNRRYIVNKDGKLELERKEAMKKRGLPSPDIADALALALYDPNNEW TLNW >gi|229784027|gb|GG667708.1| GENE 30 23921 - 24769 698 282 aa, chain - ## HITS:1 COG:BS_xtmA KEGG:ns NR:ns ## COG: BS_xtmA COG3728 # Protein_GI_number: 16078322 # Func_class: L Replication, recombination and repair # Function: Phage terminase, small subunit # Organism: Bacillus subtilis # 7 273 4 246 265 129 35.0 9e-30 MAQGSNEQTLKAKALFEQGMSLVDIARELGVSDGTVRSWKSRNGWGSDKPATLQKKKTQR CKQKRSVANTDKRTVDAVLENSELNSEQQLFCVLYAKTLNATQSYKKAYGCSYETAMVNG GRLLRKTKVKEEINRLKKERFENQLFDEHDIFQFYLDIATACITDYVTFGREEVPVMNMF GPVMVDKPDGNPGSGGKIPLMKEVNYVKFQESLEIDGRAIKKVKMGKDGASIELYDSMKA MEWLAEHMSMGTSGQQGLAQNIMSAYEKRKADKEKEGTPDAE >gi|229784027|gb|GG667708.1| GENE 31 26561 - 27058 569 165 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2975 NR:ns ## KEGG: Cphy_2975 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 165 1 165 165 124 41.0 1e-27 MNQETAERIAKEAALEAVKEFKKSERKEKRVKIFQNTKKLMENYNRICRSVQEGISDLSD MDDSDELEGFTEEDIFINSILKSKLRSVVMIAHIDKCLSLLEEEEYSKNTPEKYLAFKYY YLDGMTYENIAEVYGYVDRTARRWVTELTNILSVYLFGSDAIMLD >gi|229784027|gb|GG667708.1| GENE 32 27065 - 27436 298 123 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623296|ref|ZP_06116231.1| ## NR: gi|266623296|ref|ZP_06116231.1| putative bacteriocin ABC transporter [Clostridium hathewayi DSM 13479] putative bacteriocin ABC transporter [Clostridium hathewayi DSM 13479] # 1 123 1 123 123 154 100.0 3e-36 RPVEQNSPAKELENFIDQCVQEYKAAYENVNEEDRRLQDLVHAIEFAVDKSERNRVATKF QQSRKYRRQNKDIVKRNERIVKFFEEQKNRDTLNRMRQLLGQQRKEEEYLDGERVYKPRV GKE Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:57:38 2011 Seq name: gi|229784026|gb|GG667709.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld102, whole genome shotgun sequence Length of sequence - 30396 bp Number of predicted genes - 36, with homology - 35 Number of transcription units - 13, operones - 8 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 781 776 ## COG1293 Predicted RNA-binding protein homologous to eukaryotic snRNP 2 1 Op 2 . - CDS 818 - 1228 360 ## EUBELI_00568 hypothetical protein - Prom 1263 - 1322 5.4 + Prom 1297 - 1356 6.5 3 2 Op 1 8/0.000 + CDS 1384 - 2262 1042 ## COG1561 Uncharacterized stress-induced protein 4 2 Op 2 . + CDS 2282 - 2902 733 ## COG0194 Guanylate kinase + Prom 2908 - 2967 4.3 5 3 Op 1 . + CDS 3019 - 3177 123 ## Closa_1948 DNA-directed RNA polymerase, omega subunit + Term 3181 - 3234 5.1 6 3 Op 2 2/0.000 + CDS 3275 - 4615 778 ## PROTEIN SUPPORTED gi|227419447|ref|ZP_03902591.1| SSU ribosomal protein S12P methylthiotransferase 7 3 Op 3 4/0.000 + CDS 4599 - 5141 365 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 8 3 Op 4 . + CDS 5148 - 5429 351 ## COG1058 Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA + Prom 6276 - 6335 80.4 9 4 Op 1 . + CDS 6440 - 7270 1016 ## COG1546 Uncharacterized protein (competence- and mitomycin-induced) 10 4 Op 2 . + CDS 7292 - 8632 1387 ## Closa_1952 hypothetical protein 11 4 Op 3 . + CDS 8613 - 9002 319 ## Closa_1953 hypothetical protein 12 4 Op 4 . + CDS 9010 - 9927 829 ## Closa_1954 hypothetical protein 13 4 Op 5 . + CDS 10002 - 10406 538 ## Closa_1955 Sporulation protein YtfJ 14 5 Tu 1 . + CDS 11401 - 11517 117 ## + Term 11628 - 11663 7.4 - Term 11616 - 11651 7.4 15 6 Tu 1 . - CDS 11669 - 11854 252 ## PROTEIN SUPPORTED gi|160881022|ref|YP_001559990.1| ribosomal protein L28 - Prom 11906 - 11965 8.1 + Prom 11936 - 11995 7.3 16 7 Op 1 9/0.000 + CDS 12021 - 12377 543 ## COG1302 Uncharacterized protein conserved in bacteria 17 7 Op 2 2/0.000 + CDS 12392 - 13237 1045 ## COG1461 Predicted kinase related to dihydroxyacetone kinase + Prom 14139 - 14198 6.3 18 7 Op 3 4/0.000 + CDS 14259 - 14891 685 ## COG1461 Predicted kinase related to dihydroxyacetone kinase + Term 14913 - 14949 6.3 + Prom 14901 - 14960 2.2 19 7 Op 4 . + CDS 14981 - 17032 2052 ## COG1200 RecG-like helicase + Term 17072 - 17105 -0.9 20 7 Op 5 . + CDS 17121 - 17813 646 ## COG5523 Predicted integral membrane protein + Term 17840 - 17881 8.0 + Prom 17850 - 17909 6.2 21 8 Op 1 . + CDS 17948 - 18886 1049 ## COG0714 MoxR-like ATPases 22 8 Op 2 . + CDS 18887 - 20020 845 ## Cphy_3811 hypothetical protein 23 8 Op 3 . + CDS 20007 - 20942 519 ## bpr_I2315 hypothetical protein 24 8 Op 4 . + CDS 20947 - 21561 652 ## COG0572 Uridine kinase 25 8 Op 5 . + CDS 21583 - 21840 303 ## gi|266623321|ref|ZP_06116256.1| conserved hypothetical protein 26 8 Op 6 . + CDS 21837 - 22334 268 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 27 8 Op 7 . + CDS 22393 - 23778 833 ## Sgly_1299 transcriptional regulator, MerR family + Prom 23780 - 23839 1.6 28 8 Op 8 . + CDS 23907 - 24113 233 ## Closa_1963 small acid-soluble spore protein alpha/beta type + Term 24130 - 24174 11.6 + Prom 24154 - 24213 6.3 29 9 Op 1 14/0.000 + CDS 24243 - 24806 269 ## PROTEIN SUPPORTED gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 30 9 Op 2 . + CDS 24803 - 25288 416 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 31 9 Op 3 . + CDS 25325 - 25906 748 ## Closa_1966 hypothetical protein + Term 25954 - 25986 3.2 - Term 25666 - 25709 1.3 32 10 Tu 1 . - CDS 25863 - 26210 241 ## Closa_1967 nucleoside recognition domain-containing protein - Prom 26254 - 26313 7.6 33 11 Tu 1 . - CDS 27215 - 27412 125 ## Closa_1967 nucleoside recognition domain-containing protein - Prom 27433 - 27492 3.2 + Prom 27201 - 27260 21.1 34 12 Op 1 . + CDS 27464 - 28273 845 ## COG0739 Membrane proteins related to metalloendopeptidases 35 12 Op 2 . + CDS 28348 - 28716 445 ## COG0144 tRNA and rRNA cytosine-C5-methylases + Prom 29618 - 29677 7.2 36 13 Tu 1 . + CDS 29783 - 30395 415 ## COG0144 tRNA and rRNA cytosine-C5-methylases Predicted protein(s) >gi|229784026|gb|GG667709.1| GENE 1 1 - 781 776 260 aa, chain - ## HITS:1 COG:BH2516 KEGG:ns NR:ns ## COG: BH2516 COG1293 # Protein_GI_number: 15615079 # Func_class: K Transcription # Function: Predicted RNA-binding protein homologous to eukaryotic snRNP # Organism: Bacillus halodurans # 1 254 1 244 570 168 38.0 8e-42 MALDGIVMASLKAELKTRLEGGKIAKIAQPEKDELLFTVKNQKNTWRLLISASASLPLLY FTERNKQSPLTAPNFCMLLRKHIGNGRIIRVSQPGLERIICLEVEHLDELGDKRIKRLYI EIMGKHSNIIFCNEEDVILDSIKHISAQVSSVREVLPGRPYFIPQTVEKKDPLTISETEF MELIGHTAAPVQKALYMKLTGISPIIGTELCQLASIDGDVSANELSEAELIHLCHMFSLM MDDVRTHSFTPNIIYHQNEP >gi|229784026|gb|GG667709.1| GENE 2 818 - 1228 360 136 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00568 NR:ns ## KEGG: EUBELI_00568 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 6 121 19 135 157 104 47.0 1e-21 MKKESMIWYDRKRLWCGLPWTFTKYGMDEGRLFVETGFFNTKEEEVRLYRILNISLSKNI IQRIFGLGTIHIDSTDLDLKCLKITNIKDSDHVKEMLSEKVEEERMRNRVSAREFMNHGE DGDADEDFHPFDDHEN >gi|229784026|gb|GG667709.1| GENE 3 1384 - 2262 1042 292 aa, chain + ## HITS:1 COG:CAC1716 KEGG:ns NR:ns ## COG: CAC1716 COG1561 # Protein_GI_number: 15894993 # Func_class: S Function unknown # Function: Uncharacterized stress-induced protein # Organism: Clostridium acetobutylicum # 1 292 1 292 292 234 45.0 1e-61 MIKSMTGFGRFETVTDEYKISVEMKAVNHRYLDLSIKMPKKFNYFEAGIRNLLKNYIQRG KVDVFITYEDYTENKLCLKYNSALAAEYMDYFARMETQFGIRNDVTVSTLSRCPEVLSME DVPEDEEHMWKLLSETVEEAARRFVETRVTEGDHLKADLQGKLNYMTSLVDFIEERSPRI IEEYRAKLEDKVKELLANTTIDEGRIATEVTIFADKICVDEETVRLRSHIENTKVELEAG GSVGRKLDFIAQEMNREANTILSKANDLEISDKAIALKTEIEKVREQIQNIE >gi|229784026|gb|GG667709.1| GENE 4 2282 - 2902 733 206 aa, chain + ## HITS:1 COG:L149828 KEGG:ns NR:ns ## COG: L149828 COG0194 # Protein_GI_number: 15673881 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Lactococcus lactis # 1 189 1 189 205 189 49.0 4e-48 MNEHGVLAVISGFSGAGKGTLMKALLEKYDNYALSISATTRNPREGEADGREYFFVTEDR FKEMIGQDALIEYAQYVNHYYGTPKEYVLNQMEQGKDVLLEIEIQGALKVKERFPEAILI FVMPPSAEELKRRLVGRGTESMDVIHARLRRAVEEAEGMDSYDYILINDDIASCTERLHQ MIRVQHSRVSNNLAFLSQIREELKNI >gi|229784026|gb|GG667709.1| GENE 5 3019 - 3177 123 52 aa, chain + ## HITS:1 COG:no KEGG:Closa_1948 NR:ns ## KEGG: Closa_1948 # Name: not_defined # Def: DNA-directed RNA polymerase, omega subunit # Organism: C.saccharolyticum # Pathway: Purine metabolism [PATH:csh00230]; Pyrimidine metabolism [PATH:csh00240]; Metabolic pathways [PATH:csh01100]; RNA polymerase [PATH:csh03020] # 1 52 32 83 84 71 82.0 1e-11 MIAAAKRARQIIGGAESDLPTAGKKPLSVAVEELYDGDVKILSEADATEEDD >gi|229784026|gb|GG667709.1| GENE 6 3275 - 4615 778 446 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227419447|ref|ZP_03902591.1| SSU ribosomal protein S12P methylthiotransferase [Desulfohalobium retbaense DSM 5692] # 1 443 7 437 438 304 41 5e-82 MKLLCISLGCDKNLVDTEMMLGLLNRDGYTFTDDEYEADVIVINTCCFIGDAKEESVNTI LEMAQMKEVGKCKALIVTGCLAQRYKQEIVDEIPEVDGILGTTTYDEISHVLAETLDGRE HVQCFHDLDIIPEVKTDRVITTGGHYAFLKIAEGCDKHCTYCIIPSLRGNFRSVPMERLV SEAQSLADQGVKELILVAQETTLYGVDLYGKKSLPELLKKLCRISGIQWIRIQYCYPEEI TDELIQTMKTEDKICHYIDMPIQHASDTILRRMGRRTSRAQLKEMIGKLRSEIPDIAIRT TLISGFPGETEEDHEILMEFVDEMEFERLGVFAYSAEEDTPAAGFPDQVLQEVKEERRDA IMELQQEISFDHSQSMVGRSLEVMIEGKVADENAYVGRTYMDGPGVDGMIFIQTGEELMS GTFVRARVTGALEYDLIGEIEDEFTE >gi|229784026|gb|GG667709.1| GENE 7 4599 - 5141 365 180 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 5 175 484 669 904 145 43 1e-114 MNLPNKLTILRIIMIPFFVLFMLLDGGANQTYRYIAAVIFIVASFTDLLDGKIARKYNLV TNFGKFMDPLADKLLVCSGLICFVGLGALPAWFVIIIISREFIISGFRLVASDNGVVIAA SYWGKFKTVSQMIMSVLLIVNIPALSILTAIFSAIALVLTVVSLIDYIAKNYRVLTEGSL >gi|229784026|gb|GG667709.1| GENE 8 5148 - 5429 351 93 aa, chain + ## HITS:1 COG:CAC3586_1 KEGG:ns NR:ns ## COG: CAC3586_1 COG1058 # Protein_GI_number: 15896820 # Func_class: R General function prediction only # Function: Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA # Organism: Clostridium acetobutylicum # 1 90 1 90 245 113 58.0 1e-25 MVVELISVGTELLLGNIVNTNAQFLAEKCASLGLSMYHQVTVGDNRERLAEDISTALKRA DIIILTGGLGPTEDDITKEVCADVMGFKLVEDP >gi|229784026|gb|GG667709.1| GENE 9 6440 - 7270 1016 276 aa, chain + ## HITS:1 COG:TM0703_2 KEGG:ns NR:ns ## COG: TM0703_2 COG1546 # Protein_GI_number: 15643466 # Func_class: R General function prediction only # Function: Uncharacterized protein (competence- and mitomycin-induced) # Organism: Thermotoga maritima # 99 275 2 177 178 133 43.0 3e-31 MEKDGKTAILLPGPFNELKSLFLDEVSPCLQKLQPEVIRSQMIKICGHGESQVEDQIKDL IAAQTNPTIAPYAKTAEVHLRVTARAADDETAKSLIKPVVKEIKNRFGDDIYTTKEEETL EMAVVRLLKKYELTVTTAESCTGGLIAGRLVNVSGASEVFREGFVTYSNKAKRKILDVSK GTLKKYGAVSEQTAKEMATGGVFATDADVCIAVTGIAGPSTEGDKPVGLVYIACYMKDKV KVEEYHFKGNRDKIREQAVVKSLDLLRRSILANYRG >gi|229784026|gb|GG667709.1| GENE 10 7292 - 8632 1387 446 aa, chain + ## HITS:1 COG:no KEGG:Closa_1952 NR:ns ## KEGG: Closa_1952 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 446 1 440 440 478 59.0 1e-133 METRFIKKQMTAAYQYSYTRLALNYYTLYETITEEPELSGKEGAFLATLHELIGAFLEGG DVLSRLDQLREEVTREVEILTAYTDCFQIYEYVLNRMERSYRSLPKTGLDDESFLAMLMA FITNTRESAVMNGRIRQIIGQLPVRLTKQKFYGLVLEGLSVYIGSPVESLHDMMYTLRTE SMASLPEGMAEGRRALYETLMQFGHMDYRSLSPEAFTEAQAKLMGVSAELNEEAECFVSL QEIINDLYVLSLAKSDAVIDVNEENLYRSIIAGVQDNLMKKDGSEPADSFAALLTQMEGR QEAYYERYLRSEIPPETPELLADPDYVRARNIDLLLSGSSFMSLKKTAAENDPKESELKK LVDRAYLEKTAEAYFSELDTVFSDTSRPVMRAIMAKVLSDLPVYFHSIDEIKSYIRGSLE SCGDEAEKETCKELLEELMEYENKLV >gi|229784026|gb|GG667709.1| GENE 11 8613 - 9002 319 129 aa, chain + ## HITS:1 COG:no KEGG:Closa_1953 NR:ns ## KEGG: Closa_1953 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 102 1 102 109 142 63.0 5e-33 MKISWYDHLYVGDKAKKKRYQIIQSIRNSRLISGAYVITPSLSGNNVLDIYPAMELSAPW YREDEFFIIGIAADYWEALEVTRQIVEELYQNTGGFDLTGYLNGRPLRRSHRDDPAQGMS EHVPEAGRM >gi|229784026|gb|GG667709.1| GENE 12 9010 - 9927 829 305 aa, chain + ## HITS:1 COG:no KEGG:Closa_1954 NR:ns ## KEGG: Closa_1954 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 36 305 36 329 329 245 46.0 1e-63 MLPIILFILKLLGILVLILLGLVLGVLLLILLVPVRYRIEGSCYERLKGKARITWLLHIV SVTAAFEDEFSVGIRLLGFRLFKPVEESREDAEDMMVHAMETTDRETEEFTGDLVSDAEK SGKKQLAEEGKQVPLPEEPHPGREREASQASDEKNGKAGKTADRLSSIYAKLKFSFKGMC DKLKAIKEKKEEIQAWITDENNRKTIRLLLKQLKKLVRHLLPRKGKGEVVFGFDDPYTTG QVLTAVSVVYPFCHKLIDIYPEFDRQVFTAEGHFSGRIRAGTLLLIAGRMLLDKNFRGLL RKWLR >gi|229784026|gb|GG667709.1| GENE 13 10002 - 10406 538 134 aa, chain + ## HITS:1 COG:no KEGG:Closa_1955 NR:ns ## KEGG: Closa_1955 # Name: not_defined # Def: Sporulation protein YtfJ # Organism: C.saccharolyticum # Pathway: not_defined # 1 121 1 121 136 181 83.0 7e-45 MADNYFTSTVEALFKGMDTFVTTKTVVGEAVKVDDTIILPLVDVTCGMAAGSFAENSKQK GAGGMNAKMSPSAILIIQNGVTKLVNIKQQDAMTKILDMVPDFVNKFTGGSHDISPDALK VAEEMAAAEETDQE >gi|229784026|gb|GG667709.1| GENE 14 11401 - 11517 117 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLFCFGAGMAILLFIPATIMTILFIIACMVLGYNLFCW >gi|229784026|gb|GG667709.1| GENE 15 11669 - 11854 252 61 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881022|ref|YP_001559990.1| ribosomal protein L28 [Clostridium phytofermentans ISDg] # 1 61 1 65 65 101 73 5e-21 MAKCAICEKAAHFGNAVSHSHRRSNKMWKANVKSVKVKVNGGAKKMYVCTSCLRSGLVER A >gi|229784026|gb|GG667709.1| GENE 16 12021 - 12377 543 118 aa, chain + ## HITS:1 COG:BS_yloU KEGG:ns NR:ns ## COG: BS_yloU COG1302 # Protein_GI_number: 16078646 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 5 118 5 119 120 96 46.0 1e-20 MKGYINNKLGEIQIGPDVIAMYAGTTAVECFGIVGMAAVSMKDGLVKLLKRESLTHGINV NIQDNHITIDFHVIVAYGISISAVADNLIENVKYKVEEFTGMTVDKINIFVEGVRVID >gi|229784026|gb|GG667709.1| GENE 17 12392 - 13237 1045 281 aa, chain + ## HITS:1 COG:BH2498 KEGG:ns NR:ns ## COG: BH2498 COG1461 # Protein_GI_number: 15615061 # Func_class: R General function prediction only # Function: Predicted kinase related to dihydroxyacetone kinase # Organism: Bacillus halodurans # 1 279 1 288 557 215 43.0 6e-56 MGMKYIDGEMLKKAFLAGAKGLEAKKEWINELNVFPVPDGDTGTNMTMTIMSAAREVAAI EHPDMESLAKAISSGSLRGARGNSGVILSQLFRGFTKEIKTVDKITTTTLANSFVRATET AYKAVMKPKEGTILTVAKGMADKAVELVTETEDIIEFCQKVIEHGDYVLSQTPEMLPVLK QAGVVDSGGQGLMQVVKSALDCLMGKELDLSVEGVKSAAGVSGGAAVTDDFDIRFGYCTE FIVNLEKPYDEKSEHDFKGYLESIGDSIVVVSDDDIVKLAS >gi|229784026|gb|GG667709.1| GENE 18 14259 - 14891 685 210 aa, chain + ## HITS:1 COG:CAC1735 KEGG:ns NR:ns ## COG: CAC1735 COG1461 # Protein_GI_number: 15895012 # Func_class: R General function prediction only # Function: Predicted kinase related to dihydroxyacetone kinase # Organism: Clostridium acetobutylicum # 2 210 340 547 547 196 48.0 2e-50 MGEGLDEIFKGIGADYLIQGGQTMNPSTEDMLNAIDRVNADTVYILPNNKNIILAAEQAK KLTKDKHVIVIPSRTIPQGITALINFAPDLSAEENEKVMTEEMQRVKTGQITYAVRNTTI DDIEITEGDIMALGDHGILAAGKAIEQTALESMKSMVDEESELITIYYGADVSEEDAETL REKAEDSFGDCEVELHPGGQPIYYYLISVE >gi|229784026|gb|GG667709.1| GENE 19 14981 - 17032 2052 683 aa, chain + ## HITS:1 COG:CAC1736 KEGG:ns NR:ns ## COG: CAC1736 COG1200 # Protein_GI_number: 15895013 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Clostridium acetobutylicum # 8 652 9 650 678 488 41.0 1e-137 MNRQPLDSLKGIGEKTGRLFAKLGIETVDELLEYYPRAYDACEEPVSVGELKPDTVMAVS GVLFKSAEVKRYSHIQVITTMIRDITGSLTLTWYNMPYLHATLKAGMRAVFRGRVVKKNG RLTMEQPEIFLGDAYGEVIHSMQPIYGQTKGLPNKTIVRAVKQALSARQMVRDYMPLDLR IRHELAEYNFAIEHIHFPTDRTELLFARKRLVFDEFFLFLMAVRRLKDRRVDRESHFPMK PSDEVMRLVKNLPYALTNAQNKVLTEVFRDMTGKKVMNRLIQGDVGSGKTIIAVLALLET ACNGFQGALMVPTEVLARQHYESMLSLFEENGIEKQVVLVTGSMTAKEKRTAYEKIASHE ADIIVGTHALIQEKVVYDKLALVITDEQHRFGVSQREMLGDKGDEPHILVMSATPIPRTL AIIIYGDLDISIIDELPANRLPIKNCVVDTSYRSTAYSFIKGEVAAGRQAYVICPMVEAS EMIEAENVLDYSKKLQSALPGISVEYLHGKMKAREKNAIMERFAAGEIKVLVSTTVVEVG VNVPNATVMMIENAERFGLAQLHQLRGRVGRGKHQSYCILVDGSGQEGTRERLTILSKSN DGFYIASEDLKLRGPGDIFGLRQSGDLEFKLADIFTDANILKTVSEEVNRLLDEDPELTS EEHIELGKKLSLYLEKSYDKLNL >gi|229784026|gb|GG667709.1| GENE 20 17121 - 17813 646 230 aa, chain + ## HITS:1 COG:lin0656 KEGG:ns NR:ns ## COG: lin0656 COG5523 # Protein_GI_number: 16799731 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Listeria innocua # 68 189 44 164 345 85 37.0 9e-17 MNRQELKFDAKEKMRQAAVNPYVVTIIMGVIIVIMSVVQVVLNVWGELISSGTAVSGGEF STYIVSSIVFFVVYLLISTILQFGYSSYCLKVANRDSSMSYGDLFSSLKYLLKAIGLTLM ISIFVMLWCLLLIIPGIIAAYRYSQAIFIMVENPDKGVMQCIRESKEMMAGHKMEYFILE ISFFFWMLLGSVTCGLAYIYVYPYMTVTFANYYNRLKPVSVVYEDAGTTY >gi|229784026|gb|GG667709.1| GENE 21 17948 - 18886 1049 312 aa, chain + ## HITS:1 COG:PH0776 KEGG:ns NR:ns ## COG: PH0776 COG0714 # Protein_GI_number: 14590644 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Pyrococcus horikoshii # 1 309 3 314 314 309 52.0 5e-84 MKEAIEAIKTNIGRVLVGKEEVTDCLLSAILAGGHVLLEDVPGTGKTILAKSLAKSMELS FSRVQFTPDLLPSDITGLNYYNQKEGEFVFRRGPVFTNLLLADEINRATPRTQSSLLECM EEHQVTIDGETKKLEEPFFVIATQNPIETAGTYPLPEAQLDRFLMQIPMGLPDAAEERQI LERFGRENPYDSLTAVCDRETLLALMKEAEAVYVHPQLMDYLVELVLATRKHGEVEGGVS PRGSLAFYRAVRAYALVKGRDYVVPEDIKALAVPVLAHRLVLARTLTGRSTGRQVIEQIL NTVFVPTEEWNV >gi|229784026|gb|GG667709.1| GENE 22 18887 - 20020 845 377 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3811 NR:ns ## KEGG: Cphy_3811 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 374 1 371 380 238 37.0 3e-61 MELILMALGACILYVLQRALYERFWKKGLTADAAFQDNPAVCGSEAYLLETVINAKFLPL SILRVKFRIGRELEFVTGENSAVSDYTYRNDVFSVFPYQKITRKLTFHCSKRGYYEIKSL DLVSHDLFFSGHLIETLSLDSHLYVYPAAADGSRLNVPFQKMMGTCLSRNRLVADPFELK GIREYQTYDSMHSVNWKATARTGALKVNVHESTATQNIFLFLNMDSDYPWASEALREEAV SICGSFIGAFVSEGIPTGFFSNGMDALIGEPSFVPAGAGVSHVTRCMEALSRLRYDIPMT PFETLMRECSLEKKAEDCLYIMISSCRRPELQTAYQEFCRYSPGSLWIAPLRPEDEVTLS LCPAALAMKWEVPYDKI >gi|229784026|gb|GG667709.1| GENE 23 20007 - 20942 519 311 aa, chain + ## HITS:1 COG:no KEGG:bpr_I2315 NR:ns ## KEGG: bpr_I2315 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 76 304 186 420 424 69 28.0 2e-10 MTKFKWINRWITADLSGIHIMAAVLCVFLYGAGYLIPSSRLMTAGYAGLFLTILFFLLHL QLHACDNFLSMHPSTDHIPKSQMKLVNGVYMAVFLAAAGAFMAVFSLLRADWLIHGIQSV LTALLRFLAHFLPEAVNEPVESMPAGSFIPPELGEAAAGPSLFAKILDAAVTVISVIVLV ILAAALLRQLFFLLLRFLKPREDGDEKEFIKPAAVFSRAGAQRTKETPLWRDFTIEGRIR RAYKREILRRLAGRNRISRTETPEELEQKSGLPEEPSLCAVYHQVYEKARYGKGCTKEDL ECLKYQKTLEE >gi|229784026|gb|GG667709.1| GENE 24 20947 - 21561 652 204 aa, chain + ## HITS:1 COG:BH1275 KEGG:ns NR:ns ## COG: BH1275 COG0572 # Protein_GI_number: 15613838 # Func_class: F Nucleotide transport and metabolism # Function: Uridine kinase # Organism: Bacillus halodurans # 2 203 4 207 211 196 48.0 2e-50 MKSVLIGIAGGTGSGKSTFTNRLKDAFHDDIAVLYHDNYYKKQDGIPFDERKKMNYDHPE AFETELLLDQLAGLRNGKTVQCPVYDYSRHNRSDQFLTVHPKKVILVEGILVFADQRLRD MFDIKIFVEADADERILRRVIRDVKERGRDIEGVVEQYLTTVKPMHYLYVEPTKPLADII INSGMNEVAFQLVRTSIEKILGEP >gi|229784026|gb|GG667709.1| GENE 25 21583 - 21840 303 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623321|ref|ZP_06116256.1| ## NR: gi|266623321|ref|ZP_06116256.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 85 1 85 85 165 100.0 8e-40 MFFWQKKEVLVTFSLNDFNRVKDILAANHIPYDWRTVPSCGRRGRRTLGTAFMDESVLRQ YYLYVSKKEYERAQYLVQNVFGRQS >gi|229784026|gb|GG667709.1| GENE 26 21837 - 22334 268 165 aa, chain + ## HITS:1 COG:FN1587 KEGG:ns NR:ns ## COG: FN1587 COG0454 # Protein_GI_number: 19704908 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Fusobacterium nucleatum # 35 144 14 122 137 59 29.0 3e-09 MNPIPFRIEKAVPKDYSSICCLIAEEHAMHVLARPDVFQPCDELLTVEEYEAMQKNDSFF LFTAKTIGDETAGLCFAKLTETDSPLLRKKKILYIEDFIVSESFRRQGIGRMLYAKVKET AALSGASSVELTVWNFNESAVSFYRALGMQVQFAHMEENIKENLI >gi|229784026|gb|GG667709.1| GENE 27 22393 - 23778 833 461 aa, chain + ## HITS:1 COG:no KEGG:Sgly_1299 NR:ns ## KEGG: Sgly_1299 # Name: not_defined # Def: transcriptional regulator, MerR family # Organism: S.glycolicus # Pathway: not_defined # 1 143 1 142 259 112 39.0 4e-23 MTIKEVEQLTNITRQNIRFYEREGLITPRRNPENQYREYSADDVKTLHRIRLYRKLNVSI EDIRRLEDRTLSLESCMEQCISNTEREMVRMTKIKEVCEELRKQDLDGKSPDVEKTLEQI DGYEKSGYQFTDISKDYWSMDVVCEVRKLNRQLIKALVFSFIMVFICNMLYFALSKPQWM LMLPASVIGYSALWLYWAGKNRAVVRKHIHFTPAPASVYLAAAVIGLLFYVIVDGSITVT TTIFPTPAALINNVTRAGSIWLSLVLFFRNMLYSLIACILFYDTYKQKSIISALALSSLL FALTCLNPYTAAGYVLFGICLGLLYEFTDSLLVSMTPVLFESALTVVLCLVELYVPGTLP EFTGTGSMASASQLLPWLLLAILLILILLYTVSRISGKKIDWSQEWEKARSISAPSSGKT TILDSAAFHEIEKERKRNYRLADGSFIAAVLIALWYMTSIL >gi|229784026|gb|GG667709.1| GENE 28 23907 - 24113 233 68 aa, chain + ## HITS:1 COG:no KEGG:Closa_1963 NR:ns ## KEGG: Closa_1963 # Name: not_defined # Def: small acid-soluble spore protein alpha/beta type # Organism: C.saccharolyticum # Pathway: not_defined # 1 68 1 67 67 98 95.0 1e-19 MSGKSSNTTNVPEAREAMDRFKMEVANEIGVPLTNGYNGNLTSAQNGSVGGYMVKKMIEA QERQMAGK >gi|229784026|gb|GG667709.1| GENE 29 24243 - 24806 269 187 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 [Bacillus selenitireducens MLS10] # 1 152 13 164 199 108 41 5e-23 MRVIAGKARRLLLKTIDGLDTRPTTDRIKETLFNILNPDLPGSTFLDLFSGSGGIGIEAL SRGADRAVFIEMNPKAAECIRENLQTTKFTDESIVMNCDVLTGLKRLEGKDYVFDFVFMD PPYGKELERQVLTFLSASPLISEDTLMIVEADLKTGFDYLESLGLECRREKIYKTNKHVF IAKGEVL >gi|229784026|gb|GG667709.1| GENE 30 24803 - 25288 416 161 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 2 158 4 160 164 164 50 5e-40 MKTAVYPGSFDPVTLGHLDIIERSARMSDQLIIGVLNNNSKTPLFSVQERVNMLKEITKD LGNVEVKAFAGLLIDFVRDNQADVVIRGLRAVTDFEYELQLAQTNRVIAPEVDTIFLTTN LKYSYLSSSIVKEIAFYDGDISAFVPASVAECIRRKMADRK >gi|229784026|gb|GG667709.1| GENE 31 25325 - 25906 748 193 aa, chain + ## HITS:1 COG:no KEGG:Closa_1966 NR:ns ## KEGG: Closa_1966 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 184 1 184 193 247 91.0 2e-64 MMSRIEQLIGEIEEYIDGCKYQPLSNSKIVVNRDELEELLVELRLRIPDEIKQYQKIISN RDAILNEAHQQADSILAQASAHTDELVNEHEIMQKAYAQANEIINQANLQAQDIVEKAVS DANDIRQSAVQYTDDMLKSLQTIISHSMEGAQGRFDAFMTSMQSSYDIVSSNRQELTGAV MPPTEGDGSMPTE >gi|229784026|gb|GG667709.1| GENE 32 25863 - 26210 241 115 aa, chain - ## HITS:1 COG:no KEGG:Closa_1967 NR:ns ## KEGG: Closa_1967 # Name: not_defined # Def: nucleoside recognition domain-containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 95 194 288 305 134 72.0 1e-30 MVKIGGYIMLFSILVFYITGLPLHNPAGKAALLGFVEITTGIRAISGAIPSMAGGMWIAA AAAFGGLSGIFQTKSVIKNAGLSIRHYVAWKAVHSLLSAAIFILLAYCRRLLWAA >gi|229784026|gb|GG667709.1| GENE 33 27215 - 27412 125 65 aa, chain - ## HITS:1 COG:no KEGG:Closa_1967 NR:ns ## KEGG: Closa_1967 # Name: not_defined # Def: nucleoside recognition domain-containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 63 1 63 305 72 71.0 7e-12 MKKSFFRFCSVAVLCILLLSPAAASQGAKNGLVLWATVVLPALLPFMICSNVIVALDAIH ILIAS >gi|229784026|gb|GG667709.1| GENE 34 27464 - 28273 845 269 aa, chain + ## HITS:1 COG:BS_yunA KEGG:ns NR:ns ## COG: BS_yunA COG0739 # Protein_GI_number: 16080287 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Bacillus subtilis # 91 266 182 346 349 106 37.0 5e-23 MMMILLLILLLSFFHVTMIDYLYNNPTYYQLDAYTYESDDFRSMKLGNLIAESGPIDYDL LTSLMIEHNYDLTDSKDTICHTDRLLAVKPAEYRKLKQAYETVLGDLKFFPIPENTNAAS PPVTFENGWMDKRTYGGERGHEGCDIMGTERPRGYYPVVSMSDGVVEKVGWLEKGGWRIG IRAPGGAYLYYAHLYGYSREWQEGDAVKAGELLGFMGDTGYSTVEGTTGNFEVHLHVGIY MKTDHEDEMSINPYWVLKYLQKYRLKYSY >gi|229784026|gb|GG667709.1| GENE 35 28348 - 28716 445 122 aa, chain + ## HITS:1 COG:SP1402_1 KEGG:ns NR:ns ## COG: SP1402_1 COG0144 # Protein_GI_number: 15901256 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Streptococcus pneumoniae TIGR4 # 4 119 1 105 280 63 36.0 9e-11 MSELKLSEEYLNRMRDLLGEEEFSAYLKSFDEERLYGLRVNTAKTSPEAFPELVPWDLKQ IPWIPNGFYYEGTKRPAKDPYYYAGLYYLQEPSAMTPAMLLPVEPGDRVLDLCAAPGGKL AS >gi|229784026|gb|GG667709.1| GENE 36 29783 - 30395 415 204 aa, chain + ## HITS:1 COG:SPy1246_1 KEGG:ns NR:ns ## COG: SPy1246_1 COG0144 # Protein_GI_number: 15675206 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Streptococcus pyogenes M1 GAS # 1 108 177 284 297 92 40.0 6e-19 MFRKDEDMVKSYEEHGPEYYSRIQKEITDQAVRMLAPGGLLLYSTCTFSRCEDEEIICHI LNRHEEMELVKLPLFEGASDGIGLNGCLRLFPHKIKGEGHFMALLHKKGGSERQRTAGLL NGLQSRKAAPLPAELTEFLSLIDREFDPSRIMIKNDSVYYLPEDFVPAKELRYLRTGLLL GELKNRRFEPGQALAMTLKADEFK Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:58:46 2011 Seq name: gi|229784025|gb|GG667710.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld103, whole genome shotgun sequence Length of sequence - 17618 bp Number of predicted genes - 19, with homology - 18 Number of transcription units - 2, operones - 2 average op.length - 9.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 153 - 518 468 ## COG5263 FOG: Glucan-binding domain (YG repeat) 2 1 Op 2 . + CDS 537 - 1739 1380 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Prom 1850 - 1909 8.4 3 2 Op 1 . + CDS 1961 - 2170 116 ## gi|266623335|ref|ZP_06116270.1| putative flagellar hook-basal body protein 4 2 Op 2 . + CDS 2167 - 2268 74 ## 5 2 Op 3 19/0.000 + CDS 2298 - 3917 1774 ## COG1766 Flagellar biosynthesis/type III secretory pathway lipoprotein 6 2 Op 4 . + CDS 3919 - 4926 1236 ## COG1536 Flagellar motor switch protein 7 2 Op 5 . + CDS 4919 - 5698 762 ## gi|288871076|ref|ZP_06410011.1| putative flagellar assembly protein 8 2 Op 6 . + CDS 5719 - 7020 1530 ## COG1157 Flagellar biosynthesis/type III secretory pathway ATPase 9 2 Op 7 . + CDS 7031 - 7483 523 ## gi|266623340|ref|ZP_06116275.1| putative flagellar export protein FliJ 10 2 Op 8 . + CDS 7517 - 8917 1423 ## gi|266623341|ref|ZP_06116276.1| putative flagellar hook-length control protein 11 2 Op 9 . + CDS 8957 - 9574 689 ## TTE1435 flagellar hook capping protein 12 2 Op 10 . + CDS 9602 - 10000 566 ## Cphy_2709 flagellar operon protein 13 2 Op 11 1/0.000 + CDS 10073 - 12067 2133 ## COG4786 Flagellar basal body rod protein 14 2 Op 12 3/0.000 + CDS 12083 - 12289 303 ## COG1582 Uncharacterized protein, possibly involved in motility 15 2 Op 13 . + CDS 12335 - 13186 1102 ## COG1291 Flagellar motor component 16 2 Op 14 . + CDS 13202 - 14266 1143 ## COG1886 Flagellar motor switch/type III secretory pathway protein 17 2 Op 15 1/0.000 + CDS 14269 - 15543 1313 ## COG1298 Flagellar biosynthesis pathway, component FlhA + Prom 16448 - 16507 7.5 18 2 Op 16 2/0.000 + CDS 16579 - 17217 617 ## COG1298 Flagellar biosynthesis pathway, component FlhA 19 2 Op 17 . + CDS 17224 - 17617 454 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit Predicted protein(s) >gi|229784025|gb|GG667710.1| GENE 1 153 - 518 468 121 aa, chain + ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 4 117 517 631 693 120 49.0 8e-28 MNSWQLINDKWYLFNMDGYMLTGWQTKNGHEYYLTSNGDMVTGWLQDNRVWYYLDPQVGK VSGGWLQQGSDWYYLNPDGSMAVGWIRYKDNWYYLDPDNGRMVKDRMVETYYINSDGIWI P >gi|229784025|gb|GG667710.1| GENE 2 537 - 1739 1380 400 aa, chain + ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 262 396 516 651 693 127 45.0 6e-29 MLRRLRQAAAAVLACGLLTAALPNASGTVWAAVKNETRTPITSVSITVRSDVKADYDLDA ATVYVTTDSSLYTIGAYTWVNGNKEYWEPGDVPKVQIEIHARSGCYFEKTTGAGKFQITG AAYASVKRQNNNETLLLTVKLTPASGTLDIPNSAEWVGYPLGKGTWEAVPYAGAYELKLY RDGQMIQGVAKVNATTYDFYPFMTQAGRYQFRVRAIPKDTEEQGYITSGDWVYSDEQDID DDQTYSQGGGRQNSNLTPANIGWVKNSDGWWYRNADGSYPANTWQNIDGAWYLFDYDGYI LTGWQLKNGKYYYLDSNGAMQTGWFQDNRKWYYLSDDGSMQTGWLTLGGSTYYFDGDGSM HTGWLLDSGKWYYFAPDTGAMVRGTSVGGYYLNSDGVWVR >gi|229784025|gb|GG667710.1| GENE 3 1961 - 2170 116 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623335|ref|ZP_06116270.1| ## NR: gi|266623335|ref|ZP_06116270.1| putative flagellar hook-basal body protein [Clostridium hathewayi DSM 13479] putative flagellar hook-basal body protein [Clostridium hathewayi DSM 13479] # 1 69 1 69 69 116 100.0 5e-25 MFIVPITRLEGPASLEEMKKLDKAPQKQGKSSFQNILTDAVRKADQAVKDTREMDVKLAA VRSIIYIQH >gi|229784025|gb|GG667710.1| GENE 4 2167 - 2268 74 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIQAEQSAAAIEFTTQLTSKAVSAYTQIMGMQI >gi|229784025|gb|GG667710.1| GENE 5 2298 - 3917 1774 539 aa, chain + ## HITS:1 COG:BS_fliF KEGG:ns NR:ns ## COG: BS_fliF COG1766 # Protein_GI_number: 16078684 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway lipoprotein # Organism: Bacillus subtilis # 23 537 20 536 536 105 24.0 3e-22 MGVKDKVKTAVDGSGTVGRLLGSKKRKAAAAGVLAAVIVALAAAILLNTNKAGEYVVLFP GISREENNEILAVLNGRGVGAKRNAEGEVTVPENQVGDIMLDMSELGYPKTALPFDIFSD NMGFTTTEFEKRQYLLLNLQDRIERTLKDMNGIKNAIVTLNVSDDSNYVWDDQGADSTGA VSLTLMPDCELSPEKVGAIKNLIANSVPRLLPENVTVVNAETMQEMVSDDVGGLTSYGLG RLDFETKVEKRLQDKIMNVLTLAYPPGQIRVSATVVIDYDKMITEDLQYEPQDNGQGVVE HYKTGQSVNGQGGGAGGVAGEENNTDIPTYGAAGTGGANGANGDYYRDVDYLVGYIKKQI EKDNVKLQKATVAITVNDSNLTESKKQQLIDAASKAANIAPEDIVVSSFQQIASDKTPEP EKPVIPVTAPGVGGIDPRILIAGAAAGIFLFLVLLFLILRHRKKKQRKEDQELFGPFEGD AVELDRDIAERAEESAVENRLTNIPVKPQDPVNQVRSFAEDNPEIIASMISNWLKEDKK >gi|229784025|gb|GG667710.1| GENE 6 3919 - 4926 1236 335 aa, chain + ## HITS:1 COG:BH2457 KEGG:ns NR:ns ## COG: BH2457 COG1536 # Protein_GI_number: 15615020 # Func_class: N Cell motility # Function: Flagellar motor switch protein # Organism: Bacillus halodurans # 10 335 10 335 335 271 45.0 1e-72 MPAEERRPAEKAAAVISMLGTERASEVFRYLTENEVEQLSMEITRLPHLAPDEMEDIAKD FYNCCVTEKVITEGGKDYAKEVLEKAFGQQQARNLMDRVSKALKTKAFSFIRKVDYKTLM TAIQNEHPQTLALILSYATPEQASKIIANLPQETRVDVVERIAAMDRALPMAIKIAEEVL EKKIGSSSSEETMEVGGLNYIADIMNHVDRSTERDIFDELNLKNPQLAEDVRKLMFVFED IAYLDPLSIQRFLRETDSKDLAVALKVSNKDVMNAIFANMSNRMRESIQSDMEYLHNIRM SDVEEAQQRIVAVIRRLEEEGEIVISKDGKDEIIV >gi|229784025|gb|GG667710.1| GENE 7 4919 - 5698 762 259 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871076|ref|ZP_06410011.1| ## NR: gi|288871076|ref|ZP_06410011.1| putative flagellar assembly protein [Clostridium hathewayi DSM 13479] putative flagellar assembly protein [Clostridium hathewayi DSM 13479] # 1 259 7 265 265 386 99.0 1e-106 MYRVIKSGQTVLGEGRVPAYQMPGPSADQAETANGEPETAVQEAADGSIPRYALISEEKR KILEHARSQAEQSAARILEEAYAQRDKIVNTALAEAERLKKQAEEEGYEDGINRAEADIS TGLVRIGQAVEQAGRRLEEHSEEIKAGITEIALMIAEKILQREVDAGRAGLADMVEQAVL SERDKNEITVHLSDDSIALVEELERRLEPLRDRNGGYVRIRPEQQQPGYVQIETEEGIVD ASVFVQLENLKRQLAELLK >gi|229784025|gb|GG667710.1| GENE 8 5719 - 7020 1530 433 aa, chain + ## HITS:1 COG:TM0218 KEGG:ns NR:ns ## COG: TM0218 COG1157 # Protein_GI_number: 15642991 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway ATPase # Organism: Thermotoga maritima # 1 431 4 432 438 466 54.0 1e-131 MIESVRQAVQNANLYTYLGKIDKIVGTTVESMGPACRIGDVCTIDVPGGSPPVMAEVVGF RENRVLLMPYEETEGIGYGSTVRNTGERLLIPVSNELIGRTVDAMGMPIDGQGPLGEVAY YPITGKRSNPMERPRIDTVIQMGVKAIDGLLTIGKGQRMGIFAGSGVGKSTLMGMIAKNV KADVNVIALVGERSREVVEFINRDLGEEGLKRSVLVVATSNQSAMMRSKCTMTATTIAEY FKDQGMDVLLMMDSLTRFAMAQREVGLSIGEPPVARGYTPSIYSELPKLLERSGNFETGS ITGIYTVLVEGDDANEPVSDTVRGIIDGHIMLSRKVATQNRYPAIDILSSVSRLMSDIVM PEHKEAAGKIRKLLSVYESNIDLVSIGAYKKGTNKELDDALMRIDKIYEFLQQKTDESFT LDEAVIRMIQLTE >gi|229784025|gb|GG667710.1| GENE 9 7031 - 7483 523 150 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623340|ref|ZP_06116275.1| ## NR: gi|266623340|ref|ZP_06116275.1| putative flagellar export protein FliJ [Clostridium hathewayi DSM 13479] putative flagellar export protein FliJ [Clostridium hathewayi DSM 13479] # 1 150 1 150 150 171 100.0 2e-41 MKKFAFSLERMLNFQSQNLEKEMGILGRMTAERDALEARKRDMAEKAAGIQAEIARREAE GTTIFMLKACYSILESARNQLEELEKERKLLQAGLERQRQVVTEASREVKKLEKLKEKQL EEYHRGEAKEQQETIAEHVAGNFVRRGASQ >gi|229784025|gb|GG667710.1| GENE 10 7517 - 8917 1423 466 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623341|ref|ZP_06116276.1| ## NR: gi|266623341|ref|ZP_06116276.1| putative flagellar hook-length control protein [Clostridium hathewayi DSM 13479] putative flagellar hook-length control protein [Clostridium hathewayi DSM 13479] # 1 466 1 466 466 699 100.0 0 MNMETAMKASVTGWKQQVTADHGKGRAAGQANFADTFSEMLKTGKKAAEDRNQSQSARKT PEIQTGSKTDDTGNDTVRAEVSGPADTRAEKAPEKKPDTKKTEDSAQEEAGKTADLEIMA EQAAVMAEAPAVPLINGANAEEDGLQAVTEAGSVSDQTEAAAVPGSGTGAGYEAVQAAAA ENAGASDAVNTREMRFQPAAQKVEETENGQKTEAVKSAEAEHTALAGHLPVSDGNGAGRE EGSFSDLLKGHSEQMKAMGTGTQVQTEEETVSDDAMDNRMLEELKKNSDGKGMSFQDRLS VSQQGALKGVPTVDNTASDLHTPVAEQLKAGVEQGMKRELSEFTIKLKPEGLGEIVVHMA SAGGRTSVSIGVANPETEKLVNSQMMSLKEMLEPLHAEVEEVYHNSQGGMDFAGFGQELY QNQGQQARGHYRRGVNQGLMADDGVFIQETERMAAESLVRRLYAYV >gi|229784025|gb|GG667710.1| GENE 11 8957 - 9574 689 205 aa, chain + ## HITS:1 COG:no KEGG:TTE1435 NR:ns ## KEGG: TTE1435 # Name: flgD # Def: flagellar hook capping protein # Organism: T.tengcongensis # Pathway: Flagellar assembly [PATH:tte02040] # 70 172 27 129 131 70 36.0 4e-11 MSVQIPEYSNPYLRTASSGSSGPGVPGDAAPAKASEDGEKETEGTNGASGNGDVIDAVFG DASDKAVSMEDFLTLMVAQLKNQDFMNPVDDTQYVTQLAQISTMQQMEEMAYNAKSTYVT SLVGKTVTAAKFTVAGDLKKTEGVVNKVSLLDGKFVIYVEGESYGMDEIMEIRTEAPEKD GTEPPDSEGSGDGDGENPGDKDSKI >gi|229784025|gb|GG667710.1| GENE 12 9602 - 10000 566 132 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2709 NR:ns ## KEGG: Cphy_2709 # Name: not_defined # Def: flagellar operon protein # Organism: C.phytofermentans # Pathway: not_defined # 34 132 34 132 132 89 47.0 4e-17 MDSMIYNKMLHTPICTGKPWEAARNKPAEGTEGGSAFKELLEQKLKEESQVTFSKHAVER VMERSVDVSSDKLERLNEGVKLAGEKGLKAPLILMDSTAFVVNVKNNKVVTVVNEESLKG TVFTNIDGTVMV >gi|229784025|gb|GG667710.1| GENE 13 10073 - 12067 2133 664 aa, chain + ## HITS:1 COG:BH2449 KEGG:ns NR:ns ## COG: BH2449 COG4786 # Protein_GI_number: 15615012 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Bacillus halodurans # 1 144 1 142 263 130 50.0 1e-29 MLRSLYSAVSGMKAHQTKLDVIGNNIANVNTYGFKSSRARFQDVFYQTLQSAAGGDNNKG GTNASQVGYGSQLGGIDLDMSRSALQSTGRPLDVAIAGEGFFQVMDADGNIFYTRAGNLM LDSNSGNLVDSNGYTVLGVSGDPLGKAPSSDKIHLNIPTKSSGLASKTLTINGTQFTITS EKATTDANVSITFKAEDNLPSGSDIVVNPSDINDSSITVHVNKNAIFTSLSDFNSKMNSA ITRGCGGKAHPGGNFTITATPAENVFQKPLTGAELLGDHFGINEGTFEFVTSTDAADGIF GGLKFSGMSTSPAFDAAGPVDFKSELRPGTGGKMAWVVTATVDGREFVGEINENTTTAGK MWLKEAGTGQYIEMSHPGYTAITSSFNAGTGNTAQSFHNVTMPAAAGTGFFNKSMSLKDY GTSGINPGDDIEYSAVKIPASAGNKAKWHVTAVSGGKTYSGDISEGDAAGDLVLTEEGGG TLTIDHKGYSDINTAFLNADEADKTYASFVPKDHAGTITAATQSNDAGLGSGTFTLENGT EGGAISLSGASIAILGNGIIEATHPDKGKIQIGRIDLVTFENPYGLEESGNSYFRATSNS GEAKACQAGEDGTGALKTSSLEMSNVDISTEFSDMIVTQRGFQANSRIITVSDSILEELV NLKR >gi|229784025|gb|GG667710.1| GENE 14 12083 - 12289 303 68 aa, chain + ## HITS:1 COG:TM0675 KEGG:ns NR:ns ## COG: TM0675 COG1582 # Protein_GI_number: 15643439 # Func_class: N Cell motility # Function: Uncharacterized protein, possibly involved in motility # Organism: Thermotoga maritima # 1 67 1 67 67 66 55.0 9e-12 MITLSKLNGEEFVLNCDLIEMIMENPDTTILLTNGKHLIVRESKEEVVERVAAFRRKTFS DLIEQIRG >gi|229784025|gb|GG667710.1| GENE 15 12335 - 13186 1102 283 aa, chain + ## HITS:1 COG:BH3240 KEGG:ns NR:ns ## COG: BH3240 COG1291 # Protein_GI_number: 15615802 # Func_class: N Cell motility # Function: Flagellar motor component # Organism: Bacillus halodurans # 2 266 5 265 269 186 37.0 4e-47 MDIMSILGFILAVVLVLFGMTFDQEAMKLVFHNLRAFIDLPSMAITIGGTIGVMMLSFPS GAFKKIGKHLKIIIKPYKFDPEQSIEQIVELATEARMKGLLSLEDKLNEIDEPFMHNSLL LVVDSVDSEKVRQVMETELDQLDERHALDRGFYEKAASFAPAFGMIGTLVGLILMLGNMQ DVNALAKGMATALITTLYGSLLANIVCLPVASKLKARHDEEFLCKQLVLEGVLAIQDGEN PKFIEEKLYKLLPASRKKAEEDTEKEDGKQKKKSRRKSRKNEE >gi|229784025|gb|GG667710.1| GENE 16 13202 - 14266 1143 354 aa, chain + ## HITS:1 COG:BH2445_2 KEGG:ns NR:ns ## COG: BH2445_2 COG1886 # Protein_GI_number: 15615008 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar motor switch/type III secretory pathway protein # Organism: Bacillus halodurans # 270 350 28 108 112 100 58.0 3e-21 MAKEVENQGIMNRTDRQVFEELIHLHAKAAGESLSSILKADIEISSPEITEITVKEVEYG ILEPAVFVKSCLTSGVAGNTVMILRQRDMQAFLNELMGIDDLPDPDFEFDEVAMSAAAEL MNQMVHASVEAMAEYLGNTMESSDCQLILSDGRQNLSPAIGEAPESKTIVIHYRVKIKDM VESEFMECISITAADSIYQEIGARREAEKRAEAEAKAMEESLFVKPVQQGAAPAAAGRES LQSGMESRTANSVTADTGHAVYQASPSINGNLGLIMDVPLNVSVEIGKTRRRLKDVLNFN NGTVVELDKQADAPVDIIVNGQLIARGEVVVIDDNFGVRISEIVNARSIIGNGE >gi|229784025|gb|GG667710.1| GENE 17 14269 - 15543 1313 424 aa, chain + ## HITS:1 COG:TM0908 KEGG:ns NR:ns ## COG: TM0908 COG1298 # Protein_GI_number: 15643670 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhA # Organism: Thermotoga maritima # 1 419 1 415 678 376 48.0 1e-104 MKRFNILVTAGIIGIIFLILIPLPTPILDFLFILNITLSLLILVMTMYIRETLEFSVFPS LLLITTLFRLGLNISSTRKILFDNGYAGQVVKTFGQFVIRGNAVIGFIIFLIIVLVQFIV ITKGSERVAEVAARFTLDAMPGKQMAIDADLNSGLIDEQQARERRLKIQREADFFGSMDG ATKFVKGDAIISIIITLINFIGGIVVGMMNGQGSISEIMQIYTTSTIGDGLVSQIPALLI SVATGMVVTRSASENSLSEDLTSQFLAQPRVLMTAGGAAACLCLIPGFPVPQILMISAMM IGGGYALMRRQKTIVADAAEGYVEAEVTSEASFYKNIENVYGLLTVEQIEMEFGYSLIPL ADESNGGSFIDRVVMFRKQMALDMGFVIPSVRIKDSGQLNPNQYSILLKGEEVARGDILM DLAS >gi|229784025|gb|GG667710.1| GENE 18 16579 - 17217 617 212 aa, chain + ## HITS:1 COG:BH2438 KEGG:ns NR:ns ## COG: BH2438 COG1298 # Protein_GI_number: 15615001 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhA # Organism: Bacillus halodurans # 1 205 468 672 677 174 39.0 2e-43 MVITHLSEVVKEHLHELLNRQEVNNLLETLKKTNASIIEDTIPAVISVGELQKVLSNLLR EGIPIRDMETIVETLGDYGTQIKDTDMLTEYVRQALKRTISHRFSEAGQMKVLSLDDKIE NMIMSSVKKVDTGSYLALEPSAIQNIVTSATGEINKIKDLVNVPIVLTSPVVRIYFKKLI DQFYPNVTVVSFSEIDNNIQIQALGNIALGQR >gi|229784025|gb|GG667710.1| GENE 19 17224 - 17617 454 131 aa, chain + ## HITS:1 COG:PA1455 KEGG:ns NR:ns ## COG: PA1455 COG1191 # Protein_GI_number: 15596652 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Pseudomonas aeruginosa # 15 117 7 109 247 75 38.0 2e-14 MQMEQMSREEGKTVLSVYLKNGSQELRNKLIVHFLPIVRSAAAQLRGMAGSFTEEEDLID QGVLALMECLDRYDASKGAQFETYAFIRVRGAMIDYIRSQDWVPHRARSFQKKVDEAYSM LAHEKMREPEA Prediction of potential genes in microbial genomes Time: Fri Jul 1 01:59:52 2011 Seq name: gi|229784024|gb|GG667711.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld104, whole genome shotgun sequence Length of sequence - 27669 bp Number of predicted genes - 24, with homology - 22 Number of transcription units - 13, operones - 7 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 1578 1143 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase 2 1 Op 2 . - CDS 1587 - 2354 634 ## COG1414 Transcriptional regulator - Prom 2524 - 2583 8.7 - Term 2625 - 2667 -1.0 3 2 Op 1 . - CDS 2840 - 3445 708 ## COG5263 FOG: Glucan-binding domain (YG repeat) 4 2 Op 2 . - CDS 3525 - 4985 1531 ## Dhaf_0170 sodium/sulfate symporter - Prom 5011 - 5070 5.0 + Prom 4966 - 5025 6.3 5 3 Tu 1 . + CDS 5147 - 5818 686 ## SGGBAA2069_c08960 global nitrogen-responsive regulatory protein + Term 5843 - 5890 14.0 - Term 5831 - 5878 13.1 6 4 Op 1 1/0.000 - CDS 5948 - 7423 1706 ## COG1109 Phosphomannomutase 7 4 Op 2 . - CDS 7420 - 8235 754 ## COG3459 Cellobiose phosphorylase - Prom 8287 - 8346 9.1 8 5 Tu 1 . - CDS 9249 - 10697 1309 ## COG3459 Cellobiose phosphorylase - Prom 10828 - 10887 5.8 + Prom 10796 - 10855 5.2 9 6 Tu 1 . + CDS 10876 - 12108 1207 ## COG0205 6-phosphofructokinase + Term 12116 - 12166 15.4 - Term 12104 - 12154 11.6 10 7 Tu 1 . - CDS 12175 - 12315 120 ## gi|288871082|ref|ZP_06116296.2| hypothetical protein CLOSTHATH_04646 - Prom 12494 - 12553 4.7 - Term 12459 - 12502 -0.9 11 8 Op 1 41/0.000 - CDS 12585 - 13511 1153 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 12 8 Op 2 . - CDS 13513 - 14256 870 ## COG0396 ABC-type transport system involved in Fe-S cluster assembly, ATPase component - Prom 14365 - 14424 8.0 + Prom 14288 - 14347 10.1 13 9 Tu 1 . + CDS 14431 - 15645 923 ## Closa_2275 diguanylate cyclase 14 10 Op 1 . - CDS 16558 - 16716 73 ## - TRNA 16585 - 16671 64.5 # Leu CAG 0 0 15 10 Op 2 11/0.000 - CDS 16789 - 18099 1133 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 16 10 Op 3 11/0.000 - CDS 18096 - 18599 255 ## PROTEIN SUPPORTED gi|90020580|ref|YP_526407.1| ribosomal protein S3 17 10 Op 4 . - CDS 18602 - 19000 188 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 18 11 Tu 1 . - CDS 19928 - 20356 180 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 - Prom 20567 - 20626 6.0 - Term 20432 - 20463 1.6 19 12 Op 1 7/0.000 - CDS 20666 - 21520 719 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 20 12 Op 2 . - CDS 21585 - 23162 1272 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 21 12 Op 3 . - CDS 23159 - 23350 63 ## - Prom 23419 - 23478 3.1 - TRNA 23214 - 23297 56.4 # Leu AAG 0 0 - TRNA 23470 - 23556 64.5 # Leu CAG 0 0 + TRNA 23715 - 23798 56.4 # Leu AAG 0 0 22 13 Op 1 . - CDS 23879 - 25807 1317 ## Corgl_0562 protein of unknown function DUF303 acetylesterase 23 13 Op 2 38/0.000 - CDS 25813 - 26697 853 ## COG0395 ABC-type sugar transport system, permease component 24 13 Op 3 . - CDS 26701 - 27573 1057 ## COG1175 ABC-type sugar transport systems, permease components Predicted protein(s) >gi|229784024|gb|GG667711.1| GENE 1 3 - 1578 1143 525 aa, chain - ## HITS:1 COG:MA1802 KEGG:ns NR:ns ## COG: MA1802 COG0129 # Protein_GI_number: 20090653 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Methanosarcina acetivorans str.C2A # 12 525 2 519 553 528 53.0 1e-150 MEDFTYKNTKYRSAVVVNGMSMAGPRAHFRGMGLSNEELRKPFIGVINTHNEMHPGHVHL DKLAEQVKAGVSEAGGVPFEVNTISICDGFTQGHAGMCSVLPSREVIADSIEVYVNAHML DGLVLLGGCDKIVPAMLMAMLRLNIPAILVTGGPMMPAVYQSKSYATYELKEMAGKLLKG EISKEEYEEMEGIMSPGPGSCAMMGTANSMSVAAEAMGLTLPGCAGAHAVAGKKKRIARQ SGVQIVRLVEEHICPRDIVTQAMLDLSARVCVSVGGSTNIMLHYPAIAREGRLHMTMKEL GEISKTTPYLAKIKPSGIHTMLDFDEAGGVGAVLKALSGMVNLDLMTVNGKTHRQNTDQL TIWNQEVIRTVEQAYAPEGSIAVLHGNLAPKGCVVKQSAVAPQMRKHTGPARCFECEEEA VKAIYGGSIQDGDVIVIRNEGPKGGPGMREMLTATAALVGMGYSEKVALITDGRFSGATR GPCIGHISPETSQRGPIAALMDGDRITIDIDGGSLSVELSEEDLK >gi|229784024|gb|GG667711.1| GENE 2 1587 - 2354 634 255 aa, chain - ## HITS:1 COG:STM1842 KEGG:ns NR:ns ## COG: STM1842 COG1414 # Protein_GI_number: 16765183 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Salmonella typhimurium LT2 # 2 253 5 256 263 137 33.0 1e-32 MDLQEQMKEYSALKALQIILALAEQKGWVNLSDIAEATGLSPSTVHRILQELQACGFVSK NKEARQYKLGMGMMNIALKVNMSDYLLEAAQEEMTRLNELSLETIHLIAPDNDKAVYIGK MDAKNQIQLRSRIGWKIPLQCTSGGKLILAYHSREWVDSYLNYNPLKQYTNNTIIHKETL HRELELIRQQGYSLDNREHNPDIVCIAAPIFGIDGNLAGTIGISAPDYRFSPEKACSFAE EVKRSAAAITEKLKA >gi|229784024|gb|GG667711.1| GENE 3 2840 - 3445 708 201 aa, chain - ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 82 195 486 593 693 78 35.0 1e-14 MKKHCKKLAVYLISGSLLVSTFTVIPGTPFSIEAEAHSGRTDANGGHKDNQNKSGLGSYH YHCGGYPAHLHPNGVCPYQSGAASEDRQTTQQTEQKPAVQPETKAETQPETQATAIGWHQ DASGWYYCKGDNQYYKDGWQELDSKWYYFDVNGYMKTGWLILGDDWYYLASDGHMVTGDM EIDGTLYYFETNGVLVDDYYD >gi|229784024|gb|GG667711.1| GENE 4 3525 - 4985 1531 486 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_0170 NR:ns ## KEGG: Dhaf_0170 # Name: not_defined # Def: sodium/sulfate symporter # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 7 484 24 504 507 228 33.0 6e-58 MKQKKNMYPFHCMIGIAIMLLFRLLPIHMPNITPVGKQIFGIFIGTLYLWTTVDPVWSSI LCIFMIGTSGYAPMGAVLNTAFGNPTVVQLFFLMILMNSLQTNRISAYIGRFFLTRKFII GRPWAFTAILFIGSMLIAAFVGCFAPIFLFWPILYDIFEDIGMKPFEAYPTIMLILIVVA CAIGFPVPPYMSNGLALISNYTTVSQNLLGTTVVINNGQWLLVALIYGFLCSGAVILFTK YVLHPDVSRLKNLTLEQLNKNPLPPLSKRQKAVALICVCFLLSMLLPSLFPTLPGMAWLS ANSFGMALLFLVILFAVKIDGTPAITIEENIGVFAWGTFFLVTSAILLGGVLTNESTGVT AFLNTILSPIFTGMSPLVFGAMLLIIGCVLTNVCNSLVIGMILQPVVATYCLQAGINSAP LVSIMGIFVLSCAIATPAASPFAAMLFSNKNWIKSGDIYRYNIMYVVLELVLALLVGLPL ANALIH >gi|229784024|gb|GG667711.1| GENE 5 5147 - 5818 686 223 aa, chain + ## HITS:1 COG:no KEGG:SGGBAA2069_c08960 NR:ns ## KEGG: SGGBAA2069_c08960 # Name: not_defined # Def: global nitrogen-responsive regulatory protein # Organism: S.gallolyticus_gallolyticus # Pathway: not_defined # 35 201 36 198 223 80 30.0 6e-14 MEPYQIREDFIRCGIRREAKKNQYLFDPSREGCDNICFLDEGIAALTRINDDGEEHVYLY FGEKRLVGFANALVKHFPYDWQRYITPSPFWITAKTNCVYYSMREKQFESMMDSSPYFTQ CVLGATTLNYLELVNKIQWMMDGDKTAQFCKWLLTYSIKKRGKSMIPKAFSFVEVANYLG MHPVTVSRIARKLRETGIIAREDGCLVIQDEEKLRAMTLLQNG >gi|229784024|gb|GG667711.1| GENE 6 5948 - 7423 1706 491 aa, chain - ## HITS:1 COG:VC0611 KEGG:ns NR:ns ## COG: VC0611 COG1109 # Protein_GI_number: 15640631 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Vibrio cholerae # 1 484 1 468 470 521 53.0 1e-148 MIKFGTGGWRAVIGDEFTRENIQKLSRALVNKMKDEGVAEQGIAIGYDRRFLSKEAAIWA SEVFAAEGVPVKFILGSAPTPLIMFFVMRHEMPYGMMITASHNPAVYNGIKVFTKGGRDA DELQTQEIERYIHGIEEEVLKAACVTRGSDSEEPAVRVMEYEKGIETGIIEEINPLNEYL DNIISAVNMQAIRDRGLRIAIDPMYGVSNTSLRTILATARCDVQVIHEEHDTLFGGKLPA PSAFTLRSLQNYVVDRKCDIGIATDGDADRVGVIDDTGKFLHPNDILVMLYYYLVKYKGW KGAVVRNVATTHMLDKVAEGFGEKCYEVPVGFKYISAKMNETNAIIGGESSGGLTVRGHI NGKDGIYASTLLVEMIAVTGKKLSQIARDIEDEYGAIHMEERDYKFSQEKKDQIYDTLLI QKQLPEMDLEIDKISYLDGCKVYFKNGGWIIARFSGTEPLLRIFCEMGDPKDAIRMCECF EEFLNIKGNRN >gi|229784024|gb|GG667711.1| GENE 7 7420 - 8235 754 271 aa, chain - ## HITS:1 COG:VC0612 KEGG:ns NR:ns ## COG: VC0612 COG3459 # Protein_GI_number: 15640632 # Func_class: G Carbohydrate transport and metabolism # Function: Cellobiose phosphorylase # Organism: Vibrio cholerae # 1 265 534 801 801 360 62.0 2e-99 MLRTVKETCDRELWDDGWYIRGITKNGRKIGTHQDREGRLHLESNSWAVLSGAADYEKGI KAMDAVDEYLYTPYGIMLNGPSYTVPDDDIGFVTRVYPGVKENGSIFSHPNPWAWAAECR LGRGDRAMKFYHALCPYYQNDIIEIREAEPYSYCQFIMGKDHTAYGRARHPFMTGSGGWA YFSATRYMLGIRPQFDSLEIDPCIPADWKEFSVTRVWRGAEYRISVKNPEGVMKGVKELI LDGRKVSKIPVSEAGSSHEVTVVMGDKGGES >gi|229784024|gb|GG667711.1| GENE 8 9249 - 10697 1309 482 aa, chain - ## HITS:1 COG:VC0612 KEGG:ns NR:ns ## COG: VC0612 COG3459 # Protein_GI_number: 15640632 # Func_class: G Carbohydrate transport and metabolism # Function: Cellobiose phosphorylase # Organism: Vibrio cholerae # 1 480 1 482 801 611 57.0 1e-175 MQYGHFDNEKREYVIDRVDLPTSWTNYLGVKDMCAVVNHTAGGYLFYKSPEYHRITRFRG NAVPMDRPGHYVYVRDDETGEFWSISWQPVGKPLDQAKYTCRHGLSYTTYSCDYQGIEAE QTLFIPIDDPVELWDVKLKNESGRKRKLSVYSYCELSFHHIEMDNKNFQMSLYAAGSSFE DGIIEHDLFYEEFGYQYFTSDFKPDGYDCLRDKFIGLYHTEDNPVAVERGEMSGSSEKGG NHCGALMRRLELEPEEETRLIFLLGEGKREAGRAMRAKYSDHGAVDRAYSDLRAFWDDKC SRLQIQTPDEGMNTLINTWTLYQAEINVMFSRFASFIEVGGRTGLGYRDTAQDAMTVPHS NPEKCRQRLVELLRGLVSAGYGLHLFQPEWFDPDTEVKPFKSPTVVPTPKVSDMIHGLED TCSDDALWLIASIVEYVKETGEYGFFDEIITYADGGSGTVYEHMKKILDFSAKQIGAHGV AS >gi|229784024|gb|GG667711.1| GENE 9 10876 - 12108 1207 410 aa, chain + ## HITS:1 COG:XF0274 KEGG:ns NR:ns ## COG: XF0274 COG0205 # Protein_GI_number: 15836879 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Xylella fastidiosa 9a5c # 2 406 13 416 427 206 35.0 6e-53 MKKNAIVGQSGGPTAVINASLYGVIKEGINQEAIDHVYGMINGIEGYMSGTCMDLTADLT SDELELLKLTPAAYLGSCRYKLPEDLSSPFYPALFEKFRKMNIGYFFYIGGNDSMDTVSK LSRYASHHHSEIRFIGVPKTIDNDLVLTDHTPGYGSAAKYVADTVREIVLDSSVYQQKSV TIVELMGRHAGWVTAASALARKYEGDNPVLIYLPETHFDFEQFTEDVTAALSKNSTVVIC ISEGISDARGKFICEYADAARLDTFGHKMLTGSGKMLENFIRDKFGVKVRSIELNVNQRC SGMMASATDIEESVLAGREGVKAAVSGLTGRMISFLRNENGPYHLECGTVDVNEVCNREK LFPSEWMNSRGTDVTQAFLDYALPLIQGEVQRKTESGRPVYLYRKSLVQK >gi|229784024|gb|GG667711.1| GENE 10 12175 - 12315 120 46 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871082|ref|ZP_06116296.2| ## NR: gi|288871082|ref|ZP_06116296.2| hypothetical protein CLOSTHATH_04646 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04646 [Clostridium hathewayi DSM 13479] # 1 46 89 134 134 93 100.0 4e-18 MQTKQVYNFFSGCGTAVSYRIFGKTLTRHFYNGVSIPLSNLSEMHE >gi|229784024|gb|GG667711.1| GENE 11 12585 - 13511 1153 308 aa, chain - ## HITS:1 COG:MA4407 KEGG:ns NR:ns ## COG: MA4407 COG0719 # Protein_GI_number: 20093194 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Methanosarcina acetivorans str.C2A # 46 307 122 381 407 92 30.0 1e-18 MAKMDEIQKRLLEEVADLHDVPEGAYNLRANGQTAGRNTTANIDITSKDDGTGIDIHIKP GTKHESVHIPVVLSESGLTEVVYNDFYIGDDADVVIVAGCGISNCGQQDSRHDGIHRFYL GKNSKVKYVEKHYGEGDGNGKRIMNPGTEVYLEENSYMEMEMVQIKGVDSTNRNTKAELS AGAKLVIQERLMTHGAQEAVSGFVVNLNGDDSSANVISRAVARDDSRQTFLAEINGNSRC AGHSECDAIIMDHANISAIPKITANNTDAALIHEAAIGKIAGDQIIKLMTLGLSEAEAET QIVNGFLK >gi|229784024|gb|GG667711.1| GENE 12 13513 - 14256 870 247 aa, chain - ## HITS:1 COG:MJ0035 KEGG:ns NR:ns ## COG: MJ0035 COG0396 # Protein_GI_number: 15668205 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, ATPase component # Organism: Methanococcus jannaschii # 15 222 16 230 250 158 41.0 1e-38 MLTLENLSFGVQDEKSQKEIIRNLNLTIESNKFVVVTGPNGGGKSTLAKLIAGIEKPTGG RIQLDGVDITEKNITERAQSGISYAFQQPVRFKGIEVFDLLRLASGKQLTITDACQYLSE VGLCAKDYIKREVNDSLSGGELKRIEIATVLARGTKLSIFDEPEAGIDLWSFQNLIAVFE RMRHNIKNGSILIISHQERILDIADEIVVIANGQISMQGPKDDILPHLLGTASAVDACTK FYNKEEA >gi|229784024|gb|GG667711.1| GENE 13 14431 - 15645 923 404 aa, chain + ## HITS:1 COG:no KEGG:Closa_2275 NR:ns ## KEGG: Closa_2275 # Name: not_defined # Def: diguanylate cyclase # Organism: C.saccharolyticum # Pathway: not_defined # 1 403 1 400 547 510 62.0 1e-143 MKSDMILDNIPCGIVKLKAESGFPVLYANDEFKQLHGAAKRLEDIVEPPDYQELEQDILG HLGDAPARFELEFRSPTEGGDFCWYLIRINYVPGGGEEVLYGILIDINDRKKRQEELLIR EEQYRLASRHSGCNISIYDIPSKTLCQPHDSAFSFPSPISCPPTPDFIIDNGIIHRESVQ DFLDFYDSMERGVPEGKSIIRMKMKSGEYHWFSACYSLIRGQNREPLRSIISYQDITGQY EKELAYQKWMEYMNEQKKDCIGYYEYNLKYDLFEEIMGELTSTLPEYARNTFSDIINYIA EHFIYEEDRENYLKVFNRNQLLYHYYHGNRSLQTEHRRLRPDGTTYWCLGLIQIVPDPYT DTIKAFILIKDIDEEKKEALTLQEMSEQDSLTGLLNRGTSVRAS >gi|229784024|gb|GG667711.1| GENE 14 16558 - 16716 73 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNWTTNSYNYMLYGNAEVAELADAHGSGPCGSNTVRVQVPSSARKKQRTGAS >gi|229784024|gb|GG667711.1| GENE 15 16789 - 18099 1133 436 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 10 434 7 430 431 441 52 1e-123 MNTIAIACGLIILVLLVIMLLAGVPIAVALAVSSICAILPILNTGAAVLTGAQRIFSGIS VFSLLAIPFFILAGNIMNKGGIAIRLINFAKLFTGRIPGALAHTNAVANMLFGAISGSGT AAASAMGSIIGPIEEEEGYDRDFSAAANIATAPTGLLIPPSNVMITFSLVSGGTSVAALF MAGYIPGILWGLACMLVIFFFAKKRGYRSTKRYSASEKVKVFFQAIPCLLMIIIVIGGII SGIFTATEGSVVAVVYSLILSLFFYRSIKLTELPKIFLDSAEMTGIIIFLIGVSSIMSWV MAFTGIPSAVSEAMLSISNNRYVILFIINILLLVIGTFMDMTPACLIFTPIFLPICQKLG MNTVHFGIMMIFNLCIGTITPPVGTTLFVGVKVGKTKIENVIKPLLLYFTAIFIVLLLVS YVPILSLWLPGLLGYV >gi|229784024|gb|GG667711.1| GENE 16 18096 - 18599 255 167 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020580|ref|YP_526407.1| ribosomal protein S3 [Saccharophagus degradans 2-40] # 4 147 1 146 164 102 34 2e-21 METLHKIREVLMKILGLFIIFLFAMMTVIGTYQIVTRYFFNRPSTISEELLTYTFTWMAL LASAYVFGKRDHMRMGFLADKITGTARKILEVAIDFLTFAFAAVVMVYGGISIVKLTMIQ TTASLQIPMGYVYLIVPVTGILIMYFSFTNGMDMLHGDFSEKGEARV >gi|229784024|gb|GG667711.1| GENE 17 18602 - 19000 188 132 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 1 124 205 327 328 77 34 1e-13 MIDGAENNELALTNNKHGEVAKYYSYNKHQMVPDMLVANLKFLEGLSPEEYQVFKDAAAL STEVEMKEWDKSIEEAKKIASEDMGVEFIDVDVDAFKEKVLPLHEKMLNDNPKIRSFYEY IQTVNEQAKGEE >gi|229784024|gb|GG667711.1| GENE 18 19928 - 20356 180 143 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 2 124 1 123 328 73 34 1e-12 MMKRAAAVLLAAAMAVSLTACGSLTDGKRIIRISHAQSETHPEHLGLLAFKEYVEERLGD KYEVQIFPNEILGSAQKAIELTQTGAIDFVVAGTANLETFADVYEIFSMPYLFDSEDVYK TVMEDSDYMEQVYESTDEAGFRV >gi|229784024|gb|GG667711.1| GENE 19 20666 - 21520 719 284 aa, chain - ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 4 279 6 255 257 119 28.0 6e-27 MYRVLIADDEEIERLSFSRKLKKYFGESCEVFQAENGMQAILSVDKNDTPIVIMDIGMPG MNGVKAAEWIRKSHPDCVIIFLTAYDDFSYAKRAVSLHALEYLLKPCDDEELIPVMEEAM RQVDLCRGRRASQAAGRSGGRSEDGTEGAPGQAGERGAGTGRERDTDPEEAAENGYSRVA VAIRAYVRENYSRDISMQDAARAMNYSEVYFCRLFKQCFDQNFTAYLTRLRVKEAKKLLE DPRANIKDVSRSVGYSDSKYFSKIFKRITGQLPSEYRDGLPPAK >gi|229784024|gb|GG667711.1| GENE 20 21585 - 23162 1272 525 aa, chain - ## HITS:1 COG:BH3841 KEGG:ns NR:ns ## COG: BH3841 COG2972 # Protein_GI_number: 15616403 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 195 524 157 478 481 154 29.0 3e-37 MKSFFRFTGIADGKAVCLLEAVNSSVCLPQRKDSNMDKGKKNGSFRDRIRRTWISLPLRK KAGTITVLAAFAVSCSILMNLVTAGFGMNGFRQILEDNSRSASFHAAMETESRTFKTMIR NRSEENRAAYEAACNAARKALEALPYDYDETGPERFAKTWSVRNSYDTYEKRKTVLLSMD EDDPEFVHKLYEIYQIQNYIKSYAGSLQEMNVAAGVRRYEKQIPLFLLIPALTILWGALA LATVVWLNRSVGRYIVKPVQALAEDSRRIAENDYSGPSVLPEGEDEIADLVRAFCRMKLS TQGYIEALKEKHEAEKQLEAVRLQMLKNQINPHFLFNTLNTIAGAAEMEEAETTEKMIQA MSRLFRYNLKSTASVMPFEREKKVVEDYIYLQKMRFGSRIRYTSDCSQDTMDVMIPVFAL QPLVENAIVHGLAVKPEGGRLHIRSWIKEHRIFISVSDTGAGIAPERLEEIRAELKRGES QKLGIGVSNISRRLKTMYPDGELVIDSREGRGTVVRMAFTASEIF >gi|229784024|gb|GG667711.1| GENE 21 23159 - 23350 63 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIKHIDIFSIYWYYNLAVRECWNWQTGKTKDLVGVCSCGFKSHLPHEWAKAPDEDFVKGE DLL >gi|229784024|gb|GG667711.1| GENE 22 23879 - 25807 1317 642 aa, chain - ## HITS:1 COG:no KEGG:Corgl_0562 NR:ns ## KEGG: Corgl_0562 # Name: not_defined # Def: protein of unknown function DUF303 acetylesterase # Organism: C.glomerans # Pathway: not_defined # 4 634 9 634 644 702 54.0 0 MEGEMQLSVTPFLSNHAVLQRDEEIVVRGRGHAGCEIFAELKGPAKKRLERRGVAGDGGT WRLVFPPLPAGGPYELRVKSGGEALEFTDIFIGDVWLLAGQSNMQLPMERVKYRYPEEYR EGASPLIRQFAVPICWNFDSAQAELSGGEWKTAAAEYTPVFSAVGFFFAKKLYERYNVPV GLVLTAVGGTPVQAWMSREALREFPDELEKAEALRAPGLVKRIQRADEERIAAWWRHLDK KDPGILENWAAGGDEGSGNALWKKTNLTDSWEGNDELSGTGTVWLRKHVNIPHSRAGKPV RISLGTVTDADVFYVNGIMSGSVSYRYPPREYELPGLNAGDNLLVLRVAAVHGQGGFTRG KKHRIIWEDGMESDISGGWEYRRGAVMEPLEEQTFFERMPLGMYQAMTAPLHDFPVRGIC WYQGEMNADEPGAYPGYFSRLTADWREKWNNPRLPVLFVQLPCYDLYDAGHWVEFREMQR GLTAIPDTAMVVTTDCGEVNDLHPVSKKPVGERLAMAAFQLAYGECGTWLSPVFSFGETE ESVIVLHFEHADDGLETVDGEMLCGFEYGFCGESPECSGVICRRIPAEAALEGNRVRAAL PESGGRKPDVLLYAWSNQPEGNLCNSQKLPASPFRYLIKEKG >gi|229784024|gb|GG667711.1| GENE 23 25813 - 26697 853 294 aa, chain - ## HITS:1 COG:YPO1721 KEGG:ns NR:ns ## COG: YPO1721 COG0395 # Protein_GI_number: 16121981 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Yersinia pestis # 13 294 26 306 306 318 57.0 7e-87 MENTRGYSEVEIRRRRRLMVSAAVRYIVLIVVALVMIYPIIWLVGATFKSNNEIFTSISF IPKRIDFTPYIEGWKTRTKYTFTTFFLNTFKFVIPKIVFCLVSSTLVAYGFARFEFPFKK ICFSLLMATMFLPAVVTRIPLYILWKQMGLLNTYVPLIAPTIFANEPFFVFMLIQFMRSI PTYLDEAATIDGCNSFQILMKVLLPSLKPALISCTIFQFVWSFNDFLGPLIYVTSLEKYP VALALKMSIDQSSGIVEWNQILAMSFLALLPALILFFSAQKYFVEGVTSTGVKG >gi|229784024|gb|GG667711.1| GENE 24 26701 - 27573 1057 290 aa, chain - ## HITS:1 COG:AGl3351 KEGG:ns NR:ns ## COG: AGl3351 COG1175 # Protein_GI_number: 15891796 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 6 286 7 288 293 307 56.0 2e-83 MKKSKMGLVYILPWLIGMVFLTLYPFINALIISFTDYNLVRDPNFIGLANYTKLFKDEDF LGTLTATLKYTVITVPLQLAFALFIAYILNFKLKGINFFRTAYYIPSLLGGNVAVAVLWR FLFQQDGLINRIIGVVGIQPVAWLSSPGGAMSVIIILKVWQFGSAMLIFLAALKDVPQDL YEAASVDGSTKLHSFIHITMPLITPTIFFNLVMQLVNAFQEFNGPYLVTGKGPLNATYLT SMYIYDNAFKYFNMGYASAASWILFLIIVAVTLILFATQDKWVYYSDGGN Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:00:41 2011 Seq name: gi|229784023|gb|GG667712.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld105, whole genome shotgun sequence Length of sequence - 24302 bp Number of predicted genes - 17, with homology - 16 Number of transcription units - 9, operones - 5 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 4 - 63 4.0 1 1 Tu 1 . + CDS 87 - 1871 1745 ## COG2720 Uncharacterized vancomycin resistance protein + Term 1884 - 1925 7.0 + Prom 1989 - 2048 9.5 2 2 Op 1 23/0.000 + CDS 2112 - 4829 2194 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit + Prom 5731 - 5790 10.6 3 2 Op 2 . + CDS 5827 - 6099 317 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit + Term 6114 - 6148 7.3 + Prom 6195 - 6254 5.9 4 3 Op 1 . + CDS 6320 - 7705 1616 ## COG0534 Na+-driven multidrug efflux pump + Prom 7708 - 7767 1.9 5 3 Op 2 . + CDS 7807 - 8706 806 ## Closa_1701 hypothetical protein + Term 8733 - 8777 9.6 + Prom 8721 - 8780 8.2 6 4 Op 1 8/0.000 + CDS 8814 - 9239 490 ## COG1725 Predicted transcriptional regulators 7 4 Op 2 . + CDS 9205 - 10104 939 ## COG1131 ABC-type multidrug transport system, ATPase component 8 4 Op 3 . + CDS 10079 - 12268 1921 ## Closa_1704 hypothetical protein 9 4 Op 4 . + CDS 12327 - 12806 800 ## Closa_1705 hypothetical protein + Term 12828 - 12883 12.1 + Prom 12896 - 12955 4.6 10 5 Tu 1 . + CDS 13021 - 14571 1666 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily 11 6 Tu 1 . - CDS 15488 - 16147 501 ## COG2768 Uncharacterized Fe-S center protein - Prom 16230 - 16289 4.4 + Prom 16189 - 16248 5.1 12 7 Op 1 . + CDS 16281 - 17561 1506 ## COG2873 O-acetylhomoserine sulfhydrylase 13 7 Op 2 . + CDS 17606 - 19156 1816 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes + Term 19261 - 19303 7.6 - Term 19147 - 19188 3.2 14 8 Tu 1 . - CDS 19199 - 20161 511 ## COG0471 Di- and tricarboxylate transporters 15 9 Op 1 . - CDS 21136 - 21231 74 ## 16 9 Op 2 . - CDS 21257 - 23125 1339 ## Closa_1714 Heparinase II/III family protein 17 9 Op 3 . - CDS 23122 - 24300 972 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain Predicted protein(s) >gi|229784023|gb|GG667712.1| GENE 1 87 - 1871 1745 594 aa, chain + ## HITS:1 COG:BS_yoaR KEGG:ns NR:ns ## COG: BS_yoaR COG2720 # Protein_GI_number: 16078932 # Func_class: V Defense mechanisms # Function: Uncharacterized vancomycin resistance protein # Organism: Bacillus subtilis # 271 402 132 264 303 121 48.0 4e-27 MSQTGNRQSGTSRSYSSQGTRKTGTGTSRKTGQGSRGQSSRYKSSYQPRSGSGRGSKRKK RPQYDITKILLGILIAVVAVIVIMAAVKFVGSKPAEAETGETTSEPETELQKEVSVDGIS ITGMSREQAKKAILDKYEWGMKVTYKDDTYEVADLLETKVDSLLGDIYTGEPKESYTLDM SGLEDAAAEEAKKAAAKWDVKAKNGSVSGFDKEAGKFIYSGEQNGFVIDQEKLVSDILAQ LKNKNFSAVIEATGRETAPEITVAQAKELYQVIGTYTTTTTANKDRNKNIELAAEALNGL ILQPGEEFSFNKATGERSEAKGYRPAGAYVNGELVEEPGGGVCQVSSTLYNAVIFSGLTT TERHAHSYEPSYVTPGEDAMVSFGGPDMKFVNTSSTAIAIRASFADRKLKISIVGIPILE KGVTLSMTSKKTAELDAPAPVYEEDQTLQPGEEKIVKAETKGSRWVTNMVTKKDGVVVSD EFFHNSTYRGKPATIRRNTSGVVVPATDESASGESTISSSEGETLPSGGADTPTTTAPTQ PTTAQPDTVPGQGPGEQPTSAAQPTTTQAVQPEGPGGGSGNSPGQNVVPPSPLS >gi|229784023|gb|GG667712.1| GENE 2 2112 - 4829 2194 905 aa, chain + ## HITS:1 COG:CAC2229_1 KEGG:ns NR:ns ## COG: CAC2229_1 COG0674 # Protein_GI_number: 15895497 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Clostridium acetobutylicum # 3 400 2 402 413 563 67.0 1e-160 MARKMKTMDGNHAAAHASYAFTDVAAIYPITPSSPMAEASDEWATDGRTNIFGHTVQITE MQSEAGAAGAVHGSLSAGALTTTYTASQGLLLMIPNLYKIAGEQLPGVFNVSARAVASHA LSIFGDHSDVYACRQTGCAMLCESSVQEVMDLTPVAHMAALEGKVPFINFFDGFRTSHEI QKVETWDYEDLKEMVNMEAVDEFRRHALNPNHPCQRGSAQNPDIFFQAREACNPYYEALP AIVQKYMDKINEKIGTDYKLFNYYGAPDAEHVIVSMGSVNDTIEETIDYMVAQGQKVGVV KVRLYRPFCADALVEAIPDTVKTISVLDRTKEPGSLGEPLYLDVVAALRGTKFDQVKILT GRYGLGSKDTTPAQIVAVFNNETKTPFTIGINDDVTHLSLETGAPLVTTPEGTTNCKFWG LGADGTVGANKNSIKIIGDNTDMYAQAYFDYDSKKSGGVTMSHLRFGHKPIKSTYLIHKA NFVACHNPSYVRKYNMVQELVDGGTFLLNCPWDAEGLEKHLPGQVKKFIADHNINFYTID GVKIGIETGMGPTRINTILQSAFFELTGIIPAEKANELMKAAAKATYGRKGEDVVMKNWA AIDAGAKGVVKVEVPESWKSCTDEGLDMTHATSGRKDVVDFVNNIQAKVSAQEGDSLPVS AFTEYVDGSTPSGSSAYEKRGIAVNIPAWNPDNCIQCNFCSYVCPHAVIRPVAMTAEEAA NAPEGMKMLDMTGMPGYKFAITVSALDCTGCGSCANVCPGKKGEKALTMVNMEANYADAQ KVYDFGQTVSIKQEVIDKFKPTTVKGSQFKQPLLEFSGACAGCGETPYAKLITQLFGDRM YIANATGCSSIWGNSSPSTPYTVNEKGQGPAWSNSLFEDNAEFGYGMLLAQNAIRDGLKA KVESS >gi|229784023|gb|GG667712.1| GENE 3 5827 - 6099 317 90 aa, chain + ## HITS:1 COG:CAC2229_3 KEGG:ns NR:ns ## COG: CAC2229_3 COG1013 # Protein_GI_number: 15895497 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Clostridium acetobutylicum # 1 83 287 373 376 86 51.0 1e-17 MSKAQTEEKLAVECGYWNNFRFNPAAEKKFTLDSKAPTGDYQEFLKGEVRYASLAMKNPE RAADLFAKNEADAKERYEYLTKLVTLYGDK >gi|229784023|gb|GG667712.1| GENE 4 6320 - 7705 1616 461 aa, chain + ## HITS:1 COG:L170983 KEGG:ns NR:ns ## COG: L170983 COG0534 # Protein_GI_number: 15672149 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Lactococcus lactis # 7 446 2 441 446 280 36.0 3e-75 MAKKQTKDMTTGSPVKLILSFLIPMLFGLLFQQLYNMVDTIIVGKYLGVNALAAVGSTGS INFMIIGFCIGVCSGFAIPVAQKFGEKNEVALRRFVANGGWLSLIFSVTMTAVVCVLCRN ILIWMNTPEDIIDGAYSYIIIIFMGIPATYLYNILSGIIRSLGDSTTPLFFLLVSSVMNI ILDFFTILVLHMGVAGAAWATVISQGVSGILCLLYMRKKFTILKMQGDEWKPDKNAMVTL CGMGIPMGLQYSITAIGSVILQTAVNSLGSVAVASVTAGSKISMFFCCPFDAMGATMATY SGQNVGARRLDRIDKGIKSCIIIGAVYSVIALAGLWLFSDVVALLFVDAGETEILANTRQ FLIANSLFYFPLALVNIVRFTIQGLGYSKFAIIAGVCEMAARSFVGFCLVPFFGYAAVCF ANPIAWVAADLFLIPAYRYVMGRLRRLMGRNNPGMAAETAV >gi|229784023|gb|GG667712.1| GENE 5 7807 - 8706 806 299 aa, chain + ## HITS:1 COG:no KEGG:Closa_1701 NR:ns ## KEGG: Closa_1701 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 71 298 61 290 291 292 66.0 1e-77 MDTLSYMQNDADTPVVPLPNPGEGGPVPDGEAEGNFPVVPLPNPGEGGPIPDFDNGQTPV IPLPNPGEGGPLPTFPNRPNNGIWGSIITVFPRPIIPCYFCSTTQYGVVRFLNAAAGYNP FIIYINNQMVVNGLDNAEVSQYGRVSSGMQTVTVSGQNGYVYIQKQINVPLNGAVTVAII NTNSGLDLMEITDSNCNGGLNTGCFRVCNLSNTNRSVNVTLNGGAVTFQNVNYREVTSFR YLPAGYYTVSVSNSTSFSGSPLLTSNIYIRGNVSYTLYVFNWNNSQDAIRILIVEDRRN >gi|229784023|gb|GG667712.1| GENE 6 8814 - 9239 490 141 aa, chain + ## HITS:1 COG:BS_ytrA KEGG:ns NR:ns ## COG: BS_ytrA COG1725 # Protein_GI_number: 16080098 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 1 122 1 127 130 94 37.0 8e-20 MILLDYKDRRPIYEQVVEKLQDLMICGVLEQDSQLPSVRSLATELSINPNTIQRAYAELE RRGFIYSVKGRGSFVSDTSSIRALKVSELRKELGSWLLEARNAGVTEDKVHCWVAEDWKG QKSMVREEEQRDKSGKSDETV >gi|229784023|gb|GG667712.1| GENE 7 9205 - 10104 939 299 aa, chain + ## HITS:1 COG:lin2912 KEGG:ns NR:ns ## COG: lin2912 COG1131 # Protein_GI_number: 16801971 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Listeria innocua # 2 283 1 281 295 165 33.0 1e-40 MIKAENLTKRFDDMTAVDHINAEIKDASVFGLIGTNGAGKSTFLRMMSGVLKPDEGSVTI DGENVFEREEVKQRFFYISDEQYFFSNATPLEMMEFYSRVYPKFDKTRFHKLLAGFELNG RRKINTFSKGMKKQLSVICGICTNTDYLFCDETFDGLDPVMRQAVKSLFAGDMAERNLTP IIASHNLRELEDICDHVGLLHKGGILLSKDLDDMKLNIHKIQCVLREGMDADDLVAFERL KTERRGRLITITVRGSREEAERVIESYAPVFYEIIPLSLEEIFISETEVAGYDIKKLIL >gi|229784023|gb|GG667712.1| GENE 8 10079 - 12268 1921 729 aa, chain + ## HITS:1 COG:no KEGG:Closa_1704 NR:ns ## KEGG: Closa_1704 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 713 1 714 725 650 47.0 0 MTSKNLFFNLMREDLKRRLWAVALTFLTFFFSLPVAVALTLSESRNMEESYQGMLHSARS VLGFNNGFIAVIIVLLSLILGVTSFSWLHSRKKVDFYHSLPVRREKLFFVNYLDGALILF FAYAVNLLLGLLVAMVNGIAPGDVIPAAFGASGFLFLHFVMLYSVTVLAMVLTGNILVGI LGTGVLHLYFPALLLLLDALYQEFFKTSYHGGSRIAYRLLDKSSAFTLYVSNYSAFMNGT NGRSQILRIGAVILVTAAVTAVSVLLYRMRGSEAAGKAMAFKVSQHIIRIPVVILSALCG GLFFWYLHDSIGWAVFGLICGLLLSHCTIEIIYHFDFRKLFSSRISMAVCGLCAALVFFG FQFDLFGYDKYIPKPESLESVAVSLNNMEYWVDYGSAKLDDGSYYWDYESPSDYLFARMR ITDTDTVLALVSDAVSQVEREKRHDNEWNSEYGNRYISFSVKYNLKNGREVYRSYSVYGN KEYDLVKKIYDSRDYRMAAYPVMEQTAEYTGKIRVKQSSWTREVTRGGGSGETDYVDLML STYQEEMAALTSAVMERENPIAQIQFMTNDQVMAEKLKDQDTYAWRYNSGIDRGYYPVYP SFRKTIGLLKECGIELNDGIGTGEVVRADIDLYQLADDDSGRYYDKENVSQLTVTDQEEL AALMETARLEDYANFNPFNEQENRIVFTAVIKDTSGQTSHEYSIRKADQPEFLKRESERF NKEVETVFD >gi|229784023|gb|GG667712.1| GENE 9 12327 - 12806 800 159 aa, chain + ## HITS:1 COG:no KEGG:Closa_1705 NR:ns ## KEGG: Closa_1705 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 158 1 160 162 147 70.0 2e-34 MAFLEELGKTLTDTGKEVATKAKALTETIQLKTQISTEKTKLEEAYAAIGRIYYESNREP EEAYTKVYDAVKTCRERIAALEIELTQSEGSRICAVCGAKVPKDSLYCGKCGAPIKEEVA ESEDVNAAAEEEAAAEEESERPAGVPAAEEDAFVPTEEY >gi|229784023|gb|GG667712.1| GENE 10 13021 - 14571 1666 516 aa, chain + ## HITS:1 COG:SMc00195 KEGG:ns NR:ns ## COG: SMc00195 COG1368 # Protein_GI_number: 15965601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Sinorhizobium meliloti # 183 490 220 543 639 134 28.0 6e-31 MKAYGKRLGKILVLAAVLNFLIEFISRKSLTVLLAYVLESPLVFLLNTCIIALPFTVLFF TKRRVFGAIVLSILWLGLGVVNGILLIFRTTPFTAADFRLIKYAANIATTYFTWMQLVMI GAALVVVVVFCVLVWRAAPVSREKVSFVRSAVVVGISAAVVLGLTTAAMNTGLVAVRFGN IGEAFQMYGFPYCFANSMFNTGISKPEDYGTETIEGIKAEELVPENTYAVSANTKPNVIM IQLESFFDPLLWKKNPVTGNLKSGTEEDYDPIPFFHRLQKSYPHGYLNVPSVGAGTANTE FEAITGMNLDFFGPGEYPYKTVLKKTSCESVAFDLKNLGYSAHAIHNNEATFYDRNNVFA QIGFDTFTPIEYMNNIERNPAGWCRDKILVKEIIKTLDSTEGPDFIYTISVQGHGKYPDF EYYCEQIHEMDDFIRSLVNTLRTREEPIVLVMYGDHLPSFEWTADEMINKSLYQTEYVVW NNMNLPKKRMDVEAYQLAAHVLDMLGIHEGTMMRSS >gi|229784023|gb|GG667712.1| GENE 11 15488 - 16147 501 219 aa, chain - ## HITS:1 COG:TM0034 KEGG:ns NR:ns ## COG: TM0034 COG2768 # Protein_GI_number: 15642809 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S center protein # Organism: Thermotoga maritima # 16 218 3 205 357 234 57.0 7e-62 MENFINHAQEVHMDKSRVYYTNFHTTFSENLPQKLKRLLKTAGMEQIDFQNKYAAVKIHF GELGNLSFLRPNYAKVVADLIRENGGKPFLTDCNTLYVGSRKNALDHLDTAYENGFSPFS TGCHVLIADGLKGTDEALVPIDGDYIKEAKIGRAIMDADIFISLTHFKGHESTGFGGTLK NIGMGCGSRAGKMEMHNAGKPHVAQETCVGCHACERNCA >gi|229784023|gb|GG667712.1| GENE 12 16281 - 17561 1506 426 aa, chain + ## HITS:1 COG:PM0738 KEGG:ns NR:ns ## COG: PM0738 COG2873 # Protein_GI_number: 15602603 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Pasteurella multocida # 10 425 5 418 422 504 57.0 1e-142 MSQKKRSRETVCIQGGWQPKSGEPRVLPIYQSTTFKYDDSDKMGRLFDLEDEGYFYTRLA NPTNDAVASKICELEGGTAAMLTSSGQAASFFAILNICEAGDHVISSAAVYGGTTNLFTV TLKKLGIDSTLVDPDASEEELEKAFQPNTKAVFGETIANPALVVLDIEKFAAAAHRHGVP LIIDNTFATPINCRPFEWGADIVTHSTTKYMDGHATSVGGCIVDSGNFDWEAHAEKFPGL TTPDESYHGIVYTQKFGKMAYIMKATAQLMRDLGSIQSPMNAFLLNLGLETLHLRVPRHC ENALRVAEYLKSRDDVEWVNYPGLQGDKYYELAQKYMPDGTCGVISFGLKGGRDAAVAFM DRLKLAAIVTHVADARTSVLHPASHTHRQLTDEQLVEAGVDPSMIRFSVGIENVDDIIED IRQALE >gi|229784023|gb|GG667712.1| GENE 13 17606 - 19156 1816 516 aa, chain + ## HITS:1 COG:SPy1212 KEGG:ns NR:ns ## COG: SPy1212 COG1502 # Protein_GI_number: 15675176 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Streptococcus pyogenes M1 GAS # 2 516 15 525 525 485 48.0 1e-137 MKKARKLLRIIFGRTAFVVMSLLLQISILLAGFRFLSHYMVYIYGGFTLLSAFVILYVIN KDENPSFKLAWIIPITVIPVFGTLLYLFLELQWEGKIINRRLRENIAATQPYLKQNPRYM EQLARISRSNANLAAYIENSGSYPVYGNTNVKYYPLGEEMFEDMKKELGKAKRFIFMEYF IVERGEMWDSILEILERKVSEGVEVRFMYDGMCCLVLLPYSYPKELRAKGLKAKMFAPIR PALSTYQNNRDHRKILVIDGHTAFTGGINLADEYINRKVRFGHWKDTGIMLKGDAVTSFT MMFLQMWNITEREPEDYGRYLRDPEFFYPPELSMEGFVIPYGDSPLDQETVGELVYLDII NTARSYVHIMTPYLILNYELVQALQFAAKRGVETIIIMPHIPDKVYAFLLAKAHYEELIK AGVQIYEYTPGFVHAKVFTSDDEKAVVGTINMDYRSLYLHFECAAYIYRNEVIQDVERDF RETLAQSQVITLEECRRYPWYKKLAGRALRLFAPLM >gi|229784023|gb|GG667712.1| GENE 14 19199 - 20161 511 320 aa, chain - ## HITS:1 COG:L20481 KEGG:ns NR:ns ## COG: L20481 COG0471 # Protein_GI_number: 15673760 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Lactococcus lactis # 1 313 60 380 392 127 32.0 2e-29 MVVVAGFQECGLFVLMARKLLSGENNFRQLSWMLVMLPFFCSMLITNDVALLTFVPFTIL ILDMIGEKKHMAWIVVLQTVAANLGSMATPVGNPQNLFLYGRFSIPAGRFFAVLLPLAAL SFVLLTAAVFRIPRISMQIRFDAPEVTQEKHASPAVFLFLFLLCLLSVFRILDYRIALAV TVVCLLLFDRKLFRTADYLLLLTFVCFFLFAGNMGQLQGLKSFMAEVLSRNPVLFSALTS QLISNVPAAVLLSGFTSDWKSLLLGTNLGGLGTPIASLASLISLKFYIKELPDKVSHYLA VFTAINIAFFLILYTAACFL >gi|229784023|gb|GG667712.1| GENE 15 21136 - 21231 74 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNITLKTIGGRILAAAKKDVILTVSLVCAVS >gi|229784023|gb|GG667712.1| GENE 16 21257 - 23125 1339 622 aa, chain - ## HITS:1 COG:no KEGG:Closa_1714 NR:ns ## KEGG: Closa_1714 # Name: not_defined # Def: Heparinase II/III family protein # Organism: C.saccharolyticum # Pathway: not_defined # 21 621 1 600 601 910 70.0 0 MIADTAITETSLTGTPMAAAMITEAAQKLHKPLTLFTPCPPASDRESWEKLSEDLKARLI KNGEQYLHYEYPSLTAVDFMEFTRTGNRSRYQDRLFARRTALDTLVLAECVEHRGRFLDD IINGLYLICEENAWQLPAHNSYIRDTPQFPLPDVTRPMIDLFAAETASILAVAEYLLRSE LNEVSPFLSVMIGHELETRIFTPYLEEHFWWMGDGKSHMNNWTVWCTQNVLLSAFTRPLP SETQTRILCRAARSIDYFLDEYGEDGCCDEGAQYYRHAGLCLFGCLEILNGITDHAFSFA WKQEKIRNIASYILNVHVDSIYYVNFADCSPVAGRCNAREYLFGKRTDNRALMEFAASDY QNSEDPLTLDEHNLFYRLQTVFSHEEMMESDRNPVIAHKDLYYESVGLFIARGSRLCLAV KGGDNDDSHNHNDVGSFTIYKDGKPLFIDAGVETYTKKTFSPDRYDIWTMQSRYHNLPSF TDHGRETVQQDGGRYRAEDVRYELGEDVCRISMDIAKAYPDDRIRSYTRSAVLDKEKSIT IRDHYEGDGGTAVLSLMFCEKPSIHQNTIRIGDLGICTVEGASSITSEEIPVTDERLKTA WKHSLYRCLVTMEARDLTLTIA >gi|229784023|gb|GG667712.1| GENE 17 23122 - 24300 972 392 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 267 376 410 519 525 84 38.0 3e-16 LRLLHNLFDSEEQFLLQAKDLKLAFTDACYCAALFEIREEAHAAMDHTKLMNLCSSTFSM ARDILNKHLPCYAISLDMKHFAVIFHAAKQEDGNAAQLQSAVKNAVEMVHNYFNVTVLAG IGTPVEQPLKICESYQDARQAFGRATTTAPIIQFSQDDTESVKNAFNIAVFKNDITKAFD EFDTDVLYRTLSEIIELFKAHPLRFLQAVDGACNILYLALSLLPDGEENMMEIFSSYSDG YRSIYRFTSVDQIIEWMMILRDGLCEVLKSKRKTYKAHVITNVQKYIGSHVNERLSLNEV AGVFGLSPNYLSILFKKNCQVGFSEYISQAKIAKAKSMLLEQDMKIYEVADRLGFESAFY FSKVFKKVEGISPREYIQQNTIGPDGEQGDTL Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:01:18 2011 Seq name: gi|229784022|gb|GG667713.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld106, whole genome shotgun sequence Length of sequence - 27830 bp Number of predicted genes - 31, with homology - 29 Number of transcription units - 14, operones - 10 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 762 517 ## gi|288871092|ref|ZP_06116326.2| hypothetical protein CLOSTHATH_04682 2 1 Op 2 . - CDS 775 - 1551 481 ## gi|266623392|ref|ZP_06116327.1| hypothetical protein CLOSTHATH_04683 3 1 Op 3 . - CDS 1548 - 1637 65 ## 4 1 Op 4 . - CDS 1630 - 3537 1196 ## COG3451 Type IV secretory pathway, VirB4 components 5 2 Tu 1 . - CDS 4463 - 5056 591 ## ELI_0234 hypothetical protein - Prom 5094 - 5153 2.4 6 3 Op 1 . - CDS 5170 - 5664 404 ## gi|266623395|ref|ZP_06116330.1| conserved hypothetical protein 7 3 Op 2 . - CDS 5665 - 6633 854 ## COG0863 DNA modification methylase 8 3 Op 3 . - CDS 6652 - 7494 538 ## ELI_0231 hypothetical protein - Prom 7546 - 7605 80.4 9 4 Op 1 . - CDS 8488 - 9168 285 ## gi|266623398|ref|ZP_06116333.1| conserved hypothetical protein 10 4 Op 2 . - CDS 9182 - 9520 358 ## gi|266623399|ref|ZP_06116334.1| trwL6 protein - Prom 9566 - 9625 5.5 - Term 9595 - 9645 1.0 11 4 Op 3 . - CDS 9648 - 11159 654 ## COG3505 Type IV secretory pathway, VirD4 components - Prom 11241 - 11300 21.2 12 5 Op 1 . - CDS 12202 - 12708 340 ## ELI_0208 hypothetical protein 13 5 Op 2 . - CDS 12708 - 13565 561 ## gi|266623402|ref|ZP_06116337.1| hypothetical protein CLOSTHATH_04693 - Prom 13597 - 13656 80.4 14 6 Op 1 . - CDS 14504 - 14773 209 ## gi|167747640|ref|ZP_02419767.1| hypothetical protein ANACAC_02361 15 6 Op 2 . - CDS 14779 - 15210 443 ## gi|266623404|ref|ZP_06116339.1| conserved hypothetical protein 16 6 Op 3 . - CDS 15203 - 16180 883 ## gi|266623405|ref|ZP_06116340.1| conserved hypothetical protein - Term 16193 - 16227 2.9 17 6 Op 4 . - CDS 16241 - 16846 91 ## MGAS2096_Spy1154 hypothetical protein - Prom 16867 - 16926 3.9 18 7 Op 1 . - CDS 16970 - 17452 304 ## gi|266623407|ref|ZP_06116342.1| conserved hypothetical protein 19 7 Op 2 . - CDS 17476 - 18048 370 ## gi|266623408|ref|ZP_06116343.1| hypothetical protein CLOSTHATH_04699 20 7 Op 3 . - CDS 18038 - 19201 600 ## EUBELI_01797 hypothetical protein - Prom 19255 - 19314 20.2 + Prom 19002 - 19061 5.0 21 8 Tu 1 . + CDS 19127 - 19303 105 ## 22 9 Op 1 . - CDS 20216 - 20383 176 ## gi|167747645|ref|ZP_02419772.1| hypothetical protein ANACAC_02366 23 9 Op 2 . - CDS 20386 - 20616 199 ## gi|266623411|ref|ZP_06116346.1| relaxosome component - Term 20746 - 20791 -0.9 24 10 Tu 1 . - CDS 20813 - 21253 246 ## gi|266623412|ref|ZP_06116347.1| conserved hypothetical protein - Prom 21325 - 21384 3.4 - Term 21324 - 21361 -0.9 25 11 Tu 1 . - CDS 21561 - 22076 167 ## gi|288871095|ref|ZP_06116349.2| hypothetical protein CLOSTHATH_04705 26 12 Op 1 . - CDS 22280 - 22561 267 ## gi|266623415|ref|ZP_06116350.1| conserved hypothetical protein - Prom 22619 - 22678 1.8 - Term 22829 - 22872 1.0 27 12 Op 2 . - CDS 23038 - 23766 574 ## gi|288871096|ref|ZP_06116351.2| conserved hypothetical protein - Prom 23864 - 23923 80.4 28 13 Op 1 . - CDS 24767 - 25525 529 ## gi|266623417|ref|ZP_06116352.1| hypothetical protein CLOSTHATH_04708 29 13 Op 2 . - CDS 25588 - 25914 167 ## gi|288871097|ref|ZP_06116353.2| conserved hypothetical protein - Prom 26088 - 26147 3.3 30 14 Op 1 . - CDS 26502 - 27434 491 ## EUBELI_01204 hypothetical protein 31 14 Op 2 . - CDS 27424 - 27744 111 ## gi|266623420|ref|ZP_06116355.1| hypothetical protein CLOSTHATH_04711 - Prom 27770 - 27829 1.9 Predicted protein(s) >gi|229784022|gb|GG667713.1| GENE 1 1 - 762 517 253 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871092|ref|ZP_06116326.2| ## NR: gi|288871092|ref|ZP_06116326.2| hypothetical protein CLOSTHATH_04682 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04682 [Clostridium hathewayi DSM 13479] # 1 253 2 254 254 457 100.0 1e-127 MDVTLQVPLGEQMHLKEMFSVLEENHMEIESQNLENFLQYFDQMERYFGDMQEELIYLRD QLDRLNEKTLKAKMEHFIQNAGDRVGAAKMWLAALKNDLSAGVRKAVDSVKYHGGQALYA AIDKTKAARALSVIHEHLESAAASLEHSAETMGDIGVELNAAKGHKKNVRKLLKGKETSD TFEYDYDKGIIAKIRKAIEFCGKIVKNMAGHTENMLKSLDGLSQKALDNRAIRKKHVFEH EGGKDVSRCRKRI >gi|229784022|gb|GG667713.1| GENE 2 775 - 1551 481 258 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623392|ref|ZP_06116327.1| ## NR: gi|266623392|ref|ZP_06116327.1| hypothetical protein CLOSTHATH_04683 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04683 [Clostridium hathewayi DSM 13479] # 1 258 1 258 258 498 100.0 1e-139 MNREIFVLAVKKRLLLQTEENIKKWVAWAEECVDMGQYDGFREMPRNQAVEKWLDSYYFA FNRIQTDFDRDTAIKVLNLSEEKLCLYPNEMHVAARVLKDGGTGKDILQMIHEGTLEEDM ANMEKCRAERMDISSYSEENLYRILQAAKTRQETPEIPLPYKPVDSLVVQLINACMELLE YRYGITTIQKSDYDLISEHAFFLCDSAQVLEPEDVDLDAVYECFNDAEVNERMIEVEEQR SPEKERGSNVSGKTGKQR >gi|229784022|gb|GG667713.1| GENE 3 1548 - 1637 65 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFNGRKKSECQRNGITEKARLAGCEEEYS >gi|229784022|gb|GG667713.1| GENE 4 1630 - 3537 1196 635 aa, chain - ## HITS:1 COG:CAC2047 KEGG:ns NR:ns ## COG: CAC2047 COG3451 # Protein_GI_number: 15895317 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Clostridium acetobutylicum # 74 608 47 608 617 84 21.0 1e-15 MEPKGFEFQEIGSKLEPLDAAERLRILYNFYRIGEEETFSFDFDEYVKSKKDYKNDICGS MIKFFPNEFETDRKVCRAIFIKRYPTSLPDTFVNELTGVPVHSITSIDIVPVPKDLTTKT LQKKYLGVESDILRQQRVRNKNNDFASEISYGKKVEKKNLEALMDDVRENDQNLFFSGVN MVLIADTREELDSITETIQTIGNKYLCNMEIHHLQQREALNTALPVGVRQVETMRTMLTR DIAALMPFNVQEIYEPSGLCYGMNRISKNLCVADRKKLTNGNAMVFGVPGSGKSFFCKSE MLGVFLKTDDDILVIDPTLEYFDIARNLGGECLNLSNYTQNYINPLWIDVDQLDLADSKG LIREKGEFMLGLCEQAMLDLLNSRHKSIIDRCIRKLYLDIAMSKEKYIPVMSDFYDLLMK QPEEEARDIALALELFVNGSLNIFNHQTNIDVDNRFVVYGIRDLGKELSAIAMLVMLENV KTRIARNAERGRATWLYIDECHVLLGTGEHSGHSEYSGKFLYSLWKKVRKQGGLCTALTQ NITDMLQSYTAVTMLANSEFIALLKQANVDSQELSKVAGIPEAQLQYVSNCASGMGILKH GETIIPFDARVRKNNIIYDLFNTNIHEIREKNNHV >gi|229784022|gb|GG667713.1| GENE 5 4463 - 5056 591 197 aa, chain - ## HITS:1 COG:no KEGG:ELI_0234 NR:ns ## KEGG: ELI_0234 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 15 197 12 195 774 139 40.0 6e-32 MGIFTDRFTLLKKADKPLYTAPQSVQQQIEILRIAKDGIFENTKNRYSRVYRFADINYVT VSEGDQYELLKQYCKLVNAIDVAFKITIYNKNKDMSQFRRNVLIPDKNDGFDHLRDVYND IVEERIVAGKQGIEQERYLTVTVERKDYEQAKAFFNTFEGGMQQEFQEIGSKLEPLDAAE RLRILYNFYRIGEEETF >gi|229784022|gb|GG667713.1| GENE 6 5170 - 5664 404 164 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623395|ref|ZP_06116330.1| ## NR: gi|266623395|ref|ZP_06116330.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 164 1 164 164 324 100.0 2e-87 MDIEISRDLQHYKESFVMGLTAKQFIFSALSLGAGAGVVLGLYNKIGMTLSCYLATPIVV PIALTGFYNHNGLSFWQTVRKMIKMSFFNKPCLYRSTEDVELIKEAFAEEQKLIAQQQKK EAKAHPDGKEDFHAVKKRAKRLIILTVGILTVTVAGVLAYKFMR >gi|229784022|gb|GG667713.1| GENE 7 5665 - 6633 854 322 aa, chain - ## HITS:1 COG:all4411 KEGG:ns NR:ns ## COG: all4411 COG0863 # Protein_GI_number: 17231903 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Nostoc sp. PCC 7120 # 77 313 6 253 295 65 26.0 2e-10 MEAAAQDKGIQMTLFEYFADADQFSLDDAKECVYQYAEKQVKEPSIRARIYEGINKGLFK RLGKGVYTVTKKNAEDEDITCMLIQGDGRDISFIADNSIDAIITDHPYLLKNSLKGGNRD FASYDLFQYTQEDLDEKFRVLKKGHFLVEFLPEENGDNYEYLYQVKAMAKESGFSYYAKV AWKKGNFVANTGRKAKNTEDIFFFSKGRARDMRPDAKKDKAEPGTCHYMSGVKGMLPTAF DIQPPPKGERVHQAEKPVKLLKQIIEFVTNEKELILDQYAGSFSLAEAALDLDRDSISIE ISQDYFEEGKKRIENVKKGKMR >gi|229784022|gb|GG667713.1| GENE 8 6652 - 7494 538 280 aa, chain - ## HITS:1 COG:no KEGG:ELI_0231 NR:ns ## KEGG: ELI_0231 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 10 280 7 279 279 115 29.0 3e-24 MDIFNLGEAILSLLEVVFGFWNSQVNMVFELLGQSPTSFKGGAPWGIITGIEPIFVAVGS SLVVLFFVIGFCSESIDVREEMRFEVILRMLIRIGIAQWLVVYNVDIMRAIFTSVGNLVG LIGGSEATQVTIDPAQADIIKNLGFVESLVFLILAVFLCLIILICGFFLIYNVYFRFLKI LIVVPFGSIAYATVAGNRGVSATSVQFTKYFLSVVFEAVTMALAIILCNAFISSGLPQFM GNYADWTKTLIYLGEMTFTVALTVGSVKGAQSLTSKALGL >gi|229784022|gb|GG667713.1| GENE 9 8488 - 9168 285 226 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623398|ref|ZP_06116333.1| ## NR: gi|266623398|ref|ZP_06116333.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 226 1 226 226 464 100.0 1e-129 MLKLLHVSFLIEIEGRRGETVQEGGNPKFRKECKDLFTDMGFHVKEACKVREAFRNLHTS LGLETDDDLVQRGDECLRIMEQSIDGNLFSSSLSEIEEKLQSGKTFHYKALKSYGTVYIP STIEARDYYNKHLDAYEKTVIQFMEISHERLNAAWISNPVPYYKGDIRICLGHDVIYFFS FKIDEVLKDMVNKGLIISDGAQVAPRFALPESYQENKRGYPYRKGR >gi|229784022|gb|GG667713.1| GENE 10 9182 - 9520 358 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623399|ref|ZP_06116334.1| ## NR: gi|266623399|ref|ZP_06116334.1| trwL6 protein [Clostridium hathewayi DSM 13479] trwL6 protein [Clostridium hathewayi DSM 13479] # 1 112 1 112 112 126 100.0 5e-28 MKKLTEKKNSIKKMIDKAGKVAAGAAFATVLWTTNVYASGADVVTAPLNNLKNLVIAIIG AVGVIILAKNVMEFAQSYQQQDSSSMNSAIKGIVAGLIMAGISTVLTFLGIS >gi|229784022|gb|GG667713.1| GENE 11 9648 - 11159 654 503 aa, chain - ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 16 343 246 558 591 164 34.0 3e-40 MYIRRESDVVKLITNLIANTTPKNSQSNDPFWEKAESLYLQALFYYVWLECPRSEQNFST VLKLLGEAEVRDDGSPSDLDRRIKRLERKKGSNHPAVKQYYKVIRGAADTVRSIVISANS RLAFLENPQILRILDHDDMNIPEIGVGRNGDGKTKTALFCVIPDSDKTNNSIVGLLYSQI FQELYFQADFIYEGRLPIHVTFMLDEFANVALPDDFCSLLSTMRSREISSIIIIQNLAQI KALFEKTWEVIPGNCDTLVYLGGNEQSTHEYISKMLGKATIDKRSSGETRGKNGSSSRNF DVLGRELLTPDEARKLDNKKCLIFIRGFDPIIDSKYNTPGHKLYGETFDGGAPPYHHTPN RKDNVIPTCELLNRKSFEYYERQRDQGLNVHIDEITLDELLALQEPDMPAKVFSEDELKK NRKISEIDNLEKPDSGQDSGQITEEQFNEALLNENYSLEQLEEIRLGLVGGLSYQDILTH FKAESSAGEMREIRGQLTGLPES >gi|229784022|gb|GG667713.1| GENE 12 12202 - 12708 340 168 aa, chain - ## HITS:1 COG:no KEGG:ELI_0208 NR:ns ## KEGG: ELI_0208 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: Bacterial secretion system [PATH:elm03070] # 39 168 37 165 502 104 38.0 1e-21 MESKKKTPVWLLVLGGILAAYGGYLLNGIWEKGIDINTFMERLNLVMAHPIGNYFNGTTL KGILLAEFVYVIAIAMYLTSRRNYMPGKEYGTAVFANINQVNQALSEKDETENRILSQNV RMRMDTRKTKLNLNTLVIGGSGAGKSFYFVKPNLLQLNRSSYIITDPK >gi|229784022|gb|GG667713.1| GENE 13 12708 - 13565 561 285 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623402|ref|ZP_06116337.1| ## NR: gi|266623402|ref|ZP_06116337.1| hypothetical protein CLOSTHATH_04693 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04693 [Clostridium hathewayi DSM 13479] # 1 285 5 289 289 508 99.0 1e-142 MNQIIGEVSGGRIESIQDYIDNGEENELKETYGIEKGGKENPNGEIAHKEPSISPFGFRA VGTFLAETPGASVGEISNAIDMPWAEVEPILQHMEDLGILEINEEGSTAFKMDTGEFDEY VRLDKWQKTEAMEPIIQGMGQPQKTEQMSEMNDRQKKVLQHAKEASKGEPNIHQIFIARS LAVEETDNRILTRIPGKRNEYLWIEKNAISHISEEKRTIFASLQKDKDYSIVNNEGRVIG KRTGEQLYSQSYDPVLKETRDMLYRKKEETRKRQMASQMKKRGGR >gi|229784022|gb|GG667713.1| GENE 14 14504 - 14773 209 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167747640|ref|ZP_02419767.1| ## NR: gi|167747640|ref|ZP_02419767.1| hypothetical protein ANACAC_02361 [Anaerostipes caccae DSM 14662] conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] hypothetical protein ANACAC_02361 [Anaerostipes caccae DSM 14662] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] # 1 89 4 92 404 153 88.0 3e-36 MGEIQEATQILRVSFEGIEIFMKIIGGGIHTAKDIGKIFGKLVEMERLYGRTSVKDLLKT GGDLQVFRFDTVDIQKVKKLADKYKIRYA >gi|229784022|gb|GG667713.1| GENE 15 14779 - 15210 443 143 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623404|ref|ZP_06116339.1| ## NR: gi|266623404|ref|ZP_06116339.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 143 1 143 143 226 100.0 4e-58 MSEAFNIKMVRECYYMMQLMEQQDFTFTQDDKRLLLGYAFHQRDLDCVHDAVIHIAAVRE KSQEDLDSGIIEQYSIRGKSELQGKIVEYIIQLEVANINQERANKLLMEILRDKNVDYEL DRMITELQKQDEEKKRENEVSRR >gi|229784022|gb|GG667713.1| GENE 16 15203 - 16180 883 325 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623405|ref|ZP_06116340.1| ## NR: gi|266623405|ref|ZP_06116340.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 325 1 325 325 533 100.0 1e-150 MTVEEQKYVEECRKKLGPEQVQVIQMAFKYGLGIKDVRRLAIPKLNAEQMRQTVYAVLEN VDSELIELCCQGTFDQYQIPEIVSGSVSGLTKEEILSYATPDLPASRMKKMRSQLIEAKK NASDTAEGTAFREYTENLIKMMETSLQQFKENNEKFEVLSSLVKEHVLDEKNQEIKDLYD NLKEKDNLIKKLQEEAAGREAQIKDLEAKLSDVKVQTVQGAEKSIEIERAGKKPVPTSYW RPEPPQTAKKTLLDKLFPQKTPDILDKIAAEGLSSEQLEEIRSAFDSGLSDMEVMRIIKK DLTADKMRKMREIMMLVRERRLADE >gi|229784022|gb|GG667713.1| GENE 17 16241 - 16846 91 201 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1154 NR:ns ## KEGG: MGAS2096_Spy1154 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 4 179 42 217 231 169 47.0 8e-41 MKTFLIVWPDHREHMRTPEFKEEFDQVIEALRSDRYEGILEDRFGAVKYGGNHPGSAFEG KYNTEYGIRVDTDEHIYLLRCDAFRMNFYCSCYEGKRLDEHIRRAERGIRFVDYDGEDLF RIPDGEKIVVTTKLNEAIEHKCRYIDDEHVEVGSNLYHINEFEEKMHRNGAVYRPLREKA VERQEVQAGGKKNRVDSQKVR >gi|229784022|gb|GG667713.1| GENE 18 16970 - 17452 304 160 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623407|ref|ZP_06116342.1| ## NR: gi|266623407|ref|ZP_06116342.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 160 1 160 160 300 100.0 2e-80 MKYQFLDNAGNVTEAEQESYADAQSYAESHNLMCLSMPYDLFVSSFRKKIKLLESHAQYS WLSEYAEKIIRMNQGFGYQAIAANDFMEKIIIMPLADLQGWLEENAISEGKKGTVKPYQQ KMINGLTGSEKGITIEIPVGMGKAKSNLQNNATKKARKGR >gi|229784022|gb|GG667713.1| GENE 19 17476 - 18048 370 190 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623408|ref|ZP_06116343.1| ## NR: gi|266623408|ref|ZP_06116343.1| hypothetical protein CLOSTHATH_04699 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04699 [Clostridium hathewayi DSM 13479] # 1 190 1 190 190 370 100.0 1e-101 MIYENKKIEALASEVTGYLQKYSLGEAYIESFRERVPAGMRIPGIKEVLGNHQLRNNLTC FLLYRSSLLPESARKQGISLTDKIDAVGTVREADKHGEKQELAWEINEVMKNVAFPFYIE AGYAKAWDDQSDPVRRIYRDIGIKEKRQSMIDCLESALDKGYGAGKRISDMIASLEKIAD KVPDRLSKGR >gi|229784022|gb|GG667713.1| GENE 20 18038 - 19201 600 387 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01797 NR:ns ## KEGG: EUBELI_01797 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 2 320 96 395 608 108 26.0 3e-22 MKEYLGKEFEAVYALHDDTDHIHGHIVFNSVRCVEQGLKYDYRNGDWDNIIQPLVNKICE EYHMATLDMDKVRENRKHKRDKTWDQSKDGIFVWNDMIRRDIDQVVAQAENWDEFLSGME ERGYEIKHGAHTLIKPPGMDKGRRLDTLKGDYTEEKIRERLCNPMREEDLPKPVLKLPPK IKKISGYIPRRKSELTGYQKIYFARLYRLGVIKKQPYSNAWKYKEDIRRLHELQEEYNFL SAYQIYTDHDLDNVLKVLAGQAKALRQEKRQLKDLEDANPEAFALWEQIQELSIEVSLYE EGYQEFKGEYQRSEQLKKQMSDLGISFSETVQLFQDSQGKIRRIDEVLAGVKRQQRIGKK ILQEQKERLQSKKQHIDKGRGEPHYDL >gi|229784022|gb|GG667713.1| GENE 21 19127 - 19303 105 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPVDMVRIIMQGINSLKLFTQVFLHKFTYDLHGNIFINGICRLFKRYDNMVSLTALAS >gi|229784022|gb|GG667713.1| GENE 22 20216 - 20383 176 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167747645|ref|ZP_02419772.1| ## NR: gi|167747645|ref|ZP_02419772.1| hypothetical protein ANACAC_02366 [Anaerostipes caccae DSM 14662] hypothetical protein ANACAC_02366 [Anaerostipes caccae DSM 14662] # 1 55 1 55 484 103 83.0 4e-21 MAVSKILHMGCAKSGFKAKYLANAISYITKDFKSENGLYVSGYNCLPETALNQML >gi|229784022|gb|GG667713.1| GENE 23 20386 - 20616 199 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623411|ref|ZP_06116346.1| ## NR: gi|266623411|ref|ZP_06116346.1| relaxosome component [Clostridium hathewayi DSM 13479] relaxosome component [Clostridium hathewayi DSM 13479] # 1 76 56 131 131 140 100.0 3e-32 MRMLITNRPRDYPDLLEAMQSLTNEVNHIGININQITKNNNSGLYHESDKKRLYVYMKQI KEAVKQVVSLLESAGT >gi|229784022|gb|GG667713.1| GENE 24 20813 - 21253 246 146 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623412|ref|ZP_06116347.1| ## NR: gi|266623412|ref|ZP_06116347.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 146 1 146 146 278 100.0 1e-73 MGRIDTPDELREYLDEFDILLPLTAEEAEKVLEYIKNSGYTLETDGYGQLYRTDLENGEC LETDIDHMIDDACESNYEMISDIRDYFVFCGGKERDNLFQVLQGLLSDEKILNTAFSRTY FQKELQVRLHGVLPAVEITAGRRVIR >gi|229784022|gb|GG667713.1| GENE 25 21561 - 22076 167 171 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871095|ref|ZP_06116349.2| ## NR: gi|288871095|ref|ZP_06116349.2| hypothetical protein CLOSTHATH_04705 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04705 [Clostridium hathewayi DSM 13479] # 1 171 70 240 240 327 100.0 2e-88 MKEVIEALEQWPVISVKRLNMAIEKRREDSVCDVYELYYILMKQDHEGNPAYLACIKCDE GEMEFIDPFSYKTETFSEEECMTNQKLFQLLEEGYQIVEITKEASLVIWDNLKETDTNEL KYMQGFGRYAAFCLENDLFPVRESGGLQNTLQNRDEDSMKIVRSQHKIKSR >gi|229784022|gb|GG667713.1| GENE 26 22280 - 22561 267 93 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623415|ref|ZP_06116350.1| ## NR: gi|266623415|ref|ZP_06116350.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 93 27 119 119 165 98.0 1e-39 MQYEQEDFEQLKKKVTELFEKVPVEKAFLDFTYVPGGDVLDSLKLADTRTELLIFLLDKS KEAADRETIDLIYDALNGLAEFDNLKHGMVLDH >gi|229784022|gb|GG667713.1| GENE 27 23038 - 23766 574 242 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871096|ref|ZP_06116351.2| ## NR: gi|288871096|ref|ZP_06116351.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 242 1 242 242 489 100.0 1e-137 MPNQVDRICDYFDRVRLIKKIEQLDNVTIWAANLEKSAKKLEENETPYVTVGDIAVYFKF FDEVYEVSKEDGEVWPNDYIRPVCNKHLEQWGFTAQELFDKAKYFDVGERLQTVLDASKD AFKRVLGSYEQLLPDTYGIYSITGENGIGMMFYPVVLSQVAQRLGGDLFVIPILNDRILV CSKNEYNLEKLQEVVKMKSKEDKLYGLIPLHVSDMVFQFDAHRLTLQPATDQREKKMKHT AR >gi|229784022|gb|GG667713.1| GENE 28 24767 - 25525 529 252 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623417|ref|ZP_06116352.1| ## NR: gi|266623417|ref|ZP_06116352.1| hypothetical protein CLOSTHATH_04708 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04708 [Clostridium hathewayi DSM 13479] # 1 250 1 250 250 514 100.0 1e-144 MNLKLYKQFCGAYQQCVQAALEDSHIQGDVKIVAKGLPTEGITMLFEKKERYCPVISMRE AFLRMEMLNMWEEELVDRAIEVFSDSQTADLKIAVSQLEQPENVAIYAANHRLDLDALKN NDMPYLVMGNVVFYYMFYAPLGDGRYYQAEVTNRHLKNWNMRKGELHSFVREHSLVSMES RIYPLHNMIENMLDGIPCVLNDGEAEDTLGYTVTGPGGAATILYWDVLRELQQLMHDSFY ILPYTEREAYIS >gi|229784022|gb|GG667713.1| GENE 29 25588 - 25914 167 108 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871097|ref|ZP_06116353.2| ## NR: gi|288871097|ref|ZP_06116353.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 108 160 267 267 215 100.0 9e-55 MTTESFYWGESAMLYKDKVHELADKLGGDLILFPNSINQCLALPKENREGKKWIQETLYD SKKNGYQQRDKDLSDFIYTYSQKMDEIQKTSQRVRQKPKPQMNKRQGR >gi|229784022|gb|GG667713.1| GENE 30 26502 - 27434 491 310 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01204 NR:ns ## KEGG: EUBELI_01204 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 282 1 295 298 96 26.0 1e-18 MDYEEFIKYFTHEISYNLKMLGVTWRASPVIGEGKEGAMAVIRGGGYADLPPISIYEYYQ RFLQGTLWGVAAREATQEYFDDKITRSALEDMGVFAPDMVIPTFANVKENAERLEQIPHI RFEDLAVVLQVAYQSLGGINLNMTVTDDILEEWELSFDELYQRSLDNPRNIESVQMVELY KMLDEMASYEGAAKNLPLFYVITNTGQYLGAASILYKEKMRELADSLGEDILLVPMSIDQ FIAMPLLDSSGVRYIANVLRESNGTMPLSENLYHYNCRTEQFRMISDVPKETKKRGRYGA IIWQVDMQQY >gi|229784022|gb|GG667713.1| GENE 31 27424 - 27744 111 106 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623420|ref|ZP_06116355.1| ## NR: gi|266623420|ref|ZP_06116355.1| hypothetical protein CLOSTHATH_04711 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04711 [Clostridium hathewayi DSM 13479] # 1 106 1 106 106 207 100.0 2e-52 MEPTDWKVFDQIINAPYSGRKAALFPAGLVQETWVQTKMRKSIEHTATRLSLSKHVYMYD NGKLNRFKDIGDMLSLVQGTSEYKKKRPVRNREGEITTIRRQTDGL Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:05:01 2011 Seq name: gi|229784021|gb|GG667714.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld107, whole genome shotgun sequence Length of sequence - 22895 bp Number of predicted genes - 28, with homology - 25 Number of transcription units - 13, operones - 7 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 1118 503 ## Closa_3672 integrase family protein - Prom 1161 - 1220 4.3 - Term 1207 - 1244 6.2 2 1 Op 2 . - CDS 1266 - 2165 472 ## Closa_3673 integrase family protein - Prom 2303 - 2362 21.1 + Prom 3238 - 3297 25.1 3 2 Tu 1 . + CDS 3342 - 3443 68 ## + Term 3520 - 3558 0.7 4 3 Tu 1 . - CDS 3550 - 3687 131 ## - Prom 3766 - 3825 5.6 + Prom 3780 - 3839 6.8 5 4 Op 1 . + CDS 3885 - 4640 191 ## ELI_0221 putative ribonuclease III 6 4 Op 2 . + CDS 4637 - 5224 366 ## ELI_0222 hypothetical protein 7 4 Op 3 . + CDS 5299 - 6411 901 ## ELI_0223 hypothetical protein + Term 6422 - 6468 9.0 + Prom 6413 - 6472 1.7 8 5 Op 1 . + CDS 6492 - 6950 480 ## ELI_0224 hypothetical protein 9 5 Op 2 . + CDS 6981 - 7568 348 ## ELI_0225 hypothetical protein 10 5 Op 3 . + CDS 7609 - 8814 584 ## ELI_0226 hypothetical protein 11 5 Op 4 . + CDS 8884 - 9429 408 ## ELI_0227 hypothetical protein 12 5 Op 5 . + CDS 9489 - 9902 136 ## ELI_0228 hypothetical protein + Term 9970 - 10007 5.4 13 6 Tu 1 . - CDS 10087 - 10500 536 ## Ethha_1930 sigma-70 region 4 type 2 - Prom 10720 - 10779 80.4 14 7 Op 1 . - CDS 11619 - 12746 590 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 15 7 Op 2 . - CDS 12761 - 13000 149 ## gi|210614635|ref|ZP_03290246.1| hypothetical protein CLONEX_02460 - Prom 13040 - 13099 7.2 - Term 13069 - 13108 6.3 16 8 Tu 1 . - CDS 13110 - 13757 577 ## COG1309 Transcriptional regulator - Prom 13874 - 13933 5.5 - Term 13913 - 13969 1.6 17 9 Tu 1 . - CDS 13984 - 14082 72 ## - Term 15227 - 15269 10.0 18 10 Op 1 . - CDS 15348 - 15677 147 ## gi|266623438|ref|ZP_06116373.1| sigma-54 dependent response regulator 19 10 Op 2 . - CDS 15709 - 16200 252 ## gi|266623439|ref|ZP_06116374.1| hypothetical protein CLOSTHATH_04732 20 10 Op 3 . - CDS 16211 - 16507 208 ## gi|266623440|ref|ZP_06116375.1| conserved hypothetical protein 21 10 Op 4 . - CDS 16512 - 17276 415 ## COG4200 Uncharacterized protein conserved in bacteria 22 10 Op 5 . - CDS 17292 - 18017 686 ## HMPREF0421_21362 MutG family lantibiotic protection ABC transporter permease 23 10 Op 6 . - CDS 18014 - 18778 259 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 - Prom 18798 - 18857 5.7 + Prom 18849 - 18908 6.9 24 11 Op 1 40/0.000 + CDS 19001 - 19636 419 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 19702 - 19736 -0.4 + Prom 19683 - 19742 2.1 25 11 Op 2 . + CDS 19788 - 21029 285 ## COG0642 Signal transduction histidine kinase + Term 21067 - 21110 1.6 - Term 21141 - 21196 17.7 26 12 Tu 1 . - CDS 21212 - 21625 469 ## Ethha_1930 sigma-70 region 4 type 2 - Term 22086 - 22142 8.1 27 13 Op 1 . - CDS 22320 - 22574 184 ## Closa_3192 hypothetical protein 28 13 Op 2 . - CDS 22633 - 22815 118 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains Predicted protein(s) >gi|229784021|gb|GG667714.1| GENE 1 2 - 1118 503 372 aa, chain - ## HITS:1 COG:no KEGG:Closa_3672 NR:ns ## KEGG: Closa_3672 # Name: not_defined # Def: integrase family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 372 1 371 402 501 65.0 1e-140 MAKKKTQIPKYGTITLKGIQYYRTRITDADGKELSLYAATCEELYEKQLEARKQVEEIIF HRQHPTVAEYGEKWLLMQSAKVSASTLRGYTRDMTNYIIKPLGEMYMEEVTADDIRLALV PLSKKSEGLYNKVNMLLKCIFYTAERNQILEHNPCAGISGKGGKPTKKKEALTDQQVAVL LDTVKGLPPYLFIMLGLYSGLRREEILALQWDCVFLDEDTPYLSVRRAWRTEHNRPVIST VLKTPAAKRDIPIPKCLVDCLRETKENSISDYVIADSKGEPLAASQFQRVWQYVVVRSTK PRNYYKYVNGESIKYTVTPTLGMTQKNQPKIKYTLDFDVTPHQLRHTYITNLLYAGVGPK TVQYLAGHENSK >gi|229784021|gb|GG667714.1| GENE 2 1266 - 2165 472 299 aa, chain - ## HITS:1 COG:no KEGG:Closa_3673 NR:ns ## KEGG: Closa_3673 # Name: not_defined # Def: integrase family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 297 107 403 404 478 80.0 1e-133 MGEVSLDDIQLALVPVSKKSASVYKSVVILYKSIFRAAMESRIIDHNPTIYLTTKGGGVP QEDRQALTDEQVERLLDAIRDLPPYVFVMIGLYAGLRREEILALQWDSVYLDTDTPYLTV RRAWHTEHNRPVISDELKTKAAERNIPLPVCLAECLKAAKETSTSEYVVSNRDGEPLSYT QFKRLWQYIVTRTVKERSYYRYEDGKRVKHTVTPVLGEKAAHNGKVVYSLDFEVTPHQLR HTYITNLIHASVDPKTVQYLAGHESSKITMDIYAKVKYNRPDELVRSMNCAFASWDAAQ >gi|229784021|gb|GG667714.1| GENE 3 3342 - 3443 68 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRRRVPPKSLGGREEVSLSPRAVLAPDLPPSVL >gi|229784021|gb|GG667714.1| GENE 4 3550 - 3687 131 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSSYFASYNKNILIKFRLKRLFFALFVRVNKKDKFSPFNSTLTIT >gi|229784021|gb|GG667714.1| GENE 5 3885 - 4640 191 251 aa, chain + ## HITS:1 COG:no KEGG:ELI_0221 NR:ns ## KEGG: ELI_0221 # Name: not_defined # Def: putative ribonuclease III # Organism: E.limosum # Pathway: not_defined # 1 251 28 278 278 476 93.0 1e-133 MIGDIAIMRNPWPIFSIAALSGAACLIYDHFFSYETMWIAVVVFLALLALCFFAYCKYEH WLAKKLNPYVVAYQSDHNLENLKQGLDRWRPWALSKHSKNIITANWFSALLEQQRWTEAE ETLEQFLHRAKNTQDKLGYHLLREDYARAIGDTELEKQERLLSEQLKTQLENKKGESRIP ASAKESKYAFFCWLSFSVGLALFGIISNIATGDSILGSFGAGVFILGLFSFPVCLIWLIL WLIRRRKEAAK >gi|229784021|gb|GG667714.1| GENE 6 4637 - 5224 366 195 aa, chain + ## HITS:1 COG:no KEGG:ELI_0222 NR:ns ## KEGG: ELI_0222 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 195 1 195 195 253 95.0 3e-66 MTYILPVLYVIVSYTFFLLAVHFGNAITLQIVSILLPFIMGIVNLIVVLTVGRKWSRKTL LNCTLIIKYGLIPFYLIGGSITVYVTLMAFFPLPLMALFGLVTIVFLILGYGILLGAAPY AIAYLIKSCKDGIHPKWLAVLAGICQFFFSFDVLAMMVLTLKERHRVKTTIAVFCAMCLV LLLIVLYVVMTLIGA >gi|229784021|gb|GG667714.1| GENE 7 5299 - 6411 901 370 aa, chain + ## HITS:1 COG:no KEGG:ELI_0223 NR:ns ## KEGG: ELI_0223 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 370 1 370 370 694 100.0 0 MKKQWIALLTGCVLVCSMLAGCGSKPQTSAPAAAGSDKGCTDEAILSQLKNDGTPEFVLD GTYYTLPLETSQLIQAGWSLETEEYDAAEVSLQPGERIYGELSKDDQEIDVAIVNAGTEP CKPAEGGTVVELEYSADKDQPNPDFFVTLNGINCAMSNAALQKALEGVDGYKLNSAGNID INRTVDDDEIAVYKVILDDDYTAITLASDNVFEYKDYQPQEVKEQASNEKIAAYKDTTKA EMEPYAQDFNAIIEGYEENMVVGFYSEGTIWGEEVGEYKSIAGTPLASDVSLYIAEDTTG QLYCIANQILNEDGTVSTAIELEEGDQVKVWGYASQYLELQEGLKIVVVQPGIIERNGEL VILDKNLKAD >gi|229784021|gb|GG667714.1| GENE 8 6492 - 6950 480 152 aa, chain + ## HITS:1 COG:no KEGG:ELI_0224 NR:ns ## KEGG: ELI_0224 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 152 1 152 152 247 98.0 1e-64 MLTMNEKDKNEMLEAILNENEAYQCKLWAVIMAGADTYALIGGLSTLTGGAAAALGALSN AYCYLGVTEQHLNMVIVNSVNVSKIENRLSLPLSSITKAEVKGGLLPGRKVVMLHFGKEK MKISLMNNAIGSDIQGQKENVEMFCQIVSKLG >gi|229784021|gb|GG667714.1| GENE 9 6981 - 7568 348 195 aa, chain + ## HITS:1 COG:no KEGG:ELI_0225 NR:ns ## KEGG: ELI_0225 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 195 1 195 195 339 96.0 3e-92 MKKILIVLLAMVLLLAGCGKNGTSSPQSPSDESVTDQQEPSSSETEISSEDHAIPGTYTV PEGWEKSEKHSTDSQIFYIEEGHENDEKPDNISIHVGKNKYSLEEHEQFRDAIVQQILMQ LDGIEAQLNGDGTYTEQGDLLYIFTIDEGDIVTTQYYIVKDYGFCLIQLTNFSGSETTEQ AARDMVDSFVLDTEG >gi|229784021|gb|GG667714.1| GENE 10 7609 - 8814 584 401 aa, chain + ## HITS:1 COG:no KEGG:ELI_0226 NR:ns ## KEGG: ELI_0226 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 389 1 389 401 738 98.0 0 MEKQKLSGKTILKLVGALLALAYLGFCLYAGITGKFPNMHKAEHVMRFGDLKNWVSGGIS AVMIFALCITSAFNDLNLQKGRKKSFLPLFLTMILPIILLLLLESYLDIGWIGIALYAGI AVFLFGVAWTPQLLNQWIARRLLRDEPMSGKQQVQIIEWIHFHPLKKLLIFGLWGFGVVL VLRGIWVTATGKIEMDGELFLFLLGTAVLVVALFQKMRRYICAPYRNIPVLNQILSKNQL EQLLDGEHFEPIEFEDEGMKKYLEIYQSQNWMLIGGKLLSKKLALWFTMNRFRNHTSLKV LYLNGMTAKAKVDLDIRGDRYTEFTAVLKELIGYEGTLKLYEKEEQLAQKFASFFPEQTS EQDRVAAFLSQDVTLIRQDYIQAFSPPPDSRKKKRSRRERE >gi|229784021|gb|GG667714.1| GENE 11 8884 - 9429 408 181 aa, chain + ## HITS:1 COG:no KEGG:ELI_0227 NR:ns ## KEGG: ELI_0227 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 181 1 181 181 343 100.0 1e-93 MNYDMNMIKYRKNGFFRIAAAILIICFTLLGLTACAGTTDSKDNNDDNALIQGTWEIDTG SGAGYKFVDDKFMWLKSIENVNDNYWYGDVEYYKGAEAMDIAGLTEEELKSSLPGLKPEN IFVTKLDPEKIITDGEDKTATNMNDQTLWTRLWLIEENEDSVAAVVFDLETFSMESYTKV K >gi|229784021|gb|GG667714.1| GENE 12 9489 - 9902 136 137 aa, chain + ## HITS:1 COG:no KEGG:ELI_0228 NR:ns ## KEGG: ELI_0228 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 137 1 137 137 266 100.0 2e-70 MVGFKVKYYWFQIGSEDFLFSFFSTVAYNLENKKWGTKFPVIMNELYLGELNSSRIPEAI NELEQMKQGLAKYSPNKVIWNIKNLSAQPPWGTNISDDITDLSNYFVTSDGEDFITIFNH ALLKAEELNCALKITSI >gi|229784021|gb|GG667714.1| GENE 13 10087 - 10500 536 137 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1930 NR:ns ## KEGG: Ethha_1930 # Name: not_defined # Def: sigma-70 region 4 type 2 # Organism: E.harbinense # Pathway: not_defined # 1 137 1 137 141 116 49.0 3e-25 MKKVNLRDLYPDVYKNDHFVEVTEDVLETIRAAERAEAAHDRRMYRYKAHYSLDCDNGIE NAILMKPQTPEMLLEEKQLREQLYAAVMALPEKQAKRIYARYYLGMRVGEIAAAEGVDPS RVRDSIRRGLKQLAKYF >gi|229784021|gb|GG667714.1| GENE 14 11619 - 12746 590 375 aa, chain - ## HITS:1 COG:BS_ygaD KEGG:ns NR:ns ## COG: BS_ygaD COG1132 # Protein_GI_number: 16077935 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Bacillus subtilis # 2 371 17 397 589 112 24.0 1e-24 MIRYLKQYKFLLFLTVLFTAISSLSYVFIAILLQQVMDIVDMGDMDGFIRILLFSLAYFA LLGVFMYLQSLFSKKFVCKIMKSVRSKTFMGIEKHTIEDYSKNNTADYLSAITNDVKMIE DNFLLPLLQVIQYTIIFIASLAVMIYFDIIVTVCVIIAISIMLIVPSLFGGLLSKRQDKY SDMLSSFTNHVKDLLSGFEIIKSYRMKEYVLSRFETSNEDTIKAKYSVDKAIAANEAVSM VLALLVQVVVVFLSAYFIIIGRISAGALLGMVQVSSNLANPLLIIFSNIPKIKSIKPIAQ KLNEFSDYRKQDTDTKKSPTFNEMICVEDLHFSYDKENEVISGISLSIKKGKKYAFVGKS GCGKSTLIELIAVAS >gi|229784021|gb|GG667714.1| GENE 15 12761 - 13000 149 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210614635|ref|ZP_03290246.1| ## NR: gi|210614635|ref|ZP_03290246.1| hypothetical protein CLONEX_02460 [Clostridium nexile DSM 1787] hypothetical protein CLONEX_02460 [Clostridium nexile DSM 1787] # 1 79 1 79 79 133 98.0 4e-30 MSNLDQKIENAANKIEVAANKAWKKRSFRIVSKSVSAAAEIGLMIGAAHLSEKGYKSTAT CLFALGATGLVCDLIFNRK >gi|229784021|gb|GG667714.1| GENE 16 13110 - 13757 577 215 aa, chain - ## HITS:1 COG:CAC3606 KEGG:ns NR:ns ## COG: CAC3606 COG1309 # Protein_GI_number: 15896840 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 21 188 22 187 202 73 29.0 3e-13 MARNKYPEVTVEKILEVSQRLFMEKGYDNTTIQDIVNELGGLTKGAIYHHFKSKEEIIDA LGEKLFFDNNPFVTVQEQKHLNGLEKMREVIKLNHIDIDRTELGKQCISLLKNPRLLAEL ADTNRKLIAPLWLQLIEEGIEDGSIQTEYAKELSELLPLLTNFWLIPSVYPATPEELLSK FAFIKYLLDSMNLPIIDDEIFGMAQNYFEHLNVGE >gi|229784021|gb|GG667714.1| GENE 17 13984 - 14082 72 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRVSEIAAAEGVDPSRVRDSIRRGLKQLAKYF >gi|229784021|gb|GG667714.1| GENE 18 15348 - 15677 147 109 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623438|ref|ZP_06116373.1| ## NR: gi|266623438|ref|ZP_06116373.1| sigma-54 dependent response regulator [Clostridium hathewayi DSM 13479] sigma-54 dependent response regulator [Clostridium hathewayi DSM 13479] # 1 109 7 115 115 197 99.0 2e-49 MDDNTEYLQILSSVLSGEFETIKATGVEDALDTLQTITVDAICSDFNMKDGTGLDLLEKI RQQGLKTPFLLMSGDNSRALEQKAKLYYGDFCCKTDFDLVEKIKALVNN >gi|229784021|gb|GG667714.1| GENE 19 15709 - 16200 252 163 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623439|ref|ZP_06116374.1| ## NR: gi|266623439|ref|ZP_06116374.1| hypothetical protein CLOSTHATH_04732 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04732 [Clostridium hathewayi DSM 13479] # 1 163 1 163 163 301 100.0 1e-80 MRKIGKIIGCCLIAVSLLACTACSGLDRLGENLKDGAEQLSNDVVIGKSNALITPYQSFE GSRTSDNDVFRATYDASVTGFNGQDILVADTDLKENECREIVIRYQLDSLDGNCQLIYIS PDLEEQVLSESASGSVAVQLQAGANYIGIIGTDFSGTIQITVE >gi|229784021|gb|GG667714.1| GENE 20 16211 - 16507 208 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623440|ref|ZP_06116375.1| ## NR: gi|266623440|ref|ZP_06116375.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 98 1 98 98 122 100.0 8e-27 MEKKKLQSVSVAALIVSILPLAALAPSLLHFSLSAGIRTAWAGANTVFVLLGLTLSVVCV RSKESRSIINIASTAISVFWALLMVGIIALALFLTFVQ >gi|229784021|gb|GG667714.1| GENE 21 16512 - 17276 415 254 aa, chain - ## HITS:1 COG:BH0447 KEGG:ns NR:ns ## COG: BH0447 COG4200 # Protein_GI_number: 15613010 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 3 252 2 245 247 65 27.0 1e-10 MNFLDLLKIEFMKVRRSKIIPLIFIAPLLVVISGVASLSNYFTPEYTNAWPAMFIQSALV YAYYLLPFSMIVVCVMIAGRETGNNGILKMLALPVSRYTLSIAKFCVLTFYLFMEMVVFL VVFVIAGVVATQTMGITETLPILYLLKWCLGLFLTMLPCIAAMWAITVLFEKPLLSVGLN LLFVIPGVLVANTPLWIVYPYCYSGYLVSCSLHDFTAETSDAAFSMFPFLPCAIVVFGLF FALAVTQFGKKEMR >gi|229784021|gb|GG667714.1| GENE 22 17292 - 18017 686 241 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0421_21362 NR:ns ## KEGG: HMPREF0421_21362 # Name: not_defined # Def: MutG family lantibiotic protection ABC transporter permease # Organism: G.vaginalis_ATCC14019 # Pathway: not_defined # 1 241 1 241 241 297 72.0 2e-79 MKTLAIELRKEKRTGVIPVLLAVGILGAAYAFVNFIVRKDTLLNLPLAPMDVLLTQLYGM IMVLNLFGIVVAACMIYNIEFKGSAVKKMYMLPVSVPAMYLCKFLILTVMFLVAIMLQNL ALAQIGMTDLPNGAFEMGTLVRFAGYSFITSMPVLSFMLLISSRFENMWVPLGIGVAGFL SGMALANSEVALLLVHPFVIILKPAVAMSAQPDLIVTLVALAETLLFLAVGLWLAKYRRY E >gi|229784021|gb|GG667714.1| GENE 23 18014 - 18778 259 254 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 219 1 223 305 104 31 6e-22 MDYIITTEQLTKKYKNFISVNNVSLHIRKGSIYGFLGPNGAGKSTTMKMLLGLTAPTKGS FTIDGKQFLGDRIAILKEIGSFIEAPSFYANLTGRENLDIIRRILGLPKADVEDALELVG LTEFGDRLAKKYSLGMKQRLGLAGALLGRPPILILDEPTNGLDPSGIHEIRNLVKSLPSL YDCTVLISSHMLSEIELIADDIGILNHGRLLFEGSMNDLRQYALQSGFAADNLEDVFLSM VEKDNMDRKQRAKL >gi|229784021|gb|GG667714.1| GENE 24 19001 - 19636 419 211 aa, chain + ## HITS:1 COG:CAC0289 KEGG:ns NR:ns ## COG: CAC0289 COG0745 # Protein_GI_number: 15893581 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 207 21 233 235 183 43.0 2e-46 MVSDILKDAGFETVLTAMSVKEAVLTAKEETPDLIVLDVMLPDGDGFSLMQQLRTFSSVP IIFLTAKDEASDKLSGLGLGADDYISKPFMPQELLLRIYAVLRRTYKEDSPLLVLDGCTI DFSRAEVHKGSEVISLTAKEHTLLETLARNAGKIVTVDTLCEALWGDNPFGYENSLNAHV RRVREKIETDPSKPVSLITIKGLGYKLISRK >gi|229784021|gb|GG667714.1| GENE 25 19788 - 21029 285 413 aa, chain + ## HITS:1 COG:BS_yvrG KEGG:ns NR:ns ## COG: BS_yvrG COG0642 # Protein_GI_number: 16080374 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 147 401 303 570 573 125 28.0 2e-28 MLEMTAAAATPERLLDNAVQKLRQNHIWAIYLNADGQCYWSVDLPDEVPESYTIQDVALF SKGYIEDYPVFVWNTDDGLLVLGYPKDSYTKLTSNYYSISALRRLPVFVLGMLGLDVLCL FCAYYFSKRKIIRNTEPIVSAVETLADGKPVSLHISGELSEIANSVNKASSILNRQNEAR ANWISGVSHDIRTPLSMIMGYAGRIAENESASKTIQEQAEIVRKQSVKIKELVQDLNLVS QLEYEMQPLHKEMVRLSKLLRSYVADLLNMGISDSYNIGIEIATDAENTMLECDARLISR AVNNLVQNSMKHNPLGCRILLSLRRTENAIQLIVQDDGIGLSEEKLQELKEKPHYLESTD ERLDLRHGLGLVLVRQIVAAHEGTMTIDSKINCGCKIILTFDSIDTIPKAHLS >gi|229784021|gb|GG667714.1| GENE 26 21212 - 21625 469 137 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1930 NR:ns ## KEGG: Ethha_1930 # Name: not_defined # Def: sigma-70 region 4 type 2 # Organism: E.harbinense # Pathway: not_defined # 1 137 1 137 141 119 51.0 6e-26 MKKVNLRDLYPDVYKTDYFVEVTEDVLETIRAAERAEAAYDRRMYRYKAHYSLDCDNGIE NAILMKPQTPEILLEEKQLREQLYAAVMALPEKQAKRIYARYYLGMRVSEIAAAEGVDSS RVRDSIRRGLKQLAKYF >gi|229784021|gb|GG667714.1| GENE 27 22320 - 22574 184 84 aa, chain - ## HITS:1 COG:no KEGG:Closa_3192 NR:ns ## KEGG: Closa_3192 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 83 56 138 139 73 60.0 3e-12 MKFSPPIQTEVAIERKINPKRMQREIQSRLQDKGIGTKAQQALKLQHEQCKLERKTKSRE QKEAEKDRQFAIRQEKKKAKHRGR >gi|229784021|gb|GG667714.1| GENE 28 22633 - 22815 118 60 aa, chain - ## HITS:1 COG:lin2179 KEGG:ns NR:ns ## COG: lin2179 COG0488 # Protein_GI_number: 16801244 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Listeria innocua # 2 58 462 518 650 68 52.0 4e-12 MLTLIPCNFLIMDEPTNHLDVQAKEALKSALMDFAGTVLLVSHEEAFYREWAQRVISVEK Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:06:42 2011 Seq name: gi|229784020|gb|GG667715.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld108, whole genome shotgun sequence Length of sequence - 17526 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 6, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 3 - 678 206 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 2 1 Op 2 . - CDS 653 - 2422 958 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 2462 - 2521 8.0 + Prom 2426 - 2485 5.0 3 2 Op 1 7/0.000 + CDS 2623 - 3570 587 ## COG4209 ABC-type polysaccharide transport system, permease component 4 2 Op 2 14/0.000 + CDS 3587 - 4465 856 ## COG0395 ABC-type sugar transport system, permease component 5 2 Op 3 . + CDS 4506 - 6128 1755 ## COG1653 ABC-type sugar transport system, periplasmic component 6 2 Op 4 . + CDS 6170 - 7624 1051 ## COG3119 Arylsulfatase A and related enzymes + Term 7638 - 7684 10.4 + Prom 7721 - 7780 4.0 7 3 Op 1 . + CDS 7822 - 8562 548 ## Clole_3115 VTC domain-containing protein 8 3 Op 2 . + CDS 8555 - 9232 749 ## Sgly_0479 hypothetical protein 9 3 Op 3 . + CDS 9248 - 11002 1435 ## Dalk_4101 hypothetical protein 10 3 Op 4 . + CDS 11080 - 11607 372 ## ELI_1588 acetyltransferase 11 3 Op 5 . + CDS 11675 - 12907 1191 ## BBR47_51500 hypothetical protein + Term 12924 - 12973 15.2 - Term 12911 - 12961 13.0 12 4 Tu 1 . - CDS 12980 - 13591 387 ## COG5015 Uncharacterized conserved protein - Prom 13629 - 13688 5.2 + Prom 13831 - 13890 6.7 13 5 Op 1 . + CDS 13923 - 14321 437 ## BBIF_0169 polyketide cyclase / dehydrase and lipid transferase 14 5 Op 2 3/0.000 + CDS 14363 - 15268 780 ## COG2378 Predicted transcriptional regulator 15 5 Op 3 . + CDS 15356 - 15814 518 ## COG3708 Uncharacterized protein conserved in bacteria 16 5 Op 4 . + CDS 15828 - 16589 762 ## DSY0090 hypothetical protein + Prom 16617 - 16676 7.2 17 6 Op 1 . + CDS 16700 - 17239 642 ## Cphy_0280 hypothetical protein 18 6 Op 2 . + CDS 17249 - 17525 267 ## Cphy_0279 hypothetical protein Predicted protein(s) >gi|229784020|gb|GG667715.1| GENE 1 3 - 678 206 225 aa, chain - ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 133 1 138 530 68 33.0 7e-12 MLQTLLVDDDILSLNKLQTFLADLGFIHVCGQLLDGTSAIEFLKLHGNSVDIIILDMEMP GASGLEVAGYIRKHQLPITILVISNYDNFEYVKPILQAGAYDYLLKHELSRSLLETKLSE IQHHIDKQQLHQRHWNQMCQLSKQQYLRNLVLNYEIPEESRHFFASDPLLNGRRHVAACL QITNFSSIYQSIPDERHQKVVNTVLHFCDTFFTSLQRGIITHINY >gi|229784020|gb|GG667715.1| GENE 2 653 - 2422 958 589 aa, chain - ## HITS:1 COG:FN0190 KEGG:ns NR:ns ## COG: FN0190 COG2972 # Protein_GI_number: 19703535 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Fusobacterium nucleatum # 15 578 8 552 552 127 23.0 4e-29 MEKYPKLVHYFKSHFNARLVLLLLINGIVPVLLISLTTYPFWFRTMRSKELTFTYSRIST INQALDSLTTDIEQNITHIFSNENIRSYLKDNYDEVTVEGHKKHMLIENFLKSISNYGEV ACSYTLIDRYDRIYTNGSNVNRLQNFDGPLCENIKQASPGTYCTERNLYSTDSQKALTYG RVLKEDDEILAVILVDISPELLHSILGDYDGENYQVFIANPNNELIYTNAIHSLSKPEIM EHLLSAQGTTVTLEREPYEMAHASSLLSNYHTYVLVPHTYIYKDSGNIRILFVIILILVF LQTITCSRIISKSLSRKILLIKNELLRFITTREKSVFHHKNNDELKEISDGIVYLENEID KMLEEIQSNAEKQRRLELQTLQQQLNPHMIYNALNTITQLASLQGVKNIEEVSLAFTRML KLVSKNTENFVTLRQEIGFIKDYISIKKYNNFQDITLQCDVADPLLELPVLKLLLQPFIE NCIKHGFSNFEKDGIIFLYAYLEEEQLHIIIEDNGSGIRKDQIDQILSLSCQTEETYSNI GIRTCIERLRLQYGSRFTLSIASDGQTFTRILLSYPVKEDSHASDTSCR >gi|229784020|gb|GG667715.1| GENE 3 2623 - 3570 587 315 aa, chain + ## HITS:1 COG:SP1798 KEGG:ns NR:ns ## COG: SP1798 COG4209 # Protein_GI_number: 15901627 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 9 311 1 301 305 254 44.0 1e-67 MKNRISNKINRAEKKRRFMKNFRATKTLYLFVVPGLVYYVLFKYWPIYGSQIAFRDYSPF LGFTKSPWVGMQNFERFFRSPDFFLILKNTLLINVTNLILGFPFPIILAVLLNQVRNKGF KKLVQTATYAPYFISVVVLVGMMYTMLSPYSGVVNAVIRFFGGDPIHFMGKPGMYVMIYV LSGIWQTCGWSAIVYIAALASVSPDLHESAIVDGAGRLQRIWHIDLPGIMPTIITMLILT AGQLLTIGFEKSYLMQNSMNLGVSEVISTYTYKRGLVDGDFSYASAVEMMQSVVNFVILI SVNRFSKRFSDTSLW >gi|229784020|gb|GG667715.1| GENE 4 3587 - 4465 856 292 aa, chain + ## HITS:1 COG:SP1797 KEGG:ns NR:ns ## COG: SP1797 COG0395 # Protein_GI_number: 15901626 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 5 292 18 305 305 220 42.0 2e-57 MKAGKKDRVFELVTTVLLAVILIIALYPLYFSLIASISSPRAVYQGRVLFLPADVTLDGY RKIFADQSIWTGFGNGLLYTFVGTLLNVVFTVPLAYALSRKEFPLGGVFMKIMVFTMYFY GGIIPLYFVVKQMGMLDSIWSIVLPTLISTYNMIIARSYFMGSIPEELKEAAFLDGCGYV GFFFRVALPLSKSLIAVMVLFYAAKQWNSYFEPMIYLSSESRFPLQLVLRTILIDSQNSL ASADASTVADMQQLVDLLKYGCVVVSSIPMLVFYPFVQKHFVKGVMLGAVKG >gi|229784020|gb|GG667715.1| GENE 5 4506 - 6128 1755 540 aa, chain + ## HITS:1 COG:SP1796 KEGG:ns NR:ns ## COG: SP1796 COG1653 # Protein_GI_number: 15901625 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 7 526 10 522 538 216 30.0 1e-55 MKKRSLAVLCTAAAVAAGCVACGGAVQPKGQDSSQAVAEEKGKEKIKVTFIKNEWHGDPN DMEIYKILEEKTGVEVEWQVYSAATWEDKKNLILSSGDLPDVFYMNAVNVNDVTKYAPQG MFLDLTDYIEEYCPNLKKVFEEMPEYKNICINPDDGKIYSIGRAVEREVMYTSGLLYINK KWLDQLGLPVPETVDEYYEALKAFKENDMNGNGDKGDEIPFVFHYNSSVPDTAYSYHQLF GSFGYVDPVLTVGSHFIEDENGEIVYTAEQEEYKEAIQYYHKFVEEGLWDSEGFTTPDTS VMNAKGNNDPEIVGSFMAYDSTFVVPEKYKEDYVIVEPLEGPTGKRIWQNSANSNGNVNG TQFVLTASAKGKEEAVMKWLDAHFEPEISAQLFLGPIGTTLEKTDSGMLDYIPTPEGMSY SEFRYKNAPVHVPCKIASEDWGKLVQVMDEDINKLQIAKEHYAPYAGQSSLFLLPNKEES KYFTTTAKDIDDYVNKMQVKWLTEGGIETEWDTYLEQLNKLGIDQYKETVKGIRERMGKQ >gi|229784020|gb|GG667715.1| GENE 6 6170 - 7624 1051 484 aa, chain + ## HITS:1 COG:PM0598 KEGG:ns NR:ns ## COG: PM0598 COG3119 # Protein_GI_number: 15602463 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 19 464 13 455 467 145 28.0 2e-34 MTDQQLGTTQQSGGSAYMPNLERLKRHGVTFQEAYCPSPHCCPSRASFFSGLYPSEHGIW NNIDLADAFSHGLNEGIRLFSEDLKEAGYQMYFSGKWHVSAEEGPGDRGFGPWIYPPQEG RYKQWKKTPFMGDWDWLLEDDYITEVRGRGDGEISRVGFPPYTQYGVKENPFGDGDVVSC AEDWLEKVSEEEPFCMYVGTLGPHDPYFVQEEYLDLYPDESMELPVSWTDRMLDKPNLYR RTRERFDQLDETEQKRSLKHYLAFCSYEDALFGRLLDVLERRNLLENTVVLYVSDHGDYA GAHGLWTKGLPCFKEAYHICSVMGYGGIKAEGAVIGHRVSLLDYAPTFLDLAGILKADVI AGRQRFSGYSLKPFLEGRIPENWREETYTQSNGNECYGIQRSIFTDEYHLVFNGFDYDEL YDLRKDPDCMKNVIEEQQYETVIYDLYRKLWRFAYLHRDALGDPYITTALARYGPGIIRG VKRE >gi|229784020|gb|GG667715.1| GENE 7 7822 - 8562 548 246 aa, chain + ## HITS:1 COG:no KEGG:Clole_3115 NR:ns ## KEGG: Clole_3115 # Name: not_defined # Def: VTC domain-containing protein # Organism: C.lentocellum # Pathway: not_defined # 5 235 9 239 243 240 54.0 4e-62 MAQEIFKRYEKKYLLTEEQYLGLKRVLEDRMEADAYGVHTISNIYFDTKDFELIRTSLDG PAYKEKLRLRAYGQVNEKSMVFLEMKKKYSGVVYKRRVPMELASAVRYLNDGILPELENQ IFREIQYMKQRYRLKPAAYVAYDRVACSCKADQELRVTFDRNIRCRTEGLDLQAGTYGTS LLEEGRVLMEVKIPGAMPVWMSRGFSELGIYPVSFSKYGRYYREYLVNAERLGRCGRDRE GGRDCA >gi|229784020|gb|GG667715.1| GENE 8 8555 - 9232 749 225 aa, chain + ## HITS:1 COG:no KEGG:Sgly_0479 NR:ns ## KEGG: Sgly_0479 # Name: not_defined # Def: hypothetical protein # Organism: S.glycolicus # Pathway: not_defined # 1 225 1 225 225 267 61.0 2e-70 MLNTILTETVSSGITLPQLILCTLTSLGIGIVLAFLHTYKNTSSKNFVMTLAILPAVVQA VIMLVNGNLGTGVAVMGAFGLVRFRSAPGNAREIGSVFLAMAIGLATGMGYLGIAVLLLL IIGAAILILTEIPFGSQSFSQKELKVTIPENLDYAGIFDDLFEKYTKSAELMKVRTTNMG SLYELCYRIELKKEEEEKAMLDDIRCRNGNLTIVCGRIAENREEL >gi|229784020|gb|GG667715.1| GENE 9 9248 - 11002 1435 584 aa, chain + ## HITS:1 COG:no KEGG:Dalk_4101 NR:ns ## KEGG: Dalk_4101 # Name: not_defined # Def: hypothetical protein # Organism: D.alkenivorans # Pathway: not_defined # 47 523 239 729 755 286 38.0 3e-75 MKRILRHGLFLAVAGTIIMTGCQSQNEGQTSGSIVAEENGSTGSTETVIIGNGDSVTVTG TGAAVDQRVVTITAAGTYRLSGELTDGQIAVNAGKEDEVSLILDGFTISNTSSSAVYGIK SGSITIITEEGTDNVISDGTEYQYEEEGEDEPDAAVFSKDDIILEGSGMLTVTGNYGCGI RSKDHLTVKSGTYVIDTVQDAVKGKDSVKIEGGTFTLKAGEDGIQASNDEEEDKGYVEIS DGTFSITAGAKGIKAVTRLEISGGTIDIDSEDDAIHSNGDVNITGGTFTIATGDDGIHAD NRVVIDDGTIMIIESYEGIEGLMVDINGGVIDLTSEDDGLNAAGGNDGSGNLEFGREGRP GGMRGEKTDGAGGAFGAAEGAYIKITGGEIKISASGDGIDSNGDFYMEGGTVIAEGPTDN GNGTLDYDGTGTITGGIFMGTGSSGMVQSFSEDSAQNVIVVYFTSTKTAGTEWKLTSGDG TVELTGNPAKEYSSLIVSSPDLKEGETYQLTAGEESMEITVSGRITQSGTRSGTGDAMPG KRGADGGGGERKPGEGQRPGGQKPEGEPKPEGEPQRETAPDQKE >gi|229784020|gb|GG667715.1| GENE 10 11080 - 11607 372 175 aa, chain + ## HITS:1 COG:no KEGG:ELI_1588 NR:ns ## KEGG: ELI_1588 # Name: not_defined # Def: acetyltransferase # Organism: E.limosum # Pathway: not_defined # 4 144 13 159 187 70 28.0 2e-11 MKNKIWLDQLRAEDRDTVFSLTSDPEIVKYMRFGTHKVPEEAEELIRNYTEKGNYGFLIR LTDSGEPIGVAAMKHDEENCLEYSVSLFSFQRFWNQGYNTAAVELLKTFAQEQGIRSLNA YVVEENHGSRKVMEKCGFHRTDILHFDDFPSGLYVYSLQIGSGDEKQEQAAAETP >gi|229784020|gb|GG667715.1| GENE 11 11675 - 12907 1191 410 aa, chain + ## HITS:1 COG:no KEGG:BBR47_51500 NR:ns ## KEGG: BBR47_51500 # Name: not_defined # Def: hypothetical protein # Organism: B.brevis # Pathway: not_defined # 47 410 26 392 392 172 32.0 2e-41 MKLFGKKAQPKKEEPKMQEAVLTLYACRKDVKTVPDLACRLFSEVMDKILYTDEDEFHIQ FKDGTGVQFHLVTDAEETEVQSKGMASFFSRAPLENEAVKDAALCQIRLFNCIIGIRFEI DGDSNRTNYIVNTIYGMAGELSAFVLYPNMYLYHWDGRLLLSIDGKSDFEEFYPQASTTI LDREEKEKEADRERKMRSIAILKEKGIPYLEHLKASVFEAECIIPDKEVVVHRLACTFAA CVQSEIYTSGQFENCAETAAEEIAQLEKNYQISEWLSTEERDYLENPDKDPALHNRFGWR YECCSVLLWALSMIELKEPTEICNASELGAIMWNHTFDSLMEASVLRSRDEILDMQDLVL RYDWACVDARIHHRELPMLSGDIIYEWHYALNWLVQADGIADWDRVITTT >gi|229784020|gb|GG667715.1| GENE 12 12980 - 13591 387 203 aa, chain - ## HITS:1 COG:MA0739_1 KEGG:ns NR:ns ## COG: MA0739_1 COG5015 # Protein_GI_number: 20089624 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 12 143 28 158 163 102 35.0 3e-22 MNEKAKKYVTMFRQVKIASAATVDAGGHPQSRIINIMIANDDGMYIVTSKGKPFYKQLTA TGEIALSAMCPDCQSLKFTGRLRLADKKWVDEVFLQNPGMNEVYPGNSRYILDAFLIYEG SGEWFDLLNYPISRESFSYGGAKEEVSGYFITDDCIGCGQCTESCPQKCIAPGVPCRIDG SHCLRCGLCQEVCPVGAVAERNR >gi|229784020|gb|GG667715.1| GENE 13 13923 - 14321 437 132 aa, chain + ## HITS:1 COG:no KEGG:BBIF_0169 NR:ns ## KEGG: BBIF_0169 # Name: not_defined # Def: polyketide cyclase / dehydrase and lipid transferase # Organism: B.bifidum # Pathway: not_defined # 1 130 1 129 132 127 47.0 2e-28 MTESNKKVILKSRIEQIWDVVTDLNDYSWRSDLQKIDILEPGRKFVEYTKEGYPTTFVIT RFEPPHRYEFTMENGNMTGTWAGVFRQTEKGCEANFTERVDVKKWFMKPFVPGFLKKQQE QYFQDLKAALGE >gi|229784020|gb|GG667715.1| GENE 14 14363 - 15268 780 301 aa, chain + ## HITS:1 COG:lin0464 KEGG:ns NR:ns ## COG: lin0464 COG2378 # Protein_GI_number: 16799540 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Listeria innocua # 1 300 1 308 310 164 34.0 2e-40 MKIDRLIGILSILLQQEKVTAPYLAEKFEVSRRTINRDIEDICKAGIPLVTTQGPNGGIS IMEGYRIDRTVLTSSEMQSILTGLRSLDSVSGTNKYQQLMDKLAMGNSSILTSNSHILID LSSWYKSCLAPKIELIQAAIDAREYISFAYFGPSGESRRKLEPYLLVFQWSSWYVWGYCT EREDYRLFKLNRMLELKALAEHFAERKVPAYDMSPDKAFPPRISVTAVCEADMKWRLIEE SGLESFQEREDGKLVFHSCFSDRGSVFSWLLSMGDRVELVEPEELREEFAAMAENICRIY R >gi|229784020|gb|GG667715.1| GENE 15 15356 - 15814 518 152 aa, chain + ## HITS:1 COG:CAC3493 KEGG:ns NR:ns ## COG: CAC3493 COG3708 # Protein_GI_number: 15896730 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 151 1 146 146 124 41.0 8e-29 MDYEIVTLPEKKAAGITARTNNHAPDMGSVIGGLWERFYARGIYASIPADRRGWVLGIYS NYESDENGDYDMTAACEVTDTAGIPEHLSLCTIPAGRYAKFVVKGDVQQALVQFWTELWS LDLKRAFRADFEEYHGIEGRDAEIHVYIGLKE >gi|229784020|gb|GG667715.1| GENE 16 15828 - 16589 762 253 aa, chain + ## HITS:1 COG:no KEGG:DSY0090 NR:ns ## KEGG: DSY0090 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 3 247 1 245 271 348 65.0 2e-94 MSVDFKRVTLDNIEEEHICCAISDKKGETCVASKKAWLKERMKEGLVFEKLDVRGKVFIE YLPAEYAWAPIDADGYYYIDCLWVSGQFKGQGYSNRLLEHCIAEARAKGKKGLVALSSKK KMPFLSDGAYLKHKGFVEADTAAPYYELLALPFEEGTALPRFLECAKTGEIDEKGWVLYY TAQCPHTAKYVPILTEYAGNQGIQIKAVKLETREDARQSPSPFTTYAVFYDGKFVTNEIL SEKSFEKYRKKVE >gi|229784020|gb|GG667715.1| GENE 17 16700 - 17239 642 179 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0280 NR:ns ## KEGG: Cphy_0280 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 177 1 177 180 191 56.0 1e-47 MQEKWYELMETVQRKQELERVLACNEKTEAFGLMLTEEDAGKLMVCRRDSLRTQMRVEFG KGILPEIIDTFCDSPYLQQENYADVIAELQEVFYLYKNESEDNLTDEELLAFMKEQFDGV CCGSVAYLEETCLERFARAVRAGYRDYVEDGGSGEYGKFSEEARWDRDLFLETLFDLLS >gi|229784020|gb|GG667715.1| GENE 18 17249 - 17525 267 92 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0279 NR:ns ## KEGG: Cphy_0279 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 92 1 88 280 94 55.0 9e-19 MDYTMEELLPVVGKLTEKYTGFSSTSVTYETARQLMEAVLYCLREAEAEALKTGKDNVAA ASDTDLWLLYQQGYEVVLEKTARAKKVYEQII Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:07:28 2011 Seq name: gi|229784019|gb|GG667716.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld109, whole genome shotgun sequence Length of sequence - 20237 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 7, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 3 - 708 629 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 2 1 Op 2 . - CDS 727 - 1761 1077 ## COG1145 Ferredoxin 3 1 Op 3 . - CDS 1821 - 2375 655 ## Despr_1642 arylsulfotransferase - Prom 2435 - 2494 6.5 4 2 Op 1 . - CDS 3396 - 4679 1387 ## Dhaf_3734 arylsulfotransferase 5 2 Op 2 . - CDS 4743 - 6182 1463 ## Despr_1642 arylsulfotransferase 6 3 Op 1 . - CDS 7102 - 7539 597 ## HMPREF9243_1609 hypothetical protein 7 3 Op 2 11/0.000 - CDS 7591 - 7902 485 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 7945 - 8004 7.3 - Term 7992 - 8032 -0.1 8 3 Op 3 . - CDS 8099 - 9025 433 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 9 3 Op 4 . - CDS 9048 - 10523 1424 ## DSY1455 hypothetical protein + Prom 10506 - 10565 6.5 10 4 Tu 1 . + CDS 10759 - 11427 502 ## SGGBAA2069_c08960 global nitrogen-responsive regulatory protein + Term 11661 - 11700 -0.2 - Term 11391 - 11443 7.0 11 5 Op 1 . - CDS 11456 - 12820 849 ## Lebu_2011 hypothetical protein 12 5 Op 2 7/0.000 - CDS 12833 - 14452 1311 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 14488 - 14547 80.4 13 6 Op 1 44/0.000 - CDS 15515 - 16309 198 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 14 6 Op 2 44/0.000 - CDS 16311 - 17300 581 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 15 6 Op 3 49/0.000 - CDS 17316 - 18176 727 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 16 6 Op 4 . - CDS 18187 - 19101 946 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components - Prom 19335 - 19394 6.8 + Prom 19359 - 19418 9.9 17 7 Tu 1 . + CDS 19508 - 20071 268 ## gi|266623486|ref|ZP_06116421.1| conserved hypothetical protein + Term 20094 - 20148 15.1 Predicted protein(s) >gi|229784019|gb|GG667716.1| GENE 1 3 - 708 629 235 aa, chain - ## HITS:1 COG:STM2549 KEGG:ns NR:ns ## COG: STM2549 COG0543 # Protein_GI_number: 16765869 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Salmonella typhimurium LT2 # 7 235 16 244 272 209 41.0 3e-54 MNNIVKPVACKILEVKRESLHEYTFKVATDIRPEHGQFLQLSIPKVGEAPISVSSFGDGW MDFTIRSVGKVTDEIFEKQPGDTLFLRGAYGKGWPVEKFQGKHMVVITGGTGLAPVKSML NMFYDNPDYVKSVTLISGFKNEEGIIFKNDLEKWNEKFTTFYTLDKDHKDGWNTGFVTEF VSKVPFSEFDGDYEVVIVGPPPMMKFTGLEAVKCGVPEEKIWVSFERKMSCAVGK >gi|229784019|gb|GG667716.1| GENE 2 727 - 1761 1077 344 aa, chain - ## HITS:1 COG:CAC1513 KEGG:ns NR:ns ## COG: CAC1513 COG1145 # Protein_GI_number: 15894791 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Clostridium acetobutylicum # 1 342 1 338 338 339 48.0 5e-93 MGYKVSFEEANKIFAALQDEYEIWAPKRFVGRGRYSDTDMIKYDRINTVEEIEYMDKSDF PAKEVLSPITESLFYFTEDEFIESKATSKKLLIFMRPCDIHAQHHQEKIYLTNGGFEDMY YKRMNEKVKIVMMECTTGWDTCFCVSMGTNKSEDYSMAVRFGDGELSLDVKDEAFAPYFE GKNETGFKPEYIEKNQMTVEVPEIPNKEVLTKLKSHPMWNEYNKRCVSCGACTVACSTCT CFTTTDIIYNENGNVGERKRTTASCQVKGFTDMAGGMSFRNTAGDRMRFKVLHKFHDYKE RFKDYHMCVGCGRCIDRCPEYISIAATVKKMANAIDEIAAEENK >gi|229784019|gb|GG667716.1| GENE 3 1821 - 2375 655 184 aa, chain - ## HITS:1 COG:no KEGG:Despr_1642 NR:ns ## KEGG: Despr_1642 # Name: not_defined # Def: arylsulfotransferase # Organism: D.propionicus # Pathway: not_defined # 13 182 456 624 627 149 48.0 4e-35 MNMPGAGMTAAIDDGSAELSSITVEIKDDVVLYEMHLPANYYRAEKLHLYDEGDEITFGR GEILGSIGVTEEFLTEAPAEDGGMVPDSYKTKVAKEADRLVFKASFEKGTLVMLNLEGEN GSPTRHYFVPTTKRPFLAMCVGTFLEHDERAVEFPVSGEGIEGTYKLTVTIDEKKYDTGV TVNF >gi|229784019|gb|GG667716.1| GENE 4 3396 - 4679 1387 427 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_3734 NR:ns ## KEGG: Dhaf_3734 # Name: not_defined # Def: arylsulfotransferase # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 19 426 7 414 628 507 59.0 1e-142 METSPLRRLSGGFMGVLYKEVKHIIDQQYEAESRFLEEYKKESHTIANPFVVLNPYLIAP LTALVMFENEKPAFAKVTVKGKEAAGDYMYRPKSDARKMVLPIYGLYEDYENTVVIELST GETATVKIVTEKASDQLKKPTSIETTPEYMEDNVMMVSPTSPAYTAAYDYAGDARWYNTL NLAFDLKRVRNGRLFVGTDRLVAPPYHTTGIYEMGMIGKIYKEFRIPGGYHHDEWEMENG DILILTQHLTRGTVEDVCVMVDRNTGEILKEWDHQDVLPVYPAGGSGSQDAHDWFHNNAV WYDKKTNSLTFSGRHQDVIINRDFETGRLNWIIGDPTGWPEDMQKYFFKPIGDEFDWQYE QHACMVLPNGDIMCFDNGHWRSKEKEHYLEAKDNFSRGVIYHIDTDNMTIEQVWQYGKER GAEFLAS >gi|229784019|gb|GG667716.1| GENE 5 4743 - 6182 1463 479 aa, chain - ## HITS:1 COG:no KEGG:Despr_1642 NR:ns ## KEGG: Despr_1642 # Name: not_defined # Def: arylsulfotransferase # Organism: D.propionicus # Pathway: not_defined # 2 479 146 626 627 599 59.0 1e-170 VVSPAGEDLAVGFDYAGDARWHMTVPCVFDVKRLKNGNLIMGSHRVIQMPYYMSGLYEIS PCGKIYKEFRLPGGYHHDEFEMEDGNLLSLTDDLTSETVEDMCVLIDRNTGEILKTWDYK KFLDPKTVSRSGSWSDHDWFHNNAVWYDKNTNSLTFSGRHIDSIVNIDYETGDLNWIIGD PEGWPEEMQKYFFKPVGNNFGWQYEQHACVITPDGDVMCFDNHHYGSKNKENYLAAKDNY SRGVRYKINTDDMTIEQVWQYGKDRGAEFFSPYICNVEYYNEGHYMVHSGGIAYDSEGNP SEALGAFAKDQGGRLESITVEICDNKKMLDLHVPGNYYRGEKLKLYSDGINLELGKGQIL GEMGVTKEFDTEIPLDPSGEMLPESCNARIEDEIDRFTFFSRFEKGQLVMLLLEQGEEVH RYFISTTAVPFLAMCCGTFLDSDDRNTRTNINKAGLKGTYDVRVIIDDKKYETGVTISC >gi|229784019|gb|GG667716.1| GENE 6 7102 - 7539 597 145 aa, chain - ## HITS:1 COG:no KEGG:HMPREF9243_1609 NR:ns ## KEGG: HMPREF9243_1609 # Name: not_defined # Def: hypothetical protein # Organism: A.urinae # Pathway: not_defined # 2 143 4 146 629 153 51.0 2e-36 MSVKYSFEDHIINRQYEAEQAMLAKFEAGNYTIANPLVTYNAYLVNPLSAVVCFNTEKET AVTVTVLGKTPQGNISHTFPKAKKHVLPIVGLYSDYQNRVEIRAYRGESNIITIDVPDVF DGKEVIYSMDTTPEYLQDNIILVSS >gi|229784019|gb|GG667716.1| GENE 7 7591 - 7902 485 103 aa, chain - ## HITS:1 COG:CT539 KEGG:ns NR:ns ## COG: CT539 COG0526 # Protein_GI_number: 15605268 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Chlamydia trachomatis # 1 103 1 102 102 86 44.0 1e-17 MLKEVNSEEFKELAGKGLVLADFFSTTCGPCKMLGFVLKDVEKEFGDDLTILKVDFDQNK DLTAEYGVTGYPTLVLLKEGEEVGRLQGLQQKPVIIKLVEEHR >gi|229784019|gb|GG667716.1| GENE 8 8099 - 9025 433 308 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 4 306 2 306 306 171 34 4e-42 MAQYDVIIVGAGPAGMTAAIYAARADMNVLLLDKLAPGGQIINTNEIQNYPGVGTINGAE LAYQMFEHTQQFENIAFDYGTVKEILDEGKVKKVLCEEDDKEFTAKAIILATGTRPRCLD IPGEDKFRGNGISWCAICDGAQYRDKDVVVIGGGNSAVEESIFLAGIVKSLTIVTMFDLT ADPMACDKLRAMDHVTVYPYQDILDFTGDTKLTGVHFKSTKTGEENTVSCDGVFEYIGLT PTTEFLKDLGVLNQFGYVEVDEKMHTKVEGVYGAGDCVTKNLRQVITACADGAIAAQEAS HYVQNLED >gi|229784019|gb|GG667716.1| GENE 9 9048 - 10523 1424 491 aa, chain - ## HITS:1 COG:no KEGG:DSY1455 NR:ns ## KEGG: DSY1455 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 15 490 27 501 503 246 33.0 1e-63 MANQNERKREKMFLIHSLIGVGIMVFFRFLPLNLPEVTSIGMEVLGIFIGTLYLWTFVDP LWGSLMSIAMVGLSNYMPMAGLLKEAFGAPVVAQVFFLLIMAGGLVYYKITLYIGRFFLT RKFTNGKPWLLMFVICIGCYFLGGFISPFTAIFLFWPVLYDVFEEVGYKKGDALPRVMLT LVVVASLIGFPMAPFAQNGLALLTNFTNITANLPGGPIVVNNAAYMGVAIILGLIMIVAN ILFAKFVIRPDVTKLKNYDVEMLNKNPLPPMEPRQKIIGASFVILILGMLLPSLFPSLPG MAFLSSNSLGLALFMVAALAAIRLNGKSVLEIPQVMATNFNWGAYFIIVAAILLGSVLTS ENTGVSAFLSVVLSPIFQGMSPMVFTVLLMAMAVLLTNLCNSLVIGMILQPVIVTYCIQT GANPAPIVTLLIIFVLLSAAITPAASPFAAILHSNKEWVPTKYVYQYTIPFVILELIIVL VVGVPLANLLM >gi|229784019|gb|GG667716.1| GENE 10 10759 - 11427 502 222 aa, chain + ## HITS:1 COG:no KEGG:SGGBAA2069_c08960 NR:ns ## KEGG: SGGBAA2069_c08960 # Name: not_defined # Def: global nitrogen-responsive regulatory protein # Organism: S.gallolyticus_gallolyticus # Pathway: not_defined # 13 222 14 223 223 110 31.0 3e-23 MNESLLKIITDSGTRITMPKDIVLNKKYDKYISNYVYLLEEGICALNSITKQGEEKVYLY FRPNRLIGFNQFVVSAEKLNASTPEFSVVTKTTCTLCQIPYSAFQELIVKNPEFNAFFIE TLADNYNDALIHFHLMQEESVVVRLCHLLFEVSQLRGGKRVVPRFFTYAEMAKYLGCHSV TVSRIMAKLKQKGYVTKTPSGIAIEQEEALQELINSESRFQY >gi|229784019|gb|GG667716.1| GENE 11 11456 - 12820 849 454 aa, chain - ## HITS:1 COG:no KEGG:Lebu_2011 NR:ns ## KEGG: Lebu_2011 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 10 451 82 532 537 470 52.0 1e-131 MSISLKFHEENYTIREISILDKHIKYRAFENIVYAGKPVDREHQSLSIFVPEWYYEEADG KEQLKKVPVFFPNTVGGYMPGLPEPPGPNRSGEPNASFYAILNGYVVVSPGARGRGLKDE KGAFIGNAPACIVDLKAAIRYLKHNRDQIPGDVDKIISNGTSAGGALSALLGTTGNHPDY EPYLEEIGAAEETDDVYAASCYCPITNLEHADMAYEWEFCGFDDYYWGEKSGVMSEEQKK LSAMEKLLFPSYLNSLKLVGDSGRQLTLKEDGNGSFKEYIAEYVLKSADRAAECGTDLEA LTWLTMENGRPVGLDWDSFIKYRTRMKTAPAFDDVHLATPENELFGSSDTERRHFTSFSL KHSAVSGKMADEVQIRQMNPMNYIQDPIAKKAKYFRIRHGAADRDTSLAISAILTAGLRN EGLLVDYHLPWGVPHAGDYDLEELFNWIKTVCKK >gi|229784019|gb|GG667716.1| GENE 12 12833 - 14452 1311 539 aa, chain - ## HITS:1 COG:FN0396 KEGG:ns NR:ns ## COG: FN0396 COG0747 # Protein_GI_number: 19703738 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 4 515 2 495 511 174 27.0 5e-43 MKRRKIVSLLLAAAMTAGVLSGCGGNGETAGTAGTADTKNAETAQNGAKEEGQSTIVMEI TNDPSSLQPSTINQQAFMEQRMNYYERLFEWDYNGEVSPLLVKDFEWIDELHLKITLYDN ITTHSGKKLTTEDVKFSLEYAKDSAEFARHTTNIDYDNIEITDDYTMVIPILQQNVLFWN DITRIDIVSKADFEESKDQMITTPVGTGPYKVTGYTPGVEVVLEKNENYWNAGQTEIQQR NVDRIIYKFIPEEAQRTISLESGDVDFVYDPPLTELEGLMNNSDYDVLEVKTSSTPSLYF NSSEGSVCSNKSLRQAISCAIDNSAIVAAVYKGFAHPACSVVTPNNPEWSDDLENSNPYA FDQEKAKTCLAEAGYQPGELKLRLATDDVSTHKAAAEIIQAYLQQIGIEVEISIFDAGSY NTLLTQKDTWDMQINEYTTKGSILFFFGNQVNRNKNIRGFWDDDEFQNVLDEALKDGDNE KVSKLVSIFMENIPIYPLMNKTNYYVFRKGIENVRTKDDYIVFPGDVTYTEEASSWLYN >gi|229784019|gb|GG667716.1| GENE 13 15515 - 16309 198 264 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 235 1 222 223 80 26 6e-15 MEKIIEVHHLKKYFKTPRGLLHAVDDVSFSIEKGKTVGIVGESGCGKSTLGRTMIHLLES TEGKIIFKGEDITRVSTRKLRELRENMQIIFQDPFSSLNPRMTAIETIREPLQLAEHYSA AEIETETAKIMETVGIDKRLGNSYPHELDGGRRQRIGIARALALNPDFVVCDEPVSALDV SIQAQILNLMQDLQEERKLTYVFITHDLSVVRYISDEICVMYLGQMVERAATKELFRNPV HPLLRHCYPLYRYPISMQREILSN >gi|229784019|gb|GG667716.1| GENE 14 16311 - 17300 581 329 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 2 321 3 324 329 228 38 3e-59 MEANHQNRALLSVEDLEVRYVLSEETVCAVNGVSFTVNKGETLGLVGETGAGKTTIAKAI MSILPDPPAKIKNGKIMYGQKDLLAMKEKEVRKIRGKKISMIFQDPMTALNPVFTVGEQI AEVVKIHENLKGKESIKRAVEMLEMVGIPGERYGEYPHQFSGGMKQRVVIAMALACNPEL LIADEPTTALDVTIQAQVLEMMTKLKEQLNTAMILITHDLGIVAQMCDKVAIVYAGEIVE TGSAEDIFDRPAHPYTTGLFNALPSLADDQTRLTPIHGLVPDPTALPKGCKFHTRCPYAS EACEKAVPPLKETEPGHYTRCSRGRKEAT >gi|229784019|gb|GG667716.1| GENE 15 17316 - 18176 727 286 aa, chain - ## HITS:1 COG:BH0030 KEGG:ns NR:ns ## COG: BH0030 COG1173 # Protein_GI_number: 15612593 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 15 286 30 301 301 281 53.0 1e-75 MDTMKKKESMFFMTWKRLAKNKLAVAGLIVLIITALLAVFAPIAAPYGYEEQDLFNTLAG PSREHWLGTDNLGRDMLSRLIYGGRNSLTLGLISVALAAALGVILGAVSGYYGGKVDMVI MRLLDVLQAVPAILLAIAISATLGPGYMNCILALTISQIPGFTRMTRASCLNVQGMEYVE AARSINARERRIIFKHVLPNAISPIIVQATMSVATAILTSASLSFIGLGVQPPQAEWGAM LSAGRSYIRSNPHVIIYPGITIMIVVLSLNLLGDGLRDALDPRLKN >gi|229784019|gb|GG667716.1| GENE 16 18187 - 19101 946 304 aa, chain - ## HITS:1 COG:FN0397 KEGG:ns NR:ns ## COG: FN0397 COG0601 # Protein_GI_number: 19703739 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 298 12 308 308 262 49.0 7e-70 MIPVMFGVTLLIFTLLYFANGDPARNLLGAEATEEEVQELRDEWGLDDPYLVRYGNYLKQ LLVDRSLGVSYVNKKEVSTEIVARFKVSFAIAVESTIIAVLFGTVMGVLAATHQNTWVDN ASMVAALVGASMPGFWIGLVLSIVFALKLGWLPSSGWGTFKESLLPCISLAIGAAGGLAR QTRSSMLEVIRQDYIVTAKAKGVSKFKVTYVHALRNALIPVVTAAGGTLSWLMGGVVVTE TVFSIPGLGMYMVTAINQRDYPVIQGSVLYIALTFSLVMLAVDILYAYIDPRIKARYKTA KTVK >gi|229784019|gb|GG667716.1| GENE 17 19508 - 20071 268 187 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623486|ref|ZP_06116421.1| ## NR: gi|266623486|ref|ZP_06116421.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 187 1 187 187 374 100.0 1e-102 MPFQHKKALQNYVPKQIPAENKKGYKTEYIYQGDWYVWDCEPKKLRFFRICSLTTAGLIA GFFLAGALQRTFCNTISFVAVPSLLSIVALMLGTYGLFSPLLRVPRLPEYDFRSMHLMVQ AGFAAYTILIFIAAAACFSSIASGRWTSVSRELFTALCYLTCGMLSLTISLRFRRLPCHL ADGGRSN Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:08:21 2011 Seq name: gi|229784018|gb|GG667717.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld110, whole genome shotgun sequence Length of sequence - 21788 bp Number of predicted genes - 28, with homology - 27 Number of transcription units - 13, operones - 9 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 3 - 723 214 ## COG0395 ABC-type sugar transport system, permease component 2 1 Op 2 1/0.000 - CDS 737 - 979 249 ## COG4209 ABC-type polysaccharide transport system, permease component - Prom 1100 - 1159 20.8 3 2 Op 1 2/0.000 - CDS 2061 - 2582 398 ## COG4209 ABC-type polysaccharide transport system, permease component - Term 2600 - 2632 4.6 4 2 Op 2 . - CDS 2641 - 4335 1293 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 4574 - 4633 8.4 + Prom 4490 - 4549 6.0 5 3 Op 1 1/0.000 + CDS 4667 - 6325 611 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 6 3 Op 2 . + CDS 6303 - 7694 553 ## COG0784 FOG: CheY-like receiver 7 4 Op 1 . - CDS 7695 - 7757 57 ## 8 4 Op 2 . - CDS 7772 - 7948 67 ## gi|266625000|ref|ZP_06117935.1| conserved hypothetical protein 9 4 Op 3 . - CDS 7887 - 8114 85 ## gi|266623493|ref|ZP_06116428.1| modification of 30S ribosomal subunit protein S18 acetylation of N- alanine - Prom 8206 - 8265 3.2 10 5 Op 1 . - CDS 8373 - 8939 444 ## gi|266623494|ref|ZP_06116429.1| conserved hypothetical protein 11 5 Op 2 . - CDS 8966 - 9112 84 ## gi|288871116|ref|ZP_06410024.1| putative lipoprotein 12 5 Op 3 . - CDS 9127 - 9381 188 ## gi|266623495|ref|ZP_06116430.1| elongation factor P 1 13 5 Op 4 . - CDS 9344 - 9499 110 ## gi|266623496|ref|ZP_06116431.1| conserved hypothetical protein - Prom 9746 - 9805 3.5 14 6 Op 1 . - CDS 9861 - 10118 174 ## gi|288871117|ref|ZP_06116432.2| 2-amino-3-ketobutyrate CoA ligase 15 6 Op 2 . - CDS 10186 - 10881 595 ## bpr_III182 hypothetical protein 16 6 Op 3 . - CDS 10964 - 11821 315 ## gi|266623499|ref|ZP_06116434.1| LysM domain protein 17 6 Op 4 . - CDS 11864 - 12421 452 ## gi|266623500|ref|ZP_06116435.1| hypothetical protein CLOSTHATH_04795 - Prom 12441 - 12500 4.8 - Term 12443 - 12497 4.2 18 7 Tu 1 . - CDS 12511 - 12915 225 ## gi|266623501|ref|ZP_06116436.1| conserved hypothetical protein - Prom 13112 - 13171 3.4 19 8 Op 1 . - CDS 13175 - 13879 523 ## Csac_0468 hypothetical protein - Prom 13903 - 13962 4.8 20 8 Op 2 . - CDS 13968 - 14150 139 ## gi|266623503|ref|ZP_06116438.1| putative lipoprotein - Prom 14231 - 14290 80.4 21 9 Op 1 . - CDS 15132 - 15599 321 ## gi|266623504|ref|ZP_06116439.1| hypothetical protein CLOSTHATH_04799 22 9 Op 2 . - CDS 15602 - 16132 494 ## Csac_0467 ECF subfamily RNA polymerase sigma-24 factor 23 9 Op 3 . - CDS 16210 - 16968 535 ## gi|266623506|ref|ZP_06116441.1| hypothetical protein CLOSTHATH_04801 - Term 17356 - 17403 7.7 24 10 Op 1 . - CDS 17608 - 17916 302 ## COG1605 Chorismate mutase 25 10 Op 2 . - CDS 17930 - 18967 905 ## COG1396 Predicted transcriptional regulators - Prom 19003 - 19062 5.5 26 11 Tu 1 . - CDS 19149 - 19676 37 ## Cbei_3242 hypothetical protein - Prom 19702 - 19761 6.2 27 12 Tu 1 . - CDS 19836 - 20426 469 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) - Prom 20454 - 20513 3.3 - Term 20673 - 20707 -0.9 28 13 Tu 1 . - CDS 20866 - 21669 715 ## COG0300 Short-chain dehydrogenases of various substrate specificities - Prom 21728 - 21787 4.7 Predicted protein(s) >gi|229784018|gb|GG667717.1| GENE 1 3 - 723 214 240 aa, chain - ## HITS:1 COG:BH1066 KEGG:ns NR:ns ## COG: BH1066 COG0395 # Protein_GI_number: 15613629 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 5 240 5 241 293 181 41.0 1e-45 MVEKKTLGKSAVNALFAVILTALALSCLLPLIYNLAVSFSSKAAAEAGRVAFWPVDFTAN AYRDIMKEKNFFLSFLVALKRVVLSLITTLPILTMAAYPLAKSKKEFPPRNSILWVFVFC MMFSGGTIPWYMVMKSYGLINNVFGLAICGGIPVFNLILMVNFFQGIPTELEEAAMVDGA GSWYILLRIVVPLSKPVLATIALFTIVAQWNEFFQGLVLSTSATHYPLQTYIKQLVFRLD >gi|229784018|gb|GG667717.1| GENE 2 737 - 979 249 80 aa, chain - ## HITS:1 COG:BH1065 KEGG:ns NR:ns ## COG: BH1065 COG4209 # Protein_GI_number: 15613628 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Bacillus halodurans # 4 80 241 317 317 87 63.0 6e-18 MTILALGNILNAGFDQIYNLYNQAVYSTGDIIDTWVYRVGLNDMQFSLATAVGLLKSVAG FLLISSSYYIANKKANYTIF >gi|229784018|gb|GG667717.1| GENE 3 2061 - 2582 398 173 aa, chain - ## HITS:1 COG:BH1065 KEGG:ns NR:ns ## COG: BH1065 COG4209 # Protein_GI_number: 15613628 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Bacillus halodurans # 3 173 18 188 317 129 44.0 3e-30 MKRKQTAFFKGKTGIYWLMVCPALIWMFFINIVPMFGIIMAFQDFNPGKGILRSEFVGLE NFRYMFEMKDVLQIFVNTIVIAVFKLILNVAVPVAFALLLNEIKNVYFKRTVQTIVYLPH FISWVVMGGILLDVFGLYGPVNSIISSMGLEKIAFFRRADLFRGLAVGTDVWK >gi|229784018|gb|GG667717.1| GENE 4 2641 - 4335 1293 564 aa, chain - ## HITS:1 COG:BH1064 KEGG:ns NR:ns ## COG: BH1064 COG1653 # Protein_GI_number: 15613627 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 57 560 41 543 550 168 27.0 2e-41 MKIKSYISIVMACMAVTTGIMGCGTKDESALKNVVKENIPIDENGNPSPFGKYKEPITVE IVQSINPTMTIPEGQSTTDNQYTRYIKDNINIDIKVKWQASTADYSQKLNLAIASNDLPD IMVVGENEFKKLIKSDQIEDLTPYYDQFISDTIRQNIESTDGKSMAAATVDGKMYGLPSV NTEADGYSLMWIRKDWLDKLGLEVPKTVDELETVAKAFVDNKMGGDRTVGILGPTVNNRL YGDFLQASNNLCTLDGIFQAYQSYPGFWIKGEDGKAVYGSTTAETKEALGKLQEMYKNGA LDQELGVRKDADEAWKSGQAGIFFYPWWLGYNLAEAIANDPDATWVAYAAPEAKDGQWYP KLGSPTNQYVVVRKGYEHPEAAFLLNNFLRRDEAKMDVKSLDASYYPGRIVITPMDENSF TMQVLRKYLKGEEIPEYDNIAYKLLDNDIATVKDVKQVPYDDFDVHTWNMSDVNSGRIYS LLVGSGAVADAEEAGVVNKVYSLIYTQTPTMEKKWTNLKKKEDEIFMKIIIGEEPVDSFD TFVKEWKAEGGDEITAEVQEAANQ >gi|229784018|gb|GG667717.1| GENE 5 4667 - 6325 611 552 aa, chain + ## HITS:1 COG:BH3841 KEGG:ns NR:ns ## COG: BH3841 COG2972 # Protein_GI_number: 15616403 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 257 538 181 474 481 113 30.0 8e-25 MVCAYIFVITRFITQFTQKQLDFDYNTILSEASDTLENILWNLTLTSQQLLDSKELLKNI ESYLSTEDLLEKKDCYYEMLNTVSSLTMSNTDIALIYLYDIDENEIIYNTLPNIKSSDSP VLYQNSDYTFYGPGKSQSNFVGSPVVCLKRQFEDENGHRFLLSIESGYYSLDKPFHSAEQ KSSYFIFTDEKKSVIYNTLPKSWNVPDIMKSLSLNARHDFHSFCKTGPQGWSVYIVIPND IYAHDYRMVLRDFGICTLIISIIVSFLAIYFWKSVYHPLQLFDNQLNRLLSDDDSLEEIH SSIPEYDHLLQKILVLQQQIQQMIQQFVNQEKLHSRMQLEKLRAQINPHFLMNTLNTLHW MALMKHETEIDRITQSLSHLLSYNLDKQSYSTNLERELMALKEYVTLQKVRYAFQFDISF YPENTTLNYPCPKFILQPLIENSLSHGYQENMDIYLTVHVNDSIEIIIRDTGAGIQENTL LKLQQLTPIPESHLSDNMQFGIGLQYVVQSLNDFYGGNYNFSITSSLGNGTEITLNFPKL KGGGYHAENINY >gi|229784018|gb|GG667717.1| GENE 6 6303 - 7694 553 463 aa, chain + ## HITS:1 COG:BH3446 KEGG:ns NR:ns ## COG: BH3446 COG0784 # Protein_GI_number: 15616008 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Bacillus halodurans # 4 132 6 133 200 100 34.0 8e-21 MLKILIIDDDALTRKGIQTLMPWNAHNMEIIGEASNGKNALEFLADHDVDLVLVDLDMPI MDGMTFIEQAKSLYPNLNYVVLTIHTEFEYIQNALRMGAIDYIAKTQFDQENFDQILNRI NSSISKKLSQKQVLNPKWKESKILYPFIYVLITMETESDEHIYQFWDMNGLSGRSDIFEI ISGGWVFTDERSTFLFPECFPNTTLLCISDVSDLTYSQLGALLRNYKKEQFFYDYQPLLL INHKRAFELQESTYITDEASLESLKNEWISLNWVYRNELFDKIKFDLKNSKLTFSKLYHL LLSLENVWNASYSQMTGQSLELPTTFYCWEEVEAWLISVYEKTNMISASSKYSQDITNHI LAVKQYIDSHFSANIVTSEIARKAHMSYGYFSRCFHDIIGISFSDYCIQLRINKAKDYLL STDEPIQKISFCVGYNDEKYFSRLFKKITGYSPSDYRKLHTED >gi|229784018|gb|GG667717.1| GENE 7 7695 - 7757 57 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLEINKISIAQSFNGSDALL >gi|229784018|gb|GG667717.1| GENE 8 7772 - 7948 67 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625000|ref|ZP_06117935.1| ## NR: gi|266625000|ref|ZP_06117935.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 11 58 5 52 52 70 85.0 5e-11 MAQNIQAKVMENFYSTRAGNSAFKALAISCVFSYQYSEQKIDLRNGNKYELEIHSKKL >gi|229784018|gb|GG667717.1| GENE 9 7887 - 8114 85 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623493|ref|ZP_06116428.1| ## NR: gi|266623493|ref|ZP_06116428.1| modification of 30S ribosomal subunit protein S18 acetylation of N- alanine [Clostridium hathewayi DSM 13479] modification of 30S ribosomal subunit protein S18 acetylation of N- alanine [Clostridium hathewayi DSM 13479] # 1 75 5 79 79 140 98.0 4e-32 MTDSEAADRERIRNTIKYQKNHDTYFVYEKRTGQAIGFAGVEQITPDIYQEASIALGPEY TGQGYGKFLLNTGWE >gi|229784018|gb|GG667717.1| GENE 10 8373 - 8939 444 188 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623494|ref|ZP_06116429.1| ## NR: gi|266623494|ref|ZP_06116429.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 188 1 188 188 360 100.0 3e-98 MAPYTFVIDGEDIDDFWDSYLPEVTKQNEWFHATDARYREQGSVNIIDVGPDDTIISSRQ ILAPSKSGSLSVIDDEAAKEFIIEAEDEHGTVYLFNYYNGFTAFKEGKDGDWECRIKDSG SYEKLIAAAREYIVKHNDKMRMDYDFRSRYYFCKNGMLAHEGSSGAGNSSYSYFTYKESK RNLVEAVI >gi|229784018|gb|GG667717.1| GENE 11 8966 - 9112 84 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871116|ref|ZP_06410024.1| ## NR: gi|288871116|ref|ZP_06410024.1| putative lipoprotein [Clostridium hathewayi DSM 13479] putative lipoprotein [Clostridium hathewayi DSM 13479] # 1 48 1 48 48 70 100.0 3e-11 MYTKNKTAGLLLTAMAAAILLSGCTSEKPVETSQNAPPRQRAGRSQRN >gi|229784018|gb|GG667717.1| GENE 12 9127 - 9381 188 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623495|ref|ZP_06116430.1| ## NR: gi|266623495|ref|ZP_06116430.1| elongation factor P 1 [Clostridium hathewayi DSM 13479] elongation factor P 1 [Clostridium hathewayi DSM 13479] # 1 84 1 84 84 161 100.0 1e-38 MVITSLPFTACSKEDYPRDSLFLEDPGSPFQAQDYDAIYLYQFGRIPEEAVKENSIITGG RISITALVKDGIVISVFTNIPVAG >gi|229784018|gb|GG667717.1| GENE 13 9344 - 9499 110 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623496|ref|ZP_06116431.1| ## NR: gi|266623496|ref|ZP_06116431.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 51 98 148 148 105 98.0 1e-21 MEALGDEGADGRWSEACGFIQDQYPDYFSSDELLQKSIKYGHYLASIYRLQ >gi|229784018|gb|GG667717.1| GENE 14 9861 - 10118 174 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871117|ref|ZP_06116432.2| ## NR: gi|288871117|ref|ZP_06116432.2| 2-amino-3-ketobutyrate CoA ligase [Clostridium hathewayi DSM 13479] 2-amino-3-ketobutyrate CoA ligase [Clostridium hathewayi DSM 13479] # 20 85 1 66 66 114 100.0 3e-24 MFSLVTIFLTLSFTSYANNMLDDSYEKQIISVYSFDGSTIPMDIYYSEYNEKLEGEYAGN LTLKKGEMKEGFYLHIVYEGTLTRK >gi|229784018|gb|GG667717.1| GENE 15 10186 - 10881 595 231 aa, chain - ## HITS:1 COG:no KEGG:bpr_III182 NR:ns ## KEGG: bpr_III182 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 12 211 30 213 216 105 36.0 1e-21 MDKRTKVKLRIVLIVILLVGIIVTPVVCIITHWVEKNIVTDIEDYETYFGAQGIHRTSNT SKLKKTSESYLVINNIFPEKLPESAEVEDFYYEYYNPWDPCYLSYLVYRCDVNDYKAETE RLKQIPRPTDYLIYGATDFPYPLLAVNASYYGYIYALADEAQQEIVYVELTFCNHFTDID YEKVIPQKYLPLGFDAKEGNPTWEENRPKIDVTSGEAEKEDSFNDTGIERN >gi|229784018|gb|GG667717.1| GENE 16 10964 - 11821 315 285 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623499|ref|ZP_06116434.1| ## NR: gi|266623499|ref|ZP_06116434.1| LysM domain protein [Clostridium hathewayi DSM 13479] LysM domain protein [Clostridium hathewayi DSM 13479] # 1 285 1 285 285 577 100.0 1e-163 MKHHIKIQLLMCSLFIALLSAVPFSSYAESSGLNEQEFARVCEDNHAVQTRSLHPPLKNP AADTFNYEFHYREILLSPDGKGFIDNTHDMSRYRNLAGGQILMDMADGYEISYEYHIKDF TSDDADVTVIEGGKITPYYRAVPLSTASGSGKQCTYISNEPEEPPIIPDFPKDYDWKSHG GTTLEVYPLVTVRSLKDGSTYTYTSRVSYVTNLYKSKPDCYFLTPTDLEAYTIRKGDSLQ KIAGHYYADSDDWIYILERNQDRIKNADMICPGTLIVIPNAEAKK >gi|229784018|gb|GG667717.1| GENE 17 11864 - 12421 452 185 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623500|ref|ZP_06116435.1| ## NR: gi|266623500|ref|ZP_06116435.1| hypothetical protein CLOSTHATH_04795 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04795 [Clostridium hathewayi DSM 13479] # 1 185 3 187 187 305 100.0 1e-81 MKNIQMNTLLLFITLISLAGCANKKNIGESSSTVLETVETSSQYAIENTTEEESQTDELL QGMERTETADTEITDPALSEDAKEIQDIVETFSAAYFSGDVEEIKKYLTDPYDWDIEVYS NPEKSADISVIAINGISDIGNKNVDDICVIAMQYKQSASDDTVQSLTIELIKKSEGWKIQ FYGIE >gi|229784018|gb|GG667717.1| GENE 18 12511 - 12915 225 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623501|ref|ZP_06116436.1| ## NR: gi|266623501|ref|ZP_06116436.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 134 1 134 134 256 100.0 4e-67 MIYEVTKLNITSQLGILYLEDKAKWIMDQDMVWAEKTTLPEIKKTELPYSQNWNGIRDFL FPNTYYVEGGFSTSADGSHCVKIYTADGMYAATIDAVQNDGKFVTDISVGGNDYYAVIFN KIDGAALYATYTVK >gi|229784018|gb|GG667717.1| GENE 19 13175 - 13879 523 234 aa, chain - ## HITS:1 COG:no KEGG:Csac_0468 NR:ns ## KEGG: Csac_0468 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticus # Pathway: not_defined # 11 225 7 222 231 68 25.0 3e-10 MKNGLNDEWFEVLLKAAVIQNSMNEIEPYPPQEEIDNLQISDACDYKIRKMIKRFWRRQW FSKVQRITKKIVAVIFITVGVSFIALLQFNEVRAACYDILVRFTSRYIEIDYNAPDEELE PFNIGYVPEGFYKVEESNSASMYHIAYENENGERLSLENYKSVSVNVDNENHIITDITIN GSLGQYFSATDERFENVLIWNNDKGFFIITTYLGKDEILKVAESINFLEERDKK >gi|229784018|gb|GG667717.1| GENE 20 13968 - 14150 139 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623503|ref|ZP_06116438.1| ## NR: gi|266623503|ref|ZP_06116438.1| putative lipoprotein [Clostridium hathewayi DSM 13479] putative lipoprotein [Clostridium hathewayi DSM 13479] # 1 60 22 81 81 111 98.0 2e-23 MDNEHHAITDVTINGSPGQYFSATDEKSQNMLIWYNDKGFFIIATYLDKDEILKIAENIK >gi|229784018|gb|GG667717.1| GENE 21 15132 - 15599 321 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623504|ref|ZP_06116439.1| ## NR: gi|266623504|ref|ZP_06116439.1| hypothetical protein CLOSTHATH_04799 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04799 [Clostridium hathewayi DSM 13479] # 1 152 1 152 152 302 100.0 4e-81 MKSEWNDTWFEVLLKAAVIQNSLNEIEQYPPQEEMHKLQISDACDHKIRKMIKRFCRRQR FSKARRIMKKTVAIIFMTIGISFIALLQFKEVRAACYNILVQITSKYIQFDYNAPDEELE PINIGYVPEGFYKIEDSNYAGMHSIRYENENGVAS >gi|229784018|gb|GG667717.1| GENE 22 15602 - 16132 494 176 aa, chain - ## HITS:1 COG:no KEGG:Csac_0467 NR:ns ## KEGG: Csac_0467 # Name: not_defined # Def: ECF subfamily RNA polymerase sigma-24 factor # Organism: C.saccharolyticus # Pathway: not_defined # 3 166 6 164 180 77 33.0 2e-13 MILFVFDTKEECDKFTILYEKYRKIVLYTISIYEKDKFQAEDLLQDIYIIIGKNLQKIDM SDEQRSRNYVITIARNYCRSYLRRKSRIEEDSFEDIETVETEQNDILEDIINREYFDYLV EEINKLEEHYKEVLELKYIVELSDDEIADFLHIKKKTVQMRLYRAKILLRNKKGGS >gi|229784018|gb|GG667717.1| GENE 23 16210 - 16968 535 252 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623506|ref|ZP_06116441.1| ## NR: gi|266623506|ref|ZP_06116441.1| hypothetical protein CLOSTHATH_04801 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04801 [Clostridium hathewayi DSM 13479] # 1 252 11 262 262 450 99.0 1e-125 MIAAVALSVTLLIAGCGPGTEQAQAGYDSTKEEIQIMDSENKNDVTDNQETSSGDTDVYN HVLAEYKDMVQNDFYTDLLDHGDLEAYDSSFGEDIGAEIRNYKQDVYYAFYDIDGNGIEE LIIAGGEDSASGSGLPLRNYDLYTFDGSNVVHIFPEMDFGYRTNVSLYANGIIKVCYSGS AGESGVDFYRIDADGVSHKLVDSFASAGSQEGDTAMFTYSQNGTEITEEEYNTKIQSYET ALTTALDWRQIQ >gi|229784018|gb|GG667717.1| GENE 24 17608 - 17916 302 102 aa, chain - ## HITS:1 COG:PA4230 KEGG:ns NR:ns ## COG: PA4230 COG1605 # Protein_GI_number: 15599426 # Func_class: E Amino acid transport and metabolism # Function: Chorismate mutase # Organism: Pseudomonas aeruginosa # 12 97 7 92 101 67 38.0 9e-12 MDMAKGDTIMQCSSLEEVRSNIDRIDNEIIKLIAERGNYVIQASKFKKNEAGVKAPDRVE AVIAKVRKKAEEYGADADMVEALYREMISRFVNMELQEFSKK >gi|229784018|gb|GG667717.1| GENE 25 17930 - 18967 905 345 aa, chain - ## HITS:1 COG:CAC3472_1 KEGG:ns NR:ns ## COG: CAC3472_1 COG1396 # Protein_GI_number: 15896711 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 117 1 117 125 108 43.0 1e-23 MQLGQVIRKYRKNKNMTQEEMGGRLGVTAPAVNKWENGASMPDITLLAPIARLLGITTDI LLSFREELTAEEIKALVYEADSMLKEKPYEEAFLWVKKKLEQYPNCEQLILQLAVILDAQ RMVQELPDANQYDDYICSLYARGLESDDETIRSRAADALFSFYLRKKRYDKAEEYLKYFS VQDPIKKMKQAQIYGRTNCISEAYKAYEELLYSHYQLASGALHGMYMLAMDQGEMDRAHM LVRKQEELARCFEMGKYCEVSSRLDIATVEQDADTVIATMKEMLSSIGQIGSFYRSPLYG HMEFKEMREEFLAELKNNLLKCFRDEESYGFLKDDERWKEIVLQK >gi|229784018|gb|GG667717.1| GENE 26 19149 - 19676 37 175 aa, chain - ## HITS:1 COG:no KEGG:Cbei_3242 NR:ns ## KEGG: Cbei_3242 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 3 172 34 205 205 134 47.0 2e-30 MGIAILYSLPISIILMTVNGIWLFLLSPVFKSFINSDGFIVEFQISIKSLGYILAVILFL LIHELIHALFVPGVFHSPCTYWGFNGGFGFVYTEEIMTKKRFMLISIMPLLLLSFILPAV LLILGISNWFIIILCIINAGGSCVDILNMFLIAGQVSQNGTIVNNGYSTYFKNAE >gi|229784018|gb|GG667717.1| GENE 27 19836 - 20426 469 196 aa, chain - ## HITS:1 COG:BH2748 KEGG:ns NR:ns ## COG: BH2748 COG2249 # Protein_GI_number: 15615311 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Bacillus halodurans # 5 196 5 195 198 103 34.0 3e-22 MKVSVILGHPYEGSFNAAIAKTVVEALRENGHQVMFHDLYQENFNPVIPKEELVSDQTKD SLVALHQQEIREADGIIIVHPNWWGQPPAILKGWVDRVLRENIAYTFPKGDNGGGLPIGL LNAKAGLVFNTSNTPERREVETFGDPLNRLWKDCIFDFCGVSVFDRIMFRVVAYSSCTER EEWLNQARSMVNHYFP >gi|229784018|gb|GG667717.1| GENE 28 20866 - 21669 715 267 aa, chain - ## HITS:1 COG:CAP0001 KEGG:ns NR:ns ## COG: CAP0001 COG0300 # Protein_GI_number: 15004706 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Clostridium acetobutylicum # 4 253 2 251 251 317 64.0 1e-86 MVQKKGAYTVITGASSGIGDETAKAFAARGRNLVVAARRQDNLEALKKDILSNNPALDVV VRATDLSVMENVYQFYESLKEYPLDTWINNAGFGNYDSVAHQDLRKIESMLHLNVEALTI LSSLFVRDYKDTEGTQLINISSCGGYTIVPTAVTYCAAKFYVSTFTEGLARELEEVGAKM RAKVLAPAATKTEFGRVANNAGEYDYDRLFGTYHTSWQMAAFLMELYDSDQVVGAVDRET FEFRLCGPRFDYAGNSRHNQKAVKSQG Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:10:24 2011 Seq name: gi|229784017|gb|GG667718.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld111, whole genome shotgun sequence Length of sequence - 16823 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 5, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 2 - 1079 881 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 2 1 Op 2 . - CDS 1057 - 2883 1288 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 3 1 Op 3 . - CDS 2907 - 3599 664 ## COG1802 Transcriptional regulators - Prom 3692 - 3751 10.8 + Prom 3726 - 3785 12.0 4 2 Tu 1 . + CDS 3888 - 4907 439 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Term 4902 - 4959 10.1 5 3 Tu 1 . - CDS 4981 - 11664 6959 ## Closa_2607 Cna B domain protein 6 4 Tu 1 . - CDS 12621 - 12782 204 ## gi|266623517|ref|ZP_06116452.1| conserved hypothetical protein - Prom 12806 - 12865 3.4 - Term 12810 - 12857 11.2 7 5 Tu 1 . - CDS 12890 - 16801 3667 ## COG5263 FOG: Glucan-binding domain (YG repeat) Predicted protein(s) >gi|229784017|gb|GG667718.1| GENE 1 2 - 1079 881 359 aa, chain - ## HITS:1 COG:SP0661 KEGG:ns NR:ns ## COG: SP0661 COG4753 # Protein_GI_number: 15900562 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Streptococcus pneumoniae TIGR4 # 4 165 5 162 245 106 37.0 7e-23 MTKLLIVDDERIIREDLALNLPWRQHGIELVTPAGNGREAVEVIEREKPDIVLADINMPV MDGLELAAWLAENHKEIRIMFLTGYDDFKYAKRALELHVQDYILKYESREKIIEAVKRCE LEYHEARKGETLIQKSKSLLQSRFLTQLIMGIAEEQALGEFEKKNRFDICDPMFYLVLLH MDCQDSYLKSEGPNTNEIYAFSAINIFRELAEAELPEKDVSVYFGTYDGQVFAVFAFRKA GYREEFSKLLESVVENLRKFLNMKVKIAVSQEYHRLSQTDDAYNESYRLLELMKLGDSGA VVHCSDGNCRLVKKNAIMEEIEDYIMEQYSVPGLSLNDLANHIHLSPAHLCRIIKKNKN >gi|229784017|gb|GG667718.1| GENE 2 1057 - 2883 1288 608 aa, chain - ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 198 596 190 575 577 156 29.0 1e-37 MKNIFYKLSVRNKLMVVLLPLVSLFIIITGTVCYFLSSAQLKINSEKLLEDAVHQASVRL EERFKIAYSAAISLTNSESLKYFTFVYDGETMDKEFLINNQELKQQITRICVDYPDVMDS VVFKTKHDYSLSYSSYASPYVITESFGEMLKKVRIEQRGELFRYAWRSLHEEDLLPTSRK RNVISLYCVAGNENSDAGMLLVINMLPESVAGILRDITTSRGSVSGILAEDSVLYDGAEK KEVLREEDREKLLHAELEGSYISQDLNGEKILISYERLQLNGWVIFSAVPLKSLMAGNEF IGVMIAVCVVIMLLVSSVPIIVFSRYMSDDIKRIACQLEQYEAGDKTIIFETKDEKEIGE LVDGLNAFVATINCQISTIEDISEKKRKAELLLLQSQINPHFLYNTLISIQALINAKDYD RVSKMYESVVRFYVLSLSKGMERVKISGELDLIRHYLTILQMRYDNSFEWCMNVDDEILN CEIPKLTLQPIVENAIDHGLRRQSKKGILDISGCRFEDKVILTVWDNGGGISENKLAELN QYIKMDAPGDKEVEHFGLWNCNQRIELYYGKEYGIELESVESIYTSTLITLPYTGENGGE WNDETFDC >gi|229784017|gb|GG667718.1| GENE 3 2907 - 3599 664 230 aa, chain - ## HITS:1 COG:BH1062 KEGG:ns NR:ns ## COG: BH1062 COG1802 # Protein_GI_number: 15613625 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 11 229 11 225 226 123 32.0 3e-28 MKVFEKKQMENNREYAYRLLRDNIMTFRLVPGQSLNENELSELMHVSRTPVHEAILMLKE ESLVDVFPQSGSRVSGIDLETLNEGLFLRLQVEPVIISQTAGNLTKEQIVRLKENLEAQE TAALQETYLDTFFKIDDEFHRMIYEFAKKARVWRAVRSVCSHYDRVRYLDAIDTKANIQI FVEEHKTIYSSLLSGEIVDFDFPKFYEGHLATYRKHFRQMLEKYPEYFNL >gi|229784017|gb|GG667718.1| GENE 4 3888 - 4907 439 339 aa, chain + ## HITS:1 COG:BS_yjmD KEGG:ns NR:ns ## COG: BS_yjmD COG1063 # Protein_GI_number: 16078298 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Bacillus subtilis # 1 339 1 339 339 298 43.0 8e-81 MKAIKIPQPFQLNIENIPTPTIKTPNDVLIKITSCGICGSDMGIWNGTNSLATYPRLIGH EFGGIVTETGSSVHTVKKGDLVSVDPVLSCGHCYACRIGRHNVCSSLEVLGVHRDGGFCE YVCVPESNVHKFHTAFDQSLLSMVEPFTIGVEINRRGGITAGDKVLIMGSGPIGIAAMQV AKRNGASVMITDLVASRLEKARKMGADRTILIPLQDLECEIMEFTDSEGIPVIVDTVCTP LSLEQAVQLACPAGRIVVIGLKNQPSAISMADITKKELTIVGSRLNCNCFDEVIEGFEDG SLHPETFVTHTFHYSQTEEAFQLIRQHPEEVLKAVIRFE >gi|229784017|gb|GG667718.1| GENE 5 4981 - 11664 6959 2227 aa, chain - ## HITS:1 COG:no KEGG:Closa_2607 NR:ns ## KEGG: Closa_2607 # Name: not_defined # Def: Cna B domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 847 1294 2128 2845 515 37.0 1e-143 MGLLQTVTDQKNGRDATGYTWGYIENDSSIIYGKRDTNPSSLEAGLRYMSDKIEYDPDAS GITYQFEMPEGERTYQVTLGCKNPWDTRPADITLEGQTVKSSWSFTKDSLAEETFSVTVS DGVLDVRVHNTDRTSSSQDAMLSYIIIKAPYDQEYLKNAMKQYTLTSLERELYSVQSLEK YDEAYEYARRAAESEEPVSQEKIASNLKKLAAKYNGLQPKQDAVSYTSFSGTDGEVWLDQ NGTPIQAHGGQVQKFTHNGVTKWYWYGEDKTDGYRTVDGGVRVYSSTDLYNWADEGIALR NLTDEYDFEEDYFHALYGDYTEEQRARVLLAINDTTSVIERPKVIYNEKNDNYVMWFHAD GPTETSNSNYAAASAGVAVSDSPVGPFKFIDRYRLNYVAGKYDQSKGMARDMNLFVDDDG TGYIIYSSEENATLFISKLNEDYTYLSASPETAVQGVDFLRSETFAGQKREAPAMFKYDG KYYLMTSGCTGWGANQASFAVSENGILGDWKIIGDPCVTDTSVCSYTSALTFGTQSTNII PIDAANGKFIYMGDRWNNKNPGQNELIDPQYVWLPVEFDGEGNMILRPYADWKLEMLDYM APITVNTELPKSIKKGTALSELPQKLSITVNGETKETAVEWTLVSGDLNSVGPVTISGKM TGTGGRMVTIDGFVYSRELTYLVDCGVMEASGSPKVSGLYEDVKKHVTLMNETPDQKAEG SNTWGHGSGCDSLTWRAGLGLHSTGWYGKENKTGNSFSYEFTLDKGTYTITTGHTEWWSA GRTTNVTASYEGIAAPAELGSASWSGSNGGITQNVTGSITVPADQTKVTLTFTAGSAQAP VVSYIAIEKSSIANVFDDNLYMAVAAGKTPELPETVEAEMADGQITAKKVIWEMPEEGFY EVWTTKKVTGVLEGSDVLVTASVEVVPENLVYFIDSGFADGKISAAWQGIRRSMDLKNEA ADQEFDGTWGYVEEGISPRNTSSGDKYENGFYAQKDQEIVYKLALPAGTYRFASGYQEWW AQWNSGRSMAVSLRYTDAEGSSQTEDLGTITLGNDTAGHVLESSREFELPADGIVEFHVK KNAALDVVLSWIAVQEVKKVSPSEAVDLDYEALNVYNINEVRGNLTLPETGANGTVITWE SSDESVIAVTGLAEGTPIGVVSRPVTDTDVTLTATVTAGDVSREKVFTASVLAKPEEKQM ESYVFAYFKGEGLSNGEQIYMASSRDGMNWEDLNASKPVLTSRLGEKGLRDPFIMRSAEG DKFYLIATDLKINGNNDWTAAQTSGSQSIMIWESADLVNWSDQRMVKVAAEGAGCTWAPE AFYDETTGEYIVFWSSKIPASQNVENNDYTHRVYYAKTRDFYTFTEPEVWIELHSESGET LSVIDATVIKVGNTYYRFTKNEAAKPHREGLPSGGKYTMLETSDSLLGQWTEIESGINQI SGVEGATAFRLNGEQKWCLLLDDFGGGGYFPLLTDDIGSGAFTRLDSSEYSFPGTMRHGS VLPVTADEYGAIMDEYGPAVLNSPDLTGTELMEEIIGSAAPVYEDTFDTYETDAAADKSE VLENEDGTKQYVMLNDSFLVKEDAKGNKYLQAAATTNIKLAQAGIAAGEVTGLTDALIMM DVMIPGTEENSVRSYAQIGTEGDRSYVKFDLKNSAVNIRENGRDDSASYTFEPGETYHVA VLYKDGSAYAYVNGKLIHKKAASSYEEAGMAVIAVATGGKIDNVRIYDASADVSEYEAVL ESMKQVDAEDYDEEGWEEFAAALKKAEKLMKSAVPSKADLESVFQALTDAYEKLTAETEY YTDRIKITRKPYQTEYEQYDDFNPEGMEVTEYRIASGSNAVRKTVLPEEAYELEYDFSEL GDREVAVSYTAAGENGEEVIFTDSITVKVLEEIEEAEYYTSRIAVAKKPDKREYSIGETF DASGMEVTEYLKASASNASPRKRVLSEEEYETVYDFSGEGNRKVTVAYYGTDKKGEEKKF TDYIVVSVMADTSKGGYTTGIRVTARPFKTEYETGEELDTGGMIVKRMLVKENGEKAEEI LSEGDYTIFYDFSKAGTKKVVLSYIAEGKDGAEEVFETDFKVTVKKAGSDKPSGGNTGSG GSSGSSSGSGSSSGIKAASGKRETYYTEGSWSNTETGWQFKKADGTIPKSEWIYTGWGDQ DDWYFFDENGNMVTGWHLADGKWFYLNPVSDGTRGRMVTGWQLIDDKWYYFNPVSDGTKG VLTTVSE >gi|229784017|gb|GG667718.1| GENE 6 12621 - 12782 204 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623517|ref|ZP_06116452.1| ## NR: gi|266623517|ref|ZP_06116452.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 53 2 54 54 80 100.0 3e-14 MNRKPLKRLTAIALSMAMTLPGAMPVYAVPEGAERSSNTMPGYEELLAADYTL >gi|229784017|gb|GG667718.1| GENE 7 12890 - 16801 3667 1303 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1175 1300 495 617 621 102 40.0 6e-21 MPDRVSIIRLDGKASEADVVWNLSEEQFDTVYKTVEVTGEAGEDKYPVKARVMTVPQELR YYIDCAADVPVNPNGGGALEIRSSVYEAVEKYFEAGGKALLNDTADQKYGSAADQWGFEI TDSKNPLKPSPDGAEGEYDNSYTAGWRTNGSEIVYHLTLDAGTYKITNGFHDWYGSRSRD MRPELIYKDENGKEQKITFDTILSRGTDMAVTNKFTLPVSGDVTYKLVHAANEKPIISWL AVEESANLTSIEVSKMPDKTMYSIGEKLDTTGMEVTGYYDNGTERKLSDSQYSVSGFDSS VPGEKTVTITAKEDNVTATVKVTVGGDSVEDIPVLPVVAAEDQLPETLTLKVNGEIRDTA VTWNLKEGDWSLPGSTVKLEGVLENLDAGTISWSPVVYSSTQVYFIDCGIGVKRGTVDNT GRTSEIFNAVHDAVKYLENDVPDQAGDGNSWGSGSDYDGIKGYEENHGLHATGYFGKNSI GDSFTYKVTLDAGEYNITTGHTEWWSGNSRSTKVTASYIGTDGTVVKTELGSSGSTGGTQ WKELYVQGHLSVPENDTEVTLTFTAAEAKGAAVSYIAIERGEAPVTKQAINKIKVDTMPN KTVYELGEEFEPVGMVINGYHGSELVRELKEEEYTMDGPDMNEAGLSKVTITYEDENGKE YSAIFRVTVYDPEDLRADSIKVVREPEKTVYLPGENFDADGMEVRLLLKATASNAVPARA EKLEDGEYDVVSDLSDPGRRSVEICYSYFDDNGEEKVLRDTIKVTVLDEEDEFYQKGISI AKPPKKTVYPVDGQFDPEGMEVETVMKASTSNATYAEKTLDYEIGKYDFSEPGNKKIKVT QTGTGKDGEEKDFTAVIEVTVTENEALVMEDRLDSVIRDASEIFDNDTAAYRTIKEKQDA ASALRHSAAESFAEYEGELSKTMVRQIGELEDYFMEAYPNITVRVSGDRNLTEGAAVTGA VLSADVEQENQVITLMMEEAEIPDGLPEVKAVKAMDIRLMVSDEEIQPVVPLFLTARIPE GISQKNLVIYHVTESGEIETIIPETADNMMSFGVSSLSTFAAANPGTLTVTGIRITAKPY KTVYQIGESFEADGMVVTAVYENGTEAVVTEYELSGFDSSSAGIKHVVITYGGRKAILDV TVKSETGDEPDDGDKPGSSGGSGSSGSSSGSRPRVNTTSTVTGTWMQDQTGWWFAKTAGG YVKAEWAQINGTWYYFGEDGYMRTGWILSGNQWYYLQPDGSMTDNNWICYKDHWYFLQPG GAMAVSGWVTWNGRSYYLNADGTMAVNTTTPDGYVTGADGARI Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:11:02 2011 Seq name: gi|229784016|gb|GG667719.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld112, whole genome shotgun sequence Length of sequence - 30292 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 14, operones - 8 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 40/0.000 + CDS 3 - 626 668 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 2 1 Op 2 . + CDS 628 - 2016 1479 ## COG0642 Signal transduction histidine kinase 3 1 Op 3 . + CDS 2070 - 2462 454 ## COG1307 Uncharacterized protein conserved in bacteria 4 2 Op 1 . + CDS 3413 - 3841 340 ## COG1307 Uncharacterized protein conserved in bacteria 5 2 Op 2 . + CDS 3780 - 3962 103 ## gi|288871125|ref|ZP_06410028.1| hypothetical protein CLOSTHATH_04820 6 2 Op 3 7/0.000 + CDS 4029 - 4475 504 ## COG1846 Transcriptional regulators 7 2 Op 4 5/0.000 + CDS 4477 - 5079 717 ## COG0534 Na+-driven multidrug efflux pump 8 2 Op 5 . + CDS 6030 - 6770 532 ## COG0534 Na+-driven multidrug efflux pump 9 2 Op 6 . + CDS 6788 - 7300 487 ## COG4732 Predicted membrane protein 10 3 Tu 1 . - CDS 7309 - 7677 312 ## COG5341 Uncharacterized protein conserved in bacteria - Prom 7842 - 7901 9.4 + Prom 7839 - 7898 7.4 11 4 Tu 1 . + CDS 7931 - 8734 1082 ## COG3976 Uncharacterized protein conserved in bacteria + Prom 9581 - 9640 80.4 12 5 Op 1 . + CDS 9718 - 10674 1253 ## COG4939 Major membrane immunogen, membrane-anchored lipoprotein + Term 10701 - 10745 11.2 + Prom 10707 - 10766 4.4 13 5 Op 2 . + CDS 10838 - 11035 323 ## Closa_2147 Heptaprenyl diphosphate synthase component I + Prom 11884 - 11943 80.4 14 6 Op 1 . + CDS 12029 - 13564 1965 ## COG4468 Galactose-1-phosphate uridyltransferase + Term 13588 - 13641 16.7 + Prom 13616 - 13675 5.6 15 6 Op 2 . + CDS 13744 - 14622 776 ## COG1284 Uncharacterized conserved protein + Prom 14687 - 14746 5.3 16 7 Op 1 9/0.000 + CDS 14793 - 15845 1225 ## COG2984 ABC-type uncharacterized transport system, periplasmic component + Term 15860 - 15907 10.1 + Prom 15886 - 15945 5.9 17 7 Op 2 13/0.000 + CDS 16021 - 16935 1118 ## COG4120 ABC-type uncharacterized transport system, permease component 18 7 Op 3 . + CDS 16938 - 17735 274 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 17764 - 17817 8.2 + Prom 17865 - 17924 8.4 19 8 Tu 1 . + CDS 18026 - 19651 1516 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) + Prom 19664 - 19723 6.5 20 9 Tu 1 . + CDS 19756 - 20325 506 ## COG5493 Uncharacterized conserved protein containing a coiled-coil domain 21 10 Tu 1 . + CDS 21498 - 21668 101 ## Acfer_0238 transposase IS4 family protein + Term 21720 - 21762 3.2 - Term 21635 - 21701 17.5 22 11 Tu 1 . - CDS 21710 - 23389 1540 ## COG1048 Aconitase A 23 12 Op 1 3/0.000 - CDS 24324 - 24758 456 ## COG1048 Aconitase A 24 12 Op 2 . - CDS 24762 - 26093 1219 ## COG0372 Citrate synthase + Prom 26359 - 26418 7.2 25 13 Op 1 . + CDS 26458 - 26853 406 ## Clos_1149 hypothetical protein + Term 26867 - 26893 0.1 26 13 Op 2 . + CDS 26938 - 27516 317 ## gi|288871135|ref|ZP_06116478.2| hypothetical protein CLOSTHATH_04843 + Prom 28359 - 28418 80.4 27 14 Op 1 . + CDS 28476 - 28961 245 ## gi|266623544|ref|ZP_06116479.1| hypothetical protein CLOSTHATH_04844 + Term 29170 - 29203 -0.3 28 14 Op 2 . + CDS 29278 - 30292 754 ## COG0477 Permeases of the major facilitator superfamily Predicted protein(s) >gi|229784016|gb|GG667719.1| GENE 1 3 - 626 668 207 aa, chain + ## HITS:1 COG:CAC1700 KEGG:ns NR:ns ## COG: CAC1700 COG0745 # Protein_GI_number: 15894977 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 201 27 227 232 179 48.0 4e-45 FGYVVSAFEMAEDALKNIEADQPDLAVFDLMLPGMDGLSAIKKIRKNDSLADMPIIILTA KDREFDKVAGLDGGADDYMTKPFGVMELAARIRSLLRRRARNQADEEILEKYGISLNKKT REVTSRGKKVELTLKEYELLLYLMENHSRVATRDELLNHIWGYDYDGETRTLDMHIRTLR QKLGEEGAFCIRTVRGVGYRFVKTEEG >gi|229784016|gb|GG667719.1| GENE 2 628 - 2016 1479 462 aa, chain + ## HITS:1 COG:TM1654_2 KEGG:ns NR:ns ## COG: TM1654_2 COG0642 # Protein_GI_number: 15644402 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Thermotoga maritima # 240 461 44 269 272 173 42.0 7e-43 MRKAIFVKFIQVILAALVLSSAIFYIAASSALLKNSRTDMTYTLRAIDSVLDYSGDLEGE VKQLEETVSDNSSRFTVIRKDGSVAADTGMVDTSSMDNHLDREEIKEALESGSGSSRRYS DTLKKNMLYVACVSDHSDYILRMAVPFTGFKEYLVLLLPAIWLSFLVAIMYSACSADSFA ASVTKPLKEISQEMLKVNGDYTELTFETYQYPEINIIAETTTEMSKNVKEYLNQIELERQ IRQEFFSNASHELKTPITSVQGYAELLESGIIQDEDQKMDFVRRIKKEAVHMTSLINDIL MISRLETKEAEVVCQDVRMAIVLDDVLESLKPLAAFHEVLVHAECKPLCIYANARQMTEL LGNLLSNAIKYNKPGGEVWVTITEENREMIIRVRDNGMGIPKESLGRIFERFYRVDKGRS RKQGGTGLGLSIVKHIVSFYHGTIAVQSELEAGTEFTVKIPI >gi|229784016|gb|GG667719.1| GENE 3 2070 - 2462 454 130 aa, chain + ## HITS:1 COG:SPy1936 KEGG:ns NR:ns ## COG: SPy1936 COG1307 # Protein_GI_number: 15675739 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pyogenes M1 GAS # 4 128 3 127 286 79 34.0 2e-15 MNQFVLTCCSTADMKKEYFDERQIPYVCFHYLMDGVSYPDDLGQSMPFDEFYNRISKGAM PTTSQVNVAEFHDFFGAILEEGKDILHISLSSGISGTYNSACSAVEELREEYPDRKILVV DSLGASSGSS >gi|229784016|gb|GG667719.1| GENE 4 3413 - 3841 340 142 aa, chain + ## HITS:1 COG:BS_yitS KEGG:ns NR:ns ## COG: BS_yitS COG1307 # Protein_GI_number: 16078175 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 6 136 156 281 283 84 38.0 7e-17 MERRLHVHHWFFSTDLTSFKRGGRISATSAMLGTVLNICPLMNMDHEGHLIPRQKIRTKK KAMQEAVRMMELHAEGKTGYKGKCFISCSACGGDARAVADLVEARFPELDGSVLINSIGT VIGAHTGPGTVALFFFGDERKD >gi|229784016|gb|GG667719.1| GENE 5 3780 - 3962 103 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871125|ref|ZP_06410028.1| ## NR: gi|288871125|ref|ZP_06410028.1| hypothetical protein CLOSTHATH_04820 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04820 [Clostridium hathewayi DSM 13479] # 15 60 1 46 46 84 97.0 3e-15 MPIRGREPWLCSFSVMRERIEKEQTVTEALSGSIFTAHTGYLVWALFLIITVTRFWDNQK >gi|229784016|gb|GG667719.1| GENE 6 4029 - 4475 504 148 aa, chain + ## HITS:1 COG:MA0180 KEGG:ns NR:ns ## COG: MA0180 COG1846 # Protein_GI_number: 20089078 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 4 142 18 156 156 86 34.0 2e-17 MGTKNYISRGVSMIYRQGNRFCDRKLAEYHIGSGQNYFLTCIYEHQGINMYELAQLGHFD KGTVTKGIQKLEEQGYIRMQADEADKRIRHLYTTEKAAPILERLYEIRREWNQILTEGLT AEESAQAEQLISKMAENAWKCMNERESD >gi|229784016|gb|GG667719.1| GENE 7 4477 - 5079 717 200 aa, chain + ## HITS:1 COG:FN0944 KEGG:ns NR:ns ## COG: FN0944 COG0534 # Protein_GI_number: 19704279 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 8 198 6 196 455 159 44.0 3e-39 MNEKTGLKKENPLGTEPVGKLLLAFSVPSIISCLVNSIYNIVDQIFIGQGVGYLGNAATT VSFPMMTIVMAFATLLGSGGSAYTAIKLGERDEKEANLTLNNLFSLSVGVGIIVALLGLI FLDPMLRMFGATDTVLPYARDYASIVLMGVPFSVLGITMSNMARTDGNPRLSMYGILIGA VLNTILDPIYIFVFHWGVTS >gi|229784016|gb|GG667719.1| GENE 8 6030 - 6770 532 246 aa, chain + ## HITS:1 COG:FN0944 KEGG:ns NR:ns ## COG: FN0944 COG0534 # Protein_GI_number: 19704279 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 4 229 225 452 455 144 37.0 1e-34 MRFKFKLMKPVPRVCGRAMTLGASSGITQLVACVMQVVMNNSLVHYGNLSEVTGDVALSA MGIVMKIVMILGSFCIGIGVGSQPILGFNYGAGKYDRIRSAYGKAVVSATVSISIGWLIC QTMPHMILRLFGSGNAQFTQFAVKCMRIYLGGVFCSGFQIVSTNYFQATGQPLKASILSM MRQLILLIPLLLILPLFFGLDGILYAGPVADVGSAVIVACFIVPEMKKLNGLVREETRTT ATGTLT >gi|229784016|gb|GG667719.1| GENE 9 6788 - 7300 487 170 aa, chain + ## HITS:1 COG:SA1136 KEGG:ns NR:ns ## COG: SA1136 COG4732 # Protein_GI_number: 15926877 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Staphylococcus aureus N315 # 6 157 4 156 163 89 36.0 3e-18 MKFDVKKLVVSGMLTAVAVALSTFSIPIGASRCFPVQHMVNVMAGVFLGPVYSVAIAFCT SLIRNLLGTGSLLAFPGSMVGALLCGLAYRCTGKLTAACAAEIIGTGFLGAALCYPVAAV LMGKEVALFFYVVPFLMSTLCGTAIAAVLIGVLCRSGAFGYLKRQIAAKK >gi|229784016|gb|GG667719.1| GENE 10 7309 - 7677 312 122 aa, chain - ## HITS:1 COG:L178600 KEGG:ns NR:ns ## COG: L178600 COG5341 # Protein_GI_number: 15673322 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Lactococcus lactis # 2 122 6 126 135 71 30.0 3e-13 MSTENKNRKMKKKELVMILVILAVSAVLYAGSRILFSKPPAKVEISVDGTVIETLDLNKD TELTVEGWNGGTNHIVIKDGTVHVTEASCPDKVCVNQGTIRRNGEAIICLPNRMIARITG GE >gi|229784016|gb|GG667719.1| GENE 11 7931 - 8734 1082 267 aa, chain + ## HITS:1 COG:CAC2762_2 KEGG:ns NR:ns ## COG: CAC2762_2 COG3976 # Protein_GI_number: 15896018 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 41 130 40 132 132 62 40.0 9e-10 MNKGSLDTKKKGLIGLGALVVLAGATILGSDPLYKAIDDMARPAIYTAGTYTASARGYGG PVTVTVKVSTKAIESIDVVGEKETLLDQVVPSLPDAIIGKQTLDMDAVSGATLSSNAIIK AVEKALAEARGEAVEEETTKAEESAATGWNDGTYSYEAPEFDDNGYKDLVNMTIKDGKIT ALSWDCVDKDGKLKSQLSMDGQYVMTEDGPKWYEQADAVVKYVLENQSLDGLINEEGYTD TVSSVSINLYGFVNGVKDCLKQAAGEG >gi|229784016|gb|GG667719.1| GENE 12 9718 - 10674 1253 318 aa, chain + ## HITS:1 COG:lin2786 KEGG:ns NR:ns ## COG: lin2786 COG4939 # Protein_GI_number: 16801847 # Func_class: S Function unknown # Function: Major membrane immunogen, membrane-anchored lipoprotein # Organism: Listeria innocua # 76 294 54 267 299 62 25.0 9e-10 MNGEYVMTEDGPKWHEQAEAVVKYVLENQSLDHLIDADGYTDTVASVSINLYGFVNGVKD CLRQASEGTAAKSSWSDGTFSYEAPEFDSNGYKDQVSMTVKDGKMTALTWDCVDKEGKKK SNLSMDGEYVMTEDGPKWHEQADAVVKYVLENQSLDGLINEDGYTDTVASVSINLYGFVG GVEDCLRQSSEGSAAAASASAVKDGSYSYESPEFDGNGYKDQVSMTIKDGKITALTWDCV DKDGKKKSNLSMNGEYVMTEDGPKWHEQAESVVAYVLENQSLDGLINEDGYTDAIASVSI NLYGFTGGVEDCLKQASR >gi|229784016|gb|GG667719.1| GENE 13 10838 - 11035 323 65 aa, chain + ## HITS:1 COG:no KEGG:Closa_2147 NR:ns ## KEGG: Closa_2147 # Name: not_defined # Def: Heptaprenyl diphosphate synthase component I # Organism: C.saccharolyticum # Pathway: Terpenoid backbone biosynthesis [PATH:csh00900]; Biosynthesis of secondary metabolites [PATH:csh01110] # 7 64 4 61 171 75 87.0 5e-13 MTGKTEQSPAHKTALYGMLIALAFVLSFVETLIPISLGVPGVKLGLANLVTVVGLYTVGI YGTVA >gi|229784016|gb|GG667719.1| GENE 14 12029 - 13564 1965 511 aa, chain + ## HITS:1 COG:CAC2961 KEGG:ns NR:ns ## COG: CAC2961 COG4468 # Protein_GI_number: 15896214 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose-1-phosphate uridyltransferase # Organism: Clostridium acetobutylicum # 9 504 1 495 497 565 55.0 1e-161 MKKTGGIGMIYEAIKKLVQYGMDTGLITEVDRIYATNQILDVMRMDEYEEPEGELSETDL ESVLKELLDYAHETGVLPEDSITYRDLFDTKLMNCLMPRPGEIEKKFWDIYNGESPEAAT DYYYNLSQDSDYIRRYRIKKDMRWVTATKYGDLDITVNLSKPEKDPKAIAAAKLAKQSGY PKCQLCMENEGYAGRTNHPARNNHRIIRLKINDSRWGFQYSPYVYYNEHCIVFNGQHIPM KIEKDTFVKLFDFVKLFPHYFLGSNADLPIVGGSILSHDHFQGGNYTFAMAKAPIEKYYE MEDFPGVEAGIVKWPMAVLRTRSKNPDDLIRLGDRVLQAWRGYTDEEAFIYAETGGEPHN TITPIARKKGDMYELDLVLRNNITTEEYPLGVYHPHQELHHIKKENIGLIEVMGLAVLPS RLKDELSLLAGYILEKKDIRSNEMIEKHADWVEEFLPNYPEITKETIDEILQKEVGLVFE RVLEDAGVYKCDEKGRAAFGRFLHSTGFVEA >gi|229784016|gb|GG667719.1| GENE 15 13744 - 14622 776 292 aa, chain + ## HITS:1 COG:CAC0496 KEGG:ns NR:ns ## COG: CAC0496 COG1284 # Protein_GI_number: 15893787 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 14 289 1 276 279 172 35.0 1e-42 MTDKVMKQIENKTLRTLAEYGIITGSIWIMVIGIYFFKFPNNFAFGGVTGFATVVSAVTH WSASQFTTIVNMALLVAGFIFLGRGFGVKTVYASILMSFTLSFLERTCPMVRPLTREPLL ELIFAILLPGVGSALLFHIGASSGGTDIVAMILKKYTSCNIGTVLLVVDMAAVIMAFFVF GPETGLFSSLGLMAKSLVIDDVIENINLCKCFSIICDDPGPICDYIINGLNRSATVYEAQ GAFTHHRKTVVLTTMKRSQALKLRNYIRRVEPSAFILISNSSEIIGNGFLAG >gi|229784016|gb|GG667719.1| GENE 16 14793 - 15845 1225 350 aa, chain + ## HITS:1 COG:Cgl2198 KEGG:ns NR:ns ## COG: Cgl2198 COG2984 # Protein_GI_number: 19553448 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Corynebacterium glutamicum # 63 346 39 319 330 201 46.0 2e-51 MKKTAKAASLILVGAMMMSLTACGSKPAETTKAPETTTAEAASSAADTEAEETTAEAAKT DDGKQFKIGVLQLVQHSALDAANKGFIQALDDAGINYTVDQQNASGDQPTCQTIASKLVN DGDDLILAIATPAAQAVAGATTDIPVLITAVTDPAESDLVASNDVPGGNVSGTSDLTPVK EQIDLLKQILPDAQTVGVLYCSAESNSEIQAKMAREAIEAKGMKAVDYTVSNSNELQTVV TSMVGNVDAIYAPTDNTIAAGMATVGMVAIDNGIPVICGEEGMVDAGGLATYGIDYFQLG YLTGQQAVRVLTEGADISQMPIEYLPADKCNLTVNEDTAAALNIDVSGLK >gi|229784016|gb|GG667719.1| GENE 17 16021 - 16935 1118 304 aa, chain + ## HITS:1 COG:Cgl2197 KEGG:ns NR:ns ## COG: Cgl2197 COG4120 # Protein_GI_number: 19553447 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Corynebacterium glutamicum # 15 296 8 281 296 187 42.0 2e-47 MNGLLVSLQDAVVQGVLWGIMVLGVYITYKLLDIADLTVDGSFALGGCVCAVMVLNKGVD PWIALIIAAAAGMAAGAVTGFLHTVFEIPAILAGILTQIGLWSINLRIMGGKSNVPLLKT DTIFSKIIGWTGINKQLASLLLGIGVAVLMIAFLYWFFGTEIGSAMRATGNNEAMIRAQG VNTNWTKLLALTISNGLVGLSGALVCQSQKYADIGMGTGAIVIGLAAIVIGDVLMGRLRS FGSKLTSAVVGSVIYFVIRAVVLRMGMNANDMKLLSAVIVAVALCVPVVVRKWRLKKAYT EGGE >gi|229784016|gb|GG667719.1| GENE 18 16938 - 17735 274 265 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 255 1 245 245 110 31 1e-23 MLEIKNVRKTFNKNTINEKKALNGINLHLDEGDFVTVIGGNGAGKSTMLNMIAGVYPIDS GKIEIDGVNISREPEYKRARYIGRVFQDPMMGTAAGMEIQENMALAFRRGQRRGLGWGIK ANEKDFYHDALMKLGLGLQTRMTNKVGLLSGGQRQALTLLMATLQKPKLLLLDEHTAALD PKTAKKVLEITQEIVEEQNLTTLMITHNMKDAISIGNRLVMMHEGRIIYDVSGDEKKKLE VDDLLKKFEEASGEEFANDRMILAK >gi|229784016|gb|GG667719.1| GENE 19 18026 - 19651 1516 541 aa, chain + ## HITS:1 COG:CAC2663 KEGG:ns NR:ns ## COG: CAC2663 COG0791 # Protein_GI_number: 15895921 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Clostridium acetobutylicum # 424 538 146 250 255 95 41.0 2e-19 MENKSDKSEKKDYVVRLNKARKKRNKNDYERYLILAAIAIVVVVIIIFAGKAIAKRVSAG AEKTAATEETGSQGEESDQLTIDAKEAEESEAEAQAKKTEEEKKAVVDSYQNLGIVQVSG YLNVRKEPNTSADIVGKLMGDSACEILDSTQEGWYKISSGGIEGYIDSQYVLTGDEAKTK AYDLVSLRAIVQVDNLNIRKEANTTSDVVGQGLLNERYEVIDQLDGWVQIPSGYMSADYV KLEYALNEARKLDLKAMIFNMYKNIGISDVDNYLNVREEPSENGKIIAKMPSKAAGNILE TTDNGWYKIQSGKITGYVKSDYILTGQPAKDEALKVAELMAIVNTDMLNARSEPSTDSKI WTQISNNEKYPVLKQIDGWVEIELEEDSNAYVASDYVDVRYALPEAIKFSPLEEKANAAA SLRTQIVNYALQFLGNPYVWGGTSLTKGADCSGFTLSVFGHFGISLPHYSGSQANMGKAV KSSEMRPGDLVFYANSKGTINHVGIYIGNGQIVNAASRRSGIKISTWNYRTPVRIRNILG D >gi|229784016|gb|GG667719.1| GENE 20 19756 - 20325 506 189 aa, chain + ## HITS:1 COG:APE1221 KEGG:ns NR:ns ## COG: APE1221 COG5493 # Protein_GI_number: 14601265 # Func_class: S Function unknown # Function: Uncharacterized conserved protein containing a coiled-coil domain # Organism: Aeropyrum pernix # 25 160 94 232 356 80 43.0 2e-15 MSDEMMLILEKLDTMNYSLAGIEGKVDNTHEHLLRLEESTNERFMKMEDRFAGLEDRFAG LEGRFVGLEGRFVGLEDRFAGLEDRFVGLEDRFAGLEGRFVGLEDRFAGLEGRFVGLEDR FVGLEGRFVGLEDRFTQSEAATDLRFNRVEQKLESMGLMLENEISKKIDAIGEGHDYLKR NLDDALRAS >gi|229784016|gb|GG667719.1| GENE 21 21498 - 21668 101 56 aa, chain + ## HITS:1 COG:no KEGG:Acfer_0238 NR:ns ## KEGG: Acfer_0238 # Name: not_defined # Def: transposase IS4 family protein # Organism: A.fermentans # Pathway: not_defined # 1 45 54 98 525 64 64.0 1e-09 MGRPGYNRVSIRKTILFGFMDTGYVSLRELEDRCKVNLRYMYLMDHETQNGRAVKK >gi|229784016|gb|GG667719.1| GENE 22 21710 - 23389 1540 559 aa, chain - ## HITS:1 COG:ECs0799 KEGG:ns NR:ns ## COG: ECs0799 COG1048 # Protein_GI_number: 15830053 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Escherichia coli O157:H7 # 1 558 211 760 761 758 65.0 0 MALAIVGAVFKNGYVKNKVMEFVGPGIASMDTDYRNGVDVMTTETTCLSSIWKTDEDTRD YLKLHGREDAYKELNPADIAYYDGVVSIDLSTVKPMIALPFHPSNTYEIDELNANLDDIL RTVEKEADHILGNPNIHLSLTDKIVDGKLLVQQGVIAGCAGGNYSNVMTAAHILSGRNCG NDIFNLSVYPSSQPVYMDLVKKGAVTELMAAGATIRTAFCGPCFGAGDTPANNTLSIRHT TRNFPNREGSKPGNGQISSVALMDARSIAATAANGGILTSAEAIVDDYEVPAYEFDGTSY ERRIYQGYGKADPDAELIYGPNIKDWPSMSALTDNIILRVCSKIMDPVTTTDELIPSGET SSYRSNPLGLAEFTLSRRDPEYVGRSKKVNELEEIRKTGACPLKPDPEMEAAFKALRIYL DQPELRARETEIGSMIYAVKPGDGSAREQAASCQRVLGGLANIANEYATKRYRSNVINWG MLPFLLDEEPSFEIGDFIYVPGVRKALEGDMQHIPAYVIKNEEGNPIHEINLHIAEMTDE EKEIVKAGCLINYNRNRHQ >gi|229784016|gb|GG667719.1| GENE 23 24324 - 24758 456 144 aa, chain - ## HITS:1 COG:ybhJ KEGG:ns NR:ns ## COG: ybhJ COG1048 # Protein_GI_number: 16128739 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Escherichia coli K12 # 1 144 9 145 761 213 68.0 1e-55 MVKLYDGGAFLIHGTELVPEQEAEKLVALTGKNITKEEAKKGTIAYSILKEHNTADNMEH LKLRFDSMASHDITFVGIIQTARASGLTEFPIPYVMTNCHNSLCAVGGTINEDDHIFGLS AAKKYGGIFVPPHIAVIHQYMREM >gi|229784016|gb|GG667719.1| GENE 24 24762 - 26093 1219 443 aa, chain - ## HITS:1 COG:L67186 KEGG:ns NR:ns ## COG: L67186 COG0372 # Protein_GI_number: 15672652 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Lactococcus lactis # 17 443 14 441 441 380 44.0 1e-105 MNNFFIAQTFGKSSNYTDIPNYLFKEHNVKKGLRNEDGTGVRVGLTRVSDVVGYEIQDGK KVNVPGKLFYRGIEIGDLVKGKGNARYGFEETAFLLLFGYLPSKKELNEFAGILRHFYPL PDEFLEKNLLRSPSRNLMNSLQQSILSLYNYDEDPDNVDPYQTLLKGVSILAKLPSMAAY AYQNKIHYYDRESLIIHYPKEEYSIAENILYMLRRDGVFTEQEADLLDVMLMIHADHGGG NNSTFTNVVISSTGTDIYSSISASIGSLKGPKHGGANIRCSEMISAIEKEIGLKASDAQI KQVIKRILDKDFYDHTGLVYGLGHAVYTVSDPRADLLKSCCEKLAKQKKREDEYEFLTRF EKVAKETLAGNGKTLSNNVDFYSGFAYNMLQIPEDMYTPLFVCARMAGWLAHNIENKMYD GKIMRPAFKYVGEATPYKKREER >gi|229784016|gb|GG667719.1| GENE 25 26458 - 26853 406 131 aa, chain + ## HITS:1 COG:no KEGG:Clos_1149 NR:ns ## KEGG: Clos_1149 # Name: not_defined # Def: hypothetical protein # Organism: A.oremlandii # Pathway: not_defined # 5 126 3 120 122 122 52.0 5e-27 MYSDFGFYGGFTQLFPILFGVIFLIAAVMIVVTLIRGISQWNSNNHSPVLTVEAMVTSKR QEVSHRHSNQDTMDTYTNFTSYYATFQVESGDRMELGVSGTEYGMLAEGDMGRLTFQGTR YLSFERGPLES >gi|229784016|gb|GG667719.1| GENE 26 26938 - 27516 317 192 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871135|ref|ZP_06116478.2| ## NR: gi|288871135|ref|ZP_06116478.2| hypothetical protein CLOSTHATH_04843 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04843 [Clostridium hathewayi DSM 13479] # 1 190 1 190 190 386 100.0 1e-106 MEKRKRTIRKLMPVSPYDSAGLEVWFSELAGQGLFVRKTGYYFANFQRKEARNLVYKLEP CDQYRDEHEKELTGQYRLAGWEAAGDVWRKFLIFSAVRDGAVSPEISVEIRRAGESSLKK TGHGCLLTCAADAVLTALWCWMILGRFGFWYTTTMATPFLLCALMVFIFGINIFWRMSDY IQIRRYLKGTTS >gi|229784016|gb|GG667719.1| GENE 27 28476 - 28961 245 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623544|ref|ZP_06116479.1| ## NR: gi|266623544|ref|ZP_06116479.1| hypothetical protein CLOSTHATH_04844 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04844 [Clostridium hathewayi DSM 13479] # 1 161 14 174 174 319 99.0 5e-86 MFIWAGEQKSWHKELSEVDTSLPVPPITAFSPESSPIRGTVTYHSSIVAPVQFQIVQDGP DAQMLRMDYMELRFPRLVKPVFKSMMKDQLKWSSAEPEAVSSGLFDTACVAVINDSMHYL FVSDGQKVASVCYIGPGYTDEFLETLHDAMETWTQPELFWQ >gi|229784016|gb|GG667719.1| GENE 28 29278 - 30292 754 338 aa, chain + ## HITS:1 COG:alr2215 KEGG:ns NR:ns ## COG: alr2215 COG0477 # Protein_GI_number: 17229707 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Nostoc sp. PCC 7120 # 7 331 2 319 435 158 30.0 1e-38 MKDRQYRNFIIFWFSQSVSQLGSSMTSFALIIWSYSRTGSAMSVSLMTFCTWLPYIAASI VAGPVIDRYHKKTIMLLADLLAALCSLTVAALAVSGNLAVWHIFIVNGITGFMNAFQFPA ETVAVGMLVPKEKYSQASGLNSFTSSLLTVVTPVMAASLSSFAGIKGVIAFDLATFLFAF SVLLFRIKIPENAGGTETGKKKEKEKISVQESWREGLQFLRGHKQLWYLMISMALMNFFS RLTYENILSPMILARSGQNQIALGIVSGILGIGGMAGGLIVSFVKLPKDSVRVIFLSGGI SFLLGDFLMGAGQNVLVWSIAGIAASLPIPFIMAGQNV Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:11:38 2011 Seq name: gi|229784015|gb|GG667720.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld113, whole genome shotgun sequence Length of sequence - 15555 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 4, operones - 3 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 203 - 262 8.8 1 1 Op 1 . + CDS 308 - 2647 2715 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 2 1 Op 2 1/0.000 + CDS 2694 - 3338 628 ## COG0491 Zn-dependent hydrolases, including glyoxylases 3 1 Op 3 . + CDS 3352 - 4809 1575 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 4 1 Op 4 . + CDS 4837 - 5658 857 ## COG1052 Lactate dehydrogenase and related dehydrogenases 5 1 Op 5 . + CDS 5667 - 5885 104 ## gi|266623551|ref|ZP_06116486.1| conserved hypothetical protein 6 1 Op 6 1/0.000 + CDS 5915 - 7192 1185 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 7 1 Op 7 . + CDS 7200 - 8663 1642 ## COG3711 Transcriptional antiterminator 8 1 Op 8 . + CDS 8681 - 9193 458 ## BALH_0718 BigG family transcription antiterminator + Prom 9196 - 9255 2.3 9 2 Tu 1 . + CDS 9297 - 10634 1869 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases 10 3 Op 1 . + CDS 10766 - 11017 389 ## gi|288871137|ref|ZP_06116491.2| PTS system, cellobiose-specific IIB component 11 3 Op 2 . + CDS 11032 - 11355 483 ## COG1447 Phosphotransferase system cellobiose-specific component IIA 12 3 Op 3 . + CDS 11358 - 11963 684 ## CDR20291_0789 hypothetical protein 13 3 Op 4 . + CDS 11968 - 12060 85 ## + Term 12071 - 12123 -0.0 + Prom 12062 - 12121 3.5 14 3 Op 5 . + CDS 12152 - 13648 220 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen + Prom 13659 - 13718 5.0 15 4 Op 1 13/0.000 + CDS 13789 - 15048 1535 ## COG0124 Histidyl-tRNA synthetase 16 4 Op 2 . + CDS 15083 - 15554 450 ## COG0173 Aspartyl-tRNA synthetase Predicted protein(s) >gi|229784015|gb|GG667720.1| GENE 1 308 - 2647 2715 779 aa, chain + ## HITS:1 COG:CAC2274 KEGG:ns NR:ns ## COG: CAC2274 COG0317 # Protein_GI_number: 15895542 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Clostridium acetobutylicum # 62 777 19 738 740 711 49.0 0 MADESTMEKLNKTPLGVGRVDGAMEIKSPNDEELEAPADFTSPEVLYQELIASIRRYHPS DDISMIEKAYHVADNFHKEQKRKSGEPYIIHPLCVAIILADLEMDKETIAAGILHDVVED TVMTLDELGKEFGEEVALLVDGVTKLTQLSWSMDKMELQAENLRKMFLAMAKDIRVIIIK LADRLHNMRTLQYMKPEKQKEKARETMEIYAPIAHRLGISKIKIELDDLSLKYLQPDVYY DLAEKISLKKDARESFVKSIVDEVQEHLKEASIVAKVDGRVKHFFSIYKKMVNQNKTLDQ IYDLFAVRIIVDSVKDCYAALGVIHELYKPIPGRFKDYIAMPKPNMYQSLHTTLIGSNGQ PFEIQIRTYEMHRTAEYGIAAHWKYKESGSGQVAAGKEEEKLSWLRQILEWQKDMSDNKE FLNMVKGDLDLFSDSVYCFTPSGDVKNLPSGSTPIDFAYSIHSAVGNKMVGARVNGKLVN IDYVIQNGDQVEVITSQNCKGPSRDWLNIVKSTQAKNKINQWFKTELKEDNILRGKEMVD RYCKAKGIVYSDINKPEYQDKVMKRYAFRDWDSVLASIGHGGLKEGQVINKMVDERAKKM KKEVTDTTILNDIEDMSNKLPVNKHSKSGIVVKGIHDLAVRFSKCCSPVPGDEIVGFVTR GRGISIHRTDCVNVINLPEDERSRLIDAEWQTPEGDDTKERYSTEIKIYANNRIGMFVDI SKVFTERQIDITSMNSRTNKQGKATITMTFDIHGVEELNRLTDKLRQIEGVIDIERTAG >gi|229784015|gb|GG667720.1| GENE 2 2694 - 3338 628 214 aa, chain + ## HITS:1 COG:lin2270 KEGG:ns NR:ns ## COG: lin2270 COG0491 # Protein_GI_number: 16801334 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Listeria innocua # 5 208 3 202 205 159 43.0 3e-39 MSNFRIRTCTVGMVGTNCYLVYREDLKRAVIVDPGDNGAHILNTCRECGIIPEAVVLTHG HFDHILAVEEIRRAFKEITVYAAEKEAKLLGDPRLNLTASYGTGVTLSPDHLVKDGDILE LAGFKWQVIETPGHTEGSVCLWIKEEEVLISGDTLFAESLGRTDFPTGSSADIIRSIKER LFVLPDDTMVYPGHGEPTTIGHEKTHNPVAFYNR >gi|229784015|gb|GG667720.1| GENE 3 3352 - 4809 1575 485 aa, chain + ## HITS:1 COG:CAC2271 KEGG:ns NR:ns ## COG: CAC2271 COG0635 # Protein_GI_number: 15895539 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Clostridium acetobutylicum # 76 479 59 467 476 335 45.0 1e-91 MIGIILDDNAYEQDIRELLMAFYPGETYIHEERDGDEMPRFSVYGTTGGNRYSLTVREYG EDGEMLENKASVEADFSNRFETKNRIKRMLYGLLSEKTKKKLPWGTLTGIRPTKIAMTKL LEGKNEDEIRTYMKETYLASDAKIDLSIEIAERERELLSAIDYEHGYSLYVGIPFCPTTC LYCSFTSFPIKSWEKRMDEYLTALFKEMDYVAETMAGRTLDTVYFGGGTPTSLSAEHLER VMEKLKATFDFSEVKEFTVEAGRPDSITEDKLKVLLDYGVTRISINPQTMKQETLDLIGR RHTVDLVKEKYRLAREMGFDNINMDLIIGLPEEDLEDVRRTMEEIKALDPDSITVHSLAI KRAARLNMFKEKYGDLKITNTQEMIDLTAAYAREMGQEPYYLYRQKNMAGNFENVGYSRP GKACIYNILIMEEKQTIAACGAGTTTKVVFPRENRLERVENVKDVEQYISRIDEMLERKG KLLGK >gi|229784015|gb|GG667720.1| GENE 4 4837 - 5658 857 273 aa, chain + ## HITS:1 COG:CAC2945 KEGG:ns NR:ns ## COG: CAC2945 COG1052 # Protein_GI_number: 15896198 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 272 1 323 324 290 47.0 1e-78 MKIVILDGYTENPGDLSWEEMEKLGELTVYDRTPVSDQMEIIRRIGDAEIVITNKTPVSR AAMDACPNIKYIGMLATGYNVVDYEAAKEKGIPLCNIPSYGTEAVGQFAIALLLEICHHI GYHDKAVHDGRWENNPELESDTCRYADLDEVLGSSDVIALHCPLFADTEGIINKNTIAKM KDGVIILNNSRGPLIVEQDLADALASGKVYAAGLDVVSSEPIKGDNPLLNAPNCIITPHI SWAPKESRKRLMDIAVDNVKAFLNGTLQNVVNL >gi|229784015|gb|GG667720.1| GENE 5 5667 - 5885 104 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623551|ref|ZP_06116486.1| ## NR: gi|266623551|ref|ZP_06116486.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 72 1 72 72 141 100.0 2e-32 MSHPGAESELLADRIVHGTVSFGIVNFMINMTEGDSHNLRKVYGCLFSYHLGKTLFALKI KIRYTLVTILRP >gi|229784015|gb|GG667720.1| GENE 6 5915 - 7192 1185 425 aa, chain + ## HITS:1 COG:lin0874 KEGG:ns NR:ns ## COG: lin0874 COG1455 # Protein_GI_number: 16799948 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 14 419 19 424 441 234 38.0 2e-61 MNEKTTGFQDKLMKYMMPIANWVEKNNWLQAIKDGMIADIPVIIIGSIFLLPIALNNLIA SGPIHDFISAHMGVLTYAAGFTNDLLSIFAAFFIATALAKRYGVHNTQTGITAIIVHLIL SAVSIEGGISTEFLGASGLFTSIISGIMSVEITRFLIKHNVIIKLPSSVPPMVGESFASL IPLVVNVIVAVAIANISIAVTGLAFPAAIMKLLAPAINTMDTLPALLIVIFLTQFLWFFG LHGPSITSAVWAAFAIAYQAENMANYASGAPVTHIFTYGLYYNFLQVAGSGLTLGLVLFM MKSKAKSFKSIGVASLVPSIFGINEPVIFGLPILLNPYMFIPFVFGPLLITVLTYFSMKV GLVGYPVAAPPGFLPPGVGAFLTTYDWRSVVLVFVSLIIMALIYYPFFKMMEKDELKREA QSEAE >gi|229784015|gb|GG667720.1| GENE 7 7200 - 8663 1642 487 aa, chain + ## HITS:1 COG:lin0325_1 KEGG:ns NR:ns ## COG: lin0325_1 COG3711 # Protein_GI_number: 16799402 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Listeria innocua # 1 487 1 463 480 105 20.0 2e-22 MVFTKRQTEIIEYIMNNVSGITGDKLAKYFGVSSRTIRNDILQINSALSGSLSLCQTEGR TVYPSIKASKRIGYYITQEDMEDFRAVLEGGDDRAWVIAPHNRGYAILGHALENGSCNLY DLEEELFLSQTALREEVQKLRKMLEDKMDWTVLFLEGSQVRIVENEEKIRLSIFNIIKYE VQTHTDFNSDYLGLIFRGEYDRVEYETFVNAVKNYFGSHEILVSDASLYMVANAIYATIM RKRQGHELTGLKAEEFVVEVLTNLLYHLKAQKVELTESDEALLKIFLYSVRLVQSEENTR ASALSGVILNEFCQEVLNKYSFDLHQSDELFNNMALHLEYLIRRIDTGYRIDNPILQDIK TQYPYAYEIAILLVPIMFKYKSIPIRDEEIAYMALYVAYFLEKVNRRLKTVIINAPRQSV CAMVCSWLNDHFQNQIEIVKVLPKHQLDQYVKDHPVDLVISTEGQRVNTEIPVFRINGIA NQQDYSR >gi|229784015|gb|GG667720.1| GENE 8 8681 - 9193 458 170 aa, chain + ## HITS:1 COG:no KEGG:BALH_0718 NR:ns ## KEGG: BALH_0718 # Name: licT # Def: BigG family transcription antiterminator # Organism: B.thuringiensis_AlHakam # Pathway: not_defined # 39 163 529 651 651 67 31.0 3e-10 MRVSRRFSVIVNKYFDTRCVKIFAKDGGTAKWNGSVSFETVIGTLSEKFKEQGKIESMEE YVEDVLSREVNYPTVIGESVMIPHPLMTFAKETAVAAAILKPPITVCRKAVKVIFLLAIE GRPNDDVSILFEFFKQVAMNENLVHTLYEAADEKDFLNRLISISTSIELF >gi|229784015|gb|GG667720.1| GENE 9 9297 - 10634 1869 445 aa, chain + ## HITS:1 COG:CAC0533 KEGG:ns NR:ns ## COG: CAC0533 COG1486 # Protein_GI_number: 15893823 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Clostridium acetobutylicum # 4 444 2 440 441 633 66.0 0 MSEKKFSVVIAGGGSTYTPEIILMLLDNLDRLPLRAIKLYDNDEERQGRLAKACEIIVKE KDPSIEFVATTDPETAYTDVDFCLAHIRVGQLPMREKDEKIPLKYGVVGQETCGPGGIAY GMRSIPGVLENIEYMEKYSPNCWMLNYSNPAAIVAEACRRFKPNARVINICDMPIGMENN IAKILGFNDRKEMTVRYFGLNHFGWWTSIKDKEGNEYIDKLIAHQLKYGNTLPEEGNNGD YTDNSWYETAKKVKDLTAIDPTMAPNSYFQYYLFGDDMVAHTNPEYTRANQVMDGREKRV FGECARIAEAGTAEGTSLEIGIHASFIVDLATALAFNTHERMLLIVRNNGAIENFDKDAM VEIPCIVGKDGYEPITIGTIPTFQKGLMEEQVAVEKLTVDAYEQGSYHKLWQALTLSKTV PSANVAKKILDDYIEANKEYWPELK >gi|229784015|gb|GG667720.1| GENE 10 10766 - 11017 389 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871137|ref|ZP_06116491.2| ## NR: gi|288871137|ref|ZP_06116491.2| PTS system, cellobiose-specific IIB component [Clostridium hathewayi DSM 13479] PTS system, cellobiose-specific IIB component [Clostridium hathewayi DSM 13479] # 1 83 1 83 83 155 100.0 7e-37 MQKMRDAAAKRNMELTCEAVPNAGISDEIGKWDVCLVGPQLVYAVNNIKSILNIPVASIE PRVYALADGDKALDFAIELAKKS >gi|229784015|gb|GG667720.1| GENE 11 11032 - 11355 483 107 aa, chain + ## HITS:1 COG:SP2024 KEGG:ns NR:ns ## COG: SP2024 COG1447 # Protein_GI_number: 15901845 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIA # Organism: Streptococcus pneumoniae TIGR4 # 13 104 17 108 108 62 35.0 3e-10 MADVTKLSECAMQIVTYAGCAKSSYMQALQEAKKGNWDQVEPRIKEGEQFYRQAHEQHGE ILLKEVNTLEPQITLLMSHAEDQIMSAELVRVLVEELIEIYTNQKGE >gi|229784015|gb|GG667720.1| GENE 12 11358 - 11963 684 201 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_0789 NR:ns ## KEGG: CDR20291_0789 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 7 197 15 205 209 256 64.0 3e-67 MANYTIWANATTKNGFAADEIISCLQKAIRRGMVEEACQYAYELYTSGPALLEKVWHRLL TISVEDIGFGDLDAAGKINTLDEMRKKFAYDDGDQPMYFIHAIRILCACTKDRSSDYLKN IIIKEAAVGKVPEILDVALDKHTRRGQEMGRGSIHFFEEGAKVIPQLDVDNGYKERYRKI LENYDPSKAVPGAFVYSSGRD >gi|229784015|gb|GG667720.1| GENE 13 11968 - 12060 85 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIGKGCVCMVKTSDMALISSAVFFVSVMTD >gi|229784015|gb|GG667720.1| GENE 14 12152 - 13648 220 498 aa, chain + ## HITS:1 COG:FN0191 KEGG:ns NR:ns ## COG: FN0191 COG2865 # Protein_GI_number: 19703536 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Fusobacterium nucleatum # 1 467 1 453 477 504 55.0 1e-142 MIPDEILQLISNGENNIVEFKKSTTDITKDVYESVCAFSNRDGGHIFLGVKDNGTILGIQ PDRIDQLKKDFVTCINNENKIFPPLYIVPVEFELDGRHILYIHVPVGTSVCRCNGRIYDR NYESDIDITNNEGLVYRLYARKQDSYYVNKVFPHFSYSELRTDLIDRARRMTRSRSQKHP WNQMTDEEILRSSGMILRDPETNKEGITLAAILLFGPDDLILSALSHYKTDVIFRVFNTD RYDDRDVIITNLLDSYDRMISFGEKHLNDLFVMDGLQSVSARDKILREIVSNSLAHRDFS NAYVAKMVIERNQIFTENCNRSHGYGSLNLSNFAPFPKNPPISKVFREVGLADELGSGMR NTYKYTRLYSGGEPQFIEGDIFRTIIPLNEAATGTVGPFSIVKKGEVSGEVGGEVSGEVS GEVERAVLSEVKLDMEQLKSLLDFCNTGRSRKEMQDYCGIKAQSYFRDHILSPLLRYGLL RMSIPDKPKSSKQKYIRN >gi|229784015|gb|GG667720.1| GENE 15 13789 - 15048 1535 419 aa, chain + ## HITS:1 COG:APE0662 KEGG:ns NR:ns ## COG: APE0662 COG0124 # Protein_GI_number: 14600873 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Aeropyrum pernix # 4 346 8 334 438 202 36.0 9e-52 MALKKKPVTGMKDILPAEMAVRDYVMNQIRETYRGFGFTSIDTPCVEHIENLTSKQGGDN EKLIFKIMKRGEKLRLDTAETENDLTDSGLRYDLTVPLCRYYANNAASLPAPFKALQMGS VWRADKPQRGRFRQFVQCDIDILGDPSRLAEIELILATTTLLGKVGFKGYHVRINDRNIL KGMAAYCGFPEESFDQVFIILDKMDKIGMEGVASELSEAGYEKNKIDQYLALFGSVTPDA AGVRSLGETLKDVMEPEKAENLAAIMDSVNSIATSQFDIIFDPTLVRGMSYYTGTIYEIQ VDGFPGSVGGGGRYDKMIGKFTGMETPACGFSIGFERIITILMEEGFSVPGSTEKKAFLI EKGVSAEVMTAAIKEAMDERARGVSVLVSQMNKNKKFQKENLQKEGYTEFKEFYREGLK >gi|229784015|gb|GG667720.1| GENE 16 15083 - 15554 450 157 aa, chain + ## HITS:1 COG:CAC2269 KEGG:ns NR:ns ## COG: CAC2269 COG0173 # Protein_GI_number: 15895537 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 147 1 145 595 154 51.0 7e-38 MAESMLGLKRSHRCTEVTSSNVGETVTVMGWVQKSRNKGGIIFVDLRDRSGILQIIFEES DCGAESFAKAEKLRSEFVIAVTGHVEKRAGGVNENLATGDIEVRAESLRILSESETPPFP IEENSKTKEELRLKYRYLDLRRPDIQRNIMVGSQAAI Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:12:10 2011 Seq name: gi|229784014|gb|GG667721.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld114, whole genome shotgun sequence Length of sequence - 27290 bp Number of predicted genes - 30, with homology - 29 Number of transcription units - 17, operones - 7 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 15 - 1268 991 ## EUBELI_20298 hypothetical protein - Prom 1492 - 1551 26.5 2 2 Op 1 4/0.000 - CDS 2503 - 3606 1232 ## PROTEIN SUPPORTED gi|227872165|ref|ZP_03990534.1| possible ribosomal protein S1 3 2 Op 2 2/0.000 - CDS 3587 - 4438 604 ## PROTEIN SUPPORTED gi|227371337|ref|ZP_03854821.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase; SSU ribosomal protein S1P 4 2 Op 3 . - CDS 4453 - 4761 431 ## COG0283 Cytidylate kinase 5 3 Op 1 1/0.000 - CDS 5718 - 5960 397 ## COG0283 Cytidylate kinase 6 3 Op 2 . - CDS 5997 - 6842 848 ## COG2081 Predicted flavoproteins - Prom 6877 - 6936 80.4 7 4 Tu 1 . - CDS 7776 - 8135 397 ## COG2081 Predicted flavoproteins - Prom 8168 - 8227 6.4 8 5 Tu 1 . - CDS 8253 - 9749 1844 ## COG2326 Uncharacterized conserved protein 9 6 Tu 1 . - CDS 9851 - 10363 551 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family 10 7 Tu 1 . - CDS 11299 - 11760 579 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family - Prom 11884 - 11943 6.0 + Prom 11838 - 11897 5.8 11 8 Tu 1 . + CDS 11922 - 12083 209 ## COG1592 Rubrerythrin + Term 12109 - 12158 11.4 - Term 12088 - 12156 21.3 12 9 Tu 1 . - CDS 12181 - 12786 627 ## COG0527 Aspartokinases 13 10 Op 1 . - CDS 13765 - 14427 800 ## COG0527 Aspartokinases 14 10 Op 2 . - CDS 14505 - 15890 1318 ## COG0534 Na+-driven multidrug efflux pump 15 10 Op 3 . - CDS 15826 - 16671 694 ## BVU_3509 putative arginase 16 10 Op 4 . - CDS 16708 - 17301 782 ## COG2316 Predicted hydrolase (HD superfamily) 17 10 Op 5 . - CDS 17309 - 17839 558 ## COG1683 Uncharacterized conserved protein 18 10 Op 6 . - CDS 17863 - 19161 1115 ## COG1409 Predicted phosphohydrolases + Prom 19215 - 19274 6.7 19 11 Tu 1 . + CDS 19313 - 19471 69 ## + Prom 20311 - 20370 80.4 20 12 Op 1 . + CDS 20513 - 20884 100 ## gi|153854927|ref|ZP_01996150.1| hypothetical protein DORLON_02156 + Prom 20901 - 20960 7.1 21 12 Op 2 . + CDS 20993 - 21253 382 ## ELI_3279 hypothetical protein + Term 21281 - 21338 11.1 - Term 21268 - 21326 15.1 22 13 Op 1 . - CDS 21351 - 21938 621 ## COG0740 Protease subunit of ATP-dependent Clp proteases 23 13 Op 2 . - CDS 21935 - 22150 183 ## gi|266623583|ref|ZP_06116518.1| transposon Tn10 TetD protein 24 14 Tu 1 . - CDS 23074 - 23262 210 ## gi|288871146|ref|ZP_06116519.2| multiple antibiotic resistance protein MarA - Prom 23329 - 23388 8.0 + Prom 23492 - 23551 6.6 25 15 Tu 1 . + CDS 23593 - 24795 1051 ## COG0426 Uncharacterized flavoproteins + Term 24836 - 24893 10.1 - Term 24817 - 24888 14.4 26 16 Op 1 . - CDS 25005 - 25367 419 ## Clole_3274 hypothetical protein 27 16 Op 2 . - CDS 25354 - 26040 541 ## Exig_1303 hypothetical protein - Prom 26264 - 26323 6.0 - Term 26268 - 26321 4.7 28 17 Op 1 . - CDS 26365 - 26697 135 ## COG4185 Uncharacterized protein conserved in bacteria 29 17 Op 2 . - CDS 26646 - 26924 183 ## COG4185 Uncharacterized protein conserved in bacteria 30 17 Op 3 . - CDS 26925 - 27104 240 ## gi|266623590|ref|ZP_06116525.1| conserved hypothetical protein - Prom 27124 - 27183 2.1 Predicted protein(s) >gi|229784014|gb|GG667721.1| GENE 1 15 - 1268 991 417 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20298 NR:ns ## KEGG: EUBELI_20298 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 74 417 43 396 398 166 29.0 2e-39 MAQVKSNKMLTSQKKQERKGGKERETLNIPSGIMPFPLEERSREQVLSMLKSNPEIWGEF QLLTREFREELIGFSMGSRGLKISYDPFFKLIFDPDVHSDRLSAMLSCIMKQRVTVKQVL SPESNRISAEGSLLIMDVLAELESGELVNIEIQRIGYAFPGERAACYSSDLMLRQYSRIK SRKERGSTYKKLRNVYTIVLIENSGAEFHKIPGQYIHRSSQVFDTGLELNLLQKYIFVAL DVFREITYNLGEEIEAWLLFLSSDRPEDIMRIIERYPDFREIYEEMAQFQVKPEELVGMY SKALEILDRNTVQYMIEEQKQEIEGQKQEIEKRKQEIEGQKQEIEKRKQEIEDRKQEIED RKQEIEKRKQEIEGQKQEIEAQKQEIKEQKQEIEERKQEIEEQKQEIERLKKLLEQR >gi|229784014|gb|GG667721.1| GENE 2 2503 - 3606 1232 367 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227872165|ref|ZP_03990534.1| possible ribosomal protein S1 [Oribacterium sinus F0268] # 1 360 1 363 367 479 65 1e-134 MSELSFEQMLEESLKTIRTGEIVTGKVIDVKEDEIVLNIGYKSDGIIPRNEYTDDQNLDL TTVVSVGDEMEAKVIKVNDGEGQVALSYKRLAADRGNKRLEEAFENHEVLTAKVAAVLDG GLSVVVEGARVFIPASLVSDTYEKDLSKYADQEIEFVITEFNPKRRRIIGDRKQLMVARR AEMQKELFERIHPGDVVEGTIKNVTDFGAFIDLGGADGLLHISEMSWGRVESPKKVFKAG EKMRVLIKDINGDKIALSLKFPETNPWKDAAVKYAAGNVVVGRVARMTDFGAFVELEPGV DALLHVSQISRDHVDKPSDVLSIGQEIEAKVVDFNEDDRKISLSMKALQMQDMAQVDEDA PVYSDEE >gi|229784014|gb|GG667721.1| GENE 3 3587 - 4438 604 283 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227371337|ref|ZP_03854821.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase; SSU ribosomal protein S1P [Veillonella parvula DSM 2008] # 1 275 1 269 632 237 45 1e-115 MEVIVAKTAGFCFGVKRAVDKVYEQIRHAGKPIYTYGPIIHNEEVVKDLEEKGVRVIESE EELKELHEGIVIIRSHGVGKHVYDILEANGIEIVDATCPFVKKIHKIAQEQNQAGRRLII IGNESHPEVQGIKGWGNDHTLVVETPDQVDNLPLSEGEKLCIVSQTTFNYKKFQDLVEKF SKTRYDILVLNTICNATQERQVEARRIASEVDAMIVVGGKTSSNTQKLYEICQKECKNTY YIQTLGDLDPECVSSVRSVGITAGASTPNHIIEEVHTNVRVKF >gi|229784014|gb|GG667721.1| GENE 4 4453 - 4761 431 102 aa, chain - ## HITS:1 COG:SP1603 KEGG:ns NR:ns ## COG: SP1603 COG0283 # Protein_GI_number: 15901443 # Func_class: F Nucleotide transport and metabolism # Function: Cytidylate kinase # Organism: Streptococcus pneumoniae TIGR4 # 1 96 127 221 223 101 61.0 3e-22 MDGRDIGTCVLPEADLKIYLTASTAVRAKRRYDELTEKGEHCSLEEIEKEIEERDYRDMH RENSPLKQAEDAVLVDTSDLTIDEVIGEILKLAAEKGIGQKN >gi|229784014|gb|GG667721.1| GENE 5 5718 - 5960 397 80 aa, chain - ## HITS:1 COG:alr2936_2 KEGG:ns NR:ns ## COG: alr2936_2 COG0283 # Protein_GI_number: 17230428 # Func_class: F Nucleotide transport and metabolism # Function: Cytidylate kinase # Organism: Nostoc sp. PCC 7120 # 7 71 6 70 229 69 50.0 1e-12 MSKSYNIAIDGPAGAGKSTIARAVSAKLGFVYVDTGAMYRAMALYFIRRGIRPDEEETVS EAVKEVGVTISYEDGAQLAS >gi|229784014|gb|GG667721.1| GENE 6 5997 - 6842 848 281 aa, chain - ## HITS:1 COG:CAC1849 KEGG:ns NR:ns ## COG: CAC1849 COG2081 # Protein_GI_number: 15895124 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Clostridium acetobutylicum # 22 272 141 388 393 286 56.0 4e-77 MEEGRCRGILLKKGGAETKVYGDAVIVACGGLSYPSTGSTGDGYRFAREAGHTVTELSPA LVPFVVKEPVVKELQGLSLKNIEAAVMRGKKVIYREFGEMLFTHYGVSGPVLLSASSHAV KELKKGPLSLVIDLKPALSEEQLDARLLREFEGMINKQFKNSLVNLFPSRLVPVMVERSG ILPEKKVNEITREERKQIISATKSFTLTITGLRDYNEAIITQGGISVKEVNPSTMESKLV PGLYFAGEVLDLDAVTGGFNLQIAWSTAWTAGSEAAAGEKR >gi|229784014|gb|GG667721.1| GENE 7 7776 - 8135 397 119 aa, chain - ## HITS:1 COG:CAC1849 KEGG:ns NR:ns ## COG: CAC1849 COG2081 # Protein_GI_number: 15895124 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Clostridium acetobutylicum # 14 115 1 101 393 115 55.0 3e-26 MKTVIIVGGGAAGMLAGIAAARNGGTVHIFEKNEKLGKKVFITGKGRCNVTNACDTEDLF RNVVTNSKFLYSSFYGFNNFDMMALLEEAGCPLKTERGNRVFPVSDKSSDVIKALSIAS >gi|229784014|gb|GG667721.1| GENE 8 8253 - 9749 1844 498 aa, chain - ## HITS:1 COG:MA2391 KEGG:ns NR:ns ## COG: MA2391 COG2326 # Protein_GI_number: 20091222 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 1 492 67 567 569 452 47.0 1e-127 MLEKIDLSKKMDKDTYKQVIGEQSERLGLLQRELRTAGIPVMIVFEGMGAAGKGTQINRL IQALDPRGFEVFANSKATEEEQLRPFLWRFWTKTPADGRIAIFDRSWYRQVTIERFDGQI REKNLPEAFQDILSFERQLDDGGMVIIKLFLYISQEEQKKRFKKLEDSKETSWRVTKDDW RRNKEYGRYLEINEEMLQKTDTERAPWSIIEATDKDYAAAKIMTLVADRFEAALREKEAP KPETPKNPPVKEDKYLNGVLSGVDLTKTLTKEEYKKEIDRLQKRLEFLHSEIYRLRIPVV LGFEGWDAGGKGGAIKRLTSHLDPRGYRVNPTASPNDIEKVHHYLWRFWNNVPKAGHIAI FDRTWYGRVMVERIEGFCSEADWKRAYQEINEMEAHMANAGAVVIKFWLHIDKDEQERRF KERQENPAKQWKITDEDWRNREKWDEYETAVNEMLVRTSTTYAPWVVVEGNSKYYARVKV LRTVVDALEERVKQCGRK >gi|229784014|gb|GG667721.1| GENE 9 9851 - 10363 551 170 aa, chain - ## HITS:1 COG:MA0422 KEGG:ns NR:ns ## COG: MA0422 COG1453 # Protein_GI_number: 20089314 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Methanosarcina acetivorans str.C2A # 1 169 202 384 400 128 37.0 5e-30 MEPLLGGRLANPAPHLKKVFPEDKSPVEYALDFLWDQPEVSLLLSGMSDEKQVEENLEYA KRSSVGMAAERDKEVVKEAKRVFDSMALVGCTGCRYCMPCPFGLDIPQIFSLYNMTAAHR EEEAKEAYAALEKKAGDCRSCHHCEKECPQMIKVSEVMKDIAKVFEKTEA >gi|229784014|gb|GG667721.1| GENE 10 11299 - 11760 579 153 aa, chain - ## HITS:1 COG:TM1183 KEGG:ns NR:ns ## COG: TM1183 COG1453 # Protein_GI_number: 15643939 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Thermotoga maritima # 1 152 1 153 379 207 64.0 8e-54 MQYRNFGKTGIQVSALGFGTMRLPIFHDETVDEERAIAMIRHAIDEGVNYIDTAYPYHQG ESEKIVGKALADGYREKTYLATKCPVWKLEKAEDFETLLDEQLEKLNTDHIDFYLLHALS GERMEDKVKPFELVKRMEKAREAGKIRYLGFSS >gi|229784014|gb|GG667721.1| GENE 11 11922 - 12083 209 53 aa, chain + ## HITS:1 COG:alr1174 KEGG:ns NR:ns ## COG: alr1174 COG1592 # Protein_GI_number: 17228669 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Nostoc sp. PCC 7120 # 2 49 181 228 237 78 60.0 3e-15 MKKYVCEPCGYIYDPAAGDPDNGIDPGTAFEDLPEDWVCPLCGMGKDVFVPED >gi|229784014|gb|GG667721.1| GENE 12 12181 - 12786 627 201 aa, chain - ## HITS:1 COG:CAC0278 KEGG:ns NR:ns ## COG: CAC0278 COG0527 # Protein_GI_number: 15893570 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Clostridium acetobutylicum # 1 196 240 436 437 217 51.0 1e-56 MGASVLHEDAIFPVRKEGIPINIRNTNRPEEKGTLIVESTCRKPRYTITGVAGKKGFCSI NIEKAMMNAEVGFGRKVLEVFEKYGISFEHMPSGIDTMTVFVHQSEFEEYEQSVIAGIHR AVEPDYVELESDLALIAVVGRGMKATRGTAGRIFSALAHARVNVKMIDQGSSELNIIIGV KNADFEAAIRAIYDIFISTEI >gi|229784014|gb|GG667721.1| GENE 13 13765 - 14427 800 220 aa, chain - ## HITS:1 COG:CAC0278 KEGG:ns NR:ns ## COG: CAC0278 COG0527 # Protein_GI_number: 15893570 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Clostridium acetobutylicum # 4 219 5 220 437 248 53.0 8e-66 MKKVVKFGGSSLASARQFKKVGDIIRSDKSRRYVVPSAPGKRNDKDEKVTDMLYQCYDAA AEGKSYKKILEKIKSRYVEIIDGLDLNLNLDHEFETIEENFLKKAGRDYAASRGEYLNGI VMASYLGYEFIDAAEVIFFDEDGSFEAETTNRELGERLEHVERAVIPGFYGATHDGSIRT FSRGGSDVTGSIVAKAIHADMYENWTDVSGFLVADPRIIS >gi|229784014|gb|GG667721.1| GENE 14 14505 - 15890 1318 461 aa, chain - ## HITS:1 COG:yeeO KEGG:ns NR:ns ## COG: yeeO COG0534 # Protein_GI_number: 16129928 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 12 458 94 545 547 213 31.0 7e-55 MEEAPIFSKTAIRRLILPLIIEQFLAVLVGMADTIMVAGVGETAVSGISLVDTLNILLIN VFTAMATGGAVIAAQELGDENRDEALKASNQLIFVVLIFSSVLMVLSLLFNYQILRTIYG TIEPSVMNYARTYYYLTALSFPFLAVYNGSAAICRSMGNSGISMKVSMLMNAINIAGNAI LIYGFHLEVAGAGISTLLSRITAAVILVWVIRNPKLPLHIDKNFHLGFHPAVIRRIMGIG VPMGVENGLFQGGKLLVAGLVSSFGTTSIAANAVAGTICSFQVIPGSAISLAMITVVGQA VGAKRYREAKRYAKNFIFLIMGAHFLICFPMLVFLPHIIGLYHVSEQTAELARIICTLHG LCCVLIWAPSFALPNALRAADDVTFAMCITIMSMLLCRIAMSFVLGRYLGMGLIGVWIAM VLDWAVRGVIYAGRMASGGWLKQHKRLAARQEAEKAVLQNT >gi|229784014|gb|GG667721.1| GENE 15 15826 - 16671 694 281 aa, chain - ## HITS:1 COG:no KEGG:BVU_3509 NR:ns ## KEGG: BVU_3509 # Name: not_defined # Def: putative arginase # Organism: B.vulgatus # Pathway: not_defined # 4 244 21 262 269 171 35.0 4e-41 MKSVSVLNFSGVYERQSFYKDRDCEWIDCSDLSGVNGFCDETSMEEIDARISGLEGWIHF IDGGNFHYLSYLLMKHIREPFTLVVFDHHTDMKPSMFAGLLSCGCWIKEALDALPFLKNV VLIGVADSLADTAEPDYAGRVRIISESMAEAGDTWLGILEAEASEAVYLSIDKDAFGREE VVTDWDQGTMTLELLERAYRILDSRRILGVDICGEADRDEFFSGQMSASDEQNDAANRRI LRMLLTDQKKPAGRIQKGERDGRSTDIFEDGHQASDPAADH >gi|229784014|gb|GG667721.1| GENE 16 16708 - 17301 782 197 aa, chain - ## HITS:1 COG:DR2421 KEGG:ns NR:ns ## COG: DR2421 COG2316 # Protein_GI_number: 15807410 # Func_class: R General function prediction only # Function: Predicted hydrolase (HD superfamily) # Organism: Deinococcus radiodurans # 5 182 30 204 205 135 43.0 4e-32 MKTNVTRDQALSLLMKYNKESFHILHGLTVEGTMRWYARELGYGEDEDFWGIAGLLHDVD FENYPEEHCRKAPELLAEVQAEPELVHAVVSHGYGLVSDVEPEHEMEKVLFASDELTGLI GAAARMRPSKSVMDMEVSSLKKKYKDKRFAAGCSRDVIAQGAERLGWTLEELFEKTIGAM RSCESSVNDEMEKLGAQ >gi|229784014|gb|GG667721.1| GENE 17 17309 - 17839 558 176 aa, chain - ## HITS:1 COG:CAC2339 KEGG:ns NR:ns ## COG: CAC2339 COG1683 # Protein_GI_number: 15895606 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 8 144 6 148 150 138 51.0 6e-33 MDKKNIRILVSACLLGVNCRYDGKNGRRDEVLELLKEYELIPVCPEQMGGLGTPREPAEC RRDGGQVRVMNRKGCDVTEYFEKGGEEALQIAKLYGCRYAILKERSPSCGSGVIYDGTFS GVKTAGDGVTARLLKKHGIKVAGESGISSLISEISSDMMDGSINITQLNRNQMTEE >gi|229784014|gb|GG667721.1| GENE 18 17863 - 19161 1115 432 aa, chain - ## HITS:1 COG:lin2791 KEGG:ns NR:ns ## COG: lin2791 COG1409 # Protein_GI_number: 16801852 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Listeria innocua # 1 394 1 414 443 219 31.0 6e-57 MIKKTQIVVLSFLCAALAGCGSSRPSAEKETSSFTIVTATDLHYLSPELTDYGTGFMRMA ANGDGKLVQYAPEIVDAFVTDMAKLRPDAVILSGDLSFNGEKASHEELAEKLSCLQEQGI PVLVLPGNHDISYPFAASFEAETVTRTERVNGVEFRDLYKQLGYRDAAETDGDSLSFLYQ LSERVWILLLDANTEDAPGRIKDSTLQWAERQMERAEEAGAAVITVTHQNVLAQNKLLSR GFLLDNHEELERLLKGHGVVLNVSGHVHIQHIAENGGLYDIATGSLTVAPNRYGVLTVGS DGSTSYETRKVHVGRWDREKTGIAGDFENFEELSQAYFDECTRKKMKKELEALSVSEEER GQMTELAVEMNRNYFAGMTKPEKEIADTPGWKLWKEKGAELFFGSYMDSMMEDAGHDENR LLLPPKSDAGNR >gi|229784014|gb|GG667721.1| GENE 19 19313 - 19471 69 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKKHSPFFTKSKMKHLISDAGHSLPGIFLCALFLCIMTAVTGKVCPARLAS >gi|229784014|gb|GG667721.1| GENE 20 20513 - 20884 100 123 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|153854927|ref|ZP_01996150.1| ## NR: gi|153854927|ref|ZP_01996150.1| hypothetical protein DORLON_02156 [Dorea longicatena DSM 13814] hypothetical protein DORLON_02156 [Dorea longicatena DSM 13814] # 8 115 40 152 152 87 41.0 3e-16 MTAVTGKVCPARLLLGIPCPGCGLTRAAVLLLKGDFRGSLLSNPFLLPLLAGGLLFLYEH YILERKARLFSACIAVCIFLMVVFYLIRMKYWFPNRPPMIYEPQNLLHLLFTALKSRICP RSA >gi|229784014|gb|GG667721.1| GENE 21 20993 - 21253 382 86 aa, chain + ## HITS:1 COG:no KEGG:ELI_3279 NR:ns ## KEGG: ELI_3279 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 10 80 131 202 210 68 43.0 1e-10 MENNTMMTDPNKAVMTMGEWLITLIVLAIPCVNVIMYFVWAFGNGNENRKNFCRAGLIVM AIGIVLSLVLYAVVGAGLAAALSAGY >gi|229784014|gb|GG667721.1| GENE 22 21351 - 21938 621 195 aa, chain - ## HITS:1 COG:CAC2640 KEGG:ns NR:ns ## COG: CAC2640 COG0740 # Protein_GI_number: 15895898 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Clostridium acetobutylicum # 1 193 1 192 193 265 67.0 6e-71 MSLVPNVIETTSKGERSYDLFSRLLKDRIIFLGEEVNDETARLAVAQMLFLEAEDPSKDI SFYINSPGGSITAGLAIYDTMRYIRCDVSTICIGLAASFGAFLLAGGTKGKRFALPNAEI MIHQPAVEKIGGKATDIQIYSEKLQRDKRRLNRILAENTGRTEEEIWRDTERDHFMSAEE AKAYGIIDTVMQKRG >gi|229784014|gb|GG667721.1| GENE 23 21935 - 22150 183 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623583|ref|ZP_06116518.1| ## NR: gi|266623583|ref|ZP_06116518.1| transposon Tn10 TetD protein [Clostridium hathewayi DSM 13479] transposon Tn10 TetD protein [Clostridium hathewayi DSM 13479] # 1 71 2 72 72 120 98.0 3e-26 METDKTIVEISCEADYQSQQAFTLAFQRVYGCTPMAYRERKHFTPLREQTKQKKAGNTAC RTRTDLWRCAA >gi|229784014|gb|GG667721.1| GENE 24 23074 - 23262 210 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871146|ref|ZP_06116519.2| ## NR: gi|288871146|ref|ZP_06116519.2| multiple antibiotic resistance protein MarA [Clostridium hathewayi DSM 13479] multiple antibiotic resistance protein MarA [Clostridium hathewayi DSM 13479] # 1 60 47 106 106 115 100.0 1e-24 MDGKNRIVKEILDSVEKKMESGESLSLEAMEAHTGYSRFYLNRIFSEVTGCTIHRYIMER SS >gi|229784014|gb|GG667721.1| GENE 25 23593 - 24795 1051 400 aa, chain + ## HITS:1 COG:FN1423 KEGG:ns NR:ns ## COG: FN1423 COG0426 # Protein_GI_number: 19704755 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 1 400 1 404 405 452 51.0 1e-127 MYCVKNIKDDLFWVGGSDRRLALFENAYPIPNGISYNSYLLLDEKTVLFDTVDRAITGQF LENVESVLAGRTLDYVIVNHMEPDHCATLGELIRRYPGISVVCNAKTIPIIKQFYEFDID TRAVIVKENDTFCSGRHTFTFLMAPMVHWPEVMVTYDTTDKILFSADAFGTFGAMNGNLF ADELYFERDWLADARRYYTNIVGKYGASVQTLLKKAAALDIEVLCPLHGPVWRSDIAWYV DKYLTWSSYEPEEKAVMIAYGSIYGNTENAANILACRLADRGVKNIVMYDVSSTHPSVII SEAFRCSHLIFASATYNGGIFSCMEHLLLDLKAHNLQNRTVALLENGSWGVTAGRKMGEI ISEMKNMTILDETITIKSSLKEDQMAELGALANIIVDSMN >gi|229784014|gb|GG667721.1| GENE 26 25005 - 25367 419 120 aa, chain - ## HITS:1 COG:no KEGG:Clole_3274 NR:ns ## KEGG: Clole_3274 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 110 1 110 113 100 47.0 1e-20 MAKFEADLTGNLDHFKKYLKNEILNGSVSASFEDEHEEEINGVRCCTMVYERYSYMGGNR VSLNVTILEYEGELRLTAITSGGSQAMFFKLNTFGEEAFLEKLEEAVKAYKCRHTRMISG >gi|229784014|gb|GG667721.1| GENE 27 25354 - 26040 541 228 aa, chain - ## HITS:1 COG:no KEGG:Exig_1303 NR:ns ## KEGG: Exig_1303 # Name: not_defined # Def: hypothetical protein # Organism: E.sibiricum # Pathway: not_defined # 23 213 15 219 228 74 26.0 3e-12 MYVDYGYYIIRPCRCPEFLKDFSQWILTASSCICDVEPQPFSCMTNDERQKEEYRKRLGL EKREFIDFSEETLRLLGEDRLDTDSRFVSKQDADEIYRRYFSKQKGVDRGYRLIGIALEE AHLPSMEDRIIPKNVASRTAERHFLGFDILGWDISGFHTYLCNSLQKELMKRFKLKPGEF GLLKNSKEEVEAFADAIQNRGEPVEWIPFAVYDDTPAAAEGSEIHGKI >gi|229784014|gb|GG667721.1| GENE 28 26365 - 26697 135 110 aa, chain - ## HITS:1 COG:CAC1491 KEGG:ns NR:ns ## COG: CAC1491 COG4185 # Protein_GI_number: 15894770 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 17 101 97 181 187 64 37.0 5e-11 MRTYYTANNTQGKAIGIELHYVGVDSPEIAKQRIAERVKMGGHGIPDRDVEKRFGESFRN LHEVIGLCDLAALYDNINEFRRFAVYKCGEIVRLSKNTPEWYLKWRKGYY >gi|229784014|gb|GG667721.1| GENE 29 26646 - 26924 183 92 aa, chain - ## HITS:1 COG:CAC1491 KEGG:ns NR:ns ## COG: CAC1491 COG4185 # Protein_GI_number: 15894770 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 92 1 94 187 87 42.0 5e-18 MKKYILYAGVNGAGKSTLYRTTHYQDTMPRINTDEILREFGDWRNTADLMKAGRIAIERL NSYLQEGITFNQETTLCGLTILRTIHRAKQLG >gi|229784014|gb|GG667721.1| GENE 30 26925 - 27104 240 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623590|ref|ZP_06116525.1| ## NR: gi|266623590|ref|ZP_06116525.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 59 1 59 59 75 100.0 1e-12 MSKAEAARKLYEECKDMDIDETMELVLNAETEEEQDFFSMLSDFILQRKQKKVIAQKRF Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:13:07 2011 Seq name: gi|229784013|gb|GG667722.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld115, whole genome shotgun sequence Length of sequence - 22389 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 13, operones - 6 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 96 - 155 4.2 1 1 Op 1 . + CDS 229 - 432 105 ## gi|266623591|ref|ZP_06116526.1| conserved hypothetical protein 2 1 Op 2 . + CDS 461 - 652 136 ## gi|266623592|ref|ZP_06116527.1| putative ABC transporter, permease protein - Term 708 - 747 -0.9 3 2 Tu 1 . - CDS 759 - 1631 853 ## COG1533 DNA repair photolyase - Prom 1806 - 1865 4.2 - Term 1887 - 1932 3.8 4 3 Tu 1 . - CDS 1996 - 3990 2016 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family - Prom 4076 - 4135 20.3 5 4 Op 1 . - CDS 5037 - 5873 809 ## COG0583 Transcriptional regulator 6 4 Op 2 . - CDS 5899 - 6432 188 ## PROTEIN SUPPORTED gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 7 5 Op 1 . - CDS 7470 - 7859 575 ## EUBREC_1411 lactoylglutathione lyase related lyase 8 5 Op 2 . - CDS 7913 - 8944 1293 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Prom 9011 - 9070 7.2 9 6 Tu 1 . - CDS 10369 - 10992 697 ## COG2364 Predicted membrane protein - Prom 11072 - 11131 4.9 10 7 Tu 1 . - CDS 11168 - 11458 410 ## Cbei_3087 hypothetical protein - Prom 11566 - 11625 7.2 - Term 11639 - 11679 3.2 11 8 Tu 1 . - CDS 11694 - 13781 2376 ## COG2217 Cation transport ATPase - Term 13815 - 13850 6.1 12 9 Op 1 . - CDS 13891 - 14181 449 ## Rumal_2099 hypothetical protein - Prom 14205 - 14264 2.1 13 9 Op 2 . - CDS 14349 - 14735 495 ## Cbei_3026 heavy metal transport/detoxification protein - Prom 14779 - 14838 9.8 + Prom 14777 - 14836 8.3 14 10 Tu 1 . + CDS 14914 - 15297 476 ## COG1321 Mn-dependent transcriptional regulator + Term 15298 - 15360 18.3 - Term 15289 - 15343 16.2 15 11 Op 1 . - CDS 15409 - 15621 319 ## gi|266623607|ref|ZP_06116542.1| FeoA domain protein - Prom 15691 - 15750 2.9 16 11 Op 2 35/0.000 - CDS 15754 - 17523 1588 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 17 11 Op 3 3/0.000 - CDS 17523 - 19268 175 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 - Prom 19360 - 19419 4.0 - Term 19319 - 19362 10.2 18 12 Op 1 . - CDS 19425 - 20384 865 ## COG2207 AraC-type DNA-binding domain-containing proteins 19 12 Op 2 . - CDS 20435 - 20857 573 ## COG0716 Flavodoxins - Prom 20987 - 21046 5.5 + Prom 21120 - 21179 4.5 20 13 Tu 1 . + CDS 21252 - 22388 790 ## PROTEIN SUPPORTED gi|145634045|ref|ZP_01789756.1| 50S ribosomal protein L21 Predicted protein(s) >gi|229784013|gb|GG667722.1| GENE 1 229 - 432 105 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623591|ref|ZP_06116526.1| ## NR: gi|266623591|ref|ZP_06116526.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 67 54 120 120 97 100.0 3e-19 MDKNNLAVITCMCSGMTIGLAIGGAAGLSHGRAGITMCYGLLFGTIIGVIIGTVIKKVKE KNNITRK >gi|229784013|gb|GG667722.1| GENE 2 461 - 652 136 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623592|ref|ZP_06116527.1| ## NR: gi|266623592|ref|ZP_06116527.1| putative ABC transporter, permease protein [Clostridium hathewayi DSM 13479] putative ABC transporter, permease protein [Clostridium hathewayi DSM 13479] # 1 63 1 63 63 84 100.0 3e-15 MKQYVLPALLKALVMSVCFTLLCVIYGFISNNPYEITFIPEVIFFLLFFFVNFIEYLWKN RKK >gi|229784013|gb|GG667722.1| GENE 3 759 - 1631 853 290 aa, chain - ## HITS:1 COG:CAC3492 KEGG:ns NR:ns ## COG: CAC3492 COG1533 # Protein_GI_number: 15896729 # Func_class: L Replication, recombination and repair # Function: DNA repair photolyase # Organism: Clostridium acetobutylicum # 1 288 1 288 290 409 67.0 1e-114 MHEKQVKSILSAQNGMNLYRGCTHGCIYCDARSTCYQMDHAFEDIEVKTNAAELLEKALK SKRKKCMIGTGAMSDPYLHLEKRLCLTRRSLELIDYYGFGLSIQTKSDLILRDLDLLKSI NRKTKCVVSMTLTTYDEELCRIIEPNVCTTGRRAEVLRILKEEGIPTIVWLSPILPFIND TEENIRGILSYCRDASVYGVMLFGIGVTLRDGDRQYFYAQLDKHFPGMKERYIRTYGNSY EIPSPDTRHLLSLVKEECRKTGMVCSSNRLFEYMHAFEDKQAGVQMSLFE >gi|229784013|gb|GG667722.1| GENE 4 1996 - 3990 2016 664 aa, chain - ## HITS:1 COG:CAC3371_1 KEGG:ns NR:ns ## COG: CAC3371_1 COG1902 # Protein_GI_number: 15896613 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Clostridium acetobutylicum # 2 400 3 401 401 690 76.0 0 MKYKNLFTPVKIGSVTLKNRFALAPMGPLGLADAEGGFNQRGIDYYTERAKGGTGLIITG VTFSDCEVETQSMPNCPNSTYNPVHFIRTSKEMTERVHAYGSKIFFMMSAGFGRVTIPTN LGEFPPVAPSAIPHRWLDKICRPLTKEEIHSIVKSFGDGAYHAKRGGFDGVEIHAVHEGY LLDQFAISMFNQRTDEYGGTLENRLRFAREVVEEIKSRCGQDFPVVLRFSLKSMIKDWRE GALPGETFEEKGRDIEEGLETAKLLVQYGYDALDTDVGTYDAWWWNHPPMYQKKGLYRPY CRLVKEAVDVPVLCAGRMDNPEMASEAVEAGECDIVSLGRPLLADPDYVNKLRAGTCEQI RPCISCQEGCMGRVQEYSMINCAVNPQAARERVTAYEPILKSRRVMVVGGGAAGCEAARV LAVRGHKPELFEKGGRLGGNLIPGGVPDFKEDDLALAHWYEVQLQTLKVPVHFNTCVDRE MVLAGNYDAVILATGSRPKVFSLGNDERVFPAAEVLTGEKDCGDTTVVIGGGLVGCETAL WLAEQGKRVTIVEALDRLMAVNGPLCHANKDMLERLIPYRGVKTVTSASVTGYRNGILSL VCGDEEREIACDSVILAVGYQEEDSLYRQLEFDVPELYLLGDARKVSNIMYAIWDAFEVA NHIG >gi|229784013|gb|GG667722.1| GENE 5 5037 - 5873 809 278 aa, chain - ## HITS:1 COG:BS_alsR KEGG:ns NR:ns ## COG: BS_alsR COG0583 # Protein_GI_number: 16080655 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 1 250 1 253 302 89 26.0 5e-18 MNVKDLKSFEAVYEERGINQAAQKLFITPQGLSRNIKMLESELDTVLFVRTKQGVRPTES AELLHERAGILIREFEEIRNGIQQLKNEKRLLRIGCACGVFNVLSFPVIQRFIDENPEIS VEWCEYSNQEVKERLKDSTIEYGFVVGAWKENGVKNRRLAGCRVCLLVYEGHPFYDLEGV TIDMVRGEKLILMNEYFHMYHDFLEACEVRGFKPHIAAKTADANFQYQLCRQKTGLAVVP DFAAEHFRMDHMRAIPFRERLKWEVYGVYKESNGCYSV >gi|229784013|gb|GG667722.1| GENE 6 5899 - 6432 188 177 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 [Haemophilus influenzae 7P49H1] # 25 172 64 213 378 77 32 1e-13 MSVRLAAQCTDARVNVVVQDLYAEYPDVEALAGAEVEDIEKIVKPCGLGHSKARDISACM KILKEQYDGRVPDDFDALLKLPGVGRKSANLIMGDVFGKPAIVTDTHCIRLVNRMGLVED LKDPKKVEMALWKLIPPEEGSDFCHRLVFHGRDVCTARTKPFCEKCCLKDICARNGV >gi|229784013|gb|GG667722.1| GENE 7 7470 - 7859 575 129 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1411 NR:ns ## KEGG: EUBREC_1411 # Name: not_defined # Def: lactoylglutathione lyase related lyase # Organism: E.rectale # Pathway: not_defined # 6 129 7 130 130 134 51.0 2e-30 MAAQYITGLQHIGIPTNDIEKTTAFYHELGFETAFETVNEAAGERVVFLRCGSLVIETYE NHQAAERNGAVDHVALDVTDIEAVFAEIKEAGYPMLDEEIQFLPFWANGVRFFTILGPNR EKVEFSQML >gi|229784013|gb|GG667722.1| GENE 8 7913 - 8944 1293 343 aa, chain - ## HITS:1 COG:PA4153 KEGG:ns NR:ns ## COG: PA4153 COG1063 # Protein_GI_number: 15599348 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Pseudomonas aeruginosa # 12 339 21 360 363 192 34.0 1e-48 MMLQQVMTAPGEIEFREVPVPEPEENQVLVKIRKIGVCGSDIHVYHGEHPFTSYPVTQGH EVSGEIEKLGSGVKGWKVGQKVTIQPQVVCGTCYPCRHGKYNLCEKLKVMGFQTTGVASD YFAVDAAKVTPLPEEMSFDEGAMIEPLAVAVHAVKRAGNVEGAKIAVLGAGPIGILVAQT AKGLGAESVMITDVSDLRLEKAKECGADFCINTKTKNFGEAMVENFGPDKADIIYDCAGN NITMGQAIQYARKGSTIILVAVFAGTAQIDLAVLNDHELDLNTSMMYRNEDYLDAIRLVK EKKVVLAPLISKHFAFRDYLEAYRYIDENRESTMKVIINVAES >gi|229784013|gb|GG667722.1| GENE 9 10369 - 10992 697 207 aa, chain - ## HITS:1 COG:HI0522 KEGG:ns NR:ns ## COG: HI0522 COG2364 # Protein_GI_number: 16272466 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Haemophilus influenzae # 16 198 28 204 218 70 29.0 2e-12 MIRDYFRKLDRRRAVIMIIGNVFTGMGISIFKLSGLGNDPFSGMVMALSDCTGIQYAFFL ILVNLVIFIAEITMGRELIGLGTAVNALLLGYIVTFFFGIWEVTLGMPSVLWQKLLTVFV GVVVCSFGLSLYQTSDVGVAPYDSLALIMCKKQKKISYFWCRMACDVFCAVVCFAAGGIV GLGTLVSALGLGPFVQFFNKTSKNLAS >gi|229784013|gb|GG667722.1| GENE 10 11168 - 11458 410 96 aa, chain - ## HITS:1 COG:no KEGG:Cbei_3087 NR:ns ## KEGG: Cbei_3087 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 1 89 5 93 103 124 68.0 9e-28 MSVVIIGGHDRMVSQYKKICREFKCKAKVFTQMSTNMDKKIGCPDLLVLFTNTVSHKMVK CALDGIDDGATRVVRCHTSSGTALSKILEDETPQMV >gi|229784013|gb|GG667722.1| GENE 11 11694 - 13781 2376 695 aa, chain - ## HITS:1 COG:SP2101 KEGG:ns NR:ns ## COG: SP2101 COG2217 # Protein_GI_number: 15901916 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Streptococcus pneumoniae TIGR4 # 103 688 100 683 687 388 35.0 1e-107 MRFIVKHEIKGRLRIHLDRRELTLREADMFCCAMSLIPGVKEVKVYERTADAAILYEGGL RPVIIEEIRRFSFQDESLCELVPKDSGRALNREYKEKLIQRVVVRAFTKVFLPVPFRAVY TAVKSAGYIIKGLRCLLKGKLTVEVLDATAITVSIARLDFNTAGSVMFLLGIGELLEEWT HKKSVGDLARSMSLNIEKVWQNVDGTEVMVPVSRIREGDLVTVHVGSVIPLDGVVVTGEA MVNQASMTGESIPVKKEPDTYVYAGTAVEEGSITICVKKAAGSTRFERIVKMIEDSEKLK SNVEGKAAAMADALVPWSLGGTVLTWLLTRNVTKAISILMVDFSCALKLAMPLSVLSAMR EAGNYRVTVKGGVYLESVAEADTIVFDKTGTLTKARPVVAGVVTFEGRDEDEMLRIAACL EEHFPHSMANAVIQESVRRGLIHEEMHSKVDYIVAHGIASYIGGERVIIGSHHFVFEDEG SVIPEDGREQYESLPPEYSHLYMAVGGRLAAVICIIDPVREEAASVVNQLRALGLSKIVM MTGDSDRTAKAIAQKVGVDEYYSEVLPEDKASFVEAERAAGRKVIMVGDGINDSPALSAA DVGIAISEGAEIAREIADITISEDTLDQLVTLKRISNALMVRVNRNYRFVIGFNLCLILL GVNGILAPATSALLHNTSTLMVSLKSMTNLLEQKQ >gi|229784013|gb|GG667722.1| GENE 12 13891 - 14181 449 96 aa, chain - ## HITS:1 COG:no KEGG:Rumal_2099 NR:ns ## KEGG: Rumal_2099 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 4 72 3 71 91 94 68.0 1e-18 MLDCFKMKKIGIFAGGVLFGTAGVKILASKDAKKFYVNCLAAGLRAKDCVMTTATSIQEN ADDILAEAKEINRQRADGAFEDESEEMVSETAEEVE >gi|229784013|gb|GG667722.1| GENE 13 14349 - 14735 495 128 aa, chain - ## HITS:1 COG:no KEGG:Cbei_3026 NR:ns ## KEGG: Cbei_3026 # Name: not_defined # Def: heavy metal transport/detoxification protein # Organism: C.beijerinckii # Pathway: not_defined # 1 119 1 119 122 95 41.0 5e-19 MATAIICLALIMAGVLAIRSYRKRLTTGCCGGSGEAAVKKIKVSDKDLTHYPHRRVLKVD GMSCGNCAIRVENALNSLEGVYARVNLMEAEADVRMKQEFSDTVLKDTVKDAGYTVYRIR PVNDSISQ >gi|229784013|gb|GG667722.1| GENE 14 14914 - 15297 476 127 aa, chain + ## HITS:1 COG:CAC1469 KEGG:ns NR:ns ## COG: CAC1469 COG1321 # Protein_GI_number: 15894748 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 3 116 1 115 122 110 53.0 4e-25 MKMNESSENYLETILILSNRKPHVRSIDIANELDFSKPSVSVAMRNLRENGYILVDQDGY ITLTDSGKQVAETMYERHTMLSNWLMYLGVDEKTAAEDACRIEHVVSAKSFQAIKDHVTK GNELLEK >gi|229784013|gb|GG667722.1| GENE 15 15409 - 15621 319 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623607|ref|ZP_06116542.1| ## NR: gi|266623607|ref|ZP_06116542.1| FeoA domain protein [Clostridium hathewayi DSM 13479] FeoA domain protein [Clostridium hathewayi DSM 13479] # 1 70 1 70 70 128 100.0 2e-28 MQSLAQAKAGDDYTIKWMFGAPQILDFLNEYGIREGSDIRVFQQGKDGIIIGRDNVRLAI GEEVARRIQV >gi|229784013|gb|GG667722.1| GENE 16 15754 - 17523 1588 589 aa, chain - ## HITS:1 COG:SA2216 KEGG:ns NR:ns ## COG: SA2216 COG1132 # Protein_GI_number: 15928006 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Staphylococcus aureus N315 # 1 579 1 577 577 366 34.0 1e-101 MVQLFQRILNLSGKYRSRIEIAFLFSVLKAILSKMPLCVAFFMLSRFYTGAITAEDCVTA AVVLCVSVLLEAFFQLMSDRLQSGAGYMMFAEQRRDLGAHLRKLPMGYFTAGNIGKISSV LTTDMIFVEEIAMSSIANMMSYAFSAALMAVFLFFLDVRLGIIGLLVTVTAMITAGKMNR ISMKEAVGRQEQSEKLTDAVLSFTEGIAIIKSYNLLGEKSAELSNNFKLSRDKSLGFEEA MTPWQRGLNLIYACGMALLFGVSVYLQQTGSLTIPYLMGLLLFVFDLFGPLKALYGEATR LTVMNSCLDRIEGVCAEKELPDQGKKHIPERNHKAEIEFDHVTFAYHEKEVLKDVTFKIQ PNTMTALVGPSGGGKSTAANLLARFWDVKSGCVRMRGEDIRDVPLAELMDHISMVFQRVY LFQDTIYQNISMGKTEALKEEVYEAARKARCYDFITALPEGFDTVIGEGGASLSGGEKQR ISIARCILKDAPVVILDEATASVDADNERFIQEAISELVKGKTLLVIAHRLNTIRAAGQI LVIDDGRVVQRGNHEALIKEKGIYRDYVNIREKAAGWSIDGLSRQIQKD >gi|229784013|gb|GG667722.1| GENE 17 17523 - 19268 175 581 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 346 552 294 507 563 72 25 3e-12 MFQKILEYAGPYKKTTFLASAVILLGVFMSVLPFIFACQIIAPLIEGEPMDLSYAAVRIA GVLVCLVLHACLYVKGLSLSHEAAYNILMRIHISLQNRMEKLPLGVIQEKGTGSIKRLFV DDVESMELLLAHALPEGFANTVIPVVVYTALFFVDWKLALMSLASLPAGLLAMMVMYRIG QNGMVSYYEAGKVMNNTIIEYINGMEVVKVFNRDGESYERFQRDVLAYRDLTLDWYKACW PWMAAYNSILPCTLIVTLPLGSWFVLKGYSSLPDFILVMCLSLSLGIPLLRALSFLPSLP QINYKLNALEQMLGAEPLKQTADGWHGKDNTVVFDRVTFGYEEKSVVSGVSLMVREGQKT ALVGESGSGKSTLAKLLIHYYDVEEGNITIGGQDICGMSLEALNARISYVAQEQYLFNTS LLENIRIGRLTATDEEVLEAAAKAQCMEFIEKLPDGIRTMAGDGGKQLSGGQRQRIALAR AILKNAPVVVLDEATAFTDPENEEKMEAAISEVVKGKTLLVIAHRLSSVKNADQICVMAE GRIAARGTHDELLKNSEEYRKLWYASMESAEWRVSGGKEER >gi|229784013|gb|GG667722.1| GENE 18 19425 - 20384 865 319 aa, chain - ## HITS:1 COG:alr2189 KEGG:ns NR:ns ## COG: alr2189 COG2207 # Protein_GI_number: 17229681 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Nostoc sp. PCC 7120 # 176 315 191 331 341 100 39.0 4e-21 MQDASRYFNDLFEQMPSEQVYEVLRNAPEEIGRFSICRKNTRQGIDLLDWRMNYQRDVHV YRRGENNRDEVQFVFCMNQGMVWEIEGIPRPVSMEKGNAYIYRNGDRTTWGYYPGKCDFV FKSVQIPVQVYRRLQNAYFEGREEISAERFSSAVESIEITPGICRILAELEQSGRYEDGI GCLYLEGKLLELLAASFKELLTCRKAQKRLCGITRTDQEAVREAKRILDRHLTDAPDCEQ LARAVGVSVSRLARLFSEVTGNTIHSYLIERRLEHAALLLTEQKNVSQAAALSGYSNMSH FSASFKKKYGVLPKEYAGK >gi|229784013|gb|GG667722.1| GENE 19 20435 - 20857 573 140 aa, chain - ## HITS:1 COG:CAC0587 KEGG:ns NR:ns ## COG: CAC0587 COG0716 # Protein_GI_number: 15893876 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Clostridium acetobutylicum # 1 139 1 139 142 114 48.0 5e-26 MSKTAVVYWSGTGNTEAMAAKVVEGAQAAGAEVSLFTAAEFGADKMDEFDSVAFGCPSMG SEQLEEDEFEPMFLSCKAKLRGKKIALFGSCGWGDGEWMRTWEETCREDGAVLACDCVIC NDAPDAEAESACEEMGKALA >gi|229784013|gb|GG667722.1| GENE 20 21252 - 22388 790 379 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145634045|ref|ZP_01789756.1| 50S ribosomal protein L21 [Haemophilus influenzae PittAA] # 4 379 2 362 456 308 44 2e-83 MLNTIESINNAVNNFIWGVPAMICIIGVGLYLSIRTNFLQIRKFSYAIRVTLGRMFRKRD ASDGTMTPFQAVCTALAATVGTGNIAGVAGAIAIGGPGSVFWMWISAILGMCTKFSEVTL AVHFREVNEKGDLVGGPMYYIKNGLGKNWRWLATLFLVFGVLTVFGTGNATQVNTITTAI NSALMNYHLIDARSARTTSLIIGIILAALVALVLLGGIKRIGSVTEKLVPFMALFYIVLA VGVVLLNINRVPAVFSSIVYGAFNPAAVTGGVVGTFFMSMKKGVSRGIFSNEAGLGTGSI AHACADTKKPVKQGFFGIFEVFADTIVICTLTAFVILCSGVPVPYGEAAGAELTILGFTS TYGSWVSIFTAAAMCCFAF Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:13:39 2011 Seq name: gi|229784012|gb|GG667723.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld116, whole genome shotgun sequence Length of sequence - 16161 bp Number of predicted genes - 14, with homology - 13 Number of transcription units - 6, operones - 5 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 24/0.000 + CDS 2 - 1924 2129 ## COG0209 Ribonucleotide reductase, alpha subunit 2 1 Op 2 . + CDS 1959 - 3005 1258 ## COG0208 Ribonucleotide reductase, beta subunit 3 2 Tu 1 . - CDS 3950 - 5245 1128 ## COG2206 HD-GYP domain - Prom 5280 - 5339 4.1 + Prom 5300 - 5359 6.9 4 3 Op 1 . + CDS 5433 - 5888 571 ## COG2731 Beta-galactosidase, beta subunit 5 3 Op 2 . + CDS 5963 - 6079 97 ## 6 3 Op 3 . + CDS 6024 - 7139 761 ## gi|266623617|ref|ZP_06116552.1| conserved hypothetical protein 7 3 Op 4 . + CDS 7219 - 7932 867 ## COG2188 Transcriptional regulators + Prom 8057 - 8116 9.9 8 4 Op 1 9/0.000 + CDS 8174 - 9583 1678 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 9 4 Op 2 7/0.000 + CDS 9585 - 10100 571 ## COG0366 Glycosidases 10 4 Op 3 . + CDS 10043 - 11266 911 ## COG0366 Glycosidases - Term 11056 - 11116 1.1 11 5 Op 1 . - CDS 11263 - 11823 705 ## COG1971 Predicted membrane protein - Prom 11847 - 11906 2.8 - Term 11997 - 12027 0.3 12 5 Op 2 . - CDS 12122 - 13036 918 ## COG1284 Uncharacterized conserved protein - Prom 13169 - 13228 10.2 + Prom 13113 - 13172 5.0 13 6 Op 1 7/0.000 + CDS 13200 - 14996 1318 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 14 6 Op 2 . + CDS 14993 - 16160 779 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain Predicted protein(s) >gi|229784012|gb|GG667723.1| GENE 1 2 - 1924 2129 640 aa, chain + ## HITS:1 COG:TP1008 KEGG:ns NR:ns ## COG: TP1008 COG0209 # Protein_GI_number: 15639992 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Treponema pallidum # 5 640 210 845 845 982 69.0 0 TSGQLYGDYILEHYTKEEIETCEGFLDGTRDQLFTYSGLELLLSRYVIRSRENVPLESPQ EMFLGIALHLAMLEKNDRLEWVRRFYDMLSTLKVTMATPTLSNARKPYHQLSSCFIDTVP DSLDGIYRSIDSFAKVSKFGGGMGLYFGKVRAAGSAIRGFKGAAGGVIRWIKLANDTAVA VDQLGVRQGAVAVYLDAWHKDLPEFLALRTNNGDDRMKAHDVFPAVCYPNLFWKLAKEDI DADWYLMCPHEILTVKGYSLEDSYGREWEERYWECVADHRIHKRVIPVKELVRLILKSAV ETGTPFTFNRDLVNEANPNSHKGIIYCSNLCTEIAQNMSPVEQVEQRIETVDGETVVVTV TKPGDFVVCNLASLSLGKIDVTNREEVTRLTQYAVRALDNVIDLNFFPVPYARINNMKYR PIGLGVSGYHHMLAKNGISWESEEHLAFTDRVFEWINYAAIEASCDNAREKGCYELFEGS DWQTGAYFEKRKYHSEEWLRLKEKTAANGMRNGWLIAIAPTSSTSMLAGTTAGLDPIMNR YYLEEKKNGLVPRVAPDLSPETFWQYKNAHYIDQQWSVRAGGIRQRHVDQAQSMNLYITN EYTLRQVLNLYILAWECGVKTIYYIRSRSLEVEECEVCSS >gi|229784012|gb|GG667723.1| GENE 2 1959 - 3005 1258 348 aa, chain + ## HITS:1 COG:TP0053 KEGG:ns NR:ns ## COG: TP0053 COG0208 # Protein_GI_number: 15639047 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, beta subunit # Organism: Treponema pallidum # 4 346 8 350 351 553 74.0 1e-157 MEQLIRKPLFNPDGDTDVRNRRMIGGNTTNLNDFNNMKYTWVSDWYRQAMNNFWVPEEIN LGPDVKDYPMLDAGERRAYDKILSFLVFLDSIQTANLPAIGEYITANEINLCLSIQTFQE AVHSQSYSYMLDTICEPQTRNEVLYQWKNDEHLLARNTFIGNLYNEFQKDKSPFTFMKTV VANFILEGVYFYSGFMFFYNLGRNHKMPGSAQEIRYINRDENTHLWLFRNMILELKNEEP ELFTPDRVDVYREMIKEGCRQEIAWGHYAIGDKVPGLTKDMITDYIQYLGNLRCMSLGFE PIYEGHEKEPESMAWVSQYSNANMIKTDFFEARSTAYAKSSALVDDPS >gi|229784012|gb|GG667723.1| GENE 3 3950 - 5245 1128 431 aa, chain - ## HITS:1 COG:VCA0681 KEGG:ns NR:ns ## COG: VCA0681 COG2206 # Protein_GI_number: 15601438 # Func_class: T Signal transduction mechanisms # Function: HD-GYP domain # Organism: Vibrio cholerae # 16 386 28 414 431 138 27.0 2e-32 MQYTDPIDNRGILEIIRRSCSYVDARLMNHGYRVAYIVSEMLKPYFHDDARKIRDVCFLA LLHDIGAYKTDEISKMLQFDTDDVLAHSVYGYLFIKYFSPLKKLAGAILLHHAAWQVLEN LEAFPEEIRLLAQVLHAADRIDVAMEVQHLTWEETLALLKKDSGTNFAPVITDLAAACDF HTPFEGNNDAEALFFQMLSEIPLTQDEITDYLKMLIFTIDFRSRHTVTHTITTTTISSEL AKRMGLSEKECNQVVCGSMLHDLGKIGIPVEILEYPGKLSPQAMNIMRTHVDITEKIFGD VIEETVKCLALRHHEKLDGSGYPRGLTAADLSTGERIVAIADIISALSGTRSYKTAFSKD RICSIITEMKDDGLIDADIVELMISDYDEIMEATRIQCQPILDIYARIQQDFDRISRAMT MDDIAGMKAVF >gi|229784012|gb|GG667723.1| GENE 4 5433 - 5888 571 151 aa, chain + ## HITS:1 COG:CAC0836 KEGG:ns NR:ns ## COG: CAC0836 COG2731 # Protein_GI_number: 15894123 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Clostridium acetobutylicum # 1 151 1 151 152 95 29.0 4e-20 MLTSNLNVVEKYDYLAEKFRKGYEFLRTADLKALPVGRVDIEGDDIYASVQEYTSLKADT CRFEAHNRYFDIQYVVEGEEQFGYAKREDLTEDAPYDETNDIVFFREPEEAGTVLLKAGD CIVVAPEDAHKPRCQAKEACRVKKVVIKVRV >gi|229784012|gb|GG667723.1| GENE 5 5963 - 6079 97 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGTVWPVCLAGGDWQDGEQECGKGTAVSWGNPQYGSSG >gi|229784012|gb|GG667723.1| GENE 6 6024 - 7139 761 371 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623617|ref|ZP_06116552.1| ## NR: gi|266623617|ref|ZP_06116552.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 371 26 396 396 622 99.0 1e-176 MEKGPPYHGETRSTAAVGKTEHAAGAGDTAAEGLRPGALDTVEFGSAGSGAVSEEAVMDC LRILERDYPGMNVVIWEREGAADLAQLAAALGKGRFLVVTRDFLDRMGRDAGEYERCRTV LKETACRLAQGGSASVSEGVCLDASGKRTWVVPAAGKQDSGPKKTTPGETGLPGSPGKQS WESEWNLRKTSAASYAVSGHYGSLARAGTRGQVQKVLGDVHRSIGNLRLAACFGDDEERM KAGRAIRSLQKLLARGNRKIRRLDRESSLKRQKKHAEKARKEKKVLQIRLEMKKSRSARY GADYRLVREGLADTCWILGAARHRTREEIRAEQALPGEAGIGDGLGFTGLDGGGEGSFHA SDVVISDGGSD >gi|229784012|gb|GG667723.1| GENE 7 7219 - 7932 867 237 aa, chain + ## HITS:1 COG:BH0873 KEGG:ns NR:ns ## COG: BH0873 COG2188 # Protein_GI_number: 15613436 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 1 235 1 235 237 204 46.0 1e-52 MESKYKHLYNQLQDQIEKREYKAGDKLPSEGDLMEIYGASRDTVRKALDLLVQDGFIRKA KGKPAVVLDKNKFNFPVSEIASFKEIYRFSDSKPRTYVENLEIVKNDPKLMEALQIGPED EAFVLERVREIEGEKIIIDKDFFSRKVVENLPLRAAQDSVYEYLEHEVGLKIGFAMKEIT VQMAGDEDRCLLDMKNYDMIVVVKSYTYLEDSTLFQFTESRHRPDKFKFVDFARRKL >gi|229784012|gb|GG667723.1| GENE 8 8174 - 9583 1678 469 aa, chain + ## HITS:1 COG:lin1223_2 KEGG:ns NR:ns ## COG: lin1223_2 COG1263 # Protein_GI_number: 16800292 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Listeria innocua # 92 464 1 375 403 511 70.0 1e-144 MGKYSDDAKKLLDYVGGKQNITAVTHCMTRMRFVLADPEKADVAKIEALKSVKGTFTQAG QFQVIIGNDVSVFYNEFTSVSGIEGVSKEEVKQEAKGNMNPVQKAVANIAEIFAPLIPAI IVGGLILGFRNIIGEIRFMNGGTETLIEVSQFWAGVHSFLWLIGEAIFHFLPVGITWSVT KKMGTTQILGIVLGLTLVSPQLLNAYAMAPGVEVPVWDFGFFTMQMVGYQAQVIPAILAG FVLVYLEKFFKKITPQAISMIVVPFLSLVLAVIAAHAVLGPVGWAVGSWVSRVVYAGLTS SFRWLFATVFGFIYAPLVITGLHHMTNAIDLQLMAEFGGTMLWPMIALSNIAQGSAVLGM IYLQKHNEEAKQISIPACISCYLGVTEPAIFGVNLKRGFPFISAMIGSAIAATVSVGSNV MANSIGVGGIPGILSIQPQYMGRFAVCMLITIVVPFMLTVVIGKKKDVR >gi|229784012|gb|GG667723.1| GENE 9 9585 - 10100 571 171 aa, chain + ## HITS:1 COG:STM4453 KEGG:ns NR:ns ## COG: STM4453 COG0366 # Protein_GI_number: 16767699 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Salmonella typhimurium LT2 # 9 164 7 158 550 174 53.0 8e-44 MLEENMKDFKKSCVYQIYPKSFRDTTGNGVGDLNGVTEKLDYLKKLGVDYLWLTPFFVSP QNDNGYDIADYYHIDPLFGTMEDLDRLIKEAADRGMGLMFDMVFNHTSTHHQWFARALKG EKEYQDYYIFRKGKKGVPLPTGSPSSAAVPGSMWRSWDSITFTYLTGRRRI >gi|229784012|gb|GG667723.1| GENE 10 10043 - 11266 911 407 aa, chain + ## HITS:1 COG:BS_treA KEGG:ns NR:ns ## COG: BS_treA COG0366 # Protein_GI_number: 16077848 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus subtilis # 2 402 157 554 561 456 56.0 1e-128 MEELGQYYLHLFDRTQADLNWENPKVREEIRRVLLYWKKKGIRGFRFDVINLISKPAYFE DDDEGDGRRFYTDGPRVHEYLKEMVKETGLYEDDIITVGEMSSTTIHHCIQYTNPQERQL KMCFNFHHLKVDYKNGDKWSLMPCDFMGLKKIFHTWQSGMAEGGGWNAVFWCNHDQPRAV SRFGDDTKYRKQSAKMLATAIHGMRGTPYIYQGEELGMTNAGFTSASQYRDVESLHYFDI LKESGIDEAEVCEILRQRSRDNSRTPMQWSDGVNAGFTSAVPWIEVNRNHTTINAEECLE DSDSIFYYYQKLIRLRKEYDVISEGGYEPILEDHEAVFAYKRIYGGETLIVLTNFTDTCQ FIEAADIKLSEEELTEYKTLIANDGLLRSAAGIRLAPYGAVMLIREP >gi|229784012|gb|GG667723.1| GENE 11 11263 - 11823 705 186 aa, chain - ## HITS:1 COG:Cj0167c KEGG:ns NR:ns ## COG: Cj0167c COG1971 # Protein_GI_number: 15791554 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Campylobacter jejuni # 1 182 1 183 187 154 57.0 7e-38 MSLIELFIIAVGLSMDAFAVSICKGLSMQKMSWKNAVIVGLYFGGFQAGMPLIGYILGSQ FKNAITSIDHWIAFILLGIIGFNMIRESLNQEEECVDCSVSPGNMILLAIATSIDALAVG VTFAFLQVRIVPAVSFIGSTTFILSIIGVKVGNVFGTKYKSKAEFAGGLILVIMGVKILL EHTLGI >gi|229784012|gb|GG667723.1| GENE 12 12122 - 13036 918 304 aa, chain - ## HITS:1 COG:SP2113 KEGG:ns NR:ns ## COG: SP2113 COG1284 # Protein_GI_number: 15901928 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 41 298 50 308 313 126 28.0 5e-29 MAEKQGVSDLYKKNAEKAGKDKRLRDAVSLFFILFGSVIVAVNLNTFVEQGKLVPGGFSG LAKLIQRVGLAFFGVKISFTFLNVVFNAIPAVFAYRIVGKKFTILSCVSLLTVSILVDQL PVIPITGDLLLISVFGGIINGLGMSLILNNRASGGGTDFIAMSLSAKYKISTFNYMLLFS AVIILISGAIFGMDIALYSIIYQFCNTQVINTLYKKYKKKTLLIVTDNPAAVSADLMELT NHSSTILKGFGSYSANKKYLLYTVLSDSDVKKMKKRIREQYPDTFVNVINSSDVVGNFYI QPLE >gi|229784012|gb|GG667723.1| GENE 13 13200 - 14996 1318 598 aa, chain + ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 94 592 75 594 602 161 26.0 3e-39 MKQLPHFNSIRKKVIFLDLIFVFPLFVLFCILMSYVFEDSNYKLNNSKLSLLEEKCSNIA NRNTEIIKISVGMYLDQNINRLISKKNKLTGYEYISAQDSIQSKMLELTELFPDRQYQVM LLCENGMNFFQSSLNFTQDMITLKMLDQESWYQEAMEKDDSVYFLPKYRSPALQKLFRED TLFAVQKMRNLNSGRNVGIMIVAISRDIWGNVILSDEDSDVNTMVIDQYRKIIFSSDPQM YGYEVENNSYYDQIANYSKGFFLGNVKKQYCHIRFASIEGVGWKLISYEPYQHGWSSFYV ILFLVLGAAMFSVLVIIVFYNCSFISRRMEKLNKNILEVSNGDLKTRIQEDYEVEFHEIC HNFNYMLDHIENLMKQLEKEEEEKHALEIQALQAQINPHFFHNTLVTIRFMIQMEEYNEA DRALLAFSKLLRKSFVNSQKIIPIREELAMVEDYLELMKLRYRDKFQWKILAAEDVMELG ILKNVIQPLVENSISHGFNMKDDMGHIVICAYRIKESVIIEVEDDGVGVDLEKINNSIHN QQTVKAGDQLNGIGLSNIQMRILRNFGSQYGLSVEVNSSGGVTFRMNIPVIQMEGVNS >gi|229784012|gb|GG667723.1| GENE 14 14993 - 16160 779 389 aa, chain + ## HITS:1 COG:lin2118 KEGG:ns NR:ns ## COG: lin2118 COG4753 # Protein_GI_number: 16801184 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Listeria innocua # 1 126 2 127 494 78 34.0 2e-14 MNIVIADDEEIILKWMKKNIEALSPENHVIASCINGMQVLNCCLNGSVDVLFTDIRMPVM DGMELLKKLNQNHVLPYTVVLSAYDDFSYARDCLKLGVSEFLLKSEITKDELETCLQTAK ERIKNNGKAEQKQSSPEEELEKVLLQYLKEPDYISRDAVRQVWNNCCGIRGKYAVFLIRD TDRISNPDQLKEILPIAFPDEKRVQYCLSVNERETVVLAEMIDGDLQKFTENLYETIASF GGINMYIGSSDAGKTVDELEGLYDHSRDTLAYQLFYQHIGGLNYETMKKRIGRLEALLEE MFEKLELLINSRKWNLILEEMNQIFDYARKSEPEPAGMKRLFLNSLLNLYWSCLEEADRR SFSIDKIIEVTNCTDIDQLENMVLIQAEE Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:14:07 2011 Seq name: gi|229784011|gb|GG667724.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld117, whole genome shotgun sequence Length of sequence - 23220 bp Number of predicted genes - 22, with homology - 20 Number of transcription units - 15, operones - 5 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 161 - 220 4.6 1 1 Tu 1 . + CDS 470 - 1483 689 ## COG3344 Retron-type reverse transcriptase + Term 1587 - 1620 5.2 2 2 Tu 1 . - CDS 1681 - 2136 331 ## COG1943 Transposase and inactivated derivatives - Prom 2168 - 2227 2.7 - Term 2278 - 2329 8.2 3 3 Op 1 6/0.000 - CDS 2354 - 3631 957 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase 4 3 Op 2 5/0.000 - CDS 3648 - 4748 1272 ## COG0287 Prephenate dehydrogenase - Prom 4774 - 4833 10.9 5 3 Op 3 . - CDS 5735 - 6685 1077 ## COG2876 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase - Prom 6724 - 6783 5.0 + Prom 6795 - 6854 7.9 6 4 Op 1 . + CDS 6914 - 7168 306 ## Closa_2254 hypothetical protein + Term 7177 - 7224 -0.9 + Prom 7196 - 7255 5.7 7 4 Op 2 . + CDS 7287 - 9389 2420 ## COG0480 Translation elongation factors (GTPases) + Term 9415 - 9458 6.7 + Prom 9464 - 9523 4.6 8 5 Tu 1 . + CDS 9558 - 10463 921 ## Closa_1684 Abortive infection protein + Prom 10860 - 10919 4.1 9 6 Tu 1 . + CDS 10958 - 11206 249 ## COG0495 Leucyl-tRNA synthetase 10 7 Tu 1 . + CDS 12144 - 14249 2233 ## COG0495 Leucyl-tRNA synthetase + Prom 14290 - 14349 8.0 11 8 Tu 1 . + CDS 14445 - 14816 478 ## Cphy_1011 hypothetical protein + Prom 15659 - 15718 80.4 12 9 Tu 1 . + CDS 15903 - 16754 810 ## COG3786 Uncharacterized protein conserved in bacteria + Term 16992 - 17030 -0.4 13 10 Op 1 . - CDS 16757 - 17038 296 ## Closa_1686 hypothetical protein 14 10 Op 2 . - CDS 17093 - 17638 530 ## Closa_1687 hypothetical protein - Prom 17721 - 17780 6.0 + Prom 17599 - 17658 5.3 15 11 Tu 1 . + CDS 17713 - 17862 183 ## + Term 17907 - 17946 7.1 - Term 17888 - 17939 14.8 16 12 Tu 1 . - CDS 17970 - 18221 94 ## - Prom 18292 - 18351 8.2 + Prom 18292 - 18351 7.1 17 13 Tu 1 . + CDS 18374 - 18931 613 ## COG2872 Predicted metal-dependent hydrolases related to alanyl-tRNA synthetase HxxxH domain 18 14 Op 1 . + CDS 19887 - 20381 586 ## Closa_1690 Alanyl-tRNA synthetase, class IIc-like protein 19 14 Op 2 . + CDS 20386 - 21546 1385 ## COG0628 Predicted permease 20 14 Op 3 . + CDS 21566 - 22645 1134 ## COG3191 L-aminopeptidase/D-esterase + Term 22854 - 22890 3.3 - Term 22556 - 22614 7.2 21 15 Op 1 . - CDS 22656 - 22910 297 ## Closa_1695 hypothetical protein 22 15 Op 2 . - CDS 22926 - 23219 299 ## Closa_1697 PHP domain protein Predicted protein(s) >gi|229784011|gb|GG667724.1| GENE 1 470 - 1483 689 337 aa, chain + ## HITS:1 COG:FN0161 KEGG:ns NR:ns ## COG: FN0161 COG3344 # Protein_GI_number: 19703506 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Fusobacterium nucleatum # 39 302 59 316 349 197 40.0 3e-50 MHEPDLWKYCTAEHCREMLLSFHLLEGEEWTDEKVISCLYAVSNHTEKHYSRLRIPKKGG GVRQIQAPDPLLKAIQRNILHHILDGLELSSCAAAYRRGASIWENAACHTGKMVVLKLDI KDFFGSITFPVVHRSAFNSRYFPTSVGTLLTSLCCLNDCLPQGSPASAAISNLVMKPFDE YMDGWCRERSITYSRYCDDMTFSGDFSAAEVIRKVKGFLGAMGFELNRSKTKILTRRTRQ SVTGIVVNDKVSVPASYRKEVRQDVYYCMKFGVQAHLERRGEEKYLKMGPAGEVRYLTSL LGKIGFILSAAPGDIWFTEAKEAVKSLLREAAEPVPE >gi|229784011|gb|GG667724.1| GENE 2 1681 - 2136 331 151 aa, chain - ## HITS:1 COG:BH0176 KEGG:ns NR:ns ## COG: BH0176 COG1943 # Protein_GI_number: 15612739 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Bacillus halodurans # 3 151 2 150 156 191 60.0 3e-49 MAKKANELAHTKWMCKYHIVFTPKYRRKVIYNQYKEDLRDIIKQLCNYKGVQILEGKLMP DHVHILVSIPPKISVSSFMGYLKGKSALMMFDRHANLKYKFGNRHFWSEGYYVSTVGLNE DTIKKYIQDQEKADIMLDKLSVKEYEDPFRG >gi|229784011|gb|GG667724.1| GENE 3 2354 - 3631 957 425 aa, chain - ## HITS:1 COG:BH1667 KEGG:ns NR:ns ## COG: BH1667 COG0128 # Protein_GI_number: 15614230 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Bacillus halodurans # 1 423 1 428 431 409 53.0 1e-114 MKFKKVTPLKGELQIPGDKSISHRSVMFGSIARGTTEISHFLQGADCLSTISCFQKMGVA IENNGDSVIVRGNGLRGLKKPDSILDCGNSGTTTRLISGILSAQNFDVTVTGDESIQKRP MKRIIEPLSMMGAQIESLRQNGCAPLHITGKPLHSVHYHSKIPSAQVKSAILLAGLYADG ETRVTEDVVSRNHSELMLRSFGADVRWEGTTAVIRPAEELYGRQITVPGDISSAAPFLAA GLLVPGSEVLIRQVGINPTRDGILRVCRAMGADITLMNEQDASGEPTADILVKSSSLHGT VIEGAIIPTLIDELPMIAVMACFAEGTTIIKDAAELKVKESNRLEVMVKNLSAMGADVTE TADGMIIHGGKPLHGAVIDSLLDHRIAMTFAVAGLCAEGETEILGAECVRISYPDFYEDL NRLMR >gi|229784011|gb|GG667724.1| GENE 4 3648 - 4748 1272 366 aa, chain - ## HITS:1 COG:SA1197 KEGG:ns NR:ns ## COG: SA1197 COG0287 # Protein_GI_number: 15926945 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydrogenase # Organism: Staphylococcus aureus N315 # 4 342 2 342 363 226 38.0 5e-59 MTDSTIAFIGLGLIGGSIARGIKRENPDTKIMAYMRTRSRLLSAMDEGVIDVILDGADEQ LSQCDLIFLCTPVEYNARYLAEIKPFLKPGAIITDVGSTKTDIHEEVIRLDMEDCFVGGH PMAGSEKTGFENSTDHLLENAYYIITPTRKSSPAAVERLISIASLIGSIPIVLDYHEHDS ITAAISHLPHVIASSLVNLVRDADNQEGLMKRLAAGGFKDITRIASSSPEMWEQICMTNR APLAAVLEQYIASLNETLVKLRSGDNQYLSRLFETSRDYRNSISERNQGSVTPEYSFTVD MADEVGAISTLSVILAAKGISIKNIGINHNREHGEGTLKIIFYDKEAMDMAWKQLERYHY DLISNN >gi|229784011|gb|GG667724.1| GENE 5 5735 - 6685 1077 316 aa, chain - ## HITS:1 COG:TM0343 KEGG:ns NR:ns ## COG: TM0343 COG2876 # Protein_GI_number: 15643111 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Thermotoga maritima # 1 314 1 314 338 350 54.0 2e-96 MIIIMKPNASEEAVGKIKNLIESKGLQAHLSKGTEVTIVGVVGDKTKLQGANIEMAEGVD KIVPVTETYKLVNKKFHPEDSVIRVGNTHIGPGRLTMMAGPCAIENHDMVIETAIAVKKA GAEFLRGGAYKPRTSPYAFQGLEEEGLQYMKEARDETGLNIVCEVTSLRALETAVKYVDM IQIGARNMQNFELLKEAGRAGIPVLLKRGLSATIDEWLNAAEYIASEGNPHIVLCERGIR TYETATRNTLDISAVPVIRAKSHLPIIIDPSHATGVRAYIEPISKAAVAVGADGLIVEVH PCPSCALSDGPQSLAS >gi|229784011|gb|GG667724.1| GENE 6 6914 - 7168 306 84 aa, chain + ## HITS:1 COG:no KEGG:Closa_2254 NR:ns ## KEGG: Closa_2254 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 73 1 73 84 95 65.0 6e-19 MKLTRNNLNEVLFENHDARLLIEDVVSHTQAGIYYYHDVEITLQAALKIWNRANEAEQNE TPYHASFLEFERECRPQAAWCPAL >gi|229784011|gb|GG667724.1| GENE 7 7287 - 9389 2420 700 aa, chain + ## HITS:1 COG:FN1546 KEGG:ns NR:ns ## COG: FN1546 COG0480 # Protein_GI_number: 19704878 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Fusobacterium nucleatum # 1 691 3 686 690 634 47.0 0 MNVYGTKQIRNVVLLGHGGAGKTTLAEAMALITGAVSRMGKITDGNTISDYDKEEIKRQF SISTTLIPLEYQGEEGPIKINLLDTPGYFDFVGEVEEAISVADAAVIVVNCKAGIEVGTY KAWELCEKYKLPRIIFVTNMDDDHASYRELLLKLEKEFGRKIAPFHLPIRENEKFVGFVN VVKMAGRRFTVNSDYEECEIPEYSQKNLGIAREALMEAVAETSEEYMERYFSGDEFTQEE ISSALREYVIEGNIVPVMLGSGINAQGAKMVLQGIDKYFPTPDRYECIGVDVSTGERFTA KYNDNVSLSARVFKTIVDPFIGKYSLMKICTGTLKPDSNIYNVNKDTEEKIGKIYLLRGK DAIEVPELKAGDIGAVSKLSVTQTGDTIALRSAPIVYHKPEISTPYTYMRYKPKTKGDDD KVSSALSKLTEEDWTLKVVTDTENHQSLIYGIGDQQLDVVVSKLLNRYKVEIELSKPKFA FRETIRKKVEVQGRYKKQSGGHGQFGDVKMTFEPSGDLETPYVFEESVFGGAVPKNYFPA VEKGVAECVLRGPLAAYPVVGLKATLTDGSYHPVDSSEMAFKMAATMAFKKGFMEANPVL LEPIASLKVTVPDKFTGDVMGDLNRRRGRVLGMNSNHNGKQVIEADVPMSELFGYNTDLR SMTGGMGTYEYEFSRYEQAPGDVQKREVEERASKVDRTES >gi|229784011|gb|GG667724.1| GENE 8 9558 - 10463 921 301 aa, chain + ## HITS:1 COG:no KEGG:Closa_1684 NR:ns ## KEGG: Closa_1684 # Name: not_defined # Def: Abortive infection protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 301 1 301 303 245 49.0 2e-63 MDYLREIKKTCSRLGVGYAAFLVVGLLLQMEAGLILGVLGSMGWMGDWNTWIVILGSLPT YLLGTLICWLIVKDMGTPCRPRKQPFGAGRLVVAFFACISIMYIGNIIGTVLMSIVNGIQ GKPLVNPVSELIENLDTRVIFLLTVVAAPIFEEILYRKLLIDRVAHYGDKVAVILSGVLF GLSHGNFYQFFYAFGLGVVFAYVYINTGKLRYTIIFHAIIDFLGSIVALHAADNIWFAMA YSIFMLAAVVLGIVFLIIYRREISWKPAMVVIPKGRRFETLFLNLGMILFFFVSVGLFLL S >gi|229784011|gb|GG667724.1| GENE 9 10958 - 11206 249 82 aa, chain + ## HITS:1 COG:MPN384 KEGG:ns NR:ns ## COG: MPN384 COG0495 # Protein_GI_number: 13508123 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Mycoplasma pneumoniae # 5 82 2 81 793 105 58.0 3e-23 MSASYNHSAIEKKWRENWKKNPINVNDGKKPKYYCLDMFPYPSGSGLHVGHWRGYVISDV WSRYQILKGHYVIHPMGWDAFG >gi|229784011|gb|GG667724.1| GENE 10 12144 - 14249 2233 701 aa, chain + ## HITS:1 COG:BS_leuS KEGG:ns NR:ns ## COG: BS_leuS COG0495 # Protein_GI_number: 16080084 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Bacillus subtilis # 3 699 106 802 804 742 51.0 0 MANIKRQINEIAAVYDWDMEVNTTDPDFYKWTQWIFVQMFKKGLAYEKEFPINWCPSCKT GLANEEVVNGCCERCQTPVTKKNLRQWMLKITAYADRLLDDLDKLDWPEKVKKMQTEWIG KSYGAEVDFQVDGREEKIKVYTTRPDTLYGATFMVLAPEHELALGLAVPETKDAVEKYIC EASMKSNVDRLQDKEKTGVFTGSYAVNPLNGAKVPIWLSDYVLADYGTGAIMCVPAHDDR DFEFAKKFDLPIIQVIARDGVEIMDMTQAYTEASGIMINSGEWNGMESSVLKREAPAMIE ERGIGKKTVNYKLRDWVFSRQRYWGEPIPIIHCPHCGSVPVPEDQLPLRLPEVESYQPTG TGESPLAAIDEWVNTTCPVCGAPAKRETNTMPQWAGSSWYFLRYVDSHNEKELVSREKAD QYLPVDMYIGGVEHAVLHLLYSRFYTKFLCDIGVIDFDEPFTKLFNQGMITGKNGIKMSK SKGNVVSPDELVRDYGCDSLRMYELFVGPPELDAEWDDRGIEGISRFLNRFWNLVMDSKD RETEPTREMVRLRNKLIFDIEQRFNQFNLNTVISGFMEYNNKFIELAKKTGGIDRETLRT FVILLAPFAPHIGEELWQQLGGSDTVFHAQWPVCDEEAMKDDEISVAVQVNGKTKAVITL SAGTAREEAIEAGRALVTDKITGTIVKEIYVPGRIINIVCK >gi|229784011|gb|GG667724.1| GENE 11 14445 - 14816 478 123 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1011 NR:ns ## KEGG: Cphy_1011 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 9 122 7 123 126 121 48.0 1e-26 MEQSSVGYPLLLLDNETLAHVLGDGDFRFANLTFEEAKAILDMNAEEDIARCFEDKDLEN AIYEYLGIERRTFEYRPVRQMKVGQDAIAFKLYITPSGTQPIILGEEGSRLKRLRMYTYI VTS >gi|229784011|gb|GG667724.1| GENE 12 15903 - 16754 810 283 aa, chain + ## HITS:1 COG:lin2861 KEGG:ns NR:ns ## COG: lin2861 COG3786 # Protein_GI_number: 16801921 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 75 281 110 309 312 82 26.0 7e-16 MRRYQRYLLVMTGAVLFACSGTAGTTACARMNEDRAVRPQERTDGPEEKGPGIGLKDKNT EKAEAKQENAGEQKNAGEQELMGNIAAASETDSLILVVGQGGADISLSYHTKNEEGIWTE EFETMGRYGKNGATDDKKEGDKKTPLGTYRFTMAFGVKENPGSILPYHRLTDTDYWVDDP ESSHYNRLVDSRSTKKDWKSAEHMAAMVPFYHYGLALDYNSDCVPGNGSAIFLHCMNGTA DTGTSGCVKVPEEYMKKLVTSVDEDTRIVIVSDISRLEQENGR >gi|229784011|gb|GG667724.1| GENE 13 16757 - 17038 296 93 aa, chain - ## HITS:1 COG:no KEGG:Closa_1686 NR:ns ## KEGG: Closa_1686 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 92 1 92 95 101 64.0 8e-21 MNTNWKQDPRLKGMDPQKLELLTTFAKEVESAPKDQLMSTILTLNLKANEKGIHFSDQET ELLVSILSGGMSPAERKRMDTLRMISKNLAKRK >gi|229784011|gb|GG667724.1| GENE 14 17093 - 17638 530 181 aa, chain - ## HITS:1 COG:no KEGG:Closa_1687 NR:ns ## KEGG: Closa_1687 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 181 1 189 189 205 64.0 7e-52 MNNEKNSQPVDLDSLIGDNHLQMMKAALPYMSVTEQRFISLFVKFNELRRTVQLFEDEEV AAMGICSTGESTEKPASPIDMLGAIKPFGTPTEQDLIDLIMNFFQGFKLASAVSDAVPGS MVPDPEPTENTARQAQTPPPHRSGGPFGRMSFDQIKNFIPQEQQSRLETMQLMMSAMQQM S >gi|229784011|gb|GG667724.1| GENE 15 17713 - 17862 183 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRNRLIIDGNAVYEIDDECVKRKQGRNSGRGGNGENPLAAPGQDKNRGR >gi|229784011|gb|GG667724.1| GENE 16 17970 - 18221 94 83 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSDLAATNCGCGCGCECNNGGSGCNFIWLILILCCCCGGGGGFGGGCGCGGGSNCGCGGS NGCDFIWIILLLCCCGGGGGFCC >gi|229784011|gb|GG667724.1| GENE 17 18374 - 18931 613 185 aa, chain + ## HITS:1 COG:PH1969 KEGG:ns NR:ns ## COG: PH1969 COG2872 # Protein_GI_number: 14591706 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolases related to alanyl-tRNA synthetase HxxxH domain # Organism: Pyrococcus horikoshii # 1 155 2 162 404 119 40.0 3e-27 MEKLYYRQPYVKAFEAVVTDCREGKNGRYQVMLDRTGFYPEGGGQPGDTGRLGPVKVLDV HEKAGEIIHETDGPLEAGTAVAGTIDWERRFGYMQNHSGEHIFSGLVHKKYGFDNVGFHM GSEEVTVDFNGVITAEELEEIETEVNRKIAENLPIRELYPAAGELEHMEYRSKKALTGQV RILAS >gi|229784011|gb|GG667724.1| GENE 18 19887 - 20381 586 164 aa, chain + ## HITS:1 COG:no KEGG:Closa_1690 NR:ns ## KEGG: Closa_1690 # Name: not_defined # Def: Alanyl-tRNA synthetase, class IIc-like protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 158 223 380 384 184 57.0 1e-45 MLCGMDALKDYRKKQKTVTELSVMLSAKPERVVETVEKLKNENGIKDGRINRLYQELFAA RMEQFQCSQEPLAVFEEGLAPIQLRQYCTMLYEGGRGSVVLVCSGVDGSYQYAIGSSSAD MRALSKTLNKELNGRGGGSALMAQGTFQASKQAILETFTGLVKE >gi|229784011|gb|GG667724.1| GENE 19 20386 - 21546 1385 386 aa, chain + ## HITS:1 COG:CAC0730 KEGG:ns NR:ns ## COG: CAC0730 COG0628 # Protein_GI_number: 15894017 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Clostridium acetobutylicum # 5 371 4 378 383 209 37.0 5e-54 MELNDSNIKKLRWLIVFTVVVVVIGVNYRRVFEVFMMLFGFISPFLLGGVMAFILNVPMR RIEKMLPMKENSRLRRPIGLVFTLLFVVGLLLLVIFVVMPQLFETFMSLQNSIPVFLAGV KKGAEDLFAQNPEILDYINGIQIDWKSFSEDVIGFLSSGASTVLNSTITAAVSIVSGVTT FGIALVFAIYILLQKETLSAQMKKLMRAFLPDRVTEHVLSVAGLAYNTFSHFLTGQCLEA VILGTMFFLTLSVLRLPYALLIGVLIAFTALIPIFGAFIGCVIGAFLMLMVKPLDALLFV AVFFILQQIEGNLIYPHVVGNSVGLPSIWVLVAVTIGGSAMGIVGMLIFIPLCSVLYSLL RGTVYARLEKKNQEQKKNRGKVKAQS >gi|229784011|gb|GG667724.1| GENE 20 21566 - 22645 1134 359 aa, chain + ## HITS:1 COG:PH0078 KEGG:ns NR:ns ## COG: PH0078 COG3191 # Protein_GI_number: 14590032 # Func_class: E Amino acid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: L-aminopeptidase/D-esterase # Organism: Pyrococcus horikoshii # 8 337 7 344 361 278 48.0 2e-74 MKRIRAYGIIPGDMTPGPLNKISDVAGVTVGHAAVNTAEHKTGVTVILPCEDNPYRKKLT AAAYIHNGFGKSQGLVQIEELGTLETFIALTNTLNVGKVHDAVVDLMSEQCEREGIQVTT INPVVGECNDSGLNMIRERIIGLNEVRSAVRDAKRDFEEGAVGAGTGTVCFGLKGGIGSA SRQFTAGGSTYTVGALVQSNFGSTKNLVICGEPVGRALEKKIKPADLDRGSIMTVIATDL PLSDRQLKRVIRRAGVGLSRTGSFMGHGSGDIMIGFTTANRIPESTDQEVLTIRVLKESS LDTAFTAAAEAVEEAVLNSLAMAETTVGYQGEIRYSLTDLWLADRGTREQPAAVFSMEE >gi|229784011|gb|GG667724.1| GENE 21 22656 - 22910 297 84 aa, chain - ## HITS:1 COG:no KEGG:Closa_1695 NR:ns ## KEGG: Closa_1695 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 84 1 84 84 107 70.0 2e-22 MEELYRTIEETIRNSGCPFPADGEEIYNEICDEIENREPGAYVFFSKKTDDVFFEYRLQI FEDNFNLSSIDVHIKDQNYHVDFD >gi|229784011|gb|GG667724.1| GENE 22 22926 - 23219 299 97 aa, chain - ## HITS:1 COG:no KEGG:Closa_1697 NR:ns ## KEGG: Closa_1697 # Name: not_defined # Def: PHP domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 92 188 279 284 148 79.0 8e-35 LAHPYQYHLGDAKLEELIVSLKENGLAGLEVYHSSNNQYESGKLKNLAVKYGLFPTGGSD FHGANKPDIAIGCGRGSLRISALLLDDIKKARTKNLS Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:14:50 2011 Seq name: gi|229784010|gb|GG667725.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld118, whole genome shotgun sequence Length of sequence - 15304 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 6, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 876 500 ## CA_C3386 hypothetical protein 2 1 Op 2 . + CDS 921 - 1076 72 ## gi|288871163|ref|ZP_06410044.1| conserved hypothetical protein - Term 1046 - 1093 14.0 3 2 Op 1 40/0.000 - CDS 1105 - 2637 1862 ## COG0642 Signal transduction histidine kinase 4 2 Op 2 . - CDS 2634 - 3335 827 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 3372 - 3431 4.1 5 3 Op 1 . - CDS 3473 - 3829 328 ## ELI_2978 hypothetical protein 6 3 Op 2 11/0.000 - CDS 3829 - 4704 857 ## COG0477 Permeases of the major facilitator superfamily 7 3 Op 3 . - CDS 4771 - 5424 660 ## COG1309 Transcriptional regulator 8 3 Op 4 . - CDS 5482 - 6324 851 ## COG0647 Predicted sugar phosphatases of the HAD superfamily 9 3 Op 5 . - CDS 6371 - 7087 794 ## COG0584 Glycerophosphoryl diester phosphodiesterase 10 3 Op 6 2/0.000 - CDS 7090 - 8811 1622 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 8987 - 9046 1.6 11 3 Op 7 38/0.000 - CDS 9105 - 9935 953 ## COG0395 ABC-type sugar transport system, permease component 12 3 Op 8 35/0.000 - CDS 9948 - 10841 860 ## COG1175 ABC-type sugar transport systems, permease components - Term 10869 - 10918 11.2 13 3 Op 9 . - CDS 10934 - 12283 1695 ## COG1653 ABC-type sugar transport system, periplasmic component + Prom 12495 - 12554 13.4 14 4 Tu 1 . + CDS 12608 - 13579 971 ## COG4189 Predicted transcriptional regulator + Term 13771 - 13807 5.1 - Term 13582 - 13635 9.1 15 5 Tu 1 . - CDS 13657 - 14499 912 ## COG3717 5-keto 4-deoxyuronate isomerase - Prom 14593 - 14652 5.9 16 6 Tu 1 . - CDS 14659 - 15243 466 ## COG0655 Multimeric flavodoxin WrbA Predicted protein(s) >gi|229784010|gb|GG667725.1| GENE 1 1 - 876 500 291 aa, chain + ## HITS:1 COG:no KEGG:CA_C3386 NR:ns ## KEGG: CA_C3386 # Name: not_defined # Def: hypothetical protein # Organism: C.acetobutylicum # Pathway: not_defined # 2 290 43 324 326 224 40.0 5e-57 EESALSIVPKLVSGEWNLLERSGEGNPGRPLFRSRVKAPAPLPLTKLQRSWLKAISADPR FRLFFTDEECRELDQDLKDIQPLFYPEDFSYFDRYSDGDGYTLDSYRQHFKTILEGIREN RILVIDYFSAKNHLTASRYLPCRLEYSAKDDKFRLMAVRVRNDNSLSRLTILNLSRIMRI QDTGLSLEIRPDIDAYLEHSLCQEPVLLEISNERNALERTMLHFACYQKRTEKLGDSGKY LCRIYYSRDVETELLIQILSFGPVVKVLGPDNFLKQVRERVEMQQRLLKMC >gi|229784010|gb|GG667725.1| GENE 2 921 - 1076 72 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871163|ref|ZP_06410044.1| ## NR: gi|288871163|ref|ZP_06410044.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 51 1 51 51 74 100.0 3e-12 MYSSYISPFAYSREGNKPNVLNSIASQQLSAEFSVPDFYPFYSYKRRQAIR >gi|229784010|gb|GG667725.1| GENE 3 1105 - 2637 1862 510 aa, chain - ## HITS:1 COG:CAC2434 KEGG:ns NR:ns ## COG: CAC2434 COG0642 # Protein_GI_number: 15895699 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 1 505 3 489 492 246 30.0 6e-65 MRHSIKARFAMVFVGLMAFVLIATWCVNNWFLESFYTMDKVATLEKAYTGIDRIVTETEQ KGTDIFEYFKSTYDPNRENEGPIQQLFRVLGEKYNMNIVLMDSATDQVLISNTGERDYLS SRVQGYIFGKNMPKVSILKTHDNYFIQKTYDKRSDSYYLESWGYFSDNKTIFLMSTPLAS IRESVELANRFLAYVGIAALLVGSLFIYFATKKITSPILQLANLSEKMSNLDFDAKYTGQ AEDEIGVLGNSMNRLSDTLKETIGQLRTANTELQQDIEEKIKVDEMRKDFIANVSHELKT PIALIQGYAEGLTEGMAEDPESRDYYCEVIMDEANKMNKMVRQLLTLSALEAGNDAPVMD RFDLTDLIRGVINSVGILIQQKEADVHLEDCGPVYVCADEFKIEEVVTNYLNNALNHLAG EKKIVITVEDNGEEALVTVFNTGNNIPEEDIPNLWTKFFKVDKSHTREYGGSGIGLSIVK AIMDSHRKACGVRNVEGGVEFWFTLDCYKG >gi|229784010|gb|GG667725.1| GENE 4 2634 - 3335 827 233 aa, chain - ## HITS:1 COG:CAC2435 KEGG:ns NR:ns ## COG: CAC2435 COG0745 # Protein_GI_number: 15895700 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 14 233 5 224 224 228 51.0 7e-60 MISVQEVIMEQLKILVVDDEARMRKLVKDFLSVKGFSVLEASNGEEAVDLFFEQKDIALI ILDVMMPKMDGWETCRTIRKYSQVPIIMLTARSEERDELQGFNLGVDEYISKPFSPKILV ARVEAILRRSNAVSSDLIEIGGIRIDKAAHQVTIDGQNIDLSYKEFELLTYFVENQGIAL SREKILNNVWNYDYFGDARTIDTHVKKLRSKMGEKGDYIRTIWGMGYKFEVSE >gi|229784010|gb|GG667725.1| GENE 5 3473 - 3829 328 118 aa, chain - ## HITS:1 COG:no KEGG:ELI_2978 NR:ns ## KEGG: ELI_2978 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 2 117 296 411 418 82 37.0 4e-15 MGTPLLLIPVILILLTGAPDTAVYAVLLFSVMTGMACAAMFTIAAQTFIQKSTPVHLLGK AGAFVSTICVCALPVGQALYGGLFELFSDRPWVVVLAGSVISMALADAARRTLRNAEE >gi|229784010|gb|GG667725.1| GENE 6 3829 - 4704 857 291 aa, chain - ## HITS:1 COG:PAB0724 KEGG:ns NR:ns ## COG: PAB0724 COG0477 # Protein_GI_number: 14521293 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Pyrococcus abyssi # 8 277 3 267 418 119 31.0 8e-27 MNSRLFQRDFSIMILGQIISLFGNSILRFSLSLYVLDVTGSAAVFGTILAVSMIPTVILS PFGGVLADRLPKQKIMTILDFVTAGMIVSYDVIFGSTGSLAAVAVFMILLTLIQAFYQPA VQASIPLLAAREHLMAANGIVMQVQALAGLLGPILAGMLYGIGGVKPILAASAVCFFLSA VMELFLRIPHDKRAPEASPVRTALLDLKEALLYLIKENTCLFKLLIIVAGINLFLSALFI IGLPYLVKIYLGMSAQAYGFAEAALGLGSILGGLLSGLAGKKIKFKQSHFF >gi|229784010|gb|GG667725.1| GENE 7 4771 - 5424 660 217 aa, chain - ## HITS:1 COG:MA0364 KEGG:ns NR:ns ## COG: MA0364 COG1309 # Protein_GI_number: 20089261 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Methanosarcina acetivorans str.C2A # 1 183 1 177 192 72 31.0 7e-13 MARNKYPEETVQKILDVSRTLFREKGYDHTTIQDIVNALGMSKGAVYHHFKSKEEIMDRL TDVYYDEAEWFMDIRLDPSLNGLEKLKEILRFLFTDKKKFELDKLMPYSNKDNPRLRTLI LDSTIRDSAPFIAELIEEGIRDGSIHVTRPKELSETMMILMNVWIGMFAENREDFLIKID YFQEFCEKMGLPVIDERIKEAIIGYYDEVMKEQPGRA >gi|229784010|gb|GG667725.1| GENE 8 5482 - 6324 851 280 aa, chain - ## HITS:1 COG:TM1742 KEGG:ns NR:ns ## COG: TM1742 COG0647 # Protein_GI_number: 15644488 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar phosphatases of the HAD superfamily # Organism: Thermotoga maritima # 8 265 1 259 259 247 47.0 2e-65 MKEQRTELLEKAELFVLDMDGTFYLGDRILDGALDFLKTVEESGRKYLFFTNNSSRSPKV YLEKLKRMNCEIGREQIMTSGDVMIRYLKTCYAGRSVYLVGTPALEESFREAKICLTQEM PDVVVIGFDLTLTYEKLERACTYIRSGAEFLATHLDINCPTEDGFIPDCGAFCAAISLST GKQPRYVGKPFPETVDMILKKTGVDRERIAFVGDRLYTDVAAGVNNGAMGMLVLTGETKR EDLEGAEVVPDGVYLSLKEMGELLRGAGDSERTEMTGVDI >gi|229784010|gb|GG667725.1| GENE 9 6371 - 7087 794 238 aa, chain - ## HITS:1 COG:BS_yqiK KEGG:ns NR:ns ## COG: BS_yqiK COG0584 # Protein_GI_number: 16079474 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Bacillus subtilis # 2 230 3 233 239 185 40.0 6e-47 MKVTAHRGYSGRYPENTMLAFKKAVEAGCDEIELDVQLTKDDVVVVLHDERIDRTTDGTG YVRDYTYEELKQFNAAKCWNGRYETMAIPTFEEYCLWVKDEPVTTNIEFKTSIYYYEGLE EKTIALLRKYHLEKKVMFSSFNHVSLLKAKELEPDIECGILVDDGLGNAGYFCSRYGFEC YHPNGAKLTDDVVESCTRHGIKINAWTINDMALLEKLADLNCSGVITNFPGICRAWRG >gi|229784010|gb|GG667725.1| GENE 10 7090 - 8811 1622 573 aa, chain - ## HITS:1 COG:ECs3958 KEGG:ns NR:ns ## COG: ECs3958 COG3250 # Protein_GI_number: 15833212 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli O157:H7 # 33 426 111 484 1042 95 25.0 2e-19 MREEYPRPDFVRSEWVNLNGAWDFYLGGEKRTIEVPYVCQCRKSGIGERIREDSVVYERT FQVPGTWKGRNVLLNFGAVDYFCRVWVNGQCVGSHIGGQTPFSFDVTRCLNWGDECIRVE VEDWLKDEQIARGKQYWEDESSFIWYTPSTGIWQTVWLEPVSESRFSWIRFTPDIDEGTV KIDYRLEQGTALPCETEITICRENETVFKGTVWCRELCNTITVDVFQKKALNGAFHFVGA YWSPENPNLYEAAMALRCGAEVTDRVESYFGMRKVRIEDGKIYLNNQPYYQKLLLDQGYW AESLITAPSDEDYKNDILMSKAMGFNGCRKHEKVEDPRFLYWADHLGFLVWEAMASFWVY TPEGACAFMKEWMDVIRRDYNHPSIIVWGMLNESWGVPGIYENQEQQSFSKSLYYMAHSL DSTRLVISNDGWEMTDTDICAVHSYQHGREDDEKQQELYKKCLKDIRDLPGIVEKRLFAK GHSYQKQPVVLTEFGGIAVSEDESGWGYTAIGRSEFLKEYSRIIDAVYDSDLLCGYCYTQ LTDVEQETNGLLTDTHQYKFDPEKIKEINDRRG >gi|229784010|gb|GG667725.1| GENE 11 9105 - 9935 953 276 aa, chain - ## HITS:1 COG:lin0761 KEGG:ns NR:ns ## COG: lin0761 COG0395 # Protein_GI_number: 16799835 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 3 276 7 291 291 151 31.0 1e-36 MKKKYFNRILLVMCACVIAYAFLLPLLFMLFTSFKGLAEAMTSSTLLPHTWTLQNYRDLF ASTSTAPIFKWLMNTAVVTVSGTILRITTSVLAAYALARLPVPGKRFIVVGLVWAMAIPE IVTFFPLFYIFKQINMLNSLLPLILPSGSGVMCIYLIYNFLLSFPRELEEAGLVEGANVF QVLWHIVLPSVKPVVITQAFITFLGLYNSYLWPSLVVNKQENRTVTLGIAALVLGENYTN PGLMMASTLVSVVPVLIIFMFANKYVVKGFTQSGIK >gi|229784010|gb|GG667725.1| GENE 12 9948 - 10841 860 297 aa, chain - ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 10 286 11 282 292 131 28.0 1e-30 MKKRKGTAVLFIIPFLLFFLLFWLIPFVYGIVMSVSKYSLIQGNRGFVGLANYIKILFSS SMYHKSFVQGLGNTVIFVVLTTPFLVFGSLFLALLLDRLPERLKGIFRTIYFASYSVSVT AVAAVFVWLMKGNGGYFNNMLISMGIIKTPIPWMEQQPFVWVALTVATVWWTIGYDMMLF VNALNEIDTSLYEASAMDGAGFWVQFRYIILPNIKNVFFFVLMTTIIASFNVYGQPRLMT KGGPGETTKPLIMMITSTIMDRNDLGVGSAMTILMGVVIILCSVGQYYLTKEKDNLK >gi|229784010|gb|GG667725.1| GENE 13 10934 - 12283 1695 449 aa, chain - ## HITS:1 COG:BH1244 KEGG:ns NR:ns ## COG: BH1244 COG1653 # Protein_GI_number: 15613807 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 1 202 1 200 440 61 27.0 3e-09 MRKMKKMVSLTMAAVMTLSLAGCGGGGSTPETKAAEPAESKAETTAEAKAEETTAAQPAG EKTTITFINGFTGGDGPFMTKIVDAFNESQDQYFIEQLQDADHYTKFKSDDFDMLIIHAD WISTYHALGLLREVSDIYEKAGLSFENDWHPITQTYAKYDDGIFAFPLDLYAETFYYNKE YVKEVPKTYDDMKALRDQLDSENSGIYPMSIPLTGDHQWAWMTALGQSGCNWVEGEHIKM DTDEIADAFMKIHDLIYKDHISAAGLGDNDHFNTFIKESKDNTNVSAVTSLTGPWNYTAA REVLGDNLGIGTLPQLYGDTKVVPAGGHTFGVSAKVTDEKKIDGIAAFMKFAYQPEIMLN WADSGQAPIHLKTMELVKQSPDKYPVANVNYQIFDYAMILPAIYNIREQVKYVNTTVWSM VIQTPDLKKEDLMEELKNATEIATELSEF >gi|229784010|gb|GG667725.1| GENE 14 12608 - 13579 971 323 aa, chain + ## HITS:1 COG:AGl3407 KEGG:ns NR:ns ## COG: AGl3407 COG4189 # Protein_GI_number: 15891821 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 35 321 18 305 309 225 39.0 1e-58 MSRKQKAPVKIIGERDIQLHLSNPAHHDQIVRIGKALSSPIRLNILNLLKNTPLSLQEIA DILSIPVSSTALHIKILEEAKLIVTETQPGLHGSMRVCICSMQTFHLETFDADADLADNT ITLDMPVGNYYQCEVEPTCGLADENGVMDAYDSARTFYSPLRGNAQLIWFQQGFVEYRFP NLTNPLLELHEISFCMELCSEAPGFLEDWPSDITVSINGHEVATYCSPGDYGARRGRLTP PAWPNGRTQYGLLKTFSVRENGSYLDGSLIDPRLTIKDLKLQDHPYISLLIQIKKDARHI GGINLFGEKYGDFPQGIVMNLIY >gi|229784010|gb|GG667725.1| GENE 15 13657 - 14499 912 280 aa, chain - ## HITS:1 COG:kduI KEGG:ns NR:ns ## COG: kduI COG3717 # Protein_GI_number: 16130747 # Func_class: G Carbohydrate transport and metabolism # Function: 5-keto 4-deoxyuronate isomerase # Organism: Escherichia coli K12 # 1 280 1 278 278 333 56.0 2e-91 MDIRYSANQKDFKRYTTEETRNEFLIENLYVANEVVAVYSHVDRMVTLGCMPTTETVSID KGIDIWKNFGTGYFLERREIGIFNIGGAGKITADGEVFDMNYKDCLYITKGTKEVLFSSN DAANPAKFYMVSAPAHKACETTFIPIEKAAKKPLGAMETSNKRVINQFIHPDVLETCQLS MGMTVLDPGSVWNTMPAHTHERRMEVYMYFEIPENNVVFHMMGEGNETRHIVMQNEQAVI SPSWSIHAGAGTSNYTFIWAMGGENQAFDDMDVIPTTELR >gi|229784010|gb|GG667725.1| GENE 16 14659 - 15243 466 194 aa, chain - ## HITS:1 COG:MA1189 KEGG:ns NR:ns ## COG: MA1189 COG0655 # Protein_GI_number: 20090055 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 1 189 1 199 209 72 28.0 7e-13 MAKFTIVFHSVCGNTYQIADCFYRVLQERGQEVALYRTADASFESGERRSDEFKHLQAAI DAVPVIEEAAQVLGSDVLLLGSPTYYGCVSAQMKTFMDSFEKMWADAPTAGMYFGCFTSA GDGCGGPELCLQVMNTFAQHMGMVPISVPCNIGGTAQPAYGIKHVSGEKADRAVTEETQT AVRNYADHILRVIS Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:15:09 2011 Seq name: gi|229784009|gb|GG667726.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld119, whole genome shotgun sequence Length of sequence - 26470 bp Number of predicted genes - 23, with homology - 22 Number of transcription units - 12, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 373 311 ## COG1307 Uncharacterized protein conserved in bacteria - Prom 484 - 543 6.0 + Prom 443 - 502 4.5 2 2 Tu 1 . + CDS 561 - 1814 715 ## gi|288871168|ref|ZP_06410047.1| conserved hypothetical protein + Term 1874 - 1918 5.2 3 3 Tu 1 . - CDS 1848 - 2645 402 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 2815 - 2874 5.8 4 4 Tu 1 . + CDS 2833 - 4341 1704 ## COG3534 Alpha-L-arabinofuranosidase + Term 4353 - 4394 4.3 + Prom 4448 - 4507 7.3 5 5 Tu 1 . + CDS 4583 - 5614 835 ## COG1609 Transcriptional regulators + Prom 5616 - 5675 6.7 6 6 Op 1 . + CDS 5729 - 7243 1216 ## COG3119 Arylsulfatase A and related enzymes 7 6 Op 2 . + CDS 7240 - 7506 259 ## COG3119 Arylsulfatase A and related enzymes 8 7 Tu 1 . + CDS 8444 - 8980 323 ## Mahau_1824 sulfatase 9 8 Op 1 . + CDS 9108 - 9635 435 ## Mahau_1824 sulfatase 10 8 Op 2 3/0.000 + CDS 9724 - 10158 318 ## PROTEIN SUPPORTED gi|90020673|ref|YP_526500.1| ribosomal protein L9 11 8 Op 3 . + CDS 11088 - 11975 939 ## COG2211 Na+/melibiose symporter and related transporters 12 8 Op 4 . + CDS 11981 - 12100 254 ## gi|266626053|ref|ZP_06118988.1| beta-glucosidase 13 9 Tu 1 . + CDS 13076 - 14614 1189 ## COG1472 Beta-glucosidase-related glycosidases + Prom 15516 - 15575 8.3 14 10 Tu 1 . + CDS 15761 - 16783 958 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) 15 11 Op 1 . + CDS 17733 - 18833 1039 ## gi|266623676|ref|ZP_06116611.1| hypothetical protein CLOSTHATH_04984 16 11 Op 2 40/0.000 + CDS 18830 - 19501 603 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 17 11 Op 3 10/0.000 + CDS 19498 - 20538 1031 ## COG0642 Signal transduction histidine kinase 18 11 Op 4 . + CDS 21476 - 21841 252 ## COG0642 Signal transduction histidine kinase 19 11 Op 5 . + CDS 21838 - 22668 862 ## gi|266623680|ref|ZP_06116615.1| conserved hypothetical protein 20 11 Op 6 . + CDS 22744 - 22854 91 ## 21 11 Op 7 . + CDS 22870 - 24360 1229 ## COG0860 N-acetylmuramoyl-L-alanine amidase + Prom 24380 - 24439 6.7 22 12 Op 1 . + CDS 24491 - 25870 1452 ## COG0534 Na+-driven multidrug efflux pump 23 12 Op 2 . + CDS 25845 - 26469 453 ## COG0042 tRNA-dihydrouridine synthase Predicted protein(s) >gi|229784009|gb|GG667726.1| GENE 1 1 - 373 311 124 aa, chain - ## HITS:1 COG:CAC1624 KEGG:ns NR:ns ## COG: CAC1624 COG1307 # Protein_GI_number: 15894902 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 2 123 3 124 280 116 45.0 1e-26 MKTAILIDSGCDVSDELIQKYHMKVMPLHIIYPEKDYIDGVDIFPETIYQRFPREIPTTS TPSPQDVKNMAEEIKSEGYTHVLAFTISSGLSGTYNTVNHVLAEEHELTSFVLDTKSVSF GAGI >gi|229784009|gb|GG667726.1| GENE 2 561 - 1814 715 417 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871168|ref|ZP_06410047.1| ## NR: gi|288871168|ref|ZP_06410047.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 417 1 417 417 660 100.0 0 MEEYERIDGMICPKCGHETSGNYCSSCGAPLKNGYMAESAEEYEMRKRLAEELAELEDEA DRQSGDASSGFISRLKTQTRRTPADTEAADSDSCRPQSRMAERPKPKEKAAKPKKSREGT ERRKAEAKAQAKAEIKAETKEQKKREARMKKLEAEVERLRSSGGSNSLERGSQKSGSQKS GSQKSGSQKNGNGGQKHGIQENIRPSSRGMAAGRMPNVGEWEWDSDGGPGIPDVESRITE KADPDGAGLGEMMVKSVVTAAVMMARVMQLVSFFLMAGMVFIMAKSFWEHGQALGDIRLM AAEANYGMALYVGFAGITLFMGLIWCLWIPSKKGAGGGVRMKKYDTGRGFLPFLLCMAAV AAASVILPRIPAGEEAWKGLAAGAAAALEAVNAHRGTLFLSSTAGAVLSLVRKMLRV >gi|229784009|gb|GG667726.1| GENE 3 1848 - 2645 402 265 aa, chain - ## HITS:1 COG:STM0563 KEGG:ns NR:ns ## COG: STM0563 COG2207 # Protein_GI_number: 16763940 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Salmonella typhimurium LT2 # 164 258 184 278 284 71 33.0 2e-12 MLNFEIHFCEYNRQNPPKDRIYRPGGSGDYLFLLLKTPMKFYLNQELNLTRENACILYTP QTMQHYEAVQKFRNSYIHFSCSAPLAEHFRIPLNTIFYPGNPDEIDEYIRKIQEERIASR PFAEEKEYFLLSEMLIAIARAMAGANTDMTENRELYEIFLSLRLDMLRSYEKPWTTEDLC RRANLEKSQFFACYQRFFQSTPHADLLSVRLEKAKNLLTNEALPINLAARQCGFSDITHF SRYFKKECGCSPKEYARRWKEHCSS >gi|229784009|gb|GG667726.1| GENE 4 2833 - 4341 1704 502 aa, chain + ## HITS:1 COG:BH1861 KEGG:ns NR:ns ## COG: BH1861 COG3534 # Protein_GI_number: 15614424 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Bacillus halodurans # 4 499 5 499 500 648 61.0 0 MKKAEIIIDKYFLTGSVDKRIFGSFIEHLGRAVYGGIYQEGLACSDEQGFRKDTLELVKE LQVPIVRYPGGNFVSGFHWEDSVGPKAQRPARTDLAWRVIETNQFGLNEFADWSGKAGSE MMMAVNLGTRGPEDAKNLLEYCNFEGGTYYSDLRKSHGYAKPHDIKLWCLGNEMDGTWQM GHKTAGEYGRIASETSRMMKWLDPEIETVACGSSSLTMPTFGEWEYQVLNECYDEVDYLS LHQYYGNRSGDTADFLASSKGMDDFISGVVSICDAVKAKKHAGKNINLSFDEWNVWYHSN EQDKEIQAWTQAPHQLEDVYNFEDALLVGSMLITMLRHADRVKIACLAQLVNVIAPIMTS DTGAWRQTIFYPYMHASVYGKGTVLTTQIKAPVYDSRTYGTVSTLDCVCIWDEEAEAVTM FAVNKDLEDDAEVSCDFRQFEGYQVKEHIVLTHDDMKAVNTEEAPDTVSPVRSEDSRMEN GVFTTVFGRHSWNVVRLVKNQL >gi|229784009|gb|GG667726.1| GENE 5 4583 - 5614 835 343 aa, chain + ## HITS:1 COG:L0143 KEGG:ns NR:ns ## COG: L0143 COG1609 # Protein_GI_number: 15673630 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Lactococcus lactis # 2 340 5 332 332 93 26.0 6e-19 MTASIYDVAKRAGVSISTVSRILNNSANVSEKKVRAVKEAMEYYQYEPNQFGRGLVKQRS NIIGVYFPFHSGSVFESSYNLELLKGIERVLTRHNYSMVLISESEEYSNRVKGTPRFLEY VRQRRIDGLLLSGLNSKSMEEEAFQQILEEEYPVVYIGKRFHKKGINVYAQYESYRHQMI KTLYAYGHRKIILYFYSSHQNYLPAIVNRAKDEMEDLELAAIILPHMTDIRQELTADMRK YVTGGGYTAICSPGIETTQIILSVCGELRLSVPETVSILSVEHKNGEGRQMFPQISAFYV PAKDMGSGAAELLLDMIEDRSPDEMSREYETRFFDRESIRQLS >gi|229784009|gb|GG667726.1| GENE 6 5729 - 7243 1216 504 aa, chain + ## HITS:1 COG:STM0886 KEGG:ns NR:ns ## COG: STM0886 COG3119 # Protein_GI_number: 16764247 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 1 499 1 493 495 529 52.0 1e-150 MKAIMVMYDSLRRDLLESYGCDWTKTPNFKRLANLSVQFENCYVGSMPCMPARRELHTGR INFFHRSWGPMEPFDDSMPELLKNGGVYTHLVSDHQHYWEDGGCTYHNRYSSWEISRGQE GDLWKADLAYQYDRSTVFKNKEVMMSYPPYRRMMAQDAVNRSYMEAEEKTSQAVTFQQGL EFIEKNHEADHWFLQIETFDPHEPFYSLKEDKALYPHTFLGDAAAEADWPPYAPTSEDEN TIQHVRYEYAALLSKCDRYLGKVLDMMDQYHLWDDTMLIVNTDHGFLLGEHGWWGKTSMP IYEEIAHTPLYIYDPRRRECAGARRRAIVQTIDLAPTLLDFFGMEIPKDMEGKPLTGVMD SDEAIRKYAVFGYHGSQVDVTDGRYVYMHAAEHPEIPAYEYTLMPTHMRCMFQPEELAGM ELAEPFSFTKGCKVMKIQSKDHMGDTTSFGSLLFDLKEDPGQEHSLEDETVREQMKAYIR EYMEKNDAPEELFRRLGFDEKRPV >gi|229784009|gb|GG667726.1| GENE 7 7240 - 7506 259 88 aa, chain + ## HITS:1 COG:PM1682 KEGG:ns NR:ns ## COG: PM1682 COG3119 # Protein_GI_number: 15603547 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 4 70 3 69 453 70 46.0 8e-13 MKKRPNIVIINPDQMRADSMSHLGNPAAVTPNLDELAKDGVSFAHAFCQNPVCTPSRCSF MSGWYPHVAGHRTMNHMMHEHEPVLLKR >gi|229784009|gb|GG667726.1| GENE 8 8444 - 8980 323 178 aa, chain + ## HITS:1 COG:no KEGG:Mahau_1824 NR:ns ## KEGG: Mahau_1824 # Name: not_defined # Def: sulfatase # Organism: M.australiensis # Pathway: not_defined # 4 175 105 279 492 144 42.0 1e-33 MNQRNDLLPAQDPEYYKSYCDTYFTIGKDRRPDELPKDSFRGERGGKNFYSFYRGKLPEH RDLDQIWTEGAAEFIRGYEEERPFCIFLPLMLPHPTYQVSEPYFSRIDRRKLPERILPPE GYSGKSRMQRELAALQNMRDWSEAEYDELRAVYLGMCSRVDDLTGQVVRALKEKGNLR >gi|229784009|gb|GG667726.1| GENE 9 9108 - 9635 435 175 aa, chain + ## HITS:1 COG:no KEGG:Mahau_1824 NR:ns ## KEGG: Mahau_1824 # Name: not_defined # Def: sulfatase # Organism: M.australiensis # Pathway: not_defined # 7 166 332 492 492 153 50.0 3e-36 MQKRHGISQEMTELVDFYATAEAVAELPAEHTQFGRSLIPYMEQREEALRQAVFCEGGFR KGEDHCKETGGQAVPDPDGQYYPRQFLQASDEMYNGKATMCRTRDYKYVRRLYEEDEFYD LILDPKETENRIRCPEYASVIMEMKEKLLEFYQDTCDVVPFQKDARMEKEFTRWL >gi|229784009|gb|GG667726.1| GENE 10 9724 - 10158 318 145 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020673|ref|YP_526500.1| ribosomal protein L9 [Saccharophagus degradans 2-40] # 7 140 8 141 522 127 46 1e-28 METKYAKAKFLDVFKYSFGGLGSNLAFFLVMSYLTFFYTDIFGISSMVVSGLMLISRFID AFTDPVMGMLGDHTRSKMGKYRPWIIWGAPVLGFLIFLLFTAPDLSPNMKVVYVYVVYIA YSLASTVVNIPYHSLTPVMSQDPYQ >gi|229784009|gb|GG667726.1| GENE 11 11088 - 11975 939 295 aa, chain + ## HITS:1 COG:BS_ynaJ KEGG:ns NR:ns ## COG: BS_ynaJ COG2211 # Protein_GI_number: 16078820 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Bacillus subtilis # 2 294 162 452 463 129 28.0 5e-30 MGTVSQFIITIMALPLVEAFGGGKQGWAVYGALIGILTTAAFWICAWGGKKYDVCDMDKE RVPMNMAKDLKLLIKNKPMLMLMIAFGTDVLANATYSAVNMYYFKYVLNRVDLVPKVASA ILFAGIILLPLLPALSRLFGKKRLYWFGSLFSIIPLVVMWLKPTAPVTILMSMMAVFGFI SRIPSNLGWAMLPECSDYGEWKFGQRADGLMSSSLTFINKFGMAIGGFIASFLLGLVGFV ANQEQTPQVLNMIVFLRFGMPVLGYIASLISMAFYEITNEKYAEIRTELDERGKG >gi|229784009|gb|GG667726.1| GENE 12 11981 - 12100 254 39 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266626053|ref|ZP_06118988.1| ## NR: gi|266626053|ref|ZP_06118988.1| beta-glucosidase [Clostridium hathewayi DSM 13479] beta-glucosidase [Clostridium hathewayi DSM 13479] # 1 39 1 39 191 79 94.0 1e-13 MTDKIKDLVTAMTLEEKALLCSGKNFWQMEGIERLGLAS >gi|229784009|gb|GG667726.1| GENE 13 13076 - 14614 1189 512 aa, chain + ## HITS:1 COG:YPO2803 KEGG:ns NR:ns ## COG: YPO2803 COG1472 # Protein_GI_number: 16123001 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 1 432 261 713 793 238 36.0 3e-62 MTDWGAMNDRVKALKAGLELEMPGPDPYNDKKIVDAVRNGELDEAVLDRAAERLLTVIMR AGEVHKKEYDAAAHHILARRIAAESAVLLKNDGMLPLKKENSYAVIGAFAETPRYQGAGS SRIHPHQIDCVLDSLRESGIAFEYAPGYELEQDTINPVLLEEAAACAKGRDGILVFAGLP DSYESEGFDRTHLNLPKSHTALIEAVTAVNPHVTVVLLCGSAILMPWRDRVESILLTYLG GEAAGSACADVLTGTVNPSGRLAETFPLSLEDTPCCENFAGEGKDVEYRESILVGYRYYD WAEKPVAFPFGYGLSYTSFSYDSMEIVWDEKEEKGEARITVTNTGETSGSEVVQLYIGKA QSGVMRAARELKGFQKVFLEPGELREVVIGLDRRSFSCYDTNRSCWTVEAGTYQIYAASS SRDLKLKKDLMLPGKELSHIPGYDAAETVKDGHFAADRKQFKRLFPDELPLTPDDGRITL NTTVKEIMASEKGKALLGGLVEGYSGRYSGAS >gi|229784009|gb|GG667726.1| GENE 14 15761 - 16783 958 340 aa, chain + ## HITS:1 COG:STM1287 KEGG:ns NR:ns ## COG: STM1287 COG0641 # Protein_GI_number: 16764638 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Salmonella typhimurium LT2 # 5 334 6 342 398 232 34.0 9e-61 MPPISVLIKPSSGNCNLRCEYCFYYDTMSKREQGAYGFMSVQTLEDVIRQVLAYSEGSCT IAYQGGEPTLSGLPFFEKSIEFQEKYNVNHVKIFNAIQTNGSLLDKQWAEFFVRNHFLVG VSLDGGPKQHDYYRKTPAGKGSFTRIMENIEMLKKTGVDFNILSVVHAKTAPHIKKTYEF YRKNQLDYLQFIACLDPLMEEPGGRDYSLTPEAYGTFLIDLFELWFRDLKEGKQPYIRQF ENYIAILLGQMPESCDMRGICGMQYVVEADGSVYPCDFYVLDEYRLGNFRTDTVEMIDRK REEIQFIESSLQKEEACAECRYASLCRGGCRRTRQEELAS >gi|229784009|gb|GG667726.1| GENE 15 17733 - 18833 1039 366 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623676|ref|ZP_06116611.1| ## NR: gi|266623676|ref|ZP_06116611.1| hypothetical protein CLOSTHATH_04984 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04984 [Clostridium hathewayi DSM 13479] # 1 366 4 369 369 728 99.0 0 MKYRSFYARAAAGLLCLSVLFSPWYGRFTVHAAEEQDILSACHAYQKRLSSVKKEADIAA YGFDIIENQVFSMTVKAFGDVSMIPALDRTYHRLVLFFTDEDGNTVYSTDQLETNNQVRG ELRQLNQGISAVSFQDLNGDDKMDILLITSCEKNDSAAGKAYKVGDVLFQNEHGFYRDWR LSDKINRFGMNKSIRFIESFLVDGYSTEFLYTASTLDELKEHGFAVAEELSSWRTFEKLG SLLVVPGTYRMAEYTVFMVYLVNEQGYIVWSFQPMGDYENLYALKGVACRDIDGDGMKDL AVLARYSYEGKDGEMILENDYSIYYQKTGGFYPDTELRKQYQSKDDSTMEELVETARAYW GWRQES >gi|229784009|gb|GG667726.1| GENE 16 18830 - 19501 603 223 aa, chain + ## HITS:1 COG:CAC1506 KEGG:ns NR:ns ## COG: CAC1506 COG0745 # Protein_GI_number: 15894784 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 220 1 216 217 205 48.0 5e-53 MIKILIVDDEKPICDLIDLNLSEAGYHCRAVQDGAEAVEIIEHERFDLILLDIMLPGIDG FDIMEYIRPMEIPVIFITAKYEVRDRVKGLKLGADDYLVKPFDVVELVARVEAVLRRYNK TERRLSAGDVVVDVEARTVKKAGRPVELTNKEFLLLVLFLRNRNVALFRETLYEKVWEGE YSGDSRTLDLHVQRLRKKLNWENGEYPRLVAVYKVGYRLEVGR >gi|229784009|gb|GG667726.1| GENE 17 19498 - 20538 1031 346 aa, chain + ## HITS:1 COG:CAC1507 KEGG:ns NR:ns ## COG: CAC1507 COG0642 # Protein_GI_number: 15894785 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 1 329 1 336 473 127 26.0 2e-29 MKFSLKLLLWTMIVMAAAFGFSGFYFVNYVFERALDREAGQALDESSILKFAFETAALNV PAKYDVLLEDTTRQIASNLEAGGQGTGRMLRISDDQKQVLYAAEGFAADDGLLLALHENT KTYRVIRLDDRYYIQTAAAVEALDKKLFLETMRDVTEVFRDRATGFVVYRRVTVVILLLG TVLMHLISSWLTRPIRLLTRATKRMAAGDYDYRARLVSGDELGQLTEDFNRMANALEDTI IKLEEEIRAREDFVGAFAHELKTPLTAIIGYADMLRSHRLDEEKSFMCANYIYTEGRRLE TMSLRLLDIIVTKRQEIDFQMVSAESLFFYLYDMYAAKKNMDFLAS >gi|229784009|gb|GG667726.1| GENE 18 21476 - 21841 252 121 aa, chain + ## HITS:1 COG:CAC1507 KEGG:ns NR:ns ## COG: CAC1507 COG0642 # Protein_GI_number: 15894785 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 1 101 372 473 473 101 47.0 4e-22 MLINLLDNACKASEPGSPVDVDGYALSEGYRFSVKDHGVGIPEEELHKITKAFYMVDKSR SRSRNGAGLGLALCEEILLLHHSRLEIESEPGKGSCFSFVIPWEDGGSAARHSKEEGGGE M >gi|229784009|gb|GG667726.1| GENE 19 21838 - 22668 862 276 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623680|ref|ZP_06116615.1| ## NR: gi|266623680|ref|ZP_06116615.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 276 1 276 276 516 100.0 1e-145 MKKTTLNFLLLLFAVTVMVLAIWGPEAISGYRDKSVLNEIHVKEAEGEGEGYRYSLSRSD KLYILAEALNSQTLPESGQYTAERQEAPGETYGGTAGTYAFIVNHRGPSDREIKEEELFE AVNEGLDTLKKLGILPESVRDANAAAYDAVLYSAIDVLEPRNNVAVWKVSLSNIQKNNDR GNRLIDAYIDADDGKLYEFYVRTELTWEDMDPDGIAENWSGYLGLLAPLPYEGNNPLMEM TPYFKKYKLTGAGKGDTTATVGYYDGIQELFVKISR >gi|229784009|gb|GG667726.1| GENE 20 22744 - 22854 91 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTQELLCSSREGSGRNGNTGAKRYWLHKNPPDMFFE >gi|229784009|gb|GG667726.1| GENE 21 22870 - 24360 1229 496 aa, chain + ## HITS:1 COG:BS_lytC_2 KEGG:ns NR:ns ## COG: BS_lytC_2 COG0860 # Protein_GI_number: 16080615 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus subtilis # 120 294 8 181 182 131 37.0 2e-30 MRYERKNRYKKNGKKTGRRWLEDFAAAAAGGIAAVFLFTAMETMGSFSGTDAAENPIVVE NPAVMENSTAAENQLPSLILVMAKNQLAAGNQTVSGKKTVSASQDLSFQTAETDPDHPLI MIDPGHGGTDEGSYAGGILEKDINMEIAKKLRECLEEAGYRVLMTRCSDTYLTKEQRAEA ANVSGAEAFVSIHQNTYEDGEPSGIETWYDGTDQSRDSRRLAQLIHGEAVKRTGARERQL RGDAGFVVTGWTKMPACLIETGFLSSPEEAEKLSQPEYQEKLAEGIAGGIDAFFHPKTMY LTFDDGPSAENTAAVLDVLKDRNIKATFFVVGENVKKNPEVARRIVAEGHTIGIHCYSHD YKQLYESTASYLEDFNEAYKTVLEVTGVRVKLFRFPGGSINAYNKAVCQDIIKEMTDRGF IYFDWNASLEDALKRSEPEELIANARETALGRKHVVMLAHDIVRSTALCLGDLIDQFPEY QMEPLTPEVKPIQFQP >gi|229784009|gb|GG667726.1| GENE 22 24491 - 25870 1452 459 aa, chain + ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 4 437 2 433 447 236 33.0 8e-62 MAKIRTMTEGHPLKLIFAFSLPLMFGNVFQQFYTVVDTMVVGQALGVGALAALGAADWLN WMVLGMIQGISQGFAILMAFEFGAGDYKKLRQVIAHSAILACIGSLAMLLLMQAAARPVM LLLHTPDEIIGQSLLYLRIMFWGIPVVMAYNVLASILRAMGDGKTPLHAMVVASVVNIGL DLLFVLVFHWGIGGAAAATLIAQACSSLYCLKSLLKVQILSFKKTDFHLEPYLCRKLMVL GLPMAFQNGIIAVGGMIVQSVVNGFGVLFIAGFTATNKLYGLLEIAATSYGYAMTTYVGQ NLGAGKIKRIRQGMKSGLLLAMLTSAVIAAVMLVFGQAILGCFITGTPEEAAKTMEIAYH YLAIMSVCLPILYLLHVVRSSIQGMGDTVLPMLSGVVEFLMRTGAALLLPVAMGQEGIFY AEILAWLGADVVLIVSYFVKIKGLKGGGESGDGSEERIS >gi|229784009|gb|GG667726.1| GENE 23 25845 - 26469 453 208 aa, chain + ## HITS:1 COG:CAC3454 KEGG:ns NR:ns ## COG: CAC3454 COG0042 # Protein_GI_number: 15896694 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-dihydrouridine synthase # Organism: Clostridium acetobutylicum # 15 207 5 197 311 239 58.0 3e-63 MEAKREYHKEPCCCFAPMEGITGYVYRNAHHALFPHVDLYVTPFLQPNQNHRFSSRERND ILPEHNKGITLIPQILTNRAEDFIWMAGELEALGYDEVNLNLGCPSGTVVSKYKGAGFLA KKEELDAFLDEVCSGIRIRLSVKTRLGMASPDEFEELLEIYNKYPLKELIIHPRVRSDFY KNTPDRAAFCRAVKASKNEVWYNGDIFT Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:16:22 2011 Seq name: gi|229784008|gb|GG667727.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld120, whole genome shotgun sequence Length of sequence - 24829 bp Number of predicted genes - 20, with homology - 19 Number of transcription units - 11, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 788 899 ## Closa_2404 hypothetical protein 2 2 Tu 1 . + CDS 902 - 1996 880 ## Cphy_3029 glycosy hydrolase family protein + Prom 2898 - 2957 13.0 3 3 Op 1 . + CDS 2997 - 4415 979 ## COG3669 Alpha-L-fucosidase + Term 4505 - 4553 1.2 + Prom 4424 - 4483 6.4 4 3 Op 2 . + CDS 4636 - 5553 681 ## COG2040 Homocysteine/selenocysteine methylase (S-methylmethionine-dependent) + Term 5582 - 5625 3.0 + Prom 5599 - 5658 10.5 5 4 Op 1 16/0.000 + CDS 5750 - 8995 2894 ## COG0642 Signal transduction histidine kinase 6 4 Op 2 . + CDS 9914 - 10378 498 ## COG0784 FOG: CheY-like receiver 7 4 Op 3 . + CDS 10400 - 11623 952 ## Cthe_2967 major facilitator transporter + Term 11647 - 11699 10.2 8 5 Tu 1 . - CDS 11669 - 12529 665 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 12572 - 12631 6.5 + Prom 12473 - 12532 3.5 9 6 Op 1 4/0.000 + CDS 12715 - 14508 1815 ## COG2407 L-fucose isomerase and related proteins 10 6 Op 2 3/0.000 + CDS 14521 - 15987 1232 ## COG1070 Sugar (pentulose and hexulose) kinases 11 6 Op 3 . + CDS 16146 - 16706 522 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases + Term 16721 - 16775 17.7 - Term 16709 - 16763 14.7 12 7 Op 1 . - CDS 16803 - 17387 482 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase 13 7 Op 2 . - CDS 17403 - 18233 660 ## Closa_0221 dipicolinate synthase subunit A - Prom 18327 - 18386 11.0 + Prom 18318 - 18377 3.7 14 8 Tu 1 . + CDS 18405 - 19289 894 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase + Term 19302 - 19354 16.1 - Term 19285 - 19345 18.2 15 9 Tu 1 . - CDS 19397 - 19513 95 ## - Prom 19541 - 19600 6.3 + Prom 19485 - 19544 10.9 16 10 Tu 1 . + CDS 19742 - 20554 777 ## COG1653 ABC-type sugar transport system, periplasmic component + Prom 21456 - 21515 10.7 17 11 Op 1 . + CDS 21566 - 21670 65 ## gi|266623701|ref|ZP_06116636.1| extracellular solute-binding protein, family 1 18 11 Op 2 38/0.000 + CDS 21752 - 22609 677 ## COG1175 ABC-type sugar transport systems, permease components 19 11 Op 3 . + CDS 22636 - 23478 905 ## COG0395 ABC-type sugar transport system, permease component 20 11 Op 4 . + CDS 23500 - 24829 1185 ## COG3119 Arylsulfatase A and related enzymes Predicted protein(s) >gi|229784008|gb|GG667727.1| GENE 1 3 - 788 899 261 aa, chain + ## HITS:1 COG:no KEGG:Closa_2404 NR:ns ## KEGG: Closa_2404 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 259 461 719 722 449 81.0 1e-125 VAHAIDYKQTYSYAGVLEGLSGMPFDVKFISFDDVIENPSVLKECSVVINVGDAYTAPSG GSYWLNPAVSCAVKEFVAEGGGLIGVGEPSACQHEGRYFALASVLGVDKEVGFSLSTDKY NWETHSHFITEDSGEEIQFGEGMKNIYALPDAKILRSIGQDVQMAVHEFGKGRSVYLSGI PYSFENSRMLYRAIFWAAGREQEMKKWYSDNFNVEVNYYPATGKYCVVNNTYEPQETVIY NGEGKAEAMSLKANDILWFEA >gi|229784008|gb|GG667727.1| GENE 2 902 - 1996 880 364 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3029 NR:ns ## KEGG: Cphy_3029 # Name: not_defined # Def: glycosy hydrolase family protein # Organism: C.phytofermentans # Pathway: not_defined # 22 363 17 350 391 457 64.0 1e-127 MDKELYVRKNWLSAAPELNDREAEEAFAVCVETVKRNMAYFGRSFPAAASVDGIYPKTAN DDWTDGFWTGELWLSYEETKDPAFKREALASVADFERRMKERVVVDHHDMGFLFSPSCVA AWRLLKGEEAEEAAVSQARRTALAAADNLMGRFQEKGGFFQAWGTVGDPKEYRLIIDCLL NMPLLFWASEVTGNPVYCEKAKRHIRTAMACVVREDYSTYHTYYFDPETGAPVKGITQQG NRDGSIWARGQAWGIYGTALSYEYFHNPEYERLFDGVTACFMEHLPSDLIPYWDFDFHDG SGEPRDSSAAAIAACGLLEMARCVDQEKAKLLTSMAAKLMKALTDRCLLREGAPGAGILL HGTS >gi|229784008|gb|GG667727.1| GENE 3 2997 - 4415 979 472 aa, chain + ## HITS:1 COG:SP2146 KEGG:ns NR:ns ## COG: SP2146 COG3669 # Protein_GI_number: 15901959 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Streptococcus pneumoniae TIGR4 # 14 472 10 448 559 266 35.0 5e-71 MVIDEQEKRLIEVVPSERQRNHQKLEFYGFLHFTVNTFTDREWGDGTESPSIFCPERFDA DQWANALASSGMKGAILTCKHHDGFCLWPSKYTDHTVAASPFRGGKGDVVREVSEALRAH GLKFGIYLSPWDRHEPCYGKGNEYDDYFVGQLTELLTGYGDIFCVWFDGACGEGADGRVQ RYDWDRYYEVIRRLQPDACISVCGPDVRWCGNEAGDTREQEWSVVPARLQDKEKIKENSQ QSDETEFRQRTLSSSDQDLGSREVLEGETDLVWYPTEVNFSIRPGWFYHASEDDQVRSLE NLVSIYDRSVGGNSTMLLNIPPTKEGLIHERDAERLRELGAYLRRREAHNLAEDAEMTAT GSQIGYVAANIRRDDDGCYKTPDGVRSCEIRLSWKEPVNVSMVVIKEEIRFSQRIETFRI LSVDDGVSRILYEGKTVGYRKNAVFEPVMTKEIRLEILESRVAPVIRFFGVY >gi|229784008|gb|GG667727.1| GENE 4 4636 - 5553 681 305 aa, chain + ## HITS:1 COG:slr1189 KEGG:ns NR:ns ## COG: slr1189 COG2040 # Protein_GI_number: 16332297 # Func_class: E Amino acid transport and metabolism # Function: Homocysteine/selenocysteine methylase (S-methylmethionine-dependent) # Organism: Synechocystis # 11 305 48 345 351 123 30.0 4e-28 MDFKTSFYHHQAFLMEGALGERLKREYGLMPDKSLALAKHVYSLKGRNALKELWLEYGAI ADRYQLPFLAVTPTRRVNQERVRSRSDEGVIGDNVSFLQRIKRECRAEMYAGGMIGSRGD AYTGRGALGENESRMFHRWELDQFSEAGVDFLYAALIPTLPEAAGMAWAMAETGIPYIVS FTIQRSGTLIDGTPICDAITYIDNRVSDPPVCYMTNCVHPRIAYEALSQPFQDKELLKRR FLGIQANTAQLSYEELDNAAELVTSEPEALAEDMMALKDICCLKIFGGCCGTDGRHMREI ARRLD >gi|229784008|gb|GG667727.1| GENE 5 5750 - 8995 2894 1081 aa, chain + ## HITS:1 COG:slr2104_3 KEGG:ns NR:ns ## COG: slr2104_3 COG0642 # Protein_GI_number: 16330590 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 898 1077 11 190 270 150 42.0 2e-35 MIQPGEAGRTVRSFFDAYLIRRNAEDTLNCVTDTVHWVGTGKSELTVGRNQAKQALATEF SQMPEPFGIEYDSFDEMVVSDRCAVVLLTANVYPPGDEADAIWFRVSAACVEGEDGVCRI ASIHASTPESRQEEGEYFPESTLDRYELERQLGTKALDIMGKSIPGGIMGGYLEPGLPLY YINDFMLSYLGYSYDEFVNATEGMIINCVHPDDREAAVHIFEEALDESSDYTDYCRILKK DGSYIWVSALGKRGESVDGRAVCICVVRDVSGEVEAREHLKCQAEEQARQAKQYDHLFQS MLCGIVQYRLTDMGVVFKNANREAIRIFGYEPEEFWQKKDWNMPLLIAKEDRESILKEIG TLRENGDKSDFEYRLLRKDGLPCWIIGTAEVLLDSDGEQIIQSVFIDINERKRAELNEQR LTRQVEASNEIIHRALEHTMTCEFYYYPQTGECQVPERTCKVYHCRNRYDNMPFGFAEEE VDEAYYPAFYAMYEQIHNGERTATCEFRGKSRSYWSRQTMSVILSDEEGKPQFVIGIVED ITREKEMEVALEEARSRDSLTGLYNKESGIRMIQEYLACERGPGEHGVMMLLDMDDFEDI NQKEGNVFADAVLQEVADVLRQITEKESIPIRLGGDEFMLFLKNCNKREAGIIGPRIAAM IRNIILNTEKDIQVAASIGMCSTEVTDEYNALYCCAESTLKYVKDHGRGNAACYLDTSRE VGVFLTQMYPEEFQVNTVDREILHPEKDLVSFALELLGRSKNINDAVFLLLSRIGKKYHF DRVSIIEADKAFLSYHFSYQWARHHADLQLGQDFYASEEDFEICANMYDEDGLADHNVRD GISHIASCLHAAIWNYGEYVGSMSFEIDQENYQWTGEQRKLLKELVKIVPSFIMKSKADA VSQAKTDFLSRMSHEIRTPMNAISGMTMIAKSVLDDREKTLDCLEKIESANTYLLELIND ILDMSRIESGKMELNYEAMDLEQQLSNLESLFRAQAEAKNLVLTFDNDYRKKRLLRADSL RLNQILVNIIGNAIKFTETGRVAVSANVLEEEPRALIRFSVTDTGIGIEPAALRRIFNAS S >gi|229784008|gb|GG667727.1| GENE 6 9914 - 10378 498 154 aa, chain + ## HITS:1 COG:VC1349_4 KEGG:ns NR:ns ## COG: VC1349_4 COG0784 # Protein_GI_number: 15641361 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Vibrio cholerae # 5 141 119 250 250 112 43.0 2e-25 MSLEYACENTEESPSAEQEASIPDFKGRRVLLAEDNELNREIAQTILEMNGFTVTCAVDG QEALDLYCGKEPGWFDVVLMDIRMPVMDGLESARRIRTSGHPDARTIPIIALTANAFDED TKKSLASGMNGHLSKPIQVDQMLEVLGRCMGLIS >gi|229784008|gb|GG667727.1| GENE 7 10400 - 11623 952 407 aa, chain + ## HITS:1 COG:no KEGG:Cthe_2967 NR:ns ## KEGG: Cthe_2967 # Name: not_defined # Def: major facilitator transporter # Organism: C.thermocellum # Pathway: not_defined # 1 407 1 392 392 457 58.0 1e-127 MYSFLLVIIYMAFISLGLPDSLVGSAWPVIHTELNVPVSYAGMITMVIAAGTVISSLLSD RLTKKLGAGLVTAVSVLMTAVALFGFSVSGSFLLLCLWAVPYGLGAGAVDAALNNYVALH YASRHMSWLHCFWGLGAAASPYIMRYCLTGGYGWHSGYRAVSIIQFVLTAGLFISLPLWK QANREKGDGARNAAAVKEEKRISENPSSGKTLSLLQTIRLKGVPLVMAAFFSYCAVEQTM MLWASSYMVQVRGMDVNLAAGYSSLFFFGITFGRFLGGFAADWLGDRGLIRIGIATIVVG LILVAAPVKTNMLVLAGLVVIGFGCAPVYPAIIHATPYNFGRENSQAVIGVQMASAYIGT TFMPPLFGVLASITNISSLPFYILILTVLLLVMSEKLNRTVDRRRNR >gi|229784008|gb|GG667727.1| GENE 8 11669 - 12529 665 286 aa, chain - ## HITS:1 COG:BH1906 KEGG:ns NR:ns ## COG: BH1906 COG2207 # Protein_GI_number: 15614469 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 5 284 5 290 299 121 28.0 2e-27 MHYIKYKEIKHHGTADFPAGYYHVTESHPQYAMPYHWHEELELIHIISGTFHVTINETTR NAVPGDILLVNSGGLHGGTPYDCVYECLVFPVSLLSSQTYASEFLNRLDRQDLILDEYFP AGDDTAIHSLLERLFALMQDSDEGGKLQCLGLLYQFMGLLLQDRRYTASDKSDLSTHRNI YLLKNVLNYIEAHYTEKITLHELARTAGMSSKYFCHFFSEMTGRTPIDYVNYYRVERACC LLAGIGHSITDIAMLCGFNDVSYFTRAFKKYKGISPGQYLKIPLTQ >gi|229784008|gb|GG667727.1| GENE 9 12715 - 14508 1815 597 aa, chain + ## HITS:1 COG:SP2158 KEGG:ns NR:ns ## COG: SP2158 COG2407 # Protein_GI_number: 15901968 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Streptococcus pneumoniae TIGR4 # 9 597 3 588 588 764 62.0 0 MAESRLVGRYPVIGIRPTIDGRRGYLKVRESLEEQTMNMAKAAAELFEKNLRYSNGEPVK VVIADTTIGRVGETAACADKFRREGVDITLTVTPCWCYGAETMDMDPLTIKGVWGFNGTE RPGAVYLASVLATHAQKGLPAFGIYGHDVQDADDTEIPEDVKEKLLRFGRASVAVATMRG KSYLQIGSICMGIGGSIIDPAFIEEYLGMRVESVDEVEIIRRMSEEIYDHTEYEKALAWT KKNCTEGFDKNPEEVRKSCEEKDKDWEFLVKMMCIIKDLMNGNQNLPEGCEEEKVGHNAI AAGFQGQRQWTDFYPNCDFPEAMLNTSFDWNGAREPYILATENDVLNGLGMTFMKLLTGR AQIFADVRTCWSGDAVRRAAGYELEGRAKEADGFLHLINSGAACLDANGGAKDAEGNSVM KPWYEVTEEDQEGIMAAVTWNAADFGYFRGGGFSSRFLTDCEMPVTMIRLNLVKGLGPVM QIAEGWTVKLPEAVSDTLWKRTDYTWPCTWFVPRVTGTGAFSTAYDVMNNWGANHGAISY GHIGADLITMCSMLRIPVSMHNVPEKKIFRPAAWNAFGMDKEGQDYRACAAFGPVYR >gi|229784008|gb|GG667727.1| GENE 10 14521 - 15987 1232 488 aa, chain + ## HITS:1 COG:SP2167 KEGG:ns NR:ns ## COG: SP2167 COG1070 # Protein_GI_number: 15901977 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Streptococcus pneumoniae TIGR4 # 6 469 4 462 467 347 39.0 3e-95 MSRKQVLAVDYGASGGRVMLGGFDGTSFTISELHRFSNDPVILGDTMYWDVLRLFFEMKQ GIVKSKTHGEISGIGVDTWGVDFGLLDRKGNLLDNPVHYRDGRTAGRMEETFKIVGRDRL YEETGNQFMEINTLFQLMAVKAERPELLQRTDSLLLMPDLFQYFLSGTKISEKSIASTTQ MMDMRQGRWACGLLESLGLRAGMMRDILPGAVKTGSLRPELRKELGVPAVDVIAVAGHDT ESAMAAVPARGKDFLFISCGTWGLLGTELKEPVIDERSRRFNITNEAGLEGKYAFLKNII GLWLIQESRRQWIRKGKEYGFGELEAMAVGQEPLKCLINPDAPEFVPAGDIPERIREFCR RTGQPVPQTEGEIVCCIDQSLALAYRKAMDEIKTCTGKQYPVIHLIGGGAQSGLLCQMTA NACGLPVVAGPVEATVYGNAMAQFLALGELGSLAQARNALADTLDLIVYEPRDKALWEEA YERYKEIL >gi|229784008|gb|GG667727.1| GENE 11 16146 - 16706 522 186 aa, chain + ## HITS:1 COG:PAB0117 KEGG:ns NR:ns ## COG: PAB0117 COG0235 # Protein_GI_number: 14520392 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Pyrococcus abyssi # 6 139 55 184 194 82 32.0 4e-16 MSKSFITPEKLVHIDQGGQVLDAAAGLHPSSEIKMHLRCYEERTDVGAVLHAHPPVATGY AVANKPLDEYSMIETVIALGSIPVTPYGTPSTFEVPDAIAPYLGEHDAVLLQNHGALTVG ADLITAYYRMETLELFAKISLNAHLLGGAKEISRDNIDRLISMREGYGVTGRHPGYKKYS SPGERS >gi|229784008|gb|GG667727.1| GENE 12 16803 - 17387 482 194 aa, chain - ## HITS:1 COG:BS_spoVFB KEGG:ns NR:ns ## COG: BS_spoVFB COG0452 # Protein_GI_number: 16078737 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Bacillus subtilis # 3 191 4 192 200 206 49.0 3e-53 MDLAGKHIGLAITGSFCTFSKIEQEIKRLVETGAILHPIFSDHVQTTDSRFGDTGEFIRR VSEMTGNAPILKIEEAEPIGPKGYLDVLLIAPCTGNTLAKLASGITDTPVLMAAKAHLRN GKPLVISVSTNDALGMNFKNIGELFHTKNIFFVPFGQDDPVKKPSSMIARTELIEDTLRE ALENQQIQPVIRGV >gi|229784008|gb|GG667727.1| GENE 13 17403 - 18233 660 276 aa, chain - ## HITS:1 COG:no KEGG:Closa_0221 NR:ns ## KEGG: Closa_0221 # Name: not_defined # Def: dipicolinate synthase subunit A # Organism: C.saccharolyticum # Pathway: not_defined # 1 276 1 276 277 310 54.0 5e-83 MHQFLILGGDPRQLYLTRRLKKAGQKTLLYYDNSTPAFSLRDAMETSDIILCPVPFSKDG HTIFSDNHLEGLEIDTFLNHLTKGHTLFGGNIPPSVKAHCESRRIPCHDFMQMETVACKN SAATAEGALAEAISLSPINLYKSRCLVLGWGRCARTLADRLKGLGTAVTVAARDDRQLAH ADCLGYDTVLLEDLTGDIDRFDFIFNTIPAMVLDSVLLEAAKPEAAIVDIASAPGGVDFE TCRRLGIPAKLCPGLPGVYAPMSSAEILYEAVMDHL >gi|229784008|gb|GG667727.1| GENE 14 18405 - 19289 894 294 aa, chain + ## HITS:1 COG:CAC2378 KEGG:ns NR:ns ## COG: CAC2378 COG0329 # Protein_GI_number: 15895644 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Clostridium acetobutylicum # 4 290 2 287 293 268 48.0 7e-72 MRQSIFKGSAAALVTPMKPGGGLDFTALRRLTDRVLAGKTDAVVVNGTTGESATLSDREK LAVIEAVIETVDGRVPVIAGTGSNCTRHAVELSKEAQRLGADALLQVTPYYNKATQEGLV RHFTAVGDSVDIPILLYNVPSRTGVSIQPETYQILSGHPNIRGTKEAGGDLSALAKTAAL CGDGFDIYSGSDDLTVPVMALGGRGVISVLANLMPWEMHEMCRLFLEGRIEESRNLQLKL MEVMEAMFWEVNPIPVKAALSILGICDEFCRLPLTPMETEEKERLAEVLKSVCN >gi|229784008|gb|GG667727.1| GENE 15 19397 - 19513 95 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHFYFITVLSFVKVLYMRTIAPMYHYVFIQFIIYLSSV >gi|229784008|gb|GG667727.1| GENE 16 19742 - 20554 777 270 aa, chain + ## HITS:1 COG:BH3845 KEGG:ns NR:ns ## COG: BH3845 COG1653 # Protein_GI_number: 15616407 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 1 261 2 255 436 59 21.0 8e-09 MKKCLSVLLSAVMAASLLGGCGSADQKPTSPAAAESTAEGSTTAEAAPAPEKAETAPGEK KQISLWFWSAPADQQVLLKQIIEDDINAVQDDYVLNIEFHAGNDQNIATALAANSGPDIV FTSGPSYVTNYASAGKLENLDRYAEKYGWNDRLLEPIYNLGTYEGSLYGLSNSLMVMGVF YNKKVLSDNGWEVPKTIEEMETIMKAAQEKGLYGSVTGNKGWKPVNANYSSLFLNHIAGP EKVYECLKGEAKWNNPDMVEALNKSAELAS >gi|229784008|gb|GG667727.1| GENE 17 21566 - 21670 65 34 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623701|ref|ZP_06116636.1| ## NR: gi|266623701|ref|ZP_06116636.1| extracellular solute-binding protein, family 1 [Clostridium hathewayi DSM 13479] extracellular solute-binding protein, family 1 [Clostridium hathewayi DSM 13479] # 1 34 33 66 66 67 97.0 4e-10 MWLGTMSAEEVLDRTDAEFEKELANGLVVPLPKR >gi|229784008|gb|GG667727.1| GENE 18 21752 - 22609 677 285 aa, chain + ## HITS:1 COG:mlr7001 KEGG:ns NR:ns ## COG: mlr7001 COG1175 # Protein_GI_number: 13475831 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Mesorhizobium loti # 8 264 34 294 317 147 36.0 3e-35 MKRRRDYLLLLPAFVLSMAVVFIPGVMTIFASFTDWNGVSMNFHFIGLRNFSELFGDKIF WRAIYDNVRWVALFLTIPVVIGLLTSVLLLNRRHKGLYQMLFLFPYVLAPVTNVAVWQNI IYNPIFGLVGYLNRQGFHIPSLLSDMKTALYAVAATDIWHYWGFLMVVYLASLRQTPEDQ VEAARVEGCSGWQLFRYVYFPNMLPTFKLMMIMSVINSFLTYDYVYLMTSGGPAHSTEML STYAYTFAFASLQVGKAAAIALFMSFFGLIASFVYIHLSQNEIMS >gi|229784008|gb|GG667727.1| GENE 19 22636 - 23478 905 280 aa, chain + ## HITS:1 COG:BS_yurM KEGG:ns NR:ns ## COG: BS_yurM COG0395 # Protein_GI_number: 16080311 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 6 280 27 300 300 139 32.0 4e-33 MAVMGKKKRIVTLVTVHAILILLVITALFPIALVVLNSFKPHVAIVANPLSWPESFTPVN YTSAWSYAKFGIGFKNSIILAACTIVIVLFTSTLGGYVLATRRGKVVNLALVYFMIVMTI PIQLFLFPLYYVMSYLKLVGTVVPISFILAARNIPLALFLMRTFFLNVPKELEEAARIDG ANTGHVLWYVMAPVVSPGLLTVSVIVGLNAWNEFLITSTFLQGEKNFTATLMLRSMSGQF GSDIGIMMAGAMILIAPVLIFFMITQRYFIDGLVSGSVKG >gi|229784008|gb|GG667727.1| GENE 20 23500 - 24829 1185 443 aa, chain + ## HITS:1 COG:STM0886 KEGG:ns NR:ns ## COG: STM0886 COG3119 # Protein_GI_number: 16764247 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 1 401 1 408 495 319 40.0 7e-87 MKAFLILIDSLNRHFLPCYNPDTKVTAEHLTRFSQENITFDSHYVASAPCMPARRDIFTG RINFLERNWGGMEPFDVTLISKLREHGIFTHITTDHPHYFEIGGENYCQMFHTWNFERGQ EGDVWVSRVKGSDIPQPHYGKVNRQNQLNRLAYTKEEEYPTPRTFRSACNWIDSNRDADD YFMMVEAFDPHEPFDCTENYRELYPDTYDGPYFDWPSYKALNEETPEAMEHLRNEYSATL TMMDHWFGTFIQKLKDTGIYDESLIVVTSDHGHMLGEHGFAAKNFMPAYQELVHIPLFVH LPGSAYAGKRISSLTQNIDLMPTFLEYFKTETPKTVQGKSLLPMIPHDTPVRDKAVYGWF GKAVNVTDGTHTYFRAPVSEDNQPCFLYGAMPTTFLKYWDIKPGDYEMGTFLPYTDYPVY KVRYRVEGHYPECPEGLDYVREN Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:17:00 2011 Seq name: gi|229784007|gb|GG667728.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld121, whole genome shotgun sequence Length of sequence - 23056 bp Number of predicted genes - 21, with homology - 19 Number of transcription units - 8, operones - 3 average op.length - 5.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 9/0.000 + CDS 3 - 491 377 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 2 1 Op 2 9/0.000 + CDS 511 - 1293 241 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 3 1 Op 3 . + CDS 1310 - 2026 238 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 4 1 Op 4 3/0.000 + CDS 2089 - 2730 708 ## COG0831 Urea amidohydrolase (urease) gamma subunit 5 1 Op 5 10/0.000 + CDS 2731 - 4449 1601 ## COG0804 Urea amidohydrolase (urease) alpha subunit 6 1 Op 6 16/0.000 + CDS 4480 - 4962 599 ## COG2371 Urease accessory protein UreE 7 1 Op 7 17/0.000 + CDS 5008 - 5697 447 ## COG0830 Urease accessory protein UreF 8 1 Op 8 9/0.000 + CDS 5782 - 6396 511 ## COG0378 Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase 9 1 Op 9 . + CDS 6347 - 7273 452 ## COG0829 Urease accessory protein UreH + Term 7388 - 7446 14.2 - Term 7376 - 7434 15.0 10 2 Tu 1 . - CDS 7495 - 7725 302 ## gi|266623714|ref|ZP_06116649.1| phosphocarrier protein HPr - Prom 7795 - 7854 7.1 + Prom 7926 - 7985 8.2 11 3 Op 1 . + CDS 8010 - 10076 993 ## Cphy_1618 hypothetical protein 12 3 Op 2 . + CDS 10149 - 10262 65 ## 13 3 Op 3 7/0.000 + CDS 10273 - 12051 1476 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 14 3 Op 4 . + CDS 12044 - 13543 1208 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Term 13577 - 13631 5.0 + Prom 13548 - 13607 6.6 15 4 Tu 1 . + CDS 13748 - 13915 160 ## 16 5 Op 1 35/0.000 + CDS 14887 - 15552 590 ## COG1653 ABC-type sugar transport system, periplasmic component 17 5 Op 2 38/0.000 + CDS 15620 - 16531 1023 ## COG1175 ABC-type sugar transport systems, permease components 18 5 Op 3 . + CDS 16536 - 17387 819 ## COG0395 ABC-type sugar transport system, permease component + Term 17427 - 17477 13.6 + Prom 17519 - 17578 8.3 19 6 Tu 1 . + CDS 17755 - 20607 2036 ## Pjdr2_5534 S-layer domain protein 20 7 Tu 1 . + CDS 21582 - 22559 1000 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 22593 - 22646 15.9 - Term 22581 - 22633 15.1 21 8 Tu 1 . - CDS 22641 - 22814 100 ## gi|288871186|ref|ZP_06410060.1| lipopolysaccharide biosynthesis protein - Prom 22952 - 23011 5.8 Predicted protein(s) >gi|229784007|gb|GG667728.1| GENE 1 3 - 491 377 162 aa, chain + ## HITS:1 COG:BH0249 KEGG:ns NR:ns ## COG: BH0249 COG4177 # Protein_GI_number: 15612812 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Bacillus halodurans # 1 122 216 337 367 157 60.0 6e-39 GKVFIAIRDGENRAGFAGYNTAKYKVFVYALSAMIAGLAGMLYVGQVGIISPAEVGIVPS VEMIIWVAIGGKGTLAGPILGALAVNGFKTLASENFPELWSYFIGILFVVVILWLPGGFV SLKEVPGRIRNAVKRRREERPELMEDREAYPASAKKNGRTAS >gi|229784007|gb|GG667728.1| GENE 2 511 - 1293 241 260 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 15 249 1 229 245 97 26 7e-20 MTQAEVLMTPEEAPVLTIEDVSIQFGDFKAVKHVSTEVGKNEVRFFIGPNGAGKTTILDA ICGKNRVSGGHIIFHGENGTKDVSKMKEYAISDLGIGRKFQAPSVFQGITIEENMELAVA RRHSLYSTTFRKLSKEKQDYMNHVLEFIGLADKRFKYPGSLSHGEKQWLEIGMLLAARPK LMLLDEPVAGMGRKETEKTAEMLLDIKKECSIIVVEHDMQFARDVSDTVTVFHEGSILDE GTMESVQQNQKVIDVYLGRS >gi|229784007|gb|GG667728.1| GENE 3 1310 - 2026 238 238 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 13 227 3 226 245 96 30 2e-19 MDRTELENKAAFELAGLYAAYGDGMVLRHVSLEAPVGRVTCLVGRNGAGKTTTLKAVMGL VKTPEGEILLDGKPVIHLPSYERAKRGIGYVPQGREIFPQLTVEENILLGLEAKNGKGVI PDFIFETFPILKEFLNRKGGDLSGGQQQQLAIARAVVAEPKILILDEPTEGIQPSIIQEI GRVITGLKTRMAVLIVEQYLEFVMEIADYCYVMENGRIIMDGEPDSLDQERLQATMSL >gi|229784007|gb|GG667728.1| GENE 4 2089 - 2730 708 213 aa, chain + ## HITS:1 COG:HI0541 KEGG:ns NR:ns ## COG: HI0541 COG0831 # Protein_GI_number: 16272485 # Func_class: E Amino acid transport and metabolism # Function: Urea amidohydrolase (urease) gamma subunit # Organism: Haemophilus influenzae # 1 99 1 99 100 130 64.0 2e-30 MRLTPKEEEKLMLHLAGNLAKERRSRGLKLNYAEAVAYLSSELLELARDGRSVTELMEKG RGILNADDVMDGVADMVEAVQVEATFPDGTKLVTVHNPIGPGISKRTDRKSGEVITQDRD IILNEGRESKSVTVANIGDRPIQVGSHFHFFEVNKLLKFDRDEAFGFRLDIPSGTSVRFE PGEEKKVQLVELGGKKLVYGCNDLTRGKAERRK >gi|229784007|gb|GG667728.1| GENE 5 2731 - 4449 1601 572 aa, chain + ## HITS:1 COG:HP0072 KEGG:ns NR:ns ## COG: HP0072 COG0804 # Protein_GI_number: 15644702 # Func_class: E Amino acid transport and metabolism # Function: Urea amidohydrolase (urease) alpha subunit # Organism: Helicobacter pylori 26695 # 4 572 3 569 569 805 71.0 0 MSYRISGKQYAMMYGPTAGDRIRLADTDLYAEVEKDYTRYGDEIKFGGGKTIRDGMGQSV LTTSAGGDLDLVLTNALIIDYTGIYKADIGIKDGKIAGIGKAGNPDIMDGVTPEMTVGAS TEALAAEGLIVTAGGIDSHIHFICPQQIDCALFSGVTTMIGGGTGPADGTNATTCTPGPW NIHRMLEAAEEYPMNLGFLGKGNCSDEAPLREQVEAGAMGLKIHEDWGATPAVINHCLSV ADEYDVQVAIHTDTLNEGGCVEDTLAAIGGRTIHTYHTEGAGGGHAPDIIRAAGALNVLP SSTNPTMPFTVNTLDEHLDMLMVCHHLDKNIPEDVAFADSRIRPETIAAEDVLHDMGVFS MMSSDSQAMGRIGEVITRTWQTAHKMKGERGPLPGDKEGSDNLRIRRYIAKYTINPAITH GISDYTGSVETGKMADLVLWNPAFFGVRPDIIIKGGMIIASKMGDANASIPTTEPVMYQH MFGAHGKAKYRTCATFVSQAAYDSGIKEKLGLEKQVLPVHCTRTITKKDMKLNDLTPEIF VDPETYEVRADGVKLTSEPAERLPLTQMYSLF >gi|229784007|gb|GG667728.1| GENE 6 4480 - 4962 599 160 aa, chain + ## HITS:1 COG:HP0070 KEGG:ns NR:ns ## COG: HP0070 COG2371 # Protein_GI_number: 15644700 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Urease accessory protein UreE # Organism: Helicobacter pylori 26695 # 2 157 1 142 170 95 36.0 4e-20 MIVCEEILGNVKDKEFSGCFRDEVEIEWHEAYKRINQKTTKGDREIGIRLGTEILTRGLR DGDVLYREGDTVITVCVPPCEVIVIDVREDHARMACKVCYEIGNKHAALMWGDTKLQFIT PYNEPTLVLLSKLHGVTVWSDVRKLDFDRSISSKVSSHTH >gi|229784007|gb|GG667728.1| GENE 7 5008 - 5697 447 229 aa, chain + ## HITS:1 COG:HP0069 KEGG:ns NR:ns ## COG: HP0069 COG0830 # Protein_GI_number: 15644699 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Urease accessory protein UreF # Organism: Helicobacter pylori 26695 # 5 229 30 254 254 187 40.0 1e-47 MMKQEKRYLLLQLNDALFPIGGYSHSYGLETYIQKNMVNSEETAAQYIENMLLYSFCQNE LLAVRFAWAAGDMENLEALLLLDQCITAQRTPMEVRQAGEKMGSRFVKTVQALAAPCEKS IFARYVREGGLKNHVIAYGTYCAAHGIGQEETLEHYLYGQTSAMVTNCVKAIPLSQTSGQ KLLYGCQDIFRKVMERLDTLTEEDFGASAPGFEIRCMQHEGLYSRLYMS >gi|229784007|gb|GG667728.1| GENE 8 5782 - 6396 511 204 aa, chain + ## HITS:1 COG:HP0068 KEGG:ns NR:ns ## COG: HP0068 COG0378 # Protein_GI_number: 15644698 # Func_class: O Posttranslational modification, protein turnover, chaperones; K Transcription # Function: Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase # Organism: Helicobacter pylori 26695 # 4 201 2 199 199 327 79.0 1e-89 MKYVKIGVAGPVGSGKTALIEILTRKMASDYSIGVITNDIYTKEDAQFLCKNSILPQERI IGVETGGCPHTAIREDASMNLEAVDEMMGKFPDIQIIFIESGGDNLSATFSPELADATIF VIDVAEGDKIPRKGGPGITRSDLLVINKIDLASLVGASLQVMERDSKKMRGTRPFVFTNI RGDEGVDQIVKWIKRSVLFEDGFQ >gi|229784007|gb|GG667728.1| GENE 9 6347 - 7273 452 308 aa, chain + ## HITS:1 COG:HP0067 KEGG:ns NR:ns ## COG: HP0067 COG0829 # Protein_GI_number: 15644697 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Urease accessory protein UreH # Organism: Helicobacter pylori 26695 # 26 267 2 240 265 165 36.0 1e-40 MLNGSSEVFCLKTDFSKENTGDSGKNAFGRTSVLNISASAKEGKTYLKELGFTAPFKVMT PFEQENGGMKIMMLMASAGIMEGDVQEISICTERGASLECTSQSFEKIHKMGDGYAVRNI RLTVGAGSFLDYSPMPVIPFADSAFKASTDIVLEDDSSSLIWQEIMACGRAARGERFSYR FYHSGIRIWQQEKLLYREKCRFNPPETMMEGIGLFEGYSHMASLLLFHMGISDELLTRFR EYVEGLPYVCGGVTRTASGDVAVRIFSNQAWEVEAVCSTLKDWAKQDRIEYFATGRVKGT KSYRSLLI >gi|229784007|gb|GG667728.1| GENE 10 7495 - 7725 302 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623714|ref|ZP_06116649.1| ## NR: gi|266623714|ref|ZP_06116649.1| phosphocarrier protein HPr [Clostridium hathewayi DSM 13479] phosphocarrier protein HPr [Clostridium hathewayi DSM 13479] # 1 76 1 76 76 129 100.0 6e-29 MEHKFITFQNTKQILQFVNLVSKLDFDVDVKYGSRIVDGKSILGVLSLASFKTVEVIFHS VTDCIAQLDELFELAS >gi|229784007|gb|GG667728.1| GENE 11 8010 - 10076 993 688 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1618 NR:ns ## KEGG: Cphy_1618 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 8 680 4 713 714 701 49.0 0 MDMQNDFFILNRDVTIDAELKSEPVRRAVSRFYRDLSMVLEPADEGTERAESGKICLKKG DLAAEEYDIYVENNRKVTITASEELGFIYALIYISEHCLQILPFWFWNDQRFVKQKEIPI LFKKYHSEKKPVAYRGWFINDEVLISHWDAEKDAAYPWEMAFEALLRCGGNTVIPGTDSN SKKYAGLAGDMGLWLTQHHAEPLGAEMFLRAYPDKTPSFKEYPDLFRGLWEEGIKRQMNH KIIWNLGFRGQGDTPFWENDPQYDTPEKRGQLISSIMQEQYDLVKKYVANPVFGTNLYGE TMELYQQGLIELPDNVIMIWADNGYGKMVSRRQGNHNPRVPALPGEGLGDRRHGVYYHVS FYDLQAANVLTMLPNPMEFVKKELKHAYDCGISTLWLVNCSNIKPHVYPLDFAATLWNSL ETEPEKHLEQYISRYYGDRSLKEMRECFMEYFEAALPYGEEEDEHAGEQFYNYVTRVLIH QWMKDGGNAACEELRWCGPAESFAAQLQWYVSKCKNACPRFQGLLDRCVSLADELPEDCR RLWKDSLLLQVKIHTFCLEGVIHFGKGYSAYESSDYLNAFYEIGQAADHFYLAEEAMKEC GHDKWKGFYDNDCQTDIKETAYLLRLLMGYIRNIGDGPYFYQWQRLIIYPERDRKIMLLL NYENHLTDGELYQAMKEKENNKFDKISL >gi|229784007|gb|GG667728.1| GENE 12 10149 - 10262 65 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGFAVDFYYVFAVFSVFGYNYLEKSLFSNRKQENEE >gi|229784007|gb|GG667728.1| GENE 13 10273 - 12051 1476 592 aa, chain + ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 6 583 1 571 577 302 31.0 1e-81 MKGQSMKRYLEYWYQKYRDLKLAEKMIIVYVLMLGICFLISVAALQVSFNIYDGKLYEKS LQELDFFTQEVNRSLDEVEGISYNIAIDTQIQEQLTKMKNLKHFTADYSYEVYQLRMLFN KELANSNMISNMVYTDGDKTQFVVGLATGTISKELYDSVMNRIHEAKGAYVVQPPTTEYP YLLTGRDVRKHIDASLEYLGSLLITCDVAGMIEKNIDDLEADSSTLCVYSNEGVIYEKQK GMMERLPKLEGDKGYQIIKEHGQRYFACYLQSSKTGWTYVNIFPYSEIYGQNQNLRYMMI GGFLLLFLITALILKKLAHMIVKPLEHLTESMQIVETGDFDQAREFLGTSTQKDEIGLMT QEFKVTLDKIDYLIHENYEKQFLLKDTKYKMLQAQINPHFLYNTLNSINWMIRARKNDEA AEMTVALGTILRSALNKQQYVAVDEELDSLRKYMTIQEYRYRKRVAFSVECEVSGQYLIP HMTLQPLVENSIYHGGEKMLTPCKISVTIREEGGRITITVSDNGPGMTEEELEAVRTFTA EPKGHGIGLKNIYERLKMAFDQEAYFDISSVPGGGTTVTIKIPKAEVMISDE >gi|229784007|gb|GG667728.1| GENE 14 12044 - 13543 1208 499 aa, chain + ## HITS:1 COG:BS_yesN KEGG:ns NR:ns ## COG: BS_yesN COG4753 # Protein_GI_number: 16077763 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus subtilis # 1 156 1 156 368 129 42.0 9e-30 MNKVLLVDDEDIEREAMAEIIPWKKLDMELVDMAWNGIEGLEKIRMHIPDIVITDIKMPV MDGIQLIRNTKELYPDILFVVLSGYGEYEYTSRAMELGIRHYILKPCDEEKIVEVLLGVK KELEELENRRKSEQEYRSTFDRMLPRVKEQILGILITKEEVSRADEILLEKFMEDITNSF VLLGIQSSLELDQLDQFALTNILTELLGVENVYMSTATKNEMVYLIPAEMAENLRPLIIK VHKEYRKYKQTKLRSAISRTGGIEDVSHLYSQLQELFFLGDSEGKMEFISFDFLGEADDN SVMVLDYDRIREAKDYAEVMFEIYTSFVKMEIRGFSRKRIADSFRFFLKIFCGKAECELN DGDIWGILEMVIEECTAYLKLSLPDTKDGSRMKHILYSIYQNIKNPDLSLQYLAKEVLFM NEDYFGRFFYRYTNEKYSSFLVRIRVDLAKRLMAFDPDIRISELAEQIGYPADGQYFAKV FKKTTGLVPSEYRKMLTVR >gi|229784007|gb|GG667728.1| GENE 15 13748 - 13915 160 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRSLKKLTALSMAAAMAFSLAGCSSGNGGGTTAAEAAKTDTQAAGTESTGEPVT >gi|229784007|gb|GG667728.1| GENE 16 14887 - 15552 590 221 aa, chain + ## HITS:1 COG:BS_yesO KEGG:ns NR:ns ## COG: BS_yesO COG1653 # Protein_GI_number: 16077764 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 2 221 198 412 412 129 36.0 4e-30 MFSEDKKAIGYADDKLCSDFIQMWADLMAKGAAPNPDEYAAIQTLGEEGRPVVTGEGAML TEWNNYATKVSGTNDKLKMVTPPMVAGSDAKGLWMKPGMFFSIAETSKVKKEAAEFIDWF VNSEEANDIMMAERGTPVSSEIRDYMVASGKLSDQQATMFKGVEDALALCGETPDPDPVG MSEVNESFKNAAYSAFYGQMTPEEAAAKFRKDADAILSRNN >gi|229784007|gb|GG667728.1| GENE 17 15620 - 16531 1023 303 aa, chain + ## HITS:1 COG:BH1118 KEGG:ns NR:ns ## COG: BH1118 COG1175 # Protein_GI_number: 15613681 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 3 303 17 316 318 304 53.0 1e-82 MKSKKKKAVSRQADNAVAYLLMAPWLVGFIGMWLIPAIISIYYSFTDFNLLNDPKIIGFA NYARAFTQDETFVQALKVTFLYVLVLVPLRLAFALFVAMLLNKKHKGLGLYRTLYYVPSI IGGSIAVSVVWKQIFGNKGVVMTLLSVFGIQQTTSFLGNPKTALSVIILMGVWQFGSSML IFLSALKQIPYTLYESAMVDGAKPSYTFFRITLPMLTPTIFFNLILQIINGFRVFTESYV ITDGGPLDSTLSYVLYLYRRAFTYFDMGYSCALAWILVAIIAVFTVIIFKTQKNWVYYES EEG >gi|229784007|gb|GG667728.1| GENE 18 16536 - 17387 819 283 aa, chain + ## HITS:1 COG:BS_yesQ KEGG:ns NR:ns ## COG: BS_yesQ COG0395 # Protein_GI_number: 16077766 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 12 282 27 296 296 258 46.0 1e-68 MGMHTKKKINSVIFHIGACALGFLMIYPLLWLLASSFKSNDTMFLDTYSLIPKVWDAGTN YNSGFSGIGGVKFTTFFTNSLIVTVIGTVGCVLTSLLAAYALSRLKFKFSGFWFGCVMMT MMIPPQVMVVPQYIILKQLHLINTKTALVLPWIFGGAFFIFLMVQFFRGIPRELDEAAEI DGCGKITTLFRILVPVVKPAIITSSIFAFYWIWQDFFQPLIFMNSVERFTIPLALNMYLD PNSYNNYGGLFAMSVISLLPVIIFFIIFQRYLVDGIAMDGIKG >gi|229784007|gb|GG667728.1| GENE 19 17755 - 20607 2036 950 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_5534 NR:ns ## KEGG: Pjdr2_5534 # Name: not_defined # Def: S-layer domain protein # Organism: Paenibacillus # Pathway: not_defined # 248 925 42 677 1682 362 37.0 6e-98 MRLKQRTWFRRTIAAVLVAGMTVATGMFTLAAPGNGGKTAGASDWKEIAAVDFSGLESLD NLKDGWVVSGGSGTAELVEAGESAEGKALKLTRSSDGAVTQLVRSALGIDKSNCSYVSIE TRLKLGTEKGNKQWSIPYIKDKNGKVAYTVFVSDDANQWSQYKTHVNGTNNQAAGTAVPG TWQTVRMDINLEKNTFRVSVDGSYVLYDENARSAADSLDTIQFYADSWNRGTVYIQSVKV MGQQEHKAGTTYYVSSEGNDEAAGTSENTAWKTLGRVNREHFIPGDRILFRCGDEWENET LFPQGSGSADEKIVLGSYGEGAMPKISTNGKAADAVCLYNQEYWEICGLDISNTVEGFSQ LSGDGNGDGTPPETNNADRNAADGKLLGEYRGIHIAGRDVASLKGYHLHDLLIHDVTGVV SWIGDTGLKDAGIVNNAGLDGSKRTGGILIECLKPSGNQPTQFSDIVIEDSEFINNSFCG ITVKQWNGEGSQYSNNPGWANRNRAGGAPDYYDANWYPHSNIIIQDNYINQGASAYACNG IYLTSSKDSVIQRNVLEHIGTCGIELYFADNVAVQYNEISDVVKKGGGADDNAIDPDWRV TNSLIQYNYIHDAGEGFLLCGVHFNSGVIRYNLVQDCGRSYVHYSMGSGYFQIYNNIFYR SKDGNGTNNFDPWGGGSAAYFNNVFYDGKGQGFNFSGGSNFAYDNNAYYGTAPTSKDKNP ILLTDDPFEGDAPSLNRKGNFKTGVLLEANGLRPRVDSPLIAAGVDRDPNGYSIDDGLKS KGSKFNFTSLDKADTNYLGDCINIGRMDYPTFEKTDAEATFDSPKTQTVADQNAPTIGMF ELSLAPDAVILRGTVTDGLNPVTGAVVEVTSGDKTVEAVTNKSGVYSIVQGLAPGVATIT VKCEKEDITDTITLEGGKVNQHDVTVPMADMPGSFENVLIEENFEEQLAS >gi|229784007|gb|GG667728.1| GENE 20 21582 - 22559 1000 325 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 210 324 633 744 744 88 39.0 2e-17 MYETGDDFDPAGMKVVAYEVASASNASRNERTLNPDEYQIRHESFDTDGTKKVTVYYEAE NKEGDNEVFTDSFTVTVTEVWEEYYTTGIEVRKKPDKTVYKTGEDFDPEGMKVVAYERRA TASNAERRERVLTEDEYDLDIPSFGIQGSKTIKVVYEGVDKNGEDKTFKDTFTVKVLGRT VSSNDNDSDSSGSVYIPDDSITGTWQGGQNEPWKFKKSTGTYATNEWAKINGKWYHFDQD SNMQTGWLSDQNKWYLLNPDGAMCADTWVLVNGKWYYFNADGSMKCNEWYFYKEDWYYLG RDGDMLVSDITPDGYQVDSEGKWIK >gi|229784007|gb|GG667728.1| GENE 21 22641 - 22814 100 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871186|ref|ZP_06410060.1| ## NR: gi|288871186|ref|ZP_06410060.1| lipopolysaccharide biosynthesis protein [Clostridium hathewayi DSM 13479] lipopolysaccharide biosynthesis protein [Clostridium hathewayi DSM 13479] # 1 57 1 57 57 92 100.0 1e-17 MLAAISISSIKPALVDHAPHSGHALEIGKYLSYYLLLCECCCKFYAVSDETVVVLIL Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:17:48 2011 Seq name: gi|229784006|gb|GG667729.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld122, whole genome shotgun sequence Length of sequence - 17946 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 7, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 729 642 ## COG3119 Arylsulfatase A and related enzymes 2 1 Op 2 14/0.000 - CDS 730 - 2355 1731 ## COG1653 ABC-type sugar transport system, periplasmic component 3 1 Op 3 7/0.000 - CDS 2385 - 3218 1027 ## COG0395 ABC-type sugar transport system, permease component 4 1 Op 4 . - CDS 3297 - 4232 1029 ## COG4209 ABC-type polysaccharide transport system, permease component - Prom 4322 - 4381 3.1 + Prom 4349 - 4408 6.3 5 2 Op 1 . + CDS 4535 - 6010 1171 ## COG3119 Arylsulfatase A and related enzymes 6 2 Op 2 . + CDS 6024 - 6392 256 ## gi|266623730|ref|ZP_06116665.1| conserved hypothetical protein 7 2 Op 3 . + CDS 6358 - 8271 1704 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 8289 - 8348 5.4 8 3 Tu 1 . + CDS 8377 - 8697 261 ## Selsp_0314 hypothetical protein - Term 8778 - 8812 -0.2 9 4 Tu 1 . - CDS 8874 - 9014 108 ## 10 5 Op 1 . - CDS 9951 - 11093 1215 ## COG0860 N-acetylmuramoyl-L-alanine amidase 11 5 Op 2 1/0.000 - CDS 11090 - 12169 1139 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 12 5 Op 3 20/0.000 - CDS 12217 - 12654 597 ## COG0822 NifU homolog involved in Fe-S cluster formation - Prom 12741 - 12800 4.2 13 5 Op 4 . - CDS 12833 - 14017 1263 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes - Prom 14067 - 14126 6.1 + Prom 14044 - 14103 8.8 14 6 Tu 1 . + CDS 14341 - 15825 1018 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid + Term 15843 - 15896 11.3 - Term 15828 - 15887 12.1 15 7 Op 1 . - CDS 15912 - 16670 581 ## gi|266623739|ref|ZP_06116674.1| bacterial SH3 domain protein - Prom 16693 - 16752 1.7 16 7 Op 2 . - CDS 16766 - 17884 1193 ## Toce_1815 major facilitator superfamily MFS_1 Predicted protein(s) >gi|229784006|gb|GG667729.1| GENE 1 3 - 729 642 242 aa, chain - ## HITS:1 COG:YPO0829 KEGG:ns NR:ns ## COG: YPO0829 COG3119 # Protein_GI_number: 16121138 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Yersinia pestis # 2 218 42 274 517 205 45.0 7e-53 MKKKNLLYIFADQWRAQAIGAAGEDHVSTPHMDRFAGESMAFDHAVSTYPLCSPHRASLL TGKYPYCCGMWTNCKTGLDEAVMLRPQEVTISDVLHENGYETAYVGKYHLDASEMNFHEK PESGAVNWDAFTPPGERRHHFDFWYSYGAMDQHMSPHYWRDDPQKMEGRTWSPEHETDVV LDYLDNRNKEKPFCLFLSWNPPHPPYDLLPEKYADHYRGKELVFRGNVPEELRSDEAFRK KF >gi|229784006|gb|GG667729.1| GENE 2 730 - 2355 1731 541 aa, chain - ## HITS:1 COG:AGl3560 KEGG:ns NR:ns ## COG: AGl3560 COG1653 # Protein_GI_number: 15891902 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 55 534 31 515 522 315 37.0 2e-85 MRKRRLMAAAMAAVMLAGTLAGCSAKSGSDGTEAAKEESGTGQAAAEETLEGSLVSKEPK EFTVFLNFNNMPFDSSWQVWQEAAKRTNVSLKGTISLSNSNEEEAFNLMMSSGNLADIIG YVDASSLEKLGRDGGMIPLNDLIKEHAPNIQKVLDEDARFRQTAYSLDGNIYQIPKNQEL KAAEFWWIRQDWLDKLNLKAPTTVDELHDVLYAFRNEDPNGNGLKDEIPLFDRAGWKQPD EYLYLWDTSLEFYPRDGQMKYEPLEENFKTGVTNMIKWYQEGLIDPEIFTRGASSRDTLL GGDVGGCTHDWVSTANYNTTLQETIPGFQMVAIAPPADQNGVVKERVSRYPGVGWGISSQ CKDPVTVIKFMDYFFTEEGSDLMNWGIEGDTFTRDADGSKHFTDKVLQSELTPIGYLRSI GAQYRIGMCQDGDYEYATMKEDGIEANKLYNGHDEWFDDSLPPYLDGKMALKYTSDDETE YKNIMASIKPYVDEKFQSWILGVNDFDSEYDSFIKELKARGIDRALEINQKAYETFLGED R >gi|229784006|gb|GG667729.1| GENE 3 2385 - 3218 1027 277 aa, chain - ## HITS:1 COG:AGl3561 KEGG:ns NR:ns ## COG: AGl3561 COG0395 # Protein_GI_number: 15891903 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 276 19 292 293 223 45.0 4e-58 MAILGILISLVVLYPIYYVLIASLSQPYYVESGDVMFAVKHFTLASYREAVKKSGIWLAY GNTIFYTVAGVAVNMLFTTTMAYALSKKKLVFRKFFTLFTIFTMWFNAGIIPMYLTFKDF SLLNTRTAIVLGFAINTYNLIIMKSFFEQLPEALEEAAFIDGANHIITFFKIYLPLSKPA LATVGLFYAINRWNSYFWAMQLLNDDRKAPLQVLLKKMIVDRVANASEAAIVTQNSLSSP TTVIYAMIIIAIVPMLVVFPFVQKYFKTGVTVGGVKG >gi|229784006|gb|GG667729.1| GENE 4 3297 - 4232 1029 311 aa, chain - ## HITS:1 COG:AGl3564 KEGG:ns NR:ns ## COG: AGl3564 COG4209 # Protein_GI_number: 15891904 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 18 311 16 309 309 313 52.0 4e-85 MLKQPGGNNKKQSCLHQVRKNWDLYLLLLPGVLWFLAFAYKPMAGLAIAFYDYNIFKGIS GSTFVGLDNFITFLTGRDFLRVLKNTLMISFWQLVVCFPIPIILAIAVTEMKNKFVSKMT QMTTLLPHFVSVVVVCGIVVNFLSPSTGIVNMILEKLGMEPVYFMVQPQYFRGIYTIMTL WQNAGFNALVFIAAIMGIDPQLYEAATVDGAGKWQKIKNITVPAILPTIVTMLVLNIGKM VKVGYEAILLLYQPSTFAKADIIATYSYRLGFENGNYGLATAVGMFEAVVALILVTMANR LSRKLSDTALW >gi|229784006|gb|GG667729.1| GENE 5 4535 - 6010 1171 491 aa, chain + ## HITS:1 COG:ECs4619 KEGG:ns NR:ns ## COG: ECs4619 COG3119 # Protein_GI_number: 15833873 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Escherichia coli O157:H7 # 6 454 5 442 497 131 27.0 3e-30 MQQHKNILIIICDQLSATALSAYGNTYSDTPNLDRLAAGSAVMEYAYTSCPLCQPARASF WTSRYPHQTGVLSNLPDQGFPAVSDGIPTLGELFSRAGYDCVHFGKTHDYGALRGFQVIE SEEIHVPRTNPAIKFDYETFLDIDTTEKSVQYLSSRPEGPFLMVSDLQNPHNICAYIGEH SEGYGDFPLERELPPLPENYDFDDIANRPEFIRYLCCAHRRQRQASGWKEDDFRHYLYAY YYYLAMVDKQIGQILEALAESGATDQTMVVFLADHGEGMASHHLVTKYGAFYEETNRVPF FFSLPAGDNKSGTPGSCNKNAVPLYKKQNRIGGITSLLDLVPTLLDYAGIPCPAGVEGIS LMPQITGAKTRSDRTAAVAEWYDEFRDYTVPGRMICDEEYKYICYLEPDSEELYDMKRDR YEKTNLAGKQEYAPVLERYRSLLKQHLKKSDDPFFTLKTAGTAGYRRHPLGFEHHQGLSA VEKYALEIKRK >gi|229784006|gb|GG667729.1| GENE 6 6024 - 6392 256 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623730|ref|ZP_06116665.1| ## NR: gi|266623730|ref|ZP_06116665.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 122 1 122 122 194 100.0 1e-48 MFSSGIYKNRLYKILITTGTIVITLLAVLSGSYLHLQQKSSYIHNLSNSTAALEANSNIA MNLISRAVNDVSRDKSITKWVNSSSANDFYFNSITALKQLRIITTDSSMLNYEAGRYYGR PA >gi|229784006|gb|GG667729.1| GENE 7 6358 - 8271 1704 637 aa, chain + ## HITS:1 COG:BS_ytdP KEGG:ns NR:ns ## COG: BS_ytdP COG2207 # Protein_GI_number: 16080067 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 206 633 327 771 772 89 21.0 1e-17 MKLAVTTVGPLEFDGNTYGMVLDQSGTVDRDQFLRRIKGLSEEEMESMDHHFSASNRPFC LPHYEEDRLKSVYYIVKDYRYSSSLLCFVTIPIETLTGSARPGRFLLCSDDRLLAVSHRD DEMDELFSQLINLGMKAAGSGIGQQEYFKLGKEYVFPTTLPAIGWEIAYIYDSYMLDSGQ IVFFLLILVTAACIFTFFIALLVEALYKPIREVVEDSMESPEAGKPIDEFKILRQNSEKI KSLSKTLMEAMDENERLASQQQYRRLLFAPQPESHAECGGEPEEDYSVAVVEFQPVNEGS SPSCIVILKQYVHEFTMNIPDMTFVDLDSARCAMIVKNGGTDEILLQLYELLRYLTQKPE DETINQWIALSNPRSGLNRIWLAYQETLRILEYKHVYGHANILTFEQIRSVDAVTYSYPL SMENRLVHCIVEGKDEALQIFDQLIRTNLADKTLSIESIQSFVYVLIGTLGRVFQELKTS PEALLKEDLNFKYLYEHWNDSVTITTLRHAIQDILTAVNCRGETNDEKLLNEMIHYIHTN YTDDIMLNDMADQFNISPKYCGILFKQLSGQNFKDYLNRYRIEKAKELLQQKPGIKIAEV SLMVGFNSANSFIRVFGKYTGVTPKAYMESLKSSSFS >gi|229784006|gb|GG667729.1| GENE 8 8377 - 8697 261 106 aa, chain + ## HITS:1 COG:no KEGG:Selsp_0314 NR:ns ## KEGG: Selsp_0314 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 7 101 7 105 107 77 48.0 1e-13 MNIRSALENTIPISQFNRGLAGKIFEEVKRYGAKVVMKNNIPECVLLSPEEYIRLLDEAN DARLLATAAQRMSQFNPSTAISQEQVDQEFGFTPLDYEDTDGIEFE >gi|229784006|gb|GG667729.1| GENE 9 8874 - 9014 108 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFKDTFLCDFIYFVIFPDADYNNQHMITASDKLVYDTQTSSTEFYF >gi|229784006|gb|GG667729.1| GENE 10 9951 - 11093 1215 380 aa, chain - ## HITS:1 COG:CT268 KEGG:ns NR:ns ## COG: CT268 COG0860 # Protein_GI_number: 15604989 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Chlamydia trachomatis # 167 379 53 238 259 89 30.0 1e-17 MSRKKRTGVYIMAALVCLLTLAGCGRNYEPVEVPALEEASGDITPENGEETAAVSIEAET GNGKKEAAKPGRQAGTSAHEAAVTTETAPAFSPVDETVYVTGSQVNIRKSAGTNGAVITT VSKGTALKRTGYSDSWSRVIYQDQECYISSRFVSKEQPAPEPETAAPAVTGNGSGKIIAI DPGHQAKGNSEKEPIGPGASEKKAKVASGTQGNATGIPEYKLTLAVSLKLKEELLNRGYQ VYMIRETDDVNISNAERAEMANRSGADIFVRVHANSLSDTSVHGALTMCQTSKNPYNGNL YSKSSALSKAVVKGICDTTGFKDRGVQETDTMSGINWCKIPVTIVEMGFMSNAEEDKKMA TDEYRAKIAKGIADGIDAYS >gi|229784006|gb|GG667729.1| GENE 11 11090 - 12169 1139 359 aa, chain - ## HITS:1 COG:CAC2233 KEGG:ns NR:ns ## COG: CAC2233 COG0482 # Protein_GI_number: 15895501 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Clostridium acetobutylicum # 1 358 1 354 355 403 57.0 1e-112 MNKPKVVVGMSGGVDSSVAAWLLKEQGYDVIGVTMQIWQDEETEVQEESGGCCGLSAVDD ARRVAWDLGIPYYVMNFKEEFRENVIDYFVDEYRNGRTPNPCIACNRYVKWESLLKRSLD IGADYIATGHYARIEQLENGRYALKMSVTAAKDQTYALYNLTQYQLAHTLMPVGAYSKDQ IREIADRINLKVAHKPDSQEICFIPDHDYARFIEENTGKEPAKGNFVDLDGNVLGRHEGI THYTVGQRKGLNLSMGHPVFVTEIRPETNEVVIGNSADVFSDTLRCSRINWMAIDGLHGG EMRVTAKIRYSHKGAPCTIREVGEDLVECRFDEPQRAITPGQAVVFYLENYVAGGGTIL >gi|229784006|gb|GG667729.1| GENE 12 12217 - 12654 597 145 aa, chain - ## HITS:1 COG:MA2717 KEGG:ns NR:ns ## COG: MA2717 COG0822 # Protein_GI_number: 20091541 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Methanosarcina acetivorans str.C2A # 2 125 3 125 128 171 65.0 4e-43 MYTEKVMDHFEHPRNVGEIENASGMGTVGNAKCGDIMRIYLDIDDNGIIRDVKFKTFGCG AAVATSSMATEMVKGKSIQEAMEVTNKAVCEALDGLPPVKVHCSLLAEEAIHAALWDYAQ KNGIEIQGLKKPKSDIGEEEVEEEY >gi|229784006|gb|GG667729.1| GENE 13 12833 - 14017 1263 394 aa, chain - ## HITS:1 COG:MA2718 KEGG:ns NR:ns ## COG: MA2718 COG1104 # Protein_GI_number: 20091542 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 3 384 4 384 392 468 60.0 1e-132 MKRFIYLDNAATTKTRPEVVEAMLPYFTEYYGNPSSVYEFSNESKKAINRSRETIAEAIG AKTNEIYFTAGGTESDNWALAATAEAYQAKGNHIITSKIEHHAVLHTCEYLEKRGFEVTY LDVDENGVIKLDELKKAIRPTTILISIMFANNEIGTIEPVKEIGEIAKEHGIIFHTDAVQ AFGHVPINVDEYHIDMMSASGHKLNGPKGIGFLYIRTGIKTRSFMHGGAQERKRRGGTEN VPGIVGFGKAVEIAVNTMEERTKRESELRDYLMNRVMAEVPYVRINGHRTSRLSNNVNFA FQFIEGESLLIMLDMDGICGSSGSACTSGSLDPSHVLLAIGLPHEIAHGSLRLTLNADNT MEEMDYVAESIKKIVEKLRGMSPLYEDFVKRQNR >gi|229784006|gb|GG667729.1| GENE 14 14341 - 15825 1018 494 aa, chain + ## HITS:1 COG:BH1233 KEGG:ns NR:ns ## COG: BH1233 COG2244 # Protein_GI_number: 15613796 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Bacillus halodurans # 2 490 23 504 522 143 26.0 6e-34 MLGFFYRIFLSRTIGAEGLGLYNMVHPVFGICFALCAGSIQTAISQSVAANVRKGRSIFR TGLVISVSTSFILAWLIIRFQDFLAGSILMEPRCAPLLTYIAVSVPCAAIHACINGYYYG MQRTRVPAFAQVVEQSVRIGAVFLIADIMMEAGVEITVQLAVAGHLIGEIASTLYTVAAS QFFPPKLPDGTDAAEPDSKGRRLTAAADSFRQTAPALMALALPLMGNRLVLNILSSAEAI WIPGSLNRFGLSNSEAFSIYGVLTGMALPFILFPSTIFNSLAVLLLPTVAEAQSEGNERR IAGTISMSLRYCLYVGILCIGIFTLFGNDLGVSVFKDENAGSFIMTLAWLCPFLYLVTTM GSILNGLGKTSTTFIQNVTALFVRLAFVLFGIPKFGIMAYLIGMLASELLLALMHIFSLK KQVAFSWNAWDMIVKPALLMVMAIGIYYAAGSVCDPFAALPLFIRTGFHILILSLCYLLL LAGAHFFKREIQAD >gi|229784006|gb|GG667729.1| GENE 15 15912 - 16670 581 252 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623739|ref|ZP_06116674.1| ## NR: gi|266623739|ref|ZP_06116674.1| bacterial SH3 domain protein [Clostridium hathewayi DSM 13479] bacterial SH3 domain protein [Clostridium hathewayi DSM 13479] # 1 252 1 252 252 325 100.0 1e-87 MKKYPGIILAMTTAVMITSCSSVFNGGNTKSTGEHAASESGTTEAQTDMIKIENETDPEP SSEPETSPEAETEAPVQESKPRETEALPVYAVTAVQKTMYVTQPVHVRASYTTKSDVLTS FGTGQKVAVTGQSANGWMRITYKGGEAYIYKKYLSSSKPESGTRETAAAPSGDNAGSGQK DSNSSGSQKPSAPSGPSVPGGNDDIPIVEPSPLTPSVGTQNTNSTAGPGSSTAPGSGNGS GTGTITAIGPGM >gi|229784006|gb|GG667729.1| GENE 16 16766 - 17884 1193 372 aa, chain - ## HITS:1 COG:no KEGG:Toce_1815 NR:ns ## KEGG: Toce_1815 # Name: not_defined # Def: major facilitator superfamily MFS_1 # Organism: T.oceani # Pathway: not_defined # 1 366 9 369 393 245 42.0 3e-63 MYGISLLQGMVFYGPIATLYRQAAGVSVFEITLIESISLALCIALEMPWGVVADRIGYKN TLVISSLIYVASKVVFWKAGGFGGFLLERVMLSVVMAGMSGCDVSLLYLSCREGESHRVF GVYNLLNTVGLLTAALLFSVVVGDDYRLAGFLTVLTYAAAAVLSFGLKEVKEIERSKPVK GRFLTCLKEVARDKRFLLLLVGMAFLGETNQTITVFLNQLKYEACGIAPASMGYIYILVT VSGMAGVWSARCTEQIGMKRFALLLFGAAGISCVVLALTGSRVLAVGGIVLLRLSASLFV PLQTEIQNRHVKAADRVTVLSIYAVLAEGTGIATNLLFGAAAQAALPAAMLCGAVMCALG AAFFLIWHSSAK Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:18:28 2011 Seq name: gi|229784005|gb|GG667730.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld123, whole genome shotgun sequence Length of sequence - 11799 bp Number of predicted genes - 13, with homology - 11 Number of transcription units - 7, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 2 - 287 301 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 2 1 Op 2 . - CDS 284 - 1615 1309 ## COG1823 Predicted Na+/dicarboxylate symporter 3 1 Op 3 . - CDS 1632 - 2909 1507 ## COG3681 Uncharacterized conserved protein - Prom 2937 - 2996 9.5 - Term 3026 - 3074 7.3 4 2 Tu 1 . - CDS 3108 - 3530 430 ## gi|266623745|ref|ZP_06116680.1| putative penicillin-binding protein - Prom 3550 - 3609 1.8 5 3 Tu 1 . - CDS 3638 - 4324 880 ## COG2964 Uncharacterized protein conserved in bacteria + Prom 4510 - 4569 7.9 6 4 Tu 1 . + CDS 4661 - 6082 384 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 + Term 6114 - 6166 12.1 + Prom 6226 - 6285 8.1 7 5 Tu 1 . + CDS 6454 - 6540 107 ## - Term 6536 - 6592 2.4 8 6 Tu 1 . - CDS 6814 - 6987 101 ## + Prom 7176 - 7235 5.7 9 7 Op 1 39/0.000 + CDS 7325 - 8257 1092 ## COG0226 ABC-type phosphate transport system, periplasmic component + Prom 8267 - 8326 2.2 10 7 Op 2 38/0.000 + CDS 8353 - 9288 903 ## COG0573 ABC-type phosphate transport system, permease component 11 7 Op 3 41/0.000 + CDS 9281 - 10183 852 ## COG0581 ABC-type phosphate transport system, permease component 12 7 Op 4 32/0.000 + CDS 10281 - 11045 198 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 13 7 Op 5 . + CDS 11094 - 11756 672 ## COG0704 Phosphate uptake regulator Predicted protein(s) >gi|229784005|gb|GG667730.1| GENE 1 2 - 287 301 95 aa, chain - ## HITS:1 COG:DR0992 KEGG:ns NR:ns ## COG: DR0992 COG0446 # Protein_GI_number: 15806015 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Deinococcus radiodurans # 1 92 1 91 449 91 53.0 3e-19 MKTVIIGGVAAGMSAASKLKRLLGDAVEIVVYEKGGEVSFGACGIPFYISDHIKQGKELI ARTAEEFAKSGIPVKTYHEVTAVDADAKTVTVKNS >gi|229784005|gb|GG667730.1| GENE 2 284 - 1615 1309 443 aa, chain - ## HITS:1 COG:VC1235 KEGG:ns NR:ns ## COG: VC1235 COG1823 # Protein_GI_number: 15641248 # Func_class: R General function prediction only # Function: Predicted Na+/dicarboxylate symporter # Organism: Vibrio cholerae # 14 443 19 459 471 136 25.0 8e-32 MSTIGFLGLTRAESLAGVIVFFALLFGISRLPKKKFSFASRVLIGTGTGTVYGLIVWGVS GRTGVYAADLGRWFSLVGNGYVSLFQLLIAPVVFTAGVRLVIHTPLSKTETPLTKWKKWV NTLMLAASALIGACFGLWLKVGAAPGADTVVFSWDVGEGQKITDSIAHLIPSGIGFDFLT GNVAGIFVFAVFIGIAARRMSGKYMDMVKPFLDFTDAAFSVITSVCKAVIAYKPMGAAAV MAAFTAAYGPAFLLMLVKMLLVLCAASLAMLAVQMILCAVSGVSPAAFFRAGKAAMKKAL VTRSGSACLPDAQEALAAGMGLNKEITDEVSAYAIASGMQGCGALFPAMAAVFAAGLSGL TMTPALVFGLVVVIVLVSYGITGVPGTATMAEFGAVMGSGITGAVNGLGPMIAVDPIGDV PRTLINVTGCMTNAIIVERRVRE >gi|229784005|gb|GG667730.1| GENE 3 1632 - 2909 1507 425 aa, chain - ## HITS:1 COG:FN1147 KEGG:ns NR:ns ## COG: FN1147 COG3681 # Protein_GI_number: 19704482 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 22 424 3 411 411 339 47.0 6e-93 MERTGIQYSTYVQILKEELVPAMGCTEPIALAYCAAKAREVLGCLPERCVVEASGNIVKN VKSVIVPNTGGLKGIEAAAAAGIAAGDASRILEVIAGVTKEQKAQIREYLDHAEICVKPL DTDEILDMVVTVYGGGSCVKVRIAGYHTNIVLIEKDGEVLYKVGAVAAREAQMADRSLLN VEDIYEFAKTVDLDDVKEVIGRQIAYNRAISEEGLKNDWGANIGSVLLKAYGDDIRVRAR AAAAAGSDARMSGCELPVIINSGSGNQGMTASLPVIEYAKELGSSEEELFRALVLSNLIT IHQKTGIGRLSAYCGAVSAGCGAGCGIAFLLGGDYKTIAHTLVNGLAIVSGIICDGAKPS CAGKIAAAVDAGIMGYYMYLNGQQFKGGDGIVSKGVENTITNVGRLGKDGMRETDKEIIR IMTEC >gi|229784005|gb|GG667730.1| GENE 4 3108 - 3530 430 140 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623745|ref|ZP_06116680.1| ## NR: gi|266623745|ref|ZP_06116680.1| putative penicillin-binding protein [Clostridium hathewayi DSM 13479] putative penicillin-binding protein [Clostridium hathewayi DSM 13479] # 1 140 1 140 140 246 100.0 4e-64 MSKKSMILMIVTLLCIITTVVLRGVVDKADAEYTEVEVRVVSSDTVYRKILGKRQMQYDV FVSYLGQEYELKNAHSSAPYIPGRSVTAYLSKGNLYANIEGVKTSTPAAMVYFGFLFASF AMLITTLSSFGSGRKKLTAV >gi|229784005|gb|GG667730.1| GENE 5 3638 - 4324 880 228 aa, chain - ## HITS:1 COG:SPy0144 KEGG:ns NR:ns ## COG: SPy0144 COG2964 # Protein_GI_number: 15674354 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pyogenes M1 GAS # 9 225 4 223 226 134 35.0 2e-31 MRKGTVELDILDYYKKLVDFLGEVLGENAEVVLRDCRKPDHDIIAIANGHVSGRTIGAPI TDFTLSILANEEWKEKDYVVNYEGKAAPDKRLRSSTYFIREEGKLVGQLCVNIDMTPYEQ VMDRIRQLSGMGLMSDGGQSGIICSGPVENFSEDVIGDMMKKAVITVVGSSEAKVRERLT QKEKIEIIGELNRAGLFQLKGAVGAVAEYLYCSEASVYRYQSKIQKKT >gi|229784005|gb|GG667730.1| GENE 6 4661 - 6082 384 473 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 12 469 6 426 447 152 26 1e-36 MKQETNYSSPYELDGKLPLKTAFPLGLQHVLAMFAGNLTPILLIAGACGITAGSAEQVAI LQNAMLVAGIVTLVQLFSIGPVGGKLPIVMGTSSGFIGVCQSVAGVMGGGVAAYGAILGA SLLGGLFETVLGFFLKPLRRFFPSVVTGTVVLSIGLSLISVGIGSFGGGSSSKDYGSLEN LFLGFVVLLSIVLLKHFCKGFASVSSILIGIIIGYVVAAVMGFLLPNTFTFTDPATGAAA ELTKSWVLNWNKVAEASWFSIPKLMPVKLVFDARAIIPIGIMFIVTAVETVGDISGITEG GMGREATDKELAGGVMCDGLGSSFAALFGVLPNTSFSQNVGLVGMTKIVNRFAIATGGVF LIACGLFPKLAAIVSIMPQSVLGGAAVMMFASIVVSGIQLITKEEVNTRSITIVSVALGL GYGLGSNAAALTYLPEPVRLIFGGSGIVPAALVAIILNVCLPKESAAEKTAKK >gi|229784005|gb|GG667730.1| GENE 7 6454 - 6540 107 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKHFDFKDLMAFGMFILALLTFIYLICR >gi|229784005|gb|GG667730.1| GENE 8 6814 - 6987 101 57 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKMPHAETRSVLSAAFIVFMILLYTLPRSVRPESMNYGMPFGYDVMICYQQPILVYV >gi|229784005|gb|GG667730.1| GENE 9 7325 - 8257 1092 310 aa, chain + ## HITS:1 COG:MTH1727 KEGG:ns NR:ns ## COG: MTH1727 COG0226 # Protein_GI_number: 15679719 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Methanothermobacter thermautotrophicus # 70 309 32 269 271 170 42.0 3e-42 MKKNMVKTLALAGMAVLTMGTLAGCGSNTTESTAAPTTAQSEAETTTAGTASAETTTAAP AASADLKGSISMVGSTSMEKFATALSEIFMEKYPNITVQAEFVGSGAGVEAVNNGSADIG NSSRELKDEEKANGIAENIVAIDGIAVVVDPANTVEDLTKDQLTSIYNGTITNWKEAGGT DQPIVVIGRESGSGTRSAFEEILDVADACKYSNELDSTGAVMAKVASTPGSIGYVSLDVL DDTVKTLKLDGTEATPENIKAGSYFLSRPFVMATKGEISEQNDLVKALFDYIYSDEGTEL VKSVGLIPAN >gi|229784005|gb|GG667730.1| GENE 10 8353 - 9288 903 311 aa, chain + ## HITS:1 COG:VCA0071 KEGG:ns NR:ns ## COG: VCA0071 COG0573 # Protein_GI_number: 15600842 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Vibrio cholerae # 25 302 56 321 327 195 41.0 9e-50 MHPKSYSIFAGRGSKSAVEHAAQIIFTVCGFFAVLAVASITAYMIISGTPALFKVGILDI LFGTVWKPEAAVPAFGILYVILTSIAGTALAILIGVPIGVMTAVFLAEVAPPRLAGIVRP AVELLAGIPSVIYGLLGILILNPLMYKLELKIFEGSTTHQFTGGANLISAVLVLALMILP TVINISESALRSVPKELRSASLALGATKIQTIFQVVLPAAKSGIITAIVLGTGRAIGEAM AISLVSGSSVNFPLPFNSVRFLTTAIVSEMSYSAGLHKQVLFTIGLVLFAFIMIINISLT RLLKKGDVQHD >gi|229784005|gb|GG667730.1| GENE 11 9281 - 10183 852 300 aa, chain + ## HITS:1 COG:VCA0072 KEGG:ns NR:ns ## COG: VCA0072 COG0581 # Protein_GI_number: 15600843 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Vibrio cholerae # 22 300 9 289 289 221 46.0 9e-58 MTNEIAVPFHKDSVINNSICRRQTRVSDSVLTFFIYLCASISILILIGIMGYVFFRGIPQ INWEFLSTVPSTIRGTFGILGNIVNTLYIVVITLLIATPLGVGAAIYLNEYAKGGRAVCL IEFTTETLSGIPSIIFGLFGYVFFGTTLGLGYSILTGALTLSIMVLPLITRTTQEALKTV PDSYRSGALGIGATKWYMIRTILLPSAMPGIVTGVILAIGRIVGESAALLFTAGSGYLLA KDFFGKIFESGGTLTIQLYLSMQKAKYDQAFGIAVVLLLIVLAINGLAKYLSHKFNKSLY >gi|229784005|gb|GG667730.1| GENE 12 10281 - 11045 198 254 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 236 7 230 318 80 30 4e-15 MNTSTVKIATNGLDLYYGTNHALKNISMELYTNKITAFIGPSGCGKSTFLKTLNRMNDLV PNVKIEGEVLLDNENIYDQRVDTTLLRKKVGMVFQQPNPFPMSIYDNVAYGPRIHGIHSK SELDAIVEKSLKGAALWDEVSDRLKKSALGLSGGQQQRLCIARALAVEPEVLLMDEPTSA LDPISTLKIEDLMDELKSKYTVAIVTHNMQQATRIADYTAFFLVGEVVEYAATTELFSSP KDSRTEDYITGRFG >gi|229784005|gb|GG667730.1| GENE 13 11094 - 11756 672 220 aa, chain + ## HITS:1 COG:CAC1709 KEGG:ns NR:ns ## COG: CAC1709 COG0704 # Protein_GI_number: 15894986 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate uptake regulator # Organism: Clostridium acetobutylicum # 3 212 2 212 219 137 38.0 2e-32 MTTRINYDHELTLLNDDIKQMGFMVESSIEQCFIAFEDQDFEKAEGIIRGDRTINDLERS IEARCLSLILRQQPVAGDLRVVSSALKVVTDLERIGDHASDIAELVLRIRGEHVYHVVRH IPGMAMAAREMVRNAIDAFITQDIDAAKSIEKQDDIVDELFTKVKNDVIDLLKQSADHAD QCIDLLMIAKYLERIGDHAVNVCEWTEFAKTGALKNVRIL Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:18:48 2011 Seq name: gi|229784004|gb|GG667731.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld124, whole genome shotgun sequence Length of sequence - 15818 bp Number of predicted genes - 19, with homology - 17 Number of transcription units - 7, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 26 - 217 61 ## gi|266623754|ref|ZP_06116689.1| helicase, RecD/TraA family 2 1 Op 2 . + CDS 236 - 2038 1275 ## BBR47_51310 hypothetical protein 3 1 Op 3 . + CDS 2057 - 4138 1074 ## COG4548 Nitric oxide reductase activation protein 4 1 Op 4 . + CDS 4151 - 4882 357 ## gi|266623757|ref|ZP_06116692.1| conserved hypothetical protein + Prom 4905 - 4964 7.3 5 2 Op 1 . + CDS 5004 - 5258 311 ## EUBREC_1377 hypothetical protein 6 2 Op 2 . + CDS 5251 - 5748 596 ## COG2405 Predicted nucleic acid-binding protein, contains PIN domain + Term 5834 - 5870 -0.6 7 3 Op 1 . + CDS 7465 - 7572 107 ## 8 3 Op 2 . + CDS 7625 - 7855 94 ## bpr_I1231 RNA-dependent DNA polymerase 9 3 Op 3 . + CDS 7833 - 8012 120 ## COG3344 Retron-type reverse transcriptase 10 4 Tu 1 . + CDS 8941 - 9018 73 ## 11 5 Op 1 . + CDS 9135 - 9434 177 ## gi|288871198|ref|ZP_06116698.2| conserved hypothetical protein 12 5 Op 2 . + CDS 9431 - 10282 554 ## Closa_0805 hypothetical protein 13 5 Op 3 . + CDS 10312 - 10680 326 ## gi|266623765|ref|ZP_06116700.1| conserved hypothetical protein + Prom 10733 - 10792 1.8 14 6 Tu 1 . + CDS 10847 - 11728 709 ## COG0175 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes + Prom 11756 - 11815 1.7 15 7 Op 1 . + CDS 11835 - 12620 126 ## gi|266623767|ref|ZP_06116702.1| conserved hypothetical protein 16 7 Op 2 . + CDS 12632 - 13051 325 ## gi|266623768|ref|ZP_06116703.1| hypothetical protein CLOSTHATH_05085 17 7 Op 3 . + CDS 13108 - 13851 451 ## gi|288871199|ref|ZP_06116704.2| conserved hypothetical protein 18 7 Op 4 . + CDS 13865 - 14425 361 ## Closa_1107 hypothetical protein 19 7 Op 5 . + CDS 14431 - 15817 316 ## Closa_1108 hypothetical protein Predicted protein(s) >gi|229784004|gb|GG667731.1| GENE 1 26 - 217 61 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623754|ref|ZP_06116689.1| ## NR: gi|266623754|ref|ZP_06116689.1| helicase, RecD/TraA family [Clostridium hathewayi DSM 13479] helicase, RecD/TraA family [Clostridium hathewayi DSM 13479] # 1 63 9 71 71 119 100.0 9e-26 MLKRNIYYTGVTRAEFRVHLVGSKKALYMAISNSDIGKRNTLLAVRLRAEAQRQGLIPGQ EAA >gi|229784004|gb|GG667731.1| GENE 2 236 - 2038 1275 600 aa, chain + ## HITS:1 COG:no KEGG:BBR47_51310 NR:ns ## KEGG: BBR47_51310 # Name: not_defined # Def: hypothetical protein # Organism: B.brevis # Pathway: not_defined # 262 553 14 312 373 127 31.0 1e-27 MMKIFENWTFTRILPEPFAGVAANRGKKDTTEASCLSVHNHMSRMKRATLHASTLKALLA VSAIIRGDQEGAFGVQTVSGTVKWYCMEYAKPVAAGSEIRIGVYHPQDGKWKGIVKNGTT LAAYPQIGERGASGREIIMMFIFASLAEGSLHDDEFSKAFLLLNDEIKDGYPDADKATDL AFLCCDNLYRRIVNSASLGESGILFDGDSIASGNVPLILTSQLADGDYNPTETIYGKFQV LECKDCERTIAELKEQYDRKFELTDEEKNLVPKLPEKYNVSAEAQEIMDAVINSPMRVFL SVGEAGTGKTTNAKIIAQLLGLPYYAFTCSEGTDEVDLVSSMVPNTGERPGKIQIPMPTY QDLIMDPASALAQICGLYDNEISEEEAFRKILTMFYQNGFENGKDDKKFVMVESPIVTGC RRPSLVEIQEPSVITKPGTLVKLNGLLDEGAAITLTSGEVVKRHPDTVILITTNMGYKGC RGFNESVLSRMRMVHYLEPLNAEAMVARVKKKVKFDDETFLKKMADIVCEVQKHCNTEMI TGGVCGYREYEDWVWAYLIQKDILKAAKCTLVAKAAPEPEEREEIYKKLVIPAFTEAHAA >gi|229784004|gb|GG667731.1| GENE 3 2057 - 4138 1074 693 aa, chain + ## HITS:1 COG:SA1240 KEGG:ns NR:ns ## COG: SA1240 COG4548 # Protein_GI_number: 15926988 # Func_class: P Inorganic ion transport and metabolism # Function: Nitric oxide reductase activation protein # Organism: Staphylococcus aureus N315 # 318 659 252 585 628 70 23.0 2e-11 MNKKVQQKHNQIKNRMGRLTDKDFFLSQSYHGTMLKSVRLLGQKNDITLFMDYEESDDAR IAFTNGRLLYLNTANPITNLMTSRVSKIKSHEGFIAHECGHLRCSDFKRRGRYVSGFSRW RVYPKPPQVQLVYERKAWEEMKGYLNAHNVVAASVIQKTASYINNVLEDVYIESFMCQKY PGSIQNGIQRNAALIIHHIPTQEARKAEKSDGLTIMLDMIFRYARAGRTEAEKEYDKQYC SRLNSCRKIIDEAVVSADPDIRFYATNRLMLKLWKYIRQAIKTAAKSLKNEINRLSEEEL SKKIQEYLNRKMLWVALSETIGASEEQNEVEEEIEGWDGELEGEPESQNQNGKNEELENA LEKMRDGQRQEGGEEETEEGSEIDETLSNLLQKIAEEKFLLDEESDLKRNLEEEAADFKL DGIHKDSVIEMHRMTTVPQGLEKEYKMIAPEINRITKRLEASLEDVLERLEGGTQSGLYM GKRLSRGNLYRLDDKIFEKTIKPEDGFSIAFAVLLDLSGSMSSGGRIESAQKAGLVMYTF CRNLGIPVMLYGHTTHDTGYTEVVDIYSFADFDSVDNQDYLRIMSVSTYDCNRDGVALRF VGQKLLNRPEDIKILLMISDGQPYAQGYKGEIAKADLQEAKYSLEKRGVKLFSAAIGDDR KMIEEIYKDGFLNIADFNTMPVKLAGLIARFIR >gi|229784004|gb|GG667731.1| GENE 4 4151 - 4882 357 243 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623757|ref|ZP_06116692.1| ## NR: gi|266623757|ref|ZP_06116692.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 243 1 243 243 494 100.0 1e-138 MAEREERDWSNVVSGQIAVYKTDKALLELMDNLKPASYLFPAHIHASGDVSEVGERSLIR LNMLDYSKGTGDNTVSVYANISPEEAKYIYSALFNHLWEFDYPQEKIFGNPDKEGYSIVT KLNIVRYDTDSQGNKRNYPWFVEIQNGRGIAMKNSNGGRYCKRNSYICDKKASLYINDKD LFILFCKTDAYIRAFEMEYAFRENRIGNFTSLYGLLKNEIQGAADRLLTCIHREDEEPMK KAG >gi|229784004|gb|GG667731.1| GENE 5 5004 - 5258 311 84 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1377 NR:ns ## KEGG: EUBREC_1377 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 84 1 84 84 106 60.0 3e-22 MCQISIRIPDAVMYDTHMSEEEAAAFARCMVAVGYYTQNNVSIGYCAQIAGMTEEEFIKY LGKRKVSIFQFDNKAEFLEELENA >gi|229784004|gb|GG667731.1| GENE 6 5251 - 5748 596 165 aa, chain + ## HITS:1 COG:AF0598 KEGG:ns NR:ns ## COG: AF0598 COG2405 # Protein_GI_number: 11498206 # Func_class: R General function prediction only # Function: Predicted nucleic acid-binding protein, contains PIN domain # Organism: Archaeoglobus fulgidus # 4 161 3 149 150 108 42.0 3e-24 MRRVIVNSTPIIVLCNIGLLDILKALYAEICIPEAVYREVTEKDDSACQVLKTALNWIHV EKIANQSDKKMYKAKLHDGEVEVMILAQEGTADDLVVIDDNAAKKTAKYLGLKVTGTIGV LLKAKREGIIPEIASVLEKMKKNGFYISESLEQLVLEQAGERHNI >gi|229784004|gb|GG667731.1| GENE 7 7465 - 7572 107 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVPGEVCAECPERDNHYREIVLNVQKSAEVVVVER >gi|229784004|gb|GG667731.1| GENE 8 7625 - 7855 94 76 aa, chain + ## HITS:1 COG:no KEGG:bpr_I1231 NR:ns ## KEGG: bpr_I1231 # Name: not_defined # Def: RNA-dependent DNA polymerase # Organism: B.proteoclasticus # Pathway: not_defined # 5 66 6 67 464 95 77.0 4e-19 MGTENKDSCSQRDSAEREGYVRAHRSFRRIWKERDSAQPELLEKILNKDNLNRAFKRVKA NKGAPGNRRNDRRGNR >gi|229784004|gb|GG667731.1| GENE 9 7833 - 8012 120 59 aa, chain + ## HITS:1 COG:CAC3514 KEGG:ns NR:ns ## COG: CAC3514 COG3344 # Protein_GI_number: 15896751 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Clostridium acetobutylicum # 1 54 83 136 470 65 61.0 3e-11 MTVEETGDCLRENQKELIERIKRGKYTPDPVRRVEIPKPEGGVRKLGIPTVKDRIFLAS >gi|229784004|gb|GG667731.1| GENE 10 8941 - 9018 73 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTKERLINSGYYDLATAYQSVHVNC >gi|229784004|gb|GG667731.1| GENE 11 9135 - 9434 177 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871198|ref|ZP_06116698.2| ## NR: gi|288871198|ref|ZP_06116698.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 99 16 114 114 178 100.0 1e-43 MRDSYNPEGYHCLIIAILMGVNAREARFLYEHGLNNPISQKILKKKHPKIVRVSTRKERK EVIQQLRSEGYSIEAIADILNCDHSTVKRNSKLKRRFTS >gi|229784004|gb|GG667731.1| GENE 12 9431 - 10282 554 283 aa, chain + ## HITS:1 COG:no KEGG:Closa_0805 NR:ns ## KEGG: Closa_0805 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 280 1 306 321 149 32.0 2e-34 MMTLEKFAEAVSSEVAKRLGAGFHVHVLKPVINNGIVQTQLCIEKEGQQFRPSIHLKDFF ECYKKGTGLQAIIDEILECYASAPDIPAGMQEIVNDMSFEKIKDKIMCKLVNTHDNASML ENIPNIPYLDLSIIFLLLCEETEEIAATLMITNKHLNYWSVDKEELYQRALVNMVNKYPP VLMKITDMMYVLTNRIHFCGAEVLLYPGLMRYCELHVGKDFLIIPYSVDEVILFPTLGVE DAVVSQELADLIQSINAEEVPADERLSNHAYRYCSAEDKVICI >gi|229784004|gb|GG667731.1| GENE 13 10312 - 10680 326 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623765|ref|ZP_06116700.1| ## NR: gi|266623765|ref|ZP_06116700.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 122 1 122 122 239 100.0 5e-62 MHIHKFADIASFAEIGVGGNLPATEEYREFIKKLHPTQFLTGRLTAPLYEVEYSYVTVRG NYRKAYKYILLRLEHDDLDLEIEMIFSDWVEELNRKCPYRRILNAQILKIKPIAYATIPF EI >gi|229784004|gb|GG667731.1| GENE 14 10847 - 11728 709 293 aa, chain + ## HITS:1 COG:MA2894_2 KEGG:ns NR:ns ## COG: MA2894_2 COG0175 # Protein_GI_number: 20091715 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 31 218 28 223 379 78 30.0 1e-14 MEQYRLFDDFSDKVKRAIQLFQMFEDAAVRYHKDGYYLAFSGGKDSVVLYVLAKMAHVKF QAHYHLTTVDPPELVWFIREAFPDVKVEYPELTMWDLIVKKQMPPLRTARYCCEVLKEGG GEGAFTVTGVRWEESRKRANRDFLEIHKKKGAIYLNCDNEEDRRKLEICSLKGKRILNPI IDWTEEDIWYFIRKYQVPYCCLYDQGFCRIGCIGCPLASIRSREREFERYPKYRDAYIRA FDRMVKARTASGKSKGNWKDGESVFRWWMYGNEKTDKQVAGQMELKDFIDMAA >gi|229784004|gb|GG667731.1| GENE 15 11835 - 12620 126 261 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623767|ref|ZP_06116702.1| ## NR: gi|266623767|ref|ZP_06116702.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 261 1 261 261 516 100.0 1e-145 MKKQSCRNCHNIDLNKKEETEGRLSGRYRYGCTVQRSGFICGFIISDEKLEALVCPNWKG GKMEEADYKRLADEFGKRLQILYDRWNMWKIRGCPEADVPDGEYLNRLRSGIEAMMRQIE NTFVEADYPECYYAPLPPVMDVDYMANCQQIKETAIRALEEYRNNKDYLWLADHIQHLDN EDKENSEAYRLLCHVQSLEEAICEDAYLQMKRVSFQESLYDDMANCKRRILKRKCRPSNR KSKKNSPQIVGQLRIGDLKAS >gi|229784004|gb|GG667731.1| GENE 16 12632 - 13051 325 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623768|ref|ZP_06116703.1| ## NR: gi|266623768|ref|ZP_06116703.1| hypothetical protein CLOSTHATH_05085 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05085 [Clostridium hathewayi DSM 13479] # 1 139 1 139 139 235 100.0 7e-61 MDHLMLVKNLFNDMFVFLNGTQPLPLESLDDVYQGEPLFLALIGGLDQALLVDYNSAMQE SYGFYKKYCGRELTEEEWEQVVEEIQMFIDKWNNSWCKGMILALLALMEQEEDERKGEGK MEQAESSGDEELDSMDNAA >gi|229784004|gb|GG667731.1| GENE 17 13108 - 13851 451 247 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871199|ref|ZP_06116704.2| ## NR: gi|288871199|ref|ZP_06116704.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 247 3 249 249 402 100.0 1e-110 MIEKLYKKVKNIQDFNETASKLRQMGLKDELKNLAGKYKVPEVDLKEFLSGKRYFLIDGG ETQKDYMSARSKLLDEMFNLNDPQFTDVIGNYLIRCCDKSDFADLVLQKHKTLQRCVESI MSRAYEMVSDEVKKSGRYANMAVLEDKVFEWVRDYYLTDDREKEAERRTEAEEAFHKRLS AQMNTPKGNKNKTAKKRNASKKNKAGGANSNASEKEGKHSSAKTEAADQNKDQIEGQVSL FENGLAE >gi|229784004|gb|GG667731.1| GENE 18 13865 - 14425 361 186 aa, chain + ## HITS:1 COG:no KEGG:Closa_1107 NR:ns ## KEGG: Closa_1107 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 184 1 178 182 177 49.0 2e-43 MLIAFKGFDKDLTCTNHGNTYQYRLGVWNEEDKANCGINGMHCAENPLDCLSYYPDWDRA VYYMVVADGDIDEDAYDSKISCTRLRLIKKLDKGEFVAHSLKYLFDHPFRKNNHLVCLNQ GVAKNGIVIVRGENPVAKGAIGDVLGFAKEVRSQSEITEIGIHAVDGKEILPDTWYSING KIRKAG >gi|229784004|gb|GG667731.1| GENE 19 14431 - 15817 316 462 aa, chain + ## HITS:1 COG:no KEGG:Closa_1108 NR:ns ## KEGG: Closa_1108 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 462 1 470 734 325 39.0 4e-87 MKKIRLKKLGHLYATDEMLCMAEQDIPENKKIGWQRVEPVFQREVYLQSKICDGILMVAI YLARDLRLGSKKPLYEIFIDKSKREYLTWDTVKGKWRTACVEALEFPHYYSYSCAYITPE EDIRLAEYLGVTQKGMKGIYQYQQSILEERLENRYKKETSLWEAAMKLVPDVPKDWLRWV NRHGLNENFIFYDYSKNVKEGFCTWCEKIVPVKKARHNTYGTCICCGHRIQYKAKGKAGR LCTKEEQVYLPQKYGDGLIIRQFTAQRFYQKGEYKTPKIMCNETGRVIYDKNLTDTQYYY GRYKQRGYRWIKGYPSYSFFYGYNDYKLNHAGAVYKRTVPALSRHILNRTGLPQLISTGY KISPNDYLSGLAEAPYLERFIKAGLKHLTLDALKGRIEVSESHSLAKSLGIDGNRLGRLR NNDGGELFLIWMRYEKKKRKNIVDSVICYFEEQDIRPENIKF Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:20:30 2011 Seq name: gi|229784003|gb|GG667732.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld125, whole genome shotgun sequence Length of sequence - 17235 bp Number of predicted genes - 26, with homology - 24 Number of transcription units - 7, operones - 7 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 514 231 ## COG0338 Site-specific DNA methylase 2 1 Op 2 . + CDS 511 - 1881 263 ## COG0270 Site-specific DNA methylase 3 1 Op 3 . + CDS 1915 - 2055 184 ## + Term 2255 - 2292 1.2 + Prom 2238 - 2297 6.1 4 2 Op 1 . + CDS 2391 - 2558 126 ## gi|266623775|ref|ZP_06116710.1| conserved hypothetical protein 5 2 Op 2 . + CDS 2611 - 2814 154 ## gi|288871203|ref|ZP_06116711.2| conserved hypothetical protein 6 3 Op 1 . + CDS 3003 - 3194 125 ## gi|266623778|ref|ZP_06116713.1| conserved hypothetical protein 7 3 Op 2 . + CDS 3184 - 3465 190 ## gi|266623779|ref|ZP_06116714.1| conserved hypothetical protein 8 3 Op 3 . + CDS 3428 - 3871 261 ## Closa_1384 hypothetical protein + Term 3951 - 3994 7.1 + Prom 4013 - 4072 4.4 9 4 Op 1 . + CDS 4092 - 4616 273 ## gi|266623781|ref|ZP_06116716.1| hypothetical protein CLOSTHATH_05099 10 4 Op 2 . + CDS 4661 - 5986 821 ## COG1783 Phage terminase large subunit 11 5 Op 1 . + CDS 6923 - 7192 149 ## CHY_1680 prophage LambdaCh01, PBSX family terminase large subunit 12 5 Op 2 . + CDS 7253 - 8581 726 ## Clole_0815 bacteriophage portal protein, SPP1 GP6-like protein 13 5 Op 3 . + CDS 8587 - 10359 1027 ## ELI_3213 hypothetical protein 14 5 Op 4 . + CDS 10356 - 10499 193 ## + Term 10566 - 10600 -0.3 + Prom 10548 - 10607 2.8 15 6 Op 1 . + CDS 10680 - 11267 664 ## LKI_09680 putative scaffolding protein 16 6 Op 2 . + CDS 11270 - 12307 757 ## ELI_3208 coat protein 17 6 Op 3 . + CDS 12325 - 12693 342 ## gi|266623789|ref|ZP_06116724.1| conserved hypothetical protein 18 6 Op 4 . + CDS 12695 - 13075 310 ## Clole_0823 hypothetical protein 19 6 Op 5 . + CDS 13086 - 13511 194 ## Clole_0824 minor capsid 20 6 Op 6 . + CDS 13508 - 13963 375 ## gi|266623792|ref|ZP_06116727.1| hypothetical protein CLOSTHATH_05111 21 6 Op 7 . + CDS 13964 - 14440 417 ## gi|266623793|ref|ZP_06116728.1| hypothetical protein CLOSTHATH_05112 + Term 14614 - 14667 9.1 22 7 Op 1 . + CDS 14690 - 15670 352 ## BLJ_1240 hypothetical protein 23 7 Op 2 . + CDS 15495 - 15818 195 ## BLJ_1240 hypothetical protein 24 7 Op 3 . + CDS 15836 - 16048 299 ## BLJ_1241 hypothetical protein 25 7 Op 4 . + CDS 16094 - 16570 436 ## BLJ_1242 hypothetical protein 26 7 Op 5 . + CDS 16567 - 17067 455 ## BLJ_1243 hypothetical protein Predicted protein(s) >gi|229784003|gb|GG667732.1| GENE 1 2 - 514 231 170 aa, chain + ## HITS:1 COG:lin0088 KEGG:ns NR:ns ## COG: lin0088 COG0338 # Protein_GI_number: 16799166 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Listeria innocua # 4 162 104 264 270 108 37.0 6e-24 NFYIRLNMGHGFRTNGEKVGWKNDVQGRERAYAAKDWCNLPEKIMAAAERLRGVQIENMP AVELIKRFNHSNVLIYADPPYVLSARHGKQYRYEMDNGAQTELLEVLHAHKGPVLISGYD SELYNDSLHDWYRVETDCYSQIASKKREVLWMNFAPAGQMSIKDFLEVRP >gi|229784003|gb|GG667732.1| GENE 2 511 - 1881 263 456 aa, chain + ## HITS:1 COG:mlr8517 KEGG:ns NR:ns ## COG: mlr8517 COG0270 # Protein_GI_number: 13477024 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Mesorhizobium loti # 4 359 27 408 667 202 35.0 1e-51 MSGLIIDCFAGGGGASVGIEMALGRPVDIAINHDPQAIRMHKVNHPDTLHLTEDIFKVDL KKYVAGRHVALMWASPDCTSHSKAKGGQPRNKGLRILPWAVYKHAKAILPDVILMENVEE IQQWGPLDEAGHPIKERAGEDYKRFIAAMKRLGYDFDSRELVAADYGAPTTRKRWYAIFR RDGNVITWPEPTHSKSGADGRLKWLECGDYIDWSDLGRSIFDRPRPLADATMKRIANGYV KYVVNNPQPYIVNNQSAVSFMIQYHGETREGDSRGQLLTEPIKTIDTSNRYGLVTAFVTK FYKSGTGQMCEEPLHTITTSPGHFGLISAFLIKYYGTGCGQEAGRPLGTITTKDRFGLVN VITDIDGEQYILKDIFLRMLKPEELKRMQGFPEDYILNHDIEGKPYPVGEQVARIGNSVV PIMAQALVSANCPYLKVGERMPNMRIDDSHEQLRFA >gi|229784003|gb|GG667732.1| GENE 3 1915 - 2055 184 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKVILSGVKSACARIVKIAIAIIINVKIVTEPVKSIAPLTFSERR >gi|229784003|gb|GG667732.1| GENE 4 2391 - 2558 126 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623775|ref|ZP_06116710.1| ## NR: gi|266623775|ref|ZP_06116710.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 55 1 55 55 98 100.0 2e-19 MARPRKAEGEKYIRQDISMEPGQFRRLMAYCQREDRSISWVIRKALEMFLVCSDT >gi|229784003|gb|GG667732.1| GENE 5 2611 - 2814 154 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871203|ref|ZP_06116711.2| ## NR: gi|288871203|ref|ZP_06116711.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 67 22 88 88 126 100.0 6e-28 MGLIDAFGKEDRVEVTFSAFYELMKGCAERDFLAKGIMCNVPHRYMREMVTGKSEDEGND NDRVEDN >gi|229784003|gb|GG667732.1| GENE 6 3003 - 3194 125 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623778|ref|ZP_06116713.1| ## NR: gi|266623778|ref|ZP_06116713.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 63 1 63 63 104 100.0 2e-21 MDRKEEHAIALQSAQARAAKQEYILKGPRPETHSATMPAYCYTPACPDPKLRAPIWRRNK HGI >gi|229784003|gb|GG667732.1| GENE 7 3184 - 3465 190 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623779|ref|ZP_06116714.1| ## NR: gi|266623779|ref|ZP_06116714.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 93 1 93 93 166 100.0 7e-40 MAFSKKYEVYDRGALLGQYYADEVSELIGIPLRRVSAYAASGARFLRRYTIEAVDDTEPG WAEEWDRVRIEKLAWLRGGGAVGQRGTGAVQQP >gi|229784003|gb|GG667732.1| GENE 8 3428 - 3871 261 147 aa, chain + ## HITS:1 COG:no KEGG:Closa_1384 NR:ns ## KEGG: Closa_1384 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 146 1 149 150 107 43.0 2e-22 MDKEVLEQYSSLKAEYLDLQDEIRKLEKQIRKMETSRCQVSDSVKGTRPDGTYGSITITG FPVPDYYRRKKLLEKRKANLSKFELQLLELTNDVDDYINSLADSRMRRMIRYKFFDELSW VQVAHRMGGKYTADSCRKQIERFLEEK >gi|229784003|gb|GG667732.1| GENE 9 4092 - 4616 273 174 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623781|ref|ZP_06116716.1| ## NR: gi|266623781|ref|ZP_06116716.1| hypothetical protein CLOSTHATH_05099 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05099 [Clostridium hathewayi DSM 13479] # 1 174 3 176 176 340 100.0 3e-92 MLGLIINSTATPPNIDGIKTDASITILVVIAVIALISPIFVATINNHFQRKNHLLDLEND FLKFQYNSYYLKATEAFDNMLKSVGLFLGDATDLDKYAKAISCINVAYAYCDEELAGRLD TLRMELGQFTGDVTDGSGDSKGSFADSSEALKAVAIYISRYNREFFKIEKNERW >gi|229784003|gb|GG667732.1| GENE 10 4661 - 5986 821 441 aa, chain + ## HITS:1 COG:lin0105 KEGG:ns NR:ns ## COG: lin0105 COG1783 # Protein_GI_number: 16799183 # Func_class: R General function prediction only # Function: Phage terminase large subunit # Organism: Listeria innocua # 151 373 23 251 443 151 37.0 3e-36 MANNENLNGHGFHERTTEEQRAIAIAGGKASGEARRRKANFRRTLNMLLTAEIDSPEWTP VLEALGLESTLESAVNAAVIKKALAGNVKAYEAIRDTLGQTLKSDLDIEEQLAKIVHLKA QADALQPNRDNDQPVKYTGIPSNMIAPVFAPVVFDIQEHGHTEYVFPGGRGSTKSSFISL EVIDLIMNNDQMHAVVMRQVADTLRGSVYQQILWAIEALELSEEFHATVSPMEITRISTG QKIYFRGADDPGKVKSIKVPFGYIGILWLEELDQFVGPESVRKIEQSVIRGGEVAYIFKS FNPPKTASNWANKYIKVPKASRLVTESTYLDVPPKWLGKPFLDEAEFLKEVNPDAYDNEY MGVANGSGGSVFDNVTIRKITDEEIAQFDHVLNGVDWGWYPDLYAFVRVHYDPARLRLYI WQEYTCNKQSNRKTADKLIEI >gi|229784003|gb|GG667732.1| GENE 11 6923 - 7192 149 89 aa, chain + ## HITS:1 COG:no KEGG:CHY_1680 NR:ns ## KEGG: CHY_1680 # Name: not_defined # Def: prophage LambdaCh01, PBSX family terminase large subunit # Organism: C.hydrogenoformans # Pathway: not_defined # 1 79 323 401 420 90 54.0 3e-17 MGDYKSYGLLAREADKGPGSREYSFKWLQSLREIIIDNVRCPVAAQEFLDYEYERDKEGN VISGYPDGNDHCIDATRYATNRIWKKKGQ >gi|229784003|gb|GG667732.1| GENE 12 7253 - 8581 726 442 aa, chain + ## HITS:1 COG:no KEGG:Clole_0815 NR:ns ## KEGG: Clole_0815 # Name: not_defined # Def: bacteriophage portal protein, SPP1 GP6-like protein # Organism: C.lentocellum # Pathway: not_defined # 1 436 16 453 469 253 35.0 1e-65 MFSIKTLKELFGQDIAISQDMISAIEKWGSMYRGQAPWVDEQVDSLQIEQGICREFANVC LNEMESSISVEPLDKIYQSAVRDLNENLQSGLALGSFCIKPLGEDKVEYITQERFIPLKF DARGRLTRVAFVDVKKVADYDYFIRFEVHTWEETRVLNIQNLAYRSSDMASIGRSVPLSM VPEWAELPENINYAGVERPDFGYYRNPIKNEVDGSPCGVSVYQSATNLIKKTDIQFGRLD WEFESGERVVHVDVTALQEAPTLGQDGRTRYEMPKLNKRLYRGLNLSGGSNGEELYKEYS PELRDQNIINGLNAYLRRIEFNVCLSYGDLSDVNDVDKTATEAKIAKKRKYNMVKAIQSN LKDCLEDLAYALAFYNGMTRSGYEFLCTFKDSILVDEEEERRQDRQDLAAGIMRPEEYRA KWYGETLEEAAKNLPEPAMVEE >gi|229784003|gb|GG667732.1| GENE 13 8587 - 10359 1027 590 aa, chain + ## HITS:1 COG:no KEGG:ELI_3213 NR:ns ## KEGG: ELI_3213 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 388 2 388 526 211 36.0 6e-53 MTPDELEKLPKPLERTMTALELSIMAEIVERIKEAAQITPVTDWLITRLIAIGNSKVMIK KIIGEAIKTAEMQIDEIYRQAARSDYVRNREIYEAVGRDYLPYEDNNWLQQAVEAARRQT KDSLRPMENITQTTGFNVQMGHGKKVFTPLSEYLERSLDKAMLGITTGAKTYSQAIGEVI DEMTASGIRTVDYASGKSDRIEVAARRAVMTGVAQMTKQVSDKNAEELGTDHWEVDWHMG ARNTGTGYLNHQSWQGKVYSTEEIRTICGEGEMLGFAGINCYHIKSPFLPGISKRKYTDE WLAEQNRKENEKKPYRGREYDTYGALQYQRRLERTIRKQKQDVELLKTAGADKKDIMEAR SRLRLTDKHYVDFSKQFDLPQQRERLRIANKGTEGEHGITSKKGASGSESRVDLEYINSS EYKAKFSQITDNAKVNESIYQRAKAMLTHRSGTDKEDMYLLDRNDGKIIGLQTGGKTDYE VEYNKSLTDAVNNSKPYSLISIHNHSTNRPPTGKDFTSNGLHRYALGVVACHDGTVYTYK VGTRIFTSGLFDGRVEKYMKTPYNMSEMDAHLRVLEEFVNEYGIEWKVIT >gi|229784003|gb|GG667732.1| GENE 14 10356 - 10499 193 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKGNSFSYYDGPVKDSGRTLEEIEEAIEKEKERCDGMTSWSEAFEE >gi|229784003|gb|GG667732.1| GENE 15 10680 - 11267 664 195 aa, chain + ## HITS:1 COG:no KEGG:LKI_09680 NR:ns ## KEGG: LKI_09680 # Name: not_defined # Def: putative scaffolding protein # Organism: L.kimchii # Pathway: not_defined # 1 195 1 206 209 84 32.0 2e-15 MKRKFLEDMGLTKEQVDSIMAENGNDIEAAKGDLEQVKTELEQTKTQLQEANTTIDGFKD YDQVKKQVEEYKTKYEQSKAEYETKIADMQFGTSLEAAITAAGGRNAKAVKALLDVEALK VSKDQTADIKAAIEACQKDNSYLFGANEPINNPVGPTNGPAIGITKEQFKTMGYKERLEL KQSNPEKYAEMKGDN >gi|229784003|gb|GG667732.1| GENE 16 11270 - 12307 757 345 aa, chain + ## HITS:1 COG:no KEGG:ELI_3208 NR:ns ## KEGG: ELI_3208 # Name: not_defined # Def: coat protein # Organism: E.limosum # Pathway: not_defined # 1 345 1 342 342 445 63.0 1e-123 MPGIIFGIPFDDELFLDMWNEAPDPYYTAMIQSGAVVEDSTIAGMIQNHGNIYTIPFYNT LEGEDLNYDGQTDITVEEIGGGSQTGVVYGRAKGFFARNFTAELSGSDPMGHIVASVARY WQKRRQMRLIGISDAIFGITGASGHAKKWADSHTLELTSTTAEARKIRETDLNDLATEAC GDHKDQFGLVIMHSNVAKTLENLQLLEFWKQTDANGIQRPMALGSANGYTVIVDDGVPVE KVGGDGANKDLSKYTTYLFGNGVIRTARGRVDVPVETVREAKKNGGQDELITRMRETIHP NGFSFKIPTSGWTESPTDAQLFAKANWDIKFDPKALPIARLITNG >gi|229784003|gb|GG667732.1| GENE 17 12325 - 12693 342 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623789|ref|ZP_06116724.1| ## NR: gi|266623789|ref|ZP_06116724.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 122 1 122 122 232 100.0 7e-60 MPYAAEQDYQNEYLMGRKPVIRTGFDYYARQASQVIDVYTFGRLKTLLEVPEPVRLCCCE LAEVICQQEKRNREAAGKTSEKVGTYSVSFSSAAEARQAADREQRGIVMKWLADTGLCYQ GV >gi|229784003|gb|GG667732.1| GENE 18 12695 - 13075 310 126 aa, chain + ## HITS:1 COG:no KEGG:Clole_0823 NR:ns ## KEGG: Clole_0823 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 123 1 127 130 66 34.0 4e-10 MYTNADVTLYLYSKCGKQDSYRRIFVEDVFWDNVKQSNVLKTGQRDSDSVLLVIPLESLF EPIRFTAGKDLAVKGQCNHMIDCSSQKSMSESLQELKQCHGCVTVMTVDEKLYGSESAQH YELSCK >gi|229784003|gb|GG667732.1| GENE 19 13086 - 13511 194 141 aa, chain + ## HITS:1 COG:no KEGG:Clole_0824 NR:ns ## KEGG: Clole_0824 # Name: not_defined # Def: minor capsid # Organism: C.lentocellum # Pathway: not_defined # 1 123 3 125 141 88 40.0 8e-17 MKVELEIQPIEVLLEKHGLQPGGPVQKVIDSEAMRYMSPYMPRRQAGNLEHMMIMATVTG SGEINTPGPYAHYLHEGILYVSPTTRSSWAKENEIKVPTEKELKYTGAPMRGKKWFDRMK ADHRDDILEAAQARVDRGGKI >gi|229784003|gb|GG667732.1| GENE 20 13508 - 13963 375 151 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623792|ref|ZP_06116727.1| ## NR: gi|266623792|ref|ZP_06116727.1| hypothetical protein CLOSTHATH_05111 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05111 [Clostridium hathewayi DSM 13479] # 1 151 1 151 151 290 100.0 3e-77 MTIIDFLRQKLTEYPKISEFMAGADIHIDFTDPDPTNYGLSSTGDSLIREDVLGNQIRQH NFVMYAVAQSFTDYNRLANSNFLLELAYWLERLPEEDGISVEVDNQAMIGIFLKATSANA MAMQPMTEDINDGVLYQIQIYAQYKIESEEF >gi|229784003|gb|GG667732.1| GENE 21 13964 - 14440 417 158 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623793|ref|ZP_06116728.1| ## NR: gi|266623793|ref|ZP_06116728.1| hypothetical protein CLOSTHATH_05112 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05112 [Clostridium hathewayi DSM 13479] # 1 158 1 158 158 300 100.0 3e-80 MPEATGKIKRKFMAHYINAALPSASQAAYERLGKDLEEYNVEMNANIETKNNIMGETSVN LDSYQPQASVEPYYAEAGSQLFTRLQGIIDERQILDDLKTDVVEVHLWESAVSGAYTAYK EEAIIEVSSYGGDYTGYQIPFNLHYTGVRTKGTFNISF >gi|229784003|gb|GG667732.1| GENE 22 14690 - 15670 352 326 aa, chain + ## HITS:1 COG:no KEGG:BLJ_1240 NR:ns ## KEGG: BLJ_1240 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 1 293 1 293 381 569 96.0 1e-161 MYFTSGSDRAFEQIMQTRPGADHFDNGEAGAPEDCGTCRFYRPQWKYQFCVYAECPYQPG KLTALDGAVKFQVKGVDDEMAVFRVEKNRGYTVMSNHHLRNKDLSLKAKGLLSQMLSLPE DWDFTLKGLSLINREKIDAIRAAVRELEQAGYIVRSRERDSQGRLRGADYIIYEQPQPVP DSPTLENPTLDNPTQEKPTQEKPTQLNKDRSSKEKSITDGSNTDSIPILSPPSPLGEEAA APPERKGTGAKSQSAVEIYREIIKDNIEYEHLYQYAKGIDRDMLDEIVDLLVENRMQRPK DYPHCRGRLPRRAGEIQADETGQLSH >gi|229784003|gb|GG667732.1| GENE 23 15495 - 15818 195 107 aa, chain + ## HITS:1 COG:no KEGG:BLJ_1240 NR:ns ## KEGG: BLJ_1240 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 25 105 293 373 381 167 96.0 1e-40 MNTFTSTPRGLTGICWTRSWTCWWKTVCSARKTIRIAGDDYPAELVKSKLMKLDSSHIEF VFDCISKNTSEIRNIKKYLLAVLFNAPSTINGYYTALVAHDMNTGKI >gi|229784003|gb|GG667732.1| GENE 24 15836 - 16048 299 70 aa, chain + ## HITS:1 COG:no KEGG:BLJ_1241 NR:ns ## KEGG: BLJ_1241 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 1 70 13 82 82 130 90.0 2e-29 MKQGTLIFDEQADRYDIRFDLADYYGGLHCGETFDVLAGGRWRPTRIEMADNWYLVGIRT DDLSGLRVRI >gi|229784003|gb|GG667732.1| GENE 25 16094 - 16570 436 158 aa, chain + ## HITS:1 COG:no KEGG:BLJ_1242 NR:ns ## KEGG: BLJ_1242 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 1 157 1 157 158 193 81.0 2e-48 MQEEIEQKTIALAVKTGKLTGQVLQAAMKKFLAARQSAKSNPHQGRQSLRQLKKDGCALS NIEITDSNIGLFRPCAKKYGIDFTLRKDRTTHPPRYIVIFKSKQADNLEQAFKEFTAKKL KQQERPSIRKALSAMKQKAAAKDKQRAKEKVKERGLSL >gi|229784003|gb|GG667732.1| GENE 26 16567 - 17067 455 166 aa, chain + ## HITS:1 COG:no KEGG:BLJ_1243 NR:ns ## KEGG: BLJ_1243 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: Bacterial secretion system [PATH:bll03070] # 5 160 1 156 597 307 92.0 8e-83 MKPEVKKQILSNAPYLLFVYLFGKLGQTYRLAAGADLSEKLLHLADGFSLAFASAAPSFH LFDLAVGVAGALLLRLMVYCKSKNAKKYRKGVEYGSARWGGPKDIAPYIAPVFDNNILLT QTERLTMNNRPKDPKTARNKNVLVIGGSGSGKTRFFVKPSAPVRAV Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:22:27 2011 Seq name: gi|229784002|gb|GG667733.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld126, whole genome shotgun sequence Length of sequence - 15154 bp Number of predicted genes - 20, with homology - 19 Number of transcription units - 6, operones - 3 average op.length - 5.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 161 114 ## gi|266623797|ref|ZP_06116732.1| conserved domain protein - Term 238 - 273 0.1 2 2 Op 1 . - CDS 448 - 624 134 ## gi|266623799|ref|ZP_06116734.1| conserved hypothetical protein 3 2 Op 2 . - CDS 709 - 1491 315 ## COG0338 Site-specific DNA methylase 4 2 Op 3 . - CDS 1488 - 3401 859 ## Cphy_2982 hypothetical protein 5 2 Op 4 . - CDS 3415 - 3852 329 ## gi|266623802|ref|ZP_06116737.1| conserved hypothetical protein 6 2 Op 5 . - CDS 3849 - 5222 1248 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family 7 2 Op 6 . - CDS 5219 - 5590 215 ## gi|266623804|ref|ZP_06116739.1| hypothetical protein CLOSTHATH_05124 8 2 Op 7 . - CDS 5597 - 5932 336 ## Sterm_1414 VRR-NUC domain protein - Prom 6005 - 6064 4.7 9 3 Op 1 . - CDS 6172 - 8577 1503 ## COG5545 Predicted P-loop ATPase and inactivated derivatives 10 3 Op 2 . - CDS 8589 - 10559 1656 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains 11 3 Op 3 . - CDS 10607 - 11323 343 ## Sterm_1410 hypothetical protein 12 3 Op 4 . - CDS 11316 - 12515 910 ## Sterm_1409 hypothetical protein 13 3 Op 5 . - CDS 12515 - 12997 190 ## gi|266623810|ref|ZP_06116745.1| conserved hypothetical protein 14 3 Op 6 . - CDS 12994 - 13215 349 ## COG4443 Uncharacterized protein conserved in bacteria - Prom 13380 - 13439 3.7 - Term 13385 - 13415 1.3 15 4 Op 1 . - CDS 13575 - 13901 253 ## gi|266623813|ref|ZP_06116748.1| nitrogenase molybdenum-iron protein beta chain 16 4 Op 2 . - CDS 13907 - 14116 242 ## gi|288871210|ref|ZP_06116749.2| conserved hypothetical protein 17 4 Op 3 . - CDS 14139 - 14447 221 ## gi|266623815|ref|ZP_06116750.1| hypothetical protein CLOSTHATH_05135 18 4 Op 4 . - CDS 14463 - 14615 195 ## gi|266623816|ref|ZP_06116751.1| conserved hypothetical protein 19 5 Tu 1 . - CDS 14766 - 14915 184 ## gi|266623817|ref|ZP_06116752.1| hypothetical protein CLOSTHATH_05137 - Prom 15033 - 15092 4.0 + Prom 14948 - 15007 5.9 20 6 Tu 1 . + CDS 15049 - 15154 138 ## Predicted protein(s) >gi|229784002|gb|GG667733.1| GENE 1 2 - 161 114 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623797|ref|ZP_06116732.1| ## NR: gi|266623797|ref|ZP_06116732.1| conserved domain protein [Clostridium hathewayi DSM 13479] conserved domain protein [Clostridium hathewayi DSM 13479] # 1 53 1 53 53 101 100.0 2e-20 MYLNIEYRDGKKEQKSVDDCSVKDGCLKYYIRTGVSAGTHYIPLDTIKEFKTP >gi|229784002|gb|GG667733.1| GENE 2 448 - 624 134 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623799|ref|ZP_06116734.1| ## NR: gi|266623799|ref|ZP_06116734.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 58 23 80 80 110 100.0 3e-23 MDERLARKYHKQYGVNSDGGTYQKYPKKKYRPIALATVVDILTAGGDLENEILIEQEN >gi|229784002|gb|GG667733.1| GENE 3 709 - 1491 315 260 aa, chain - ## HITS:1 COG:lin0088 KEGG:ns NR:ns ## COG: lin0088 COG0338 # Protein_GI_number: 16799166 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Listeria innocua # 1 256 1 256 270 166 38.0 5e-41 MKSLLHYPGSKKRLAPWIIGYMPPHHSYLEPYFGGGAVLFEKDPSKIETVNDLDDDVVNF FRVIRDPESRERLQEWLTYTPYARQVYDESFRQEPESPVEKAGYFAIRSMQSHGFRLTEK CGWKKDVYGREAAYAVRYWNELPEVLAVMAERLKRVQIEHKPAIELIKAFNYENVLIYAD PPYVLSTRTRKQYRHEMSDQDHRELLETLCGSKAKIMLSGYDCELYEDYLTGWHKAQIPA RAQNSLPRTETLWMNFGQEN >gi|229784002|gb|GG667733.1| GENE 4 1488 - 3401 859 637 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2982 NR:ns ## KEGG: Cphy_2982 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 506 636 292 421 421 140 55.0 3e-31 MREVKKSDLTFDEANAMLDEYNRSLRNNFVGIGYTLKRVREDKLFLEAGYKNFDEFMNAK YGKDKSWASKCISIAVQFGQGEETPVIQDKYADYEFNKLVELVTMTEEQRSQVTPKTSVR ELREMKPGRKKKVATVETPKPEPEKLSAYGTVIKVYPEDSLIAVAGCNGPEGDHKGSHDC FSCHLQCRIRQQYCYCVEAPMGNPFPCAQIDIIDTLEEEIGERCQFVDLDQAYHRAGDHE PVPCCKECKEPCQYACERSIKKREAEEQTCNTDSSSCPHRDGYSCTLTPDQKNTPGDGNQ CERSCCWECKYHGMCETECDSSAGRDGKRPEFNAAWFVRQWAEKKPSDLKKVMQICREQE NNSDRAKAVQKYIAPYSCSGSHYGTYCHDFHSFAGGIDFEIGRVKLHLKYGRFVTELLAL YDPSSPEYDEKPELEPVQADPDPVIDADFEEITEPAEEPKESMTDLQMVRQMLEKEEKLL TTCLLTVPDQDNIHIRRMKLKVAALAAYATNLDNIENSAPKPEQLPLPMLKNNDQRKSWI EDYKAWGMWYRDENIDVNYYKYDFSDGSHLIAAEYPQREFYWSNEIRDEVYYHLVEKGKQ KYNGLGAYDEKYQNSTTSVGEIVEYLKEIQKKGRTRN >gi|229784002|gb|GG667733.1| GENE 5 3415 - 3852 329 145 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623802|ref|ZP_06116737.1| ## NR: gi|266623802|ref|ZP_06116737.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 145 1 145 145 236 100.0 4e-61 MIIEFSIPNGNMKVCAEEFFAEAGMAQIRRMFKMLRESGLDDNRRKEILVWLRDQSTEMY QRMEEWSKRYMDCSTRCRELEEQYEQMKSPCYAVYTQDKEALKAARDKVTSAKRRVSASK REYQTAEKMRNRYQKIIDILVEVTT >gi|229784002|gb|GG667733.1| GENE 6 3849 - 5222 1248 457 aa, chain - ## HITS:1 COG:XF0680 KEGG:ns NR:ns ## COG: XF0680 COG0553 # Protein_GI_number: 15837282 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Xylella fastidiosa 9a5c # 8 449 6 456 472 305 39.0 9e-83 MIFKPHAYQQHCIEQILRVKKLGLFLDMGLGKTVTTLTAVKELKYNRFQVRKVLIIAPKK VAEGTWTKEAAKWDHTKMLRVSPVLGSQTKRIRALNTPADLYIINRENVVWLVDYYRNAW PFDMVVVDESSSFKSHSAKRFKALASVGGYIDRMVELTGTPSPNGLDDVWAQVYLLDGGE RLGKRYTQFRERYFQPDKRGADGMIYSYEVKPGSEGSILERISDICISMKAEDYLQLPDI TYHEIPVELDSKASKAYYEMEREMVLALPEEEISVTSAAALSNKLLQLANGAVYDEDHSV HEVHNCKIEAFMELIESLQGKPALVFYNFQHDRARILKALEKTGLRVRELKTTQDEDDWN ARKIDILLTHPASSAYGLNLQQGGNHVIWFGLTWNYELYTQANKRLHRQGQEEKVIIHHL ICSGTRDEDVMEALKRKDDVQSWVMESLKARIRRYRN >gi|229784002|gb|GG667733.1| GENE 7 5219 - 5590 215 123 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623804|ref|ZP_06116739.1| ## NR: gi|266623804|ref|ZP_06116739.1| hypothetical protein CLOSTHATH_05124 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05124 [Clostridium hathewayi DSM 13479] # 1 123 1 123 123 245 100.0 1e-63 MTKIDLEQKIKDAVDARGGRMHEISVGNTPGFPGYMVILPGGHVGFVEIGKPNRNQLIAL RNQISELRTLGCAAMAIDNESQINSMMMYILSDAGKDPSWNAYQDYKAGMTGYGIEHKGD DAL >gi|229784002|gb|GG667733.1| GENE 8 5597 - 5932 336 111 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1414 NR:ns ## KEGG: Sterm_1414 # Name: not_defined # Def: VRR-NUC domain protein # Organism: S.termitidis # Pathway: not_defined # 3 96 8 101 110 87 48.0 1e-16 MLEKDIEKILVNEVKKLGGRAYKWVSPGNDGVPDRIVILPGLRPVFVELKTEKGRLSAIQ RVQIERLKKMKQDVSVLYGEPQVRDFLEECKHRLGLLAYLEDCHFYDDEEE >gi|229784002|gb|GG667733.1| GENE 9 6172 - 8577 1503 801 aa, chain - ## HITS:1 COG:XF0506 KEGG:ns NR:ns ## COG: XF0506 COG5545 # Protein_GI_number: 15837108 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Xylella fastidiosa 9a5c # 371 669 32 343 488 175 33.0 3e-43 MQNNRKLQISTAGSRKSTHWPRCEIMWSEFTEKLKTPIRGAETLEQYLALPKARQDDLKD VGGFVGGTFTGDRRKPECAEGRDLLTLDLDNIPAGQADDILRRVSGLGCAAAVYSTRKHA GYAPRLRVIVPLDRTATADEYEPAARKLASLIGIEFCDPTTFEVNRLMYWPSCCIDSQYL YEVYDNPFCSADGLLGMYGDWRDISQWPQVPGTEAIERRRLAKQEDPTTKRGVIGAFCRT YSITQAMEMFIPGMYEETASPGRYTYTGGETTGGAIIYDGDLFLYSHHSHDPCCNQLVNA YDLVRLHMYGDRDVEAKEGTPVNKLPSFVAMSRLAMDDKAVADLIAKEKHEAAVAAFAEP TEGVRSEDYSWLSGLEVDGNGNYKKTVNNMIIVLQNDPLLKGRIVTDEFANRGLVLGELP WSKEVGRRQWSDPDDAGFFWYMENFYHIAQQDKLDRALTIVGEQNKINEVRDYLKGLKWD GVKRVDTLLSEYLGAEDTPYTRAVMRKSLCAAVARAVEGGVKYDYMPIFTGPQGLGKSSF LNILGKSWFSDSLTSFEGKEAAELIQGTWINEVGELTAMTRQETSAVKQFLSKREDIYRA AYGRRTERYPRRCVFFGTSNDSEFLKDNTGNRRFWPVDVGVHPAKRSVWNDLPAEVDQIW AEAYMYWALGEPLYLSKEIEALAIEQQERHREASGKEGMILDFLEKMIPSNWDQMDPLKR KMYWQGTLQLPEGALLVPREKVCAVEIWVECFNGDPRYLKRMDSTEINNVLQNISGWKRN KTTRRYGPYGQQKGFERVTTN >gi|229784002|gb|GG667733.1| GENE 10 8589 - 10559 1656 656 aa, chain - ## HITS:1 COG:SMc02850_2 KEGG:ns NR:ns ## COG: SMc02850_2 COG0749 # Protein_GI_number: 15963928 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Sinorhizobium meliloti # 371 490 420 530 663 68 37.0 5e-11 MHHLSIDIETRSSIDIGKAGAYKYAQSPDFEILLFAYQWNNDPVKVIDLKNGEELPCWLM QALADPNVIKHAYNAAFEWYCLNCAGYETPIEQWRCTMAHGLYCGYTAGLDATGKAIGLP QDKQKLTTGKALIRYFCVPCKPTKTNGSRTWNQPWHDTDKWELFKEYCLQDVVTEREILK RLDLFPMPEEEEHLWQMDVLMNAYGVRVDTDLIEGALYIDQISTQRLTDEAISLTGLQNP NSAAQLLQWLRDNGTEADNLQKATVAELLGGINPNKVRRMLEIRQQLGKTSIKKYVAMDT ARGEGDRVRGLTQYYGANRTGRWAGRLVQMQNLPRNYLKTLDYARNLVKAKNYDGVRILY GNVPDTLSQLIRTAFIPSAGHKFVVADFSAIEARVIAWLAGEQWVNEVFATHGKIYEATA SQMFGVPVERIAKGNPEYSLRQKGKVATLALGYQGGTSALIAMGALQMGLTEEELPDIVQ RWRQANPRIKGLWYAIENAALAVMETAQPQGINGLIFALEGDIIYGQSFLTVRLPSGRKL FYPKPFLKENRFEKMAVHYYTVGQQTRKWEVTSTYGGKMVENIVQAIARDCLAVTLERIA SKGLQVVFHVHDEVIIDAPMETTVEEICDLMAEPIPWAPGLVLKGAGFESSYYMKD >gi|229784002|gb|GG667733.1| GENE 11 10607 - 11323 343 238 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1410 NR:ns ## KEGG: Sterm_1410 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 5 178 6 181 195 166 49.0 9e-40 MNDLTNVTTGEVRLSYVHLFKPYAAMTGQEEKYSVTVLVPKTDVDTMGRINAALDAAKQK GISEKWNGQCPPIVPVPVYDGDGVRPSDGMAFGPECKGHWVFTASAKVDYRPEVVDKMGN PIINQSEVYSGMYGRVNVSFYPYSFGGKKGIGCGLGPVMKTRDGESLGGSAPSAAQAFNV QQGQATAGYAAPQYGAAMPATPGAAGYAPQTTAPWNGAGQPAAQAAAPRLHPITGQPY >gi|229784002|gb|GG667733.1| GENE 12 11316 - 12515 910 399 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1409 NR:ns ## KEGG: Sterm_1409 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 8 385 3 374 389 329 49.0 2e-88 MGHAERDHALLSASGAHRWLYCTPSARLEEGFPDTTSEAAAEGTLAHELAELKVRNYFHG VEFGKKKLTAAINKLKKKELWQEEMMGYTDEYLDYIKGVALAMKNQPYVATEKRVDFSAY VPDGFGTADCILICGNVLHIIDFKYGKNPNGRVEAEGNPQLALYALGAYEMYKILYPIEI IRMSIVQPRLSDGISEWECPLEELLSWGQYVKGRAELAIKGEGDYFPSPETCKYCRARGQ CRARADENVKLAFSEDLGKLPPLISNAEAGDYLRKGVDVAKWLEALKDYALKECLAGKEV PGWKAVSGRGGRDWTDMDKAFETLVKSGVAEEAVLWERKPLSLAQVETTVGKKDFADAVG EYVVWKPGKPALVEASDKRPAITNKLTAAEAFKEEKVNE >gi|229784002|gb|GG667733.1| GENE 13 12515 - 12997 190 160 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623810|ref|ZP_06116745.1| ## NR: gi|266623810|ref|ZP_06116745.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 160 1 160 160 194 100.0 2e-48 MMITATFESIQEMREFANAVSTVETDALMELANGPDHYRQANPTIQFTGTPPVQQAASVT PPVQQAPVTPPPVQQTPVTPPVQQTPPVTPPPVQQAPVQTTAPSYTPDDLARAAMTLMDS GRQGDLIDLLAQFGADTLTHLQPEQYGAFATALRGLGAPI >gi|229784002|gb|GG667733.1| GENE 14 12994 - 13215 349 73 aa, chain - ## HITS:1 COG:L114363 KEGG:ns NR:ns ## COG: L114363 COG4443 # Protein_GI_number: 15672295 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Lactococcus lactis # 1 69 1 70 72 60 51.0 5e-10 MAKELKCEILETLIEFPGEGKYHKELNLVKWGDHEPKYDLRGWNADRSEMTKGVTLTKDE LMILKKELGGIEL >gi|229784002|gb|GG667733.1| GENE 15 13575 - 13901 253 108 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623813|ref|ZP_06116748.1| ## NR: gi|266623813|ref|ZP_06116748.1| nitrogenase molybdenum-iron protein beta chain [Clostridium hathewayi DSM 13479] nitrogenase molybdenum-iron protein beta chain [Clostridium hathewayi DSM 13479] # 16 108 16 108 108 192 100.0 5e-48 MKKRLEKKREKRKREQIHRLLDLALDINGLEPRKQGITGDLPTAFFKFSGQVGWIDIHMY STGWNSDNYPDVEMLPSTNSLGELSYAVRRMEALKAETPGAATPRDSR >gi|229784002|gb|GG667733.1| GENE 16 13907 - 14116 242 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871210|ref|ZP_06116749.2| ## NR: gi|288871210|ref|ZP_06116749.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 69 11 79 79 103 100.0 6e-21 MQRETIMKQLEKQARYAQRSLSRDLLYQTYGQACMARQLEAITKDEFMQINHMTVYFMNT HARELEEGR >gi|229784002|gb|GG667733.1| GENE 17 14139 - 14447 221 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623815|ref|ZP_06116750.1| ## NR: gi|266623815|ref|ZP_06116750.1| hypothetical protein CLOSTHATH_05135 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05135 [Clostridium hathewayi DSM 13479] # 1 102 1 102 102 202 100.0 6e-51 MKRVNAEETYFDSMNDMVIVLCCKEDIAEEGPEHIVLEPVWKKYESDDSPVEYLTLADIS EQFQGRCCLMVIAETPLSGRIYRYDNYGEKEWLEVGTTCGYA >gi|229784002|gb|GG667733.1| GENE 18 14463 - 14615 195 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623816|ref|ZP_06116751.1| ## NR: gi|266623816|ref|ZP_06116751.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 50 1 50 50 86 100.0 5e-16 MQKYYTDLDDFEDDSRSRLLEVTERWLMPAVIFVAGVIIILAVCVSLEAL >gi|229784002|gb|GG667733.1| GENE 19 14766 - 14915 184 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623817|ref|ZP_06116752.1| ## NR: gi|266623817|ref|ZP_06116752.1| hypothetical protein CLOSTHATH_05137 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05137 [Clostridium hathewayi DSM 13479] # 1 49 1 49 49 68 100.0 1e-10 MKDTVVALNSVNVERAFEALARIFTARGECTVTVKSIRKKDEVQKDETA >gi|229784002|gb|GG667733.1| GENE 20 15049 - 15154 138 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTYEELKTFTWEELSALTYYQLKMDKYELLVKAQN Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:24:01 2011 Seq name: gi|229784001|gb|GG667734.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld127, whole genome shotgun sequence Length of sequence - 13960 bp Number of predicted genes - 14, with homology - 12 Number of transcription units - 8, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 - CDS 266 - 1549 767 ## COG0582 Integrase 2 1 Op 2 . - CDS 1546 - 2739 485 ## COG0582 Integrase 3 1 Op 3 . - CDS 2754 - 3095 84 ## gi|266623821|ref|ZP_06116756.1| putative prophage LambdaBa04, DNA binding protein - Prom 3121 - 3180 1.9 + Prom 3638 - 3697 7.2 4 2 Op 1 . + CDS 3731 - 4393 436 ## Spirs_4185 TetR family transcriptional regulator + Term 4397 - 4433 4.5 + Prom 4396 - 4455 2.8 5 2 Op 2 . + CDS 4479 - 5393 437 ## COG3595 Uncharacterized conserved protein 6 2 Op 3 . + CDS 5414 - 6493 500 ## gi|266623824|ref|ZP_06116759.1| conserved hypothetical protein + Term 6695 - 6730 2.0 + Prom 7769 - 7828 5.1 7 3 Tu 1 . + CDS 7914 - 8366 -18 ## COG4908 Uncharacterized protein containing a NRPS condensation (elongation) domain + Term 8408 - 8447 8.4 8 4 Tu 1 . - CDS 8569 - 8679 61 ## - Prom 8800 - 8859 4.8 + Prom 8425 - 8484 7.6 9 5 Tu 1 . + CDS 8664 - 8801 59 ## 10 6 Tu 1 . - CDS 9010 - 10788 432 ## Closa_1131 hypothetical protein - Prom 10832 - 10891 7.0 - Term 10850 - 10901 12.1 11 7 Op 1 . - CDS 10920 - 11849 433 ## bpr_I1133 hypothetical protein 12 7 Op 2 . - CDS 11920 - 12246 123 ## gi|266623828|ref|ZP_06116763.1| conserved hypothetical protein - Term 12261 - 12318 6.1 13 8 Op 1 . - CDS 12327 - 13121 390 ## gi|266623829|ref|ZP_06116764.1| conserved hypothetical protein 14 8 Op 2 . - CDS 13143 - 13850 342 ## Closa_1108 hypothetical protein - Prom 13900 - 13959 1.8 Predicted protein(s) >gi|229784001|gb|GG667734.1| GENE 1 266 - 1549 767 427 aa, chain - ## HITS:1 COG:mlr0475 KEGG:ns NR:ns ## COG: mlr0475 COG0582 # Protein_GI_number: 13470699 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mesorhizobium loti # 36 381 27 369 399 72 23.0 2e-12 MIIDLSLQQYVDWEVKPVIPIKGKFGYRVVLKYIDGTERTQQKSGFKTDKEANAARDNTI GKLHAGTYIVYENVKVSDFLEFWLKEDICKRVRSEETYAAYSNIVYNHIIPILGKKKMSA VNRGDVQKLYNDRAEYSVSVSRLVKTVMNVSMNYAVAKKIIAENPAVGINLPKTVKKKEY HARSIDTQKTLTMDQILILLEASRDTPIHMQILFNVLMGLRRREINGVKYSDIDYINRTL KLRRQLGKKINTKKEDFPPKTFTKQELGLKTPSSYRDIPIPDYVFEAILQQREVYERNKS RRRSQFQDSGYVCCSNYGKPRSKDFHWKYYKKLLADNNLPDIKWHHLRSTFCTLLLKNNF NPKAVSKLMGHATELITLDVYGDNREIIADCVDEIQPFIDEVLPIKEVDKQFEEELLEIE VPAEDYV >gi|229784001|gb|GG667734.1| GENE 2 1546 - 2739 485 397 aa, chain - ## HITS:1 COG:mlr0475 KEGG:ns NR:ns ## COG: mlr0475 COG0582 # Protein_GI_number: 13470699 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mesorhizobium loti # 143 396 152 398 399 88 26.0 3e-17 MKRKPQNSAAFFENNSWYHRTKILQDDGSIKYSKKGGFATSEEADKSYRRHEDEFKKAYR AYHLAHQINTEVTLKDYLIYWFEEIFSQRIETTTRMVGAYAVYDLILPNIEYDIKVRFAN AEYLDALLERVAKVSASAGNKGREILNMAMKEAVIGGYIKTNPVTGTKPYKRQKPKITIL SKEKIKVLLKASSENNWYLEILLGLFCGLRKGEILGLKFSDLDMEKRTVTIRRQIASDPL IKKGSGSKIDNYGLIERDPKTPNSFRVLRVPAIIIEELEKRRRLVESNKVVYAGQYDDND YISCQRNGKLHSLSALNSALTGICSKNGLPHITVHGLRHMYATILIEMGVPLIKISALLG HGSVHTTFEYYCDVMDENENILAFMNQAFVPVERRAV >gi|229784001|gb|GG667734.1| GENE 3 2754 - 3095 84 113 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623821|ref|ZP_06116756.1| ## NR: gi|266623821|ref|ZP_06116756.1| putative prophage LambdaBa04, DNA binding protein [Clostridium hathewayi DSM 13479] putative prophage LambdaBa04, DNA binding protein [Clostridium hathewayi DSM 13479] # 1 113 1 113 113 221 100.0 1e-56 MIVEKQTVLKTERFCEINIVLAISLRLNHVRWHQSCSWLVAFVKGVTAIEVLQKDKKVYN AEEIQEILGIGRSKTYTFLDEAYRNQKPFRVLKVGKLFRVPKQSFDDWLDGIS >gi|229784001|gb|GG667734.1| GENE 4 3731 - 4393 436 220 aa, chain + ## HITS:1 COG:no KEGG:Spirs_4185 NR:ns ## KEGG: Spirs_4185 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: S.smaragdinae # Pathway: not_defined # 3 219 2 218 219 264 57.0 2e-69 MSDRIKSITDAATYLFLQQGYSKTQISHIAKAVGVSVGTIYLDFAGKKEIMHFVLKCTID PAFINQNFERPITDDLFVGLENDIIAVFEKIGSDFAKHLVNKAADYDLETLVSDVFDILA QYAVGCLFIEKNQFDFKFLAEHYRAYRKKFLETMTQYLTAFVESGKVRPLEQLELTTTLI IEILSWWAMDIRYTSFETQDIPPELAKKICIDNIISAYKS >gi|229784001|gb|GG667734.1| GENE 5 4479 - 5393 437 304 aa, chain + ## HITS:1 COG:CAC0573 KEGG:ns NR:ns ## COG: CAC0573 COG3595 # Protein_GI_number: 15893863 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 78 302 104 304 306 63 25.0 4e-10 MKMKRKIINGMLMAACLFTLSGCSHSTPAQVANELQFNLNGISDVTISYDEENVTFYQSE NDALIIKEYMTENKSRYYADVKQDSGSIHVSEGGKPFLKSNFTRYIEVYLPQEYQANLTV TTTNGNIDMSDLALDLNSLRIDSTAGTVQLDDANASIVHLSSTSGELDLGNITGEQIRLD TTSGKVTCVELNGAVTYTSTSGDIVVKSAVGSGSYRSNNSGNLKVTYTEVTGDLSFFNKN GGIELTLPQDLEFEFEATTKNGSVSTSFQGSISIDGRTTRGTVGNTPTATVKTETNNGNI TVTQ >gi|229784001|gb|GG667734.1| GENE 6 5414 - 6493 500 359 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623824|ref|ZP_06116759.1| ## NR: gi|266623824|ref|ZP_06116759.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 359 9 367 367 659 99.0 0 MILGVVILACIAFYFGGGYTAYNLAYIFSFVFLLFYAIFSASTILEKIAQIIIYALVMVV QILFNTFILRNLFEDNGIIGCLCRLLGILFIFIPIIIKQIFFYRRNGFPFCILGEYPALT YSELLSNKNEIASKIEKLKSTGQVLSKEHLQEILHDLPRHSSFSYINNGSLTAEYFQKAA ESFNDGFIYLVITQTKSASSEVIGLFTNRIFNHVSLSFDRDLQTIISYNGGEKIKPPGLN PELLEHLTEKTGASIMLYQLPATRQQKKTMLDKVQEINNEGSAYNLIGLVFKFSRKPNIM FCSQFTYTMLEIAGLNYFEQKAAHVRPTDFIELDYYRKLQFVDRIVLDNGDNAYGEENP >gi|229784001|gb|GG667734.1| GENE 7 7914 - 8366 -18 150 aa, chain + ## HITS:1 COG:CAC3554 KEGG:ns NR:ns ## COG: CAC3554 COG4908 # Protein_GI_number: 15896790 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a NRPS condensation (elongation) domain # Organism: Clostridium acetobutylicum # 9 143 281 415 417 59 25.0 2e-09 MTGIYRKIVVEIKPQHSFHETLLQVHIEMQLQKSRYRCFVGIRPLDYLFHKIPPPVMAQI IKWAYRLFPVSYTNFGRINHTKLFFEGCKIKSCYTTGAYRLPPDFQLSISTFQNVCTLNC TLIGQNEDSMTGQNILDEVKNEILEWVNMN >gi|229784001|gb|GG667734.1| GENE 8 8569 - 8679 61 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLLATLIRSAPIVKRRLVFWDGVVGKVGSSFFGMLL >gi|229784001|gb|GG667734.1| GENE 9 8664 - 8801 59 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWPEAICFLRPLMETDSGAYIPQRYHLQNIGWALYLSIKFHYIFI >gi|229784001|gb|GG667734.1| GENE 10 9010 - 10788 432 592 aa, chain - ## HITS:1 COG:no KEGG:Closa_1131 NR:ns ## KEGG: Closa_1131 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 591 1 591 674 960 82.0 0 MDIKELYNIYQGNPEKFKKIVLYRNLLGSPISYNYKDSDLESLVQMIMSMDEKYETDFGF MMMELGLAYEAIYSNMKNNNAAREKTEFMNYPLDYQLRALISFFEDQHKLGIELNNIDGK KPITGIKHLVANVESEIYPGMRYSTSQNSENLIELIDYSIPYLYYYDSNFEKPCDNLTYE QCLQCPDIMKFVHHSNFCELINGLWSLYIYKDYSVKLGKTDTDDDITVFVPNSNEASLID FVAGIRREARRFQNSIDLLISNQNRIINGNKFIIRLAKKIKLDLWKSIFELELEVYLRCN LSANCTMKIIKQTDIFPAYLRQTLPYGELNDFLEVHEFLVTMSEIYSNILNHNWEEIEHD RFNYLCPVVDIDLLIKSFSRLYGKSLNTAAKLVEWFIYYPQRGKEGDLFSKPLVQISGKR ILFAPNLIRQINITRMLEQIMLDYKIKRAAIGDEYESYLRNKLSQSSLWNVYADKIEFKS SIGNTDFDVIALFDNHVVIIEIKHLVTPYDPKRYYEDRQEIKKAIKQLKLRKQVLLRDWA LIRDITNGFLPPEPYPEERIIQLVCTNIDSFTSLEIDGIRIVDESVLIRFFF >gi|229784001|gb|GG667734.1| GENE 11 10920 - 11849 433 309 aa, chain - ## HITS:1 COG:no KEGG:bpr_I1133 NR:ns ## KEGG: bpr_I1133 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 107 304 101 308 308 76 31.0 2e-12 MEAYMKYEEFIAALCQKIREVTKGWNGTSIEFRTTPQNNGRTLDLLVVSKQGKKMNGTYE LSFHTQELYEALAFDKMDLKHILRIVESKLRQAGEVGKISPLEVADDYEQIRSHLIIRPI NYVKHMEKLKDGMYDLIGDIALTVYIHIGFVNGAYVSSAVSSEYLEIWNKGKEEVIKEAF TNTYNLFPPIMLTGLEAYSVMDTDIFQKDAGYDMVGVTNEVIFNGAVSIFLPGVAKKLSD ILGEDLYIVFLNYDYAVIHPVSIIRPEEIRSLIASVNSDSPDEEGFLSDMIFTYRRDSDK MEVVKEVYS >gi|229784001|gb|GG667734.1| GENE 12 11920 - 12246 123 108 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623828|ref|ZP_06116763.1| ## NR: gi|266623828|ref|ZP_06116763.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 108 1 108 108 190 100.0 2e-47 MRDYINRNIRIDGRLIPYPVYTSWEYFELHDGIEDVEDFVDSNPAIEELVTQILALKQSC FLLRHTTHSCQSLSDSLFSLKLKLIKELKEKYNYNFDDVWMENLIGRI >gi|229784001|gb|GG667734.1| GENE 13 12327 - 13121 390 264 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623829|ref|ZP_06116764.1| ## NR: gi|266623829|ref|ZP_06116764.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 264 1 264 264 506 100.0 1e-142 MKQKLYYIKNETAEKAIYKNFENQRFSVIEKDSEPPMVLSFFDDKNVTDEYEYVYIPREM SKKEVEGRFEVEKQCPMCGTVTSIYLTDEDMDKYSWYSNSRFNRRGSGLPAPHIQNLFPD RTEEERELMMSGYCYWCQELIFGEDEFLLMTEAEFSMEPDHLIAYNRADHDGYQWWNKWF PQKGTDGNQYMVQEMNKFSAYILEEQLGKGISDIPKLGAYYGAEPSEDKEYHLYCEGTYC NFHVKLINRKRSYNVYIYVYETEE >gi|229784001|gb|GG667734.1| GENE 14 13143 - 13850 342 235 aa, chain - ## HITS:1 COG:no KEGG:Closa_1108 NR:ns ## KEGG: Closa_1108 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 214 508 722 734 187 44.0 3e-46 MSKRLKRDITLEIFYKPKNLVESHDEVLQLCEDKNITLQAVDIAKVYPDVDAICESIKEK YEYRDQKYVLLIPNRIEDIIQEGRALRHCIRFSDDYYERIQRRESYIAFLRKADQPEKAF YTLEFEPDGTVRQKRTSGDRQNVDFNQAVSFIKKWQKAIQPRLTEEDYRLAKVSASLRVE EFKELRKKNAKVWHGHLAGMPLADVLEADLMEVSLCMEDAGVTEADGQNGELVAA Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:25:17 2011 Seq name: gi|229784000|gb|GG667735.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld128, whole genome shotgun sequence Length of sequence - 12184 bp Number of predicted genes - 13, with homology - 10 Number of transcription units - 5, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 25 - 462 192 ## gi|266623831|ref|ZP_06116766.1| conserved hypothetical protein 2 1 Op 2 . + CDS 452 - 850 326 ## gi|266623832|ref|ZP_06116767.1| conserved hypothetical protein + Term 942 - 995 1.6 + Prom 948 - 1007 12.0 3 2 Op 1 3/0.000 + CDS 1091 - 1672 287 ## COG1309 Transcriptional regulator + Term 1679 - 1723 8.5 + Prom 1689 - 1748 3.6 4 2 Op 2 . + CDS 1776 - 2621 421 ## COG1131 ABC-type multidrug transport system, ATPase component 5 2 Op 3 . + CDS 2618 - 3301 320 ## CDR20291_3471 putative ABC transporter permease 6 2 Op 4 . + CDS 3298 - 3996 312 ## CDR20291_3472 putative ABC transporter permease 7 2 Op 5 . + CDS 4007 - 4936 401 ## COG0053 Predicted Co/Zn/Cd cation transporters 8 2 Op 6 . + CDS 4969 - 5097 147 ## + Term 5140 - 5186 1.9 + Prom 5154 - 5213 7.3 9 3 Op 1 . + CDS 5406 - 5984 252 ## COG1309 Transcriptional regulator 10 3 Op 2 . + CDS 6029 - 9163 1027 ## COG0060 Isoleucyl-tRNA synthetase + Term 9186 - 9226 -0.9 + Prom 9485 - 9544 2.0 11 4 Tu 1 . + CDS 9688 - 9807 69 ## + Prom 10123 - 10182 4.3 12 5 Op 1 . + CDS 10246 - 11631 298 ## COG1404 Subtilisin-like serine proteases 13 5 Op 2 . + CDS 11628 - 11699 56 ## + Term 11768 - 11802 -0.0 Predicted protein(s) >gi|229784000|gb|GG667735.1| GENE 1 25 - 462 192 145 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623831|ref|ZP_06116766.1| ## NR: gi|266623831|ref|ZP_06116766.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 145 9 153 153 282 99.0 7e-75 MGLFFLAARFPAIIGGPFIGLGVCGMFGSAIRYWKLKQLIMPLSGCCLEIQTSCFVARQP WKKEHYEQCRIYFKEVQALIREAKAGGFYLQIPEGGESEIRGSGENRRCLFYVSPFGYPE EKMQAVYQTIKERLPETAEIYEYET >gi|229784000|gb|GG667735.1| GENE 2 452 - 850 326 132 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623832|ref|ZP_06116767.1| ## NR: gi|266623832|ref|ZP_06116767.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 132 3 134 134 238 100.0 1e-61 MKHEWMGVPEMLMKKLIINAGVILGILAAGVFMEIHQEGEGVLKLTILCVTGLGLYFLYL CQVIFQKKYETLVGEVTLIQNCKGRKRYWEIEVVDGEGKAKRLLVPVQSGVRKGIVYRFF LKNDILLGVEEM >gi|229784000|gb|GG667735.1| GENE 3 1091 - 1672 287 193 aa, chain + ## HITS:1 COG:MA1487 KEGG:ns NR:ns ## COG: MA1487 COG1309 # Protein_GI_number: 20090346 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Methanosarcina acetivorans str.C2A # 1 69 1 68 204 64 44.0 1e-10 MPRFTEQEKEIINRKLLIEGEKLFAAHGLKKVTVDDLVAAVNISKGSFYAFYPSKEHLYV EINFRLQKELFTNIETTIKGRKYENHRDLTKDVITLGVTGLINSPILSQIDLSLMDYLQR KLSPDIFDSHMHNDIHILEMLEDLGVIFTIPHTVIIKSLYSVLSCLEQFKEDKELNMIQN LLVNGIVQQVVAE >gi|229784000|gb|GG667735.1| GENE 4 1776 - 2621 421 281 aa, chain + ## HITS:1 COG:Rv2688c KEGG:ns NR:ns ## COG: Rv2688c COG1131 # Protein_GI_number: 15609825 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Mycobacterium tuberculosis H37Rv # 1 280 17 300 301 247 46.0 2e-65 MIDVKKLYFSYTDKPFVENVSFHVGKGEIFGFLGPSGAGKSTIQKVLTGLNTKYKGSVKV AGTEIRERTNRFYENIGVDFEFSTCYEKFTARQNLAYFASLYEKQPRSIDELLHMVGLEN DGDKKVGDFSKGMRSRLNFIKALIHDPDILFLDEPTSGLDPTNNRLMKDIILAEKKRGKT VIITTHNMFDATELCDQVAFIVAGKVSALDSPHNLIMSRGAAKIQYTYYDKGEKTGECLL DRTTEDKLLKNLISENRLLSIHSSEPTLNDIFVDITGRTLQ >gi|229784000|gb|GG667735.1| GENE 5 2618 - 3301 320 227 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_3471 NR:ns ## KEGG: CDR20291_3471 # Name: not_defined # Def: putative ABC transporter permease # Organism: C.difficile_R20291 # Pathway: ABC transporters [PATH:cdl02010] # 1 227 1 227 227 343 91.0 3e-93 MRLGRLICGDIHFQWKYGFYFIYFILTVLYVCGIAALPEHWKTDIASIMIYSDPAAMGLF FMGAIVLLEKSQKVLNAMVVSPVKISEYILSKTVALIAISTVIALILGLVSGSNHLLGIA VGTALTSAIFTMLGIIAATKISNLNQFLIVIMPIEIVCFVPPIAGLFVTLPNVFCFFPFM ACMNLITGKSMLLSFDMVIVIATLIILYIIARDTVEHMWKSLGGVKL >gi|229784000|gb|GG667735.1| GENE 6 3298 - 3996 312 232 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_3472 NR:ns ## KEGG: CDR20291_3472 # Name: not_defined # Def: putative ABC transporter permease # Organism: C.difficile_R20291 # Pathway: not_defined # 1 232 1 232 232 333 90.0 5e-90 MKRILSSAIQVFKQIKSDPMMFAACFTPFIMGALIKFGIPFFERITKFSLQGYYPIFDLL LSIMAPVLLCFAFAMITLEEIDDKVSRYFSITPLGKAGYLFTRLVVPAIISAVIAFIVLL LFSLEKLPTRMMIGLALLGSVQAIIVSLMIITLSGNKLEGMAVTKLSALTLLGIPVPFFI DSYYQFAVGFLPSFWVAKAVQNEAVLYFPIALVVALIWYYFLIKRLFRKLAG >gi|229784000|gb|GG667735.1| GENE 7 4007 - 4936 401 309 aa, chain + ## HITS:1 COG:MA0617 KEGG:ns NR:ns ## COG: MA0617 COG0053 # Protein_GI_number: 20089506 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Methanosarcina acetivorans str.C2A # 9 295 20 310 331 207 40.0 2e-53 MKKNKILPNENVNTAIRVSTITLIINLILSLLKFVAGYIGKSSAMLSDAVHSASDVLSTI VVMVGIKISEKRPDKEHPYGHERMECVASIILAAALAVTGAGIGYSGINKIYSGQYNTLS VPGRIALTAAVLSIVIKEWMYWFTRGAAKHTNSDALMADAWHHRSDALSSVGSLIGILGA RLGYAILDPIASVVICGCILKAALDIFRESINKMVDHSCDNITETKIRELVLQQQGVEGI DDLKTRMFGAKMYVDIEILADGSLVLYDAHRIAEGVHKVIEHNFPQCKHCMVHVKTNELL PPFCIGRGR >gi|229784000|gb|GG667735.1| GENE 8 4969 - 5097 147 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEQYWWIIIVAVLIVVGLLKMVIMKKYNDKKNPKKRSHNDED >gi|229784000|gb|GG667735.1| GENE 9 5406 - 5984 252 192 aa, chain + ## HITS:1 COG:alr4567 KEGG:ns NR:ns ## COG: alr4567 COG1309 # Protein_GI_number: 17232059 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Nostoc sp. PCC 7120 # 3 186 2 187 201 67 22.0 1e-11 MGNKGEETKKLIREKACSLFARKGFKNVTMKDICVATGLSRGGLYRHYDSTQQVFSEIID DLMNIQDNEFSEKMKNGYPAVQILDEILERYRTEMLDSTGSLGLAIYEFYSEIQNDNNDN ILLKQYEHSVTVWKDFLSYGIQRGEFKNVESDELVDIIIFSYQGVRMLSEVFPVDEKVSA RIINHIKRVLLD >gi|229784000|gb|GG667735.1| GENE 10 6029 - 9163 1027 1044 aa, chain + ## HITS:1 COG:CAC3038 KEGG:ns NR:ns ## COG: CAC3038 COG0060 # Protein_GI_number: 15896289 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 1042 1 1033 1035 1219 54.0 0 MFNKVETDLNFIERENKVKSFWKENQIFEKSIEARAQGTPYVFYDGPPTANGKPHIGHVL ARVFKDMVPRYRAMKGNMVPRKAGWDTHGLPVELEVEKNLGLDGKEQIETYGLEPFIAHC KESVWKYKGMWEDFSDIVGFWADMENPYVTYHNGFIESEWWALKQIWDKDLLYKGFKIVP YCPRCGTPLSSHEVAQGYKDLKERSAIVRFKVKQENAYFLAWTTTPWTLPSNVALCVNPA ERYVKVESGDGFVYYMAEALLEKVIGSLGEYKILENYKGSDLKYKEYEPLFPEAKDTIAK QQKRGFFVTCDNYVTLTDGTGVVHIAPAFGEDDSRVGKNYNLPFVQFVNSKGEMTEETAW PGVFCKNADPLVLNDLEKQGQLFSAPYFEHSYPHCWRCDTPLIYYARESWFIKMSAVKEN LIKNNNTINWVPENIGKGRFGDWLENVQDWGISRNRYWGTPLNIWECKCGHQQAIGSIDE LKSLSSNCPENIELHRPFLDSVTIICPKCGKEMYRVPEVIDCWFDSGAMPFAQHHYPFEN KELFETQFPADFICEGVDQTRGWFYSLLAISTLIFNKAPFKNVIVIGHVLDENGQKMSKS KGNSVDPFDVLKIYGADAVRWYFYINSAPWLPNRFHDKAVTEYQRKFMGTLWNTYAFFVL YGNIDNFNPAEYCLEYDNLSVMDKWLLSKLNTVVKAVDTYLETYKVTEAARTLQDFVDDM SNWYVRRSRNRFWAKGMEQDKINAYMTLYSALITICKVIAPMLPFMAEDIYQNLVRNVDK TMPESVHLCDFPKAHKEMIDKALERKMDEVLKIVVLGRAARNTANIKNRQPIKRMYIKAE SLPEFYQMIIRDELNVKEIEFVTDVQAFTSYTFKPQLKTVGPKYGKQLGNIKTALSELDG NRAMNRLNEQGFLSFEFDGANVELTKDDLLIDMTQKQGFVSEMDNTVIVVLDTDLTPDLI EEGFVREIISKVQTMRKEAGFEVTDHITVYVQNNKSISSVLKANETFIKSEVLAEQICYE QMEGYQKEWNINGEKVVLGVKKYV >gi|229784000|gb|GG667735.1| GENE 11 9688 - 9807 69 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMTQATGKQIKFHMVTIESLVPEKCALAAYRSIHRLFAI >gi|229784000|gb|GG667735.1| GENE 12 10246 - 11631 298 461 aa, chain + ## HITS:1 COG:CAC3245 KEGG:ns NR:ns ## COG: CAC3245 COG1404 # Protein_GI_number: 15896490 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Clostridium acetobutylicum # 30 461 677 1101 1118 255 36.0 1e-67 MSIWDQSLPEDKSLLPAGVPNRYNASGASYGTEYTQEQINEALESDNPFAVVPSTDTNGH GTFLAGIAAGGILPNQDFTGAAPECELVIVKLKPAKQYLRDFYLISNDADAYQENDIMMG IKYLRVEAYSQRKPLVILLGIGSNLGSHEGTSPLNVMIQDISRYLGMATVIAAGNETGRG HHYMGSVPIGEESVEVEIRVGNAESQRGFVVELWADTADTYSVGFVSPSGESIGRIPIIG RNETSIPFLLEPTVITVNYQLIETGAGKQLVFIRFAAPTNGIWRIRVYNTQYLTGQFNMW LPAHTLISDETVFLTPSPYTTITLPGDSPSPITVGAYNHLNNSIYIHSGRGYTIGGLIKP DLAAPGVNVSGPSIGQRNQDSIPMTTRTGTSVAAAHVAGAVANLLSWGIIEGHNIAMSEA TVKAFLIRGAKRNPALSYPNREWGYGALNLYETFLRLREIR >gi|229784000|gb|GG667735.1| GENE 13 11628 - 11699 56 23 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNYLKHQILMHITDTGELDGGFG Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:26:00 2011 Seq name: gi|229783999|gb|GG667736.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld129, whole genome shotgun sequence Length of sequence - 20902 bp Number of predicted genes - 17, with homology - 16 Number of transcription units - 10, operones - 5 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 14 - 73 6.7 1 1 Tu 1 . + CDS 164 - 847 436 ## Closa_2883 hypothetical protein + Term 1067 - 1124 10.4 + TRNA 993 - 1066 76.1 # Pro CGG 0 0 + TRNA 1259 - 1331 71.5 # Ala GGC 0 0 - Term 1239 - 1309 12.1 2 2 Op 1 . - CDS 1434 - 1982 436 ## CPR_0873 hypothetical protein 3 2 Op 2 2/0.000 - CDS 1998 - 2795 587 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 4 2 Op 3 . - CDS 2797 - 3474 632 ## COG0846 NAD-dependent protein deacetylases, SIR2 family 5 3 Tu 1 . - CDS 4409 - 7564 2634 ## COG1201 Lhr-like helicases 6 4 Op 1 . - CDS 8497 - 9063 269 ## PROTEIN SUPPORTED gi|62317343|ref|YP_223196.1| DEAD-box ATP dependent DNA helicase 7 4 Op 2 . - CDS 9150 - 9431 306 ## Amet_3815 hypothetical protein 8 5 Tu 1 . - CDS 9538 - 10389 903 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 10420 - 10479 2.1 9 6 Tu 1 . + CDS 10852 - 11484 626 ## COG3601 Predicted membrane protein + Term 11532 - 11579 2.2 10 7 Op 1 . - CDS 11435 - 11551 56 ## 11 7 Op 2 . - CDS 11569 - 12969 1540 ## gi|266623854|ref|ZP_06116789.1| hypothetical protein CLOSTHATH_05177 - Prom 13002 - 13061 2.0 12 8 Op 1 . - CDS 13063 - 14028 758 ## Calkro_1115 hypothetical protein - Prom 14085 - 14144 4.3 13 8 Op 2 . - CDS 14181 - 15035 868 ## COG2017 Galactose mutarotase and related enzymes - Prom 15110 - 15169 80.4 - Term 16044 - 16107 14.6 14 9 Op 1 . - CDS 16136 - 17614 1605 ## COG2407 L-fucose isomerase and related proteins 15 9 Op 2 . - CDS 17675 - 18298 833 ## Closa_1357 hypothetical protein 16 9 Op 3 . - CDS 18318 - 19841 1769 ## COG1070 Sugar (pentulose and hexulose) kinases - Prom 19877 - 19936 3.7 17 10 Tu 1 . - CDS 19964 - 20818 837 ## COG2207 AraC-type DNA-binding domain-containing proteins Predicted protein(s) >gi|229783999|gb|GG667736.1| GENE 1 164 - 847 436 227 aa, chain + ## HITS:1 COG:no KEGG:Closa_2883 NR:ns ## KEGG: Closa_2883 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 12 224 1 213 219 230 54.0 3e-59 MNIEILFYGAIMQIITQRVLSDTVYSKGTAVLTYTIRYPYFTTTCSESAALSINAFYSAA AKKTEEYCHTVLASQASVQAEYAQKDNYPFFGYEFISAYTVTWNSGCMTSLYTDQYTYTG GAHGSTIRTSDTWDFSSGRYLTLADFYPHNPSYREGIYRNLEQQVKEREKESSSTYFDDY SSLIRSTFKPENFFVTPKGIVIYFQQYDIAPYSTGIPEFLIPFKRNR >gi|229783999|gb|GG667736.1| GENE 2 1434 - 1982 436 182 aa, chain - ## HITS:1 COG:no KEGG:CPR_0873 NR:ns ## KEGG: CPR_0873 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_SM101 # Pathway: not_defined # 1 157 1 155 162 144 44.0 2e-33 MKGVILRKGETAYSCLSGAFSAIQPAVTEHNWLITDYENVPELLEPYSGNQGREAFFWVP GERLDEIVHQQNFPWYWGVLSGFKREISLQEVLSHGLPYADGYDGFWKNPVSIQHPLAEL EIVAWDSTLTLLISRDEAIVQVFRNAYPLSEDLETYNLGEPESEEWRKIWEEAERYYGDG GR >gi|229783999|gb|GG667736.1| GENE 3 1998 - 2795 587 265 aa, chain - ## HITS:1 COG:SA0314 KEGG:ns NR:ns ## COG: SA0314 COG2110 # Protein_GI_number: 15926027 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Staphylococcus aureus N315 # 8 264 6 261 266 251 50.0 9e-67 MKDKNDMTQEERCIYLIRTMQREMPQYRGIVIPEEGAEQKRLLRSLMNVRPPMPASSDFL AVQDMYLSEENAARGILDGADLAPVKGNSRICLWQGDITRLKTGAIVNAANRALLGCFRP CHSCIDNIIHTCAGIQLRLTCNEIMEAQGCEEPAGSAKLTPGFNLPCDFILHTVGPVITG PLQRTDCRMLADCYRSCLELAAENHITSVAFCCISTGVFRFPQERAAEIAVETVAGFLEQ NESVRQVIFDVYTDKDMAIYRRLLK >gi|229783999|gb|GG667736.1| GENE 4 2797 - 3474 632 225 aa, chain - ## HITS:1 COG:SPy1215 KEGG:ns NR:ns ## COG: SPy1215 COG0846 # Protein_GI_number: 15675179 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Streptococcus pyogenes M1 GAS # 1 222 65 287 293 173 34.0 2e-43 MYSAGFYPFSTQEEKWAYWSRHIYYNRYDVRIGKPYLDLLQLVKDRNYFILTTNVDHQFQ MAGFPEERIFAVQGDYGLFQCKKACHRQLYDNEKDVRAMVARQENLRIPTELVPKCPVCG GEMEVNLRCDRFFVEDEAWERACAGYSSFIRNNRNKKIVFLELGVGMNTPGIIKYPFWQM TDREKNGSYICINQGEAWGPEEIREKSIYVDMDLAKALEAVRGGL >gi|229783999|gb|GG667736.1| GENE 5 4409 - 7564 2634 1051 aa, chain - ## HITS:1 COG:Cgl0823 KEGG:ns NR:ns ## COG: Cgl0823 COG1201 # Protein_GI_number: 19552073 # Func_class: R General function prediction only # Function: Lhr-like helicases # Organism: Corynebacterium glutamicum # 1 995 170 1274 1478 378 28.0 1e-104 MSATIEPLELSAEYLSPEPAVICAPAMEKKVFIEVVGTTPANGRRKDPVWEELGMAVYQK CLDCKSVIAFSEGRRYAEKLAYYVNLLGGDGFARVHHGSLSKEQRAEVEQELREGRLRLL CATSSMELGIDVGDIDQVLQIGCPRTVSSTMQRLGRAGHNPGSVSVMYMYPRTAPESVYC GMTAETARLGGVEHTKPPRMCLDVLAQHLVSMAAGDGYRVDEVMDILSRAYSFRDVTRED VKSVLRMLAGDYEHGREIPVRPRVLYDRIHERVNGDAYSRMLAVAAGGTIPDKGLYAAKT EDGVKLGELDEEFVYESQIGDKFVLGSFAWKIVSQDRDTVVVTQVPAEGARLPFWKGETR GRTLRTSLAFGRLFRELGKAAENGPDALTDALNRLGLDDAAACSAASFLRRQIDATGGLP DDRTIVVERFTDHTGSNQVMVHALFGRRVNAPLSLLLQHTARQLCGIDIGSVDEEEGVLL YPYGNDVLPEGLLFQIDEKQARSILEAALPLTPLFNMTFRYNASRALMMGMKRNGRQPLW MQRLKSTEMLSSLIHEPGHPLIRETKRECLEDQWDIDGVLRILGDIRAGLITVREVCVDV PSPMSLPFQWQAEAAEMYEYSPVTQGIREAVYEELKTADMIKPQAEELARLQERKKLPED EEQLHSLLMMEGDLIAGELPVPVEWLENLAERELVAYMEPGIWIAAEHREEYEHALLDGN EEAAMHIVRRMLYYRGGQNAEAVRERYFFPEGLAEHILEKLAGESLVVEDEGIFYHAKIY DRARRASIKGMRLEAVTQPASHYAALMAGRAFLSAPGEEQLRRAVEQLCGRPYPVKFWES VFFARRVKNYTEAMLDRLLAQGDYFWQMKSDGTLAFYRYEDIDWEAPLTGLSEELEGDEK LMYQELMRRGASFLKFLTEVPKQDSARDVLLKLVEKGLVCADSFVPVRQWQNREKVKKAT ARQRVNVRVMALSAGRWDIVRPVRKRSLEEWLDLFLKESAVLCKETFRRSCDSCAGSAYM ISAGASAGDREEAGELTWRKALDVLRIWVAS >gi|229783999|gb|GG667736.1| GENE 6 8497 - 9063 269 189 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|62317343|ref|YP_223196.1| DEAD-box ATP dependent DNA helicase [Brucella abortus bv. 1 str. 9-941] # 20 189 23 202 851 108 36 4e-23 MEVTMTEDVLKIFSEPTAGWFAETFGKATKVQEEAWPAIAAGKPVLVSAPTGTGKTLSAF LVFIDRLSTLAREGRLEEKLYLIYVSPLKSLAGDIRENLRRPLDGIGREEGQQEITVGIR TGDTPQKERQRMVKHPPHILIITPESLYLMLTSRSGQSVVKTAEAVIIDELHALIDTKRG AHLMLSVAR >gi|229783999|gb|GG667736.1| GENE 7 9150 - 9431 306 93 aa, chain - ## HITS:1 COG:no KEGG:Amet_3815 NR:ns ## KEGG: Amet_3815 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 1 86 1 86 88 119 67.0 3e-26 MEEKYCQSCGMPMGETDELYGTNEDGSRNGDYCNYCFQDGAFTTDCTMEEMIEGCVPMMA GANSGMSEEEARNMMRQFFPELKRWKKAPVNPE >gi|229783999|gb|GG667736.1| GENE 8 9538 - 10389 903 283 aa, chain - ## HITS:1 COG:PA0762 KEGG:ns NR:ns ## COG: PA0762 COG1595 # Protein_GI_number: 15595959 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 6 99 4 97 193 68 37.0 1e-11 MDEKKREDTEMLVSRAVEGDQEALEQLLTEVQDLVFNLSLRMLGTIQDAEDASQEILIRV MTNLASFRGESAFSTWVFRIAVNHLKNYRKSMFARQPLSFEYYGEDIVSGKEKDIPDLTM GVDRNLLEQELKLSCTNVMLQCLDADSRCIYILGTMFRVDSRLAAEILDLSPEAYRQRLS RIRKKMAGFLEEYCGLSGKGVCSCGKRVSYAIASHRVNPEKPEYTSLKENTEFLLECKTA MEEIDDLSMVFSSLPAYRTTKAAREYLDEFLKSDCYSTISNAL >gi|229783999|gb|GG667736.1| GENE 9 10852 - 11484 626 210 aa, chain + ## HITS:1 COG:CAC2841 KEGG:ns NR:ns ## COG: CAC2841 COG3601 # Protein_GI_number: 15896096 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 1 197 1 193 209 134 41.0 1e-31 MKENKLMSTKNVVLMGMFGALAGVLMIFEIPLPFIAPSFYGLDLSEVPILVGTFALGPVA GAVMEIVKILVKLVLKPTSTGFVGEFANLAVGCSLILPAGILYRFNKTKKGAVLGMAAGT VMMAVVGVVINALIMIPFYSNLMPLETIIQAGAAINPAISNVWTFAIICVGPFNIIKGVT VSLITALVYKRISVIIHSAGRSERRTQART >gi|229783999|gb|GG667736.1| GENE 10 11435 - 11551 56 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYIGCGPFDSLRVFGLTVKGFRVTSVPESCAQTFRHYG >gi|229783999|gb|GG667736.1| GENE 11 11569 - 12969 1540 466 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623854|ref|ZP_06116789.1| ## NR: gi|266623854|ref|ZP_06116789.1| hypothetical protein CLOSTHATH_05177 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05177 [Clostridium hathewayi DSM 13479] # 1 466 17 482 482 930 100.0 0 MTGYFRKIFTILVLAVMLAGCTKETAEKLEEQPLEEREVTVVSNGISYVPLENWTNDLIT ERDKDGNEQETAACGLWLTPEKIVKESVGEYHMPEVFYADDFKIKVKGNGDNPIDYIIYS MGFDTVDSGKVMNPSDILIPVNGKDYYVQLEFSQGDEKNYSHYQYFFKVVKGDPLPQIRV SHERKSREIPSASGVILWNEPGDDGDTVSTAACGPEPFTVLAKDEREIPYVPLGRWITIE LTGDKQPDEAILSDYILKEDGSPRYDEKTVITTELEYDNHEMKFCLPTNVNAMLSSDMSA YMTGGILRGFKLELVWDDGSRAEYGFILSTDASFVGEDPAAVEADPVFKPESDLFLDLKR SEDEPFEKKLILEINNQSGSPVLFGDEMKLMRIDGVEMTEVPVNPEVIWNSITYQVSSGS YKALPVDLERLYDELESGRYRIIKGVRPLGGSRRELYADFIVEPRE >gi|229783999|gb|GG667736.1| GENE 12 13063 - 14028 758 321 aa, chain - ## HITS:1 COG:no KEGG:Calkro_1115 NR:ns ## KEGG: Calkro_1115 # Name: not_defined # Def: hypothetical protein # Organism: C.kronotskyensis # Pathway: not_defined # 13 316 14 311 313 248 45.0 4e-64 MRGTIRYTVLCGSLILAVLSAGCRTGGGNITYAGEMVPAASAPQAEEEIIYLTGDDEVPD KPSEKRDSLINPEGKTLEERFYTPEGYTRIPREMGSFQGYLRRYVMKPDQSPVLLYDGSG KRNQSAHAAVFSMPLIDGDLQQCADSVIRIYSEYFWKMGDYDKIAFHLTNGFLMDYPAWK AGKRLMVDGNETSWVAKAAADDSYETFLKYLRTVMIYAGTLSLDGETVPVNVSDIQAGDM FIKGGSPGHCVMVADVAQNRDGSRCFLLAQGYMPAQEFQIIKNPLHESDPWYYASELEYP LVTPEYVFDEGSLKRWYGFVY >gi|229783999|gb|GG667736.1| GENE 13 14181 - 15035 868 284 aa, chain - ## HITS:1 COG:DR0747 KEGG:ns NR:ns ## COG: DR0747 COG2017 # Protein_GI_number: 15805773 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Deinococcus radiodurans # 66 281 88 295 298 65 29.0 1e-10 MEVFVNPADGMNIYQIDWEGRHMIDWEEARYQRKATYGVPVLYPTPNRSENLKIQAYGKQ YDARMHGLVKNLPFQVKTAETDGQAALVTGVLEWNEEQPDFAMFPFPSTLSITVKALPDE VVWSYQVDNRGEGELSYGIAIHPYFSKREQEVKISVPAASVMEMTEEKIPTGKLIPAGET VPDLRTPVLVDGLSLDHVYTDCPAGAYAEIFYKDCKVKLEATEEFGHIVVFTPDAPFFCV ENQSCSTDCFNLFAKGYARESGLQAVHAGQKKSGEVRFVFEKQE >gi|229783999|gb|GG667736.1| GENE 14 16136 - 17614 1605 492 aa, chain - ## HITS:1 COG:CAC2610 KEGG:ns NR:ns ## COG: CAC2610 COG2407 # Protein_GI_number: 15895868 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Clostridium acetobutylicum # 1 491 1 489 490 771 74.0 0 MNNIPEVKVGIVAVSRDCFPESLSVNRRKALVEAYAAKYDVKNIYECPVCIVESEIHMVQ ALEDVKAAGCNALVVYLGNFGPEISETLLAKHFDGPSMFIAAAEESGDNLTQGRGDAYCG MLNASYNLRLRNIKAYIPEYPVGTAEECADMIEEFLPIARTLVGLSSLKIISFGPRPLNF LACNAPIKQLYNLGVEIEENSELDLFEAFNKHADDPRIADVMKDMEKELGEGNLKPEILP KLAQYELTLLDWVEEHRGYRKYVAIAGKCWPAFQTQFGFVPCYVNSRLTGMGIPVSCEVD IYGALSEFIGTCVSQDAVTLLDINNTVPDDMYDGDIKGKFNYTHKDTFMGFHCGNTCSRK LSACSMKYQMIMARALPEEVTQGTLEGDILPGDITFFRLQSTADCQLRAYVAQGEVLPVA TRSFGSIGVFAIPEMGRFYRHVLIEKNYPHHGAVAFGHFGKALFEVFKYLGVEDIGFNQP KGMLYPSENPFA >gi|229783999|gb|GG667736.1| GENE 15 17675 - 18298 833 207 aa, chain - ## HITS:1 COG:no KEGG:Closa_1357 NR:ns ## KEGG: Closa_1357 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 207 1 208 208 302 69.0 5e-81 MGMQIKDISDKAFSAYGKVITGVDTEGILKEMERTPLPDDVIYVPSVPELEALPEAEIIK NNVYGELPIQIGYCNGHNKKLNAVEYHRNSEINIAVTDLILIIGKQQDIEADYTYDTSLM EAFLVPAGTVIEVYGTTLHYAPCHVEDGGFRCVVVLPKGTNTDMEPLEIKNKEDELLFVR NKWLIGHEEGGLPENAYIGLKGENLSI >gi|229783999|gb|GG667736.1| GENE 16 18318 - 19841 1769 507 aa, chain - ## HITS:1 COG:CAC2612 KEGG:ns NR:ns ## COG: CAC2612 COG1070 # Protein_GI_number: 15895870 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 1 499 1 499 500 535 51.0 1e-152 MEYLIGIDVGTSATKTVLFDKKGNIIASASREYPLYQPHNGWAEQKPEDWRDAVIATVEE VVKRAGAEKDEVKGVGISGQMHGLVMLDEDGEVIRPSIIWCDQRTGEEVEEMLKIMPRER WIEITANPPLTGWTAAKILWVRNHEPENYERCKHILLPKDYIRYILTGVFATEVSDASGM QLLDVPNRCWSDEVLEKLNIDKSLLGKVYESCEVTGTVLPDIAEKTGLSTGTKVVGGAGD NAAAAVGTGVVKDGTAFTTIGTSGVVFAHSSQVTIDPLGRVHTCCAAVPGTWHVMGVTQA AGLSLKWFKDNFCQDYVEEAEREGVDVYDLINRDIAQIPVGSDKLIYLPYLMGERTPHLD PDCRGVFFGLSAIHTKAHMLRAVMEGVSYSLSDCNDILKEMGIDVDSMMACGGGGKSPVW RQMLADLYDCSVNTVRQTEGPALGVAILAGVGCGIYESVEAACDELITEAQSTDADQDAA GKYKKYHELYKELYGDLKDSYKKLAAL >gi|229783999|gb|GG667736.1| GENE 17 19964 - 20818 837 284 aa, chain - ## HITS:1 COG:lin2267 KEGG:ns NR:ns ## COG: lin2267 COG2207 # Protein_GI_number: 16801331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 11 272 11 288 292 100 25.0 2e-21 MDSSHEMIIPNDNLPFKLFLFEGGEGKYKRASHWHRSIEIFLVMKGEIDFYLGERHFSIG EKNFIIVNSNEVHSIDAPLPNRTLVLQIPVQTFEVYLQEQPYLSFSRRSGEQNEKLMGLV MEMYRTYENKAYAWELKVSGLFEFVKYLLLTEFKDQAQAPDIIRQKIHLEKLSEITEYMK NHYDEPLSLEKVADRFGFSPTYLSRIFKRYANISYRDYLQDLRVEYAVKEMVHTDHELGD IAVNHGFADSRAFAKAFAKRYGCLPSEYRKKLKEERGESSEGKR Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:26:50 2011 Seq name: gi|229783998|gb|GG667737.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld130, whole genome shotgun sequence Length of sequence - 18739 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 6, operones - 5 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 563 440 ## COG2942 N-acyl-D-glucosamine 2-epimerase 2 1 Op 2 . + CDS 550 - 1968 638 ## COG3119 Arylsulfatase A and related enzymes 3 1 Op 3 . + CDS 2058 - 3176 821 ## COG1940 Transcriptional regulator/sugar kinase 4 1 Op 4 . + CDS 3208 - 4524 1106 ## COG5434 Endopolygalacturonase + Term 4668 - 4711 11.6 5 2 Op 1 35/0.000 + CDS 5163 - 6488 1002 ## COG1653 ABC-type sugar transport system, periplasmic component 6 2 Op 2 . + CDS 6514 - 7470 793 ## COG1175 ABC-type sugar transport systems, permease components 7 2 Op 3 . + CDS 7481 - 7687 188 ## gi|266623867|ref|ZP_06116802.1| ABC transporter, permease protein 8 3 Op 1 1/0.000 + CDS 8625 - 9257 584 ## COG0395 ABC-type sugar transport system, permease component 9 3 Op 2 1/0.000 + CDS 9272 - 10774 1236 ## COG3534 Alpha-L-arabinofuranosidase 10 3 Op 3 . + CDS 10843 - 11826 914 ## COG1609 Transcriptional regulators 11 3 Op 4 . + CDS 11842 - 13068 485 ## COG3940 Predicted beta-xylosidase + Term 13167 - 13211 -1.0 + Prom 13338 - 13397 4.8 12 4 Tu 1 25/0.000 + CDS 13452 - 13592 199 ## COG1192 ATPases involved in chromosome partitioning + Prom 14494 - 14553 18.5 13 5 Op 1 . + CDS 14649 - 15575 687 ## COG1475 Predicted transcriptional regulators 14 5 Op 2 . + CDS 15588 - 15863 86 ## gi|295115243|emb|CBL36090.1| hypothetical protein 15 5 Op 3 . + CDS 15893 - 16078 164 ## gi|288871227|ref|ZP_06116811.2| conserved hypothetical protein + Term 16082 - 16121 5.1 16 6 Op 1 . + CDS 16146 - 17015 511 ## BLJ_1240 hypothetical protein 17 6 Op 2 . + CDS 17050 - 17796 504 ## COG3617 Prophage antirepressor 18 6 Op 3 . + CDS 17819 - 18130 328 ## gi|309776218|ref|ZP_07671209.1| hypothetical protein HMPREF0983_01773 19 6 Op 4 . + CDS 18143 - 18562 469 ## gi|266623880|ref|ZP_06116815.1| conserved hypothetical protein Predicted protein(s) >gi|229783998|gb|GG667737.1| GENE 1 3 - 563 440 186 aa, chain + ## HITS:1 COG:slr1975 KEGG:ns NR:ns ## COG: slr1975 COG2942 # Protein_GI_number: 16330802 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Synechocystis # 1 173 223 387 391 130 38.0 1e-30 TLQGGYIDSEEGHRINPGHIFESMWFVLEQAERVGRLEYEERALHTIMATFQRSRDEIYG GILHMLSDNGTDEQYKDWNAARNLKWDEKVWWTQAEALCALLTFADRTGDDEVWKEFKTL FLWCREHFFDPEYREWYAVLNRNGSPRIMLKGGIQKAAFHIPRALFQCSCILKKYVVETG EYDAQT >gi|229783998|gb|GG667737.1| GENE 2 550 - 1968 638 472 aa, chain + ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 2 409 29 459 497 140 28.0 4e-33 MRRPNIVFILADDMGFWTLGSAGNRDAVTPNLDEMAREGCIAENFFCSSPVCSPARATLL TGRMPSMHGILDWILRGNIKNEGEIPIEYLNDFKGYTDYLSEAGYICGLSGKWHLGDSQK QQKGFSHWYVHQSGGGSYYDAPMIREGKRVCEQGYITELITRDAVRFLNEHAGKNAPFYL GVNFTAPHTPWIHNHPQKYLDLYRDCAFDSCPVEQRHPWQIDFAEFNYDRTEMLKGYFAA TSALDVGVGEIREELKRLKLDQDTLILFSSDNGFNCGHHGIWGKGNGTAPFNMYDTSVKV PFLACMPGKIQPGTRLRGLYSAYDFFPTIMEIAGVQYKEKGLPGKSFAKAVFSGEERDIN DCVVVYSEYGAVRMIRQKEWKYIRRYPEGPDELYNLKTDPDEMRNMIDKAAPELIELLNK RLDSWFSEHTRPETDGRQANVTGAGQNRKYTNHGFEPGSFEKGYETLPIRQA >gi|229783998|gb|GG667737.1| GENE 3 2058 - 3176 821 372 aa, chain + ## HITS:1 COG:BS_xylR KEGG:ns NR:ns ## COG: BS_xylR COG1940 # Protein_GI_number: 16078822 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Bacillus subtilis # 43 327 31 298 350 97 25.0 4e-20 MKHVSRKSLLELSHYKAPTLYRVIEDLVQNGYAEVSRCEDYGSKGRPNDLLSMNSRHAYV FSICILRETFCCAVVDFSNEILAMEEHGMVPGMTPEQLADIAFRDFREACEHLGLAEDSF LEIGLSAVGPLDYANGLMLRPLYFLAGDWKNIPISKILQTRFNRNVYLDCNARAALMGNY LPDYFEKHQNVVYVTIGNGIGSGLIINRKLMLNHNIILDGFAHMTIDMDGRKCTCGEYGC VEAYVSVPAILAQCVEEMRLGIDSSMKSCMDHLSLQDLQSAVRTKDALAVIQINKAANIF AKCILNYLRMVELEAVVLGGGLIEAIPEFYDRVEQYAKERNSSVTFYRSTEKEKNILRGI ASEIVMRYIFGE >gi|229783998|gb|GG667737.1| GENE 4 3208 - 4524 1106 438 aa, chain + ## HITS:1 COG:CAC0355 KEGG:ns NR:ns ## COG: CAC0355 COG5434 # Protein_GI_number: 15893646 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Clostridium acetobutylicum # 16 405 95 463 513 144 28.0 3e-34 MRKVWKTSEYHIDNTGERLVTEALQALIDETSQAGGILVLEKGTYLTAPLFLKNGMEFHF EEGAVLKGTTDETKIPVIPTRAAGIGMDWYPGVLNCNGQKDVIISGNGTIDGQGEYWWRK YWGDDGKSGMRGEYDKKGLRWACDYDCMRVRNVVIMESSRITLKDITSMRSGFWNIHICY SDHIHVDGIKIASCGGESPSTDGIDIDSCHDVLVENCVTDCNDDSICIKSGRDADGIRVN RPCHDITVQNCEIRAGFGVTIGSEVSGGVYQITLKNLRYHGTDCGFRIKSSVARHGYIRD VRVEGLSMVNVKYPFHFFLNWNPAYSYCELPGDYEGEIPEHWKKLLEAIPDSVPKTKVSN ITIENVTARNEADYNGISRAFHIEGFEDQPVEHVIFKNVSLACREFGVINHTKEIGFQNV TVSVSGAHDEKNDSYDNR >gi|229783998|gb|GG667737.1| GENE 5 5163 - 6488 1002 441 aa, chain + ## HITS:1 COG:BH1079 KEGG:ns NR:ns ## COG: BH1079 COG1653 # Protein_GI_number: 15613642 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 5 439 10 450 459 130 27.0 8e-30 MLAGVMACSLAGCNTKSPMAGKEEAATAADSQSDSNTQAAQGENEAEKQEKAAGPITIQF WNAFTGTDGDVLREIVDRYNKENGKDITIEMDIMPASSLEEKLPAAIASKTAPALIIRGN FDTATYTDNGILTPLDDFFEKTGVDSSNFSEASLQALKYNGTQMMIPMQVHSTFLYWNKD LFEAAGLDPETPPATWEEAGEYALKISDPSRKLYGIGFPISGAPCYFDAMFKANGGDVFS SDGKSSVLDSPENLKTLQYIQSLVEQGAAPVGSTGADTDNLMLAGQMGIYCGGPWLISGL KEAEINFGVTGMPAGDKEAAGVIEVQGLGVTSTASEEEKAAAYDFIAYWNSDATCKEWSL RNGYPPYLKSLAEDADIKADATVNALSSIADFGFSFAPGIKPVKQINNDILFPMIENVVA GNDPQEELTKASESIDKLLSE >gi|229783998|gb|GG667737.1| GENE 6 6514 - 7470 793 318 aa, chain + ## HITS:1 COG:lin0218 KEGG:ns NR:ns ## COG: lin0218 COG1175 # Protein_GI_number: 16799295 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 40 295 15 268 292 174 39.0 2e-43 MKNIKTGKSPANPSVDKQGESPARLFSRKYWNKPERVAYLFMLPSILVLFVFAIIPLIAC LVISLFDMNIFFTDTSFAGVHNFTMACQDTRFWNALLNTVVFVLFEVPLQILIGLLTANA LVRTSLFNKAARSIFFLPVVCSMAAVGIIWSILLDTNIGFIPYVLEQIGIHNVTFFRNAK TAMPTVIAMTVWKNFGYTMSILVVGIQGISQSYYEASEIDGAGKTAQFFKITLPLLKPTL GFCLITNTIGSLQVFDQVYVTTQGGPQFKTETLVQYIYKTGFNQPYNLGYACALSVLLLV VILAISLPMYKNMFINQN >gi|229783998|gb|GG667737.1| GENE 7 7481 - 7687 188 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623867|ref|ZP_06116802.1| ## NR: gi|266623867|ref|ZP_06116802.1| ABC transporter, permease protein [Clostridium hathewayi DSM 13479] ABC transporter, permease protein [Clostridium hathewayi DSM 13479] # 1 65 1 65 65 104 100.0 2e-21 MNEKKKRRVTLLGVVKILFTVSLIVIVLFPLVWMAVGSFKMEKEILGYPPTVFGTKYTLK SFQRILAS >gi|229783998|gb|GG667737.1| GENE 8 8625 - 9257 584 210 aa, chain + ## HITS:1 COG:lin0219 KEGG:ns NR:ns ## COG: lin0219 COG0395 # Protein_GI_number: 16799296 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 1 210 74 282 282 172 49.0 3e-43 MAAYIKNTVIFAGGSTFLAVLFDSLTGYAFARLNFKGKDILFMLVLLTMMVPFQVMMIPL FLESNLLGLLDTYGGLILPKATSAFGIFMMRSYFAALPKDLEEAARVDGMSEFGIFAKIM FPLVIPGVLTLAIFHLMQNWNDLLYPLMMTSSTRMRTLSAGLALFVGEHATTYYGPQLAG ALLSVLPLLVIYIFFQKYFIASVATSGMKD >gi|229783998|gb|GG667737.1| GENE 9 9272 - 10774 1236 500 aa, chain + ## HITS:1 COG:BS_abfA KEGG:ns NR:ns ## COG: BS_abfA COG3534 # Protein_GI_number: 16079924 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Bacillus subtilis # 1 498 1 497 500 510 48.0 1e-144 MQKARITINREFKKGSVDDRIYGSFVEHMGRVVYSGIYEPDHPLADEDGFRKDVLEAVRN AGITTIRYPGGNFVSCYNWEDGIGPKEERPRRPELAWKAIETNEFGTDEFMKWCEKARVS PMLAVNLGTGSIREALNYLEYCNFKDGTTYSDKRRRNGREEPYRVRTWCLGNEMDGEWQL GHKAASEYGRLVRETAKAMKILDPGVELVSCGSSLNIMTTFPEWEARSLDETYEYVDYIS LHQYFDGHEKPVESFLAQADEMSDYIKTVTSVCDYIKAKKRSSRQLYISFDEWGVWTRAS KETVRECDESPWQQAMPISEMVYSFQDALLFGGMLLAILKHADRVKIACQSLLTNVSAMI MTEKGGAMWLQPIYYPFADTAAYGHGYVMDCRVTTAALSHEKGEIPLLDTAAVENGDEIA VFMINRSSSEEMEVSLDLQGYETETVLEHRVLSSDDPEAVNSPDHQPVRPELKDGAVIDG GKVTVVLDRLSWNVLRIKVK >gi|229783998|gb|GG667737.1| GENE 10 10843 - 11826 914 327 aa, chain + ## HITS:1 COG:BH0901 KEGG:ns NR:ns ## COG: BH0901 COG1609 # Protein_GI_number: 15613464 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 1 327 1 329 329 272 43.0 5e-73 MGVTTNDIARICQVSRTTVIRALNNQGRISKETKDRIVKTAEELGYRPDLLARGLVKGKT MYIGVVVFDVKNQYFAQMLSAIEAEAQTRGYCVNITLHGKNREKEIDLIRKLVDYHVDGL ILSPVNKGEHFNNFLKSLDTPLIVIGNKVSDEIPFVGIDERRAAKEATDKILDSGYERVV FVCPPLGESREGNNYTHEQRLDGFREAMAGRSVEAVIISSEYYEKEAASILETNDCNTAF FCSGDIYAMDIMKHLKSCGKQAPRDYGIMGFDDIELLEYLSPRLSTIYNSVEEVAKRAVE LLFDEMEGRKIAMETYERYRLVQGETL >gi|229783998|gb|GG667737.1| GENE 11 11842 - 13068 485 408 aa, chain + ## HITS:1 COG:CAC1529 KEGG:ns NR:ns ## COG: CAC1529 COG3940 # Protein_GI_number: 15894807 # Func_class: R General function prediction only # Function: Predicted beta-xylosidase # Organism: Clostridium acetobutylicum # 14 320 9 311 327 327 53.0 2e-89 MDKDHYVSTEYNIPLIENRADPYICKHEDGTYYFTASVPEYDRIILRKADTIDGLKSAAE KTLWVRHDSGPMSCHIWAPEIHYISGAWYIYFAAGDRDDIWKIRPYVLRNKGNPMEDEWE ELGPMKAVEEQEPDKFSFQDFSLDMTVFQYQEKWYCVWAEKVNIGKKISNLYIAEMETPN RLKTAQVLLSAPDYEWERRGFWVNEGAAVLKKNGKLFLTYSASSTGADYCMGMLSLRRGG DPLDPQDWTKSRKPVVKTDVEKGIFGPGHNCFVKSEDGLTDIMVFHARQYDKIQGDPLYD ENRHTYTLPVEWDENEEPVFRFRKNRRPNILMMVVDHQAFYGHSRVQTPYFDRLVEEGVS FERSQGCTSRSAVQQEIICCSNRIKMVQKPRKIKVFRGFTVSKRADIV >gi|229783998|gb|GG667737.1| GENE 12 13452 - 13592 199 46 aa, chain + ## HITS:1 COG:Cgl1387 KEGG:ns NR:ns ## COG: Cgl1387 COG1192 # Protein_GI_number: 19552637 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Corynebacterium glutamicum # 5 46 38 79 290 64 73.0 4e-11 MNTQIIAIANQKGGVGKTTTCANLGIGLAQAGKKVLLIDGDPQGSL >gi|229783998|gb|GG667737.1| GENE 13 14649 - 15575 687 308 aa, chain + ## HITS:1 COG:BH4057 KEGG:ns NR:ns ## COG: BH4057 COG1475 # Protein_GI_number: 15616619 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 25 217 18 193 288 85 33.0 1e-16 MKSSAKKIELASVDDLFSTEEGRQDAKLEKIQEIPLSELHPFKNHPFKVRDDEAMMETAD SIKQYGVLVPAIARPDPEGGYELVAGHRRHRASELAEKETMPVIVRDLDDDAATIIMVDS NLQRESLLPSERAFAYKMKLDAIKHQGARTDLTSAQVGPKLTAAEKIAENSPDSKSQIKR FIRLTELIPELMDMVDEKKIALNPAYELSFLKKEEQIDLLDAMDSEQATPSLSQAQRLKK YSQEGHLTLDMMRVIMGEEKKSDLDRVTFTSDTLRKYFPKSYTPQRMQETIIKLLEAWQK KRQRDQER >gi|229783998|gb|GG667737.1| GENE 14 15588 - 15863 86 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295115243|emb|CBL36090.1| ## NR: gi|295115243|emb|CBL36090.1| hypothetical protein [butyrate-producing bacterium SM4/1] # 1 91 1 91 91 169 98.0 4e-41 MRDISARELKGHNILAVERFWDNTRWMIEFSVLRPSTAYGSPGDEMRLFLTEDGYQAALQ SQQRREIKIKRYARVIEGHILDFKPGKRRRS >gi|229783998|gb|GG667737.1| GENE 15 15893 - 16078 164 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871227|ref|ZP_06116811.2| ## NR: gi|288871227|ref|ZP_06116811.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 61 3 63 63 104 100.0 2e-21 MCTVKEIQEAIREDCKSQHDMDSTDIDRVMQTLPFAQEEPFTEARPQKPLFWFYARKFEK R >gi|229783998|gb|GG667737.1| GENE 16 16146 - 17015 511 289 aa, chain + ## HITS:1 COG:no KEGG:BLJ_1240 NR:ns ## KEGG: BLJ_1240 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 1 289 80 375 381 343 58.0 7e-93 MAVFRIERTRDYTVMSNHHLRNANLSLKAKGLLSMMLSLPEDWNYTTRGLAKICKEGVDA IGAALRELEGAGYIVRHQRRDKSGRITDTEYVIYEQPQPDMSQPDTASPDTENPDMVRPD MEKPAELNIEKSNTEKSITYGSSTDSIPFRETAAARPPERKGRDAMSVTEIENYRELILE NIEYDCLKQRYPLYLDDLNEIVELLVETVCAKRKTTRISGADFPHEIVRSRFLKLDSSHI EFVMDCLQKNTTQVRNMKQYLLAVLFNAPTTMNNHYTSLVNHDMHAGGW >gi|229783998|gb|GG667737.1| GENE 17 17050 - 17796 504 248 aa, chain + ## HITS:1 COG:lin2418_1 KEGG:ns NR:ns ## COG: lin2418_1 COG3617 # Protein_GI_number: 16801480 # Func_class: K Transcription # Function: Prophage antirepressor # Organism: Listeria innocua # 1 131 1 128 128 100 39.0 4e-21 MNQMEIFKNPEFGSIRVIEENGKYLFSGTDVAAALGYSNPRDAIIRHCRYVVKRDAPHPQ SPDRKISMTFIPEGDLYRLIVHSKLPSAERFERWVFDEVLPTIRKHGAYLTKEKLWEVAT SPEALMKLCSDLLAEREANISLRKENAQLEGKAAFYDLFIDLKHSTNLRTTAKELDVPER RFVRFLIERRFVYRTASGNVLPYANVKNAGLFCVKDYCNHGHTGSYTLVTPQGKLYFAQL REMILLVL >gi|229783998|gb|GG667737.1| GENE 18 17819 - 18130 328 103 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|309776218|ref|ZP_07671209.1| ## NR: gi|309776218|ref|ZP_07671209.1| hypothetical protein HMPREF0983_01773 [Erysipelotrichaceae bacterium 3_1_53] hypothetical protein HMPREF0983_01773 [Erysipelotrichaceae bacterium 3_1_53] # 1 103 4 106 106 194 100.0 2e-48 MQEGKTIGQLMEEMRQKAGAQNYHGHDYMDLQRFAENTRHMIIFDVLTNDSPVGWKGERT RLFLSDIGYEKALDSQAKGQIKILSHAKVRNGDLFYDHKEQIR >gi|229783998|gb|GG667737.1| GENE 19 18143 - 18562 469 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623880|ref|ZP_06116815.1| ## NR: gi|266623880|ref|ZP_06116815.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Erysipelotrichaceae bacterium 3_1_53] conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Erysipelotrichaceae bacterium 3_1_53] # 1 139 1 139 139 269 100.0 6e-71 MNEEKKVPFKWEYGEETISLQLGMYANNQRLYIGMITHTEDGAEPFADMTVNLPGYSLDP GEAFISGDISKDLLRFIKENKLGKVLPYQVQSGYGKYSAVAFDLEKLKVFDPKGVAEFRE EWNLPDKKPVKKKSRGMER Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:27:34 2011 Seq name: gi|229783997|gb|GG667738.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld131, whole genome shotgun sequence Length of sequence - 24093 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 10, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 429 405 ## COG1686 D-alanyl-D-alanine carboxypeptidase 2 1 Op 2 . - CDS 432 - 1181 839 ## COG3022 Uncharacterized protein conserved in bacteria 3 1 Op 3 . - CDS 1178 - 2533 1006 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase - Prom 2577 - 2636 7.5 + Prom 2533 - 2592 4.7 4 2 Tu 1 . + CDS 2613 - 3626 826 ## COG0524 Sugar kinases, ribokinase family 5 3 Op 1 . - CDS 4560 - 5789 1596 ## COG1171 Threonine dehydratase 6 3 Op 2 10/0.000 - CDS 5793 - 6734 864 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 7 3 Op 3 6/0.000 - CDS 6748 - 10308 4037 ## COG1196 Chromosome segregation ATPases - Prom 10347 - 10406 80.4 8 4 Tu 1 7/0.000 - CDS 11247 - 11945 790 ## COG0571 dsRNA-specific ribonuclease - Term 11963 - 12014 11.5 9 5 Op 1 3/0.000 - CDS 12023 - 12253 392 ## COG0236 Acyl carrier protein 10 5 Op 2 . - CDS 12272 - 13288 872 ## COG0416 Fatty acid/phospholipid biosynthesis enzyme 11 5 Op 3 . - CDS 13377 - 14222 884 ## COG2200 FOG: EAL domain - Prom 14265 - 14324 16.3 12 6 Tu 1 . - CDS 15226 - 15597 362 ## Elen_2577 diguanylate cyclase/phosphodiesterase - Prom 15634 - 15693 6.7 + Prom 15727 - 15786 4.8 13 7 Op 1 . + CDS 15868 - 16344 403 ## Closa_2314 response regulator receiver protein 14 7 Op 2 . + CDS 16316 - 16936 435 ## Closa_2313 hypothetical protein - Term 16818 - 16861 -0.3 15 8 Op 1 . - CDS 16955 - 17389 441 ## Ethha_0934 transcriptional regulator, MarR family 16 8 Op 2 . - CDS 17386 - 17787 355 ## COG2015 Alkyl sulfatase and related hydrolases 17 9 Tu 1 . - CDS 18714 - 19646 1051 ## COG2015 Alkyl sulfatase and related hydrolases - Prom 19690 - 19749 6.7 18 10 Op 1 . - CDS 19758 - 21086 1095 ## COG0534 Na+-driven multidrug efflux pump 19 10 Op 2 . - CDS 21122 - 21958 823 ## COG1082 Sugar phosphate isomerases/epimerases 20 10 Op 3 . - CDS 21955 - 23568 1266 ## COG5434 Endopolygalacturonase 21 10 Op 4 . - CDS 23573 - 24091 458 ## COG0395 ABC-type sugar transport system, permease component Predicted protein(s) >gi|229783997|gb|GG667738.1| GENE 1 3 - 429 405 142 aa, chain - ## HITS:1 COG:CAC1267 KEGG:ns NR:ns ## COG: CAC1267 COG1686 # Protein_GI_number: 15894549 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 8 141 4 136 425 95 38.0 4e-20 MRHKKNILFYFLLCMAVAFTGLTPMTVYGAADWPSEVAMEADGGILMDADTGTILYGKNI NEAYYPASITKLLTAQVVLDRCRLDEVVEFSQNAVYNVEQGSKSLSMDTGDRMSVKDCLY GLILHSANEVANALAEHAAGST >gi|229783997|gb|GG667738.1| GENE 2 432 - 1181 839 249 aa, chain - ## HITS:1 COG:STM0005 KEGG:ns NR:ns ## COG: STM0005 COG3022 # Protein_GI_number: 16763395 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 1 245 1 252 257 146 36.0 4e-35 MKIIISPAKKMRVQTDIMEETGMPVFLEDAKQLHGLLKEYSMEQLGKLFGANEAITRQNY ERYQTMDLERGLTPAVLAYVGLQYQSMAPDIFTGSQWDYVKDHLRILSGFYGVLKACDGV VPYRLEMQARLPVDGNRDLYGFWGSRIYEELTADGDSVIINLASKEYSRAVEPWLKASDT FITCVFGTEQGGKVRVKATAAKMARGEMVRFMAANGIECVGKLKEFTALGYRYREEYSCE KEYVYVQET >gi|229783997|gb|GG667738.1| GENE 3 1178 - 2533 1006 451 aa, chain - ## HITS:1 COG:ydaJ KEGG:ns NR:ns ## COG: ydaJ COG1473 # Protein_GI_number: 16129299 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Escherichia coli K12 # 20 421 24 421 441 291 45.0 2e-78 MYDQITAMAKEIFERSVARRRDLHRHPETGWLEMRTSAVIAGILTELGYEVLTGSNVLDG EARLGLPSEAALREHANLIRENWNTPLTYLTEEMEAGYTGVIGILRCGPGPVCALRFDID ALGVLEDESSGHRPAREGFASLCPGVMHACGHDGHTAIGLGVAEVLMSVKEQLHGTVKLI FQPAEEGVRGARAIVSKGHLDDVDYFIGTHLAPLDGPDDGAVTPATYGSLATTKYDVIYH GAPAHAGGFPEKGRNALLAASCAAMGLAAIPRHSEGASRVNTGTLHAGTGRNVIPDYAKM EIELRGETTAVNAYMEDSARRICEAAAAMYGCTCEMICMGAADSHHSDLALADRAAALIE RELPDIRVSSVRNVRNWGSEDISLMMNRVQSHGGQAICMRNMTRMEAPQHTADFDFDETV LEEGIRIFSAVTCDLMQDETYKDTRQRGENK >gi|229783997|gb|GG667738.1| GENE 4 2613 - 3626 826 337 aa, chain + ## HITS:1 COG:STM2144 KEGG:ns NR:ns ## COG: STM2144 COG0524 # Protein_GI_number: 16765473 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Salmonella typhimurium LT2 # 27 312 17 305 321 172 35.0 1e-42 MKELPVPRIIILRFYNEKIRKPEAIMKKILVIGSTCVDIILKLDHLPVTGEDLHPKSQSM ALGGCACNAAHVLLYSGSEFTFLSPVGGGIYGDLVKEALTAHGFSIPVYLPEKENGCCYC LVEASGERTFLSLHGVEYTFRKEWMEPYRMEDYSMAYLCGLEVEEPTGEALVEYFEEERG PKLFFAPGPRGTRLPGERLNRILALSPVLHINEQEALELGGRDCVADAAAALYQITGSPV IVTLGERGAYCRESAEHAYVVPGIPSRVVDTIGAGDSHIGAILSGLQKGSSLRDAIAAAN QVAAAVVSQEGAILPKDSVLFRNLRLTKEGLRENTNP >gi|229783997|gb|GG667738.1| GENE 5 4560 - 5789 1596 409 aa, chain - ## HITS:1 COG:FN1411 KEGG:ns NR:ns ## COG: FN1411 COG1171 # Protein_GI_number: 19704743 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Fusobacterium nucleatum # 1 402 1 402 404 402 56.0 1e-112 MLTLEKFEEATEVVSKVTQETKLVYSEYFSSQTGNRVWFKPENMQYTGAYKVRGAYYKIS TLSEEEKAKGLITASAGNHAQGVAYAAKLAGISATIVMPTTTPLMKVNRTKGYGAEVVLE GDVFDEACSYAYKLAEEKGLTFIHPFNDLDVATGQGTIAMEIIKELPTVDYILVPIGGGG LCTGVATLAKMLNPKIKVIGVEPAGANCMQESLRQGKVVALNGVNTIADGTAVQCPGDKL FPYIQENVDDIITIEDSELIVSFLDMVENHKMIVENSGLLTVAALKHMKAENKKIVSILS GGNMDVITMASIVQHGLIQRDRVFTVSVLLPDKPGELAKVSTLLAKEQGNVIRLEHNQFI SINRNSAVELRITIEAYGTEHKNTIMAALTEAGYRPKLVKSKGTYAVAS >gi|229783997|gb|GG667738.1| GENE 6 5793 - 6734 864 313 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 14 313 25 330 336 337 53 4e-92 MSEKKGFFGKLVAGLQKTRDNIISGMDSIFSGFSAIDDEFYEEIEETLIMGDLGIQTTMS IVEDLRKKVKEQGIKDPSECKELLMESIRDQMDLGENAYEYENRQSVLLIIGVNGVGKTT SVGKLAGQLKDDGKKVILAAADTFRAAAIEQLTEWANRAGVDIIAQQEGSDPAAVVYDAV AAAKSRRADILICDTAGRLHNKKNLMEELKKINRIIDKEFPEAYRETLVVLDGTTGQNAL SQARQFMEVADITGIILTKLDGTAKGGIAVAIQSELGIPVKYVGVGEKIDDLQKFNAEEF VNALFHVEEKEEA >gi|229783997|gb|GG667738.1| GENE 7 6748 - 10308 4037 1186 aa, chain - ## HITS:1 COG:CAC1751 KEGG:ns NR:ns ## COG: CAC1751 COG1196 # Protein_GI_number: 15895028 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Chromosome segregation ATPases # Organism: Clostridium acetobutylicum # 1 1179 1 1183 1191 611 35.0 1e-174 MYLKSIEIQGFKSFANKIVFEFHNGITGIVGPNGSGKSNVADAVRWVLGEQKVKQLRSSS MQDVIFSGTETRKPQGFASVAITLDNSDHQLAIDYDQVTVTRRVYRSGESEYMINGSTCR LKDINELFYDTGIGKEGYSIIGQGQIDKILSGKPEERRELFDEAAGIVKFKRRKAIAQKK LEDEKQNLVRVTDILSELEKQVGPLAKQSEAAKEYLRLKEDLKKYDVNQFLLETAGTRGQ LKETEENAAIVSKDLEETRQASEHIRVEYETLDAILSGLEAAAGEARTALSEANMEKGTL EGRVGVLNEQINTEKMNAEHIGKRMTAIHGEIADKKTKVSAYEEERSGIAAQVKESLERL AAAEEALKKKDEEIRALEEEIESGKGNIIDTLNEKASINARQQRYETMLEQVNVRRSEVC QKLLKYKSDESEQDGRLDELNRQLEEIEAEIASLGDAQAAAETRTEELDHEVRRLNRNLN DKQQEYHTSYTKLESLKNIAERYEGYGGSIRRVMEVRDRIHGIHGVVADLITVPKKYEIA VETALGGSIQNIVTDSEQTAKQLIEYLKKNRYGRATFLPLTSIGSKNTFNQDKALKEPGV LGLANSLVETDGQYEGLIRYLLGRVVVVDTIDNAIALARKFQYSLRIVTLEGELLSAGGS MTGGAFKNTSNLLGRKREIEELEGACTKALTDVDRIEQDLVMNEALLAESREELESLKSR KQESYLRQNTVRMSISRTEDKKEEIRESYGDLERENSQLEEQIREIGRSRSELADEVIRL EEQNQDINSRLDSCHERLDAAKAEREEASRALSTVQLEASGLKQKDDFELENIRRVKEDV RRLEEELSGLSGGTDDSNLIIEEKLAEIAGLKAQIENTVIRAAGLESVIAEKTAEREEKS RQQKELFQKREELSDRMSRLDKDLFRLQSQKEKLEERLENSANYMWDEYELTYSAALELK GGEEESLPEIRKHIASLKEEIKKLGNVNVNAIEDYKEVSERYVFMKTQHDDLVTAEETLL KIIDELDTGMRRQFEEKFKEIRTEFDKVFKELFGGGRGTLELVEDEDILEAGIQIISQPP GKKLQNMMQLSGGEKALTAIALLFAIQNLKPSPFCLLDEIEAALDDSNVDRYAKYLHKLT KYTQFIVITHRHGTMVAADRLYGITMQEKGVSTLVSVNLIEEDLDK >gi|229783997|gb|GG667738.1| GENE 8 11247 - 11945 790 232 aa, chain - ## HITS:1 COG:lin1919 KEGG:ns NR:ns ## COG: lin1919 COG0571 # Protein_GI_number: 16800985 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Listeria innocua # 7 225 6 226 229 202 47.0 5e-52 MSKKLHELEGRIGYRFTDRQLLTQAMTHSSYANEHRLNKLECNERLEFLGDSVLEVVSSD FLYHKYPERPEGDLTKIRASIVCEPTLAYCAEDIDLGAYLLLGKGEEATGGRGRASVVSD AMEALIGAIYLDGGFANAKEFILRFIMNDIEHKQLFFDSKTILQEIVQSQTDQPLAYELL REEGPDHNKLFESRALIGEEEIGRGTGRTKKAAEAVAAYHGILKLKKWKIAS >gi|229783997|gb|GG667738.1| GENE 9 12023 - 12253 392 76 aa, chain - ## HITS:1 COG:ssl2084 KEGG:ns NR:ns ## COG: ssl2084 COG0236 # Protein_GI_number: 16329904 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Synechocystis # 3 73 6 76 77 63 63.0 7e-11 MEFEKLQKIVAEVLNVEPEEVTMASTFVDDLGADSLDVFQIVMGIEEEFDIEIPAEVAEK IATVGDAVEQIKNALN >gi|229783997|gb|GG667738.1| GENE 10 12272 - 13288 872 338 aa, chain - ## HITS:1 COG:CAC1746 KEGG:ns NR:ns ## COG: CAC1746 COG0416 # Protein_GI_number: 15895023 # Func_class: I Lipid transport and metabolism # Function: Fatty acid/phospholipid biosynthesis enzyme # Organism: Clostridium acetobutylicum # 4 333 3 331 331 301 46.0 8e-82 MIKVAVDAMGGDYAPAEMVAGAVQAVNANREISVLLVGQEPAVAEELKKHSFPAEQIQIV PATEIIETEEPPVNAIRKKKDSSIVVGMNLVKKKEADAFVSAGSSGAILVGGQVIVGRIK GVERPPLAPLIPTEKGASLLIDCGANVDARASHLVQFARMGSIYMEHVMGVKNPRVAIVN IGAEEEKGNALVKETFPLLKECRDINFIGSIEAREIPHGGADVIVCEAFVGNVILKLYEG VGATLISKVKSGMLVNLRSKIGALLVKPALKETLKSFDSSQYGGAPLLGLNGLVVKTHGN SKAKEVCNSILQCMTFKEQGINDKIRESLTAENQENKE >gi|229783997|gb|GG667738.1| GENE 11 13377 - 14222 884 281 aa, chain - ## HITS:1 COG:PA4601_3 KEGG:ns NR:ns ## COG: PA4601_3 COG2200 # Protein_GI_number: 15599797 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Pseudomonas aeruginosa # 41 277 1 238 247 160 34.0 3e-39 MCSHVHLRNTSYDFYSSSISKKYLQDCEIEERAVEAVEQEKFIPYLQPKVRLSDEKIVGA EVLLRWFDENGRLIPIKDYLPVLDKNGYIRNVDLYIFETMCRKLQECRDAGIPVVPLSFN ISKSYYNDDDLFSDYMEVFEQYEIPESWIQFELMESISFDDSARMMRLIPGFKKRGFVCM LDDFGSGYSSFQIISSLLLDGLKIDRQFFTKAFNPQDCAIVETIIQIAKTLHMETIAEGV EQKEYIDFLKHAGCDMVQGFYYYKPMPVDDFFHLLASQNEF >gi|229783997|gb|GG667738.1| GENE 12 15226 - 15597 362 123 aa, chain - ## HITS:1 COG:no KEGG:Elen_2577 NR:ns ## KEGG: Elen_2577 # Name: not_defined # Def: diguanylate cyclase/phosphodiesterase # Organism: E.lenta # Pathway: not_defined # 9 113 322 430 758 67 33.0 1e-10 MKIQLEEKYTDPVTGGGNERWIKEKYQENCLHRMNHYAVMTINIKKFRGYHLRYGKDASD EILKEFYDVTDELLHENEWISRIYSDHFVILPQRESKEALVQFIYDTDWAGYQYPDPRTH MNS >gi|229783997|gb|GG667738.1| GENE 13 15868 - 16344 403 158 aa, chain + ## HITS:1 COG:no KEGG:Closa_2314 NR:ns ## KEGG: Closa_2314 # Name: not_defined # Def: response regulator receiver protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 150 1 150 152 142 49.0 5e-33 MQTLVRVVPGQEPEHAEFYVHQKDRAIERLVEYVEQERYHSILLTCYRGNEIVQLPESSI CFIETVREKQLVHTDAEILEVKKRLYELERLLPPGFIRISKSVIMNMDRVTAYRPMVNGL MAASFPNNDIVYISRKYLKDVRSHIMEVRNERKSYGNV >gi|229783997|gb|GG667738.1| GENE 14 16316 - 16936 435 206 aa, chain + ## HITS:1 COG:no KEGG:Closa_2313 NR:ns ## KEGG: Closa_2313 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 205 8 211 211 134 41.0 3e-30 MKENLMETFKQGLFLGMLLFICSLTALRLAAPEGVGPVWSLFPAVLSGYLAAGCFLQLIR SLAARLDSVFFRGAAAAAWCYIVLFCLFFLAQNLPLPVRTMLLTSAAVLLLMWEAEYRYL KSVARDLNSGAAGRILLIDLDEKPGSSEEFFAALETYCKKSSITLEYVKRDIPATVKLNG RLHTVTLTESYMYGGPGWSMEIQEIL >gi|229783997|gb|GG667738.1| GENE 15 16955 - 17389 441 144 aa, chain - ## HITS:1 COG:no KEGG:Ethha_0934 NR:ns ## KEGG: Ethha_0934 # Name: not_defined # Def: transcriptional regulator, MarR family # Organism: E.harbinense # Pathway: not_defined # 3 139 4 140 140 106 40.0 3e-22 MIDNTFTLISRIDGEIRRFITEELAAEGIADIVPSHGAIIMELMKHGQMPMNELARRIDR TPQTVTCLIKKLVNTGYVTTGKSSGDSRVTMAALTEQGRHLASVLQRISEKIYERQYFGI PEDDIPILRGALMKMHSNFAEKEL >gi|229783997|gb|GG667738.1| GENE 16 17386 - 17787 355 133 aa, chain - ## HITS:1 COG:ECs5066 KEGG:ns NR:ns ## COG: ECs5066 COG2015 # Protein_GI_number: 15834320 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Alkyl sulfatase and related hydrolases # Organism: Escherichia coli O157:H7 # 4 110 426 532 665 76 36.0 1e-14 MEWSVRSIYTGYLGWFDGNPTKLGHMATKERAEKTVSMMGGPARVLAEGAAALKRGEAQW AAELCDLLLACRKETEEASVLKADALTELGRAQTSANARHYYLTCARQLRGEGGTLKLNG AMMDVGKKQEDNG >gi|229783997|gb|GG667738.1| GENE 17 18714 - 19646 1051 310 aa, chain - ## HITS:1 COG:YOL164w KEGG:ns NR:ns ## COG: YOL164w COG2015 # Protein_GI_number: 6324409 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Alkyl sulfatase and related hydrolases # Organism: Saccharomyces cerevisiae # 9 310 88 401 646 124 30.0 2e-28 MIYAGEEKLKRQAENGFPKGITKICDNVYFALGYGGSTCTLVIGENSCILVDTLNGTAPA EEVKEAFKKLTDKPIRTIIYTHYFHFDHTSGASVFAEPGTRIIGRKPTYPQYGRTGMIRD ICGVRGARQFGVGLTPEENICVGIGPRNEINGEKGNLPCSELFEEELLELESDGIKAVLA AAPGETDDQIFVWFPQHRVLCCGDNYYESWPNLYAIRGGQYRDISGWIDSLDKMREYRAE YLLPGHTKAVIGSRQVETTLKNYRDALEYVLTETLQGMNEGLTPDELVERVRLPEPLASL PYLQEYYGTV >gi|229783997|gb|GG667738.1| GENE 18 19758 - 21086 1095 442 aa, chain - ## HITS:1 COG:yeeO KEGG:ns NR:ns ## COG: yeeO COG0534 # Protein_GI_number: 16129928 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 6 436 92 527 547 184 29.0 3e-46 MLFDRKQLVKLIVPLVCEQVIMGTIGIADMLMVASVGETAISGISLIDSINLLFNSMFAA LVTGGTIVCTQYLGKREADMANLAAKHLLGISLLFSLAVAGMCFLFRRMVVYGLYGSAEP EVLRQAMIYFCFSMASYPFLMMLDSCGAIFRSMGNAALPFSVSFFMSIINIVLNAVFIFG LHWGVFGAGFATFLARAVSSAAIFLFLYHHKGAVRIQSFFGSGWHGFMMKNILKLAVPTG LEDSIFHIGKLLVQGIVTSYGTAAIAANAVALTISEFTHMPGFAIGLALITVVGRCVGAG EIQQARYYVKKITGVSFLVTGAAAVLSWIFAGPVVGFYQMNRETWELASEIVRWQAVACL FFWVPAFNISDALRAASDVRFTMAVSMISMWVFRVGCSALFREILDIGVICVWFAMFLDW MFRAVVFLIRFHGSKWEGRMFL >gi|229783997|gb|GG667738.1| GENE 19 21122 - 21958 823 278 aa, chain - ## HITS:1 COG:TM0416 KEGG:ns NR:ns ## COG: TM0416 COG1082 # Protein_GI_number: 15643182 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Thermotoga maritima # 3 271 1 266 270 145 32.0 1e-34 MSMTIAATASTAAGGDDPILFRGPFEETIPQMAALGYRAVELHIYDSDQIDRERLYRLLA ENHVVLTSIGTGSAYERDGICLGSSDPEKRAAAIRRLTGHMVTAAPFHGIVIIGLIAGRF SDCGQDGEAFKRNLTESLKLCAKEAKAHDVTLGFEVMNRFESDYGTTIREASWLLRAVGS DRLKLHLDTVHLNIEEDDIGKAIESAGSAVCHVHVADNNRWYPGHAHYNFTETLTALHRV GYRGALALEITNAPDTETAAKKSLSYLSSILETLPHLS >gi|229783997|gb|GG667738.1| GENE 20 21955 - 23568 1266 537 aa, chain - ## HITS:1 COG:CAC0355 KEGG:ns NR:ns ## COG: CAC0355 COG5434 # Protein_GI_number: 15893646 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Clostridium acetobutylicum # 1 514 1 513 513 462 44.0 1e-130 MKANVIFKTSRKFLLEWEDFGVFYTEKEYEVWLNGRYVLTSSRVIQTVSGLAPDTEYRVT LRCGGDICSEFSVRTDYEFVTLNVRRFGAKGDGIHDDTLAIQTAIASCPKDGRVYIPEGK YLVTSLFLKSDFTLDIGKNAVLLGHAEREKFGVLPGMIQSYDETGEYNLGSWEGNPLDIF TSMITGIHVSNVVITGEGTLDGCATFDDWWEDDRAKIIAFRPRMVFLNHCDHVVLHGVTI QNSPSWNLHPYFSDDLRFLDLTILNPWDSPNTDGMDPESVNGLEIAGIYFSLGDDCIALK SGKYYMGHKYKVPSQNIEVRQCCMNNGHGAVTIGSEMAAGVKHVHVKDCLFLHTDRGLRI KTRRGRGKDAVVEDICFENIRMDHVLTPFVLNSFYNCCDPDCHSDYVKCKSPLPVDERTP SVKQMVFRNIEAHNCHVAGAFLYGLPEKKVDYVEFDHVTIDYDENPEPDEPAMMDDIGLT AKMGLYINNVQKLVLNDVSISGCEGEPMIFEHIGQITGNQTMTGSTVINSGTEGERT >gi|229783997|gb|GG667738.1| GENE 21 23573 - 24091 458 172 aa, chain - ## HITS:1 COG:YPO1721 KEGG:ns NR:ns ## COG: YPO1721 COG0395 # Protein_GI_number: 16121981 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Yersinia pestis # 1 172 135 306 306 201 56.0 8e-52 WFSILLATLFLPQVVLNIPQFLLFRTWGWTNSYLPLVVPAFFGAEPYFIFMLVQFMRSVS RELEEAAEIDGCNSFQRLVYIVVPMVKPTIVSVALFQFMWAFNDFQGPLIYIDSVKKYPT SLGLRLLTDAETGFEWNKVLALSVITLLPSLIVFFLAQDQFVDGIAAGGVKG Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:27:52 2011 Seq name: gi|229783996|gb|GG667739.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld132, whole genome shotgun sequence Length of sequence - 16708 bp Number of predicted genes - 15, with homology - 13 Number of transcription units - 8, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 136 91 ## + Term 188 - 235 -0.9 + Prom 179 - 238 10.0 2 2 Tu 1 . + CDS 360 - 1691 1036 ## COG0534 Na+-driven multidrug efflux pump + Term 1753 - 1788 7.1 - Term 1733 - 1782 10.5 3 3 Op 1 . - CDS 1810 - 2811 967 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 4 3 Op 2 . - CDS 2860 - 3633 938 ## COG3836 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase 5 3 Op 3 38/0.000 - CDS 3680 - 4516 926 ## COG0395 ABC-type sugar transport system, permease component 6 3 Op 4 35/0.000 - CDS 4513 - 5382 808 ## COG1175 ABC-type sugar transport systems, permease components - Prom 5402 - 5461 2.2 - Term 5411 - 5452 8.1 7 3 Op 5 . - CDS 5490 - 6671 1395 ## COG1653 ABC-type sugar transport system, periplasmic component 8 4 Tu 1 . - CDS 7641 - 7799 212 ## - Prom 7860 - 7919 4.8 9 5 Op 1 7/0.000 + CDS 7990 - 9489 1136 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 10 5 Op 2 . + CDS 9533 - 11314 1337 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 11 6 Tu 1 . - CDS 12507 - 13112 550 ## COG5632 N-acetylmuramoyl-L-alanine amidase - Prom 13146 - 13205 5.8 + Prom 13234 - 13293 9.8 12 7 Tu 1 . + CDS 13518 - 14351 563 ## HS_1632 large adhesin + Term 14364 - 14410 7.2 + Prom 14364 - 14423 3.5 13 8 Op 1 . + CDS 14613 - 15557 535 ## BLJ_1240 hypothetical protein 14 8 Op 2 . + CDS 15594 - 16067 506 ## CD1116 hypothetical protein 15 8 Op 3 . + CDS 16064 - 16567 540 ## CD1115 putative conjugal transfer protein Predicted protein(s) >gi|229783996|gb|GG667739.1| GENE 1 2 - 136 91 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no RQDLLGGTPVKEICQKYSFGNYSSFYRAFRTEFGQSPRAVRKNV >gi|229783996|gb|GG667739.1| GENE 2 360 - 1691 1036 443 aa, chain + ## HITS:1 COG:MA2050 KEGG:ns NR:ns ## COG: MA2050 COG0534 # Protein_GI_number: 20090897 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 1 438 1 439 468 199 31.0 1e-50 MQNKQDFMKTKPVFPLLLSMAAPMVLSMLIQSLYNIVDSIFVARLGTEALTAVSLVYPLQ NIVLSLSVGVGVGINSVIARRLGEQKAEEANKAASIGMGLTFLHCILIILFGLLATRPFL SIFTSDEQTLQWACRYSYIVLCFSAGSLIHICFEKIFQSVGAMFLTMISLLVGCITNIIL DPILIFGLFGLPALGVTGAAVATVIGQFCSLFVYLVIYARKDIGIKISPKYFSLDKNTIL QIYSVGIPSSLMLAAPSLMSGVLNFMLTGFSDIYVAILGIYLKLQTFLYTPASGIVQALR PIAGYNYGAGERERLHKIIRCSMLLVAAIMAAGTAAALIFPEQIFSIFIDDAALIGPGST ALRIICLGFLVSSVSVVYAGVFEALGCGKKSLLISLIRQIIIIIPLGFILSRIWGPVGIW VCFPIAEFLAAGTARFMVKKQNI >gi|229783996|gb|GG667739.1| GENE 3 1810 - 2811 967 333 aa, chain - ## HITS:1 COG:YPO2980 KEGG:ns NR:ns ## COG: YPO2980 COG0667 # Protein_GI_number: 16123161 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Yersinia pestis # 1 327 1 328 329 387 56.0 1e-107 MEYIAKAERYETMKYNRCGKSGLLLPALSLGLWHNFGNTSDFTGCREMLLESFDRGITHF DLANNYGPEAGSAEETFGRVMTEELKPYRDEMIISTKAGYYMWPGPYGEWGSKKNLVASL DQSLKRMKLDYVDIFYHHRPDLDTPLEETMDALAGIVKSGKALYVGISNYNAEQTEDAVR MLDTMGIHCLIHQMQMSMIHRENEAVLDVLNRHGVGAIAFSPLAQGILTGKYINGIPADS RAAGASVFLNANHVTGSVEVITKRLMEISDGRGQSLAQMALAWVLQKKGMTSVIIGASRT EQIVENLKTMENMAFTEEELRRIDEILYDETEV >gi|229783996|gb|GG667739.1| GENE 4 2860 - 3633 938 257 aa, chain - ## HITS:1 COG:yfaU KEGG:ns NR:ns ## COG: yfaU COG3836 # Protein_GI_number: 16130180 # Func_class: G Carbohydrate transport and metabolism # Function: 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase # Organism: Escherichia coli K12 # 4 249 6 249 267 115 28.0 9e-26 MNHADRIREKIAAGKLVKGFFLNMADPMVSEMAGYIGYDYVWIDAEHGPLDRQEIFHHIV AAQGAGCAAFVRVPGVDAKAMKAILDMGPDGIIFPFTNSAEVAKEAVRSCSYPDYGGVRG QGPIRAIRYGLINEDEYIQTAYGRVLKIMQIETLEGYENLDEIMEVQGIDSLFIGAADLG RSINGQTGEKQYELSAVYDDICKRVRAKGLTLGAAIGATREEAAKAVSKGIQWVVFGQDA RVLADGLKRNFDELADF >gi|229783996|gb|GG667739.1| GENE 5 3680 - 4516 926 278 aa, chain - ## HITS:1 COG:BH1926 KEGG:ns NR:ns ## COG: BH1926 COG0395 # Protein_GI_number: 15614489 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 15 277 25 284 285 172 38.0 6e-43 MTKRQGKAVLLNAVVIFACFIAVFPVFIMITGSLKTSQELYTNSAGFPVHPTLANFKRLL SFNSGLILRTFGNSIFVAASYTALSCLLASMAGYAFAKFKFRGRNVLFLLLLITMMIPAE LNITPLYLIFSKMKWLNTYRVQIIPGAANVFAMFLMRQYMMSIPDSLIEAARIDGAGEWR IFSKIVLPISSPSIGALAILQFLSKWNELLYPKMMLTKQKLMPIMVILPTLNETDSARSV PWELVLSGCTLVTIPLIIVFLMFQDKFLSSVTLGAVKG >gi|229783996|gb|GG667739.1| GENE 6 4513 - 5382 808 289 aa, chain - ## HITS:1 COG:BH2725 KEGG:ns NR:ns ## COG: BH2725 COG1175 # Protein_GI_number: 15615288 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 13 285 12 287 290 131 34.0 2e-30 MNRYKLKKTAVPYLFLLPFMVIYLLFFLWPALYSLGLSFFKYSGYGAAKWVGLQNYKNLF HYDTMWKCLGNTVFYFIFSFIPTMAFSFLLALAVKSKSAGKMRSIYKPLIFLPQICAVVA SSLCFKIIFGERVGVINQILGKTIPFLTDPAWMRWTVVVLLSWRGIGWYFIIYLSGLTTI GEDITEAASIDGASGLQNLWHIILPMMKPTFMVAFITNAIGSFKIYTEPNLLLAQNYDPS MKVAPYINLIINSMSGGQFGMASAAGWILVAVILVLTLVQLKVLGGDEE >gi|229783996|gb|GG667739.1| GENE 7 5490 - 6671 1395 393 aa, chain - ## HITS:1 COG:BH0905 KEGG:ns NR:ns ## COG: BH0905 COG1653 # Protein_GI_number: 15613468 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 44 384 90 439 445 110 26.0 4e-24 MMVFPSTENYETINAKFLEQNPDLAEKVDFEVQLGGSGDGDVADKLRLALTSGENVPDIV RLNYTQLPEFAKAGVLYPLDDYIAEYEDRIIDAAKSIMKYDDTYYALPREVKPKVWFYRA DIFEECGVDVREVKTIDEFIAAGEKIREKYPNACMENYNVPTQGYDFMMMLSGTGAGFCD ENGNYNLASNEDVKLTFERMKKLYDSDVMSSAAEWSADWTPAFDNGEIVSQLIGGWFKTD FKNFNLESQKGKWAMAPWPEEIREGSDAGGAIWVIPAAARNKELAAEIMTKMCFDEEAAR IIFDVTGVLPALKSAAEDEYYNGDNEYYACSLGKVNSEAMEYLKIFPYNPASSQEQKIVL QYLDEYLQENMTLDEALKAAQDDMINQIGNPFE >gi|229783996|gb|GG667739.1| GENE 8 7641 - 7799 212 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKRRRITALALSAMMVGTMLAGCGTKSEAPAGGESAVQTEAGEKKETTGGAA >gi|229783996|gb|GG667739.1| GENE 9 7990 - 9489 1136 499 aa, chain + ## HITS:1 COG:BH1123 KEGG:ns NR:ns ## COG: BH1123 COG4753 # Protein_GI_number: 15613686 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 491 1 519 526 126 23.0 8e-29 MFHLLIVEDEKRTREGLCRHIDWNGIGVDPVTAVAGGMEALLAVDEIHPDILLSDIRMPH MTGIELASIIRKKFPRCKIIFLSGYADKEYLMSAIELKAVNYIEKPIDLTEIRNAVAKAV VQLESELKDELIHSGFQKTLPIIHEEIVSALISPDLNWEKFSQDFIPLYFTWSKIGNYSV SCIRPGRKLTNDSEAKKFLACILKALEKVPGLTSDDYLCTMMSPKEAVLICRNLLSPAIS SVFKELNAAFTENSLPEASFGISTPCLSLTGISELYHAVSRSTEYESFYSEKYGIFDLTV PLHTKEAPKNLFSSKELTLSGAETLFDTLSREKYTDIPFIRGELYKIYMMAIEKSWDEKV VSWDEFEMFSFEEYRELICYEVNTIQLLGSDIYDIKIKNAVHYILWNYSDADLSIKVIAD HVNLSQNYLSTLFRKQTGTTINDFILNIRTEKACRLLSGTDLKLYEIAEKIGLSDANYLS TVFKKRYGVTPTLYRKTVQ >gi|229783996|gb|GG667739.1| GENE 10 9533 - 11314 1337 593 aa, chain + ## HITS:1 COG:BH1122 KEGG:ns NR:ns ## COG: BH1122 COG2972 # Protein_GI_number: 15613685 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 17 593 9 571 586 169 23.0 2e-41 MKIIRNIKRCFSSMNYMKKLFFSYFFIIFLILMIFSMTSSYYSSKSETKQALYTARNMLH QTTEFLSYKVTSIKNIINVISADETIQTLIAASDEFTRESQNNWIILSTTAKSIMYNANT SKDIHNIRLYALEGSISFENSKEFSQLTDEEKERWNERSAEKKNISSLWLPASFFSSSDK LGAITYVKKIPDSKNLNQYLGTLTATINNSAFRDVIAQAAATLHTSVLLYNSYGETIMYH GEEGFFTPEDIEALQLQTVSDTDSELQRIKYHGENYFIGIRPISGCDWTLAILTPSSDIL SATRRYQQQTLILVILLIVLLIPVLYRTSKQLSSRIYVLKEYISRAVNANFEIEPLCNGS DEIGVLTDSFNDMVRQISGLLKEQYKLGYEIKDLEFQVLQSQINPHFLYNTLDMIYWLGI DSQTPDVADAARELGRFYMLSLGHGETIVTVKNELDHVAAYVNIQNMRFENHFKLITDVP EELYSYKIIKIILQPLVENAILHGLREKSSESGTITIKARLDDGVITISIEDDGVGIPPE KLDSLLTKNEKSGYGVWNIHERIQLSYGSQYGLRYTSRPYIGTTVYITFPALL >gi|229783996|gb|GG667739.1| GENE 11 12507 - 13112 550 201 aa, chain - ## HITS:1 COG:lin2374 KEGG:ns NR:ns ## COG: lin2374 COG5632 # Protein_GI_number: 16801437 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Listeria innocua # 81 197 7 133 316 75 37.0 8e-14 MKQDRERYEDLEMRRRSRIRRQRRLKRQRRKAGILFGIFSAFLALLTGAVLGALSLYVES RTARREIDLSALVAPDWVTQDFFEVNPYSRPGIKMKQVNEIIIHYVANPGTSAKQNWNYF NNLKDQKGDNATSASSHFVIGLDGEILQGIPLDEIAYSTSKEKNLDSVSIENCHPDETGK FTDATYNSLVRLTAWLCLVAS >gi|229783996|gb|GG667739.1| GENE 12 13518 - 14351 563 277 aa, chain + ## HITS:1 COG:no KEGG:HS_1632 NR:ns ## KEGG: HS_1632 # Name: not_defined # Def: large adhesin # Organism: H.somnus # Pathway: not_defined # 2 252 1489 1757 2914 211 54.0 2e-53 MGPTGPQGLRGAMGPTGPQGPRGVIGATGPQGPRGVIGATGPQGPRGAMGPTGSQGPRGT AGATGPQGPRGIMGPAGPQGITGPPGPIGSRGTAGLQGPTGAQGDRGPIGPTGAQGCRGA TGPAGPQGDPGPTGPQGVRGLKGDTGSQGPTGPAGSTGPAGPQGIAGPTGPEGPMGLQGP RGATGIQGLQGLPGPEGPRGIQGSTGPAGPTGAAGATGEAGPTGPRGDQGPAGVPGPAGI TGATGPTGPTAPSIYAQQGKCRNHAAFSTLKTIKQPL >gi|229783996|gb|GG667739.1| GENE 13 14613 - 15557 535 314 aa, chain + ## HITS:1 COG:no KEGG:BLJ_1240 NR:ns ## KEGG: BLJ_1240 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 1 309 80 369 381 416 72.0 1e-115 MAVFRVERNTGYTVMSNHHLRNKELTLKAKGLLSQMLSLPEDWDYTLAGLSHINREKIDA IREAVKELEKAGYIVRSRERDEKGRLRGADYVIYEQPQPKEPEAATSSEKPPTSDLPTLE NPTLENPTLEKPTQEKPTLENPTQLNKDISSKEQSITDLSSTHSIPFHSLNPLPYEQDEA ATPPERKRTEAKTNSAIEIYREIIKENIDYDILIQDPKMDKDRLDEIVEIMLETVCTARK TIRIAGDDYPAELVKSKFMKLNSSHVEFVLDCMRENTTKIRNIKQYLKAVLFNAPSTIDS YYTALVNHDFYGSK >gi|229783996|gb|GG667739.1| GENE 14 15594 - 16067 506 157 aa, chain + ## HITS:1 COG:no KEGG:CD1116 NR:ns ## KEGG: CD1116 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 153 1 154 158 127 55.0 1e-28 MQEEVTQKTIALSVNVGKGAARLTEQALQKAIKKFLEQKSKAPHGKQTMRQLMKQNAGVS NIEITDSNIKAFESTAKKYNIDFSLKKVKGEQTRYLVFFKGRDADVMTAAFQEFSAKKLN RDKKPSIRKALAAAKDKAKQLNAARDKVKKIDRGREI >gi|229783996|gb|GG667739.1| GENE 15 16064 - 16567 540 167 aa, chain + ## HITS:1 COG:no KEGG:CD1115 NR:ns ## KEGG: CD1115 # Name: not_defined # Def: putative conjugal transfer protein # Organism: C.difficile # Pathway: not_defined # 7 161 6 160 595 267 80.0 1e-70 MKQINYKKLIISNIPYVFFVYLFDKVGQAVRLAPGADISEKILNITQGFSEAFSNALPSV HPLDLLIGIVGAVVIRLIVYFKGKNARKYRKGAEYGSARWGNAEDIKPYIDPDFQNNIIL TQTERLTMNSRPKQPKYARNKNVVVIGGSGSGKTRFFVKPSAPVRAV Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:28:22 2011 Seq name: gi|229783995|gb|GG667740.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld133, whole genome shotgun sequence Length of sequence - 12588 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 924 375 ## Closa_0416 cell wall binding repeat-containing protein + Prom 944 - 1003 3.1 2 2 Op 1 . + CDS 1026 - 5114 2523 ## COG5263 FOG: Glucan-binding domain (YG repeat) 3 2 Op 2 . + CDS 5188 - 12570 4606 ## Closa_0424 LPXTG-motif cell wall anchor domain protein Predicted protein(s) >gi|229783995|gb|GG667740.1| GENE 1 1 - 924 375 307 aa, chain + ## HITS:1 COG:no KEGG:Closa_0416 NR:ns ## KEGG: Closa_0416 # Name: not_defined # Def: cell wall binding repeat-containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 306 465 762 763 279 47.0 1e-73 FEFVSSDNRYSVLRIGFQRIWRPESCVHKMEALEPVVPDCISGGHERRRCALCGYEETVQ FPASGHCDEDLDGICDICYQEIDDAAVPQTGIYREGDIQIRNIGGRQYRFRCIDEDYSDD RENLGYGALFLCETVIRSDIVSDQQSVRKWKFGMDNNYKTSDVRTWLQKNTDASMREVSF VYTGVNTAYQGRTRVGEYEQFTFDDLEGYEKPFQMMEDQMFIFSLEEAVKYRDVLWKFDG SSQNNPQSQYSPYSKGYYLRTPQYEGEDNFQYGKGIYVVDLANGSLHPVEVSDTSIGIRP AYVIARR >gi|229783995|gb|GG667740.1| GENE 2 1026 - 5114 2523 1362 aa, chain + ## HITS:1 COG:SP0377 KEGG:ns NR:ns ## COG: SP0377 COG5263 # Protein_GI_number: 15900300 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1237 1361 213 340 340 65 31.0 1e-09 MKERKDMIAQDRRRWWKKLLACHLCGILVFPAFCSYPVSALDLEGGRKQAASSSALEAFG EAEENGESQTATASDASRQETKASLSNAVRAINSLGNIWDSWSGKTSFEFLNGTQGDGSE SRPFLIKNREQLMGLSELTAMGMTVPDAEGADYAGDYTGCFFALGGNIDMQGVDWIPIGF YRDSSEMPGEVEYAFQGDFDGNGYTIRNLRLNKLGNYDHIGLFGAIANSKIHDFCMIPDT EIKGNDRIGTVAGYAVDSEIRNVTVKNAALNTSGIVGGIAGEISGTIIENAVCDNVLIDA RAGKEVIYIGGIAGIASNSLIADCTVATGTGNTARIQGTGYTGGIAGLQNSADIYNVHVN GTIGGYHSTAIGGVTGKYVSGKLKVARFEGTIGNSQLGAMAREGAFIGTRQGAATNFNYI DDVAYLFTDRESKISANVCGSEIADDNDYTYAAHIGYWHSGDLHYTLVQGGTSKNITDRY FYEELEDGILSVMDETDDNRYTIDHFAPDSTGRPVRGYLINVNQIDTTANGRNFYDVAVL EARGVSQYSGVLNKNHRGAVAAGSVVYVNTSPNNTETEKFQMNGTPYYIDGDGIKKNIAY SADSHCYTFRMPEEDIVVDAVYKKVAAAVDVQPDVCHFTVTQTRTGNRKNPVKTTEIRNK AGKLIARYINGLLEQGTEVQPVTIQAVIDTNNDVSDERVRWSIDDADLIRLAKNDDEDSD GYTAKSASITVNLNAGFFNEIIREQERKQQEENYRYRIPNTIYGAGHQNGAVAILTAETR PSASFEGKPCTANCRINVTFQILDNTLVAAEGAVLDKQTLEYTVTRRLTGNRIEPEESIL VTAPQSLTAEFEPDYFSRDEVSWKASDPAVIRINEEEASYKEVSVSAYKDAKWIRDIIAA DNGKKENDSYTLLTGNGERSARVTVEGTDKLGNRATAECMVTVRFVTEDRTRIVPDAVKL NQTAMEYRLTYDKAGDIHSETVGKGGFERRTLTAAVSPDIEDSESHRPFDRSIVWSSSDS EALSVDADGTLTVNDGAPWILEALSRQPYQAVKTVDITAAANDGLRTDRCRVTLNFQANC IEADQEKEVFPITLTLTGRRSAPVLTFSGQEARELSAVIYSSHPDVSQIVWSSEDPDLVT VTRNGTVAPVLLDENNEQKALWIRDLLRSGAYSGTKTVLVNASTADGKMKDSVLIELNLK VIDRTYPSGGSGGGGTGGMSTTTGVTPSGKTTGQAGANGAVTGTWTQTANGKWIFASDRT YANEWAYINNPYAAGNQPEAAWFRFDEEGFMVTGWFTDQNGNLYYLNPAIDGTQGQMLTG WQQIDGSWYYFNSSSKEVMGKLLTNTVIDGSFLVNEKGQWIP >gi|229783995|gb|GG667740.1| GENE 3 5188 - 12570 4606 2460 aa, chain + ## HITS:1 COG:no KEGG:Closa_0424 NR:ns ## KEGG: Closa_0424 # Name: not_defined # Def: LPXTG-motif cell wall anchor domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1445 2440 1489 2481 4700 1003 53.0 0 MRKGMWKETAKRLASIFLAGMILAATVVTPVSAVHTYYWPDAVTADAMAIKNWTWTAEGE EAARNPSVGLVPTGLSQQETALMGGYTSKFWFQAEKAKTITIPGTSAELDFSKFHHRYFA CADSHNTGFFAPVGAQIAFQKMYYPTNSLGDYPDFLKNAPMDMKTKKFALLLIAMISAAY ETPAMEDLRDSDRMSMYYYLLWASIWSNDKYVDQGMFKGESPEGDWDFVQYFVRTMLNSK LNPDPYGSTAIYDAFKDGGPAQKYFFHCWKAAKFLSTFDYTVDMGSSLPVSAPVLGEDGM YHITFTYGDLSDYEKEVYRRMTAENLASGWEYTNDGSSIDFKSADGGSDGNAIATLKLQE NSEEDRFYNCGFGVGGLAGFRGCAKRGRNGEFGWGNTQIYFSAVSEPLEILVGGRIPLNP GAEWNMEVHRYEHTETWEAHYNIQLRKYDSETGQPLAGSKWDILEAFDDTQLNDTELEDF DNWANRSGSQFLRWEGWDYGEGNPEGDAANDPCTWDINVTNEEGILMLGDNKENASERRA HTDTKTYTYTKGYCGGHPEPEIEESGDPEIDTENETAAKEAWQEEVDYCERLAEQGGFFH SIDAGEAKEQLEADRDKLYGQFISLKYDYSAVELTPRPGYTNHGSHTDDIPIEVKTVTSS EYKDRPEFKAGKTEKYYLNKRAAGKNRDYMDEDRAGNTAVPSEATASNTIASKATISNAS ASKASASNASASNAPVSHATDSNAKKKNGLLARLKALFELPRSLVERNKSDKSLLNNKSE SVESGYFEPSEADPVVPGNQAITDHTFIVYNHRTEGEIHINKRDLYLQAGENNTYHAYGD SMGDGTMEGAVYGLFALQDINHPDGCTGTVYQKDDLVAVASTDRNGDASFMAFTEAPGMT WNYKTGKIEKRPGGFSGPENLYRNRTDADTVMDIENYIGFDSEGNPVHLKDSLAGGEAGY WKYSSNQSGIEGLKGSHAVYPVSDNDKNNGNGWIGRPLIVEEHGTDYYVKELARSEGYEL SVNGRTNVITNGRDNYEGEYHTADVAIGKITLDTVGNGNYFDITAHQVDHDLTLQGIQFP EGAVFELSASQKVPEKITVPVYRTVIKPVMAAAGTFVYRAGEKVAATLGEAVFFPGGQSY SVNAVSDEKEQTIGVKPMNYHTMGIPAVTELHSEGEAEAFQNLYNRELGGLGYMEPGNDA PWVRVKLNGTTDTEWILSIAAGMKAHDLQYFNSLRISDIEQSGRDIYAIIRYEWKLYEDS RDDAVYIPDKNRLYLKKDSGNGYFVYIAYEDPESNPAVLSWRMKNGFLERATLKQQKIEG LKVSYPAQLPDSFSAVTVRTPSYWIYAAGEQQIDDAGDLKYAEETVIDYVEQDGFKTVEN NIRLTSVYDADHKMYEVTLPAEAFKDTDTVRLKVSDNGSGKYSIKQAYINQSYFVSRPVQ READSYIQNITLTPPSYDQPWQDGNMRREPANVMERPVMQKIKIVKDILVEEDGKYGDNT FADSGHEDYFTQNGGGIEDTASYRPNFRFKIYLKSNLEQLYRTEDGEVEWLDRNGNTVDI AAYRAAYPNLVQKIYTKVLHRTDFRSRRSNRDAIANQELYSSTDGRINEYPETGYTAVLE TILSHEVDADGDDKEISRYNYQKFFDAVGVANQDKWDKAKPEYTSFKPLAFIRQLLFGTA GSERIYPAQHNNTEIQNEANTSETAKENRNWSDAVRQFAISWYLDREVEKLVKENGTGET EIAAGSENYQDEVYDTALNAALIKAENYLKPFFHYDLDEIYAIEWDSESDGGKDKDRTTL SADKEAAAGYCYAISEYLPYGTYVAVEQQPMDKELEDFPNKHYAVDVPKEIELPAVYEDG KAGASETPERLSRYYHYNANLPAADLAAKYFIRFNEEWAGKADESVREHVIKAHGYLGDY EIYKYGLDLAKLAGNALGDPSGTPHFEITQSEYDPMKDYYNTIVCPEEEGGNPESHYLAD DVNHGITAPGGKEYESDAIERIYRYGSVSEHRQTLSSMDGMQTAYDGKYASMLVPWSVTE PVNEERDTIQNPDGSSGGMGYGYRKFRNTFYRSRLRIEKLDSETGENILHDGAVFTIYAA DREDGENTDGLARFYETDTQIKGSREFLEAMGASEITRAARGLPEPGGLWTGIVAAGTPI CSEREQIVMVDAKGRRTGEFEAYTTTRDGVQAEEENPAELSWQDQNTGYLMTPQPLGAGT YVLCEMRPPSGYARTKPIAIEIYSDKISYYLNGSRDSRVTAAVYEDGIGQGPEGAADTGR IYVGNTPIRLEVSKIKEKGRTVTYCTRTRVEGSETELTAKYGKDNLEFAYKGGTYLNYAW YKGTAEYLESRKQAGDEVVPVCEDGVFAGYGRITRPLDTAGDKNRYVSGAKMILYDGIEI QSNGDSGDYGFDGVEVVRDRNNNVKSIKVLEGYAGNTVEFKPFRFQRWNRKGLNSEACGL Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:29:02 2011 Seq name: gi|229783994|gb|GG667741.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld134, whole genome shotgun sequence Length of sequence - 18544 bp Number of predicted genes - 28, with homology - 25 Number of transcription units - 11, operones - 5 average op.length - 4.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 8/0.000 + CDS 14 - 649 694 ## COG0215 Cysteinyl-tRNA synthetase 2 1 Op 2 7/0.000 + CDS 634 - 1086 251 ## PROTEIN SUPPORTED gi|163764762|ref|ZP_02171816.1| ribosomal protein S13 3 1 Op 3 1/0.000 + CDS 1083 - 1835 671 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 4 1 Op 4 . + CDS 1934 - 3013 1140 ## COG1305 Transglutaminase-like enzymes, putative cysteine proteases 5 1 Op 5 . + CDS 3041 - 4093 1268 ## COG1459 Type II secretory pathway, component PulF 6 1 Op 6 . + CDS 4105 - 4428 339 ## Closa_1301 hypothetical protein 7 1 Op 7 . + CDS 4444 - 4923 600 ## Closa_1302 hypothetical protein 8 1 Op 8 . + CDS 4928 - 5392 388 ## Closa_1303 hypothetical protein 9 1 Op 9 . + CDS 5418 - 5897 582 ## Closa_1304 hypothetical protein 10 1 Op 10 . + CDS 5946 - 7013 1105 ## COG2805 Tfp pilus assembly protein, pilus retraction ATPase PilT 11 1 Op 11 . + CDS 7082 - 7711 623 ## COG0039 Malate/lactate dehydrogenases + Prom 8554 - 8613 80.4 12 2 Tu 1 . + CDS 8650 - 8946 255 ## COG0039 Malate/lactate dehydrogenases + Term 8994 - 9049 11.2 - Term 8745 - 8801 9.3 13 3 Tu 1 . - CDS 9023 - 9295 379 ## Closa_0281 XRE family transcriptional regulator - Prom 9362 - 9421 8.7 + Prom 9230 - 9289 3.3 14 4 Op 1 . + CDS 9336 - 9503 125 ## 15 4 Op 2 . + CDS 9430 - 9777 316 ## gi|266623935|ref|ZP_06116870.1| hypothetical protein CLOSTHATH_05259 16 4 Op 3 . + CDS 9774 - 9971 68 ## gi|288871233|ref|ZP_06116871.2| putative mercuric transport protein MerT + Prom 10873 - 10932 20.2 17 5 Tu 1 . + CDS 11100 - 11213 62 ## + Term 11245 - 11273 -0.0 - Term 11053 - 11090 5.7 18 6 Tu 1 . - CDS 11302 - 12168 147 ## COG0582 Integrase - Prom 12256 - 12315 80.4 19 7 Tu 1 . - CDS 13163 - 13531 215 ## Rumal_0678 integrase family protein - Prom 13615 - 13674 2.8 20 8 Tu 1 . - CDS 13757 - 14200 334 ## Vpar_0382 transcriptional regulator, XRE family - Prom 14320 - 14379 3.6 + Prom 14236 - 14295 6.7 21 9 Op 1 . + CDS 14333 - 14653 124 ## Cbei_2216 hypothetical protein 22 9 Op 2 . + CDS 14565 - 15404 533 ## CKL_4029 hypothetical protein 23 10 Op 1 . + CDS 15523 - 16683 544 ## BL03487 hypothetical protein 24 10 Op 2 . + CDS 16676 - 17515 812 ## Bcer98_2967 hypothetical protein 25 10 Op 3 . + CDS 17508 - 17861 412 ## gi|266623944|ref|ZP_06116879.1| conserved hypothetical protein 26 10 Op 4 . + CDS 17867 - 18112 202 ## Rumal_2679 hypothetical protein + Term 18162 - 18206 4.1 27 11 Op 1 . + CDS 18223 - 18432 297 ## gi|288871235|ref|ZP_06410075.1| conserved hypothetical protein 28 11 Op 2 . + CDS 18429 - 18543 62 ## Predicted protein(s) >gi|229783994|gb|GG667741.1| GENE 1 14 - 649 694 211 aa, chain + ## HITS:1 COG:CAC3177 KEGG:ns NR:ns ## COG: CAC3177 COG0215 # Protein_GI_number: 15896425 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 209 253 460 464 181 47.0 9e-46 MHNAFLNIDNRKMSKSLGNFFTVREISEKYDLSVLRFFMLSAHYRSPLNFSAELMESSKN GLERILTAVDKLKDLESRAVGDSFTEGESREAVDELVAKYEAAMDDDFNTADGISAVFEL VKLSNSTAGADSSKAYVVYLKETIEKLCDVLGIVTEKKEEMLDSEIEEMIAARQQARKDK NFALADEIRGKLLEKGIVLEDTREGVKWKRA >gi|229783994|gb|GG667741.1| GENE 2 634 - 1086 251 150 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764762|ref|ZP_02171816.1| ribosomal protein S13 [Bacillus selenitireducens MLS10] # 25 146 9 132 141 101 43 4e-21 MEEGIRDVSGFQTLCREALRLKEMDPGTYSPLALAYIGDAVYELIVRTMVISHGSIQVNK MHKKSASLVNAGTQAEMIRLMMEELTEEETAVYKRGRNAKSVTTAKHATVVDYRTATGFE ALCGYLYLMGRLERLVTLISHGFEKIGELE >gi|229783994|gb|GG667741.1| GENE 3 1083 - 1835 671 250 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 12 248 9 246 255 263 53 8e-70 MTEEFTTEEFIIEGRNAVLEAFRAGKTIDKLYVQEGIKDGPVQSIIREAKKQDTIVNFVS RERLNQMSQEGRHQGVIAHAAAYEYAEVEDILKAAEEKGEPPFIFLLDGIEDPHNLGAII RTANLAGAHGVIIPKRRASGLTAVVAKTSAGALNYTPVAKVTNLAATIEELKEKGLWFVC ADMNGELMYRLNLKGPIGLVIGSEGDGVGRLVREKCDMVASIPMKGDIDSLNASVAAGVL AYEIVRQRLG >gi|229783994|gb|GG667741.1| GENE 4 1934 - 3013 1140 359 aa, chain + ## HITS:1 COG:CAC2409 KEGG:ns NR:ns ## COG: CAC2409 COG1305 # Protein_GI_number: 15895675 # Func_class: E Amino acid transport and metabolism # Function: Transglutaminase-like enzymes, putative cysteine proteases # Organism: Clostridium acetobutylicum # 221 337 264 376 397 60 31.0 5e-09 MVLTACAALMVSVNGCQSQKASVADSQGIAAESTAALTEESLTEEPELTGEAETAREPEV TGESETAGGQESQALVTVEAEAVPLYSKPAGSNVRTPQASGTTTYSNNSVTLDASNLSQG YVMVQYTGGSSKIKVQVIKSGGETYTYDLNARSSYEVFPMSEGNGSYTVRVLEQVQGNQY AVKFSQDLSVSLSDQFAPFLYPNQYVNFTADSSVVQKGAEVAASAADALDVVSSVYNYVV SNITYDTAKANSVQSGYLPNCDSVLAQKKGICFDYAAVMTAMLRSQNIPTKLVVGYTGSL YHAWINVYIEGQGWVDNIIYFDGHDWKLMDPTFASSGKGDPAIEEYIKNSSNYKAKYSY >gi|229783994|gb|GG667741.1| GENE 5 3041 - 4093 1268 350 aa, chain + ## HITS:1 COG:aq_747 KEGG:ns NR:ns ## COG: aq_747 COG1459 # Protein_GI_number: 15606135 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Aquifex aeolicus # 10 349 65 406 408 150 28.0 5e-36 MAKELKKHYSARELSAFCLQISLLLRAAIPLDEGLSVMAEDAADEEEKKILLHMSEEVEL GSPFSAVLQEVGSYPSYVIKMAKLGEETGTLDQMMASLSEYYEKEYILMKSIKNAVTYPI IMIFMLLVVLFVLFTKVMPIFESVYEQLGARLSPVSMAATRLGGMFSGVALAAGVLLAVV AGIIWLLGRGGKKIGAVEHVADRFKRKSRIALAVANRRFTSVLALTLHSGLELEKGMELA AELVENPAVEEKIKACGEELETGTDYYSAMKNTGLFGGFHVQMIKVGVRSGRLDQVMEEI SRSCEEEADTALDNMVNRLEPTMVAVLAVAVGLILLSVMLPLVGVLSAIG >gi|229783994|gb|GG667741.1| GENE 6 4105 - 4428 339 107 aa, chain + ## HITS:1 COG:no KEGG:Closa_1301 NR:ns ## KEGG: Closa_1301 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 105 1 105 107 171 74.0 8e-42 MKRWKQTVKELYGYVLSVVVFVCVLALFVTGILFFSKKADARGADTLRDAIRRASVQCYA IEGRYPPSVDYLEENYGILIDRDKYDVFYSGFASNFMPDITVNLHEQ >gi|229783994|gb|GG667741.1| GENE 7 4444 - 4923 600 159 aa, chain + ## HITS:1 COG:no KEGG:Closa_1302 NR:ns ## KEGG: Closa_1302 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 159 1 159 163 215 69.0 4e-55 MGRRNDHKGQLVNTLFTMLLFLVFVLCALFAVLIGGKVYENINARTESTYRGDVALSYIA NKVRQGDADGMVGLTEMEGITVLQLKQEINGSEYVTYIYYRDGKLWELFTDTESGLGVND GYEILECDEVVMSMDGGLFSASTSGDGGGRIWLSLRSGG >gi|229783994|gb|GG667741.1| GENE 8 4928 - 5392 388 154 aa, chain + ## HITS:1 COG:no KEGG:Closa_1303 NR:ns ## KEGG: Closa_1303 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 154 1 133 139 125 46.0 7e-28 MRRKQHSGIFMMEMIMVVFFFILCASICILVFVKADRMSRDAADLNQSVLIAQSVAEVWK SEGTEGLKQRFDADGPAADRNSATDEEGARNRYVMEFDKNGSSAKNTPAGSGDERTSEKV YTAVLSADEAEGSAEITVSREENSVYTLSVSRHE >gi|229783994|gb|GG667741.1| GENE 9 5418 - 5897 582 159 aa, chain + ## HITS:1 COG:no KEGG:Closa_1304 NR:ns ## KEGG: Closa_1304 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 158 1 161 164 159 56.0 3e-38 MAKQESGFKANIGSSSLILIFIVMCLVTFGMLSLTSARSDLSLANRNADAVAEYYRADTE GEAFYHMVAGEVKAACAGAAGQEERLALLKTSLGEYLRDGVAVTEVSMARAQALHIELEP DLDGDGSVRVAKWNVIQTEDYEIDDSMPVWGGTVNSEEK >gi|229783994|gb|GG667741.1| GENE 10 5946 - 7013 1105 355 aa, chain + ## HITS:1 COG:aq_745 KEGG:ns NR:ns ## COG: aq_745 COG2805 # Protein_GI_number: 15606134 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein, pilus retraction ATPase PilT # Organism: Aquifex aeolicus # 1 351 13 362 366 317 45.0 2e-86 MKVIDLLSSAVEQNASDIFLVPGMPFSYKLGSRIVCQNEEKLFPAQMETLITELYELAGN RSMEKVREHGDDDFSFALPGVSRFRSSVFRQRGSLAAVIRVVTFDLPDYETLHIPQNIID IAQMTKGFVLVTGPAGSGKSTTLACIIDKINHTRNAHVITLEDPIEYLHRHKKSVVTQRE IVTDTDSYVSGLRAALRQAPDVILVGEMRDYETIQIAMTAAETGHLVISTLHTVGAANTI DRIIDNFPPNQQQQIRMQISMVMQAVVSQQLIPTVDGGVYPAFEIMFFNNAIRNMIRESK THQIDSVIATGQAEGMISMDTSLLNLYKKGMITKESAVIYSSNSELMAKKIERQG >gi|229783994|gb|GG667741.1| GENE 11 7082 - 7711 623 209 aa, chain + ## HITS:1 COG:BH3937 KEGG:ns NR:ns ## COG: BH3937 COG0039 # Protein_GI_number: 15616499 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Bacillus halodurans # 3 196 4 197 310 234 55.0 1e-61 MRTDKRKVALIGTGMVGMSYAYSMLNQNICDELVLIDINKKRAEGEAMDLNHGLAFSASN MKIYAGEYKDTSDADIAVICAGVAQKPGESRLNLLKRNAAVFKSIVDPVTESGFNGIFLV ATNPVDIMTRITYTLSGFNPRRVLGSGTALDTARLRYLLGESLSVDPRNVHAYVMGEHGD SEFVPWSQAMIATKPILSLCEEQGEGGTS >gi|229783994|gb|GG667741.1| GENE 12 8650 - 8946 255 98 aa, chain + ## HITS:1 COG:SA0232 KEGG:ns NR:ns ## COG: SA0232 COG0039 # Protein_GI_number: 15925944 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Staphylococcus aureus N315 # 2 98 219 315 317 105 52.0 1e-23 MRGAAYQIIEAKSATYYGIGMALSRITRAILGDENSVLTVSAMLRGEYGQNDVYAGVPCI INKNGIFRVLPLDLSEEEKRRLGESCDTLRASFDELEQ >gi|229783994|gb|GG667741.1| GENE 13 9023 - 9295 379 90 aa, chain - ## HITS:1 COG:no KEGG:Closa_0281 NR:ns ## KEGG: Closa_0281 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 2 73 4 75 75 72 48.0 7e-12 MGHINIKLADLICEKGESKNKVCYHCEIQRTQLNNYCNNKVSRVDLALMARVCKYLDCDI SAILEYVKDEDEENAEERKESKATGEKKKD >gi|229783994|gb|GG667741.1| GENE 14 9336 - 9503 125 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCENKMYLYRYILTDVLKYRYSAIRKGEEVPYEEIVLFCSSGYCDGDTIQNDTTL >gi|229783994|gb|GG667741.1| GENE 15 9430 - 9777 316 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623935|ref|ZP_06116870.1| ## NR: gi|266623935|ref|ZP_06116870.1| hypothetical protein CLOSTHATH_05259 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05259 [Clostridium hathewayi DSM 13479] # 1 115 1 115 115 221 100.0 1e-56 MKRLSCFAAAVIVMVTLYRMIPHYEILAQYSFSSPGSRETTILAVSHTVGPALYASGETI LRLHRKLNYDVPSSRVILMIYGKKSDVEKGKCSAIMIYEGEDGSREEDAGIVFLK >gi|229783994|gb|GG667741.1| GENE 16 9774 - 9971 68 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871233|ref|ZP_06116871.2| ## NR: gi|288871233|ref|ZP_06116871.2| putative mercuric transport protein MerT [Clostridium hathewayi DSM 13479] putative mercuric transport protein MerT [Clostridium hathewayi DSM 13479] # 1 65 1 65 65 116 100.0 6e-25 MTGDMQGGWRSEGIPLTYLVYKVAATRLTGGSRLGYKPLLLGQRLKALAFTSFKPLCAFT SVAAP >gi|229783994|gb|GG667741.1| GENE 17 11100 - 11213 62 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTGLGPVTSSLPILKIALCLMRCDDVMLKKLGFMRIL >gi|229783994|gb|GG667741.1| GENE 18 11302 - 12168 147 288 aa, chain - ## HITS:1 COG:BS_ydcL KEGG:ns NR:ns ## COG: BS_ydcL COG0582 # Protein_GI_number: 16077547 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Bacillus subtilis # 82 282 186 368 368 71 27.0 2e-12 MSYAVGEGLLSLNPIIYAGRQSKGRKAKKEYKVDYLTIEQVKRLLWALDNPIPIKHKAHD QVDDTGKPYHVPEYTQVWRLPLKWRAYFYLALFVGDRRGENISFTWEDIDLETGTVNIDK STAYVDRQIIHKSTKTYKSRCPIVPPVVTGILKQWKAEQQQQSLERGTAWIGRHGKDFDK NYIFTQENGKQMHPSSPYHQFKRIIRLYNENIAEDEDHMIPPNATQHDLRHTAASILISN NMDPRSVAGVLGHSNASTTLNIYAYFFQTKNQEAADIMADTLLQSSTS >gi|229783994|gb|GG667741.1| GENE 19 13163 - 13531 215 122 aa, chain - ## HITS:1 COG:no KEGG:Rumal_0678 NR:ns ## KEGG: Rumal_0678 # Name: not_defined # Def: integrase family protein # Organism: R.albus # Pathway: not_defined # 1 121 1 112 393 74 38.0 1e-12 MASIRQRGDSYQVTVSNGRRSDGTQIIETDTYTPEPGMTKRQIEKALNEFVVDFERDVKS GQNVKGKRMTFEQLTKQFLKDTKPTGNEERDTLAITTWSSYKSDLEHRINPRIGHLKIID II >gi|229783994|gb|GG667741.1| GENE 20 13757 - 14200 334 147 aa, chain - ## HITS:1 COG:no KEGG:Vpar_0382 NR:ns ## KEGG: Vpar_0382 # Name: not_defined # Def: transcriptional regulator, XRE family # Organism: V.parvula # Pathway: not_defined # 2 66 4 68 123 67 49.0 1e-10 MNIGERIRGLREKQEMTQTELAEKIGSTKQTVYKYENGVVTNIPYDKLILLAKALGTTPS SLMGWDKIEEAINEEFTRLDNLTKQYILLFELNGYKIDVGDEQVKITDKQKTACTVTKKD FMSMIQYCYIDIENNMNKLLRSYETTS >gi|229783994|gb|GG667741.1| GENE 21 14333 - 14653 124 106 aa, chain + ## HITS:1 COG:no KEGG:Cbei_2216 NR:ns ## KEGG: Cbei_2216 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 1 63 1 63 65 67 57.0 1e-10 MIRTNELRGIIAKNGFSQVKVAKMLGISPKTFYNKMKQGRFGSDEIQVMIDRLNISNPCE IFFAEEVTLKDTRKENKDAETNSSSRKQRGGRKRVQRPEGSHIQGH >gi|229783994|gb|GG667741.1| GENE 22 14565 - 15404 533 279 aa, chain + ## HITS:1 COG:no KEGG:CKL_4029 NR:ns ## KEGG: CKL_4029 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri # Pathway: not_defined # 8 148 9 154 263 124 46.0 3e-27 MQKQIVAVGNSEVAVKEFRGQRVVTFKDIDTVHERPEGTARKRFNDNKKRFIDGEDYFKI SASEFRTHFNPAYSKQATEDITLITEQGYLMVTKSLTDDLSWKVQRQLVTSYFKKEAAPQ KQKDTKRPPLSSVNMMVKNIREAYKDAKIEPIYIAAEMQHIYKDSGYSLTAPLLTDKETM PKLYDCTEMAKELGIYSSKDKPHNQAVSVIIKKLHISDSEIVTTAFSRNGHEDVTVQYKP SVMEDARLWLAENNYPNKIPYTDSKGNPKTCTVVYKEVP >gi|229783994|gb|GG667741.1| GENE 23 15523 - 16683 544 386 aa, chain + ## HITS:1 COG:no KEGG:BL03487 NR:ns ## KEGG: BL03487 # Name: not_defined # Def: hypothetical protein # Organism: B.licheniformis # Pathway: not_defined # 208 366 20 171 184 99 35.0 4e-19 MNTERKKVLIEHIKNRRNITPPMASMERMLENAGVKLFAPSCFNRIVDADLIDNEHVILD YLPESDDIEMDSEGLESCSGILEGFIEPERLREIINGFKVVEDGAVLDNNLLYSLRKWGK KELRRVRKEALEKLDKDESDMAELISYELKARNIISHFRPVMDRLCVMKMVITGLDIPFY SLPGILKYSWNILKDSDPLSIALSPGKYRKAFKKFCEAHRRAIKHDKLCGDELAKYAFLG VKAMGKEYFKKRPKSRVKSCERIEAVYRECSQIMAALGKLTPEELVQMFPIKKEYDGERW GVKDYFYVREAIERLEPDKPIGTEMDAADLLWEYVNDDLTFFFFNWMDAIDDLHLHCFES SPHNEFYKERKDRKQYKIKMEVNTDE >gi|229783994|gb|GG667741.1| GENE 24 16676 - 17515 812 279 aa, chain + ## HITS:1 COG:no KEGG:Bcer98_2967 NR:ns ## KEGG: Bcer98_2967 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_NVH # Pathway: not_defined # 2 209 9 200 278 114 35.0 6e-24 MNEIRVAGRQNFMGWNIPVVLGGFGEGKKCISDKTIAEIHRMEVRNVRARITDNIKRFTE GTDFIDLKSCLPDKQQFLETLGYTQMQISKAEHIYILSERGYAKLIKIMDTDLAWEVHDK LVDEYFLLREKKEKQQLSSTEAQEIELKAKAMRAEAMRLNARTRAFKEIKTSIPRDQLSE VALKVFDLKGAETLMGEELGNYLPPVEKTYSATEVGNRLGITSNKVGKLANLHGLKTEEC GITVMDKSPYSSKNVPSFRYYENAVQRLKEILDREETHE >gi|229783994|gb|GG667741.1| GENE 25 17508 - 17861 412 117 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623944|ref|ZP_06116879.1| ## NR: gi|266623944|ref|ZP_06116879.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 117 1 117 117 209 100.0 5e-53 MNKEFDLAVTAIWNKYGKYGMQKQEIAGLMKSGMENNSLSLRAVYNGMRMSLAMEFNEQE NFTVEDIMDITGESREEVVKRIEEMREEITATGGNPDDYARKVEAPKGSAFFFPEGV >gi|229783994|gb|GG667741.1| GENE 26 17867 - 18112 202 81 aa, chain + ## HITS:1 COG:no KEGG:Rumal_2679 NR:ns ## KEGG: Rumal_2679 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 4 73 6 68 68 62 45.0 9e-09 MAGIPRMRTIQQCAAYFKEQDPGTSVGEWCIRQMVNQGEIPVVRSGRKILVNLDTLIEYL SGEETEVGKHDNECTAEMGNN >gi|229783994|gb|GG667741.1| GENE 27 18223 - 18432 297 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871235|ref|ZP_06410075.1| ## NR: gi|288871235|ref|ZP_06410075.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 69 6 74 74 88 100.0 2e-16 MSIDKKIDYLTAELIQELGTMEIEEIQEFRIEVMRELDTFKRPELVKVYINRLIDLVIQK KQEEERAAI >gi|229783994|gb|GG667741.1| GENE 28 18429 - 18543 62 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIKKNAVQKAKEIRVPARFSISYDQVAEVVNHYESGFE Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:30:21 2011 Seq name: gi|229783993|gb|GG667742.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld135, whole genome shotgun sequence Length of sequence - 13511 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 2043 1383 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases - Term 2052 - 2087 4.2 2 2 Op 1 . - CDS 2126 - 3910 1544 ## Pjdr2_1672 hypothetical protein 3 2 Op 2 14/0.000 - CDS 3935 - 5662 1956 ## COG1653 ABC-type sugar transport system, periplasmic component 4 2 Op 3 7/0.000 - CDS 5757 - 6650 839 ## COG0395 ABC-type sugar transport system, permease component 5 2 Op 4 . - CDS 6666 - 7622 865 ## COG4209 ABC-type polysaccharide transport system, permease component - Prom 7815 - 7874 6.5 + Prom 7697 - 7756 5.7 6 3 Tu 1 . + CDS 7818 - 8924 790 ## Pjdr2_1668 putative sensor with HAMP domain 7 4 Op 1 1/0.000 + CDS 9849 - 10580 571 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 8 4 Op 2 . + CDS 10577 - 12184 1241 ## COG0784 FOG: CheY-like receiver - Term 12157 - 12194 1.4 9 5 Tu 1 . - CDS 12266 - 13045 860 ## COG4636 Uncharacterized protein conserved in cyanobacteria - Prom 13071 - 13130 6.9 10 6 Tu 1 . - CDS 13149 - 13499 292 ## gi|266623957|ref|ZP_06116892.1| conserved hypothetical protein Predicted protein(s) >gi|229783993|gb|GG667742.1| GENE 1 3 - 2043 1383 680 aa, chain - ## HITS:1 COG:SP0312 KEGG:ns NR:ns ## COG: SP0312 COG1501 # Protein_GI_number: 15900245 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Streptococcus pneumoniae TIGR4 # 75 679 2 593 679 625 50.0 1e-178 MVRDIKRYGNPAGRESAFIQGPDYRFTVLTSSLIRMEYNEKGNFEDRATQIVMNRDFQVP EFDVIKENGILTITTEFLQVKYRGGKFAEEGLSIRTILNQETGENRLWRWGDEIKDLGGT ARTLDEVDGAISLEHGMISKDGCTLIDDSASMVITEDGWVTPRGGDGLDVYFFGYGRRYQ ECIRDFYRLCGNTPLLPRWALGNWWSRYHKYTEEEYKRLMVRFQRERIPFSVAVIDMDWH LVDIPAKYGSGWTGYTWNKELFPDPPEFLAWLHEQGLKVTLNVHPADGVRAHEEAYPRMA EALGIDPDSEKTVEFDVTDRRFLEAYFDVLHHPMEGEGVDFWWVDWQQGKKTRIPGLDPL WMLNHYHYLDSTRKGGAGLTFSRYAGIGSHRYPVGFSGDTIISWESLRFQPYFTASASNV GYGWWSHDIGGHMMGARDDELAVRWVQLGVFSPINRLHSTSNPFSGKEPWNFNKRAEMIM KDFLRLRHELIPYLHTANRRASLEGCPLIRPLYWEEPEQKEAYEFPNEYYFGSELLVIPI TERLDREADLAGAKAWIPEGLWFDYQNSRIYQGGKQMTLYRTLEQMPVLAKAGAIIPRSL DGKGSTENPDCMAVDIFPGTDGCFTIWEDDGRTDGDTVERWVSTRLVLEWGSSVRFAVSG AEGNRSAIPQRRSWKLNFRN >gi|229783993|gb|GG667742.1| GENE 2 2126 - 3910 1544 594 aa, chain - ## HITS:1 COG:no KEGG:Pjdr2_1672 NR:ns ## KEGG: Pjdr2_1672 # Name: not_defined # Def: hypothetical protein # Organism: Paenibacillus # Pathway: not_defined # 94 586 114 600 627 561 55.0 1e-158 MTEKKISYGGIRPIVYCVSREEGEPYLIHNYDGAVEFPPVARCFILPLNGQGEYRSYEAE ASEAFFIRISADETEVTVSAGEGGAFSEPVWKEFDGPPRCDVMGNTMELDFENGGIFLDT YYHFYWETLLPSVVERTRAKSYSDSDGYVVSTLQKGAYAGTYPDVDHEFQIKGRLAMADE LDLAVVKRMMELQLKMMAEDPEQAFRDPCAIQPNGAREYHVRRNSLDNSENAEMFLVTGN IEIIESAFFYYSAVRDRAWLGDHMEGLEQSLSLVESCIDRQGRLWSDVYYEDQVIKDGRE CMAQALAARSFELMAQLEEVLKRDRQAEHYRTVSKILGKALVESLPAGFWDASKNRFVDW VDRNSVPHDHIHLLANELPSLFHYTKPEQEDGVDRLMEECFLEFQRFPSFVSAQIADYTE AEIGSGGPYDLCAAGRYWCWDFEYWREKGRSDILEQQLLAVCRQARTDEYRMGERYDMNY VYYLSEKNWHGAAHYYEYPCVFIWNLIAGYAGVRFGLQADLEIEPMLAGDGRVRLENPQY AIEYEVGENTIFVKNLLAQERLLLVKWRGEERLVKIKENGTWSYTCGDGNNPES >gi|229783993|gb|GG667742.1| GENE 3 3935 - 5662 1956 575 aa, chain - ## HITS:1 COG:BH1913 KEGG:ns NR:ns ## COG: BH1913 COG1653 # Protein_GI_number: 15614476 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 96 455 40 400 551 86 26.0 1e-16 MRKFRRVLSCMVCSAMIAGMLGGCGGGQSTVKETTGETAAETEKAESEPQTQAPEAAVSD FQSEMDWERIGWQADPEDLEWKEDTSPIELGYYANFSWFSLDWNDATAERVTAKTGVDLN FMKPVVDDGQKLNMMIAGNQLPDILTLDKNDAALKKMAEAGMLWSLDELIDQYAPKMREI LPKEILSNYQMADGKTYQFTTWIQGEAWQKAAREYDQIVGTNQPVMTVRKDYYDEIGRPE IKNMADFIAAVKQMKENHPDKIGFYPADGSMSADEFGKSAKLSHYGIQMGLSTDFAEKDG GIQWVVRDDKFKEPMKYLNEMYLEGILTKDPFIDTKDVGKAKIEQGDVISYCWTISDGEK VPGDNPETTYEILPPFETYGQIRTGAGWLATVIPKTCKNPERAIRFLEYLASVEGHSDVS WGIEGDTYQDPVAGAQWHMVDGKPVLLEEYVKDKNADWGGVASRNGLGEYWIACNELLWN LPWWNDQDERMNKFNEQFGKYVEFRPELDIQDPSPESEEGIIRQKAFSLLQQYSVKMIFT EDFESVYQEFIQKIDELGMNKVEAYWTEEYKNKTK >gi|229783993|gb|GG667742.1| GENE 4 5757 - 6650 839 297 aa, chain - ## HITS:1 COG:BH1912 KEGG:ns NR:ns ## COG: BH1912 COG0395 # Protein_GI_number: 15614475 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 12 296 7 296 297 226 41.0 3e-59 MLRMKLGRSREDRIINVVTAVLCIFVLIVTVYPVYYCLIYSFNDGRDAARQALYFWPRKF SLENYKIVFQNKAIYPAFLMTMLRSIVGTVLAVFCMAMAAYGLSKDHLKGRKVYMIMGVI TLYFSAGVVQSYLLYREMHLLDSFWVYILPNIFQFYYIILFISFFKELPAALEESAQMDG AGYFTILFKIIIPLSTPVIATVSLFVGVWHWNDWFHPAFFIQNESLMTLPAVLMRAMSLA EAQQTLQKMIAVPSESSTTMESVRYAMLIVSILPVTIIYPFVQKFFLKGMMLGAVKE >gi|229783993|gb|GG667742.1| GENE 5 6666 - 7622 865 318 aa, chain - ## HITS:1 COG:BH1911 KEGG:ns NR:ns ## COG: BH1911 COG4209 # Protein_GI_number: 15614474 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Bacillus halodurans # 20 317 25 318 319 198 38.0 1e-50 MEEVWYVSMKKLPKKKRWGKRQFQLQLMVIPAMIFFFIFSYIPVMGLANAFLDYSLTDGF FGHAFAGMKYFKELFSSESFYLAMKNTLGMSFLKFFFTFTSPLLLALIINEVPFKWLKKA TQTGSYLPHFLSYVIVATLWIVFLDSKGLINDILMRLHITDMPIEFLADSKYFWWIGVWI DCWQESGWNAIIYLAAIAGINQDIYEAAAVDGAGRLKRITQITLPSIAWTAGVLFVMNCG NLLSGGPVASNFNQSYLLGNAFNYSKSYVIQKFIVDEGLNQLRFSFAAAGNLILSVLSVT LLLTANKISKKVFGKSVF >gi|229783993|gb|GG667742.1| GENE 6 7818 - 8924 790 368 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_1668 NR:ns ## KEGG: Pjdr2_1668 # Name: not_defined # Def: putative sensor with HAMP domain # Organism: Paenibacillus # Pathway: not_defined # 1 368 12 345 580 150 27.0 1e-34 MHYKSKLILCYLVLVSFPLILCLILLYQSIVKPVRENAVSAIDQRLEQELSTINSKIEKI RNVSYLVSTNTVINKFFVPKFYGDLELIEILNNDISPLLSWLEASSPEVSYYHFFTNNPS IPDTQYLHHYDDYRSEPWMQEMEQSVFDKGYYLEPYHQNRSYSFGYGVEDQMVYSLFYPL LTANNFLEVCIKPSVFFEDMKAIPAMESGFVAAVNRDGKVISDLPSTVTAAFREALEQSL TEGGYLSGPADSPVLVSLEQTEYLLSLRKIDALDSWILCAVPNAEIVGPVLKAQQSFALV ILAAAIGIVLLSYLLSTLLIRKINTIIDAVHKIQEGDFHVSIPVNGEDEIDQLATDINYM SYKINELI >gi|229783993|gb|GG667742.1| GENE 7 9849 - 10580 571 243 aa, chain + ## HITS:1 COG:SPy1588 KEGG:ns NR:ns ## COG: SPy1588 COG2972 # Protein_GI_number: 15675475 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pyogenes M1 GAS # 3 216 368 563 574 114 32.0 2e-25 MLQKETELSALQAQINPHFLFNILDTFKMIAIIHDLDDFSDSIAALGNLMRYNISPASYR CTLEHELQIIHDYINIQNLLLNNRVTLLLTVPDDLLRFEIPNFIIQPIIENSFVHGFENK LNELIIKVSIYRDGSNIIIQIEDNGMGIGGEQQQALTDAMKKSTETLEIGPAQDPFREGE TSRSGKSIGLLNVNLRLFLLYRDRYSFTFSASDLGGACNTISLPVSIPTKEQEENFLKER SML >gi|229783993|gb|GG667742.1| GENE 8 10577 - 12184 1241 535 aa, chain + ## HITS:1 COG:BH3446 KEGG:ns NR:ns ## COG: BH3446 COG0784 # Protein_GI_number: 15616008 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Bacillus halodurans # 3 169 6 172 200 107 34.0 9e-23 MTLLIVEDEDIIRKGLIVTLRKLEMDFEHIYEAGDGEEGLRLCREYTPDIIMTDIKMPLM DGLAFIRESQKLLPEGQFIILSGYSDFEYARTALKYGVKDYLLKPSTKNEIKEVLTRVVE QLKEQQSLRLELTERIHTYEKKLDRFQELLLGNILSGRYPAGEIGRYLSHYSIVFPEEYM EVLCIRITSAVSSGEPSIDYKNHFMWIASLFTPYYTVYQADILSVYKCLLINFSRMEPGY SQKIWKKAETEIREYGQKHGLRICLSISRAEVSEEALPDLYQEACMLLHCRLFQPEAVIF DPKCQRAAETKIPAVPLTMIETLYHYYIGNSQFDLRQNFYSFIHHITHIEDSSPGYVCEC LDRLEEFFAVQSAKDGLEPDTAYRIAFSVSESMTVCSSLEDLTDMLYARLLDYWKLKKEG TAPAAHMASSPVDQAIAYMEQNYYLDLDLSMISDLISMNSSYFSSLFKKKTGMNLINYLQ NVRIEKSKQLLLNSNQKLYEISEAVGIPNVKYFCKLFKDYTGVTPSEFRKHEHSF >gi|229783993|gb|GG667742.1| GENE 9 12266 - 13045 860 259 aa, chain - ## HITS:1 COG:aq_1194 KEGG:ns NR:ns ## COG: aq_1194 COG4636 # Protein_GI_number: 15606437 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in cyanobacteria # Organism: Aquifex aeolicus # 83 250 7 164 180 94 33.0 2e-19 MMDLEKMKKRKKELGITNQELSERSGVPLGTVNKIFSGATKSPQYDTILALETVLGMTFY RDEDGPYVSSVHEEALEYSVHGRYTLEDYYALPDHVRAELIDGTFYYMSSPGLVHQKLIG ELYFQFKEYIRRKAGPCEVFLSPFDVFLDSDDKTVVQPDLMIICDQTRVESKGVAGAPDF VMEIISQSTGKKDYSIKLNKYWSAGVREYWIVDPLKGRILTYCFNDGDMDMKIYGMHDTV PVGIYEDLSIDFGEFSFLV >gi|229783993|gb|GG667742.1| GENE 10 13149 - 13499 292 116 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623957|ref|ZP_06116892.1| ## NR: gi|266623957|ref|ZP_06116892.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 116 5 120 120 223 99.0 5e-57 MIPEGGYRTEVEADGNADENSEVVKNGFRLIELVVPFPVTLRLDGLIADRLGVSRSMVKR WCSQGIMVKLDEYEMVWKKEEKLCREKVRNGMRIGIGTDLPVSETGAAPQAVKDSI Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:30:46 2011 Seq name: gi|229783992|gb|GG667743.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld136, whole genome shotgun sequence Length of sequence - 13867 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 3, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 651 473 ## COG0395 ABC-type sugar transport system, permease component 2 1 Op 2 . + CDS 673 - 2019 931 ## COG3119 Arylsulfatase A and related enzymes 3 1 Op 3 . + CDS 2034 - 3509 896 ## COG3119 Arylsulfatase A and related enzymes + Term 3557 - 3597 6.3 + Prom 3634 - 3693 5.3 4 2 Op 1 35/0.000 + CDS 3755 - 5164 1542 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 5177 - 5222 8.1 5 2 Op 2 38/0.000 + CDS 5225 - 6112 839 ## COG1175 ABC-type sugar transport systems, permease components 6 2 Op 3 . + CDS 6118 - 6960 698 ## COG0395 ABC-type sugar transport system, permease component 7 2 Op 4 . + CDS 6998 - 8092 1075 ## COG1874 Beta-galactosidase 8 3 Op 1 . + CDS 9057 - 10055 561 ## Haur_2392 beta-galactosidase (EC:3.2.1.23) 9 3 Op 2 . + CDS 10052 - 10759 325 ## gi|266623966|ref|ZP_06116901.1| hypothetical protein CLOSTHATH_05291 10 3 Op 3 1/0.000 + CDS 10808 - 12586 1575 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 11 3 Op 4 . + CDS 12564 - 13820 1046 ## COG0784 FOG: CheY-like receiver Predicted protein(s) >gi|229783992|gb|GG667743.1| GENE 1 1 - 651 473 216 aa, chain + ## HITS:1 COG:lin0761 KEGG:ns NR:ns ## COG: lin0761 COG0395 # Protein_GI_number: 16799835 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 1 216 76 291 291 186 43.0 2e-47 NSLIITLVSIVFSLLFNSIAGYAFARLHFKGSKVLFTLLLVGLMMPPQVTMLPTFLIMAR FPMIGGNDILGFGGSGLMNTFPGVIIPLVSGSFGVFLSKQFYENFPKSLDEAAMLDGAGK WKTYFYIYLPNSKVILATLALLKGSAVWNDYLWPLVMTNSVNMRTVQLALTMFRDESTVR WGPMMAAATLISLPMILLFLCAQKYFVQGVVSTGLK >gi|229783992|gb|GG667743.1| GENE 2 673 - 2019 931 448 aa, chain + ## HITS:1 COG:PM1682 KEGG:ns NR:ns ## COG: PM1682 COG3119 # Protein_GI_number: 15603547 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 1 444 1 451 453 498 53.0 1e-140 MERPNIIVYFSDQQRWDTLGCYGQHLNVTPNLDSLAKEGVIFENAFTCQPVCGPARACLQ SGKYATQVNCHRNALALPQNIKTLADYFNESGYETAYVGKWHLASNRSSYTAPFDEENYE TRAIPPERRGGYRDYWMAADVLEATSHGYDGYVFDDEGKKCEFIGYRADAINNYAIHYLH QYDGEKPFFLFISQIEPHHQNDHGRFEAPDGAREKFTGYELPADIPVYRGDWKEHYPDYL GQCHSLDENVGRLVETLKARGLWENTILFYTSDHGCHFCTRNSEYKRSCHEASIHIPLLA VGGPFKGGKTISELVSLIDLPVTLLDCAGIEKPEDFQGNSLKSLADGTSEGWEDCVFIQI SESGVGRAIRTDRWKYSIAADADGWNCPGADVYYEQYLFDLEHDPAEQCNLAGDPDCETT RRMLREKLLFCMERAGEKRPVIRPFRGD >gi|229783992|gb|GG667743.1| GENE 3 2034 - 3509 896 491 aa, chain + ## HITS:1 COG:STM0886 KEGG:ns NR:ns ## COG: STM0886 COG3119 # Protein_GI_number: 16764247 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 1 490 1 492 495 477 48.0 1e-134 MKAIMIMFDSLNRRFLEPYGCRETITPNFLRLAQKTVRFENSYAGSLPCIPARRELHTGR YNFLHRSWGPLEPFDDSMPGILREHGIHSHLVSDHGHYWECGGATYHTQYTTWENIRGQE GDPWAPVVGGAEDTDPNFAVFTEGLRSELYRQNLANRTRVTEKDQYPLVKTFDQGLGFLR ENEGKDSFFLQIESFSPHEPFMAGREFKEMYPGKVKGKKYEWPDYGPVTEAPDEIEEVRY SYFADLTMCDEQLGRLLDYMDEKNLWEDTMVIVNTDHGYMLGEHGYWAKNYMLCYDEIVH TPLFIWDPRFPESAGTSRTALVQTIDLPVTILKFFGIDPAEDMQGQDIAQIIGEDKGSRS CGLFGIFGAHICCTDGRYVYMRAPRNLEIPLHEYTLMPTHMMDFFSPRELESMERHEAFS FTKNCPVMQIDADKAIRCIEEGDYLFDLWNDPGQKHPIVSEEIVGKMEQSMVRLMRLNDA PEELYLRFGFA >gi|229783992|gb|GG667743.1| GENE 4 3755 - 5164 1542 469 aa, chain + ## HITS:1 COG:BH3680 KEGG:ns NR:ns ## COG: BH3680 COG1653 # Protein_GI_number: 15616242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 2 464 3 434 438 84 22.0 4e-16 MRKSICKAAALILAGTMCAGLTACGGKTQETKVQETAVQDTKEAQTGETGKEQASEAGKE SAAPKKKVSLVLATNWGEGDSKYHYFYPKFQEFQEANADTMDITLETYSTEDYKTKIKTQ TASGDLPDVFTYWGGAMMTDMVEAGLLLDVEEYFNASDQVKRDGFEPTSFGYYTASDGKT YGIPIESTRGIFLANKQLFEENGLDIPKTYEELLKTAKTFNEKGIIPIAIGSNGGTPSEF FFSELYGQYDGAAKELEELAATRTFNTGNALKVANNIKDMIDQKMFPADTVANGGWAASL QLYTDRKAAMTYTYPWMFESIPEEIQEASEIVPLPMLPDADIDPAGIMTGFTVYGFEINK ASFEDAAKHDAVVKLCDFLASDELSAELTKSGMIPAKNIEIDMDSQKLIFKKMMEYAQDK KLISVHYTTMPDQDAVNMMDSSLDEFFIKALTPQELIDKVQAVLDKNNK >gi|229783992|gb|GG667743.1| GENE 5 5225 - 6112 839 295 aa, chain + ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 10 285 4 281 292 181 35.0 1e-45 MKRKYLKKYHNLPVTVMFLAPAFLIYTVFLFLPLILTFWYSLTTWNGVSAKVFAGLANYR ELAADSDYWLTFSNTVKLTVISLIIQIGAGILLAYLLYSWTKGMKIFRTIYFLPVVIAPV AIGLMFSLFYNSEIGIFNKILTAVGLGRFATNWLSNPRTLLYAVMAPQVWQYIGLYVTIF LGALQSVPEDLIESAQIDGAGKLQSFFHVILPQITGFMNICIILCLTGSLKAFDHSWIMT GGGPGVRSAYLGVFMYKSAFVNSDFGLGSAVTITIVTVSLTVTILFNWFADRKKG >gi|229783992|gb|GG667743.1| GENE 6 6118 - 6960 698 280 aa, chain + ## HITS:1 COG:BH3682 KEGG:ns NR:ns ## COG: BH3682 COG0395 # Protein_GI_number: 15616244 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 7 280 17 293 293 159 35.0 4e-39 MERFLKRSLKYRVTAVLCYAIMSISAVAALYPFIWVVFSSLKDNNEIYSNSFGLPKTAVW SNYAEAWKGARVGRSFVNSILVCLITLAVLTVITSMASYILTRVTKSRLLSVYFSLGIMI PVHALLIPSVLIFKRLHLIDHLSGLMIVYTAVNISFAMFIMNGFMEGIPRELDEAATIDG CGRAGIFFHIVFPVAKPGIATVATITFLNCWNDLLLGLVLISTPARKTLSMTISALKGSY VTQYGLLCAGFVISIVPVVFMYLLFQKQVIAGMTAGAVKG >gi|229783992|gb|GG667743.1| GENE 7 6998 - 8092 1075 364 aa, chain + ## HITS:1 COG:PAB1349 KEGG:ns NR:ns ## COG: PAB1349 COG1874 # Protein_GI_number: 14521734 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Pyrococcus abyssi # 2 277 6 280 787 119 29.0 9e-27 MVKFTHDGIRINGVERPLYSGSIHYWRTKKKDWRLVLSQIRGMGFEIVETYIPWSVHEYE EGKYDFGEIEEEKDLDLFLTLCEEMNLAVIVRPGPHINAEMTGFGYPEWILMDEEIQAKN PWGTTVLYPYATGPFPIPSYASDKLYEKTEIYFQKLKPILQRHCGGSGCICLIQADNETC NFFRDRPYIMDYSDCSKAQYHKFLVERYGTVAAMNERYHRNYSGFQETEPPCGFEEGDFS SLCYYLDWVEFKEYQILAALGRLVDIINRLELPVPIFHNCAYQNYTPVSVQRDEEIPGLM VAGIDAYPEPGDAAMLKERIRYLAGSSRFPFVPEFGSGSWFDREMLLTAEEERFGYLYSV MNGF >gi|229783992|gb|GG667743.1| GENE 8 9057 - 10055 561 332 aa, chain + ## HITS:1 COG:no KEGG:Haur_2392 NR:ns ## KEGG: Haur_2392 # Name: not_defined # Def: beta-galactosidase (EC:3.2.1.23) # Organism: H.aurantiacus # Pathway: Galactose metabolism [PATH:hau00052] # 1 248 377 621 672 103 30.0 1e-20 MLAERDRWTGCPVTNDGRIRKPYYEMFSDLLRLLKDHEIYHFNRSPRVLILKNYDMGRLK ALHSVMDCNTFSSNCFIKGPDIPAALFLPEGEPEYYLDRELNYWDDAWVKGLAKRFDEEH EIYDYSDEYLDYGRWLQYDVIAAASGDIMDAKTQKRLADFAGMPGKKVILGPVIPRYDRT LKACEILKHMVEEGEDSGILFAKEPGEIGAGFWEHLRRSREYGCGDPEIELAVHRRENSR DHLLYTANLSGKEKTAFISVSDEPNEHISESIWRLPRGSKSPLQTAGALNASAERRRFSV LQGPVKINMEKNGITVQIPAYTVAILYVEENV >gi|229783992|gb|GG667743.1| GENE 9 10052 - 10759 325 235 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623966|ref|ZP_06116901.1| ## NR: gi|266623966|ref|ZP_06116901.1| hypothetical protein CLOSTHATH_05291 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05291 [Clostridium hathewayi DSM 13479] # 1 235 1 235 235 468 100.0 1e-130 MIDSGLKVEYLDPELWTHLGEILQPLSGQGQVLHVRRLSSGLYRAVKCSRGTAAVCESEA GCPFLDLSQYMEGEELNTERIFRDFYDIMEIRIYTLEGLRQCYAESQKAELYRLDIDEAI SKIYGIFGHTAGMTAVRRKETGERWYFESLRRLLGDKDGDHVYLLWITDNRTLYFNCILE MKRGRLVGISTSDRYGGERDYDKIQGLIEKEYRMKPEVFCMELSEYRTNERIFWF >gi|229783992|gb|GG667743.1| GENE 10 10808 - 12586 1575 592 aa, chain + ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 39 584 49 598 602 166 26.0 1e-40 MRFTLAMVLMSLIPGLILAVVYYRNVKDFYQDKTWTYQENTLRLMEAKMQDIVKQSEVVL NQVLGLSVTSDLFSGYTEMNAYERLNLMRNINGTLSNIRIANNSIDHIYLLAFDGGAYSS NAGWNKREYAENDWTKIREDQPGESITVPTHPAAYKYLNPGAADPLVVSIVTYLNRYTGN SVIGLVQIDLSYEKIREAMDCMEMDDSDMAFVVDEEGRVIFAPKESMAGLPAADVSLGAY NLAQILSAVDGGVEEDGNFSIHLDYGRDSGTIRKRRVGNSGFSIIQINSGRMLKQELDRF RKVWAAVVGACMVSAALLASSLSAGIVKPVTSLIRSMSRVSRGDFSTKVETPKDKDLSEL ALSFNTMVAEVDKLMHENIVKERERLTMEQTALNSQINSHFLYNTLNTVKWMAVRAGNEE IARMIVALVNMLEYSCKKIDMPVFIPEEIKFIKNYVCIQEGRCCSSVHIKFDIDPKLENC LILKMLLQPVVENAMQHGFGEDNIDNRILITGRLLGDRVQFQIRDNGQGFLYEGFDKLTG VGLHNIQDRIRLNYGDNYGVELESKPGEGTAVTVEIPVCQKAEEADGEDFNC >gi|229783992|gb|GG667743.1| GENE 11 12564 - 13820 1046 418 aa, chain + ## HITS:1 COG:BH3787 KEGG:ns NR:ns ## COG: BH3787 COG0784 # Protein_GI_number: 15616349 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Bacillus halodurans # 1 105 1 103 120 83 37.0 8e-16 MEKILIVDDEMFSRTAIEDILRAEWSSIEVFSARGALEALRILEREEIDLLMTDIKMPGM NGLELLAEVRKRGLSMEVIILSSYNEFDLVRQAMKLGTFDYLFKPAMLPPDIIEVVKKAL QKQREKSSSRIAESDRKRLSRDREQFFRDLISAAGIGRTWFEERAAAFGLPEPVGKNAVV VFRLIRYRKSMAEIFENDVSLLRASVCNVMSEALDSEKDCQFLCNKFDEYVWIVWEARED GRQEFYERLEALIRTAADFLSRYYGLEFSIGISRSSASLFDCCQGYMEAADNGRESETGI FYAGGPDTGSLKKEMRAALDYIRDHLGDKDLSLQTVADQVGISRNYFSRIFKEVMGVNFI DYITRLRVEKARSLYINTDMKIYEIAELVGYSDWHYLYSVYKKQLGHSMSKEKRGKTT Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:31:10 2011 Seq name: gi|229783991|gb|GG667744.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld137, whole genome shotgun sequence Length of sequence - 22881 bp Number of predicted genes - 19, with homology - 18 Number of transcription units - 6, operones - 5 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 - CDS 3 - 555 456 ## COG2384 Predicted SAM-dependent methyltransferase 2 1 Op 2 . - CDS 572 - 1609 1251 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) 3 1 Op 3 . - CDS 1602 - 1781 193 ## 4 1 Op 4 3/0.000 - CDS 1821 - 3584 2071 ## COG0358 DNA primase (bacterial type) 5 1 Op 5 . - CDS 3617 - 4627 1061 ## COG0232 dGTP triphosphohydrolase - Prom 4719 - 4778 8.8 6 2 Op 1 3/0.000 - CDS 4905 - 6296 707 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 7 2 Op 2 . - CDS 6325 - 7326 754 ## COG0095 Lipoate-protein ligase A 8 2 Op 3 12/0.000 - CDS 7323 - 8777 1437 ## COG1003 Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain 9 2 Op 4 . - CDS 8774 - 9592 736 ## COG0403 Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain 10 3 Op 1 16/0.000 - CDS 10536 - 10982 543 ## COG0403 Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain 11 3 Op 2 18/0.000 - CDS 11012 - 11386 562 ## COG0509 Glycine cleavage system H protein (lipoate-binding) 12 3 Op 3 2/0.000 - CDS 11423 - 11875 452 ## COG0404 Glycine cleavage system T protein (aminomethyltransferase) 13 3 Op 4 . - CDS 12823 - 13437 712 ## COG0404 Glycine cleavage system T protein (aminomethyltransferase) 14 4 Tu 1 . - CDS 13850 - 15382 1556 ## COG1404 Subtilisin-like serine proteases 15 5 Op 1 . - CDS 16339 - 18186 1855 ## COG2385 Sporulation protein and related proteins 16 5 Op 2 . - CDS 18196 - 19428 1383 ## COG0628 Predicted permease 17 6 Op 1 . - CDS 20759 - 21025 363 ## Closa_2197 hypothetical protein 18 6 Op 2 . - CDS 21051 - 21851 724 ## Closa_4186 GCN5-related N-acetyltransferase 19 6 Op 3 . - CDS 21936 - 22676 866 ## COG0546 Predicted phosphatases - Prom 22759 - 22818 5.7 Predicted protein(s) >gi|229783991|gb|GG667744.1| GENE 1 3 - 555 456 184 aa, chain - ## HITS:1 COG:CAC1302 KEGG:ns NR:ns ## COG: CAC1302 COG2384 # Protein_GI_number: 15894584 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferase # Organism: Clostridium acetobutylicum # 1 184 1 183 229 129 36.0 3e-30 MKLSKRLETVASFVPKGSNIADIGTDHGYVPIYLVREGLAEHAVAMDVRKGPLERAKAHV AEAGLENRIDVRLSDGLTGLKPGEADCVVIAGMGGELVIHILEAGRSLWETIGYWVLSPQ SELDKVRRFLEKESFSIVRETMMKEDGKYYTVMGVTRGGLSGENDSEPHYLYGSRLIRQK NPVL >gi|229783991|gb|GG667744.1| GENE 2 572 - 1609 1251 345 aa, chain - ## HITS:1 COG:BH1376 KEGG:ns NR:ns ## COG: BH1376 COG0568 # Protein_GI_number: 15613939 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Bacillus halodurans # 11 344 48 372 372 411 71.0 1e-114 MLDFFAGTVLDADRLDRIYDFLETNKVDVLQINEDEDLELDPDLFIEEELEEEEEIDMEH IDLSVPEGISVEDPVRMYLKEIGKVPLLSSDDEIELAKKIELGDELAKEKLTEANLRLVV SIAKRYVGRGMQFLDLIQEGNLGLIKAVEKFDYRKGYKFSTYATWWIRQAITRAIADQAR TIRIPVHMVETINRLVRVSRQLLQELGREPSPEEVASRVDMPVERVREIMKVSQEPVSLE TPIGEEEDSHLGDFIQDDQVAVPADAATFTMLHEQLMEVLDTLTEREQKVLRLRFGLDDG RPRTLEEVGREFNVTRERIRQIEAKALRKLRHPSRSKKLKDYLDD >gi|229783991|gb|GG667744.1| GENE 3 1602 - 1781 193 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGENVKRTAEKLSSSKKVDEAREAELRAAEEQAAFEEQLNQLLLLAEKEKECSGKPGSA >gi|229783991|gb|GG667744.1| GENE 4 1821 - 3584 2071 587 aa, chain - ## HITS:1 COG:CAC1299 KEGG:ns NR:ns ## COG: CAC1299 COG0358 # Protein_GI_number: 15894581 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Clostridium acetobutylicum # 5 497 8 496 596 347 39.0 5e-95 MRYPEEVVEEVRMKNDIVDVISGYVKLQKKGSNYFGLCPFHNEKSPSFSVSPSKQMYYCF GCGAGGNVITFVMEYENYSFMEALQMLADRAGVALPKQEYSKEAKEAADLRTALLEINRM AANYYYYQLTNPQGEAGYRYLRDRQLSDDTIRHFGLGFANKTSDDLYRYLRAKGYDDRIL KETGLVTIEERGAHDKFWNRVMFPIMDVNNRVIGFGGRVMGAGEPKYLNSPETKLFDKSR NLYGLNYARLSREKYILICEGYMDVIAMHQAGFTNAVASLGTAFTTQHAALLKRYTDKVV LTYDSDGAGTKAALRAIPILRDVGISIRVLNMQPYKDPDEFIKNMGAEAFRERIEQARNS FLFEIDVLKRNYEMEDPEQKTEFYNQVAKKLCEFPEALERDNYLEAVSREFFINYEDLKR LVNRMGARLGPVSPREEEESTSVKKKKDREDGKNQSQRLLLTWLIENPFLFDKIEGIITP DDFIEDLYHQVARMVFDGHAAGNLNPAEILNHFINDEEQYRVVAGLFNASLKESLDNEEQ KKAFSETIMKVKKNSLDYASRNAAGIEELQRIIREQAALKDLHISLD >gi|229783991|gb|GG667744.1| GENE 5 3617 - 4627 1061 336 aa, chain - ## HITS:1 COG:RSc2968 KEGG:ns NR:ns ## COG: RSc2968 COG0232 # Protein_GI_number: 17547687 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Ralstonia solanacearum # 15 328 18 382 387 231 39.0 1e-60 MNIRESTESWEAETLSPYASLSRNSKGRDREEEPCDIRPVYQRDRDRIIHCKAFRRLKHK TQVFLAPEGDHYRTRLTHTLEVAQIARTIARALRMNEDLTEAIALGHDLGHTPFGHSGEA VLGKICSEGFAHYRQSVRVVEVLEKNGRGLNLTWEVRDGILNHRTSGRPSTLEGKIVRLS DKIAYINHDIDDAIRARMFVEEDLPACFTDVLGHSVRERLNNLIHDIILGSIGKPEIIMS PHMEEAMQGLRTWMFENVYRSDVPKAEEGKAQHLIVMLFNYYMEHTDKLPAEYHYLMDVR HENRERVVCDYIAGMSDSYAIDKFEELFVPKAWKDI >gi|229783991|gb|GG667744.1| GENE 6 4905 - 6296 707 463 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 5 452 4 447 458 276 36 7e-74 MEKHYDLIVIGAGPGGYMAALYASKAGKRTAVIERSHVGGTCLNCGCIPTKTYLHTTGLF REVKKTEAAGLHTEGMHVSMAELKKRKDAVVEELRAGIEKQLAAAKIDLYRGNGVVCGDH LVKVMPEGEELTAEYILLAAGSEPSSLPIPGMDLPGVKNSTGILEEETVPDRLIIIGGGV IGMEFATVYNDLGCQVTVLEAMDKILPGMDKEISQNLKMIMKKRGVEIHTGAMVSSIEKM GDGTYLCRYTEKEKACECSAPLILSAVGRRPCDSGLFSGEMQERIKMERGRILVDERGET SVPGIYAAGDVIGGVCLAHVASAEGYRAVSAMFPELKGKDMAVIPSCVYTDPEIACAGLT ADEAKAAGIEAVTGKYIMSVNGKSVLSMQDRGFLKIVAEKDTGRILGAGLMCARATDMIG ELELAVSKGLTAEDLAAVIMPHPTFCEGIGEAAESLCMEKKNK >gi|229783991|gb|GG667744.1| GENE 7 6325 - 7326 754 333 aa, chain - ## HITS:1 COG:VC1388 KEGG:ns NR:ns ## COG: VC1388 COG0095 # Protein_GI_number: 15641400 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase A # Organism: Vibrio cholerae # 13 301 12 307 338 228 40.0 9e-60 MIRLLSLAHGPGTYPYPNFGMEEYLLYHVEEEECILYLWQNEKTVVIGRNQNAWKECRRE ELEAAGGHLVRRLSGGGAVFHDLGNLNFTFITRKENYDVTKQTEVILRAVKKLGIHAERT GRNDITAEGRKFSGNAYYETGDFCCHHGTILLSVDKEEMAHYLNVSREKLKSKSVDSVRS RVTNLCEFVPDLTVRRMGQCLEESFGEVYGLPVRPFPPERMDESEAAERTARFASWDWKY GRKIPFSDESAARFSWGEVQVLVLVEQGRVGEAKIWSDALDTDFIKAAEEALSGMVWSRN AAAARFSALSCVTLLQETMRRDLQDVICNMIER >gi|229783991|gb|GG667744.1| GENE 8 7323 - 8777 1437 484 aa, chain - ## HITS:1 COG:lin1387 KEGG:ns NR:ns ## COG: lin1387 COG1003 # Protein_GI_number: 16800455 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain # Organism: Listeria innocua # 1 472 7 484 488 525 59.0 1e-148 MKLIFERSVHGRHCQILPSSDVPYVMPEHGRELPLRLPEVSENDISRHYTELAKATHGVN DGCYPLGSCTMKYNPKINEEIAGLPGFTGIHPLQPESTVQGALEVLKTAEEIFSEITGMD SMTFQPAAGAHGEFTGLLLIRAWHESRGDEKRTKIIVPDSAHGTNPASAAMAGYKVINIP STADGLVDLEKLREAVGEDTAGLMLTNPNTVGLFDPNILEITKIVHDAGGLCYYDGANLN AVMGITRPGDMGFDVVHLNLHKTFSTPHGGGGPGSGPVGCKKFLEPFLPGRKVVERDGVL CTAPYGQSIGMVKEFCGNFLVVVKALSYVLTLGPDGIADACRNAVLNANYVRVKLSDDYH MPFKQTCMHEFVMSLEQEKEEMEVTAMDIAKALLDHGIHPPTMYFPLIVHEALMVEPTET ESKENLDELISVFHEIYETAKRDPEYIRNSPHVTLVKRLDEVNAARNPKLHYAFGEEASE REEA >gi|229783991|gb|GG667744.1| GENE 9 8774 - 9592 736 272 aa, chain - ## HITS:1 COG:lin1386 KEGG:ns NR:ns ## COG: lin1386 COG0403 # Protein_GI_number: 16800454 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain # Organism: Listeria innocua # 1 260 173 438 448 253 46.0 2e-67 METYCHGRGAFVELVPSMDGSVDFEALKGMLDETVACFYVQQPNYYGILEDVTAISELVH EAGGKVIAGCNPISLAVLKTPAECGADIAVGEGQPLGLPMAFGGPYLGFMATTRDMMRKL PGRIVGETVDSDGKRGFVLTLQAREQHIRREKAGSNICSNQALCALTASVYLSAMGPQGL REAALQCTSKAHYLREKLKEAGLKPVYDKPFFHEFVTECPVDTEQLSARLEEEGFLGGLP LPGGRILWCATEMNTKEEMDRLACLVKEVCCA >gi|229783991|gb|GG667744.1| GENE 10 10536 - 10982 543 148 aa, chain - ## HITS:1 COG:BS_yqhJ KEGG:ns NR:ns ## COG: BS_yqhJ COG0403 # Protein_GI_number: 16079512 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain # Organism: Bacillus subtilis # 4 145 5 148 448 153 52.0 9e-38 MGSYLSSTDDQQKAMLDEIGYKDYDDLFGCVPDSVKLKRRLSVPEGRSELELYRHMASMA GRNRVFDHVFRGAGAYRHYIPAIVKEVVSKEEFVTAYTPYQAEISQGNLQAIFEYQTMIC ELTGMDAANASVYDGASAAAESVFMCRE >gi|229783991|gb|GG667744.1| GENE 11 11012 - 11386 562 124 aa, chain - ## HITS:1 COG:PAB0559 KEGG:ns NR:ns ## COG: PAB0559 COG0509 # Protein_GI_number: 14521036 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system H protein (lipoate-binding) # Organism: Pyrococcus abyssi # 5 123 21 142 144 118 54.0 2e-27 MTYPEHLKYAKSHEWVEFLEDGSARIGMTDYAQDQMGDLVFVNLPEPEDEVTAGEAFGDV ESVKAVSDVYSPVTGVVAEINEDLLDSPESINSDPYGAWMIRISDIEDKEELMSAQEYEE FVKE >gi|229783991|gb|GG667744.1| GENE 12 11423 - 11875 452 150 aa, chain - ## HITS:1 COG:BH2816 KEGG:ns NR:ns ## COG: BH2816 COG0404 # Protein_GI_number: 15615379 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system T protein (aminomethyltransferase) # Organism: Bacillus halodurans # 1 148 214 363 365 147 51.0 1e-35 MEAGKEEGLIPCGLGARDTLRLEAAMPLYGHEMDDTITPIETGLSFAVKMKKEDFIGKAA IEAKGEPARARVGLKVTGRGIIREHETVYADGGEIGVTTSGTHVPYLKCPVAMALVKKEY AVPGTKVEADVRGRRIEAEVVPLPFYKRVK >gi|229783991|gb|GG667744.1| GENE 13 12823 - 13437 712 204 aa, chain - ## HITS:1 COG:BH2816 KEGG:ns NR:ns ## COG: BH2816 COG0404 # Protein_GI_number: 15615379 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system T protein (aminomethyltransferase) # Organism: Bacillus halodurans # 4 199 6 200 365 207 48.0 7e-54 MECKTTLYDTHVKYGGKMVPFAGYLLPVQYGTGVIAEHMAVRTACGLFDVSHMGEILCEG EGALDNLNHLLTNDFTGMSDGQARYSPMCNEEGGVVDDLIVYKIRDNHYFIVVNAANKDK DFAWMTAHSLPGAELKDISAEIGQLALQGPKAKEILLKLTEETNLPVKYYTCLADRPVAG IRCIISKTGYTGEDGYELYMAVAS >gi|229783991|gb|GG667744.1| GENE 14 13850 - 15382 1556 510 aa, chain - ## HITS:1 COG:CAC3245 KEGG:ns NR:ns ## COG: CAC3245 COG1404 # Protein_GI_number: 15896490 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Clostridium acetobutylicum # 8 508 25 518 1118 342 37.0 9e-94 MEEMINEYAILTVPESEIDYISSLPQIEYVEKPKRLFFAINQAKAASCINIAQQGDRSLS GRGVLTAIIDSGIDYYHDDFRKENGDTRIVMLWDQTLNRVFTEEEINEALAAGSRAAARA LVPSVDTSGHGTAVAGIAAGNGRENNGQYRGIAYESELLVVKLGVPKSDGFPRTTELMRA VNFVVREAVRRRQPVAVNLSFGNTYGSHDGTSLLETFLNDISNYGKTVFVVGTGNEGAGD GHSAGQLVMNQPQEVELTVGAYQTSFSVQLWKSYTDIFDISLITPSGETIGPVSSRLGPQ TIPFVNQTILLYYGKPGPYSVAQEIYLDFIPKENYLEEGVWRIRLTPREIVEGNFDFWLP SASVLSRSTRFLRPSPETTLTIPSTAAGVISVGAYDDAYQSFADFSGRGFTRMTRQVKPD LVAPGVGIITARSGGGYETVTGTSFATPFVTGSAALLMQWGIVDGRDPFLYGEKIKAYLR RGARHLPGYSQWPNPELGFGTLCLSDSIPI >gi|229783991|gb|GG667744.1| GENE 15 16339 - 18186 1855 615 aa, chain - ## HITS:1 COG:FN0806 KEGG:ns NR:ns ## COG: FN0806 COG2385 # Protein_GI_number: 19704141 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Sporulation protein and related proteins # Organism: Fusobacterium nucleatum # 419 560 190 333 333 117 43.0 5e-26 MNKRKFIAFAAGIPLLVLILIIIILVSEPKPGGISRAAAYKSAALLLTDAESCEQLLKEQ GQNYFTEKDRNHWYAKYLNYLYVNGYVSPDTTPSDPEYVQGYLTYREAEELAEALSPGQG EPARVGKKKQGKPFPSDNWWFIYDSLRKELDHDGSIKERNILLYGTPMNIRSAPAWTAYT SEGKFRFEGISLDSYIDWELKVLVKDGEMIAVREPVSDSVTYKNVWLTHGEGDTFSVRLG TVDRSFPMEASLGQPEEFADNLADLSLKNGKLQKVTLKKKRITGKVLAVKDDSIEIEGYG KIKLDKDFKVYKLYGQFEEQSVSDILVGYDIQEFVVAHGKLCAALTMREFDAKTIRVMIM NTNFQSVFHPSVTLSAESGLNLASGEESVQIPAKAEVIIDLSDERLKEGRIVVTPVEAGD TITVNSIRRSLGTPTYSGSIEIRKENEGITLINELYLEDYLTRVVPSEMPDSYEMEALKA QAVCARTYAYRQIQSNAYSQYGAHVDDSTRFQVYNNLKTSDKTEQAVRETYGKLLFYQDV PIEAFYFSTSCGHTTDGSIWGSDPAKYPYLDGCLLEGGRSVLNLSTNAAFEAFIKDKEYP SYDSSFPMYRWETTV >gi|229783991|gb|GG667744.1| GENE 16 18196 - 19428 1383 410 aa, chain - ## HITS:1 COG:CAC0730 KEGG:ns NR:ns ## COG: CAC0730 COG0628 # Protein_GI_number: 15894017 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Clostridium acetobutylicum # 19 390 14 382 383 126 26.0 7e-29 MKETKFRNYICWGVTALAVIALSIAFAFFLTRFQVMKAGLKAFIGILMPVIYGAVLAYLL LPIYNRTRSLLEGWIAKGIKKEKTIRGLSKAGATAVSLIVLFVIVAGLFWMVIPQIYTSI MTLQEGLGENINNLALWLQKLLEDNPSLEQKVIPMYDEVTNQLENWLTTSLVPNVSTVIS GLSSGLLSVVLALKNILIGVIVMVYLLNIKETLSAQGKKIIYSVLPLRMANQFIEELRFV HRVFGGFITGKLLDSLIIGIICFVCLNWMKMPYVLLVSVIVGVTNVIPFFGPFIGAVPSA FLILLVSPMKCLYFLIFIVLLQQFDGNILGPKILGQSTGLPSFWVLFSILLFGGLFGFVG MIIAVPTFAVGYSMLTGLVNRALRKKELSLNTNDYMDLKHIDEEKKTYMR >gi|229783991|gb|GG667744.1| GENE 17 20759 - 21025 363 88 aa, chain - ## HITS:1 COG:no KEGG:Closa_2197 NR:ns ## KEGG: Closa_2197 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 84 1 84 87 90 58.0 2e-17 MKRVKVGFWGIDSILEFVKITGSSRHKMELVCGSTVVDAKSLLSVLSVKAAQAVELVIHE ESCDRLLERLDPYIEHGSLLDRAHHLSA >gi|229783991|gb|GG667744.1| GENE 18 21051 - 21851 724 266 aa, chain - ## HITS:1 COG:no KEGG:Closa_4186 NR:ns ## KEGG: Closa_4186 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.saccharolyticum # Pathway: not_defined # 20 256 16 244 247 171 36.0 3e-41 MIELGKEKRQIIRTLCGQADNVLVHGAVQGYMGRVWVPELTNPSYCLIHLGDFAYLFGIC PKGEQAMELRAQLYRECGQDYITPADERWAEWLEESFPGEYRVLSRYSMRRDRNHFSEEA LESYCRNLPAGIRLKRIDKRLYQTALKEEWSRDFCSNFETPEEYEQNGLGFVAMDGRKIV SGCSAYGISEGMFEIQVETRREYQRKGLALACSARFILTCLEQGIYPSWDAISLQSVGLA EKLGYIFDREYKVYQLHDMESGLLMA >gi|229783991|gb|GG667744.1| GENE 19 21936 - 22676 866 246 aa, chain - ## HITS:1 COG:CAC0418 KEGG:ns NR:ns ## COG: CAC0418 COG0546 # Protein_GI_number: 15893709 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Clostridium acetobutylicum # 37 243 4 209 216 202 52.0 6e-52 MELIQNVDGNKNAYGDGGLPSVSVAFLSMEERMEKEYLLFDLDGTLTDPKEGITKSVRHA LKAYDIEVEDLDSLCCFIGPPLKDSFIEYYGFSEENASNAIGVYREYFSDRGIFENEVYE GIEEVLKALKASGKKLFVASSKPEVFVRKIMEYFKLDPYFTFMGGADLGETRVKKADVIR YVLEENGITDLEKVIMIGDRKHDILGAKEVGVDSVGVLYGYGDREELEAAGADFLAETVF DLQNLL Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:31:24 2011 Seq name: gi|229783990|gb|GG667745.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld138, whole genome shotgun sequence Length of sequence - 13400 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 6, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 464 5 ## EUBREC_1182 hypothetical protein - Prom 526 - 585 10.3 + Prom 523 - 582 6.8 2 2 Op 1 . + CDS 625 - 1509 823 ## COG0583 Transcriptional regulator + Term 1709 - 1740 3.2 + Prom 1610 - 1669 7.4 3 2 Op 2 . + CDS 1755 - 2696 306 ## COG4823 Abortive infection bacteriophage resistance protein + Term 2723 - 2774 14.4 - Term 2714 - 2759 11.3 4 3 Op 1 . - CDS 2760 - 4037 1165 ## Varpa_5825 hypothetical protein 5 3 Op 2 . - CDS 4062 - 4181 69 ## gi|266623993|ref|ZP_06116928.1| conserved hypothetical protein - Prom 4321 - 4380 8.9 6 4 Op 1 . - CDS 5282 - 5737 339 ## gi|266623994|ref|ZP_06116929.1| hypothetical protein CLOSTHATH_05318 7 4 Op 2 1/0.000 - CDS 5750 - 6547 196 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 8 4 Op 3 . - CDS 6568 - 7956 1408 ## COG2211 Na+/melibiose symporter and related transporters 9 4 Op 4 . - CDS 7979 - 8860 920 ## Dalk_0511 hypothetical protein - Prom 8883 - 8942 6.3 10 4 Op 5 . - CDS 8958 - 10328 1242 ## COG0534 Na+-driven multidrug efflux pump - Prom 10362 - 10421 6.0 - Term 10481 - 10528 1.0 11 5 Tu 1 . - CDS 10553 - 11968 1499 ## COG2252 Permeases - Term 12082 - 12125 9.1 12 6 Op 1 . - CDS 12337 - 12741 238 ## CDR20291_0527 hypothetical protein 13 6 Op 2 . - CDS 12827 - 13399 432 ## Closa_0666 hypothetical protein Predicted protein(s) >gi|229783990|gb|GG667745.1| GENE 1 2 - 464 5 154 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1182 NR:ns ## KEGG: EUBREC_1182 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 154 29 179 513 193 62.0 2e-48 MDIILYLLQVIQDLYQQNCWLILFICKYIPLKQWAFDDSHSPKYQKFRIDELPKIISFQQ DWNWLDLISYYQQRYHKTIRPVFRRVECDIPVDCTCPACSAPVDYLSWNNGRQKNQIRCK VCQTLFSPTQDSRFSKNSRLHCPHCSHALVHKKD >gi|229783990|gb|GG667745.1| GENE 2 625 - 1509 823 294 aa, chain + ## HITS:1 COG:CAC1481 KEGG:ns NR:ns ## COG: CAC1481 COG0583 # Protein_GI_number: 15894760 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 286 1 289 292 101 26.0 1e-21 MEISVLEEFVSLVETCSFQETAAIMNLSQSSLTKHIHKLEEELNLSLFDRSTRNVKLNEY SHAFYPYAKQIVQLQHEAIATLGDLTDKDKDHFSIAYAPVLGQYGLVDTIADFSRKYPEH NLNIIESYQPMSLLKSKKCDFAFVSENDVQDESFNKMIYKTDHLAAVFPCGHPLADKESV TLEQLYGEKFILHSNSVGSTHEETQKFLNLCDSKNFIPNVIAHSEFTSTMVRYVSNGRGV AILNRLHIPAEIPNISIVDIYPTIRSYIYLLYQRKVRSASAADFLHYMIEQISQ >gi|229783990|gb|GG667745.1| GENE 3 1755 - 2696 306 313 aa, chain + ## HITS:1 COG:PM1540 KEGG:ns NR:ns ## COG: PM1540 COG4823 # Protein_GI_number: 15603405 # Func_class: V Defense mechanisms # Function: Abortive infection bacteriophage resistance protein # Organism: Pasteurella multocida # 3 228 6 238 309 85 26.0 1e-16 MPKPFMTYEQQIQKLRDKNLTITDVEEAKSILRHDGYYALITGYKDLFKNPSTKNYRDGT TLQDILALYQFDEQLRELTLRHLLHIERHIRSTLSYAFCDQFGEQQSAYLNPANYDNTKK KSNAINRLITKYLSPLLTKQTSYPFLEHNKKVHGNVPLWVMVNALSFGTLSKMYELSVSK IQYAVSRNFPAVNEKQLGQVLDVLTDYRNLCAHNERMFSHRCAKKDIPDFPLHKKLSIPV QGNYYLCGKRDYFAVVLAFRYLLPNSEFLRYKSQLSKLIEEAVTSNQQISRDVLLNIMGF PANWKKVTSYRKI >gi|229783990|gb|GG667745.1| GENE 4 2760 - 4037 1165 425 aa, chain - ## HITS:1 COG:no KEGG:Varpa_5825 NR:ns ## KEGG: Varpa_5825 # Name: not_defined # Def: hypothetical protein # Organism: V.paradoxus_EPS # Pathway: not_defined # 3 228 18 234 373 84 29.0 1e-14 MGEIKTTFVYLAKRVPGVLYKPKAGSPKQGISVLVMHSDEDYLTFPTGAELAERGYTVLC ANPANKEGIIFSQVEKMQCVKAAVLYLRSLPDVEKVVLMGHSGGGTLMTAYQNIAENGAE VFQGQEKIVPYPDNSELPPADGLMLLDSNWGNAAMQLFSLDPAVESENSGRLINEELNLF NPQNGFQPDGAHYDREFIDQFQRAQSARNSCILDYALNRLLLLQNGQGNYCDDEPLIIPG AAQGFFNNKLYAQDIRLMSHTREPHLLLHGDGTATKEIVYSVRRPENPRSYTDSFWEGAR FLSVKTYLTSYAVRTEKEFGYDEDHVWGIEWDSTYNCTPGNVVGIRVPLLVMGMTGGWEY LASETIYQMATSKDKTIAFVEGATHRFTPAKDCEAYEGQFGDTMKTLHDFADAWLSAPGR FWNGE >gi|229783990|gb|GG667745.1| GENE 5 4062 - 4181 69 39 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623993|ref|ZP_06116928.1| ## NR: gi|266623993|ref|ZP_06116928.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 39 62 100 100 78 97.0 2e-13 MDETYLKNVLTHNTIERAHLYQVLPDLYEAYKDLPMDAD >gi|229783990|gb|GG667745.1| GENE 6 5282 - 5737 339 151 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623994|ref|ZP_06116929.1| ## NR: gi|266623994|ref|ZP_06116929.1| hypothetical protein CLOSTHATH_05318 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05318 [Clostridium hathewayi DSM 13479] # 1 148 1 148 148 292 100.0 6e-78 MSDYRLPELSPAEKRMPAARFITGYPLCPPNPLLQQILDAGPMEVKDAIPAENWLDLLQI HGYRDIVYGYTMMPDGSGFYIEYSVSPVTWQGKWRRWYGTWYNRYSKSMVPGEGNLRYKI WNPLDHWDHKFVNGENDRDGVWSVETLDLAS >gi|229783990|gb|GG667745.1| GENE 7 5750 - 6547 196 265 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 98 260 75 238 242 80 31 8e-15 MLTLKNRTMIITGGAGNNGLAIVRMALEYGMNVAFMSSFHGKAQGAVAKMDPRYRDQVIG FAQNPQARLAENIACDPELYKEDTTQEDVLRWIYERFGSIDVVVNGSGGHDRHNMEETDK AIWRHSMEVLEAAFFNTKLALPYLLKSRAPRVINMTTCEKNGGWYPNPSFAASRSGMIGL TQEMAKELGPKGITVNCVLTGHIEQDVPEEDILPEELKAELLARTPLGRLGVPEDIAGAV NFLASEEASFITGAVIDVNGGAIMG >gi|229783990|gb|GG667745.1| GENE 8 6568 - 7956 1408 462 aa, chain - ## HITS:1 COG:BS_yjmB KEGG:ns NR:ns ## COG: BS_yjmB COG2211 # Protein_GI_number: 16078296 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Bacillus subtilis # 11 444 18 443 459 194 31.0 4e-49 MGKQGAQQTGFGMRDKIGYLFGDFGNDFTFVLSSAFLLKFYTDVMGVSGYIVGIMMMIAR LLDAVTDITMGQLVDRSPVRRDGKFRPWIRRMSGPVAFSSLLMYASWFQNMGMAFKVVWM FVTYLLWCSICYTGIVIPYGSLASAITSDPVERTQLSNFRSVGGTLAMTVSSVVLPLAVY YTDEAGHRALSGAKMTTAALFCSVGALICYILCYSLTTERVKLGQTTEKFDAGKLLRSLA HNRCLPGIVLVVLIREVANTGLQAMASYIYPNYFANTVAQSQSGVVGTVITLIMAAFIVK LAVRFGKKEITAAGCFLASGILAVAYFVHTENVYVWIFFYAVATVGLAVFGLIGWAMVSD IIDDTEVRTGERGDGTIYGIYSFARKAGQACSSGVTGILLSLIGYTPDTAFDAAVTDRIY DITCLAPMIGFFAMALAVIFLYPLDKKTVLGNAVKLEAERGK >gi|229783990|gb|GG667745.1| GENE 9 7979 - 8860 920 293 aa, chain - ## HITS:1 COG:no KEGG:Dalk_0511 NR:ns ## KEGG: Dalk_0511 # Name: not_defined # Def: hypothetical protein # Organism: D.alkenivorans # Pathway: not_defined # 22 143 6 114 239 65 36.0 3e-09 MFIEVGKEVFETEDESVINIRQKLFPEEEKMPLAKYFYNYPLHAPTPVEMQIINQLNPMN PEDAILPENFMDLLKPYGYDKIELGYCMFPDGSGYVATYRVRPPHISGEMERWYRNWRNL KSKSMVPGHGNLRYKIWNYADHFDHYYVNWQDGSDGIHTTESLDLGGGDRMYDTIRHQFD LKDFGLTDEKMKELKDAGCQLTGKGSYETFDEPGTHLCLSYSRPCPLGGIETRSREWIGW RPVNGKLVRDPSTKCSEEYLKKVVIHTLVEWEHLYTFLPDLYAEYHDQPADAD >gi|229783990|gb|GG667745.1| GENE 10 8958 - 10328 1242 456 aa, chain - ## HITS:1 COG:FN1726 KEGG:ns NR:ns ## COG: FN1726 COG0534 # Protein_GI_number: 19705047 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 9 453 4 450 457 185 27.0 2e-46 MAADAKTQKKQMMLTENIRTLIPRMAVPTIMAQLITTIYNLVDTYFVSTLGTNATAAVGV NSSLERMITLIGSLIGAGACSYIARLLGAKDDEHADKVLSTSFFSGLGLGIIFMFIGKMV IGDLVFWLGATKECAQYSIQYANYVLYAAPFMIGSFILNMCLRSEGSATFSMIGIGFGGI LNCFLDPLFIYHFGLGVAGASMATALSKFISFCILLYPYIRKTSVVSISVKRIKFLMNDV KEVLAIGSSSFFRSALSVVAAVSINRVAGSYSTAALAALSVANRVMDFPFAIILGFGQGY QPVVGFNWGAKAWKRVRESYVFSCAVAVLGAVGIGAVIFVFANPIVHVFNSQADSEVLRL GMLCIRLQCCVLVFHSLGSLVNMFYAGIGHAKYALIMNLARQGYCFFPVLFIAPVFLGIN GVAATQAIADLLSVIVIVPLGLKALKLIREQEEKQP >gi|229783990|gb|GG667745.1| GENE 11 10553 - 11968 1499 471 aa, chain - ## HITS:1 COG:RSp1678 KEGG:ns NR:ns ## COG: RSp1678 COG2252 # Protein_GI_number: 17549897 # Func_class: R General function prediction only # Function: Permeases # Organism: Ralstonia solanacearum # 10 471 6 433 434 307 41.0 3e-83 MERKTGVADRFFGISQSHSSVKTEILAGITTFITIAYILILNPQILSDPYVIMGDPAMAG KIANGVFIGTCIGAFIGTALCALYARVPFAQAPGMGLNAFFAYTVVLGMGYTYGQALVIV FISGIFFIVITAIGLREAIIRSIPDAVKTAITPGIGLFITIIGLKNAGIIISNPATLVSL VDFSKWRSGADMVLINGALVALIGLIIMGILHARRVKGSILLGIVAATLIGIPLGITQLS NLDMNLGVKFRDFAEVSFLKMDFAGLFAGTNFVETLFTVTMLVISFSLVNMFDSIGTLLG AARQSGMIDKNGEVIRMKQALMSDAVSTLAGAMVGTSTVTTVVESSAGIAEGGRTGLTSL VTAVMFLGAILFAPIVSIVPAAATAPALIFVGILMLGNIRDVDFSDMSNALPAFCTIVFM PFTYSIANGVAFGLITYCLMNIMTGKRREVRALTFMISVVFVVRYAFMTLG >gi|229783990|gb|GG667745.1| GENE 12 12337 - 12741 238 134 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_0527 NR:ns ## KEGG: CDR20291_0527 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 134 1 135 137 139 53.0 4e-32 MKMPEVISSRMVAPCGMNCMVCFTHCSSKKPCGGCMGTDESKPGHCRTCLRKNCAAERGI SYCYECCDFPCRRIRDLDRSYRKRYGASLIGQSLYMKENGIDRFLEEEQKRYTCAVCGGI ISLHDRTCSECGQR >gi|229783990|gb|GG667745.1| GENE 13 12827 - 13399 432 190 aa, chain - ## HITS:1 COG:no KEGG:Closa_0666 NR:ns ## KEGG: Closa_0666 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 188 103 290 292 157 40.0 1e-37 LNFHSFGNRCYEDTIIRGMPEFFVRYDARFRPQDHLLTLDYPILRPVGKRSGIDAVYFYL SCVLLEQRFLGRLPEPYGKAVLEHFHGDYEELILNVASVILRNLVVHMMMGKKLSENAVT ADDMERFCICVKNCDRQKLEEAISHQLEQLTGGPEGDRALYSYLSCDRKDFAAELKNAAE CGYMDRMIVY Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:32:05 2011 Seq name: gi|229783989|gb|GG667746.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld139, whole genome shotgun sequence Length of sequence - 20246 bp Number of predicted genes - 21, with homology - 20 Number of transcription units - 9, operones - 7 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 185 92 ## gi|266625228|ref|ZP_06118163.1| RNA polymerase sigma factor SigW 2 1 Op 2 . + CDS 182 - 835 228 ## gi|266624003|ref|ZP_06116938.1| conserved hypothetical protein + Prom 990 - 1049 3.2 3 2 Op 1 2/0.000 + CDS 1069 - 1434 520 ## COG1733 Predicted transcriptional regulators + Prom 1436 - 1495 2.8 4 2 Op 2 . + CDS 1516 - 1908 203 ## COG1881 Phospholipid-binding protein 5 3 Tu 1 . + CDS 3170 - 3364 245 ## gi|266624006|ref|ZP_06116941.1| high-affinity nickel transport protein Nic1 + Prom 3377 - 3436 6.5 6 4 Op 1 . + CDS 3456 - 3545 116 ## 7 4 Op 2 . + CDS 3545 - 3694 134 ## gi|266624007|ref|ZP_06116942.1| conserved hypothetical protein 8 5 Op 1 20/0.000 + CDS 4635 - 6437 1036 ## COG2060 K+-transporting ATPase, A chain 9 5 Op 2 18/0.000 + CDS 6452 - 8518 1600 ## COG2216 High-affinity K+ transport system, ATPase chain B 10 5 Op 3 15/0.000 + CDS 8544 - 9146 363 ## COG2156 K+-transporting ATPase, c chain + Term 9152 - 9194 9.1 + Prom 9301 - 9360 6.9 11 6 Op 1 . + CDS 9396 - 9920 481 ## COG2205 Osmosensitive K+ channel histidine kinase 12 6 Op 2 16/0.000 + CDS 9939 - 12092 1055 ## COG2205 Osmosensitive K+ channel histidine kinase 13 6 Op 3 2/0.000 + CDS 12085 - 12783 567 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 12794 - 12845 1.1 + Prom 12921 - 12980 5.3 14 6 Op 4 2/0.000 + CDS 13167 - 13475 223 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Prom 14377 - 14436 11.3 15 7 Op 1 40/0.000 + CDS 14584 - 14745 86 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Prom 14964 - 15023 2.7 16 7 Op 2 4/0.000 + CDS 15066 - 15971 449 ## COG0642 Signal transduction histidine kinase 17 7 Op 3 36/0.000 + CDS 16034 - 16471 335 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component 18 7 Op 4 . + CDS 16521 - 17381 275 ## COG0577 ABC-type antimicrobial peptide transport system, permease component + Prom 18326 - 18385 80.4 19 8 Op 1 . + CDS 18487 - 19098 316 ## COG3973 Superfamily I DNA and RNA helicases 20 8 Op 2 . + CDS 19103 - 19348 167 ## Amet_1004 superfamily I DNA helicase + Term 19376 - 19440 11.4 + Prom 19363 - 19422 8.5 21 9 Tu 1 . + CDS 19467 - 20244 585 ## COG0534 Na+-driven multidrug efflux pump Predicted protein(s) >gi|229783989|gb|GG667746.1| GENE 1 3 - 185 92 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625228|ref|ZP_06118163.1| ## NR: gi|266625228|ref|ZP_06118163.1| RNA polymerase sigma factor SigW [Clostridium hathewayi DSM 13479] RNA polymerase sigma factor SigW [Clostridium hathewayi DSM 13479] # 1 60 93 152 152 96 100.0 6e-19 ELLHEISRLRGKSKEVLILFAVEGYSIKEISAMLKISESAIKKRLQRGREELSRQLGVRQ >gi|229783989|gb|GG667746.1| GENE 2 182 - 835 228 217 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624003|ref|ZP_06116938.1| ## NR: gi|266624003|ref|ZP_06116938.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 217 1 217 217 432 100.0 1e-120 MRKLSDLPEEMVLDDQVKQRILRNGRTIAQKRRKQARLRNGILVGAVVIASIVFPLSNDS HISNERGLLPLTVYAKDVSGKEVLDIGSGRVFSLTRTETPLGEGYTLQMVAKEGYHYELN IEDMGTGLETIFIRDNDVYWIPDYWKDSSGKLYDTDGSELMISNALSSPILNYCVYNQEN RLCTELSIQLHEEETGSAELIRLVCYPEENGIELQNR >gi|229783989|gb|GG667746.1| GENE 3 1069 - 1434 520 121 aa, chain + ## HITS:1 COG:CAC3399 KEGG:ns NR:ns ## COG: CAC3399 COG1733 # Protein_GI_number: 15896640 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 116 1 116 116 138 62.0 3e-33 MKMREEFTCPLELVHDMIKGKWKPIIIWRLRLGPTSLARLEKDIDGITQKMLLEQLKELI DYGFVEKKSFEGYPLHVEYSLTDGQGRELLQALKIMQHIGIEYLKAQGKEALLIEKGVVP E >gi|229783989|gb|GG667746.1| GENE 4 1516 - 1908 203 130 aa, chain + ## HITS:1 COG:CAC3398 KEGG:ns NR:ns ## COG: CAC3398 COG1881 # Protein_GI_number: 15896639 # Func_class: R General function prediction only # Function: Phospholipid-binding protein # Organism: Clostridium acetobutylicum # 1 129 1 127 165 164 60.0 5e-41 MIVTSTGIIGGKIQDQYGGRGTQFNENGVPTYSLPFKIEEAPEKTVSFAVILEDKDSYPV TGGFVWVHWLAANITRNELKNNESQTASDFIQGCNSWTSIQGGQQSREWSCFYGGMTPPD QAHTYELHVC >gi|229783989|gb|GG667746.1| GENE 5 3170 - 3364 245 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624006|ref|ZP_06116941.1| ## NR: gi|266624006|ref|ZP_06116941.1| high-affinity nickel transport protein Nic1 [Clostridium hathewayi DSM 13479] high-affinity nickel transport protein Nic1 [Clostridium hathewayi DSM 13479] # 1 64 1 64 64 96 100.0 7e-19 MTYMIQTIKTWEITFDQKAEHFIFKHPFLSFVGIVVGVPIFILAAVGVSTTIVILPIAWM FGWL >gi|229783989|gb|GG667746.1| GENE 6 3456 - 3545 116 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQILIGIIGMCMAALLIYYIYILMGGDEK >gi|229783989|gb|GG667746.1| GENE 7 3545 - 3694 134 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624007|ref|ZP_06116942.1| ## NR: gi|266624007|ref|ZP_06116942.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 49 1 49 49 84 100.0 2e-15 MGIYFLWRLGMVIVILASFALFVFIYNKVTKHIHPMQKYDVQVQKRRDS >gi|229783989|gb|GG667746.1| GENE 8 4635 - 6437 1036 600 aa, chain + ## HITS:1 COG:lin2830 KEGG:ns NR:ns ## COG: lin2830 COG2060 # Protein_GI_number: 16801890 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, A chain # Organism: Listeria innocua # 1 597 6 558 561 538 54.0 1e-152 MQYVLYLAVLVVLAIPLGTYMKKVMYGEKTIFSSIFTPCERIVCKVLHVQESEDMTWKQY LSSVFLFSGIGFVFLFLLQLLQGALPGNPQGLAGTSWHLAFNTAASFVTNTNWQSYSGES TLSYLTQAVGLTVQNFVSAATGIAVLFALIRGFTRVKAKGIGSFWVDMTRIVLHILIPLN LVLALLLVAGGVVQNLKPAEEVMLLEPVAVSASSEILEDADIDLETKKVTVDGKLVPDAR IITEQFVPMGPAASQIAIKQTGTNGGGFFGVNSAHPFENPNALTNLLEMLSLLLIPMALC FTFGKAVKNKKQGTAIFLAMFLMLIASLSLIAVNEQSAATMLSPNNNVNMGMIDQAGGNM EGKETRFGITASSTWAAFTTAASNGSVNSMHDSYTPLGGLAQMLQMQIGEVIFGGVGCGL YGMLAFAILTVFIAGLMVGRTPEFLGKKIEPREMKWAVLACLATPIAILAGSGTAAMVPE VMDSLNNGGAHGFSEMLYAYTSAGGNNGSAFAGFNANTVFLNVSIGLVMLFARFLPIVGT LAIAGSLAEKKKIAVTAGTLSTTNGMFIFLLIFVVLLVGALSFFPALALGPLAEFFSGIA >gi|229783989|gb|GG667746.1| GENE 9 6452 - 8518 1600 688 aa, chain + ## HITS:1 COG:lin2829 KEGG:ns NR:ns ## COG: lin2829 COG2216 # Protein_GI_number: 16801889 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity K+ transport system, ATPase chain B # Organism: Listeria innocua # 17 687 10 680 681 899 72.0 0 MTNATSVFANKKMLGRAIKDSFIKLDPKTQAQNPVMLLVYLSAVLTSVLWVVSLFGIMDA PSGYTLTIAVILWFTCLFANFAEAIAEGRGKAQADSLRAAKKDVEAHRIPSISRKNDAIA ISSSLLKKGDIVIVKAGKQIPADGEVIEGAASVDESAITGESAPVIREAGGDRSAVTGGT TVLSDWIIVQVTNEAGESFLDKMIAMVEGAARKKTPNEIALQIFLIALSVIFILVTMSLY AYSAFSAKLADISNPTSVTTLVALLVCLAPTTIGALLSAIGIAGMSRLNQANVLAMSGRA IEAAGDVDILMLDKTGTITLGNRQACDFIPVDGADPRELADAAQLSSLADETPEGRSVVI LAKEKFGLRGRTLQDRGMVFSPFTAATRMSGVDYDGLEIRKGAADSVRDYVEAAGGTFSN DCDRVVKEIANQGGTPLVVAKDHKILGVIHLKDIVKQGVKEKFADLRKMGIKTIMITGDN PLTAAAIAAEAGVDDFLAEATPEGKLMMIRDFQSKGHLVAMTGDGTNDAPALAQADVAVA MNTGTQAAKEAGNMVDLDSSPTKLIEIVRIGKQLLMTRGSLTTFSIANDVAKYFAIIPAL FMGLYPGLAALNVMGLHSPQSAVLSAIIYNALIIIALIPLALRGVKYREVPAGKLLSRNL QVYGLGGLATPFIFVKLIDMLLAACGLA >gi|229783989|gb|GG667746.1| GENE 10 8544 - 9146 363 200 aa, chain + ## HITS:1 COG:CAC3680 KEGG:ns NR:ns ## COG: CAC3680 COG2156 # Protein_GI_number: 15896912 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, c chain # Organism: Clostridium acetobutylicum # 11 200 8 198 205 150 41.0 2e-36 MKTMKDILLRAAGLFLIFTLLCGVFYTGAVTGFAQVLFPQKANGSMIEIGGVKYGSELLG QYYTDDAHFWGRIMNLDVTTYRDGEGNILMYSAPSNLSPTSAEYAALVAERVEKIQKSNP IMKDTAIPVDLVTCSGSGLDPHISPAAAEYQVTRVATASGKSEQEVREIIAKCTDGQFLG IFGEKIVNVLKVNLMLDGIL >gi|229783989|gb|GG667746.1| GENE 11 9396 - 9920 481 174 aa, chain + ## HITS:1 COG:Cj0679 KEGG:ns NR:ns ## COG: Cj0679 COG2205 # Protein_GI_number: 15792028 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Campylobacter jejuni # 4 174 2 175 606 214 58.0 9e-56 MEEKKNKPDEILRVIQEDEQSSKNGHLKIFFGYAAGVGKTYAMLKDANAAKRRGVDVVIG YVEPHARPQTAALTQNLECLSLMEDVYNGIALKEFDLDAALARRPQLILVDELAHTNAKG CRNRKRYQDVRELLRAGIDVYSTVNVQHIESLNDMVASITGVTVRERIPDEVFD >gi|229783989|gb|GG667746.1| GENE 12 9939 - 12092 1055 717 aa, chain + ## HITS:1 COG:pli0050 KEGG:ns NR:ns ## COG: pli0050 COG2205 # Protein_GI_number: 18450332 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Listeria innocua # 1 717 174 888 888 697 50.0 0 MDIEPQELIERLNEGKIYREAQARQALNHFFSVENLTALREIALRRCADRVNRMAESVKV QNNSDYYTDEHILVCLSSSPTNPKIIRAAARMAQAFRGEFTALFVETPEFQNMSADDKTR LRENIRLAEQLGASVETTYGDDIALQIAEYARLSGVSKIVLGRDNAVKRRLFAKQTLTEK LTIYAPKLEVYILPDRPVPGYRPQKAGNPVPPFSVADLLRSAAILAVATGIGFVFYHFGF SEANIITVYILGVLLTAVCTSARIYCLVTSLASVLMFNFFFTSPRFTFKAYDPSYFVTFP IMFLAALITGNLAIRIKGQARKSAQAAYRTKVMFDTNQALQQVNGNEDIISVTAHQIVRL LERSVVFYAKQDGNTLAEPLFFSFSENQNKQDYCSENERAVAAWVYQNNKRAGATTDTLS SAKCLYLAVRIENAVYGVVGIVMEQGALEPFESSILLSILGECAMALKNEQTTREKEQAA VLAKNEQLRANLLRAISHDLRTPLTSISGNAGVLLANDTVIPQEKRRQLYGDIYDDSLWL INLVENLLAVTRIEDGSMNLRLKPELMDDVVAEALHHINRKSVEHHITVAQSNELALARM DGRLIVQVVINIVDNAIKYTPQGSTITVRTFTSGDWVVTEISDNGPGISDEVKLRIFDMF YTSSTKIADSRRSLGLGLALCKSIITAHGGTIEVRDNEPEGTVFCFTLPIEEVTLHE >gi|229783989|gb|GG667746.1| GENE 13 12085 - 12783 567 232 aa, chain + ## HITS:1 COG:pli0051 KEGG:ns NR:ns ## COG: pli0051 COG0745 # Protein_GI_number: 18450333 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 1 232 1 231 240 281 61.0 9e-76 MNKSLILVVEDDAAVRNLITTTLETQDYRFLTASTGEAAILEAVSHNPDIVLLDMGLPDI DGVDIIKKIRTWSLMPIIVISARSEDTDKIDALDAGADDYLTKPFSVEELLARLRATQRR LIATQGVMPAQQTFFENGRLRIDYAAGCVFLEDTELHLTPIEYKLLCLLAKNIGKVLTHT YITQEIWGSSWDNDVASLRVFMATLRKKIENTQDAPQYIQTHIGVGYRMLKV >gi|229783989|gb|GG667746.1| GENE 14 13167 - 13475 223 102 aa, chain + ## HITS:1 COG:CAC0450 KEGG:ns NR:ns ## COG: CAC0450 COG0745 # Protein_GI_number: 15893741 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 102 1 100 227 90 46.0 5e-19 MSQILLLEDDVSLVDGLRYSLKRNGFDITIARTIEEAIKCLQQIGSYDLLLLDVTLPDGT GFDVCNKVRKQGNQIPIIFLTASDEEVNIIRGLDSGGDDYTS >gi|229783989|gb|GG667746.1| GENE 15 14584 - 14745 86 53 aa, chain + ## HITS:1 COG:CAC0524 KEGG:ns NR:ns ## COG: CAC0524 COG0745 # Protein_GI_number: 15893814 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 47 174 220 228 68 59.0 2e-12 MNELWDNTSDFVDDNTLSVYVRRLREKIETIPSQPEHLITVRGFGYQWKEVSL >gi|229783989|gb|GG667746.1| GENE 16 15066 - 15971 449 301 aa, chain + ## HITS:1 COG:CAC0451 KEGG:ns NR:ns ## COG: CAC0451 COG0642 # Protein_GI_number: 15893742 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 5 300 109 413 416 187 34.0 2e-47 MIAWVVLTGMLFTGTFLFFRKREQLYRQATQVILHFIQGDFSHHLTQANEGAIFQFFASV DQLATILQSKNDTQEKAKEFLKNTISDISHQLKTPLTALSMYHEIISDEPENVETVREYS EKTGLAINRMDILIQSMLKITRLDAGSIDFEKERCYVADLIARSIRELIIRAEREGKKIL TDGSTEEMIFCDVGWTSEAIGNIVKNALDHTEAGGEIHISWEHSPIMLRVYISDNGSGIA PEDIHHIFKRFYRSKKSLDTQGIGLGLPLAKSIIGGQDGIISVQSSLNKGTMFTLSFLTE L >gi|229783989|gb|GG667746.1| GENE 17 16034 - 16471 335 145 aa, chain + ## HITS:1 COG:CAC0453 KEGG:ns NR:ns ## COG: CAC0453 COG1136 # Protein_GI_number: 15893744 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Clostridium acetobutylicum # 1 141 1 141 227 152 51.0 1e-37 MEILKVQHLCKTYGKGEAKVEALKDVSFTLEQGEFAAVVGESGSGKSTLLNCIGALDMPT SGNVLMDGQDLFSMKEEQRTVFRRRNIGFVFQSFQLVADLTVEQNIMFPILLDYRKPEPG TIEEILGVLGLKDRRHHLPNKRILG >gi|229783989|gb|GG667746.1| GENE 18 16521 - 17381 275 286 aa, chain + ## HITS:1 COG:CAC0454 KEGG:ns NR:ns ## COG: CAC0454 COG0577 # Protein_GI_number: 15893745 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 5 274 127 393 832 80 25.0 4e-15 MFLSEISEGRFPETVYEVVLSEYAKSLLKVDIGDTIQIDGINEPLVISGFAENSAKQMRQ DTFSALLSMNGFRTHIPAENYTEQLVVQLSPFCNMQKVIADIADQLHLSEKQVIQNGNLL AVLGQSDNNFIIQLYGCAVILFLLVLIASVLMITSSLNSNVMQRTEFFGMLRCVGATRKQ IMKFVRREGLRWCKFAIPIGLICGIVVVWGLCAALKFISPEYFAELPVFRVSWISILCGI IIGFLTVLLAAQAPAKKAASVSPLTAVSGNAFSRQSVKTRASEAIR >gi|229783989|gb|GG667746.1| GENE 19 18487 - 19098 316 203 aa, chain + ## HITS:1 COG:CAC1026 KEGG:ns NR:ns ## COG: CAC1026 COG3973 # Protein_GI_number: 15894313 # Func_class: R General function prediction only # Function: Superfamily I DNA and RNA helicases # Organism: Clostridium acetobutylicum # 40 199 505 662 763 131 41.0 8e-31 MLKDYRRLLSDPEKLCHLCGSALIPEVAEYVAGQCQLMSKKNMVEIEDLAPLMFLQMKLK GIDDTIPARFVAIDEAQDFSELQFAVLKQVLRTDKFSILGDLSQGIHGYRGIDTWDTICK DVFHGICEYRVLEQSYRTTIEIMDFANQVLSLIQDESLVRAKPVVRHGDRPQCITVSSEK QLFEAIAEKVNRYENRNRRQENI >gi|229783989|gb|GG667746.1| GENE 20 19103 - 19348 167 81 aa, chain + ## HITS:1 COG:no KEGG:Amet_1004 NR:ns ## KEGG: Amet_1004 # Name: not_defined # Def: superfamily I DNA helicase # Organism: A.metalliredigens # Pathway: Nucleotide excision repair [PATH:amt03420]; Mismatch repair [PATH:amt03430] # 9 80 688 759 769 73 48.0 3e-12 MLPVFPEIQLLKDCDTAYGGGIMALPAYLSKGLEFDAVIVTCIEDDYACSNLDIKLLYVA ITRALHKMDIIRLPEKMKLIP >gi|229783989|gb|GG667746.1| GENE 21 19467 - 20244 585 259 aa, chain + ## HITS:1 COG:MA1121 KEGG:ns NR:ns ## COG: MA1121 COG0534 # Protein_GI_number: 20089987 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 6 257 10 265 475 147 35.0 2e-35 MNQPNKKMELLGSAPIPKALLALGLPTMIGMLINALYNLVDTYFVGGLGTDQMGAITVAF PLGQVVVGLGLLFGNGAAAYLSRLLGRGDKDTANKVASTAVYSSVLIGAIVIIGSVIFLK PILKQLGAIESIMPYAMTYTAIYITFSLFNVFNVTMNNIVSSEGAAKTAMCALMSGALLN VILDPVFIYVLNFGVAGAAIATAISQIISTVVYLFYIFRKKSVFSFRIKDCCFSKEIMSE ILKIGIPTLLFQILTSLSI Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:32:42 2011 Seq name: gi|229783988|gb|GG667747.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld140, whole genome shotgun sequence Length of sequence - 16125 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 6, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 8 - 1645 790 ## Spico_1734 Heparinase II/III family protein 2 1 Op 2 . + CDS 1629 - 3563 929 ## Cpin_3454 hypothetical protein + Term 3590 - 3634 6.2 - Term 3575 - 3623 10.1 3 2 Op 1 4/0.000 - CDS 3628 - 5127 1592 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases 4 2 Op 2 . - CDS 5209 - 6690 1514 ## COG2721 Altronate dehydratase - Prom 6773 - 6832 7.2 + Prom 6760 - 6819 7.7 5 3 Tu 1 . + CDS 6966 - 7922 1018 ## COG1609 Transcriptional regulators + Prom 8767 - 8826 80.4 6 4 Op 1 . + CDS 8921 - 9277 622 ## Closa_1095 hypothetical protein 7 4 Op 2 . + CDS 9274 - 10260 622 ## PROTEIN SUPPORTED gi|145634155|ref|ZP_01789866.1| ribosomal protein L11 methyltransferase - Term 10271 - 10312 13.4 8 5 Tu 1 . - CDS 10339 - 15297 4252 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 15328 - 15387 6.7 + Prom 15370 - 15429 8.0 9 6 Tu 1 . + CDS 15517 - 16110 634 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases Predicted protein(s) >gi|229783988|gb|GG667747.1| GENE 1 8 - 1645 790 545 aa, chain + ## HITS:1 COG:no KEGG:Spico_1734 NR:ns ## KEGG: Spico_1734 # Name: not_defined # Def: Heparinase II/III family protein # Organism: S.coccoides # Pathway: not_defined # 21 359 124 440 669 94 26.0 2e-17 MEHSGVPRTADLIFKFDAGEQKTAPWRSLEAGFRMFLSWYLIEECLRGTPYMTEETRKEL WKSVKEHAEVLEFVSPRLYPEADHNHFLMEMCGLFFTACKFQNLPGADEWKSKAVTELER CLMRQYTAGGGHLEGCPHYHNECLVLLLRSLLVVKEEDSDGGFSEDFENRIREVLIYSLH TCRPTGVGVPWGDSDAADAPIKSVLYGWFVYGGTELLEMAVQLAGTETVRRVFLSCIWDC PVPEEFDELMRRFENGMNNGKMGISLAAWFKPLSQVCVRNSWNSEAVSVFFACRLPVQNA HAHMDPGGFDFTAFGRNMIADPGRFTYNEISERRLFKSAFSHTTLTVNGKEPFEYISSWN YGRQGNGMVTELSLENILHPEIYRISSEQTFEKYCHRRVLITGFSEKQPFLAMIDQVNGL GKGDQVELHFQINYTKVRLDGQTMTFWDSGAPSLIVVGDGKLKPSFRHGWISDMMDQKRP SLILDFCEEGGEATRYFATVLVPVKAGKHANLRGLSVDKSGFNMELNGQTYHFDIKREKR YECRK >gi|229783988|gb|GG667747.1| GENE 2 1629 - 3563 929 644 aa, chain + ## HITS:1 COG:no KEGG:Cpin_3454 NR:ns ## KEGG: Cpin_3454 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 9 549 43 564 632 350 38.0 2e-94 MNVENEYLELLKSWCDGLIRLQFQKEDNPRLDGGIFCPSCMMIHGRCHDAIYPMMYLADR LKEKKYLEAAKLLFQWSGNLLCDDGSYYNDAQNDWMGITVFSVIALTHALEFHASLLTAQ EKEKWEARLNRTAEWVFQTITPQFDVNINYHAACAEAMALTGMYFHREKYLKRSDEMADF CMQYISKEGLFFGEGKPRDDVSPKGCRPVDIGYNVEESLASLLSYGTIRKNKIVRDKVKH MLWVHLEFMNPDGGWDNTMGTRNYKWSYWGSRTSDGCQLAYGLLGDEEPVFSDASYRNFK LYKRCTHDGLLYGGIDYYEHGELPCVHHTFCKAKALALLLDFGAVSVTPSELPSEHLDSV KYWPVIDTYRIARGGFIATVTAYDIEYRPGGHASGGTMTMLWHRSVGMMMASSTVEYTLI EPTNGQLSLKKNSHRPMTMRLETCGKPYLSSAYDFRAVFKHAALMSKRTIVNKSEDRVSA EADGNRSRDGETVVCLETFGHLTDRTQKRTVSGQEIEYRFRYQIEESRVMISAETGVCEE TVRWVIPVVGKHEAGVERLKDSENGRETGVKSCSIITHGNDVDYGLKKIRYRIQGERGSV LLELDRKPAEINPIFFLNGGFEGWEVILSMEQGKESKFWISAEE >gi|229783988|gb|GG667747.1| GENE 3 3628 - 5127 1592 499 aa, chain - ## HITS:1 COG:CAC0695 KEGG:ns NR:ns ## COG: CAC0695 COG0246 # Protein_GI_number: 15893983 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Clostridium acetobutylicum # 19 495 11 482 482 451 46.0 1e-126 MEKLCYETLEKQGYDGYLLKDAPERVLQFGEGNFLRAFVDYFIDMMNEKADFNSKVVLCQ PIAPGLADMINEQEGLYTLFLRGFENGKKVNDKRVISCVSRCLNPYKDYDAVLACAKNPD LRFIACNTTEAGITYDPACSFTDVPADSYPGKLTQFLYKRFETFGKEPGKGFVILSCELI DNNGKELEKCVLQYAEKWKLGEEFTSWIKQENIFCSTLVDRIVTGYPRNEAASICEELGY QDNIIDTGEVFGFWVIEGPESLKKELPFEKAGLPVLITDDHKPYKQRKVRILNGAHTSFV LGAYLAGQDIVRDCMEDEVICGFMNKTIYDEIIPTLTLPREELMSFAASVTERFKNPFID HALLAISLNSTSKWKARVMPSLKGYIANTGRLPECITASFAFYIAFYRGTELTEEGLTAA RPAGNEYTVKDDRPILQFYYDHRNDDVKTLVHAVCVNEEFWGEDLSAIAGFEDAVAGYVA AIEEKGAYEVMKSCLDKEA >gi|229783988|gb|GG667747.1| GENE 4 5209 - 6690 1514 493 aa, chain - ## HITS:1 COG:BH0490 KEGG:ns NR:ns ## COG: BH0490 COG2721 # Protein_GI_number: 15613053 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Bacillus halodurans # 2 493 3 497 497 561 55.0 1e-160 MQDFIKIHSGDNVAVALKPLASGTTVTTGDQTITLISDIPQGHKFALTDIAEGASVIKYG CAIGLAKEPILKGSWIHTHNLKTGLGDLLTYSYKKQDTTVAPAEERFFQGYRRPNGKVGV RNEIWIIPTVGCVNNVAQAIERQAKSLISGTVEDVAAFPHPYGCSQMGDDQEHTRTILAD LINHPNAGGVLVLGLGCENSNITELKRYIGEYDERRVKFLVAQESDDEIADSLELIRELN DYAGSFEREPVSCSELIIGMKCGGSDGLSGITANPAVGAFSDMLVGAGGTTILTEVPEMF GAETLLMNRCIDEKTFEKTVHLINDFKNYFTSHNQTIYENPSPGNKKGGISTLEDKSLGC VQKSGSAPVMDVLMYGEPVHTKGLNLLSAPGNDLVASTALAASGAHIVLFTTGRGTPFAC PVPTMKISTNTALSTKKSNWIDFNCGRLAEDTDMDTLKNDFFDFVLEVASGKKVKSEEAG FHDMAIFKQGVTL >gi|229783988|gb|GG667747.1| GENE 5 6966 - 7922 1018 318 aa, chain + ## HITS:1 COG:BH2219 KEGG:ns NR:ns ## COG: BH2219 COG1609 # Protein_GI_number: 15614782 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 3 310 5 306 335 152 30.0 7e-37 MNIYDIAELAGVSIATVSRVVNDSPRVSEKTKQRVRAIMEENGYTPNVFARGLGLDSMKT IGIICPDVSDSYMASAAFHLEGRLHEYGYDCILCCSGFEQANKEKYVSLLLNKRIDALIL VGSTYAGEGGDDRKADYVRNAAKTVPVFMINGYIKGENVYCAFSDDYGATYDVTSQMIEA GRKRILFLCDSHSYSANQKLAGYEAALKDHGLPVLGDLKLYVKNRIHSVRDILLERRDLK FDSVLATDDGLAVGAIKYANAKLLKIPGELCITGFNNSALSICCDPELSSIDNKVEELCA ITTDQMMAVLRGEEKQAS >gi|229783988|gb|GG667747.1| GENE 6 8921 - 9277 622 118 aa, chain + ## HITS:1 COG:no KEGG:Closa_1095 NR:ns ## KEGG: Closa_1095 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 116 1 115 118 157 81.0 1e-37 MIEEKQDRIVLCGANSYEQKYYFNEQFSNLPENIRQELQIMCVLYTEDVGGILTLEFEED GTLEFKVTAEESDYLFDEIGSVLKIKQYQMEKRELLEALEMYYRVFYLGEEIDEEEES >gi|229783988|gb|GG667747.1| GENE 7 9274 - 10260 622 328 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145634155|ref|ZP_01789866.1| ribosomal protein L11 methyltransferase [Haemophilus influenzae PittAA] # 2 325 27 349 353 244 40 4e-64 MKLKIGNVTLENNLILAPMAGVTDLPFRVLCKEQGCGMVVTEMVSAKAILYKNRNTKALL EVEERERPVSVQLFGSDPEIMADIAKLIEPGPFDIIDVNMGCPVPKIVNNGEGSAMMKNP RLAEAVLGAMVKAVKKPVTVKFRKGFTDSDVNAVEFAKMAESCGVAAVAVHGRTREQYYS GTADWDIIRRVKEAVKIPVIGNGDVFAPEDAKALLDETGCDGIMIARGAKGNPWIFSRTI HYLETGELLPPPSMEELQNMVLRHAEMSVEYKGEYLGIREMRKHMAWYTAGLPHSAKLRN DINAVETLADLKTLVKERFSDIICESGS >gi|229783988|gb|GG667747.1| GENE 8 10339 - 15297 4252 1652 aa, chain - ## HITS:1 COG:CAC1079_2 KEGG:ns NR:ns ## COG: CAC1079_2 COG5263 # Protein_GI_number: 15894364 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Clostridium acetobutylicum # 440 1196 144 818 2566 84 23.0 2e-15 MKKTKLIRGNAKRVIAVASAFMMTCSTVAGNPVNHAAAAYAKEADTAKEEENSSWKENAA SDTEKSDTEGTMSAEDADESRADDESGKGDDGIASEESKETDDGINSDESKETDDGIASD ESKETDDGITSDESKEADDGIVSDESEETNNSEEADESQAADGSGATDESEENGETGAAE ESEGTDESNASDEPETPKGLVEANRLAAVMAVDPDSWYVELDSCICRDDRYYVVDETTQE ETMLKKQYIEVEKDGESFIIYVDGNGVMVKSKWLETEDGHFRFVNNKGKVIRNKEGKEAG KFYGNFDDMGYWTPVPNTFFESQLDDGTSVIKYSGEDGGVSGKKKDDIQKYYCFQTTDDG ILCSLIPAEGISGAEAVTDTWVVDLWIDADGYLTVDKENQLIGDKYYNFTGEGHSSRSVN TFIRTEDGTILCYLDEAGEKVREQFESDSEHTYYFDENGDLAVRQWIEKSGNFRYVDGKG RMIVNATKLVAGFYGHFDGEGYWTAIEDEFFEDKLDDGTPVTKYSGKEGEVAGIGKKPDR RYYCFVTGTDNRISCRLLSADETSMSETVKNVWIDDMWVNEDSILTIDVREQQIDGKYYN FDAAGHGSLITSAFIHDGDGNNLYYVDDAGEKVTGKAEYPIEDNYYRFSEDGRVTLITNS FIHDTNGNNLYYVDENGDKLTGLREHPLGEKYYNSDADGRIELIVNSFIHDENGNILCYV NADGIKVTDTKEQLAGDKYYNFDEDGNGTLIINAFIHDTNGNNLYYVDENGDKLTGLREY QIGDVYYNVSTDGLLELIVNAFIHDADQNILCYVGADGMKVTGVREQLVDGRYYSLDKDG KVTLITSAFIHNAAGENLYYVDEAGDKVTGKDEYLIEGNYYSFDAEGLCRMITNSFIHDT AGNLLYYVDDTGSVITGKQDYPIETKYYNIADDGKVTLIKNAMIHDQNGKILCYVNESGE KAANRFVTVDGRTIYFGADGAQVFRQWIESDGKFRYVDGKGYMIVNAVKAAGGFYGRYDG EGYWTAISNTFFEDKLDNGEPVTKYAGKEGAIAGIGEKPNTRHYCFETGQDNTIHCRLLS ADGGSVTDDPVANVWIGDLWVNEDGNLSMNVSAQQIDGRYYSFDENGHGTLITNAVIQSP DGEGLCYVNESGEKAVNRFVTFEGHTLYFGSDGAQVFRQWIEINGKFRYVNNKGYMLVNT KKAVAGFYGSFDGEGYWTAIENTFFEGDVNGELVTKYSGNEGKIAGIGDKENRQEYCFLK ETDGKLKCCLSAADGDYLTEEAVKNTWINDMWVNESGYLVINADAPVHGVYYHFDANGHG TLITYKVSYILDGGINSGNNPLHYTGAMDTVILKMPSRPGYSFDGWYTDSSYTDRITEIT SKSHGNLTLYAKWRRIYEDNSSDNNEESATSTTTNNNNNPTVVTPPVTVTVGEDKKVTLQ TGGTTVAASATQTAEGSVSGNTIVSTTETASVTVAEGQVADVATVAVSADGSARALLASP ALGATVQQQTVTVQGVPVTQNVVVYADGTQVTQKSGAEELEGFAEAVTSTEKAIQSGTQT LAAAYNDKVSIDLQQYKQVGAAVTYAVTSGINGAVPQVQMEQTSFAPGQEVAALITDANG NVTAVTIIVGANGIIQYQIPGVNCIVRFMCKI >gi|229783988|gb|GG667747.1| GENE 9 15517 - 16110 634 197 aa, chain + ## HITS:1 COG:CAC3336 KEGG:ns NR:ns ## COG: CAC3336 COG0664 # Protein_GI_number: 15896579 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 56 196 56 198 199 81 37.0 1e-15 MTEAEKRYTIEKVFKTSGCAPKPETIDQMVKLAREVHFPAGAAIQEIGVEQKYLYLITEG IARSYYIDSEGNDITKMFIREEEFAVGESLFLPESLEVFEALENLKSLRFEAKKMKEIIY SDKAALVSYAGMLEQTVIYKMRREYSLQNMQAMERYLQFREDYRDLLERVPQNVIASYLG IKKESLSRLRRNLRDNG Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:33:07 2011 Seq name: gi|229783987|gb|GG667748.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld141, whole genome shotgun sequence Length of sequence - 19209 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 6, operones - 2 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 759 822 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 2 1 Op 2 . - CDS 756 - 3062 2542 ## COG1554 Trehalose and maltose hydrolases (possible phosphorylases) 3 1 Op 3 2/0.000 - CDS 3124 - 6123 2641 ## COG3250 Beta-galactosidase/beta-glucuronidase 4 1 Op 4 38/0.000 - CDS 6136 - 6957 950 ## COG0395 ABC-type sugar transport system, permease component 5 1 Op 5 35/0.000 - CDS 6960 - 7880 880 ## COG1175 ABC-type sugar transport systems, permease components 6 1 Op 6 . - CDS 7940 - 9181 1388 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 9236 - 9295 7.1 - Term 9463 - 9515 13.7 7 2 Tu 1 . - CDS 9600 - 10553 874 ## COG3439 Uncharacterized conserved protein 8 3 Tu 1 . - CDS 11475 - 12050 607 ## COG1680 Beta-lactamase class C and other penicillin binding proteins - Prom 12180 - 12239 6.6 - Term 12231 - 12283 11.9 9 4 Op 1 12/0.000 - CDS 12381 - 13169 995 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB 10 4 Op 2 3/0.000 - CDS 13185 - 13760 789 ## COG4657 Predicted NADH:ubiquinone oxidoreductase, subunit RnfA 11 4 Op 3 13/0.000 - CDS 13773 - 14564 903 ## COG4660 Predicted NADH:ubiquinone oxidoreductase, subunit RnfE 12 4 Op 4 12/0.000 - CDS 14576 - 15196 853 ## COG4659 Predicted NADH:ubiquinone oxidoreductase, subunit RnfG 13 4 Op 5 12/0.000 - CDS 15198 - 16121 1049 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD 14 4 Op 6 . - CDS 16133 - 16480 230 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC - Prom 16542 - 16601 80.4 15 5 Tu 1 . - CDS 17445 - 18086 735 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC - Prom 18155 - 18214 8.8 16 6 Tu 1 . - CDS 18513 - 19208 889 ## COG3142 Uncharacterized protein involved in copper resistance Predicted protein(s) >gi|229783987|gb|GG667748.1| GENE 1 3 - 759 822 252 aa, chain - ## HITS:1 COG:AF0788 KEGG:ns NR:ns ## COG: AF0788 COG0697 # Protein_GI_number: 11498394 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Archaeoglobus fulgidus # 1 252 38 279 308 111 34.0 1e-24 MKKYLAIFGLILVTVIWGGGFVASDMALDSLTPFQIMTFRFFLGTVILGGISLPGMKQIK KEEIRAGFFMGCALFAGFALQIVGLQYTTPSKNAFLTATNVVMVPFIAFLVYKKTVGFNG ILGAVMAIVGVGILSLSGNFSLGAGDALTLAGAVGFALQIFLTGEFAPKYRVRVLNTVQM ATAFLLSAVFLAVQGQLTVHATSTGWLSVLYLGVVSTAFTYLLQTNCQKYVDETKAAIIL SMESVFGTLFSV >gi|229783987|gb|GG667748.1| GENE 2 756 - 3062 2542 768 aa, chain - ## HITS:1 COG:all4989 KEGG:ns NR:ns ## COG: all4989 COG1554 # Protein_GI_number: 17232481 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose and maltose hydrolases (possible phosphorylases) # Organism: Nostoc sp. PCC 7120 # 1 743 17 744 759 437 34.0 1e-122 MSDDSWMILQDHYDPEENLKYESLFCLSNGYLGTRGAYEERTAKSIPCTYINGVFDRSET FMRELANLPDWLGIKLYVEKELIGIETCRVLEYSRALDMKHSLLAKRFVLEDKKGRRTEV EGIRMVSRSNVHRMGIRLFVTPLNYSGIIEVENIIDGSVINFCDAPRFKVKHTRLLENRT LTDTGAYVEVTTRDRKLCVGTGSFMEAYSEGRAVMKNRSFGAFGETAVEFQDFDAEEGKT VEVVKYASVYTEREVPDYELHCRVRDEIADFLADGFETEMNRHMEVYARMWDDANIEIQG DFELDRAVRFTIFHLMSTGSEHDDRVNVGAKLLSGEEYGGHAFWDTELFMLPFFACVFPD TARNLENYRYHLLDAARANAAKNGYRGAQYPWESADDGTEQCPDWTIEPDGTCYRCYVAV YEHHVTAAVAYGIYNYVKITGDRNFLREKGAEILIETARFWASRCEYNRELDRYEIRRVT GPDEWHEPVDNNLYTNYLARWNLGYVLELLKEWEKDEAEVCDELKSRLKIGDKELEYWSL VREKMYLPKGTENGLLEQFEGYFNLKDVTIEACDENDWPIRPEILKTVPKDQTQIIKQAD VVMLLHLLGEEFDEETKRKNYAYYEKRTLHGSSLSPSIYSVMGLHVGDESKAYRYLKRAA FIDLTDLQKNTREGIHAANAGGVWQTVVFGFAGVSLEEDGVIDIRPHMPKEWKGLSFKLH HRGSRLEIAVSPDNRVTVTVLEGGPVTVRMNGKVVILEGQKTVAGGLS >gi|229783987|gb|GG667748.1| GENE 3 3124 - 6123 2641 999 aa, chain - ## HITS:1 COG:ebgA KEGG:ns NR:ns ## COG: ebgA COG3250 # Protein_GI_number: 16130971 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli K12 # 8 997 15 1025 1042 680 37.0 0 MERRKKKRWEDERMTGIGRRRARTSFYRDSARHVSLNGDWKFLYLEAPELSPAGFFEPGS GEGWDTIDVPSVWQMRGYDHMHYTDVLYLFPVNPPYVPSKNPTGIYKRSFFLDDAWLQED TILKFHGVDSACDIWVNGVHAGYSKVSRLPAEFDITSLVKQGENDVTVRVYKWSDGTYLE DQDMWWFSGIYRDVELINEPKAAVLDCRVTAGLDAACGCGHAVIDLEVKGDGKGTWSLLD GERLMAEGEFASCKDQREIRITAEAGTVQPWTAETPYLYQVKITFEGHEITVRTGFRTIE IRDGNFTVNGTAILLNGVNHHDYSPTEGRVVTYDQMKADVVLMKQHNINALRCSHYPAAD CLYDLCDEYGLYVIDEADLECHGFEWVENYTWISDDPAWETAYVDCAVRMVKRDFNHPSI IMWSLGNESAFGCNFVKSAEAVRSLDPSRLIHYEGDFEAEAADIYSTMYTWLEKLEEIGR GEKGNHKPHVMCEYGHSMGNGAGCLKAYQDVFRKYKRLQGGFIWEWYDHGFFTKDEEGHE YYRYGGSYGDFPNNGNFCIDGLLMPDRTPSPALAEYRQVIAPVEILKKTGSGREIALKNW YDFKDLSHLCVKWWISCDETVQEEGMIEELKAGPGEVCCLEIPYQPFIPERNTDYYLNIS VCLKECAAYAPKGHEISRGQFLLKEHLETAVLHETGRRLAAEDDGIFYTVKGESSEVVFH KIYGRLERFTVDGKTYVTAGPEVTVYRATIDNDMYKKDDWLNKYFIQLPDEQTEYFRTEE RGDRIDVMIGTYFGCLNQSWGFLCDYHYTVYGDGSVTCRLEGKQIQRGKAEPGFLPRIGI QMKAAKELQNAAWYGLGFQENYADSREAAWMGVYRSDVDGMSVNYVSPQENGHREEVLWY SLGDGEHSLLVTAERPLGLNIHNYTEESLEEAKYPWQIKKADDVIIHMDYLHSGLGSNSC GQEQEEAYKVRRQDFALSFTMRVIGAGEDVAQAKIKYLD >gi|229783987|gb|GG667748.1| GENE 4 6136 - 6957 950 273 aa, chain - ## HITS:1 COG:SMb20969 KEGG:ns NR:ns ## COG: SMb20969 COG0395 # Protein_GI_number: 16264842 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 8 271 3 266 270 285 53.0 5e-77 MSMSKKSKIIIYVLLTLMALYFLSPFIYMLFTTFKTEAEAIAYPPRLFPKKWLVDNYLDA WKAQPFGRYLWNSILVTVGTTAGQILSCSLVAYGFARYQFKGKNVLFMILLSTMMIPWDV TMIPQYMEFKIFGWINTLKPLIIPAWFGSAYYIFLMRQFLMGVPKDFEEAARIDGANAFQ IYWRIFMPIMKPQLILVGVLNMITVWNDYLGPLIFLQDRSKFTLALGLASFKGVHSTQII PMLCITVIMILPPIIVFLFAQKHIVEGTSGAIK >gi|229783987|gb|GG667748.1| GENE 5 6960 - 7880 880 306 aa, chain - ## HITS:1 COG:SMb20970 KEGG:ns NR:ns ## COG: SMb20970 COG1175 # Protein_GI_number: 16264843 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Sinorhizobium meliloti # 10 302 25 317 319 295 53.0 9e-80 MEKKKFPGKVTPYLFILPWIIGFIVFTAGPLILSLVMSFFDWPITSTPTFRGLGNYVEMF TKDKQFYKSLVITLKYAAIFVPLNMIIALFLAMLITQPVKGVKIYRTIFYIPAVVSGVAI SIIWGWILNGNYGVLNYLLSLLGIEGPNWLVDPSWALLAVILASAFGVGTMMLIFYTDIK SIPIDLYEAASLDGASPARQFFSITLPIITPTILFNLITSLISAFQQLTLVMLLTNGGPL KSTYFYGMFTYNNAFKHHKLGYASANAWVMFVIILFLTALVFKSSSAWVFYETEAKNSKK TRKGRK >gi|229783987|gb|GG667748.1| GENE 6 7940 - 9181 1388 413 aa, chain - ## HITS:1 COG:SMb20971 KEGG:ns NR:ns ## COG: SMb20971 COG1653 # Protein_GI_number: 16264844 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 4 390 5 385 410 262 37.0 9e-70 MKKKIAVIAALSLVAVTVLSGCGSKADSQANGGKVKIRFASWDNAEDLDKQQALVDQFNA SHDDIEVALEAYGSEYDTKISAGMGSGDTPDVLYMWDYPSYYEGLEPLDSYIEKEGADYK NNFYDALWPYNSKGDSVYGIPVGFTTHALFYNKDIFAQAGVAEPTNDWTWDDLMAAAKTI TEKVDGVKGFSFQMKPDPYDYEMYLWSNGTAFVDQDGNLDGNLNSEKAIEAVSMFQNMEK DGYAIATEKNGTDEFRSGQTAMYIYGAWSIASFDEDGLNYGIVDIPAFAGAGHDSVSILS SSGVSISKDSKHKDAAWEFVKYWTGEEMNKARIGYELPALKSVVESEKILEDPANAPFYS MLEQSSGYTPASFIVDNWSELKDTLDLTFERVYNPSTMEDPAVVLNEAVSEMQ >gi|229783987|gb|GG667748.1| GENE 7 9600 - 10553 874 317 aa, chain - ## HITS:1 COG:BMEI0634_2 KEGG:ns NR:ns ## COG: BMEI0634_2 COG3439 # Protein_GI_number: 17986917 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Brucella melitensis # 194 279 2 87 126 80 43.0 3e-15 MKKNQFELLGLKHTSFSDGLPAFRQEDLSLSEYVHQIFKTDGRYIDPAETAVSYRSDGSV YPQLHSSALRGFGDVWASAQDISFWDIGLAGGVLIKKPENRAVLYAPWTLPDGRTVWGSA GWQFYHHRGLMDIKGSVPGFSSFLSRFTHPEELVCVTLLANKEGVDFTNLGRKIAGAFGD LLSTNYDDNRLFLMEGQFSADETAERLEKQLKALDIPVFAKFDHAKNAAEAGLELRPTTV LVFGAPKVGTGLMQADQSIALELPLKIAVWEDEAGSTWLAFPKMKQVAGEYGLENHPVVG NMQKLLEKLVKQAANLY >gi|229783987|gb|GG667748.1| GENE 8 11475 - 12050 607 191 aa, chain - ## HITS:1 COG:alr0153 KEGG:ns NR:ns ## COG: alr0153 COG1680 # Protein_GI_number: 17227649 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Nostoc sp. PCC 7120 # 24 191 9 179 537 78 32.0 6e-15 MNTREHTERKMRGSADVSYLGETVDEMVWEFMKEQEIPGLTLAIVQAPYIPRVVGYGFSD AGQKRLASVNTLWPAGPVSQGYAAVAIMQLYEAGSVELDVPASDYVEGLPEHWKNITVRQ LIHHASGLADYRNQEGFSLLREWTFEELKALMADLPLRFEPGTEVEQSAANFLLLTEVVE KVSGMTYQEFS >gi|229783987|gb|GG667748.1| GENE 9 12381 - 13169 995 262 aa, chain - ## HITS:1 COG:MA0664 KEGG:ns NR:ns ## COG: MA0664 COG2878 # Protein_GI_number: 20089551 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Methanosarcina acetivorans str.C2A # 1 261 1 261 264 183 44.0 2e-46 MITGIIIAAAVVGILGILIGVFLGVASEKFKVETDEKEILVRNELPGNNCGGCGYAGCDA LAKAIAAGEAAVDACPVGGAAAAGRIGAIMGVDGGNAVKKVAFVKCKGTCDKTKVQYNYY GMDDCRKITVVPGAGEKACVYGCMGYGSCVKACAFDAIHVVDGVAVVDKEKCVACGKCVA SCPNHLIELVPYTAEHLVQCSSHDKGKDVKAKCESGCIACTLCTKQCEFDAIHMDNNVAV IDYEKCTNCGKCAAKCPVKVIV >gi|229783987|gb|GG667748.1| GENE 10 13185 - 13760 789 191 aa, chain - ## HITS:1 COG:FN1592 KEGG:ns NR:ns ## COG: FN1592 COG4657 # Protein_GI_number: 19704913 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA # Organism: Fusobacterium nucleatum # 19 190 21 192 194 185 60.0 4e-47 MKELFLIAVGSALVNNVVLSQFLGLCPFLGVSKNVKTSAGMGAAVIFVITIASFVTSVIY KGILVSLNLEFLQTIVFILVIAALVQFVEMFLKKSMPPLYEALGVYLPLITTNCAVLGVA LTNVQKSYSILESVINGVGTSVGFTIAIVMLAGVREKIEHNDVPYSFQGSPIVLITAGLM AIAFFGFSGLI >gi|229783987|gb|GG667748.1| GENE 11 13773 - 14564 903 263 aa, chain - ## HITS:1 COG:TM0247 KEGG:ns NR:ns ## COG: TM0247 COG4660 # Protein_GI_number: 15643019 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE # Organism: Thermotoga maritima # 8 188 7 185 200 201 55.0 1e-51 MNKCTERLYNGIVKENPTFVLMLGMCPTLAVTTSAVNGLGMGLSTTVVLMFSNMLIAALR KIIPDRVRIPAYIVIVATLVTIVQLLLQAYVPSLNAALGIYIPLIVVNCIILGRAESYAS KNPVMASTFDGIGMGLGFTVALTCIGLVREILGAGAFFGHVIIPEDFHISIFILAPGAFF VLAVLTALQNKFKAPSATNGSVPQSRLACGGDCMNCSGSSCMSNHEILETRKRQAEEDAL ASKKADLAKKEAELKAATSKKEE >gi|229783987|gb|GG667748.1| GENE 12 14576 - 15196 853 206 aa, chain - ## HITS:1 COG:MA0661 KEGG:ns NR:ns ## COG: MA0661 COG4659 # Protein_GI_number: 20089548 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG # Organism: Methanosarcina acetivorans str.C2A # 28 201 29 184 188 67 29.0 1e-11 MKKGGFMKDAWILFAITLISGLLLGAVYQITKVPIQMAEAKESLHKYQMAYPDATDFLFD QAIQDQVVLSKETLKEQKPEYGNVEVTVALKAVDASGNVIGHIITASSDDSYGGTVKVSV GITNEGKITGVELLEISDTPGLGMKATEPAFKDQYKDKTVEEFTVTKTGSTSDSEINAIS GATITSNAVTHAVDAALYFAANCIPQ >gi|229783987|gb|GG667748.1| GENE 13 15198 - 16121 1049 307 aa, chain - ## HITS:1 COG:TM0245 KEGG:ns NR:ns ## COG: TM0245 COG4658 # Protein_GI_number: 15643017 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Thermotoga maritima # 10 307 8 318 318 202 44.0 7e-52 MSKLLNVSSSPHVRDQVTTSNIMLDVAIAMIPATAYGVYQFGIKAALVILATVLSCVLSE YVYETLMGKPVTISDGSALVTGMILALNMPPGIPLWIPVLGGIFAIIVVKQLYGGLGQNF MNPALAARCFLLIAFAGKMTTFTYNDVVVSPTPLAALKAGQAVDIPAMFIGKIPGTIGEV SVIALLIGAAYLLIKKVITIRIPAAYLITFAVFVFIFGQQDFTYVLAHLCGGGIIFGAFF MATDYVTSPITPKGQILFGILLGVLTGLFRLFGGSAEGVSYAIIISNILVPLIEKITLPK AFGKEGK >gi|229783987|gb|GG667748.1| GENE 14 16133 - 16480 230 115 aa, chain - ## HITS:1 COG:FN1596 KEGG:ns NR:ns ## COG: FN1596 COG4656 # Protein_GI_number: 19704917 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Fusobacterium nucleatum # 1 113 329 441 441 95 39.0 2e-20 MISGGPMMGQALFNLEIPVTKTSSALTAMTKDEVAVHAPSACIRCGRCVSVCPSHVVPQM MMDAAERSDIERFTALNGMECCECGCCTYVCPAKRPLTQAFKEMRKVVAASRKKA >gi|229783987|gb|GG667748.1| GENE 15 17445 - 18086 735 213 aa, chain - ## HITS:1 COG:FN1596 KEGG:ns NR:ns ## COG: FN1596 COG4656 # Protein_GI_number: 19704917 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Fusobacterium nucleatum # 1 212 7 216 441 173 45.0 2e-43 MGLATFKGGIHPYEGKELSESKPVSVLQPKGEMVFPLSQHIGAPAKPLVAKGDHVLVGQK IGEPGGFISACVISSVSGTVKALEPRMVANGAMVPSIIVENDGKYETVEGFGRDRDPKTL SKEEIRNIVKEAGIVGLGGAGFPTHVKLTPKDESAIDTIIVNGAECEPYLTSDYRMMLEE PESIVKGLNVILSLFDNAKGVIGIENNKPEAIS >gi|229783987|gb|GG667748.1| GENE 16 18513 - 19208 889 231 aa, chain - ## HITS:1 COG:VC0730 KEGG:ns NR:ns ## COG: VC0730 COG3142 # Protein_GI_number: 15640749 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Vibrio cholerae # 1 229 22 250 254 135 37.0 9e-32 GAERIELNSALALGGLTPSMAELLLVKRETPLKVIAMARPRGAGFCYGVSDFKQMFFDCQ MMMGHGADGIAFGCLKADGTIDLSQTELMVKAIKDKGGEAVFHRAFDCVKDPFQAAEQLI SLGVDRILTSGLKSKAPDGWRLLKELQERFGERIEILAGSGIHAGNAGELAEKTGVRQLH SSCKDWIADPTTVGEEVSYSYADSDHASCFEVVARDKVAELLAAVRKLDIY Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:33:14 2011 Seq name: gi|229783986|gb|GG667749.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld142, whole genome shotgun sequence Length of sequence - 20405 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 6, operones - 6 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 8/0.000 - CDS 1 - 981 421 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 2 1 Op 2 . - CDS 1007 - 1357 433 ## COG2739 Uncharacterized protein conserved in bacteria - Prom 1377 - 1436 3.6 3 2 Op 1 . - CDS 1467 - 2063 597 ## Sterm_3839 cyclase family protein 4 2 Op 2 3/0.000 - CDS 2053 - 2754 740 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 5 2 Op 3 . - CDS 2805 - 3173 219 ## PROTEIN SUPPORTED gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 6 3 Op 1 12/0.000 - CDS 4441 - 4890 264 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 7 3 Op 2 2/0.000 - CDS 4859 - 6376 1178 ## COG0642 Signal transduction histidine kinase 8 3 Op 3 17/0.000 - CDS 6391 - 7506 1129 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 9 3 Op 4 24/0.000 - CDS 7557 - 8327 316 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 8352 - 8411 80.4 10 4 Op 1 . - CDS 9255 - 9986 719 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 11 4 Op 2 . - CDS 10000 - 10821 619 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase - Prom 10919 - 10978 2.5 - Term 10940 - 10983 7.8 12 5 Op 1 . - CDS 11012 - 12307 1430 ## COG0172 Seryl-tRNA synthetase - Prom 12345 - 12404 2.2 13 5 Op 2 . - CDS 12407 - 13249 784 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase 14 5 Op 3 . - CDS 13292 - 13879 683 ## Closa_2116 hypothetical protein - Prom 13984 - 14043 80.4 15 6 Op 1 6/0.000 - CDS 14887 - 15234 357 ## COG1564 Thiamine pyrophosphokinase 16 6 Op 2 10/0.000 - CDS 15231 - 15890 815 ## COG0036 Pentose-5-phosphate-3-epimerase 17 6 Op 3 7/0.000 - CDS 15893 - 16768 1020 ## COG1162 Predicted GTPases 18 6 Op 4 17/0.000 - CDS 16778 - 18961 2400 ## COG0515 Serine/threonine protein kinase 19 6 Op 5 5/0.000 - CDS 18958 - 19701 941 ## COG0631 Serine/threonine protein phosphatase 20 6 Op 6 . - CDS 19698 - 20405 717 ## COG0820 Predicted Fe-S-cluster redox enzyme Predicted protein(s) >gi|229783986|gb|GG667749.1| GENE 1 1 - 981 421 327 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 33 314 53 333 336 166 30 9e-41 MAFESLSDKLQNVFKNLRGKGRLTEADVKAALKEVKMALLEADVSFRVVKQFISAVQERA VGQDVMNSLTPGQMVIKIVNEELVNLMGSETTEIHLKPANEITIIMMAGLQGAGKTTTTA KIAGKLKAKGRKPLLAACDVYRPAAIKQLQINGEKQGVPVFSMGENHKPADIAKAAVEHA SKNDQNVVILDTAGRLHIDEDMMNELVEIKEAVEVHQTILVVDAMTGQDAVNVAGMFNDK IGIDGVILTKLDGDTRGGAALSIRSVTGKPILYIGMGEKLSDLEQFYPDRMASRILGMGD ILSLIEKAETQVDEEKAKELSQKLRKA >gi|229783986|gb|GG667749.1| GENE 2 1007 - 1357 433 116 aa, chain - ## HITS:1 COG:BS_ylxM KEGG:ns NR:ns ## COG: BS_ylxM COG2739 # Protein_GI_number: 16078660 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 88 3 90 110 64 45.0 6e-11 MEKIARQGLLYDFYGALLTEHQRQVYEDVVLYDLSLSEIAEEQGISRQGVHDLVKRCDRI LEDYEEKLHLMEKFGNTKRLAREIQSLAGEFSRTGDRNLLRRIEEISGEIIDLEAH >gi|229783986|gb|GG667749.1| GENE 3 1467 - 2063 597 198 aa, chain - ## HITS:1 COG:no KEGG:Sterm_3839 NR:ns ## KEGG: Sterm_3839 # Name: not_defined # Def: cyclase family protein # Organism: S.termitidis # Pathway: not_defined # 1 186 1 182 183 242 60.0 7e-63 MMIDLTVEVTPKAAADAQGNEKKALTGHLGTHFDVMDREFPLEYVIRKGLVFDVSQVSGR DIESGDIELERVEPGMFVAFSTGFLEQTGYGTPVYFKEHPQLSDELIDRLLDRGVSIIGI DCAGIRRGAEHTPKDQYCADRNTFVVENLCGLSGILEGEMAAEATIYTFPVRFKGMTGLP CRVVGERSAGSSAGCHRQ >gi|229783986|gb|GG667749.1| GENE 4 2053 - 2754 740 233 aa, chain - ## HITS:1 COG:mlr0493 KEGG:ns NR:ns ## COG: mlr0493 COG0596 # Protein_GI_number: 13470715 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Mesorhizobium loti # 7 199 32 245 268 107 33.0 3e-23 MELHVNGVKLYYRVSGHGPAVILVHGNGEDHSVFDETAALLSCDYTVYALDSRGHGKSTA VETLGYQNMADDVASFIKKLDIRKPAFCGFSDGAIIGMLVAEQNPGLLSKLILCGGNAYP QGIKEKWFQLFRLISHFDRDPKIRMMLVEPQITEEELKRIGEPVLILAGENDMIKEEHTR YLASKIPNSRLVIVPGENHGSYVVHSRKLYHLIKKFLAEPVKRQEQGGNAYDD >gi|229783986|gb|GG667749.1| GENE 5 2805 - 3173 219 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 [Streptococcus pneumoniae SP3-BS71] # 6 121 7 126 126 89 38 2e-17 MISFNHFNFNILNLEKSLAFYKEALGLTPVREKEASDGSFKLVYLGDGKSDFTLELTYLA DRKEPYDLGECEFHLAFTTDEFDALYEKHKEMGVVCFENPGMGIYFISDPDGYWIEIVPE RG >gi|229783986|gb|GG667749.1| GENE 6 4441 - 4890 264 149 aa, chain - ## HITS:1 COG:BS_yxjL KEGG:ns NR:ns ## COG: BS_yxjL COG2197 # Protein_GI_number: 16080942 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Bacillus subtilis # 12 140 1 128 218 85 37.0 3e-17 MAQKIWRRGGEITERKIRILVAEDFDLLREDLCELLNSQKDMETAGSADGKEEIVKLAQM MEFDLILMDIEMEYLTSGIEAAEEIRKEKPEAQIIFLTAHDTNQMIYTAMGAGAVDYIVK GCPDEELLFHIRSAVSGKPVMEARIQIAS >gi|229783986|gb|GG667749.1| GENE 7 4859 - 6376 1178 505 aa, chain - ## HITS:1 COG:PA4725_2 KEGG:ns NR:ns ## COG: PA4725_2 COG0642 # Protein_GI_number: 15599919 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 311 494 176 369 378 73 29.0 7e-13 MLIIMMALIGVVALIVVGLRRDKISLLILALCGSFIAMLAGIVIYTAKIGGFSRNQQILL FLLPNIQQYLQYLPITLDVQGYAAAVGRCCFPCILLLISLNYSMSTLVRRHMRKLKILVY TVVIWLVYYYPAVFRLWVRGRGKLQNFMGVMAMAWIMVCLLLALFLLIHEYQSISIPYFK KNFRYILLSYISMAFLYLIYGLQDPAQIYQYYGLEFLWIWRGSYTAGIMYSFSWAALVLC SIFFVILGSVNLLRYTQITYNQDMETVYLQSKFDQTSLGASVFVHSMKNQLLANRILNKK LGRALNEDSPDLEKIRGVAAQLNQLNESMVTRMEDLYRAIHSNSILLTPVSVREIAEAAE GRFYQKYPDRKVTAEICDDGMILADREHLAEAVYNLLCNGQEAIAEAGREAGAELRLTIK RQRLYMVLEVHDNGKGISRTERLKICEPFYTSKNTSYNWGMGLFYVRNIVKSHYGTLRIE SHPETGTSFFIMLPGYGTKDMEKRG >gi|229783986|gb|GG667749.1| GENE 8 6391 - 7506 1129 371 aa, chain - ## HITS:1 COG:BMEII0109 KEGG:ns NR:ns ## COG: BMEII0109 COG0715 # Protein_GI_number: 17988453 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Brucella melitensis # 65 361 23 315 319 149 27.0 8e-36 MKKLPVGLLTVMLCGILAGCSGEGVPKQNTDTKAPEITTEARQQTEAAAAEMTDGAETTE SSKPETVKLRVAYHPNFGGNNLYITAEKNGFFEEEGLVVEPVKFTAGPQEIAAMVAGDVD IGYIGHGATFLAVQRQVNVFLTEFLGNGDELLTYKESGITSLEDLKGKTIATAPGTSGET VLKLALEKAGLSRDDVKIVNMDAAGMVAAIVAKKVDACTLWPPQTSEVRKTMGEENVVKL ASIGDFRDQFAFPGHWVVTPKFMDAHPDVMVRFSRAMLKAMDYRKEHPEETAENVAEFID APLDGVAAEMNNIDLFDADYCYQVMTDGTAAKWYDALQNLYVQDGKVEKTVPVAEWFCPQ FAKEAYESYHK >gi|229783986|gb|GG667749.1| GENE 9 7557 - 8327 316 256 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 8 208 4 213 245 126 35 1e-28 MIVTIDHVEKIYQGRSGEVVALNGVDMEIGENEFVCVVGPSGCGKSTLLNIIAGLEKPTS GRVCVKGKEVVNPGSERGVIFQQYALFPWLTVKKNVKFGLKLRGVKEPELSAIADKYIRL VGLEEFGDSYPKELSGGMKQRVAIARAYAVNPEILLMDEPFGALDAQTRTQLQTELLETW EKEKKTCFFITHDVEEAIILAQRVVIMSARPGRVKEIVPVNIPYPRTQETKMTKEFLDLK VHVWGQVYREFLEVRK >gi|229783986|gb|GG667749.1| GENE 10 9255 - 9986 719 243 aa, chain - ## HITS:1 COG:BMEII0107 KEGG:ns NR:ns ## COG: BMEII0107 COG0600 # Protein_GI_number: 17988451 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Brucella melitensis # 15 240 13 237 246 177 45.0 2e-44 MKWIKIKNKRVWSWISLVLCLAVWQMISVSEAGAVFASPKKTLMAFITMAQSGAIWVHIG YSLFRVITGFLLGLAAAIPIAFLMGWYKTVQYLLEPVVQFIRTIPPIAYIPLVIVALGVD ESSKIMVIFIATFLVMVITIYQGVKNVDSTLVKAARVLGANDRDIFFHVVVPASFPFVLT AIRLGLSAGLTTLIASELTGASKGLGTMIQEASMYFRMDVVIMGIMVIGIIGLALNKIVH ILS >gi|229783986|gb|GG667749.1| GENE 11 10000 - 10821 619 273 aa, chain - ## HITS:1 COG:BH0420 KEGG:ns NR:ns ## COG: BH0420 COG0363 # Protein_GI_number: 15612983 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Bacillus halodurans # 32 271 9 240 246 78 26.0 1e-14 MKQWEEYRIPAAEVRRISRIPITCLEDQTSVFSRMALEMTDKIEENNRQGKRTVFIIPFG PTGQYPVFVDLVNTRRISLTQVWLITMDDWLDSDGDLVGLDNIHSLRRKMNQVLYSKIDK SLLMPEEQRIYPDPKWLVRIPDLIEALGGVDICFAGFGINGHLGFNEPSDLSIEEFAKLP TRVVKISETTRTIKASTGVNGAMEDIPLYGVTVGMKEMLCAEKVSVYCFRDWHSGAIRRA CCGTVSASFPASLLQNHPNASIVCSPNAIAQCY >gi|229783986|gb|GG667749.1| GENE 12 11012 - 12307 1430 431 aa, chain - ## HITS:1 COG:PH0710 KEGG:ns NR:ns ## COG: PH0710 COG0172 # Protein_GI_number: 14590588 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Pyrococcus horikoshii # 1 427 6 450 460 322 39.0 1e-87 MLDLKFVRENPEAVKQNIRNKFQDSKLPLVDEVIELDELNRAAKQEGDALRAERNKLSKQ IGGLMGQGKKEEAEELKKQVTAASDHLAELEEKERELDEKIKKIMMTIPNIIDPSVPIGK DDSENVEVQRYGEPVVPDFEVPYHTDIMERFDGIDLDSARKVAGNGFYYLMGDIARLHSS VISYARDFMINRGFTYCVPPFMIRSEVVTGVMSFAEMDAMMYKIEGEDLYLIGTSEHSMI GKFIDTIIPEESLPKTLTSYSPCFRKEKGAHGLEERGVYRIHQFEKQEMIVVCKPEESME WYDKLWQNTVDLFRSLDIPVRTLECCSGDLADLKVKSVDVEAWSPRQKKYFEVGSCSNLG DAQARRLKIRVNGENGRKYFAHTLNNTVVAPPRMLIAFLENNLQADGSVRIPEVLRPYMG GMEAMVPKGHK >gi|229783986|gb|GG667749.1| GENE 13 12407 - 13249 784 280 aa, chain - ## HITS:1 COG:CAC1622 KEGG:ns NR:ns ## COG: CAC1622 COG2240 # Protein_GI_number: 15894900 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Clostridium acetobutylicum # 5 276 3 278 290 197 37.0 2e-50 MTSERHQKKIAVINDLSGYGRCSLTVAIPILSALKVQCCPVPTSILSNHTGFPVYFFDDY TEKMEPYLEKWTELGLEFDGIFSGFLGSEAQIGIVMSAIRRFKGKDTLVVIDPIMGDHGK AYQTYTPAMCSRMKELVGLGDLVTPNLTEACILTGREYREEGWHRADLLAMAEEILKMGP SGVVITGVKEGAYLTNVIAERDREAAFCRSLRVGRERPGTGDVFSAIVAAEAVKGSSLTD AAKKAARFVKRCILKSEELDVPVQNGVCFEEVLAELIKWK >gi|229783986|gb|GG667749.1| GENE 14 13292 - 13879 683 195 aa, chain - ## HITS:1 COG:no KEGG:Closa_2116 NR:ns ## KEGG: Closa_2116 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 190 1 190 195 267 82.0 2e-70 MNRTDSKISKIAMTGLFAAMSYVVFTFLQFKITLPGGDATSIHLGNAVCVLGALLLGGVY GGLGGAIGMTIGDLLDPIYVVYAPKTFILKLCIGLITGFLAHRLGKINESHDQKHILKWV LIASAGGLLFNVIFDPLVGYYYKLLILGKPAAELALVWNVTSTTINAVTSTIASVVIYMP LRSALIRSGMGLMRS >gi|229783986|gb|GG667749.1| GENE 15 14887 - 15234 357 115 aa, chain - ## HITS:1 COG:CAC1731 KEGG:ns NR:ns ## COG: CAC1731 COG1564 # Protein_GI_number: 15895008 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine pyrophosphokinase # Organism: Clostridium acetobutylicum # 1 114 1 112 211 73 37.0 1e-13 MTGLIITGGSINCAFAGAFLKIREFDMVIAVDGGLAAVKSLNIRPDAIVGDFDTVKGELL EEYRRMGGIAFEQHQPEKDETDTELALRTAMEAGVNDIVLLGATGGRLDHSIGNS >gi|229783986|gb|GG667749.1| GENE 16 15231 - 15890 815 219 aa, chain - ## HITS:1 COG:lin1932 KEGG:ns NR:ns ## COG: lin1932 COG0036 # Protein_GI_number: 16800998 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Listeria innocua # 4 200 4 200 218 229 56.0 4e-60 MFLLAPSILGADFKHLEQQVCEAADAGAQYIHLDVMDGDFVPSISFGMPVISSLRSCTDK VFDVHMMVREPGRYIDAMKEAGADIISVHVEACTHLDRTVNQIKEAGIKAGVVLNPATPI DVLDCILDQVDMVLLMTVNPGFGGQKFIPYTLEKIRALRKKCDMRGLSTDIQVDGGVTCD NVRDLMEAGANIFVAGSAIFKGDVTANTKAFLKIFEEQE >gi|229783986|gb|GG667749.1| GENE 17 15893 - 16768 1020 291 aa, chain - ## HITS:1 COG:CAC1729 KEGG:ns NR:ns ## COG: CAC1729 COG1162 # Protein_GI_number: 15895006 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 1 291 1 287 288 245 44.0 8e-65 MTGKIIKGIAGFYYVDNGTDVYECKAKGVFRNKKIKPLVGDNVEFTVLSEAEREGNIDKI LPRKNELVRPAVANIDQALVIFSITHPEPNLNLLDRFLVMMEVQEVPVKICFNKTDLTGD GERDRLGNIYSSAGYPVYFTSTYENSGLDELRVLTEGKTTVLAGPSGVGKSSITNLLFPE AEMETGKISEKIQRGKHTTRHSELFAIGKDTYMMDTPGFSSMYLEELECGNLKDYFPEFA AYEEDCKFLGCVHVGEPVCGVKEAVRQGKISESRYANYKLLYQELKDKRRY >gi|229783986|gb|GG667749.1| GENE 18 16778 - 18961 2400 727 aa, chain - ## HITS:1 COG:BH2504_1 KEGG:ns NR:ns ## COG: BH2504_1 COG0515 # Protein_GI_number: 15615067 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Bacillus halodurans # 6 271 3 270 343 268 53.0 3e-71 MILRPGTFLQDRYEILDKIGSGGMSDVYKALCHKLKRQVAIKVLKEEFSSDSGFVSKFKM EAQAAARLSHPNIVNVYDVVDEGTLHYIVMELIEGITLKNYILKKGCLDSKEAIGIAIQV AQGIAAAHEQGIVHRDIKPQNIIIARDGKVKVADFGIARAASTQTLSATAMGSVHYISPE QARGGYSDARSDIYSLGITMYEMAAGRVPFEGENTVTVALAHLEDPITPPSYFNQEIPVS FENIILKCTEKKPEYRYGSMQEVIADLRRALVTPDENFVQASAPSMDTSQTVIIGAPELE QIRSGSGRRNQPEPQPGRNRSYGGEPVRERQPEHRSSRNTSGGSPSGRTKRKEPDEEINP GIEKLLTAAGIVVAILIVIVLIFLVTRLGGLFRSGTPKETETTVETTQEETTLGEKQTTM PEVEGLAEDIARQKLRENGGLDMKIGDYEFSDTVKKGDVISQTPKAGEIVDRYSAVSVVI SSGPEHPEIDLTTLGLDSLTAEAAKTLLEGKGFIVNLQNQSSDTVESGKIISYSPSKAKE GAVISVIASTGPALTNPVVVPDITGKEEEVGEEMLADLGLVRGTVTKETSEEVPAGTIIR QTIAPNTQVEGGTAIDYVVSSGSAEQAKYKYLASIEKSYPLQNLIGPGSASTQLNIKIQL KQTVNGKDEFRDLMGPVTITGDQQLPVVFKNIEGAYGVTSGEVQIVNVDTGDVINSYPVT FVPIPQS >gi|229783986|gb|GG667749.1| GENE 19 18958 - 19701 941 247 aa, chain - ## HITS:1 COG:BH2505 KEGG:ns NR:ns ## COG: BH2505 COG0631 # Protein_GI_number: 15615068 # Func_class: T Signal transduction mechanisms # Function: Serine/threonine protein phosphatase # Organism: Bacillus halodurans # 6 237 6 239 249 167 42.0 2e-41 MKACALTDTGRVRTANQDYVYASVEPVGSLPNLFVVADGMGGHQAGDYASRYIVENLVSY LQYTENSQIVPLLREAILKVNTMLYHEAKEKPEFSGMGTTLVAAVADENTLYVANVGDSR LYLVRDRIRQVTRDHSYVEELVSLGRLERGSKDYKDKKNIITRAVGTEDKLLVDFFEVGL EPGDYILMCSDGLSNMLEDAEMEEIIGSDLELQEKAEKLITVANDNGGKDNIAVVLVDPQ IGREGSL >gi|229783986|gb|GG667749.1| GENE 20 19698 - 20405 717 235 aa, chain - ## HITS:1 COG:CAC1726 KEGG:ns NR:ns ## COG: CAC1726 COG0820 # Protein_GI_number: 15895003 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Clostridium acetobutylicum # 2 228 117 343 345 246 51.0 3e-65 VCASTLDGLERNLRPAEMLDQIYRIQYLTGERVSNVVIMGSGEPMDNYDHVVKFIRLLTD EHGLNVSQRNVTLSTCGIVPGILKLADEGLAITLALSLHAPNDEVRKTLMPVAKSYKLND VLEACHTYFEKTGRRLTFEYSLVAGVNDNLEEAAALAALIKDQQGHVNLIPVNPIKERDY VQSDRKAIEAFKNLLEKNGINVTIRREMGRDIHGACGQLRKSFLTEEQKGSEKTL Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:33:28 2011 Seq name: gi|229783985|gb|GG667750.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld143, whole genome shotgun sequence Length of sequence - 20822 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 5, operones - 5 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 523 - 575 14.3 1 1 Op 1 . - CDS 643 - 1152 384 ## Closa_2255 hypothetical protein 2 1 Op 2 . - CDS 1231 - 2616 1492 ## COG0534 Na+-driven multidrug efflux pump - Term 2912 - 2960 2.3 3 2 Op 1 . - CDS 3011 - 4057 999 ## Bphy_3219 hypothetical protein 4 2 Op 2 . - CDS 4100 - 4552 476 ## COG2731 Beta-galactosidase, beta subunit 5 2 Op 3 44/0.000 - CDS 4590 - 5552 794 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 6 2 Op 4 . - CDS 5545 - 5928 207 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 7 3 Op 1 44/0.000 - CDS 6870 - 7454 335 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 8 3 Op 2 49/0.000 - CDS 7469 - 8368 817 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 9 3 Op 3 . - CDS 8368 - 9339 870 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 10 3 Op 4 . - CDS 9365 - 9853 476 ## gi|266624073|ref|ZP_06117008.1| putative oligopeptide ABC transporter oligopeptide-binding protein 11 4 Op 1 . - CDS 10789 - 11844 1097 ## COG0747 ABC-type dipeptide transport system, periplasmic component 12 4 Op 2 . - CDS 11913 - 12641 768 ## Asphe3_01730 hypothetical protein 13 4 Op 3 1/0.000 - CDS 12654 - 13583 509 ## COG2706 3-carboxymuconate cyclase - Term 14810 - 14863 8.5 14 4 Op 4 3/0.000 - CDS 14907 - 15860 922 ## COG0583 Transcriptional regulator - Prom 15931 - 15990 3.7 - Term 15964 - 16014 11.3 15 4 Op 5 . - CDS 16060 - 16347 304 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 16417 - 16476 7.5 16 5 Op 1 38/0.000 - CDS 17378 - 18112 705 ## COG0395 ABC-type sugar transport system, permease component 17 5 Op 2 35/0.000 - CDS 18126 - 19013 988 ## COG1175 ABC-type sugar transport systems, permease components - Term 19028 - 19070 6.1 18 5 Op 3 . - CDS 19088 - 20494 1591 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 20672 - 20731 6.0 Predicted protein(s) >gi|229783985|gb|GG667750.1| GENE 1 643 - 1152 384 169 aa, chain - ## HITS:1 COG:no KEGG:Closa_2255 NR:ns ## KEGG: Closa_2255 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 150 21 171 177 102 37.0 7e-21 MDELRWYLYDLVREVMEKHGTGESTYSLETVREGAVCLIPSEHGFLVTGGGERESEQEDF YRGSREFFRRIFQDDEMAETVMQEFLTRTLDLPAIMKGPSITGLEARIFKCREEMAALEQ KALKPDGQKWKIKRKLDRIYLEGLLKQLEETDKKRYEKIKMEINDSGSV >gi|229783985|gb|GG667750.1| GENE 2 1231 - 2616 1492 461 aa, chain - ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 13 459 4 443 447 226 32.0 1e-58 MGGKLKALFAQTDMTVGKPWKQIVIFTIPMLIGNIAQQLYNTVDSIVVGKYVGDNALAAV GSASPILNLLLVLFIGISMGTSIMVSQFFGAKDREHLSKTIGSTITLTAAASLIIMVFGS VTIRPLLKMLNTPSSIIDWCTSYLLILIVGIMGVAYYNMFCGILRGLGDSLSALIYLLIA TVINIILDLWFVAGLQMGVAGVALATVIAQMVSAVLCFFKLKRMTDVFDLKFQYLKPDME NIMTILRLGLPSGITQAIFSMAMVVVQSLTNSFGEMVIAANVIIMRVDGFAMMPNFSFGT AMTTYSGQNVGAKEIERVHKGAKQGTMIAAGTSAVITALILIFGKYLMGIFTNTSELVAL SANMMKILAVGYIAMAVTQSLSGVMRGAGDTMTPMWISIVTTVFIRVPLAYGIAYFTKTP ELPGGRYECIWVSLLISWVLGAVLTVIFYRIGKWKNKGLES >gi|229783985|gb|GG667750.1| GENE 3 3011 - 4057 999 348 aa, chain - ## HITS:1 COG:no KEGG:Bphy_3219 NR:ns ## KEGG: Bphy_3219 # Name: not_defined # Def: hypothetical protein # Organism: B.phymatum # Pathway: not_defined # 44 281 18 222 318 92 31.0 3e-17 MKILHCDSRRDEIQKTVYSCGEFAYEYISDFARNLPDNVANASIAGGCTDPEGNVYLGMR GNPSQIVKLAPDGTYMKGFGGELLGDYLHFIKYTPQNTILCTDTHNHIVREFDTDGTLIR DFGEAGKPSDTGMDMGTLARMRRDGKIFPTEPYLGIVGMWAFYEAQRRVKRVAGPFNMPT DVDMTSSGEYIFADGYGNRAVHIFNADGTYRKTFGGVGDWEKPYEDTPGDMLIVHALCVD ARDHIWICDREKDAVHVFDTDGNVVGYCSNNMGMPSGVDTDGEYIYVVGRAGYLTVFDLD MNMVAQLGTFNSDLRAHDLAANKQGDLFLFPTHANEDHQVIKLRRIRK >gi|229783985|gb|GG667750.1| GENE 4 4100 - 4552 476 150 aa, chain - ## HITS:1 COG:CAC0836 KEGG:ns NR:ns ## COG: CAC0836 COG2731 # Protein_GI_number: 15894123 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Clostridium acetobutylicum # 1 150 1 151 152 86 30.0 2e-17 MVFEKISNVDRYDLKNEKFTEAFAFLKREGLEKLPIGNFELCPGVIVQVQSYETEPAAGI DYETHDCHFDIHYVIHGLEGIGVTGRDGLTEKGDYDPEKDMGYWEGPEYEGMVILHPGEY VILAPEDAHKPHCAVGEPCSMRKIVIKVEV >gi|229783985|gb|GG667750.1| GENE 5 4590 - 5552 794 320 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 4 311 11 324 329 310 49 5e-84 MADLLEVKNLKRYFKTPTGDLHAVDGVNFTLKAGKTLGVVGESGCGKSTLGRVVLGLIEA TEGTVLFEGKDITHLKTRKRREMCKEMQMIFQDPLASLNPRMCVEELIMDPLKIVHASRD KETMRKRVYELMDQVGLARRFACSYPHELDGGRRQRIGIARALVLGPKFLVCDEPVSALD VSIQAQIINLLQDIQEESRLSYLFITHDMSVVKHISDDILVMYVGCMVEKSSSDDLFDHP MHPYTKGLLSAIPVPDIDVKKKRVLLQGELSSPINPKPGCRFAPRCPYAAEECRIRQPLF EEVMPDHFVACHKVREINQL >gi|229783985|gb|GG667750.1| GENE 6 5545 - 5928 207 127 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 115 214 329 329 84 35 6e-16 SVLLITHDLGIVAECCQKVAVMYAGEIVEYGSLGDIFKETSHPYTMGLFRSIPSLTRKEK RLSPIDGLMPDPAKLPEGCCFHPRCPYADDICKKCHPELSDTGNGHKVRCFHLEEAKMAH GEVRTNG >gi|229783985|gb|GG667750.1| GENE 7 6870 - 7454 335 195 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 3 195 11 204 329 133 39 9e-31 MSLLEINDLNIHYITEQATFRAVNGIDLSLEEGETLGLVGETGAGKTTTALGILNLIPNP PGKIVGGEILYKGENVVTMDQAALRRLRGNEISMIFQDPMTALNPVMRVGDQIVETISLH MECSRAEAVERSLDMLKMVGIGPERSTDYPHQLSGGMKQRVVIAMALACSPALLIADEPT TALDVTIQAQVLEMI >gi|229783985|gb|GG667750.1| GENE 8 7469 - 8368 817 299 aa, chain - ## HITS:1 COG:FN0398 KEGG:ns NR:ns ## COG: FN0398 COG1173 # Protein_GI_number: 19703740 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 19 299 7 289 289 275 49.0 6e-74 MSGKKKRETYIQSLEGLTRKHRQWRDVIHRLFKNRLAIVGLILVAVVLFATIFANFLTPY DYSAQSFSDRFVYPCLAHPFGTDQYGRDLLARILYGGRVSLIVALIAVVISLVIGTVLGA VAGYFGGIYDTVIMRVIDIIMSIPGLLLAVAISAALGTGLLNTGIAVGLASIPGGARILR STVMTIRDEQYIEAARAGGAGHAYIIVHHVLPNTIAPLIVDASLKVGMAILTISSLSFVG LGVQEPTPEWGAILSSGRAYIRDFWPLVIFPGLAIMTTLLGFNLLGDGLRDALDPKLKQ >gi|229783985|gb|GG667750.1| GENE 9 8368 - 9339 870 323 aa, chain - ## HITS:1 COG:FN0397 KEGG:ns NR:ns ## COG: FN0397 COG0601 # Protein_GI_number: 19703739 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 309 1 307 308 265 49.0 6e-71 MAKYILKRILWIVPVCLGVLLIVFSISYFAPGDPVMSMLGASGYTPETYASLKHELGLDQ PFFVQYIRYIVNIVTKFDFGNSYTYGHAVSGEIISRIGITVKLGVLSVIVTTLIGVPFGV ISATKQYSAMDYSVTAGSLFFAAMPNFWLALICIIVFALKLKWLPATGMGNWRYMVLPVL ANSMGTVANVARMSRSSMLEVIRSDYIRTARAKGCRENTVIWRHALHNAVIPILTVVGMQ LGTVMAGSVVIETIFNMPGLGSYLLTGISGRDYPVINACVLILAFFICIMNLIVDLCYAF ADPRIRSQYVSGKKRKAAVKEGS >gi|229783985|gb|GG667750.1| GENE 10 9365 - 9853 476 162 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624073|ref|ZP_06117008.1| ## NR: gi|266624073|ref|ZP_06117008.1| putative oligopeptide ABC transporter oligopeptide-binding protein [Clostridium hathewayi DSM 13479] putative oligopeptide ABC transporter oligopeptide-binding protein [Clostridium hathewayi DSM 13479] # 1 162 6 167 167 302 99.0 7e-81 MNFTLNMTTVDQAVKTNAATVIQFYLQQLGINLQMNFTDFPTAFSAWLTEGGTDLNFQDS DTGSVCGEPYISLRFFPAELATFPFAAISDETFVTLYKEANYTTDDDVRKEKYAELQQYV HDHAMIIPLYESVDAVAYNPEKIESVSLHSAVSANLRYVTLK >gi|229783985|gb|GG667750.1| GENE 11 10789 - 11844 1097 351 aa, chain - ## HITS:1 COG:MA1250 KEGG:ns NR:ns ## COG: MA1250 COG0747 # Protein_GI_number: 20090114 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Methanosarcina acetivorans str.C2A # 42 346 30 327 527 92 23.0 1e-18 MKKLGVLLLAAALVMSMTACGGGNRNTGEGTTASGGNTRTEAGAKSGEGGGEKTLNVGTI NVLGTFMPGSENEVCHWGCYLVYDYLFYYDEENNPFSDILKNWYYEEDGTTFVMECRDDV YFANGNQMTGEDVLFSIKTLVDRGTNQAAYYATVDWDNSYVSEDGFTVYLANTQEFGPGI INMGVVYLLDQSWCEETGFDNIDPWLNAPNGSGPYEVAEYVTDSYVTLKRRDNYWGDFSG ADTVKINHYAEDSAMYMALETGEIDLALNIAESDYSRALADDKIAVMTTHEGENILLALD NNNQYMKDENVRLAIAYGVNWQDVAESARGELAEVPSSIITSVSPYYKDTS >gi|229783985|gb|GG667750.1| GENE 12 11913 - 12641 768 242 aa, chain - ## HITS:1 COG:no KEGG:Asphe3_01730 NR:ns ## KEGG: Asphe3_01730 # Name: not_defined # Def: hypothetical protein # Organism: A.phenanthrenivorans # Pathway: not_defined # 10 239 27 254 256 85 29.0 1e-15 MDDSAEIRLRRLEARAEIENLIGTYCHLLFAGEGGQIMDELWSQSEDISIEIGASGRYCT REKVATYYQKDHIAGKFTLLLPVTPVIEAAADGKTVRGMWFVLGLDSDAGELGTAVTEER ELLTSRTKEGKAYQAECAIWRLGADCIKEGKLWKILHLHQYDMIRFPCGSDWVRFAEERF ATDGIRLDAMFCSNLPFAEDRAPENLANAPTSYHWQYRVDGMTESEPKLPKPYETYKETD RY >gi|229783985|gb|GG667750.1| GENE 13 12654 - 13583 509 309 aa, chain - ## HITS:1 COG:PA4204 KEGG:ns NR:ns ## COG: PA4204 COG2706 # Protein_GI_number: 15599399 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Pseudomonas aeruginosa # 6 277 69 350 388 113 28.0 5e-25 MYHAGYLTISKDRRNLYVLSEGMTFKGRASGGITAYDISDGKFEELNWAVTGGQRPCFVY CDDNSAEIYVSNFYQGTMAAFRRQDDGRIGERLACVKAHDTTPVGPFLHCVMKSPAGRYL AALELIGGSIYIYDCENQYRLVWKEELEAHSGPRHMVFSEDGRFLYVNRQDDEKVSVYYF APDSKRILTHVQTISVRTPDMKGKTEPAAIRLCPGGRLLAVSNRGMGRENREDSISIYGV DTETGRITLISVIKTEGEMPRDINFTPDGKFLVVAYQFQGYPDLFRVEGETLIYTGAGCK IPSPVCIAF >gi|229783985|gb|GG667750.1| GENE 14 14907 - 15860 922 317 aa, chain - ## HITS:1 COG:sll0998 KEGG:ns NR:ns ## COG: sll0998 COG0583 # Protein_GI_number: 16329741 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Synechocystis # 3 306 8 314 345 87 24.0 2e-17 MELLHLNYFREVAAVENITKAAKKLYVSQPALSTAIARLEKELDFLLFERKGNRIELTEA GRCFLEYVNSAFSLLEEGTLKSRQIANRNNRRMVIASGFGVLRNMLDDYKKIHPECEVEL KFCDTDRIHAMLASGDADAGLNLGEILDSRLTNRDLMEGRYYIAVNNTHPFWGKNSVDLK DLNGQLLFCSNIARTFETGMEIFQNAGISCNLLKLDEREVLFSAARKGLGGVFCMPMFEE NKSVEMKQENSDPADRICFIPIRDCEVKASVTLVTRKDHYYTAENQDFIQFVTQRFLKNQ IALNSDLEQRKIGHGID >gi|229783985|gb|GG667750.1| GENE 15 16060 - 16347 304 95 aa, chain - ## HITS:1 COG:yijO KEGG:ns NR:ns ## COG: yijO COG2207 # Protein_GI_number: 16131792 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 91 178 268 283 66 36.0 1e-11 MEENYTQKITLAEVAEKTYVSQWHLSKLLNGHTGKSFSELLNHIRIEEAKKLLKDPSLRI GDISEEVGFMDVAHFSKVFKKLEGISANEYRNSRI >gi|229783985|gb|GG667750.1| GENE 16 17378 - 18112 705 244 aa, chain - ## HITS:1 COG:BH1119 KEGG:ns NR:ns ## COG: BH1119 COG0395 # Protein_GI_number: 15613682 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 3 227 2 222 281 120 31.0 2e-27 MNRGKKISLGSVVIYIVLGFWALTTIYPIFWVIINSFKAKGEILSNSFALPTGAMFTLAN YRTAFERLSIFSAYRNSMIISTSVAVIVIVLAGMASFGLVRYDFKLRKPLNSLVVAGMMF PVFSTIIPVYRMLSSIHMVNTESLALSLTSVILPQVAGNMSFAIVVLTGYVRSLPIELEE SAYLEGCNPFQIFYKIVVPLAKPSFATVAIFSFLWSYNDLFTQMFLLRLPQHRAITRLLN ELAS >gi|229783985|gb|GG667750.1| GENE 17 18126 - 19013 988 295 aa, chain - ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 5 291 2 287 292 139 31.0 5e-33 MGANIYKNKKPLFLFLVPAFAFMAIFLYYPFIQNILNSFKQITELGAASKGWNDPWYTNY VTMFQDENMRVAVRNTILMMVVTIVGQVGIAIVLALLVDNISHGAKFFRTVYFFPIVISA TALGLMFNLIFLYDKGMVNQLLESFGVTELIDWKGEKLALITMMLPVMWQYVGFYFVILI TGLNNISDELYEAAAIDGATGFQKVRYVSIPLLHNVMCTCVVLGVTGALKVFDLPWTMFP KGIPLGETWLTGTYMYYQTFNTKNVDYSSAIAVLIVVLGIVTAKVVNTIFKEKDY >gi|229783985|gb|GG667750.1| GENE 18 19088 - 20494 1591 468 aa, chain - ## HITS:1 COG:SMc02471 KEGG:ns NR:ns ## COG: SMc02471 COG1653 # Protein_GI_number: 15966817 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 82 459 51 425 434 65 22.0 2e-10 MKKRTMALLLAGAMTTTVFTGCSKQSTADTTAAPKTEAGTEAKTEGEAKTEAPTEAKKAV ALNVMTTFAGEDGNAPNYQKAVKDWEASTGNTVNDSSAVSDETVKARVASDFQMGSEPDV LFWFNGADANSFIEAGKVMSIDDIRKEYPEYASNMNDDLIAASLVDGKKYAVPVNGFWEA MFVNTTVLDAAGVEVPGPDYTWDQFLADCEKIKAAGYSPIAAALGNIPHYWWEYAIFNHN TPANHCDIPETVDDAQGQSWVAGINDIKDLYEKEYFPKNTLSATDDDTFLAFTEDKAAFL IDGSWKVGGIAGACQSDPNDPATLDKEKLDKFTVTYVPGQGDRKATDLIGGLSMGYYITK KAWDDPEKRDAAVNFVEYMTSNEIVPIFAQHTASALKEAPEVDQSQFNSLQVKAMDMMSG VTSLTPAVQDIFQGECRTSTFDGMPQIVTGKVSAEDAVKEGLEIYHAN Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:33:53 2011 Seq name: gi|229783984|gb|GG667751.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld144, whole genome shotgun sequence Length of sequence - 10983 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 3, operones - 2 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 1302 864 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member 2 1 Op 2 . - CDS 1321 - 1851 204 ## gi|266624084|ref|ZP_06117019.1| conserved hypothetical protein - Term 1861 - 1907 9.1 3 1 Op 3 . - CDS 1926 - 2576 412 ## COG0629 Single-stranded DNA-binding protein - Prom 2617 - 2676 2.7 - Term 2635 - 2676 2.1 4 2 Op 1 . - CDS 2680 - 3147 302 ## gi|266624086|ref|ZP_06117021.1| conserved hypothetical protein 5 2 Op 2 . - CDS 3080 - 4372 626 ## Closa_1104 hypothetical protein 6 2 Op 3 . - CDS 4411 - 5523 670 ## gi|266624088|ref|ZP_06117023.1| conserved hypothetical protein 7 2 Op 4 . - CDS 5438 - 5899 293 ## gi|266624089|ref|ZP_06117024.1| conserved hypothetical protein 8 2 Op 5 . - CDS 5913 - 6491 276 ## gi|266624090|ref|ZP_06117025.1| hypothetical protein CLOSTHATH_05426 9 2 Op 6 . - CDS 6507 - 7538 584 ## COG5377 Phage-related protein, predicted endonuclease 10 2 Op 7 . - CDS 7594 - 8721 1010 ## gi|266624092|ref|ZP_06117027.1| putative endonuclease III 11 2 Op 8 . - CDS 8735 - 8869 240 ## gi|266624093|ref|ZP_06117028.1| hypothetical protein CLOSTHATH_05429 12 2 Op 9 . - CDS 8906 - 9841 514 ## Closa_0805 hypothetical protein - Prom 10041 - 10100 7.9 - Term 9860 - 9918 13.2 13 3 Tu 1 . - CDS 10160 - 10753 226 ## CLL_A2764 putative aminotransferase, class V - Prom 10803 - 10862 2.3 Predicted protein(s) >gi|229783984|gb|GG667751.1| GENE 1 3 - 1302 864 433 aa, chain - ## HITS:1 COG:CAC2854 KEGG:ns NR:ns ## COG: CAC2854 COG0507 # Protein_GI_number: 15896108 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Clostridium acetobutylicum # 3 433 6 420 739 185 33.0 1e-46 MVCKFKSRLYINPGNGYTVAEYSSMELDCIPSEAVISNYGTEVTFTAFGKELPCMDNVEI EMEGDWKRSEKYGLQFGVDWSRVLLPKSREGIIGYLSSELITGIGPVMAREIVNRFGTDT FTVMENHPNELLAIKGITEQKLEFIIESYQKSAELRELMAYLAPYHVTPKKAEKIKQHFG LEAVTLLKENPYRLCEIKGFGFITVDPIARASKDLAPDEPERIKAAIQYVLRKGAEEGNL YLDSTIIVDMAYKVLNAGFPTDTVRRGQIKLAGNELVMKDKLLEADGTAIYLKAYREAEK EAAYHLVRLLRSPGNTYNIERELEAVLAKSKMTLAQKQEEAVRMVFRSQVSIITGGPGKG KTTILKIILQIFERLEKNKSVLLCAPTGRARKRLSESTGYPAFTIHKALYVTDDEMDNME PEILEEDLIIADE >gi|229783984|gb|GG667751.1| GENE 2 1321 - 1851 204 176 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624084|ref|ZP_06117019.1| ## NR: gi|266624084|ref|ZP_06117019.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 176 1 176 176 363 100.0 4e-99 MNEKKVQKFIMVFLMCTAGAVMLQLAAPMILGLIVNAVSIGIPFAVYYLLIRKKWRIRII KSSEEKEQSGSGSGAAEEGLDDGNNGQDERDNPVYEWYTRCGRERIKAISTNIFAHGKNE FWIRTDGVCNIRTPKGYRRAGNLPGYPGAQANTITEYLKQDGFYAVNDGRYIYVSR >gi|229783984|gb|GG667751.1| GENE 3 1926 - 2576 412 216 aa, chain - ## HITS:1 COG:SA1792 KEGG:ns NR:ns ## COG: SA1792 COG0629 # Protein_GI_number: 15927558 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Staphylococcus aureus N315 # 1 109 2 110 156 83 42.0 3e-16 MNRCIFVGRLTRDPEVSFSEKAQMSIARFTVAVDRIPKREGEPEADFFNCVAFDRKAEFV EQYWYSGMRVVVSGRMQNDNYTNKKGEKVYGVSLKADDVEFADGKREADNGQAQSTAPSR AANPANALAAQMSKEAPVTRNATQAPRAATAASRAAAPATNQGAAKPAASRNTPQRQAPS RSAVGRSASQRGAARAAAGDDFMNVASGMEEGLPFN >gi|229783984|gb|GG667751.1| GENE 4 2680 - 3147 302 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624086|ref|ZP_06117021.1| ## NR: gi|266624086|ref|ZP_06117021.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 31 155 1 125 125 253 100.0 4e-66 MMGWTEQFIIEAICRHTEQELRSDETKTERMKLSACSTLESIRRRVNGYGAECFAFVLLV GIMLKKSQMILNADTKGELEAILEPSVPRWNYGGFLTGPYHVPEEEAIFWSKASLEAPLN DEGARRYRDVAAVAFPEEMAEIWGTDDTGEGGIAV >gi|229783984|gb|GG667751.1| GENE 5 3080 - 4372 626 430 aa, chain - ## HITS:1 COG:no KEGG:Closa_1104 NR:ns ## KEGG: Closa_1104 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 406 10 400 595 221 34.0 4e-56 MFNISDVANLLGIKRLTEGNSFNVVCPFCGDTRGKMNFRIVKDGELANTYHCFKCGAHGN MLTLYADLMGICGVDRYKIAYREIKTALHEGAGCRNSYDSSFTGTMTQIEEEETASLEKR SQIYQRLIEMLSLSETHRRKLMDRGLTSNQIEAFHFRSTPVHGTEGLARRLIKEGYSLAG IPGFYMNQNQNWDIALYRNNQGILCPAYSVSGKIEGFQIRLDAPHDDKKYLWLSSTNKKR GAGSKSPVSFIGNPFDKTVRVTEGILKPIVAHALSGYSFLGTPGVNQYKSLEKALAVLKE NGLEMVMEYFDMDKFMDIGCYGDYKSSTCGKCQNKERYYGRKICEEKRKKRDQIRAGCCR LYESCDRLKLKCIRKTWDQREDGIWAGNDKGIDDYWWSGLKKKEEQIGYDGMDRTIHHRS NLPSYGTRVA >gi|229783984|gb|GG667751.1| GENE 6 4411 - 5523 670 370 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624088|ref|ZP_06117023.1| ## NR: gi|266624088|ref|ZP_06117023.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 22 370 1 349 349 625 99.0 1e-177 MEQKILTLQEKMVKIRSEIPALVKRAYSEDVNYDFIKIDDIFRYLTPAMNRWNVNFEIVS ECASKRNENGDSVYVEFLAYCQMWLYEADLELKWINAEKPEDEVTVTIHAIGTHEMPEKA KGSALTYAMKYYLMDKYCIDQGGVDPDMRDFPPDTYDGSIPEYDGLSSDAYDFSDRDSEG IAGDASGITDMELDAEMEGGQEVPDELPEEEPMEEAEESEVDASEERSMMNDSEMNPESG HEETDAEETAKQQDGGMESDAKKTRRKSAKDKQKADKGDTNRPVLSRHIMPENGTMVNLP NEVSPQPAVNMTIEEAGNIVCTFGIYNNKCLSELLNQGEEGVKALEWIAYNYRGRNETIR QGARLLLSTL >gi|229783984|gb|GG667751.1| GENE 7 5438 - 5899 293 153 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624089|ref|ZP_06117024.1| ## NR: gi|266624089|ref|ZP_06117024.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 153 1 153 153 303 100.0 3e-81 MTDNERHFLELKLRVWLRKLKREKAGFAGHRTPNDWSCELTDTAKKYLCDVVYSGQGGYV LASEESRLFTELQQAVTQAKVKEKLHQALFVDMDFEMVRDLAYGLRGQVETIMVEYKSHV KKGNEHGTKDFNTTGKNGKDPLGDTCIGKTRIQ >gi|229783984|gb|GG667751.1| GENE 8 5913 - 6491 276 192 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624090|ref|ZP_06117025.1| ## NR: gi|266624090|ref|ZP_06117025.1| hypothetical protein CLOSTHATH_05426 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05426 [Clostridium hathewayi DSM 13479] # 1 192 1 192 192 371 100.0 1e-101 MDEEMRQPEEMVSVIDGKKVQEILVGRYLNVVIVDHTDFMKLMLKEDYRIRHNFMMLIGQ WFICLSSSSEWDERNEGSVKLSCKLKPELSKTNLWPWVTYGDSAPNEVSVSGHQTESVER LFAAYLSKTEWEICNPFRQFLVAMQKEDRTLQQILTRFAVEWCDWVVKEEKEIQGESYRI ASLIASVPSVLL >gi|229783984|gb|GG667751.1| GENE 9 6507 - 7538 584 343 aa, chain - ## HITS:1 COG:lin2414 KEGG:ns NR:ns ## COG: lin2414 COG5377 # Protein_GI_number: 16801476 # Func_class: L Replication, recombination and repair # Function: Phage-related protein, predicted endonuclease # Organism: Listeria innocua # 11 341 11 319 319 115 27.0 1e-25 MLNLNYEPDVLVEDITTLSHDEWKLYRTLGIGGSDAAAVCGISPWKTARDLYEEKVHKKI KEQNDDGWVAKEIGKRLEELVVQIFMKRTDTRPYAVRKMFRHPLYPFMISDFDFFVKLDE KIYLLECKTSFSYHHMNGWEGGNIPPHYVLQGIHYMSVANVDGIIFLCLHGNHEGSLIIR RLERDLAQEEELIRQESYFWNQCVALRKPPKYTEKADLMLKSIQGRYGVQENKQIVLPPS YAANMTSYLKLKKRKSELKRQMDQIDEQMKLAYAPIKEAMEGAEEAVLETGNIRYLAGYN RKVTTSINKDSLEVLQLMHPDIYREYAQTNTSQSFYIKEDKAS >gi|229783984|gb|GG667751.1| GENE 10 7594 - 8721 1010 375 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624092|ref|ZP_06117027.1| ## NR: gi|266624092|ref|ZP_06117027.1| putative endonuclease III [Clostridium hathewayi DSM 13479] putative endonuclease III [Clostridium hathewayi DSM 13479] # 1 375 1 375 375 773 100.0 0 MNFADDFSKTFCDKEEFLEFLDGIECGAEWQKHPTNTVLVIAGEEEPEICEKIKDMDEKE EIIKDTMKNSGLFLCLGSTYYPVGQTTMKSIESRARIAGSALLDLPKGKLARVLNDCLNV TKGNALVRIHEGKVRAVHGGDPSDYSVLPMPELFEVASIYVAENYDKATFSNGLFSHSLA MASWELEEKGLLDTYRELLLQYGLQADNVLSALIRVQTSDVGVSGANIYYSLLLGAEKKP LILGKPLKLEHTNNASIEDFSQNMGQIFARYQEAIGDLSKLFHVYASYPANVMASVMIKA GISKALTAQTVEQFKAGHGPGICNGYDIYCGICEAIFLAQSNGMGAKALTDLEEMVSRCL TYRFHEYDIPGTILY >gi|229783984|gb|GG667751.1| GENE 11 8735 - 8869 240 44 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624093|ref|ZP_06117028.1| ## NR: gi|266624093|ref|ZP_06117028.1| hypothetical protein CLOSTHATH_05429 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05429 [Clostridium hathewayi DSM 13479] # 1 44 1 44 44 81 100.0 2e-14 MKLEARWDNAGRDQLTIGEMVEMGKDGYEFVISGGHIVSVLINL >gi|229783984|gb|GG667751.1| GENE 12 8906 - 9841 514 311 aa, chain - ## HITS:1 COG:no KEGG:Closa_0805 NR:ns ## KEGG: Closa_0805 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 304 1 309 321 84 25.0 8e-15 MNMETRKTSFEEKLKENLGAEYKATFPQRMEKQDEEFLRVGVQKFGESAGIWIYLNGSEF HWLDNEEDIQRAVTKVIEEYKKKRGFLNGPYLTGDSFSNVKNNLVCALVNREDNEEVLKY IPHIRYYDMVAYFLIFITVDGESYVRTVVNEDLDRWEVELQDVLETAKSNTDLNPPVIEM VGIKDWKVISAPIGNTFGEMLQSIKSHQEQKNPMYVLSNQSGLYGASNMIDNHLLGQISE SCNDSLLILPVDIHEIILIPSRKNRTSIADWKAVIHALNIENKGVKLSDLVYLFDRADKK LHVAKNDDFQA >gi|229783984|gb|GG667751.1| GENE 13 10160 - 10753 226 197 aa, chain - ## HITS:1 COG:no KEGG:CLL_A2764 NR:ns ## KEGG: CLL_A2764 # Name: not_defined # Def: putative aminotransferase, class V # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 19 194 154 323 340 118 34.0 1e-25 MHGLDRNGKKSEYRQGYTKWLPLYESDILISHYCCVKQNEEPIALYEKQTGRHPILALMA EESARRKEAYLRTGCNGFESERPFSKPMGFWRAQDVLRYTVEKQLEIAEPYGEVAEVGQV PGQIGFFPSCGPFKCTGEQRTGCLFCPVGCHLTSFEKFVRLKAYNPKLYDFCMEELGEKN LLSWIEKNYRRGYKQIA Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:35:26 2011 Seq name: gi|229783983|gb|GG667752.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld145, whole genome shotgun sequence Length of sequence - 19200 bp Number of predicted genes - 20, with homology - 19 Number of transcription units - 9, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 96 - 155 6.3 1 1 Tu 1 . + CDS 328 - 441 73 ## + Prom 446 - 505 11.8 2 2 Op 1 . + CDS 630 - 1508 705 ## COG0682 Prolipoprotein diacylglyceryltransferase 3 2 Op 2 . + CDS 1510 - 1662 218 ## gi|266624098|ref|ZP_06117033.1| conserved hypothetical protein + Term 1679 - 1722 7.6 + Prom 1716 - 1775 6.7 4 3 Op 1 29/0.000 + CDS 1862 - 2287 437 ## COG2001 Uncharacterized protein conserved in bacteria 5 3 Op 2 . + CDS 2284 - 2790 566 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 6 4 Op 1 . + CDS 3737 - 4168 296 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 7 4 Op 2 . + CDS 4186 - 4671 526 ## Closa_2465 hypothetical protein 8 4 Op 3 3/0.000 + CDS 4709 - 6904 2525 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 9 4 Op 4 4/0.000 + CDS 6933 - 8711 1877 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 10 4 Op 5 1/0.000 + CDS 8823 - 9083 218 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 11 4 Op 6 28/0.000 + CDS 10010 - 10654 747 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 12 4 Op 7 . + CDS 10671 - 11741 1296 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 13 5 Op 1 25/0.000 + CDS 12711 - 12938 284 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 14 5 Op 2 . + CDS 12985 - 14136 1470 ## COG0772 Bacterial cell division membrane protein + Prom 14145 - 14204 2.2 15 6 Op 1 . + CDS 14228 - 15478 852 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase 16 6 Op 2 . + CDS 15475 - 15699 137 ## Closa_2458 hypothetical protein 17 7 Tu 1 . + CDS 16659 - 17066 452 ## Closa_2458 hypothetical protein + Term 17086 - 17139 0.6 + Prom 17072 - 17131 7.0 18 8 Tu 1 . + CDS 17186 - 18475 1643 ## COG0206 Cell division GTPase + Term 18492 - 18553 15.1 19 9 Op 1 . + CDS 18564 - 18902 495 ## Closa_2456 hypothetical protein 20 9 Op 2 . + CDS 18950 - 19199 85 ## Closa_2455 hypothetical protein Predicted protein(s) >gi|229783983|gb|GG667752.1| GENE 1 328 - 441 73 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKGQFITIPNFIIHYLPVILLSIDSIHISEIGLKMT >gi|229783983|gb|GG667752.1| GENE 2 630 - 1508 705 292 aa, chain + ## HITS:1 COG:BS_lgt KEGG:ns NR:ns ## COG: BS_lgt COG0682 # Protein_GI_number: 16080552 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Bacillus subtilis # 30 280 20 257 269 170 39.0 3e-42 MANTADISFVHLGITIDHLQNSISIFGFRIAFYGIIIGIGILAGLWVATSDAKRRGQDPD IYLDFALYGIIFSIIGARIYYVIFDWENYRDNLIQIFNLRAGGLAIYGGVIAAVITLTVY TRKKKLSFFSMADSGCLGLITGQIIGRWGNFFNCEAFGGYTDSLFAMRIRRALVNESMIS QDLLNHLIVKDGVEYIQVHPTFLYESVWNLGVLAFMLWYRKRKKFNGEMLLIYLLGYGLG RVWIEGLRTDQLIFFNTGIAVSQALSLVLVVISALVLIWKHRQIRLKSKGED >gi|229783983|gb|GG667752.1| GENE 3 1510 - 1662 218 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624098|ref|ZP_06117033.1| ## NR: gi|266624098|ref|ZP_06117033.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 50 1 50 50 77 100.0 2e-13 MISKEELICKIEEARDKLNRSIDTEQDSGTVYKRSVELDQLIEQYIVAGY >gi|229783983|gb|GG667752.1| GENE 4 1862 - 2287 437 141 aa, chain + ## HITS:1 COG:lin2148 KEGG:ns NR:ns ## COG: lin2148 COG2001 # Protein_GI_number: 16801214 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 1 141 1 143 143 179 60.0 2e-45 MFMGEYNHTVDAKGRLIVPSKFREQLGEEFVVTKGLDGCLFVYDNEEWKALEEKLKSLPL TNTNARKFNRFFLAGASSCEVDKQGRILLPAVLREFAGIEKDAVLVGVGSRIEIWSKDAW TAANTYDDMEEIAENMEGLGI >gi|229783983|gb|GG667752.1| GENE 5 2284 - 2790 566 168 aa, chain + ## HITS:1 COG:CAC2132 KEGG:ns NR:ns ## COG: CAC2132 COG0275 # Protein_GI_number: 15895401 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Clostridium acetobutylicum # 1 167 1 165 312 190 58.0 1e-48 MMFEHKSVLLYEAIDSLNVKPDGIYVDGTLGGGGHALEVCRRLGEYGRLIGIDQDADAIA AASERLRDYEDRVTIVRSNYEEIQSVLKDLGIEKADGIYLDLGVSSYQLDTPERGFTYRE EDAPLDMRMDQRNTRTAADIVNTYSEFDLYRIIRDYGEDKFAKNIAKH >gi|229783983|gb|GG667752.1| GENE 6 3737 - 4168 296 143 aa, chain + ## HITS:1 COG:CAC2132 KEGG:ns NR:ns ## COG: CAC2132 COG0275 # Protein_GI_number: 15895401 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Clostridium acetobutylicum # 4 143 171 310 312 161 57.0 4e-40 MRARETKRIETTGELTEIIKEAVPAKVRAVGGHPSKKTFQAIRIELNQELEVLNNSIDTM IDLLNPGGRLAVITFHSLEDRIVKIRFRNNENPCTCPPDFPVCVCGKVSKGRVITRKPVV PSEEEINGNKRSKSSKLRVFERI >gi|229783983|gb|GG667752.1| GENE 7 4186 - 4671 526 161 aa, chain + ## HITS:1 COG:no KEGG:Closa_2465 NR:ns ## KEGG: Closa_2465 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 161 1 160 160 234 80.0 1e-60 MAARKNVHSYRSAAYVQGNAVRKLEPERVPVRRSPDEVRRPAVNHKVRRNREKALQMDLP YVIMLTIAAVCTLYLCVNYLHLQSSITTRIHTIENLEAELEMLKTENDALQTNINTSVDL DHVYKVATEELGMVYANKNQILQYDKTESEYVRQNEDIPKH >gi|229783983|gb|GG667752.1| GENE 8 4709 - 6904 2525 731 aa, chain + ## HITS:1 COG:CAC2130 KEGG:ns NR:ns ## COG: CAC2130 COG0768 # Protein_GI_number: 15895399 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 1 630 12 592 729 268 31.0 2e-71 MQEKLAITVLVVTLALFALVMVLYDLMTNKKDDFNQIVLSQQEYDSRTIPFRRGDILDRN GTYLATSEKVYNLILDPKQIMSDPDAYLEASVSALVDCFGYDRTEMMNLINEKKDKYYIR YAKQLTYDQKENFEKYKDQKRDEFKKAGDGKGIRGIWFEDEYKRIYPYNSVGCNVIGFAR DDGMDGSGGIEQYYNSTLMGTNGREYGYLNDDSNLERVIKPASNGNTVVSTIDTNVQKIV EKYITEWEQEVGSKRVGVIVMDPNNGEVLAMANDKSFDLNNPRQLRPEYTDDVVKQLGIQ EAIDDYRRKNKDAEPLTEDNVYAHYSNDEIMSLGKQVAWNQTWRNFCISDGFEPGSTAKI FTVAAALEEGAITGNETYTCNGKLEIGGWPISCVNRYGHGDLTVEQGLMKSCNVVMMQIA QKVGKQKFYKYQQQFGFGSKTGIDLPGEADNKALMYTAETADPASLATNAFGQNFNCSMI QMAAAYCSVINGGSYYEPHVVKQILNEQGSIVKKMDPVLVRETVSESTTKFINEALYKTV NAQGGTGSAARVEGYKIAGKTGTAEKVGRDKKNYVVSFCGYAPADNPQVLVYVVVDEPHV EEQAHSTYASGIFQKIMTEILPYLNIFPDTDITPSLPEGGQAELPSEEGITSSTEGSSEE STEESTEAGTLENGETIPPEALDPSEEAVIGPSEEAGYGLPGALPGQTNPVRLPQTSSSS AEASETETKGE >gi|229783983|gb|GG667752.1| GENE 9 6933 - 8711 1877 592 aa, chain + ## HITS:1 COG:CAC2130 KEGG:ns NR:ns ## COG: CAC2130 COG0768 # Protein_GI_number: 15895399 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 6 578 3 589 729 369 35.0 1e-101 MMTVEQRMNSNKTHHREKIAILFFLLFLAMTGLMGRLFFLMIFQSAHYSAMAEDLHERER TIKAARGNIIDANGTVIATNRTVCTISVIHNQIEEPDQVVSVLSKELGIEEDAVRKKVEK YSSREIIKTNVDKALGDQIRTYHLAGVKVDEDYKRYYPYDSLASKVLGFTGGDNQGIIGL EVKYEDYLKGMNGKILTMSDAAGIEIENAAEDRIEPIAGQDLHVSLDVNIQKYCEQAAYQ VMEKKGAKRVSVIVMNPQNGEIMAMVSAPEFNLNDPFTLPEDPGNVSAKEKQELLNQMWR NPCINDTYEPGSTFKIVTAAAGLEAGVVNLEDHFSCPGFRIVEDRKIRCHKVGGHGSETF LQGMMNSCNPVLIDVGQRLGVDNYYKYFEQFGLKGKTGIDLPGEAATIMHKKENMGLVEL ATVSFGQSFQITPMQLITTASAIVNGGTRVTPHFGVQAVSADGEVIHNFTYPVKEGIVSA ETSETMRYILEKVVSEGSGKKAYLEGYRIGGKTATSEKLPRSLKKYISSFIGFAPADNPS VIALITIDEPQGIYYGGTIAAPVIADIFKNILPYLGIEATEEKTVSSFRFYS >gi|229783983|gb|GG667752.1| GENE 10 8823 - 9083 218 86 aa, chain + ## HITS:1 COG:BS_mraY KEGG:ns NR:ns ## COG: BS_mraY COG0472 # Protein_GI_number: 16078583 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Bacillus subtilis # 1 86 1 91 324 87 50.0 8e-18 MINETILAIIIAFAISAILCPIVIPFLHRLKFGQQVREEGPESHLKKQGTPTMGGLIILT SIIITSLFYVKDYPKIIPILFMTVGF >gi|229783983|gb|GG667752.1| GENE 11 10010 - 10654 747 214 aa, chain + ## HITS:1 COG:CAC2127 KEGG:ns NR:ns ## COG: CAC2127 COG0472 # Protein_GI_number: 15895396 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Clostridium acetobutylicum # 2 214 110 316 317 175 51.0 4e-44 MQKIIGQFIITGVFAYYLLNSKAVGTSMLIPFTGGFEHGLYLDLGIFFVPVLFFIVLGTD NGVNFTDGLDGLCTSVTILVATFLTVVALGEKAGISPITGAVVGSLLGFLLFNVYPARVF MGDTGSLALGGFVASSAYMMQMPIFIAIIGFIYLVEVLSDIIQITYFRKTGGKRFFKMAP IHHHFEMCGWSETRVVAVFSIVTAILCLLAYLGL >gi|229783983|gb|GG667752.1| GENE 12 10671 - 11741 1296 356 aa, chain + ## HITS:1 COG:BS_murD KEGG:ns NR:ns ## COG: BS_murD COG0771 # Protein_GI_number: 16078584 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Bacillus subtilis # 3 355 10 358 451 207 35.0 2e-53 MSQKILVAGSGKSGIAASKMALDTEHEVILYDSNEALDAEAVRGQFKEDEAIEVLLGELK KESLKDVGLCVISPGIPLDAPFVEVLKACEIPVWSEIQLAYHDAKGKLAAITGTNGKTTT TSLVGEILSAYYKEVFVVGNIGTPYTSTALRTTDQSYTVAEISSFQLETITDFRPDVSAI LNITPDHLNRHKTMDCYINVKERIAENQTKEDACILNYDDPVLRRFGETIAPRVIYFSSR ELLPEGICLDGDMIIRNHDGVRMEIINIHDMNLLGRHNHENVMAAAAIAMELGVPLETIQ SVIRDFKAVEHRIEFVAEKAGVKYYNDSKGTNPDAAIQAVNAMPGPTILIAGGYDN >gi|229783983|gb|GG667752.1| GENE 13 12711 - 12938 284 75 aa, chain + ## HITS:1 COG:BH2567 KEGG:ns NR:ns ## COG: BH2567 COG0771 # Protein_GI_number: 15615130 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Bacillus halodurans # 1 75 374 448 450 67 38.0 6e-12 MVLLGQTREKIAECAARHGFTNVMYAEDMQEAVRVCASYANAGEYVLLSPACASWGMFKD YEERGDIFKECVRNL >gi|229783983|gb|GG667752.1| GENE 14 12985 - 14136 1470 383 aa, chain + ## HITS:1 COG:BS_spoVE KEGG:ns NR:ns ## COG: BS_spoVE COG0772 # Protein_GI_number: 16078585 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus subtilis # 35 380 26 365 366 215 39.0 1e-55 MAENRPNKKKNKPRRFYDYSLLFTVIFLTVFGLVMIYSSSSYAAQIKYDDAAYFMMRQAK IALAGFVIMLIISKMDYHWYARFAVLAYVLSYVLMIATALFGIERNGKKRWLGVGGASIQ PTEFVKIALIVMLASMIVQMGKNINEKRGVVLVIVTTLPIAGIVAANNLSSGIIIAGIAF VMLFVACKKKWPFFACGFAGVGVLAFAGPIATALEKIGLLKEYQLSRIFVWLEPEKYPST GGYQVLQGLYAIGSGGLVGRGLGESIQKMGFVPEAQNDMIFSIICEELGLFGAVSVILIF LFMIYRFMLIADNAPDLFGALLVVGVMGHIAIQVILNIAVVTNTIPNTGITLPFISYGGT SVLFLMMEMGMVLSVSNQIRLER >gi|229783983|gb|GG667752.1| GENE 15 14228 - 15478 852 416 aa, chain + ## HITS:1 COG:RSc2953 KEGG:ns NR:ns ## COG: RSc2953 COG0766 # Protein_GI_number: 17547672 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Ralstonia solanacearum # 1 416 1 420 421 323 45.0 3e-88 MSVIEVQGLTPLEGEIKIQGSKNAVLPMMAAAILHKGTTVIENVPRIQDVFCMLGILDSI GCKCRLEGNRLKIDASVVTETEIPKKFVKSMRSSIIVSGPLLGRMKRAVTSFPGGCSIGE RPIDLHLAALDAMGAIIEKDGEELKAVTGQLTGADIHFPFPSVGATENALMAAVLAEGVT VIHGAAREPEIEELCLFLIGMGARIHGVGSSNLVIEGVAGLHDSIFTVTGDRIVAGTYLL AVMAAEGNVTLTGIRPAHMEAVLRCAEEMGAEVRRYDDRISAAMVGRPSGYHVVTSPYPG FPTDLQSPMMAVMSVAEGEGRLEETVFEGRFGTVKELQKLGADIIIEDNRADIHGLYPLT GAEVAARDLRGGAALVVAGLASQGVTRITDCSHIERGYEDICRDLRSLGAVIRGME >gi|229783983|gb|GG667752.1| GENE 16 15475 - 15699 137 74 aa, chain + ## HITS:1 COG:no KEGG:Closa_2458 NR:ns ## KEGG: Closa_2458 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 6 71 6 71 242 77 50.0 2e-13 MIRPEKSKRKIVIGIAAAVIVLLAVILMSIRITEITVTGNKQVTADELRGLLFKDKWDRN TIYCFYKDHFKEPS >gi|229783983|gb|GG667752.1| GENE 17 16659 - 17066 452 135 aa, chain + ## HITS:1 COG:no KEGG:Closa_2458 NR:ns ## KEGG: Closa_2458 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 135 108 242 242 227 80.0 1e-58 MGSYMYFDKDGIIVESSSNKLPGIPWITGLKYGHIALHQPLPVENNKIFDEILNLTQLLS THEITVDRIQYDTHGNASLILGDITVYLGSNDQMSGKISELKDQLPVLTGLSGTLYLDTY DEAETATSYRFVKNK >gi|229783983|gb|GG667752.1| GENE 18 17186 - 18475 1643 429 aa, chain + ## HITS:1 COG:CAC1693 KEGG:ns NR:ns ## COG: CAC1693 COG0206 # Protein_GI_number: 15894970 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Clostridium acetobutylicum # 13 372 12 371 373 340 58.0 3e-93 MLEIKINEADNAARILVIGVGGAGNNAVNRMIDESIAGVEFIGINTDKQALQFCKAPTAM QIGEKLTKGLGAGAKPEIGEKAAEESSEELAQAMKGADMVFVTCGMGGGTGTGAAPVVAK IAKDMGILTVGVVTKPFRFEAKTRMSNAIAGIERLKESVDTLIVIPNDRLLEIVDRRTTM PDALKKADEVLQQAVQGITDLINVPGLINLDFADVQTVMTDKGIAHIGIGKAKGDEKALD AVKQAVSSPLLETTIEGASHVIINISGDISLIEANEAASYVQEMAGDDANIIFGAMYDET AQDEASITVIATGLDMGSETPVGKVMTSFGGSASGYTRPQKPAAQPAQNQNPNPNQEAAA TAPAYNPNYNPNYGSPNYGNQGYNPNYNKPNYGGQAGAPGASQGAGQPYRPTVNREVQIN IPDFLKNKR >gi|229783983|gb|GG667752.1| GENE 19 18564 - 18902 495 112 aa, chain + ## HITS:1 COG:no KEGG:Closa_2456 NR:ns ## KEGG: Closa_2456 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 8 112 16 120 120 101 58.0 9e-21 MSEEKKTREDDWELSEAYYKREEGEKELSLEDFITEEDLQADGAMLQPNHDELLKKTAED VHRVMELSLQGYNIERITETTGLDSQYVYNILVCAQGFREDDEIAVAHLVLG >gi|229783983|gb|GG667752.1| GENE 20 18950 - 19199 85 83 aa, chain + ## HITS:1 COG:no KEGG:Closa_2455 NR:ns ## KEGG: Closa_2455 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 83 1 84 390 81 44.0 1e-14 MIKKMCPVCDLPVNAGNYCPRCRRVIRKPYTQNVDYYLNERHPVHEAECDFHNPYLEADH HGESSRPAGAVPGKDTPGRNTPG Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:35:51 2011 Seq name: gi|229783982|gb|GG667753.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld146, whole genome shotgun sequence Length of sequence - 14105 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 10, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 715 333 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 2 1 Op 2 . + CDS 712 - 1236 507 ## COG0006 Xaa-Pro aminopeptidase + Prom 2083 - 2142 80.4 3 2 Tu 1 . + CDS 2208 - 3410 1414 ## COG0006 Xaa-Pro aminopeptidase + Term 3531 - 3597 16.9 - Term 3519 - 3583 10.3 4 3 Tu 1 . - CDS 3622 - 3957 197 ## Closa_2859 XRE family transcriptional regulator - Prom 4058 - 4117 5.8 + Prom 4017 - 4076 6.1 5 4 Tu 1 . + CDS 4248 - 5312 822 ## COG2199 FOG: GGDEF domain + Term 5428 - 5466 7.7 6 5 Tu 1 . - CDS 5490 - 6092 471 ## COG3760 Uncharacterized conserved protein - Prom 6143 - 6202 6.2 + Prom 6021 - 6080 2.7 7 6 Tu 1 . + CDS 6141 - 6320 59 ## gi|266624123|ref|ZP_06117058.1| conserved hypothetical protein + Prom 6382 - 6441 3.1 8 7 Tu 1 . + CDS 6479 - 7948 1031 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase + Term 8002 - 8040 1.2 + Prom 8322 - 8381 2.7 9 8 Tu 1 . + CDS 8416 - 9306 622 ## COG0583 Transcriptional regulator + Term 9334 - 9382 8.3 + Prom 9312 - 9371 1.8 10 9 Op 1 . + CDS 9456 - 9806 369 ## gi|288871275|ref|ZP_06117061.2| transcriptional regulator, MerR family 11 9 Op 2 . + CDS 9872 - 10594 395 ## COG0716 Flavodoxins 12 9 Op 3 . + CDS 10609 - 11487 289 ## Bmur_2506 hypothetical protein + Prom 11675 - 11734 6.5 13 10 Op 1 1/0.000 + CDS 11754 - 12947 972 ## COG1073 Hydrolases of the alpha/beta superfamily 14 10 Op 2 . + CDS 13034 - 13618 385 ## COG0655 Multimeric flavodoxin WrbA + Prom 13622 - 13681 6.6 15 10 Op 3 . + CDS 13709 - 14026 263 ## COG0640 Predicted transcriptional regulators + Term 14029 - 14059 0.3 Predicted protein(s) >gi|229783982|gb|GG667753.1| GENE 1 2 - 715 333 237 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 26 230 259 453 458 132 35 1e-30 IRGEEQGLVCVYEAGGEVRTIRSGQILLAAGRQPNLDGLFGQDVSLAMEGGCLKTDSEFQ TSEPGIYAIGDVTAGKMLAHVAAAQGTYVVEKIAGVHHTIRLSVVPNGMFVELPVVPGCI YTEPEIASVGITVEEAKAHKMKVRCGRYSMSGNGKSIITREQSGFIHLIFEEYSGTIVGA QIVCPRATDMISEMATAIANGLTADQLMLAMRAHPTYSEGITAAIENYYEAKGDERL >gi|229783982|gb|GG667753.1| GENE 2 712 - 1236 507 174 aa, chain + ## HITS:1 COG:FN0453 KEGG:ns NR:ns ## COG: FN0453 COG0006 # Protein_GI_number: 19703788 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 2 173 3 170 584 144 41.0 1e-34 MIQERIAALRSLMAERHIDAYMIPTSDFHESEYVGDYFKCRKFITGFTGSAGTAVITQTE ARLWTDGRYFVQAAKQLEGTGVILMKSGQEGVPTEEEYLTEMMPDNGTLGFDGRVVNSQM GQKLKELLEDKHVKFSWQEDLVDFIWEDRPELSAEPVWILKENYAGKSAVDKIA >gi|229783982|gb|GG667753.1| GENE 3 2208 - 3410 1414 400 aa, chain + ## HITS:1 COG:FN0453 KEGG:ns NR:ns ## COG: FN0453 COG0006 # Protein_GI_number: 19703788 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 2 399 195 584 584 381 47.0 1e-105 MWLLNIRGNDVVCNPVVLSYALVTLERFYLFINEAVLDSEVKSYLKELGVTIRPYNDIYD AVGQLKGQKVLLETAKTNYAIISNLDETNTIVDCMNPTAPAKAVKNPVEIENMKKAHIKD GVAMAKFIFWLKKNIGKQTITELDAEHYLDQLRAEQEGNLGLSFHTISAYGANAAMCHYS ATPESNAVLEPKGLYLVDSGGQYYEGTTDVTRTISLGETTQAEREHFTLSVISMLRLAAV KFLYGCRGLTLDYVAREPLWSRGLNFDHGTGHGVGYLLNVHERPNGIRWRMVPERQDNCV LEEGMVTSDEPGVYIEGSHGVRTENLIVCKKAEKNEYGQFMEFEFLTFVPIDLDALDQSL MNERDVELLNNYHRQVYEKISPYLTEEEAEWLKENTRAIS >gi|229783982|gb|GG667753.1| GENE 4 3622 - 3957 197 111 aa, chain - ## HITS:1 COG:no KEGG:Closa_2859 NR:ns ## KEGG: Closa_2859 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 111 2 113 113 149 75.0 3e-35 MNEQFVRNRITELRLKKDVSEYQMSLDLGKNKSYIQGISSGRSMPSMKQFFEICDYLEIT PLEFFDTEMAEPPQFRRAVELLKELDNEDWEAIIPLLMRLRKEITQKQAEP >gi|229783982|gb|GG667753.1| GENE 5 4248 - 5312 822 354 aa, chain + ## HITS:1 COG:aq_035_2 KEGG:ns NR:ns ## COG: aq_035_2 COG2199 # Protein_GI_number: 15605636 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Aquifex aeolicus # 183 354 70 247 251 109 39.0 7e-24 MQSKGMADHIAWAVFIVSLVAMGVISYPLIWGKDRGEGERLEEGKMEKQKEQKEQNKQNE QDKQNKQDNQNEQKEDRREALYRRLGEFTDTVLFEYRYRDGSILFTPNASNQMETSVLKS ALEKYVKCPPLQGESRCFEFRMKADTEDYRWCICHIRAEYDGGKESVDKYAEAERPVCLI GKLDDISKQKEREEQLLFQSTRDGLTGIYNKTAFEYMMEETLKRGSRGSLYMIDIDNFKD VNDQYGHPAGDRILVKTGELLREIFRDSDLIGRVGGDEFVVYSESGDTKMKALRLLNGTA DFSKEGELRISVSIGIASSTGNPDEEYQELFSRADQAMYRAKQEGKNRIAWYEE >gi|229783982|gb|GG667753.1| GENE 6 5490 - 6092 471 200 aa, chain - ## HITS:1 COG:PA1841 KEGG:ns NR:ns ## COG: PA1841 COG3760 # Protein_GI_number: 15597038 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pseudomonas aeruginosa # 48 193 19 163 165 94 36.0 2e-19 MANTAAKNITPAHVFYIDRTLYTGRPADLTGRPEKEKVTYDLLDKLGIAYQRLDHDPTAT IEACHDIDALLETEICKNLFLRNAQKNSFYLLMIPGCKKFRTAVLSKQIGSARLSFAEPE FMEQYLNITPGSVSVLGLMNDKENRVRLLIDRDILEQEFLGCHPCINTSSLKIRTSDIVN IFLPYIGHSYTLVDLPCDSD >gi|229783982|gb|GG667753.1| GENE 7 6141 - 6320 59 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624123|ref|ZP_06117058.1| ## NR: gi|266624123|ref|ZP_06117058.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 59 17 75 75 114 100.0 3e-24 MSVIIQENGIIARGKQKWLTEEGKMGPAVCWKMGDAVINTGKPGMNNSFFIGFMRCLVL >gi|229783982|gb|GG667753.1| GENE 8 6479 - 7948 1031 489 aa, chain + ## HITS:1 COG:BH0687 KEGG:ns NR:ns ## COG: BH0687 COG2265 # Protein_GI_number: 15613250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus halodurans # 108 482 77 455 458 345 43.0 8e-95 MIENRLKNERSRVLTEKTMSSKRDDKSARYGEGKRREDKAVNTAGGVSGKRGNNTAGKAG HKPEKRPLQKGSNQKTGNNRDLRSGTSAFKRGVSAKEEGHPVRKNTTPCPVFLRCGGCQL LDMPYDKQLSLKQKQLEELLKPYCRVQPIIGMKDPFHYRNKVHAVFDHDKKGNPVSGVYE ANTHRVVPIESCLIEDQKADEIIGTIRGMLKSFKIKTFDEDTGYGLLRHVLIRRGFATGD IMVVLVTASPVFPSKNNFVRALREKHPEITTIVQNINGRDTSMVLGDRENVLYGKGYIED ILCGFRFRISSKSFYQVNPVQTEVLYRKAIELAGLTGKETVIDAYCGIGTIGIVASQEAD RVIGVELNRDAVRDAAQNAKINGIKNAQFYCNDAGAFMSKMADNGEHVDVVFMDPPRSGS TEEFIQSVAKIKPEKVVYVSCGPETLARDLGVFRKMGYEAKVAWGVDMFPQAAHVESVIM MQNCGFKKK >gi|229783982|gb|GG667753.1| GENE 9 8416 - 9306 622 296 aa, chain + ## HITS:1 COG:lin0450 KEGG:ns NR:ns ## COG: lin0450 COG0583 # Protein_GI_number: 16799526 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 286 1 287 291 234 39.0 1e-61 MDLRVLNYFLAVAREENFTRAAQMLHVTQPTLSRQIAQLEEDLGVKLFTRSNHNIILTED GMILKRRAQELLALAEKTRQDLRHRDEELTGKIAIGSGEFQSTRFLSQMIASFSEKHPNV QYEIYSGDADNIHDYVERGLLDIGLMAEPIDIRKYDFISMPVKERWGVWVNTDSALAQKE WVQPEDLVGKQLITATGEFPKSSIGKWLDDYNTQVKIIATGNLLYNEAMLAENHLEIVLG IKLNYTYDNLCFVPLYPILEMSTVLVWKKEQIFSSATSSFIEFARQYEKVISDNKL >gi|229783982|gb|GG667753.1| GENE 10 9456 - 9806 369 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871275|ref|ZP_06117061.2| ## NR: gi|288871275|ref|ZP_06117061.2| transcriptional regulator, MerR family [Clostridium hathewayi DSM 13479] transcriptional regulator, MerR family [Clostridium hathewayi DSM 13479] # 1 116 9 124 124 205 100.0 9e-52 MTINEASERYNIPIEVLKEYESWGLCDEVRRVMGVWQYDDRDIQRLSMIMTLHDIGFDNN EVEAYMRLLLEGDSTEEERLDMLNKKRGVTLDELHFKQKQLDRVDYLRYKIQKTHK >gi|229783982|gb|GG667753.1| GENE 11 9872 - 10594 395 240 aa, chain + ## HITS:1 COG:MA0407 KEGG:ns NR:ns ## COG: MA0407 COG0716 # Protein_GI_number: 20089301 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanosarcina acetivorans str.C2A # 80 204 10 144 179 112 43.0 8e-25 MKKILSFICVCSTVFVTTACSSQGSASDTGMSVPASSQETGKAAQQTDLLQNDSENASVT KKAGEETDDTGTEEAGSDILVAYFSATGTTRTLAGYISEVTGGDLYEIVPEIPYSSEDLN YSDNNSRSTREQNDESVRPVISGSLENMDQYDTIFLGYPIWWGEAPRIIDTFMEAYDFSG KTIVPFCTSGGSGIGSSARNLHNLAGSDVIWLDGKRLSSDISHGEMVSWIDGLSLDVMAQ >gi|229783982|gb|GG667753.1| GENE 12 10609 - 11487 289 292 aa, chain + ## HITS:1 COG:no KEGG:Bmur_2506 NR:ns ## KEGG: Bmur_2506 # Name: not_defined # Def: hypothetical protein # Organism: B.murdochii # Pathway: not_defined # 3 206 10 207 219 137 38.0 7e-31 MMVDMSLLVMLPVLMAEILTGQEFHEWLGTAMAVVLILHHLLNINWLKHIGKGNYTPLRS LITIINLLLFADMSILMISGIVMSGFVFEWLPISGGMVLSRRLHLFASHWGLILMALHTG VHWSRVVRLGTKFLEKSESEDELAWLARGLAAAISIFGVYAFIQQNMSDYLFLKTAFVFW DETKTAASFLTETLSIMGLFITIGYYGQKQMKGLKGMKHKTAKYLAFLIPILVCIGTICW MMTGNRASPADSWETTPSVATEVNDGYVLIPGGTFLMGSPEREAWRDNHTVL >gi|229783982|gb|GG667753.1| GENE 13 11754 - 12947 972 397 aa, chain + ## HITS:1 COG:PA2218 KEGG:ns NR:ns ## COG: PA2218 COG1073 # Protein_GI_number: 15597414 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Pseudomonas aeruginosa # 78 392 50 364 367 330 53.0 3e-90 MRKLLKKTLAITGCVAMMASMTACGAKPETQPEATRQTEATMQTEATAETEGQAVKEGDA MAVNYGNMKDVSIKIEPIELTEEWDKVFQLSDEVNHRKVTFANHFGITLAADLYEPMEYT GKLPAIAVCGPFGAVKEQSSGLYAQELAKRGFLAIAFDPSFTGESAGYPRDFNSPDINIE DYQAAIDFLSIQENVDPEKIGIIGICGWGGMAVQTAALDTRIKATAAMTMYDMSRNTALG YFDSVDEEGRYEARIAYNRQRTEDYKNGSYTMGGGLPEEAPEEAPQYVKDYVAYYKTERG YHPRSVNANNGWAATAGGSLMNMRLFEYAPEIRSAVLLVHGEEAHSLMYSQEAYKLLQGD NKELMLIPGASHTDLYDQMEIIPFDKLETFFNNAFTQ >gi|229783982|gb|GG667753.1| GENE 14 13034 - 13618 385 194 aa, chain + ## HITS:1 COG:CAC3505 KEGG:ns NR:ns ## COG: CAC3505 COG0655 # Protein_GI_number: 15896742 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Clostridium acetobutylicum # 1 188 1 188 190 232 57.0 3e-61 MKVLAINSSARKDGNTAILIDHVFQELNKADIETELVQLAGQIIEPCKACWACQGKKNCV HKKDTFRNIFEKMVQADGILLGSPVYTANVSANMQAFLERAAVVCDLNPGLLTHKTGAAV TAARRGGALQAIDSMNHFFLNHEMFVVGSVYWNMAYGQMPGDVLKDQEGLDNAGNLGRNM AYLMKTLQKGQSLI >gi|229783982|gb|GG667753.1| GENE 15 13709 - 14026 263 105 aa, chain + ## HITS:1 COG:pli0036 KEGG:ns NR:ns ## COG: pli0036 COG0640 # Protein_GI_number: 18450318 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Listeria innocua # 5 103 3 97 121 80 42.0 5e-16 METTYQKNAKVFKALCDEKRLAILEQLRSGEKCACILLEQLDLTQSGLSYHMKILCESGI VSSRQEGKWTHYRLSAAGRDYAVDLLKKLTTPDTEQKSECICQGN Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:36:14 2011 Seq name: gi|229783981|gb|GG667754.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld147, whole genome shotgun sequence Length of sequence - 9940 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 2701 793 ## COG0827 Adenine-specific DNA methylase - Term 2725 - 2764 9.2 2 2 Op 1 . - CDS 2780 - 2995 361 ## Ethha_1906 hypothetical protein 3 2 Op 2 . - CDS 2997 - 6365 2784 ## COG0358 DNA primase (bacterial type) 4 2 Op 3 . - CDS 6368 - 6649 168 ## Ethha_1904 hypothetical protein 5 2 Op 4 . - CDS 6646 - 8751 1581 ## COG0550 Topoisomerase IA 6 2 Op 5 . - CDS 8741 - 9439 507 ## gi|266624136|ref|ZP_06117071.1| conserved hypothetical protein - Term 9488 - 9549 6.2 7 3 Tu 1 . - CDS 9587 - 9940 362 ## CD1107 hypothetical protein Predicted protein(s) >gi|229783981|gb|GG667754.1| GENE 1 1 - 2701 793 900 aa, chain - ## HITS:1 COG:pli0004 KEGG:ns NR:ns ## COG: pli0004 COG0827 # Protein_GI_number: 18450290 # Func_class: L Replication, recombination and repair # Function: Adenine-specific DNA methylase # Organism: Listeria innocua # 18 100 24 103 756 68 43.0 5e-11 MPTKAELYAQMADKVATQLTGSWQEWAGFLTTASRLYKYPFHEQLMIYAQRPDATACAEY DLWNEKMGRYVRRGSKGIALVDDSGDRPRLRYVFDISDTGTREHSRTPWLWQLEERHLDS VQAMLERTYDVSGEDLAGQLTEVAGKLAEEYWTEHQQDLFYIVDGSFLEEYDEYNIGVQF KAAATVSITYALMSRCGLEPERYFDHEDFMAIFDFNTPSTIGALGTAVSQINQQVLRQIG VTVRNAEREANQERSKQDEQSHDLHPERRLSDSRPEAEPAAGEIPGQVRQDEENLPEGTP SHPLQPDVAEREAVSAPHRDRRDRPEQTGTDDAPAGEGSGSHRGTESQRSHEVGGPDEHL QSPGRGDPDGGAYQQLTLNLFLSEAEQIQSIDEAENVAHTSSAFSFAQNDIDHVLRLGGN TDRQRERVVVAFEKQKTTAEIAEILKTLYHGGNGLGSVSAWYAEDGIHLSHGKSVRYDRS AQVISWESAAERIGELLESGQFASNVELAEAAGYERSLLAEKLWHLYHDFSDKARDSGYL SCLSCIQRTGFPEETAWLAEQLSDPAFRQTLKEEYAAFWTTYQQDRDLLRFHYHRPREIW ENLKDLDLPRRTFSSDLSQVPTVQHFITEDEIDAAMTGGSSFAGGKGRIYAFFMENHTDK EKVRFLKDEYGIGGRSHALSGATHSGEDHDGKGLHYKKQDCPDVHLNWEKVAKRITSLVQ KGRYLTEQEQAQYDKIQAEKDLAEEDAIHAQQPEIEEETPKPTLREQFEQYKPVVTAAIS EDVAYRNACGHSDRENAVIEGNAAVRRAVLGSKDMELIRLYSDVPEFRQRLHREVIDETY PKLHELLRPLSQEDIDRALCVWNGNIESKHAVVRYMKDHAREKDTAAWLAQEYGGSNSPF >gi|229783981|gb|GG667754.1| GENE 2 2780 - 2995 361 71 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1906 NR:ns ## KEGG: Ethha_1906 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 4 71 2 69 72 71 61.0 8e-12 MGNFTFEEMNLMCIYNTGSRTGLIDSLREMRGELAPEETELRELTDSALGKLQAMSDAEF AELELYPDFDQ >gi|229783981|gb|GG667754.1| GENE 3 2997 - 6365 2784 1122 aa, chain - ## HITS:1 COG:RC1330 KEGG:ns NR:ns ## COG: RC1330 COG0358 # Protein_GI_number: 15893253 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Rickettsia conorii # 4 141 16 147 595 69 30.0 3e-11 MGANVFETVKQSVTVREAAERYGIEVKRGGMACCPFHDDKNPSMKLNEEYFYCFGCGATG DVIDLTARLYNLSPKEAAEKLAQDFGLIYDSQAPPRRNYVRQKTEAQKFRENRDHAFRVL ADYFHLLRKWEAGYTPKTPEEPMHPRFLEAVQQKDYIGYLLDSFLEDSPEEQKLWIAEHQ STITNLERRVNIMADKPTNRERLREITDGIEQGIKELFESEKYMRYLSVMSRFHRYSVNN TMLIYMQKPDATLVAGYNKWKDQFERHVKKGEHGITIIAPTPYKKKIEEQKLDPDTKAPI LDKDGKIITEEKEIEIPMFRPVKVFDVSQTDGKPLPELASSLSGNVPNYEAFMEALRRSA LVPITFEAMAADTDGYFSADHQKIAIRQGMSEVQTVSATVHEIAHSKLHNQKKIQIANDE QYQEIELFDKPGLFSNGRIVRDNLPEGVYCYDLRGSDYDPGEQVCVEERVVVNHAGSVLL TEPLELAEDGRLMLTEEEGLNFVGGFSTLAQFLQEQRKDRHTEEVEAESISYAVCKYFGI ETGENSFGYIASWSQGKELKELRASLETINKTSGTLISDIERHYKEICKERGIDPNAKKE PETAPIEQPTGNLTYYVAECMEFPNLGEYHDNLSLEEAVRIYQEIPAERMNGIKGIGFEL KDGSDYEGPFPILTGQTIDLDTIQAIDYYRDNPLVQKAVKELAATMPEMEVLGADANQQE ALFLINDTTYLHIQPCDSGWDYTLYDAASMKELDGGQLDMSELSRKNAVLQICDDNNLDS TSLRHAPMSMIETLQEAAYQQMQAETSQMIASSQLPEAQEQALDEYPMPDEQVSTPDMQE YGYSYDGMLPVTRERALELDAAGLTVYVLHEDNTESMVFDPQEIMDHGGLFGVDHEEWEK SPQFHEKVMERQEHQQEREQAFLSQNRDCFAIYQVSRDDPQNVRFMNLDWLKSHDISIDR SNYDLIYTAPLRESGTVPEQLEKLYQQFNLEKPVDFHSPSMSVSDIVAIRQDGKVSCHYC DSVGFTQILGFLPENPLKNAEMMLEDDYGMIDGIINNGAKEPTVAELEQQARSGQPISLM DLADAVHRKEREKKKSVMEQLKGQPKTEHKKTAPKKSAEREI >gi|229783981|gb|GG667754.1| GENE 4 6368 - 6649 168 93 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1904 NR:ns ## KEGG: Ethha_1904 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 86 1 91 100 72 56.0 8e-12 MKLSLYEQETILLYNQAEDTAEVYTHDRKLMEKLTRLSKKHPEQVCKKDKHNFTVPKRCV SVREPYSAERRKAASERAKVAGYRPPIPKPKAD >gi|229783981|gb|GG667754.1| GENE 5 6646 - 8751 1581 701 aa, chain - ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 1 636 1 656 709 456 39.0 1e-128 MRYKLVIAEKPSVAQSLAAVIGATARKDGYLEGNGWRVSWCVGHLAGLADADSYDPKYAK WRYDDLPILPEHWQMVVGKDKKKQFDILKKLMNAPDVTEVVNACDAGREGELIFRSVYEL AGCQKPMKRLWISSMEDSAIREGFANLRPGVDYDGLRKAALCRAKADWLVGINATRLFSV LYHRTLNIGRVMSPTLALIVQREAEIDTFKPVPFYTVALELPGLTVSGERISDRASAEQL KEACQDAAVTIKKVECKEKSEKPPALYDLTTLQRDANRLLGFTAQQTLDYLQSLYEKKLC TYPRTDSRYLTSDMADSLPVLVNLVANAMPFRKGIAITCDPQTVINDKKVTDHHAVIPTR NLKDADLSALPAGEKAVLELVALRLLCAVAQPHIYSETVVIAACAGGEFTTKEKTVRHPG WKALEDAYRAKTKDAEPKKEGAEKAVPELTEGQTLSVSAAIVKEGKSSPPQHFTEDTLLS AMETAGKEDMPEDAERKGLGTPATRAGILEKLVSAGFLERKKNRKTVQLLPSHDAVSLIT VLPEQLQSPLLTAEWEYRLGEIERGQLAPEEFLDGISTMLKDLVGTYQVIKGTEYLFTPP REVVGKCPRCGGDVAELQKGFFCQTESCKFAIWKNSKWWAMKKKQPTKALVTALLKDGRA RLTGCYSEKTGKTYDATVVLEDDGRYANFKLEFDRQKGGAK >gi|229783981|gb|GG667754.1| GENE 6 8741 - 9439 507 232 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624136|ref|ZP_06117071.1| ## NR: gi|266624136|ref|ZP_06117071.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 232 1 232 232 420 100.0 1e-116 MNTWNWLLHFLIIFDLVAFGIYFGGDILFTAISKVQKKSKKDDDLDLYDFEIETAPQKRS LCVDSIRMLYSGEIDEFVEKELLPDAAETMNALSEAEKTAVSLLLKAYIGFLSEEVTVDE QSFPMVKELLSYTQGSKEDGEKDAIDCLMEDTVSRTHRHREYYNNYQRYQLMQVDKERVI MACNIIINDLIGRLYRYDYRFGYDITLASEHSIAKKLSDNWQNEWEAEDDEI >gi|229783981|gb|GG667754.1| GENE 7 9587 - 9940 362 117 aa, chain - ## HITS:1 COG:no KEGG:CD1107 NR:ns ## KEGG: CD1107 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 95 134 229 244 68 46.0 9e-11 ECEAGAVNTACPVCVTDKSKCTGKAPEPPAETSEPEKEKPAGLNPAALILLLALLGGGGV FAYLKLVKNKPKTKGNDSLDDYDYGEEDSEEWETEDEEVLEDGFEGSDPNEERDRAD Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:36:37 2011 Seq name: gi|229783980|gb|GG667755.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld148, whole genome shotgun sequence Length of sequence - 18997 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 11, operones - 7 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 327 67 ## COG1032 Fe-S oxidoreductase - Prom 371 - 430 5.7 + Prom 444 - 503 8.6 2 2 Op 1 . + CDS 524 - 907 215 ## COG1733 Predicted transcriptional regulators 3 2 Op 2 . + CDS 957 - 1649 315 ## COG3279 Response regulator of the LytR/AlgR family 4 3 Tu 1 . - CDS 1544 - 2449 873 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 2549 - 2608 9.3 + Prom 2493 - 2552 5.2 5 4 Op 1 7/0.000 + CDS 2594 - 3673 1016 ## COG1312 D-mannonate dehydratase 6 4 Op 2 3/0.000 + CDS 3664 - 4824 795 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases + Prom 5726 - 5785 18.3 7 5 Op 1 . + CDS 5938 - 6903 557 ## COG1904 Glucuronate isomerase + Prom 6917 - 6976 1.9 8 5 Op 2 . + CDS 7001 - 7246 157 ## gi|266624145|ref|ZP_06117080.1| putative cell division initiation protein 9 5 Op 3 . + CDS 7262 - 7399 203 ## gi|266624146|ref|ZP_06117081.1| conserved hypothetical protein + Term 7569 - 7612 4.2 - Term 7559 - 7598 2.1 10 6 Op 1 . - CDS 7645 - 9003 1199 ## COG0642 Signal transduction histidine kinase 11 6 Op 2 . - CDS 9000 - 9101 67 ## gi|266624148|ref|ZP_06117083.1| DNA-binding response regulator RegX3 - Prom 9188 - 9247 80.4 12 7 Tu 1 . - CDS 10095 - 10571 402 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 10757 - 10816 5.7 + Prom 10584 - 10643 4.3 13 8 Tu 1 . + CDS 10775 - 11128 372 ## COG1131 ABC-type multidrug transport system, ATPase component + Prom 12030 - 12089 9.8 14 9 Op 1 . + CDS 12315 - 13076 440 ## Dhaf_3871 hypothetical protein 15 9 Op 2 . + CDS 13090 - 13812 498 ## COG4200 Uncharacterized protein conserved in bacteria + Term 13843 - 13881 4.4 - Term 13820 - 13877 12.4 16 10 Op 1 . - CDS 13929 - 14075 88 ## gi|288871286|ref|ZP_06410095.1| putative transcriptional regulator - Term 14091 - 14127 -0.6 17 10 Op 2 . - CDS 14182 - 15711 950 ## COG5434 Endopolygalacturonase 18 10 Op 3 . - CDS 15708 - 16130 302 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 16165 - 16224 80.4 19 11 Op 1 . - CDS 17068 - 17229 214 ## gi|288871287|ref|ZP_06117091.2| C4-dicarboxylate transport transcriptional regulatory protein 20 11 Op 2 . - CDS 17207 - 18988 1566 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain Predicted protein(s) >gi|229783980|gb|GG667755.1| GENE 1 3 - 327 67 108 aa, chain - ## HITS:1 COG:CAC2422 KEGG:ns NR:ns ## COG: CAC2422 COG1032 # Protein_GI_number: 15895688 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 1 108 1 102 290 82 39.0 2e-16 MHYTGTIWRPPYEAGSLLIEVTAGCTHHKCKFCTLYDDLPFQFRMSPLTDIEADLQEAQY QLHERSSRVKRVYLVGANPFVLQFKRLKEISELIHQYFTECETIGCFA >gi|229783980|gb|GG667755.1| GENE 2 524 - 907 215 127 aa, chain + ## HITS:1 COG:BH0737 KEGG:ns NR:ns ## COG: BH0737 COG1733 # Protein_GI_number: 15613300 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 17 108 15 105 118 100 50.0 5e-22 MADNKFQTKDAMARCVPMSSLQAVLSGKWKILILWYVAFYKVQRFGELMRRLDGITQSTL TKQLRELEQDGFLHREIYKEIPPKVEYTLTELGKSFIPVLNQMMTWSEIHLCPPDYTSPY SGKKLPD >gi|229783980|gb|GG667755.1| GENE 3 957 - 1649 315 230 aa, chain + ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 227 2 227 234 96 26.0 5e-20 MKIGICDDEPHVCKQMQAEVNHYYHSLDILPVTFSHAEQLIEALRKDPDNFLCIFMDIEL EGLDGITASRLIQGMNLNIPIILLTSHSEFAPEGYEISAFRFLTKPLDRKKMVEALKAVE TMQIRNRKIAICQDGQEIYVSYQDICYIKSENVYLRIRSTRQSYLVRGTLQKYLEQMPSL LFCRVHRSYAVNLSHVQSFDGTDVIMENALLFLVSLYSCGEIPYSFLNAR >gi|229783980|gb|GG667755.1| GENE 4 1544 - 2449 873 301 aa, chain - ## HITS:1 COG:CAC1333 KEGG:ns NR:ns ## COG: CAC1333 COG2207 # Protein_GI_number: 15894612 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 8 287 7 285 286 156 31.0 5e-38 MKKNLQTKFSTRQYMLSEDFEIYYYKEHYLSKVELHTHDYYEFYFFLEGNISMKIEEKQY PLQYGDLVLIPPEIRHRAVVHDEETPYRRFVFWISREYCNQLVELCRDYGYLMQYVTVRK EYIFHNDVIAFNTIQSKVFHLIEEIHSQRFGKTAKIPLCVNDLILHLNRVVYEQNHPKSP KEERDLYEKLIDYIEEHLDEELTLERLAREFYVNKYHISHAFKDNMGLSVHQYITKKRVA ACREALLNHVKISEVYPMFGFQDYSGFYRAFKKEYGISPQEYRETKKSKAFSIITSVPSN D >gi|229783980|gb|GG667755.1| GENE 5 2594 - 3673 1016 359 aa, chain + ## HITS:1 COG:TM0069 KEGG:ns NR:ns ## COG: TM0069 COG1312 # Protein_GI_number: 15642844 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Thermotoga maritima # 1 358 1 360 360 442 55.0 1e-124 MEMTLRWYGSKFDTVTLKQIRQIPGVTGVITTLYDTAPGEVWSRERIRAMKDEVAASGLH VSGIESVNIHDAIKTGSADRDTYIDNYIETLENLGKEDIHLVCYNFMPVFDWTRTELARV RPDGSTVLAYTQEAVDALDPENMFESIAGDTNGTIMPGWEPERMAKIKDLFAMYEDINDE KLFANLKYFLKRIMPVCNKYDINMAIHPDDPAWSVFGLPRIIINKENILRMMEMVDNPHN GVTFCSGSYGTNLENDLPDMIRSLKGRIHFAHVRNLKFNSPTDFEESAHLSSDGTFDMYE IMKALYDIGFEGPIRPDHGRMIWDEVAMPGYGLYDRALGATYLNGLWEAIVKGADKSCR >gi|229783980|gb|GG667755.1| GENE 6 3664 - 4824 795 386 aa, chain + ## HITS:1 COG:TM0068 KEGG:ns NR:ns ## COG: TM0068 COG0246 # Protein_GI_number: 15642843 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Thermotoga maritima # 1 386 1 384 539 403 52.0 1e-112 MQVNEQGLANKKQWEDKGYSLPGFDRAAVSEKTRQNPFWVHFGAGNIFRAFQAKVVQDLL NEGVLDRGLVVAEGFDYEIIEKMNHPHDDYNILVTLKADGSVEKTIVGSVVESLTVDSEN DADFTRLREIFQKDSLQIASFTITEKGYSLVNGKGELLPSVADDMKNGPEKPQSYIGKVV SLLYTRFQSGEKPIAMVSMDNCSHNGSKLFDAVHTFAAKWSETGLTAASFVEYIENPAKV SFPWSMIDKITPRPDPSVEELLKKDGVEELDPVVTSKNTYVAPFVNAEECQYLVIEDVFP NGRPQLERGGLIFTTRETVDKVEKMKVCTCLNPLHTALAIFGCLLGYDLISKEMQDVTLK KLVKEIGYTEGLPVVINPGILDPKEF >gi|229783980|gb|GG667755.1| GENE 7 5938 - 6903 557 321 aa, chain + ## HITS:1 COG:STM3137 KEGG:ns NR:ns ## COG: STM3137 COG1904 # Protein_GI_number: 16766437 # Func_class: G Carbohydrate transport and metabolism # Function: Glucuronate isomerase # Organism: Salmonella typhimurium LT2 # 3 321 151 469 470 330 50.0 2e-90 MQNVKVLCTTDDPVDDLIWHKKLAQEQSDFQVLPTFRPGNALDIEKDNFPSYVACLGSVS QSDVSSLDGLTEALKKRLDFFISAGCRVTDHSLENSFYLPASYETANGLYQKRLTGVSLS SEEAAMYRGFLLTELGREYARRGLVMQLHIGALRNNSSRMFEKLVADTGFDSLNDFNYAP QLSALLNAMDSTDELPKTILYYLNSKDVDMLAAMAGNYQGNSTGIRGKIQLGSAWWFCDH KRGMERQMDALSDVGLISRFVGMLTDSRSFLSFPRHEYFRRILCSKLGTWVENGEYPKDM AYLGKLTEDICYYNAKNYFNL >gi|229783980|gb|GG667755.1| GENE 8 7001 - 7246 157 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624145|ref|ZP_06117080.1| ## NR: gi|266624145|ref|ZP_06117080.1| putative cell division initiation protein [Clostridium hathewayi DSM 13479] putative cell division initiation protein [Clostridium hathewayi DSM 13479] # 1 81 1 81 81 148 100.0 2e-34 MYGKSDTDFPGTRLLNVIACINAGFDISGMILILALMFIGGELTIRSVSRNLSKEDKLEE MDERNQLIELKTKSQSFRLTQ >gi|229783980|gb|GG667755.1| GENE 9 7262 - 7399 203 45 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624146|ref|ZP_06117081.1| ## NR: gi|266624146|ref|ZP_06117081.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 45 1 45 45 65 100.0 1e-09 MLTFLIMGKISGEKAFIGMGVGLAFAYTISMFAEIFTYCYYEKHT >gi|229783980|gb|GG667755.1| GENE 10 7645 - 9003 1199 452 aa, chain - ## HITS:1 COG:CAC0290 KEGG:ns NR:ns ## COG: CAC0290 COG0642 # Protein_GI_number: 15893582 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 10 442 7 461 467 134 25.0 3e-31 MKSLMRIYFNYIATALAIIFMFVLMQFVLLAVITSKLYDNDSGGKRDSIARIYEAVDEDG QEAAACLQEMGASFAMLLDDAGTPVWSYQLPEQLNHTYTTSQVASFTRWYLEDYPVYVWG GEKGLLVVGYPKGSIWNYAVHQDMDKLKAVFTFLSLSFTSTLAAAVLILLVSGFRYYRRM RMIADSIGKLASGGSVHLAETGNVKEIAYTINRTSDRLTTQREQLEKRDEARTEWISGVS HDIRTPLSLVMGYADMIERQLDTDPKIREKAGLIRAQSVRIRNLIEDLNLASKLEYNVQP LRKKRVFLAAILRRVAADVLNSLEDAEYYPLSICIEPEFEAFSFEADEQLLFRAFQNILG NSIRHNEDGCSLGVRAWLEDGCPHVVFYDSGRGIPPSICRFLNEGQMPDGDVHLMGLRIV RQIIMAHEGTISIKEGGQEIEVRFQPVMKERD >gi|229783980|gb|GG667755.1| GENE 11 9000 - 9101 67 33 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624148|ref|ZP_06117083.1| ## NR: gi|266624148|ref|ZP_06117083.1| DNA-binding response regulator RegX3 [Clostridium hathewayi DSM 13479] DNA-binding response regulator RegX3 [Clostridium hathewayi DSM 13479] # 1 33 24 56 56 65 100.0 2e-09 MVHIRKLREKIEEKPSEPAFLITVKGLGYRMNR >gi|229783980|gb|GG667755.1| GENE 12 10095 - 10571 402 158 aa, chain - ## HITS:1 COG:BS_yvrH KEGG:ns NR:ns ## COG: BS_yvrH COG0745 # Protein_GI_number: 16080375 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 1 132 131 259 369 101 40.0 7e-22 MKLTEARILVVDDNQELCSLVENICRQEGFSHVETAFSCREAESRIKSGDVDFLILDVNL PFEDGFSFLQRIKTVLENQRIPVLFLSARDQDEDRLQGLDLGADDYMTKPFLPRELVLRI SAILKRTYRLEEQPECYRLGNREVDLSAGVVRVLDGSG >gi|229783980|gb|GG667755.1| GENE 13 10775 - 11128 372 117 aa, chain + ## HITS:1 COG:CAC0288 KEGG:ns NR:ns ## COG: CAC0288 COG1131 # Protein_GI_number: 15893580 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Clostridium acetobutylicum # 1 115 1 114 242 125 55.0 2e-29 MAQYSIETSQLTKTYRGTPAVDHLDLRIPEGAIYGFLGPNGAGKSTTMKMLLGLISRDGG SIRIGGTELNDRSRISILKQVGSLIENPSYYGNLTAYENLQISCMLRGLPYTEIDPS >gi|229783980|gb|GG667755.1| GENE 14 12315 - 13076 440 253 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_3871 NR:ns ## KEGG: Dhaf_3871 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 4 245 2 245 250 100 29.0 5e-20 MSKKTEFALEMRKLRRRHIPILFLMVFVLITAWTYWCIDDLDVSRLNDVSAMVFTNLLLM NTILCPIVMAALASRMCDMEQMGNTYKWLCTMQKPEHIYRGKVMAGSFYILLFTLMQTVL FWIISQPYGAGVVSRLPGYFTTIFLTSLCIFILQLNLSLKFTNQLTPIFISIGGTFTGLF SWFLNRWPLRYLIPWGYYASLCNAGYNYDETTRYTTYYWDMYPFLWMAVLTAAIIILYRH GQKHFLETVRETI >gi|229783980|gb|GG667755.1| GENE 15 13090 - 13812 498 240 aa, chain + ## HITS:1 COG:BH0447 KEGG:ns NR:ns ## COG: BH0447 COG4200 # Protein_GI_number: 15613010 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 1 226 1 232 247 62 24.0 5e-10 MIVKAIICERMKCKGTLIWPAFLLIPVIPILLGAGNYLSNLEILKSGWYSYWTQVTLFYA TFFFAPLIGAYCAFLWRYENFNSCRNTLFARPIPYHTIYLSKFILVCAISAMTQIWFALL FIASGKLIGLPGLPPADLCFWMIRGLAGAFVIAALQFLAATEIRNFATPIAIGLVGGVTG MIAANTRAGLLWPYSQMLLGMNSNKSEDVLGQNAAIFFVICVVYLLGVSLIGIKRIKRNN >gi|229783980|gb|GG667755.1| GENE 16 13929 - 14075 88 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871286|ref|ZP_06410095.1| ## NR: gi|288871286|ref|ZP_06410095.1| putative transcriptional regulator [Clostridium hathewayi DSM 13479] putative transcriptional regulator [Clostridium hathewayi DSM 13479] # 1 48 1 48 48 84 100.0 3e-15 MDYDIHTVVDFTVEETAPVMRSVWTLVYHEEWATPSVRHMNQRIKHCQ >gi|229783980|gb|GG667755.1| GENE 17 14182 - 15711 950 509 aa, chain - ## HITS:1 COG:CAC3684 KEGG:ns NR:ns ## COG: CAC3684 COG5434 # Protein_GI_number: 15896916 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Clostridium acetobutylicum # 12 436 50 457 534 169 29.0 1e-41 MNQIKELHSFTTENSITLYWDKPEGLEGGVQYQVFLNRKQADTVDRTHCELEDLEPDTEY RLKVKTDSEKGRAVSDEIMVKTGKKKVLLDITRGPYHAVGDGKTLNTAVIQRAIDDCMDE EAVYLPKGVFLTGALRLHSDMELYLEEGAVLQGTDQVEDYLPRIWSRFEGIEQECYSSVL NLGSLEHQGDYNCRNVVIRGKGTIASGGRSLAEKVIASETEHLKDYLLSLGEQIRECEKP ETIPGRVRPRLINMSNCQNITMSGLKLKDGASWNVHMIYSDGIVTNNCTFYSEGIWNGDG WDPDSSTNCTIFGCTFNTGDDSIAIKSGKNPEGNEISRPSEHIRIFDCKCAMGRGITIGS EMSGGINDVQIWDCDISSSRHGIEIKGTKKRGGYVKNVKVRDSRTARILFHSVGYNDDGA GAPKPPVFEKCIFENIDVTGMSLSKQREWIPCDAIELCGFDEEGYELREIEFKDIRLGRP GENRKHIITLQMCEGISFSNVSAAGFLKR >gi|229783980|gb|GG667755.1| GENE 18 15708 - 16130 302 140 aa, chain - ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 5 117 401 513 530 84 35.0 7e-17 MKKIVSRLLDGVRNKESADVIISIREYIEKNYDDSLNVADLARKYGLNVSYLSTLFKERT GINITSYIEGIRMEKAMNLLRETNWKVTEVAFHVGYSNSGYFSRVFKKYTGVTPGEYGEM EKKRRKTPGISLEERGGKTV >gi|229783980|gb|GG667755.1| GENE 19 17068 - 17229 214 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871287|ref|ZP_06117091.2| ## NR: gi|288871287|ref|ZP_06117091.2| C4-dicarboxylate transport transcriptional regulatory protein [Clostridium hathewayi DSM 13479] C4-dicarboxylate transport transcriptional regulatory protein [Clostridium hathewayi DSM 13479] # 5 52 1 48 48 84 100.0 3e-15 MYKVMIVDDEKFIRKSIRNRIDWERFGITEIEEAANGQEALALQESFRPTIVS >gi|229783980|gb|GG667755.1| GENE 20 17207 - 18988 1566 593 aa, chain - ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 207 566 218 571 577 165 31.0 3e-40 MKQTITHRFLRYLILIIISSNLLVAIFLYGIMRENAMRQARDMRQNLMESNLTMMQQYFE DVDRIADAIIYNRDMIRFMRNDKDTVSDLALLRGIESQYYYSNPDLALSFYKAGHWNGVY SIENEERAVYVPDYRYTDWYQEIIWTRDKKVLMTVNAENGASVQSCVYKIEDLYTPGVVG YLKIDIDMNYLKKRFLHSFSKIAGATILDGDGNVLFYDKIEIKIPAELMTENGSGTYETE DYIITYGTSESTGWHLCLASSKEEILKGQNQIIPVLILILLVIVGFTLLISKRCFSIITV NFKRLAEGMEQVKRGDLTTRVKPDTEDEINILIREFNDMMRRINELMRRVESEQLLVKEA EIKALQQQINPHFIYNILETIMGLASEGMDDAVIEVSTCLSEMLRYNTRFKNVTVVEKEL EQIKNYVTVIKIRFEDRFEVYYDVDEECLNCRILKFTLQPLLENAISHGLAETDSGGMLR IRIKKEENMVSIMIFDNGIGIPEEKLKELNERLKVTGERPLEFIEQYKSLGILNVHLRSK LFYGDTYSIEIFSREEKGTCIVMKIPFVCINTRQKENSIILEGGESYVQGDDC Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:37:10 2011 Seq name: gi|229783979|gb|GG667756.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld149, whole genome shotgun sequence Length of sequence - 16540 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 7, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1445 510 ## Closa_4081 hypothetical protein - Prom 1569 - 1628 4.0 - Term 1593 - 1640 4.0 2 2 Op 1 3/0.000 - CDS 1658 - 2929 417 ## COG0673 Predicted dehydrogenases and related proteins 3 2 Op 2 21/0.000 - CDS 2983 - 3957 437 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 4 2 Op 3 . - CDS 3968 - 5362 232 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 + Prom 5646 - 5705 5.1 5 3 Tu 1 . + CDS 5746 - 5889 56 ## gi|288871289|ref|ZP_06410096.1| ATP-dependent zinc metallopeptidase + Prom 6057 - 6116 11.7 6 4 Tu 1 . + CDS 6143 - 6802 293 ## gi|288871290|ref|ZP_06117097.2| conserved hypothetical protein + Term 6940 - 6981 5.1 - Term 6928 - 6969 1.7 7 5 Tu 1 . - CDS 7085 - 7237 82 ## gi|266624163|ref|ZP_06117098.1| transcriptional regulator - Prom 7316 - 7375 80.4 8 6 Op 1 2/0.000 - CDS 8216 - 9529 314 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 9 6 Op 2 2/0.000 - CDS 9532 - 11190 561 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 10 6 Op 3 . - CDS 11183 - 12667 440 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Term 12830 - 12878 -0.9 11 7 Op 1 . - CDS 13071 - 13328 247 ## gi|266624167|ref|ZP_06117102.1| conserved hypothetical protein - Prom 13377 - 13436 4.7 12 7 Op 2 . - CDS 13498 - 13710 146 ## gi|266624169|ref|ZP_06117104.1| toxin-antitoxin system, antitoxin component, Xre family - Prom 13786 - 13845 80.4 Predicted protein(s) >gi|229783979|gb|GG667756.1| GENE 1 2 - 1445 510 481 aa, chain - ## HITS:1 COG:no KEGG:Closa_4081 NR:ns ## KEGG: Closa_4081 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 7 481 52 527 552 536 52.0 1e-150 MAFRGSGFLKEIVRVECISPIADSIRIRSVENVPVGRACNDQVDDNYLRTEAGMYPDLLR ELDQGIVSISPNQWRSLWIDVEATVDMKAGNYPIKLRVTKENEVIGTVKTEIIIYDAVLP KQKIMHTEWMHADCLADYYKVEPFSEEHWRILDHFFELYVKRGGNMMLTPMFTPPLDIAL GGERTTVQLVGVKKNGNTYEFDFSKMKRWIDLCIQHGIEYFEMSHLFSQWGARYAPKIIA EVNGEQRQIFGWHTPAVGEYTEFLNIYLPQLRSHLEEWGISEKTYFHLSDEPGANDVTSY KAARDSVDSLLKGFHTFDALSDFEFYRQGLVDKPVPGNNEIAEFLEHGVDNMWVYYCTMQ YLDVSNRFMAMPSARNRIYGLQIYKYDIIGILHWGYNFYNSQWSTRHINPYEVTDADYGF PAGDSFLVYPGEDGYPEESIRGMVLNEALNDLRACQLLEALTSKEYVLGILEEFLAEPLT F >gi|229783979|gb|GG667756.1| GENE 2 1658 - 2929 417 423 aa, chain - ## HITS:1 COG:DR1362 KEGG:ns NR:ns ## COG: DR1362 COG0673 # Protein_GI_number: 15806379 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Deinococcus radiodurans # 1 409 4 396 403 264 37.0 3e-70 MVNAILIGAGQRGYEVYGECALEHPSDLKFIAVAEPDEARRNRFAQAHKIPNELCFESWE KILDQPKLADAAFICTQDRMHTEPTLMALEKGYHVLLEKPMSPVAGECIAMGKYAEKYNR IFSICHVLRYSEFFSAIKEVIDCGKLGEIINIKHAENVAFWHQAHSFVRGNWRNSNETSP MILQKCCHDMDIILWLMNRSCTKVSSFGSLEYFKPENAPAGATERCTDACPVKGTCPFNA ERFYLGENTHWPVSVISEDMSLKARRKALEDGPYGRCVYHCDNNVVDHQVTALEFEGGAT ATLTMCAFTPDCTRFIEVMGTKGYLRAEMKKFGENARHNEIVVENFLTGEVEIIPVSDFG LMTGHEGADVLLTMGFVKQVAEHDSEGRTSAARSVESHLMSLAAEKSRVENRQIPMAEMW AMS >gi|229783979|gb|GG667756.1| GENE 3 2983 - 3957 437 324 aa, chain - ## HITS:1 COG:VCA0129 KEGG:ns NR:ns ## COG: VCA0129 COG1172 # Protein_GI_number: 15600900 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Vibrio cholerae # 27 317 36 330 332 165 35.0 1e-40 MEAKNKQSAVMQKIKPYMMDMNFSLTVAIVIVGIILAFKSPYFFSFKNVLNMGLAGSVTG VMAAGLTISMVMGAFDLSQYAVATLASVVSTIMLISGVPVWICLPAAVLVGMACGAFNGL MVTVFKINPIITTLGTQMIFRALCYILTSSKILSIDQPVMNYLGRGYLLGIPTSFWILII VYIILALVVKYTSFGRSVYAVGANANASYLAGINIRRVRIGGLMVSSTCAAVAGVLTACQ VGASIPANGVGAEMEITTAVLLGGLSLAGGKGKLSGTFLGLVLLTIINNGLTLLSVQSYY QMLVRGIVLVLAVLIDSIRGGGYK >gi|229783979|gb|GG667756.1| GENE 4 3968 - 5362 232 464 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 112 443 158 498 563 94 25 6e-19 MIFFGSYLTQAGKSTLIKILSGAYTQDLGEVYLNGVKQEHVTPITAIKNGISVIYQELNN VDQITVAENIFLGNLPRKGALKILDYDKLRKDAQSLLDEVGLKVDPFTPLEQLSIAQKQL VEIAKALSKEIKILVMDEPTAALNNQEIEIMFRLIRNLTSRGIGILYISHRMEEIFQMSD RVMVMRDGKGIATMPTTEVDSQQIISLMVGREIKDLYPQKCKPGSKICFSVEGLETERLH NISFEVKEGEILGLFGLMGCGRTDIARAIFGADPRKSGKMCLDGKDIAPTSPFLAKKEGI AYIPADRKMDGLMLIHSVSRNMTISVLEKLKRSHILNKKVEMQLVDGWIDRLAIKTPSAD TIINSLSGGNQQKVVLAKWLSIEPKLLILNEPTRGIDVGSKSEIYRLMNELCQQGIGIIM ISSDLPEVMGMSDRMVIVCDGKVGGIMEKEEFTQDAIMAKAIGL >gi|229783979|gb|GG667756.1| GENE 5 5746 - 5889 56 47 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871289|ref|ZP_06410096.1| ## NR: gi|288871289|ref|ZP_06410096.1| ATP-dependent zinc metallopeptidase [Clostridium hathewayi DSM 13479] ATP-dependent zinc metallopeptidase [Clostridium hathewayi DSM 13479] # 1 47 38 84 84 75 100.0 2e-12 MLKSMSKTFSISKDDFQYAIYDTNKTKNIYHADILLSKYANLNHSIQ >gi|229783979|gb|GG667756.1| GENE 6 6143 - 6802 293 219 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871290|ref|ZP_06117097.2| ## NR: gi|288871290|ref|ZP_06117097.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 219 15 233 233 362 100.0 8e-99 MKILLRIKFICYVLYYYFYTNLHLMINLTLTIIIITVTPNILLKIVAPIPILLIFNFVIF PIKKYGTHLQLIKSGKTYELTMCGKIKDLSKEEIQHHFLNSLEEVIDNPSIFQLKSKIRV HTHGLFATKLFEALYKAHNIDYNSKLWREGIANGFFELEQFKIEIYKKKGKNIMIAGNYG YSVILKDEKLIQALKHIERSIDYYDIILPPDVFRVTKNT >gi|229783979|gb|GG667756.1| GENE 7 7085 - 7237 82 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624163|ref|ZP_06117098.1| ## NR: gi|266624163|ref|ZP_06117098.1| transcriptional regulator [Clostridium hathewayi DSM 13479] transcriptional regulator [Clostridium hathewayi DSM 13479] # 1 50 21 70 70 73 98.0 6e-12 MVEKYKEEKILTSDIADAFIEKVLVDRNHDIEIIFKFQDEIDKIYERIAC >gi|229783979|gb|GG667756.1| GENE 8 8216 - 9529 314 437 aa, chain - ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 8 298 11 297 301 108 30.0 3e-23 MIKKYLFFYERLSKEDGDDESCSIGNQRRLLQRFQKEKDEFSDYEVKEFIDDGYTGANFE RPDFQRMMEKIRSLYGNKIVVVTKDFSRLGRDTIDTVNYLEKIFPFLEVRYIAVNDDYDS NDYPYGLDFENKFKNLINGIYPMQLSYAHKKIKMEEAKQGKHNGPIPTYGYRYTKEREYE VDWEAATVVQMIMDFLEQRKSYKYIICFLAEKKIEPPSIYLTSRYGVKLSKVSKKSVWNR ATILKIARNKTYLGYSVRHTVESCSLGTKKTRVIKRDQQVLVPNRQLAIIRTEQFDSVND WLDDRAEKAGHKKTKRQRVSALYKKVYCTECGCGMTRRLEYNYDVYVCLDKRENPFSVCT TKPIQTSVLEAAVFKALQFNIYLFLDNERNIKKSLKSNITDRHEEMACLHKSVDLISCQK MELYEKYKSEQLTKIAS >gi|229783979|gb|GG667756.1| GENE 9 9532 - 11190 561 552 aa, chain - ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 24 322 10 300 301 150 32.0 5e-36 MARTSRKSSKAERSAPVFRENRWASYARLSVKSNKVGIDDSIIHQLQRNRDFIRKNIQDG LLAGEYYDDGISGTTFDRPGFNELLQDIKRGKINAVIVKDFSRFGRNSLEALELLEVVFP SLDVRFISIDDAYDSADPRCSKDRMIYIIKHLANEYYARVASKKLSQAHELNRGSGEFWG ARPPYGFARSQKNSKILVRDDEAAVVLIRIFNMFVLEGYTYFRLAKQLNEERIPCPQAYY LIKHGKQDKVEKNPEKYLWTGNTISKIIQNPAYIGCLVTHKTEQSYYKNQKIRRVPRQEW IIDENVLPRIIGQSVYDEAQELAKQAWVNNFNRENDKKLGLFGGKIICGICGRNMSQHRY KTKKGLGFEYHCPGHDIAPSICKAQFVNEKYILKAVTVLLDRWVKAAMKERKLYKKGIFL TKLKREFQQNMAIADDELQRITVKQQQIYEDYVGGLLDKREYLQIKNQYSEQREELIEKK GKLDMELNDAVRKYSMKNKWISSLLDIHGNICITAEVIDELIAEVRVVDKTSIEVIFKFK DIFNIDVECEVV >gi|229783979|gb|GG667756.1| GENE 10 11183 - 12667 440 494 aa, chain - ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 1 227 68 289 301 112 33.0 1e-24 MLELIKKGMIDCIIVKDLSRLGRDFTEVLRYVQRRFPEWGIRFVAIDDNYDSDDESCKQD FLTLPIKSLLNESYPANTSISIRNTLKAMREQGLFVGAYAYYGYQKDPEDRHRLILDPIA SGVVRDIFAWKICGLSQDAIARRLDSLGFLPPADYKVSQGIPYKTTFKLYERSHWTAVAV GRILCNIAYVGILVQGKTTTPNFKVHKTIYKTEEEWDIVEGAIPPIVSWIDFMIVNHLLE KDTRTAPGQDTVYLFSGILECADCHQSLVRKPAKYNGKEYGYYVCSTNRDHKEQCSSPHR VSEAKLKKSMLLLIRHQISMMVNLKSILQYVETIPFSDRKVEKESNRMEMLKKEYQWNLK LSTSLYESFQEGILTKEEYLQMKKKYSERCGELEQLMENQEEESRNIFRNICGQNNWIEH FLKFGQIKELDRKLVVSFIKDIQVTAQGNLEVTFWFEDEYRQVIEKIKQIHYELPNAKMQ GFLNQIGEGGVLCG >gi|229783979|gb|GG667756.1| GENE 11 13071 - 13328 247 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624167|ref|ZP_06117102.1| ## NR: gi|266624167|ref|ZP_06117102.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 85 1 85 85 141 100.0 1e-32 MLTVEELEELQNISIDQMDPNTLVDLNSIRVDTSQPKEKRIKMLLESGINPYFFRVGNMK IKVAYSNTGKTLSDIIENLVEKAEI >gi|229783979|gb|GG667756.1| GENE 12 13498 - 13710 146 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624169|ref|ZP_06117104.1| ## NR: gi|266624169|ref|ZP_06117104.1| toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] # 1 70 20 89 89 131 100.0 1e-29 MESYLNVLQALDVYPVFVAGADDGKETKKYISDFMEIIKDCSGREVAFLLNMAVSMKENM IKTGLTASGK Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:37:54 2011 Seq name: gi|229783978|gb|GG667757.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld150, whole genome shotgun sequence Length of sequence - 18417 bp Number of predicted genes - 14, with homology - 13 Number of transcription units - 7, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 255 174 ## Tresu_1954 Methyltransferase type 11 2 1 Op 2 . - CDS 342 - 2027 1191 ## COG0840 Methyl-accepting chemotaxis protein 3 1 Op 3 6/0.000 - CDS 2123 - 3229 855 ## COG2200 FOG: EAL domain 4 1 Op 4 5/0.000 - CDS 4186 - 5280 918 ## COG2199 FOG: GGDEF domain 5 1 Op 5 6/0.000 - CDS 5346 - 6566 1113 ## COG2199 FOG: GGDEF domain - Prom 6664 - 6723 18.3 6 2 Op 1 . - CDS 7625 - 9607 1410 ## COG2200 FOG: EAL domain 7 2 Op 2 . - CDS 9638 - 9724 60 ## - Prom 9827 - 9886 6.7 - Term 10021 - 10061 5.5 8 3 Tu 1 . - CDS 10092 - 10307 148 ## gi|266624179|ref|ZP_06117114.1| conserved hypothetical protein - Prom 10379 - 10438 6.0 9 4 Tu 1 . - CDS 10595 - 10858 170 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 10 5 Op 1 . - CDS 11817 - 13100 1032 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 11 5 Op 2 . - CDS 13136 - 14131 823 ## COG1397 ADP-ribosylglycohydrolase 12 5 Op 3 . - CDS 14163 - 15626 905 ## COG3275 Putative regulator of cell autolysis 13 6 Tu 1 . - CDS 16547 - 16738 255 ## gi|266624184|ref|ZP_06117119.1| conserved hypothetical protein - Prom 16831 - 16890 5.6 - Term 16868 - 16907 4.3 14 7 Tu 1 . - CDS 16921 - 18417 1467 ## COG1653 ABC-type sugar transport system, periplasmic component Predicted protein(s) >gi|229783978|gb|GG667757.1| GENE 1 3 - 255 174 84 aa, chain - ## HITS:1 COG:no KEGG:Tresu_1954 NR:ns ## KEGG: Tresu_1954 # Name: not_defined # Def: Methyltransferase type 11 # Organism: T.succinifaciens # Pathway: not_defined # 1 84 1 84 261 141 72.0 1e-32 MSITNQNIDGGKAFDWSKTSEDYAKYRDIYPQQFYDKIIDRKLCISGQSILDMGTGTGVL PRNMYRYGAKWTGTDISESQIKQA >gi|229783978|gb|GG667757.1| GENE 2 342 - 2027 1191 561 aa, chain - ## HITS:1 COG:CAC0120 KEGG:ns NR:ns ## COG: CAC0120 COG0840 # Protein_GI_number: 15893416 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Clostridium acetobutylicum # 252 554 212 514 555 209 45.0 2e-53 MKNLKVSRKLTISYGFILSLLVISIVVSIGNLISIRSKVEVFYNGPFRVLNAANTVNSSF EAMQKSVFRAISSTSQANTEQALENARAWERKIQEQIPVIQKDFLGDQAIVERLQAALDE LAPQREHVLELAVQNQKAEASAYMEANNIITIHKAQAELDELIEIADRKADELIVNLHAM QTKFITILSVLGIASVLVSVFFGIYITRGITRPIKELEAAARQMEQGHLKIDVNYASKDE LGSLSNSMRQMSDKISYYMDAISRVMRQLADSNLEIPHYDDFQGDFLPVQESLLIVLNSL NETISEINMFSDQVANGADQVSNGAQILSQGVIAQASSVEELAATMSEISQQVKENAETS QVVKTAAGEMGANILACNQQMQEMKNAMEKINQNSTQIRSIIKTIDDIAFQTNILALNAA VEAARAGESGKGFSVVAQEVRSLATRSSDASKSTEALIEQSLAAVVYGTKVAEETAASLR NIAGGTDEMISKINQIAEASKRQAAATEMVSTGIDQISDIVQTNSATAEESAAASEELYG QSQVLKSRVSRFKLHVSRPEY >gi|229783978|gb|GG667757.1| GENE 3 2123 - 3229 855 368 aa, chain - ## HITS:1 COG:slr1692 KEGG:ns NR:ns ## COG: slr1692 COG2200 # Protein_GI_number: 16330979 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Synechocystis # 98 353 71 326 332 171 34.0 2e-42 MADILKSRVENSAYRVGGDEFTVICADIGKQDFQTLTEKLIRDFDRNKEYDVSIGHTWKE GNVSADEEILKADDLMYAEKQRYYHAVLQGDRMARAGIATELLRDIADHRFGVYYQPQIC LKTGRIMGAEALVRKWSKRGTLILPDRFIPHYERERVICHLDLFVFETVCADWQNWKRQG IDTHISSNFSRVTLMASDIVTQIKQICRKYEVPAEKITIEVTESISKLDPQQLIDLMNQI VAEGFSVSLDDFGSKYSNLSILTTLDFNEIKFDKSLVDKLSSDAKSRIVMKNTMQICREL PMTRSLAEGIETVEQLDLLHQYHCDYGQGYYFAKPMSAESFLALLKKEQMLGTSVWIQSG HDGTDKME >gi|229783978|gb|GG667757.1| GENE 4 4186 - 5280 918 364 aa, chain - ## HITS:1 COG:aq_563_2 KEGG:ns NR:ns ## COG: aq_563_2 COG2199 # Protein_GI_number: 15606018 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Aquifex aeolicus # 282 358 32 111 218 62 37.0 2e-09 MRESFFERLIINQQNLVAYATDTDTNEIIYMTQAAASLYGFEHVQDTYGKKCYELIQGLD SVCPFCTNSKLVPGQPYHWEHYNEKLQMWFDITDILVMYEGKYCRLEIARDVTAQKEKFD RVSNRLTVEETLVDCIQTLSSEADVNAAVNRFLEIIGRFYAADRAYIFEYRKNQIKNTFE WCAPNIKPQLDLLQEIPIEYIRDWNHKFELDGEFFITSLDRDLETDSPDYRILKDQGIES LAAAPLKKNGQIIGFLGVDNPMESTGDLSLLRSVCSFVLEEMERRRLIQNLELFSYTDLL TGLQNRNSYIKMLDRLSHQTLRSLGVIYIDINGMKKINDSNGHKYGDRVIKKVADILKYT GGRI >gi|229783978|gb|GG667757.1| GENE 5 5346 - 6566 1113 406 aa, chain - ## HITS:1 COG:RSp1155_4 KEGG:ns NR:ns ## COG: RSp1155_4 COG2199 # Protein_GI_number: 17549376 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Ralstonia solanacearum # 236 399 6 170 182 101 36.0 3e-21 MCNREAPKLFGLDSEQEYLDHFNQLSPERQPDGSLSSEKAKERIREAMQVGRVQFHWLHQ KTDGELIPAEITLVRIDGLEEDGSAMVAGYTRDLRDQLRAEKAERLMARRVRAMVDSSPL ACVLWNMRQEALDCNQVAVEMLGATDKEEVLHDFDSFMPLYQPDGSVSVEKRKRTMREIQ NNRRYVFEWVYVNRKGESIPCEVTLVKAVIGNNDEDIIIAYSRDLRELKRTMEIIERLKK QACYDALTGCLTREYFIEILSGQFQEGTGEDSVVLGLLDLDGFKNVNDTYGHEAGDMTLK CVMKRVRELLPAGTVVGRFGGDEFLFLLYETEHGLVSQILNRLVEQIPHMQMEYQGNIFT TSISIGAVFQTTEDTCPDDLIRRADDALYQAKRNGRNRSVLIRVNN >gi|229783978|gb|GG667757.1| GENE 6 7625 - 9607 1410 660 aa, chain - ## HITS:1 COG:slr1692 KEGG:ns NR:ns ## COG: slr1692 COG2200 # Protein_GI_number: 16330979 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Synechocystis # 299 553 74 323 332 159 33.0 2e-38 MSLFDVNPDPEEAFLYNTIASICSDYLYVWDLEGDVLLASPNMVSDLGLAGNRLANCSQM DIKWVHPHDRERVQKQYRAFIHSKEESLSLEYQALTANGSYIWLSDRAKMKRDTAGKPLL IVGTLRNMEQYEGVDQVTGLLRHSSGRDRFERTLEGRAGFHEELLLLGIDGFKNINILNS HSFGDLVLRTTAQEILKMLPEQVKLYRYDGDQFLLSGNHVRREEMLRIYEDIRQYASGPH RVNGCAYRFTVSGGIIAYPEEAHSWVELEKGACAALRQAKENGKNQCVEFTAGLLDELQL SHCLAQSADNGFQGFRLVFQPICGADTLEVRGAEALLRYRMPDGKEISPVEFIPLLESSR LILPVGLWVLEQAIIACKVWTEQNSDFVMDVNVSFVQLRDSGFCDKVEALLQKYSLPARN LALELTESYFITNDANVNDSLERLNEMRIELSMDDFGTGYSSLGRLAEIGMDVVKIDRYF VKALHISQYNHDFVESVIRLCHNLGKAVCMEGVETKEEWETISLLNADMVQGFYISRPME PEAFFQKYLTSSYDNKVLEISSDRMKFQKKLANSRELLFAMMNATPLCLMLWNRQSEIMA CNEKAVQLFRAEGKEEIIEHFYRFSPLHQPDGRTSIELVHEKIAEAFAAGYSEFKWLHCD >gi|229783978|gb|GG667757.1| GENE 7 9638 - 9724 60 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAAAGIYIFPPGPPEGFEGGDFWKNALL >gi|229783978|gb|GG667757.1| GENE 8 10092 - 10307 148 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624179|ref|ZP_06117114.1| ## NR: gi|266624179|ref|ZP_06117114.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 71 2 72 72 125 100.0 1e-27 MFEKVFLVPYEDSANYPALSKYCEENGYSCIFTGDDEAEISGKKYEIYRDYEPGSRGNYG IIKKKEKMIFS >gi|229783978|gb|GG667757.1| GENE 9 10595 - 10858 170 87 aa, chain - ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 75 180 254 257 73 44.0 1e-13 MYLNPNYLSRLFKREMNTSINSYLIDKRIDKAKSLLEHSDMPVHAVSMEVGYNNFSYFTK LFREKTGQTPNEYRKSMGIRGEEYFLR >gi|229783978|gb|GG667757.1| GENE 10 11817 - 13100 1032 427 aa, chain - ## HITS:1 COG:SPy1587 KEGG:ns NR:ns ## COG: SPy1587 COG4753 # Protein_GI_number: 15675474 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Streptococcus pyogenes M1 GAS # 3 111 4 112 494 84 36.0 5e-16 MNILIVDDDSYVLELICKKIDWSSLGINEIYTASSAKQAKEIILEQNIHILLTDIEMPGE SGLELLEWVRSRNIRLEAMFLTSYAEFEYAKKAIELKSLEYYLKPVDEHSLSKGIASSVK RCLEYEKIGAYRKQSEYLAAHSHVIEEHFWLDLTKGFFEERKEIVWERICSDKLSFTQDH HFAVILFRIMPSLGRDTDEYLRIYRNRIRKMLREFYKDTYYESVYDFTENEGQFGLCLKL EPDIFNLIDIQNTAWRLIEYLEKQEHLELYGAVSEITTIDGIYGQLMELKNLLTGLIESD EKVITCKSVKTREKIGYNMPDLEIWETVFHSGGQKILSGMITDYLEQLASEKQINSFVLR MFRMDVEQLVYQYLRKNHVEMHKLFDSKEAETLSDLALQSIEYMKQYADYLIEKSMEYTE MTEKTDS >gi|229783978|gb|GG667757.1| GENE 11 13136 - 14131 823 331 aa, chain - ## HITS:1 COG:yegU KEGG:ns NR:ns ## COG: yegU COG1397 # Protein_GI_number: 16130037 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ADP-ribosylglycohydrolase # Organism: Escherichia coli K12 # 8 320 4 317 334 94 26.0 4e-19 MTDKVIEDRLTAVLEAFVTGDAMGMATEFMTRKEIIRKFGGWVTSLIDTRYSKLHKDLPT GSITDDSEENLILLEEYKKAGEVTVEATVEGLLRWINESNAMEKKYIGPSAAKALTDIKG GVSPYESGREGTTCGGIMRTPCVFLFSPGQDEDQLAVHIWRCLVPTHNTSEAMEAAGAYG FALRAAFLGESFEEIIDAAMRGGERLLEFVEEIRSSPSSSARIKEVVTKSREMDECQLMD WLHEVLGNGLSSSDVCGAVFGAFAMAGGDVFKAICMGASMGGDTDTIAALAGALSTAFVG SHNIPGDILETIKHVNHLDLAGLAGRGWEKM >gi|229783978|gb|GG667757.1| GENE 12 14163 - 15626 905 487 aa, chain - ## HITS:1 COG:SA0250 KEGG:ns NR:ns ## COG: SA0250 COG3275 # Protein_GI_number: 15925963 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Staphylococcus aureus N315 # 246 481 343 582 584 96 29.0 1e-19 MALNAADKSLEQLQGQEQLNNVFSIYFFDYAAQFHFFLYFPDRNRYIACEASSGEFGIED TIVTLIREGRINEYSKMDGWSIAEEDGEIYLIKTLKYQNIRFGYWLTAKEMSDLMSSLSD AEEDSLILTSLNGDSLTEIPGKIEPYLEKTGFFSKYIEIQRTFDKLPFAIRYFIPVSRIF QNILWLQLILAAITIGTAFGLFFYFYYTVRHIIKPIQTFCEGLSGYGGEELPELDSARIL ELEQADLQFRNLAEQIKNLKISLYEKQMEQQKEVIEFMKLQIRPHFYLNCLNLAYNMVEL EKYDECRQMLRLTSEYFRYLLKSDMRKELIRDELSYVKNYLKIQSIRYRRTFNYYIEQDP RTIDCYIPPLIIQTFVENAVKNTVALERPVDISITVMEENGDREPVIHIFITDTGDGFSE EILSRLQKGESLQRDNGTGIGITNSIKRLQYHFGEKCRVRFYNSPQGGAVVELTIPVSRK DGKVWHT >gi|229783978|gb|GG667757.1| GENE 13 16547 - 16738 255 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624184|ref|ZP_06117119.1| ## NR: gi|266624184|ref|ZP_06117119.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 61 14 74 74 105 100.0 8e-22 MRRISPVLGILTVGLVFSILLNCAAIIYTRYEQKQYAMMAVKANMDSVDREIAAITRNQA KIS >gi|229783978|gb|GG667757.1| GENE 14 16921 - 18417 1467 498 aa, chain - ## HITS:1 COG:lin2115 KEGG:ns NR:ns ## COG: lin2115 COG1653 # Protein_GI_number: 16801181 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Listeria innocua # 48 491 45 480 485 95 22.0 3e-19 VTGCGAKKALVGEETTQTAAGETGKKKSGKAEGETIHLKIATCVAAPKNMDKIVEQANAV LKEKLNITIEVVPLAFDGSKYNLLLTTGNDVDLIYTATWAKFQTNALDGAYLPLKEMIQK EVPELWDSIPEEDWKKCEINGEIYNIPSTSDDLIEIGFTYREDLRKKYNLPEITDIASME AYFAGIKEYEPTMVPCSDRGGRVNEMFQLLNGWYAMSNNLAYDLKGDQGVVVYSKTDSFK KGVETAREWVENGWIDSDIISANVDSAGQFLSGKYASTISYATGEAFNLIYSPALTSNPD WEIKMVNFGAMGQRVLKKHPMHTAFGLPKASPHPVEALKFMLELRTNQELYQLLDCGILG EDYEVTDNGNYKSLNSAEHPAFTKEGMGLTGFYYSVPELSLKPEHYEFITSVYDTMRPYV VDNTIMGFPADDDSVKAEVTAVNQLISQYLNPLINGMVDDTDGGITDLNERMDQAGIEKI HETISSQAKAYTKAIKGE Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:38:14 2011 Seq name: gi|229783977|gb|GG667758.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld151, whole genome shotgun sequence Length of sequence - 8666 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 2, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 - CDS 2 - 248 97 ## COG4209 ABC-type polysaccharide transport system, permease component 2 1 Op 2 . - CDS 280 - 1368 774 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 3 1 Op 3 . - CDS 1402 - 1647 188 ## gi|266624188|ref|ZP_06117123.1| putative hemerythrin HHE cation binding region 4 1 Op 4 2/0.000 - CDS 1654 - 2496 479 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 2590 - 2649 4.8 5 1 Op 5 7/0.000 - CDS 2761 - 4203 776 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 6 1 Op 6 . - CDS 4196 - 5929 1162 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 5967 - 6026 7.0 - Term 6072 - 6119 12.1 7 2 Tu 1 . - CDS 6134 - 8665 2504 ## COG1012 NAD-dependent aldehyde dehydrogenases Predicted protein(s) >gi|229783977|gb|GG667758.1| GENE 1 2 - 248 97 82 aa, chain - ## HITS:1 COG:BH0484 KEGG:ns NR:ns ## COG: BH0484 COG4209 # Protein_GI_number: 15613047 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Bacillus halodurans # 17 75 30 88 325 70 47.0 9e-13 MKMIKGNGLLYTKNLGRRVWTHKWLYVLLAPAILYYVIFKYWPMYGVVIAFKDYNVFKGI NGSEWAGLEVFRRIFKNANSNP >gi|229783977|gb|GG667758.1| GENE 2 280 - 1368 774 362 aa, chain - ## HITS:1 COG:BS_yteR KEGG:ns NR:ns ## COG: BS_yteR COG4225 # Protein_GI_number: 16080064 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Bacillus subtilis # 6 356 11 363 373 307 42.0 2e-83 MGESEKNALWYGEKACNTIMRKYEAAKLPPVDRFHYHQGVFLSGMYKIYSLTKNEDYFDY IKGWVDSQIDEEGNIKNYDDGQLDDMQSGILLYPLMDRTGDPRYKTALDRIVPKIYDFPK TEEGGFWHMRTLPHQMWLDGLYMGGPICAEYGYRYHKPEYTELVIKQILLMISKTRDSRT GLLYHAWDESRSEVWADKETGCSPEFWGRSVGWVPVAILDDLDFISRDHPKYGTICQSLC SLLDAVCKVQSSEGRWYQVLDKGDTEGNWLETSCSCLFTAALCKAIRKGILDKKYFQYAK KGYEAVIHSLEWAGDDILIGNVCIGTGVGDYRHYCDRPVVVNDLHGMGAFLIMCAEYCLV NQ >gi|229783977|gb|GG667758.1| GENE 3 1402 - 1647 188 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624188|ref|ZP_06117123.1| ## NR: gi|266624188|ref|ZP_06117123.1| putative hemerythrin HHE cation binding region [Clostridium hathewayi DSM 13479] putative hemerythrin HHE cation binding region [Clostridium hathewayi DSM 13479] # 1 81 1 81 81 155 100.0 7e-37 MGFLPQYEKVYTDEGMFRQWVSSYPDIDALEACETFSDILTSDRTVNIKKIEGEGSDLLD QIMENIWLGSFDANSVLEGAD >gi|229783977|gb|GG667758.1| GENE 4 1654 - 2496 479 280 aa, chain - ## HITS:1 COG:BH1864 KEGG:ns NR:ns ## COG: BH1864 COG1653 # Protein_GI_number: 15614427 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 1 234 62 308 461 66 26.0 6e-11 MIRQYEKENKGIKIEANFVLKPDYLSKISILNAEGKMPDVFLLPESNVYQWGKKGVLHEF GEGIYGARLNYVTMCLYYNKELLKAEGITPPSVNGEVWTWDEYVENAKRLTVDMSGKNAF QDGFNEKRVKTYGTRMPTSWEAITALAVSAGDSEPLSGGYGKLHFSDEKIHVIQSIADLS LSHHCAPDHVVANSTYTDAAAMLMSGRLALFIGTSGAFADYANRGFDVGVAPLPYFKTAS DIKWSEYVCVSEQSEHKGDAVDFTTYFSDYLNALKVSEQK >gi|229783977|gb|GG667758.1| GENE 5 2761 - 4203 776 480 aa, chain - ## HITS:1 COG:lin2118 KEGG:ns NR:ns ## COG: lin2118 COG4753 # Protein_GI_number: 16801184 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Listeria innocua # 1 478 1 494 494 131 25.0 2e-30 MYRVCIADDELYVRKSITQRIEKSKLPLLVVGTAENGEKGWELFNKEKPDIFLVDIKMPM QDGLGMIEKIKKEYRNVKTKFIVISGYDDFEYMQKAIQVGVVDYIRKPFESEQFIRVLEK ICSQLEEEEIEESKKMTKGRMIFWRDFYKVMKHKEVSGTFLLLYQKDMLSQANVWKLEKI YPDSDWKYICFHSTNQILLLYSDRPGTPVFASHKSDALMDIARYAVLYSGRESLDQILQC LEQNLDRRFYPGTPFLIKADTIPEKHKTVWDLHELEAALKNTRENQYQVLIRKIVGELEA DNENFIYYTEFFHVFNAVIASIYTSYGLNLPDDISRSFSPMALADFNRSEDLLKSVNASA EALVTKVRELTGRSEIVDNVIRYIKNHYKEDINLNVLANEFFLSPAYLSRKFSQTTGVSI MSYLEDYRINVATDLLKGSERSISEIADQVGYYDANYFTKIFKKVKGITPKEFRKMSKSC >gi|229783977|gb|GG667758.1| GENE 6 4196 - 5929 1162 577 aa, chain - ## HITS:1 COG:BH2110 KEGG:ns NR:ns ## COG: BH2110 COG2972 # Protein_GI_number: 15614673 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 291 564 299 574 585 129 31.0 1e-29 MKWFMKFGIWKKTLTILVAIYCVTFLISGAVSLMIYSNLQRQVVMDSVNQTLQEKNDVLV NYFEQIDEIAYNLSYSNWMQQIFFFESDDRRRQECINTLKKNMGRSSFLYNNIQFALETV NGTQIAGNEMTYFDYNFHVEHQFWYGELMENEKYMLIGDEQLCYYSGYRDSITMVYQVKN YETLELAAYFLVRIPLKNIEALIQENYIEGESVTLSFPQHNLFLGGSEGNEERDHAKSSN YSYEQQKNLRINHKNMVIQAGRAWSSVHADNTGIWSFAALIMIVVAALMAVIAFMISRYL TRPILKCKEAMSEIQANHIGITLDNPYQDEIGELITGFNEMSVSIYDLIKTNQNISLIQK ETEIKMLERQINPHFLFNTLELISGLILDEKEEEAISLCGNLGQLYRYNLRQEKWIQLRE EIEYTKRYLEIMQYKVENLETFYDIEDEVLEEKVIKAILQPLVENSTRHGFRDKAGECCI AIQAKRRGCGLILEVMDNGNGMSEEQLCQLNRDIEKIKENPDNRLAESKHIGVKNVFQRL YLEYRENLEFKISSRIGFGVKIEIVIKDPDRGEDAGV >gi|229783977|gb|GG667758.1| GENE 7 6134 - 8665 2504 843 aa, chain - ## HITS:1 COG:CAP0035_1 KEGG:ns NR:ns ## COG: CAP0035_1 COG1012 # Protein_GI_number: 15004739 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Clostridium acetobutylicum # 5 430 22 448 448 576 67.0 1e-164 RKGFEFATFTQEQVDKIFYEAASAANKMRLPLAKMAVEETGMGVVEDKVIKNNYAAEYIY NAYKETKTCGIIEEDKAYGIKKLAEPIGIVAAVIPTTNPTSTAIFKTLISLKTRNAIIIS PHPRAKNCTIAAAKVVLEAAVKAGAPEGIISWIDVPSLELTNEVMRSADIILATGGPGMV KAAYSSGKPALGVGAGNTPVIIDDTANIKLAVNSIIHSKTFDNGMICASEQSVTVLESVY DEARNEFAARGCYFLKDDEIEKVRKTIIINGALNAKIVGQKAHTIAKLAGVDVPEDTKIL IGEVESVNIEEEFAHEKLSPVLAMYKAKDFDEALAKAAQLVADGGYGHTASLYVNVNEHE KIMKHAEAMKTCRILINTPSSHGGIGDIYNFKMTPSLTLGCGSWGGNSVSENVGVKHLLN IKTVAERRENMLWFRNPEKVYFKKGCMPVALSELKDVYGKKRAFVVTDSFLYMNGYTKPI TDKLDEMGIVYTVFSDVQPDPTLANAQAGAKAMRAFQPDTIIALGGGSAMDAGKIMWVMY EHPEVDFQDMAMRFMDIRKRVYTFPKMGEKAYFVAIPTSSGTGSEVTSFAVITDQDTGVK YPLADYELMPKMAIVDADNMMSQPKGLTSASGVDVLTHALEAYASVMASDYTDGLALKAM KNVLEYLPTAYNDGSNVEARCKMADASCMAGMAFNNAFLGVCHSMAHKLGAFHHLPHGVA NALLITLVMEFNASEVPTKMGTFPQYEYPHTLARYAECGRFCGIQGKTDAEVLKKFIDKI EELKAAVGIKKTIKEYGVDEKYFLDTLDSMVEQAFDDQCTGANPRYPLMSELKEMYLKAY YGK Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:38:26 2011 Seq name: gi|229783976|gb|GG667759.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld152, whole genome shotgun sequence Length of sequence - 15053 bp Number of predicted genes - 15, with homology - 13 Number of transcription units - 7, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 40/0.000 + CDS 3 - 554 486 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 2 1 Op 2 . + CDS 551 - 1585 889 ## COG0642 Signal transduction histidine kinase 3 1 Op 3 . + CDS 1630 - 1767 96 ## 4 2 Tu 1 . - CDS 1721 - 1936 197 ## gi|266624195|ref|ZP_06117130.1| conserved hypothetical protein - Prom 1976 - 2035 6.7 + Prom 2042 - 2101 5.9 5 3 Tu 1 . + CDS 2125 - 2778 268 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 + Prom 3631 - 3690 80.4 6 4 Op 1 . + CDS 3808 - 5235 1364 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 7 4 Op 2 . + CDS 5243 - 5362 107 ## 8 4 Op 3 31/0.000 + CDS 5447 - 6526 1132 ## COG0342 Preprotein translocase subunit SecD 9 4 Op 4 . + CDS 7466 - 8461 979 ## COG0341 Preprotein translocase subunit SecF + Term 8478 - 8529 16.5 + Prom 8524 - 8583 4.9 10 5 Tu 1 . + CDS 8625 - 10001 1275 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Term 10035 - 10080 9.4 + Prom 10147 - 10206 4.9 11 6 Op 1 . + CDS 10302 - 11153 933 ## COG1408 Predicted phosphohydrolases 12 6 Op 2 21/0.000 + CDS 11236 - 12108 1097 ## COG1354 Uncharacterized conserved protein 13 6 Op 3 . + CDS 12044 - 12715 925 ## COG1386 Predicted transcriptional regulator containing the HTH domain 14 7 Op 1 . + CDS 13688 - 14491 885 ## Closa_2506 TPR repeat-containing protein 15 7 Op 2 . + CDS 14571 - 15047 204 ## PROTEIN SUPPORTED gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 Predicted protein(s) >gi|229783976|gb|GG667759.1| GENE 1 3 - 554 486 183 aa, chain + ## HITS:1 COG:SP1633 KEGG:ns NR:ns ## COG: SP1633 COG0745 # Protein_GI_number: 15901469 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Streptococcus pneumoniae TIGR4 # 1 183 41 223 225 216 54.0 2e-56 FAAFEPHLVLMDILLPYYNGCYWCGEIRKLSNVPVIFLSSASDNMNIIMAVSMGGDDFIA KPFDLNVMTAKVQAMLRRTYDFAGDHHLISHGGAVLNVNDATLTYQEEKVELTRNEFRIL KLLMEQAGRTVTREEVMTRLWETDSYVDDNTLTVNVNRLRRKLDGAGLTDFIVTKKGTGY MVK >gi|229783976|gb|GG667759.1| GENE 2 551 - 1585 889 344 aa, chain + ## HITS:1 COG:SP1632 KEGG:ns NR:ns ## COG: SP1632 COG0642 # Protein_GI_number: 15901468 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Streptococcus pneumoniae TIGR4 # 103 336 88 316 324 184 42.0 2e-46 MKGREKLPLVRTYLKRCRAVIAACVCGGILCASVFWLYRLPVEAVLYACILSAVPGLGLA VRSFYSYCRKCRDLERMLAWKTGSPEFAAPASDQAELLYRKLLVSSYEERERLLQEEEEK RTEMKDYYTMWVHQVKTPIAAMNLILQTGEGPDHGRLSTELFKIEQYVEMVLQYLRADSP SADLVIREYQLDSIVKKAIHKYAPVFIRKKTALDYRPAVCLVRTDEKWLLFVIEQLLSNA LKYTERGRITIRLEEERETENGGELRTLIVEDTGRGIAPEDIPRVFEKGYTGCNGRVNQG SSGIGLYLCKKVMTRLAHEMVIESQPGVGTKVKLRLDMTKFTPE >gi|229783976|gb|GG667759.1| GENE 3 1630 - 1767 96 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRSGMEECIWCGRTAGNWHVKERILYNVVKPTRENTLSAAETGIF >gi|229783976|gb|GG667759.1| GENE 4 1721 - 1936 197 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624195|ref|ZP_06117130.1| ## NR: gi|266624195|ref|ZP_06117130.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 71 1 71 71 119 100.0 5e-26 MYSSFEQRARQRGIDTIHEFETTHGRPGKSRTPISGYNPKKAMKQAEKAYLESIGIPKKS LFQRLKEYFHA >gi|229783976|gb|GG667759.1| GENE 5 2125 - 2778 268 218 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 3 210 11 227 329 107 32 4e-23 MPLLEVNQLKKVYTTRFGGNQINALNGVSFTVEPEEYVAIMGESGSGKTTLLNILAALDR PSGGTVLLNGRDMKTIPEGELAAFRRDHLGFVFQDFNLLDTFSLEDNIYLPMVLAKKSYG EMKKKADAVVGKLGIRELMKKYPYEVSGGQKQRAAVARALVMDPAIILADEPTGALDSRS SDQLLDLFEGINREGQTILMVTHSVKAASHAKRVLFIK >gi|229783976|gb|GG667759.1| GENE 6 3808 - 5235 1364 475 aa, chain + ## HITS:1 COG:SP0913 KEGG:ns NR:ns ## COG: SP0913 COG0577 # Protein_GI_number: 15900794 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 5 474 189 661 662 215 30.0 2e-55 MLAGKQTGEREPKAPWLFAVCGMVSLAAGYWIAISTERSAMLIIAFFIAVMFVIAGTCLL FMAGSITLLKLLKKNKGYYYQTRHFISVSGMIYRMKQNAAGLSVICLLSTAVLVMISSTI SLYAGLDDVMRNRFKRDFVVTVRDYSDETAAKTNRIVENALKEAGETAKELISYRYVSFA AVETGDGYEVREENLADYSSLDYLFFMTLDDYNRITEQNVGLDENQVMVLSDGMHYDRET FSIFGKQYEVVGKKGQAVEDGMDAMGSVGSLHIVLPDFSQLKAIEKQQAEAYGDAKSSPI YYVAFDAGDGREGSERLYQILEEQLSGLGAAYRLESAAMERGSFFSLYGGLFFVGLFLGT LFIMATVLIIYYKQISEGYEDRKRYEIMQNVGLSAREIKAAVRSQVLTMFFLPLVTAAVH ICFAFPMIARLLALLNLRNVGLFAASTAAAVLVFAVFYGVIYGVTSKKYYEIVKR >gi|229783976|gb|GG667759.1| GENE 7 5243 - 5362 107 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKILMGIMGSISVIILILFIQEIRNELEEERWWRNRGKS >gi|229783976|gb|GG667759.1| GENE 8 5447 - 6526 1132 359 aa, chain + ## HITS:1 COG:CAC2278 KEGG:ns NR:ns ## COG: CAC2278 COG0342 # Protein_GI_number: 15895546 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Clostridium acetobutylicum # 2 357 3 352 417 177 34.0 4e-44 MKPGKKTLGILTVIVILAVYTACFGLGKDIKGAGDMRFGIDIRGGVEAVFEPVGLDRNPA PNEIEAARNIIETRLDDKNITDREVTTDNEKGYVIVRFPWKSDEKDFKPEAAIAELGDMA NLTFRDENGTVMLEGRDVSTSRPVEYSDQTGMKKYEVELQFTKEGADKFADATGKLVGRR MGIYMDEELISNPVVQDKITGGNAVINGMETYAAAKNLSDKINAGALPYAMETKNFSTIS PSLGSNALNIMIFSGVLAFICICIFMIAMYRLPGLVACFGLVLQTSIQLLAISIPQYTLT LPGIAGIILTIGMAVDANIIISERIGEELGSGASLTEAVKRGYKRAFSSVLDGNLTSAS >gi|229783976|gb|GG667759.1| GENE 9 7466 - 8461 979 331 aa, chain + ## HITS:1 COG:CAC2277 KEGG:ns NR:ns ## COG: CAC2277 COG0341 # Protein_GI_number: 15895545 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecF # Organism: Clostridium acetobutylicum # 28 331 4 291 300 149 34.0 1e-35 MLLSIIREGKFHDLKWYKVKKNQKIYPIMKRKYIWFAISICMLLAGAVGFVKNGVSLDTQ FTGGTTLKYEIEGAADVNSDAVAAKISGETGRNVNVRLTKEDGLGKQFMVLTFAGQEGIS PESQEYVTQILNSYQDGFTAALSEAYAVQPYMGQKTLHNAVIAMVLAAVLITVYVWIRFS ILSGLSAGVSAVIALIHDVLIVLCAFLVFAIPLNDAFVAVVLTIIGYSINDTIVIYDRIR ENMNLHKCSSIDELVDTSVSQSLGRSLHTSLTTCISVLILVIASWHFGIQSIFEFSLPLL FGLVSGCFSSICVASVLWAMWKNSRRAESKK >gi|229783976|gb|GG667759.1| GENE 10 8625 - 10001 1275 458 aa, chain + ## HITS:1 COG:CAC1267 KEGG:ns NR:ns ## COG: CAC1267 COG1686 # Protein_GI_number: 15894549 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 101 376 30 328 425 208 41.0 2e-53 MIKRFNRIWSIAISLSLLASVPAFGDIPIVPPIGEGANTQTLGPETAGPGGGNSQNSQGT SAGPGAAAGSGTSAGPGTSAGPGASQGSGTGGSVVNTVKQPEIQAQGAVLMDAATGNLLY SKNAETKYYPASITKLMTALLVAEKCSLDDTVTFSKTAVTNLESGAVTLGLVEGDKLTVR QCLYGLLLKSANEVANGLAEHTAGSISAFADMMNARAKELGCTNTNFVNPNGLNDSNHYT TPHDMALIARAAFQNATVCKVASTLSYQIPATQKAAARTISLGHKMLYPNDSRYYQGVIG GKTGYTSLAGNTLVTCAERDGVRLIAVIMKSKSTQYTDTSALLDYGFALKSGGSQTAKWQ QEGEKWYFVKSDGNRAANEWMTIDGADYWFDSDGVMATGWRHYSNDAWYYFKSSGAMASS CWVETDGQWFYLGADGVMMKNTVTPDGYTLDASGVWVQ >gi|229783976|gb|GG667759.1| GENE 11 10302 - 11153 933 283 aa, chain + ## HITS:1 COG:CAC2775 KEGG:ns NR:ns ## COG: CAC2775 COG1408 # Protein_GI_number: 15896030 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 44 283 45 287 287 148 34.0 1e-35 MWLAAGVLSAAAGAGCLLRSQYEREQLSVERTEILTEKITRDCRMVFLSDLHDHEFGDHN ERLIEAIDGVKPDVVLIGGDMMVSRESADLRVSLALVGELTKRYPVYYGNGNHETRMNWE RKNFGLQYDIYVRKLKDMGVVLLSDRTEDFRDDIAVTGIDLAVRYYKKFHPGHLRPEEIE RLAGPAKREKFQILLCHSPLFFDACRKWGADLTLSGHFHGGTIRLPYLGGVMTPQFQFFL PWCAGTFEENGKTMIVSRGLGTHSINIRLNNKPQLVVVDLKKK >gi|229783976|gb|GG667759.1| GENE 12 11236 - 12108 1097 290 aa, chain + ## HITS:1 COG:CAC2061 KEGG:ns NR:ns ## COG: CAC2061 COG1354 # Protein_GI_number: 15895331 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 244 1 243 249 149 40.0 7e-36 MGISYKLENFEGPLDLLLHLIEKNKVNIYDIPIVTITEQYLDYVSHMETEDLNVVSEFLV MAATLIDIKSRMLLPREVNEDGEEEDPRAELVVRLLEYKYYKYMAFELKEKEVGADRLLF KGPTVPPEVAKYEPPVDLDGLLDGLTLAKLQEIFKSVMKRTEDRIDPVRSRFGNITKEPV SLEDKITSVLQYARIHRRFSFRGMLEKQADKTEVVVTFLALLELMKIGKIHLTQEHLFDD MLIETLEPEGEEEDLDLSELEEDIEGYREDEEEVGGAGVAADGGTERDYE >gi|229783976|gb|GG667759.1| GENE 13 12044 - 12715 925 223 aa, chain + ## HITS:1 COG:CAC2060 KEGG:ns NR:ns ## COG: CAC2060 COG1386 # Protein_GI_number: 15895330 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing the HTH domain # Organism: Clostridium acetobutylicum # 15 208 10 202 202 126 37.0 2e-29 MKKKWEAQESLPMEELREITNEEREYEAVIEAVLFTMGNSVEVRQLAAAIGQDEETARSA VKRLQKQYEAGRKGMQIIELENSYQMCTRPEYYENLIRVAATPKKQVLTDVVLETLSIIA YKQPVTKMEIEKIRGVKSDHAVNRLVEYNLVYEVGRLDAPGRPALFATTEEFLRRFGIGS TQNLPTPDPETAAEIRMEVEEELKGSGELLPEDEVLLDEILAS >gi|229783976|gb|GG667759.1| GENE 14 13688 - 14491 885 267 aa, chain + ## HITS:1 COG:no KEGG:Closa_2506 NR:ns ## KEGG: Closa_2506 # Name: not_defined # Def: TPR repeat-containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 6 267 2 259 259 275 58.0 2e-72 MNSYSRVKTGYNKKKIKVLILTAGLCMGLTVGCGKSENKYTYRDAGIEALNAGDYESAIA SFDEAINASNALVGAFDIDVLKYRAEAEYRAADYQAASDTYGILIQVDQERPEYLNMRCV SRAQNGDLNGALEDYKRASELDTEKNAPGRAEALLAAGAALENQGAAAEAMALYESAKAE GIESAELYNRMGLCKFGEKDYETAITCFAQGLLKPDAASVPDLIYNQAVAYEYKGDFDKA RELMEQYVAGNSSDEEAARELEFLKSR >gi|229783976|gb|GG667759.1| GENE 15 14571 - 15047 204 158 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 [Roseobacter sp. AzwK-3b] # 27 141 21 142 425 83 41 1e-15 MADLVELKEIEERVILVAVSTEEGSLAASSLDELEELVETAGAVTVDKMIQNRERIHPGT YLGKGKIEEVKERVWELNATGIVCDDELSPAQLRNLEDALDTKVMDRTMVILDIFASRAT TREGKIQVELAQLKYRAARLVGLRNSNPFGSNVGTERV Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:38:50 2011 Seq name: gi|229783975|gb|GG667760.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld153, whole genome shotgun sequence Length of sequence - 16097 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 9, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 53 - 541 530 ## COG2606 Uncharacterized conserved protein - Prom 586 - 645 7.2 + Prom 578 - 637 9.2 2 2 Op 1 . + CDS 666 - 2726 2019 ## COG0550 Topoisomerase IA 3 2 Op 2 . + CDS 2740 - 3420 786 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) 4 2 Op 3 . + CDS 3443 - 4774 1444 ## COG0733 Na+-dependent transporters of the SNF family + Term 4811 - 4861 7.1 - Term 4793 - 4855 17.8 5 3 Op 1 5/0.000 - CDS 4871 - 5512 729 ## COG0035 Uracil phosphoribosyltransferase 6 3 Op 2 . - CDS 5531 - 6856 851 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 7 3 Op 3 . - CDS 6934 - 7806 751 ## Closa_1881 hypothetical protein - Prom 7836 - 7895 5.3 8 4 Tu 1 . - CDS 8026 - 8310 233 ## COG1396 Predicted transcriptional regulators - Prom 8451 - 8510 8.0 + Prom 8356 - 8415 8.8 9 5 Tu 1 . + CDS 8466 - 8699 240 ## gi|266624215|ref|ZP_06117150.1| conserved hypothetical protein + Term 8728 - 8764 4.2 - Term 8709 - 8759 9.3 10 6 Tu 1 . - CDS 8816 - 9028 242 ## CD0586 hypothetical protein - Prom 9149 - 9208 80.4 11 7 Tu 1 . + CDS 10086 - 10385 197 ## PROTEIN SUPPORTED gi|50365462|ref|YP_053887.1| acetyltransferase of 30S ribosomal protein L7 - TRNA 10472 - 10554 63.1 # Leu CAA 0 0 + Prom 10602 - 10661 7.8 12 8 Op 1 . + CDS 10687 - 11034 452 ## Closa_1882 transmembrane anti-sigma factor 13 8 Op 2 . + CDS 11088 - 11888 1086 ## COG0258 5'-3' exonuclease (including N-terminal domain of PolI) 14 9 Op 1 4/0.000 + CDS 12867 - 14600 2013 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains 15 9 Op 2 . + CDS 14637 - 15227 440 ## COG0237 Dephospho-CoA kinase 16 9 Op 3 . + CDS 15224 - 16021 794 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I Predicted protein(s) >gi|229783975|gb|GG667760.1| GENE 1 53 - 541 530 162 aa, chain - ## HITS:1 COG:BS_ywhH KEGG:ns NR:ns ## COG: BS_ywhH COG2606 # Protein_GI_number: 16080800 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 158 1 157 157 163 54.0 1e-40 MSIAKVKEYLRDYGRDCDVQEFPVSSATVELAAKALSVEPARIAKTLSFHGDETNGCILI VTAGDMKIDNAKFKKQFGFKARMLGAEEVAMLTGHEVGGVCPFANPEHVTTYLDISMQRF PSVFPAAGSASSAIELTCEELFKYSQAREWIDVCKAKEVQAV >gi|229783975|gb|GG667760.1| GENE 2 666 - 2726 2019 686 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 1 655 1 587 709 365 36.0 1e-100 MSKSLYIAEKPSVAQEFANALKINGRRSDGYIESEQAIVTWCVGHLVTMSYPECYDPKYK RWSLETLPFLPKEFKYEVIPSVEKQFKIVSGLLNRPDVDTIYVCTDSGREGEYIYRLVAQ MADVKDKAQKRVWIDSQTEEEILRGIREAKDEKEYDNLSASAYLRAKEDYLMGINFSRLL TLKYGQNISNYLRTGKNTVISVGRVMTCVMGMAVRREREIRDFVKTPFYRVIGTFSPQDM DFDGEWRSAPGSKYFESPYLYKENGFKERKTAEALIESLSAVQPMNAVLQAIEKKKETKN PPLLYNLAELQNDCSKMFKISPDETLKVVQELYEKKLVTYPRTDARVLSTAVAKEIYKNI GGLKNFEPLAGFAAEALESGSFKGIAKTRYVNDKQITDHYAVIPTGQGMGALRSLSPLSA KVYEVIARRFLSIFYPPAIYQKYALELVIDKEHFFTNFRVMLEAGYLKVADVPGSRKKPD TKGKDGQAENASGDSDEGNTDSAQKNMDNPEFIAMLGTLKKGAVIPLKSLAIKEGETSPP KRYTSGSMILAMENAGQLIEDDELRAQIKGSGIGTSATRAEILKKLFNIKYLALNQKTQV ITPTLLGEMIFDVVNASIRQLLNPELTASWEKGLTYVAEGSITSDEYMQKLENFVAGRTA NVLRLNNQYGLRGAFDAAAANYKKTK >gi|229783975|gb|GG667760.1| GENE 3 2740 - 3420 786 226 aa, chain + ## HITS:1 COG:APE1202 KEGG:ns NR:ns ## COG: APE1202 COG1208 # Protein_GI_number: 14601247 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Aeropyrum pernix # 57 209 202 349 360 102 33.0 8e-22 MKKEDMIIRDLFDTSHTIAKLLFERNTYPWEILPKIGEYITSLGELLPEDEYNRMGKSIW IHKSVKMPPTACLADNVIICKGASIRHCAFIRGNVIVGEGAVVGNSCELKNVILFDGVQV PHYNYVGDSILGYKAHMGAGSITSNVKSDKTLVKVHAEDGDIETGLKKFGAILGDFVEVG CGSVLNPGTVIGPKSNIYPLSSVRGCVDGGCIYKKQGEVVEKREAE >gi|229783975|gb|GG667760.1| GENE 4 3443 - 4774 1444 443 aa, chain + ## HITS:1 COG:FN1989 KEGG:ns NR:ns ## COG: FN1989 COG0733 # Protein_GI_number: 19705285 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Fusobacterium nucleatum # 13 443 7 438 438 397 53.0 1e-110 MSEKTHTAGKNGKFGSRLGFILASVGSAVGMGNIWMFPYRVGQYGGAAFLIVYFGFILLF GLVGLSGEFAFGRLTGTGPIGSYGYALKTRGKKGGALLGAIPLIGSLGIAIGYAIIVGWV LRYLFGSLSGVMMGMEAEVYFSQATGHFGSVPWHFMVVAITMAVLVGGVLKGIEKVNRIM MPAFFILFIIIAVRVFFLDGALAGYEYLLIPRWELLLKPETWVMAMGQAFFSLSITGSGM IIYGSYLDKKEDIPKATMTTALLDTVAALLAGFAIIPAVFAFQMDPASGPPLMFITLPKV FAQIPAGRWVAVLFFLSVLFAGITSLVNMFEVCAEAVQTHLHFSRKKAVFMVGSLVFLVG LFIEYEPYMGSWMDMITIYVVPFGAVLCAIMIYWVLGINRIGEELNTGRKKPVGKLFCFA AKYVYVGLAVLVFIMSIVYKGIG >gi|229783975|gb|GG667760.1| GENE 5 4871 - 5512 729 213 aa, chain - ## HITS:1 COG:SP0745 KEGG:ns NR:ns ## COG: SP0745 COG0035 # Protein_GI_number: 15900640 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 8 213 4 209 209 271 63.0 6e-73 MEFPIENVTVFSHPLIQHKISILRDKRTGTNEFRALIEEIAMLMGYEALRDLPLKEVEVE TPIETCMTPMIAGKKLAVVPILRAGLGMVNGILALVPSAKVGHIGLYRDEVTHEPHEYYC KLPNPIEQRTIVVTDPMLATGGSAVAAVDFIKEHGGKNIKFMAIIAAPEGLERLHKAHPD IQIYVGHLDRQLNENAYICPGLGDAGDRIFGTK >gi|229783975|gb|GG667760.1| GENE 6 5531 - 6856 851 441 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 14 438 19 424 447 332 42 1e-90 MSEKTQLTTRDYLLSIQHLFAMFGATVLVPLLTGLNPSLALFSAGVGTLIFHCCTKFKVP VFLGSSFAFLAALSAIIRPEGAVVAENIPLAQGGIIFAGLIYLLFALIAFLIGADKIKKV FPPVVTGPVIVVIGINLASTAIGDATGGLSLADGLTSEIALNLGIALFTLFVVILCSIFA HGFFKLVPILIGIAAGYILCIILGAVGVFHMDYSAITSAAWINIPFVTKDMNGVPFMSLP KFELGAILSIAPIACVTFMEHIGDVTTNGTVVGKDFLKDPGLHRTLMGDGLATLFAGFVG GPANTTYSENTGVLATTKNYNPRLLRVTAIFAIILGLFGKVGAILQTIPGPVKGGVEVML FGMIAAVGIRSLAESDLDFTHSRNLTIVGLILVFGLGFAQLGGLNIHAGSITLNISGLFI AVVIGVLMNIILPQTADTVKN >gi|229783975|gb|GG667760.1| GENE 7 6934 - 7806 751 290 aa, chain - ## HITS:1 COG:no KEGG:Closa_1881 NR:ns ## KEGG: Closa_1881 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 290 1 280 280 241 46.0 3e-62 MKRYTVILTLSLLCALSLAGCGKKNQDKTDLTSTHTTAATETMSPTTTAAETKETEGSDK EAADGESEKDDSKKTDNSASVSTSIHTYTTGNISIEYPVVSNLSKAAVQETVNKLLNEHA LEFIKAYGVDESKDSLTVKCRVVSADRRRLTAVYTGSYMPDGGAHPVSLFYTNTIDTALG EDMGLTDYADPYTLAGYVLSADCQFADADATLEKELMKIKNDTDIETYTEMFRRADFPWK EPKTTSKGSDEAVKFPEVFSYEDQGTIYVSIPVPHALGDFALIKYTPDTK >gi|229783975|gb|GG667760.1| GENE 8 8026 - 8310 233 94 aa, chain - ## HITS:1 COG:L80045 KEGG:ns NR:ns ## COG: L80045 COG1396 # Protein_GI_number: 15674195 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 4 88 6 90 102 63 38.0 6e-11 MVDFGNTLKTLRLQENMTQAQLSQKLGLTKSVISAYETGLRLPSYDVLIHIAQIFKVSTD YLLGVEHKDSLDLSGLTQPEKSALLMLIKAMKNK >gi|229783975|gb|GG667760.1| GENE 9 8466 - 8699 240 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624215|ref|ZP_06117150.1| ## NR: gi|266624215|ref|ZP_06117150.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 2 77 1 76 76 144 100.0 2e-33 MMKKHIYNLEFTEEELQMIYLVFSRLETKCFAKWRALWPDDGERASSDCTKQWLILGKVE RIKDRVAKPLGLSFRKE >gi|229783975|gb|GG667760.1| GENE 10 8816 - 9028 242 70 aa, chain - ## HITS:1 COG:no KEGG:CD0586 NR:ns ## KEGG: CD0586 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 12 69 1 58 59 102 82.0 6e-21 MTAQNLVEDWNMGKDRFEKVYAQGTLNVTEIWVDKETGVNYVFHCGGNAGGLTPLLDREG KIVISPIVSH >gi|229783975|gb|GG667760.1| GENE 11 10086 - 10385 197 99 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|50365462|ref|YP_053887.1| acetyltransferase of 30S ribosomal protein L7 [Mesoplasma florum L1] # 1 98 73 170 170 80 39 7e-15 DRNIFVGAVSIRHYLNDKLLKTGGHIGDGIRPGERRKGYATAMIALALEECKKLGITRVL MCCDKNNIGSAKSIIKNGGVLENEIEEDGHIEQRYWIQL >gi|229783975|gb|GG667760.1| GENE 12 10687 - 11034 452 115 aa, chain + ## HITS:1 COG:no KEGG:Closa_1882 NR:ns ## KEGG: Closa_1882 # Name: not_defined # Def: transmembrane anti-sigma factor # Organism: C.saccharolyticum # Pathway: not_defined # 1 115 1 115 115 173 80.0 2e-42 MTCKEAESLVMPYIKNELTDEELFEFLEHIEHCPECREELEIYFTVDVGIRQLDSETGNY NIKGALETAIEQSRERLEAVRMVKIMRYAVSTLSVMALIITVLLQCRIWLQAGLI >gi|229783975|gb|GG667760.1| GENE 13 11088 - 11888 1086 266 aa, chain + ## HITS:1 COG:CAC1098_1 KEGG:ns NR:ns ## COG: CAC1098_1 COG0258 # Protein_GI_number: 15894383 # Func_class: L Replication, recombination and repair # Function: 5'-3' exonuclease (including N-terminal domain of PolI) # Organism: Clostridium acetobutylicum # 1 250 13 260 340 233 48.0 2e-61 MSRAFYGVPELTNSEGLHTNAVYGFLNIMFKILEEEQADHVAVAFDLKEPTFRHQMFEQY KGTRKPMPEELHEQVDLMKEVLGAMEVPILTMAGFEADDILGTVAKESQAKGVEVVVVSG DRDLLQLADEHIKIRIPKTSRGGTEIKDYYPEDVKNEYHVTPKEFIDMKALMGDSSDNIP GVPSIGEKTAATIIEAYGSIENAYAHIEEIKPPRAKKSLEENYSLAQLSKELAAINTNCG IEFSYDDAKTDSLYTPAAYQYMKLAS >gi|229783975|gb|GG667760.1| GENE 14 12867 - 14600 2013 577 aa, chain + ## HITS:1 COG:CAC1098_2 KEGG:ns NR:ns ## COG: CAC1098_2 COG0749 # Protein_GI_number: 15894383 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Clostridium acetobutylicum # 93 577 65 531 531 459 48.0 1e-129 MVTDFGEAEAVFASCRKGAKIGLELVIEDHELTAMALCTGEEATYCFVPQGFMRAEYLVE KARDLCRTCERVSVLKLKPLLPFLKAESDSPLFDAGVAGYLLNPLKDTYDYDDLARDYLG LTVPSRAGLIGKQSVKMALETDEKKAFTCVCYMGYIAFMSADRLTEELKRTEMYSLFTDI EMPLIYSLFHMEQVGIKAERVRLKEYGDRLKVQIAVLEQKIYEETGETFNINSPKQLGEV LFDHMKLPNGKKTKSGYSTAADVLDKLAPDYPVVQMILDYRQLTKLNSTYAEGLAVYIGP DERIHGTFNQTITATGRISSTEPNLQNIPVRMELGREIRKIFVPEDGYVFIDADYSQIEL RVLAHMSGDERLIGAYRHAEDIHAITASEVFHTPLDEVTPLQRRNAKAVNFGIVYGISSF GLSEGLSISRKEATEYINKYFETYPGVKEFLDRLVADAKETGYAVSMFGRRRPVPELKSA NFMQRSFGERVAMNSPIQGTAADIMKIAMIRVDRALKAKGLKSRIVLQVHDELLIETRKD EVEAVKALLVDEMKHAADLSVSLEVEANVGDSWFDAK >gi|229783975|gb|GG667760.1| GENE 15 14637 - 15227 440 196 aa, chain + ## HITS:1 COG:Rv1631_1 KEGG:ns NR:ns ## COG: Rv1631_1 COG0237 # Protein_GI_number: 15608769 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Mycobacterium tuberculosis H37Rv # 1 181 1 182 230 101 34.0 8e-22 MRVIGLTGGVGAGKSRILDILKTEYGAEIIVADQVAHELMEPGQGGYREVVRALGTSFLN PDGTIDRPLLSALIFHDRNALETMNGIIHPMVWKTIKDKISSSQADLIVVESAIMGREQD DIYDEMWYVYTSEENRIRRLNENRGYSRERSLSIMKNQLSDEEFRELADRVIDNNGTVED VKAGLEAILKDKGRET >gi|229783975|gb|GG667760.1| GENE 16 15224 - 16021 794 265 aa, chain + ## HITS:1 COG:CAC3538 KEGG:ns NR:ns ## COG: CAC3538 COG1235 # Protein_GI_number: 15896774 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Clostridium acetobutylicum # 1 263 1 259 261 209 46.0 3e-54 MRLVSIASGSSGNCIYVGSDHSHILVDAGISNKRIEQGLNEIGLKGSELDGIVITHEHSD HTKGLGVLARKYGVPLYGTKETLDEVSKQKYLGEYPRELYHPISPDVDFFIGDLEVKPFS IDHDAANPVAYRIQHDHKSVAVATDMGHYDQYIIDHLQGLDALLLESNHDVNMLQAGPYP YYLKRRILGDHGHLSNENAGRLLCCILHDNLKRILLGHLSKENNYEELAYETVKLEITDG DNPFKASDFSIAVAGRDRMSEIITI Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:39:09 2011 Seq name: gi|229783974|gb|GG667761.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld154, whole genome shotgun sequence Length of sequence - 16333 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 6, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 38/0.000 + CDS 1 - 591 556 ## COG1175 ABC-type sugar transport systems, permease components 2 1 Op 2 . + CDS 572 - 1447 869 ## COG0395 ABC-type sugar transport system, permease component 3 1 Op 3 . + CDS 1458 - 2813 1112 ## COG0534 Na+-driven multidrug efflux pump 4 1 Op 4 . + CDS 2859 - 3977 1011 ## Calhy_2197 alpha-L-rhamnosidase + Prom 4820 - 4879 80.4 5 2 Tu 1 . + CDS 4900 - 6393 1148 ## Calhy_2197 alpha-L-rhamnosidase + Term 6394 - 6449 9.3 - Term 6377 - 6440 15.0 6 3 Op 1 . - CDS 6446 - 6814 207 ## gi|266624229|ref|ZP_06117164.1| peptidoglycan lytic transglycosylase-related protein, glycoside hydrolase family 23 protein 7 3 Op 2 . - CDS 6892 - 7410 551 ## Closa_2357 hypothetical protein 8 3 Op 3 . - CDS 7438 - 7905 598 ## COG0281 Malic enzyme 9 4 Tu 1 . - CDS 8871 - 9434 722 ## COG0281 Malic enzyme - Prom 9461 - 9520 6.9 + Prom 9606 - 9665 6.0 10 5 Tu 1 . + CDS 9791 - 12886 3334 ## Closa_2355 hypothetical protein + Prom 13729 - 13788 80.4 11 6 Op 1 . + CDS 13849 - 14289 453 ## Closa_2355 hypothetical protein 12 6 Op 2 . + CDS 14305 - 15528 1370 ## Closa_2354 hypothetical protein 13 6 Op 3 . + CDS 15525 - 16332 746 ## Closa_2353 hypothetical protein Predicted protein(s) >gi|229783974|gb|GG667761.1| GENE 1 1 - 591 556 196 aa, chain + ## HITS:1 COG:AGc4553 KEGG:ns NR:ns ## COG: AGc4553 COG1175 # Protein_GI_number: 15889771 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 6 196 120 312 312 106 29.0 3e-23 REGSIFRFIFYSPSMMPMTVAGLLFVFVLSPDDGLLNNLFAVVGLKSLQHAWLSEPGLVL ATLAVVSGFKGSGTIMMMVYTGILGIPESLFESAKLDGANYWKEVKLIILPLIKPTICMV FSMEVMWSFKTYDMVWTMTQGGPGSMSKTAPITMIQNAFTYNKFGYASAIGLVFTVIVLV CITLVRRAMKSEIYEY >gi|229783974|gb|GG667761.1| GENE 2 572 - 1447 869 291 aa, chain + ## HITS:1 COG:SMc02473 KEGG:ns NR:ns ## COG: SMc02473 COG0395 # Protein_GI_number: 15966815 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 22 290 15 281 282 139 30.0 7e-33 MKSMSTDSAAVKRKWNGEDWVRITIRAVLVVFAVIYVYPIIFAGITSFKTLPEFYTNIWA MPQKFLYRNYVDAFYTGKIGEYFFNSVLIAVLSIVLIQLFSTMAAYALTRLKIPHTNLIL LLFFALQILPTETIIIPLYVEMTKLGFMRLQYVPIILAYVGWSIPGSTIILKNFFDTVPQ ELLEAARIDGCGEIRSLFKITLPLMKSAICTVATMNLNFVWGELMWAQISTLLTDKGLPL TVGLLNFKGQISTNWPQLCAAIIIVVLPLYTLFLFTQKYFIAGLTAGGVKG >gi|229783974|gb|GG667761.1| GENE 3 1458 - 2813 1112 451 aa, chain + ## HITS:1 COG:CAC3354 KEGG:ns NR:ns ## COG: CAC3354 COG0534 # Protein_GI_number: 15896597 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 3 424 2 424 452 180 27.0 5e-45 MKRRLQYDLTKGPVVNGILGFAVPLMISNLIQTTYNAVDMYFVGAYAGTAGVSAVSVCGP VMNVMIVTLSGMSAGVTVVIGSLAGMKGSAEIADAANTVFTLYFLMAAVVTAAGFLFAPQ VLMLVQTPREAYDHALIYLRVMFLGIVFLFGYNLIGALQRGLGDSLSSLRYVVTAAVVNI ALDYLLVGRLSMGTGGAAAATVAAQMVSFFMGLWEFQTGGHVIRLRIRNMGFSRRHLKSI LWFGIPTAFNEVMVNAAMLTVSGVANSFGLAQSAAYGIGSRINSFAIFSDGAMNQTMSSF ASQNIGAGEEKRAMQGLKECLKISAAIAAVTSVIVFAVAPQLSRLFDRDPEVVNWTVRFL HVTVWSYVLFALVGPLIGFIRGTGNVTASVAVGFVAQCVFRIPFSFLFGHLYGFPGVGAA VLIGPLSSVVMYSFWVLSGRWRRGIVSVERM >gi|229783974|gb|GG667761.1| GENE 4 2859 - 3977 1011 372 aa, chain + ## HITS:1 COG:no KEGG:Calhy_2197 NR:ns ## KEGG: Calhy_2197 # Name: not_defined # Def: alpha-L-rhamnosidase # Organism: C.hydrothermalis # Pathway: not_defined # 2 371 3 374 888 288 39.0 3e-76 MIKAYHLLCEYEENPVGVATGKPAFSWRMEGDCDPVFQSAFQIVAAIDGAFARIVWDTGQ RMGGQSVHIVYDGSVPLEPAVKYYWKVRLWDQNGEAGPFSDVHCFVTSLISGGEAWAGRW ITAESEADLFTSSGRYMKKEFELSVAEVDAAYLFATAHGIYEVSVNGIRAGDGLLTPGWT EYAKRLLFQMYDVKDALTEGKNTICAHVGPGWYKGDLAGWIHLRGVYGHTTGFNAMLMIR YRDGRKRWIVTDRSWQWCYSPAVYAEIYHGEIWDARLAEETGQKWAPVTETDQPVDTLVP MDGVFVRRKETVAPKRLFRTPNGDLILDFGQNMVGWVAVRVSGEAGDYVELSHAEILDQE GNLYTGNLREAS >gi|229783974|gb|GG667761.1| GENE 5 4900 - 6393 1148 497 aa, chain + ## HITS:1 COG:no KEGG:Calhy_2197 NR:ns ## KEGG: Calhy_2197 # Name: not_defined # Def: alpha-L-rhamnosidase # Organism: C.hydrothermalis # Pathway: not_defined # 1 492 371 882 888 546 50.0 1e-154 MREARQRVKYILKGQETEVFEPHFTWQGFQYVRIDHYPGTPELKDFTGVVIYSDLRQSGF FQCSNPLINRLEENIRWGMKGNFIDIPTDCPQRDERMGWTGDIQVFARTASYLMETAPFF RKWLRDLQAAQLPDGGIPYVVPDILTYQDRDVRDAEANHSASGWGDAATVCPFMAYLYSG DRRLLEESYPMMKRWVDYIKSQAEDGVIWNSGYQYGDWLALDAEPGGRFGKTPNELTATA YYAHSAELVRKAAVILGEDEDAETYGRLSDRIKEAFRNRFYNEAGHLWARTQTAHVLALA FDLTGEEYRQRTIDDLETLIGEWNGLSTGFLGTPHICHVLRDNGRVDAAYRLLTKTEYPS WLYPVTKGATTVWEHWDSIRTDGSFWSDSMNSFNHYAYGCIGDWIYSTVLGIDTSENGPG YRESIVAPVPGGGITWAEGGCHTPYGLLEVKWSVEAGNFLLHVTVPANTACRITLPGSGE SHSVGSGKYEFQCRIES >gi|229783974|gb|GG667761.1| GENE 6 6446 - 6814 207 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624229|ref|ZP_06117164.1| ## NR: gi|266624229|ref|ZP_06117164.1| peptidoglycan lytic transglycosylase-related protein, glycoside hydrolase family 23 protein [Clostridium hathewayi DSM 13479] peptidoglycan lytic transglycosylase-related protein, glycoside hydrolase family 23 protein [Clostridium hathewayi DSM 13479] # 1 122 1 122 122 142 100.0 1e-32 MQIKKQRSIIQEIRDEYQAELKRLKRRKFLLKARIILTVILPVFIILLSVKVIKTFIRIK VRSLFAKPQQTQQEKKTPPKPAVPVPAEMKPVEKEIVKPVPVETAAVAETGSVELVPIES VG >gi|229783974|gb|GG667761.1| GENE 7 6892 - 7410 551 172 aa, chain - ## HITS:1 COG:no KEGG:Closa_2357 NR:ns ## KEGG: Closa_2357 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 171 2 172 174 308 92.0 8e-83 MSEPMTLYKLMILYMLKQVNFPLTSAQLSNFFLDHEYTTYFTLNQALNELLEAKLLHVET NHNSSRYEITREGEETLSFFGNKISSAIIEDIDEYLKENKFRLRNEVGVTADFYKSTSQD YIVHCEVREGKSCLIDLQLAVPDKDQAEIMCDHWKAKSQEIYAFAMKALMSD >gi|229783974|gb|GG667761.1| GENE 8 7438 - 7905 598 155 aa, chain - ## HITS:1 COG:BH3168 KEGG:ns NR:ns ## COG: BH3168 COG0281 # Protein_GI_number: 15615730 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Bacillus halodurans # 1 140 232 371 410 168 62.0 4e-42 MQEKMVEVTNLEGASGSLADAMKGADIFVGVSAPGIVTKEMAASMNKDAILFAMANPVPE IMPDLAKEAGAKIIGTGRSDFPNQVNNVVVFPGIFKGALEGRATAITEEMKLAAAAAIAG LVDESILDENHILPEAFDPRVADVVSRAVKDHIQK >gi|229783974|gb|GG667761.1| GENE 9 8871 - 9434 722 187 aa, chain - ## HITS:1 COG:BH3168 KEGG:ns NR:ns ## COG: BH3168 COG0281 # Protein_GI_number: 15615730 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Bacillus halodurans # 2 186 3 187 410 245 65.0 4e-65 MTTNEKALLLHEEWNGKLETTSKAEVKSREDLALAYTPGVAEPCKVIAKDPEAAYKYTIK SNTIAVVSDGSAVLGLGNIGPLAAMPVMEGKAVLFKEFGGVNAFPICLDTQDTEEIIKTV VNIAPAFGGINLEDISAPRCFEIEERLKELLPIPVFHDDQHGTAIVVLAGIINALKVTGK KKEECRS >gi|229783974|gb|GG667761.1| GENE 10 9791 - 12886 3334 1031 aa, chain + ## HITS:1 COG:no KEGG:Closa_2355 NR:ns ## KEGG: Closa_2355 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 1029 1 1015 1190 1345 67.0 0 MRERINRLAKGIIDSEVPVIKVTPKETTDVVRSGGATRRELVVTSENNLHIKGLAYSSNF RVRLLNGAFGGTYNHLVYEVNTSFLEDGDVIKGAFYLVTNGGEREVPYSFRVELSTSGKA LSDLKTAEDFAATAKNDYETALRLFEYRDFYEAPFMKDLHVRALYGGLKGRPDRKKELEE FLTGLGVKEPVKLSADESERRFDEPAGVIEERITVTRSCWGYVSLEVTADCDFIEIPVRN IGDQDFTDDICEIVYRIHPGRMHQGLNQGRIIITGIEERFVVPVTAMVNPGGGISAEDAY TKEAVAAYLHLRLQKEAGEYEEAVLLVPPDTDEGDVPHNRMLRALDEIEEMKGQLEAIKL LRAEIYLSDGRESQAALLMDECREKILAERQEKLELYCYYQYLRLILQPDEYQKESLVRL LRKYLEENPDLHNLFLILLKLDRRMSENPGGALSMMRRQFEAGMRSPFLYLHACRLLSEE PDLIRSIGPFELQVLSFGARYGAAGKEFALAAARLTLTSRHFNRLHYRALGRLYEKYPEK EILEAVCSILIRGESRSAESFKWYEKGIKEGISLTRLYEYYLYALPKDYRYLLPKEVLLY FSYGGNELDRHSRALLYKNVLVYLKESDPLYEAYERDIEKFATEQLFESRIGSNLAVIYE HMIYKDVIDLPMAKVLPSILRSYRVECKNVKMKYVIVCYEELQEEDAFLLDDGVAYVPLF SEHSILLFQDAFGNRYSTVPYEKTPVMDKPDLEARCFELYPDHPMLRLRACDEVLKRGIA DEKEAEMLEDAMEDGRLQPLYQRLLLSRIIEYYQHIPLQAADEDSSGGDTYLVRLDKRLL TKEERQGICETLISRNYIEEAFDMIREFGCEDIQSKRLLKLCTRMILKNLFDEDPLLLHL SYRVFLEGMSDSVILDYLCEHYNGLTNQMYRVLLQGISEHVETYDLEERLVAQMMFSGCT ETIDRVFSLYMNKKKTSDNIVRAYFTIKSSEYFLEDKTPDDKVFAYLEGAVNSSADKERI PTIYLLALTTS >gi|229783974|gb|GG667761.1| GENE 11 13849 - 14289 453 146 aa, chain + ## HITS:1 COG:no KEGG:Closa_2355 NR:ns ## KEGG: Closa_2355 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 145 1045 1189 1190 214 72.0 9e-55 MFPYFKTLAKYVPMPEDIMDKAMIQYHSDRNARIDFEVRILPDEEEYHCEDIGRTYQGIF VKKKVLFEGEIMEYRISELEDGQWVLKKEGSVSCDAVSAAGDTESRFACLNEMSLCLSLK DEEGLKKRMREYLTKNAAAEELFPLM >gi|229783974|gb|GG667761.1| GENE 12 14305 - 15528 1370 407 aa, chain + ## HITS:1 COG:no KEGG:Closa_2354 NR:ns ## KEGG: Closa_2354 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 407 1 407 407 700 81.0 0 MDGLIIGLDLCDAYTRLCCHEREKTWCLPTVICRKKDDDEWYVGEEAYAYTLVGEGIIVD KLIKMVMKDGTATLGGVKYEGTTLLEIYLKKILALAKEEFFSDEVKELVITLEKVEVKLM DALMYCADYLEIPRERVHIISHTEGFVYYVLSQKKEVWNNQVAVFDLSEDSLHYHELKVQ RGMKQMTVVAEDEALEESFNLDILNTPSGGKLADKILSSCAERLLSKKLFSSIFLTGKGF ERQDWAGDFMKLLCSRRKVYMDQELFARGAAFKGMDYLHEKTSYPFVCVCEGRLHSTISM KVLHKNRESQLIVAAAGDNWYESRSSVDLIVDNQDYVEFLVTPMDPKQKKLVKIPLEGFP KRPARTTRVGISVGFLDEKTMAVVLKDKGFGELFPASDAVVRQEVMI >gi|229783974|gb|GG667761.1| GENE 13 15525 - 16332 746 269 aa, chain + ## HITS:1 COG:no KEGG:Closa_2353 NR:ns ## KEGG: Closa_2353 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 269 1 269 271 354 65.0 2e-96 MSLILCRQEPVKHPFYFEGLGVHLYSSQELCYVIYQNPLLVLDHFVDEHLIEFIRDELEM GFMAAKLEKWQQSGEDADELLFLILTECDYYNAAEIKHFRQKIETYRKMSPHEFAKAKAD YLFTRRQYGKAVAEYEGILEMPKESSADDAFYAKIYNNLGAAYARLFSMEKAYQAYQKSF DLAKSGDVLKRIYYLSKWNPNLVLKDRFRTLITEDVKTGWDEEMKNAEEAAEKAESLEKL EELFLKDPIKRMKGAADMVKSWKGEYRNM Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:40:04 2011 Seq name: gi|229783973|gb|GG667762.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld155, whole genome shotgun sequence Length of sequence - 15300 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 4, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 38 - 97 7.2 1 1 Tu 1 . + CDS 265 - 1158 767 ## ELI_1361 hypothetical protein + Prom 2005 - 2064 80.4 2 2 Op 1 . + CDS 2146 - 4302 2276 ## COG0642 Signal transduction histidine kinase 3 2 Op 2 . + CDS 4336 - 4698 517 ## COG0346 Lactoylglutathione lyase and related lyases + Term 4733 - 4780 3.1 + Prom 4776 - 4835 6.8 4 3 Op 1 3/0.000 + CDS 4906 - 5277 611 ## COG3682 Predicted transcriptional regulator 5 3 Op 2 . + CDS 5274 - 6815 1152 ## COG4219 Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component + Term 6861 - 6913 15.2 + Prom 6876 - 6935 6.6 6 4 Op 1 . + CDS 6956 - 7033 66 ## 7 4 Op 2 7/0.000 + CDS 7018 - 8253 1385 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 8 4 Op 3 1/0.000 + CDS 8271 - 10133 2035 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Prom 10139 - 10198 2.0 9 4 Op 4 35/0.000 + CDS 10218 - 11648 1607 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 11665 - 11710 4.1 10 4 Op 5 38/0.000 + CDS 11746 - 12666 1033 ## COG1175 ABC-type sugar transport systems, permease components 11 4 Op 6 . + CDS 12679 - 13512 904 ## COG0395 ABC-type sugar transport system, permease component 12 4 Op 7 . + CDS 13542 - 14837 1203 ## COG2271 Sugar phosphate permease 13 4 Op 8 . + CDS 14866 - 15298 408 ## Cphy_3024 phosphotransferase domain-containing protein Predicted protein(s) >gi|229783973|gb|GG667762.1| GENE 1 265 - 1158 767 297 aa, chain + ## HITS:1 COG:no KEGG:ELI_1361 NR:ns ## KEGG: ELI_1361 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 297 10 296 933 338 53.0 2e-91 MAEHFKDIQKIQDILHEAQTGLWAIELEEGKEPRMYADKAMLELLGFAGEPSPEECYRGW YERIDPDYYPMVQNGVQRICADERAEVEYLWHHPLWGQIYVRCGGIRDKNFQRGTCLLGY HQNITNTVMLKQEYDTILEKLNENYKGIILCNLQTGEYRIIKATEKLEAFSEGAADFETF FRNYAESEIVWQNCELFMNAVNPENIRRQFQADCSGTGRADSDHTGAEAIYRTKEKRWRR IKVVPLNQYSEEYPWVIAAMEEQDGEIEKWIDEASAQVAVSQIYTLLISVDCEKTEY >gi|229783973|gb|GG667762.1| GENE 2 2146 - 4302 2276 718 aa, chain + ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 300 703 259 654 676 176 30.0 1e-43 MYDREGGFHAYSYYSALISQNFEDRILLTVRNVDDKYEIQRREAILSNLCECYYSIYLFD YVNNTEEPIWQEDFIQKNKAFPKGALDVYYEKFVKYYVYPEDQEKMRRAASIDFLRQTLS AEQPVYEVDFRRIYPDCIRWVRSRFSIAEIQDGVITKVIFANMDINEQKLTELKEEQQKK LYVEAQNIIKGLSAFYHSVFYVDLAEETFQTFVVREDVAERLNHSDQYCVLFDIYHEHLI HEEDRKRFQQEMSLGAIKERIQEGETIYDLEYRRDYSGYYGWMRVHVILAESRNGVPIKI ILASHNIEKEKEQEAQSQKALFAAYEEARNANEAKSNFMAQMSHDIRTPMNAIIGMTAIA MGQVDDRERVKDCLQKISLSSSHLLHLINEVLDMSKIERGQLTLTEAPFSLKNLFHEIGS IVYNESMEKEQTLNFLTDGTTHDNLLGDAGKIRQVLINLINNAIKYTPHGGTITVASREM TSGIPGVGCFVFTVEDNGIGISRDFMDHIFVPFAREDNCHVRKVQGTGLGMSISYGIVRA MQGNIRVESREGEGSRFIVTLNIRILEEEEKEEAVQRAVTGNLGRKLDACLQGLRLLLVE DNELNMEIAETLLRDAGFIVDKAENGYEAFQMFVSSEPGTYDAILMDLQMPVMDGYTAAG EIRGSGHPEAKTVPIIALTANAFAEDISKVLASGMNDHVTKPIDFDRLLAALRKCVCR >gi|229783973|gb|GG667762.1| GENE 3 4336 - 4698 517 120 aa, chain + ## HITS:1 COG:NMA2147 KEGG:ns NR:ns ## COG: NMA2147 COG0346 # Protein_GI_number: 15795018 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Neisseria meningitidis Z2491 # 1 120 1 127 138 66 36.0 1e-11 MKFIFATLHVKNLEESIRFYETVTGMKLARRFPAGPRTEIAFMADGPAEIELICSADAEP FLYGECPSLGLSVENLDEALEHMKELGVEIVRGPVQPNPGTRFFFIHDPDGVNLEIIEKK >gi|229783973|gb|GG667762.1| GENE 4 4906 - 5277 611 123 aa, chain + ## HITS:1 COG:CC1640 KEGG:ns NR:ns ## COG: CC1640 COG3682 # Protein_GI_number: 16125886 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Caulobacter vibrioides # 4 118 24 140 144 62 29.0 2e-10 MDIKLVDSEIKVMDVLWKEGDTMAKHVADVMKKRYNWNVNTTYTLLKRCIKKGAVERIDP NFVCHALVKQEEVQEQETDELLHKVFDGSVDKLFASLLGRKNLSAEQVEKLKQMVEELGG DGK >gi|229783973|gb|GG667762.1| GENE 5 5274 - 6815 1152 513 aa, chain + ## HITS:1 COG:CAC3437 KEGG:ns NR:ns ## COG: CAC3437 COG4219 # Protein_GI_number: 15896678 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component # Organism: Clostridium acetobutylicum # 1 319 9 341 541 104 22.0 5e-22 MSLIQMSVSGAVMILVITAVRAISINRLPKKTFAVLWGLVLVRLLVPFSLPSPFSVYSLA GHSGAGQIGTSPIANVLPVDPVSAGTAAAHNTAVSAGISVWGWTWGIGLVLCILYFTAAC IRCRRKFMVSEPVENDFTIQWLTEYQLRRPVSIRQSGKITAPLTYGILHPVILMPAETDW TETMRMQYVLAHEYVHIRRFDAVTKLLLTAALCLHWCNPLVWVMYLLANRDIELSCDERV VRMFGETVKSAYALMLIAMEEKKSGLTPLCSNFSKNAVEERIEAIMKIKKTTVFSIAAAC AVVAVIGSAFATSAAAESIQDLIYVKEDIAVRFLPDPGIYEQYSPYEIGISGDGQSLLYK NQKVKLFTDSNEKEAFYYDEAGTVNLKVERDSSGAVTGVSVMSEKEAQKCRAKFFEEDTD GSNQEPLENGKKYEQYASFGISFRAEEDAMYFKGQKVRVFVDESDGWYPAFWTDETGTAD LAVTRNSSGQITGIETLPEEKAERYLAAAGKDV >gi|229783973|gb|GG667762.1| GENE 6 6956 - 7033 66 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTEAAAEQLKAAAKYDKEEQPCTSY >gi|229783973|gb|GG667762.1| GENE 7 7018 - 8253 1385 411 aa, chain + ## HITS:1 COG:BS_yesN KEGG:ns NR:ns ## COG: BS_yesN COG4753 # Protein_GI_number: 16077763 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus subtilis # 1 407 1 359 368 139 27.0 8e-33 MYQLLIVDDEEDIRRGMARGIPWKEWGFEVAGQAENGEEAVRMIGETHPDVVLSDIRMPK MDGIELMQYLHENNPEIKIIILSGYNDVEYLNMAIRNQVTEYLLKPTDIDEFETLFHRLK TKLDEEGMKRRELESLKVSAMQNKELSYARVLSNLLDGYVGGEEEYGWKEEMEDAGMKFN CCMMAVLDAQPVGSGERDDHYQLKQRIIRYCNSREMAWEKHFFLNRDRKVVGIITLKEGK AEELEAIGRCIREIQGEIGDIYGLGLLAGLSGLCDDEEHLPAVYGQTVKHVSSIAPDLSA GQKKTGLLVTSIREYLDREYCSNLVSLEATAEHFRKNPAYISKVFKKETGFNFSDYITQK RMAHSKLLLRDMSLKIYEIAEMTGYADASNFIKVFKKHCGMSPNEYRGLHN >gi|229783973|gb|GG667762.1| GENE 8 8271 - 10133 2035 620 aa, chain + ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 229 613 214 574 577 168 31.0 2e-41 MRNIWKRLSSYMNRGLRQQLFFYLVVGVLLPFTVVASVLFIKTRTEMKNQAVGNIRQRAD AIAAQVDELLYNVQLVSDKFAYDVEVEEFLGKDYGSRTIEKQRDIYELNNYFLKTDPLGK SQRISAIYGNNKEVYNFLDPYFQGNDLKRIMVGMGATDRTKLSMFHWQPLQNNFLSRTKK GDVRTDQVITCMRRILHPFTGTWLYTQFFVLEENQIYQLYQASAEEMKGTVYIVDSGGRL VSSSDQDAVEMCRMPERIMELAKEAEEGSHQIQYDDESYITDLSPLNNADWQIFTVVPLS AATEPIDRLFKEIIAAMLVCVTACIAIINWISRRFLQPVEVLDASMKEVYDGNLEAYVEP EAYQGELRSMMMYYNAMLVQINHYIKEQVESEKKKKELELEVLMGQINPHFLYNTLENIV WKSNEVGRPDIGRIAAALGRLYRLSIGNGETIVPIRQEVEHVMAYVNIQKNRYKERIEFD STVDYDQLYNYAMIKLTLQPVVENCFMYGMEGIDRVLKIRLDIREESEVIRFRVADNGCG MTKNQLKEVRRQVEAGTVRTETEPGKRRRKGTGIGLYSVKERIAIYTGYQNSVRILSKQG AGTIVIITIPKRSVKDKNNQ >gi|229783973|gb|GG667762.1| GENE 9 10218 - 11648 1607 476 aa, chain + ## HITS:1 COG:PH0753 KEGG:ns NR:ns ## COG: PH0753 COG1653 # Protein_GI_number: 14590623 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pyrococcus horikoshii # 71 332 68 339 464 65 25.0 2e-10 MKRNIKRVVSLSLCAAMAVGSLAGCSSKKAEPADTATTAAAETTTKAEGETAAEGETTEG AAASDKPFEGVTLKYATTQTASTGEENQKLLDLVKEKTGINIEFYVIPNTKAGEVDKTLV SLMAGDEIDLIYNTKPGLKAFYNAGVLEPMNALAEKAGYDIAGTFGDYVPEFDGDVYGLP AFSDIWITLYNKQIFDDAGVDYPTADGWTWEKYIETAKKVTDTSKGIYGSLMLDYDCYNY MYALQKGWEPYKADGTANFDDPLFKESLEFFYGLGNTEKIQPSILEFKASNTPWNAFAST GKYGMFMCGGWVTSILNQFDKYPRDWKVGLLPMPYPEGTDPSTLTVPGCYAVPSTSKNKD AAFAAAACMAENQYTLGMGRVPARIDLTDEEINEYVENSLAKPFEFDGLTVEDFKTAWFD SGRKALSEKIIGYGDAAISQAIIEEGQLYGQGAESIDDAVTKIQDKANRAIEEDKQ >gi|229783973|gb|GG667762.1| GENE 10 11746 - 12666 1033 306 aa, chain + ## HITS:1 COG:lin0218 KEGG:ns NR:ns ## COG: lin0218 COG1175 # Protein_GI_number: 16799295 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 15 306 4 292 292 168 36.0 1e-41 MKSNGNAKVKMKSSIAQKKENLTGALFIAPYLIGYIIFQGLPFLIAVILGFTNVRYISKL EDASFVGLSNFVRMFSDVDTMKALGRTGLYSLMYVPLIMVCGFFIAYMVNKGIHCKNLVR SMFFLPYVSNMVAVAVVFKLLLGPDGPVVYILSALGSENPPYLLYGLNTALPTVVCIAVW KGIGLNFITYLAALQNVSADLLEAAQIDGASKWQRIKNIVIPMISPTTFFLMISSVITSL QNFTTIQALTGGGPGQATTVMSINIIRTAFTNYQTGYASAQALVMFIIVLIITLIQWRGQ QKWVNY >gi|229783973|gb|GG667762.1| GENE 11 12679 - 13512 904 277 aa, chain + ## HITS:1 COG:lin0219 KEGG:ns NR:ns ## COG: lin0219 COG0395 # Protein_GI_number: 16799296 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 12 276 20 281 282 190 37.0 3e-48 MLKKDKKMSWGWTIVLIAIGLVCIYPFLFMVSSSFKISGDVLTKPLQLIPDQIILSNYSD LFGDPYYDFGQWYINTIVMTVTTIVIKVFVVSITAYAFSKINFKGRDGIFLVLLTSLMIP SDIMLIPRYVIFKQIHILDTMWSLVLPATFDIYFVFMMRQAFIGIPEEISEAARIDGCNH FKIYSRIILPLAKPSIVTLILFTFVWSWNDYMGPYIFISSLEKQMLSVGIKLFTEGTVTD PALQMSAATLVLMPVLVLFLFSQKYFVEGVNSSAVKG >gi|229783973|gb|GG667762.1| GENE 12 13542 - 14837 1203 431 aa, chain + ## HITS:1 COG:CPn0665 KEGG:ns NR:ns ## COG: CPn0665 COG2271 # Protein_GI_number: 15618575 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Chlamydophila pneumoniae CWL029 # 17 416 34 441 455 92 23.0 2e-18 MGLRFYKITDKKMVRLLFWLCWIVYFSTYIGRLNYSASLTEIITSEGFSKGQAGMIGTAF FFAYGAGQFVSGFLGDRLAPKKMVFTGLLVSGLCNLAMAGVKAPGLMAAVWCVNGLVQAF IWSPMIRLMYEYYETETRMKACVSLNSSVPVGTMAAYGLTALVIWLSGWRAMFVLAGSAL LAVSAFWLLGMNRVERYAVESGEAGTSDRLDGAAEKKPFAAASVNASVSWKSLLIQSGFL FLMMALFVQGALKDGVTTWVPTYISETYGVSAILAITSTMVIPVFNLLGVYLASFANIHW FRDEVRTAGAFFVVSAAAILMLRLTSGQSMAVSFLMLAVATTAMMAVNTMLIAVLPSYFG VIGRASSVSGLLNSSVYAGGAVSTYGIGALSVALGWNATIVIWFLMAAASAVICFLIVRR WIAYRKKILQI >gi|229783973|gb|GG667762.1| GENE 13 14866 - 15298 408 144 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3024 NR:ns ## KEGG: Cphy_3024 # Name: not_defined # Def: phosphotransferase domain-containing protein # Organism: C.phytofermentans # Pathway: not_defined # 1 144 1 143 331 163 60.0 2e-39 MSNKIYLIPETGTFYKANLHCHTIHSDGRLTPEEVKKAYMEAGYSVVAYTDHRKYQWHRE LMDDHFVALAACEVDINEHFQVPGDFSRVKTYHINLYDAKPEEFSEEKKQIPLPERRYGD YHYINEYIDQMKAYGFFACYNHPY Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:40:20 2011 Seq name: gi|229783972|gb|GG667763.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld156, whole genome shotgun sequence Length of sequence - 17029 bp Number of predicted genes - 17, with homology - 15 Number of transcription units - 13, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 31 - 780 533 ## COG5632 N-acetylmuramoyl-L-alanine amidase + Term 790 - 822 4.1 2 2 Tu 1 . - CDS 1113 - 1370 166 ## gi|266624250|ref|ZP_06117185.1| conserved hypothetical protein - Prom 1391 - 1450 3.0 + Prom 1330 - 1389 7.5 3 3 Tu 1 . + CDS 1429 - 1578 70 ## + Prom 1626 - 1685 7.8 4 4 Tu 1 . + CDS 1892 - 2095 272 ## gi|288871314|ref|ZP_06117186.2| conserved hypothetical protein 5 5 Tu 1 . + CDS 2536 - 3003 222 ## gi|266624252|ref|ZP_06117187.1| conserved hypothetical protein + Prom 3043 - 3102 3.7 6 6 Op 1 . + CDS 3140 - 3979 579 ## COG3646 Uncharacterized phage-encoded protein 7 6 Op 2 . + CDS 3979 - 4194 184 ## gi|266624254|ref|ZP_06117189.1| conserved hypothetical protein + Term 4199 - 4237 5.4 - TRNA 4455 - 4527 63.7 # Thr GGT 0 0 + Prom 4616 - 4675 7.3 8 7 Tu 1 . + CDS 4697 - 6025 1546 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) 9 8 Tu 1 . + CDS 7336 - 8211 1157 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) 10 9 Tu 1 . + CDS 9176 - 10297 1191 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) 11 10 Tu 1 . + CDS 11338 - 11760 490 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) 12 11 Tu 1 . - CDS 11831 - 11968 85 ## - Prom 12022 - 12081 6.9 + Prom 11953 - 12012 6.7 13 12 Tu 1 . + CDS 12058 - 12927 948 ## Cbei_1987 regulatory protein, LysR + Prom 13829 - 13888 16.7 14 13 Op 1 . + CDS 13927 - 14952 1155 ## COG1975 Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family 15 13 Op 2 . + CDS 14949 - 15590 434 ## Closa_2139 hypothetical protein 16 13 Op 3 35/0.000 + CDS 15590 - 16603 1136 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 17 13 Op 4 . + CDS 16617 - 17028 407 ## COG1120 ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components Predicted protein(s) >gi|229783972|gb|GG667763.1| GENE 1 31 - 780 533 249 aa, chain + ## HITS:1 COG:BS_yqeE_1 KEGG:ns NR:ns ## COG: BS_yqeE_1 COG5632 # Protein_GI_number: 16079624 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus subtilis # 20 155 23 150 182 94 40.0 1e-19 MLPITKQIKQINCYASQNHPKYIVIHETDNFNKGAGAESHARAHVNGNIATSVHYYVDDV AIYQTLNHTDGAWAVGKQYGTPLVAGVNNNNTINIEICVNPDSNYDKARLNCVDLVKHLI QETGIPADRVIRHYDAKRKWCPRKMMDNPELWTDFCLRIRGQEDEVKSFEDGAGNWHFTV NGELQKARWVKYKNKWFYVDDAGNMVTGYAVIGGMAYMLNPSKADMATYGALLVTKNLGQ GNLEVQHVE >gi|229783972|gb|GG667763.1| GENE 2 1113 - 1370 166 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624250|ref|ZP_06117185.1| ## NR: gi|266624250|ref|ZP_06117185.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 85 1 85 85 146 100.0 5e-34 MKSGKCSVFKQIKNFLTRQREFDARLFDFMQEHPEITPDDEPIEAPPGDFQKIMEEMNKR GIEPAVRKQLKWKILFDRIFRRKKY >gi|229783972|gb|GG667763.1| GENE 3 1429 - 1578 70 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MENNRARPLVPWYEWVYTSARKVPENYRVSLYGQFNAGGQNLVPLYQGY >gi|229783972|gb|GG667763.1| GENE 4 1892 - 2095 272 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871314|ref|ZP_06117186.2| ## NR: gi|288871314|ref|ZP_06117186.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 67 18 84 84 119 100.0 9e-26 MRTRATNTIAGEGRLLDTDELRAYTNLGRNNAMKLGEEIGAKVKIGKRVLWDKVKIDQYL DSLTGAG >gi|229783972|gb|GG667763.1| GENE 5 2536 - 3003 222 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624252|ref|ZP_06117187.1| ## NR: gi|266624252|ref|ZP_06117187.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 155 1 155 155 247 100.0 2e-64 MGMTKSFRMSDRIENMFNSLKKYDPVGKSDTEMLSKGIELQFELATQTHNLFYRKCIMEY LPTEKLNGLFNFICDMLESLSFSDGYYLEDEMKYFMSTVEADRFFESDESYEETNHQQYY KVLEITLKREEYTEEDVQLLSETMQKYYEEKNKHH >gi|229783972|gb|GG667763.1| GENE 6 3140 - 3979 579 279 aa, chain + ## HITS:1 COG:BS_yoqD_1 KEGG:ns NR:ns ## COG: BS_yoqD_1 COG3646 # Protein_GI_number: 16079126 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Bacillus subtilis # 5 113 8 108 124 84 38.0 2e-16 MKKQIEQSITTLEIAEMMEVPHYEILKKLDGTFNPDGSVKQIGIIPVLTDGKIPVSDYFI ESTYKDASGKENKCYKATKLGCDMLANKFTGEKGITFTARYVKRFHDMEDIIQKGEKQWK MIQTEPKRPPLTAVTSAAKFLTELAGEAGCDRKIQLLAAKSFYNTNGYSYQLEIGADKQY WDTVHIARQVGICVKSSGKPADKAVNEIIRRIGVTEEDYTDTWESKGNWQGTVRKYSDSV IERVKCWIADNDYPEDIDYRQNNGEMKAWHVTYQSREVA >gi|229783972|gb|GG667763.1| GENE 7 3979 - 4194 184 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624254|ref|ZP_06117189.1| ## NR: gi|266624254|ref|ZP_06117189.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 71 1 71 71 139 100.0 9e-32 MDIHEELRINGRQPIQGEAVTELGAFFQAIVTQNVENGFAIEDSIPMLLSQAYTYGVICG KRKERARRKYT >gi|229783972|gb|GG667763.1| GENE 8 4697 - 6025 1546 442 aa, chain + ## HITS:1 COG:BS_pbpD KEGG:ns NR:ns ## COG: BS_pbpD COG0744 # Protein_GI_number: 16080201 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Bacillus subtilis # 19 442 4 414 624 211 32.0 3e-54 MKKKRRKRKSAWAVFFKPIKVIFSIFLILMILGLGAGGIVYLKFGDEISACKERADQIVA QSKREDFSQPSDTRIYDSSGQVIGIINTGHYEYVPITEISENLQNAYIAQEDKRFYQHHG VDIIGLARAGVALVKNKGEITQGGSTITQQVIKNTYLTQEKSFTRKITEIFAAPQMEEKY SKADILEFYCNTNFYANRCYGVAKASLFYFDKEAKDLTVWEAATLAGISNNPSRYNPVAH PEAALEKRNKIIDNMQKNGMLTEEECETAKAQPLTVAKVTEEGTKETYQSSYAVHCAAQE LMKNDGFEFEYTFADEAAYNDYKERYQEAYQEKSDLIRNGGFEIHTSLDSTLQTQLQAGI DETLAKFTDVSENGKYALQGAAVCIDNRTGYVVAISGGRGTDDEYNRGFLSRRQPGSTIK PLIDYAPAFETGYYYPSRMLAS >gi|229783972|gb|GG667763.1| GENE 9 7336 - 8211 1157 291 aa, chain + ## HITS:1 COG:FN1480 KEGG:ns NR:ns ## COG: FN1480 COG2239 # Protein_GI_number: 19704812 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Fusobacterium nucleatum # 1 291 7 297 449 276 48.0 5e-74 MEELMDLLFTRQFRKLKDILTEMNEVDIATFIEELDSEKTVVVFRMLPKELASDVFACLE VDKQEHIINSITDYELGTIVDDLFVDDAVDMLEELPASVVKRVLKNARPDTRKLINQFLN YPDNSAGSIMTAEYVGLKQSMTVEQAFAYIRKNGVDKETIYTCYVMDEKRRLEGVVTVKD LLMNPYEEVIGNIMDTHVIKAFTTEDQEEVADSFQKYDLLSLPVVDHEERLVGIVTVDDV VDVMEQEATEDFEKMAAMLPSEKPYLKTGVFQLAKNRIAWLLILMVSSMIT >gi|229783972|gb|GG667763.1| GENE 10 9176 - 10297 1191 373 aa, chain + ## HITS:1 COG:alr5324 KEGG:ns NR:ns ## COG: alr5324 COG0744 # Protein_GI_number: 17232816 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Nostoc sp. PCC 7120 # 30 188 478 633 643 95 39.0 2e-19 MLADIGVDYGMSYLDKMHFVGLSYLDNWNTSMSIGGFTEGARVVDMARGYSVLANGGMYS NNTCITGIDSQYEGDLYQENQPEERIFTEDTAYMVTDVLKGTLNQPYGTGYGLGLPIPAA GKTGTTNSSKDTWFCGYTKYYTTAVWVGYDTPKAMPGVYGKTYAGKIWYDFMKQCNEGLP AEDWVMPSTVAYYNVDSRGEKTSQETGKQDLFSSVAEQNARSNSKKWAEEQKMKEENARA AAEQASRDAAAQGQEQAAAGLTALIGELNALAYQDDTTQEKISQGRTMLEMLDGTEQYEA LKQQFDEAAARAGSLPYEEQTVPPSQEIGPGITHEARPGEAPSASQDITSGQAPGVQPGA APGGIEAVGPGGL >gi|229783972|gb|GG667763.1| GENE 11 11338 - 11760 490 140 aa, chain + ## HITS:1 COG:FN1480 KEGG:ns NR:ns ## COG: FN1480 COG2239 # Protein_GI_number: 19704812 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Fusobacterium nucleatum # 5 138 314 447 449 147 56.0 5e-36 MIPLLVTFIPMLTDTGGNAGSQSSTMIIRGMAVGEIEPGDLFKVLWKELRVGVIVGVILG FVNYVRLVILYPGREMLCLTVVLSLMATVIIAKTIGCMLPIAAKVFHMDPAIMAAPLITT IVDAVSLIIYFQLACTLLGL >gi|229783972|gb|GG667763.1| GENE 12 11831 - 11968 85 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEIRGKGVRPSDGGEAEGGGEGTAGSFSPSLRLSSITGAGYWLQK >gi|229783972|gb|GG667763.1| GENE 13 12058 - 12927 948 289 aa, chain + ## HITS:1 COG:no KEGG:Cbei_1987 NR:ns ## KEGG: Cbei_1987 # Name: not_defined # Def: regulatory protein, LysR # Organism: C.beijerinckii # Pathway: not_defined # 2 287 8 290 309 213 38.0 1e-53 MTTGAVILAAGHKSSTTSFQPMLPVGDTTAVRRIIITLQRAGVEPVVVITGDRGEELEKH ISKMRVVCLRNTEYAGTQLFDSICMGLNYIEDLCDRVLVMPTKAPLLLSETIEKIMESRA PLCCPVYDGRRGHPVMIAKDWIPKILNYQGEYGLRSFLREPEADAVLEEIPVDDRGIIQA VETDEDCAMAMNGRHGKLPLHSRMSLYLECDEVFFGPGIAQFLSLIDHTGSMQTACKQMH MSYSKGWKIMKTAERELGYPLLITQSGGAEGGYSQLTPKTKDFLNRFSS >gi|229783972|gb|GG667763.1| GENE 14 13927 - 14952 1155 341 aa, chain + ## HITS:1 COG:BH1974 KEGG:ns NR:ns ## COG: BH1974 COG1975 # Protein_GI_number: 15614537 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family # Organism: Bacillus halodurans # 19 340 18 330 343 107 30.0 5e-23 MKQLFEEMKQLLEQGDDLVLVTVIAGSGSTPRGAGARMIVRKDGTTMGTIGGGAVEYEAG RMALDVLKSKNSFSKGFKLVKGQVADLGMICGGDVVVYFQYVNPDNREFLAICEEGIQAC ERDEDSWIITDITDETAWAVGIYSESHGFAGITVEGDIEPLTQTKAVQTTVNGRRYYSEP LMRAGTVYVFGGGHVAQELVPVIAHVGFKCVVYDDRPEFACSSLFPDAVDTIVGDFTSIF DKVTIREQDYVVIMTRGHQCDYDVQKQVLRTPAHYIGVMGSRQKLMTLAGKLKAEGFTDQ DIERFYSPIGLAILAETPAEIAISVAGELIMVRARREGRTK >gi|229783972|gb|GG667763.1| GENE 15 14949 - 15590 434 213 aa, chain + ## HITS:1 COG:no KEGG:Closa_2139 NR:ns ## KEGG: Closa_2139 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: Purine metabolism [PATH:csh00230]; Thiamine metabolism [PATH:csh00730]; Metabolic pathways [PATH:csh01100] # 4 210 3 200 212 132 42.0 1e-29 MTERHLFLKGRSGTGKTTWLLSCLKPYWKEAGGFLTQRLIDGTGETVGFCLLPVENMTEP PPVTRLYEPGIEGVFIEKTPEGWRKYPEVFETMGAEILERLLDRPERYRFFYIDEAGGVE LKVKPFMERLERLLDGPVFCIGVLKHSDNLAAMGTRVSLSANAAPDRNRLESKMEQRFHC RILSFAGAEDEKMEIRRMIQQRLAGTAEKECEG >gi|229783972|gb|GG667763.1| GENE 16 15590 - 16603 1136 337 aa, chain + ## HITS:1 COG:MA1199 KEGG:ns NR:ns ## COG: MA1199 COG0609 # Protein_GI_number: 20090065 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Methanosarcina acetivorans str.C2A # 11 336 16 343 347 294 49.0 2e-79 MVEKRRVRMRLTAMTVLFVICFFGSFMLGRYPIEPWTLIRVLASRVIPVTPDWPSQVETV LFNVRLPRVLMAALIGAGLSAAGAAYQGIFKNPMVSPDVLGASSGAGFGAALGLFLSFSY QGVSFLAFISGLSAVAVVCLISGRVKYNQTLGLVLAGMMVSSLFTAAVSFLKLVADPNNT LPVITYWLMGSLASIRPKDLAFAAPWIIGGIIPICLMRWKINVLTLGEEEARCIGVNTGA VRLAVVLCATLITSAAVSVSGLIGWVGLVIPHFARMLVGSDYRKMLPASLLLGASFLVVV DNFARLLATSEIPIGILTAFVGAPFFLWLILREGNKL >gi|229783972|gb|GG667763.1| GENE 17 16617 - 17028 407 137 aa, chain + ## HITS:1 COG:ECs1697 KEGG:ns NR:ns ## COG: ECs1697 COG1120 # Protein_GI_number: 15830951 # Func_class: P Inorganic ion transport and metabolism; H Coenzyme transport and metabolism # Function: ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components # Organism: Escherichia coli O157:H7 # 3 136 4 137 252 116 40.0 1e-26 MGLVVENLTFSYRCFPVLKGVDFSVEAGNLVCMLGKNGAGKSTLFRCILGLLKGYQGKIT VDGVDLRTLKEKDMARKIAYIPQNHDQAFAFSVLDMVMMGTTSSLQGFQNPGKEQREAAF DALKTMRIEAMAERSYA Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:41:08 2011 Seq name: gi|229783971|gb|GG667764.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld157, whole genome shotgun sequence Length of sequence - 13291 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 7, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 992 411 ## COG3344 Retron-type reverse transcriptase - Prom 1171 - 1230 3.5 2 2 Tu 1 . - CDS 1545 - 6677 2143 ## COG4646 DNA methylase - Term 6739 - 6776 8.0 3 3 Tu 1 . - CDS 6803 - 6925 76 ## 4 4 Op 1 . - CDS 7842 - 7997 122 ## gi|167747619|ref|ZP_02419746.1| hypothetical protein ANACAC_02340 5 4 Op 2 . - CDS 7978 - 8499 332 ## gi|266624268|ref|ZP_06117203.1| peptidase M14, carboxypeptidase A 6 4 Op 3 . - CDS 8496 - 9056 411 ## gi|266624269|ref|ZP_06117204.1| hypothetical genomic island protein 7 4 Op 4 . - CDS 9138 - 11114 1029 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) - Prom 11221 - 11280 2.8 8 5 Tu 1 . - CDS 11320 - 11817 396 ## gi|266624271|ref|ZP_06117206.1| hypothetical protein CLOSTHATH_05621 - Prom 11846 - 11905 4.1 - Term 11872 - 11904 0.6 9 6 Op 1 . - CDS 11932 - 12330 242 ## COG1895 Uncharacterized conserved protein related to C-terminal domain of eukaryotic chaperone, SACSIN 10 6 Op 2 . - CDS 12318 - 12647 379 ## Athe_1643 DNA polymerase beta domain-containing protein region - Prom 12676 - 12735 7.8 11 7 Op 1 . - CDS 12772 - 12915 148 ## gi|288871319|ref|ZP_06410109.1| NUDIX hydrolase 12 7 Op 2 . - CDS 12955 - 13158 178 ## gi|288871320|ref|ZP_06117209.2| hypothetical protein CLOSTHATH_05625 13 7 Op 3 . - CDS 13130 - 13291 69 ## gi|288871092|ref|ZP_06116326.2| hypothetical protein CLOSTHATH_04682 Predicted protein(s) >gi|229783971|gb|GG667764.1| GENE 1 2 - 992 411 330 aa, chain - ## HITS:1 COG:Q0050 KEGG:ns NR:ns ## COG: Q0050 COG3344 # Protein_GI_number: 6226520 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 28 330 255 544 834 222 40.0 6e-58 MRNPENVLISLTKHSNQKDYKYERLYRLLYNEELYLTAYQSIYSNDGSMTKGTDNQTVDG MSVERIRKIIVSLKDESYQPKPARRTYIPKKNGKMRPLGIPSFEDKLLQEVIRMILEAIY EGHFENTSHGFRPNRSCHTALNEIQRTFTGVKWFIEGDIKGFFDNINHDTLIGILRERIN DERFLRLIRKFLNAGYIEKWTFHNTYSGTPQGGIISPILANIYLDKFDKYVNEYVQKFNK GKKRMRTKEYRKNEVELSKARTALKNANGDYERESAIALIRQLEKERVTIPQSDPMDNKY ARLVYVRYADDWLCGVIGSKEDCKKIKEDF >gi|229783971|gb|GG667764.1| GENE 2 1545 - 6677 2143 1710 aa, chain - ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 707 1704 1 1020 1315 535 33.0 1e-151 MARQKIYHFFELNEGLRDRARYLCQMYEAGDTISTEAGHVFRYETDEAGIQLYWHENGQY LSGYLPFEVAAEEILHLIDRGKYYQPIKTDMEIPQPLVDTVENFVEVADGKVIFDMIYDV MTVNLLPQQETEFVRQSFLALLGSDAEWSFSATLVFHEQDVSCSIDGKETLAYSWSIIRD ILARSIEQDNYPESKNDLPVEKLIEFPWFQRLAVQYQDLLEQAGEEPAVEIPYLSSEAAI EKENTDGEKEEVPDNEKSDILETVFYDVTKAFSLHSMTLIPSKPLIQNFFKQAGDKDKQE FLLRLLELQSKRDYGKSVLNTDQGLVHYTIEKGTVDFQLMHGEKEGHGRFTMQEAARVIQ EAIDAGEYLSPGEYEAAVKEGFSLCPEESYQLYKEMQNRNQSVSQASDFFYPDAWEIPSG DKTKYKSNVAAIQLLKELEQSSRPATPEEQIILARYVGWGGLANAFNPKSPNWTNEYQEL QELLTEEEYEQARASVNSAFYTPPVVAKVIYKALQQFGFEGGSILEPSMGIGNFYSVLPE DMRNSQLYGVELDSISGRIAKQLHPHATIQVKGFEKTKFEKNKFDVVVGNVPFGAYKVFD PEYKKYGFRIHDYFLAKSIDLVRPGGMIAVVTTKFTMDKANSIIRKYIAERADFVGAVRL PGIAFKKDAGAEVTSDIIFLQKKGRLLTANTSEDWMNITYTDDGVPVNEYFSRHPEMMLG KMAFDTHTYGPNSNYTELIVEDEEHFDFEEQLNRAISFLSATYLPEEKKTEGVENNKEEL PDTIPALLEVANNTYTVIEGEIYYRDNDSMVRWKGNETKRKRILGMHAIRQSVRYLIDIQ TRGCTEEQLTEGQAQLNKIYDAYVKEYGYLSSRGNKLAFREDNDYYLLCSLETEDENKNV KKSDIFYKQTIAPQTYIEKADTAMDALQISLAEYGKVHIPYMQSLYPVDREQLLAELKGQ IFLNPVKADETNPNQGWETASEYLSGPVRKKLKTAEMYAQRESQYSANVEALKAVQPEDL SAADISVKLGTAWVELEDYEAFIYELLNTPESYRTGPYAIKVQLNRYTMSYKVTNKSADY SSVVAKQTYGTKRIDAYSIIEALLNQNIITVKDAIESGDSVKYVVNQKETTLAREKATMI KEAFKEWIWKDPDRRKKYVDFYNQNFNDNRLRVYDGSYLTFPGMNPEKKMRPHQRNVIER AIHGSTLLAHAVGAGKTYEMAAICMELKRLKLMNKAMIVVPNHLVGQMASEFLTLYPGAN LLVTRKEDFTKENRRKLTCKIAANNYDAVILGHSQFERISLSADRRAAMLEEQVDRISNA VSELKEEQGAKWGIKDMERQRANLEAQILELRNDAKKDDVLDFEQLGIDALFVDEAHVFK NLSIFTKIRNVAGIATNGSQRAMDMFQKIQYIQEHTGGRNVFLATGTPISNTMCEMYVMQ LYLQSQKLREKGIEHFDSWAANFGEVTTSLEMSPEGGYRMRSRFNKFCNLPELMNMFREV ADIQLPSMLDLNVPKLKGGKYKIVESVASDSVDIMMQELVRRAEQIRNGSVDPSEDNMLK ITNEARLLGTDPRLIDPDAEVDEDGKLYQAAENIYQEYVASQDFKGTQVVFSDIGTPTGK KGFNVYDFLKGELIKKGIPRDEIAFIHDAKNDKQKEELFADMRSGRKRILIGSTSMMGTG TNIQKRLCAAHHIDCPWKPSDIERAPVKAS >gi|229783971|gb|GG667764.1| GENE 3 6803 - 6925 76 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKAADRKINPSDTMEMTRLRNEAAMIAQETVIQEVICKPR >gi|229783971|gb|GG667764.1| GENE 4 7842 - 7997 122 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167747619|ref|ZP_02419746.1| ## NR: gi|167747619|ref|ZP_02419746.1| hypothetical protein ANACAC_02340 [Anaerostipes caccae DSM 14662] conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] hypothetical protein ANACAC_02340 [Anaerostipes caccae DSM 14662] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] # 9 49 1 41 2143 84 90.0 2e-15 MLGKAGEYMDQHTMKVSDLYDIAIFDISRDGEQWKKFLEFAARGNLYRLAS >gi|229783971|gb|GG667764.1| GENE 5 7978 - 8499 332 173 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624268|ref|ZP_06117203.1| ## NR: gi|266624268|ref|ZP_06117203.1| peptidase M14, carboxypeptidase A [Clostridium hathewayi DSM 13479] peptidase M14, carboxypeptidase A [Clostridium hathewayi DSM 13479] # 1 173 1 173 173 319 100.0 4e-86 MKKTLKEELEESFQRWDNELYSGGSDPYYSDGVDMNLLRKHIIAYKTQILETGELPEIYH RKTPEELPESFMVKAEKIYQTAIDIFRQCRDDADYQFLCGLELNPKMDRMAEVINALKNV KELEGAIKKQDFVVMRRYYEKPDFKKCRLIVERSSERIEPKIEQMSLFAGESR >gi|229783971|gb|GG667764.1| GENE 6 8496 - 9056 411 186 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624269|ref|ZP_06117204.1| ## NR: gi|266624269|ref|ZP_06117204.1| hypothetical genomic island protein [Clostridium hathewayi DSM 13479] hypothetical genomic island protein [Clostridium hathewayi DSM 13479] # 1 186 1 186 186 334 100.0 2e-90 MDDYAEDNYYYIEKVKTGEEYHVALYKALMGIWEARETFINSLQEGITYRKLELGIEKDG EKIPFFKCDDGKFYIPEIKARKKLEGVDTLLSEFRRRFGIREGENQGSHMEEAKSVQDYD IRITERLSKIVTVRAESMESALSDAHDNYSDAKKGYVMDYEDMQEVTFSMAGIHLDLEKQ KSIGGR >gi|229783971|gb|GG667764.1| GENE 7 9138 - 11114 1029 658 aa, chain - ## HITS:1 COG:CAC0309_2 KEGG:ns NR:ns ## COG: CAC0309_2 COG0791 # Protein_GI_number: 15893601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Clostridium acetobutylicum # 556 645 80 160 171 65 43.0 3e-10 MKTTEFSEPQEQKDNNRVFLAVNSASGLGERTKRREDDTPGYENTAGKAGRTKMQSEQTV HSNPKKESNKKGPEGAPQGEDGQKKRQKEKYIKEQQKQKARNDEKFHFSRKSGPEVRRKN HEYGFKDNVIFSREKSPLGTPETQSEIGGDRQETMGGFGKEEDPLKTRKKDEKISSKGEG NKKRVKKDTENTGMKKAVRRRMFAYMVNKLTDEQQKDSFAKTVKDIATIKVGLVVKQLFL YLLGLLGPLLGGLVTVALPFILIIVILYNTPLALFLPPLLGGDTVQSVLSGYYQEFYDDI REAEEENDEEVTYKNMQNGVARSNFTDVMMVYMVRYGTGQGNFSTVMNDKNKEKLKEIFD EMNHFASETTTDTVRVGESLGVMTFSGYCNCSICCGQWSGGPTASGAMPKADHTLAVDAY NPTLPMGTHIIANGKEYVVEDTGDFDRYGVDFDMYFGDHQTAQNFGHQDMEVYLADSNGS NSVEVTHTSLYVYNLSYEDYIELGKLTPSQEDLLREVMSEEFLHSMPASGIGEQVALLAQ EKVGCQYSQDLRYEEGYYDCSSLVQRCYAQFGVNLPSVASTQGQYIVDNGLQVSENELRP GDLIFHARDDNADEFMSIGHVAIYVGNGMQVDARGTAYGVVYRPLVPSNIGLYGRPCR >gi|229783971|gb|GG667764.1| GENE 8 11320 - 11817 396 165 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624271|ref|ZP_06117206.1| ## NR: gi|266624271|ref|ZP_06117206.1| hypothetical protein CLOSTHATH_05621 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05621 [Clostridium hathewayi DSM 13479] # 1 165 1 165 165 306 100.0 2e-82 MRFVFTFGFQLNEWEVEIEIPKDYKYPSEGEAREACRNMNVEAALFPDGIAVLFEREYMI LDDRESYHFRKLRDEVYEQFGALLCPEDISELIDKGTFAEILLAENYMFDETDYLELDLP VDEAYQLLAEKYGPQQELPFGLGCAELKGEKGFHFKSLYYQVIQI >gi|229783971|gb|GG667764.1| GENE 9 11932 - 12330 242 132 aa, chain - ## HITS:1 COG:TM1000 KEGG:ns NR:ns ## COG: TM1000 COG1895 # Protein_GI_number: 15643760 # Func_class: S Function unknown # Function: Uncharacterized conserved protein related to C-terminal domain of eukaryotic chaperone, SACSIN # Organism: Thermotoga maritima # 1 130 1 130 132 71 32.0 4e-13 MGSVETLAKYRFQRVLEDLAAAKEMLSGGMYKPSLNRSYYSIFHAMRAITVLEGFDCSKH SGVIAYFNQNFIKTGIFAKETSKIIKSASIMKEQSDYLDFFIASKQDAEEQIERAKQFIE AVEKYLKDKKVL >gi|229783971|gb|GG667764.1| GENE 10 12318 - 12647 379 109 aa, chain - ## HITS:1 COG:no KEGG:Athe_1643 NR:ns ## KEGG: Athe_1643 # Name: not_defined # Def: DNA polymerase beta domain-containing protein region # Organism: A.thermophilum # Pathway: not_defined # 5 107 10 114 115 77 37.0 1e-13 MNENQEMFEKLVKGLRGIYGDMLVSILLYGSFARGTNTRESDIDIAILLKGRETKEMHDQ MVDLAVDMDLEYDQVFSIINIDYENFLEWEDTLPFYKNVREDGVVLWAA >gi|229783971|gb|GG667764.1| GENE 11 12772 - 12915 148 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871319|ref|ZP_06410109.1| ## NR: gi|288871319|ref|ZP_06410109.1| NUDIX hydrolase [Clostridium hathewayi DSM 13479] NUDIX hydrolase [Clostridium hathewayi DSM 13479] # 1 47 1 47 47 85 100.0 9e-16 MYSGDFELMTWDWVDLDILNSIPYYRDSSLVQKAVEDKTTHARIGDN >gi|229783971|gb|GG667764.1| GENE 12 12955 - 13158 178 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871320|ref|ZP_06117209.2| ## NR: gi|288871320|ref|ZP_06117209.2| hypothetical protein CLOSTHATH_05625 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05625 [Clostridium hathewayi DSM 13479] # 1 67 6 72 72 117 100.0 2e-25 MFPDAEKEYSEFVAESGIEYGKKPVPKISFFVAECMEFPSLGQVYECRELKQAVDIYHKL PEHVKCM >gi|229783971|gb|GG667764.1| GENE 13 13130 - 13291 69 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871092|ref|ZP_06116326.2| ## NR: gi|288871092|ref|ZP_06116326.2| hypothetical protein CLOSTHATH_04682 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_04682 [Clostridium hathewayi DSM 13479] # 2 53 203 254 254 97 100.0 3e-19 GGKIVKNMAGHTENMLKSLDGLSQKALDNRAIRKKHVFEHEGGKDVSRCRKRI Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:42:06 2011 Seq name: gi|229783970|gb|GG667765.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld158, whole genome shotgun sequence Length of sequence - 15828 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 7, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 2 - 889 988 ## COG1625 Fe-S oxidoreductase, related to NifB/MoaA family 2 1 Op 2 . + CDS 882 - 1451 818 ## COG1160 Predicted GTPases 3 2 Op 1 2/0.000 + CDS 2390 - 3112 893 ## COG1160 Predicted GTPases 4 2 Op 2 1/0.000 + CDS 3116 - 3751 703 ## COG0344 Predicted membrane protein 5 2 Op 3 . + CDS 3769 - 4773 1014 ## COG0240 Glycerol-3-phosphate dehydrogenase + Term 4856 - 4898 2.5 + Prom 4795 - 4854 6.6 6 3 Op 1 20/0.000 + CDS 4973 - 6193 1338 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component + Term 6232 - 6268 9.6 + Prom 6216 - 6275 1.7 7 3 Op 2 24/0.000 + CDS 6383 - 7267 1014 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 8 3 Op 3 19/0.000 + CDS 7277 - 8371 1243 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 9 3 Op 4 18/0.000 + CDS 8374 - 9132 256 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 10 3 Op 5 . + CDS 9145 - 9855 299 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 11 3 Op 6 . + CDS 9899 - 10837 644 ## COG1266 Predicted metal-dependent membrane protease + Term 10852 - 10889 3.6 12 4 Tu 1 . + CDS 10986 - 11198 293 ## Closa_2078 stage IV sporulation protein A + Prom 12100 - 12159 19.0 13 5 Op 1 . + CDS 12363 - 13964 1704 ## COG1236 Predicted exonuclease of the beta-lactamase fold involved in RNA processing 14 5 Op 2 . + CDS 14078 - 14554 471 ## COG0328 Ribonuclease HI + Prom 14580 - 14639 7.5 15 6 Tu 1 . + CDS 14659 - 15543 845 ## Closa_2075 hypothetical protein + Term 15553 - 15610 9.3 + Prom 15643 - 15702 5.3 16 7 Tu 1 . + CDS 15734 - 15827 91 ## Predicted protein(s) >gi|229783970|gb|GG667765.1| GENE 1 2 - 889 988 295 aa, chain + ## HITS:1 COG:CAC1710 KEGG:ns NR:ns ## COG: CAC1710 COG1625 # Protein_GI_number: 15894987 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase, related to NifB/MoaA family # Organism: Clostridium acetobutylicum # 1 273 159 430 437 214 39.0 2e-55 KMLHNRFAGQALKRIDELYKAGVPMNGQIVLCKGLNDGVELERTIRDLSKYLPYMESVSV VPVGLSKFRDGLYPLQPIGKEEAVKTIDLIERWQKKLYEEHGTHFIHASDEFYILAGRPL PEEERYDGYIQLENGVGMLRLLEEEVESSLGALEGDGREEEISIATGLLAAPFIERHVNT IRRKFPNCTIHVYPIRNYFFGEQITVAGLITGQDLIAQLEGKPLGSRLLLPECMFRSGEE VFLDDITRGEVQKALQVKVDIVKSSGQDLVQAVLHPVEEGSPSYEGYELKEISYE >gi|229783970|gb|GG667765.1| GENE 2 882 - 1451 818 189 aa, chain + ## HITS:1 COG:CAC1711 KEGG:ns NR:ns ## COG: CAC1711 COG1160 # Protein_GI_number: 15894988 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 1 189 1 188 438 234 63.0 5e-62 MSKPVVAIVGRPNVGKSTLFNVLAGDTISIVKDTPGVTRDRIYADCTWLDKNFTLIDTGG IEPDSSDIILSQMREQAEIAIATADVIVFIVDVRQGLVDADSKVADMLRKSKKPVVLAVN KVDSFEKFGNDVYEFYNLGIGDPVPVSAASRLGLGELLDEVIKYFGEGTGEETEDDRPRI AIVGKPNVG >gi|229783970|gb|GG667765.1| GENE 3 2390 - 3112 893 240 aa, chain + ## HITS:1 COG:lin2051 KEGG:ns NR:ns ## COG: lin2051 COG1160 # Protein_GI_number: 16801117 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Listeria innocua # 1 235 201 435 436 281 57.0 6e-76 MIVSDIAGTTRDAVDTEIVHNGTEYVFIDTAGLRRKSKIKEELERYSIIRTVTAVERADV VVVVIDATEGVTEQDAKIAGIAHERGKGIIVAVNKWDAIEKNDKTIYEYTNKIKDTLAFM PYAEYVFISAKTGQRTNRLFELIDMVRQNQTLRVATGVLNEIMTEAVALQQPPSDKGKRL RLYYITQVAVKPPTFVIFVNDKNLMHFSYTRYIENKIREAFGFKGTSLKFIARERSGKEQ >gi|229783970|gb|GG667765.1| GENE 4 3116 - 3751 703 211 aa, chain + ## HITS:1 COG:MG247 KEGG:ns NR:ns ## COG: MG247 COG0344 # Protein_GI_number: 12045101 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Mycoplasma genitalium # 4 207 12 236 239 108 32.0 6e-24 MERIICLAAGYLFGLLQTGYMYGRMNKIDIRNYGSGNSGTTNALRVLGMKAGAIVFFGDF LKAFIPCVLVRILFKDQPETAFLLVLYTGLGVILGHNYPFYLNFKGGKGIAATAGLIVAL DLRLMVVTLVAFVLIVAVTRFVSLGSLVVATIFLAWMIVMGSNGSYGVSAGHLPEFYIVA AVISAQAFWRHRANIVRLVHGNENKISFGKK >gi|229783970|gb|GG667765.1| GENE 5 3769 - 4773 1014 334 aa, chain + ## HITS:1 COG:FN0906 KEGG:ns NR:ns ## COG: FN0906 COG0240 # Protein_GI_number: 19704241 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Fusobacterium nucleatum # 1 328 1 331 335 404 60.0 1e-112 MAKIGIVGSGSWGIALAVLLYNNGHSVSVWSALPEEITEMKTTGRHHTLPDLVLPDGMMF TESLEEVMEDKEVLVTAVPSVYVRSTAQKMKPYCREGQIIVNVAKGIEESTLKTLSEILE DELPMADVAVLSGPSHAEEVSRGLPTTCVAGAHSRKTAEYIQSVFMSEVFRVYTSPDMQG IELGGSLKNVIALAAGIADGLGYGDNTKAALITRGIAEIARLGTAMGGKFQTFCGLSGIG DLIVTCASMHSRNRRAGILIGKGYSMEEAMKEVKMVVEGVYSAKAALQLSEKYGIDMPIV EQVNAILFDHKPASEAVKGLMLRDKTIESSDLPW >gi|229783970|gb|GG667765.1| GENE 6 4973 - 6193 1338 406 aa, chain + ## HITS:1 COG:FN1432 KEGG:ns NR:ns ## COG: FN1432 COG0683 # Protein_GI_number: 19704764 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Fusobacterium nucleatum # 12 403 6 375 383 235 37.0 1e-61 MKTTKKLVSLGLVAVMAASLTACGGKSDNGTTTAGTAETTAAADAGTAEQTAGGSFKLGG IGPITGATAIYGQAVMNGAELAINEINAAGGIGGVQIEYNFQDDENDTEKAVNAYNTLKD WGMQMLVGTVTSAPCIAVGSESSNDNMFQLTPSGSAVDCAKFDNQFRICFSDPNQGAASA QYIGENKLATKVAIIYDSSDVYSSGIHDKFIAEAANQGFEVVADEAFTADNNTNFSVQIQ KAQDAGAELVFLPIYYQQASLILVQAKQMGYEPEFFGCDGLDGLLTVENFDTSLAEGIML LTPFAADAQDDLTKNFVTTYKEKYGDTPNQFAADAYDAVYAIKAAAEKAGITPDMDASTI CDKLKTAMTEVSVDGLTGSGMKWTADGEPEKAPKAVKIVDGVYNAM >gi|229783970|gb|GG667765.1| GENE 7 6383 - 7267 1014 294 aa, chain + ## HITS:1 COG:FN1431 KEGG:ns NR:ns ## COG: FN1431 COG0559 # Protein_GI_number: 19704763 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Fusobacterium nucleatum # 2 294 14 308 308 258 52.0 7e-69 MMSFISYFINGLSLGSVYAIIALGYTMVYGIAKMLNFAHGDVIMIGGYVVFTVVSTMGLP PLAGILLAVIICTVLGVTIEKVAYRPLRMASPLAVLITAIGVSYLLQNIALLVFGSNPKS FTSVVTIPALKLAGGRLTISGETLVTIVSCIVIMVVLTTFIKKTKAGQAMLAVSEDKGAA QLMGINVDGTIALTFAIGSALAAIAGTLLCSAYPTLTPYTGSMPGIKAFVAAVFGGIGSI PGAFIGGLLLGVIENLSKAYISSQLSDAIVFAVLIIVLLVKPTGILGKKINEKV >gi|229783970|gb|GG667765.1| GENE 8 7277 - 8371 1243 364 aa, chain + ## HITS:1 COG:FN1430 KEGG:ns NR:ns ## COG: FN1430 COG4177 # Protein_GI_number: 19704762 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Fusobacterium nucleatum # 53 364 2 283 285 181 39.0 2e-45 MGKSGKTKSLQINKTTRSNLITYGIVIAAYIIVEILISGGHMSSLMKGLLVPLCIYTIMA VSLNLTVGILGELSLGHAGFMCVGAFASSLFSICMLDTIPIAGIRFTLAILVGAAAAAAA GIVVGIPVLRLRGDYLAIVTLAFGEIIKNVVNVIYLGKDSNGIHVSIKNAMALQMQEDGQ IIMNGAQGINGTPKNATFTVGIILILITLFIVLNLIHSRSGRAIMSVRDNRIAAESIGIN ITKYKLMAFTISASLAGVAGVLYSHNLSALTATQKNFGYNQSIMILVFVVLGGIGNIRGS MIAAVILTLLPEMLRGLNTYRMLIYAIILIVMMIFNWSPKFIDFRKRHSLKRYFHKNTAD KEVS >gi|229783970|gb|GG667765.1| GENE 9 8374 - 9132 256 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 2 242 3 226 305 103 27 1e-21 MALLEIKNLGISFGGLRAVDDFSITIEKGQLYGLIGPNGAGKTTIFNLLTGVYKPNMGTI HLDGMNITGKKTMEINRAGVARTFQNIRLFKELTVLDNVKAGLHNQYSYSAIAGILRLPK YFKVEKTMDEKAVELLKVFDLDREKDTLASNLPYGKQRKLEIARALATNPKLLLLDEPAA GMNPNETKELMDTIRFVRDNFDMTILLIEHDMKLVSGICEKLTVLNFGQVLAQGNTADVL NDPEVIKAYLGE >gi|229783970|gb|GG667765.1| GENE 10 9145 - 9855 299 236 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 2 224 10 231 318 119 32 1e-26 MAILEVKNLEVYYGVIKALKGISFEVNEGEIVALIGANGAGKTTTLHTITGLLKAAGGSI GFDGCDITKVRGDRIVGMGMAHVPEGRRVFADLTVFENLKLGAYTRKDKTEIEETLQMIY KRFPRLLERRNQPAGTLSGGEQQMLAMGRALMSHPKLIVMDEPSMGLSPIYVNEIFDIIQ EINKSGTTVLLVEQNAKKALSIANRAYVLETGDIVLSGDAKELMNDDSVKKAYLSE >gi|229783970|gb|GG667765.1| GENE 11 9899 - 10837 644 312 aa, chain + ## HITS:1 COG:CAC0420 KEGG:ns NR:ns ## COG: CAC0420 COG1266 # Protein_GI_number: 15893711 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Clostridium acetobutylicum # 70 296 48 273 276 118 32.0 1e-26 MKNRFGQVRSGWIILITMAVYYILALVSGNLIIELLRKILIATGDINPVTGEFSSVAEWF DNIFLPVAFQILTELLMLAVPITIWRLGMKHSVRVMGLDSLKGKKRRKDGAAGMLLGILN CSVIFLLVLMIGKGRVVSNGLTVSGLTFWWLLTFVLVGIAEETLNRGFLMAVLRRCRNVP VIMFVPSVIFGLIHLSNPGVTFFSVFNIILVGILFSYMFLKSGSIWMCIGYHITWNIFQG VVYGMPVSGLQIPGIITTQYAESNLLNGGAFGIEGGILTTIVTLLSFIFVWYYYRNSNYD FLSDSEKPEVLQ >gi|229783970|gb|GG667765.1| GENE 12 10986 - 11198 293 70 aa, chain + ## HITS:1 COG:no KEGG:Closa_2078 NR:ns ## KEGG: Closa_2078 # Name: not_defined # Def: stage IV sporulation protein A # Organism: C.saccharolyticum # Pathway: not_defined # 1 70 1 70 491 131 92.0 1e-29 MDNFDVYSDIKARTNGEIYIGVVGPVRTGKSTFIKRFMELMVLPEMVDEHQKTQTRDELP QSAAGKTIMT >gi|229783970|gb|GG667765.1| GENE 13 12363 - 13964 1704 533 aa, chain + ## HITS:1 COG:PA3614 KEGG:ns NR:ns ## COG: PA3614 COG1236 # Protein_GI_number: 15598810 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted exonuclease of the beta-lactamase fold involved in RNA processing # Organism: Pseudomonas aeruginosa # 3 459 4 462 467 355 40.0 1e-97 MKLMFIGADHEVTGSCHYLEAAGLKILVDCGMKQGVDYFENAELPVSYSEIDYVLLTHAH IDHAGMIPFIYARGFRGEIITTEATADLCNIMLKDSAHIQESEAEWKNRKARRAGRQEEV PLYEMNDAMGVLKQFNPCSYDEIIKLEEGLTIRFTDVGHLLGSASIEVWIKENDISKKIV FSGDIGNKNKPLIRDPQYIKNADYVVMECTYGDRLHGQIPDHVSILADIVQRTLDRGGNV VIPAFAVGRTQELLYMFRIIKAEHMVKGHDGFEVYVDSPLAVEATSVFKDNLVDCYDEET KALVTQGINPISFPGLKLSITSAESRDINFDTKPKVIISAAGMCDAGRIRHHLKHNLWRP ECTIVFTGYQSIGTLGRSLLEGGREVRLFGETIQVEAHIEKMDGMSAHADMEGLIQWLNA FEEKPREVFLVHGNDEVCDSFGNLLKEEYGYQADAPYSGSVYDLAAGRWLEQRTGVVVTK TVETAKRAVGVYARLLAAGERLLSVIRHNEGGANKDLAKFADQINSLCDKWDR >gi|229783970|gb|GG667765.1| GENE 14 14078 - 14554 471 158 aa, chain + ## HITS:1 COG:YPO1081 KEGG:ns NR:ns ## COG: YPO1081 COG0328 # Protein_GI_number: 16121382 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HI # Organism: Yersinia pestis # 3 152 4 146 154 171 57.0 6e-43 MGKVLLFTDGAARGNPDGPGGYGAVLQFTDSKGQLHEKTLSAGYVRTTNNRMELMAAIAG LEALNRPCEVELYSDSKYVTDAFNQHWIDNWVKNNWKRGKSGPVKNIDLWKRLLKAMEPH RVTFCWVKGHAGHPENERCDQLATTAADGDSLLVDEGL >gi|229783970|gb|GG667765.1| GENE 15 14659 - 15543 845 294 aa, chain + ## HITS:1 COG:no KEGG:Closa_2075 NR:ns ## KEGG: Closa_2075 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 290 1 283 287 157 42.0 4e-37 MNKNKGIWLVIAAILLIGIAVTFATSRFIRQNGGSLDTVAVTQMPVTAGGGIPQAYSEGA GAETPAAAKKRMAAAPIEEEDAAPGNTGEAADQAAVPEPASSRIMAAEGAAAPAPESFLM DSDEAAGEAAVPETAAAGPGSREDAAAVPSISPLGPGGSASLQNDEEEKLVYYERRLAEL DAQVQKMRSESSDSTTYSMKTLAEKELRLWNIEMNTIYADIIDGLDEEARSKLEGEQQTW TKSRDTKAEEAARKYSGGSLEGVEYTASQAESTRTRTYDLVETYIQALPAENED >gi|229783970|gb|GG667765.1| GENE 16 15734 - 15827 91 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTAILASFLQTAIKFLFFLAVAWGGIVCGKK Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:42:23 2011 Seq name: gi|229783969|gb|GG667766.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld159, whole genome shotgun sequence Length of sequence - 16222 bp Number of predicted genes - 16, with homology - 14 Number of transcription units - 7, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 338 211 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 2 1 Op 2 . - CDS 340 - 1971 1259 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 3 1 Op 3 5/0.000 - CDS 1952 - 2932 612 ## COG0534 Na+-driven multidrug efflux pump 4 1 Op 4 . - CDS 3864 - 4205 264 ## COG0534 Na+-driven multidrug efflux pump - Prom 4266 - 4325 3.6 - Term 4351 - 4392 10.2 5 1 Op 5 . - CDS 4442 - 4561 103 ## - Prom 4729 - 4788 24.6 6 2 Op 1 5/0.000 + CDS 5731 - 6414 690 ## COG0534 Na+-driven multidrug efflux pump 7 2 Op 2 . + CDS 6411 - 7049 556 ## COG0534 Na+-driven multidrug efflux pump 8 2 Op 3 . + CDS 7055 - 8968 1974 ## COG0171 NAD synthase 9 2 Op 4 . + CDS 8965 - 9744 278 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 + Prom 10786 - 10845 80.4 10 3 Tu 1 . + CDS 10899 - 11405 579 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs + Prom 12307 - 12366 20.6 11 4 Tu 1 . + CDS 12394 - 12561 233 ## Closa_2495 band 7 protein + Term 12586 - 12630 9.4 12 5 Tu 1 . - CDS 12504 - 12611 61 ## 13 6 Tu 1 . + CDS 12653 - 13807 1478 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily + Prom 13843 - 13902 6.6 14 7 Op 1 . + CDS 13935 - 14870 1169 ## COG0078 Ornithine carbamoyltransferase 15 7 Op 2 . + CDS 14874 - 15182 398 ## Closa_2492 hypothetical protein 16 7 Op 3 . + CDS 15188 - 16165 937 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase Predicted protein(s) >gi|229783969|gb|GG667766.1| GENE 1 2 - 338 211 112 aa, chain - ## HITS:1 COG:CAC0777 KEGG:ns NR:ns ## COG: CAC0777 COG0110 # Protein_GI_number: 15894064 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Clostridium acetobutylicum # 1 112 3 114 210 169 68.0 9e-43 MPDLKKIYPRTGDTQTVYLKNTITDPSIQVGDYTIYNDFVNDPADFEKNNVLYHYPVNQD RLIIGKFCSIACGAKFLFNSANHTLSSLSSYPFPLFFEEWGLDKQNVTASWD >gi|229783969|gb|GG667766.1| GENE 2 340 - 1971 1259 543 aa, chain - ## HITS:1 COG:BS_ydiF KEGG:ns NR:ns ## COG: BS_ydiF COG0488 # Protein_GI_number: 16077662 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 526 2 522 642 269 32.0 1e-71 MTMKGSKMNLSFGSEIIYEDAEFLINDHDKAGIVGVNGAGKTTLFHVLMHELELDSGTIS TGNSRIACLPQEIVIHDETCTVLDYLQEGRPIRRLEADLSRIYQRLEHAAVTEHASLLKQ MDKLQTRLEFFDCYKADSILLDIIDRMQIDIDLLDMPVNRLSGGQKSKIAFARVLYSKAD ILLLDEPTNHLDAAAKEFVTEFLKNYRGMVLIISHDTAFLNQIIHKILFINKAAHKISVY DGNYDTYKKKYAEEQRLRQRILIQEEKEIKELEAFVQRANQASRTNHALKRMGQERALRL EKKRSGLHPCDRICKRVKMDIRPRREGADLPLEVCNLSFHYPDQPWLYKDLSFHVNGKER FLVVGENGVGKSTLLKLIMGILLPDAGKIRFHSKTDVAYYAQELEQLDLQKTVLENVQTD GYTIKQLRSVLSNFLFYEDDICKKTEVLSPGEKARVCLCRVLLQKANFLVLDEPTNHLDP ETQSIIGRNFQLFEGTIMVVSHNPWFVEQIGINRVLLLPGGRVVDYSRELLEYYYGLNNK EDL >gi|229783969|gb|GG667766.1| GENE 3 1952 - 2932 612 326 aa, chain - ## HITS:1 COG:BS_ypnP KEGG:ns NR:ns ## COG: BS_ypnP COG0534 # Protein_GI_number: 16079230 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Bacillus subtilis # 1 315 136 444 445 130 29.0 5e-30 MKIICCGTVFVFGYNAVCSILKGLGDSKSPLAFVAIAAAVNIVLDIILVGPFAMGTEGAA YATIASQGISFMISVIHLKRRNSVFDFKFRHFAVKPDKMAAILKVGLPAAAQMAVVNISY LLITGMLNRFGVPVAAASGVGLKINTFAGMPCWAIGQAVTAMAGQNFGANDIDRVRKTTK AGLYLNIIITLLTVLLVQLFAKPIMMIFNPVNAEVISDGILYLRICCGANSLVYAVMYTF DSFAIGVGSANIAMFNALLDAVIVRLPVSWLLAFTFGFGFPGIYIGQALSPIIPAIAGFL FFRGRRWENNRLIQQNREVSYDYERQ >gi|229783969|gb|GG667766.1| GENE 4 3864 - 4205 264 113 aa, chain - ## HITS:1 COG:CAC3354 KEGG:ns NR:ns ## COG: CAC3354 COG0534 # Protein_GI_number: 15896597 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 6 111 7 112 452 64 34.0 4e-11 MNEEQNLTTGSVPGKLIVFALPLLFANLLQSFYSIVDMLVVGRIVGNTGLAAISNASMAG FIINSVCIGLTMGGTVAAAQYKGAGDEKGQRETVGTLFSIAFLAAILVTAIGS >gi|229783969|gb|GG667766.1| GENE 5 4442 - 4561 103 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLSGKLCIDYSSGQFDITKHCAEPENTKTIVSGYLIPKQ >gi|229783969|gb|GG667766.1| GENE 6 5731 - 6414 690 227 aa, chain + ## HITS:1 COG:FN1653 KEGG:ns NR:ns ## COG: FN1653 COG0534 # Protein_GI_number: 19704974 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 212 10 221 445 171 45.0 1e-42 MTEGVIWKQLLAFSMPLLVGNLFQQLYNTVDSIVVGNFIGSEALAAVGSSNSLINLIIGM FMGIATGAGVIISQYYGAKDEQKLHWAVHTCMALSLIGGALLIVLGVLLSPLILRWMGTP EEVMPNSVAYLRIFFCGSLFNLVYNMGAGVLRAVGDSRRPLYYLCVSSVVNIILDMVFVV VFRMGTAGVGYATVIAQAVSSVLTVRALMKTEAVTGSSHPGSESTGA >gi|229783969|gb|GG667766.1| GENE 7 6411 - 7049 556 212 aa, chain + ## HITS:1 COG:FN1653 KEGG:ns NR:ns ## COG: FN1653 COG0534 # Protein_GI_number: 19704974 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 2 201 238 437 445 122 32.0 5e-28 MMKRILKLGIPSGIQQSIISLSNVIVQANINGYGAAAMAGYGAYSKVDGFAMLPLQSFCM ASTTFTGQNIGAKKARRVKDGMFQGIAISLIYTAFISAVLYLNADKILRIFSPDDTVIAF GHTTMAILLPFYWSIAIHQILMGSIRGSGRTMVTMLIGVGNMCILRMIYINLLVPFFPSF EAVMWCYPITWMTTMLMDCVYSTKAKWIPKGE >gi|229783969|gb|GG667766.1| GENE 8 7055 - 8968 1974 637 aa, chain + ## HITS:1 COG:CAC1782_2 KEGG:ns NR:ns ## COG: CAC1782_2 COG0171 # Protein_GI_number: 15895058 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Clostridium acetobutylicum # 323 633 2 313 313 418 61.0 1e-116 MKHGFIRVAAATPDVKVADPQFNRTEICALIRAGIERKAKLMVFPELSLTAYTCGDLFGQ DALLYGARRELKEILKETEGSDLLAFIGMPWERSGKLYNTAVAVQNGRILGIVPKSNIPN YSEFYERRYFEPGNEIPVMVSWEGQNFPMGTNILFACEEMPGFVVAAEICEDAWVPCPPS IRHTAAGATVIVNCSASDETTGKDIYRRSLITGHSASLVCGYVYANAGDGESTQDLVFGG QNLITENGTCLAESKRFANETIFADMDMERLNNERRRLSTYPVLNTGSYVTVGFHITEEE YDLERPIDPMPFVPTDEGQRNRRCEEILSIQAMGLKKRLAHTGCSHAVIGISGGLDSTLA LLVTVRAFDMLNIPRHQIHAVTMPCFGTTDRTYQNACLMTKKVGAELTEIDIREAVTVHF RDIGHDISHHDVTYENSQARERTQVLMDLANRWGGMVVGTGDMSELALGWATYNGDHMSM YGVNASVPKTLVRHLVRYYADTCGEEELKAVLLDVLDTPVSPELLPPKEGEIAQKTEDLV GPYELHDFFLYQVLRYGYRPAKVFRLAKAAFDGQYDGETVLKWLKVFYRRFFSQQFKRSC LPDGPKVGSVAVSPRGDLRMPSDASSRLWLEELEGLS >gi|229783969|gb|GG667766.1| GENE 9 8965 - 9744 278 259 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 5 258 34 247 255 111 29 3e-24 MITSSANGKVKQVMNLAKKAKARKETGLFVAEGLRMFKEIPRDRIDSLFVSESFLKNPEG RKLIGPLAFEAVADDVFKTMSDTTTPQGVLALVKQYAYTIDDILDRPGPALLMILDTIQD PGNLGTILRAGEGAGITGVIMNDTTADIYNPKVIRSTMGSVCRVPFVYTAALPQTLANIK KRGVRLYAAHLEGRNNYEKEDYTVDTGFLIGNEANGLSSETASMADAYVKIPMMGKVESL NAAVAASVLMFEAARQRRN >gi|229783969|gb|GG667766.1| GENE 10 10899 - 11405 579 168 aa, chain + ## HITS:1 COG:FN1549 KEGG:ns NR:ns ## COG: FN1549 COG0330 # Protein_GI_number: 19704881 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Fusobacterium nucleatum # 1 168 36 203 294 214 67.0 5e-56 MGAYLGTWSVGVHIKMPILDRVAKRVNLKEQVADFPPQPVITKDNVTMRIDTVVFFQITD PKLYAYGVENPLMAIENLTATTLRNIIGDLELDQTLTSRETINAKMRESLDIATDPWGIK VNRVELKNIMPPAAIQDAMEKQMKAERERRESILRAEGEKKSTILVAD >gi|229783969|gb|GG667766.1| GENE 11 12394 - 12561 233 55 aa, chain + ## HITS:1 COG:no KEGG:Closa_2495 NR:ns ## KEGG: Closa_2495 # Name: not_defined # Def: band 7 protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 54 257 310 312 80 87.0 2e-14 MIKEAGADQAVLTIKSLEALKTVADGKATKIIIPSEIQGLAGLVSSITEIPKDPK >gi|229783969|gb|GG667766.1| GENE 12 12504 - 12611 61 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRAGAPCCEFSNYLSRSYLGSFGISVMEDTSPASP >gi|229783969|gb|GG667766.1| GENE 13 12653 - 13807 1478 384 aa, chain + ## HITS:1 COG:ECs2798 KEGG:ns NR:ns ## COG: ECs2798 COG1752 # Protein_GI_number: 15832052 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Escherichia coli O157:H7 # 12 99 8 100 356 70 41.0 4e-12 MELKLDLTREYGIVLEGGGAKGAYQIGAWKALREAGIRIKGIAGASVGSLNGALICMDDL EKAEEIWKNIEYSRVMDVSDETIKALKRKDFKALNMQEILNSGVQFIKDGGFDVTPLKNL IAEVVGDESRIRESDRELFAVTYSVSEHRELAVDVRKGEEGSVKDMLLASAYFMAFKNEK LGGKRYRDGGGLNNVPLGVLLDKGYEDIIVIRIYGWGFDSEKVTKIPEGVNVYHIAPRQD LGGILEFDKKQSRKNMTLGYYDAKRFLYGLAGRIYYIDAPGSEPYYFDKMMSELELLKIY VEEDLDQETRESLNGYRMFTETLFPRMAEEFKLKEDWNYRDLYIAILEDLAKRLKINRFQ IYTVDRLIGKIMMKLHSLDSRIPL >gi|229783969|gb|GG667766.1| GENE 14 13935 - 14870 1169 311 aa, chain + ## HITS:1 COG:SPy1544 KEGG:ns NR:ns ## COG: SPy1544 COG0078 # Protein_GI_number: 15675442 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Streptococcus pyogenes M1 GAS # 3 310 5 330 337 392 60.0 1e-109 MNLKGRSFITLKDYTPEEINYLLDLAADLKAKKKSGVTGSSLKGKNIALIFEKPSTRTRC SFVVAAVDEGAHPEYLGKDDIQLGHKESVEDTARVLGRIFDGIEFRGFKHKTVEDLAKYA GVPVWNGLTDDYHPTQILADLLTMKEHFGYLKGLNFVYAGDGRNNMANSLMIGMSKMGVN FTILAPKCLWPKDELVTLCRGYAGESGSSVTLTENVDDVKGADTIYTDVWCSMGEEDKAA ERVALLKPYQVNAALMEKTEKDTTIFMHCLPAVKGNEVTEDVFESKASVVFDEAENRMHT IKAVMVATLGE >gi|229783969|gb|GG667766.1| GENE 15 14874 - 15182 398 102 aa, chain + ## HITS:1 COG:no KEGG:Closa_2492 NR:ns ## KEGG: Closa_2492 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 102 1 102 102 124 70.0 1e-27 MYIQERSRNPRNLSMMLDAAHIIIGILIVVLAVISFLNPEDHMLLFPVIFLLAASLNIMN AVFKIRMSGREKKKKASGILVLLFGIVLLALAVISAVSIWWG >gi|229783969|gb|GG667766.1| GENE 16 15188 - 16165 937 325 aa, chain + ## HITS:1 COG:BH1685_2 KEGG:ns NR:ns ## COG: BH1685_2 COG0340 # Protein_GI_number: 15614248 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Bacillus halodurans # 78 318 11 252 254 192 41.0 6e-49 MKTEILKLLKEESGFVSGQELCDRFGVSRTAVWKAIGQLKEEGFQIEAVRNKGYRLLSSD DVITEAELMSCMEGGLVRKIVYYEETDSTNIRARKLAEEGAPDGTLVVTDFQNAGRGRRG RMWVSPSGTGVFMSLILRPDILPSSASMLTLVAALAVHDGIKETTGLDTVIKWPNDIVGD GRKLCGILTEMSAELEGIHYVVTGIGINANMTEFPEEAGEVASSLRILTGSPVRRSRLIA SVMKAYEGYYKKFRECGSLASLMDVYNEHMANLGREVKVLDPAGTYTGKALGIDEKGELI VEKADGKVIRVVSGEVSVRGIYGYV Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:42:40 2011 Seq name: gi|229783968|gb|GG667767.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld160, whole genome shotgun sequence Length of sequence - 12745 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 8, operones - 5 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 4.7 1 1 Tu 1 . + CDS 85 - 975 628 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 1182 - 1212 -0.5 2 2 Op 1 . - CDS 1028 - 2488 930 ## ELI_2097 regulatory protein GntR 3 2 Op 2 . - CDS 2475 - 5009 1815 ## COG0642 Signal transduction histidine kinase - Prom 5037 - 5096 2.8 4 3 Op 1 . - CDS 5107 - 6903 1304 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases 5 3 Op 2 . - CDS 6929 - 7057 127 ## gi|266624311|ref|ZP_06117246.1| conserved hypothetical protein - Prom 7077 - 7136 3.6 - Term 7541 - 7595 5.3 6 4 Op 1 . - CDS 7706 - 8434 766 ## COG1985 Pyrimidine reductase, riboflavin biosynthesis 7 4 Op 2 . - CDS 8438 - 8815 352 ## COG0716 Flavodoxins - Prom 8848 - 8907 1.8 8 5 Tu 1 . - CDS 8932 - 9096 208 ## gi|288871338|ref|ZP_06410118.1| iron-containing alcohol dehydrogenase - Prom 9156 - 9215 4.9 + Prom 9089 - 9148 6.0 9 6 Tu 1 . + CDS 9243 - 9689 366 ## COG0789 Predicted transcriptional regulators - Term 9522 - 9565 -1.0 10 7 Op 1 . - CDS 9677 - 9988 262 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 11 7 Op 2 . - CDS 10066 - 10209 98 ## gi|288871340|ref|ZP_06117253.2| putative short-chain dehydrogenase - Prom 10417 - 10476 80.4 12 8 Op 1 . - CDS 11933 - 12571 193 ## FN1195 hypothetical protein 13 8 Op 2 . - CDS 12578 - 12736 56 ## gi|288871541|ref|ZP_06117992.2| auxin-binding protein Predicted protein(s) >gi|229783968|gb|GG667767.1| GENE 1 85 - 975 628 296 aa, chain + ## HITS:1 COG:BH0483 KEGG:ns NR:ns ## COG: BH0483 COG2207 # Protein_GI_number: 15613046 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 187 287 666 766 769 82 37.0 8e-16 MKIHENGYIINISQPLSCRDTGKFEAPSSDWRHDALPLDDTYELCVVTRGILYLKYRGVS YSIHENQFLLIDPHEAPNNVRRGYNNNKCDFYWLHFYPPSEIRKTELGNVDFPFLEAVNE IFIPSQGTLIYPEKIAIFMRQLQDTIRSNYSFLEVNYLTTLILCEISNQISILKNDEEAQ SQSNAGIYNSIIDFIDENLDKELKLKNIANKFNYNEQYIARLFKSLGSVSLHQYILNQRI SQANYYLSETSLSIKEIAQKLGYHDSHNFMKIYKRETGLTPSEYRNSFPNRLRYDT >gi|229783968|gb|GG667767.1| GENE 2 1028 - 2488 930 486 aa, chain - ## HITS:1 COG:no KEGG:ELI_2097 NR:ns ## KEGG: ELI_2097 # Name: not_defined # Def: regulatory protein GntR # Organism: E.limosum # Pathway: not_defined # 1 484 5 485 485 435 46.0 1e-120 MLMYDRIYEILKNRIVSRLLPGGSSLPSRAKLCVEFGTSEKTVRRALAMLEKEGLIETTQ RKRPTVSFMQNTGHRTTVLALEKIDKDITSDVLKAGVLLCYPVIKKGIALCSQEDLQIPR RIVDHMDSGNAEEFWRLSKLFYRFFVARNENSLILQAVDSLGLSDLSPLRDDIQIRTRYY EQVQEFMRTLETGGVPEHVHFDDMSDIYGMTEGNTPAFQAAPDSAVIHGKKQLEKLLEDS EVRYSAVYMDIIGLIAAGSYQREDKLPTHAELQKIYGVSVDTTTRAIQILREWGVVKTVR GVGIFVTADTADIQKIHVPSHLIACHVRRYLDTLEFLELTIEGAAACAAARVTKPELLAV KEKMERFWNEEYLYGRTPAILLDFITEHIGIKALSTIYELLQRNFRIGRSIPGLLTTEKT PVNCEIHEQCIAVIDALLAGSHEQFTEKNVQLFGNIYCLVKEECRRLGYFEPAVEVYDGT ALWKSS >gi|229783968|gb|GG667767.1| GENE 3 2475 - 5009 1815 844 aa, chain - ## HITS:1 COG:SMb20356_1 KEGG:ns NR:ns ## COG: SMb20356_1 COG0642 # Protein_GI_number: 16264090 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Sinorhizobium meliloti # 302 567 401 667 667 202 47.0 2e-51 MIAILCTLAVINYQKVSERTVTNISGVYLREMTTQLSSHFQTNLNSQFSQLRTITNALSE SDLKDEESLKLFLARAQSDNDFTHIAFISDKGIAYSSEGVVPAMSKISGLDQLLSGSGEL ISVNETIWESGTILLGAAMSPVEFEDCRLVAVIIGIHTSDIGSRLGMDSKTATDSYTNIV TRNGDFVIKSASSEEVIYGSNLLTVYEQHAVFDKGYDFPSFHAAIDAGESGLTLLTVGTH HVYLYYMPIQGTDWYMVTSMSYETVNNQIVYLSQFMAGVGAGIFFVVFATVITFFLLFRR SEKRSNELLRLEKERAEAASRAKSDFLSQMSHEIRTPLNGIMGMVELGKNHIDEPGRMRN CLDKITFSSTHLLSLINDILDMSKIESGKIELHPELFDLGKLLRALTTVFHVQAINRQID FQIFLRGEVTEYLVGDALRLNQILTNLLSNAVKFTPAQGSVNLNIETLRRDDHRIWLRFE VKDTGRGIAPENIHRIFEPFTQENSGIARNYGGTGLGLPITRSFTEMMGGSITVSSEVGT GSVFTVDLPFDCAPDDVYKDAQPCGSGQSVLVVNQIEELKNNLTDVLRKENFRVDSVSDE ETALCRIHAAVQNGTPYELCFVKWDVCQEMKLFASKIRSESDNPDLKLILTGQDQDDLDD MAALCGADATLCRPVFHSDLAVLMTVLTGQNQDQTKPEQSAVLAGKQVLLAEDNEINLEI AAALLQDAGAIVTSTQNGQEAVERFSEAREGFYDLILMDIQMPVMDGCSAAQAIRALPRS DAKCTIIIAMTANSFREDIQKCLDSGMNAHIAKPFILNDITSAYLEVLGSWKLRKADIDR HVDV >gi|229783968|gb|GG667767.1| GENE 4 5107 - 6903 1304 598 aa, chain - ## HITS:1 COG:BH0026_2 KEGG:ns NR:ns ## COG: BH0026_2 COG0737 # Protein_GI_number: 15612589 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Bacillus halodurans # 330 596 231 471 471 105 32.0 2e-22 MVESACGDIELQTELSLYTSEQLRRLDRGVGPDLVITAQPDSNMVRKHLLDISDTKASAA YDGTIMNAWKTDGKTYMIPLPGVYSGYVVNETLFQQAGIALPTNNDELVTALTELKERGL GTGEDGVSFSIMSDYNTAVGLFYVGNMVPDFLGTVEGVKWLAGFRDKQATFTGVWEQSFT LSDALVAAGVMDPADIAHQRNIVLCKKRLSQGTLAAAFGDSALYYECVAENQKAVKEGTS EAYVYRMLPLFSDEGNEPWFLFSPSAFMGVNNSISKEKQDACKRILELLSTPEGQAALIE DMGGGMSCLTDYQQQEELIPLGVETYVESGYVYNVLFPGKTVEYLGGYVRNVMSGQCSLE EALQSIDQFYYEGTDENSYDFSVIGMVGHDLLLDNFNVRRRETELGNFIADCVAEASGAP IAVVNGGGIRASFYQGVVYGGDLAVVCPFNNQIIVLEMDGQTLWDMLENGLSVCTEEFPG GRFLQVSGLRYTFDSSKPAGSRLVSVTLPDGTALDLEADCQVAVTDYMAGSRTYAEGNGD GFTMLNYYDDATPKGRVALVKETGLLYRDALAQYFERHRDSVVETMLEGRISDLAQDK >gi|229783968|gb|GG667767.1| GENE 5 6929 - 7057 127 42 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624311|ref|ZP_06117246.1| ## NR: gi|266624311|ref|ZP_06117246.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 42 1 42 42 70 100.0 4e-11 MKRNGNRGMAGMLGTVLILLVLTGCGRQISVPEEKISDQNSL >gi|229783968|gb|GG667767.1| GENE 6 7706 - 8434 766 242 aa, chain - ## HITS:1 COG:XF1748 KEGG:ns NR:ns ## COG: XF1748 COG1985 # Protein_GI_number: 15838349 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine reductase, riboflavin biosynthesis # Organism: Xylella fastidiosa 9a5c # 1 237 1 231 237 231 51.0 7e-61 MERPYVFCHMMTSLDGKIMGNYMEAPESGPAGDAFYQIAFGKKPKYLHQGWLSGRVTTDD NFTMYRRPALDEKAPEVPEGDFVAEKAPMYYVSIDPSGRLGWEENTLTYIDTTAHVLEVL TEQASNAYKAFLRKLGISYIIAGKKTLDYSEALTKLKRLFGIETLMLGGGGVLNWSFIQA GMCDEISVVIAPCADGSAETQTLFQARAGLSTEEPVGFTLKSADVLEGDAVWLRYLVKER IK >gi|229783968|gb|GG667767.1| GENE 7 8438 - 8815 352 125 aa, chain - ## HITS:1 COG:FN0119 KEGG:ns NR:ns ## COG: FN0119 COG0716 # Protein_GI_number: 19703467 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 22 101 64 142 164 69 36.0 2e-12 MTPYSGDYNSVVSQGKREVEEGYEPPVRPFDKDLDHYDTVILGTPVWWYTFAPAVKTVLN AADWNGKTVYPFATNGGWIGHTFRDMKTACAGADVKQGLNIRFDGDQMQTSIAEIDRWTA EIKEE >gi|229783968|gb|GG667767.1| GENE 8 8932 - 9096 208 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871338|ref|ZP_06410118.1| ## NR: gi|288871338|ref|ZP_06410118.1| iron-containing alcohol dehydrogenase [Clostridium hathewayi DSM 13479] iron-containing alcohol dehydrogenase [Clostridium hathewayi DSM 13479] # 1 54 6 59 59 113 100.0 6e-24 MNFKGYIPTRVLFGTGCLNDLHTQIMPGKKAMLECDRIPLTDEECTAIYEKSYR >gi|229783968|gb|GG667767.1| GENE 9 9243 - 9689 366 148 aa, chain + ## HITS:1 COG:CAP0107 KEGG:ns NR:ns ## COG: CAP0107 COG0789 # Protein_GI_number: 15004810 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 143 1 143 152 180 58.0 7e-46 MFYTVGEMAKKMGVAPSTLRYYDKEGLLPFVERSGGGIRMFKDADFQWLSIIECLKKAGL PIKEIKTFMDWCMEGDNTIDKRLALMQEQRDIVKKQIAQMQDTLAILDYKCWYYETAREA GTCKIHDSLPDEAVPSEFAAAKEKTQTL >gi|229783968|gb|GG667767.1| GENE 10 9677 - 9988 262 103 aa, chain - ## HITS:1 COG:RSc0215 KEGG:ns NR:ns ## COG: RSc0215 COG1028 # Protein_GI_number: 17544934 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Ralstonia solanacearum # 1 90 181 270 275 100 55.0 8e-22 MAEVVKWGKREARINDIAPGIIVTPLAIDEFNGLRGKFYKSMFAKCPLARPGTADEAANA AELLMSYKGASITGATFLIDDGATSNYYYGPPRPQEESSNQSV >gi|229783968|gb|GG667767.1| GENE 11 10066 - 10209 98 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871340|ref|ZP_06117253.2| ## NR: gi|288871340|ref|ZP_06117253.2| putative short-chain dehydrogenase [Clostridium hathewayi DSM 13479] putative short-chain dehydrogenase [Clostridium hathewayi DSM 13479] # 1 47 1 47 47 92 100.0 9e-18 MNEARFDALAYEMDLFSRSSICSLIVEAQTYGKIVMLAMPPQKNFCH >gi|229783968|gb|GG667767.1| GENE 12 11933 - 12571 193 212 aa, chain - ## HITS:1 COG:no KEGG:FN1195 NR:ns ## KEGG: FN1195 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 210 1 209 211 147 43.0 2e-34 MFDNDKLLFSANNEIDHVKGIVKVYDSYVAQNEVGKEKQQFQQIAKIMKERTEERYIKPI KMFLHNCKEHPEQSGLIVMGINCILIELYFEMIHGLDKSDEGGHKVQDAYTAILPLLDPS ITEEDAKRFYKGIRCKIIHQGQTGLYTAITYAELIDKIIILCGEYYLCNPKVLFEALTLL YHQYWKELSEGKDPIQKQNFIKKFKYILYHDD >gi|229783968|gb|GG667767.1| GENE 13 12578 - 12736 56 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871541|ref|ZP_06117992.2| ## NR: gi|288871541|ref|ZP_06117992.2| auxin-binding protein [Clostridium hathewayi DSM 13479] auxin-binding protein [Clostridium hathewayi DSM 13479] # 1 52 29 80 80 95 100.0 9e-19 MTMINYRKAEVDSSSGVNYGVYAKALLRLLEIRITHSRHCEVCTKLCREKEG Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:43:14 2011 Seq name: gi|229783967|gb|GG667768.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld161, whole genome shotgun sequence Length of sequence - 13824 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 8, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 321 248 ## gi|266624321|ref|ZP_06117256.1| conserved hypothetical protein + Term 372 - 425 2.0 - Term 729 - 789 8.1 2 2 Op 1 1/0.000 - CDS 796 - 1596 647 ## COG0778 Nitroreductase 3 2 Op 2 . - CDS 1571 - 2509 225 ## PROTEIN SUPPORTED gi|42631297|ref|ZP_00156835.1| COG0697: Permeases of the drug/metabolite transporter (DMT) superfamily 4 3 Op 1 . - CDS 2624 - 3685 1083 ## COG2017 Galactose mutarotase and related enzymes 5 3 Op 2 . - CDS 3717 - 4805 1247 ## COG0136 Aspartate-semialdehyde dehydrogenase - Prom 4833 - 4892 8.9 6 4 Op 1 . - CDS 5097 - 5948 821 ## COG1307 Uncharacterized protein conserved in bacteria 7 4 Op 2 24/0.000 - CDS 5986 - 8223 2742 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit 8 4 Op 3 . - CDS 8235 - 8792 637 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 9 4 Op 4 . - CDS 8749 - 10155 1428 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit - Prom 10322 - 10381 73.9 + TRNA 10303 - 10373 64.1 # Gly CCC 0 0 - Term 10408 - 10438 4.3 10 5 Tu 1 . - CDS 10472 - 11101 650 ## gi|167769952|ref|ZP_02442005.1| hypothetical protein ANACOL_01293 - Prom 11298 - 11357 9.5 - TRNA 11196 - 11268 82.5 # Thr CGT 0 0 + Prom 11351 - 11410 7.1 11 6 Tu 1 . + CDS 11453 - 11734 310 ## COG1254 Acylphosphatases + Prom 11769 - 11828 3.6 12 7 Tu 1 . + CDS 11852 - 12025 221 ## gi|288871348|ref|ZP_06117265.2| conserved hypothetical protein + Term 12026 - 12056 -1.0 - Term 12001 - 12056 12.1 13 8 Op 1 . - CDS 12062 - 12475 605 ## EUBREC_1216 hemerythrin 14 8 Op 2 . - CDS 12540 - 12857 329 ## Closa_2289 hypothetical protein 15 8 Op 3 . - CDS 12870 - 13823 742 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair Predicted protein(s) >gi|229783967|gb|GG667768.1| GENE 1 1 - 321 248 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624321|ref|ZP_06117256.1| ## NR: gi|266624321|ref|ZP_06117256.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 106 1 106 106 132 100.0 7e-30 RDTIKAEIGRSENLLRDNIKTEIERSENLLRDTIKAEIGRSENLVLSEVDRVQENLETKM EQLKRNMDELTQYYRTVKLDHENNALFLQMIQELKKEMEQLKIKIA >gi|229783967|gb|GG667768.1| GENE 2 796 - 1596 647 266 aa, chain - ## HITS:1 COG:CAC3359_2 KEGG:ns NR:ns ## COG: CAC3359_2 COG0778 # Protein_GI_number: 15896602 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 85 265 1 190 191 125 37.0 1e-28 MNSAIVINEERCIGCGLCVQDCPGNAIVLRDHKAAAERPCITCGHCVAVCPSGAVSIPDY DMAGVEAYQEDSFTVAPGNFLHAVKFRKSIRSYKDMPVSRDVLERIMEAGRYTATAKNAQ ACTFVAVQTRFQEFKELFWSEFPYILEALKETKPEYVRAFTRFYEKWKKDPKADTFFFNA PCLLITAADNPLDGGLAAANMENMAVAEGAGVLYSGYLQRTISVSPKLMEWLGLDGRPLA CCMLLGYPAVKYQRTAPRKKGDIRFL >gi|229783967|gb|GG667768.1| GENE 3 1571 - 2509 225 312 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42631297|ref|ZP_00156835.1| COG0697: Permeases of the drug/metabolite transporter (DMT) superfamily [Haemophilus influenzae R2866] # 18 284 5 277 290 91 26 3e-18 MSTKDGKGIKLLAAIGLIITTIIWGSAFVVMKNSVDIISPTYLLALRFTIAAIALILVFW RKVMKINKSDLLCGGLLGVFLFVSYFFQTYGLKYTTASKNAFITTLYVILVPFLHWLFNH KKPKRNNIIAACIAVVGLALLSLEGDLSINIGDLLTLICGFFYAIHIVFIDRYTEDHDPV KMTVLQMVVAAVLSWIIAPVLEGPFDGRVIDNSMIISLLYLGIFSTMIGFLLQNVGQKYL TPNTSSILLSFESVFGLIFSVIFLGDPLTIRLMAGCALMFAAVILSEYRKKTTPSIKPGK GDVSNEQRNCNQ >gi|229783967|gb|GG667768.1| GENE 4 2624 - 3685 1083 353 aa, chain - ## HITS:1 COG:TM0282 KEGG:ns NR:ns ## COG: TM0282 COG2017 # Protein_GI_number: 15643051 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Thermotoga maritima # 5 350 10 353 356 259 42.0 7e-69 MAYRKEQWGVMPSGEPVSLYTLSNDNGVSASFTDLGGTWVTMYVPDRDGKLVDVVLGYDS VEAYLENPPHFGAPIGRNANRIAGARFTLNGKEYRLAANDGTNNLHSAPDLYHSRLWEAE TEEGDLGTTVRFSLFSPDGDQGYPGNADITVSYTLTPDDSLVIAYHMVSDSDTIANFTNH TYFNLDGQDGGNAMGQSVWIDADAYTAADAGSIPTGEIVPVAGTPMDFTVMKEIGRDIDA AYEALILGMGYDHNWILNHAPGETALSAATESGKTGIRMEVCTDLPGMQFYTGNFLGTGP AGKNGMIYAPRYGYCFETQYYPNAVNMPEFPSPILKAGEEYRTTTVYRFTQQK >gi|229783967|gb|GG667768.1| GENE 5 3717 - 4805 1247 362 aa, chain - ## HITS:1 COG:CAC0022 KEGG:ns NR:ns ## COG: CAC0022 COG0136 # Protein_GI_number: 15893320 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Clostridium acetobutylicum # 4 361 3 360 360 464 60.0 1e-131 MEKKLKVGVLGATGMVGQRFISLLENHPWYEVVTMAASPRSAGKTYEEAVGDRWKMTTPM PEAVKKLTVMNVNEVEKVAASVDFVFSAVDMTKEEIKAIEEAYAKTETPVVSNNSAHRWT PDVPMVVPEINPEHFEVIKDQRKRLGTERGFIAVKPNCSIQSYAPVLTAWKEFEPYEVVA TTYQAISGAGKTFKDWPEMEGNIIPYIGGEEEKSEQEPLRLWGKVENGEIVKANEPVITC QCIRVPVLNGHTAAVFVKFRKKPTKEELIERLVSFKGLPQELELPSAPKQFIQYLNEDNR PQVTMDVDFEHGMGVSIGRLREDTVYDYKFVGLSHNTVRGAAGGAVLCAELLTAQGYIQA KS >gi|229783967|gb|GG667768.1| GENE 6 5097 - 5948 821 283 aa, chain - ## HITS:1 COG:lin2658 KEGG:ns NR:ns ## COG: lin2658 COG1307 # Protein_GI_number: 16801719 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 2 282 3 283 283 122 29.0 9e-28 MKRIGISADSSCDLSAELAEQYGIAVIPQAIIMDGKTYRDGIELKPADIFRHVSEGGEIC STTAINTAEYCGVFSELRKECGALIHFTIGSGLSSCYQNAVLAANEVPGVTVVDSMNLTT GIGYLAVAAAKMAAAGMELEEITAKLEELKAHMDVSFVIDKLDYLKKGGRCSAVTAFGAN LLNIKPSICMKNGTLAVEKKYRGAFRQCVRRYIEERLDAVKDRCVPDIIFLTHAACSEAV LEEAADLIRSRGIFSEILVTEAGCTVSSHCGPNTLGIIFAVKS >gi|229783967|gb|GG667768.1| GENE 7 5986 - 8223 2742 745 aa, chain - ## HITS:1 COG:CAC0007 KEGG:ns NR:ns ## COG: CAC0007 COG0188 # Protein_GI_number: 15893305 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Clostridium acetobutylicum # 4 739 6 734 830 510 41.0 1e-144 MAEKIVKTEYSEEMQKSYMNYSMSVITARAIPDARDGLKPVQRRVLYDMSELRLNHDKPH RKSARIVGDTMGKYHPHGDSSIYETLVVMSQVFKKGMPLVNGHGNFGSIEGDGAAAMRYT EARLEKFAEEVYLKDLDKTVEFIPNYDETEKEPEVLPVRVPNLLINGAEGIAVGMSTSIP PHNLGEVIDAVMAYIDNPDISTRELMEYMPGPDFPTGGIIANKSGLAEIYETGSGKIKLR GKIDIELGKRKADKDKMVITEIPYTMIGAGINKFLVDVAELVESKKLTDVVDISNQSNKD GIRIVLELKKDADVEKIRNILYKKTKLEDTYGVNMLAIAGGRPETLNLKGILKNYLDFQY HNATRKYQALLEKELEKKEIREGLIKACDIIDLIIAVLRGSKNLKDAKACLMTGDTSNIK FRVSGFAEDAKNLHFTEKQASAILEMRLYKLIGLEILALEKEYKETMAKIAEYRHILESR SNMDVVIKKDLAAIKEEFATPRRTLIEDGKEAVYDETAVAVSEVVFVMDRFGYCKLLDKS TYDRNRETIETENVHVVSCFNTDKICMFTNTGNLHQVKAADIPYGKLRDKGTPIDNLSKY DGTKEEIIYLCSAESMKGSRFLFATRQALVKRVPGEEFETNNRTVAATKLQDGDGVVSIL VLRAETEVVLQTSNGVFLRFALEEIPEMKKNSRGVRGIKLAQGEELETVYLLGEDPVVEY KKKEVHLNRLKIGKRDGKGSKVRLS >gi|229783967|gb|GG667768.1| GENE 8 8235 - 8792 637 185 aa, chain - ## HITS:1 COG:CAC0006 KEGG:ns NR:ns ## COG: CAC0006 COG0187 # Protein_GI_number: 15893304 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Clostridium acetobutylicum # 8 178 461 629 637 194 58.0 6e-50 MWRRLPWNKVLANAEIKTMINAFGCGFSEGYGNDFDITKLRYHKIILMTDADVDGSHIDT LLLTFLYRFMPELIYDGHVFIAMPPLFKVIPKRGEEQYLYDEKELERYRKTHTGDFTLQR YKGLGEMDAQQLWETTLDPEHRVLKQVEIEDARMASEITEMLMGSDVPPRRQFIYEHADE AEIDA >gi|229783967|gb|GG667768.1| GENE 9 8749 - 10155 1428 468 aa, chain - ## HITS:1 COG:BS_gyrB KEGG:ns NR:ns ## COG: BS_gyrB COG0187 # Protein_GI_number: 16077074 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Bacillus subtilis # 3 463 5 464 638 423 49.0 1e-118 MAKTQYNADSITVLEGLEAVRKRPGMYIGSVGTKGLNHLIYEIVDNAVDEHLAGFCNQIW VTLEADGSCTVRDAGRGIPVEMHKKGMSAERIVLSTLHAGGKFDNDAYKTSGGLHGVGSS VVNALSERMSVKVYKNGLIHYDAYERGIPTVELVDGLLPTRGKTKETGTEINFLPDGEIF EKTRFKAEWLKSRLHETAYLNPNLQITYQNKRGGEEETILFHEPEGIISYVKELNSGKAP VHDPIYYKGTIDKVEVEVAFQFVDTFEENILGFCNNIFTQEGGTHLAGFKTRFTQMINSY ARELGILKEKDANFTGADTRNGMTAVVAVKHPDPIFEGQTKTKLASADATKAVFTVTGDE LQRYFDRNLEILKAVISCAEKSAKIRKAEEKAKTNMLSKSKFSFDSNGKLANCESRDAEK CEIFIVEGDSAGGSAKTARNRMYQAILPIRGKILNVEKASMEQGFGQC >gi|229783967|gb|GG667768.1| GENE 10 10472 - 11101 650 209 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167769952|ref|ZP_02442005.1| ## NR: gi|167769952|ref|ZP_02442005.1| hypothetical protein ANACOL_01293 [Anaerotruncus colihominis DSM 17241] hypothetical protein ANACOL_01293 [Anaerotruncus colihominis DSM 17241] # 40 207 32 194 194 68 30.0 2e-10 MRKKFLKTGVILFMAAMLLAGCSGKSSKTETEAEPEQALYPVTIDGKEILVGQTTVQTLL DEGMNVTVSEMTEDKQINKYEIDPDATLEPNSYYTGGSIWVTDSIFAHISLVTDENAVRM GDAVIARLEFSLTSGEKSELEKITFNGVPVSAISREKAGEMFPDFKGDEYMWFQSGDDYE YFMSFSTTDSMMNKFSVEKEYDVDWNSKE >gi|229783967|gb|GG667768.1| GENE 11 11453 - 11734 310 93 aa, chain + ## HITS:1 COG:alr0877 KEGG:ns NR:ns ## COG: alr0877 COG1254 # Protein_GI_number: 17228372 # Func_class: C Energy production and conversion # Function: Acylphosphatases # Organism: Nostoc sp. PCC 7120 # 3 63 7 67 101 72 54.0 2e-13 MEEREMIRKHIFVSGRVQGVGFRYRTYYLAQSLGLTGWVTNLDDGRVEMELQGKEEDMNQ LFRRLEQNSFIDITDCTVKQIPLKKENSFHVRG >gi|229783967|gb|GG667768.1| GENE 12 11852 - 12025 221 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871348|ref|ZP_06117265.2| ## NR: gi|288871348|ref|ZP_06117265.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 57 3 59 59 65 100.0 1e-09 MRKKGEKELADLFDRAAESDEPVPPAPEHEFQAILAEMKRRGIEPRIRRELKEKEKK >gi|229783967|gb|GG667768.1| GENE 13 12062 - 12475 605 137 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1216 NR:ns ## KEGG: EUBREC_1216 # Name: not_defined # Def: hemerythrin # Organism: E.rectale # Pathway: not_defined # 1 130 5 134 144 126 47.0 3e-28 MSITFDRSLVTGNEMIDTQHRELIGRVNRLAEECVPGTEKRTAVGTLDFLLDYTDYHFTE EEALQKKSGYPKLSAHHLEHEKFKKAVEDLRVMLEEEEGPSEAFVDAVRKNVEEWLWNHI MTWDKEVAGYAETQAEI >gi|229783967|gb|GG667768.1| GENE 14 12540 - 12857 329 105 aa, chain - ## HITS:1 COG:no KEGG:Closa_2289 NR:ns ## KEGG: Closa_2289 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 103 1 103 103 128 56.0 7e-29 MSAFGIGTNTEKADSGTLTGRQIKIACKAWFTSSGVIKPLSFKFEGDDGMMQTVNHVTVR FGEEKCYSGIPSMEYACRAVIGGLTQEFKLIFYPEEGRWVMVLPE >gi|229783967|gb|GG667768.1| GENE 15 12870 - 13823 742 317 aa, chain - ## HITS:1 COG:CAC0285 KEGG:ns NR:ns ## COG: CAC0285 COG0389 # Protein_GI_number: 15893577 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Clostridium acetobutylicum # 1 285 108 391 396 227 41.0 2e-59 YSIDEAYMDMTESRALFGPPKEAAEKIRERILCELGFTVNIGVANNKLLAKMASDFKKPN RVHTLWKEEIPMKMWPLPVSELFFIGRATYRKLKGIGITTIGELARSDPDILRRHLGKQG EVIWGFANGIDLSAVEAEAPPNKGYGNSTTVAFDVCDAETAKLVLLSLAETLGARLRKDG VKVGVVAVGIRDYLFEYHSHQISLETSTSITRELYEAACRAFDEAWDKTPIRHLGIHTSR VSSEKSRQLGLFDRMDYEKLERMDKAVDEIRRRFGNDAVKRAAFLPEHPGKTAVHDHMGG GIAREKRTVDYSRQEVL Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:43:44 2011 Seq name: gi|229783966|gb|GG667769.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld162, whole genome shotgun sequence Length of sequence - 9484 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 7, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 775 705 ## COG0657 Esterase/lipase 2 1 Op 2 . + CDS 787 - 1266 614 ## COG1576 Uncharacterized conserved protein - Term 1301 - 1353 11.1 3 2 Op 1 . - CDS 1409 - 2584 1128 ## COG2866 Predicted carboxypeptidase - Prom 2609 - 2668 2.8 4 2 Op 2 . - CDS 2697 - 4811 1365 ## COG1455 Phosphotransferase system cellobiose-specific component IIC - Prom 4865 - 4924 5.1 + Prom 4918 - 4977 3.9 5 3 Tu 1 . + CDS 5030 - 6484 1336 ## COG2006 Uncharacterized conserved protein 6 4 Tu 1 . - CDS 6515 - 6739 280 ## Closa_1058 hypothetical protein - Prom 6792 - 6851 6.8 + Prom 6828 - 6887 4.2 7 5 Tu 1 . + CDS 6932 - 7954 902 ## COG4129 Predicted membrane protein + Prom 8015 - 8074 2.6 8 6 Tu 1 . + CDS 8281 - 8559 340 ## CPE0738 hypothetical protein + Term 8561 - 8618 9.0 + Prom 8628 - 8687 2.2 9 7 Tu 1 . + CDS 8882 - 9482 390 ## COG4332 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|229783966|gb|GG667769.1| GENE 1 2 - 775 705 257 aa, chain + ## HITS:1 COG:RSc0439 KEGG:ns NR:ns ## COG: RSc0439 COG0657 # Protein_GI_number: 17545158 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Ralstonia solanacearum # 21 214 109 302 317 134 38.0 1e-31 LEPKRFEFAVRYSKLSFGGDVLTIDYRVAPEYPYPAALEDAVYAYKWLIQEKEYEPDHII IAGDSAGGGLALALVMYLRDHRLAMPGGIITMSPWTDLTNSGASYTDNYETDPLFGNSTD NMLFHSSYIGEADPKEPYLSPLFGDYHGFPPVLMQVGSYEVLLSDTLEVAEKLKRAGVRR RLSVYEGMFHVFQMGMDLIPESREAWEEVEEFLRITCGISRKPDGKVVKRVKTRRKKTAA QVAGRVLESMKKNLPVR >gi|229783966|gb|GG667769.1| GENE 2 787 - 1266 614 159 aa, chain + ## HITS:1 COG:BS_yydA KEGG:ns NR:ns ## COG: BS_yydA COG1576 # Protein_GI_number: 16081075 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 159 1 159 159 170 55.0 8e-43 MKITLITVGKIKEKFYTAAIEEYSKRLSRYCKLEIIQVADEKTPDHAGEALETQIREKEG ERILAGIREGAYVIALAIEGKMLDSVELSEKIAALGVSGTSQIIFIIGGSLGLSKQVLDR ADYLLSFSRMTFPHQLMRVILLEQIYRGYRIMNGQAYHK >gi|229783966|gb|GG667769.1| GENE 3 1409 - 2584 1128 391 aa, chain - ## HITS:1 COG:BH1603 KEGG:ns NR:ns ## COG: BH1603 COG2866 # Protein_GI_number: 15614166 # Func_class: E Amino acid transport and metabolism # Function: Predicted carboxypeptidase # Organism: Bacillus halodurans # 84 376 66 341 355 144 33.0 2e-34 MDLKAGQSTVEQEAERGTNMKRMVCAVLSGILILGNAGYGGMVSYGAPTYAEANAVGPGI KAAVGLNDFLDQEVPNPIVKPADKYSYEQMEADIQSLKNTYGDKITVNVIGTSLDGRNIY DIILGNPEAKSQILMQGAIHAREYMTPLVMMSQLEYALAFYDTGHYNYKSVKDMLKKVAI HFVPMTNPDGVTLSQFGIDAIRSDSLKQGIRDCYTRDVTLKRTADSFETYLTKWKANAAG VDLNYNFPYGWDELATKLTEPSSASYKGPAPFSEPESQALRTLVDQYPWAVTVSYHSQGQ VIYWTTSSNGAELASNTLAEAVSVMTGYRMDNSDGKGGFKDWMQSRSGAVPGVTLEVGKT PCPVPFAEFPEVWRQNKGVWVQVLDYVVRRI >gi|229783966|gb|GG667769.1| GENE 4 2697 - 4811 1365 704 aa, chain - ## HITS:1 COG:lin2832 KEGG:ns NR:ns ## COG: lin2832 COG1455 # Protein_GI_number: 16801892 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 18 415 27 427 435 162 28.0 2e-39 MSFRKILSRFQISIFARSIRHGLTLAIPFLIMGSFSLLLNNFPIDSYQVFLASWLHGTAA QLLNTIYTISLGSLALVLTVTISISYGQLAETDQFLLYPVVSVCSYLAFCGGIKDQTEYI FSAEWVFTAMCITMLACVLFQRLLRFGRRFERLHTTGADYLFNISIQSLVPVIVITLFFT ITGYLVRITWSDSNITNFGSYLFLKLFNGISGNLFGILLYVVITHVLWFLGIHGTNTLEA VSRQLFEQNVEINQALMTAGSLPTEIFSKTFLDTFVFLGGCGTALSFVIALCIASGQSHN RKIAHVALPSVLFNISEIAVFGFPIIFNFTMAVPFIITPVVLTVTSTLAIHLGLVPAVTQ SVDWTVPVLLSGYKSTGSIAGSLLQLFNLIIGTMIYIPFIRRSEKLEAGKFTQAIRQMEA DMASGEKQGNPPRFLTPEYPCYYYAKTLAMDLQNAIKRGQVQLYYQAQLSHDGTLHGMEA LMRWKHPVTGFIAPPVLAGLAYEHGFLDELGSYLIRRACLDAQTMERLLKKDAYLSVNVS SMQIKSDGFFDMALSLIKQYPAEHIHPVFEITERAALEISDHLKIEMEHLRSQGVEFSLD DFGMGHNSLLQLQEDAFDEVKLDGNLVSQLLTNKRSREIVSSIIQMSRNLNCRIVAEFVE TAELRNVLSELGCSIYQGYYYCRPLPLEEFMDYISRLPEKGHGS >gi|229783966|gb|GG667769.1| GENE 5 5030 - 6484 1336 484 aa, chain + ## HITS:1 COG:MA1031_1 KEGG:ns NR:ns ## COG: MA1031_1 COG2006 # Protein_GI_number: 20089906 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 87 356 14 253 295 60 26.0 6e-09 MKKLWIYMAMPMTLLNGCQIHPAEAVPAVNTVAERKSEQGQMVPLEQTVPGYQPDTAVQT VKEAVFHGSTVTIAKSQKAKAADLTESEIEAMVRTAASDLKDIVKNGQTVVLKPNLVQMI VDSTGELLDQEVNGVTADWRVTKAVLKMVREFNPDGKVYIMEGSATGPTREVMNYFHYTP DYMEGVDSFICLEEDCGAWQDFDAPEVVKVELPEGLLHKSYYFNRILYEADVVISIPALK TSSGVVVTGGIKNVSIGTPPGNLYGMAPDNPSKAAMVSHKITDGELDRWIYDYYMARPVD YVIVDGLQGFQSGPVPMSHERKETDKMNMGVIMGGSDAVAVDTICSLVTGWDPESVGYLN LLREHTEAGELENIRVKGAYVDELRKKFTIRKPELGGIQLEAGTGPSLEARAEKNGDQLE LHYKTGENACKAELFVDGLFQYSEEAVSDGRIQLNIPGLSAGTHEVQIVVYDRFLNKTSK AMEL >gi|229783966|gb|GG667769.1| GENE 6 6515 - 6739 280 74 aa, chain - ## HITS:1 COG:no KEGG:Closa_1058 NR:ns ## KEGG: Closa_1058 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 67 1 67 75 78 55.0 7e-14 MNFAALMQLKEEWTSFKIRHPKFPAFLKAVRGRALAEGTVIDIRVTAPDGTVLNSNLKLK KEDLELIRRIHELI >gi|229783966|gb|GG667769.1| GENE 7 6932 - 7954 902 340 aa, chain + ## HITS:1 COG:CAC2484 KEGG:ns NR:ns ## COG: CAC2484 COG4129 # Protein_GI_number: 15895749 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 8 324 7 315 320 218 39.0 1e-56 MKKPDYGKVMKIAAGAGLAMILAERLGLNYSASAGVITLLSIQDTKKETIRVVVRRLLSF VLAMVISAACFSAFGYGPAAVSVFLLMFSTVCMAFGMQEGISVNTVLMTHFLAERSMNAV NIGNELGLLVTGAGIGVLLNLYIPGKERVIRMSQRQIEGRMKEILWKMAKQLTAADQERS NMPSDQPPLTSRLTELDETLKQGEKSAYEDMENKLLSDTRYYLRYMNMRRTQAFVLGRIE AEMSHLKELPSQARPIAELMERISVSFHEYNNAVDLLASLDRVTCSMREQPLPESREEFE SRAVLFQILLELEQFLEIKKKFVEELSAEEIGKFWKENAC >gi|229783966|gb|GG667769.1| GENE 8 8281 - 8559 340 92 aa, chain + ## HITS:1 COG:no KEGG:CPE0738 NR:ns ## KEGG: CPE0738 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens # Pathway: not_defined # 1 88 1 88 92 95 57.0 6e-19 MRYRKATEILPEELVEAIQKYMDGGYVYIPRKEENKKKWGEETSFRRELKTRNEEIYRRY LDGFGVKTLAEEYFLSEKSIQRIILSEKNKIK >gi|229783966|gb|GG667769.1| GENE 9 8882 - 9482 390 200 aa, chain + ## HITS:1 COG:CAC0055 KEGG:ns NR:ns ## COG: CAC0055 COG4332 # Protein_GI_number: 15893352 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 18 189 1 161 196 96 35.0 3e-20 MRDNIKNINNSNTGLDTIRKERWRITPEAVPTVRRNCSKCKEKREFINSMKFRVNANGKN VDVWLIYRCEKCSTTWNMSIYERTPVSSVLQEEYAGFLENDRTLAEAYGRRMDLFVKNRA EAVIPEGGYRTEVEADGNADENSEVVKNGFRLIELVVPFPVTLRLDGLIADRLGVSRSMV KRWCSQGIMVKLDEYEMVWK Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:43:51 2011 Seq name: gi|229783965|gb|GG667770.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld163, whole genome shotgun sequence Length of sequence - 6981 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 4, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 2205 1227 ## gi|266624344|ref|ZP_06117279.1| conserved hypothetical protein 2 1 Op 2 . - CDS 2209 - 2766 450 ## gi|266624345|ref|ZP_06117280.1| conserved hypothetical protein 3 1 Op 3 . - CDS 2738 - 3028 224 ## gi|266624346|ref|ZP_06117281.1| conserved hypothetical protein - Prom 3137 - 3196 4.5 - Term 3144 - 3183 5.1 4 2 Op 1 . - CDS 3198 - 3323 63 ## gi|331089502|ref|ZP_08338401.1| hypothetical protein HMPREF1025_01984 5 2 Op 2 . - CDS 3316 - 3900 382 ## gi|266624348|ref|ZP_06117283.1| conserved hypothetical protein - Prom 4092 - 4151 4.0 - Term 4105 - 4146 -1.0 6 3 Op 1 . - CDS 4153 - 4341 209 ## gi|266624349|ref|ZP_06117284.1| conserved hypothetical protein 7 3 Op 2 . - CDS 4338 - 4613 104 ## gi|266624350|ref|ZP_06117285.1| conserved hypothetical protein - Prom 4681 - 4740 4.9 - Term 4720 - 4765 1.4 8 4 Op 1 . - CDS 4969 - 5226 169 ## gi|288871353|ref|ZP_06410128.1| conserved hypothetical protein 9 4 Op 2 . - CDS 5226 - 5459 120 ## gi|288871354|ref|ZP_06410129.1| conserved hypothetical protein 10 4 Op 3 . - CDS 5465 - 5677 122 ## gi|266624353|ref|ZP_06117288.1| conserved hypothetical protein 11 4 Op 4 . - CDS 5667 - 5801 160 ## gi|266624354|ref|ZP_06117289.1| conserved hypothetical protein 12 4 Op 5 . - CDS 5802 - 6170 273 ## EUBELI_10051 hypothetical protein 13 4 Op 6 . - CDS 6183 - 6344 206 ## gi|288871355|ref|ZP_06410130.1| hypothetical protein CLOSTHATH_05718 14 4 Op 7 . - CDS 6322 - 6543 304 ## gi|288871356|ref|ZP_06410131.1| conserved hypothetical protein 15 4 Op 8 . - CDS 6548 - 6757 223 ## gi|288871357|ref|ZP_06410132.1| conserved hypothetical protein - Prom 6809 - 6868 5.2 Predicted protein(s) >gi|229783965|gb|GG667770.1| GENE 1 3 - 2205 1227 734 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624344|ref|ZP_06117279.1| ## NR: gi|266624344|ref|ZP_06117279.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 734 1 734 735 1360 100.0 0 MNSIVEDILMHYGMPRRSGRYPYGSGENPYQHSGDFLSRVQELKKSGMSETDIAKNMGLT TTQLRTQMSLAKDERRALQVATAKGLREKGYSLNEIADKMGFANDSSVRSLLNETSENRM NQAKATADVLRKLIEEKGMIDVGTGVERELGVSKEKLNQALYMLELEGYPIYGGGVPQVT NPGKQTNIKVICPPGTEHKDIYDFENVHSVRDYISYDNGESFRKSFEYPASMDSKRLQIR YADQGGVDKDGVIELRRGVKDLSLGDSHYAQVRIMVDGTHYLKGMAVYSDNMPDGVDVIF NTNKKSGTPTKDVLKKIKDDPDNPFGSLIKEHGGQSYYDDPKGKYTDPVTGKKQSLSLIN KRAEEGDWGEWSKTLPSQFLSKQSLTLIKKQLGLAKADKQAEYDEICSLTNPTVKKALLK SFADDCDAAAVHLQAAALPRQKYQVILPLTTIKDNEVYAPNYKDGETVALIRYPHGGTFE IPILKVNNKLAEGKSVLGNTPADAIGINKKNADRLSGADFDGDTVMVIPCNSTKSKVKIT STSPLKGLEGFDTKDAYGGTVKKDADGVDHYYRNGKEYKIMRNTQTEMGKVSNLITDMTL KGATQDELARAVRHSMVVIDAEKHKLDYKQSEIDNGIASLKKKYQGNVDSEGRYHEGAST LISRAKSETQVLKRKGSPTINEDGSLSYKSVKEEYVDKNGKIQVRTQKSTKMAETKDART LSSGTPQEEAYADY >gi|229783965|gb|GG667770.1| GENE 2 2209 - 2766 450 185 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624345|ref|ZP_06117280.1| ## NR: gi|266624345|ref|ZP_06117280.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 185 1 185 185 375 100.0 1e-102 MNFNNHSNLEGQHAFLGASKYHWINYGEDKVAEAYRNFLATQKGTVLHAFAAQCIMLNQK LPKSKQTLNMYVNDAIGFKMTPEQILYYSDNCFGTADAILFRNNFLRIHDLKTGKIPAHM EQLEIYAALFCLEYKVKPGDIEMELRIYQNNEILYHNPTAEDIVPIMDRIITFDKVIKKI REQEG >gi|229783965|gb|GG667770.1| GENE 3 2738 - 3028 224 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624346|ref|ZP_06117281.1| ## NR: gi|266624346|ref|ZP_06117281.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 96 1 96 96 180 100.0 3e-44 MLENKFQANLIKELKERFPGCIVMKNDPTYIQGIPDLLVLHKDKWASLECKKSAGAKKQP NQEYYVDRMNQMSFSRFICPENKEEVLDELQQSFEP >gi|229783965|gb|GG667770.1| GENE 4 3198 - 3323 63 41 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|331089502|ref|ZP_08338401.1| ## NR: gi|331089502|ref|ZP_08338401.1| hypothetical protein HMPREF1025_01984 [Lachnospiraceae bacterium 3_1_46FAA] hypothetical protein HMPREF1025_01984 [Lachnospiraceae bacterium 3_1_46FAA] # 1 41 1 41 41 64 97.0 3e-09 MSKGGKKKRSTAGLILDVILTLCTGGLWLIWILIRYLRNNS >gi|229783965|gb|GG667770.1| GENE 5 3316 - 3900 382 194 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624348|ref|ZP_06117283.1| ## NR: gi|266624348|ref|ZP_06117283.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 194 25 218 218 310 100.0 4e-83 MGTKSNKNISGVIGAIGAVGGLITAVTPLVEKAIDNAQNKPTEKIDTKVIIPELYRKGFP IDLEQAEELLTERGLKVSKSKLRMKEADPKYRDYEDTQVIDSNPKQGAKVKVGTTVCLRY ITAEVIEESQKIFDDDVRIKQEAKEQKAAEKQEKKERLKESVSETMDSAKSGLGKIFKKD RKAIEAEKGETIDE >gi|229783965|gb|GG667770.1| GENE 6 4153 - 4341 209 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624349|ref|ZP_06117284.1| ## NR: gi|266624349|ref|ZP_06117284.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 62 1 62 62 105 100.0 1e-21 MNVKRKVTWKDIFNNFKSVYPRLSKEAQDYRPYNYMSIIVYLADGTKVVYDDMVKRAKML AA >gi|229783965|gb|GG667770.1| GENE 7 4338 - 4613 104 91 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624350|ref|ZP_06117285.1| ## NR: gi|266624350|ref|ZP_06117285.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 91 1 91 91 187 100.0 3e-46 MTYDKKLVEDWLCEHFPYHLRVNKDIPNGAHVTMKNEIAISQEWLWVDNPPYQSFEDVMF GYTIPRDFYSGAGASYCGYPFGGLYPIGGLP >gi|229783965|gb|GG667770.1| GENE 8 4969 - 5226 169 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871353|ref|ZP_06410128.1| ## NR: gi|288871353|ref|ZP_06410128.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 85 2 86 86 136 100.0 4e-31 MTMEELQKACETLAEAWNKVLEPMEKLAKDLSEAFGRMYASEEENRKIRTGRKLKSVRRV PDSKMSTYNYKPVVKRNLPYQRRNF >gi|229783965|gb|GG667770.1| GENE 9 5226 - 5459 120 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871354|ref|ZP_06410129.1| ## NR: gi|288871354|ref|ZP_06410129.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 22 77 1 56 56 108 100.0 9e-23 MKICKVRPNYSTCSACIATQEMFNVVDDCSRCKSNTDTYELLQVGTGFWSGDYAMVQKDG KITKVSLNRVYDVKESL >gi|229783965|gb|GG667770.1| GENE 10 5465 - 5677 122 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624353|ref|ZP_06117288.1| ## NR: gi|266624353|ref|ZP_06117288.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 70 1 70 70 130 100.0 3e-29 MTAKDDRKNAEGYNDPTAYNAIKNVEQEQDKDDMRFHQLLNTLFSLCELADFHIEGRVVL KDKRTGKVWR >gi|229783965|gb|GG667770.1| GENE 11 5667 - 5801 160 44 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624354|ref|ZP_06117289.1| ## NR: gi|266624354|ref|ZP_06117289.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 44 1 44 44 63 100.0 3e-09 MGYLILGIIVLTAILIFGGYIVLSVMNAAMWMDDSMRWGGRDDS >gi|229783965|gb|GG667770.1| GENE 12 5802 - 6170 273 122 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_10051 NR:ns ## KEGG: EUBELI_10051 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 122 1 121 121 137 63.0 2e-31 MGRAERRRAQKCEQKAKTATYNLTRAQLDALVREKISGELDRVKQEATNDAINQAMILLL TLPLEVLMDHYWPKSYAKRIPEFTEYVLEYYEKWQNDELDIDKLKEDLWVYGGVRLEEVE GK >gi|229783965|gb|GG667770.1| GENE 13 6183 - 6344 206 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871355|ref|ZP_06410130.1| ## NR: gi|288871355|ref|ZP_06410130.1| hypothetical protein CLOSTHATH_05718 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05718 [Clostridium hathewayi DSM 13479] # 6 53 1 48 48 76 97.0 6e-13 MLRRDLIKNKIYGIIFIVLGALTIPIEWDATFFLFALMVGILLFASRENCIMD >gi|229783965|gb|GG667770.1| GENE 14 6322 - 6543 304 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871356|ref|ZP_06410131.1| ## NR: gi|288871356|ref|ZP_06410131.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 73 7 79 79 132 100.0 1e-29 MHFTVIQIIIMFLIGYVCLYALIDRVMKCIEHCATARAYGRFREAGATMKMDDVAAGIAK SKEEKRNVEKRLD >gi|229783965|gb|GG667770.1| GENE 15 6548 - 6757 223 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871357|ref|ZP_06410132.1| ## NR: gi|288871357|ref|ZP_06410132.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 69 24 92 92 107 100.0 2e-22 MVKVRDILPLIQWNDAQIIKDQDEEICLLRNDFMVGSLSEEILNMTVTGIENDENIENTV VVYVTDEED Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:45:43 2011 Seq name: gi|229783964|gb|GG667771.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld164, whole genome shotgun sequence Length of sequence - 14335 bp Number of predicted genes - 15, with homology - 14 Number of transcription units - 7, operones - 5 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 401 399 ## COG0524 Sugar kinases, ribokinase family + Term 415 - 471 -0.3 + Prom 1348 - 1407 80.4 2 2 Op 1 1/0.000 + CDS 1482 - 2108 810 ## COG1414 Transcriptional regulator + Term 2134 - 2160 -1.0 + Prom 2154 - 2213 5.3 3 2 Op 2 8/0.000 + CDS 2271 - 2921 794 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase 4 2 Op 3 2/0.000 + CDS 2970 - 3599 720 ## COG0524 Sugar kinases, ribokinase family + Prom 4446 - 4505 80.4 5 3 Op 1 . + CDS 4574 - 5131 586 ## COG0524 Sugar kinases, ribokinase family 6 3 Op 2 . + CDS 5185 - 5283 112 ## + Prom 5322 - 5381 2.9 7 4 Op 1 . + CDS 5426 - 5710 360 ## gi|266624365|ref|ZP_06117300.1| conserved hypothetical protein 8 4 Op 2 . + CDS 5774 - 7735 1942 ## COG2199 FOG: GGDEF domain + Term 7746 - 7818 25.8 - Term 7738 - 7800 18.0 9 5 Op 1 13/0.000 - CDS 7827 - 8600 216 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 10 5 Op 2 9/0.000 - CDS 8593 - 9477 1093 ## COG4120 ABC-type uncharacterized transport system, permease component 11 5 Op 3 . - CDS 9544 - 10647 1250 ## COG2984 ABC-type uncharacterized transport system, periplasmic component 12 6 Tu 1 . - CDS 11032 - 11469 653 ## gi|266624370|ref|ZP_06117305.1| hypothetical protein CLOSTHATH_05732 - Prom 11564 - 11623 4.4 + Prom 11537 - 11596 8.2 13 7 Op 1 . + CDS 11623 - 12132 682 ## COG1335 Amidases related to nicotinamidase 14 7 Op 2 40/0.000 + CDS 12162 - 12854 775 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 15 7 Op 3 . + CDS 12868 - 14199 1620 ## COG0642 Signal transduction histidine kinase Predicted protein(s) >gi|229783964|gb|GG667771.1| GENE 1 3 - 401 399 132 aa, chain + ## HITS:1 COG:TM0067 KEGG:ns NR:ns ## COG: TM0067 COG0524 # Protein_GI_number: 15642842 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 1 132 211 339 339 104 44.0 3e-23 VTSGELELQGYVDIFNQMIDKFGFKYVISSLRESHSASDNGWSACIMDGATREFYHSRKY HITPIVDRVGGGDSFAAGVICGMVDGKDYKSALEYAVAASALKHTIPGDFNLVTREDVDS LAGGDGSGRVQR >gi|229783964|gb|GG667771.1| GENE 2 1482 - 2108 810 208 aa, chain + ## HITS:1 COG:kdgR KEGG:ns NR:ns ## COG: kdgR COG1414 # Protein_GI_number: 16129781 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 208 52 260 263 119 30.0 6e-27 MTTLAEMGMVNVARGQKQSYTIGLMAYRIGINYTNNLDFIGTIEPVLKAFSKEVGKTVFF GIPSDHHVVYICKFEPENPIITTATVGSKNPMYCTSLGKVILAYSDEETREQVINRIKFV KHTERTILGREQLLKELEKVRERGYAFDAREMEEHMQCVGAPVFDRDGSCLGAISVSSLY KPTEDYEALGALVAAKGREVSRLLGYFE >gi|229783964|gb|GG667771.1| GENE 3 2271 - 2921 794 216 aa, chain + ## HITS:1 COG:CAC2973 KEGG:ns NR:ns ## COG: CAC2973 COG0800 # Protein_GI_number: 15896226 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Clostridium acetobutylicum # 1 216 1 213 213 243 61.0 1e-64 MKKEQVLAKMKEDCLVAVVRAKNLEQGEKVVDAIIAGGINFIEITMTMDEGNPVEFIAKM SEKYRDNKDVVIGAGTVLDPETARAVILAGANYVVSPGLNVETIKMCNRYRVPMLPGVMT PTEAITALEAGCDIIKIFPGNIMGPAAISSFKGPLPQGEFMPSGGVDVDNVDKWIKAGAY AVGTGSSLTKGAKTGDFAAVTAKAQEFVKAVAAARA >gi|229783964|gb|GG667771.1| GENE 4 2970 - 3599 720 209 aa, chain + ## HITS:1 COG:TM0067 KEGG:ns NR:ns ## COG: TM0067 COG0524 # Protein_GI_number: 15642842 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 3 204 2 203 339 242 58.0 3e-64 MAKVVTMGEIMLRLSTPNNEKFIQADEFDVNYGGGEANVAVSLANYGHDAEFVTAVPKNP IGECAVAALRKYNVCTKHIARCGERLGIYFLETGSAMRASTVVYDRAHSSISTATEADFN FDEIFEGADWFHFTGITPAVSDSAAELTEAACKAAKAHGVTVSVDLNFRKKLWSSEKAQK VMTNLMQYVDVCIGNEEDAEKVLGFKPGN >gi|229783964|gb|GG667771.1| GENE 5 4574 - 5131 586 185 aa, chain + ## HITS:1 COG:CC1496 KEGG:ns NR:ns ## COG: CC1496 COG0524 # Protein_GI_number: 16125743 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Caulobacter vibrioides # 9 172 182 360 368 80 27.0 1e-15 MIRRFKSNGTIISFDVNFRGNLWTGEEARECIEQILPYVDIFFCSEDTARLTFQKTGTAR EMMKSFTEEYPISIVASTQRVVHSPKSHTFGSVIYDAARDEYYEEEPYRNIDVVDRIGSG DAYIAGALYGLLSSGGDCMKAVRYGNAASALKNTIPGDLPTSDLEEMNMIIREHNQKGPQ SEMSR >gi|229783964|gb|GG667771.1| GENE 6 5185 - 5283 112 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAKLSKGSDAKLKGRMGSQLLKPYEMLGLQPD >gi|229783964|gb|GG667771.1| GENE 7 5426 - 5710 360 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624365|ref|ZP_06117300.1| ## NR: gi|266624365|ref|ZP_06117300.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 94 1 94 94 137 100.0 4e-31 MKKENSFSRNHVRAVLLLVVLVLAFIYLFLYDFSRRVDHLMTDMTTEHLADVNDQSQTAV KNELTRVFEEVRYTADFLSETPELTEESRQRVYE >gi|229783964|gb|GG667771.1| GENE 8 5774 - 7735 1942 653 aa, chain + ## HITS:1 COG:PA4601_2 KEGG:ns NR:ns ## COG: PA4601_2 COG2199 # Protein_GI_number: 15599797 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Pseudomonas aeruginosa # 468 653 2 195 207 124 39.0 7e-28 MEHEELPHSTGGYVEGICRGESGMTDVFPSSVTGQEVFAFYAPVRRGETVEGGIAGIMIV EQMVDMVTTNGFNSGSYSYVLKSDGTVIMQTSHADSLYSGRDYFSFLRSDTDMGEESVEE LQEKMERKERGVFLFTYGKEKRIVYYAPAQINDWYVLTVVPYTVSNWYREHIGKTALYLT VKMLLLFLAELAVIVLWFRKTRNIILDSKTALELEKKKLELALLHSTGTTFEYVPKDDSL SFISPPVVKTAEFPPVLYQVSAKGAACGLVAPEDLEKWRMILKRAAEGKEPDPLEFAGGT SFSEDTYFRLSVTPVIVGKGEVANAIGTLEDITEERRIRRQFAQEEQYRAALLSEAVAMW SVDLVKQQVLACTIHGKDWMEDGSGFLYSSRFVEKMCRYIHPADHDRIAGQMQAKRMLAA YYTGEREQKELFRAAYPGSDDYKWMTCTISLLTEPVSGNPVAFAYVRDVDEETKRKMELL YSSERDPLTGLYNRRNIEEKVEQALKHEDSLSCLMMLDLDGFKEINDRYGHQEGDSVLKK MARILSGAFRSDDLIARFGGDEFVVFLERIPNRECASFRAEEVRRQVYALTMDGEDSIVS ISIGIAFSPQDGNTFELLYRHADEALYSAKGRGKNQIAFYGEEEEIMVLERLD >gi|229783964|gb|GG667771.1| GENE 9 7827 - 8600 216 257 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 254 1 245 245 87 27 4e-17 MLELKKINKYYNQGTVNEMCLFEDFSLTIKDRQFVSVVGSNGSGKTSMLNIICGSIPLDS GRIIIGGKDITSMPEYRRQRRIGRVYQNPAMGTCPHMTILENMALADNKGKPFNLLPGTN RQRINYYKEQLHSLGLGLEDKLHVKVGVLSGGQRQAMALLMSTMTPIEFLILDEHTAALD PKTAEVIMELTDQIVREKNLTTIMVTHNLRYAVEYGDRLIMMHQGNAIIDCADEDKAALK VDDILDRFNEISIECGN >gi|229783964|gb|GG667771.1| GENE 10 8593 - 9477 1093 294 aa, chain - ## HITS:1 COG:FN2080 KEGG:ns NR:ns ## COG: FN2080 COG4120 # Protein_GI_number: 19705370 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Fusobacterium nucleatum # 3 289 1 275 278 236 48.0 4e-62 MSIILGVFEEGLVYAIMALGVYITYKILDFPDLSVDGTFPLGGAVTVTLILAGVNPVLTL FAAFAAGVLAGCVTGFIHVKLKVRDLLSGIIVMTALYSVNLRVAGKSNVPIFNSDSIFEN GFVSRMFPDSISNFIVVIILAVIVILVKLLLDLYLKTRSGYLLRAVGDNDTLVTSLAKDK GMVKIVGLAIANGLAALAGSVYCQQKGFFEISMGTGTIVIGLANVIIGTKLFKRVGFVKS TTAVVIGSIIYKGCVSLAIYLGLEASDLKLITAALFLVILVLSNGREKKVKSHA >gi|229783964|gb|GG667771.1| GENE 11 9544 - 10647 1250 367 aa, chain - ## HITS:1 COG:PA3836 KEGG:ns NR:ns ## COG: PA3836 COG2984 # Protein_GI_number: 15599031 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Pseudomonas aeruginosa # 58 361 25 323 325 184 37.0 2e-46 MKKTTKALLLAAMTVTALAGCSSKPADPAATATEAVTTTAAADTTAAEAADTTAAKDASA EDGKSYTIGIGQFAEHGSLDNCRDGFLQGLADEGIVEGQNLTILYENAQADGGTASQIIN NFISKKADLICAIATPIAQAAYGGAKKDNVPVIYTAVTDPVAAALANEDGTPVGEVTGTS DKLPVEKQLEMIRQILPDAKKIGILYSTSEVNSETAIAEYKAAAPDYGFEIVEGAVTSTA DIPLAADSILEKADCLNNLTDNTVVSSLPMILDKAAKKNIPVFGSEVEQVKIGCLASMGL DYVDLGIQTGKMAAKVLKGEQKASEMNFEVIEEAAFYGNSKVAENLGITLPQDLTDSAAA IFTEITQ >gi|229783964|gb|GG667771.1| GENE 12 11032 - 11469 653 145 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624370|ref|ZP_06117305.1| ## NR: gi|266624370|ref|ZP_06117305.1| hypothetical protein CLOSTHATH_05732 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05732 [Clostridium hathewayi DSM 13479] # 1 145 1 145 145 233 100.0 5e-60 MEEKKNPFASLSDKELEAMLKRLEDSNERQASYAKKQYRMSQITALASILILAIVIYTSA TIIPRVNRLFDDVETVMDSVNVVAKELEEANLGQMVTDIDNLVTSSDQSVKEALRQINSV DIQTLNQAILDLSNLISPLARFFGK >gi|229783964|gb|GG667771.1| GENE 13 11623 - 12132 682 169 aa, chain + ## HITS:1 COG:L67226 KEGG:ns NR:ns ## COG: L67226 COG1335 # Protein_GI_number: 15672251 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Lactococcus lactis # 2 168 4 168 171 181 52.0 5e-46 MKLLIVVDMQKDFIDGALGSSEAQAIVPSVKKMIEEYQAAGDDVIFTLDTHDDKYMDTQE GKNLPVPHCIKGTDGWELDSSLKEFQGKRFEKNTFGSTALMEYVQERPYESVMLIGVCTD ICVISNALLVKAAIPEVPVLVDSSCCAGVTPESHENALNAMKMCHIAVQ >gi|229783964|gb|GG667771.1| GENE 14 12162 - 12854 775 230 aa, chain + ## HITS:1 COG:BS_yrkP KEGG:ns NR:ns ## COG: BS_yrkP COG0745 # Protein_GI_number: 16079696 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 1 227 1 226 231 231 52.0 6e-61 MAQRILIADDSEEIIEILQILLTGEGYEVVTAGDGEEAVRLADDTIDLIILDVMMPVKSG YVACSEIRKKTMAPILFLTARTQDSDKSMGFSAGGDDYLAKPFSYSELLSRCKSLLRRYL VYQGKSFAAPDNRLVLGDLTVDQDTRTVTVEGRDVNLTSREYCILVLLLKNRRKIFSSEN IYESVWNEPYFYSANNNVMVHIRNLRRKIEKDPQNPEYIKTMWGRGYYID >gi|229783964|gb|GG667771.1| GENE 15 12868 - 14199 1620 443 aa, chain + ## HITS:1 COG:BS_yrkQ KEGG:ns NR:ns ## COG: BS_yrkQ COG0642 # Protein_GI_number: 16079695 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 17 439 21 427 432 165 26.0 1e-40 MANPTRLSLATLYDVVIAAAAAFLCYVAVTNAMGLLAGMMTDQKNYYERHTKLLAEEFQD FVTENHVEIGDGETIGDWNSENWYVFLTVYQRERVYYNTMDRDNEKLKTELKDEHPGFQY GYLYRHGTAVTYPVKFADGDGEILIRAYFEARFRTLILVLGAGLSAVCFVVVFLVLLQKK LAYIRQIEKGIHILETGSRGYVIPLKGRDELYSLADSINQMSESLHKEIEEKDRIEKERS EIVTALSHDIRTPLTSVLFYLDLITAGKCTADESVLYMEKAKRQAYHMKNLMDDLFSYSY ATGDRFSLKNEEYDGNELFGQLIGDLTESLEEHGFRTEVHYKIETPFTVETDVVQLKRVL NNIGTNIEKYARRDGIVSVEVRLSDGFVCLNIENPIRDEEIEVESYGIGVKASEQMIEKM GGSFTYESHDYRYHTLMKLPEKQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:46:05 2011 Seq name: gi|229783963|gb|GG667772.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld165, whole genome shotgun sequence Length of sequence - 17695 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 6, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 547 593 ## gi|266624375|ref|ZP_06117310.1| conserved hypothetical protein 2 1 Op 2 . - CDS 519 - 1112 505 ## gi|266624376|ref|ZP_06117311.1| conserved hypothetical protein - Prom 1170 - 1229 80.4 3 2 Op 1 . - CDS 2077 - 2490 630 ## Sgly_1079 hypothetical protein 4 2 Op 2 . - CDS 2480 - 2995 509 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 3089 - 3148 3.8 - Term 3145 - 3202 13.4 5 3 Op 1 . - CDS 3265 - 3534 382 ## EUBELI_00576 hypothetical protein 6 3 Op 2 . - CDS 3552 - 3740 150 ## gi|288871363|ref|ZP_06410135.1| conserved hypothetical protein 7 3 Op 3 2/0.000 - CDS 3816 - 4706 858 ## COG0370 Fe2+ transport system protein B - Prom 4755 - 4814 6.1 8 3 Op 4 2/0.000 - CDS 5717 - 6565 868 ## COG0370 Fe2+ transport system protein B - Prom 6610 - 6669 8.5 9 3 Op 5 22/0.000 - CDS 7571 - 7753 303 ## COG0370 Fe2+ transport system protein B 10 3 Op 6 . - CDS 7830 - 8051 324 ## COG1918 Fe2+ transport system protein A 11 3 Op 7 . - CDS 7978 - 8277 310 ## Trebr_1215 FeoA family protein - Prom 8377 - 8436 80.4 12 4 Op 1 . - CDS 9488 - 11284 1274 ## COG3250 Beta-galactosidase/beta-glucuronidase 13 4 Op 2 . - CDS 11334 - 12710 958 ## gi|266624386|ref|ZP_06117321.1| conserved hypothetical protein - Prom 12767 - 12826 13.8 14 5 Op 1 . - CDS 13728 - 13925 94 ## gi|266624387|ref|ZP_06117322.1| conserved hypothetical protein 15 5 Op 2 . - CDS 13934 - 17446 2069 ## Trebr_0184 coagulation factor 5/8 type domain protein - Prom 17503 - 17562 2.4 - Term 17506 - 17552 7.1 16 6 Tu 1 . - CDS 17569 - 17685 192 ## Predicted protein(s) >gi|229783963|gb|GG667772.1| GENE 1 1 - 547 593 182 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624375|ref|ZP_06117310.1| ## NR: gi|266624375|ref|ZP_06117310.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 182 1 182 182 355 100.0 6e-97 MKKPEGWKIKTGKDWCLAFLFAALMVYVAVVRFFQWKILDFMQKFMPDKMSTMNLPGGYG TVLFWALIMALLVMAVLAHRRQKRKYIAAAGLAGFLIADCALGGFVYHCRLIVQRGTEEP AASAWICFEDGTEQVKLAAGDETLERLQALAFSLERQPAERQEELRELVRDGGKERSFVW FS >gi|229783963|gb|GG667772.1| GENE 2 519 - 1112 505 197 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624376|ref|ZP_06117311.1| ## NR: gi|266624376|ref|ZP_06117311.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 177 14 190 210 351 99.0 2e-95 MAERITPVLIGGAVYGSLAYFTGDSRIFTLRRYLKFAAIWAAVVLLWTGALLKADTLLAA KDDRMEDVTGFFLATDSQGNGFYEEKEPGANDTVFRLLRDALYQEGNPLKRIGAEVREPE DELQLQIFLGKYRRDSMICRLNPKEHYLVTERGTVYRVPEEFSRYMTQYQEDCFESIYEE YYGEEAGENEEAGGMEN >gi|229783963|gb|GG667772.1| GENE 3 2077 - 2490 630 137 aa, chain - ## HITS:1 COG:no KEGG:Sgly_1079 NR:ns ## KEGG: Sgly_1079 # Name: not_defined # Def: hypothetical protein # Organism: S.glycolicus # Pathway: not_defined # 1 131 1 141 322 93 38.0 3e-18 MACDYRERVLKYLRENTEDPGVERHLEECMECRALVEGYLEKEKELDIPEAGYEGTDEEL KERVVHFEKGTRRILVFTLVGFVLGWFSIAYFTDSFLVTKVILAIPYKASEMLHNLFHSH PYSYYGGNGMFTEFNEF >gi|229783963|gb|GG667772.1| GENE 4 2480 - 2995 509 171 aa, chain - ## HITS:1 COG:BS_ylaC KEGG:ns NR:ns ## COG: BS_ylaC COG1595 # Protein_GI_number: 16078537 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 6 164 8 172 173 84 33.0 1e-16 MGVRFEEVYQRQKHAVYGYLSYMTKDEDTAEDLAQETFLKVYLGLKGFKGDCSEKTWCLT IARNTFLSFARKKQPVLLEEGDMERLEAGYERGPEEKLLEEEKAKLIREVLSSLNADDRT IILLRDYEKMTYAEIARVTGLSETVVKVRLHRARLRYRSRYEQRGGDGCGM >gi|229783963|gb|GG667772.1| GENE 5 3265 - 3534 382 89 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00576 NR:ns ## KEGG: EUBELI_00576 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 8 87 6 85 142 97 60.0 2e-19 MNQTDALEEKRSYVIRELKRNGCRITNQRQILIDVILQDECCCCKEMYYQALEKDPTIGM ATVYRMVKTLEEIGLIRRKNLYRIEGDSA >gi|229783963|gb|GG667772.1| GENE 6 3552 - 3740 150 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871363|ref|ZP_06410135.1| ## NR: gi|288871363|ref|ZP_06410135.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 62 1 62 62 94 100.0 2e-18 MVTWLAANAATIIISMIILALTGLAVRSVYKNRGTCSCQSKSGCGGQCSCCHGCTPVHKP HS >gi|229783963|gb|GG667772.1| GENE 7 3816 - 4706 858 296 aa, chain - ## HITS:1 COG:L190009 KEGG:ns NR:ns ## COG: L190009 COG0370 # Protein_GI_number: 15672169 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Lactococcus lactis # 1 284 429 703 709 249 47.0 5e-66 MTIMTSTFVPCGAKLPVIALVAGAIMGGAWWMAPSMYFIGIASVIVSCIILKKTKMFAGD PSPFVIELPQYHVPSVKTVLLHVWERVWSFLKKAGTILFLCCMVMWFLATFGVENGAFGM VEDMEASFLAIIGTFLAPIFAPLGFGSWQAVASSLSGFVAKEGIVSTMGVLANMAEATET DAALWQYVMTDFFHGSALAGFTFLTFNLLDSPCLAAISTMAKEMGNKGWTAFAVIFQNVF AYAVCLMIYQLGSFAMGGGFGIGTAVAVIVLAAGLYLLFRPAPGAGKKAETAAAGI >gi|229783963|gb|GG667772.1| GENE 8 5717 - 6565 868 282 aa, chain - ## HITS:1 COG:L190009 KEGG:ns NR:ns ## COG: L190009 COG0370 # Protein_GI_number: 15672169 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Lactococcus lactis # 1 279 86 371 709 211 40.0 2e-54 MDATNLERNLYLTTQLLELGIPVVIALNMMDLAVKSGDKIDEKKLAAAFGCEVVATSALK GEGLTEVVDRAVRLANAKTIQVPKHEFSREVEGCLADIADLLPNSVPDHLARWYSIKLFE QDEKVLGTTPLSADSSKRIVEIVKDAEKQMDDDAESIITNERYEYISKIITGCLKKNRKG MTVSDKIDRIVTNRILALPIFVAVMFVVYYVSVTSLGTIVTDFTNDTLFGAWIQPAVQGL METAGASEWLVSLVVDGIIGGLAAPIGFAPQMAILFLFLLAS >gi|229783963|gb|GG667772.1| GENE 9 7571 - 7753 303 60 aa, chain - ## HITS:1 COG:CAC1031 KEGG:ns NR:ns ## COG: CAC1031 COG0370 # Protein_GI_number: 15894318 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Clostridium acetobutylicum # 5 60 8 64 683 89 73.0 2e-18 MAIRIALAGNPNCGKTTMFNNLTGSNQYVGNWPGVTVEKKEGKLKGKKDIIITDLPGIAS >gi|229783963|gb|GG667772.1| GENE 10 7830 - 8051 324 73 aa, chain - ## HITS:1 COG:L192240 KEGG:ns NR:ns ## COG: L192240 COG1918 # Protein_GI_number: 15672170 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein A # Organism: Lactococcus lactis # 4 72 80 148 152 73 52.0 1e-13 MQTLKEIPCGTTVTVKKLNGEGAVKRRIMDMGITKGCSVFVRKVAPLGDPVEVMVRGYEL SLRKQDAELIVVE >gi|229783963|gb|GG667772.1| GENE 11 7978 - 8277 310 99 aa, chain - ## HITS:1 COG:no KEGG:Trebr_1215 NR:ns ## KEGG: Trebr_1215 # Name: not_defined # Def: FeoA family protein # Organism: T.brennaborense # Pathway: not_defined # 1 66 1 66 69 82 62.0 5e-15 MPLTMARTGEAQIIRKIGGNEETKRFLENLGFVAGAAVTVVSAISGNMIVNVKDSRIAVN QDMAKKNYGLVGGKEDADLEGNSMRHDRNREEVKRRGRS >gi|229783963|gb|GG667772.1| GENE 12 9488 - 11284 1274 598 aa, chain - ## HITS:1 COG:uidA KEGG:ns NR:ns ## COG: uidA COG3250 # Protein_GI_number: 16129575 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli K12 # 1 587 1 588 603 592 48.0 1e-168 MLYPYESRSREVKDISGVWRFKLDRDGVGMKERWYEKKLEGSIPMPVPASYNDITQEADV RDYVGEVWYEKTVTIPQSWKGRRIVLRFGSVTHFGRVWLNGDFLMVHKGGYLPFEAEINP EPGTEARITVMVSNVLDYQTLPPGEVEILDDPYKYERPIKVQRYFHDFFNYAGIHRPVKL YTTPQCYIRDITVNTDIDGETGVIRYQARGNEAGRLFVEVFDSNGQSVGRAEGEAGEIHI GQAELWEPLKPYLYQLNVYLEPEDGREADIYSMKVGIRTVKVAGKQFLINGKPFYFKGFG KHEDMDIKGKGLDLAMCVKDFNLMKWINANSLRTSHYPYAEEILDLADEYGFVVIDESPA VGMTFFNEDKKVFVEERVNRDTLKHHIEVMEELVARDKNHPCVVMWSVANEPSSWESGAV PYFESVIGAVRRADPGRPITMVSYAFPYGPRIDRVASMLDVVCINKYYSWYEDSGMLDVI DYQVKKELDLWHKVFEKPLIMTEYGADSIAGFHSCPPVMFTEEYQIEMVKRFNQAFDRLD FVIGEHLWAFSDFATKQGTTRVIGNRKGVFTRERNPKSIAFLLRDRWKKDFTTKTLDL >gi|229783963|gb|GG667772.1| GENE 13 11334 - 12710 958 458 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624386|ref|ZP_06117321.1| ## NR: gi|266624386|ref|ZP_06117321.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 458 33 490 490 949 100.0 0 MLKMAEKYDFEVGINWCDAWHFYPWIEEFHKDAVTRERKVELFKENVQYLLDTLYSTSYG ATVDGHPIILIFGGGPTLEEFSDIRNCSYRLPEGVKQPCFIVRAPINGTVGEDGKVNYYL EENGWPEVMDGMFGWIPTRVREGLADDGFKSWDRYAVLQDSRDYLDALYKAFGEKGCRVR IGSVNPEMDNRACASWNKHDLSHIPRERGKTYEGMWESLVDNSSQVDVAYIVSWNDYTES HQVEPTLQEGYRELLTTRKYGARWKGLAEESSAGWETEVLSLPLHLFQVRKQVEFYRKTG FQTEPYDKALDDIAGCLSNGRYGEAKLNLLGVEELISSLGDMLRSWEVELTQENGGLSLN ADQLTICCGKTLEEMERYYFRACLSFEYLDDDEESFQILHGRQELCDGKKDGSGLYKQVR IRLFKESLHGIVQERELTILGNMEIRNVKLLIKLYSPV >gi|229783963|gb|GG667772.1| GENE 14 13728 - 13925 94 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624387|ref|ZP_06117322.1| ## NR: gi|266624387|ref|ZP_06117322.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 64 1 64 64 125 100.0 9e-28 MEKKIYADYHGPRCTRECDGEVAQWKYYGTAKKSRAAQTYANHNADFIGPDGLHDLAARC YPLVS >gi|229783963|gb|GG667772.1| GENE 15 13934 - 17446 2069 1170 aa, chain - ## HITS:1 COG:no KEGG:Trebr_0184 NR:ns ## KEGG: Trebr_0184 # Name: not_defined # Def: coagulation factor 5/8 type domain protein # Organism: T.brennaborense # Pathway: not_defined # 261 629 95 440 616 81 23.0 2e-13 MMRKWKRSMAVSMAAVLLGLAVLPGTANGFCVEKVQKANDSNAESTQTATGSNAENVQTA TGSNAENIQKATGSNAMARVIAVNPCSNPGNRMLAADGDTDTFWTSKDSHKQENRHYIEF ILDREYTIERIEMHLGRLTNGKWGFIPENLRVEARTDGEYDVVLDQNTVTEMTFLTLPEP TKASSLRLSFLDSRSKGFSVREVAFYLPGQPGGQTQYSQKKVYADYHGPRYTREHDGNVG NWGFVGEAKNSGAANKKVINNPDVIGEDGRRKLAAAAYPLVGMQSQMDPDYHEYQILLAK MANIDGFFVEWGFPGHGSDRQLEIMMETADKYGFELGINWCDAWHMKDWITLVKPEVVTR EEKVAEAVNSLSEILEKLYASNCGALFEGHPLVYLFGGGFDKAEVRKIVEEAVIPGQFLT DVPWYFRRASVSGSVNSSDQVNYSYAGTDWHTVVDGPYGWVPERVRSAQAAGFKDFDVYG TREDAVSYLQMLKRTFINNPDIPLRNSVVSPGMDNRPCAGWGDKFKYMDRDEGQLYRDMW EFNVLNRDYLDVVYIASWNDYTEGHQIEPTVEDGYRELFTTQDFAYQFKGKGSQDKTGFS LPLRLFELRKEAEKLEAMGGDAAALNEQLDQAGRDISEGRYEEAGSLLNEVQRVISAEIA TDSNATILCNEANGKLVLTEQIRENVAYRKTVTSNCSKEGLASVVDGVSDTAWSFEGAGS YIDIDIGSNANLVAGVFMGDMPFRLYYWEEEVLKQAEVVRDAGNGFYLPKSVTASRVRIQ LEASQGNVYEIKLYANSVPDVEVVSPQPNGQVDFTAPWQVEVEASDRDGSIVRVEAWLDG ILMTPLSVNPTDSRYSLLLPPAEPTLHQLTVKVVDNEGGYTVAGPFSLYPILENIALNKP AIANVSMQGYGPELAVDGIISLDSRWRTPASYTEHWLEIDLEGTYEICRADLFMGDLSGY AVRDFKLEYWDGSGWRQIPGTSFTGNTVKDMVLRFDPIITNKVRFYSSEAAGRGVRVKEI MIYAPAKATGTENCASINFHAKYLENADSGGGAESLSAAKGCYLTLSDEVAQQLSGKYYE GILSFEYLDDGFGSFSVLVSADGQAEFGNYEEAAVINKTGTGQWMVAKINLKGPHIRMNH TGENDSDVALKGPTRIRGISLEFVVAEPKL >gi|229783963|gb|GG667772.1| GENE 16 17569 - 17685 192 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSEYENFVLYLNDNGGKRLEEIYAGAYARYKALIEEIK Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:47:27 2011 Seq name: gi|229783962|gb|GG667773.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld166, whole genome shotgun sequence Length of sequence - 16413 bp Number of predicted genes - 18, with homology - 15 Number of transcription units - 8, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 975 207 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 + Term 1158 - 1206 8.2 + Prom 1035 - 1094 4.2 2 2 Op 1 . + CDS 1242 - 1982 797 ## Closa_2181 Inosine monophosphate cyclohydrolase-like protein 3 2 Op 2 . + CDS 2009 - 3184 1310 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) + Prom 3197 - 3256 4.0 4 2 Op 3 . + CDS 3279 - 3929 608 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit + Prom 3931 - 3990 8.1 5 3 Op 1 7/0.000 + CDS 4103 - 4930 918 ## COG3711 Transcriptional antiterminator 6 3 Op 2 . + CDS 4956 - 5447 676 ## COG2190 Phosphotransferase system IIA components 7 3 Op 3 . + CDS 5461 - 5562 105 ## + Prom 6405 - 6464 80.4 8 4 Op 1 . + CDS 6528 - 6659 188 ## 9 4 Op 2 . + CDS 6732 - 8171 1533 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 10 4 Op 3 . + CDS 8178 - 8288 69 ## 11 4 Op 4 . + CDS 8373 - 8720 427 ## COG0681 Signal peptidase I 12 5 Tu 1 . + CDS 9655 - 9840 141 ## Closa_3890 signal peptidase I + Term 9857 - 9892 7.1 + Prom 9859 - 9918 5.2 13 6 Op 1 . + CDS 9971 - 11332 1446 ## COG0534 Na+-driven multidrug efflux pump 14 6 Op 2 . + CDS 11365 - 11997 553 ## Closa_2167 hypothetical protein 15 6 Op 3 . + CDS 12056 - 13222 1156 ## Closa_2166 FliB family protein 16 6 Op 4 . + CDS 13265 - 13909 666 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Prom 13954 - 14013 6.0 17 7 Tu 1 . + CDS 14051 - 14611 709 ## Closa_4234 cell wall binding repeat-containing protein - Term 15708 - 15759 9.2 18 8 Tu 1 . - CDS 15795 - 16373 452 ## COG1346 Putative effector of murein hydrolase Predicted protein(s) >gi|229783962|gb|GG667773.1| GENE 1 1 - 975 207 324 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 146 324 248 413 413 84 29 5e-16 NIPLKEGSLPAADSQELEMVVGNQIPMRFHNPKSRGNVYMFSNDPADAPDVDFMNRPMFV IFDTNAYYQSQNGSSADDENPVRAPKKYMLKTAAMVGGEDYNNYSFSVYVDIDILKAQLK QVFKKNPIPDQPTNKKGKPYNYFIYDQAVVEVDDMKNVSEVQKIINDMGFQASSQMDWLE QSQKQSNMVQGVLGGIGTVSLFVAAIGIANTMMMSIYERTKEIGIIKVLGCDMRNIKNMF LLESGFIGFMGGVIGILLSYGISFIANKFLSGQFLVGIEGDLSRIPPWLSLAAIGFAIFV GMAAGFFPALRAMKLSPLAAIRNE >gi|229783962|gb|GG667773.1| GENE 2 1242 - 1982 797 246 aa, chain + ## HITS:1 COG:no KEGG:Closa_2181 NR:ns ## KEGG: Closa_2181 # Name: not_defined # Def: Inosine monophosphate cyclohydrolase-like protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 246 1 237 237 402 78.0 1e-111 MNMISLKDDLRGNSYPGRGIIIGKTPDGKKAVTAYFIMGRSENSRNRVFVEDGEGIRTQA FDPSKLTDPSLIIYAPVRVLGNKTIVTNGDQTDTIYEGMDRQMTFEQSLRSREFEPDSPN YTPRISGIMHVRGNSDGDASEAGGYNYAMSILKSSNGNPESCCRYTFAYENPAAGEGHFI HTYMHDGNPLPSFEGEPKLVETMDDIDAFASLLWENLNEDNKVSLFVRYITIATGEYETR IVNKNQ >gi|229783962|gb|GG667773.1| GENE 3 2009 - 3184 1310 391 aa, chain + ## HITS:1 COG:CAC2445 KEGG:ns NR:ns ## COG: CAC2445 COG0138 # Protein_GI_number: 15895710 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Clostridium acetobutylicum # 3 391 5 391 391 529 64.0 1e-150 MKELELKYGCNPNQKPSRIFVKNGELPIEVLSGRPGYINFMDAFNGWQLVKELKEATGLP AAASFKHVSPAGAAVGLPLDETLRKIYWVDDMGELSPLASAYARARGADRMSSFGDFISL SDVCDVDTARIIKREVSDGVIAPGYEPEALEILKSKKNGSYNVIAIDPEYRPEPIERKQV YGIVFEQGRNELKIDEELLKEVVTANKEIPDSAKIDLIISLITLKYTQSNSVCFAKGGQA IGIGAGQQSRVHCTRLAGNKADNWFLRQCPKVLNLPFADKIRRADRDNAIDVYIGEDYMD VLADGRWENIFREKPEVFTKEEKRAWLDQMTDVALGSDAFFPFGDNIDRAYKSGVKYVAQ PGGSVRDDQVIETCDKYGMAMAFTGIRLFHH >gi|229783962|gb|GG667773.1| GENE 4 3279 - 3929 608 216 aa, chain + ## HITS:1 COG:CAC1689 KEGG:ns NR:ns ## COG: CAC1689 COG1191 # Protein_GI_number: 15894966 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Clostridium acetobutylicum # 3 210 26 233 234 204 50.0 8e-53 MKTFPKPLTAREERECLDRYQEGDQEARATLIERNMRLVAHVAKKYQNTDYDMEDLLSVG TIGLIKAVNTFHTDRGSRLATYAAKCVENEILMLLRANKKYSKEVSLFEPIGVDKDGETV SLVDVIEMENKEALETIILSQDIRELYDAFDHCLRDTEKKVIGMRYGLYGGKEHTQREIA DMLGISRSYVSRIEKKSIEKLRMEFEKNNKKSGIIK >gi|229783962|gb|GG667773.1| GENE 5 4103 - 4930 918 275 aa, chain + ## HITS:1 COG:BS_licT KEGG:ns NR:ns ## COG: BS_licT COG3711 # Protein_GI_number: 16080959 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 1 273 1 273 277 251 46.0 1e-66 MKIDKIINNNIVSAIEADGKEVVIMGRGLGFGMKPGKEIPEGKIEKVFRLDSQNSTDKFK ELLANLPLEHIQASTEIINYAKSVLNRRLNQSIYITLTDHINFAIERFKERMVFTNPLLN EIKTFYKEEYLVGEYAVALIERRIGIKLPVDEAGFIALHIVNAEYNTVMRDTIDITNLIQ NVVKIVKEYFSMDLDETSLNYQRFVTHLRFLAQRIIGGEMLNSENPEFNQLISQMYPEEY ACSLKLKDYIQVTYHHDVTEEETAYLAVHIKRIRM >gi|229783962|gb|GG667773.1| GENE 6 4956 - 5447 676 163 aa, chain + ## HITS:1 COG:BH0296_3 KEGG:ns NR:ns ## COG: BH0296_3 COG2190 # Protein_GI_number: 15612859 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Bacillus halodurans # 17 163 13 159 161 151 50.0 6e-37 MFNSLKKMFSGSEKEREKILAPVEGKVVPLSEVSDPTFSQEILGKGVAVIPAKGRIVAPA SGVLTVMFETKHAVSVTTDGGAEIIVHMGLDTVNLKGEHYTSYKKQGEHVKAGELLLEFD IDAIKEAGYDVITPIIVCNTPNYPDMVCHTGMEVKELDPIIEL >gi|229783962|gb|GG667773.1| GENE 7 5461 - 5562 105 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MREFLYEVKDPMGFHARPASRLFMKAREFDCPS >gi|229783962|gb|GG667773.1| GENE 8 6528 - 6659 188 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDGKNLLALMGLEIREGELLRFRFEGTDEAEAEKALLEVLDQI >gi|229783962|gb|GG667773.1| GENE 9 6732 - 8171 1533 479 aa, chain + ## HITS:1 COG:PM0876_1 KEGG:ns NR:ns ## COG: PM0876_1 COG1263 # Protein_GI_number: 15602741 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Pasteurella multocida # 1 387 3 396 408 446 63.0 1e-125 MMKYLQKLGKSLMLPVAVLPAAGILMGIGYWIDPTGWGANSAIAAFLIKAGGAIIDNMAI LFAIGVAVGMADDHEGTAGLAGLVSWLMIQTILSTGVVAMLKGIDAGEVNAAFGKINNQF IGILSGIIGSTCYNRFKNTKLPDALAFFSGKRSVAIVTALASLVACLVLFFVWPVVYSAL VAFGEAILGLGAVGAGIYGFFNRLLIPVGLHHALNSVFWFDVAGVNDLGNFWAGTGTLGV TGQYMTGFFPVMMFGLPAAALAMYHTAKPGKKKVAYGLLLAAAVCSFFTGVTEPLEFAFM FLAPGLYLIHALLTGISLAICALLPVRAGFNFSAGFVDWFLSFKAPMAVNPLFIIPIGLI YGVIYYVVFRFAIVKFNFKTPGREDDDDIDENQVTLANNDYTQVASIILEGLGGKGNIKS VDNCITRLRLEINDYTLVNEKKIKSAGVAGVIRPGKTSVQVIVGTQVQFVADEFKKLCQ >gi|229783962|gb|GG667773.1| GENE 10 8178 - 8288 69 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no METAGRVTVPAVLLYPERDQSIGFIGSAMLERENKC >gi|229783962|gb|GG667773.1| GENE 11 8373 - 8720 427 115 aa, chain + ## HITS:1 COG:RSc1716 KEGG:ns NR:ns ## COG: RSc1716 COG0681 # Protein_GI_number: 17546435 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Ralstonia solanacearum # 46 112 35 108 230 67 47.0 5e-12 MREKKETLQEQEEPFDLKKEIFEWVKIIIAAAAIAFCLNTFIIANSTVPSGSMETTIMTG DRIIGSRLAYKFGGDPERGDIVIFDHKTGPGDKETRLVKRIIGLPGETVEIRDTS >gi|229783962|gb|GG667773.1| GENE 12 9655 - 9840 141 61 aa, chain + ## HITS:1 COG:no KEGG:Closa_3890 NR:ns ## KEGG: Closa_3890 # Name: not_defined # Def: signal peptidase I # Organism: C.saccharolyticum # Pathway: Protein export [PATH:csh03060] # 3 61 123 182 182 80 66.0 2e-14 MREAMDSEDYHFEVPDGCYLMLGDNRNYSADARYWPDPYVPEKKILAKVLFRYYPRIGKI E >gi|229783962|gb|GG667773.1| GENE 13 9971 - 11332 1446 453 aa, chain + ## HITS:1 COG:CAC3354 KEGG:ns NR:ns ## COG: CAC3354 COG0534 # Protein_GI_number: 15896597 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 6 409 7 411 452 169 27.0 1e-41 MKQENNLTEGNVFRVLLEFSIPFVIANVIQALYGAVDLLVIGKYCTPESVAAVSTGTQVT QIITSMITGLTLGGTIMVGKYTGMNAKEEVKKTIGTTLSIFGAAALVLTVLMIIFSPAVL RLLKTPEESFELAKQYVILCSCGIIFICGYNAISAILRGYGDSKRPMMFIALACVLNIIG DVVLTGILGLGVAGVAIATIGSQAVSMICAIWYLNRNRFIFTFRFSNLKIDRDKALELAK VGIPISLQECMVRLSFLYLTSVTNRLGVYAAAAVGVASKYDVFAMLPATSIANALAAITA QNYGAGKPERARRSLAAGLAFAVAASSVFWLWAQVSPQTMIGIFSDNQDIIAAGIPFFTS CSYDYLAVSFVFCLNGYLNGRAKTIFTMISCCFGALALRMPLIWLVYTRWPDNLFRIGTI APMVSGFMACYTLIYVICGFWKERNERRVSKSA >gi|229783962|gb|GG667773.1| GENE 14 11365 - 11997 553 210 aa, chain + ## HITS:1 COG:no KEGG:Closa_2167 NR:ns ## KEGG: Closa_2167 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 207 1 205 206 223 60.0 5e-57 MTMFHILVLVLALCMDTFVASMAYGANKVHITWEKIVAMNAVCSGCLGIALAAGSILDDL VPETMTRTVCFISLFLLGIVKLLDYFIKKYINRHVSLHKGITFSFSGLRVIVNIYGDPMA ADWDHSQSLSWKEVLFLSFAMSIDSLIAGALSAFLEIPVGLTMGTAFIMGIVVMAAGLFL GRKLAARCKCDLSWISGVLFMILAVTKSLS >gi|229783962|gb|GG667773.1| GENE 15 12056 - 13222 1156 388 aa, chain + ## HITS:1 COG:no KEGG:Closa_2166 NR:ns ## KEGG: Closa_2166 # Name: not_defined # Def: FliB family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 386 1 386 386 451 56.0 1e-125 MKITVPHYYSRFQCVAGQCPDTCCAGWQIMIDETSLKKYRKRRDSFGSRLKNSIDWENSA FRQYDRRCAFLDENNLCDIYSEAGPSMLCRTCRAYPRHIEEFEGVREISLSLSCPEAAKL ILDCKEPVRFLSKETEKEETYPFFDFFLYTKLMDCRDVIISTLQNRELDWNLRAGIVLGL AHDLQRRIVNDRLYETDSLLERYGRKGAVEKLASRLLAYQIDGDSRFEGMKELFGFLDGL EVLNAGWPDYIGRVKEVLYGAGENAYEEQRQEFYEYLESEEERLETWERWSEQLMVYFVF TWFCGAEYDERPYGKMKLAVVSTLLIRETAQALWKEQGGVLEKDAFFEAARRYSREVEHS DLNLERLETLFREKKVCSLDRLLTLVMG >gi|229783962|gb|GG667773.1| GENE 16 13265 - 13909 666 214 aa, chain + ## HITS:1 COG:ECs0395 KEGG:ns NR:ns ## COG: ECs0395 COG0110 # Protein_GI_number: 15829649 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Escherichia coli O157:H7 # 1 201 3 193 203 206 47.0 3e-53 MSLEGKMKNGRLYCEFGHELEENRAYERKIEYQRVHCKELVYDYNQTRPSNVKEKRRILD ELLAEAGEEIWIESPAFFSYGCNTHIGHHFYSNFNLCVVDDCEVFIGNYVMFGPNVTLTV TGHPVWGEYRRKGAQFSLPIHIGDDVWIGANAIILPGVTIGNDVVIGAGSVVTHDIPSHS VAFGTPCKVVREITEEDRLYYRKGQPVNEDWDRD >gi|229783962|gb|GG667773.1| GENE 17 14051 - 14611 709 186 aa, chain + ## HITS:1 COG:no KEGG:Closa_4234 NR:ns ## KEGG: Closa_4234 # Name: not_defined # Def: cell wall binding repeat-containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 182 1 181 317 68 32.0 1e-10 MKGKRLGFILTLVLIMALAVPAWAADTKIDSVKLDFSYAGAEPKSGNEIGGIHVKVPDGQ PFTVEYAEYSNEGDVWVVGDRPLVHVELTAKSGYRFGSTSKSRFTLNGCSAVYKKARIYD DGTQMELEVYLKRIGGTLSSVQNLEWGGNVAYWDPLEGSKEYEVRLYRDEKSVVTVKTTD SSYNSS >gi|229783962|gb|GG667773.1| GENE 18 15795 - 16373 452 192 aa, chain - ## HITS:1 COG:MA3262 KEGG:ns NR:ns ## COG: MA3262 COG1346 # Protein_GI_number: 20092078 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative effector of murein hydrolase # Organism: Methanosarcina acetivorans str.C2A # 1 188 43 230 238 157 47.0 1e-38 MISIIAVMIFLSVFHIDYESYNLSARYLSYLLTPATVCLAVPLYQQLSLLKKHARAILAG TVTGVLTTMATVLVLALLFGLTHEQYVTFLPKSITTAIGMGVSEEMGGIVTITVACIIIT GIIGNIIAETVCRLFHITEPVAKGIAIGSASHAIGTTKAMEMGEVEGAMSSLAIAVSGLC TVLFASVFANFL Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:48:03 2011 Seq name: gi|229783961|gb|GG667774.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld167, whole genome shotgun sequence Length of sequence - 9422 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 29 - 88 1.9 1 1 Tu 1 . + CDS 115 - 1293 1147 ## Closa_1929 glycosyl hydrolase family 88 + Term 1326 - 1379 15.1 - Term 1314 - 1367 2.3 2 2 Op 1 . - CDS 1390 - 2184 796 ## CLH_2455 hypothetical protein 3 2 Op 2 . - CDS 2250 - 3029 627 ## Aaci_0532 hypothetical protein - Prom 3106 - 3165 5.1 4 3 Tu 1 . - CDS 3234 - 3815 339 ## COG1280 Putative threonine efflux protein - Prom 3912 - 3971 8.2 + Prom 3868 - 3927 7.5 5 4 Tu 1 . + CDS 3965 - 4516 527 ## COG1396 Predicted transcriptional regulators + Term 4551 - 4589 7.6 - Term 4534 - 4582 9.4 6 5 Tu 1 . - CDS 4656 - 5885 1292 ## COG0137 Argininosuccinate synthase - Prom 5996 - 6055 8.9 + Prom 6002 - 6061 9.3 7 6 Op 1 1/0.333 + CDS 6231 - 7268 1355 ## COG0002 Acetylglutamate semialdehyde dehydrogenase 8 6 Op 2 . + CDS 7317 - 7763 446 ## COG0456 Acetyltransferases 9 6 Op 3 . + CDS 7784 - 9007 1472 ## COG1364 N-acetylglutamate synthase (N-acetylornithine aminotransferase) 10 6 Op 4 . + CDS 9052 - 9421 413 ## Cphy_3229 hypothetical protein Predicted protein(s) >gi|229783961|gb|GG667774.1| GENE 1 115 - 1293 1147 392 aa, chain + ## HITS:1 COG:no KEGG:Closa_1929 NR:ns ## KEGG: Closa_1929 # Name: not_defined # Def: glycosyl hydrolase family 88 # Organism: C.saccharolyticum # Pathway: not_defined # 1 392 1 392 392 602 74.0 1e-170 MAQTMERIRQYPGTDAAMRKKYMDAAAEIIRSNLPEFTEFFESSNSEGGFYLPTENVEWT TGFWTGVIWLAYEYTGEACFKEAAESQVESFLKRIREKIDVNHHDMGFLYSLSCVAAYKL TGSSHAREAALLAADHLAGRYRENGRFIQAWGNVNEPSEYRLIIDCLLNLPLLYWASEVT GNPDYADKAANHIRTAMKCVMRPDCSTYHTYFINTVTGEPDHGVTHQGNRDGSAWARGQA WGVYGIALSYRYLKKPEYLELFCKVTDYFITHLPEDLVPYWDFDFDTGSTEPRDSSAPAI AICGILEMSRYLDEETAEKYMTAADKMLRALLDHCANTDKTRSNGLLLHGTYARDSEYNT CRNRGVDECNTWGDYFYMEALTRRMKDWNPYW >gi|229783961|gb|GG667774.1| GENE 2 1390 - 2184 796 264 aa, chain - ## HITS:1 COG:no KEGG:CLH_2455 NR:ns ## KEGG: CLH_2455 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 1 221 1 221 272 243 60.0 5e-63 MEKRMEDLNLEDDFLFSRVMCDEEICRCVLEKILNISIRKIVLPEQQKTIDLLPDSKGVR LDIYVNDDIGTIYNCEMQRGKRGPLPKRSRYYQGNIDLDAISKGEPYTRLRRSYIIFICT FDPFTEGRHIYTFQNTCLESPGLLLGDEATKLFLNTRGTLDDVDDEMKEFLAYIENTTDT FARQAKSPLVKKIHQRVTAVKQNKEMEAEYMTLMQRDRENIEQGERRMAALTELLLDQKR TDDLKRAIKDEDYREQLYRHFNIG >gi|229783961|gb|GG667774.1| GENE 3 2250 - 3029 627 259 aa, chain - ## HITS:1 COG:no KEGG:Aaci_0532 NR:ns ## KEGG: Aaci_0532 # Name: not_defined # Def: hypothetical protein # Organism: A.acidocaldarius # Pathway: not_defined # 3 247 40 285 321 186 40.0 8e-46 MNEAILKTCMQQCNSASENVGIFVDFDNIYYSLKEYGVNPESPEYCVFSLMERIYSINKI RTLRAYADYDQVGVSLKHLQEMRVQIKNVYGNGLEEEYRKNASDIELSVDALEIYYRSPE IDTFVFLTSDSDMIPIMSRLTYKGKHIHLFCIDDHTSHYQDISRFCHFKCDLLTLFEIDP QRKNPEFWTDRALTEIAAWYSVRKNSDMMLGGKWLNRLLCEKLQISSRAASRIITYLKDN NLIRETSNDAGHTGFFPAV >gi|229783961|gb|GG667774.1| GENE 4 3234 - 3815 339 193 aa, chain - ## HITS:1 COG:yfiK KEGG:ns NR:ns ## COG: yfiK COG1280 # Protein_GI_number: 16130503 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Escherichia coli K12 # 4 192 6 194 195 124 39.0 1e-28 MFHLSDFLIYCFITAYTPGANNLLSMSNAVRLGFRKSIRFNFGILAGFSVVMAVCTIFSA ALYSLLPKVKPVMQIAGGVYMLYLAWKVLRSTADLDAGQKDGAGFLSGMILQFFNPKIYI YAITTISLYILPVFHSPGALAGFAVLLSLIGASGSFVWALFGAAFCNFFSRHTKAVNLVM ALLLVYCAAVLFI >gi|229783961|gb|GG667774.1| GENE 5 3965 - 4516 527 183 aa, chain + ## HITS:1 COG:BH2909 KEGG:ns NR:ns ## COG: BH2909 COG1396 # Protein_GI_number: 15615472 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 8 174 10 175 189 120 34.0 2e-27 MDPMNTIVAKNIKRLREEKKLSMEELSKLSSVSKSMLAQIERGEGNPTLSTLWKISNGMK VPFDALTVRPERIYEIVKTSEVQPLLEDSGKVKNYSIFPDDENRRFAVYYLELEEGSYWE SEPHLKGTTEFITVFEGGIVIRVEGQTFPVEKGESIRFRADNPHSYRNTGAGKAVLHMIL YNP >gi|229783961|gb|GG667774.1| GENE 6 4656 - 5885 1292 409 aa, chain - ## HITS:1 COG:CAC0973 KEGG:ns NR:ns ## COG: CAC0973 COG0137 # Protein_GI_number: 15894260 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate synthase # Organism: Clostridium acetobutylicum # 1 406 1 400 400 451 53.0 1e-126 MKEKVILAYSGGLDTTAIIPWLKENFDYEVVCCCIDCGQGNELDGLEERAKLSGASKLYI EDLVDDFCDNYIVPCVQANSVYENKYLLGTSMARPAIAKRLVEIARKEGATAICHGATGK GNDQIRFELGIKALAPDLKIIAPWRMTDVWTMQSREDEIEYCKQHGINLPFSADNSYSRD RNLWHISHEGLELEDPANEPNYDHLLVLGVSPEKAPDEGEYVTMTFEKGVPVSVNGEAMK VSDIIRKLNELGGRHGIGIVDIVENRVVGMKSRGVYETPGGTILMEAHQQLEELVLDRAT MEVKKDMGNKFAQIVYEGKWFTPLREAIQAFVESTQQYVTGEVKFKLYKGNIIKAGTTSP YSLYSESLASFTTGDLYDHHDADGFITLFGLPLKVRAMKMQELEKKNGK >gi|229783961|gb|GG667774.1| GENE 7 6231 - 7268 1355 345 aa, chain + ## HITS:1 COG:Cj0224 KEGG:ns NR:ns ## COG: Cj0224 COG0002 # Protein_GI_number: 15791596 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate semialdehyde dehydrogenase # Organism: Campylobacter jejuni # 1 340 3 338 342 392 54.0 1e-109 MKAGIMGATGYAGGELARLLLQRDDVEITWYGSKSYVGQKYASVFKNMVQFADDACMDDN MEALAEQVDVIFTATPQGFCASLMNEAVLSKAKVIDLSADYRIKDVAVYEKWYKIEHKTP EYIKEAVYGLPEINRDKIRDARLIANPGCYPTCSFLSVYPLVKEGLIDPNTVIIDAKSGT SGAGRGAKIDNLYCEVNESIKAYGVGVHRHTPEIEEQLSLAAGKPVTISFTPHLVPMNRG ILVTAYGTLLKSVTYEDVKAIYDKYYENEYFVRVLDREEVPQTKWVEGSNFVDVNFKIDP RTNRIVMMGAMDNLVKGAAGQAIQNMNIMFGLPENQGLKLVPMFP >gi|229783961|gb|GG667774.1| GENE 8 7317 - 7763 446 148 aa, chain + ## HITS:1 COG:Cj0225 KEGG:ns NR:ns ## COG: Cj0225 COG0456 # Protein_GI_number: 15791597 # Func_class: R General function prediction only # Function: Acetyltransferases # Organism: Campylobacter jejuni # 1 146 1 146 148 181 56.0 5e-46 MTIRTMTIDDYDQVWHLWSGIKGFGIRTLDDSREGVEKFLKRNPSTSVVAEEAGTVIGAI LCGHDGRRGCMYHVCVAEKYRKHGIGHQMALAAMDALKAEGINKVSLIAFKSNTIGNKFW RKVGWTFRDDLNYYDFTLNEENITRFIE >gi|229783961|gb|GG667774.1| GENE 9 7784 - 9007 1472 407 aa, chain + ## HITS:1 COG:TM1783 KEGG:ns NR:ns ## COG: TM1783 COG1364 # Protein_GI_number: 15644527 # Func_class: E Amino acid transport and metabolism # Function: N-acetylglutamate synthase (N-acetylornithine aminotransferase) # Organism: Thermotoga maritima # 12 407 5 397 397 364 49.0 1e-100 MKKTDGGVTAAKGFLAASTAAGIKYKDRMDMAMIYSEKPCKAAGTFTTNIVKAAPVKWDQ EVVSASPYAQAVVVNAGIANACTGAEGYGYCLDTAKAAAEALQIPESAVLVASTGVIGMQ LPMDRITAGIKAMAPGLSGTLEGGTAASKAIMTTDTVNKEVAATFTAGGKTVTIGGMCKG SGMIHPNMCTMLSFVTTDLAISKELLTEALREDIKDTYNMISVDGDTSTNDTVLLLANGM AGNPEITEKNEDYREFVKALNFINETLAKKMAGDGEGCTALFEVKIVGAETKAQAVTLAK SVITSSLTKAAIFGHDANWGRILCAMGYSGAVFDPENVDLYFESAAGKLQIIENGVALPY SEEEATKILSEPEVTATADIKMGSCSATAWGCDLTFDYVKINADYRS >gi|229783961|gb|GG667774.1| GENE 10 9052 - 9421 413 123 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3229 NR:ns ## KEGG: Cphy_3229 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 4 117 3 115 130 76 38.0 3e-13 MAVKDDYIMRMIEDMARVLARLILGKDDINYVLPEDEKFTATHSLYKKLITMADEGQINE AENILLDELVDKSSGYFEMAASFYLHLNEYDDEFLDSHQFSREEIQEGIEHLGKEFGVDL PFH Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:48:24 2011 Seq name: gi|229783960|gb|GG667775.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld168, whole genome shotgun sequence Length of sequence - 13337 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 5, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 2801 2962 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain + Prom 2815 - 2874 2.1 2 2 Op 1 29/0.000 + CDS 2902 - 4188 1661 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) + Term 4202 - 4251 7.1 3 2 Op 2 24/0.000 + CDS 4270 - 4857 729 ## COG0740 Protease subunit of ATP-dependent Clp proteases 4 2 Op 3 18/0.000 + CDS 4904 - 6205 266 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 + Term 6222 - 6275 13.2 + Prom 7162 - 7221 80.4 5 3 Tu 1 . + CDS 7369 - 8868 1763 ## COG0466 ATP-dependent Lon protease, bacterial type + Prom 9747 - 9806 80.4 6 4 Op 1 31/0.000 + CDS 9857 - 10585 845 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Term 10613 - 10661 3.2 7 4 Op 2 34/0.000 + CDS 10693 - 11493 940 ## COG0765 ABC-type amino acid transport system, permease component 8 4 Op 3 34/0.000 + CDS 11496 - 12239 267 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 + Term 12302 - 12348 13.2 + Prom 12295 - 12354 5.1 9 5 Op 1 34/0.000 + CDS 12390 - 13145 762 ## COG0765 ABC-type amino acid transport system, permease component 10 5 Op 2 . + CDS 13142 - 13337 80 ## COG1126 ABC-type polar amino acid transport system, ATPase component Predicted protein(s) >gi|229783960|gb|GG667775.1| GENE 1 3 - 2801 2962 932 aa, chain + ## HITS:1 COG:AGl1027 KEGG:ns NR:ns ## COG: AGl1027 COG5001 # Protein_GI_number: 15890630 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 499 908 222 631 648 125 28.0 3e-28 VLDEQHKDASCISNDYDLNRKIADGIINRAVDLMRRSYGTGIFVVLNGPAAQNSTENTKA GFYIRDSNPGRSYNQDNSSLLLERGLPSLANKYGIPLDSFWDLGFDVTADDGSTDYYKKP FDSAVENQAKAGERLNFAYLSKPFRLSPKDIPVITYSAPIILEDGSVVGVFGLEMGIGRI EQVLSADRTESSFEVCYALGVRQAGENIITPVASSGYLYNQYFKGDEYLEYTDNEEEGIG ELRSSDRAGWYASVKQLDVYNYNTPFEQEEWVLVGMARKNDLLSFYNSIRGMLLSSILIP IILSVFGVFVIGKIVTDPIRGLVNEVRKRSGEHGLTLTRVHIGEIDELTETIEQLSADVE RSASKISSILEHANVLIGVFEYEETNESVFCSRSIFEMMGWAEPEEPYCYMSSGEFQKRM EEAIRNTKKDGVNYLFHINVSPASDDTGEAGVACGPAAFSRGRWVRLILDKQEPGTILGV LSDVTSDVEEKEKLERERNYDLLTNLYNRRAFREQLVVAMADGGIQTAAVAMWDLDNLKY INDTYGHDEGDRYIVLFASCLKRLEKDGAIVSRYSGDEFVTLIYGSGGKEEIRHRVTGFM QFLQTVSLEMKGGYQIPIRVSGGMAWYPDDAVDFDTLLNYADFAMYMVKHSVKGIIMEFD SNDYSNNSYMLAGREELNRMLETREVDFAFQPITRRDGSVFGYELLMRPNFTNLKGIKEV LNLARAQAKLTQMETLTWYAGLKAVEKQDQAGALGQTERLFINSIASACMSKEEEADLWN RYGRYLSRVVTELTEGEPVNQEYMRRKIAVTKEWNGLIAIDDFGAGYNSETVLLDMEPDI IKVDMSLVQRIHEDPNRQLILNNILDFAAQNHITVLAEGVETKEELEFLIQCGVELFQGY YIGRPQMEIRPIDPYIVRKMQEFAEKSDFSQK >gi|229783960|gb|GG667775.1| GENE 2 2902 - 4188 1661 428 aa, chain + ## HITS:1 COG:BS_tig KEGG:ns NR:ns ## COG: BS_tig COG0544 # Protein_GI_number: 16079875 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Bacillus subtilis # 1 422 1 421 424 340 52.0 4e-93 MSLQVEKLEKNMAKITVEVPAEQFEKALTAAFNKNKSRFNIPGFRKGKAPQAMVEKMYGV EVLYEDAINEALDATYGDAVTESGLEVVSRPEIDVVQVEKGKELIYTATVAVKPEVTLGE YKGIEVEKASAEVTDEDIEAELKKVQEQNSRLITVEDRAVEDGDQTVIDFEGSVDGTPFE GGKGEDYPLTIGSHSFIDTFEEQLIGKNIGEECEVHVTFPEEYHAKELAGKPAVFKVTVK EIKRKELPELNDEFAGEVSEFETLEEYKNDVKSKLSLKKQKDAATENENHVVDKVVENAQ MDIPEPMIDSQVNNMVNDYARRMQSQGLSLEQYMQFTGMTIETLKEQMKPQAVKRIQTRL VLEAIVKAENITVSDEAVEKEIADMAESYKMEVAQIKEYMGENGIEQMKEDLAVQEAVDF LVAEAKLV >gi|229783960|gb|GG667775.1| GENE 3 4270 - 4857 729 195 aa, chain + ## HITS:1 COG:CAC2640 KEGG:ns NR:ns ## COG: CAC2640 COG0740 # Protein_GI_number: 15895898 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Clostridium acetobutylicum # 1 193 1 193 193 295 74.0 5e-80 MSLVPYVIESTSKGERSYDIYSRLLKERIIFLGEEVTDVSASIVVAQLLFLEAEDPNKDI NLYINSPGGSVTAGMAIYDTMNYIKCDVSTVCMGMAASMGAFLLSGGAKGKRFALPNAEI MIHQPSGGAQGQATEIKIVAENILKIKKRLNEILAANTGQPYEVIERDTERDNYMNAEEA KAYGLIDEIIYSHEH >gi|229783960|gb|GG667775.1| GENE 4 4904 - 6205 266 433 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 153 412 248 460 466 107 31 6e-23 MPVRTDDKIRCSFCGKTQDQVKKLIAGSNNVYICDECIDLCAEILEEEFDSHEEEGPDFS DINLMKPKEIKSFLDDYVIGQDSAKKVLSVAVYNHYKRITSRRSMDVDVQKSNILMLGPT GCGKTYVAQTLAKVLNVPFAIADATALTEAGYVGEDVENILLKLIQAADYDISRAEYGII YIDEIDKITKKSENVSITRDVSGEGVQQALLKILEGTVASVPPQGGRKHPHQELLQIDTT NILFICGGAFDGLEKIIESRLSAGSIGFNAEIVDKNRTDIDDLLKKVLPQDLVKFGLIPE FIGRVPVTVSLELLDKEALVKILTEPKNALIKQYQKLFELDDVKLELTEEAVERIAELAV ERKTGARGLRSIMESVMMDIMYEIPSDSNIGICTITKAVVNKEGEPEIIYRDTTVPRKSL AQKLRKDKTGEIA >gi|229783960|gb|GG667775.1| GENE 5 7369 - 8868 1763 499 aa, chain + ## HITS:1 COG:BH3050 KEGG:ns NR:ns ## COG: BH3050 COG0466 # Protein_GI_number: 15615612 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Bacillus halodurans # 1 491 280 770 774 557 57.0 1e-158 MPGGSQEANVLRMYLETVLDLPWKKMSKDNNNMTHAEEILNADHYGLEKVKERILEYLAV RILTKKGTSPIICLVGPPGTGKTSIARSVARALNKQYVRISLGGIHDEAEIRGHRKTYVG AMPGRIVEAMRQAGVSNPLMLLDEIDKVSSDYKGDTSSALLEVLDSEQNVKFRDHYVELP IDLSNVLFIATANTTQTIPGPLLDRMEVIEVNSYTENEKFHIAKNYLVRKQLERNGLTEE QVSITDKALEKIIHNYTREAGVRNLERRIGDICRKAAREFLEDKKKAIRITENGLEKYLG KEKITFEDVNEEDQVGIVRGLAWTSVGGDTLQIEVNVMPGKGTLQMTGQMGDVMKESAQT ALSYIRSICPEYGVADDYFEKHDIHLHIPEGAVPKDGPSAGITMATAMLSAVTGRKVDAK LAMTGEITLRGRALPIGGLKEKILAARMAHMKRVLVPDKNRPDIAELSKEITKGLTIIFV KDMADVLKEALVSPAPVSE >gi|229783960|gb|GG667775.1| GENE 6 9857 - 10585 845 242 aa, chain + ## HITS:1 COG:SP0453_1 KEGG:ns NR:ns ## COG: SP0453_1 COG0834 # Protein_GI_number: 15900370 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Streptococcus pneumoniae TIGR4 # 1 239 32 265 325 160 41.0 1e-39 MECSYAPYNWTQPTDGNGAVPISGSSDFAYGYDVMMAKKIADTLGYELEIVKLDWDSLVP AVQSGKVDCVIAGQSITSERLQSVDFTEPYYYATIVTLVKSGSKYENAASVADLAGATAT SQMNTIWYDNCLPQIQDATIQPAQESAPAMLVALSSGRCDLVVTDKPTGQAALIAYPDFK LLDFTGTEGEFKVSDEDINIGISLKKGNTELKEAIDGVLAAMTEDDFNRMMEEAISVQPL SE >gi|229783960|gb|GG667775.1| GENE 7 10693 - 11493 940 266 aa, chain + ## HITS:1 COG:SPy0277_2 KEGG:ns NR:ns ## COG: SPy0277_2 COG0765 # Protein_GI_number: 15674455 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Streptococcus pyogenes M1 GAS # 22 240 9 224 239 210 51.0 2e-54 MSLPSDFFGRIVFILQKYGVSFLDGAGKTMVIAIVGTLIGCLIGFAVGIVQTIPVDKSDN PVKRVILGFIKLILNVYVEVFRGTPMMVQAMFIYYGSSVLFDINMSPMFAAYFIVSINTG AYMAETVRGGILSIDPGQTEGAKAIGMTHFQTMVNVIMPQALRNIMPQIGNNLIINIKDT CVLSVIGVVELFYATKGVAGAYYTYFEAFTITMVIYFILTFTCSRILRYWEKKMDGPDSY DLATTDTLAYTSGMVKFPDPNEKEDK >gi|229783960|gb|GG667775.1| GENE 8 11496 - 12239 267 247 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 7 219 9 218 309 107 30 4e-23 MAGSTVLEIHHLSKTFGTNVVLRDIDFSVAVGDVTCIIGASGSGKSTLLRCINLLETPTT GDILYHNTDITSGKMSAASYRTKVGMVFQSFNLFNNMTVLDNCMVGQVKVLNKSKEDAKR NAMRFLEKVGMAPYINAKPRQLSGGQKQRVAIARALAMEPEILLFDEPTSALDPQMVGEV LAVMRTLAREGLTMIIVTHEMAFARDVSNHVVYMADGVIEEEGAPADIFGNPQKTSTKEF LSRFMQG >gi|229783960|gb|GG667775.1| GENE 9 12390 - 13145 762 251 aa, chain + ## HITS:1 COG:SPy0277_2 KEGG:ns NR:ns ## COG: SPy0277_2 COG0765 # Protein_GI_number: 15674455 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Streptococcus pyogenes M1 GAS # 20 231 9 218 239 191 51.0 8e-49 MPESFIGGIGYILQEYGMSYLLGAAKTMKIALVGTSIGCLIGLAVGIIRTIPVKKDDRMA RKVFLTAVKAVMAAYVEIFRGTPMMVQAMFLFFGASAVFQINMSIGFAAFFIVSINTGAY MAESVRGGINSTDPGQTEGAKALGMTHVQTMVLVVLPQALRNIMPQIGNNLIINIKDTCV LSVIGVVELFYTSKGVAGAYYRYFEAFTITMVLYFVLTFTCSRILLLIEKRMAGNGSYKM EPQAAGKGAVL >gi|229783960|gb|GG667775.1| GENE 10 13142 - 13337 80 65 aa, chain + ## HITS:1 COG:SA1674 KEGG:ns NR:ns ## COG: SA1674 COG1126 # Protein_GI_number: 15927431 # Func_class: E Amino acid transport and metabolism # Function: ABC-type polar amino acid transport system, ATPase component # Organism: Staphylococcus aureus N315 # 7 64 3 60 242 77 55.0 8e-15 MSGTEAILEMRHIGKYFGANVVLRDVNFSVREGEVTCIIGASGSGKSTLLRCINLLEMPT SGEIQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:48:28 2011 Seq name: gi|229783959|gb|GG667776.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld169, whole genome shotgun sequence Length of sequence - 14444 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 5, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 1/0.000 + CDS 3 - 782 563 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases + Prom 1684 - 1743 11.9 2 2 Op 1 7/0.000 + CDS 1837 - 2850 931 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 3 2 Op 2 . + CDS 2865 - 4160 916 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 4 2 Op 3 . + CDS 4214 - 4306 77 ## + Term 4323 - 4348 -0.8 5 2 Op 4 . + CDS 4374 - 5438 553 ## Micau_2124 methicillin resistance protein + Prom 5450 - 5509 2.9 6 3 Op 1 3/0.000 + CDS 5567 - 6073 176 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 7 3 Op 2 . + CDS 6159 - 7007 110 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 8 3 Op 3 . + CDS 7000 - 8190 550 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) 9 3 Op 4 . + CDS 8214 - 9395 374 ## COG0673 Predicted dehydrogenases and related proteins 10 3 Op 5 . + CDS 9448 - 10476 342 ## COG0722 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase + Prom 10558 - 10617 2.9 11 4 Op 1 . + CDS 10638 - 11672 323 ## ELI_2571 hypothetical protein 12 4 Op 2 . + CDS 11703 - 12905 401 ## COG0438 Glycosyltransferase + Prom 13456 - 13515 6.5 13 5 Tu 1 . + CDS 13611 - 14165 -96 ## gi|266624440|ref|ZP_06117375.1| hypothetical protein CLOSTHATH_05806 + Term 14291 - 14321 -0.9 Predicted protein(s) >gi|229783959|gb|GG667776.1| GENE 1 3 - 782 563 259 aa, chain + ## HITS:1 COG:TM1548 KEGG:ns NR:ns ## COG: TM1548 COG1086 # Protein_GI_number: 15644296 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Thermotoga maritima # 1 257 15 266 605 113 30.0 3e-25 SYFFGLWLRFDLRFSKIPQVYLDSYLKFVPLYTVFALIVFYALHLYKSLWRFASFSELNR IFASSVITTVFHTVFITVLLKRMPISYYVIGSAMQLALVTAVRFAYRYVTLERTRRERNV SAVHNVMVIGAGAAGQVILKELKSSTEVMSKPCCVIDDNPNKWGRFMEGVPVVGGRDSIM KAVDKYRIDQILLAIPTASPKDKRDILNICKETHCELKSLPGIYQLANGEVSLSRMKAVA VEDLLGREPIKVNMEEISS >gi|229783959|gb|GG667776.1| GENE 2 1837 - 2850 931 337 aa, chain + ## HITS:1 COG:SA0147 KEGG:ns NR:ns ## COG: SA0147 COG1086 # Protein_GI_number: 15925856 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Staphylococcus aureus N315 # 2 285 352 584 607 258 50.0 2e-68 MVFEKYKPDIVYHAAAHKHVPLMETSPNEAIKNNVVGTYKTAYAALRNGAQKFVLISTDK AVNPTSIMGASKRLCEMVIQSMDAVSKEGRMDILPFLYAHANKTIDWQVVPDPVDHMAVE SVREEKNMDTIGQGSRARTQFVAVRFGNVLGSNGSVIPLFKKQIEAGGPVTVTHPDIVRY FMTIPEAVSLVMQAGTYARGGEIFVLDMGDPVKIDTLARNLIKLSGYEPGIDIQMAYTGL RPGEKLFEEKLMAEEGMLKTDNELIHIGKPVEFDTEEFLMLLEELARASYENSDHIVEMV KTIVTTFQPVGDKPTGKEFLRKSGQGAGEAAVTTEEE >gi|229783959|gb|GG667776.1| GENE 3 2865 - 4160 916 431 aa, chain + ## HITS:1 COG:BS_yvfE KEGG:ns NR:ns ## COG: BS_yvfE COG0399 # Protein_GI_number: 16080476 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Bacillus subtilis # 18 279 4 269 301 266 47.0 6e-71 MNKEQKGQEMQLIPFTNKVWLSSPTMHGQEIEFVKEAYESNWMSTIGSNIDEAERLICDK TGCKYAVALATGTAALHMAIKLTGIKPGDKVFCSDLTFSATVNPVKYEGGEAVFIDTEYD TWNMDPKALEKAFELYPDVKTVVVAHLYGTPGKIDEIKKICKRHGAMLVEDAAESLGAVY KGTQTGVFGDFNAVSFNGNKIITGSSGGMFLTDDGEAAKRVRKWSTQAREIASWYQHEEL GYNYRMSNVIAGVVRGQIPYLEEHISQKKAIYLRYKEAFEKAGLPVEMNPYDKENSEPNF WLSCLLIKPEAMSRQVRGEQEALYIRESGKSCPTEILEALTKYNAEGRPIWKPMHMQPFY RMNPFVTRYGNGRGKTDAYTDGRIGAALGKDELLLDVGADIFERGLCLPSDNKMSKADMD KVCEIVKRCFG >gi|229783959|gb|GG667776.1| GENE 4 4214 - 4306 77 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVFGHELKKMLRNDSIPAVDCVFNSFLVIL >gi|229783959|gb|GG667776.1| GENE 5 4374 - 5438 553 354 aa, chain + ## HITS:1 COG:no KEGG:Micau_2124 NR:ns ## KEGG: Micau_2124 # Name: not_defined # Def: methicillin resistance protein # Organism: M.aurantiaca # Pathway: not_defined # 13 351 15 353 362 203 36.0 1e-50 MQHITEIPVLERERWDQAVKSIKGYDVFYFNEYVTAFMHEDESKGEPVLLYYENGSDRAI NVVFKRDVAMDQKLAESLPNNTYFDLITPYGYGGFIGSITDYDALNASYTAYCVDKSYVC EFVRFELFSNYHMHYSGETETRTHNVVRSLDIPIDTIWMDFKQKVRKNVKRANTYGLEII VDEDGTHMDDFLRIYYSTMERSDAEDQFYFKKPFFEELMSMENSVMFYVRFQDKIISTEL VIYGSENCYSYLGGTESEYFYTRANDFLKFEIIKWAKEKGLRNFVLGGGYGADDGIFQYK MNLAPHGIRDFYIGRKVFDEGAYQELLHIRSKGNSEMEKKLIDSGFFPAYRGTH >gi|229783959|gb|GG667776.1| GENE 6 5567 - 6073 176 168 aa, chain + ## HITS:1 COG:SPAC18B11.09c KEGG:ns NR:ns ## COG: SPAC18B11.09c COG0110 # Protein_GI_number: 19113786 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Schizosaccharomyces pombe # 35 141 92 201 207 64 36.0 1e-10 MNKKRLLQTIGLTLKRGAGQRGEYLREKHILASVGENVYFQPRLVPLYPELIKLHNNVMV SAGVRLITHDASFVVLNRLGKGEFPEKVGCIEVMDNVYIGYNATIMPNVKIGSNVIVGAG ALVSKDLEPNGVYVGVPAKRICSFDDYMVRHCENPDNQGGVFLSLCTA >gi|229783959|gb|GG667776.1| GENE 7 6159 - 7007 110 282 aa, chain + ## HITS:1 COG:NMB1820_1 KEGG:ns NR:ns ## COG: NMB1820_1 COG2148 # Protein_GI_number: 15677656 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Neisseria meningitidis MC58 # 63 249 3 189 223 198 50.0 8e-51 MDKKKKIAITAAVSAGVGVASYRLRKSLKHSKDSESITDLDSNNTESKSISQVFSHTECY YEKHIKRTMDIICSGLALVVLSPVFLVSAVAVWISLGRPVIFVQERLGKDEVPFNLLKFR SMKNVFDEYGVPLPDQKRRTRIGNIIRKLSIDELPSLINILKGDMSIVGPRPLPTNYGPW FYENERKRHSVKGGLTGLAQINGRNVLSWEERFVYDLQYVDHITFLGDVKIILKTVEKVL KRSNIGERGVNSPGDFHVYRSGMTEQELVMWEKEKREKECNA >gi|229783959|gb|GG667776.1| GENE 8 7000 - 8190 550 396 aa, chain + ## HITS:1 COG:SPy0835 KEGG:ns NR:ns ## COG: SPy0835 COG0458 # Protein_GI_number: 15674870 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Streptococcus pyogenes M1 GAS # 98 277 659 837 1058 73 26.0 8e-13 MRRKKLLMLNASNSDVPLILAAREEGFYVITTTTMSDYIGHKYSDEYIYHDYMDFDGLVE LCKEHIITAVSCGVSDGASFPASYLTDYFGWKGHDSYETIGKLHNKDEFKALAKRLKLRS PVSEGFDNIEGALAYTEKLTYPTIVKPVDLGGGQGISVVNNPQEYAKAVKVAISRSKIKR IVVEPYIVGTQHSFSSFLIDKKVVHYCTWDDLNYSDRFMVSKGSYPASYKNAEYIDKTIC EEIEKIAAELNLVDGLLHLQFIVSNGAPYIIEVMRRAPGNWDTCLSSVASGVEWNKWIIR AEAGMDCHGFPMGRIQSGYWGYYCLLGSRNGEMKDIMLSEDIKKNIIRFDRWVDNGFKIT DFKHQKLALFQFWFSSKEEMNQKMGIIDDLIQVEYK >gi|229783959|gb|GG667776.1| GENE 9 8214 - 9395 374 393 aa, chain + ## HITS:1 COG:SMc01163 KEGG:ns NR:ns ## COG: SMc01163 COG0673 # Protein_GI_number: 15964109 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Sinorhizobium meliloti # 1 392 1 374 376 73 22.0 9e-13 MNKKGEICLGMIGAGRATELHMNALTRYTGVPLCFKWIAAHRYEQVNMMKERYGFVNSSL DYHDILNDPEVDIVDICTPPYAHTQMIIEAMRAGKHVICEKPLTGYFGEEGDERPIGINV SKVHMYEKLCESLEEVKKVVDESGKKFMYAENFVYAPAVRKAAEILTKKKSKILYAKGEE SLKGSSSPVAGEWDKTGGGTFIRTGAHPLSAILWLKQIESEARSEEITVKSVFADMGHIT PSLSDYDHRHIAARPHDVEDNGTVIITFSDDSKAVVIATDVLLGGSKNYLELYCNDAVIN CTLTLSNLMSTYFLDEDNLDDVYISEMLPLKTGWNNPFVEDEIIRGYMDELRDFVESVYF DREPKSGFKLAYDSIKIIYAAYKSAELGKAVEP >gi|229783959|gb|GG667776.1| GENE 10 9448 - 10476 342 342 aa, chain + ## HITS:1 COG:L0063 KEGG:ns NR:ns ## COG: L0063 COG0722 # Protein_GI_number: 15672098 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Lactococcus lactis # 9 337 11 337 345 322 50.0 5e-88 MILVNKLPDVEEVKNEYPLTKEQENNRGVFIKQIRDILSGIDKRKILIIGPCSADREDTV VDYMVRLSRVREDVKDTIFIIPRVYTSKPRTNGIGYKGLLHNPDTNDAHEDLISGVLASR KMHLKVIQETGMYAADEMLYPDSLMYFDDLLSYLAVGARSVEDQMHREMASGFNIPVGLK NPTSGDLSVLLNAIFAAQHSQRMLYHGWEVATKGNLYAHAILRGYIDRNGKMYSNYHYED LNDFHDQYIKLNLKNSSVIVDCSHCNSGKKYYEQERILEEVLSVCSRKRGISSLVKGFMV ESYLEEGNQMVGSGIYGKSITDGCIGWDKTQELIYKMADTLA >gi|229783959|gb|GG667776.1| GENE 11 10638 - 11672 323 344 aa, chain + ## HITS:1 COG:no KEGG:ELI_2571 NR:ns ## KEGG: ELI_2571 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 328 57 356 356 176 35.0 2e-42 MPSYCCYTMLEPFVRHGIKIRFYDIFYDEEKGLCAKLPEEDYTEVSKENHDNEIFYYMTF FGFSQLSGLDVKQIRDIYGLIIVDRTHSWLSDSTFQVEIDADYTYISFRKWTGFYGIAEA IKQHAPFVLELGTVGSEYSEHRKEAMMMKNDYIKWQRTIEVDEHLSHATEIAEAEGQKKN YLNKYDEAEDFLDEHYVGNKPTTECVYQLLLADWNFVKGQRRTNAVVLLQGLRDIPEIKL MYGEQSEMDVPLFVPILVKDDRDGLRRHLINKQIYCPVHWPLSKYHQNISNQGKEIYNRE ISLVCDQRYTEEDMNRIVDAIKDFFDKKSVACEQVTNIRDYVCN >gi|229783959|gb|GG667776.1| GENE 12 11703 - 12905 401 400 aa, chain + ## HITS:1 COG:TM0622 KEGG:ns NR:ns ## COG: TM0622 COG0438 # Protein_GI_number: 15643387 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermotoga maritima # 33 394 23 376 388 90 25.0 5e-18 MRETTVNKTLKNRKIKIAFVSLFSNISGATLAMLEIIDKLIEDGNFDVIVITGSRGTLET RLEERNIPVFKCLFFSWLKGRGIVEEIKIRVKRVLSFRSEIQILNILKQEKVDLLHINTG ISPVGIKSAKKLGIPVIWHLREEPVSYFNREPYDKKVELRSIKQTDAIITISKYIYDCYK DKFNHNARVIYDGVYTEGADDLRTKEIFTGETVHMSLCGYNAFKGHVEAIKALSLLLKKG YENIYLYFYGNIEPTFKKNLMAMVEKRGLGNHVKFEGYVDDMPKKWAESDIGLMCSDGEG FGRATVEAMATGALVIGANKGATPEIIEDGRGFLYEKGNAKQLARKIVYAIEHPEEARRT AKNGQRYITSGTFSMDRNIDNLKSLYQEILEKKKITSVKI >gi|229783959|gb|GG667776.1| GENE 13 13611 - 14165 -96 184 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624440|ref|ZP_06117375.1| ## NR: gi|266624440|ref|ZP_06117375.1| hypothetical protein CLOSTHATH_05806 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05806 [Clostridium hathewayi DSM 13479] # 1 184 229 412 412 343 100.0 4e-93 MLSVGLWGLYLFYNSFGTGKIKRKWFGVFIVAVLMVLVAYIMSPQLQYSVQRIFTGIGGQ NSAIQGRLGGGQYYISLFNRSEFLFGTGDAGTELTMYMSGLYHLIYIDGVICAALFILMF AVQAIRRRGGDRLLPCIVILMTVFSDTTLIQTLVYVMIYVFATVNHVEVKQLKRKIKIRV PSKH Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:48:57 2011 Seq name: gi|229783958|gb|GG667777.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld170, whole genome shotgun sequence Length of sequence - 14984 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 6, operones - 5 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 1384 1298 ## COG3119 Arylsulfatase A and related enzymes 2 1 Op 2 4/0.000 + CDS 1387 - 1827 382 ## COG0295 Cytidine deaminase 3 1 Op 3 7/0.000 + CDS 1824 - 3146 1334 ## COG0213 Thymidine phosphorylase 4 1 Op 4 . + CDS 3167 - 3820 762 ## COG0274 Deoxyribose-phosphate aldolase + Prom 4014 - 4073 6.0 5 2 Op 1 15/0.000 + CDS 4099 - 4599 674 ## COG0440 Acetolactate synthase, small (regulatory) subunit + Prom 5633 - 5692 26.8 6 2 Op 2 . + CDS 5810 - 6589 1052 ## COG0059 Ketol-acid reductoisomerase + Term 6637 - 6686 14.6 + Prom 6632 - 6691 3.2 7 3 Tu 1 . + CDS 6759 - 7124 382 ## COG1733 Predicted transcriptional regulators 8 4 Op 1 30/0.000 + CDS 8083 - 9159 1210 ## COG0065 3-isopropylmalate dehydratase large subunit 9 4 Op 2 . + CDS 9186 - 9653 553 ## COG0066 3-isopropylmalate dehydratase small subunit + Prom 10555 - 10614 10.9 10 5 Op 1 . + CDS 10665 - 11540 723 ## COG3773 Cell wall hydrolyses involved in spore germination 11 5 Op 2 . + CDS 11648 - 12112 578 ## COG2731 Beta-galactosidase, beta subunit + Term 12129 - 12166 4.6 + Prom 12256 - 12315 4.4 12 6 Op 1 38/0.000 + CDS 12339 - 13211 1147 ## COG1175 ABC-type sugar transport systems, permease components 13 6 Op 2 14/0.000 + CDS 13211 - 14047 925 ## COG0395 ABC-type sugar transport system, permease component 14 6 Op 3 . + CDS 14104 - 14983 1203 ## COG1653 ABC-type sugar transport system, periplasmic component Predicted protein(s) >gi|229783958|gb|GG667777.1| GENE 1 2 - 1384 1298 460 aa, chain + ## HITS:1 COG:PM0598 KEGG:ns NR:ns ## COG: PM0598 COG3119 # Protein_GI_number: 15602463 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 2 458 21 457 467 156 27.0 1e-37 SEGACFTHSYCATPLCGPTRRTMLTGLFAHTHKQYHNYTDPPYDHEVYLDTLAEAGYDNY YFGKWHAGPGAANEHHCSGFSRTDYSNPYIQPEYKEYLKRYNLPQALHHIDRYFDIPEIS SRGDWAECADNVDYQCKANWCGEHAVGSTLTPKETHESFFLATLACEQLEKLAKSKSDQP FSLRVDFWGPHQPHFPTQEFLDLYPKEDFVLPEYGSYRNRLQGKPGAYYRERSSPFGKDD QLIIPSTVNWEEWTEIIRHAFAHITMLDAAGGMIVDKLKELGLDKNTMIIWTTDHGDALA SQGGHFDKGSYMSEEVMRIPCAVKWEGVIPAGQIRDELISTIDYPVTILDAAGTAFTKNK VHGRSLLPLLTGKTDQWDDDLMAETFGHGYGEDINSRMLVHGDYKLIANKESIAELYNLR TDPFELHNLYEDPAFADVRADLENRLRIWQKRTDDPVEVL >gi|229783958|gb|GG667777.1| GENE 2 1387 - 1827 382 146 aa, chain + ## HITS:1 COG:CAC1544 KEGG:ns NR:ns ## COG: CAC1544 COG0295 # Protein_GI_number: 15894822 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Clostridium acetobutylicum # 9 136 6 127 132 116 47.0 1e-26 MDKKITQMLIEEAAEARKQSFSPFSGFAVGAALMTAEGAVYRGCNIECSGLTASNCAERT AFFKAVSEGERNFKAIAIVGGPSDRDVESLCYPCGVCRQVMAQFCDLDAFRIIVAADRSH WEECTLRELLPRAFHRHCTAEGSEEL >gi|229783958|gb|GG667777.1| GENE 3 1824 - 3146 1334 440 aa, chain + ## HITS:1 COG:SA1938 KEGG:ns NR:ns ## COG: SA1938 COG0213 # Protein_GI_number: 15927710 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine phosphorylase # Organism: Staphylococcus aureus N315 # 1 399 14 413 446 447 59.0 1e-125 MRMYDVIMKKRDGGVLTDEEIGYFVDGYTKGEIPDYQASALLMAIFFQGMNEHETAYLTG CMAASGDEIDLSSIPGIKVDKHSTGGVGDKTTMVIGPVAAACGVPVAKLSGRGLGHTGGT LDKLEAIPGLTTSLDTAKFFDMVKEIGIAVAGQTGNLAPADKKLYALRDVTATVDNISLI AASIMSKKIASGSDRILLDVKTGSGAFMKTLDDSIALAKEMVNIGTNVGREVVALVTDMD RPLGRNIGNALEVMEAAATLKGRGPEDFTQICIALSSNMLYLAEKGTMEECTAMAKEAIK SGRAFEKFCQMAEAQGGDASYIRRPEKFTISPVEYELKAEKTGYITAMDTEKIGMAAVEL GAGRRKKEDPIDYGAGIILNAKIGDRVENGQTVAVLYTSAVPLLSGAVELLKEAVVIGKE QPEEEKLICARVTKDGVERY >gi|229783958|gb|GG667777.1| GENE 4 3167 - 3820 762 217 aa, chain + ## HITS:1 COG:SPy1867 KEGG:ns NR:ns ## COG: SPy1867 COG0274 # Protein_GI_number: 15675686 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Streptococcus pyogenes M1 GAS # 1 212 1 213 223 221 54.0 8e-58 MTYSEMLHLVDHTLLKACASWEEICRLCDEAVEYQTASVCVPPCYIKRIRETYPDLTICT VVGFPLGYSVTEAKVCEVREALADGADEIDMVINLTDVKNGEFQKVEDEIRTLKKAAGDR VLKVIIETCYLTGEEKVAMCHAVTNAGADFIKTSTGFGTAGAQLSDIELFKSNIGENVRI KAAGGVRTLEDLKNYVEAGCSRVGASSVVKQVAELQK >gi|229783958|gb|GG667777.1| GENE 5 4099 - 4599 674 166 aa, chain + ## HITS:1 COG:MTH1443 KEGG:ns NR:ns ## COG: MTH1443 COG0440 # Protein_GI_number: 15679440 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Methanothermobacter thermautotrophicus # 5 159 10 165 168 142 51.0 2e-34 MERIVLSLLVDNTSGVLSRVAGLFSRRGYNIESLTVGVTADERYSRMTVVSIGDQEILDQ IEKQLRKLEDVRDIKELKPGHSVYRELILVKVRANASERQAVSAIADIFRATIVDVGKDS LTVMLTGDQSKLDALINLLEDYEILELARTGLTGLSRGSDDVRYLP >gi|229783958|gb|GG667777.1| GENE 6 5810 - 6589 1052 259 aa, chain + ## HITS:1 COG:Cj0632 KEGG:ns NR:ns ## COG: Cj0632 COG0059 # Protein_GI_number: 15791992 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Ketol-acid reductoisomerase # Organism: Campylobacter jejuni # 1 258 79 336 340 352 65.0 3e-97 MILINDEKQAAMYKESVEPNLEEGNMLMFAHGFAIHFGQIVPPKNVDVTMIAPKAPGHTV RNEYVVGRGTPCLVAVHQDYTGKAKDMALAYALALGGARAGVLETTFKVETETDLFGEQA VLCGGVCALMKAGFETLVEAGYAPENAYFECIHEMKLIVDLIFESGFAGMRYSISNTAEY GDYITGPKIVTDETKKAMKKVLTDIQDGTFAKDWLLENQVGCSHFNAMRRKEAEHPAEKV GAELRKLYSWNDGGKLIDN >gi|229783958|gb|GG667777.1| GENE 7 6759 - 7124 382 121 aa, chain + ## HITS:1 COG:FN0589 KEGG:ns NR:ns ## COG: FN0589 COG1733 # Protein_GI_number: 19703924 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 1 97 2 98 107 142 76.0 2e-34 MEKTLPACPVETTLMLISDRWKVLIIRDLMDGTKRFGELKKSIGSISQKVLTSNLREMEA DGLVNRKVYAEVPPRVEYTLTDTGYSLKPILDAMVEWDWNIRESRTEQDAGYRRHETSLA S >gi|229783958|gb|GG667777.1| GENE 8 8083 - 9159 1210 358 aa, chain + ## HITS:1 COG:CAC3173 KEGG:ns NR:ns ## COG: CAC3173 COG0065 # Protein_GI_number: 15896421 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Clostridium acetobutylicum # 2 356 67 421 422 517 74.0 1e-146 MDHFIPNKDIKSAENCKCCRDFACRHEISNYFDVGQMGIEHALLPEKGLVVAGDAVIGAD SHTCTYGALGAFSTGVGSTDMAAGMVTGKAWFKVPSALKFELVGKPNGWVSGKDVILHII GMIGVDGALYKSMEFTGDGIKNLTMDDRFTICNMAIEAGGKNGIFPVDDLAVQYMKDHSK RDFTVYEADPDAEYDEVYTIDLSELKPTVAFPHLPENTKTIGSFGDIPVDQSVIGSCTNG RIDDMRIAARVMKGRKVAKNVRCIVIPATQEIYLQAMREGLLEIFIEAGAIVSTPTCGPC LGGYMGILAAGERCISTTNRNFVGRMGHVDSEVYLASPAIAAASAITGKISAPEELGL >gi|229783958|gb|GG667777.1| GENE 9 9186 - 9653 553 155 aa, chain + ## HITS:1 COG:PAB0892 KEGG:ns NR:ns ## COG: PAB0892 COG0066 # Protein_GI_number: 14521550 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Pyrococcus abyssi # 1 153 1 153 164 213 65.0 2e-55 MKATGSVFKYGDNVDTDVIIPARYLNITDGYELARHCMEDIDKDFVKNVKKGDIIVANKN FGCGSSREHAPLVIKCAGISCVIAETFARIFYRNAINIGLPIIECPEAAKAISAGDEVEV DFDSGIITNKTTGESYEGQAFPPFMQKIISAGGSS >gi|229783958|gb|GG667777.1| GENE 10 10665 - 11540 723 291 aa, chain + ## HITS:1 COG:BH1631_2 KEGG:ns NR:ns ## COG: BH1631_2 COG3773 # Protein_GI_number: 15614194 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall hydrolyses involved in spore germination # Organism: Bacillus halodurans # 160 262 6 107 132 70 38.0 4e-12 MLTLHSILQHICQFFRGLAVKATKRMYRSSAVFMAGAAVITVVAFTSTGFGSGGKSALTA FAETPGQETAAEEEDAEEEEPVTEAKVQVQLTEIKKQGQLLAGNLLEKEVQQKQEIQQDS KEELERINEQIVMDAKIKALEAEENARQKALEEERRIEEEKKAASRKAISDDDYQVLLRI VQAEAGVCDEKGKILVANVVLNRVKSQEFPDSVRSVVYEPSQFSPVSDGSINSVKVTEET KECVNRALEGEDYSDGALYFMNRRGSRSRAVSWFDSHLTYLFRHQNHEFFK >gi|229783958|gb|GG667777.1| GENE 11 11648 - 12112 578 154 aa, chain + ## HITS:1 COG:CAC0836 KEGG:ns NR:ns ## COG: CAC0836 COG2731 # Protein_GI_number: 15894123 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Clostridium acetobutylicum # 1 152 1 150 152 114 41.0 5e-26 MIFDSAKNLDFYRGLGIEGRYGKAVDFLQNTDLENLEPGKYEIDGKNVYANVTEYTTIPW EEAKYESHRDYTDIQYMIRGTETMTYAPVDALNVKVPYNEEKDCVLYDNANPGLKVVVGA GEYMIFQPWDGHKPKAADGEPAPIKKVIVKIKEN >gi|229783958|gb|GG667777.1| GENE 12 12339 - 13211 1147 290 aa, chain + ## HITS:1 COG:mlr7001 KEGG:ns NR:ns ## COG: mlr7001 COG1175 # Protein_GI_number: 13475831 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Mesorhizobium loti # 15 288 33 315 317 168 36.0 1e-41 MKTKRLTYSNFYFYLLLLAPILAVYILFFVIPVVSSMFFSLTNFNGISLNFKWVGFNNYD VAFHDKVFKKAMVNTFLFALGATVFQNLFAIVFAMALNTKIKSKNFLRMLLFAPCMLSPV VVAFIWQFIYMPDGMLNHLLGTDITWLGNRKTALFCVVIAHVWMWIGYSSTIYMSNLQSI STDILEAAAIDGAGPWQKFKSITLPMLAPATTINVTLAFTQSLKVFDIVYAMTNGGPLNS TETVGTYVVANMNRGLHGYASAQTVLLTIVIVLFGQVLIGVLKKREEAIC >gi|229783958|gb|GG667777.1| GENE 13 13211 - 14047 925 278 aa, chain + ## HITS:1 COG:SP1895 KEGG:ns NR:ns ## COG: SP1895 COG0395 # Protein_GI_number: 15901722 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 1 277 2 278 278 138 34.0 1e-32 MKTRKKQLLFLFELLMAAACILMFYPVIMMVLVSLKDEALLAGEPLSLRTSFAFENYVTA AKGMKYWRALFNSSVLTVCSSLLTTFFGACGAYAIMRARRGKQLFLALNALFLIGLALPQ QVAMVPLVLWMQKLHVANTLFGLILAFIGANAAYGVFFFSGFVNTVPVTLEEAAYIDGAS PFTTFIRIVFPLLKPPMVTLLIVIALRVWNNFMYPLLLLQGQNSRTLPLTVFFFKGDLSI QWNILFAATTLVILPLMIVYFILQKQIISGMLNGSVKG >gi|229783958|gb|GG667777.1| GENE 14 14104 - 14983 1203 293 aa, chain + ## HITS:1 COG:BH3690 KEGG:ns NR:ns ## COG: BH3690 COG1653 # Protein_GI_number: 15616252 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 60 200 42 185 420 67 32.0 3e-11 MRKRKAAVWFVAAAMVMGTLAGCGQKAAPETTAAQTGETKKEETAAAAEETKAEAQKPVE QVELKVFGFKTGAEEGAIPELIEQFNKENPDIKVVYEGISNAGGYQDVLTARLASGQGDD VFFANPNYLPQLQEAGYTEDLSDMPVVAQYSGLVKDLLTINDSIPGLGMEIAVFGMFSNL DVLKEVGIDHAPANYQEFLEDCEILKKAGKTPIVAGAKDGTGVAVLSMAKSMDPVYQAAD KMDQIARMNSGELKFGDVMQPGFALVEDLISRGYLDGSKALVYAAGQDDIAEF Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:49:01 2011 Seq name: gi|229783957|gb|GG667778.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld171, whole genome shotgun sequence Length of sequence - 12211 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 607 651 ## gi|266624457|ref|ZP_06117392.1| hypothetical protein CLOSTHATH_05824 + Term 622 - 678 18.0 2 2 Op 1 7/0.000 - CDS 643 - 2247 1619 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 3 2 Op 2 . - CDS 2275 - 4080 1635 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 4230 - 4289 9.1 + Prom 4230 - 4289 5.9 4 3 Op 1 35/0.000 + CDS 4337 - 5701 1655 ## COG1653 ABC-type sugar transport system, periplasmic component 5 3 Op 2 38/0.000 + CDS 5877 - 6797 1079 ## COG1175 ABC-type sugar transport systems, permease components 6 3 Op 3 . + CDS 6812 - 7639 901 ## COG0395 ABC-type sugar transport system, permease component 7 3 Op 4 . + CDS 7686 - 8864 1344 ## Closa_1929 glycosyl hydrolase family 88 8 3 Op 5 . + CDS 8881 - 10296 1334 ## COG3119 Arylsulfatase A and related enzymes + Term 10318 - 10371 5.5 - Term 10304 - 10360 3.6 9 4 Op 1 . - CDS 10394 - 11233 751 ## Closa_3187 hypothetical protein 10 4 Op 2 . - CDS 11261 - 12211 1034 ## COG1293 Predicted RNA-binding protein homologous to eukaryotic snRNP Predicted protein(s) >gi|229783957|gb|GG667778.1| GENE 1 2 - 607 651 201 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624457|ref|ZP_06117392.1| ## NR: gi|266624457|ref|ZP_06117392.1| hypothetical protein CLOSTHATH_05824 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05824 [Clostridium hathewayi DSM 13479] # 1 201 1 201 201 349 100.0 8e-95 TVSFIEELKQYADNPDVMEAIRLAILEERETEENSTNQWVTNFIMTNPVVVGVDGPYQET YDELAAVFYESGKKEYYSSVLEKSSKETQAAMAERAYKEHNIPCFSMTVQILTAEERLQY LEKSYQDNQTAFFSILFSDLAGNEQTSYAYRAYEDDRKDFFSIAVESLDSYQCEALAKRA YREGKKEFLYLIPGAKDDKDF >gi|229783957|gb|GG667778.1| GENE 2 643 - 2247 1619 534 aa, chain - ## HITS:1 COG:BH1123 KEGG:ns NR:ns ## COG: BH1123 COG4753 # Protein_GI_number: 15613686 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 526 1 521 526 176 25.0 1e-43 MYRVIIVDDEPLILAGIASLITWEEYDCTIIGKATNGPSAFDMIMELKPDIVITDIRMPV LNGLELVEKCKDEGCTFSFIVLTNLEEFHLVRKALSLGASDYLVKLDLNEEALAASLARA KEACELLVSQKQHQLLNDLMKDNQENASKNYLSQLLLSGNNPEVPDKVRAEYPDPFLLLF SISPINIAFETDDDPYDFHQISKQLMDILGGIATRTFRCHTLLEFRTDTYLLAASLKDGD VFETVVSEFCQKMTVALKTYFEFSVVFGVSRRKSDISLLRDALSEAQTALEVYYFESSSP VVFYQGQKYHMSRAKDFNINIFKKDLSAAIAQNDSDKLAAVFGQIITLFKENQPGKEQAA SACINLYTYLYSYFESADNSYGDIFPYAINVAGRLYQFNSLADILTWLQSFCSKLCRLLN DRKSTRSDKLVEQARAYVDEHYMEKLTLADIADCLNISAGHLSNTFKKLTGSTLSDYIAT VKIEHAKELIDTHQYLMYEISDMLGFDNPYYFSKVFKKVTGISPREHENRTPAS >gi|229783957|gb|GG667778.1| GENE 3 2275 - 4080 1635 601 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 323 599 309 592 602 171 35.0 3e-42 MKRAEKQKNGKMPKTFQKKLILYNMLVIICIASAVSFYNYRSYCQDVIISETNNSANRVR ALSERLQVAYDEMVNIVLNCAERKSLFFATTLDSPNKPYNTHMAVYASDVLRDFCAISGY SRYINKIVLFHNGLVLQAGNSFGASTDVENIMKAPWFSSLLYRENSQYTLSLEDNPFTFS SGRSAGPKLLPLLRPLQYSARSKPEESWVFLGISTTLFSDTLKSLPPDALVYAVTAGGDI IASTDVNGEFDTSAIIGSLLADKGAEGNYQMKLNHENCVITYVKQPVSGLLFFEIRPESR MQLDHGVIRRTVLIVFFFCIAIGLTLSLLISRQLGAPISRLTKRLELISRGDFEPDHSIE TDDEIGMIGRQINHMSGHISTLLETRVENEKEKKDMEIKMLQAQINPHFLYNTLDSIKWI ATMQKNSGIVSVVTALSSLLKNMAKGFNEKVTLRQELDFLQNYVIIEKIRYIELFDIDII VEQEELYDARIIKLTLQPIVENAIFSGIEPSGRTGLITIRVWVKSGVLHISVTDNGIGIS PENIEKLLTDTSRITRSNMSGIGLPNVDRRFKLVYGEDYGLTIESVVDQYTTITITLPLE F >gi|229783957|gb|GG667778.1| GENE 4 4337 - 5701 1655 454 aa, chain + ## HITS:1 COG:BH1864 KEGG:ns NR:ns ## COG: BH1864 COG1653 # Protein_GI_number: 15614427 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 15 370 15 356 461 79 24.0 2e-14 MKLKRVAALTLAAGMACTSLAACGSKNTPTAAPETAAKEEATTAADAESKEETTAASGEK QVLSVTTWDYDTTPQFSAVVKAYEAKNPNVEIKVIDTSADECNNSLGISLSAAQADPDVI WVKDMGSMLQMADKGQLLPLDDMIKADNLDLSVYSGAAEQLMYNDTTYGLPYRSDWYVLY YNKDLFDAAGVEYPSNDMTWGEYDELAAKMTSGEGSAKVYGTHDHTWQALVTNWAVQDGK HTVVEKDYSFLKPYYEEALALQENGYMQDYSTLKTANIHYSSVFKNQQCAMMPMGTWFIA TMMQAQAEGETDFNWGVARIPHPEDTESGYTVGALTPIGISAYTDQKDLAWDFVKFASSE EAANILAEQGVFTGIQTDESINTIASAENFPEGDSNKEALTYTHYAFDRPLDPQIEEIRK PLDEVHEMIMIKAYSIDDGIAELNKRVAEIKGWN >gi|229783957|gb|GG667778.1| GENE 5 5877 - 6797 1079 306 aa, chain + ## HITS:1 COG:BH1865 KEGG:ns NR:ns ## COG: BH1865 COG1175 # Protein_GI_number: 15614428 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 3 301 10 306 309 205 41.0 8e-53 MEEKKKKGLSSRQKRVIRDNMVGYAFILPNLIGYAIFIFIPVCFSFVLSVMKWDASQAPM EFVGLANFAQIFQDEIFMKSFWNTIEYALMTVLPTLVLSLLLAVLLNNKLKGIAIFRTAI YFPYIASIVAVGAVWNMLFQPDFGPVNEFLKFIGIAKPPRWVVDVNWAMVAISIVSIWKY MGYYMIVYLAALQGISSSLYEAASIDGANGWQKLRYITIPMLTPTTFFVLIMLTIQCFKV FDLVYVMTGGGPGNATKTLVNYIYEKAFTSWQLGPASAGAIILFAVVLVITLIQFTGEKK WSKDLM >gi|229783957|gb|GG667778.1| GENE 6 6812 - 7639 901 275 aa, chain + ## HITS:1 COG:lin0219 KEGG:ns NR:ns ## COG: lin0219 COG0395 # Protein_GI_number: 16799296 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 11 274 18 281 282 201 40.0 1e-51 MQTKHKSYKGLIYILLVLLSLFMLVPFYWMVISSLKLNKDVFSIPMKWWPDTMHWDNYQI IWKKLPLMTFFLNTAKLTICTTVIQLFTSCLAAYGFTKTKFKGRDLIFLMYVTTIAVPWQ VYMIPQFILISKMGLNDTHIGLILMQAFSAFGVFLIRQFYISIPDELCEAARIDGLNEFG IFSRIVLPLGKPAMATLTIFTFTNVWNDFMGPLIYLKTKELKTIQLGIRMFISQYGADYA WIMAASVCSLIPVVIVFLSCQKFFVEGVAASGIKG >gi|229783957|gb|GG667778.1| GENE 7 7686 - 8864 1344 392 aa, chain + ## HITS:1 COG:no KEGG:Closa_1929 NR:ns ## KEGG: Closa_1929 # Name: not_defined # Def: glycosyl hydrolase family 88 # Organism: C.saccharolyticum # Pathway: not_defined # 1 392 1 392 392 696 87.0 0 MAQTLDTIRNYPGMEEGMITTALDGAVAVIRENLPVFTEKFQSSNSFGGFYEPTENVEWT TGFWTGSIWLAYEHTGDEAFKQAADIQVESFLERIEKKIDVNHHDMGFLYSLSCVAAYKL TGNEHAKKAALLAADHLAERYREKGNFLQAWGNPGEPKEYRLIIDCLLNLPILYWATEVT GDESYAQKAENHIKTAMKCILRPDNSTYHTHFFDVETGEPTYGVTHQGNRNGSAWARGQA WGVYGIALSYRYLKNPEYVDLFCRVTDYFIEHLPEDLVPYWDFDFDTGSTEPRDSSASAI AVCGILEMAKYLDQEKAGRYLTAADRMLRALVDRCANTDITKSNGLLLHGTYARDSKENT CTNRGVDECNTWGDYFYMEALTRLSQDWKLYW >gi|229783957|gb|GG667778.1| GENE 8 8881 - 10296 1334 471 aa, chain + ## HITS:1 COG:YPO0829 KEGG:ns NR:ns ## COG: YPO0829 COG3119 # Protein_GI_number: 16121138 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Yersinia pestis # 2 458 43 501 517 306 37.0 4e-83 MKTNLLYVFADQWRAEAVGSLGSDQVVTPNIDRFSEESVCCTNAYSTFPLCSPHRASLMT GKYPFRLGMWTNCKIGLEEKIMLKPQETCIANVLKDAGFATGYIGKWHLDASELNFSPHP KSGAGEWDAYTPPGERRQGFDYFLSYGACDDHLDPHYWLDDETQIKPGKWSAEFETDKAI EYMNQKKDGEEPFALFVSYNPPHLPYELVPERYYEKFKNLKVHYRPNVPESMREEGGLLE TQTRQYFAAVHGIDEQFGRILAWLKENGMEEKTLVVLSADHGEMLGSHGLMSKNIWYDEA LHIPLIFRQKGRLKPGKNDVIFASPDHMPTLLELLDLAVPETCEGYSHADSLIRGSAVPG EPEDMLICSYPGGADMVAAFSKRGLTHKAYGWRGIRNRRYTYVITNGYAPDEPQREFLYD RELDPYEMNPAAIEKDCTDERILAFRERLKDYLELTEDPFLWDRPRCETVQ >gi|229783957|gb|GG667778.1| GENE 9 10394 - 11233 751 279 aa, chain - ## HITS:1 COG:no KEGG:Closa_3187 NR:ns ## KEGG: Closa_3187 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 275 1 275 277 430 71.0 1e-119 MAVDKKYEADAAVILSHRYDQGGLLWDTPDHRLQKGAPFSTLDCVSYLMELGMEPEEPVL KEAAELIFSAWQKDGRIRCYPKGSIYPCHTACAAQVLCRLGYGEDERLKATFRYFLDSQY QDGGWRCRKFSFGHGPETEYSNPHPTLLALNAFRFSDYINREPALDRAVEFLLDHWTIRK PIGPCHYGIGTLFMQVEYPFRNYNLFEYVYVLSFYERARKDARFLEAFHELEKKTVDGQI VVERVVPKLAKLAFCKKGEPSELATRRFGEIQENLAADI >gi|229783957|gb|GG667778.1| GENE 10 11261 - 12211 1034 316 aa, chain - ## HITS:1 COG:CAC2111 KEGG:ns NR:ns ## COG: CAC2111 COG1293 # Protein_GI_number: 15895380 # Func_class: K Transcription # Function: Predicted RNA-binding protein homologous to eukaryotic snRNP # Organism: Clostridium acetobutylicum # 5 304 265 566 570 228 43.0 1e-59 GYDGDGYSTVTYDSISTLLEEYYASRNVLTRIRQKSADLRKIVQTALERNYKKYDLQMKQ LKDTEKRDKYKVYGELLNTYGYELEGGEKKLTCLNYYTNEEITIPLDDQLTARENSQKFF DKYNKLKRTYEALSEQTQDTKREIDHLESVSAALDIALKEEDLVQIKEELMEYGYIKHRR AGDKKPKITSRPFHYITSDGFHIYVGKNNYQNDELTFKLANGGDWWFHAKGIPGSHVIVK TEGKELPDHVFEEAGSLAAYYSKGRENEKVEIDYIQRKHIRKVAGGAPGFVIYHTNYSLV AEPKLLLEEVSEKKNH Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:49:25 2011 Seq name: gi|229783956|gb|GG667779.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld172, whole genome shotgun sequence Length of sequence - 13448 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 6, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 575 619 ## COG2182 Maltose-binding periplasmic proteins/domains 2 1 Op 2 20/0.000 - CDS 649 - 1482 840 ## COG3833 ABC-type maltose transport systems, permease component 3 1 Op 3 . - CDS 1510 - 2772 1236 ## COG1175 ABC-type sugar transport systems, permease components - Prom 2813 - 2872 4.4 + Prom 2998 - 3057 7.2 4 2 Tu 1 . + CDS 3146 - 3931 686 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family 5 3 Op 1 . - CDS 4939 - 5346 542 ## HMPREF0421_20730 hypothetical protein 6 3 Op 2 . - CDS 5348 - 6025 745 ## Clos_0666 hypothetical protein 7 3 Op 3 . - CDS 6028 - 6612 595 ## ELI_1341 putative transcription regulator 8 3 Op 4 . - CDS 6625 - 6795 76 ## gi|288871389|ref|ZP_06410145.1| hypothetical protein CLOSTHATH_05844 + Prom 6669 - 6728 5.9 9 4 Op 1 1/0.000 + CDS 6758 - 7351 604 ## COG1309 Transcriptional regulator + Term 7365 - 7401 3.1 10 4 Op 2 32/0.000 + CDS 7453 - 9186 1379 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 11 4 Op 3 . + CDS 9189 - 9686 439 ## COG0440 Acetolactate synthase, small (regulatory) subunit - Term 9577 - 9616 5.0 12 5 Tu 1 . - CDS 9683 - 10360 537 ## CDR20291_2612 hypothetical protein 13 6 Op 1 1/0.000 - CDS 11376 - 12107 478 ## COG0726 Predicted xylanase/chitin deacetylase 14 6 Op 2 . - CDS 12095 - 13198 1167 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase - Prom 13326 - 13385 5.3 Predicted protein(s) >gi|229783956|gb|GG667779.1| GENE 1 2 - 575 619 191 aa, chain - ## HITS:1 COG:BH2019 KEGG:ns NR:ns ## COG: BH2019 COG2182 # Protein_GI_number: 15614582 # Func_class: G Carbohydrate transport and metabolism # Function: Maltose-binding periplasmic proteins/domains # Organism: Bacillus halodurans # 1 191 2 178 424 76 31.0 2e-14 MKKRKLCSLLLASIMAASALCGCSKELPPEAVQNSAAEADPEDVSGKIIPEEGAKLLLWT DKQAYGEAIAKGFMEAYPGVTVTAEEVGFTDARTKMELDGPAGSGADVFMISHNRLGSAI SSGMVMAFEGDVRDRLLSELSDAAIKSASQDGKLYGAPVSVEANSFFYNKDIVGDEPAGT FEEIMEFAETF >gi|229783956|gb|GG667779.1| GENE 2 649 - 1482 840 277 aa, chain - ## HITS:1 COG:SA0209 KEGG:ns NR:ns ## COG: SA0209 COG3833 # Protein_GI_number: 15925920 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type maltose transport systems, permease component # Organism: Staphylococcus aureus N315 # 1 276 1 278 279 214 42.0 2e-55 MAKFKKVFVAALLHLELIIVSLLVLIPIVWIVSSSFNNSSSLASATLIPEKWTLANYARL FRETKYGSWFANTLKIAVVNSAVSVTIVMITAWVMSRFRFRGKKAGLMTLLLLSMFPSFL SMTAIYVLFLTFGLLNKPVALVIIYSAGAIPYNVWLVKGYLDGVSRSLDEAAYIDGCSKF RALFHVTLPLSMPIITYVTVTQFMTPWMDYILPNLLLSRNENRTLAVGLYAMINEQESTN FTMFAAGAVLVAVPISMIYIIFQKYIVQGVSAGANKE >gi|229783956|gb|GG667779.1| GENE 3 1510 - 2772 1236 420 aa, chain - ## HITS:1 COG:BH2925 KEGG:ns NR:ns ## COG: BH2925 COG1175 # Protein_GI_number: 15615488 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 42 419 54 426 430 297 41.0 3e-80 MKGILFFFVQALVIGIELQTGYWLEWLTGKIPVFELRLCGGFFTRGVWGLVTLGERPGAK TGDHSMMLMISGVIVGLLLLLFLGIYVWNLRDAWRSGREIDETGIYQGSANYFKKLYQEK FVYIVLAPVAVLILFITVMPMIFSVLTAFTNYSKGNLPPANLIDWVGFDNFYKLFHVPVW SSTFFGVLGWTVIWTVASTLSCYCFGMLQAVILNSECVKFKKVFRTIMILPWAMPGMICL LVFRNIFNGQFGPLNQFLLDMGWISARIPFLTDPVLAKITVILVNLWLGFGASMVMISGV LSNMDPGMYEAARIDGASAIKQFRFLTLPHLLRVTAPLLVMNFSGNFNNFGAVFFLTQGG PANPNYQFAGDTDILISWIYKLTLDNQMYSMAAVMSILIFLVVGSVAFWNFRRTSAFKEV >gi|229783956|gb|GG667779.1| GENE 4 3146 - 3931 686 261 aa, chain + ## HITS:1 COG:DR0470 KEGG:ns NR:ns ## COG: DR0470 COG1387 # Protein_GI_number: 15805497 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Deinococcus radiodurans # 7 235 8 229 260 85 30.0 9e-17 MKKVNYHTHTRLCRHAGGAAEDYAAAAWQNSLDVLGFSDHAPFPDGRFGLRMQYDELEPY IRDLNRLKTEYQGRVSICSGLEIEYCPDSLEYYESLLKPGRLDYLLLGQHFYTAAGALPV NIYELEEKGGTAAYIDYALSIREGLATGFFKVLAHPDVIFINDLAWDENCSLACEIIVEA AQVSGAVLELNANGIRRGMREYCDGVRYPYPHKRFWDRVSGTRIPVIVSSDCHNPDVLWD SCMEEGYRLGREWGLLLTDEI >gi|229783956|gb|GG667779.1| GENE 5 4939 - 5346 542 135 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0421_20730 NR:ns ## KEGG: HMPREF0421_20730 # Name: not_defined # Def: hypothetical protein # Organism: G.vaginalis_ATCC14019 # Pathway: not_defined # 12 135 7 129 147 112 54.0 5e-24 MTNDRKKKAVLKLINAAFVYALLAMAGGVFYREFTKFNGFSGRTTLAFVHTHLFLLGMIV FLVAALFEFHAAVTEQKRFGLFFGIYNAGVAVTVVMLIVRGIAQVRGLELSRAMDASISG IAGIGHILTGVGIIL >gi|229783956|gb|GG667779.1| GENE 6 5348 - 6025 745 225 aa, chain - ## HITS:1 COG:no KEGG:Clos_0666 NR:ns ## KEGG: Clos_0666 # Name: not_defined # Def: hypothetical protein # Organism: A.oremlandii # Pathway: not_defined # 1 217 1 217 225 272 64.0 7e-72 MQAVMETIFDIVYLCTVIALGVRMIAGAGERKQIKLFGIMAVVLGCGDAFHLVPRAYALC TDGLANHAAALGFGKLVTSVTMTVFYVILYYIWKMRYRVEGKNGLTAAVWGLAIVRIFLC FMPGNDWLSASAPLSWGIYRNIPFAVLGLLMIVLFYRKAKKTADSGFGFMWLAITLSFGF YIPVVLWADTIPAVGALMIPKTCAYVWIVVMGYREMKQTGQRKER >gi|229783956|gb|GG667779.1| GENE 7 6028 - 6612 595 194 aa, chain - ## HITS:1 COG:no KEGG:ELI_1341 NR:ns ## KEGG: ELI_1341 # Name: not_defined # Def: putative transcription regulator # Organism: E.limosum # Pathway: not_defined # 1 184 1 184 186 134 37.0 2e-30 MKKSATSKEELLAIAKEIAGREGIGNLNIRRLASESGIAIGTVYNYYPSKGDLVGAVMED FWRNVFHGSHFDTESGDFIGSIHDIYFRLHENLESFREEFLQEMEVLSQTDQKRGRELEA FYLGHMKEGLLRILERDSRVAPEVWNETFTPRKFVSFAFSNMVFLLKNQEDPAYFEEIMR RLIYKTDADTGKEV >gi|229783956|gb|GG667779.1| GENE 8 6625 - 6795 76 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871389|ref|ZP_06410145.1| ## NR: gi|288871389|ref|ZP_06410145.1| hypothetical protein CLOSTHATH_05844 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05844 [Clostridium hathewayi DSM 13479] # 13 56 1 44 44 67 100.0 4e-10 MAASTILWLVSSMYSTSMQVLTFITFYRKEEWMSRKIKAELFDKMTVTEYNEQCSY >gi|229783956|gb|GG667779.1| GENE 9 6758 - 7351 604 197 aa, chain + ## HITS:1 COG:BS_yvkB KEGG:ns NR:ns ## COG: BS_yvkB COG1309 # Protein_GI_number: 16080573 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 1 189 1 186 189 91 30.0 9e-19 MDDTSQRIVDAAMGLIRDKGYVATTTKDIAKQAGVNECTLFRKFSGKKDIVLAGIDQEKW RADVTPLAFENVTWDLRSDLEMFMNAYMDRITPDFVNLSIGLRAPQIYEETAPFIMMIPK AFVSSLIRYFEQMEEKGKLSHGNYESWAMTIFSATFGYTFLKASFDDSLSKLSRHDYVRD SVTVFLEGIGRREAPEQ >gi|229783956|gb|GG667779.1| GENE 10 7453 - 9186 1379 577 aa, chain + ## HITS:1 COG:AF1720 KEGG:ns NR:ns ## COG: AF1720 COG0028 # Protein_GI_number: 11499309 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Archaeoglobus fulgidus # 5 572 3 551 552 546 51.0 1e-155 MQITGAQLFVKALKEEGVEVLFGYPGGQAIDLFDALYGETGIRLILPRHEQGLVHAADGY ARSTGRPGVCLVTSGPGAANLVTGIATANYDSVPLVCFTGQVPTSLIGNDAFQEVDITGI TRSICKYSVTVRTRERLGEIIKKAFHIATTGRPGPVLVDLPKDIQREYGSSCYPDHVNLR GYRPGFAVHGGQIKKALELLNQAKRPLFLVGGGVNIARAEEAMTRLAERTGVPVVTTIMG KGAIPSTHPLYAGSIGIHGSFAANSAVSSCDVLFSIGTRFNDRITGKNGEFAKHASIIHV DIDPASISRTIAVDIPIVADAGAAIHALLEKAVPLDTDEWRKQIDGWKENHPITMKQEGL TAESVIRCINRLEGPLLVATDVGQNQLWAAQFLELDKDRRLLTSGGLGTMGYGLPAALGA KIGNPDKTVILITGDGGLQMNMQELATAVVCGLPVIICVLNNGWLGNVRQWQEMFFEHRY SATCLRRRYSCPAVCMEPGPDCPPYTPDFIKLAESFGITGIRVTKAEEIQPALQAAEASK DAPVLIEFLIAWDDNVLPIVPPGSPIDHMILENERGR >gi|229783956|gb|GG667779.1| GENE 11 9189 - 9686 439 165 aa, chain + ## HITS:1 COG:RSc2076 KEGG:ns NR:ns ## COG: RSc2076 COG0440 # Protein_GI_number: 17546795 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Ralstonia solanacearum # 6 121 5 119 163 105 48.0 5e-23 MKKRWINLYVENEPGVLALIAGLFSGKSYNLESLTVGPAEDETVSRMTIGLTGDDQTFEQ IKKQLARRVEVIKVVDLTETAAHCEELMFLRIHDCSEEEIRELFRMASVFHARVIDYSPS SILLKSVRPEEENTLLIKRMQKRFPNRIYAIRGGSVAIPALSPMD >gi|229783956|gb|GG667779.1| GENE 12 9683 - 10360 537 225 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_2612 NR:ns ## KEGG: CDR20291_2612 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 15 206 158 350 354 110 33.0 5e-23 MGAAKFSYLMGLPLALKLLICFGILANGISVAFILLILFCRPLVERLVYRVIRLLNHIPS FDGEKAAQKAGSMIEEYSQGGAYLRKYPMAAVKTAVLTFCQLTFLYMASYCACLALGLGS VGLVNFIMLQSIVSLAVSAVPLPGAVGASESGFLVMFGGFLSSSQLLPVMVLSRGISFYG FLVVSGLAVASLHIGKKGTPKKEKEVRSRARGPMHKQFSVSADPR >gi|229783956|gb|GG667779.1| GENE 13 11376 - 12107 478 243 aa, chain - ## HITS:1 COG:BH1917 KEGG:ns NR:ns ## COG: BH1917 COG0726 # Protein_GI_number: 15614480 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Bacillus halodurans # 20 241 48 270 276 112 32.0 5e-25 MYGLRKNWSRLAAGFGLLAGGFTVHSVIPALYHKHRNPAVLRRTGVPGTVMLTFDDGPDP EYTGRLLDLLKKEEVKAAFFVVTREAVKEEALIDRMLSEGHVVGFHSMDHQNAMVRGYLH TKKDFRTGCEFMKRKGISDIWYRPPWGLTNLFSWHFVRKYGMKMVLWDVMAEDWEREATV GSIRKKCLSRVKDGSILCLHDGGRRTGGAPGAPEKTLEALAYVIPALKEAGYRFILPSES VSG >gi|229783956|gb|GG667779.1| GENE 14 12095 - 13198 1167 367 aa, chain - ## HITS:1 COG:CAC2897 KEGG:ns NR:ns ## COG: CAC2897 COG0707 # Protein_GI_number: 15896150 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Clostridium acetobutylicum # 1 343 1 354 384 169 29.0 1e-41 MKLLFLTGKFGMGHYSAAFSLAERVSRVNPEADIVIRDIFEYAMPYYSDKVYHAFGVMVT HCSGTYNKYYNHMERKGPDLKPVFLPWFLKKIKNLLEEEQPDAVISTLPLCSQIMSWYKA VTGSRMPLITCITDISSHSEWINGATDCYLVPDRMVRTKLIEKGVEETKIYVYGIPVRPE FDYGSERPGETDGKKHILIMGGGLGILPESNEFYEELNDSGHIRVTVITGKNQEIYKKLH GKYENIEVIGYTNEVYRYMQEADVVISKPGGITLFETIHAGTPLLVFEPFLQQEINNTGF IVGRGIGMILEKNPMDCVREISKIVQDDTLLDAMKANVNRLKREYDQTVIGQVLALIEGP EGALCTG Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:49:49 2011 Seq name: gi|229783955|gb|GG667780.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld173, whole genome shotgun sequence Length of sequence - 15917 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 6, operones - 5 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 1176 1258 ## gi|266624482|ref|ZP_06117417.1| hypothetical protein CLOSTHATH_05850 2 1 Op 2 . - CDS 1204 - 3372 1348 ## LGAS_1663 hypothetical protein 3 1 Op 3 . - CDS 3390 - 3557 124 ## gi|266624484|ref|ZP_06117419.1| conserved hypothetical protein 4 2 Op 1 . - CDS 4487 - 4801 436 ## gi|266624485|ref|ZP_06117420.1| putative stearoyl-CoA 9-desaturase 1 5 2 Op 2 . - CDS 4824 - 6899 1699 ## gi|266624486|ref|ZP_06117421.1| conserved hypothetical protein 6 2 Op 3 . - CDS 6896 - 7705 832 ## COG3505 Type IV secretory pathway, VirD4 components 7 3 Tu 1 . - CDS 8643 - 9344 821 ## COG3505 Type IV secretory pathway, VirD4 components - Prom 9466 - 9525 80.4 8 4 Op 1 . - CDS 10373 - 10696 334 ## gi|266624489|ref|ZP_06117424.1| conserved hypothetical protein 9 4 Op 2 . - CDS 10699 - 11505 648 ## CLI_A0011 CAAX amino protease family protein 10 5 Op 1 . - CDS 12657 - 13502 455 ## ELI_2013 hypothetical protein 11 5 Op 2 . - CDS 13504 - 14220 473 ## COG3279 Response regulator of the LytR/AlgR family - Prom 14301 - 14360 7.2 + Prom 14306 - 14365 7.5 12 6 Op 1 . + CDS 14484 - 15293 300 ## ELI_1170 response regulator of the LytR/AlgR family + Term 15308 - 15336 -0.0 + Prom 15343 - 15402 9.0 13 6 Op 2 . + CDS 15461 - 15802 194 ## COG3152 Predicted membrane protein + Term 15822 - 15879 7.7 Predicted protein(s) >gi|229783955|gb|GG667780.1| GENE 1 3 - 1176 1258 391 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624482|ref|ZP_06117417.1| ## NR: gi|266624482|ref|ZP_06117417.1| hypothetical protein CLOSTHATH_05850 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05850 [Clostridium hathewayi DSM 13479] # 1 391 1 391 392 719 100.0 0 MRIIPKKTKVSIEFFKGVDIIDLIIGMIGIFLVAMVVMSSLPFKLPLTMVLIIIFGVLLV PIDEDKNYEAVLYILRYFAHPKVVKRNGEKESGQETAEAKGKAKKKSGKNVNVEDVMAFT GIKGNEILYGDSYVASVIGISPVEFRFMAEKRQDYMIERVFGAALRMVTGSNRGASIVKV ERPILYDETIQYEFQKMESLKEAFYNDIINEDELTTRVEIIYDRIMEIERINFDDKVYDS FHYLVLYDKNSKSLSDMTRSVLQTLDGGGLPVKLLDEKELAVFLKYNMTRDFDEREIETL KPEEYKAWIIPECVEFGTRTVKIDDVVTHNLKIVNYPTEVGNAWGHSIFNMENTKVIMKL TPVDRFKAVRNIDRSIDELRGQEANTSKESR >gi|229783955|gb|GG667780.1| GENE 2 1204 - 3372 1348 722 aa, chain - ## HITS:1 COG:no KEGG:LGAS_1663 NR:ns ## KEGG: LGAS_1663 # Name: not_defined # Def: hypothetical protein # Organism: L.gasseri # Pathway: not_defined # 595 722 2203 2330 2449 98 37.0 9e-19 MGTSQYMYTYLGIFDKIAEWVFSGISKAISWLFKTIFGPIFEAVLNPVANAIVDAVKSAM ADAFFYIFQKLLYLIDTIERAAKIFSGAEYVTYEGKQWSLINILIFGEGTVGRAIKYMTY IAVLFAFVFAFISLSRSIFDLETDNRRPIGVVLNYCLKAVISFLLVPVMCVAGFQLSSLL IQVTSQVTAPGGSAVTIADELFRTCAMGAFKTGKGVGDLSGGLWTDIANTKKVLDLAKVD YITAYLCSLFMIWNLCSISFVFIQRMFEMILLYLASPFFVATMTLDGGEKFQRWKNTFIG KAITGMGTLILLNLFLILVPLITSDTIILSAGVAESGNAAQQAYTALTKLLFILGSIMAV KNGGSLLTSIVDAQTGSAEQAASQQTSAMMAGALLGGTRMVRGGAAKWSAGKEQRTMKRE GKAFIKGEREYKRNEQAASAMQARGDIQGASRLRSLNESRHTRSLVRAEKAGLGTASRDR TGKVVSNGKFQLKGGEQMKDIKKQLNKVRGKSGIIYNLEHGSKKVGQKIHNLQRDLDERS EMSDLRKNNEMSSIEEGFGYQLKSIRSSYDKKTAKNPIIEGLRKGGLSVGGGESPKPPGP PKPPGSPIPSEPPKPPIPPKPSAPSKPSAPSKPSEAPEPSAPSKPSETPIPPEPPKTSEA SKPSETLEASETPIPPEPPKVSEPPKTSEPPKPSEAPIPPEPPKTSESSKQKDKSDLPGS DT >gi|229783955|gb|GG667780.1| GENE 3 3390 - 3557 124 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624484|ref|ZP_06117419.1| ## NR: gi|266624484|ref|ZP_06117419.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 55 4 58 58 66 98.0 5e-10 MGSCIAVKKGSGILLQVINPQAAQSSESAMGSLTAIAMTARSVAASGSHAFRGKK >gi|229783955|gb|GG667780.1| GENE 4 4487 - 4801 436 104 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624485|ref|ZP_06117420.1| ## NR: gi|266624485|ref|ZP_06117420.1| putative stearoyl-CoA 9-desaturase 1 [Clostridium hathewayi DSM 13479] putative stearoyl-CoA 9-desaturase 1 [Clostridium hathewayi DSM 13479] # 1 102 1 102 102 172 100.0 5e-42 MQMDRVYMIPLGFNFIKWLGDHLLKPIIDAIVKILDPIFTGFLDFLGDMIVKMLAGVFYA IYTSLLSMVDFSQTIFDFLSGARKVYSPQGGFALEDYMLNFFIS >gi|229783955|gb|GG667780.1| GENE 5 4824 - 6899 1699 691 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624486|ref|ZP_06117421.1| ## NR: gi|266624486|ref|ZP_06117421.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 691 1 691 691 1315 100.0 0 MKRRSVIAFIVCLMVYTGFGRIAYGAERIVSDEKGPGKSTVTLTETYNQEMGLYEEQFAG GRRFVTNIPNGGQMYDAAYISLGGSIRGKLERDGVSIAIPANGYLYDPGYYILTVEAQQV GKSIYETAVFSFRLLGQPRGAVNTEEYSHPKIVCGTEVTSVAGRDGWYRFAMPNYKSVFS SLSQNNSTVSSATFYIPGNVGASLTRDGAAVSLINQVEITAPGQYQLKAWAESYGPDNAC PVWYETTISFTIPQPESEAAAAFSSLFGSGGQGDEKENLETAETLSEIYHENADLYEEAF SNGTVFYTNTAGGSITGGNVYLDIPSNISVVMMKDGKPVSFESKVPINEEGTYLLNLSTG YREDGVVHTLTSTFRFRIQRSLGAALPEEENMSAPAEGEEVSPVESDIDMPEEITESLSG ESERTGSESYDADRGMFLHQFTDGTSFYINVPQNAVVNDAVRAELPEQATAVLTMEDGPS MEYESGSTIGDHGSYALTVETAGGENVTWNFRIILHAVNDVTAFTAPDGYLIASVNLTDK LGESSETLEQASYSLTGDGTYSFILKGKEAGLPELFTVIIQDTEAPVVTFEGVDDKGKAK KGSFSYSCDEENVSISIKKGNKAVNTLGNTVKGSGTYTVTAVDAAGNTSVYQIKAGVHLN MFSYILLVILAIIVGILIIFYSVNRDQMKVR >gi|229783955|gb|GG667780.1| GENE 6 6896 - 7705 832 269 aa, chain - ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 7 217 340 549 591 94 32.0 2e-19 MSIAFDKLNMFNDEGLCALTSNTDIDPNCFADEKTALFLKIPDEKDTRHGLASMFILCIY KALIKKASAREDLSLPRNVYFILDEFGNMPQINKFDKMITVGRSRKIWFNMVVQSYAQLN NVYGDTVANIIKSNCGMKMFIGSNDIETCKEFSELCGNMTVTTASVSNGGQDKDISVSSQ LQTRPLIYPSELQKLNNKQSTGNSIIVTFGNYPLKTQFTPSYQCPLYKIGAMDLTEARRN VFFGDEVYYDLGERNRSVLNTEEGREKEA >gi|229783955|gb|GG667780.1| GENE 7 8643 - 9344 821 233 aa, chain - ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 11 86 162 235 591 72 44.0 7e-13 MNLASPMHALIIGSTGSGKTTTFINPMIQILGASDAGSSMIMTDPKGELFQLHSGFLKER GYQVMVLDLRDTYESYRWNPLGDIYDRYQLYIEAGKGIYIRNDAPENSGLELQDDLEKFG EEWYEYQGQGYADRRELITVVKIKKQKIYDELYEDLNDLVGVLCPIESKDDPVWEKGARS IVMATLLAMLEDSENPELGMTKDKFNFYNLNKAMSNSENNFEELKNYFAGRPM >gi|229783955|gb|GG667780.1| GENE 8 10373 - 10696 334 107 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624489|ref|ZP_06117424.1| ## NR: gi|266624489|ref|ZP_06117424.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 107 3 109 109 192 100.0 7e-48 MEVKGIIKKILIYGGLFICVSLAVGIAIEGMIVYLMAGMNGETVPFFSWSYFAEADTYKY GFGLTLAAILLYQVFSNQNSKKAKRMLKGKAKGVESVLENSRFMTDK >gi|229783955|gb|GG667780.1| GENE 9 10699 - 11505 648 268 aa, chain - ## HITS:1 COG:no KEGG:CLI_A0011 NR:ns ## KEGG: CLI_A0011 # Name: not_defined # Def: CAAX amino protease family protein # Organism: C.botulinum_F # Pathway: not_defined # 46 255 64 258 270 73 27.0 6e-12 MIDERTKAIGSMFFILLCYILSYGILNALLSGYHADACFVQMLSGSFLLYMIWKREKKKQ INFFEESRFLPFPQSWKPLGWAVLLGIGLNCFVGGFINLLPLSDSMVSSYMEASSAPVRG VAPAMAFVIISIIAPFIEECLFRGVILRRMRKSMDDLAAIALTSTVFGLLHGQIIWIMYA IILGMVLGLVYVLYDSIYPAIVLHMSFNLVSGIPMILNQQGLVYRFTYGNPFFLFLMMAG GFTGVAVILYRIYFKSFMQKTTTAGRGE >gi|229783955|gb|GG667780.1| GENE 10 12657 - 13502 455 281 aa, chain - ## HITS:1 COG:no KEGG:ELI_2013 NR:ns ## KEGG: ELI_2013 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 131 278 82 232 400 80 32.0 6e-14 MKEALLLAVGRLVCTTPWYVICCVPFYAHRRISRSKIAAVITGISMLFFVCNFYLQFRYD NFKTHGSLVFCALYIILIGLFVWGFHVEFAKLLYVFLIAQAISTAINYIAAIMLKPFFKD MRISLHATPAYILAIFFLTMAVYPFLWYFFSHQLREAVEELKDRDFRLLCIPPVLIFVIT VIFNDVGVNPAIPQSQAIAIFLLITASGLITYFLHVRLALDAVRRIRLEMEVAAIESQIA VQGQSYAQLTQNIESARAARHDLRHHLAAMSAFLENSDIAS >gi|229783955|gb|GG667780.1| GENE 11 13504 - 14220 473 238 aa, chain - ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 231 2 230 234 102 25.0 4e-22 MNIAVCEDEENIRNFLLGSLKEELRCRRLYANVQTYESGEALLEAMKHMTFEIYFLDIFM PGVSGVSVAETLRAKDSDAAIVFTTFSQDYYAAGFAVGAVHYLVKPLKREDISEAFKRCL KQVGEMEQYIELTIDRENRRIMLPDCIWVESKDKVCELHLKSGAFRTYLQLNALECMLED SRFLRCHRSFIINMDYVSRMENGRFYMSDGAVIPVRQAERGRFRIAYEDYMFEKMRRR >gi|229783955|gb|GG667780.1| GENE 12 14484 - 15293 300 269 aa, chain + ## HITS:1 COG:no KEGG:ELI_1170 NR:ns ## KEGG: ELI_1170 # Name: not_defined # Def: response regulator of the LytR/AlgR family # Organism: E.limosum # Pathway: not_defined # 8 258 2 251 268 101 26.0 3e-20 MGKSSISESEIIQMTRDNVRSFYNRNPEITTAPMAKDFMWIGSNDFQWCEGLESFLQVTK KEYEEPPVYLSDEEYHLLFHERNVWVLYGRYQVTAILEDGSVMHAHVRGTYVWKRINGVI TLAHVHGSHAQDIPLNQLVPSTIPLTSDSDYFEFMKRLDSLKTNGPKLDFRDREKRHRYL HPSEIIYMQAALQWTTVYTTNGSFQAYGLLAAHEQAVPEDFQRIHKSFLVNTGYIRSLCR YEATLLNGQVLPVSKERYMSLKRFLEKRP >gi|229783955|gb|GG667780.1| GENE 13 15461 - 15802 194 113 aa, chain + ## HITS:1 COG:ECs3986 KEGG:ns NR:ns ## COG: ECs3986 COG3152 # Protein_GI_number: 15833240 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 110 1 111 118 77 38.0 6e-15 MPEYLSIWKNFANFKGRTSRRGYWLAIAFHIIVTIVLTLLSNFTLIFAVLTGIYVFASII PLFALEIRRLHDINRCGWWVFLPMIPLVGGIILLVWFCKSSVDEGNIYGSMQV Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:51:23 2011 Seq name: gi|229783954|gb|GG667781.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld174, whole genome shotgun sequence Length of sequence - 10017 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 5035 3891 ## Closa_0424 LPXTG-motif cell wall anchor domain protein + Prom 5878 - 5937 80.4 2 2 Tu 1 . + CDS 5968 - 10015 2875 ## COG4932 Predicted outer membrane protein Predicted protein(s) >gi|229783954|gb|GG667781.1| GENE 1 2 - 5035 3891 1677 aa, chain + ## HITS:1 COG:no KEGG:Closa_0424 NR:ns ## KEGG: Closa_0424 # Name: not_defined # Def: LPXTG-motif cell wall anchor domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 888 1502 1812 2548 4700 282 31.0 9e-74 WNRKGFEFSSSQSEGEGAITGYYPGNRAVSVREMEDVPQLSKSEKVTDIVQAGMSDEGNQ NSDSVTGQEKIPEDKGNEQEPAFKEHENTKKHESESSGDTAETGEEGAKEVDAGSEETAD TTEIKESDEASQPSMENREETDEEASPEAETQDSGNEAESVEEETITEDLPTVATRSQIG IEKDVATPSNASTYLVSRPASIKVGTGGHWEWDGVQEDSEVAPIEQGSYPSGYIGYAYLV KDHRTEGELHINKRDKELFELEEDSYGKAQADATLEGAVYGLYAAEDIRHPDGKTGVVFS AGELVSIATTDKNGDASFLTITEVSETSREVPNLYTGNEVRNGNGWIGRPLILGSYYIEE ISRSEGYERSVTGKNLSESNRTGKPIVLTASGSAYTDGFTHSINEWFEDSYDFTVKYYKT KGFDILLSGLPERVKAYEVTRKETASQEQVITGTERVEKKDAQGNILYRTAEGGEYKLDE TGNKIIKSDDSGNPVLSDRALTQTVSAVNRLNDYISSIEREEPSDLDMEESEDIDEDYIL WETSSALALSGYKSGLSDYPFKKLKLFGKTNGKMIDEILTFCSSESFWDAYDVDAVYEEA GTWYARIRYGYKALANQPAIYESGSGLLVIRKEYEGGFYYAVYEEGQYDMDGYRFTVEKK ELDLEALGQDEILLKTVYAPVYETYAAGEFILDSEGQKIPQMEAVPIYTSQEVVSYEEIL TPVKTTAVEGQVSIHVDTDEEFTEGEKHERTYRVVTEQKITEYTATVNVMTTKPAQKSGS YLKFPVLFYPGQYEIYEDNGTRKEPVIMLERVIKQAIEVKKDIALDSYEHNTYEIHRDPF TVLFGGYNGTQETKTLPGFFFKLYLRSDLEKTDKLMKKEDGSYDYVTFFKENPDAAFELA IEWDLEKYDADGDMTTVHANRGGGKDDYWGQSRMLPYGTYVLVEQQPTGLPQKHYEIDAP QEVEIPFVPQIDADGTVHDKIPSKEYLYDSAMTPEELTERYQIRFNEETHIIYAHNNDGD FEVFKYGLEPDSRRDCQNETVARYYHYGSISEDAGSADQVYYETYYDRDGTIADYGVTMN GVVTMTGKSTAVDRMYAKALVPWSVLDPRYGEVINDDGDIGNREAGLETDGTFNFISFAV KDFENEFYSSRLRIEKLDSETGENILHEGALFKIYAAKRDVIGNGASGVTGSGDILFDEE GTPLYEEREQIFMQDDTGAEVGVFKAYTTIRDGEVEGEDGSLHTEKQCVGYLETYQPLGA GAYVLVEVEAPDGYVKSKPIAFTVYSDKVEYYEDGNQEKRTQAVKYQYMRPIGADGKTVV EDMHQIIVKNAPTHIEIHKLENRADAVTYRVEGDEKQLKNRGDVDLQYKPNGEFAGFGYV TKRLENGQEKIYVENATLTLYEGLEVKQTGEHEYEKVKVKRNLFGSVTGIQAYDTGVDTD IRQTGTNAAGQAEWDITEEDNPPVDIWYFDLKYDPTELDEQTGILYGLDDWGNRLCMLDS ETGMAYVTDKTGAVIVWPLDENGDKIISQSVEVYTNGEGNSSINMDLQPVPDENGLPIYY KDGGVIWIENEWVTDHGACEIARVRQGAYILEETAAPLSDGYVQSASVGVIVRDVSEKQS YVMEDDYTKIEVSKLDMTSRKEIEGAVLTLYEAYRVYDDSDRGWHLEIQNPFRFQPS >gi|229783954|gb|GG667781.1| GENE 2 5968 - 10015 2875 1349 aa, chain + ## HITS:1 COG:L148778 KEGG:ns NR:ns ## COG: L148778 COG4932 # Protein_GI_number: 15672133 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted outer membrane protein # Organism: Lactococcus lactis # 295 1124 1093 1919 1983 178 27.0 9e-44 MEDKPIVAERWVSEGNVPHWIDHIMPGDYILQETRVPTKAGYVTAEDVEVTILETGEVQG YVMEDDHTAVEVLKLDSRTGAVMDNLHRATLALYEAQVDENGEVQYRDDGTILYHQEKKV YEWQTDDGSDVRKTAHQVTIPGGHSYTAYDYEIEQVPGTSQAVCYITETGAMRFEYLPVG KYVLVEEQAPYGYTVAAPVYVPVLDVGSKERVQTITMTDEPIQVLLTKVNVTGGKEISGA TVAVYRAKEDGTLAKHQMKDKNGNLLYVTDTDGNLLQDEDGNHIPAMEYEEAYLVERWIS GSDGTYTKRDQKEGRIPEGYEIGDLRLHELSQPAAGNYYFVEEQSPFGYVRAAELPFAIV DTLDIQKIELVNELILGQVEIIKTDKRNPEQVLSGARFRLSNLDTNVATILITDSDGRAV SSPVPIGGIGTDGAVSLYHFRIQEVEAPDGYLLDPTVHDFQFNIKTDQYQTLTYQYEAAD SPNRVIISKKQLTTKEELPGASLEVRRVTEMVDKDGTVIRIDGDVIESWISTEVPHELEC LTEGMYVLIETRAPEGYIEAEKVYFTISGNMTVDDMPMVEMFDDDTKIEIRKVDSENGNP LKGAKMQLILEASGEIIKEWITDETGAIQFFGLPAGVYLVKEVEAPEGYQIPEGPMRITV TKDYKQQTFVMENRMTELMIDKLDEETREPVTGAVLQLTDGEGNQVAKWVTTGEPELIRG LKAGWYILEEIQAADGYLLLKEPVRIEILEKAGIQTVTITNRKLEVEVAKTDKETGENLA GARLQLIRDADGMVLKEWVSGKIPEIFKGLSVGEYTIRELTAPEGYAVTGRLNFCVSGTE EKQEINLVNEKIVVELQKTDTLEGSPVEGAVLQLLKAAGTNEETVMKEWVSGKEPLILTG IPSGIYTIRETQAPEGYVPMEDMKIEILPNQTMQHFDIKNQPIQIEIGKVSGETGKLLGG AVLQLVRDSDGTVIRKWTSRDGEAEHFKNLAGGTYIIREVKAPSGYEKMEPQEIEIKDIE AIQEFVVKNYKITHSGGGGGSTPNHPRPSAEYMELFKIDGRTGQKLAGAKITVYNPDGSV YAEGFTNAAGIFRFKKPLKGAYTFKETEAPEGYYLNEVLHAFAVTEDGTIEGDTTMENYS KTEMIISKVDVTTAEELPGAEIEVTDQDGNHIFSGISDEKGKVYFPVPSPGEYHFRELTA PEGYDRNETVFSFTVFEDGSIIGDNTITDQKHYGTITANYETNRRGEGDLMVGELNHAPK TGDTSYFVGVFMAWLASVVSLLIVSLSKWRKKQKKSKTGIKAGMFLLLAIMVVSTPVMES QAAEDVIENIYEEHQYTTENPDSNEAEKL Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:51:50 2011 Seq name: gi|229783953|gb|GG667782.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld175, whole genome shotgun sequence Length of sequence - 14673 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 3, operones - 3 average op.length - 5.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 961 1129 ## COG2195 Di- and tripeptidases 2 1 Op 2 . - CDS 977 - 2266 1401 ## COG0139 Phosphoribosyl-AMP cyclohydrolase 3 1 Op 3 . - CDS 2271 - 3005 789 ## COG0106 Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase 4 1 Op 4 . - CDS 2983 - 3996 669 ## Closa_1983 hypothetical protein 5 1 Op 5 6/0.000 - CDS 3999 - 4586 681 ## COG0131 Imidazoleglycerol-phosphate dehydratase 6 1 Op 6 18/0.000 - CDS 4590 - 5894 1532 ## COG0141 Histidinol dehydrogenase 7 1 Op 7 11/0.000 - CDS 5929 - 6576 787 ## COG0040 ATP phosphoribosyltransferase 8 1 Op 8 . - CDS 6585 - 7847 1535 ## COG3705 ATP phosphoribosyltransferase involved in histidine biosynthesis - Prom 7876 - 7935 2.6 9 2 Op 1 . - CDS 7941 - 8474 563 ## COG3331 Penicillin-binding protein-related factor A, putative recombinase 10 2 Op 2 . - CDS 8478 - 9440 200 ## PROTEIN SUPPORTED gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family 11 2 Op 3 . - CDS 9453 - 10022 530 ## Closa_1974 hypothetical protein 12 2 Op 4 . - CDS 10053 - 10811 758 ## COG0300 Short-chain dehydrogenases of various substrate specificities 13 2 Op 5 . - CDS 10832 - 11143 217 ## COG1032 Fe-S oxidoreductase 14 3 Op 1 . - CDS 12110 - 13684 1638 ## COG1032 Fe-S oxidoreductase 15 3 Op 2 1/0.000 - CDS 13681 - 14346 782 ## COG0637 Predicted phosphatase/phosphohexomutase 16 3 Op 3 . - CDS 14346 - 14672 378 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases Predicted protein(s) >gi|229783953|gb|GG667782.1| GENE 1 1 - 961 1129 320 aa, chain - ## HITS:1 COG:CAC0476 KEGG:ns NR:ns ## COG: CAC0476 COG2195 # Protein_GI_number: 15893767 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Clostridium acetobutylicum # 1 320 1 320 408 369 55.0 1e-102 MSEVLNHFLKYISYDTQSKEDVEVIPSTEKQRVLAEVLAEELRAMHAQDVRVDQNSYVYA TIPATMDKKVPVLGFIAHMDTAPAYSGAGVHPQIVKQYDGGDIVMNQKTGLTMKTADFPD LLNYRGQDIVTTDGTTLLGADDKAGVAEIMAMAEYLLAHPEIPHGTIRIGFTPDEEVGHG ADLFDVAGFAADAAYTVDGGALGELEYENFNAASAKVNIHGSSIHPGSSKGKMRNAILMA MEFHNMLPVFENPMYTEGYEGFFHLDTMTGSVEEAHLDYIIRDFDEKKFAEKKEFITRVA AYLNERYGKGTVELELKESY >gi|229783953|gb|GG667782.1| GENE 2 977 - 2266 1401 429 aa, chain - ## HITS:1 COG:CAC0942 KEGG:ns NR:ns ## COG: CAC0942 COG0139 # Protein_GI_number: 15894229 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosyl-AMP cyclohydrolase # Organism: Clostridium acetobutylicum # 235 329 13 107 115 130 62.0 7e-30 MKEYKKVIPGFGIREGKAVRLWGQGCCEDDVLTLSRYYGDSGADELFLYDMAETDEDHER TIGLIKEIARTADVPVITGGRVRRFEDVKKYLYAGAKAVFLNAGVDENVDLMKEASNRFG DDKIYAWLPDFSYLDRTAEYSQLGASVMILGTACPSQEEMARLEEKEEIFLLSSDRPDAV TLAALMASGRVSGAITSMTGSSASCMELKQKLKAAGIPADTFESTVAWKDFKLNSDGLIP VIVQDYRTSEVLMLAYMNEESFEATLHSGLMTYYSRSRRSLWLKGETSGHYQYVKSLSLD CDNDTMLAKVNQIGAACHTGARSCFFQTLVKKEYQETNPLKVFEDVFAVILDRKENPKEG SYTNYLFDKGIDKILKKLGEEATEIVIAAKNPDPEEVKYEISDFLYHMMVLMADKGVTWE DITRELANR >gi|229783953|gb|GG667782.1| GENE 3 2271 - 3005 789 244 aa, chain - ## HITS:1 COG:lin0573 KEGG:ns NR:ns ## COG: lin0573 COG0106 # Protein_GI_number: 16799648 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase # Organism: Listeria innocua # 1 234 1 233 240 219 48.0 4e-57 MQLYPAIDMKNGQCVRLKQGKFKEITVYSDSPEKVAAYWESQGATFLHLVDLDGALAGYS VNEEAIRRIVSSVSMPVEIGGGIRSEEAIERMLDLGVSRVIIGTKAVENPLFLRDAVLRF GADRIVAGVDAKNGLVAVEGWEKLSEITASDLCGQMKEYGVRHVVYTDISRDGMLTGPNV EATRALTEETGMDIIASGGMSSMDDLKRLHDAGVKGAIIGKALYENRINLKEAVRAFEGI EKGE >gi|229783953|gb|GG667782.1| GENE 4 2983 - 3996 669 337 aa, chain - ## HITS:1 COG:no KEGG:Closa_1983 NR:ns ## KEGG: Closa_1983 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 70 298 65 296 500 82 28.0 2e-14 MEDQLEMQEGMRRIGRLRFRIGRLLTGIACLSAAAAFIIAFSLLAWYDGIHTRNTAPEPF NAFQESRDCSVDIQLLAGPFASTQDDKQTFYWGYGLMMEPVMVLFTGELPEECRALVEYT HNPGITEIPEPVTIRGRSVSIENTDIYDYALEFYQTMWGLAPMERNEFKQTVASCYLDSA KQTWLKKLPWYGTLLVFFLPVVFFVTGFSECRRFAVQKKREQKRLSVLSEEEMQTAVRQL AGASEFEPRSRIYVTADYVITGNYQFDIIPYARITSIEETGGFLIAVTGDGAAHILLSSG HKKPDRISWLNRLKKVLEEKIIYAKIQGEHYAVISGN >gi|229783953|gb|GG667782.1| GENE 5 3999 - 4586 681 195 aa, chain - ## HITS:1 COG:CAC0938 KEGG:ns NR:ns ## COG: CAC0938 COG0131 # Protein_GI_number: 15894225 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate dehydratase # Organism: Clostridium acetobutylicum # 3 195 5 197 197 219 55.0 2e-57 MGRTAEIARVTNETDIRLKLNLDGTGKATVQTGIGFFDHMLNSFARHGLFDLEVDVKGDL EVDTHHTIEDTGIVLGTAIKEAVGSKKSLKRYGNMILPMDESLILCAVDLSGRPYFVFDA AFTAERVGQLETEMVKEFFYAISCAAEMNLHIRKLSGENNHHIIEGMFKAFAKALDEATQ TDSRITDVLSTKGTL >gi|229783953|gb|GG667782.1| GENE 6 4590 - 5894 1532 434 aa, chain - ## HITS:1 COG:CAC0937 KEGG:ns NR:ns ## COG: CAC0937 COG0141 # Protein_GI_number: 15894224 # Func_class: E Amino acid transport and metabolism # Function: Histidinol dehydrogenase # Organism: Clostridium acetobutylicum # 32 429 36 431 431 445 58.0 1e-125 MRIVALDEASRKDILVDLLKRDPNNYTAYSDTVQEIVDTVKRDRDAAVFSYTEQFDHAVI DASTIRVTEEEIEEAMRAVEPGLLEVMKRSMKNIRAFHEKQKTYSWFDSKPDGTILGQKI TALSSVGVYVPGGKAAYPSSVLMNIIPAEVAGVEKIVMVTPPGRDGKVNPVTLTAAYLAG ATEVYKVGGAQAIAALAFGTESIPRVNKIVGPGNIFVALAKKAVYGHVSIDSIAGPSEIL VLADESANPRFVAADLLSQAEHDELASAILVTTSMELAERVSEETERFTKELSRSEIIQK SLDNYGYILVADTMADAIETVNEIAPEHLEIVTKNPFEDMTRVKNAGAIFIGEYSSEPLG DYFAGPNHVLPTNGTAKFFSPLNVDDFVKKSSIIYYSKDALEAIHEDIETFAEAEQLTAH ANSVRVRFEKTRED >gi|229783953|gb|GG667782.1| GENE 7 5929 - 6576 787 215 aa, chain - ## HITS:1 COG:CAC0936 KEGG:ns NR:ns ## COG: CAC0936 COG0040 # Protein_GI_number: 15894223 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 2 212 5 212 215 198 49.0 8e-51 MRYLTFALAKGRLAEQSLDLLEKIGITCEEMRDKSSRKLIFTNEELGMRFFLAKASDVPT YVEYGAADIGVVGKDTILEEGRKLYEVMNLGLGSCRMCVCGPKKAEELLKHHERIRVATK YPAIAKDYFYNKKHQTVEIIKLNGSIELAPIVGLSEVIVDIVETGTTLKENGLAVLEEIC PLSARMVVNQVSMKRENDRITRLIHDLRNYLQSAD >gi|229783953|gb|GG667782.1| GENE 8 6585 - 7847 1535 420 aa, chain - ## HITS:1 COG:CAC0935 KEGG:ns NR:ns ## COG: CAC0935 COG3705 # Protein_GI_number: 15894222 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase involved in histidine biosynthesis # Organism: Clostridium acetobutylicum # 1 324 1 326 407 245 38.0 1e-64 MANKLLHTPDGVRDIYGVECTRKAAIQNRILEVFHLYGYQDIETPTFEFFDIFNESRGSV KAREMFKFFDRDNHTLVLRPDETPAIARCVTKYFTEEDMPLRLCYLERTFINNTSYQGRL KEAAQTGVELIGDDSSDADAEILAMVIRALKAAGLTEFQVELGEVDFFRGLLEEAGMDEE MEERLRELIENKNYFGVEELVMEQPIPQELKEAFLKLPELFGSLEEIQAAREFTRNPRAL RAIDRLEEVNRILEYYDLSEYVSYDLGMLSQYQYYTGIIFKAYTYGTGDYIVNGGRYDKL LVQFGKDAPAVGFGISVDDLMLALSRQKIDTPVRVVGTMILFEPESREQAIQLAKHFRDT SIPVQLQLKKAGRTLEAYQSYAARTTITNLLYLDEKGFSVKIVNLNLNRTDEVPLSEYLK >gi|229783953|gb|GG667782.1| GENE 9 7941 - 8474 563 177 aa, chain - ## HITS:1 COG:BH3539 KEGG:ns NR:ns ## COG: BH3539 COG3331 # Protein_GI_number: 15616101 # Func_class: R General function prediction only # Function: Penicillin-binding protein-related factor A, putative recombinase # Organism: Bacillus halodurans # 10 169 8 167 168 119 37.0 3e-27 MGTWNTRGLRGSTLEDFINRSNDSYREKKLALIQKVPTPITPISIDKESRHITLAYFDQK STVDYIGAVQGIPVCFDAKECAVETFPLHNIHPHQIAFMREFEEQGGISFIILSYTVKNE VYYLPFDEIDRFWTRMEEGGRKSFTYEEVDKSWRIRSCRDIFVHYLEMIQKDLDRRD >gi|229783953|gb|GG667782.1| GENE 10 8478 - 9440 200 320 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family [Lactobacillus jensenii 269-3] # 37 290 38 279 287 81 27 3e-15 MRSLIVTENESGQRLDKLLAKYLNLAGKGFLYKMMRKKNITLNGKKCDGSEKLVTGDEVR LFFSDETIGKFSVHTLQEVKKVNLQIIYEDDHILLVNKPSGMLSQKAKDSDESLVEYIID YLVSSGQLSTEQLKSFRPSVCNRLDRNTSGLVVAGKSLPGLQIMSAVFKDRSLHKYYRCV VKGSVTDKQVITGFLKKDSVTNQVTITAAEVEGSVPVMTEYEPLGQAEGCTLLSVKLITG RTHQIRAHLASIGHPILGDFKYGSSAVNEEAKKRWNIRSQMLHSCTVTFPELPEPLAYLS GKTFSAPLPQTFRKICRDWS >gi|229783953|gb|GG667782.1| GENE 11 9453 - 10022 530 189 aa, chain - ## HITS:1 COG:no KEGG:Closa_1974 NR:ns ## KEGG: Closa_1974 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 189 1 189 189 213 52.0 3e-54 MKKYDKGQYGYRNHHKKAETGKVAFGAAMIIVQLAARNFVDSTAWKNILTVMAILSVLPT ANVASPLLAGWRYKTPSEDFHRRVAGYEGRFSILYDLIITTRDTIIPADAAIVHPGGVFL YCTAPKLDTAKAEKCLKEIFAGHKLDSGVKIILDEKAFFHRLESLKPAEEYEDDGSVDYT LRLLKNLSM >gi|229783953|gb|GG667782.1| GENE 12 10053 - 10811 758 252 aa, chain - ## HITS:1 COG:MT1596 KEGG:ns NR:ns ## COG: MT1596 COG0300 # Protein_GI_number: 15841011 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Mycobacterium tuberculosis CDC1551 # 5 246 13 252 267 126 36.0 4e-29 MKIAVVTGASSGMGREFVRQLADRFNGIDEIWVIARRRDRLKELVSTVPVHLRLFAVDLT DQKELMEFQYALEEENPDVKWLINSAGFGKIGPVGSINLADEAGMVDLNCKALCVVTHMV LPYLSGNSRIIQLASSASFLPQPDFAIYAATKSFVLSYSRALNEELKPRGIYVTAVCPGP VKTEFFDIAETTGEIPVYKRIVMADPRRVVKKAIRDSMMGRTVSVYGISMKAFHLLCKIM PHGMILRIMKAL >gi|229783953|gb|GG667782.1| GENE 13 10832 - 11143 217 103 aa, chain - ## HITS:1 COG:MA4618 KEGG:ns NR:ns ## COG: MA4618 COG1032 # Protein_GI_number: 20093399 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Methanosarcina acetivorans str.C2A # 2 53 686 737 742 66 65.0 2e-11 MEKVYVPTNPHEKAMQRALIQYRNPKNYDLVEEALKKAGRTDLIGYDKKCLIRPRYSGGG AYGGGTKPGSGQAAGSKGPGKQRDDKRTGKKKTIRNVHKKVKK >gi|229783953|gb|GG667782.1| GENE 14 12110 - 13684 1638 524 aa, chain - ## HITS:1 COG:MA4618 KEGG:ns NR:ns ## COG: MA4618 COG1032 # Protein_GI_number: 20093399 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Methanosarcina acetivorans str.C2A # 5 521 126 662 742 501 47.0 1e-141 MIHDYLPMNRAEMNIRGWNQCDFVYVSGDAYVDHPSFGHAIITRLLEAHGYRVGIIAQPD WKDPESIRVLGEPRLGFFVSGGNMDSMVNHYSVSKKRRQQDSYTPGGVMGKRPDYATIVY CNLIRSVYKRIPIVIGGIEASLRRLAHYDYWSDRLKHSILIDSQADLISYGMGERSVVEI ADALNSGIDVRDITFIDGTVYKTSSLDSVYDALTLPSYSAMKADKLEYARSYYIQYSNTD PFIGKRLVEPYQDNLYVVQNPPSKPLTTEEMDAVYDLPYMRNYHPSYEEAGGVPAIREIK FSLISNRGCFGACSFCALTFHQGRIIQVRSHESLVAEAKLLTEEPDFKGYIHDVGGPTAD FRSPACEKQLTKGACPGRQCLFPEPCRNLKADHSDYIELLRKLRSLPKVKKVFIRSGIRF DYVLADPSRKFLKELCQYHVSGQLKVAPEHVADKVLSRMGKPQNKVYRQFVKEYKDMNAQ LGMKQYLVPYLMSSHPGSGLPEAIELAEYLRDLGYMPEQVQVAS >gi|229783953|gb|GG667782.1| GENE 15 13681 - 14346 782 221 aa, chain - ## HITS:1 COG:CAC3231 KEGG:ns NR:ns ## COG: CAC3231 COG0637 # Protein_GI_number: 15896477 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Clostridium acetobutylicum # 1 214 1 214 215 183 46.0 2e-46 MIEETKAVIFDLDGTLVDSMWMWKDIDIEFLKNYGHDCPPELQKEIEGMSFSETAVYFKD RFGLRESIEDIKAIWRDMSIEKYRCHVPLKAGAREFLEHLRARGIAAGIATSNGREMVDA VIDSLKIGEYFNVIATACEVAAGKPAPDIYLNVADRLGVIPEDCLVFEDVPAGILAGKRA GMTVCAVADEFSRHMEAEKKSLADYFIRDYYDIINRNGAAL >gi|229783953|gb|GG667782.1| GENE 16 14346 - 14672 378 108 aa, chain - ## HITS:1 COG:L109527 KEGG:ns NR:ns ## COG: L109527 COG1187 # Protein_GI_number: 15674222 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Lactococcus lactis # 1 102 168 273 273 93 50.0 1e-19 IEGMLPEDAKERMAAGLTLSDGTETLPADLELVRRGGCGELSEIRLTIHEGKFHQVKRMF ETLGCHVVYLKRLSMGTLTLDETLKPGEYRPLTREELELLNKTDQEQE Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:52:01 2011 Seq name: gi|229783952|gb|GG667783.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld176, whole genome shotgun sequence Length of sequence - 11620 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 7, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 78 - 230 138 ## gi|288871397|ref|ZP_06117449.2| conserved hypothetical protein 2 1 Op 2 13/0.000 + CDS 248 - 1144 1223 ## COG0548 Acetylglutamate kinase 3 1 Op 3 . + CDS 1134 - 2258 1287 ## COG4992 Ornithine/acetylornithine aminotransferase 4 2 Tu 1 . - CDS 3175 - 4761 1380 ## COG0659 Sulfate permease and related transporters (MFS superfamily) - Prom 4819 - 4878 5.7 + Prom 4715 - 4774 1.9 5 3 Tu 1 . + CDS 4969 - 5883 602 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases + Prom 5895 - 5954 5.5 6 4 Op 1 . + CDS 6092 - 6718 649 ## COG3404 Methenyl tetrahydrofolate cyclohydrolase 7 4 Op 2 . + CDS 6738 - 7577 1115 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase + Prom 7725 - 7784 9.2 8 5 Tu 1 . + CDS 7843 - 8466 627 ## HEAR3314 putative signal peptide + Prom 9368 - 9427 11.9 9 6 Tu 1 . + CDS 9547 - 10341 815 ## COG2199 FOG: GGDEF domain + Term 10373 - 10420 5.4 - Term 10361 - 10408 5.4 10 7 Tu 1 . - CDS 10452 - 11567 564 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins Predicted protein(s) >gi|229783952|gb|GG667783.1| GENE 1 78 - 230 138 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871397|ref|ZP_06117449.2| ## NR: gi|288871397|ref|ZP_06117449.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 50 26 75 75 94 100.0 3e-18 MPNKQEYIQKRGTGIQTLRFQLDSHQFSREEIQEGIEHLGKEFGVDLPFH >gi|229783952|gb|GG667783.1| GENE 2 248 - 1144 1223 298 aa, chain + ## HITS:1 COG:Cj0226 KEGG:ns NR:ns ## COG: Cj0226 COG0548 # Protein_GI_number: 15791598 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate kinase # Organism: Campylobacter jejuni # 5 286 4 280 281 331 58.0 8e-91 MDLDQSIMQKAEVLIEALPYIQRFNRKIIVVKYGGSAMVDEELKKHVIQDVVLLKLVGFK PIIVHGGGKEISKWVGKVGMEPQFINGLRVTDEPTMEIAEMVLNRVNKSLVQLVNELGVK AVGISGKDGMMLQCKKRYADGEDIGYVGDITGVDPQIIYDLLEKDFLPIICPVGFDEEFK TYNINADDAACAIATAMEAEKLAFLTDVEGVYRDFEDKESLISEMTIDEAQDFVSSGTLG GGMLPKLQNCIDAMKQGVSRVHIMDGRIPHCLLLEIFTNKGIGTAIFKTGDERYYHAE >gi|229783952|gb|GG667783.1| GENE 3 1134 - 2258 1287 374 aa, chain + ## HITS:1 COG:Cj0227 KEGG:ns NR:ns ## COG: Cj0227 COG4992 # Protein_GI_number: 15791599 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Campylobacter jejuni # 10 373 6 364 395 377 52.0 1e-104 MPNKQEYIQKAEQEFYKTYNRFPVIFDHGDGVFLYDTEGEEYLDFGAGIAVMGLGYNDKE YNEAVKTQVDKLLHTSNLFYNVPSIDAGEKLLAASGMDKVFFTNSGTEAIEGALKIARRY AYNKAAGEKRGEGCVHEIIAMRHSFHGRSMGALSVTGNDHYQEPFKPLIPGIKFADFNDL DSVKALVNENTCAVIMETVQGEGGIYPAEEEFLKGVRALCDEHSMLLILDEIQCGMGRTG SMFAWQQYGVKPDVMTVAKALGNGVPVGAFLASGKAAEAMVPGDHGTTYGGNPLVTAAAD TVLSIFEKRGIVDHVNQIGGYLWKKLDELADQFDCITGHRGMGLMQGLEFHMPVGPIVQK ALLEEKLVLISAGS >gi|229783952|gb|GG667783.1| GENE 4 3175 - 4761 1380 528 aa, chain - ## HITS:1 COG:L1004 KEGG:ns NR:ns ## COG: L1004 COG0659 # Protein_GI_number: 15672032 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfate permease and related transporters (MFS superfamily) # Organism: Lactococcus lactis # 81 528 1 443 460 362 45.0 1e-99 MIKNYVTQLKSEFRNYSGADCMKDLMAGLTVAAVALPLALAFGVSSGSTAAAGLVTAIIA GLVIGTLSGGYYQISGPTGAMAAILISIIARYGMQGVFTATLIAGILLVLCGIFHIGRLT GFIPAPVITGFTSGIAVIIALGQIDNFFGVTSEGSSAILKLLSYSQLGFPVNIQAAALGL FVVLFMVFFPKKWNAVVPASFLSIIFATVLSVILNLDIQTVGAIPKTLFLDTRLDLSAIT PAHLSGLIGPAVSIAMLGMIESLLCGASAGKMANVRLNSDQELVAQGIGNIILPFFGGIP ATAAIARTSVVLKSGARTRLTGIFHALGLLAFMFILGPVMAKIPLSALAGVLMVTAFRMN DWQEIRYIFSHHFKGAAAKYLITMAATIVFDLTTAILIGVVTALVLLVSRLANIEINYEK VNMDRVRSSDAALAKEFGNAVVAYLTGSVIFANTQAIEEMETCTKEYDTVLLSMRGVSYM DISGAIAFMHVLSDLQAEGKRILLCGVPTSTMAMLKRSDIYDMIGEDS >gi|229783952|gb|GG667783.1| GENE 5 4969 - 5883 602 304 aa, chain + ## HITS:1 COG:CAC2274 KEGG:ns NR:ns ## COG: CAC2274 COG0317 # Protein_GI_number: 15895542 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Clostridium acetobutylicum # 132 297 7 190 740 101 32.0 2e-21 MECALYIRLHAKEPLSVQGLSNRFGYSACHFSRMFRAEMGVTLMDYVKQQRLFGAAREIR EGRKILDAALDYGYETHSGFTRAFRAQFGYSPALLRAFRVGEAFEKGDWENMGIYLRNTP VHAQPEELYGLLFEVLGEAGTAYGRGNVEAVYELAERVYEGKKRRSGDEYVTHPLNTAII LADMGADEDTVCAGLLHDVWEMADEPETYLSDPAVTPAVKEILNEYREFDKYASCDERVV LVALADRLHNMRTIDFVDPATWEERAKMTLEVFGPMAAECGKVKCRMEFDDLAEKYLSKY GPAL >gi|229783952|gb|GG667783.1| GENE 6 6092 - 6718 649 208 aa, chain + ## HITS:1 COG:SPy2084 KEGG:ns NR:ns ## COG: SPy2084 COG3404 # Protein_GI_number: 15675842 # Func_class: E Amino acid transport and metabolism # Function: Methenyl tetrahydrofolate cyclohydrolase # Organism: Streptococcus pyogenes M1 GAS # 9 203 10 204 208 118 35.0 9e-27 MIKEMVTEEFLAELSSKKPTPGGGGAAALGGATGVSLGQMVINLTLGKKKYADVEEEMKD LNGQLEALKAEFLHLADEDAQVFAPLAAAYSLPGTTDEEKAHKAAVLEGHLLTASLVPIQ VMEAAGKALHIMDILADKGSRMAVSDVGVGVQFIRTALTGAVMNVWINTKSMKNREKAEE LNRYADQMMQSGTAAADAVYQKVERALR >gi|229783952|gb|GG667783.1| GENE 7 6738 - 7577 1115 279 aa, chain + ## HITS:1 COG:BS_folD KEGG:ns NR:ns ## COG: BS_folD COG0190 # Protein_GI_number: 16079487 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Bacillus subtilis # 8 277 14 280 283 211 43.0 1e-54 MITLKGAEVSAKIKEETAELLKQCGGRIPKLAIIRVGENPDDLSYERGAVKKMAAFGLTA EVYAYPADIGDEGFKTEFQRINGDPEIDGILLLRPLPKQICEKDIERMIDSKKDLDGISP ENIAKVFAGDSTGYAPCTAEAVVETLKAYEIPINGKRAVIVGRSMVVGKPLAMLLLKENA TVTICHTRTVNLKETCRNAEILVAAAGRAKMLDKESVGKDAVVIDVGINVDENGKLCGDV DFETLEGTASMATPVPGGIGAVTTAVLAKHLVMAAVRNE >gi|229783952|gb|GG667783.1| GENE 8 7843 - 8466 627 207 aa, chain + ## HITS:1 COG:no KEGG:HEAR3314 NR:ns ## KEGG: HEAR3314 # Name: not_defined # Def: putative signal peptide # Organism: H.arsenicoxydans # Pathway: not_defined # 14 198 14 191 492 103 32.0 5e-21 MGKNRELLKTNVLVSVILIFGFVVTAFFSYQANYEASLNNIEQVASLTTEGIYYQLTALF TKPINISITMAHDSLLVDHLENEADNLEDEAYAWKIKNYLKTYREKYDFDSVFLVSAQTN RYYNFNGVDRVLTEDEPENNWYFSMLKNDLEYSVNVDNDEVNGADNAITVFVNCKIFGPD EDFLGVVGVGMRVSYLKEFKPFRFQPS >gi|229783952|gb|GG667783.1| GENE 9 9547 - 10341 815 264 aa, chain + ## HITS:1 COG:RSp0380_2 KEGG:ns NR:ns ## COG: RSp0380_2 COG2199 # Protein_GI_number: 17548601 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Ralstonia solanacearum # 97 261 3 169 181 99 31.0 4e-21 MTEQYFESLGAGNLPYDQGLKVIAEKQIKEEYRAGYIAAFTPQNVIREYEKGNSHLRYDF MITQDGNGYFWMRIDAYIFFSQEDGCLHMFTYRKNIESEKEKEILASIDEMTGFLTKTET ERKIAALLSAKTDGIYAFFLFDIDNFKQANDSCGHAFGDFCIKEFTSIIRKHFRADDVLG RIGGDEFAVFLSAPDGGWIEGKAKELSQALQVTCEKENKSWSMSASIGVSIAPGAGADFD TLYQNADEALYQTKQRGKNGYTIV >gi|229783952|gb|GG667783.1| GENE 10 10452 - 11567 564 371 aa, chain - ## HITS:1 COG:STM1911 KEGG:ns NR:ns ## COG: STM1911 COG4225 # Protein_GI_number: 16765253 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Salmonella typhimurium LT2 # 50 369 63 375 379 246 40.0 5e-65 MDREVISQKIDLVVDRLMALGGADYAKDKMADSGNRFTGLIERDFGIEEWDWPQGLGLYG LIKCLQGKKHNRYMEFLDHWFQRNIALGLPSKNINTTAPYLALLPLSELTGNQIYRQMCT ERAEWLMNTLPRTKYEVFQHVTSAFGDRNGVNLHDGEIWVDTLVMAVLFLNQAGCKLGRP EYSQEALYQYLSHIKYLFDRSNSLIHHGWTFHGNNNFGGVYWCRGNAWFTYGIPCLLADS DDSISPGIFRYLVNTWKAQVDTLVSLQDHNGLWHTVLDDPASYCEVSGSSAIAAGILKGI HLGILDSSYLPTAELAISAVCDNIAADGTVLNVSAGTGIGMNADHYKNIVIKPMAYGQSL AILALTEALEG Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:52:13 2011 Seq name: gi|229783951|gb|GG667784.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld177, whole genome shotgun sequence Length of sequence - 6521 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 7, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 206 119 ## gi|266621700|ref|ZP_06114635.1| conserved hypothetical protein 2 1 Op 2 . + CDS 364 - 810 484 ## gi|266624527|ref|ZP_06117462.1| conserved hypothetical protein + Term 869 - 916 10.3 - Term 927 - 975 2.2 3 2 Op 1 . - CDS 1073 - 1714 271 ## bpr_I1885 hypothetical protein 4 2 Op 2 . - CDS 1785 - 1976 171 ## Clos_2043 hypothetical protein - Prom 2170 - 2229 6.3 5 3 Tu 1 . - CDS 2351 - 3076 212 ## FMG_1610 hypothetical protein 6 4 Tu 1 . - CDS 3429 - 3839 260 ## gi|266624531|ref|ZP_06117466.1| mobilization protein - Prom 4047 - 4106 3.7 - Term 3993 - 4037 4.0 7 5 Tu 1 . - CDS 4162 - 4713 491 ## gi|266624532|ref|ZP_06117467.1| putative protein GrpE - Prom 4799 - 4858 2.3 - Term 4748 - 4788 0.4 8 6 Tu 1 . - CDS 4894 - 5949 698 ## Ethha_1796 hypothetical protein - Prom 6061 - 6120 3.2 - Term 6070 - 6108 6.2 9 7 Tu 1 . - CDS 6161 - 6340 63 ## gi|288871401|ref|ZP_06410148.1| hypothetical protein CLOSTHATH_05904 Predicted protein(s) >gi|229783951|gb|GG667784.1| GENE 1 3 - 206 119 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621700|ref|ZP_06114635.1| ## NR: gi|266621700|ref|ZP_06114635.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 67 73 139 139 109 100.0 8e-23 VAIALFVSFKIDGIIASIPWLSEDGFVFIIIVLTVLCVVRDVKTIRTISQIQKSSVTQTN STDERNT >gi|229783951|gb|GG667784.1| GENE 2 364 - 810 484 148 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624527|ref|ZP_06117462.1| ## NR: gi|266624527|ref|ZP_06117462.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 148 1 148 148 241 100.0 1e-62 MKDYNYNENREEQSAREIGKKFAELGFGTVMTDEEYARSEKRIEKSEKRYKFFGSVFSEL RTIIYTPLSIAFHAVSYVAKGIGYISSFGLIAGVYYLYQSFCAFKSGVPFGEIGEFTKAV GFIVFPFIAYLVSYVSEKIYTYFEENAY >gi|229783951|gb|GG667784.1| GENE 3 1073 - 1714 271 213 aa, chain - ## HITS:1 COG:no KEGG:bpr_I1885 NR:ns ## KEGG: bpr_I1885 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 1 213 1 216 216 206 46.0 7e-52 MKTVDIHGMTNMELVQGGTDEWYWATDYIHGDLYEAEELFRQRHHVRSNRLYLIHYPDGT VYEPVPPVDGQYLGYPVYDEGAIVLLVVNFAESMIYILRFIHQQEKTQEVARLPLSTVKD CYNLILHTSPLSLTRQPNDGTFEMIWPEKISFAINDRESFNFREGDDLYFNIWYEDPDYR EETLVRSLRDGAILERLPGDIRIMPNGERWLIK >gi|229783951|gb|GG667784.1| GENE 4 1785 - 1976 171 63 aa, chain - ## HITS:1 COG:no KEGG:Clos_2043 NR:ns ## KEGG: Clos_2043 # Name: not_defined # Def: hypothetical protein # Organism: A.oremlandii # Pathway: not_defined # 1 63 26 88 88 89 66.0 6e-17 MTKEIEIQGCITIPKDVSMDEVIDKFIAFIEKNEWSFGGGYRTIIDGYYMNADGTKGKCV LDE >gi|229783951|gb|GG667784.1| GENE 5 2351 - 3076 212 241 aa, chain - ## HITS:1 COG:no KEGG:FMG_1610 NR:ns ## KEGG: FMG_1610 # Name: not_defined # Def: hypothetical protein # Organism: F.magna # Pathway: not_defined # 1 240 3 242 243 291 70.0 1e-77 MDTRFNQTKLKKLQIFNGAQLKYLAFASMLIDHVNNALITPYLNGQGFLLHLSNLFSILG RIAFPLFVFFLVEGFFKTGNRMKYLIMLLIFGVISEVPFDLFTSKTCFSPYWNNIMFTLA LCLITIWIIDILKDKISNKYPWYALSILIVAFFGFLSMELNLDYDYHAIVVAYLFYIFYD KPLLGAGLGYISIIKELYSFIGFGMTLTYNGERGKQYKWFNYFFYPVHILILGLLRIYLN I >gi|229783951|gb|GG667784.1| GENE 6 3429 - 3839 260 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624531|ref|ZP_06117466.1| ## NR: gi|266624531|ref|ZP_06117466.1| mobilization protein [Clostridium hathewayi DSM 13479] mobilization protein [Clostridium hathewayi DSM 13479] # 1 136 1 136 136 250 100.0 3e-65 MAEKDRVRTEQFRFTLLPEENAMLTKNAFEYGLSKSEYLRKLILAESLTGKQWHMDRDQA KQLIYEVNRIGNNVNQIAYNTNAVRFSSNKDWGELQTKYFELMTLFCQFALRENGEEKLW YKKILELMPKDDSSSV >gi|229783951|gb|GG667784.1| GENE 7 4162 - 4713 491 183 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624532|ref|ZP_06117467.1| ## NR: gi|266624532|ref|ZP_06117467.1| putative protein GrpE [Clostridium hathewayi DSM 13479] putative protein GrpE [Clostridium hathewayi DSM 13479] # 1 183 1 183 183 316 100.0 5e-85 MRDYSSQKIEIETIKPRTITVNLSNADVKRLAEKSGEGGLTISELLENFIGDLVDGTYSN GSDERMYAEQWYQRCWFAMFSDDTFLKFLLLWGDLDDYIDLMDELESNKKEMAEMTADAE EYSAEERDELQEYINELQKEIDYYWGKFLERKRQKESYVFEKEIENIMAWKSQLDTILDP ENP >gi|229783951|gb|GG667784.1| GENE 8 4894 - 5949 698 351 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1796 NR:ns ## KEGG: Ethha_1796 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 22 299 3 278 351 185 34.0 3e-45 MNESTTNMRTEEAIKEKIKESRFESINGKDLMEMDLPPLHYTISTILPHGFFILAGGAKV GKSWLAEQISYAVASGGQIWGYQAIQSEVLYLGLEDTVTRLQSRFEMLEAYEGMENIHFV LKANSITSGVEEDIRDYLTQHPNIKLVVIDTLQHIRGNEFVKNIYAGDVEFTNILRQISM EFDLTILALTHTNKGKHDDDISKISGSEGMAGGTDGNWVLTKSKRTDTRANLTISNRDTE AFEFALEFDKESCRWNKLTRDESVVNEKERFLHAIVNFVKAEEAHHWEGTATELLTNLQA REPILVSYSAAVFSKSLKAAQDSLREIYGVSYVSAFHNKKKWITLDYIEVK >gi|229783951|gb|GG667784.1| GENE 9 6161 - 6340 63 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871401|ref|ZP_06410148.1| ## NR: gi|288871401|ref|ZP_06410148.1| hypothetical protein CLOSTHATH_05904 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_05904 [Clostridium hathewayi DSM 13479] # 1 59 1 59 59 102 100.0 7e-21 MLASLAGFEPTAFRLGGKSRLPGYPFCNGFVTLVPGFCIFSVTIIETSSRLKLVITAYT Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:53:08 2011 Seq name: gi|229783950|gb|GG667785.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld178, whole genome shotgun sequence Length of sequence - 13441 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 5, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 2297 2236 ## COG0642 Signal transduction histidine kinase 2 1 Op 2 . - CDS 2239 - 2388 74 ## gi|288871403|ref|ZP_06410150.1| conserved hypothetical protein - Term 2408 - 2454 -0.7 3 1 Op 3 . - CDS 2471 - 3292 709 ## COG4295 Uncharacterized protein conserved in bacteria 4 1 Op 4 . - CDS 3292 - 4209 1101 ## COG1051 ADP-ribose pyrophosphatase 5 1 Op 5 . - CDS 4196 - 4594 410 ## COG3236 Uncharacterized protein conserved in bacteria - Prom 4639 - 4698 80.4 - Term 5650 - 5695 5.3 6 2 Tu 1 . - CDS 5804 - 6721 671 ## bpr_I2230 hypothetical protein - Prom 6810 - 6869 3.7 - Term 6829 - 6882 18.0 7 3 Tu 1 . - CDS 6905 - 7486 513 ## Closa_2015 cell envelope-related transcriptional attenuator 8 4 Op 1 . - CDS 8433 - 9305 999 ## COG1316 Transcriptional regulator - Prom 9342 - 9401 2.7 9 4 Op 2 . - CDS 9403 - 11199 1352 ## Closa_2016 hypothetical protein - Prom 11229 - 11288 5.8 10 5 Op 1 2/0.000 - CDS 11294 - 12031 730 ## COG0584 Glycerophosphoryl diester phosphodiesterase 11 5 Op 2 . - CDS 12051 - 13439 1243 ## COG1653 ABC-type sugar transport system, periplasmic component Predicted protein(s) >gi|229783950|gb|GG667785.1| GENE 1 2 - 2297 2236 765 aa, chain - ## HITS:1 COG:SMb20356_1 KEGG:ns NR:ns ## COG: SMb20356_1 COG0642 # Protein_GI_number: 16264090 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Sinorhizobium meliloti # 619 757 362 503 667 92 45.0 3e-18 MNEQITARECVKALCSAWFERRSMEETIVFFADDVMFIGTGEGESAHGKNEMANYITQDI QEIPEPFTVGVTIVHEQQIGENICNISIGLTLKNTLYSWLLRGFAVLMREHGAWLVKSLH FAEPSGKQREAEHYPQTLVMENLAKQRQKLLDDSLPGGMMGGYIEEGFPFYFINRRMLDY LGYENEDAFLQDIGGLITNCMHPADREMVDRQVARQLEDSDEYVIEYRMKKCDGSYIWVH DQGRVVTAEDGRPAIMSVCVDVTAQKQAREEVLSLYNNIPGAVFRCRFDAEFSVIDANDG LFEFLGYTREEFAQMGNAMASVIYPDDLREMTDRLNRQLTHSNTIHNENRLICRDGTVKW ISIKAQLLTESGGEQYFYCVFVDITEEKLLQDRMRELYEKELTYFAEMSSSDGSIQGRMN VTQNRVESYLSTLDISLTSTDHTYDESIESLAACVVDAAYGEEIRWSLRREKLLEDYAAG KTDYHYEFLCRYNGNAFWGSMNFKSCLNPETGDIIVFFYVMDITEQKLTERLLKSIAELD YDSITDVDIRRDTYQQVSFNSSEKNLIPWQGEFQKEITKIAEQYMDSAAGNEYLRQLDSG YMKAMLKDQNTYSFVTEMRDEYGNVRVKRFKVFYISRELGRICIVRTDVTDIVRGEQRQK EELAAALAAAEQANAAKSDFLSRMSHEIRTPMNAIIGMSTIAAKSIGDDEQVADCISKIG ISSRFLLSLINDILDMSRIESGKMLLKNEKIPMEDFLCGINSICY >gi|229783950|gb|GG667785.1| GENE 2 2239 - 2388 74 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871403|ref|ZP_06410150.1| ## NR: gi|288871403|ref|ZP_06410150.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 49 9 57 57 92 100.0 1e-17 MGFYAKIGTNGQKEMSLWQAGTTPAGGVTEDERADNSQGMCESALLRLV >gi|229783950|gb|GG667785.1| GENE 3 2471 - 3292 709 273 aa, chain - ## HITS:1 COG:DRB0099 KEGG:ns NR:ns ## COG: DRB0099 COG4295 # Protein_GI_number: 10957515 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 2 269 5 271 285 186 43.0 5e-47 MDRKATARETLRILEQGYYLYGDQRISIEEAHRKSVKGSVLITPEEGTRLLKAYEDGNRT GDAQSCCTVGDISTVEAAWKLFLEGKKDIAILNFASAKNPGGGFLNGAMAQEESLAASSG LYKTLTVHEEYYRNNRACPSMMYTDHAIYSPEVVFFRDGGFRLLEKPFPSSVLTLPAVNM GQVLLKGEDCGTAEHVMRRRMQLALAIFAERGAKNLVLGAYGCGVFRNDPVKIAAWWEEL LNGEFRGIFGQIVFAVLDRSKNKACLNAFLEKW >gi|229783950|gb|GG667785.1| GENE 4 3292 - 4209 1101 305 aa, chain - ## HITS:1 COG:CAC1777 KEGG:ns NR:ns ## COG: CAC1777 COG1051 # Protein_GI_number: 15895053 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Clostridium acetobutylicum # 1 298 1 294 307 218 45.0 2e-56 MKKNELRDRNGLTEAEFLSRYEPGDYVRPSVAADMVIFTVMETEEENYRKLPEKELEVLL IQRGVHPYLGCWALPGGFVRPDETTEAAAKRELKEETGVDHVYLEQLYTFSEPERDPRTW VMSCSYMALIDSSKVRVKAGDDADQAVWFRTSFRLMDEHIEYRKNMTYDGMKTIRIKRYE LELSHEDVVLRSLIEKKIVKTEHETVTEYRILENDGLAFDHAKIIAYALERLKGKVEYTD LALHMMPKEFTLTQLQQVYEVILGKTLLKAAFRRKIAHLVEGTDSFTENEGHRPSKLFRR KLEEE >gi|229783950|gb|GG667785.1| GENE 5 4196 - 4594 410 132 aa, chain - ## HITS:1 COG:PA4580 KEGG:ns NR:ns ## COG: PA4580 COG3236 # Protein_GI_number: 15599776 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 1 122 63 184 184 119 52.0 1e-27 MMAEKARMFGDGEMLSMILKARHPKEMKAYGRAVKNFDQDKWDGACYEIVKRGNLAKFSQ NPELWDYLKSTKRRILVEASPRDRIWGIGMGKSNPDAECPVKWRGTNFLGFSLTEVRDLL LEKEEEETDEEE >gi|229783950|gb|GG667785.1| GENE 6 5804 - 6721 671 305 aa, chain - ## HITS:1 COG:no KEGG:bpr_I2230 NR:ns ## KEGG: bpr_I2230 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 4 124 42 163 164 68 27.0 3e-10 MEGIMKKIICILGTMLAALLLMACNGMSFSGSRLGNGSQLIMKYSIFNTTDSQYFEMDQG DVIDADIVSDSGKLSVTIQSEDGETVYENEDVPTGTFQIEIRKKGIYKIKVTGRKAKGSL SFIKSTEQDSLEANLAALSNSYYDGQSSRAFQMLQKSIFKKMLANGWLDEMSGLDNARWN YDTFTNYTVLDRDPALHEAQGELYCCTFSADHDRYGYIVMSYNGDGLSKIRAVETPYLYD FLLEWDQIEKELETSGVDLSTASARRAEVLGEDGSDPAEGISFTDSKGNQYFYRFLKDGS AGNKK >gi|229783950|gb|GG667785.1| GENE 7 6905 - 7486 513 193 aa, chain - ## HITS:1 COG:no KEGG:Closa_2015 NR:ns ## KEGG: Closa_2015 # Name: not_defined # Def: cell envelope-related transcriptional attenuator # Organism: C.saccharolyticum # Pathway: not_defined # 1 170 339 504 507 134 56.0 2e-30 MRIGKMDCVIAATLESNVIQLHQFLYQDENYTPSSAVKSISARISEESGISEPGKNAPTG GGSTKKNPGAASTPKETAPPETESIAEESVEETMETTEETEAETTEEETAASETYEDGSL IGPGAGLDGAAKEPEETKGTKAPEEETKAETKASEPAAPEPTKAPETQTPPAPGDKGPAG GSSGGPGEVGPGV >gi|229783950|gb|GG667785.1| GENE 8 8433 - 9305 999 290 aa, chain - ## HITS:1 COG:CAC3046 KEGG:ns NR:ns ## COG: CAC3046 COG1316 # Protein_GI_number: 15896297 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 100 272 72 242 341 111 40.0 2e-24 MDYEDELDRMRAQKRRGSGAKRKVDNRGGRAYYETGAGRNPRVEYTGRPDVHQKLRQERM RKKKKKKRLLMMELAVLAILAVSAFFIFGKGTNQKGYWTIAVFGVDSRDGNLEKGALSDV EMICNIDKATGEIKLVSVYRDTYLKINSEGTYHKINEAYFKGGHKQAVEALNENLDLKID DYATFNWKAVAEAINILGGVDIEITDAEFSYINGFITETVNSTGIGSYQLKSAGMNHLDG VQAVAYARLRLMDTDFNRTERQRKVVTLALEKAKQADFGTLKTLVSAVFS >gi|229783950|gb|GG667785.1| GENE 9 9403 - 11199 1352 598 aa, chain - ## HITS:1 COG:no KEGG:Closa_2016 NR:ns ## KEGG: Closa_2016 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 598 1 537 537 449 44.0 1e-124 MKRSKTVKNQTISAIFMALLGTAVHVGSQSYLSFYDVPEHKPARPVVSIPPAVPEKIMPE ETVPATEAEEVFSPSLAVLNLAYLYPEGEEDTDAQNSVAGQILRVLSVKDSEWLASVYDL SNRKPVEAAAKLGLPAESIRGRYNPLDEMQKVDDPDSWELSEWSHIDMVFTDENGNRIMA DSNIKEIMAMANTYYYYTDPEDSEAFLDYVTSLWDRSHSIHYSIGEVYYCDGCLEETPGE SESVGESGAAGESESFRDLKSAVLEETQEEAAESSFAGSPEGTQNETDTELSAAAAANGQ ESAETAEIGPGIGTSAAETAAEQETTAEIGPGIALMESSRASEAEAEVEILAEEAVPEVK CPGHVDLHITAAVAGVSSSGNSLFSIAEQLITGKRESVFPGWTDENKNAVLELAAQDWKE VYGLTVSSAYITNPLSQSEMEYYVSLLPADISPERRRLIRYALSSVGTVPYYWGGKPSAP DYDGNHFGSLVNADEKGRTMKGLDCSGWISWIYWSALGKRLPAESTDGLAGCGAAVAREE LKPGDIIVRLGEEGHVVMFLAWESDHKMKVVHESSGSVNNVTISTMQADWPYYRKLID >gi|229783950|gb|GG667785.1| GENE 10 11294 - 12031 730 245 aa, chain - ## HITS:1 COG:CAC0430 KEGG:ns NR:ns ## COG: CAC0430 COG0584 # Protein_GI_number: 15893721 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Clostridium acetobutylicum # 7 245 9 248 249 160 39.0 1e-39 MKTKVWAHRGASACAPENTLEAFELAVSQGADGVELDIQMTKDGSLAVIHDETVDRVCDG SGCVADLTMKELKRFRCSRLHPEFGEARIPELGEVLSLLKPTKLTVNIELKTGIVRYKGI EEAAIKLVKEMGMEERVIYSSFHHPSLLKVKRLDPDSKMGLLYSDGWIHAASYGKKLEVD ALHPALYHLRSGKLIKEAKEKKLPLHVWTVNEEGTMRILAEQKIAALITNYPDVARRVVR ETSKS >gi|229783950|gb|GG667785.1| GENE 11 12051 - 13439 1243 462 aa, chain - ## HITS:1 COG:CAC0429 KEGG:ns NR:ns ## COG: CAC0429 COG1653 # Protein_GI_number: 15893720 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Clostridium acetobutylicum # 44 462 24 447 447 421 50.0 1e-117 RTTIAACMALMVAAGSVTGCSSQTGTNESSTTAQDSAAGTEALQSDAAGNKTDGVTITFW HSMGGVNGEAINTLVEQFNKENTAGITVEAQYQGSYDDAINKLKSAQIGNMGADLVQIYD IGTRFMIDSGWVVPMQDLIDGEGWDLSQVEPNIAAYYTVDDILYSMPFNSSTPILYYNRD MFEKAGITEVPDSLPKIAAAGDSLLTKGGAGEVISLGIYGWFFEQFTSKQGADYADNGNG RSAAATAVAFDKNGAGVKTLTAWKALADQGYAPNVGRGGDAGLADFSAGKAAMTLGSTAS LKQILNDVNGKFEVGTAYFPKVSDEDQGGVSIGGGSLWALNNEDPARQNAVWQFVKFLVS PASQAYWNSQTGYFPVTLAAHEEPVFRENIEKYPQFQTAIDQLHDSTPQSAGALLSIFPE ARQIVETEIENMMNNGTSPEDTVIKMAESINKSIEDYNLLNE Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:53:34 2011 Seq name: gi|229783949|gb|GG667786.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld179, whole genome shotgun sequence Length of sequence - 10630 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 1, operones - 1 average op.length - 9.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 35/0.000 + CDS 2 - 835 921 ## COG1653 ABC-type sugar transport system, periplasmic component 2 1 Op 2 38/0.000 + CDS 894 - 1793 908 ## COG1175 ABC-type sugar transport systems, permease components 3 1 Op 3 2/0.000 + CDS 1818 - 2648 751 ## COG0395 ABC-type sugar transport system, permease component 4 1 Op 4 . + CDS 2669 - 3748 954 ## COG2152 Predicted glycosylase 5 1 Op 5 2/0.000 + CDS 3778 - 4521 759 ## COG1609 Transcriptional regulators 6 1 Op 6 3/0.000 + CDS 5449 - 5709 268 ## COG1609 Transcriptional regulators 7 1 Op 7 . + CDS 5710 - 7263 1156 ## COG0366 Glycosidases 8 1 Op 8 . + CDS 7278 - 9737 1792 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 9740 - 9799 2.4 9 1 Op 9 . + CDS 9823 - 10630 728 ## COG1109 Phosphomannomutase Predicted protein(s) >gi|229783949|gb|GG667786.1| GENE 1 2 - 835 921 277 aa, chain + ## HITS:1 COG:TM1855 KEGG:ns NR:ns ## COG: TM1855 COG1653 # Protein_GI_number: 15644598 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Thermotoga maritima # 35 245 191 396 418 108 35.0 9e-24 TAVIREFAAQRDTVMKDMGVTHSFYRPSLTRPDQWWDRWYDFQMPYEALTGGKPWVEGNK LVLDKEGAIEAFELIGLFGNSIQSGEISSIWTEENPSVLVTINAPWEISLYRENNKVYGE DYVYGQAIVKKDGDIPYNFADSKGLVFYKNKSVSDEEHAGAVEFVKWVYNKDNSAQTDLD WLNATTMLPVRGDLNDNEMFADTMKEYPELEALAEAVPYAIPSIATEKVTDIQTALTESG TGPYMNEVMNAEPGNAPDAAPYVEAAMEAMKQAGGLE >gi|229783949|gb|GG667786.1| GENE 2 894 - 1793 908 299 aa, chain + ## HITS:1 COG:lin0853 KEGG:ns NR:ns ## COG: lin0853 COG1175 # Protein_GI_number: 16799927 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 15 299 8 291 292 252 45.0 8e-67 MKRKSITGNKKAGLMGWLLNGPYLIYSLIFFLIPLIWAFWLSTMDWNLMSEQKTFVGIRN FAGLFADSRVKAAFINSYRYLLPIVVLCFAGGLIIALLVSNLPDKIKGLVAVLFFIPYLT SGVATSVVVKYFFNYNSAFNTFLRDQFNLNINWLTDQRYAFAVMVGIVVWKMSGYYALFI LSGIESIGDDVTEAAMLDGSTGLHKLLHITLPMIMPTLTTVIVLATGLAFGVFTEPYLLT GGGPNMTTTTWQLEIYNTSFTRFQSGYGAAMAIASAVQIFITLRIINAVTDRLNRRFGW >gi|229783949|gb|GG667786.1| GENE 3 1818 - 2648 751 276 aa, chain + ## HITS:1 COG:lin0854 KEGG:ns NR:ns ## COG: lin0854 COG0395 # Protein_GI_number: 16799928 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 16 276 15 277 277 239 47.0 4e-63 MGRKKKFGVKTFTITVITLTLLAISLFPYYYMVLQSFTAWDQVDKVMVPHGFTLRSYEYL IGKGGATNSMMWVRALVNSFLVAFPTAIISVIIGLCNGYSVCKLKFKGERFIMDTLLFQM FFPTIILLVPRYMIAKPMANTYGGMIIPMCISIWAIFMYINYFKTLPNEVFEAAKIDGAG ELRIIFYIAFPITKSVTTIVFLSIFMQRWSELMWDMLIAPNIQMQTLNVLISTQFKPMGA FPGPMYAASVILTLPIIILFLCFSKYFKEGISFMLK >gi|229783949|gb|GG667786.1| GENE 4 2669 - 3748 954 359 aa, chain + ## HITS:1 COG:YPO2474 KEGG:ns NR:ns ## COG: YPO2474 COG2152 # Protein_GI_number: 16122695 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Yersinia pestis # 6 352 3 345 351 369 48.0 1e-102 MLEVKRCGENPLLIPNNVKPSQENFTVEGVFNCGVARYRDEVILLCRVAESVKGARDGEV LVPVVKKVDGKDEITVISLIKSEHPELDFTDSRTINKMTKHGKRTVYLTSLSHLRVARSK DGIHFTIEDTPFIFPCAEEESWGMEDPRITQIGETYYINYTSVTENGAATSLISTKNFVT YERHGVIFAPENKDVTIFPEKIHGMYMAFNRPVPGGIGNPEMWLAQSPDLEHWGHQKHFC GISESDSWDNGRIGGGAVPFKTEQGWVKIYHAADKNDRYCLGAFLLDAEDPSKVIAKTER PILEPEASYETEGFFGNVVFTCGCLVEGDRIAIYYGAADDKICRADLSVSELLAAMKAV >gi|229783949|gb|GG667786.1| GENE 5 3778 - 4521 759 247 aa, chain + ## HITS:1 COG:lin0851 KEGG:ns NR:ns ## COG: lin0851 COG1609 # Protein_GI_number: 16799925 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 3 209 2 210 341 122 34.0 5e-28 MKKRVTMKEIADRLNLSINAVSLALNDRAGVSEETRRLVLNAAEEMGYLDQSTRYTAAYS NKNICILLEDRFFKDMQFYGRILLGLEEAAKKAGYDLFINSFEKNQEIPACVEQQKVSGL IVVGKIDDEYLKRLKSYSIPVLLLDHESVETPTDCVMTNNVAGAYKLTKYLLDKGYRKIG FFGDLTYTPSIRERFWGYQEAMQQCHNFPSLEDGIAYIRRYSALRDVEDAVIHQDAAGLA AFYQALD >gi|229783949|gb|GG667786.1| GENE 6 5449 - 5709 268 86 aa, chain + ## HITS:1 COG:lin2102 KEGG:ns NR:ns ## COG: lin2102 COG1609 # Protein_GI_number: 16801168 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 1 85 249 333 336 67 40.0 4e-12 MAILMCKALSQCGLSVPGDISIVGFDDIELSNMVNPRITTMRVDKVLMGKKAMERLLKCM ERPREKVEKLVLDVQLIERDSVRELT >gi|229783949|gb|GG667786.1| GENE 7 5710 - 7263 1156 517 aa, chain + ## HITS:1 COG:lin0855 KEGG:ns NR:ns ## COG: lin0855 COG0366 # Protein_GI_number: 16799929 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Listeria innocua # 3 495 2 494 510 414 44.0 1e-115 MNQWWKNTVFYEIYVKSFCAAGQNGESGFRGIISKLPYLKELGVGGIWLTPFYPSPKVDN GYDVSDYYGIDPEYGTMKDFEEMVTAAHKLDIRVIIDMVVNHTSTQHPWFLNSVSSRNSS KRDWYIWKDPVNGREPNNWESFFGGSAWAYVESDGSGQYYYHSFSKEQADLNWQNPSVEE AIGQVLDFWLRKGVDGFRFDVINNLSVCSSLEDNPTAENGEQLHEHDVNQPGVRAVIRRL GNRIREKKPEAFLVGEISSDDLSRIHAYMGDGGLDTTFNFNLGSRETFQPEEIYEELKRM QAMYGSFEDPTLFFGSHDMRRFPDRFGFTDGETKCLLTLMLLCKGIPFLYYGDEIGMKSR VLNSLDEAEDVQGILACKTAIREGKSREEAVRILNEKSRDASRNPMDWEEEKRQKKDPSS VWNYVRKLIELRSGIRALTEGTMDLKYTAPGVIKIVRTLDGERVLAVINFSECEVSVPAE CGYRYLTGSEETGEGMDRNYGEKQLQVPGGSCVILKK >gi|229783949|gb|GG667786.1| GENE 8 7278 - 9737 1792 819 aa, chain + ## HITS:1 COG:XF0846 KEGG:ns NR:ns ## COG: XF0846 COG3250 # Protein_GI_number: 15837448 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Xylella fastidiosa 9a5c # 25 794 64 851 891 428 32.0 1e-119 MVIQKVTEGWKVRSCQTGQCFETKVPGTVYGTWLEHGAMEDPYWRDCEDDALAMMEGEYE YETTLPFCEELMSCRAIRLKFEGIDTLADIYFNGNHLGFADNMHRVWEYPVKSLAAESGG VNTIRVRFHSPLSFIKEAYEKSPYDGSPEAMQGFPLLRKSHCSFGWDWGPRLPDAGLFRA VSLVGVEQAEIRDVWIRQIHREGAVSLSCDIAVESVCDTVCTCELKVTGPKGDTSVSQLQ RKNDGFFGGIVIDNPELWWPNGYGSQPLYRVEAVLKANGEIQDIQSRMVGLRTMKVNTGA DEYGSRFAHEVNGVEIFAMGADYIPEDCIRGRVNRERTRTLLEHCRDCNFNVIRVWGGGY YPDDFFYDLCDELGLIVWQDFMFACAVYPLTPEFEANVTAEIRDNVRRIRHHACLGLWCG NNEMEMFLADNQWVRKPLQKTEYLIMNERIIPELLRQYDPDTFYWPSSPSSGGDFDSPND PDRGDVHYWEVWHGGLPFTDYRNYYFRYLSEFGFQSFPAVKTIEQFTAPEDKNIFSYVME KHQKNAGANGKILQYMSQMYLYPESLELLVYVSQLLQAEAIRYGVEHFRRNRGRCMGCVY WQLNDCWPVASWSSIDYYGRWKALHYFAKRFFAPVLLSCEEEGALSSSMNVNEENRRIRK TARFCVTNETRERITGVIRWSLRDASSRIHQQGQYCVEAEPLSVCSFPQMEFTEADMRRW YLSYELWIGDRRVSHGSSLFVPPKHFEFEDPQLEVWQEGEMVCIRSSAFAKGVELGNKDG DLLLSDNYFDMDAGVERVRVIQGSMEGLSVRSIYSLAHA >gi|229783949|gb|GG667786.1| GENE 9 9823 - 10630 728 269 aa, chain + ## HITS:1 COG:CAC2337 KEGG:ns NR:ns ## COG: CAC2337 COG1109 # Protein_GI_number: 15895604 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Clostridium acetobutylicum # 4 269 3 267 575 296 53.0 3e-80 MADYRAEYRNWCDNTYFDEDTRREVKELTSEEEIKDRFYKELEFGTGGLRGIMGAGLNRM NRYTVRKATQGLANYMNRQNCTEKRAAIAYDSRHHSAEYAMEAALCLCANGIEVYLFESL RPTPELSFAVRELGCMAGIVVTASHNPPEYNGYKVYWADGAQITWPRDTEIMREVKAVTD FSSAASMERENAAGAGLFHLIGEEMDLRYLEKLKELVQEPSLLKLRAKELTIVYTPLHGT GNLPVRRILKELGFESVYVVEEQEQPDGD Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:53:37 2011 Seq name: gi|229783948|gb|GG667787.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld180, whole genome shotgun sequence Length of sequence - 9450 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 5, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 2103 1971 ## COG0514 Superfamily II DNA helicase + Prom 2179 - 2238 7.5 2 2 Tu 1 . + CDS 2280 - 3695 1564 ## COG1757 Na+/H+ antiporter + Term 3722 - 3761 4.1 - Term 3709 - 3747 3.1 3 3 Tu 1 . - CDS 3768 - 4433 618 ## SGGBAA2069_c09090 cAMP-binding protein - Prom 4537 - 4596 8.8 + Prom 4535 - 4594 9.7 4 4 Op 1 . + CDS 4618 - 6504 2090 ## DSY2571 hypothetical protein 5 4 Op 2 . + CDS 6590 - 7441 1180 ## COG1284 Uncharacterized conserved protein + Term 7515 - 7564 12.4 6 5 Op 1 . - CDS 7651 - 8856 1126 ## COG1940 Transcriptional regulator/sugar kinase 7 5 Op 2 . - CDS 8925 - 9449 340 ## SP70585_0538 hypothetical protein Predicted protein(s) >gi|229783948|gb|GG667787.1| GENE 1 1 - 2103 1971 700 aa, chain + ## HITS:1 COG:CAC2687 KEGG:ns NR:ns ## COG: CAC2687 COG0514 # Protein_GI_number: 15895945 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Clostridium acetobutylicum # 11 496 132 586 714 424 45.0 1e-118 EFLDFALNTKISMVAVDEAHCVSQWGQDFRPSYLKIVEFIDRLNVRPVVSAFTATATKEV RDDISDILMLREPTVLTTGFDRPNLYFGVQAPKDKYATMKNFLELHPGQSGVIYCLTRKN VEEVSDKLRADGFSVTRYHAGLSDAERKKNQEDFIYDRAAIVVATNAFGMGIDKSNVRFV IHYNMPKNMESYYQEAGRAGRDGEPAECILLYGGQDVITNQLFIDNNRDNQELDAVTRAI VQERDRERLRRMTFYCFTNECLRDYILRYFGEYGENYCGNCSNCLSQFETVDVTDIAKGI IGCVESSRQRYGTNVIIDTVHGANTAKIRQYHMDENPHYSELAKVPSYKLRQVINYLQLN EYLAVTNDEYAVVKLTEKSRAVLEEGEPVVMKMAKEQEHPAKVKAEKGKKSRKGLAAGLS EVGFSESDEALFDQLRALRSEIAKEEKVPPYIVFSDKTLTHMCVIKPKTRAEMLTVTGVG EFKFEKYGERFLECIRAETGAAHAETAENLSERAADRAFGDSAVSPAADYDEGFAGDDLY FSSDSGEFDDWSLDDVMAAWESGSQEPLKQEEPPVEAAAVKKKKGKSAKTEFVMTRELAD ELHYSERVSLSDFIGQINDLRDEDTMKRLTIKSVEQKLIEDGYYEEQFFNGMKRKRLTEA GEAFGIESEKRLSEKGNEYDVFFYTEKAQRAITAWLLESV >gi|229783948|gb|GG667787.1| GENE 2 2280 - 3695 1564 471 aa, chain + ## HITS:1 COG:BS_mleN KEGG:ns NR:ns ## COG: BS_mleN COG1757 # Protein_GI_number: 16079413 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Bacillus subtilis # 40 455 36 451 468 280 39.0 4e-75 METKEKVQKTLNGIQSAVLLLLIVAAVAVCIRFKVGGPMIGLFFSWLIIYLFCIGLKIDY EHVKAGAFDALKVVLPTMTILMAIGVLIGTWMASGTIPTIIVYGLKSINPAFLLTFTLIF TAVLSLVTGTSYGSAGSAGVAMMAIGNAMGINPGMVAGAVICGAMFGDKLSPFSDTTNLA PAVAGAKLGDHIKSMLWTTLPPMLISIVVFTVLGISQTNGGYDPGDLNVYIDYLQGAFHI GWTTLIPAVLIIVLLMFRVGAFEALGIGAVAGALVAVFVQGVDLRSILKICYDGFTSDTE IGVLKSILNRGGMASMLQYVAIITFAVGMGGMLDRLGVLNNILSVFVKRINSDGSLVLAT LLTGYVTSIVSCSSPMSHVLTGRLMQPLFKERGIEPRILSRCLEDSGTLAGPMIPWHGYG IYMAGTLGVTWAQYIGFVPLLWLTPLFSILYGFTGISIKRVADQKKNSDGK >gi|229783948|gb|GG667787.1| GENE 3 3768 - 4433 618 221 aa, chain - ## HITS:1 COG:no KEGG:SGGBAA2069_c09090 NR:ns ## KEGG: SGGBAA2069_c09090 # Name: not_defined # Def: cAMP-binding protein # Organism: S.gallolyticus_gallolyticus # Pathway: not_defined # 3 212 10 222 223 94 30.0 3e-18 MESSAACLNELRDLGTLTRKKKNDCLYNRAHLDKYVYCLEEGLCALHSVSPGGRERIYQY FLPGDFVGFIPAYARSYPDDTFYAFSITAKSACLLYQIPYNTFREYVESHPGFYRWLFET AIAHYDNALKHCYALQDGDNFVSLCYALTELAVCEGPHYVIHKDFSYSELANYLGIHTIT VTRIMGKLKDMGVVSKRGHQTVIHDMARLTALAQQRTISKK >gi|229783948|gb|GG667787.1| GENE 4 4618 - 6504 2090 628 aa, chain + ## HITS:1 COG:no KEGG:DSY2571 NR:ns ## KEGG: DSY2571 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 6 626 4 626 628 683 53.0 0 MERKTIGYHRISHLVERQYQNEEAMRKELERGNYTSEHPYIVVNPYLVNPLTAMILFKTE KEEAVTLTVKGKEAAGDITHTFPKAKEQILPVLGLYPEYDNTVVIALEDGTAYEVTVTTE QIENMPYQADYINTTSDYMNGQLMFVTPAGDSLAGGYDYRGDCRWHLVEPFIFDMKPAAN GRILIGSNRLLNMPYYTAGVCEMDLVGKIYTEYCIPGGYHHDQFEMEDGNLLILTQEKNA ATAEDMCVLVDRKSGKIIKSWDYKKVLPQQAAKSGSWSEHDWFHNNAVWYDRRTNSLTLS GRHQDAVINIDFETGELNWIIGDPEGWPEDLVSRYFFTPAGGGDFDWQYEQHACMMLPDG DIMMFDNGHWRAKNKEQYRLNRDNFSRGVRYHIDTERMTIEQVWQFGKERKNDFFSSYIS NVEYYRDGYYLVHSGGMGYNHGVTCEELPVYMNLEDPECVLKSITVEIMDGELVYEMHLP SNYYRAEKMSLYREGANLDLGKGRVVGNLGVTGEFDTEVPAENTGELLPEACEAVLTEED DRIIFKAKFKKGQLVMFQLEKEDDPEEIHRYFISTSAQKFLAMCSGTFLPKDDREVTLNV DKEGLNGTFDVRVIIDDSKYETGLKLKF >gi|229783948|gb|GG667787.1| GENE 5 6590 - 7441 1180 283 aa, chain + ## HITS:1 COG:BH1678 KEGG:ns NR:ns ## COG: BH1678 COG1284 # Protein_GI_number: 15614241 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 4 281 6 284 290 148 33.0 1e-35 MSKKIGKILGIVLGNGIFALGLSAFSIPNHFLVGGATGIARSVQHFAHLDVSVTVAALNV VMFLLGFYILGKRFALTTIVSTLVFPVFLNQFLKIEAFSHLTGDRLLAALFGGMLTGLGL GIVMRLGGSTGGMDIPPLILNKKFRIPVALSMYFFDIFILLSQMIFSNTEEILYGIMSVL LTSIVLNQVLVFGAGDVQVLIISREYERINEIIQTDLDRGSTFLPIQTGFEKLDQKAVLC VLPNREMSHLNHCVQEIDPKAFIIINGVREVRGRGFTLDKHLV >gi|229783948|gb|GG667787.1| GENE 6 7651 - 8856 1126 401 aa, chain - ## HITS:1 COG:SP0473 KEGG:ns NR:ns ## COG: SP0473 COG1940 # Protein_GI_number: 15900388 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Streptococcus pneumoniae TIGR4 # 11 395 11 395 407 280 37.0 4e-75 MYKGKKKKASNRSRILECIYRNAPIARTDIAEETEITPATVTMNVTTLISEGFVRELGEV STDEASSGRKRVLIDIIPDRAYSIGLEFTQKALVICITDLKGRIRFQRTEPFNEELAGRI TDAIIGGVKDLVAESGISWDEIVGIGAAVPGHMNGSASSLITNRKTWQNFDPKRIEQELP LPIVCENNARCMALGEYLFSPQESPDSFAFFHVGMGMFCASVVDGEMFLGENYVAGEIGH TIVSEGGRRCECGKYGCLQTYASENHLIRNARLLYRNTPNTILRSLVSDDSQITIETITT AYSMGDPVIGMYIAEALKYLGITISNIAIITNPGKIFLHGQLFDNQDIQNELMDYIERQL IFVDRTYAGNVEILPYGAADGAVGASALAILRFFIRPVMGE >gi|229783948|gb|GG667787.1| GENE 7 8925 - 9449 340 174 aa, chain - ## HITS:1 COG:no KEGG:SP70585_0538 NR:ns ## KEGG: SP70585_0538 # Name: not_defined # Def: hypothetical protein # Organism: S.pneumoniae_70585 # Pathway: not_defined # 1 166 463 628 628 127 43.0 2e-28 PNAFPAYVMGRMLFDADVTFGELKEEYFRAAYGPGWEQVLSYLTKLSSLCSCDYFNGKED RKDPREAAAMKELIRLAEHAPLPGQEGTDSLTDAQNLFWKYLDYHREYSLRLGKALMKLA GGEELEAQECWRQFQHMICERETEFQECLDVYRVTEVSTKYTGFLLEEPLISTL Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:53:57 2011 Seq name: gi|229783947|gb|GG667788.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld181, whole genome shotgun sequence Length of sequence - 11953 bp Number of predicted genes - 13, with homology - 11 Number of transcription units - 9, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 541 507 ## COG4667 Predicted esterase of the alpha-beta hydrolase superfamily 2 1 Op 2 . + CDS 567 - 2984 2649 ## COG0210 Superfamily I DNA and RNA helicases + Prom 3039 - 3098 9.8 3 2 Op 1 . + CDS 3168 - 3461 337 ## Closa_1728 hypothetical protein + Term 3484 - 3520 2.6 4 2 Op 2 . + CDS 3647 - 5017 1359 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase + Term 5213 - 5257 5.9 + Prom 5078 - 5137 3.6 5 3 Tu 1 . + CDS 5271 - 5417 96 ## 6 4 Tu 1 . - CDS 6334 - 6453 66 ## + Prom 6721 - 6780 6.6 7 5 Tu 1 . + CDS 6822 - 7943 947 ## Trebr_1975 hypothetical protein + Term 8015 - 8071 15.2 + Prom 8460 - 8519 2.5 8 6 Tu 1 . + CDS 8558 - 8839 306 ## gi|288871416|ref|ZP_06117503.2| conserved hypothetical protein 9 7 Tu 1 . - CDS 9546 - 10013 153 ## COG1309 Transcriptional regulator - Prom 10226 - 10285 5.7 + Prom 10123 - 10182 6.5 10 8 Op 1 . + CDS 10254 - 10553 251 ## COG3681 Uncharacterized conserved protein 11 8 Op 2 . + CDS 10556 - 11266 392 ## COG3681 Uncharacterized conserved protein 12 8 Op 3 . + CDS 11289 - 11621 278 ## CLI_0908 CGGC domain-containing protein + Term 11673 - 11715 11.9 + Prom 11641 - 11700 4.6 13 9 Tu 1 . + CDS 11736 - 11952 124 ## Pjdr2_1164 transcriptional regulator, AraC family Predicted protein(s) >gi|229783947|gb|GG667788.1| GENE 1 2 - 541 507 179 aa, chain + ## HITS:1 COG:PM0638 KEGG:ns NR:ns ## COG: PM0638 COG4667 # Protein_GI_number: 15602503 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Pasteurella multocida # 2 166 117 278 280 112 35.0 4e-25 CKTGKPEYFEKGKCEDFIEAVKASSSLPILSRMIDVDGKKYLDGGCSMPIAYQRAIDEGY EKIIVILTRNQGYRKKPVDKWTRRAYEHYFKPLPRLLKSLEKAPERYNRMQEEIDELEKA GRIYVIRPDKPVTVSRTEQDRKKLEALYEDGRRLAVEQLADIKKYLGIETTGYPAESNH >gi|229783947|gb|GG667788.1| GENE 2 567 - 2984 2649 805 aa, chain + ## HITS:1 COG:SA1721 KEGG:ns NR:ns ## COG: SA1721 COG0210 # Protein_GI_number: 15927479 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Staphylococcus aureus N315 # 2 638 3 640 730 652 53.0 0 MSIYDTLNEVQREAVFHTEGPLLVLAGAGSGKTRVLTHRIAYLIEEKGVNPWNILAITFT NKAAGEMRERVDKIVGFGAESIWVSTFHSSCVRILRRHIESLGYTTNFTIYDSDDQRTLM RQVLKTLELDPKLYKDRAMLSLISTAKNELVTAAEFELNAGGDFRQKKAAQIYKEYQGQL KKNNALDFDDLIMKTVELFQNNPEVLDYYQERFKYIMVDEYQDTNTAQFKLVSLLAGKYR NLCVVGDDDQSIYKFRGANIGNILNFEKSYPGAAVIKLEQNYRSSQNILNAANEVIRHNR GRKSKTLWTANEEGPLVQYKQFETAAEEADAIVRDIQNSGCGYQDCAVLYRTNAQSRLIE EKCLQRNVPYRMVGGVNFYQRKEIKDILAYLKTIDNGRDDLSALRIINVPKRGIGGTSVG KVQAFASEHDFSMYDAFCRAQAVPGLGKAADKILKFTALIEDFRERIRYEDYSIRELIED VLDESGYQKELEAEGEIEYQTRLENIEELINKAVSFERDHDEGTLSEFLEEVALVADVDR MDDSENRVTLMTLHSAKGLEFPRVYLSGMEDGLFPSMMSISSDDKEEIEEERRLCYVGIT RAQDFLMLTGAKQRMINGETRYSKTSRFVDEIPADLLDCDKLEPRLSSFRKSSDGYGDSR AERVSDRFDDSDLPWGQGGKNARGGSSWGGSGAGAAGSGGGASGGQRPTTWGNKAAGVSS FGPGSNAYASKTASPVQKPAFGKAFTVQKSGALDYKEGDRVRHIKFGEGNVVEIKDGGKD YEVTVCFDKAGIKKMFASFAKLKPV >gi|229783947|gb|GG667788.1| GENE 3 3168 - 3461 337 97 aa, chain + ## HITS:1 COG:no KEGG:Closa_1728 NR:ns ## KEGG: Closa_1728 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 97 1 96 96 90 85.0 3e-17 MNRFDTMSKEAMNRVEELVNSSRLNELLHKREEDEKKKNCILWILAIIGAVAAVAGIAYA VYRFFTPDYLEDFEDDFDDDFDDDFFEDDTPGEGSEQ >gi|229783947|gb|GG667788.1| GENE 4 3647 - 5017 1359 456 aa, chain + ## HITS:1 COG:CAC1435 KEGG:ns NR:ns ## COG: CAC1435 COG2265 # Protein_GI_number: 15894714 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Clostridium acetobutylicum # 1 443 4 446 456 441 51.0 1e-123 MKKGEIYEGTIEKFDFPDKGTLVTESGKITVKHALPGQKVSVMITKKRGGRLEGKIVEVL EKSPLETAEKPCVHFGTCGGCIYQTIPYEKQLEIKEKQVRSLLDSVCASYEFEGILASPQ PCGYRNKMEFSFGDEFKDGPMALGMHRRGSFYDVVTTTDCQIVNSDFCEILKASKDYFEE LGLGFYKKMQHTGYLRHLLVRRAVKTGEILVDLVTTSQEEPDLMPFAERLLSLPLNGRIV GILHTWNDSLSDAVKNDKTDILYGQDYFYEELLGLKFRISPFSFFQTNSLGAEVLYEKAR SYVGDTKDMVVFDLYSGTGTIAQMIAPVAKKVVGVEIVEEAVEAAKENAKMNGLENCEFI AGDVLKVIDEIEEKPDLIILDPPRDGIHPKALGKIIDYGVNRIVYISCKPTSLVRDLAVF QERGYRVEKVCAVDMFPSTGNVEVCCRMSLVDEKGK >gi|229783947|gb|GG667788.1| GENE 5 5271 - 5417 96 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVNQLKKTQNTGKVNRRWLKKVNIKSFFKFIVIQVIILCKKFMHIKQI >gi|229783947|gb|GG667788.1| GENE 6 6334 - 6453 66 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDSKFLEELRQKHIQNPPKGMTSKLVRTMSDSDLLDMHS >gi|229783947|gb|GG667788.1| GENE 7 6822 - 7943 947 373 aa, chain + ## HITS:1 COG:no KEGG:Trebr_1975 NR:ns ## KEGG: Trebr_1975 # Name: not_defined # Def: hypothetical protein # Organism: T.brennaborense # Pathway: not_defined # 8 363 6 372 425 151 29.0 4e-35 MDKVMIMLKDIPVLEIENYSCRILDYDLLPISLRYPGVNYDDVMHGWTENRTMNIGKTNA KKLLAGFRISQSNPYMIARLFHFASLSDCYWLKEAGENLTWDQVSLFRNPLEKAVSATAL LGVNKTFWPVVQKIYTPELTVQGMAAKAWIREDDGLYLYKVGKKELAASKILNALGIPHV DYKEAESYRLEQIADEAHIKKIYDNGEKIVKCKIISSEETAIVPWEDFQVYCSYHDENEY DFIRKKDPDRYYSMQVADYILGNEDRHGANFGFFMDNKNGGLKGLYPLMDHDHAFSEEKD IPSQTSEIDETLQDAAINAIRYTEVKFDQVLEMNRPEDLEEKSWTMILERCRELKQEQKI QPFRGHDEQTLKE >gi|229783947|gb|GG667788.1| GENE 8 8558 - 8839 306 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871416|ref|ZP_06117503.2| ## NR: gi|288871416|ref|ZP_06117503.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 93 16 108 108 171 100.0 1e-41 MISFEEIFGELLTESVNTLLREKYKRPYDEVTEERANHARGMEEQLKFLTGEQLAVVDAC LDDFIDRCGEDQEFLYRKGVEDGIRIMKGIQKL >gi|229783947|gb|GG667788.1| GENE 9 9546 - 10013 153 155 aa, chain - ## HITS:1 COG:CAC0724 KEGG:ns NR:ns ## COG: CAC0724 COG1309 # Protein_GI_number: 15894011 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 146 32 176 185 60 26.0 1e-09 MAVKQLMEATDLSRQTFYQIFNSKEEILEYYFDTIFSHFIIDFKKHTIHNLCDAAKIFFA FFEKYKPTLKLIIQNGKSCVIQRKCREYLRNEEFIQYHLQGIRSDQDQDYSTTFVISGIV AMLEQWIREDFTTAPSANDLALLVCRITGMPTATQ >gi|229783947|gb|GG667788.1| GENE 10 10254 - 10553 251 99 aa, chain + ## HITS:1 COG:STM3238 KEGG:ns NR:ns ## COG: STM3238 COG3681 # Protein_GI_number: 16766537 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 10 89 17 96 436 63 50.0 6e-11 MNKCEMLELLHKSVMPAVGCTEPACVAIAAADSARAVGGTIEKIRLEVSSNIYKNGMSVG IAGFNNVGLKYAAALGALIGKQEMGLEIMNMSLVFLQKV >gi|229783947|gb|GG667788.1| GENE 11 10556 - 11266 392 236 aa, chain + ## HITS:1 COG:yhaN+M KEGG:ns NR:ns ## COG: yhaN+M COG3681 # Protein_GI_number: 16132252 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 227 210 435 436 127 34.0 2e-29 MNQQLANFGIEYEAGIGIAGVLHLNSGGSFIGDSILEKIMIQVAAATEARLDGCPYAAMS SAGSGSKGIAVILPVAQVAEAVNADRKKMLQVIAFAHLLNEYINGYIGKLSAVCTCAVAS ATAASAAITWLLGGDDEQIGYVIRNMTGNITGMICDGGKVGCALKLSTASGAAFMSALLA INGVGLRITDGICAETPEACIKNIARVGNPGMLQTDKVIMHIMLSKGVTNIREDNM >gi|229783947|gb|GG667788.1| GENE 12 11289 - 11621 278 110 aa, chain + ## HITS:1 COG:no KEGG:CLI_0908 NR:ns ## KEGG: CLI_0908 # Name: not_defined # Def: CGGC domain-containing protein # Organism: C.botulinum_F # Pathway: not_defined # 1 110 1 110 110 186 80.0 4e-46 MRVGIIRCMQTEDFCPGTTDFKMIQEKRGAFEGIEEDIEIVGFINCGGCPGKKTVLRAKE LVKRGADTIVFASCIKKGTPIGYPCPFAKRMHDIVSEAVGENITILDYTH >gi|229783947|gb|GG667788.1| GENE 13 11736 - 11952 124 72 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_1164 NR:ns ## KEGG: Pjdr2_1164 # Name: not_defined # Def: transcriptional regulator, AraC family # Organism: Paenibacillus # Pathway: not_defined # 12 72 12 72 308 79 52.0 3e-14 MKYKLISLQKELEVKKIATLFYMEYDKNFDFPGEQHDFWELVYCDYGEVIISADGNKFNL KKGMIAFHKPGE Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:54:28 2011 Seq name: gi|229783946|gb|GG667789.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld182, whole genome shotgun sequence Length of sequence - 9623 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 4, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 44 - 430 379 ## gi|266624573|ref|ZP_06117508.1| MgtC family protein 2 1 Op 2 . + CDS 454 - 1293 734 ## HMPREF9243_0629 hypothetical protein 3 1 Op 3 7/0.000 + CDS 1354 - 2520 1323 ## COG1840 ABC-type Fe3+ transport system, periplasmic component 4 1 Op 4 17/0.000 + CDS 2564 - 3589 1207 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 5 1 Op 5 . + CDS 3592 - 5313 1851 ## COG1178 ABC-type Fe3+ transport system, permease component 6 1 Op 6 . + CDS 5330 - 6001 514 ## COG5255 Uncharacterized protein conserved in bacteria + Term 6064 - 6105 5.3 - Term 6052 - 6091 8.1 7 2 Op 1 . - CDS 6162 - 6587 491 ## SpiBuddy_2157 C_GCAxxG_C_C family protein 8 2 Op 2 . - CDS 6643 - 7407 563 ## Closa_3092 hypothetical protein + Prom 7531 - 7590 6.7 9 3 Op 1 . + CDS 7630 - 8040 544 ## Ccel_1035 glyoxalase/bleomycin resistance protein/dioxygenase 10 3 Op 2 . + CDS 8054 - 8710 782 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 11 4 Tu 1 . + CDS 8813 - 9622 726 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain Predicted protein(s) >gi|229783946|gb|GG667789.1| GENE 1 44 - 430 379 128 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624573|ref|ZP_06117508.1| ## NR: gi|266624573|ref|ZP_06117508.1| MgtC family protein [Clostridium hathewayi DSM 13479] MgtC family protein [Clostridium hathewayi DSM 13479] # 1 128 15 142 142 216 99.0 3e-55 MTTSAGIWATVGVGMAEGAGMYLIGVITTAMIILVEVFLGRRGVLADKQEEDRKIIVEYI DDIQKSRETTGYVKKLCEERGYRILAMETRRMGEKIQLQTMISASKDYDAADMTAALTED RDITKIVV >gi|229783946|gb|GG667789.1| GENE 2 454 - 1293 734 279 aa, chain + ## HITS:1 COG:no KEGG:HMPREF9243_0629 NR:ns ## KEGG: HMPREF9243_0629 # Name: not_defined # Def: hypothetical protein # Organism: A.urinae # Pathway: not_defined # 4 277 2 266 268 235 45.0 1e-60 MVAIKILETFKKGKVNERLCEDAFFQSENYIAVIDGVTSKTDFLYHGKTTGKLASEIIYR VLETLRGDEPLREIVERVNGEITAFYDQIDFPYDRGEKGLQAACVIYSAFYREIWMIGDC QAVVDGEVYLNPKKSDVVLSEMRSLILSTLEQERRNTTGLYREEWEEHDAAREIILPWIL RATIYANDDTTEYGYSVFNGQAIPESLIKVIRLREGAHEVILTSDGYPEVKETLRKTEEN LKEILIHDPGCYKKYLSTKGVKKGQTSFDDRTYVRFMTD >gi|229783946|gb|GG667789.1| GENE 3 1354 - 2520 1323 388 aa, chain + ## HITS:1 COG:SP1826 KEGG:ns NR:ns ## COG: SP1826 COG1840 # Protein_GI_number: 15901655 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 52 386 28 352 355 162 32.0 7e-40 MKLKNAVLIGILAAAAVSTAGCGKKEAVPSDTAAQTEAAKQTDAGQTEAQGNSEEEAPAD NQAVLSQKPTGKRLVLVTNNASDGRDVWLSEHAQEAGFNVEIVALGAGDATARVISEINN PTTNVIWGPSEDQFTAMAEAGSLAKFTPEWAEDVKGLSKENGYSWSYEIQPKLLVCNPDV YTEETAPKSYQDLWEKEEFHGKYAVPTTFDGNTNRAIVGGILVQYLDPAGELGVSDEGWK AIKAYFDNGYKTPKGENDFGNMASGKVPITFTFASGLKGKSESYNIKPLIVYTEVGEPTN TNQIAVVANKDQEMVEESMRFANWLGSAEVIGAYAAENGNMVANKKAEDSMVPIIKEVKE HYKPQEADWAYVNSMMDDWVAKIQLEIY >gi|229783946|gb|GG667789.1| GENE 4 2564 - 3589 1207 341 aa, chain + ## HITS:1 COG:SP1825 KEGG:ns NR:ns ## COG: SP1825 COG3842 # Protein_GI_number: 15901654 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Streptococcus pneumoniae TIGR4 # 1 338 1 336 336 336 52.0 5e-92 MIKFNNIEIKFGDFTAIKDLTLEVKKGEFFTFLGPSGCGKTTTLRSLVGFIKPTKGNIVV KGEDITDKPIEKRGIGMVFQSYALFPTMTVEENIVFGLKENKWAKDAIRKRVAEVSAMVN LTDEQLKKNVSALSGGQQQRVAIARALALNPEIIVLDEPLSNLDAKLRKSLRLELKKIQK ASGTTMIYVTHDQEEALTLSDQIAVFHNGGVEQVGTPREIYYHPASEFVTTFIGDTNRMT ESLITQLNEQNKGAGLNPSDHIYVRVENVKESVHAGEENRYYKLQARFETEEFYGITTRR TFLAGDTEIKSVNTKNDEKLEKGTEAVLYIDPADLIIFREG >gi|229783946|gb|GG667789.1| GENE 5 3592 - 5313 1851 573 aa, chain + ## HITS:1 COG:SP1824 KEGG:ns NR:ns ## COG: SP1824 COG1178 # Protein_GI_number: 15901653 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 9 560 2 553 563 407 40.0 1e-113 MEKLSANYRFKKNLGFTLLTFVLGWAIFTFLLFPNLSLLKITLFPEGKFDAKPVMQIMHS ARVVRALIHSFVLAVSLTVTVNVLGIFEILVLDYFDIKGSKWLNIAFHSPLICNGMVLVT AYNFLFGSQGFLTANLMQLYPDMNPYWFRGFGAVLIEMTFAGTSNHIMFVRDSLKNIDYQ TIEAAQNMGVRAEKILSRVVLPTLKPSIFAASILTFIVGISAFATPQVLGGENFETINPL VMSFSKTLTTRNYAAVLALFLGIITILVLGISNYIESRGNYVSVSKVKTPLKKMKIKNPV LNGIVTVTAHIAALVQTIPLLFVFIFSFMSVQDLYAGNISLSNLSINNYLTVFRSTTGLK PVLTSVLYSAAAAVTVVTIMLFIARILTKYHNRLTGTLEMLLQIPWFLPATLIALGLVMT FDMPTPFTFGTTLTGTVYIMLIGYIILKLPYTLRMIKAAYAGLDGSLEEAAKNLGASPAR TYLKVLFPILWPTILSVFLLNFIQQLAEYNVSVFLFHPMFQPLGVVLNTATSPDSSPEAQ MLTFVYSVIIMIVSTATIVFVYGRKSAALKERG >gi|229783946|gb|GG667789.1| GENE 6 5330 - 6001 514 223 aa, chain + ## HITS:1 COG:CC1790 KEGG:ns NR:ns ## COG: CC1790 COG5255 # Protein_GI_number: 16126034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Caulobacter vibrioides # 5 213 40 258 264 85 29.0 7e-17 MILEDAVYKFDENGSPLPYYGNTIVSYLNDQRWPVFDEAVRVQEALKKCGFSQKLAFLPP SSFHMTVLTLCREIDRGTPYWPKGMAADETFQGIDRILKERVAAVPLPEGVFVEVEDCEE IRIILKPHDKESADKLRAYRDQVAEVTGIRHSWHDTFRYHLSLDYLLKPLDEEEKRERDA VCARFAGELRRTVKPFLLPKPEFVIFNDMMSYETDLSKRGAAY >gi|229783946|gb|GG667789.1| GENE 7 6162 - 6587 491 141 aa, chain - ## HITS:1 COG:no KEGG:SpiBuddy_2157 NR:ns ## KEGG: SpiBuddy_2157 # Name: not_defined # Def: C_GCAxxG_C_C family protein # Organism: Spirochaeta_Buddy # Pathway: not_defined # 6 137 3 133 141 103 39.0 2e-21 MSDKIEEKVQEALNRHNRGYNCSQSVACTFCEELGIDEATMFRMTEGMGLGMGSMDGTCG AIGAAAVLSGMKNSTVHLDHPDSKAVSYKAARQCLNEFKEQNGSVICRDLKGVDTGNVLR ACNDCIADAVRIIEKELYGEG >gi|229783946|gb|GG667789.1| GENE 8 6643 - 7407 563 254 aa, chain - ## HITS:1 COG:no KEGG:Closa_3092 NR:ns ## KEGG: Closa_3092 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 249 1 249 258 318 66.0 1e-85 MKEFIIKTAWMMVPPQPYSAFHICITAAGFLIAVLAAYVLSRKMERTFFIRLLFFCGVIL TASEIYKQLFLYYVVNQEKYDWWYFPFQLCSLPMYFCLLLPLVKNRKLQLIILTFMQDFN LLGGIMALAEPSGLLHPYWFLTMHGLLWHILLIFIGLFIGFSHESDSSFTGYVRTLPLFL LCCGIATLINVTARPLGQADMFYISPYYPNNQAVFYTISQKIGVLPGNIIYLLSICIGGL LFHMIFARLFRHTE >gi|229783946|gb|GG667789.1| GENE 9 7630 - 8040 544 136 aa, chain + ## HITS:1 COG:no KEGG:Ccel_1035 NR:ns ## KEGG: Ccel_1035 # Name: not_defined # Def: glyoxalase/bleomycin resistance protein/dioxygenase # Organism: C.cellulolyticum # Pathway: Pyruvate metabolism [PATH:cce00620] # 2 127 3 127 128 117 46.0 2e-25 MITGLDHITVNMADREASFRFYEEVLRLEKLYDVDMGDHRLTYFNLNGCTRLELIEYYND SPAHPEPSDDKGIYRHMALTTDEIGRLFDRCTAYGVKVLMEPVWVEKLNWTGMLVRDPNG VEIEIVQRESSAATDI >gi|229783946|gb|GG667789.1| GENE 10 8054 - 8710 782 218 aa, chain + ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 218 1 222 530 89 26.0 6e-18 MYKLLIVDDEADHRKGLMNLLAAMRPEDMLLEADDGVTALEVIELVDCDVIITDIRMTNM NGLELLKAAKAQNPDVEVIILSGYGQFDYAREALKYQAADYIMKPVDSDEIRTVLNKVTE RISERQKSRNRQESIENRLTETMPVYMDHLMNLFVQKPDFAGKERLKDMFQLERPGYIFC CHVAGGGRELSEEQRNDLKYQMKKSLDPCSSYPFFLEH >gi|229783946|gb|GG667789.1| GENE 11 8813 - 9622 726 269 aa, chain + ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 19 262 276 520 525 90 27.0 2e-18 MICSPGGMVDNLYEEASVSYEEAVRYSQYEFYEPGTLLSEETIGKRIGKNSDAVTYHVLP VVEGLKRGELRSAYELLAAEIDEGVKTLFSPPVLLKQRVIFALFQIMRNFEVLISGNFKK EANLQISSLQKAETLTLLKRSLYAFFLDLGKELNCQKEGNGRDIMTDCGEYLENHYMDEI NLESVAEKYHFNPSYFSTLFKNTYKSTFSEYLIRIRMEKARELLLTTESRVRDIAVLTGY RDANYFVRAFKRYYEMTPEEYRKMSGRTE Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:54:54 2011 Seq name: gi|229783945|gb|GG667790.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld183, whole genome shotgun sequence Length of sequence - 10264 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 3, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 81 - 1796 1821 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 2 1 Op 2 . - CDS 1808 - 3343 1985 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 3515 - 3574 80.4 3 2 Op 1 1/0.000 - CDS 4423 - 4659 318 ## COG0703 Shikimate kinase 4 2 Op 2 . - CDS 4656 - 5528 1045 ## COG0169 Shikimate 5-dehydrogenase 5 2 Op 3 . - CDS 5509 - 6036 461 ## COG2179 Predicted hydrolase of the HAD superfamily 6 2 Op 4 1/0.000 - CDS 6045 - 6497 412 ## COG1327 Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains - Prom 6562 - 6621 2.8 - Term 6564 - 6613 -0.2 7 2 Op 5 . - CDS 6681 - 6941 301 ## COG1873 Uncharacterized conserved protein 8 2 Op 6 . - CDS 7005 - 7832 740 ## Closa_2336 hypothetical protein 9 2 Op 7 . - CDS 7832 - 8470 512 ## Closa_2337 hypothetical protein - Prom 8511 - 8570 12.2 10 3 Op 1 22/0.000 - CDS 8707 - 9609 1239 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen 11 3 Op 2 . - CDS 9751 - 10191 526 ## COG2011 ABC-type metal ion transport system, permease component Predicted protein(s) >gi|229783945|gb|GG667790.1| GENE 1 81 - 1796 1821 571 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 8 546 9 574 602 153 24.0 9e-37 MLNWYYGSASIRKKLVISYLLLITVPILVLGVYSFSVSKRNMERQTEETIRNNVRLMASD LETSLKRENDNIKYLSYNAKFRQVLKNSRGNQVEVAQVLNEVVEPIFWYFITSDNNMKGI EIYTPYVSQEVGSFLKPVGDSSEESWYQYHQKNFGTLWTFENGSMFATRTVLDAQTASEP IGVLKLTVHPASVLEPISNNTFLNNGILLTDTEGKTIYSRQTGAEPADQAVLAGAEAGNL ADGDQGEYIVMSEAISTSGWRLYYYVTKSEITGQLYEILKSTLLIVGLCLAVSLFLIGIL SRVLSRRILELKESAEKVAEGNFNLELNHNDTDEIGIVSKSLQTMCDRLNEMINRVYKAE LEKKAMELKALQAMINPHFLYNCLSSIKWKAIRTGNDEISDLTGLLAKFYRTSLNGGKQI TTVGSELENVKSYVELQRMTHENSFDVEYELDETLFETSMPNFLLQPIVENAIEHGMNYQ EEIEGRGRLIVRLEEEDGHLIFSILNNGPRVELDRLEEILNTPGRGYGIYNIEERIRIYY GAGCGVYASVTADGYTCFTVRIGKELKELSV >gi|229783945|gb|GG667790.1| GENE 2 1808 - 3343 1985 511 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 505 1 522 525 121 22.0 3e-27 MFRILLVDDEAQERDGIRFLIEKYKLPLTVVEAQNGQKALEYLKEHPVDILFTDVKMPYK DGLELAKETFQFNQSIRIVIFSAYGEFEYAKRAMEANAVDYLLKPIELDEFLKVMHKVIG GLERQREESKQEREQRDASMKRVFCKAFTGGSVSAADQEMLTEYFQKLGKERKVLVSLES GRNIFEQQEDVFLKLLHTYIRIPCEYVNLYPDSSVVLLADENRIVGGELQKQLEKLVRDG KTLLGTEISIVVSREFETVEEMNRQMDEISQIRKNFYGTGGEITFLSEIVLNSEYYAAEV ERIKELMIGAIGDRNDDLVILYGDRLVETMVGGHVVSRIYVHHVFYDILSAFYSQYGIND KPRLFERVERLLSYKDGEELEVNFRAVLNEVLAKRTDAAGDSPRIVERIKNIVRDEYRND IGLEDIAERVNLAPAYLSYLFKKETGANLVKYITDYRMEQARKLLEEGELKIVLVAKACG YENQPYFNRLFKNYYGMTPRQYREKNGKQQG >gi|229783945|gb|GG667790.1| GENE 3 4423 - 4659 318 78 aa, chain - ## HITS:1 COG:XF1335 KEGG:ns NR:ns ## COG: XF1335 COG0703 # Protein_GI_number: 15837936 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Xylella fastidiosa 9a5c # 3 69 6 72 180 67 50.0 4e-12 MRNVILIGFMGAGKTTVGERYGKKHGLPMVDTDWLIEERAGMTISEIFARKGEEEFRKTE TAVLQALLSDTQEKVISV >gi|229783945|gb|GG667790.1| GENE 4 4656 - 5528 1045 290 aa, chain - ## HITS:1 COG:MJ1084 KEGG:ns NR:ns ## COG: MJ1084 COG0169 # Protein_GI_number: 15669272 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Methanococcus jannaschii # 1 265 1 263 282 190 41.0 3e-48 MISGKSKVCGVMAYPVEHSMSPLMHNYYSEQTGTDLAYIPLKVEPGCVEAAVKGAYALNF TGLNVTVPHKQEVMKYLAEIDEDAKAIGAVNTLVRTEGGYRGYNTDAAGLKRAMTEAGIE IRDRSCILFGAGGAARAAACVLAKEGAAVIYALNRSVEKAEALSEIINERYGHRVMIPMA LADYGRLEGSSYLAVQTTSVGMHPNVDHAPVEDEEFYRKIDTAVDIVYTPMETKFMKYVR AAGGKAVGGLDMLIYQGVIAYELWSPETVFTRDIIDGARVLMREELEAAR >gi|229783945|gb|GG667790.1| GENE 5 5509 - 6036 461 175 aa, chain - ## HITS:1 COG:BH1322 KEGG:ns NR:ns ## COG: BH1322 COG2179 # Protein_GI_number: 15613885 # Func_class: R General function prediction only # Function: Predicted hydrolase of the HAD superfamily # Organism: Bacillus halodurans # 1 163 1 163 171 107 34.0 1e-23 MFNRFYPDEDVASAYDIPYDALYREGVRGVIFDVDNTLVPHDAPADERAKNLFSHLRALG MDTCLLSNNKEPRVAAFAEAVGGSNYIYKGGKPGVKNYKKAMELMGTGLQSTIFVGDQLF TDVYGAKRTGIKSFLVKPINPREEIQIVLKRYPEALILFFYRRDKKRRQYDKRKK >gi|229783945|gb|GG667790.1| GENE 6 6045 - 6497 412 150 aa, chain - ## HITS:1 COG:CAC1698 KEGG:ns NR:ns ## COG: CAC1698 COG1327 # Protein_GI_number: 15894975 # Func_class: K Transcription # Function: Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains # Organism: Clostridium acetobutylicum # 1 150 1 150 151 164 58.0 4e-41 MKCPFCNEADTKVIDSRPADDNSSIRRRRQCEKCGKRFTTYEKLETLPLMVIKKDDSREV YERSKLEKGILHSCHKRPVSSQQIDSMVDEIETQIFNMEEKEVSSATIGELVMKKLQEVD EVAYVRFASVYREFKDVNTFMEEIGKLLKK >gi|229783945|gb|GG667790.1| GENE 7 6681 - 6941 301 86 aa, chain - ## HITS:1 COG:BS_ylmC KEGG:ns NR:ns ## COG: BS_ylmC COG1873 # Protein_GI_number: 16078600 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 75 2 76 81 68 40.0 3e-12 MRISDLKQKEVINICDCKRLGYVGDVDFDMDTGCLLAIIVPGPGCFCGFLVREKEYIIPY SDICQVGRDIILVKVDLEKTEEKCKI >gi|229783945|gb|GG667790.1| GENE 8 7005 - 7832 740 275 aa, chain - ## HITS:1 COG:no KEGG:Closa_2336 NR:ns ## KEGG: Closa_2336 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 262 1 265 276 169 39.0 8e-41 MKTTSSDLKKRAKLTLSGNYGTAVGAMLIVDVFLIVICMVFIGVSMAGTFSIIGGAGYIR SSMTRSLVMMLVIYFIAILLIYLVMAGIRRMFYHMSTGRPFALGDMLFAFTHRPQRFLGL YFINMAFGMAAGIPYFVVSFSARITGYIPILMALQFLMYLLQLIGIVVYSLYFKMAGYLL VEDPERTVISCLRESAALMKGNKGRLFYLDLSFIGMYLLGAGSFGIGFLWILPYMETTMI HFYLDINAGRRQEEACYDQEPFYDGSSGFYENVTE >gi|229783945|gb|GG667790.1| GENE 9 7832 - 8470 512 212 aa, chain - ## HITS:1 COG:no KEGG:Closa_2337 NR:ns ## KEGG: Closa_2337 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 107 212 6 114 115 102 55.0 1e-20 MEDDRQYEYDGPDSEREPVHEEPAAMGEKPAAAGGAAAGRCEEPGGGIGQGFDEMERYPD QYSRQPEEWTQAGYQQEENYQAGYRQEENYQNEYGQNGYGQNGYQQQTGYYRDEYRQAVN SGYKQPQRQSSMALASLIMGIIGIVTSCCCYGGLIFGSLGILFALLSKAGDTMEGYARAG LITSVIGLFLAAMALIFVFGIMSSSLLSGGVY >gi|229783945|gb|GG667790.1| GENE 10 8707 - 9609 1239 300 aa, chain - ## HITS:1 COG:jhp1472 KEGG:ns NR:ns ## COG: jhp1472 COG1464 # Protein_GI_number: 15612537 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Helicobacter pylori J99 # 60 300 37 271 271 225 51.0 7e-59 MKKSLYVLAAAFFAAGALTACGGNNKETTAGTTTAAVTESKETESAAPADETTAEAGELK KLVVGASPAPHAEILKAAKDILKEKGYDLVIKEYTDYVQPNLALESGDLDANYFQHFPYL EQFNQENGTDLVSAGAIHYEPFGIYAGKTASLADLADGAQVAVPNDVSNEARALLLLADN GLIELKEGVGLEATKNDIVKNDKNLKIMEIEAAQLPRSLGDVDIAVINGNYAIEAGLKVS DALATEDSQSIAATTYGNVVAVRKGEEKSDATLALIDALTSETVKQYIEDTYQGAVVPLF >gi|229783945|gb|GG667790.1| GENE 11 9751 - 10191 526 146 aa, chain - ## HITS:1 COG:VC0906 KEGG:ns NR:ns ## COG: VC0906 COG2011 # Protein_GI_number: 15640922 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, permease component # Organism: Vibrio cholerae # 2 144 83 225 225 120 54.0 1e-27 MVIPITKKIVGTTLGAKAVIPPLVIAAAPYVARVVESSFKEVDAGVIEAAKSMGASTMQI VWKVLLPEAKPSLLVGAALSVTTILSYSAMSGFVGGGGLGDIAIKYGYYRYQTEMMFVTV AILVIIVQIIQEAGMKLSRVSDKRIR Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:55:05 2011 Seq name: gi|229783944|gb|GG667791.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld184, whole genome shotgun sequence Length of sequence - 8087 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 1 - 1629 1687 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Prom 1795 - 1854 5.3 2 1 Op 2 35/0.000 + CDS 1927 - 3357 1575 ## COG1653 ABC-type sugar transport system, periplasmic component 3 1 Op 3 38/0.000 + CDS 3493 - 4398 794 ## COG1175 ABC-type sugar transport systems, permease components 4 1 Op 4 . + CDS 4401 - 5240 888 ## COG0395 ABC-type sugar transport system, permease component 5 1 Op 5 . + CDS 5281 - 7557 2017 ## AciPR4_1095 hypothetical protein + Term 7594 - 7652 14.6 + Prom 7699 - 7758 4.6 6 2 Tu 1 . + CDS 7805 - 8087 118 ## gi|266624601|ref|ZP_06117536.1| putative SAM-dependent methyltransferase Predicted protein(s) >gi|229783944|gb|GG667791.1| GENE 1 1 - 1629 1687 542 aa, chain + ## HITS:1 COG:lin2119 KEGG:ns NR:ns ## COG: lin2119 COG2972 # Protein_GI_number: 16801185 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Listeria innocua # 256 538 288 574 577 151 27.0 4e-36 EFAQKLNIDSRLYEIFKNLDSTDEVSLMNANSRITAILNAYKPWGSNIYSVHLVTSYFRF GEEDKNFYPRGAFMNSRLEKEAREAKGSVCWIPTYSYTEMFGISNLTDTQVEYGSLFTAV CQMNFCDVSSGRVERLSETVEKPILVINYTEEFLMSLIRRYGAKDIVNDADYLVIDREGN VICSSDPAYDASHRYQADWLGTLKKGTNSGYTTEMDGREKYMVTYAQSEITDWLVVARVP VHLLIKDVERNLRSYIIAIMVILLFLSAVSSYMISRYINRKIYGTIHMIDQVGTGQFGSM ITYDDRDEFSFFYEKLNEMSRNIKALIHENYEVKLQQKDFEIMILTIQLNPHFLYNTLNI INWTCLAGDTPKASRMIVDLSRMLQYTSQSRKQLVYLRDDIDWLKRYIEIMQIRYENQFE VNFDIPEELLGKEVPKLFLQPLVENAIVHGFKNSEEKGILEISAETDEENIYFTVEDNGV GMTQQRVKEVLRFTGTSIGVANTDRRIKILYGDAYGVSLHSQLGEGTAVVVTIPKQGGSL PH >gi|229783944|gb|GG667791.1| GENE 2 1927 - 3357 1575 476 aa, chain + ## HITS:1 COG:BH1864 KEGG:ns NR:ns ## COG: BH1864 COG1653 # Protein_GI_number: 15614427 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 34 468 24 438 461 125 27.0 2e-28 MKHNHLMQLAALLTAGTMLLTGCGGNGGNTTTAGAGTAEGKTEAAQSETGGEPVEITFTY WGSGAEKTAIEASVKTFEEANPDIKVKAMHIPSDDFLTKLNAMIAAGETPDISYSASWKC QFGEDGLIYNFYDLIKDDPEVSVDDYLETCWWNWSPTESAGPIMANVTLSLMYNEDLFKD AGIDTPPTKVEDAWSWQEFVEKAQELTLDSAGRNAKDPAFDANNITQYGVMFAPNWNVYM PFIYSNGGAYLNEDMTEFGLNDPKAAEVIQNFADLINVYHVAPTAIQSNSMPAAATALAS RQAAMYIDGSWNHLDLAEAGINWGVGVLPIDKNYTTFFDGGSLIIFKATEHLEETLKLYQ WITNPESSAEITEMFRSIWLPVHKEYYTNDEKLDFWASEEFPARPEGFKDAVVQSTYDHQ VIATEIDVKNFNEIDTMVKAALQQVWAGDKTAEEAMTEIKTSVDPLVDGTYSGNRS >gi|229783944|gb|GG667791.1| GENE 3 3493 - 4398 794 301 aa, chain + ## HITS:1 COG:BH1118 KEGG:ns NR:ns ## COG: BH1118 COG1175 # Protein_GI_number: 15613681 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 4 301 18 314 318 218 39.0 9e-57 MKGKIKMTHREKRENVAGFLFCLPCILGFLFFAAIPMIISFIMSFSNYSIVKESHFVGLD NYVRLFAGQDPYFYKSLTVTVIYVVLSVPTSIVFAFLIASLLNKSVKGKGLFRTLFYLPS IIPIVAMGAIWMWIFNPDVGLANSLLKAVGMPPNKWLSSESTVLPTLVFVNLWTTGSTMV IFLAGMQDVPRQLIEAVEIDGGGVLAKMFHVIVPIMTPTIFYNVVMGFINGFQIFTQSYV MTQGGPNNASLFYVYYLYRESFQFMRMGNSCAIAWILFVIIMILTAILFKFQNRWVYYGG E >gi|229783944|gb|GG667791.1| GENE 4 4401 - 5240 888 279 aa, chain + ## HITS:1 COG:BH1119 KEGG:ns NR:ns ## COG: BH1119 COG0395 # Protein_GI_number: 15613682 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 4 278 2 281 281 212 43.0 6e-55 MSAKMKKTVGKVITYAVLVAGSLFCIVPLVWMVRSSLMTPVEIFMVPVKWVPDKMMWSNY RKVFEVLPFFTYYRNSLFIAALVVAGSMLTSSVCAYGLARIKWKGRNVVFACIMGSMMLP FAVTLIPTFLMWRAIGFTNSFVPLIAPAWFGGGAFYIFLLRQFYMTIPKDFDEAAYLDGA NHFQIFTRIILPITKPSLAVVGLFTFMNTWNDFLGPLVYLNDEAKYTVALGLQLFTGSYR SEWHLMMAAACLVLIPVILVFIVGQRYLIEGITMSGVKG >gi|229783944|gb|GG667791.1| GENE 5 5281 - 7557 2017 758 aa, chain + ## HITS:1 COG:no KEGG:AciPR4_1095 NR:ns ## KEGG: AciPR4_1095 # Name: not_defined # Def: hypothetical protein # Organism: T.saanensis # Pathway: not_defined # 221 756 213 729 732 270 29.0 2e-70 MTTIKDMVHAPVNPFFSVAVKDGKYECGNRESYYAKKLPDETFSYDISNNKVLANTDEFG TLKTVTFYRGCYTCDDIPGVWVSKDFGQAGPFCFGMTIEGEAVSLCSGSLPCQSDLAENL FPRAVFTHPSLTATVLAYAPVSADGKTRPRALIYGLYLENTTEGDVSGTVIPPEFSPDSD YFTIPDASGKMVECETMVECGKKAKCVPFHLKKGCGQWLPMVIYAPGETDAVKVIEAAGT LHWLNETLAYFRGMLGNLTMDADPLTAAIFERAVYQGISAVGMDRDGEICGSNWGTFPAT RQIWMKDMYYSCLPLSMLNPDFFRQACLWFLEHGIRPEGSKYQGGVAHSLSNSLTAVLMG VLYYEATADREFFLTHREVADAFEEILEEVLALKEGDVWLFPTVWISDALALGKYHTGSN ICAWKAFDGMARFMGEVFGDQEKQKKYGMISRNIKEAIETYMVTEGRFGPQYLEGIGGLT RDTQKNYPISEYEKKYVDQALTFYPDIINGDTINLMMHDGEESDTTLIPFYGYKTYDDPV VRNYARFSSSEENPTYGTECRGIKWGHESGATFPGYTTAFSGVVDEETMNGERGFMTELK RLCDLDGSWWWWPYRCDTKTGDVVRLNCCGKCGWAAGVFASLFMTQILGVRYDAPSKTLN FKPFSPGSGFEWKHARTGSGEFDFTYQKSEGRVTASVTSRVDYEVNLVLTLVTEGLPEFT DRGINAEFEEVEFLGKRAVRVHTVLKPDQRIEIAAEER >gi|229783944|gb|GG667791.1| GENE 6 7805 - 8087 118 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624601|ref|ZP_06117536.1| ## NR: gi|266624601|ref|ZP_06117536.1| putative SAM-dependent methyltransferase [Clostridium hathewayi DSM 13479] putative SAM-dependent methyltransferase [Clostridium hathewayi DSM 13479] # 1 94 1 94 94 164 100.0 1e-39 MKYGEALLWIVIIFLLYMNELRIDQLPGIGEGETGTKVEYMIRAFVKAGILFSGWYIHLP LFIVYLLIFCGYGMTFWSIGKRSRLNLFFWMNSN Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:55:23 2011 Seq name: gi|229783943|gb|GG667792.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld185, whole genome shotgun sequence Length of sequence - 8685 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 1314 868 ## CD1108 putative DNA-repair protein 2 1 Op 2 . - CDS 1307 - 1681 252 ## gi|160939432|ref|ZP_02086782.1| hypothetical protein CLOBOL_04325 3 1 Op 3 . - CDS 1684 - 4143 2198 ## COG3451 Type IV secretory pathway, VirB4 components 4 1 Op 4 . - CDS 4028 - 4495 231 ## Ethha_1896 hypothetical protein 5 1 Op 5 . - CDS 4498 - 4656 128 ## gi|266624606|ref|ZP_06117541.1| DNA (cytosine-5-)-methyltransferase - Prom 4891 - 4950 1.6 - Term 4941 - 4982 -0.9 6 2 Tu 1 . - CDS 5088 - 5477 124 ## gi|295104946|emb|CBL02490.1| hypothetical protein - Prom 5632 - 5691 1.8 7 3 Op 1 . - CDS 6918 - 7388 522 ## Ethha_1894 hypothetical protein 8 3 Op 2 . - CDS 7431 - 7553 221 ## gi|160939424|ref|ZP_02086774.1| hypothetical protein CLOBOL_04317 9 3 Op 3 . - CDS 7623 - 7841 242 ## Closa_3718 conjugative transfer protein - Prom 7862 - 7921 2.8 - Term 8023 - 8058 1.3 10 4 Tu 1 . - CDS 8061 - 8684 598 ## COG3505 Type IV secretory pathway, VirD4 components Predicted protein(s) >gi|229783943|gb|GG667792.1| GENE 1 3 - 1314 868 437 aa, chain - ## HITS:1 COG:no KEGG:CD1108 NR:ns ## KEGG: CD1108 # Name: not_defined # Def: putative DNA-repair protein # Organism: C.difficile # Pathway: not_defined # 6 437 71 514 646 366 50.0 1e-99 MNKEPRLRFTDEERADPALEKPIRKTEKATARADKAQANIPKKKVRQTVIDPDTGKKTSK LTFEDKKKPPSKLSQGVKEAPVHLVAGKFHKEIRETEQDNVGVESAHKSEEAVETSAYLV REGYRSHKLKPYRKAAQAEQKLEKANVNALYQKSLRENPQFASNPLSRWQQKQSIKKQYA VAKRTGQTAGNTAQAASKTGKAARTVKEKAQQAGAFVMRHKKGFLIAGVLFLIACMLMNT MSSCSMMAQSIGSVISGTTYPSDDPEMLAVEADYADREARLQEKIDNIESSHPGYDEYRY NLDMIGHDPHELAAVLSAVLQGYTRHSAQAELSRVFDAQYQLTLREEIQIRTYTDEDGDE HEYEYRILHVTLTSRSIASLAPELLTPEQMEMYQVYRQTMGNKPLLFGGGSPDTGVSEDL TGVEFINGSRPGNPQLV >gi|229783943|gb|GG667792.1| GENE 2 1307 - 1681 252 124 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939432|ref|ZP_02086782.1| ## NR: gi|160939432|ref|ZP_02086782.1| hypothetical protein CLOBOL_04325 [Clostridium bolteae ATCC BAA-613] hypothetical protein HOLDEFILI_02406 [Holdemania filiformis DSM 12042] conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] putative malonyl CoA-acyl carrier protein transacylase [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF0866_01339 [Ruminococcaceae bacterium D16] hypothetical protein CLOBOL_04325 [Clostridium bolteae ATCC BAA-613] hypothetical protein HOLDEFILI_02406 [Holdemania filiformis DSM 12042] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] putative malonyl CoA-acyl carrier protein transacylase [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF0866_01339 [Ruminococcaceae bacterium D16] # 1 124 1 124 124 241 100.0 2e-62 MKQNICELDTMIFFREALEAHEFMLLPVMASAVVECRTADKELKTLNEDGEIGLARLFSI WANMMCAPGAATIVGCRPITMLSEILAQVHAYLTVHPLYDPEGLALYVELHHMMDAILMG DWFE >gi|229783943|gb|GG667792.1| GENE 3 1684 - 4143 2198 819 aa, chain - ## HITS:1 COG:CAC2047 KEGG:ns NR:ns ## COG: CAC2047 COG3451 # Protein_GI_number: 15895317 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Clostridium acetobutylicum # 312 785 89 594 617 102 23.0 3e-21 MFTAIKKWLHRMFGKTEEKAVQPVKTKKKLSRADRKQIEAAIARANRTDKKEKSAQDSIP YERMWPDGICRVSDGHYTKTIQFQDINYQLSQNEDKTAIFEGWCDFLNYFDSSIQFELSF LNLAASEETFARAINIPLQRDDFDSIRVEYTTMLQNQLAKGNNGLIKTKYLTFGIDADSI KAAKPRLERIETDILNNFKRLGVAAETLDGKARLAQLHGIFHMDEQLPFRFEWDWLPTSG LSTKDFIAPSSFEFRTGKQFRMGKKYGAVSFVQILAPELNDRMLADFLDMESSLIVSLHI QSVDQIKAIKTVKRKITDLDKSKIEEQKKAVRAGYDMDIIPSDLATYGVEAKKLLQDLQS RNERMFLVTFLVLNTADNPRQLDNNVFQASSIAQKYNCQLTRLDFQQEEGMMSCLPLGLN QIEIQRGLTTSSTAIFVPFTTQELFQNGKEALYYGINALSNNLIMVDRKLLKNPNGLILG TPGSGKSFSAKREIANSFLLTSDDVIICDPEAEYAPLVERLHGQVIKISPTSTNYINPMD LNLDYSDDESPLSLKSDFILSLCELIVGGKEGLQPVQKTIIDRCVRLVYQTYLNDPRPEN MPVLEDLYNLLRSQEEKEAQYIATALEIYVTGSLNVFNHQSNVDINNRIVCYDIKELGKQ LKKIGMLVVQDQVWNRVTINRAAHKSTRYYIDEMHLLLKEEQTAAYTVEIWKRFRKWGGI PTGITQNVKDLLSSREVENIFENSDFVYMLNQAGGDRQILAKQLGISTHQLSYVTHSGEG EGLLFYGSTILPFVDHFPKNTELYRIMTTKPQELKKEDE >gi|229783943|gb|GG667792.1| GENE 4 4028 - 4495 231 155 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1896 NR:ns ## KEGG: Ethha_1896 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 155 1 141 152 176 60.0 3e-43 MAYVPVPKDLTKVKTKVMFNLTKRQLICFTGGALIGVPLFFLLRKPTGNSVAAMCMILVM LPFFMLAMYEKHGQPLEKIVGNILKVAVIRPKQRPYQTNNFYAVLKRQEMLDKEVYDIVH RNKKMAASDVRKNRGKSCAAGQNKEKAVPRRQKAD >gi|229783943|gb|GG667792.1| GENE 5 4498 - 4656 128 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624606|ref|ZP_06117541.1| ## NR: gi|266624606|ref|ZP_06117541.1| DNA (cytosine-5-)-methyltransferase [Clostridium hathewayi DSM 13479] DNA (cytosine-5-)-methyltransferase [Clostridium hathewayi DSM 13479] # 1 52 383 434 434 112 98.0 1e-23 MQGFPDGWTAYDFEHGIISDSKRYEMLGNSVAVPCVAYIMQGIYQVLSERSV >gi|229783943|gb|GG667792.1| GENE 6 5088 - 5477 124 129 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|295104946|emb|CBL02490.1| ## NR: gi|295104946|emb|CBL02490.1| hypothetical protein [Faecalibacterium prausnitzii SL3/3] # 1 120 1 127 179 72 41.0 1e-11 MDNIRKSQYGKTSPELCHPQEDVTLRLCCKRSQKPRFQCLILGDGPTPEWYEAEELISLG ACTTPNISEQHSDAGACSLSLILEENAPDKYFLSPKACEGILNRAERRGKHLPPLLQAAL EEQSRKTDI >gi|229783943|gb|GG667792.1| GENE 7 6918 - 7388 522 156 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1894 NR:ns ## KEGG: Ethha_1894 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 153 56 208 289 206 73.0 3e-52 MIENLSNSIMVPIAGVILAIVMTVDLIQMIADKNNLHDVDTWMIFKWVFKSAAAILIVTN TWNIVMGVFDMAQSVVAQAAGIINSDASIDISSVMTDLEPRLMEMDLGPLFGLWFQSLFI GITMWALYICIFIVIYGRMIEIYLVTSVAPIPMVAS >gi|229783943|gb|GG667792.1| GENE 8 7431 - 7553 221 40 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939424|ref|ZP_02086774.1| ## NR: gi|160939424|ref|ZP_02086774.1| hypothetical protein CLOBOL_04317 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04317 [Clostridium bolteae ATCC BAA-613] # 1 40 1 40 223 73 100.0 7e-12 MDFLLEALTNWLKEMLVGGIMSNLSGMFDSVNQQVADISV >gi|229783943|gb|GG667792.1| GENE 9 7623 - 7841 242 72 aa, chain - ## HITS:1 COG:no KEGG:Closa_3718 NR:ns ## KEGG: Closa_3718 # Name: not_defined # Def: conjugative transfer protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 70 1 70 71 92 87.0 6e-18 MAFFEQAITVLQTLVIALGAGLGIWGVINLLEGYGNDNPGAKSQGMKQLMAGAGVAVVGM VLVPLLSGLFSV >gi|229783943|gb|GG667792.1| GENE 10 8061 - 8684 598 207 aa, chain - ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 26 164 414 556 591 82 34.0 5e-16 TQLFNLLCEKADDVYDGRLPVHVRCLIDECANIGQIPKLEKLVATIRSREISACLVLQAQ SQLKAIYKDNADTIIGNMDTSIFLGGKEPTTLKELAAVLGKETIDTYNTGESRGRETSHS LNYQKLGKELMSQDELAVMDGGKCILQLRGVRPFLSDKYDITKHPNFKYTADADDKNAFD IEAFLSARLKLKPNEVCDVYEVDTKGA Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:56:04 2011 Seq name: gi|229783942|gb|GG667793.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld186, whole genome shotgun sequence Length of sequence - 10001 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 209 266 ## COG4943 Predicted signal transduction protein containing sensor and EAL domains 2 2 Tu 1 . - CDS 230 - 1597 1257 ## COG0534 Na+-driven multidrug efflux pump - Prom 1629 - 1688 7.2 3 3 Tu 1 . - CDS 1713 - 2255 698 ## COG0288 Carbonic anhydrase - Prom 2279 - 2338 4.6 + Prom 2241 - 2300 6.0 4 4 Tu 1 . + CDS 2478 - 3752 1166 ## COG2200 FOG: EAL domain 5 5 Op 1 . - CDS 3812 - 5305 920 ## Dfer_0637 hypothetical protein 6 5 Op 2 . - CDS 5355 - 6266 703 ## BF3495 hypothetical protein 7 5 Op 3 . - CDS 6284 - 6874 452 ## COG0395 ABC-type sugar transport system, permease component - Prom 6938 - 6997 11.9 8 6 Op 1 . - CDS 7899 - 8069 126 ## gi|266624619|ref|ZP_06117554.1| sugar permease 9 6 Op 2 35/0.000 - CDS 8072 - 8974 554 ## COG1175 ABC-type sugar transport systems, permease components 10 6 Op 3 . - CDS 8961 - 9860 729 ## COG1653 ABC-type sugar transport system, periplasmic component Predicted protein(s) >gi|229783942|gb|GG667793.1| GENE 1 3 - 209 266 68 aa, chain + ## HITS:1 COG:XF0470 KEGG:ns NR:ns ## COG: XF0470 COG4943 # Protein_GI_number: 15837072 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing sensor and EAL domains # Organism: Xylella fastidiosa 9a5c # 6 67 462 522 530 62 46.0 2e-10 TTNPRNRAIVESISEVCLKLGIRMVAEGIENEEQYAVLRACGIDLFQGYLFSRPIPVEEF ERDYLGKP >gi|229783942|gb|GG667793.1| GENE 2 230 - 1597 1257 455 aa, chain - ## HITS:1 COG:FN0667 KEGG:ns NR:ns ## COG: FN0667 COG0534 # Protein_GI_number: 19704002 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 411 3 414 426 187 29.0 3e-47 MKAAKKIDMTQGSIMKRVLLFALPICVGNVLQQLYGTVDTLVIGNFCGSVSLAAVGTSSQ PVEILLCVFLGLGTGVSILVSQFTGRGDMESLKEIGATAISFLFLCSIPLTIIGQFAGPA VLRFMQVPDDTWELANAYISIIFFGTLGNMGYNMNAGILRGMGDSNASLLFLIVSCVVNI VLDLLFVAGMGMDVPGAALATTIAMYCSWLFSIVYIRKKYPEFGFTYLPHRMNRRMLGSI VAIGLPLGLNSSIYSVGHILMQSLINLQGSVFIAACSVSSKVTGIANVAITSLSSAATTF SGQNLGAGNCVYLKKGGIRIPLFSGLITCIAGLVVTFYCRPLLELFTREAEVLDIAVRYI RIVLPFTWAYAVFNGIISFVNGMGEVRFPTVVNLLMLWAVRIPCAWLITSYIDGGYVMAC LPISFVFGMLCMFSYFFSKNWKKIGKMAAEQQGRL >gi|229783942|gb|GG667793.1| GENE 3 1713 - 2255 698 180 aa, chain - ## HITS:1 COG:BS_ytiB KEGG:ns NR:ns ## COG: BS_ytiB COG0288 # Protein_GI_number: 16080121 # Func_class: P Inorganic ion transport and metabolism # Function: Carbonic anhydrase # Organism: Bacillus subtilis # 1 176 3 178 187 213 51.0 2e-55 MIEDVIDYNRKFVAEKNYEPYETSKYPDKKLAILTCMDTRLTELLPAALGIHNGDAKIIK NAGGVISHPYGSAVRSLLVAVLELGVEEIMVIGHTDCGVQGMDGKKMVEKLKARGIPEEH IDIIRKSGINLEQWLGGFESAKEAVKESVDSLKNHPLMPKDLVIRGFMMDSVTGELTALC >gi|229783942|gb|GG667793.1| GENE 4 2478 - 3752 1166 424 aa, chain + ## HITS:1 COG:mll7513_3 KEGG:ns NR:ns ## COG: mll7513_3 COG2200 # Protein_GI_number: 13476242 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Mesorhizobium loti # 170 410 6 246 260 172 38.0 1e-42 MADLKNEQWIKARYEDICSKNPDDYALISMKVKRFRIFNRMFGREAGDRLTAMIYDCVES WLNPGEYIAHIHAGYYLLLVHISKDYDDIFHYTIDINTRIRDWPYGEQYGKVYMGYGIFL LGSDPPDFLTAQYNADICRTESSESHLRNSHMEVYGLTYNDTNLGNYNMEQDFQPALEHE DFKLYLQPKVNLRTGEVTEAEALVRWINPEKGMIPVSDFLPELEKNGLIGDLDLYMFEHV CRTINRWIKQYNKKIKISVNLSSNMFNYRYFLDLYKQVYEKAPCPKDCIEFELLESIVLN QIDLVQVVVKQLYHYGFSCSLDDFGSGYSSFSVLTTTELKTLKIDRSLFCNYSDPRERVL VRHIVETAKELNLKTVAEGIETKEYVEFLKKLGCDYIQGFIFYKPMPVEEFERRFLVNCE RAEV >gi|229783942|gb|GG667793.1| GENE 5 3812 - 5305 920 497 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0637 NR:ns ## KEGG: Dfer_0637 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 7 493 33 516 533 466 46.0 1e-130 MILEKVLETYVDRFNAEDEEIYRQEIGNDQALHWLKKNIPLFECPDPDIEEIYYFRWWTY RKHVKKTPEGFIISEFLPDVPWAGSYNSINCAAGFHIREGRWLRNGRKIIEDYIRFWLKG SGDIRSYSTWIADAVWDYCSVLEDYGFGIELLEELISNFEWWTKEHQRENGLYWSIDDRD AMEFSISGSGFRPTLNSYLYADAVAISRLAEKAGKDDIASRFAKEAEMLKECVQNHLWNG DFFQVIPARIDCREKRFIECSTLDFKKIPSDRNVKELIGYIPWYFNLPDAGYEKAFYYLT QEENFLGKSGLRTADRSHPRFAYEVPHECLWNGPAWPFATTQALVALANLLQNYSQEFVS KTDYYRILKSYSKSQHRITEEGKRIPWIDEDQDPETGEWIARTILKEAGWKPEKGGYERG KDYNHSMYCDLIITGLLGIDPGQEPPTVNPLIPEEWEYFLLEGLEIHGKIYTVLYDRDGS RYNRGEGLQILTEELFQ >gi|229783942|gb|GG667793.1| GENE 6 5355 - 6266 703 303 aa, chain - ## HITS:1 COG:no KEGG:BF3495 NR:ns ## KEGG: BF3495 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 8 303 38 328 328 319 54.0 1e-85 MLQHGTKGNWKKYINNPVLGGKYGTCFDIAVLDEGTEIAMYFSWRPRKSVAVVKSTDGIH WSEPEICIGPKTSAEGWEDELNRPSVVKKGNTYYMWYTGQYKAGEADGTSHIFCAVSTDG IHFERVDDKPVLYPQEPWEKAAVMCPSVLWDEETSQYKMWYSAGEQYEPNAIGYAESPDG LVWKKFTGNPVFCADPDTEWEQHKAAGCQVLKKDEYYYMFYIGYHNEDYAQIGIARSRSG MTGWERSSYNPIIAPGEDGFDSCACYKPFAVFKDGKWLLWYNGRNGTLEQIGLVTHDGYD FAF >gi|229783942|gb|GG667793.1| GENE 7 6284 - 6874 452 196 aa, chain - ## HITS:1 COG:BH1926 KEGG:ns NR:ns ## COG: BH1926 COG0395 # Protein_GI_number: 15614489 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 3 195 94 284 285 139 39.0 3e-33 MLTLIVSSLAAFAFSKYSFRGKNVIFIFLLATMMIPVEITIPAIYLMFSKVHMLNTYSVQ IFPGIANVFCLFMLKQYMDSLPDSLLEAARLDGAGDISIYQKVVLPLVSPAIGALAILTF LGKWNDYLWPSMLLTKTEVMPIMVVLPTLNTGNSQWSTPWELVMAGCVIVTLPLIIVFFI FQDQFMSSVTIGAVKE >gi|229783942|gb|GG667793.1| GENE 8 7899 - 8069 126 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624619|ref|ZP_06117554.1| ## NR: gi|266624619|ref|ZP_06117554.1| sugar permease [Clostridium hathewayi DSM 13479] sugar permease [Clostridium hathewayi DSM 13479] # 1 54 2 55 55 80 100.0 4e-14 MSRDIKKVTLHILLLLFAAAAVVPILLMFVNSFKTGSELAKNAWGIPSFWTFSNYS >gi|229783942|gb|GG667793.1| GENE 9 8072 - 8974 554 300 aa, chain - ## HITS:1 COG:XF2447 KEGG:ns NR:ns ## COG: XF2447 COG1175 # Protein_GI_number: 15839038 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Xylella fastidiosa 9a5c # 16 299 1 291 293 133 32.0 4e-31 MLLNNSGIDSQTGEGMKKKTIIPYLFITPFLIAFVLFFVYPAGYSLYLSFLKYKGYGEAT FVGFQNYKALLTYALFWKSLGNTAFYFLAHFVPVMAGAFLFALAMQSSYIGRAQKVIKPV LFLPQVVPLVASALIFKIIFSTNSGALNQITGLHVKWLEDPGLLKWPVVFLIIWRSVGWF MVIFLAGLTTISSDLNEAATLDGANNWQRITKITIPLMKPIFLFAFVTDVISSFKIYTEP NVMMVETGTIPVDAQPVMNIITNHIKGGNFGMSSAAGWILFLIILVVSLAQLFLMKNKED >gi|229783942|gb|GG667793.1| GENE 10 8961 - 9860 729 299 aa, chain - ## HITS:1 COG:mlr6435 KEGG:ns NR:ns ## COG: mlr6435 COG1653 # Protein_GI_number: 13475383 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 5 239 129 360 423 87 26.0 4e-17 MRYEDATVAFPFELKPRVWFYRSDIFEEAGIDADEIKTTDDFIAAGRKLQEKFPGKYICN VGSSAPGYLFYMALSGNGARFTDEDGNYMISQDEGTRKVLEDYRKIVESGVSMDVSDFTT DWEAALADGSLVSSLSASWLAQDSLLPTYAAGQEGKWACAQWPEIGGSVGGSDAGGSVFV IPSFSEHKEEAKELLAEMTLSEAGDLCIWDSISSIPVNKKALDNERMQVPNKFFGTSVVE AGKTAVSKMSVFNYSPKALAETDIVMPYFVKAVNGELSIDEALTSAEKDLKSQLGNAFE Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:56:24 2011 Seq name: gi|229783941|gb|GG667794.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld187, whole genome shotgun sequence Length of sequence - 9347 bp Number of predicted genes - 10, with homology - 9 Number of transcription units - 6, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 8 - 127 152 ## gi|266625429|ref|ZP_06118364.1| putative magnesium and cobalt transporter 2 1 Op 2 . + CDS 137 - 991 845 ## GYMC10_2909 hypothetical protein - Term 905 - 938 4.0 3 2 Tu 1 . - CDS 1057 - 2055 992 ## COG2855 Predicted membrane protein - Prom 2125 - 2184 11.7 + Prom 2147 - 2206 11.7 4 3 Op 1 . + CDS 2240 - 3127 1029 ## COG0583 Transcriptional regulator + Term 3177 - 3219 1.5 + Prom 3131 - 3190 5.0 5 3 Op 2 . + CDS 3260 - 4924 1315 ## COG2720 Uncharacterized vancomycin resistance protein + Term 4945 - 5003 12.5 + Prom 4945 - 5004 3.5 6 4 Tu 1 . + CDS 5059 - 5139 69 ## + Prom 5161 - 5220 10.9 7 5 Op 1 35/0.000 + CDS 5273 - 6559 933 ## COG1653 ABC-type sugar transport system, periplasmic component 8 5 Op 2 38/0.000 + CDS 6600 - 7523 547 ## COG1175 ABC-type sugar transport systems, permease components 9 5 Op 3 . + CDS 7535 - 8401 542 ## COG0395 ABC-type sugar transport system, permease component 10 6 Tu 1 . + CDS 8504 - 9347 614 ## COG2942 N-acyl-D-glucosamine 2-epimerase Predicted protein(s) >gi|229783941|gb|GG667794.1| GENE 1 8 - 127 152 39 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625429|ref|ZP_06118364.1| ## NR: gi|266625429|ref|ZP_06118364.1| putative magnesium and cobalt transporter [Clostridium hathewayi DSM 13479] putative magnesium and cobalt transporter [Clostridium hathewayi DSM 13479] # 1 39 285 323 323 84 100.0 3e-15 MNFINMPELHWKYGYVGMILLSIAIVAVEIWFFKKKKIL >gi|229783941|gb|GG667794.1| GENE 2 137 - 991 845 284 aa, chain + ## HITS:1 COG:no KEGG:GYMC10_2909 NR:ns ## KEGG: GYMC10_2909 # Name: not_defined # Def: hypothetical protein # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 3 280 124 374 374 229 45.0 1e-58 MEKKEKGSFVGSILLSSPVWDAEKLKRDLLEEWDIRVPDLEADEEGTGGEPMGREQPYSD QPGQEASDCGNSDNDTIVFEADDYMVAISLMPAPVPAGEAEYYAKSNYFWKGAVDAASEH RAHILVAVIGGRERDPFEAGKLYVKISSACLKQENALGIYTSGTVFEPEMYCAVAEDMKE EEETYPILDWIYIGLYQTEKGMNGYTYGMTAFGKDEIEVVESAAAPAELHEFLFDIASYV LYDDVLLQDGETIGFTEDEKLPITKSEGEAVEGESLKIAFMPAQ >gi|229783941|gb|GG667794.1| GENE 3 1057 - 2055 992 332 aa, chain - ## HITS:1 COG:SPy1056 KEGG:ns NR:ns ## COG: SPy1056 COG2855 # Protein_GI_number: 15675048 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pyogenes M1 GAS # 19 332 22 339 339 333 63.0 2e-91 MKKYTAGLLICLAISVPAWFLGRLFPVVGGPVCAILIGMAIAPFIKGKERFRPGIAFTSK KILQYAVVLLGFGMNLSVVLEKGRQSLPIILATITTSILIAVLLYRLLKLPTKSAILIGV GSSICGGSAIAATAPVIEADDEEIAQSISVIFLFNIAAALLFPPLGAALGLSNEGFGLFA GTAINDTSSVTAAAAAWDGLYGSNTLEAAAIVKLTRTLAIIPITLFLALYRTKKEQSGSG NTFQLVKIFPFFVLFFLLASVLTTVFHLPAAVTGPLKDLSKFFIIMAMAAIGTNTNLVTL VKTGMKPILLGLCCWIGITLVSLSMQSFLGVW >gi|229783941|gb|GG667794.1| GENE 4 2240 - 3127 1029 295 aa, chain + ## HITS:1 COG:PA3398 KEGG:ns NR:ns ## COG: PA3398 COG0583 # Protein_GI_number: 15598594 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pseudomonas aeruginosa # 5 291 7 290 308 120 28.0 4e-27 MLDFRMDTFLAVCRCMNFTKAAEELNITQPAVSHHIHYLEEHYQVKLFEYSGKKIGLTEA GRLLLSAATTMQHDELHLKQALQMSEGGGRRLVFGATMTVGEYVLPEALMRLMDAQPDIS VKMVVADTQDLLRSLDEGEIEFAVVEGFFQKKEYDFLVYDRGKFVIVASPEYRFHSEGKV IEDLTRERFFIREPGSGTRYVFERYLEGKNLLIHDFEDLMEISNIGAIKSLVADGRGITF LYEAAVKRELEEGSLRRVELEDFELSHDFTFIWRKDSIYADYYRELYRLLRGSEL >gi|229783941|gb|GG667794.1| GENE 5 3260 - 4924 1315 554 aa, chain + ## HITS:1 COG:BS_yoaR KEGG:ns NR:ns ## COG: BS_yoaR COG2720 # Protein_GI_number: 16078932 # Func_class: V Defense mechanisms # Function: Uncharacterized vancomycin resistance protein # Organism: Bacillus subtilis # 193 374 92 275 303 113 35.0 1e-24 MKRGRSRRRRRQARMVMYGGTVILLTAVLGTAGLAFGRRGGVRGQESSVSGNGTAVIAEH LTETVTAYEKTVAEGIDITGKSRGEAETLLKEKMKGLSVVWQDETVSVEEVTGDAVAGLL DKIYGDGQMKSGTFTLDTDSLKESFTVLAKKLAGKWNRGAVDSQLASFDKETGIYTYSEA KAGSSLNEDELVQSLMEAVKARTYDTAVIPEFTEISPKRTAAQAKEQYQVIGSFTTKTTN NKNRNENIRLAVEAIDGRILKPGEEFSFNLATGNRTKDKGYQPAGAYRNGVLIEEPGGGV CQVSTTLYNAIVSSGLTATERHSHSFTPSYIQPGEDAMVSFDGYAGPDLKFVNSAQTTVA VRASLKDNTLKISIVGLPILEDGVKVTLRSEKVRDVEPPAPVYEANDGLPAGTEKVVDQG QKGGVWKTFRVVTKDGNVVEETPLHNTTYRGKASVIQKNENAAGTGENSAEAGANESAAG GQTQENTGQAAGAGNAEGGSSGQQEGMPQDISQQGEPKQEAGQQAAQPEAEPQQEPQSQQ TPSADAVVVPTFPG >gi|229783941|gb|GG667794.1| GENE 6 5059 - 5139 69 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLSEYSYSKIVNKAIELFINYYYNKK >gi|229783941|gb|GG667794.1| GENE 7 5273 - 6559 933 428 aa, chain + ## HITS:1 COG:BH1244 KEGG:ns NR:ns ## COG: BH1244 COG1653 # Protein_GI_number: 15613807 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 1 406 15 439 440 122 24.0 1e-27 MILILCLTACGQQKEVPKESGTTGETVKSGKLNVLFMSGVYADSARSMTDEFKEKTGYDV EVNDVPFSSFHETAMLDFASQAGSYDVVAVNLSWLGEFAPHLEPLDDRVKESGMEVDKFI PSVYDACKWNDVLYGFPKAPTPNMMAYRTDLIDTPPKTYEEYMQMAEKFNDPGNGMYGIS IPGKKEQYAVLYLVRSWAMGADVADKDWNVIVNGEVGRSAMEQLGEVTKYSDPAGLSWGL EESINAFLQGKAAFCEAWPTLGIIQAADDPEKSKIVDNWAIAPFPQEKTGINQMSIWCVA VSKYSKNKDAAFQWIKDYSSLEKQEKFFDEFGILPSYTSFWEKEEVKNSKMGPLGDGLKT SLPKWRIPVSAELDSIMANAVSSYMSKQMTLDQAMEFFDSELNRAIKNNPPPADSKNDNA IAIEKALK >gi|229783941|gb|GG667794.1| GENE 8 6600 - 7523 547 307 aa, chain + ## HITS:1 COG:AGc4553 KEGG:ns NR:ns ## COG: AGc4553 COG1175 # Protein_GI_number: 15889771 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 26 305 29 305 312 180 32.0 3e-45 MLDDPEEFMRREEKKKRISEDQVTTWLFLMPALIIVLVAVIIPLCNSFVLSFEKVRINVP GFIPQLVGLDNYLRMLDDPVFIRSVQNTALFAFASVPVELILGMMLAMMLSGDSTGARIM RSIMLIPMIMAPVAAGTMWRMMLDKNTGIINYLLTCIGIPAVDWLGNPKLAIFSVIMVDV WRMTPWFALLLISGIKSIPGDTIEAALVDGATKWKCFRYVILPQLYRVLTLVLMLRVIDA FKVFDIVFVMTGGGPGMATEMLPSYIYAQGLRYFDAGYAAALALVFIAAMAVFSSVFVFL RSRGRDE >gi|229783941|gb|GG667794.1| GENE 9 7535 - 8401 542 288 aa, chain + ## HITS:1 COG:mlr7094 KEGG:ns NR:ns ## COG: mlr7094 COG0395 # Protein_GI_number: 13475911 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mesorhizobium loti # 23 288 17 281 281 167 33.0 2e-41 MRRKENDRISSTITDFLGKPTKWIFITLFAVSVILPLYWLATNAVKVEADYLASPPVIVP TRITTENFIKIFVNDGVGKGLYNSAVITAVSTFLAASFGSLAAYALTKGKLGKRTRSIFA GWFLLQKMYPAVATAIPIYLVMRSLHLIDTKTAMIILNTSFNFPMVIWLMMGFFQEVPES IEESGKIDGCSMAQRFFYLVLPIVKPGLVASAILTFVSTWNEFLFAVILSVRNAKTLPVI ISGFITDRGLEWGPMAATGIVIIIPVFVVVWLLQKDFVQGMTMGAVKG >gi|229783941|gb|GG667794.1| GENE 10 8504 - 9347 614 281 aa, chain + ## HITS:1 COG:slr1975 KEGG:ns NR:ns ## COG: slr1975 COG2942 # Protein_GI_number: 16330802 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Synechocystis # 3 279 20 296 391 168 33.0 1e-41 MNDILSFWKERTADYEYGGYITSFDREGNCTGKEKNIWLQARQVWMFATIYSRVEKDPVW LELAKRGRDYLVSHACAGNGRWYYLLDENGTVQQGTISLYTDMFVLMALSAYAEASGLME DKIVIQETFDSLYQNVRNDNCRDTFPQEFQEGVVSHGRYMICLNAVACAREYLDGNKADE LIRFCMDRIFGVLGYGEDGTIYEARTLQGGYIDSEEGHRINPGHIFESMWFVLEQAERVG RLEYEERALHTIMATFQRSRDEIYGGILHMLSDNGTDEQYK Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:56:40 2011 Seq name: gi|229783940|gb|GG667795.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld188, whole genome shotgun sequence Length of sequence - 7480 bp Number of predicted genes - 11, with homology - 10 Number of transcription units - 8, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 422 230 ## gi|266624631|ref|ZP_06117566.1| conserved hypothetical protein + Prom 439 - 498 3.2 2 2 Op 1 . + CDS 575 - 1456 478 ## Dhaf_3299 hypothetical protein 3 2 Op 2 . + CDS 1422 - 1664 74 ## Closa_4187 hypothetical protein + Prom 1721 - 1780 4.2 4 3 Tu 1 . + CDS 1810 - 2286 263 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Term 2294 - 2338 1.5 + Prom 2298 - 2357 3.6 5 4 Tu 1 . + CDS 2499 - 2786 232 ## Fisuc_0607 hypothetical protein + Prom 2848 - 2907 4.7 6 5 Tu 1 . + CDS 3045 - 3149 58 ## + Prom 3293 - 3352 6.3 7 6 Tu 1 . + CDS 3406 - 4299 339 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 + Prom 4632 - 4691 3.6 8 7 Tu 1 . + CDS 4824 - 4955 159 ## gi|288871435|ref|ZP_06410166.1| conserved hypothetical protein + Term 4993 - 5038 6.2 9 8 Op 1 . - CDS 5468 - 5707 212 ## EUBREC_3583 hypothetical protein 10 8 Op 2 . - CDS 5748 - 6764 495 ## EUBREC_3583 hypothetical protein 11 8 Op 3 . - CDS 6736 - 7131 252 ## EUBREC_3584 hypothetical protein - Prom 7207 - 7266 2.0 Predicted protein(s) >gi|229783940|gb|GG667795.1| GENE 1 3 - 422 230 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624631|ref|ZP_06117566.1| ## NR: gi|266624631|ref|ZP_06117566.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 139 1 139 139 233 100.0 3e-60 QQPEESGETPSEEKASTLESFTENTFEKWEDSNPNLEGSIKELQEGQFTVVEAITEKADN GGDIVVSPGNGDDSEFNKVTVTYDENTIVAIQTIKNGGDSYEMSEATADFLSEGHMVKVW GTPSSNGLKATWICIVMVV >gi|229783940|gb|GG667795.1| GENE 2 575 - 1456 478 293 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_3299 NR:ns ## KEGG: Dhaf_3299 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 186 273 62 152 261 68 39.0 4e-10 MKLFPKMKVKKVYLCYLLILLIIMGILLIIVLYGKQNSSKDGNILAGASPDTSAFQFYYF DGKAVTLRTLYDPAMEKELIKKINALPMEIADKDAIAGMEVPFYGIRISDKEGYDITVAL SKGVWLKNDGTIYYGDTDLSLLWEQMEGEDEDNSLTVLNFPNAGLLTAYHLFFMMKAEEP DGEVVEGVVMSVEDIQPSRITVSITNNSGEEFSYGRYFSLQKQIDGQWYTMPVQADNIGF EDLACVLPDGKSASETYDLNIYGTLEPGIYRLVVQNLSAEFLVGHGNLSAKRD >gi|229783940|gb|GG667795.1| GENE 3 1422 - 1664 74 80 aa, chain + ## HITS:1 COG:no KEGG:Closa_4187 NR:ns ## KEGG: Closa_4187 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 10 80 81 153 153 103 63.0 2e-21 MDTVICPQKGIECNDEAEAPDGWAKWIIPGYEYIYVERDSEDSCSIKYLKDNGISLVGAV HDFISPLTGKNYMFFSIRKL >gi|229783940|gb|GG667795.1| GENE 4 1810 - 2286 263 158 aa, chain + ## HITS:1 COG:lin0387 KEGG:ns NR:ns ## COG: lin0387 COG0494 # Protein_GI_number: 16799464 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Listeria innocua # 3 156 2 155 169 93 38.0 1e-19 MKEVWDLYDAKGNKTGYTLVRGRPIPEGCYHLVVSVWIINNRGEYLLSQRHPDKPYPLRW ECTGGAVLSGEDSLGGALREVNEELGIILNPKDGQRISRICREQTRDLYDVWVFYKDVDI SDITLQETEVVDVKWVTDEELIKMERKGELHPLLALTM >gi|229783940|gb|GG667795.1| GENE 5 2499 - 2786 232 95 aa, chain + ## HITS:1 COG:no KEGG:Fisuc_0607 NR:ns ## KEGG: Fisuc_0607 # Name: not_defined # Def: hypothetical protein # Organism: F.succinogenes # Pathway: not_defined # 1 91 105 195 195 77 46.0 1e-13 MIAEFCMLGIEDMEEITGLNACKGSYINLTYTFSSGQAVKLLDDDKIYLGNQLSKKDSSR YYGIAADESVLIVSEYDADGSNAEIIVYKKRNIKR >gi|229783940|gb|GG667795.1| GENE 6 3045 - 3149 58 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLGFCGTINGSVLIEQGYFDNAMVMGTRALKSIL >gi|229783940|gb|GG667795.1| GENE 7 3406 - 4299 339 297 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 1 295 1 301 306 135 33 1e-31 MYDIIIIGAGPAGISAAIYGKSRGKRILMLEKNQVGGLIGTVSTVTHYTAIMKEESGKTF AERMKEQALSAGVEIRYENVTETKLIGNIKKVVTDKGSYECKTVILANGGSGRMLGIPGE SLKGIRLNAPKDGFDYKGKNIYIIGGADGAVKEAIYLAGIAGKVTIVCVEDELVCIQEFK DKVAAYHNIEIMPHSSLTAVYGNERAEELEFTDNKTGKKQIIKDAECGVFVYAGIVPNTQ IYTELKLDNGYIPVDELMQTEIPGVFAAGDICVKKVRQVATAVADGAIAGIQAAAVC >gi|229783940|gb|GG667795.1| GENE 8 4824 - 4955 159 43 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871435|ref|ZP_06410166.1| ## NR: gi|288871435|ref|ZP_06410166.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 43 106 148 148 84 100.0 2e-15 MYYGTAQANSLSWINGDVCYSLMDINKNVSKDEMVAMAKDMID >gi|229783940|gb|GG667795.1| GENE 9 5468 - 5707 212 79 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3583 NR:ns ## KEGG: EUBREC_3583 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 77 352 428 543 96 66.0 3e-19 MVDREAKRAEGKGIGYDRWAAVHNLKQMAATMSIYEESGFSSPEELEAALAAASAGLHET TGKLKAVESALREKKEFLK >gi|229783940|gb|GG667795.1| GENE 10 5748 - 6764 495 338 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3583 NR:ns ## KEGG: EUBREC_3583 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 332 3 331 543 529 80.0 1e-149 MATFKHISSKNADYGAAEQYLTFEHDEFTMKPTLDKNGQIVPRTGYLISSLNCGDEDFAI ACMRSNLKYGKNQKREDVKSHHYIISFDPRDAPDNGLTVDRAQALGEEFCKTHFPGHQAL VCTHPDGHNHSGNIHVHIVINSLRIAEVPLLPYMERPADTREGCKHRCTDAAMEYFKAEI MEMCHRENLYQIDLLGGSKNRITEREYWAKRKGQAALDKENASLAVAGEPPKLTKFETDK EKLRRTIRTALSSAVSFEDFSGKLLQQGVTVKESRGRLSYLTPDRTKPITARKLGDDFDR AAVLEALTQNAQNAVRAAKKPLPKQELPPQHKRPFTAQ >gi|229783940|gb|GG667795.1| GENE 11 6736 - 7131 252 131 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3584 NR:ns ## KEGG: EUBREC_3584 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 17 131 1 115 115 174 77.0 1e-42 MKRYNTPHRHRVIKTRLTEEEYAEFSERVTLCKMSQAEFIRQALTKSRIYPVITVSPVND ELLSAVGKLTAEYGKIGGNLNQIARCLNEYGAPYNTLSQEVRAATAELAALKFEVLQKVG EAVGNIQTYQL Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:57:18 2011 Seq name: gi|229783939|gb|GG667796.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld189, whole genome shotgun sequence Length of sequence - 5204 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 5, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 215 157 ## gi|266624641|ref|ZP_06117576.1| conserved hypothetical protein 2 1 Op 2 . - CDS 230 - 436 179 ## Ethha_1353 DNA binding domain protein, excisionase family - Prom 496 - 555 5.4 3 2 Op 1 . + CDS 845 - 961 77 ## 4 2 Op 2 . + CDS 1043 - 1399 251 ## Sgly_1451 transcriptional repressor, CopY family + Prom 1830 - 1889 1.6 5 3 Op 1 . + CDS 1911 - 2633 172 ## Swol_0431 antirepressor regulating drug resistance signal transduction N-terminal membrane component-like protein 6 3 Op 2 . + CDS 2645 - 3091 153 ## gi|266624646|ref|ZP_06117581.1| hypothetical protein CLOSTHATH_06027 7 3 Op 3 . + CDS 3124 - 3564 205 ## Sez_0479 hypothetical protein + Term 3607 - 3661 12.1 + Prom 3578 - 3637 3.4 8 4 Tu 1 . + CDS 3698 - 4927 332 ## Blon_0448 hypothetical protein + Prom 4972 - 5031 3.4 9 5 Tu 1 . + CDS 5052 - 5202 68 ## gi|266624649|ref|ZP_06117584.1| putative acetylglutamate kinase Predicted protein(s) >gi|229783939|gb|GG667796.1| GENE 1 2 - 215 157 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624641|ref|ZP_06117576.1| ## NR: gi|266624641|ref|ZP_06117576.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 71 1 71 72 106 100.0 5e-22 MASIVKRGKTYSVVYYEGTGDKRQQKWESGLTYSAAKSMKAKIEHEQAQQTTGDESKNRL KEMTISEFKPF >gi|229783939|gb|GG667796.1| GENE 2 230 - 436 179 68 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1353 NR:ns ## KEGG: Ethha_1353 # Name: not_defined # Def: DNA binding domain protein, excisionase family # Organism: E.harbinense # Pathway: not_defined # 16 67 17 68 70 67 61.0 2e-10 MAVGEFNHEKQAVSEKRTYSVQEIADILQISRSMAYNLCKQSLFKTVKVGKYVRVSKPSF DEWLDTRK >gi|229783939|gb|GG667796.1| GENE 3 845 - 961 77 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIYFRKEKPRDYKYPILEYYLHYPASLSIFLSGIYEWF >gi|229783939|gb|GG667796.1| GENE 4 1043 - 1399 251 118 aa, chain + ## HITS:1 COG:no KEGG:Sgly_1451 NR:ns ## KEGG: Sgly_1451 # Name: not_defined # Def: transcriptional repressor, CopY family # Organism: S.glycolicus # Pathway: not_defined # 4 118 7 121 124 131 57.0 8e-30 MEKRLFDSELKVMETLWENGELSAKQIAELLRQQIGWSKTTTYTVLKKCIDKEIIKRSDP NYICSACISKEDVRQYETHELINKMYDGAPDKLVASIIGNEKMDKDMIRHLKELIQNL >gi|229783939|gb|GG667796.1| GENE 5 1911 - 2633 172 240 aa, chain + ## HITS:1 COG:no KEGG:Swol_0431 NR:ns ## KEGG: Swol_0431 # Name: not_defined # Def: antirepressor regulating drug resistance signal transduction N-terminal membrane component-like protein # Organism: S.wolfei # Pathway: not_defined # 6 132 117 239 442 106 44.0 1e-21 MDGEIDELHLEQILEHELIHIKRFDVLFKWLLAFICAVYWVNPFIWIMYSFANRDIELAC DEAVLKSRSKDYKKSYILTLIYLEEKRVRGDFLCNFFSRYPMEERVQIMIRNGDKKALKN MILPAIAALLIALFSISSMAGEYDGEWSPKNERRDNRNLTATTTDIQMQLPIFQKRLPDN FGSLSGGDFDIPTIIIRKSKENYSACAVDKNGTVIYEESGTTKNVEATLEHIYNKLFQRN >gi|229783939|gb|GG667796.1| GENE 6 2645 - 3091 153 148 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624646|ref|ZP_06117581.1| ## NR: gi|266624646|ref|ZP_06117581.1| hypothetical protein CLOSTHATH_06027 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06027 [Clostridium hathewayi DSM 13479] # 1 148 1 148 148 277 100.0 2e-73 MRIGKHFWVLIIAVLISLSAISSVMAEEINSNVTDIPTLEITRDETGCYSVLAKDHEGSI IFTEKGLTGTIEDIIEHSYSNIFRAATKSCTHIPCNHEIVTGGINHVIDWDTDICTMITN DFYRCACCDQILGIVPGSTTVVGTHPAH >gi|229783939|gb|GG667796.1| GENE 7 3124 - 3564 205 146 aa, chain + ## HITS:1 COG:no KEGG:Sez_0479 NR:ns ## KEGG: Sez_0479 # Name: not_defined # Def: hypothetical protein # Organism: S.equi # Pathway: not_defined # 36 143 15 126 127 88 47.0 7e-17 MRKFKSNILKVSLAMGLLLSSSFPAYAVSTDSTVVGGWDEDTGYFVNADAYNKAMEKRGL LRSDPVHEGERQRKDQSGNTYFRAHGWTSWPGVYHYTRARMETYGGSILTDSGRKWGTSE TQATSPWHKFDPDVSDRARTYYGSEE >gi|229783939|gb|GG667796.1| GENE 8 3698 - 4927 332 409 aa, chain + ## HITS:1 COG:no KEGG:Blon_0448 NR:ns ## KEGG: Blon_0448 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_infantis_ATCC15697 # Pathway: not_defined # 21 306 8 300 402 211 39.0 6e-53 MIKSSRIYNFFFKNSFSVLCIMLLTVFIFIVGITLTSLTISKPTSENKEHVSIGEKRYTL IDNFLDADSFREFRHNDKKVNMLGDFYNKLTGMDNAKLLSMFNQSVVIDDFQGDEKFYYH TKEFRDKFPDAELAIKSMQMNQLAFEHNKLKVKKGNMPIWNKISFNDNTFPILLGSSYEG IYNIGDIVKGSFYTKNINFQVIGILEDNTQVYYKTDPAYMLDEYIIIPYPAMAWTVNPND FVFEGILYFALVNSDIVIDSDEKNFLTGIRAIANSTGFVDFSLVGIDDQIIKNQELIFMI SEHQRLIGCILVVMYIILTAVLYCQLKVHLKKNDISGQPVFNGPSDRKKFFRKYSMFYVI SFILSLVLQLRLIPRIFLGVFAAELLILGSVYLIVSLAYYKMFLKENMK >gi|229783939|gb|GG667796.1| GENE 9 5052 - 5202 68 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624649|ref|ZP_06117584.1| ## NR: gi|266624649|ref|ZP_06117584.1| putative acetylglutamate kinase [Clostridium hathewayi DSM 13479] putative acetylglutamate kinase [Clostridium hathewayi DSM 13479] # 1 50 1 50 51 90 100.0 4e-17 MERDKLCSKIRDYKNGNRIALNEIISQMTPLVKKFARKCFFMEYDDAFQE Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:58:00 2011 Seq name: gi|229783938|gb|GG667797.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld190, whole genome shotgun sequence Length of sequence - 6380 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 11/0.000 + CDS 3 - 947 843 ## COG1554 Trehalose and maltose hydrolases (possible phosphorylases) 2 1 Op 2 . + CDS 925 - 1575 700 ## COG0637 Predicted phosphatase/phosphohexomutase 3 1 Op 3 . + CDS 1598 - 2680 890 ## COG3835 Sugar diacid utilization regulator + Term 2725 - 2768 -0.3 + Prom 2703 - 2762 5.7 4 2 Op 1 35/0.000 + CDS 2790 - 4145 1427 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 4173 - 4208 -0.1 + Prom 4151 - 4210 1.5 5 2 Op 2 38/0.000 + CDS 4326 - 5276 908 ## COG1175 ABC-type sugar transport systems, permease components 6 2 Op 3 . + CDS 5273 - 6316 1083 ## COG0395 ABC-type sugar transport system, permease component Predicted protein(s) >gi|229783938|gb|GG667797.1| GENE 1 3 - 947 843 314 aa, chain + ## HITS:1 COG:lin2966 KEGG:ns NR:ns ## COG: lin2966 COG1554 # Protein_GI_number: 16802024 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose and maltose hydrolases (possible phosphorylases) # Organism: Listeria innocua # 1 288 403 677 725 259 50.0 5e-69 IIYGLISYYRATGDRNFMDQYGCRMILETAVFWYSRAVWNGDKRRFEIRDVIGPDEYTEH VDNNAYTNYMAFENVKAAGQVLASWDNPLADSCRKEGWSERFDHFLRHLYLPEPNEEKLI PQDDTFLSKKKLSNIEKYRNADVRQLILKDYSRNQVVDMQVLKQADVVMLLELLPERFDA ETVKKNVEYYEALNTHDSSLSLCAHAEAEAVIGEMDKAVGFFHQAMEIDLGMSYKDSAEG IHAASLGGIVNCILRGFAGIKTDGNMFRFTPHLPDHWQSIRFSFMDHGRKKTAVVTSEGA AVTAEEDDDTGVYI >gi|229783938|gb|GG667797.1| GENE 2 925 - 1575 700 216 aa, chain + ## HITS:1 COG:BS_yvdM KEGG:ns NR:ns ## COG: BS_yvdM COG0637 # Protein_GI_number: 16080508 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Bacillus subtilis # 2 204 1 204 226 175 44.0 7e-44 MIQAFIFDLDGVVVDTAKYHYQAWKELAGELGFDFPEAEGERLKGVSRMDSLEIVLESGR ITGLTAEEKKRLADRKNKSYLTYINRLDEREILPGILKFLKKIRAEGYKTALGSASKSGG MILQKLGIADLFDVIVDGLSIVKAKPDPEVFLAAAAKLGADPGNCIVIEDAQAGVLAAKN GGMHCIGIGSEEILKGADVVLEHTGLLPNVNYRALF >gi|229783938|gb|GG667797.1| GENE 3 1598 - 2680 890 360 aa, chain + ## HITS:1 COG:CAC2833 KEGG:ns NR:ns ## COG: CAC2833 COG3835 # Protein_GI_number: 15896088 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Sugar diacid utilization regulator # Organism: Clostridium acetobutylicum # 11 351 8 341 344 125 29.0 1e-28 MYENYLNTENAQKIADGIMKIIPCNINITDTAGEILASGDKSTLGTLHKGALTALKRKEA YVIYENTPTEQRGICLPIIYNNSILGVIAIGGDVEEVMPIGQICLSIAVLTIENRLLSDM SSIKESRLKDFLYEWISLSREQYDDAFYDQAAYLGIDMKIPRTAVVITSSRIRYSIIETI KSHLAAGEYIVRQGMEEVLILFRSDKRLKSRLEKILDISKDLENCYIGESDIIASRTTNS VMQTFHIARALNIRRRILCYHEVSLECLLNNVEVTRELEEILKLLKERDVDGVLKETIAA YVEDNDNYAQICDKLHIHRNTLNYRLAKIEELLNRNPRRAKDLMMLYIAVIKMGGNKEKR >gi|229783938|gb|GG667797.1| GENE 4 2790 - 4145 1427 451 aa, chain + ## HITS:1 COG:APE1630 KEGG:ns NR:ns ## COG: APE1630 COG1653 # Protein_GI_number: 14601527 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Aeropyrum pernix # 26 361 39 376 478 112 27.0 2e-24 MKKNLTKKLAALGLAVITTASMAGCSSKPAETKAPETTTAAAAGAESTEAPATEAAKKEA GGSDVLKVTYSGTPEIHEKQYLMDTFFKNFEEENNCTVEVDFVAQADAVKKIQSEQEGKN YITDIIYADTANMAPYVNGGWMQDITELVNSTGSTYTQMFDNTTNKDGVRYFVPFTFDVY VTCANKKALDYLPDGLTEEDVVSGITWEQYADWAVAIAAGEGEGKTMMPANLTSSQLLYP MGGMSLAYGGGFPEFTSDGFKDALGIIAQIAAGDGFYAEQDQYTAPTDPLNSGDVWLTFA HMAPVGTSYNAAPNNFVIGAAPKGSKGAGSTSGAWCYGIAQGAPNQELAEKFINYIADPE VNYEFCSNYGGSLSPIEEVGDILESSDVVMRAGIEMLKTTIVTGVPASEYSDWNAVKLLY GDIFNDVLKNKAVPGDDYLAEEQSKLEALKN >gi|229783938|gb|GG667797.1| GENE 5 4326 - 5276 908 316 aa, chain + ## HITS:1 COG:TM1234 KEGG:ns NR:ns ## COG: TM1234 COG1175 # Protein_GI_number: 15643990 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Thermotoga maritima # 18 298 4 280 281 136 32.0 4e-32 MNENDSKQTIKKTIFSKKSIPYLLLLPAFLFYAVFWLLPVLSGVKEVFTDFDGNFTLLGN FKLMAQSDLIGPAIFNTATFAAVSVMLQYFLALLLAVLLSRKFKSSKLLMFISMIPMAIT PTAVAILWKTGLVRDGWINTVLIGLHLIDKPIVFMGAEGMSAVFLIILIDTWTVTPSVMI ILIAGLQGVQRELKEAAYLFGANKWQTFKDIVIPILKPSITTSIIMRLIAAIQVWSIAVM VLGYSKVPFLVERIAFYVEAVPGVNTSQKLAFTLSFLTTLIVLCATVVYLKASKGAGAGS HETGRKRPAKVKEAAK >gi|229783938|gb|GG667797.1| GENE 6 5273 - 6316 1083 347 aa, chain + ## HITS:1 COG:APE1633 KEGG:ns NR:ns ## COG: APE1633 COG0395 # Protein_GI_number: 14601529 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Aeropyrum pernix # 141 343 65 266 270 134 41.0 2e-31 MMDKLNIVQKTVMALILATMVLLVAVPLLAFTTVAFSNSVEMTEFPKRLIPKRTVTVRVA PTPEGEFEIFYDSGEGYESIITTKRASKLEAHFTRQYAVTVPGEELAEDFKQTLTNGPME FTYKKDLFYNFKTFFSIVPNAAAALKNSIIVSLYTILISLGIGSLAGFAMARFQFKFKEQ IHVSLLIVRMFPVVGISIPMAMLLIQFGLFDTRIGLALLYSVPNIALTAWITNSIFIGIN KELEEASMIFGANSVQTFAKITLPLAFPALAASSMYSFLTAWNDTISALILTNKNQTLSL VVYKAIGTTSSGIQYAAAGSIVLILPALVFTFIIRKYIGQMWGGVEL Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:58:03 2011 Seq name: gi|229783937|gb|GG667798.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld191, whole genome shotgun sequence Length of sequence - 9783 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 5, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 79 - 282 244 ## gi|266624656|ref|ZP_06117591.1| conserved hypothetical protein + Term 332 - 383 12.0 2 2 Op 1 . - CDS 400 - 1020 700 ## EUBREC_1239 cytidylate kinase 3 2 Op 2 . - CDS 1013 - 2206 1021 ## COG0534 Na+-driven multidrug efflux pump + Prom 3403 - 3462 1.8 4 3 Op 1 2/0.000 + CDS 3492 - 4556 1262 ## COG1312 D-mannonate dehydratase 5 3 Op 2 . + CDS 4549 - 5379 600 ## COG1802 Transcriptional regulators 6 3 Op 3 . + CDS 5459 - 7069 1985 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases + Term 7077 - 7122 11.1 - Term 7063 - 7110 13.1 7 4 Tu 1 . - CDS 7154 - 7984 702 ## COG0207 Thymidylate synthase - Prom 8119 - 8178 7.1 + Prom 8091 - 8150 10.3 8 5 Op 1 . + CDS 8175 - 8678 495 ## COG0262 Dihydrofolate reductase 9 5 Op 2 . + CDS 8706 - 9341 682 ## Closa_2936 hypothetical protein 10 5 Op 3 . + CDS 9373 - 9781 308 ## COG0262 Dihydrofolate reductase Predicted protein(s) >gi|229783937|gb|GG667798.1| GENE 1 79 - 282 244 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624656|ref|ZP_06117591.1| ## NR: gi|266624656|ref|ZP_06117591.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 67 1 67 67 129 100.0 8e-29 MKIINKKVEHKNYGAGTICAMNGGSVCVEFGKLFGMKRFPYPQVFSEGTMKLMDEALQEE LMEDLLT >gi|229783937|gb|GG667798.1| GENE 2 400 - 1020 700 206 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1239 NR:ns ## KEGG: EUBREC_1239 # Name: not_defined # Def: cytidylate kinase # Organism: E.rectale # Pathway: not_defined # 1 196 1 192 194 198 51.0 2e-49 MDNRIITISREFGSGGRTIGKMTAEKLGIPYYDKELVKKVAVETGFDESYIEKQGEDAPS KSIFGYAFAARGMNGAMNGMSADDFLWVIQRNVILDLAEKGPCVIVGRCADYILKERADC LNVFVHADADIRAERIVRLYGESEKSPLKRLEEKDKKRSVYYRYYTGREWGMAANYHIAL NSGKLGLERCVELIAELTDIQKALTS >gi|229783937|gb|GG667798.1| GENE 3 1013 - 2206 1021 397 aa, chain - ## HITS:1 COG:FN0944 KEGG:ns NR:ns ## COG: FN0944 COG0534 # Protein_GI_number: 19704279 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 4 390 68 454 455 255 39.0 1e-67 MASLALALLAGDGSAAKLSLTLGSGDQETSHKCIGNGILATVFIGIVMTAAGFLFSDQIL RLFGVTEASYPYAREYMEVILIGIPFYIFASGMNAAIRADGSPAYSMFSTVIGAVLNLIL DPVAIFVFHMGVRGAAIATVIGQAASCFVTILYFRKPKSFRFSKTSFLPDWRLLGQIGQL GISSFITQIAIVIVMSVSNNMIGLAGPKSIYGADIPLSVVGIVMKVFGIVIAFSVGIAVG GQPVAGYNYGAGNFKRVFDTYRHIIAANAVVGIIATLLFEFCPQAIVSLFGSESSLYNEY ADMCFRIFLGGILLCCVQKASSIFLQSIGKPVKATVLSLSRDVIFLVPGVILLSMAFGVT GMLWAAPIADVLSLILTIILVSHEYKCLKRMEVKTHG >gi|229783937|gb|GG667798.1| GENE 4 3492 - 4556 1262 354 aa, chain + ## HITS:1 COG:TM0069 KEGG:ns NR:ns ## COG: TM0069 COG1312 # Protein_GI_number: 15642844 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Thermotoga maritima # 1 352 1 354 360 440 57.0 1e-123 MNMSLRWYGPGYDTVTLEQIRQIPGVKGVITTLYGTRPGEVWEREEIRSLKKTVEEAGLS ILGIESVNVHDDIKTGAGDRDRYIENYIKTLERLGEEDIHMVCYNFMPVFDWTRSELARK RADGSTVLAYNQDIIDQIDPEHMRESIEAMSAGYVMPGWEPERMGRLAELFAMYREVDEE KLFSNLVYFLEAIGPVCEKYDIRMGIHPDDPAWPIFGLPRIITDKAHIEKLLRAVDRPYN GLTLCTGSLGSNQKNDIPGIIEAAKGRIAFAHVRNLKYNSERDFEEAAHLSSDGSMDMYR IMKTLYDTGFDGVIRPDHGRAIWGEEAMPGYGLYDRALGAMYLQGLWEAIRKNG >gi|229783937|gb|GG667798.1| GENE 5 4549 - 5379 600 276 aa, chain + ## HITS:1 COG:BH1062 KEGG:ns NR:ns ## COG: BH1062 COG1802 # Protein_GI_number: 15613625 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 30 253 4 225 226 104 29.0 1e-22 MAEKDTKKAAAEIPAVKKNTMKTAETAKTAEKENSGSREDRMYDVLRADILDLKLRPGLV FSIKDVCDAYEVGRTPVREALIRLSKEGLITFLPQRGTMVSRIDFQRADNERFLRSCVEE QVMLEFMALSGPDSAAAAALKESLDRQEKIVKDRDCRAFLEEDMVFHSVFYQGADRETCA RIIHANSGDYRRIRLLSLTETGISDGVVRQHRELMEAVLMRDRERVHRLLALHLNKMVKE ERFLMEKYAESFTGGEAVEKRRNDGFATDFLAGCLH >gi|229783937|gb|GG667798.1| GENE 6 5459 - 7069 1985 536 aa, chain + ## HITS:1 COG:TM0068 KEGG:ns NR:ns ## COG: TM0068 COG0246 # Protein_GI_number: 15642843 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Thermotoga maritima # 1 533 1 532 539 571 52.0 1e-162 MNLTDTGLQDRQAWEKAGIAVPTYDRAAMVTRTKENPEWVHFGAGNIFRGFIANLQDALL EAGLAQTGIIAAETFDFDIMDKIYKPFDDLALLITLKADGSTEKKVIASIAEGVKADSGD PKEWQRLRTIFTSPSLQMISFTITEKGYALKGADGTYFPFVQNDIDNGPKELSCAMAVVC ALLLERYRHGAAPLAVVSMDNCSHNGEKLQNSILTMAEEWKKKGFVDDGFLAYLKNEEKI AFPWSMIDKITPRPSAELAGQLEADGVSSMQPVITSKRTYIAPFVNAEGPQYLVIEDKFP AGRPALEKAGVYMTDRDTVNRVERMKVTTCLNPLHTALAVYGCILGYTSIAEEMKDPELK ALVEKIGLTEGMPVVTDPKILSPEAFVNEVICDRLPNPFIPDAPQRIASDTSQKVGIRYG ETIKSYVERYGDAKKLTFIPLAIAGWCRYLLAVDDELKPFELSADPMIPELMECLKGVEA GKPETYAGQLKPILSNENIFGVNLYDAGIGSLTEAMFLEEIAGAGAVREALKKYLA >gi|229783937|gb|GG667798.1| GENE 7 7154 - 7984 702 276 aa, chain - ## HITS:1 COG:BS_thyA KEGG:ns NR:ns ## COG: BS_thyA COG0207 # Protein_GI_number: 16078831 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Bacillus subtilis # 1 265 1 269 279 212 43.0 6e-55 MSYADTLFIQNCREILSDGVWDTDSLVRPVWEDGTPAHTIKKFGIINRYDLSKEFPILTL RRTAFKSAIDELLWIWQKKSNNIHDLNSHIWDSWADETGSIGKAYGYQLGIRYQYKEGEF DQVDRVLFDLKHNPLSRRIMTNLYNHQDLHEMNLYPCAYSMTFNVSGNVLNGILNQRSQD MLTANNWNVTQYAVLLHMFASVSGLQAGELVHVIADAHIYDRHVPIVKEMIEKTPYDAPK FIMGPEIHDFYSFTKDSFRLEDYQYHAFTDKIPVAI >gi|229783937|gb|GG667798.1| GENE 8 8175 - 8678 495 167 aa, chain + ## HITS:1 COG:BH3450 KEGG:ns NR:ns ## COG: BH3450 COG0262 # Protein_GI_number: 15616012 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Bacillus halodurans # 4 147 5 147 163 102 41.0 3e-22 MKAIVAVDEAWGIGKDGKLLTHLPEDMKFFRTVTKGKVVVMGRKTLQSFPDAKPLKNRIN IVLTSDAALNGEGLIVCRSVADALRQLKEYDSDDVYIIGGQSVYEQFLPYCDTAYVTRMK RDFGADTWFVNLDRQEGWEETETGDEKEYEGLHFTFCTYKNRNCQDW >gi|229783937|gb|GG667798.1| GENE 9 8706 - 9341 682 211 aa, chain + ## HITS:1 COG:no KEGG:Closa_2936 NR:ns ## KEGG: Closa_2936 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 207 1 207 207 342 78.0 6e-93 MLKCERTSVMNFENAIRGARNPMNSWGRMDSGYDEEGNFVLGPNDMSLARRLRKAGSDHR KYVRQIFVSVDITAPLYWWKEYDTYKVATVANSTSTMHKIHSKPFSMDDFSHDHMTPDTL EFMNQMVEYLEKVRSRYLETKSRDDWYDMIQLLPSSYNQMRTCTMNYESLINMYFARRNH KLAEWHTFCGWIMTLPYAKELIAAEEENEEA >gi|229783937|gb|GG667798.1| GENE 10 9373 - 9781 308 136 aa, chain + ## HITS:1 COG:CAC3004 KEGG:ns NR:ns ## COG: CAC3004 COG0262 # Protein_GI_number: 15896256 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Clostridium acetobutylicum # 1 129 2 126 153 93 35.0 1e-19 MNILVTVDKNWAIGNQGQLLVSIPEDQRLLREETLGGIVVMGRKTFETLPGKQPLYNRVN VILSKDRNYRVKGAVVCHSVDETFEFLKQYPQASVFIIGGSSIYDQFLPYCDTVHVTFID YEYSADTHFSLTLDIS Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:58:20 2011 Seq name: gi|229783936|gb|GG667799.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld192, whole genome shotgun sequence Length of sequence - 10852 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 2, operones - 2 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 908 1033 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 2 1 Op 2 . + CDS 958 - 1347 337 ## DSY4205 hypothetical protein 3 1 Op 3 . + CDS 1364 - 2356 751 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 4 1 Op 4 . + CDS 2386 - 2919 620 ## COG1335 Amidases related to nicotinamidase 5 1 Op 5 . + CDS 2954 - 3817 869 ## gi|288871446|ref|ZP_06410172.1| hypothetical cytosolic protein 6 1 Op 6 . + CDS 3836 - 4699 630 ## COG0451 Nucleoside-diphosphate-sugar epimerases + Term 4750 - 4797 0.7 + Prom 4797 - 4856 3.3 7 2 Op 1 4/0.000 + CDS 4876 - 6399 1639 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase 8 2 Op 2 26/0.000 + CDS 7345 - 7680 353 ## COG0242 N-formylmethionyl-tRNA deformylase 9 2 Op 3 1/0.000 + CDS 7682 - 8620 1060 ## COG0223 Methionyl-tRNA formyltransferase 10 2 Op 4 2/0.000 + CDS 8624 - 9340 706 ## COG2738 Predicted Zn-dependent protease 11 2 Op 5 4/0.000 + CDS 9333 - 10661 1230 ## COG0144 tRNA and rRNA cytosine-C5-methylases 12 2 Op 6 . + CDS 10667 - 10850 169 ## COG0820 Predicted Fe-S-cluster redox enzyme Predicted protein(s) >gi|229783936|gb|GG667799.1| GENE 1 3 - 908 1033 301 aa, chain + ## HITS:1 COG:CAC2128 KEGG:ns NR:ns ## COG: CAC2128 COG0770 # Protein_GI_number: 15895397 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Clostridium acetobutylicum # 2 300 158 447 452 162 35.0 1e-39 DEIGVIELGMSEPGELTVIAGIAQIQMAVITNIGITHIEQLGSQENIFREKLSIQDGLTE HGILFLNGDDELLKKTTARDGFETIYYGTGENCHYRAVDVRLENGYPVFTAVCGDEQVPV RLKVMGTHNVANAMVSLAVASVCGIPMEAAAKQLAEFTGFQNRQQIYHTGGMTIIDDTYN ASPVSMKAGLEVLNSIEHSRRRVAVLADMKELGPDSPRYHYEIGTYIAAHPVDKVAVLGE PAKEIARGVREQAPHILVYEFMDREPLVEWLKTELREGDAVLFKGSNSMELGKVAAVFLE K >gi|229783936|gb|GG667799.1| GENE 2 958 - 1347 337 129 aa, chain + ## HITS:1 COG:no KEGG:DSY4205 NR:ns ## KEGG: DSY4205 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 125 1 123 124 131 48.0 1e-29 MDREAVIRNFFQDVISQNEVGLGRWFREDAVIRWHCTNEQFSAEEYIRANCEYPGRWNGT IERVEQGKDTAVTAARVWCEEAGISNHVVSFYRFDGDTIVSLDEYWGDDGPAPQWRLEMK LGRPIGERP >gi|229783936|gb|GG667799.1| GENE 3 1364 - 2356 751 330 aa, chain + ## HITS:1 COG:CAC1006 KEGG:ns NR:ns ## COG: CAC1006 COG0494 # Protein_GI_number: 15894293 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Clostridium acetobutylicum # 4 139 3 137 146 140 48.0 4e-33 MKRVVFYEKADDELLRFAVIAAKYQGQWVLCRHKARSTWEFPGGHRETGETIEAAAVREL YEETGAAEYRMAPVGVYSVTDEAEPEEKRSESFGMLFYAEITRFDPLPSEFEMEKTGYFE TPPDNWTYPEIQPVFLKRLNQEEAKRSREQGGHWGAEAPKARLFCGNQMICAEKTAGLLY RRAGICDRELLTRLRIQVLRAANGLDEDADMSRAESESRSYYGTCFLDDSHAAWLVFDGR EVVGTGAVSFYRVMPAYHNPSGKKAYIMNMYTRPDYRRRGIAYRVLELLTAEAEMRHVDA VTLEATAAGRPLYEAFGFTAMRDEMELLRE >gi|229783936|gb|GG667799.1| GENE 4 2386 - 2919 620 177 aa, chain + ## HITS:1 COG:CAC3465 KEGG:ns NR:ns ## COG: CAC3465 COG1335 # Protein_GI_number: 15896704 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Clostridium acetobutylicum # 1 172 1 172 178 178 47.0 4e-45 MGKIAVLIVDVQKALVEAHPYREEEFLQELKRVADGARKAGAEIIYVRHDGDAGDELESG TEGWNIHGAVEPKPGERIFDKNYNSAFLETDLEEYLKEKQITDLILMGMQTEYCMDSTCK AAFELGYGVVIPEGATTTYDNEFMSGEATVKFYETKIWDGRFADLMSVEEVLELLAR >gi|229783936|gb|GG667799.1| GENE 5 2954 - 3817 869 287 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871446|ref|ZP_06410172.1| ## NR: gi|288871446|ref|ZP_06410172.1| hypothetical cytosolic protein [Clostridium hathewayi DSM 13479] hypothetical cytosolic protein [Clostridium hathewayi DSM 13479] # 1 287 5 291 291 575 100.0 1e-162 MEEQTLAELLYAAVFPVKKQGKGCSEERKIWNSFLDQLNQGLDASGCGAKAEAYTESEPY GQACGFHRDYGYFMVSCGDRGTKSVTILEAEAETAVNRFLALIIDKQAYQWVLDIKDREE ASWPYHCRYDYRRLWFGKALEWLAPAVDWAFFAQCADRYIGLLNRWFARPHWEYDYQISR IIEISDSREYSDSHYLYRIEPAKEPGRSRIRLEPQFLEDGGAAGTFQFDIDGEALDMRRL DEKVLVLMDREEDNIDAYSLEGRFLWNIGEKLPVHRRYDNLYEDISG >gi|229783936|gb|GG667799.1| GENE 6 3836 - 4699 630 287 aa, chain + ## HITS:1 COG:lin1855 KEGG:ns NR:ns ## COG: lin1855 COG0451 # Protein_GI_number: 16800922 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Listeria innocua # 1 279 1 284 291 182 37.0 7e-46 MKILVLGGTRFFGVHLVNTLLLQGHEVAVATRGNAKPRFIAPVEAIRVDRIQEESMRAVF SGREFDVVYDDLAYCSNDVKNALETVRCRRYIMVSSISVYELKENTRETDYDPYRETLIW GKRSAFDYGMGKRQAEAALVQAFPEQESVMVRFPVVLGRDDYTERLRFYAAHTAKGIAMQ VENENCPVSFISAEECGRFLAFLADSSLEGPVNAAASGTITIGEIIRYIEEKTGKKAVLS PEGEEAPYNGFPGFSINTEKAEGAGYHFSVLDTWIYPLLDFYGETEG >gi|229783936|gb|GG667799.1| GENE 7 4876 - 6399 1639 507 aa, chain + ## HITS:1 COG:BS_priA KEGG:ns NR:ns ## COG: BS_priA COG1198 # Protein_GI_number: 16078634 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Bacillus subtilis # 5 504 3 564 805 370 35.0 1e-102 MTKRFAQIIIDISHEKVDRTFDYRIPPQLEDRIFVGVLVKIPFGKGNSLRKGYVVGITDH ADYDADKIKEIAEIVEGGVSAESQLIMLAWWLKEQYGSTMNQALKTVLPVKQKVKPKEKK VLRLLIPDEQLEAVTAEAEKKSYKARVRLFKALKENHVIPYEVASGQMNLSAATLKPVIE KGYVALESEEIFRNPVKDAGSRVKAVELNKEQQAVVSAFCEDYEAGKRETYLIHGITGSG KTEVYMEMISRVISEGMQVIVLIPEIALTYQTVMRFYGRFGNRVSIINSRLSAGERYDQF ERARSGDIDIMIGPRSALFTPFARLGMIIIDEEHEGAYKSEVVPRYQAREVAVKRAQMQN ASVVLGSATPSLEAYTKALRGEYRLFKLNTRAKADSRLAEVAVVDLREELKEGNKSIFSR KLQQMITDRLEKKEQIMMFINRRGYANFVSCRSCGEAIKCPHCDVTLTLHKDNRLVCHYC GYSIPMPEQCPGCGSPYIANFGVGLAS >gi|229783936|gb|GG667799.1| GENE 8 7345 - 7680 353 111 aa, chain + ## HITS:1 COG:CAC1722 KEGG:ns NR:ns ## COG: CAC1722 COG0242 # Protein_GI_number: 15894999 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Clostridium acetobutylicum # 1 92 54 144 150 88 55.0 3e-18 MLKQIVVIDVEDGNQYVLINPVITETSGSQTGSEGCLSVPGKSGVVTRPDHVKVKAFDCE MNEFELIGEGLLARAICHECDHLSGDLFVDKVEGELTDVDQEEDETEEGDE >gi|229783936|gb|GG667799.1| GENE 9 7682 - 8620 1060 312 aa, chain + ## HITS:1 COG:CAC1723 KEGG:ns NR:ns ## COG: CAC1723 COG0223 # Protein_GI_number: 15895000 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Clostridium acetobutylicum # 1 309 2 310 310 290 47.0 2e-78 MRIVFMGTPDFSVPALTALVEAGHDVIAVVTQPDKPKGRGKEVQMPPVKVKALEYGIPVY QPVKARDPEFVSLLKEMQPDAMVVAAFGQLLPKTILDIPKYGCVNIHASLLPKYRGASPI QYAVINGEPVSGITTMMMAEALDTGDILDQETVALDEKETGGSLHDKLSAIGGRLIIKTL KKLEDGTAVRTVQDEKETCYVGMIKKSKGDIDWSQDAVSIERLIRGLNPWPSAYTGWEGR IIKLWEADVLEEEYEGACGQVVKTGKDCLYVKTGKGTLAVKELQLQGKKRMDTGAFLRGY QIAEGTVFERSV >gi|229783936|gb|GG667799.1| GENE 10 8624 - 9340 706 238 aa, chain + ## HITS:1 COG:CAC1724 KEGG:ns NR:ns ## COG: CAC1724 COG2738 # Protein_GI_number: 15895001 # Func_class: R General function prediction only # Function: Predicted Zn-dependent protease # Organism: Clostridium acetobutylicum # 7 234 1 226 227 197 48.0 1e-50 MFYGGLMGYYFDPTWILVIIGAVLSMAASAKVNSTFSKYSKVRSMTGMTGEDAAKRLLNS QGIYDVTVRPVKGQLTDHYDPRTKTVNLSESVFHSTSVAAIGVAAHECGHAMQDNVGYVP LKLRGAIVPVANIGSQAAFPLIIIGVLIGGMGSPLVNIGLILFSLAVIFQLITLPVELNA SRRAITLLDQVGILGGQEVNQTRKVLGAAALTYVAALAASVLQLLRLVILFGGRRDND >gi|229783936|gb|GG667799.1| GENE 11 9333 - 10661 1230 442 aa, chain + ## HITS:1 COG:BH2507 KEGG:ns NR:ns ## COG: BH2507 COG0144 # Protein_GI_number: 15615070 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Bacillus halodurans # 6 441 5 448 450 292 37.0 1e-78 MTNGNSSREIILDILLEILEKGSYSHIVLSQALSKYQYLDKQERAFISRIVEGTVEYTLQ LDYVINSYSSVKVKKMKPVIRTLLRMSVYQILYMDRVPDSAVCNEAVKLAQKRKFTGLKG FVNGVLRGISRNKEQLKWPDDSVRYSMPSWILDMWKDTYGEEITVSMAKAFLKPSRTAVR CNLNRASKQEIMDSLKSQGVTVEETPLSETVLYLSKYDYLESLDAFASGLISVQDMSSSF VGEIADPKEGDVCIDVCGAPGGKSLHIADKLKGTGMVTVRDLTEQKVALIEENIARSGFT NIRAEVHDALVPDQAWEGRADIVLADLPCSGLGIIGRKPDIKYRMTMEQLAELAALQREI LSVVQSYVKPGGKLIYSTCTIDSQENEENAAWFLENFPFDAVNLEGKLGETLNAASMKSG MIQLLPGIHPCDGFFIAAFQKR >gi|229783936|gb|GG667799.1| GENE 12 10667 - 10850 169 61 aa, chain + ## HITS:1 COG:BS_yloN KEGG:ns NR:ns ## COG: BS_yloN COG0820 # Protein_GI_number: 16078638 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Bacillus subtilis # 2 57 16 71 363 58 44.0 4e-09 MEKTDIKSLNLEELTAFITASGEKAFRAKQLYEWMHKKLAPGYEEMTNLPKALKERLKET C Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:58:39 2011 Seq name: gi|229783935|gb|GG667800.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld193, whole genome shotgun sequence Length of sequence - 10896 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1183 909 ## COG0584 Glycerophosphoryl diester phosphodiesterase + Prom 1186 - 1245 8.8 2 2 Op 1 . + CDS 1332 - 1592 272 ## gi|266624680|ref|ZP_06117615.1| PTS system, IIB component, lactose/cellobiose family 3 2 Op 2 . + CDS 1597 - 3684 1634 ## COG3711 Transcriptional antiterminator 4 2 Op 3 . + CDS 3700 - 4362 683 ## COG3037 Uncharacterized protein conserved in bacteria + Prom 5201 - 5260 80.4 5 3 Op 1 . + CDS 5322 - 6035 463 ## COG3037 Uncharacterized protein conserved in bacteria + Term 6048 - 6090 6.9 + Prom 6123 - 6182 3.4 6 3 Op 2 8/0.000 + CDS 6225 - 6542 438 ## COG1445 Phosphotransferase system fructose-specific component IIB 7 3 Op 3 2/0.000 + CDS 6556 - 7077 588 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 8 3 Op 4 1/0.000 + CDS 7058 - 7777 886 ## COG0036 Pentose-5-phosphate-3-epimerase 9 3 Op 5 . + CDS 7791 - 8894 1055 ## COG1299 Phosphotransferase system, fructose-specific IIC component - Term 9850 - 9922 19.2 10 4 Tu 1 . - CDS 9951 - 10895 955 ## COG2200 FOG: EAL domain Predicted protein(s) >gi|229783935|gb|GG667800.1| GENE 1 2 - 1183 909 393 aa, chain + ## HITS:1 COG:BS_yhdW KEGG:ns NR:ns ## COG: BS_yhdW COG0584 # Protein_GI_number: 16078027 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Bacillus subtilis # 136 361 3 234 243 151 34.0 2e-36 KTAVLLHILGAFAAFGIAAALLLGAANLGLVYYTKQVEIPLMAAETFVSYFDRAVPAAVF ILSSLSTIWLFSLLISLYHQYREDVRPSAVLRNKTGLLYHFKQAVVIAFTVIAIVIFSET ELGGSFTAETYAGPKIVAHRSGALFAPENTMAALDHAIAMNIDMVEIDVQQLKDGELILL HDDSFNRTTGENKKVWEAGYDEVKHYDAGSWFSPQFSGERIPKLDDFLMRARDNIRVMIE LKSTGHETNLVERVAALIETYDMLDQCNIGSMNLELLKEVKAINPQIETVYITPLIYSLQ YDIDFIDAFSVETTMLTREMVVSMHWNGKDVYGWTANSKETIKKNLQCQVDGIVTDNPEL VNHYVMQTWHNRLLSALLQTFFNTTSADQNDMQ >gi|229783935|gb|GG667800.1| GENE 2 1332 - 1592 272 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624680|ref|ZP_06117615.1| ## NR: gi|266624680|ref|ZP_06117615.1| PTS system, IIB component, lactose/cellobiose family [Clostridium hathewayi DSM 13479] PTS system, IIB component, lactose/cellobiose family [Clostridium hathewayi DSM 13479] # 1 86 10 95 95 139 100.0 7e-32 MCCCGNGVGTSLIMQMTIEEALEYLGMDDIEVSFGSLSDISADRADLFIVSQELLESLGD LPAVGLEDLMDSEMAAELLKPLLGIE >gi|229783935|gb|GG667800.1| GENE 3 1597 - 3684 1634 695 aa, chain + ## HITS:1 COG:BH0220_1 KEGG:ns NR:ns ## COG: BH0220_1 COG3711 # Protein_GI_number: 15612783 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus halodurans # 6 528 3 556 556 200 26.0 8e-51 MAVGYLNERTKKIIDMILKRVGALSIQEIADELAVSTRTVYNELDRANDWLHTKHLPDIQ IIRGKIQPFTEEEKGSFEAVMQTEQPHDDYIFTPSERTRMIVCQIIVSREPVYVENLMNT CLVSRNTIFVDLQAVISQLHAYQLGLGYEKKRGYWIDGDPIRIRAIFFLYFNMLEPLLSS GKLRFLYLDEMEFHLKQIEQVEKQLGVSYVRNDMLALAAMLPVMEKGEDSLYFTDVSTEK VKVSKEYQLVMEYFPTLSDMEKVYLSLHFLGGRLESYSRTEADTGDNEFALEITKNLITE FEREACVLFERKEDLIRNLYRHIAASLYRYRFGIQIGNPMAEDIRREYPYIFDTTRAVVK YLEQQIGVLISDSEVAYLALHFGAHLEYAERDEKELRILVACMNGVATGNMISHELKRIL PQARIVAVMAASEIVNPQNMCDIIITSVHVKAVVPVIVVNTILNDFDRKNILNHPLIRSR FGFIDTDALFKVVKKYVDPQHHMELRRELEKFFARQKEEKQPVLKPDIWRLTDFLTEDRI LFLDRNGRERTEEKREWSGDGYSTGWERALYTAASPLLERGSMSEEYVRCIVDRMVEAGP YMFITRDLILAHARPENGVKHLDLSIGIAADGIEFEGGKTARVIFCLVVEDQQKHMGILQ DIRKTLAKPRQIDELVRAESAYEVCDILRSRLTDR >gi|229783935|gb|GG667800.1| GENE 4 3700 - 4362 683 220 aa, chain + ## HITS:1 COG:STM2342 KEGG:ns NR:ns ## COG: STM2342 COG3037 # Protein_GI_number: 16765669 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 1 205 4 207 463 165 44.0 7e-41 MQIVTVVREILATPAFLVGFVTLLGLLLQKKPVEHVIKGTITAIVGFVLLSAGSDFLQKG ALKDFGVLFNYDFHIQGVIPNMEAVASLGVAQYAVEVSMVMFLGMVANLVIARFGPFHYI FLTGHHTLYMACLLTVALSGSSMKEWQIIAAGALMLGLFMALMPALAQGEMKKITGGNGI ALGHFSTCGYLIAAKTARLAAKKDRRIKTAALETGTVLAS >gi|229783935|gb|GG667800.1| GENE 5 5322 - 6035 463 237 aa, chain + ## HITS:1 COG:SP2129 KEGG:ns NR:ns ## COG: SP2129 COG3037 # Protein_GI_number: 15901943 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 1 218 227 431 448 172 45.0 6e-43 MTGIFLVVSGIAAARTDLAALDISYKTGGFQNWIIYAVIQGAQFSAAIYIILSGVRLVIA EIVPAFKGIAKRIVPHARPAVDCPVLFSYAPNAAMIGFLMSFLGGIVTMLILIGINTWFG ELIVPVIVPGVVAHFFCGGSAGVFANTEGGVKGCLVGSFVHGILISVLALIVMPVLGTLN LSGTSFPDSDFCIAGILFGNLASVLSGGGILLVCVLVFILPIIWEQTAKYRKTLKAD >gi|229783935|gb|GG667800.1| GENE 6 6225 - 6542 438 105 aa, chain + ## HITS:1 COG:SP1618 KEGG:ns NR:ns ## COG: SP1618 COG1445 # Protein_GI_number: 15901455 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system fructose-specific component IIB # Organism: Streptococcus pneumoniae TIGR4 # 1 101 6 106 108 111 68.0 3e-25 MKIVGVCACTVGIAHTYIAQEKLEKAAEKAGFEAKFETQGSIGRENALTQEDIAAADLVI LAIDVKIADRDRFEGKKTIQVSTDVAIKSPNKLMEKAKQVMEAAK >gi|229783935|gb|GG667800.1| GENE 7 6556 - 7077 588 173 aa, chain + ## HITS:1 COG:SP1619 KEGG:ns NR:ns ## COG: SP1619 COG1762 # Protein_GI_number: 15901456 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Streptococcus pneumoniae TIGR4 # 1 148 1 148 149 141 48.0 7e-34 MDFTEVLNPRTIITHLKVNQKAEALEAMAELFAEAGVIGDKEQYIKDVYVREEMGATGIG GYIAIPHGKSKAVITPGAAIAVLDHEIEWESLDGAGAKVIILFAVGADHEAAEEHLRLLT MFSKRLGDRAVVSRLIDADTAEEVMLAFTDDGCIRKEEPDEEEELNLDEISIM >gi|229783935|gb|GG667800.1| GENE 8 7058 - 7777 886 239 aa, chain + ## HITS:1 COG:alsE KEGG:ns NR:ns ## COG: alsE COG0036 # Protein_GI_number: 16131911 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Escherichia coli K12 # 17 229 4 216 231 204 45.0 9e-53 MKFQLCESEETNMVTFAPSLMCMDLGKIKEQFAVMDQYMKLYHVDIMDGHFCKNMALTPG ILKSFREHTNTDIDVHLMTTNPSDWIEMCAAAGATYISPHAETINTDAFRTMQMIQKLGC KCGVVLNPATPLSYVESYLEYIDMLTIMTVDIGFAGQKFIDQMYNKIMQAKELREKNGYH YKIQIDGHCNKENYRQLAEAGADVLIVGNAGLFGLDQDLDTACRKMEMEWKESMLEKAV >gi|229783935|gb|GG667800.1| GENE 9 7791 - 8894 1055 367 aa, chain + ## HITS:1 COG:SP1617 KEGG:ns NR:ns ## COG: SP1617 COG1299 # Protein_GI_number: 15901454 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Streptococcus pneumoniae TIGR4 # 4 364 2 358 361 376 63.0 1e-104 MEKLKSLNIKSHLLTAISYMLPIVCGGGFLVAIGMALGGTNLYAAGLVQGQFTFWDALAT MGGAALGLLPLIISVGISFSIAGKPGIAPGFVAGLSAIAISAGFIGGILGGYLAGYLALF IIKNLKLPAWARGLMPTVVVPLISSFIAGLVMIYVFGVPIAAFTNWLTNVLLGLGTSSKL VLGLVIGSLCIVDFGGPINKTVYAFTLTLLASGIREPVTALQLVNTATPIGFGLAYLIAK ALHKNIYEPEQVENLKSAVPMGVLNIVEGVIPIVMSDLARALIASAVGGACGGAVTMVLG ADSTVPFGGFLMLPTMTRPWTGLLAIAVNVVVTGCLYAVIAKNKGNEGFADEREEEEEPD LEDILAS >gi|229783935|gb|GG667800.1| GENE 10 9951 - 10895 955 314 aa, chain - ## HITS:1 COG:STM3388_3 KEGG:ns NR:ns ## COG: STM3388_3 COG2200 # Protein_GI_number: 16766683 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Salmonella typhimurium LT2 # 53 304 3 254 273 189 36.0 7e-48 NFGVHIIQADTPWDAHRHSPDTAINGALLALNSVNGNTSNPVAFYDEALYEKARKKIEIE SQMYAALANGEFHMYLQPKYYLKDGSMHSAEALVRWYSSDGVIHYPDEFIPVFEENGFIT ELDMYMLEEACRKVSQWKQQGYKVSPVSVNQSRVFFYDEEYLDKFHEIVDQYHIDPSLII LEVTESVAMSNLEQVKMVIRKLHKIGFSISMDDFGSGYSSLNTLKDLDIAELKLDKDFLS EQSTSERGKIVIESVIHLAKALSITTVAEGIENDVQLDFMKSICCDIGQGYYFAKPMPAE EFERLLLKKEQEDV Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:58:49 2011 Seq name: gi|229783934|gb|GG667801.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld194, whole genome shotgun sequence Length of sequence - 8451 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 8/0.000 - CDS 2 - 851 972 ## COG0549 Carbamate kinase - Prom 950 - 1009 1.8 - Term 953 - 1000 9.1 2 1 Op 2 . - CDS 1014 - 2207 981 ## COG0078 Ornithine carbamoyltransferase - Prom 2333 - 2392 3.4 3 2 Op 1 4/0.000 - CDS 2403 - 3716 1339 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 4 2 Op 2 1/0.000 - CDS 3792 - 4997 1213 ## COG1171 Threonine dehydratase 5 2 Op 3 . - CDS 5042 - 5689 668 ## COG2964 Uncharacterized protein conserved in bacteria - Prom 5725 - 5784 8.9 + Prom 5787 - 5846 9.5 6 3 Op 1 1/0.000 + CDS 6022 - 6417 496 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases + Prom 7260 - 7319 80.4 7 3 Op 2 . + CDS 7451 - 8269 932 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases Predicted protein(s) >gi|229783934|gb|GG667801.1| GENE 1 2 - 851 972 283 aa, chain - ## HITS:1 COG:yqeA KEGG:ns NR:ns ## COG: yqeA COG0549 # Protein_GI_number: 16130776 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 5 278 3 274 310 350 66.0 2e-96 MNKKKRIVIALGGNALGNTLPEQMTAVKITAKAIVDLIEEGCEVVVAHGNGPQVGMINNA MSALSREDPNQPNTPLSVCVAMSQAYIGYDLQNALREELLNRNITDIPVTTMITQIRVDA EDPAFNAPSKPIGHFMTEEQAKIAEEKYGYIMKEDAGRGYRRVVASPKPAEIIEIGAIRS LVDSGQLVIACGGGGIPVTLIGNHLKGASAVIDKDFASELLAEELDADFLIILTAVEKVA INFGKPEEKWLDDITTDDARKYMDEGHFAPGSMLPKVQAAVKF >gi|229783934|gb|GG667801.1| GENE 2 1014 - 2207 981 397 aa, chain - ## HITS:1 COG:ECs3743 KEGG:ns NR:ns ## COG: ECs3743 COG0078 # Protein_GI_number: 15832997 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Escherichia coli O157:H7 # 1 395 2 395 396 540 65.0 1e-153 MKTLQDYIDKLNALNFKEMYNNDFFLTWEKTDEELEAVWTVADALRFMRENNISTKVFES GLGISLFRDNSTRTRFSFASACNLLGLEVQDLDEGKSQVAHGETVRETANMISFMADVIG IRDDMYIGKGNAYMHEVVDSVTQGHKDGILEQKPTLVNLQCDIDHPTQCMADMLHIIHEF GGLENLKGKKLAMTWAYSPSYGKPLSVPQGVIGLMTRMGMEVVLAHPEGYEVMPEVEEVA RKNAEKSGGSFRVSHDMADAFKDADIVYPKSWAPFAAMEKRTNLYGEGDTEGIKALEKEL LAQNAEHKDWCCTEELMKTTKDGKALYLHCLPADINDVSCVDGEVEASVFDRYRDPLYKE ASFKPYVIAAMIFLAKFENPQEILKKLEERNTPRVFK >gi|229783934|gb|GG667801.1| GENE 3 2403 - 3716 1339 437 aa, chain - ## HITS:1 COG:ECs3745 KEGG:ns NR:ns ## COG: ECs3745 COG0624 # Protein_GI_number: 15832999 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli O157:H7 # 6 431 10 396 403 554 63.0 1e-157 MDYAKIKEAAKNYEADMTRFLRDIVKFPGESCDEKAHIDRIAEEMRKLEFDKVVIDDMGN VLGYMGTGATLIGFDAHIDTVGIGNKSNWNFDPYEGYENDTEIGGRGVSDQCGGIVSAVY GARIMKDLGLLSDKYTVLVTGTVQEEDCDGLCWQYIINEDKVRPEFVVSTEPTDGGIYRG QRGRMEIRVDVKGVSCHGSAPERGDNAIYKMADILQDVRALNENDAADDKEVKGLVKMLD ESYNAEWKEANFLGRGTVTVSEIFFTSPSRCAVADSCSVSLDRRMTAGETWESCLEEIRS LPSVKKYGNDVTVSMYEYSRPSYKGLTYPIECYFPTWVIPEDHKVTKSLEEAYKNLYGDV RIGAEETLAMRTARPLTDKWTFSTNGVSIMGRNGIPCIGFGPGAEAQAHAPNEKTWKQDL VTCAAVYAALPSIYCEK >gi|229783934|gb|GG667801.1| GENE 4 3792 - 4997 1213 401 aa, chain - ## HITS:1 COG:STM1002 KEGG:ns NR:ns ## COG: STM1002 COG1171 # Protein_GI_number: 16764362 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Salmonella typhimurium LT2 # 33 399 36 402 404 403 54.0 1e-112 MNPIEWVVNTMPKTDDKNLSVMALDEIGKARKFHESFPQYSKTPLAKLDHMAEYLGLGQV YVKDESYRFGLNAFKVLGGSFAIAKYIAQQTGKDVSDLPYSVLTSEELRKEFGQATFFTA TDGNHGRGVAWAANRLGQKAVVFMPKGSTITRFNNIKAEGAEVTIEEVNYDECVRMAADA ASKTEHGVVVQDTAWDGYEEIPAWIMQGYGTMAMEAGEQLEEYGCYRPTHVFVQAGVGSL AGAVQGYFANRYPENPPKVVVVEADVAACLYKGAKAGDGGIQIVDGDMQTIMAGLACGEP NTISWDILKNHVDTFVSAPDWAAAKGMRMLAAPIKGDVSVTSGESGAAPFGILACIMTMD EYKDLRDHLGLNKDSKVLLFSTEGDTDPDRYKAIVWEGKER >gi|229783934|gb|GG667801.1| GENE 5 5042 - 5689 668 215 aa, chain - ## HITS:1 COG:YPO1671 KEGG:ns NR:ns ## COG: YPO1671 COG2964 # Protein_GI_number: 16121935 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Yersinia pestis # 19 208 35 215 225 71 27.0 9e-13 MDSKLEQLTQIAHGVASHFGNACEVVIHDLKKQDIESSIVYIENGHVSNRKLGDGPSEIV LETLNRDPECLKDRLSYLTKTDDGRILKSSTMYIRGKDGSVDYIFSLNYDITGLLTVDNA IRSLISTAEMAGEDRQPRKITHNVNDLLDTLIEQSVQLVGKPVALMSKDDKITAIQYLND AGAFLITKSGDKVSNFFGISKFTLYSYMDAGKENR >gi|229783934|gb|GG667801.1| GENE 6 6022 - 6417 496 131 aa, chain + ## HITS:1 COG:Z4218 KEGG:ns NR:ns ## COG: Z4218 COG0402 # Protein_GI_number: 15803416 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Escherichia coli O157:H7 EDL933 # 1 131 23 150 464 81 32.0 3e-16 MLIIGNGRMITRNPEAPFLEDGAVAMDGSTIIKVGATKEIKEAYPDAEYIDAKGGVIMPA FINAHEHIYSAFARGLSIKGYDPHGFLEILDGMWWTIDRNLTLEETRLSAMATYIDSIKN GITTTFDHHAS >gi|229783934|gb|GG667801.1| GENE 7 7451 - 8269 932 272 aa, chain + ## HITS:1 COG:ssnA KEGG:ns NR:ns ## COG: ssnA COG0402 # Protein_GI_number: 16130781 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Escherichia coli K12 # 17 271 214 462 464 125 32.0 1e-28 MENAAFIKHALADDSDMIAGMMGMHAQFTISDETMELAAANKPAEVGYHIHVAEGIEDLH DCLKKYGKRIVDRLMDCNILGPKTLLGHCIYVNPHEMELIKETDTMVVHNPESNMGNACG CPPTMEIVHRGILTGLGTDGYTHDMTESYKVANVLHKHHLCDPNAAWGEVPQMLFEGNAK IANRYFKKPLGVLKEGAAADVIVTDYIPLTPMNADNVNSHILFGMTGRSVTTTVGNGKVL MKDRILTNIDEEKVMADCRQAAKELADRINSR Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:58:52 2011 Seq name: gi|229783933|gb|GG667802.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld195, whole genome shotgun sequence Length of sequence - 9394 bp Number of predicted genes - 10, with homology - 8 Number of transcription units - 6, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 117 110 ## EUBREC_1606 putative reverse transcriptasematurase of intron - Term 41 - 98 12.5 2 2 Tu 1 . - CDS 230 - 1441 781 ## Dhaf_4021 hypothetical protein - Prom 1464 - 1523 5.3 + Prom 1579 - 1638 3.8 3 3 Op 1 . + CDS 1725 - 2888 868 ## Pjdr2_1985 glycosyl hydrolase family 88 4 3 Op 2 . + CDS 2906 - 4993 1509 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 5 3 Op 3 . + CDS 4990 - 6009 933 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases + Term 6065 - 6121 13.2 - Term 6058 - 6104 9.8 6 4 Tu 1 . - CDS 6111 - 6233 92 ## - Prom 6255 - 6314 80.4 7 5 Op 1 . - CDS 7158 - 7256 89 ## 8 5 Op 2 . - CDS 7249 - 8016 218 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 - Prom 8049 - 8108 3.6 + Prom 8074 - 8133 4.5 9 6 Op 1 . + CDS 8210 - 8896 643 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 10 6 Op 2 . + CDS 8952 - 9393 402 ## Dhaf_0040 histidine kinase Predicted protein(s) >gi|229783933|gb|GG667802.1| GENE 1 1 - 117 110 38 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1606 NR:ns ## KEGG: EUBREC_1606 # Name: not_defined # Def: putative reverse transcriptasematurase of intron # Organism: E.rectale # Pathway: not_defined # 3 37 429 463 464 68 88.0 9e-11 IGFTTQTVAVNMATTKERLINSGYYDLATAYQSVHVNC >gi|229783933|gb|GG667802.1| GENE 2 230 - 1441 781 403 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_4021 NR:ns ## KEGG: Dhaf_4021 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 399 1 392 393 520 63.0 1e-146 MTEPSIEQIRIFRLRSHHLDAVYPKSEADRLAGACGMQNTPPGAWETAFFNRAPDCTLSD MEHLLYEQKTLLQAWSYRGTPVVFPVSESSAFLSALIPDEPEPWIYTQGITLALDYLQMD SDSLLEMLKQVIPQLDDHVIVSKSALDQTLAGWMTPLIPVQKRELWNQPSMYGSPDIQTV GGAVVSFLLRPCAFHGLVVFGKRDGISPTFTSFKNWTGSPLPPDRDAKKKLVRKYLHCYG PATADMFASWLGCSGKQARRMWNTISEELEPVTFFGKKAFILSEDRERLFAPASFTGKHP GDLQRNLLLLGGHDPYLDQRDRAVLQPDKTLQKQIWKLVTNPGAVVYRGEVIGIWTTKKK AKGMDIKMTLWTDAAGKTPLQNLAEEYAAFRQQALMNLEIAQL >gi|229783933|gb|GG667802.1| GENE 3 1725 - 2888 868 387 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_1985 NR:ns ## KEGG: Pjdr2_1985 # Name: not_defined # Def: glycosyl hydrolase family 88 # Organism: Paenibacillus # Pathway: not_defined # 1 387 1 384 384 550 66.0 1e-155 MISLQDNAWIDEAVEKMMKKMDLVSERSRNKIPYTAINGVHDDRSREDQTFSADDGINWW CNGFWAGMLWQMYHVAGDEKYAKIARLSEEKLDRCFSDYYGLHHDVGFMFLPSAVADYRL TGNPDSRRRGLHAANLLAGRFNPVGKFIRAWNDIPGSQDDTRGWAIIDCLFNIPLLYWAS EETKDPRFEQIAKMHADTVMETFVRPDGSVRHIVEFDPFTGEMVRDYGGQGYGQGSSWTR GQTWGLYGFLMSYRHTGKQEYLDTAKRIAHYFIANIPEDGLIPVDFRQPAEPLWYDDSAA AIAACGLLEIAKEVGDLEKELYEGAALKLLHALCREHCDFSMDSDCFLTKCTGSYHNEKD HEMPIIYGDYFFIEAMLKLKGQDVFLW >gi|229783933|gb|GG667802.1| GENE 4 2906 - 4993 1509 695 aa, chain + ## HITS:1 COG:TM0308 KEGG:ns NR:ns ## COG: TM0308 COG1501 # Protein_GI_number: 15643077 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Thermotoga maritima # 68 618 148 666 764 145 24.0 3e-34 MEKIAEGIYKITYQKPEPFTPVSMRKSKMADGALKRLENGRKDGAAALPFEETEIQFEQT ARGITVTLPMKTDEDIYGFGLQLLSLNAAGRKRYIKVNSDPRMDTGESHAPAPFYVSTAG YGLLVDTYRYATFYMGTNTEKGASAGKREVNIPHQEFSESALYAFKRAREERKVIIEIPA AGGIDLYFFAGNVKEAVMRYNLFSGGGCLPPMWGLGMWYRVYGGSDENHVKKLAKGFREN QMPVDVIGIEPGWHSHSYSCTYEWGYLFPNHQEMVDTLNRDGYHVNLWEHIFVYPAAPFY EKLLDYSGNYEVWNGLVPDFATEEAQNIFAGWHEEQFIRNGIAGFKLDECDNSDFNPSNW SFPDSTKFPSGMDGEQMHNAVGVLYQELIFRAFKNRGIRTLSSVRSSGALAAPLPFVLYS DLYGHREFIRGMVTASFSGMLWAPEVRDCANGKDLLRRIETIVFSPQALLNCWRIPNPPW EQVCIEKNLQNEAMEERGFYMAACRKLFELRMSLLPYLYSAFAEYHELGIPPVRAMAIED PEDYEARNIDDQYFFGKDLLVCPLTLEDGDSREVYLPGGLWYGFFDGQPYEGGRRFTVQA KEEEIPVFVRDGAIVPTAEPVLCVREDTVFEMTIRSYGWKEGEYTLYEDDFTSFRYENEE MRKVVIRRDGDGKIEIAGGENSQRYRFRIERRETP >gi|229783933|gb|GG667802.1| GENE 5 4990 - 6009 933 339 aa, chain + ## HITS:1 COG:AGl3399 KEGG:ns NR:ns ## COG: AGl3399 COG0111 # Protein_GI_number: 15891817 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 47 327 44 329 338 152 33.0 1e-36 MRDKLKPIKILVSIPAGEMRDSFFSEALSARLEQLGCVEWNTNTEQFSEEELAEKLSGVD ICISGWGNTPFYEKTLKYADKLKLIAHIGGSVRPMVGDAAFERGIRVCSGNRVFAESVAE GVLTYMLCSLRKIGEYEARMAAGEWPSLIGTRGLLGRSVGLVGYGMIAEYLVKFLKPFGC RILVSSRHISEEELAAAGIEAASAEEIFRTCDIVSLHNSLTMRTKHSIGAELLNSMKDGA LLVNTARGALIDEEALVSVLKERSVWAALDVYETEPLPMDSPLRDCERVLLMPHAAGPTA DRRYAVTSYVLDDMERFLNGENLDCEIDFARAGTMTTVV >gi|229783933|gb|GG667802.1| GENE 6 6111 - 6233 92 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIIYGSQVFKMDLVVMSIVILCIVSAILYQGITILEKKMK >gi|229783933|gb|GG667802.1| GENE 7 7158 - 7256 89 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIETHPTLAQQEFIAKEMLHRRNVRILRIMLS >gi|229783933|gb|GG667802.1| GENE 8 7249 - 8016 218 255 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 211 1 211 305 88 28 2e-17 MKPKLEVCGVSYSYHSLEGETLALSNISFTVDNGEFIAIVGPSGCGKSTLLSLLSGLIEP DEGEIIIDGVPRTHSGAGIGYMLQRDHLFEWRTILGNAELGLEIQKKLNDGSKKKLHEML NTYGLEAFKNSRPSELSGGMRQRAALVRTLALEPDLLLLDEPFSALDYQTRLAVCDDISS IIKSTHKTAILITHDLSEAISVADRVIILTSRPGRMKGIVPVSFGDGYVRPLERRNTPEF SAYFNQVWKELQNHD >gi|229783933|gb|GG667802.1| GENE 9 8210 - 8896 643 228 aa, chain + ## HITS:1 COG:BS_yrkP KEGG:ns NR:ns ## COG: BS_yrkP COG0745 # Protein_GI_number: 16079696 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 3 226 4 227 231 228 50.0 8e-60 MKKILFADDDPEIREVVRILLARDGYEVVEAVNGIEAVEKADHTIDLIILDVMMPGCSGV EACGEIRKKSSAPILFLTAKSRDRDKEEGFAAGGDDYLVKPFSSVELKARVNAMLRRYCV YKGKKGTEREVISLPGLEIFADTEEVIKNGERVLLTDIEYRILLLLAENRKKTYTTQDIY EAVWNEPFLFSSNNTVMVHIRNLRKKLEDDPQNPRCIRNVWGKGYRID >gi|229783933|gb|GG667802.1| GENE 10 8952 - 9393 402 147 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_0040 NR:ns ## KEGG: Dhaf_0040 # Name: not_defined # Def: histidine kinase # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 141 15 151 436 66 29.0 5e-10 MASVAIAMAAAFCLYMGIHTAAYDWIDRTQSTQEAYARQLTRSEESLRQYIRDKDIVLSD LSPLDEWVDREAYVAMVIYRSGRVIYSSYPYDVENVTISEEEAAETEVMPVSDDSTYTFT FADGTASVLLMPFYEYRYYQWADRISV Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:59:20 2011 Seq name: gi|229783932|gb|GG667803.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld196, whole genome shotgun sequence Length of sequence - 10837 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 42 - 863 732 ## gi|266624705|ref|ZP_06117640.1| putative metalloprotease 2 1 Op 2 . + CDS 863 - 2047 1042 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) + Term 2184 - 2216 1.8 - Term 2018 - 2066 15.2 3 2 Tu 1 . - CDS 2096 - 5323 3544 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) - Prom 5564 - 5623 7.4 + Prom 5293 - 5352 7.3 4 3 Op 1 1/0.000 + CDS 5567 - 7285 1302 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 5 3 Op 2 . + CDS 8241 - 8666 424 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases + Prom 8725 - 8784 5.8 6 4 Op 1 . + CDS 8810 - 9889 1092 ## COG1840 ABC-type Fe3+ transport system, periplasmic component 7 4 Op 2 . + CDS 9902 - 10835 849 ## Closa_2527 integral membrane sensor signal transduction histidine kinase Predicted protein(s) >gi|229783932|gb|GG667803.1| GENE 1 42 - 863 732 273 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624705|ref|ZP_06117640.1| ## NR: gi|266624705|ref|ZP_06117640.1| putative metalloprotease [Clostridium hathewayi DSM 13479] putative metalloprotease [Clostridium hathewayi DSM 13479] # 1 273 1 273 273 517 100.0 1e-145 MKVKTVTRYKNPGYPVKEEILADPQLLRRIPSRWEKNAYVMTALGLTVSLSACGASTEGA RGASDRVAPVFIHGGGSGAYGCVSVVSPYFLSEEEAFEIIRAEAESYGGLILTQEAAPEL KNVKLPVTHIYSEPEKTKEKTVKGSLQTDGWDPDKRIAIEFVSKSDVGNWKETEEGVYAT VEGYAIQDTAVRLQKSLEKAAPEEVVGIFYDPYDYGDLIEEYDKIQKKYESREGNYWAAM SEEQDDMARAAAEENLKAQVRDFIDWLKGQGVI >gi|229783932|gb|GG667803.1| GENE 2 863 - 2047 1042 394 aa, chain + ## HITS:1 COG:CAC2279 KEGG:ns NR:ns ## COG: CAC2279 COG0641 # Protein_GI_number: 15895547 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Clostridium acetobutylicum # 3 330 97 439 454 104 25.0 3e-22 MHMTLHLTNDCNLQCSYCYVCQKPEYMNRSTAEHAVRLAAASGEPSCGIIFFGGEPLLRR DTIRDTVAYCRAMEREKPVRFHFKMTTNGTLLDRDFLEFSRKENIFIALSHDGTRTAHDF FRRDHGGAGTYDRLEAVTDMLLASRPYAPVMMTVSPETAAEYAQGVKELWQKGFHYFICS LNYAGNWTKESVRELRRQYRELADFYYELTKREEKFYFSPFEVKIASHIQGKQYCHERCE LGRKQISVAPGGLLYPCTQFVEHREFSIGDVVGGIDEKKRTELFERNELEKEECKGCAVM GRCNCHCGCLNYQVTGSIDHVAPMLCQHERIVIPVVDKLAKRLFQERNGMFIQKHYNDVF PLISMVEDMVKPEKRGTQISDETILPKHKSTDND >gi|229783932|gb|GG667803.1| GENE 3 2096 - 5323 3544 1075 aa, chain - ## HITS:1 COG:AF1274 KEGG:ns NR:ns ## COG: AF1274 COG0458 # Protein_GI_number: 11498873 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Archaeoglobus fulgidus # 1 1070 1 1075 1076 1078 51.0 0 MSQRTDIHKVLIIGSGPIIIGQACEFDYSGTQACKALRKLGYEIVLVNSNPATIMTDPET ADVTYIEPLNVERLEQIIAKERPDALLPNLGGQSGLNLCSELNKAGILEKYNVKVIGVQV DAIERGEDRVEFKKAMNELGIEMARSEVAYSVDEALKIADQLGYPVVLRPAYTMGGAGGG LVYNRDELKTVCARGLQASLVGQVLVEESILGWEELELEIVRDSEGRMITVCFIENIDPL GVHTGDSFCSAPMLTISPECQKRLQEQAYKIVESVQVIGGTNVQFAHDPVSDRIIVIEIN PRTSRSSALASKATGFPIALVSAMLATGLTLKDIPCGKYGTLDKYVPDGDYVVIKFARWA FEKFKGVEDKLGTQMRAVGEVMSVGKTYKEAFQKAIRSLETGRYGLGHAKDFDTRTKEQL LHMLVTPSSERHFIMYEALRKGASVEEIHEITKVKNWFIEQMKELVEEEEALLSCRGGLP SDEMLTAAKKDGFSDKYLSQLLQIPEETIRNHRIAIDVEEAWEGIHVSGTEDSAYYYSTY NAPDRNPVSTEKPKIMILGGGPNRIGQGIEFDYCCVHASLALKKLGFETIIVNCNPETVS TDYDTSDKLYFEPLTLEDVLSIYKKEQPAGVIAQFGGQTPLNLAADLEKNGVKILGTPPS VIDLAEDRDLFRAMMEKLEIPMPESGMAVNVEEALAIAAKIGYPVMVRPSYVLGGRGMEV VYDDESMIGYMKAAVGVTPDRPILIDRFLNHAMECEADAISDGTHAFVPAVMEHIELAGV HSGDSACIIPSVQISPENVETIKEYTRKIAEEMHVVGLMNMQYAIENDRVYVLEANPRAS RTVPLVSKVCNIRMVPLATDIITAELTGRPSPVPSLKEQIIPNYGVKEAVFPFNMFQEVD PILGPEMRSTGEVLGLSHSYGEAFYKAQEATQNKLPLEGTVLISVNRKDKAEVVEVARSF HEDGFKIVATGNTYELIHDAGIPATRVNKLYEGRPNILDMITNGQIQLIVNSPVGKDSIH DDSYLRKAAIKGKIPYMTTIAAAKATASGIRYVKEHGRGEVHSLQSLHSEIRDKE >gi|229783932|gb|GG667803.1| GENE 4 5567 - 7285 1302 572 aa, chain + ## HITS:1 COG:CAC1085 KEGG:ns NR:ns ## COG: CAC1085 COG1501 # Protein_GI_number: 15894370 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Clostridium acetobutylicum # 1 570 1 566 769 719 59.0 0 MKFTEGYWCIKKEITPLYAVEYADSRQNGNELTVYAPGKHISDRGDCLNVGMLTIRLTSP MEDVIRVSVSHFDGTVIRGPFARIYGTPPHTVIEETDEEIRYRSGRTTAVIDKRPNSWGI RFLDGDRELTNTGFRNMAYMTNRETGKCYMAEQLAIDVDEYIYGLGERYTPFVKNGQSVE MWNEDGGTASEITYKNIPFYITNKGYGVLLENEGDAAYEIASEKVERVQFSVEGERLDYC VINGSSPKGTISKYTELTGRPALPPAWSFGLWLTTSFTTSYDEKTTSGFINGMADRDIPL HVFHFDCYWMEAFEWCNFTWDPKTFPDPEGMLKRCHDRDLKICVWINPYIAQKSPLFEEG RRLGYLIKKENGDVWQTDMWQAGMGLVDFTNPGAVAWYQAKLKVLLDMGADCFKTDFGER IPVKDIAYYDGSDPVKMHNYYTHLYNQTVFELLERERGTGDAVVFARSATVGGQQFPVHW GGDCSATYSSMAETIRSGLSLACAGFGFWSHDISGFENTASADIYKRWCQFGLLSSHSRL HGSSSYRVPWLFDEEACLVLRKFVKLKCALAS >gi|229783932|gb|GG667803.1| GENE 5 8241 - 8666 424 141 aa, chain + ## HITS:1 COG:BH1905 KEGG:ns NR:ns ## COG: BH1905 COG1501 # Protein_GI_number: 15614468 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Bacillus halodurans # 1 127 608 732 773 123 44.0 1e-28 MLGDSLLTAPIFKESGEVEYYLPEGIWVNLLTGKTVTGGGWQKETHNYFSLPLMVRPGSV IAVGSNCEKPDYNYAKGVRFLLYLPETGMKAETAVTDLTGKTVMTACAEREGNTIRLTVN GGNNDYTYEVLGEELLSVVCS >gi|229783932|gb|GG667803.1| GENE 6 8810 - 9889 1092 359 aa, chain + ## HITS:1 COG:AGl591 KEGG:ns NR:ns ## COG: AGl591 COG1840 # Protein_GI_number: 15890411 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 358 38 383 387 167 31.0 3e-41 MDNAIRSEKRIWRKQAVRAVKGMALLFLLSAVLYGCSKTGKGGAGTAAGQQAEIYGVSED KKLIVYTSHKEEVYGPIIREFEERTGIWVEVRDGGSTELLEAIAGEAGQQTCDIMFGGGV ESYDAYREYFEPYRCKLIDKLDDRYLSPDDRWTAFTELPIVFIYNNKLVSGETAPHSWKD FLDDEWRGKIAFAAPGKSGTSYTALATMILVLDMDERELMEDFVYALDGNVSAGSGEVVD EVSTGIRLVGITLEETAKKRIAKGADISMIYPEEGTSTVPDGCALVKGAPHEDNAKLFID FTVSDDVQNLIMDQFCRRTVRKDLQMKAAGEDMKIMDFDLKWASAHQDEILELWSELIR >gi|229783932|gb|GG667803.1| GENE 7 9902 - 10835 849 311 aa, chain + ## HITS:1 COG:no KEGG:Closa_2527 NR:ns ## KEGG: Closa_2527 # Name: not_defined # Def: integral membrane sensor signal transduction histidine kinase # Organism: C.saccharolyticum # Pathway: not_defined # 6 311 6 314 572 107 23.0 6e-22 MWNFIRKSFKRELLVSFVAVALLPLILSCIFLIQMFKVKLERDYQKKDLEQAAVMEEKLT TLFQTIDDVTLHLSKEPEIAASIRESGNRGRSAVYAKLYEETASLREMAQFDLYSEEGIC LYSTGAGMFHTQLPVYWGILKVAEAHPDEMTVRREKEYSGSSNILLRAARPVLGSEDSSV GYVLVSINGSNFEKILGGTYGSQDGICILNNFWEPVYSTGTAVREEIGTVLRRRLMAGEA VDEPFHNNSVYISEIGRTGLYSVFLRPKVFTADTTKSMYSVLIVMTAASLMLCAAVAAKM SNHLSRPIQSL Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:59:40 2011 Seq name: gi|229783931|gb|GG667804.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld197, whole genome shotgun sequence Length of sequence - 6490 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 2, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 217 177 ## gi|266624712|ref|ZP_06117647.1| thermonuclease 2 1 Op 2 . - CDS 246 - 1484 1096 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) - Prom 1537 - 1596 4.3 - Term 1616 - 1665 14.0 3 2 Op 1 42/0.000 - CDS 1681 - 2079 380 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) 4 2 Op 2 42/0.000 - CDS 2091 - 3482 1565 ## COG0055 F0F1-type ATP synthase, beta subunit 5 2 Op 3 42/0.000 - CDS 3500 - 4354 1176 ## COG0224 F0F1-type ATP synthase, gamma subunit 6 2 Op 4 41/0.000 - CDS 4385 - 5887 1811 ## COG0056 F0F1-type ATP synthase, alpha subunit 7 2 Op 5 . - CDS 5922 - 6470 682 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) Predicted protein(s) >gi|229783931|gb|GG667804.1| GENE 1 1 - 217 177 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624712|ref|ZP_06117647.1| ## NR: gi|266624712|ref|ZP_06117647.1| thermonuclease [Clostridium hathewayi DSM 13479] thermonuclease [Clostridium hathewayi DSM 13479] # 1 72 1 72 72 99 100.0 9e-20 MKYRKKTGRLWLLLAAVTFLAAGYSMTAFGEEAIKSVSVTVTSSVEAEADEGTVSATANS SKYRVVSCDFAN >gi|229783931|gb|GG667804.1| GENE 2 246 - 1484 1096 412 aa, chain - ## HITS:1 COG:CAC0309_2 KEGG:ns NR:ns ## COG: CAC0309_2 COG0791 # Protein_GI_number: 15893601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Clostridium acetobutylicum # 246 395 2 156 171 92 33.0 1e-18 MGHQRPVRCCRFLLFVTAAVCLGQMVSFVSAASPISEAEREKQEAEARYEAVNGEAEELE RKKGEAAALAGRLDKELTGLLTDMDLLETDMDDMTRQIEREEAEYQRAEEKRKRQYEILK KRIQYIYEEGEVTYLDILLKAKTIGDVINGNEYFQQLYEYDRKLITEYEKTKCDAGNRKE ALEERQSELYSLRQEYGVREKTLRTMMTEKRAEESDFDLRLKAARAEAGKQAEVIRRRTE DIRLLREEEERRAEEQKRISEAARNEAQWRQSGSQGAQGGPGKTQNGGPGIIRSTGGTEF GRSVADYALQFVGNPYVWGGTSLTSGADCSGFVQSVYRHFGVSIPRTSAEQAGFGREIAY EDMEPGDLVCYPGHVAMYIGGGRIVHARSAKAGIRVDDNPAYRTIVSIRRPW >gi|229783931|gb|GG667804.1| GENE 3 1681 - 2079 380 132 aa, chain - ## HITS:1 COG:BH3753 KEGG:ns NR:ns ## COG: BH3753 COG0355 # Protein_GI_number: 15616315 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Bacillus halodurans # 7 130 6 130 133 97 42.0 7e-21 MADLFKLHVITPERRFYDGEASMVELSTTEGDIGVYRNHIPLTAIVAPGVLKIHEEGGVK EAALMSGFIEILPERITIMAEVAEWPDEIDGNRAEEARIRAERRLKEESGEIDTMRAELA LRRALVRLSLRK >gi|229783931|gb|GG667804.1| GENE 4 2091 - 3482 1565 463 aa, chain - ## HITS:1 COG:CAC2865 KEGG:ns NR:ns ## COG: CAC2865 COG0055 # Protein_GI_number: 15896119 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Clostridium acetobutylicum # 4 463 3 461 466 674 74.0 0 MAEQNIGKITQIIGAVLDIKFSQGRLPSINDAIDITMKNGNRLVVEVSQHLGDDTVRCIA MGSTDGLVRGMEAVATGAPITIPVGEETLGRIFNVLGDPIDNKPKPEVKEYLPIHRPAPT FEEQSTETEILETGIKVVDLLCPYQKGGKIGLFGGAGVGKTVLIQELIRNIATEHGGYSV FTGVGERTREGNDLYYEMQESGVINKTTMVFGQMNEPPGARMRVGLTGLTMAEYFRDQAG KDVLLFIDNIFRFTQAGSEVSALLGRMPSAVGYQPTLQTEMGALQERITSTKNGSITSVQ AVYVPADDLTDPAPANTFAHLDATTVLSRAIVELGIYPAVDPLESTSRILDPRIVGEEHY NVARGVQEVLQKYKELQDIIAMLGMDELSEEDKLTVSRARKIQRFLSQPFFVAGQFTGLE GRYVPLADTIQGFKEILEGKHDDIPESYFLNAGSMDDVLARVK >gi|229783931|gb|GG667804.1| GENE 5 3500 - 4354 1176 284 aa, chain - ## HITS:1 COG:BH3755 KEGG:ns NR:ns ## COG: BH3755 COG0224 # Protein_GI_number: 15616317 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Bacillus halodurans # 1 283 1 284 285 204 40.0 1e-52 MASMRDIKHRKESIQSTEQITKAMKLVSTVKLQKSRTKAEESKPYYDMMYDTICSMLRKS GNIEHKYLKAGDSKKKAVIAITSNRGLAGGYNNNIVKLVTGAEGLTPENAVIYAIGRKGK DSFVRKGYEIAEDYSEVINEPIYRDVADVTAELLDAFGQNQIGEIYLAYTSFKNTVTHIP TLKKLLPVEAGDAGEKTDLVLMNYEPNEDQVLDSIIPKYMSSLIYGALLEAVASENGARM TAMDSATNNAEEMIEELGLAYNRARQGSITQELTEIIAGANAIS >gi|229783931|gb|GG667804.1| GENE 6 4385 - 5887 1811 500 aa, chain - ## HITS:1 COG:CAC2867 KEGG:ns NR:ns ## COG: CAC2867 COG0056 # Protein_GI_number: 15896121 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Clostridium acetobutylicum # 1 496 1 497 505 665 66.0 0 MNLRPEEISSVIKEQIERYSTKLEVSDVGTVIQVADGIARIHGLENAMQGELLEFPGEIY GMVLNLEEDNVGAVLLGAGDISEGDTVKTTGRVVEVPVGDAMTGRVVNALGQPVDGKGPV QTDKFRQIERVAHGVIDRKSVDTPLQTGIKAIDAMIPIGRGQRELIIGDRQTGKTAIAID TIINQRGQGVHCIYVAIGQKASTVATIVKTLEEYGAMDYTTIVVSTASDLAPLQYIAPYS GCAIGEEWMENGEDVLVVYDDLSKHAAAYRTLSLLLKRPPGREAYPGDVFYLHSRLLERA SKLSDALGGGSLTALPIIETQAGDVSAYIPTNVISITDGQIYLETEMFNSGFRPAINAGL SVSRVGGAAQIKAMKKIAAPIRVELAQYRELASFAQFGSELDKETTEQLAQGERIKEVLK QGQYQPMPVEYQVMIIFAATRKLLLDIPTHRILDFEKALFSFIDSKYPQIPASIRDTKQI TEETDALLTKAINECKAEAF >gi|229783931|gb|GG667804.1| GENE 7 5922 - 6470 682 182 aa, chain - ## HITS:1 COG:BH3757 KEGG:ns NR:ns ## COG: BH3757 COG0712 # Protein_GI_number: 15616319 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Bacillus halodurans # 5 176 6 177 183 92 34.0 5e-19 MAELVSKVYGDALFEESIEKQSVDTLFSEVKALLAIWRENPELAELLDNPKIVKEEKIGI MTNIFAGRVSDDLMGFLAVIVDKGRQKDIPAICTYFIDTVKEYKKIGVARVASAVELKSE QKAMIEQRLLDTTQYVEFEMEYSVDPSLIGGLVIRIGDRVVDSSIKTQINELHRSLSKLQ LS Prediction of potential genes in microbial genomes Time: Fri Jul 1 02:59:48 2011 Seq name: gi|229783930|gb|GG667805.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld198, whole genome shotgun sequence Length of sequence - 10662 bp Number of predicted genes - 13, with homology - 11 Number of transcription units - 7, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 221 - 604 118 ## gi|266624719|ref|ZP_06117654.1| putative glycerol-3-phosphate cytidyltransferase 2 1 Op 2 . - CDS 670 - 828 116 ## gi|288871455|ref|ZP_06410177.1| conserved hypothetical protein - Prom 848 - 907 4.3 3 2 Op 1 . - CDS 912 - 1109 133 ## 4 2 Op 2 . - CDS 1154 - 1279 101 ## gi|266624721|ref|ZP_06117656.1| hypothetical protein CLOSTHATH_06106 5 2 Op 3 . - CDS 1297 - 1899 130 ## Bcell_0463 helix-turn-helix domain protein - Prom 2099 - 2158 5.6 6 3 Op 1 . - CDS 2207 - 2545 225 ## Cbei_3841 hypothetical protein 7 3 Op 2 . - CDS 2567 - 2764 130 ## gi|288871457|ref|ZP_06117659.2| putative membrane protein - Prom 2912 - 2971 6.1 - Term 3091 - 3123 -0.4 8 4 Tu 1 . - CDS 3150 - 3602 136 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 9 5 Tu 1 . + CDS 4844 - 4966 93 ## - Term 5600 - 5663 13.4 10 6 Op 1 14/0.000 - CDS 5803 - 7158 958 ## COG1653 ABC-type sugar transport system, periplasmic component 11 6 Op 2 38/0.000 - CDS 7203 - 8048 464 ## COG0395 ABC-type sugar transport system, permease component 12 6 Op 3 . - CDS 8051 - 8932 261 ## COG1175 ABC-type sugar transport systems, permease components - Prom 9079 - 9138 19.7 13 7 Tu 1 . - CDS 10040 - 10651 466 ## COG1032 Fe-S oxidoreductase Predicted protein(s) >gi|229783930|gb|GG667805.1| GENE 1 221 - 604 118 127 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624719|ref|ZP_06117654.1| ## NR: gi|266624719|ref|ZP_06117654.1| putative glycerol-3-phosphate cytidyltransferase [Clostridium hathewayi DSM 13479] putative glycerol-3-phosphate cytidyltransferase [Clostridium hathewayi DSM 13479] # 1 127 1 127 127 233 100.0 3e-60 MGGLTLKRKDEKKGTFLIFVLFAVLLTVCSSKQKEATHNETMIIKPSEFSEETKDLILEK AKFEDWETMYKNVWSRSEAALYMQWRVTESEAAARECIRKTIEYQKNTLGPDYTGKGYGK QILQLLL >gi|229783930|gb|GG667805.1| GENE 2 670 - 828 116 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871455|ref|ZP_06410177.1| ## NR: gi|288871455|ref|ZP_06410177.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 52 1 52 52 75 100.0 9e-13 MQGLKRILFGIAIILIGGFFLISPDSSLGGWGELVCFISGIGLGISGLKCDE >gi|229783930|gb|GG667805.1| GENE 3 912 - 1109 133 65 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAIYIIVSIVLLLDIINPRILWYIDFWKHNGDKPEPSGTYILLNRIFAFIGLVVLWSVYF TRLGL >gi|229783930|gb|GG667805.1| GENE 4 1154 - 1279 101 41 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624721|ref|ZP_06117656.1| ## NR: gi|266624721|ref|ZP_06117656.1| hypothetical protein CLOSTHATH_06106 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06106 [Clostridium hathewayi DSM 13479] # 1 41 1 41 41 72 100.0 7e-12 MRTKGKTIVVNEKVDVEFLTMAIGRNGCDVVLDLANRMFFQ >gi|229783930|gb|GG667805.1| GENE 5 1297 - 1899 130 200 aa, chain - ## HITS:1 COG:no KEGG:Bcell_0463 NR:ns ## KEGG: Bcell_0463 # Name: not_defined # Def: helix-turn-helix domain protein # Organism: B.cellulosilyticus # Pathway: not_defined # 1 195 1 202 207 119 36.0 7e-26 MSLGENIKTKRNELKLSQEYVAEQLGVSRQAVSKWETDQSQPKAKNLTELATVFHVGLSG LVEPQKYQAEQAAQENEWRENRNNSKMLCCRWLGYIFLITGYSGYAGYYNSGLSSYYWMA FFALGIILLVITSISYFKKARMKPLQIVLGLLLIGSIFVLPVILPLQHGINVLIGHIAAS SMIGLLNLKYWRITWNTHNS >gi|229783930|gb|GG667805.1| GENE 6 2207 - 2545 225 112 aa, chain - ## HITS:1 COG:no KEGG:Cbei_3841 NR:ns ## KEGG: Cbei_3841 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 16 109 17 110 112 80 42.0 1e-14 MNFVGFWACIVLVLPFGIIGVLFAIFKEKSAKFVSGFNTLSKEEQEMYDKAWIARDIRNS CFVWAAIMLIGAVSSYVLTPYAAIGAYIIWLILFFRDVHFNAHKAFEKYLLK >gi|229783930|gb|GG667805.1| GENE 7 2567 - 2764 130 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871457|ref|ZP_06117659.2| ## NR: gi|288871457|ref|ZP_06117659.2| putative membrane protein [Clostridium hathewayi DSM 13479] putative membrane protein [Clostridium hathewayi DSM 13479] # 1 65 54 118 118 126 98.0 7e-28 MISSAFVLYIQGRPDFETLIVYLVFTQIAIMALTIIPTEIKLNNVFTKEGYRNQKNESRF AEPKN >gi|229783930|gb|GG667805.1| GENE 8 3150 - 3602 136 150 aa, chain - ## HITS:1 COG:MA2686 KEGG:ns NR:ns ## COG: MA2686 COG1028 # Protein_GI_number: 20091506 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Methanosarcina acetivorans str.C2A # 23 129 65 176 239 74 37.0 6e-14 MDMDVIIFDIQAPPEGFQYYQVDIRKDDQILTALSQIPRVDILINNAGIYFEKYLEDTSN EELDQMVDVNIKGTYRMTGNALPKIMEAKGNIVIIASCLGLVPELTSPLYCATKAGLIML KKCLAQQYADVDEVHRPARRHCPYGSISGE >gi|229783930|gb|GG667805.1| GENE 9 4844 - 4966 93 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKYIQKVNYLFLFYIHSSNKRTYVEDDEDKYRYNPKKGNK >gi|229783930|gb|GG667805.1| GENE 10 5803 - 7158 958 451 aa, chain - ## HITS:1 COG:AGpA77 KEGG:ns NR:ns ## COG: AGpA77 COG1653 # Protein_GI_number: 16119286 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 30 388 7 357 419 103 25.0 1e-21 MKLKCPLINHCRTSIFKKAFRTGTSAAVLLAVSLSLLTGCGGKAPDDTVTADGKVQVRFW YSGGKTAAGVMEDIIEDFNLSQSEYTVIGVQQADYDETYTKLQAGIAGKNAADIALLDSD KTRNLAKKELLAPLDQYIEQDSSFNKEDLIPVFYRQCLGDDGILYAMPAYGTTQVMYYNR DLFEKAGIQPDTIKTWEDLAKASALIQQNTGANGWEPMWGEDNLIDMALSNGGAMISDDG RTVLINTEPWVETWEAARRWIHEDKTMVIHYGGQGWEYWYDTMDDVLQDKAGGYTGSSGD QADLDFSKVGAMEQPGFGDNPSAPVARVLQIVMVETGKDKTKDGAYQFMKYFADASQQAR WSMATGYIPVRETTLDIPEYQSYTAENPHALVPLSQAMHASPDPIDPTGGKIFDALAIAA DKVEIENVPAADALKEAAETAQAALDAINQK >gi|229783930|gb|GG667805.1| GENE 11 7203 - 8048 464 281 aa, chain - ## HITS:1 COG:lin0219 KEGG:ns NR:ns ## COG: lin0219 COG0395 # Protein_GI_number: 16799296 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 9 280 12 281 282 174 35.0 2e-43 MKYIPRTLHIIRSCVVHFILGGACCFTAFPFLWMAVSSLKTKEEVMAVDIFFPSDPQYQN YSEILFHSPIPRYLANSLLVAVSVVIIQVICCALFTYAVTFMKAKGKNILFRLVLFTYMV PTAVTYIPCYMILSRLGLLDSYGGLILSNGVSVFGIFLLHQAFAGVPSETMEAARMDGAG HLMILTRIIVPMTRPSFVTFILLSFINTYNSYMWPSLITDSPDLSLVSQGLRRFLYEGGA YGTQWSLVMAAGTVTVVPLLILFAGTWRLIVQGITDFGCKG >gi|229783930|gb|GG667805.1| GENE 12 8051 - 8932 261 293 aa, chain - ## HITS:1 COG:lin0218 KEGG:ns NR:ns ## COG: lin0218 COG1175 # Protein_GI_number: 16799295 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 11 269 8 266 292 185 38.0 9e-47 MNDSHCSKKWKSEFTFIAFVTPAFMFLILFWLLPIVASIGISFTDWDYMTAGFHVTGLAN YTSLIKDKAFWDALTHTLLFALETVIPVLVLGLLLAVLIWSAGKHTIYSLIIFSPWITPA VAVSIVWTWIFEPDSGLANQLLHLLGLPGSAWLHSSNTALASVALVTVWKNLGYVMLLFT GALSRIPSELHEAASADGAGSMQRFLHITLPLLRPAILVTGIIMTADSLRAYDQIQILTQ GGPAGSTRTLLYLYYQMGFEQFDMGRAAAAVVILMIIGLSLALLQIRLKSKEA >gi|229783930|gb|GG667805.1| GENE 13 10040 - 10651 466 203 aa, chain - ## HITS:1 COG:CAC2422 KEGG:ns NR:ns ## COG: CAC2422 COG1032 # Protein_GI_number: 15895688 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 8 203 40 231 290 91 30.0 9e-19 MYDDLPFQFRMSPLTDIEADLQEAQYQLHERSSRVKRVYLVGANPFVLQFKRLKEISELI HQYFTECETIGCFARVTDVALKTDEELQELHGLGYDGITIGAETGDEAALDFMDKGYKVQ DIAIQAERLDNAHITYNFFYLTGVSGSGRGVEGALESAKIFNMTRPKIIGSSMLTIYPES ELYGEVQKGRWQEESELEKLEEV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:00:29 2011 Seq name: gi|229783929|gb|GG667806.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld199, whole genome shotgun sequence Length of sequence - 10529 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 4 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 132 - 536 203 ## Tresu_1954 Methyltransferase type 11 2 1 Op 2 . + CDS 620 - 892 159 ## COG0491 Zn-dependent hydrolases, including glyoxylases + Prom 897 - 956 2.1 3 2 Op 1 . + CDS 1003 - 1476 296 ## gi|266624733|ref|ZP_06117668.1| conserved hypothetical protein 4 2 Op 2 . + CDS 1522 - 1656 169 ## gi|225568565|ref|ZP_03777590.1| hypothetical protein CLOHYLEM_04642 + Prom 2499 - 2558 80.4 5 3 Op 1 . + CDS 2664 - 4091 1181 ## COG3497 Phage tail sheath protein FI + Prom 4134 - 4193 2.4 6 3 Op 2 . + CDS 4265 - 5842 1035 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 5906 - 5942 3.2 - Term 5941 - 5996 -0.9 7 4 Op 1 . - CDS 5999 - 6541 210 ## gi|266624737|ref|ZP_06117672.1| conserved hypothetical protein 8 4 Op 2 . - CDS 6378 - 7295 212 ## Elen_3094 regulatory protein GntR HTH - Prom 7509 - 7568 4.9 + Prom 7457 - 7516 5.7 9 5 Tu 1 . + CDS 7668 - 9344 1376 ## COG0840 Methyl-accepting chemotaxis protein 10 6 Tu 1 . - CDS 9517 - 10527 782 ## COG2200 FOG: EAL domain Predicted protein(s) >gi|229783929|gb|GG667806.1| GENE 1 132 - 536 203 134 aa, chain + ## HITS:1 COG:no KEGG:Tresu_1954 NR:ns ## KEGG: Tresu_1954 # Name: not_defined # Def: Methyltransferase type 11 # Organism: T.succinifaciens # Pathway: not_defined # 2 134 129 261 261 203 70.0 2e-51 MPALSRMLKPNGRILVLYMAWLPFEDKIAGASEELVLRYSPKWSGAGETIHPIQIPACYS EEFELVYHEEYPLNVPFTRESWNGRMKACRGIGASLADAEIAAWEQEHMELLNKIAPPEF SILHYGAIAELKKK >gi|229783929|gb|GG667806.1| GENE 2 620 - 892 159 90 aa, chain + ## HITS:1 COG:TM1295 KEGG:ns NR:ns ## COG: TM1295 COG0491 # Protein_GI_number: 15644050 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Thermotoga maritima # 31 84 27 80 218 62 50.0 2e-10 MNIKSWFSVEKIDSDTFAISEYKHWEETHCYLLCGTEKAVFIDTGLGVGNIREIVDSLTS LPVTVMTTHVHWDHIGGHRYFEKIERECLL >gi|229783929|gb|GG667806.1| GENE 3 1003 - 1476 296 157 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624733|ref|ZP_06117668.1| ## NR: gi|266624733|ref|ZP_06117668.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 157 42 198 198 303 99.0 3e-81 MAETTMMEQNSESGTIPESQTLPLIDVMYSIADSQGLDGAVDYESGMQDGQKDALVFLCR SESGKYEAYGFISTEYGKNGLLINNVINDQDNWNFFEESWSCGDTLPEFEENGDYEVIFT FTQGKEGEYKERSIHFDTYDTGTMSIEGESPSSTPDR >gi|229783929|gb|GG667806.1| GENE 4 1522 - 1656 169 44 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225568565|ref|ZP_03777590.1| ## NR: gi|225568565|ref|ZP_03777590.1| hypothetical protein CLOHYLEM_04642 [Clostridium hylemonae DSM 15053] hypothetical protein CLOHYLEM_04642 [Clostridium hylemonae DSM 15053] # 1 42 1 42 561 72 85.0 1e-11 MAEYLSPGVYVEEYDNSPRAVEGVGTSTAGFVGMAVKGPSCGAS >gi|229783929|gb|GG667806.1| GENE 5 2664 - 4091 1181 475 aa, chain + ## HITS:1 COG:mlr6548 KEGG:ns NR:ns ## COG: mlr6548 COG3497 # Protein_GI_number: 13475469 # Func_class: R General function prediction only # Function: Phage tail sheath protein FI # Organism: Mesorhizobium loti # 124 473 166 526 528 256 41.0 9e-68 MARVVPPDAKKAAGTYGILTAEAVDEGKWGNKIMISVNTSCRKKMQLVEKLGDRVYRAKS PAGFLEGVLVSFHGEVNRIAAVYGGEITMEEEFTVNPVDNNIVPVNVVTSLEIDLSVRYE DEVESYEGMTLNQTSPDYVEKRMGKSKLVKITCTPEDKWKNPVEMILGEGVSDGSVTLSG GSDGTLALVNDGTFIGEDSGPGKRTGIQAFLENDFVSIMAVTGITSQAVEVSLVSHCENL KSRMAVLDMPKQMTKTSELIEFRSIIDSTYAAMYHPWIQQFDRTVKCPGMFPPSGAVMGV YSRCDINRGVHKAPANETISCTGLSVNYTTAEQDILAPEGINLIRALPGQGIRIWGARTA GSDPSFRYINVRRLFIFVEQSIKASTNWVVFEPNDATLWARVQKTIADFLEHLWRNGMLA GASSSEAYFVEIGPSTMSQDDMMAGRLICNIGIAPSRPAEFIIFRITQHTAEAGE >gi|229783929|gb|GG667806.1| GENE 6 4265 - 5842 1035 525 aa, chain + ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 402 522 490 621 621 105 43.0 2e-22 MTAPVAGEKPVTAITDTEQYTGKVTWRPGFAQDGTFAANTNYTAEITLEPKKGYTMSGVP ADFFEVEGAERAENKIDSGIIEASFARTAVTANRKNITGVTAPVTGEKPVTAITETEQYT GSVVWSPAIAEGESFADNTKYTAVITLIPKKGYTVSGVPKDYFEVEGAESTENRINSGII TARFAKTDAAVNSKNIKGVTAPATGEKPVTSITETEQYTGSVVWSPAIAPGGSFAANTEY TATITLALKKGYTLTGVTKDYFEVEGAARTANAENSGIITAHFAGTDAIRVSEVTLDRTS LFMKVGETEKLSVIIDPLDATNPSVIWTTGDAGIAVVDSDGTVHAVGTGRTALTVTALDG GKTAVCTVFVRTSVEESGDDGQEADNTYPILNGQIPASEGHGLWKEQGSGAWKFLKDKDG YAQKEWLKIGEDWYLFGDNSIMITGWSLVNQTWYYSGGSGAMLTGWQFINGSWYYLKPDG SMAVGWLEADGKWYYLRPDGSCLINGVTPDGYRVDGNGVWDGNEK >gi|229783929|gb|GG667806.1| GENE 7 5999 - 6541 210 180 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624737|ref|ZP_06117672.1| ## NR: gi|266624737|ref|ZP_06117672.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 89 180 1 92 92 189 100.0 6e-47 MKKKVTVLRTSEELAPIRIEQNIIENNLTLFLNVLQILAVCTGKLSFAAFSLLDDSQLAD TASEWESSPLRQRSSGIIHIITAFLKAHMPQKCLENILEQFDDALIWGHYLDRPYILVEE CAALAQEGFEQFALSCQSLRKGDRESAAASVQRTFRAIFLDARLYTLICFKNPDKVPAEI >gi|229783929|gb|GG667806.1| GENE 8 6378 - 7295 212 305 aa, chain - ## HITS:1 COG:no KEGG:Elen_3094 NR:ns ## KEGG: Elen_3094 # Name: not_defined # Def: regulatory protein GntR HTH # Organism: E.lenta # Pathway: not_defined # 2 256 54 318 486 76 26.0 1e-12 MLKEEGVICTSAGKRAIVCFGESEKGYIQSLMQRRESILDVYRGLELIMPPLYTEGAMHC GRLSVFADAEGSSDDQMINSQRTTAFFTELLLPYRNQTVLDLQTDMEHYARFPYVMQSQL EDPFTVSADFLKQNLSIMQGMVERKERESLIAKLEEMYRNAGWQAAVYLNDLQKIVPVPG EPVTYQWFRGKNRSPLYAAVAQSLYRRAQLGEFSHRTYFPSEPEIMHTYKISKSTAAKAM ALLSDIGLIHTIEKKGYCTADIGGTGAYPHRAKYHRKQSDTFFKCPADSGCLYRETFLCC VFPAG >gi|229783929|gb|GG667806.1| GENE 9 7668 - 9344 1376 558 aa, chain + ## HITS:1 COG:CAC0120 KEGG:ns NR:ns ## COG: CAC0120 COG0840 # Protein_GI_number: 15893416 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Clostridium acetobutylicum # 4 558 6 515 555 238 31.0 2e-62 MLKNIKIRTRMLLSYAVIILLGLSASVAALIMMDRISGNLTVFYDNNYAVTVKAWTAKRE MQYARADILKGILETDDDDMQTALDQASAALANMRAAFPVIRERFKGDMALMDEVDSILV KAIVYRDQVFDLVEAKKNEEAFALMKASYIPLLDQISDTLDKISGQAERNAKEKVDQGKQ LQRIFMLVVIGIIVLNIILAALLALHISNGIRRPVEEIRTAAEKLASGNLDVSIDYYAED ELGNLSDSIRSLIHIFQGIIGDMGYGLAALGSGDFTVDSKAEELYVGDFQKLKASMDEII EKLSLMMVKISQSSDQIFAGSSQVSSGAQAVAGGATEQASAVEELAGSINEISAQVSENA QNARHGSELAETAGIKMIESNREMQELISAMGDISDKSGKIEKVLNIIEDIAFQTKILAL NAAVEAAHAGKNGKGFAVVANEVRNLAQKSAEASKNTAALIAETIQAVNLGTKLADETAN MLAEVVESVKQAVFAVDKISEASSEQAVSIAKVTQSVNQISDVVQNNSATAEESAAASEE LSVQAHILSSLVSQFKLK >gi|229783929|gb|GG667806.1| GENE 10 9517 - 10527 782 336 aa, chain - ## HITS:1 COG:slr1588_2 KEGG:ns NR:ns ## COG: slr1588_2 COG2200 # Protein_GI_number: 16332199 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Synechocystis # 92 334 27 270 283 137 29.0 2e-32 GFEFVVLINGIGQSALEDQVVNLKHYILLSTNCKGAVGSIWTDKVTEIDSKISEADELMY KDKMDFYRKNPTSNRYRFHNDTILQFSDKSKLEEEIADGRFEVYLQPKVDFNRHIIGGEA LIRYHDHQGKLIMPNDFINYLENARGIHYIDFFVFETICKLLEKWKKENLLLKPVSVNFS RYTLRIPEYIEMLNAIWERYDVDKSLLEIEIIENDENYDNDFLISIIDKIKKAGFAISID DFGSRYSNMALFINADLNTLKVDKRLMDDIDHNKRSQMLISSLVQICHSLHMRLIAEGVE NEKQFSILQELGCDGVQGFLISRPVQIEQYENLFLK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:01:02 2011 Seq name: gi|229783928|gb|GG667807.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld200, whole genome shotgun sequence Length of sequence - 6638 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 1, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 557 424 ## Closa_0572 NAD-dependent epimerase/dehydratase 2 1 Op 2 6/0.000 + CDS 618 - 1811 1125 ## COG0500 SAM-dependent methyltransferases + Term 1819 - 1883 -0.9 + Prom 1817 - 1876 2.8 3 1 Op 3 12/0.000 + CDS 1990 - 2997 878 ## COG0451 Nucleoside-diphosphate-sugar epimerases 4 1 Op 4 11/0.000 + CDS 2984 - 4012 957 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 5 1 Op 5 8/0.000 + CDS 4009 - 6201 2670 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 6 1 Op 6 . + CDS 6203 - 6638 355 ## COG1216 Predicted glycosyltransferases Predicted protein(s) >gi|229783928|gb|GG667807.1| GENE 1 3 - 557 424 184 aa, chain + ## HITS:1 COG:no KEGG:Closa_0572 NR:ns ## KEGG: Closa_0572 # Name: not_defined # Def: NAD-dependent epimerase/dehydratase # Organism: C.saccharolyticum # Pathway: not_defined # 1 179 117 306 312 246 62.0 3e-64 GSQAEYGIHQDAMTEETTLNPVSEYGKAKVDFCRKAMELTDGFRYIHARIFSVYGPGDHP WSLVESCLKTFPAGGYLSLGECTQMWNFLYLDDLLEALTALWQSGAEGVFNVAGEPDETR PLKEYVRLMYEACGCHGSYSCGRRPQNAEGPANLIPDVKKLMEVTGWRQETPFQEGIRRL LTGP >gi|229783928|gb|GG667807.1| GENE 2 618 - 1811 1125 397 aa, chain + ## HITS:1 COG:SMb21062 KEGG:ns NR:ns ## COG: SMb21062 COG0500 # Protein_GI_number: 16264389 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Sinorhizobium meliloti # 1 374 16 386 399 95 23.0 2e-19 MRDKICIACGKALPKRPLIVLDGMPASAQDIPDAEGRDEDKGISLHLHQCDFCGLVQFDC EPVGYYKDVIRSGGFSTTMVNLRRRQYRALIDKYHLEGKKFIEAGCGQGEFLSVLTEFPV DVYGIEHKADLVELAKSRGLKVWRQFTEEENTVLENEEAGYHGPYDVFLSFNFLEHQVNP VRMLKCLYNNLTKDGLGLVTVPSLEYILQYDGYYELIRDHLAYYTFDTLRFVLAEAGFVV LEEEMVNRDTLSVVVRKGTAAGQAEKDSRISSAVTGFAASRETLKLEVDRILKQYDETGR TLAVWGASHQGFTLAATTGLKDRVKYMIDSAPFKQGRFAPASHIPIVPPDHFEKEPVDGI LIVAPGYTEEIAGIIKERFGDKVEIMTLRSRHIENLQ >gi|229783928|gb|GG667807.1| GENE 3 1990 - 2997 878 335 aa, chain + ## HITS:1 COG:MA4464 KEGG:ns NR:ns ## COG: MA4464 COG0451 # Protein_GI_number: 20093250 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Methanosarcina acetivorans str.C2A # 4 324 3 298 298 64 22.0 2e-10 MENIVITGATGFLGSHLARWFLDRGDRVYALVRPGSAKLQDLPVHENLIPVYGTMEEAAD CVKEIGHADAWFHFAWGGVNREEIDSPEVQAKNIRGSLACVEAAHRLGCRVFMDAGSRVE YGVVSAGAPGGGAAEGDAGLLAGEGRGVMTEEMECRPVNEYGKAKLEFFRQAAPICEMYG MNFCHLRFFSVYGYGDHPWSIISTLVRELPRGGRVALSACRHEWNFMYIDDAVEAVGCLY EQVKEKEPEKGVAVNIASRDTRVLKSFVEEIFEIAGRRGTLEFGTFVQAKEGALSIRPDV SRMEAMTGGFKERYTFRRGITEMIKREMEHEDEEN >gi|229783928|gb|GG667807.1| GENE 4 2984 - 4012 957 342 aa, chain + ## HITS:1 COG:lin2695 KEGG:ns NR:ns ## COG: lin2695 COG0463 # Protein_GI_number: 16801756 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Listeria innocua # 1 306 1 306 315 170 32.0 4e-42 MKKISVLIPCYNEEENVVPISEAVIKTLTKDLPEYDYELVFIDNDSQDRTRSLIRQLCNE NPKIKAIFNARNFGQFNSPYYGMLQITGDCVIEMVADFQDPVELIPQYVHEWEKGYKIVI GIKTSSKENRLMYWLRGCYYKMMKKLSDVEQIEQFTGSGLYDRDFIEVLRNLDDPTPFLR GIVAELGFKIKQIPYEQPKRRAGETKNHFYQLYDAAMLSITSYTKAGLRLATIFGSICSL ISMVVAVVYLVMKLMYWDRFPAGMAPVLIGMCFLGSVQIFFIGLMGEYILSINARVMKRP LVIEEERLNFAAADGADREDADLEKCGDAELFAASLEDGDNV >gi|229783928|gb|GG667807.1| GENE 5 4009 - 6201 2670 730 aa, chain + ## HITS:1 COG:alr4487_2 KEGG:ns NR:ns ## COG: alr4487_2 COG0463 # Protein_GI_number: 17231979 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 184 706 1 504 519 214 29.0 5e-55 MKYKIDVARIRENSIVLNGWVIGSSPEAKAEFKVQDENQKDVEFKYVPVRRDDVSQNYFK TVYEKDFGFDIRIPYVRDENEDRFLIIRCDGKTSKVKLNTEIIEKHNSSAHKKHAKLKSM MNKETFHSAIEFWRENGLKAFLLKSSHKLQGIDSDYDYPEWYQLTKTSDEELEAQKSVEF DYMPKLSVVIPAYKTPERYLAAMLDSLRAQTYKNWEVCVADGSPRGESAERVLKRYALKD ERIRYVVLGENKGIAGNTNAAIDMAKGDFIVLADHDDTLAPDALFECVKAINSDPEIDVV YTDEDKLDIDGGELFEPHFKPDFNPDLLTSVNYICHLFVVNRELLDEVGGFQEEFDGAQD YDFIFRCTEKARKIYHVPKALYHWRCHQNSTSSNPESKLYAFEAGARAIQAHYERMGIRA LSVEKGVDYGIYHTKFEITGNPLVSVIIPNKDHAADLDVCMRSLIEKGTYKNLEFVIVEN NSTEQATFDYYERIQKEFDFVYVVTWEREFNYSAINNFGVTAASGEYLLFLNNDTELINP ESIEEMLGFCQRDDVGIAGARLLYSDDTIQHAGVVVGFGGIAGHTFIGLHRAESSYFNRA MCAQDYSAVTAACMMSKKSLFETVGGFSEELAVAFNDIDYCMKIRSLGKLVVYAPYALFY HYESKSRGLEDTPEKVERFNREIKMFADKWPDILKNGDPYYNPNLTLRKSNFALRDLRRE KIGEPYRLEV >gi|229783928|gb|GG667807.1| GENE 6 6203 - 6638 355 145 aa, chain + ## HITS:1 COG:MTH172 KEGG:ns NR:ns ## COG: MTH172 COG1216 # Protein_GI_number: 15678200 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Methanothermobacter thermautotrophicus # 7 142 6 152 332 99 38.0 2e-21 MSRKVTVVIPNYNGLKFMEPCFAALAAQSEKNFDVLVVDNGSSDGSVEWLKEHEIPSVFL PDNIGFPGAVNIGIKKAQTPYVILLNNDTEPKPDYIREMIRMIERSPKIFSVSSRILQLY HKELMDDAGDMYSLPGWAFQRGVGQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:01:09 2011 Seq name: gi|229783927|gb|GG667808.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld201, whole genome shotgun sequence Length of sequence - 11396 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 4, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 14/0.000 - CDS 2 - 1394 1386 ## COG1653 ABC-type sugar transport system, periplasmic component 2 1 Op 2 7/0.000 - CDS 1439 - 2323 781 ## COG0395 ABC-type sugar transport system, permease component 3 1 Op 3 . - CDS 2337 - 3146 500 ## COG4209 ABC-type polysaccharide transport system, permease component - Term 3502 - 3561 21.5 4 2 Tu 1 . - CDS 3588 - 4739 702 ## COG2207 AraC-type DNA-binding domain-containing proteins 5 3 Op 1 . - CDS 5718 - 6614 575 ## gi|266624751|ref|ZP_06117686.1| hypothetical protein CLOSTHATH_06141 6 3 Op 2 . - CDS 6611 - 6766 126 ## gi|266624752|ref|ZP_06117687.1| putative phospholipase A2 CB2 7 3 Op 3 17/0.000 - CDS 6782 - 7435 798 ## COG0569 K+ transport systems, NAD-binding component 8 3 Op 4 . - CDS 7535 - 8890 1329 ## COG0168 Trk-type K+ transport systems, membrane components 9 3 Op 5 . - CDS 8892 - 9254 338 ## Bmur_0596 ABC transporter 10 4 Op 1 42/0.000 - CDS 10226 - 10720 692 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 11 4 Op 2 . - CDS 10720 - 11394 182 ## PROTEIN SUPPORTED gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 Predicted protein(s) >gi|229783927|gb|GG667808.1| GENE 1 2 - 1394 1386 464 aa, chain - ## HITS:1 COG:AGl3560 KEGG:ns NR:ns ## COG: AGl3560 COG1653 # Protein_GI_number: 15891902 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 8 400 12 401 522 87 23.0 4e-17 MKLRKVTALMLAAAMAVSVLSGCSFKTKMEETAADAGQPGASGKEPITIEFMLLNNQSGD GERFVLKKIVKDKFNIDLNLVLNTKDAYEEKLNLLIASNELPDLISPVSGEIGKDIGPKG ALVAIDEHFDKMPNFKKKIMEDKNVYASVVAADGHIYNMPRFSKQQQFKSVPLLRTDLVK AVGKEMPKNFKELIDVLEAVKEKYPDTIGMISRNKMNCLWSFGIHYNTNQSVFYDKNQDK WVYGPLNSGFKEMIGDFHEMWAKGLMDKEFFTSSTQQWEEKILSNKGVFTLDYATRALTE TEAYKKLHPEDTEFQFQPIMPLVTDSNPEPTLNIAEQIGIWTSFGVSSDSKHIDRILEMI DWMYSDEAATLVQWGVEGEDYNIVDNMKKYVPSLKASYNPEGTIDPETELGLNHNRIMRI ENADGFEPYVEGYSEMIDAYKKNVHLFENNYRINLTFTDDEMEE >gi|229783927|gb|GG667808.1| GENE 2 1439 - 2323 781 294 aa, chain - ## HITS:1 COG:BH0481 KEGG:ns NR:ns ## COG: BH0481 COG0395 # Protein_GI_number: 15613044 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 1 294 1 293 293 233 43.0 5e-61 MVEDKSLGAKVFHVINTLVMALVVIITLYPLLYVVALSLSSAEYVQAGMVNLVPKGFNTE AYKQVFGDKFFWKSYGNTILYTVLGTAINLLLTTTFAFCLSRKELVFRKPLTVLIVFTMF FSGGIIPSFILIRSLNLYNTLWAVVLPGAINTYNMIIVRTFMQGLPEEIFESARIDGAHD IHLFTKIVIPLSKPALATIGLYYAVDHWNAYFYPMLYLKDKGKYPLQVLLKEMLVEQDFS SMSQSAAEVMGNMQPTTDMLMGASIVIALIPILCIYPFIQRYFVQGIMIGSLKG >gi|229783927|gb|GG667808.1| GENE 3 2337 - 3146 500 269 aa, chain - ## HITS:1 COG:AGl3564 KEGG:ns NR:ns ## COG: AGl3564 COG4209 # Protein_GI_number: 15891904 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 269 41 309 309 254 49.0 2e-67 MYGTIIAFKNFDFRSGILGSEWVGLKYFKQFVTGPYFWRLVKNTFLISFYDLLFSFPAPI VFALLLNEVKAKKYKKLVQTVSYLPHFISVVILVGLMKTMFSPNGGVVNNIIQAFGFESI NFFMEKGWFRSIYVGSSIWQSFGWGSIIYLATMAGIDTQLYEAAHMDGAGRFRCMWNITL PSIRPTIVIMLILRLGRLLSVGFEKIILMYNPATYEVSDVISSYVYRYGILQANYSFGAA VDLMNSLIALALILITNRISQKVSDTSLW >gi|229783927|gb|GG667808.1| GENE 4 3588 - 4739 702 383 aa, chain - ## HITS:1 COG:BH0483 KEGG:ns NR:ns ## COG: BH0483 COG2207 # Protein_GI_number: 15613046 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 175 383 553 769 769 101 29.0 2e-21 MGNGNEKENEFEKIKSGIITLKKHNSTYRENQVLQDMIRNGGSRAELEKLFPCPLALGIL VVGKEQELISVLKNSLEQALVPENGSGVRLISGGRDQVYVVINVQKLEIELVKEAVLNFF AGNPSSLLIGMGTIEQKGDLKLSCDHAEAAFLEARISDPGPIYSWDIGGRIQEIYMPIDL ESRMLSFIYAANREKVKELVDEVFDRNRDISRSQLKKLIMDFANIYIKLAQKVGCGADIK KAEEVLEREYQFDKLKDCIYWLYVGLVYEEPVDKRADYVGKYIKDYVQHHYMDSTVSIET IASMLNLVPTYVSTLFKKEAKVSFSQYLSDYRIEQAKYLLEHTDKKVKDIASETGFGTYN NFTRVFKKKLGVTPIEYKNHKQS >gi|229783927|gb|GG667808.1| GENE 5 5718 - 6614 575 298 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624751|ref|ZP_06117686.1| ## NR: gi|266624751|ref|ZP_06117686.1| hypothetical protein CLOSTHATH_06141 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06141 [Clostridium hathewayi DSM 13479] # 1 296 1 296 296 518 100.0 1e-145 MNNLKRGLKGISVLKKLLISYVLILLLLIASNLLIFNRASEVLVEKEITYNKMSLNYTRD ILDQNIKEIYNSCVALLKNNDTMRRIAKMDKVWNQTILIDRLMEFTQKNSLVEHAVLFNP TLDYVLYEDGGMNAEKAFGSYFQYQNLHGEKLPDKLNQASIGYVSGTGMYLSTKPAGLSE SYVTIYFSPQYINGNNKLILMLPESKLRDLFKSAYSQNKFMIVSQNGEVVVDARNQSRPE GLLESLTESIKLGSEAGQFDYGSYLVIHEKSEVFDWEYTVMVNYDYILKDINAIRNIS >gi|229783927|gb|GG667808.1| GENE 6 6611 - 6766 126 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624752|ref|ZP_06117687.1| ## NR: gi|266624752|ref|ZP_06117687.1| putative phospholipase A2 CB2 [Clostridium hathewayi DSM 13479] putative phospholipase A2 CB2 [Clostridium hathewayi DSM 13479] # 1 51 1 51 51 90 100.0 3e-17 MVSYNDVLPVIYCRKEFYMIFDVYCVLWGGKRGVCDCEKTIECMKCSGDRE >gi|229783927|gb|GG667808.1| GENE 7 6782 - 7435 798 217 aa, chain - ## HITS:1 COG:lin1022 KEGG:ns NR:ns ## COG: lin1022 COG0569 # Protein_GI_number: 16800091 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Listeria innocua # 6 217 7 219 219 153 37.0 3e-37 MKSILIIGMGRFGHHLCQNLAALGNDVMVVDQKEEVLEDLLPIVTSAKIGDCTNEAVLRS LGIANFDICFVCIGTNFQSSLEITSLVKELGGRRVISKANRDIHAKFLLRNGADEVIYPD RDIAEKMAMRYSANHVFDYIELTQEYSIYEIPPLPEWIHKTIREADIRNRYHISVLGIKR EGKAQLMPPADYVIQAQEHLMVIGKKVDIDNILEKMK >gi|229783927|gb|GG667808.1| GENE 8 7535 - 8890 1329 451 aa, chain - ## HITS:1 COG:BS_yubG KEGG:ns NR:ns ## COG: BS_yubG COG0168 # Protein_GI_number: 16080162 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Bacillus subtilis # 17 437 18 432 445 265 40.0 2e-70 MNEKKLIHARRHVSQTQFIAYGFFCVIITGTLLLMLPFASRDGQSEPFLNCLFTATSASC VTGLVVADTWSQWSLFGQLVILTMIQIGGLGFITVGVFISIILRRKIGLKERGLMMESVN TLQIGGVVRLAKKIIIGTCIFEGTGAVLLAIRFIPQFGFLRGLFYGIFHSISAFCNAGFD LMGGQAPYSSFVAYYDDWLVNLVIMSLIVIGGIGFIVWDDLSRNKLHFRKYMLQTKIVLV TTAILVFGGGLLFYLMERNNLLVGMNTSGKILTSLFSSVTARTAGFNTTDTAALTDGSKL LTIILMFVGGSPGSTAGGIKTTTLVVLLLCVHSNIKQTYGINIFGRRLENDAVKRAGTIL TINLLLALTASLAIMAIQPLGFSDILFETFSAIGTVGMTTGITRALHPVSRLIIILLMYC GRIGSLSFALAFVQSKRKPHVQQPAEAINIG >gi|229783927|gb|GG667808.1| GENE 9 8892 - 9254 338 120 aa, chain - ## HITS:1 COG:no KEGG:Bmur_0596 NR:ns ## KEGG: Bmur_0596 # Name: not_defined # Def: ABC transporter # Organism: B.murdochii # Pathway: ABC transporters [PATH:brm02010] # 1 92 179 270 272 99 64.0 4e-20 MLIAILTAITIVLGMRMMGAMLISSLIIFPALTSMRIFKSFRGVVLSSGVLGVVCFCIGM VMSYRFSTPAGASVVVVNLAAFLLFTVVQWVLHLRTGGKKEELKSPAGVSKNTEAERQQQ >gi|229783927|gb|GG667808.1| GENE 10 10226 - 10720 692 164 aa, chain - ## HITS:1 COG:MA0025 KEGG:ns NR:ns ## COG: MA0025 COG1108 # Protein_GI_number: 20088924 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 6 163 3 160 274 79 29.0 2e-15 MEMVTEMLSYPFLVRALVGGILVSLCASLLGVSLVLKRYSMIGDGLSHVSFGALSIAVAA GWSPLKISIPVVVLAAFFLLRITEHGKIKSDAAIAMISAAALAIGIIVTSMTTGMTTDVS SYMFGSILAMSKTDVRLSVVLSIVVLGLFLICYNKIFAVTFDDS >gi|229783927|gb|GG667808.1| GENE 11 10720 - 11394 182 224 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 [Haemophilus influenzae PittAA] # 9 197 17 205 205 74 29 3e-13 FGYENQDAVVDVTMEVNPGDYICIVGENGSGKSTLMKGLLGLLKPTKGTISVSDELKKSG IGYLPQQTAAQKDFPATVGEVVLSGCLSRRGNRPFYSGKEKKLAEKNMERLGIAELGSQC YRDLSGGQQQRTLIARALCATDRLLILDEPITGLDPSAILEFYEIIRKLNRKEGVAILMV SHDIANVVKQAGKILHLKREVLFYGTTKEYLESSAGHLYLGGEL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:01:33 2011 Seq name: gi|229783926|gb|GG667809.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld202, whole genome shotgun sequence Length of sequence - 10258 bp Number of predicted genes - 10, with homology - 9 Number of transcription units - 4, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 3 - 1293 971 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 2 1 Op 2 . - CDS 1290 - 3257 1588 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 3 1 Op 3 . - CDS 3254 - 3376 69 ## - Prom 3509 - 3568 5.8 + Prom 3468 - 3527 6.1 4 2 Op 1 . + CDS 3566 - 4858 1253 ## COG1653 ABC-type sugar transport system, periplasmic component 5 2 Op 2 . + CDS 4873 - 5616 989 ## Celal_1465 hypothetical protein 6 2 Op 3 . + CDS 5641 - 7005 1607 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 7033 - 7071 7.0 + Prom 7041 - 7100 4.7 7 3 Tu 1 . + CDS 7127 - 7384 449 ## Thit_1739 glycosyl hydrolase family 88 8 4 Op 1 . + CDS 8315 - 8992 707 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 9 4 Op 2 38/0.000 + CDS 9006 - 9920 891 ## COG1175 ABC-type sugar transport systems, permease components 10 4 Op 3 . + CDS 9936 - 10257 346 ## COG0395 ABC-type sugar transport system, permease component Predicted protein(s) >gi|229783926|gb|GG667809.1| GENE 1 3 - 1293 971 430 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 2 125 3 127 525 93 37.0 6e-19 MKVLIVDDEAHVIRAVKLLVPWQELGITEIHEALTPQEAIRIMEAEQPEILITDIVMQDL SGIDLMKYITGTMQHIKVIVISGYNNFDYVRSSLQHGGVDYLLKPLDQDQLIAAVKKAVL AWNQEHNLYHTALVHKDQISSMTALCRENLVSRLIHGDHPEHAYNKLTELSPELSGLTDC GIAYFNAEPFVLPGDDSWKEPFHRYRDSISAFLTRWNAGFLLPADRVHEITAFLTVWDPS LMEKLKEFLTDRQSTLPFPACMGIASGRFPAGIIEVYQEAKTAFGAFDMAVFSPVSVKTD SLTQVMQIRMTDSQKELDNRMLLSAFLTGNDELIADSLSLWIKNRTANYNPYLAVIWNAV QDENQLIDSWADLLKQRHKGFVHASGYQVMRFCDTMDESMHLSFPRFLKRMQADISFLYQ ELKSVHAPEA >gi|229783926|gb|GG667809.1| GENE 2 1290 - 3257 1588 655 aa, chain - ## HITS:1 COG:BH1122 KEGG:ns NR:ns ## COG: BH1122 COG2972 # Protein_GI_number: 15613685 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 9 540 9 516 586 119 24.0 2e-26 MNSQRFVSIRIKLLSLLFICILVPFLVYWFFSYEYIKTSLDEEFVSQTTDTMHATGNSIS DYVKLVDYTARGLYFNSEVMDILSGREQSVSSYRLLQTESRIFDFLQTLYSSVPDASQIQ LSAFNLKKSLLLQSDMQRYEKDHIYLSQERRFPCEPYHSYVTSAHMQSNYNFQELSGREF SLVFSLHMPIYQIPSVTDSLGEITIDIPVSVLDSICSPLFKEKEYLCLVDGNGSYVYSSK PEEIATQTRNPVIQDILSASLEPGDPVVLDADSEDIVICSRISEAPVDWYLIKISPKAFV YEKASRFFHMMLYSFILVTLAEILLIVITVIRFTAPIRKVTAYAEAVSAGDLDANMSSYI TYTKNDEIGSLLITIRKMMYSIKNFIIHQYQLELANRTSELKALQAQINPHFIHNTLQCL ATNALEGGNLPLYQSITALGQMMHYSMDTAHNLVPLNDALHYIELYLKLQKLRFPNSLET EFDISDEAGTMLIPKMTLQPLVENSIRHGNLLKLECSRLSIRAYAEGNYLHLFVIDTGTG IAAEKLEELTRSLHQVKAAAASSDAVTFVASLNRFSKEQPSRSSAVLKSEQQLEADKENR YVSNNIGMQNVYQRLLLNYKNQCSMEILSDGKTGTTIHIQADYRMLEYTEERSRS >gi|229783926|gb|GG667809.1| GENE 3 3254 - 3376 69 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTSPFGRKECTPKHLMRSFTNLVFMVEWCQITVGLGEISV >gi|229783926|gb|GG667809.1| GENE 4 3566 - 4858 1253 430 aa, chain + ## HITS:1 COG:BH2226 KEGG:ns NR:ns ## COG: BH2226 COG1653 # Protein_GI_number: 15614789 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 3 362 2 362 424 62 23.0 2e-09 MKRRIKTGLITAGIILGLTGCGSQNAERDAAELPAEPGIVLTVSMPSSEWGGYGKELEKL YQKDHPEIERIEWNLVDRSMYSDLLRVSLASQKLPDIICVGFEGSMEEWKDHLLPLEDFT SLEKLPMPYRTMGSLEGTVYSVPIQIQGIGIAYNRKLLKAAGWDRFPVTKSEFEQLCTDL EDQEIKPMMNHYKETLLTMANNLFMLPAMAADNPAAYMESLLEEERQQEYEENWQALADY FDLTLRFGNRDSLTTGAPTARDYFFIEKYAMLNDEGSWLAPVLAKSKPVLEKDISIGAIP LYEEAERNKLIVEVQALSVVKSSRHPEEAGEFLYWLATSEEASRYLKETMGCLPVMDMND ESLEGLSPLAAEVKKSINEGETALDVVNCLPGELKNKSAELWGFYLTGDLSREEVLQDYQ SLWKDYMKHN >gi|229783926|gb|GG667809.1| GENE 5 4873 - 5616 989 247 aa, chain + ## HITS:1 COG:no KEGG:Celal_1465 NR:ns ## KEGG: Celal_1465 # Name: not_defined # Def: hypothetical protein # Organism: C.algicola # Pathway: not_defined # 26 236 45 259 278 77 23.0 6e-13 MSVIHYQKKEDMKFEAAGNGILRAEMLPGMVEGVHTYKCRVAAGTVVTLETFGDQTQIYY FTKGEGYVATPKKAFNIEEPSLFIPNMDKEENLLHAVTDMEFLQLTCVMLPEDYEQFDHW HIALPRFCPLSGCIEYIEGFRPEGIKALSILDTHFLTRMTMGAIIGKGPNKADPHSHDNL YQWYYGMPGTKFIYRAADEEIELHEGGWAFIPTDIEHAIEIEPDETVNYVWFEIELTKAE LAAKKAQ >gi|229783926|gb|GG667809.1| GENE 6 5641 - 7005 1607 454 aa, chain + ## HITS:1 COG:YPO1719 KEGG:ns NR:ns ## COG: YPO1719 COG1653 # Protein_GI_number: 16121979 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Yersinia pestis # 52 376 24 351 430 231 36.0 2e-60 MMKRITAAVLAAAMAASLTACGGNGAKETTQAAAQTSQEAATVAGGASTTDGKEPVTLRF SWWGGETRHKATMAAVEAFEQEYPWVTVECEYSSWDGWTDKVATQLAGGTAPDLMQINWN WLYQFSSDGSKFADLNQFKDIISLDNWSETNLAAMTVGGKLQGVPISITGKLPMYNKTTF DKAGVAIPTSFEELRAAGKAFQEKLGDDYYPLAEEGYERMIWMVMYLQEKYGKEWIKDLS VNYSVEEVTEGMEWINSLEEDHVFPKIATIAGDGAENFLSNKKWMDGHYAGLTEYDSNVQ EIADSLDEGQELVLGDYPTDLGPNPAGMSKVSQGFAITETSEHKEEAALLLEYITSNENG VKLMGTERGTVCNTAAKKILEDADILGGQTLEGNQKVLAFCHYTFDPNFENSALKERTGA YYEVFDNLSAGADPAEMAQYLIDSINDVNAANPY >gi|229783926|gb|GG667809.1| GENE 7 7127 - 7384 449 85 aa, chain + ## HITS:1 COG:no KEGG:Thit_1739 NR:ns ## KEGG: Thit_1739 # Name: not_defined # Def: glycosyl hydrolase family 88 # Organism: T.italicus # Pathway: not_defined # 6 83 11 77 372 63 42.0 3e-09 MNLTPFFKEYMKNYEGLDDYLVEIDGQSMYEWNYEDGCLFVGASRLYKLTGDREYLDFIV KMVSPYIEEDGTIKSYHREEYNLAS >gi|229783926|gb|GG667809.1| GENE 8 8315 - 8992 707 225 aa, chain + ## HITS:1 COG:CAC0359 KEGG:ns NR:ns ## COG: CAC0359 COG4225 # Protein_GI_number: 15893650 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Clostridium acetobutylicum # 2 220 141 357 361 179 43.0 3e-45 MEYENRFNERKNYSDIMNQFNQARTLLYDEKTGLYYHAYDEYKDRKWADPETGLSPNFWL RSIGWYLMALVDCYELASEEIYEVKARFGELLREAVHGILKYQDEKSKLFFQLPALPDQE GNYLETSGSLMVAYAVLKGCRLGALLEEKYAERGKEIYDAVLEQNIYQAEDGAYHLSGTC AVAGLGPRTERDGSIAYYLSEPVVDDDVKAVGVLMMVTAELMKMD >gi|229783926|gb|GG667809.1| GENE 9 9006 - 9920 891 304 aa, chain + ## HITS:1 COG:AGl3351 KEGG:ns NR:ns ## COG: AGl3351 COG1175 # Protein_GI_number: 15891796 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 16 304 4 292 293 276 54.0 3e-74 MEAQPAKTRNWKNFYRKYRGFFYILPWIIGFMIFKFYPFIMSLFYSFTNYSLFKPGTEFI GLDNYLSIFHTKKYIKAFKVTFQYAFLTVPLKLAFSLFIAYILNFKLKFVKMFRTIYYIP SILGGSVAIAVVWKFIFQGDGLINQFIQMFGGEKINFLGSTHYALFVICLLRVWQFGSAM VIFLAALKGVPQELYEAASIDGAGKWRQFFSVTVPLITPVIFYNLITQLCEAFQEFNSAF IITKGGPLGSTTLISLLIYQNAFRTYEMGLASAMAWLLFIIICTLTVASFASQKYWVYYS DDER >gi|229783926|gb|GG667809.1| GENE 10 9936 - 10257 346 107 aa, chain + ## HITS:1 COG:YPO1721 KEGG:ns NR:ns ## COG: YPO1721 COG0395 # Protein_GI_number: 16121981 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Yersinia pestis # 6 106 30 130 306 98 49.0 2e-21 MSKKTREQVSTVLRYTVLILVGFVMFYPMLWMVGASFKASNNEIYSTISFLPKSPSFQAY IDGWTATGNPYSLYLINTFKIVIPKVIGAVVASVVTAYGFSRFEFPG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:01:48 2011 Seq name: gi|229783925|gb|GG667810.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld203, whole genome shotgun sequence Length of sequence - 14367 bp Number of predicted genes - 15, with homology - 13 Number of transcription units - 10, operones - 5 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 493 450 ## COG0174 Glutamine synthetase - Prom 534 - 593 4.6 - Term 685 - 738 9.0 2 1 Op 2 . - CDS 765 - 869 109 ## - Prom 973 - 1032 80.4 3 2 Tu 1 . - CDS 1877 - 2377 508 ## Emin_0754 aminoglycoside phosphotransferase - Prom 2414 - 2473 4.1 4 3 Op 1 2/0.000 - CDS 2481 - 3140 595 ## COG5658 Predicted integral membrane protein 5 3 Op 2 . - CDS 3127 - 3408 284 ## COG0640 Predicted transcriptional regulators - Prom 3477 - 3536 6.1 - Term 3511 - 3538 -0.8 6 4 Tu 1 . - CDS 3539 - 4828 1256 ## Clole_0321 hypothetical protein - Prom 4874 - 4933 8.3 - Term 4889 - 4928 1.0 7 5 Op 1 19/0.000 - CDS 4936 - 5367 444 ## COG0822 NifU homolog involved in Fe-S cluster formation 8 5 Op 2 1/0.000 - CDS 5357 - 6301 748 ## COG0520 Selenocysteine lyase - Prom 6335 - 6394 80.4 9 6 Op 1 24/0.000 - CDS 7238 - 7504 254 ## COG0520 Selenocysteine lyase 10 6 Op 2 12/0.000 - CDS 7479 - 8558 964 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component - Prom 8687 - 8746 13.1 11 7 Op 1 . - CDS 9648 - 10394 689 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 12 7 Op 2 . - CDS 10425 - 10523 94 ## - Prom 10555 - 10614 18.9 13 8 Tu 1 . - CDS 11516 - 12112 163 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 - Prom 12194 - 12253 4.6 + Prom 12230 - 12289 8.5 14 9 Tu 1 . + CDS 12323 - 12742 460 ## COG1959 Predicted transcriptional regulator + Prom 12793 - 12852 9.1 15 10 Tu 1 . + CDS 12883 - 14365 1278 ## COG0840 Methyl-accepting chemotaxis protein Predicted protein(s) >gi|229783925|gb|GG667810.1| GENE 1 1 - 493 450 164 aa, chain - ## HITS:1 COG:RSc1258 KEGG:ns NR:ns ## COG: RSc1258 COG0174 # Protein_GI_number: 17545977 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Ralstonia solanacearum # 1 164 1 165 471 130 38.0 9e-31 MSKSTEVLRICKEKGIRMIDFKMTDLNGRWRHITIPVERFDEEIFTQGIGFDGSNYGYAP VEKSDMVFIPDPGTAVVDPFTEIPTLTMCGDVCVIGKENRPFDQYPRNVSLRAIQYMKEQ GIADQMLIGPEFEFHLFDHVSYMVEPQRMAFTVDTRQAAWNSGR >gi|229783925|gb|GG667810.1| GENE 2 765 - 869 109 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQQWLSIAAASRKTKGIPEEEKLLNHWLNVLEYE >gi|229783925|gb|GG667810.1| GENE 3 1877 - 2377 508 166 aa, chain - ## HITS:1 COG:no KEGG:Emin_0754 NR:ns ## KEGG: Emin_0754 # Name: not_defined # Def: aminoglycoside phosphotransferase # Organism: E.minutum # Pathway: not_defined # 1 166 1 164 249 181 53.0 1e-44 MEQREILAKRSNKVVYREGNLVYKVFDRSFPKADILNEALNQARIEETGLPIPGIVSVNV TEQGEWTIVQEYVSGKTMAQLMKEEPENRNQYLEQFVDLQLLVHSKRSPLLNKLREKLDR QISTCEALSPNTRFIIRGRLDSMPKHACVLHGDFNPGNIIVSEDGS >gi|229783925|gb|GG667810.1| GENE 4 2481 - 3140 595 219 aa, chain - ## HITS:1 COG:MA3135 KEGG:ns NR:ns ## COG: MA3135 COG5658 # Protein_GI_number: 20091953 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Methanosarcina acetivorans str.C2A # 17 214 17 216 227 122 38.0 4e-28 MKKTDRKLTSRDLLYWGLALVPFIISIVFYSRLPEQVATHWGSDNQVNGYSSRNMAAFGI PAFMLLMAVIVNVIPVIDPKRENIRRSKELMTIVRWFIVLLAVMVQLVIVLSAVGISINV VSVVSVPIALLFVVIGNYLPKCRQNYTLGIKLPWTLADEENWTRTHRLAGYVWMIGGILM MILGFFHMEPLYFTVFLSMILIPGVYSYLIYRKKAGHKL >gi|229783925|gb|GG667810.1| GENE 5 3127 - 3408 284 93 aa, chain - ## HITS:1 COG:BS_yvbA KEGG:ns NR:ns ## COG: BS_yvbA COG0640 # Protein_GI_number: 16080432 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 5 81 3 79 90 103 61.0 9e-23 MPFGDTFKALSDPTRREILTLLKPGSMTAGQIVERFDTSGATISHHLNILKQAGLIEDRK SGKYIYYELNTTVFQEVLSWLQTLMEEQHHEKD >gi|229783925|gb|GG667810.1| GENE 6 3539 - 4828 1256 429 aa, chain - ## HITS:1 COG:no KEGG:Clole_0321 NR:ns ## KEGG: Clole_0321 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 56 412 2 348 356 127 27.0 8e-28 MKLTEQLQEWNENDEYQKIIDAIEGMPVEERTPELISELARAYNNIAGPALPYFEQSFRE KTAEGWSVFEKGEARLRALMDQKDQGAVREELIETCSGLLSAVFSDVSFELGFNGKKYEL ILTPEGKRTKLFELVYFQSHAPSSVLEHWNILVGRQPSRGYSLGAFGEMVSGQDVQVWIE KTEEQSQSKHISLELYCEKLLPLFKTDESKVWWLLSTLLDQILGELAAMELVEDFQVLDT PKDAVPVLLDELPEILKKMDIECSSDPEHFLENSFIGYEMVPDQNPEADWRLDVFAGSTR CAAVINEYLRGEGDIMDEFHRDGVTAGFLCYPLDSFAHEEERGRVVLDFRDELIAYVLEN AGDDAMTFLGGATGIYCGYLDFIAWDLNAVLASAAAFFENSPVEWAAFHSFRRNVNAVRL MDKRPAEAE >gi|229783925|gb|GG667810.1| GENE 7 4936 - 5367 444 143 aa, chain - ## HITS:1 COG:CAC3292 KEGG:ns NR:ns ## COG: CAC3292 COG0822 # Protein_GI_number: 15896537 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Clostridium acetobutylicum # 1 142 1 143 143 144 51.0 7e-35 MENRSFYNEVLTEHNLHPAHKHDLPDANLVLEGVNPSCGDDILLKLKVEDGIIREGSFVG DGCAVSQASADMMLDLVIGKKKEEAVKLAEKFFQMIRGTAEEADLEVLEEAAVLQDISHM PSRVKCAVLGWRTLDEMLAGEKS >gi|229783925|gb|GG667810.1| GENE 8 5357 - 6301 748 314 aa, chain - ## HITS:1 COG:CAC3291 KEGG:ns NR:ns ## COG: CAC3291 COG0520 # Protein_GI_number: 15896536 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Clostridium acetobutylicum # 1 313 95 407 408 360 52.0 2e-99 MNLIAYSYGLSKLKAGDEILVSIMEHHSNLLPWQMAAGKTGAVLRFLECEPDGRISQEKL EAAFTDRTRLVAVAHVSNVLGCENPIRRIAAMARERGAVVVVDAAQSAPHIPIDVQELGV DFLAFSGHKLMGPMGIGVLYGRRELLEEMPPFLTGGEMIDSVTRTGAVFAPIPHKFEAGT VNAAGAWGLKEAVCYLKSVGFDEVRRRELELTAAALEGLKQIPHVHVLGSEKPDEHCGIV TFTVDGVHPHDVSAILDADGIAVRAGHHCAQPLLEYLKVPSATRASIYFYNTREEIEAFL KSAAGIRRKMGYGE >gi|229783925|gb|GG667810.1| GENE 9 7238 - 7504 254 88 aa, chain - ## HITS:1 COG:mlr0021 KEGG:ns NR:ns ## COG: mlr0021 COG0520 # Protein_GI_number: 13470346 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Mesorhizobium loti # 10 88 18 100 413 83 51.0 8e-17 MENRNPYHEDFPLLANNPVIYLDSAATAQKPGCVLRAERDFYEQCYANPMRGFYELSMEA TKWLEEAREEVRRFINAPSADEIIFTSS >gi|229783925|gb|GG667810.1| GENE 10 7479 - 8558 964 359 aa, chain - ## HITS:1 COG:CAC3290 KEGG:ns NR:ns ## COG: CAC3290 COG0719 # Protein_GI_number: 15896535 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Clostridium acetobutylicum # 9 352 8 365 366 150 27.0 5e-36 MGQGLDLTINRLPARTWNWLRMNEADLKQIEAGESPTVITEGLGEAEPWNEDISVLPTGM GEDMDRLAGEAGTAAIRIQSGPLKEKKRAALHFVCSDGGKSFQPVELLVREGECLTVIMD FTSPPEAAGLAAVQTRFQVNKNGRLKLIQLQLLGSGYRFLNDIGGRCEEDGAAEVVQLFL GGSETYAGCRADLAGRDSVLNADIGYLGRGNQRFDMNYTADHRGERSSSRILAGGVLRDR AFKLFRGTIDFKSGAVGAEGEEKEDVLLFGDDVVNQTIPLILCAEEDVLGSHGATIGRLD EELLFYLCSRGMRREEAEAMITRARLEAVCRKSGDEAAEQMVHNYLKEVKENGKQKSIS >gi|229783925|gb|GG667810.1| GENE 11 9648 - 10394 689 248 aa, chain - ## HITS:1 COG:CAC3289 KEGG:ns NR:ns ## COG: CAC3289 COG0719 # Protein_GI_number: 15896534 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Clostridium acetobutylicum # 3 247 4 247 466 361 66.0 1e-100 MEEKTYVNDIDRSIYDIRNEETDAYRIEEGLKPAIVEQISKEKNDPLWMELFRLKSLQIY NSMEVPDWGPSLEGLDMSHIATYVRPGTKMKTKWSDVPEAIKDTFERLGIPQAERKSLAG VGAQYDSELVYHNVRQEVAELGVVYTDLESALKGDYADMVKKHFMKLVKPSDHKFAALHG AVWSGGSFVYVPPGVSVEIPLQSYFRLNAPGAGQFEHTLIIVDEGASLHFIEGCSAPKYN VANLHAGS >gi|229783925|gb|GG667810.1| GENE 12 10425 - 10523 94 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIDGRIVKTGGASLVEEINKNGFEQYLEQADR >gi|229783925|gb|GG667810.1| GENE 13 11516 - 12112 163 199 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 14 196 31 212 329 67 30 6e-11 MANQLLEVKGLSVKVEDKEILHGINLEIKKGETHVLMGPNGAGKSTLGYALMGNPRYEVT QGEIRFQGKLMNGETADKRAGAGMFLSFQNPLEVPGLSLGTFIRNAVEQRRGARVRLWDF RRELWKAMELLQMERTYGDRDLNVGFSGGEKKKAEILQMLLLKPGLAILDETDSGLDVDA VRTVSKGVEEYQKDRNGGL >gi|229783925|gb|GG667810.1| GENE 14 12323 - 12742 460 139 aa, chain + ## HITS:1 COG:DR2094 KEGG:ns NR:ns ## COG: DR2094 COG1959 # Protein_GI_number: 15807088 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Deinococcus radiodurans # 1 138 46 181 197 87 31.0 8e-18 MMISTRGRYALRVMLDLAQQTTDRYVPLKEIANRQEISEKYLEIVIKVLVQNKLLTGLRG KGGGYRLTRAPGEYTLGEILRLTEGSLAPVACAEESYTECSRRGFCLSRPVWQELDKIIN EYLDGITLEDLLKSPQKSV >gi|229783925|gb|GG667810.1| GENE 15 12883 - 14365 1278 494 aa, chain + ## HITS:1 COG:CAC2617 KEGG:ns NR:ns ## COG: CAC2617 COG0840 # Protein_GI_number: 15895875 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Clostridium acetobutylicum # 4 315 2 309 664 67 21.0 7e-11 MVDKGVKKHIVVSAILVSFISAALVLIGGISYLSIERAQEKEAQKYMKEIVSQYKNIITA QIDGDFQTLEALAVLIGYDDVIHLDETLSCLEEQSIRNDFTRMGFVSPAQTGYFIDSDGE KHFDVDVSDEAFIQKTLSGQRTVSEIMTDRLSGEPVVCYGVPITYHDTITGAITATRLTS NLTDIISQDIFDGVAYIHIVDHEGNFIIRSDHAVIPKQMHNLFDDGEIEDGIRASILASM ENNGDSFATFRYQGEAYWVTFLPIGVNDWQLFCLVPQTFLNHNFHTLLLVFSGVMICIFF LFGILFLYIYSLLKKEHQTLQQFAYKDLLTGADNRNRFVTDLPALLKEPKDYAMVLMNIN GFKFVNEFFGFESGNRLLQHIAMVLHTNVCHEERYYRDSADHFGMLLTYHSQEELIDRIR NIQQEINEYTVSPNQDYRINCNFGVHIIQADTPWDAHRHSPDTAINGALLALNSVNGNTS NPVAFYDEALYEKA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:02:09 2011 Seq name: gi|229783924|gb|GG667811.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld204, whole genome shotgun sequence Length of sequence - 11530 bp Number of predicted genes - 12, with homology - 11 Number of transcription units - 7, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 93 - 152 4.2 1 1 Op 1 . + CDS 266 - 1420 1071 ## COG0523 Putative GTPases (G3E family) 2 1 Op 2 . + CDS 1425 - 2378 1108 ## Closa_2430 G3E family GTPase-like protein 3 1 Op 3 . + CDS 2417 - 2575 92 ## + Term 2777 - 2811 -0.0 - Term 2413 - 2465 5.6 4 2 Tu 1 . - CDS 2494 - 3471 1084 ## Closa_2429 polysaccharide deacetylase 5 3 Tu 1 . - CDS 4422 - 4763 323 ## gi|266624786|ref|ZP_06117721.1| putative fimbrial assembly - Prom 4854 - 4913 4.4 + Prom 4700 - 4759 5.0 6 4 Tu 1 . + CDS 5000 - 6223 1095 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Term 6247 - 6291 14.8 - Term 6233 - 6279 15.2 7 5 Tu 1 . - CDS 6285 - 7613 1125 ## COG3808 Inorganic pyrophosphatase 8 6 Tu 1 . - CDS 8531 - 9172 601 ## COG3808 Inorganic pyrophosphatase - Prom 9340 - 9399 5.9 + Prom 9286 - 9345 9.4 9 7 Op 1 . + CDS 9416 - 9667 203 ## COG1983 Putative stress-responsive transcriptional regulator 10 7 Op 2 . + CDS 9709 - 10029 304 ## COG1695 Predicted transcriptional regulators 11 7 Op 3 . + CDS 10026 - 10754 907 ## Closa_2422 hypothetical protein 12 7 Op 4 . + CDS 10751 - 11528 899 ## Closa_2421 hypothetical protein Predicted protein(s) >gi|229783924|gb|GG667811.1| GENE 1 266 - 1420 1071 384 aa, chain + ## HITS:1 COG:FN0779 KEGG:ns NR:ns ## COG: FN0779 COG0523 # Protein_GI_number: 19704114 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Fusobacterium nucleatum # 3 197 2 193 294 135 37.0 1e-31 MTKIDIISGFLGAGKTTFIKKLLEEAISGEQVVLIENEFGEIGIDGGFLKNSGIEIREMN SGCICCSLVGDFGKSLAEVLTKYKPDRVIIEPSGVGKLSDVMKAVRDVAAEIEVTLNSAV TIVDVAKCRMYMKNFGEFFNNQIENAGTVVLSRTDIAAVDKVNQAVELIREKNPAAVIVT TPCSQLSGAQLLEIIEKPDTMAEDLMKEVEEHRRHHHDHEHGEECGCGHDHDHHHDHEHG EECGCGHDHDHHHDHDHGHGEECGCGHDHDHHHDHDHGEACSCGHDHGHHHADEVFTSWG LENVGAMKREELDRILDELAYGDQYGEVLRAKGMVPGEKEGSWLYFDLVPEQYEIRDGAP EYTGKVCVIGAKLKENELKKAFGR >gi|229783924|gb|GG667811.1| GENE 2 1425 - 2378 1108 317 aa, chain + ## HITS:1 COG:no KEGG:Closa_2430 NR:ns ## KEGG: Closa_2430 # Name: not_defined # Def: G3E family GTPase-like protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 317 1 317 317 556 82.0 1e-157 MEYTDTLDDDFMPVFLINGFLEAGKTQFLQFTMQQDYFQTDGKTLLIVCEDGDTEYDTEL LKRTKTAGIFIENMAELTPERLLELELLYNPERVLIEWNGMWNQDELQLPDDWRVYQQIT IIDMSTFDLYVANMKPLLYAMIKNSELVICNRCDGIEDLSGYRRTLKSMCQRGEIVFEDS EGEINEIADEDLPYDLGADVIEISPEAYGIWYIDCMDRRDRYDGKIVEFTAMVLKSPEFP KNYFVPGRMAMTCCEADMTFLGFVTKAREAKDLQTKQWVKVRAKIAYEFWKDYEGEGPVL YAETVQPAEPIREVVQF >gi|229783924|gb|GG667811.1| GENE 3 2417 - 2575 92 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKNSCEVDRLRFRFTAIFEFVIRPEDYIWGTGVGLAGSNTSGTLNKSLKLSG >gi|229783924|gb|GG667811.1| GENE 4 2494 - 3471 1084 325 aa, chain - ## HITS:1 COG:no KEGG:Closa_2429 NR:ns ## KEGG: Closa_2429 # Name: not_defined # Def: polysaccharide deacetylase # Organism: C.saccharolyticum # Pathway: not_defined # 1 325 145 469 469 579 83.0 1e-164 MFFHSLIVDTSKAFDGDSREGGYNQMMTTKDEFIKMMQSMYDRGYVLVKIHDLAYETTDE NGNPKFVYGDIMLPEGKKAFVMSQDDVCYYEYMKGDGFATRMVIGEDGYPTTEMEQEDGS VITGDFDLVPILETFIKEHPGFSYKGARAIIAFTGYNGILGYRTDESYKDTNPNYDEDLK QAAAVAQCLKDHGWELASHSWGHRNLGTISMEHFKVDADKWEKNVESLIGPTDIILYPFG SDIGTWKPYTDDNERFTYLKSLGFRYFCNVDASTPYWMQLGPDHFRQGRRNLDGYRMYYY PDSLSDLFNVPDVFDPARPTPVPQM >gi|229783924|gb|GG667811.1| GENE 5 4422 - 4763 323 113 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624786|ref|ZP_06117721.1| ## NR: gi|266624786|ref|ZP_06117721.1| putative fimbrial assembly [Clostridium hathewayi DSM 13479] putative fimbrial assembly [Clostridium hathewayi DSM 13479] # 1 110 1 110 110 164 100.0 2e-39 MNMEEYRERRQQEKRRRMIHNILVIVTGFLLIAALVTGGIILGRYRSGFGSVKASVHQTT AETTAPEIITTAAETEDTGLTDLLTQAESLAAQYDYEGAINLITSDSRYALAS >gi|229783924|gb|GG667811.1| GENE 6 5000 - 6223 1095 407 aa, chain + ## HITS:1 COG:BH1535 KEGG:ns NR:ns ## COG: BH1535 COG1686 # Protein_GI_number: 15614098 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 43 406 25 384 387 270 44.0 5e-72 MLALVLSMLLFLCPSHAAWGQEPDIAVASDMVIRSEEAKAVNAEGGPSIQAPSALLMEAS TGQVIYEKDADTKRSPASITKIMTLILIFDALDSGKIKMTDEVVTSAHAKSMGGSQVFLE EGETQTVETMIKCIVVASGNDASVAMAEYIAGTEEEFVKMMNERAAGLGMTNTHFEDCCG LTESPDHVTTARDIALMSRELINKYPQIHNYSTIWMENITHVTKQGTKEFGLSNTNKLLK MATNFQVTGLKTGSTSLAKYCLSATAEKDGVRLIASIMAAPDYKARFADAQTLLNYGYAN CRLYEDKEMLPLPELAVDNGVTDSVPLKYGGTFSYLSLNGEDLAAIEKELVLPESIAAPV EEGQSAGVIQYSLAGKKLGEVPVLTAGAVKEAGFLDYFKRVLAGFAV >gi|229783924|gb|GG667811.1| GENE 7 6285 - 7613 1125 442 aa, chain - ## HITS:1 COG:FN2030 KEGG:ns NR:ns ## COG: FN2030 COG3808 # Protein_GI_number: 19705321 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Fusobacterium nucleatum # 1 438 225 670 671 389 56.0 1e-108 MGADLFESYVGSLISALTLGVVFYREAGVIFPLLLSACGVIASIIGALIVRAGNNSNPHR ALKSGEYSATALVVAAAFLLSRMYFGSFNAAFAVIAGLLVGVMIGFVTEIYTSGDYRFVK KIASQSETGSATTVISGIAVGMQSTAIPILLVCIGILVSYRLMGLYGIALAAVGMLSTTG ITVAIDAYGPIADNAGGIAEMAGLDDSVREITDKLDSVGNTTAAIGKGFAIGSAALTALA LFVSYAEAVHLTTIDILNAHVIIGLFIGGMLTFLFSSMTMESVSKAAHQMIEEVRRQFRE KPGILKGTDRPDYASCVSISTTAALREMFLPGLMAVLVPLAVGLILGPEALGGLLTGALV TGVLTAIFMSNAGGAWDNAKKYIETGHHGGKGSSAHKAAVVGDTVGDPFKDTSGPSINIL IKLMTIVSLVFAPLFLQYGGLL >gi|229783924|gb|GG667811.1| GENE 8 8531 - 9172 601 213 aa, chain - ## HITS:1 COG:MA3880 KEGG:ns NR:ns ## COG: MA3880 COG3808 # Protein_GI_number: 20092676 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Methanosarcina acetivorans str.C2A # 1 210 1 210 671 184 54.0 1e-46 MEKMILSVPVIGLIGLLFAVFLRLHVIKQDPGNDRMREIADAIAEGARAFMTSEYRVLII FVAVLFVVIGLGTRSWTTAICFLVGSAFSTIAGYLGMSVAIRANCRTANAARTSGMSRAL SVAFSGGSVMGMAVVGLGLLGVGTLFLITKNVEVLAGFSLGASSIALFARVGGGIYTKAA DVGADLVGKVEAGIPEDDPRNPAVIADNVGVAS >gi|229783924|gb|GG667811.1| GENE 9 9416 - 9667 203 83 aa, chain + ## HITS:1 COG:CAC2659 KEGG:ns NR:ns ## COG: CAC2659 COG1983 # Protein_GI_number: 15895917 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Putative stress-responsive transcriptional regulator # Organism: Clostridium acetobutylicum # 4 78 3 62 63 68 41.0 3e-12 METKRLYRSVKNKVICGVCGGIGEYFNVDPVIIRLIWVILMFFQPWRHLFHSFAGFSLIG GSLVVYFIAAVIIPRAPSDDGSC >gi|229783924|gb|GG667811.1| GENE 10 9709 - 10029 304 106 aa, chain + ## HITS:1 COG:SPy2172 KEGG:ns NR:ns ## COG: SPy2172 COG1695 # Protein_GI_number: 15675909 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 102 1 100 108 80 41.0 9e-16 MVFNTGAALLDAIVLAVVSKEQEGTYGYKITQDVRNVIDISESTLYPVLRRLQKEECLEV YDMAIDGRNRRYYKITEKGRVQLKLYKGEWAGYSRKISQIFEEAFI >gi|229783924|gb|GG667811.1| GENE 11 10026 - 10754 907 242 aa, chain + ## HITS:1 COG:no KEGG:Closa_2422 NR:ns ## KEGG: Closa_2422 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 241 1 246 247 262 67.0 1e-68 MNREEFMKELEYLLSDIPDEEKAEAIGYYRDYLEEAGEENEEAAIREFGSPERIAAIIRS DLSGHLEDGGEFTDKGYEDERFRDPNYQVAKRYDLPDAVEGQKTEEREEKRENQPRTNRT VKLILWIILIIVAAPVLFGIGGGLVGMFGGLFGLLVAAVVVVGVLTAALFICGIALLIVG IVSMVIHPLSGVLMIGISVLTLGFAFACLALCGAFYGKFLPFLFRSIVNGISGLLYGRRS RS >gi|229783924|gb|GG667811.1| GENE 12 10751 - 11528 899 259 aa, chain + ## HITS:1 COG:no KEGG:Closa_2421 NR:ns ## KEGG: Closa_2421 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 259 1 236 300 128 32.0 3e-28 MKNFMKVCLIAGGVCILIGGGITTVAAAMGGRLSDLPSWNYRWLGRDIVGDITGDLRSEM DDLGRDLSDQFDDSFDELDREFDEIGRRHDGVNDGELNESEFSGIYARKLDLEVRRGRAF ITYDAPADKIVVSSGEDMSRWNVYSEEDELKVTVGSKSWDSYDYEDTVVYIHVPADYRFE EVELKVKAQRNNRVINSDNGPAITAEALKADKIEIDAAVGAVSVTGADAGKLEVDSDVGA VKFEGNVDGDVDAECSVGA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:02:44 2011 Seq name: gi|229783923|gb|GG667812.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld205, whole genome shotgun sequence Length of sequence - 14499 bp Number of predicted genes - 12, with homology - 11 Number of transcription units - 7, operones - 5 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 72 - 395 170 ## PROTEIN SUPPORTED gi|42783874|ref|NP_981121.1| 30S ribosomal protein S14 homolog-related 2 1 Op 2 . + CDS 437 - 793 197 ## gi|239626903|ref|ZP_04669934.1| conserved hypothetical protein + Term 1017 - 1051 0.5 3 2 Tu 1 . - CDS 936 - 1169 246 ## BCA_1836 GTP-binding protein 4 3 Tu 1 . - CDS 2114 - 2995 495 ## PROTEIN SUPPORTED gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 + Prom 5140 - 5199 80.4 5 4 Op 1 3/0.000 + CDS 5330 - 5707 362 ## COG3682 Predicted transcriptional regulator 6 4 Op 2 . + CDS 5704 - 7410 928 ## COG4219 Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component + Term 7453 - 7494 0.5 - Term 7435 - 7488 14.2 7 5 Op 1 3/0.000 - CDS 7529 - 8581 998 ## COG0582 Integrase 8 5 Op 2 . - CDS 8578 - 9309 470 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control - Prom 9403 - 9462 5.1 - Term 9619 - 9656 8.4 9 6 Op 1 . - CDS 9668 - 10291 522 ## Cbei_0730 hypothetical protein 10 6 Op 2 . - CDS 10354 - 11853 1277 ## COG1690 Uncharacterized conserved protein - Prom 12033 - 12092 16.4 11 7 Op 1 . - CDS 12994 - 13116 138 ## 12 7 Op 2 . - CDS 13183 - 14358 1211 ## gi|266624807|ref|ZP_06117742.1| conserved hypothetical protein - Prom 14439 - 14498 2.2 Predicted protein(s) >gi|229783923|gb|GG667812.1| GENE 1 72 - 395 170 107 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42783874|ref|NP_981121.1| 30S ribosomal protein S14 homolog-related [Bacillus cereus ATCC 10987] # 1 94 6 98 114 70 36 9e-12 NMNAQFKKGVLELVVLESVRKKDMYGYELVEEVSKVIDVNEGTIYPLLKRLTNEHYFETY LRESTEGPPRKYYHLTAAGVLYREKLEQEWNEFQQRVCTFLKEQKDE >gi|229783923|gb|GG667812.1| GENE 2 437 - 793 197 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|239626903|ref|ZP_04669934.1| ## NR: gi|239626903|ref|ZP_04669934.1| conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] # 9 115 63 165 169 83 36.0 6e-15 MKMRNSRIFWKNFFYSHSGNPTKIIPDETVPENTVRFQITYNEQAVSPYLRDSEQESEDP FVGIEFAYLQNDMELFLAGKEQLLDDIKNRRIGSYETVSVDRIRIFVNPASADLVIMD >gi|229783923|gb|GG667812.1| GENE 3 936 - 1169 246 77 aa, chain - ## HITS:1 COG:no KEGG:BCA_1836 NR:ns ## KEGG: BCA_1836 # Name: not_defined # Def: GTP-binding protein # Organism: B.cereus_03BB102 # Pathway: not_defined # 1 75 343 417 419 70 44.0 2e-11 MSAKYGIGIPELLRMMEQILFSEYQEARFLIPYENGNVVHYFNSRAMVHSQEYQEQGVCL TVSCREGDREKYKQYLT >gi|229783923|gb|GG667812.1| GENE 4 2114 - 2995 495 294 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 [Roseobacter sp. AzwK-3b] # 33 293 64 316 425 195 45 2e-49 MKELNNLVKACDMEPVARIDQNLASINAAYYIGSGKVGEIREAAETLRADYVVFNDTLSP SQLKNLQREIDVPVWDRTYLILEIFSRRAQTKEARMQVESARLQYMLPRLMGLRDSLGRQ GGASGSLSNKGSGEKQIELDRRKIEGRISELRRELERMEQERNVQRRKRSRGTCPQAALV GYTNAGKSTLLNKLVELSNGKEEKMVMAKDMLFATLDTTVRKISPNGSQDFLLSDTVGFI SRLPHSLVKAFRSTLEEIRYADLLLHVVDFSDEHYKEQMEVTEETLKELGAGDI >gi|229783923|gb|GG667812.1| GENE 5 5330 - 5707 362 125 aa, chain + ## HITS:1 COG:CC1640 KEGG:ns NR:ns ## COG: CC1640 COG3682 # Protein_GI_number: 16125886 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Caulobacter vibrioides # 2 125 23 144 144 66 30.0 1e-11 MHQITEYELELMKIIWANGNKALYADIAKALEDRGTPWTKNTIITLSSRLIAKGFLKTAK IGRRNEYIAIVSESSYQTAQAETFLEKIYEGNAKGLVSTLIEKDLLSAEDMEDLRRYWEG GESGK >gi|229783923|gb|GG667812.1| GENE 6 5704 - 7410 928 568 aa, chain + ## HITS:1 COG:CAC3437 KEGG:ns NR:ns ## COG: CAC3437 COG4219 # Protein_GI_number: 15896678 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component # Organism: Clostridium acetobutylicum # 9 382 12 365 541 127 23.0 5e-29 MTEILKTVLSLSLSGSLLILLLFMLRPVIKARLSKRWQYYIWLVVAARLMLPFAPEENFM GTFFQGIDRAVTQTELAAPTGQQTAVAGISSAEYKTDGSKKLHHEQRTPMKSAGILARIV SSAMWKHLWPGWFAIALILLIRKITIYQDFVKYIRAGSEEAGDIDLLERFGKLVEKSRVK NTVEFCTNSLISSPLLIGFFHPCIVLPTADLSSSDFEYTILHELTHYKRRDMFYKWLIQL IVCVHWFNPLVYLMRREIDRACELSCDEAVIKTLDPGGRKAYGDTLLHAMGAGGRYRDSL ASVTLNESRELLKERLDAIMRFKRTTRLTAAVSAALVLVLSFTAAVTGAYAASPDAENTV VINLSNEGQHSVIHSSSFEAKDGQILTLKITSSIKGAADLFLFSPSNQEQRIPIGGNSET KTIPLSEGRWSYNCTGFFDSGNISIVGTLSSAQSAAAGAAGDAPSGLTVTDPDPNDMIVI DLSDEGQKCIVHSSGFEAKDGQILTLEITSSINGSVDLFLFSPSNQEQRITMGGSSETKT IPLSEGRWSYNCTGFFDSGTISIVGTLQ >gi|229783923|gb|GG667812.1| GENE 7 7529 - 8581 998 350 aa, chain - ## HITS:1 COG:CAP0080 KEGG:ns NR:ns ## COG: CAP0080 COG0582 # Protein_GI_number: 15004784 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Clostridium acetobutylicum # 41 344 19 307 323 131 32.0 2e-30 MNTLSYHQQRDIENVKHLRELIRELPPFCADFFRGIEPRTSSRTRIAYAYDLKVFFDFLL KENPVIAKMDMKDISLDHLDNLKLVDMEEYMEYLKYRFNDKNQEITNKERGIMRKISSLK SFFNYYYRNERLVNNPAALVQLPKLHEKEIIRLDVDEVALLLDEVEKGEALTEKQKSYHE KTKLRDLALLTLMLGTGIRVSECVGLDIDDVDFKNGGIRIHRKGGKEVTVYFGSEVEDAL NDYLEERKMIIAEEGHESALFLSLQRKRLAVRSVENLVKKYARIVTPLKKITPHKLRSTY GTNLYKETNDIYLVADVLGHADVNTTKKHYAALEDERRRSARNKVRLRED >gi|229783923|gb|GG667812.1| GENE 8 8578 - 9309 470 243 aa, chain - ## HITS:1 COG:FN0868 KEGG:ns NR:ns ## COG: FN0868 COG0037 # Protein_GI_number: 19704203 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Fusobacterium nucleatum # 12 228 40 258 277 145 35.0 5e-35 MKLQRLYSLTRQAIDYYHLIEDGDHIAVGISGGKDSLTLLYALQGLKQFYPKHFELSAIT VDLGFGDFDLSPVKELCERFSVPYTIVSTEIGKILFETRKESNPCALCAKLRKGALNDAA IKLGCNKIAYAHHREDLIETMLLSLIYEGRFYAFSPSTYLDRTGLTVIRPMIYVKEADVI GFKNKYDLPVCKNPCPVDGHTKREYVKKLTKTLEQENPGVKDRLFHAIVDGTIEGWPERG QLS >gi|229783923|gb|GG667812.1| GENE 9 9668 - 10291 522 207 aa, chain - ## HITS:1 COG:no KEGG:Cbei_0730 NR:ns ## KEGG: Cbei_0730 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 2 202 1 220 480 141 35.0 2e-32 MINRDDMLELTRRMTLARTSFTRIAGCYVDRDGDFEGSFNTNFLKVPAPERTKKLKLAKE IPFSATNVNLKKYEFTQDMRKPGSMWQLLMAMNECGLKNDALMDTFYDLIMEKYHKNEPY AILVFHDRYDIPAKGSDGERQWESEEVFEYMICAICPMTGEYEPGKPECGFLFPAFTDRS GDQNHINIFQADAGRPHRELLEILGLG >gi|229783923|gb|GG667812.1| GENE 10 10354 - 11853 1277 499 aa, chain - ## HITS:1 COG:TM1357 KEGG:ns NR:ns ## COG: TM1357 COG1690 # Protein_GI_number: 15644109 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 59 499 54 474 474 185 32.0 2e-46 MELKLNFENRSCGVSVFLPEGTKPDKKALDELNRMMELEHTVECMKKAGSFFKNPDSAVR KAAVTPDFHKGAGIPIGTVLMTEGFVIPQAMGNDINCGMRLYTTDLKEEELEGKLSVLLP EIRRIFFEGGRGIAMNGIQREAMLREGLTGLLETSGEAAGTGIWRFYDREEQEEELERTA FKGSLWAEETEGLEDFIGDAHTLTYDDQIGSIGGGNHFVEMQRVAEIYDGQTANAWGIKR GAILLMIHTGSLMIGHQSGRISQMITRGLYPKLIPHPDNGIYLLPEGEGEGREWRRFRST TGNAANFGFANRMFLGLMLQNAVENRVHPFEMKLLYDSPHNFIWEKESEGRKSYIHRKGA CSARGMAEMEGTPFAYYGEPVMIPGSMGSSSYLLRGMGNPDALWSASHGAGRRLSRGEAI HGSDREFQEFMKKFHIVTPIDPNRSDLKGRNDILKKWEESLRSEAPYAYKDIDKVVKVHA DHGMAGLVARMEPVFTIKA >gi|229783923|gb|GG667812.1| GENE 11 12994 - 13116 138 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDKNGLDFVVEKSKELMSASTCSSEAKAAAQAWLDAVGTD >gi|229783923|gb|GG667812.1| GENE 12 13183 - 14358 1211 391 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624807|ref|ZP_06117742.1| ## NR: gi|266624807|ref|ZP_06117742.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 391 1 391 391 706 100.0 0 MKSRNDRWIAAAACIALCGVLTACGQKEKIQDIETTVTSQAEEEQSAGDHDRRPESETLS SDGAENGYRIIEDQTFEANLNPLGEVTFVSFEPEDRANSFADAIFELKDGDRTIAVLEGI SRDNIREGERLQKVKAVSFPDYNSDGYSDIIIICSYLSVSETAAGAVYSEVRIYQGSESG TFTLEKGLSKDADAALAEKTVQSVLGFLGAGKKSEAPSGWKQAYIDYIQAQDEEEWDGYQ LIYIDDDDIPELVKIGNSEAVGCMIAAWDGGQVVENQLNRLYFSYIEKENLLCNSEGSMD YYYDLVYSLTDGRLSLIASGYYGAGDRSNLEFDEDGNVIYQYEWNGTPMSREEYSQAFNA VYDTSKARDGYEWGQWLTQSQTIQKISEMQQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:03:22 2011 Seq name: gi|229783922|gb|GG667813.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld206, whole genome shotgun sequence Length of sequence - 11058 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 8, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 662 290 ## COG1357 Uncharacterized low-complexity proteins - Prom 768 - 827 30.8 2 2 Op 1 . + CDS 1996 - 2532 571 ## COG4720 Predicted membrane protein 3 2 Op 2 . + CDS 2559 - 2969 361 ## gi|266624811|ref|ZP_06117746.1| conserved hypothetical protein 4 2 Op 3 . + CDS 2983 - 3090 70 ## 5 2 Op 4 . + CDS 3116 - 3565 535 ## COG0242 N-formylmethionyl-tRNA deformylase 6 2 Op 5 . + CDS 3637 - 4113 560 ## COG4894 Uncharacterized conserved protein + Term 4168 - 4202 3.6 7 3 Op 1 . + CDS 5358 - 5945 281 ## COG1309 Transcriptional regulator 8 3 Op 2 . + CDS 5942 - 6490 201 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 9 4 Tu 1 . - CDS 6499 - 7848 1108 ## COG0534 Na+-driven multidrug efflux pump - Prom 8081 - 8140 4.3 + Prom 7955 - 8014 4.8 10 5 Tu 1 . + CDS 8049 - 8426 107 ## gi|266624819|ref|ZP_06117754.1| toxin-antitoxin system protein - Term 8367 - 8419 -1.0 11 6 Tu 1 . - CDS 8423 - 9130 507 ## BcerKBAB4_0681 alkaline ceramidase domain-containing protein - Prom 9166 - 9225 9.6 12 7 Tu 1 . - CDS 10127 - 10288 168 ## gi|266624821|ref|ZP_06117756.1| hypothetical protein CLOSTHATH_06213 - Prom 10359 - 10418 5.9 - Term 10392 - 10428 4.1 13 8 Tu 1 . - CDS 10434 - 10697 285 ## CLJ_B0712 hypothetical protein - Prom 10942 - 11001 4.4 Predicted protein(s) >gi|229783922|gb|GG667813.1| GENE 1 2 - 662 290 220 aa, chain - ## HITS:1 COG:CAC0073 KEGG:ns NR:ns ## COG: CAC0073 COG1357 # Protein_GI_number: 15893370 # Func_class: S Function unknown # Function: Uncharacterized low-complexity proteins # Organism: Clostridium acetobutylicum # 24 190 1 152 270 139 44.0 4e-33 MDPIHESIPHRHAVTGEEDLMPALITHGRKRLAADCSRCCGLCCTALYFSKEEGFPGDKE AGIPCIQLCEDFSCRIHSSLSGRRLKGCINYDCFGAGQTVTEIVFEGKSWCQLPDPEEMF RIFLVELTLHEAMWYLEEAALLRPAAALRPRIHAALTDLEKLAGGSRQTLTEEGIRDTIT AANGLLKETIRLTLDYAASQTMPSKKGRAPRSARSSACSR >gi|229783922|gb|GG667813.1| GENE 2 1996 - 2532 571 178 aa, chain + ## HITS:1 COG:PH1832 KEGG:ns NR:ns ## COG: PH1832 COG4720 # Protein_GI_number: 14591582 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pyrococcus horikoshii # 8 156 42 184 202 77 34.0 1e-14 MKTNNNTRKMVMTALFMALVCTATMVIRIPTPGTNGYIHPGDALVILSGIFLGPVQGLLA AGIGSALADFLGGYLLYVPATFLIKGLTAFCCGLVFQRLSATAGKRTAGVVIGGLTDMVF VACGYWLYETCYYGMAAALASVPSNLIQGIGGLVLAVILYPVLAVCRRQMEQEEKRLE >gi|229783922|gb|GG667813.1| GENE 3 2559 - 2969 361 136 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624811|ref|ZP_06117746.1| ## NR: gi|266624811|ref|ZP_06117746.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 136 1 136 136 274 100.0 1e-72 MAIAKKHKRSLIYGDSTYYWWVKRDDDYDRPVLHIASEDKGLIVLYPTEQHGERRYLVSI GRVFKNRTTSGCWQRYRIPKLDTGREPDAGQDTEKADEPVMEAFPVTPGFVASLLGWCLD EGDAESVTWNGRDIWL >gi|229783922|gb|GG667813.1| GENE 4 2983 - 3090 70 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAVILSLQGCMAVGKTTAVQHVKERRPEMDRHRSE >gi|229783922|gb|GG667813.1| GENE 5 3116 - 3565 535 149 aa, chain + ## HITS:1 COG:CAC2474 KEGG:ns NR:ns ## COG: CAC2474 COG0242 # Protein_GI_number: 15895739 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Clostridium acetobutylicum # 1 149 1 149 150 175 61.0 2e-44 MAVLNMRYDGDEILRKKCKEVKEVDDRIREILNDMTDTLHATPNGAAIAANQVGILKRLV VIDMGTGLMKLVNPVIVEQTGEQDCIEGCLSFPEKYGRTIRPQTVIVKALDENGEVVTLT GIDEMAKCFCHELDHLDGVCFVDKVTEWL >gi|229783922|gb|GG667813.1| GENE 6 3637 - 4113 560 158 aa, chain + ## HITS:1 COG:BS_yxjI KEGG:ns NR:ns ## COG: BS_yxjI COG4894 # Protein_GI_number: 16080945 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 149 2 150 162 92 31.0 3e-19 MQLLFKQRLFSWFDSYDIYDEAGNTVYTVKGKLSWGHLLEIYDRFDNHIGTVQERVLTFL PKFLMYINGQEIGEIRKELTFLKPKFTLDCNGWTVDGDWLEWDYRVKDSSGSLIMTASKE FLRFTDTYVLDIVREEDALLCLMIVLAIDAAKCSGGQD >gi|229783922|gb|GG667813.1| GENE 7 5358 - 5945 281 195 aa, chain + ## HITS:1 COG:CAC0723 KEGG:ns NR:ns ## COG: CAC0723 COG1309 # Protein_GI_number: 15894010 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 182 1 183 185 120 36.0 1e-27 MKETNNPVVWQSRTWLMDAFTALMAEKKFEEITVVELCRRADLSRRTFYRNYSSKEALFR SCCERLCREYVRYLERDTAYFAAHITEVYFSFWKEHKTFLRSVCRDDKICCLIEVSNTFW PEIYSRLQSYWNETMSGEELEYCLFFNMGGLWNIMMKWLYEEPERSPGEMECMIRRALHN LLAGTEKSPDRSDAK >gi|229783922|gb|GG667813.1| GENE 8 5942 - 6490 201 182 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 14 176 484 655 904 82 32 2e-15 MKKFTRKELLSIPNLMGYFRLLMIPVFVSLYLHAETDRGYYIAAAVMGISSITDMFDGLV ARKFNMITEFGKFLDPFADKMTHGAILLCLWTRYPYMAVLVILFVVKEGFMAVMGLVKLR EGKKLDGAMWFGKVCTALLFVVMFVLILVPHIPVAAANALIAVCAVVMAATLILYIPVFR NM >gi|229783922|gb|GG667813.1| GENE 9 6499 - 7848 1108 449 aa, chain - ## HITS:1 COG:FN0667 KEGG:ns NR:ns ## COG: FN0667 COG0534 # Protein_GI_number: 19704002 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 10 419 12 421 426 244 35.0 2e-64 MEKSTSSTLMTEGSIWKKIIAFAVPLFLGNLFQQLYNTADSLIVGNFLGSNALAAVSSSG NLIFLLVGFFNGIAIGAGVVIARYYGAGRHDDVQKAVHTTIAFGLAAGAALTVIGVVLAP MILVWMGTPADVLPESTVYFRTYFAGSLSFVMYNVFVGILQSVGDSRHPLVYLIISSLIN IVLDLVLVGIFHFGVEAAAFATIVSQFVSAFLCLRRLMRSPAEYRVCLKKIRFDTALLRQ IIANGLPAGFQNSVISLANVVVQSNINRFGKMAMAGCGAYSKIEGFGFLPITCFAMALTT FVSQNLGAGQYERAKKGAGFGILCSIITAELVGIFIYAFSPLLISAFNNDPQVIINGTAQ ARTVTLFYFLLAFSHCIAGILRGAGKSTVPMLVMMVCWCIIRVTYITVTVHFIPDIRVVF WAYPLTWFLSSVVFLRYFLKGNWVHGFEQ >gi|229783922|gb|GG667813.1| GENE 10 8049 - 8426 107 125 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624819|ref|ZP_06117754.1| ## NR: gi|266624819|ref|ZP_06117754.1| toxin-antitoxin system protein [Clostridium hathewayi DSM 13479] toxin-antitoxin system protein [Clostridium hathewayi DSM 13479] # 11 125 11 125 125 217 100.0 3e-55 MKKKTKNYKKQREFIKQWLKAGMYAGGFCETCGGRLILFFKHDAVCCPGCNQWIDLRCGD PECPYCSQRPQTPADALEEERSRLDFTQTADQKEYCIRQYERSARGEHRKAEKIRYRESK PPFRF >gi|229783922|gb|GG667813.1| GENE 11 8423 - 9130 507 235 aa, chain - ## HITS:1 COG:no KEGG:BcerKBAB4_0681 NR:ns ## KEGG: BcerKBAB4_0681 # Name: not_defined # Def: alkaline ceramidase domain-containing protein # Organism: B.weihenstephanensis # Pathway: not_defined # 32 225 223 416 420 110 33.0 5e-23 MMIQGASGDVRPRFHQENTEYPEIHSFKAAEKGFSEEYQKRYRVQSKNALEQTADSICEA VSAVLPAMTAKPVTHLTMKSVFSHFTADVPSLTQAERIAGEALKEAGIDGTDWLREVKCL NGEGVSQQFSEVEIQYFSLNDGCLCGIANEAMCRISIDAMEAAGTPLLFLNGYTNGCNSY LPTAGEYEKGGYEVLWSNLVYFPYHGRVMPFNRDTAGQLVEEVVQNWRKLRADYG >gi|229783922|gb|GG667813.1| GENE 12 10127 - 10288 168 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624821|ref|ZP_06117756.1| ## NR: gi|266624821|ref|ZP_06117756.1| hypothetical protein CLOSTHATH_06213 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06213 [Clostridium hathewayi DSM 13479] # 1 52 1 52 52 110 100.0 4e-23 MNTYDIAQMGYGEADITPETFVELVGFFRPNNHSKGVRDPLKLQTMVWKYRGS >gi|229783922|gb|GG667813.1| GENE 13 10434 - 10697 285 87 aa, chain - ## HITS:1 COG:no KEGG:CLJ_B0712 NR:ns ## KEGG: CLJ_B0712 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_Ba4 # Pathway: not_defined # 1 87 1 87 87 104 58.0 1e-21 MKYINAAAVLPASLVEELQNYVQSGYLYVPAREGQHRAWGERNGGREELRKRNRDILDAW QNGVSMETLADQYCLSIHAIRKIIYQK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:03:56 2011 Seq name: gi|229783921|gb|GG667814.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld207, whole genome shotgun sequence Length of sequence - 6727 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 143 - 4639 4525 ## Selsp_1785 hypothetical protein 2 1 Op 2 . - CDS 4632 - 5342 964 ## Selsp_1786 hypothetical protein 3 1 Op 3 . - CDS 5383 - 6726 1558 ## Selsp_1787 hypothetical protein Predicted protein(s) >gi|229783921|gb|GG667814.1| GENE 1 143 - 4639 4525 1498 aa, chain - ## HITS:1 COG:no KEGG:Selsp_1785 NR:ns ## KEGG: Selsp_1785 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 1 1489 1 1459 1470 867 40.0 0 MSKINAVRIINLNYNNNAIRINDETFRLGGKSTLLSLRNGGGKSVLVQMMTAPFVHKRYR DSRDRPFASYFTTNKPTFILVEWVLDGGAGYVMIGMMVRKNQDLSEGNGEELEMVNFISE YKTRCELDIDHLPVIEKTKKEMALKGFGVCRQMFEQWRRDRSVKFSSYDMNQAPQSRQYF DRLMEFRIYYKEWETIIKKVNLKESGLSELFSDCRDEKGLVEKWFLDAVENKLNKDKNRM KEFENIVTKYVSQYKENRSKIERRDTIRLFREEAGEILKDAECYREEAAAAVAHEGKIAA FIGELDRLEAEAENEKAAVQDETEALEAEIKRIGYEEISCEIHGYEEEQTLHARRRDMAG FELDSLEHECEGLTQKLRLLECARWREITRESFRDYRFADERLAALRQKKEDLEPERVML GTRLKAYYGEVCRKESEEAEILTESLRVTRERIAALKGREAGYQKALQEAERLAGGLQTR ISGYDGEEERYNRRHRQSLRRNILGEYEPAALEMAADGLIREREQVSRLMTEGKREAGRL EEELRIVDRNLEDAKKDLILTETAKKEQTKELGRLEEERKDLRVILRYLELPEEYLYERD RILTAIQRKLSDIAMARRGFEKELDELEKEYRKLAEGSVLELPAEFEEMLEQTGLRRVYG MEWLKKNGNSVEKNQELIRRQPFLPYSLILSRGELEKLIKSGNELFTSFPIPIIKREDLE DGAVECDGGIVSLGKISFYVLFNDNLLDEEKLALMMREKEAQIGRKRQAADQKDKEYREY FEKQEKIRSQTITKERYDAVEQEIGRLEEKISELNAGILELSAKQSSKKERKRVLERESA EAERKIEDLKRQEEDLAALTGFYKVYLESRAELLKCRDKMEATRQDQERSRQLCDKLEAK EKSLEGENGLLRERMKETEGKRQIYVGFGGESAGEAAIETVEEAAVSGTGVSAADAGEWH PVIPSLEEAVGLEARYAAITGNLSGEERELEENCRKTLSVYEKNEKELKRLAEKYRLSAA EYQDIWYSAEEKDRVDDMLETKQTEKKISERKWNEEDKLVALAVQNKERCQKDMVKKFGE DEPLPKDAITQVDFDVRRSRATDRVKDARDRLRAVEEKLRSYEENLTALSEYREFAPAEG LTWDSQIPLENMTPKQLGDFKGSMIRDYRQALEKRSGKKADLERTLNRVVRREEFAEEFF KKPLETLLLLTSSAEQVIMQLTTVLSSYESLMEKLEVDISMIEKEKDKIVEILEDYVREV HANLGKIDRNSTITVRDRQIKMLKIQIPGWEENENLYHVRLMDFMDEITARGIELLQENQ NASEYLGTRVTTKTLYDTVVGIGNVEIRLYKIEEKREYPITWSEVAKNSGGEGFLSAFII LNSLLYYMRKDETDIFADRAEGKVLLMDNPFAQTNASHLLKPLMDMAEKTDTQLICLSGL GGESIYSRFNNIYVLNLIAANLRNGMQYLKSDHIRGSEEETIVLSQVEVNEQLELNLF >gi|229783921|gb|GG667814.1| GENE 2 4632 - 5342 964 236 aa, chain - ## HITS:1 COG:no KEGG:Selsp_1786 NR:ns ## KEGG: Selsp_1786 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 1 234 1 234 251 266 65.0 4e-70 MAYEFSDIRTSQEIFYHLLMKRELREEDEPYLYQAYTEQEQIQNLVKSQGDAAESFVERY GGVIYLIPKEDNNFLGFSKNQLKQALCKSTATDKDYYLSQFVILTLLVEFYDGQGSSSKA REYMKVGELQNCISERLKEGCERAKDEEEREGLAFSNMLEAYEALRSDDRGSKAKTTKEG FLYHILNFLEKQGLIDFVEEDEMIKTTKKLDSFMDWNLLNQNQFQRVLKVLGVEHE >gi|229783921|gb|GG667814.1| GENE 3 5383 - 6726 1558 447 aa, chain - ## HITS:1 COG:no KEGG:Selsp_1787 NR:ns ## KEGG: Selsp_1787 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 7 442 95 535 543 389 49.0 1e-106 LEPKGFEFLINVVLCNEGRAMYFDGYDFEMNAYKIMHISFVANRIVYVDSDVKRTSYYLT DDGYNLILSTLEVENNMKLTIHEIIFQMHLEKASYDKAVDEIKNIFNLLRIQLQKIEETM NRIRRNALEYTVADYREILEGNLETIADTKQKFQNYREVVRKRERELEEEHINVKRLGKQ DEQSLENLRIIEGYLDRSIDEYQKILSSHFDLKALYTRELEQISQMALIRRFHLRNDLYD KILDDVTALDRLEYFLRPLFNREQDKIYNINKALEFQKPVREKKEEAESIYSFEEEVLDS EEEYRRERLKMYEASLKVLLGFLKREGEISLKAIRETLGDQLKELVPDTDIFKEIMVELL KSGRLDVQMLKKERSESLGDKSAEFRLSEMFLDLLEQDDSLADVTALEAYRIEDGETIVF EGVKNAEGRLCNIRCSNVLLRLLGHRV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:04:27 2011 Seq name: gi|229783920|gb|GG667815.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld208, whole genome shotgun sequence Length of sequence - 7261 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 3, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 310 239 ## COG0583 Transcriptional regulator - Prom 345 - 404 6.6 + Prom 304 - 363 7.5 2 2 Op 1 . + CDS 478 - 2160 1263 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 3 2 Op 2 . + CDS 2178 - 2414 168 ## gi|317485248|ref|ZP_07944129.1| 4Fe-4S binding domain-containing protein + Term 2448 - 2485 1.2 - Term 2432 - 2475 1.2 4 3 Op 1 34/0.000 - CDS 2494 - 3282 571 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 5 3 Op 2 31/0.000 - CDS 3272 - 3964 806 ## COG0765 ABC-type amino acid transport system, permease component 6 3 Op 3 2/0.000 - CDS 3982 - 4839 994 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 4895 - 4954 6.5 7 3 Op 4 5/0.000 - CDS 5060 - 6325 1147 ## COG2199 FOG: GGDEF domain 8 3 Op 5 . - CDS 6326 - 6919 598 ## COG2199 FOG: GGDEF domain 9 3 Op 6 . - CDS 7000 - 7260 157 ## gi|266624834|ref|ZP_06117769.1| conserved hypothetical protein Predicted protein(s) >gi|229783920|gb|GG667815.1| GENE 1 1 - 310 239 103 aa, chain - ## HITS:1 COG:CAC3361 KEGG:ns NR:ns ## COG: CAC3361 COG0583 # Protein_GI_number: 15896604 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 90 1 90 312 73 44.0 8e-14 MELLQLQYFERIAKYQSMSKAAEELHVSQPSLSSCIIRLEKELGVKLFDRNGRKIVLNSY GKYFLNMTRQILNLVNECKVPTNEKALMERINIGFMNYNEKLF >gi|229783920|gb|GG667815.1| GENE 2 478 - 2160 1263 560 aa, chain + ## HITS:1 COG:PA2298 KEGG:ns NR:ns ## COG: PA2298 COG1053 # Protein_GI_number: 15597494 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Pseudomonas aeruginosa # 12 531 7 521 574 145 27.0 2e-34 MSDMIEMNKPLECDVLVAGGGVAGLMAAIAAAEEGAKVIVAEKANSKRSGSGATGNDHFL CYIPAVHGDDPSEFIREMEESQIGGNNDMDLVRRYVAESFNRVKDWDSWGVGMREHGDWE FNGHAKPGHLRVWLKYAGREQKPIFTREAIKRGVTILNHHPFADVITDGHNRVAGGICID ITGEVPKMQVIRAKSVVLATGSVNRLFGSKTMGWMFNNAGCPSDVCTGRAAAYRAGARLV NLDRTGTGAGCKYFNRGGKATWLGVYSDISGKPLGPFVTKPSKEYGDLAGDIWHDMFSIK RSQGEPVFMNCSECTDEQLDYMLWGLSNEGNVGTLDHLASEGFDFRKHMVEFDCKDGGNI GGGIDISVDAQCTVDGLFAAGDETGNFRADMGGAATYGHIAGVSAACYSRGQTLIPAEEM EIVTARAHHYSEIISRDTDTATPDWREINVAIQQVMTDYCGPEVRSEHILTTGLSHLKRL QAKAATVHCADSHEFMRCLEAEDLALLGELAFVSALERKETRGKHNRVDYPFTNPLLNNK FVTVEKVNGVPSTGWRDKHK >gi|229783920|gb|GG667815.1| GENE 3 2178 - 2414 168 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|317485248|ref|ZP_07944129.1| ## NR: gi|317485248|ref|ZP_07944129.1| 4Fe-4S binding domain-containing protein [Bilophila wadsworthia 3_1_6] 4Fe-4S binding domain-containing protein [Bilophila wadsworthia 3_1_6] # 1 73 1 73 80 76 50.0 6e-13 MPPIIDKNKCIGCHTCVDICPVDVYGQKQTKKQPPVIQYPDECWHCNACVFDCPVKAISL RIPAPASIVFVDAPKKES >gi|229783920|gb|GG667815.1| GENE 4 2494 - 3282 571 262 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 23 261 1 242 245 224 44 1e-58 MSTRVNMESTTRVRSAGGMDEPLIQVQNLGKKFGSIEVLKDITVDIYKGDVVCVIGPSGS GKSTFLRCLNRLEEPTSGHILFEGTDITDKKTDIDRHRQKMGMVFQQFNLFPHMTILKNL TLAPVKLQGRSEEEAQAQAMTLLEKVGLADRASAYPNQLSGGQKQRIAIVRALCMNPDVM LFDEPTSALDPEMVGEVLNVMRDLAEEKMTMVVVTHEMGFAREVATRVMFMDGGYFLEEN EPKEFFEHPKNDRLKGFLSKVL >gi|229783920|gb|GG667815.1| GENE 5 3272 - 3964 806 230 aa, chain - ## HITS:1 COG:FN0802 KEGG:ns NR:ns ## COG: FN0802 COG0765 # Protein_GI_number: 19704137 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Fusobacterium nucleatum # 13 229 11 236 236 163 46.0 3e-40 MWEKLVADFYFNFIEDNRWKYITDGLQNTLKITFFAVLIGIVLGFLVAIIRSTYENTHKL KILNAVCTVYLTVIRGTPVVVQLMIVYFVIFVAINPGMVTTAILAFGINSGAYVAEIFRS GISSIERGQFEAGRSLGFNYAQTMWYIIMPQAFKNVVPTLANEFIVLLKETSVAGYIGLQ DLTKGGDIIKSRTYSAFMPLIAVALIYLVMVMIFSYLVKLLERRLSRSEH >gi|229783920|gb|GG667815.1| GENE 6 3982 - 4839 994 285 aa, chain - ## HITS:1 COG:FN0800 KEGG:ns NR:ns ## COG: FN0800 COG0834 # Protein_GI_number: 19704135 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Fusobacterium nucleatum # 74 281 20 228 230 177 44.0 3e-44 MKKNVLVLVAAVCMAAALTACGSSNSTAETTAAAAETTTADTAAGSTDTAADTAADTSAA EETEGAGGKIVMVTNAEFPPYEYHENNTIVGIDADIARAIADQMGMELEIQDMAFDSLIP AIQSGKADFAAAGMTVNEERLRNVDFTETYAEAAQVIIVKEGSAIAAPADLTGKKIGVQT GTTGDIYADDVENAEVQRFNKGMEAVMALTQDKLDAVIIDREPAKVFVKENEGLKILDEA FTEEEYAIAIKKGNTELLDKMNAAIKELKESGELQKIVDKYITAE >gi|229783920|gb|GG667815.1| GENE 7 5060 - 6325 1147 421 aa, chain - ## HITS:1 COG:aq_1455 KEGG:ns NR:ns ## COG: aq_1455 COG2199 # Protein_GI_number: 15606624 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Aquifex aeolicus # 246 409 25 189 236 92 38.0 1e-18 MSLVLSWKGYYTMFVAVCYYIYNHHMFAYIDYLTGKKLPKWKRSVVTFIINYEVFMLISQ LRFYLIINWTIFAVFLVLEIALLYRVPWTMSLFCGIQGAMIGLAFNFITRSGTALLVNQS LAAFDSRISSSIASLKPYPIAAGFVLAGLGFWGLQHKFHDWAEKEENPDKPVNLIFSLCL VLSLFIYLDLNLLIYFTPSNEPIIKLWGMKSGICVLIGYYIGISHIHTLTELQHFERQSY SVRKELISHKILEQKLEQLAFHDVLTGCQSREVARKAMLYHYENEIPFYICFADLNNLKP VNDNLGHDAGDRYLAAVAHALEEAAGADDIVSRYGGDEFLLMTAEDGIERVRAGIKNAQW ALKELSNSSEYPFELWVSYGIAYSGECRDMEELMRLADDRMYENKRRDKQERSGGKAAGG G >gi|229783920|gb|GG667815.1| GENE 8 6326 - 6919 598 197 aa, chain - ## HITS:1 COG:alr3504 KEGG:ns NR:ns ## COG: alr3504 COG2199 # Protein_GI_number: 17230996 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Nostoc sp. PCC 7120 # 41 193 128 284 307 92 34.0 5e-19 MIQFILFLAHTHRIIEKAHYEAEYYRLEEERAEHVKRQMILQKLAYIDGLTGAFTRRYAM EMLESMQKDGLEVTVAYIDVNGLKKVNDTLGHQEGDRYLKLIADSLNESLNKSDILSRIG GDEFMIVSNSSEKEGLESMLKKVNELLGTAGREGYRPSFSYGVVSAPHKVPFDLEELLRE SDRRMYVNKMQFKRGNV >gi|229783920|gb|GG667815.1| GENE 9 7000 - 7260 157 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624834|ref|ZP_06117769.1| ## NR: gi|266624834|ref|ZP_06117769.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 86 1 86 86 134 100.0 3e-30 NFFSVCFIAVNLISMAAVSLVTGSMLRDVYLSNTYCVITLCITLIMIHLMNMLWKKKIFT DWITLLTSDEKRFNQLIFLNGMHLDI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:04:42 2011 Seq name: gi|229783919|gb|GG667816.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld209, whole genome shotgun sequence Length of sequence - 7485 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 5, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 70 - 216 222 ## Closa_2872 chaperonin GroEL + Term 236 - 294 18.3 + Prom 345 - 404 4.3 2 2 Op 1 . + CDS 433 - 1068 643 ## SpiBuddy_0012 regulatory protein ArsR 3 2 Op 2 . + CDS 1071 - 2300 847 ## COG0477 Permeases of the major facilitator superfamily 4 3 Tu 1 . - CDS 2303 - 4474 2402 ## COG1511 Predicted membrane protein - Prom 4532 - 4591 7.0 - Term 4612 - 4667 14.3 5 4 Tu 1 . - CDS 4735 - 5223 404 ## gi|266624839|ref|ZP_06117774.1| surface protein - Prom 5270 - 5329 8.8 6 5 Op 1 3/0.000 - CDS 5332 - 6210 1308 ## COG1940 Transcriptional regulator/sugar kinase - Term 6238 - 6271 2.7 7 5 Op 2 . - CDS 6279 - 7241 983 ## COG1482 Phosphomannose isomerase - Prom 7295 - 7354 5.5 Predicted protein(s) >gi|229783919|gb|GG667816.1| GENE 1 70 - 216 222 48 aa, chain + ## HITS:1 COG:no KEGG:Closa_2872 NR:ns ## KEGG: Closa_2872 # Name: not_defined # Def: chaperonin GroEL # Organism: C.saccharolyticum # Pathway: not_defined # 1 47 1 47 49 73 93.0 2e-12 MSYIAPAVQAKFETLSIDLKNEILERNVQINTIYDLINVLQEIVNEGQ >gi|229783919|gb|GG667816.1| GENE 2 433 - 1068 643 211 aa, chain + ## HITS:1 COG:no KEGG:SpiBuddy_0012 NR:ns ## KEGG: SpiBuddy_0012 # Name: not_defined # Def: regulatory protein ArsR # Organism: Spirochaeta_Buddy # Pathway: not_defined # 19 200 19 194 196 100 34.0 5e-20 MITSNIPKLLKNGLQQKEDLMSEQNMITLVTRKQMDIYMNPQRQRLLKAMDVSGVPMTPK QLSTILKISASSVSLHIRKLEELGLVKLDHTEAIHGIQAKYYKKLPVSVSLGGNLDDDLK EERSCLSDFIMTELWNGFKDRLKRAKDQSDVMTTGDFTSGVVHLSRKDAEELYHLILDFT DAHTVPGDETVPWEFGLVAYPHIPLPEEKEE >gi|229783919|gb|GG667816.1| GENE 3 1071 - 2300 847 409 aa, chain + ## HITS:1 COG:HP1165 KEGG:ns NR:ns ## COG: HP1165 COG0477 # Protein_GI_number: 15645779 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Helicobacter pylori 26695 # 44 202 48 202 386 72 35.0 1e-12 MKRSHILLFGANAFGTGLLAPVLSLVLLSHGATMENLSLCIGIFAVMVVVLELPSGILAD LIGRKRIFLISACFMMANYALLLFSRHFLMLAAACAMQGIGRAFSSGSIEALEIETYLCE HEGRNLEKINSTMAVIESVGLASGSIAGGILGYLDVSYSLLLILAFILQILILGLAAWFV KEPPRKKSPLSPVLQLKEQVRGLYHSLLHSLPVTTIVLMSAACGMLLCTVEVYWQPTLQA FLPGQMGWIFGVITCLGYLGVTIGSKAAEAVMQSKRTVFTPEKSWRTYWILRFLLILSVA VLGFTRRVWLFPVLFVLVYVVLGAGNLIENTIFHGTVASSQRAGMMSLLSLSLRAGGLGT SLLGSLIISGLSLSYVWLLLPLFSALLIGVIMIFHHRSAANCRTAKAGC >gi|229783919|gb|GG667816.1| GENE 4 2303 - 4474 2402 723 aa, chain - ## HITS:1 COG:CAC2582 KEGG:ns NR:ns ## COG: CAC2582 COG1511 # Protein_GI_number: 15895842 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 4 722 7 720 722 295 29.0 3e-79 MKQIWAIFLRDIKNIGKNPVAVIVALGIMILPSLYAWFNIAANWDPYGNTKGLRVAVASL DKGTRIEQLDITVNIGDMIISNLHENDQIGWQFVDEEQAVEGVKAGRYYACVVIPEDFSE KTASIFTSNVRKPTLQYYVNEKKNAIATKITDKGVSTIQQQVNESLISASSEALGKAFHV TYDTVEGKKEDLADEMVASMKDARNDLVLMGSSVNALKSALTAGKGLLSSVQAMLPDSRE LLESGQDTGAEVSRLIRSSEDLSDTMTDSVGNVLDITDGLMDSVDGSIRDAIDMWDRDTN AAAGALNRASGILDKLLTVNQKLFDLVNRLKENLPGPGKVLGAMSDMIGSQIERQKNMIH IIDTAEDAVLKTGKLPADARQEIKKSLEEMMDQGAQITDQYHSQVKPAMTAGLDSLYDSL SAASDYAVGINGLLPQIDGTLTQTTDSFDSLIQTLDQTAVMITAGEEKLDRIIGEVESVE ESERLAKIVEIMKNDPETMGNFLSSPVNLEENQIYPIENYGSAMTPFYTILAIWVGGLVL VAVLKCRVLEDENIHGGKPCEKYFGRYLIFMLFGVAQALIVALGDLYLLKIQCLEPAKFV LAAVTASVVFVNIIYTLTISFGDVGKALAVVLLVIQVAGAGGTFPIEVTPHFFRMVNPML PFTHAINAMRECVGGIYGNAYFEDMGKILLYLPVSLFVGVVLRKQVMKMNDFFERKLEET GVM >gi|229783919|gb|GG667816.1| GENE 5 4735 - 5223 404 162 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624839|ref|ZP_06117774.1| ## NR: gi|266624839|ref|ZP_06117774.1| surface protein [Clostridium hathewayi DSM 13479] surface protein [Clostridium hathewayi DSM 13479] # 1 162 1 162 162 185 100.0 1e-45 MRKYPLFVLAAAAAMSIASVPMTAHAAAFQVTGGGNGMSGYMTNCSGQTGNLAGIFNQLS ATANGRGGLGCGLGQNYTGNGSCDYGQGCGDNGFCGFGQDCGDNGSCGFGQNCGDNGSCG FGQNCGDNESCGFGQSCSGNRNCGGNQNMGMNGQGIPFLMMR >gi|229783919|gb|GG667816.1| GENE 6 5332 - 6210 1308 292 aa, chain - ## HITS:1 COG:CAC1523 KEGG:ns NR:ns ## COG: CAC1523 COG1940 # Protein_GI_number: 15894801 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Clostridium acetobutylicum # 1 290 1 287 288 330 54.0 2e-90 MRLGALEAGGTKMVCAIGNEHGEIFERVSIPTETPEITMPKLIDYFKDKEIEALGIGCFG PIDLNRKSETYGYITTTPKLKWANYNIVGAFKEALGVPVGFDTDVNGSALGEATWGITKG LENSVYFTIGTGVGAGIISNGRLLHGMLHPEGGHVLLAKHPEDTYAGKCPYHRNCLEGLA AGPAIEERWGKKGIELADRKEVWEMEAFYIGQAIVDYIVILSPQRIILGGGVMHQEHMMP LVREEVKRQLNGYIQTKELEDLDSYIVLPSLNDNQGIMGALKLAMDELELAE >gi|229783919|gb|GG667816.1| GENE 7 6279 - 7241 983 320 aa, chain - ## HITS:1 COG:lin2215 KEGG:ns NR:ns ## COG: lin2215 COG1482 # Protein_GI_number: 16801280 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Listeria innocua # 2 309 3 315 318 318 49.0 6e-87 MELLFMEPVFKEAIWGGTKLRDVFGYEIPGDTTGECWAVSAHKNGDCRIAGGMYDGRLLS SLWKEKPELFGNYPGDQFPLLVKIIDAKADLSIQVHPDDEYAKIHEDGSLGKTECWYILD CDPGTEIVIGHHAGSHEEMEAMIRGHQWDAFIRRVPVKKGDFFQINPGCVHAIKGGTLIL ETQQSSDITYRVYDYDRLSDGKPRQLHVEQSIATIEAPFQEASCDTFTEKIPGAVHTHLI TCPFYSVDKYEIDGVFTREFSRYFANVSVISGKGCVNGIPLETGQHFIVPAGAGELRFEG KMTVICSNPETETRQNAADL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:05:02 2011 Seq name: gi|229783918|gb|GG667817.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld210, whole genome shotgun sequence Length of sequence - 10181 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 889 553 ## Acid_0198 glycoside hydrolase family protein + Term 933 - 973 9.0 - Term 2160 - 2217 12.5 2 2 Tu 1 . - CDS 2327 - 3553 832 ## COG1940 Transcriptional regulator/sugar kinase - Prom 3633 - 3692 3.4 + Prom 3470 - 3529 5.4 3 3 Op 1 5/0.000 + CDS 3677 - 6238 1818 ## COG0210 Superfamily I DNA and RNA helicases 4 3 Op 2 . + CDS 6211 - 6912 652 ## COG0210 Superfamily I DNA and RNA helicases + Term 7000 - 7034 3.2 + Prom 7037 - 7096 7.3 5 4 Tu 1 . + CDS 7147 - 7569 388 ## COG3973 Superfamily I DNA and RNA helicases + Prom 8408 - 8467 80.4 6 5 Tu 1 . + CDS 8557 - 10180 1398 ## COG3973 Superfamily I DNA and RNA helicases Predicted protein(s) >gi|229783918|gb|GG667817.1| GENE 1 2 - 889 553 295 aa, chain + ## HITS:1 COG:no KEGG:Acid_0198 NR:ns ## KEGG: Acid_0198 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: S.usitatus # Pathway: not_defined # 1 295 641 932 932 167 35.0 6e-40 RENYRLQVQNGVPLGNGYYLRREPVAFEHRGNHDVTLAGGKGVACAAAAKNDYALAELAQ RQFEWISGKNPFAESVMFGEGYDYCQEYAVLPGEMVGELGVGFATLDEHDSPFWPQVNTC VYKEVWIRSVLQWIWLASDLHGGAKISGIMPQKNGKVLFTNMDYGCIYELSVNSETGWYE GELPAGNYEICCCGQIKHMTLLASRSYRLDAPFYDYQIKARKEGNEVTLVIRTQGSGRAR IKLNMINLTCCDFDREIILGEEIEIKGEIREARRPYYAVLIPDGKLEQIKEVYGR >gi|229783918|gb|GG667817.1| GENE 2 2327 - 3553 832 408 aa, chain - ## HITS:1 COG:lin0217 KEGG:ns NR:ns ## COG: lin0217 COG1940 # Protein_GI_number: 16799294 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Listeria innocua # 27 307 12 277 404 98 24.0 2e-20 MAIFDTIKLFYDKGVDNRMKAYIPTYLKKQNRTTVLDLFLEHKELTRADVSKLTGISLPT VLKIVDFLRMKQILIDKPDALTLSDSGLGRKSQLLQLNEDSYSTIGIYFEGNYLHVGLIN LAYSIIDQKAFPLNSLKVSEDHFSSLSYMLVDTVRYLCSQHPTTEILGIGFGLPGIVDYL NKKIMKKNYQFFIEFYDLFPAFKELKEFPLFIENDINAASLGEILLRNRSHNGNLIYLSL GSGFGSGIIIDGEIWHGANNFAGDISNSILVLPKPSSAVSPEDLRVEKNVCLETLQKQFD FDIRYKETCDDRKREKICQYLSYYLIPLVNNLSYTLDISDFVLAGLTTNFLGKELFTKLE KGLEALRTPAILGPETHIMPTISDSTGIVGAAAIAFSNCLPKLLEENV >gi|229783918|gb|GG667817.1| GENE 3 3677 - 6238 1818 853 aa, chain + ## HITS:1 COG:Cgl0831 KEGG:ns NR:ns ## COG: Cgl0831 COG0210 # Protein_GI_number: 19552081 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Corynebacterium glutamicum # 473 835 8 380 763 200 35.0 1e-50 MYIADLHIHSRYSRATSRDCTPEYLDLWARRKGISIVGTGDFTHPAWRKELEEKLEPAED GLYVLKEEYRIKDETVGASVTPRFVITGEISSIYKKHDRVRKVHSVILLPGLEAAECLSR KLETIGNIHSDGRPILGLDCHDLLEIMLELNSESIYIPAHIWTPHFSLFGAFSGFDTVEE CYEDLTPYIHAMETGLSSDPPMNWRLSALDSYQLVSNSDAHSPAKLGREANLLDMELSYQ GLQKAIQTGEGLAGTIEFFPEEGKYHYDGHRKCHLCLSPLEAEKYSGKCPVCGRKLTIGV SHRVEQLADREEGFVLPGAKAFESLVPLPEVIAASTGHSAASRKVENQYRDMLSALGSEF AILRELPIEDIRKVSGRLTAEGIRRLREGKVERLPGYDGEYGTIRLFDQMEREEIDGQLS LFTGAEMAELSHMETAAAVVADCTAVAVAADSAAVAVRTAAAEGAIHGKEAEPESDQPAP GLNLQQEAAVTAIGRAINVVAGPGTGKTKTLVSHAFHLLKERGVKPTELTAVTFTNQAAR EMHSRLEGQLGGKRFVSRMRIGTFHSICYDLLKKSGMEFTLADEVEMKDTAEEIIAEMQL NLSARQFLRQISAEKCGLEKQTETCPKEAYQAYQKRLKEQGILDFDDLLLETLHLFEEDG KNKTLRQGFSYLLVDEFQDISPVQYRLIREWNKGGRELFVIGDPDQAIYSFRGSDERCFE KLKEDWPNVTTICLEQNYRSTPQILDAASNVISQNRGQKRRLIPQKRKGVPVRLVTAAGE MSEDIFAAKEINRLIGGIDMLDAQGHQEMGDGHRVRSFSEIAVLYRTNRQAELLETCLKK KEFLMWLPAGKIS >gi|229783918|gb|GG667817.1| GENE 4 6211 - 6912 652 233 aa, chain + ## HITS:1 COG:TP0102 KEGG:ns NR:ns ## COG: TP0102 COG0210 # Protein_GI_number: 15639096 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Treponema pallidum # 98 201 518 628 657 80 41.0 2e-15 MVAGREDFLMEPAVRGTICFFKALLNPEDRAARRLSLKLLWKLPEDGISEGIFDGMAETY RKKIKREKPQKVLEAWISDMNLQEEEGVAKLAAMSVFYKNMQEFLDNLTFGQESDLKRCG GKTYTSDSVTLMTLHGSKGLEFPVVLMCGVRKGLIPLESGRQEIDEAEERRLFYVGMTRA KEELILITSQEASPFLKDIPEDVTERENAGKQRDPGMGKQMSLFDFNTFSEGH >gi|229783918|gb|GG667817.1| GENE 5 7147 - 7569 388 140 aa, chain + ## HITS:1 COG:BS_yvgS KEGG:ns NR:ns ## COG: BS_yvgS COG3973 # Protein_GI_number: 16080398 # Func_class: R General function prediction only # Function: Superfamily I DNA and RNA helicases # Organism: Bacillus subtilis # 44 137 56 149 774 58 36.0 3e-09 MNQSQTKRIPGVTLEEERRTLSEILAIADRNLKQVKSSVQNLADELHELKEIYDAEDKEG LALWFNTDARFQQVRQELLRMERCRKKPYFGRIDFTDSSLLKKECYYIGKAAITKDAAEL VVIDWRAPIASVYYEGSLAS >gi|229783918|gb|GG667817.1| GENE 6 8557 - 10180 1398 541 aa, chain + ## HITS:1 COG:CAC0603 KEGG:ns NR:ns ## COG: CAC0603 COG3973 # Protein_GI_number: 15893892 # Func_class: R General function prediction only # Function: Superfamily I DNA and RNA helicases # Organism: Clostridium acetobutylicum # 4 508 198 714 721 162 29.0 2e-39 MANDELLTKYLAKNKRTVLNDIIATIQQEQNEIIRKRPQHNVIVQGGAGSGKTTVAMHRI SYILYNYDLEFQPKDFYIIGSNRILLNYITGILPDLDVYGVSQMTMEQLFTRLLYEDWDD QKYTIKKLDRENKTACIKGSYSWFHDLEDFCRRYEWSYIPREDVYTEKTHHLLMSRSSIE ELLHKFDFLSLPDKLNKLTEHLMAKLENEIYGRYYTYPPEEQKALLKYYQTYFGKREWKG SVFELYDSFLTEQNNKGNQIPLPGHAFDLYDLAALAYLYKRIKETEVIQEASHVVIDEAQ DFGMMAYGALKYCLSKCTYTIMGDVSQNIYFDYGLTDWEELRSLMLPGEFDYFGLLRKSY RNTVEISDFATNILQHGNFPIYPVEPIIRHGNEVRKSECKDEEQLILETAKTALQWQKNG YETIAVICRDEEEERRVSKELRKRAAIREFDAETEEFGSGVMVLPIEYAKGLEFDAVLLF NASDRNYPAEDGFARLLYVAATRALHELAVLYTGKLTDLIAVTVSEEKKQKYVMVRQEQP V Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:05:09 2011 Seq name: gi|229783917|gb|GG667818.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld211, whole genome shotgun sequence Length of sequence - 4559 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 16/0.000 + CDS 29 - 388 415 ## COG2205 Osmosensitive K+ channel histidine kinase 2 1 Op 2 . + CDS 381 - 1091 790 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Term 837 - 888 3.1 3 2 Tu 1 . - CDS 1088 - 2080 1068 ## Closa_4231 Exonuclease RNase T and DNA polymerase III - Prom 2222 - 2281 5.3 + Prom 2228 - 2287 7.1 4 3 Tu 1 . + CDS 2396 - 2824 417 ## COG0071 Molecular chaperone (small heat shock protein) + Term 2864 - 2907 10.4 5 4 Tu 1 . + CDS 2945 - 3541 437 ## EUBELI_01438 hypothetical protein 6 5 Tu 1 . - CDS 3544 - 4557 1022 ## COG2200 FOG: EAL domain Predicted protein(s) >gi|229783917|gb|GG667818.1| GENE 1 29 - 388 415 119 aa, chain + ## HITS:1 COG:lin2827 KEGG:ns NR:ns ## COG: lin2827 COG2205 # Protein_GI_number: 16801887 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Listeria innocua # 1 113 777 890 896 115 50.0 2e-26 MVPMDATLIEQVIINLLENAVYHSHTDEPIDLTVTVHDGLAWFDITDHGIGIPPERLDTL FDGYTPSPNSSGDSHKGMGIGLSICKTIITAHGGTITAANGKHGARFTFTLPLGEDTYE >gi|229783917|gb|GG667818.1| GENE 2 381 - 1091 790 236 aa, chain + ## HITS:1 COG:lin2826 KEGG:ns NR:ns ## COG: lin2826 COG0745 # Protein_GI_number: 16801886 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 2 233 3 230 231 269 59.0 3e-72 MSKLTIVIIEDEKNILSFIEAALETHDYKVLTAVSGKEGLSLINACCPDMILLDLGLPDL DGIDIIKNVRSWSSIPIIVISARTQEQEKVAALDFGADDYITKPFGTSELMARIRTALRH GKPTVSTGDGKLIQPPYRSGDLVIDFDKRLVMLKDKEIHLTQIEYKLVSLLAEQSGKVLT YDYIISRIWGPYADSNNQILRVNMAHIRRKLELNPAEPQYIYTEIGVGYRMRESET >gi|229783917|gb|GG667818.1| GENE 3 1088 - 2080 1068 330 aa, chain - ## HITS:1 COG:no KEGG:Closa_4231 NR:ns ## KEGG: Closa_4231 # Name: not_defined # Def: Exonuclease RNase T and DNA polymerase III # Organism: C.saccharolyticum # Pathway: not_defined # 1 319 1 319 330 473 72.0 1e-132 MKNYIVLDLEWNQSPDGKEGTIDHFPFEIIEIGAVRLDEERKEIGEFRRLIRPRVYPQMH YRISEVTHMDMAELERNGETFETVIRDFLEWCGDDCIFCTWGSMDLTELQRNMVYYGVEI PFDKPLLYYDVQKLYSLLQGDGKQKQSLDITVEELGIREDRPFHRALDDAHYTGRVMAAM DFERVLEYWSTDYYRLPESKEEEVYLVFPGYSKYISRAFETKEEAIADKTVTDLICYRCN RMLRKKVRWFSVNQKFYFALGICPEHGYLKGKIRMKRSDDGRIYAVKTMKLVDEEGARQI SEKKEEAKKKRVEKTKHRKQKTNMKMRGSD >gi|229783917|gb|GG667818.1| GENE 4 2396 - 2824 417 142 aa, chain + ## HITS:1 COG:RSc0200 KEGG:ns NR:ns ## COG: RSc0200 COG0071 # Protein_GI_number: 17544919 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Ralstonia solanacearum # 34 141 37 139 140 65 37.0 2e-11 MMMPSIFGENLFDDFMEDAFKSPIFGKREKNLMKTDIRENDNGYELDMDLPGFKKDEITV NLRDGYVTISAERGMERNEKDEKTGKFVRQERYSGSCQRSFYVGEDVKQEDMKARFEDGI LHLEFPKASPKQVEESHRILIE >gi|229783917|gb|GG667818.1| GENE 5 2945 - 3541 437 198 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01438 NR:ns ## KEGG: EUBELI_01438 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 4 194 43 233 235 78 28.0 1e-13 MIWKEMKRSYLVISICAAFLGLLLLLLPGPTLATAGICTGVCFIVYGVTKLVEITRFKSF FGKYWLHCLLALFPIVFGVIFLFKPLAATSILPFILGVFLAVYGIMGLKTSMKLKRFGLD SWWFHLITAVITILLGALSIFNPFATAAAMVMMMGAILFTLGVFHVVQYFVASHRFKKAG RQISELEKEWDSFDWFWF >gi|229783917|gb|GG667818.1| GENE 6 3544 - 4557 1022 337 aa, chain - ## HITS:1 COG:slr1692 KEGG:ns NR:ns ## COG: slr1692 COG2200 # Protein_GI_number: 16330979 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Synechocystis # 96 330 75 309 332 147 33.0 2e-35 SFPEGDIFRLSGDEFVIVAMNMAYENFMKLVKNMGIELDSRTPSGVSLGTTWVEHLTDFD VLLHHAEELMLVNKQIYYKNSDEVRKHYSPEGLKLLVHDVEQGYYRLYLQPKFDPETGTV HSVEALSRYQAPGRDLQSPVKFVSLLEKMKLIRYLDFYMLEEVFRLLSRWKAEGKPLIPV SVNFSRITLLESDLFQMLTEIQSKYDVPCDMVMIEITERIGDMEHKVMESIGSKLRKAGF RISLDDFGADYANMSILSIMHFDEVKLDKSLVENLVENEKNQIIVRCIIEMCRRLHVDCV AEGVETKEQLDLLKEYGCTTIQGFYYSKPVDAEQFLI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:05:22 2011 Seq name: gi|229783916|gb|GG667819.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld212, whole genome shotgun sequence Length of sequence - 10297 bp Number of predicted genes - 13, with homology - 11 Number of transcription units - 10, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 123 131 ## - Prom 146 - 205 5.1 + Prom 228 - 287 4.2 2 2 Tu 1 . + CDS 309 - 569 217 ## gi|266624856|ref|ZP_06117791.1| conserved hypothetical protein + Term 586 - 634 9.3 - Term 574 - 622 13.1 3 3 Op 1 . - CDS 636 - 1328 922 ## COG1051 ADP-ribose pyrophosphatase 4 3 Op 2 . - CDS 1364 - 1807 465 ## COG2426 Predicted membrane protein - Prom 1869 - 1928 8.5 - Term 1940 - 1980 8.1 5 4 Tu 1 . - CDS 1988 - 2719 805 ## COG0846 NAD-dependent protein deacetylases, SIR2 family - Prom 2839 - 2898 7.0 6 5 Tu 1 . - CDS 2995 - 3147 166 ## - Prom 3218 - 3277 4.4 + Prom 3233 - 3292 8.2 7 6 Tu 1 . + CDS 3364 - 4020 545 ## Closa_2516 phosphoesterase PA-phosphatase related protein + Term 4075 - 4138 23.7 - Term 4066 - 4122 19.3 8 7 Tu 1 11/0.000 - CDS 4169 - 4507 327 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 4590 - 4649 80.4 9 8 Op 1 . - CDS 5498 - 5827 398 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 10 8 Op 2 . - CDS 5824 - 6570 470 ## Closa_2518 Abortive infection protein - Prom 6635 - 6694 80.4 + Prom 7518 - 7577 22.0 11 9 Op 1 . + CDS 7728 - 8660 1003 ## COG0549 Carbamate kinase + Term 8666 - 8715 12.3 12 9 Op 2 . + CDS 8788 - 9618 538 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) 13 10 Tu 1 . - CDS 9597 - 10241 591 ## CD2025 putative ABC transporter permease Predicted protein(s) >gi|229783916|gb|GG667819.1| GENE 1 3 - 123 131 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYKILIVEDDEVIGRTVQRHLQEWGYEAVCVNDLKHVLEE >gi|229783916|gb|GG667819.1| GENE 2 309 - 569 217 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624856|ref|ZP_06117791.1| ## NR: gi|266624856|ref|ZP_06117791.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 86 1 86 86 172 100.0 1e-41 MITIFNRKELYSGFSMVECSRIGRILKSNNIPYTIKDCDLNPIRPGFCRLTAPMLQTKPI IEYSLYVNKQDYEEARFLLMCEEGSV >gi|229783916|gb|GG667819.1| GENE 3 636 - 1328 922 230 aa, chain - ## HITS:1 COG:CAC1777 KEGG:ns NR:ns ## COG: CAC1777 COG1051 # Protein_GI_number: 15895053 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Clostridium acetobutylicum # 1 219 4 228 307 175 45.0 7e-44 MELKNEDGLTEKEFLAQYRPGNYERPSVTVDMLIFAVDPEAEEEEPELLLIRRRNHPCIG QWAIPGGFVNMDESLEEAAARELEEETGITGICLEQLYTWGNVKRDPRTRIISVSYMAVV PKDELNLVAGDDAKEAVWFQVKKKKLSDLADGATYALTVENEERHIFIGYRITEGYTHMG LMWKKDTHAELLPAIDVTDTEALAFDHAYILNAALDRLEEMQKEYREQLE >gi|229783916|gb|GG667819.1| GENE 4 1364 - 1807 465 147 aa, chain - ## HITS:1 COG:AF1696 KEGG:ns NR:ns ## COG: AF1696 COG2426 # Protein_GI_number: 11499286 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Archaeoglobus fulgidus # 2 142 1 137 137 59 35.0 2e-09 MVPLIELRGAIPYSQVMGLPLLESYIIAIIGNMLPVPIIYLFARKVLEWGADKPVIGGFF TWCLEKGKHGGEKLQAKAGRGLFVALLLFVGIPLPGTGAWTGTLAASLLDIDFKSSILAV MGGVLLAGVIMGLASVGVLGALNSVIF >gi|229783916|gb|GG667819.1| GENE 5 1988 - 2719 805 243 aa, chain - ## HITS:1 COG:CAC0284 KEGG:ns NR:ns ## COG: CAC0284 COG0846 # Protein_GI_number: 15893576 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Clostridium acetobutylicum # 4 238 6 241 245 139 33.0 7e-33 MEGKISKLREILDESRYTVAICGSGMMEEGGFIGVKNPDKAYDIETKYGYSAEEIFSSAF YNTRPEFFFEFYKKEMLDNPPEDTASGPALAAMERAGKLQCLITSNIYAKAQRAGCKNVI NLHGTIYENKCPHCGKDYTMEQVRDAKKILLCDKCGIPVRPQVSLFGEMVDSRRMTKTSN EISKAEVLLLLGTTLDSEVFGNYVKYFTGRWLVIIHPNRNYLDDKADMLFLDYPKNILPK LGY >gi|229783916|gb|GG667819.1| GENE 6 2995 - 3147 166 50 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDKDKQEKLAQKKSNEIFNSITEKVKPENQNQTHNSRREGMGPNMKRKPM >gi|229783916|gb|GG667819.1| GENE 7 3364 - 4020 545 218 aa, chain + ## HITS:1 COG:no KEGG:Closa_2516 NR:ns ## KEGG: Closa_2516 # Name: not_defined # Def: phosphoesterase PA-phosphatase related protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 218 1 218 218 330 77.0 3e-89 MKNLLHKYRHAWILCYAFIYIPWFVYLEKTVTSNYHIMHVRLDDFIPFNEYFIIPYMLWF AYVAGAVLYLFFTDVKGYYRLCTMLFTGMTISLLVCTIFPNGTDFRPVVNPDKNICSAIV AWLYSTDTCTNVFPSIHVYNSVCVHIAVSHSEVLKKFKSVQIGSFILMVSICLATVFLKQ HSAFDGLGAVVMAYVMFQFIYTDGYVTSRKKVTQKAIS >gi|229783916|gb|GG667819.1| GENE 8 4169 - 4507 327 112 aa, chain - ## HITS:1 COG:CAC1488 KEGG:ns NR:ns ## COG: CAC1488 COG0463 # Protein_GI_number: 15894767 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 1 112 227 338 338 103 48.0 7e-23 MIGRIDQQIRVTKLMLGYCDVTKLKKRKLRHYMVRYLEIMMTVSSILAIKSGTEENMEKK KELWQYLRKLNLALYLRLRWGFLGQGTNLPGKSGRKFSIAGYKITQKFFGFN >gi|229783916|gb|GG667819.1| GENE 9 5498 - 5827 398 109 aa, chain - ## HITS:1 COG:CAC1488 KEGG:ns NR:ns ## COG: CAC1488 COG0463 # Protein_GI_number: 15894767 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 1 108 1 107 338 144 58.0 3e-35 MKLLSIAIPCYNSEAYMRHCIESLLPGGDEVEILIVDDGSTKDRTAEIADEYERKYPGIC RAIHQENGGHGEAVNAGLRNAAGIYYKVVDSDDWVDEAAYQEILATLRR >gi|229783916|gb|GG667819.1| GENE 10 5824 - 6570 470 248 aa, chain - ## HITS:1 COG:no KEGG:Closa_2518 NR:ns ## KEGG: Closa_2518 # Name: not_defined # Def: Abortive infection protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 248 3 247 247 188 46.0 2e-46 MVILYLILPLVFYSLVTGVIALILPLQAVTCTLLGAVVTFPVLCYWPYRSDQARRGVIPP IPFRFRGCLPVIILLGIGTCISVNNIISLTPLPELSSGFEAVRDTFYSPPLLIQILGPGV LIPAAEEMIFRGLMFAPLRDRMPFWPAALISAVLFGLYHGNLLQGVYAFLLGLVLAWLYE RFQTLAAPWLFHAAANMTSIIAVNTGLEILTEKTDRFVFGTALVLAAVLSVFCLIRIERK TNIKEEEV >gi|229783916|gb|GG667819.1| GENE 11 7728 - 8660 1003 310 aa, chain + ## HITS:1 COG:yqeA KEGG:ns NR:ns ## COG: yqeA COG0549 # Protein_GI_number: 16130776 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 4 309 3 308 310 294 51.0 1e-79 MAKKRIVMALGHNALGTNLPEQKKAVTETAKVIADFIQAGWQVALTHSNAPQVGMIHTAM NEFGKQHDGYTSAPMSVCSAMSQGYIGYDLQNGIRAELVKRGIYKPVATILTQMMVDPYD DSFYTPMKPVGRFMTAGEAKEEEEKGNYVEEIPGKGFRRVIASPKPVSIVEIDVIKAVMD ADQIVIACGGGGIPVMEQGYNLKGASAVIEKDRAAGLLAKEIDADVLMILTSVDNVTIDY GTPQERPISHMSIEEAESYINEGQFEFASMLPKIEASVDFVTHGKGRCAIITSLSKAKAS LAGKAGTTIE >gi|229783916|gb|GG667819.1| GENE 12 8788 - 9618 538 276 aa, chain + ## HITS:1 COG:VC1177 KEGG:ns NR:ns ## COG: VC1177 COG0613 # Protein_GI_number: 15641190 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Vibrio cholerae # 1 257 1 250 290 97 29.0 3e-20 MKVDYHMHTLCSDGIYEVETVVDMAVNNGILAMAVTDHDTLAGIGRARQRCQGRVRYMTG TELTCAERPWGSLPVPQSIHLLGYGFDENDETLCALLSSRKERTETVYRDLAEAVTASGC PLRIPDVPKSCGNVLQLRDVMAHVRKTFPSVSEETIRLIDSFSKNLTEANITVEDGIAAI HHAGGKAVWAHPYHSYHLFQKQTLQADEVSVMLNELVRMGLDGLETEYPAFSLPEKSFLN GLAADYQLFTTAGSDFHGSASRNRMGMETAQNSFFL >gi|229783916|gb|GG667819.1| GENE 13 9597 - 10241 591 214 aa, chain - ## HITS:1 COG:no KEGG:CD2025 NR:ns ## KEGG: CD2025 # Name: not_defined # Def: putative ABC transporter permease # Organism: C.difficile # Pathway: not_defined # 1 213 1 211 212 70 28.0 7e-11 MKGLLIKDYYILRQSIKSMLFILVVWSLVFLGGRKTGMFLIPMFIMVAGINVLNLFSYDR QTKWETYVLTLPVPRWKMVLEKYFYAVCISTFFGLIAVAVVAAAAAVKGMAMGPEFLLEL LMNWLTGIVISFFYNSFSIPLTYWLGVEKARLIPSVLIAAAVFLFVAFASAYGSGFTVSE GTAAIAIAAGALGTVIMMVLSYFISVRIYKKKEF Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:05:52 2011 Seq name: gi|229783915|gb|GG667820.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld213, whole genome shotgun sequence Length of sequence - 9735 bp Number of predicted genes - 12, with homology - 11 Number of transcription units - 7, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 4 - 63 5.7 1 1 Op 1 . + CDS 90 - 608 417 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 2 1 Op 2 . + CDS 598 - 1356 522 ## EUBELI_01517 hypothetical protein 3 1 Op 3 . + CDS 1331 - 2215 305 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 4 1 Op 4 . + CDS 2217 - 3428 835 ## EUBELI_01515 hypothetical protein 5 1 Op 5 . + CDS 3463 - 3726 88 ## gi|266624874|ref|ZP_06117809.1| conserved hypothetical protein 6 1 Op 6 . + CDS 3764 - 4315 389 ## gi|266624875|ref|ZP_06117810.1| hypothetical protein CLOSTHATH_06269 + Prom 4321 - 4380 5.8 7 2 Tu 1 . + CDS 4444 - 4578 86 ## gi|288871494|ref|ZP_06410196.1| hypothetical protein CLOSTHATH_06270 8 3 Tu 1 . - CDS 4568 - 4906 66 ## COG1733 Predicted transcriptional regulators - Prom 4990 - 5049 6.3 + Prom 4800 - 4859 2.8 9 4 Tu 1 . + CDS 5036 - 5833 383 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold + Prom 6021 - 6080 3.2 10 5 Tu 1 . + CDS 6253 - 7977 839 ## gi|288871495|ref|ZP_06117813.2| hypothetical protein CLOSTHATH_06273 11 6 Tu 1 . - CDS 8890 - 9033 74 ## + Prom 8820 - 8879 80.4 12 7 Tu 1 . + CDS 8969 - 9734 380 ## Closa_3504 cell wall/surface repeat protein Predicted protein(s) >gi|229783915|gb|GG667820.1| GENE 1 90 - 608 417 172 aa, chain + ## HITS:1 COG:BH0263 KEGG:ns NR:ns ## COG: BH0263 COG1595 # Protein_GI_number: 15612826 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 4 169 5 185 187 77 29.0 9e-15 MLSDDELVEQIQDGDENAAEELIKRYYTSILRYCKWHCSNLERAEDLTQETFLKLFKNIY RYKGKKKFKAYLYTIANHLCIDESRKLQAYSLEDDETILNECEEILRVEDREEIKYLLNA LSLEQRETVILRFGEQLSFEEIAKVMGCNMRTAQSRVRNALKAMRKEQQDGR >gi|229783915|gb|GG667820.1| GENE 2 598 - 1356 522 252 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01517 NR:ns ## KEGG: EUBELI_01517 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 252 1 253 253 339 68.0 5e-92 MEDKVLKAHLQHALRQEVQPERLEETINSCIEVMRKQKFTAEARTGFWGYLLDVFRFEGI PILGLQAITLLLVCLAISNLSDIPKYIPLFIPLFVLAIMPVIFKSQYYGMSEMEAVTRAS GAQIMLAKLILSGAANLVCITILLFMETYQQNSCKGIGQMILYCLVPYLVCMVGLLRVIR MRKKEGLPVCVFIMLGSCVFWGLSANALPWLYEATATGLWLIMFVVFSAFFLQEIRYIAA IRKEGKMYGIIA >gi|229783915|gb|GG667820.1| GENE 3 1331 - 2215 305 294 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 1 290 7 311 312 122 27 1e-27 GRCMELSLDRLTKQYGRKIAVDCVSTVLKPGVYGLLGANGAGKTTLLRMLCAVLEPTSGE VLFNETEVVSMGADYRNIIGYLPQDFGYYPGYTAMEFLMYIAALKGIPKRMAKRRAAELL EVVGLSQEAGKKIKTFSGGMKQRVGIAQALLNNPKIVILDEPTAGLDPKERVRFRNLLSD YAEDKIVILSTHIVSDIEAIADEVLLIKKGRFVMQGTVPELTRRADGKVWELSVSQEEAR RWQEKATVANLRHEENRIVLRIVSDDRPGDNAVPCEAALEDIYLYYFQTEEGDR >gi|229783915|gb|GG667820.1| GENE 4 2217 - 3428 835 403 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01515 NR:ns ## KEGG: EUBELI_01515 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 402 1 402 403 610 73.0 1e-173 MNLFLLEHKKLWRKRLTKISVLLCFVYVVVFGGILSFQWFSFGSSGDHTSAFGNNFDGYA VVKGSQEYARFFGGELTDETLQELVRDYQRMEAAGMEKELEHTDWHIVNNWLGMLYPELK DPNSYKIMMSYVNPDKLADFYERRQRVISEFLEVNGQTGTEREYLLQMEQEVRKPLRYEW TEGWSVLLGSMLADIGMVMALFLAIVLSSLFSGEWSDRTSTLVLTTPNGWQRLALAKIFT GVAFTVELFALLAAGNLVTQIFYMGTSGWDMPIQTIKLIAIAPVNMLQAEIYEYAFALLG AIGYAGGVMFLSAAVRSNVLSLLLSLAAVHGPMMIAGYLPFGAQKALDLLPLVGSSADIF RTNTFCVFGKLIWSPYLLITVPVMIGVLCMPFAVRKWSKRMKV >gi|229783915|gb|GG667820.1| GENE 5 3463 - 3726 88 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624874|ref|ZP_06117809.1| ## NR: gi|266624874|ref|ZP_06117809.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 87 1 87 87 126 100.0 5e-28 MDKKSGLSSGIETGFVKGGVNGLCAIHAGQLIGELKKLICTISYKKSTSKCDSMTRRGQP RRFRRQRIREHAPLMASMLMKMGFLQK >gi|229783915|gb|GG667820.1| GENE 6 3764 - 4315 389 183 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624875|ref|ZP_06117810.1| ## NR: gi|266624875|ref|ZP_06117810.1| hypothetical protein CLOSTHATH_06269 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06269 [Clostridium hathewayi DSM 13479] # 1 183 1 183 183 363 100.0 3e-99 MKKGKSVFSLIWPIALIIIALYLFLLFKGGKAFDFGRVEYKETAATCESVGHEITRSQID RSMEGRPNDKDQYYIIYNYSYEVDGITYHASDRNHREFQDYQSYQRDIEESQEWIGTTKT VYYHPEDPSEYSFSSKNFKESVSETNWVVGVIVVLCSVILLPWIIFMGVIRSVLRKRADR FRK >gi|229783915|gb|GG667820.1| GENE 7 4444 - 4578 86 44 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871494|ref|ZP_06410196.1| ## NR: gi|288871494|ref|ZP_06410196.1| hypothetical protein CLOSTHATH_06270 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06270 [Clostridium hathewayi DSM 13479] # 1 44 1 44 44 85 100.0 2e-15 MAEVRQNGFPCSAAHKRENISIKMEDTKFDFYQWDKKRGRAIMV >gi|229783915|gb|GG667820.1| GENE 8 4568 - 4906 66 112 aa, chain - ## HITS:1 COG:CAC2934 KEGG:ns NR:ns ## COG: CAC2934 COG1733 # Protein_GI_number: 15896187 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 111 1 111 112 105 49.0 2e-23 MAKTMEVPERCPMDVGLNILSGKWTLRILWHLSKGPVRFNELQRLLGTITTKTLSQQLRQ LEEQQIILRTVYPETPPKVEYSLTDLGKTIQPVLKKLCEWGISYQKAISAIP >gi|229783915|gb|GG667820.1| GENE 9 5036 - 5833 383 265 aa, chain + ## HITS:1 COG:CAC3337 KEGG:ns NR:ns ## COG: CAC3337 COG2159 # Protein_GI_number: 15896580 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Clostridium acetobutylicum # 1 265 1 262 262 233 41.0 3e-61 MIIDSHEHMMLPTEMQLQMMDAAGIDKTILFCTAPHPEKASSLNALETEMDELYKVLSGS NSIEANIIRQQKNISELISIIRKHPDRFGGFGSVPLGMSFNETQDWITDYIISNSLLGIG EFTPGNEQKILQLDTVFQAIEATEICPVWVHTFSPVTLEGIKLLMDLCERYPQVPVIFGH LGGTNWIEVIKFAKAHKNAYLDLSAAFASIATRMALIELPQRCLFSSDAPYGDPYLYREL VEFVSPSKEIADMALGDNIKELLHL >gi|229783915|gb|GG667820.1| GENE 10 6253 - 7977 839 574 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871495|ref|ZP_06117813.2| ## NR: gi|288871495|ref|ZP_06117813.2| hypothetical protein CLOSTHATH_06273 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06273 [Clostridium hathewayi DSM 13479] # 15 572 1 558 558 836 100.0 0 MKRSNKRQYARLCSMALSAVMLCVSIPMTAFTAHAESDSNHVHDESCYRKKLHCLHEHDD SCYEGDETATSSNMKKLVCDHEHGEGCYEVVLDCGYRQEENSSEGAEGDGNAGADKNGGL ASVASPAAAARALTLTGVERLVADINTWAAGNGIVAEVDEYSGDVIVTGKTTTEASDKLE LEIPENVTVEWRAVLTSSAEPVIGIVEESGTDSVFSVTDGEIISTSNKNHTGSSYNTIDN AAIYNAAANTSILVSGDSIRGGGGDRSTGIYNFGEGLVSIQGGRISGGTGTKNYGIFNSF SGITEVSEGIVSGGGEESYGIQGSGEIRVSGGSISGGTGSVENVGIVFKNSTNSVISGGN ISGGNGANTCGILNDSPNSLLVSGGTIDGGKGTSISVGIVNEENSSITVKDGSINADGKK KVCIMNEGGTVVVSGSKIVNTGGYAMIDNSSSSGIQIGGSAVIFGKDLDSTGSAIKTAGT QPVPGGTAAIITWAGDKTTYLKGSAADLDVSPAAVKAEWSTQNGHGGVFYDVSGINNGFI TLPGVTVKAAPAIITETLPDGTIGVAYSQTLAAS >gi|229783915|gb|GG667820.1| GENE 11 8890 - 9033 74 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVLAGPVVTAMEDKPAAMVIDIAFVSLPMSLEALTVKWLIPAFVGIS >gi|229783915|gb|GG667820.1| GENE 12 8969 - 9734 380 255 aa, chain + ## HITS:1 COG:no KEGG:Closa_3504 NR:ns ## KEGG: Closa_3504 # Name: not_defined # Def: cell wall/surface repeat protein # Organism: C.saccharolyticum # Pathway: not_defined # 72 159 121 207 745 80 47.0 8e-14 MSITIAAGLSSIAVTTGPAKTTYTAGQSFDSSGMVVTAAYSDGSTRAVTGYTVTPSGALT EADSRVTVTYTEGGIAKTATLAITVKAAGTNHTITYDANGGSVTPATGITDTDGRLGSLP TPVRGGSYQFDGWFTAASGGVKVTTTLIFTENCTIYAHWTYIGGGSGSGGGSGGGGGSGS GHGFAGGNSGPKESLPKNYTGGTKNIDGVIVPSYVEKVVWVMTEGGRWRLRRADGSDYVN TWVAAYNPYADLSAG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:07:11 2011 Seq name: gi|229783914|gb|GG667821.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld214, whole genome shotgun sequence Length of sequence - 10795 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 5, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1690 1512 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 1723 - 1782 7.2 + Prom 1846 - 1905 8.1 2 2 Tu 1 . + CDS 2116 - 3480 1578 ## COG1653 ABC-type sugar transport system, periplasmic component 3 3 Tu 1 . + CDS 3607 - 3693 166 ## + Prom 4595 - 4654 7.7 4 4 Op 1 . + CDS 4718 - 4903 232 ## Spico_1737 carbohydrate ABC transporter membrane protein 1, CUT1 family 5 4 Op 2 2/0.000 + CDS 4921 - 5757 883 ## COG0395 ABC-type sugar transport system, permease component 6 4 Op 3 . + CDS 5778 - 6848 967 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 7691 - 7750 80.4 7 5 Op 1 . + CDS 7827 - 8939 1207 ## Rleg2_5334 glycosyl hydrolase family 88 8 5 Op 2 . + CDS 8956 - 10662 1899 ## COG4289 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|229783914|gb|GG667821.1| GENE 1 1 - 1690 1512 563 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 285 563 260 530 602 145 32.0 2e-34 MIKKFFPSSIRGKIMAVTAAITVLIAVITMAVCYSVFQSFLMKNQIQSTEFNLSVISGNM AEDMKEIVTFTKWCCSNSEVGHYLERLKDQPKLPTASKDTEGLRPLALSAYKRLNEEYIT SGSAGGKKYMARVIISTPNTGNFLQSMTSANYSSSFDAARVSSSSFFAPLLESSDFRWIG VVDDPISNANHEKIIPIVRPVYDEYSLEVIGWTYVEVSSRLFTDYLASYPQADDSQLFLT IGDKTYGSSRRTWDDPSMPYTDFKPVTGTATLDPGTQVFSVVSPEGEKRMLVRRPVEVEG SDGWYLSQVLSRQQFNQQRMVYLLLIVGISAAIISLGCLLTVLLNKIISQPIKRLSLKID AIADGDFSRDSEIEWNHELGSIGRGINNMSENIVTLMDKRVSDEKQKKDLEYQILQSQIN PHFLYNTLNSIKWMATIQNATGIADMTTALARLLKNVSKGTSSMVTLREELDLVKDYFLI QQYRYGGSITMDYHIESDELYDCEIHRFTLQPIIENALFHGIEPKGTAGAITVSVTSDTM NGKKVLKISVTDNGVGMTGETIE >gi|229783914|gb|GG667821.1| GENE 2 2116 - 3480 1578 454 aa, chain + ## HITS:1 COG:PH0753 KEGG:ns NR:ns ## COG: PH0753 COG1653 # Protein_GI_number: 14590623 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pyrococcus horikoshii # 1 454 4 464 464 105 26.0 3e-22 MKKRALSLALAGAMVVTTLAGCGSGSKETTAAPEGTTAETTAKAEAPADTTAAASGDKTT LKWSVWDISSTTYYQPLIDEFEKAHPDVTIEMVDLGSTDYQTVLATELTGNGSDFDVVTV KDVPGYMTLVNKGVLEPLDSYIADSGVDLAQYKGLTDQITVDGKLYELPFRNDFWVLFYN KDVFDKAGVAYPTNDMTFDEYDKLARSVTVDTPGQEVYGAHYHTWRSAVQLFGILDGKNT ILDGKYEFLKPYYDMVLAEQEDGVCQDYATLKTSGLHYSGAFAQGNVAMMNMGTWFISTL IEKIKTGEYTDCTNWGIAKYPHADGVEAGSTLATITSLAIPTNAPHKDLAWEFVNFVSGA EGAEVLAATGTIPAVMTDEVANLVSATEGFPKDDDTSVDALNTANLYLEMPVHPKSSEIE TALNEAHDAIMTGSISVDDGITQMDEQVSAILAE >gi|229783914|gb|GG667821.1| GENE 3 3607 - 3693 166 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAKTLTETQKNEKIAALLTKREALELAS >gi|229783914|gb|GG667821.1| GENE 4 4718 - 4903 232 61 aa, chain + ## HITS:1 COG:no KEGG:Spico_1737 NR:ns ## KEGG: Spico_1737 # Name: not_defined # Def: carbohydrate ABC transporter membrane protein 1, CUT1 family # Organism: S.coccoides # Pathway: not_defined # 12 58 258 304 306 68 68.0 1e-10 MLAGGSNGVVSTSAMVLVYHIYEEAFRNWNLGYASAVAMVLFLMVLVITLIQFRGEKKYA N >gi|229783914|gb|GG667821.1| GENE 5 4921 - 5757 883 278 aa, chain + ## HITS:1 COG:lin0219 KEGG:ns NR:ns ## COG: lin0219 COG0395 # Protein_GI_number: 16799296 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 10 277 14 281 282 210 40.0 2e-54 MKVQSTNYKISRVILYAVLIALAFLMLVPFAWMLSASLKLDKDVFIFPIQWIPENPRWRN YIDIWTKIPLMKFVLNTIKITLIVTCLQLLTSSFAAYAFAKLNFKHKNALFLAYIATIAV PWQVYMVPQFMMMRRFGLNDTHLAIICLQAFSAFGVFMMRQFYQGIPDELCEAARIDGMN EYQIWAKIMLLLSKPALSTLTIFTFVNTWNDFLGPLIYLKTEAKKTLQLGLKMFLSQYSS EYGLIMAASVLSLIPVLIVFLALQQYFVEGIAATGVKG >gi|229783914|gb|GG667821.1| GENE 6 5778 - 6848 967 356 aa, chain + ## HITS:1 COG:SSO3036 KEGG:ns NR:ns ## COG: SSO3036 COG3250 # Protein_GI_number: 15899743 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sulfolobus solfataricus # 36 354 6 305 570 90 25.0 4e-18 MEEIKNFVDNIHNEAYKNAYSAPLIGAESLIMQGGRTEESLDGLWNFSPDLYDNCLRAKW YLEEKTNEEGRTVPLDYAFDDWETVPVPGVFNLAKPEYFYYEGPAVYSRRFSYERREHER VFIRFGAVAGEARVFLNGCFLGVHRGGSTPFCVEATKNLREGGDNRLIVVADGTRHQADV PSLNTDWFPYGGIYRSVSLVRLPEFFIQDMKVGLSGRRGREITVEATAEGAAGSSFSDFS AFFEIPEIGVCEIMDFNSDGTGCLKLKVPELELWSPEHPRLYTVVLTLYDGENRAIDQIT DRVGFRTIETEGRSILLNGRPLFLRGVCLHEDSKDHGKAVTKAEIREAFCLAKETS >gi|229783914|gb|GG667821.1| GENE 7 7827 - 8939 1207 370 aa, chain + ## HITS:1 COG:no KEGG:Rleg2_5334 NR:ns ## KEGG: Rleg2_5334 # Name: not_defined # Def: glycosyl hydrolase family 88 # Organism: R.leguminosarum_trifolii # Pathway: not_defined # 1 370 21 390 390 471 59.0 1e-131 MDFAAGQVKENLKEFTHSFKKAYSEEGFYRPTPNVNWTTGFWTGQIWLAYEWSGDETLKQ AGEIQIDSFYQRIERKQDVDHHDMGFLYSPSCVAGYKLIGSEQGKKAAVMAADQLITRYH PTGEFIQAWGPMDQPENYRFIIDCLLNLPLLYWASEETGDEKYRDIAEKHIHTAIANVIR EDYSTWHTFFMNMETGEPDHGATCQGYRDGSAWARGQAWGVYGTAMAYRYTKREEYIDDF KHVTGYFLDHLPEDLVPYWDLEFGDGSGEPRDSSSASIAACGMLEMARYLKDEDAKYYTS LAEKIMKSVVENYAVKSPEESNGLVLHSTYSKKSPYNTCTPEGVDECNIWGDYFYMEALT RLSKDWNPYW >gi|229783914|gb|GG667821.1| GENE 8 8956 - 10662 1899 568 aa, chain + ## HITS:1 COG:SMb20536 KEGG:ns NR:ns ## COG: SMb20536 COG4289 # Protein_GI_number: 16264263 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 3 555 16 567 617 389 38.0 1e-108 MRLETKQDFRDWMFSVLNPLKPLYSEGGARLHLGDSGVTYPEVSIEMEAFSRPLWALVPF WKGGGDGFEELYQKGLKNGTDPDHPEYWGGFTDYDQRFVEMAAIASGMIFTPEKVWEPLG EEGQNHLADWLYGINGHIIPDCNWQFFMILVNVALKKNGCRYDAKNLQSGLDKIESYYLE DGWYRDGASGQKDYYISFAMHYYGLLYSVAMADEDPERCARFKERAEIFAGDFIYWFDED GAALPYGRSLSYRFGEAAFWSAYVFAGLDEIPVAVVKGILARHLNWWCGQKIFDRDGVLT IGYAYPNLIMAERYNAPGSPYWGMKTLLCLALPDEHPFWSVEAAPLPALAEVKMMPQANM IVQRRGRDVTAYPAGVCEKYGHGHVPEKYSKFAYSTKFGFSVARSQIVLHENAPDSALAF VIDGDDYVFVRKVSESYHIFADRVVSDWQAFPGIRVTSVVIPTEYGHMRIHEVESDYDCT AYDCGFAVEKFTEGYGESAEGKTASVSWEKQGCTVSGEGPEAAGVVIGADPNTNVLYANA SIPAVAYRIRRGEKIRIETRIETFVKAD Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:07:26 2011 Seq name: gi|229783913|gb|GG667822.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld215, whole genome shotgun sequence Length of sequence - 9941 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 6, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 726 605 ## COG3209 Rhs family protein + Prom 769 - 828 4.2 2 2 Op 1 . + CDS 864 - 1601 433 ## CA_C1004 hypothetical protein 3 2 Op 2 . + CDS 1610 - 1960 165 ## CA_C1005 hypothetical protein + Term 1966 - 2015 -0.3 4 3 Tu 1 . + CDS 3093 - 4280 718 ## COG3547 Transposase and inactivated derivatives + Prom 5360 - 5419 25.3 5 4 Op 1 . + CDS 5619 - 6104 -5 ## PMI1136 hypothetical protein 6 4 Op 2 . + CDS 6128 - 6340 87 ## + Term 6538 - 6567 1.2 - Term 6304 - 6364 17.5 7 5 Tu 1 . - CDS 6393 - 7229 673 ## gi|266624894|ref|ZP_06117829.1| hypothetical protein CLOSTHATH_06292 - Prom 7390 - 7449 6.9 + Prom 7358 - 7417 5.1 8 6 Op 1 . + CDS 7507 - 9276 2215 ## COG0018 Arginyl-tRNA synthetase 9 6 Op 2 . + CDS 9301 - 9940 564 ## Closa_2319 SNF2-related protein Predicted protein(s) >gi|229783913|gb|GG667822.1| GENE 1 1 - 726 605 241 aa, chain + ## HITS:1 COG:MA2043 KEGG:ns NR:ns ## COG: MA2043 COG3209 # Protein_GI_number: 20090890 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Methanosarcina acetivorans str.C2A # 72 225 100 239 440 80 33.0 2e-15 FQLNYNPDAECGYGKNVSGEVFMPENNKNEDGSLTAEGDLFSYICSATGRAYDLTEYVND TNRQYTEVLTAYTVNSGATESYSYNGRQRLSRNDIWTEARDLVCNETSYYLYDGRGSVTA NTWQNGMVTSVYQYDPYGQVTLGSTEHTDFYGYNAESYNPNTGLEFLRARYYNAEIGRFF QEDTYLGDINDPLTLNRYAYTKNSPLNYIDPSGHTQVGIPKGIISRTFDKLVDNIKGKLF I >gi|229783913|gb|GG667822.1| GENE 2 864 - 1601 433 245 aa, chain + ## HITS:1 COG:no KEGG:CA_C1004 NR:ns ## KEGG: CA_C1004 # Name: not_defined # Def: hypothetical protein # Organism: C.acetobutylicum # Pathway: not_defined # 144 244 225 325 327 129 65.0 1e-28 MIMGIDMELAYQYGFGDTRTKAMLDQFQKLTGSEDLLVDSLLDRESYYYGKYIGDMIAMI ANSGMTAAGIAAFIELLPAAAPIMSASFSGDIGVIQMVLSGGMIFEIEVAGAAGAGLGAG LYNIAAIKQRAYDNLDKAQNSGKGDSKTELTSSIDKDPRLVKAAEEMGKNERVQQEADHL IDELLKGNENPGLGSKNLFKDVSYLRGRNGARVFYRKTANGYEILGKADKANEQTVINIL KKLYE >gi|229783913|gb|GG667822.1| GENE 3 1610 - 1960 165 116 aa, chain + ## HITS:1 COG:no KEGG:CA_C1005 NR:ns ## KEGG: CA_C1005 # Name: not_defined # Def: hypothetical protein # Organism: C.acetobutylicum # Pathway: not_defined # 1 116 1 112 115 81 41.0 1e-14 MKIKNITYPTVLEKIPDLFNDNIDVFVETEDGMHFTMTICTPMFYLHYMEKENLNFIPAS PPDIIVKELTHNNIKEALESFCEDDGYWMKLYFLSGASEGIFSKDKLDEMIQNIEK >gi|229783913|gb|GG667822.1| GENE 4 3093 - 4280 718 395 aa, chain + ## HITS:1 COG:BH4041 KEGG:ns NR:ns ## COG: BH4041 COG3547 # Protein_GI_number: 15616603 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Bacillus halodurans # 1 390 19 409 427 110 24.0 3e-24 MGIDLHKETHTAVMLDCWNQKLGEITFENKPSEFSKLTRKVSRFVTEEKEPVYGLENAYG YGRALAVWLIEKGVAVKDVNTALSYAQRKSVPMYQKSDSYDAEAVALVLINMLDKLPDAI PDDKYWTLSQLVNRRDNICTHLHCLKNQLHEQLCIAYPSYKQFFSDISRATALYFFMEYP SPEHLQGKTAEELAEELRPVSHNNCSVKRAEKILSLVRVDGDTKRDHQESRDTITRSLVS DLEHYRVQLEEVNQAIEMLMPEFDCTLMTMPGIELITAANMLSEIGNISRFPNSAKLAKF AGIAPVNFSSAGKGKDVCPKQGNRRLQAIFYFLAIQMIQVSINGTPRNPVFREYFLRKLE DGKNKQQALICIARRLVNIVYGMLKNHMEYREPGQ >gi|229783913|gb|GG667822.1| GENE 5 5619 - 6104 -5 161 aa, chain + ## HITS:1 COG:no KEGG:PMI1136 NR:ns ## KEGG: PMI1136 # Name: not_defined # Def: hypothetical protein # Organism: P.mirabilis # Pathway: not_defined # 9 160 6 153 155 68 32.0 7e-11 MYEIRYCNSIVKQGIMKKFRKPKGLNLFEHACNCNSLEDIIAISYLFCPDFIQIGEYIFV SVFFEEEGEEAIKKVEKLEERFGKNKKLIEQWVNSWSIGDLFINCTDESYQNNSLIMQFC DILVYNWQQRLKELFPHKKIIVETGNEIMGELGLTITVYEQ >gi|229783913|gb|GG667822.1| GENE 6 6128 - 6340 87 70 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIVRPALPGDVPRGYALSSITSLTCTLFDPAILVNFNWKTRSRIGPTRRNICLMVHFTYK RNSMTKTAIK >gi|229783913|gb|GG667822.1| GENE 7 6393 - 7229 673 278 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624894|ref|ZP_06117829.1| ## NR: gi|266624894|ref|ZP_06117829.1| hypothetical protein CLOSTHATH_06292 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06292 [Clostridium hathewayi DSM 13479] # 1 278 1 278 278 435 100.0 1e-120 MRVNLNFSLHKNTLINTARVNASFAKYQSVQDKKSGGILGNRPKNDRFTLSPQGKLINMI ENLTKQKQALVDQRNSLVETTLGDGGKLEDIKDQLKSYKKQIDAIDKQISNAYTQQARQC IEPEDKKSSDKTGKNKTDDQLTTEHLTTLAAVSQDLRHAEKISAVQSHIESEARIKESEI DLGAAHVDTLVSKGLGGDTVGALIENEMDSIGRSDQEAGQLHDKASELALRQGEQLRKSR EELEQSNGAQTKMATAQEGTEGPMVDQGKDGNDARDGE >gi|229783913|gb|GG667822.1| GENE 8 7507 - 9276 2215 589 aa, chain + ## HITS:1 COG:CC3359 KEGG:ns NR:ns ## COG: CC3359 COG0018 # Protein_GI_number: 16127589 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Caulobacter vibrioides # 20 589 20 600 600 476 43.0 1e-134 MKKILDLITSEVKEAFAACGYDESFAKVTLSNRPDLCEYQCNGAMAAAKAYKKKPIDIAN EVVEKLSGGDIFSEVNAVMPGFINIKLEAGYLATYMNNMREEDQCGCEKAKTPLTIIVDY GGANVAKPLHVGHLRSAIIGESIKRMGRFLGHKVIGDVHLGDWGLQMGLIIEELRDRKPE LAYFDETYTGEYPKEAPFTIGELEEIYPAASAKSKTDEEFSARAHEATLKLQNGYAPYRA IWNHIIDVSVTDLKKNYGNLNVSFDLWKGESDAQAYIPGLIQELEEKKLAYESQGALVVD IAEPGDSKELPPCIIRKSDGAALYSTSDLGTILEREKEFCPDWYIYITDKRQELHFTQVF RVAKKAGFVPEERKMSHLGFGTMNGKDGKPFKTRDGGVMRLETLISDIDAAAYEKIMENR TVSEEEALKTSKIVGMAALKYGDLSNQASKDYIFDMDRFVSFEGNTGPYILYTIVRIKSI LAKASDLEVQGVCTGDILPAVSGAEKDLMLQLSKYNEVIETSFMELAPHKICQYVYELAN AFNRFYHDTKIIAEENEKQRASWLSLISLAKDVLTACIDLLGIEAPERM >gi|229783913|gb|GG667822.1| GENE 9 9301 - 9940 564 213 aa, chain + ## HITS:1 COG:no KEGG:Closa_2319 NR:ns ## KEGG: Closa_2319 # Name: not_defined # Def: SNF2-related protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 213 1 212 1047 312 68.0 7e-84 MKVSQMTTSAFWKGESGVKGLVEDQGKQYKANLYIKGSQVYDYSCSCVQGNSYKGMCPHC QELYREYREKAAENSGRPVTTSQQARTMIREYTNREVAQIMIEGEEKQVQLAVRIIADRS DVKLEFRLGRERFYVLKDLVAFTRAIESGSFVEYGKNLAFHHNLSVFTKESRPLVEFIME LVGSYCEHYEQFQKSSFSAMPALRSLNLSRSNR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:08:00 2011 Seq name: gi|229783912|gb|GG667823.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld216, whole genome shotgun sequence Length of sequence - 10162 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 3 - 654 122 ## COG3279 Response regulator of the LytR/AlgR family - Term 1044 - 1095 15.3 2 1 Op 2 . - CDS 1152 - 2474 1297 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase - Prom 2512 - 2571 6.2 - Term 2554 - 2606 15.1 3 2 Tu 1 . - CDS 2610 - 2867 295 ## COG5263 FOG: Glucan-binding domain (YG repeat) 4 3 Op 1 . - CDS 3832 - 4542 801 ## gi|266624901|ref|ZP_06117836.1| conserved hypothetical protein 5 3 Op 2 . - CDS 4577 - 9826 5285 ## Elen_1817 coagulation factor 5/8 type domain-containing protein - Prom 9854 - 9913 6.5 6 4 Tu 1 . - CDS 10041 - 10160 83 ## EUBREC_2457 putative recombinase Predicted protein(s) >gi|229783912|gb|GG667823.1| GENE 1 3 - 654 122 217 aa, chain - ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 29 193 50 218 234 84 31.0 2e-16 MIESHGEEADIEDFSDGQTCLSQLRNRNFNILFLDIEMPGMDGFSMAGKMQKSYPNIVLI FVSSHESLVFQSYEYDTFWFMRKSALIADLKKAIDKYFARIIYNELFYVVNAQKVYYKDI LYIECIGHLITIKTTDKTYSMCGSLKKLEEELTPYQFIRVHKSFLVNMQQINVIHNDKII LTDNSSVLLSKARQQTAPEKPWSTSMMRRTICRRSSI >gi|229783912|gb|GG667823.1| GENE 2 1152 - 2474 1297 440 aa, chain - ## HITS:1 COG:CAC0326 KEGG:ns NR:ns ## COG: CAC0326 COG2256 # Protein_GI_number: 15893618 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Clostridium acetobutylicum # 3 432 5 433 443 573 61.0 1e-163 MEQLSLFDDRAAVGPLASRLRPTGLEEFVGQKHLLGEGKVLRRIIDQDMVCSMIFWGPPG VGKTTLARIIANRTKASFVDFSAVTSGIKEIKEVMAQAERDRHMGLRTLVFVDEIHRFNK AQQDAFLPYVEKGSIILIGATTENPSFEINAALLSRCKVFVLQALQTDDLVILLHHALTS PLGLGDQRIGITEDMLRMIAQFANGDARTALNTLEMAVLNGEISAEETTVTPEVLEQCTS RKSLLYDKNGEEHYNLISAFHKSMRNSDPDAALYWLARMLEAGEDPLYVARRLIRFASED IGMADSQALTLAVSAYQACHFNGMPECNVNLAHTVIYLSMAPKSNSAYAGYEHCREDALH TLAEPVPLVIRNAPTGLMKELHYGKGYQYAHQTEDRIAAMQCMPDSLKDRTYYVPTEEGN EKAAKKRLEEVKAWREAHGM >gi|229783912|gb|GG667823.1| GENE 3 2610 - 2867 295 85 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1 77 541 617 621 99 57.0 1e-21 MKTGWVNTGDHWFYFSPSGAMKTGWVLEKGCWYYMSESGAMKTGWVQTDGKWYYLDESGK MLADTVTPDGVRVDKNGVRVNGITE >gi|229783912|gb|GG667823.1| GENE 4 3832 - 4542 801 236 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624901|ref|ZP_06117836.1| ## NR: gi|266624901|ref|ZP_06117836.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 234 6 239 239 354 100.0 3e-96 MNKKVVIGLLGILILLAAFGRIMTTPAEEYRFVTWEKGTSENEVKVSVMLSAREDVQKAG TFQVQFGVASDDPDGIEDIKFKFDSDLKNNSDVTVKTYRYNQSNGTLTIYVSGVADDILK KGTALNLGTIVVNADSDVTFSVEPGGCKTADENYTERTLTVFGSQDDYTVKRETSDDETP NETPEETTPDETPEETPGETAPAETPEETLPDETPGGTLPEETTDSSEPEETPEHS >gi|229783912|gb|GG667823.1| GENE 5 4577 - 9826 5285 1749 aa, chain - ## HITS:1 COG:no KEGG:Elen_1817 NR:ns ## KEGG: Elen_1817 # Name: not_defined # Def: coagulation factor 5/8 type domain-containing protein # Organism: E.lenta # Pathway: not_defined # 34 1745 64 1778 1787 1123 40.0 0 MKRAAACLLALCLILQGPASLMTSFADFGQRGVASPSNSVSASPSSAWKENESEGKNGEL KVEIRGVLPVQRAAGWELFLTKDEELSDQGTLQFEAVESAGVYSSGSYTFTDLPRGKYQL LVKSLNGGYEDYIQKNISINGDRASILLLNDYPEKYGYTGNKMPGVIRMGDVNGDGQIDG EDMEELIDAIDEAESNSTDSRCDFNGDGEVDLVDLQYFTIFYKNKSNTKATVTSQALVNP DAVTASSSNAVFSVDTLKEVFAGNSSEALKLETAEETAITPETPVEVSAEIEKGLTAAGF TIQPVIGSGNTIKDGFVTVECEGEDDPVVFKIENGVAVRQTGAKARAAFRSASPRSGAED SAAGRTIVIDLGRQVAIKKVTITVTAALEKEATLVEISKVEFLNNMENRIPEPEMNIPEN LNAVAGSESFDLTWKRAVNVTGYEAEVTGRVKSGIKTAVIPVSENRLSVESIQNEDLING EEYTVRVQSVNGAWKSGYSESVKVTPEASKRPDPPEGITVKGGYRSLDISWKKMKDTDSY SMFYREYDDADGSYIRIDNIENTSQTVPNLKDETKYDIYLTGTNKIGESAPSMHYSGTTE SVNAPVTPNYKLINVAGENGGTEHIKAVNNHGGTADSEFAVADNDAVSAWVRNDWDAGCV YPGESKSPTVTLDDSYTMDTVVIIPDQAQKFAYTDATFFYWPEGKNSQVQAAGTFSRKTS SNGKTYYEFQTSQPVTTDRVQVRLTTGYGPSSRISIAEMKFYYYDPIEHEVYDLFADDMH LSLKEGVTQESIDSLRARLEVKDEVSGELHPKKEILERELTTAEQILKDEALADILTIDN RVTKKADGSITFGGGLNAWQPLGVTAMAGDTVMVYVGSPEKKTGDSTNLRLIATQYHGES SAWSKTIGMLKAGINEITIPKITDLDVEQGGQLYVEYTGAQGKERYSVRVSGGHQIPTLN VTMASDSNAARALVTKYVEELEATAANLETEHESHKKAHDGDWSSAKKNCILGATDIVTK YMMFSVSSQQILAGLSGGTVEEKAEQLYQSLTAADEMVNLFYQHKGLSSDPDAGEKNKLP VSRLNLRYQRMFAGAFMYAGGLHIGIEWGSIPGLTRGVPVKADPKGRYESGQYFGWGIAH EIGHEINEGAYAIAEITNNYFSVLAQAHDTNDSVRFQYPEVYKKVTSGVTGRSSNVFTQL GLYWQLHLAYDMGGYNYKTYDKYRDQFNNLFFARVDSYVRNTEAAPKPGGVSLSLSGDVD NKLMRLACAAAEKNILEFFERWGMVPDETTKKYAQQFEKETRAIWFVNDEARAYVLKNGK DGSVAASTVVEADLSYTENSNEVTIHLASQSKKPEAMLGYEIYRSETIKSHVEKKPVGFV TADQTEFVDSVSTVNNRVFTYEVVGYDKYLNATKPVVLEPVKVSHGGVIDKSDWTVTTNM VSAEDKVDEEINPDTVTMEAIGKVIDNDAGTTYTGKTVAGSNGKAPAASVTIHLNREETI TGLTYKLSNADNAAGSPIGNFKVEISDTGAEGSWTQVKAGTFSVAQGQLVNGSETVYFNK NDDTWLYAYDTSYVRITAVDQKGTDISISEIDLLGQTGDDIDFGQTDSIGILKEDYHAGH GESGEAVIPKGSLIFTGTYKGNPAYNVVLLYDEQGKIVGGTDEAGNISAAQMIFAEVPEH GELGETSSGTWIYYIEPGNFKKDALPKRVRAELYRVDNAHDNRGERLVSNTLFVDMPEKL PVIEIKAQD >gi|229783912|gb|GG667823.1| GENE 6 10041 - 10160 83 39 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2457 NR:ns ## KEGG: EUBREC_2457 # Name: not_defined # Def: putative recombinase # Organism: E.rectale # Pathway: not_defined # 1 38 426 463 464 75 89.0 7e-13 GYWFTTQTVAVNMAMTKERLINSGYYDLATAYQSVHVNR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:08:37 2011 Seq name: gi|229783911|gb|GG667824.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld217, whole genome shotgun sequence Length of sequence - 10542 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 7, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 557 495 ## COG0657 Esterase/lipase 2 1 Op 2 . - CDS 584 - 1783 1516 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase - Prom 1821 - 1880 7.5 + Prom 1799 - 1858 6.3 3 2 Tu 1 . + CDS 1919 - 2683 733 ## COG1521 Putative transcriptional regulator, homolog of Bvg accessory factor + Prom 2748 - 2807 4.3 4 3 Tu 1 . + CDS 2882 - 3658 697 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit + Term 3702 - 3761 2.5 - Term 3691 - 3746 5.4 5 4 Tu 1 . - CDS 3749 - 5791 1723 ## COG4219 Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component 6 5 Op 1 . - CDS 6724 - 6933 99 ## Clole_4055 peptidase M56 BlaR1 7 5 Op 2 . - CDS 6933 - 7328 551 ## COG3682 Predicted transcriptional regulator 8 5 Op 3 . - CDS 7401 - 8378 902 ## Cphy_1046 hypothetical protein 9 5 Op 4 . - CDS 8375 - 8578 152 ## gi|266624912|ref|ZP_06117847.1| RNA polymerase sigma factor, sigma-F - Prom 8598 - 8657 80.4 10 6 Tu 1 . - CDS 9506 - 9706 343 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 9805 - 9864 4.9 - Term 9832 - 9867 4.1 11 7 Tu 1 . - CDS 9960 - 10541 684 ## Closa_2348 hypothetical protein Predicted protein(s) >gi|229783911|gb|GG667824.1| GENE 1 2 - 557 495 185 aa, chain - ## HITS:1 COG:CAC2917 KEGG:ns NR:ns ## COG: CAC2917 COG0657 # Protein_GI_number: 15896170 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Clostridium acetobutylicum # 33 174 34 175 272 155 47.0 6e-38 MNTEKKEIIVPGQLTGPALLTGYLIKEVSVAPDRKRPAVIVCAGGGYWQRSDRESEPLAL QFMAMGCHAFILDYSVAPNHFPTSVRELGEAVAYIREHAAEWKVDPEKIIACGSSAAGHL VCSLGVFWNRELVYDAIGRTPEQIRPDGMILCYPVITGGEYAHENSFKRLLGDDVTQEQR DAVSL >gi|229783911|gb|GG667824.1| GENE 2 584 - 1783 1516 399 aa, chain - ## HITS:1 COG:BH2510 KEGG:ns NR:ns ## COG: BH2510 COG0452 # Protein_GI_number: 15615073 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Bacillus halodurans # 1 389 1 390 404 364 51.0 1e-100 MLKGKHVVLGITGSIAAYKTAGLASMLVKKGCHVQVLMTENATNFINPITFETLTGNKCL VDTFDRNFEFSVEHVALAKQADVVMIAPASANIIAKLAHGLADDMLTTTVLACRCKKIIA PAMNTNMYENPVVQDNIKICEKYGMEVIKPAVGYLACGDTGAGKMPEPAELFDYIEKEIG AQKDLEGRKILVTAGPTREAIDPVRYITNHSTGKMGYAVAKAAALRGADVTLITGKTDTP KPRFVKLIEIESARDMFEAVTKAAAEQDIIIKAAAVADYRPKSAGTEKTKKTDGDMAIEL ERTDDILKWLGAHRKEGQFLCGFSMETQNMLENSRVKLDKKNIDMIVANNLKVEGAGFGT DTNVVTIITRERDLELEKMTKEEVADRLLDEIQAITAEK >gi|229783911|gb|GG667824.1| GENE 3 1919 - 2683 733 254 aa, chain + ## HITS:1 COG:BH0086 KEGG:ns NR:ns ## COG: BH0086 COG1521 # Protein_GI_number: 15612649 # Func_class: K Transcription # Function: Putative transcriptional regulator, homolog of Bvg accessory factor # Organism: Bacillus halodurans # 1 253 1 253 254 200 43.0 3e-51 MILAIDMGNSNIVIGGIDNSRTWFVERVTTNQGKTDLEYAVNIKDIMAIHNLPISSIEGA VVSSVVPPLTDVVLSAVKKITGKTPMLVGSGMKTGLNIIMDNPKTVGSDLIVDAVAALKD YPAPIIVIDMGTATTMSVIDKSGNYTGGIIFPGLRVSLDSLSSRAAQLPYIGLNKPARVI GKNTIDCMKNGILYGNAAMIDGIIDRMEEELGQAASLVATGGLAPSVLPLCRHKINYDEA LLLKGLLILYEKNK >gi|229783911|gb|GG667824.1| GENE 4 2882 - 3658 697 258 aa, chain + ## HITS:1 COG:CAC1696 KEGG:ns NR:ns ## COG: CAC1696 COG1191 # Protein_GI_number: 15894973 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Clostridium acetobutylicum # 1 257 1 257 257 333 64.0 2e-91 MPVYKVEICGVNTSKLPLLNNEEKEALFKRILDGDMEAREEYIKGNLRLVLSVIQRFSGS NENVDDLFQIGCIGLIKAIDNFDITQNVRFSTYAVPMILGEVRRYLRDNNSIRVSRSLRD TAYKAIYAKENLMKKNLKEPTIMEIADEIGVSKEDITYALDAIQNPVSLYEPVYSDGGDP LFVMDQISDKKNLEENWVEDISLNEAMKRLPERERHIIDMRFFEGKTQTEVAEEIHISQA QVSRLEKNALKSMKGYLS >gi|229783911|gb|GG667824.1| GENE 5 3749 - 5791 1723 680 aa, chain - ## HITS:1 COG:CAC3437 KEGG:ns NR:ns ## COG: CAC3437 COG4219 # Protein_GI_number: 15896678 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component # Organism: Clostridium acetobutylicum # 71 226 171 334 541 68 26.0 4e-11 MAASNHVTSVNPIQILLFVMLVVWEAGMLAFFLCHFAGYIRLKWRLGTAVKIQEADPGQG AGTCRRPEHAIYESDRIPGAFVMGIFRPVIYLPANLTEIERECILRHERIHIRRRDYIVK LLGLLMVMVHWFNPIAHVSFRLFCRDMEMSCDEAVVGQLGEGGKKQYSLALLSAAEKGNG ARLPLAFGESHTKSRIKNVLNYKKAGLWITLCGVTVTAAAVLCLLTVREKPAASAVSIIG GGDGPTSIFIAGKDDGETPEMPVELPDTSWLAGQVLGKGIESTITIDYASKDQIIFHGSC GLFSFQLEDGKWMPNLFIGPDDAGTVDLEAVTSALDAGKKEPESEDSVHAEDSFIGRNPT AGSSFIDYDVTKLADGSIAVLGGMGTSPGSGELTGKLVDVFYGWYHPDDMVMHQVYLFAE DGTMAENNSGEIEERRYLFTTDGADYFLRTPQTALEFELQDGKEPESLHFPYDRLELVRD EGGYTRVLDPMVIMGQPENQKMVLAEGRIYYYGAGKADLMSFKSQSLIGIRTDGSDRRVA GIDYSVCRGLSYDNGYLYYEGWKNAMEFPRPIYRMKPDFSEITKLGDFNGNLISVVDGTF YMLSAEKPAIVLTREGRFDEVMYYDKCGYDAKNYECVAASAADGKLTMKFENLSGNREYV NYEIPLAPKDTWEKTAVKKK >gi|229783911|gb|GG667824.1| GENE 6 6724 - 6933 99 69 aa, chain - ## HITS:1 COG:no KEGG:Clole_4055 NR:ns ## KEGG: Clole_4055 # Name: not_defined # Def: peptidase M56 BlaR1 # Organism: C.lentocellum # Pathway: not_defined # 5 64 4 63 887 70 50.0 2e-11 MAGNQVMITLLNMSMTAGLTALFVMALRLLYKRLPKRFSCVLWMVVIFRFLCPFSLQSAY SLLPFYSNS >gi|229783911|gb|GG667824.1| GENE 7 6933 - 7328 551 131 aa, chain - ## HITS:1 COG:CC1640 KEGG:ns NR:ns ## COG: CC1640 COG3682 # Protein_GI_number: 16125886 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Caulobacter vibrioides # 12 128 19 139 144 65 30.0 2e-11 MEEKKKTEEKKQTEEKKLYESEYRFMNVIWDNEPVNSTELVRICGKELGWKKSTCYTVLK KLAGRGFVKNEQAVVCSLIPREEVLKYESETVVDRNFDGSLPAFVTAFLKDRKLTEKEAE ELRQMIEKAVK >gi|229783911|gb|GG667824.1| GENE 8 7401 - 8378 902 325 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1046 NR:ns ## KEGG: Cphy_1046 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 305 1 296 298 180 38.0 7e-44 MSRLKDSKNVYDNIPIPDRLEETVASAIEQAAARRKEDHGVPDQPVTEKRNRKKWRRRGK MVSGCGLAAAAALTVVAAVGVNTSPTFAEEMQGIPVIGAIVRLFTAESWQSDTGDAGISV DVPGIEMIRGDTKNLADEVNEEIKAKCDAYAAAAVERAKEYKKAFLDTGGTEAEWKEHDI RIRVWYELKSQDDSHLSFAVSGSENWVSAYSETYYYNISLTDGAYLTLEDLLGPDYIEKA NESIRAAIKEREADSGEEFFSKDEGGFVTVTADTPFYINEKGNPVVVFEKYEIAPGFMGR PEFEIEREVDKSGLSEDTLDIQGLS >gi|229783911|gb|GG667824.1| GENE 9 8375 - 8578 152 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624912|ref|ZP_06117847.1| ## NR: gi|266624912|ref|ZP_06117847.1| RNA polymerase sigma factor, sigma-F [Clostridium hathewayi DSM 13479] RNA polymerase sigma factor, sigma-F [Clostridium hathewayi DSM 13479] # 1 67 2 68 68 102 98.0 1e-20 MEPGYEETEGFFEELDRLPADTQNVIRLRFYEEMSLQEIAEVMELSVNTVKSKLYRGLKS LKEELVI >gi|229783911|gb|GG667824.1| GENE 10 9506 - 9706 343 66 aa, chain - ## HITS:1 COG:BS_sigV KEGG:ns NR:ns ## COG: BS_sigV COG1595 # Protein_GI_number: 16079766 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 8 66 10 68 166 58 45.0 2e-09 MEHDHYDIVVNYIIANQKKFYRLAYTYVRNENDALDIVQNAIYSALEHYGSIREISYIKT WFYRVL >gi|229783911|gb|GG667824.1| GENE 11 9960 - 10541 684 193 aa, chain - ## HITS:1 COG:no KEGG:Closa_2348 NR:ns ## KEGG: Closa_2348 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 190 99 288 289 263 74.0 3e-69 KNNYEAKIDIPQVSIGDQSSAEVNKSIEEYANQLIGEYEKEVTGDLAGDGHYSVTSTYQV VTDNEKYLSLRINTTVIMASGAEYVKIFTIDKATGQVVTLKDLFRNKADYVKALSDNIKE QMREQMAADDSNKYFFESGEDAADDFDQITGDESFYFNENGELVIVFDEYTVAPGYMGVV EFTIPKSVTGDSF Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:08:55 2011 Seq name: gi|229783910|gb|GG667825.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld218, whole genome shotgun sequence Length of sequence - 7679 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 5, operones - 4 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 72 - 1169 964 ## COG3119 Arylsulfatase A and related enzymes 2 1 Op 2 . + CDS 1222 - 2247 995 ## COG1609 Transcriptional regulators + Prom 2274 - 2333 5.0 3 2 Op 1 . + CDS 2383 - 3729 1399 ## COG0534 Na+-driven multidrug efflux pump + Prom 3758 - 3817 7.8 4 2 Op 2 . + CDS 3858 - 4184 485 ## gi|266624918|ref|ZP_06117853.1| conserved hypothetical protein + Term 4227 - 4292 19.1 - Term 4219 - 4277 7.5 5 3 Op 1 . - CDS 4394 - 4969 246 ## gi|266624920|ref|ZP_06117855.1| hypothetical protein CLOSTHATH_06320 6 3 Op 2 . - CDS 5003 - 6133 888 ## Cphy_3002 cell wall hydrolase SleB - Prom 6192 - 6251 10.4 + Prom 6106 - 6165 8.1 7 4 Tu 1 . + CDS 6372 - 6509 226 ## gi|288871506|ref|ZP_06117857.2| sensory box protein 8 5 Op 1 . + CDS 6637 - 7359 696 ## Ccel_2016 AraC family transcriptional regulator + Term 7363 - 7402 -0.6 9 5 Op 2 . + CDS 7436 - 7679 217 ## Ccel_2014 hypothetical protein Predicted protein(s) >gi|229783910|gb|GG667825.1| GENE 1 72 - 1169 964 365 aa, chain + ## HITS:1 COG:PM0598 KEGG:ns NR:ns ## COG: PM0598 COG3119 # Protein_GI_number: 15602463 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 60 359 167 454 467 113 27.0 6e-25 MKKRNLPAAAHRIERVFFWADKSQQAYQKGSPDEVPRNGELFKLQDPRHCCEAAYGITVT PKETLEFFFLSELACEKLEELSADGDEAPFSLRVDFWGPHQPYFPTQEYYDLYKDVSYPP YASFSSGLDGKPEVYFTEKNVPIGKDNRIVIPNALDWKAYEEMLKCCAAQITMIDEAVGR ILDKVKELGMEENTIIIWTADHGDGLACHGGHFDKGSYLSQEVLRVPLGIKWKGTIEAGM KIDAPVCTVDVPVTIMDAAGLTYKNSVHGESLIPICQGKRKPEYAVSETYGLGYGEYVKA RAVTDENYKLIATLGQTYELYDLKADPYELKNRYDDPDYRAVQAEMTKKLKEWQRETGDN IPFFH >gi|229783910|gb|GG667825.1| GENE 2 1222 - 2247 995 341 aa, chain + ## HITS:1 COG:BH2227 KEGG:ns NR:ns ## COG: BH2227 COG1609 # Protein_GI_number: 15614790 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 2 332 3 333 347 145 32.0 1e-34 MTLKEIAQETGLSVSTVSRIINKENYKCTSREAEARVWEIVRNSGYTPNLMAKELKKNGS KRPMSRNISCLFARTENSVAEQFFSPIAQAVEQEALVQGCPLGYSFSMLHISEIQNERIS QLEKDGIVVLGKPGSDLDRLLRDKFKNVVVVGIDVNEVPFDRVRCDGYEAAKTAIRYLYD LGHRRIGYAGLKDISAYDAYFDFFTEHNMKIDESVIHGYKSAFGIDYKAACAFLQAEKDL LPTAYFCSCDMIAVGIIRACRELKIKVPEELSVVGIDNIEMGQIVSPMLTTVNVPVREMG RMAVKILLDRINKGHTSAVNIEFPTELVKRESCVALPVKKA >gi|229783910|gb|GG667825.1| GENE 3 2383 - 3729 1399 448 aa, chain + ## HITS:1 COG:FN1789 KEGG:ns NR:ns ## COG: FN1789 COG0534 # Protein_GI_number: 19705094 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 4 448 12 457 459 215 30.0 1e-55 MEKDEGLYRRFLSLAVVLVLQNVVTLSVNLADNMMLGAYSETALAGVAAVNQIQFVYQQL LHALGDGVVIVGSQYWGKNQTVPMKRLASAAMRFGLVLSIALFAVVSLIPYRVMGIFTRD SGIIAEGARYLNIIRFTYLFFAVTQILLATLRSVETVKIAFKLSVLTLFVNCGINYVLIN GHFGAPEMGVKGAAIGTLAARILECIVLLLYISKRERKLNLKWRDYLAWDRILTGDYLKV TIPMLTVQGLWGVNTALQTVILGHMTANAIAANSVASTLFLMVKSMAVGSAAATSVMIGK AVGSGDCDRAVSYARSLQRIFVLIGIVSGTLLFFIRIPVLSLYDLSAETREMANTFLIIL SVVCVGMSYQMPTNNGIIRGGGNPMFVVKMDLISIWLIVIPVSLFMAFVVKASPVVVVCC LNADQIFKVVPAFLEVRYGNWMRKLTRD >gi|229783910|gb|GG667825.1| GENE 4 3858 - 4184 485 108 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624918|ref|ZP_06117853.1| ## NR: gi|266624918|ref|ZP_06117853.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 108 1 108 108 178 100.0 1e-43 MEKLLDVIERKDALVIALSDGHIRAVDLSGYHRGNEAEISAEELYEEGKRVYTILPEDME QLQNCEYIFREDAMWVRYGSKLHEELQEHRLQELLDREIYCEDENGMQ >gi|229783910|gb|GG667825.1| GENE 5 4394 - 4969 246 191 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266624920|ref|ZP_06117855.1| ## NR: gi|266624920|ref|ZP_06117855.1| hypothetical protein CLOSTHATH_06320 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06320 [Clostridium hathewayi DSM 13479] # 1 191 1 191 191 370 100.0 1e-101 MNKKIIVIEGYLASGKSTFAVQLSKTLTIPYLIKDTFKIALCSSLSIADRYESSRFSAVT FDAMMYVTERLMETGYPFIIEGNFAPAGIKKTDESSVIKTLIHKYDYTPLTFQFMGDTQV LHQRFIERENTLERGQVNKMGFIPSYSEFDKWCHNLDMFHIGGKIIKVDTTNFSTVDFTG YFNTASLFINS >gi|229783910|gb|GG667825.1| GENE 6 5003 - 6133 888 376 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3002 NR:ns ## KEGG: Cphy_3002 # Name: not_defined # Def: cell wall hydrolase SleB # Organism: C.phytofermentans # Pathway: not_defined # 79 376 185 488 488 183 35.0 1e-44 MKHPWKNTRENRAVLTAAAIAAALSLTAFSSPGHLTLSLPDTAKTDIVTVGGSVSVKSAF SVPQADIERKEVPVTYAPAASEYDNKAVANVTDVLNLRAEPSLEGKVLGKCYRGAGGTVL EKKDGWTKIRSGGLEGWLKNDYLVFGQDIKPLAKELGLFTARVTTQTLHVRETPSTDAAI IGLAAADDYYPVLEESDGWIRVQLSSDTSGYISSQYAKVSLTPGKAVSMEAEQAALKTAE SKEKKKEEEKPKYVINASSDEIYLMAACVMMESGSRSYDGQLAVASVIVNRVKSGRWGNS ITDVIYADGQFPGATSGLLDKYLAKGPSSDALKAAKAALSGSNNIGDYLFFQSAKRADYN SYASYTVVDGNCFYKK >gi|229783910|gb|GG667825.1| GENE 7 6372 - 6509 226 45 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871506|ref|ZP_06117857.2| ## NR: gi|288871506|ref|ZP_06117857.2| sensory box protein [Clostridium hathewayi DSM 13479] sensory box protein [Clostridium hathewayi DSM 13479] # 1 45 7 51 51 88 100.0 1e-16 MELLTNAGCDVIQGYLFAKPMPKEAFEAMPGSTDHLSMNHREQGG >gi|229783910|gb|GG667825.1| GENE 8 6637 - 7359 696 240 aa, chain + ## HITS:1 COG:no KEGG:Ccel_2016 NR:ns ## KEGG: Ccel_2016 # Name: not_defined # Def: AraC family transcriptional regulator # Organism: C.cellulolyticum # Pathway: not_defined # 19 240 57 279 279 172 36.0 1e-41 MEDEVPQEMDGKEPGSPVLVIPDTCADIIVRINHTRQEISGFLCGIQDQPFHSVPHVSGD EVSCFAIRFYFWSAGLFLNLNYKETSNTTIELEELGRDWSLLLEPFFYLTGMEERIAHVE KFLLKKLASIEWNPDLFNAVHRILASAGRISVKDICEYSCVSQRQMERIFLKEAGLPIKR IAGMVRYQNVWREMAFREAFDIQDAVYRYGYTDQAHLLKEFRRYHGMAPEEARRIARLNR >gi|229783910|gb|GG667825.1| GENE 9 7436 - 7679 217 81 aa, chain + ## HITS:1 COG:no KEGG:Ccel_2014 NR:ns ## KEGG: Ccel_2014 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 80 1 80 314 109 58.0 3e-23 MKVLNREDYQEIRSWIHRNARQLELSVWNYFFENGSREAVIDALSYYQNEDGGFGNAVEP DVWNPESSPYATMVVTGILRR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:09:31 2011 Seq name: gi|229783909|gb|GG667826.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld219, whole genome shotgun sequence Length of sequence - 7311 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 38/0.000 - CDS 112 - 912 698 ## COG0395 ABC-type sugar transport system, permease component 2 1 Op 2 35/0.000 - CDS 905 - 1786 751 ## COG1175 ABC-type sugar transport systems, permease components 3 1 Op 3 1/0.000 - CDS 1806 - 3143 1530 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 3212 - 3271 5.1 4 1 Op 4 7/0.000 - CDS 3276 - 5045 1403 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 5 1 Op 5 . - CDS 5042 - 6544 1408 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 6667 - 6726 7.8 + Prom 6665 - 6724 9.5 6 2 Tu 1 . + CDS 6762 - 7311 521 ## COG1609 Transcriptional regulators Predicted protein(s) >gi|229783909|gb|GG667826.1| GENE 1 112 - 912 698 266 aa, chain - ## HITS:1 COG:BH2724 KEGG:ns NR:ns ## COG: BH2724 COG0395 # Protein_GI_number: 15615287 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 2 266 1 271 272 246 53.0 2e-65 MIKKVGKYVFLCIASFLSIFPFLWMIVGATNASVDITSGRMIPGNQLLTNIHTLFTTTNI VTGIKNSMIITVIATILTVLVAAMAGYGFEIYKSRIKDKVMALLLMSMMVPMATLLIPRY RMFAQWGLLNTFLSVILPSMATAFLIFFFRQNTKSFSKEILQAARVDGLGEMKSFFLIYC PVMRSTFAAATIITFMNIWNSYLWPLVALQTDDKMTLPLIISAMNSAYTPDFGMIMVAIV TATFPTAAIFFIMQKSFVEGMLGSVK >gi|229783909|gb|GG667826.1| GENE 2 905 - 1786 751 293 aa, chain - ## HITS:1 COG:BH2725 KEGG:ns NR:ns ## COG: BH2725 COG1175 # Protein_GI_number: 15615288 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 12 291 11 290 290 354 65.0 1e-97 MNSKKISSVIKGWLFIGFGTLIIVVFCFYPMIGALANSFQTGMGMNLHFCGLANYKRLLV DPIFLSSVKNTFLYLIVQVPVMLLLAMLYAVLLNDKKLKFRSFFRIAIFLPCVTSLVAYA VLFKSLFSDSGFINQMLMTLHVIQSPIHWLTDAFWGKVTIIIAITWRWTGYNMIFFLSAL QNIDPEIYEAAEIDGANVVQKFFSITAPLLKPIILFTSITSISGTLQLFDEVVNLTQGGP GNATITMSQYIYNLCFKYTPDFGYASAVSYAILLMIVILTAIQFKVTGGNKHD >gi|229783909|gb|GG667826.1| GENE 3 1806 - 3143 1530 445 aa, chain - ## HITS:1 COG:BH2726 KEGG:ns NR:ns ## COG: BH2726 COG1653 # Protein_GI_number: 15615289 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 46 445 31 425 425 357 45.0 2e-98 MKKMYLFAAAVALAAGLTACGSKAPADNTGTQTGETAKQEAAGTSTGDKKTVTFWAWDKE FNIAALEMAKEIYEADHPDVEINIVEVGQNDVVQKLNTGLGSGSLKGLPNAVPIEDYRIQ SFLKSYPGAFMDLTSEINYDNFAAYKKGPMTLDGKSYGIPWDNGAAVLYYRKDMIEQAGY TEEDMQDLTWSQFIEMGKKIKEATGKKMISMDPSDIELMKLTMQSAGTWFTKEDGTTPNM ENNAALKECLQIIKTITDEDLVRTYSGWAGLLEGVNKGEVAFQMKGCWFTPSIMKGEGQD GLWRVAKIPRLDSTPGATNASNLGGGSWYILNGVENSDIALDFLKSTFGENKELYNRLLD EKGIVGTYLPSQESDVYNKEIPFFGNQKIYQEIASITKEIPNVNYGTYTYAFQDIVMAEI QKILQGEDIDAAMKSMQVQAEAQAR >gi|229783909|gb|GG667826.1| GENE 4 3276 - 5045 1403 589 aa, chain - ## HITS:1 COG:BH2727 KEGG:ns NR:ns ## COG: BH2727 COG2972 # Protein_GI_number: 15615290 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 10 588 18 592 597 301 30.0 3e-81 MKSQIRWGTFFKLSLVVVSSIVLTNVCTMASSDWITKNLFIQSSTFLNAKLMDNIKKNVE ENNNSYLTLLRLLETDPALKNYFLAADACGTEKFIKTYDILKVYYDSVPSGTAENLVAVS NQGSAFSFTGERISVPVEAIEEMPCSAKISETGRRLQYTYQSHGITADFQDRKCVIAARR LFLPSSSESFATAYITIPEEEFYQMFSEAVPEGSRMVILSSDGQIVSGNGKDSLGREEPK LLSLVQSMRENGETYRTVKGNHKPEIVISGYIPELDMYMVNTVEEAALLRSYAGVRRNIL FAGFLITGTAIAVVFLIMRRIVSPLNFFIQQLDTFKNRRFEKVSPVKGNSEVRQMADVYN RMVDEIDSYVRQLIEEQERGRRLEIEALQMQINPHFMYNTLASIKYLTWENNTEKASDTI NALIAILKKTIGNMEEEITLREEIDLVRQYVFINQVRYGDSIEVEYFLAEEEMECMVPKL ILQPFIENAFFHAFQNKKEGKIRIFTKKQRDDLICEIIDEGDGMSQEETADLLNSDRRRH LTGIGIPNVHERLRLIYGEEYGVTIHSVIGTGTSVSLRLPYHLKKVPKL >gi|229783909|gb|GG667826.1| GENE 5 5042 - 6544 1408 500 aa, chain - ## HITS:1 COG:BH2728 KEGG:ns NR:ns ## COG: BH2728 COG4753 # Protein_GI_number: 15615291 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 8 141 2 136 510 115 40.0 2e-25 MGYEGDKCKILIVEDELLVRKALRYIIEQEGDEFEIIGEVTNGEEALDFLEQSLPHMMIC DIMMPRMNGLELLHTVNLKYPEVSTIILSGYQDFEYVKSAFKYGVFDYILKPELESDYLL KALRHIGRKKGVIHAKDPGKAGKEEESWETSDGYASYYLIAIQPGKILQGATNKQELTIG HMEELMKEAFGRFYLDSFMLKKSQVYIYTIGCFYPEMFEEKLHKMMADGIDYLRNIEMII SPPYGKGDSLEAEAQKLSELFSCRFFDSRKDYLDLRETGLPQLPALDQKTVCDLVRQQKA GTAGAVLVNYLNSLRGMAVGEVSVKKLMESTLYNTIFELNENEFCPPQLEERKLSFLTSI GTTDSLDGLIDAITAIYQEIEQITEEARGNGLYLRITEYLYLHYSEPVTLKTLAADFNMN YSYLSSYFSSHMGQGINEFLNSIRVEKAKELLSSSELSLSEIAQEIGYTDQSYFGKVFKK QTGYSPHTYRNRAKRRGETL >gi|229783909|gb|GG667826.1| GENE 6 6762 - 7311 521 183 aa, chain + ## HITS:1 COG:SMb20674 KEGG:ns NR:ns ## COG: SMb20674 COG1609 # Protein_GI_number: 16265129 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Sinorhizobium meliloti # 3 183 1 176 339 110 34.0 1e-24 MNIYDVSERANVSIATVSRVINGNPNVSEKTRNRVLAVMEELGYTPNVFARSLGLGTMRT IGIMCADSSDPWLAGAIYYLEQELRRNGYDSLLCCSGYLPETKKKYLELLLSKRVDAVIL AGSHYVEAKQKDNAYLLEASKELPVMLVNGCLEGKQIYSTVCDDRTAVYESVTRLIQSGH TSV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:09:34 2011 Seq name: gi|229783908|gb|GG667827.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld220, whole genome shotgun sequence Length of sequence - 9072 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 5, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 971 - 1030 3.5 1 1 Op 1 . + CDS 1249 - 1845 476 ## Trebr_0443 thiamine pyrophosphate TPP-binding domain-containing protein 2 1 Op 2 . + CDS 1874 - 2962 148 ## Trebr_0444 NAD-dependent epimerase/dehydratase 3 2 Op 1 . + CDS 3554 - 3961 209 ## COG1216 Predicted glycosyltransferases 4 2 Op 2 . + CDS 4002 - 4478 250 ## Trebr_0281 Formate C-acetyltransferase (EC:2.3.1.54) + Prom 5317 - 5376 80.4 5 3 Op 1 1/0.000 + CDS 5552 - 5725 216 ## COG1882 Pyruvate-formate lyase 6 3 Op 2 . + CDS 5786 - 7240 453 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 7 3 Op 3 . + CDS 7230 - 7409 170 ## gi|266624937|ref|ZP_06117872.1| EpsIJ, UDP-galactopyranose mutase + Prom 7616 - 7675 4.8 8 4 Op 1 . + CDS 7733 - 8245 194 ## COG1180 Pyruvate-formate lyase-activating enzyme 9 4 Op 2 . + CDS 8248 - 8466 100 ## gi|288871511|ref|ZP_06410205.1| putative alpha amylase, catalytic region + Term 8605 - 8649 -0.2 + Prom 8502 - 8561 4.7 10 5 Tu 1 . + CDS 8665 - 8907 167 ## gi|266624939|ref|ZP_06117874.1| hypothetical protein CLOSTHATH_06340 Predicted protein(s) >gi|229783908|gb|GG667827.1| GENE 1 1249 - 1845 476 198 aa, chain + ## HITS:1 COG:no KEGG:Trebr_0443 NR:ns ## KEGG: Trebr_0443 # Name: not_defined # Def: thiamine pyrophosphate TPP-binding domain-containing protein # Organism: T.brennaborense # Pathway: not_defined # 2 194 384 578 578 189 49.0 6e-47 MFEAPGSVVGYANTGGYGIDGGLSSLLGNSLVTDKLVFGVFGDLAFFYDMNSLGNRHYGK NVRIILVNNGLGTEFKNPDNRAYQFGDDANPYIAAEGHNGAKSKKLVKHYAEDLGFKYLS AENKEEYMENLPTFLSPEMDSSIIFEVFVDTKEDTEAYSVLSHSHTTVKGKVKTEVKKVL GGEGIKTVKKLLGRNRDF >gi|229783908|gb|GG667827.1| GENE 2 1874 - 2962 148 362 aa, chain + ## HITS:1 COG:no KEGG:Trebr_0444 NR:ns ## KEGG: Trebr_0444 # Name: not_defined # Def: NAD-dependent epimerase/dehydratase # Organism: T.brennaborense # Pathway: not_defined # 1 352 1 350 352 328 48.0 2e-88 MRVLVLGGTGAMGMPLIEKLSNRGEQVFVTSRYQHGSAEDIHYLRGNAHDENFLSEILKQ KYDTIVDFMIYRSDEFASRVNSLLESTDQYIFTSSSRVYADSESPITECSSRLLDVSNDD TFLATDEYALAKARQENLLFNSGSSNWTVIRPYITYNVERLQLGTIEKNVWLYRALHGRN VPLPKDVACHQTTMTCGGDVAQAIADLVGNKRAYGEVLNLTGTQHMEWWEVWKIYSRVLQ EHAGITSKLYQPEDSSGICKVMENEYQVRYDRLFDRTFDNSKLLSICPDLSFVSMEEGLT MCLRKFIKKPSWKGTFGCGVEAYLNRQTGEKTKLSELDSIATKLRYLGYRWMPGIVNILK GR >gi|229783908|gb|GG667827.1| GENE 3 3554 - 3961 209 135 aa, chain + ## HITS:1 COG:MTH348 KEGG:ns NR:ns ## COG: MTH348 COG1216 # Protein_GI_number: 15678376 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Methanothermobacter thermautotrophicus # 2 127 192 313 313 78 39.0 3e-15 MDIVRQVGYPDSSYFIMYDDSDYARRCLAYTKIRYVTSACLHKQIIPPANDEFSWKGYYG YRNCFLYDMKYGKNIGVRKLRPFFIMMATYLMNKYVRHKDLVADTVKVAFKDAVAGRTGK TVEPGKLEEYLEKRK >gi|229783908|gb|GG667827.1| GENE 4 4002 - 4478 250 158 aa, chain + ## HITS:1 COG:no KEGG:Trebr_0281 NR:ns ## KEGG: Trebr_0281 # Name: not_defined # Def: Formate C-acetyltransferase (EC:2.3.1.54) # Organism: T.brennaborense # Pathway: not_defined # 3 138 7 143 677 73 37.0 2e-12 MNLKNFIVDKIKRTDIYQNKKEEAYYSSRSNMLSMLTPDEKSSSKRHVYFKKSKADEYNK MFGLINITIRFGNRFQTWIDTGLYFSNIYALEDNTTPDYELILDNSINDLINRSGNYNNS VSYEVQIMLRGILSYIDRIVEEIQEAILTLKDTADLAS >gi|229783908|gb|GG667827.1| GENE 5 5552 - 5725 216 57 aa, chain + ## HITS:1 COG:MTH346 KEGG:ns NR:ns ## COG: MTH346 COG1882 # Protein_GI_number: 15678374 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Methanothermobacter thermautotrophicus # 1 52 582 633 642 74 53.0 3e-14 MQMNVVSSKTLIAARKDPSAFPNLIVRVWGFSAYFNDLPENYKDILIKRAIESEIAV >gi|229783908|gb|GG667827.1| GENE 6 5786 - 7240 453 484 aa, chain + ## HITS:1 COG:SA0127 KEGG:ns NR:ns ## COG: SA0127 COG2244 # Protein_GI_number: 15925836 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Staphylococcus aureus N315 # 1 477 1 476 476 181 26.0 3e-45 MARKSIKKNYLFNLSYQILLLITPLITTPYVSRALGADGIGTVSYAGSMVTYFTLFATLG ITIYGQREISYVQDSVDKRSKVFWNTKILGFFTSGCAFLIYVIFSLFNDNTPLYLVLAFN ILAVFVDVTWFFQGLEEFGKIVIRNTIIRIISILYIFIFVKTKDDILVYAFGLAVFVFLS NLSLWAYLPKYITKISKKDIHPFRLLPTVITLFVPTIAIQVYTVLDKTMIGVITQSSFEN GYYEQALKISRTILTLVTALGTVMIPRIGYYFELGDTDEIRRLMYRGYRFVWFLGVPLCL GLISVSSNLVPWFFGPNYDKVVILLKILALLILAIGINNVTGMQYLIPSKRQNIFTFTVV FGACVNFVLNSIFISLWQSIGAAIASVVAEMSIAVIQLIIVRKELSPLRVIKEGVHYYIA GLIMFGVAYMLGNALSPSILHTSIIVIAGALTYFAVLFIIKDQFFISNIKNIASKVISRG RHEV >gi|229783908|gb|GG667827.1| GENE 7 7230 - 7409 170 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624937|ref|ZP_06117872.1| ## NR: gi|266624937|ref|ZP_06117872.1| EpsIJ, UDP-galactopyranose mutase [Clostridium hathewayi DSM 13479] EpsIJ, UDP-galactopyranose mutase [Clostridium hathewayi DSM 13479] # 1 59 1 59 59 99 100.0 7e-20 MKYDYLVVNSGLYGAVADKVVYTGPIDACFDYRLGEYKYYDMDQVIAAAFDMCDKEIRS >gi|229783908|gb|GG667827.1| GENE 8 7733 - 8245 194 170 aa, chain + ## HITS:1 COG:MTH345 KEGG:ns NR:ns ## COG: MTH345 COG1180 # Protein_GI_number: 15678373 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Methanothermobacter thermautotrophicus # 1 159 131 278 288 70 31.0 2e-12 MQMSELEDVLRRLHSEKIHITIETSLFSNIEQLEIALKYVDLFYVDIKILDKMRCRNVLK GNLDSYYNNLSVLMKRGALTVARIPVIAGFTDDIENRERVAELLGSFQGNLLKVEIIKEH NLGISKYQSLRKAGTSIKVPDYKGVEDNMLIDYRARLKKYVNVPVEVCKI >gi|229783908|gb|GG667827.1| GENE 9 8248 - 8466 100 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871511|ref|ZP_06410205.1| ## NR: gi|288871511|ref|ZP_06410205.1| putative alpha amylase, catalytic region [Clostridium hathewayi DSM 13479] putative alpha amylase, catalytic region [Clostridium hathewayi DSM 13479] # 1 72 1 72 72 131 100.0 2e-29 MNIRKFKSDSITFSNRASIYYGNEMEFLHNPINPNCLTHTSSWGYNISEDFFNNAISVKK LEKAIRLGNLGA >gi|229783908|gb|GG667827.1| GENE 10 8665 - 8907 167 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624939|ref|ZP_06117874.1| ## NR: gi|266624939|ref|ZP_06117874.1| hypothetical protein CLOSTHATH_06340 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06340 [Clostridium hathewayi DSM 13479] # 1 80 1 80 80 155 100.0 1e-36 MIHAYDEMYVEGAMIHMGDMIEYACPDCRYAPDGFWRMFLQNEVVRRFEIGDESVVAGKS GSELAIRGFGGDREERNDRY Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:10:06 2011 Seq name: gi|229783907|gb|GG667828.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld221, whole genome shotgun sequence Length of sequence - 11116 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 6, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 16 - 636 472 ## CLJ_B3040 hypothetical protein + Term 807 - 852 -0.2 - Term 460 - 495 3.0 2 2 Tu 1 . - CDS 637 - 1917 1250 ## COG2357 Uncharacterized protein conserved in bacteria + Prom 2021 - 2080 6.5 3 3 Tu 1 . + CDS 2112 - 2258 138 ## gi|266624942|ref|ZP_06117877.1| conserved hypothetical protein + Prom 3161 - 3220 11.7 4 4 Op 1 . + CDS 3242 - 3520 316 ## COG1863 Multisubunit Na+/H+ antiporter, MnhE subunit 5 4 Op 2 . + CDS 3508 - 3906 531 ## Shal_3357 multiple resistance and pH regulation protein F 6 4 Op 3 . + CDS 3903 - 4277 415 ## Acear_1092 monovalent cation/proton antiporter, MnhG/PhaG subunit 7 4 Op 4 . + CDS 4264 - 4536 400 ## COG2111 Multisubunit Na+/H+ antiporter, MnhB subunit 8 4 Op 5 . + CDS 4511 - 4792 260 ## gi|266624947|ref|ZP_06117882.1| protein phosphatase 1 regulatory subunit 14D + Prom 5636 - 5695 80.4 9 5 Op 1 16/0.000 + CDS 5757 - 6395 633 ## COG2111 Multisubunit Na+/H+ antiporter, MnhB subunit 10 5 Op 2 21/0.000 + CDS 6397 - 6774 541 ## COG1006 Multisubunit Na+/H+ antiporter, MnhC subunit 11 5 Op 3 . + CDS 6771 - 8279 1721 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 12 5 Op 4 . + CDS 8272 - 8694 491 ## gi|288871514|ref|ZP_06117886.2| putative pH adaptation potassium efflux protein 13 6 Op 1 5/0.000 + CDS 9670 - 10401 701 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 14 6 Op 2 . + CDS 10415 - 11114 670 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit Predicted protein(s) >gi|229783907|gb|GG667828.1| GENE 1 16 - 636 472 206 aa, chain + ## HITS:1 COG:no KEGG:CLJ_B3040 NR:ns ## KEGG: CLJ_B3040 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_Ba4 # Pathway: not_defined # 2 206 54 272 272 88 33.0 2e-16 MVWGYHYTRNRKKYLLRLYLMSIFMTGFMYFIKIRFNAVVDYGYHNIFLSMFLVGVLIST IELFIKDRKKGGILIGVIVLVQILYYMLPRFFPFLRSLSGDTLTGVIPNLAMNEYGLEFV ALGVLMYFLKEQKDVFTAVYLIFCICQFSEEMLAAGTATQWLMVLALPFMLSYNNQKGPG LKYFFYVFYPAHTFLLFYTANYIFSK >gi|229783907|gb|GG667828.1| GENE 2 637 - 1917 1250 426 aa, chain - ## HITS:1 COG:DR1631 KEGG:ns NR:ns ## COG: DR1631 COG2357 # Protein_GI_number: 15806636 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 9 178 26 190 394 66 31.0 1e-10 MYMDLINQFIENYKKKLNFYEMAGHIAARQLESALQAAGIRAMVTSRAKNPVRLKSKVQR RNSRRTVPYKNMREIYEDIADLSGVRVSLYFPGDRDKANSLIADLFTIIETKQFPEQSKP PTYNKRFSGYWANHYRAHMKEESLDKTQQKYATARIEIQVASVLMHAWSEVEHDLVYKPL SGTLSEEELAIIDELNGLVLAGEIALERLQSAGNERVRSKNATFGSQYELASYLYNYLSN NFRREDIELRMGNIELLFKLLSRLKMNGVKEIEPVLKSVKFEKDRRNISQQIIDQIITGN EKRYQIYRELRIAGETDEVASNAITYFFNQWVPLEHFLNRISHKSSPKTRGAFNINTLKR LDILDKECLNKIVALRKTRNVLIHNIETPEVDDLLEQGNEARDLLVKLAEQFSAPAAAAE SAKQTN >gi|229783907|gb|GG667828.1| GENE 3 2112 - 2258 138 48 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624942|ref|ZP_06117877.1| ## NR: gi|266624942|ref|ZP_06117877.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 46 3 48 48 68 97.0 1e-10 MFLLFFAVWVILNGKVTAEICIFGVLISAALFYFMCRYMEYSLKKEPS >gi|229783907|gb|GG667828.1| GENE 4 3242 - 3520 316 92 aa, chain + ## HITS:1 COG:VNG0571C KEGG:ns NR:ns ## COG: VNG0571C COG1863 # Protein_GI_number: 15789782 # Func_class: P Inorganic ion transport and metabolism # Function: Multisubunit Na+/H+ antiporter, MnhE subunit # Organism: Halobacterium sp. NRC-1 # 2 75 238 316 364 67 48.0 7e-12 MKANVCVLKIILSPELQPEPAFVYFDTAFQTGLAKVLLANSITLTPGTITVSVEDDRFCV HCLDKELAEGMEDSVFVKLLEEMEEVEARWTR >gi|229783907|gb|GG667828.1| GENE 5 3508 - 3906 531 132 aa, chain + ## HITS:1 COG:no KEGG:Shal_3357 NR:ns ## KEGG: Shal_3357 # Name: not_defined # Def: multiple resistance and pH regulation protein F # Organism: S.halifaxensis # Pathway: not_defined # 11 82 1 72 98 63 38.0 4e-09 MDKIGQAYEMLLTGAALILAVLMIISIIRSVLGPKISDRIIAVNMTGTMVIMVIAILSVY LDENYLVDVSLIYAMISFLGVVVLCKVYTGVYLQKKNRKADLGAIEDNVERQAEETGREG ETGNNTEQEDLL >gi|229783907|gb|GG667828.1| GENE 6 3903 - 4277 415 124 aa, chain + ## HITS:1 COG:no KEGG:Acear_1092 NR:ns ## KEGG: Acear_1092 # Name: not_defined # Def: monovalent cation/proton antiporter, MnhG/PhaG subunit # Organism: A.arabaticum # Pathway: not_defined # 15 90 9 84 115 62 35.0 8e-09 MTWQWIRFALSAACLVTGLVFMMLAVFGVNRFHRALNRMHAAAMGDTLGILFVFAGLILI RGFSMASFKLLLVILFFWTAGPVSGHMISRLEAMTDEDLGEILVIRKEKTDKKQKGEEKG DETL >gi|229783907|gb|GG667828.1| GENE 7 4264 - 4536 400 90 aa, chain + ## HITS:1 COG:VNG0566C KEGG:ns NR:ns ## COG: VNG0566C COG2111 # Protein_GI_number: 15789779 # Func_class: P Inorganic ion transport and metabolism # Function: Multisubunit Na+/H+ antiporter, MnhB subunit # Organism: Halobacterium sp. NRC-1 # 5 73 4 72 176 68 44.0 3e-12 MKLFEIILLICLIVCAVSVAFTKDLLTSIVIFMSYSLIMCIIWILLQSPDLAITEAAVGA GVTSILFFITLKKIRAIRKEERDEQDEGSL >gi|229783907|gb|GG667828.1| GENE 8 4511 - 4792 260 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624947|ref|ZP_06117882.1| ## NR: gi|266624947|ref|ZP_06117882.1| protein phosphatase 1 regulatory subunit 14D [Clostridium hathewayi DSM 13479] protein phosphatase 1 regulatory subunit 14D [Clostridium hathewayi DSM 13479] # 1 91 1 91 91 150 100.0 2e-35 MSRTKDHYENSLWLKFRKWVDGETDLLADGMETRAPEEELKELSQAAQTAQAEHLQSDSE KAARMRRKEERGLLLFRRFYQVMSVLLCLTITS >gi|229783907|gb|GG667828.1| GENE 9 5757 - 6395 633 212 aa, chain + ## HITS:1 COG:RC0355_2 KEGG:ns NR:ns ## COG: RC0355_2 COG2111 # Protein_GI_number: 15892278 # Func_class: P Inorganic ion transport and metabolism # Function: Multisubunit Na+/H+ antiporter, MnhB subunit # Organism: Rickettsia conorii # 2 207 49 261 265 96 29.0 3e-20 MSARYIEKGVEETGALNYVTGMILDYRAFDTFGESCVLFIASCCVLALLRIDAVSRDEQT RKRLDEANDRLFEPKNDIILQKCACILVPFIMIFGVYIILNGHLSPGGGFSGGAVLGSGL ILYLNAFGFRKTERFFTETVYRRITLCALTFYCVAKSYSFFTGANHLESGIPLGTPGAIL SSGLILPLNICVGLVVACTMYAFYTLFRKGGM >gi|229783907|gb|GG667828.1| GENE 10 6397 - 6774 541 125 aa, chain + ## HITS:1 COG:PAB0488 KEGG:ns NR:ns ## COG: PAB0488 COG1006 # Protein_GI_number: 14520916 # Func_class: P Inorganic ion transport and metabolism # Function: Multisubunit Na+/H+ antiporter, MnhC subunit # Organism: Pyrococcus abyssi # 13 115 8 109 114 76 43.0 1e-14 MAAHLLSNMEETAAVILFGVGFTMLLLHKNLIKKIMGMNIMDTAVYLFLAAKGYIQGRAV PIEMNGIKEASAYINPVPSGLVLTGIVVSVSTTALMLALTIRLYERYGSLDLDAILIQAK EEEKT >gi|229783907|gb|GG667828.1| GENE 11 6771 - 8279 1721 502 aa, chain + ## HITS:1 COG:PAB2416 KEGG:ns NR:ns ## COG: PAB2416 COG0651 # Protein_GI_number: 14520917 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Pyrococcus abyssi # 35 455 35 442 493 164 28.0 5e-40 MTLVQNFPFFSIIIAMFSGIISCTLSGKRARNVSLAVIVVTTAMSAAVFLFCIRTGESYT YMMGHFPAPWGNEIRAGVLEGLTAAVFGIVMLLSVLGGMDHTAADVEGTKLNLFYVMIDL MLSSLLALIYTNDLFTAYVFVEINTIAGCGLIMIRQKGRSLSAAIRYMIMSQLGSGLFLI GLSMLYDVTGHLLMSPAKAAVAAIEAAGSYEIPMLVILAVISVGLAIKSGLYPFHYWIPD TYGYATPTASAVLSGLVSKGYIFLLIKIFYRVLGRENVVASRIVNVLFVFGLAGMLMGSL HAILEKDTRRMIAYSSVAQIGYIYMGIGLGTEAGMVAAVFHMFTHSATKALLFISAVGLY EVSGDKKDYKSLRGAGYRNPLAGIGFSVGALSMVGFPMLAGFISKLLFATSALQSPNKMI VTLIGLAVSTTLNAIYFLRLVITLYSREGDRETRGAKERSAVRKSSWKLRLAVVCFIFLN LVLGLKSQPIVQAIADGLLMFE >gi|229783907|gb|GG667828.1| GENE 12 8272 - 8694 491 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871514|ref|ZP_06117886.2| ## NR: gi|288871514|ref|ZP_06117886.2| putative pH adaptation potassium efflux protein [Clostridium hathewayi DSM 13479] putative pH adaptation potassium efflux protein [Clostridium hathewayi DSM 13479] # 1 137 1 137 137 245 100.0 8e-64 MNKSYLQSAHNKEKLMYQEMDETIVLLLPVLIPIAAGVLLLTVKRLRNSRKTMISLVMAA LTAGALCAFAAVARGGGLTLWKLTDTITIEFRVDGISRLFAVLTASVWLLVGIYSMTYMT HERDEHRFFGFYLIVLGLAS >gi|229783907|gb|GG667828.1| GENE 13 9670 - 10401 701 243 aa, chain + ## HITS:1 COG:SMa1541 KEGG:ns NR:ns ## COG: SMa1541 COG0651 # Protein_GI_number: 16263292 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Sinorhizobium meliloti # 1 238 259 485 487 127 35.0 2e-29 MLAVIRVVYYVIGPELLRGTWVQTMWILLSLLTVFMGSMLAYREPVLKKRLAYSTVSQVS YILFGLSLLEPMGFVGALSHVVFHSMIKNGLFLAAGAIIFKTGWTRVEEMRGLGRVMPVM LGGYTVLSLALIGIPPCSGFVSKWYLACGALASQTGVWTIAGPAVLLVSALLTAGYLLPL TIHGFFPGEDFDESGLKARGLIGQEPSLCMMVPILIFTAGALLFGCFPGRFLAMLESIAA TVL >gi|229783907|gb|GG667828.1| GENE 14 10415 - 11114 670 233 aa, chain + ## HITS:1 COG:SMa1541 KEGG:ns NR:ns ## COG: SMa1541 COG0651 # Protein_GI_number: 16263292 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Sinorhizobium meliloti # 65 232 63 228 487 90 39.0 3e-18 MNGILLAVPALLPIAAGMAILASGYGAEDERTYRCLEFVVEGLVLLNSLLIAVVIAGRSG LDHGLVLFKLYGNLTVRFKLDRMGSVFAGLVAFLWPLATLYSFEYMRHEKRRVSFFAYYL MTYGVTAGIAFSGNLVTMYLFYELLTLVTFPLVLHPMTKEAMRASRKYLYYSIGGAAFAF IGLIFVIQYSATGTTEFVPGGVLSMAAAEGNRGMLLFVYILAFFGFGVKAAVF Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:10:36 2011 Seq name: gi|229783906|gb|GG667829.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld222, whole genome shotgun sequence Length of sequence - 9061 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 5, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 256 180 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 + Term 302 - 358 1.7 + Prom 264 - 323 5.8 2 2 Op 1 . + CDS 412 - 744 190 ## PROTEIN SUPPORTED gi|18309686|ref|NP_561620.1| 30S ribosomal protein 3 2 Op 2 . + CDS 737 - 2254 831 ## Closa_0212 hypothetical protein + Term 2305 - 2359 9.6 - Term 2289 - 2351 13.9 4 3 Tu 1 . - CDS 2403 - 2519 75 ## gi|210610570|ref|ZP_03288496.1| hypothetical protein CLONEX_00686 5 4 Op 1 2/0.000 - CDS 3489 - 5282 2451 ## COG1190 Lysyl-tRNA synthetase (class II) 6 4 Op 2 . - CDS 5318 - 5800 793 ## COG0782 Transcription elongation factor 7 4 Op 3 . - CDS 5886 - 6755 820 ## Closa_2529 hypothetical protein - Prom 6922 - 6981 5.9 - Term 6933 - 6973 3.1 8 5 Op 1 2/0.000 - CDS 7025 - 8074 1236 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 9 5 Op 2 . - CDS 8131 - 8976 823 ## COG0395 ABC-type sugar transport system, permease component Predicted protein(s) >gi|229783906|gb|GG667829.1| GENE 1 2 - 256 180 84 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 1 78 364 441 456 73 42 4e-13 TIIGWGLYGARCIEFLFSSRIIKAFMIVYSLVAVLGATMDLGMLWSIAETFNGLMAIPNL IAVFLLSGTVVKLVREYFDGEGKG >gi|229783906|gb|GG667829.1| GENE 2 412 - 744 190 110 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|18309686|ref|NP_561620.1| 30S ribosomal protein [Clostridium perfringens str. 13] # 13 106 15 109 110 77 41 3e-14 MDKNYMAVGSSMLILKLLEQQDMYGYQIIKELEKRSERVFTLKEGTLYPLLHALEQEGAV ICEQRTAENGRGRKYYIITDNGRRLLNEKLKEWDSFQTAVNQVIGGVLFG >gi|229783906|gb|GG667829.1| GENE 3 737 - 2254 831 505 aa, chain + ## HITS:1 COG:no KEGG:Closa_0212 NR:ns ## KEGG: Closa_0212 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 19 500 13 502 507 362 41.0 2e-98 MDKPSHKNFDQQRNYEEAFLDDVAEQIAYKPLRPAVVRELKDHIEDRTEEYLSEGMTVED AVKKAVDSMGDGVAIGTGINAVRRVQSNRLLILLTALLLFTGLAAESWTAMIPELSLRGI RFYLTGTILLVFAVLKGYPLLIRYQKRLLILAAAVFSAEAVLLQMIHFDLIPDIVWNTAY IIPWHMIDYYTLLLSGPVFVIFLYRIRKQENQAMTAAFLLTAGAVLIHYFTYKSFTLAEI VIFLFSMIGSITYMVCRGIFAGNKRKLLLRTAVGSVVLLLVYGMLPGQSRYFQAFVNPEA NARDTSDDSYNGVLIRNLLSKSPAAGSLSLTPEELMDYGTGEWYFTYGDAKEAAAEGKKP FSEHHWITFPYDKSEVTLWNILPQHYANNYLLAVSILQFGWLAGMLLLAVIAVFYVILFS CILRIRGALAGAVSFHCGLCLLFQNVLYILGNFGFQYGSFPNLPMISEGRFSILVNMLLL GFILSACRYDRVIDAPAADRKTAFF >gi|229783906|gb|GG667829.1| GENE 4 2403 - 2519 75 38 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210610570|ref|ZP_03288496.1| ## NR: gi|210610570|ref|ZP_03288496.1| hypothetical protein CLONEX_00686 [Clostridium nexile DSM 1787] hypothetical protein CLONEX_00686 [Clostridium nexile DSM 1787] # 1 38 114 151 151 77 97.0 4e-13 MGIDSEGMLISAVHEEDGHEGLNLLMVDDRIPAGAKLY >gi|229783906|gb|GG667829.1| GENE 5 3489 - 5282 2451 597 aa, chain - ## HITS:1 COG:CAC3197 KEGG:ns NR:ns ## COG: CAC3197 COG1190 # Protein_GI_number: 15896444 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Clostridium acetobutylicum # 9 495 19 509 515 546 58.0 1e-155 MAEQQNQNQVKEEDLNQLLKVRRDKLAELQEAGKDPFKITKYDVTVHSSDIKEHYEEWEG KEVSVAGRMMSKRVMGKASFCNVQDLKGNIQTYVARDSIGEESYKDFKKLDIGDIVGIKG TAFTTKTGEISIHATSVVLLTKSLQILPEKFHGLTNTDLRYRQRYVDLVMNPEVRDTFVK RSQIIKAIRDHLDDQGFMEVETPMLVANAGGAAARPFETHYNALDEDVKLRISLELYLKR LIVGGLERVYEIGRVFRNEGVDTRHNPEFTLMELYQAYTDYHGMMDLVEDLFRSVALKVL GTAVVTYDGVEIDLSKPFERITMVDALKKYKGIDFTEIHTDEEAKALADQYHIQYEARHK KGDILSMLFEEFVEEHLVQPTFIMDHPVEISPLTKRKPDDPDYTERFELFITKREMANAY SELNDPIDQRVRFEAQEVLKAAGDEEANSMDEDFLNALMIGMPPTGGIGIGIDRFVMLLT DSYAIRDVLLFPTMKSLGGSESGKKAEKAAAAEVKTEEKPVEKIDFSNVKIEPLFEEMID FDTFAKSDYRAVKIEACEVVPKSKKLLKFTLNDGTDRKRTILSGIHEYYEPEELIGR >gi|229783906|gb|GG667829.1| GENE 6 5318 - 5800 793 160 aa, chain - ## HITS:1 COG:CAC3198 KEGG:ns NR:ns ## COG: CAC3198 COG0782 # Protein_GI_number: 15896445 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Clostridium acetobutylicum # 3 157 4 158 158 125 53.0 3e-29 MADKKNILTYEGLKRYEDELQNLKVVKRKEVAQKIKEAREQGDLSENAEYDAAKDEQRDI ELRIEELEKLLKNAEVVVEDEIDLDKINIGCKVKVYDVDEDEEMEFKIVGSTEANSLQNK ISNESPVGQALMGKKAGDVVDVETQAGVIQYKVLEIQRVS >gi|229783906|gb|GG667829.1| GENE 7 5886 - 6755 820 289 aa, chain - ## HITS:1 COG:no KEGG:Closa_2529 NR:ns ## KEGG: Closa_2529 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 274 1 278 281 256 50.0 9e-67 MGERIIETWIQAKHPNQELCEDGLYTGRHYFAVIDGVTSKGSLVWPGEKTGGRFAKDVLL GALETLPADCSAQEAVSCLNHALRSASFQALEHPELLERMREERPQAVLVVYSRMRREVW CFGDCQCMIGGRLYRKTTVMDELTAGVRSAYNQLELAAGKTLEELAEHDTGREYIMPLLV GECRFANGDGEYSYDVLDGFPIHEEHTRVYPVRPGEEVVLASDGYPELFGSLAESEAVIR EVKETDPMCMFRFKGPKGIAPGFHSFDDRTYLRFVLEAEDETLAICKGK >gi|229783906|gb|GG667829.1| GENE 8 7025 - 8074 1236 349 aa, chain - ## HITS:1 COG:BS_yesR KEGG:ns NR:ns ## COG: BS_yesR COG4225 # Protein_GI_number: 16077767 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Bacillus subtilis # 1 339 1 337 344 332 49.0 6e-91 MGKLEFDRDIIEETMDRIVERTMRMDMTWDWPCGVAYYGIAEAYEATGKPQYLDRMKERV DELIDLGLPGFTVNTCAMGHVLLTLYQHTGSELYRDIIEAKVDYLEHEALRFGDGVLQHT VSADNDFPEQCWADTLFMAALFLLRAGVLLDRPELTADALKQWYWHIQYLQDEKNGLWYH GYNHITKDHMSGFYWGRANCWAAYTMSRVSHILPECYLYPEYLEIAGSLNEQLSALKLYQ TESGLWRTLLDDGESYEEVSASAGIAAAMLAKKNPLYLKYVNKTVQGLLANVTEDGRVTN VSGGTAVMKDREGYRSISRKWIQGWGQGMALAFFAGLLQYDSMKGDGAL >gi|229783906|gb|GG667829.1| GENE 9 8131 - 8976 823 281 aa, chain - ## HITS:1 COG:BS_yesQ KEGG:ns NR:ns ## COG: BS_yesQ COG0395 # Protein_GI_number: 16077766 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 8 280 25 296 296 293 56.0 3e-79 MKAKKRIRDIVYHVIVFGIGILMIYPLIWMFMSSFKETKTIFQTAGSLIPNPFTLDNYIN GLKGFAKVPFLVFFKNSLFISVIATIGTVISSAVVAYGFARFEFRGKKILFAAMLLSMML PAQVLMIPQYLWYQKLGWVGSYLPLIVPYFFAIQGFFVYLMTNFIDGIPKELDEAAKIDG CSYPQIFARVILPLITPALVTGGIFSFMWRWDDFLSALLYINKSIKYPVSLALKLFCDPG SSSDYGAMFAMASLSILPSVIIFIFFQKYLVEGISTSGLKG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:10:56 2011 Seq name: gi|229783905|gb|GG667830.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld223, whole genome shotgun sequence Length of sequence - 9150 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 4, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 1255 949 ## COG1797 Cobyrinic acid a,c-diamide synthase 2 1 Op 2 . + CDS 1255 - 1890 701 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Prom 2729 - 2788 80.4 3 2 Op 1 2/0.000 + CDS 2869 - 3744 850 ## COG2038 NaMN:DMB phosphoribosyltransferase 4 2 Op 2 . + CDS 3738 - 4388 588 ## COG2087 Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase 5 2 Op 3 . + CDS 4456 - 5106 606 ## COG4832 Uncharacterized conserved protein 6 2 Op 4 . + CDS 5120 - 5533 270 ## Acfer_1674 hypothetical protein 7 2 Op 5 . + CDS 5530 - 6474 787 ## EUBELI_20121 endonuclease + Prom 6531 - 6590 5.3 8 3 Tu 1 . + CDS 6663 - 7187 442 ## TUZN_2187 hypothetical protein + Term 7333 - 7365 -0.9 9 4 Op 1 . + CDS 7532 - 7783 309 ## Corgl_0946 hypothetical protein 10 4 Op 2 . + CDS 7798 - 8859 999 ## COG0502 Biotin synthase and related enzymes 11 4 Op 3 . + CDS 8913 - 9150 333 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes Predicted protein(s) >gi|229783905|gb|GG667830.1| GENE 1 2 - 1255 949 417 aa, chain + ## HITS:1 COG:BH1898 KEGG:ns NR:ns ## COG: BH1898 COG1797 # Protein_GI_number: 15614461 # Func_class: H Coenzyme transport and metabolism # Function: Cobyrinic acid a,c-diamide synthase # Organism: Bacillus halodurans # 3 412 59 450 465 236 35.0 5e-62 GGNLDTFFLSEEAVRNQFKELAAGADLSVVEGVMGYYDGVGGNDTWASSYDTARTIDTPV VLILDGRGASLSLAAEIKGFLEYKKDSRIQGVILNRVSRVMAERLVPEIEKMGIAVYGCL PECDAAKVTSRHLGLVIPEESKALRESLGRLACEIEKNVDVEGLFRLACGAGTLQDDGYR ASGGTGHEPDGKTADRKPEKHDRVPAGPDRVRIGIARDEAFCFYYQENLKLFESLGAEFA EFDPMRDEHLPPGIAGLMFGGGYPELYAGELSANRPILQEIRNAAAGGMPILAECGGFLY LHEELETKEGEVFPMAGVISGRAYPTGKLSRFGYIELVPYGDTPLLKEGERIRGHEFHYW DSTACGTDMKAVKPGGKRSWDCIHANGGFLAGFPHLYYPSNPSAAERWLELCRKGTE >gi|229783905|gb|GG667830.1| GENE 2 1255 - 1890 701 211 aa, chain + ## HITS:1 COG:CAC1007 KEGG:ns NR:ns ## COG: CAC1007 COG0454 # Protein_GI_number: 15894294 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 4 175 8 172 174 77 29.0 2e-14 MTEELNIRQGVLKDIPELVRLYEETVEHLESHVNYPGWKKGVYPGEAAAEEGVKTGTLYV AEAGGRIIGSIILNEKQEAAYSEASWSVDAAPGEVMVIHTFLVHPDFRRSGAGKLLMDFA EVQALKEGKKTIRLDVYEKNEPAVRLYERQGYHYTATVDLGLGSRGLPWFKLYEKTAGEC LYGLLSRISPLSSEAMELAALTWSHVAKLAS >gi|229783905|gb|GG667830.1| GENE 3 2869 - 3744 850 291 aa, chain + ## HITS:1 COG:CAC1372 KEGG:ns NR:ns ## COG: CAC1372 COG2038 # Protein_GI_number: 15894651 # Func_class: H Coenzyme transport and metabolism # Function: NaMN:DMB phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 1 289 64 349 352 244 46.0 1e-64 MCGDNGIVEEGVTQTGQEVTAVVTENMTDGNSSVCLMAERAGVDVFPVDIGVCRELRSGT RNPLIRKKLAYGTKNFRKEPAMSRETAVSAIAAGLCMAGELKEKGYRLIATGEMGIGNTT TSSAVAAMLLGRDPAEMTGRGAGLSDEGLRKKVSVIREAVKRYEAQCRDAIDVIACVGGL DLAGLTGVFLGGAVYRIPVLIDGFISGTAALAAAKLAPSACDYMLATHVSAEPAGKMILE ELGLKPFVAAGMCLGEGTGAVASIPLLDMALEVYTKMSTFQDIRIEEYKQW >gi|229783905|gb|GG667830.1| GENE 4 3738 - 4388 588 216 aa, chain + ## HITS:1 COG:CAC1383 KEGG:ns NR:ns ## COG: CAC1383 COG2087 # Protein_GI_number: 15894662 # Func_class: H Coenzyme transport and metabolism # Function: Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 2 216 4 185 185 79 30.0 6e-15 MVTLVTGGSGSGKSAFAEQEIVKLGTRRRIYIATMKPWDEECRRRIERHRVMRADKQFET IECYRGLDRLELPAAGKEGEAGNAVLLECLSNLVSNELFGTGDDDCPDRLSSQYGSATAD LVVGGIMRLMRQADDLVIVTNEVFSAGDYRRNPIGENRHEGENRYEGGKCPESGDFSGWD ESTALYLKVLGEVNCRLGAAADRVTEVTAGIPVIIK >gi|229783905|gb|GG667830.1| GENE 5 4456 - 5106 606 216 aa, chain + ## HITS:1 COG:lin2189 KEGG:ns NR:ns ## COG: lin2189 COG4832 # Protein_GI_number: 16801254 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 5 210 7 208 208 204 50.0 1e-52 MTAFDYKKEYRALYLPKDTPSVIDVPPMQYAAVRGCGNPNEEGGEYGRAIAVLYGISYTI KMSYKGSRKIDGFFEYVVPPLEGFWWMEGGAPGVDYRNKSGFNWISIIRVPDFVTEEVFS WAKEEAQRKKGIDMGPAELITVSEGRCVQCLHIGSFDEEPATVAKMDRYLEEHGYVNDFS ETRRHHEIYLSDPRKVPRERMKTVIRHPVSQNSIND >gi|229783905|gb|GG667830.1| GENE 6 5120 - 5533 270 137 aa, chain + ## HITS:1 COG:no KEGG:Acfer_1674 NR:ns ## KEGG: Acfer_1674 # Name: not_defined # Def: hypothetical protein # Organism: A.fermentans # Pathway: not_defined # 1 135 1 133 135 129 45.0 3e-29 MIVMVCIDDHNGMMFNHRRQSRDRAVIERVLRCAESSRLWMSEYSYRLFPGESRDQLAVG PDFLERAGRGEYCFVEDRDILPYEAELEAIVLFRWNRSYPADFYFPVEILSRGWSMTESE EFEGSSHEKIIKEVYKR >gi|229783905|gb|GG667830.1| GENE 7 5530 - 6474 787 314 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20121 NR:ns ## KEGG: EUBELI_20121 # Name: not_defined # Def: endonuclease # Organism: E.eligens # Pathway: not_defined # 40 314 37 316 316 359 63.0 8e-98 MKRIRTYIAVLLMAALFLSGCAPAPVQTDGSGYTAGVLANETELVGSTGPKQFSVSDIPV FSGAAYAAVNGNIPYFEQSDYSTESFETYSDLDSLGRCGTAFANVGTDLMPTEKRGSIGQ VKPSGWQTVKYDFVDGKYLYNRCHLIGFQLTGENANERNLITGTRYMNVDGMLPFENMVA DYVKETKNHVLYRVTPVFEGDNLVADGVLMEAESVEDEGDGILFCVFVYNVQPGVAIDYA TGDSSLSEDSNTDYHADAAGQQENTYIFNTNTHKFHDPSCSSVSQMNESSKQSFTGSREE AIAQGYEPCGRCKP >gi|229783905|gb|GG667830.1| GENE 8 6663 - 7187 442 174 aa, chain + ## HITS:1 COG:no KEGG:TUZN_2187 NR:ns ## KEGG: TUZN_2187 # Name: not_defined # Def: hypothetical protein # Organism: T.uzoniensis # Pathway: not_defined # 18 128 69 185 306 81 47.0 2e-14 MTDSEKLDLLLLKMEGFATKDDLKSFATKDDLKSFATKDDLKSFATKDDLKPFATKDDLK SFATKDDLKSFATKDDLKPFATKDDLKSFATKDDLKSFVTNDSLRHEMNLILSEIDSTEQ RLNRRFEEVDKKILDLRNYVTSMRYSSEAVEILMKRQDNVEQRLQKVERKVLYS >gi|229783905|gb|GG667830.1| GENE 9 7532 - 7783 309 83 aa, chain + ## HITS:1 COG:no KEGG:Corgl_0946 NR:ns ## KEGG: Corgl_0946 # Name: not_defined # Def: hypothetical protein # Organism: C.glomerans # Pathway: not_defined # 1 81 6 86 86 104 65.0 1e-21 METRIAVVGIVVEEEESVEELNEILHEYRQYIIGRMGIPYHEKKISIISIAVDAPQSIIS ALSGKVGKLKGVSSKTAYSGVKS >gi|229783905|gb|GG667830.1| GENE 10 7798 - 8859 999 353 aa, chain + ## HITS:1 COG:CAC1631 KEGG:ns NR:ns ## COG: CAC1631 COG0502 # Protein_GI_number: 15894909 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Clostridium acetobutylicum # 19 348 15 339 350 257 39.0 3e-68 MSETAGKIMDPAVRLAEHHDLPDEELYTLITERDEATAQFLKQEARALCERMYGRVVYIR GLIEFTNYCKNDCYYCGIRRSNREAERYRLTEKEILSCCEAGYGLGFRTFVLQGGEDPYF TDERLCTIVSSIRQRFPDCAITLSAGERSFESYQRLYDAGADRYLLRHETADRCHYGKLH PESMSFDHRMAALRDLKEIGYQTGCGFMVGSPYQTAAMMVKDLRFMKEFGPHMIGIGPFL SHKDTPFRDRKNGTAELTLYLLSIIRLMMPKVLLPATTALGTVTEGGREAGILSGANVVM PNLSPAGVRKKYMLYDNKLSSGSEAAESLNLLRESLSKIGYGIAVSRGDSLVI >gi|229783905|gb|GG667830.1| GENE 11 8913 - 9150 333 79 aa, chain + ## HITS:1 COG:CAC1356 KEGG:ns NR:ns ## COG: CAC1356 COG1060 # Protein_GI_number: 15894635 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Clostridium acetobutylicum # 1 79 1 79 472 94 62.0 6e-20 MYNPESLKAEEFISHEEILETLDYAEKNKENRELIDSIIEKARQLKGLSHREASVLLACE MPDKIEEMYGLAEEIKKKF Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:11:11 2011 Seq name: gi|229783904|gb|GG667831.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld224, whole genome shotgun sequence Length of sequence - 6697 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 404 339 ## Closa_0839 DNA polymerase III, delta subunit 2 1 Op 2 . - CDS 450 - 2771 1772 ## COG2333 Predicted hydrolase (metallo-beta-lactamase superfamily) 3 2 Op 1 . - CDS 3007 - 3957 1222 ## Closa_0837 Lipoprotein LpqB, GerMN domain protein 4 2 Op 2 . - CDS 3976 - 5421 1705 ## COG0642 Signal transduction histidine kinase 5 2 Op 3 . - CDS 5450 - 6316 1024 ## COG1307 Uncharacterized protein conserved in bacteria 6 2 Op 4 . - CDS 6313 - 6696 362 ## COG1132 ABC-type multidrug transport system, ATPase and permease components Predicted protein(s) >gi|229783904|gb|GG667831.1| GENE 1 2 - 404 339 134 aa, chain - ## HITS:1 COG:no KEGG:Closa_0839 NR:ns ## KEGG: Closa_0839 # Name: not_defined # Def: DNA polymerase III, delta subunit # Organism: C.saccharolyticum # Pathway: Purine metabolism [PATH:csh00230]; Pyrimidine metabolism [PATH:csh00240]; Metabolic pathways [PATH:csh01100]; DNA replication [PATH:csh03030]; Mismatch repair [PATH:csh03430]; Homologous recombination [PATH:csh03440] # 1 133 1 133 323 216 77.0 2e-55 MQNLTQDIKNHSFKPVYLLYGDEAFLKNSYKNQLKNAISGDDTMNYNYFEGKGLDLNELI SLADTMPFFSEKRLIMVEDSGFFKGASDELVAYLPQMPDTTCMIFVESEVDKRSRMFKKV KEIGYAAEMARQDF >gi|229783904|gb|GG667831.1| GENE 2 450 - 2771 1772 773 aa, chain - ## HITS:1 COG:CAC0946 KEGG:ns NR:ns ## COG: CAC0946 COG2333 # Protein_GI_number: 15894233 # Func_class: R General function prediction only # Function: Predicted hydrolase (metallo-beta-lactamase superfamily) # Organism: Clostridium acetobutylicum # 496 761 32 284 320 124 30.0 6e-28 MWLSVFLLFALIGMARAGYTKQEWEARDRLCGELAGTYVELTGHVGAVEEAEGTVQLTLE GNDVRPAAGERKEAALAGGDRSVSGGRQIPRVLVLWKTDGDGEMPEAREFKAGSKVTVRG QLALFARARNPGEFDSRNYYRGQGVDCRLYGEEVERSGGDVSPIPECLRRIREKAKQNIK RSVPEEDAGIFLAAVLGDKKALDSEKKDLYQKNGIAHLLAISGLHLSLIGMGVYRLLRKA GLGYKGAGLAGAGFIFCYGAMTGGSPSVTRAVIMMGTGFLASYLGRTYDLLSALSLALLL LAWETPELLTQGGVQLSFGAVFAVGSVLPVIQNYLGKKRGALGALSVSAAIQLVTLPVIL RDFFQLPVYGIFLNLLVIPLMGGVVCSGIGAALFGGISGEAGAVAAGAGHCILRFYEWLC HGAERLPYHTLVMGRPEAAAVIVYYTVLAAILAAMRRAGGKSASGETGRKKRRADGSRWI RTAVFFLICILLPCILMPRPVDGMEVLFLDVGQGDGILLRSGRSAVLIDGGSTSEKKLGE YRLEPCLKSCGVSVIEYAFVSHGDLDHISGIRYLLESCEEIEVRNLLLPCQGQEDEALSD LAELARARGTRVAFLEAGEHFLVEGIGITCLYPGIHDIPADKNEESEVLKVDYGNCHILF TGDMSGDGELRLLDALSERTGVLSEIQILKTAHHGSRFSSGAEFLDALGVRWAVISYGEG NSYGHPHQEVLDRFHERGVKIYETAKGGAVTLKTDGKMVRWKAFLLVDAFKIH >gi|229783904|gb|GG667831.1| GENE 3 3007 - 3957 1222 316 aa, chain - ## HITS:1 COG:no KEGG:Closa_0837 NR:ns ## KEGG: Closa_0837 # Name: not_defined # Def: Lipoprotein LpqB, GerMN domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 30 313 37 320 330 462 81.0 1e-129 MKRWIYSLLLLVLLAGLAGCQRRSGDTEGVYQIYYMNSSMTRLEPQDYTMPQMPADGTAT ETDWKITQLMEQLRNVPKDLDRQSAVPDKVGFERYKLEDTVLYLYFDNNYAMMNPTREIL CRAALVRTLTQAEGVDYVAIYTAEQPLMDSTGVPVGAMTGADFIDNISNVNAFEKTELTL YFADSTGEKLVKEQREVVHNVTTSLEKLVVEQLIAGPDRPGINATLPKDTKLLNVSVNEN VCYLNFDASFLNNSLDVKEYIPVYSIVNSLAAVSSVNKVQIAVNGSQDVMFRDSISLNQL FERNLDYNAENEENPQ >gi|229783904|gb|GG667831.1| GENE 4 3976 - 5421 1705 481 aa, chain - ## HITS:1 COG:SA1515 KEGG:ns NR:ns ## COG: SA1515 COG0642 # Protein_GI_number: 15927270 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Staphylococcus aureus N315 # 245 473 321 547 554 156 41.0 7e-38 MFFQVKPIKLKHTVSLRLVLILFFLIIGLAPMMIQNKIMLGTYRQNQIDARIQDIQNQCS ILSNKMTRGGYLVNEPRDTLLDNEMSVKADMFNGRIVIVNKNFRIVSDTFSLSLGKLNVS EEVIRCFNGENSSKYNSDKNYAALTIPVYSNSEDKVIDGVMIVTASTEDLLTRLSSVSEK GSFLHLMIFSMLAILVVLLVKLLMNPFKELQWKLKRVEEGNLDEELSSNAYTETRQITES ISKTLKKLKAVDQSREEFVSNVSHELKTPITSIRVLADSLMSMDDVPVELYREFMTDISD EIDRESKIIDDLLALVRMDKTAGTELNIAQVNVNGLLELILKRLRPIAGKRNVELTFESI REVTADVDEVKLSLALNNLVENAIKYNVEGGWVRVTLDADHKFFYVKVADSGIGISEEFQ EHIFERFYRVDKARSRETGGTGLGLAITKNIILMHQGAIRLSSKEGEGTTFTVRIPLTYI P >gi|229783904|gb|GG667831.1| GENE 5 5450 - 6316 1024 288 aa, chain - ## HITS:1 COG:BS_yviA KEGG:ns NR:ns ## COG: BS_yviA COG1307 # Protein_GI_number: 16080601 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 280 1 276 281 126 32.0 5e-29 MRVAIVTDSNSGITQEQGREAGIYVVPMPFMMAGETYFEDISLTQKSFYEKLAEGAEIST SQPSPASLIELWNGLLKEHDAVIHIPMSSGLSSSCETAHALAADFGGKVLVVNNQRISVT QRQSVLEAKAMADQGMSAEEIKRVLEETALDSTIYITVDTLEYLKKGGRITPAAALLGTF LKIKPVLTIQGEKLDAFAKARTMKQAKSMMITAVKKDLEERFSDPEGAGVHMAVAHTDNE TAALEFKKELMELFPKAGDIYVDHLSLSVSCHIGPGALAVACTKKLSL >gi|229783904|gb|GG667831.1| GENE 6 6313 - 6696 362 127 aa, chain - ## HITS:1 COG:CAC1613 KEGG:ns NR:ns ## COG: CAC1613 COG1132 # Protein_GI_number: 15894891 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 1 108 463 570 579 168 73.0 3e-42 GYDTYVGERGVKLSGGQKQRISIARVFLKNPPVLILDEATSALDNESEHLVSQSLERLAS GRTTLTIAHRLTTIRNADRILVLSGSNIIEEGNHEELIEKQGIYYQLYTSAGEAEQESKM EMKENSR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:11:21 2011 Seq name: gi|229783903|gb|GG667832.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld225, whole genome shotgun sequence Length of sequence - 7069 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 1700 1159 ## COG0210 Superfamily I DNA and RNA helicases 2 1 Op 2 . + CDS 1713 - 2657 885 ## COG2971 Predicted N-acetylglucosamine kinase 3 1 Op 3 . + CDS 2691 - 3728 712 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 + Term 3746 - 3790 4.3 - Term 3733 - 3778 9.1 4 2 Tu 1 . - CDS 3845 - 6208 1899 ## CLL_A1507 putative lipoprotein - Prom 6439 - 6498 7.5 Predicted protein(s) >gi|229783903|gb|GG667832.1| GENE 1 3 - 1700 1159 565 aa, chain + ## HITS:1 COG:CT608 KEGG:ns NR:ns ## COG: CT608 COG0210 # Protein_GI_number: 15605339 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Chlamydia trachomatis # 24 444 200 632 634 154 30.0 6e-37 KECSGEKECGEEKECSGENGFSEENECSRNIQWVIVDEVQDCDGKQLEFIDCFMTGGAKL FAVGDPNQVIYSWRGSAFNLFYTLKTRYQAAELSLPVNYRSSNEILEAARCFQQNGGALE GTRGGGEKIVVKNHYNSFQEACYLADRICELQKQGIPFDEIAIFYRLQSQSKVMEDVFSK NEIPYEVAVKTGIHEIPALEWLLRLLRFLADPADRAAAVFVLSSREYGVGISAKKAEKLA EESCAWVCGRKEAQENGSEKAAAPGLFERMVSFARNGAEIEDTNGFYEYFDLDCSLKPNA SSYGEDKANVCHFLDRMFAFMKAKGFSLPEGLREFLNSAALYGTSRELLPETDGAGENRV KMMTLHASKGLEFSHVFIIGVNYGLIPLRTKGLDSEDEERRLFFVGITRARNYLELSWYT NPDIFGVMAGESRYIRMIPSELVKGQESEMRTGTANLQELRRQVQMAKENGEKMADREDV AKTDKMEEWVDAGELKEMEAETPYMESVSVSESASASAEIRVRHETYGYGKVVEEDDTMI TVEFDGYGRKEFIKAFSCLEMLSEQ >gi|229783903|gb|GG667832.1| GENE 2 1713 - 2657 885 314 aa, chain + ## HITS:1 COG:CAC0183 KEGG:ns NR:ns ## COG: CAC0183 COG2971 # Protein_GI_number: 15893476 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted N-acetylglucosamine kinase # Organism: Clostridium acetobutylicum # 1 236 1 237 306 119 30.0 1e-26 MNYYAGMDIGGTNGRLKLEDGDGNILGKFTGPGCSINTDGYDKSRSRCRELVLPALKELG LNPTDCMGICVAASGIDSPGYEKMCRDIFEEMGFKSDCIRAVNDCEVFLYLKGGPSMVVV SGTGSICYGIDAKGTITRTGGWNHILSDEGSAFYLGLQTMRLAAEDLDERTKCPYLTRRF METSGLDTLEKIDLFVNEHLFDKPEIARFSMIAYEAAKQGDSQAEGILKDCAEKLWKLIE DTARKAGWTRQSQVFPDSGKETRHLWLWGSVLVKNEIIRDEVVKRVRQAMPDIIVAVPET SALDLALMLAGKIR >gi|229783903|gb|GG667832.1| GENE 3 2691 - 3728 712 345 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 1 344 1 330 339 278 43 6e-75 MEELRTQFLNQIDNLLQNARNSVKTAVNLTMVYTYYDIGRMIIEEEQNGVNRAEYGKYIL NQLSVYLTDKFGKGFSVTNLKQMRQFYMIYSHDKIGQTLSDQSKELSSVEPKRKFVLSWS HYLQLMRITNANERHFYEIEAMKNGWSLSELKRQFNSSLYERLALSKDKKAVARLAAEGQ VVESPQDLIKDPYILEFLGVPELPEYSETELESKIIDNLQKFLLELGTGFAFVGRQVRFT FDEEHFRVDLVFFNRLLRCFVLFDLKIGELKHQDIGQMQMYVNYYDRKVKLPEENQTIGI ILCKDKKQSIVEMTLPESNNQIFASKYETILPSKEDLKRLLEQQS >gi|229783903|gb|GG667832.1| GENE 4 3845 - 6208 1899 787 aa, chain - ## HITS:1 COG:no KEGG:CLL_A1507 NR:ns ## KEGG: CLL_A1507 # Name: not_defined # Def: putative lipoprotein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 45 780 33 765 768 678 43.0 0 MKYLLSRKPSLLLPLALVLTGCASNSQNPEPSSAAVPVETEVFESQYTGLPFDFEAEPEE FTLTFEVEQTLITAAAKGERRTVLDYGKVGDTVSWRYPDEQISVSLTPEKDYLSVEITSE NENDNSFTWPVISADTYYFPFGEGKRVPADDPAWSDYLNGGEFQVMEQLSMPFWISSAGS YNLLFVMEDPYRTQMNFTSDSPIAFSVSHQYPAIDNNRTNRFRIYITDKDPVSAAKLYRN YVIEQGQFTTLEQKAETNPNIRKLYGAPFIYLWGDFAISADDINWQALRMSLSSPAMEYF LSFAGRLENGQEFTAVLKEIQGQDYVAEYQKNVICSYISQLLKRDDFWEPSVFTQSSSEL EVLLAKGYDNLSETERIQVHKYAVAANLPEVFADAGQWMDSQTVTLLQEMKEGGIDHAWI GLNSWEQAYMKPELVETAVSQDYLIASYDSYHSIHEPDREQWITAQFDDQSLYENAVIRD HNGKPESGFQNVGRKLNPALSLPVVKNRMETIMSNQLPFNSWFIDCDATGEIYDDYTPSH ITTQEEDLAARLERMSYIRDQYHLVIGSEGGNDFAASDIAYAHGIELKTFAWMDEDMKKN RDSDYYIGKYYNPNGGAAEHFSKRIPVKDQYHTIFVDPKYDTPLFKLVYNDSVITSYHWD WSTFKIKGATADRMIREILYNVPPLYHLDAVEWSDYKDDILRHQTVWSDFSQTAVTKEMT GFEYKSEDGSVQKTVFGGSGDAVVAAVANFGDSSYQYENIEIPAHSVLIEMNGSREVYTP LLDPKHV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:11:35 2011 Seq name: gi|229783902|gb|GG667833.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld226, whole genome shotgun sequence Length of sequence - 7196 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 6, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 176 - 817 734 ## COG3153 Predicted acetyltransferase - Term 904 - 939 0.2 2 2 Tu 1 . - CDS 970 - 1569 178 ## PROTEIN SUPPORTED gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 - Prom 1610 - 1669 4.9 3 3 Tu 1 . - CDS 1684 - 2142 645 ## COG1522 Transcriptional regulators - Prom 2180 - 2239 7.3 + Prom 2200 - 2259 5.8 4 4 Tu 1 . + CDS 2302 - 3633 1136 ## COG0534 Na+-driven multidrug efflux pump 5 5 Tu 1 . - CDS 3630 - 6188 1883 ## COG0474 Cation transport ATPase - Prom 6373 - 6432 6.2 - Term 6447 - 6499 7.1 6 6 Op 1 . - CDS 6547 - 6801 330 ## COG3070 Regulator of competence-specific genes 7 6 Op 2 . - CDS 6815 - 7195 501 ## COG2195 Di- and tripeptidases Predicted protein(s) >gi|229783902|gb|GG667833.1| GENE 1 176 - 817 734 213 aa, chain - ## HITS:1 COG:MA1701 KEGG:ns NR:ns ## COG: MA1701 COG3153 # Protein_GI_number: 20090553 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Methanosarcina acetivorans str.C2A # 4 201 17 214 217 138 38.0 8e-33 MSTIMIRSEKETDYSVVEEITRKAFYNIYVPGATEHYLVHIMRQHEDFIPELDFVIELDG RVIGNIMYTKARLIDEAGTEKEILTFGPVSIDPEYQRAGYGKLLLEHSFEQAARLGYDVI VIFGSPMNYVSRGFKSCKKYHICIENGKYPAAMMVKELTPHALDGRKWFYYDSPVMAVSE EEAQKYDDTLEKLEKKFLPSQEEFYIMSHSFIE >gi|229783902|gb|GG667833.1| GENE 2 970 - 1569 178 199 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 [Bacillus selenitireducens MLS10] # 15 191 5 174 190 73 30 5e-13 MRQKNGVIPQRRELTKDMAVCGLFASLIAVGAFIKIVIPVGMDTMNFTLQWLFVLLAGLL LGSKRAFYSVTTYLVIGLVGFPVFARGGGPAYLIRPTFGFLLGFALAAYAMGKVCEILHS SKIRTWVLAAAIGYVIYYGTGILYFYFITHFVVVTPNTVGWAAIFAVYCLPTMFPDGLLC ILAIMLAARLRPVVSQMLS >gi|229783902|gb|GG667833.1| GENE 3 1684 - 2142 645 152 aa, chain - ## HITS:1 COG:BS_ywrC KEGG:ns NR:ns ## COG: BS_ywrC COG1522 # Protein_GI_number: 16080664 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 6 152 12 158 158 117 40.0 6e-27 MFLEGLDELDQKIVQLLIKNARMSYSEIGSLVGISRVAVKMRVQSLEKRGVIEEYTTIIN PQKISGAVSCYFEIETKPDTLGQVTETLRQNETITQIYRVTGKNKLHVHAVAASGEEMER LIQEVIDPLPGVVSSSCNMILSRIKDIKGLRL >gi|229783902|gb|GG667833.1| GENE 4 2302 - 3633 1136 443 aa, chain + ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 1 431 1 429 447 253 38.0 7e-67 MKGDLTRGPVMKTMLIFAVPMILGNLLQQCYNVADTLIVGRFLGPDALAAVGSSFTLMTF LTSILLGLCMGSSAVFSIRFGQKDFDGLKDSIFTSFLLIAAICAVLNLAAFLSIGWILLL LKVPEEIQGLMRDYLMVIFSGITAVFLYNYFSCLLRAVGNSVIPLVFLAVSAVMNIGLDL WFVLGLKRGVAGATEATVISQYVSGIGIAVYSWFACPWLRIGREHCRIRLSCLKEIAGFS VLTCVQQSVMNLGILMVQGLVNSFGTTVMAAFAVAVKIDSFAYMPVQDFGNAFSSFIAQN FGAGRKDRIHAGFRGAVLVSLLFCLIISAVIWMFARPLMLLFVNEQETAIIAEGIRYLHI EGAFYCGIGCLFLLYGLYRALGRPGMSVVLTVISLGTRVALAYILSSIPEIGVVGIWWSV PIGWILADAAGFVYFIRRRRLLL >gi|229783902|gb|GG667833.1| GENE 5 3630 - 6188 1883 852 aa, chain - ## HITS:1 COG:MTH1001 KEGG:ns NR:ns ## COG: MTH1001 COG0474 # Protein_GI_number: 15679019 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Methanothermobacter thermautotrophicus # 10 817 23 807 844 423 34.0 1e-118 MDTGKNGIPEGLTTAGAQKLLEQYGRNELTPQKKESFFKKALCVMAEPVFLLLMVASVIY FILGEPRDGAVMLLFVTGMITIDVAQEWKTDRTLRALKNLSAPRITVLRDGTRKEIVSTE LVPGDIMVLCEGVKIPADGVILSSSDFCVDESSLTGEPIRVWKHARPSDTVPARAEEEGE GYWKKDFCYAGTLVIQGNALVLVEQTGTATEYGKAGSRVLQAEKVLTPLQKQTKSLVTVC AGIAALLFLLVSAFTYGNLSGYPFQERLIKSILSGVTLAMAMIPEEFPVVLTVFLSMGAW RLARKHSLVRRLPSVETLGAVSVLCVDKTGTITKNHMEVKELWPWKTDSDGLARLMGLAC ETEAYDPMEKAMLEFCDQSGFPKEALFSGILLKEYAFTSETRRMGHVWELEEKRIIAAKG SPEAVLSLCTISAEDRKTVEEKIEELSGRGLRVIAMASAELSPNEKIPESLEECSLVFSG LSGLYDPPREGIKEDIALCWKAGIRIVMITGDSGFTAAAIAAQTGIDCEGGSITGEMLDR MTDEELRQAVTSSSIFARVIPEHKMRIVQALKDNGEIAAMTGDGVNDAPALKYADIGIAM GKRGSEVSREAADLILMDDNFHTIVETVRDGRRIYDNIRKAVGYIFTIHIPIALASLVAS VLGIVPQDLMFLPVHIVLMELMIDPTCSVILERQPAESCVMCRPPRKRSEKLITLSSLTK SVIQGLVLFAASFGSYYLMLRRGEAAPQARSMGLAIVILGNLFLVLVNSSECDSAWRSAR KLARDRVMLLAAALTLLLLGASLYSPVHSFLKLAPLTGIRLLTAFFTSAASVFWYELVKA AGKQKHGRKQPV >gi|229783902|gb|GG667833.1| GENE 6 6547 - 6801 330 84 aa, chain - ## HITS:1 COG:lin2889 KEGG:ns NR:ns ## COG: lin2889 COG3070 # Protein_GI_number: 16801949 # Func_class: K Transcription # Function: Regulator of competence-specific genes # Organism: Listeria innocua # 1 83 1 83 83 69 48.0 2e-12 MGELAKLPNIGKVLEEQLNQAGITTYEELKEIGSRQAWLKIKAMDDTACLHRLYSLEGAI LGIKKAELTTETKQDLKDFFHSFK >gi|229783902|gb|GG667833.1| GENE 7 6815 - 7195 501 126 aa, chain - ## HITS:1 COG:CAC0476 KEGG:ns NR:ns ## COG: CAC0476 COG2195 # Protein_GI_number: 15893767 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Clostridium acetobutylicum # 1 126 283 408 408 149 56.0 8e-37 DFDEKKFAEKKEFITRVAAYLNERYGKGTVELELKESYRNMKAIIEPHIYLIDIAKAAME ETGIEPIVTPIRGGTDGARLSYMGLPCPNLCTGGHNFHGKYEFIPVQSMEKVVELLLKIV EKFAER Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:11:37 2011 Seq name: gi|229783901|gb|GG667834.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld227, whole genome shotgun sequence Length of sequence - 8378 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1398 1038 ## Closa_2607 Cna B domain protein 2 2 Tu 1 . - CDS 2342 - 4354 1385 ## BVU_2161 putative glycoside hydrolase - Prom 4426 - 4485 5.1 3 3 Op 1 . - CDS 4984 - 6459 625 ## Dfer_0637 hypothetical protein 4 3 Op 2 . - CDS 6492 - 6926 332 ## BF3495 hypothetical protein - Prom 7089 - 7148 3.1 5 4 Tu 1 . + CDS 7001 - 7174 69 ## gi|266624307|ref|ZP_06117242.1| putative AraC/XylS-type transcriptional regulator - Term 7215 - 7257 6.5 6 5 Tu 1 . - CDS 7406 - 8314 509 ## COG3533 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|229783901|gb|GG667834.1| GENE 1 3 - 1398 1038 465 aa, chain - ## HITS:1 COG:no KEGG:Closa_2607 NR:ns ## KEGG: Closa_2607 # Name: not_defined # Def: Cna B domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 153 398 968 1200 2845 114 35.0 1e-23 MAPGDTFFEVESLPYNLSPDSKLDGFWWEDKGVSGKWMKVIPKGIGDWVEFDLDMPASDY EYEVIAGYRSANNRGQIQLAVNGEECGDPVDMYASGSEFTEKSFGSVKFDSDGTKKMRFT VTGKNSKSSGYVIGLDYIRLVPQVPAAERTAVVNGELPELPKTLEVMEQGKIVEKNVEWD LDPEDFKENYSVVTVNGKVEDSDVVAQAKVEVVPKNLVYFIDCNSEDSESYKAVEKLGLD LLNDRADQSWKNSDDKWGITDTYDGKKSDGDGDKFHDGYYGKNKAGKENGYAYKLTLEAG EYDITVQSHEWWNDSRSSNFEAVYEDKDGKKVSVMIAENVKVGAGNSIDALRTGRLTVDH ETEVTLNIYANSEKGAVITFLGVAYANVADKTDLKLLLEKAKEEVQKTGSYTAESIEILK KAIDEAKKIMDEDQTDQESIDLRKAELQKALDALTVQDAEYYTDT >gi|229783901|gb|GG667834.1| GENE 2 2342 - 4354 1385 670 aa, chain - ## HITS:1 COG:no KEGG:BVU_2161 NR:ns ## KEGG: BVU_2161 # Name: not_defined # Def: putative glycoside hydrolase # Organism: B.vulgatus # Pathway: not_defined # 275 667 108 469 657 108 26.0 1e-21 MAVSGLCLTMMTIPASGMPLASDNGRPVSWVDNSRDTYERQTLMVDGKPFFFNGVQIRID KVKDFYHYSDEQIQNLFQIARDDGFTVANAQLRWMDIQPDQFYDASETTYIRNGEYADQN FSDEKSMLSGYSDTPDEQSLTYLKFDFAELAGADWAGSKLRVWVNSAEEANSLRLYGIED DSWDSSTMTWNSGAPNHNGYEITGEGVIDLGLTPSYDPVNKAQYYDFDVTDFINEYCDED KKASFILRTDKDTKNMVGIDGKEGEKAAPQLVISRDDVYNWDYLDKVIGWAEEAGIKLEL LWFSTDTVNSTIDNRVPYYVFQGYQKSLKTDGTPFFEKKTDPVYGTYWFLMCKNDPELRA KEKEAMKAMFDHIAEYNRENGNLNTVVGCQVANEPAVGRLHQTKYGEHCYCDTCMKKKGT MSDQEFRDLTMWEYTNNLAAGVKESDYSVWTRVNNYTGTDAVGVTYNEKMRKQGGTNVDF IGPDPYGADTQLCYDFGHKNTWLGPYDQGDNLTMIMENSGSKSGTPNWILAALAGGSFYN VYDLCSPDGNGLYDNKNGIPVPHGEYVEDMRNINHMLNKISFDLATKVADGDGGKKLVFL NQLGKSTSAEKKVRNIEVSYVSPEKGVGIVVDHGEKEIILESTRDAKYTLKGLAGYGVSS LEEGYYDVAS >gi|229783901|gb|GG667834.1| GENE 3 4984 - 6459 625 491 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0637 NR:ns ## KEGG: Dfer_0637 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 6 490 32 513 533 455 46.0 1e-126 MILDRRLEEYIQKFNNEDEEVYKQEIDNEHALGWLKDNVPLFECPDPVIEEIYYFRWWTY RKHIKKIPEGYLITEFLPEVPWSGKYNSINCAAGFHIREGRWLRNGQQIMEDTIRFWLQG SGDIRSYSTWIADALWDYCCVLDDYSLGTELLDSLTDNFEGWIKEHGTETGLFWSIDDRD AMEFSISGSGFRPTLNSYLYADARAISQFAEKAGRKSLAEEYAERAENIKKQMQKTLWSG DFFQVIPAEIDPKPGRFLNCKTLDFEKIPMEHNVRELIGYIPWYFNLPDPGYEKAFAYLM DEKHFLGKYGFCTADQAHPRYKYEVPHECLWNGPVWPFATSQALVAMANLLRNYHQEIVT EKDYHRALKIYAESQHRITESGKKIPWIDEDSDPETGEWISREILKKDGWKAEKGGYERG KDYNHSMYCDLIITGLLGLDPQQISLAVKPMIPPEWDYVLLEGVRVGKKEYTVLYDRDGT RYGKGKGLMIY >gi|229783901|gb|GG667834.1| GENE 4 6492 - 6926 332 144 aa, chain - ## HITS:1 COG:no KEGG:BF3495 NR:ns ## KEGG: BF3495 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 144 184 328 328 176 60.0 2e-43 MWYSAGEQYEPNAIGYAESSDGISWKKYAENPIFQADPKIEWEKHKAAGCQVFQKDGYFY MFYIGYHDEDYAQIGMARSKDGIRNWERSELNPIIAPDEGFDKSACYKPFTIFWDGKWML WYNGRNGAPEQIGLAFHEGRDFEF >gi|229783901|gb|GG667834.1| GENE 5 7001 - 7174 69 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624307|ref|ZP_06117242.1| ## NR: gi|266624307|ref|ZP_06117242.1| putative AraC/XylS-type transcriptional regulator [Clostridium hathewayi DSM 13479] putative AraC/XylS-type transcriptional regulator [Clostridium hathewayi DSM 13479] # 1 57 240 296 296 94 77.0 3e-18 MSEASYYLSETTLDVKEIAQKLGFSDSHNFMKVYKKETGMTPSEYRNSFPNRLNYDS >gi|229783901|gb|GG667834.1| GENE 6 7406 - 8314 509 302 aa, chain - ## HITS:1 COG:ECs4459 KEGG:ns NR:ns ## COG: ECs4459 COG3533 # Protein_GI_number: 15833713 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 301 348 651 656 271 42.0 1e-72 MLQIRPDAQYADVMERVLYNGVLSGMALDGKSFFYVNPLEVVPEACHRDERKSHVKPVRQ KWFGCACCPPNVARLLSSVGSYAYTEKEDTIFIHLYIGAILKKQINGKEMEVKIQSEFPW NGKVNVYVKGVREVCTIAFHIPEWGEAYQLSKINGATIKVKERYLYVTKKWEEEEEIHLQ FPMEVRLIEANPFVRENIGKNAVMRGPLVYCLEEVDNGSSLHLLSIVKDAEVKTMYRDIA GVTMVCVELSGRKQVAKLKENTPLYYDADDKRGEQIQLQYIPYYAWANRGENEMQVWTRR ET Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:12:12 2011 Seq name: gi|229783900|gb|GG667835.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld228, whole genome shotgun sequence Length of sequence - 7451 bp Number of predicted genes - 11, with homology - 10 Number of transcription units - 4, operones - 1 average op.length - 8.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 196 - 255 3.8 1 1 Tu 1 . + CDS 275 - 343 82 ## 2 2 Op 1 . - CDS 587 - 748 56 ## gi|266625002|ref|ZP_06117937.1| cystathionine beta-lyase 3 2 Op 2 . - CDS 777 - 998 150 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases 4 2 Op 3 . - CDS 1075 - 1707 323 ## COG2755 Lysophospholipase L1 and related esterases 5 2 Op 4 16/0.000 - CDS 1708 - 2601 893 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 6 2 Op 5 34/0.000 - CDS 2626 - 3381 245 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 7 2 Op 6 17/0.000 - CDS 3386 - 4060 562 ## COG0765 ABC-type amino acid transport system, permease component 8 2 Op 7 . - CDS 4041 - 4700 391 ## COG0765 ABC-type amino acid transport system, permease component 9 2 Op 8 . - CDS 4743 - 6035 1133 ## COG2873 O-acetylhomoserine sulfhydrylase + Prom 6378 - 6437 6.3 10 3 Tu 1 . + CDS 6463 - 6858 289 ## COG1959 Predicted transcriptional regulator + Term 6952 - 7002 16.1 - Term 6937 - 6991 16.3 11 4 Tu 1 . - CDS 7047 - 7451 217 ## COG0491 Zn-dependent hydrolases, including glyoxylases Predicted protein(s) >gi|229783900|gb|GG667835.1| GENE 1 275 - 343 82 22 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFSENLGNLCKAKSISQEVMAE >gi|229783900|gb|GG667835.1| GENE 2 587 - 748 56 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625002|ref|ZP_06117937.1| ## NR: gi|266625002|ref|ZP_06117937.1| cystathionine beta-lyase [Clostridium hathewayi DSM 13479] cystathionine beta-lyase [Clostridium hathewayi DSM 13479] # 1 53 1 53 53 100 100.0 4e-20 MVDALTKFINGHGDAQEGAILTNNLETMDRIRYEAHRLTLAPSSAHLIHEDFP >gi|229783900|gb|GG667835.1| GENE 3 777 - 998 150 73 aa, chain - ## HITS:1 COG:BS_yjcJ KEGG:ns NR:ns ## COG: BS_yjcJ COG0626 # Protein_GI_number: 16078253 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Bacillus subtilis # 4 62 107 165 390 60 40.0 8e-10 MTVYRMFHELFKKNNVDTVMADMTDLEAVKKAIIPGTRLVHIETPDNPTVGITDIEAIAK IACSQLITPLLPR >gi|229783900|gb|GG667835.1| GENE 4 1075 - 1707 323 210 aa, chain - ## HITS:1 COG:RSp1245 KEGG:ns NR:ns ## COG: RSp1245 COG2755 # Protein_GI_number: 17549466 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Ralstonia solanacearum # 2 197 1 200 213 108 31.0 9e-24 MMKNILVFGDSNTWGLIPGTHERYPWGVRWTSILQEKERCNNTRIIEEGLCGRTTIFEDE LRPGRKGLDMLPVLLETHNPLDAAIIMLGTNDCKILYRAPAHVIGKGIVLCLDELLKVIP ADRILLVSPLLLGENVWKPEKDPEFDQASVLICKQLKEEYRKIASAKGTAFIAASDYAAA SSIDDEHLTEDGHSALADILYNKLIEMKVI >gi|229783900|gb|GG667835.1| GENE 5 1708 - 2601 893 297 aa, chain - ## HITS:1 COG:Cj0982c KEGG:ns NR:ns ## COG: Cj0982c COG0834 # Protein_GI_number: 15792309 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Campylobacter jejuni # 8 295 1 278 279 226 44.0 4e-59 MEIIMKKIRKLVTVTLTLSLAFSLAACGSENSSNNNVTANSRKTQNVVYRTLDEIKESGT INIGVFSDKNPFGYVDENGEYQGYDVYFANRIGEDLGVKINYISTEAANRIEYLQTGKAD IILANFTVTPQRAEEVDFALPYMNVALGVVSPDSRVIESLDNWNAEDPIIVISGTTAETY LIENYPDIPLQKYDSYATAKNALENGNGAAWANDNTEVIAFALQNAGYTVGIPSLGSQDS IAPAVSKGNSTLLDWLNEEIKALGEEQFFHKDYEATLVDTYGLDYEDSLVVEGGKTK >gi|229783900|gb|GG667835.1| GENE 6 2626 - 3381 245 251 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 220 7 223 318 99 30 1e-20 MSESVLTIEHLTKQFDNSTVLNDLSLQIHEGEVVVIVGPSGCGKSTLLRCINALEDTQNG EIRLHGELVTKRSRNLAGIRQKIGMVFQSYELFPHLNVLDNILLAPTKVQKRTKDEVKKE ALALLSRVGLSEKAKSFPRQLSGGQKQRVAIVRALCMHPEILLFDEVTAALDPEMVREVL EVILNLADQGRTMIIVTHEMQFAKAVADRIIFLDDGNIKEEGTPEQFFDAPKTDRAKKFL QTFTFERKGSK >gi|229783900|gb|GG667835.1| GENE 7 3386 - 4060 562 224 aa, chain - ## HITS:1 COG:SP0710 KEGG:ns NR:ns ## COG: SP0710 COG0765 # Protein_GI_number: 15900608 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 1 224 1 224 225 240 55.0 2e-63 MPDLGLEVLLKGKNLARILGGLDAALKISMISVAISIPLGIILGILMTWKNPICRAILRC YLEFVRIMPQMVLLFLVYFGTTRAFGWDLSGELASVIVFTVWGAAEMSDLVRGSLISIPI HQYESAEALGLSRIQSYQYIVLPQAVRRLIPLSINLITRMIKTTSLILMIGVVEVLKVAQ QIIEANRMGSPNAAFGIYLTVFFLYFIACWPISLLAKYLEKRWR >gi|229783900|gb|GG667835.1| GENE 8 4041 - 4700 391 219 aa, chain - ## HITS:1 COG:SP0711 KEGG:ns NR:ns ## COG: SP0711 COG0765 # Protein_GI_number: 15900609 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 23 219 3 199 206 204 55.0 1e-52 MNWEIIGKYLPLYQKAAVLTVKIGFMGIAAAIMIGLLCTIIQYYKIPVIRRIVSVYIEIS RNTPLMVQLFFIYYGLPKIGIQTDAAACGVAGLAFLGGSYMSEAFRSGIEAVAPIQEESA YSLGLSKLQTFSYVILPQAFSISVPAFMANVIFLLKETSVFSAISLMDLMFTAKDLIGLY YKTAESLFLLVVFYLIILLPVSVLGSLLERRSRYAGFGS >gi|229783900|gb|GG667835.1| GENE 9 4743 - 6035 1133 430 aa, chain - ## HITS:1 COG:L75975 KEGG:ns NR:ns ## COG: L75975 COG2873 # Protein_GI_number: 15672055 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Lactococcus lactis # 6 430 4 426 426 477 57.0 1e-134 MSKYKNEYKFETLQLHVGQEQADPATGSRAVPIYQTTSYVFHDSAHAAARFGLADAGNIY GRLTNSTQDVLEKRLAALEGGAAALATASGAAAITYTIQALAQAGDHIVAQKTIYGGSYN LLAHTLPQFGVTTTFVDAHNLEELTAAIRPNTKAVYLETLGNPNSDIPDIDTIAEIAHAH GLPLVIDNTFGTPYLIRPIEHGADIVVHSATKFIGGHGTTLGGIIVDSGKFDWKASGKYD PIAKPNPSYHGVSFVDAAGPAAFVTYIRAILLRDTGATLSPFNAFLLLQGVETLSLRLER HAENTKKVVEYLSNHSQVEKVNHPSLPDHPDHALYEKYFPDGGASIFTFEIKGGQEEAWR FIDNLKIFSLLANVADVKSLVIHPASTTHSQLSREELLDQGIKPNTIRLSIGTEHIDDIL ADLENGFASL >gi|229783900|gb|GG667835.1| GENE 10 6463 - 6858 289 131 aa, chain + ## HITS:1 COG:DR2094 KEGG:ns NR:ns ## COG: DR2094 COG1959 # Protein_GI_number: 15807088 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Deinococcus radiodurans # 2 131 48 175 197 95 37.0 3e-20 MISTKGRYALRVMIDLAEHGGEGYIPLKEVAERQEISKKYLEIIVKELIKGKMLIGLGGR GGGYKLCRKPEEYKVGEILELMEGSLATVACLSKDAKPCSRAAICKTLPLWSEFDKMVHD YFYEKTLADLL >gi|229783900|gb|GG667835.1| GENE 11 7047 - 7451 217 134 aa, chain - ## HITS:1 COG:TM0607 KEGG:ns NR:ns ## COG: TM0607 COG0491 # Protein_GI_number: 15643373 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Thermotoga maritima # 2 132 150 281 282 91 34.0 4e-19 VADCLSSMETLDKYQIGFVYDIAAYIKTLEMVQTLNAEMFVPSHAEATNDIAPLAQYNID KVLEIAKKITDLCKDPLCFEMILQKLFTDYSLTMNFEQYVLVGSTVRSYLAWLNDTNRIN ACFENNMLLWKKNN Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:12:24 2011 Seq name: gi|229783899|gb|GG667836.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld229, whole genome shotgun sequence Length of sequence - 9305 bp Number of predicted genes - 10, with homology - 9 Number of transcription units - 7, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 166 133 ## gi|288871531|ref|ZP_06410215.1| NLP/P60 family protein - Prom 227 - 286 6.8 2 2 Op 1 . + CDS 604 - 1098 422 ## PROTEIN SUPPORTED gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 3 2 Op 2 . + CDS 1126 - 1323 241 ## PROTEIN SUPPORTED gi|228581564|ref|YP_002852265.1| ribosomal protein L35 4 3 Tu 1 . - CDS 2292 - 3119 403 ## Closa_1701 hypothetical protein - Prom 3286 - 3345 5.3 + Prom 3095 - 3154 8.0 5 4 Op 1 . + CDS 3394 - 3888 438 ## gi|266625016|ref|ZP_06117951.1| conserved hypothetical protein 6 4 Op 2 . + CDS 3900 - 4112 216 ## PROTEIN SUPPORTED gi|163756262|ref|ZP_02163377.1| 50S ribosomal protein L20 + Term 4338 - 4372 1.0 7 5 Tu 1 . - CDS 4376 - 4651 86 ## - Prom 4673 - 4732 2.4 8 6 Tu 1 . + CDS 4683 - 6011 375 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 - TRNA 6988 - 7060 76.0 # Asn GTT 0 0 9 7 Op 1 17/0.000 + CDS 7236 - 8531 1305 ## COG0168 Trk-type K+ transport systems, membrane components 10 7 Op 2 . + CDS 8545 - 9225 682 ## COG0569 K+ transport systems, NAD-binding component Predicted protein(s) >gi|229783899|gb|GG667836.1| GENE 1 1 - 166 133 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871531|ref|ZP_06410215.1| ## NR: gi|288871531|ref|ZP_06410215.1| NLP/P60 family protein [Clostridium hathewayi DSM 13479] NLP/P60 family protein [Clostridium hathewayi DSM 13479] # 1 55 11 65 65 64 98.0 3e-09 MNKKIILFSALLLTAALSACSSRQASPVPADDPAATVEGVTEEPMAEEPMAASQN >gi|229783899|gb|GG667836.1| GENE 2 604 - 1098 422 164 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 [Haemophilus parasuis 29755] # 9 164 2 158 159 167 51 3e-41 MINEQIRDKEVRLIGEDGEQLGVMPARDALQMAKEAELDLVKIAPTAKPPVCKIIDYGKY RYELARKEKEAKKKQKVIEVKEVRLSPNIDTNDLNTKMGAARKFLEKGDKVKVTLRFRGR EMAHMSKSRYILEDFAKELADIAVIDKPSKVEGRSMVMFLTPKR >gi|229783899|gb|GG667836.1| GENE 3 1126 - 1323 241 65 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|228581564|ref|YP_002852265.1| ribosomal protein L35 [Clostridium sp. 7_2_43FAA] # 1 65 1 65 65 97 73 3e-20 MPKLKTSRAAAKRFKKTGTGKLVRNKAYKSHILTKKSTKRKRNLRKDIVTDATNAKVMKK ILPYL >gi|229783899|gb|GG667836.1| GENE 4 2292 - 3119 403 275 aa, chain - ## HITS:1 COG:no KEGG:Closa_1701 NR:ns ## KEGG: Closa_1701 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 58 275 59 291 291 140 37.0 9e-32 MTQFETLPLSAAQADDEPVDPGFLVPDYNTVIPPFDEDIDPDFSVMPPIYPDRPYPPDRP FPDRPIYPPFIPTPMPCLFCNNNQWARGSIRFLNAATGYNAFTIYIDNRPVYSNLEFPEL TRYQQITQGYHRFSIVANGYTYLQKSLYVGDGMATIAIINSDTGLDLTSIADTTCPTPNA NACFRVCNLAYYSGPVNAVIGNVYFNSVNFKQAASFSSMMSGTYTVNVARSARPQVPLVT TTITMRPGRIYTLYVLNWNTSPDTIQTLLVEDRRY >gi|229783899|gb|GG667836.1| GENE 5 3394 - 3888 438 164 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625016|ref|ZP_06117951.1| ## NR: gi|266625016|ref|ZP_06117951.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 164 1 164 164 278 100.0 7e-74 MKENDFNLDTTVREKKVFLKKIRIVFTAALFLTGVLSFLGLMMSAWGIKNRIWLQMGLAD FVWNMVLTAVMILVFVSLVKIAADEQPFSRTLTWSIRMIGILLLTASFVIPRLEGDQSSG YEFLSRGSFVLIDAAIFIPGLLFVILGNIIMEGFSMQKEMDEIL >gi|229783899|gb|GG667836.1| GENE 6 3900 - 4112 216 70 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163756262|ref|ZP_02163377.1| 50S ribosomal protein L20 [Kordia algicida OT-1] # 1 65 1 65 67 87 66 3e-17 MAIILRLDRVMADRKISLNDLSKKVGISTVNLSNLKTGKVKAIRFSTLDAICRELNCQPG DILEYTGDAD >gi|229783899|gb|GG667836.1| GENE 7 4376 - 4651 86 91 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFLPSSAKLSAFCSLPQVRAPPSAVSFIYGAFSVVVHPLITIRIGICRIACHSGPPGPPF LTDRIIVRLIKKDFPAAGDRQHKSCSCSPDI >gi|229783899|gb|GG667836.1| GENE 8 4683 - 6011 375 443 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 18 439 10 426 447 149 28 1e-35 MKSNAEISINNIYRLDGRVPLGKAIPFGLQHVLAMFVANLTPIVIVTAAAGLETSQTAAL IQNAMFIAGVATLIQLYPVWRIGAKLPIVMGVSFTFVTILSSVGAKYGYPAILGAILIGG LVEGTLGLFAKYWRKLISPIVSACVVTSIGFSLLTVGARSFGGGYSDSFGSATNLLIGTV TLLSCILFNVFAKSFWKQLSVLFGLVVGYVLSIFLGVVDLSGLLDGGILSLPRILPFMPE FHAGAILSVVVIFLVSATETIGDTTAMVASGLDRDITEKEISGSLACDGFASSISSLFGC LPVTSFSQNVGLIAMTKVVNRFTIMTGAFCMILSGLFPPIGAFFASLPDAVLGGCTIMMF GSIIISGVQMLARAGFSQRNTIIGSLSLSVGIGFTLLPEIFQIFPSIIQDVFAENCVAVV FVLAILLNLILPENMDIEKAVEA >gi|229783899|gb|GG667836.1| GENE 9 7236 - 8531 1305 431 aa, chain + ## HITS:1 COG:DR1668 KEGG:ns NR:ns ## COG: DR1668 COG0168 # Protein_GI_number: 15806671 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Deinococcus radiodurans # 7 414 69 492 512 252 40.0 1e-66 MTEHLRKLSPGRIVVLGFAFVILTGSLILWLPISANEGVDVSYIDALFTSTSAVCVTGLI AVDTADTFNVFGRTVVALLIQVGGLGVTCVGVGVILLAGKKIGIHGRVLIRDSMNLTTVK GVVRLVEAILFMTLLFEGAGALLSFLVFSKDYPPLDALGISVFHSVAAFNNSGFDILGGL KNLIPYQSNVLLNLTTCGLIIFGGLGFLVIREIWEKHSWRKFSLHTKVVIATTIALLAAG TVLLKMTEDITWLGALFQSTSARTAGFSTYPIGAFSNAGLFVLAVLMFLGASPGSTGGGI KTTTTFVLMKSMFSAATNRHCSAFKRRIPTEVVSQAFLIAILALAVVCVQTFLMCIAEPE LDFMKLLFETVSAFGTVGLSTGITPDLNAGAKLILITTMFIGRLGPLTMATVWSFKPKAA AWYSEESITIG >gi|229783899|gb|GG667836.1| GENE 10 8545 - 9225 682 226 aa, chain + ## HITS:1 COG:BH2663 KEGG:ns NR:ns ## COG: BH2663 COG0569 # Protein_GI_number: 15615226 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Bacillus halodurans # 9 218 4 214 220 154 39.0 1e-37 MKRRNDTIEYGIIGLGRFGTALATALSEAGKEVMVVDNTESKVKQIRDLVSEAFVVEGID RDSFESAGIQNCETVIVCIGEKIDTSILATLTVISMGVPRVISKATSAEHGCVLEKIGAE VVYPEKDIAIRLARRLVSPHALDFISLNDDIAVSEIRLTSVLEGRSVMEAGIRNRFGLNI VAMERERDTTIDIQPTYRFRKEDVIVVIGKRDHINKFERYLSEEQR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:12:52 2011 Seq name: gi|229783898|gb|GG667837.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld230, whole genome shotgun sequence Length of sequence - 6626 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 53 - 1141 1195 ## Closa_2156 S-layer domain protein 2 1 Op 2 24/0.000 + CDS 1155 - 2783 1286 ## COG0845 Membrane-fusion protein 3 1 Op 3 36/0.000 + CDS 2803 - 3507 325 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 4 1 Op 4 . + CDS 3491 - 4669 371 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 + Term 4815 - 4856 5.1 + Prom 4735 - 4794 7.2 5 2 Op 1 32/0.000 + CDS 4870 - 5337 641 ## COG0779 Uncharacterized protein conserved in bacteria 6 2 Op 2 . + CDS 5372 - 6577 744 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA Predicted protein(s) >gi|229783898|gb|GG667837.1| GENE 1 53 - 1141 1195 362 aa, chain + ## HITS:1 COG:no KEGG:Closa_2156 NR:ns ## KEGG: Closa_2156 # Name: not_defined # Def: S-layer domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 362 1 363 363 363 53.0 7e-99 MNRRLFAASTAACIMVAAVYPTTSLAAVNMDLKKKVVGMAGIMNVTNTEKNVTRAEYARM VVLASPYGSSVPPEGSSSVFADVGKDHACASYIKTAVEKGYMTGYLGGVFKPDQNVTLQE AVRGILALLGYKDEDFAGSQAGGRISQYRFLKLDRNVNREAAELLSRGDCINLFYNLLKT KQKDGGDIYGKLFGCELTSDGEINPLKMADNGLKGPRLVRSKRSLSSYIPFKLDKANVFI NGESSTVSTLKDAVESGGAVLLYYHPGSKSIWAYTEDSSDSRRGIVRGTVSNIYYTSVDV MSPSAVTLEESGDQYQLASSEMQFAFSMYGNVRVGDTVTLVYEKTVKEDGTETYTVLDYL ED >gi|229783898|gb|GG667837.1| GENE 2 1155 - 2783 1286 542 aa, chain + ## HITS:1 COG:CAC0318 KEGG:ns NR:ns ## COG: CAC0318 COG0845 # Protein_GI_number: 15893610 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Clostridium acetobutylicum # 321 533 135 367 392 82 30.0 3e-15 MKNRKKKKTVIIAAAAFVLLAAAGTAALRSSGRSAGTGRGTAVMTAPVTRQDISSRLSSS GVLKPRNSYQVSSLAEGEITSADFEVGDQVEKGQVLYRIDVSSMEEEMKAAQNSAERAAR NLEEAQKDYSRAQGRFNDGIFRSGDSGYIKKIYIKAGETVSSGTKIADIEDDSIMRLKVP FLSAEAAAIGTGSTALVTLSSTGEQIPGTVTAVSSKDETLTGGRIVRYVTIEAANPGGLT TEQSAAASVGDYSSSLEGTFSPVVETILSADLPDSVTVEQLLVSEGDYAAKGAPIFRFDR ESADKLMRSFREKLDSAKDAMDSANSKIESTKSSMNNYTITAPISGTVIRKNMKAGDKIT SKGSGDGVMAVIYDLSSITFSMSIDELDISKVKTGQTVQITAEAAGDASFRGIVENVSLE SANNNGITTYPVSVTVKDPKGLLPGMNVNGKILLEESKNALVIPAASLMRGNVVYVKDPS VKAAQGEVPAGFREVTVETGIMNDEQVEVTSGLAEGDEVYLVPGAESTAGSENMGEGMTM DL >gi|229783898|gb|GG667837.1| GENE 3 2803 - 3507 325 234 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 4 229 1 225 245 129 37 5e-30 MGSLIDLQNIYKIYEMGDEEVRANDGISLSIERGEMTAIVGKSGSGKSTLMNIIGALDVP TSGTYHLEGENVGEMTDNQLADIRNRLIGFIFQQYNLLTRSTLLENVELPLIYAGIDGEE RRERAMASLKRVGLADKYKNLPGQLSGGQQQRVSIARALAGSPSLILADEPTGALDSRTS REVLDFLKQLNEEGNTIVMITHDNSIALEAKRVIRIHDGKINFDGDVSDYAAVI >gi|229783898|gb|GG667837.1| GENE 4 3491 - 4669 371 392 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 2 392 6 413 413 147 27 2e-35 MRQSFKMAIKNIMSNKMRSFLTMLGIIIGVASVIILVGLVEGQMNYIRASFSDMGTNQLQ VQLTNTPTRSVSPQDLYAFYDKNRAYFKEMSPQVSVKATIKHKTDKLTRTRISGVSEQFY GIDAKKLEAGRYIQYSDIASRQSVCVLGYFTATRLFGSAEAAVGESVRFDGNVVKVVGVL KRRDKDELKQGGPDDRALLPYTTALLMNQNADVSDYIITVDKVEEITAARILLTVFLQGR YKSPDYFYISSMSEMLKSMESTMSMMSAALGGIAGISLLVAGVGVMNIMLVSVTERTREI GIRKSLGAKKSVILQQFVIEAAVTSSIGGLVGIVLGCVITPAAGSLMQMKAAATLPAILV SFGVSAAIGLVFGYMPARRAASLNPIDALRSE >gi|229783898|gb|GG667837.1| GENE 5 4870 - 5337 641 155 aa, chain + ## HITS:1 COG:lin1358 KEGG:ns NR:ns ## COG: lin1358 COG0779 # Protein_GI_number: 16800426 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 9 155 6 155 155 121 40.0 4e-28 MTKRETYEKKTEDYLLPLMEEYHFELVDVEYVKEAGNWYLRAYIDKEGGITVDDCEVVSR RLGDWLDEKDFIEDSYILEVSSPGLGRPLKKDKDFDRSIGKDVDIKLYKPMNKQKDYTGT LKAYDKDTVTVTVEDGTELVLNRPEIALIRLAFDF >gi|229783898|gb|GG667837.1| GENE 6 5372 - 6577 744 401 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 4 395 9 416 537 291 38 1e-78 MNKELLEALNILEKENNISKDTLLEAIENSLLTACKNHFGKADNVKVSMNHETCDFSVYA EKEVVEEVEDPLLQISLAEAKMTDPKYDIGDIVQCPIDSKKFGRIATQNAKNVILQKIRE EGRKALYNDWYCQEKEVVTGIVQRYLGKNVSINLGKVDAILNETEMVKGEVFKATERIKV YVLEVKDTPKGPRISVSRTHPDLVKRLFESEVAEVKDGTVEIKAIAREAGSRTKIAVKSN DQNVDPVGACVGLNGARVNSIVNELRGEKIDIINWDDNPAYLIENALSPAKVICVVADEE SREAQVIVPDYQLSLAIGKEGQNARLAARLTGYKIDIKSETQAREMGLFEELGLDYQEDG GYDDYESDYQEDGEAEYQEDYQDDYQEDYQDDGLEADSQEK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:13:01 2011 Seq name: gi|229783897|gb|GG667838.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld231, whole genome shotgun sequence Length of sequence - 10296 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 3, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 41 - 736 580 ## COG1270 Cobalamin biosynthesis protein CobD/CbiB 2 1 Op 2 . + CDS 789 - 929 200 ## Closa_2381 aminotransferase class I and II + Prom 1831 - 1890 19.6 3 2 Op 1 . + CDS 2108 - 3667 1675 ## COG1492 Cobyric acid synthase 4 2 Op 2 . + CDS 3664 - 4314 814 ## COG2082 Precorrin isomerase 5 2 Op 3 . + CDS 4332 - 5531 1610 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 6 3 Op 1 . + CDS 6478 - 6834 417 ## COG2207 AraC-type DNA-binding domain-containing proteins 7 3 Op 2 . + CDS 6835 - 8628 2040 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 8 3 Op 3 . + CDS 8631 - 9632 1195 ## Closa_2376 ABC-type sugar transport system periplasmic component-like protein 9 3 Op 4 . + CDS 9649 - 10294 672 ## COG1879 ABC-type sugar transport system, periplasmic component Predicted protein(s) >gi|229783897|gb|GG667838.1| GENE 1 41 - 736 580 231 aa, chain + ## HITS:1 COG:lin1155 KEGG:ns NR:ns ## COG: lin1155 COG1270 # Protein_GI_number: 16800224 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CobD/CbiB # Organism: Listeria innocua # 2 231 106 308 315 185 45.0 6e-47 MKVYDALAAGDVEGARKAVSMIVGRDTASLSDLGITKAAVETVAENTSDGIIAPLFYMVL GGGTGIFLYKAVNTMDSMVGYKNDRYINFGRTAARLDDGCNYIPARLSAFFMIAAGFLLE AAEGAGRRLKKGDEAGIRHYSGTGGIRIFKRDRYNHKSPNSAQTESVCAGVLGIRLAGNA CYFGRLYEKPTIGDADRPVEYADIRRANALMTGTYAAALFVTAVIGFFLFV >gi|229783897|gb|GG667838.1| GENE 2 789 - 929 200 46 aa, chain + ## HITS:1 COG:no KEGG:Closa_2381 NR:ns ## KEGG: Closa_2381 # Name: not_defined # Def: aminotransferase class I and II # Organism: C.saccharolyticum # Pathway: Porphyrin and chlorophyll metabolism [PATH:csh00860] # 1 46 1 46 362 78 71.0 9e-14 MEYQHGGDIYTNQVELDYSANLNPLGLPEGVRTAYIRAADRCSVYP >gi|229783897|gb|GG667838.1| GENE 3 2108 - 3667 1675 519 aa, chain + ## HITS:1 COG:lin1171 KEGG:ns NR:ns ## COG: lin1171 COG1492 # Protein_GI_number: 16800240 # Func_class: H Coenzyme transport and metabolism # Function: Cobyric acid synthase # Organism: Listeria innocua # 1 503 1 495 511 479 48.0 1e-135 MAKSIMIQGTMSNAGKSLIAAGLCRVFKQDGYRVAPFKSQNMALNSYITEEGLEMGRAQV VQAEAAGIKPQADMNPILLKPTSDSGSQVIVHGVPVGTMPAKEYFVYKKSLVPEVENAYR RLAQEYDIIVIEGAGSPAEINLKQDDIVNMGMAKMADAPVLLVGDIDRGGVFAQLYGTVA LLSEEERARVKGLVVNKFRGDKTILEPGLTMIEDLCGIPVTGVVPYADVDIEDEDSLGHH LKGTEKGKTTLVDIAVIRLPRISNFTDFQIFSCMPEVGLTYVERLSELGKPDLIVLPGTK NTIEDLLWMRENGLEAAVLKLASAGVPVFGICGGFQMLGESLEDPFGAETERYRGRPVEG LKLLPVRTVFGKEKTRTRVTGSCASVGGIFEDLSGMEIEGYEIHMGGTVRAVPPLTYVME QNGSHLAKMDGCQRNHVYGTYLHGFFDREGIAETIVNALLKKKGLDAVPGGEFQYAAYRE QQYDRLAAVIRESFDMKAVYGMMGIDGNSMPDSRKEDSE >gi|229783897|gb|GG667838.1| GENE 4 3664 - 4314 814 216 aa, chain + ## HITS:1 COG:FN0970 KEGG:ns NR:ns ## COG: FN0970 COG2082 # Protein_GI_number: 19704305 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin isomerase # Organism: Fusobacterium nucleatum # 10 209 9 210 219 176 45.0 4e-44 MIETELERVLPGDIEKRSFEMITEELGDRILNPEYELVIKRVIHTTADFDYADNLMFSEH AVEQAKAAIRQGVRFITDTNMGKAGINRSALQACGCRVDCFMADEDVAEAAKRLGTTRAC ASMEKAAELGEDCIFAVGNAPTALIRLYELIREGRLKPKLVIGVPVGFVNVVQSKEMILS LPDTPFIVARGRKGGSNVAAAICNALLYQCQSVPEA >gi|229783897|gb|GG667838.1| GENE 5 4332 - 5531 1610 399 aa, chain + ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 3 396 1 381 525 124 23.0 4e-28 MNLYSIILVDDEEEVRKSIIKQIDWESAGFQVVGDAENGEDALEKIEVLEPDVVLTDIRM PYMDGLTLAEKIRQRYPSTKVVIFSGYDDFEYAQKAIKLNVTEYILKPVNVEELTSILKR IKSNLDEEIEEKRNVSRLRENYRKSLPIIREQFFNDMVHRRLADDLIESKLREYDIPITG ARKWIIAAIDVEKSDDRSKKTLSLHEEEELIPISVMQIVREKLKSYCRFSLFQSTAEAGM VVIAALDDDNTTTGLIDVLGDICKETKRILEVPVTIGIGHSVTGLSKIAGSYQSAVEALG YKAVVGSGITIYINDMEPVGSGKLEFDNSDESDFISAVKFGPDEKIEAVMVRISGKLESA RVHYRQQQVYVFGVLNTVIQMIQQYDLNLEEILGGELEY >gi|229783897|gb|GG667838.1| GENE 6 6478 - 6834 417 118 aa, chain + ## HITS:1 COG:BH3443 KEGG:ns NR:ns ## COG: BH3443 COG2207 # Protein_GI_number: 15616005 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 15 118 98 201 207 78 37.0 3e-15 MNQAINQERDMTTRQVIQQAKQYIMDNYQNPDLSVEMICRHLHMSPAYFSTMFKKETGQA YIAYLTEIRLNKAVELLNKTDDKTYVIASKVGYQEQNYFSYVFKKKFGVSPTKFRGAR >gi|229783897|gb|GG667838.1| GENE 7 6835 - 8628 2040 597 aa, chain + ## HITS:1 COG:SP0662 KEGG:ns NR:ns ## COG: SP0662 COG2972 # Protein_GI_number: 15900563 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pneumoniae TIGR4 # 19 588 11 563 563 253 29.0 6e-67 MNKKKKGKIFHNSSIRYTIFTYFTVTALAASLLITFSIYQRLASQVSETVMEENQSLIGQ VARSVENYLRTVMKLSDSLYYGAVKNADLSSESINSEITLLYDNNKDNVDNIALFSQDGT LKEAVPAARIKTNLDVTGDSWFSNALEKTENQHFSNPHVQYIFDGNESQYRWVISLSRAV ELTEGTSTAQGVLLVDLSYSSLAHLFDGVTTGTGGYVYLISNDGEILYHPKIQLIDSGRV VENNLAAAGYKDGNHLEKFEGRERIITVKSIGYTGWKVVGVTPKNVVSLNTIKTRLFIVF LITLILFILTLINSYISSRITNPIKELEKSVGFLEEGNLETPVYIGGSYEIQHLGTSINS MARQIRVLMNDIVTEHEAKRKQEFDMLQSQINPHFLYNTLDIIVWMIENEQKAEAVKVVT ALARFFRISLSKGKSIITVRDELEHVRNYLMIQHMRFKNKFSYYIDASEDCMDLASLKLM LQPLVENAIYHGMEFMDGDGEITLKVWKEEEDLYFSVRDNGLGMTAEQVESLFSDSIHVT SKKGSGIGVKNVNERIKLYFGEEYGLMIHSEPDEGTEITIHLPAVPYSRILTEGGGL >gi|229783897|gb|GG667838.1| GENE 8 8631 - 9632 1195 333 aa, chain + ## HITS:1 COG:no KEGG:Closa_2376 NR:ns ## KEGG: Closa_2376 # Name: not_defined # Def: ABC-type sugar transport system periplasmic component-like protein # Organism: C.saccharolyticum # Pathway: ABC transporters [PATH:csh02010]; Bacterial chemotaxis [PATH:csh02030] # 1 333 1 323 323 403 61.0 1e-111 MTRQEKIIWSLFAGVLILLFLLSSTDLIIKEKKTEIYPVSVIIGETTDDYYTNFRKGVDK AADEYNVDVNFITLYEKGDVTEQMELLKREIDDGAKAVVLVPLKQKECSEILDGMVLASP LIVMGTIFPGDWGMTGISQDCQEAGKMLGEAIAAENTPDKPVFLFSEGLEYGYNRDVYDG LLGVLSRAGFQLHLYEMDKSGITEPDAAENGEIFRRIMEETVYPGGEAVIAALDVKSLDI TADIIAGSPAYGRYLPRLYGFGSTTKILNQMDRGVIKGLVVTNQFDAGYLSIVKAVEAIE KRGDREQIKLESYYIEKEDLKKTHFEKILYPIE >gi|229783897|gb|GG667838.1| GENE 9 9649 - 10294 672 215 aa, chain + ## HITS:1 COG:HI0822 KEGG:ns NR:ns ## COG: HI0822 COG1879 # Protein_GI_number: 16272763 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Haemophilus influenzae # 39 215 46 216 349 140 40.0 2e-33 MKKNYLLFAALAAAFLLAAGCIVFQNRSHEEREEKKTLRVGVTLYRGDDSFINNLCGKME EKAKSYEKETGVKVILDVVDGKGSQNTQNSQVDRFISLGVDVICVNMVDRSAASYIISRA MEGDIPVVFFNREPVEEDMNRWEKLYYVGENAKESAVLQGNILVEAYKKDPASLDLNGDG KVSYVLLEGETSHQDSLIRTEWSIQTLKDGGVPLE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:13:10 2011 Seq name: gi|229783896|gb|GG667839.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld232, whole genome shotgun sequence Length of sequence - 8380 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 5, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 2930 2053 ## PPSC2_c2592 ig domain protein group 2 domain protein + Prom 2947 - 3006 7.3 2 2 Op 1 2/0.000 + CDS 3031 - 3987 879 ## COG1653 ABC-type sugar transport system, periplasmic component 3 2 Op 2 . + CDS 4941 - 5306 391 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 5323 - 5364 7.4 4 3 Tu 1 . + CDS 5379 - 5552 189 ## gi|266625041|ref|ZP_06117976.1| putative sugar ABC transporter, permease protein 5 4 Op 1 38/0.000 + CDS 6507 - 7172 462 ## COG1175 ABC-type sugar transport systems, permease components 6 4 Op 2 . + CDS 7172 - 8014 715 ## COG0395 ABC-type sugar transport system, permease component + Term 8031 - 8079 3.4 7 5 Tu 1 . + CDS 8120 - 8320 216 ## gi|288871539|ref|ZP_06117979.2| toxin-antitoxin system, toxin component, Fic family Predicted protein(s) >gi|229783896|gb|GG667839.1| GENE 1 3 - 2930 2053 975 aa, chain + ## HITS:1 COG:no KEGG:PPSC2_c2592 NR:ns ## KEGG: PPSC2_c2592 # Name: not_defined # Def: ig domain protein group 2 domain protein # Organism: P.polymyxa_SC2 # Pathway: not_defined # 1 974 170 1207 1215 798 42.0 0 NDEEELERWAKLHMEEDTIGVHTYEKIFELLLRLKANYIWPAMHVNSFNRRKENGALADR MGIVVGTSHCDMLMRSNNREWLPWLKEKGYEGVKYDYTIEGRNREILHEYWRESVIQNRD FEVSYTLGMRGIHDSGFETSNLNGRTEEELRTQKIELLETIIASQNEILKEELDKTPLKL FIPYKEVLELYDHGLKVPDDFTMIWANDNYGYVRRYPSEEDRKRVGGHGIYYHNSYWSPP GRSYLFFCSIPLTHTKYELMKAYDEGIQKLWILNVGALKPLEMEVEFFLRLAWEAGSAKG RTQDVDSYVSDWIDRNFTGKIGEKMGPLLNRFSQIANVRKLEMMEDDVFSQTAYGDEGVM RLHKLQEILDQADVVYEGLLEEEKDAFFQLVLLRIHALYLTMGQYYFSDRSTLCHKQGKQ QAADLYVKETRAYEDARRKLLLYYNERISGGKWKGIVTPEDFPPPRTAMYPACTPSVHMG GRNMLVHIWNNGEELCFVRPGTKWFEISNGGEGSFAWRAETPDWIQLSETSGEISCETRI LVTVKETQEEKTGIILIRNETDNVQCEVPVLVSPVPAGCENPEEAGVVSVSVTGLRVDGF RLISYLGREEGDLLEGYKEGAEASFPVYFSSEGEFLLEIHRFPSLNSTGRIRMGVKIDRG TVLTVESLANDEWRDTWTYNSTNNVDKLYLKLPYLKKGAHQVTFKVIDPYFAISRFVIYT KERAENNLGIICAGQVNREFPREQALLNNGRILDWSDRFYGAPELKPRKEIYANREVTRD SLVATDHFEEPVEYGKTKSPKEVLTAAHSLFCEKDGVVKIDAVTAYEQTEFAYTENGQWQ YCSSESYGRSGLAIYMRKRGQQWKQEEEAPNLNYQIRCDGGTYDFWVLLRIDPASPSYLG VAADGNFVDRTLLYNSGKTWRYEAEQVWRWIPLAGLALSGGKHVLTLAVLASGVRIDRLY LTRKGDRPPVDCSWE >gi|229783896|gb|GG667839.1| GENE 2 3031 - 3987 879 318 aa, chain + ## HITS:1 COG:BS_yesO KEGG:ns NR:ns ## COG: BS_yesO COG1653 # Protein_GI_number: 16077764 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 50 318 12 276 412 159 34.0 9e-39 MRMKKIMALFLTAAMAAGVLSGCGGGNSQTTDNTSADGTSSVETTSAVKNGKDGRTTIRF AWWGGQERADLTNEAVKLFMEKNPDIEVETSFYPWDSYWENLSVASTANNIPDVYQGYIG SGDFQQFMDGGIVEPLDSYAESGLIDLSSISENLIAEGQVDGKLYGLPFGVNTRAMLVAP DIYEKAGLTIPENGYESWEALEADLPKLKEATGKYAAADFLMMEGDVFKYYCRQQGESVY APEGDSLINFSKDTFNNFYGMKLKWAGEGLIPPYDVSQAENGPEDSSIVKGETAVNIIPA SQYANFANAANKELRMIL >gi|229783896|gb|GG667839.1| GENE 3 4941 - 5306 391 121 aa, chain + ## HITS:1 COG:BS_yesO KEGG:ns NR:ns ## COG: BS_yesO COG1653 # Protein_GI_number: 16077764 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 5 110 291 396 412 75 34.0 2e-14 MVPASLHLCMSSKSENKEAAAKLIDFLINDVEANKIMKAERGMPASDKVRESMESTFDEN QKKVSAIVDQAVEYSSANDRPSMAGSSKIQKLLAEYEERMMYQDITPDEAYDELVEAAKL N >gi|229783896|gb|GG667839.1| GENE 4 5379 - 5552 189 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625041|ref|ZP_06117976.1| ## NR: gi|266625041|ref|ZP_06117976.1| putative sugar ABC transporter, permease protein [Clostridium hathewayi DSM 13479] putative sugar ABC transporter, permease protein [Clostridium hathewayi DSM 13479] # 1 55 1 55 55 103 100.0 4e-21 MKNKKLRKRCEGYLYVAPWLFGFIVLTLIPFLCLIYFSLTEYNMLSVPKWNGIQNSS >gi|229783896|gb|GG667839.1| GENE 5 6507 - 7172 462 221 aa, chain + ## HITS:1 COG:BS_yesP KEGG:ns NR:ns ## COG: BS_yesP COG1175 # Protein_GI_number: 16077765 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 3 217 91 306 309 238 58.0 8e-63 MSVPLKLLAALLVAMLLNQKRRGIGIYRTLFYIPSIIGGSVAVAVMWRNLFSRSGAFNAL ITAVTGINMDISWVADPRTALGSLILMAIWQFGSPMLIFLAGLKNISSTYYEAATVDGAN KVQCFFSITLPMLSPIIFFNLIMQMINSFMTFTQGLVVTNGGPMNRTLFYQLYVYQKGFE DFNMGYASALSCIMLVIVLFFTALVFRSSDAWVYYESGGKK >gi|229783896|gb|GG667839.1| GENE 6 7172 - 8014 715 280 aa, chain + ## HITS:1 COG:BS_yesQ KEGG:ns NR:ns ## COG: BS_yesQ COG0395 # Protein_GI_number: 16077766 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 10 279 27 296 296 274 52.0 1e-73 MRRKKQIGNIAFHVICILFALTSLYPILWMISSSLKESSQVFVDAANLIPHEFKIENYIN GWRGFGGISFAYFFKNTFIVVILSVLGVVVTCPFVAYGFARLDFPFKNVCFMTILISLML PGQVVMIPQYIMFNKLGWLNTYLPLILPLWFGSAFFIFQHMQFIRSIPNELDEAAFLDGA SKFKVYTNVILPLIKPSLVTSIIFQFYWKWEDFFGPLIYLTSPKKYTVSVALRLFSDPTS MTDWGAMFAMGILSLLPPVILFFCFQKYIVEGISTSGLKG >gi|229783896|gb|GG667839.1| GENE 7 8120 - 8320 216 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871539|ref|ZP_06117979.2| ## NR: gi|288871539|ref|ZP_06117979.2| toxin-antitoxin system, toxin component, Fic family [Clostridium hathewayi DSM 13479] toxin-antitoxin system, toxin component, Fic family [Clostridium hathewayi DSM 13479] # 1 66 14 79 79 128 100.0 1e-28 MNLKLGSFVLAATVALSVNGCWQKTEQTVEVEAEMVPVLDVYAGKNEEKNLNFWYRRMRK RIRMYP Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:13:35 2011 Seq name: gi|229783895|gb|GG667840.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld233, whole genome shotgun sequence Length of sequence - 6728 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 5, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 22 - 183 185 ## gi|266625045|ref|ZP_06117980.1| putative toxin-antitoxin system, antitoxin component 2 1 Op 2 . + CDS 203 - 451 281 ## gi|266625046|ref|ZP_06117981.1| conserved hypothetical protein - Term 446 - 482 3.3 3 2 Tu 1 . - CDS 497 - 1099 236 ## gi|288871540|ref|ZP_06117982.2| putative lipase ATG15 - Prom 1119 - 1178 7.2 + Prom 1128 - 1187 3.8 4 3 Tu 1 . + CDS 1211 - 1405 264 ## Closa_0724 hypothetical protein + Prom 1607 - 1666 2.1 5 4 Op 1 . + CDS 1809 - 2243 169 ## gi|266625050|ref|ZP_06117985.1| hypothetical protein CLOSTHATH_06461 6 4 Op 2 . + CDS 2248 - 3123 644 ## CLJ_B1878 putative phage replisome organizer 7 4 Op 3 . + CDS 3120 - 3638 471 ## gi|266625052|ref|ZP_06117987.1| hypothetical protein CLOSTHATH_06463 8 4 Op 4 . + CDS 3635 - 5677 1430 ## Mpet_0475 hypothetical protein 9 4 Op 5 . + CDS 5686 - 5916 142 ## gi|266625054|ref|ZP_06117989.1| ATP synthase F0, subunit B 10 4 Op 6 . + CDS 5979 - 6404 419 ## gi|266625055|ref|ZP_06117990.1| conserved hypothetical protein 11 5 Tu 1 . - CDS 6394 - 6645 103 ## gi|266625056|ref|ZP_06117991.1| conserved hypothetical protein Predicted protein(s) >gi|229783895|gb|GG667840.1| GENE 1 22 - 183 185 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625045|ref|ZP_06117980.1| ## NR: gi|266625045|ref|ZP_06117980.1| putative toxin-antitoxin system, antitoxin component [Clostridium hathewayi DSM 13479] putative toxin-antitoxin system, antitoxin component [Clostridium hathewayi DSM 13479] # 1 53 8 60 60 78 100.0 1e-13 MKNADRIRTMTDEELACFLVRVDAKLYRDDLDVVTYRTDKAVDTLEWLEREVY >gi|229783895|gb|GG667840.1| GENE 2 203 - 451 281 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625046|ref|ZP_06117981.1| ## NR: gi|266625046|ref|ZP_06117981.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 82 1 82 82 155 100.0 7e-37 MNNGQKIKYMELCLAVAREEVEYAELYKEKEPDYDEDFDAWCVYTRSHRNPNKALITDNL RNVARTAFILAKEINVSGFFRE >gi|229783895|gb|GG667840.1| GENE 3 497 - 1099 236 200 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871540|ref|ZP_06117982.2| ## NR: gi|288871540|ref|ZP_06117982.2| putative lipase ATG15 [Clostridium hathewayi DSM 13479] putative lipase ATG15 [Clostridium hathewayi DSM 13479] # 80 200 1 121 121 218 100.0 3e-55 MNFKFLLLATLCTIVLSACGTQSPSNTNLDSSSDPETTTESNNSNGTEASTFSQATTKTS VSETVFEKNDKNTVDFKTLMELIGKSDSNVVSVLGEGDPLTNDDSVLLNRDYTLSLFNED VSVSLAFNLYQHKANLLDQCTIYLTKPDLDGYKKILVGLLGTPSETYEKSYFFETSTATV LLADPFDDVPYIEISPNKID >gi|229783895|gb|GG667840.1| GENE 4 1211 - 1405 264 64 aa, chain + ## HITS:1 COG:no KEGG:Closa_0724 NR:ns ## KEGG: Closa_0724 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 57 1 57 58 66 61.0 3e-10 METKIFIRDGETWTRFKVKIREVGVYAYKLKKYVDVDKPVRQSSRYAYYEVKGDLLNDHK QKAR >gi|229783895|gb|GG667840.1| GENE 5 1809 - 2243 169 144 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625050|ref|ZP_06117985.1| ## NR: gi|266625050|ref|ZP_06117985.1| hypothetical protein CLOSTHATH_06461 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06461 [Clostridium hathewayi DSM 13479] # 1 144 17 160 160 299 100.0 6e-80 MYYLTCYLWAKFFEDNNMAWVYCPESGRDGMVDGAADFYLPDQDAYMLADLGRPGRKYIN IQKLANDSGKTIILGGAQGKFSIIEEGKRFSGPDAWLCECAACRRYYFMNSSGGFACRVC GEHDGDHHLQNVMYGDDGLFGLQE >gi|229783895|gb|GG667840.1| GENE 6 2248 - 3123 644 291 aa, chain + ## HITS:1 COG:no KEGG:CLJ_B1878 NR:ns ## KEGG: CLJ_B1878 # Name: not_defined # Def: putative phage replisome organizer # Organism: C.botulinum_Ba4 # Pathway: not_defined # 1 157 1 170 286 140 47.0 6e-32 MAEQEKKYYWLKLDKNFFKRHDIRIIEAMPNGKDYILFYLKLLVESVSHDGMLRFSDAIP YDENMLSTITNTNIDIVRAAIGVFTNLQMIEIMDDKTIFMVEVENMTGSETKWAEKKRNY RAKIGQGADNVLPGVDVVRQEKELEKEKELEIEIEKEVVVAPGGSTSPFSNNYDFDSFEM QCVEYLIASCLETFPKAKVPDTLEKKRKWASEIEKMKRLDHLSESEIKQSLYFATHDSFW KTNIRSTKKFREKFETLYTQNRRGQGKTGNDLYDTADRLNKLEEKIGGDQK >gi|229783895|gb|GG667840.1| GENE 7 3120 - 3638 471 172 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625052|ref|ZP_06117987.1| ## NR: gi|266625052|ref|ZP_06117987.1| hypothetical protein CLOSTHATH_06463 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06463 [Clostridium hathewayi DSM 13479] # 1 172 1 172 172 303 100.0 4e-81 MKYEEFKNIAITLKAAFPALKAFENDEGIRTWYEMLKDLDYAVASAAVSAYIRESPYPPA IADIRNISRKIVVPDWSIEWQKLLKNASFEELNAPAQYAVQTLTEEYVREMLESSERVVL CMKEFERLYNNFFRLSRQDEEVLKKLGVWSNGIGFIQMQPKLLITADGRELE >gi|229783895|gb|GG667840.1| GENE 8 3635 - 5677 1430 680 aa, chain + ## HITS:1 COG:no KEGG:Mpet_0475 NR:ns ## KEGG: Mpet_0475 # Name: not_defined # Def: hypothetical protein # Organism: M.petrolearius # Pathway: not_defined # 5 328 11 338 983 243 40.0 2e-62 MNFNETEVRKAITVMKPGNALFEVRVISGRGNATGYFTTADTLINELKRLNLAATCNVYI TLNSIKDECYSRQQRDQFIQNGKPTTSDTDITLYDWLMVDIDPVRAAGTSASNEQIKKAK LKANEVYAYMKKTGFEEPLVGFSGNGVHLLYSVALSTNDENKALMKNCLTVLDMFFSDDA VKIDTANFNPARVCKLYGTVAQKGANTPERPHRPSYIIRSPEKPVQNKKMLLVKLAGYLP EPEKPQRYNNYNPRQFDLDEWLDHYGLRYTKASYGSGTKYILEKCPFDDNHTGKDACIFK AANGAIGFHCFHNSCSDKTWQDVRRLYEPDAYDRQYVPDQRHPNYRNPNYVVEKKEEVKI VEGQPVFFTTEQIRLLEEPPEEFIKTGIDTIDEKMRGLKKGFVSCLSGLRAAGKSSVISQ LTIEAAEQGYRTALFSGELKPKNLLKWLLLQAAGKQYVSQTQYDYYYVVRSPYDEIISKW LDEKVWVYNNYYGNNFGLIMTQIRKCVTEHKVDLVILDNMMALNLMEMGSDKYQQQSHFV ESLEDYAKQANIHILFVAHPRKSTGFLRLDDVSGSNDIVNRVDNAFILHRVNEDFKRLSK EMFKWKADDPLYQCSNVIEICKDRDGGVQDEFVPLYFEQSTKRLRNSPGETKTYTWTEKI GEYIRNDFESVPLDEQLPFD >gi|229783895|gb|GG667840.1| GENE 9 5686 - 5916 142 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625054|ref|ZP_06117989.1| ## NR: gi|266625054|ref|ZP_06117989.1| ATP synthase F0, subunit B [Clostridium hathewayi DSM 13479] ATP synthase F0, subunit B [Clostridium hathewayi DSM 13479] # 1 76 1 76 76 131 100.0 2e-29 MNNEQMKEIFWQTYNVFWNKWKNVLLTRQSPEWDEIVEEGRELIKKYHCDICSHMISDMI QILKERYEKEERKGGT >gi|229783895|gb|GG667840.1| GENE 10 5979 - 6404 419 141 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625055|ref|ZP_06117990.1| ## NR: gi|266625055|ref|ZP_06117990.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 141 1 141 141 251 100.0 1e-65 MTAKEYLKEIKKIDVAIDQKQIEYETLKGSRTYVGGTDYSAERVQTSPDGSGFTRISDRI TDMQREINDEIDQWHDMRHERIGQIQQLSKVEYVDILFRKYVQYQSLETIAGDLDKSYYW TCHLHGEALQEFEERFLKVSN >gi|229783895|gb|GG667840.1| GENE 11 6394 - 6645 103 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625056|ref|ZP_06117991.1| ## NR: gi|266625056|ref|ZP_06117991.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 83 1 83 83 166 100.0 6e-40 MMNEAEHHGIEPEPRARPCPSAIDGMLHIYRIVSGIALIDFERSVKGELLDLAEKACQGC VLLIHFLQYYYSTLDKISLAVSC Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:14:51 2011 Seq name: gi|229783894|gb|GG667841.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld234, whole genome shotgun sequence Length of sequence - 6181 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 3, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 373 - 416 -0.6 1 1 Op 1 . - CDS 463 - 873 192 ## COG5485 Predicted ester cyclase 2 1 Op 2 . - CDS 923 - 1408 167 ## Clole_4115 hypothetical protein 3 1 Op 3 . - CDS 1473 - 1679 172 ## Clole_2891 hypothetical protein 4 1 Op 4 . - CDS 1774 - 2256 183 ## Rumal_2951 hypothetical protein - Prom 2339 - 2398 4.5 5 2 Tu 1 . - CDS 2552 - 2803 189 ## gi|288871543|ref|ZP_06117997.2| putative DNA polymerase III, chi subunit 6 3 Op 1 . - CDS 3202 - 3924 395 ## COG4422 Bacteriophage protein gp37 7 3 Op 2 2/0.000 - CDS 3988 - 4413 520 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 4434 - 4493 3.8 - Term 4551 - 4586 5.1 8 3 Op 3 5/0.000 - CDS 4716 - 5066 322 ## COG0526 Thiol-disulfide isomerase and thioredoxins 9 3 Op 4 . - CDS 5131 - 6132 725 ## COG0701 Predicted permeases Predicted protein(s) >gi|229783894|gb|GG667841.1| GENE 1 463 - 873 192 136 aa, chain - ## HITS:1 COG:all4540 KEGG:ns NR:ns ## COG: all4540 COG5485 # Protein_GI_number: 17232032 # Func_class: R General function prediction only # Function: Predicted ester cyclase # Organism: Nostoc sp. PCC 7120 # 43 119 45 121 137 58 33.0 3e-09 MSNKEIIKYFYEVVVSENMLDELPQYISEDCVQRDGKNEIFIGIDGMKQHLMSVKNTYPD YTMEIIRQFEDGDTVISEFIMRGTHKGEFIGIMPMNRVIEMTGVDIDKIVNGKIVEHGGA VNTFDAFWENGLIKPV >gi|229783894|gb|GG667841.1| GENE 2 923 - 1408 167 161 aa, chain - ## HITS:1 COG:no KEGG:Clole_4115 NR:ns ## KEGG: Clole_4115 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 8 154 5 152 154 109 38.0 5e-23 MAKEKASFKEFLSAVAPEHQAFVEKLNTKLIEQGCVLVIKEAKSGYAASYQLEKKTVMNW VFRKSGILARIYGDNAGKYEDTIASLPADMQKKMITSRDCKRLIDPNACSDTCVKGFIYA LNGVTHKKCRNDGMFFLLTNETAEHIAELVCAEVNVRKSVL >gi|229783894|gb|GG667841.1| GENE 3 1473 - 1679 172 68 aa, chain - ## HITS:1 COG:no KEGG:Clole_2891 NR:ns ## KEGG: Clole_2891 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 68 1 68 68 106 82.0 2e-22 MSKYNSLWEYIQKNGGSSFKMTFEEIQNIAGIPIDHSFLKYKKELTEYGYEVGKISMKEQ TIIFNKVD >gi|229783894|gb|GG667841.1| GENE 4 1774 - 2256 183 160 aa, chain - ## HITS:1 COG:no KEGG:Rumal_2951 NR:ns ## KEGG: Rumal_2951 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 33 160 17 130 130 75 38.0 6e-13 MIWVERLAKDISNGRYKTKKTEKEIILDIITQRNLASFMNNTKWRELRTGMLNEMPFVPP YEYKTLFDDSDYISEDYVQHLIKNEGPSCLCSLDEESFNFLNYKAIEWLKVRPCFFTEEG GQLVKKKVWYDCEKEFTEILKKYSIPFELQNGVYTIYGYK >gi|229783894|gb|GG667841.1| GENE 5 2552 - 2803 189 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871543|ref|ZP_06117997.2| ## NR: gi|288871543|ref|ZP_06117997.2| putative DNA polymerase III, chi subunit [Clostridium hathewayi DSM 13479] putative DNA polymerase III, chi subunit [Clostridium hathewayi DSM 13479] # 1 83 12 94 94 159 98.0 6e-38 MDIANRWVALAFNTWDENGIAHMYQPINQKYEDSQEDAPVNIGSQTPVLKRNALDNLDLA AECVLHFAKTGELYPNLKWEEAE >gi|229783894|gb|GG667841.1| GENE 6 3202 - 3924 395 240 aa, chain - ## HITS:1 COG:MT2803.2 KEGG:ns NR:ns ## COG: MT2803.2 COG4422 # Protein_GI_number: 15842273 # Func_class: S Function unknown # Function: Bacteriophage protein gp37 # Organism: Mycobacterium tuberculosis CDC1551 # 2 216 12 229 284 80 31.0 3e-15 MATWNPWHGCKKISPGCANCYVYRRDAEFGKDSSIVTRTSSFDMPIKRNRKGEYKLQPDG ESVYTCMTSDFFLDEADEWREEAWDIIRRRSDLHFVIITKRIHRFEVELPGDWGSGYENV TICCTCENQNRADYRLPLFLNLPIKHRTVIHEPMLEQIDIRKYLTTGKIEGVTCGGESGP EARVCDFAWILDSMEQCVEYDVPFWFKQTGARFKKGNKVYLIDRKDQMSQAQKAGVNYRF >gi|229783894|gb|GG667841.1| GENE 7 3988 - 4413 520 141 aa, chain - ## HITS:1 COG:CAC1468 KEGG:ns NR:ns ## COG: CAC1468 COG0454 # Protein_GI_number: 15894747 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 135 1 138 142 93 38.0 1e-19 MIRLFEFRDLDKIMNIWLEGNLEAHSFIDAEYWNKNFDSVKSVLPNAEVYVYEEAGEILG FIGMDAEYIAGIFVAAGHKGQGIGHQLIEAVKKKKRLTLHVFEKNTGAMAFYLAEGFKVR ERMTEKETGERECLMVYEDGQ >gi|229783894|gb|GG667841.1| GENE 8 4716 - 5066 322 116 aa, chain - ## HITS:1 COG:TM0996 KEGG:ns NR:ns ## COG: TM0996 COG0526 # Protein_GI_number: 15643756 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Thermotoga maritima # 39 113 5 79 80 73 50.0 7e-14 MNLFGKKKKEETAGCCCGGNCTPETMAQAENAKNEGASVKILGSGCAKCNQLEEATRAAL EQLGMDTAVDHVTDFSQIAAYGVMTTPALVVDGKVVSMGKVLKTEEVKKILQKMRG >gi|229783894|gb|GG667841.1| GENE 9 5131 - 6132 725 333 aa, chain - ## HITS:1 COG:MTH894 KEGG:ns NR:ns ## COG: MTH894 COG0701 # Protein_GI_number: 15678914 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Methanothermobacter thermautotrophicus # 28 328 19 326 327 253 45.0 5e-67 MSIWEFIQDQVLGMKWLNKMIGSMLSSLGLDISGRVGGSVQFFIYDVVKITVLLCFLIFM ISYIQSYFPPERSKRILGRFHGIGANIISALLGTVTPFCSCSSIPLFIGFTSAGLPLGVT FSFLISSPMVDLGSLILLMSIFGTRVAIIYVVVGLVIAVVGGSLIEKMHMEPYVEEFIKT AGSVDIESPSLTKKERLTFAKEQVAATFTKVFPYILIGVGIGAMIHNWIPESWVEAVLGS NNPIGVILATLVGIPMYADIFGTIPVAEALLFKGAQLGTILSFMMAVTTLSLPSIIMLRK AVKLRLLGLFIGICAGGIILVGYFFNAIQAFII Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:15:08 2011 Seq name: gi|229783893|gb|GG667842.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld235, whole genome shotgun sequence Length of sequence - 7671 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 5, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 125 - 1519 1600 ## COG1653 ABC-type sugar transport system, periplasmic component 2 1 Op 2 . - CDS 1543 - 3507 1105 ## PROTEIN SUPPORTED gi|90020672|ref|YP_526499.1| ribosomal protein S18 3 1 Op 3 . - CDS 3495 - 4649 974 ## COG0673 Predicted dehydrogenases and related proteins - Prom 4731 - 4790 7.7 + Prom 4801 - 4860 7.1 4 2 Op 1 . + CDS 4906 - 5745 701 ## COG2207 AraC-type DNA-binding domain-containing proteins 5 2 Op 2 . + CDS 5742 - 6113 511 ## gi|266625071|ref|ZP_06118006.1| NAD dependent epimerase/dehydratase family protein 6 3 Tu 1 . - CDS 7018 - 7386 72 ## gi|288871546|ref|ZP_06118007.2| conserved hypothetical protein + Prom 6952 - 7011 80.4 7 4 Tu 1 . + CDS 7186 - 7518 384 ## gi|255282583|ref|ZP_05347138.1| NAD dependent epimerase/dehydratase family protein - Term 7430 - 7463 1.0 8 5 Tu 1 . - CDS 7513 - 7668 84 ## gi|266625073|ref|ZP_06118008.1| hypothetical protein CLOSTHATH_06485 Predicted protein(s) >gi|229783893|gb|GG667842.1| GENE 1 125 - 1519 1600 464 aa, chain - ## HITS:1 COG:AGl3359 KEGG:ns NR:ns ## COG: AGl3359 COG1653 # Protein_GI_number: 15891799 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 60 458 51 443 444 184 32.0 2e-46 MKKWSKAGMVCAAILAAGMAVTACGGGNTAKTENTGAVETEKKAEGEPTEAAKPSGETAK LSFSWWGNDDRHQATQNAIDAFNAAHAGAIEVTGEPSGFGNLEETFATRYAGGTAADVMT VNYPWVIQYSPNGEGFYDLDQVKDIFDFSQYDEGFLEFGKTGGKMQAVPYGQNTLGLYVN KSAYDRAGITELPKTFEEYKEAAKIFTQQDPNTYMIVSPTFRFAAVYYLQQKTGKGEFAE DLSMNYTVDDYKEALLWYKDLADAHVFCSRKDYIENVGNEPVSIAQNAKYINGGYVGVLE WTGGIASNAKTLEDKGDELVVAPLPVIEGAAFEGTMAKPSLLFAISKDTKNPKEAAEFLQ YILNDPEGTKLMGSTRGMVASRAAKASLEADGQIEGAVKQAYDFTDEAKVINNSPIFENA VFTNSYESNYEKFEFGKSTAEEAAQAIFDATSEQVAKLKQDYGK >gi|229783893|gb|GG667842.1| GENE 2 1543 - 3507 1105 654 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020672|ref|YP_526499.1| ribosomal protein S18 [Saccharophagus degradans 2-40] # 54 646 127 729 754 430 39 1e-120 MHELRDFCWLAKQVQNRKIFTYYMEDMENTAIAETIEREMPLLFGEVRRCERPEAAIVFS RTADEKLGQEGYRIEETENGYVVTSTAASGALYGMYGLYRLLITDRAGEFPYESVPDQSI RMINHWDNFDGSIERGYAGESIFFDNNTFRGDMELIRQYARLLASAGINAVSINNVNVHK LETFFVQGDAPAEIRKIADVLTAYGIKTFLSINYAAPMTVGGLTNADPLNPEVAAWWEET VRHIYEEIPDFAGFVVKADSEGEPGPFTYGRDHDEGANMLARAVRPYGGLIIWRCFVYDC AQDWWNRKADRARAAYDIFKKLDGHFEDNVILQIKNGPIDFQIREPVSPLFGALEKTNQI LEFQITQEYTGHQKDVCYLVPMWKETLDFDTKYGDGLTVNDAVRNHSQIRSRSGIAGVGN VGMDSNWTGNKLAQANLYGFGRLSWDNGLSPEEIAGEWVRQSFCLTAEEEDRVTALLCTS REVYEDYTCPLAVGFMCRPKIHYGVDVDGYEYDRWGTYHYADRDGVGRDRTVKTGTAYTR QYSDPRFEEYDDLSTCPDELLLFFHHVPYTHVLQSGKTVIQHIYDTHFRGVEKVLEYQAL WDSLKDSLDEESYFNVKERLAWQYENAVNWRDQVNTYFYRKSGIPDEKGRRIYQ >gi|229783893|gb|GG667842.1| GENE 3 3495 - 4649 974 384 aa, chain - ## HITS:1 COG:SMc04139 KEGG:ns NR:ns ## COG: SMc04139 COG0673 # Protein_GI_number: 15963866 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Sinorhizobium meliloti # 4 351 6 349 357 184 33.0 3e-46 MTGVAIIGTGVISERHIEAYAAFHGRCRIVALCDLMPEKAERLKEKYHLEDTEIYSSFRE MVKRPDIHLVSICTPPFTHAESAVSCMRAGKHVLVEKPMASSLEECDEMIRAQKETGCYL GVISQNRYLQDNRNLKNMIASGAVGRVLFGQVESLWFRGREYYNLWWRGTWEKEGGGCTL NHAVHQIDLLNWIMGSPKTVTAVLSNTAHDNAEVEDVSAAVLTYDSGAVVTVTASLVTHG EGQRLVFQCERAGLSSPWQVKSSRSDRTGFPAENREIEDRLLKIREEQPPLPYTEHAGQI DTVLRFLEGGETPPENSMDGRLALEVITAVYQAGFTGRAAQLPILSDSPFYTKKGFLANV KHFYEKKEAQSSKGAVEDEKTCTN >gi|229783893|gb|GG667842.1| GENE 4 4906 - 5745 701 279 aa, chain + ## HITS:1 COG:BH2229 KEGG:ns NR:ns ## COG: BH2229 COG2207 # Protein_GI_number: 15614792 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 22 256 30 272 287 87 28.0 3e-17 MNLDFNNMIPELHYYIHRKCTPNWKIEPHVIPFIDITYVIDGKAEYNIGGRTYIVKKGDL ICIPQNTFRSAVNIPEDLMECYSTNIFLRNQFGQEAYLPVPIISHIGIIPRLISRFHEIH EEWVQKDFGYMLKIRGILCLILYQISNLILNENHLSQEDPRIKSSIRYMSLHYAEPLTID RMAEQFHLHPVYYGSLFREAMGMTFKQYLISLRLNYAENMLKSGEYSVSEVALQCGFSDI FYFSKLFKEKKGIPPSGMFPPERRKKKKEITSLSGGKAE >gi|229783893|gb|GG667842.1| GENE 5 5742 - 6113 511 123 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625071|ref|ZP_06118006.1| ## NR: gi|266625071|ref|ZP_06118006.1| NAD dependent epimerase/dehydratase family protein [Clostridium hathewayi DSM 13479] NAD dependent epimerase/dehydratase family protein [Clostridium hathewayi DSM 13479] # 19 120 19 120 120 205 100.0 1e-51 MKALIIGGSGGLSGALASMAKESYEVWALTRGKRPLGEGIIPLRADRDNEEEFRSAVLGA GVTWDVVFDCICMNERHAEQDLEVLSKVSGRLVVISTDSVYDPARKRTPQTEEGFFVEET LAS >gi|229783893|gb|GG667842.1| GENE 6 7018 - 7386 72 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871546|ref|ZP_06118007.2| ## NR: gi|288871546|ref|ZP_06118007.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 119 7 125 125 220 99.0 3e-56 MGIQVPRERQLADNGLHAQKLSENLIVDAVFHRLRPADTENLLKCAGFIHTLQHHARQIT DEDGLGHIDAAGYKRERLTGENEIRQFLLAAVLRITADQKAGAIDVTRAKQCDIQTAAFL AS >gi|229783893|gb|GG667842.1| GENE 7 7186 - 7518 384 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|255282583|ref|ZP_05347138.1| ## NR: gi|255282583|ref|ZP_05347138.1| NAD dependent epimerase/dehydratase family protein [Bryantella formatexigens DSM 14469] NAD dependent epimerase/dehydratase family protein [Bryantella formatexigens DSM 14469] # 1 101 205 305 317 150 63.0 4e-35 MTQPIFVRDLARVMLECVDKPGTFQQIFCIGGPEAVENRIYYEILGKLLGVETVIRELPL TGYLDAHPDYSGHLCHRIYDLSKLKAAGVELPATPLEEGLKLHLESLGYL >gi|229783893|gb|GG667842.1| GENE 8 7513 - 7668 84 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625073|ref|ZP_06118008.1| ## NR: gi|266625073|ref|ZP_06118008.1| hypothetical protein CLOSTHATH_06485 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06485 [Clostridium hathewayi DSM 13479] # 1 51 2 52 52 94 100.0 2e-18 MCSPIPAIFYDSYSDVGTLGYQTAPPPKARKKLVFQTLSFVYTNRNRPGGL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:15:37 2011 Seq name: gi|229783892|gb|GG667843.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld236, whole genome shotgun sequence Length of sequence - 10283 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 49/0.000 - CDS 3 - 279 190 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 2 1 Op 2 . - CDS 290 - 964 241 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 3 2 Op 1 . - CDS 1891 - 2154 382 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 4 2 Op 2 . - CDS 2171 - 2632 564 ## Trad_1319 peptidase M24 5 2 Op 3 . - CDS 2676 - 3503 937 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases - Prom 3541 - 3600 80.4 6 3 Op 1 . - CDS 4448 - 4651 206 ## gi|266625079|ref|ZP_06118014.1| ribosome-binding factor A 7 3 Op 2 . - CDS 4655 - 5857 1515 ## COG0006 Xaa-Pro aminopeptidase 8 3 Op 3 1/0.000 - CDS 5889 - 7301 1274 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases - Prom 7376 - 7435 13.9 9 3 Op 4 . - CDS 8337 - 9551 1495 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 9648 - 9707 9.6 - Term 9586 - 9642 15.2 10 4 Tu 1 . - CDS 9827 - 10249 421 ## gi|266625083|ref|ZP_06118018.1| putative transcriptional regulator Predicted protein(s) >gi|229783892|gb|GG667843.1| GENE 1 3 - 279 190 92 aa, chain - ## HITS:1 COG:HP0300 KEGG:ns NR:ns ## COG: HP0300 COG1173 # Protein_GI_number: 15644928 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Helicobacter pylori 26695 # 20 92 9 81 285 60 44.0 1e-09 MEKKITYKRRTNLFKTDAWKRFSKNKLALIGSIIILLMIAAAILAPLIVQNDPYVSLTDA NGLILKNSSPAKSGTILGTDSLGRDTFSRVIY >gi|229783892|gb|GG667843.1| GENE 2 290 - 964 241 224 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 1 220 74 310 320 97 28 4e-20 PATLLLTVTAFVLSFVIGIPLGVYSATHKYSAGDYGLTIFSFIGISIPSFFFGMGLIYIF AIKLKWFPTSGFGDTTFKGTGMALFLNKMKYLVMPALVMSLANLATVMRFTRSSMVETLN QDYIRTARAKGLSEKVVIYRHALKNSLIPVITIFGLSIPNLFGGAYITEKVFSWPGMGLL GVDAIANRDYAVLMGLTLFTAILVLVGNLVADILYSFVDPRIRY >gi|229783892|gb|GG667843.1| GENE 3 1891 - 2154 382 87 aa, chain - ## HITS:1 COG:lin0183 KEGG:ns NR:ns ## COG: lin0183 COG0601 # Protein_GI_number: 16799260 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Listeria innocua # 1 87 1 87 316 77 42.0 5e-15 MGKYIRKRLIQSIPVVFGITILTYFIMKLAPGGPLANMINPRTSAESILRAKEAMGLNKP IIIQYVNWLRELLQGNFGYSTNSGQQV >gi|229783892|gb|GG667843.1| GENE 4 2171 - 2632 564 153 aa, chain - ## HITS:1 COG:no KEGG:Trad_1319 NR:ns ## KEGG: Trad_1319 # Name: not_defined # Def: peptidase M24 # Organism: T.radiovictrix # Pathway: not_defined # 18 126 20 129 395 77 37.0 1e-13 MKNFEFHNRVIHDVTEKLKELDIDLYLIITSEGSDKMTRFIPGVDTVGSGAFFFTKEGKR YGVASTIDAQDVEESGLFDEVVRYQDYESAVAGLLERLNPKRVALDFSETDAECDGLTLG RYEHYKSHIKGQMEYTEVSSDLFIPQVKAAVHQ >gi|229783892|gb|GG667843.1| GENE 5 2676 - 3503 937 275 aa, chain - ## HITS:1 COG:MT2199 KEGG:ns NR:ns ## COG: MT2199 COG0624 # Protein_GI_number: 15841632 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Mycobacterium tuberculosis CDC1551 # 3 270 165 445 448 72 26.0 1e-12 MEYVKEAKPDLFVRAAVINEGGGFPLRINGKDYMMVTVGEKAVCKVKLSASGRGGHASAP GGDQAILKLAAVLEEVFGAEEELDFGSHATQETMARILGSRDCDNPVAADIFGYSGQNTI GLRDYRIGDRSNVIPAAVEVVLEFKVLPGTRVEEIEAFIRRHVTDGVQYEIMSFERGFES NFDNSALKSVIEDLKTISSEEGFSCEVLPMLALGRTDGRFFGSEGSMVYGCSPLLLDDSF DVVLPKVHGNDESISTESYLFGCRVLDRFIEKNCL >gi|229783892|gb|GG667843.1| GENE 6 4448 - 4651 206 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625079|ref|ZP_06118014.1| ## NR: gi|266625079|ref|ZP_06118014.1| ribosome-binding factor A [Clostridium hathewayi DSM 13479] ribosome-binding factor A [Clostridium hathewayi DSM 13479] # 1 67 1 67 67 93 100.0 6e-18 MDNVKETLIHLISIPSVTASPSEKEAIMYLEEILKEEGIETERIYKDPERLNLLAHLPAE KPEKEPL >gi|229783892|gb|GG667843.1| GENE 7 4655 - 5857 1515 400 aa, chain - ## HITS:1 COG:SA1360 KEGG:ns NR:ns ## COG: SA1360 COG0006 # Protein_GI_number: 15927110 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Staphylococcus aureus N315 # 148 393 120 350 353 101 28.0 3e-21 MVDRKNVAVEKMDRAVQILKENQIDMWMFYSRQNQDPSLELMFNTDTKNEVLFVLTADGD RMAFAEASDAAVYEASGIFTCVKTVTPDTIMKEFTAVCDEKKPNRIAVNDSTEDSRCDGL GLGLYKKVCGALGEDRMKALKTGSYRMLEELRAVKTPSEVAIMEECSRLTTDIYDALFER LHVGLSEIDVGEIMMEECAKRNVVTAFGNPPEYPLVLLVKGGMSHRKPNAKNICMPGDML VIDFSIRFNGYTSDIARTMYFLKPGEEHAPKEVTDCANGAIRAVGEVMKVIKPGMKGYEV DAVGRQSILDSGYPNIPHSVGHQVGLEVHDGGTTLGPNKAKVSCQGVLRKNEIYALEPTV LQPDGLPCAIIEDDVILTDDGCRLISKRQTALIEIPYREA >gi|229783892|gb|GG667843.1| GENE 8 5889 - 7301 1274 470 aa, chain - ## HITS:1 COG:MT2199 KEGG:ns NR:ns ## COG: MT2199 COG0624 # Protein_GI_number: 15841632 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Mycobacterium tuberculosis CDC1551 # 8 456 14 441 448 132 27.0 2e-30 MERQMKYDYTEVLKDLVRIDTTNPPGNEKMALDYIQQILEQEGINSKIYESAPGRGNLLA CLPASFHMDEGAGPDRDSAEDEAADRPLILMSHIDVVLAKESQWKHPPFAAVEEDGYIYG RGTVDTKQLTVMELAAFLALAENGEYRKRDVFLLVTCDEESGSACGLSAFLEETIPAGSR PITGRELFENSEVISEGGGFPILVGDTAFYLCESGQKGCGTVEFTVAARLAKGPFFGSGD GMVRAMKLVADIGSMELEQRKLPTVAHFEQRLSQCVDGGSHGEPDAGQDAARLGEVLSPV MNNILTAMNRNTMTVTMIQGKNSSEVKVICDVRLLPGYGTEYLKELTDSLAEKWDADCRI LSFSEGYESQPEGGLIGCLEEATRMELGEAGKRAELLPFISMGSSDGRFLVPMKAKVYGY SPVLPWDMTFDQAVSMVHGVDEKIHRESVLFGCRVLERAVRQAVCGSHGE >gi|229783892|gb|GG667843.1| GENE 9 8337 - 9551 1495 404 aa, chain - ## HITS:1 COG:CAC0176 KEGG:ns NR:ns ## COG: CAC0176 COG0747 # Protein_GI_number: 15893469 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Clostridium acetobutylicum # 4 404 2 418 569 189 30.0 6e-48 MRKRRSVLAAILLSAMALTTACSGGGGTTETNGGGTVQAEAGSEAGNTGEETVVQIGQTS EVANLNPTIQPRTPDSNVQCMIFDYLVIPDEELNYVGALAEGWDVSDDGRTYTFHLRDGV KWHDGEPFTSADVAFTLTSLAAPTYKGGADSRIVSIVGAKAYQEGSADSVSGITTPDDKT VVVELEEANAAFIGNMYTCILPKHILGDVDPGTWDTDDFNRHPIGTGKYKFVEWKSGQYI ELEKNEDYFGAKPSIDRVYVKFGDETTLTASLINGELDVLYGLSASEIETVEAMGGVRVE TYDMLSVYYIGLNQLNEDLSDLKVRQALAYGIDKAKIVATVYGESGYVQDSIFPSNHWTY SDDVTKYPYDPAKAKSLLEEAGYAMSASTGFYEKNGKTLHLTYS >gi|229783892|gb|GG667843.1| GENE 10 9827 - 10249 421 140 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625083|ref|ZP_06118018.1| ## NR: gi|266625083|ref|ZP_06118018.1| putative transcriptional regulator [Clostridium hathewayi DSM 13479] putative transcriptional regulator [Clostridium hathewayi DSM 13479] # 1 140 12 151 151 253 100.0 3e-66 MNDNRSKNQPRVRNFPLEQTEEKGPGRAAKGGRNKEELLREMGNRLRQIRLEKNWTQDKM AECLGITKAFYGKIERGESSIALEKLALLNETMDIDLNYLITGETIPVLPINFQDVPREK RYSMEQLIKYAVSLASSKEE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:15:57 2011 Seq name: gi|229783891|gb|GG667844.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld237, whole genome shotgun sequence Length of sequence - 7781 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 3, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 156 143 ## gi|266625084|ref|ZP_06118019.1| conserved hypothetical protein 2 1 Op 2 . + CDS 200 - 391 268 ## COG1983 Putative stress-responsive transcriptional regulator + Term 419 - 460 7.0 + Prom 496 - 555 4.5 3 2 Tu 1 . + CDS 584 - 796 227 ## Closa_2419 hypothetical protein + Term 830 - 872 -0.6 + Prom 906 - 965 2.4 4 3 Op 1 1/0.000 + CDS 1059 - 2258 1156 ## COG1903 Cobalamin biosynthesis protein CbiD 5 3 Op 2 9/0.000 + CDS 2258 - 2935 733 ## COG2243 Precorrin-2 methylase 6 3 Op 3 12/0.000 + CDS 2938 - 3732 825 ## COG2875 Precorrin-4 methylase 7 3 Op 4 6/0.000 + CDS 3729 - 4817 1026 ## COG2073 Cobalamin biosynthesis protein CbiG 8 3 Op 5 4/0.000 + CDS 4838 - 5554 779 ## COG1010 Precorrin-3B methylase 9 3 Op 6 . + CDS 5547 - 7040 1138 ## COG2099 Precorrin-6x reductase 10 3 Op 7 . + CDS 7037 - 7781 535 ## COG2242 Precorrin-6B methylase 2 Predicted protein(s) >gi|229783891|gb|GG667844.1| GENE 1 1 - 156 143 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625084|ref|ZP_06118019.1| ## NR: gi|266625084|ref|ZP_06118019.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 51 1 51 51 94 100.0 3e-18 DFNYEIKCRVGSITVDGEGFSGLSKERKINNGAAKKMELDCSTGAIEVKFH >gi|229783891|gb|GG667844.1| GENE 2 200 - 391 268 63 aa, chain + ## HITS:1 COG:MA4106 KEGG:ns NR:ns ## COG: MA4106 COG1983 # Protein_GI_number: 20092899 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Putative stress-responsive transcriptional regulator # Organism: Methanosarcina acetivorans str.C2A # 7 61 2 56 59 77 65.0 6e-15 MDDRETRRLYRSDTDKMICGVCGGIGEYFNVDPTLIRLLWAVLACSGTGVVAYFIAAIII PRR >gi|229783891|gb|GG667844.1| GENE 3 584 - 796 227 70 aa, chain + ## HITS:1 COG:no KEGG:Closa_2419 NR:ns ## KEGG: Closa_2419 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 67 1 67 73 94 85.0 1e-18 MSKGKKNEPFDPHNLKPEEILKFEIATELGLSDKVITGGWRSLTAKESGRIGGLITKRKR ELKQEALGEK >gi|229783891|gb|GG667844.1| GENE 4 1059 - 2258 1156 399 aa, chain + ## HITS:1 COG:FN0967 KEGG:ns NR:ns ## COG: FN0967 COG1903 # Protein_GI_number: 19704302 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiD # Organism: Fusobacterium nucleatum # 1 323 1 315 375 209 40.0 8e-54 MEERKLRSGFTTGTCAACAAKAAAVMLLKQEAVESVQVMTPGGTRAELYPEEAVITENSA VCAVRKDAGDDPDVTDKALIYASVEYIKDGGAPENCYRNDRIFLTGGQGVGIVTKPGLAC PVGFYAINPVPRKMIFDEVEAVAHLAGCEGGLLVRIWIPEGERLALSTFNPKLGIEGGIS VLGTSGIVEPMSEAALKATIQLELHMKAVEGQREVILTPGNYGENFLRETLKLSLAQGVQ CSNFVRESVLMAADEGMDGLLFVGHIGKLIKVAGGVGNTHSKYGDRRMEIMWDCARSHCG KLSERARSRLKEQVLDSNTTEEAVRRLNEAGMLKPAMETAVERIRHYLEVWSERPVRAEV VTFTNGYGILGMTRGAEAMIDTFRVREKANEIQKEQEKR >gi|229783891|gb|GG667844.1| GENE 5 2258 - 2935 733 225 aa, chain + ## HITS:1 COG:CAC1379 KEGG:ns NR:ns ## COG: CAC1379 COG2243 # Protein_GI_number: 15894658 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-2 methylase # Organism: Clostridium acetobutylicum # 5 223 4 217 221 122 38.0 6e-28 MSGILYGIGVGPGDPELMTIKAVRRIRECTVIAIPHRSKELCTAYQIACQAVPEMEQKEC LYLPMPMTKDKKVLDESHDLAAGAVMERLDRGEDVGFITLGDVSIYSTCSYLLERLWNRG YATRLECGIPSFSAAAARLGIPLVSGAEELHVIPATYQVKDALKLPGVKVLMKTGKQMKT VKEEIRKCGASAVMVENCGMADERVFASLEEIPDEPGYYSLLIVR >gi|229783891|gb|GG667844.1| GENE 6 2938 - 3732 825 264 aa, chain + ## HITS:1 COG:MJ1578 KEGG:ns NR:ns ## COG: MJ1578 COG2875 # Protein_GI_number: 15669774 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-4 methylase # Organism: Methanococcus jannaschii # 2 249 7 254 259 266 53.0 3e-71 MVHIVGAGPGAPDLITVRGKELISQADTIIYAGSLVNPALLEYRKEGCRVLNSAYMTLEQ VIEAIRETEAEGKMTVRLHTGDPCIYGAIREQMDALERLGISYEVCPGVSSFCGAAAALN MEYTLPEISQSVVITRMAGRTPVPERESVASFAAHGATMVLFLSSGLLTELAEELMRGGY GADTPAAIVYRATWEDEKKVLCTVGTLAERAAGEKITKTALIIVGNAVSQSGYALSKLYD PGFSTEYRQGSDSVGAEGQKEDEV >gi|229783891|gb|GG667844.1| GENE 7 3729 - 4817 1026 362 aa, chain + ## HITS:1 COG:lin1161 KEGG:ns NR:ns ## COG: lin1161 COG2073 # Protein_GI_number: 16800230 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiG # Organism: Listeria innocua # 2 354 5 333 343 172 36.0 8e-43 MRLSLICFTGAGAQLGARLLKELKFSGHECRGFVLEKFLNPFHEANGLEPLKTPLREWTG EQFTARDGLIFIGAAGIAVRAIAPWVKDKRTDPAVVVIDDCGRYSISLLSGHLGGANGLA EETAKITGGIPVITTATDLHGRFAVDNFAKEHGLWMSDMKTAKAVSADVLAGEPVGLFSD FPVMGSLPEGFTQKESCRRNVWITVKREPEEGGFLKLFLPEGGEVLRLVPRIVILGIGCK KGTGKEQIEAAVGEALSRWNIEPESAAAVATIDIKKEEAGLLSYVRDHGLSFHTYPAGRL LQAEGEFSPSSFVREITGVDNVCERAAVSLAEDLGGGRLMMKKQAGGGVTVAAAVRDWKV KL >gi|229783891|gb|GG667844.1| GENE 8 4838 - 5554 779 238 aa, chain + ## HITS:1 COG:lin1162 KEGG:ns NR:ns ## COG: lin1162 COG1010 # Protein_GI_number: 16800231 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-3B methylase # Organism: Listeria innocua # 1 235 4 239 241 223 48.0 4e-58 MVGIGPGSYEQMTIKAVRALEESRVVVGYTVYADLMREHFPGKEWITTPMRQETERCRMA IEKARSGLTVALICSGDAGVYGMSGLILELVGEEDSPAVEVIPGVTAALSGGALLGAPLG HDFAVISLSDLLTPMELIEDRLFHAAKADFAVCLYNPSGRKRSDYLKRACEIMLTVRKPD TVCGIAKNIGREGESMEVMTLSQLRDTPADMFTTVFIGNEKTKVINGKMVTPRGYRNV >gi|229783891|gb|GG667844.1| GENE 9 5547 - 7040 1138 497 aa, chain + ## HITS:1 COG:lin1163 KEGG:ns NR:ns ## COG: lin1163 COG2099 # Protein_GI_number: 16800232 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6x reductase # Organism: Listeria innocua # 4 239 2 225 250 135 34.0 1e-31 MCRILIFGGTTEGRLLAEYCMQQEIAVCVSVVSGYGADLLPESRLIHVVSGRMEEKEMEE FMSRESIRTVFDATHPYAVEATGNIREACKRSGASYVRVTRESAAENGGGGREASASQTV YVDSASEAARYLKDREGGILVTTGSRELAAFTAIPGYEDRLFARVLPSAAVICACEELGI KGKHVIAMQGPFSEELNRAMMEQLGVRYLVTKEAGTAGGFLEKLSAASALSVTTVVIGRP SEDRDGVSLEEAKKLLMESKLQTCSMAGSENTSAKRKISLMGTGMGGRGQMTLAVAEELK RCDVLFGAKRMTEAAETLGAPADGILRIPIYGNREILEWLENHPEYRNAGVLYSGDTGFY SGASGMAAMLLQEPYCDIYDYCIYPGISSVSYLCAKMGRSWEQVKLISLHGRDCDVSLEV SQNPAVFTLLGGVHTVKELCEQLLNHGLSNVRMTVGERLSYSDERIVAGTPEQLAAMECS SLAVVMMERDEGREGTR >gi|229783891|gb|GG667844.1| GENE 10 7037 - 7781 535 248 aa, chain + ## HITS:1 COG:FN0964 KEGG:ns NR:ns ## COG: FN0964 COG2242 # Protein_GI_number: 19704299 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6B methylase 2 # Organism: Fusobacterium nucleatum # 3 179 5 182 189 141 40.0 9e-34 MRDEWFIRGEVPMTKSEVRAVSVEKLELSADSVLYDIGAGTGSVSVEAAAFLPEGTVYAI EKKREAVELLKKNREKFRAERIRIIEGAAPEALEGLEAPTHAFLGGTSGKMADILSLLLE KNPEVRVVVNAITLESVSKVLEWTAGRGIEADIVLVSVSRAKAAGRVHMMMAQNPVYVIS FGGRPAQLWNAPGRAERETKNTEYPRLMLAAPKSGSGKTMVTCGLLAAWQKRKLNCRAFK CGPDYIDP Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:16:06 2011 Seq name: gi|229783890|gb|GG667845.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld238, whole genome shotgun sequence Length of sequence - 8798 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 5, operones - 4 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 1177 573 ## PROTEIN SUPPORTED gi|39938628|ref|NP_950394.1| ribosomal protein L13 2 1 Op 2 . + CDS 1269 - 2153 1004 ## COG0679 Predicted permeases + Prom 2691 - 2750 4.0 3 2 Tu 1 . + CDS 2825 - 3847 881 ## COG0392 Predicted integral membrane protein + Term 3869 - 3913 1.3 + Prom 3951 - 4010 1.6 4 3 Op 1 . + CDS 4031 - 4480 606 ## COG3279 Response regulator of the LytR/AlgR family 5 3 Op 2 . + CDS 4482 - 4880 442 ## gi|266625096|ref|ZP_06118031.1| conserved hypothetical protein + Term 4883 - 4944 9.8 - Term 4870 - 4933 10.2 6 4 Op 1 19/0.000 - CDS 4944 - 5369 629 ## COG1781 Aspartate carbamoyltransferase, regulatory subunit 7 4 Op 2 . - CDS 5363 - 6037 540 ## COG0540 Aspartate carbamoyltransferase, catalytic chain + Prom 6926 - 6985 80.4 8 5 Op 1 . + CDS 7018 - 7152 115 ## Closa_2184 ABC transporter 9 5 Op 2 . + CDS 7136 - 8798 1948 ## Closa_2183 hypothetical protein Predicted protein(s) >gi|229783890|gb|GG667845.1| GENE 1 2 - 1177 573 391 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|39938628|ref|NP_950394.1| ribosomal protein L13 [Onion yellows phytoplasma OY-M] # 6 389 163 544 546 225 32 1e-58 RYRRVSRVNIYSIGGFEDYYYGYMVQNTGYIKYFDLFLYDDGFMLMLPQKDNPKEVPVFE AEKKQKLFQVLKESVKWGERLNVSHVGELNEVIAKGDISRLILIQEALQEKKIAEIAEKV TKERDKKFVMIAGPSSSGKTSFSHRLSIQLKAQGMIPHPIGVDDYFVNRVDSPKNPDGSY NYEVLECLDVEQFNKDMTALLAGDTVEMPRYNFKTGMREYHGDMLKLGRDDILVIEGIHC LNDRLSYSLPAESKFRIYVSALTQLNVDEHNRIPTTDCRLLRRMVRDARTRGASAQDTIR MWPTVRHGEEQYIFPFQESADMMFNSATVYELAVLKQYAEPLLFGIPRESEEYMEAKRLL KFLDYFLGVNSETIPNNSIIREFIGGSCFRV >gi|229783890|gb|GG667845.1| GENE 2 1269 - 2153 1004 294 aa, chain + ## HITS:1 COG:CAC0366 KEGG:ns NR:ns ## COG: CAC0366 COG0679 # Protein_GI_number: 15893657 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Clostridium acetobutylicum # 1 288 10 298 301 118 28.0 1e-26 MFVMALIGMICCRIGLISGHTNEKFSDILLLLVSPLLIFTSYQQAFDPEKLYGLLTAFFM AAVSHAVAITLASFLVKKGRTDWEVERISAIYSNCGFMGIPLVSALFGADGVFYATAYIT VFNILVWTHGVILMTGKQDFKSFLTALRSPCIIAVVLGLFCYLARIRFPAVILEPLTAIA DMNTPLAMLVAGVSIAGSDIKALLMKVQIYYICAIRLLIIPTVVLLVLKAFRLDDMVTTV VVLATACPTATTGTLFALKFHRNSGYASEIFGLATASSIVTIPLMMLLCGLING >gi|229783890|gb|GG667845.1| GENE 3 2825 - 3847 881 340 aa, chain + ## HITS:1 COG:lin2698 KEGG:ns NR:ns ## COG: lin2698 COG0392 # Protein_GI_number: 16801759 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Listeria innocua # 36 324 45 332 357 87 26.0 3e-17 MICLAVLAVLSFMTFHILFKDHSPESLYEVFSGADWRFVGLGAACMAIYLICEAESIRTL MGAFQTKVPFLRSAGYSFADFYFSAITPSATGGQPMQLYYMTRDGFGFAQSSFSLLTLAA VYQLMVLVYGVVMVILKYPYVQSQGRLIKWLLVFGIAVNGLCSLGILLVILKRALVERIA MTGISIAVKLRIIKDRRKLERKAEHLIDEYSRGGECFRKYPQVFVKVMILTAFRLTALFL IPYFACLALGIRGVEPLEFLAVQAILSLAVTAVPLPGSVGASEGSFMILYSGMVAADHLF PMMMLSRGISFYGFLAVSGAATLVLQFWNAGKHRKTEVQT >gi|229783890|gb|GG667845.1| GENE 4 4031 - 4480 606 149 aa, chain + ## HITS:1 COG:SP1915 KEGG:ns NR:ns ## COG: SP1915 COG3279 # Protein_GI_number: 15901739 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Streptococcus pneumoniae TIGR4 # 1 137 2 137 149 63 33.0 1e-10 MKIIIEPGDMEEPEVIIRGRSGSEEVRQLLELLGGAGGISQIPLYAEEKEYFFKPEEICY FLTEGGKVNAYLESGVYEAKGRLYELARQLRHRGFIQISKSTVINTAMVAFVEVEFSGNY TAFLKDGKTKLLISRNYMKDFRKYIMEGR >gi|229783890|gb|GG667845.1| GENE 5 4482 - 4880 442 132 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625096|ref|ZP_06118031.1| ## NR: gi|266625096|ref|ZP_06118031.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 132 1 132 132 213 100.0 5e-54 MKRFEEFLISIFTGIGIGAVVCTVTMAAMGGMDATLKQVLVWVAASALFAVISQIMCMDF GNLLIRTVIHFCLCFTLAATVGTFLHYSSDWISSARVMLPAFIIIYVIIYVAIFLVRLAE TKELNKKLKEPE >gi|229783890|gb|GG667845.1| GENE 6 4944 - 5369 629 141 aa, chain - ## HITS:1 COG:CAC2653 KEGG:ns NR:ns ## COG: CAC2653 COG1781 # Protein_GI_number: 15895911 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, regulatory subunit # Organism: Clostridium acetobutylicum # 1 139 1 138 146 129 50.0 1e-30 MLNISGLNEGIVLDHIQAGKSLDIYYNLGLDKLDCQVAIIKNAKSSKMGRKDIIKIEGGL DHIDLDILGYIDHNITVNIIRDNQIAEKRTLKLPKKITNVIHCKNPRCITSIEQELPHIF YLADSETETYRCMYCEEKYSK >gi|229783890|gb|GG667845.1| GENE 7 5363 - 6037 540 224 aa, chain - ## HITS:1 COG:CAC2654 KEGG:ns NR:ns ## COG: CAC2654 COG0540 # Protein_GI_number: 15895912 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Clostridium acetobutylicum # 1 220 87 306 307 327 71.0 7e-90 MADTIRVISCYADICAMRHPKEGAPLVAATHSGIPVINAGDGGHQHPTQTLTDLLSIRSL KGRLHDLTIGLCGDLKFGRTVHSLINALVRYENIKFVLISPPELRVPEYIREDVLKANNI EFVEMDSLDEAMPSLDILYMTRVQKERFFNEEDYIRLKDCYILDKDKMKLAKEDMYVLHP LPRVNEISVEVDNDPRAAYFKQAQYGVYVRMALIMTLLEVEKPC >gi|229783890|gb|GG667845.1| GENE 8 7018 - 7152 115 44 aa, chain + ## HITS:1 COG:no KEGG:Closa_2184 NR:ns ## KEGG: Closa_2184 # Name: not_defined # Def: ABC transporter # Organism: C.saccharolyticum # Pathway: not_defined # 1 42 204 245 246 72 83.0 5e-12 MVTHDNYIAGFADRQLHIVDGKIFKIEEQHHDDTEEETENEEER >gi|229783890|gb|GG667845.1| GENE 9 7136 - 8798 1948 554 aa, chain + ## HITS:1 COG:no KEGG:Closa_2183 NR:ns ## KEGG: Closa_2183 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 549 2 552 611 753 71.0 0 MKKRGNRVLAGLLAVLVVLGAVPVQTMGAQYGIQKNKYLDVKKTPTGKTGQNMTISMIFD NSKGSEDLQGVSVRFDDSAVDSEYDAAEDPDEETRYNGSIFPFEITSSIFDAKVIGSVKA GATKTVSLSARVRRDIAEGYYSVPIIVKANGSDATTDHINVWITKSSGTTESGDDEGSIR FELGENQSTPSGQYPDVMNFTVNLRNASNITAFDVNVRMGLSADSTKFPFNINDGNYVRH FDRVGGGETQEVSYSMAIRKEAYTGYYPITFTIEYRDSTEGDIQKAESIFYVHVQNKDKE EETKEFNANDRTRARLIVDGFQTNPEVVYAGDEFDLILRMKNASENVAASNILFTLESEK VTDSAVFTTDSGSSSIVVNSLAAGQSTELKIRMRAGAWVDQRTYAITINEKYDSPEYKNA EEKVTVDIPVKQMARLNTGTIEVMPDTMSVGAESNVMFPINNTGKVLLYNVMVTFVGDSI QQTDSYVGNIKPGESKTVDAMVSGVTPTTDDGKVKILITYEDENGVVSDPIEKEMTLMVT EEMEPEDGMLDDMG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:16:26 2011 Seq name: gi|229783889|gb|GG667846.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld239, whole genome shotgun sequence Length of sequence - 7111 bp Number of predicted genes - 9, with homology - 7 Number of transcription units - 5, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 336 - 524 178 ## 2 2 Op 1 2/0.000 - CDS 569 - 979 362 ## COG3328 Transposase and inactivated derivatives 3 2 Op 2 . - CDS 954 - 1148 153 ## COG3328 Transposase and inactivated derivatives 4 2 Op 3 . - CDS 1130 - 1468 206 ## CbC4_1308 transposase, Mutator family - Prom 1491 - 1550 5.3 + Prom 1283 - 1342 6.7 5 3 Tu 1 . + CDS 1534 - 1638 63 ## - Term 1539 - 1603 12.2 6 4 Op 1 . - CDS 1795 - 3726 748 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 7 4 Op 2 . - CDS 3726 - 5969 696 ## gi|288871562|ref|ZP_06410226.1| conserved hypothetical protein 8 4 Op 3 . - CDS 5972 - 6742 488 ## COG3279 Response regulator of the LytR/AlgR family - Prom 6769 - 6828 6.8 + Prom 6869 - 6928 4.3 9 5 Tu 1 . + CDS 6972 - 7110 106 ## gi|266625109|ref|ZP_06118044.1| conserved hypothetical protein Predicted protein(s) >gi|229783889|gb|GG667846.1| GENE 1 336 - 524 178 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEGEIKERIPVWLYPSTLAALDEWKEKDNCKSRSEFLEKAILFYRLSSANDYFRINWYDR FF >gi|229783889|gb|GG667846.1| GENE 2 569 - 979 362 136 aa, chain - ## HITS:1 COG:YPO0011 KEGG:ns NR:ns ## COG: YPO0011 COG3328 # Protein_GI_number: 16120364 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 5 135 170 300 402 172 56.0 1e-43 MLFTIKVREEGRIVTKAAYVALGINAEGRKDILGIWIGENEGAKFWLKVCNELKNCGVKD ILIACIDGLQGFPDASKTIFPETMIQLCIIHQIRNTMKYIAYKDSKAFMKDLKRVYGAES EEIAFMNLEAMKDSWK >gi|229783889|gb|GG667846.1| GENE 3 954 - 1148 153 64 aa, chain - ## HITS:1 COG:SMa0384 KEGG:ns NR:ns ## COG: SMa0384 COG3328 # Protein_GI_number: 16262658 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Sinorhizobium meliloti # 1 62 109 170 400 73 50.0 8e-14 MYAKGMTVRDIVSHVQTLYGIELSPASVSNITDKIMEEAREWYSRTLDGFYPVVFLDAVH YKSP >gi|229783889|gb|GG667846.1| GENE 4 1130 - 1468 206 112 aa, chain - ## HITS:1 COG:no KEGG:CbC4_1308 NR:ns ## KEGG: CbC4_1308 # Name: not_defined # Def: transposase, Mutator family # Organism: C.botulinum_BKT015925 # Pathway: not_defined # 5 93 9 102 410 95 45.0 7e-19 MTSGFNYQEELSKCSTMEDITGPNGLVQRMVKDAIEQILQNEIADYITDEKSKGNTPKRN GASPKKVKTSYGSINIDVPRVREGEFEPEVIKKRAVSRKGWKHKSFRCMRKG >gi|229783889|gb|GG667846.1| GENE 5 1534 - 1638 63 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIPIQKGGVSNNLQVIHSLFDTPPQTAASDRLER >gi|229783889|gb|GG667846.1| GENE 6 1795 - 3726 748 643 aa, chain - ## HITS:1 COG:CAC1582 KEGG:ns NR:ns ## COG: CAC1582 COG2972 # Protein_GI_number: 15894860 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Clostridium acetobutylicum # 405 633 204 442 452 90 27.0 8e-18 MRGEKVMQRSNHFRFAFILLLLVISGITSSICLLLYQYDNKYTAPGPASAYGSMEVEAET LNGQGVIWLVDGWEFYGGLLLTPADFRSNMPAPDRYLFAGQQGGFESVTGNPHGSASYRL TIKLPKQTQTYSLYLPELRSSCRLYVNGVERLALGEISPEHYKAETAEKIVPFEAAGEVE LLLAVSDFSGAYGGLTYPPALGTPDSIHDMTTVRLNLRTALLTMTALFGVISLLIGIANP QNSLTGLYALYCLTFLGFTCYPVAKTLFPAFGGLYFVENLSFCAMILMIFILQRRLFTLE TPVNTFGIAFGGLVCVVCIFYHLFLPQASLTMLLSYSRLISLYKSISAALLTGSTIWMLL RRSDETAHHAKFLLCGIVIFDCTLIMDRFLPLYEPIYSGWFLEISSLCMVLLIGAILLQE IYIRTAASLRLEETIRLTGQNLAMQKEQHQTMIEAIAGERTFRHDLRQHISVLHEFAERD DMDSLKQYFDQLEGRLPANYAGVFCENAAVDAILRHYKTNAESAGIEFSAAVQVAEGMPI SDVDLCVIFGNCLENALEACKRQTDHKKYVMVSAGTSVGVFAITVENSFNGSLRRSGAGF FSSKQSRIGIGTASVQAIAKKYNGKATFKSDDNKFQALIVLHL >gi|229783889|gb|GG667846.1| GENE 7 3726 - 5969 696 747 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871562|ref|ZP_06410226.1| ## NR: gi|288871562|ref|ZP_06410226.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 747 3 749 749 1359 100.0 0 MRRILSVTLVLGLLLCGTPAFAAVKGSGEAFPSQQNAPFFLSGINLSGKDSILEAASGDT LPGTATDSNATSETAKPATPSDADDNFDSNHQTLPPEIATDSNSTAPLTPEMQTVYTADE LRDLLENYDGTGGTISFGATITLEHSIYFYGESVCINTGTFGLIYDGGCIETLGNDTLTL ELTGEGIDMPVLEIKRTVLYPEWYHIIDWNCALSGMEVTAVGREGNGGTAVRVTADYERR TPTTEYQTRGRIRSYGDGAVGLELCSSAQTYLLDIEAEGRNACAVAAVNGADLFACKLSA QGEGAVVATGNGVILDSCILSGESDGVTVIQSTIESRTGLEPQLMQYTDSDSIVYDAMVH DEQQYLLSNGKYIDLYLQYDENLLEHLDTSVLGAVDIPVSLHPCFQGFGLEGEKELTFRI WIRDPAKPVITEYWQEDSIITFFSWYNDGLEPGTRLWSSKDGGSTWVDISDSDGVTLCVN SNDTSLFILDTSGLDQTIFLALKNAAGWSNVVTLSPGKDGKIPVGAGGDRDGGDREELPG GDGGNHDNGNTDDDSAIGDDSGSSGSTPPSEAGQGGQPDNSSGEHDDRDGNNDTTTDTGA NSAVNESQQTDHQQEQPNNHGESSVSPSTFPVSSEKWTIDFSPIRLSQQDHAQPDSQNTQ TFSDIPVVQMPAHDETEAVLNPLETPAELNSFDAVPNLEAQERESGFTTPDKSASSRQAD TVLSVLVLCGAGGITAFVTFRKKRGER >gi|229783889|gb|GG667846.1| GENE 8 5972 - 6742 488 256 aa, chain - ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 54 225 50 217 234 92 30.0 6e-19 MKIKIAVCEDCQKDADLLCEQLSRYCQESGLPTYDVDIYPNGFTFSKNCLPGSYDLIFMD IYLEQDDGMRLAQNFRKADKECPIVFFTRSTDHAVEAFKVNAAHYLTKPLAYENLAEALD RCLRLHERQSKFILLPTEKAVQKIRVAEIVYIEVFNNISVVHLKRETVSSRISLKNLLME LTKTEADAEFLRCHRSYMVNMNWIRALKSDFFLMDTGIPVPISKYMKKQVICAYEEFALK QIRDTQTIPATPKEVR >gi|229783889|gb|GG667846.1| GENE 9 6972 - 7110 106 46 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625109|ref|ZP_06118044.1| ## NR: gi|266625109|ref|ZP_06118044.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 46 1 46 47 83 100.0 5e-15 MKIIVVDNEPDRLEELVKCLRAAFPNAEITDFTSPSLAVQYSFENP Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:17:16 2011 Seq name: gi|229783888|gb|GG667847.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld240, whole genome shotgun sequence Length of sequence - 9771 bp Number of predicted genes - 11, with homology - 10 Number of transcription units - 8, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 920 - 1102 129 ## gi|266625111|ref|ZP_06118046.1| acetyltransferase, GNAT family + Prom 1080 - 1139 9.1 2 2 Tu 1 . + CDS 1222 - 1713 415 ## COG3546 Mn-containing catalase + Term 1781 - 1836 -0.1 3 3 Tu 1 . - CDS 1679 - 1810 82 ## - Prom 1830 - 1889 5.4 + Prom 1754 - 1813 3.9 4 4 Tu 1 . + CDS 1963 - 2274 360 ## COG0209 Ribonucleotide reductase, alpha subunit 5 5 Tu 1 . + CDS 3242 - 4555 1178 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase + Prom 4567 - 4626 2.6 6 6 Tu 1 . + CDS 4686 - 4895 120 ## gi|288871563|ref|ZP_06118050.2| purine nucleoside phosphorylase DeoD-type + Prom 6098 - 6157 3.1 7 7 Op 1 . + CDS 6221 - 7033 195 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 8 7 Op 2 . + CDS 7091 - 7228 71 ## gi|288871565|ref|ZP_06410228.1| hypothetical protein CLOSTHATH_06534 9 7 Op 3 . + CDS 7203 - 7388 264 ## Closa_2877 transcriptional regulator, GntR family 10 8 Op 1 . + CDS 8315 - 8743 528 ## Closa_2877 transcriptional regulator, GntR family 11 8 Op 2 . + CDS 8747 - 9532 968 ## Closa_2876 hypothetical protein + Term 9565 - 9607 2.5 Predicted protein(s) >gi|229783888|gb|GG667847.1| GENE 1 920 - 1102 129 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625111|ref|ZP_06118046.1| ## NR: gi|266625111|ref|ZP_06118046.1| acetyltransferase, GNAT family [Clostridium hathewayi DSM 13479] acetyltransferase, GNAT family [Clostridium hathewayi DSM 13479] # 1 60 1 60 60 101 100.0 2e-20 MNRTDELFFEIIETYQRHAQAARNAKYQETREMAEILLNMDITAMWFLTQHIPDTRSRLV >gi|229783888|gb|GG667847.1| GENE 2 1222 - 1713 415 163 aa, chain + ## HITS:1 COG:CAC1338 KEGG:ns NR:ns ## COG: CAC1338 COG3546 # Protein_GI_number: 15894617 # Func_class: P Inorganic ion transport and metabolism # Function: Mn-containing catalase # Organism: Clostridium acetobutylicum # 1 146 40 185 200 183 58.0 1e-46 MRYLSQRYAMPYKECKGLLTDIGTEELAHMEIIAAIVHQLTRNLTPEQIVESGFGPYYID HTTGIWPQAAGGIPFNACEFQSKGDPITDLVEDMAAEQKARSTYDNILRVVTDPEVCDPI RYLREREVVHFQRFGEALRITQDNLDSKNFYAFNPSFDVKSNC >gi|229783888|gb|GG667847.1| GENE 3 1679 - 1810 82 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYSNTCNKEKPNKTFFHTLAGKNPPLAGPSLFISNYSLHQSWD >gi|229783888|gb|GG667847.1| GENE 4 1963 - 2274 360 103 aa, chain + ## HITS:1 COG:PA1156 KEGG:ns NR:ns ## COG: PA1156 COG0209 # Protein_GI_number: 15596353 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Pseudomonas aeruginosa # 7 98 30 128 963 60 37.0 1e-09 MRKEGTGMILVVKRDGEIVEFHLEKISEAIRKAFKATGKFYTEDIVELLSIRVTSDFQGK MQGDRIAVEDIQDSVEKVLEESGYTDVAKAYILYRKQREKLAS >gi|229783888|gb|GG667847.1| GENE 5 3242 - 4555 1178 437 aa, chain + ## HITS:1 COG:PA1920 KEGG:ns NR:ns ## COG: PA1920 COG1328 # Protein_GI_number: 15597116 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Pseudomonas aeruginosa # 1 401 257 656 675 358 44.0 8e-99 MEAFVFGVNTPSRWGTQAPFSHISLDLMIPEDMAEQKAIVGGKEMEFTYGECQKELDMVD RAFLETMLEGDASGLGFQYPIPSYGLTPDFDWSDNERNRLLFSLASHYGTPYFTNYVRSG LTPGDVRISYEGVTPDFHKLRSKAGGFFGFGEHTGSIGVVTINLPRIAYQSKDETEFYER LDEMMDLAARSLRIKREVISNLLTNGLYPYTACYLSDFSNHLSSIGVIGMNEAGLNAPWL GEGMESEKTREFAVAMLNHMRERIVGYQLEYGDLYGLEATPAESATYRFAMLDKEKYPGI RTAGHEGDVPYYTNSTKLPADYDGTLEKALEHQDRLQPLYTAGTVFHVYMEREFDDWKKA RDLLHSITTGHDIPYYTLSPVYSICQSCGYLPGRQNVCPKCGGRADVYSRIAGYYRPVHD WNEGKAQEFKNRIMYRM >gi|229783888|gb|GG667847.1| GENE 6 4686 - 4895 120 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871563|ref|ZP_06118050.2| ## NR: gi|288871563|ref|ZP_06118050.2| purine nucleoside phosphorylase DeoD-type [Clostridium hathewayi DSM 13479] purine nucleoside phosphorylase DeoD-type [Clostridium hathewayi DSM 13479] # 1 67 1 67 67 70 100.0 4e-11 MDVIVDAVVPGAADAEIVILPADVDAEITILPVDAAEMTGQTATIAAAMQQALMQVMRLA SEMASGPAS >gi|229783888|gb|GG667847.1| GENE 7 6221 - 7033 195 270 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 15 259 4 238 242 79 25 8e-15 MNMDYLTDKFSLNGKVALITGASYGIGFAIAKGMAAAGATIVFNDIKQELVDKGIASYKE EGIDAHGYVCDVTDEAAVNEMVARIESEVGVIDILVNNAGIIKRIPMCDMTAEQFRQVID VDLNAPFIVSKAVIPSMIKKGHGKIINICSMMSELGRETVSAYAAAKGGLKMLTKNIASE YGEFNIQCNGIGPGYIATPQTAPLRETQPDGSRHPFDQFIISKTPAGRWGEAEDLAAPAV FLASDASDFINGHVLYVDGGILAYIGKQPQ >gi|229783888|gb|GG667847.1| GENE 8 7091 - 7228 71 45 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871565|ref|ZP_06410228.1| ## NR: gi|288871565|ref|ZP_06410228.1| hypothetical protein CLOSTHATH_06534 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06534 [Clostridium hathewayi DSM 13479] # 1 45 1 45 45 77 100.0 3e-13 MSIRCAAPGKKRMYIHAMILFSAELMACFLEMITERCHGREDESL >gi|229783888|gb|GG667847.1| GENE 9 7203 - 7388 264 61 aa, chain + ## HITS:1 COG:no KEGG:Closa_2877 NR:ns ## KEGG: Closa_2877 # Name: not_defined # Def: transcriptional regulator, GntR family # Organism: C.saccharolyticum # Pathway: not_defined # 1 61 1 61 233 98 91.0 6e-20 MEEKTKAYKNRGELVFETLKREILDLELKPGQMISENEICERFGVSRTPVREAVRRLQEQ G >gi|229783888|gb|GG667847.1| GENE 10 8315 - 8743 528 142 aa, chain + ## HITS:1 COG:no KEGG:Closa_2877 NR:ns ## KEGG: Closa_2877 # Name: not_defined # Def: transcriptional regulator, GntR family # Organism: C.saccharolyticum # Pathway: not_defined # 1 142 89 230 233 221 90.0 7e-57 MAVETMVMRDFVDIATPLWMEEVRYMIRKQQALIQEKGFEPEQFYRMDAEMHAIWFRATD KMKLWDMIQAQQLHYTRFRMLDFVTETDFTRIIREHTELFELLEKKDIAGLERSLKEHLY YSMKRMKHQIDVEYKDYFEQEE >gi|229783888|gb|GG667847.1| GENE 11 8747 - 9532 968 261 aa, chain + ## HITS:1 COG:no KEGG:Closa_2876 NR:ns ## KEGG: Closa_2876 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 260 1 260 264 416 75.0 1e-115 MANVSITIKDENGVVKCFGCGEERAVLVYEGEYCQGDTITFATDAVNCHYVVRVDDTMDE ALVYITKPEVVYVIPFEEKKVSYNPKSFTGGKHYLTIRTAEAYETGAYRNLAKNVMDQHG DPGCYPHVYANVETRGEAVFAARNAIDGVVSNESHGEWPYESWGINRQDDAEMTLEFGRP VDFDKIVLVTRADFPHDNWWVKATFTFSDGTTEVVDMEKSVKPHEFRIEKKGITWLKLSN LIKADDPSPFPALTQIEVYGA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:17:47 2011 Seq name: gi|229783887|gb|GG667848.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld241, whole genome shotgun sequence Length of sequence - 7391 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 6, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 870 817 ## COG3359 Predicted exonuclease + Term 892 - 933 -0.9 2 2 Op 1 . - CDS 881 - 1687 619 ## COG2207 AraC-type DNA-binding domain-containing proteins 3 2 Op 2 . - CDS 1701 - 2246 495 ## Closa_2241 hypothetical protein - Prom 2291 - 2350 5.5 + Prom 2360 - 2419 7.3 4 3 Tu 1 . + CDS 2439 - 2636 321 ## COG1278 Cold shock proteins + Term 2757 - 2794 3.1 - Term 2745 - 2780 6.5 5 4 Op 1 1/0.000 - CDS 2829 - 3710 1103 ## COG1281 Disulfide bond chaperones of the HSP33 family 6 4 Op 2 . - CDS 3713 - 4474 1092 ## COG0500 SAM-dependent methyltransferases - Term 4490 - 4526 4.8 7 4 Op 3 . - CDS 4540 - 4959 487 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) - Prom 5013 - 5072 80.4 8 5 Tu 1 . - CDS 5924 - 6169 341 ## PROTEIN SUPPORTED gi|160880683|ref|YP_001559651.1| ribosomal protein L21 - Prom 6270 - 6329 6.7 9 6 Tu 1 . - CDS 6350 - 7333 947 ## gi|288871569|ref|ZP_06118064.2| conserved hypothetical protein Predicted protein(s) >gi|229783887|gb|GG667848.1| GENE 1 1 - 870 817 289 aa, chain + ## HITS:1 COG:CAC0978 KEGG:ns NR:ns ## COG: CAC0978 COG3359 # Protein_GI_number: 15894265 # Func_class: L Replication, recombination and repair # Function: Predicted exonuclease # Organism: Clostridium acetobutylicum # 3 118 89 203 274 87 37.0 3e-17 EFLKRFSMVVHFNGDGFDIPYLLKRCRAYGLPYDFSGVTSLDIYKKIRPYRNLLGLESMK QKAIEQFLGVGREDIYSGGQLIEVYQDYLSSQDQALLDLLLLHNADDLRGMPGILPILNY PDYLEHDFKLESQELLTRSDLFGREYHALKLVYQSDYTVPVSFSRTSSVADIEAKGGQLT ASVDLYEGELKYFYPDYKNYYYLIYEDRAIHKSVAEYVDREARIKATAKTCYTRRSGCYL PQFTPVFEPVLQKEYKDRLTYFPYDDRQFAELEQSGGYVRHLLDYLCGK >gi|229783887|gb|GG667848.1| GENE 2 881 - 1687 619 268 aa, chain - ## HITS:1 COG:lin2267 KEGG:ns NR:ns ## COG: lin2267 COG2207 # Protein_GI_number: 16801331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 20 264 30 286 292 99 27.0 5e-21 MKYNISRTKRSYEFHMAESHWHTHYEFYYLYSGHCKIFINHTLYYVEPGDIIFLKPGEIH RTTYHSSPVNERVTISFDGEYLDSMKELCGEENVEYIFRNSKVTVPNGNRSRVEELIRRI EYEETGQDGFSPLMKKGCFYELLVSLSRCQEAETDHGRLEIGEETIEAAAKYIYDHYQNP ITLEEVAELSHMSPAYFSRKFKAVTGFGYKEYLTNIRIREASRLLLHSACSVTEIADRCG FGDGNYFGDAFRKAKGVSPRAFRKMQGI >gi|229783887|gb|GG667848.1| GENE 3 1701 - 2246 495 181 aa, chain - ## HITS:1 COG:no KEGG:Closa_2241 NR:ns ## KEGG: Closa_2241 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 181 1 181 181 275 84.0 4e-73 MKLHNLWGHFHTITAHKFRVMKNCFRVGLYKQGLLHDLSKYSPEEFIPGVIYYQGTRSPN AAEREEKGFSKAWLHHKGRNKHHYEYWIDFTTDMSEGLRGHKMPLKYVIEMMMDRIAASK TYKGKDYTDASPWEYYIHAKKYMVIAPETAKLLEELLLMLKENGEERTFAHMKALLKKGD Y >gi|229783887|gb|GG667848.1| GENE 4 2439 - 2636 321 65 aa, chain + ## HITS:1 COG:BH3610 KEGG:ns NR:ns ## COG: BH3610 COG1278 # Protein_GI_number: 15616172 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Bacillus halodurans # 1 65 1 65 65 93 70.0 1e-19 MKGTVKWFNNQKGYGFISDEQGNDVFVHYSGLNMDGFKSLDEGASVEFDVVDGAKGPQAV NVTKL >gi|229783887|gb|GG667848.1| GENE 5 2829 - 3710 1103 293 aa, chain - ## HITS:1 COG:SA0470 KEGG:ns NR:ns ## COG: SA0470 COG1281 # Protein_GI_number: 15926189 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Disulfide bond chaperones of the HSP33 family # Organism: Staphylococcus aureus N315 # 3 284 4 285 293 282 50.0 5e-76 MTDYIVRATAGDHQIRAFAATTRELVEHARSAHNTSPVATAALGRLLTAGAMMGVMMKGE QDLLTLKVQGSGPIEGLTVTADSKGNVKGYAYNPGVMLPPNDKGKLDVGGAVGEGVLSVI KDIGLKEPYVGQTILVGGEIAEDLTYYYATSEQTPSSVALGVLMNKDNTVKQAGGFIIQL LPGASDEMIDQLEKKLGEITSITSLLDEGKTPEMILNYILGEFGLEIMDKVPAEFTCNCT KDRVEKALISIGKKELNEMIDEGKSIEVNCHFCNKHYNFSVEELTDIRDRAVR >gi|229783887|gb|GG667848.1| GENE 6 3713 - 4474 1092 253 aa, chain - ## HITS:1 COG:BS_yqeM KEGG:ns NR:ns ## COG: BS_yqeM COG0500 # Protein_GI_number: 16079615 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Bacillus subtilis # 4 247 3 241 247 153 34.0 3e-37 MEAYTGFAEVYDTFMDNIPYGEWCRYLTGLLKKYGVGDGLILDLGCGTGSVTELLSEAGY DMIGVDNSEDMLQIAMDKRAASGHDILYLLQDMRDFELYGTVAAVVSICDCMNYILEYDD LVEVFSLVNNYLDPEGIFIFDLNTIYKYETLLGESTIAEDREESSFIWENYYDRDTMINE YDLALFIRTEEDLYRKYVETHYQKAYSLNTVKKALKEAGMEFVAAYDAFTEEPVKEDSER IYIIARECGKKEE >gi|229783887|gb|GG667848.1| GENE 7 4540 - 4959 487 139 aa, chain - ## HITS:1 COG:CAC3537 KEGG:ns NR:ns ## COG: CAC3537 COG0653 # Protein_GI_number: 15896773 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Clostridium acetobutylicum # 1 138 30 166 166 146 55.0 1e-35 MEKGIYEQILSAPEAVVEGTVKELAEKYGTDIQTMTGFLDGINESLKGYENPIDTMDENT SVKIEIDPEKLYYNMVEAKAEWLYGLPQWDSILTEEKKKELYKKQKASGTIVKGPKIGRN DPCPCGSGKKYKKCCGKNA >gi|229783887|gb|GG667848.1| GENE 8 5924 - 6169 341 82 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160880683|ref|YP_001559651.1| ribosomal protein L21 [Clostridium phytofermentans ISDg] # 1 82 1 82 102 135 81 7e-32 MYAIIATGGKQYKVSEGDTIKVEKLGVEAGETVTFDQVLVVNNGELAVGCPTVAGATVTG TVVKEGKAKKVIVYKYKRKSGY >gi|229783887|gb|GG667848.1| GENE 9 6350 - 7333 947 327 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871569|ref|ZP_06118064.2| ## NR: gi|288871569|ref|ZP_06118064.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 327 20 346 346 610 100.0 1e-173 MKRQEFNCKKLMVLGAALCLGLTACGAKTAAESQGSGVSQEATEPQTVQESQTVQETETA EEPQEAAGMQITMEQVIEANSTEKLLGHYHSAYEVSPTMRRYVETEFRYYNEDGVKEEIQ MPDTVYGLYDGQYWGQLTPGGVWNSDSEYYAYFGLDHEATALEEIEKIEDNGDTLMLYTV LQPDSLSAKKDTLGIDYRDGDWIECSYKLDSETLAILETENVQCHADGTKEEPSTVTVTY DAPRPEGALALLERLSDKQNLRTITMILNPGTEKESSDSVYVQKGEGVMLINPEGKTLEV FKDSACTEPYTGGADTNKDLTIYGRFE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:18:07 2011 Seq name: gi|229783886|gb|GG667849.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld242, whole genome shotgun sequence Length of sequence - 7123 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 86 - 145 7.0 1 1 Op 1 1/0.000 + CDS 172 - 789 506 ## COG1323 Predicted nucleotidyltransferase 2 1 Op 2 . + CDS 1715 - 2179 281 ## COG1323 Predicted nucleotidyltransferase - Term 1987 - 2017 2.0 3 2 Op 1 . - CDS 2174 - 2677 566 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 4 2 Op 2 . - CDS 2700 - 3080 117 ## Closa_1890 hypothetical protein - Prom 3101 - 3160 5.0 5 3 Op 1 . - CDS 3186 - 4391 1361 ## COG0500 SAM-dependent methyltransferases 6 3 Op 2 . - CDS 4409 - 4741 313 ## Closa_1888 Phosphoglycerate mutase 7 4 Tu 1 . - CDS 5702 - 7060 1834 ## COG2848 Uncharacterized conserved protein Predicted protein(s) >gi|229783886|gb|GG667849.1| GENE 1 172 - 789 506 205 aa, chain + ## HITS:1 COG:CAC1741 KEGG:ns NR:ns ## COG: CAC1741 COG1323 # Protein_GI_number: 15895018 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferase # Organism: Clostridium acetobutylicum # 9 205 3 208 402 204 46.0 1e-52 MTKNENAYIVGIIAEYNPLHKGHAWHIKKAKELAGADYCIAVMSGDFVQRGAPAIYDKYT RTAMALEAGADLVLEMPAIFATSSAEDFAACGIALLDRLGVVGSVCFGSECGDMEEISRA ASILASEPELYTAALKEHLKMGFTWPQARAHALSLTGDFDEALLNSPNNILGIEYAKAVV RRKSPIRPITVRRQGSGYHDTGLAS >gi|229783886|gb|GG667849.1| GENE 2 1715 - 2179 281 154 aa, chain + ## HITS:1 COG:CAC1741 KEGG:ns NR:ns ## COG: CAC1741 COG1323 # Protein_GI_number: 15895018 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferase # Organism: Clostridium acetobutylicum # 13 104 273 363 402 64 38.0 6e-11 MTAASVPFDQYLDVSSDLAARIRKDLLSFTTFEDRISALKTRQYTYTRISRALLHLLLGI TDQEIMAARAADYAPYARVLGFNRKASAVLSDIKKRGTIPLITKTADAGSLLEGTAWEMF QKDLFCSHLYQAVVSGKYHTRLKNEYTHSVVMKS >gi|229783886|gb|GG667849.1| GENE 3 2174 - 2677 566 167 aa, chain - ## HITS:1 COG:lin0387 KEGG:ns NR:ns ## COG: lin0387 COG0494 # Protein_GI_number: 16799464 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Listeria innocua # 1 149 1 151 169 92 36.0 3e-19 MELWDVYDENRKLTGKKHERGVPLGPGEYHIIADVWTVNQRSEILLTRRHPDKPYGLLWE CTGGSVLTGENSVEGALRELSEEVGIHASKEELHLIHSICLKERFVDTYITRQKVNLEDL KLQAEEVVDAKFVVFDELMEMWRQGIVVPKSRFLLYKDKIESFIRPS >gi|229783886|gb|GG667849.1| GENE 4 2700 - 3080 117 126 aa, chain - ## HITS:1 COG:no KEGG:Closa_1890 NR:ns ## KEGG: Closa_1890 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 66 1 66 114 114 80.0 1e-24 MPFINSRVNVPLTDGEKETLKGKLGQAISLIPGKSESWLMVGFEDNYSLYFKGKNDTKLA YVEVKISAVLRMKLTISLRPDCVRSMRRCWGFLRIRFISNTRKWNTGAGTALISSIIRKV GIFPLT >gi|229783886|gb|GG667849.1| GENE 5 3186 - 4391 1361 401 aa, chain - ## HITS:1 COG:FN0778 KEGG:ns NR:ns ## COG: FN0778 COG0500 # Protein_GI_number: 19704113 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 9 394 2 396 412 263 39.0 5e-70 MEQSGMDREKRDEIKILFEDFLDEGLIRIILSNPVLKEGAGPGKVLKVRVRPVMLKGGMV FQAEELTEKQAFHRNLTKEEGVPYLLGLLEGGFKQAEAESVKGQARVMVGKKGTVTVKVK KNQQKIVAAPNVASHNRQKRYILEEGKPVAFLEDLGVMTAEGKVIRSRYDKFRQINRFLE FIEDILPRLDKSRENVIIDFGCGKSYLTFAMYYYLHELRGYEVRIIGLDLKQDVIDRCNR LSEAYGFDKLKFYHGDIASYDGVDHVDMVVTLHACDTATDYALEKAVKWDASVILSVPCC QHELNKQMDNKLLRPVFQYGLIKERMAALYTDALRAEILENRGYRTQILEFIDMEHTPKN ILIRAVKQGGPKDNRKEIEDILQFLGTEQTLARLLLEDQGI >gi|229783886|gb|GG667849.1| GENE 6 4409 - 4741 313 110 aa, chain - ## HITS:1 COG:no KEGG:Closa_1888 NR:ns ## KEGG: Closa_1888 # Name: not_defined # Def: Phosphoglycerate mutase # Organism: C.saccharolyticum # Pathway: Glycolysis / Gluconeogenesis [PATH:csh00010]; Methane metabolism [PATH:csh00680]; Metabolic pathways [PATH:csh01100]; Biosynthesis of secondary metabolites [PATH:csh01110]; Microbial metabolism in diverse environments [PATH:csh01120] # 1 110 104 213 213 175 74.0 5e-43 METDLPYPGGECAGDVIRRAMPVFLEMAESGYQKIAVVTHGGVIRSMLSAFLGMEPARYR ILGKTLENCSITEVVYQPDNQCFTVERFNDYAHLEPYPELLRSSWVSAEN >gi|229783886|gb|GG667849.1| GENE 7 5702 - 7060 1834 452 aa, chain - ## HITS:1 COG:lin0538 KEGG:ns NR:ns ## COG: lin0538 COG2848 # Protein_GI_number: 16799613 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 6 451 5 448 451 581 71.0 1e-166 MLNMFEVNETNKMIEQEKLDVRTITLGISLLDCSDSNLDAVNEKIYNKITTTAKNLVAVG REIERDFGIPIVNKRISVTPIAMVGAAACKSPEDYVTIAQTLDRAAHEVGVNFIGGYSAL VSKGMTDADERLIRSIPQALAVTERVCSSVNVGSTKTGINMDAVKLMGEIVLATAEATKE SDSLGCAKLVVFCNAPDDNPFMAGAFHGVTEAESIINVGVSGPGVVKTAIETARGKDFEV LCETIKKTAFKITRVGQLVAQEASKRLNIPFGIIDLSLAPTPAIGDSVAEILEEIGLERV GAPGTTAALAMLNDQVKKGGVMASSYVGGLSGAFIPVSEDQGMIDAVSMGALTLEKLEAM TCVCSVGLDMIAIPGDTPATTIAGIIADEAAIGMINQKTTAVRLIPVIGKTIGDTVEFGG LLGYAPVMPVNPYSCEKFVTRGGRIPAPIHSS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:18:14 2011 Seq name: gi|229783885|gb|GG667850.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld243, whole genome shotgun sequence Length of sequence - 3787 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 32 - 418 115 ## gi|266625138|ref|ZP_06118073.1| response regulator + Term 422 - 461 5.1 - Term 402 - 452 5.0 2 2 Tu 1 . - CDS 457 - 3030 1735 ## COG3409 Putative peptidoglycan-binding domain-containing protein - Prom 3065 - 3124 12.4 3 3 Tu 1 . - CDS 3167 - 3400 211 ## gi|266625140|ref|ZP_06118075.1| conserved hypothetical protein - Prom 3447 - 3506 7.6 Predicted protein(s) >gi|229783885|gb|GG667850.1| GENE 1 32 - 418 115 128 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625138|ref|ZP_06118073.1| ## NR: gi|266625138|ref|ZP_06118073.1| response regulator [Clostridium hathewayi DSM 13479] response regulator [Clostridium hathewayi DSM 13479] # 1 128 11 138 138 200 100.0 2e-50 MVLIEAVSKIRTYENDGQCIVYINTCFKNKYCLLCKNYYIYKEIEELYENKDIPSQIECF SDIIFIIDISRYINQIECDSHKKIAELYLVEGKSDREISETLCISRQYVNRVRRRLLNDL RTEYFMQK >gi|229783885|gb|GG667850.1| GENE 2 457 - 3030 1735 857 aa, chain - ## HITS:1 COG:BH1295_1 KEGG:ns NR:ns ## COG: BH1295_1 COG3409 # Protein_GI_number: 15613858 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative peptidoglycan-binding domain-containing protein # Organism: Bacillus halodurans # 67 450 16 459 459 175 31.0 3e-43 MLLKNGDTGIQVKYLQQGLKIMCCNPGSIDSAFGPGTQAAVEKFQEEWGLTVDGIVGNDT WNCLLAEIKPIQQALKNKGFYTGAITGIAKDSTYNAVIRFQSSRDLTADGMVGAATRARL FNEDQGGGDESMLPLSIGDRGDYVLYLQYGLRILCCSPGALDGVFGSGTAEAVKKFQAKY GITDNGIADTTTWNTLKGQITDIQSRLLERNYSIAIVDGLATSALVETIKKYQEANWLTA DGQVGPATYELLFSDVEDGATDALPLKTGSRGPRVLYFQYALRISCINPNGTDGVYGPGT KSAVDRYKTRKGLTADGMVDTVTWEKMRDEIRPLQTALVNRGYDVGFVDGIATEKVYNSV LQFQTDHNLVADGMIGNATKALLLGGTAGGGTVSSTLKLGSNGSLTRYLQRLFNELGYQI PIDGIFSQETHNAALSFQTTHGLEADGIVGGGTWRKLFEVYRVDVPGTGVEKLLNVVKHE LAWGFAEDNANNITPYGQWYEMNRSPWCAMFVSYCAYQAGVLDTLVPKFAWCPSGMTWYK NRQKYHKRNSGYIPKKGDVIFFYNDELGRVAHTGIVVDGDENYVTTIEGNTTIDAVEQRT YNRNHSTIDGYGDNGGEAIELPAPPTEEEINEILVDHYREFLDACYIILPSEQITLNYEA TIPMPPNGKALVEASADTTIFDNSINNPNAVTFDVEGGIAMSQEIALSEALTLTFEESGL EDAQSLADIVFDINMSLDTGASVVASGIRTEADGTWFYISYAVKKEVQIADGYPPVNFVF KYTLCLKSDDSAGARFFELVEEFVTEYRKEINVVVGVAAVIGLAIAFKALLLAGGISGLI AATKAVLGAAAKVAIVA >gi|229783885|gb|GG667850.1| GENE 3 3167 - 3400 211 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625140|ref|ZP_06118075.1| ## NR: gi|266625140|ref|ZP_06118075.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 77 1 77 77 108 100.0 2e-22 MEELIVNLVQSQGIWAVLFVFLLLYTIKKNDKLDELQEARERKYQELLTQLTVKLSIVNT VNEKLDTIQAVLKEKSD Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:18:29 2011 Seq name: gi|229783884|gb|GG667851.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld244, whole genome shotgun sequence Length of sequence - 6304 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 3 - 299 362 ## COG0637 Predicted phosphatase/phosphohexomutase + Prom 318 - 377 1.9 2 1 Op 2 . + CDS 405 - 1400 1034 ## COG1609 Transcriptional regulators 3 1 Op 3 . + CDS 1443 - 2486 1203 ## COG1363 Cellulase M and related proteins 4 1 Op 4 . + CDS 2474 - 3451 835 ## Clocel_0465 aminoglycoside phosphotransferase 5 2 Tu 1 . - CDS 3390 - 3563 113 ## - Prom 3666 - 3725 15.3 6 3 Op 1 2/0.000 - CDS 4627 - 5673 1193 ## COG1904 Glucuronate isomerase 7 3 Op 2 . - CDS 5706 - 6272 392 ## COG1609 Transcriptional regulators Predicted protein(s) >gi|229783884|gb|GG667851.1| GENE 1 3 - 299 362 98 aa, chain + ## HITS:1 COG:CAC2614 KEGG:ns NR:ns ## COG: CAC2614 COG0637 # Protein_GI_number: 15895872 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Clostridium acetobutylicum # 3 88 122 204 215 89 53.0 1e-18 GFILGQLGLGDYFDAVSDGNQITRSKPDPEVFLLAAKLIGEKPEDCLVVEDARAGLAAAK AGGMTGAGIGDAADCELADYHLRTFQDLLTATHLSACN >gi|229783884|gb|GG667851.1| GENE 2 405 - 1400 1034 331 aa, chain + ## HITS:1 COG:BH2227 KEGG:ns NR:ns ## COG: BH2227 COG1609 # Protein_GI_number: 15614790 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 1 326 1 334 347 228 38.0 2e-59 MATIKDIAEKAGVSIATVSRVLNYDETLNVQDETRKRIFEVAEQLDYQVKDKKRRKRKLK IGVLYSYSLEEELDDTYYLSVRVAIEKKLAEEGYKKYRINSSDTREDLAALEGIICLGTF SQSMMERIESFRKPTVFADAVPDTEKFDSVIHDIERSVLKVMKYLMNQGHKKIAFIGGYE TDSDGAEVVDNRTYAYQRFMKQEGLFHEEYVKIGGYTPKYGYQMMKELLELPDRPTAVFV ANDSLAIGCYKAVNEKGLAIPEDISVVGYNDIPAAKYLVPPLTTVHLHMDFMGEQAVSML AERILTGREIAIQTLIPAKLVIRESVRKLNG >gi|229783884|gb|GG667851.1| GENE 3 1443 - 2486 1203 347 aa, chain + ## HITS:1 COG:DR0229 KEGG:ns NR:ns ## COG: DR0229 COG1363 # Protein_GI_number: 15805265 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Deinococcus radiodurans # 6 340 34 369 375 167 33.0 2e-41 MTDERFLEELLEAVTVSGFEEEGQAVVRKYMEPLADELRTDEIGDTVCVLNPESRLKILM TAHLDEIGLMVTAVNEQGRLLVIDRGGIIPATYPGHRVKVMTEQGPVFGVVESYRDLFKK EGGLKTSDFLVDIGVDSKEEACRLVNPGAPVILDTGMQKIAGGRFVSRALDDRLGVYIIM EAFKKAGKRGCTSGVYSAATVGEETTKNGACWTAARVKPDLAVVVDVTYTSDCLGMNAAD SGEVRLGGGPVLCNSPIVSKRLNRMMRECASGIGIEVQTEAASRLSYTDADKIHFAGEGI PVVLVSIPLRYMHHPAEMADEKDVEGCIDLIAEFLVKYEEMCAAWRV >gi|229783884|gb|GG667851.1| GENE 4 2474 - 3451 835 325 aa, chain + ## HITS:1 COG:no KEGG:Clocel_0465 NR:ns ## KEGG: Clocel_0465 # Name: not_defined # Def: aminoglycoside phosphotransferase # Organism: C.cellulovorans # Pathway: not_defined # 1 314 1 314 323 305 47.0 1e-81 MESLTKNRQSREIVQKIAEKFFPDDRMAEYEELTEGYFNVAYEITLLSGRSVILKVAPPD EVPVMSYEKNIMFAETESMKMAAANPDIPVPLVYGYDDSRSICPSPYFLMEKLRGNSLYS IRENLPEENINAIKKETGEVNRRINEIICPRFGLPGQQECQGEEWFPVFRRMMEMGIEDA EARAVDLKVSAGEILECLEKDRRFFEEVKTPCLVHWDIWAGNIFTDGICVTGIIDWERSL WGDPLLEVGFRTFDVDICFRKGYGKERLTESEERRALWYDVYMAVLQALECEYRHYDTME MYERGVRLLGEQMKRMRLLQGQQDR >gi|229783884|gb|GG667851.1| GENE 5 3390 - 3563 113 57 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKAKEILNTIGYDIDITIVPAGSQVNDSPSDGPDNPLLIDPAAPATASSVSSAPPEV >gi|229783884|gb|GG667851.1| GENE 6 4627 - 5673 1193 348 aa, chain - ## HITS:1 COG:uxaC KEGG:ns NR:ns ## COG: uxaC COG1904 # Protein_GI_number: 16130987 # Func_class: G Carbohydrate transport and metabolism # Function: Glucuronate isomerase # Organism: Escherichia coli K12 # 1 347 1 348 470 392 52.0 1e-109 MKQFMDKDFLLSTDAALNLFHNYAEKMPIVDYHCHINPQEIYEDRKFDNITQVWLGGDHY KWRQMRSNGVDEKYITGDATDREKFQKWAETLPKLIGNPLYHWSHLELQRYFGFTGYLNG DTAEEVWNLCNEKLSQDSMSVRNLIRQSKVTLICTTDDPADTLEWHQKLAADDSFEVKVL PAWRPDKAMNIEKADFAAYMAKLAKAAGIEITDFASLKLALKNRMEFFASQGCSVSDHAL EYVMYVPADDVELDTIFAKGLAGEPVSREDELKYKTAFMLFVAKEYNRMGWVMQLHYGCK RDNNAMMYANLGPDTGFDCINNYAPSAQMADFLNALSATDEIPKTIIS >gi|229783884|gb|GG667851.1| GENE 7 5706 - 6272 392 188 aa, chain - ## HITS:1 COG:ascG KEGG:ns NR:ns ## COG: ascG COG1609 # Protein_GI_number: 16130621 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 185 146 331 337 96 31.0 2e-20 MLVNGCLEGKQIYSTVCDDRTAVYESVTRLIQSGHTSVLYLYAGSSYSNLQKLAGYHDAC AAAGLIIRDDLIRICPKDLDGTEELLQSLAAAGIKFDAVFAAEDVMAVGAVKYACRAGIK IPEEMNIIGYNNSILARCTDPELTTVDNKVESLCTTTVTTLMKVLEGGNVPVKTTISAEL IKRGTTNF Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:18:41 2011 Seq name: gi|229783883|gb|GG667852.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld245, whole genome shotgun sequence Length of sequence - 4987 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 73 - 606 532 ## COG0602 Organic radical activating enzymes 2 1 Op 2 . + CDS 628 - 1335 756 ## Cphy_0177 hypothetical protein + Prom 1339 - 1398 6.0 3 2 Op 1 . + CDS 1464 - 1697 338 ## Closa_2230 hypothetical protein 4 2 Op 2 . + CDS 1710 - 2228 680 ## COG0717 Deoxycytidine deaminase + Term 2302 - 2357 11.5 - Term 2288 - 2345 8.1 5 3 Op 1 . - CDS 2363 - 3259 893 ## Rumal_3043 hypothetical protein 6 3 Op 2 . - CDS 3354 - 4802 1204 ## COG0534 Na+-driven multidrug efflux pump - Prom 4849 - 4908 3.6 Predicted protein(s) >gi|229783883|gb|GG667852.1| GENE 1 73 - 606 532 177 aa, chain + ## HITS:1 COG:FN0312 KEGG:ns NR:ns ## COG: FN0312 COG0602 # Protein_GI_number: 19703657 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Fusobacterium nucleatum # 1 162 1 157 168 148 44.0 4e-36 MKYADIKHCDVANGTGVRTSLFVSGCTHRCRGCFNEIAWDFNYGKEFTDEAVQEILDSCG KSYISGLSLLGGEPMEPQNQRALLPLVLRFKEQYPEKTIWCYTGYTYEADLLSPEGRAHC EVTPEFLACIDILVDGEFDQECYDLKLKFRGSSNQRVLQIHEPGCPELFAVKDESTQ >gi|229783883|gb|GG667852.1| GENE 2 628 - 1335 756 235 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0177 NR:ns ## KEGG: Cphy_0177 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 234 1 222 224 278 54.0 2e-73 MKREISLEEISDGKLYGSGDMVKVGCDDCRGCSACCRGMGSSIVLDPYDIFRLETGLDLS FEELLAEAVELNLVDGMILPNLKMSGEGEACTFLNEEGRCRIHPCRPGICRMFPLGRIYE DHGFRYFNQIHECRKAQKTKVKIRKWMDTPDLGRYENFIVEWHYFLKELEERLEKKREQK PEEGSEQAGGEDARKRAAMGVLKVFYLIPYQKELDFYQQFEARMAAAREEFINFL >gi|229783883|gb|GG667852.1| GENE 3 1464 - 1697 338 77 aa, chain + ## HITS:1 COG:no KEGG:Closa_2230 NR:ns ## KEGG: Closa_2230 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 76 1 76 77 107 64.0 1e-22 MAKVKRFSTKINSFFVEWSALNHYLSSYECEKNGVPYIGFKDPRVQKLWNSLSSAEKRDL MKHYGKKNEAELGDYLG >gi|229783883|gb|GG667852.1| GENE 4 1710 - 2228 680 172 aa, chain + ## HITS:1 COG:CAC0025 KEGG:ns NR:ns ## COG: CAC0025 COG0717 # Protein_GI_number: 15893323 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidine deaminase # Organism: Clostridium acetobutylicum # 1 171 1 171 173 272 73.0 2e-73 MILSGKEILKHMGKEIIITPFDEKRINPNSYNLSLADELLVYEKEALDMKKPNPTKRIVI PDEGLLLEPNRLYLGRTNEFTKTDRYVPMLEGRSSTGRLGLFIHVTAGFGDIGFAGYWTL EIFCVQPVRIYPNVEICQIYYHDIDGEYDLYSSGKYQNNSGIQASLMYKDFE >gi|229783883|gb|GG667852.1| GENE 5 2363 - 3259 893 298 aa, chain - ## HITS:1 COG:no KEGG:Rumal_3043 NR:ns ## KEGG: Rumal_3043 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 14 287 5 281 291 305 51.0 2e-81 MGLFSKKKGPKEELKQNLEEKAEFQNLYQIYLLFEEKPVKPEAGIIRTALEETFGAIDIV SPSPALTSFAVKKYLAQFKDGALPPQVVMGDVQEFRQDSIDAFSRSQLWDVADGDALLER CRYEIFLFDMMAAVLEYKERCEMMMDWMETAVRLFPDCTAVWVEPAGKLFTAEQIREHRI PREDRFVYFGVNARLFNISGSDDKVVDTLGLYAIGLPDVQYHFRGLEPQAVVNHAYNVAS YLYDADAPVKSGETIDGIADGQLSREVQWKCQYEDALIQPVRVVMDICPGEYAAGHRE >gi|229783883|gb|GG667852.1| GENE 6 3354 - 4802 1204 482 aa, chain - ## HITS:1 COG:BB0473 KEGG:ns NR:ns ## COG: BB0473 COG0534 # Protein_GI_number: 15594818 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Borrelia burgdorferi # 11 440 1 428 454 92 20.0 2e-18 MSRQLLPKAVLSAKDQHFREFALHDNMWKVILYVGTPLALYQSLNQLFKIFDSMMAAHIS AGSVSAVAYLSQISLMLSALGGGLAIGSSLKISEAYGAGDFDLVKKRVSTLFAMCSVLSA VILALLVPFTPWFLRLARTPEAFISEGTLYFRLELFGMVISFFNNIYIAIERARGNSGRI LRLNMAVIAVKMAFTALFVYGLQCGISMISTATILSQLFLLAAAVRNMNQKDNVFGFSMR AVSFQRDTTVPMISLSIPVIAEKMAFSFGKVIINSMSTVYSALTVGALGISNNIGGITTM PQNGFQEGGSAIISQSVGAGKPRRALKAFGCVLTVNLAVGIVLMACSLYFLEELSRLFAG NDPAFQKMIISIYRFEALGAIPLGITASVMALLYGFGKTKLTLAINFCRVFVFRVPVLWF LQSYTSMGSLSVGVVMAVSNIATGVFALIIGAYEIVRICRRYEIPLSAPFHFSVFQKAWG KS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:18:53 2011 Seq name: gi|229783882|gb|GG667853.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld246, whole genome shotgun sequence Length of sequence - 6284 bp Number of predicted genes - 8, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 247 - 654 163 ## CKR_3435 hypothetical protein - Prom 871 - 930 3.2 2 2 Tu 1 . - CDS 1006 - 2754 1156 ## COG3344 Retron-type reverse transcriptase - Prom 2860 - 2919 3.8 3 3 Tu 1 . - CDS 3321 - 3437 182 ## - Prom 3457 - 3516 2.8 - Term 3468 - 3510 1.0 4 4 Tu 1 . - CDS 3643 - 3876 220 ## gi|288871578|ref|ZP_06118096.2| conserved hypothetical protein - Prom 4068 - 4127 3.3 - Term 4275 - 4309 4.0 5 5 Op 1 . - CDS 4362 - 5093 363 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 6 5 Op 2 . - CDS 5115 - 5276 143 ## 7 5 Op 3 . - CDS 5273 - 5722 327 ## gi|266625165|ref|ZP_06118100.1| hypothetical protein CLOSTHATH_06577 8 5 Op 4 . - CDS 5734 - 6282 353 ## Rumal_2264 hypothetical protein Predicted protein(s) >gi|229783882|gb|GG667853.1| GENE 1 247 - 654 163 135 aa, chain - ## HITS:1 COG:no KEGG:CKR_3435 NR:ns ## KEGG: CKR_3435 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 6 134 7 139 139 113 58.0 3e-24 MLDRRAVQAIVFLLLLLAASVTDIKKRVVPDLLCMLIALTAAISFHPEYVWGIFIALPFL LAAVFYGGMGGGDIKLMAAAGLVLGLPAGIAATIVGLSLVLVYSVLLKICKKTQVIAVPL VPFLSAGCAVGYLIG >gi|229783882|gb|GG667853.1| GENE 2 1006 - 2754 1156 582 aa, chain - ## HITS:1 COG:Q0050 KEGG:ns NR:ns ## COG: Q0050 COG3344 # Protein_GI_number: 6226520 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 50 515 256 707 834 244 34.0 3e-64 MHMAKGGSLSLNTKKGKVREALRNPIDVLNSLKSKACDEGYCYERLYRNLYNEEFYLLAY QNIYAREGNMTAGADGKTIDGMGMDRIQRLIGKMKNHSYQPNPARRTYIEKKNGKKRPLG IPSFDDKLVQEVVRLILESIYEPTFSNYSHGFRPNRSCHTALQQIQSTFTGVKWFIEGDI CGFFDNIDHAVLINILRKRIHDEYFIALLWKFLKAGYVEDWMFHNTYSETPQGSIISPIM SNIYLDELDRYMEIYMKEFERGKRRQKSAEYANWEYKLKYLRYEVYAKDKWVDLTAEERK TANDQIHAVRSKMLKCEYSDPQDSGYRRLFYVRYADDWLCGVIGSKQDAEEIKADMKRFL SETLKLELSEEKTLITNARDMARFLSYDIFLSDNESLREDKNGHTRRVRRGKVKLYVPRE KWQKKLMDYKALEIKYQNGKEIFIPVHRTYLISNDDFEIVQQYNAEVRGLYNYYKIADNV NVLGHFNYVMKFSMFKTFGAKYKLHISGVRKKYGYKHFGVKYQTKQGEKILYYYEDGFKK VKTGIAKPEVDNVPKVYRNNNPTGLIARLRAGHCEWCGAENV >gi|229783882|gb|GG667853.1| GENE 3 3321 - 3437 182 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKAKMKDYLIKKGMKVKAALENNKAEGFVDTASASVRA >gi|229783882|gb|GG667853.1| GENE 4 3643 - 3876 220 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871578|ref|ZP_06118096.2| ## NR: gi|288871578|ref|ZP_06118096.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 77 9 85 85 139 98.0 5e-32 MNKTSFEYGGYHFYPLRTFTAEDGDFYELTRKLGQDREIGLTKCDETWQKYPYDYEEFYK AAGGKVFDIFLCEENGK >gi|229783882|gb|GG667853.1| GENE 5 4362 - 5093 363 243 aa, chain - ## HITS:1 COG:MA3114 KEGG:ns NR:ns ## COG: MA3114 COG0454 # Protein_GI_number: 20091932 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Methanosarcina acetivorans str.C2A # 96 241 7 152 166 117 42.0 2e-26 MAICEICGGEMLERVGCKIGICDCNGKSYPRIRFGAEKRFEDVFGEEDSCPDCLAPFGSF HHFGCDIEECPVCGEAIYGDCKCDLVFPDLEDLVIIKRAEQDDLQKILNLQYLAYQSEAK LFNNPDIPPLKQSLTDVVNEYQKGIVLKAEDVKGNIIGSVRAYSENGTVYIGKLIVHPER QGKGIGTKLLINIENEYPGCRYELFTSTKSEKNICLYERLGYAIFKEKEITEELKFLYLE KQN >gi|229783882|gb|GG667853.1| GENE 6 5115 - 5276 143 53 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTILCAIIKFIGKLLLGIAYALLKIASFSVKVIGCMFLFVFHIFMLLVDGTRF >gi|229783882|gb|GG667853.1| GENE 7 5273 - 5722 327 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625165|ref|ZP_06118100.1| ## NR: gi|266625165|ref|ZP_06118100.1| hypothetical protein CLOSTHATH_06577 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06577 [Clostridium hathewayi DSM 13479] # 1 149 1 149 149 236 100.0 3e-61 MDDKYVTTLRLYKNNPAHFQALTCLEAYNRELFASQNDFIAEAVIHYSEYLKQKTEYEKL ERTLSNIREKEDYFRELTKRALEDVLGSMQITTIPAVAESDMKTKEEQETNSESERPAEM TDQATEKDFNLLDFYSSFTGFEDEGEDRI >gi|229783882|gb|GG667853.1| GENE 8 5734 - 6282 353 182 aa, chain - ## HITS:1 COG:no KEGG:Rumal_2264 NR:ns ## KEGG: Rumal_2264 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 1 167 129 294 307 100 32.0 2e-20 FNVYLEGVSVYMQGFAAITDILAETQGKSCLLVDIGGGTVDGVPIENMRPSGAQPIIDNN GTIKCISNVNEVLMAEFGEKAKPYIIETIMRTGTYAGDEMYLRVIRRELNKYTEYIYNLI KKHGYNLHLEKIIFMGGGASIMQNFGDNEGKDVICIPDIHANARGYEETLKSIWKARQNM AG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:19:26 2011 Seq name: gi|229783881|gb|GG667854.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld247, whole genome shotgun sequence Length of sequence - 8788 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 3, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 831 832 ## Mbar_A2481 hypothetical protein 2 1 Op 2 5/0.000 - CDS 875 - 1282 574 ## COG2172 Anti-sigma regulatory factor (Ser/Thr protein kinase) 3 1 Op 3 . - CDS 1279 - 2790 1489 ## COG2208 Serine phosphatase RsbU, regulator of sigma subunit 4 2 Op 1 . - CDS 3713 - 4183 447 ## COG0223 Methionyl-tRNA formyltransferase 5 2 Op 2 5/0.000 - CDS 4200 - 5060 570 ## COG0534 Na+-driven multidrug efflux pump - Prom 5090 - 5149 13.7 6 3 Op 1 . - CDS 6051 - 6821 725 ## COG0534 Na+-driven multidrug efflux pump 7 3 Op 2 . - CDS 6843 - 7148 301 ## bpr_I0261 anti-sigma factor antagonist 8 3 Op 3 . - CDS 7188 - 8780 1444 ## gi|266625174|ref|ZP_06118109.1| hypothetical protein CLOSTHATH_06586 Predicted protein(s) >gi|229783881|gb|GG667854.1| GENE 1 3 - 831 832 276 aa, chain - ## HITS:1 COG:no KEGG:Mbar_A2481 NR:ns ## KEGG: Mbar_A2481 # Name: not_defined # Def: hypothetical protein # Organism: M.barkeri # Pathway: not_defined # 23 274 5 255 267 128 35.0 3e-28 MEITENIENEAEAARSPGEVKKEIRKLRHKAERPLYWILFVMNILILLCAFVFTMAMQSS DLSGFIDDPDIIELLQTINGGIVFFYLAYVIIAFLCAFYQLYASELSYAIKVTPRNFPEI YEKSVEFSKNLGMKKVPEVYIRQQNGILNAFSAWVIGKRYVQLNSEIVDIAYMENKDFDT LYFVMAHEFGHHYFNHVTLAHSFSILLPRIIPILGPLYSRSQEYTADRVAQVLTESTGIK CMAMLSAGRHLYPYVDAEDYLENIYRHPNVLERLGR >gi|229783881|gb|GG667854.1| GENE 2 875 - 1282 574 135 aa, chain - ## HITS:1 COG:slr1861 KEGG:ns NR:ns ## COG: slr1861 COG2172 # Protein_GI_number: 16330247 # Func_class: T Signal transduction mechanisms # Function: Anti-sigma regulatory factor (Ser/Thr protein kinase) # Organism: Synechocystis # 1 135 1 137 143 81 34.0 5e-16 MKELTISATIENIERVTDFVNEQLESLGCSMKAQMQIDIAIDELFGNIAHYAYHPENGFA TVRVEVMDNPMAVVITFMDNGVPYDPLANEDPDITLSADERGIGGLGIYIVKKSMDEITY EYKDGKNILKIRKNF >gi|229783881|gb|GG667854.1| GENE 3 1279 - 2790 1489 503 aa, chain - ## HITS:1 COG:TM0467 KEGG:ns NR:ns ## COG: TM0467 COG2208 # Protein_GI_number: 15643233 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Serine phosphatase RsbU, regulator of sigma subunit # Organism: Thermotoga maritima # 249 500 199 443 445 149 35.0 1e-35 MYRGQMEEQYGKVAFDEATIAAEMIDGDMVEKYYATGEKDAYYEKVAAALLRVKQTIGLK YFYVVVPEEDVMVYIWDVGEEGEEGVRRLLDTDSYYGGGNEVMHAAFSANAKRNILFTRN EEYGYLASAYVPIFNSAGEPVALSSVDISMEKINRDIYRFVWTAVSIAGLAMFVSIIAYY YYIRRIVIRPAEVLHDAASALIKNDMDDLENFSISVKTGDEFEELADAFQFMTVELSEYI KNLTVVTAERERIGAELDVATQIQSSMLPCIFPAFPDRREFDIYATMNPAKEVGGDFYDF FMVDERHLAIVMADVSGKGVPAALFMVIGKTLIKDHTQPGRDLGEVFTEVNDLLCEANSE GLFITAFEGVLDLVTGEFQFVNAGHEIPFICKSGGFYEPYKIRAGFVLAGMEGIRYKAGT MQLEPGDKIFQYTDGITEAMNSRNELYGMKRLEETLRKNVHKAPMELLPEVKADIDSFVG SAVQFDDITMLCLEYRARMEGIS >gi|229783881|gb|GG667854.1| GENE 4 3713 - 4183 447 156 aa, chain - ## HITS:1 COG:FN1489 KEGG:ns NR:ns ## COG: FN1489 COG0223 # Protein_GI_number: 19704821 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Fusobacterium nucleatum # 1 149 8 163 317 69 29.0 3e-12 MKIVYFGTDVFLPCFSYFVRNHQVLSLYTYHNDEDCFTEYGIVKEAEKYGIPVHYEDMTA AETKRLFTEEGCGLFFSAEYNRILPLPEDVTAFRGINLHSSLLPEGRSYYPIEAAMERGF LESGVTMHKMTAALDGGDILDQSSVEITEGMDSIAS >gi|229783881|gb|GG667854.1| GENE 5 4200 - 5060 570 286 aa, chain - ## HITS:1 COG:BH2163 KEGG:ns NR:ns ## COG: BH2163 COG0534 # Protein_GI_number: 15614726 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Bacillus halodurans # 1 140 299 440 464 60 27.0 3e-09 MQPILSTYYGEHNVSGRRNAVRLGFASGIAVGTLLIAAVMVFPNAFCRLFGMTEAALASM AAEALRIYCAGAFFAGISILTCNYYQSCERLKAVFLIETMRGAVVLLPCTLLFASCGIEK FWLLFPATEIVSLLIFLLWRRFSHYIVDDGDPERVFQYTISGSVTDLTEVCGRIEDFCVK WNASAGQRYFVVMTVEELGMAILTKGMKDRDDGYIQITVIAAENGKFKLYLRDNAWKFNP FEMETKRANSEDEVFMDSMGILVLKEKAEDFYYQRYQGFNSLVLTI >gi|229783881|gb|GG667854.1| GENE 6 6051 - 6821 725 256 aa, chain - ## HITS:1 COG:lin2873 KEGG:ns NR:ns ## COG: lin2873 COG0534 # Protein_GI_number: 16801933 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 13 217 19 221 450 81 28.0 2e-15 MNKDKFTGPMYFKLWWPAMFSSVGWALSDMADAVVVGQKLGATGLAAISLILPVYMFNCM MAHGFGLGGSVRFSTLLSKGKEAEANRHFSAVLELSMACSIVTAALGILFLDQLLAFLGT TSSDGKLFYATKDYLFILVLSTPLFYLSNILNYFLRNDSCQRRAGIGSVIGNITDIGLNI ILVLGLGLGTGGAALSTSIGQIIAIAVYLPGLYGKESHLTPKLLKRLPVKDVMKSLRSGF ATSVQYLYQMVFFWLS >gi|229783881|gb|GG667854.1| GENE 7 6843 - 7148 301 101 aa, chain - ## HITS:1 COG:no KEGG:bpr_I0261 NR:ns ## KEGG: bpr_I0261 # Name: not_defined # Def: anti-sigma factor antagonist # Organism: B.proteoclasticus # Pathway: not_defined # 1 98 1 98 98 97 51.0 1e-19 MEIIKEKNGTAVTMSLQGRLDTATAPQMEAELKKDMEEVTRLILDMRELEYLSSAGLRVI LNAQKMMGKRGGMVVRHVNETIMEVFELTGFSDILTIEQES >gi|229783881|gb|GG667854.1| GENE 8 7188 - 8780 1444 530 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625174|ref|ZP_06118109.1| ## NR: gi|266625174|ref|ZP_06118109.1| hypothetical protein CLOSTHATH_06586 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06586 [Clostridium hathewayi DSM 13479] # 1 530 3 532 532 1000 100.0 0 MDDETKTAVEFLRRTLLETAPDSETLKDISAVLNNRELLDWLEQTERNLAQASGFAESVT YPAVEAVIDGQTVKKDPYRDGYTVYMDKVTAMKEHLKVMTEQSQLLAEARKSLGELESQI KLERRDDLTFDNQQSFLLKQCDRMQELIQQVNQGANDKVVNDAKKELKETAWKIGLSPGI GDNSVLKLDSALLARAVYLKKTVVQQIDSGKNPYILNRISHSRVAMTDGNYVTLLNFKNG DIETNKFDKTEDFKNYSLVEDSCMGGGEGAVYANADTDYKKKCIVDINQLVGTELPAAGG SSVYLMYSGTALADDRLGSTVNFTKGNTYVCMNSLGKPADTVYNVSSYSFYYKRAFLEYA SSQKVEKKDQEKVSTNLDVEMDELYGKAKATEELKTDTEFLSQEHTIILPQKNRYDNDIY FLNISLQEGVVLYKYSKAYSTSVPKMEVVRRITDEMNYRGGKMGGSYFRGWLKEKKADGD SGKVTGYDLVLLGFSKEDTLTEVNGVKEYRSIVPDDIYQAHLYKISIPKP Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:19:54 2011 Seq name: gi|229783880|gb|GG667855.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld248, whole genome shotgun sequence Length of sequence - 4798 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 43 - 858 459 ## Shel_00880 putative cell wall binding protein - Prom 985 - 1044 4.8 + Prom 947 - 1006 4.9 2 2 Tu 1 . + CDS 1079 - 1951 712 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 1997 - 2056 5.3 3 3 Op 1 . + CDS 2084 - 3436 1082 ## COG2211 Na+/melibiose symporter and related transporters 4 3 Op 2 . + CDS 3453 - 3581 207 ## 5 3 Op 3 . + CDS 3636 - 4334 517 ## COG0657 Esterase/lipase + Term 4525 - 4555 1.1 6 4 Tu 1 . - CDS 4350 - 4766 194 ## COG2893 Phosphotransferase system, mannose/fructose-specific component IIA Predicted protein(s) >gi|229783880|gb|GG667855.1| GENE 1 43 - 858 459 271 aa, chain - ## HITS:1 COG:no KEGG:Shel_00880 NR:ns ## KEGG: Shel_00880 # Name: not_defined # Def: putative cell wall binding protein # Organism: S.heliotrinireducens # Pathway: not_defined # 45 105 418 476 478 67 50.0 7e-10 MKLKKRLFTISAGICLSLALTAVHFPARLQETFAAADTGSETNHTGWLHNGSRWQYALSD GSLICGGWIKDGDGWYYFDTDGMMVSDTTLDIEGHTYRFEPSGAWRETPPADPLFVHLPS GRFEQSVYEHPWAGIRITLPEYTFALTAEELNALSAHDYIPSYYDFLAIMPEQSLFGVVI QYNSDPIEQILTDDLDVFSAFWGKYDLSPSVPQPVTVAGQTYQKMVCSIPENLKMNVYLR NQENKVIYLFSIAADEEAADRFIETIIPCTR >gi|229783880|gb|GG667855.1| GENE 2 1079 - 1951 712 290 aa, chain + ## HITS:1 COG:lin2267 KEGG:ns NR:ns ## COG: lin2267 COG2207 # Protein_GI_number: 16801331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 31 283 23 286 292 101 28.0 2e-21 MSEPVAKRQMFEYDGDNPFHIYNLPYYPSIGDELLFNHWHEELEIVYTLQGNSLHYIDGT CIRSQPGRLIVVNSESSHNIIPDRSVYGNGRKVVIALIISREFLEEVFPDFRNMYFLNEK EVTTPVIRELMVTLSHFGDGTELSRYEKIYRKGLVLQLLYHITQEGVAEKNVILPLNNQK NIERLKGVLQFIENHYMEKISQAEIAEKFYFSREYFARFLKKHTGMTFTEYLTKYRLQKA RMQLMASDSGILEIALNNGFSDDRRLILAFRQEYGTTPFQYRKMERNKEK >gi|229783880|gb|GG667855.1| GENE 3 2084 - 3436 1082 450 aa, chain + ## HITS:1 COG:BS_ydjD KEGG:ns NR:ns ## COG: BS_ydjD COG2211 # Protein_GI_number: 16077683 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Bacillus subtilis # 3 446 17 460 463 186 31.0 1e-46 MGRKKNGLSWTEKISYGMGDCGANVTVALCSTFLTAYYTDTVGIAAAAVGTMMLLARVFD GITDIVMGAVVDRTKTRWGKARPWVLWTAPLMAIALILEFNVPGGLSGNGKLVYAYLTYI FQNCIVYTANNLPFNALLSRMTLNVQDRASAATTRFVMTQLTTLVINAVTATLLSTAGWF WLSIIYGIVTFVMCLICFLGTREHIGEDAETGVVQVENVPLKTALPALLKNRYFYIQSLL FMFLYIGIVSTGSTTFYFCNIVLNNLGYLTYISMATTIPAIIVNLMLPKVIRKYGKWKLM VSGAFLMVLGGLVVGAAGSSFPLVLLGLIIKGTGMGPIMSGIFAMTADVVDYGEWKTGVR SEGLVNCCTSFGMKVGIGLGSALCTWIITMGGYDGTAAVQPASAVAMIRFGFGYSGAIIA VVCLALCFMMNIDKYIVTIQHDLEAKRNTK >gi|229783880|gb|GG667855.1| GENE 4 3453 - 3581 207 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSELSDTIRKLFKEGDDRRDAGLTTPEGILRYDDIVYGEDAG >gi|229783880|gb|GG667855.1| GENE 5 3636 - 4334 517 232 aa, chain + ## HITS:1 COG:SA0610 KEGG:ns NR:ns ## COG: SA0610 COG0657 # Protein_GI_number: 15926332 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Staphylococcus aureus N315 # 3 218 88 307 347 92 29.0 5e-19 MSVHGGGWVYGDKERYQYYCMDLARRGFAVVNFTYRLAPEFKYPASLEDTNRVFAWVLDH AKEYGFDRSRIFGVGDSAGAHILGLYAAICTNPDYAAAYPFQPPKGFVPAAVALNCGVYR TPVSDDEKDQSNLLMADLLPEKGTREEFEGISLVNFVTSQYPPVFLMTASGDFLRDQAPL LAEKLMEFEIPFVYRYFGNADQPLGHVFHCNIRLKEALRCNDEECEFFREYC >gi|229783880|gb|GG667855.1| GENE 6 4350 - 4766 194 138 aa, chain - ## HITS:1 COG:SPy0634 KEGG:ns NR:ns ## COG: SPy0634 COG2893 # Protein_GI_number: 15674706 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose-specific component IIA # Organism: Streptococcus pyogenes M1 GAS # 1 99 1 98 145 75 40.0 3e-14 MIGLVITGHGQYAAGLVSALELLIGHQDLLTAVDFEAGQNEDLLTEHLKHAVEGMHTCDE ILILCDMIGGSPYKCAVRLTAILPKITVIYGINLGMALELAMRCRMGLDRDAGALADEMI DTGKAQIGKYRPVPAGSS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:20:06 2011 Seq name: gi|229783879|gb|GG667856.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld249, whole genome shotgun sequence Length of sequence - 7444 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 10/0.000 - CDS 1 - 262 233 ## COG1175 ABC-type sugar transport systems, permease components 2 1 Op 2 . - CDS 278 - 1378 1024 ## COG3839 ABC-type sugar transport systems, ATPase components - Prom 1406 - 1465 7.7 3 2 Tu 1 . + CDS 1870 - 3687 1678 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains + Term 3693 - 3747 13.8 - Term 4789 - 4835 1.6 4 3 Op 1 15/0.000 - CDS 4849 - 5796 309 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 5 3 Op 2 . - CDS 5760 - 6278 522 ## COG0597 Lipoprotein signal peptidase 6 3 Op 3 . - CDS 6289 - 7377 1186 ## COG0337 3-dehydroquinate synthetase Predicted protein(s) >gi|229783879|gb|GG667856.1| GENE 1 1 - 262 233 87 aa, chain - ## HITS:1 COG:CAC0427 KEGG:ns NR:ns ## COG: CAC0427 COG1175 # Protein_GI_number: 15893718 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Clostridium acetobutylicum # 13 87 28 102 304 61 46.0 4e-10 MKVNAMVKKAGPYAYLVPALAVFAVFLFYPFFKTIYLSLYKTNKMGQARLFVGAGNYKDL LTSSSFLNSLVVTGVFVAIVVSVSMLL >gi|229783879|gb|GG667856.1| GENE 2 278 - 1378 1024 366 aa, chain - ## HITS:1 COG:CAC3237 KEGG:ns NR:ns ## COG: CAC3237 COG3839 # Protein_GI_number: 15896483 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Clostridium acetobutylicum # 1 364 1 369 369 404 54.0 1e-112 MASLSLVNVCKTYKNGFEAVKNVSLDIRDREFLILVGPSGCGKSTTLRMIAGLEDISSGQ LWIDGQLVNMLEPKNRDLSMVFQNYALYPHMTVYQNMAFGLKVRKTPKEEIDRRVRQAAK ILDIAHLLDRKPHALSGGQKQRVAIGSVIVRQPKAYLMDEPLSNLDAKLRSQMRVELSKL HRELGATVIYVTHDQVEAMTLGTRIVVMKSGKIQQAATPGDLYQNPVNKFVAGFIGSPAM NFLPVWVERRDERVCLEFGSSRLYVNRMCASRLIEGGYMGRRVYLGIRPEDFHETGPGEN ALLFEVEIREMLGAEQLLYGNSGRNELCVRMKPEFKAEPGSTVTVFVDMDRIKLFDMETE DNILYL >gi|229783879|gb|GG667856.1| GENE 3 1870 - 3687 1678 605 aa, chain + ## HITS:1 COG:CAC0158 KEGG:ns NR:ns ## COG: CAC0158 COG0449 # Protein_GI_number: 15893453 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Clostridium acetobutylicum # 1 605 1 608 608 514 44.0 1e-145 MCGIIGYTGPLDSKKILLNGLSQLEYRGYDSAGVALCMENSHICLVRRTGKVANLRETME RITDVSHCGLGHTRWATHGGVTTQNTHPHQAGKVTLVHNGIIENYHQLTEQFHLQGRLHS ETDSEVAAWVLDSLYQGDPLEAITRLVALIKGSYSFCIMFEDYPGEIYAIRNVSPLVAAY TRSGSFVASDLTALIPYTRQYFVVPEDHIVRLTSYKVHLYNLHRHEEMPELMEVNWNMDA AMKNGFPHFMLKEIHEQPEALRNTILPRLNKGLPDFTDDQIPDDVFTSCSQIHIVACGTA MHAGMVARSMMEPLLRIPVTVSVASEFRYEEPLIGEDTLVIIISQSGETIDTLAAMRLAK NYGAAALSIVNVKGSTIARESDYTLYTHAGPEIAVASTKAYSVQLAALYMIGCRMALVRG KYSVSQAADFMKKLLDAIPAMEAMIAQKDSIKALVTHLINKSDAFFIGRGLDYAFSLEGA LKLKEISYIHAEAYAAGELKHGTIALITEEVPVIAIATQEKVFAKTISNIREVKARGAFV ILITREDAVLDGGGADIHIRIPKIGDRFTVFPIAVVLQLIAYYASTGKNLDVDQPRNLAK SVTVE >gi|229783879|gb|GG667856.1| GENE 4 4849 - 5796 309 315 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 38 303 27 280 285 123 33 4e-28 MPVRPEGKGSGLKQQFVVEEEDKAIRIDKYLSERCPELSRSYLQKLLKAEAVFVGGKPVK SNYKTAAGDLIELEVPEAVEPKIVPEEMDLDILYEDSDIILINKPKGMVVHPAAGHYSGT LVNGLMAHCRADLSGINGVMRPGIVHRIDMDTTGVLIVCKNDMAHNSIAEQLKVHSITRK YYAVVHGVMKEDEGTVSGPIGRHPVDRKKMCINEKNGRDAVTHYRVLERFRQFTYVECRL ETGRTHQIRVHMASIGHPLLGDAVYGPAKCPYKLTGQTLHAGVLGIVHPRTGEYMEFTAP LPEYFEELLRKLRTT >gi|229783879|gb|GG667856.1| GENE 5 5760 - 6278 522 172 aa, chain - ## HITS:1 COG:SPy0826 KEGG:ns NR:ns ## COG: SPy0826 COG0597 # Protein_GI_number: 15674864 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Streptococcus pyogenes M1 GAS # 11 159 6 149 152 95 37.0 3e-20 MKQKAKLVTGLIIGFLAAIGLDQWTKLLAVNHLRNQPPYVIWDGVFEFLYSENRGAAFGM LQGKQWFFLIIAVIVVAAAVYAVFRMPASKKYLPLHLIAMFLSAGAIGNMIDRFTRGYVV DFLYFKLIDFPIFNVADCYVTVSMFFFILLFLFLYKEEDLNCLSGRKEREAD >gi|229783879|gb|GG667856.1| GENE 6 6289 - 7377 1186 362 aa, chain - ## HITS:1 COG:RSc2969 KEGG:ns NR:ns ## COG: RSc2969 COG0337 # Protein_GI_number: 17547688 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Ralstonia solanacearum # 27 357 24 354 368 248 40.0 2e-65 MEKRIPVHMDGKKVYDIVMEQSFGQLEEELRGMNLGDRKICVVTDSNVAPLYLEEVERIV SACCRKTEHFIFPAGEENKNLNTVKDLYETLILKKFDRHDYLLALGGGVVGDLCGFAAAT YLRGISFIQVPTTLLSQVDSSIGGKTGVDFDAYKNMVGAFHMPKLVYTNISTLKTLTDVQ FSSGMGEIIKHGLIKDAAYYDWLGEHWQEINERDLSVCEEMVLISNRIKRDVVETDPTEQ GERALLNYGHTLGHAIEKLADFKLMHGHCVGLGCIAAMGISAARGALPEEALTNLKERME RFHMPLTVSGLSAEDIIETTKSDKKMDSGTIRFVLLEEVGKAYLDRTVTDDEMNAGLSRI LA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:20:08 2011 Seq name: gi|229783878|gb|GG667857.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld250, whole genome shotgun sequence Length of sequence - 3406 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 29 - 88 2.2 1 1 Op 1 . + CDS 113 - 436 105 ## gi|295115643|emb|CBL36490.1| hypothetical protein 2 1 Op 2 . + CDS 451 - 1155 623 ## gi|266625189|ref|ZP_06118124.1| conserved hypothetical protein + Term 1183 - 1224 -0.5 + Prom 1158 - 1217 2.1 3 2 Op 1 . + CDS 1239 - 1658 510 ## gi|266625190|ref|ZP_06118125.1| hypothetical protein CLOSTHATH_06603 4 2 Op 2 . + CDS 1673 - 2215 292 ## gi|266625191|ref|ZP_06118126.1| conserved hypothetical protein 5 2 Op 3 . + CDS 2224 - 3216 414 ## DET1070 endolysin, putative Predicted protein(s) >gi|229783878|gb|GG667857.1| GENE 1 113 - 436 105 107 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295115643|emb|CBL36490.1| ## NR: gi|295115643|emb|CBL36490.1| hypothetical protein [butyrate-producing bacterium SM4/1] # 1 107 1 107 107 203 97.0 3e-51 MSRRLKLHNALCDILSCPNKGPECRAYFQPPSSVKMKYPAIVYALDDIENTFANDGVYLS ARKYSVTVIDSDPDSSLVGKVASMPTSRFNRHYTKDNLNHDVFEIFF >gi|229783878|gb|GG667857.1| GENE 2 451 - 1155 623 234 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625189|ref|ZP_06118124.1| ## NR: gi|266625189|ref|ZP_06118124.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF9474_02267 [Clostridium symbiosum WAL-14163] conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF9474_02267 [Clostridium symbiosum WAL-14163] # 1 234 1 234 234 447 100.0 1e-124 MKKKLVWDKTGERLYETGVSQGVLYPIQTGGVYNSGTAWNGLSTVTESPSGAEPTAIYAD NIKYLNLMSAEEFGGTIEAYMAPDEFAECDGSKEIAPGVFAGQQNRKMFGLSYKTLLGND VDSNDYGYKLHLVYGCLASPSEKGYSTVNDSPEAITLSWEFSTTPVEIATLIDGKKLKPT SILTFDSTKVDAKKLAALEEILYGKDPSSAEADDGVEPRLPLPDEVIKIMTAEG >gi|229783878|gb|GG667857.1| GENE 3 1239 - 1658 510 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625190|ref|ZP_06118125.1| ## NR: gi|266625190|ref|ZP_06118125.1| hypothetical protein CLOSTHATH_06603 [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF9474_02266 [Clostridium symbiosum WAL-14163] hypothetical protein CLOSTHATH_06603 [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF9474_02266 [Clostridium symbiosum WAL-14163] # 1 139 1 139 139 251 100.0 9e-66 MYAVTKTYKDFNGVERTETKLFNLTETEVMEMELGTAGGVAEMLQRIVDAKDQPTIIKFF KEFILKAYGEKSADGTYFEKSEEISRKFACTQFYNLLFMELATDDSKAAEFVNHVIPKVV DIKKHSENPEIAPVVATMN >gi|229783878|gb|GG667857.1| GENE 4 1673 - 2215 292 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625191|ref|ZP_06118126.1| ## NR: gi|266625191|ref|ZP_06118126.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 180 1 180 180 351 100.0 1e-95 MLELTIPRTDLWDERNQRFIPVKEQKLRLEHSLVSLSKWESKWCKVFLSKEQKTYEETID YIRCMTLTQNVDPLVYRCITNSHIDAVNAYIEEPMTASTVKEEKGGPINRQQITSELIYY WMTAYHIPFECQKWHLNRLLMLIRICNAENKPPKKRSKRDLYRHHAEVNAANRKKFNSKG >gi|229783878|gb|GG667857.1| GENE 5 2224 - 3216 414 330 aa, chain + ## HITS:1 COG:no KEGG:DET1070 NR:ns ## KEGG: DET1070 # Name: not_defined # Def: endolysin, putative # Organism: D.ethenogenes # Pathway: not_defined # 5 256 223 469 491 192 43.0 1e-47 MAKSRQAVVNLVESWDGKKESNGSHKSIIDLYNDFFEKICAGKFPRGIRMRYDWAWCACT WSALAAALRYESIMPMEISCYYLIEAAKKMGCWQENDAYVPSPGDAILYDWQDNGFGDNS GNPDHVGTVIEVHKESGYMVIEEGNYSNAVKKRTLSINGKFIRGFITPKYDDNTVAAPGL SKGKDIKTIAHEVIVGLWGSGENRKKLLTEYGYSYSEVQNMVNQILNGSAVTPSNTKQDQ NQSVSKKVVATCSAKQFNKTCAGEYKTTAVLYCRNDAGTNKKAICKIPAGTKVKCYGYYT MANGVKWLYIQFVLDGVQYTGFSSSAYLAK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:20:49 2011 Seq name: gi|229783877|gb|GG667858.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld251, whole genome shotgun sequence Length of sequence - 4340 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 2 - 1298 1303 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 2 1 Op 2 . - CDS 1303 - 3084 1511 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 3153 - 3212 5.5 + Prom 3217 - 3276 4.8 3 2 Tu 1 . + CDS 3337 - 4339 1241 ## COG1653 ABC-type sugar transport system, periplasmic component Predicted protein(s) >gi|229783877|gb|GG667858.1| GENE 1 2 - 1298 1303 432 aa, chain - ## HITS:1 COG:BS_yesN KEGG:ns NR:ns ## COG: BS_yesN COG4753 # Protein_GI_number: 16077763 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus subtilis # 1 128 1 126 368 155 55.0 2e-37 MLKLIIADDERIIRETISTIIDWSQYGIELAGLCSNGIEAYDMILDESPDLVITDIRMPG MNGLELIGRICETDLDTQFILLSGYGEFEYAKQAMKYGVRHYLLKPCNEQQIIDAVKESV SDCYAKRQARRILDSQFSQADSIRHNVVSSIINDALYLEKPLEEIIPTYEPYMDFYFTPY RLFYIYYLEQENLEEFLGLLNAFARNSLPQVSIYGIYVRNTFLIFCQDISFDFDQLEAKL DSVSLEGQHTSLELHMETYSDLKGLFAIVLEKIRRFSTFYFVNNDHALCTCNYTAITQET ENYCRAISEGDAAAMSGLMELISDIMDIRFLKQLSGSMFLKITLGNPNISGSGLTEWLMQ AEGETDLSALKESVFQKLTEVVESGSRTETVSSMTRQIFTYIKENLENPNLTLKYIAEQH LYMNVDYVSRKF >gi|229783877|gb|GG667858.1| GENE 2 1303 - 3084 1511 593 aa, chain - ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 1 581 1 564 577 256 30.0 1e-67 MKKRAEAIQIKFLTLFKNLSLRWKIFFIVLLNVVSLLLFSFIGFTILSKTYNELLYRSIA GNLSFSANTISTNLKDVERLSSIILSTSSIQTSLSAIHESEDAVVRSNANRSINTALSSY YETFRQNGVSYIVLFNDNFSNATNWAAYEKTDREFIDHALQNAELGDGAVSWTTGGGSTS GLFLGRNIRKIEGLDFSTIGRLLIYVDTDRLINTINRSFQTFDTGYYILADGPDTIYASS GLSKDLVNRTLLLEDDTYQTLQADNHTYFAVSSKIPYYDWRYISLIPFDSIIHSIGRSAK IILGIMVLGILTALVLTQWLIYSILRHFNALIVKMDVFSKNELTLEQSAYDYSTRNDEVG RLHQRFDRMATRIRTLVDTNYKNELLRKEAQIKALESQINPHFLYNTLESINWRAKASKN AEISMMAESLGTLLRSTLSNHESLVELDYELELVRAYITIQKIRFEDRLDFLSLVDDDLH HILMPPLTIQPLIENAIRYGMEEMTETCHIELLVRHESDQVLIRVKNNGSVFEDHLLEKL KSREKQPNGFGIGLMNIDQRIRLLFGDAYGLELNNEDDCATADIHIPFQVKEN >gi|229783877|gb|GG667858.1| GENE 3 3337 - 4339 1241 334 aa, chain + ## HITS:1 COG:BS_yesO KEGG:ns NR:ns ## COG: BS_yesO COG1653 # Protein_GI_number: 16077764 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 40 334 3 290 412 144 31.0 2e-34 MKKKLAVVLAASMAAVSLAACGGGAAESGSTTGADSQTTAGAAAENTTAAASGDASLTVA WWGNQTRNERTQAALDKYSELNPGVAFDGQFSEWSDYWNKLATAAAGHSLPDVIQMDYAY LDQYVTNNLLVDLTPYIEDGTLNVDNCSQDIINSGSVDGKVYSIAIGINAPAMVYNKTIT DQAGVEIKDNMTLDEFVAVSKEIYEKTGCKTNMAYGVSQEILQAMIRGYDGHVLFDEGKL GVDSADEFVPFYKVYEDGVKEGWYIEPSVFAELTPGSIEQDPLIYGSSPETMSWVAFCWS NQYAAFTNAAPEDMELDLTTWPSKDPKKSNFLKP Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:20:51 2011 Seq name: gi|229783876|gb|GG667859.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld252, whole genome shotgun sequence Length of sequence - 6851 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 7, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 205 170 ## Ccel_1549 outer membrane protein - Prom 238 - 297 7.1 + Prom 388 - 447 7.4 2 2 Tu 1 . + CDS 486 - 950 415 ## Cphy_1528 AraC family transcriptional regulator 3 3 Op 1 . - CDS 2232 - 3374 1221 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family 4 3 Op 2 . - CDS 3391 - 3741 396 ## Odosp_1093 hypothetical protein - Prom 3879 - 3938 4.8 + Prom 3803 - 3862 6.8 5 4 Tu 1 . + CDS 3942 - 4151 243 ## gi|288871592|ref|ZP_06118137.2| putative secreted protein + Term 4229 - 4284 2.3 + Prom 4187 - 4246 7.5 6 5 Tu 1 . + CDS 4333 - 4563 193 ## COG1309 Transcriptional regulator 7 6 Tu 1 . + CDS 5536 - 5856 307 ## TepRe1_0944 TetR family transcriptional regulator + Term 5896 - 5942 1.6 - Term 5882 - 5930 8.0 8 7 Tu 1 . - CDS 5972 - 6709 665 ## COG2340 Uncharacterized protein with SCP/PR1 domains - Prom 6731 - 6790 10.0 Predicted protein(s) >gi|229783876|gb|GG667859.1| GENE 1 1 - 205 170 68 aa, chain - ## HITS:1 COG:no KEGG:Ccel_1549 NR:ns ## KEGG: Ccel_1549 # Name: not_defined # Def: outer membrane protein # Organism: C.cellulolyticum # Pathway: not_defined # 2 68 3 69 644 106 70.0 3e-22 MREYHVAVTGCDSNSGTKDQPFRTISRAASLAMPGDRVIVHEGEYREWVKPAQGGTGSVS RITYEAAE >gi|229783876|gb|GG667859.1| GENE 2 486 - 950 415 154 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1528 NR:ns ## KEGG: Cphy_1528 # Name: not_defined # Def: AraC family transcriptional regulator # Organism: C.phytofermentans # Pathway: not_defined # 1 152 1 152 284 121 39.0 1e-26 MIEHLDGIFETVTFRENMQIRISDMDICEDFPDHWHTPMEIIHATKNWYKIVVDGRTYRL NEGEIAIIRPGTIHALHAPDTGSRTIYLADLSFLGGISNLETLLALLPPVTTITPEREPE LYRKAVELLSFMEEEYAGSRSFYDMLLYGALIAS >gi|229783876|gb|GG667859.1| GENE 3 2232 - 3374 1221 380 aa, chain - ## HITS:1 COG:MA0404 KEGG:ns NR:ns ## COG: MA0404 COG1453 # Protein_GI_number: 20089299 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Methanosarcina acetivorans str.C2A # 1 379 1 362 364 290 39.0 4e-78 MNYRKLGKTGLMVSEIGLGGEWLERHNTDEVKAVIDCCEEAGINILDCWMSEPNVRSNIG AALAGRREKWIIQGHIGSTWQDGQYVRTRDLDKVKPAFQDLLDRMRTDYIDLGMIHFVDE AAEFQSIIHGEFMEYVRELKDRGIIRHIGMSTHNPAVAKMAALCGEIEMILFSINPAFDM LPASENMDDYFKEEYEEALGGIAPERTELYRICEREEVGITVMKGYAGGRLFSAEASPFG TALTPVQCLHYALTRPAVASVMVGYDTPEHVAAAVAYEYASAEEKDYATVLAGAPHLAYA GQCTYCGHCAPCPSAIDIAMVNKLYDLAVMQEEVPASLRAHYNGLSAGAGDCIGCGSCES RCPFGVPVVERVEQAKVLFS >gi|229783876|gb|GG667859.1| GENE 4 3391 - 3741 396 116 aa, chain - ## HITS:1 COG:no KEGG:Odosp_1093 NR:ns ## KEGG: Odosp_1093 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 1 113 1 113 114 204 84.0 1e-51 MRKCKITVLKTTLDEELAKEYGVPGLQACPMLKAGQIFFADYAKPEGFCDEAWKAVYQYV FALSHGAGEELFYYGDWIKTPGIAICSCNDGLRPVIFKIEATGEESRLDYNPVVRE >gi|229783876|gb|GG667859.1| GENE 5 3942 - 4151 243 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871592|ref|ZP_06118137.2| ## NR: gi|288871592|ref|ZP_06118137.2| putative secreted protein [Clostridium hathewayi DSM 13479] putative secreted protein [Clostridium hathewayi DSM 13479] # 1 69 6 74 74 130 100.0 4e-29 MIEEGTVVYYLDEDLVHSGRVTDVTPVSGGFTFSIDSYGACEGPYVIASGQIGKTVFFTE KEAKDRLGL >gi|229783876|gb|GG667859.1| GENE 6 4333 - 4563 193 76 aa, chain + ## HITS:1 COG:FN1034 KEGG:ns NR:ns ## COG: FN1034 COG1309 # Protein_GI_number: 19704369 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 4 66 9 71 205 65 49.0 3e-11 MKEDRPYHHGNLQTELIEEGLALIHEEGKSNFSLRKLAKRLGVSPAACYNHYSTVDELMR EMKNYVTQKFCGALAS >gi|229783876|gb|GG667859.1| GENE 7 5536 - 5856 307 106 aa, chain + ## HITS:1 COG:no KEGG:TepRe1_0944 NR:ns ## KEGG: TepRe1_0944 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: Tepidanaerobacter_Re1 # Pathway: not_defined # 1 91 92 183 203 82 39.0 6e-15 MGIAYVNFFAVNPHYFTFIYDGDDYRIDLTEDTFDGDFEPFHLFKELGLLCLEYNHVEKD KRRDSLIIMWAAAHGLAAMANMKGFYYDGDWGALAGKLLQEKINLT >gi|229783876|gb|GG667859.1| GENE 8 5972 - 6709 665 245 aa, chain - ## HITS:1 COG:CAC2230 KEGG:ns NR:ns ## COG: CAC2230 COG2340 # Protein_GI_number: 15895498 # Func_class: S Function unknown # Function: Uncharacterized protein with SCP/PR1 domains # Organism: Clostridium acetobutylicum # 110 244 43 173 175 102 42.0 5e-22 MKRLTAYALTAAAALATAGTIPSTANAAVHTYNFSAGGGKVIMVTGNNGMDFNHILNQIP GFHASPDCQPEFPGYPGEVFPDNSFPTPELPSPDFPDNSLPSPGLPDNSLPTPDQPSDEN QDEAVGAVLKLVNEERAKAGLPVLALHAGATRAAQQRAGEIETAFSHTRPDGSNFTTALT AAGVTYRAAGENIAYGQKSAGQVMQDWMNSAGHRANIMNGNFTSIGIGHYKSAAGVDYWT QLFIN Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:21:09 2011 Seq name: gi|229783875|gb|GG667860.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld253, whole genome shotgun sequence Length of sequence - 6197 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 421 443 ## Cphy_1627 acyl-CoA reductase 2 1 Op 2 . - CDS 408 - 1541 592 ## Clole_4024 acyl-protein synthetase LuxE 3 1 Op 3 . - CDS 1560 - 3356 1609 ## COG3706 Response regulator containing a CheY-like receiver domain and a GGDEF domain 4 1 Op 4 . - CDS 3380 - 4000 643 ## Closa_2209 metallophosphoesterase 5 1 Op 5 . - CDS 4005 - 5960 2265 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein - Prom 6002 - 6061 5.2 Predicted protein(s) >gi|229783875|gb|GG667860.1| GENE 1 1 - 421 443 140 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1627 NR:ns ## KEGG: Cphy_1627 # Name: not_defined # Def: acyl-CoA reductase # Organism: C.phytofermentans # Pathway: not_defined # 14 140 29 157 415 94 41.0 1e-18 MRYDELEQKTTALLEQPPLSRTVVIEACSRFSELLNGGAYDEKIREYGMEKLLTGEMISR AASLLRADALKKRVETELSGFQNQFLMPLGVLMHVTAGNMTGIGAYSVLEGLLAGNINLL KLSSEDDGLSVFLLRELIRI >gi|229783875|gb|GG667860.1| GENE 2 408 - 1541 592 377 aa, chain - ## HITS:1 COG:no KEGG:Clole_4024 NR:ns ## KEGG: Clole_4024 # Name: not_defined # Def: acyl-protein synthetase LuxE # Organism: C.lentocellum # Pathway: not_defined # 1 370 1 370 373 404 52.0 1e-111 MDYRTKLCFCRDIYDQRASDRTFMEAVRRNVDCHLLKCKDYRRILESRGFSRSDGQRVMN AGEVPCIPTLFFKYHDLYSVPEASMVIKATSSGTGGVKSRIGFDKKSLFFAGVAAGRVIA RHGLFSVRPVHYVILGFPPGKENQTAISRTQRLSTMFAPALSRKYALRKGKHGYRPDFNG LLEALRRYGRGNAPVRIIGFPAYLYFLLKEMERNRIFIRLPKHSMIMAGGGWKQFYKETV DKRELYRLGERHLGIPESHFREFFGAVEHPGVYCDCPNHHFHIPVTSRVVIRDVDTLEPV GYGTTGLVNLISPLVESMPLVSILTDDLGILHRGCECGCGIEAPYLEIIGRTGVPEIRTC AAGAEEVLKEAGRSALR >gi|229783875|gb|GG667860.1| GENE 3 1560 - 3356 1609 598 aa, chain - ## HITS:1 COG:SMc01370 KEGG:ns NR:ns ## COG: SMc01370 COG3706 # Protein_GI_number: 15965047 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and a GGDEF domain # Organism: Sinorhizobium meliloti # 290 457 290 454 455 92 35.0 2e-18 MDVFYRPELSVQTLLMDANTVQETAENLADYINHLMSADSVLLVRLDNQMRYKESGSWKA GNDGENGVEAVMRQWLPRLVESWNKREKLSCGELKDFCQALTPYTVPGGYLTILPICNHI RFVGFAVFGKRAEEGDWTDKELPVLRFLLSIIAVTFANKNLYEDYMLQNWVYNTMMGRMQ ANLYVTDIHTDQILFMNKTMQEAFGLENPVGKVCWQVLQKGIKGRCPGCPVTRLTESGEE GVYRRWEEYNTVTGRYYDNYDSLMKWTDGSTVHFQHSLDITASKKLSKEASTDELTGLLN RRGGKNRLAEMLNMARHQEEVLTVCMYDVNSLKKVNDTFGHREGDNLLRVIANMVKSVLT ERDFACRLSGDEFLITFAGKTEQEADELIKLVEERLLAVRRQDRIPYELSFCTGILELKP EHTMSVTDILREVDERMYDQKRQYHMRQGKESVLAASEAQWRLLKPEAFDYDQQYLYDAL VESTDDYLYVCDMATNTFRYSRKLVEEFAFPGEVIQNAADIWRSIIHPADWERFWESNQA VSDGRTVSHEVEYRARNRKGQWIKLHCRGHVAQNSAGEPRVFAGFIANEGQWFPEWTE >gi|229783875|gb|GG667860.1| GENE 4 3380 - 4000 643 206 aa, chain - ## HITS:1 COG:no KEGG:Closa_2209 NR:ns ## KEGG: Closa_2209 # Name: not_defined # Def: metallophosphoesterase # Organism: C.saccharolyticum # Pathway: not_defined # 1 200 1 200 201 345 78.0 8e-94 MKILVLADEESKSLYEYFSPEKIKGVDLIISCGDLKASYLTFFATFSHAPLLYVKGNHDG HYEDRPPEGCVCIEDDIFVFRGVRILGLGGSMEYIPGADNQYTERGMERRIKKLWWKLKK NKGFDILVSHAPAYQINDMQDLPHRGFACFKDLMDKYMPKYFIHGHVHANYGGGFKREDT YGETRIVNAYEYYIFDYPENGEQTTS >gi|229783875|gb|GG667860.1| GENE 5 4005 - 5960 2265 651 aa, chain - ## HITS:1 COG:AF0890 KEGG:ns NR:ns ## COG: AF0890 COG1744 # Protein_GI_number: 11498495 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Archaeoglobus fulgidus # 299 627 63 397 397 148 27.0 3e-35 MALYDYSGALKKGRKQYQVSVAKGEYPYLPVLDDILSYTDIISEVNLGLMDVPLDKIVGT KTKGRTSAFANNFMPLLAEKSEFGAKWAYLYDHQIEEGIHDPIVAYEFMNQYYVQEGNKR VSVLKYVGAFSITASVTRLIPKRTDDLDNRLYYEFLEFYQVSFNCDVWFSQEGCYDRLLK AMGKAPEEVWSEDDRIYFKSAYDQFSKAFRAFGGSSYEMTCSDAFLVYVELFGYDVVKLK TEGEITKDLAKIKDEPLLASRGSKIALVEQPEEVEEQDNGPLKIINWLRPSQNIEPEMLK IAFIHAKTAETSSWTYGHELGRMYLEQAFEGRLKTVAFFEADTDAEIANAIDLAIAARCN MIFTTASQMIGLSVKAAIEHPEVKIFNCSVNMSYSSVCTYYARMYESKFLMGALAASMAQ CDKLGYIADYPIYGTIANINAFALGARMINPYVKVHLEWARVKGRNAREELEKEGIAFIS GEDMITPKTASREYGLYKIESDGEFRNLATPIWHWGKFYERIVNITCRGGSDVKEMKGKQ AINYWWGMSADVIDVICSQNLPHGTNRLITFLKNSIRSGSFQPFVGTIYSQDGTIQCEEE QRLSPEEIITMNWLADNVVGKLPEFDELTEEAQSLVRLQGLTIYDNAVMEE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:21:23 2011 Seq name: gi|229783874|gb|GG667861.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld254, whole genome shotgun sequence Length of sequence - 7905 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 655 403 ## Dhaf_0962 cell division membrane protein-like protein + Prom 662 - 721 3.4 2 2 Op 1 . + CDS 745 - 1836 1454 ## COG4859 Uncharacterized protein conserved in bacteria 3 2 Op 2 . + CDS 1742 - 1987 114 ## gi|266625213|ref|ZP_06118148.1| conserved hypothetical protein + Prom 2305 - 2364 1.5 4 3 Tu 1 . + CDS 2519 - 3778 1476 ## COG0525 Valyl-tRNA synthetase + Prom 4680 - 4739 9.8 5 4 Tu 1 . + CDS 4902 - 6095 1348 ## COG0525 Valyl-tRNA synthetase + Prom 6997 - 7056 18.6 6 5 Tu 1 . + CDS 7076 - 7900 555 ## Closa_2351 peptidase U4 sporulation factor SpoIIGA Predicted protein(s) >gi|229783874|gb|GG667861.1| GENE 1 2 - 655 403 217 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_0962 NR:ns ## KEGG: Dhaf_0962 # Name: not_defined # Def: cell division membrane protein-like protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 216 232 449 455 89 29.0 8e-17 VCKGIFHVRKTRTIAALIALPLITSLLLYWKGADLGLLHHYQLARLQYVFSPDMLDYNSN GGIIPYLWESVSGFRMFGSSAAPVPKAMKSLNCDYVVFFVFAKYGIAAGTAMLSILAVTA VKAFSISRRQKNRLGFLVGTACSVVLTIQMMVYVAANFGVPLVEPMTIPFLSYGGQSTLV NYILLGLILSVHRNKDIVSERHGTRGKWKIRLERIES >gi|229783874|gb|GG667861.1| GENE 2 745 - 1836 1454 363 aa, chain + ## HITS:1 COG:PA2504 KEGG:ns NR:ns ## COG: PA2504 COG4859 # Protein_GI_number: 15597700 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 265 357 107 197 205 68 39.0 1e-11 MGDNIWQNVFQEIFEKNLERMKKEPETAGLNTLFDSEGAYEQLTIGEVRLNTGRIEIGDP LCYMNTKYSCTLEEMVEPGSYPVSLSVIDHPVFGFRFLAAKLDVNGKTPVRYELAMPQGC TIEDKDKPGVFAMFGVDTGLACICDRAVSAVYDDFIKEWRRKNPDKNLYDDYFEEVMKAY AEAYPRYQREDGDYLDWCPPGSDGNLILFTSGFGDGAYSGYWGFDENGDKACLVVRFIDP EAYDVPMPELPKSKKFFMKAEEIKPLLKSGQFGIATDKIMVEGSKVGYMVRNEPQEEHPE DSGWIFYEGSEDREYCEDSGNFGLYDLNTVANYDPDIIPLLDAPAGMAFFRGDDGEFYVD AGV >gi|229783874|gb|GG667861.1| GENE 3 1742 - 1987 114 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625213|ref|ZP_06118148.1| ## NR: gi|266625213|ref|ZP_06118148.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 28 81 1 54 54 89 100.0 8e-17 MIPISFRCLTPPRAWHFSAVMTENSMWMRVFKLTLENVTNRNIEILAEGHQIINRRPDDA VKSESNIEMTRIRTNILEEEI >gi|229783874|gb|GG667861.1| GENE 4 2519 - 3778 1476 419 aa, chain + ## HITS:1 COG:CAC2399 KEGG:ns NR:ns ## COG: CAC2399 COG0525 # Protein_GI_number: 15895665 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 3 417 6 419 881 580 63.0 1e-165 MKELEKNYNPAEIEGKLYQKWLDKKYFHAEPNPDKKPFTIVMPPPNITGQLHMGHALDNT MQDILIRYKRMQGYEALWQPGTDHAAIATEVKVIESLKKQGIDKEDLGRDGFLEKCWDWR KEYGGRIINQLHKMGSSADWDRERFTMDEGCSEAVQEVFIRLYEKGYIYKGSRIINWCPV CQTSISDAEVEHVDQDGFFWHINYPVVGEEGSFVEIATTRPETLLGDTAVAVNPEDERYT NLVGKMLELPLTGRTIPVIADSYVDKEFGTGCVKITPAHDPNDFEVGKRHSLPELTIMND DATISLPGSKYDGMERYEARKAIVEDLKELGLLVKVVPHTHAVGTHDRCKTTVEPLVKQQ WFVRMEEMAKPAIEALKTGKLKFVPERFDKTYLHWLEGIRDWCISRQLWWGHRIPAYSS >gi|229783874|gb|GG667861.1| GENE 5 4902 - 6095 1348 397 aa, chain + ## HITS:1 COG:CAC2399 KEGG:ns NR:ns ## COG: CAC2399 COG0525 # Protein_GI_number: 15895665 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 2 396 484 878 881 437 56.0 1e-122 MLVTGYDIIFFWVIRMVFSGIEHTGQLPFNTVLMHGLVRDSEGRKMSKSLGNGIDPLEVI DKYGADALRMTLITGNAPGNDMRFYWERVEASRNFANKVWNASRFIMMNIEKAPDSRAEL SDLTMADKWILSKANSLARDVTDNLDKFELGIALQKVYDFIWEEFCDWYIEMVKPRLWNE EDTTKAAAIWTLKTVLINSLKLLHPYMPFLTEEIFCNLQDEEESIMVSAWPEYKEEWNFA GEEHAVETIKEAVRGIRNVRTSMNVPPSKKAKVYVVSENSEILDIFENSRVFFAMLGYAS EIVLQKDKEGIGEDAVSAVIHQAVIYMPFAELVDIEKEKERLKKEEERLTKELARVKGML SNEKFVSKAPAAKLEEEKAKLEKYTQMMDQVKERIAH >gi|229783874|gb|GG667861.1| GENE 6 7076 - 7900 555 274 aa, chain + ## HITS:1 COG:no KEGG:Closa_2351 NR:ns ## KEGG: Closa_2351 # Name: not_defined # Def: peptidase U4 sporulation factor SpoIIGA # Organism: C.saccharolyticum # Pathway: not_defined # 3 274 26 297 297 330 62.0 5e-89 MTVDLYIDVFFVINMILDYLVLSITGKVMRYRAGRYRKLAGAALGAAWAALCAAFPSIPP AVSFFMTYAVVSCLMVATAFGVRKKREMGRAVACLYLVTVMMAGVMQVLYQHTMAGYYIE QILRGNSRAAMPFYRLLLLAAGAYFGVRGVLGFVLEARKNRNHFYEVTMHYRGRKKTVTA LLDTGNRLYEPVTHRPVHVVTYEALKELCESVSSVIYIPFGSVGKKEGVMPGIFLDEMEI RQGSDVKIIEKPLVAVCRRPLCADGEYQMLLHED Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:21:39 2011 Seq name: gi|229783873|gb|GG667862.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld255, whole genome shotgun sequence Length of sequence - 3711 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 790 498 ## COG0673 Predicted dehydrogenases and related proteins 2 1 Op 2 . + CDS 864 - 1331 555 ## Tthe_1941 lactoylglutathione lyase and related lyase 3 1 Op 3 . + CDS 1352 - 2173 528 ## Rumal_2051 xylose isomerase domain-containing protein 4 1 Op 4 . + CDS 2174 - 3028 731 ## COG0451 Nucleoside-diphosphate-sugar epimerases + Term 3171 - 3212 -0.7 - Term 3038 - 3084 11.5 5 2 Tu 1 . - CDS 3120 - 3710 571 ## COG0438 Glycosyltransferase Predicted protein(s) >gi|229783873|gb|GG667862.1| GENE 1 2 - 790 498 262 aa, chain + ## HITS:1 COG:AGl3148 KEGG:ns NR:ns ## COG: AGl3148 COG0673 # Protein_GI_number: 15891691 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 262 169 413 415 178 37.0 1e-44 APDTFLSSGLQSMNKYIRDGIIGKPLYVTANMMSYGVEMWHPCPAPFYEAGGGPLYDMAG YYLSAIVSMLGPVDSVSGFCGKGFDRRLILNKAQFGTYLDVEIPTHYSAVLNLKSGVIVS MNMSFDIWRSGLPKFEIYGTDGTISYPDPNFGGGVPSVYRKEQVLDTVFQENEEYRQREA KMYELPELYHRVSNYSRGLGVLDLAYAIEHKQPNRASAELARHITEVIDGIITAANEHCV CKMATSCEIPKPIEPGLFVGEL >gi|229783873|gb|GG667862.1| GENE 2 864 - 1331 555 155 aa, chain + ## HITS:1 COG:no KEGG:Tthe_1941 NR:ns ## KEGG: Tthe_1941 # Name: not_defined # Def: lactoylglutathione lyase and related lyase # Organism: T.thermosaccharolyticum # Pathway: not_defined # 4 153 5 150 150 126 45.0 2e-28 MSGVVGTNLVAQVGFIVKDIEAAKEKWAQFLGVEVPPTQPVGDYAVTGTVYKGQPAPEAS CLMAFFDVGPGLQLELIQPNEAPSTWRNYLNEKGEGIHHVAFQVKDSKEAIARCEAFGLS LEQHGVYGDGSGEYNYMNGEKDLKCIVELLESYKK >gi|229783873|gb|GG667862.1| GENE 3 1352 - 2173 528 273 aa, chain + ## HITS:1 COG:no KEGG:Rumal_2051 NR:ns ## KEGG: Rumal_2051 # Name: not_defined # Def: xylose isomerase domain-containing protein # Organism: R.albus # Pathway: not_defined # 2 272 1 271 274 284 50.0 3e-75 MIRIGLSAALEHATPLEWAQRNASAGCRCVNFPINYRQGEDLVQSYVKAAGDHDLLIAEV GVWRNPVSPFEDVRTDAIEYAVGQLRLADSICANCAVNVAGSMGERWDGAYKDNFTKETW KKTVKSIQTIIDEAAPKHTYYTIEPMPWMYPISPDEYLHLIEDVGRDRFAVHMDIFNWIT TPYRYFFNEEFMEEVFEKLGRYIKSCHIKDVLLEQDFTMMYREVKCGGGIINLEKYAELA ARYNPDMPMIIEHLHSEKEYLESIEYVKRRLKL >gi|229783873|gb|GG667862.1| GENE 4 2174 - 3028 731 284 aa, chain + ## HITS:1 COG:mll1687 KEGG:ns NR:ns ## COG: mll1687 COG0451 # Protein_GI_number: 13471652 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Mesorhizobium loti # 4 283 3 295 296 221 38.0 1e-57 MSKKKIVITGTAGLLGPHVARHFAEHGYHVVGIDRAVPKESAAQENLIVDLTNAGEVYSA LSGAMGVVHLAAIPRPGIYTSEVTFGNNILASYHILEAADNLGIKKAVIASSESAYGFCF SKNNLRPQYFPVDEEHPALAEDSYGIGKIAAEKVAEGIHRRNHMQIITFRLGNIITEPMY ENFKDWIHDPHKRVLNVWNYIDARDIASACRLAVERDGLGCDIMNLAADDNCMDIKSRDL IQEVFPDITDIRGDLASYETLYSNAKAKALLGWQPVHHWRDYVS >gi|229783873|gb|GG667862.1| GENE 5 3120 - 3710 571 196 aa, chain - ## HITS:1 COG:XF0885 KEGG:ns NR:ns ## COG: XF0885 COG0438 # Protein_GI_number: 15837487 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Xylella fastidiosa 9a5c # 8 178 261 430 443 127 40.0 1e-29 AYVFEKFLESVPDDFEKREGLLFVGGFAHPPNADAVLWFAREIFPKIREKLEVPFYIVGS KVTEEIQALEQPGNGIIVKGFVSEEELSELYRTCRIVVVPLRYGAGVKGKVIEALYNGAP VVTTSIGAEGIAEAESVMCIKDAPEEFAEETVRLYQNPEALRELSRKTQNYIRRYYSVDA AWSVVEEDFTKETVKR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:21:48 2011 Seq name: gi|229783872|gb|GG667863.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld256, whole genome shotgun sequence Length of sequence - 5245 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 5, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 440 442 ## Ccel_3177 helicase 2 1 Op 2 . + CDS 466 - 1083 592 ## gi|288871596|ref|ZP_06118158.2| stage V sporulation protein G 3 1 Op 3 . + CDS 1088 - 1492 375 ## gi|266625224|ref|ZP_06118159.1| conserved hypothetical protein 4 2 Tu 1 . + CDS 2514 - 2816 265 ## TepRe1_1316 transposase IS4 family protein - Term 2828 - 2879 6.9 5 3 Tu 1 . - CDS 2903 - 3361 94 ## gi|288871597|ref|ZP_06118161.2| peptidase, M56 family - Prom 3611 - 3670 7.4 - Term 4155 - 4184 1.4 6 4 Tu 1 . - CDS 4226 - 4582 257 ## GYMC10_1102 transcriptional repressor, CopY family - Prom 4737 - 4796 6.1 + Prom 4522 - 4581 2.8 7 5 Op 1 . + CDS 4605 - 4712 64 ## 8 5 Op 2 . + CDS 4765 - 5223 298 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog Predicted protein(s) >gi|229783872|gb|GG667863.1| GENE 1 3 - 440 442 145 aa, chain + ## HITS:1 COG:no KEGG:Ccel_3177 NR:ns ## KEGG: Ccel_3177 # Name: not_defined # Def: helicase # Organism: C.cellulolyticum # Pathway: not_defined # 2 144 1914 2060 2077 78 33.0 8e-14 DEREAAGSLLLEIGQSLDSMESRKIGTYKGFDVVINKKFSDCTMQLCGNMKYTADMSSSA SGNMVRLENLLSGLEKRVAAHKENLEQYKRNMEESKKEFNKTFTYELELRQKLVRQKEIN DELEIKEEGEELVVTDNLPEQAVAR >gi|229783872|gb|GG667863.1| GENE 2 466 - 1083 592 205 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871596|ref|ZP_06118158.2| ## NR: gi|288871596|ref|ZP_06118158.2| stage V sporulation protein G [Clostridium hathewayi DSM 13479] stage V sporulation protein G [Clostridium hathewayi DSM 13479] # 1 205 3 207 207 392 100.0 1e-107 MKYTVNIYEVSTEKDTKLRAFAAVTFGDCFKVTGITVREGKAGNLYVSMPQYPTGERGED NKLIYSDVFFPKTTDFSTLLRGEILEAYQNREDGRNEVDLEYGDTGFEYYVQVVNNKDLS CATKAFARLVIDDVFVVNQISVKRSFAGNNFVAMPSQRRMIDGKAEYQDIAYPVTKDFHD TLYGDIMKQYEQNLDKQRTAKEKAR >gi|229783872|gb|GG667863.1| GENE 3 1088 - 1492 375 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625224|ref|ZP_06118159.1| ## NR: gi|266625224|ref|ZP_06118159.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 134 1 134 134 212 100.0 8e-54 MEKKIKITIGNLEELMENEEWETENFLFEEDEEHQVWLDKNYLILRGLGLEIHSGVWYEK YEDDKDGNFEPDFDFYMFFDAETKVYLYMEQGSSLITCIHNFTGLSWEEIEGQACEVVLT DQNIVVEDQFLIFE >gi|229783872|gb|GG667863.1| GENE 4 2514 - 2816 265 100 aa, chain + ## HITS:1 COG:no KEGG:TepRe1_1316 NR:ns ## KEGG: TepRe1_1316 # Name: not_defined # Def: transposase IS4 family protein # Organism: Tepidanaerobacter_Re1 # Pathway: not_defined # 5 89 367 452 582 72 44.0 7e-12 MKWFLNSREYQRELREKQVERTKAILAGGNHKKTWKNPNDPARFIDKSAVTNDGETVDIL YVLDEQKIVEEARYDGFYAVCTTLFEDDSGEILKGKSWLP >gi|229783872|gb|GG667863.1| GENE 5 2903 - 3361 94 152 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871597|ref|ZP_06118161.2| ## NR: gi|288871597|ref|ZP_06118161.2| peptidase, M56 family [Clostridium hathewayi DSM 13479] peptidase, M56 family [Clostridium hathewayi DSM 13479] # 1 152 276 427 427 308 100.0 9e-83 MTPPKTKKFSRKLSVILTSCMVLCSSFTSFAYQPSTELDMLGTFSTNPLSNINSSDTVVF QSGDFQPENQYGPILYNNQFTDINGNIYNIDSDRSSYATCEHNYIDGTLSRHTKKSDGSC ITKFYDSQRCTNCGQLIVGELINTETYTKCPH >gi|229783872|gb|GG667863.1| GENE 6 4226 - 4582 257 118 aa, chain - ## HITS:1 COG:no KEGG:GYMC10_1102 NR:ns ## KEGG: GYMC10_1102 # Name: not_defined # Def: transcriptional repressor, CopY family # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 3 115 5 121 124 81 36.0 1e-14 MKQNLTPAEQAIMEILWKNNHWMTISELIEIFEHLGKDWKRQTVNTFLARLIEKGLVVKN GRKYFYTYTKDEYEAQIASEMLKTLYGGSLKNFIAALSGNHTLTDENLKELREYLDNF >gi|229783872|gb|GG667863.1| GENE 7 4605 - 4712 64 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTNYISRMYCSINLADLSITIKITEQVQQYAKIKI >gi|229783872|gb|GG667863.1| GENE 8 4765 - 5223 298 152 aa, chain + ## HITS:1 COG:BS_sigW KEGG:ns NR:ns ## COG: BS_sigW COG1595 # Protein_GI_number: 16077241 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 10 148 27 184 187 67 30.0 1e-11 MDREELFLKYMDQIYRMAFFMLSNTHDAEDAVQETYVRMLVKKPKFADEEHGKAWLLRVC INICKNQIRFWKRHPQYELEEIPIKIDSFKEWELLHEISRLRGKSKEVLILFAVEGYSIK EISAMLKISESAIKKRLQRGREELSRQLGVRQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:22:29 2011 Seq name: gi|229783871|gb|GG667864.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld257, whole genome shotgun sequence Length of sequence - 5306 bp Number of predicted genes - 9, with homology - 7 Number of transcription units - 4, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 438 439 ## Cphy_2664 CarD family transcriptional regulator - Prom 472 - 531 4.6 - Term 480 - 524 8.4 2 1 Op 2 . - CDS 534 - 752 358 ## gi|288871598|ref|ZP_06118165.2| protein pyrBI - Prom 773 - 832 7.6 3 2 Op 1 . - CDS 890 - 1390 473 ## Closa_1769 MarR family transcriptional regulator 4 2 Op 2 . - CDS 1387 - 1524 86 ## 5 3 Tu 1 . - CDS 2495 - 2623 64 ## - Prom 2732 - 2791 11.2 6 4 Op 1 . - CDS 2897 - 3160 234 ## CDR20291_3072 hypothetical protein 7 4 Op 2 . - CDS 3192 - 3776 713 ## Spico_0824 putative phage-related protein 8 4 Op 3 . - CDS 3872 - 4921 1003 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 9 4 Op 4 . - CDS 4971 - 5258 356 ## gi|266625238|ref|ZP_06118173.1| conserved hypothetical protein Predicted protein(s) >gi|229783871|gb|GG667864.1| GENE 1 3 - 438 439 145 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2664 NR:ns ## KEGG: Cphy_2664 # Name: not_defined # Def: CarD family transcriptional regulator # Organism: C.phytofermentans # Pathway: not_defined # 1 145 1 145 168 137 42.0 1e-31 MFSVNDYVVFGNHGICVIKAIGPLDLGIAERGRLYYTLEPLYTQKNTIYTPVDSEKNSLR RAITREEALELIDRIPQVETVWVPDEKRREERYREIMRQNDCMGWMQIIKTLYLKKQKRL AEGRKNTARDELYLKLAEDFLYGEF >gi|229783871|gb|GG667864.1| GENE 2 534 - 752 358 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871598|ref|ZP_06118165.2| ## NR: gi|288871598|ref|ZP_06118165.2| protein pyrBI [Clostridium hathewayi DSM 13479] protein pyrBI [Clostridium hathewayi DSM 13479] # 13 72 1 60 60 106 100.0 5e-22 MRRIISACLLQTMRFDTTKEADPEQDFAIFCKKLDKSSVNYVIEEKTKEADGSLVVKIRK QYNSYSTDGYLQ >gi|229783871|gb|GG667864.1| GENE 3 890 - 1390 473 166 aa, chain - ## HITS:1 COG:no KEGG:Closa_1769 NR:ns ## KEGG: Closa_1769 # Name: not_defined # Def: MarR family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 27 142 27 144 191 79 41.0 6e-14 MREHRCRSFGMEEKLQGQLRACGHFLHYNMGEKAGQARILSALLESGTITQRELQDILEV RSGSLSEILNKVEANGFIERSQSENDRRQMEVRLTESGMERASRLEGEREDAAAGLFACL NDEEKKTLSELLDKLLDGWEGQERGRGRGGRGRGRHMHGRSGDEMQ >gi|229783871|gb|GG667864.1| GENE 4 1387 - 1524 86 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQKQIAEPICRAVTLREKDVKVMAEENWESCEDAEEWKRMEEMGK >gi|229783871|gb|GG667864.1| GENE 5 2495 - 2623 64 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDENRMNCPCCPNHCEAGALQCGKGRAYFSQEENNRSGRIAS >gi|229783871|gb|GG667864.1| GENE 6 2897 - 3160 234 87 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_3072 NR:ns ## KEGG: CDR20291_3072 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 6 85 4 83 83 94 61.0 9e-19 MDCKDKKKLLLNLDKLHTTDLGVMRIRRNLSLDAEDAVVWCREMIGNPNTCITRNGKNWY AENEFCIITVNAFSYTIITAHRRKKPS >gi|229783871|gb|GG667864.1| GENE 7 3192 - 3776 713 194 aa, chain - ## HITS:1 COG:no KEGG:Spico_0824 NR:ns ## KEGG: Spico_0824 # Name: not_defined # Def: putative phage-related protein # Organism: S.coccoides # Pathway: not_defined # 7 184 1 182 188 90 32.0 3e-17 MERMDALLYDCDIREPLFDYLEERFGKARMFEEKIIGKSRADVLMVTERRITGLEIKSDA DTYERLRRQIRDYDKYCDENYVVIGRSHAKHVEEHIPAYWGVLVVSVNGRDIVIEEMRPP QQNPKMKRELQLAILWRAELQNIIEQNHLPHYRQKSKRFVREKLLEKLEWDQLKLEVCEE LFQRDYTLLEEEEE >gi|229783871|gb|GG667864.1| GENE 8 3872 - 4921 1003 349 aa, chain - ## HITS:1 COG:FN1711 KEGG:ns NR:ns ## COG: FN1711 COG0275 # Protein_GI_number: 19705032 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Fusobacterium nucleatum # 52 311 9 259 314 173 39.0 4e-43 MENQEQTHKRRVRYKGTHPRNYKEKYKELQPEKYADTVAKVVSKGSTPAGMHLSICVREI LDFLQIKPGQKGLDATLGYGGHTREMLKCLQSQGHLYALDVDPIELAKTRERLQSQGFGP EILTIIQENFANIDRVAAEVGPFDFVLADLGVSSMQIDNPDRGFSYKTEGPLDLRLNPEK GITAAERLQTIEEEELRGMLLENADEPYAEEIARAVVQEIRRGNPIDTTTRLRQVIEKAL SRVPAADRKEAVKKSCQRTFQALRIDVNSEFEVLYSFLEKLPDVLAPGGRVAVLTFHSGE DRLVKKSFKRLKKEGVYSEIADEVIRPSAEECLRNGRARSTKMRWAVKA >gi|229783871|gb|GG667864.1| GENE 9 4971 - 5258 356 95 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625238|ref|ZP_06118173.1| ## NR: gi|266625238|ref|ZP_06118173.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 95 17 111 111 183 100.0 3e-45 MIAFPKKVVLPDKQVTVQSPEEFLAYYDEIFTADYRERIGQLMAEDDVWWSYRGVAVGNG EVWLNERDGTLWIEALNNGEDRAVQYPENTGIQAE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:23:03 2011 Seq name: gi|229783870|gb|GG667865.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld258, whole genome shotgun sequence Length of sequence - 8095 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 6, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 109 90 ## + Prom 165 - 224 5.2 2 2 Tu 1 . + CDS 403 - 963 645 ## Closa_2162 hypothetical protein + Term 1005 - 1062 2.2 3 3 Tu 1 . - CDS 960 - 1745 674 ## EUBREC_3191 hypothetical protein - Prom 1838 - 1897 80.4 4 4 Op 1 . + CDS 2775 - 3584 750 ## COG0561 Predicted hydrolases of the HAD superfamily 5 4 Op 2 . + CDS 3588 - 4427 964 ## COG0489 ATPases involved in chromosome partitioning 6 4 Op 3 . + CDS 4485 - 5183 815 ## COG2357 Uncharacterized protein conserved in bacteria - Term 6121 - 6170 6.1 7 5 Tu 1 . - CDS 6182 - 6856 674 ## COG1802 Transcriptional regulators - Prom 6986 - 7045 6.4 + Prom 6964 - 7023 8.0 8 6 Op 1 . + CDS 7062 - 7745 687 ## COG1878 Predicted metal-dependent hydrolase 9 6 Op 2 . + CDS 7763 - 8093 414 ## COG1893 Ketopantoate reductase Predicted protein(s) >gi|229783870|gb|GG667865.1| GENE 1 2 - 109 90 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no DVLTGLVLGTVIAVAVHKIVEWSVRQKKETAGEAE >gi|229783870|gb|GG667865.1| GENE 2 403 - 963 645 186 aa, chain + ## HITS:1 COG:no KEGG:Closa_2162 NR:ns ## KEGG: Closa_2162 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 182 1 182 190 251 72.0 1e-65 MKKFVTLFTDSYKELKSVRTITTAAMLAAIAIILGMFSIDVGSTIRIGFSSIPNGVCAYL FGPVVGGIFAGGLDVLKYLLKPTGPFFPGLTAVVILAGVLYGCFYYKKPITFWRVLLAKF TVMLICNVLLNTLCLSVLYGKGFMVILPARVIKNLIMWPIDSMIFYSVLKALNLLGILKA VRTNFA >gi|229783870|gb|GG667865.1| GENE 3 960 - 1745 674 261 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3191 NR:ns ## KEGG: EUBREC_3191 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 243 3 248 261 135 33.0 3e-30 MDFAQAKKIFAGYLKQYDTKNDKIRLKIVHTYGVVDAARLIAEGLGLSGEDTALALHIAL LHDIGRFEQLKRYDSFDDSIVPHADLSLSILFDENLIRSFIPERDYDSVIYTAIKNHGVY LMDDSLTGRDRLHSMIIRDADKLDNFRVKEQDSIPAMLDIEAEELGAEDISDHILTSFLN RRPIKNSDRVTHMDMWISYLGYIYDLNFPESRRCAAERGVIDTLIDRVPYSNEATRRKME LIRQAAWNFLSESTAGGSRFH >gi|229783870|gb|GG667865.1| GENE 4 2775 - 3584 750 269 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 7 261 3 253 256 158 34.0 1e-38 MRRRQIRAIFFDIDGTLRDFDTKRVPDSTKEALRKAKEAGILLFVATGRHKLEIEEENLL EDMEFDGYVTLNGQYCYCGSTVVYDVPIDGAGVAAMLRLIGKDPFPCLFMEADRMYINMV DEVVKKAQEGIGTRIPPVMDVSRAAGQPIYQIVPYIGRNREEEIRRAVPGCEIIRWHDEY AVDVIPRGGSKCIGITRMAAHFGLSLEETAAVGDGANDVSMVEMAGLGIAMGNGKDAVKA VADYITDSIETDGLSRAVFYILENNHQEE >gi|229783870|gb|GG667865.1| GENE 5 3588 - 4427 964 279 aa, chain + ## HITS:1 COG:FN2098 KEGG:ns NR:ns ## COG: FN2098 COG0489 # Protein_GI_number: 19705388 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Fusobacterium nucleatum # 37 279 15 254 257 229 48.0 5e-60 MSETNCTHNCNTCGESCSSRQAEQTSFLEPLNPASTVKKVIGVVSGKGGVGKSLVTTMMA VRMNAKNYKTAILDADITGPSIPKAFGLGDDGVGMTPTGLMIPATTSMGIEVMSANLILD HETDPVIWRGPVIAGAVKQFWQEALWEDIDYMFVDMPPGTGDVPLTVFQSLPVDGIIIVT SPQELVSMIVAKAVNMAKKMNIPILGLVENMSYLECPDCGRRISVFGESRIDEVAKENEI PVLAKIPIEPRIAKAVDEGTVEYLEAPWLDEAVKAVESI >gi|229783870|gb|GG667865.1| GENE 6 4485 - 5183 815 232 aa, chain + ## HITS:1 COG:CAC3340 KEGG:ns NR:ns ## COG: CAC3340 COG2357 # Protein_GI_number: 15896583 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 26 224 17 214 217 217 57.0 1e-56 MNPNSELHQSIVNVPNLVQVPDPLLEQAYQFQEAMMMYTCAIREIKTKLEVLNDELSVRN SRNPIEMVKSRVKKPLSIVEKLQRRGLPITMESMMENLDDVAGIRVICSFLDDIYAVADM LTRQDDVHIIAIKDYIRHPKDNGYRSYHMIVEIPVFFSDRKKWMRVEVQIRTIAMDFWAS LDHQLKYKKEVGDSSEEISAELKECAEVIAETDERMLKIRMKIEAEGITVSK >gi|229783870|gb|GG667865.1| GENE 7 6182 - 6856 674 224 aa, chain - ## HITS:1 COG:CAC0379 KEGG:ns NR:ns ## COG: CAC0379 COG1802 # Protein_GI_number: 15893670 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 7 206 13 207 216 80 28.0 3e-15 MKEKSFLQSKAYDYIKEQILEGKLLPDTLYSEARLSKELDISRTPIREALQCLSQDGYIS IVPSKGFMLRHLTEKDMQETIEVRCAIEGFCTHRIAGQLTTEKGKRLLEELEEILEQMRN AKDADDGLQTFIDCDHAFHLAIVGYVENNEFNQLFQRLLYLIHLTSATALSVTGRVEGTL QEHELYYRALREGNGNEAYQIMIRHLMMPLNLHDHSGSPSSSVS >gi|229783870|gb|GG667865.1| GENE 8 7062 - 7745 687 227 aa, chain + ## HITS:1 COG:PAB0997 KEGG:ns NR:ns ## COG: PAB0997 COG1878 # Protein_GI_number: 14521703 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Pyrococcus abyssi # 11 221 33 208 217 64 30.0 2e-10 MCVYKMGNLRVVDLTKELDPKTESRRCHMYRFNTGGPIPDYHTIMDITSHLGTHCECPYH HDDNWPSVAELPLTAFMGRALYVDFKEDVAPGTHITAADLDKATEGRIMEGDIVIIDSSY KLPPFTPATNTEADKRLLIGRESAEWFRDHKVKCVGFGDGVSIENCNEDVKPFHDILMAE NIVFLEVLKNLDKLEQDVFFMSYSPLPIHGLDSCPVRAYAIEGLEGF >gi|229783870|gb|GG667865.1| GENE 9 7763 - 8093 414 110 aa, chain + ## HITS:1 COG:CAC2937 KEGG:ns NR:ns ## COG: CAC2937 COG1893 # Protein_GI_number: 15896190 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Clostridium acetobutylicum # 1 107 1 109 307 99 46.0 2e-21 MKIAVIGAGAMGSIYGGHLSKNHQVYLVDTNPDIVKQINSEGLKIDEDGVTNIWHPTAVT DTEGLGEMDLVILFVKSIFSRAALAGNRGVIGEKTRLLTLQNGAGHEDIL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:23:18 2011 Seq name: gi|229783869|gb|GG667866.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld259, whole genome shotgun sequence Length of sequence - 4446 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 674 658 ## COG1472 Beta-glucosidase-related glycosidases 2 1 Op 2 . - CDS 674 - 1999 1467 ## COG2211 Na+/melibiose symporter and related transporters - Prom 2034 - 2093 4.0 - Term 2113 - 2155 2.0 3 2 Tu 1 . - CDS 2165 - 3079 959 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 3115 - 3174 5.6 + Prom 2876 - 2935 4.0 4 3 Tu 1 . + CDS 3081 - 4446 909 ## ELI_2091 hypothetical protein Predicted protein(s) >gi|229783869|gb|GG667866.1| GENE 1 2 - 674 658 224 aa, chain - ## HITS:1 COG:Cgl0317 KEGG:ns NR:ns ## COG: Cgl0317 COG1472 # Protein_GI_number: 19551567 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Corynebacterium glutamicum # 29 223 9 219 548 92 30.0 4e-19 MEKKWSQEEKDGYRLIHNEGGKDLGISSGSRVAILSEDGYAFKDFLGTGKIVPYEDWRLP AGERAADLASRLSIEDIAGLMLYSAHQLIPAKGPLAAAFGGTYDGKAFEESGASPWDLTD QQKEFIVKDRVRHVLIMKLQDTETAVRWNNKLQALAENTGFGIPANNSSDPRHGAGSSAE YMGVTGEPISKWANGIGLTAAFEPEAVREFGEIGAAEYRALGIT >gi|229783869|gb|GG667866.1| GENE 2 674 - 1999 1467 441 aa, chain - ## HITS:1 COG:CAC3422 KEGG:ns NR:ns ## COG: CAC3422 COG2211 # Protein_GI_number: 15896663 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Clostridium acetobutylicum # 3 439 5 440 445 267 38.0 4e-71 MEKISLKERFSYGLGDLACNLMFQLITAYLMFFYTDVAGIGLGAISLIMLVARVVDAVTD PMMGVIIDKTHTRWGKARPYVLWMAVPFGIISASMFLVPNFGGTGKFIYALITYILFCIV YTALNIPYSTMLSCMTDDVSDRLSFNMFKSLGASLGGFVVMGATLTLVAVFGQGDQKKGF FGTVVLYAVVGILLLMICFKNTKERIIPETEKDSVTGAIKVAVKNKYWLLLCGITFFTFA GMILKNQSTMYYAKYYLNNEGIASLLMTIPTLLSFLMAFIIPALAKKIGKRNCILSGGLV TIAGNLLVLISKTNLAGILAGTVLCGIGLGLGMGVTFVMGAETVDYSEWQTGRRPQGIMT ALMGFGVKLSMAMSGVVGAQILKFGGYVENSVQTEGALRAIQMNFIWIPVIFFIVVVILS LFYDLDKKYEKILAEIHERRN >gi|229783869|gb|GG667866.1| GENE 3 2165 - 3079 959 304 aa, chain - ## HITS:1 COG:lin2267 KEGG:ns NR:ns ## COG: lin2267 COG2207 # Protein_GI_number: 16801331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 5 278 4 286 292 100 27.0 3e-21 MDLFYHETFTQNEQMPMQLRLHRGQINYMILEHWHRSIEIDYLLDCEADFYINGKKKQVA AGSITLINSGDIHALEPREVPVDTEDKIHGISLFISYEFLKKICPEIDQITFELEGQEER LKELKEVFNSLIRLELAPPEEYGYLKKNAMLHSLMYLLLTYFKEPRNQSSVKSQKYIDRL RVVLDYVEENYREPLTLQGVADHFSVSMEYLARILKKYTGNTFKTHLNQVRVSKAFRDLL ETDYSVLEIAMRNGFSDTRSLINVFRDTYGMTPSQYRKKNARREAAKEAASAEYPPHIRV VDAP >gi|229783869|gb|GG667866.1| GENE 4 3081 - 4446 909 455 aa, chain + ## HITS:1 COG:no KEGG:ELI_2091 NR:ns ## KEGG: ELI_2091 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 70 450 15 395 666 378 47.0 1e-103 MYPLLPYDFTLLLLFYHSAHMATTWKTAAKVVRLLFLSCNGMRNMVLLKLKEITMTLNPE APMQKKILPVILAIALILSSTFSILCFHNLQGNAKVINYAGIVRGATQRLVKEELNGIPN DLLIEKLDGILEELQSGKGVYGLIRMNSSDFQDLILRMETDWNLLKEEIYLVRAGGDTEK LFEDSESYFDLADRAVLAAEQFSENRELLAEKSLLLLNCFFLLLVFLFCLYSAGQARRQR ELQRTEEDSRKKKEHLSHMVDSLLGPMNEISELVYVSDIEDHTLLFVNKAGMETFHIDAL DSRKCYQVLQGFDSPCEFCTSSILKEGETYTWEYTNPLTRRHYILKDRLIQWEGRTARME LAFDTTESEKEKLQLKLTLEANEMITKCVQTLYQSEDIDTAISQVLERLGSFLSADRAYI IYIRHGKMYNDYEWCTDGVEPQREMLQDLPLNLIT Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:23:26 2011 Seq name: gi|229783868|gb|GG667867.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld260, whole genome shotgun sequence Length of sequence - 6725 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 6, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 52 - 111 4.2 1 1 Tu 1 . + CDS 320 - 544 279 ## Rumal_3364 helix-turn-helix domain protein + Prom 553 - 612 9.4 2 2 Tu 1 . + CDS 721 - 1068 211 ## gi|266625253|ref|ZP_06118188.1| toxin-antitoxin system, antitoxin component, PHD family + Term 1088 - 1123 3.4 + Prom 1205 - 1264 8.4 3 3 Op 1 . + CDS 1331 - 2017 410 ## gi|266625254|ref|ZP_06118189.1| hypothetical protein CLOSTHATH_06668 + Prom 2019 - 2078 1.9 4 3 Op 2 . + CDS 2172 - 3005 407 ## COG0778 Nitroreductase + Prom 3067 - 3126 3.3 5 4 Tu 1 . + CDS 3146 - 3553 300 ## COG4405 Uncharacterized protein conserved in bacteria + Prom 3638 - 3697 1.8 6 5 Tu 1 . + CDS 3777 - 4043 346 ## EUBELI_01517 hypothetical protein 7 6 Op 1 . + CDS 4958 - 5470 402 ## Cphy_2938 LytTR family two component transcriptional regulator 8 6 Op 2 . + CDS 5467 - 6651 855 ## Amet_3568 signal transduction histidine kinase regulating citrate/malate metabolism Predicted protein(s) >gi|229783868|gb|GG667867.1| GENE 1 320 - 544 279 74 aa, chain + ## HITS:1 COG:no KEGG:Rumal_3364 NR:ns ## KEGG: Rumal_3364 # Name: not_defined # Def: helix-turn-helix domain protein # Organism: R.albus # Pathway: not_defined # 4 72 2 70 74 67 40.0 1e-10 MSYPVIDPVATGARINTYRIDRGFSVASLREYFGLSTTNAIYKWLRGDSLPTLDNFLALS VLFNVSMNDLIVYH >gi|229783868|gb|GG667867.1| GENE 2 721 - 1068 211 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625253|ref|ZP_06118188.1| ## NR: gi|266625253|ref|ZP_06118188.1| toxin-antitoxin system, antitoxin component, PHD family [Clostridium hathewayi DSM 13479] toxin-antitoxin system, antitoxin component, PHD family [Clostridium hathewayi DSM 13479] # 1 115 1 115 115 176 100.0 3e-43 MVAKPHPVSIAERLLEIERNAKKEHITELEQMMDKFMMQHKKEIVRHVFADYSEDFKELA KEFDKLKKELEIMEMDLPDAGATAEKSEIKEECSNHCSDASESAGQCVPPASYTE >gi|229783868|gb|GG667867.1| GENE 3 1331 - 2017 410 228 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625254|ref|ZP_06118189.1| ## NR: gi|266625254|ref|ZP_06118189.1| hypothetical protein CLOSTHATH_06668 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06668 [Clostridium hathewayi DSM 13479] # 1 228 1 228 228 463 100.0 1e-129 MKGQPHGRCEMRIYKIRFDVLLCIVLSSLVVLGISKILPVFYSMLYQRYTQTHMVSDGDI GGKAGDDVYRVENVEELFSHNTFTVQVEYGSLLTADTDYFGDIYLMNLELPSGERVAACI NSDAVQQNYEEDYFIMPVGKVVEADLSKDVEFINGIERSDALSRTDFYLDMNRNSGSGLV SVESYDEQYTIYIKAIAAVLCFMAFHIAGCKAGIFPSLIPGKKKRERT >gi|229783868|gb|GG667867.1| GENE 4 2172 - 3005 407 277 aa, chain + ## HITS:1 COG:MTH120 KEGG:ns NR:ns ## COG: MTH120 COG0778 # Protein_GI_number: 15678148 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanothermobacter thermautotrophicus # 89 277 7 191 191 146 40.0 4e-35 MIAIERFDAQIKKDDSGRLTIVEIPFNAKEVFHKSRGTIYVSGTMNGIEYRSKLLSRGSG KFVMVLDKAMQKSIGFHGQEMTAEITMFSEDLPAAVNVPDKTADIKCDQDVLTVIKTRQS IRKFTEKPVSEHMVTAILSAGMCAPTAKDKRPYHFIVVRDKGVLSMLARHNSNAAMLEIS AGAIIICGDKTREGIKEFLYADCAAAAQNILLSIHGLGLGGVWCGVVPNSDWRKLLIEQL SLPHKVEPISVIAFGWPDEEKELRPRWETAMVHYDKW >gi|229783868|gb|GG667867.1| GENE 5 3146 - 3553 300 135 aa, chain + ## HITS:1 COG:STM4186 KEGG:ns NR:ns ## COG: STM4186 COG4405 # Protein_GI_number: 16767436 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 14 134 14 133 135 86 36.0 2e-17 MDTIENTKKDYFECFAFGDSPEMADELLALVLAGKKTATVSVILEDEQAPSVGDLSLVLD GRGNPACVIKTVYLETVNFCDLTWDMVKLEGEDEDFEHWKAGNIRYWTRDAANRGYTFTD QTPITFERFQVVEVF >gi|229783868|gb|GG667867.1| GENE 6 3777 - 4043 346 88 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01517 NR:ns ## KEGG: EUBELI_01517 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 12 85 76 149 253 98 71.0 6e-20 MELIQLDIFTVLFISNLSDIPQCIPLFIPLFVLAFVPVIFKGQYYGMSEMEAVTRASGAQ IMLAKLVLARSANLVCITILLFMDLLAS >gi|229783868|gb|GG667867.1| GENE 7 4958 - 5470 402 170 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2938 NR:ns ## KEGG: Cphy_2938 # Name: not_defined # Def: LytTR family two component transcriptional regulator # Organism: C.phytofermentans # Pathway: not_defined # 1 161 78 231 240 83 31.0 4e-15 MPEYRFTPILFTTELAGEELSAYREIKCYDFLVKPFTEAEFQKTFQAALEMGTQMQKAPE ILRIEQKQFLFEYEIRNILYIESFGKKLVIHSEQYGDCEIADQISGYSLSKLLNMVPQNR LLQCHKSYLVNPVHISKIDKANRLLYLKGCKTAVPIGEKYQKAVFEREQP >gi|229783868|gb|GG667867.1| GENE 8 5467 - 6651 855 394 aa, chain + ## HITS:1 COG:no KEGG:Amet_3568 NR:ns ## KEGG: Amet_3568 # Name: not_defined # Def: signal transduction histidine kinase regulating citrate/malate metabolism # Organism: A.metalliredigens # Pathway: not_defined # 166 347 160 337 429 69 30.0 2e-10 MTISIISQSVLDGLLLSLSSLGLLGRSRGKLDWLLPLVFAGICLLSRLGLGDHDTEMAYA VLPVDQIVSFLFFFLGVLLLNSLWFQKSEGHTLFGTVAAFALYLLLRELCLLFLALCGSA EPVWYLYVGRVLSLGLWLALWGTGALRWLREKLADGDPVICIVCSNTELLLLLMLILFQF DFSVMIRWLPVTVGAVGLLAVGDVIAMLLRQHRVQARRRTALLEQYLPLVEELIEQVRAR QHDFNNQMMAVAAAVSTARDLQEAQEAVTALLQHVRLDGADKELLKCDSKVISGLLFGKM KQAEFKQVRVEVTIAGAFLRRVLSEADWVEVIAILLDNAIEAAAPGDVLYARAIEEGDSL RFSVCARRRTFLTGASPESARWWEGYSRRQGCPS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:23:58 2011 Seq name: gi|229783867|gb|GG667868.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld261, whole genome shotgun sequence Length of sequence - 5008 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 2173 2125 ## COG0827 Adenine-specific DNA methylase 2 2 Tu 1 . + CDS 3148 - 5008 1580 ## COG4646 DNA methylase Predicted protein(s) >gi|229783867|gb|GG667868.1| GENE 1 2 - 2173 2125 723 aa, chain + ## HITS:1 COG:pli0004 KEGG:ns NR:ns ## COG: pli0004 COG0827 # Protein_GI_number: 18450290 # Func_class: L Replication, recombination and repair # Function: Adenine-specific DNA methylase # Organism: Listeria innocua # 1 281 467 717 756 226 45.0 2e-58 AWASEYAQLKELLTPEEYAAARSSTLNAHYTSPTVIQAIYEAVGRMGFETGNILEPSMGV GNFFGMLPEEMRNSRLYGVELDPVSGRIAKQLYPKADITVGGFETTDRRDFFDLAIGNVP FGQYQVNDKAYNKLNFSIHNYFFAKSLDQVRPGGVVAFVTSRYTMDAKDSTVRRYLAQRA ELLGAIRLPNDAFKKNAGAEVVSDIIFLQKRDRPLDIAPEWTQTGIIRVPAKSQDFVGEG GATEQASFSPQGGNEQAQSATTQDGFAINRYFIDHPEMVLGRQEPVSTAHGMDYTVNPIE GLELSDQLHDAVKYIHGTYQEAELPELGEGEAIDTSIPADPNVKNYSYAIVDGQVYYREN SRMVRPDLNATAEARVKGLVGLRDCVQELIDLQMDAAVPDSTIREKQAELNSLYDSFSAK YGLINDRANRLAYADDSSYYLLCALEVIDEDGKLERKADMFTKRTIKPHQAVAVVDTASE ALAVSISEKACVDMDYMSQLTGKTKEELAGELQGVIFRVPGQLEQDRTPHYVTADEYLSG NVRRKLRQAQRAAQQDPVYAVNVEALTAAQPKDLDASEIEVRLGATWIDKEYIQQFMYET FNTPFYLQRSIEVNYSSFTAEWQIKGKSSVSYNDVAAYTTYGTSRANAYKILEDSLNLRD VRIYDTIEDADGKERRVLNAKETTLAAQKQQAIREAFRDWIWRDPERRQTLVSQYNEEKE VGC >gi|229783867|gb|GG667868.1| GENE 2 3148 - 5008 1580 620 aa, chain + ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 614 462 1060 1315 403 36.0 1e-112 MVSQYNEEMNSTRPREYDGSHITFGGMNPAITLREHQKSAIAHVLYGGNTLLAHEVGAGK TFEMVAASMEAKRLGLCQKSLFVVPNHLTEQWASEFLRLYPSANILVTTKKDFETHNRKK FCARIATGDYDAIIMGHSQFERIPISRERQERLLYEQIDEITEGIAEVQASGGERFTVKQ LERTRKSLEARLEKLQAEGRKDDVVTFEQLGVDRLFVDEAHNYKNLFLYTKMRNVAGLST SDAQKSSDMFAKCRYMDEITGNRGVIFATGTPVSNSMTELYTMQRYLQYERLQELNMTHF DCWASRFGETVTALELAPEGTGYRARTRFSKFFNLPELMNLFKEVADIKTADQLNLPTPE VEYHNIVAQPTEHQQEMVKALSERASLVHSGTVDPSQDNMLKITSDGRKLGLDQRIVNQM LPDEPGTKVNQCVDNIMQIWRDGKADKLTQLVFCDISTPQAKAPASKAAKTLDNPLLHAL EDAVPLPEQEPTFTVYDDIRQKLIAQGMPADQIAFIHEANTEVRKKELFSKVRTGQVRVL LGSTAKMGAGTNVQDRLVALHDLDCPWRPGDLAQRKGRIERQGNQNPLVHVYRYVTEGTF DAYLWQTVENKQKFKPFRFQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:24:00 2011 Seq name: gi|229783866|gb|GG667869.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld262, whole genome shotgun sequence Length of sequence - 3085 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 3 - 582 403 ## COG0451 Nucleoside-diphosphate-sugar epimerases 2 1 Op 2 . - CDS 670 - 2664 1872 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] - Prom 2814 - 2873 5.7 3 2 Tu 1 . - CDS 2882 - 3085 184 ## COG0582 Integrase Predicted protein(s) >gi|229783866|gb|GG667869.1| GENE 1 3 - 582 403 193 aa, chain - ## HITS:1 COG:MA1185 KEGG:ns NR:ns ## COG: MA1185 COG0451 # Protein_GI_number: 20090051 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Methanosarcina acetivorans str.C2A # 1 176 1 186 311 64 29.0 9e-11 MNIVVTGATSFIGAPMVDKLIGMGHDVYAVVRPASKNRARLGTGRERLHVIEKNLQEADR LDEEILVPCEAFFHFGWDGAGSGNRMNREVQQKNVADSLKALEGAKKLGCRTFLFSGSQA EYGIHQDAMTEETTLNPVSEYGKAKVDFCRKAMELTDGFRYIHARIFSVYGPGDHPWSLV ESCLKTFPAGGYL >gi|229783866|gb|GG667869.1| GENE 2 670 - 2664 1872 664 aa, chain - ## HITS:1 COG:all4613 KEGG:ns NR:ns ## COG: all4613 COG0028 # Protein_GI_number: 17232105 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Nostoc sp. PCC 7120 # 13 646 43 608 632 228 29.0 2e-59 MRMKVSNYISQKLVEFGITQVFTVTGGGAMHLNDALGHQEGLTCLYNHHEQACAIAAECY ARVQGRIAAVCVTTGPGGTNAITGVVGGWLDSIPMLVLSGQVRYDTTARWSGVGIRAMGD QEFDITKAIDCMTKYSEMVIDPMRIRYCLEKAIYLAYSGRPGPAWLDIPLNVQGAYIETE ELVGFSQEDYEAGGDGWAAPSGSKTEADNAGQGEKRQVLPPAVTRETARVIIEKIKKSQR PVINAGNGIRIGHAFEVFSRVVEKLGVPVVTGWDSEDCMPDDHPLYTGRGGGMGDRAGNF AIQNSDLVLSLGSRLSIRQVGYNYSTWARAAYTIVNDIDPEELKKPSVHIDMAVHADVKD LLEQLERVLDEEYGNGQPVFAGGAGLPGMTWTDTCRMWKEKYPVVLPKHYDHGEEEDANV YAFMKEMSSRLKEEQVIVVGNGSACVVGGHACIIKQGQRFISNSAIASMGYDLPAAIGAC MAVREGGAQEGRMQGAESQDTELQDTSSDIILITGDGSIQMNLQELQTIIHHRMPIKIFL INNGGYHSIRQTQKNFFGEPLVGIGVDSHDLSFPDMEKLAAAYGYPYCRACHNGELVPAI ETALRTDGPVICEIFVSRDQNFEPKSAAKRLPDGTMVSPPLEDLSPFLPEEEMDENMIIP RIRE >gi|229783866|gb|GG667869.1| GENE 3 2882 - 3085 184 67 aa, chain - ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 1 67 199 265 265 106 71.0 8e-24 AVRYGIPVEVVYPHSFRHRFAKNFLEKFSDIALLADLMGHESIETTRIYLRRTSSEQQQI VDQVVDW Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:24:02 2011 Seq name: gi|229783865|gb|GG667870.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld263, whole genome shotgun sequence Length of sequence - 7555 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 44 - 916 531 ## COG0395 ABC-type sugar transport system, permease component 2 1 Op 2 . - CDS 929 - 1798 845 ## COG4209 ABC-type polysaccharide transport system, permease component - Prom 1862 - 1921 3.0 3 2 Tu 1 . - CDS 1924 - 2067 133 ## gi|266625267|ref|ZP_06118202.1| nucleotide pyrophosphatase 4 3 Tu 1 . - CDS 3037 - 4110 662 ## COG1524 Uncharacterized proteins of the AP superfamily - Prom 4252 - 4311 7.7 - Term 4256 - 4313 5.7 5 4 Op 1 4/0.000 - CDS 4322 - 5797 864 ## COG1621 Beta-fructosidases (levanase/invertase) 6 4 Op 2 . - CDS 6765 - 7550 624 ## COG0524 Sugar kinases, ribokinase family Predicted protein(s) >gi|229783865|gb|GG667870.1| GENE 1 44 - 916 531 290 aa, chain - ## HITS:1 COG:BH0795 KEGG:ns NR:ns ## COG: BH0795 COG0395 # Protein_GI_number: 15613358 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 9 290 36 322 322 220 41.0 3e-57 MKKKLTKLDVISYCFISLLVLTSLLPMVMILIASFTDNNTILRNGYTFFPEKWSAHAYNY IFASAGEVIRSYEVTLFITAAGTGAGLIVTALAGYVLNCRKFKYRNYFSFFFYFTTLFSG GLVPFYIMVAAILGLKDNLLAVILPGLTSPFLIILMRSFITTSIPDSLMESMRIDGATDF TVFARLVLPLIKPALATVGLFLALNYWNGWYLPMLFLNRPKDFPIQYYLHNMMSVQRMAA TSGAAVQGMQFPGESIKMAMAVVATAPILLAYPFVQRYFVEGLTVGAVKE >gi|229783865|gb|GG667870.1| GENE 2 929 - 1798 845 289 aa, chain - ## HITS:1 COG:lin2117 KEGG:ns NR:ns ## COG: lin2117 COG4209 # Protein_GI_number: 16801183 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Listeria innocua # 1 289 17 309 309 221 43.0 1e-57 MIMPAVLFFLIFNYGPMAGIYYAFTQYNFRGGLFGSPFIGLTNFKILIQNGSLGYLTRNT VLYNLVFIAAGNILEVLVAVLIHRLSSKKFKKISQSVILFPYVISYVIVQVFAYAMLNGN SGAVTHFVREEMGAAGFNAYTTPGIWKYIIVLIYLWKNTGYGMIIYLAALSGISKDYYEA AQIDGASAYQQIRYITLPQLTPTFITLLLLAIGNILRGQFELFYQLVGTNGLLYEQTDIF DTFVFRLLQNSFDVGLGTAAGLYQSFFGLFVVLSCNYFVKRSNPDYALF >gi|229783865|gb|GG667870.1| GENE 3 1924 - 2067 133 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625267|ref|ZP_06118202.1| ## NR: gi|266625267|ref|ZP_06118202.1| nucleotide pyrophosphatase [Clostridium hathewayi DSM 13479] nucleotide pyrophosphatase [Clostridium hathewayi DSM 13479] # 1 47 18 64 64 89 100.0 1e-16 MSGPDVKAGAVMERQRIIDEAPTFAYMLGVSMEEAQGRCMEELLLKP >gi|229783865|gb|GG667870.1| GENE 4 3037 - 4110 662 357 aa, chain - ## HITS:1 COG:CAC0477 KEGG:ns NR:ns ## COG: CAC0477 COG1524 # Protein_GI_number: 15893768 # Func_class: R General function prediction only # Function: Uncharacterized proteins of the AP superfamily # Organism: Clostridium acetobutylicum # 2 354 4 348 434 201 33.0 2e-51 MKRNKLIVMSIDALFDEDIPCIKELPNFNRLLKNSARVRGGMRGIYPALTYPSHVSMITG TYPERHHICHNEIVDPFSREMDWYWYRDQIAVPTILDAAHAAGYTTACLGWPCMGGDPAS DWNVPEIWPKRGAGNLEEVLIGSSSESVLADGGVLRKHMHVYKKLARFFVDEAIVGCACD LIRTRQPDVLFMHVAHLDHMRHVHGVYGREIEEALMVHDDWLGRILDEVKEAGLDDFTNI AVVSDHGHLAVKRMFQPNVLLAEAGLIRLDQSGTITGWEAFCNSASLSTQVHMRDPEDRQ AREKLEAVLADVIKNPIYGVEAVFTKAEAKTEHHLEGGFEYIMEGGTGTAFGNAVAS >gi|229783865|gb|GG667870.1| GENE 5 4322 - 5797 864 491 aa, chain - ## HITS:1 COG:CAC0425 KEGG:ns NR:ns ## COG: CAC0425 COG1621 # Protein_GI_number: 15893716 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Clostridium acetobutylicum # 1 481 1 483 490 268 33.0 2e-71 MSSLQLKKNIEKAERAIQEKRETAGQGKMRQQFHFMAESGWINDPNGLIYFHGQYHLFYQ FNPFKPAWGEMYWGHAVSDNLVDWTYLPVALAPSENYDSHEQGGCFSGTSIEYDGKLYLL YTGTSSGEHGFVQVQCLAESSDGITFEKYEGNPVLRAPAGYDECDFRDPKVWSHDDSFYM ICGTKKDGYAKLLLFKSADLKKWEYKSVMAESRGEWGEMWECPDFYSIEGTDVLTFSPVG AGERKTVYMTGQLDYQTGKFCCTREGEVDWGFDYYAPQTLEDEDGRRIMFGWAGGWDWMP WWNGHGPSEKEEWCGFFGIPREVRLLDDHSLQFIPVKELESLRRREMISVSEESITEETP FFLPDVRIYEMKLSINLKESQAECCILRLCGTEDFYTDITADLKNGELRVDRNHSDKWNH GVSRSNLYLQDKERLDLHIFMDQSSVEVFADEYRNVHTCRLYSDSGQCRNRIMTMGGTVR IEHLKLWSFRC >gi|229783865|gb|GG667870.1| GENE 6 6765 - 7550 624 261 aa, chain - ## HITS:1 COG:PA1950 KEGG:ns NR:ns ## COG: PA1950 COG0524 # Protein_GI_number: 15597146 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Pseudomonas aeruginosa # 2 258 4 268 308 122 33.0 8e-28 MKILNLGSLNIDKVYQVNHFVKGQETISAEECHVYPGGKGLNQSVAIARAGGEVFHAGAV GSDGGELLSVLEKENVHLELVRRTGGVSGSAVIQVTGGENAIIVYGGANKEITREDVDRA LEPFGKGDYLLLQNETSLVSYAIRKASEKGMTVVFNASPVTEDMQGYPLEMVDIFFVNEI EAKYLAECDEGDYQIILERLHQKYPAAILVMTVGEEGVYYQSESDFFHLPANQVSVVDTT AAGDTFGGYYLACISKSMTVS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:24:08 2011 Seq name: gi|229783864|gb|GG667871.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld264, whole genome shotgun sequence Length of sequence - 4868 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 233 140 ## Closa_2239 DNA polymerase B exonuclease - Prom 300 - 359 6.0 + Prom 424 - 483 5.4 2 2 Op 1 29/0.000 + CDS 515 - 1123 819 ## COG0632 Holliday junction resolvasome, DNA-binding subunit 3 2 Op 2 . + CDS 1145 - 2140 1256 ## COG2255 Holliday junction resolvasome, helicase subunit + Prom 2160 - 2219 7.5 4 3 Op 1 . + CDS 2257 - 2679 568 ## Closa_2236 hypothetical protein + Term 2688 - 2729 5.5 + Prom 2701 - 2760 2.3 5 3 Op 2 . + CDS 2831 - 4390 1298 ## COG0826 Collagenase and related proteases + Term 4421 - 4473 13.0 + Prom 4411 - 4470 3.6 6 4 Tu 1 . + CDS 4533 - 4850 328 ## CPF_0987 hypothetical protein Predicted protein(s) >gi|229783864|gb|GG667871.1| GENE 1 2 - 233 140 77 aa, chain - ## HITS:1 COG:no KEGG:Closa_2239 NR:ns ## KEGG: Closa_2239 # Name: not_defined # Def: DNA polymerase B exonuclease # Organism: C.saccharolyticum # Pathway: not_defined # 1 76 1 76 364 112 69.0 6e-24 MITIHETMLFPETYPLDRLGRREDLLFFDIETTGFSGEYTNLYLIGCVYYEGGRWNLVQW FADTVSAEKEVLVTFFE >gi|229783864|gb|GG667871.1| GENE 2 515 - 1123 819 202 aa, chain + ## HITS:1 COG:VC1846 KEGG:ns NR:ns ## COG: VC1846 COG0632 # Protein_GI_number: 15641848 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Vibrio cholerae # 1 199 1 203 204 130 36.0 2e-30 MISYVKGPLVDIFEDTVVIEAGYIGLEIHVPLSVLDRLPGIGMETVLYTYFQVREDGMCL YGFLNRQDLQMFRQLISVSGIGPKGALGVLSAMTPDELRLAIITGDAKAISRAPGIGVKT AQRVILDLKDKIDMADVLPAQFAAEEEPGISAGGVAKEAVEALVALGYSTAEANRAVGKV EVTGDMTSEDVLKASLKHLAFL >gi|229783864|gb|GG667871.1| GENE 3 1145 - 2140 1256 331 aa, chain + ## HITS:1 COG:BS_ruvBm KEGG:ns NR:ns ## COG: BS_ruvBm COG2255 # Protein_GI_number: 16081161 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Bacillus subtilis # 1 328 3 330 336 419 62.0 1e-117 MERRIITTEVTEEDKRTEPNLRPQSLNEYIGQEKLKANLKVYIDAAKARGESLDHVLFYG PPGLGKTTLSGIIANEMGVNMKVTSGPAIEKPGEMAAILNNLQEGDVLFVDEIHRLNRQV EEVLYPAMEDYAIDIMLGKDSAARSIRLDLPRFTLVGATTRAGLLTAPLRDRFGVVQKME FYTPKELEIIVCHSAKVLEVEIEPEGAAEIAKRSRGTPRLANRLLKRVRDFAQVKYHGVI TKEVADFALDILDVDKFGLDNNDRAILTTMIEKFSGGPVGLETLAASLGEDAGTLEDVYE PYLLMNGFINRTPRGRVATERAYHHLGLTLE >gi|229783864|gb|GG667871.1| GENE 4 2257 - 2679 568 140 aa, chain + ## HITS:1 COG:no KEGG:Closa_2236 NR:ns ## KEGG: Closa_2236 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 119 1 119 122 139 74.0 4e-32 MDTKHYTEVLIDGRIYTLGGTEDEGYMQRVASYINEMITTLKRQEGFTKQSAEYQTAMIE LNMADDYFKAREQIARLEQLKHEMEKEVYSLKHELVTTQMKLEAANKELAAGRTARPSGS SGNASESKKGSDRTGSENKE >gi|229783864|gb|GG667871.1| GENE 5 2831 - 4390 1298 519 aa, chain + ## HITS:1 COG:CAC2341 KEGG:ns NR:ns ## COG: CAC2341 COG0826 # Protein_GI_number: 15895608 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Clostridium acetobutylicum # 3 295 18 302 787 293 53.0 5e-79 MKAAVAAGADAVYIGGSRFGARAYAENPEEDRLLEAIDYAHLHGCRLYMTVNTLMKESEM AELCAFLEPYYMRGVDAVIVQDMGAFALIRERFPDLPVHASTQMTITGSYGARMLKELGA ERVVTARELSLEEIERIHREVAVEIESFVHGALCYCYSGQCLFSSLIGGRSGNRGRCAQT CRLPYDVKRDGRTLNGKDSRYVLSLKDLCTLDLIPDMIEAGIYSMKIEGRMKSPRYTAGV VEIYRRYTDLYLERGRAGYRVEEKDRKRLLELFDRGGQTDGYYRRHNGKDMVVWKEKPSF REGNQELFDYLDKNYVEKQLQEPVRGRIVLETGKPAVLELSCGECSVSLAGDTVLEALKQ PMDETRIRKQLQKTGNTPFYFEELTVTLTGEVFLPVQSLNELRRKGLEALEEAVLKQYRR EVLQNQREVLQCRREDPQDCRETPHSAVSGKRESSSGQPRIFVSLERPDCLPAVLHNPYV ERIYVDAAEFAVSENWWKASKAIEKLRFLMSFGRVKTEA >gi|229783864|gb|GG667871.1| GENE 6 4533 - 4850 328 105 aa, chain + ## HITS:1 COG:no KEGG:CPF_0987 NR:ns ## KEGG: CPF_0987 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 8 104 1 97 97 134 62.0 1e-30 MVNGGKNMKYYMVEGMILDADRMNDAIMKDHMAYTQKAMDSGMIFLSGLKTDRSGGAFLM KAENPEEIESYLASEPFKLYGIQDYRWAEFDIHYINPQPEQWFQN Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:24:18 2011 Seq name: gi|229783863|gb|GG667872.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld265, whole genome shotgun sequence Length of sequence - 3814 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 37/0.000 - CDS 3 - 323 200 ## PROTEIN SUPPORTED gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc 2 1 Op 2 23/0.000 - CDS 316 - 1500 1410 ## COG0133 Tryptophan synthase beta chain 3 1 Op 3 9/0.000 - CDS 1503 - 2141 427 ## COG0135 Phosphoribosylanthranilate isomerase 4 1 Op 4 21/0.000 - CDS 2170 - 2976 755 ## COG0134 Indole-3-glycerol phosphate synthase 5 1 Op 5 . - CDS 2989 - 3813 996 ## COG0547 Anthranilate phosphoribosyltransferase Predicted protein(s) >gi|229783863|gb|GG667872.1| GENE 1 3 - 323 200 107 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc [Roseobacter sp. AzwK-3b] # 1 107 1 113 263 81 40 9e-16 MSRVRDAFADGKAFIPFVTAGDPELAVTEELIVAMAEAGAALIEIGIPFSDPVAEGIVIQ EADIRALASGTTTDKIFDMVKRIREKTEVPLAFMTYINPIFVYGSEK >gi|229783863|gb|GG667872.1| GENE 2 316 - 1500 1410 394 aa, chain - ## HITS:1 COG:CAC3158 KEGG:ns NR:ns ## COG: CAC3158 COG0133 # Protein_GI_number: 15896406 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Clostridium acetobutylicum # 4 386 3 385 394 533 66.0 1e-151 MSKGRFGQHGGQFIPETLMNAVNELETAYEKYIHDESFLAELDELNRTYTGRPSLLYYAA KMTEDLGGAKIYLKREDLNHTGSHKINNALGQTLLAKKMGKTRVIAETGAGQHGVATATA AALLGLECEVYMGKEDTDRQALNVFRMELLGARVHPVTSGTQTLKDAVNETLREWTNRVE DTHYVLGSVMGPHPFPTIVRDFQSIIGREVKSQLLEAEGKLPDMLIACVGGGSNAMGIFY DFIDEPGVKLVGCEAAGHGIDTDKNAATIATGSLGIFHGMKSYFCQDEYGQIAPVYSISA GLDYPGIGPEHASLHDSGRAQYVPVTDAEAVDAFEYLSRMEGIIPAIESAHAVAYARKIA PAMGKDQVIVINLSGRGDKDVAAIARYKGVEIYE >gi|229783863|gb|GG667872.1| GENE 3 1503 - 2141 427 212 aa, chain - ## HITS:1 COG:CAC3159 KEGG:ns NR:ns ## COG: CAC3159 COG0135 # Protein_GI_number: 15896407 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylanthranilate isomerase # Organism: Clostridium acetobutylicum # 1 210 2 202 205 129 36.0 4e-30 MKIKICGLTRPEDIAAVNEYRPDYIGFVFAPSRRNVSEMRAAGLKKLLSPEIMAVGVFVN SPIEDIVRIASRGIIDAVQLHGDMDGFETARYAENLKKALAAAGLAASVPLIRAVRVKDR NDIQRAVSYPAEYLLFDTFVKDQYGGSGVRFDWSLIPHIEKPWFLAGGLSAATIKEALKT EACCLDLSSAAETGGRKDPKKIQEIIQTVRSA >gi|229783863|gb|GG667872.1| GENE 4 2170 - 2976 755 268 aa, chain - ## HITS:1 COG:BH1661 KEGG:ns NR:ns ## COG: BH1661 COG0134 # Protein_GI_number: 15614224 # Func_class: E Amino acid transport and metabolism # Function: Indole-3-glycerol phosphate synthase # Organism: Bacillus halodurans # 47 255 28 239 257 184 48.0 1e-46 MILDTLAASTRKRVEAAKEKIPLGTMMNLASQASEARQNGAKESVFSFEKALKEPGMSFI CEVKKASPSKGLIAPDFPYVDIARDYEAAGAAAISVLTEPEYFLGSSDYLREIRGHVSVP LLRKDFTIDPYQIFEAKVIGASAVLLICALLDTETLRTGIHLCDSLGMSALVEAHDEEEI KSALRAGARIIGVNNRNLKTFEVDFSNSIRLRNLVPEGVLFIAESGVKSAEDIRLLHNAG VDGVLIGETLMRSGSKKLILDSWKQACE >gi|229783863|gb|GG667872.1| GENE 5 2989 - 3813 996 274 aa, chain - ## HITS:1 COG:MJ0234 KEGG:ns NR:ns ## COG: MJ0234 COG0547 # Protein_GI_number: 15668409 # Func_class: E Amino acid transport and metabolism # Function: Anthranilate phosphoribosyltransferase # Organism: Methanococcus jannaschii # 4 247 88 329 336 253 54.0 3e-67 EAGTFNISTTSAFVVAAGGVPVAKHGNRSVSSKSGAADVLENLGVNLKLTVEQSEQILKD TGMCFMFAQSYHASMKYAGPVRKELGVRTIFNILGPLSNPAGATMQLLGVYDEKLVEPLA KVLSNLGVNRGLVVCGNDGLDEATVTGPTHVCEIRFGELTMYEITPEQFGFSRCDLSELV GGNPEENARITRDILTGKLTGPKRDIVVFNSALSLYLGIDDCTIEDCIVLANDLIDSGKA AAKLEEFAKATNQYNIPSAAGETGKSPSQPFGGM Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:24:19 2011 Seq name: gi|229783862|gb|GG667873.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld266, whole genome shotgun sequence Length of sequence - 4248 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 112 - 255 125 ## gi|266625282|ref|ZP_06118217.1| hypothetical protein CLOSTHATH_06698 - Prom 276 - 335 5.8 - Term 278 - 322 5.2 2 2 Tu 1 . - CDS 441 - 1196 775 ## COG2188 Transcriptional regulators - Prom 1351 - 1410 5.2 - Term 1453 - 1503 10.3 3 3 Op 1 1/0.000 - CDS 1541 - 2638 1266 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs - Prom 2825 - 2884 2.0 - Term 3151 - 3196 1.0 4 3 Op 2 . - CDS 3259 - 4248 945 ## COG1690 Uncharacterized conserved protein Predicted protein(s) >gi|229783862|gb|GG667873.1| GENE 1 112 - 255 125 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625282|ref|ZP_06118217.1| ## NR: gi|266625282|ref|ZP_06118217.1| hypothetical protein CLOSTHATH_06698 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06698 [Clostridium hathewayi DSM 13479] # 1 47 1 47 47 77 100.0 3e-13 MLTFVKCFGMIYLAAENGGKLFRKTSGLTAERVLKKMKKVVDKNESL >gi|229783862|gb|GG667873.1| GENE 2 441 - 1196 775 251 aa, chain - ## HITS:1 COG:BH0419 KEGG:ns NR:ns ## COG: BH0419 COG2188 # Protein_GI_number: 15612982 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 6 251 4 238 240 103 29.0 2e-22 MLNHEKGAPPLYSQLESIIKMQIERGEYNRGDFLPTEKMLMDQYEISRVTVRQAMAALSQ AGYIKCSRGIGTEVIYEKIDEHMKSVISFTDEMKQHNITMETTYCRMEKIMPGTTVASAL QIPLTDQCYCLTRVRSVQGRPLVYTITYMRDLVELPMDSQYYMESLYKYLSEVHEIRIVR GRDTLEAALPSLEVQKFLEIDAQMPVFKRTRQTFLPGEDFPALHTSGKDGLVFEYSVCYY PGNRYKYTVDL >gi|229783862|gb|GG667873.1| GENE 3 1541 - 2638 1266 365 aa, chain - ## HITS:1 COG:CAC3381 KEGG:ns NR:ns ## COG: CAC3381 COG0330 # Protein_GI_number: 15896623 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Clostridium acetobutylicum # 1 363 1 362 365 307 44.0 3e-83 MRIHINQTQCGFVIKNGCYQKMIFAGTYYYPACMGYKVVIEEMRGRVSFKEVPLKVLKED KVFLSRVTIAEVPEGQLGVLYINHAASAVVTDREAAYWNVWEPCELRMIDMREPAMDRTV EKSLLRLIDQQYYKKIEILPGEKGLLYQDNILTDELGPGTYYYWLYARDVLCRVVDLKMK ELEVSGQEILTADRVGIRLNLTATYRIADPRRLVETIKGVENQLYTRIQLIVREYIGRYR LDEILEQKEAIAGFLAQRMREEQEQYCVEVQTIGIKDIILPGEIRDIMNTVLIAEKRAQA NVITRREEVASTRSLLNTAKLMDENKTLYKLKELECLERICTQVGNISVAGSGTLIQQLK ELLGT >gi|229783862|gb|GG667873.1| GENE 4 3259 - 4248 945 329 aa, chain - ## HITS:1 COG:STM3519 KEGG:ns NR:ns ## COG: STM3519 COG1690 # Protein_GI_number: 16766807 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 69 329 153 405 405 176 42.0 4e-44 DFVSTNIKAEDMKPVMTGNGTMFQAITGEIMRNIPVGYNSYKKRQESEVLDRCLEHASDY EKNEALFPLMEDAFYQIGTLGGGNHFIELQEDQDGYLGIMIHSGSRHFGKEICDYFHKRA RELNAAWHSAVPDSYHLAFLPVDSAEGQDYIPWMNLALEYAFENRKRMMEKTLEIVAEKV EKYLDTTMEFTEHINCHHNYASLETHYGKEVWVHRKGATRAGNGEMAVIPGAMGSYSYVV MGKGNPESFESSSHGAGRAYSRKAAMEKFSCEEVINDLKAQGVVLGKMKKSDVAEESRFA YKDIDHVMDQQKDLVVPVKKLKTVGVVKG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:24:26 2011 Seq name: gi|229783861|gb|GG667874.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld267, whole genome shotgun sequence Length of sequence - 5896 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 99 119 ## + Term 222 - 284 8.3 - Term 210 - 272 20.3 2 2 Op 1 10/0.000 - CDS 305 - 1960 1646 ## COG4211 ABC-type glucose/galactose transport system, permease component 3 2 Op 2 16/0.000 - CDS 1979 - 3487 1383 ## COG1129 ABC-type sugar transport system, ATPase component - Prom 3525 - 3584 8.6 4 2 Op 3 . - CDS 4486 - 5484 1190 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 5543 - 5602 6.0 5 3 Tu 1 . - CDS 5671 - 5856 131 ## Closa_2375 LacI family transcriptional regulator Predicted protein(s) >gi|229783861|gb|GG667874.1| GENE 1 1 - 99 119 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no VEEAEAFTPPEWFGRDVTFSGEYQNSRLAGLD >gi|229783861|gb|GG667874.1| GENE 2 305 - 1960 1646 551 aa, chain - ## HITS:1 COG:TP0686 KEGG:ns NR:ns ## COG: TP0686 COG4211 # Protein_GI_number: 15639673 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type glucose/galactose transport system, permease component # Organism: Treponema pallidum # 31 551 31 531 531 412 42.0 1e-114 MAKGNSKTLTAEQEMKLRQPIEDYVGRIQSKIDNLRADGTNRAVALQNNIDAIKRDRSIP KEEKENRIAAYKTELDKAKAVEAKNKDEISKLISDATAYLKAHFDKDYYQPLKESCEQEK VLAKEKYQKKIAELEKEHQSNLSKLTDHQEVKDEKYVHKNRMFDAKMELEKELQQIKDRR HAAFAWQYHLIDLLRISKFTFMESRAQKWENYKYTFNRRAFLLANGLYIAIILIFIALCI ATPIIKGVPLLTYNNVLNILQQASPRMFLALGVAGLILLTGTDLSIGRMVGMGMTTATII MHQGVNTGMVFGHVFDFTGLPIGVRVIFALFMCIVLCTCFTAIGGFFTARFKMHPFISTM ANMLVIFGLVTYATKGVSFGAIDPSIPKMIIPRLNGFPTIILWAIAAIVIVWFIWNKTTF GKNLYAVGGNPEAASVSGISVFWVTMGAFILAGILYGFGSWLECARMVGSGSAAYGQGWE MDAIAACVVGGVSFTGGIGKISGVVVGVLIFTVLIYSLTILGIDTNLQFVFEGIIIITAV TLDCLKYVQKK >gi|229783861|gb|GG667874.1| GENE 3 1979 - 3487 1383 502 aa, chain - ## HITS:1 COG:TP0685 KEGG:ns NR:ns ## COG: TP0685 COG1129 # Protein_GI_number: 15639672 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Treponema pallidum # 9 502 4 496 496 614 62.0 1e-175 MADKKDEIVLSIRGMSKSFGRNRVLDHINLDVKRGTVMGLMGENGAGKSTMMKCLFGTYQ KDEGNIFLKGKEVSFSGPKDALENGIAMVHQELNQCLERNVIDNLFLGRYPVNSMGVVDE GRMKKEASELFRKLGMTVNLTQPMRNMSVSQRQMCEIAKAISYNSQVIVLDEPTSSLTVQ EVDKLFEMMRMLKEQGIALIYISHKMDEIFEICDEISVLRDGNLVMTKSTKDANMNELIS AMVGRSLDNRFPPVDNTPKDVVLSIQHLSTKFEPHLQDVTFDVKEGEIFGLYGLVGAGRT ELLETIFGVRTRAAGRVYFHNRLMNFSSAKEAMEHGFAMITEERKANGLFLKGDLTFNTT IANLNQYKSGLALSDAKMIKATANEIKIMHTKCMGPDDMISSLSGGNQQKVIFGKWLERG PQVFMMDEPTRGIDVGAKYEIYELIINMAKQGKTIIVVSSEMPEILGITNRIGVMSNGRL SGIVNTKETNQEELLRLSAKYL >gi|229783861|gb|GG667874.1| GENE 4 4486 - 5484 1190 332 aa, chain - ## HITS:1 COG:TP0684 KEGG:ns NR:ns ## COG: TP0684 COG1879 # Protein_GI_number: 15639671 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Treponema pallidum # 66 332 35 292 403 218 45.0 2e-56 MRLLKKVVAVGMASTMVFTIAGCGSKTTETTAAPAETTAAAAETTAAGETTAAAAETKDP NAKVENKDKPLVWFNRQPSSSATGQLDMTALNYNKDTYYVGFDANQGAELQGTMVKDYIE KNIDSIDRNGDGIIGYVLAIGDIGHNDSIARTRGVRKALGTAVEKDGNINSDPIGTNTDG TSTIVQDGKIEVNGKSYIVRELASQEMKNSAGATWDAATAGNGIGTWASSFGDQIDIVAS NNDGMGMSMFNAWSKDNNVPTFGYDANSDAVAAIAEGYGGTISQHADVQAYLTLRVLRNA LDGVDINTGISTADDAGNVLSDDVYVYKAVAS >gi|229783861|gb|GG667874.1| GENE 5 5671 - 5856 131 61 aa, chain - ## HITS:1 COG:no KEGG:Closa_2375 NR:ns ## KEGG: Closa_2375 # Name: not_defined # Def: LacI family transcriptional regulator # Organism: C.saccharolyticum # Pathway: ABC transporters [PATH:csh02010]; Bacterial chemotaxis [PATH:csh02030] # 1 57 282 338 340 82 68.0 4e-15 MEGLKSGKLFGTVQCDSDEYGSVMFQIAAAAGLGQNVQEIVSLEDGKYYNCRQTALTAGE N Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:24:34 2011 Seq name: gi|229783860|gb|GG667875.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld268, whole genome shotgun sequence Length of sequence - 3731 bp Number of predicted genes - 7, with homology - 5 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 43 - 297 212 ## gi|266625292|ref|ZP_06118227.1| conserved hypothetical protein + Prom 310 - 369 6.9 2 2 Op 1 . + CDS 582 - 1316 659 ## gi|266625293|ref|ZP_06118228.1| hypothetical protein CLOSTHATH_06708 3 2 Op 2 . + CDS 1320 - 1520 103 ## 4 2 Op 3 . + CDS 1462 - 1878 352 ## gi|266625294|ref|ZP_06118229.1| hypothetical protein CLOSTHATH_06709 + Term 1918 - 1964 11.2 - TRNA 1963 - 2036 70.4 # Pro GGG 0 0 5 3 Tu 1 . + CDS 1978 - 2079 56 ## + Prom 2085 - 2144 4.8 6 4 Op 1 . + CDS 2167 - 2952 1067 ## COG0561 Predicted hydrolases of the HAD superfamily 7 4 Op 2 . + CDS 3011 - 3730 376 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 Predicted protein(s) >gi|229783860|gb|GG667875.1| GENE 1 43 - 297 212 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625292|ref|ZP_06118227.1| ## NR: gi|266625292|ref|ZP_06118227.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 84 1 84 84 150 100.0 4e-35 MPGGWSYQSYLTPYDSFIFYNAIGKHNDYFYTPIAVATQVVNGTNYKFMTIAEPQTEDLN PQFAVVDIYQPINGKPYLTSIAML >gi|229783860|gb|GG667875.1| GENE 2 582 - 1316 659 244 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625293|ref|ZP_06118228.1| ## NR: gi|266625293|ref|ZP_06118228.1| hypothetical protein CLOSTHATH_06708 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06708 [Clostridium hathewayi DSM 13479] # 1 244 15 258 258 480 100.0 1e-134 MNYLDFAVAKIAKLYPEVLREEIYLLFYLIKDNFQANMKYLRENNVDVDCFGFLPVRRNT RNSVEAFFDMFNLINAENYISVLLYMHDKKNNRYDDRYKKYAFKGDFTIVSKRDIAKDNG LGRITDDETMAKIDQAESGSNSYIHPSIFIPLYTSADIEAKEEILRDLLRVNTWLLGGAY HQLCCFIKKKYGTLKSFECLNCTKEAACFECFEAYYVAFKEMIEQNLFPLSQEGGHEEPY KNRS >gi|229783860|gb|GG667875.1| GENE 3 1320 - 1520 103 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSGKDGQTGLEFKVNLSGKATRPEPGRASPRNTRCHRMQGSEMWEEYNVNYILYSKQKTK KRVQGV >gi|229783860|gb|GG667875.1| GENE 4 1462 - 1878 352 138 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625294|ref|ZP_06118229.1| ## NR: gi|266625294|ref|ZP_06118229.1| hypothetical protein CLOSTHATH_06709 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06709 [Clostridium hathewayi DSM 13479] # 1 138 1 138 138 246 100.0 4e-64 MSTTYYIVNRKRKKECREFEKFWEEEWLPMVTDKLHQFCAEANGEIVNDELAERLMRDSF SVFSRSPLSDSLYKEPFLTVNHAGVFWHKCETEGALLNSLEDLIKFFSKRANQEKYSLED ESGRGCTLNELISGEHRD >gi|229783860|gb|GG667875.1| GENE 5 1978 - 2079 56 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNLRPLGPQPSALPSYAILRFWSLWYYSRKFAI >gi|229783860|gb|GG667875.1| GENE 6 2167 - 2952 1067 261 aa, chain + ## HITS:1 COG:BS_ykrA KEGG:ns NR:ns ## COG: BS_ykrA COG0561 # Protein_GI_number: 16078519 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Bacillus subtilis # 4 260 5 257 257 97 30.0 2e-20 MKPIVSFDVDMTLLDHKDWTIPDSAMETLQKLRENYTIVLATGRDMDSSFSVVLKEQIRP DAIIHLNGTKITVGDTIIYEHLFDKELMKRILAYAQDTPYSVGASVGGYDYYIHPEVVNR HDTDLWGECGRRFADPWKLPELDVRTMAYIGDERGAKAMGDRFPEINVHMFAEKRGADVV EKTTSKAMGLIRLCDYFQTPLSDTVAFGDSMNDYDIVKTAGIGIAMGNAMEELKEAADYV TDAVDADGIRNACVHFGLIRL >gi|229783860|gb|GG667875.1| GENE 7 3011 - 3730 376 240 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 5 240 4 234 458 149 34 3e-36 MAKHYDLFIIGGGPGGYTAALKAAGLGLKTAIAEKEKLGGTCVNKGCIPTKALLHASSMF AAFQNCDVFGVSADFISFDFKKMQDYKKRSVKEYRKEVKEAVESAGIDYIEGTATIRRGR TVEVNSPSGKDYYEADHIIIATGAKPVIPDIPGAMLPGVLTSDRLLASDTWNYDRLTIIG GGVIGVEFATIFQALCSHVTIIESREHLLGPMDNEVSEVLEEELRRKGISVQCEARVLEI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:25:11 2011 Seq name: gi|229783859|gb|GG667876.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld269, whole genome shotgun sequence Length of sequence - 5287 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 26/0.000 - CDS 2 - 335 316 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 2 1 Op 2 . - CDS 460 - 1929 1481 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 3 1 Op 3 . - CDS 1941 - 3611 2056 ## COG2759 Formyltetrahydrofolate synthetase 4 1 Op 4 . - CDS 3638 - 5287 1926 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins Predicted protein(s) >gi|229783859|gb|GG667876.1| GENE 1 2 - 335 316 111 aa, chain - ## HITS:1 COG:aq_821 KEGG:ns NR:ns ## COG: aq_821 COG0770 # Protein_GI_number: 15606186 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Aquifex aeolicus # 3 111 2 109 445 72 34.0 2e-13 MEQITVKEILEATGGRLLCGSPDTPLDHVSIDSRTMKGNDLFVPLVGEKVDAHRFIGQAF DNGAAATFTSEHDVMEDNRPWIRVSDTKRALQALGAWYRRRLKLPLVGITG >gi|229783859|gb|GG667876.1| GENE 2 460 - 1929 1481 489 aa, chain - ## HITS:1 COG:CAC2129 KEGG:ns NR:ns ## COG: CAC2129 COG0769 # Protein_GI_number: 15895398 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Clostridium acetobutylicum # 1 485 1 481 482 331 38.0 2e-90 MKFREWLEELEYDVISGNLDIEVEDVIYDSRKARENVVFVCTKGTKIDSHEFIPDVVKAG VKALVVERQAEIPDGVTAIRVHSSREALALLSAARFGRPQEKMITIGVTGTKGKTTTTHM IKAILESCGKKVGMIGTTGIVIGSEVTPTMNTTPESYQLFESFERMVQAGCEYMVMEVSS QAFKMHRVDGIRFDYGLFTNISPDHIGPDEHADFAEYLYYKSQLLSRCRIGIVNRDDEHY REIIKDTACELASYSLEQEADFMAGQIHYVSEPEFVGLEFDVTGRYKLEARVNIPGKFNV YNALAAISVCSFLDLPKEKVRHALEHLSVNGRMEIVYSSKKCTVIVDYAHNAVSMESLLK TLREYHPKRLVCVFGCGGNRSKDRRYSMGDSAGRLADFTILTADNSRFEKTGDIIADIRS SIEKTGGEFIEIPDRREAIAYSMIHARPGDMIAVIGKGHEDYQEVNGVRHHFSDREVVLE IAAELEKNE >gi|229783859|gb|GG667876.1| GENE 3 1941 - 3611 2056 556 aa, chain - ## HITS:1 COG:CAC3201 KEGG:ns NR:ns ## COG: CAC3201 COG2759 # Protein_GI_number: 15896448 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate synthetase # Organism: Clostridium acetobutylicum # 1 556 1 556 556 731 66.0 0 MKTDIEIAQEAKMIPIKEVAASYGISEDDLELYGKYKAKLTDELWEEVNDRPDGKLVLVT AINPTPAGEGKTTTTVGLGEAFGRMDKKAIIALREPSLGPCFGIKGGAAGGGYAQVVPME DLNLHFTGDFHAITSANNLLAALLDNHIHQGNALGIDTRQILWKRCLDMNDRALRNVVVG LGAKADGFVREDHFVITVASEIMAILCLADDMNDLKERLGRIIVAYNYAGEPVTAAQLNA VGAMAALLKDAMKPNLIQTLEHTGAIVHGGPFANIAHGCNSVRATKTALKLADIVVTEAG FGADLGAEKFLDIKCRMAGLKPDAIVLVATVRALKYNGGIPKDQLKEENLEALAKGIVNL EKHIENMQKYDVPVIVTLNSFITDTEAEYQFVKKFCEDRGCEFALSEVWEKGGEGGTALA EKVLYTLENKESRYKPLYPDEAGLKEKIAAVAKEIYGADGVSYAPAASKALKKIEDMGFG SLPVCMAKTQYSLSDDQTKLGRPSGFQINVRDAYVSAGAGFVVVLTGAIMTMPGLPKVPA ANNIDVNNDGVITGLF >gi|229783859|gb|GG667876.1| GENE 4 3638 - 5287 1926 549 aa, chain - ## HITS:1 COG:CAC1812 KEGG:ns NR:ns ## COG: CAC1812 COG1674 # Protein_GI_number: 15895088 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Clostridium acetobutylicum # 18 541 249 765 765 536 55.0 1e-152 GDTFSIPEEPKRVVTASGKVIETDTEALQKKLEKKKAEKAEKEQAGELSVSQEIKQKEEV VKKEYVFPPVTLLKKGKNSGPFSDKEYRETAIKLQQTLQNFGVGVTVTNISCGPSVTRYE LHPEQGVKVSRIVGLADDIKLSLAAADIRIEAPIPGKSAVGIEVPNKENNMVYLRDILEA DEFQKHASRIAFAVGKDIGGQVVVTDIGKMPHLLIAGATGSGKSVCINTLIMSIIYKANP DDVKLIMVDPKVVELSVYNGIPHLLLPVVTDPKKASGALNWAVAEMDDRYKKFAQYNVRD LKGYNAKVENIKDIEDENKPKKMPQIIIIIDELADLMMVAPGEVEDSVCRLAQLARAAGI HLVIATQRPSVNVITGLIKANVPSRIAFAVSSGVDSRTIIDMNGAEKLLGKGDMLFYPSG CPKPVRVQGAFVSDTEVSAVVDFLTEQGMTANYNPEVENQIVQTPAAGDAKGGGNDRDEY FVQAGKFIIEKDKASIGMLQRMFKIGFNRAARIMDQLAEAGVVGEEEGTKPRKVLMSAEE FEELLEQGY Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:25:13 2011 Seq name: gi|229783858|gb|GG667877.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld270, whole genome shotgun sequence Length of sequence - 5397 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 91 - 1533 1291 ## COG0534 Na+-driven multidrug efflux pump 2 1 Op 2 . - CDS 1580 - 2089 619 ## COG1052 Lactate dehydrogenase and related dehydrogenases 3 2 Op 1 . - CDS 3032 - 3409 443 ## gi|266625304|ref|ZP_06118239.1| glyoxylate reductase 4 2 Op 2 . - CDS 3435 - 5141 1797 ## COG0004 Ammonia permease 5 2 Op 3 . - CDS 5220 - 5396 255 ## COG1126 ABC-type polar amino acid transport system, ATPase component Predicted protein(s) >gi|229783858|gb|GG667877.1| GENE 1 91 - 1533 1291 480 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 3 439 9 433 448 201 30.0 2e-51 MGTAPLFPLLLSMGIPGLLGTMTTYLYRTVDQVFVGNYVGRYALGGLSVLAPFNNVMIAL SLFITVGGASLLSLAIGSGNDEKAGRLFTNIIVQATGMATVVSLVFSVSAEPFVRLCGAS EETETYRYAVEYLRIVSLGQVFNMLNVGLAAIIRTEGSAGYSMAANMIGAVCNIILDYFL IARGGMGVEGAALATVISQMAGAAFSAAYFLTGRSGLKWAGFSAVDGRQMIYIIKMGIAP SIFQMLSFVTNIMMNKALLKYGDLDPVYSLLGGGELCVSAMAAAVTMESFIVSLTSGINQ AASPVISYNYGARKYSRVKKASLISQSMTVILSAAVWAAMMAVPAALVRMFGAGDEAFVT FGAYAMRICKILALFSGYQMLVSMYFSAIGRSDQATLVSFSRHGIFLIPALLLFPRFFGL AGVLYAAPFSDACSLLVVSCLYGKDIRRVGRLRDGETVLNKNRFISAKMKEKTAVKSGLA >gi|229783858|gb|GG667877.1| GENE 2 1580 - 2089 619 169 aa, chain - ## HITS:1 COG:CAC2945 KEGG:ns NR:ns ## COG: CAC2945 COG1052 # Protein_GI_number: 15896198 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 168 155 322 324 165 47.0 3e-41 MGIIGLGSIGLWTARMAAGFGMKVISFSRHKKTGPEYDFIEQVGLEELFLRSDVISIHCP LTDESYHMIDRAAIEKMKDGVILINTARGAVIDEEALIGALDSGKIYAAGLDVVDNEPLK ERCALMNCSNAVITPHIAWAPEEARYRTVRVAAENLKNWIHGMPTSVIA >gi|229783858|gb|GG667877.1| GENE 3 3032 - 3409 443 125 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625304|ref|ZP_06118239.1| ## NR: gi|266625304|ref|ZP_06118239.1| glyoxylate reductase [Clostridium hathewayi DSM 13479] glyoxylate reductase [Clostridium hathewayi DSM 13479] # 1 123 1 123 123 258 100.0 9e-68 MKVVICQGEAIGGRPSREELIRYWKEQLKEIPEVDEVVFHNGFSSRKKDEIAGDADALLG AWIQDCCEETFLSNHPRLKYLASFAHGYGEFDRDAAEKHHITMTNTVYGDVTIAQFAMAL LLDIS >gi|229783858|gb|GG667877.1| GENE 4 3435 - 5141 1797 568 aa, chain - ## HITS:1 COG:TM0402 KEGG:ns NR:ns ## COG: TM0402 COG0004 # Protein_GI_number: 15643168 # Func_class: P Inorganic ion transport and metabolism # Function: Ammonia permease # Organism: Thermotoga maritima # 3 406 30 429 435 389 55.0 1e-108 MSAVNTIWVLFGAALVFFMQAGFAMVETGFTRAKNSGNIIMKNLMDFCIGTPIFWFVGFG IMFGSWGSVFGGFDFFLMGNYDSVLPAGISKWTFVIFQSVFCATAATIVSGAMAERTKFS AYCVYSAVISMVIYPVSGHWIWGGGWLANLGFHDFAGSAAVHMVGGTAAAVGAYLLGPRI GKYGEDGKIKAIPGHNITLGALGVFILWFCWFGFNGCSTVSMEGDAMETAGLIFMNTNLA AALAGCATMFYTWFKYKKPDVSMTLNGVLAGLVAITAGCDAVTPFGAAAIGICAGILVVV AIEMIEKKLKIDDPVGAIGVHGVCGAFGAAAVGLFAKEGGLFYGGGFSCLGIQLLGVLSV AAWVFLTMNVVFRVIRKTMGLRVTRREEIEGLDSTEHGLANAYADFLPVVPADDLMANVD SVLGAAPGRRDIPVDEAIPVRARVSETGESELSRVDIILKQSKFEELKNALNGIGITGMT VTQVLGCGMQKGAAEYYRGVPVDMTLLPKVKIEVVVAKVPVEEVVSTARKVLYTGHIGDG KIFIYDVRDSVKIRTGETGYDAMQGDQD >gi|229783858|gb|GG667877.1| GENE 5 5220 - 5396 255 58 aa, chain - ## HITS:1 COG:BS_yckI KEGG:ns NR:ns ## COG: BS_yckI COG1126 # Protein_GI_number: 16077427 # Func_class: E Amino acid transport and metabolism # Function: ABC-type polar amino acid transport system, ATPase component # Organism: Bacillus subtilis # 1 50 197 246 247 69 62.0 2e-12 THEIAFAREVASRVIFMEGGVVVEQGPPSELLVNPKEARTKQFLKRITSPNDGMAEVC Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:25:23 2011 Seq name: gi|229783857|gb|GG667878.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld271, whole genome shotgun sequence Length of sequence - 6005 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1273 1457 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase - Prom 1295 - 1354 4.9 - Term 1321 - 1356 -0.7 2 1 Op 2 . - CDS 1414 - 1608 158 ## gi|266625308|ref|ZP_06118243.1| putative creatinine amidohydrolase - Prom 1648 - 1707 80.4 - Term 2578 - 2610 3.1 3 2 Op 1 . - CDS 2662 - 3282 600 ## COG0655 Multimeric flavodoxin WrbA 4 2 Op 2 . - CDS 3334 - 5955 2731 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family Predicted protein(s) >gi|229783857|gb|GG667878.1| GENE 1 1 - 1273 1457 424 aa, chain - ## HITS:1 COG:SPy2110 KEGG:ns NR:ns ## COG: SPy2110 COG1328 # Protein_GI_number: 15675860 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Streptococcus pyogenes M1 GAS # 2 424 15 432 732 257 38.0 4e-68 MIKVQKRDGRIVNYDREKIVRAIQKANAEVDAAERAGEALIDGILDDVERECEDVVHVES IQDMIESRLVKQNKYTLSKKYMIYRYQRALLRKANTTDESILKLIRNENKELAEENSNKN TILASTQRDYIAGEVSRDLTKRMLLPERISMAHEQGAIHFHDADYFVQPIFNCCLINIAD MLDNGTVMNEKMIESPKSFQVACTVMTQIIAAVASNQYGGQSVDMRHLGKYLRKSSNKFH RQIREEFGDSLDEESVEKMARIRLRDELKSGVQTIQYQINTLMTTNGQAPFVTLFLRLDD EDEYINENALIVEEILRQRLEGIKNEAGVYVTPAFPKLIYVLDENNCLKGGKYDYITRLA VRCSSKRMYPDYISAKKMKENYEGNVFSCMGCRSFLSPWKDENGDYKFEGRFNQGVVSLN LPQI >gi|229783857|gb|GG667878.1| GENE 2 1414 - 1608 158 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625308|ref|ZP_06118243.1| ## NR: gi|266625308|ref|ZP_06118243.1| putative creatinine amidohydrolase [Clostridium hathewayi DSM 13479] putative creatinine amidohydrolase [Clostridium hathewayi DSM 13479] # 1 64 8 71 71 122 100.0 6e-27 MTRIQMPAHWNIHAPGTRPAAARRFFGCVLAMLVKRWGTGLRAELLGREKEKEKKAFTFA ISVV >gi|229783857|gb|GG667878.1| GENE 3 2662 - 3282 600 206 aa, chain - ## HITS:1 COG:CAC3341 KEGG:ns NR:ns ## COG: CAC3341 COG0655 # Protein_GI_number: 15896584 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Clostridium acetobutylicum # 1 206 1 208 208 292 65.0 2e-79 MKVLLLNGSPHEKGCTYRALTEVAGALTNEGIETEIVHIGNQAIHGCIGCGKCAAAGRCV FKGDPVNAFLDKMEHADGLVVGSPVYYASANGSLYSFLDRLFYAGSCFAFKPAAAVASAR RAGTTATLDSMNKYFTISQMPVVSSQYWNMVHGNSPEEVEQDLEGLQVMRTVGKNMAWLL HCIEAGKAAGITTPEPEQRLRTNFIR >gi|229783857|gb|GG667878.1| GENE 4 3334 - 5955 2731 873 aa, chain - ## HITS:1 COG:FN1160 KEGG:ns NR:ns ## COG: FN1160 COG0553 # Protein_GI_number: 19704495 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Fusobacterium nucleatum # 267 872 493 1086 1089 507 45.0 1e-143 MEFIMELVGSYCEHYEQFQKSSFSAMPALRSLNLSRSNRDRFFRLMMGETLELEDYRAHK RMVTLTDHNPDLAVKIRKKGRDGILVSIDKGLLIFEGERHLYVVDKERIYRCDEACSSTL HVFLEQMIKSFGSNCEAEVNDKDMPLFYERVLKKIASYSIIDAGDVELEAYKPEELKARF EFDSQGPNDLVLKPILSYGDYSFQPVEDEKLPRTVCRDVPGEFRVSQLITKYFKYKEHDT EYPAIRDDEEAVYRLLTEGMSEFMAQGEVYLSEAFQKLKVLPPPKISIGVKSSGSWLELT VDTEGMSGAELTKLLSEYSQKKRFYRMKNGEFLALDDNGLMTVAKLVEGLAVNKTDLQSQ KFRLPRYRALYLDGILKEGSGITLYRDALFKAVVRGMKSVEDSDYEIPLTLRSVLREYQK TGFRWLKTLDSYGFGGILADDMGLGKTIQVIALLLDESNREPDSSALIVCPASLVYNWEN EIHHFAPTLKVRTISGTAQEREELLKAASAGEILITSYDLLKRDIAFYEEREFRFQIVDE AQYIKNASTQSAKAVKSVNARTRFALTGTPIENRLSELWSIFDFLMPGFLFSYQRFKKEY ELPIVRDQDENCLKGLHRMIGPFILRRLKKDVLKELPDKLENVIYSGFEKEQKELYTANA WQVKQQLELAGDGGSDRIQILAQLTRLRQICCDPHLCYSNYNGSSAKLETCIDLIRNGVE GGHKILLFSQFTSMLEIIEKRLKKEGMAYYILTGATPKEERLHMVSSFKDDGVPVFLISL KAGGTGLNLTAADVVIHYDPWWNVAAQNQATDRTHRIGQEKQVTVFKLITKGTIEENILK LQESKKNLAEQIITEGTVSFGSLTKEDLIGLLD Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:25:29 2011 Seq name: gi|229783856|gb|GG667879.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld272, whole genome shotgun sequence Length of sequence - 4381 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 566 427 ## ELI_1104 hypothetical protein 2 2 Op 1 . - CDS 744 - 1373 353 ## COG4845 Chloramphenicol O-acetyltransferase 3 2 Op 2 . - CDS 1453 - 2022 402 ## COG1309 Transcriptional regulator - Prom 2046 - 2105 6.3 4 3 Op 1 . - CDS 2194 - 3096 329 ## Sterm_0531 hypothetical protein 5 3 Op 2 . - CDS 3166 - 3486 61 ## CD3206 hypothetical protein 6 3 Op 3 . - CDS 3544 - 4005 278 ## Closa_4187 hypothetical protein - Prom 4142 - 4201 11.1 Predicted protein(s) >gi|229783856|gb|GG667879.1| GENE 1 3 - 566 427 187 aa, chain + ## HITS:1 COG:no KEGG:ELI_1104 NR:ns ## KEGG: ELI_1104 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 187 301 486 500 66 24.0 6e-10 SDIGIVETSPGKGTRVLVGVDNPAPISSFSTPNLKKALLQYLQSMQIILIICKDAARSVF PSLTDSSIQDRASWLKTIVISGDYYMTFGVCFSLLSENAQSPALREILGGLMQIQYLGYP LNDVKPYPFRFDSRSTTALLKSLEERDADLFAEELQCLALDIFKAGKEKLIAGGILEAEA IVLPSLD >gi|229783856|gb|GG667879.1| GENE 2 744 - 1373 353 209 aa, chain - ## HITS:1 COG:MA1703 KEGG:ns NR:ns ## COG: MA1703 COG4845 # Protein_GI_number: 20090555 # Func_class: V Defense mechanisms # Function: Chloramphenicol O-acetyltransferase # Organism: Methanosarcina acetivorans str.C2A # 1 208 1 208 209 266 55.0 2e-71 MSNQYQVIDEKSWDRAMHCFIFRNSIEPAFCVTFEANITNFKRKIKEQGLSFTMAMVYAV CKCANEIEAFRYRFLDGQVVLFDKIDTAFTYLNNETELFKVVTVPFVGNLKEYCESALKT AQEQKEYFTGPLGNDVFQCSPMPWVTYTHISHTNSGKKDQATPLFDWGKYYEKNGEILIP VSVQAHHSFVDGLHIGQFVDKLQNFFDEY >gi|229783856|gb|GG667879.1| GENE 3 1453 - 2022 402 189 aa, chain - ## HITS:1 COG:BS_yobS KEGG:ns NR:ns ## COG: BS_yobS COG1309 # Protein_GI_number: 16078967 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 3 183 4 183 191 104 30.0 8e-23 MARAGLDKNAVVERAAQLANEVGIDHIQLKTLAESLNIQPPSLYNHIRGLDDLHHELMLY GWRHMEERMLEAMGSSDGYAAWESVCWAFYRYATENPGVFSAMLWYNKYRDDETQSVTEE LFTMCFKIASSLNISEENCNHLLRTFRAFLEGFSLLVNNNAFGHSLSVEESFDLSLKVMI AGMKELEGK >gi|229783856|gb|GG667879.1| GENE 4 2194 - 3096 329 300 aa, chain - ## HITS:1 COG:no KEGG:Sterm_0531 NR:ns ## KEGG: Sterm_0531 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 5 297 6 301 306 305 50.0 1e-81 MNEHQARNFVYRNARPLELARWKYLFEGGSREEVLTALASYQNGDGGFGHALEPDCWNPD SSPIQTWAAAEILREIGFEEKDHPIIQGILQYLSSGKDFDGHTWANTIATNNDAPHAPWW EYAPDAERSYNPTASLITFILKYAEPESKLFALACSLAEEAYHYFKAHFPMESVGNVSCF VRLYDYLKESETDTLINLIEFKALLQRQIQYVMTCDTARWTAEYVCKPSFFISSKSSDFY EENRELCKYECEFIENTQEPDGTWAITWSWADYAEEWNISKNWWKSDWIIKYLHYLREIR >gi|229783856|gb|GG667879.1| GENE 5 3166 - 3486 61 106 aa, chain - ## HITS:1 COG:no KEGG:CD3206 NR:ns ## KEGG: CD3206 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 93 2 93 104 112 65.0 4e-24 MKKVLKSFTFWFLLISVLEIFMHQMGQDSKSIVLIRLNPILNMIAESDGFLYSFMKSGLQ ISCNTIEGQISIYWYIGSVLVFIFYGVILDCIKWGVKRWRSHRECP >gi|229783856|gb|GG667879.1| GENE 6 3544 - 4005 278 153 aa, chain - ## HITS:1 COG:no KEGG:Closa_4187 NR:ns ## KEGG: Closa_4187 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 153 1 153 153 230 68.0 2e-59 MKVNRCIKESFIVIGKEGSTLDGEDFIQKLWHHANSHFEEIQHLAKKDEDGNIRGIWGAM TDFSRSFKPWENFNKGLYLAGVECVDDAEAPDGWSKWIVPGYEYIYAECEEEAVFPRVIG YMRENGIPLAGAVQDFTCPQTEKNYVFFPIRKL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:25:44 2011 Seq name: gi|229783855|gb|GG667880.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld273, whole genome shotgun sequence Length of sequence - 3626 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 271 231 ## Closa_0232 single-strand binding protein 2 1 Op 2 . + CDS 285 - 803 384 ## Cphy_2986 hypothetical protein 3 1 Op 3 . + CDS 784 - 1275 366 ## Thebr_0094 zinc finger, CHC2-family protein 4 1 Op 4 . + CDS 1277 - 3151 1366 ## COG5519 Superfamily II helicase and inactivated derivatives + Prom 3257 - 3316 6.1 5 2 Tu 1 . + CDS 3460 - 3625 63 ## gi|266625321|ref|ZP_06118256.1| conserved hypothetical protein Predicted protein(s) >gi|229783855|gb|GG667880.1| GENE 1 2 - 271 231 89 aa, chain + ## HITS:1 COG:no KEGG:Closa_0232 NR:ns ## KEGG: Closa_0232 # Name: not_defined # Def: single-strand binding protein # Organism: C.saccharolyticum # Pathway: DNA replication [PATH:csh03030]; Mismatch repair [PATH:csh03430]; Homologous recombination [PATH:csh03440] # 1 89 56 150 150 126 76.0 4e-28 AFDRAGEFAEKYFRQGMRVLISGRIQTGSYINKDGIKVYTTDIIVEDQEFADSKGAASGG QTNGRPEPADADGFVNIPDNLEDDSLPFN >gi|229783855|gb|GG667880.1| GENE 2 285 - 803 384 172 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2986 NR:ns ## KEGG: Cphy_2986 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 163 1 161 166 209 59.0 3e-53 MNIQIDSREKAKAIQKIIKEFDRQGVDHFVSKLYVGDYMNFDNPRVIIDRKQNLTELCGN VCQGHNRFRDEILRANEHGIEIIILCEHGKGIERLEDVIWWDNPRRIERYKDLHAGKWLQ RETKALTGEKLYKILSTFQRKYGCQFLFCNKEETGRRIIELLGGELIDSGRD >gi|229783855|gb|GG667880.1| GENE 3 784 - 1275 366 163 aa, chain + ## HITS:1 COG:no KEGG:Thebr_0094 NR:ns ## KEGG: Thebr_0094 # Name: not_defined # Def: zinc finger, CHC2-family protein # Organism: T.brockii # Pathway: not_defined # 6 128 12 134 203 68 34.0 6e-11 MTVEEIKAAYSMRDIVERYGFRPTRAGFIPCPFHSGDRQASLKVYDRDFHCHACGAHGDI IDFVMMMDDVDFKTAYQSLGGEYQKPTFSSRLAVYEAQKQREMRHKMQERIAAQKDLNNT LLTVYRTYMERSEPLSDIWTDCYNALQYQLYLQAELNDMEARW >gi|229783855|gb|GG667880.1| GENE 4 1277 - 3151 1366 624 aa, chain + ## HITS:1 COG:SA1828 KEGG:ns NR:ns ## COG: SA1828 COG5519 # Protein_GI_number: 15927596 # Func_class: L Replication, recombination and repair # Function: Superfamily II helicase and inactivated derivatives # Organism: Staphylococcus aureus N315 # 126 511 68 456 569 118 25.0 4e-26 MVPLKELTAETILSKEILTEVFDQEDELYRAELLASLGLRAAELKVKTEFREMVATYKRI EKEMKRQERDKKSQPCTLEQWTNFDGPYDNMQCKQWIASENGIYLNNPSTGYTDILACYH PILPIERLKNLETGEEQIKLAYKRNARWEEIIVPKTLVTSANKIVALSGRGIAVTSENAK YLVRYLADVENANEEHINVQYSTSKLGWIRKGFLPYDTEIVFDGDTRFRQIYDSVEQAGS REEWFSHAKELRRSGRIEIKFMLAASFSSALVQPLGGLPYFVDLWGETEGGKTVDLMLAA SVWADPDENAYIGDYKTTDVALEAKADMLNHLPLILDDTSKKNRKIEDNFEGLVYDLCSG KGKSRSNKELGLNRENRWKNCILTNGERPLTSYVTQGGAINRILELECGDHVFKDPGYTA ELVKRSYGHAGREFVELIKDLGIDAIREIQQEFLRQLADDEKMQKQSLSLSIVLTADKLA TDYLFKDRQYISLEEAREVLVDRNELSDNERCYQFLMDKIAMNPARFDGDNENIEKWGVI EEGYAIIYATAFSTLCKDGGFSRTSFLSWANRKGLLQTEKSGKKLDKIKSFKGNKIRCVF LKLNDGADKDGFIQTDEQMELPFE >gi|229783855|gb|GG667880.1| GENE 5 3460 - 3625 63 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625321|ref|ZP_06118256.1| ## NR: gi|266625321|ref|ZP_06118256.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 55 1 55 55 102 100.0 1e-20 MTNDEIKNILNEVHNVFWVKWRNKVPERGSYEWEQFIQDGGELMKKYSYCSLVIK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:25:59 2011 Seq name: gi|229783854|gb|GG667881.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld274, whole genome shotgun sequence Length of sequence - 4244 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 338 180 ## COG0733 Na+-dependent transporters of the SNF family 2 1 Op 2 . - CDS 396 - 1322 1132 ## GY4MC1_1285 hypothetical protein - Prom 1344 - 1403 6.2 - Term 1348 - 1403 12.6 3 2 Tu 1 . - CDS 1438 - 3801 2286 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase - Prom 3853 - 3912 7.4 - Term 3870 - 3929 11.2 4 3 Tu 1 . - CDS 4061 - 4243 229 ## Closa_2505 GTP-binding proten HflX Predicted protein(s) >gi|229783854|gb|GG667881.1| GENE 1 2 - 338 180 112 aa, chain - ## HITS:1 COG:MA0901 KEGG:ns NR:ns ## COG: MA0901 COG0733 # Protein_GI_number: 20089780 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Methanosarcina acetivorans str.C2A # 27 109 3 85 459 99 66.0 1e-21 MVQNLCRSSQKEADFDRFGGSNLINQERKSMGREKFSSRLGFILISAGCAIGLGNVWRFP YIVGQYGGAAFVLIYAVFLVILGLPIVVMEFAVGRASQRSVALSYDVLETKR >gi|229783854|gb|GG667881.1| GENE 2 396 - 1322 1132 308 aa, chain - ## HITS:1 COG:no KEGG:GY4MC1_1285 NR:ns ## KEGG: GY4MC1_1285 # Name: not_defined # Def: hypothetical protein # Organism: Geobacillus_Y4.1MC1 # Pathway: not_defined # 1 308 1 305 305 310 52.0 5e-83 MKRFLNCNASDFEKMNGKELVESIASSEGRVLVCETIGTTQPMLGDVSNAELAAAMGADV ILLNIFDVDHPVIHGLPPVDGADIIRKVKALTGRPVGINLEPKEQGVESAVEADAMWGIT KGRLGTLENGKKAVAMGVDFIVLTGNPGVGVTNRAIAETLALYRKEFGDAVVLVAGKMHA AGILSEAGEKIITREDVKQFREAGADVILMPAPGTVPGITMEYIRGLVSFAHELGALTLT AVGTSQEGSDTATIREIALMCKMTGTDMHHLGDAGYGGMALPENILEYGKVIRGVRHTYH RMAASIRR >gi|229783854|gb|GG667881.1| GENE 3 1438 - 3801 2286 787 aa, chain - ## HITS:1 COG:PA1920 KEGG:ns NR:ns ## COG: PA1920 COG1328 # Protein_GI_number: 15597116 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Pseudomonas aeruginosa # 4 663 13 665 675 722 51.0 0 MIAVLKRDGEVAEFSLNKITEAIKKAFKATKKDYNNEILELLSLRVTADFQEKMRDGQIT VEQIQDSVEHVLEQTGYTDVAKAYILYRKQREKIRNMKNTILDYKDVVNSYVKVEDWRVK ENSTVTYSVGGLILSNSGAVTANYWLSEIYDQEIADAHRNADIHIHDLSMLTGYCAGWSL KQLLMEGLGGIPGKITSSPAKHLSVLCNQMVNFLGIMQNEWAGAQAFSSFDTYLAPFVKI DNLTYPEVKKCIESFIYGVNTPSRWGTQAPFSNITLDWTVPGDMAELPAIIGGKEADFKY KDCKAEMDMINKAFIETMIEGDANGRGFQYPIPTYSITKDFDWSDTENNRLLFEMTSKYG TPYFSNYINSDMEPSDVRSMCCRLRLDLRELRRKTGGFFGSGESTGSVGVVTINMPRIAY LAKDEADFYSRLDHMMDVSARSLKTKREVITKLLNQGLYPYTKRYLGTFENHFSTIGLIG MNEVGLNAVWLGEDMTSTKTQEFTKNVLNHMRERLSDYQEKYGDLYNLEATPAESTTYRL AKHDRKKYPDIRTAGAEGDTPYYTNSSHLPVDYTADIFDALDIQDELQTLYTSGTVFHAF LGEKLPDWAAAARLVRTIAENYKLPYYTLSPTYSICAEHGYLIGEHSVCPICGKRAEIYS RITGYYRPVQNWNEGKTQEYKNRTTYDITGSVLKKGNKGAETVKEAVHNAEIPAGSYLFT TKTCPNCKMAKQFLKDVDYEVVDAEENAELAEQLGIMQAPTLVVIKDGQVKKIANASNIR KYAETIA >gi|229783854|gb|GG667881.1| GENE 4 4061 - 4243 229 60 aa, chain - ## HITS:1 COG:no KEGG:Closa_2505 NR:ns ## KEGG: Closa_2505 # Name: not_defined # Def: GTP-binding proten HflX # Organism: C.saccharolyticum # Pathway: not_defined # 1 59 356 414 423 96 74.0 4e-19 NLLESILRSRKVFLERVYPYQEAGKIQLIRKYGQLLKEEYRDDGIFVSAYVPAEMYTNLI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:26:08 2011 Seq name: gi|229783853|gb|GG667882.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld275, whole genome shotgun sequence Length of sequence - 4877 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 192 - 251 7.0 1 1 Op 1 . + CDS 306 - 1751 1671 ## HMPREF0868_0284 putative lipoprotein + Term 1778 - 1828 13.3 2 1 Op 2 8/0.000 + CDS 1836 - 3668 1842 ## COG1178 ABC-type Fe3+ transport system, permease component 3 1 Op 3 . + CDS 3670 - 4876 1358 ## COG3839 ABC-type sugar transport systems, ATPase components Predicted protein(s) >gi|229783853|gb|GG667882.1| GENE 1 306 - 1751 1671 481 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0868_0284 NR:ns ## KEGG: HMPREF0868_0284 # Name: not_defined # Def: putative lipoprotein # Organism: Clostridiales_BVAB3 # Pathway: not_defined # 3 477 2 452 456 584 65.0 1e-165 MKKQILSAVLISTMLAATLAGCGSKAADPTTAAPAAQTEAAKGEETKAPETEAKAEEPKA DDGNLSEVEKIIKEAESMSMEELAKKAIEESNGKTFYGVGNSSRGKTALPLFIEYLQSVD PSYTMEYEWQQPKNNKIFEQLSADSLKSTGTFAMTLIQDGNQIESKMVQTGILKTFIPKE WAEANGTTADAYTGFLPLQTLNKVFMYNSTGTAEFNNCWDFVAEGVHPLYMDIDSEVVGK NFLYMLTEDKYAGWLKDAYDALDSSKQAYFKPVIDEMATDAADLGLGENGAYALAWIKLW VENYNEQTDDGPICNTLVSKTATDQCGLLVYSKLRSVEESSEVSVNNIKVAAYQDGYKGI GGYGYCHYLFLTNNSPLPWTSCAFIAYMTNTADGFSAWGKDMGGYSSNPKVMEETEKIYG HSKGGYNDKGENEFDSKNDRGYEWWTTDGELVLEDPEYCADVSFTVGSWIELLTKYSDSA Q >gi|229783853|gb|GG667882.1| GENE 2 1836 - 3668 1842 610 aa, chain + ## HITS:1 COG:SMb20155 KEGG:ns NR:ns ## COG: SMb20155 COG1178 # Protein_GI_number: 16263903 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Sinorhizobium meliloti # 18 608 29 615 616 277 31.0 4e-74 MDQTVNRKSNNKVIVLNKIKTYFSKPQNIILLLFGIVLTFTTVAPIVAIVKDTLMIHPGT IDAHLTGKASGYTIVNYVDLFTSRLARTNLWKPLWNTLLLAVLTCLISILYGGVFAFLVT RTNLKFKKYLSSIFIFPYIMPQWTLAVVWQNLFNSNAVVGTANGLLASLFGILAPKWVCQ GLFPSAVVLGLHYAPFAYILIGGIFRNMDANLEEAATILDTPKWKTMFRITLPMVKPAIL STVLLVFGSAMGSYPVPHYLGFTTLSTKYISMNSKYTGEASILAIIMMVFGVAILTMNQV SLKSRKNYTTVTGKSGQLSKINLGKMGKYVIGIVMIAATFFTSIFPILSFALETFLPNPG DYSFLYTKNLSNLTTKWWLTSENITENGMYGQKGILYNSTIWHAFWGTLLVSVCCALIAG TIGTLIGYAVSKNRRSKWANYVNGVAFLPYLMPSIAVGAAFFILFSNEKINLFNTYTLLI IAGTIKYIPFASRSALNSMLQISNEIEEAAIIQDLPWFKRMFRIIIPIQKSAIISGYMLP FMTCLRELSLFLLLCTQGFILSTTLDYFDEMGLYAFSSAINLILIVTILIFNTLVNKLTG ASLDDGIGGN >gi|229783853|gb|GG667882.1| GENE 3 3670 - 4876 1358 402 aa, chain + ## HITS:1 COG:MYPU_0990 KEGG:ns NR:ns ## COG: MYPU_0990 COG3839 # Protein_GI_number: 15828570 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Mycoplasma pulmonis # 4 371 36 402 586 194 34.0 2e-49 MPEIKLTNVTKRWGKFYAVDNLNLEIEDNSFITLLGPSGCGKTTTLRMIAGLETPTSGRI SIGDRVVFDSEAGINVPANKRKVGFLFQNYALWPNMTVYQNISFGLSNIKEEMPQIDFEA KTANDLIQALKNGKRIAELVEECRDKKGRLDMDKVYLKLIDAYTLSIYTAKTLFDFKIHE SSDPDGAAKAKAAELQTKLDSIRASYRGKGQELNSEFAVVNGGKLLTENRKLHKEEVEQA VRRVSRIVKIGPFMNRYPAELSGGQQQRVAIARTLAPEPAVLFMDEPLSNLDAKLRLEMR YELQRLHVETGSTFVYVTHDQMEAMTLATKICLINNGVLQQYEAPLDVYNRPRNLFVADF VGNPSINFMEAKGRQRTDGSLELTVLDGEKAIFLPEKPLSMD Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:26:17 2011 Seq name: gi|229783852|gb|GG667883.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld276, whole genome shotgun sequence Length of sequence - 3673 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 61 - 120 5.2 1 1 Tu 1 . + CDS 218 - 985 712 ## Closa_0879 3D domain protein + Term 1039 - 1083 1.6 - Term 1019 - 1078 8.5 2 2 Tu 1 . - CDS 1090 - 2043 991 ## EUBELI_01420 hypothetical protein - Prom 2092 - 2151 4.7 + Prom 2422 - 2481 11.8 3 3 Tu 1 . + CDS 2572 - 3300 701 ## Closa_0878 3D domain protein + Term 3328 - 3380 15.0 - Term 3316 - 3368 15.0 4 4 Tu 1 . - CDS 3419 - 3673 351 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases Predicted protein(s) >gi|229783852|gb|GG667883.1| GENE 1 218 - 985 712 255 aa, chain + ## HITS:1 COG:no KEGG:Closa_0879 NR:ns ## KEGG: Closa_0879 # Name: not_defined # Def: 3D domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 110 255 49 187 188 134 52.0 4e-30 MLCVTLLCTLFLSMTAAAEEKGALTAVDGTHITGWLKNDADSGEAAVVELHIYKDGNSEA AKVITVPEESYSNEAGNSTGDGRRVFDYKINWQELGGTACKVEAFLVSGDKKTLLTGAME YKPSSEVLKKTESSGEEIGPGIPKKEEIPAEQTFEGYKKGESLGIFKTTGYCSCSKCSGG SGLTYSGTVPQPNHTISADITVLPLGTKVIIGETVYTVEDIGSSVDGHTVDLYFSSHQEA LAYGVKKQEVFQAIE >gi|229783852|gb|GG667883.1| GENE 2 1090 - 2043 991 317 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01420 NR:ns ## KEGG: EUBELI_01420 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 3 262 10 265 321 130 30.0 8e-29 MGKKDIALARYFEDEDRYADLINGYVFSGEQVVSGSDILEKNILVTGVLGTVRKWFAVQK YRDAVKRVVMGMNFAVIALEHQDLVHYGMPVRVMLEDAAGYDEQMRRIQKRNRKARGLSR GEFLGGFRKEDRLNAIFTIVLYYGAEPWDGAGDLYSLINFEEIPEGLKNLFNNYRLHVLE VRRFKDTDLFQTDLREVFGFIGFSGDKEAERSYVFGHRESFEQLSEDAYDVITVMSGSKE LEAVKETYREKGEKINMCEAIRGMIEDGRLEGKLEAEQIVAKNMYLRGMTEEDVAGLCEE EIELVRGWFREWKKVEN >gi|229783852|gb|GG667883.1| GENE 3 2572 - 3300 701 242 aa, chain + ## HITS:1 COG:no KEGG:Closa_0878 NR:ns ## KEGG: Closa_0878 # Name: not_defined # Def: 3D domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 242 1 240 240 301 62.0 1e-80 MKRLKRNQLFFGTLLMMTFLLGTSLPAFAGQTPGVLDCVEGDSVRGWAWNSSDPASAVEV KVTVTNQATGQTEAEYTAPADDYREDLVRKGTGTGNYGFEVTVPWETYEDGSYLVEAYAS GMKLSNPKVHSVGDIQAMGSGGTTGTLRSLGVFKTTGYCPCSICSEGWGRHTSTGAIASA NHTIAVDPRVIPYGSKVMINGVIYTAEDRGGAVKGNHIDIFFNTHGETRAYGTRSAEVFL VQ >gi|229783852|gb|GG667883.1| GENE 4 3419 - 3673 351 84 aa, chain - ## HITS:1 COG:CAC1279 KEGG:ns NR:ns ## COG: CAC1279 COG0635 # Protein_GI_number: 15894561 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Clostridium acetobutylicum # 9 83 300 374 374 69 52.0 2e-12 GQDFEQLSVKSKMEEFMFLGLRLTRGVSAEGFITRFGQSIRNVYGGVIDKLEREGLLEHK NGYYRLTERGLDLSNYAMSLFLLD Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:26:32 2011 Seq name: gi|229783851|gb|GG667884.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld277, whole genome shotgun sequence Length of sequence - 2765 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 173 190 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 245 - 304 7.2 2 2 Op 1 38/0.000 + CDS 330 - 1226 919 ## COG1175 ABC-type sugar transport systems, permease components 3 2 Op 2 . + CDS 1223 - 2065 664 ## COG0395 ABC-type sugar transport system, permease component 4 2 Op 3 . + CDS 2078 - 2765 469 ## Thebr_0630 glycosyl hydrolase 38 domain-containing protein Predicted protein(s) >gi|229783851|gb|GG667884.1| GENE 1 3 - 173 190 56 aa, chain + ## HITS:1 COG:BH1123 KEGG:ns NR:ns ## COG: BH1123 COG4753 # Protein_GI_number: 15613686 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 3 56 472 525 526 58 46.0 3e-09 MAWLTSVRIEKAASLMKTTDDRIYEICEKVGYNNPQYFSVIFKKYMGVSPREYMEK >gi|229783851|gb|GG667884.1| GENE 2 330 - 1226 919 298 aa, chain + ## HITS:1 COG:SMc01978 KEGG:ns NR:ns ## COG: SMc01978 COG1175 # Protein_GI_number: 15966265 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Sinorhizobium meliloti # 4 268 19 281 311 167 35.0 2e-41 MRGKKRLDHYYPYLLILPTVLLIIIFIMIPILDVFKLSAQDYSTNSLSGVGTFVGFENFR KILMEDEVFRKAAGISLKWVACTVAGQLVIGMGIALLLNQKVWFRGLFRSVTFMPWAVSG VLATMLWTIMYNQNVGVINDLLIKAGILEKGVAWLSNPHTTFWAVVVTELWRGIPFFAIM ILAGLQSIPEEIYESCNVDGCGGFRKFLYITLPYLKESLVFSTLLRCIWEFNSIDLIFTM TNGGPLRLTTTLPVYLMQRAVVGGQYGYGSAMAVMMGAGLLVFAFLYLKITRYGREDV >gi|229783851|gb|GG667884.1| GENE 3 1223 - 2065 664 280 aa, chain + ## HITS:1 COG:mll0851 KEGG:ns NR:ns ## COG: mll0851 COG0395 # Protein_GI_number: 13470996 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mesorhizobium loti # 13 280 15 282 282 169 37.0 5e-42 MTNKQKRTVKRILGIYIPLCVTALFTIAPILWSLITSLKEPGSVFKLPVQYFPSPATLKN FVTVWGRNKFSLYFGNSLYIGVLSVVFIVILAILNGYALARFKFKAKGAFLSLLLLTQIM PVIIYIVPLFLTFNKFGLINTRASVILFYVVSQTPFNSLLMRGFVSDIPVEIEEAAMVDG AGRFYIITRMIPKLILPGIVATSAFAFVGCWNEFQVAFSFITQQSKFTIPVALKSLVTEA SVDMAALAASSIIALIPPIILFAFIQKHLISGMAAGAVKG >gi|229783851|gb|GG667884.1| GENE 4 2078 - 2765 469 229 aa, chain + ## HITS:1 COG:no KEGG:Thebr_0630 NR:ns ## KEGG: Thebr_0630 # Name: not_defined # Def: glycosyl hydrolase 38 domain-containing protein # Organism: T.brockii # Pathway: Other glycan degradation [PATH:tbo00511] # 7 229 3 221 1053 104 30.0 2e-21 MNQTESYEIERVSQILSVLKAAVKKNVTVPEKLRYAPEKLNFNQLPVFQEFRNGGSWGGF DQYGWFQCHFEIPEETEGCGLWIEITQNKRDWYAQNPQFLLYCNQIPMQGFDIYHEECLI TEHARGKETVCIDIDAWSGMVLKEVSWDNRENLPGILSIRFYEKDCNTEQLYYDMKSMLD TAVCCGKESDEGIILLHALHEAVNLLDLREIGSESYDNSIKEAAKYLKE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:26:38 2011 Seq name: gi|229783850|gb|GG667885.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld278, whole genome shotgun sequence Length of sequence - 5255 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 744 229 ## EUBREC_1163 UvrD/Rep helicase family protein 2 1 Op 2 . + CDS 716 - 1357 296 ## gi|288871622|ref|ZP_06118273.2| conserved hypothetical protein 3 2 Op 1 . - CDS 1346 - 1651 347 ## Closa_1065 branched-chain amino acid transport 4 2 Op 2 . - CDS 1644 - 2111 315 ## COG1296 Predicted branched-chain amino acid permease (azaleucine resistance) 5 3 Tu 1 . - CDS 3071 - 3280 200 ## Closa_1066 AzlC family protein - Prom 3455 - 3514 5.9 + Prom 3325 - 3384 7.5 6 4 Op 1 . + CDS 3449 - 3826 445 ## gi|288871624|ref|ZP_06118277.2| ty transcription activator TEC1 7 4 Op 2 . + CDS 3903 - 4403 536 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 8 4 Op 3 . + CDS 4390 - 5254 833 ## gi|266625344|ref|ZP_06118279.1| conserved hypothetical protein Predicted protein(s) >gi|229783850|gb|GG667885.1| GENE 1 1 - 744 229 247 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1163 NR:ns ## KEGG: EUBREC_1163 # Name: not_defined # Def: UvrD/Rep helicase family protein # Organism: E.rectale # Pathway: Nucleotide excision repair [PATH:ere03420]; Mismatch repair [PATH:ere03430] # 26 243 816 1030 1046 148 35.0 2e-34 VVQNSIDEKRAEKMKNTANRIEKSEFGGIPDTNSLRPLGHSRLNPEVRWVTGHRDCLELT NAFGVLRITPVSENTVRISFAGEAPDHLPDIPSEIETASRVKWNYREDKSRVEVRFGKLL LRIDKKTGGISFYTDKETLLLSESPSLPRQISSSPKNQTWTYFDWSKKEVLKARGAVDQQ WLDLKTTAKYISFGAKSKRPACILSNRGYQILIPGGRRVMCCTIPMYGLYVYTEGEVQID YFFRTAL >gi|229783850|gb|GG667885.1| GENE 2 716 - 1357 296 213 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871622|ref|ZP_06118273.2| ## NR: gi|288871622|ref|ZP_06118273.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 21 213 1 193 193 367 100.0 1e-100 MIISSERHCKASSDHHNRRPMRERPTLERMNIPIAALLKSPSVRDAVHSICRTQCVNDRF LAGAAVTFRQLSLLSSDTRIPGETLKLIFEFLAEEDRKHPVFLEERYPYLKQSSWSLNMS EISYMKVSAERQGEYLFSIRAIQKEVDPVSEQPYLMLFPEYSGERSGCSNGEGERKEESI KVSFDDEYHMQEFMKNIILNGRAELFRRPALQK >gi|229783850|gb|GG667885.1| GENE 3 1346 - 1651 347 101 aa, chain - ## HITS:1 COG:no KEGG:Closa_1065 NR:ns ## KEGG: Closa_1065 # Name: not_defined # Def: branched-chain amino acid transport # Organism: C.saccharolyticum # Pathway: not_defined # 1 101 1 101 101 108 79.0 6e-23 MNNRIYLYILVMAGVTYLIRVLPLTLIRKEIKNTYIRSFLYYVPYVTLSVMTFPAILTAT ASVWSAVTALVIAIFLAYKGKSLFIVSLAACTAVFLTELFL >gi|229783850|gb|GG667885.1| GENE 4 1644 - 2111 315 155 aa, chain - ## HITS:1 COG:BH2910 KEGG:ns NR:ns ## COG: BH2910 COG1296 # Protein_GI_number: 15615473 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permease (azaleucine resistance) # Organism: Bacillus halodurans # 1 135 93 221 237 62 29.0 3e-10 MSCSLSQKLDDKTPFFHRFLIGYGVTDEIFGVSVCRPGRLNPFFSYGLIGAAVPGWTLGT LLGAVSGDFLPPRILSALNVALYGMFLAVIIPPAKGNRILSGVILVSMVCSLIAAVTPVL NTISSGFKIIILTLLIAGAAAYLFPVKEPEEEQHE >gi|229783850|gb|GG667885.1| GENE 5 3071 - 3280 200 69 aa, chain - ## HITS:1 COG:no KEGG:Closa_1066 NR:ns ## KEGG: Closa_1066 # Name: not_defined # Def: AzlC family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 69 1 69 240 96 72.0 4e-19 MNTNAIQWKNGLKDGIPICLGYFAVSFTFGIMASGAGLSPWQAVVMSFTNLTSAGQFAAL GIIAAKAPF >gi|229783850|gb|GG667885.1| GENE 6 3449 - 3826 445 125 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871624|ref|ZP_06118277.2| ## NR: gi|288871624|ref|ZP_06118277.2| ty transcription activator TEC1 [Clostridium hathewayi DSM 13479] ty transcription activator TEC1 [Clostridium hathewayi DSM 13479] # 1 125 13 137 137 253 100.0 3e-66 MKKIHITLFSAIVLVIIAAALYGSLTGKNDPLFDQGTRTTFYAETAAGHEPDISAENYFS GEDYDLSKFTFSMSGCNWEEKGIYQIPVFYDGKETNCVISLEVTGPAGDVPETKEGLNED TRITN >gi|229783850|gb|GG667885.1| GENE 7 3903 - 4403 536 166 aa, chain + ## HITS:1 COG:Cgl0743 KEGG:ns NR:ns ## COG: Cgl0743 COG1595 # Protein_GI_number: 19551993 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Corynebacterium glutamicum # 17 165 29 192 206 64 31.0 7e-11 MEQDEFTAVYRQFYRPLYVYALSLSGNRADAEDLVQSTFLKALLSYEGGGSLRYWLTRVL KNEYFNLWKKRNRLVDEGRFDFSLLREEDHTLEGLIQDEERRLLLEAIMKLPVHYKEVLL DSIYFHLSDEEIGKSMGITKENVRQIRSRAKKQVMRLLEVEEHEGF >gi|229783850|gb|GG667885.1| GENE 8 4390 - 5254 833 288 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625344|ref|ZP_06118279.1| ## NR: gi|266625344|ref|ZP_06118279.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 288 1 288 288 565 100.0 1e-159 MKDFDELADGLNELPDWNAMEDTGLEKQIERQINRRITRISLRTLFAVIMAAAVLLLIIN PVMKFCSFDPRPLFTSDDAAYENEFTKYMRIYYETAQPYVTVYDSSLKDLGFGRYAIDMG VMDTSGPIYVGLPPNVRLNVKWGTIGVEDPQRLTTVTASRYGYGLMSEEDKAALMEELEE LPDSSIIHVSLSAREEKPVSEVIAAPVQVNWVEIYSEDGEIRGGLTIRKSILGEGERDRV EMTDEEIKQEYLGNLEYLLSQPKLLNALGLRMGSEGRYFQAPTTVEPI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:27:18 2011 Seq name: gi|229783849|gb|GG667886.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld279, whole genome shotgun sequence Length of sequence - 4293 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 40/0.000 + CDS 3 - 1880 1059 ## COG0642 Signal transduction histidine kinase 2 1 Op 2 . + CDS 1840 - 2553 598 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 2569 - 2628 10.1 + Prom 2565 - 2624 4.4 3 2 Op 1 . + CDS 2655 - 2846 200 ## gi|266625347|ref|ZP_06118282.1| conserved hypothetical protein 4 2 Op 2 . + CDS 2859 - 3866 932 ## COG0784 FOG: CheY-like receiver + Term 3894 - 3934 8.4 + Prom 3967 - 4026 11.5 5 3 Tu 1 . + CDS 4071 - 4293 63 ## gi|266625349|ref|ZP_06118284.1| transcriptional regulatory protein Predicted protein(s) >gi|229783849|gb|GG667886.1| GENE 1 3 - 1880 1059 625 aa, chain + ## HITS:1 COG:all0824_2 KEGG:ns NR:ns ## COG: all0824_2 COG0642 # Protein_GI_number: 17228319 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 392 618 1 248 273 139 33.0 2e-32 IIETSANGFWDLSETDIRQKAVRLTGAVEYVADALLTPQEFSERPDIHVGQPGSEIQYAT SRMHLTVPQGSYLICGYSVDFASRMYVNGELIFEAGIPGNSRETTTPGVKFYLLPVSPDA NGELVIVQQAANFTHKDGGTYGSFFIGPPEQMTQYVFRDLWPEAVLMGGYLVLFAVHLIL YLLMRGYRPNLLFALFCLTWFIRTGVTGQRMLEVMVPGLPWTVMFRLEYLTMPISGILLV WLLYLLFPGVLPKWFPPAASFLCGGFAAVDLFGSTLLISHTMVWRVVILGVIGLFFFARL LLRWQRPDTGQLAVLLGFGFLLFAALWDMLYHRDIYLLPALRFAISEMAMAVFVLYSMTA LFLATMREVKRARESEAHMAAEKEMLAEMNRMKNQFYTDMSHEMKTPLTVISVNAQFAAQ SIRSGAIDEETVTDLTAISTEARRLAQMVTSLVGLGRMQGTDSGSRLLALDSLVAETVRI YQSMFARQGNTLTADTEPGLPFVEGSADQLVQVLINLLSNANRHTRNGSVLVRAKALENQ VLVSVTDNGDGISPELLPHVFERFQRGDSGGSGLGLTICKAIIEEHGGKIGVKSEEGKGT EIWFTLPVKEAEHEQDGNNPPGRGR >gi|229783849|gb|GG667886.1| GENE 2 1840 - 2553 598 237 aa, chain + ## HITS:1 COG:CAC0860 KEGG:ns NR:ns ## COG: CAC0860 COG0745 # Protein_GI_number: 15894147 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 7 215 7 215 231 128 35.0 9e-30 MNRTGIILLVEDDKNIQRINHRILEREGFQVLCAGTLSEARELLQTHTPDVLVLDIMLPD GSGLSFCEKIRPTTSAPVLFLTALDEKSEIIEGLVAGGNDYITKPYDVDEFVARVKAQLR FVEITRREMEQTKILKRDPLVLDTVALRAFLNGQDMLLTAREFSVLLYLLRHEGSTLPAA KIYEEIWKQPMSGNSSALWKCMSRLKNKLAVSNGKVSLMNFRNEGYLLEIVEDSPVK >gi|229783849|gb|GG667886.1| GENE 3 2655 - 2846 200 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625347|ref|ZP_06118282.1| ## NR: gi|266625347|ref|ZP_06118282.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 63 1 63 63 102 100.0 8e-21 MKKNQPVTFEVSIEEYNNHTWQGTVRMNGQTIPFQSDLDLLLLINRLVNEPCEKTFHKTS AQR >gi|229783849|gb|GG667886.1| GENE 4 2859 - 3866 932 335 aa, chain + ## HITS:1 COG:slr1305_1 KEGG:ns NR:ns ## COG: slr1305_1 COG0784 # Protein_GI_number: 16329450 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Synechocystis # 208 313 8 116 120 86 40.0 8e-17 MELHAGSAGCPPHPHYQESAARNEQFVGECLKKIRAAPDAEEGLRRLLRFLGERLECDRV YVFEEMDRQHIRNTYEWCTPGISSGIEELPYVAKKDLLPWYGQLTAGENIIEPKVETLRQ RDPLIYEFLQPQKIHSIILSPLLSQGKMYGFLGADNPPPEKMEHISVVFDVLAYFVCSLV SQRELARLRERHVPLQKPEAAARYTGKTVLLVDDSPELLRLNERVLRPEGYSLFCAGTLG EARAVLDQTAPDAIVLDIDLPDGNGLDFCRELRAAANIPVVFLTAHSDAQTVQKGAESGS CAFLTKPYQMEELQKAVAEAVGEKTDRKRDGARIT >gi|229783849|gb|GG667886.1| GENE 5 4071 - 4293 63 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625349|ref|ZP_06118284.1| ## NR: gi|266625349|ref|ZP_06118284.1| transcriptional regulatory protein [Clostridium hathewayi DSM 13479] transcriptional regulatory protein [Clostridium hathewayi DSM 13479] # 1 74 1 74 74 122 100.0 8e-27 MSRKAYTEQERKQIKEALFVTMLQCINERGIIHSSIEFICRKVGISKSYFYSFFSSKEEL VLCALQYQQPKILY Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:27:30 2011 Seq name: gi|229783848|gb|GG667887.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld280, whole genome shotgun sequence Length of sequence - 2823 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 887 449 ## HMPREF0421_21181 hypothetical protein - Prom 1087 - 1146 10.0 2 2 Op 1 38/0.000 + CDS 1210 - 2082 954 ## COG1175 ABC-type sugar transport systems, permease components 3 2 Op 2 . + CDS 2092 - 2821 775 ## COG0395 ABC-type sugar transport system, permease component Predicted protein(s) >gi|229783848|gb|GG667887.1| GENE 1 2 - 887 449 295 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0421_21181 NR:ns ## KEGG: HMPREF0421_21181 # Name: not_defined # Def: hypothetical protein # Organism: G.vaginalis_ATCC14019 # Pathway: not_defined # 1 295 1 295 633 226 38.0 8e-58 MTINILYSSHSGSPVVFAAQELARCLAKIVADSIIITNNQNCRSDEVTLRLESLKNPKNS EDAYSIDVTPMGGTISGSNDRSVLLGVYQYLWLLGCRFPAPGRKHESFPSLYKKEQLSAS CQKQAALRHRGVCIEGANSLENILDYIDWLPKLGCNSFFLQFQLPYTFMARWYHHEMNPL LKPEEFTRETAAAFTTRIEEALQERGLLLHQAGHGWTGDVLGFPCADWKAASEPLPPETA PLAACINGKRELFHGVPMNTNLCYSNETVIEKFSDRVVEYCIQHPAISCVHVWLA >gi|229783848|gb|GG667887.1| GENE 2 1210 - 2082 954 290 aa, chain + ## HITS:1 COG:lin0218 KEGG:ns NR:ns ## COG: lin0218 COG1175 # Protein_GI_number: 16799295 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 2 263 12 268 292 150 35.0 3e-36 MWCWIFMIPTMILYILFQGYPILSSIFYSTLDWSGMTSNALFIGLDNFKELLQDKYFFNA VANSFKYMIMAVPLQLVISLALAYIFNSILKKGATFFRTVFFLPVITTASIVGIIMMFIF GGTGSINQFLSLFGVRGLNWLGNANTALIVVVLIGVWKDSGTFMIYWLAALQSVSQDVYE AATIDGANKWQTFLHVVFPLIIPIGGVITVLCIISSLKVFYLIQTMTNGGPFYSTDVAAT FVYRTAYASSSGMPRLGYASAAAMTFGIIVVVIGTVGNVVKTAFQKKNAI >gi|229783848|gb|GG667887.1| GENE 3 2092 - 2821 775 243 aa, chain + ## HITS:1 COG:BH1119 KEGG:ns NR:ns ## COG: BH1119 COG0395 # Protein_GI_number: 15613682 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 16 234 14 234 281 140 37.0 2e-33 MERTSILSKVGKTALYAVLAVASFLWIYPFLWMISASFKTQNEFFASGLNLIPQSLNLDN FIRAWNNANFGIYFKNSIIVTVSVVVIVLLSTSLAGYVMGRYSFVGKGLIMKIFMASITI PLVFTVIPIYELLKNMGLSQSLVGLILAESGGGHVIFLMLFSSFYASIPNEMEEAATIDG ANFVQTYGSVMFPLAKPIMVTVVIMQFIWTWNSFLLPLIITLNNPSLRTLAVGLYALRGE NVV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:27:36 2011 Seq name: gi|229783847|gb|GG667888.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld281, whole genome shotgun sequence Length of sequence - 2862 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 15 - 74 3.7 1 1 Op 1 35/0.000 + CDS 183 - 1664 1404 ## COG0147 Anthranilate/para-aminobenzoate synthases component I 2 1 Op 2 13/0.000 + CDS 1664 - 2302 459 ## COG0512 Anthranilate/para-aminobenzoate synthases component II 3 1 Op 3 . + CDS 2356 - 2860 607 ## COG0547 Anthranilate phosphoribosyltransferase Predicted protein(s) >gi|229783847|gb|GG667888.1| GENE 1 183 - 1664 1404 493 aa, chain + ## HITS:1 COG:aq_582 KEGG:ns NR:ns ## COG: aq_582 COG0147 # Protein_GI_number: 15606032 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Aquifex aeolicus # 1 490 1 490 494 390 43.0 1e-108 MKLTPDCTEIEALAAKYPIIPVCREVFADIVTPITLLRKIAGLSSRYCLLESIEGGENWG RYSFLGYDPILRVTCKDRVVTIRRKPAGPAEKIIETDRPFDILRDILKEYRSPKLKGMPP FTGGFIGYFAYGMIGYAEPTLNIKKGECSDFDLMLFDTVIAYDHLKQKIDIVVNMKTGQQ AECMANYGAACARLESIAQMIANPAPMEKSVADDKPEFTCNVSKETYCRMVEKAKEYIVD GDIFQAVISRQFSSEYHGSLQNAYRVLRTTNPSPYMVYLRMDDMEIMSTSPETLVRLKDG RLTTFPVAGSRPRGAAPEEDDRLEKELLADEKELAEHNMLVDLGRNDLGKISKFSTVEVT GYQMIHKYSRIMHICSQVEGDILNGCDAFSAIEAVLPAGTLSGAPKIRACEIIEELEPVP RGIYGGALGYIDFTGNMDTCIAIRMAVKTNGKVYVQAGGGIVADSVPEKEYEESANKAKA VIHAIEAASEVND >gi|229783847|gb|GG667888.1| GENE 2 1664 - 2302 459 212 aa, chain + ## HITS:1 COG:TM0141_1 KEGG:ns NR:ns ## COG: TM0141_1 COG0512 # Protein_GI_number: 15642915 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component II # Organism: Thermotoga maritima # 2 212 47 234 246 194 45.0 7e-50 MILLIDNYDSFSYNLVQLFGTLCLENRISLTPSRPGESAGSEAAGLEPDIRVIRNDAMTT EDIRNLHPDHIILSPGPGYPKDAGVCEDVVRQMKGEVPILGVCLGHQGICEVFGAEIVHA AKLMHGKQSTITIDNSDPIFQDLPERIPAARYHSLIARRSSLPDSLKVIGEDDMGEIMAV RHTDYPVYGLQFHPESILTPDGMTILRNFLNI >gi|229783847|gb|GG667888.1| GENE 3 2356 - 2860 607 168 aa, chain + ## HITS:1 COG:MJ0234 KEGG:ns NR:ns ## COG: MJ0234 COG0547 # Protein_GI_number: 15668409 # Func_class: E Amino acid transport and metabolism # Function: Anthranilate phosphoribosyltransferase # Organism: Methanococcus jannaschii # 1 168 1 168 336 175 56.0 4e-44 MIQQAIHDVIEGQDLSFEAAKEVMNEIMSGETTPAQMAAFLTGLRMKGETIDEITACATV MREKGLKLEPDFAVIDIVGTGGDEAGTFNISTTSAFVVAAGGVPVAKHGNRSVSSKSGAA DVLENLGVNLKLTVEQSEQILKDTGMCFMFAQSYHASMKYAGPVRKEL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:27:38 2011 Seq name: gi|229783846|gb|GG667889.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld282, whole genome shotgun sequence Length of sequence - 4496 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 56 - 385 311 ## gi|266625356|ref|ZP_06118291.1| conserved hypothetical protein 2 1 Op 2 . - CDS 428 - 1066 331 ## gi|288871627|ref|ZP_06118292.2| phage shock protein A - Term 1079 - 1113 1.2 3 1 Op 3 . - CDS 1148 - 2587 82 ## LM5578_1883 hypothetical protein - Term 2882 - 2916 4.0 4 2 Op 1 . - CDS 2983 - 3573 449 ## LM5578_1884 hypothetical protein 5 2 Op 2 . - CDS 3648 - 4193 442 ## LM5578_1885 hypothetical protein 6 2 Op 3 . - CDS 4190 - 4495 190 ## LM5578_1886 hypothetical protein Predicted protein(s) >gi|229783846|gb|GG667889.1| GENE 1 56 - 385 311 109 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625356|ref|ZP_06118291.1| ## NR: gi|266625356|ref|ZP_06118291.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 109 1 109 109 202 100.0 5e-51 MEDKANDYIQRLAEDVNRLVDDLGTFGCMDDKEPKKTPEDRIREVYQWLTNSKGCGEASE YFKELLEILDKSDTEYERISDICGRIHKVQKMEPWKETIPIRKNTGSYR >gi|229783846|gb|GG667889.1| GENE 2 428 - 1066 331 212 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871627|ref|ZP_06118292.2| ## NR: gi|288871627|ref|ZP_06118292.2| phage shock protein A [Clostridium hathewayi DSM 13479] phage shock protein A [Clostridium hathewayi DSM 13479] # 1 212 21 232 232 395 100.0 1e-108 MDNKIKTLDTLKNEIFLLKEKLRIAEEALSADRIPYELVKKIMEVGGYSNKINALRKAYL MGIDWDSLIGLVHDSDGSEEVRAIAGALERKLDIQKIRIVADGKHNYRQMELVFYGFYIG RSVQEMELATDNRFDEDQIEEILSSFQYGLTYDQVAVYAKEHFDCYQMRTIKEAFLYNHL SIDEAAAIAVPSNNTRRMRQEIKESIKRRGEK >gi|229783846|gb|GG667889.1| GENE 3 1148 - 2587 82 479 aa, chain - ## HITS:1 COG:no KEGG:LM5578_1883 NR:ns ## KEGG: LM5578_1883 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 479 91 566 566 643 65.0 0 MSYTSGRGIGISGSRYQYIRPAQRLPKIISSQSLGPASIAEIKRYFTDEQTVRGIAGYVG IDFNVLINGEYRIMIEPIAYVTLQGVRVAMTATEAALYDEVAGGVVYSWMRGVVFQNLPL SMFLEKPDLGYPAWNGPITGLRSHQEIKTSLGLGIIRFNEELPDLSPEVASFDYEYHTNT EVITSIRVGGSQSDPDHPVSVTFKLKGRNYKVSNVYYPEGGSQLAWVRWTTPSTEQTMEI PVTVKGGGHTDKGTIRVKIVNLDKNPPPNPIADDRNDSYTRPEVPSKNQQTSADWSIWRP YWYSYWVWHSGNDDDDGFWCDHGWWKFNIDRYHANMNATMRVQPDGKSPTASGKTLKSGY GFNEFVSSNVSTNQSSAVTGAQNAVTYFPEFQYKTYWRLLDRMSGGLQSQFEFKQNQYST FKRRTHFTPIWMPDGRYTPYTWLIDAWTPSGMLSLNLTDNLTINGNLWTDWHISPQAPD >gi|229783846|gb|GG667889.1| GENE 4 2983 - 3573 449 196 aa, chain - ## HITS:1 COG:no KEGG:LM5578_1884 NR:ns ## KEGG: LM5578_1884 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 196 3 188 188 100 39.0 3e-20 MSDKVKKGMIIAGCVVCCIVLVVMIASNFKGPAKTKDKLAESQSTEIEVKPDIKKSELDK KPKETDSKKEESKKEPETTEGQPESEEGKEGDQDNGQPVQNIQPRVTKPETPDQEVLKNP NQKPDGESVEGVPVPENHDEVQQPETDTVPGAAPEGESQEGKIYAPGFGWIDDIGEGEGI EDSEIYENGNKIGIMD >gi|229783846|gb|GG667889.1| GENE 5 3648 - 4193 442 181 aa, chain - ## HITS:1 COG:no KEGG:LM5578_1885 NR:ns ## KEGG: LM5578_1885 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 181 1 181 181 266 66.0 2e-70 MKRLICCLKDKRGSSFPFVVAVTLVLMLIMCGFLEFYRLKIIANGVRDAAQEAVMITVND NYANVYHGVREGYSGGYQPNNGGFKYSVDKGDVLSEMDKILGTRVESGRHVKYTDGVLEY SISDLSVTARNAPLAPGTPENEQRFEIDAVVTLKVPVQFAGKRLPEMKIRLKVQAGYIEV F >gi|229783846|gb|GG667889.1| GENE 6 4190 - 4495 190 101 aa, chain - ## HITS:1 COG:no KEGG:LM5578_1886 NR:ns ## KEGG: LM5578_1886 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 101 31 131 131 164 75.0 1e-39 VSVFPVFLAKNQLDTFATELCREAEISGRVGTETARREQVLRERTGLDPTVEWSQRGDIQ LNHEVTVKLTLHRDLGLFGNFGSFPITLKASATGKSEVYHK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:28:12 2011 Seq name: gi|229783845|gb|GG667890.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld283, whole genome shotgun sequence Length of sequence - 2609 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 616 618 ## DSY0163 hypothetical protein - Prom 684 - 743 9.9 - Term 705 - 768 10.5 2 2 Op 1 . - CDS 782 - 1450 661 ## COG0274 Deoxyribose-phosphate aldolase 3 2 Op 2 . - CDS 1500 - 2051 449 ## COG0778 Nitroreductase 4 2 Op 3 . - CDS 2059 - 2526 411 ## gi|266625365|ref|ZP_06118300.1| conserved hypothetical protein Predicted protein(s) >gi|229783845|gb|GG667890.1| GENE 1 1 - 616 618 205 aa, chain - ## HITS:1 COG:no KEGG:DSY0163 NR:ns ## KEGG: DSY0163 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 196 1 193 336 154 41.0 3e-36 MRKEASLEQWKELYEVTLNLKALEPWHYFGSEDLVAIALQGEEEPVFMSIMGMMGSCYGI SMYEGMEGFCDFDMVARAGGEDGLPVPYAMMEQSCITWYVGDREEVPEDQRKVIKKLELG FRGKGQWQYFYSFAKGYMPFTPDAREVSVLTEAFKGLFMATRAVKEKRISVDFEHGEVLW RVYNAETEEWNMFAGPLSPYERNYP >gi|229783845|gb|GG667890.1| GENE 2 782 - 1450 661 222 aa, chain - ## HITS:1 COG:SPy1867 KEGG:ns NR:ns ## COG: SPy1867 COG0274 # Protein_GI_number: 15675686 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Streptococcus pyogenes M1 GAS # 6 219 1 214 223 207 51.0 9e-54 MNNSEMNNSEIYSHVDHTLLKAYASWEEIKKLCDEAVTYRTASVCIPPSYIKRVRETYGK DFRICTVIGFPLGYHTKDIKVRETLQAIEDGADEIDTVINIGDVKNGDFDLVRSELKAIR EAVGEKILKVIIETCYLTDAEKREMCRIVTEAGADYIKTSTGFGTEGAVLKDIFLFKEEI GPGVKIKAAGGMHSRQDFVEFLNAGCDRLGTSSAVKVLEKTD >gi|229783845|gb|GG667890.1| GENE 3 1500 - 2051 449 183 aa, chain - ## HITS:1 COG:TM0383 KEGG:ns NR:ns ## COG: TM0383 COG0778 # Protein_GI_number: 15643149 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Thermotoga maritima # 3 168 2 166 166 152 47.0 3e-37 MDNICLKRYSCRCFRKERVSEEALIRIFQAAMQAPSALNEQPWEFLLIREACMKQEVAGI NPYCHPAAEADCLILLLADLNRLKKDSPWWVQDMSACAENMLLQAVELGLGAVWLGVYPR EERMKKLSRLVCLPENVIPFAVISVGYPDEERDPVLRFDGDRIYGEQYGTRLLLQKDSGE TER >gi|229783845|gb|GG667890.1| GENE 4 2059 - 2526 411 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625365|ref|ZP_06118300.1| ## NR: gi|266625365|ref|ZP_06118300.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 155 1 155 155 317 100.0 3e-85 MKTVVIEDIRPDNDYGLIVQMPEEGNVYQDDYFYWKGSLIPACFQTNRISTGTMKTWHRD PVFQKLEFHEDYEKFVFLTGTALFPVADTRDGAVCDESFRVLRVSPGTEVLVPPGKAHFM PVAEGDEPVRITVVCPEMPFYHVFLSEAVRAEAKE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:28:26 2011 Seq name: gi|229783844|gb|GG667891.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld284, whole genome shotgun sequence Length of sequence - 6309 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 594 485 ## COG1893 Ketopantoate reductase - Term 642 - 701 21.9 2 2 Tu 1 . - CDS 711 - 1412 505 ## COG0642 Signal transduction histidine kinase - Prom 1501 - 1560 8.4 3 3 Op 1 . - CDS 2462 - 2731 191 ## Closa_2158 integral membrane sensor signal transduction histidine kinase 4 3 Op 2 . - CDS 2725 - 3402 782 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 3620 - 3679 80.4 5 4 Op 1 . + CDS 4562 - 5653 1284 ## Closa_2156 S-layer domain protein 6 4 Op 2 . + CDS 5668 - 6307 576 ## Closa_2155 secretion protein HlyD family protein Predicted protein(s) >gi|229783844|gb|GG667891.1| GENE 1 1 - 594 485 197 aa, chain + ## HITS:1 COG:CAC2937 KEGG:ns NR:ns ## COG: CAC2937 COG1893 # Protein_GI_number: 15896190 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Clostridium acetobutylicum # 1 189 114 302 307 122 31.0 3e-28 EFVPEERVIIGTTEDNGAVLAPGHVRRGGVGNTNVGMLTEDREGFLPRLKEAFDSCGFHV KIHENIQYLIWDKLFTNVSLSAVTGILQVDMGFIAGNEYAWNLTKALIHEATAVAGSLGL SFDEETVTERVRQTSIGNPTGCTSIRADLRDGRRTEVNTISGAVVTAAKRCGVPVPSHEF VVNMVHAMEMKRQDSSI >gi|229783844|gb|GG667891.1| GENE 2 711 - 1412 505 233 aa, chain - ## HITS:1 COG:BH0373 KEGG:ns NR:ns ## COG: BH0373 COG0642 # Protein_GI_number: 15612936 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 1 232 225 448 459 140 39.0 2e-33 MEQAWKQQNQFIADASHELKTPLTVILANLQILLSHQDSTIRNQLKWVDNTREEASRMKQ LVEELLFLARSDAGTVEVSHAAYTEFDYSDSVLNCALLFEAVAFENKVILNNDIVPGIHV CGSETQMKQVVTILLDNACKYAGLHGTVSVSLKSASHHAVLTVNNTGEPIPSSDQKHIFE RFYRTDKSRVRKEGGYGLGLSIARTIVEQHKGKITVSSSQEEGTTFTVVLPEL >gi|229783844|gb|GG667891.1| GENE 3 2462 - 2731 191 89 aa, chain - ## HITS:1 COG:no KEGG:Closa_2158 NR:ns ## KEGG: Closa_2158 # Name: not_defined # Def: integral membrane sensor signal transduction histidine kinase # Organism: C.saccharolyticum # Pathway: not_defined # 1 87 1 87 418 85 48.0 9e-16 MLKKLRGKFIIINMSLVITILIIVLGNFYHVNIWRLERQTEMALMQAQDSARQNGKPNKV EPGRRTDDAPPPFVPNAVIILKKDGKIAS >gi|229783844|gb|GG667891.1| GENE 4 2725 - 3402 782 225 aa, chain - ## HITS:1 COG:BH0372 KEGG:ns NR:ns ## COG: BH0372 COG0745 # Protein_GI_number: 15612935 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 3 222 4 222 222 178 44.0 7e-45 MNILIVEDEVRLADALGQIMEEQHYHTDIVYTGTDGLYCGLSGEYDVIVLDVMLPGENGF DVVRKLRKAHVQTPVLMLTARDDISDKVTGLDRGADDYMTKPFVPEELLARVRALSRRQG EVVVEELHFGDLVLNLSTNDLLRGRKSIHLGYKEFEVLKLLMNNAGKIISKDTLISRVWG NDSDAEDNNVEAYISFLRKKLFFLNSRVEIVTVRKVGYRLEEPSC >gi|229783844|gb|GG667891.1| GENE 5 4562 - 5653 1284 363 aa, chain + ## HITS:1 COG:no KEGG:Closa_2156 NR:ns ## KEGG: Closa_2156 # Name: not_defined # Def: S-layer domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 363 1 363 363 541 77.0 1e-152 MNGRKIAASMAVVTAAMAVMPITAEASTNIDLRKKVIGISGIMSVSNMDSTITRGEFASM LVNASSYRSTVSSVSNTSVFADVPRDHAYAASIRIAAEQGWMTGYLGGNFKPDESVTLQE AIKGVLALLGYDNTDFTGDQTGARQSKYHFLELDENMNKSPEEVLIKEDCINLFYNLLKT DTKAGTMFGKSLGCELTSDGEINPLTMVDNSLKGPKIVRSKSRLSDYLPFKLSAANVYLD GSPISNSSDAIAAALENDNGVLVYYHPVSKTVWLYTVGSENENGRSAVYGEITNIIYNSA DLMTPDAIILDDGNTYELDSTEMKFAFSTYGDMRVGDTVTLVYSVTTDNNGDETRTVLDY IED >gi|229783844|gb|GG667891.1| GENE 6 5668 - 6307 576 213 aa, chain + ## HITS:1 COG:no KEGG:Closa_2155 NR:ns ## KEGG: Closa_2155 # Name: not_defined # Def: secretion protein HlyD family protein # Organism: C.saccharolyticum # Pathway: not_defined # 24 213 24 213 579 226 71.0 6e-58 MKKISFGNRSKNGKKTGKKGKKLLLLLLILLAVILGLYAAVRMKAEKAASANTKEVKTAV VEKRDITSELSSSGTISPKNTYDITSLVEGEVISADFEEGDQVEAGQILYQIDTSSMESE LTSVNNSLSRAQENYEVALDDYNTALSDYSGNTYKSTETGYIRTLYIKEGDKVSSNTKIA DIYDDKVMKIKLPFLAGEAALIGAGNDAVLTLT Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:28:39 2011 Seq name: gi|229783843|gb|GG667892.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld285, whole genome shotgun sequence Length of sequence - 3864 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 3 - 1224 1200 ## COG1228 Imidazolonepropionase and related amidohydrolases 2 1 Op 2 . - CDS 1283 - 2545 1189 ## COG0477 Permeases of the major facilitator superfamily - Prom 2589 - 2648 10.6 3 2 Tu 1 . - CDS 2748 - 3428 745 ## ELI_4287 TetR family transcriptional regulator - Prom 3467 - 3526 8.8 + Prom 3500 - 3559 8.8 4 3 Tu 1 . + CDS 3608 - 3863 307 ## gi|266625375|ref|ZP_06118310.1| high affinity choline transporter Predicted protein(s) >gi|229783843|gb|GG667892.1| GENE 1 3 - 1224 1200 407 aa, chain - ## HITS:1 COG:CC3125 KEGG:ns NR:ns ## COG: CC3125 COG1228 # Protein_GI_number: 16127355 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Caulobacter vibrioides # 19 364 39 396 431 132 32.0 2e-30 MILKNCKLVPELTEANAPVMADVIVKGSLIEAVVPCGSADADGHEVIDLDGATLLPGLID AHVHLFNGKRSGGFTLGDRDMAPSQWAYDAYAFAKWFLDNGYTTIRDVGDESNYCGIATR NAIQEGIVVGPRVFCSGMTIVPLTAGFNTFEFMCAFYGNKEEIRSQARNQFYHGADFLKL YGTGSMLVEDSMPGRRIMLEDEISEAVAVARLRGSYCAVHCHGAEACDVMVDLGVRTIEH ASFISDETCRKLDGRKDAGLVPTISCSLPEAQGITRESGAVYDRFENISAERDACIRNAG ENYDILMGWGTDMDIMTMEKTPYIEWKARKDRLGFSNIDLLKQATINSAILMGVDDKIGS IAAGKFKPFRFQRWNRKGLNFLNGTQGDGSESRPFLIKNREQLMGLS >gi|229783843|gb|GG667892.1| GENE 2 1283 - 2545 1189 420 aa, chain - ## HITS:1 COG:ECs3631 KEGG:ns NR:ns ## COG: ECs3631 COG0477 # Protein_GI_number: 15832885 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 6 419 7 425 425 118 24.0 2e-26 MKKLGKRFWPLVFYSIAANVGNVIFFPVMFYAAFQQVFHLTNTQIGSLTAAYASLAIPAY LVSGIIADKFNSKLLMSIACISSTVVVFIMATIPPYRVLLICFFVLSITLGLLFWSATAK LRRMLGDASEQGTINGIYQCIDGILSLGLMVGLVALLGDKLATPAGMRTLFLVFGIIYAI GTVGFIITYDYKKYAATYVISTGEPVRLKNYLVGLKIPAMWITALMSFGVYITSTAFNYI NPYMVSEFAMSAAFASIFGILLRYGIKIVAAPLGGMIRDKINNTTRMIVSLGIPTIILVI VFMFIPRGAAYTVPAVIVALLLTCTYRSVNNLCEIPVAELKVPLEILGVVGGLYLFFGYC SDWFLPALIGHWMDTKGGNAYYYIFSIAVVGLFIFVLASFWLKAELKKQRAKEAADPAKS >gi|229783843|gb|GG667892.1| GENE 3 2748 - 3428 745 226 aa, chain - ## HITS:1 COG:no KEGG:ELI_4287 NR:ns ## KEGG: ELI_4287 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: E.limosum # Pathway: not_defined # 20 225 10 214 216 65 23.0 2e-09 MRFKKAPAKRKKCDVADSATAQAIFLAAKECFMEKGYSGVTFREIAERAGVASSLISYYY DSKENLAAEVCSRFLDEVAEELGQTDFHNLASGERFYITVYMEWLKIDEVPAYSRFYYSY YENSLGQKVLNGSNYLKMVNDIIEEYGLLVSSAENTMYMIANRGATRELMLHHHKGQYGI TREDVMDITTSNYFYNLGLDDEQIYSIVTRSKEFLTNYYGSDAYLR >gi|229783843|gb|GG667892.1| GENE 4 3608 - 3863 307 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625375|ref|ZP_06118310.1| ## NR: gi|266625375|ref|ZP_06118310.1| high affinity choline transporter [Clostridium hathewayi DSM 13479] high affinity choline transporter [Clostridium hathewayi DSM 13479] # 1 85 1 85 85 119 100.0 7e-26 MFTSGLGPVDIAVLILYIAAIISIGFIVSRRVKTVKDFTSAGQSLSPLVMIASCLATFFG AFAGSGAMEMIGQFGLTTLTILLGA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:28:52 2011 Seq name: gi|229783842|gb|GG667893.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld286, whole genome shotgun sequence Length of sequence - 4617 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 38 - 331 232 ## PROTEIN SUPPORTED gi|212638657|ref|YP_002315177.1| Predicted RNA-binding protein containing KH domain, possibly ribosomal protein 2 1 Op 2 14/0.000 - CDS 375 - 1658 1347 ## COG0536 Predicted GTPase - Term 1685 - 1726 8.0 3 1 Op 3 14/0.000 - CDS 1733 - 2023 454 ## PROTEIN SUPPORTED gi|239623849|ref|ZP_04666880.1| ribosomal protein L27 4 1 Op 4 14/0.000 - CDS 2027 - 2356 163 ## PROTEIN SUPPORTED gi|116492579|ref|YP_804314.1| ribosomal protein 5 1 Op 5 . - CDS 2371 - 2679 441 ## PROTEIN SUPPORTED gi|160880683|ref|YP_001559651.1| ribosomal protein L21 - Prom 2763 - 2822 6.1 + Prom 2820 - 2879 6.1 6 2 Tu 1 . + CDS 2907 - 4082 854 ## Clole_3937 major facilitator superfamily MFS_1 7 3 Tu 1 . - CDS 4104 - 4616 557 ## COG0168 Trk-type K+ transport systems, membrane components Predicted protein(s) >gi|229783842|gb|GG667893.1| GENE 1 38 - 331 232 97 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|212638657|ref|YP_002315177.1| Predicted RNA-binding protein containing KH domain, possibly ribosomal protein [Anoxybacillus flavithermus WK1] # 1 97 2 96 97 94 50 2e-19 MTSKQRSYLKGLAMTTEPIFQIGKSSITPEITAAIAEALEARELIKITVLKNCLDDGNAI AATLAERTHSEVVQVIGRKIVLYKPAKEEKKRKIVLP >gi|229783842|gb|GG667893.1| GENE 2 375 - 1658 1347 427 aa, chain - ## HITS:1 COG:CAC1260 KEGG:ns NR:ns ## COG: CAC1260 COG0536 # Protein_GI_number: 15894542 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Clostridium acetobutylicum # 1 427 1 424 424 428 53.0 1e-119 MFADRAKIFIKSGKGGDGHVSFRRELYVPCGGPDGGDGGEGGDIIFEVDDGLNTLSDFRQ VRKYAAQDGEQGGKKRCHGKNGSDLIVKVPEGTVIKEFESGKVIADMSGENRREVILKGG RGGQGNMHYATPTMQAPKYAQPGQSGQELWVQLELKVIADVGLVGFPNVGKSTLLSRVSN ARPKIANYHFTTLNPHLGVVDIDGGKGFVMADIPGLIEGASEGVGLGHDFLRHIERTRVL VHVVDAASTEGRDPIEDILAINKELEAYNPELMKRPQIIAANKTDVIYAGDEDPVAKLKA EFEPKGIKVYPISAVSGQGVKELLYAVYDLLQTVDSTPVIFEKEFDPSTMIDQTLPYTVE RNEDGIYVVEGPRIEKMLGYTNLESEKGFLFFQRFLKENGILEELEEAGIEEGDTVRMYG LEFDYYK >gi|229783842|gb|GG667893.1| GENE 3 1733 - 2023 454 96 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239623849|ref|ZP_04666880.1| ribosomal protein L27 [Clostridiales bacterium 1_7_47_FAA] # 1 96 1 96 96 179 90 4e-45 MLKMNLQFFAHKKGVGSTKNGRDSESKRLGAKRADGQFVLAGNILYRQRGTHIHPGLNVG RGGDDTLFATVDGVVRFERKGRDKKQVSVYPRVISE >gi|229783842|gb|GG667893.1| GENE 4 2027 - 2356 163 109 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116492579|ref|YP_804314.1| ribosomal protein [Pediococcus pentosaceus ATCC 25745] # 1 102 1 105 108 67 39 2e-11 MIKVTVFEDSDRVVRGFKLTGHAGYGEEGQDIVCAAVSALVFSAYNSIETFTEDDFEGSA DERSGDFQLRFSGEISPESKLLMNSLVLGLTNIMESYGKQYIKIRFEEV >gi|229783842|gb|GG667893.1| GENE 5 2371 - 2679 441 102 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160880683|ref|YP_001559651.1| ribosomal protein L21 [Clostridium phytofermentans ISDg] # 1 102 1 102 102 174 83 1e-43 MYAIIATGGKQYKVAEGDIIKVEKLGVDAGEAVTFDQVLVVNNGELAVGCPTVAGATVTG TVVKEGKAKKVIVYKYKRKSGYHKKNGHRQAYTQVKIDKINA >gi|229783842|gb|GG667893.1| GENE 6 2907 - 4082 854 391 aa, chain + ## HITS:1 COG:no KEGG:Clole_3937 NR:ns ## KEGG: Clole_3937 # Name: not_defined # Def: major facilitator superfamily MFS_1 # Organism: C.lentocellum # Pathway: not_defined # 5 385 3 386 392 341 50.0 4e-92 MKTDKRKQIFIRCCYGYSVSGMAVLVIGAVLPSVIAEAGISFLYAGGLLSVMAVGNLLSS FLFPAMVPVMGRRASITLMTALAPVCLLGFTLLPPLPVMYGIMLVYGLVRGSVTIINNAA VNDIYDEGAAGKLNLLHCSFAVGAFLAPFLTAVFLRLGFGWRSVVCLLIVVTATSAVSYG TMDYSLLNERAGDSAFKKNSGRRSGRSFLRSLQFYCIALILFFYLGVENCINGWFVTYLQ NTGVMSETYAATLVSVTWLVIMAGRMVCAALSGKMARSSLILLNALGSGICFFLLISTKS LPLITLALTGFGFFLAGIYPTCIADAGPLIQGSTFGMSVLTAISAMGGILTPQIVGGAAD RVGIVAAISILAVNVVLVILLSAVNRRLYHK >gi|229783842|gb|GG667893.1| GENE 7 4104 - 4616 557 170 aa, chain - ## HITS:1 COG:FN0993 KEGG:ns NR:ns ## COG: FN0993 COG0168 # Protein_GI_number: 19704328 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Fusobacterium nucleatum # 1 170 312 481 483 168 51.0 4e-42 ITTTGFSTVDFDKWPEFSKGLLVVLMFIGACAGSTGGGIKVSRVIILLKTVKKELFSLLH PRSVKVLKIEGKPVEHQVIRSINTFLVAYLVIFIFSTILVSLDNFDFTTNFTAVAATFNN IGPGLGAVGPTGNFSGFSALSKWVMMFDMLAGRLEVFPMLFLFAPSTWKK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:29:00 2011 Seq name: gi|229783841|gb|GG667894.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld287, whole genome shotgun sequence Length of sequence - 3643 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 16 - 61 -0.5 1 1 Op 1 . - CDS 161 - 1153 427 ## DSY4633 hypothetical protein 2 1 Op 2 . - CDS 1199 - 2083 204 ## COG1032 Fe-S oxidoreductase - Prom 2169 - 2228 3.8 3 2 Tu 1 . - CDS 2448 - 3317 573 ## COG0175 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes - Prom 3383 - 3442 3.1 4 3 Tu 1 . - CDS 3478 - 3642 82 ## gi|266625356|ref|ZP_06118291.1| conserved hypothetical protein Predicted protein(s) >gi|229783841|gb|GG667894.1| GENE 1 161 - 1153 427 330 aa, chain - ## HITS:1 COG:no KEGG:DSY4633 NR:ns ## KEGG: DSY4633 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 4 302 14 290 477 162 36.0 1e-38 MDIARSVDLVSIAERQGFTTKRVGSYYTLKEHDSVRIKNRRTWLRYSIPQSMEGHNGDTI EFLKQFCGMDFRGSVIYLLNEAGYRYEDKEFESDTYRKRHECRKAEQETEPEEKGEFVLP EANDNYRRLYAYLLKTRCLSKAVVDWFVKKKLIYETRRYHNVVFIGRDSFGTARFASMRG TVDNYGKPFKGDVDNNDKTYGFNVRNRRNKEVKVFEAAIDLMSYMDYKRDGQSNKLALGT TWDGALDRFLLENPQIKLVSLYFDNDKAGRKAAAACRKKYREMGYRVKVRFPPHGKDWNE FLVTERKKATVQYLDRSRNVVRKQAIGKIQ >gi|229783841|gb|GG667894.1| GENE 2 1199 - 2083 204 294 aa, chain - ## HITS:1 COG:Ta1390 KEGG:ns NR:ns ## COG: Ta1390 COG1032 # Protein_GI_number: 16082367 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Thermoplasma acidophilum # 57 227 70 237 425 64 27.0 3e-10 MEHVKYGGTGFYLDQAPGLPYEVEHTMPDYSLYANWLASQGEDGKRSKKFKYYREYSVGF LTRGCFRQCPFCVNRNKTASVPASPVTEFYDPKRKKICLLDDNFLACREWHTLLEQLRET GRPIQFRQGLDIRLITSQKAEELAACRLDGDLIFAFNNIKDKRVIVEKMKVINNVFQGSK RVRFYVLTGFDSKGRYDQSFWRQDLLQTFERIRILMGMRAYPYIMRYEKYRESPYRGMYV NLSRWCNQPHMFKRMSLRDFCFLAKDGSACRNYLQEFERNFPEGRSYLDMKFID >gi|229783841|gb|GG667894.1| GENE 3 2448 - 3317 573 289 aa, chain - ## HITS:1 COG:PAE1989_2 KEGG:ns NR:ns ## COG: PAE1989_2 COG0175 # Protein_GI_number: 18313015 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes # Organism: Pyrobaculum aerophilum # 3 264 17 251 325 91 26.0 2e-18 MDLEAKSIERIKLASEMSKRYYDAPIICTYSGGKDSDVLLDLFIRSNVEFEVHNSHTTVD APDTVYYIREKFRGLKEKGIPCYIHMPSLSMWQLIVKKKMPPTRMQRYCCEILKENTTPD RMVATGVRWDESVKRSKRGQLEAAAAKVENRITLMNDNDDKRLVIERCAVRGRVVSNPII DWEHRDIWDYIRSNHMSYNPLYNMGYKRVGCVGCPMAGKNRYKEFADFPKYKAAYIRAFE KMLVAIHAAGKETKWKNGEDVFRWWMQDQNIEGQLCLADFMDMGPDNNI >gi|229783841|gb|GG667894.1| GENE 4 3478 - 3642 82 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625356|ref|ZP_06118291.1| ## NR: gi|266625356|ref|ZP_06118291.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 54 56 109 109 102 100.0 1e-20 GEASEYFKELLEILDKSDTEYERISDICGRIHKVQKMEPWKETIPIRKNTGSYR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:29:11 2011 Seq name: gi|229783840|gb|GG667895.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld288, whole genome shotgun sequence Length of sequence - 4388 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 936 369 ## COG2199 FOG: GGDEF domain - Prom 958 - 1017 2.1 2 1 Op 2 . - CDS 1024 - 1251 91 ## gi|288871633|ref|ZP_06410249.1| coagulation factor X-activating enzyme heavy chain 3 1 Op 3 . - CDS 1217 - 1474 118 ## gi|288871634|ref|ZP_06410250.1| conserved hypothetical protein 4 2 Tu 1 . + CDS 2698 - 3120 151 ## Closa_3846 SARP family transcriptional regulator + Term 3248 - 3309 -0.9 5 3 Tu 1 . - CDS 4135 - 4323 71 ## gi|266625390|ref|ZP_06118325.1| putative membrane protein Predicted protein(s) >gi|229783840|gb|GG667895.1| GENE 1 3 - 936 369 311 aa, chain - ## HITS:1 COG:BMEII0654 KEGG:ns NR:ns ## COG: BMEII0654 COG2199 # Protein_GI_number: 17988999 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Brucella melitensis # 158 305 64 211 212 95 35.0 1e-19 MEKNISANDFCRFIWKSYLEERRYDLISDFISDKISIIGTGAHEVERNLQEFIAKIEEES QEWTGRFIIRDQWYQTTELSDSHSLVMGELAVREDADEGILYDMCFRFTIVLEKAECGWK IVHIHQSVADPNQASDEFFPHHMVEQTYTQIVYNLRHDSLTGLLNRLYFKETCERFLTAG DSGAFLMIDIDMFKNINDQYGHPAGDKTLISFSESLKSVITPNALAGRVGGDEFTLFLPG MNHKGEMEGFFTMLNTDWGERQKLLQMHEPVSISVGVTFTSKGNNSYGVLLNKADQALYL AKTNKNAGLVH >gi|229783840|gb|GG667895.1| GENE 2 1024 - 1251 91 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871633|ref|ZP_06410249.1| ## NR: gi|288871633|ref|ZP_06410249.1| coagulation factor X-activating enzyme heavy chain [Clostridium hathewayi DSM 13479] coagulation factor X-activating enzyme heavy chain [Clostridium hathewayi DSM 13479] # 7 75 1 69 69 109 98.0 7e-23 MNLEEFVQQKNDSELCEAYKELQVMKETGNRGFQIMKMLMLCKTSFSCSEADMTTVENAV CCEMARRYYDIKCNS >gi|229783840|gb|GG667895.1| GENE 3 1217 - 1474 118 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871634|ref|ZP_06410250.1| ## NR: gi|288871634|ref|ZP_06410250.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 85 4 88 88 162 100.0 9e-39 MIPVPSIIQAFDMHILISARIYEYIYDSENGHRVWYNSYAEALLDEGAACEWVCIMSGER IKQIKGEMGEKGVYYESGRIRAAEK >gi|229783840|gb|GG667895.1| GENE 4 2698 - 3120 151 140 aa, chain + ## HITS:1 COG:no KEGG:Closa_3846 NR:ns ## KEGG: Closa_3846 # Name: not_defined # Def: SARP family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 115 264 378 399 79 33.0 7e-14 MALIRSELSEQELISGAYCQDYETFKRIYRFVERRLRRSSESVYIILFTLTDKNGDLPKL LTREKQIDTLKSVIQHSLRLGDVFTQYSSCQYLIMVSDVNSQNVELIAQRISESFFSAMT EIEDKLLLHHCFPLKPAGTS >gi|229783840|gb|GG667895.1| GENE 5 4135 - 4323 71 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625390|ref|ZP_06118325.1| ## NR: gi|266625390|ref|ZP_06118325.1| putative membrane protein [Clostridium hathewayi DSM 13479] putative membrane protein [Clostridium hathewayi DSM 13479] # 1 62 22 83 83 89 100.0 1e-16 MVFGMENRFTQYFVGIGVAILFILAIVFAVVAIKELIHDLKIFIKSKKVVHRVRKKDLYK YK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:29:31 2011 Seq name: gi|229783839|gb|GG667896.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld289, whole genome shotgun sequence Length of sequence - 3644 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 19/0.000 - CDS 58 - 1485 1538 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 2 1 Op 2 . - CDS 1472 - 2662 1252 ## COG0772 Bacterial cell division membrane protein - Prom 2732 - 2791 1.9 - Term 2720 - 2755 -1.0 3 2 Tu 1 . - CDS 2808 - 3644 422 ## COG0826 Collagenase and related proteases Predicted protein(s) >gi|229783839|gb|GG667896.1| GENE 1 58 - 1485 1538 475 aa, chain - ## HITS:1 COG:CAC0506 KEGG:ns NR:ns ## COG: CAC0506 COG0768 # Protein_GI_number: 15893797 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 24 453 18 460 482 268 36.0 2e-71 MKKADPNPKANRHILMITYGMAALFAGLVLYFAFFLQIKSESVINNSYNARLDSFSDRVV RGSILSSDGRVLARTNVDENGGETRYYPYDSMFSHVVGYSTRGKTGLEALGNFYLLTSHV NLAEQVINELSSVKNLGDNVVTTLDVDLQKAAYDALGGRKGAVVAMEPDTGKILAMVSKP DYNPNTLSTDWDALVAGDNGEGQLLNRAAQGLYPPGSTFKIVTALEYIRENPENFRDYRF DCSGFFHYEDYTIKCYHETAHGSQDFTRAFANSCNGAFANLALTLDLGRLRNTAEQLLFN TPLPVAIPYSQSAYALTPGGDTWQILQTSIGQGQTQMTPIHGGMIAAAIANGGTLMKPYL IDHVENVGGDVIKKFMPQSYGNLMTAEESAVLTELMRAVVTEGTGSAVRTDAYTVAGKTG SAEFDKDKETHAWFIGFAPAEQPKLVVSVIVEEGGSGGKTAVPIARTLFDTYFAR >gi|229783839|gb|GG667896.1| GENE 2 1472 - 2662 1252 396 aa, chain - ## HITS:1 COG:CAC0505 KEGG:ns NR:ns ## COG: CAC0505 COG0772 # Protein_GI_number: 15893796 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Clostridium acetobutylicum # 27 378 41 393 400 180 34.0 5e-45 MFLIHFLAYGILWLKTEDERLIAFYAAQVIFFLCYIYVSRLFYRHVSRLLVGNMCMLLSV GFIILTRLSFDRALKQFIIVLAAAVITWIIPFIIDRVWQLSKIPWMYGVLGLILLGVVLI TGINSFGAQLSIGIGGFTFQPSEFVKISFVFFVATMFYRSTKFVTVCITTLVAALHVLIL VASRDLGSALIFFVTYVLMLFVATGKWSYLLGGAGAGAAASVLAYQLFDHVRARVLAWRN PWSDIENKGYQITQSLFAIGTGGWFGMGLCQGMPGKIPVVEKDFVFSAVSEELGGIFALC VLLICFGCFLQFMMIASRMKAVFYKLIAFGLGTVYIVQVFLTVGGVTKFIPSTGVTLPLM SYGGSSVFSTFILFGVMQGLYILKRNEEEEEKYEKG >gi|229783839|gb|GG667896.1| GENE 3 2808 - 3644 422 278 aa, chain - ## HITS:1 COG:CAC2341 KEGG:ns NR:ns ## COG: CAC2341 COG0826 # Protein_GI_number: 15895608 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Clostridium acetobutylicum # 142 278 644 787 787 60 28.0 4e-09 GGNGRAAAARPRIFVSLERPDCLPAVLHNPYVERIYVDAAEFGPELWKETASSCHDAGKE CLLTMPHIFRTEAERYFERSRKAFCEAGFDGVLIRSMEEIGYLKTHGPELPWVFDYGMYG MNDRAESFLMKMGAAELTWPVELNERELSRLSVPGELIVYGRLPMMVTAQCIHQGTESCD KKPGLLTLKDRMGKEFPVKNHCAFCYNTIYNAAPVSLLGLEKQIRALAPTAVRLQFTVET RKETAEVIRCFADSLLLGKQTEQPVGEFTRGHFKRGVE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:29:33 2011 Seq name: gi|229783838|gb|GG667897.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld290, whole genome shotgun sequence Length of sequence - 2819 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 716 874 ## Bcav_1543 extracellular solute-binding protein family 1 + Prom 723 - 782 2.7 2 1 Op 2 . + CDS 815 - 1813 869 ## COG1533 DNA repair photolyase + Term 1838 - 1871 4.5 + Prom 1870 - 1929 8.1 3 2 Tu 1 . + CDS 2002 - 2818 506 ## PPE_04600 hypothetical protein Predicted protein(s) >gi|229783838|gb|GG667897.1| GENE 1 3 - 716 874 237 aa, chain + ## HITS:1 COG:no KEGG:Bcav_1543 NR:ns ## KEGG: Bcav_1543 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: B.cavernae # Pathway: not_defined # 2 230 247 477 478 65 30.0 1e-09 YEFYDTLVKEGCFHPDTVSIKAPEARALFAQNQAAFIIQGAWCISTWRKDNPDLDFGVMA LPAPDDGMKGKLPYIGAQPWMGISANCKHPDVAARYLTELYSEDYQAGLVEDGGFVSVID AANKAHMTDDVMLQYYNLNNEVAALAPDPIVGNPAAADVYAEVSAITPGLGEIAQGILAQ NIDYKTELKTLSDKTQEEWMRAIDAAKAKGADVSADDFEFKNWNPMENYTADLYESR >gi|229783838|gb|GG667897.1| GENE 2 815 - 1813 869 332 aa, chain + ## HITS:1 COG:BS_splB KEGG:ns NR:ns ## COG: BS_splB COG1533 # Protein_GI_number: 16078457 # Func_class: L Replication, recombination and repair # Function: DNA repair photolyase # Organism: Bacillus subtilis # 7 320 10 326 342 172 34.0 7e-43 MQHFYEVYYEPDIVNYQLGRELREQFKNLPWIPIESHNRIPEMQEKPNKEFGRMKRNLII GTRKTHKYVENHKVSDYLVPYTSSGCSAMCLYCYLVCNYNKCSYLRLFVNREQMMDRLIK VSMNGETGKTFEIGSNSDLVLENTITGNLEWTITQFAQRGRGHITFPTKFHMVEPLLDLD HRGKAIFRMSVNPQSMISRIELGTSSLMKRIDAVNRMCDAGYPCGLLIAPVIIMEGWKEE YRTLLQTLRDHLSPKMKREMFLEVILMTYSFVHRAINGEAFPAAPELYDKELMTGRGRGK YCYRPEVRADAEQYLREEIKKALGDVEILYIA >gi|229783838|gb|GG667897.1| GENE 3 2002 - 2818 506 272 aa, chain + ## HITS:1 COG:no KEGG:PPE_04600 NR:ns ## KEGG: PPE_04600 # Name: not_defined # Def: hypothetical protein # Organism: P.polymyxa # Pathway: not_defined # 66 235 505 668 673 78 28.0 3e-13 MKRKNPMKYVVLIGAVSAGLLSACSHGQTAETAKSDRSSAEVPQTAESFTVSASENPVSA GETESAEMESAEMESTEMKNAEMENTEMGNTLDDSRIITGQSFDVTWKNWGEVRFISYAP PADHAHDDVKFYLSKNNEIVYEFPKAWAEGSLPWTFESVDMVSFRDIDGDGYEDVITIIS YITGAGTEAMVPFPTTRIFLGDGTKFVLDAELSEAVDKAQANESISTIMEFINNRTSPDA GYSESAGGEDQMYEITGIREEEAKAFLKKFKP Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:29:44 2011 Seq name: gi|229783837|gb|GG667898.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld291, whole genome shotgun sequence Length of sequence - 4612 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 113 104 ## 2 1 Op 2 35/0.000 - CDS 146 - 1891 192 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 3 1 Op 3 35/0.000 - CDS 1893 - 3221 225 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 4 1 Op 4 . - CDS 3164 - 3622 361 ## COG1132 ABC-type multidrug transport system, ATPase and permease components - Prom 3663 - 3722 10.3 + Prom 3644 - 3703 10.8 5 2 Tu 1 . + CDS 3740 - 4610 800 ## COG0583 Transcriptional regulator Predicted protein(s) >gi|229783837|gb|GG667898.1| GENE 1 2 - 113 104 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKTGHYAAFIGLLSAVALSVTACGAPAKAPAAAGAA >gi|229783837|gb|GG667898.1| GENE 2 146 - 1891 192 581 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 343 559 279 502 563 78 25 9e-15 MNMKKHKTTIRRILTSIGRYKWGGLASLACAFVTVVLTLYVPILTGKAIDCILAPGNVDF ALMSRILLKIGVIIAVTAVAQWLMSHINNTITYRVVKDIRVRAFAKMESLPLKYIDSHSH GDIISRLIADIDQFSDGLLMGLTQLFTGVLTILGTLIFMFTINPAIAVVVILVTPVSLFV ASFIARRTYTMFRKQSQTRGELTSLVEEMLGSQKVVLAFGHEKEAQAQFEEINERLRVWG LKATFFSSITNPATRFVNSLVYACVGIAGAFAAISGYLTVGQLSSFLSYANQYTKPFNEI SGVVTELQNALASAARVFELIDEEPQIPDAPDAAVLTEVKGEVSLKNVNFSYDPEVSLIE NMNLEVKPGQRVAIVGPTGCGKSTVINLLMRFYDVDSGSIRVDGTDVRKITRKSLRTSYG MVLQETWLKSGTIRDNIAYGRPEASEEEVVRAAKEAHAHSFIKRMKDGYDTVISEDGSNL SQGQKQLLCIARVMLCLPPMLILDEATSSIDTRTEIRIQKAFSKMMEGRTSFIVAHRLST IREADVILVMRDGHIIEQGTHESLLEKNGFYAELYNSQFAV >gi|229783837|gb|GG667898.1| GENE 3 1893 - 3221 225 442 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 183 424 117 354 398 91 28 1e-18 MRSPFIVFGAMAMAFTIDVKAALIFVVVIPVLPLVVLGIMLISMPLYKKVQRQLDRVLLT ARENLSGVRVIRAFNRQEDESRRFDGENGKLVTIQVFVGKISSLLNPLTYVIINLGIIVL IQTGAREVNRGAITQGEVVALVNYMSQILVELVKLANLIITISKALACANRISAVFEQKP GITEKNRTCLEAGEDAPAVEFRDMSFTYEGAREPALSGISFSVKRGQTVGIIGGTGSGKS TLVNLIPRFYDVTEGGVYVDGHDVREYPLDQLRGKTGVVPQKAVLFHGTLRENMQWGKKH ASDGEIYEALKTAQAFEFVEEKDEGLELYIQAGGKNLSGGQRQRLTIARALVRKPEILIM DDSASALDFATDARLRKAMKERTQDMTVFIVSQRATTVRNADQIVVLDDGVMAGCGTHKE LYETCEIYREICLSQLSKEEVQ >gi|229783837|gb|GG667898.1| GENE 4 3164 - 3622 361 152 aa, chain - ## HITS:1 COG:CAC2393 KEGG:ns NR:ns ## COG: CAC2393 COG1132 # Protein_GI_number: 15895659 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 1 133 1 133 577 140 48.0 6e-34 MKKLLKYLREYRLESFLGPLFKLLEASFELMVPLVMAKVIDVGIRTQDLPYILKMGGILV LLGVVGLICSITAQYFAAKAAAGFGTALRNDLFAHIGSLSYAELDKEGTATLITRMTSDA NQVQSGVNLFLRLFCGLRLSCSEPWPWHLPLM >gi|229783837|gb|GG667898.1| GENE 5 3740 - 4610 800 290 aa, chain + ## HITS:1 COG:CAC2394 KEGG:ns NR:ns ## COG: CAC2394 COG0583 # Protein_GI_number: 15895660 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 4 285 1 280 286 263 46.0 2e-70 MGQMNSRHLYYFTVIAETGSFTAAAKKLGLSQPPLSKQIMMLEEDLGVRLLNRGARKAEL TEAGAYLYSRARDILSMMDTAASELQNFPSSAKGILKLGTISSSGDYLADRVLPGFCRLH PNVRFEISEGNTYELLEKLKNGIIECAVVRTPFNTEGLHCVFGTEEPLTAVGVASWFDAL PDGSIRLTDLAGKPLIYYRRFDAIISMAFQNIGAVPDIFCRNDDARTCLQWAGAGLGVAL VPESISSMKTAEGLTVRTIDSPETVTRMTAIYKKNGYVSNIAKEFKPFTV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:29:49 2011 Seq name: gi|229783836|gb|GG667899.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld292, whole genome shotgun sequence Length of sequence - 4168 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 386 195 ## Rumal_2310 hypothetical protein 2 1 Op 2 . - CDS 409 - 612 287 ## gi|266625402|ref|ZP_06118337.1| alcohol dehydrogenase 3 1 Op 3 . - CDS 627 - 1052 335 ## MGAS2096_Spy1123 hypothetical protein 4 1 Op 4 . - CDS 1080 - 1232 65 ## gi|266625404|ref|ZP_06118339.1| toxin-antitoxin system, antitoxin component, Xre family - Prom 1475 - 1534 80.4 5 2 Op 1 1/0.000 - CDS 2378 - 3337 790 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 3410 - 3469 6.9 6 2 Op 2 . - CDS 3478 - 4095 423 ## COG2207 AraC-type DNA-binding domain-containing proteins Predicted protein(s) >gi|229783836|gb|GG667899.1| GENE 1 2 - 386 195 128 aa, chain - ## HITS:1 COG:no KEGG:Rumal_2310 NR:ns ## KEGG: Rumal_2310 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 4 122 9 125 305 83 35.0 3e-15 MDREILGIDHGNRQMKTANTAFLSTVTQNKVKTSNLSQILEFKGKYYSIGGSREDVDTKV DKTVDDDYYILTLASLAAELKARGKNQAAVRLATGLPPRWYESQMKAFRKYLGRERELCF RYQGEEFN >gi|229783836|gb|GG667899.1| GENE 2 409 - 612 287 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625402|ref|ZP_06118337.1| ## NR: gi|266625402|ref|ZP_06118337.1| alcohol dehydrogenase [Clostridium hathewayi DSM 13479] alcohol dehydrogenase [Clostridium hathewayi DSM 13479] # 1 67 1 67 67 95 100.0 1e-18 MKMYKIDQIIELLRLGKESEDDYVGFENNEKGNLVITLKQHAPDWEDEEGFGEAVEEVEL LEHAGEY >gi|229783836|gb|GG667899.1| GENE 3 627 - 1052 335 141 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1123 NR:ns ## KEGG: MGAS2096_Spy1123 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 12 139 4 116 134 66 31.0 3e-10 MFDINKLEVMDIVFADEAHERFFREKVRELPNSRRDNEEIAMIYTLGICDKTRAAFSKIV DQKTLQVNPLVLFAGWQTSASEKVTRLAINLYTGFAQEVKRDNSEEGYYIGVDSSVYAVG EIFDCPYAKYFLEAVKLRYDI >gi|229783836|gb|GG667899.1| GENE 4 1080 - 1232 65 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625404|ref|ZP_06118339.1| ## NR: gi|266625404|ref|ZP_06118339.1| toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] # 1 50 1 50 50 80 100.0 5e-14 MTQWELADKLGISLRTYQRIEYGQQKPSYRVILILQKIFNENIESILQEL >gi|229783836|gb|GG667899.1| GENE 5 2378 - 3337 790 319 aa, chain - ## HITS:1 COG:STM2190 KEGG:ns NR:ns ## COG: STM2190 COG1879 # Protein_GI_number: 16765520 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Salmonella typhimurium LT2 # 66 280 27 250 332 67 28.0 5e-11 MEKVLTTVICTALVCVQLIGCSQQTAPADTQPKATTEKAGATETTEITGDTEETGEAEGE QSRVYKIGINSYAENFESSQRYLSSFRAAAEKAGNVELVYADCNADPQKLAPNYDAFILQ NVDAIIDASWMGEVGPIAVEKCKGAGIPLVVCDSPFDEEYSYLIGTDQYQAGVIAGKYLA DYVKENWDGSIDYLVLEYFQSGGPQVKDRMQGCLDGLKENGIPVEEDQVFWFDNEAQTQK TNQITRDFLTSHPDATKIIFGTNNDPCAIGVVSAVEASNRVDNCISYSYGGEDSALDLLK KDDNCYIGSVSFQQLQYGS >gi|229783836|gb|GG667899.1| GENE 6 3478 - 4095 423 205 aa, chain - ## HITS:1 COG:SMa0319 KEGG:ns NR:ns ## COG: SMa0319 COG2207 # Protein_GI_number: 16262625 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Sinorhizobium meliloti # 51 193 197 330 333 70 29.0 2e-12 MEEFVGKILTLNEREQKMLAKIVNESNNVFEIEQERNVIYLKPKKDPGVFGGQQYVKNLL EIFLIELIRRQKFKVVASESKDRITTINVNTMYEDITRTIMKYLEEYICENLNIDDLCSK FSFSKSYIEYIFKRETGLGIMAKFSEMKIEKAKELLRNQDYNITMISEMLGYSSVHYFSR KFKTVVGMPPSEYASSIKLKAQTIK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:30:07 2011 Seq name: gi|229783835|gb|GG667900.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld293, whole genome shotgun sequence Length of sequence - 5043 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 75 - 299 281 ## gi|266625407|ref|ZP_06118342.1| putative succinate dehydrogenase hydrophobic membrane anchor subunit 2 1 Op 2 . + CDS 306 - 506 353 ## COG1476 Predicted transcriptional regulators + Term 569 - 620 2.1 3 2 Tu 1 . - CDS 586 - 894 132 ## COG1295 Predicted membrane protein - Prom 916 - 975 18.8 4 3 Tu 1 . - CDS 1877 - 2260 313 ## COG1295 Predicted membrane protein - Prom 2286 - 2345 9.4 + Prom 2730 - 2789 4.8 5 4 Op 1 . + CDS 2817 - 3716 826 ## COG1533 DNA repair photolyase 6 4 Op 2 . + CDS 3745 - 5016 1425 ## COG2270 Permeases of the major facilitator superfamily Predicted protein(s) >gi|229783835|gb|GG667900.1| GENE 1 75 - 299 281 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625407|ref|ZP_06118342.1| ## NR: gi|266625407|ref|ZP_06118342.1| putative succinate dehydrogenase hydrophobic membrane anchor subunit [Clostridium hathewayi DSM 13479] putative succinate dehydrogenase hydrophobic membrane anchor subunit [Clostridium hathewayi DSM 13479] # 1 74 25 98 98 117 100.0 3e-25 MIDERYEANVNRAAAIANRVSLTLIIMAAILALSLFRIHDSYGMLKLLLAIIGFAFGLSL FLQEYLLYRFENEE >gi|229783835|gb|GG667900.1| GENE 2 306 - 506 353 66 aa, chain + ## HITS:1 COG:MA4668 KEGG:ns NR:ns ## COG: MA4668 COG1476 # Protein_GI_number: 20091114 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 8 66 16 74 79 67 54.0 6e-12 MAEFTCNLKKYRLQKEMTQEQLAQITGVRRETIMRLEAGKYNPSLKLAIDISKAVEAPIE ALFIFD >gi|229783835|gb|GG667900.1| GENE 3 586 - 894 132 102 aa, chain - ## HITS:1 COG:lin1818 KEGG:ns NR:ns ## COG: lin1818 COG1295 # Protein_GI_number: 16800885 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 3 86 190 273 289 78 45.0 3e-15 MLLLILTICFTFLYTYVPEKKQHFKKQLPGAVFSTLGWIAFSFAFSIYFNNFSNFSVMYG SLTAIVLLMLWLYVCICILFLGAEINYFYSGDWKEIKKTGGW >gi|229783835|gb|GG667900.1| GENE 4 1877 - 2260 313 127 aa, chain - ## HITS:1 COG:CAC0168 KEGG:ns NR:ns ## COG: CAC0168 COG1295 # Protein_GI_number: 15893462 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 14 127 11 123 273 64 26.0 4e-11 MIGLILRGKQIYDKFSRDEMTVYAAQASFFTIIAAFPFIMLLMAMIQLIPTITKSNLLMV ITNIVPANLKSLVFGIVENIYTNSPATVLSVTAIAAIWSASRGMLSIERGLNRVFGKRKK RNYVVTR >gi|229783835|gb|GG667900.1| GENE 5 2817 - 3716 826 299 aa, chain + ## HITS:1 COG:CAC3492 KEGG:ns NR:ns ## COG: CAC3492 COG1533 # Protein_GI_number: 15896729 # Func_class: L Replication, recombination and repair # Function: DNA repair photolyase # Organism: Clostridium acetobutylicum # 25 288 16 278 290 208 39.0 1e-53 MEYIPAKTIVTKTKSSEWFGIDYNMNIYKGCCHGCIYCDSRSSCYRIERFDTVRAKEDAL RIVRDDLRRKVKTGVVGTGAMSDPYNPFEKELELTRHALELVDTFGFGAGIATKSTLLKR DADILESIKEHSPVILKVTVTTADDELAGKIEPGVPSSTERFELIDYLSGRGLFTGILLM PVLPFLEDSPENIRAVVRNAHEAGASFIYAAFGVTLRDNQREWYYDKLDQLFPERKLAAE YRKRYGERYECRSPAAKRLWAIFAAECEKYGILYRMEDIIHGYKKNYESSQMSLFDFLQ >gi|229783835|gb|GG667900.1| GENE 6 3745 - 5016 1425 423 aa, chain + ## HITS:1 COG:CAC1585 KEGG:ns NR:ns ## COG: CAC1585 COG2270 # Protein_GI_number: 15894863 # Func_class: R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Clostridium acetobutylicum # 9 412 4 409 425 317 41.0 4e-86 MEQEKKEKLTKMEIDWILYDVGNSAFVLLATTILPIYFNYLAGLAGISEVDYLAYWGYAA SVATLLVAFIGPVCGAIADTKGYKKPIFTICMLVGVVGCAALAIPKSWIVFLVVFVIAKV GYSASLVFYDSMLPDITTPERMDAVSSHGFAWGYIGSCVPFVFSLVFVLFYEKIGITMVT AMVIALIINALWWFLVTVPLLRHYRQVHYTNRTETERGVVTESLRRLGQTLLDLKNHKKI FVFMIAYFFYIDGVYTIIEMATAYGSSLGLDSQGLLLALLLTQIVAFPFALLFGKAAKKY RTEKLIRLCILAYLCIALYAIQLDKQYEFWVLAVCVGMFQGAIQALSRSYFARIIPPEKS GEFFGIFDICGKGASFMGTALMGLFAQITGKSNGGVAILAVMFLIGFFVFGKAVKLNEDK GDY Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:30:14 2011 Seq name: gi|229783834|gb|GG667901.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld294, whole genome shotgun sequence Length of sequence - 3902 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 35/0.000 + CDS 14 - 442 543 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 2 1 Op 2 2/0.000 + CDS 442 - 2316 176 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Prom 2473 - 2532 10.3 3 2 Op 1 5/0.000 + CDS 2595 - 2921 422 ## COG1695 Predicted transcriptional regulators 4 2 Op 2 . + CDS 2908 - 3498 629 ## COG4709 Predicted membrane protein 5 2 Op 3 . + CDS 3495 - 3896 455 ## gi|266625417|ref|ZP_06118352.1| hypothetical protein CLOSTHATH_06838 Predicted protein(s) >gi|229783834|gb|GG667901.1| GENE 1 14 - 442 543 142 aa, chain + ## HITS:1 COG:CAC3414 KEGG:ns NR:ns ## COG: CAC3414 COG1132 # Protein_GI_number: 15896655 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 2 133 436 567 577 155 61.0 2e-38 MEEIRAVAESAQADGFVTSFQDGYEMDLGQGGVNVSGGQKQRLCIARALLKKPKILILDD STSAVDTATEAKIRECFQTTLKETTKFIIAQRIGSVESADKIIVLDEGMIVGVGTHEELI KTCEAYQEIYYSQRDREKEASA >gi|229783834|gb|GG667901.1| GENE 2 442 - 2316 176 624 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 385 606 2 226 245 72 27 5e-13 MDNQKRVESLKRRYGRNAAPAKRRGPMGGPGHGRQHGKGGMPKNSKETIRRLLSYLNGYR LHMGLAFFCVIVNTVATLAGSYMLRPIINKYIAPMDGSRGSISGLAGSLVFMAAIYLIGV AANYAQSRVMLTVAQNALQKIRDELFTKMQKLPVRFYDTNNNGDLMSRFTNDVDTVGQML SSTLVQLFSGALSIIGTLALMIYTNIWLTLVTVFMIPLMMKAGGYVAGKSQKYFSAQQTS LGAVNGYIEETISGQKVVKVFCHEDIAEEEFEILNEDLRDNMIKAQFFGGMMGPVMSNLS QLNYTLTAAIGGLLCVLRGFDVGGLTVFLNFSRQFSRPINEISMQISNVFSALAGAERVF AVMDEEPEPVDDGDAAVLDPMKGYVELKNVTFGYNPDKTILKDISLYAKPGQKIAFVGST GAGKTTITNLLNRFYDIQSGSITIDGINIRHISRDNLRHNIAMVLQDTHLFTGTVMENIR YGRLDATDEEVIQAAKTASAYSFIMRLPHGFDTVLEGDGANLSQGQRQLLNIARAAISKA PILILDEATSSVDTRTEKHIEHGMDRLMADRTTFVIAHRLSTVRNANAIMVLENGEIIER GDHEDLLAQKGRYYELYTGLKELD >gi|229783834|gb|GG667901.1| GENE 3 2595 - 2921 422 108 aa, chain + ## HITS:1 COG:SPy2172 KEGG:ns NR:ns ## COG: SPy2172 COG1695 # Protein_GI_number: 15675909 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 108 1 108 108 101 45.0 4e-22 MVYQLTPALLDAMVLAIAEGEDAYGYMIAQRLKSIASQKESSLYPVLKRLLEAGLLETYD QEYQGRNRRYYRITEEGRRRLAFYRKEWDEFKAAADEILVGGEQDEQE >gi|229783834|gb|GG667901.1| GENE 4 2908 - 3498 629 196 aa, chain + ## HITS:1 COG:SPy2173 KEGG:ns NR:ns ## COG: SPy2173 COG4709 # Protein_GI_number: 15675910 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pyogenes M1 GAS # 1 187 1 185 195 105 33.0 4e-23 MNRNEYMKELEQALKRLPKAEREEALSYYNEYFDDAGPEREAGVMEELGDAKGIAAQIVK EVALKRLAEPKPEKAARKGLSTLWIVLLALCAAPIGLPLLLAVVIFGLAMVIMVFSIFAA LLICGVVFVAVGIVSAVAGIYFLPAQMASGIFILGTALGESGVGLLLICAGCASCKYIYR GMAGFTKRLLTRGDKS >gi|229783834|gb|GG667901.1| GENE 5 3495 - 3896 455 133 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625417|ref|ZP_06118352.1| ## NR: gi|266625417|ref|ZP_06118352.1| hypothetical protein CLOSTHATH_06838 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06838 [Clostridium hathewayi DSM 13479] # 1 133 1 133 133 266 100.0 4e-70 MNKVMRGTLAAAGICIVAGLGLSLAGFVMGGEPGFWIGRSGLYTNREVRTKNAGKLVELE KTEIDPFTSMDVRADYGSITVKPSGDDSCYLEYSLYIRDKDPVYTVKNGILTFTCVPDAN SNPFGSNVGTERV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:30:24 2011 Seq name: gi|229783833|gb|GG667902.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld295, whole genome shotgun sequence Length of sequence - 5618 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 353 371 ## COG0395 ABC-type sugar transport system, permease component 2 1 Op 2 . + CDS 361 - 1239 741 ## CPF_1116 hypothetical protein 3 1 Op 3 . + CDS 1254 - 3311 2020 ## COG1554 Trehalose and maltose hydrolases (possible phosphorylases) + Prom 4213 - 4272 13.1 4 2 Tu 1 . + CDS 4305 - 5180 977 ## CPF_1118 hypothetical protein + Term 5192 - 5250 20.5 Predicted protein(s) >gi|229783833|gb|GG667902.1| GENE 1 3 - 353 371 116 aa, chain + ## HITS:1 COG:BH3682 KEGG:ns NR:ns ## COG: BH3682 COG0395 # Protein_GI_number: 15616244 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 3 116 180 293 293 94 38.0 5e-20 EGTVPKELEEAAEIDGCTVFRKFFRVVLPLLTPVISTVVIIVTLNVWNEFMTPLLFLQSR EKDVILQEVSRNIGQFATDWTALFPMLMLGVAPLMIFYVFMQKYIIAGVAAGAVKG >gi|229783833|gb|GG667902.1| GENE 2 361 - 1239 741 292 aa, chain + ## HITS:1 COG:no KEGG:CPF_1116 NR:ns ## KEGG: CPF_1116 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 12 292 12 289 289 297 51.0 3e-79 MEKKKSGTLHDSLMVIVPHQDDEILMCAGILWEAAKRKIPAVVVMVTNGDYGSSDLSIGR SRLKETLAGLAMLGIDAGNVEFLGYADTGMPKEESFLNGLYEETDAGRLHRSHCSEETYG LEEKDEYHKKRWGAHAPYDRAHLAADLYGVIEEHRPAMIFTTSEFDTHGDHSALCLFVRD SLVRMKAEADTECGLPRLYSGIVHSLAGDENWPVRTTEVGPYTEPEQFEETSRFRWEERI SFPVPECMRTEVREDNLKYRALSCHVTALKPDAVDFLYAFIKNEELFWEINY >gi|229783833|gb|GG667902.1| GENE 3 1254 - 3311 2020 685 aa, chain + ## HITS:1 COG:MT3509 KEGG:ns NR:ns ## COG: MT3509 COG1554 # Protein_GI_number: 15842995 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose and maltose hydrolases (possible phosphorylases) # Organism: Mycobacterium tuberculosis CDC1551 # 9 682 7 648 786 375 35.0 1e-103 MEINKQPIYPVEPWTVTETEFKMETNYRNETTFSLSNGYLGTRGTLEEGYPFTADEGLEG NFINGFYESEPIRYGEWNYGFPTESQSLLNLPNLKTIRIFLEGEELDIRTGKVSGWRRTL HMDRGTVTRSFDWTSPSGRTVHVETARLVSFEVKNLMAMSCEVTPVNFDGEITFLSVLDA DVENHTRKTNPLVDYGPFGRCLITDVISADETMLFYQGTTEHSGITMACGSVMTVKAGQD GAPDRVTEEGSAPQEWWITEDGLHCCHKLVVQGRKGRKITAEKQMAYTSSLDLEKSGMKE LILSILNAAAEKGFKRTADGQEACMKAMWKQADVVIEGDEALQQGMRFNLFHILQSASRD GKNGMGAKGLSGEGYEGHYFWDTEMYVIPVLIYTNASLAKSLLGYRFRTLNEARARARVL GHEKGALYPWRTINGREASTYFPLGTAQYHINADIAYAFKLYLDVTGDMEFLKDQAAEVL VETARVWADVGCFAEARGGKYCICAVTGPDEYNAIVDNNFYTNLMARENIRDAGWALDTL REMDEPAYEALAAKLGLKEEERELWNRIVENMYFPYDEDRKVYPLDDGFMIRKPWDESKI PPEKRHLLYENYHPLFIFRQRMSKQADAILGMVLHSNYFTEEELRRNYDFYQEVTLHHSS LSTCIFGILACQIGYGEEAYAYFSS >gi|229783833|gb|GG667902.1| GENE 4 4305 - 5180 977 291 aa, chain + ## HITS:1 COG:no KEGG:CPF_1118 NR:ns ## KEGG: CPF_1118 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 7 286 6 285 291 382 65.0 1e-105 MSAVKPRAIFFDWDGTAVYSRKAPADAALSAMKPLLAQGIKLVIVSGTTIENIAGGQIET YFTPEELRHLYLGLGRGAYNYAFDEEGRPYVFADKLPDREGLLAVHSICFDIHKELLKKY DFPTDIVFSRPNYCKIDLMVSNQRGDQLFLQENELVILKESLRRHGIEGGLAKLLEISGM IGASYGIAVAPTCDAKYLEVGLSSKSDNVDTILDKLQREFGILPEECSYWGDEYVGIEEG IFGSDSFMITEKSRTGRFFDVSEVPGKRPQEVTVTGGGVERFLEYLREISI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:30:35 2011 Seq name: gi|229783832|gb|GG667903.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld296, whole genome shotgun sequence Length of sequence - 3278 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 376 - 435 2.9 1 1 Tu 1 . + CDS 470 - 667 103 ## gi|266625423|ref|ZP_06118358.1| conserved hypothetical protein + Prom 682 - 741 2.9 2 2 Op 1 . + CDS 810 - 1508 367 ## Rumal_0360 helix-turn-helix domain-containing protein 3 2 Op 2 . + CDS 1588 - 1743 57 ## gi|266625425|ref|ZP_06118360.1| conserved hypothetical protein + Term 1750 - 1795 4.3 - Term 1738 - 1782 4.1 4 3 Tu 1 . - CDS 1785 - 2132 318 ## COG1733 Predicted transcriptional regulators - Prom 2155 - 2214 10.4 + Prom 2112 - 2171 4.9 5 4 Tu 1 . + CDS 2295 - 2822 527 ## COG0778 Nitroreductase + Term 2850 - 2899 12.1 - Term 2838 - 2885 10.1 6 5 Tu 1 . - CDS 2912 - 3202 133 ## gi|266625428|ref|ZP_06118363.1| conserved hypothetical protein Predicted protein(s) >gi|229783832|gb|GG667903.1| GENE 1 470 - 667 103 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625423|ref|ZP_06118358.1| ## NR: gi|266625423|ref|ZP_06118358.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 65 1 65 65 104 100.0 2e-21 MLYAIALALAPLTVLIWNQLFLTIKWSDAAFRRLSAYVHKRRYTQDIFILIFSKRMLKFE DSVIK >gi|229783832|gb|GG667903.1| GENE 2 810 - 1508 367 232 aa, chain + ## HITS:1 COG:no KEGG:Rumal_0360 NR:ns ## KEGG: Rumal_0360 # Name: not_defined # Def: helix-turn-helix domain-containing protein # Organism: R.albus # Pathway: not_defined # 1 146 1 148 165 130 46.0 3e-29 MDTKKIGEFLKVLRKEKGLTQEQLAESLLVSGRTVSRWETGMNMPDLSVLIQMAEFYDVE VKEILDGERKSEIMDKELKETLSKVADYNKLEKEKVTKAGNVAFGLTFVVCTVVIVIQLL IAANLSVVVGETVVLLVGGVAYIGVMTYNGVWETGSRFKSTPFIDMLIAVICAGAFAIVL AICYIRLGATMPQAVHSALIFFVGIAIVGFGVLRTLAYLSHRKKSRKNQEKK >gi|229783832|gb|GG667903.1| GENE 3 1588 - 1743 57 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625425|ref|ZP_06118360.1| ## NR: gi|266625425|ref|ZP_06118360.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 51 1 51 51 85 100.0 1e-15 MEGIGNSALEKGDNVLIVSHAFAVKTMIYLFDNSHLSEVEKIRNASITTII >gi|229783832|gb|GG667903.1| GENE 4 1785 - 2132 318 115 aa, chain - ## HITS:1 COG:CAC1483 KEGG:ns NR:ns ## COG: CAC1483 COG1733 # Protein_GI_number: 15894762 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 7 105 6 104 108 157 73.0 5e-39 MKSCPSSYNCPVEATLDLIGGKYKALILWHLIDNILRFNEISKLIPQATPKMLTQQLREL EADQLITRTVYPVVPPKVEYSLSSFGKSIIPVLDSMCNWGAAYLDGLDVKPPCSK >gi|229783832|gb|GG667903.1| GENE 5 2295 - 2822 527 175 aa, chain + ## HITS:1 COG:CAC1484 KEGG:ns NR:ns ## COG: CAC1484 COG0778 # Protein_GI_number: 15894763 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 1 175 1 172 172 260 68.0 9e-70 MSFLDLAKKRYSVRAYTEKKVEKEKLEAILEAAHVAPTGANNQPQHLIVVESDEGMNKVG KAANTYGAPLVIIICSDRNVTWTRPFDGKKLTDIDASIITDHMMMEATDLGLGSVWICYF KPDVLKEELAIPEGLEPVNILAIGYADTEKEPALSPDRHGRLRKPLSSTVSYEQL >gi|229783832|gb|GG667903.1| GENE 6 2912 - 3202 133 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625428|ref|ZP_06118363.1| ## NR: gi|266625428|ref|ZP_06118363.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 96 26 121 121 186 100.0 5e-46 MLSVFIPACFMGTNILLDKELDRVKTARVELLYVIFQAGILILGEVLIGNIISLALSALV YWGYVVKINRSKLNRPLYIDKSKKQAKRDSSSPDFH Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:30:57 2011 Seq name: gi|229783831|gb|GG667904.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld297, whole genome shotgun sequence Length of sequence - 3340 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 974 976 ## COG0598 Mg2+ and Co2+ transporters 2 1 Op 2 . - CDS 999 - 1919 1059 ## COG1432 Uncharacterized conserved protein - Prom 1948 - 2007 10.5 3 2 Op 1 . + CDS 2293 - 2589 349 ## Closa_1284 spore coat protein CotJB 4 2 Op 2 . + CDS 2589 - 3200 630 ## COG3546 Mn-containing catalase + Term 3214 - 3254 6.2 Predicted protein(s) >gi|229783831|gb|GG667904.1| GENE 1 3 - 974 976 323 aa, chain - ## HITS:1 COG:FN0332 KEGG:ns NR:ns ## COG: FN0332 COG0598 # Protein_GI_number: 19703675 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Fusobacterium nucleatum # 144 321 171 349 351 101 31.0 2e-21 MVRYSFGTRLTRLSDEEGRNRAGTCITVMTLEEFRETKELYLHRKVMVHSMENIHYCKAE AYGGCTMGTFMIPNKKDLLGEKFSFGFYVTDTELILLDEGDMMTDILRRMEDIASKEGSG HGHTGEETGKEKETAAAPAGFLILLMDYLIRNDVLFLQKYEEKLEDMEDELLDHQPKNFY ETVIVSRKELLALHTYYEQLISLGEELESNDNHLFSEGECTGFGMFASRASRLHDHVEML REYVFQIREMYQSQIASNQNQIMSWLTVVTTIFLPLSLLVGWYGMNFINMPELHWKYGYV GMILLSIAIVAVEIWFFKKKKIL >gi|229783831|gb|GG667904.1| GENE 2 999 - 1919 1059 306 aa, chain - ## HITS:1 COG:slr1870 KEGG:ns NR:ns ## COG: slr1870 COG1432 # Protein_GI_number: 16330259 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Synechocystis # 2 234 4 243 249 171 42.0 1e-42 MNDKKFAVLIDSDNISAKYISYILDEMTKYGVITYKRIYGDWTSSQMGKWKQELLENSIT PIQQFSNTVGKNATDSALIIDAMDLLYTDNVDGFCIVSSDSDFTKLASRLRESGKEVIGM GEEKTPKSFRAACTVFTNLEILLDQDEESVGGGTVGREVIEQDIASIILENDDKNKATGL GEIGNRLVKKYSDFDVRSYGYSSLSKFLEEMDSFELKKSNNVVTVRMKDNRTRKQELDDY AIKLVKNSGKNGLELAALGNGMHRKYGDFKVKDYGYSTFSKFVQSISQLEIRENKINNKV VFIRQA >gi|229783831|gb|GG667904.1| GENE 3 2293 - 2589 349 98 aa, chain + ## HITS:1 COG:no KEGG:Closa_1284 NR:ns ## KEGG: Closa_1284 # Name: not_defined # Def: spore coat protein CotJB # Organism: C.saccharolyticum # Pathway: not_defined # 1 98 1 98 98 165 82.0 6e-40 MADRETMLKQINEISFTVNDLTLFLDTHPLDDNALKAFSDAMKQRKQLMQTYAENFEPLT VDCVCPDTNNKSETHTKYPGQKHFTWSDGPLPWEGGTV >gi|229783831|gb|GG667904.1| GENE 4 2589 - 3200 630 203 aa, chain + ## HITS:1 COG:CAC1338 KEGG:ns NR:ns ## COG: CAC1338 COG3546 # Protein_GI_number: 15894617 # Func_class: P Inorganic ion transport and metabolism # Function: Mn-containing catalase # Organism: Clostridium acetobutylicum # 1 185 1 185 200 238 62.0 7e-63 MWNYEKRLQFPVKISTPCPKTASLIITQFGGPDGELAASMRYLSQRYSMPCKEVGGLLTD IGTEELAHLEMICAIVFQLTKNLTTEQAKTAGFDAYYVDHTTALWPTAAAGVPFNACEFQ SKGDAITDLTEDLAAEQKARTTYDNLIRIIDNPEVREPLKFLREREVVHFQRFGEALEKI KSNLDAKNFYYFNPEFDKQFVKK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:31:01 2011 Seq name: gi|229783830|gb|GG667905.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld298, whole genome shotgun sequence Length of sequence - 4639 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 16 - 261 266 ## gi|266625434|ref|ZP_06118369.1| transcriptional regulator, AraC family + Term 314 - 348 -0.9 + Prom 342 - 401 4.1 2 2 Tu 1 . + CDS 421 - 981 688 ## gi|266625435|ref|ZP_06118370.1| conserved hypothetical protein + Prom 1824 - 1883 80.4 3 3 Op 1 7/0.000 + CDS 1936 - 2424 425 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 4 3 Op 2 . + CDS 2405 - 3931 1522 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 3947 - 4006 7.2 5 4 Tu 1 . + CDS 4044 - 4638 529 ## Clole_2589 extracellular solute-binding protein family 1 Predicted protein(s) >gi|229783830|gb|GG667905.1| GENE 1 16 - 261 266 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625434|ref|ZP_06118369.1| ## NR: gi|266625434|ref|ZP_06118369.1| transcriptional regulator, AraC family [Clostridium hathewayi DSM 13479] transcriptional regulator, AraC family [Clostridium hathewayi DSM 13479] # 1 81 6 86 86 160 100.0 3e-38 MGIVCAEGEGTIEKGHPSVAQYNYALEGADFELTVDGETINTKEGDWSYVPAGLDHSLVS EKGKKVFYIWFEHMVAEVKPS >gi|229783830|gb|GG667905.1| GENE 2 421 - 981 688 186 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625435|ref|ZP_06118370.1| ## NR: gi|266625435|ref|ZP_06118370.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 184 1 184 184 368 100.0 1e-100 MERYRDYPLQQKLVAALIICIVVPLTVLGLFLNSYVAGNTRKKEYQINQTLVSEVAGDLD KLLNNVQELKYDCLTDFSLQDIITGKGNVADYSSVGNWLDSIIRNEKCYHSICIADESEV MIQRGTYLVSEEEERIRQAKARVSGGFWIGPRQAKRVAGGGMQKEDMVLSYYCGINNYEQ IRKVAS >gi|229783830|gb|GG667905.1| GENE 3 1936 - 2424 425 162 aa, chain + ## HITS:1 COG:SP0662 KEGG:ns NR:ns ## COG: SP0662 COG2972 # Protein_GI_number: 15900563 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pneumoniae TIGR4 # 2 146 412 553 563 107 32.0 1e-23 MLNRGEECVTIASEVEFIQNYMAIMESRFGRRVKIQIQTEPGLEKVMIPKLIIQPLVENG ILHGLEPKKEGGTILITIAEREKKLVIAVEDDGVGADAAEIVKKMKDGDGGKDTFALKNI DDRIRLRYGNSYGLRFESKPGEGTRVLVVMPLEEENDEADDC >gi|229783830|gb|GG667905.1| GENE 4 2405 - 3931 1522 508 aa, chain + ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 503 2 514 530 135 23.0 1e-31 MKLMIVDDDIQIREGIQYGIDWDTIGITGVKGYGDGIEALEDYDAFSPSIILADIRMPAM DGLEFLKRVKERSDTVKVILISAYSDFEYCQRAIQLGASGYELKPLKVGNLIQKIQEMIE LVRKEEQGKESYERYAASYREKMVEELFAGKITDRNVIIELLRMHFHLEDARNLVCMILA GDPAGGTSATPAVSAENRVHLNSLLREDQILFEYRNQYVVLSKTKDSTMYTLERQNHLKT VWYQMKQYGTGRGFSISAGISGMVPAEKISVGYETALQALGCRFLNGPGSCKILDRQGKP EETGLHLPPEYESGLKKHELEEILDAIDHFGAELSKQEITDQRRVKNLMINGIVRLAQEM GQNTDCSREWEDVWWLTDCLCLWKEECRRIVNSYEESRNGHYSANIRKALAYIQEYYAKD LLVEEVAAYIGKTPNYFSSIFRTEVGVTFREYLNHYRIERAKELIEESDMMIYEIAEQVG YSDYTYFSQVFKKMTGISPTSMRGRSKL >gi|229783830|gb|GG667905.1| GENE 5 4044 - 4638 529 198 aa, chain + ## HITS:1 COG:no KEGG:Clole_2589 NR:ns ## KEGG: Clole_2589 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: C.lentocellum # Pathway: not_defined # 1 195 1 194 450 78 28.0 1e-13 MRRQVKQMLSLALAGCMVLGMTACGSKSSTGTAAPESKQDVTTEGSSAGTAAEKPDKTAE EASLVMLIEEPNVNYYPVFLENIRAKYPEYSITSKTWDQSQVEKTVKTAFAGGEAVDIVK WFPNQMETFISSDMALDLTPYMDDEWKSIFTENALDIGTYGGKLFCLPITTVYPDIDVNA DIFEEAGVEVKEHWTWDE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:31:21 2011 Seq name: gi|229783829|gb|GG667906.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld299, whole genome shotgun sequence Length of sequence - 3016 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 347 416 ## COG0006 Xaa-Pro aminopeptidase 2 1 Op 2 44/0.000 + CDS 363 - 1331 1047 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 3 1 Op 3 1/0.000 + CDS 1324 - 2283 859 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 4 1 Op 4 . + CDS 2315 - 2689 479 ## COG0251 Putative translation initiation inhibitor, yjgF family 5 1 Op 5 . + CDS 2704 - 3016 361 ## gi|288871644|ref|ZP_06118378.2| putative aminotransferase B Predicted protein(s) >gi|229783829|gb|GG667906.1| GENE 1 3 - 347 416 114 aa, chain + ## HITS:1 COG:SPy0513 KEGG:ns NR:ns ## COG: SPy0513 COG0006 # Protein_GI_number: 15674617 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Streptococcus pyogenes M1 GAS # 1 111 246 356 361 107 45.0 5e-24 IFNLVLEAGKRAVEAVRPGVHVNEVEAAHREVFIREGLDDYALRGLGHGIGLQIHECPRV QIGKDTVLKPNMIFTIEPGLYFPGVCGVRTEDDVLVTESGVENLSHTPHEIHID >gi|229783829|gb|GG667906.1| GENE 2 363 - 1331 1047 322 aa, chain + ## HITS:1 COG:CAC0179 KEGG:ns NR:ns ## COG: CAC0179 COG0444 # Protein_GI_number: 15893472 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Clostridium acetobutylicum # 2 316 4 318 325 380 60.0 1e-105 MSLLEVKELKTYFYTDAGCAKAVDGVSFSLDKGKVLGIVGESGCGKSVTSLSIMGLVDPV TGRHEGGEILFHGEDLSKLSEKERRKIRGNKISMIFQEPMTSLNPVFTVGYQIEEALMLH LGLDKNAARQRAIELLKLVEIPEAEKRVDEYPHQLSGGMRQRVMIAMALSCNPELLIADE PTTALDVTIQAQILELLNELKDKLGMAIIFITHDLGVISEMADEVVVMYAGEIVEKAETR QLFDNPKHPYTEGLMASIPDIDQEVDRLQTLEGLVPSLYDMPSGCRFGPRCKYYTEECGK NHSQLTRLSDQREVRCRRYEHE >gi|229783829|gb|GG667906.1| GENE 3 1324 - 2283 859 319 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 2 316 8 326 329 335 51 3e-92 MSEILIQAKNIKKYFPTSDRGKVVKAVDDISFDIYKGETLGVVGESGCGKSTTGRMVLKL LDCTEGSVLYKGKDIGKMKGKEMRSLRKEMQIIFQDPYASLDPRKTVFQILAEPFWIHHP EMRKEDIYREVEKLIECVGLRPEHILRYPHEFSGGQRQRVGIARAIALHPEFVVCDEPVS ALDVSIQAQVINLMQDIKKEYQLTYLFISHDLRIIKHFCDRVMVMYLGNIVELGTKEAIY GNQRHPYTKALLSAVSNVKGQGSKNRIILQGDIPSPVNPPSGCPFHTRCGYCMEICKEKK PELKKMADGTLVACHLDME >gi|229783829|gb|GG667906.1| GENE 4 2315 - 2689 479 124 aa, chain + ## HITS:1 COG:L52644 KEGG:ns NR:ns ## COG: L52644 COG0251 # Protein_GI_number: 15673211 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Lactococcus lactis # 1 121 1 122 126 125 54.0 2e-29 MNKIVTENAPKAIGPYSQGMVFENLVFSSGQIPVNPATGEIEGTSIETQAAQSCKNVAAV LEAAGSSMERVIKTTCFLADMADFPVFNQVYESYFVSSPARSCVAVKTLPKNVLCEIEAI AYKE >gi|229783829|gb|GG667906.1| GENE 5 2704 - 3016 361 104 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871644|ref|ZP_06118378.2| ## NR: gi|288871644|ref|ZP_06118378.2| putative aminotransferase B [Clostridium hathewayi DSM 13479] putative aminotransferase B [Clostridium hathewayi DSM 13479] # 1 104 3 106 106 207 100.0 3e-52 MALFDGNNIKLDVLKKWAYNYRWAEVPEGTIPLTAADPDYPVAPEIRKALRDYVDNGYFS YTPKMGYPEFMEAASRALWERKHEKISQDCILPIDSAARGMYII Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:31:30 2011 Seq name: gi|229783828|gb|GG667907.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld300, whole genome shotgun sequence Length of sequence - 2661 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 228 158 ## gi|266625444|ref|ZP_06118379.1| sn-glycerol-3-phosphate-binding protein 2 1 Op 2 35/0.000 - CDS 240 - 1136 614 ## COG1175 ABC-type sugar transport systems, permease components - Term 1214 - 1254 8.3 3 1 Op 3 . - CDS 1264 - 2568 1160 ## COG1653 ABC-type sugar transport system, periplasmic component Predicted protein(s) >gi|229783828|gb|GG667907.1| GENE 1 3 - 228 158 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625444|ref|ZP_06118379.1| ## NR: gi|266625444|ref|ZP_06118379.1| sn-glycerol-3-phosphate-binding protein [Clostridium hathewayi DSM 13479] sn-glycerol-3-phosphate-binding protein [Clostridium hathewayi DSM 13479] # 1 75 1 75 76 134 100.0 2e-30 MKKKKRNHIIWIVCMTVVSAIILLPIIVMILTSFKTTGEINSAVFHFLPDSLNFDNYKTA MSTGNWLIYFRNSNP >gi|229783828|gb|GG667907.1| GENE 2 240 - 1136 614 298 aa, chain - ## HITS:1 COG:lin0218 KEGG:ns NR:ns ## COG: lin0218 COG1175 # Protein_GI_number: 16799295 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 18 271 15 268 292 216 47.0 6e-56 MAGKKVRSKTHSDKEAYIMILPAYLIFTIFILIPIGMVLYYSLTDFNMYQIPEFLGLKNY MKLFHDSEFLTSVGNTLLYTVVTLTIQMALGLLLAVLLYRESRAVPFFRTALYMPNIMSM VCVSMVWLWLYNPNFGILNSFLRLIGIPAQQWLINPNQAMLCIIIIGIWKNCGYSMVIFL GGLTGIPQSLYEAAELDGAGAFYKFLYITWPMLYPTTFFLLVTGIVNSFAVFEQVNIMTN GGPLNRTTTIVHQIYRRGFLEFKMGYASAMAVLLLVFSLFVTMMVFKFGNKGQDVDVS >gi|229783828|gb|GG667907.1| GENE 3 1264 - 2568 1160 434 aa, chain - ## HITS:1 COG:BH3690 KEGG:ns NR:ns ## COG: BH3690 COG1653 # Protein_GI_number: 15616252 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 9 431 10 419 420 116 26.0 1e-25 MKRNLKAVLSTVAAGCMVLSLAACGSGNEDLVSDAGKDTGDENVKLVFLRAGTEDYKKDA FTQMIEGFEAKYPNYTVEYQEAPWGDDFETKLNTGFASGTAPDVIHYSLASIGARVPMGQ YECLDPYTEGWEGMDDMYDSVKEAGSIGGKLYGVPYTPDARMFVINTELFEKAGLDPDSP PSNWDELLEAHKKLVVKDDSGNVIQCGFGVPTSGRNINQYLEIFLVQNGLGNLVDEASNE ILFNKPEAVEAMDFLRQLKEAGLVDWNNTQQDQNPFYNGTAAMTVVSENEFNNINTGDLE GKIKLVPMFSKVNSGTFCGMHFMFMNANSKHKEAAWKLIEYMSSKESMQIWMDTAKTAPV RASLEEAYLKKNPVNGPMIMEAIEIGKGSPKVPYFNSVFTYVDDAMEQVFYGQSSPQDAL NAAAEKVQEEIDNQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:31:37 2011 Seq name: gi|229783827|gb|GG667908.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld301, whole genome shotgun sequence Length of sequence - 2949 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 851 806 ## Closa_2455 hypothetical protein + Term 1070 - 1119 9.5 + Prom 920 - 979 4.9 2 2 Op 1 21/0.000 + CDS 1120 - 2223 1215 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 3 2 Op 2 . + CDS 2220 - 2949 888 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs Predicted protein(s) >gi|229783827|gb|GG667908.1| GENE 1 3 - 851 806 282 aa, chain + ## HITS:1 COG:no KEGG:Closa_2455 NR:ns ## KEGG: Closa_2455 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 8 277 102 383 390 146 35.0 8e-34 TGRPAQMGQIRQPGQAPQRSGGLNGSYRSPAGTGSVKPEGTKNHTGKTVLITAVTGLAAV AVIAFVNVGGLLRDSIRGSVETTEAYSYNYESDFTELSDEEVMEAGERCLGFYHFPADGA YVSEQMLKAAESSGFGYSLDSDEFYTDNYLYEDTSYYESIRSFYLTDSGAETADGEYSYQ YVDVNYDTATGELHDYISRLNSGEASLEFLERFLTATEESSGIALEDRKTGEIMEEARDI AKQGQSSYIYEGMFAIEIYRYEGEDAMYVYVSYNDPEAGSEL >gi|229783827|gb|GG667908.1| GENE 2 1120 - 2223 1215 367 aa, chain + ## HITS:1 COG:BH3155 KEGG:ns NR:ns ## COG: BH3155 COG0330 # Protein_GI_number: 15615717 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Bacillus halodurans # 63 343 13 301 319 174 35.0 2e-43 MEKNSPVYRKETRPRFQKSLREIPETYRKKGGEIMETFDPEKKPKLKKVSGLLKKSGVAA VLVIAIPVIGLSSVYNIQEQEQAVLTTLGTAKAVAEPGLHFKIPFIQRVQKVNTTIQGVA IGYDPSDNQSEEADSLMITSDYNFVNVDFFVEYKVVDPVKAVYASQDPFTILQNISRSCI RTVIGSYDVDSVLTNGKNEIQSKVKEMIMNKLEQHDVGLSVVNVTIQDSEPPTVEVMEAF KAVETAKQGKETAINNANKYRNEKLPEATAQTDKILQEAESSKVQRVNEANAEVAKFNAM YVEYSRNPEVTRKRMFYEAMEDVLPGMKVIIDGTGKTETILPLDSFTGSGAQSTADEADY GTEEAAE >gi|229783827|gb|GG667908.1| GENE 3 2220 - 2949 888 243 aa, chain + ## HITS:1 COG:BH3154 KEGG:ns NR:ns ## COG: BH3154 COG0330 # Protein_GI_number: 15615716 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Bacillus halodurans # 10 241 25 262 310 162 38.0 4e-40 MKKTLKKVAGTIAGLAVVIVLLGSVVVTKENEYKLIRQFGRVERVVDTAGVTLKLPFIQT ADTLPKQILLYDLAASDVITMDKKTMLSDSYVLWRITDPLKFAQTLNSSVANAEGRIDTV VYNSVKNVISSMSQNEVISGRDGELSQAIMTNVGDSMAEYGITLLAVETKRLDLPADNKA AVYERMISERDKIAATYTAEGQAEAQKIRNTTDREIAISISDAKAQAAAITADGEAEYMR IMA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:31:43 2011 Seq name: gi|229783826|gb|GG667909.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld302, whole genome shotgun sequence Length of sequence - 3279 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 456 424 ## COG1109 Phosphomannomutase + Prom 480 - 539 5.8 2 2 Tu 1 . + CDS 565 - 978 275 ## Clole_2554 hypothetical protein + Term 1096 - 1139 8.5 3 3 Tu 1 . + CDS 1324 - 2172 802 ## COG0582 Integrase + Term 2287 - 2332 6.6 + Prom 2179 - 2238 3.5 4 4 Op 1 . + CDS 2349 - 2501 128 ## 5 4 Op 2 . + CDS 2473 - 3277 389 ## COG2200 FOG: EAL domain Predicted protein(s) >gi|229783826|gb|GG667909.1| GENE 1 1 - 456 424 151 aa, chain + ## HITS:1 COG:BS_yhxB KEGG:ns NR:ns ## COG: BS_yhxB COG1109 # Protein_GI_number: 16077996 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Bacillus subtilis # 1 146 398 554 565 124 36.0 5e-29 SRDKDAVAAVMALCEAAVYYRMKGMTLWDQMQNLYRKYGYYKEELHTIVKMGADGLAEIQ AVMEDMRRRPLSSAGGFKTVTVRDYLCDHMELPKSDVLYYELEDDNWFCIRPSGTEPKIK VYTGTRGNTEREAEERLAAIWKEIEKRLASG >gi|229783826|gb|GG667909.1| GENE 2 565 - 978 275 137 aa, chain + ## HITS:1 COG:no KEGG:Clole_2554 NR:ns ## KEGG: Clole_2554 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 134 1 134 139 172 54.0 6e-42 MYERMLNKKEVPTIAVMTAYCGENGGVFTSLNEWLSRTFGTEQKIVFPYGSQYGWGIAHK LKQKLMCNIFAENNSFTVMIRLSDKQFQSVYGQLQPYAREYIDTKYPCGDGGWIHYRVTC REQFDDIQKLLAVKCSR >gi|229783826|gb|GG667909.1| GENE 3 1324 - 2172 802 282 aa, chain + ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 7 272 1 262 265 145 32.0 9e-35 MENARNIKRYEDYLKNEEKADATIRKYLHEVEQLLAYMDKAEISKAAILQYRRYLSSQYK AGTVNGKLSAIHSYLEFMNLDMCKVKFLKVQKKAYIDEERELTEQDYRRLLDSAGRMGKS QLYYLMMVLYSTGIRISELPYVTVEAVCQGKTEIYMKGKCRLIIFPKNLIKKLKEFIRNE KIKSGCIFRTRSGRDLDRSNICHSLKRLCEEARVDPSKVFPHNFRHLFAKTFYSIEKNLA HLADILGHSSIETTRIYVAASTRQYEKVMNRMRIGLDKITTE >gi|229783826|gb|GG667909.1| GENE 4 2349 - 2501 128 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNYGRKMEDMVRVLCGFSDSGYHRIVIPWYPAAHTNERRKSYESGIDHYH >gi|229783826|gb|GG667909.1| GENE 5 2473 - 3277 389 268 aa, chain + ## HITS:1 COG:PA1727_3 KEGG:ns NR:ns ## COG: PA1727_3 COG2200 # Protein_GI_number: 15596924 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Pseudomonas aeruginosa # 159 267 2 110 254 95 40.0 8e-20 MRAVLIIIIENCKNQTISGWLQSFYEEKPVCFHGLDHMLFLIEEILDSEGVPAENGECRH LVKMDPVKTKRRTKKITHPEKVGRMVKFCPVRNRLKTGENVFTVTIYSRQNASIQGIIRN SRGSASFRSGLEVIRIMNECMFASCLGVYDKGENGREIEASEIMHDMRQALEQEQFIIYF QPKFDLATDTDYGAEALVRWNHPKAGLIPPGTFIPLFEKSGFIVRLDYYMAEHVCRIIGK WCSDGIKLNPITINLSRITLLDPKLVDI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:31:52 2011 Seq name: gi|229783825|gb|GG667910.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld303, whole genome shotgun sequence Length of sequence - 5194 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 134 - 193 4.8 1 1 Tu 1 . + CDS 431 - 2167 619 ## ML5_2095 hypothetical protein 2 2 Tu 1 . + CDS 3086 - 3472 293 ## Cphy_2148 DNA-repair protein + Prom 4374 - 4433 15.5 3 3 Tu 1 . + CDS 4508 - 4942 513 ## Closa_1454 hypothetical protein + Term 4948 - 4999 4.3 - Term 4937 - 4985 7.5 4 4 Tu 1 . - CDS 5017 - 5184 148 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes Predicted protein(s) >gi|229783825|gb|GG667910.1| GENE 1 431 - 2167 619 578 aa, chain + ## HITS:1 COG:no KEGG:ML5_2095 NR:ns ## KEGG: ML5_2095 # Name: not_defined # Def: hypothetical protein # Organism: Micromonospora_L5 # Pathway: not_defined # 94 209 1 127 189 80 37.0 2e-13 MSDEDLYTWKPGGKTLSEFLAHNLCWNLNDCKELAVKEMSVEQRNNRFYPQTGFYAAGEA HTVYAADFDHTDQLVYLASVSPAVLEPPVWEGAMTVEKFWRIIELAKEAESSETAEQVTD FLHTLPLTSLLQWKWIFDEYFVLADKNGLWAAAFLINEGCGDDSFAYFRGWLITQGRMIY HKALADPDFLANVWIDKWNAQNEKILSAARSSAVLEWYRSKEWTDWLKKQPDLSADVQES LNALEKAQKSVEDAEQDTGEPGCIEELLNRHMTAINTIGIKVIGPAFDRMMEEVSFQGEA RKLSDEIRRGIHYSPDIDRPWDENDDVTLARLLPKLWEIYGDDEKPDHDAPPELVRLIKQ GNHYLEQENHTVAFETFLQAKEKYPQFPDGHFGAAFSLQLMQQNSEAVSHYQRAFQRIEA GCRIVTGAVYDYQRRPNYYYGCCLYDAGRIDEAIEALLNGKKQKELYASDTDCLAGIFFL EKGNYQQAAAYFEESLKRFYTTDKSAFSNYSPCDSWNHWGNVCMLMEQYPIALRNHINAL DFDTELWSAWKMAGLALQMLGQESLGSEFLAVSEQQAS >gi|229783825|gb|GG667910.1| GENE 2 3086 - 3472 293 128 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2148 NR:ns ## KEGG: Cphy_2148 # Name: not_defined # Def: DNA-repair protein # Organism: C.phytofermentans # Pathway: not_defined # 1 128 279 406 507 199 73.0 4e-50 MLMDHAWGWEPCTMADVKAYKPETNSTGSGQVLQCPYESGKARLIVREMTDQLVLDLVDK GLVTDQLVLTVGYDIENLKKPEIRKKYKGPVTTDHYGRSVPKHAHGTINLERRTSSTKLI MAAVMTLF >gi|229783825|gb|GG667910.1| GENE 3 4508 - 4942 513 144 aa, chain + ## HITS:1 COG:no KEGG:Closa_1454 NR:ns ## KEGG: Closa_1454 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 11 142 5 136 145 143 53.0 2e-33 MQNQMNDPHRYDDIIHLPHHVSAVHPPMDVADRAAQFSPFAALTGHDAAIHEAARLTDQK MELDENRKEILNEKINVLIGKLREHPAVTITRFVPDDKKDGGYYTSINGMIRKIDEYEGM IVLMDETRIPLDDVIEIEGDVFIL >gi|229783825|gb|GG667910.1| GENE 4 5017 - 5184 148 55 aa, chain - ## HITS:1 COG:VCA0886 KEGG:ns NR:ns ## COG: VCA0886 COG0156 # Protein_GI_number: 15601641 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Vibrio cholerae # 2 52 346 396 397 72 60.0 2e-13 MMAKGVYVVAFSFPVVPKGKARIRTQVCASHTKEDIDFIIKCFREVREEMGLNEG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:32:07 2011 Seq name: gi|229783824|gb|GG667911.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld304, whole genome shotgun sequence Length of sequence - 3108 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 837 1057 ## COG0784 FOG: CheY-like receiver 2 1 Op 2 . + CDS 884 - 2257 1422 ## COG0534 Na+-driven multidrug efflux pump + Term 2447 - 2490 4.0 3 2 Tu 1 . - CDS 2290 - 3099 867 ## COG1284 Uncharacterized conserved protein Predicted protein(s) >gi|229783824|gb|GG667911.1| GENE 1 1 - 837 1057 278 aa, chain + ## HITS:1 COG:VC1349_4 KEGG:ns NR:ns ## COG: VC1349_4 COG0784 # Protein_GI_number: 15641361 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Vibrio cholerae # 11 264 3 250 250 135 36.0 1e-31 KKQNFNFSHLKTLVVDDDVAVCESAVVTLHEMGIKAEWVDSGRKAVDRIRECWARGAYYD MVLIDWKMPEMDGIETARQIRRIVGPDVTIIIMTAYDWSSIEHEAKLAGVNLLMSKPMFK SSLVSAFSRALGEKEEQEQEAAGFDFDFSGKRVLLVEDNAINTEVAALLLTDQGFAVDTA ENGLRALEIFSKSEPGYYDAILMDIRMPIMDGLSATVNIRHLSNPDARDIPIIAMTANAF DDDIEKSRAAGMNAHLAKPIEPERLYQTLYDFIFGKEE >gi|229783824|gb|GG667911.1| GENE 2 884 - 2257 1422 457 aa, chain + ## HITS:1 COG:lin0989 KEGG:ns NR:ns ## COG: lin0989 COG0534 # Protein_GI_number: 16800058 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 15 439 14 432 453 177 28.0 4e-44 MIDNQKTALFEQMPVPKAVMKLAVPTILSSLVMVIYNLADTYFVGMMNNPVQNAAVTLAA PVLLAFNAVNNLFGVGSSSMMSRALGRKDYDTVRQSSAFGFYCALICGFLLSFACTIFQQ PLLTLLGADVNTAQATEEYLKWTVSFGAAPAILNVVMAYLVRSEGAALHASIGTMSGCLL NIILDPIFILPWGFNMGAAGAGLATFLSNCVACIYFFILLYVRRSNTFICILPKNFRMKK AIILGVCGVGIPAAIQNLFNVTGMTILNNFTSSYGADAVAAMGITQKINMVPMQIALGFS QGIMPLVSYCYASGNRKRMKEAVLFTVKTLLPFLFVVSLCYYIGAGSLTRAFMNNEAIIA YGTRFLRGFCLALPFLCMDFLAVGVFQAVGMGKAALTFAVLRKIILEIPALYALNYLFPL YGLAYAQFTAEFVLGITAVWMLARIFRKMREGEAAAG >gi|229783824|gb|GG667911.1| GENE 3 2290 - 3099 867 269 aa, chain - ## HITS:1 COG:TM0177 KEGG:ns NR:ns ## COG: TM0177 COG1284 # Protein_GI_number: 15642951 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 3 264 12 273 283 163 35.0 4e-40 MITAATIIVAAAVFFFLIPSHVSVGSISGLAIVLGNFIPLNISAITLVLNILLLILGFLF IGREFGAKTVYTSLLLPVILGVFELIFPNNQSITNDAFLDMLCYIFVVSIGLAMLFNRNA SSGGLDIAAKFLNKYFHMDLGKAMSLAGMCVALSSALVYDKKIVVLSVLGTYLNGIVLDH FIFGFNLKKRVCIISKHEEEIRDFILNQLHSGATIYEAIGAYDHQPKKEIITIVDKNEYA LLMSYIMKTDKNAFVTVYTVNEIIYRPKR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:32:08 2011 Seq name: gi|229783823|gb|GG667912.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld305, whole genome shotgun sequence Length of sequence - 2064 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 + CDS 142 - 1812 2106 ## COG0481 Membrane GTPase LepA 2 1 Op 2 . + CDS 1802 - 2063 180 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases Predicted protein(s) >gi|229783823|gb|GG667912.1| GENE 1 142 - 1812 2106 556 aa, chain + ## HITS:1 COG:CAC1278 KEGG:ns NR:ns ## COG: CAC1278 COG0481 # Protein_GI_number: 15894560 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Clostridium acetobutylicum # 1 554 49 602 602 778 68.0 0 MDLERERGITIKAQTVRTVYKAEDGEEYIFNMIDTPGHVDFNYEVSRSLAACDGAVLVVD SSQGIEAQTLANVYLALDHDLDVFPVLNKIDLPSAEPERVVGEIEDVIGIEAHDAPRISA KTGENIEAVLEAIVKKIPAPKGDADAPLQALIFDSLYDSYKGVIVFCRIMEGRVKKGVRI RMMATGAVEEVVEVGYFGAGQFIPCDELTAGMVGYVTASIKNVKDTAVGDTITDADNPCD KPLPGYKKVTSMVYCGMYPADGARYNDLRDALEKLQLNDASLFFEPETSLALGFGFRCGF LGLLHLEVIQERLEREYNLDLVTTAPSVIYRVYKKNGEMIQLTNPSNLPDPSEIEYMEEP IVSAEIMVTTEFVGAIMKLCQERRGVYLGMEYMEATRALLKYELPLNEIIYDFFDALKSR SRGYASFDYDLKGYERSELVKLDILVNKEEVDALSFIVHAESAYDRGRRMCEKLKEEIPR QLFEIPIQAAVGGKIIARETVKAMRKDVLAKCYGGDITRKKKLLEKQKEGKKRMRQIGNV EIPQKAFMSVLKLDDE >gi|229783823|gb|GG667912.1| GENE 2 1802 - 2063 180 87 aa, chain + ## HITS:1 COG:CT052 KEGG:ns NR:ns ## COG: CT052 COG0635 # Protein_GI_number: 15604771 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Chlamydia trachomatis # 1 87 1 88 378 72 39.0 1e-13 MMNKKKLEIYVHIPFCAKKCAYCDFLSFPGNMRMRREYTDKLLEEIRIQSSFVREYQIDT IFLGGGTPSVLDAADITAIMGALKEHY Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:32:09 2011 Seq name: gi|229783822|gb|GG667913.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld306, whole genome shotgun sequence Length of sequence - 2259 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 362 218 ## CA_C3386 hypothetical protein 2 1 Op 2 . - CDS 352 - 1704 1143 ## CA_C3385 hypothetical protein - Prom 1869 - 1928 7.5 + Prom 1902 - 1961 4.5 3 2 Tu 1 . + CDS 2018 - 2258 300 ## COG1690 Uncharacterized conserved protein Predicted protein(s) >gi|229783822|gb|GG667913.1| GENE 1 2 - 362 218 120 aa, chain - ## HITS:1 COG:no KEGG:CA_C3386 NR:ns ## KEGG: CA_C3386 # Name: not_defined # Def: hypothetical protein # Organism: C.acetobutylicum # Pathway: not_defined # 4 120 2 112 326 85 36.0 5e-16 MEPELFSEIYNCYFLVVRRILDEAAEHGLSECDLNRIADTYGYEESALSIVPKLVSGEWN LLERSGEGNPGRPLFRSRVKAPAPLPLTKLQRSWLKAISADPRFRLFFTDEECRELDQDL >gi|229783822|gb|GG667913.1| GENE 2 352 - 1704 1143 450 aa, chain - ## HITS:1 COG:no KEGG:CA_C3385 NR:ns ## KEGG: CA_C3385 # Name: not_defined # Def: hypothetical protein # Organism: C.acetobutylicum # Pathway: not_defined # 1 431 1 407 413 353 46.0 6e-96 MKEFSELVKRFDRIRTYVRDFYVYGFKTREDYSEGSGRTYDNERRRIESWFSPYIHSEYN EERKKTVAIIVHSNRIAVNPLFSVWKSKSFTSNDIMLHFFILDLFGNGSFYTADETADEF QNRYGILFDSQTVRKKLVEYESLGILSVRKEGRRLLYGRLENCCGDKENASSNLHWTGPA EELTRSPGFIDMVSFYQGCAPFGFIGSTILDNLQKENTVFRLKHDFLVHTLEDEILLELL RAMREHKEIRLENKSAKTGRTELTEGIPLKILVSTQTGRRYVVVYQTRTKRFSSFRLDYL LSVNLLASEEKSLEAQYEQHKNDLEKNLPKTWGVSFGDGRRKERLEMLSFTLSLNEEREA HIISRLEREGRGGTITRLDHGVWHYTRECFDCNELMPWVKTFTGRILSFSCSNSMIEKKF FRDMERMAKMYAESPPHIHSADREVNNDGA >gi|229783822|gb|GG667913.1| GENE 3 2018 - 2258 300 80 aa, chain + ## HITS:1 COG:STM3519 KEGG:ns NR:ns ## COG: STM3519 COG1690 # Protein_GI_number: 16766807 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 10 80 11 80 405 73 50.0 1e-13 MFTWKEEQMNKPVKIWLENIEEAEEGCLEQARHLAQLPFIHKWACLMPDTHQGKGMPIGG VIAADGVIIPNAVGVDIGCG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:32:19 2011 Seq name: gi|229783821|gb|GG667914.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld307, whole genome shotgun sequence Length of sequence - 3884 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 26/0.000 - CDS 1 - 358 464 ## COG1079 Uncharacterized ABC-type transport system, permease component 2 1 Op 2 24/0.000 - CDS 360 - 1433 1264 ## COG4603 ABC-type uncharacterized transport system, permease component 3 1 Op 3 . - CDS 1417 - 1626 192 ## COG3845 ABC-type uncharacterized transport systems, ATPase components 4 2 Op 1 . - CDS 2546 - 2983 474 ## gi|266625472|ref|ZP_06118407.1| membrane protein, Bmp family 5 2 Op 2 . - CDS 3020 - 3757 640 ## gi|288871651|ref|ZP_06118408.2| putative cyclic nucleotide-binding domain protein - Prom 3788 - 3847 4.3 Predicted protein(s) >gi|229783821|gb|GG667914.1| GENE 1 1 - 358 464 119 aa, chain - ## HITS:1 COG:TM0105 KEGG:ns NR:ns ## COG: TM0105 COG1079 # Protein_GI_number: 15642880 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Thermotoga maritima # 12 119 20 127 319 89 48.0 1e-18 MLNFLIFMLASTLRMGTPIALTALGGLTSERSGVVNIGLEGIMLASAFGAVLGSYLTGNP WIGVLTAMFVGVLISAIHSVISITWGGNQSVSSMALVLLATGFSGVGLKAVFGQQGSSP >gi|229783821|gb|GG667914.1| GENE 2 360 - 1433 1264 357 aa, chain - ## HITS:1 COG:TM0104 KEGG:ns NR:ns ## COG: TM0104 COG4603 # Protein_GI_number: 15642879 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Thermotoga maritima # 21 351 7 338 344 202 39.0 9e-52 MGQQVKTGIKMPSKFREVGLNFLISLIALVLAIITGGIIIWLFGNNPFEAYKALIQGAFG TPMALTQSLTKSVPLMLTGLAVALAYKCTVFNIGAEGQLLVGAIAAGIAGTLFPMPGILN IPFVLIVSMLAGMVWAFFPALLKQKANVNVVISTIMFNYVGQYLVQYLILGPFKAEGAAS ATRPIQATAMLPKLLPSPYVLNLGIVIALTAAVVVYFLLNRTSLGYEMCAVGLNKNASLT NGINVERNMFLALLMSGALAGLAGGIEVTGTMGKVINGFSANYGFSGIPVALMARNNPFA IILTALLMGSMRSGSLMMQSSVGVSKNMVDIIQGLVIVFLCAENVIRYYIKKGRGGK >gi|229783821|gb|GG667914.1| GENE 3 1417 - 1626 192 69 aa, chain - ## HITS:1 COG:BS_yufO KEGG:ns NR:ns ## COG: BS_yufO COG3845 # Protein_GI_number: 16080207 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport systems, ATPase components # Organism: Bacillus subtilis # 1 66 445 510 510 72 57.0 2e-13 MEQRDKGKGVLLFSLELDEILSVSDRIAVIFKGEIIDVVDAQSVTRQELGMMMLGSVRGK GGKEVGTAG >gi|229783821|gb|GG667914.1| GENE 4 2546 - 2983 474 145 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625472|ref|ZP_06118407.1| ## NR: gi|266625472|ref|ZP_06118407.1| membrane protein, Bmp family [Clostridium hathewayi DSM 13479] membrane protein, Bmp family [Clostridium hathewayi DSM 13479] # 1 142 1 142 142 246 100.0 6e-64 MKHYKRLLGVLLAVGMGLSLLTGCQSSQPAGGSTAADKAETDKAGEDASGDGAQAKKSDL VVGMIANAFGTQSYNDDVLAGMELAEKELGVKGIPLEVPEISDSANSLRTLISQGANFIM VPSSEYKDGMLEVAAENPEVKFLAS >gi|229783821|gb|GG667914.1| GENE 5 3020 - 3757 640 245 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871651|ref|ZP_06118408.2| ## NR: gi|288871651|ref|ZP_06118408.2| putative cyclic nucleotide-binding domain protein [Clostridium hathewayi DSM 13479] putative cyclic nucleotide-binding domain protein [Clostridium hathewayi DSM 13479] # 1 245 13 257 257 467 100.0 1e-130 MKTEKTSGQLRELFRVAYVENQDQTIKMHDLDRFLNPCITMRMVKKGEDIVRPTDKLQSI ILVVKGGAHFTRVSTDGNTNMLAAVNAPFFIGITQLVSNDKEYYSQIIAAKNCLLLYIDC SYFLKGIREDGEAALIVIRDLAKVVERNHTRMDRMVFLKASNNLMAYIESKWNESEICRR DRQLTIRERHGVIAADLSVSVRTLYRSINSLKDEGLLTVQKGGALMVNEAQMEEMKRRLE KLGKM Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:32:40 2011 Seq name: gi|229783820|gb|GG667915.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld308, whole genome shotgun sequence Length of sequence - 3122 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 416 334 ## Closa_2490 A/G-specific adenine glycosylase + Prom 445 - 504 5.5 2 2 Tu 1 . + CDS 527 - 1420 1032 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase + Term 1466 - 1514 1.0 + Prom 1422 - 1481 6.1 3 3 Tu 1 . + CDS 1527 - 2672 1228 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) + Term 2684 - 2724 5.4 4 4 Tu 1 . + CDS 2841 - 3120 377 ## COG1636 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|229783820|gb|GG667915.1| GENE 1 3 - 416 334 137 aa, chain + ## HITS:1 COG:no KEGG:Closa_2490 NR:ns ## KEGG: Closa_2490 # Name: not_defined # Def: A/G-specific adenine glycosylase # Organism: C.saccharolyticum # Pathway: Base excision repair [PATH:csh03410] # 1 134 228 361 365 157 61.0 2e-37 ASVCVARRDGLTKEIPVKTPKKARRIEKKTILVLDWKGRTAIRKREDSGLLASLYEFPNE DGWLEEEALAETYRVPLTAVRCLPEAKHIFSHVEWHMRGFAVELQEKPAGDYLWVTPEEI RETYSLPGAFKVYTMST >gi|229783820|gb|GG667915.1| GENE 2 527 - 1420 1032 297 aa, chain + ## HITS:1 COG:BH0676 KEGG:ns NR:ns ## COG: BH0676 COG1597 # Protein_GI_number: 15613239 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Bacillus halodurans # 3 290 6 291 295 178 36.0 1e-44 MLFIFNPRSGKAQIRNRLMDILDIFTKAGYEVSVHVTQRPKDAMEVAAARGKDADIIVCS GGDGTLNETISGMMTLDRIPDLGYIPAGSTNDFASSLKIPKNMIAAAHLAVEGEAYPVDI GSFCGDRNFVYIAAFGAFTEVSYLTPQDKKNVLGHQAYMLESVKSLTSIKSYRMRVESEE ITLEGEFIFGMVTNTISVGGFKGLVTQNVALNDGEFEVLLIRTPRTPLDLSNIVNYMFLK EEPNEYVHKFRTKALTVIPEEPVDWVLDGEFGGTRMETQIINLQKQIRIMRNPSKKQ >gi|229783820|gb|GG667915.1| GENE 3 1527 - 2672 1228 381 aa, chain + ## HITS:1 COG:BH1376 KEGG:ns NR:ns ## COG: BH1376 COG0568 # Protein_GI_number: 15613939 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Bacillus halodurans # 12 381 22 371 372 358 55.0 1e-98 MEKDNFLEKLGKLVELGKTKHNALDVTEINNFFMGDSLTPEQMDQIYSYLENRGVDVIPV IDDSILADDVLLSDDALLLDDDLDDSFIKGADEEDIDLDAIDLLEGIGTEDPVRMYLKEI GTVPLLSADEELRLAKRKADGDEDAKERLIEANLRLVVSIAKRYTGRGMSFLDLVQEGNL GLIKGVEKFDYTKGYKLSTYATWWIRQSVTRALADQARTIRVPVHMVETINKMSKMQRKL TLELGYEPSVTELADALDMSEDKVMEIMQIAREPASLETPIGEEDDSNLGDFVADNNVVT PEGNVESVMLREHIDALLGDLKERERQVIVLRFGLEDGHPRTLEEVGKEFNVTRERIRQI EAKALRKLRNPVRSKRIRDFL >gi|229783820|gb|GG667915.1| GENE 4 2841 - 3120 377 93 aa, chain + ## HITS:1 COG:CAC1577 KEGG:ns NR:ns ## COG: CAC1577 COG1636 # Protein_GI_number: 15894855 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 92 1 92 208 100 58.0 6e-22 MNHRNYQKELDTILEDFEKQGKVPRLLLHSCCAPCSSYVLEYLSKYFEITLYYYNPNIYP IQEYMKRVKEQEKLISEMKFVHPVLFRTGPYEP Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:32:45 2011 Seq name: gi|229783819|gb|GG667916.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld309, whole genome shotgun sequence Length of sequence - 4895 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 95 - 760 508 ## Clocel_3236 VanZ family protein 2 1 Op 2 . - CDS 765 - 1502 666 ## gi|266625479|ref|ZP_06118414.1| transcriptional regulator, AraC family - Prom 1631 - 1690 4.9 - Term 1510 - 1547 1.1 3 2 Tu 1 . - CDS 1692 - 1862 90 ## gi|266625480|ref|ZP_06118415.1| PAS sensor diguanylate cyclase/phosphodiesterase - Prom 1896 - 1955 10.6 4 3 Tu 1 . - CDS 2857 - 4299 1051 ## COG2199 FOG: GGDEF domain - Prom 4359 - 4418 3.7 Predicted protein(s) >gi|229783819|gb|GG667916.1| GENE 1 95 - 760 508 221 aa, chain - ## HITS:1 COG:no KEGG:Clocel_3236 NR:ns ## KEGG: Clocel_3236 # Name: not_defined # Def: VanZ family protein # Organism: C.cellulovorans # Pathway: not_defined # 62 177 60 175 471 73 33.0 5e-12 MFNSITKLISWDSIWNPYTFTDAKIIVASLLLAMAGSIVLTAIVWGRKRAAVVLKREMLL FWLFTIFSATLLARKETAEPKINLELFWTIRYAWKYHSGIHWFYIVGNIALFIPLGFLLP LNGKPFRSCVLTALAGCLLSVSIEATQLFLKLGLCELDDVFHNTWGTLLGCCMFRLITPD AVVCSKGRKVINILSRGLCAVMLFGTVITFCILIWMNAPVK >gi|229783819|gb|GG667916.1| GENE 2 765 - 1502 666 245 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625479|ref|ZP_06118414.1| ## NR: gi|266625479|ref|ZP_06118414.1| transcriptional regulator, AraC family [Clostridium hathewayi DSM 13479] transcriptional regulator, AraC family [Clostridium hathewayi DSM 13479] # 1 245 10 254 254 487 100.0 1e-136 MQDIMEAIAEAYENPADDDFGGDSCAAEKRKEHANLNLIAEEFGMTPLKVRKLLITAGYH YQREIYSTPISRKVNDLYIEGKNIEEIMELTGLSRASVHGYLPYSRTVYKMEEGSAASER IRRYQERNYACERLRTAIHLQEPEVDELLWNTIIQFEGYPFCTSKGLKFSYIIKKRRDGS NSGEMFISRKEKSITKATVMIAFHKALELMDAEGSVSGPKKLGIFGASYLYPVFIRLFLP ENRRL >gi|229783819|gb|GG667916.1| GENE 3 1692 - 1862 90 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625480|ref|ZP_06118415.1| ## NR: gi|266625480|ref|ZP_06118415.1| PAS sensor diguanylate cyclase/phosphodiesterase [Clostridium hathewayi DSM 13479] PAS sensor diguanylate cyclase/phosphodiesterase [Clostridium hathewayi DSM 13479] # 1 56 25 80 80 89 98.0 6e-17 MSCSLGVVFYPRNGTTFKELYRNADIALYDVKRAGRCDYKIYGDEEEEERPPEDLI >gi|229783819|gb|GG667916.1| GENE 4 2857 - 4299 1051 480 aa, chain - ## HITS:1 COG:VC1370 KEGG:ns NR:ns ## COG: VC1370 COG2199 # Protein_GI_number: 15641382 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Vibrio cholerae # 380 479 256 353 443 79 42.0 1e-14 MEGSILERYKNILDLVPCGICQVALDDDLTILYANRFYYDIYGYTPQNAEEQGFINVRFI LPEEEYPEIHRKVLDYAGAGTTKFQMEFRGKHRTGKVLWLLVQCTYNPDTPEYILCSLMD IGDRKRMEEEIRMSMEENRIAFELTNKQLYIYDIENKRLFQPESAAEEFGLPPVAGYSPY SIVESGVIAEESRQEYIDFYESMIDGDPRGYAVVKKRKKDGTFGWYSAKSAVIYDREGKP CRAVISCENITEQREKELTYQKWSQYFKSQEGKTIGYYEYNLTKDVVEAGDDPPVYLKLL NKYTETVRYIAERFVYEGDRARFYHFFNRDRMLMRYYSGQTDGTIDYLRKREDGSLYWVR ATIQLIGDPYNNDVRLFMMTLNIDEEKTESLRLQHKMECDGMTDILNRDTFMTRVTEILE KSSPDMRHALIMLDIDQFKLHNDSYGHQFGDYVIRETAQILKRFLRKNDSCGRIGGDEFS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:33:07 2011 Seq name: gi|229783818|gb|GG667917.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld310, whole genome shotgun sequence Length of sequence - 5849 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 399 423 ## COG1625 Fe-S oxidoreductase, related to NifB/MoaA family - Prom 577 - 636 6.3 + Prom 387 - 446 7.2 2 2 Tu 1 . + CDS 607 - 1692 1251 ## COG0462 Phosphoribosylpyrophosphate synthetase 3 3 Op 1 8/0.000 - CDS 2605 - 2841 311 ## COG0164 Ribonuclease HII 4 3 Op 2 . - CDS 2825 - 3670 875 ## COG1161 Predicted GTPases + Prom 4526 - 4585 80.4 5 4 Tu 1 . + CDS 4745 - 5849 383 ## Clocel_0596 integrase catalytic region Predicted protein(s) >gi|229783818|gb|GG667917.1| GENE 1 3 - 399 423 132 aa, chain - ## HITS:1 COG:CAC1710 KEGG:ns NR:ns ## COG: CAC1710 COG1625 # Protein_GI_number: 15894987 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase, related to NifB/MoaA family # Organism: Clostridium acetobutylicum # 6 132 3 130 437 156 55.0 1e-38 MSRKGHKIKAVDPGSIAEALELEPGDWVLSIDGHELEDIFDYEYYINSESIVMEVRKANG EEWELEIENGYEDPGITFENGLMSEYRSCRNQCVFCFIDQMPPGMRETLYFKDDDTRLSF LQGNYVTLTNLS >gi|229783818|gb|GG667917.1| GENE 2 607 - 1692 1251 361 aa, chain + ## HITS:1 COG:CAC0819 KEGG:ns NR:ns ## COG: CAC0819 COG0462 # Protein_GI_number: 15894106 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Clostridium acetobutylicum # 16 361 13 340 371 355 50.0 9e-98 MANDYVFNDQLPVAPLKIAALESCRELAEKVNDHIVGFRRNDIEELKRREADLHYRGYDV DSYLLNCSCPRFGSGEAKGIIKESVRGTDIFAMVDITNYSISYTVNGYTNHMSPDDHFQD LKRIIGACSATAHRVNVIMPFLYESRQHKRSKRESLDCAMALEELTSMGVSNIITFDAHD PRVQNAIPLHGFDNFMPTYQFVKAFLRHYPNLPIDKDHLMVISPDEGAMDRAVYLANNLS VDMGMFYKRRDYSRVVGGRNPIVAHEFLGSSVEGKTVLIVDDMISSGESMLDTAKELKER GVEKVIVCCTFGLFTNGLDKFDEYYAKEYIDLVVTTNLNYRPKELLSREWYVEADISKLA S >gi|229783818|gb|GG667917.1| GENE 3 2605 - 2841 311 78 aa, chain - ## HITS:1 COG:BS_rnh KEGG:ns NR:ns ## COG: BS_rnh COG0164 # Protein_GI_number: 16078669 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Bacillus subtilis # 1 78 43 120 255 80 57.0 7e-16 MRKLSEEKLAAERERLEQMKSYERQYGDHILVCGIDEAGRGPLAGPVVAGAVILPGDCEI LFLNDSKKLSEKRREELF >gi|229783818|gb|GG667917.1| GENE 4 2825 - 3670 875 281 aa, chain - ## HITS:1 COG:BH2476 KEGG:ns NR:ns ## COG: BH2476 COG1161 # Protein_GI_number: 15615039 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Bacillus halodurans # 1 262 17 280 284 259 48.0 4e-69 MKEDIKLIDLIIELVDARVPLSSRNPDIDDIGAGKARLILLNKSDLADERCNLKWAAYFQ EKGYFVVKVNSRSGAGLKAINGVVQEACKEKIERDRRRGILNRPVRAMVVGIPNVGKSTF INSFAGKACAKTGNKPGVTKGNQWIRLNKSLELLDTPGILWPKFEDQTVGIRLALIGSIN DEILNKDELALELIRFLQKQYPNVLEDKYQADASDAVELLNQIAKARACLSKGGELDLAK ASRLLIDDFRNGKLGKITLELPPECPSGNTNVEQTCDEKVK >gi|229783818|gb|GG667917.1| GENE 5 4745 - 5849 383 368 aa, chain + ## HITS:1 COG:no KEGG:Clocel_0596 NR:ns ## KEGG: Clocel_0596 # Name: not_defined # Def: integrase catalytic region # Organism: C.cellulovorans # Pathway: not_defined # 5 368 4 369 435 349 48.0 1e-94 MKKTHLSLDDRITIVHMLSEQCSFKQIALAVGKNCTSISREVRNHILFRKTGGYGRHYNA CLLRKDCSHSTLCPSCLSSNRKRCSFCDKCNANCPDFQQESCSRLLMPPYVCNGCPDRRS CTLEKHLYSADYAHKEYRELLSESRTGISYSEEDIRRLDSIVSPLIFRGQSLNHICANNR DSLMVSESTLYRLVDYNVFKARNIDLPRKVRYSRRRKAKEFKVDKACRSGRTYEAYLNHC SRHPDLPVTQMDSVEGKKGGKVLLTLHFVKTELMLAFLRDANDSQSVIDIFDRLYLELGP DRFSSLMPLILTDNGSEFSNPRAIEYDRQGNLRTRLFYCDASSPGQKGSAEKNHEFIRYV LPKGTSFD Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:33:14 2011 Seq name: gi|229783817|gb|GG667918.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld311, whole genome shotgun sequence Length of sequence - 2946 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 583 291 ## COG3843 Type IV secretory pathway, VirD2 components (relaxase) - Prom 652 - 711 5.5 + Prom 609 - 668 6.3 2 2 Tu 1 . + CDS 698 - 2107 549 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen + Term 2111 - 2166 11.9 - Term 2101 - 2152 6.4 3 3 Op 1 . - CDS 2181 - 2480 355 ## Ethha_1912 membrane protein 4 3 Op 2 . - CDS 2477 - 2779 369 ## gi|266625492|ref|ZP_06118427.1| putative P-aminobenzoate synthetase - Prom 2810 - 2869 5.2 Predicted protein(s) >gi|229783817|gb|GG667918.1| GENE 1 1 - 583 291 194 aa, chain - ## HITS:1 COG:SP1056_1 KEGG:ns NR:ns ## COG: SP1056_1 COG3843 # Protein_GI_number: 15900926 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD2 components (relaxase) # Organism: Streptococcus pneumoniae TIGR4 # 1 173 1 180 402 80 30.0 2e-15 MATTRIMPLHVGKGRTESRAISDIIDYVANPQKTDNGRLIAGFACDSRTVDAEFLLAKRQ YIVATGRVRGADDVIAYHVRQSFRPGEITPEEANRLGVEFAKRFTKGNHAFVVCTHIDKS HIHNHIIWSSVSLEYDRKFRNFWGSTKAVRRLSDTICIENRLSIVENPKLHGKSYNKWLG DQAKPSHRELLRVM >gi|229783817|gb|GG667918.1| GENE 2 698 - 2107 549 469 aa, chain + ## HITS:1 COG:UU038 KEGG:ns NR:ns ## COG: UU038 COG2865 # Protein_GI_number: 13357594 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Ureaplasma urealyticum # 2 384 4 386 463 334 46.0 2e-91 MIEQLIAEATECDFKVALETKKPKSWLKSVSAFSNGIGGTLFFGVSDDRGPIGLSDVQKD AEAISRLIKERITPLPQFILKPLQEDGKNLLALEVSPGRSTPYYYKADGVMEAYIRVGNE SVIAPDYIVNELILKGTNQSFDTLTTEAVKKDYSFTLLEATYLERTGLRFEPSDYVSFGL ADKNGFLTNAGKLMTDQHTVYNSRMFCTRWNGLEKGSIFDDALDDKEYEGNLIYLLKSGS EFIRNNSKVRFVKEAQYRVDKPDYAERAVTEALVNALIHRDYIVLGSEVHIDMFDDRVEI TSPGGMFGGGSIQEYDIYSIRSMRRNPVIADLFHRMKYMERRGSGLRKIVSETEKLPGYT EAYKPEFSSTATDFRVILKNVNYHHGPSIIETTHDKTYDATHDATHDKILAFCIEPHSKL EIAEHCGYKNTKNFTQKYLRPLLNNGTLKMTLPDKPKSKHQKYITVRSE >gi|229783817|gb|GG667918.1| GENE 3 2181 - 2480 355 99 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1912 NR:ns ## KEGG: Ethha_1912 # Name: not_defined # Def: membrane protein # Organism: E.harbinense # Pathway: not_defined # 4 99 1 96 96 81 55.0 9e-15 MKILKCLLMIVTAPVVLVLTLFVWLCMGLIYISGLVLGLLSTVIALLGVAVLITYSPQNG AILLIMAFLISPMGLPLAAIWLLGKVQDLKFAIQDWMYG >gi|229783817|gb|GG667918.1| GENE 4 2477 - 2779 369 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625492|ref|ZP_06118427.1| ## NR: gi|266625492|ref|ZP_06118427.1| putative P-aminobenzoate synthetase [Clostridium hathewayi DSM 13479] putative P-aminobenzoate synthetase [Clostridium hathewayi DSM 13479] # 1 100 1 100 100 174 100.0 3e-42 MLTFEKVLEIFADYLTADETIEVYISRHGCVRVEFDQDFHYCSGKVCHTPKELFNLLADD YRTYVEIELTKGRRELTEDDEREADALCKRYLERWREELE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:33:24 2011 Seq name: gi|229783816|gb|GG667919.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld312, whole genome shotgun sequence Length of sequence - 3315 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 3042 2272 ## gi|266625493|ref|ZP_06118428.1| hypothetical protein CLOSTHATH_06914 2 1 Op 2 . - CDS 3069 - 3314 191 ## gi|266625494|ref|ZP_06118429.1| glycolate oxidase iron-sulfur subunit Predicted protein(s) >gi|229783816|gb|GG667919.1| GENE 1 3 - 3042 2272 1013 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625493|ref|ZP_06118428.1| ## NR: gi|266625493|ref|ZP_06118428.1| hypothetical protein CLOSTHATH_06914 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_06914 [Clostridium hathewayi DSM 13479] # 1 1013 1 1013 1014 1974 100.0 0 MLNPNEYLDMLPMLAEFAKTRPDAWKKFVEERDKLSTAGQYRGLRKDMYEIKGSNVESML SNVVAKELDIRNPNRPNKIAQISNTLFVTADLSINSTYGAEEAMNLRTYKNMVKSISGEK KELAEFLIACYEEVLTMSFAYAEDVDISQKMNGFDPLGEMTSNLGEFQSAEMCLAQMFIP LIQGEDIQIHQIHLEREKSKGQDSNEYESREMYRNRIREERGKIIYAADHSNIPADAMAV DLFNYNLVFEKVSSRPEFQKFRGITADQLCDFDYRHRLAQSGEYRLPDRMDFQIKTNEII DPAAFVEFMEKKTEPERKALVEFYNTNSAKAKKGDDIYYIFKSYSSLPEEVQTGCKALVG YMEYGSTWYNDNIGCMIQKETHQLLEEAGLDWTDMIFIGDKSLREIVGSQYDDMPKELAE KEKQARVTAAVIKAEEPVSFGVMEKAPDGKIRLARVIPVYGDVLPFQKTEAEKTAYRKKQ EALNASPVFAEQEAAIRWKMKEYKPAYSNAYTENHVKPSGSDTLKYMTGGLDPESFTSIW HGYVPLAFNELNKPGRVNSLTNEIDRKAAMKWKALYDEQTDPKKKDTVKAMYELQRCGVR RTEPTRSHGSFFESILYLNIKAKHPEYHMSDIMCTDRCENQELADQIRLEKEAMGPQIMA KMGTTEGLADVMKVCMRTVSGLDLGREADYAAKIPETLTVEERKELCKRPEYEGARAGFF VTLGGAIQGLTQGLESAGGFTKKVPPVKVTNENRAVFNKYPEALKLREKIGELSVQELAE FNHAVEAVIKVSDLNRNKAQFMGDGKRDQISDAELAACAEATILNEQLEERGRIGGKKIA TGRAGTLLPLVADIPSQERPRMMRGHLPETLKEAYFNDEGIKKKLTDDAMGAMPNISKEE ESVLRYLSTPESERKFKPEAVPAHPLYDITKFSKTATLPEDVVIRMQEENAKRAIVTNTV LTFLEEHKIDNPGDTRDFSGMIDTKGLDNAQIEEKTKLALEDYYSGDPQRRKK >gi|229783816|gb|GG667919.1| GENE 2 3069 - 3314 191 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625494|ref|ZP_06118429.1| ## NR: gi|266625494|ref|ZP_06118429.1| glycolate oxidase iron-sulfur subunit [Clostridium hathewayi DSM 13479] glycolate oxidase iron-sulfur subunit [Clostridium hathewayi DSM 13479] # 1 81 1 81 81 164 100.0 1e-39 ADRVAQVLTESTGIKCMAMLSAGRHLYPYVDAEDYLENIYRHPNVLERLGRWIVNLTVDH PIAPFRVAAIVDPEQKSGRLI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:34:02 2011 Seq name: gi|229783815|gb|GG667920.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld313, whole genome shotgun sequence Length of sequence - 3151 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 120 - 179 7.0 1 1 Op 1 . + CDS 241 - 2322 1909 ## COG3250 Beta-galactosidase/beta-glucuronidase 2 1 Op 2 . + CDS 2195 - 2545 195 ## gi|266625496|ref|ZP_06118431.1| beta-galactosidase 3 1 Op 3 . + CDS 2569 - 3151 327 ## COG3507 Beta-xylosidase Predicted protein(s) >gi|229783815|gb|GG667920.1| GENE 1 241 - 2322 1909 693 aa, chain + ## HITS:1 COG:SPy1586 KEGG:ns NR:ns ## COG: SPy1586 COG3250 # Protein_GI_number: 15675473 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pyogenes M1 GAS # 21 551 78 675 1168 221 27.0 6e-57 MPRKETKIMNNWEFALEKDGEKHFQTVDLPHDWAITAPFSREMAEGEAQGFRDRWGVGWY RKIQVLEEKKSSYRYYLYFGGIYEDSTVWVNDKVVGGRKYGYSPFRLDITDFVHTGDNEI LVRVDNTKKPADRWYSGAGIYRTVKWIEMEERHLEENQVIIKTSLKGKVGIIEVNPGIEG IVRVTVRGGDETYSCKGGKWPLTLWIPDAKRWSAKEPNLYELTLELMDGERSTDRITQKV GIREFSFVPNQGMFVNGEPVVLKGVCIHQDVGCRGNAAVKELWRARLMDLKELGCNCIRA AHHVYAEEFLDLCDELGFYVYEECFDKWTGGLYGRYFETEWKKDVSAMVERDRNRACIFI WGVGNEVENQAQDSMIRILKMLKGYTVSLDSSRPVSYAMNPHFKRESNVDLSKIKDIQKF VDEEDNTEIYDMEERLDRISRIAEVVDIISCNYQEQWYERIHERFPEKLILGTEIYQFFK GNENQMQNFTAENPSLVPLEKNYVIGGIIWTGIDYLGESMGYPAKGWGGSLIRTNGSRRA SYYIMQSYWSRKPMVYFAVMDYSLEDEGVKEHWDIPMYAEHWHFPQIHRAVIPYMIASNC EEVKLFLNEKEFYLPKPAECPNRMITGFLPYVRGRVKVVGYNQGKEVCNRGGGHAGSTGT AEIQRRGSGGNPGGRAAASDRGGLRRIRKPIFP >gi|229783815|gb|GG667920.1| GENE 2 2195 - 2545 195 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625496|ref|ZP_06118431.1| ## NR: gi|266625496|ref|ZP_06118431.1| beta-galactosidase [Clostridium hathewayi DSM 13479] beta-galactosidase [Clostridium hathewayi DSM 13479] # 46 116 1 71 71 137 100.0 3e-31 MVTPEAPVQLKFKEGEVVEIPAGEQRLLTVAACDESGNPYFRESSMVTFTAEGAGEILAV DNGNLMGSEPYHETMIHMFQGEASVLVRSGESGESFRVLAWAPGLKSAAVVCRIIQ >gi|229783815|gb|GG667920.1| GENE 3 2569 - 3151 327 194 aa, chain + ## HITS:1 COG:CC2802 KEGG:ns NR:ns ## COG: CC2802 COG3507 # Protein_GI_number: 16127034 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Caulobacter vibrioides # 2 193 34 241 548 182 46.0 4e-46 MKYQNPILKGFHPDPSICRVGEDYYLVNSSFEYFPGIPVYHSRDLVNWKQIGNCISRPEQ LSLKHAGNSGGIWAPVIRYHEGVFYVTATVEKYGNFIISTQDPREGWSDPVWVPVGGIDP SLYFEGGKAYYCTNQSVHPGKEEITLEEIDVTTGKLKSPITPIWSGTGGGHLEGPHIYYK DSWYYLMAAEGGTF Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:34:11 2011 Seq name: gi|229783814|gb|GG667921.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld314, whole genome shotgun sequence Length of sequence - 3001 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 1001 1197 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 2 1 Op 2 . - CDS 1035 - 1487 207 ## PROTEIN SUPPORTED gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 3 1 Op 3 . - CDS 1516 - 2937 1647 ## COG4108 Peptide chain release factor RF-3 Predicted protein(s) >gi|229783814|gb|GG667921.1| GENE 1 2 - 1001 1197 333 aa, chain - ## HITS:1 COG:SPy1070 KEGG:ns NR:ns ## COG: SPy1070 COG0624 # Protein_GI_number: 15675062 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Streptococcus pyogenes M1 GAS # 10 331 10 332 469 205 36.0 1e-52 MYRKEIEEFVEAHRQEMLEDICTLCRINSEKMPYKEGMPYGEGAFTALAEALSMAENYGF SITNYDNYVGTVDLNEKESQLDILAHLDVVPAGEGWKETEPFEPVVKGDKLFGRGTADDK GPAVAALYAMRAVKELGIPLKKNARLILGTDEECGSSDIAHYYAIEKEAPMTFSPDGSYP VVNTEKGGLNGHFTASFAPSDALPKLVSVEAGIKVNVVPGKARATVQGIDVEVMEKAAEE VSGETGIRFEFDVEEDAATITAIGAGAHASKPEEGNNALTGLLVLIQRLPFAPCEQISAI GRLLELIPHGDTSGKALGIAMSDELSGDLTLAF >gi|229783814|gb|GG667921.1| GENE 2 1035 - 1487 207 150 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 [Lactobacillus johnsonii NCC 533] # 1 149 1 146 147 84 34 1e-16 MSKIDVVRAAMVEAMKAKDKQRKDALSMLLSALKNFEIDKKDHTPITEDEANAVVKKEIK QSQETYDTAPADRDDIREEAAYRLSVYKEFAPEDMSEEQIREIVTAVLAELEIEKPAPSD KGKIMKVLMPKVKGKADGKAVNEILASMMK >gi|229783814|gb|GG667921.1| GENE 3 1516 - 2937 1647 473 aa, chain - ## HITS:1 COG:CAC0630 KEGG:ns NR:ns ## COG: CAC0630 COG4108 # Protein_GI_number: 15893918 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptide chain release factor RF-3 # Organism: Clostridium acetobutylicum # 1 472 58 526 526 667 67.0 0 MEIEKERGISVTSSVMQFNYDGYCINILDTPGHQDFSEDTYRTLMAADSAVMVIDGSKGV EAQTIKLFKVCVMRHIPIFTFINKMDREARDPFELMDEIETVLGIRTCPVNWPIGCGHEF KGVYDRNKKTITTFQAAMNGQKAVDATEVLLGDPQVDWLIGEDYHHQLMEDIELLDGASD ELDMDRVGKGDLSPVFFGSALTNFGVETFLQHFLQMTTTPLPRLADKGVIDPFEEEFSAF VFKIQANMNKAHRDRIAFMRIVSGKFSAGMEVNHVQGGRKLRLSQPQQMMAQDRKIVEEA YAGDIIGVFDPGIFSIGDTLCLSNEKFEFEGIPTFAPEHFARVRQMDTMKRKQFIKGVNQ IAQEGAIQIFQEFNTGMEEIIVGVVGELQFEVLTYRLENEYNVEVKLEKLPYEYIRWVEN KDEIDVAKIQGTSDMKRIKDLKDNPLLLFVNSWSVGMVLERNPGLKLSEFGKN Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:34:12 2011 Seq name: gi|229783813|gb|GG667922.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld315, whole genome shotgun sequence Length of sequence - 2809 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 23 - 199 112 ## gi|266625501|ref|ZP_06118436.1| conserved hypothetical protein 2 1 Op 2 . + CDS 226 - 372 129 ## CD3333 hypothetical protein + Term 413 - 448 3.5 + Prom 500 - 559 6.9 3 2 Tu 1 . + CDS 603 - 845 219 ## CD3333A hypothetical protein + Term 866 - 903 4.1 4 3 Tu 1 . + CDS 1364 - 1489 181 ## SSUBM407_0953 hypothetical protein + Term 1499 - 1536 5.3 + Prom 1530 - 1589 12.1 5 4 Op 1 . + CDS 1629 - 2192 307 ## COG1309 Transcriptional regulator 6 4 Op 2 . + CDS 2192 - 2807 92 ## bpr_I1935 CAAX amino terminal protease family protein Predicted protein(s) >gi|229783813|gb|GG667922.1| GENE 1 23 - 199 112 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625501|ref|ZP_06118436.1| ## NR: gi|266625501|ref|ZP_06118436.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 58 8 65 65 101 98.0 2e-20 MKKTVYVKADEYHSEYITNIQSENPMKDIGLKLEPKRFELLSIIVEKSLAVYMMIGYW >gi|229783813|gb|GG667922.1| GENE 2 226 - 372 129 48 aa, chain + ## HITS:1 COG:no KEGG:CD3333 NR:ns ## KEGG: CD3333 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 41 34 74 81 76 100.0 3e-13 MPTVTYEIPYENAKEMLLVEEIDNKDFLTGLFNVMYDELPTPKPKKKK >gi|229783813|gb|GG667922.1| GENE 3 603 - 845 219 80 aa, chain + ## HITS:1 COG:no KEGG:CD3333A NR:ns ## KEGG: CD3333A # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 80 6 85 85 130 98.0 2e-29 MANELLENLDRLHTTELGVIRIKKNLSLNVENVIEWCKEKISLSNAKIIRKGKNWYITID NCIITVNAYSYTVITAHKVK >gi|229783813|gb|GG667922.1| GENE 4 1364 - 1489 181 41 aa, chain + ## HITS:1 COG:no KEGG:SSUBM407_0953 NR:ns ## KEGG: SSUBM407_0953 # Name: not_defined # Def: hypothetical protein # Organism: S.suis_BM407 # Pathway: not_defined # 1 41 84 124 124 67 82.0 2e-10 MKQAQGITERLKAKNALEWVGQMNNIRACAMEIVEREIIFA >gi|229783813|gb|GG667922.1| GENE 5 1629 - 2192 307 187 aa, chain + ## HITS:1 COG:BH0317 KEGG:ns NR:ns ## COG: BH0317 COG1309 # Protein_GI_number: 15612880 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 1 164 1 161 190 78 32.0 7e-15 MPPKVKITKEMIIDAAFEIARSEGAENINARTVSKKLGCSTQPVMYHFKTIEELKKTVYV KADEYHSEYITNIQSENPMKDIGLNYIRFAETEKNLFRFLFQTNEFIGKNISELINSEEL QPIIAILSKEVEVNTEQAKTIFRSLFLIAHGYASMFANNEMTYDEQTIISDLDLVFDGTV YSLKGGL >gi|229783813|gb|GG667922.1| GENE 6 2192 - 2807 92 205 aa, chain + ## HITS:1 COG:no KEGG:bpr_I1935 NR:ns ## KEGG: bpr_I1935 # Name: not_defined # Def: CAAX amino terminal protease family protein # Organism: B.proteoclasticus # Pathway: not_defined # 1 204 1 206 248 187 48.0 2e-46 MYRLYKKNELNFSLVWIISYVVLFSVADSFSASLGTEKIIAAPVAIVFTLLLLVFVSKHN LKEKYGLCSFNGSLRNFLYFTPLLLIMSINLWNGVTMKLSVLETVLYILSMLCVGFIEEI IFRGFLFKALYNDNVKLAIVISSVTFGIGHIVNLLNGKDLIPTLLQVCYAIAIGFLFTII FYKGKSLLPCIIAHSFVNSSSVFAV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:34:26 2011 Seq name: gi|229783812|gb|GG667923.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld316, whole genome shotgun sequence Length of sequence - 1923 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 151 232 ## gi|266625506|ref|ZP_06118441.1| oxidoreductase, NAD-binding 2 1 Op 2 . - CDS 165 - 896 876 ## COG2820 Uridine phosphorylase 3 1 Op 3 . - CDS 918 - 1850 1014 ## COG1079 Uncharacterized ABC-type transport system, permease component Predicted protein(s) >gi|229783812|gb|GG667923.1| GENE 1 1 - 151 232 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625506|ref|ZP_06118441.1| ## NR: gi|266625506|ref|ZP_06118441.1| oxidoreductase, NAD-binding [Clostridium hathewayi DSM 13479] oxidoreductase, NAD-binding [Clostridium hathewayi DSM 13479] # 1 50 1 50 50 82 100.0 8e-15 MKTGIWGAGNIAHTHAEALKASGIKIGAIVDVSEEKAKAFAEEFKPFRFQ >gi|229783812|gb|GG667923.1| GENE 2 165 - 896 876 243 aa, chain - ## HITS:1 COG:SPy1869 KEGG:ns NR:ns ## COG: SPy1869 COG2820 # Protein_GI_number: 15675688 # Func_class: F Nucleotide transport and metabolism # Function: Uridine phosphorylase # Organism: Streptococcus pyogenes M1 GAS # 19 238 25 249 259 131 37.0 1e-30 MEKIVPIIDIPYNKIAARVIACGDPARAEKISKRLDHVEVVAKSREYWTFNGTYKGVPVT VSSHGVGGNGASVSFEGLILGGAKVIIRVGTCGTLQKDMPQGALIIATAACRDDGITPQF IPISYPAVASMDVVAAMQEAAKEINPDARTGIITTSGLFYTGLMPTNNRLFSEAGVLAME NEASVLFVIASVRGIKAGCIVAADGPCFEFVGSEEFDHYPEKMAKAVEDEIMIALESIIK VEV >gi|229783812|gb|GG667923.1| GENE 3 918 - 1850 1014 310 aa, chain - ## HITS:1 COG:BMEII0087 KEGG:ns NR:ns ## COG: BMEII0087 COG1079 # Protein_GI_number: 17988431 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Brucella melitensis # 3 296 4 284 303 182 43.0 6e-46 MKILEIVTSVSFLFSIIRMTTPILYASMSYLVADLAGVTNIGIEGMMLTCALIGVVASAA FGGNAFLGFTAAVLVGALLGILMCFIVTKLKTDPTITGIAYNLTAAGGTVFLLYLAVGEK GISSSLKSGVLPQVNIPWIKDIPVLGTVISGHNVLTYAALLLIAILTIFLYKTSLGLHIR SCGENPEALESVGVHVLQVKYIALAISGMLAAMGGAYMSMGYVSYFVKDMTAGRGFIAIA AASLGGNKPVPTLLVCLLFGVADAFAVNPGTQNMGIPTELVSTIPYIVTVIVLVIYSYKK MVDKRRAAAA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:34:32 2011 Seq name: gi|229783811|gb|GG667924.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld317, whole genome shotgun sequence Length of sequence - 2877 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 8 - 76 9.1 1 1 Op 1 . - CDS 325 - 636 197 ## gi|266625509|ref|ZP_06118444.1| phosphoglycerate mutase family protein 2 1 Op 2 . - CDS 545 - 835 62 ## gi|323485907|ref|ZP_08091242.1| hypothetical protein HMPREF9474_02993 - Prom 1029 - 1088 9.2 3 2 Tu 1 . - CDS 1095 - 1202 100 ## - Prom 1372 - 1431 3.8 - Term 1678 - 1724 -0.7 4 3 Tu 1 . - CDS 1755 - 2552 416 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold - Prom 2649 - 2708 3.7 Predicted protein(s) >gi|229783811|gb|GG667924.1| GENE 1 325 - 636 197 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625509|ref|ZP_06118444.1| ## NR: gi|266625509|ref|ZP_06118444.1| phosphoglycerate mutase family protein [Clostridium hathewayi DSM 13479] phosphoglycerate mutase family protein [Clostridium hathewayi DSM 13479] # 1 103 1 103 103 194 100.0 2e-48 MADISEQWTGGGDGTNYSGRHQHPVKLKQDKRLREWCFGSLEGISNRDFTQKIRQELETE IGMKELNQRLQEIPDILVWADESGWLESFECIGTRLKSARQAL >gi|229783811|gb|GG667924.1| GENE 2 545 - 835 62 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|323485907|ref|ZP_08091242.1| ## NR: gi|323485907|ref|ZP_08091242.1| hypothetical protein HMPREF9474_02993 [Clostridium symbiosum WAL-14163] hypothetical protein HMPREF9474_02993 [Clostridium symbiosum WAL-14163] # 4 76 59 131 141 89 57.0 9e-17 MHPDGVQISLSRYEFFTLCFLAGHQGWVFSKEQIYEFVRKEAGEHCVTAVTNIISQIRRK LKQENPNGGYIRTVDGRWRRHELFWPASASGKAKTG >gi|229783811|gb|GG667924.1| GENE 3 1095 - 1202 100 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MENNKTGMETLFLSMESSFHLGVVYFAAMGIVRFF >gi|229783811|gb|GG667924.1| GENE 4 1755 - 2552 416 265 aa, chain - ## HITS:1 COG:CAC3337 KEGG:ns NR:ns ## COG: CAC3337 COG2159 # Protein_GI_number: 15896580 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Clostridium acetobutylicum # 1 265 1 262 262 230 42.0 2e-60 MIIDSHEHMMLPTEMQMQMMDAAGIDKTILFSTVPHPEKASSLNGLEIEMDKLYKVLSGS NSIEANIIRQQKSISELVSIIRKHPDRFGGFGPVPLGMSFNETQSWITDYIISNSLLGIG EFTPGNEQQILQLDTVFQAIEATEICPVWVHTFSPVTMGGIKLLMGLCERYPRVPVIFGH LGGKNWIEVIKFAKTNKNVYLDLSAAFASIATRMALIELPQRCLFSSDAPYGEPYLYREL IEFVSPSKEIADMALGNNIKELLHI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:34:51 2011 Seq name: gi|229783810|gb|GG667925.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld318, whole genome shotgun sequence Length of sequence - 2761 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 28 - 1416 1245 ## COG1653 ABC-type sugar transport system, periplasmic component 2 1 Op 2 . - CDS 1472 - 1657 121 ## gi|266625514|ref|ZP_06118449.1| conserved hypothetical protein - Prom 1684 - 1743 5.9 + Prom 1615 - 1674 10.5 3 2 Tu 1 . + CDS 1699 - 2679 476 ## COG1609 Transcriptional regulators + Term 2720 - 2760 1.8 Predicted protein(s) >gi|229783810|gb|GG667925.1| GENE 1 28 - 1416 1245 462 aa, chain - ## HITS:1 COG:BH3845 KEGG:ns NR:ns ## COG: BH3845 COG1653 # Protein_GI_number: 15616407 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 4 304 2 301 436 78 25.0 2e-14 MRKMKRIAALGLAGIMAFGLTACGGKNVDDAASGAAENSGTQEATEPGNAAKEGGRTLVI SMESVTAEADELRKDYFDKQLKAAFPNDNITIDVASDAQSLQVQVAGGSGPDLFHLNGPT DAVEYAKAERIIDLSAYADQYNWKDLFYDWAYGTSYYDGKLYSLPNSFEGMVIYYNTDVF EANGWEKPETLDELLMLCDDAQAKGIIPFSFGNSDYQGAVDWLYSTFLSCYAGPDAMKEA LEGSRSWTDPQIKGGIETMVDWWQKGYIGDKKSQALTGDDMVSLFANGKAAMMIDGTWAS NSLITTYPECKWDVEVMPEIHDGSGRILPLATGGCFAVNKSCKDPDFAAEVLNWIYTGNL ETHIRGVVEAGFQPFPIKTIDPESFNGMNEKMLDMYQALFDGMDSGKVGYCSWTFFPSTA RNYMNENTDALFLGQLGVDDYLNRVEELVKAAVADGSAPQLP >gi|229783810|gb|GG667925.1| GENE 2 1472 - 1657 121 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625514|ref|ZP_06118449.1| ## NR: gi|266625514|ref|ZP_06118449.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 61 1 61 61 97 100.0 3e-19 MGEKKYGNFFTKKGMDFQKVFTNMVTVDRKYMKTVDIQKKLNYDKATIQLVARICLLVRM L >gi|229783810|gb|GG667925.1| GENE 3 1699 - 2679 476 326 aa, chain + ## HITS:1 COG:SP1725 KEGG:ns NR:ns ## COG: SP1725 COG1609 # Protein_GI_number: 15901558 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 1 317 2 317 321 234 42.0 2e-61 MAKITINDVAKHAGVSITTVSRVLNNRGYISDATREKVQSSINELGFIPNELARSFFTCS SNLIGLIIPTTANPFFGELTFYIEKQLSKEGYKLFICNSINENENEKEYLRMLQENRVDG IIVGSHNLNIAEYDKLPLKAVSVERVLNDSIPVIQCDNYQGGRIATQTLIHAGCRKILCL SGDFKLNAPANSRYTAHKDCMAEHGLPYHIESIPFTVSNEEKIARITEIMDSCTDIDGVF AGDDILASIVYNYAVQHQITVPERLKIIGFDGTEAIHTIFPALTTIQQPIEFLAEKSVRL LLSLIREERVPPVTTLPVDLYRGGTV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:34:56 2011 Seq name: gi|229783809|gb|GG667926.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld319, whole genome shotgun sequence Length of sequence - 2122 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 2122 2341 ## Pjdr2_4665 S-layer domain protein Predicted protein(s) >gi|229783809|gb|GG667926.1| GENE 1 1 - 2122 2341 707 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_4665 NR:ns ## KEGG: Pjdr2_4665 # Name: not_defined # Def: S-layer domain protein # Organism: Paenibacillus # Pathway: not_defined # 2 701 382 1104 2026 330 33.0 2e-88 FEFADLGWNYLPDACYGDGSYSDGGVLADTGTNNYLTLKDPDTDDYSMIFANNTSQTRKY KVTVKNLKKASSPLNVWETRGPDEGQEYDSNWMQMVDSFTPNKESYGVYSFTLTVKPYSI KTVTSLLDRGQEYSPGQNDPNVDRDVLALPYTDDFEYADYDTDEKGRTYLERRGGTPRYT TDQYGAFEVREGSFGNHVLAQMINYDNRPYDWNVWGNGTDENSQTTGRPRTVLGDHRWTN YTAGIDFKLDLTSPECFENYAGIGVREVVHEGTNANDIATYTFQVFQNGSYQLSCRTGKS VKGTLENFDSTVWHNMQLSADENVLTAYVDGKKAAELVDDNHTSMSGRVTLTSGFYNTQF DNLEVLPIEGKAAAAEQKLDDTSELITWNGNWNHVLNEGYGHYNRTRTYGGSIEKGVYAH NSGQIRYYKGGNPLAWGSNDSNAWGSAADEAHYEFDFYGTGIEIFGEANGSNGTGDVYVD GEKVGTVNYNSGNGGTGHNVFNLDLDPDQLHTLKVTATSSFISLQKLKVIGEPDDSGESR ENSFSLTFHGTGIQVFGNSGAATIKVEVDGEIREAAYQTPAAGNRQTTYALQNLPEGAHT VKITVEGGTYYVDGIDVLGTAAEGAAVNTAKLLTLIDAAKKLTNDDETYMPEDWENFQKA IAAAEKALAGGSAEAVNQAYLALRNAMEKLGLAGAITEVTGLEKLYI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:35:08 2011 Seq name: gi|229783808|gb|GG667927.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld320, whole genome shotgun sequence Length of sequence - 2721 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 782 636 ## Closa_0875 Stage II sporulation P family protein 2 1 Op 2 . - CDS 871 - 1848 1126 ## Closa_0874 spore protease (EC:3.4.24.78) + Prom 1970 - 2029 7.4 3 2 Tu 1 . + CDS 2115 - 2378 264 ## PROTEIN SUPPORTED gi|160880450|ref|YP_001559418.1| ribosomal protein S20 + Term 2405 - 2453 8.2 Predicted protein(s) >gi|229783808|gb|GG667927.1| GENE 1 2 - 782 636 260 aa, chain - ## HITS:1 COG:no KEGG:Closa_0875 NR:ns ## KEGG: Closa_0875 # Name: not_defined # Def: Stage II sporulation P family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 260 1 247 468 183 42.0 7e-45 MKEKGGRINWLIRHMMLITGLVLAVLLLVRGAAGTGLSESRERLTALLRGQADAGLEWIW RMNYPAAGHNDERETEAGKAEEESISGFICRLIFGQSPLYRYLGREEDGEHNYREEDPAF AGYLESGKFYEEHQYLMYDGGEETGAGTMEQAGTTVLAAGGNGADGDGNTGGQGTVAGGN AENGAMAGSPASETGAAGGSADANAAGIPQKNENMTCAASVLPLIGTTYRKEQLADYDYM MKHFYSVHTSTTAGRDLMKA >gi|229783808|gb|GG667927.1| GENE 2 871 - 1848 1126 325 aa, chain - ## HITS:1 COG:no KEGG:Closa_0874 NR:ns ## KEGG: Closa_0874 # Name: not_defined # Def: spore protease (EC:3.4.24.78) # Organism: C.saccharolyticum # Pathway: not_defined # 1 324 1 324 325 556 88.0 1e-157 MENTFGVRTDLAVEETESFPGNGGEISGVSLREWHRAGSRIKMTEVKILDKKGEKAMGKP MGSYITLEADQLSRKDEDYHREVSEELARQIGRLLEGKNYRSSSYHVLAVGLGNSSVTPD SLGPRVLHNLQVTRHLKVQYGDEFWSGRSMPVISGIVPGVMAQTGMESAEILKGIIHETK PDLIIAIDALAARSVKRLGTTIQLTDTGIHPGSGVGNHRHSLTEESLGVPVMAIGVPTVV GAAAIVHDTVSAMIGALSKSMETKGMGDFVGSMSSDEQYDLIRELLEPEFGPLYVTPPDI DETVKQLSYTISEGIHLAFMGEEEE >gi|229783808|gb|GG667927.1| GENE 3 2115 - 2378 264 87 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160880450|ref|YP_001559418.1| ribosomal protein S20 [Clostridium phytofermentans ISDg] # 1 87 1 87 87 106 63 2e-23 MANIKSAKKRILVNETKAARNKAIRSRVKTSIKKVEAAVTAGDKAAAQACLTSAIAEIDK AATKGVYHKNTASRKVSRISKAVNAMA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:35:23 2011 Seq name: gi|229783807|gb|GG667928.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld321, whole genome shotgun sequence Length of sequence - 2851 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 128 84 ## 2 1 Op 2 . + CDS 67 - 711 534 ## Closa_2453 Vitamin B12 dependent methionine synthase activation region 3 1 Op 3 . + CDS 717 - 2850 2161 ## COG1410 Methionine synthase I, cobalamin-binding domain Predicted protein(s) >gi|229783807|gb|GG667928.1| GENE 1 3 - 128 84 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no FFHECIEGKLQEVPVKKWSDVYGNQPERDSEVSGVWQTGSG >gi|229783807|gb|GG667928.1| GENE 2 67 - 711 534 214 aa, chain + ## HITS:1 COG:no KEGG:Closa_2453 NR:ns ## KEGG: Closa_2453 # Name: not_defined # Def: Vitamin B12 dependent methionine synthase activation region # Organism: C.saccharolyticum # Pathway: not_defined # 1 213 1 213 214 295 73.0 1e-78 MEISRREIRRYLGYGRQEADEKVSKLIEECVEELEAAVSPKSIYRVYPFSFLTEEELDFT VFRTESRSLSRNLKDCEQVILFAATLGTEADSLIRKYSKLQMSKAVTMQAAATAMLESYC DEINTELKETFEKKGLYLRPRFSPGYGDFPLECQRDITAVLETAKRIGIMLTDSLLMTPS KSVTAVMGVSRKPHRCEVRGCELCSKTDCAYRRG >gi|229783807|gb|GG667928.1| GENE 3 717 - 2850 2161 711 aa, chain + ## HITS:1 COG:alr0308_2 KEGG:ns NR:ns ## COG: alr0308_2 COG1410 # Protein_GI_number: 17227804 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Nostoc sp. PCC 7120 # 309 711 22 438 868 203 33.0 1e-51 MTELMEEIRKRIVFFDGGTGSLLQANGLKPGELPETWNILHPEIVTKLHYDYLEAGADIV KTNTFGANGLKFNDAGEYGLDEIVAAAMENAKKAVSKAGDKGYIALDIGPTGKLLKPLGD LGFEEAYRLFSDVVAIGAREGADLVLIETMSDSYEVKAAVLAAKENCNLPVFATMIFDSK GKLLTGGTVESTVALLEGLGVDALGINCGLGPVQMKGILADIMKAASVPVIVNPNAGLPR SEGGRTFYDIDADEFAGTMREIVEMGACVVGGCCGTTPEHIRKTIALCKDQPARMPEKKN RTVISSYAQAVEIDKNPVLIGERINPTGKSKFKQALRDHNLEYILREGVAQQDNGAHVLD VNVGLPEIDEAAMMEEVVMELQSIIDLPLQIDTSNIQAMERALRVYNGKPLINSVNGKQE VMEAVFPLVKRYGGVVVALALDEDGIPETADGRLKVAEKIYAKASEYGIERKDIVIDALC MTVSSDSRGAITTLETVRRVRDELGGKTILGVSNISFGLPQREIVNAAFFTMALQNGLNA AIINPNSEAMMRSYYSFRVLADLDPQCSEYISVYSGQVATLGQTVRQGGGSGKADGSGSA MSASLAESIERGLKESAHQAVTELLKTLEPLVIINEEMIPALDRVGKGFEKGTVFLPQLL MSAEAAKAAFEVIKEQLAKSGREEEKKGKIILATVKGDIHDIGKNIVKVLL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:35:32 2011 Seq name: gi|229783806|gb|GG667929.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld322, whole genome shotgun sequence Length of sequence - 2398 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 25 - 84 4.8 1 1 Op 1 . + CDS 169 - 582 439 ## gi|266625522|ref|ZP_06118457.1| V-type ATP synthase subunit E 2 1 Op 2 . + CDS 665 - 1057 469 ## gi|288871666|ref|ZP_06118458.2| conserved hypothetical protein 3 1 Op 3 . + CDS 1065 - 2397 1095 ## gi|266625524|ref|ZP_06118459.1| putative GGDEF domain protein Predicted protein(s) >gi|229783806|gb|GG667929.1| GENE 1 169 - 582 439 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625522|ref|ZP_06118457.1| ## NR: gi|266625522|ref|ZP_06118457.1| V-type ATP synthase subunit E [Clostridium hathewayi DSM 13479] V-type ATP synthase subunit E [Clostridium hathewayi DSM 13479] # 1 137 1 137 137 244 100.0 2e-63 MFTSAKDIMKIIWNGLKEIKKNCSEYCEKDARKIYEETLERVKDELSGPEPEEIRKERLV RNRWEKMRADRKIAAVSILTILASLVLVTSYDRWMGFGSFILVMAAVAVISVLWISHLFM RRFKLSLEGWLLEAKEV >gi|229783806|gb|GG667929.1| GENE 2 665 - 1057 469 130 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871666|ref|ZP_06118458.2| ## NR: gi|288871666|ref|ZP_06118458.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 130 2 131 131 261 100.0 1e-68 METFLCFESKGAFFLIPLKLVRHIVQGNAGQEKTIVFEGCEAEIYSVGAFWGDSREQEYA VLLDAEDSLPAILADRVAGVFEVPDTRIFKLPEDVTGEKNDFLTQAVYLDSSGCWAFVMD IERFLNKERD >gi|229783806|gb|GG667929.1| GENE 3 1065 - 2397 1095 444 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625524|ref|ZP_06118459.1| ## NR: gi|266625524|ref|ZP_06118459.1| putative GGDEF domain protein [Clostridium hathewayi DSM 13479] putative GGDEF domain protein [Clostridium hathewayi DSM 13479] # 1 444 1 444 445 884 100.0 0 MVSNREERLDRLTNTWTEASAEVMISEYLKDTEHWGGDGLFFLEAANGGTKLSEASVKQA AFLLQCTLRRTDLVARVGEDCFLIFLLGCRSREDALTSMNSVRECLERFAGLSVTAGGVL TDGREETYELLRERALAALAHAKREKRPFCFAEECERGDGEETAAEMHAAKGCLIRPVPE YIEDSGKADMEFVRIMMDQLLPGKKECSIEKGLERICAYYGCGEAYILERKSGGKDYEIS FSWREEKPMVRNDHLKTAPGLIVDRYLDLFRNREVLACNRICELEILDPVIAERQKLRKT RALMQFPIMENGDYIGYISVNDSKKERLWTLEETMTFALAGRVLAARILECRFQRFAQIL IDHDRLTEAWNYNRFLSEGRKRLKRAPLLQAVVTMDIKNFKVINANYSYETGNDILVGIS SLLNHFTGGGECFARIEADKFALL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:36:07 2011 Seq name: gi|229783805|gb|GG667930.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld323, whole genome shotgun sequence Length of sequence - 3504 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 202 200 ## COG0778 Nitroreductase + Prom 254 - 313 10.1 2 2 Tu 1 . + CDS 360 - 1049 742 ## COG2364 Predicted membrane protein + Prom 1062 - 1121 4.3 3 3 Op 1 . + CDS 1178 - 2680 1508 ## Closa_1674 hypothetical protein 4 3 Op 2 . + CDS 2677 - 3502 450 ## COG1180 Pyruvate-formate lyase-activating enzyme Predicted protein(s) >gi|229783805|gb|GG667930.1| GENE 1 2 - 202 200 66 aa, chain + ## HITS:1 COG:FN1223 KEGG:ns NR:ns ## COG: FN1223 COG0778 # Protein_GI_number: 19704558 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Fusobacterium nucleatum # 1 66 109 174 175 86 60.0 1e-17 AAHSLGIDSCWIHRAKEEFDSPEGKALLEEWGIHGDYEGIGHCILGYRSCEYPEAKPRKE GTVIRI >gi|229783805|gb|GG667930.1| GENE 2 360 - 1049 742 229 aa, chain + ## HITS:1 COG:CAC0198 KEGG:ns NR:ns ## COG: CAC0198 COG2364 # Protein_GI_number: 15893491 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 1 201 1 201 227 158 43.0 7e-39 MDHHVIMKKYRMFLYGMIVAALGISMITKAELGTSPITSPSYVLTFIFPYSLGAFVLAVN STLFLLECLVLGRGFKKIQLLQLPATLIFSACIDGWMWVLSFWTPVFYLQKVLLLLAGCA ALGLGVALEVVPDVLILPGEGLVRAISRRKGWDFGLVKTCFDLSLVLAAILLSVFCLGHV EGIREGTVAAAFLVGGISKFFIRHITEFDRNHRALFVREEEERLLVTEE >gi|229783805|gb|GG667930.1| GENE 3 1178 - 2680 1508 500 aa, chain + ## HITS:1 COG:no KEGG:Closa_1674 NR:ns ## KEGG: Closa_1674 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 498 2 495 498 675 65.0 0 MEQNRVLDVLTSKELTYEQKVFSLAREAEDSLEVLDIPPKTRAYFETGALNDLFEGHAPY RPRYIMPDYKKFVRQGSKFLRIEPPKTLDELLTGLEILYHHVPSITGFPVYLGELDTMIE PFTTGLADDEVKEKIRLFLIYLDRTITDSFCHGNLGAKPTRTGRIILELEKELQNAVPNL TLKYDPDVTPDEYAELAVSTSLYCANPAICNDADNRTTYPCEYGISSCYNILPVGGGAYT LSRITLTELVKSAKSTEHFLKELLPDCLNAVGNYMNERVKFLVERSGFFESSFLVREGLI EREKFVGMFGVTGLAECVNALMADTGKRYGHSAESDDLGVQIMEIIQKFVADFPAVYSEI GRGHYLLHAQVGLDSDKGITSGVRIPVGDEPENLLDHLRHSARFHEYFPTGVGDIFPFAV TARNNPGALVDIVKGSFQLGVKYMSFYASDSDLVRITGFLVKRSEMEKYRNHDVVLQNTT VLGSKNYDTNHLELRKERMA >gi|229783805|gb|GG667930.1| GENE 4 2677 - 3502 450 275 aa, chain + ## HITS:1 COG:STM4565 KEGG:ns NR:ns ## COG: STM4565 COG1180 # Protein_GI_number: 16767806 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Salmonella typhimurium LT2 # 3 245 6 252 287 198 42.0 1e-50 MKAPVNKIIPVSVVDGPGNRTAVFLQGCNISCAYCHNPETQNLCTGCGVCVDGCPAGALK LNGETVVWDEDKCISCDRCIAVCPSFASPKVREMSAEEVFCKVSLNMPFIRGITVSGGEC CLYPEFLKELFGLCRKAGLSCLIDSNGMVDFSSCPELLDLCEGVMLDVKSWDPCVYRRLT GADNAVVRKNLEFLWRKRKLEEIRIVCLPGESDYEDVLLGIAGLLGSPITVRVKLIKFRP FGVRGRLSGASRPEDGLMRELASWAGELGYTNIVL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:36:15 2011 Seq name: gi|229783804|gb|GG667931.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld324, whole genome shotgun sequence Length of sequence - 2422 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 - CDS 2 - 659 522 ## COG3408 Glycogen debranching enzyme 2 1 Op 2 2/0.000 - CDS 685 - 1674 1124 ## COG1609 Transcriptional regulators 3 1 Op 3 . - CDS 1691 - 2422 965 ## COG2182 Maltose-binding periplasmic proteins/domains Predicted protein(s) >gi|229783804|gb|GG667931.1| GENE 1 2 - 659 522 219 aa, chain - ## HITS:1 COG:MA0905 KEGG:ns NR:ns ## COG: MA0905 COG3408 # Protein_GI_number: 20089784 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen debranching enzyme # Organism: Methanosarcina acetivorans str.C2A # 3 149 25 169 680 94 35.0 2e-19 MRLEYGRNSFTGFDRGQEQCSLLTNGLGGYSSQTVIGSNTRGDHALLMAALQAPNLRYHM VTRVDETLMTGGSCFSLGSQQYAGYCENTEGFRYLDAFCMEYFPVYTYRVRGVEIKKTVV MPQQENTVGIRYEIENRTKREALLEVVPLFQFVEKGEKCMPGRQFTLSAAKAAGRISDGV LNLNFWTDGEIVEQTQQYIDDLYFEYDARDGRPAAGGAY >gi|229783804|gb|GG667931.1| GENE 2 685 - 1674 1124 329 aa, chain - ## HITS:1 COG:BH3692 KEGG:ns NR:ns ## COG: BH3692 COG1609 # Protein_GI_number: 15616254 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 1 289 1 296 337 110 29.0 3e-24 MTSINDIARMAGVSKSTVSKVLNQYDSVSDQTRRKVEQAVRALNYVPNAAAISLSKKNFH RVGLIVDMTYYKQAVDEIDILYLFGAFEKAGEYQMEAVTFFTSQFAHMSGDDMIRYLKSQ GITGIIVYSLARENVKLYNVIMSKEFKAVVVDSPIVSQMISSVSVDHYKGQYEVAKKAVE SGTGSTVLYIACGPDGYMTDARLAAMRRLKEERNLELIVEYADYSEKKAREITFARGREA DVIVCASDLMAIGAACALEEMGVRRPVSGYDGITLMGYMRTPMYTLKQDFKKISSTAFLE LVRLFEGGTGRHVTMEAKVVQIFYKDVMK >gi|229783804|gb|GG667931.1| GENE 3 1691 - 2422 965 243 aa, chain - ## HITS:1 COG:lin2230 KEGG:ns NR:ns ## COG: lin2230 COG2182 # Protein_GI_number: 16801295 # Func_class: G Carbohydrate transport and metabolism # Function: Maltose-binding periplasmic proteins/domains # Organism: Listeria innocua # 6 234 175 407 419 98 28.0 8e-21 EFAETFNDPAKNQFAFMMDAANGYSAYGFLTTYGYKLFGESGTDRDEPGFDSPEFEKGLE FYQELKKILPVESQDLKGEFVNEQFKQGKTAYILGGPWNIVDFRNAGVNFGVMTMPTLNG RTPTPFAGLKVAHVSAYTDYPHAAMLLAEYMASEPGAIILYDTNYKTTALKDISAVPELL EDRDLSVFSRQFETAFPVPNIDRMDYYYSISEKALALVFDGQLEPAEAAKKAMEEWNSLV ASE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:36:16 2011 Seq name: gi|229783803|gb|GG667932.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld325, whole genome shotgun sequence Length of sequence - 2543 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 2472 2746 ## COG0013 Alanyl-tRNA synthetase Predicted protein(s) >gi|229783803|gb|GG667932.1| GENE 1 3 - 2472 2746 823 aa, chain - ## HITS:1 COG:CAC1678 KEGG:ns NR:ns ## COG: CAC1678 COG0013 # Protein_GI_number: 15894955 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 822 1 824 881 851 52.0 0 MNKTGVNELRRMFLEFFESKGHLAMKSFSLVPHNDNSLLLINSGMAPLKPYFTGQEIPPR KRVTTCQKCIRTGDIENVGKTARHGTFFEMLGNFSFGDYFKTEAIRWSWEFLTEVVGLDP DRLYPSVYLEDDEAFDIWNKEIGIAPERIFRFGKEDNFWEHGAGPCGPCSEIYYDRGEKY GCGKPGCTVGCECDRYMEVWNNVFTQFENDGHGNYEELAQKNIDTGMGLERLAVVVQDVD SIFDVDTIKALLNRVAELARTEYQKDASVDVSLRLITDHVRSCTFMISDGIMPSNEGRGY VLRRLLRRAARHGRLLGIEGRFLAELSRTVIELSKDGYPELEEKKAMILKVLTEEEDKFN RTIDQGLAILSEMEEQMSAAGKTVLSGEDAFKLYDTYGFPLDLTREILEEKHFSIDEDGF KKAMQEQREKARAARKTTNYMGADVTVYQSIDPALTTEFVGYDRLVCDSKITALTTETDL VEALTDGETGTIVTEETCFYGTMGGQQGDKGVIVSANGEFSVEDTIHLQGGKVGHVGKMK KGMFQIGDVVTLKVCEESRMNTGKNHSATHLLQKALRTVLGEHVEQAGSYVDEDRLRFDF THFSAMTPEEIKKVECLVNEKIKEALNVKTEVMSLDEAKKSGAMALFGEKYGENVRVVKM GDFSTELCGGTHISNTGVIGSFKILSETGIAAGVRRIEALTGDGLMKYYQEAETELHEAA KAAKATPQTLAARIEAMLEEIKTLHSENEKLKSRLAKDSLGDVMDQVKEIGGVKVLAAKA DDVDMNGLRNLGDQLKDKLGEGVIVLASVLDGKVNLMATATEE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:36:17 2011 Seq name: gi|229783802|gb|GG667933.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld326, whole genome shotgun sequence Length of sequence - 1892 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 + CDS 1 - 1356 1115 ## COG1205 Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster 2 1 Op 2 . + CDS 1357 - 1891 555 ## COG0210 Superfamily I DNA and RNA helicases Predicted protein(s) >gi|229783802|gb|GG667933.1| GENE 1 1 - 1356 1115 451 aa, chain + ## HITS:1 COG:MA2672 KEGG:ns NR:ns ## COG: MA2672 COG1205 # Protein_GI_number: 20091495 # Func_class: R General function prediction only # Function: Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster # Organism: Methanosarcina acetivorans str.C2A # 2 299 479 768 912 131 30.0 3e-30 AWSGGSFPAGDYSLRGIDPNRYKLLLKDSGKEITEMDEMQAFREIHPGAVYLHGGVQYQV VTLDLETRTARAVPFNGNYYTTPMGNTDITIIHGRKEKSWKRTEVVFGDVNVNAQVSLYR KLQFHNHQNLGYEQIQPSLSREFDTESIWLKLPGNVVTAYRRLLQESPNGKMIRNNHFEG LCYALQNAARLVTMTEQEDIGTTVSTNAVYAEKSTQESVFLFLYDQYTGGLGYAEKAYEL IPEIIENGIAMVGGCPCEDGCAACVGDYHLDKSMVLWGLKNLLEESEVPKHVKMAPYGPS TFIQKPFRFNGLEEQWEAFTAMLLKNGEPFASFLNTVAGVRTEERRLALITDQPFYREWI MEETNRKALHNLISYYTDAPAGFELEVEIRKKEENRRNGCDTGSGTAGKKAGNSGEGNTG IESDGFGSSEDDGDMRRKLEQRLDKLKKKEN >gi|229783802|gb|GG667933.1| GENE 2 1357 - 1891 555 178 aa, chain + ## HITS:1 COG:L0287 KEGG:ns NR:ns ## COG: L0287 COG0210 # Protein_GI_number: 15673102 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Lactococcus lactis # 10 139 8 132 758 77 35.0 1e-14 MDWEREWERLNPYQREAVTNENTACLVSANVGSGKTTVLTAKILYLHEVKHVSYRNIMVL TFTNRAAAEIRERLVSADPAVTPEETENFGTFHGVALGLLKKKLPVENLGYTKDFMVMEP DEELELAHTLIAEKKLKIKYKNRLRKRLSEAERVSAAGAAGGQDDLEILAGLLTEEKI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:36:18 2011 Seq name: gi|229783801|gb|GG667934.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld327, whole genome shotgun sequence Length of sequence - 2424 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 921 463 ## COG4962 Flp pilus assembly protein, ATPase CpaF 2 1 Op 2 . - CDS 915 - 1727 731 ## LM5578_1890 hypothetical protein 3 1 Op 3 . - CDS 1728 - 2423 469 ## LM5578_1891 pilus assembly protein CpaB Predicted protein(s) >gi|229783801|gb|GG667934.1| GENE 1 3 - 921 463 306 aa, chain - ## HITS:1 COG:RSp1085 KEGG:ns NR:ns ## COG: RSp1085 COG4962 # Protein_GI_number: 17549306 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Ralstonia solanacearum # 40 298 3 259 450 87 26.0 4e-17 MLGERKTRGVFNEGRNSEWSARMQELMPAQTAPAEPVVREPAEVVNKSADDIFFAAVEGK DFATVLQEVQEHIASRYSSLLTDGNSDDVKRQMKRYITKYVQENRVAVQGMSNQELVDNL YTEMAEFSFLTKYIFGEGVEEIDINSWKDIEIQYSGGRTEKLEEHFDSPAHAINVIRRML HVSGTILDDASPVVTGTLAKNIRIAVLKTPIVDEEVGLAASIRIVNPQSMKKEDFVSGGT ATGEMLDFLSECIRYGISVCVAGATSSGKTTVAGWLLTTIPDRKRIFSIENGSRELDLVR EYDGRV >gi|229783801|gb|GG667934.1| GENE 2 915 - 1727 731 270 aa, chain - ## HITS:1 COG:no KEGG:LM5578_1890 NR:ns ## KEGG: LM5578_1890 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 270 1 271 271 338 61.0 1e-91 MLNFKGKKRVDKSDSKEEPETGKRKAQVLAVWGSPGCGKTTISVKIAKYLAEKKKNVILV ISDMTVPMIPCICPMDELENIWSLGNVLAAPHITKGLIEENCIVHKKIKYLAMLGMLRGE NEYTYAPYEAEQALEFLAGLRELSSYIIIDCGSYIANDILSAVALMEADAVLQMVNCDLK SISYLASQLPLLQEPKWKVERHLKAINNIKGQEAINHMEQVLGKVSFQIPSASEVTEQGM AGNLLAGLEGKKSRGFNNEIERIVKEVYGC >gi|229783801|gb|GG667934.1| GENE 3 1728 - 2423 469 231 aa, chain - ## HITS:1 COG:no KEGG:LM5578_1891 NR:ns ## KEGG: LM5578_1891 # Name: not_defined # Def: pilus assembly protein CpaB # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 207 35 242 284 300 80.0 3e-80 KKIEIVRVVKDIKIGDKVTGDMVRSVEVGSLNLPSEVMKNKENVIGKYASADMVPGDYII NSKVADEPAAENAYLYNLSGEKQAISVSVKSFATGLSGKLKSGDIVSVIAPDYQKQGETV IPPELKYVEVIAVTAGSGYDANTGEEDPEEKELPSTVTLLATPEQSRILAMMEKDGNLHI SLVYRGTKENAAKFIAEQEQALQELYPPETEAVESEPETGQPEGETEGSAE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:36:28 2011 Seq name: gi|229783800|gb|GG667935.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld328, whole genome shotgun sequence Length of sequence - 2516 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 50 - 109 6.4 1 1 Tu 1 . + CDS 182 - 1891 1731 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Term 1928 - 1978 0.3 2 2 Tu 1 . - CDS 2390 - 2515 73 ## gi|323695156|ref|ZP_08109292.1| transposase Predicted protein(s) >gi|229783800|gb|GG667935.1| GENE 1 182 - 1891 1731 569 aa, chain + ## HITS:1 COG:SP0155 KEGG:ns NR:ns ## COG: SP0155 COG2972 # Protein_GI_number: 15900093 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pneumoniae TIGR4 # 267 566 243 545 548 126 28.0 1e-28 MKWRMIGIMLMCWFLPFCLIIGVAGYYILSNRYGTVAQGIVDQVVYDGQICVERLNYGIS ASRKATYDQTLEKAWREYSRGIRSYGGLYTVSNDYLNKEYRQNQCFSTTILWCYEEPERM NCSVYNNSAGGTYQQVQTYWREDHERIKEFLSTLDTALGYINIDGRIYMVRNLLDNTYYP FGALIMRLNTSYCFGPLIDSPQGENVLVRLDGQTILLKGENMEEKIPKDSDVSQTGYSRR NGNMYVYDTQKEEYFTLSVLMQVDKSIIQAPFYGYPFVIAGMLLFLVPLLLILLRVFNRN VTKPITVMMEGADQIETGNLGYQIKEAPENLEFKYLMESFNQMSERLKYQFDHIYEEEIA LRDARIMALQSHINPHFMNNTLEIINWEARLGGNEKVSRMIEALATLMDAAMDRKKRTLV PLSEEMIYVNAYLYINSERFGKRLTVVKELDESIMQYEVPRLILQPVIENAIEHGVRPSG RGTVVLRGYEKEGYLYLEIVNDGTLTPAEEEKICRLLSPEYDFSREPSGNLGIANVNQRL RILYGEPCGLTIEKDDEQHVIARLTIAAR >gi|229783800|gb|GG667935.1| GENE 2 2390 - 2515 73 41 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|323695156|ref|ZP_08109292.1| ## NR: gi|323695156|ref|ZP_08109292.1| transposase [Clostridium symbiosum WAL-14673] transposase [Clostridium symbiosum WAL-14673] # 1 41 306 346 346 89 100.0 6e-17 YQVLNKVDMLEDADNMPGKWQLLILLGQQTILHLQEQLAAG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:36:34 2011 Seq name: gi|229783799|gb|GG667936.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld329, whole genome shotgun sequence Length of sequence - 1731 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 772 681 ## COG3119 Arylsulfatase A and related enzymes 2 1 Op 2 . - CDS 798 - 1616 889 ## COG0395 ABC-type sugar transport system, permease component 3 1 Op 3 . - CDS 1613 - 1729 101 ## Predicted protein(s) >gi|229783799|gb|GG667936.1| GENE 1 1 - 772 681 257 aa, chain - ## HITS:1 COG:MT0310 KEGG:ns NR:ns ## COG: MT0310 COG3119 # Protein_GI_number: 15839682 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mycobacterium tuberculosis CDC1551 # 2 252 10 257 465 163 37.0 3e-40 MKKNILYLHTHDSGRYLEPFGVSVKTPNLMKLAKEGTLFRQAYCAAPTCSPSRVGLLSGQ SPHSTNMLGLAQRGFQMDCYDTHLSNFLRSQGYHTCLFGVQHEAPDAAMLGYDYWFDEDC HHSEFIRRDTAACTRAADYLLHYQEEKPFFMSVGFINTHRRYPEAPADFVNPDYVKPPMT IPDTKENRKDMADFMYSARIADDSVGVVLDALKRSGREDDTIVIFTTDHGIAFPFMKCSC YDTGIGVTLIMKFKPFR >gi|229783799|gb|GG667936.1| GENE 2 798 - 1616 889 272 aa, chain - ## HITS:1 COG:mlr7002 KEGG:ns NR:ns ## COG: mlr7002 COG0395 # Protein_GI_number: 13475832 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mesorhizobium loti # 12 271 21 287 288 142 33.0 5e-34 MKKRLFTVLANLAAWIASLIVLVPMLVVLVNSFKSQKESMVMTLTLPEKFMFENYTTVIE RGKLLSSFFNSALYATFSVLLILAVVSAASYVLSRNRSRLHRVIYYLIVLGIAMPVNNVA LMKIMKATALINTRYGLIFLYVAMNIPLALFLMFGFVETLPREIDEAAVIDGAGPMRLFS SVILPLMKPTIVTVGILNFMAIWNDFSMPLYFMNNSGKWPMTLAVYNFFGQFEQSWNLVS ADITLTALPVLIIFVFGQKYIVGGVSAGAVKG >gi|229783799|gb|GG667936.1| GENE 3 1613 - 1729 101 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no FSKGNYAMGTTLSSLLFIFVMMISYFILKAVGNKEVEQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:36:39 2011 Seq name: gi|229783798|gb|GG667937.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld330, whole genome shotgun sequence Length of sequence - 2775 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 76 - 930 756 ## COG0642 Signal transduction histidine kinase + Prom 1278 - 1337 9.4 2 2 Op 1 . + CDS 1363 - 2220 842 ## CD1125 nitroreductase family protein 3 2 Op 2 . + CDS 2289 - 2774 170 ## gi|266625546|ref|ZP_06118481.1| conserved hypothetical protein Predicted protein(s) >gi|229783798|gb|GG667937.1| GENE 1 76 - 930 756 284 aa, chain + ## HITS:1 COG:BS_resE_4 KEGG:ns NR:ns ## COG: BS_resE_4 COG0642 # Protein_GI_number: 16079368 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 90 280 11 187 269 93 32.0 5e-19 MVMACFALPVLGVFLLPNTAAVAGLFLALIPNGLAAALALRSGKVGRDPYRLLYLFSLFL FIIYAPLTKTVLEAKLYIPGTVSNLFLILSQCVMLSRSYADAHEQVERVNENLERIVEER TAQLNDTNRQLSASQAALREMISNISHDLKTPLTVLNNYLELLGDDAIASNEKERAEYIG IAYHKNLDLQRLIHNLFEVTRMEAGTAVYRLEWVEGSRLLEEAGSKYGDLIRDKELSFSV GADCAAELLIDRHKIWSVLDNLVYNALRHTPKGGSISLNLCEAG >gi|229783798|gb|GG667937.1| GENE 2 1363 - 2220 842 285 aa, chain + ## HITS:1 COG:no KEGG:CD1125 NR:ns ## KEGG: CD1125 # Name: not_defined # Def: nitroreductase family protein # Organism: C.difficile # Pathway: not_defined # 1 285 1 289 289 182 34.0 2e-44 MDYRKAAEARKSIREFTDKPVSGELKEALKKSFAECRRLVPDIKTEILILDQSECGCLQG NAGYEGLTFEAPAYLLLLSEVKPGYLENAGYMNEDIILHMTDMGLDSCWLTVDDDPALRA ALKIGDDRKVAAITAFGYGKKERSLTRLHIRSISDVDVITREGHLAPKISLAQMVYGDAW GKEADLDEMYIDDGLRDALYAASYAPTFLNRQSFRLLLADGKAILVKMMDSMTGELDARL NCGAVMLNFAAALDERRPFQTKWTLGSLSGYELPADCEIVGYCSL >gi|229783798|gb|GG667937.1| GENE 3 2289 - 2774 170 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625546|ref|ZP_06118481.1| ## NR: gi|266625546|ref|ZP_06118481.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 161 9 169 169 306 100.0 3e-82 MLKERLEAVMKATKITRIQKLASLAVTAVLTGSLLILGGFASDAQEPERQIQTDSDITDG WLNDSKNPESKKWRHSFKTEGFFVNQYGIRLAWNNDSSLYQTTRQVTAGEVYTVSFIEEL KQYADNPDVMEAIRLAILEERETEENSRRIYMRQSPPFWRD Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:36:53 2011 Seq name: gi|229783797|gb|GG667938.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld331, whole genome shotgun sequence Length of sequence - 3829 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 2, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 9 - 368 252 ## CLL_A1913 prophage protein 2 1 Op 2 . + CDS 380 - 622 415 ## gi|266625548|ref|ZP_06118483.1| transcription termination factor Rho 3 1 Op 3 . + CDS 623 - 952 301 ## gi|266625549|ref|ZP_06118484.1| prophage protein 4 1 Op 4 . + CDS 918 - 1289 140 ## TepRe1_1056 hypothetical protein 5 1 Op 5 . + CDS 1322 - 1822 339 ## SPJ_1857 hypothetical protein 6 1 Op 6 . + CDS 1822 - 2163 335 ## SPJ_1856 conserved structural protein, putative 7 2 Tu 1 . + CDS 3095 - 3827 538 ## Ccel_3035 Phage-related protein-like protein Predicted protein(s) >gi|229783797|gb|GG667938.1| GENE 1 9 - 368 252 119 aa, chain + ## HITS:1 COG:no KEGG:CLL_A1913 NR:ns ## KEGG: CLL_A1913 # Name: not_defined # Def: prophage protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 4 118 192 306 311 139 57.0 4e-32 MNTTFSKGGIDTAVPSVDGVPIISTPSNRMYTVIKINDGKTGGQERGGYEKGESAKNINF FICPRTTPIAVTKQDIMRIFDPTINQKLNAWQMDYRRFHDIWVLDNKLDSIYLNIKESA >gi|229783797|gb|GG667938.1| GENE 2 380 - 622 415 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625548|ref|ZP_06118483.1| ## NR: gi|266625548|ref|ZP_06118483.1| transcription termination factor Rho [Clostridium hathewayi DSM 13479] transcription termination factor Rho [Clostridium hathewayi DSM 13479] # 1 80 1 80 80 101 100.0 2e-20 MRLIMNNVERIAEDDVQIRKLMAAGFEPLGEAACEEKAALKPELEKMKVEELKALAKEKG IEGAASLTKDELQAVLKEVV >gi|229783797|gb|GG667938.1| GENE 3 623 - 952 301 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625549|ref|ZP_06118484.1| ## NR: gi|266625549|ref|ZP_06118484.1| prophage protein [Clostridium hathewayi DSM 13479] prophage protein [Clostridium hathewayi DSM 13479] # 1 109 1 109 109 166 100.0 6e-40 MTETEKLKLLTGESDEDLLSLLLADATEYVLGYTNRTELPAALNKSVRDLAVIAYNRLGT EGETGRSEGGESYSFDTAPKQIHDVLDRYRLVGVGGRRYEAKKKPTENV >gi|229783797|gb|GG667938.1| GENE 4 918 - 1289 140 123 aa, chain + ## HITS:1 COG:no KEGG:TepRe1_1056 NR:ns ## KEGG: TepRe1_1056 # Name: not_defined # Def: hypothetical protein # Organism: Tepidanaerobacter_Re1 # Pathway: not_defined # 1 123 1 108 110 65 33.0 8e-10 MRLKRNRLKTYSHRSAISKKDNEGNSYIEYGLPSSFEAEVWPAGGKLQAEMYGQRVNHIQ NCRINAGYEVMADEKGRVSYRIGYTTIQEGDGICVYVPGEQEPDYRIVAIRPYRYLTLEV ERI >gi|229783797|gb|GG667938.1| GENE 5 1322 - 1822 339 166 aa, chain + ## HITS:1 COG:no KEGG:SPJ_1857 NR:ns ## KEGG: SPJ_1857 # Name: not_defined # Def: hypothetical protein # Organism: S.pneumoniae_JJA # Pathway: not_defined # 58 166 7 108 108 123 56.0 4e-27 MKKYGELAEQAAGESMGRAVGASAKLIQGEAKLLCPGNDGELRRSIKTRVETAEEGTIGA VYTNKKYGPYVELGTGPRGEADHAGISPLVTPAYSQSPWWIHESQIDAKIAEKYHWFYID TPEGRFYQCSGQPAQPFLYPALKNNEERVTRNIANYVAREIRKVCK >gi|229783797|gb|GG667938.1| GENE 6 1822 - 2163 335 113 aa, chain + ## HITS:1 COG:no KEGG:SPJ_1856 NR:ns ## KEGG: SPJ_1856 # Name: not_defined # Def: conserved structural protein, putative # Organism: S.pneumoniae_JJA # Pathway: not_defined # 1 110 1 110 122 101 45.0 1e-20 MINVKDEVYAALCTVTENVTDYYPRSWENDISIQYMEEDNKVAEESGRGEVKSYVRYRID IWSRKSTSAAAVAVDAAISPLGLKRTQCMDVEDPSGLKHKQMRYEGIIDVRTS >gi|229783797|gb|GG667938.1| GENE 7 3095 - 3827 538 244 aa, chain + ## HITS:1 COG:no KEGG:Ccel_3035 NR:ns ## KEGG: Ccel_3035 # Name: not_defined # Def: Phage-related protein-like protein # Organism: C.cellulolyticum # Pathway: not_defined # 8 233 129 357 841 165 41.0 1e-39 MGEAGYQMSAAITGLTGDVASFYNLSTDEAYTKLKSIFTGETESLKELGVVMTQTALDQY ALNNGFGKTTAKMTEQEKVMLRYQFVMSSLADASGDFARTSTSWANQVRVLSLQFQSLKA TIGQGLINAFTPVIRVINTILAKLQTLAAYFKAFTVALFGDAGGSSNIADSMESAAGSSG AVADNMDKAAGAAKKMKEYTLGIDELNVLNPDEGNGGSGGGAGGGGSLDFGDMSGELFGE VTVN Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:37:22 2011 Seq name: gi|229783796|gb|GG667939.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld332, whole genome shotgun sequence Length of sequence - 2133 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1945 927 ## Acid_0198 glycoside hydrolase family protein 2 1 Op 2 . - CDS 1961 - 2068 123 ## gi|266625555|ref|ZP_06118490.1| endochitinase Predicted protein(s) >gi|229783796|gb|GG667939.1| GENE 1 1 - 1945 927 648 aa, chain - ## HITS:1 COG:no KEGG:Acid_0198 NR:ns ## KEGG: Acid_0198 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: S.usitatus # Pathway: not_defined # 29 634 35 634 932 422 39.0 1e-116 MREWFKQDLKESKNIHRPLRIHEEEMRETRCLRKRVVSSILLDDLTSLDHWETEGPYARM ELSKEHTRQCVNSLKFTSVTKLDHWPNAFGRIYTVPKVIRKVNHENWEHINRISFWVYAD MPGFRSISFRMQFRNGGDHPVPDRYDREGHHNVNMTNGKWQYVSVEIPYVHRDDVLGLSF EYDMVGGEPGATETACWYITDIMLETLAPEDLDVYEGWQPGEHRIAFSGSGYQAGSMKTA VVNEPEAKSFKLVETASGRTVLEKEIEVQCDSNGVHQVLDFSEIQTPGEYLLVCGSVVSR SFPIGTGIWEDAIWKTLNFLFCERCGCEVPGKHYSCHHDVTIRHGERSIIANGGWHDAGD MTQVTTNTSECTYALFELLDKVRDNEPLFERILEEAKWGLEWLLKTRFGDGYRVPSSSKS CWTNGIIGDDDDIFREAALEPIENFMCAGAEALAYMELREIEPELAAYSLKCAKEDWNFA YTNREMPDFCKFDDPNRICTPLLKSSAACWSAVDLYKATEEVYYRDMAICFADEIIACQQ QEITDWDTPYVGFFYDDASHKLIAHCSHRSHDNEPIQSLARLAACMPDHENFGKWMYAVN LYSDYIQNASEATAPYYVVPASIYHEDEAMHDNEYDMGMIKKYLTDDE >gi|229783796|gb|GG667939.1| GENE 2 1961 - 2068 123 35 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625555|ref|ZP_06118490.1| ## NR: gi|266625555|ref|ZP_06118490.1| endochitinase [Clostridium hathewayi DSM 13479] endochitinase [Clostridium hathewayi DSM 13479] # 1 35 22 56 56 67 100.0 2e-10 MNPDEHPEIQDNWRWAEQWLPHASWFLYAVGLKMK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:37:37 2011 Seq name: gi|229783795|gb|GG667940.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld333, whole genome shotgun sequence Length of sequence - 1606 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 + CDS 67 - 921 414 ## COG2207 AraC-type DNA-binding domain-containing proteins 2 1 Op 2 . + CDS 914 - 1604 156 ## COG0477 Permeases of the major facilitator superfamily Predicted protein(s) >gi|229783795|gb|GG667940.1| GENE 1 67 - 921 414 284 aa, chain + ## HITS:1 COG:BH3634_1 KEGG:ns NR:ns ## COG: BH3634_1 COG2207 # Protein_GI_number: 15616196 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 8 116 6 112 132 101 47.0 2e-21 MGNQSVISIEAVIDYIESHLDGKLELETVAEAVHYSKYHLHRLFTETVGMTIHDYVQRRQ LTEAAKLLVFSDKPIIEIAFICGYESQQSFSLAFKAMYKSPPAEYREERSFYPLQLRFIL HRRTTAMEFTIQDIRLAEKKDIVDWMNLMRLVIDGYPVMDEDDYLAKLEESIDEKRALVL REGDILIGAMAFTYSPGSIEFLGVHPQYRNRGLQKLFLDALLETYLQGQEISTTTFREQD KADTGHRDMLLQLGFAEKELLTEFGYPTQRFVLPPKKQEDIQHG >gi|229783795|gb|GG667940.1| GENE 2 914 - 1604 156 230 aa, chain + ## HITS:1 COG:SP1116 KEGG:ns NR:ns ## COG: SP1116 COG0477 # Protein_GI_number: 15900983 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Streptococcus pneumoniae TIGR4 # 35 230 28 223 392 75 25.0 6e-14 MDDTQRTGWQKRILLFLTSQCITLFGSTLVQMALVWYATMQTSSGVWVAAFTVCSYLPQF LISFIGGVWADRYHRKKLIIGADLLIAFVTFVMVLAIPHISSEPALLGGLLVMSVIRSLG AGIQTPAVNAVIPQLAPEDQLMRYNGINATMQSIVNFAAPAAAGAVFAISTLRMTLMIDI VTAILGTGLLSCLALPKQNISIEKASVFSDMKIGVKYAFADKLIGKLLII Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:37:38 2011 Seq name: gi|229783794|gb|GG667941.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld334, whole genome shotgun sequence Length of sequence - 2432 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 79 - 1815 1417 ## CD1108 putative DNA-repair protein 2 1 Op 2 . - CDS 1808 - 2185 462 ## gi|166033488|ref|ZP_02236317.1| hypothetical protein DORFOR_03214 3 1 Op 3 . - CDS 2188 - 2430 218 ## Ethha_1897 hypothetical protein Predicted protein(s) >gi|229783794|gb|GG667941.1| GENE 1 79 - 1815 1417 578 aa, chain - ## HITS:1 COG:no KEGG:CD1108 NR:ns ## KEGG: CD1108 # Name: not_defined # Def: putative DNA-repair protein # Organism: C.difficile # Pathway: not_defined # 6 578 71 646 646 691 64.0 0 MNKEPRLRFTDEERSDPALEKPIRKAEKAAARADKAQANIPKKKVRQTVIDPDTGKKTSK LTFEDKKKPPSKVSQGVREAPVHLVAGKLHKEIRETEQDNVGVESAHKSEEVVETGAYLV REGYRSHKLKPYRKAAQAERQLEKANVNVLYQKSLQENPQFASNPLSRWQQKQTIKKQYA AAKHGGQTAGNTAQAASKTGKAARTVKEKAQQAGAFVMRHKKGFLMAGVLFLITCMLMNT MSSCSMMAQSIGSVLSGTTYPSDDPEMLAVEADYVAREVQLQEEIDNIESSHPGYDEYRY DLGMIGHDPHELAAFLSAVLQGYTRQSAQAELVRVFAAQYQLTLTEEVEIRYRTETSTDP ETGETTSEEVPYEYYILNVKLASKPISAVVSELLTTEQMKMYQVYRQTMGNKPLLFGGGS PDTSGSEDLSGVQFINGTRPGNPQLVELAKRQVGNVGGYPYWSWYGFDSRVEWCACFVSW CYNQAGKSEPRFAGCEWQGVPWFQSHGQWGARGYNNLAPGDAIFFDWDLDGTADHVGIVI GTDGSRVYTVEGNSGDACKIKSYDLNYQSIKGYGLMNW >gi|229783794|gb|GG667941.1| GENE 2 1808 - 2185 462 125 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|166033488|ref|ZP_02236317.1| ## NR: gi|166033488|ref|ZP_02236317.1| hypothetical protein DORFOR_03214 [Dorea formicigenerans ATCC 27755] hypothetical protein DORFOR_03214 [Dorea formicigenerans ATCC 27755] # 1 125 3 127 127 239 98.0 4e-62 MNPNILNQNPLMFFDRAVNAQRSQLLTVMADAVSECRTAADQAAELNETGQVGLLRLAEI WSTIRAKEGMGGLVLEGTEAKILSDVVAQFYAYLSGCMFNDPVGMAIYAELHYMMSSLML GEWFE >gi|229783794|gb|GG667941.1| GENE 3 2188 - 2430 218 80 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1897 NR:ns ## KEGG: Ethha_1897 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 78 712 789 794 145 84.0 4e-34 FENSDFVYMLNQAGGDRQILAKQLGISTHQLSYVTHSGEGEGLLFYGSTILPFVDHFPKN TELYRIMTTKPQELKKKEDE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:37:55 2011 Seq name: gi|229783793|gb|GG667942.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld335, whole genome shotgun sequence Length of sequence - 1798 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1738 901 ## Closa_0424 LPXTG-motif cell wall anchor domain protein Predicted protein(s) >gi|229783793|gb|GG667942.1| GENE 1 1 - 1738 901 579 aa, chain - ## HITS:1 COG:no KEGG:Closa_0424 NR:ns ## KEGG: Closa_0424 # Name: not_defined # Def: LPXTG-motif cell wall anchor domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 334 548 317 554 4700 111 35.0 1e-22 MPKALCSVGFSIISADASEGENRGSTFAYYGSQNLLKEGASFPFYGKNHNRVGLWPYGIT NVEGGHSAPGYCLEPNKSMRSGTPGTIVTYDLDTDGDNLPLGLTREDAEILWYALSSSGN FEGGISGNGKIGQGHYILGQAATWAIMSGNWNGLDDFRSQMEVLIENLKDPMLAVLTRGA LEQFFKQVNGAVEEGAVPPFASKFQSQAPVHKMKENGDGTYSITLEFDGDDWRQSTLVYD LPEGWNVSLEHGRITFTCTTGNPDIGLVRGHFQDGSLGAQYWVKPNSFKIWYPDGWNESS AVDGKQAMITMAGKQESWEVWLSFGKSTTHRGEGDYEIPYTQYLHEETFKRDYVIELEKQ CSETGKTLENSTFEVLEKFEFSQLDGTNLEKDQFMKMVPTSEGKFEDLTVCDMGLGTDAN GHFSHSDKKLYKYEKTYCGGHPDPVIHYVDGDSDSADEENERRKKKAWDAWQECVDWCEE NCDFHSIDEGVARDLMEEDRDEAWSTYIHLKRIYTVRETDARTGYILHDLHNDDVSIEIV EFSSSQSEGEGAITGYYPGNRAVSVREMEDVPQLSKSEK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:38:05 2011 Seq name: gi|229783792|gb|GG667943.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld336, whole genome shotgun sequence Length of sequence - 2327 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 20 - 79 8.3 1 1 Op 1 . + CDS 116 - 592 339 ## COG1720 Uncharacterized conserved protein 2 1 Op 2 . + CDS 605 - 1174 198 ## PROTEIN SUPPORTED gi|229087394|ref|ZP_04219532.1| Acetyltransferase, including N-acetylase of ribosomal protein + Term 1261 - 1323 11.4 3 2 Op 1 . - CDS 1317 - 1913 458 ## COG1573 Uracil-DNA glycosylase 4 2 Op 2 . - CDS 1913 - 2326 355 ## COG2378 Predicted transcriptional regulator Predicted protein(s) >gi|229783792|gb|GG667943.1| GENE 1 116 - 592 339 158 aa, chain + ## HITS:1 COG:MK0151 KEGG:ns NR:ns ## COG: MK0151 COG1720 # Protein_GI_number: 20093591 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanopyrus kandleri AV19 # 4 136 25 155 173 105 43.0 3e-23 MNNLIVRPVGKIVFQNEMPLILLEKPYRSALEGLDGFSHIQIFWWFDACDNEASRGVLKV TSPYRNSPETLGTFATRSPERPNPVALSTAQILRIDPAAGIIRLSHIDARDNTPVIDLKP YTPSLDRVAQPGVPEWCASWPKSLEESADFDWSGVFTF >gi|229783792|gb|GG667943.1| GENE 2 605 - 1174 198 189 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229087394|ref|ZP_04219532.1| Acetyltransferase, including N-acetylase of ribosomal protein [Bacillus cereus Rock3-44] # 9 180 20 193 198 80 30 9e-16 MSLKDYLTQKPFLETERLILRELKPEDTDDLREWMSDPGLYLYWGKRPGRHDLNPELLFT AANRRPTKSFHWGIANKKDNKVIGEFWVYLIENDRMAKVSYRLSPACQGNGLMTETLKSV VDFCFTHTELQRLWTDVDVRNTASCKTLEKAGFTREGCIRQGKMVSSWCDYYLYGLLKTD LLPEHESQA >gi|229783792|gb|GG667943.1| GENE 3 1317 - 1913 458 198 aa, chain - ## HITS:1 COG:MA2265 KEGG:ns NR:ns ## COG: MA2265 COG1573 # Protein_GI_number: 20091103 # Func_class: L Replication, recombination and repair # Function: Uracil-DNA glycosylase # Organism: Methanosarcina acetivorans str.C2A # 33 197 1 165 165 150 49.0 1e-36 MNIREELEKRNGNHEAFVFQDVEVKPERIRAVLINEVVPSNPADDFYGGDGAAYLSTAIP LFQKAGAQADSAGELLEQGIYLTNAVKMPKTAPAIERSSIEKSLPYLEWELSLFPHVKVI MLMGDVAKKAFNMISKKAIGKHVIPPVSTYKIRKSEFWYQGIRIMPSYIMTGQNILIEKS KAEMAAEDISVMLSIIKE >gi|229783792|gb|GG667943.1| GENE 4 1913 - 2326 355 137 aa, chain - ## HITS:1 COG:CAC3494 KEGG:ns NR:ns ## COG: CAC3494 COG2378 # Protein_GI_number: 15896731 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Clostridium acetobutylicum # 2 134 169 299 300 73 28.0 1e-13 IYKDKAWYLYGWCLAREDCRLFRLSRIKEIVLTGIQFDRIPEGTEPDLTRWENEKEPVSV DLRFPEEAASRVYDVFDDRAVTKRGGELTVRAVIPDNEWLYGFLMSFGDKVTVLSPLSLK RAMEKRLKSALQHYEEK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:38:06 2011 Seq name: gi|229783791|gb|GG667944.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld337, whole genome shotgun sequence Length of sequence - 2034 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 423 441 ## gi|266625566|ref|ZP_06118501.1| conserved hypothetical protein - Prom 454 - 513 6.7 2 2 Tu 1 . - CDS 529 - 1860 1083 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 1929 - 1988 4.5 Predicted protein(s) >gi|229783791|gb|GG667944.1| GENE 1 3 - 423 441 140 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625566|ref|ZP_06118501.1| ## NR: gi|266625566|ref|ZP_06118501.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 140 1 140 141 286 100.0 5e-76 MSIAIARANEIQYRVENGYGETEVLPGVYPLVRGFKCVLKAGHDVKPEVYADKTAIYCFT RGTGYITDGLKAYNIDELSFYIPDFSREFCIHAVSDMEIMKIVVDLLESDKAAYEDTHMV LPAFKRFSETEPYDQSCKGP >gi|229783791|gb|GG667944.1| GENE 2 529 - 1860 1083 443 aa, chain - ## HITS:1 COG:SMb21221 KEGG:ns NR:ns ## COG: SMb21221 COG1653 # Protein_GI_number: 16264473 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 131 365 97 327 412 65 25.0 2e-10 MFLAAKAADRIIKNEKDSVNFMQRGTKVCVMAALSAVLLLTACTGVEMGTLESAGSRADQ TENVDLTVMIDETTVLYYRDMVEELRQEFSEYGIHSIEWSLPEVEKTVKTAMARKESIEV VKWFPNQMENFISSSAAMDLTPYMDGEWKNIWEDGALDIGTYDGKVYCLPSVVVYPVLEV NEEILERAGVTVREAWTWEEFVEACGRIKSRTDVYPFGIRDSRVCWFMRNALLQIWDDAE EMKRFQSGELSFYDSRIAEAFDRVTSLFRQDYAYPGQRAFSQTNEQIDAAFERGEIAMMF NVNNSVKESLERMEAAGRGRIRVISFPTMASADCDYLLGGCEGFFIPSNTDHPDEAIRLL KFLTSSRIFTELKNQGFAVPVNLGEEEKKITSDSGKVFSQELMNLSSRLYNYINYELPAA YYMDKENTLKELETMRLEAVSVR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:38:15 2011 Seq name: gi|229783790|gb|GG667945.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld338, whole genome shotgun sequence Length of sequence - 2014 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 568 298 ## LM5578_1889 pilus assembly protein CpaF 2 1 Op 2 . + CDS 565 - 1494 636 ## LM5578_1888 hypothetical protein 3 1 Op 3 . + CDS 1499 - 2012 303 ## LM5578_1887 hypothetical protein Predicted protein(s) >gi|229783790|gb|GG667945.1| GENE 1 2 - 568 298 188 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1889 NR:ns ## KEGG: LM5578_1889 # Name: not_defined # Def: pilus assembly protein CpaF # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 182 327 508 511 287 81.0 1e-76 VVHTLTRDSENDRQRIDQIALLDIALRFNPDILVVGEMRGAEANAAQEAARTGVAVVTTI HSNSCEATYRRMVSLCKRAVDMSDETLMGYVTEAYPIIVFCKQLENRQRVIMEIMECEIK PDGSRKYRPLYQYEILENQLEGDYFVVNGKHRAMNTISDGLYKRLLENGMPMETLNRFRS KGEGGEEA >gi|229783790|gb|GG667945.1| GENE 2 565 - 1494 636 309 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1888 NR:ns ## KEGG: LM5578_1888 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 309 1 309 309 462 72.0 1e-128 MTSIQLFACLGLIVGGFLLFGIRPMEFTDGLFGFLSKEPSNIRNEIQTVTKRKKARFLKK AIIEAQQTLKMTGREKQFSVICVVALILFTVGASAAIMIGNFFLAPVMAVGFMFFPFWYV ILSASSYKKNVAAELETALSVITTGYLRTEDILTAIEENIQYLNPPVQRVFQDFIVRINL VNPDIHEALKELREKIDNDVFQEWCDALCDCQYDRSLKTTLTPIVSKLSDIRIVNAELEY LIVEPRKEFITMAIFVIGNIPLLYFLNKSWYDTLMHTPMGQVILAITAAIIFVSTAVVVR LTKPLEYRR >gi|229783790|gb|GG667945.1| GENE 3 1499 - 2012 303 171 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1887 NR:ns ## KEGG: LM5578_1887 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 3 171 2 170 290 202 60.0 4e-51 MSVLLLLLFGGFLFLGMFFLAADLLKVPYMKTEKALLDTGRNNKKGSAAFDAFLLQGAMR LSHVIRMDEYRKSRMKNVLKATGIKMTPEVYQAYALTKAGLLMAGIIPCLLLFPLLTPIV VFLAVMVYFKEIQRADEMLKSKRKSIEDELPRFVANLEQELQNNRDVLSMV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:38:27 2011 Seq name: gi|229783789|gb|GG667946.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld339, whole genome shotgun sequence Length of sequence - 2109 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 20/0.000 + CDS 1 - 1161 1093 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component 2 1 Op 2 . + CDS 1212 - 2109 905 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components Predicted protein(s) >gi|229783789|gb|GG667946.1| GENE 1 1 - 1161 1093 386 aa, chain + ## HITS:1 COG:BH0251 KEGG:ns NR:ns ## COG: BH0251 COG0683 # Protein_GI_number: 15612814 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Bacillus halodurans # 14 385 44 415 419 498 65.0 1e-141 SDSTTAAVTTSAGETIKVGILHSLSGTMAISETSVRDAELLAIKEINSAGGVLGKQIVPV VEDGASDNATFKEKAEKLLLNDKVATVFGCWTSASRKAVLPTFESNNGLLWYPVQYEGME SSPNIFYIGAAPNQQIVPAIEYMNENYGKKVFLLGSDYVFPRTANSIVKKQAESLGMEVV GEEYTPMGHTDYTTILSKIKQAEPDFIFNTLNGDSNVAFFKQFKDAGLTPEQIQTLSVSI AEEEAAGIGASYLEGHLTAWNYYQTTDTPENKKFVEAYKAEYGTDRVTDDPIEAGYDAVY LWAAAVEAAGSTDVDSVKAAAAGISIEAPEGTITIDGENQHIYKPVRIGKINEEGLIDEV WSTPEPVKPDPYLKGYDWASGLAGAQ >gi|229783789|gb|GG667946.1| GENE 2 1212 - 2109 905 299 aa, chain + ## HITS:1 COG:BH0250 KEGG:ns NR:ns ## COG: BH0250 COG0559 # Protein_GI_number: 15612813 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Bacillus halodurans # 5 299 2 295 297 266 57.0 3e-71 MELFLMQLFNGISVSSILLLAALGLAITFGLMGVINMAHGEFIMIGAYTAYVVQNIFQAY FPSQVFDVYCIVAIILSFLAAASLGYVLERLVICRLYGRAADSLLVTWGISLILQQAARS IFGSPNVGVKAPAFLERTIRISGMVALPMKRIFILLIAVCCLTGVYVLMYKTRQGRNIRA VMQNREMAASLGVNTKRIDSMTFAIGSGLAGLAGCALTWIGAIGPTLGTNYIVDTFMTVV VGGAGSILGSVLGAGFIGIGQTAFEFLTSASIGKVLIFACVIVLLQFRPKGIFAVQNRA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:38:28 2011 Seq name: gi|229783788|gb|GG667947.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld340, whole genome shotgun sequence Length of sequence - 1813 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 13 - 41 -0.3 1 1 Op 1 . - CDS 121 - 978 753 ## COG1496 Uncharacterized conserved protein 2 1 Op 2 1/0.000 - CDS 1066 - 1407 291 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase 3 1 Op 3 . - CDS 1419 - 1811 472 ## COG0164 Ribonuclease HII Predicted protein(s) >gi|229783788|gb|GG667947.1| GENE 1 121 - 978 753 285 aa, chain - ## HITS:1 COG:BS_ylmD KEGG:ns NR:ns ## COG: BS_ylmD COG1496 # Protein_GI_number: 16078601 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 36 284 30 277 278 165 35.0 1e-40 MRDIWKRKQEMAVLDYRTVNGVPYLAFPALERTGIVCHGFSTRMGGVSTGVYSTMNFALL KGDDPDHVRENYRRMAKAIGVDQEKMVLSWQTHTTNIRKVTEDDAGKGITRERDYHDIDG LITDVPGLTLVTFYADCVPLYFLDPVRRAVGLSHSGWRGTVGRMGQATVEAMGREYGSNP GDIIACIGPSICQDCYEVGEEVAAEFRRAFSEKYWPELLCAKPEGKYLLNLWRANEIVLL EAGVKAENIQVTDICTHCNPDYLFSHRTMGSERGNLAAFLSIKEL >gi|229783788|gb|GG667947.1| GENE 2 1066 - 1407 291 113 aa, chain - ## HITS:1 COG:STM3265 KEGG:ns NR:ns ## COG: STM3265 COG0792 # Protein_GI_number: 16766563 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Salmonella typhimurium LT2 # 1 113 14 127 131 85 41.0 2e-17 MNKHEIGSGYEEMAAAYLIEQGYKIIARNFSDRRGEIDIIARDGEYLVFVEVKYRRDEKQ GNPAEAVDLRKQQHIRHAAEYYLYKNRVSDAMPCRFDVVAILGDRITLIRDAF >gi|229783788|gb|GG667947.1| GENE 3 1419 - 1811 472 130 aa, chain - ## HITS:1 COG:CT029 KEGG:ns NR:ns ## COG: CT029 COG0164 # Protein_GI_number: 15604747 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Chlamydia trachomatis # 3 124 84 204 217 134 57.0 5e-32 KAVSFAVGVVGPERIDEINILQATYEAMRLAVSQLTNPPQVLLNDAVTIPGLDILQVPII KGDAKSVSIAAGSIMAKVTRDHMMVEYDKLFPEYGFAKHKGYGTAAHIAALKEFGPCAIH RRSFIRNFVD Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:38:29 2011 Seq name: gi|229783787|gb|GG667948.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld341, whole genome shotgun sequence Length of sequence - 2356 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 317 154 ## COG0491 Zn-dependent hydrolases, including glyoxylases 2 1 Op 2 . - CDS 329 - 832 259 ## COG1396 Predicted transcriptional regulators - Prom 852 - 911 7.5 - Term 1035 - 1085 10.1 3 2 Op 1 . - CDS 1095 - 2216 410 ## COG0642 Signal transduction histidine kinase 4 2 Op 2 . - CDS 2267 - 2347 59 ## Predicted protein(s) >gi|229783787|gb|GG667948.1| GENE 1 2 - 317 154 105 aa, chain - ## HITS:1 COG:TM0607 KEGG:ns NR:ns ## COG: TM0607 COG0491 # Protein_GI_number: 15643373 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Thermotoga maritima # 13 105 3 91 282 65 35.0 2e-11 MYELIQVGTNSYYIQSPAKIGLYVENEKDVCLIDSGNDKDSGRKVRQILEAHNWRLKAIY NTHSNADHIGGNKYLQGQTMCKIYAPGIDCDFTRHPLLEPSFLYG >gi|229783787|gb|GG667948.1| GENE 2 329 - 832 259 167 aa, chain - ## HITS:1 COG:ECs2037 KEGG:ns NR:ns ## COG: ECs2037 COG1396 # Protein_GI_number: 15831292 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli O157:H7 # 16 165 36 178 178 94 35.0 9e-20 MKEGSMTTPTEMIGMMLGQIERGQSSPTINTLWKIATGLKIPLSLLLEEQSSEYTIAAAE EQNVILEDNGRMKAYPIFAYDPIRSVESFYITFEPDCKHTSDKHNDGVEEQVFVLHGSLQ LVLNGKTIIVTENQAIRFHADIPHEYNNPFADSCTVYDMIFYSKNEK >gi|229783787|gb|GG667948.1| GENE 3 1095 - 2216 410 373 aa, chain - ## HITS:1 COG:MTH444 KEGG:ns NR:ns ## COG: MTH444 COG0642 # Protein_GI_number: 15678472 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanothermobacter thermautotrophicus # 121 360 135 369 373 141 35.0 2e-33 MEESAVILIVDDMPEQLRITASLLKEHGYSVRAAISGSAALHLVEREIPDLILLDIHMDD IDGFTVCQKLRSSSRYEDTAILFMTASGDHDSLQKGFEAGAQDYIIKPCHASELLARVST HVQIVTQARRLKAAYQEMDQFCHSVSHDLKSPVQVMKQLTGCLREELFSSPTGIAEGTEA ILSRLDEKCIQMESMIGRLLDFSQMTTVKYHPELLDAAAMIREIFEELSSLEPDRVILLS DSQFACPPIPGDPTLLRLLFQNIIGNAVKFTKGRDPAVITISARQTSSHTLLEVKDNGTG FDMTDSGRLFRVFERLHTQAEYEGTGVGLAISKRIMEKHGGTISIIGKKGKGATVTLKFP VKLTPLFSSSRTP >gi|229783787|gb|GG667948.1| GENE 4 2267 - 2347 59 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVYYEHWVHPLTNLIRTIPTTWSGLL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:38:35 2011 Seq name: gi|229783786|gb|GG667949.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld342, whole genome shotgun sequence Length of sequence - 2336 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 561 565 ## COG0191 Fructose/tagatose bisphosphate aldolase - Prom 590 - 649 4.4 2 2 Tu 1 . - CDS 708 - 1718 902 ## COG1609 Transcriptional regulators - Prom 1747 - 1806 7.5 3 3 Tu 1 . - CDS 1831 - 2334 401 ## gi|266625581|ref|ZP_06118516.1| putative sodium:proline symporter Predicted protein(s) >gi|229783786|gb|GG667949.1| GENE 1 3 - 561 565 186 aa, chain - ## HITS:1 COG:SA1927 KEGG:ns NR:ns ## COG: SA1927 COG0191 # Protein_GI_number: 15927699 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Staphylococcus aureus N315 # 2 183 3 186 286 138 41.0 5e-33 MIVQLDRLYADAMKRRYIVGAFNVFNYDSLCAVLEAAEEEQSPVILQVSMGARNYVPDFR QFIEVMKIAASGVSVPVGINHDHCPTAEAAMQAVDAGVGGVMFDGSHLSFEENVRFTGEV VEYAHTRGVCVEAELGRLPGFEDEVFADHVEFTNPGAAKRFVALTGCDSLAVSAGTSHGG VKAAEN >gi|229783786|gb|GG667949.1| GENE 2 708 - 1718 902 336 aa, chain - ## HITS:1 COG:SMb20674 KEGG:ns NR:ns ## COG: SMb20674 COG1609 # Protein_GI_number: 16265129 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Sinorhizobium meliloti # 8 316 3 310 339 206 36.0 7e-53 MKNVTIKEVAAHAGMSITTVSRALNNNYPVSAEAREKIERAVSELGYRPNLVARSLKSNN TGIIGLVVADISNRFFMKAARKLEDVVSAQGYQIIYASSDGNVEKEHQILSMFEERCVDA LVIASSDSSSSRLNNIAANGTPVIAIDRKVPGLQADIVVEDNRDSSYQLVEALIQRGHRE IAVNNVLMNISSGKERLEGVRMAMQAHGLPLREEWVSKGGFTSDDARQWVKQIFDGDGLK PTAVFCANNVMTEGTMLALEELNLRIPEDVSVVSFGELPMHQLIQPQIESVVQDPFRIGQ LAGELVLMRLKGQNEEFCHYELPLKLKHGHSIRSLV >gi|229783786|gb|GG667949.1| GENE 3 1831 - 2334 401 167 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625581|ref|ZP_06118516.1| ## NR: gi|266625581|ref|ZP_06118516.1| putative sodium:proline symporter [Clostridium hathewayi DSM 13479] putative sodium:proline symporter [Clostridium hathewayi DSM 13479] # 1 167 1 167 167 276 100.0 3e-73 AFLPPVLRGLGIAAFISLLLTTGNALLVSETSVISNDILPRIWPDADERQGLLISRGVVV ALGGLAVILGLYFQSVCAVMLLFVGMYGASIFPPVLLSVVFPKRELRSETVALCMGITAV LTLTLDLIPAFPAEGIFIGVPVHLILLLGLSKAESWRIEPAGEEVQV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:38:45 2011 Seq name: gi|229783785|gb|GG667950.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld343, whole genome shotgun sequence Length of sequence - 1687 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1686 1844 ## COG1032 Fe-S oxidoreductase Predicted protein(s) >gi|229783785|gb|GG667950.1| GENE 1 3 - 1686 1844 561 aa, chain + ## HITS:1 COG:CAC1254 KEGG:ns NR:ns ## COG: CAC1254 COG1032 # Protein_GI_number: 15894536 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 1 558 8 562 622 631 51.0 0 ILLSIQQPARYIGGEVNTVMKDSAKADIRFAMCFPDVYEIGMSHLGMQILYDMFNRREDI YCERVYSPWTDLDKIMREKNIPLFALESQDPIREFDFLGITLQYEMSYTNILQVLDLGHI PLHAADRSEEDPIVIGGGPCAYNPEPLADFFDMFYIGEGETVYFELMDCYKENKKKGGSR RQFLEQAAGITGIYVPAFYDVEYHEDGTIKSFHPNNSHAKETITKQLVVNMDDAYYIEAP VVPFIKATQDRVVLEIQRGCIRGCRFCQAGNVYRPLREHSLEYLKDYAYKMLKSTGHEEI SLSSLSSSDYTYLEGLVNFLIDEFKGQGVNISLPSLRIDAFSLDVMSKVQDVKKSSLTFA PEAGSQRLRDVINKGLTEEVILQGARDAFYGGWNRVKLYFMLGLPTETKEDMEGIAELSE KVAEVYYEIPKEQRNGKVNVVASSSFFVPKPFTPFQWARMCTKEEFIERAYIVKDKFRQM KNQKSLKYNYHEADLTVLEGVLARGDRKTGALIEEAYKNGAIYDSWSEYFDNRIWMKAFE TCGLSIDFYTTRERSLEEYSR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:38:46 2011 Seq name: gi|229783784|gb|GG667951.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld344, whole genome shotgun sequence Length of sequence - 1196 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 87 - 287 181 ## gi|266625583|ref|ZP_06118518.1| putative multisubunit Na+/H+ antiporter 2 1 Op 2 . + CDS 306 - 1194 622 ## COG0747 ABC-type dipeptide transport system, periplasmic component Predicted protein(s) >gi|229783784|gb|GG667951.1| GENE 1 87 - 287 181 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625583|ref|ZP_06118518.1| ## NR: gi|266625583|ref|ZP_06118518.1| putative multisubunit Na+/H+ antiporter [Clostridium hathewayi DSM 13479] putative multisubunit Na+/H+ antiporter [Clostridium hathewayi DSM 13479] # 1 66 29 94 94 128 98.0 2e-28 MNSTAFAIGKGFGTIIGGASFDMAKKYTLENAPWYLYGVLVGTAMIGFMLYKNFGTPKDE TEKQKQ >gi|229783784|gb|GG667951.1| GENE 2 306 - 1194 622 296 aa, chain + ## HITS:1 COG:FN1523 KEGG:ns NR:ns ## COG: FN1523 COG0747 # Protein_GI_number: 19704855 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 8 294 6 287 526 104 27.0 2e-22 MTGTKNRQKSILMICCLLAVSMLAGCGAKPTDESSNALTESNATISASELVVAGNEIAIG VDPTVYPAADYLLNMGAGEILFKADANGVIQPYFAKDVSQLDEYTWEIKLHPNATFWSGD PADAEAVIASLMRSKALDPKANPFLQGIDFTAVDSETIQAVTETPNVDITLNLSFPQLLI HNTAEKYTYESIETADYTGMYRIVEFKPAQCMVFEKNENYWGTKPEIQRVVHEQIGDADA RVIAALSGRYHVVMDIPQSAYSQFDNSEEAKIVLVPGAQTQTIYLNCEKEALNDYR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:38:53 2011 Seq name: gi|229783783|gb|GG667952.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld345, whole genome shotgun sequence Length of sequence - 2298 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 787 803 ## Cphy_3024 phosphotransferase domain-containing protein 2 1 Op 2 . + CDS 840 - 2298 1418 ## Cphy_3030 hypothetical protein Predicted protein(s) >gi|229783783|gb|GG667952.1| GENE 1 2 - 787 803 261 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3024 NR:ns ## KEGG: Cphy_3024 # Name: not_defined # Def: phosphotransferase domain-containing protein # Organism: C.phytofermentans # Pathway: not_defined # 16 247 89 320 331 336 69.0 7e-91 DINEHFQVPGDFSRVKTYRINLYDAKPEEFSEEKKQIPLPERRYGDYHYINEYIDQMKAY GFFACYNHPYWSLQNYDDYKNLRGFWGMEIYNYGCEHDGLYGYNPQSYDEMLRLGNRLFC VSTDDNHNSYPFGDPLCDSFGGFTMIKAEKLTYDSIVDALLKGSFYSSMGPEIKELYVED GVLTVKTSPVEKIYVLMEGRNCLKKVAGPGEMVTEASFALTGDETYIRVTCQDGRGLHAD SNAYELNGKEPVLREYDMRVL >gi|229783783|gb|GG667952.1| GENE 2 840 - 2298 1418 486 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3030 NR:ns ## KEGG: Cphy_3030 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 486 1 487 724 892 86.0 0 MSEKYGRVTVPTDVDMIEETKEIVKRWGADALRDCDGTSMPDELKSMPVKIYSTYYTTRK DNPWAEANPDEVQQMYLMTEFYTAMEAGELRIPVMKHLYDQQLKPNTIDDIKRWWEVVER TTGEVVAPEEWSYDEAAREVIVAQPERYHDYTVSFLAFIIWDPVHMYNFITNSWENVEHQ ITFDVRQPKTQQHVIDRLKNWMVENPDTDVVRFTTFFHQFTLVFNEYAKEKFVDWFGYSA SVSPYILEQFEKEAGYRFRPEYIIDQGYHNNTNRVPSKEFKDFQKFQQREVAKLMKVLVD ICHENGREATMFLGDHWIGTEPFGEYFKEVGLDAVVGSVGNGTTLRLISDIPGVKYTEGR FLPYFFPDVFHEGGDPIREAKVNWVTARRAILRKPIDRIGYGGYLKLALQFPDFIQYIEE ICDEFRLLYENVGGQTPYNHFTVGVLNSWGKLRSWGTHMVAHAIDYKQTYSYAGVLEGLS GMPFDV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:39:05 2011 Seq name: gi|229783782|gb|GG667953.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld346, whole genome shotgun sequence Length of sequence - 1918 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 205 - 456 328 ## gi|266625588|ref|ZP_06118523.1| peptidyl-prolyl cis-trans isomerase NIMA-interacting 1 2 1 Op 2 . - CDS 463 - 1110 588 ## gi|266625589|ref|ZP_06118524.1| hypothetical protein CLOSTHATH_07013 3 1 Op 3 . - CDS 1122 - 1904 880 ## COG3451 Type IV secretory pathway, VirB4 components Predicted protein(s) >gi|229783782|gb|GG667953.1| GENE 1 205 - 456 328 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625588|ref|ZP_06118523.1| ## NR: gi|266625588|ref|ZP_06118523.1| peptidyl-prolyl cis-trans isomerase NIMA-interacting 1 [Clostridium hathewayi DSM 13479] peptidyl-prolyl cis-trans isomerase NIMA-interacting 1 [Clostridium hathewayi DSM 13479] # 1 83 1 83 83 118 100.0 1e-25 MAYFDDPENRKNWERELRQMKVEKARRAGGYESSVKRAEESYRRAVTFEELMAMEGIGTG NAVSKKSMEKGLEKQNQIEGRTL >gi|229783782|gb|GG667953.1| GENE 2 463 - 1110 588 215 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625589|ref|ZP_06118524.1| ## NR: gi|266625589|ref|ZP_06118524.1| hypothetical protein CLOSTHATH_07013 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_07013 [Clostridium hathewayi DSM 13479] # 1 215 1 215 215 409 100.0 1e-113 MKLKWRMITALALLVLSMCGCMRQETESQVIRLAVESGGENLSGLTEELSKAWNRPVEIV SVAEDRLFDEISQGNCDAGIGLPEREERIKIGIWSLAVGEPANAVVIGNQPYRYSKELEG KKIGYPAGFPESQLVLSGGLQLNSYQNRNSMEQDLVNGVLDGIMLEGSDGEVFMEEHSGS GYLISQLKDQPVLSRYLYSSSMDLIRLAAGRKKGE >gi|229783782|gb|GG667953.1| GENE 3 1122 - 1904 880 260 aa, chain - ## HITS:1 COG:MYPU_3830 KEGG:ns NR:ns ## COG: MYPU_3830 COG3451 # Protein_GI_number: 15828854 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Mycoplasma pulmonis # 19 249 625 842 853 96 28.0 5e-20 MIRVYDEKGIDETTNLNLLGPEDYPTFDDLYDKLLTDYQASDSDYMKNHLRVLISYISKF STGGRNSHLWNGPSTLDVKENFTVFNFQSLLANKNNTVANAQMLLILKWLDNEIIKNRDY NEKYHASRKIIVVVDEAHVFIDEKYPIALDFMFQLAKRIRKYNGMQVVITQNVKDFVGTE ELARKSTAIINASQYSFIFPMAPNDMHDLCRLYEKAGEINETEQDEIVNNGRGHAFVVTS PASRSSIEITADDSMVRMFE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:39:22 2011 Seq name: gi|229783781|gb|GG667954.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld347, whole genome shotgun sequence Length of sequence - 1223 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 667 580 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 781 - 840 6.2 + Prom 825 - 884 9.8 2 2 Tu 1 . + CDS 1011 - 1221 286 ## COG1285 Uncharacterized membrane protein Predicted protein(s) >gi|229783781|gb|GG667954.1| GENE 1 1 - 667 580 222 aa, chain - ## HITS:1 COG:CAC1333 KEGG:ns NR:ns ## COG: CAC1333 COG2207 # Protein_GI_number: 15894612 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 13 222 8 210 286 99 33.0 6e-21 MFEIKKKSEPMPYYHKPPMRFRDFEIHYNRDYSLNHIVRNKHSYYEFYFLISGDVTYYIE DREFDLKPGAIILIAPEQYHQATINTRTGQAYERYVLWLDPEYLKRLSTDKSDLLLPFQK TYITSAHIQLTRDMQILINNLLEMILTSSVSQEYGADLLTNSYIIELLVHIARMKLFQQD TYLDQRLNDNPENSPIITNTLNYINSHIYEDIRIQDITDYLY >gi|229783781|gb|GG667954.1| GENE 2 1011 - 1221 286 70 aa, chain + ## HITS:1 COG:CAC3658 KEGG:ns NR:ns ## COG: CAC3658 COG1285 # Protein_GI_number: 15896891 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Clostridium acetobutylicum # 8 64 7 63 229 79 64.0 1e-15 MSLAVMTELEYVLRILAAGICGGVIGYERKNRNKAAGSKTHVIVAVSSALMIIISKYGFY DVLGEYIKLD Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:39:23 2011 Seq name: gi|229783780|gb|GG667955.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld348, whole genome shotgun sequence Length of sequence - 1634 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 307 324 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 322 - 351 0.4 - Term 360 - 400 7.2 2 2 Tu 1 . - CDS 431 - 1615 1339 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component Predicted protein(s) >gi|229783780|gb|GG667955.1| GENE 1 2 - 307 324 101 aa, chain + ## HITS:1 COG:CAC0777 KEGG:ns NR:ns ## COG: CAC0777 COG0110 # Protein_GI_number: 15894064 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Clostridium acetobutylicum # 1 100 110 209 210 142 63.0 1e-34 TASWDNKGDIIIGNDVWIGYEAVLMAGVTIGDGAVIGARAVVTKDIPPYTIAGGVPARPI KRRYTEETIAALSSLKWWDWPEERIAQNLDAIQSGQLDWLK >gi|229783780|gb|GG667955.1| GENE 2 431 - 1615 1339 394 aa, chain - ## HITS:1 COG:MA4536 KEGG:ns NR:ns ## COG: MA4536 COG0614 # Protein_GI_number: 20093321 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Methanosarcina acetivorans str.C2A # 47 391 41 384 387 193 31.0 4e-49 MRKRKQIWLGLSLAAAMMMGLNGCAKGTGNAGDTQQATVAETTVSDTSEAAQQEETDSSA AGDNGVREEGTRIVTDSAGREVEIPSEITKIAPSGPLAQIVLYTVSPDKLAGLAADFSDE AKQYIDEKYWGLPKFGQFYGKNASLNMEALIAEAPDVIIDIGEAKKTVKEDMDALQEQLN IPVIFVEATLPTMADAYEMLGDITGEKEQAGKLADYCRAEIAKADKNAAAIDDSDKKSVY FGLGDDGLHTNAKGSIHADVIERIGAVNAADVEAVSSGGGSEVSFEQVLLWNPDLIIVDS QKLYDTLTSDSMWQELDAVKNGKIFKIPTAPYSYMSSPPSVNRMIGIEWLGNLVYPDLYS SNIREEVKNFYQLFYHIDVTDEQLDTILKDAVSN Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:39:25 2011 Seq name: gi|229783779|gb|GG667956.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld349, whole genome shotgun sequence Length of sequence - 1991 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 187 - 229 11.0 1 1 Tu 1 . - CDS 247 - 843 204 ## PROTEIN SUPPORTED gi|52078547|ref|YP_077338.1| 50S ribosomal protein L25/general stress protein Ctc - Prom 907 - 966 4.8 2 2 Tu 1 . - CDS 969 - 1991 962 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase Predicted protein(s) >gi|229783779|gb|GG667956.1| GENE 1 247 - 843 204 198 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|52078547|ref|YP_077338.1| 50S ribosomal protein L25/general stress protein Ctc [Bacillus licheniformis ATCC 14580] # 1 196 1 208 208 83 28 2e-16 MNTLKAEKRSMSIKAKRLRREGYVTGNVFGRDIQESIPVKIERTVVERLLKTCNKGSQIL LDVDGQAYDVLIKDISFNAMKGIVEEIDFQALVRGEKVHSVAEVILVNHDKVMNGVLQQQ LQEISYKALPEALIDKVRIDVGDMKVGDTIRVADLDIAKNKNVDLVTDLDTTVATVTAVH AAAEEPAEGTEEAASETK >gi|229783779|gb|GG667956.1| GENE 2 969 - 1991 962 340 aa, chain - ## HITS:1 COG:BH1665 KEGG:ns NR:ns ## COG: BH1665 COG0079 # Protein_GI_number: 15614228 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Bacillus halodurans # 9 333 31 359 369 199 34.0 7e-51 VGKTDIGRLVRMGLNENPYGMSPKALNAIAETAAGSNLYGDFQANELKNTLAEYYGMSYD NFITAAGSSACIELVGNTFLNPGDEVLMCPTFAAFLDMAYIHQAKPVTVPLKSDLTYDLE GLLEAVTEKTKIIVICNPNNPTGTYVNGDSLRSFIDRVPEDIVIVMDEAYIEFSTAPDCE SMIDLVRQGIEKPLLVLRTFSKYYAMAGVRVGYVAAAPELIAAMKKCPCNCNINKMGQAA AVAAMKDQEYYQEVKALVVAGRKYIETELEAMGCRVYHSQTNFIYFDAHVSPEWLRDRLA ERGIMISSNQISRVSVGTAKENEMFIGVMKEVLENAELAS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:39:26 2011 Seq name: gi|229783778|gb|GG667957.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld350, whole genome shotgun sequence Length of sequence - 2732 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 84 - 818 792 ## gi|288871672|ref|ZP_06118532.2| conserved hypothetical protein - Prom 843 - 902 4.0 - TRNA 1024 - 1095 58.8 # Glu CTC 0 0 - TRNA 1119 - 1195 81.3 # Pro CGG 0 0 2 2 Op 1 . - CDS 1281 - 1475 345 ## gi|266625598|ref|ZP_06118533.1| conserved hypothetical protein 3 2 Op 2 4/0.000 - CDS 1513 - 1818 407 ## COG1937 Uncharacterized protein conserved in bacteria 4 2 Op 3 . - CDS 1843 - 2730 1011 ## COG2217 Cation transport ATPase Predicted protein(s) >gi|229783778|gb|GG667957.1| GENE 1 84 - 818 792 244 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871672|ref|ZP_06118532.2| ## NR: gi|288871672|ref|ZP_06118532.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 244 12 255 255 490 100.0 1e-137 MLKPVDENFKKYCVTKDGSYLKKIRSIGGGSAALTAAGFLIAGICVLLIAATKDVVTAEG LTLFAAVAAGSALLAIIGIFMRRRRIRTYLEYFSKESGYTPDQLKEFEREVLEPDSCYDT VSRKLAKNSAAFSWVLTEHWFKQMDHIPIRIEDMAAAFYMNGITYKKIQYGKSIFFVLKD GTIHDVYNWQYDKEGTARIVEELRKRNPLLIPAKSVRAGEEVYNCLEQPERVAELYRSAR ERRS >gi|229783778|gb|GG667957.1| GENE 2 1281 - 1475 345 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625598|ref|ZP_06118533.1| ## NR: gi|266625598|ref|ZP_06118533.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 64 1 64 64 112 100.0 9e-24 MKTYRCDEMMCNGCVSRINKGLDHEGIGHKVDLETKTVMIEEEKDYDKAVEILDDLGFTA VEQQ >gi|229783778|gb|GG667957.1| GENE 3 1513 - 1818 407 101 aa, chain - ## HITS:1 COG:BS_yvgZ KEGG:ns NR:ns ## COG: BS_yvgZ COG1937 # Protein_GI_number: 16080405 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 14 101 12 100 101 73 48.0 1e-13 MEEHTCDCAVKHKHRQAAEEKDLLNRLNRIEGQIRGIKSMVEDERYCVDILTQVSAVQAA LNSFNKVLLSSHIKSCVVEQIQEGNLEAVDELCMTIQKVMK >gi|229783778|gb|GG667957.1| GENE 4 1843 - 2730 1011 295 aa, chain - ## HITS:1 COG:CAC3655 KEGG:ns NR:ns ## COG: CAC3655 COG2217 # Protein_GI_number: 15896888 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 5 294 525 813 818 310 57.0 2e-84 DQAETSDGLLKLAAGCEQMSEHPLGQAIVLKAREQGLELAMPEKFESVTGAGIIASLGTS QVAVGNERLVEQMKVSMGDPVKEQAHQMANQGKTPMYLIADGKLKGIICVADTIKETSVE AVDQLKRLGITVCMLTGDNQKTADYIGKQAHIDTVIAEVLPEDKANVVESLQKQGKTVMM VGDGINDAPALVSADVGTAIGSGSDIALESGDIVLMKSDLRDVYKAVKLSRLTIRNIKQN LFWAFFYNSLGIPVAAGVLYLFGGPLLNPMFAGFAMSLSSVCVVSNALRLKTVKL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:39:44 2011 Seq name: gi|229783777|gb|GG667958.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld351, whole genome shotgun sequence Length of sequence - 1130 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 192 217 ## Closa_0578 Undecaprenyl-phosphate glucose phosphotransferase (EC:2.7.8.6) - Prom 257 - 316 5.9 - Term 541 - 598 3.0 2 2 Tu 1 . - CDS 607 - 1119 436 ## COG1216 Predicted glycosyltransferases Predicted protein(s) >gi|229783777|gb|GG667958.1| GENE 1 3 - 192 217 63 aa, chain - ## HITS:1 COG:no KEGG:Closa_0578 NR:ns ## KEGG: Closa_0578 # Name: not_defined # Def: Undecaprenyl-phosphate glucose phosphotransferase (EC:2.7.8.6) # Organism: C.saccharolyticum # Pathway: not_defined # 1 63 1 63 472 98 76.0 1e-19 MIKDNQKKLNGFHVVLDGLIIIASYALAWFILLMGNRLFSPEKKLLAPQYYFAALIIIVP GYL >gi|229783777|gb|GG667958.1| GENE 2 607 - 1119 436 170 aa, chain - ## HITS:1 COG:CC3632 KEGG:ns NR:ns ## COG: CC3632 COG1216 # Protein_GI_number: 16127862 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Caulobacter vibrioides # 1 111 167 277 341 85 39.0 4e-17 MFAACAAAAIYRREVFEEIGYFDESHFAYLEDVDVGYRARIYGYDNVYCPAAEVYHVGSG TSGSKYNSFKVKLAARNNIYLNYKNMPLLQLLFNLPFLLAGTGIKYLFFKKMGYGADYAE GFCEGIKTFGTCKKVAFRTEHIGNYLQIEWELFTGTLLYVWEFLKRRLKK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:39:47 2011 Seq name: gi|229783776|gb|GG667959.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld352, whole genome shotgun sequence Length of sequence - 1841 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1839 925 ## Mahau_2083 ADP-ribosylation/Crystallin J1 Predicted protein(s) >gi|229783776|gb|GG667959.1| GENE 1 3 - 1839 925 612 aa, chain + ## HITS:1 COG:no KEGG:Mahau_2083 NR:ns ## KEGG: Mahau_2083 # Name: not_defined # Def: ADP-ribosylation/Crystallin J1 # Organism: M.australiensis # Pathway: not_defined # 1 586 115 699 711 500 44.0 1e-140 LHDGIPAPRSGSIRQNGRAVAEQIGGQIFIDSWGLVAPGNPDLAAKLAREAAGVTHGGDG IHGGIFIAACISAAFEEKNIISIIEKGLSYLPEESGYYKIVKEVMRFYMEHPGTPESCLE YIQEHYGYGKYPGNCHIIPNTAVIILALLYGNGDFSETLNICNRCGWDTDCNVGNVAVIM GVRGGLSVISQALREPINDFVACSSVIGSLNNQDIPYGACYMIKQAWNLAGEELPEPFGT IIETRIDSCHFEFSGSTHAFLLRNGEEKEGKIQETVMVNTDEAAHTGKRSLKLSAGPAEP GGRIFAFKKTYLRPADFHDSRYDPAFSPLVYPGQEIHGSAFLPGYGVPCRAALYVRNAET QEIYPSSEMELKAGMWQELDFKIPAMQGVLLDEAGFVFTINPEQSEASVLTVMVDDFYVT GAPEYTIDFSKERVEYWTPVHKEISQFTRLKGQFCLEEGRALISCSDYAECYTGHHGWKD YEVTAELEPVIGDEHYVAVRVEGAVRSYLAGFRSDRTFAVMKKTRQGMIVLGEIPCEWEH GQTYRITVNVTGNRIQAKLQNTVLTVLDQDRPYLAGCIGLAVREGSRFYCRSMEIKSAGD KEIYENSNPFGS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:39:56 2011 Seq name: gi|229783775|gb|GG667960.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld353, whole genome shotgun sequence Length of sequence - 808 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 16 - 75 8.9 1 1 Tu 1 . + CDS 133 - 808 799 ## Clos_2091 sodium/hydrogen exchanger Predicted protein(s) >gi|229783775|gb|GG667960.1| GENE 1 133 - 808 799 225 aa, chain + ## HITS:1 COG:no KEGG:Clos_2091 NR:ns ## KEGG: Clos_2091 # Name: not_defined # Def: sodium/hydrogen exchanger # Organism: A.oremlandii # Pathway: not_defined # 15 225 4 216 399 119 36.0 1e-25 MELLYSQMDESTAVLCSLSIILFAGFIVTRITKRLRLPNVSGYIIAGVLIGPCCLKMVPS DILGHMGFMSDIALAFIAFGVGKFFKKEIIKEAGIKIIVITIFEALAAGILVTISMRMIF GMSWNFALILGAIATATAPASTMMTIHQYHARGDFVNVLLQVVALDDVVCLLTFSAVVAV VNARTSGHISFTDVMVPIFWNLMALALGFLCGIVLSKLLTPARSK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:40:01 2011 Seq name: gi|229783774|gb|GG667961.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld354, whole genome shotgun sequence Length of sequence - 1712 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 16/0.000 + CDS 3 - 149 223 ## COG1159 GTPase 2 1 Op 2 . + CDS 165 - 827 764 ## COG1381 Recombinational DNA repair protein (RecF pathway) + Term 834 - 871 1.1 3 2 Tu 1 . + CDS 901 - 1711 556 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen Predicted protein(s) >gi|229783774|gb|GG667961.1| GENE 1 3 - 149 223 48 aa, chain + ## HITS:1 COG:lin1499 KEGG:ns NR:ns ## COG: lin1499 COG1159 # Protein_GI_number: 16800567 # Func_class: R General function prediction only # Function: GTPase # Organism: Listeria innocua # 1 47 254 300 301 57 53.0 4e-09 KRIGSAARREIEDLMDTQVNLQIWVKVRKEWRDSELYMKNYGYNEKEI >gi|229783774|gb|GG667961.1| GENE 2 165 - 827 764 220 aa, chain + ## HITS:1 COG:BH1369 KEGG:ns NR:ns ## COG: BH1369 COG1381 # Protein_GI_number: 15613932 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Bacillus halodurans # 9 147 7 144 254 73 29.0 3e-13 MREAVRLTGMVLKAVPSGEMDKRLVILTKERGKVTAFARGARRPGSMLMACGRPFAFGQF TLYEGRDAYSLQSAEISNYFDELSTDVEGTCYGSYFLEFSDYYTRENMDGTAMLKLLYQS VRALMKPAVPNELVRRVYELKAMAINGEYTEKPPREVSDSANYAWEFVVLSPVEHLYTFA LTDAVFEEFGRCVEINKKRFIDREFHSLDILEAMTGRRPL >gi|229783774|gb|GG667961.1| GENE 3 901 - 1711 556 270 aa, chain + ## HITS:1 COG:UU038 KEGG:ns NR:ns ## COG: UU038 COG2865 # Protein_GI_number: 13357594 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Ureaplasma urealyticum # 1 248 2 249 463 205 45.0 7e-53 MNIKKLIGETNAYDKKQALEERRPKSWCKSVSAFANGNGGRLIFGVADDDSITGLRDVRH DSEVISEQIKNRIDPIPKFDLVIQGGDDKEIIVLEISKGLETPYYYVSEGSRTAYQRVGN ESIPVDARGLKELILRGSDSTYDSSISRYEFRNYSFTKLKAVYRQRTGESFENDDFISFG IVSEDGKLTNAGALLADESPIRHSRLFCTRWNGLDKAAGTMDALDDKEFSGSLVVLLENG LDFVRTNSNLFGSNIIDSEVPVIKVTPKET Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:40:02 2011 Seq name: gi|229783773|gb|GG667962.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld355, whole genome shotgun sequence Length of sequence - 2145 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 145 94 ## Closa_1994 Radical SAM domain protein 2 1 Op 2 2/0.000 + CDS 129 - 839 899 ## COG5011 Uncharacterized protein conserved in bacteria 3 1 Op 3 . + CDS 832 - 2034 950 ## COG1530 Ribonucleases G and E 4 1 Op 4 . + CDS 2031 - 2145 99 ## Closa_1997 TrkA-N domain protein Predicted protein(s) >gi|229783773|gb|GG667962.1| GENE 1 2 - 145 94 47 aa, chain + ## HITS:1 COG:no KEGG:Closa_1994 NR:ns ## KEGG: Closa_1994 # Name: not_defined # Def: Radical SAM domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 45 575 619 619 85 75.0 1e-15 DAGVSKDFLKREWQNAIGEKLTPNCRQKCSACGAMRFGGGVCHEGKN >gi|229783773|gb|GG667962.1| GENE 2 129 - 839 899 236 aa, chain + ## HITS:1 COG:CAC1255 KEGG:ns NR:ns ## COG: CAC1255 COG5011 # Protein_GI_number: 15894537 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 215 1 224 238 108 31.0 1e-23 MKVRIKFSKQGPVKFIGHLDVMRYFQKAMRRAAIDIKYSEGFSPHQIMSFAAPLGLGLTS NGEYMDIEVNSMTDCETMKNQLNEVMVEGIQILECHKLEDRAKNAMSIVAAADYTLTFRE GKAPSDPEAFFEGLEEFYRQDQIITVKKTKKGEREMDLKPAIYELSTEGGRIFMKVSAGS QDNLKPELVMETYDRWRGESCPEFAFLIQREEVYGNIGDEELKNLVPLGLIGEAVE >gi|229783773|gb|GG667962.1| GENE 3 832 - 2034 950 400 aa, chain + ## HITS:1 COG:ML1468 KEGG:ns NR:ns ## COG: ML1468 COG1530 # Protein_GI_number: 15827770 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Mycobacterium leprae # 17 383 337 719 924 220 35.0 3e-57 MNKLIITKWNGMVLTMVQSEGSAVSLNLEPGEERSILGNIYIGKVKNIVKNIGAAFVEIG DGCMGYLNLADGFIHHASAGGADGKLRVGDEIVVQVERDAVKTKAPVLTGNLNFTGRYIV LTAGKRQLGFSSKLSDMEWKKQIKPILEAEKEEDFGIIVRTNAPEASFDAILEELRSLKA LYRKVMGDAVHRTCLSLLYQTPPAYLADIRDSLHGSLESVVTDEPDIYEAIKVYLETFQP EDLKKLTFYDDALLPLFKLFKVEKAMDEALGKRVWLKSGGYLVIEPTEALCVIDVNTGKY SGKKNLHDTIMKINTEAAVQIARQLRLRNLSGIIIIDFIDMETEEDRQELLTVLSRALSK DPVKTSVVEITKLNLVEVTRKKIRRPFHEQAALIKGKAES >gi|229783773|gb|GG667962.1| GENE 4 2031 - 2145 99 38 aa, chain + ## HITS:1 COG:no KEGG:Closa_1997 NR:ns ## KEGG: Closa_1997 # Name: not_defined # Def: TrkA-N domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 38 1 38 451 65 84.0 7e-10 MKIIIVGCGKVGTTLAEQLNNEHHDITVIDKNTSVINA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:40:06 2011 Seq name: gi|229783772|gb|GG667963.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld356, whole genome shotgun sequence Length of sequence - 1728 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 56 - 115 4.8 1 1 Tu 1 . + CDS 299 - 1156 704 ## Selsp_1746 Conserved hypothetical protein CHP01784 + Term 1167 - 1230 13.2 - Term 1154 - 1218 17.2 2 2 Op 1 . - CDS 1232 - 1519 265 ## CKR_1428 hypothetical protein 3 2 Op 2 . - CDS 1534 - 1728 168 ## gi|225390651|ref|ZP_03760375.1| hypothetical protein CLOSTASPAR_04406 Predicted protein(s) >gi|229783772|gb|GG667963.1| GENE 1 299 - 1156 704 285 aa, chain + ## HITS:1 COG:no KEGG:Selsp_1746 NR:ns ## KEGG: Selsp_1746 # Name: not_defined # Def: Conserved hypothetical protein CHP01784 # Organism: S.sputigena # Pathway: not_defined # 7 267 10 252 287 125 32.0 2e-27 MKKRKLEDLNLMDDFLFQEMVSKEGIGEEFCRILLSTILGRPIQKIKVIPQKVLLGIDTG RHGIRMDAYIEDRSGDDIEVAADIYDIEPNNTYEKETLPKRTRYYDALIDSQLLETGADY SKLKNVMIIMILPYDPFGMDRMVYTVKNLCLEEPSVPYDDGLVKLYLYTKGTAGNPGKSL RDMLKYMEKSTEDNVTDDNIAKIHDLVSKVKRDKEVGINYMKSWEREQFIRREGLEEGLA EGLTEGQNRINQLNLKLIAASRTEDMIRAAGDKEFQHKLLEEFGV >gi|229783772|gb|GG667963.1| GENE 2 1232 - 1519 265 95 aa, chain - ## HITS:1 COG:no KEGG:CKR_1428 NR:ns ## KEGG: CKR_1428 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 9 87 372 446 454 64 39.0 2e-09 MMADFFGDDNICRLGGDEFLILIPDKTEEETENMLEEACQKMKETFNEQNVPIRLSVSYG VVEVGKLPFAAVSDILEPTDRKMYTKKKETHKMKR >gi|229783772|gb|GG667963.1| GENE 3 1534 - 1728 168 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225390651|ref|ZP_03760375.1| ## NR: gi|225390651|ref|ZP_03760375.1| hypothetical protein CLOSTASPAR_04406 [Clostridium asparagiforme DSM 15981] hypothetical protein CLOSTASPAR_04406 [Clostridium asparagiforme DSM 15981] # 14 62 142 190 405 70 71.0 3e-11 VILILLVLLASIICLEIFDRLIDQKSIQILIGNKRQLRFAVTSLMFIDVYLLIIPTAIRQ ATAI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:40:19 2011 Seq name: gi|229783771|gb|GG667964.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld357, whole genome shotgun sequence Length of sequence - 2239 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 25 - 171 100 ## gi|266624026|ref|ZP_06116961.1| putative cyclic nucleotide binding regulatory protein 2 1 Op 2 . + CDS 235 - 1203 655 ## COG2267 Lysophospholipase 3 2 Tu 1 . - CDS 1233 - 2237 743 ## COG2755 Lysophospholipase L1 and related esterases Predicted protein(s) >gi|229783771|gb|GG667964.1| GENE 1 25 - 171 100 48 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624026|ref|ZP_06116961.1| ## NR: gi|266624026|ref|ZP_06116961.1| putative cyclic nucleotide binding regulatory protein [Clostridium hathewayi DSM 13479] putative cyclic nucleotide binding regulatory protein [Clostridium hathewayi DSM 13479] # 1 48 150 197 197 88 100.0 1e-16 MQAMERYLQFREDYRDLLERVPQNVIASYLGIKKESLSRLRRNLRDNG >gi|229783771|gb|GG667964.1| GENE 2 235 - 1203 655 322 aa, chain + ## HITS:1 COG:RSc2319 KEGG:ns NR:ns ## COG: RSc2319 COG2267 # Protein_GI_number: 17547038 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Ralstonia solanacearum # 9 315 5 309 319 159 32.0 5e-39 METEEIGMTYNNQKAWKEIEKFLPLEYHFRGEYQPAEELWNWKGNKVHLDTFRNPDAEAK MICFHGVGTNGRQISLIIGGPLAKEGFETISVDMPTYGVTEVNPKMRITYADWVQCGSDL VEEELKKDDRPVFLYGLSAGGMETYHIAAKNGRVRGIIGMTFLDQRQKPVQITTASNPFW GHAGAILAKLACAAGFSGFRIKMSIPSKMKALVNDRKCLDIMMADRTSAGNRVTMEFLHS YITYKPEVEPKDFSVCPILLTQPEKDGWTPQFLSDDFLNQIEKIPVTKTVLQNGSHYPIE KEALSDLQTSVLSFLRNQLAME >gi|229783771|gb|GG667964.1| GENE 3 1233 - 2237 743 334 aa, chain - ## HITS:1 COG:BS_yxiM_2 KEGG:ns NR:ns ## COG: BS_yxiM_2 COG2755 # Protein_GI_number: 16080963 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Bacillus subtilis # 66 268 21 223 229 101 30.0 2e-21 GRKLLRDIISLKKGERYERTFYQSIAGIIPRYHEEYYPAEHLFVTLCTPTPGAVKIDGCH AETAGPVTTVYLCGDSTVTDQAAEIPYLPGACYASWGQALPAFLEGRTAVENQAHCGLTT ETFRQEGHMDIVKRFLRPGDFCLFQFGHNDQKLPHLLAHREYPVNLRRYVEEVRALGGIP VLATSLGRNIWNADGTYHELLAEHVQAVKDVAAATNTPVIDLHEFSVQFYKKTGMERSCG YFHPGDYTHTNEYGAYLFASFIAAELNRLFPETFHVTPGRTDFTPPASLWEELENRNNRA GGQEQREQFDAMEKSTAALLKALEEAEKPYTDTL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:40:25 2011 Seq name: gi|229783770|gb|GG667965.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld358, whole genome shotgun sequence Length of sequence - 1370 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 38/0.000 - CDS 3 - 468 504 ## COG0395 ABC-type sugar transport system, permease component 2 1 Op 2 . - CDS 465 - 1370 832 ## COG1175 ABC-type sugar transport systems, permease components Predicted protein(s) >gi|229783770|gb|GG667965.1| GENE 1 3 - 468 504 155 aa, chain - ## HITS:1 COG:BH2224 KEGG:ns NR:ns ## COG: BH2224 COG0395 # Protein_GI_number: 15614787 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 3 155 6 155 276 83 37.0 1e-16 MTKRKRNLYILEAVMLVIAVIWFIPIYYLIVTTLKNPQEATANPLGLPIHLEIGNYIKAW TNMQFPRAFANTLFITATAVVIIVVFGAMAGYALARTKTKMGNRMFLFFLAGLIVPFQMN IVSLYKIVKSLHLMNTPFAVILVNVAINTPQAVFL >gi|229783770|gb|GG667965.1| GENE 2 465 - 1370 832 301 aa, chain - ## HITS:1 COG:BH2225 KEGG:ns NR:ns ## COG: BH2225 COG1175 # Protein_GI_number: 15614788 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 23 289 28 295 304 156 35.0 4e-38 VEQIQRKKRQAGTLGKYFYVFILPAFVVYLLFSIVPFLYTFYYSFTDYTDMNPVNLSFVG LKNYIKVFNTPLMITAIKNSVLYAIMLTSLQTILALPLAVLLDKKLKTRNLLRAVFFFPA VFSSLIIGYLWNFIMSSSDYGLINNLLHQLGLGTFNFFTADRALFSVILTQAWQWTGWAM VIYLANLQSISPDLYEAADIDGAGGIKKFFYVTLPLMCPSVKIIVVTGLIGGMKVFDIIY SMTSGGPGNATETVMTVMMKKGISDGFYSTGSAFGVCFFAIVLVISAVVTKLMGKWSDSI Q Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:40:26 2011 Seq name: gi|229783769|gb|GG667966.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld359, whole genome shotgun sequence Length of sequence - 1841 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 1202 871 ## COG1075 Predicted acetyltransferases and hydrolases with the alpha/beta hydrolase fold 2 1 Op 2 . - CDS 1291 - 1839 439 ## COG0042 tRNA-dihydrouridine synthase Predicted protein(s) >gi|229783769|gb|GG667966.1| GENE 1 2 - 1202 871 400 aa, chain - ## HITS:1 COG:CAC1028 KEGG:ns NR:ns ## COG: CAC1028 COG1075 # Protein_GI_number: 15894315 # Func_class: R General function prediction only # Function: Predicted acetyltransferases and hydrolases with the alpha/beta hydrolase fold # Organism: Clostridium acetobutylicum # 57 397 79 413 479 342 49.0 6e-94 MKAKKVLKRLYLGVMIATVSNLPLLVYGMFTGRLLEETVPAQPAAAVMGAILLSICWLYF NIRPYRDRNHAGLRLVIMRGGRTLGFAAIYGIAVQCLVIMILYPRLLHGNADAAGWPAGI LIGNTVFAAAALLVLLWNGILRLYLTSARLRVRMRVLMLLAMWIPVVNLAVLLHAMRLVR AEYDYACYKEAVRSTRAESDLCRTKYPLIMVHGIGFRDLHYFNYWGRIPRELVRYGASVY YGNQEALGTIACNAEDIRKKILEVVEETGCEKVNIIAHSKGGLDARYAISKLGMAPYTAS LTTVSTPHHGCRFVDYAIRLPEGLYRLVARCFDRTFSRFGDKNPDFYTATHQFSTASSRA FNESVSDVPGVFYQSYASKMRDCFSDPLLWIPYCLIKPLE >gi|229783769|gb|GG667966.1| GENE 2 1291 - 1839 439 182 aa, chain - ## HITS:1 COG:CAC3454 KEGG:ns NR:ns ## COG: CAC3454 COG0042 # Protein_GI_number: 15896694 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-dihydrouridine synthase # Organism: Clostridium acetobutylicum # 2 172 140 310 311 183 47.0 2e-46 ASPDEFEELLEIYNKYPLKELIIHPRVRSDFYKNTPDRAAFCRAVKASKNEVWYNGDIFT AGDWKEFHAGVPEVDCIMMGRGLLANPGLAGEIKGLGGPDKEWIREFHDMVYQGYRETIS GDRNVLFKMKEFWFYLLFLFPDSEKYGKKICKAERAAEYEAAVAALFRDLDIKEGRGYGG EH Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:40:27 2011 Seq name: gi|229783768|gb|GG667967.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld360, whole genome shotgun sequence Length of sequence - 1405 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 15 - 74 4.3 1 1 Tu 1 . + CDS 125 - 778 577 ## COG3619 Predicted membrane protein + Term 907 - 963 13.4 + Prom 890 - 949 3.6 2 2 Tu 1 . + CDS 1013 - 1403 478 ## COG0474 Cation transport ATPase Predicted protein(s) >gi|229783768|gb|GG667967.1| GENE 1 125 - 778 577 217 aa, chain + ## HITS:1 COG:lin0467 KEGG:ns NR:ns ## COG: lin0467 COG3619 # Protein_GI_number: 16799543 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 13 209 23 217 221 175 48.0 6e-44 MPTGIFLTLSGGFQDAYTYYTRGKVFANAQTGNIILLGHNAMNGDFTDAFRYLAPVLAFA GGIYISEVIRGIYREYGKLHWRQIVVALEILLLFVVGFLPQSMNMAANILVSFVCAMQVE AFRKMKGSAYASTMCIGNLRSATEMLYRYRHTKEKGCLEKCLRYYGVILVFGIGAALGSL MTSLFKERTIWISCGFLFVCFCIMFMKEDIEGEGGLD >gi|229783768|gb|GG667967.1| GENE 2 1013 - 1403 478 130 aa, chain + ## HITS:1 COG:SPy0623 KEGG:ns NR:ns ## COG: SPy0623 COG0474 # Protein_GI_number: 15674699 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Streptococcus pyogenes M1 GAS # 4 126 11 131 893 90 43.0 7e-19 MKGYLDTLEEVLEKLNTKRTGLSAEEAAGRLMADGKNKLKEAKPVPLWRRFLSQLADPMI VILIAAAVVSGMTAVYAGESFADVIIIMIVVLINAVLGVAQESKAEKAIAALQEIAAATS RVMRDGKQQT Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:40:28 2011 Seq name: gi|229783767|gb|GG667968.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld361, whole genome shotgun sequence Length of sequence - 2198 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 218 - 260 6.0 1 1 Tu 1 . - CDS 297 - 443 149 ## gi|266625627|ref|ZP_06118562.1| conserved hypothetical protein - Prom 548 - 607 6.7 + Prom 651 - 710 4.4 2 2 Tu 1 . + CDS 747 - 893 166 ## + Term 920 - 960 6.4 - Term 890 - 962 23.1 3 3 Op 1 . - CDS 988 - 1536 270 ## PROTEIN SUPPORTED gi|229236145|ref|ZP_04360568.1| acetyltransferase, ribosomal protein N-acetylase 4 3 Op 2 . - CDS 1557 - 1961 596 ## COG4747 ACT domain-containing protein 5 3 Op 3 . - CDS 2018 - 2197 248 ## COG4443 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|229783767|gb|GG667968.1| GENE 1 297 - 443 149 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625627|ref|ZP_06118562.1| ## NR: gi|266625627|ref|ZP_06118562.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 48 1 48 48 72 100.0 1e-11 MTIYPVCTSLMRYRLFVNCMERYYRYKMKRPCFLIMQNSRWTVVTLDM >gi|229783767|gb|GG667968.1| GENE 2 747 - 893 166 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNSLFWIIYIIVIIAVPVVFFILGYFLVKMAVKNGILEAYKEIHKDQP >gi|229783767|gb|GG667968.1| GENE 3 988 - 1536 270 182 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229236145|ref|ZP_04360568.1| acetyltransferase, ribosomal protein N-acetylase [Chitinophaga pinensis DSM 2588] # 12 181 8 179 181 108 38 4e-24 MRAEEEIIDLGGPQLLLRNAEEEDAEMLLDFLKTTCGETRYLVKEPEEIDLTLEDERTFI RNQNESDQNIMLLGFIDGEYAGNCSLMGMGAARFKHRASLGIALYQKYTGRGIGKAMIGK LIEIAREHNLEQIELEVVADNQRAVSLYRKMGFEIFGTLPNNMKYQDGTYADAYWMMKSI RR >gi|229783767|gb|GG667968.1| GENE 4 1557 - 1961 596 134 aa, chain - ## HITS:1 COG:AF1672 KEGG:ns NR:ns ## COG: AF1672 COG4747 # Protein_GI_number: 11499262 # Func_class: R General function prediction only # Function: ACT domain-containing protein # Organism: Archaeoglobus fulgidus # 1 130 1 129 137 97 39.0 7e-21 MLKQLSIYAENRKGTFKHVTGILEEENINILGSVTNDSAEYGIIRMVVSAPDKAVAALEK ADYLCKLTDVVGVEMADEVGNLHHLLMALSESNINVDYVYLSFNRDSNKPILIFHADDIR AVEACLAAKGFTVV >gi|229783767|gb|GG667968.1| GENE 5 2018 - 2197 248 59 aa, chain - ## HITS:1 COG:CAC0545 KEGG:ns NR:ns ## COG: CAC0545 COG4443 # Protein_GI_number: 15893835 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 58 15 72 74 76 72.0 1e-14 VLSENAKGWRKEVNLISWNGGIPKYDIRDWAPEHEKMGKGTTLSEEEIKKLKEILGEIL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:40:38 2011 Seq name: gi|229783766|gb|GG667969.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld362, whole genome shotgun sequence Length of sequence - 1295 bp Number of predicted genes - 4, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 28 - 303 124 ## COG3344 Retron-type reverse transcriptase + Prom 352 - 411 4.3 2 2 Op 1 . + CDS 458 - 577 89 ## 3 2 Op 2 . + CDS 607 - 741 86 ## 4 2 Op 3 . + CDS 723 - 1293 447 ## CbC4_0986 hypothetical protein Predicted protein(s) >gi|229783766|gb|GG667969.1| GENE 1 28 - 303 124 91 aa, chain + ## HITS:1 COG:CAC3514 KEGG:ns NR:ns ## COG: CAC3514 COG3344 # Protein_GI_number: 15896751 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Clostridium acetobutylicum # 1 88 383 470 470 69 37.0 1e-12 MKNRIEGLNQWLYHRIRMCIWKQWKKPKTKVKNLLKMGVPKELAWKAGNSRRGYWFTTQT VAVNMAMTKERLINSGYYDLATAYQSVHVNC >gi|229783766|gb|GG667969.1| GENE 2 458 - 577 89 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVMACIMSDRSWNVMGGKIITGNDRLNGIPYITIGVLVS >gi|229783766|gb|GG667969.1| GENE 3 607 - 741 86 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGGPFRSRKTPVCSLGLVFPEADAKIKVQIKINRKEVLICETLY >gi|229783766|gb|GG667969.1| GENE 4 723 - 1293 447 190 aa, chain + ## HITS:1 COG:no KEGG:CbC4_0986 NR:ns ## KEGG: CbC4_0986 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_BKT015925 # Pathway: not_defined # 43 190 46 195 254 67 28.0 4e-10 MRNFILEGKKIAKPVLVTTCLLTLLTCILSCTLYRGYTLHFELDAWEIGTEFLSLLFPLF VTIPICWQLYYERRDRFLTYTLTRIGRQRYLVIKWCACAVSAFIILVIPCFISALFALYI REPQALWPIPTGYAHVFSNLYAKMPVLYALLLSLWKGLLGILTMTFGFVLALYGSNVFVI LTGPFIYVIL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:40:52 2011 Seq name: gi|229783765|gb|GG667970.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld363, whole genome shotgun sequence Length of sequence - 1489 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 22 - 243 242 ## gi|266625636|ref|ZP_06118571.1| shikimate kinase 2 1 Op 2 . + CDS 240 - 671 594 ## COG0757 3-dehydroquinate dehydratase II + Prom 700 - 759 8.3 3 2 Tu 1 . + CDS 791 - 1348 611 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) Predicted protein(s) >gi|229783765|gb|GG667970.1| GENE 1 22 - 243 242 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625636|ref|ZP_06118571.1| ## NR: gi|266625636|ref|ZP_06118571.1| shikimate kinase [Clostridium hathewayi DSM 13479] shikimate kinase [Clostridium hathewayi DSM 13479] # 1 73 8 80 80 125 98.0 1e-27 MVYLEVNPETVLKRLKGDTTRPLLQGGDVTQRVKEFLAVRGPIYRETAGMVVDVNDRTVD EIVAELEERMGFE >gi|229783765|gb|GG667970.1| GENE 2 240 - 671 594 143 aa, chain + ## HITS:1 COG:TM0349 KEGG:ns NR:ns ## COG: TM0349 COG0757 # Protein_GI_number: 15643117 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase II # Organism: Thermotoga maritima # 1 139 1 136 144 163 54.0 7e-41 MKLLLLNGPNLNFLGIREKGVYGTEDYGYLVRMMKEKAEREGHELDIYQSNYEGGLIDKI QEAYYNGTEGIIINPGAFTHYSYALRDALSSVEIPKVEVHISNVHKREEFRHTSVTAPVC DGQVVGLGLKGYALAMDYLTGRE >gi|229783765|gb|GG667970.1| GENE 3 791 - 1348 611 185 aa, chain + ## HITS:1 COG:BH2799 KEGG:ns NR:ns ## COG: BH2799 COG0231 # Protein_GI_number: 15615362 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Bacillus halodurans # 1 184 1 184 185 214 57.0 7e-56 MISAGDFRNGITLEIDGNVYQIMEFQHVKPGKGAAFVRTKIKDVVNGGVVERTFRPTEKF PAARIDRVDMQYLYADGDLHNFMDTNTYEQIALSPEVIGDALKFVKENDILKVCSYNGKV FSVEPPLFVELEITDTEPGFKGDTAQGATKPATVETGAVVYVPLFVDQGDKIKIDTRTGE YLSRV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:40:59 2011 Seq name: gi|229783764|gb|GG667971.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld364, whole genome shotgun sequence Length of sequence - 1726 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1704 1811 ## COG2176 DNA polymerase III, alpha subunit (gram-positive type) Predicted protein(s) >gi|229783764|gb|GG667971.1| GENE 1 1 - 1704 1811 567 aa, chain + ## HITS:1 COG:CAC3442 KEGG:ns NR:ns ## COG: CAC3442 COG2176 # Protein_GI_number: 15896683 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit (gram-positive type) # Organism: Clostridium acetobutylicum # 1 566 886 1450 1452 707 60.0 0 DGYLVGSRGSVGSSFVAYMSGITEVNSLSPHYYCASCHYYDFDSEEVKKYTGMAGCDMPD KTCPVCGHPLTKDGFDIPFETFLGFKGDKEPDIDLNFSGEYQSKAHDYTEVIFGKGQTFR AGTIGTLADKTAYGYVKNYYEEHGVHKRNCEINRIVQGCVGVRRTTGQHPGGIIVLPHGE EIYSFTPVQRPANDMTTMTVTTHFDYHSIDHNLLKLDILGHDDPTMIRMLQDLIGFDPVK DIPLDSREVMSLFQDTSALGITPEDIGGCKLGALGVPEFGTDFAMQMLIDTQPKYFSDLV RIAGLAHGTDVWLGNAQTLIQEGKATIQTAICTRDDIMIYLISMGLEEGLSFTIMESVRK GKGLKPEWEEEMISHGVPDWYIWSCKKIKYMFPKAHAAAYVMMAWRIAYCKVFYPLAYYA AFFSIRANGFSYELMCQGRDKLEYYLADYKKRMDTLSKKEQDTLRDMRIVQEMYARGYDF TPIDIYQAKARHFQIIGGKLMPSLSSIDGLGEKAADAVVDAVKDGKFLSKEDFRNRTKVS KTVCDLMADLGLLGELPESNQLSLFDF Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:41:00 2011 Seq name: gi|229783763|gb|GG667972.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld365, whole genome shotgun sequence Length of sequence - 2453 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 543 421 ## RB13146 xylosidase/arabinosidase 2 1 Op 2 38/0.000 - CDS 596 - 1441 810 ## COG0395 ABC-type sugar transport system, permease component 3 1 Op 3 . - CDS 1444 - 2355 665 ## COG1175 ABC-type sugar transport systems, permease components 4 1 Op 4 . - CDS 2370 - 2453 71 ## Predicted protein(s) >gi|229783763|gb|GG667972.1| GENE 1 3 - 543 421 180 aa, chain - ## HITS:1 COG:no KEGG:RB13146 NR:ns ## KEGG: RB13146 # Name: not_defined # Def: xylosidase/arabinosidase # Organism: R.baltica # Pathway: not_defined # 21 177 46 200 329 90 32.0 3e-17 MKLPKELLEVERKTVETVNLPSDNNHDPSNILYDQGTYYLWYTQHDNERPYDHFADCKIM CCTSKDGIHWEEGKDALLPAESGWDCAGVLTANVIYDKGRFYMFYTGVGTDFAEGKTTRR CCGLAAANTPDGPFERLGNEPVLQWEEEGSWDDEAVDDISAVFFQNRWLVYFKGSRLTEP >gi|229783763|gb|GG667972.1| GENE 2 596 - 1441 810 281 aa, chain - ## HITS:1 COG:BH1926 KEGG:ns NR:ns ## COG: BH1926 COG0395 # Protein_GI_number: 15614489 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 10 280 16 284 285 170 35.0 3e-42 MKKEKAENILKKLFSLLMLLIVIIVTCFPFLMMFMGSFKEDYEIFSLSPKILPAAGFQLK MYRMLFANWPFVTNMMNSVIVAVSTTALACFFCTAAGFTFAKYQFPFKNVLFIITLSSLM IPLETRLVPTYLLVKQLGGVNHLWSMILPNAIPAFGIFMMRQFASGTVPQETMEAARIEG ATENQILIRIGFPMLKPAIVSLAILTFMNTWNEFLWPIIITTKKEKLTVTALLRSIGDVS MNGNYGVLLAAAALSALPILVLYLIFHNQMIDGIVEGTGKE >gi|229783763|gb|GG667972.1| GENE 3 1444 - 2355 665 303 aa, chain - ## HITS:1 COG:slr1202 KEGG:ns NR:ns ## COG: slr1202 COG1175 # Protein_GI_number: 16329975 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Synechocystis # 20 295 17 289 298 156 35.0 5e-38 MEADEMTSTLKEMKKYKHCYLWIAPFFILFAVFYIYPAVSGFLISLTNFDGLGTVAAADH IGFKNYEKLLRDDKFWKSLMNTVLIWLYIVPLRTFLALILAAVLNSSRLTGKRVYSVIVL LPYITAVAVAAIIFRILLTTEGGLINVLLERCLGIGPVGWLDTTALSKVSVAFMNIWRMT GYFSLVLLAGMQKIPGSVGEAASIDGAGAFTKFFKITLPLMVPEIFFVVLMSTIWIFQNV GDVMVLTQGGPMNSSTTLVYYMYQNAYEYSKMGYAAAITYVLFAFLVILSVFVVKNYYGR TEE >gi|229783763|gb|GG667972.1| GENE 4 2370 - 2453 71 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no VIGEMFLNGAYTPETAVEVMKEKLNDI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:41:09 2011 Seq name: gi|229783762|gb|GG667973.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld366, whole genome shotgun sequence Length of sequence - 1951 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 1443 805 ## COG3291 FOG: PKD repeat - Prom 1465 - 1524 2.6 2 1 Op 2 . - CDS 1538 - 1738 112 ## gi|266625645|ref|ZP_06118580.1| assembly factor CBP4 3 1 Op 3 . - CDS 1713 - 1949 225 ## gi|266625646|ref|ZP_06118581.1| germination-specific N-acetylmuramoyl-L-alanine amidase Predicted protein(s) >gi|229783762|gb|GG667973.1| GENE 1 3 - 1443 805 480 aa, chain - ## HITS:1 COG:MA4285_2 KEGG:ns NR:ns ## COG: MA4285_2 COG3291 # Protein_GI_number: 20093074 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 279 431 676 834 1325 84 36.0 7e-16 MQRRKRWSSRIAAIALSAAMVFTMTPSVGVMAATTDGAKSVSEGLCEHHPEHTADCGYVA GTEGTPCTHVHTDDCYDVVEDCVHAHKAECYPEGDDMEEASPSNAKEPTECSHVCNEETG CIKEKLNCQHKHDEDCGYVAAEPGEPCGYVCDICNAENSNELKPPVTEPEELQYPECTSK EESNRLLDRLLAVEKIVTDPEADKDSEEYQEALEKDDALWEHVDGCEICQGQIAEDGAYY KKFYGTATAAMLSGSGWIFDTSTGTLTISTNDGTLMWYGGTFTDGDAVKNIVLTSGVTLI YRKAFIACTQLVSITIPDSVTMIGDEAFSGCQQLDNVTIPNSVASIGDRAFIQTGLTSIE IPESVNSIGMNAFTGSSLKSIEFKSNTPPSSLNSRCFAGVPIAGNVTVPEGTEEAYETAL TAAGLHFGVDEWTINATDPVAQYKKSGDRDWTEVGSLAAAITAIGTGTGEIQLVKDINST >gi|229783762|gb|GG667973.1| GENE 2 1538 - 1738 112 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625645|ref|ZP_06118580.1| ## NR: gi|266625645|ref|ZP_06118580.1| assembly factor CBP4 [Clostridium hathewayi DSM 13479] assembly factor CBP4 [Clostridium hathewayi DSM 13479] # 1 66 1 66 66 90 100.0 3e-17 MLNVHRKDEIFKIREQLKHEKEREMRMQKKKELIRIPRSHHDSYNDWNTGLVKKARLGTD VRDDSR >gi|229783762|gb|GG667973.1| GENE 3 1713 - 1949 225 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625646|ref|ZP_06118581.1| ## NR: gi|266625646|ref|ZP_06118581.1| germination-specific N-acetylmuramoyl-L-alanine amidase [Clostridium hathewayi DSM 13479] germination-specific N-acetylmuramoyl-L-alanine amidase [Clostridium hathewayi DSM 13479] # 1 78 1 78 78 117 100.0 2e-25 EHLLENFHFAIEEYNRSDQKDKHLSIARGIAVYNSVTDLVFSNVFKRADDAMYQNKADMK RKHVAAEAEENAERPSKG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:41:20 2011 Seq name: gi|229783761|gb|GG667974.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld367, whole genome shotgun sequence Length of sequence - 1680 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 199 - 237 0.8 1 1 Op 1 . - CDS 245 - 562 315 ## Closa_2009 Peptidoglycan-binding lysin domain protein 2 1 Op 2 . - CDS 579 - 701 61 ## - Prom 776 - 835 5.2 + Prom 659 - 718 7.2 3 2 Tu 1 . + CDS 747 - 1367 603 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) + Term 1400 - 1469 28.1 - Term 1387 - 1457 28.3 4 3 Tu 1 . - CDS 1513 - 1680 166 ## COG0799 Uncharacterized homolog of plant Iojap protein Predicted protein(s) >gi|229783761|gb|GG667974.1| GENE 1 245 - 562 315 105 aa, chain - ## HITS:1 COG:no KEGG:Closa_2009 NR:ns ## KEGG: Closa_2009 # Name: not_defined # Def: Peptidoglycan-binding lysin domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 94 1 94 104 122 64.0 6e-27 MKKIMITMALVVLILSNVCGHSLMNALADETELPESQKYYTSIEIQKGDTLWGIADEYAG SCRMSTADYVTELKNMNGLKEDVIHSGQHLTIMYCVPVSEVLAER >gi|229783761|gb|GG667974.1| GENE 2 579 - 701 61 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFEKNANLCSKKMIFFLDKHLFEWYDIFIQNMCSAFVTKE >gi|229783761|gb|GG667974.1| GENE 3 747 - 1367 603 206 aa, chain + ## HITS:1 COG:SA1174 KEGG:ns NR:ns ## COG: SA1174 COG1974 # Protein_GI_number: 15926920 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Staphylococcus aureus N315 # 5 202 3 205 207 220 57.0 1e-57 MAQEKITAKQQEILEYIKETILKKGYPPAVREICEAVHLKSTSSVHSHLETLEEKGYIRR DPTKPRTIEIIDDCFNLTRREVVNVPLLGTVAAGVPLLAEENIENYYPIPVELLPNADTF MLNVKGNSMINAGIFDGDQLIVERCSTAYDGEIVVALVDDSATVKRFYKEDGYYRLQPEN DEMEPILVDHLEILGKVIGLFRLGIH >gi|229783761|gb|GG667974.1| GENE 4 1513 - 1680 166 55 aa, chain - ## HITS:1 COG:BH1328 KEGG:ns NR:ns ## COG: BH1328 COG0799 # Protein_GI_number: 15613891 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Bacillus halodurans # 2 51 63 112 117 59 42.0 1e-09 VGCTCRQVEGYQSANWILMDYGDIIVHVFDRDNRLFYDLERIWRDGKRLDVSELE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:41:27 2011 Seq name: gi|229783760|gb|GG667975.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld368, whole genome shotgun sequence Length of sequence - 928 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:41:28 2011 Seq name: gi|229783759|gb|GG667976.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld369, whole genome shotgun sequence Length of sequence - 1736 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 1013 978 ## Clole_2789 hypothetical protein 2 1 Op 2 . + CDS 1023 - 1652 398 ## PSPA7_2380 hypothetical protein 3 1 Op 3 . + CDS 1649 - 1736 107 ## Predicted protein(s) >gi|229783759|gb|GG667976.1| GENE 1 3 - 1013 978 336 aa, chain + ## HITS:1 COG:no KEGG:Clole_2789 NR:ns ## KEGG: Clole_2789 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 323 143 530 540 113 26.0 1e-23 IREWFGEIPGWVSYDQNILQVLYDIQSENGEYFQTRQDINRDIRNKRAFIEDIADTIPAG YEAEKWENENVGELYREIEKIRKENETIEKAKRMIESRSNKMRAFEADKEIELSAIEKEF SRRETNLKEQIASLEEQIRSCKKELAGLNEKKQDKISLAEQTYKTNVAKYDAELAQYEEY ADKEVKSTAELTEKAEYAEEMKGHLNEYRRMENLQSEVEKLVAESRTLTDKIEKARELPG EILQEATIPINGLTVKDGIPLIHGLPISNLSDGEKLDLCIDVAIQKPNGLQIILIDGVEK LSTDMRSELYRKCKEKGLQFIATRTTDDPELTVVEL >gi|229783759|gb|GG667976.1| GENE 2 1023 - 1652 398 209 aa, chain + ## HITS:1 COG:no KEGG:PSPA7_2380 NR:ns ## KEGG: PSPA7_2380 # Name: not_defined # Def: hypothetical protein # Organism: P.aeruginosa_PA7 # Pathway: not_defined # 19 192 27 200 264 163 43.0 4e-39 MDELTLNQPVTSLSSGVFSSAESFQELFNIGKMFSASTLVPQAYQGKPMDCTIAVDMANR MGVSPMMVMQNLYVVKGKPSWSGQACMSMIRASKEFKNVRLVYTGDKGTDSWGCYVQAEH RETGEPVKGTEVTIKMAKEEDWYGKTGSKWKTMPEQMLAYRAAAFFARVYIPNSLMGLHV EGEAEDITNESEIPEIPDIFGEKEKEAQV >gi|229783759|gb|GG667976.1| GENE 3 1649 - 1736 107 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MILTAENYFSKEADREYLSVSQYKNFMGT Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:41:43 2011 Seq name: gi|229783758|gb|GG667977.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld370, whole genome shotgun sequence Length of sequence - 1430 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 19/0.000 + CDS 2 - 166 190 ## PROTEIN SUPPORTED gi|240146498|ref|ZP_04745099.1| 30S ribosomal protein S16 2 1 Op 2 12/0.000 + CDS 218 - 445 349 ## COG1837 Predicted RNA-binding protein (contains KH domain) + Term 459 - 503 -0.9 3 1 Op 3 30/0.000 + CDS 522 - 1034 172 ## PROTEIN SUPPORTED gi|163796730|ref|ZP_02190688.1| 50S ribosomal protein L19 4 1 Op 4 . + CDS 1045 - 1429 414 ## COG0336 tRNA-(guanine-N1)-methyltransferase Predicted protein(s) >gi|229783758|gb|GG667977.1| GENE 1 2 - 166 190 54 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240146498|ref|ZP_04745099.1| 30S ribosomal protein S16 [Roseburia intestinalis L1-82] # 1 54 27 80 81 77 64 5e-15 SPRDGKFIEEVGYYDPNTDPSVIKFNEESAKKWLATGAQPTEKVSKLLKIAGIQ >gi|229783758|gb|GG667977.1| GENE 2 218 - 445 349 75 aa, chain + ## HITS:1 COG:CAC1756 KEGG:ns NR:ns ## COG: CAC1756 COG1837 # Protein_GI_number: 15895033 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein (contains KH domain) # Organism: Clostridium acetobutylicum # 1 74 1 74 75 72 70.0 2e-13 MKELVEVIAKALVDNPDEVAVTESVKDDEIVLELTVAPSDMGKVIGKQGRIAKAIRSVVK AAASKEDKKVIVEIQ >gi|229783758|gb|GG667977.1| GENE 3 522 - 1034 172 170 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796730|ref|ZP_02190688.1| 50S ribosomal protein L19 [alpha proteobacterium BAL199] # 7 161 7 159 179 70 33 6e-13 MEQLLRVGVISSTHGVKGEVKVFPTTDDAARFKKLKQVILDTGKEQLKLEIEGVKFFKNM VILKFKGYDTIEDIEKYKGKDILVTRDQAVKLGPDENFIVDLIGLKVVTEDGEPFGTLTD VIKTGANDVYEVKTAEGKEVLFPAIRECIRNVDLEAGVVTVHIMDGLLDL >gi|229783758|gb|GG667977.1| GENE 4 1045 - 1429 414 128 aa, chain + ## HITS:1 COG:BH2479 KEGG:ns NR:ns ## COG: BH2479 COG0336 # Protein_GI_number: 15615042 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Bacillus halodurans # 1 128 1 128 246 148 52.0 2e-36 MNFHILTLFPDMVLNGLHTSIIGRAVEHGLISIEAIDIREYSTDKHRHVDDYPYGGGAGM VMQPMPICDAYDDLCARIGKKPRVIYLTPQGTVFNQKIAEELAEEEELVFLCGHYEGVDE RALELVAT Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:41:44 2011 Seq name: gi|229783757|gb|GG667978.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld371, whole genome shotgun sequence Length of sequence - 1583 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 3 - 756 961 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 2 1 Op 2 . - CDS 749 - 1582 861 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain Predicted protein(s) >gi|229783757|gb|GG667978.1| GENE 1 3 - 756 961 251 aa, chain - ## HITS:1 COG:SP0661 KEGG:ns NR:ns ## COG: SP0661 COG4753 # Protein_GI_number: 15900562 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Streptococcus pneumoniae TIGR4 # 4 250 5 240 245 162 36.0 4e-40 MLKVLVIDDEVFVRKGIVMETDWASLDCMVAGEAANGLEGLEACRKYNPDLIISDIRMPK MDGIEMLKILREEGNDVSVIFLTAYSEFEYARNALKLLAFDYLLKPFEDGELEDAVSRVR DKLVREREEKERREESKILSPGARTGDKSRYIKEAIAYITEHYNDSDLSVGTIAESLGIS EGHLSHLFKKETDYTVMAYITRCRIREAMKLLTNCRYKVYEVGEMVGYRDITYFSSTFKK IAGVTPSEYQA >gi|229783757|gb|GG667978.1| GENE 2 749 - 1582 861 277 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 1 271 324 595 602 159 34.0 5e-39 IQSLNEAMHEVQSGNLEARVHSTRNDEFGQLSESFNTMASELKTYMEMQVSQQKKLNEVQ IAMMQAQLNPHFLYNTLDTMKWVAKANHIPEIATLAAKLAKILRTSISNAQFITLKEELE LVDCYAEIQKIRFQGKFCFDYELPEELAGCVVPKLVVQPIVENAMIHGLKDCEEGHILVR ACSKEGRLFIEVMDDGCGISDEVLNHLNNRGAGAGVPEEGGEEALLRLGHIGFYNVDTII RLHYGAEYGLSVCRPEGGGTKVTIMIPVRQGEELCDA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:41:45 2011 Seq name: gi|229783756|gb|GG667979.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld372, whole genome shotgun sequence Length of sequence - 1827 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 493 571 ## COG1283 Na+/phosphate symporter - Prom 522 - 581 6.4 2 2 Tu 1 . - CDS 602 - 1594 1124 ## COG2355 Zn-dependent dipeptidase, microsomal dipeptidase homolog - Prom 1665 - 1724 4.5 Predicted protein(s) >gi|229783756|gb|GG667979.1| GENE 1 1 - 493 571 164 aa, chain - ## HITS:1 COG:SA0100 KEGG:ns NR:ns ## COG: SA0100 COG1283 # Protein_GI_number: 15925808 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Staphylococcus aureus N315 # 1 163 3 156 555 145 49.0 4e-35 MSITDVQMLFQFVGGLGMFLYGMNAMADGLQKSAGHRMQQLLGALTKNRLMGVILGAGIT AIIQSSSATTVMVVGFVNAGIMNLGQAVGIIMGANIGTTVTSWIVSMSEWGEMLKPEFFA PALVGVGAIMTLFCKNSRKKQVGEILVGFGVLFIGLTFMSAAIT >gi|229783756|gb|GG667979.1| GENE 2 602 - 1594 1124 330 aa, chain - ## HITS:1 COG:PAE0005 KEGG:ns NR:ns ## COG: PAE0005 COG2355 # Protein_GI_number: 18311646 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent dipeptidase, microsomal dipeptidase homolog # Organism: Pyrobaculum aerophilum # 119 330 113 307 308 145 38.0 1e-34 MKVVDMHCDTISELFYRKEEGRGHSVWKNDCHVDLERMEKGDYCLQNFALYTPLDKHERP FEYCMKLVDLFYREMEEQRERIGIVRSYKDILENQRLGKMSAMLTIEEGGVCQGETAFLR DFYRVGVRMMTLTWNFKNELAWPNIVRIEGEEAYFEPDTEHGLTEKGIEFVEEMERLGMI LDVSHLSDAGILDVFRYTKKPFVASHSNARAVSSNPRNLTDEMVKMLSERGGVAGINYCA DFLHDWQKGEHAVSRVDDMILHMKHMKKIGGIQCIGLGSDFDGIGGDLELKSPEDLPVLE QAMRRAGFTESEIEAVFFRNVLRVYKEILK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:41:46 2011 Seq name: gi|229783755|gb|GG667980.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld373, whole genome shotgun sequence Length of sequence - 1071 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1070 1011 ## COG1472 Beta-glucosidase-related glycosidases Predicted protein(s) >gi|229783755|gb|GG667980.1| GENE 1 2 - 1070 1011 356 aa, chain - ## HITS:1 COG:Cgl0317 KEGG:ns NR:ns ## COG: Cgl0317 COG1472 # Protein_GI_number: 19551567 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Corynebacterium glutamicum # 2 142 363 487 548 75 36.0 1e-13 IVNDLLRGRLGYDGVVCTDWGITHDMGKTEEEFAGKCWGVEGLTEAERHLKALEAGVDQF GGNNDVGPVLEAYRLGCEKYGEEAMRSRFERSAARLLLNMFRTGLFENPYLDMEHSCQVV GCREFSEKGYLSQLKSVTLLKNRGGVLPLAEQIKVYIPRRHIRPYLNFMSSMTPDVSVEP AGKGAAADYFVVVDTPEEADAAVCFIESPISVGYDPEDRKRGGSGYVPVTLQYRPYEAKQ ARLHSLAGGDPLETSDDRSYRGKWNTAANEEDLDLVIETKKRMAERPVIVSVNLKNPMVM AELEPYADAVLVDYGVSPQAVLDVITGRFTPCGLLPLQMPADMDTVERQAEDEAFD Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:41:47 2011 Seq name: gi|229783754|gb|GG667981.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld374, whole genome shotgun sequence Length of sequence - 1610 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 40/0.000 - CDS 2 - 1265 1380 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit 2 1 Op 2 . - CDS 1286 - 1609 295 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit Predicted protein(s) >gi|229783754|gb|GG667981.1| GENE 1 2 - 1265 1380 421 aa, chain - ## HITS:1 COG:CAC2356_2 KEGG:ns NR:ns ## COG: CAC2356_2 COG0072 # Protein_GI_number: 15895623 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Clostridium acetobutylicum # 153 417 1 265 654 293 56.0 4e-79 MNTPLSWIKAYVPDLDVSDQEYTDAMTLSGTKVEGFERLDRNLEKIVVGRIDKIERHPDA DKLIICQVNIGGETIQIVTGAPNVKEGDKVPVVLDGGKVAGGHDGGPLPENGVAIKKGRL RGIESNGMLCSIEELGSSKEMYPDAPESGIYILPEDAETGADAIEVLGLRDSVFEYEITS NRVDCYSVIGIAREAAATFDKPFQPPVVTATGNDEDINSYLKVSVEDSHLCPRYCARMVK NIKLAPSPSWMQRRLAACGIRPINNIVDITNYVMEEYGQPMHAFDYDLLSGHEIIVKCAK DGDVFQTLDGQERKLDSSILMINDGEKEVGIAGIMGGENSKITDDVKTMVFESACFDGTN IRLSAKKVGLRTDASGKYEKGLDPNNAEAAVNRACQLIEELGAGEVVGGMIDIYPEKRVE K >gi|229783754|gb|GG667981.1| GENE 2 1286 - 1609 295 107 aa, chain - ## HITS:1 COG:CAC2357 KEGG:ns NR:ns ## COG: CAC2357 COG0016 # Protein_GI_number: 15895624 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Clostridium acetobutylicum # 1 107 233 339 339 187 73.0 5e-48 AKELFGPETKVKFRPHHFPFTEPSAEVDVTCFKCGGSGCRFCKGSGWIEILGCGMVHPNV LRMSGIDPDEYSGFAFGVGLERIALLKYEIDDMRLLYENDIRFLRQF Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:41:47 2011 Seq name: gi|229783753|gb|GG667982.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld375, whole genome shotgun sequence Length of sequence - 1737 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1736 1543 ## COG4646 DNA methylase Predicted protein(s) >gi|229783753|gb|GG667982.1| GENE 1 2 - 1736 1543 578 aa, chain + ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 577 5 595 1315 300 35.0 4e-81 FIDHPEMVLGRQEPVSTAHGMDYTVNPIEGLELSDQLHDAVKYIHGTYQEAELPELGEGE AIDTSIPADPNVKNYSYAIVDGQVYYRENSRMVRPDLNATAEARVKGLVGLRDCVQELID LQMDAAVPDSTITPKQAELNSLYDSFSAKYGLINDRANRLAYADDSSYYLLCALEVIDED GKLERKADMFTKRTIKPHQAVATVDTASEALAVSISEKACVDMSYMSRLTGKTKEALAGE LQGVIFRVPGQLEQDGTPHYVTADEYLSGNVRRKLRQAQRAAQQDPSFAVNVEALTAAQP KDLDASEIEVRLGATWIDKEYIQQFMYETFNTPFYLQRSIEVNYSSFTAEWQIKGKSSVS YNDVAAYTTYGTSRANAYKILEDSLNLRDVRIYDTIEDADGKERRVLNAKETTLAAQKQQ AIREAFRDWIWRDPERRQTLVRQYNEEMNSTRPREYDGSHITFGGMNPAITLREHQKSAI AHVLYGGNTLLAHEVGAGKTFEMVAAAMEAKRLGLCQKSLFVVPNHLTEQWASEFLRLYP SANILVTTKKDFETHNRKKFCARIATGDYDAIIMGHSQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:41:48 2011 Seq name: gi|229783752|gb|GG667983.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld376, whole genome shotgun sequence Length of sequence - 1713 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1575 633 ## gi|288871687|ref|ZP_06118599.2| hypothetical protein CLOSTHATH_07089 - Prom 1627 - 1686 3.1 Predicted protein(s) >gi|229783752|gb|GG667983.1| GENE 1 3 - 1575 633 524 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871687|ref|ZP_06118599.2| ## NR: gi|288871687|ref|ZP_06118599.2| hypothetical protein CLOSTHATH_07089 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_07089 [Clostridium hathewayi DSM 13479] # 1 524 47 570 571 1008 99.0 0 MEYTKSEGFKPICIVDPAISDITYNGALGLYVQSSAKLKVFGIKSSEVGGRAGVELNGEL KVISDQPKQTCCDVKISLAASLFAQIGPDYFNIRKTIPIGAVVLGNLHIEETGIVPKCTR SEKDKGSYAGLVTDSITHTPISYARIELLQEGALIETIFSDANGSFKGTKKKPGSYTLCV SAPFYNESESRITIETGKDTTFEIALKPEKVSKSPWTATVISAESGELIENAKLEIILDN TRLAEIHTDTNGHGSGLLPVGTYTVRISHPKYKARTETIEIEENKTADFSWELTPTSEET GERMYGYLECPAINGYNPESGVGRIPFGTIEFYDASSQLISSVTADENGYYECDVPENCY SYKAYGEKCLKSESSYTPLFTSGGSYDFELKPKTVRTHLRLSALNGSTIYLAYVKITAVE SDYFTEAELVYSDDYYFMDSFSFDLPIGTYQISVTNASYLSDDAPYKAEDKEEIFTIRDD NDIFYLRVFLEPTYVIDSDINEVDLLDEAALVEPEISGETDLTN Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:42:08 2011 Seq name: gi|229783751|gb|GG667984.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld377, whole genome shotgun sequence Length of sequence - 1690 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 5 - 151 228 ## Corgl_0944 iron-only hydrogenase maturation protein HydG 2 1 Op 2 . + CDS 206 - 1426 1213 ## COG1160 Predicted GTPases + Term 1467 - 1519 11.8 Predicted protein(s) >gi|229783751|gb|GG667984.1| GENE 1 5 - 151 228 48 aa, chain + ## HITS:1 COG:no KEGG:Corgl_0944 NR:ns ## KEGG: Corgl_0944 # Name: not_defined # Def: iron-only hydrogenase maturation protein HydG # Organism: C.glomerans # Pathway: not_defined # 1 48 434 481 481 84 79.0 2e-15 MDYASEDTKAIGEKLIEAELANIPKDKVREICKDHLDQIEQGIRDFRF >gi|229783751|gb|GG667984.1| GENE 2 206 - 1426 1213 406 aa, chain + ## HITS:1 COG:CAC1651 KEGG:ns NR:ns ## COG: CAC1651 COG1160 # Protein_GI_number: 15894928 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 3 405 4 401 411 397 50.0 1e-110 MGLNDTPGGERLHIGFFGRRNAGKSSIVNAVTGQELSVVSDIKGTTTDPVYKSMELLPMG PVVIIDTPGFDDEGALGELRVKKTREILNRADCAVLVVDSLQGKTKADEELLSLFKEKEI PFLIAYNKWDLLEKNESHGGQCTAEELTEENGITVSAETSYHIQELKEKIAGLVKEKEHG CRLVGDFLKPLDLVVLVVPIDKAAPKGRLILPQQQAIRDILEAGALSVAVRDTELSAALK KLGRDPDLVITDSQSFSEVAAVVPDHVLLTSFSVLMARYKGFLETAASGVKAVDRLQDGD RVLISEGCTHHRQCNDIGTVKIPAWIKNYTGKNVIIETSSGRSFPEDLSGYRLVIHCGGC MLNEREVRYRMKCAGDAGVPFTNYGMVIAQMKGILERSLEIFQKAR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:42:11 2011 Seq name: gi|229783750|gb|GG667985.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld378, whole genome shotgun sequence Length of sequence - 980 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 113 - 172 2.3 2 2 Tu 1 . + CDS 246 - 395 105 ## + Term 486 - 521 3.7 - TRNA 399 - 469 61.9 # Cys GCA 0 0 - TRNA 512 - 586 63.1 # Glu TTC 0 0 - TRNA 634 - 706 76.0 # Asn GTT 0 0 Predicted protein(s) >gi|229783750|gb|GG667985.1| GENE 1 1 - 289 337 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625667|ref|ZP_06118602.1| ## NR: gi|266625667|ref|ZP_06118602.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 96 1 96 96 171 100.0 2e-41 MGALIDRIRRAVMNTGIRKKLLLIYCAAGLVPMGVIGSLMLYNTHRMLYGQYRGQLESEN KRVHNIVFEVTYLVSNISNVLGYDSALNTLLSTSYE >gi|229783750|gb|GG667985.1| GENE 2 246 - 395 105 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFITARRILSINAPIYTAPFFIIMPEFICFVNNVKSPRTLDFSRFLRLF Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:42:22 2011 Seq name: gi|229783749|gb|GG667986.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld379, whole genome shotgun sequence Length of sequence - 1528 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 266 - 325 11.7 1 1 Tu 1 . + CDS 557 - 1498 901 ## EUBELI_00788 hypothetical protein Predicted protein(s) >gi|229783749|gb|GG667986.1| GENE 1 557 - 1498 901 313 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00788 NR:ns ## KEGG: EUBELI_00788 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 4 233 3 248 315 207 43.0 6e-52 MQQIKLNRNYKDRLFRLAFQEKKDLLDLYNAVSGRQYTNPDDLIITTLADAIYLGMKNDI SFLVSDVLNLYEHQSSFNPNMPVRGLNYFADTYREYIDRNGFDIYGEKLIRLPMPQYIVF YNGTKEEPDRIELRLSDAFLCQNPEEKGCLECRATMININYGHNKELMDRCRRLKDYAVF VSRIRNNEKRGMALDEAVKQAVHSCIEEGILADILKKNRAEVCNLILYEYDEQRQLAIAR EGAMKAGREEGRAAEQVTIIRNMAGKGLNPSAIADMLGLEEGYVKKVLYLLAEETGRTDL EIAGILLEQRQEG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:42:28 2011 Seq name: gi|229783748|gb|GG667987.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld380, whole genome shotgun sequence Length of sequence - 1870 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 15/0.000 + CDS 3 - 674 805 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein 2 1 Op 2 . + CDS 711 - 1869 1043 ## COG3845 ABC-type uncharacterized transport systems, ATPase components Predicted protein(s) >gi|229783748|gb|GG667987.1| GENE 1 3 - 674 805 223 aa, chain + ## HITS:1 COG:CAC0702 KEGG:ns NR:ns ## COG: CAC0702 COG1744 # Protein_GI_number: 15893990 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Clostridium acetobutylicum # 4 217 136 351 357 125 35.0 9e-29 GGGNIMSVAYRENEAAFLGGALAGLMTKTDNVGAVMAVGETLQYRYQFGFTAGVKAVNPD CEVQSAFTNSYTDVGQGSEVAKIMYNKGADIIGTYSGACNLGVFNAAKDAGEGTYCLGAA NGQFDKMPDKILASVVKPADQAILSILKEYQETGVFDTSEPMSLGLKEGGVKLLFTNNEE LLKLIPEDVMNTIDDLAAKVESGEIKVPSDEEEFNTFTYRYAK >gi|229783748|gb|GG667987.1| GENE 2 711 - 1869 1043 386 aa, chain + ## HITS:1 COG:CAC0703 KEGG:ns NR:ns ## COG: CAC0703 COG3845 # Protein_GI_number: 15893991 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport systems, ATPase components # Organism: Clostridium acetobutylicum # 1 369 1 368 514 351 49.0 1e-96 MGSIIELKGITKRFPGIIANDHISISFEEGEIHTLAGENGAGKSTLMNILYGLYQPDEGE LLIRGEPVKFSNSKDSIRHRIGMVHQHFMLIPKLTVTENIIVGQEIGTAIKVDRKRAAET IGALSEQYGLKIDPNKKVADLSVTEQQRVEILKVLYREAEILIFDEPTAVLTPQEIDEFC DILLKLKEKGKTIIFISHKLAEVMKISDRITVIRLGKVVGTVEKKETDSSELTRMMVGRD VNLGRRSRRKFEKAENILEVRDVSYTVDKTVKKLNHINLELRKGEVLGIAGVDGNGQEEL VNVICGKLMPDSGSIVMNGNDITKDKIRDRKDHGLGLIPEDRHRDGLVLEYSIANNLILG LQYHETFAKNGVWLNFKKIHDTAEEL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:42:29 2011 Seq name: gi|229783747|gb|GG667988.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld381, whole genome shotgun sequence Length of sequence - 1312 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 311 - 370 3.8 1 1 Tu 1 . + CDS 446 - 664 106 ## - Term 725 - 785 7.2 2 2 Tu 1 . - CDS 805 - 1224 588 ## Closa_2035 hypothetical protein - Prom 1246 - 1305 4.1 Predicted protein(s) >gi|229783747|gb|GG667988.1| GENE 1 446 - 664 106 72 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPEKRRGTESWRILPRVADSWSGEGAEQGVREGDQRPDERGHGGELIRSSAEFARSPCAA SSPGLESAPEGT >gi|229783747|gb|GG667988.1| GENE 2 805 - 1224 588 139 aa, chain - ## HITS:1 COG:no KEGG:Closa_2035 NR:ns ## KEGG: Closa_2035 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 139 1 139 139 219 87.0 2e-56 MVQIIAGKKGKGKTKHLLDRANSAIRDSKGSIVYLDKSSKHMYELSNKIRLINVNEYPIK SSEGFIGFICGIISQDHDLEMMFLDSFLKLSCLEGEDITDTITTLERIGEKYHVTFVLSV SMDAENLPENAQADVVVSL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:42:38 2011 Seq name: gi|229783746|gb|GG667989.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld382, whole genome shotgun sequence Length of sequence - 1190 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 1 - 687 654 ## COG2199 FOG: GGDEF domain + Term 701 - 768 10.2 + Prom 744 - 803 10.5 2 1 Op 2 . + CDS 863 - 1190 361 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase Predicted protein(s) >gi|229783746|gb|GG667989.1| GENE 1 1 - 687 654 228 aa, chain + ## HITS:1 COG:VC1216_2 KEGG:ns NR:ns ## COG: VC1216_2 COG2199 # Protein_GI_number: 15641229 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Vibrio cholerae # 43 216 15 194 204 76 31.0 3e-14 GSVSDVTEGKRREEKLIASRKILADVLNINGEEAEFGLDEIMALSREVKECRKRLGELER CRLAETDPLTGLHNGKTAVPLMKEWIARRRGQPSALVLFRFDGFKAVNEELGQINGERIL VNSVEKLKRFFREEDIVCRIGGYEILVLCKNIGESDMRRKLEQLMADMKLELALESAGMR PSINAGYVMTAGGGTDFDDLYEKARQALKMASARGKGICMRCGQEITD >gi|229783746|gb|GG667989.1| GENE 2 863 - 1190 361 109 aa, chain + ## HITS:1 COG:CAC2832 KEGG:ns NR:ns ## COG: CAC2832 COG0436 # Protein_GI_number: 15896087 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Clostridium acetobutylicum # 1 108 1 108 393 117 54.0 6e-27 MIAEKMRPLVENNSAIRAMFEEGKRLAAIYGPENVYDFSLGNPNVPAPEAVNRAITDIVA EEASTMVHGYMSNAGFEDVRDTVAQSLNRRFGTHFHLENILMTVGAASG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:42:39 2011 Seq name: gi|229783745|gb|GG667990.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld383, whole genome shotgun sequence Length of sequence - 1248 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 369 331 ## gi|266619878|ref|ZP_06112813.1| putative acyl carrier protein - Prom 394 - 453 7.9 Predicted protein(s) >gi|229783745|gb|GG667990.1| GENE 1 3 - 369 331 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619878|ref|ZP_06112813.1| ## NR: gi|266619878|ref|ZP_06112813.1| putative acyl carrier protein [Clostridium hathewayi DSM 13479] putative acyl carrier protein [Clostridium hathewayi DSM 13479] # 1 122 1 122 246 195 94.0 1e-48 MEKIKIGTQEALFEIESIRPVSENVMQLVFPDTMPSVWGDITIYTDDGTEATTITGYETL YRDEGQTVYLSNDGSVYTAPETPEGPGEPPEPYIPTLEELQAAKRREISVACQQIIYQGV NV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:42:48 2011 Seq name: gi|229783744|gb|GG667991.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld384, whole genome shotgun sequence Length of sequence - 1855 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 25 - 756 826 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit + Term 778 - 831 13.5 + Prom 785 - 844 2.2 2 2 Tu 1 . + CDS 939 - 1854 861 ## COG0148 Enolase Predicted protein(s) >gi|229783744|gb|GG667991.1| GENE 1 25 - 756 826 243 aa, chain + ## HITS:1 COG:CAC2356_2 KEGG:ns NR:ns ## COG: CAC2356_2 COG0072 # Protein_GI_number: 15895623 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Clostridium acetobutylicum # 1 242 414 654 654 211 43.0 7e-55 MRTSPLHGMLSSLSTNYNRRNKNVRLYELANIYLPKSLPLTELPDERMQFTLGMYGDGDF YDMKGVIEEFFDKAGMHKKPHYNPDDKKSYLHPGRQADIVYDGVTVGYLGEVHPDVADNY KIGERTYVAVIDMPSMIPFTTFDRKYTGIAKYPAVSRDISMVVPKDVLVGQIEDVIEQRG GKLLESYQLFDIYEGAQILAGFKSVAYSITFRAKDHTLEDKEAGAVMNKILNGLQALGIE LRA >gi|229783744|gb|GG667991.1| GENE 2 939 - 1854 861 305 aa, chain + ## HITS:1 COG:aq_484 KEGG:ns NR:ns ## COG: aq_484 COG0148 # Protein_GI_number: 15605961 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Aquifex aeolicus # 7 300 4 309 426 201 36.0 2e-51 MTEGLEIIEVTGRRLFDSEGLPAVEAEVELENGARGRAMVALCENSGNTDQMMEVLSECV LFEDASDQRGVDKILEETVSSVGTAKCGGFLTAVSMAVSRAAAAGLGMPLYRYLGGTAPG RLPVPMMTMISGGRYGKNNSLDMESYMVVPVGAASFREGAARCMEVYQALKRLLTISGHQ TGVGEDGGFVPDLRNSEEALEYVMKAMTLCGCEPGKDMCMAIGAGAGALYHEGTGFYQFK GESRMTGVPVERDRNDMMAFYIRLMDGFPVCAVFDGLAESDAEGWGLLTKILGNRAVFIG QHQYR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:42:48 2011 Seq name: gi|229783743|gb|GG667992.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld385, whole genome shotgun sequence Length of sequence - 1517 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 8 - 1357 1279 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 1402 - 1461 4.4 Predicted protein(s) >gi|229783743|gb|GG667992.1| GENE 1 8 - 1357 1279 449 aa, chain - ## HITS:1 COG:BH3690 KEGG:ns NR:ns ## COG: BH3690 COG1653 # Protein_GI_number: 15616252 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 3 366 2 343 420 89 28.0 2e-17 MYKKKVLSLIMTGILAASALGGCSNNEQGGEYKQTSDTVTDVVSQTNGEEKTLLKWALWD LNSVFYYKTMAEAFEAAHPDIEIELVDLGSAGGTDYMTSLSIELTGKDTEFDMVCIQDVP GYSTLVKKGVLEPLTGYIERDGFDLSGYGGLTDQVTLEGKLYELPFRSDFWVLYYNPDLF EEAGAECPTNDMTWTDYDALARKVTDITPGNEVYGTHYHTWRSCVECLGLTDGKQTMVDG NYDFLKPYYEMVLKQQDDGVCHNYAELKTTGLHHSSAFAQGNVAMYIMGTWQISGFINKI KSGEFTEIPNWRIAKMPHSEGVEAGSTIATITGIAVPAMSDNKEEAWEFIKFMAGPEGAE IVADSGSIPAVKTEAVIDKLAAMEGFPQDEQSREALHTANTYLEMPVHEKTVQIDTALNE VHDSIMTGAVSVDDGIAQLNEEIPKLLAE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:42:49 2011 Seq name: gi|229783742|gb|GG667993.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld386, whole genome shotgun sequence Length of sequence - 1769 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 116 - 175 7.4 1 1 Tu 1 . + CDS 219 - 1712 1741 ## COG2317 Zn-dependent carboxypeptidase Predicted protein(s) >gi|229783742|gb|GG667993.1| GENE 1 219 - 1712 1741 497 aa, chain + ## HITS:1 COG:FN0061 KEGG:ns NR:ns ## COG: FN0061 COG2317 # Protein_GI_number: 19703413 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent carboxypeptidase # Organism: Fusobacterium nucleatum # 6 497 3 496 496 415 46.0 1e-115 MSKAYERLEKVLEKTMALQTALILFEWDNETLAPEEAGANTARVIGALSEEYYRVMASEE MKEAIETCEKEEGLTDVEKAIVKEAKETREQLVCIPSEEYRDNAQLVSEATRVWAKAKKD NDFDKFVPTLEKVVDYQKKFASYRAKDGEKLYDSMLDLYEKDFNMELLDQFFSELKEAIV PLIREIRERGKKIDSEFLKGDYPEEKQRELAEFLAEYVGFDFKKGVLAVSAHPFTTNLHN HDVRITTHYSDRVDSSLFSVIHEAGHGIYELGIGDELTQTLAGQGASMGMHESQSRFFEN IIGRNAAFWVPLYEKLQSIFPEQLEGVSRDQFVEAINKAEPGPIRTEADELTYSLHVLIR YEIEKMIIEEDLDLKKLPEIWADKYEEYLGIRPENASEGVLQDIHWSQGSFGYFPSYALG SAFGAQLYYYMKKEMDFEGLLERGEIGVIREYLRDHVHRFGKLKTSREILKDTTGEDFNP KYYIQYLKEKYSRLYDL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:42:50 2011 Seq name: gi|229783741|gb|GG667994.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld387, whole genome shotgun sequence Length of sequence - 1230 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 924 1093 ## COG0173 Aspartyl-tRNA synthetase + Term 983 - 1038 11.1 Predicted protein(s) >gi|229783741|gb|GG667994.1| GENE 1 1 - 924 1093 307 aa, chain + ## HITS:1 COG:CAC2269 KEGG:ns NR:ns ## COG: CAC2269 COG0173 # Protein_GI_number: 15895537 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 307 288 594 595 377 60.0 1e-104 RFGSDKPDLRFGMELKNVSDVVRGCEFAVFKNALEAGGSVRGINADGQGAMPRKKIDALV EYAKGFGAKGLAYIAIAEDGTYKSSFAKFMTEDELHALVEAMAGKPGDLLLFAADKDKVV FDVLGNLRLELARQQDLLKKDDFKFLWVTEFPLLEYSEEQGRFVAMHHPFTMPMDEDWHL IDSDPGAVRAKAYDIVLNGTEMGGGSVRIHQGDIQSKMFEVLGFTPEQAQEQFGFLLDAF KYGVPPHAGLAYGLDRMIMLMVGADSIRDVIAFPKVKDASDMMTEAPAPVDEKQLEELGI EVSETEE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:42:51 2011 Seq name: gi|229783740|gb|GG667995.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld388, whole genome shotgun sequence Length of sequence - 1826 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 310 378 ## COG0325 Predicted enzyme with a TIM-barrel fold 2 1 Op 2 . - CDS 325 - 1764 1587 ## Closa_2034 hypothetical protein Predicted protein(s) >gi|229783740|gb|GG667995.1| GENE 1 1 - 310 378 103 aa, chain - ## HITS:1 COG:CAC2121 KEGG:ns NR:ns ## COG: CAC2121 COG0325 # Protein_GI_number: 15895390 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Clostridium acetobutylicum # 25 102 16 93 221 104 62.0 5e-23 MVKENLEEVNRRISEACRRAGRNREDVTLIAVSKTKPVSMLEEAYETGIRDFGENKVQEL CEKHEVLQSDIRWHMIGHLQRNKVRQVIDKTVLIHSVDSVRLA >gi|229783740|gb|GG667995.1| GENE 2 325 - 1764 1587 479 aa, chain - ## HITS:1 COG:no KEGG:Closa_2034 NR:ns ## KEGG: Closa_2034 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 479 1 479 479 722 78.0 0 MAQQKVRPRKKEAAPKKKNKKIIKYRKPLHINVGMIIFALIFVYMVFSVYTYIRRDKIQF YEVQEGSIVNNTDYTGIILREEEVKTADRSGYINYYVREGKRASVGTRIYSIDETGNIAS FLEENMEEGVTLTAENLADLKKQLTVFSLTYDDNRFNSVYDTKYSLEADVMEYMNFNALD HAGNLLDEAGINFQQVRADEAGVVSYGVDSYESLNISGISEAVFDRSKYTKTITKAGNLI EKGIPVYKMITSDLWSVVFPMSDDDVKAYGDKTSLTVTFSGHDLKAAGNFSMITGTDGKP YGRLDFDKYMVQFASDRYVDFEIISDKVDGLKIPASAVTTKDFYLVPMEYLTQGGDSSES GFNKEVYSESGSSVVFVPATIYYSEEDNYYIEMGGENGFNAGDYIVKPDSTDRYQIGASA SLQGVYNINKGYAVFKQIDVLASNDEYYTIKKNMTYGLAVYDHIVLDASTVSEGELIYQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:42:59 2011 Seq name: gi|229783739|gb|GG667996.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld389, whole genome shotgun sequence Length of sequence - 943 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 498 224 ## Closa_1410 hypothetical protein 2 1 Op 2 . + CDS 488 - 941 226 ## gi|266625689|ref|ZP_06118624.1| putative alkylphosphonate utilization operon protein PhnA Predicted protein(s) >gi|229783739|gb|GG667996.1| GENE 1 1 - 498 224 165 aa, chain + ## HITS:1 COG:no KEGG:Closa_1410 NR:ns ## KEGG: Closa_1410 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 129 46 173 177 135 52.0 6e-31 GSASDALTRYEHLLGLVPDAAKSDRYRRERIKAKISGAGTTTTSLIQNIAESFTNAAVNI VENFPAYTITVRFTGTSGIPGNMADIKQTIEEAVPAHLKVLYEYIFNTYGAVGTFTHAEL AAYTHERIRGGHLKNRIQELQAYQHAELAQLTHNELSKGELPNGN >gi|229783739|gb|GG667996.1| GENE 2 488 - 941 226 151 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625689|ref|ZP_06118624.1| ## NR: gi|266625689|ref|ZP_06118624.1| putative alkylphosphonate utilization operon protein PhnA [Clostridium hathewayi DSM 13479] putative alkylphosphonate utilization operon protein PhnA [Clostridium hathewayi DSM 13479] # 1 151 1 151 152 261 100.0 1e-68 MATNTTNYNFKKPDESDFYSVQDQNNNWDKADAALKDLDTPTFEDYSGSTTVPDAATAIN NIKSKGKLSTIVSNVKAAFKGACLIGQIVNNCVTNNAKLPLSAAQGKALMDLYTQLNSDL AIVGNVNGMEFGTDGEKRYFVFKHNDGSKSS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:43:11 2011 Seq name: gi|229783738|gb|GG667997.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld390, whole genome shotgun sequence Length of sequence - 1655 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 182 - 241 7.6 1 1 Tu 1 . + CDS 281 - 1384 1137 ## COG1940 Transcriptional regulator/sugar kinase + Term 1401 - 1436 6.5 Predicted protein(s) >gi|229783738|gb|GG667997.1| GENE 1 281 - 1384 1137 367 aa, chain + ## HITS:1 COG:CAC3673 KEGG:ns NR:ns ## COG: CAC3673 COG1940 # Protein_GI_number: 15896905 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Clostridium acetobutylicum # 5 363 6 378 385 152 31.0 7e-37 MIEKKKQNKVKIAKFIRYKGHTSKPEIAAELGISMPTVLQNVKELTESGIVEEIGEYEST GGRKAKTLSITASWKMAVGLDITLNHISMVMLDLKGTITGKKRIRKNYENSYEYYRGLAD ELDCFVEETGTAPEKILGVGISIPGVIDQSRELITRSHVLHLNNVKFQAVSQFIRYPVSF ENDANSAALAEFVHRERDAVYLSLSNSVGGAIYVNRGIYAGDNFKSAEFGHMILEPDGKL CYCGKKGCMDAYCSARVLSDHTGDSLELFFQRLESGDKTIREIWNSYLSYLAIAVTNLRM AFDCDIILGGYIGEYIKPYMVELSRSMSGCNMFEQDALYLKNSRYGMEAAAVGGAMKYIE AYFETLS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:43:11 2011 Seq name: gi|229783737|gb|GG667998.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld391, whole genome shotgun sequence Length of sequence - 889 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 558 402 ## Spirs_1114 ROK family protein 2 1 Op 2 . - CDS 594 - 887 135 ## COG2017 Galactose mutarotase and related enzymes Predicted protein(s) >gi|229783737|gb|GG667998.1| GENE 1 3 - 558 402 185 aa, chain - ## HITS:1 COG:no KEGG:Spirs_1114 NR:ns ## KEGG: Spirs_1114 # Name: not_defined # Def: ROK family protein # Organism: S.smaragdinae # Pathway: not_defined # 11 172 9 166 406 72 28.0 1e-11 MLDARKVKAEKPRNIKRRNQKAIVALLQNSAEMTIPEIADRLELSRTSATQIVNELCREK VLKKVGTRDSTTAGGKPPRVFSMNASFRYTIIVIIGNDFISCNISNMKCRVIQRRVRELS DLGRWNGWNLAMEVTADEVKLLLKESGIRKEQICMMVVACGGVVDRKRGVCMFPIGTKRM NDFPV >gi|229783737|gb|GG667998.1| GENE 2 594 - 887 135 97 aa, chain - ## HITS:1 COG:TM0282 KEGG:ns NR:ns ## COG: TM0282 COG2017 # Protein_GI_number: 15643051 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Thermotoga maritima # 1 96 260 355 356 84 45.0 4e-17 YVLRGFGYRKAAELRHVESGRRMEVYTDQPGIQIYTGNHFDGSLACKGGVHYQRYAGICF ETQGFPDAVNRSHFPGCIVRKNTVFTSRTTYRFHMEP Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:43:15 2011 Seq name: gi|229783736|gb|GG667999.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld392, whole genome shotgun sequence Length of sequence - 1697 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 17/0.000 + CDS 5 - 766 1045 ## COG0569 K+ transport systems, NAD-binding component 2 1 Op 2 . + CDS 782 - 1697 750 ## COG0168 Trk-type K+ transport systems, membrane components Predicted protein(s) >gi|229783736|gb|GG667999.1| GENE 1 5 - 766 1045 253 aa, chain + ## HITS:1 COG:FN0242 KEGG:ns NR:ns ## COG: FN0242 COG0569 # Protein_GI_number: 19703587 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Fusobacterium nucleatum # 1 253 199 451 452 155 37.0 6e-38 MQSGDKISIIAPPAESTEFFKQAGIINNTIKTAMFVGGGKITYYVAKLLENTKIQIKIIE QNVERCNELSELLPKAMIIHGDGSDQELLLQEGIRQTEAFASLTGFDEENIMLSLYAASQ SKAKLITKVNKIAFENVINSMNLGSVIYPKLITSDSILQYVRAMQNSMGSNVETLYKIVA NRAEALEFRVTNNPAIVGIPLEKLRLKDNLLVACINRNGKIRSPRGKDVIEAGDRVIIVT TTTGLNDLKDILK >gi|229783736|gb|GG667999.1| GENE 2 782 - 1697 750 305 aa, chain + ## HITS:1 COG:FN0993 KEGG:ns NR:ns ## COG: FN0993 COG0168 # Protein_GI_number: 19704328 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Fusobacterium nucleatum # 1 295 1 297 483 242 47.0 6e-64 MNKKIIFYLLGWIMNMEAIFMLLPCAVAVIYRESSGIWFLAVMAVCGSAGLFLTRRKPDN MVFFAKEGFVSVALSWIVLSFFGSLPFYLSREIPRFEDAMFEVISGFTTTGASILPDVES LSHCMLIWRSFTHWIGGMGVLVFILAVLPLAGGYNMYLMKAESPGPSVGKLVPKVKSTAK ILYGIYFFMTVVEILLLLAGGMPLFDSLATGFGTAGTGGFGIKNSSIAYYDSYYLQGVIT VFMILFGINFNVYYLFLVHHPKDALRCEEARVYLGIIGVTVLLITLNIRGSFTSLFAAFH HAAFQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:43:16 2011 Seq name: gi|229783735|gb|GG668000.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld393, whole genome shotgun sequence Length of sequence - 1633 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 325 - 384 7.2 1 1 Tu 1 . + CDS 634 - 771 85 ## + Prom 821 - 880 3.5 2 2 Tu 1 . + CDS 946 - 1422 379 ## COG4422 Bacteriophage protein gp37 Predicted protein(s) >gi|229783735|gb|GG668000.1| GENE 1 634 - 771 85 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTCDSNRRFIRVLLLAEENLLLWDIKYWLYVISVENAFLSVRQTV >gi|229783735|gb|GG668000.1| GENE 2 946 - 1422 379 158 aa, chain + ## HITS:1 COG:SMa2239 KEGG:ns NR:ns ## COG: SMa2239 COG4422 # Protein_GI_number: 16263660 # Func_class: S Function unknown # Function: Bacteriophage protein gp37 # Organism: Sinorhizobium meliloti # 1 135 84 223 259 69 32.0 3e-12 MTSMSDFSCWKPEWREKVFQIMGQNPQHVYLFLTKRPQLINFETMMEQVWIGVTITSQAD KDRILCMKKHIKAPNYFITFEPLFGDMGPVDFNGIGWVVIGTETGNRNGKITSDKTWILN IAKQAEKHGISVFMKSSLLNIVGPEDLKQEIPASFCKH Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:43:21 2011 Seq name: gi|229783734|gb|GG668001.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld394, whole genome shotgun sequence Length of sequence - 1113 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 104 - 613 366 ## Amet_3916 transcription activator, effector binding 2 1 Op 2 . - CDS 616 - 777 92 ## Clole_0135 AraC family transcriptional regulator Predicted protein(s) >gi|229783734|gb|GG668001.1| GENE 1 104 - 613 366 169 aa, chain - ## HITS:1 COG:no KEGG:Amet_3916 NR:ns ## KEGG: Amet_3916 # Name: not_defined # Def: transcription activator, effector binding # Organism: A.metalliredigens # Pathway: not_defined # 5 169 78 240 241 157 48.0 1e-37 MFNTFDKPEVYLGLEPEVRFSGQIPAGESTGVDVPGQLWKRYHMEKTSISAYININFEMG MSHTEDTAQGLFTYFAGGLVRNIPHKQQGDFVKRELPAGEYIVCRIEAESFEDLVTDALN QGGKYLFGTWLPHHNLTTEPFSAEKYYRDLKDMACMEIWVKPIPLEKKE >gi|229783734|gb|GG668001.1| GENE 2 616 - 777 92 53 aa, chain - ## HITS:1 COG:no KEGG:Clole_0135 NR:ns ## KEGG: Clole_0135 # Name: not_defined # Def: AraC family transcriptional regulator # Organism: C.lentocellum # Pathway: not_defined # 1 48 54 101 302 66 60.0 4e-10 MKLRRLTKVIENLGNTEQRILDVALDYDITSHANFTRAFKETYGIHPKSIGGI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:43:27 2011 Seq name: gi|229783733|gb|GG668002.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld395, whole genome shotgun sequence Length of sequence - 1389 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 21 - 818 165 ## COG3344 Retron-type reverse transcriptase 2 1 Op 2 . + CDS 899 - 1388 171 ## COG4646 DNA methylase Predicted protein(s) >gi|229783733|gb|GG668002.1| GENE 1 21 - 818 165 265 aa, chain + ## HITS:1 COG:Q0050 KEGG:ns NR:ns ## COG: Q0050 COG3344 # Protein_GI_number: 6226520 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 1 260 553 823 834 71 28.0 2e-12 MELSEEKTLITNAQKSAKFLSYEIRVRHSNLTKRDKTGKLVRNYMGRIVLEVSSDTIKKH LIDTGAMKLIYHNGKEIWKPKAIYRLKNCDDLEILDYYNSMIRGFYNYYCIANNSSIINS YKYIMEYSMYKTYGTKYRTSISKVIGKFRVGKDFAVKFRNSKGFEKMRVFYNEGFKRQKK VFQSNADLVPNTIKYFSSTKLIDRLKARECELCGKTNTPIEIHHVHRLKDLQGKTFWEAL MIARNRKTIALCRECHKKLHCGQLN >gi|229783733|gb|GG668002.1| GENE 2 899 - 1388 171 163 aa, chain + ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 10 113 1020 1124 1315 81 40.0 5e-16 MARRWVLSLQREGRIIRQGNDNEEVSIYRYITKKTFDSYLWGIVENKQRFISQIMTNKTV ERECQDVDETVLSFAEIKAIASGNPLIMEKTEVDTEVARLQMLKANYESQRYAHQDNFLF KYPKLISETENRLFGIEKDIRKRDNELAMEPDFLITLNGHTYD Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:43:28 2011 Seq name: gi|229783732|gb|GG668003.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld396, whole genome shotgun sequence Length of sequence - 1376 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 430 249 ## bpr_I0846 carboxylesterase (EC:3.1.1.1) + Term 444 - 486 7.1 + Prom 474 - 533 3.9 2 2 Tu 1 . + CDS 559 - 1375 838 ## COG0517 FOG: CBS domain Predicted protein(s) >gi|229783732|gb|GG668003.1| GENE 1 2 - 430 249 142 aa, chain + ## HITS:1 COG:no KEGG:bpr_I0846 NR:ns ## KEGG: bpr_I0846 # Name: not_defined # Def: carboxylesterase (EC:3.1.1.1) # Organism: B.proteoclasticus # Pathway: not_defined # 17 142 409 529 529 153 54.0 2e-36 LEPKGFEFVRSLAAAGGSVYSYLFALEFPYQNQKTAWHCSDIPFVFHNTELVPVANIPEV SDRLEEQMFGAVMAFARSGKPEYGGLPQWPASREDDEATMIFDRVCEVRHNHDNRLLKLH AESSPKFDLAAVMAEMGDEIQH >gi|229783732|gb|GG668003.1| GENE 2 559 - 1375 838 272 aa, chain + ## HITS:1 COG:lin0179_2 KEGG:ns NR:ns ## COG: lin0179_2 COG0517 # Protein_GI_number: 16799256 # Func_class: R General function prediction only # Function: FOG: CBS domain # Organism: Listeria innocua # 89 226 2 139 139 184 63.0 2e-46 MAFYYEEPSRTFSEYLLVPGYSSTKCIPAEVNLRTPLVRFKKGEEPALHINIPMVSAIMQ SVSDDRLAVALAQEGGLSFIYGSQTVAEQAAMVNRVKRYRAGFVVSDSNVSPEMTLEDVL SLTERTGHSTIAVTDDGGPGGRLLGIVTNKDYRVSRMTPDTKVKEFMTQIGDLVYADEDT TLKEANDIIWEHKINCLPLVNKEGRLVYLVFRKDYDSHKKNENELIDSSKRYMVGAGINT RDYEERVPALLEAGADVLCIDSSEGFSEWQKI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:43:31 2011 Seq name: gi|229783731|gb|GG668004.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld397, whole genome shotgun sequence Length of sequence - 678 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 676 551 ## COG1653 ABC-type sugar transport system, periplasmic component Predicted protein(s) >gi|229783731|gb|GG668004.1| GENE 1 1 - 676 551 225 aa, chain - ## HITS:1 COG:BH3680 KEGG:ns NR:ns ## COG: BH3680 COG1653 # Protein_GI_number: 15616242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 16 214 147 344 438 65 28.0 5e-11 TTWFTDQVRSYSDFGDGTLMMLPNEASVDGLFYNKTLFEKYGWKLPETFDDLISLAEEVQ KEGYYLLSAGGAENRWAWLMSQLMVRTGGIDSYNHLTYGEGLKDWADPQYGYPQALEKFK QLVDAGAYYPGTVGMMVNEADTLFCTEKVVMYYEGSWKPGNWKNIVGEEWLNANVGVMSF PAMTDMEKGDPEAIIGGTLNGWIINESLSDEEKQACCEFKPFRFQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:43:32 2011 Seq name: gi|229783730|gb|GG668005.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld398, whole genome shotgun sequence Length of sequence - 1087 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 603 619 ## COG0726 Predicted xylanase/chitin deacetylase - Prom 727 - 786 7.7 - Term 724 - 771 3.3 2 2 Tu 1 . - CDS 796 - 963 69 ## - Prom 1006 - 1065 2.4 + Prom 691 - 750 11.3 3 3 Tu 1 . + CDS 881 - 1085 177 ## COG1725 Predicted transcriptional regulators Predicted protein(s) >gi|229783730|gb|GG668005.1| GENE 1 3 - 603 619 200 aa, chain - ## HITS:1 COG:BH0895 KEGG:ns NR:ns ## COG: BH0895 COG0726 # Protein_GI_number: 15613458 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Bacillus halodurans # 46 199 33 186 264 147 45.0 2e-35 MDMNKKKQLFRLVKVLLLFALAFMAGRFGAMMTDGKQAVETAADGNWGLSFQQEGQPPVA NATADYLKKFNAYYAENTPEKVLYLTFDAGYENGNTAAILDALKKHNAPATFFVVGNYIE TAPDLVKRMVAEGHIVGNHTYHHPDMSKISSKEAFEKELGDLEKLFTETTGQTMKKYYRP PQGKYSETNLQMAKDMGYST >gi|229783730|gb|GG668005.1| GENE 2 796 - 963 69 55 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTLIRKSFARVLILGRASPAISCPLIIIFCICDVICSYIGFPLELLMMIFTAPPP >gi|229783730|gb|GG668005.1| GENE 3 881 - 1085 177 68 aa, chain + ## HITS:1 COG:BH0651 KEGG:ns NR:ns ## COG: BH0651 COG1725 # Protein_GI_number: 15613214 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 2 67 25 90 123 81 65.0 3e-16 MIISGQLMAGDALPSMRTLAKDLRISVITTKRAYNDLEQEGFIETITGKGSFVAARNTEV IREGNLRM Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:43:38 2011 Seq name: gi|229783729|gb|GG668006.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld399, whole genome shotgun sequence Length of sequence - 1493 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 21/0.000 + CDS 3 - 695 537 ## PROTEIN SUPPORTED gi|149378138|ref|ZP_01895857.1| Ribosomal protein S7 2 1 Op 2 . + CDS 689 - 1276 821 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN 3 1 Op 3 . + CDS 1296 - 1491 211 ## COG0151 Phosphoribosylamine-glycine ligase Predicted protein(s) >gi|229783729|gb|GG668006.1| GENE 1 3 - 695 537 230 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149378138|ref|ZP_01895857.1| Ribosomal protein S7 [Marinobacter algicola DG893] # 5 228 121 341 354 211 44 3e-55 PEKIADIVGGVADGCKQAGAALIGGETAEMPGFYPEDEYDLAGFAVGVVDEKNLITGKDL KAGDVLIGMASSGVHSNGFSLVRKVFEMTKESLDTYYDELGGTLGETLIAPTKIYVKALK RVKDSGVTIKACSHVTGGGFYENIPRMLPDGVRAVIKKDSYEIPAIFKLLAEKGSIEEEM MYNTYNMGIGMVLAVDPSAVDTAMEAIREAGETPYVIGSVEAGEKGVTLC >gi|229783729|gb|GG668006.1| GENE 2 689 - 1276 821 195 aa, chain + ## HITS:1 COG:CAC1394 KEGG:ns NR:ns ## COG: CAC1394 COG0299 # Protein_GI_number: 15894673 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Clostridium acetobutylicum # 1 193 1 189 204 199 52.0 3e-51 MLRVGILVSGGGTNLQAILDRLDDGSLTNVSVEVVISNNRSAYALERAKNHGIETAAISP KEFGTREEFNEAFLSKVDEYHLDLIVLAGFLVTIPEAMTRKYKNRIINIHPSLIPSFCGV GYYGLKVHEAALKRGVKVTGATVHYVDEGVDSGPILLQKAVEVKDGDTPEILQRRVMEEA EWVILPQAIQMIANS >gi|229783729|gb|GG668006.1| GENE 3 1296 - 1491 211 65 aa, chain + ## HITS:1 COG:BH0634 KEGG:ns NR:ns ## COG: BH0634 COG0151 # Protein_GI_number: 15613197 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Bacillus halodurans # 1 65 1 65 428 80 55.0 9e-16 MKVLIVGSGGREHAIAWKVAGSPKADKIYCAPGNAGIAEYAECVPFGSMEFDKLAAFAKE NQIDL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:43:39 2011 Seq name: gi|229783728|gb|GG668007.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld400, whole genome shotgun sequence Length of sequence - 1474 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 875 1010 ## COG2199 FOG: GGDEF domain 2 1 Op 2 . + CDS 865 - 1474 390 ## BHWA1_01124 hypothetical protein Predicted protein(s) >gi|229783728|gb|GG668007.1| GENE 1 3 - 875 1010 290 aa, chain + ## HITS:1 COG:mlr5331_2 KEGG:ns NR:ns ## COG: mlr5331_2 COG2199 # Protein_GI_number: 13474448 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Mesorhizobium loti # 114 285 24 196 217 91 35.0 1e-18 INRMVGSLLEATGKLSFVFQSVDIPIAVYEYNQDMRRVLATKKLGEILMIPEEELRTVLA DRTLFVERIRELCAEPLAEEKEVYLLEGSERRYVRITSYEGEGRTLGIVIDVTEAIVEKQ QIEQERDIDLLTGLLTRRAFFRTADQLFADPGRMKTAMLLMMDLDNLKHVNDTWGHEYGD KLLKAAADLFAGLKAPEKVAARLSGDEFVLLIYGADEKEELYACLERLEKSIREARLALP EGGEAEISVSGGYLIYPEVGGDCTEMLHLADQAMYRVKKSGKGRFEKYGR >gi|229783728|gb|GG668007.1| GENE 2 865 - 1474 390 203 aa, chain + ## HITS:1 COG:no KEGG:BHWA1_01124 NR:ns ## KEGG: BHWA1_01124 # Name: not_defined # Def: hypothetical protein # Organism: B.hyodysenteriae # Pathway: not_defined # 15 203 1 191 256 154 42.0 3e-36 MAGRTTGLFEGGGIMKLYEIYFSPTGGTKKVTEIIGSAWTCEKERIDLMNPRLEPAGYSF TASDICIVAVPSFGGRVPETALERLKKMNGGGAKTVLAAVYGNREFEDTLIELKDALLEA GFCCMAAAAAVAEHSIIHKFGVGRPDEEDRRELKAFAERIKIHLEENGSGEVSVPGDHPY REYKGVPMKPKADRDCIRCGKCA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:43:44 2011 Seq name: gi|229783727|gb|GG668008.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld402, whole genome shotgun sequence Length of sequence - 2259 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 47 - 73 -1.0 1 1 Op 1 . - CDS 151 - 1278 771 ## COG2849 Uncharacterized protein conserved in bacteria 2 1 Op 2 . - CDS 1230 - 1841 271 ## gi|266625716|ref|ZP_06118651.1| conserved hypothetical protein 3 1 Op 3 . - CDS 1856 - 2257 226 ## gi|266625717|ref|ZP_06118652.1| conserved hypothetical protein Predicted protein(s) >gi|229783727|gb|GG668008.1| GENE 1 151 - 1278 771 375 aa, chain - ## HITS:1 COG:FN2119 KEGG:ns NR:ns ## COG: FN2119 COG2849 # Protein_GI_number: 19705409 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 207 373 162 327 338 93 31.0 5e-19 MLYGAGVTWHAVNLRVNGIKTTAEVIQIASEMRPGKDRGIGEPIFYSYFEVVNFTTKSGS EVQSRLDVKGESEDHYAIGDEVEIYYNPEAPAQIMDAGSHARLIKGIAAIAVGLFIFIVI AVMNKGLFASMGSLGWIASIGIKAVLFLLALYLLADQNLYTLYRCFPAVDEEACVNSDGV MVRRNDGKPFSGRMKSRTDQKLSVYSYKNGQLDGLDVVYYDGIVKETGCWKEGKQNGLFR LYTPSGILVDYANFENGERHGLTRQFDPETGGITAEGRYNNGQLDGTWIQYYPDGRTVAI EQTYQNGILNGSARQYYENGQLQIDMSYSDGVPSGPYKFYYPNGQIAVDGVLENGSYRSD TKMYDEDGISITEIP >gi|229783727|gb|GG668008.1| GENE 2 1230 - 1841 271 203 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625716|ref|ZP_06118651.1| ## NR: gi|266625716|ref|ZP_06118651.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 203 1 203 203 396 100.0 1e-109 MMYYKQTQNDSVRMLMEVLEMEAQENPVWTLQNLFILCLDSSFMDDNPQNSDYYDSTYLI LAELYLEFKQYDQMADVPAFLRSVRSFTMDEESKQEMIAVLTDLVSDIIDDRARVLTVQW EREGTLRDWQPHIEELVSEIMETVPEQPSVNPDTANELEVKFQKKRSKKRQLYFLQAPSC LHGCHFTSCCTGQGLPGMLSIFA >gi|229783727|gb|GG668008.1| GENE 3 1856 - 2257 226 133 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625717|ref|ZP_06118652.1| ## NR: gi|266625717|ref|ZP_06118652.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 133 1 133 133 272 100.0 7e-72 YYAQTLMDDPTLSWREGVEEFLKSCCYGSRSGIAVLSIEEAQKVYHCLNPENFQVFRQSQ IALSRSLMNIFGISVQVFDPRLFGNLILTMIMVHKAIPDTMPFLFPETAENMVEFQIDAL IHKMEQARDSKVT Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:44:01 2011 Seq name: gi|229783726|gb|GG668009.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld403, whole genome shotgun sequence Length of sequence - 1609 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1555 775 ## COG0642 Signal transduction histidine kinase Predicted protein(s) >gi|229783726|gb|GG668009.1| GENE 1 1 - 1555 775 518 aa, chain - ## HITS:1 COG:VC1831_1 KEGG:ns NR:ns ## COG: VC1831_1 COG0642 # Protein_GI_number: 15641833 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 286 516 353 583 584 162 37.0 2e-39 MKRKFRFNQKIMNCIGAAALIFVVLAMIISTWNVKRSVAAEVRVAAKRDRCGKLCEELKE AADYLSEQARLYAVTGDNSCLERYWEEVHSGKRRDGVVKRLGELGLDDRDMRLVESAKKN SDLLIYLETRAMRLAADGKGENEEVLPLEVQTYVLNVVEAAMTPEEKRMEAISLLCGDRY AGEKKVIDQYASEFMESAVKRMNEELAATRRRTDRALVMLWGLQGASMLLFLLYAFFCYR GTIYPALAYQMCLEEGRAAELKPDGISEIYRLGQSVLNMYCRMAEAMKAKDEFLASVSHE IRTPLNSVIGCETLLKQTRLNEKQREYLACMETASKQLLDMVNELLDYASLEKKMEIQET EWSLREAAKDLENSFCCLAVEKGLEFSVEADDKVPEQVLADEGKFRLIAGNLISNAIKFT DEGLVHAELLWEKGTEDEGVLVLRVEDTGRGIKKEDEERIFEAFERGSKEESAPVAGTGL GLAICQRTAGLLRGSLTVESCWRKGSAFTARIPMKIGG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:44:02 2011 Seq name: gi|229783725|gb|GG668010.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld404, whole genome shotgun sequence Length of sequence - 1399 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 195 - 254 9.3 1 1 Tu 1 . + CDS 397 - 1257 1165 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain Predicted protein(s) >gi|229783725|gb|GG668010.1| GENE 1 397 - 1257 1165 286 aa, chain + ## HITS:1 COG:lin2443 KEGG:ns NR:ns ## COG: lin2443 COG0834 # Protein_GI_number: 16801505 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Listeria innocua # 27 271 21 263 269 146 36.0 4e-35 MKKRMFGTRLTGLALAASLAVGALSGCGAKTEAKAPAADGVTVVKVGTGNNAAPFCYLDE DGNPVGYDLDVLKELDKRLEQYKFDIQTMDFSTLVVSIDSGSIDMLSHQLVKSEARKEKY LFPDEYYCLSPMSLCVRSDSGITSMADLAGKTMEQNPSAYEYQMMMAYNEAHPGNEVEII GVSDQSTADSYKKVSNGQVDAALTYKATYDSVMDDLGITNLTLTDVVMCEDTYMMFAKDQ AELVAAVSQATKEMKEDGTLGEISKKWFGGEDVFTSYADMVAISVE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:44:03 2011 Seq name: gi|229783724|gb|GG668011.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld405, whole genome shotgun sequence Length of sequence - 1360 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 340 216 ## COG2610 H+/gluconate symporter and related permeases 2 1 Op 2 . - CDS 354 - 1259 650 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase Predicted protein(s) >gi|229783724|gb|GG668011.1| GENE 1 1 - 340 216 113 aa, chain - ## HITS:1 COG:PM0793 KEGG:ns NR:ns ## COG: PM0793 COG2610 # Protein_GI_number: 15602658 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Pasteurella multocida # 7 112 2 107 449 71 41.0 3e-13 MITGGWLIVIFLIAIAILLITIIKFKVNAFIALLVTIIGTGLMVRMPVSDISKVIGNGFG NTLGSIGVIIGLGVMLGRFLYESGGIEVIADTFLKKFNGSKSRIAVAMAGFIT >gi|229783724|gb|GG668011.1| GENE 2 354 - 1259 650 301 aa, chain - ## HITS:1 COG:PH0847 KEGG:ns NR:ns ## COG: PH0847 COG0329 # Protein_GI_number: 14590708 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Pyrococcus horikoshii # 6 284 1 277 287 162 32.0 5e-40 MKNVTLQGITAPVLTPFCENGEVNTDEYARLIQYITGCGIKGIFVGGTSGEFVNLRMEER EKLLDAAREAVEKDTTVLFNVTAMNEYELYHMIEWGRKKGADAVSVTAPYYHRYDEKALC RYFGKVAEAAEGMPVYLYNMSGMTNNPITPAILKTVAEQHRNICGVKDSSMDFMTLLNYQ IAMEKMDFELITGNDAQVLPALQAGASGGIIAVASVFPELAVSIWNHFKDDNIEAARESQ KKVLRIRELFRTVMPTIAHKEALKLQGFDMGPSRFPFRDLTGEERERLKKGLEDLGIIKE V Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:44:04 2011 Seq name: gi|229783723|gb|GG668012.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld406, whole genome shotgun sequence Length of sequence - 1783 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 149 61 ## gi|266625724|ref|ZP_06118659.1| glycine cleavage system H protein - Prom 169 - 228 6.1 2 2 Op 1 9/0.000 + CDS 456 - 1511 1116 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 3 2 Op 2 . + CDS 1527 - 1782 347 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes Predicted protein(s) >gi|229783723|gb|GG668012.1| GENE 1 2 - 149 61 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625724|ref|ZP_06118659.1| ## NR: gi|266625724|ref|ZP_06118659.1| glycine cleavage system H protein [Clostridium hathewayi DSM 13479] glycine cleavage system H protein [Clostridium hathewayi DSM 13479] # 1 49 34 82 83 82 97.0 8e-15 MRYAQKYFDFVQMQEYNKITDKKEVLMDTKAKKLTNEVGAPVAENEHSL >gi|229783723|gb|GG668012.1| GENE 2 456 - 1511 1116 351 aa, chain + ## HITS:1 COG:STM3708 KEGG:ns NR:ns ## COG: STM3708 COG1063 # Protein_GI_number: 16766993 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Salmonella typhimurium LT2 # 6 345 1 340 341 462 62.0 1e-130 MQEQTMWALTKKKPEKGLWMECVPVPVVGPNDVKIKIHKTAICGTDVHIYEWNDWAQHTI PIGLTAGHEYVGEIVETGAGVTGHQVGDLVSGEGHITCGKCRNCLEGHKENCRDAKGVGV NRNGAFAEYLVIPSSNVWPCNPNIPEELYSIFDPFGNAAHTALSYDLLGEDVLIAGAGPI GIMAAAIAKFCGARHVVVTDLNDYRLELAKKLGATRTVNLRKETLTNVMKAIGMTEGFDV GMEMSGSPAGLSDMIHNMKHGGKIALLGLQGRETTVDLETVIFNGLNLRGIYGRKVWDTW YKMTTMVQAGLDISPIITHRFDIRDYEKGFEAMISGQSGKVILDWSHINDK >gi|229783723|gb|GG668012.1| GENE 3 1527 - 1782 347 85 aa, chain + ## HITS:1 COG:RSp0961 KEGG:ns NR:ns ## COG: RSp0961 COG0156 # Protein_GI_number: 17549182 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Ralstonia solanacearum # 14 85 14 85 399 92 62.0 2e-19 MERKNDILSIYAKEVEDIKEAGLFKGEAPIASAQGARVKLEDGREVINMCANNYLGLGDS QRLIDAAKRTYDNRGYGMASVRFIC Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:44:10 2011 Seq name: gi|229783722|gb|GG668013.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld407, whole genome shotgun sequence Length of sequence - 1956 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1617 1474 ## Ccel_1549 outer membrane protein + Term 1640 - 1701 17.0 Predicted protein(s) >gi|229783722|gb|GG668013.1| GENE 1 1 - 1617 1474 538 aa, chain + ## HITS:1 COG:no KEGG:Ccel_1549 NR:ns ## KEGG: Ccel_1549 # Name: not_defined # Def: outer membrane protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 537 111 644 644 795 70.0 0 LGGDWFIYPADNSLHTGDVYLNGKSFYEAKSLDEVKNPVIRTEGVNPPWMKHPEAILQPE DTVFQWYAETDRDTTVIYANFQGANPNEELTEINVRKCCFYPEKTGINYITVRGFEMAQA ACPWTPPTADQPGLLGTNWSKGWIIENNRIHDARCSGISIGKEASTGHNLCTRTHRKPGY QYQMEAVFRARQIGWSKETIGSHVIRNNEIYDCGQNGIVGHMGCVFSEIAHNHIYNIAVK HEYFGYEIGGIKLHAAIDVQIHHNNIHNCTLGTWLDWQAQGTRVSKNLYYANDRDLMVEV THGPYLVDNNIFASSYNFDNIAQGGAYLHNLCCGTMRREDVLDRSTPYHFPHTTEVAGTT VVYSGDDRIFQNIFLGGTVTYTEQSVHGTEGYNGHTNSLEEYINDVVSRGNGDLEQFKHV KQPVYIRGNAYLKGAKPYEREENTYVSDTDPAVRIVEEDGKTYLELNVEKGMLEIPTEVY GTEKLGMPRITEAPYENPDGTPIVFDTDYLGQARSGQPAAGPMEGLKEGMNRILVWGE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:44:19 2011 Seq name: gi|229783721|gb|GG668014.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld408, whole genome shotgun sequence Length of sequence - 1423 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 511 350 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 2 1 Op 2 . + CDS 501 - 1422 649 ## gi|266625730|ref|ZP_06118665.1| conserved hypothetical protein Predicted protein(s) >gi|229783721|gb|GG668014.1| GENE 1 2 - 511 350 169 aa, chain + ## HITS:1 COG:BS_yhdM KEGG:ns NR:ns ## COG: BS_yhdM COG1595 # Protein_GI_number: 16078017 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 7 161 1 153 163 73 32.0 2e-13 GSHEQNMDMESIYRTYENLVYRFLYARTCDSEWAQELMQETFLRAITSISRYDGSCKLSV WLCQIAKHVLWQELRKKKRLEPVELTDALPDTSVLDGEASVLQKENRLELYQAIHHLPEL EREVVLYRITGELSFRDIGEILGKSENWARTVFYRTKQKIRKELKEHES >gi|229783721|gb|GG668014.1| GENE 2 501 - 1422 649 307 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625730|ref|ZP_06118665.1| ## NR: gi|266625730|ref|ZP_06118665.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 307 1 307 308 625 100.0 1e-177 MSPNYTCDIIRDLIPGYVDGILSEAGTDLVKSHLECCQECRDFYEELKADMKLAAEPSKE ETLVLDGFKKIKKRTRRLKLAVAIVSGLLAAFVLTVLIRVFIVGVPLEPHLIQISHIAYQ ENTDSLSMEGDLNISGYRISRVVWKQSKKDRNEVNVIVYAAETLPFFHGSTHFSIEIPDM KGRKAYLACPEYDQLEVYNWKTDHYEILDQLEKEIQQRIPDWDETRDILEYSGGITAVAG EEGISYYMTYLIGKDASYWRLNDQLITDGDFKPADFEIWISLKEPHQILIYDYQTGEWNE DYSIVAE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:44:34 2011 Seq name: gi|229783720|gb|GG668015.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld409, whole genome shotgun sequence Length of sequence - 1233 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 47 - 1232 264 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains Predicted protein(s) >gi|229783720|gb|GG668015.1| GENE 1 47 - 1232 264 395 aa, chain + ## HITS:1 COG:CAC3339 KEGG:ns NR:ns ## COG: CAC3339 COG0488 # Protein_GI_number: 15896582 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 2 395 107 499 518 325 44.0 1e-88 MQLYEKAADGDMKSLELAARYQEQLEAHDFYSIDTAIDRVANGLGLLAIGLDRPIAEMSG GQRAKVILAKLLLEKPDVLLLDEPTNFLDKDHVAWLAEYLSSLENAFLVVSHDFDFLDKI ANRICDIDNDTITKYYGTYSEFLRKKTLLREDYIRQYSAQQKEIKKTEEFIRKNIAGRKA KMARGRQKQLDRMDKMEALEQKEIIPYFHFPVLPLTNTEHLLVKHLAVGYHYPVLSDIDF SIKGGQKVVITGFNGIGKSTLLKTLIGQIPSMQGHFKFSDQVTLGYFEQDLIWDEPEQTP IQIVSNVHPDMVIKDVRKHLARCGISSKHAMQAIGTLSGGEQAKVKMCLLTLIPCNFLIM DEPTNHLDVQAKEALKSALMDFAGTALLVSHEEAF Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:44:35 2011 Seq name: gi|229783719|gb|GG668016.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld410, whole genome shotgun sequence Length of sequence - 825 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 8 - 67 3.0 1 1 Tu 1 . + CDS 130 - 823 720 ## COG1420 Transcriptional regulator of heat shock gene Predicted protein(s) >gi|229783719|gb|GG668016.1| GENE 1 130 - 823 720 231 aa, chain + ## HITS:1 COG:CAC1280 KEGG:ns NR:ns ## COG: CAC1280 COG1420 # Protein_GI_number: 15894562 # Func_class: K Transcription # Function: Transcriptional regulator of heat shock gene # Organism: Clostridium acetobutylicum # 6 229 2 229 343 152 38.0 7e-37 MNDKDQLDGRKITILKAIIKTYLETGEPVGSRTISKYTDLNLSSATIRNEMSDLEELGYI VQPHTSAGRIPSDKGYRFYVDQIMQEKEEEVTEIKDLMLKRVDRVELLLKQMARILAQNT NYAALISAPQYHRNKLKFIQLSRVDDGKLLVVIVVEGNMIKNTMIPISQQLSDEGLLNLN ILLNNALNGLTIEEINLDVISRLKEQAGIHSEVVDRVLNEVAEAIRADDDD Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:44:36 2011 Seq name: gi|229783718|gb|GG668017.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld411, whole genome shotgun sequence Length of sequence - 700 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 698 800 ## gi|266625733|ref|ZP_06118668.1| conserved hypothetical protein Predicted protein(s) >gi|229783718|gb|GG668017.1| GENE 1 2 - 698 800 232 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625733|ref|ZP_06118668.1| ## NR: gi|266625733|ref|ZP_06118668.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 232 1 232 232 429 99.0 1e-119 VDDINTKLQQLSELNQTIAHEIFVNADYDGVNYGPNDLLDQRNTIIDDLSRYGKLEVISL DQGRIQVKLGGKLVVDANGGSCSNESIRIGLDGTTLSWGDGTAANLGAGAIRGFEDMLTG SNSLNVGIPYYERKLDEFAQTMANVFNSMVHEDDPDKPGPFKTLIQGDFNGKVSAGSIRI SDLWTKDSSYIIRKKNPDGDLDNEDILAMKAALEKDFEFGDGSDKFTGTFSE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:44:49 2011 Seq name: gi|229783717|gb|GG668018.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld412, whole genome shotgun sequence Length of sequence - 1386 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 988 809 ## COG0784 FOG: CheY-like receiver 2 1 Op 2 . + CDS 1011 - 1386 429 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain Predicted protein(s) >gi|229783717|gb|GG668018.1| GENE 1 2 - 988 809 328 aa, chain + ## HITS:1 COG:PA2824_3 KEGG:ns NR:ns ## COG: PA2824_3 COG0784 # Protein_GI_number: 15598020 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Pseudomonas aeruginosa # 1 112 7 119 126 80 42.0 6e-15 LVDDNRINRRIQKELFAVLGVAVMEAESGEEAVRMASERHFDLIFMDLRMPGMSGYEAAG RIRSLKTFQCPIIALTADGKAQIRERALAYGMEEVVMKPISLAELEKVLKTYISPSCLDV ELGIGRLGGNELLYRQVTEMFAEEHGRDGKKLMDLAEKGDSEEIYALLHALKGASGAIGA EPLSKACERLEDELKGEELVKGELKERVCRIEKLVKDLIGDMNKKLNEEPIGKMPVEVSE SACMKFEENQKNMEEISLNEVIRFSGELQKDLIKEWYGMVERADFAAEELWSDNREVFIS AFGRGCAEQLERALLRFDYEMILQRIKV >gi|229783717|gb|GG668018.1| GENE 2 1011 - 1386 429 125 aa, chain + ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 116 1 117 525 87 36.0 8e-18 MYRILFVEDDEAIRFLVSKYHFWKQSDFSVAGEAGNGKEALALMAQKTYDVVITDIRMPV MDGLELCRQMKRRGYEMPVILASTYSDFAYAKEGMRLGALEYIEKPYSCEKMEEALEIAR NYLEQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:44:50 2011 Seq name: gi|229783716|gb|GG668019.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld413, whole genome shotgun sequence Length of sequence - 1395 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 172 - 327 101 ## Pjdr2_0202 binding-protein-dependent transport systems inner membrane component 2 1 Op 2 . + CDS 378 - 1395 543 ## COG3533 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|229783716|gb|GG668019.1| GENE 1 172 - 327 101 51 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_0202 NR:ns ## KEGG: Pjdr2_0202 # Name: not_defined # Def: binding-protein-dependent transport systems inner membrane component # Organism: Paenibacillus # Pathway: not_defined # 1 51 245 295 295 70 64.0 2e-11 MTAEQLKQMTMMSNTTLNAAKIFISMVPVLCIYPFLQKYFVTGITLGGVKE >gi|229783716|gb|GG668019.1| GENE 2 378 - 1395 543 339 aa, chain + ## HITS:1 COG:ECs4459 KEGG:ns NR:ns ## COG: ECs4459 COG3533 # Protein_GI_number: 15833713 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 8 339 4 312 656 280 42.0 3e-75 MNKIEYSSPIPLNRIRITDSFWKSEMELVRKEMIPFQWKALNDQLEEAPPSFCMHNFKVA GKRSEERKRQGKAFKEPAYTDRGFEILPEDRKELKEEFYGYVFQDSDFYKWIEAVGYSLA QEPDPELEKVADQGIDIVCNAQQDDGYLDTYYIINGINKAFTNLRDYHELYCLGHLAEGA VAYFEATGKDKLLKAAERYADCILQNFGSEEGKRKGYPGHEIAEMALVRLYEATGKEKYL KLGSFFINERGTRPYYFDLEHPESIKAGHEEDIRYAYQQAHMPVRQQTEAVGHAVRAVYL YSGMADTARCMEDRTLYEACERLWSDIVNRKMYITGGIG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:44:53 2011 Seq name: gi|229783715|gb|GG668020.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld414, whole genome shotgun sequence Length of sequence - 1261 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 38/0.000 + CDS 1 - 678 323 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Term 689 - 737 7.1 2 1 Op 2 . + CDS 746 - 1259 224 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components Predicted protein(s) >gi|229783715|gb|GG668020.1| GENE 1 1 - 678 323 225 aa, chain + ## HITS:1 COG:MA1250 KEGG:ns NR:ns ## COG: MA1250 COG0747 # Protein_GI_number: 20090114 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Methanosarcina acetivorans str.C2A # 2 208 297 511 527 77 28.0 3e-14 GWALDREELIALGMEGLSEPVTTWLGSNPTYEKVKNAYYDSCNVEKSAQLLDDAGWILDE DGFRYKDGQQLTILLRTFRNDQALGETIQMQWKKIGVNVSVQHGDYSLITTARETGDWDA SVEAWGTFGNVSALLNAQYAADGAANYGRFQDKKLAGMLESLAQAATTEERIKIGEEISL YVAEQSPALYIAPRPQITAVSTSLEGFVPHFRQFENVVNANLRIK >gi|229783715|gb|GG668020.1| GENE 2 746 - 1259 224 171 aa, chain + ## HITS:1 COG:jhp0284 KEGG:ns NR:ns ## COG: jhp0284 COG0601 # Protein_GI_number: 15611354 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Helicobacter pylori J99 # 4 165 3 155 334 94 33.0 6e-20 MIRNMVLKRIGSALLALLLSSVVIFLLVRLAPGDPINLVLGEGPGDIGVNTELLEERRES LREAHGLNDSIPVQYANWFKKIITLDMGTSIRSGRPITQELWSRVPATFTLALAALFIET ILGVLFGIYSAVHAGKLQDRAIRLFCVILASLPAFVLSLLFLFLFCLFGFI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:44:53 2011 Seq name: gi|229783714|gb|GG668021.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld415, whole genome shotgun sequence Length of sequence - 1299 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 239 131 ## Closa_2367 diguanylate cyclase/phosphodiesterase 2 1 Op 2 . - CDS 242 - 1228 1142 ## Closa_2368 lipoprotein Predicted protein(s) >gi|229783714|gb|GG668021.1| GENE 1 2 - 239 131 79 aa, chain - ## HITS:1 COG:no KEGG:Closa_2367 NR:ns ## KEGG: Closa_2367 # Name: not_defined # Def: diguanylate cyclase/phosphodiesterase # Organism: C.saccharolyticum # Pathway: not_defined # 1 75 1 75 991 92 69.0 4e-18 MKKSRNESIRTHLLWPLLVLLIIQALIMAGMVLFGGVSKKLKNNEIHILSENTDNTRLYL EKETIHQWINVINDSGSMA >gi|229783714|gb|GG668021.1| GENE 2 242 - 1228 1142 328 aa, chain - ## HITS:1 COG:no KEGG:Closa_2368 NR:ns ## KEGG: Closa_2368 # Name: not_defined # Def: lipoprotein # Organism: C.saccharolyticum # Pathway: ABC transporters [PATH:csh02010] # 1 328 148 475 475 480 71.0 1e-134 MAKSTEVFAINKTDWDKFAEATGETDAAFSSWEGIARVAEKYYKWTDSLTETPDDGKAFF GRDAFANYIIIGSLQLGHEIFRVENGEVIMDFDRTAMKRLWDNYYVPYVNGYFGAFGKFR SDDVKTGQLDAFVGSTSGFAYFPTSVTLEDGTSYPIESKVYPLPNFSGTEPCAVQQGAGM MILKSEEKKEYAAATFLKWFTGVDNNIRFSVSSGYLPVKKEAGVKERLETVLAEAGEEDA ASDNLLIGLETANKYELYTTKPFEGGDRARAVLNATMADKAGEDRETVLSLMAQGMPREE AAAQFTTEENFERWYEDTKEQLETIIGE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:01 2011 Seq name: gi|229783713|gb|GG668022.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld416, whole genome shotgun sequence Length of sequence - 1742 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1726 1670 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain Predicted protein(s) >gi|229783713|gb|GG668022.1| GENE 1 1 - 1726 1670 575 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 304 575 310 585 602 202 38.0 2e-51 MRIKTWMESRLNRKIFLSFLTVSLIPLCIVYCYINSAYSEKLRSDAFTLNQLTERNTAVL LEDYLTKIEYISNLFFDSDTQAMIRGASDPLSSYSARMALEKKVRVNLDLFKVMAYVDQV TFLKQDGTAINVMNSLGYADTIPFSAPQEGVGKRFKNQRLMQGDKLNESCTDKYVYLRQI DDLDMGNGTLGWLFIVFDRDQFDLLLQDLRAVLNTEIVMDDGQLLYDTTNGGPDAVNRLR AASVSYSAPSKERRRAERISWDYHIENLGISVTFFDDMKGVESNIRALSIMTGTVIFVTV LIILAASFLFSETIVRPVTRLHRNLVRVKEGDYTVRVQVETRDEVGDLCEAFNGMAEEID RLVNQVYSVELKEKEAAIKALQAQINPHFLYNTLDMIKSMAELNGALQVSEMIMALSKLF RYATHTDGVLVTIREELENLSSYMTIVNARFGGRIEFAARVPEELLTESIVKVCLQPLVE NAISHGLGRGRAGGRIAVTVEKENGIITVTVEDNGGGILPDRLEEIRNRLERKEHVEEES GRGHVGLKNIHDRIRLYYGETYGIVIDSFPERGTV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:02 2011 Seq name: gi|229783712|gb|GG668023.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld417, whole genome shotgun sequence Length of sequence - 1113 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1111 575 ## Closa_0416 cell wall binding repeat-containing protein Predicted protein(s) >gi|229783712|gb|GG668023.1| GENE 1 1 - 1111 575 370 aa, chain + ## HITS:1 COG:no KEGG:Closa_0416 NR:ns ## KEGG: Closa_0416 # Name: not_defined # Def: cell wall binding repeat-containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 114 309 262 466 763 150 39.0 7e-35 IDGRCYYFAETSSDFYPQGAMYAACKTPDGYLVGTGGAWLDEHGDEWYIAGRGLSALKTD DGNHLPELDAGRKRKYGSGGGGGSTGGGGSRLPEGGNKESNTDKDQPEEKNDSERESFPE TDDSTTAPATPSDAALVSWEVRFVEEKDPNRKIIRTQSGKSEEGTEVTVDFPEIVPGSDG WQYQALEKSPKQIMVSGTGIQKYTVLFRKTDRIPDESGQEEGERRLESWIRLAEHADLEI TGGRTEGWSVITENQEESRERLRNLVSMVHDKKRHEIYLIARNHNPSTVVIGQEFPDILN ISGLLMDEFKPFRFQLDGRCYYFAETSSDFYPQGAMYAACKTPDGYLVGTGGAWLDEHGD EWYIAGRGLS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:10 2011 Seq name: gi|229783711|gb|GG668024.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld418, whole genome shotgun sequence Length of sequence - 1231 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 614 699 ## CLL_A1913 prophage protein 2 1 Op 2 . - CDS 634 - 1230 515 ## SP670_2151 phage scaffold protein Predicted protein(s) >gi|229783711|gb|GG668024.1| GENE 1 2 - 614 699 204 aa, chain - ## HITS:1 COG:no KEGG:CLL_A1913 NR:ns ## KEGG: CLL_A1913 # Name: not_defined # Def: prophage protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 4 204 3 202 311 185 48.0 1e-45 MAINTLATATLFMNTLDKVAIREAVTGWMDANAGQVIYNGGAEVKIPKMSVQGLGDYDRD NGYQQGGVTLEYETRKMTQDRGRKFQLDPIDINENNFVTTAAAVMGEFQRMFVVPEIDAY RISKIATETITANKAGMISYDYTPGAPGTSALRKIKEGIKAIRELYNGPLVIHATPDMIM ELEMELSGKIMNTTFSKGGIDTAV >gi|229783711|gb|GG668024.1| GENE 2 634 - 1230 515 198 aa, chain - ## HITS:1 COG:no KEGG:SP670_2151 NR:ns ## KEGG: SP670_2151 # Name: not_defined # Def: phage scaffold protein # Organism: S.pneumoniae_670-6B # Pathway: not_defined # 46 180 24 159 194 89 45.0 1e-16 IYRFFAESGDGAGADQGEGGGTGNKPDEGSSDGVDTGNKEPKSFDDLLQNKDYQAEFDRR VQKALGTAKEKWTALMDDKLSEADKLAKMNKEEKAEYLRQKQEKELKEREAAITRRELMA EAKNTLAEKKLPVGLAEVLNYTDAESCNKSMAAVEKAFQEAVQAAVEEKLKGGEPLKKAP SEDGKDLAKQVEDLMMGI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:19 2011 Seq name: gi|229783710|gb|GG668025.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld419, whole genome shotgun sequence Length of sequence - 1611 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 849 499 ## COG2942 N-acyl-D-glucosamine 2-epimerase 2 1 Op 2 . - CDS 872 - 1165 393 ## SpiBuddy_0440 extracellular solute-binding protein family 1 3 1 Op 3 . - CDS 1138 - 1611 370 ## SpiBuddy_0440 extracellular solute-binding protein family 1 Predicted protein(s) >gi|229783710|gb|GG668025.1| GENE 1 3 - 849 499 282 aa, chain - ## HITS:1 COG:all3695 KEGG:ns NR:ns ## COG: all3695 COG2942 # Protein_GI_number: 17231187 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Nostoc sp. PCC 7120 # 11 281 6 277 388 132 29.0 8e-31 MKIGITKEEARQLSALYKSQLSEIMEEWAEHSADPSGGYLTDFGENWKLVSRRRNIWAQA RQTYMFAAYYEYSGREEKWLSLAKAGRDFLVCHAYAGEGRWNYEVSENGEEVIEGTTSIF TDLFALIALAQYASASGDQTDYDLIRKTFDSARNHIMDPEFRDIKPHVWQNGIERHSPYM IAVHSSMVAEKVLGKEVTGPFIDFCIRKLLYFFGKNESGYLLESLKEDGSVWDTPEGRIV NPGHIFEGMWFCIDYARDTDDAVISRALTIMQETAKAAVDRT >gi|229783710|gb|GG668025.1| GENE 2 872 - 1165 393 97 aa, chain - ## HITS:1 COG:no KEGG:SpiBuddy_0440 NR:ns ## KEGG: SpiBuddy_0440 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: Spirochaeta_Buddy # Pathway: not_defined # 1 94 327 422 426 86 50.0 3e-16 MEKLVEYIITNDNLSYRMSNETLAYPAKKSMEHAVDDSLILKGAAGSEANTKTQVFIPQN SDLNLELCALAQAVTVGKTDVDKAIADFKKAAEMILD >gi|229783710|gb|GG668025.1| GENE 3 1138 - 1611 370 157 aa, chain - ## HITS:1 COG:no KEGG:SpiBuddy_0440 NR:ns ## KEGG: SpiBuddy_0440 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: Spirochaeta_Buddy # Pathway: not_defined # 1 147 178 324 426 234 76.0 1e-60 AEKLSQLKDADGNQVYAFGQTTASVPVSGASLTSMVFNFGGQVLDENGKLSCDNEGFKQA FEMLKLLDEKGYNPQNAKLKDLRNLFALGRLAMYYDQSWGFAGIKPINPEAEAFTVTAKP LKGGSGTGQSILQSHCLMMVDNGEARKKPWKSLLNIL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:24 2011 Seq name: gi|229783709|gb|GG668026.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld420, whole genome shotgun sequence Length of sequence - 982 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 980 825 ## COG1653 ABC-type sugar transport system, periplasmic component Predicted protein(s) >gi|229783709|gb|GG668026.1| GENE 1 2 - 980 825 326 aa, chain - ## HITS:1 COG:BH1864 KEGG:ns NR:ns ## COG: BH1864 COG1653 # Protein_GI_number: 15614427 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 36 326 146 435 461 64 24.0 2e-10 FFPEAGEAAIWKNKKILKNLSPWYETKNQVNASLYGDVEKDGALYMIPYRVGKICVYYNK TLFDSRGVEYPQENWTWEDYREKAGILTGWGNNKKVYGALGFENSSSWWMLPARTRGAFD PFKEEDLKMFRESAEWCHAFISESAEKTPYMSRTDDDWSGYHILFAEGRLGMYFGEDGEV NTINQELRKTGMNIEYDVMELPSWEGTEKADVYNTAVVAMAEATKYPEETYRFMKFCTGE EGGKILAKNRTVPFLQSPEVLKIYLSGTQIPEHAEYFLMDQAPRGLAVGTLHNGGIEVMR REVSMYLLGEQELEYTFKTIEEELAK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:25 2011 Seq name: gi|229783708|gb|GG668027.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld421, whole genome shotgun sequence Length of sequence - 1077 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 126 - 185 5.4 1 1 Tu 1 . + CDS 211 - 417 281 ## COG1396 Predicted transcriptional regulators Predicted protein(s) >gi|229783708|gb|GG668027.1| GENE 1 211 - 417 281 68 aa, chain + ## HITS:1 COG:SPy1834 KEGG:ns NR:ns ## COG: SPy1834 COG1396 # Protein_GI_number: 15675661 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 67 1 67 195 62 44.0 3e-10 MSLGNNLFNARKGKGLSQEVVAEKLGVSRQTISKWETDETLPDIQQSKKLAVLYGLTLDE LIEFDVDI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:26 2011 Seq name: gi|229783707|gb|GG668028.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld422, whole genome shotgun sequence Length of sequence - 1159 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1159 898 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit Predicted protein(s) >gi|229783707|gb|GG668028.1| GENE 1 1 - 1159 898 386 aa, chain - ## HITS:1 COG:hyfB KEGG:ns NR:ns ## COG: hyfB COG0651 # Protein_GI_number: 16130407 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Escherichia coli K12 # 2 197 219 419 672 132 37.0 1e-30 VGVKAGCFPLHIWLPKAHPVAPAPASALLSGILTKAGVFGIIVITGALFFSDGNWGLLLT ALGLVTMFLGALLALLSVNLKRTLACSSISQIGFILTGIGMAVLLACVGEKNSLAVRGTL LHMVNHSLFKLVLFLCAGAVYMNLHKLDLNEIRGFGRRKPALGFCFLMGSLGIGGIPLWS GYVSKTLLHESMVEYAKAVGEGAVVLGNQGMPSAFAGMPGIITALLLNPGVWKAAEWVFL LTGGMTVAYMLKLFIALFVEKHPANQEAYDAMSASYMKPLSRFVLVGCSALIPLLGMTPH LTMDRIAGAGEAFFQGDGLAHSMAYFSPENLKGGAISIGIGIVFYVAVVRGLLMKRSDGG EAVYADRLPSWLDLENLLYRPLLLRI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:27 2011 Seq name: gi|229783706|gb|GG668029.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld424, whole genome shotgun sequence Length of sequence - 1638 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 294 232 ## Closa_2382 cobalamin biosynthesis protein CobD 2 1 Op 2 . - CDS 305 - 742 364 ## Closa_2383 hypothetical protein 3 1 Op 3 . - CDS 776 - 1147 331 ## Closa_2384 cobalamin-5-phosphate synthase CobS - Prom 1178 - 1237 1.6 + Prom 1188 - 1247 5.8 4 2 Tu 1 . + CDS 1447 - 1566 71 ## Predicted protein(s) >gi|229783706|gb|GG668029.1| GENE 1 3 - 294 232 97 aa, chain - ## HITS:1 COG:no KEGG:Closa_2382 NR:ns ## KEGG: Closa_2382 # Name: not_defined # Def: cobalamin biosynthesis protein CobD # Organism: C.saccharolyticum # Pathway: Porphyrin and chlorophyll metabolism [PATH:csh00860]; Metabolic pathways [PATH:csh01100] # 1 97 1 97 331 106 57.0 3e-22 MIRYHILAALAGCALDALFGDPRRIPHPVCGIGNLIAWLEARLRKWFPADAKSERRAGAV MVSSVLLITGLAAALILTAAYAVHPLAGLAVESVMCG >gi|229783706|gb|GG668029.1| GENE 2 305 - 742 364 145 aa, chain - ## HITS:1 COG:no KEGG:Closa_2383 NR:ns ## KEGG: Closa_2383 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 145 1 140 140 134 47.0 9e-31 MIFITGGAWQGKTAFAETLAVDFVTMESGERVSCAGSITEKKAVPVACGRSDPFETAFER PVISGFHYFVRRLLREERDVNEFIAAIGEKNPNAVITADELGCGIVPADPEERAWREAAG RAAEILAADSDAVYRMICGIAVRLK >gi|229783706|gb|GG668029.1| GENE 3 776 - 1147 331 123 aa, chain - ## HITS:1 COG:no KEGG:Closa_2384 NR:ns ## KEGG: Closa_2384 # Name: not_defined # Def: cobalamin-5-phosphate synthase CobS # Organism: C.saccharolyticum # Pathway: Porphyrin and chlorophyll metabolism [PATH:csh00860]; Metabolic pathways [PATH:csh01100] # 9 118 141 250 255 129 69.0 4e-29 MILAAYGGSFVITRALSGLAVVTFPMAKNSGLAASFSGQAQKRTVAVTMWLYLVFTECWI LYTGGIAAAVMTAAAAALTFLYYYRMSKKEFGGITGDLAGYFLQVCELVLTAVFAAVSRG ILA >gi|229783706|gb|GG668029.1| GENE 4 1447 - 1566 71 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAYFIRSFVQATVGMGILEYMENAIARLQNRFIVHFLPL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:40 2011 Seq name: gi|229783705|gb|GG668030.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld425, whole genome shotgun sequence Length of sequence - 1492 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 13 - 153 85 ## 2 2 Op 1 1/0.000 - CDS 134 - 772 778 ## COG2964 Uncharacterized protein conserved in bacteria 3 2 Op 2 . - CDS 792 - 1427 807 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase Predicted protein(s) >gi|229783705|gb|GG668030.1| GENE 1 13 - 153 85 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGILIRGMKKGRGAAAERQAVPASIPGSSYARRAPRPVFVIICALS >gi|229783705|gb|GG668030.1| GENE 2 134 - 772 778 212 aa, chain - ## HITS:1 COG:YPO0626 KEGG:ns NR:ns ## COG: YPO0626 COG2964 # Protein_GI_number: 16120952 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Yersinia pestis # 11 186 1 179 196 89 32.0 6e-18 MHENLQVIQTMLRSIHQMLGDRYEVILHDLSHVESSIVGIEGCITHRKIGGPATNYLIQL LREYGDEAKDSINYKNVMPDGRVLRSSTIFIRDNDGKIIGSLCINQDLTDFMVANRLLER LVEFKSEQHEVPKEMFAQDISEVMEAMVGSELALMNKPVAYMQKEDKLALVDDLERKGVF DVKGSVEYVAECLGVTNFTVYNYLKEIRTKHK >gi|229783705|gb|GG668030.1| GENE 3 792 - 1427 807 211 aa, chain - ## HITS:1 COG:PH0771 KEGG:ns NR:ns ## COG: PH0771 COG0436 # Protein_GI_number: 14590640 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Pyrococcus horikoshii # 3 204 193 388 391 112 32.0 5e-25 MSLCEKYDLYIMNDEIWSDIVYPEQEFLSILSLGEERTGRVMSVFGFSKSFGIAGLRAGM IYCTDKAVFERLVEKSAVMTTAGGIASVSQIAGITCLKECYYWVDEFISHLVKNRDYAVE RINSMPVISCHKPQATYLLYVDISGLGVPSAQFTDYLIEHAGLALVPGGETFFGPGSEGA MRICFATSREILEEGLNRLETGLKMFLEEKK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:46 2011 Seq name: gi|229783704|gb|GG668031.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld426, whole genome shotgun sequence Length of sequence - 1434 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 7 - 237 71 ## Ethha_1903 DNA topoisomerase III 2 1 Op 2 . + CDS 219 - 377 59 ## Ethha_1903 DNA topoisomerase III 3 1 Op 3 . + CDS 374 - 652 294 ## Ethha_1904 hypothetical protein 4 1 Op 4 . + CDS 659 - 1433 511 ## COG0358 DNA primase (bacterial type) Predicted protein(s) >gi|229783704|gb|GG668031.1| GENE 1 7 - 237 71 76 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1903 NR:ns ## KEGG: Ethha_1903 # Name: not_defined # Def: DNA topoisomerase III # Organism: E.harbinense # Pathway: not_defined # 1 48 562 609 686 69 58.0 4e-11 MLKDLVGTYQVIKGTEYLFSPPRDVVGKCPHCGGEVVELQKGFFCQNDLANLQSGKITNG GLPRKNSRPRLWCLHC >gi|229783704|gb|GG668031.1| GENE 2 219 - 377 59 52 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1903 NR:ns ## KEGG: Ethha_1903 # Name: not_defined # Def: DNA topoisomerase III # Organism: E.harbinense # Pathway: not_defined # 1 51 633 686 686 74 68.0 2e-12 MVSALLKDGRVRVTGLYSEKTGKTYDATVVLEDDGQYANFKLEFDQRKGGSR >gi|229783704|gb|GG668031.1| GENE 3 374 - 652 294 92 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1904 NR:ns ## KEGG: Ethha_1904 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 92 1 97 100 71 51.0 1e-11 MKLSLVERETIFLYNQAEPMAEVYTHDPRLMEKLELLAKKHPDQITRKDAHNFTVPKRCV SVREPYSAERRKAASERAKAAGYQPPVRKSSS >gi|229783704|gb|GG668031.1| GENE 4 659 - 1433 511 258 aa, chain + ## HITS:1 COG:RC1330 KEGG:ns NR:ns ## COG: RC1330 COG0358 # Protein_GI_number: 15893253 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Rickettsia conorii # 4 117 16 124 595 67 33.0 3e-11 MAENVFEAVKQSVSTREAAEFYGIEVKRNGMACCPFHDDKNPSMKVDQRFHCFGCGADGD VIDFTAKLFDLSSKEAAEKLAQDFGLIYDSQAPPRRKYVRQKTEAQKFREDRQRCYRVLS DYYYLLKKWEIDNSPRTPEEEPHPRFVEAIQKKTYVEYLLDLFLYESEEEQKAWIADHTA EITHLERRLKIMAENKPTNRERLREITDGIEQGIKELFESEKYMRYLSVMSRFHRYSVNN TMLIYMQKPDATLVAGYN Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:52 2011 Seq name: gi|229783703|gb|GG668032.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld427, whole genome shotgun sequence Length of sequence - 1027 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1025 924 ## COG1409 Predicted phosphohydrolases Predicted protein(s) >gi|229783703|gb|GG668032.1| GENE 1 2 - 1025 924 341 aa, chain - ## HITS:1 COG:CAC0205 KEGG:ns NR:ns ## COG: CAC0205 COG1409 # Protein_GI_number: 15893498 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 3 192 307 505 652 130 42.0 4e-30 SDWTVVTFHHSIYSTASHESDNDIIQRRAELSPVFTELGIDVVLMGHDHVYTRSYMMNGT DPIIPEDGTVPESVTDPAEGEVLYVTANSASGSKYYSIHNKDFPYAAVMNQESTPNITNV EVTDKSFAITTYRTKDMSVVDTFAIYKDGYQPPESVIKSVSLGVGADESETMVTWYSDSK LPGKVQLVKKSDLANGVFPETAAEFAAEKESANEEGFFTNQAVIRGLESATEYAYRVGDG TAWSDVYDLTVQDYENGFNFLLAGDPQIGAGSTDTDIKGWQNTMETAMKAFPETSFLISA GDQVNTASNETQYAGYLSPKELLSLPAAVNVGNHDAGSSAY Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:53 2011 Seq name: gi|229783702|gb|GG668033.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld429, whole genome shotgun sequence Length of sequence - 728 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 728 728 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes Predicted protein(s) >gi|229783702|gb|GG668033.1| GENE 1 2 - 728 728 242 aa, chain - ## HITS:1 COG:YPO0059 KEGG:ns NR:ns ## COG: YPO0059 COG0156 # Protein_GI_number: 16120412 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Yersinia pestis # 2 240 108 346 403 335 68.0 3e-92 ADDTILYSSCFDANGGLFETILSENDAVISDELNHASIIDGVRLCKAKRYRYKNNDMTDL ESKLKEADAAGARIKLIATDGVFSMDGIICNLKGVCDLADQYHALVMVDDSHAVGFIGKH GRGTAEYNGVEGRVDIITGTLGKALGGASGGYTSGRREIIDLLRQRSRPYLFSNTLAPAI VGASLELFDMLEESTKLRDHLEETTAYYRKQLTDNGFDIIPGTHPCVPVMLYDEKLAGEF KP Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:54 2011 Seq name: gi|229783701|gb|GG668034.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld430, whole genome shotgun sequence Length of sequence - 1397 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1367 1551 ## COG1132 ABC-type multidrug transport system, ATPase and permease components Predicted protein(s) >gi|229783701|gb|GG668034.1| GENE 1 3 - 1367 1551 454 aa, chain + ## HITS:1 COG:lin0155 KEGG:ns NR:ns ## COG: lin0155 COG1132 # Protein_GI_number: 16799232 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Listeria innocua # 3 453 140 587 593 467 52.0 1e-131 GDVNSLKDVLSDNVTQLIPDLITVVCVAVIMLIKNYKLAMAALLTLPILVVGMLVIETTA HKRWQIYRKKTSNLNAYVHEDLSGIRVIQSFAAERETRAVFYDLVDQHYRAFIDAVVVAD GFGPVVEITWGLGGFLLYFIGIRVIGVGEVGIGTFLAFSTYIAMFWSPIRNLANFYNKLT TNISAAERIFDIIDTEAGIRDCPGAAELPEIEGTVDFEHVSFAYDDEPERMILKDVNFHI RQGETIALVGPTGAGKTTIVNLLSRFYETTDGRVLIDGHDIKKVTLKSLRSQMGIMTQDN FLFSGTIKYNIKYGRLDATDEEMMEAARAVNAHDFIMKLEHGYDTEISERGARLSIGQRQ LLAFARTMLSKPGILILDEATSSIDTHTELLVQKGIEALLEGRTSFVIAHRLSTIRKADR IFVIDQGNIMEAGSHEELMERKGAYYQLYQSQFL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:55 2011 Seq name: gi|229783700|gb|GG668035.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld431, whole genome shotgun sequence Length of sequence - 855 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 727 683 ## COG0642 Signal transduction histidine kinase + Term 739 - 790 18.1 Predicted protein(s) >gi|229783700|gb|GG668035.1| GENE 1 2 - 727 683 241 aa, chain + ## HITS:1 COG:BS_yrkQ KEGG:ns NR:ns ## COG: BS_yrkQ COG0642 # Protein_GI_number: 16079695 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 1 234 193 425 432 115 31.0 6e-26 SVKGSDELGELAEGINRMRISILEREQKEAEARQANQELVTAMSHDLRSPLTSLLGYLEF LDEEETEDAAQQKHFIATSRKKAMQIKEMSDRLFEYFLVYGREEKPAMEPVNGAALFQQT VGESAFALEGLGFTVDFQMEEMKGIYYVSVDLFRRVVDNLFSNLAKYAAREFPVVITCVQ TKEGARITIRNRKRTDGALVESSGIGLKTCRKIMEEHEGSFETTEDEDRFTATLFLPGTA R Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:55 2011 Seq name: gi|229783699|gb|GG668036.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld432, whole genome shotgun sequence Length of sequence - 1354 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1242 1123 ## COG2272 Carboxylesterase type B Predicted protein(s) >gi|229783699|gb|GG668036.1| GENE 1 3 - 1242 1123 413 aa, chain - ## HITS:1 COG:BS_pnbA KEGG:ns NR:ns ## COG: BS_pnbA COG2272 # Protein_GI_number: 16080492 # Func_class: I Lipid transport and metabolism # Function: Carboxylesterase type B # Organism: Bacillus subtilis # 12 412 5 391 489 197 34.0 3e-50 MAKQFLYDNLPVVETKAGKLRGYQWEGTYIFKGIRYARANRFQLPEEVEPWEGVKEAASY GFVCPMLTRDHPQGELLVPHRYWPQDEDCLSLNIWSQSLDRSAKKPVMFWIHGGAFSMGS SIEQKAYNGENMSRYGDVVVVTVNHRLNILGYLDLSPYGERYAGSANAGQADLVAALKWV RDNIEAFGGDPDNVTIFGQSGGGMKVSGLMQTPEADGLFHRAMIMSGVAGDVLPYSTGDS RPLIQAMLKELGLAEQEAGRLETVPYYDLAAAYNRVSPAIARAGGYIGCTPRPDDFYKGE GPAVGFTDHAKTIPVMVGTVFGEFAMMPLPFNKETISEAELDEILDKRFQGHGKELKTVF AEAYPGKSPVDLLTLDTIFRGPTKEFVRSLAAAGGSVYSYLFALEFPYQNQKT Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:56 2011 Seq name: gi|229783698|gb|GG668037.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld433, whole genome shotgun sequence Length of sequence - 504 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 485 427 ## COG2199 FOG: GGDEF domain Predicted protein(s) >gi|229783698|gb|GG668037.1| GENE 1 2 - 485 427 161 aa, chain - ## HITS:1 COG:DR0267 KEGG:ns NR:ns ## COG: DR0267 COG2199 # Protein_GI_number: 15805298 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Deinococcus radiodurans # 69 161 335 430 511 66 36.0 2e-11 MIIEDLEEIREPFPEEYQILHAQSITSLAAAPMEQDGTLIGYLGVDNPSPARLQNISPLL QTLCYFLMLARNHAENKQLLTHLSYYDKLTDFYNRNKYIVDTGALAGSNQPVGIVYLDVN GLKDINDHYGHESGDRVLIECAKRIRAAFTQADFYRIGGDE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:57 2011 Seq name: gi|229783697|gb|GG668038.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld434, whole genome shotgun sequence Length of sequence - 670 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 8 - 670 624 ## COG3119 Arylsulfatase A and related enzymes Predicted protein(s) >gi|229783697|gb|GG668038.1| GENE 1 8 - 670 624 220 aa, chain - ## HITS:1 COG:YPO0829 KEGG:ns NR:ns ## COG: YPO0829 COG3119 # Protein_GI_number: 16121138 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Yersinia pestis # 5 207 302 493 517 114 34.0 1e-25 EGFEFAEYYGAIEGLDDNFGRIMEYLDREGLEEDTIVVLSADHGDCMGSHGLMGKNIWYD ESIRIPLYIRGPRIAAGRTDALIASQDHMPTLLELLDAAVPDTVQGRSFASLARGESMEE EPEHAFLCMIPGMPELVEEYRKLGLNNKAFGWRGIRTKDSTYIIDNGTSPSATQRRLFYD NQKDPLQLNPIELDKGSALANAYDEVIESYLRKTRDPFLM Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:45:58 2011 Seq name: gi|229783696|gb|GG668039.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld435, whole genome shotgun sequence Length of sequence - 1326 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 665 372 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 - Term 685 - 726 8.2 2 2 Tu 1 . - CDS 794 - 886 133 ## - Prom 939 - 998 4.8 3 3 Tu 1 . - CDS 1013 - 1324 401 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases Predicted protein(s) >gi|229783696|gb|GG668039.1| GENE 1 3 - 665 372 221 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 220 2 232 329 147 36 3e-36 TKDMQSSEPIIQLVGLGKEFRTAGGPIKALDDINLTICQGEIFGIIGLSGAGKSTLVRCI NYLEVPTSGEVLFENRSLASMSGRDQRKARQSMGMIFQQFNLLAQRNVIQNVCFPLEIAG ASKKEAKERAMELLGLVGLADRKDAYPSQLSGGQKQRVAIARALATNPKILLCDEATSAL DPNTTKSILALLKEINKSMGVTVVVITHEMSVIEAICDRVA >gi|229783696|gb|GG668039.1| GENE 2 794 - 886 133 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNQLKAMQMNMYMMNGMMMCRMSMCVKNRK >gi|229783696|gb|GG668039.1| GENE 3 1013 - 1324 401 103 aa, chain - ## HITS:1 COG:BS_ytjP KEGG:ns NR:ns ## COG: BS_ytjP COG0624 # Protein_GI_number: 16080050 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Bacillus subtilis # 19 101 379 461 463 75 42.0 3e-14 KAKMAEKGFTLLNDSMKPPHHVDEDSEFIRTLLRTYEEYTGRTGECIAIGGGTYVHELKN GVAFGAAMPETDNRMHGADEFAVIEELIVSAKMFAQVIVDLCS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:46:02 2011 Seq name: gi|229783695|gb|GG668040.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld436, whole genome shotgun sequence Length of sequence - 1096 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1094 620 ## SEQ_1253 conjugative transposon DNA recombination protein Predicted protein(s) >gi|229783695|gb|GG668040.1| GENE 1 2 - 1094 620 364 aa, chain - ## HITS:1 COG:no KEGG:SEQ_1253 NR:ns ## KEGG: SEQ_1253 # Name: not_defined # Def: conjugative transposon DNA recombination protein # Organism: S.equi_equi # Pathway: not_defined # 2 271 895 1163 3975 245 45.0 2e-63 SAAERIGELLESGQFASNVELAEAAGYERSLLAEKLWHLYHDFSDKARDSGYLSCLSGIQ RTGFPEETAWLAEQLSDPAFRQTLKEEYAAFWTAYQQDRDLLRFHYHRPREIWENLKDLD LPRRTFSSDLSQVPTVQHFITEDEIDAAMTGGSSFAGGKGRIYAFFMENHTDKEKVRFLK DEYGIGGRSHALSGATHSGEDHDGKGLHYKKQDCPDVHLNWEKVAKRITSLVQKGRYLTE QEQAQYDKIQAEKELAEEDAIQAQQPEVEEETPKPTLREQFEQYKPVVTAAISEDAAYRN ACGHSDHENAVIEGNAAVRRAVLGSKDMELIRLYSDVPEFRQRLHREVIDETYPKLHELL RPLS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:46:09 2011 Seq name: gi|229783694|gb|GG668041.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld437, whole genome shotgun sequence Length of sequence - 1203 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 380 231 ## AB57_3247 prophage LambdaBa04, DnaD replication protein, putative 2 1 Op 2 . - CDS 386 - 523 148 ## gi|266625775|ref|ZP_06118710.1| hypothetical protein CLOSTHATH_07202 - Term 676 - 724 -0.9 3 2 Tu 1 . - CDS 842 - 1000 138 ## COG1129 ABC-type sugar transport system, ATPase component - Term 1028 - 1072 8.2 4 3 Tu 1 . - CDS 1093 - 1203 56 ## Predicted protein(s) >gi|229783694|gb|GG668041.1| GENE 1 2 - 380 231 126 aa, chain - ## HITS:1 COG:no KEGG:AB57_3247 NR:ns ## KEGG: AB57_3247 # Name: not_defined # Def: prophage LambdaBa04, DnaD replication protein, putative # Organism: A.baumannii_AB0057 # Pathway: not_defined # 1 126 1 126 297 147 51.0 2e-34 MREYTTGNSIVDASAEISITGNITPQTWYKTIVKETGKPHLTAIVILADIVYWYRPTELR DESTGQIIAIRKKFKADLLQRSYQQIAEQFGLSKKEATNAIIFLEKLGVIKRVFRTINLN GLVVNN >gi|229783694|gb|GG668041.1| GENE 2 386 - 523 148 45 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625775|ref|ZP_06118710.1| ## NR: gi|266625775|ref|ZP_06118710.1| hypothetical protein CLOSTHATH_07202 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_07202 [Clostridium hathewayi DSM 13479] # 1 45 1 45 45 68 100.0 2e-10 MQIVFIGIGFLITLSAVVSIALAVTAKKSDERLRAICTSLEEKGR >gi|229783694|gb|GG668041.1| GENE 3 842 - 1000 138 52 aa, chain - ## HITS:1 COG:BH3441 KEGG:ns NR:ns ## COG: BH3441 COG1129 # Protein_GI_number: 15616003 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Bacillus halodurans # 6 50 7 51 517 59 64.0 1e-09 MGDTMLLEMKNITKEFPGVKALDDVSLELKEGEILALVGENGAGDRVIIRPS >gi|229783694|gb|GG668041.1| GENE 4 1093 - 1203 56 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no AIPAAIDLIEGKTDVPRSQGPTPFMIDRGNLAENSN Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:46:21 2011 Seq name: gi|229783693|gb|GG668042.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld438, whole genome shotgun sequence Length of sequence - 1196 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1194 1040 ## COG0827 Adenine-specific DNA methylase Predicted protein(s) >gi|229783693|gb|GG668042.1| GENE 1 3 - 1194 1040 397 aa, chain - ## HITS:1 COG:pli0004 KEGG:ns NR:ns ## COG: pli0004 COG0827 # Protein_GI_number: 18450290 # Func_class: L Replication, recombination and repair # Function: Adenine-specific DNA methylase # Organism: Listeria innocua # 279 396 379 502 756 111 50.0 2e-24 SISDEEYDAVRRPIPQRTSYDPAAPVYAVGDTVYIEDDAYQITELREDTVQLLPTGMVYP IYRAERKEQFEQLLRADRRNAYYTELLPIDPDKADQDLRDVLSHGLMDEADKKQVSTLLQ SGRSNSEIAYWLSRAYPHEIETLNLETGDIADYRTTAQGMELEVLDAEEKRLAVLYFRWD EVAPLLRGMYARQMDGFGQEQPQPSTESPTFHSETVAVYPGDKNNLPYDVVVERLHIEEP EPPAPVTESEKLFEEVLDEHPVSIQVNGQWQTFPNAKAAEEASYEEYKANLRRNAQNFRI TDEHLGEGGPKAKFQANINAIRLLKELEAAGQQASPEQQEVLSRYVGWGGLSDAFDPEKP AWASEYAQLKELLTPEEYAAARSSTLNAHYTSPTVIQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:46:22 2011 Seq name: gi|229783692|gb|GG668043.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld439, whole genome shotgun sequence Length of sequence - 552 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 552 168 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain Predicted protein(s) >gi|229783692|gb|GG668043.1| GENE 1 3 - 552 168 183 aa, chain + ## HITS:1 COG:CAC0289 KEGG:ns NR:ns ## COG: CAC0289 COG0745 # Protein_GI_number: 15893581 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 183 30 212 235 174 46.0 6e-44 GFREIRRAYRGSEAVTLCREFQPDAVILDVMLPDMDGLEVCRRIREFSYCSILFLSSRND DIDKILGLSSGGDDYITKPFSPREVAFRVKAQLRRQRYQNAPSPAVSSVLTAGPLSLDQE SGRVWKNGREISLTGREFLLLSYLMENTDKIISKERLYEQVWGESSCICDNTIMVHIRHL REK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:46:23 2011 Seq name: gi|229783691|gb|GG668044.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld440, whole genome shotgun sequence Length of sequence - 1277 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1277 1026 ## COG5263 FOG: Glucan-binding domain (YG repeat) Predicted protein(s) >gi|229783691|gb|GG668044.1| GENE 1 2 - 1277 1026 425 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 311 420 520 625 744 78 39.0 2e-14 ADTIKVTKEPDKMEYVTGEKFDPAGMEVTVYETASKSNADRKRILSPEDYEVTPNRFDTA GVNEVTVTYSGKNAEGVNEDFTDSFSVRVIESEEEFYTTGIKVTKKPTKTEYEIGEDFNP EGMEVVAYETASSSNAIRREIELSPEDYDISQEEFDTAGTKMITVTYHAENKDGEGEAFS ASFSVKVTEAWEDYYTTRIKVEKKPEKIVYKTGENFEPEGMKVVAYERRASSSNAERRER VLSEDEYELDIPPFDTQGAKSVKVLYCSVDQKGEEKTFRDSFTVRVLGNQTDDDNSDDDD ETTYKHETTYKPDDNVTGTWQGGKNEPWRFKKSNGTYATNEWAKINGKWYHFDTESNMQT GWLSDQNKWYFLGPDGSMCADIWSMVNGKWYYFNADGSMKCNEWFFYKETWYYLGSDGEM LVSNI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:46:23 2011 Seq name: gi|229783690|gb|GG668045.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld441, whole genome shotgun sequence Length of sequence - 1280 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1160 565 ## COG2199 FOG: GGDEF domain - Prom 1181 - 1240 3.4 Predicted protein(s) >gi|229783690|gb|GG668045.1| GENE 1 2 - 1160 565 386 aa, chain - ## HITS:1 COG:DR0267 KEGG:ns NR:ns ## COG: DR0267 COG2199 # Protein_GI_number: 15805298 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Deinococcus radiodurans # 295 386 331 430 511 69 37.0 1e-11 MLMTSLHISVSKHLLNDDFQVIWANRFFYDKTGYTKEEYDQNFHGSVRLYFSSAPEEYQT IEKAVKDALTANQRGYDAVCKMPRKDGSSIWIRFIGTFTNESIDGVPVIYVVYTDVNDLV NARTKVEKEHSKSQNMLRKELQTMECLKRMYGCMDMAEYMDDILYIIGTFLEAERAYMFE INDTKMDCIYEWCRADIVPQMDYCQRMDIKLLEPWYEEFCQGKSIVIPNADALKEENTYT FEVLHKQGIHSMVLSPIVMHDQLTGFIGADNSNVELLMNTSMMETLSYFIGISIEKSELN DKLIYHSFYDELTGAFNRNKYLQDIEKLCQSDSFTSLGVVYMDINGLKDINDHMGHKFGD QILIEGAEMLKKAFPAGDIYRLGGDE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:46:24 2011 Seq name: gi|229783689|gb|GG668046.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld442, whole genome shotgun sequence Length of sequence - 583 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 14 - 232 101 ## COG0055 F0F1-type ATP synthase, beta subunit 2 1 Op 2 . + CDS 157 - 378 112 ## Closa_1969 Fmu (Sun) domain protein 3 1 Op 3 . + CDS 363 - 582 293 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases Predicted protein(s) >gi|229783689|gb|GG668046.1| GENE 1 14 - 232 101 72 aa, chain + ## HITS:1 COG:NMA0519 KEGG:ns NR:ns ## COG: NMA0519 COG0055 # Protein_GI_number: 15793517 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Neisseria meningitidis Z2491 # 2 44 365 409 465 68 75.0 4e-12 MVQETLQKYRELQDIIAILGMEELGDEDKTTVNRARKIQKFKPFRFQRWNRKGLNFAGPS PGTGRTTVSSGT >gi|229783689|gb|GG668046.1| GENE 2 157 - 378 112 73 aa, chain + ## HITS:1 COG:no KEGG:Closa_1969 NR:ns ## KEGG: Closa_1969 # Name: not_defined # Def: Fmu (Sun) domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 7 72 388 453 454 121 80.0 9e-27 MEPKGFEFRRTISWNREDDRVIRYLKGETISLAPEEGPVKGWCLVCVDGSPLGFAKGTGM ALKNKYYPGWRWM >gi|229783689|gb|GG668046.1| GENE 3 363 - 582 293 73 aa, chain + ## HITS:1 COG:L109527 KEGG:ns NR:ns ## COG: L109527 COG1187 # Protein_GI_number: 15674222 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Lactococcus lactis # 4 73 27 95 273 59 42.0 2e-09 MAVDVMKELMRLDKYLTEMGQGTRSQIKEMARKGRITVNSVTEKKSDRKISPSSDAVAVD GRIVAYAEYEYYM Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:46:27 2011 Seq name: gi|229783688|gb|GG668047.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld443, whole genome shotgun sequence Length of sequence - 1264 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 434 379 ## COG0582 Integrase + Prom 509 - 568 6.1 2 2 Op 1 . + CDS 589 - 741 71 ## gi|288871716|ref|ZP_06410278.1| conserved hypothetical protein 3 2 Op 2 . + CDS 675 - 1013 245 ## gi|288871717|ref|ZP_06410279.1| toxin-antitoxin system, antitoxin component, Xre family + Term 1213 - 1244 -0.5 Predicted protein(s) >gi|229783688|gb|GG668047.1| GENE 1 3 - 434 379 143 aa, chain + ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 2 139 124 262 265 90 35.0 9e-19 IRVSELCFVTVKAVRLGYAEVCCKGKNRRIMIPGNLRKKLLIYIQKNGIEKGEVFITKTG NAMDRSNIWRGMKGLAAEAGVLADKIFPHNLRHLFARSFYEIDKDIAKLADVLGHSSINT TRIYMISTGEEHRRQLERLRLVI >gi|229783688|gb|GG668047.1| GENE 2 589 - 741 71 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871716|ref|ZP_06410278.1| ## NR: gi|288871716|ref|ZP_06410278.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 50 1 50 50 87 100.0 3e-16 MTQMQAAGDISPYKQHNIYYVVSSKGGMKWLPITGRKWENDWAAVDVKWD >gi|229783688|gb|GG668047.1| GENE 3 675 - 1013 245 112 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871717|ref|ZP_06410279.1| ## NR: gi|288871717|ref|ZP_06410279.1| toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] # 20 112 1 93 93 161 100.0 1e-38 MVTYNRQEVGKRLGCRRREMGLTGEEMGRRIGKNGRYYRDIENGRCGMSVETLILLSEEM GLSLDYLIYGDSSDDGRLGEQKQIVNILSHCNGGIREGAVNLLKVYLESVSK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:46:40 2011 Seq name: gi|229783687|gb|GG668048.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld444, whole genome shotgun sequence Length of sequence - 1330 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 585 594 ## COG5632 N-acetylmuramoyl-L-alanine amidase + Term 611 - 650 6.6 2 1 Op 2 . + CDS 668 - 1328 475 ## Ppro_1118 BRO domain-containing protein Predicted protein(s) >gi|229783687|gb|GG668048.1| GENE 1 1 - 585 594 194 aa, chain + ## HITS:1 COG:BS_yqeE_1 KEGG:ns NR:ns ## COG: BS_yqeE_1 COG5632 # Protein_GI_number: 16079624 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus subtilis # 2 100 60 150 182 66 40.0 2e-11 YVDDVAIYQTLNHTDGAWAVGKQYGTPLVAGVNNNNTINIEICVNPDSNYDKARLNCVDL VRHLMQETEIPADHVIRHYDAKRKWCPRKMMDSPELWTDFCLRVRGQVDEVKSFEDGAGN WHFTINGELQKTRWVKYKNKWFYVDDSGNMVTGYAVIGGLVYMLNPSKADMATYGALMVT NNLTQGNLEVQWVE >gi|229783687|gb|GG668048.1| GENE 2 668 - 1328 475 220 aa, chain + ## HITS:1 COG:no KEGG:Ppro_1118 NR:ns ## KEGG: Ppro_1118 # Name: not_defined # Def: BRO domain-containing protein # Organism: P.propionicus # Pathway: not_defined # 87 203 71 188 247 84 37.0 3e-15 MKLRLVKQGDFLGTKCDFYVNEIGDIFMSRTQIGYALKYKQPQNAVLIVHKRHKERLDKF SVEVSGCQFVTPIYKNENTDKVFMYKERGIYEICRYSNQPIADDFNDWVYDTILSIKKNG YYIAAEKDEKWLGTRQETKEVRKAETDQIKLFVEYARAQGSQHADRYYVSLTKLINRRLG MESGGRDKADQRTLMHLKSLETVVELHLATLMAEGLPYKE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:46:44 2011 Seq name: gi|229783686|gb|GG668049.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld445, whole genome shotgun sequence Length of sequence - 804 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 34 - 531 283 ## Selsp_1788 hypothetical protein 2 1 Op 2 . + CDS 521 - 803 173 ## Selsp_1787 hypothetical protein Predicted protein(s) >gi|229783686|gb|GG668049.1| GENE 1 34 - 531 283 165 aa, chain + ## HITS:1 COG:no KEGG:Selsp_1788 NR:ns ## KEGG: Selsp_1788 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 1 155 183 325 325 140 45.0 2e-32 MRRHLISGGGPIFGLETGTLIYGAGKQILRSFRDFSLCMEPYITNSGNKIYYFGDLDYEG ISIYEDLCGRFGREWVIEPFKAAYIAMTEKVLNTLTVQDSLDSGLCSLPGMKEKQSRRGG DLFFGYFEAAEQEKMKAVLLAGKYIPQECLTISDLPMRPGDYDGT >gi|229783686|gb|GG668049.1| GENE 2 521 - 803 173 94 aa, chain + ## HITS:1 COG:no KEGG:Selsp_1787 NR:ns ## KEGG: Selsp_1787 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 1 94 1 94 543 124 59.0 9e-28 MEHDFLKQFWKRMKSVGMYALLFQNSFQKTTWKQYGFLKMDEQINMIFAVLLYIMEQSLK DEPCTMDDIGAYLDSVNQKYLQKPLSYEECKELG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:46:50 2011 Seq name: gi|229783685|gb|GG668050.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld446, whole genome shotgun sequence Length of sequence - 642 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 6 - 512 336 ## COG0534 Na+-driven multidrug efflux pump Predicted protein(s) >gi|229783685|gb|GG668050.1| GENE 1 6 - 512 336 168 aa, chain + ## HITS:1 COG:MA0334 KEGG:ns NR:ns ## COG: MA0334 COG0534 # Protein_GI_number: 20089232 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 1 154 297 449 466 92 37.0 3e-19 MKGFQPIAGFSYGAKKYDRLHEAIKTSILWSTIFCVIFGLAAVIFSEGIVSLFTKGDMEM IRVGQIALRANGLSFMLFGFYTVYSFLFLVMGKATEGCILGACRQGICFVPVILILPGLL GLNGILYAQPIADVLSAVVTVFMAVHLHKELATEKIRIVTMDVGHSVS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:46:51 2011 Seq name: gi|229783684|gb|GG668051.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld447, whole genome shotgun sequence Length of sequence - 1242 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 18 - 338 152 ## Clole_2737 hypothetical protein 2 1 Op 2 . + CDS 360 - 815 573 ## Cphy_0798 toxin secretion/phage lysis holin + Term 915 - 948 -0.9 + Prom 835 - 894 2.4 3 2 Tu 1 . + CDS 957 - 1242 186 ## CbC4_4176 putative N-acetylmuramoyl-L-alanine amidase Predicted protein(s) >gi|229783684|gb|GG668051.1| GENE 1 18 - 338 152 106 aa, chain + ## HITS:1 COG:no KEGG:Clole_2737 NR:ns ## KEGG: Clole_2737 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 106 36 141 141 108 55.0 9e-23 MFLLGGICFAALGLINEILPWSMALWKQILIGTGIITALEFLTGCVVNLCLGWNIWDYSH LPGNILGQICPQYCLLWLSVSLAGIVLDDWLRYWWWGEERPKYKMI >gi|229783684|gb|GG668051.1| GENE 2 360 - 815 573 151 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0798 NR:ns ## KEGG: Cphy_0798 # Name: not_defined # Def: toxin secretion/phage lysis holin # Organism: C.phytofermentans # Pathway: not_defined # 18 136 20 146 160 63 29.0 4e-09 MKFIDKYNAVVGAAVTVLTAIFGVYWYVFAGYLLCNALDYITGWAKARKTHKESSSIGIV GIVKKVGYWIIVLVAFMIPELFIHLGQDLLGVNLSFLALLGWFTLATLLINEIRSILENL VEYGINVPDFLIKGLAITEKLISAGADTGEK >gi|229783684|gb|GG668051.1| GENE 3 957 - 1242 186 95 aa, chain + ## HITS:1 COG:no KEGG:CbC4_4176 NR:ns ## KEGG: CbC4_4176 # Name: not_defined # Def: putative N-acetylmuramoyl-L-alanine amidase # Organism: C.botulinum_BKT015925 # Pathway: not_defined # 4 95 5 96 304 81 47.0 1e-14 MLPITKQIKQINCYASQNHPKYIVIHETDNFNKGAGAEAHSRAHNKGNLSTSVHYYVDDV AIYQTLNHTDGAWAVGKQYGTPLVAGVNNNNTINI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:46:59 2011 Seq name: gi|229783683|gb|GG668052.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld448, whole genome shotgun sequence Length of sequence - 1005 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 6 - 476 452 ## COG2200 FOG: EAL domain + Term 515 - 554 6.9 + Prom 606 - 665 5.0 2 2 Tu 1 . + CDS 702 - 1005 355 ## gi|288871719|ref|ZP_06118732.2| conserved hypothetical protein Predicted protein(s) >gi|229783683|gb|GG668052.1| GENE 1 6 - 476 452 156 aa, chain + ## HITS:1 COG:RSp0254_4 KEGG:ns NR:ns ## COG: RSp0254_4 COG2200 # Protein_GI_number: 17548475 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Ralstonia solanacearum # 16 147 118 247 261 103 40.0 1e-22 MLDESFIDFVKQTLNQYNLPPAHIVLELTESCFVTDMEALKEAFRLLRNLKIQIAMDDFG TGYSSLGMLSQSPADIVKIDRLFITAIGDKENEFNRSFIGSVIQLCHRVGISVCVEGVER KNELDTVCSLEADCIQGYYISKPLLQEEFEKKFWEA >gi|229783683|gb|GG668052.1| GENE 2 702 - 1005 355 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871719|ref|ZP_06118732.2| ## NR: gi|288871719|ref|ZP_06118732.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 101 16 116 116 90 100.0 4e-17 MKDEFNHVEIDGKEADAIQVKMDDVVADAEKFAAEERSADPDRNRKGGAESELEMEEQMS DEEFNEELKELMEEDNSGKKKKKKKKKGERGRRKKWSVRKK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:47:07 2011 Seq name: gi|229783682|gb|GG668053.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld449, whole genome shotgun sequence Length of sequence - 896 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 40 - 411 284 ## COG1380 Putative effector of murein hydrolase LrgA - Prom 446 - 505 5.5 + Prom 451 - 510 9.7 2 2 Tu 1 . + CDS 554 - 896 256 ## COG0671 Membrane-associated phospholipid phosphatase Predicted protein(s) >gi|229783682|gb|GG668053.1| GENE 1 40 - 411 284 123 aa, chain - ## HITS:1 COG:MA3263 KEGG:ns NR:ns ## COG: MA3263 COG1380 # Protein_GI_number: 20092079 # Func_class: R General function prediction only # Function: Putative effector of murein hydrolase LrgA # Organism: Methanosarcina acetivorans str.C2A # 4 121 2 119 165 73 37.0 6e-14 MKFIKQFSIILIISLIGEALHYFIPLPVPASIYGLLIMLAGLYTKLIPLDSVREASFFLI DIMPLMFIPAAVGLLDSWGLLRPILVPFLVITLVSTVVVMVVTGKITQFFIRSDKEKKVP EHE >gi|229783682|gb|GG668053.1| GENE 2 554 - 896 256 114 aa, chain + ## HITS:1 COG:CAC2438 KEGG:ns NR:ns ## COG: CAC2438 COG0671 # Protein_GI_number: 15895703 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Clostridium acetobutylicum # 19 112 24 117 180 70 40.0 5e-13 MIEAEFGILYFLQSLHTPWLDAVMKNITHLGDSGIFWILTGLALFCFKKTRRMGFCVLLS LAGGLLIGNIFLKNLVARDRPCWIDPTIQLLVASPKDFSFPSGHTMASFEAAVS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:47:08 2011 Seq name: gi|229783681|gb|GG668054.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld450, whole genome shotgun sequence Length of sequence - 1470 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 119 - 520 157 ## DSY0014 hypothetical protein + Term 608 - 654 13.6 + Prom 766 - 825 3.5 2 2 Tu 1 . + CDS 973 - 1386 428 ## Ethha_1950 sigma-70 region 4 type 2 + Term 1399 - 1460 3.1 Predicted protein(s) >gi|229783681|gb|GG668054.1| GENE 1 119 - 520 157 133 aa, chain + ## HITS:1 COG:no KEGG:DSY0014 NR:ns ## KEGG: DSY0014 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 123 268 390 413 139 57.0 4e-32 MMAGGVIMSTWGGFKHRGKTLAVGLFAFGSFAIGMGFSKNFIFYLGLMVCYGVALTMVQT AITTMLQEKTDTSMQGRVFGLLGTMYAGFLPIGMAMFGPLADILPLQWIMVGSGIALIVI AGIAYYSRELKTI >gi|229783681|gb|GG668054.1| GENE 2 973 - 1386 428 137 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1950 NR:ns ## KEGG: Ethha_1950 # Name: not_defined # Def: sigma-70 region 4 type 2 # Organism: E.harbinense # Pathway: not_defined # 1 133 1 133 141 116 48.0 3e-25 MKKINLRELYPDVYTTDFFVDVTEEVMETIRAAERAEAAYERKMYRYKAQYSLDCENGIE NAVLLKPQTPEMVLEEKQFQEQVYAAVMKLPEKQAKRIYARYYLGMTVNEIAEVEGVDPS RVRDSIRRGLKQLAKYF Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:47:14 2011 Seq name: gi|229783680|gb|GG668055.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld451, whole genome shotgun sequence Length of sequence - 904 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 61 - 903 942 ## COG1283 Na+/phosphate symporter Predicted protein(s) >gi|229783680|gb|GG668055.1| GENE 1 61 - 903 942 280 aa, chain - ## HITS:1 COG:SP0496 KEGG:ns NR:ns ## COG: SP0496 COG1283 # Protein_GI_number: 15900410 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Streptococcus pneumoniae TIGR4 # 18 273 284 535 543 139 29.0 5e-33 VFKLNPVFASSTISSVEISIFHTVFNVSNTLLLFPFAGFLVKASSLLVRDGKSGKAETEG SQMQRHLDERILETPSFAIENATQEVINMGKAALENFDIAAAALLDNSRAEAEQVEKLEA KINQYEKLLTSYLVKINNQSLNEEQHLLVKNLFYTVSNFERVSDHCENLSELAVEKADKN IEFSEDAAEEMKEMIKTVRAALEHAIKARAGAGMSEVRAVVQSEENVDMLEEELRERHIE RLSSHKCTPENGIVFLEALSNLERISDHAHNIAGFVRDEM Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:47:15 2011 Seq name: gi|229783679|gb|GG668056.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld452, whole genome shotgun sequence Length of sequence - 518 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 517 338 ## Closa_3736 cell wall binding repeat-containing protein Predicted protein(s) >gi|229783679|gb|GG668056.1| GENE 1 1 - 517 338 172 aa, chain - ## HITS:1 COG:no KEGG:Closa_3736 NR:ns ## KEGG: Closa_3736 # Name: not_defined # Def: cell wall binding repeat-containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 14 160 89 238 408 94 39.0 2e-18 LEPKGGLNSRGSWKAGETPKVTIELEAEEGYYFSSINAGKATVKGAVYSSAKKGSDNHSL TLSVKLKGVKGTLGTVESAYWESAPLGKARWSRVENAPAYEVKLYCGDSMVYRVEKTSST SYDFFSRMTSRGDYYFKVRAVAKTASEADYLKAGEWTESDNQEITKTGMPKP Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:47:19 2011 Seq name: gi|229783678|gb|GG668057.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld453, whole genome shotgun sequence Length of sequence - 1610 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 79 - 294 313 ## Ethha_1906 hypothetical protein + Term 318 - 368 9.5 + Prom 351 - 410 7.7 2 2 Op 1 . + CDS 443 - 805 313 ## COG0346 Lactoylglutathione lyase and related lyases + Prom 883 - 942 7.1 3 2 Op 2 . + CDS 968 - 1357 103 ## gi|288871723|ref|ZP_06118742.2| conserved hypothetical protein + Term 1415 - 1453 7.0 + Prom 1373 - 1432 2.8 4 3 Tu 1 . + CDS 1479 - 1608 62 ## gi|160939442|ref|ZP_02086792.1| hypothetical protein CLOBOL_04335 Predicted protein(s) >gi|229783678|gb|GG668057.1| GENE 1 79 - 294 313 71 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1906 NR:ns ## KEGG: Ethha_1906 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 4 71 2 69 72 65 57.0 8e-10 MGNFTFEEMNLMCIYNTGSRTGLIDSLHEMRGELSPEETELRELTDSALTKLCAMTDEDF AELELYPDFEQ >gi|229783678|gb|GG668057.1| GENE 2 443 - 805 313 120 aa, chain + ## HITS:1 COG:MA0108 KEGG:ns NR:ns ## COG: MA0108 COG0346 # Protein_GI_number: 20089007 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Methanosarcina acetivorans str.C2A # 1 120 10 131 163 79 38.0 2e-15 MKLKNILIVVKDIEKSKQFYNDLFGLDLVLDNDGNMILTEGLVLQDEKIWKKFLGKDIIP KSNSCELYFEEQDIEAFVEKLERSYPSIQYVNKLMTHSWGQKVIRFYDLDGNLIEVGTPM >gi|229783678|gb|GG668057.1| GENE 3 968 - 1357 103 129 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871723|ref|ZP_06118742.2| ## NR: gi|288871723|ref|ZP_06118742.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 129 3 131 131 230 100.0 2e-59 MKKYVSIFLCLACVLTLVACGKKADPITLPQTDEITSIDITVGENTVSHSDKTWISEIIA NISSSEPTKKESVQDVPQAESYIKIDFQFETGTSTLFAYEDSGKYYVEQPYQGIYKIDSQ LYSQLQETN >gi|229783678|gb|GG668057.1| GENE 4 1479 - 1608 62 43 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939442|ref|ZP_02086792.1| ## NR: gi|160939442|ref|ZP_02086792.1| hypothetical protein CLOBOL_04335 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04335 [Clostridium bolteae ATCC BAA-613] # 1 43 1 43 2451 93 100.0 6e-18 MPTKAELYAQMADKVATQLTGSWQEWAGFLTTASRLYKYPFHE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:47:33 2011 Seq name: gi|229783677|gb|GG668058.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld454, whole genome shotgun sequence Length of sequence - 500 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 51 - 500 225 ## PROTEIN SUPPORTED gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc Predicted protein(s) >gi|229783677|gb|GG668058.1| GENE 1 51 - 500 225 149 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc [Roseobacter sp. AzwK-3b] # 1 128 114 242 263 91 37 1e-19 FIRTAAETGIDALIVPDMPFEEKGELLPLCRQYGLDLISLIAPTSKERIHAIAKEAEGFV YCVSSMGVTGVRTEITTNIGEMVSLVKEAQSIPCAIGFGISTPEQAKKMSEHADGVIVGS AIVRMVAKYGKDCVEPVCQYIREMKAAIS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:47:34 2011 Seq name: gi|229783676|gb|GG668059.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld455, whole genome shotgun sequence Length of sequence - 1131 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 229 94 ## gi|160941510|ref|ZP_02088844.1| hypothetical protein CLOBOL_06400 - Prom 310 - 369 5.6 2 2 Tu 1 . - CDS 385 - 657 203 ## - Prom 899 - 958 7.5 + Prom 757 - 816 6.7 3 3 Tu 1 . + CDS 922 - 1119 93 ## gi|266625813|ref|ZP_06118748.1| conserved hypothetical protein Predicted protein(s) >gi|229783676|gb|GG668059.1| GENE 1 1 - 229 94 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941510|ref|ZP_02088844.1| ## NR: gi|160941510|ref|ZP_02088844.1| hypothetical protein CLOBOL_06400 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06400 [Clostridium bolteae ATCC BAA-613] # 1 59 1 59 113 84 74.0 3e-15 MKYERETSNLLRCLGSNNSYKGFRYTSYGVGLDIHDPELLTYISKGLYVEITSKFHTSIY HIVNQFKRTCKFTANS >gi|229783676|gb|GG668059.1| GENE 2 385 - 657 203 90 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPSVFHSHKDPNCRPNYRPNCFLHQYCTTMDNSSSNSNRSYYYYYYYYYYYYYYHSICNS ICNNIYNNIYSNTFFTPPFNIGLYYTIWDE >gi|229783676|gb|GG668059.1| GENE 3 922 - 1119 93 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625813|ref|ZP_06118748.1| ## NR: gi|266625813|ref|ZP_06118748.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 65 1 65 65 116 100.0 5e-25 MGKAAHKAMCYDAHINYGLILSRLQQETAYSRYSEKLKQAELLTINRLLETVLPQVANSI DMEDY Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:47:52 2011 Seq name: gi|229783675|gb|GG668060.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld456, whole genome shotgun sequence Length of sequence - 644 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 643 447 ## PROTEIN SUPPORTED gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 Predicted protein(s) >gi|229783675|gb|GG668060.1| GENE 1 2 - 643 447 214 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 [Roseobacter sp. AzwK-3b] # 1 214 148 368 425 176 45 3e-45 LSRLGGGIGTRGPGEKKLEADRRLIHDRIGQLKAELEDVKRHRDVARQQRTRNHTPVAAI VGYTNAGKSTLLNTLTDAGILAEDKLFATLDPTTRNLELPGGEQILLTDTVGFIRKLPHN LIEAFKSTLEEAKYSDIILHVVDCSNPQMDMQMYVVYETLKDLGVHDKEVITVFNKIDAA GEMRIPRDLSSDYQVKISAKTGEGLDELLNLLES Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:47:52 2011 Seq name: gi|229783674|gb|GG668061.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld457, whole genome shotgun sequence Length of sequence - 1471 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1469 1434 ## COG4646 DNA methylase Predicted protein(s) >gi|229783674|gb|GG668061.1| GENE 1 2 - 1469 1434 489 aa, chain - ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 489 567 1044 1315 286 33.0 5e-77 KKDFETHNRKKFCARITTGDYDAIIMGHSQFERIPISRERQERLLYEQIDEITEGIAEVQ ASGGERFTVKQLERTRKSLEARLEKLQAEGRKDDVVTFEQLGVDRLFVDEAHNYKNLFLY TKMRNVAGLSTSDAQKSSDMFAKCRYMDEITGNRGVIFATGTPVSNSMTELYTMQRYLQY ERLQELNMTHFDCWASRFGETVTALELAPEGTGYRARTRFSKFFNLPELMNLFKEVADIK TADQLNLPTPEVEYHNIVAQPTEHQQEMVKALSERASLVHSGTVDPSQDNMLKITSDGRK LGLDQRIVNQMLPDEPGTKVNQCVDNIMQIWRDGKADKLTQLVFCDISTPQAKAPASKAA KTLDNPLLHALEGAVPLPEQEPVFTVYDDIRQKLIAQGMPADQIAFIHEANTEVRKKELF SKVRTGQVRVLLGSTAKMGAGTNVQDRLVALHDLDCPWRPGDLAQRKGRIERQGNQNPLV HVYRYVTEG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:47:53 2011 Seq name: gi|229783673|gb|GG668062.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld458, whole genome shotgun sequence Length of sequence - 1096 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 34/0.000 - CDS 3 - 476 468 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 2 1 Op 2 . - CDS 493 - 1095 610 ## COG0765 ABC-type amino acid transport system, permease component Predicted protein(s) >gi|229783673|gb|GG668062.1| GENE 1 3 - 476 468 158 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 4 157 1 154 245 184 53 2e-47 MEYMVRLDNIHKSFGKNHILKGVDLTIKKGEVVVILGPSGSGKTTLLRTVNFLDSADDGS ITVNDFTVNAKKHTKSQIIELRRKTAMVFQNYNLFQNKTILENVMEGLVTVKKFKKEDAE RVSREILEKVGLSERCDFYPSQISGGQQQRAGIARALI >gi|229783673|gb|GG668062.1| GENE 2 493 - 1095 610 200 aa, chain - ## HITS:1 COG:L163056 KEGG:ns NR:ns ## COG: L163056 COG0765 # Protein_GI_number: 15672920 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Lactococcus lactis # 7 191 86 270 295 136 38.0 2e-32 GLGIASLRRAKTPVVSQICAVFVSFMRGVPMVILLYVAYYALPVMVYSYGVSIGREFDVS VVPPAVYAIAALTLDQAAYSSEIFRAALGAVDEGQREAAYSVGMTKLQAMTRIVFPQAMA VAMPNLGGLFLGLVKGTSLAYYVGVYEITATANLLAMPALNFIEAYLITTVIYELISFFF NRLFGTVETRLKRFRAGAAA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:47:54 2011 Seq name: gi|229783672|gb|GG668063.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld459, whole genome shotgun sequence Length of sequence - 582 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:47:54 2011 Seq name: gi|229783671|gb|GG668064.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld460, whole genome shotgun sequence Length of sequence - 677 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 46 - 105 1.7 1 1 Tu 1 . + CDS 170 - 352 230 ## gi|288871725|ref|ZP_06410283.1| conserved hypothetical protein Predicted protein(s) >gi|229783671|gb|GG668064.1| GENE 1 170 - 352 230 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871725|ref|ZP_06410283.1| ## NR: gi|288871725|ref|ZP_06410283.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 60 9 68 68 92 100.0 8e-18 MKITIARGTLTKEEQYKLGELLLKAGYVVSVRKGTTKETKSMTIINLVDPEPEAVEEDLS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:48:00 2011 Seq name: gi|229783670|gb|GG668065.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld461, whole genome shotgun sequence Length of sequence - 1023 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 58 - 117 7.8 1 1 Tu 1 . + CDS 145 - 1021 380 ## ELI_1104 hypothetical protein Predicted protein(s) >gi|229783670|gb|GG668065.1| GENE 1 145 - 1021 380 292 aa, chain + ## HITS:1 COG:no KEGG:ELI_1104 NR:ns ## KEGG: ELI_1104 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 2 292 3 297 500 134 29.0 4e-30 MDDLNLYNMAYKHLFNCIHFGLYPTGSSLPTVPELCKTFNVSSSTIHNALRLLQEDHYVS LSQGRSAIVTYDVNEEECRTEYRTYSYATKDALLDLCDAMLIIWPEVMLQGLKLCTDDDL EKLTEIFQRMSPCNEYPFFDFFFHILKGLGNPLFLNLYLSTTFFGHSTIMRLRDKAYIYG YMTFLKRETAQILALCKASDYGHLKDLLVRLYQKHIEGIHQYHDSLPPPEPPVKPIPYEW NYFSERPLVNFNLAMKLLRQIYSLLGNQEYLPSFSTLAKQYSVPFITVRRAV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:48:05 2011 Seq name: gi|229783669|gb|GG668066.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld462, whole genome shotgun sequence Length of sequence - 624 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 411 384 ## Closa_3048 hypothetical protein + Term 430 - 484 18.0 - Term 417 - 471 18.0 2 2 Tu 1 . - CDS 512 - 622 99 ## Predicted protein(s) >gi|229783669|gb|GG668066.1| GENE 1 1 - 411 384 136 aa, chain + ## HITS:1 COG:no KEGG:Closa_3048 NR:ns ## KEGG: Closa_3048 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 14 136 33 155 155 157 61.0 2e-37 QAVTGAVPADGEVLSGIPVDAEKSEAVSIYYGSVCYSFFSEDRDRIKEVADLFTGFSLEE VPNGQLDEATTYQIYFSTDTEQIAAINVDKNGIFYIPEEKKFYKVKAGTFHFETLDQIYK DSMYADGFDENQCLIQ >gi|229783669|gb|GG668066.1| GENE 2 512 - 622 99 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no GFEFADLIVVNGDPVNDITAMYQKPVHVFKDGVMIR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:48:13 2011 Seq name: gi|229783668|gb|GG668067.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld463, whole genome shotgun sequence Length of sequence - 890 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 + CDS 2 - 511 398 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 2 1 Op 2 . + CDS 530 - 890 155 ## COG0673 Predicted dehydrogenases and related proteins Predicted protein(s) >gi|229783668|gb|GG668067.1| GENE 1 2 - 511 398 169 aa, chain + ## HITS:1 COG:SMb20974 KEGG:ns NR:ns ## COG: SMb20974 COG1028 # Protein_GI_number: 16264847 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Sinorhizobium meliloti # 5 151 64 204 222 58 29.0 4e-09 SDKFKQIDFIVNNAGVLFGSKYDKIDEIVDLDVAKFRKTLDVNVTGMAIVLKYFMPYLYK SSCPVIINITSEAAYLGSGGYNYLSYSVSKYAANMYTQKIRNYLHDERPELGVRIFMMHP GRMQTVMGAENAQIPSSESAEGIYKVIEGETDPKLEIPFINYKGEAMPH >gi|229783668|gb|GG668067.1| GENE 2 530 - 890 155 120 aa, chain + ## HITS:1 COG:BH0708 KEGG:ns NR:ns ## COG: BH0708 COG0673 # Protein_GI_number: 15613271 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 1 120 1 119 369 99 42.0 1e-21 MRTYKIGIIGCGSISNTYIPDIQNIYHDLEIHAVADADLNRAGETAKRFGIKHAYRTEEL LADDEIEIAVNLTPPRFHQQINLSVLSAGKHLFTEKPFAMNLPEADELITLAGEKGVMIG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:48:14 2011 Seq name: gi|229783667|gb|GG668068.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld464, whole genome shotgun sequence Length of sequence - 1362 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 + CDS 1 - 480 623 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 2 1 Op 2 . + CDS 536 - 1361 914 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis Predicted protein(s) >gi|229783667|gb|GG668068.1| GENE 1 1 - 480 623 159 aa, chain + ## HITS:1 COG:BS_yluC KEGG:ns NR:ns ## COG: BS_yluC COG0750 # Protein_GI_number: 16078719 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Bacillus subtilis # 11 158 265 420 422 106 42.0 2e-23 NEEYNQYMIGISVSPVNVRTSSFLELVKYGAYEVKYDITVTIKSLGMLLSGKASVNDLSG PVGIVVMIDDSVKAGLTVSVMAAIMNVLSMCILLSANLGIMNLLPIPALDGGRLVFLVIE AVRGKRMDPEKEGMVNLIGMMALMALMVFVVFNDISRFT >gi|229783667|gb|GG668068.1| GENE 2 536 - 1361 914 275 aa, chain + ## HITS:1 COG:CAC1797 KEGG:ns NR:ns ## COG: CAC1797 COG0821 # Protein_GI_number: 15895073 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Clostridium acetobutylicum # 4 275 3 274 349 308 58.0 7e-84 MTDRKQTKTVSIGNRKIGGGNPILIQSMCNTKTEDAEATVSQILALEEAGCDIIRVAVPT MEAAGALKEIKRQIHIPLVADIHFDYRLALAAIENGADKIRINPGNIGSEERVRAVVDKA KEYGIPIRVGVNSGSLEKELLEKYHGVTAEGIVESALMKVRMIEEMGYDNLVISIKSSDV LMCIKAHELIAEQTAYPLHVGITEAGTVRSGTIKSSVGLGIILHQGIGDTIRVSLTGDPV EEVMTAKQILKTPGLRKGGVEVVSCPTCGRTRIDL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:48:15 2011 Seq name: gi|229783666|gb|GG668069.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld465, whole genome shotgun sequence Length of sequence - 1480 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 53 - 274 269 ## COG2608 Copper chaperone - Prom 325 - 384 6.3 + Prom 395 - 454 3.6 2 2 Tu 1 . + CDS 499 - 1480 838 ## COG2207 AraC-type DNA-binding domain-containing proteins Predicted protein(s) >gi|229783666|gb|GG668069.1| GENE 1 53 - 274 269 73 aa, chain - ## HITS:1 COG:FN0259 KEGG:ns NR:ns ## COG: FN0259 COG2608 # Protein_GI_number: 19703604 # Func_class: P Inorganic ion transport and metabolism # Function: Copper chaperone # Organism: Fusobacterium nucleatum # 1 70 1 70 73 65 51.0 3e-11 MKKTFKLMDLDCAHCASKIEGAVKKINGVTNVEVHFLSQKMVLEAADDRFQEIAEEAVRL IKKIEPDVTVQTM >gi|229783666|gb|GG668069.1| GENE 2 499 - 1480 838 327 aa, chain + ## HITS:1 COG:all2613 KEGG:ns NR:ns ## COG: all2613 COG2207 # Protein_GI_number: 17230105 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Nostoc sp. PCC 7120 # 175 319 174 326 326 78 33.0 2e-14 MEELTKPRYQIYYNGEPLPPQIFSDLHVQYRYGNHENPGSMDVYTIFPGIELIYNDFSMT CCDSRKKVEGELLEIHYCYKGREECQWLCGDYLYLGEGDLCITRIEEETPGLCFPTGRYL GITIVLDLGILKDHTPPLLDSESMRLQDFGDRFCPGNHFLAIRANRQIDHIFGELYQIPQ EFRHDYFRIKVLELLLFLRMIDPEQERALDRVTRNQIDIIRQVRRRITQNLQENITIDQL AREYCISPTSLKSNFKQVYSTTIKDYLRKVRMERASALLLTTDSTVADIAASLGYTNQSK FSAAFKTVYGLAPAEYRKKCRHEYICT Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:48:16 2011 Seq name: gi|229783665|gb|GG668070.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld466, whole genome shotgun sequence Length of sequence - 1322 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 230 278 ## Closa_2370 LacI family transcriptional regulator 2 1 Op 2 . + CDS 259 - 819 735 ## COG0693 Putative intracellular protease/amidase 3 1 Op 3 . + CDS 833 - 1322 628 ## Closa_2368 lipoprotein Predicted protein(s) >gi|229783665|gb|GG668070.1| GENE 1 3 - 230 278 75 aa, chain + ## HITS:1 COG:no KEGG:Closa_2370 NR:ns ## KEGG: Closa_2370 # Name: not_defined # Def: LacI family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 74 275 348 354 105 70.0 5e-22 GRKTGKDIYLVGVDALEETVQYIKEGKVTGTVLNDHTGQSHTAADVLMKMIHGDEAETRY LVDYVKISTISTLQK >gi|229783665|gb|GG668070.1| GENE 2 259 - 819 735 186 aa, chain + ## HITS:1 COG:CAC1629 KEGG:ns NR:ns ## COG: CAC1629 COG0693 # Protein_GI_number: 15894907 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Clostridium acetobutylicum # 3 179 2 177 188 124 40.0 1e-28 MGKVYAFVANGSEEVELLAVVDVLLRGGQDVKLVSVTGSRDVVSAHQIKIQADLEFSEAD FSDADVLFLPGGMPGTRNLAAHRGLCDELLKADRAGKRLAAICAGPSVLGGLGILEGRKA TCYPGFEGELKGAEYTRQGVVTDGNVTTARGLGYALDLGIELLALLTDREHAAQIKESIQ YDQIPM >gi|229783665|gb|GG668070.1| GENE 3 833 - 1322 628 163 aa, chain + ## HITS:1 COG:no KEGG:Closa_2368 NR:ns ## KEGG: Closa_2368 # Name: not_defined # Def: lipoprotein # Organism: C.saccharolyticum # Pathway: ABC transporters [PATH:csh02010] # 11 163 1 152 475 228 71.0 6e-59 MGMKMKKNWVLPVLAAGILASGLTGCEKKDPYGLSAKNPVTITIWHYYNGVQKEGFDQLV QTFNESEGREKGIIVEAYSKGSIDDLSQAVTDSIDKKIGSDPIPDVFAAYADKVYEIDRR GMAVDISKYLTAEEIGEYVDAYIEEGRFDGSEGIKVFPVAKST Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:48:22 2011 Seq name: gi|229783664|gb|GG668071.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld467, whole genome shotgun sequence Length of sequence - 1044 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 38 - 256 204 ## gi|266625835|ref|ZP_06118770.1| conserved hypothetical protein 2 1 Op 2 . + CDS 280 - 444 134 ## gi|266625836|ref|ZP_06118771.1| conserved hypothetical protein 3 1 Op 3 . + CDS 461 - 634 133 ## gi|266625837|ref|ZP_06118772.1| conserved hypothetical protein 4 1 Op 4 . + CDS 627 - 1044 196 ## CLJ_B2545 hypothetical protein Predicted protein(s) >gi|229783664|gb|GG668071.1| GENE 1 38 - 256 204 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625835|ref|ZP_06118770.1| ## NR: gi|266625835|ref|ZP_06118770.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 72 1 72 72 110 100.0 3e-23 MKVLDAMKQIEHIDNEVKNLQKFCRLSPEDRERIADEVGLNSNFEKLVNVSVNAMLSWKE TLKEKINNAELN >gi|229783664|gb|GG668071.1| GENE 2 280 - 444 134 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625836|ref|ZP_06118771.1| ## NR: gi|266625836|ref|ZP_06118771.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 54 1 54 54 86 100.0 6e-16 MIKQMDEMVSGLADTRRRAVRIAKYWGKVIAKIVIAVTLPVWLIPYLVWRRRHG >gi|229783664|gb|GG668071.1| GENE 3 461 - 634 133 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625837|ref|ZP_06118772.1| ## NR: gi|266625837|ref|ZP_06118772.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 12 57 20 65 65 79 100.0 8e-14 MALQAARARAAKQEYILKGPRPVTHSATMPAYCYTAACPDPRLRKPIRKRTGGEAVG >gi|229783664|gb|GG668071.1| GENE 4 627 - 1044 196 139 aa, chain + ## HITS:1 COG:no KEGG:CLJ_B2545 NR:ns ## KEGG: CLJ_B2545 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_Ba4 # Pathway: not_defined # 5 123 7 127 145 76 37.0 3e-13 MDKGILEQYIDACELIKETEADMRRVKKQRKTIIQDSVKGSMHDFPYAAQNFKIQGMTYS AVRDPGALAAYERLLEERKAKAEEIKVQVEAWLNTIPQRMQRIIRFRFFEGLSWGETASR IGRKATADGVRMEFTNSNP Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:48:41 2011 Seq name: gi|229783663|gb|GG668072.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld468, whole genome shotgun sequence Length of sequence - 683 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 683 530 ## gi|266625839|ref|ZP_06118774.1| conserved hypothetical protein Predicted protein(s) >gi|229783663|gb|GG668072.1| GENE 1 2 - 683 530 227 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625839|ref|ZP_06118774.1| ## NR: gi|266625839|ref|ZP_06118774.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 227 1 227 228 390 99.0 1e-107 EGFEFIAELDEEKMKAEWERTYDETLTKTESDGGKTVNITVYPDYSRYLIEAMKEYGKKN GYEMTPGSWNPDNRRTFEAAFVYHGVTYHFKVTGKRSQVSSELAADQEITLTDMEIYPTK NTYQLYDDGDQQYLAVFQFRTEEKKSTEENVEAGPDSNADDTADYSVDEKMNAENPAFYT VAAAIFGATRDKYMESNISVIQITNNQDMLQAFKEERERIRQEIAAR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:48:53 2011 Seq name: gi|229783662|gb|GG668073.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld469, whole genome shotgun sequence Length of sequence - 906 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 659 357 ## COG4962 Flp pilus assembly protein, ATPase CpaF - Prom 716 - 775 2.1 2 2 Tu 1 . - CDS 783 - 905 157 ## gi|266621228|ref|ZP_06114163.1| group II intron reverse transcriptase maturase Predicted protein(s) >gi|229783662|gb|GG668073.1| GENE 1 2 - 659 357 219 aa, chain - ## HITS:1 COG:mlr6483 KEGG:ns NR:ns ## COG: mlr6483 COG4962 # Protein_GI_number: 13475420 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Mesorhizobium loti # 12 219 145 342 461 79 31.0 5e-15 MNSGRSIKIPDKFTSPQHAIDVVRRMLNACGMVIDDTMPSVIGFLDKNIRISVDKAPIVD SDVGINASIRIVNQQTVSEQKLLESKSATAEMLHFLTACIRYGVSVCIAGSTGSGKTTVM AWLLSQVPNNRRLITIEEGSREFDLIKRDADGNILNSVVHLLTRPHENPAMNINQDFLLE RVLRKHPDVIGVGEMRSAAESLSAAESSRTGHTVCTTIH >gi|229783662|gb|GG668073.1| GENE 2 783 - 905 157 40 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621228|ref|ZP_06114163.1| ## NR: gi|266621228|ref|ZP_06114163.1| group II intron reverse transcriptase maturase [Clostridium hathewayi DSM 13479] group II intron reverse transcriptase maturase [Clostridium hathewayi DSM 13479] # 1 40 220 259 259 76 100.0 8e-13 TKEVETIEKYLELLKRKSIDFVKLNNLRRLVGNCEISENK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:48:58 2011 Seq name: gi|229783661|gb|GG668074.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld470, whole genome shotgun sequence Length of sequence - 1412 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 111 - 1310 1441 ## COG1653 ABC-type sugar transport system, periplasmic component 2 1 Op 2 . - CDS 1282 - 1401 72 ## Predicted protein(s) >gi|229783661|gb|GG668074.1| GENE 1 111 - 1310 1441 399 aa, chain - ## HITS:1 COG:BH3680 KEGG:ns NR:ns ## COG: BH3680 COG1653 # Protein_GI_number: 15616242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 42 374 73 415 438 96 26.0 7e-20 MTEAAGNTGDKAGTDKAKIRMTYWNSEDTMKAMLDYLAETLPDVEIEYQFIDNSNYDTIV DTQLSAGEGPDIICESPGSSLKHARLGYLEPLNDLGAKYSSAGTSVYSYDGSIYALPGIS WFEGIYYNKALFEENGIALPTTFDEYISVCREFKEKGITPLAAGLKSWEPMLKNSMAFVT AEYLSTDAGKNFGSDYREGKAKMEGTWDPYVEKWSEMITEGVYTTDMTGIDHDQALEQFA TGGAAMFCSGPWDLDTITSKNPDLQIDMMPFYGTARSDGWLIGGPGCGFAVNSSSSNKEA AMKVVEAISTVEGQKALWENNQGGSSYLEGASFDLPDAYEGVSGTLAAGNVYCPWNEWGA AAGAHETYGTEMQSYLLGSQDLKTTLSNVDAAVAELLEK >gi|229783661|gb|GG668074.1| GENE 2 1282 - 1401 72 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYSNDCFGFGRLRRLRDVRDGSAQDNGGRSRDGGSRQYR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:49:03 2011 Seq name: gi|229783660|gb|GG668075.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld471, whole genome shotgun sequence Length of sequence - 981 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 390 285 ## Ethha_1896 hypothetical protein 2 1 Op 2 . - CDS 411 - 980 479 ## Ethha_1894 hypothetical protein Predicted protein(s) >gi|229783660|gb|GG668075.1| GENE 1 3 - 390 285 129 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1896 NR:ns ## KEGG: Ethha_1896 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 124 1 126 152 171 70.0 6e-42 MAYVPVPKDLTKVKTKVMFNLTRRQLVCFTAGALVGVPLFLWLREPAGNSMAAMCMMLVM MPFFLLAMYEKHGQPLEKIVGNILKVAVIRPKQRPYQTNNFYAVLKRQEMLDKEVYDIVH RKIKKMAAS >gi|229783660|gb|GG668075.1| GENE 2 411 - 980 479 189 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1894 NR:ns ## KEGG: Ethha_1894 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 189 101 289 289 270 77.0 3e-71 KWVFKSAAAILIVTNTWNIVMGVFDMAQSVVAQAAGIINSDASIDISSVMTDLEPRLMEM DLGPLFGLWFQSLFIGITMWALYICIFIVIYGRMIEIYLVTSVAPVSMAAMMGKEWGGMG QNYLRSLLALGFQAFLIIVCVAIYAVLVQNIALEDDIIMAIWSCVGYTVLLCFTLFKTGS LAKSVFQAH Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:49:10 2011 Seq name: gi|229783659|gb|GG668076.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld472, whole genome shotgun sequence Length of sequence - 724 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 716 363 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member Predicted protein(s) >gi|229783659|gb|GG668076.1| GENE 1 2 - 716 363 238 aa, chain - ## HITS:1 COG:CAC2854 KEGG:ns NR:ns ## COG: CAC2854 COG0507 # Protein_GI_number: 15896108 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Clostridium acetobutylicum # 1 237 423 676 739 145 39.0 9e-35 MSDMKLASMLFSRIKTGARLILVGDADQLPSVGPGDVFRELIVSDVIPVTVLDEFFRQAK GSNIIWNADLMNRNQKNLLYGDDFTFTYVEDAYDAAEKISEIYQEELERSGGDLDMVQVL SPLRTKTEAGVTALNNRLQGIANPFAVDKAEWQTKYGLFRVGDRVMQTHNTEEIANGDIG RVVQIGKSKAGEMEMTVDFGDVIKTYQEDELSILELAYATSIHKSQGAEFPIVIIPVL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:49:11 2011 Seq name: gi|229783658|gb|GG668077.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld473, whole genome shotgun sequence Length of sequence - 1140 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 42 - 91 10.6 1 1 Tu 1 . - CDS 113 - 1102 640 ## COG2610 H+/gluconate symporter and related permeases Predicted protein(s) >gi|229783658|gb|GG668077.1| GENE 1 113 - 1102 640 329 aa, chain - ## HITS:1 COG:YPO3954 KEGG:ns NR:ns ## COG: YPO3954 COG2610 # Protein_GI_number: 16124082 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Yersinia pestis # 1 320 124 436 438 135 30.0 1e-31 MVRVISKKTKVSIVSLCCATVCATVATGAMVIPTPNPMAVAENLHLDYGVFFLYAALAGL FGTLVGGLGYGRFLDWQDRKNHHEYAFEDIEDFEKESETSAASRKVPGFGVAVSILMVPI MLILLGSFGPMILPEGGTIIKILNFIGDKNIAMLIGVIYAALVSLPYLQKPVGDVMNDAA GQVGLVLLITGSGGAFGKMLQSTGIADYIATSLSQFHIPVLVLCFIIAQILRCAQGSTGV ALITTSSMFAPLIAASGISPVLCGIAICAGGIGLSLPNDSGFWAINRFYKISVEDTIRAW TIGGFVTGVAALVFVCILSLFQTHLPGLL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:49:11 2011 Seq name: gi|229783657|gb|GG668078.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld474, whole genome shotgun sequence Length of sequence - 1140 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 + CDS 1 - 501 354 ## COG0191 Fructose/tagatose bisphosphate aldolase 2 1 Op 2 . + CDS 417 - 1140 540 ## COG1105 Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) Predicted protein(s) >gi|229783657|gb|GG668078.1| GENE 1 1 - 501 354 166 aa, chain + ## HITS:1 COG:TP0662 KEGG:ns NR:ns ## COG: TP0662 COG0191 # Protein_GI_number: 15639649 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Treponema pallidum # 3 156 165 330 332 125 39.0 4e-29 VEFTNPGAAKRFVALTGCDSLAVSAGTSHGGVKAAENLPLHMEVLERIQAELPGFPLVLH GAASLPMHLIEAVNEQGGRVEVMKNCSESSIRRSAEYGVCKANMDVDNFLAFTGAVRRVL NEQPDKYDPRVYLKQGKNAWKKEVMWKMSQVTGSSGHNWVKEGVRQ >gi|229783657|gb|GG668078.1| GENE 2 417 - 1140 540 241 aa, chain + ## HITS:1 COG:BS_fruB KEGG:ns NR:ns ## COG: BS_fruB COG1105 # Protein_GI_number: 16078503 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) # Organism: Bacillus subtilis # 28 231 1 200 303 101 31.0 1e-21 MEERSHVEDESGDREFRPQLGERGGEAMIRAICLNPVIDRSYHVNGFEPGKKYFHLLREV SIGGKGINVAKVCKQCGEPVTLYGFIAGSNGKMIRAYMEKEGIDACLIEINGNTRETINI IDLEQGRESELVEQGPTVEVRQVEQLLCRLREDLQRDDIVICSGMTIEGAPADLYSEISA MCRWKEARCFLDTNSVTVEQLRKDGYYFYKPNLQELFELFGREPVSDRTAVRKLALQLVQ D Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:49:12 2011 Seq name: gi|229783656|gb|GG668079.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld475, whole genome shotgun sequence Length of sequence - 1043 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 735 485 ## gi|288871732|ref|ZP_06118784.2| hypothetical protein CLOSTHATH_07272 2 1 Op 2 . - CDS 782 - 1042 59 ## gi|266625850|ref|ZP_06118785.1| conserved hypothetical protein Predicted protein(s) >gi|229783656|gb|GG668079.1| GENE 1 3 - 735 485 244 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871732|ref|ZP_06118784.2| ## NR: gi|288871732|ref|ZP_06118784.2| hypothetical protein CLOSTHATH_07272 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_07272 [Clostridium hathewayi DSM 13479] # 1 244 8 251 252 452 100.0 1e-125 MRDYDEDQAYDEVREDFIEEFKDQLKQRGKGEEMEFIAEGTPFESFMIKGDGASICPSMS MELVYINCYTIAAYEIDEAARYILGMHDNAERIACEQLEQIDNVLIFPMNLSKYWEKLEK NHITYMPYRDMAITFQFIYETENRSYCKDVKEDHLIQWGINVEELFEKVRYKEVEEIGTQ IINVYDNVKKSLENNQSIIFSSVEDVREEARQVYQVYGVNGFGAVFYPKVMEEVIKKLGG DMVI >gi|229783656|gb|GG668079.1| GENE 2 782 - 1042 59 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625850|ref|ZP_06118785.1| ## NR: gi|266625850|ref|ZP_06118785.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 86 1 86 86 177 98.0 2e-43 LEPKGFEFAEGVNTDLYLLPTTNGQTRIYAADQYTLVDIKMGLERQKHFKDFENHALSGA VLHYKQKCKSLVVVPQPRKCSVVRRG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:49:31 2011 Seq name: gi|229783655|gb|GG668080.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld476, whole genome shotgun sequence Length of sequence - 698 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 669 258 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase Predicted protein(s) >gi|229783655|gb|GG668080.1| GENE 1 1 - 669 258 222 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 2 206 102 318 319 103 30 3e-23 FDRELDIPVFLEHNANAGAYAHMWDLKDAYRDDILVYIAAGQGIGAGIVMDGKIYKGALG TSGEIGHMTIDRNGRPCACGNKGCLERYASSLELVKAVYGDQAGRSGCNFEDVERRIREG DNFCIDCYRRACESLGIGIINIVNVINPDIIIIGDDMARPDPELMESVVRETVREGVLDD VWEGLTLSISTYQGDPILTGAAIVAIDRIFDSPGQFIKESED Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:49:32 2011 Seq name: gi|229783654|gb|GG668081.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld477, whole genome shotgun sequence Length of sequence - 605 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 603 490 ## Cphy_2209 hypothetical protein Predicted protein(s) >gi|229783654|gb|GG668081.1| GENE 1 3 - 603 490 200 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2209 NR:ns ## KEGG: Cphy_2209 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 5 199 89 283 314 193 45.0 5e-48 MGAGHPMMQGIFRYLGSGAGSDEDGWLFSVPSNDAWPRAPWWSYDEAENKLQSMGITAGL CAFILHYGEKESGIYQTALEHTEKILKKAAATEDFGEMGAGGVCMLLGEVMMSGAEVSFP GEALMGKMAEVVNRSIERDTEKWAGYTPRPSEFIWGPDSPFYKGNEEIVEKELDYLIDTR KPGGVWDITWTWFALGEKYP Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:49:36 2011 Seq name: gi|229783653|gb|GG668082.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld478, whole genome shotgun sequence Length of sequence - 1029 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 21 - 80 6.8 1 1 Tu 1 . + CDS 323 - 997 181 ## gi|266625854|ref|ZP_06118789.1| hypothetical protein CLOSTHATH_07277 Predicted protein(s) >gi|229783653|gb|GG668082.1| GENE 1 323 - 997 181 224 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625854|ref|ZP_06118789.1| ## NR: gi|266625854|ref|ZP_06118789.1| hypothetical protein CLOSTHATH_07277 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_07277 [Clostridium hathewayi DSM 13479] # 1 224 1 224 224 394 100.0 1e-108 MRKIKKIVTLLLTLVFVASNTIYSFANENSTLSSDVPHETCIITLTSPNGIKYNVMAVEE PQMQLRSSNGDIAKTYSYSLDSKNMQRVTPSTRGGQSNDEWDDSISVHGYITITYASTTL SNGLKGYKLTKVNGKWEKSDNAVTMSNRRVSYTCQDVNHQNQITAKFPSSNTFSYSTGYT TYVSDIATGVLGANSHIDLNHAGSGTWSLDVTCNYFDNNILDYL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:49:48 2011 Seq name: gi|229783652|gb|GG668083.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld479, whole genome shotgun sequence Length of sequence - 643 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 642 419 ## PROTEIN SUPPORTED gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 Predicted protein(s) >gi|229783652|gb|GG668083.1| GENE 1 1 - 642 419 214 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 [Haemophilus influenzae 7P49H1] # 13 214 13 214 378 166 42 5e-42 EFSKEERLRAMEKPLLSWYKGHARILPWRENPEPYRVWISEIMLQQTRVEAVKPYFERFM EALPDTAALAAVSEDRLFKLWEGLGYYNRARNLKKAAGVVMEQYGGVLPASWEELKKLPG IGSYTAGAIASIAYGIPVPAVDGNVLRVISRVTGSREDILKQSVKKQMEDLLLGVMPREG AGNYNQALIEIGAIVCVPNGEPLCRECPMASVCV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:49:49 2011 Seq name: gi|229783651|gb|GG668084.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld480, whole genome shotgun sequence Length of sequence - 720 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 718 478 ## COG1686 D-alanyl-D-alanine carboxypeptidase Predicted protein(s) >gi|229783651|gb|GG668084.1| GENE 1 1 - 718 478 239 aa, chain - ## HITS:1 COG:CAC1267 KEGG:ns NR:ns ## COG: CAC1267 COG1686 # Protein_GI_number: 15894549 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 1 148 140 293 425 107 38.0 3e-23 FAKLMNKKAKELGCTGSNFVNPSGLNDPDHYTTAYDMALIARSALNYPEIVEIMGTRVYK IPPSKKAPEGQTISPGHKMLKKNDSAYDPRVFGGKTGFTSLADNTLVTYAKKDDMTLISV ILNGHQTHYTDTKKLLDFGFSQFESKKLADYEGSLVPENHLTYSDSRTDQSLLLLNQNSR IVLPKEADLSAVVPIVSYDETSLTPSGTVVTVNYLYGERTVGHAWITPNPAIQSRLTRS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:49:49 2011 Seq name: gi|229783650|gb|GG668085.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld481, whole genome shotgun sequence Length of sequence - 1251 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 249 - 464 169 ## gi|266625858|ref|ZP_06118793.1| conserved hypothetical protein 2 1 Op 2 . - CDS 461 - 1213 503 ## COG5585 NAD+--asparagine ADP-ribosyltransferase Predicted protein(s) >gi|229783650|gb|GG668085.1| GENE 1 249 - 464 169 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625858|ref|ZP_06118793.1| ## NR: gi|266625858|ref|ZP_06118793.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 71 1 71 71 130 100.0 3e-29 MNVGKAVEQVKLGKGMRLPHWTKDTAIRMKFPDEHSDMTEPYLYVDSLVLGRWPWKESTE ELLSTKWEVVD >gi|229783650|gb|GG668085.1| GENE 2 461 - 1213 503 250 aa, chain - ## HITS:1 COG:BH3531 KEGG:ns NR:ns ## COG: BH3531 COG5585 # Protein_GI_number: 15616093 # Func_class: T Signal transduction mechanisms # Function: NAD+--asparagine ADP-ribosyltransferase # Organism: Bacillus halodurans # 33 197 135 300 490 107 35.0 3e-23 MEQLQNQLDQTMREVYKQEKARTTSHYVDLANEAYYRSIFDIQQRTGLGFSFSAIDPAVI DRVINSKWSGANYSTRIWNNTQALAQDLKEELLVNLVTGRTDREVAEIIANKYAQGASNA RRLVRTESCNLANQMEMQSYEECGIEKYRFVATLDLKTSAVCRKLDGKVFPVSEQQPGKN CPPMHPWCRSTTICVIDEIDMSNMKRRARDPVTGKTNTVPADMTYNEWYNQNVKGKAEAE AKEKEARKRK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:49:56 2011 Seq name: gi|229783649|gb|GG668086.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld482, whole genome shotgun sequence Length of sequence - 1088 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 292 290 ## gi|288871733|ref|ZP_06118795.2| putative ABC-type sugar transport system, permease component 2 1 Op 2 . - CDS 365 - 1087 825 ## gi|266625861|ref|ZP_06118796.1| hypothetical protein CLOSTHATH_07284 Predicted protein(s) >gi|229783649|gb|GG668086.1| GENE 1 1 - 292 290 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871733|ref|ZP_06118795.2| ## NR: gi|288871733|ref|ZP_06118795.2| putative ABC-type sugar transport system, permease component [Clostridium hathewayi DSM 13479] putative ABC-type sugar transport system, permease component [Clostridium hathewayi DSM 13479] # 17 97 1 81 81 142 100.0 7e-33 MNRPAKKKYSRLERHLMGIGIVFLLPAAILIIFTTLVPIVWNGVLSLCEWNGNGPMEFIG LKNYIKVFTDKPTMKTIGNSMVVAAGSTAVSMLLGIM >gi|229783649|gb|GG668086.1| GENE 2 365 - 1087 825 240 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625861|ref|ZP_06118796.1| ## NR: gi|266625861|ref|ZP_06118796.1| hypothetical protein CLOSTHATH_07284 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_07284 [Clostridium hathewayi DSM 13479] # 1 240 1 240 240 469 99.0 1e-131 VAACEKIKQNTDVFPFAICDTRVGWFQRNAFLQIWDTKEELEAFNRGEISFKDERVVTAL DKINDLFAKEYAYPGNGAFSQTKDQTKAAFSQGKIAMLFNVNNGGKTIRTVMEEAGRTNI RTISWPTMAAGECDYVLGGCDGYFISSNTKNPEASINILKYLTSKEAFELQVDDGDVVPA NVEGGTDEEFARDSSKVYPKEIMAISAEMNDYINYNIGANYFYDKEGTLEELESLRTSAQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:50:15 2011 Seq name: gi|229783648|gb|GG668087.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld483, whole genome shotgun sequence Length of sequence - 1144 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 35/0.000 - CDS 2 - 695 705 ## COG1175 ABC-type sugar transport systems, permease components - Prom 750 - 809 3.2 - Term 735 - 781 -0.9 2 1 Op 2 . - CDS 812 - 1144 424 ## COG1653 ABC-type sugar transport system, periplasmic component Predicted protein(s) >gi|229783648|gb|GG668087.1| GENE 1 2 - 695 705 231 aa, chain - ## HITS:1 COG:BS_yesP KEGG:ns NR:ns ## COG: BS_yesP COG1175 # Protein_GI_number: 16077765 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 6 231 13 239 309 231 55.0 1e-60 MKMLKKAMNNERTAGMIFTMPFTIGFLLFMIVPMGISLYYSFCSYDILSPPVFTGLQNFK TMFADETFYKSIKVTFFFAFVSVPLRLLFALIVAMLLLKSTKLTGFYRAAYYLPSIIGGS VAVAILWKRMFATDGVINKLLQAVGINCTMSWLGNTKTAIWVLIILAVWQFGSSMLIFLS SLKQIPQSLYEAANVDGANGISKFFRITLPLLTPTIFFNLVMQMINGFLAF >gi|229783648|gb|GG668087.1| GENE 2 812 - 1144 424 110 aa, chain - ## HITS:1 COG:BH1117 KEGG:ns NR:ns ## COG: BH1117 COG1653 # Protein_GI_number: 15613680 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 1 105 330 433 434 73 42.0 8e-14 AGAKIIDYWTNSMDCNEILLAERGVPISSKVAEELAPSLTESDQKVISFINDVVTPNSSQ INPPYPNGSAEVSDLINKLGEKVCYGELTAEEAAEQLYTEGNKIMAEKAK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:50:16 2011 Seq name: gi|229783647|gb|GG668088.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld484, whole genome shotgun sequence Length of sequence - 1064 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 36 - 61 -0.8 1 1 Tu 1 . - CDS 248 - 580 66 ## gi|288871735|ref|ZP_06118799.2| hypothetical protein CLOSTHATH_07287 - Prom 620 - 679 7.5 - Term 708 - 740 -0.2 2 2 Tu 1 . - CDS 953 - 1054 61 ## Predicted protein(s) >gi|229783647|gb|GG668088.1| GENE 1 248 - 580 66 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871735|ref|ZP_06118799.2| ## NR: gi|288871735|ref|ZP_06118799.2| hypothetical protein CLOSTHATH_07287 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_07287 [Clostridium hathewayi DSM 13479] # 1 110 15 124 124 188 100.0 1e-46 MKKADIERLLFYYLCGNETAKEMSQGNNITYEEFDRITYILIELKFYNLLMEFWDEFYNQ FQEDIDKSAEDFSDSFEREISRCEKWLYQFCKEAPTQELQGILKEIFDIS >gi|229783647|gb|GG668088.1| GENE 2 953 - 1054 61 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPVEIAGYVVDEAENPLALDNEEFTPKNISIKM Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:50:28 2011 Seq name: gi|229783646|gb|GG668089.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld485, whole genome shotgun sequence Length of sequence - 1040 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 9 - 1040 1151 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes Predicted protein(s) >gi|229783646|gb|GG668089.1| GENE 1 9 - 1040 1151 343 aa, chain - ## HITS:1 COG:CAC1356 KEGG:ns NR:ns ## COG: CAC1356 COG1060 # Protein_GI_number: 15894635 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Clostridium acetobutylicum # 2 343 81 423 472 575 80.0 1e-164 DGNRIVMFAPLYLSNYCINGCTYCPYHGKNKHIARKKLTQEEVAKEVIALQDMGHKRLAI EAGEDPVRNPIEYILECIDTIYSIKHKNGAIRRVNVNIAATSVENYRKLKDAGIGTYILF QETYHKESYEVLHPTGPKHDYAYHTEAMDRAMEGGIDDVGLGVLFGLELYKYEFAGLLMH AEHLEAVHGVGPHTISVPRIKHADDIDPSAFDNSISDDIFAKLCALIRLAVPYTGMIIST RESQEVREKVIRLGVSQISGASRTSVGGYQEEIRPTDTEQFDVSDQRSLDEVVRWLMEMG YIPSFCTACYREGRTGDRFMSLCKSGQIQNCCHPNALMTLKDF Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:50:28 2011 Seq name: gi|229783645|gb|GG668090.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld486, whole genome shotgun sequence Length of sequence - 883 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 137 127 ## gi|266623968|ref|ZP_06116903.1| Two-component response regulator YesN - Term 108 - 152 2.1 2 2 Tu 1 . - CDS 166 - 633 288 ## gi|266625869|ref|ZP_06118804.1| putative lipoprotein - Prom 732 - 791 6.0 Predicted protein(s) >gi|229783645|gb|GG668090.1| GENE 1 3 - 137 127 44 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266623968|ref|ZP_06116903.1| ## NR: gi|266623968|ref|ZP_06116903.1| Two-component response regulator YesN [Clostridium hathewayi DSM 13479] Two-component response regulator YesN [Clostridium hathewayi DSM 13479] # 1 44 375 418 418 94 100.0 3e-18 YINTDMKIYEIAELVGYSDWHYLYSVYKKQLGHSMSKEKRGKTT >gi|229783645|gb|GG668090.1| GENE 2 166 - 633 288 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625869|ref|ZP_06118804.1| ## NR: gi|266625869|ref|ZP_06118804.1| putative lipoprotein [Clostridium hathewayi DSM 13479] putative lipoprotein [Clostridium hathewayi DSM 13479] # 1 155 1 155 155 253 100.0 4e-66 MKRTALVTILMIFCLSITACAADPNHPTNAQSSIFHSLNVTHTDRSIIVKADAYVNITDA AFLTIEISDEESVSLTYTLTKNTGDITLSCRTPEDETILLADTEDKTYGEETVTLKPGTT TFFLAGTDSSFNVSFSIKDIDISKVQAINGSAPEL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:50:42 2011 Seq name: gi|229783644|gb|GG668091.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld487, whole genome shotgun sequence Length of sequence - 1146 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 134 74 ## gi|266621230|ref|ZP_06114165.1| conserved hypothetical protein 2 1 Op 2 . - CDS 158 - 703 625 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase - Prom 725 - 784 7.0 3 2 Tu 1 . - CDS 791 - 1144 148 ## Closa_1294 diacylglycerol kinase catalytic region Predicted protein(s) >gi|229783644|gb|GG668091.1| GENE 1 2 - 134 74 44 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621230|ref|ZP_06114165.1| ## NR: gi|266621230|ref|ZP_06114165.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 7 44 25 62 310 82 100.0 1e-14 MLEPKGFEFLEGLTKWIKPKDNSLKRRIRESRNQKPPKGVKRLV >gi|229783644|gb|GG668091.1| GENE 2 158 - 703 625 181 aa, chain - ## HITS:1 COG:CAC0434 KEGG:ns NR:ns ## COG: CAC0434 COG0245 # Protein_GI_number: 15893725 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Clostridium acetobutylicum # 1 154 1 154 155 186 62.0 1e-47 MRIGQGYDVHKLTDGRDLILGGVKVPYEKGLLGHSDADVLVHAVMDALLGAAALGDIGEH FPDTDPAYRGISSVELLKRVGALLEEKGYVIENIDATIIAQRPKLKDYRPQMVQNIAAAL GIPADRVSVKATTEEGLGFTGSGEGISAQAITLLTAVGDYCYDDKIMDGGCQGCGGGCCG K >gi|229783644|gb|GG668091.1| GENE 3 791 - 1144 148 117 aa, chain - ## HITS:1 COG:no KEGG:Closa_1294 NR:ns ## KEGG: Closa_1294 # Name: not_defined # Def: diacylglycerol kinase catalytic region # Organism: C.saccharolyticum # Pathway: not_defined # 1 117 192 308 308 196 70.0 2e-49 LLLDGVQKVEFNHAYFISAHIHPYEGGGFKFAPDASYDDGKLSICVMNNRKKRKLIPVLL NSMFGRQSHNKGTRFYTCGEAVVHVDKPMAVHVDGESCFCQNDIQLRCIKKAVRMIV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:50:49 2011 Seq name: gi|229783643|gb|GG668092.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld488, whole genome shotgun sequence Length of sequence - 1219 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 229 - 1219 957 ## COG2217 Cation transport ATPase Predicted protein(s) >gi|229783643|gb|GG668092.1| GENE 1 229 - 1219 957 330 aa, chain + ## HITS:1 COG:BS_yvgW KEGG:ns NR:ns ## COG: BS_yvgW COG2217 # Protein_GI_number: 16080402 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Bacillus subtilis # 35 328 120 414 702 380 64.0 1e-105 MTKNRKTVYRIFAAAAIYLITMALPLTPTGKLAGFLLCYGVIGWDIIWKAVTNILHGQVF DENFLMTVATVGAMFLGEYAEGVAVMLFYQVGELFQSYAVSKSRRSISGLMDIRPDYANV VREDGIVSVDPDEVAVGEVIVVKPGERVPLDGTILEGRSALNTSALTGESLPREVEPGAD VISGCINQSGTLKILVSKPYGESTVAKILDLVENASSKKSKSEAFITRFARYYTPAVVVA AVLLAVIPPLLTGQPFAVWIERALTFLVISCPCALVISVPLSFFGGIGGASKCGVLIKGS NYLEALAKTETVVFDKTGTLTKGSFAVSGC Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:50:50 2011 Seq name: gi|229783642|gb|GG668093.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld489, whole genome shotgun sequence Length of sequence - 1308 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 19 - 966 1072 ## SPJ_1869 phage portal protein, SPP1 family 2 1 Op 2 . + CDS 959 - 1307 178 ## SPJ_1868 phage putative head morphogenesis protein, SPP1 gp7 family Predicted protein(s) >gi|229783642|gb|GG668093.1| GENE 1 19 - 966 1072 315 aa, chain + ## HITS:1 COG:no KEGG:SPJ_1869 NR:ns ## KEGG: SPJ_1869 # Name: not_defined # Def: phage portal protein, SPP1 family # Organism: S.pneumoniae_JJA # Pathway: not_defined # 1 311 107 421 433 248 45.0 3e-64 MYYYAKKDDSDKKRAIYVATVLTEHYKWVLNIENIDSPQALLEEPTPHYFDEVPVIEYLN NKLAIGDFELQIPLIDAYNALMSDRITDKEQFIDAILALYGAMLGDNEAKDADGRTAAQK LKEDRLMELPKDAKAEYITRTFDESGVEVLKKAIEQDIHKFSHIPCMTDESFGGNVSGVA MEFKLLGMENITKIKTRYYKRGLRKRLRLFEDWLAKSKSVQVDISGITPTFSRAMPKNLL EISQVTANLWGKVSRKTLLSQIPFVENVEEELAAVKKEEDEAAKKQMEMFGLGSNTPPPD GEEEPKKKKPGGVDE >gi|229783642|gb|GG668093.1| GENE 2 959 - 1307 178 116 aa, chain + ## HITS:1 COG:no KEGG:SPJ_1868 NR:ns ## KEGG: SPJ_1868 # Name: not_defined # Def: phage putative head morphogenesis protein, SPP1 gp7 family # Organism: S.pneumoniae_JJA # Pathway: not_defined # 19 116 1 100 527 65 37.0 8e-10 MSSASYWEKRKAQRMFEYMQSAEDTADEIAKLYLRSSEYLSAELDKIYERYKRKHHLTDA EAYRLLNCLHDKTSIEELKEALRAGDGVEKDILAELEGPAYRARLERLEQLQNQLD Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:50:58 2011 Seq name: gi|229783641|gb|GG668094.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld490, whole genome shotgun sequence Length of sequence - 1394 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 309 - 605 306 ## COG2002 Regulators of stationary/sporulation gene expression 2 1 Op 2 . - CDS 602 - 1174 290 ## gi|266625879|ref|ZP_06118814.1| conserved hypothetical protein 3 1 Op 3 . - CDS 1185 - 1394 104 ## gi|288871739|ref|ZP_06118815.2| conserved hypothetical protein Predicted protein(s) >gi|229783641|gb|GG668094.1| GENE 1 309 - 605 306 98 aa, chain - ## HITS:1 COG:BH0070 KEGG:ns NR:ns ## COG: BH0070 COG2002 # Protein_GI_number: 15612633 # Func_class: K Transcription # Function: Regulators of stationary/sporulation gene expression # Organism: Bacillus halodurans # 1 96 1 99 179 63 38.0 1e-10 MNSTGIIRRIDDLGRVVVPRDMRKSFGLQEGTPLEVCATEEGILFKKYDPGITLMDIVNN LESALDDNYVELGVDKTREIRLCISDLKEILKEADGRR >gi|229783641|gb|GG668094.1| GENE 2 602 - 1174 290 190 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625879|ref|ZP_06118814.1| ## NR: gi|266625879|ref|ZP_06118814.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 190 1 190 190 374 100.0 1e-102 MLKDIEVKIIAPAQLPPVLYWLLNHKYHTEQWDFVVMFDAKWQILYVNRTVPESDVKKFV DIASWQTWYIGDMDCPIADDVEYVYVAYGRNVWNILTDAHKDRMRKRETEKAQEKAKKIL PVIKAEMNTIVDDEIPDPMDDYLVSCINDAGREADRDRDMHECLVNTGTKYVFYLGYLMG SGKIKEDTEA >gi|229783641|gb|GG668094.1| GENE 3 1185 - 1394 104 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871739|ref|ZP_06118815.2| ## NR: gi|288871739|ref|ZP_06118815.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 21 69 1 49 49 90 100.0 4e-17 DGSICPHCGARTFFTEPLKVMRIYCECGLYTRYMTNLKEEVFDVNCINCGSPVAVKYNGR KNCYETIRE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:51:14 2011 Seq name: gi|229783640|gb|GG668095.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld492, whole genome shotgun sequence Length of sequence - 1007 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 36 - 398 365 ## GYMC10_0575 protein of unknown function DUF1861 2 1 Op 2 . + CDS 414 - 1005 544 ## COG1653 ABC-type sugar transport system, periplasmic component Predicted protein(s) >gi|229783640|gb|GG668095.1| GENE 1 36 - 398 365 120 aa, chain + ## HITS:1 COG:no KEGG:GYMC10_0575 NR:ns ## KEGG: GYMC10_0575 # Name: not_defined # Def: protein of unknown function DUF1861 # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 4 118 199 313 314 150 60.0 2e-35 MEHQFIPEEWGGANELHRLACGKVGVLSHIARFDEKGSRHYYSTVFCFDPETGKYSPMKL IAERKDFQPGPCKRPDLEDVIFSGGIVRLGGGKAELYCGVSDAEGQKALIDDPFDEYENK >gi|229783640|gb|GG668095.1| GENE 2 414 - 1005 544 197 aa, chain + ## HITS:1 COG:lin0852 KEGG:ns NR:ns ## COG: lin0852 COG1653 # Protein_GI_number: 16799926 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Listeria innocua # 6 192 4 179 440 135 43.0 4e-32 MKRMKKTMAIAAAAVMAAAMMTGCGKSGGTDSSANSGNSGTAGGVVSVDFWTAPQQVQYN FWESKAQAFNDAGITVDGKKVEVKVQQMPESPSSEAGIQNAIATGTVPAVSENINRGFAA TLAASGAVYDLSGEAWFQEVIEARKMEDTITNWAIDDKQYVLPVYVNPMIWQWNMKALKA LGYDDAPKTVDEFTAVI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:51:18 2011 Seq name: gi|229783639|gb|GG668096.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld493, whole genome shotgun sequence Length of sequence - 1137 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 505 307 ## CD1108 putative DNA-repair protein 2 1 Op 2 . + CDS 535 - 786 415 ## CD1107A hypothetical protein 3 1 Op 3 . + CDS 776 - 1136 420 ## CD1107 hypothetical protein Predicted protein(s) >gi|229783639|gb|GG668096.1| GENE 1 2 - 505 307 167 aa, chain + ## HITS:1 COG:no KEGG:CD1108 NR:ns ## KEGG: CD1108 # Name: not_defined # Def: putative DNA-repair protein # Organism: C.difficile # Pathway: not_defined # 1 167 480 646 646 324 83.0 6e-88 KPLLFGGGSPDTGVSEDLTGVEFINGSRPGNPQLVELAKSQVGNVGGQPYWSWYGFDSRV EWCACFVSWCYGQSGRTEPRFAGCQSQGVPWFQSHGQWGARGYENIAPGDAIFFDWDLDG SADHVGIFVGTDGSRVYTVEGNSGDACKIKSYDLNYQSIKGYGLMNW >gi|229783639|gb|GG668096.1| GENE 2 535 - 786 415 83 aa, chain + ## HITS:1 COG:no KEGG:CD1107A NR:ns ## KEGG: CD1107A # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 67 1 67 85 76 70.0 3e-13 MAKNKIERIDQEITKVHKKIAEYQEKLKALEAQKTEAENLEIVQMVRALRMTPAQLSAML SGGTVPGRLADDNNEQEENSYEE >gi|229783639|gb|GG668096.1| GENE 3 776 - 1136 420 120 aa, chain + ## HITS:1 COG:no KEGG:CD1107 NR:ns ## KEGG: CD1107 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 120 3 121 244 164 74.0 1e-39 MRNKRLLRTLSALCLTLVLASGFTVPAFAQGAAPPPAEDTTNDSNVVVEETEKAPPLTPD GNAALVDDFGGNKQLITVTTKAGNYFYILIDRANEDKKTAVHFLNQVDEADLMALMEDGK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:51:26 2011 Seq name: gi|229783638|gb|GG668097.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld494, whole genome shotgun sequence Length of sequence - 1251 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 22 - 1008 879 ## gi|266625886|ref|ZP_06118821.1| conserved hypothetical protein 2 1 Op 2 . + CDS 1075 - 1249 196 ## gi|266625887|ref|ZP_06118822.1| conserved hypothetical protein Predicted protein(s) >gi|229783638|gb|GG668097.1| GENE 1 22 - 1008 879 328 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625886|ref|ZP_06118821.1| ## NR: gi|266625886|ref|ZP_06118821.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 328 8 335 335 650 100.0 0 MKGRGIFFVSVFSLLTAFNAFAQTETVSSSAVQDHLVQGYDSAAKMYTQTFANKKTFRTN VPNRMITSRSVVMEFDTGELIYTLTRDGEAMAYESGQIITEPGYYCLKAMALPEILAEDV PEPTIDQLYGEAAMPEFTLYEDDNVYQSCFYFFIPGKAENRLDYINAPEGYQIGYVEKNG QRVPGDSPDWQRLKEDGSYRFLWNPVKEGLPVCESVLVRDTKAPFLEIDGVKDGGYSKGG ISVYADEADITIRVTDGSFSWNLDGSKLERPGIYEITAVDAAGNQSRYGVVIDMDMTWAM VLAGAALAAAIGACIWYIHVQKKKIQVR >gi|229783638|gb|GG668097.1| GENE 2 1075 - 1249 196 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625887|ref|ZP_06118822.1| ## NR: gi|266625887|ref|ZP_06118822.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 58 1 58 59 95 100.0 1e-18 MFQNLFTKVMMQGRFIFAASTEGTKAGTVPDFTEAVKPIVDFINALKGPALTVVIALA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:51:47 2011 Seq name: gi|229783637|gb|GG668098.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld496, whole genome shotgun sequence Length of sequence - 716 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 + CDS 2 - 424 483 ## COG1846 Transcriptional regulators 2 1 Op 2 . + CDS 426 - 714 375 ## COG1132 ABC-type multidrug transport system, ATPase and permease components Predicted protein(s) >gi|229783637|gb|GG668098.1| GENE 1 2 - 424 483 140 aa, chain + ## HITS:1 COG:FN2010 KEGG:ns NR:ns ## COG: FN2010 COG1846 # Protein_GI_number: 19705306 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Fusobacterium nucleatum # 1 128 11 141 160 59 29.0 2e-09 RIMKLHRSILEQRLNKTGVYRSQHQILTAMAEHPNTSQKELAEYLNVTAATVAVSVKKLE KGGYITRIVDQEDNRYNKLCLTEEGRDVVEHSRQFFKNVENQMFFGFTEEEFQIMEGYLE RVYANLSDITAKTMTKREDS >gi|229783637|gb|GG668098.1| GENE 2 426 - 714 375 96 aa, chain + ## HITS:1 COG:TM0287 KEGG:ns NR:ns ## COG: TM0287 COG1132 # Protein_GI_number: 15643056 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Thermotoga maritima # 1 96 1 96 577 80 40.0 8e-16 MKRYWKYIKPYWGAFLLAPLLMLTEVFGEIMLPKLMSLIINNGVANRDTSYIIAVGGVMV VTAFVMAAGGIGAAYFSSKASICFSSDLRKGVLDKV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:51:48 2011 Seq name: gi|229783636|gb|GG668099.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld497, whole genome shotgun sequence Length of sequence - 1363 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 - CDS 1 - 649 591 ## COG0772 Bacterial cell division membrane protein 2 1 Op 2 . - CDS 662 - 1138 395 ## COG1695 Predicted transcriptional regulators - Prom 1196 - 1255 10.5 Predicted protein(s) >gi|229783636|gb|GG668099.1| GENE 1 1 - 649 591 216 aa, chain - ## HITS:1 COG:BH3360 KEGG:ns NR:ns ## COG: BH3360 COG0772 # Protein_GI_number: 15615922 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus halodurans # 1 90 1 90 431 70 37.0 2e-12 MELSEYLDTVSEQIRCKRARTMVREELENHVQEQAEAYEADGMTVPEAMREAVRQMGDPI ETGTALNRIHRPQLEWKFLILVLLLSALGLALQYMTCYTGLFGGFSSDLADYFWKRQCFF TVAGLGVLAAVYIVDYTILGRYPLLLWFGYFAVIFIIANTHTVIMGKVRIYNYLTLFLPL YAGVLYRFRSKGYGAVAVCYLLSFMPFFVGLKTILV >gi|229783636|gb|GG668099.1| GENE 2 662 - 1138 395 158 aa, chain - ## HITS:1 COG:DR1954 KEGG:ns NR:ns ## COG: DR1954 COG1695 # Protein_GI_number: 15806952 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Deinococcus radiodurans # 3 101 1 101 107 77 38.0 7e-15 MPIDKSLISGSTSMLLLKLLTEKDMYGYEMIETLRKKSNDVFELKAGTLYPLLHSLEQKG YLTSYEEEASGKVRKYYSITGEGKRYLKGKEEEWKQYAGAVSGVLALHCINPCAIAPAIY ARSTPAKLDVGSATPSLRGRTPDADIRHYNPVIYATQY Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:51:49 2011 Seq name: gi|229783635|gb|GG668100.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld498, whole genome shotgun sequence Length of sequence - 1033 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 20 - 1031 656 ## COG3408 Glycogen debranching enzyme Predicted protein(s) >gi|229783635|gb|GG668100.1| GENE 1 20 - 1031 656 337 aa, chain + ## HITS:1 COG:MA0905 KEGG:ns NR:ns ## COG: MA0905 COG3408 # Protein_GI_number: 20089784 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen debranching enzyme # Organism: Methanosarcina acetivorans str.C2A # 3 337 279 611 680 266 44.0 4e-71 MAQESKRQQELEENSGFRDAEARLIARNCSQYVVNRESTGGKSVIAGYPFFGDWGRDTMI ALPGLCIVTGQMETAKSILSTFIQYRRRGLMPNVFPEGDGEPEYNTADASLLFIGSVYEY YQAGGDLDFIEKEAYPVILEIIDWYRKGTDYHICMDRDGLISAGGGLEQVTWMDVRINGF LPTPRHGKAVEINAQWHSALMIAGHFAALFSESGKVFTDLAETVKNSFVREFWLEDEGCL KDVISNSAEHHPDRQIRCNQIWAVSQPFPILDREKECRIVDTVYEHLYTPYGLRSLSGKD GEFCAEYGGSVVNRDMAYHQGTVWTFPLGAYYLAYLK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:51:50 2011 Seq name: gi|229783634|gb|GG668101.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld499, whole genome shotgun sequence Length of sequence - 1301 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 353 180 ## gi|266625893|ref|ZP_06118828.1| conserved hypothetical protein 2 1 Op 2 . - CDS 368 - 811 508 ## SP670_2143 prophage protein 3 1 Op 3 . - CDS 829 - 1242 480 ## SPJ_1855 major tail protein, putative Predicted protein(s) >gi|229783634|gb|GG668101.1| GENE 1 2 - 353 180 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625893|ref|ZP_06118828.1| ## NR: gi|266625893|ref|ZP_06118828.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 117 1 117 118 223 100.0 4e-57 MIDDLFPLALDCGISPERFWDLSIPDIIDIVECSRRQEERKVKHELMNLHFLARDIGQFT AAAIQGSDKVKIMELWDFFPDLFEREHEETKKKIQEKQLAEYKARFNDFVIRHNHAR >gi|229783634|gb|GG668101.1| GENE 2 368 - 811 508 147 aa, chain - ## HITS:1 COG:no KEGG:SP670_2143 NR:ns ## KEGG: SP670_2143 # Name: not_defined # Def: prophage protein # Organism: S.pneumoniae_670-6B # Pathway: not_defined # 30 127 5 107 126 62 33.0 5e-09 MTQGFDEILETKEEEEKVVSLEEKKKRRPFAYWEVGGQTYKMKLTTQNICRLEDKYKTSL LNLLFGSGNVPTLSIMLTVTQAAMLPYHHKIKFVDVQNLFDRYCEEGGTQMTFMTDVFME IYKVSGFFTEDQAEEMDKRLEEAKDQM >gi|229783634|gb|GG668101.1| GENE 3 829 - 1242 480 137 aa, chain - ## HITS:1 COG:no KEGG:SPJ_1855 NR:ns ## KEGG: SPJ_1855 # Name: not_defined # Def: major tail protein, putative # Organism: S.pneumoniae_JJA # Pathway: not_defined # 1 134 1 133 137 137 55.0 2e-31 MLANGITLEVKKNGSESYTALQDLKEVPELGVDAEKVENTRLKDKFKHSELGIGDPGDMA YKFVYDNSSASSDYRVLREIADSNKVASYRQTFPDGTKYEFDAYSSIKVGGGGVNAAIEF TLTLGLQSDINVTDPTA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:52:03 2011 Seq name: gi|229783633|gb|GG668102.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld500, whole genome shotgun sequence Length of sequence - 1098 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 337 292 ## COG1131 ABC-type multidrug transport system, ATPase component 2 1 Op 2 . - CDS 334 - 1098 279 ## gi|266625897|ref|ZP_06118832.1| putative nicotinamide nucleotide transhydrogenase, subunit alpha 2 Predicted protein(s) >gi|229783633|gb|GG668102.1| GENE 1 1 - 337 292 112 aa, chain - ## HITS:1 COG:SP0707 KEGG:ns NR:ns ## COG: SP0707 COG1131 # Protein_GI_number: 15900606 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Streptococcus pneumoniae TIGR4 # 1 112 1 112 215 98 42.0 3e-21 MKPIISLDQVSKSFSGRRVLSDVSLDVEKGSTVGIVGANGSGKSVLFNIICGFLTPDSGQ VYVRGQALGKGRDFPENVGVLINSPGFIGLNTGLQNLRYLAGIRGVAGEKEI >gi|229783633|gb|GG668102.1| GENE 2 334 - 1098 279 254 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625897|ref|ZP_06118832.1| ## NR: gi|266625897|ref|ZP_06118832.1| putative nicotinamide nucleotide transhydrogenase, subunit alpha 2 [Clostridium hathewayi DSM 13479] putative nicotinamide nucleotide transhydrogenase, subunit alpha 2 [Clostridium hathewayi DSM 13479] # 2 254 2 254 254 406 100.0 1e-112 GRHSAFKHGGKGLASCYRRILFTGKNGLLLIGLILLMTVWTWISGGAPEEGTDWLVRLFA GHGTGYFYPMGLFMLLTMDILPLWPLGVFCSQAVGERSVFLTVRLRRRSELLEALLSTGF LWIFLYGCLLALAAVIPPLLQGFSPDMGLTAAVVGLKLLDIAFQFLIILSVLCVTGQVTA GFTAVLLQHFLCLLPISWLPAGISSLARLKLPQTGGTIPAAAAAVLLAALSLPLILWLSI RGIRRLFNHEGGNI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:52:18 2011 Seq name: gi|229783632|gb|GG668103.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld501, whole genome shotgun sequence Length of sequence - 1187 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 128 - 1187 368 ## COG1882 Pyruvate-formate lyase Predicted protein(s) >gi|229783632|gb|GG668103.1| GENE 1 128 - 1187 368 353 aa, chain + ## HITS:1 COG:MTH346 KEGG:ns NR:ns ## COG: MTH346 COG1882 # Protein_GI_number: 15678374 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Methanothermobacter thermautotrophicus # 1 353 179 515 642 235 38.0 1e-61 MVGIGRLDKVLARYKLDIPESVQIIGDFYSEMHRYFAFKSSGKLLGDTGQIIVLGGIDPN GDYFCNEYTHIFIEVMRDKPIPDPKILLRISSKMPDDLLELAIQCIATGVGCPLLSNDDV IIPALEEFGYTHADACNYVTSACWEPMAYGKSLEKNNIKSINFGRCFEKTYNNDEFTACK NFEGLLTLFLNEVDREAETVINILNNIRWEKNPLRSLFTEDCIVKGKDISEGGATYNNYG VLTVGLANAVDSLLNIKELCFEEKKFTLDQVRSICRESNFESVDSSLKKWYGTENTQVAQ LVEQMTDRVYNNLSSYRNRFYGKVKWGLSASNYVESGAVVGATFDGRNKNEPL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:52:19 2011 Seq name: gi|229783631|gb|GG668104.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld502, whole genome shotgun sequence Length of sequence - 564 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 130 - 180 5.1 1 1 Tu 1 . - CDS 206 - 562 88 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) Predicted protein(s) >gi|229783631|gb|GG668104.1| GENE 1 206 - 562 88 118 aa, chain - ## HITS:1 COG:FN1345 KEGG:ns NR:ns ## COG: FN1345 COG0596 # Protein_GI_number: 19704680 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Fusobacterium nucleatum # 1 117 156 272 275 128 51.0 3e-30 FCEYSVLANQYTHQRYKNEVLAGLVLADNNFINILENNYEFSFNADEIIRDITYQKPVLF ICGKQDKCVGYQDAWRLTESYSRATFSVLDMAGHNLQIEQPHLFNELVIDWLLRIEQE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:52:20 2011 Seq name: gi|229783630|gb|GG668105.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld503, whole genome shotgun sequence Length of sequence - 1254 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 27 - 134 72 ## 2 2 Op 1 5/0.000 + CDS 236 - 583 469 ## PROTEIN SUPPORTED gi|160880535|ref|YP_001559503.1| ribosomal protein L19 + Term 611 - 650 6.8 3 2 Op 2 . + CDS 662 - 1207 721 ## COG0681 Signal peptidase I Predicted protein(s) >gi|229783630|gb|GG668105.1| GENE 1 27 - 134 72 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQLAVYIGVAFTLKIYYQDIQEENCKKGLAIDNPL >gi|229783630|gb|GG668105.1| GENE 2 236 - 583 469 115 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160880535|ref|YP_001559503.1| ribosomal protein L19 [Clostridium phytofermentans ISDg] # 1 115 1 115 115 185 79 2e-47 MNDIIKNIEAAQLKETVPEFHVGDTVKVYNKIKEGTRERIQVFEGTVIKRQNGGARETFT VRKNSNGIGVEKTWPLHSPSVDNIEVVRRGKVRRAKLNYLRDRVGKAAKVKELVK >gi|229783630|gb|GG668105.1| GENE 3 662 - 1207 721 181 aa, chain + ## HITS:1 COG:lin1310 KEGG:ns NR:ns ## COG: lin1310 COG0681 # Protein_GI_number: 16800378 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Listeria innocua # 40 180 34 179 180 87 37.0 2e-17 MEFYNSEENNRIRHVVNWIVDITVVIAFAWFIVYAYGTQIPIAGHSMTPLLQSEDIVLMD RLSYDFGKPDRFDVVVFEREDRKMNVKRVIGLPGETVQIKGGQIYINDEWIEQPEGATSI SLAGIAENPVKLGEDEYFLLGDNRDSSEDSRFSNVGNVSGKQIQGKVWIRIAPLANLELI R Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:52:25 2011 Seq name: gi|229783629|gb|GG668106.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld504, whole genome shotgun sequence Length of sequence - 919 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 730 287 ## gi|266625904|ref|ZP_06118839.1| major facilitator superfamily metabolite/H+ symporter 2 1 Op 2 . - CDS 735 - 917 143 ## gi|266625905|ref|ZP_06118840.1| transposase, IS110 family Predicted protein(s) >gi|229783629|gb|GG668106.1| GENE 1 1 - 730 287 243 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625904|ref|ZP_06118839.1| ## NR: gi|266625904|ref|ZP_06118839.1| major facilitator superfamily metabolite/H+ symporter [Clostridium hathewayi DSM 13479] major facilitator superfamily metabolite/H+ symporter [Clostridium hathewayi DSM 13479] # 1 243 1 243 243 412 100.0 1e-113 MVRRVWEKAGLGKFFLLFVLSLLFGLSERVSTQDTLPVHLLAVLNDQYYYTFAVLPVFLL LCTSVMEDDTPFVLVRYGTFGRYFFHKYRALLMIAALLWLGQMAAILLTGLGLPIAGRWP GTSGGQWREVFTLLQGIFPSPWSAILCCAGQTLLGYGLIALTALCLGHFCSRSLAVRLLM ALYLFAVLWIQLPVMSRPPFVFLTGFNHWVFLLHNLACPWRFPLTAVTTAGLAAGMVWLV TQR >gi|229783629|gb|GG668106.1| GENE 2 735 - 917 143 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625905|ref|ZP_06118840.1| ## NR: gi|266625905|ref|ZP_06118840.1| transposase, IS110 family [Clostridium hathewayi DSM 13479] transposase, IS110 family [Clostridium hathewayi DSM 13479] # 1 60 1 60 60 98 100.0 1e-19 FWAILGLPGLRFVTAFEPASLSAAAVHAASFTAGPVLMCIVMGLVALYFQKVKKRTIYQV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:52:44 2011 Seq name: gi|229783628|gb|GG668107.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld505, whole genome shotgun sequence Length of sequence - 1083 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 435 - 494 7.9 2 2 Tu 1 . + CDS 615 - 1083 333 ## GWCH70_3218 hypothetical protein Predicted protein(s) >gi|229783628|gb|GG668107.1| GENE 1 3 - 104 75 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no ASKREAPLKMRYPIARVRPLEDDEVDRAEVEVR >gi|229783628|gb|GG668107.1| GENE 2 615 - 1083 333 156 aa, chain + ## HITS:1 COG:no KEGG:GWCH70_3218 NR:ns ## KEGG: GWCH70_3218 # Name: not_defined # Def: hypothetical protein # Organism: Geobacillus_WCH70 # Pathway: not_defined # 4 156 3 155 323 145 51.0 7e-34 MLQFNRSTVRTCRELLLEFNEDKIQVNSSKLKFSVKQTEKDVYNITAPFAADGKTWIAGR VEDRDSEYSRIGFFEERDGEWVEADREEFCLQDPFVTRIGGKLVLGGVEVFDDRENPGHL NYRTVFYSGNGVLDLSRFAVGPDRMKDIRLCELEDG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:52:52 2011 Seq name: gi|229783627|gb|GG668108.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld506, whole genome shotgun sequence Length of sequence - 861 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 678 316 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 2 1 Op 2 . - CDS 696 - 860 141 ## Predicted protein(s) >gi|229783627|gb|GG668108.1| GENE 1 1 - 678 316 226 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 5 222 1 217 245 126 36 6e-30 MEETLIELKDIYKVYEMGDEEVRANDGISLTINRGEFVAIVGKSGSGKSTLMNIIGALDV PSSGEYLLGGKDVGSMTDNQLADIRNRMIGFIFQQYNLLPKLNLLENVELPLLYAGASNT ARRMRAMASLERVGLAEKWRNLPNQLSGGQQQRVSIARALAGDPSLILADEPTGALDSKT SRSVLDFLKQLNDEGNTIVMITHDSSIAMEARRVVRVHDGKINFDG >gi|229783627|gb|GG668108.1| GENE 2 696 - 860 141 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no DVVYVAESTVNSSFNMMGGMQGGPGGMGGNPGGGAPGGGNSGGRSGGNNGGAPR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:01 2011 Seq name: gi|229783626|gb|GG668109.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld507, whole genome shotgun sequence Length of sequence - 650 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 437 426 ## COG2378 Predicted transcriptional regulator - Prom 457 - 516 3.1 Predicted protein(s) >gi|229783626|gb|GG668109.1| GENE 1 2 - 437 426 145 aa, chain - ## HITS:1 COG:CAC3494 KEGG:ns NR:ns ## COG: CAC3494 COG2378 # Protein_GI_number: 15896731 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Clostridium acetobutylicum # 1 145 1 146 300 133 48.0 1e-31 MTESRLFKIIYYLMEKGQATAPELAEKFEVSVRTIYRDIDRISGAGVPVYSVAGRTGGIR LLDRFVLEKSLLSEEEMRDILFGLQSLAAVQNSDMEMVLSKLRATFQMADTSWIEVDVSR WGSDPERERMDFNVLKQAVMTRRQI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:01 2011 Seq name: gi|229783625|gb|GG668110.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld508, whole genome shotgun sequence Length of sequence - 681 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 680 649 ## gi|266625911|ref|ZP_06118846.1| conserved hypothetical protein Predicted protein(s) >gi|229783625|gb|GG668110.1| GENE 1 2 - 680 649 226 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625911|ref|ZP_06118846.1| ## NR: gi|266625911|ref|ZP_06118846.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 226 1 226 227 366 100.0 1e-100 ASDAKGTTIEYSTDGESWSGTIPQFTEYQEGGYPVYVRATNDNYSNAATAEVVFNITKRP VTVSAGIMTAEYDGSEKEVTEIRYTEADSENAEGVLAGHTVTAKLLNNKRTDAGEQAVSI EGGSVRILSGRTDVTKNYAVSLGDGKLTISQKGDLKVTVDAESLSHVYDGAGHGIKAAEA SDAKGTTIEYSTDGESWSGTIPQFTEYQEGGYPVYVRATNDNYSNA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:14 2011 Seq name: gi|229783624|gb|GG668111.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld509, whole genome shotgun sequence Length of sequence - 1032 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 150 179 ## gi|266625912|ref|ZP_06118847.1| HflC protein 2 1 Op 2 . + CDS 185 - 1031 1016 ## COG0685 5,10-methylenetetrahydrofolate reductase Predicted protein(s) >gi|229783624|gb|GG668111.1| GENE 1 1 - 150 179 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625912|ref|ZP_06118847.1| ## NR: gi|266625912|ref|ZP_06118847.1| HflC protein [Clostridium hathewayi DSM 13479] HflC protein [Clostridium hathewayi DSM 13479] # 1 49 1 49 49 87 100.0 3e-16 MAEAYRDPQKAEFYSYTRSLEAARASLKGDGNTLILPADSPIARIFMGQ >gi|229783624|gb|GG668111.1| GENE 2 185 - 1031 1016 282 aa, chain + ## HITS:1 COG:aq_1429 KEGG:ns NR:ns ## COG: aq_1429 COG0685 # Protein_GI_number: 15606607 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Aquifex aeolicus # 1 279 1 285 296 245 44.0 5e-65 MKIKDILGQGKPTLSFEVFPPKTEDKYESVERAAAEIAKLKPAFMSVTYGAGGGTSRYTV DIASTLHHEYQVTALAHVTCVSSTREKVHEVLEELKAAGIENILALRGDIPKDGPVASDY RFASELIREIRAYGDFCIGAACYPEGHVESVSKTIDMDYLKQKVEAGCDFVTTQMFFDNN ILYNYLYRIREKGITVPVVAGIMPVTNVKQIIRSCQLSGTYLPSRFKAIVDRFGDNPAAM KQAGIAYATEQIIDLIANGVDAIHVYSMNKPDVAAKIKENLS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:19 2011 Seq name: gi|229783623|gb|GG668112.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld510, whole genome shotgun sequence Length of sequence - 802 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 708 520 ## Ethha_1905 hypothetical protein Predicted protein(s) >gi|229783623|gb|GG668112.1| GENE 1 1 - 708 520 235 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1905 NR:ns ## KEGG: Ethha_1905 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 2 234 582 816 817 249 58.0 5e-65 GGLFGVDREEWEKSPQFHEKVMERQEHQQEREQAFLSQNRNCFAIYQVSRDDPQNVRFMN LDWLESHDVSVDRSNYDLIYTAPLSESGTVPEQLEKLYQQFNLEKPVDFHSPSMSVSDIV AIKQDSKVSCHYCDSVGFTQIPGFLPENPLKNAEMAVEDDYGMIDGIINNGTKEPTVAEL EQQARSGQPISLMDLAAATHREEREKKKSVVNQLKSQPKAEHKKTAQKKSAEREI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:24 2011 Seq name: gi|229783622|gb|GG668113.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld511, whole genome shotgun sequence Length of sequence - 908 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 52 - 111 10.0 1 1 Tu 1 . + CDS 144 - 906 727 ## COG2217 Cation transport ATPase Predicted protein(s) >gi|229783622|gb|GG668113.1| GENE 1 144 - 906 727 254 aa, chain + ## HITS:1 COG:CAC3655 KEGG:ns NR:ns ## COG: CAC3655 COG2217 # Protein_GI_number: 15896888 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 8 244 78 309 818 194 46.0 2e-49 MKTEVYDIEGMSCAACSSAVERVTRKLEGVESSDVNLTTNKMTITYDETKVTPEMIMGKV EKAGFGACPLAEEKDSKAQEEENWGKQEQQRDTMKRRLIVAICFAVPLLYISMGHMLPFE LPLPHFLHMDSNPLNYALAQLILTVPVLIAGRKFYLVGLRSLFKGNPNMDSLVAIGTGSA FIYSLCMTVTIPSNPMNAHHLYYESAAVVVTLVMLGKYMESRSKGKTSEAIRKLMELAPD TAILYEDGTEREVE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:24 2011 Seq name: gi|229783621|gb|GG668114.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld513, whole genome shotgun sequence Length of sequence - 1002 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1000 1096 ## COG2176 DNA polymerase III, alpha subunit (gram-positive type) Predicted protein(s) >gi|229783621|gb|GG668114.1| GENE 1 1 - 1000 1096 333 aa, chain - ## HITS:1 COG:CAC3442 KEGG:ns NR:ns ## COG: CAC3442 COG2176 # Protein_GI_number: 15896683 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit (gram-positive type) # Organism: Clostridium acetobutylicum # 1 333 508 841 1452 363 53.0 1e-100 VLVAHNASFDVSFINRNAEILGLPFEPTVLDTVTLARFLLPNLNRFKLDTVAKALNISLA NHHRAVDDAGCTAEIFVKFVQMLRERGVETLTELNSLKTMTPEMIKKLPTYHVIIIATND VGRVNLYKLISWSHIDYYARRPRIPKSLLNQNREGLIVGSACEAGELYQALLRGGSNAEI ARLVRFYDYLEIQPLGNNSFMLRDEKSTVNSEQDLIDINKKIVELGDQFGKPVCATCDVH FLDPEDEVYRRIIMAGQGFKDSDEQAPLYLRTTEEMLKEFEYLGPNKAEEVVIRNTRKIA DMCEKISPVRPDKCPPVIENSDGDLRQICYDKA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:25 2011 Seq name: gi|229783620|gb|GG668115.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld514, whole genome shotgun sequence Length of sequence - 1059 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 290 218 ## Mahau_2083 ADP-ribosylation/Crystallin J1 2 1 Op 2 . - CDS 302 - 1057 518 ## COG0395 ABC-type sugar transport system, permease component Predicted protein(s) >gi|229783620|gb|GG668115.1| GENE 1 2 - 290 218 96 aa, chain - ## HITS:1 COG:no KEGG:Mahau_2083 NR:ns ## KEGG: Mahau_2083 # Name: not_defined # Def: ADP-ribosylation/Crystallin J1 # Organism: M.australiensis # Pathway: not_defined # 6 96 5 95 711 104 48.0 1e-21 MSRMRKEYIEKIYAGWLAKIIGIRLGAPVEGWSSKKIRDVYGRLTGYAVKYNRFAADDDS NGPLFFIRALEDSGKKEKLCSEDVAEALLNYVPFEH >gi|229783620|gb|GG668115.1| GENE 2 302 - 1057 518 251 aa, chain - ## HITS:1 COG:BS_yurM KEGG:ns NR:ns ## COG: BS_yurM COG0395 # Protein_GI_number: 16080311 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 1 251 51 300 300 159 35.0 5e-39 PILQVYLNSFRTDGDVKQKPFGLPAKWIFTNWAETWKVGGYTTAFLNSLFIAAVVIAVVL FLISMCAYALSKMKFKGKGFLTGYFFVAISLPGFLYIVPDYFMFNKIGLVNSRISLILIY IAMQIPFNMLLLRTYLADIPGELEEAARVDGCSDASVFFRIILPIAKPMLFTITILIFVN VWNEFLWANTFIATDALKPLATRFVKFVGEYSSNMARIYTASAITITPIIVVYLLFSRRF IEGMTSGSVKG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:28 2011 Seq name: gi|229783619|gb|GG668116.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld515, whole genome shotgun sequence Length of sequence - 950 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 56 - 493 247 ## COG2191 Formylmethanofuran dehydrogenase subunit E + Term 501 - 555 -0.3 2 2 Tu 1 . - CDS 562 - 948 396 ## COG1307 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|229783619|gb|GG668116.1| GENE 1 56 - 493 247 145 aa, chain + ## HITS:1 COG:MA4602 KEGG:ns NR:ns ## COG: MA4602 COG2191 # Protein_GI_number: 20093386 # Func_class: C Energy production and conversion # Function: Formylmethanofuran dehydrogenase subunit E # Organism: Methanosarcina acetivorans str.C2A # 1 142 46 215 217 129 42.0 2e-30 MELLRLNFSQDEEVVCIAENDACGVDAVQVVLGCSVGKGNLLFHIRGKQAFSFFNRKTGK SVRLVLKDRPAFASKEESFAYLQEREPEELFDVKPTVCRIPERARLFRSFPCDCCGEMTA EHMMRLQDGKKLCLDCWKEYDRFQV >gi|229783619|gb|GG668116.1| GENE 2 562 - 948 396 128 aa, chain - ## HITS:1 COG:CAC1624 KEGG:ns NR:ns ## COG: CAC1624 COG1307 # Protein_GI_number: 15894902 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 2 127 155 279 280 97 39.0 6e-21 CKIFYYMDTLTYLRKGGRIGLVTSVVGSMLNLKPIISCNEEGVYYTVAKIRGSRQGLSRL LAEVKAFAGDAPCLTALLNGDGKEAAEELRPKLIAGIPHGTLVMEKAITASLAVHTGPGL VGIGALKL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:29 2011 Seq name: gi|229783618|gb|GG668117.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld516, whole genome shotgun sequence Length of sequence - 971 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 85 - 144 1.5 1 1 Tu 1 . + CDS 197 - 971 1034 ## COG1653 ABC-type sugar transport system, periplasmic component Predicted protein(s) >gi|229783618|gb|GG668117.1| GENE 1 197 - 971 1034 258 aa, chain + ## HITS:1 COG:TM0810 KEGG:ns NR:ns ## COG: TM0810 COG1653 # Protein_GI_number: 15643573 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Thermotoga maritima # 60 202 22 163 420 63 29.0 2e-10 MRKVKRALAAALAVLMAGSLSACGVNVQTKDDANTTAAVTEAKEETGAVDTPADEKVTLK IIDWSDSTKERREAFHKKFMEENPNVTIEYTVLTADQFKETVISAIKAGNAPDLFPLPSG MKLSAALKENWFMPMNDYVSDDFLNSFADGALNEGITTIDGKTYVLPESANIINTLVFYN KNVLKEAGIDESQLPKTRSEFLEVCKKVSDAGNGKFFGIIDSGAQANRLELALRSLASLD GGKCSDISQMILVDGQNT Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:30 2011 Seq name: gi|229783617|gb|GG668118.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld517, whole genome shotgun sequence Length of sequence - 594 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + 5S_RRNA 82 - 198 94.0 # CP000885 [D:447851..447967] # 5S ribosomal RNA # Clostridium phytofermentans ISDg # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. + TRNA 207 - 283 88.9 # Ile GAT 0 0 + TRNA 379 - 451 80.2 # Ala TGC 0 0 Predicted protein(s) Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:30 2011 Seq name: gi|229783616|gb|GG668119.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld518, whole genome shotgun sequence Length of sequence - 702 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 531 577 ## gi|288871744|ref|ZP_06118859.2| sulfatase family protein Predicted protein(s) >gi|229783616|gb|GG668119.1| GENE 1 3 - 531 577 176 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288871744|ref|ZP_06118859.2| ## NR: gi|288871744|ref|ZP_06118859.2| sulfatase family protein [Clostridium hathewayi DSM 13479] sulfatase family protein [Clostridium hathewayi DSM 13479] # 1 176 1 176 177 341 100.0 1e-92 MKYLQDGGYYTAVIGKQHFWRSEIERGYDYEDIVDEHEPPAVISKELPEGAFGLPANKTV SDRVSSYVEFLADSDFTSGSQLYREINSKGIYEFTGEEKYHVDAYIGDRGRKWLEESCPG DRPWFLTLSFPGPHMPFDGIGLPDEKAYEDTELDLPETGLSDLFEKPPHYLDIARK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:41 2011 Seq name: gi|229783615|gb|GG668120.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld520, whole genome shotgun sequence Length of sequence - 902 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 210 170 ## Predicted protein(s) >gi|229783615|gb|GG668120.1| GENE 1 3 - 210 170 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVELSYKEKNGLLYPLLETGSMMEETESMSLGRYGKLAEEYLKVKDKSRYSLLLRTGKLQ ETLKQIEEE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:47 2011 Seq name: gi|229783614|gb|GG668121.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld521, whole genome shotgun sequence Length of sequence - 1265 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 10/0.000 + CDS 1 - 876 785 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 2 1 Op 2 . + CDS 892 - 1264 524 ## COG0691 tmRNA-binding protein Predicted protein(s) >gi|229783614|gb|GG668121.1| GENE 1 1 - 876 785 291 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 1 270 419 687 730 306 52 4e-84 TKVVLDKNGKPLEIMPYERNAATKIIEDFMLIANETVAEDYFWQELPFLYRTHDNPDPEK MKSLATFINNFGYSIRFHNGEVYPKEVQKLLVKAEDTPEEALISRLALRSMKQAKYTVVN TGHFGLAAKYYTHFTSPIRRYPDLQIHRIIKENLRGGLSEKRTAHYDKILTGVSVQTSAM ERRADEAERETVKMKKCEYMAGHIGEEFDGVVSGVTNWGLYVELPNTVEGLIRVGELKDD YYRFDEEHYELVGEMTRKVYKLGQPIRVMVTGTDKLARTIDFIPARTFTEE >gi|229783614|gb|GG668121.1| GENE 2 892 - 1264 524 124 aa, chain + ## HITS:1 COG:BS_yvaI KEGG:ns NR:ns ## COG: BS_yvaI COG0691 # Protein_GI_number: 16080413 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Bacillus subtilis # 1 124 1 124 156 135 54.0 2e-32 MGKESFKLVANNKKAYHDYFIDEKYETGIELAGTEVKSIRMGKCSIKEAFVRIDKGEVWV YGMNINPYEKGNIFNKDPLRPRKLLMHRAEISKLDSKIAVKGYTIVPLQVYFKGSLVKLE IGLA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:48 2011 Seq name: gi|229783613|gb|GG668122.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld522, whole genome shotgun sequence Length of sequence - 1132 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 650 397 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 - Prom 744 - 803 6.7 - Term 771 - 809 1.2 2 2 Tu 1 . - CDS 818 - 1051 240 ## Closa_2213 preprotein translocase, SecG subunit Predicted protein(s) >gi|229783613|gb|GG668122.1| GENE 1 3 - 650 397 216 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 13 216 7 211 730 157 38 4e-39 MTEEQLTQRKSMILKLVEEPAYVPMKLKEMAILLDVPKGQREELKEVLDALLAEGKIGIS KKGKYGKPDIGAITGIFSGHPRGFGFVTVEGRDQDVFIPEDKTADAMNGDTVQIAVEAEG EGSKRAEGRVLRVLEHANRTVIGYYQKNKNFGFVIPDNQKIGKDIFIPQGKDMGAMTGHK VVAVLTDFGGSQKKPEGVVTEILGHVNDPGTDIISI >gi|229783613|gb|GG668122.1| GENE 2 818 - 1051 240 77 aa, chain - ## HITS:1 COG:no KEGG:Closa_2213 NR:ns ## KEGG: Closa_2213 # Name: not_defined # Def: preprotein translocase, SecG subunit # Organism: C.saccharolyticum # Pathway: Protein export [PATH:csh03060]; Bacterial secretion system [PATH:csh03070] # 1 77 1 77 77 110 90.0 2e-23 MIRTILSIIFVIICIALSAIILLQEGKTQGLGSIGGMADTYWGRNKGRSMEGKLEKFTRY GAILFFVLALVLNLNIL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:50 2011 Seq name: gi|229783612|gb|GG668123.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld523, whole genome shotgun sequence Length of sequence - 1210 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1209 898 ## Ethha_1897 hypothetical protein Predicted protein(s) >gi|229783612|gb|GG668123.1| GENE 1 3 - 1209 898 402 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1897 NR:ns ## KEGG: Ethha_1897 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 18 401 1 384 794 620 77.0 1e-176 FGKTEKKTVQPVKTKKKLSRADKKQIEAAIARANRTDKKGKSAQDSIPYERMWPDGICRV SDSHYTKTIQFQDINYQLSQNEDKTAIFEGWCDFLNYFDSSIHFQLSFLNLAASEETFAN SISIPPQRDAFDSIREEYTTMLQNQLARGNNGLIKTKYLTFGIDADSIKAAKPRLERIET DILNNFKRLGVAARTLDGKERLSQLHAVFHMDEQLPFQFEWDWLAPSGLSTKDFIAPSSF EFRTGKQFRMGKKYGAVSFLQILAPELNDRLLADFLDMESSLIVSMHIQSVDQVKAIKTV KRKITDLDRSKIEEQKKAVRAGYDMDIIPSDLATYGSEAKKLLQDLQSRNERMFLLTFLV LNTADNPRQLGNNIFQAGSIAQKYNCQLTRLDFQQEEGLMSC Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:56 2011 Seq name: gi|229783611|gb|GG668124.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld524, whole genome shotgun sequence Length of sequence - 624 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 131 - 190 1.7 1 1 Tu 1 . + CDS 263 - 623 337 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases Predicted protein(s) >gi|229783611|gb|GG668124.1| GENE 1 263 - 623 337 120 aa, chain + ## HITS:1 COG:SMc01214 KEGG:ns NR:ns ## COG: SMc01214 COG1063 # Protein_GI_number: 15965330 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Sinorhizobium meliloti # 2 119 176 296 347 71 35.0 3e-13 MVQLAKLAGAARVALLEPVEEKRRVGRRMGADDCIDPLTEDVKAELEDAGMNWIQTVIEC VGRPSTIEQAIEIAGKQAVVMMFGLIKPDETVAVKPFTVFQKELVLKASYINPYTQKRAL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:53:57 2011 Seq name: gi|229783610|gb|GG668125.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld525, whole genome shotgun sequence Length of sequence - 1129 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 33 - 92 2.8 1 1 Tu 1 . + CDS 118 - 1065 259 ## Closa_1057 hypothetical protein Predicted protein(s) >gi|229783610|gb|GG668125.1| GENE 1 118 - 1065 259 315 aa, chain + ## HITS:1 COG:no KEGG:Closa_1057 NR:ns ## KEGG: Closa_1057 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 25 97 46 116 336 74 42.0 4e-12 MRRWKSTLLITGILMGFAAIPAYAAGWQWIDGNNDGVSECYYIDDEGKMLTGTTTPDGFT VNEQGAWVENGVVKTKEEKQIVSAYSSLNKRSMKTSQMDEFQYYIFSPENATENMPLIVY LHGHGLGDNIDDFKNEKYFAALREGGKRGSSAYILVPYLPPELDHGWKGMWPGIEPSIME LIESVADTYKIDCNKISIIGASMGADAAIQIVSSHPDTFSCMAGVVPFHYQCPIAKWEDN WGEQLKTVPVWLFVEDHEIALNMAESAVNAINAAGGQAWLDIQHGANHGDATKRVASCMN SGQYEIYDWLISVSK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:54:03 2011 Seq name: gi|229783609|gb|GG668126.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld526, whole genome shotgun sequence Length of sequence - 1095 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 455 361 ## gi|266625932|ref|ZP_06118867.1| hypothetical protein CLOSTHATH_07355 - Prom 572 - 631 3.5 - Term 850 - 883 5.4 2 2 Tu 1 . - CDS 922 - 1095 174 ## COG1075 Predicted acetyltransferases and hydrolases with the alpha/beta hydrolase fold Predicted protein(s) >gi|229783609|gb|GG668126.1| GENE 1 2 - 455 361 151 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625932|ref|ZP_06118867.1| ## NR: gi|266625932|ref|ZP_06118867.1| hypothetical protein CLOSTHATH_07355 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_07355 [Clostridium hathewayi DSM 13479] # 1 151 1 151 151 254 100.0 2e-66 MVSRAKLLQLTDYEKTDILNVHGLKLRTMDQHMEAIMIKLMGITIESREEQRKREEEALY RYFRYGAKHRNKVGRLLEELIPGEKREHLLLYYLQIKDAMEACGSQNFDEAVTRMPSKSR IITVNKTVHQYYKAVMEADAGMDEDLVLPSA >gi|229783609|gb|GG668126.1| GENE 2 922 - 1095 174 57 aa, chain - ## HITS:1 COG:CAC1028 KEGG:ns NR:ns ## COG: CAC1028 COG1075 # Protein_GI_number: 15894315 # Func_class: R General function prediction only # Function: Predicted acetyltransferases and hydrolases with the alpha/beta hydrolase fold # Organism: Clostridium acetobutylicum # 1 57 423 479 479 84 64.0 6e-17 VSTESARWGEFRTVFASSGHRGISHGDMIDLKREDYKGFDVVECYVQIVSELAGKGF Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:54:12 2011 Seq name: gi|229783608|gb|GG668127.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld527, whole genome shotgun sequence Length of sequence - 889 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 496 598 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase 2 1 Op 2 . + CDS 539 - 889 308 ## PROTEIN SUPPORTED gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase Predicted protein(s) >gi|229783608|gb|GG668127.1| GENE 1 2 - 496 598 164 aa, chain + ## HITS:1 COG:TM0446 KEGG:ns NR:ns ## COG: TM0446 COG0041 # Protein_GI_number: 15643212 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Thermotoga maritima # 1 161 9 169 171 204 63.0 8e-53 GSDSDMPVTAQAADVLEELGVDFEITVISAHREPDIFFEYAKTAEARGVKVMIAGAGKAA HLPGMCAALFPMPVIGIPMKTSDLGGVDSLYSIVQMPSGIPVATVAINGGTNAGILAAKI LAASDEALLGKLKEYSENLKNDVVKKAEKLDQIGYKEYLAQMKK >gi|229783608|gb|GG668127.1| GENE 2 539 - 889 308 117 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase [Acinetobacter baumannii SDF] # 1 108 12 117 356 123 57 6e-29 MDYKNAGVDIEAGYKSVELMKEHVKGTMRPEVLGGIGGFSGAFSMSAFKEMEKPTLVSGT DGVGTKLKLAFLMDQHNTVGIDCVAMCVNDIACAGGEPLFFLDYIACGKNSNPFGSN Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:54:12 2011 Seq name: gi|229783607|gb|GG668128.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld528, whole genome shotgun sequence Length of sequence - 537 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 64 - 180 124 ## 2 1 Op 2 . + CDS 186 - 537 131 ## gi|288871746|ref|ZP_06118871.2| hypothetical protein CLOSTHATH_07360 Predicted protein(s) >gi|229783607|gb|GG668128.1| GENE 1 64 - 180 124 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVRHRDMMCALCLENFREEAFLGRESKLMRKKSTVNSR >gi|229783607|gb|GG668128.1| GENE 2 186 - 537 131 117 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871746|ref|ZP_06118871.2| ## NR: gi|288871746|ref|ZP_06118871.2| hypothetical protein CLOSTHATH_07360 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_07360 [Clostridium hathewayi DSM 13479] # 1 117 1 117 117 202 100.0 9e-51 MELVFIFTVILIQMFFLNPGEVMQIEGTLWIDTFVDAEEPPVLFRDKGVSTVRAHETDWG GNNLPSDKRLSTDLALVLSVAAIIIIEIVVGSATEGTDGILGNGFPVAPLDRPDGFT Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:54:24 2011 Seq name: gi|229783606|gb|GG668129.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld529, whole genome shotgun sequence Length of sequence - 513 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 142 - 511 92 ## gi|266625937|ref|ZP_06118872.1| conserved hypothetical protein Predicted protein(s) >gi|229783606|gb|GG668129.1| GENE 1 142 - 511 92 123 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625937|ref|ZP_06118872.1| ## NR: gi|266625937|ref|ZP_06118872.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 123 48 170 171 228 99.0 1e-58 MEFFNNEDDNIITIPNASSSDKFEEGQIYVLHNEASSEGDIAVKVIDVKETTNGKQVYYE EPELEEVIDSLHISGTETSESEFIPAEGVTVTQAGRMLRSSGTVKFNCFNGLNLKATVGP VSY Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:54:32 2011 Seq name: gi|229783605|gb|GG668130.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld531, whole genome shotgun sequence Length of sequence - 1001 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 13 - 918 785 ## COG1175 ABC-type sugar transport systems, permease components Predicted protein(s) >gi|229783605|gb|GG668130.1| GENE 1 13 - 918 785 301 aa, chain + ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 12 296 4 289 292 166 34.0 7e-41 MEKDTAAVRRKKKNVVPYLFVLGAFMVHLCLVTIPALSTLVMSFFDWNGLGDPKFVAFHN FKEIFTQDPVVITAVVNNLKWTAIFLTVPIVLGFTVAVMVSRVKRFQMALRTVYFIPYVL SAAVIGKVFTAYYNPFYGINQFFGQIGLTGLAETLWLGNPKIALYSVAFVDLWHWWGFVM IMFLGALQQVDPSLYESARVDGVNVFQEIWYITVPAIKRTIAFVLIMTVMWSFLTFDYVY VMTNGGPANSTEILATWIYKNAFTKYRAGYASALCVIQCAICFVFYLLQSKASKMGGLDE S Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:54:33 2011 Seq name: gi|229783604|gb|GG668131.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld532, whole genome shotgun sequence Length of sequence - 922 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 496 357 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases Predicted protein(s) >gi|229783604|gb|GG668131.1| GENE 1 1 - 496 357 165 aa, chain - ## HITS:1 COG:MA3541 KEGG:ns NR:ns ## COG: MA3541 COG0454 # Protein_GI_number: 20092348 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Methanosarcina acetivorans str.C2A # 6 156 10 163 321 95 33.0 3e-20 MEIIDFKREYIEEASLLAMANYNEERTKVTCLPEVSRIPGLQSFADNGLGVAAFENRKMT GFLCCCSPFEHAFHTEAVGTFSPVHAHGAVKENRAFLYRRMYQKAAEKWVEHKIVSHAVS LYAHDREGLEAFFTCGFGLRCVDAIRPMEPLSFTAPLPLTLSSGI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:54:34 2011 Seq name: gi|229783603|gb|GG668132.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld533, whole genome shotgun sequence Length of sequence - 1206 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1206 670 ## gi|266625940|ref|ZP_06118875.1| hypothetical protein CLOSTHATH_07365 Predicted protein(s) >gi|229783603|gb|GG668132.1| GENE 1 3 - 1206 670 401 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625940|ref|ZP_06118875.1| ## NR: gi|266625940|ref|ZP_06118875.1| hypothetical protein CLOSTHATH_07365 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_07365 [Clostridium hathewayi DSM 13479] # 2 389 2 389 402 657 100.0 0 GSLDLSLLTTNPPRDPAKRAKSAEDLVRLYEVVVYDQTRETKLKENPEYRADRYSDPEKH AQYVHLEKGLKMVAPYFYQVVFANSGYDQALKVHDVKMMRASVLPECTASMIVETKRFYD RAAHMEGKKTDGVLPFVAVSEMENLLDVRKNPERNAAGLSAQLRTLYISEFQTDKKEWEK GGITQLSQIFFIDGKTAEEYCGNRFSNLPEQEREKRIEAEIMTAIASGEHHVDIATMSPT KEGTLDIQMYSIKADLHALDKKQREKEGMFASLISKKADKTWNNDKDRASRLEKEQQNLS ERMVSTIQRKIQEKDELARQSAGAEVKQAPQAPAAGEEVKQAPQAPAAGAEVKQVPQQSS GGKTAVKIASTLEEMCSIQKNADQAQALQKAEKAPAAPEAA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:54:51 2011 Seq name: gi|229783602|gb|GG668133.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld534, whole genome shotgun sequence Length of sequence - 502 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 191 274 ## Closa_2339 binding-protein-dependent transport systems inner membrane component 2 1 Op 2 . - CDS 181 - 501 286 ## Closa_2340 ABC transporter Predicted protein(s) >gi|229783602|gb|GG668133.1| GENE 1 2 - 191 274 63 aa, chain - ## HITS:1 COG:no KEGG:Closa_2339 NR:ns ## KEGG: Closa_2339 # Name: not_defined # Def: binding-protein-dependent transport systems inner membrane component # Organism: C.saccharolyticum # Pathway: ABC transporters [PATH:csh02010] # 1 62 1 62 221 87 72.0 2e-16 MTFDAATFDMLKAGIWETLFMTFTSSFFAYLIGIPLGIILIVTDKDGIRPVPWLQKIQTL SVP >gi|229783602|gb|GG668133.1| GENE 2 181 - 501 286 106 aa, chain - ## HITS:1 COG:no KEGG:Closa_2340 NR:ns ## KEGG: Closa_2340 # Name: not_defined # Def: ABC transporter # Organism: C.saccharolyticum # Pathway: ABC transporters [PATH:csh02010] # 1 106 235 340 340 176 81.0 2e-43 FSEPKSKIGRQLILGDAAGQAAKFGKSRRVRISFDGRSSFEPVLSNMILACKVPVNIIHA ETRDISGTAMGQMVIQLPEEEVDANRVVNYLKTAKVPFEEVTDYDI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:54:56 2011 Seq name: gi|229783601|gb|GG668134.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld535, whole genome shotgun sequence Length of sequence - 829 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 26 - 827 789 ## EUBREC_0516 response regulator PleD Predicted protein(s) >gi|229783601|gb|GG668134.1| GENE 1 26 - 827 789 267 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0516 NR:ns ## KEGG: EUBREC_0516 # Name: not_defined # Def: response regulator PleD # Organism: E.rectale # Pathway: not_defined # 4 233 81 309 476 159 36.0 1e-37 MQKQPEVADDLEEMGKIASLLRVDELHLFDKKGRLYAGSEPKYYDYTFQSGEQMRYFLPM LDDYTLQLCQDVTPNTAEGKMMQYVAVWREDHEGIVQIGMEPARLMVAMEKNELSYVFSM MTTESGVTIFAVDRETGTIVGSTGDIPAGTAAEALGLHLPEDMETGGPYERELDISGTKN YCLVELTDGIVVGVSATYGKLYENVPNNMMLILFSLCVLSFVTVFLIVKMLDKFILSDIY GTIEGTKKIAAGNLDYRLEIDHTPEFK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:55:00 2011 Seq name: gi|229783600|gb|GG668135.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld536, whole genome shotgun sequence Length of sequence - 666 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:55:01 2011 Seq name: gi|229783599|gb|GG668136.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld537, whole genome shotgun sequence Length of sequence - 849 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 848 889 ## COG3451 Type IV secretory pathway, VirB4 components Predicted protein(s) >gi|229783599|gb|GG668136.1| GENE 1 2 - 848 889 282 aa, chain - ## HITS:1 COG:CAC2047 KEGG:ns NR:ns ## COG: CAC2047 COG3451 # Protein_GI_number: 15895317 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Clostridium acetobutylicum # 44 224 132 314 617 74 30.0 2e-13 KITDLDKSKIEEQKKAVRAGYDIDIIPSDLATYGVEAKKLLQDLQSRNERMFLVTFLVLN TADNPRQLDNNVFQASSIAQKYNCRLTRLDFQQEEGLMSCLPLGLNQIEIQRGLTTSSTA IFVPFTTQELFQNGKEALYYGINALSNNLIMVDRKLLKNPNGLILGTPGSGKSFSAKREI ANCFLLTSDEVEILDPEAEYAPLVERLHGQVIKISPTSTNYINPMDLNLDYSDDESPLSL KSDFILSLCELIVGGKEGLQPVQKTIIDRCVRLVYQDYLNDP Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:55:02 2011 Seq name: gi|229783598|gb|GG668137.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld538, whole genome shotgun sequence Length of sequence - 500 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 31 - 429 402 ## ECSP_1541 putative DNA packaging protein of prophage CP-933X Predicted protein(s) >gi|229783598|gb|GG668137.1| GENE 1 31 - 429 402 132 aa, chain - ## HITS:1 COG:no KEGG:ECSP_1541 NR:ns ## KEGG: ECSP_1541 # Name: not_defined # Def: putative DNA packaging protein of prophage CP-933X # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 132 14 145 145 212 98.0 5e-54 MTKDELIARLRSLGEQLNRDVSLTGTKEELALRVAELKEELDDTDETAGQDTPLSRENVL TGHENEVGSAQPDTVILDTSELVTVVALVKLHTDALHATRDEPVAFVLPGTAFRVSAGVA AEMTERGLARMQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:55:05 2011 Seq name: gi|229783597|gb|GG668138.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld539, whole genome shotgun sequence Length of sequence - 582 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 64 - 123 1.9 1 1 Tu 1 . + CDS 168 - 311 126 ## gi|288871750|ref|ZP_06118882.2| conserved hypothetical protein Predicted protein(s) >gi|229783597|gb|GG668138.1| GENE 1 168 - 311 126 47 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871750|ref|ZP_06118882.2| ## NR: gi|288871750|ref|ZP_06118882.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 47 1 47 47 71 100.0 2e-11 MGFSYHLVKYTRQAIEDAKKATEEVKKQMTLGKEHYTGDQQVISRLE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:55:10 2011 Seq name: gi|229783596|gb|GG668139.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld540, whole genome shotgun sequence Length of sequence - 750 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 553 444 ## gi|266625948|ref|ZP_06118883.1| conserved hypothetical protein 2 1 Op 2 . - CDS 581 - 748 229 ## gi|266625949|ref|ZP_06118884.1| anti-anti-sigma factor family protein Predicted protein(s) >gi|229783596|gb|GG668139.1| GENE 1 1 - 553 444 184 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625948|ref|ZP_06118883.1| ## NR: gi|266625948|ref|ZP_06118883.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 184 1 184 184 321 100.0 2e-86 MKKITVQMTAVICALSLTGCSSLFGMATPPPPFPDDFMIVDGSGVDLSDPASIQAAYDRH IQSLMQADYVGNPAFSSQISEIVSAKPEWSRNFQNFQIVDYDMDNDILVYAYQTTLLAQD GEDYSAPAVEVAPVDYSIDTKVKKENEKLIERDSKVKEEYGGAVSDDKKKLISAVGTYRW KDDD >gi|229783596|gb|GG668139.1| GENE 2 581 - 748 229 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625949|ref|ZP_06118884.1| ## NR: gi|266625949|ref|ZP_06118884.1| anti-anti-sigma factor family protein [Clostridium hathewayi DSM 13479] anti-anti-sigma factor family protein [Clostridium hathewayi DSM 13479] # 1 55 1 55 55 105 100.0 9e-22 SYICSAALRVFLKTQKEINKLPGSSMEILHCNQGVKDIFEITGFSGILTIKEDEK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:55:25 2011 Seq name: gi|229783595|gb|GG668140.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld541, whole genome shotgun sequence Length of sequence - 1045 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 32 - 91 2.5 1 1 Op 1 . + CDS 161 - 853 655 ## gi|309776206|ref|ZP_07671197.1| putative TraG/TraD family protein 2 1 Op 2 . + CDS 834 - 1044 295 ## COG0550 Topoisomerase IA Predicted protein(s) >gi|229783595|gb|GG668140.1| GENE 1 161 - 853 655 230 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|309776206|ref|ZP_07671197.1| ## NR: gi|309776206|ref|ZP_07671197.1| putative TraG/TraD family protein [Erysipelotrichaceae bacterium 3_1_53] putative TraG/TraD family protein [Erysipelotrichaceae bacterium 3_1_53] # 1 230 1 230 230 415 93.0 1e-114 MSVLIIFIKLIAVFDLLVIAVYFGGDILSTMLNKLQRKPKQDDPFDFSLDTMTVSLSIDS IRQLYSGDAADFVEDKILPNAEKTMAVLTDQEQAAARLLLSALIGFLAAEAPMDEQSFPM VMELLNCMEGEKEDGCQDAVESLLEDAVRNTHRHEEYYSNYQRYQLMQVDKTRVILACRI IINDLLGKLYRYDYRFGYNLLLDEENSIEKKLHTPVREEWEVEEYEAGDC >gi|229783595|gb|GG668140.1| GENE 2 834 - 1044 295 70 aa, chain + ## HITS:1 COG:topB KEGG:ns NR:ns ## COG: topB COG0550 # Protein_GI_number: 16129717 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Escherichia coli K12 # 1 69 1 71 653 79 53.0 1e-15 MKLVIAEKPSVAQSLAAVIGATARKDGYLEGNGWRVSWCVGHLAGLADADSYDPKYAKWR YDDLPILPEH Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:55:37 2011 Seq name: gi|229783594|gb|GG668141.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld542, whole genome shotgun sequence Length of sequence - 850 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 83 63 ## 2 1 Op 2 . + CDS 94 - 849 261 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein Predicted protein(s) >gi|229783594|gb|GG668141.1| GENE 1 3 - 83 63 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no AAEIARRSGISYEELCEVLKMLYDGE >gi|229783594|gb|GG668141.1| GENE 2 94 - 849 261 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 245 1 253 311 105 31 1e-23 MEYAVRIKNLSKQYPDFSLNNISLDIPMGSIMGLIGENGSGKTTTIKAMLDIVKRDAGDV EILGMDIREREQEVKEEIGVVFGESQFHDFLSAVQISGIMKRVYKNWDEELFSSYLSRFA LPEKKKVKEYSRGMAMKLAIAAALSHHPKLLILDEATSGLDPVARDEILDLFFGFIEDGE HSVLIASHITSDLDKVADYVTMIHNGSILFSEEKDQLLEDMGILRCGEEELAGLTDVHVV GVRKNPYGVEAL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:55:42 2011 Seq name: gi|229783593|gb|GG668142.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld543, whole genome shotgun sequence Length of sequence - 810 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 361 386 ## COG1420 Transcriptional regulator of heat shock gene 2 1 Op 2 . + CDS 373 - 808 471 ## Closa_0881 GrpE protein Predicted protein(s) >gi|229783593|gb|GG668142.1| GENE 1 2 - 361 386 119 aa, chain + ## HITS:1 COG:BH1344 KEGG:ns NR:ns ## COG: BH1344 COG1420 # Protein_GI_number: 15613907 # Func_class: K Transcription # Function: Transcriptional regulator of heat shock gene # Organism: Bacillus halodurans # 3 114 230 336 343 72 33.0 1e-13 DLQIYTSGATNIFKYPELSDGEMASRLIGTLEQKELLQELVDDVNSSESSSGIQVYIGEE APVQTMRDCSIVTANYELGEGLRGTIGIIGPKRMDYEKVLNTLRNLMTQLDSILKKDER >gi|229783593|gb|GG668142.1| GENE 2 373 - 808 471 145 aa, chain + ## HITS:1 COG:no KEGG:Closa_0881 NR:ns ## KEGG: Closa_0881 # Name: not_defined # Def: GrpE protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 143 1 137 216 94 63.0 1e-18 MSSEETRIEEELAEEAMKAEDTGSVNEEIEKGETEAAETEACAEDAADEKSGDEEPAEEK QAEPEKKGFFGKKKEKKDPKDAKIEELTDRLQRNMAEFDNYRKRTEKEKSAMFEIGARDI IEKILPVVDNFERGLAAVPEEDKGT Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:55:46 2011 Seq name: gi|229783592|gb|GG668143.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld544, whole genome shotgun sequence Length of sequence - 874 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 871 987 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases Predicted protein(s) >gi|229783592|gb|GG668143.1| GENE 1 1 - 871 987 290 aa, chain - ## HITS:1 COG:CAC1279 KEGG:ns NR:ns ## COG: CAC1279 COG0635 # Protein_GI_number: 15894561 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Clostridium acetobutylicum # 27 274 54 282 374 208 43.0 1e-53 MRREYTDKLLEEIRIQSSFVREYQIDTIFLGGGTPSILDAADITAIMGALKEHYDIAPDA EVTIEVNPGTVKMEGLAAYREAGINRVSMGLQSADDTELRYLGRIHTYDEFLKSFQRVRM AGFTNVNVDLISAIPGQTPESWRNTLKKTAMLKPEHISAYSLIVEEGTPYYDRYGGHVEM EGYEMSPEERRILMALPDLPDEDTEREMYYMTRNCLGEQGYERYEISNYARPGFECRHNV GYWTGTGYLGLGLGASSYLEGCRFHNTSDFQSYVSAHFDDKAEFNQTLRQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:55:46 2011 Seq name: gi|229783591|gb|GG668144.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld545, whole genome shotgun sequence Length of sequence - 616 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 615 434 ## gi|266625956|ref|ZP_06118891.1| conserved hypothetical protein Predicted protein(s) >gi|229783591|gb|GG668144.1| GENE 1 3 - 615 434 204 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625956|ref|ZP_06118891.1| ## NR: gi|266625956|ref|ZP_06118891.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 204 1 204 205 410 100.0 1e-113 DLIGYHSDLIVGLSQIQHGLYEIPGWNEGFEFSLISTALDNLNAQLSDIGTVMHENQQLV RERLLSGILYNYVDITSLPPEYEEHGLLFPFPYYAVILISLPSLDQMEDYTRREQLKLVI RTNTTNAFSALGTAYSLYIDNKSICIILNTGLSDTLSKELSRLCTALKKRMKQTLSVYPL FSIGICSETEPAPWQVWQLARHNF Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:55:57 2011 Seq name: gi|229783590|gb|GG668145.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld546, whole genome shotgun sequence Length of sequence - 999 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 189 262 ## Closa_1314 selenium metabolism protein YedF - Term 255 - 295 6.1 2 2 Tu 1 . - CDS 311 - 646 577 ## Closa_1312 hypothetical protein - Prom 677 - 736 2.8 3 3 Tu 1 . - CDS 794 - 997 221 ## COG2109 ATP:corrinoid adenosyltransferase Predicted protein(s) >gi|229783590|gb|GG668145.1| GENE 1 3 - 189 262 62 aa, chain - ## HITS:1 COG:no KEGG:Closa_1314 NR:ns ## KEGG: Closa_1314 # Name: not_defined # Def: selenium metabolism protein YedF # Organism: C.saccharolyticum # Pathway: not_defined # 1 62 1 62 204 85 72.0 7e-16 MKKIIDAKGLPCPQPVIKAKAVLKEMTEGTVEVLVDNEIAVQNLMKLGGYFGLKPVSEKV SE >gi|229783590|gb|GG668145.1| GENE 2 311 - 646 577 111 aa, chain - ## HITS:1 COG:no KEGG:Closa_1312 NR:ns ## KEGG: Closa_1312 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 109 1 109 114 122 88.0 3e-27 MAQDPNLDQENMEEQEEMTVTLTLDDGSEIECVVLTIFTAGERDYIALLPMEGAEAEEGE VYLYRYSESEDGQPNLENIEDDDEYEIVADAFDELLDDAEYDELVGEDETE >gi|229783590|gb|GG668145.1| GENE 3 794 - 997 221 67 aa, chain - ## HITS:1 COG:PAB2289 KEGG:ns NR:ns ## COG: PAB2289 COG2109 # Protein_GI_number: 14520300 # Func_class: H Coenzyme transport and metabolism # Function: ATP:corrinoid adenosyltransferase # Organism: Pyrococcus abyssi # 7 67 115 175 175 65 49.0 2e-11 MAVMNCGLVQEETVRAFITGRPSGLEVVMTGREPSEELISLADYVSEIRKVKHPYDQGIS AREGIEY Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:56:02 2011 Seq name: gi|229783589|gb|GG668146.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld547, whole genome shotgun sequence Length of sequence - 559 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 375 153 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 2 1 Op 2 . - CDS 400 - 558 191 ## Closa_3045 binding-protein-dependent transport systems inner membrane component Predicted protein(s) >gi|229783589|gb|GG668146.1| GENE 1 1 - 375 153 125 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 4 124 12 133 312 63 29 3e-11 MAFMTLKDIDVCYDKKKQVLKGLNLEVEEGELVSLLGPSGCGKTTTLRVVAGFIEPQSGI FSLDGADLTKVPVHKRNFGIVFQSYALFPHLSVYDNVAFGLKIRKLDKAEMDRKVKDILE VCGLT >gi|229783589|gb|GG668146.1| GENE 2 400 - 558 191 52 aa, chain - ## HITS:1 COG:no KEGG:Closa_3045 NR:ns ## KEGG: Closa_3045 # Name: not_defined # Def: binding-protein-dependent transport systems inner membrane component # Organism: C.saccharolyticum # Pathway: not_defined # 1 52 209 260 260 81 92.0 1e-14 GPGVSTFPATLMNYIEYNYDPTVSAVSVLLMAATVMIMVIVDKTLGIAALTK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:56:04 2011 Seq name: gi|229783588|gb|GG668147.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld548, whole genome shotgun sequence Length of sequence - 1056 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 18 - 623 669 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Term 798 - 851 11.5 + Prom 694 - 753 4.6 2 2 Tu 1 . + CDS 879 - 1056 174 ## gi|266625963|ref|ZP_06118898.1| putative lipoprotein Predicted protein(s) >gi|229783588|gb|GG668147.1| GENE 1 18 - 623 669 201 aa, chain + ## HITS:1 COG:BH1910 KEGG:ns NR:ns ## COG: BH1910 COG4753 # Protein_GI_number: 15614473 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 80 199 381 499 506 74 33.0 1e-13 MLYHKLLEVTETTGGIAAEIIKGEELRFQGQTLKVRATLEKQLELVLEYMGKQAPWIPVL DKMDVCIGERVLTDVERVLQELRYLIDRYHLEKPDSMMVRITTIISESVLQEHIQDIVAE EMGFSKDYLAKQFRGKIGITMSEYCMRVKMEYAKRLLKDTNKKVYEIGEELGYTTVDYFT RLFKNYTGCTPAYYRKYGDVV >gi|229783588|gb|GG668147.1| GENE 2 879 - 1056 174 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625963|ref|ZP_06118898.1| ## NR: gi|266625963|ref|ZP_06118898.1| putative lipoprotein [Clostridium hathewayi DSM 13479] putative lipoprotein [Clostridium hathewayi DSM 13479] # 1 59 1 59 59 62 100.0 9e-09 MRKTLWKKLGALAVTAVTAVSLAGCSGGAMSEIGSQTTAADSAAESSQAEPQAAEAESD Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:56:09 2011 Seq name: gi|229783587|gb|GG668148.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld549, whole genome shotgun sequence Length of sequence - 860 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 14 - 673 846 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 750 - 783 5.2 Predicted protein(s) >gi|229783587|gb|GG668148.1| GENE 1 14 - 673 846 219 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 32 218 528 744 744 159 37.0 6e-39 MTVKTTDSSYNFSNNMDREGSYTFRVRAIASYNSRAGEWSDYSEDYYIDEDEVWYYGGNG AWQQNARGWWYAYSGGGYPRSCWKLIDNAWYYFDGDGYMTTGWRRVDGNYYYLGPDGKMR TGWQQIDGAWYYLDGSGVRTTDWQYVNGRWYYMNSSGVMQIGWQRINGDWYYLDGSGAMA TGWQYINGAWYYLDGSGVMYAGRWTPDGRYVDYNGVWVQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:56:10 2011 Seq name: gi|229783586|gb|GG668149.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld550, whole genome shotgun sequence Length of sequence - 659 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 525 348 ## COG1228 Imidazolonepropionase and related amidohydrolases - Prom 591 - 650 2.4 Predicted protein(s) >gi|229783586|gb|GG668149.1| GENE 1 3 - 525 348 174 aa, chain - ## HITS:1 COG:SMa1677 KEGG:ns NR:ns ## COG: SMa1677 COG1228 # Protein_GI_number: 16263373 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Sinorhizobium meliloti # 2 173 291 459 480 107 36.0 9e-24 MAHLYTCESMQRAVRAGVMSLEHTQLMDDETARMIRDNGVWVCPCPAFAEDSMMDFISTE DMQKKYRIVKQGVEQQTELIDKYHLNVVFGTDMATNKYFCEEHQLKDFGTWGRRYGSFKT LQAATGRAYDLFKLSTYRNPYPEGKIGVLEEGSFADLLIVDGNPVRQVDLLTDK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:56:11 2011 Seq name: gi|229783585|gb|GG668150.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld551, whole genome shotgun sequence Length of sequence - 720 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 332 - 556 128 ## - Prom 660 - 719 7.3 Predicted protein(s) >gi|229783585|gb|GG668150.1| GENE 1 332 - 556 128 74 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVTTDENGNCNIPFGYTFADTDICVIPMVAYSYEQCSASVRGVNTTGFSVSVNVAGQHWT NKEITLIWVAIGNK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:56:17 2011 Seq name: gi|229783584|gb|GG668151.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld552, whole genome shotgun sequence Length of sequence - 562 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 560 673 ## COG1175 ABC-type sugar transport systems, permease components Predicted protein(s) >gi|229783584|gb|GG668151.1| GENE 1 2 - 560 673 186 aa, chain + ## HITS:1 COG:lin0218 KEGG:ns NR:ns ## COG: lin0218 COG1175 # Protein_GI_number: 16799295 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 3 186 16 195 292 133 38.0 2e-31 YSFIAPNFIGFAVFTLGPIIFAFVLAFMKWDGNSPMEFAGIKNFVQMVGNARFRASFVNT IVYCLATVPFTLACALGLAVLLNQKVKGRNFFRTVSFFPYVASLVAVAAVWNMLFSPQKS GPVNMILYQLGVSAKSLPKWAADPHWVMFTIVLFSVWKNMGYYMVIYLAGLQGINAELYE AASLDG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:56:18 2011 Seq name: gi|229783583|gb|GG668152.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld553, whole genome shotgun sequence Length of sequence - 780 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 434 112 ## DSY2531 hypothetical protein 2 1 Op 2 . - CDS 431 - 778 107 ## gi|240145800|ref|ZP_04744401.1| hypothetical protein ROSINTL182_07720 Predicted protein(s) >gi|229783583|gb|GG668152.1| GENE 1 2 - 434 112 144 aa, chain - ## HITS:1 COG:no KEGG:DSY2531 NR:ns ## KEGG: DSY2531 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 20 144 29 161 187 93 41.0 3e-18 MKNKLFILLLAFVLLLSACSNNSPPDDNTKEVLPELESAGSIESRGEKFIENFPQDYPSF ELLEYVFGSAENAPIQLVAIARNKETGSSATLFVLDENGVGQVVLASGYSATYCKEDGLQ LDKNVISISLNLEISSTNSEIHDF >gi|229783583|gb|GG668152.1| GENE 2 431 - 778 107 115 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|240145800|ref|ZP_04744401.1| ## NR: gi|240145800|ref|ZP_04744401.1| hypothetical protein ROSINTL182_07720 [Roseburia intestinalis L1-82] hypothetical protein ROSINTL182_07720 [Roseburia intestinalis L1-82] hypothetical protein RO1_33370 [Roseburia intestinalis XB6B4] # 1 115 101 215 215 196 100.0 6e-49 NVVFAKMQAEVGVEVTGSRSWTKGTSAGVSYSIAPGKFEVLNVYIPAVRTAGRLKYKVYM DGYPENVFYEYKTLTESYAPQKNSVHYKVTTSSYSAIKVPKGMTVYTPTGVYKAQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:56:28 2011 Seq name: gi|229783582|gb|GG668153.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld554, whole genome shotgun sequence Length of sequence - 809 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 809 992 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase Predicted protein(s) >gi|229783582|gb|GG668153.1| GENE 1 2 - 809 992 269 aa, chain - ## HITS:1 COG:ECs5215 KEGG:ns NR:ns ## COG: ECs5215 COG1328 # Protein_GI_number: 15834469 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Escherichia coli O157:H7 # 2 269 415 684 712 242 45.0 5e-64 AQIGILAKGDEEVFWKLLDERLTICFEALMCRHKALEGTLSDVSPIHWQYGAIARLPKGA KIDPLLHDGYSTISLGYIGIYEMTKLMKGVSHTDPAGEEFALRVMNRMKEAVECWKEETG LGFGLYGTPAESLCYRFARIDRERFGLIEDVTDKGYYTNSYHVDVREKIDAFSKFTFESQ FQKISSGGAISYVEVPNMAKNLEAMEEVVKFIYDNIQYAEFNTKSDYCQECGFDGEIIIN DDMEWECPQCHNKDKDKMNVTRRTCGYLG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:56:29 2011 Seq name: gi|229783581|gb|GG668154.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld555, whole genome shotgun sequence Length of sequence - 899 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 78 - 818 852 ## COG3250 Beta-galactosidase/beta-glucuronidase 2 1 Op 2 . + CDS 778 - 898 182 ## gi|266624885|ref|ZP_06117820.1| glucuronyl hydrolase Predicted protein(s) >gi|229783581|gb|GG668154.1| GENE 1 78 - 818 852 246 aa, chain + ## HITS:1 COG:TM1062 KEGG:ns NR:ns ## COG: TM1062 COG3250 # Protein_GI_number: 15643820 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 9 231 351 563 563 84 29.0 2e-16 MYWWIDFANPDTWKQAKSQLGEMMKRDANRASVIIWSVGNENPDTDERYRFMADLAEYAR KSDPSRLISAACLVDTVNLRIADRLEAHLDVIGLNEYYGWYDPDYSKLGTILEQSAPEKP VIISEFGADGAFAAEEGGEEERAGSGVTAGSSAGCVRGSVKEQMEIYENQIAMFRKIPYI QGTTPWILFDFRTPKRLGKYQKGYNIKGLMTEDRSRKKPAFYLMQNYYKEVLEYERGDHQ ASEGKE >gi|229783581|gb|GG668154.1| GENE 2 778 - 898 182 40 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266624885|ref|ZP_06117820.1| ## NR: gi|266624885|ref|ZP_06117820.1| glucuronyl hydrolase [Clostridium hathewayi DSM 13479] glucuronyl hydrolase [Clostridium hathewayi DSM 13479] # 5 40 1 36 389 75 100.0 9e-13 MKEETIRRLKEKNELSKEACRQAMDFAAGQVKENLKEFTH Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:56:34 2011 Seq name: gi|229783580|gb|GG668155.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld556, whole genome shotgun sequence Length of sequence - 934 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 291 410 ## COG2606 Uncharacterized conserved protein 2 1 Op 2 . - CDS 315 - 809 577 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 857 - 916 5.8 Predicted protein(s) >gi|229783580|gb|GG668155.1| GENE 1 3 - 291 410 96 aa, chain - ## HITS:1 COG:lin0783 KEGG:ns NR:ns ## COG: lin0783 COG2606 # Protein_GI_number: 16799857 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 3 96 4 97 158 87 48.0 4e-18 MVKTNAMRMLDKAKIEYRTKEYEVDESDLSGSHAADMMGVDHGSVFKTLVLKGEKTGYLV CCIPVDAELDLKKTARAAGDKKVEMIPMKDLLAVTG >gi|229783580|gb|GG668155.1| GENE 2 315 - 809 577 164 aa, chain - ## HITS:1 COG:lin2807 KEGG:ns NR:ns ## COG: lin2807 COG0454 # Protein_GI_number: 16801868 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Listeria innocua # 6 154 9 157 159 169 55.0 2e-42 MTIQFAAEEDTGLILRFIKELADYEGMLSEVEATEPLLREWIFEKKKAEVLIGLEGGEPV GFALFFHNFSTFLGRAGIYLEDLYVRPEYRGRGCGTAFLKRLAKIAVERGCGRLEWWCLD WNKPSIEFYRSMGAVPMYDWTVYRIAGDQLKMLSERDESPGYGK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:56:35 2011 Seq name: gi|229783579|gb|GG668156.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld557, whole genome shotgun sequence Length of sequence - 1238 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 396 268 ## PRU_2733 hypothetical protein 2 1 Op 2 . - CDS 423 - 1193 483 ## COG3507 Beta-xylosidase Predicted protein(s) >gi|229783579|gb|GG668156.1| GENE 1 3 - 396 268 131 aa, chain - ## HITS:1 COG:no KEGG:PRU_2733 NR:ns ## KEGG: PRU_2733 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 11 131 44 160 830 107 44.0 2e-22 MAVLTIKNESSIHIGISWKENSAVRIAAENLKNDLKKVLGTEVTLGEFKGGESILVGTAG VSAEIEGLFDEKKLQDKNGNFRKEAYIRTVSKDRLVIVGTDRRGTIYGIYDLCEEIGVSP WYFWADVPVKK >gi|229783579|gb|GG668156.1| GENE 2 423 - 1193 483 256 aa, chain - ## HITS:1 COG:yagH KEGG:ns NR:ns ## COG: yagH COG3507 # Protein_GI_number: 16128256 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Escherichia coli K12 # 16 252 287 536 536 68 28.0 9e-12 MVHLAVRTARRTMSHLGRETFLTSVTWKDGWPYVGGNKLETIEREGPLWAGQERVQLWSA DFSSKNWEAPWLFLRERRDSLFKRGDGKLSLIPSVQADGTDIGSTFAAVRPLDFECIVET ELRFCPEQTGDEAGLLIYLESKFHYRFCKKRLDDGVFLVLEKTAEDFKQTICFAPVRDGN IWLRIECDREQYHFYYALEGGPFTLAGTASTRFLSCEIAGKCFTGTVVGLYALAQKQTSA YAEAKSFSIGPVDAYK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:56:38 2011 Seq name: gi|229783578|gb|GG668157.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld558, whole genome shotgun sequence Length of sequence - 608 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 56 - 142 70 ## + Term 286 - 339 2.2 - Term 271 - 328 8.6 2 2 Tu 1 . - CDS 364 - 606 213 ## Closa_1455 Glucan-binding domain (YG repeat)-like protein Predicted protein(s) >gi|229783578|gb|GG668157.1| GENE 1 56 - 142 70 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEMILETADLCKSFKNQMAVNNVSLHIN >gi|229783578|gb|GG668157.1| GENE 2 364 - 606 213 80 aa, chain - ## HITS:1 COG:no KEGG:Closa_1455 NR:ns ## KEGG: Closa_1455 # Name: not_defined # Def: Glucan-binding domain (YG repeat)-like protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 73 1299 1370 1374 81 53.0 1e-14 DVNGYMVTGWYTDSQGGSYYLNPVSDNTKGRMMTGWVLIDGSYYYFNEEPDGTRGRLFRN TKAPDGRYVDENGVWDGKEK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:56:45 2011 Seq name: gi|229783577|gb|GG668158.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld559, whole genome shotgun sequence Length of sequence - 710 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 - CDS 2 - 227 192 ## COG1904 Glucuronate isomerase 2 1 Op 2 . - CDS 260 - 709 368 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases Predicted protein(s) >gi|229783577|gb|GG668158.1| GENE 1 2 - 227 192 75 aa, chain - ## HITS:1 COG:L0019 KEGG:ns NR:ns ## COG: L0019 COG1904 # Protein_GI_number: 15673610 # Func_class: G Carbohydrate transport and metabolism # Function: Glucuronate isomerase # Organism: Lactococcus lactis # 4 75 3 74 472 112 63.0 2e-25 MKPFINDDFLLQTNTARELFSGHARKMPIIDFHNHLNPQEIYEDRCYDNIAQVWLGGDHY KWRAMRANGIPEQLI >gi|229783577|gb|GG668158.1| GENE 2 260 - 709 368 149 aa, chain - ## HITS:1 COG:TM0068 KEGG:ns NR:ns ## COG: TM0068 COG0246 # Protein_GI_number: 15642843 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Thermotoga maritima # 1 149 385 534 539 174 56.0 5e-44 IDTVLNVRIPNSFMPDTPQRIATDTSQKLAIRFGETIKAYEASDTLSTSDLKFIPLVFAG WLRYLMGVDDQGNPFTLSPDPMLDTLRPYISGIKLGDAVDDALPKPILENSTIFGVNLYK VGLAPLVCRYFREMTAGCGAVRATLEKYI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:56:46 2011 Seq name: gi|229783576|gb|GG668159.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld560, whole genome shotgun sequence Length of sequence - 852 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 377 269 ## ELI_3215 hypothetical protein 2 1 Op 2 . + CDS 391 - 585 254 ## gi|266625982|ref|ZP_06118917.1| putative 60S ribosomal protein L17 3 1 Op 3 . + CDS 656 - 851 201 ## gi|266622261|ref|ZP_06115196.1| phage portal protein, SPP1 family Predicted protein(s) >gi|229783576|gb|GG668159.1| GENE 1 3 - 377 269 124 aa, chain + ## HITS:1 COG:no KEGG:ELI_3215 NR:ns ## KEGG: ELI_3215 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 2 112 324 434 441 152 63.0 6e-36 AEDLKKFLDGTPVKAVVVDPSAASFIAELNKHGFTVIQADNAVEDGIRLVATLLNTERIA FSQSCKNTIMEFASYIWDPKAAERGEDKPIKQHDHAMDAVRYFCYTILNNKAVRVRKKSD YGLH >gi|229783576|gb|GG668159.1| GENE 2 391 - 585 254 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625982|ref|ZP_06118917.1| ## NR: gi|266625982|ref|ZP_06118917.1| putative 60S ribosomal protein L17 [Clostridium hathewayi DSM 13479] putative 60S ribosomal protein L17 [Clostridium hathewayi DSM 13479] # 1 64 1 64 64 110 100.0 5e-23 MGRYSEIAQNIEKECDYMGSRKSLEVSLKVLDTLDQNTDSLPVKQAITILDDAKNVLLQL VGIN >gi|229783576|gb|GG668159.1| GENE 3 656 - 851 201 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266622261|ref|ZP_06115196.1| ## NR: gi|266622261|ref|ZP_06115196.1| phage portal protein, SPP1 family [Clostridium hathewayi DSM 13479] phage portal protein, SPP1 family [Clostridium hathewayi DSM 13479] # 1 65 1 65 468 130 96.0 3e-29 MYIYTMPRENWDESNPDKQAIRTLIVKHRREANQLRKSMKYYEGEHKILTESRKTKLVCN HAKDI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:56:59 2011 Seq name: gi|229783575|gb|GG668160.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld561, whole genome shotgun sequence Length of sequence - 825 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 344 311 ## COG2954 Uncharacterized protein conserved in bacteria - Prom 381 - 440 6.0 + Prom 319 - 378 5.0 2 2 Tu 1 . + CDS 465 - 599 80 ## gi|288871758|ref|ZP_06410291.1| hypothetical protein CLOSTHATH_07411 3 3 Tu 1 . - CDS 662 - 823 102 ## gi|266625985|ref|ZP_06118920.1| conserved hypothetical protein Predicted protein(s) >gi|229783575|gb|GG668160.1| GENE 1 2 - 344 311 114 aa, chain - ## HITS:1 COG:SMc03154 KEGG:ns NR:ns ## COG: SMc03154 COG2954 # Protein_GI_number: 15966638 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 2 114 4 118 157 58 38.0 4e-09 MEIERKFLITNPLFNYDTYPFHTIEQGYLCTEPVVRVRREDDTCYLTYKSKGLLSREEYN LPLTKESYDHLISKADGNIISKKRYFIPIEGTELTIEFDVFEGKFEGLMLAEVE >gi|229783575|gb|GG668160.1| GENE 2 465 - 599 80 44 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871758|ref|ZP_06410291.1| ## NR: gi|288871758|ref|ZP_06410291.1| hypothetical protein CLOSTHATH_07411 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_07411 [Clostridium hathewayi DSM 13479] # 1 44 1 44 44 64 100.0 3e-09 MRLRAISGFSFRQGAWAGALPPVSAHGFGRGISKAGRIQKQGYR >gi|229783575|gb|GG668160.1| GENE 3 662 - 823 102 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625985|ref|ZP_06118920.1| ## NR: gi|266625985|ref|ZP_06118920.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 53 1 53 53 92 100.0 8e-18 NYEKLLSSITLAGISGFSFRQGAWAGALPPVSAIFAGRRQLLGRKDTKTRGKI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:57:09 2011 Seq name: gi|229783574|gb|GG668161.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld562, whole genome shotgun sequence Length of sequence - 647 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 25 - 108 62 ## + Term 202 - 248 13.2 + TRNA 114 - 185 74.9 # Gly GCC 0 0 - Term 190 - 235 9.2 2 2 Tu 1 . - CDS 241 - 588 364 ## Clos_1311 hypothetical protein Predicted protein(s) >gi|229783574|gb|GG668161.1| GENE 1 25 - 108 62 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRYVQYVVLRVYALRENGEMRVASFL >gi|229783574|gb|GG668161.1| GENE 2 241 - 588 364 115 aa, chain - ## HITS:1 COG:no KEGG:Clos_1311 NR:ns ## KEGG: Clos_1311 # Name: not_defined # Def: hypothetical protein # Organism: A.oremlandii # Pathway: not_defined # 5 115 6 116 116 115 50.0 6e-25 MIKTEFGIIDAIDPEKNYNIYEPETYHCVAIDDDKYISDWWPRLLLIRTYTQSLKRPSFA LDRWGVTLIPPESLPALQDIVISDRRITHDHQLIALADKISQAIDEQKYMIHFGI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:57:16 2011 Seq name: gi|229783573|gb|GG668162.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld563, whole genome shotgun sequence Length of sequence - 603 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 480 183 ## PROTEIN SUPPORTED gi|168184665|ref|ZP_02619329.1| conserved hypothetical protein - Prom 533 - 592 2.4 Predicted protein(s) >gi|229783573|gb|GG668162.1| GENE 1 1 - 480 183 160 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168184665|ref|ZP_02619329.1| conserved hypothetical protein [Clostridium botulinum Bf] # 36 159 31 150 166 75 31 1e-14 MEINMLINLSELFTLEGKEKTYTPDIEMDIYHGPGGDYEIVSKEPVVLRIMNLGNRKLEA EGKAMLTLRIPCDRCLDPVEVPLEFDIVRTLDLNESEEERVEELDEQPYLKGYNLDVDQL VCDELILNLPMKVLCSESCKGICNRCGTNLNHETCDCDKR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:57:17 2011 Seq name: gi|229783572|gb|GG668163.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld564, whole genome shotgun sequence Length of sequence - 673 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 257 119 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains Predicted protein(s) >gi|229783572|gb|GG668163.1| GENE 1 2 - 257 119 85 aa, chain - ## HITS:1 COG:SP0770 KEGG:ns NR:ns ## COG: SP0770 COG0488 # Protein_GI_number: 15900664 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Streptococcus pneumoniae TIGR4 # 1 75 1 75 513 87 46.0 7e-18 MSLLEIEGLTHSFGENLLYKNAGFTLNRGEHIGIVGQNGTGKSTLIKICTEQIIPDSGRI VWQPNTTIGYLDQYAEIDHTLTMKE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:57:17 2011 Seq name: gi|229783571|gb|GG668164.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld565, whole genome shotgun sequence Length of sequence - 590 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 590 550 ## gi|266625989|ref|ZP_06118924.1| conserved hypothetical protein Predicted protein(s) >gi|229783571|gb|GG668164.1| GENE 1 2 - 590 550 196 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625989|ref|ZP_06118924.1| ## NR: gi|266625989|ref|ZP_06118924.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 196 1 196 196 306 100.0 7e-82 KAAEASDAKGTTIEYSTDGESWSGTIPQFTEYQEGGYPVYVRATNDNYSNAATAEAVFNI TKRPVTVSAGIMTADYDGSEKEVTEIRYTEADSENAEGVLAGHTVTAKLLNSKRTDAGEQ TVSIEDGSVRILSGRTDVTKNYAVSLGDGKLIINQKGDLKVTVDAESLSHVYDGAGHGIK AAEASDAKGTTIEYST Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:57:28 2011 Seq name: gi|229783570|gb|GG668165.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld566, whole genome shotgun sequence Length of sequence - 771 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 119 62 ## 2 1 Op 2 . + CDS 116 - 770 816 ## COG1653 ABC-type sugar transport system, periplasmic component Predicted protein(s) >gi|229783570|gb|GG668165.1| GENE 1 3 - 119 62 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no GIANAMEKQVGRVDPERLEMIKNKINIENIPNKEVSHL >gi|229783570|gb|GG668165.1| GENE 2 116 - 770 816 218 aa, chain + ## HITS:1 COG:PH1039 KEGG:ns NR:ns ## COG: PH1039 COG1653 # Protein_GI_number: 14590877 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pyrococcus horikoshii # 49 218 47 222 420 82 28.0 5e-16 MKKRGLAMGIAVVLGCMSLTGCGGNSNEAKNGETAAATSRVEAKNEGEIELTMMGGAHLV SVAEIVLRDYLAEHPNVKINFEKYSYGEYPTKMKLQLSNDESTPDIMIVHDLYAPQFAKA GYLVDLSDMFTEGEVLPVMDPVTMDGKVYGIPNQVTNQYVFMYRQDIYDELGLTVPETFD DYFNQAVALKENGYYAGAFDPADSGCNEVFQNFIYMLG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:57:33 2011 Seq name: gi|229783569|gb|GG668166.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld567, whole genome shotgun sequence Length of sequence - 862 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 222 194 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 - Term 252 - 312 6.2 2 2 Tu 1 . - CDS 338 - 862 528 ## COG0733 Na+-dependent transporters of the SNF family Predicted protein(s) >gi|229783569|gb|GG668166.1| GENE 1 1 - 222 194 73 aa, chain + ## HITS:1 COG:CAC0908 KEGG:ns NR:ns ## COG: CAC0908 COG1179 # Protein_GI_number: 15894195 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Clostridium acetobutylicum # 1 69 166 248 251 87 54.0 4e-18 SVCPLCKVMRRELKKRGITSLKVLYSKEEPQKPLEDSGEVTSKRAVPGSVSFVPPVAGLL IAGEVIRGLTGRN >gi|229783569|gb|GG668166.1| GENE 2 338 - 862 528 174 aa, chain - ## HITS:1 COG:MA0901 KEGG:ns NR:ns ## COG: MA0901 COG0733 # Protein_GI_number: 20089780 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Methanosarcina acetivorans str.C2A # 1 171 283 453 459 190 56.0 9e-49 VEPGQGPKLIFVTLPNVFNNMAGGRLVGTLFFLFMSFAAVSTVIAVFQNIVSFATDLTGC TIKKAVMCNAAAIILLSLPCVLGFNLWSGFMPFGEGSNVLDLEDFIISNNLLPLGSLIYL AFCTTRYGWGFENFMKEANEGKGIRFPRWVRGYVTFVLPVIVLFIFIQGYIAKF Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:57:33 2011 Seq name: gi|229783568|gb|GG668167.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld568, whole genome shotgun sequence Length of sequence - 597 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 498 418 ## COG2199 FOG: GGDEF domain 2 1 Op 2 . - CDS 449 - 595 86 ## gi|266625995|ref|ZP_06118930.1| conserved hypothetical protein Predicted protein(s) >gi|229783568|gb|GG668167.1| GENE 1 3 - 498 418 165 aa, chain - ## HITS:1 COG:DR0267 KEGG:ns NR:ns ## COG: DR0267 COG2199 # Protein_GI_number: 15805298 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Deinococcus radiodurans # 31 140 338 451 511 66 33.0 2e-11 MTKGDLSISLTQQTKDEVGMLSDSFQQTVDHLQKYINYINSLAYRDALTGVKNKTAYQDA ERRLEEQMRNGRPEFAVVVLDINDLKRINDNYGHDFGDMFIVDACRLICKCFPHSPVYRI GGDEFVVIMEGADYANYEHLLENFHFAIEEYNRSDQKDKHLSIAR >gi|229783568|gb|GG668167.1| GENE 2 449 - 595 86 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625995|ref|ZP_06118930.1| ## NR: gi|266625995|ref|ZP_06118930.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 48 1 48 48 91 100.0 2e-17 FLVIAPVSVWVTVLITRRMVRPLKELNEAAKQIDQRGLKHFPDAADKG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:57:39 2011 Seq name: gi|229783567|gb|GG668168.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld569, whole genome shotgun sequence Length of sequence - 813 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 812 366 ## Clole_2765 phage terminase, large subunit, PBSX family Predicted protein(s) >gi|229783567|gb|GG668168.1| GENE 1 2 - 812 366 270 aa, chain + ## HITS:1 COG:no KEGG:Clole_2765 NR:ns ## KEGG: Clole_2765 # Name: not_defined # Def: phage terminase, large subunit, PBSX family # Organism: C.lentocellum # Pathway: not_defined # 1 270 24 295 434 429 73.0 1e-119 WWCDTSPVKDKDGIIADGAIRSGKTVCMSLSFVFWAMANYSDQNFAMCGKTIGSFRRNVL TILKLMLRSRGFQVADHRADNLVEISRNGVTNHFYIFGGKDESSQDLIQGITLAGVFFDE VALMPESFVNQATGRCSVDGSKYWFNCNPDGPYHWFKLNWLDKAKEKNLLVLHFTMDDNL SLSEHIKERYRNMYTGVFFKRYILGLWAMAEGIIYDMFSEDRHVKTILEYARQLIDGGRY VSIDYGTQNATAFLLWNKGRDGKWYCIREY Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:57:44 2011 Seq name: gi|229783566|gb|GG668169.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld570, whole genome shotgun sequence Length of sequence - 511 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 28 - 432 450 ## gi|266625997|ref|ZP_06118932.1| protein SWAP Predicted protein(s) >gi|229783566|gb|GG668169.1| GENE 1 28 - 432 450 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266625997|ref|ZP_06118932.1| ## NR: gi|266625997|ref|ZP_06118932.1| protein SWAP [Clostridium hathewayi DSM 13479] protein SWAP [Clostridium hathewayi DSM 13479] # 1 134 1 134 134 228 100.0 2e-58 MNAAYKEAYLHYRESFSSIFNEEHREEQKGIPETADLDDGFSLRSRYYTGTLKGNIFDFE DPFTEPEGINGYELMDREYDLYDDIEFSDFQNGKILLKAYNTRDQKQEILAYDTKYLKRL LKERLKAVRMEDAR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:57:52 2011 Seq name: gi|229783565|gb|GG668170.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld571, whole genome shotgun sequence Length of sequence - 943 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 4 - 159 99 ## gi|266625998|ref|ZP_06118933.1| putative DNA-binding response regulator 2 1 Op 2 . + CDS 156 - 942 668 ## gi|266625999|ref|ZP_06118934.1| hypothetical protein CLOSTHATH_07427 Predicted protein(s) >gi|229783565|gb|GG668170.1| GENE 1 4 - 159 99 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625998|ref|ZP_06118933.1| ## NR: gi|266625998|ref|ZP_06118933.1| putative DNA-binding response regulator [Clostridium hathewayi DSM 13479] putative DNA-binding response regulator [Clostridium hathewayi DSM 13479] # 1 51 2 52 52 97 100.0 3e-19 MIERGYVINLYHVWKMEGTSVVLDNKTVLPVSQNHAKEVRETITGFFRRHL >gi|229783565|gb|GG668170.1| GENE 2 156 - 942 668 262 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266625999|ref|ZP_06118934.1| ## NR: gi|266625999|ref|ZP_06118934.1| hypothetical protein CLOSTHATH_07427 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_07427 [Clostridium hathewayi DSM 13479] # 1 262 1 262 263 422 100.0 1e-116 MNTLFKLCELAAYFFEALICYFFIQLFFPEKAKGKIRFFLLSGVLLVTVFIADSLHVFPF LVTLWFVFYICMTTVLIFHVDVFYAVSLVSFYILCVYIIDFFCISVMGVFLKNQQFAQMV IAQLSQWRLVYQAADKLLLTAAYFLARKAFRRKLAYNPGMLFLTSGLGLCGVGFLSWLTI QETSLHALFGWSMCIILLFLIYFLLLFYSKYVEEQKQRLALEMKEQMIDREYDLMVRQQR EQEELSHDMKNHLLVLSSMIGE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:58:11 2011 Seq name: gi|229783564|gb|GG668171.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld572, whole genome shotgun sequence Length of sequence - 1079 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1053 1271 ## COG1132 ABC-type multidrug transport system, ATPase and permease components Predicted protein(s) >gi|229783564|gb|GG668171.1| GENE 1 3 - 1053 1271 350 aa, chain - ## HITS:1 COG:CAC3282 KEGG:ns NR:ns ## COG: CAC3282 COG1132 # Protein_GI_number: 15896527 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 2 350 32 387 584 275 41.0 7e-74 MLPKLMSLIINNGVANRDTSYIIAVGGVMVVTAFVMAAGGIGAAYFSSKASICFSSDLRK DVFDKVQQFSFKNIDDFSSGSLVTRLTNDIQQIQNVLMMSLRLMFRAPGMLIGGLIMAYL MNSRLVTILLVVIPLLLLAIAVILRTAFPRFEVMQKKIDRLNSGVQEALTNVRVIKSFVR EDYEEEKFRRTNEDLKEGSLRAMKIVIATMPVMMLAMNVTTLAVVWYGGNLIIGGSMPVG DLTAFTTYIVQILMSLMMVSMVFLQASRAMASLKRVREVLDTEIDLTDEHAARKDAVVTE GRIEFRDVCFQYAKDTDELVLDHINFTANPGETIGIIGSTGSGKTSLVQL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:58:11 2011 Seq name: gi|229783563|gb|GG668172.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld573, whole genome shotgun sequence Length of sequence - 699 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 327 210 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain Predicted protein(s) >gi|229783563|gb|GG668172.1| GENE 1 1 - 327 210 108 aa, chain + ## HITS:1 COG:BH2728 KEGG:ns NR:ns ## COG: BH2728 COG4753 # Protein_GI_number: 15615291 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 2 108 399 505 510 82 40.0 1e-16 YSDSVFKIMQYMEARYAEPVTLDELANHVHMNRSYISHLFKKETGRNINAYLLEFRMEKA KILLESSNHNIQDICCQIGIPDSAYFSKVFKKYTGFTPIEFRRISNEN Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:58:12 2011 Seq name: gi|229783562|gb|GG668173.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld574, whole genome shotgun sequence Length of sequence - 726 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 524 638 ## COG1309 Transcriptional regulator - Prom 626 - 685 4.9 Predicted protein(s) >gi|229783562|gb|GG668173.1| GENE 1 2 - 524 638 174 aa, chain - ## HITS:1 COG:BH3394 KEGG:ns NR:ns ## COG: BH3394 COG1309 # Protein_GI_number: 15615956 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 11 171 7 163 186 65 26.0 4e-11 MTAYVSKNEKTKYRLADSVKQCMMTTPVDKITVQNIVDGCDMTRQTFYRNFKDKYDLINW YFDKLVLESFARIGVEKTVCQSLEEKFEFIKKEKVFFTEAFRSDDYNSLKEHDFELIMGF YQELITRRTHKPLEEDIQFLLEMYCRGSVYMTVKWVLSGMKQTPGEMAASLVEA Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:58:12 2011 Seq name: gi|229783561|gb|GG668174.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld575, whole genome shotgun sequence Length of sequence - 824 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 348 - 407 8.5 1 1 Tu 1 . + CDS 564 - 728 149 ## gi|288871763|ref|ZP_06118939.2| Tas protein Predicted protein(s) >gi|229783561|gb|GG668174.1| GENE 1 564 - 728 149 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871763|ref|ZP_06118939.2| ## NR: gi|288871763|ref|ZP_06118939.2| Tas protein [Clostridium hathewayi DSM 13479] Tas protein [Clostridium hathewayi DSM 13479] # 1 54 1 54 54 96 100.0 6e-19 MYIVPIPVSRKESPLRENAGAVEVNLTTEKIADINEKLGQMEIFSMFGGSPVKK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:58:18 2011 Seq name: gi|229783560|gb|GG668175.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld576, whole genome shotgun sequence Length of sequence - 1277 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 630 428 ## Closa_0419 Ig domain protein group 2 domain protein 2 1 Op 2 . + CDS 682 - 1002 165 ## gi|266626006|ref|ZP_06118941.1| transketolase + Term 1008 - 1065 5.6 Predicted protein(s) >gi|229783560|gb|GG668175.1| GENE 1 1 - 630 428 209 aa, chain + ## HITS:1 COG:no KEGG:Closa_0419 NR:ns ## KEGG: Closa_0419 # Name: not_defined # Def: Ig domain protein group 2 domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 83 204 1275 1402 1413 100 40.0 5e-20 MNFNGWFTAATGGTAVTTDTVYTADTTIYAHWTYNGSGSSGSGSSGGGSSSSSSTTTYPE CLPNNYKGATKILHNVRVPDYVVEGNWKTVGDGRWRLGSPDGTDYAGTWVPAYNPYANNG QRVFDWFLFDAEGYLVTGWYTDEQGNTFYLNSSVDNTQGAMFFGWNIINGKYYYFNEEPD GSRGKLYRNTTTPDGYYVDENGVWDGIQK >gi|229783560|gb|GG668175.1| GENE 2 682 - 1002 165 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266626006|ref|ZP_06118941.1| ## NR: gi|266626006|ref|ZP_06118941.1| transketolase [Clostridium hathewayi DSM 13479] transketolase [Clostridium hathewayi DSM 13479] # 1 106 1 106 106 193 100.0 3e-48 MRAAGNDVLPSWYYKKGTAPNKEMRIISEKLLKRLTELTKLSYISDLHGGASGRIISKAL HEIKPEEYSIEEWEYAIGYILGQTQHFESREAALQYLYEQLNKIDK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:58:30 2011 Seq name: gi|229783559|gb|GG668176.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld577, whole genome shotgun sequence Length of sequence - 798 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 404 568 ## Closa_1706 sulfatase + Term 409 - 475 15.0 - Term 247 - 272 -0.8 2 2 Tu 1 . - CDS 510 - 725 292 ## COG2768 Uncharacterized Fe-S center protein Predicted protein(s) >gi|229783559|gb|GG668176.1| GENE 1 3 - 404 568 133 aa, chain + ## HITS:1 COG:no KEGG:Closa_1706 NR:ns ## KEGG: Closa_1706 # Name: not_defined # Def: sulfatase # Organism: C.saccharolyticum # Pathway: not_defined # 1 127 503 629 642 223 77.0 2e-57 MRFHQNYLNGGMEDEETYLADMKALEYDILYGDREVYGGENPYQTTDLQMGIDPITIDDI VYNDSNILVYGENFTPYSRICLDGKAVETTFVWPELIIAKNIPEKKVTDADITVWQIGRD KIPLGESAGTSHK >gi|229783559|gb|GG668176.1| GENE 2 510 - 725 292 71 aa, chain - ## HITS:1 COG:TM0034 KEGG:ns NR:ns ## COG: TM0034 COG2768 # Protein_GI_number: 15642809 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S center protein # Organism: Thermotoga maritima # 4 71 298 352 357 63 48.0 7e-11 MFASFDPVALDMACVDAVNRQPVIAGSILEKHGSKHHDHFTDVHPDTNWKTAVEHGVKIG LGTKEYELITI Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:58:33 2011 Seq name: gi|229783558|gb|GG668177.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld578, whole genome shotgun sequence Length of sequence - 585 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 220 199 ## Closa_2136 hypothetical protein 2 1 Op 2 . - CDS 224 - 583 361 ## COG1120 ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components Predicted protein(s) >gi|229783558|gb|GG668177.1| GENE 1 1 - 220 199 73 aa, chain - ## HITS:1 COG:no KEGG:Closa_2136 NR:ns ## KEGG: Closa_2136 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 73 3 72 249 70 46.0 1e-11 MWELYERLIEPIPDDVPVDEILVGTSCTMVRAGGAAGAAANQRLESRPRILGEGEWEQEL TWRQAASLINSWN >gi|229783558|gb|GG668177.1| GENE 2 224 - 583 361 119 aa, chain - ## HITS:1 COG:MA1233 KEGG:ns NR:ns ## COG: MA1233 COG1120 # Protein_GI_number: 20090097 # Func_class: P Inorganic ion transport and metabolism; H Coenzyme transport and metabolism # Function: ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components # Organism: Methanosarcina acetivorans str.C2A # 1 117 142 257 257 113 43.0 7e-26 GEQQLVLIARAIAQQAGILIMDEPCANLDYGNQARVMEELKRLSREGYLIVQSTHSPDQA FLYADQAAVLSDGVIRAFGKPEEVLTEALLEAMYGIPVRLFDAGDTGRKLCMPERVKGE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:58:36 2011 Seq name: gi|229783557|gb|GG668178.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld579, whole genome shotgun sequence Length of sequence - 579 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 579 720 ## COG0733 Na+-dependent transporters of the SNF family Predicted protein(s) >gi|229783557|gb|GG668178.1| GENE 1 3 - 579 720 192 aa, chain + ## HITS:1 COG:MA0901 KEGG:ns NR:ns ## COG: MA0901 COG0733 # Protein_GI_number: 20089780 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Methanosarcina acetivorans str.C2A # 1 192 67 254 459 170 52.0 1e-42 FAVGRASQRSVALSYDVLEPKGTKWHYAKYLGMAGNYILMMFYTTVAGWMILYFLKMAKG DFTGLDAAGVGAEFEGMLTNPVLMAVFMILVVIGCFAVCAKGLQGGVEKITKVMMVCLLI LMVILAIHSVLMDNSGPGLEFYLKPDFGKIKEAGLGEVIFAAMGQAFFTLSLGIGALAIF GSYIGKERSLMG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:58:37 2011 Seq name: gi|229783556|gb|GG668179.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld580, whole genome shotgun sequence Length of sequence - 685 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 685 619 ## gi|266626013|ref|ZP_06118948.1| conserved hypothetical protein Predicted protein(s) >gi|229783556|gb|GG668179.1| GENE 1 1 - 685 619 228 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266626013|ref|ZP_06118948.1| ## NR: gi|266626013|ref|ZP_06118948.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 228 1 228 228 376 100.0 1e-103 DAKGTTIEYSTDGESWSGTIPQFTEYQEGGYPVYVRAKNDNYSNVAEADVVFRITKRPIK VAAGILETEYDGSEKAVTEFTYTVSNEENTAGALKDHKVSAVLKNNTRTEAGEQTVSVKE NSVRILSGEADVTKNYAVSLEDGKLTVNQKGGLKVTVDAESLSHVYDGAGHGIKAAEASD AKGTTIEYSTDGESWSGTIPQFTEYREEGYPVHVRAKNDNYSNVAEAD Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:58:49 2011 Seq name: gi|229783555|gb|GG668180.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld582, whole genome shotgun sequence Length of sequence - 641 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 348 - 395 9.0 1 1 Tu 1 . - CDS 427 - 639 109 ## COG0336 tRNA-(guanine-N1)-methyltransferase Predicted protein(s) >gi|229783555|gb|GG668180.1| GENE 1 427 - 639 109 70 aa, chain - ## HITS:1 COG:BH2479 KEGG:ns NR:ns ## COG: BH2479 COG0336 # Protein_GI_number: 15615042 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Bacillus halodurans # 1 60 186 245 246 74 58.0 6e-14 EYKGLRVPEVLLSGHHKNIETWRREQSIKRTYERRPDLLKDAVLSKKEQEYLERLAKKPE NQDIQVEFQG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:58:49 2011 Seq name: gi|229783554|gb|GG668181.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld583, whole genome shotgun sequence Length of sequence - 973 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 66 - 380 341 ## COG0846 NAD-dependent protein deacetylases, SIR2 family - Prom 438 - 497 6.9 2 2 Tu 1 . - CDS 568 - 936 341 ## DSY3774 hypothetical protein Predicted protein(s) >gi|229783554|gb|GG668181.1| GENE 1 66 - 380 341 104 aa, chain - ## HITS:1 COG:SA0315 KEGG:ns NR:ns ## COG: SA0315 COG0846 # Protein_GI_number: 15926028 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Staphylococcus aureus N315 # 8 102 24 118 315 92 46.0 1e-19 MNETLSAVRTLLREAEAVVVGAGSGMSAAAGLTYAGRRFHEKFGDFIEHYGMTDMYSAGF YPFSTQEEKWAYWSRHIYYNRYDVRIGKPYLDLLQLVKDRNYLF >gi|229783554|gb|GG668181.1| GENE 2 568 - 936 341 122 aa, chain - ## HITS:1 COG:no KEGG:DSY3774 NR:ns ## KEGG: DSY3774 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 5 122 1258 1375 1378 119 50.0 3e-26 MTAALKQPEDEVIWLNASDPAQIWGKVLEQPEGSSFMNIPGTAAALRQGRAAAVMERQGK VLRIFEEEGMEAVLSEFVRCFQRKLLFPDQKRLILKEYPEHAGELLKKAGFIREMQDYVL YR Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:58:53 2011 Seq name: gi|229783553|gb|GG668182.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld584, whole genome shotgun sequence Length of sequence - 765 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 763 852 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase Predicted protein(s) >gi|229783553|gb|GG668182.1| GENE 1 1 - 763 852 254 aa, chain - ## HITS:1 COG:CAC2832 KEGG:ns NR:ns ## COG: CAC2832 COG0436 # Protein_GI_number: 15896087 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Clostridium acetobutylicum # 3 254 117 367 393 282 55.0 3e-76 KTILDPGDEVIVFAPYFMEYGAYVRNYDGVLVTVSPDISSFQPNIEEFKEKITGRTKAVI INTPNNPTGVVYSAKTLERIGAVMREKEAEYGTSIVLLSDEPYRELAYDGVEVPYVTPFY HNTVVCYSYSKSLSLPGERIGYLVIPDELEDSKAVIAAASIANRVLGCVNAPSLMQRVIM RCIDEKVNVEAYDRNRNLLYNGLRGFGFECIRPEGAFYLFVKSPEADDRAFSEACKKHRL LVVPGTSFGCPGYV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:58:53 2011 Seq name: gi|229783552|gb|GG668183.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld585, whole genome shotgun sequence Length of sequence - 700 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 698 376 ## COG4646 DNA methylase Predicted protein(s) >gi|229783552|gb|GG668183.1| GENE 1 2 - 698 376 232 aa, chain + ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 183 1015 1191 1315 97 32.0 2e-20 PGDLAQRKGRIERQGNQNPLVHVYRYVTEGTFDAYLWQTVENKQKFISQIMTSKSPVRSC DDVDETALSFAEIKALCAGDPRIKERMDLDVEVSRLKLMKADHQSKQYRLEDQLLKYFPE EIEKHKGFIKGFESDLEVLAAHPHPEDGFAGMEIRGDLLTDKENAGAALLDACKEVKTSD PVQIGNYRGYAMSVEFSAWKQEYTLLLKGQMTHRATLGTDPRGNLTRIDNAL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:58:54 2011 Seq name: gi|229783551|gb|GG668184.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld586, whole genome shotgun sequence Length of sequence - 783 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 29 - 88 5.1 1 1 Tu 1 . + CDS 189 - 782 298 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 Predicted protein(s) >gi|229783551|gb|GG668184.1| GENE 1 189 - 782 298 198 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 15 198 1 182 245 119 39 7e-28 MAEENREEKSKKIPMIQVKNLYKVYKVGDTKVYALGGVDFTVYKGEFCAIVGPSGSGKST LLNMMAGLEKPSKGEIVIAGKHIEKLTENQLVAFRRKHVGFIFQSYNLLQTMNAVENVAL PLSFRGVPKKKRNEKAVKYIKLVGLEKQMKHMANEMSGGQQQRVGIARALAVDPQIIFAD EPTGNLDSKTTKEILGLM Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:58:55 2011 Seq name: gi|229783550|gb|GG668185.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld587, whole genome shotgun sequence Length of sequence - 799 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 797 651 ## Ethha_1897 hypothetical protein Predicted protein(s) >gi|229783550|gb|GG668185.1| GENE 1 2 - 797 651 265 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1897 NR:ns ## KEGG: Ethha_1897 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 265 521 785 794 487 85.0 1e-136 SPLSLKSDFILSLCELIVGGKEGLQPVQKTIIDRCVRLVYQTYLNDPRPENMPILEDLYN LLREQEEKEAQYIATALEIYVTGSLNVFNHQSNVDINNRIVCYDIKELGKQLKKIGMLVV QDQVWNRVTINRAAHKSTRYYIDEMHLLLKEEQTAAYTVEIWKRFRKWGGIPTGITQNVK DLLSSREVENIFENSDFVYMLNQAGGDRQILAKQLGISTHQLSYVTHSGEGEGLLFYGST ILPFVDHFPKNTELYRIMTTKPQEL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:59:00 2011 Seq name: gi|229783549|gb|GG668186.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld591, whole genome shotgun sequence Length of sequence - 890 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 713 828 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 740 - 799 5.6 Predicted protein(s) >gi|229783549|gb|GG668186.1| GENE 1 2 - 713 828 237 aa, chain - ## HITS:1 COG:ECs3042 KEGG:ns NR:ns ## COG: ECs3042 COG1879 # Protein_GI_number: 15832296 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 50 234 18 199 332 116 34.0 4e-26 MKKRECRIFSIALLLALAVCFLAACVSGQDGTQESTEAAKGETESTEDSGGSTEAAEKKL RIGVTIYRYDDNFMKLYREELRQYLEETYQAEVVVRNARGEQEEQISQINDFIEAGYDGI IVNLVDTDHAGVIADSCHEAGIPLVFINREPKADEQARWKAEKMAVSCIGTDSKQAGTYQ GDIILETDTRGDLNRDGIVSYVMIMGEKDNEDSVSRTEYSIKALEAGGMKTEKLYSG Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:59:00 2011 Seq name: gi|229783548|gb|GG668187.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld592, whole genome shotgun sequence Length of sequence - 511 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 510 151 ## PROTEIN SUPPORTED gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 Predicted protein(s) >gi|229783548|gb|GG668187.1| GENE 1 1 - 510 151 170 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 [Roseobacter sp. AzwK-3b] # 1 146 2 148 305 62 28 4e-11 NYELYKVFYHVAATLSFSEASKQLFISQSAVSQSIKVLEKKLNQTLFIRSTKKVQLTPEG EILLKHVEPAINLIQKGENQLLEANTLNGGQLRIGASDTICRYYLIPYLNRFHKAYPNVH IKVTNQTSIECAHLLENGQVDFIITNYPNSGLSNSQNVRVINEFKPFGSN Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:59:01 2011 Seq name: gi|229783547|gb|GG668188.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld593, whole genome shotgun sequence Length of sequence - 938 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 25 - 708 604 ## gi|266626023|ref|ZP_06118958.1| TraG/TraD family protein Predicted protein(s) >gi|229783547|gb|GG668188.1| GENE 1 25 - 708 604 227 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266626023|ref|ZP_06118958.1| ## NR: gi|266626023|ref|ZP_06118958.1| TraG/TraD family protein [Clostridium hathewayi DSM 13479] TraG/TraD family protein [Clostridium hathewayi DSM 13479] # 1 227 1 227 227 434 100.0 1e-120 MSILITFIKLLAVFDLLVIAAYFGRDILSTVWSKFRKKPRREEPFDFSLDAAAVPLSIDS IRQLYSGDAADFVEERLLPNAAQTMAVLTDREQTAARLLLSALVGFLAEEAPMDERSFPM VMELLNCMECEKEDGCQDPVDILFEETVRRTCRHEEYYSNYQRYRLMQVDKTRVILACRI LINDLLGKLYRYDYRFGYDLLLDEENSIEKKLHTSVREEWEDEDDGM Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:59:13 2011 Seq name: gi|229783546|gb|GG668189.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld594, whole genome shotgun sequence Length of sequence - 1012 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 88 - 828 925 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit + Term 895 - 944 8.1 Predicted protein(s) >gi|229783546|gb|GG668189.1| GENE 1 88 - 828 925 246 aa, chain + ## HITS:1 COG:BH2556 KEGG:ns NR:ns ## COG: BH2556 COG1191 # Protein_GI_number: 15615119 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus halodurans # 26 241 20 232 237 272 68.0 5e-73 MILKVSVPNRFQFKAVPTFKTLMMPKQGEIHYIGGADILPAPLDNESEARVIAMLGGEDD QSAKSELIEHNLRLVVYIAKKFDNTSVGVEDLISIGTIGLIKAINTFNPGKNIKLATYAS RCIENEILMYLRRNNKTKMEVSIDEPLNVDWDGNELLLSDILGTDEDVIYKDLETEVERN LLNSAISRLSPRERKIVELRFGLSDEEGEEMTQKEVADLLGISQSYISRLEKKIMKRLKK EIVRFE Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:59:13 2011 Seq name: gi|229783545|gb|GG668190.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld595, whole genome shotgun sequence Length of sequence - 1031 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 338 206 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 2 1 Op 2 . + CDS 335 - 1031 558 ## Amet_4420 sensor diguanylate cyclase Predicted protein(s) >gi|229783545|gb|GG668190.1| GENE 1 3 - 338 206 111 aa, chain + ## HITS:1 COG:alr3260 KEGG:ns NR:ns ## COG: alr3260 COG0745 # Protein_GI_number: 17230752 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Nostoc sp. PCC 7120 # 2 110 127 235 238 68 32.0 4e-12 YAQVRRISMKRFAIDHSNQLLLVDDQIIPLPQKEFELFLFLFGNPGQIFPAAELYQQVWR TGETDSANTVAVHIKRLRHKLEDAGAVIGRIETVRGEGYRFILGSEARIKI >gi|229783545|gb|GG668190.1| GENE 2 335 - 1031 558 232 aa, chain + ## HITS:1 COG:no KEGG:Amet_4420 NR:ns ## KEGG: Amet_4420 # Name: not_defined # Def: sensor diguanylate cyclase # Organism: A.metalliredigens # Pathway: not_defined # 2 231 3 238 650 116 31.0 6e-25 MKRFLAIFLLILALCAALFVTMSVSFVSKGPEAVDGVLDYRGTDFTSSVYQLDGQWEFYY GSLYAPEEFKQGTPKGRELITLPGSWAGLGYPVLGHATYRLTLQTDPGEIYLLFIPEIIS SAVIWNNGTEIYRAGQVGDSAANTVTGVRNELLAVSSEDGTLELVVWAANYHLTDSGLFY PILFGRDTVMLHYLLWQRAAAAAAMGGILLIGVYHLFLYLFRRMERLYLVFS Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:59:19 2011 Seq name: gi|229783544|gb|GG668191.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld597, whole genome shotgun sequence Length of sequence - 798 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 141 - 797 541 ## Closa_0875 Stage II sporulation P family protein Predicted protein(s) >gi|229783544|gb|GG668191.1| GENE 1 141 - 797 541 218 aa, chain - ## HITS:1 COG:no KEGG:Closa_0875 NR:ns ## KEGG: Closa_0875 # Name: not_defined # Def: Stage II sporulation P family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 218 251 468 468 393 88.0 1e-108 LAKDFTLEGKNDKPQILIYHTHSQEEFSDYGPNNKEATVIGIGTYLTELLTAKGYNVIHD KSVYDLQNGKLDRNKAYTYALDGVTRILQENPSIEVILDLHRDGVNESLHMVNEVNGKPT APIMFFNGVSQTPEGPIEYLPNPYRSDNLAFSFQMQLDAAAYFPGLTRKIYIKGLRYNLH LRPRSVLIEVGAQTNTYQEARNAMEPLAEVLNMVLQGN Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:59:23 2011 Seq name: gi|229783543|gb|GG668192.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld598, whole genome shotgun sequence Length of sequence - 693 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 530 308 ## Ethha_1908 DEAD-like helicase 2 1 Op 2 . + CDS 532 - 691 262 ## gi|291549641|emb|CBL25903.1| hypothetical protein RTO_12680 Predicted protein(s) >gi|229783543|gb|GG668192.1| GENE 1 3 - 530 308 175 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1908 NR:ns ## KEGG: Ethha_1908 # Name: not_defined # Def: DEAD-like helicase # Organism: E.harbinense # Pathway: not_defined # 2 175 2289 2462 2462 179 56.0 4e-44 GDLLTDKENAGAALLDACKEVKTSDPVQIGNYRGYAMSVEFSAWKQEYTLLLKGQMTHRA TLGTDPRGNLTRIDNALAQMPQRLESAKAQLDNLCQQQAAAKEEVGKPFLYEEELRCKNA RLVELDTLLNIDGKGQAHTESVVAKSTRPSVLDNLKRPVQPRSTDKKPKQHEEVR >gi|229783543|gb|GG668192.1| GENE 2 532 - 691 262 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291549641|emb|CBL25903.1| ## NR: gi|291549641|emb|CBL25903.1| hypothetical protein RTO_12680 [Ruminococcus torques L2-14] # 1 53 1 53 268 100 100.0 3e-20 MNTNDLNTALYEKMAAEQDKFRDWLKRQPPEEILHHTYEYTVREDIVMAMEEL Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:59:32 2011 Seq name: gi|229783542|gb|GG668193.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld599, whole genome shotgun sequence Length of sequence - 1080 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 31 - 849 815 ## Ethha_1910 hypothetical protein 2 1 Op 2 . + CDS 852 - 1080 156 ## Ethha_1911 hypothetical protein Predicted protein(s) >gi|229783542|gb|GG668193.1| GENE 1 31 - 849 815 272 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1910 NR:ns ## KEGG: Ethha_1910 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 106 231 1 132 254 104 42.0 5e-21 MNTNDLNTALYEKMTAEQDKFRDWLKSQPPEEVLNHAYEYTIREDIVMAMEELELTDTQA QALLESPSPLADVYRYFEKLETGYMDMIRDSIENRADDVCRAKEELRTTPVYPHSAAYAR EHGELEQYRASNNVNLQCKESIEAAVREHFDGMYLSHDAAKGVIETYGMERVSMVLSNTV QLQDWDGRYSRRNKEWAKTIPNDNPETVRCGYALNSHPAVLNGFIDLVREEQQHSRTQRE KLEPSRPSVRDKLKQELRAHKSAAPKKREPER >gi|229783542|gb|GG668193.1| GENE 2 852 - 1080 156 76 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1911 NR:ns ## KEGG: Ethha_1911 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 76 1 76 112 87 63.0 1e-16 MAKRKRDVPVLFWVSAEELELIHQKMQQYGTENLSAYLRKMALDGYVVKLELPELKELVS LMRRSSNNLNQLTRKV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:59:39 2011 Seq name: gi|229783541|gb|GG668194.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld600, whole genome shotgun sequence Length of sequence - 870 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 39 - 290 435 ## CD1107A hypothetical protein 2 1 Op 2 . + CDS 280 - 868 386 ## CD1107 hypothetical protein Predicted protein(s) >gi|229783541|gb|GG668194.1| GENE 1 39 - 290 435 83 aa, chain + ## HITS:1 COG:no KEGG:CD1107A NR:ns ## KEGG: CD1107A # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 67 1 67 85 78 73.0 7e-14 MAKNKIERIDQEITKVREKIAEYQEKLKALEAQKTEAENLEIVQMVRALRMTPTQLSAML SGGTVPGSLADDNNEQEENSYEE >gi|229783541|gb|GG668194.1| GENE 2 280 - 868 386 196 aa, chain + ## HITS:1 COG:no KEGG:CD1107 NR:ns ## KEGG: CD1107 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 182 3 178 244 198 61.0 1e-49 MRNKRLLRTLSAFCLTMVVAFGFTIPAFAQGSEQAPAAPAEDSTNDSNVIVEETEPAPAL TPEGNAALVDDFGGNKQLITVTTKAGNYFYILIDRANEDKKTAVHFLNQVDEADLMALME DENVKEKPSAVCSCTTKCEAGAVNTACPVCATDKSKCTGKAPEPPAETPEPEKEKPAGLN PAAIVLLLALLGGGGV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:59:45 2011 Seq name: gi|229783540|gb|GG668195.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld602, whole genome shotgun sequence Length of sequence - 709 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 708 326 ## gi|266626035|ref|ZP_06118970.1| conserved hypothetical protein Predicted protein(s) >gi|229783540|gb|GG668195.1| GENE 1 3 - 708 326 235 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266626035|ref|ZP_06118970.1| ## NR: gi|266626035|ref|ZP_06118970.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 235 1 235 235 505 100.0 1e-141 DEFSWKVVRVGINQLLLENKDSRIPLRQQNPICGMQVRMLDNAKNVRVMGRDLPVSRRDE NGWYYRIPDDAAREAFERKELCISIRPDDEKEPFVFLEGSFLVKCSSPYTEKDGRQMITD GGFYLEEIGGDISCHDLLTAGFPFCGSAVTVECMAAACEDGRIKLGHIHGDCARIWVDNR EYGVIWGPAWVITGLTVGIHRITAEIVPSTFNSYGPHHHMEGDRHVISPAQYEGV Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:59:58 2011 Seq name: gi|229783539|gb|GG668196.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld605, whole genome shotgun sequence Length of sequence - 936 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 153 - 932 209 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 Predicted protein(s) >gi|229783539|gb|GG668196.1| GENE 1 153 - 932 209 259 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 22 239 278 502 563 85 29 2e-17 MNSQITGNKTQITKKSPTFNEMICVEDLRFSYDKENEVISGISLSIKKGKKYAFVGKSGC GKSTLIKLIAGYYSDFKGNVLYDEDNLSIVDVDKITALSSIIHQNVYMFDESILDNICLH EDYSKEELQSALSDSGLLHFIEQVPNGLEYHVGENGNNLSGGQKQRIAVARALIRKKPLL ILDEGTSAIDMQTAYDIESRLLEISDLTLLTITHNMSKDILELYDEIIFMADGKIIEHGT FETLIDKHAAFYDFYQLKK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:59:59 2011 Seq name: gi|229783538|gb|GG668197.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld606, whole genome shotgun sequence Length of sequence - 651 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 651 680 ## COG3505 Type IV secretory pathway, VirD4 components Predicted protein(s) >gi|229783538|gb|GG668197.1| GENE 1 3 - 651 680 216 aa, chain + ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 16 192 151 307 591 95 34.0 9e-20 NVILTKTESLTMNSRPKDPKTARNKNVLVIGGSGSGKTRFWLKPNLMQMHSSYVVTDPKG TILVECGKMLQRGTSKLGKDGKPMKDKHGKVIYEPYRIKVLNTINFKKSMHYNPFAYIHS EKDILKLVTTLIANTKGEGKAGDDFWVKAETLLYCALIGYIHYEAPVEEQNFSTLIEFIN AMEVREDDEEFKNPVDLMFDALEAEKPNHFAVRQYK Prediction of potential genes in microbial genomes Time: Fri Jul 1 03:59:59 2011 Seq name: gi|229783537|gb|GG668198.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld607, whole genome shotgun sequence Length of sequence - 996 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 996 763 ## LA2_07230 acyl-CoA thioesterase Predicted protein(s) >gi|229783537|gb|GG668198.1| GENE 1 3 - 996 763 331 aa, chain - ## HITS:1 COG:no KEGG:LA2_07230 NR:ns ## KEGG: LA2_07230 # Name: not_defined # Def: acyl-CoA thioesterase # Organism: L.amylovorus # Pathway: not_defined # 120 294 59 224 262 65 29.0 4e-09 VRDSYGRLLFAICAAADKNDLRPSVTGGAVDDSKAAAAEEPSWNHIDMNKDTTYTIRVHA DFEEKTVSYSITEKDGKVAAQEIAIPTEASNLAKMIACNWWVGQPQYIDNFRLTAPEAAL DLPLEGKTLYAFGDSIVAGHQYTKHSFADFIADQEGMILQKYAVNGATIMEAGYSGGQIL SQLNSAPDEAPDYIIFDGGTNDAEYLLNKDSESLGTIVEECDPERFDTDTFAGAFEKTVY EMKKKWPDAQLVYVAVHKLGSRDESMQDKLHELELAACAKWEVAVANLYDDSELDTRKEP HKNRYTFNSLDSNGLPGTNGSGTHPNLAAIE Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:05 2011 Seq name: gi|229783536|gb|GG668199.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld608, whole genome shotgun sequence Length of sequence - 696 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 586 454 ## CDR20291_2110 TetR family transcriptional regulator Predicted protein(s) >gi|229783536|gb|GG668199.1| GENE 1 2 - 586 454 194 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_2110 NR:ns ## KEGG: CDR20291_2110 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: C.difficile_R20291 # Pathway: not_defined # 1 193 22 215 217 95 32.0 1e-18 FYEHGYKATTARMIAGQANINLGLIDYYFKGKEEIATLIYRDVRNSFESLFLACEPQTTP LDMFFISSALELRLCLECLPFGRFYDEIILFPAIHQRLLTFNASRIKEYGAVGKDTDYPM LAAISIASIKPALVDHALHSGHALDTGKYLSYYLEQQLHYFGLHKEMAARYQALLSGYYV NAAANFTPILTKLL Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:09 2011 Seq name: gi|229783535|gb|GG668200.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld609, whole genome shotgun sequence Length of sequence - 503 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 105 - 320 204 ## gi|266626040|ref|ZP_06118975.1| conserved hypothetical protein 2 1 Op 2 . + CDS 313 - 502 105 ## Cphy_3715 hypothetical protein Predicted protein(s) >gi|229783535|gb|GG668200.1| GENE 1 105 - 320 204 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266626040|ref|ZP_06118975.1| ## NR: gi|266626040|ref|ZP_06118975.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 71 35 105 105 125 100.0 8e-28 MQAIITAAMWHVSYHTTYCNALNMWIAGCQTAEEVQDISYGADVPVEYQSEVLQAYLVDI AAQAGGGTRHA >gi|229783535|gb|GG668200.1| GENE 2 313 - 502 105 63 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3715 NR:ns ## KEGG: Cphy_3715 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 3 63 14 74 146 62 52.0 6e-09 MRKYLALMATGGLLYVILELTWRGRSHWTMFLLGGICFAALGLINEILPWSMALWKQILI GTG Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:17 2011 Seq name: gi|229783534|gb|GG668201.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld610, whole genome shotgun sequence Length of sequence - 676 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 141 123 ## - Prom 177 - 236 6.9 + Prom 398 - 457 3.6 2 2 Tu 1 . + CDS 511 - 676 210 ## COG0534 Na+-driven multidrug efflux pump Predicted protein(s) >gi|229783534|gb|GG668201.1| GENE 1 3 - 141 123 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSTEIIKQEETNGEKAAVLSFLLISAAFLMAEIGLMDISLLKAPEL >gi|229783534|gb|GG668201.1| GENE 2 511 - 676 210 55 aa, chain + ## HITS:1 COG:FN0944 KEGG:ns NR:ns ## COG: FN0944 COG0534 # Protein_GI_number: 19704279 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 55 1 54 455 57 50.0 7e-09 MTETKRDANASLASEPIGRLLLKFSVPCVLSMLVSALYNIVDQIFIGQSVGYLGN Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:23 2011 Seq name: gi|229783533|gb|GG668202.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld611, whole genome shotgun sequence Length of sequence - 643 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - SSU_RRNA 1 - 494 99.0 # EF400800 [D:1..1493] # 16S ribosomal RNA # uncultured bacterium # Bacteria; environmental samples. Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:24 2011 Seq name: gi|229783532|gb|GG668203.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld612, whole genome shotgun sequence Length of sequence - 962 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 594 630 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 2 1 Op 2 . + CDS 614 - 962 203 ## COG0006 Xaa-Pro aminopeptidase Predicted protein(s) >gi|229783532|gb|GG668203.1| GENE 1 1 - 594 630 197 aa, chain + ## HITS:1 COG:BS_appC KEGG:ns NR:ns ## COG: BS_appC COG1173 # Protein_GI_number: 16078205 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus subtilis # 10 197 116 303 303 184 51.0 9e-47 VVAVGISTGIGILLGALAGYFGGWVDTFIMRVVDVVYCFPVLFLVIIIATILKPSIYNIM IIIGLCNWTGTARFVRGEILRVRELEYVQASISLGATDGRTIFHHIIPNIMAPIIVEATL QMARAILTEASLSYLGVGVQMPTASWGNMLMSANNLSTLTLRPWQWVPPGVCIFLAVLSI NFIGDGLRDAFDARQKK >gi|229783532|gb|GG668203.1| GENE 2 614 - 962 203 116 aa, chain + ## HITS:1 COG:BS_yqhT KEGG:ns NR:ns ## COG: BS_yqhT COG0006 # Protein_GI_number: 16079502 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Bacillus subtilis # 9 110 2 104 353 72 42.0 2e-13 MKKTQYEKRLSGLRAYLEDHGLAGALITSYENRRYFCGFTGSSGYLIVTRTHVVLITDKR YTTQAKEQTVDCEIVEHSQDRLRLVADTMKRLGITSSVMESSMTAGEYFSLKEYLG Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:25 2011 Seq name: gi|229783531|gb|GG668204.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld613, whole genome shotgun sequence Length of sequence - 664 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 659 711 ## Clocel_3608 hypothetical protein Predicted protein(s) >gi|229783531|gb|GG668204.1| GENE 1 2 - 659 711 219 aa, chain - ## HITS:1 COG:no KEGG:Clocel_3608 NR:ns ## KEGG: Clocel_3608 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulovorans # Pathway: not_defined # 11 219 2222 2396 3534 69 27.0 9e-11 MNRLTKVKRADGSTSSYTYNARDQIVEAENLCSCGFLISDYQYTYDDAGLIVSETAKECL FVSNKDYGHKGGPDGECVHVSDNPWQNQNPVWEITKRTFKYDNNGELIECKEDKGMFDKT VYTYEYDSVGNRTRVKKQEVFEYRTDQTTYTYNADNQMVGAVVCEGNLTKRYTFKYDANG NLTQECLMNRAEVTYQYDTENRLKAVKDQQKLLMAAAYD Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:29 2011 Seq name: gi|229783530|gb|GG668205.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld614, whole genome shotgun sequence Length of sequence - 521 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 492 541 ## COG4869 Propanediol utilization protein Predicted protein(s) >gi|229783530|gb|GG668205.1| GENE 1 3 - 492 541 163 aa, chain - ## HITS:1 COG:lin1146 KEGG:ns NR:ns ## COG: lin1146 COG4869 # Protein_GI_number: 16800215 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Propanediol utilization protein # Organism: Listeria innocua # 1 163 32 193 213 169 53.0 2e-42 MHLTQEHLEILFGKGYELTRKKDLSQPGQYACEERVTIVGSKKELAGVSILGPVRKATQV ELSLTDARAIGVAAPIRESGDVAGSGACKIVGPCGEIEITEGVIAAKRHIHATSADAEAL GVKNGEIVSVKVDTDGRSLVFGDVVVRVSDSYALAMHIDTDES Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:30 2011 Seq name: gi|229783529|gb|GG668206.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld615, whole genome shotgun sequence Length of sequence - 587 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 389 272 ## gi|266626048|ref|ZP_06118983.1| hypothetical protein CLOSTHATH_07481 2 1 Op 2 . - CDS 405 - 587 86 ## gi|266626049|ref|ZP_06118984.1| heat shock 70-related protein 5 Predicted protein(s) >gi|229783529|gb|GG668206.1| GENE 1 2 - 389 272 129 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266626048|ref|ZP_06118983.1| ## NR: gi|266626048|ref|ZP_06118983.1| hypothetical protein CLOSTHATH_07481 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_07481 [Clostridium hathewayi DSM 13479] # 1 129 1 129 130 217 100.0 2e-55 MAYKLIGRKSGTVTFKPGTGTITQETMTKSSKMTSNAIEGGSSIEDHVHLNSEQFQIVGV VVKNYSSYKSRLESMWQNRDLVTYVGKFRVNNYVIINLQMKNSSANKKGFSFTATLQKAN IVSGQYVEM >gi|229783529|gb|GG668206.1| GENE 2 405 - 587 86 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266626049|ref|ZP_06118984.1| ## NR: gi|266626049|ref|ZP_06118984.1| heat shock 70-related protein 5 [Clostridium hathewayi DSM 13479] heat shock 70-related protein 5 [Clostridium hathewayi DSM 13479] # 1 60 1 60 60 102 98.0 1e-20 EQNVSVSAPFTPVVNIDIHGNVDPGTAASLKDEIKQTMRELYQEFKNEDAMNMAIQQGNA Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:42 2011 Seq name: gi|229783528|gb|GG668207.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld616, whole genome shotgun sequence Length of sequence - 903 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 181 229 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit - Prom 300 - 359 4.0 2 2 Tu 1 . - CDS 616 - 903 235 ## gi|266626051|ref|ZP_06118986.1| conserved hypothetical protein Predicted protein(s) >gi|229783528|gb|GG668207.1| GENE 1 1 - 181 229 60 aa, chain - ## HITS:1 COG:BH3111 KEGG:ns NR:ns ## COG: BH3111 COG0016 # Protein_GI_number: 15615673 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Bacillus halodurans # 1 60 1 60 344 64 50.0 5e-11 MKDQLDKIREDALKQIEASDALEKLNEIRVAYLGKKGELTSVLKSMKDVPPEERPKVGQM >gi|229783528|gb|GG668207.1| GENE 2 616 - 903 235 95 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266626051|ref|ZP_06118986.1| ## NR: gi|266626051|ref|ZP_06118986.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 95 1 95 95 197 98.0 2e-49 VCLLSGLKHREGEGRIYGDNWNDPEGIQYLYYSEDGIHFEPCAPFPNRASGIFIPEGEDQ KDITKYWGVSVATADAHKKRYIERFDFIKKNPQTT Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:50 2011 Seq name: gi|229783527|gb|GG668208.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld617, whole genome shotgun sequence Length of sequence - 548 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 294 - 381 54.0 # Ser GCT 0 0 Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:51 2011 Seq name: gi|229783526|gb|GG668209.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld618, whole genome shotgun sequence Length of sequence - 614 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 42 - 612 581 ## COG1472 Beta-glucosidase-related glycosidases Predicted protein(s) >gi|229783526|gb|GG668209.1| GENE 1 42 - 612 581 190 aa, chain + ## HITS:1 COG:TM0025 KEGG:ns NR:ns ## COG: TM0025 COG1472 # Protein_GI_number: 15642800 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 3 190 2 201 721 172 43.0 3e-43 MTDKIKDLVTAMTLEEKALLCSGKNFWQMEGIERLGIPSVMVTDGPHGLRKQAGEADHLG LHQSVKATCFPPAVTSASSWDKAALYDMGQAIGEECVQEEVAVVLGPGTNIKRSPLCGRN FEYFSEDPYLAGEMAAAWISGVQSKGIGTSLKHFAANNQEKARLVSNSVVDERALREIYL APFEKAVKQA Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:51 2011 Seq name: gi|229783525|gb|GG668210.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld620, whole genome shotgun sequence Length of sequence - 502 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 497 333 ## COG0148 Enolase Predicted protein(s) >gi|229783525|gb|GG668210.1| GENE 1 3 - 497 333 164 aa, chain + ## HITS:1 COG:all3538 KEGG:ns NR:ns ## COG: all3538 COG0148 # Protein_GI_number: 17231030 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Nostoc sp. PCC 7120 # 4 147 277 412 429 93 37.0 1e-19 FYIRLMDGFPVCAVFDGLAESDAEGWGLLTKILGNRAVFIGQHQYREPGGINPYERDWTG LSDWNGVEENAAGISLCGAKSVTAAMEMAELAKKNGLKLAVSQGYGETEDPYAADFAAAV NADWIRAGAPCRSERTSKYNELLRIEEWMSGYHSLPASICCDKM Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:52 2011 Seq name: gi|229783524|gb|GG668211.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld623, whole genome shotgun sequence Length of sequence - 690 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 253 - 688 362 ## COG3505 Type IV secretory pathway, VirD4 components Predicted protein(s) >gi|229783524|gb|GG668211.1| GENE 1 253 - 688 362 145 aa, chain + ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 33 129 211 307 591 65 36.0 3e-11 MECGKMLQRGAPKLGKDGKPMKDKHGKVIYEPYRIKVLNTINFRKSMHYNPFAYIHSEKD ILKLVTTLIANTKGEGKAGDDFWVKAETLLYCALIGYIHYEAPVEEQNFSTLIEFINAME VREDDEEFKNPVDLMFDALEAEKPN Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:52 2011 Seq name: gi|229783523|gb|GG668212.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld624, whole genome shotgun sequence Length of sequence - 672 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 653 767 ## COG0466 ATP-dependent Lon protease, bacterial type Predicted protein(s) >gi|229783523|gb|GG668212.1| GENE 1 2 - 653 767 217 aa, chain - ## HITS:1 COG:CAC2637 KEGG:ns NR:ns ## COG: CAC2637 COG0466 # Protein_GI_number: 15895895 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Clostridium acetobutylicum # 1 217 8 223 778 127 33.0 2e-29 MPVIALRGMTVLPKMMIHFDISRSKSIAAVEKAMIGDQKVCLVTQKNSEEADPGIDELYQ VGCVALIKQLVKIPNNVVRVMVEGLERVELLGLDSEEPMLVGEIEGLTESDDSLDCVTRQ AMVRILKEKLEEYGRENPRMLKEVFPNLMMVTDLGELLDQIAIQLPWDYKSRQQVLECVL LEERYETVMGNLLTEIEITRVKREIQGRVKENVDKNQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:53 2011 Seq name: gi|229783522|gb|GG668213.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld625, whole genome shotgun sequence Length of sequence - 790 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 35/0.000 - CDS 1 - 637 695 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 2 1 Op 2 . - CDS 628 - 789 211 ## COG1132 ABC-type multidrug transport system, ATPase and permease components Predicted protein(s) >gi|229783522|gb|GG668213.1| GENE 1 1 - 637 695 212 aa, chain - ## HITS:1 COG:lin0155 KEGG:ns NR:ns ## COG: lin0155 COG1132 # Protein_GI_number: 16799232 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Listeria innocua # 1 212 1 212 593 102 27.0 6e-22 MSVNTSRVDEEQKEVLKKDTIIRLFRYLLIYKKQIAAVLLIMAGTIAVSMATPIMMEYAI NVCVATGDVNGLLRLGALAVVMFLFFLAGTKVRMYIMSDVSNKVLLNIRDELYQHIQTLS FGFFDSRPTGKILARIIGDVNSLKDVLSDSVTQLIPDLITVVCVAVIMLIKNYKLAMAAL LTLPILVVGMLVIETTAHKRWQIYRKKTSNLN >gi|229783522|gb|GG668213.1| GENE 2 628 - 789 211 53 aa, chain - ## HITS:1 COG:BS_yheH KEGG:ns NR:ns ## COG: BS_yheH COG1132 # Protein_GI_number: 16078037 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Bacillus subtilis # 1 48 622 669 673 66 58.0 9e-12 AHRISAVRHADEILILEGGRIIERGTHEELMAKKGQYYRTYQVQYGEEVQLCQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:54 2011 Seq name: gi|229783521|gb|GG668214.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld626, whole genome shotgun sequence Length of sequence - 696 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 253 268 ## Closa_2132 peptidase S14 ClpP - Prom 286 - 345 4.1 - Term 344 - 404 10.2 2 2 Tu 1 . - CDS 407 - 694 257 ## Clole_4125 major facilitator superfamily MFS_1 Predicted protein(s) >gi|229783521|gb|GG668214.1| GENE 1 1 - 253 268 84 aa, chain - ## HITS:1 COG:no KEGG:Closa_2132 NR:ns ## KEGG: Closa_2132 # Name: not_defined # Def: peptidase S14 ClpP # Organism: C.saccharolyticum # Pathway: not_defined # 1 84 1 73 260 75 60.0 7e-13 MREYKKDSEMNCDADNRTENQNGNGRSDENNTSSGDTRKDRLDEKKTEKDIEQKEDEKLE EYGQMTLDDNSKKRKIHLLSIIGE >gi|229783521|gb|GG668214.1| GENE 2 407 - 694 257 95 aa, chain - ## HITS:1 COG:no KEGG:Clole_4125 NR:ns ## KEGG: Clole_4125 # Name: not_defined # Def: major facilitator superfamily MFS_1 # Organism: C.lentocellum # Pathway: not_defined # 1 82 331 412 427 113 67.0 2e-24 NVILYRTIPQNMQGRIFAVRNAIQYWTIPTGILLGGFLADYVFEPFMCADGGIQTVLHRL TGYGAGSGMAVMFLCTGILGSLSCFFCYKKMKKNG Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:00:59 2011 Seq name: gi|229783520|gb|GG668215.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld627, whole genome shotgun sequence Length of sequence - 897 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 717 763 ## COG0747 ABC-type dipeptide transport system, periplasmic component Predicted protein(s) >gi|229783520|gb|GG668215.1| GENE 1 1 - 717 763 238 aa, chain + ## HITS:1 COG:PM0592 KEGG:ns NR:ns ## COG: PM0592 COG0747 # Protein_GI_number: 15602457 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Pasteurella multocida # 1 229 314 523 531 92 27.0 8e-19 DKAKIVATVYGESGYVQDSIFPSNHWTYSDDVTKYPYDPAKAKSLLEEAGYTMNASTGFY EKNGKTLHLTYDLVTSTDGNSVAQLIQQQWKEIGVEMEIIEQDFSTLAYTKLFPSDDSGS PRRVTADDFACYTLGFGMEADPDEYRMYLTTADAPGTWNFVTYSNPEVDQLFEKQLYSTK PEERAECYHEIGKLISEDVPWIPLYGKKSLAGVSEKVQNFAADFRGITFQIEKWSVAQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:00 2011 Seq name: gi|229783519|gb|GG668216.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld628, whole genome shotgun sequence Length of sequence - 504 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 243 286 ## gi|266626062|ref|ZP_06118997.1| transcriptional repressor 2 1 Op 2 . - CDS 245 - 502 224 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain Predicted protein(s) >gi|229783519|gb|GG668216.1| GENE 1 3 - 243 286 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266626062|ref|ZP_06118997.1| ## NR: gi|266626062|ref|ZP_06118997.1| transcriptional repressor [Clostridium hathewayi DSM 13479] transcriptional repressor [Clostridium hathewayi DSM 13479] # 1 80 1 80 81 152 100.0 6e-36 MQIQFQPFTHSTDTGRLPEDSERLAIPELSDGFDRELSSLADGVAGSMTPVSYYSDRVPC GLYAIHISLTAVEEAEQIFL >gi|229783519|gb|GG668216.1| GENE 2 245 - 502 224 85 aa, chain - ## HITS:1 COG:BS_yesN KEGG:ns NR:ns ## COG: BS_yesN COG4753 # Protein_GI_number: 16077763 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus subtilis # 5 68 295 358 368 65 45.0 3e-11 RGFEFQKETGIRFSDYLTNERIQKAKEYIETDGMDRISDIAERVGFGNNPQYFSQLFKKK TGMAPSAYITGLRGPSGMSGQKEEF Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:07 2011 Seq name: gi|229783518|gb|GG668217.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld629, whole genome shotgun sequence Length of sequence - 512 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 69 - 138 29.4 # Pseudo GCA 0 0 + TRNA 143 - 214 74.5 # Lys TTT 0 0 + TRNA 221 - 292 59.3 # Glu TTC 0 0 + TRNA 352 - 425 68.5 # Gln CTG 0 0 + TRNA 430 - 501 34.4 # Gln TTG 0 0 Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:07 2011 Seq name: gi|229783517|gb|GG668218.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld631, whole genome shotgun sequence Length of sequence - 1463 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 450 587 ## Closa_2187 AAA ATPase 2 1 Op 2 . - CDS 512 - 1321 970 ## COG0327 Uncharacterized conserved protein 3 1 Op 3 . - CDS 1290 - 1463 191 ## gi|266626066|ref|ZP_06119001.1| conserved hypothetical protein Predicted protein(s) >gi|229783517|gb|GG668218.1| GENE 1 3 - 450 587 149 aa, chain - ## HITS:1 COG:no KEGG:Closa_2187 NR:ns ## KEGG: Closa_2187 # Name: not_defined # Def: AAA ATPase # Organism: C.saccharolyticum # Pathway: Pyrimidine metabolism [PATH:csh00240]; Metabolic pathways [PATH:csh01100] # 1 149 1 149 553 221 73.0 8e-57 MAFVTIGDVTKEYPNGTTYLEIAEEYQPQYEDDILLVRINGKLRELHKKVKFDCRLEFYT GRDQPGIQTYHRSATFLMLKAFYDVVGVEKIEKVTVDFSLGKGYYIEPHGSFTLTEELIG RVKARMHEYVEEKIPIMKRSENTDDAIEL >gi|229783517|gb|GG668218.1| GENE 2 512 - 1321 970 269 aa, chain - ## HITS:1 COG:FN1316 KEGG:ns NR:ns ## COG: FN1316 COG0327 # Protein_GI_number: 19704651 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 248 1 240 258 142 30.0 5e-34 MKCLELIQKLETLAPADCACEWDNVGLLVGWREREIHRILIALDATDEVVEEAVRMRADL LLTHHPLIFKPLRKVNDDDFIARRVMELIQYNVNYYAMHTNFDAAPGCMADLAAERLGLS ETRVLEVAGTMETDGRPVEYGIGKVGLLPSAMTVKELALLVKERFHLPFITVYGEHAAGE TVTRVAIAPGSGKSSIAFAEKAGAEVLVTGDIGHHEGIDAAANHLTVLDAGHYGLEHLFI GFMADYLEREFGGRLEIHKAAAAFPAAVL >gi|229783517|gb|GG668218.1| GENE 3 1290 - 1463 191 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266626066|ref|ZP_06119001.1| ## NR: gi|266626066|ref|ZP_06119001.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 2 57 2 57 57 70 100.0 5e-11 AVLAEFLKKEQEQLEQILDTLKDSDTPAALKRRVEMMEKLAWNKEAQDEMLGTDTEA Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:16 2011 Seq name: gi|229783516|gb|GG668219.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld632, whole genome shotgun sequence Length of sequence - 800 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 792 426 ## COG0550 Topoisomerase IA Predicted protein(s) >gi|229783516|gb|GG668219.1| GENE 1 1 - 792 426 263 aa, chain + ## HITS:1 COG:CAC2947_1 KEGG:ns NR:ns ## COG: CAC2947_1 COG0550 # Protein_GI_number: 15896200 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 1 259 361 612 618 145 36.0 9e-35 VTDHHAVIPTRNLKDADLSALPAGEKAVLELVALRLLCAVAQPHIYSETVVIAACAGGEF TAKGKTVKHPGWKALEDAYRAKMKDAEPKKDGAEKALPELTEGQTLSVAAAIVKEGKSSP PQHFTEDTLLSAMETAGKEDMPEDAERKGLGTPATRAGILEKLVSAGFLERKKSRKTVQL LPSHDAVSLITVLPEQLQSPLLTAEWEYRLGEIERGQLAPEEFLDGISTMLKDLVGTYQV IKGTEYLFSPPRDGGRQMPSLRR Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:16 2011 Seq name: gi|229783515|gb|GG668220.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld633, whole genome shotgun sequence Length of sequence - 582 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:17 2011 Seq name: gi|229783514|gb|GG668221.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld635, whole genome shotgun sequence Length of sequence - 859 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:17 2011 Seq name: gi|229783513|gb|GG668222.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld636, whole genome shotgun sequence Length of sequence - 903 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 901 815 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain Predicted protein(s) >gi|229783513|gb|GG668222.1| GENE 1 1 - 901 815 300 aa, chain - ## HITS:1 COG:SP0661 KEGG:ns NR:ns ## COG: SP0661 COG4753 # Protein_GI_number: 15900562 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Streptococcus pneumoniae TIGR4 # 1 89 57 145 245 68 32.0 1e-11 IRMPKMDGLAFITESKKRYPGINYIIMSAYSDFSYAKTAIQLGVEAYLLKPVNKEELEKM LGQLLHKANEEKLDRMLRSISVKEPEKNVIFQYRYVTALAIYAPSDEENGVEIRMKVEDH LSQYSDCSAYYLRDCSRSSCFMFLINTETQSQDIAKKCGSDLLELMGDQKMRAAVSTVFE RGEFKVAVAQSLCFLKRKMFCPQKKIISYSSCENRDSSEKAEKKRKMRGKLGQLYSLIRK EEFDRLEGELLGIINELISETNSITLIEDGVGELLVLLGHLPREAGGDMDFEILFHDLKS Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:18 2011 Seq name: gi|229783512|gb|GG668223.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld637, whole genome shotgun sequence Length of sequence - 1034 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1032 837 ## COG0840 Methyl-accepting chemotaxis protein Predicted protein(s) >gi|229783512|gb|GG668223.1| GENE 1 3 - 1032 837 343 aa, chain + ## HITS:1 COG:CAC3352 KEGG:ns NR:ns ## COG: CAC3352 COG0840 # Protein_GI_number: 15896595 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Clostridium acetobutylicum # 1 335 11 347 703 90 22.0 6e-18 IQTKFITLVLSCVFVCSAVIGGAGIISAKRVVDEDSAQMMNYRCSELACEVDAMLSRIEQ SVKTLAVYTDENLESVELLKSDDTRKAFTEQIEAVAVNAANNTEGAVAVYVRYNPDFTPP TSGVFWSKTNLNGTFQKQVPTDFSRYNPTDTEHVGWYYIPVRNGRAIWMSPYTNENIDIQ MISYVIPIYKNNETVGVVGMDIDFSVIEEMINRVRIYESGSAFLTDEKGTVMCHKVYPFG ISMGNVDESLIPLVAELENGTSGSSLFSYTNENVKRQMAFRTLRNGMRLAVTAPLSEIDK NKNLLLLQIVAAFLVIAPVSVWVTVLITRRMVRPLKELNEAAK Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:19 2011 Seq name: gi|229783511|gb|GG668224.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld638, whole genome shotgun sequence Length of sequence - 623 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 463 479 ## Ethha_1915 relaxase/mobilization nuclease family protein - Prom 487 - 546 3.9 Predicted protein(s) >gi|229783511|gb|GG668224.1| GENE 1 1 - 463 479 154 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1915 NR:ns ## KEGG: Ethha_1915 # Name: not_defined # Def: relaxase/mobilization nuclease family protein # Organism: E.harbinense # Pathway: not_defined # 1 147 306 452 468 174 71.0 1e-42 MAQTMNYLSEHNLLEYAVLEEKATAATAHHNELSAQIKAAEKRMAEIAVLRTHIVNYAKT REVYVAYRKAGYSKKFREEHEEEILLHQAAKNAFDEMGVKKLPKVKELQTEYAKLLEEKK KTYAEYRRSREEMRELLTAKANVDRVLKMEVEQD Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:22 2011 Seq name: gi|229783510|gb|GG668225.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld641, whole genome shotgun sequence Length of sequence - 637 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 105 148 ## gi|266621114|ref|ZP_06114049.1| ABC-type dipeptide/oligopeptide/nickel transport system, periplasmic component + Term 138 - 174 6.2 2 2 Tu 1 . + CDS 229 - 637 461 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components Predicted protein(s) >gi|229783510|gb|GG668225.1| GENE 1 1 - 105 148 34 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621114|ref|ZP_06114049.1| ## NR: gi|266621114|ref|ZP_06114049.1| ABC-type dipeptide/oligopeptide/nickel transport system, periplasmic component [Clostridium hathewayi DSM 13479] ABC-type dipeptide/oligopeptide/nickel transport system, periplasmic component [Clostridium hathewayi DSM 13479] # 1 34 540 573 573 67 97.0 4e-10 TFTYDKNVAKSSAKGFYINGAGGPAIELKSAYVE >gi|229783510|gb|GG668225.1| GENE 2 229 - 637 461 136 aa, chain + ## HITS:1 COG:CAC3631 KEGG:ns NR:ns ## COG: CAC3631 COG0601 # Protein_GI_number: 15896865 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Clostridium acetobutylicum # 1 136 1 136 310 113 48.0 9e-26 MVKYILKKIGYMIVTLWVILTITFFLVSVIPGDPMQADTKVLPEAVVENLKARWGLDKPI GERYVIYLKNLLHGELGESYKTPGLTANQIIKDRFPASARLGIQAVVLGLVLGLLLGILA AFHRGTWIDFITIFIA Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:27 2011 Seq name: gi|229783509|gb|GG668226.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld643, whole genome shotgun sequence Length of sequence - 575 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 271 301 ## PROTEIN SUPPORTED gi|239626258|ref|ZP_04669289.1| ribosomal protein L7Ae/L30e/S12e/Gadd45 2 1 Op 2 . + CDS 258 - 573 247 ## Closa_2982 translation initiation factor IF-2 Predicted protein(s) >gi|229783509|gb|GG668226.1| GENE 1 2 - 271 301 89 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239626258|ref|ZP_04669289.1| ribosomal protein L7Ae/L30e/S12e/Gadd45 [Clostridiales bacterium 1_7_47_FAA] # 1 72 23 94 105 120 80 2e-28 GEFMTEKSVKSGKAFLVIVSEEASDNTRKMFTNMCTYYKVPIYFFGKKSELGHAMGKEMR ASLVMMDAGFAKAVVKLLNTNGGSKYESI >gi|229783509|gb|GG668226.1| GENE 2 258 - 573 247 105 aa, chain + ## HITS:1 COG:no KEGG:Closa_2982 NR:ns ## KEGG: Closa_2982 # Name: not_defined # Def: translation initiation factor IF-2 # Organism: C.saccharolyticum # Pathway: not_defined # 1 104 1 93 1038 99 62.0 3e-20 MRVYDLAKELGKDSSKEILDILEKHDINLKSSSNITDDQASIVRKAMGGAAKSSEGRQNA PAPAKPAGEDSAKTEGSGEPHKKKIAAVFRPQNAQMKPQRPQGQG Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:30 2011 Seq name: gi|229783508|gb|GG668227.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld644, whole genome shotgun sequence Length of sequence - 704 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 704 594 ## Clole_4025 acyl-CoA reductase Predicted protein(s) >gi|229783508|gb|GG668227.1| GENE 1 2 - 704 594 234 aa, chain + ## HITS:1 COG:no KEGG:Clole_4025 NR:ns ## KEGG: Clole_4025 # Name: not_defined # Def: acyl-CoA reductase # Organism: C.lentocellum # Pathway: not_defined # 2 234 181 413 436 186 41.0 1e-45 TFRIPSGDRERILKLLTLSDGVCTWGGDDAVRGIRSLAPAGTKLIEWGHRISFACVTRAC TEREEALQAIADHMIRTNQLLCSSCQGIYVEAEGREAEEFCRRFLKLLEYSCEKYGNKEA GFRGLSTIRMYRKELETAMTGEHMFRGDGVSVTLAEDPKLQLSLMYGSCWVKPVNRERLV EILRKDRGKLQTARLVCLEEEEQEMVRILVRAGVNRILTGREPDSEVYFDSHDG Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:35 2011 Seq name: gi|229783507|gb|GG668228.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld645, whole genome shotgun sequence Length of sequence - 540 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 86 - 539 578 ## COG0577 ABC-type antimicrobial peptide transport system, permease component Predicted protein(s) >gi|229783507|gb|GG668228.1| GENE 1 86 - 539 578 151 aa, chain + ## HITS:1 COG:SP0913 KEGG:ns NR:ns ## COG: SP0913 COG0577 # Protein_GI_number: 15900794 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 8 151 7 150 662 80 32.0 1e-15 MSKLFYTRLAVQNLKKNSAVTIPWLLTCIGSVMMYYIISSLTVNPGFAHMQGGGSMVFIL LLGCVVMAVFSFLFLIYTNSFLVKRRKQEFGLFQVLGMGKKHLARIMALETLITAAVSLV AGVALGFLLYRVFALLLYRLMKLDISWDFSF Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:35 2011 Seq name: gi|229783506|gb|GG668229.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld648, whole genome shotgun sequence Length of sequence - 702 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 78 - 702 458 ## COG5484 Uncharacterized conserved protein Predicted protein(s) >gi|229783506|gb|GG668229.1| GENE 1 78 - 702 458 208 aa, chain + ## HITS:1 COG:lin1733 KEGG:ns NR:ns ## COG: lin1733 COG5484 # Protein_GI_number: 16800801 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 186 1 247 294 90 34.0 2e-18 MARPRSPNRDKACELWLESGKKRPLKDIAAELKVSEEQIRKWKNQDKWDKVTLPNAKSNV TNRKGGQPGNKNAVGHGGTGPPGNKNAVKHGAYEQIYYEALPEEERSLFDSIPDTAVLDG EIKLLRLKLARLIGRSEIKTYDMFGGEHKREITEAEREKGILEVTAELRKLIKTKKQIEI AELKAGGGDPDESEDDGFMAALEGTAAE Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:36 2011 Seq name: gi|229783505|gb|GG668230.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld649, whole genome shotgun sequence Length of sequence - 691 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 690 346 ## COG0753 Catalase Predicted protein(s) >gi|229783505|gb|GG668230.1| GENE 1 3 - 690 346 229 aa, chain + ## HITS:1 COG:NMA0050 KEGG:ns NR:ns ## COG: NMA0050 COG0753 # Protein_GI_number: 15793081 # Func_class: P Inorganic ion transport and metabolism # Function: Catalase # Organism: Neisseria meningitidis Z2491 # 1 226 191 416 504 298 60.0 4e-81 ASFRHMNLYGEHTFSFYNKDNQRFWCKFHFITQQGIKNLTDEEAEALIAKDRESHGRDLY ESIVKGEYPRWTMYVQIMTEEQARNHYENPFDITKIWRHREFPLQKVGVLELNRNPENYF AEVEQSAFTPAHVVPGIGFSPDKFLQGRLFVYGDAQRYRLGINYNQIPVNRPKVEVHDYH RDGLMRTDGNYGGAPAYSPNSMGDWAAQPEVMEPPLDLSGSMYAYDPQD Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:37 2011 Seq name: gi|229783504|gb|GG668231.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld650, whole genome shotgun sequence Length of sequence - 772 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 627 863 ## COG0263 Glutamate 5-kinase 2 1 Op 2 . + CDS 639 - 771 191 ## Predicted protein(s) >gi|229783504|gb|GG668231.1| GENE 1 1 - 627 863 208 aa, chain + ## HITS:1 COG:lin1228 KEGG:ns NR:ns ## COG: lin1228 COG0263 # Protein_GI_number: 16800297 # Func_class: E Amino acid transport and metabolism # Function: Glutamate 5-kinase # Organism: Listeria innocua # 1 190 75 263 276 179 48.0 4e-45 EKQAFAAVGQARLMMVYQKLFAEYNQVAAQVLLTKDTMINDSSRYNAQNTFDELLKLGTI PIVNENDTVSTSEIPYVDNFGDNDRLSAIVAALIGADLLILLSDIDGLYSDDPRQNPKAQ FIRQVDEITPELMDMGKATSGSDVGTGGMAAKLAAARIATDSGSDMVIANGDDVEVVSQI MTGADKGTLFLAHPNFDFDLMHYINNEY >gi|229783504|gb|GG668231.1| GENE 2 639 - 771 191 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQQEKIDRINTLYHKAQAVGLSEEEKAEQAALRKEYIEAIRMSL Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:42 2011 Seq name: gi|229783503|gb|GG668232.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld651, whole genome shotgun sequence Length of sequence - 621 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 620 672 ## Closa_2028 cytidylate kinase Predicted protein(s) >gi|229783503|gb|GG668232.1| GENE 1 2 - 620 672 206 aa, chain + ## HITS:1 COG:no KEGG:Closa_2028 NR:ns ## KEGG: Closa_2028 # Name: not_defined # Def: cytidylate kinase # Organism: C.saccharolyticum # Pathway: not_defined # 1 206 5 210 213 360 84.0 2e-98 LVITIGRQCGSSGKIIGQKLAEEMGVKCYDKELLALAAKNSGLCEELFETHDEKPTNSFL YSLVMDTYSMGYTTSGYMDMPINHKIFLAQFDTIKQLADQESCVIVGRCADYALADYPKV VSVFITASDEDRIASLKKLYNVEEAKAKDIMVKTDKKRASYYNYYSSKKWGDTRSYDLCI NRSAVGVDGAVKMIRAFAETKLEWLK Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:46 2011 Seq name: gi|229783502|gb|GG668233.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld652, whole genome shotgun sequence Length of sequence - 655 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 83 - 565 602 ## COG3544 Uncharacterized protein conserved in bacteria + Term 583 - 624 7.2 Predicted protein(s) >gi|229783502|gb|GG668233.1| GENE 1 83 - 565 602 160 aa, chain + ## HITS:1 COG:all7633 KEGG:ns NR:ns ## COG: all7633 COG3544 # Protein_GI_number: 17158769 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 8 158 80 235 242 69 32.0 2e-12 MRDMENIPKTGNASIDFLNGMIPHHESAVEMADSYLEYGGNNETLKKLAETIKTTQTEEI RQMRDLIRKYEAEGHNDADQESAYLEEYDKMLDHDHSMNKAGTDSLDHAFAEGMIMHHQM AVDMAKSILEYTNYEEIRTLAQSIIDAQEKEIEEMKEFTS Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:01:46 2011 Seq name: gi|229783501|gb|GG668234.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld653, whole genome shotgun sequence Length of sequence - 738 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 737 754 ## gi|266626086|ref|ZP_06119021.1| conserved hypothetical protein Predicted protein(s) >gi|229783501|gb|GG668234.1| GENE 1 2 - 737 754 245 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266626086|ref|ZP_06119021.1| ## NR: gi|266626086|ref|ZP_06119021.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 245 1 245 246 239 99.0 2e-61 LEAAKRALDFGYQDLEEIKRDSEEKLADLKRDYNQSMTRSEEEMEEETRRQYRTAERAYD AAVARKESAVRKAEREVADCEEKLARLEEEGASDEETEKAQKALDRANEDLEEVREEQSL NVAEAKEALNSAEEDYGDVSAGRRTAAEALKSSYEASVQAVEAQIKAGEKAVRDLEEALS QAELAVTNARARDAVTAAENGKEQTAASLDRQGKQLDITKAETQLKELEALEAAGGAVTA PVAGV Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:01 2011 Seq name: gi|229783500|gb|GG668235.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld654, whole genome shotgun sequence Length of sequence - 581 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 454 151 ## PROTEIN SUPPORTED gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 - Prom 508 - 567 10.1 Predicted protein(s) >gi|229783500|gb|GG668235.1| GENE 1 2 - 454 151 151 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 [Roseobacter sp. AzwK-3b] # 4 149 2 137 305 62 28 5e-11 MEQNLSQYKIFYEVAKAGNISKAAKELYISQPAISKSISKLEDSLGVSLFTRNSRGVQLT SEGELLFHHTESAFEALSRGENELKRIKDFNIGHLRIGVSNTLCKYILLPYLKGFIEKYP HVKITIESQSTTHTIAMLEQQHIDLGLIAEP Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:02 2011 Seq name: gi|229783499|gb|GG668236.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld655, whole genome shotgun sequence Length of sequence - 555 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 473 599 ## COG1175 ABC-type sugar transport systems, permease components Predicted protein(s) >gi|229783499|gb|GG668236.1| GENE 1 3 - 473 599 156 aa, chain + ## HITS:1 COG:BH1077 KEGG:ns NR:ns ## COG: BH1077 COG1175 # Protein_GI_number: 15613640 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 1 156 155 314 315 148 50.0 4e-36 GIVNKLTGLNINWISDPNYALLCVALVTAWLNSGINFLYFSAGLSNIDESIYERASVDGA GAWAKFFHLTLPGLSPILFYTLVVNIIQAFQSFGQVKVLTQGGPGESTNLIVYSIYRDAF FNYRFGGAAAQSVLLFGIIMILTLCMFRLEKKGVNY Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:02 2011 Seq name: gi|229783498|gb|GG668237.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld657, whole genome shotgun sequence Length of sequence - 800 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 800 873 ## COG0845 Membrane-fusion protein Predicted protein(s) >gi|229783498|gb|GG668237.1| GENE 1 2 - 800 873 266 aa, chain + ## HITS:1 COG:CAC0318 KEGG:ns NR:ns ## COG: CAC0318 COG0845 # Protein_GI_number: 15893610 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Clostridium acetobutylicum # 17 254 57 289 392 85 30.0 7e-17 IVRYVTIEAANPGGLTTDMAATATIGDFICSSEGTFEPTLETVMNADISSSVEVGKLLVN EGDYVGRGTPVFSMESKSAEKLLRTYKDSVEKAEETLESARNKVDSTQDNYDNYTITAPI SGQVITKNVKAGDKVAKSNSGSTTLAVIYDMSGYTFEMSVDELDVKEVAVGQSVVITADA VSGKTFSGTVTNVSLQSSYSNGVTNYPVTVTLNDGMDELLPGMNVDGVIILDQASDVLTV PADALMRGNKVYVKDGTVTEAQGNIP Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:03 2011 Seq name: gi|229783497|gb|GG668238.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld658, whole genome shotgun sequence Length of sequence - 604 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 221 193 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 318 - 377 3.7 - Term 262 - 308 12.1 2 2 Tu 1 . - CDS 449 - 604 79 ## Predicted protein(s) >gi|229783497|gb|GG668238.1| GENE 1 2 - 221 193 73 aa, chain - ## HITS:1 COG:BH3426 KEGG:ns NR:ns ## COG: BH3426 COG0745 # Protein_GI_number: 15615988 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 7 73 6 73 241 60 48.0 8e-10 MDYAYRILAVDDEPDILRTNRRYLEAQGYQVDTAACASEALEHLKNQNYDAILLDVLLPD RNGFALCEAIRAL >gi|229783497|gb|GG668238.1| GENE 2 449 - 604 79 51 aa, chain - ## HITS:0 COG:no KEGG:no NR:no VLKRQKIREAKAAAVRKHRREQWMQESGCSAEEFEALLARRRGSSGKKHRK Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:08 2011 Seq name: gi|229783496|gb|GG668239.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld659, whole genome shotgun sequence Length of sequence - 711 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 148 129 ## gi|266626092|ref|ZP_06119027.1| carbamate kinase + Term 164 - 227 6.1 + Prom 173 - 232 8.1 2 2 Tu 1 . + CDS 292 - 709 221 ## COG2233 Xanthine/uracil permeases Predicted protein(s) >gi|229783496|gb|GG668239.1| GENE 1 2 - 148 129 48 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266626092|ref|ZP_06119027.1| ## NR: gi|266626092|ref|ZP_06119027.1| carbamate kinase [Clostridium hathewayi DSM 13479] carbamate kinase [Clostridium hathewayi DSM 13479] # 1 48 1 48 48 85 100.0 2e-15 KIQTLSVPTLEPKGFEFADSKDGRNALITLLEKAKEGILGKTGTRIHQ >gi|229783496|gb|GG668239.1| GENE 2 292 - 709 221 139 aa, chain + ## HITS:1 COG:PAB1838 KEGG:ns NR:ns ## COG: PAB1838 COG2233 # Protein_GI_number: 14520997 # Func_class: F Nucleotide transport and metabolism # Function: Xanthine/uracil permeases # Organism: Pyrococcus abyssi # 22 132 17 118 427 60 36.0 1e-09 MKQSNKEYASIFQLDGIPKFSQALPLALQHVVAMIVGCVTPAIIVSNVANLSGADRVILI QAALVVSALSTLLQLFPIGKKGSFRLGAALPVIMGISFAYVPSMQSIAADFGIPAILGAQ IVGGVVAFIVGAFVMQIQT Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:14 2011 Seq name: gi|229783495|gb|GG668240.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld660, whole genome shotgun sequence Length of sequence - 739 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 3 - 306 170 ## COG1396 Predicted transcriptional regulators - Prom 332 - 391 3.5 2 1 Op 2 . - CDS 403 - 738 270 ## COG0789 Predicted transcriptional regulators Predicted protein(s) >gi|229783495|gb|GG668240.1| GENE 1 3 - 306 170 101 aa, chain - ## HITS:1 COG:SPy1834 KEGG:ns NR:ns ## COG: SPy1834 COG1396 # Protein_GI_number: 15675661 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 4 98 3 95 195 67 34.0 7e-12 MDTLGKRIAYYRKLQGLSQEKVAEHIGISRQAVTKWENDNSRPNTDNLLQLSALFEIPLN ELVSSYSEDHSPEEKATRIDENGLLRNRMRIIIPIFLFTCG >gi|229783495|gb|GG668240.1| GENE 2 403 - 738 270 111 aa, chain - ## HITS:1 COG:CAP0178 KEGG:ns NR:ns ## COG: CAP0178 COG0789 # Protein_GI_number: 15004881 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 111 9 119 123 113 54.0 8e-26 SRLTGLGIHTLRYYEQEGLIAPERNSGNRRRYSDRDIAWISFIKRLKDTGMPIKEIKRYA QLRAEGNPTLQARLEMLVQHRQALNEQIMQLQEHRDRLDEKIDFYRNEIGK Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:14 2011 Seq name: gi|229783494|gb|GG668241.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld661, whole genome shotgun sequence Length of sequence - 592 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 109 - 255 62 ## gi|288871780|ref|ZP_06119031.2| conserved hypothetical protein Predicted protein(s) >gi|229783494|gb|GG668241.1| GENE 1 109 - 255 62 48 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871780|ref|ZP_06119031.2| ## NR: gi|288871780|ref|ZP_06119031.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 48 1 48 48 84 100.0 3e-15 MIEEINKKISTKEKFICVSRSRRFGKTMALEMLASYYIKEGDCGLFIQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:19 2011 Seq name: gi|229783493|gb|GG668242.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld662, whole genome shotgun sequence Length of sequence - 602 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 504 476 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Term 544 - 591 6.5 Predicted protein(s) >gi|229783493|gb|GG668242.1| GENE 1 1 - 504 476 167 aa, chain + ## HITS:1 COG:aq_158 KEGG:ns NR:ns ## COG: aq_158 COG0494 # Protein_GI_number: 15605731 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Aquifex aeolicus # 18 148 1 125 134 72 37.0 4e-13 KTGIPDRDARFQSGGIAMIEATSCGGVVIFRGKILVLYKNYKNKYEGWVLPKGTVEAGEE YKETALREVKEETGVSASIIKYIGKSQYSFNTPQDMVEKDVHWYLMMADSYYSKPQREEY FIDSGYYKFHEAYHLLKFSNEKQILEKAYNEYLDLKKSNLWGNKKYF Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:20 2011 Seq name: gi|229783492|gb|GG668243.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld663, whole genome shotgun sequence Length of sequence - 701 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 11 - 701 736 ## COG0282 Acetate kinase Predicted protein(s) >gi|229783492|gb|GG668243.1| GENE 1 11 - 701 736 230 aa, chain + ## HITS:1 COG:TM0274 KEGG:ns NR:ns ## COG: TM0274 COG0282 # Protein_GI_number: 15643044 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Thermotoga maritima # 1 230 128 357 403 289 61.0 3e-78 MGIEACEEAMPGTPNVAVFDTAFGMKMPEKASLYAIPYEYYEKYSIRRYGFHGTSHSYVS KEAIKYCELDPEKAKVIVCHLGNGASVSASIGGKCVDTSMGLTPLEGLIMGTRSGDIDPA VVQFICNKEGKDVNEVLNILNKKSGILGMSGGVSSDFRDVQKAQGEGNHKADVAIQAFIY RVAKYIGAYVAAMNGVDAIVFTAGVGENDKPIRGAVCEYLGYLGIEIDPE Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:21 2011 Seq name: gi|229783491|gb|GG668244.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld664, whole genome shotgun sequence Length of sequence - 945 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 72 - 776 763 ## COG0765 ABC-type amino acid transport system, permease component - Prom 819 - 878 6.8 Predicted protein(s) >gi|229783491|gb|GG668244.1| GENE 1 72 - 776 763 234 aa, chain - ## HITS:1 COG:BS_ytmL KEGG:ns NR:ns ## COG: BS_ytmL COG0765 # Protein_GI_number: 16079988 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Bacillus subtilis # 1 229 1 227 239 155 38.0 6e-38 MENTYHLSQVPGYIPRILKAFPMTVEILSLSLLFSLVIGTLVAAGALSSHRILGRISKGY ISFMRGIPTLVLIFLLYLGLPQVMAALGVDLSGVSKTSYIIASLSLSTSANMAEMMRAAY LAVEKGQREAAYSVGMKGSSAFFRIIFPQAFGIAIPMLGNNIILLFKETSLAFTVGVIDI LGKARAISSASYGSNRLEVYLAAGIIFWAVCALLEQLSKWAEKLYTKGRRQAAG Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:21 2011 Seq name: gi|229783490|gb|GG668245.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld665, whole genome shotgun sequence Length of sequence - 578 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 576 520 ## COG0173 Aspartyl-tRNA synthetase Predicted protein(s) >gi|229783490|gb|GG668245.1| GENE 1 3 - 576 520 191 aa, chain + ## HITS:1 COG:CAC2269 KEGG:ns NR:ns ## COG: CAC2269 COG0173 # Protein_GI_number: 15895537 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 189 121 309 595 263 63.0 1e-70 ENSKTKEELRLKYRYLDLRRPDIQRNIMVRSQAAIITRAFLAEEGFFEIETPTLIKSTPE GARDYLVPSRVHPGSFYALPQSPQLFKQLLMCSGYDRYFQLARCYRDEDLRADRQPEFTQ IDLELSFVDVEDVLDVNERLLKRLFKEICNFDLELPIPRMTWHEAMDRFGSDKPDLRFGM ELKNVSDVVRG Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:22 2011 Seq name: gi|229783489|gb|GG668246.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld666, whole genome shotgun sequence Length of sequence - 593 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 61 - 592 422 ## COG0591 Na+/proline symporter Predicted protein(s) >gi|229783489|gb|GG668246.1| GENE 1 61 - 592 422 177 aa, chain + ## HITS:1 COG:PAB1841 KEGG:ns NR:ns ## COG: PAB1841 COG0591 # Protein_GI_number: 14520992 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Pyrococcus abyssi # 2 157 129 287 537 59 29.0 4e-09 MISVVYLMGQIAGQFVACGTMAHLLGLCSFQTGIVAGGIIMILLSVSGGLSSVTITDSIQ QIFITIMCVIVVPILAFSNAGGLKAVAAATDPVKMSLFQGAPSGYVIGTFLSLVLAYSCE PSFAQRIFAAKDEKSVVKETMFACLLGFLFTVPMWGAALTMPILFPEESSLVFLPLV Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:23 2011 Seq name: gi|229783488|gb|GG668247.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld668, whole genome shotgun sequence Length of sequence - 671 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 8 - 607 580 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component Predicted protein(s) >gi|229783488|gb|GG668247.1| GENE 1 8 - 607 580 199 aa, chain - ## HITS:1 COG:CAC0618 KEGG:ns NR:ns ## COG: CAC0618 COG0600 # Protein_GI_number: 15893906 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Clostridium acetobutylicum # 4 199 66 261 264 162 46.0 3e-40 MVKDGTIFLHTGVTVMETLVSFALVVVVGLATAILLWSSRSVSEVLEPYLVMLNSLPKSA LAPILIVWLGNNIKTIIVAAISVAVFGCIMTLHTGFMQTDPDKIKLIYSLGGTKKDVLTK VLLPGSVPLIISNMKVNIGLCLVGVIIGEFLAANKGLGYLIIYGSQVFKMDLVVMSIVIL CIVSAILYQGITILEKKMK Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:23 2011 Seq name: gi|229783487|gb|GG668248.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld670, whole genome shotgun sequence Length of sequence - 706 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 705 731 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain Predicted protein(s) >gi|229783487|gb|GG668248.1| GENE 1 3 - 705 731 234 aa, chain - ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 3 105 52 152 257 76 45.0 4e-14 RPDILITDINMPLIDGLDVLKRLEEAGEKLPKTIVISGYDEFEYARRAIELEVMAYLLKP LERSAVYEAVEKAVAALEKEQERKAQTRDGFTAGVENILTEYLRSPGDETKRRLCRLMDS EQGRSGFYELALFQMRRLEKSALTREEIRSRLEEAAAGKAIYLCPLDRFTWGVLITDIVK PAGFRLDESVLVKLLHNYEYGLSEAHGDPGEIDEAFAEARESIILRLGREGKTE Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:24 2011 Seq name: gi|229783486|gb|GG668249.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld671, whole genome shotgun sequence Length of sequence - 919 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 867 720 ## Ethha_1905 hypothetical protein Predicted protein(s) >gi|229783486|gb|GG668249.1| GENE 1 3 - 867 720 288 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1905 NR:ns ## KEGG: Ethha_1905 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 274 48 273 817 290 58.0 3e-77 MLIYMQKPDATLVAGYNKWKDQFERHVKKGEHGITIIAPTPYKKKIEEQKLDPDTKAPIL DKDGKIVTEEKEIEIPMFRPVKVFDVSQTDGKPLPELASSLSGNVPNYEAFMEALRRSAP VPITFEAMAADTDGYFSADHQKIAIRQGMSEVQTVSATVHEIAHSKLHDPKKYEMLPSWK VVLESEGGTKHDFKLDFATEREAEQFASDMDWRYVDENQFEWRLAVEEDPTAEKQAIKNR HTEEVEAESISYAVCKYFGIETGENSFGYIASWSSGQGTERAESQFGD Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:29 2011 Seq name: gi|229783485|gb|GG668250.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld672, whole genome shotgun sequence Length of sequence - 717 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 717 679 ## gi|266626105|ref|ZP_06119040.1| conserved hypothetical protein Predicted protein(s) >gi|229783485|gb|GG668250.1| GENE 1 3 - 717 679 238 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266626105|ref|ZP_06119040.1| ## NR: gi|266626105|ref|ZP_06119040.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 238 1 238 238 437 100.0 1e-121 FFINYGPVKKAFTGIFIVSLSLCTIFTVIACAQAAANYTDTGRKVRDVLHDLWKALLTYI LVQVLVIATIGLTNNLLTVVDKSIQVASMDNPSEIKSDGIRVANVIFISTTLSAAKNTTA TIHDPLNADPWKSYYTGAAKYTDDNALKNDFDATKVDYVSGYLCCIFMVFVMFSAVFLFI QRFFEVVLLYIISPFFVASMPLDGGSRFSAWREMFIAKCLASYGIVVIMRLFMIFLPI Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:42 2011 Seq name: gi|229783484|gb|GG668251.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld673, whole genome shotgun sequence Length of sequence - 775 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 138 109 ## COG1032 Fe-S oxidoreductase - Prom 299 - 358 7.8 - Term 288 - 335 1.1 2 2 Tu 1 . - CDS 383 - 547 90 ## gi|266626107|ref|ZP_06119042.1| conserved hypothetical protein - Prom 605 - 664 2.8 Predicted protein(s) >gi|229783484|gb|GG668251.1| GENE 1 3 - 138 109 45 aa, chain - ## HITS:1 COG:CAC1254 KEGG:ns NR:ns ## COG: CAC1254 COG1032 # Protein_GI_number: 15894536 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 8 45 6 43 622 61 60.0 4e-10 MRKLALNDEILLSIQQPARYIGGEVNTVMKDSAKADIRFAMCFPD >gi|229783484|gb|GG668251.1| GENE 2 383 - 547 90 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266626107|ref|ZP_06119042.1| ## NR: gi|266626107|ref|ZP_06119042.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 54 5 58 58 93 98.0 4e-18 MGTVSDVNNAEYPSEEFHGSRLETGQEVYASQAIPEKLYVKYDRGFGLFERSAE Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:47 2011 Seq name: gi|229783483|gb|GG668252.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld674, whole genome shotgun sequence Length of sequence - 722 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 720 653 ## Closa_2131 cell division protein FtsK/SpoIIIE Predicted protein(s) >gi|229783483|gb|GG668252.1| GENE 1 3 - 720 653 239 aa, chain - ## HITS:1 COG:no KEGG:Closa_2131 NR:ns ## KEGG: Closa_2131 # Name: not_defined # Def: cell division protein FtsK/SpoIIIE # Organism: C.saccharolyticum # Pathway: not_defined # 15 237 161 351 902 155 43.0 1e-36 GGGLIGGLLAEGLISIIGTVGAYLVILVLIIISAVCITEKSFVNLIKSGSGKAYHHAREN MDIQREIHAERREERRRIREEQKLRGVNLDVTKLEAPEAYEMEDEAVNSAEAEDDIDVHA GRLAAVSAVEVPESAKQAVPEVQPNPADIFRGSIFPKPGEEEDLPRVYSASEDDVPFDAD DAVPFVTEELWESSPDLRSAECDAGGFAESDDYGTSGTLFVKKEMKTLEEMEVAREELY Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:52 2011 Seq name: gi|229783482|gb|GG668253.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld676, whole genome shotgun sequence Length of sequence - 534 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 424 416 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 469 - 528 5.8 Predicted protein(s) >gi|229783482|gb|GG668253.1| GENE 1 1 - 424 416 141 aa, chain - ## HITS:1 COG:BH1123 KEGG:ns NR:ns ## COG: BH1123 COG4753 # Protein_GI_number: 15613686 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 134 1 134 526 112 39.0 2e-25 MYRVVIIDDEPIIVEGISKVVPWADYGCEVTATACSGLEGLEIIRKLRPDIIFTDISMPG MDGLSMIAALRVEFPEMMIAILTGYRDFDYAQKAIRLGVNRFLLKPSNMDEIKEALQFMA DTLNKRKEQEHALKTQNTAEV Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:53 2011 Seq name: gi|229783481|gb|GG668254.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld677, whole genome shotgun sequence Length of sequence - 632 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 631 617 ## COG4177 ABC-type branched-chain amino acid transport system, permease component Predicted protein(s) >gi|229783481|gb|GG668254.1| GENE 1 1 - 631 617 210 aa, chain - ## HITS:1 COG:BH0249 KEGG:ns NR:ns ## COG: BH0249 COG4177 # Protein_GI_number: 15612812 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Bacillus halodurans # 28 209 32 214 367 191 53.0 6e-49 NQRKKIKGTTVLALLIFLVLAVMPFCMSMFRVGLFGKYMCFAIIAIGLDMIWGYTGILSL GHGVYFGLGAYCMAMYLKMEASGGLPDFMTWSGVTQLPFVWQLFGNPAFAITMALLVPVA LAVLVGYLTFNNKIKGVYFSILSQAMALIMSTLLVGSQAFTGGSNGLTDFKTIFGRNINE PLTKITMFYVTLACLIAVYGLCRFLTGRRI Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:02:53 2011 Seq name: gi|229783480|gb|GG668255.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld678, whole genome shotgun sequence Length of sequence - 613 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 612 453 ## gi|266626112|ref|ZP_06119047.1| hypothetical protein CLOSTHATH_07552 Predicted protein(s) >gi|229783480|gb|GG668255.1| GENE 1 3 - 612 453 203 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266626112|ref|ZP_06119047.1| ## NR: gi|266626112|ref|ZP_06119047.1| hypothetical protein CLOSTHATH_07552 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_07552 [Clostridium hathewayi DSM 13479] # 1 203 1 203 204 320 100.0 5e-86 TSLTLKDSAGGGVLSVDNNLNGSSLEIGNAAVTIDGAISFNRGVIIVNNGTLDGSNIEFT VNSSGMRPFQIENSTVKNLKARVSGGLHGMSISGSSNVTITGGEYYGSKAGLIITGSDST VKLTGGVFKVGQSDSVGAAIRCDDMSLGDLLEEGCNYYMGGDVVDLSTLTDDHKLGDHEN PVTVKSLTHTITFNANGGSVSPA Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:04 2011 Seq name: gi|229783479|gb|GG668256.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld679, whole genome shotgun sequence Length of sequence - 767 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 765 661 ## COG3505 Type IV secretory pathway, VirD4 components Predicted protein(s) >gi|229783479|gb|GG668256.1| GENE 1 3 - 765 661 254 aa, chain - ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 3 242 275 503 591 97 28.0 2e-20 IGYIHYEAPVEEQNFPTLIEFINAMEVREDDEEFKNPVDLMFDALEAEKPNHFAVRQYKK YKLAAGKTAKSILISCGARLAVFDIAELREVTSYDELELDTLGDRKTALFLIMSDTDDSF NFLISMCYTQLFNLLCEKADDVYGGRLPVHVRCLIDECANIGQIPKLEKLVATIRSREIS ACLVLQAQSQLKAIYKDNADTIIGNMDTSIFLGGKEPTTLKELAAVLGKETIDTYNTGES RGRETSHSLNYQKL Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:05 2011 Seq name: gi|229783478|gb|GG668257.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld680, whole genome shotgun sequence Length of sequence - 532 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:06 2011 Seq name: gi|229783477|gb|GG668258.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld681, whole genome shotgun sequence Length of sequence - 743 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 738 178 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 Predicted protein(s) >gi|229783477|gb|GG668258.1| GENE 1 1 - 738 178 246 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 1 214 10 234 413 73 26 5e-14 KMAMKSIMGNKMRSFLTMLGIIIGVASVIILVSLVNGQMSYMTESFASMGTNQITVNVTN LSTRSVSVDEMYEFYEDNRDLFAQMTPNVTVSTTVKNGTESSTSTTVSGVSEEYLEMKDM ELESGRFIQYSDIVSRQKVCVVGYYVAWDLYGGVEKAIDQTIKIGGNAFRIIGVAARQDD DELESGGSDDFVWMPYSCAAKMTRNANISSYTFATADMNHTEEAKTAIDDFLMEIFKDDD LYRITA Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:06 2011 Seq name: gi|229783476|gb|GG668259.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld682, whole genome shotgun sequence Length of sequence - 809 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 4 - 285 101 ## gi|288871782|ref|ZP_06119050.2| conserved hypothetical protein + Term 413 - 447 1.1 - Term 194 - 240 6.7 2 2 Tu 1 . - CDS 288 - 644 455 ## CDR20291_1745 hypothetical protein Predicted protein(s) >gi|229783476|gb|GG668259.1| GENE 1 4 - 285 101 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871782|ref|ZP_06119050.2| ## NR: gi|288871782|ref|ZP_06119050.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 93 2 94 94 179 98.0 7e-44 MGKMDTILSIPCAHTTKAVAAMVQRFQYLGNLSGGFLLFKGCNYTLPLAVCIAAQAQYMV NVVLGKRKSRGCMRYILCFIYGLDLFRFGEEQV >gi|229783476|gb|GG668259.1| GENE 2 288 - 644 455 118 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1745 NR:ns ## KEGG: CDR20291_1745 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 5 118 4 117 117 158 71.0 7e-38 MNNPMTYIQNGDYLIPDLKLSQQPEKPLGKYGRMRKAYLKEHRPILYNQMLLSEKLYPHL LEIDETAQNRLEQMIPQLAKEAGATEELKASDPMKWVGLMNTCKAQAEEILMAELINS Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:15 2011 Seq name: gi|229783475|gb|GG668260.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld683, whole genome shotgun sequence Length of sequence - 833 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 698 377 ## gi|266626118|ref|ZP_06119053.1| hypothetical protein CLOSTHATH_07558 + Term 766 - 823 12.5 Predicted protein(s) >gi|229783475|gb|GG668260.1| GENE 1 3 - 698 377 231 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266626118|ref|ZP_06119053.1| ## NR: gi|266626118|ref|ZP_06119053.1| hypothetical protein CLOSTHATH_07558 [Clostridium hathewayi DSM 13479] hypothetical protein CLOSTHATH_07558 [Clostridium hathewayi DSM 13479] # 1 231 1 231 231 459 100.0 1e-128 VSECTGRFLKSTNGDGILVIDDYGPVCFSYASDDRSVLNTLEDGDAVSMKTGLIMETWPG QTTVYGCELLRKGSRASIDPHTLASLMEMGQIEKAPLSPMFYARDTLFISTDRAAHPTCG TEDGSFSSIIDPLIVPSENGQANFGGKGDGYISLWDGAMAVRTGDSYTLFLSDGTVEYEG HFFQESDLSEDTLEWLEFYNGLPEEDRLSISYVPHELLPKGDPGNRVGGSD Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:28 2011 Seq name: gi|229783474|gb|GG668261.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld684, whole genome shotgun sequence Length of sequence - 582 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 44 - 580 602 ## COG2014 Uncharacterized conserved protein Predicted protein(s) >gi|229783474|gb|GG668261.1| GENE 1 44 - 580 602 178 aa, chain - ## HITS:1 COG:MTH925 KEGG:ns NR:ns ## COG: MTH925 COG2014 # Protein_GI_number: 15678945 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanothermobacter thermautotrophicus # 36 159 86 217 236 85 35.0 5e-17 LEAGIGAAAINAYYNRKDVLMNGIAAYEGVKILESCDTFAAPGDGFAGKKVATIGHFHYA ERYLKTAGELFVLEREPREGDYPDTACEYILPDMDYIYITGFTLVNKTLPRLLELGKNAR VVLVGPSVPMAPVLFEFGVRELAGTLITDTAKTERLVRFGSHRAVVRSGIPVRAGYEE Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:29 2011 Seq name: gi|229783473|gb|GG668262.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld685, whole genome shotgun sequence Length of sequence - 671 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 15 - 669 611 ## COG0569 K+ transport systems, NAD-binding component Predicted protein(s) >gi|229783473|gb|GG668262.1| GENE 1 15 - 669 611 218 aa, chain + ## HITS:1 COG:FN0242 KEGG:ns NR:ns ## COG: FN0242 COG0569 # Protein_GI_number: 19703587 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Fusobacterium nucleatum # 1 218 46 263 452 164 39.0 1e-40 MGVVGNGAVYKVQMEAGIQDTDLLIATTNSDELNMLCCLIAKKAGNCHTIARIRNPEYST ETRYIREELGLSMAINPEMAAAMEISRLLRFPSAIKIDTFAKGRIEILKFLVPDHSKLHA MKVRDVLDKLHCSVLICAVERGEEVIIPSGDFQMQSGDKISIIAPPAESTEFFKQAGIIN NTIKTAMFVGGGKITYYVAKLLENTKIQIKIIEQNVER Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:30 2011 Seq name: gi|229783472|gb|GG668263.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld686, whole genome shotgun sequence Length of sequence - 621 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 442 497 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 2 1 Op 2 . - CDS 439 - 621 192 ## gi|266626123|ref|ZP_06119058.1| transporter Predicted protein(s) >gi|229783472|gb|GG668263.1| GENE 1 1 - 442 497 147 aa, chain - ## HITS:1 COG:PH1431 KEGG:ns NR:ns ## COG: PH1431 COG0651 # Protein_GI_number: 14591225 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Pyrococcus horikoshii # 19 145 78 205 510 61 31.0 4e-10 MKTVFDVVIPGICGLGLHFKLDGFRLIYSCIAILMWAVSGAFSLEYMAHYVKRRRYYVFL WITFFATAGVFLSADLYTTFIFFEIMSFTSYVWVAFDEKKESLRAAETYLAVAVIGGLAM LMGLFLLYDLTGTLNMDLLGPAAREAL >gi|229783472|gb|GG668263.1| GENE 2 439 - 621 192 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266626123|ref|ZP_06119058.1| ## NR: gi|266626123|ref|ZP_06119058.1| transporter [Clostridium hathewayi DSM 13479] transporter [Clostridium hathewayi DSM 13479] # 1 60 1 60 60 94 98.0 3e-18 VVRAYFPKADAAAAVKEEAYTDPNWMMMVPLLVFAAAVLVLGLHSGPLMELLTAIGGEAG Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:35 2011 Seq name: gi|229783471|gb|GG668264.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld688, whole genome shotgun sequence Length of sequence - 642 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 353 - 412 8.4 1 1 Tu 1 . + CDS 473 - 641 125 ## gi|288871783|ref|ZP_06119060.2| conserved hypothetical protein Predicted protein(s) >gi|229783471|gb|GG668264.1| GENE 1 473 - 641 125 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871783|ref|ZP_06119060.2| ## NR: gi|288871783|ref|ZP_06119060.2| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 56 12 67 68 100 100.0 3e-20 MQLTIGFITTCSGRWPREVPEQRNREYGEWLEAQMPEVKVVRAPELGSGIESVEAA Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:41 2011 Seq name: gi|229783470|gb|GG668265.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld689, whole genome shotgun sequence Length of sequence - 548 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 497 402 ## EUBREC_2131 hypothetical protein Predicted protein(s) >gi|229783470|gb|GG668265.1| GENE 1 2 - 497 402 165 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2131 NR:ns ## KEGG: EUBREC_2131 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 162 1 162 265 211 61.0 7e-54 MILTAENYFSKEADREYLSVSQYKNFMGTIGRPACEAEAMAKLNGEWETKKTPALMVGSY VDAHFEGSLNLFKAQNPEIFTKQGALKADYKKAEEIINRIERDKVFMQFMSGEKQVIMTA DMFGSPWKIKIDSYLPGKAIVDLKVMRELHKAEYTKDYGYMNSNP Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:44 2011 Seq name: gi|229783469|gb|GG668266.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld691, whole genome shotgun sequence Length of sequence - 629 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 628 580 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit Predicted protein(s) >gi|229783469|gb|GG668266.1| GENE 1 1 - 628 580 209 aa, chain + ## HITS:1 COG:SMa1541 KEGG:ns NR:ns ## COG: SMa1541 COG0651 # Protein_GI_number: 16263292 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Sinorhizobium meliloti # 1 184 233 416 487 142 44.0 5e-34 GWLPKATVAPTPVTALLHAVAVVKSGAFAILRFTYFSYGTEFLRGSAAQWVVMAAAMVTI IFGSTMAVKEIHWKRRLAYSTISNLSYILLGASMMSPLGLTAALSHMVFHAFMKICSFFC AGAVMHQTGKTYVCELDGLAKKMPLTFGCLTVASLSLMGIPLFAGFISKWNIAEAAFACG SLALAGGSRMGILPYLGTAVILYSALMTG Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:45 2011 Seq name: gi|229783468|gb|GG668267.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld693, whole genome shotgun sequence Length of sequence - 599 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 202 - 247 9.1 1 1 Tu 1 . - CDS 319 - 597 357 ## COG5012 Predicted cobalamin binding protein Predicted protein(s) >gi|229783468|gb|GG668267.1| GENE 1 319 - 597 357 92 aa, chain - ## HITS:1 COG:MA0859_1 KEGG:ns NR:ns ## COG: MA0859_1 COG5012 # Protein_GI_number: 20089743 # Func_class: R General function prediction only # Function: Predicted cobalamin binding protein # Organism: Methanosarcina acetivorans str.C2A # 1 87 127 215 270 69 49.0 2e-12 VIDLGKDVPPELVVETAVEQAVKLVGLSALMTTTVPSMEETIRQLQKTVPGIRVMVGGAV LTEEYAKTIGADRYCRDAMASVNYAEKVFAGE Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:45 2011 Seq name: gi|229783467|gb|GG668268.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld694, whole genome shotgun sequence Length of sequence - 726 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:46 2011 Seq name: gi|229783466|gb|GG668269.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld695, whole genome shotgun sequence Length of sequence - 585 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 50 - 487 153 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 - Prom 520 - 579 2.4 Predicted protein(s) >gi|229783466|gb|GG668269.1| GENE 1 50 - 487 153 145 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 43 136 128 221 318 63 34 3e-11 MEQSMSRAGHCIDNGPLEPKGLNLQYLAGIRGVAGEKEIRSAMQKTGLDPEDKTKVEHYS MGMKQKLGIAQAIMEDQDILILDEPFNALDYKTYNDTKEIIRILQAEGRTILMTSHNYDD LETLCTHIYAINEGKLEFPGITGRN Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:47 2011 Seq name: gi|229783465|gb|GG668270.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld698, whole genome shotgun sequence Length of sequence - 706 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 50 - 109 2.3 1 1 Tu 1 . + CDS 129 - 705 386 ## COG0582 Integrase Predicted protein(s) >gi|229783465|gb|GG668270.1| GENE 1 129 - 705 386 192 aa, chain + ## HITS:1 COG:SP0890 KEGG:ns NR:ns ## COG: SP0890 COG0582 # Protein_GI_number: 15900773 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 9 192 8 182 321 78 25.0 9e-15 MVEKILENVVNEMAPHLSQEQLEHLSNVLYVNFHGLEVQEQCTELAATGEDGDEAKIRMF VASKKAVNRQNNTLKQYTREICNMLDFLGKRLEDITGMDLRYYYGVMRERRGIKMSTMQT RLHYLSSFWDFMITEDLVSSNPVKKVGLLKIEKTIKKPFSAAEMEALRTSCSELRDRALV EFLYSTGVRVSE Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:47 2011 Seq name: gi|229783464|gb|GG668271.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld699, whole genome shotgun sequence Length of sequence - 523 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:48 2011 Seq name: gi|229783463|gb|GG668272.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld700, whole genome shotgun sequence Length of sequence - 573 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 572 434 ## Ethha_1905 hypothetical protein Predicted protein(s) >gi|229783463|gb|GG668272.1| GENE 1 2 - 572 434 190 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1905 NR:ns ## KEGG: Ethha_1905 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 14 110 247 332 817 101 54.0 1e-20 GNQKYATRRKWRLKDISYAVCKYFGIETGENSFGYIASWSQGKELKELRASLETINKTSG TLICDIERHYKEICKERGIDPNTKAEPETAPIEQPTGNLTYYVAECMEFPNLGEYHDNLS LEEAVRIYQEIPAERMNGIKGIGFELKDGSDYEGPFPILTGQTIDLDTIQAIDYYRDNPL VQKAVKETGC Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:53 2011 Seq name: gi|229783462|gb|GG668273.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld703, whole genome shotgun sequence Length of sequence - 566 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 32/0.000 + CDS 7 - 195 172 ## PROTEIN SUPPORTED gi|90961933|ref|YP_535849.1| 50S ribosomal protein L21P 2 1 Op 2 . + CDS 212 - 502 449 ## PROTEIN SUPPORTED gi|239623849|ref|ZP_04666880.1| ribosomal protein L27 + Term 513 - 545 2.3 Predicted protein(s) >gi|229783462|gb|GG668273.1| GENE 1 7 - 195 172 62 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90961933|ref|YP_535849.1| 50S ribosomal protein L21P [Lactobacillus salivarius UCC118] # 1 62 41 102 102 70 53 2e-13 MNNGELAVGCPTVAGATVTGTVVKEGKAKKVIVYKYKRKSGYHKKNGHRQAYTQIKIDKI NA >gi|229783462|gb|GG668273.1| GENE 2 212 - 502 449 96 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239623849|ref|ZP_04666880.1| ribosomal protein L27 [Clostridiales bacterium 1_7_47_FAA] # 1 96 1 96 96 177 89 1e-45 MLNMNLQFFAHKKGVGSTKNGRDSESKRLGAKRADGQFVLAGNILYRQRGTHIHPGINVG RGGDDTLFATVDGVVRFERKGRDKKQVSVYPRTISE Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:53 2011 Seq name: gi|229783461|gb|GG668274.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld705, whole genome shotgun sequence Length of sequence - 628 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 172 - 585 426 ## Ethha_1930 sigma-70 region 4 type 2 Predicted protein(s) >gi|229783461|gb|GG668274.1| GENE 1 172 - 585 426 137 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1930 NR:ns ## KEGG: Ethha_1930 # Name: not_defined # Def: sigma-70 region 4 type 2 # Organism: E.harbinense # Pathway: not_defined # 1 137 1 137 141 120 52.0 1e-26 MKEVNLRDLYPDVYKTDHFVEVTEDVLETIRSAERAEAAYNRRMYRYKAHYSLDCDNGIE NAILMKPQTPEMLLEEKQLREQLYAAVMALPEKQAKRIYARYYLGMRVSEIAAAEGVDPS RVRDSIRRGLKQLAKYF Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:57 2011 Seq name: gi|229783460|gb|GG668275.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld707, whole genome shotgun sequence Length of sequence - 551 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 549 655 ## COG2217 Cation transport ATPase Predicted protein(s) >gi|229783460|gb|GG668275.1| GENE 1 3 - 549 655 182 aa, chain + ## HITS:1 COG:CAC3655 KEGG:ns NR:ns ## COG: CAC3655 COG2217 # Protein_GI_number: 15896888 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 1 179 346 523 818 227 65.0 1e-59 GSSSVDESMLTGESIPVEKQAGDALIGGSMNYNGAMEMEVTHTGGDTTLSKIIKMIEDAQ GKKAPISKLADKVAGYFVPTVMGIAVAAALLWWLLGGKELSFVLTIFVAVLVIACPCALG LATPTAIMVGTGLGAGHGILIKSGEALETAHKVDTVVFDKTGTITEGKPKVTDIVVLDQA ET Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:57 2011 Seq name: gi|229783459|gb|GG668276.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld709, whole genome shotgun sequence Length of sequence - 617 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 12 - 615 444 ## COG0524 Sugar kinases, ribokinase family Predicted protein(s) >gi|229783459|gb|GG668276.1| GENE 1 12 - 615 444 201 aa, chain + ## HITS:1 COG:VNG0158G KEGG:ns NR:ns ## COG: VNG0158G COG0524 # Protein_GI_number: 15789472 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Halobacterium sp. NRC-1 # 1 199 3 200 318 138 37.0 5e-33 MLSLGELLLRLSPAGVERLVRGDSFQKQVGGAELNVAVGAALLGLHTGVVTQLPAHDMGN FVKNKVRSYGISDDFFVYDHSSDARLGLYYYEYGAHPRKPKVIYDRKNSSFYSICIDSFP EDMFRAATCFHTTGITLALCENTRRTAIDMIRRFKSNGTIISFDVNFRGNLWTGEEAREC IEQILPYVDIFFCSEDTARLT Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:03:58 2011 Seq name: gi|229783458|gb|GG668277.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld710, whole genome shotgun sequence Length of sequence - 574 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 488 693 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 2 1 Op 2 . + CDS 505 - 574 83 ## Predicted protein(s) >gi|229783458|gb|GG668277.1| GENE 1 3 - 488 693 161 aa, chain + ## HITS:1 COG:BS_sigV KEGG:ns NR:ns ## COG: BS_sigV COG1595 # Protein_GI_number: 16079766 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 20 156 20 160 166 108 46.0 5e-24 KYLKKNIEAEAEQILLTNYEKYYRLAYSYVNNEQDALDIVQESAYKVMKDISKVREPDYL STWIYRVVMNTAVDFLRKRQKESIGLEGVEIPHEDVYHEDDPMELLKSLEEKDRTIVVLK VIEELKLEEIAVVLDLNINTVKARLYRALKKLRIELEPKMS >gi|229783458|gb|GG668277.1| GENE 2 505 - 574 83 23 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MENDGRSKLEQLKEAYHTILVPA Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:03 2011 Seq name: gi|229783457|gb|GG668278.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld711, whole genome shotgun sequence Length of sequence - 507 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 507 634 ## COG0516 IMP dehydrogenase/GMP reductase Predicted protein(s) >gi|229783457|gb|GG668278.1| GENE 1 3 - 507 634 168 aa, chain + ## HITS:1 COG:lin0179_3 KEGG:ns NR:ns ## COG: lin0179_3 COG0516 # Protein_GI_number: 16799256 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Listeria innocua # 1 168 79 246 276 306 85.0 2e-83 GADFVKVGIGGGAICITREQKGIGRGQATSLIEVAKARDAYYEETGVYVPICSDGGIVHD YHVTLALAMGADFVMLGRYFARFDESPTKRVNVNGSYMKEYWGEGSARARNWQRYDMGGE KKLSFEEGVDSFVPYAGSLKDNVNLTLSKVRSTMCNCGALTIPELQSK Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:03 2011 Seq name: gi|229783456|gb|GG668279.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld713, whole genome shotgun sequence Length of sequence - 588 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 488 372 ## SpiBuddy_0341 extracellular solute-binding protein family 1 - Prom 520 - 579 11.4 Predicted protein(s) >gi|229783456|gb|GG668279.1| GENE 1 2 - 488 372 162 aa, chain - ## HITS:1 COG:no KEGG:SpiBuddy_0341 NR:ns ## KEGG: SpiBuddy_0341 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: Spirochaeta_Buddy # Pathway: ABC transporters [PATH:sbu02010] # 70 162 43 135 441 82 44.0 6e-15 MRKKWFGLLAAGVLLVSCMMTGCGSKTSEPPAASGAETAAGAGEKKEETEKKDVEKGEEA KSDTESSGSEKVKLVGMTWGSTATIENFTKEFFEQNPEMAAKYEVEWVVGGKGDDDVTER IRLALSSGEYAADFVQLNYTQVPEFAREGVLTDVSDALKKYE Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:07 2011 Seq name: gi|229783455|gb|GG668280.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld714, whole genome shotgun sequence Length of sequence - 653 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 379 435 ## Ethha_1894 hypothetical protein 2 1 Op 2 . - CDS 449 - 652 164 ## Closa_3718 conjugative transfer protein Predicted protein(s) >gi|229783455|gb|GG668280.1| GENE 1 1 - 379 435 126 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1894 NR:ns ## KEGG: Ethha_1894 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 126 1 126 289 158 65.0 7e-38 MDFLLEALTNWLKEMLVGGIMSNLSGMFDSVNQQVADISVQVGQTPQGWNGSIFSMIENL SNSIMVPIAGVILAIVMTVDLIQMIADKNNLHDVGTWMIFKWVFKSAAAILIVTNTWNIV MGVFDM >gi|229783455|gb|GG668280.1| GENE 2 449 - 652 164 67 aa, chain - ## HITS:1 COG:no KEGG:Closa_3718 NR:ns ## KEGG: Closa_3718 # Name: not_defined # Def: conjugative transfer protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 65 7 70 71 84 89.0 2e-15 QAITVLQTLVIALGAGLGIWGVINLLEGYGNDNPGAKSQGMKQLMAGAGVAVVGMVLVPL LSGLFSV Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:12 2011 Seq name: gi|229783454|gb|GG668281.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld716, whole genome shotgun sequence Length of sequence - 502 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 48 - 501 439 ## CLH_2431 hypothetical protein Predicted protein(s) >gi|229783454|gb|GG668281.1| GENE 1 48 - 501 439 151 aa, chain + ## HITS:1 COG:no KEGG:CLH_2431 NR:ns ## KEGG: CLH_2431 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 1 151 1 151 262 214 68.0 8e-55 MNRRIAECINILGDFCGKRDVDELTKEELKRIYGIDQADVMVLFGGSILCGGDVLARAIQ QQAAKHYVIAGGAGHTTATLRAKVHQECPEIETEGLPEAMVFAAYLKARYGLEADYLECC STNCGNNITCLLKLLKEHQISFRSIILAQDA Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:16 2011 Seq name: gi|229783453|gb|GG668282.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld722, whole genome shotgun sequence Length of sequence - 558 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 167 - 226 2.1 1 1 Tu 1 . + CDS 256 - 556 340 ## Closa_2496 hypothetical protein Predicted protein(s) >gi|229783453|gb|GG668282.1| GENE 1 256 - 556 340 100 aa, chain + ## HITS:1 COG:no KEGG:Closa_2496 NR:ns ## KEGG: Closa_2496 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 20 100 1 81 146 121 82.0 9e-27 MKDMLGSFAYDWLRKGIDKMAAIYWLIGFVVLLGIEAATMALTTIWFAGGALAAFILALL GAGVEVQLAVFVIVSFALLFFTRPFALKYVNRNTVKTNSE Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:19 2011 Seq name: gi|229783452|gb|GG668283.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld729, whole genome shotgun sequence Length of sequence - 507 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 467 379 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 Predicted protein(s) >gi|229783452|gb|GG668283.1| GENE 1 3 - 467 379 154 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 2 153 153 304 308 150 52 1e-37 AHEETTGPEIWRDTDGTVDIFVAGVGSGGTITGVGEYLKSQNPNVKIVAVEPSGSPILSG GNPGPHGLQGIGAGFVPTILNRDIYDEIILVKEEDAYSTGQEIARNEGILVGISSSAAVW AAKELAERPENEGKKIVVLLPDTGDRYLSTPLFS Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:19 2011 Seq name: gi|229783451|gb|GG668284.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld732, whole genome shotgun sequence Length of sequence - 502 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 501 486 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 Predicted protein(s) >gi|229783451|gb|GG668284.1| GENE 1 1 - 501 486 167 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 1 166 219 385 730 191 57 6e-50 EEFPDEVMQQVEKISDDVTDREREGRLDLRGLQTVTIDGEDAKDLDDAISISKENGVYTL GVHIADVSHYVTENSALDEEALKRGTSVYLVDRVIPMLPHKLSNGICSLNQGEDRLALSC IMEIDEAGNVTGHRIAETLINVDRRMTYTAVNAVVTDRDEAVMEEYR Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:20 2011 Seq name: gi|229783450|gb|GG668285.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld736, whole genome shotgun sequence Length of sequence - 649 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 423 454 ## COG0366 Glycosidases - Prom 508 - 567 4.6 Predicted protein(s) >gi|229783450|gb|GG668285.1| GENE 1 3 - 423 454 140 aa, chain - ## HITS:1 COG:BS_yvdL KEGG:ns NR:ns ## COG: BS_yvdL COG0366 # Protein_GI_number: 16080509 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus subtilis # 1 127 1 127 561 158 51.0 2e-39 MPQWLKDAVFYEIYPQSFKDTNGDGIGDFQGIIEKLDYIRDLGCNALWINPCFDSPFKDA GYDVRDYKKAASRYGTNEDLCRLFAEAHKKGIRVLLDLVPGHTSEEHPWLTMSQKAETNE YSNRYIWTSHCFEGAKGFPY Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:21 2011 Seq name: gi|229783449|gb|GG668286.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld737, whole genome shotgun sequence Length of sequence - 567 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 235 197 ## COG0673 Predicted dehydrogenases and related proteins 2 1 Op 2 . - CDS 301 - 567 275 ## gi|266626150|ref|ZP_06119085.1| conserved hypothetical protein Predicted protein(s) >gi|229783449|gb|GG668286.1| GENE 1 1 - 235 197 78 aa, chain - ## HITS:1 COG:BH0708 KEGG:ns NR:ns ## COG: BH0708 COG0673 # Protein_GI_number: 15613271 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 1 77 1 77 369 68 47.0 3e-12 MKPVKAGLIGSGIISWTYLDNMVNRFDSVEVVGCSDLIPERSAARAEEFGIRQMTNQEIY DDPEIEIVVNTTNWQSHT >gi|229783449|gb|GG668286.1| GENE 2 301 - 567 275 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266626150|ref|ZP_06119085.1| ## NR: gi|266626150|ref|ZP_06119085.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 88 1 88 88 182 98.0 1e-44 VEGRIACIHVKDFYDLQKPWKMVEKDPSSKEGFTAVGTGVVNIPGVLKELDRQSVEYACV EQDIMRNLNPREALTMSYLAMKESGYVW Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:28 2011 Seq name: gi|229783448|gb|GG668287.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld742, whole genome shotgun sequence Length of sequence - 634 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 272 - 303 2.5 1 1 Tu 1 . - CDS 344 - 634 255 ## Ethha_1891 hypothetical protein Predicted protein(s) >gi|229783448|gb|GG668287.1| GENE 1 344 - 634 255 96 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1891 NR:ns ## KEGG: Ethha_1891 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 96 69 161 163 108 66.0 5e-23 VEITDPSIKEFEKIARKYGVDYAVKKDRSSSPPKYLIFFKGRDADALTAAFTEYTSKKVK KAEKTERPSVLAKLSQFKEMVKNAVVDRTKRKELER Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:31 2011 Seq name: gi|229783447|gb|GG668288.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld743, whole genome shotgun sequence Length of sequence - 597 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 571 444 ## Clole_0795 hypothetical protein Predicted protein(s) >gi|229783447|gb|GG668288.1| GENE 1 1 - 571 444 190 aa, chain - ## HITS:1 COG:no KEGG:Clole_0795 NR:ns ## KEGG: Clole_0795 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 183 1 182 540 134 41.0 2e-30 MKLTKIKIKNLFGIKEYEADGQSVELSGRNGAGKTSVIDAIRLALTNRSDREYIVRDGET EGEILIETDNGLRIDRKIRTNQADYKSVKKDGHEVGSPETFLKDIFTPLQLSPVEFMAMD RKKQNAIILDMIDYPWDMNKIREWFGEIPGWVSYDQNILQVLHDIQSENGEYFQTRQDIN RDIRNKRAFI Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:35 2011 Seq name: gi|229783446|gb|GG668289.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld744, whole genome shotgun sequence Length of sequence - 506 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 504 472 ## COG0747 ABC-type dipeptide transport system, periplasmic component Predicted protein(s) >gi|229783446|gb|GG668289.1| GENE 1 3 - 504 472 167 aa, chain - ## HITS:1 COG:AGl2786 KEGG:ns NR:ns ## COG: AGl2786 COG0747 # Protein_GI_number: 15891502 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 48 158 223 336 585 62 33.0 4e-10 NKLAELGYVTDIGTYYAGSEATGDYTLQIKLKNTMDGAIEKVLTSCSIASRTWYEGASEN DINLSPATTGAYTVTDMQTGSSVTMEARGDYWKKEDRADAELQNVNKIILRCIAEASSRS IALENKEVDMAEVSASDVGRFEGNDAFNVTKYNNAMSQYLIFNTSEN Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:36 2011 Seq name: gi|229783445|gb|GG668290.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld748, whole genome shotgun sequence Length of sequence - 542 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 98 - 157 1.9 1 1 Tu 1 . + CDS 203 - 542 517 ## COG4667 Predicted esterase of the alpha-beta hydrolase superfamily Predicted protein(s) >gi|229783445|gb|GG668290.1| GENE 1 203 - 542 517 113 aa, chain + ## HITS:1 COG:CAC2424 KEGG:ns NR:ns ## COG: CAC2424 COG4667 # Protein_GI_number: 15895690 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Clostridium acetobutylicum # 1 112 1 112 283 112 46.0 2e-25 MTEGALVLEGGSLRGVFSAGVLDVFMEQGIEMSYVNGVSAGSMCGMSYISKQIGRTIRVD LDYVNDKRFMSFRSMVKNRSIFNFDFLFGELSETLIPFDFETFEASKQRFEVV Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:36 2011 Seq name: gi|229783444|gb|GG668291.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld752, whole genome shotgun sequence Length of sequence - 610 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 41 - 608 319 ## COG3451 Type IV secretory pathway, VirB4 components Predicted protein(s) >gi|229783444|gb|GG668291.1| GENE 1 41 - 608 319 189 aa, chain + ## HITS:1 COG:CAC2047 KEGG:ns NR:ns ## COG: CAC2047 COG3451 # Protein_GI_number: 15895317 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Clostridium acetobutylicum # 15 127 201 314 617 63 34.0 2e-10 MSCLPLGLNQIEIQRGLTTSSTAIFVPFTTQELFQNGKEALYYGINALSNNLIMVDRKLL KNPNGLILGTPGSGKSFSAKREIANCFLLTNDDVIICDPEAEYAPLVERLHGQVIKISPT STNYINPMDLNLDYSDDESPLSLKSDFILSLCELIVGGKEGLQPVQKTIIDRCVRLVYQT YLNDPRPEN Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:37 2011 Seq name: gi|229783443|gb|GG668292.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld753, whole genome shotgun sequence Length of sequence - 622 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 379 - 612 137 ## COG3436 Transposase and inactivated derivatives Predicted protein(s) >gi|229783443|gb|GG668292.1| GENE 1 379 - 612 137 77 aa, chain - ## HITS:1 COG:SPy0131 KEGG:ns NR:ns ## COG: SPy0131 COG3436 # Protein_GI_number: 15674346 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 1 72 376 449 450 65 40.0 2e-11 MTLGRKNWLFSDSQDGANASMIVYTMVEMAKAHGLHPYNYLKYLLDSRPGTDTSDAEFKD LAPWSEKARIECNKKSE Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:38 2011 Seq name: gi|229783442|gb|GG668293.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld754, whole genome shotgun sequence Length of sequence - 631 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 629 525 ## Daes_0609 diguanylate cyclase Predicted protein(s) >gi|229783442|gb|GG668293.1| GENE 1 2 - 629 525 209 aa, chain - ## HITS:1 COG:no KEGG:Daes_0609 NR:ns ## KEGG: Daes_0609 # Name: not_defined # Def: diguanylate cyclase # Organism: D.aespoeensis # Pathway: not_defined # 3 141 173 308 478 68 33.0 2e-10 EDFLGVVGVGMRVSYLKEFLKEYEEKYHLNACLVDGDGKIEISSEHTGYNKTDWFEICGQ EEIRGRVLEWKEDSSNLELWTKTGVGGQERSYIVSRYIPELTWHLIVEQNNGMMVREIKA RLYQSGFIIAAIIITVLLVITTVLRNFNRQITQLIEERQDAFKRATEQLYDNIYELNITK NRSANKLTEQYFESLGAGNLPYDQGLKVI Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:42 2011 Seq name: gi|229783441|gb|GG668294.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld757, whole genome shotgun sequence Length of sequence - 653 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 14 - 565 372 ## COG1192 ATPases involved in chromosome partitioning Predicted protein(s) >gi|229783441|gb|GG668294.1| GENE 1 14 - 565 372 183 aa, chain - ## HITS:1 COG:lin2923 KEGG:ns NR:ns ## COG: lin2923 COG1192 # Protein_GI_number: 16801982 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Listeria innocua # 1 168 85 250 253 150 46.0 2e-36 MPADIQLSGMEVSLVNAMSRETILRQYLDTLKGQYSHILIDCQPSLGMLTVNALAAANRI IIPVQAEYLPAKGLEQLLSTVNKVKRQINPKLQIDGILLTMVDSRTNFAKEISALLRETY GSKIKVFGTEIPHSVRAKEISAEGKSIFAHDPGGKVAEGYKNLTKEVLKLEKQREKNRAG IGR Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:43 2011 Seq name: gi|229783440|gb|GG668295.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld758, whole genome shotgun sequence Length of sequence - 638 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 222 - 638 557 ## COG0740 Protease subunit of ATP-dependent Clp proteases Predicted protein(s) >gi|229783440|gb|GG668295.1| GENE 1 222 - 638 557 138 aa, chain - ## HITS:1 COG:BS_ymfB KEGG:ns NR:ns ## COG: BS_ymfB COG0740 # Protein_GI_number: 16078742 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Bacillus subtilis # 1 137 97 233 241 157 55.0 6e-39 VAIAEMIASLSIPTVSLVLGGSHSIGVPLAVSTDYSFIVPTGTMMVHPVRMTGMVIGASQ TYEYFEMIQDRILSFVSGHATIAYDQLKRLMLNTEMLTRDLGTVLVGEETVKEGLIDEVG GIKDALRKLYEMMDEVKG Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:43 2011 Seq name: gi|229783439|gb|GG668296.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld763, whole genome shotgun sequence Length of sequence - 597 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 597 564 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain Predicted protein(s) >gi|229783439|gb|GG668296.1| GENE 1 3 - 597 564 198 aa, chain - ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 52 198 247 397 577 75 32.0 6e-14 GRYSGFVTPQSLLTCLTDGKGTVLSSTDKEMLGTVLPVFNTVSKAWRQQGSGYVTAGLEG QRVVAFYVKSSVNDWYFINLVPRSAFLTGSRTSMLVIVCSFLLCIFFGITFALIQKRFVI GPIRELVVKIDRVKEGDFTEEDTVYPGDEIGALNQEFDEMSRRLKRLIEEVYTTKIKEQE AELNALIAQINPHFLYNT Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:44 2011 Seq name: gi|229783438|gb|GG668297.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld764, whole genome shotgun sequence Length of sequence - 507 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 448 361 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase Predicted protein(s) >gi|229783438|gb|GG668297.1| GENE 1 1 - 448 361 149 aa, chain - ## HITS:1 COG:aq_036 KEGG:ns NR:ns ## COG: aq_036 COG1057 # Protein_GI_number: 15605637 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Aquifex aeolicus # 7 122 53 168 168 75 37.0 4e-14 MTAGRIRLDMTRLAMEGHEGFTCSDFEVMRSGNTYTSQTLEMLHGLYPGHTFYFIIGADS LYEIEHWHEPEKVLAQAVILAAVREYESAGRSMEKQIAYLKETYQADVRMLHCREIDISS AELRRMTALGEPIDAFVPGPVARYIQKHH Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:44 2011 Seq name: gi|229783437|gb|GG668298.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld767, whole genome shotgun sequence Length of sequence - 597 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 556 409 ## COG0550 Topoisomerase IA Predicted protein(s) >gi|229783437|gb|GG668298.1| GENE 1 1 - 556 409 185 aa, chain - ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 14 185 237 412 709 124 40.0 1e-28 MALELPGLTVSGERMADKAAAEQLKEACQGAAATIKKVECKEKSEKPPALYDLTTLQRDA NRLLGFTAQQTLDYLQSLYEKKLCTYPRTDNRYLTGDMVDSLPVLVNLVANAMPFRKGIA ITCDPQTVINDKKVTDHHAVIPTRNLKDADLSALPAGEKAVLELVALRLLCAVAQPHIYS ETVVI Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:45 2011 Seq name: gi|229783436|gb|GG668299.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld769, whole genome shotgun sequence Length of sequence - 645 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 18 - 645 429 ## Ethha_1908 DEAD-like helicase Predicted protein(s) >gi|229783436|gb|GG668299.1| GENE 1 18 - 645 429 209 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1908 NR:ns ## KEGG: Ethha_1908 # Name: not_defined # Def: DEAD-like helicase # Organism: E.harbinense # Pathway: not_defined # 34 209 411 559 2462 110 43.0 4e-23 MELIRLYSDVPEFRQRLHREVIDETYPKLHELLRPLSQEDIDTALCAWNGNIESKHAVVR YMKDHAREKDTAAWLAQEYGGSNSLFVVRAGSPEETQLPWPKVQRRIAQLIQEDRFYTEE EQDRFDNIDPIAIREALEERGIVNGQVADSEKLDNAPFIQQVMSDAETEQTSEVSISDEE YDAVRRPIPQRTSYDPAAPVYAVGDTVYI Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:50 2011 Seq name: gi|229783435|gb|GG668300.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld770, whole genome shotgun sequence Length of sequence - 588 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 586 568 ## COG3119 Arylsulfatase A and related enzymes Predicted protein(s) >gi|229783435|gb|GG668300.1| GENE 1 1 - 586 568 195 aa, chain - ## HITS:1 COG:PA0031 KEGG:ns NR:ns ## COG: PA0031 COG3119 # Protein_GI_number: 15595229 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 1 176 250 428 503 84 32.0 1e-16 IRLMKKAYYANVTLIDRKIGQVIEVLKKKGLYENTLILFTSDHGDFMGEFGIATKAQYCS EALMRIPFLLKPPFSGFKGRREDSFISSVEIAATCLTAAGIAVPESVMGRSLTQFYEDGE KERWQDVYMEARDIRAIRDEHYKLIYYQNRSYGEFYDLQNDPYEKYNLWDDPLLQGRKYE LMKRLIDRLIDLGEG Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:50 2011 Seq name: gi|229783434|gb|GG668301.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld771, whole genome shotgun sequence Length of sequence - 823 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 94 - 789 513 ## COG2199 FOG: GGDEF domain Predicted protein(s) >gi|229783434|gb|GG668301.1| GENE 1 94 - 789 513 231 aa, chain - ## HITS:1 COG:mlr8123 KEGG:ns NR:ns ## COG: mlr8123 COG2199 # Protein_GI_number: 13476725 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Mesorhizobium loti # 8 165 7 170 196 84 31.0 2e-16 MKQDLAALEEENKKLSKLANHDWLTSIYNRGATETKINQLLSEKKTGVLFVLDMDHFKQI NDRYGHIAGDCVLQEVVRILNLMTFKQDILGRIGGDEFVIYMPLEQNQNFIDERCYQIRI RLLGIQMTNPLINGISATVCGSLYQPGDDYKSLFDRADQLLLAEKRKKDRKPVVAAAEQR RSAARKSIDIDMALIRSELSEQELISGAYCQDYRDLQKNIPLCRTQTTPKQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:51 2011 Seq name: gi|229783433|gb|GG668302.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld774, whole genome shotgun sequence Length of sequence - 520 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 35 - 91 9.8 1 1 Tu 1 . - CDS 123 - 518 466 ## Closa_2205 LysR family transcriptional regulator Predicted protein(s) >gi|229783433|gb|GG668302.1| GENE 1 123 - 518 466 131 aa, chain - ## HITS:1 COG:no KEGG:Closa_2205 NR:ns ## KEGG: Closa_2205 # Name: not_defined # Def: LysR family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 123 168 290 299 199 75.0 3e-50 IFVSTKPYLDNLYLREGRNTDIFQTGNILLLDRNNMTRKYIDEYLSEHQIVPSQVLEVNT MDILIEFAKIGLGIGCVIREFVQEELDKQMLVTIPMKWPIKKRTIGFAYNPGGTSRALED FLSFCGSFPIS Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:54 2011 Seq name: gi|229783432|gb|GG668303.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld775, whole genome shotgun sequence Length of sequence - 601 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 317 358 ## Closa_0864 N-acetylmuramyl-L-alanine amidase, negative regulator of AmpC, AmpD + Term 323 - 372 12.7 + Prom 319 - 378 1.6 2 1 Op 2 . + CDS 410 - 599 198 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair Predicted protein(s) >gi|229783432|gb|GG668303.1| GENE 1 3 - 317 358 104 aa, chain + ## HITS:1 COG:no KEGG:Closa_0864 NR:ns ## KEGG: Closa_0864 # Name: not_defined # Def: N-acetylmuramyl-L-alanine amidase, negative regulator of AmpC, AmpD # Organism: C.saccharolyticum # Pathway: not_defined # 3 102 174 273 276 155 77.0 3e-37 EKKSGWLEEDGGWRFYLGDSGDYVANDWYEDGDQWYWFDGAGMMVHDTWKTGSDGKWYYL KSDGAMAKDQWIIWKGELYRVREDGAMFEGTLCLKTDEKGALKE >gi|229783432|gb|GG668303.1| GENE 2 410 - 599 198 63 aa, chain + ## HITS:1 COG:CAC0285 KEGG:ns NR:ns ## COG: CAC0285 COG0389 # Protein_GI_number: 15893577 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Clostridium acetobutylicum # 1 62 1 61 396 81 59.0 4e-16 MERTIFHVDVNSAFLSWEAVYRLKHLGGRLDLRTVSAAVGGDVTRRHGIILAKSIPARTY GIK Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:04:58 2011 Seq name: gi|229783431|gb|GG668304.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld778, whole genome shotgun sequence Length of sequence - 532 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 82 - 525 537 ## Cphy_2144 ABC transporter related Predicted protein(s) >gi|229783431|gb|GG668304.1| GENE 1 82 - 525 537 147 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2144 NR:ns ## KEGG: Cphy_2144 # Name: not_defined # Def: ABC transporter related # Organism: C.phytofermentans # Pathway: not_defined # 3 143 406 545 548 143 52.0 2e-33 MERNQADEEAEKKKQEILKDKKAVEKGNKDETFKYHIAKVDESDYAVEDDPVITDEDFVI GVRPECIKLSPAGQGSLNATIYGAMPTGMESTVKLRVGDYLLTGVIFGGVTYQIGEKTGV DIEGNDILLFDRKSGKCVTAGKIQFVK Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:05:01 2011 Seq name: gi|229783430|gb|GG668305.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld780, whole genome shotgun sequence Length of sequence - 638 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:05:02 2011 Seq name: gi|229783429|gb|GG668306.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld783, whole genome shotgun sequence Length of sequence - 523 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 56 - 521 542 ## gi|266626172|ref|ZP_06119107.1| conserved hypothetical protein Predicted protein(s) >gi|229783429|gb|GG668306.1| GENE 1 56 - 521 542 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266626172|ref|ZP_06119107.1| ## NR: gi|266626172|ref|ZP_06119107.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 155 19 173 174 288 100.0 8e-77 MRKCYDAGIWHTSQQGNSEAYWTAFNAGKIAAFPSVAAHAAYYVTNVDPNGQGGYGHLAI AKPMKFAEDGRETYINNTEYYAINKNTEHLEAAKDVVRYLALTKDAAEKFSNVNEDGVMA QYATGCMEGIRAIAASRENGWEAFGGEPVVSELAE Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:05:11 2011 Seq name: gi|229783428|gb|GG668307.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld791, whole genome shotgun sequence Length of sequence - 570 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 94 - 183 88 ## + Term 413 - 454 -0.4 Predicted protein(s) >gi|229783428|gb|GG668307.1| GENE 1 94 - 183 88 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDIYAKVKYNKPEELFGVVNGAFHQPVAE Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:05:16 2011 Seq name: gi|229783427|gb|GG668308.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld794, whole genome shotgun sequence Length of sequence - 517 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 9 - 516 444 ## Ethha_1892 TraG family protein Predicted protein(s) >gi|229783427|gb|GG668308.1| GENE 1 9 - 516 444 169 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1892 NR:ns ## KEGG: Ethha_1892 # Name: not_defined # Def: TraG family protein # Organism: E.harbinense # Pathway: Bacterial secretion system [PATH:eha03070] # 1 169 251 419 596 313 87.0 2e-84 MLYCALIGYIHYEAPVEEQNFSTLIEFINAMEVREDDEAFKNPVDLMFDALEAEKPNHFA VRQYKKYKLAAGKTAKSILISCGARLAVFDIAELREVTAYDELELDTLGDRKTALFLIMS DTDDSFNFLISMCYTQLFNLLCEKADDVYDGRLPVHVRCLIDECANIGQ Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:05:19 2011 Seq name: gi|229783426|gb|GG668309.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld798, whole genome shotgun sequence Length of sequence - 797 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 795 436 ## COG0827 Adenine-specific DNA methylase Predicted protein(s) >gi|229783426|gb|GG668309.1| GENE 1 3 - 795 436 264 aa, chain + ## HITS:1 COG:pli0004 KEGG:ns NR:ns ## COG: pli0004 COG0827 # Protein_GI_number: 18450290 # Func_class: L Replication, recombination and repair # Function: Adenine-specific DNA methylase # Organism: Listeria innocua # 8 229 415 637 756 237 52.0 2e-62 DEHLGEGGPKAKFQANVNAVRLLKELESAGQQASPEQQEVLSRYVGWGGLSDAFDPEKPA WALEYAQLKELLTRSEYAAARSSTLNAHYTSPTVIQAIYEAVGRMGFETGNILEPSMGVG NFFGMLPEEMRNSRLYGVELDPVSGRIAKQLYPKADITVGGFETTDRRDFFDLAIGNVPF GQYQVNDKAYNKLNFSIHNYFFAKSLDQVRPGGVVAFVTSRYTMDAKDSTVAPLSCPACR AAGSLSVCPMTRSKRIAGAEVVSD Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:05:20 2011 Seq name: gi|229783425|gb|GG668310.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld818, whole genome shotgun sequence Length of sequence - 574 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 168 - 227 1.9 1 1 Tu 1 . + CDS 315 - 545 206 ## Closa_3674 hypothetical protein Predicted protein(s) >gi|229783425|gb|GG668310.1| GENE 1 315 - 545 206 76 aa, chain + ## HITS:1 COG:no KEGG:Closa_3674 NR:ns ## KEGG: Closa_3674 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 76 2 77 82 102 67.0 3e-21 MEKLITRKEAAEILGISIATLDAARNNGLISYVQYVQNGCVYFTAAGLQEYIAKCTHRAK PVERSTTYRKPRSGRS Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:05:22 2011 Seq name: gi|229783424|gb|GG668311.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld822, whole genome shotgun sequence Length of sequence - 595 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 47 - 593 600 ## Ccel_3035 Phage-related protein-like protein Predicted protein(s) >gi|229783424|gb|GG668311.1| GENE 1 47 - 593 600 182 aa, chain + ## HITS:1 COG:no KEGG:Ccel_3035 NR:ns ## KEGG: Ccel_3035 # Name: not_defined # Def: Phage-related protein-like protein # Organism: C.cellulolyticum # Pathway: not_defined # 43 182 32 171 841 136 52.0 3e-31 MGEGMTLEKLRVIIEAYTKPYRDELEKIKQQTTKATNHVERQTAKMKKSFGGLGRVVASV LGVGAIVAFGKSCVKLGSDLAEVQNVVDVTFGKMSGAVNAFSKNAITQFGLSELTAKKYM GTYGAMAKSFGIVGEAGYQMSAALMGLTGDVASFYNLSTDEAYTELKSIFTGETESLKEL GV Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:05:26 2011 Seq name: gi|229783423|gb|GG668312.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld824, whole genome shotgun sequence Length of sequence - 507 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 164 - 201 0.4 1 1 Tu 1 . - CDS 290 - 427 61 ## Predicted protein(s) >gi|229783423|gb|GG668312.1| GENE 1 290 - 427 61 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNFEPLFTYINMNIISSLCNISSPQFLNYSLVFPDFSRAKAWKPG Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:05:31 2011 Seq name: gi|229783422|gb|GG668313.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld830, whole genome shotgun sequence Length of sequence - 524 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 214 241 ## gi|239625816|ref|ZP_04668847.1| two component transcriptional regulator - Prom 245 - 304 4.8 2 2 Tu 1 . - CDS 372 - 524 84 ## EUBELI_01515 hypothetical protein Predicted protein(s) >gi|229783422|gb|GG668313.1| GENE 1 1 - 214 241 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|239625816|ref|ZP_04668847.1| ## NR: gi|239625816|ref|ZP_04668847.1| two component transcriptional regulator [Clostridiales bacterium 1_7_47_FAA] two component transcriptional regulator [Clostridiales bacterium 1_7_47FAA] # 1 71 1 71 241 123 92.0 5e-27 MGLILLVEDEPGAVALMRRYMENSSMEHKLAVFEKAAEALFYAVKNKVDLFILDIQLLDY RGTELARQLRS >gi|229783422|gb|GG668313.1| GENE 2 372 - 524 84 50 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01515 NR:ns ## KEGG: EUBELI_01515 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 6 49 359 402 403 81 77.0 8e-15 ETLPTIFRTNTFSVFGKVIWSPYLLITVPVLIGMVCMPFAVNNWSKRMKV Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:05:39 2011 Seq name: gi|229783421|gb|GG668314.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld834, whole genome shotgun sequence Length of sequence - 519 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 40 - 517 357 ## gi|223985013|ref|ZP_03635111.1| hypothetical protein HOLDEFILI_02415 Predicted protein(s) >gi|229783421|gb|GG668314.1| GENE 1 40 - 517 357 159 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|223985013|ref|ZP_03635111.1| ## NR: gi|223985013|ref|ZP_03635111.1| hypothetical protein HOLDEFILI_02415 [Holdemania filiformis DSM 12042] hypothetical protein HOLDEFILI_02415 [Holdemania filiformis DSM 12042] # 1 159 1144 1302 2945 302 96.0 6e-81 MSRAYSGEIETLNLETGDIADYRTTAQGIELEVMDAEEKRLAMLYFRWDEVAPLLRGMYA RQLDGFGQEQPQPSAESPAFHSETVAVYPGDKNHLPYDVVVERLHIEEPEPPAPATEPEK TFEEVLDEHPVSIQVNGQWQTFPNAKAAEEASYEEYKAN Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:05:48 2011 Seq name: gi|229783420|gb|GG668315.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld835, whole genome shotgun sequence Length of sequence - 572 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 160 - 357 143 ## gi|266626184|ref|ZP_06119119.1| conserved hypothetical protein 2 1 Op 2 . - CDS 350 - 511 117 ## COG3645 Uncharacterized phage-encoded protein Predicted protein(s) >gi|229783420|gb|GG668315.1| GENE 1 160 - 357 143 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266626184|ref|ZP_06119119.1| ## NR: gi|266626184|ref|ZP_06119119.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 65 1 65 65 116 100.0 4e-25 MHKIILNIIYRLETLRNQAPACEKSAYTQAIAEVMDIYDMACFSKAEKRNKKKSPAGANC HGSKN >gi|229783420|gb|GG668315.1| GENE 2 350 - 511 117 53 aa, chain - ## HITS:1 COG:SA1801_2 KEGG:ns NR:ns ## COG: SA1801_2 COG3645 # Protein_GI_number: 15927569 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Staphylococcus aureus N315 # 1 53 74 126 126 64 60.0 4e-11 MPTQRSMEMGLFEIKESTHLDGNGCNVTTRTPKVTGKGQQYFINKFLGGEQSA Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:05:54 2011 Seq name: gi|229783419|gb|GG668316.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld869, whole genome shotgun sequence Length of sequence - 566 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 340 304 ## gi|160914391|ref|ZP_02076606.1| hypothetical protein EUBDOL_00395 + Term 387 - 428 10.2 Predicted protein(s) >gi|229783419|gb|GG668316.1| GENE 1 2 - 340 304 112 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160914391|ref|ZP_02076606.1| ## NR: gi|160914391|ref|ZP_02076606.1| hypothetical protein EUBDOL_00395 [Eubacterium dolichum DSM 3991] hypothetical protein EUBDOL_00395 [Eubacterium dolichum DSM 3991] # 1 112 143 254 254 139 97.0 9e-32 VNTACPVCATDKSKCTGKAPEPPAETPEPEKEKPAGLNPAAIVLLLALLGGGGVFAYLKL VKNKPKTKGNDSLDDYDYGEEDSEEWETEDEESDEPDADGGSTEEDDEDSVK Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:06:03 2011 Seq name: gi|229783418|gb|GG668317.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld887, whole genome shotgun sequence Length of sequence - 520 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 197 76 ## gi|266626187|ref|ZP_06119122.1| conserved hypothetical protein Predicted protein(s) >gi|229783418|gb|GG668317.1| GENE 1 3 - 197 76 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266626187|ref|ZP_06119122.1| ## NR: gi|266626187|ref|ZP_06119122.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 1 64 1 64 64 99 100.0 8e-20 VTIVRINSGGSNNRSSEISADILNGDIRRAEIGFGTDIKTIRVILINLIFKFKERRTEFM RKFF Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:06:08 2011 Seq name: gi|229783417|gb|GG668318.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld894, whole genome shotgun sequence Length of sequence - 517 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 515 432 ## COG3505 Type IV secretory pathway, VirD4 components Predicted protein(s) >gi|229783417|gb|GG668318.1| GENE 1 2 - 515 432 171 aa, chain + ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 7 159 351 503 591 81 32.0 8e-16 DIAELREVTAYDELELDTLGDQKTALFLIMSDTDDSFNFLISMCYTQLFNLLCEKADDVY GGRLPVHVRCLIDECANIGQIPKLEKLVATIRSREISACLVLQAQSQLKAIYKDNADTII GNMDTSIFLGGKEPTTLKELAAVLGKETIDTYNTGESRGRETSHSLNYQKL Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:06:09 2011 Seq name: gi|229783416|gb|GG668319.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld904, whole genome shotgun sequence Length of sequence - 557 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 22 - 555 563 ## gi|266619785|ref|ZP_06112720.1| putative LPXTG-motif protein cell wall anchor domain protein Predicted protein(s) >gi|229783416|gb|GG668319.1| GENE 1 22 - 555 563 177 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266619785|ref|ZP_06112720.1| ## NR: gi|266619785|ref|ZP_06112720.1| putative LPXTG-motif protein cell wall anchor domain protein [Clostridium hathewayi DSM 13479] putative LPXTG-motif protein cell wall anchor domain protein [Clostridium hathewayi DSM 13479] # 1 177 884 1060 1490 301 98.0 1e-80 TVDAESLSHVYDGAGHGIKAAEASDAKGTTIEYSTDGESWSGTIPQFTEYREEGYLVHVR AKNDNYSNVAEADVVFRITKRPIKVAAGILETEYDGSEKAVTEFTYTVSNKENTAGALKD HKVSAVLKNNKRTEAGEQTVSVEENSVRILSGEADVTKNYAVSLEDGKLTVKRKDGL Prediction of potential genes in microbial genomes Time: Fri Jul 1 04:06:18 2011 Seq name: gi|229783415|gb|GG668320.1| Clostridium hathewayi DSM 13479 genomic scaffold Scfld996, whole genome shotgun sequence Length of sequence - 610 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 519 217 ## COG3344 Retron-type reverse transcriptase Predicted protein(s) >gi|229783415|gb|GG668320.1| GENE 1 3 - 519 217 172 aa, chain - ## HITS:1 COG:Q0050 KEGG:ns NR:ns ## COG: Q0050 COG3344 # Protein_GI_number: 6226520 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 13 166 408 548 834 85 36.0 5e-17 MCWLNYCGKGFQDEAFIGLIWKFLKAGYVEQWQYNCTYSGVPQGSGISPICANIYLSELD NYMQEYKEKYDYEPECRRTTREYERASRRYRKARKALMGAEKSTPELVKEFKDSRRKKMN QHYYNPFEEGFKKIQYNRYADDFVIGVIGSKKDAEKIQGRCKNTFFQEKLAF