Prediction of potential genes in microbial genomes Time: Thu May 26 08:58:25 2011 Seq name: gi|223714214|gb|ACDT01000001.1| Coprobacillus sp. D7 cont1.1, whole genome shotgun sequence Length of sequence - 85072 bp Number of predicted genes - 84, with homology - 84 Number of transcription units - 30, operones - 17 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 68 - 760 765 ## gi|237735213|ref|ZP_04565694.1| predicted protein + Prom 842 - 901 9.6 2 2 Tu 1 . + CDS 1023 - 1328 360 ## gi|167755060|ref|ZP_02427187.1| hypothetical protein CLORAM_00564 + Prom 1528 - 1587 5.0 3 3 Tu 1 . + CDS 1659 - 1904 207 ## gi|167755060|ref|ZP_02427187.1| hypothetical protein CLORAM_00564 + Term 1946 - 1983 5.0 + Prom 1941 - 2000 6.7 4 4 Op 1 2/0.000 + CDS 2088 - 2921 853 ## COG1737 Transcriptional regulators + Prom 2934 - 2993 7.5 5 4 Op 2 . + CDS 3028 - 3936 853 ## COG2103 Predicted sugar phosphate isomerase 6 4 Op 3 . + CDS 3941 - 4273 319 ## gi|167755057|ref|ZP_02427184.1| hypothetical protein CLORAM_00561 7 4 Op 4 . + CDS 4286 - 5638 1564 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 8 4 Op 5 . + CDS 5656 - 6534 935 ## COG2971 Predicted N-acetylglucosamine kinase 9 4 Op 6 . + CDS 6577 - 8022 1371 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 8028 - 8075 -0.5 + Prom 8083 - 8142 6.5 10 5 Tu 1 . + CDS 8169 - 9119 942 ## gi|167755053|ref|ZP_02427180.1| hypothetical protein CLORAM_00557 + Term 9120 - 9168 5.2 - Term 9156 - 9194 6.5 11 6 Tu 1 . - CDS 9197 - 9631 427 ## gi|237735223|ref|ZP_04565704.1| predicted protein - Prom 9653 - 9712 12.9 + Prom 9627 - 9686 14.1 12 7 Tu 1 . + CDS 9752 - 10051 380 ## COG3695 Predicted methylated DNA-protein cysteine methyltransferase + Prom 10070 - 10129 8.2 13 8 Op 1 2/0.000 + CDS 10231 - 10836 636 ## COG0009 Putative translation factor (SUA5) 14 8 Op 2 . + CDS 10833 - 11276 626 ## COG0698 Ribose 5-phosphate isomerase RpiB 15 8 Op 3 5/0.000 + CDS 11278 - 12516 1559 ## COG0112 Glycine/serine hydroxymethyltransferase 16 8 Op 4 . + CDS 12530 - 13153 929 ## COG0035 Uracil phosphoribosyltransferase + Prom 13197 - 13256 9.4 17 9 Op 1 . + CDS 13325 - 13531 173 ## gi|167755046|ref|ZP_02427173.1| hypothetical protein CLORAM_00550 18 9 Op 2 . + CDS 13528 - 13881 395 ## gi|167755045|ref|ZP_02427172.1| hypothetical protein CLORAM_00549 19 9 Op 3 40/0.000 + CDS 13904 - 14563 478 ## COG0356 F0F1-type ATP synthase, subunit a 20 9 Op 4 37/0.000 + CDS 14638 - 14865 430 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K 21 9 Op 5 38/0.000 + CDS 14888 - 15388 733 ## COG0711 F0F1-type ATP synthase, subunit b 22 9 Op 6 41/0.000 + CDS 15388 - 15921 659 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) 23 9 Op 7 42/0.000 + CDS 15931 - 17454 1872 ## COG0056 F0F1-type ATP synthase, alpha subunit 24 9 Op 8 42/0.000 + CDS 17461 - 18318 1090 ## COG0224 F0F1-type ATP synthase, gamma subunit 25 9 Op 9 42/0.000 + CDS 18330 - 19736 1710 ## COG0055 F0F1-type ATP synthase, beta subunit 26 9 Op 10 . + CDS 19729 - 20127 457 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) + Term 20136 - 20165 -0.2 + Prom 20848 - 20907 4.7 27 10 Op 1 . + CDS 20994 - 21779 818 ## COG1586 S-adenosylmethionine decarboxylase 28 10 Op 2 . + CDS 21788 - 23239 1534 ## COG1982 Arginine/lysine/ornithine decarboxylases 29 10 Op 3 2/0.000 + CDS 23245 - 24096 977 ## COG0421 Spermidine synthase 30 10 Op 4 . + CDS 24086 - 24949 1013 ## COG0010 Arginase/agmatinase/formimionoglutamate hydrolase, arginase family 31 10 Op 5 7/0.000 + CDS 24946 - 26148 1401 ## COG1748 Saccharopine dehydrogenase and related proteins 32 10 Op 6 . + CDS 26148 - 27257 1107 ## COG0019 Diaminopimelate decarboxylase + Term 27367 - 27402 -0.5 33 11 Tu 1 . - CDS 27275 - 28141 614 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 28167 - 28226 7.0 34 12 Tu 1 . + CDS 28367 - 29188 958 ## EUBELI_00615 hypothetical protein + Term 29194 - 29229 1.0 + Prom 29356 - 29415 11.0 35 13 Op 1 . + CDS 29494 - 30459 1252 ## COG0039 Malate/lactate dehydrogenases 36 13 Op 2 . + CDS 30493 - 31302 826 ## COG0860 N-acetylmuramoyl-L-alanine amidase + Term 31304 - 31350 1.2 37 14 Op 1 2/0.000 + CDS 31672 - 33129 1785 ## COG0793 Periplasmic protease + Prom 33145 - 33204 5.8 38 14 Op 2 . + CDS 33224 - 35197 2272 ## COG0556 Helicase subunit of the DNA excision repair complex 39 14 Op 3 . + CDS 35289 - 36542 1197 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase 40 14 Op 4 2/0.000 + CDS 36546 - 39368 3354 ## COG0178 Excinuclease ATPase subunit 41 14 Op 5 10/0.000 + CDS 39382 - 40317 1242 ## COG1493 Serine kinase of the HPr protein, regulates carbohydrate metabolism 42 14 Op 6 . + CDS 40321 - 41121 783 ## COG0682 Prolipoprotein diacylglyceryltransferase + Term 41251 - 41286 -0.1 + Prom 41153 - 41212 9.5 43 15 Tu 1 . + CDS 41319 - 42035 904 ## COG2188 Transcriptional regulators + Term 42036 - 42070 3.0 + Prom 42054 - 42113 12.2 44 16 Op 1 8/0.000 + CDS 42150 - 43436 1167 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 45 16 Op 2 . + CDS 43438 - 44868 1620 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 46 16 Op 3 . + CDS 44893 - 46611 1601 ## COG1482 Phosphomannose isomerase + Term 46643 - 46680 -0.6 + Prom 46681 - 46740 12.0 47 17 Op 1 7/0.000 + CDS 46871 - 48976 1894 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 48 17 Op 2 . + CDS 48993 - 49817 763 ## COG3711 Transcriptional antiterminator + Term 49951 - 49988 1.2 + Prom 49834 - 49893 6.4 49 18 Op 1 13/0.000 + CDS 50040 - 51029 1434 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 50 18 Op 2 13/0.000 + CDS 51055 - 51870 1117 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 51 18 Op 3 4/0.000 + CDS 51894 - 52811 902 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID 52 18 Op 4 . + CDS 52854 - 53237 402 ## COG4687 Uncharacterized protein conserved in bacteria + Term 53244 - 53289 8.2 + Prom 53290 - 53349 9.0 53 19 Op 1 40/0.000 + CDS 53405 - 54085 785 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 54 19 Op 2 1/0.143 + CDS 54075 - 55127 916 ## COG0642 Signal transduction histidine kinase + Term 55140 - 55189 4.2 + Prom 55143 - 55202 4.4 55 20 Op 1 35/0.000 + CDS 55237 - 56994 219 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 56 20 Op 2 . + CDS 56994 - 58814 193 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 57 21 Tu 1 . - CDS 58888 - 59736 1307 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 59879 - 59938 8.8 - Term 59933 - 59971 2.2 58 22 Op 1 . - CDS 60010 - 60630 646 ## COG4684 Predicted membrane protein - Prom 60719 - 60778 2.8 59 22 Op 2 . - CDS 60799 - 61767 1268 ## COG3191 L-aminopeptidase/D-esterase - Prom 61884 - 61943 8.3 + Prom 61703 - 61762 8.0 60 23 Tu 1 . + CDS 61918 - 62583 867 ## COG1396 Predicted transcriptional regulators + Term 62602 - 62658 2.1 - Term 62864 - 62917 7.6 61 24 Op 1 . - CDS 62926 - 64248 1499 ## COG4099 Predicted peptidase 62 24 Op 2 . - CDS 64328 - 65200 653 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 65230 - 65289 7.7 63 24 Op 3 . - CDS 65291 - 65797 587 ## COG0716 Flavodoxins - Prom 65954 - 66013 9.8 + Prom 65799 - 65858 7.0 64 25 Op 1 . + CDS 65883 - 66545 743 ## COG0546 Predicted phosphatases 65 25 Op 2 . + CDS 66547 - 67455 495 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 + Term 67463 - 67498 3.5 66 26 Tu 1 . - CDS 67452 - 68324 705 ## COG0583 Transcriptional regulator - Prom 68377 - 68436 5.9 + Prom 68356 - 68415 6.5 67 27 Op 1 6/0.000 + CDS 68443 - 69483 1066 ## COG1145 Ferredoxin 68 27 Op 2 2/0.000 + CDS 69476 - 70270 922 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 69 27 Op 3 . + CDS 70283 - 71320 1117 ## COG2221 Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits 70 27 Op 4 4/0.000 + CDS 71351 - 72541 912 ## COG0373 Glutamyl-tRNA reductase 71 27 Op 5 6/0.000 + CDS 72538 - 73431 865 ## COG0181 Porphobilinogen deaminase 72 27 Op 6 2/0.000 + CDS 73421 - 74857 1319 ## COG0007 Uroporphyrinogen-III methylase 73 27 Op 7 7/0.000 + CDS 74847 - 75824 1019 ## COG0113 Delta-aminolevulinic acid dehydratase 74 27 Op 8 1/0.143 + CDS 75817 - 77112 1351 ## COG0001 Glutamate-1-semialdehyde aminotransferase 75 27 Op 9 . + CDS 77093 - 77653 518 ## COG1648 Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) 76 27 Op 10 . + CDS 77692 - 78216 369 ## gi|167754986|ref|ZP_02427113.1| hypothetical protein CLORAM_00490 77 27 Op 11 . + CDS 78276 - 78851 198 ## PROTEIN SUPPORTED gi|148988990|ref|ZP_01820390.1| hypothetical protein CGSSp6BS73_02415 + Term 78856 - 78890 1.5 78 28 Tu 1 . + CDS 78924 - 79859 996 ## gi|237735290|ref|ZP_04565771.1| predicted protein - Term 79883 - 79926 9.4 79 29 Op 1 . - CDS 79938 - 81362 1863 ## COG2268 Uncharacterized protein conserved in bacteria 80 29 Op 2 . - CDS 81378 - 82496 1025 ## gi|237735292|ref|ZP_04565773.1| predicted protein - Prom 82599 - 82658 7.3 + Prom 82517 - 82576 12.1 81 30 Op 1 . + CDS 82718 - 83566 932 ## COG1660 Predicted P-loop-containing kinase 82 30 Op 2 . + CDS 83575 - 84522 893 ## COG1481 Uncharacterized protein conserved in bacteria 83 30 Op 3 . + CDS 84524 - 84736 278 ## COG1396 Predicted transcriptional regulators 84 30 Op 4 . + CDS 84776 - 85063 368 ## gi|167754978|ref|ZP_02427105.1| hypothetical protein CLORAM_00482 Predicted protein(s) >gi|223714214|gb|ACDT01000001.1| GENE 1 68 - 760 765 230 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735213|ref|ZP_04565694.1| ## NR: gi|237735213|ref|ZP_04565694.1| predicted protein [Mollicutes bacterium D7] # 1 230 1 230 230 429 100.0 1e-119 MLKKLLTVGLALTMMFAVTACSSKDSDSDKPNDNNTSDNTKQERPFTESNGLIKYFTIND EKICLPETVGEYVKYLEKIGTKVELGDTGKSVDEAPKVDAGGISSMVAYLKVYLDDSDWQ WFGIRYKNDTDKKQSVADCKVTQITLDYDTITEEENHYGIKTTVFITKDDQEIPMNGKTT SSNLLKWIGSPQQNTDGHWHYTDDQGYKYEFVTENQKGILTRLAITYPTN >gi|223714214|gb|ACDT01000001.1| GENE 2 1023 - 1328 360 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755060|ref|ZP_02427187.1| ## NR: gi|167755060|ref|ZP_02427187.1| hypothetical protein CLORAM_00564 [Clostridium ramosum DSM 1402] # 1 101 1 101 293 198 100.0 1e-49 MKKLIKILMVSMICLGLTACGEKKAAKAETTDDVAKIAEDNDLNDEGFDNSGLFWKFSFA GMEFSVAFNVGDDPKFYYVTNTLTLANIDRIKINPDKDIGS >gi|223714214|gb|ACDT01000001.1| GENE 3 1659 - 1904 207 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755060|ref|ZP_02427187.1| ## NR: gi|167755060|ref|ZP_02427187.1| hypothetical protein CLORAM_00564 [Clostridium ramosum DSM 1402] # 1 81 213 293 293 160 100.0 3e-38 MVIDAAFDLEAKTGYMYLPEQGTCGYSINGATQFIYQYSDNTFLKGEPTLEQYAEMKNIK NWYDEFLNQFSTKTEILQLIK >gi|223714214|gb|ACDT01000001.1| GENE 4 2088 - 2921 853 277 aa, chain + ## HITS:1 COG:BH2675 KEGG:ns NR:ns ## COG: BH2675 COG1737 # Protein_GI_number: 15615238 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 6 255 14 263 287 153 34.0 3e-37 MNSYAKIKEAQSSYTSTEKVIAKYILENAADVLKCSAQSLGEKTGTSAAAIIRFSKKLGY NGFSELKMSLAQSKRAPEEKIDFIIDENDDIVTLADKCCRLNMNTVLKTYQLIDTNQLDN AIKKLIAANTIYLFGVGGSAIVAQDVEQKLTRIGKKVIYNKDLHVQLTFSESMNKEDAAL FISYSGTTKGLVEIAKMIKNKNVPIISITQFKPNPLSKLSDIILQVPNEEKEIRMGAISS RISSLVMTDLLYYGVFKNDLTGNKDKLIETKRIVSKI >gi|223714214|gb|ACDT01000001.1| GENE 5 3028 - 3936 853 302 aa, chain + ## HITS:1 COG:ECs3299 KEGG:ns NR:ns ## COG: ECs3299 COG2103 # Protein_GI_number: 15832553 # Func_class: R General function prediction only # Function: Predicted sugar phosphate isomerase # Organism: Escherichia coli O157:H7 # 1 295 1 295 298 294 55.0 1e-79 MKLSQLDTEQINSNSLHIDEMTTEEILTVINQEDQKVAIAVKDVIPQIAKAVDCIYEKMC MGGRLIYMGAGTSGRLGVLDASECPPTYGVDPTLVQGLIAGGKEAFTTAIEGAEDSKEIA VDDLKKIELSPLDVVCGIAASGRTPYVVGALAYARKLGCSTISICCVKNGAISQHAHFPI EAVVGPEVITGSTRMKAGTAQKLILNMLSTSVMIKKGKVYKNLMVDVQPTNEKLRIRAIN IVQQAISVDEGTAKELLIEADNNVKIAILKGLTNMSKEECLKALKDSDENISQTIRTITK GE >gi|223714214|gb|ACDT01000001.1| GENE 6 3941 - 4273 319 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755057|ref|ZP_02427184.1| ## NR: gi|167755057|ref|ZP_02427184.1| hypothetical protein CLORAM_00561 [Clostridium ramosum DSM 1402] # 1 110 1 110 110 182 100.0 5e-45 MNKYVVKNNTKKEYKNIFGKTDYDIPMFGFKSALFILMTTMLSLLICINILNYLKIDFRI PNALISGIVCGFSVAYSQFFIERKKGVCKNFWIVGTIFSLIAFMVIFMIK >gi|223714214|gb|ACDT01000001.1| GENE 7 4286 - 5638 1564 450 aa, chain + ## HITS:1 COG:BS_ywbA KEGG:ns NR:ns ## COG: BS_ywbA COG1455 # Protein_GI_number: 16080890 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Bacillus subtilis # 3 447 2 438 444 261 38.0 3e-69 MFTKISNLIEGKLAGPLEKLSNQRHLRAIRDGIIATLPLIIVSSLLMVVAFSYNQMPADW GITQFIKNNAVAILLPYRMSMYIMTLYAVFGIGYSLAKSYNLDGLSGAILAELTFLLTIV PVSMPDITESISALASKHAELNAFLQGLPSGYVIPAENLGSAGMFIGIISAFIGVEIYRF TQVKGLKISMPAAVPPAVARSFESLTPTVIIILGVSTITMWLGINVHTIVGNLIKPLINA TDTLPSTLLIIFLIMFFWSFGIHGDSIVSSLARPIWLILLDQNTAAIAAGKAATHIGVEP FLQWFVHIGGSGATLGLAILFCVKAKSKYGKTLGRTTILPSIFNINEPMIFGAPIVLNPM LLIPFIIIPLICATIAWIATSMHLVNCAVTIAPWTLPGPIGAYLACGGDWRAAVLNIILI IISVIIYYPFFKMYDNELLAEEKSKEAVSA >gi|223714214|gb|ACDT01000001.1| GENE 8 5656 - 6534 935 292 aa, chain + ## HITS:1 COG:SSO3218 KEGG:ns NR:ns ## COG: SSO3218 COG2971 # Protein_GI_number: 15899921 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted N-acetylglucosamine kinase # Organism: Sulfolobus solfataricus # 1 292 1 287 298 104 29.0 2e-22 MLYIGIDGGGTKTKMALFDDAGRKLKEIIKPSVHVLTQPREICIQILKDGVMELDANFQA KVIAGLAGYGEQKEVRNKIATICKEAFGNREFSLYSDVRIAITGALGGGDGIVVVAGTGS IALSSKNNHITRCGGWGYQLGDEGSAYWIAKRMLALYCQQVDGRLEKTQLYYLIKEKCKL ENDYDIITFINKLNHDRTSIASLAKLNGIAAKDNDKYALQIYKEAAYEIAVLIKTLAKNF TSPFKVSYIGGVFEYGEDYVLKPLNNYLQPLSCQLVAPLYSPEYGAYLLGKK >gi|223714214|gb|ACDT01000001.1| GENE 9 6577 - 8022 1371 481 aa, chain + ## HITS:1 COG:BS_bglH KEGG:ns NR:ns ## COG: BS_bglH COG2723 # Protein_GI_number: 16080977 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Bacillus subtilis # 2 475 16 474 481 442 47.0 1e-124 MNKKRKILWGGATASSQYEGGYNLGGKGLDTQDCRPYLKRTSNATTSTRLLTQDIIDEAK SCKEVGNYPFRQGSDGYHHIDEDIQLLKELGIDIYRFSISWARLYPHGDEEMPNEEGIRF YDHIFKKVHQAGMKIFLTMNHYAVPLYLVETYGGWTNNILVDFYLRFAKTVFEHWGEYID YYLPFNEINAGYFSPYNGVGLVKEKDRPYDQSLVFQSLHHQFVASAKVIELGHKISSQSQ SGCMVACFCYYPLTPRPEDNLKAVRDEQIHQWFAVDVLANGYYPSYMNRFFQENKINIII TEEERKLLRDNTCDFVSFSYYSSAVATVEESSEQTAGNLVVSTKNPYLKASDWGWQIDPI GLRIMLNKMYDRTRKPVFISENGFGAHDKFEEDGSIHDQYRIDYLNQHFQQIDEAIRDGV DVIGYIMWGIIDIVSAGSCEMEKRYGVIYVDGDNEGKGSYQRYKKDSFAWYQNYIKQYKN N >gi|223714214|gb|ACDT01000001.1| GENE 10 8169 - 9119 942 316 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755053|ref|ZP_02427180.1| ## NR: gi|167755053|ref|ZP_02427180.1| hypothetical protein CLORAM_00557 [Clostridium ramosum DSM 1402] # 1 316 10 325 325 519 100.0 1e-146 MKKIKDYILAHKIKVTIIIIILLGIIGALGKEQPKLILKQNTVNLEYGENFNYLDYVDKN KFDEEDISYSCDGFDEDNPAVGKYTITYKYGNSKKKLTLNIKDTTAPQLIQTKEIDIIEN DEIDYRDYLEIKELSKYELEIDDSGVSYLAAGNYNASIKASDKYGNTSTLIMPVSIKALE LELSTTTLNLEENQNNTINIKTNSTQPVVYTSSDETIATVDQSGNIKALKSGTVKITAAI KNKSVTCNVTVKGKPVETSSIRTPVADDNISTTVYITKTGDCYHSSGCGYLSRSKIAISK STAINTGYRPCSRCHP >gi|223714214|gb|ACDT01000001.1| GENE 11 9197 - 9631 427 144 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237735223|ref|ZP_04565704.1| ## NR: gi|237735223|ref|ZP_04565704.1| predicted protein [Mollicutes bacterium D7] # 1 144 1 144 144 243 100.0 4e-63 MKQFKKSLLIIGLCFLMIGCTNDAMGKVTKKLQDAGYDISYLTDDFTAVNINKTEKDKDR IQFCAYLEKKVVTSISYIVLPADNSNIDKTIIGFIYVDKNDDNIISESAQKEAKKILKKL DLSIDDLVNYALQVHEDKGKSLNS >gi|223714214|gb|ACDT01000001.1| GENE 12 9752 - 10051 380 99 aa, chain + ## HITS:1 COG:lin0580 KEGG:ns NR:ns ## COG: lin0580 COG3695 # Protein_GI_number: 16799655 # Func_class: L Replication, recombination and repair # Function: Predicted methylated DNA-protein cysteine methyltransferase # Organism: Listeria innocua # 8 97 9 98 98 111 58.0 3e-25 MDEDLIFQVLAIVSEIPWGKVASYKQIASLAGRPKNARLVGKILSHASMYGDYPCHRVVN SAGRLVPEWYEQRDLLLEEDVCFKDNGNVDMKKCKWQLD >gi|223714214|gb|ACDT01000001.1| GENE 13 10231 - 10836 636 201 aa, chain + ## HITS:1 COG:VNG2312C KEGG:ns NR:ns ## COG: VNG2312C COG0009 # Protein_GI_number: 15791118 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Halobacterium sp. NRC-1 # 23 198 19 196 198 115 36.0 5e-26 MNTVSVEKNAIEEVAKLVSDSRVVAFPTETVFGLGVKFGSHQALDALYELKHRDKGKAIS MMISKAEDIEKYAYVDETAKKLIAAFMPGMITIILKKRPHIDDYFTASLDTIGIRIPDDP FVLSLLDKTGPMLVTSANLSGENSLVDDKAVMKVFHGKIPLIVKGESISKKASTIVDLSK GKVEILRLGDISKEQIAEVIE >gi|223714214|gb|ACDT01000001.1| GENE 14 10833 - 11276 626 147 aa, chain + ## HITS:1 COG:BH3767 KEGG:ns NR:ns ## COG: BH3767 COG0698 # Protein_GI_number: 15616329 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Bacillus halodurans # 1 145 1 145 145 172 60.0 1e-43 MKIAMACDHGGLRLKNVLKDYLEANGYIVEDFGTYSEDSCDYPDYAGKAAKAVASGSCDR GVVVCGTGIGVSITANKVKGIRCALCHDVFSAKATRAHNDTNMIAMGQRVIGEGLALEIL TAWLEGEYEGGRHDQRIAKMMAYEGEE >gi|223714214|gb|ACDT01000001.1| GENE 15 11278 - 12516 1559 412 aa, chain + ## HITS:1 COG:BS_glyA KEGG:ns NR:ns ## COG: BS_glyA COG0112 # Protein_GI_number: 16080743 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Bacillus subtilis # 2 411 7 415 415 515 61.0 1e-146 MKDIAVFESVERELNRQRNNIELIASENFVSPEILELAGTVLTNKYAEGYPGKRYYGGCK FVDEVETLAKERLCKIYGAEYANVQPHSGAQANTAVYMALLNHGDKVLGMSLADGGHLTH GHPLNYSGINYEFHSYGVTKETETIDYEDFKKKCQEVKPKLVVAGASAYSRTIDFEYMAK CAHEVGAMFMVDMAHIAGLVAAGLHPSPFPHADIVTTTTHKTLRGPRGGVIMCKEKYAAD IDRAVFPGMQGGPLMHIIAAKAACFYEAMQPEFKEYAAQVIKNAKALETALKEEGFRLVA DGTDNHLILIDVKASCGISGKKAERLLDEINITANKNAIPFDTEKPFKASGIRVGTPAMT TKGFTEEDFREVGKIIAYRLKNEETDEIKEECLKRVRALTDKVEMYHDIKYI >gi|223714214|gb|ACDT01000001.1| GENE 16 12530 - 13153 929 207 aa, chain + ## HITS:1 COG:SP0745 KEGG:ns NR:ns ## COG: SP0745 COG0035 # Protein_GI_number: 15900640 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 5 206 6 208 209 237 59.0 1e-62 MATTILNHALIDHKLTIMRDKDTKTIVFKDNLDEIAMLMAYEVTKELPLVDKEIVIPICP MIGKELKKQIILVPILRAGIGLVDGFRRIIPTAKIGHIGMARNEETLIPEEYYAKFPSGL EDSIVIIVDPMLATGGSASAAITNIKARGAKDIRLVCLVGAPEGVKVIEEDHPDVELILA TLDEKLNEKGYIVPGLGDAGDRLFGTD >gi|223714214|gb|ACDT01000001.1| GENE 17 13325 - 13531 173 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755046|ref|ZP_02427173.1| ## NR: gi|167755046|ref|ZP_02427173.1| hypothetical protein CLORAM_00550 [Clostridium ramosum DSM 1402] # 1 68 1 68 68 97 100.0 3e-19 MDQDKKKMLKDLFFCLQLGFQVIGAFLLAVIVGLRLDKYFNTHPTILLILLFLAFGYVIK ILLGAGKE >gi|223714214|gb|ACDT01000001.1| GENE 18 13528 - 13881 395 117 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755045|ref|ZP_02427172.1| ## NR: gi|167755045|ref|ZP_02427172.1| hypothetical protein CLORAM_00549 [Clostridium ramosum DSM 1402] # 1 117 1 117 117 140 100.0 2e-32 MNEKDVLKQSVKVFIGGLVIFSIVGFVLKQISFPLGFILGYAVSVLSFYIIIVMSDVILK MGQAIRFVVMMFIAKMLLYIAGFMLAIKFDDIFNLISVFFGYFVTKITINILGYIKR >gi|223714214|gb|ACDT01000001.1| GENE 19 13904 - 14563 478 219 aa, chain + ## HITS:1 COG:MG405 KEGG:ns NR:ns ## COG: MG405 COG0356 # Protein_GI_number: 12045267 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Mycoplasma genitalium # 11 207 44 260 292 86 34.0 3e-17 MTIQVELIYSLVIMVCLGVFFVYAGKKVALADPTVKPKGIVLVVETGVKAIDDYMKSIMP KKFAKNYYPYFAMVFCYILVSNLSSLIGFESPTSNYSITFAMTFITFILIQYNALKKKGF FKYVWDTVWPPTNILSVLSPLISISMRLFGNILSGSILMTLIYQFTAWISFHVINFNFLG PIIAPVFHCYFDIFAGCIQTLVFVTLSSILIMMENEDDE >gi|223714214|gb|ACDT01000001.1| GENE 20 14638 - 14865 430 75 aa, chain + ## HITS:1 COG:MYPU_2710 KEGG:ns NR:ns ## COG: MYPU_2710 COG0636 # Protein_GI_number: 15828742 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Mycoplasma pulmonis # 2 75 19 92 92 68 51.0 3e-12 MTDVGLIAIGAGIAVCAGLGTGIGEGICASKAVEALGRNPEMEGKIRTLMILGIALTETA AIYGLLISLILLFVY >gi|223714214|gb|ACDT01000001.1| GENE 21 14888 - 15388 733 166 aa, chain + ## HITS:1 COG:MG403 KEGG:ns NR:ns ## COG: MG403 COG0711 # Protein_GI_number: 12045265 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Mycoplasma genitalium # 4 164 40 200 208 82 30.0 3e-16 MPDIASKLFPNVTTIIIQLLSTGVLLLVFKKYLWVPVQNYFAKRAEFIEGTVNEAKDMNE KARALMEESEEQARQAAVQYREIVNLAKEDALKTKATIQEQANQEYKAKLDQARREIEAE KAQAKAAMKQEIVEVAIDVATKVMNKEMDTKTNKALVEDFVEEVVN >gi|223714214|gb|ACDT01000001.1| GENE 22 15388 - 15921 659 177 aa, chain + ## HITS:1 COG:BH3757 KEGG:ns NR:ns ## COG: BH3757 COG0712 # Protein_GI_number: 15616319 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Bacillus halodurans # 3 175 5 178 183 113 36.0 2e-25 MDAVAVRYAESLFDLAKEMNQVEAYSKDIDLIRTVFESDPSFVPFFSHVLIEDEAKCALL DKSFKGQVNDYVVNFLKLLVRKRRMRYVMEIIEAFKALTNEHFGILEGILYANYDISVQE VKEVEDALSKKENKTIRLRVVNDPSLIGGIKVEINNRVFDGSIKNKVALLKKELLRK >gi|223714214|gb|ACDT01000001.1| GENE 23 15931 - 17454 1872 507 aa, chain + ## HITS:1 COG:FN0360 KEGG:ns NR:ns ## COG: FN0360 COG0056 # Protein_GI_number: 19703702 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Fusobacterium nucleatum # 1 500 1 499 500 616 61.0 1e-176 MGLRPDEISALIKEQIKHFDDVIELKDVGTVMTVGDGVSLIHGLDNAMLGELLAFPNDVY GMVMNLDEDCVGAVLLGSESTIKEGDEVKRTGKIMEIPVGDEMLGRVVNPLGQAIDGNGE IITAHTRPIERVASGVMTRKSVDQPLQTGITSIDAIIPIGRGQRELIIGDRQTGKTAIAI DTILNQKDQDIYCVYVAIGQKNSTVAQIVEKLRKGGAMEYTTVVSASASELAPVQYIAPY AGCAIAEEWLDQGKDVLIVYDDLSKHAVAYRTVSLLLKRPPGREAYPGDVFYLHSRLLER ACRLNEENGGGSITALPIIETQAGDISAYIPTNVISITDGQIFLQQELFNAGFRPAIDTG LSVSRVGSAAQIKAMKQVSGSLKLELAQYAEMQAFAQFGSDLDAATKATLDHGAKVREVL KQAQYSPRTVSTQVITLFALKYGFTKTLEVEQVSAFMDQLVEHINMTQRDIIDEINEQKA ISADLEKRMKDVMSAFVTQFEKTQTKG >gi|223714214|gb|ACDT01000001.1| GENE 24 17461 - 18318 1090 285 aa, chain + ## HITS:1 COG:HI0480 KEGG:ns NR:ns ## COG: HI0480 COG0224 # Protein_GI_number: 16272427 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Haemophilus influenzae # 3 284 2 288 289 170 36.0 3e-42 MPGGMQEIKRRIKSVESTKKITKAMELVATSKLRKTRNQLEQSKPYYTNVAQTVAEILAN CKGNNDSIYLVENKDIEKEVFIVIASSLGLCGGYNANIFKEIKGAIKPGDYVYSIGSKAT SYLLKNHQGVTDHKFDDLNTTFDFKDVTKLVAELTKMYREKEISKIKIVYTEFVNNLTFR PRIVTLLPVDPSDFDHIEISKKSTLFEPSPEEVLDSLIPMYLQAVIYGYIIESATSENAA RRTSMENANDNADELTEQLLLKYNQARQTAITNEISEIVAGANAQ >gi|223714214|gb|ACDT01000001.1| GENE 25 18330 - 19736 1710 468 aa, chain + ## HITS:1 COG:CAC2865 KEGG:ns NR:ns ## COG: CAC2865 COG0055 # Protein_GI_number: 15896119 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Clostridium acetobutylicum # 1 465 1 463 466 645 71.0 0 MSQNIGKIISAKGPVIDIQFDGDNNLPALNTAIEIQNGDELLVVEVAQHIGDDVVRCVAM GPTDGVKRGMDALNTGNPISVPVGNETLGRMFNVLGQPIDGKEALGDSTKKMPIHRSAPT YAEQRTETEILETGIKVVDLICPFIKGGKIGLFGGAGVGKTVLIQEFINNIATEHGGLSV FAGVGERSREGNDLYYEMKESGVLSKTTLVFGQMNEPPGARLRVALTGLTMAEEFRDEQG QDVLLFIDNIFRFTQAGSEVSALLGRVPSQAGYQPTLATEMGALQERITSTKKGSITSVQ AVYVPADDLTDPAPATTFAHLDAKVVLDRSIAALGIYPAVDPLNSSSRALDPLVVGTEHY EVAHGVQQILQRFQELQDIIAILGMDELGEEDKLTVARARRVRNYLSQPFTVASQFTGMD GKYVRVADTIKGFKEILEGKHDDLPEQAFHNVGTIEEAIEKAKTLAND >gi|223714214|gb|ACDT01000001.1| GENE 26 19729 - 20127 457 132 aa, chain + ## HITS:1 COG:BS_atpC KEGG:ns NR:ns ## COG: BS_atpC COG0355 # Protein_GI_number: 16080733 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Bacillus subtilis # 1 129 1 129 132 89 34.0 2e-18 MTKFKLKIVTPSGIYQEVEVNQLNLRTTAGQVGILAHHMPLASGIEISEMSYIIDKQRHI FAIGGGFIYVNDDETKIIANSIESKEEIDLNRAKEAKRRAEQRIKEVTEKTDLLRAEIAL KKAITRINVKEL >gi|223714214|gb|ACDT01000001.1| GENE 27 20994 - 21779 818 261 aa, chain + ## HITS:1 COG:CAC2601 KEGG:ns NR:ns ## COG: CAC2601 COG1586 # Protein_GI_number: 15895860 # Func_class: E Amino acid transport and metabolism # Function: S-adenosylmethionine decarboxylase # Organism: Clostridium acetobutylicum # 1 259 5 267 274 294 56.0 1e-79 MNEKLKLYGFNNLTKSLSFNIYDVCYAKGQREQKDYIAYIDEQYNSKRLTDILYEVANMI GAHVLNVSKQDYDPQGASVTMLLAEEEMLESRQGKMEEVDDRVILGHLDKSHITVHTYPE YHPETSIATFRVDIDVATCGEISPLSTLDFLISSFDSDIITMDYRVRGFTRDIKGKKLFM DSPMTSIQQFIDVKTLLNYDATDINVYQANLFHTKMLIKEIDLQNYLFKTDIYELPPKVR LEITNNLRQEMIEIYSGSNIF >gi|223714214|gb|ACDT01000001.1| GENE 28 21788 - 23239 1534 483 aa, chain + ## HITS:1 COG:SP0916 KEGG:ns NR:ns ## COG: SP0916 COG1982 # Protein_GI_number: 15900796 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Streptococcus pneumoniae TIGR4 # 2 481 3 482 491 674 67.0 0 MKLDQSRMPIYEALQQMKRERLVPFDVPGHKHGKGNPELTDFLGEKCMSIDVNSMKPLDN LCHPVSVIKEAEELAADAFSAAHAFFMVGGTTSAVQSMIMASVKAGEKIIMPRNVHRSAI NAMILTGAVPIYVNPDVDKRLGISLGMSLDQVEQVIKDNPDAKAILVNNPTYYGICSNLK AIVALAHKHNMLALVDEAHGTHFYFNDDLPMSAMEAGADMASVSMHKSGGSLTQSSFLLL GENVNADYVRQVVNLTQTTSGSYLLMASLDMSRKNLAIHGNEIFNRVRRLAGYARTEINQ IGDYYAYCKELINGDSVYDFDVTKLSIFTLDIGLAGIEVYDLLRDEYGIQIEFGDIGNIL AYISVGDSEANIERLIGALLEIRRRFKKDRTGMFDHEYINPEVVISPQAAFYGEKESLPL YQSIDRVCSEFVMCYPPGIPILAPGERITKEILDYIQYAKEKGCFMTGPEDMAINKLNVL KGR >gi|223714214|gb|ACDT01000001.1| GENE 29 23245 - 24096 977 283 aa, chain + ## HITS:1 COG:CAC2602 KEGG:ns NR:ns ## COG: CAC2602 COG0421 # Protein_GI_number: 15895861 # Func_class: E Amino acid transport and metabolism # Function: Spermidine synthase # Organism: Clostridium acetobutylicum # 1 283 2 286 286 360 58.0 1e-99 MELWFTERHTNGVNFSIKVDCQLFSGQSEFQKIDIFDSKEFGRFLALDGYMMLTEKDEFI YHEMIVHVPMAVHPEVKKVLVIGAGDGGVIRELCRYETIEKIDMVEIDELVVEVSKKYLP MTACCFDDPRLQIYYQDGLKFVRGKENEYDLIIVDSTDPFGPGEGLFTKEFYGNCYKALN DTGIMVNQHESPFYQEDAVAMQRAHKRIVESFPISRVYQAHIPTYPSGHWLFGFASKKYH PIKDFQKTKWNARGMKTKYYNTGIHVGSFALPNYVEELLRDVE >gi|223714214|gb|ACDT01000001.1| GENE 30 24086 - 24949 1013 287 aa, chain + ## HITS:1 COG:BS_ywhG KEGG:ns NR:ns ## COG: BS_ywhG COG0010 # Protein_GI_number: 16080801 # Func_class: E Amino acid transport and metabolism # Function: Arginase/agmatinase/formimionoglutamate hydrolase, arginase family # Organism: Bacillus subtilis # 7 283 10 290 290 275 48.0 5e-74 MWNKNIQTFIGLEASFDEAECVIFGAPMDSTTSYRPGTRFASSSMRQESFGLETYSPYQD KDLEDIKVFDGGDLELPFGNPRKALDIIKVTTKTIIKANKLPCMIGGEHLVTLGAFEAVF EKYPEIRVIHFDAHTDLRDEYLGEKLSHASVIRRIYDLIGDNKIYQFGIRSGEREEFCFA REHTNLNKFNFTGLSKAIEACRGYPVYFTIDLDVLDPSVFPGTGTPEAGGVTFMELLEAM IVVSELNVVAMDINELSPVYDQSGGSTAVACKVLRELLLAMNKGENK >gi|223714214|gb|ACDT01000001.1| GENE 31 24946 - 26148 1401 400 aa, chain + ## HITS:1 COG:BH3957 KEGG:ns NR:ns ## COG: BH3957 COG1748 # Protein_GI_number: 15616519 # Func_class: E Amino acid transport and metabolism # Function: Saccharopine dehydrogenase and related proteins # Organism: Bacillus halodurans # 1 400 1 399 410 695 81.0 0 MKRTLIIGCGGVATVAIHKCCQNSEIFEEIMIASRTKSKCDKLKAQLDGKTKTIIHTAQV DADNVEELVALMKDFQPEVVLNLALPYQDLKIMDACLEAGCHYVDTANYEPEDTAKFEYS WQWAYREKFEQAGLTAILGCGFDPGVTGVFSAYALKHEFDEINYIDILDCNGGDHGYPFA TNFNPEINIREVSANGSYWEDGHWVETKPMEIKREYDFAQVGKKDMYLLHHEELESLGLN IPGIKRIRFFMTFGESYLTHLKCLENVGMTSIEPIEYEGKQIIPLQFLKAVLPDPASLGP RTVGKTNIGCIYQGKKDGQEKKYYLYNVCDHQECYKEVGSQAVSYTTGVPAMIGTMLLLQ GQWKRAGVYNVEEFDPDPFMEALNKWGLPWIEDHNPVLVD >gi|223714214|gb|ACDT01000001.1| GENE 32 26148 - 27257 1107 369 aa, chain + ## HITS:1 COG:BH3958 KEGG:ns NR:ns ## COG: BH3958 COG0019 # Protein_GI_number: 15616520 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Bacillus halodurans # 1 369 10 379 379 525 63.0 1e-149 MKTPYYLIDEQLLEKNLKILHTVKERTGCKILLAQKCFSMFSVYPLIAEYLDGTTASGLY EAKLGYEEMHKENHVFSTAYRDDEIEEIMNICDHVVFNSFNQWQKYQNLAKQTSTSCGIR INPECSTQIGHEIYDPCAKFSRMGVTLENFKEELLEGIEGLHFHTLCEQNSDDLITTIRA VEEKFGKYLGQMKWLNFGGGHHITRKDYDLEKLIKIINEIKEKYQVQVYLEPGEAVALNA GFLVSSVLDITHNGMAIAIMDTSAACHMPDVLEMPYRPEIIDSALPNVKTYTYRLGGPTC LAGDIIGDYSFDQPLQVGQQLWFKDMAIYSMVKNNTFNGMNLPSIVLKQKEKLKIIKEFN YCDFKNRLS >gi|223714214|gb|ACDT01000001.1| GENE 33 27275 - 28141 614 288 aa, chain - ## HITS:1 COG:YPO4093 KEGG:ns NR:ns ## COG: YPO4093 COG0561 # Protein_GI_number: 16124202 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Yersinia pestis # 11 277 3 265 269 87 26.0 4e-17 MKITKKQLNKIKMIVCDLDGTLFNHKKEISSATITYLIKLQENGYTLVLATGRFFYELKP YIEQLQLEKYHGYVICCNGIEIYDLTHNKCRSFSFLEQIEINELLQLALYYRLNIRANFD HKYQMILNNWLYFLIPLIRIFTKRYPDLTFYKNTAVVPWNKLGKICLLSSPRKLKLFEAA AQTKFPGLYQYYYTNPYCVEVVKAEVNKCYAVKYLCQLNNLNLENVLAFGDSGNDEELLN NAGIGIIMKNGLSSLKKKATYLTFKNNRHEGVLNMLEYLFDDLIKDKN >gi|223714214|gb|ACDT01000001.1| GENE 34 28367 - 29188 958 273 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00615 NR:ns ## KEGG: EUBELI_00615 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 267 2 276 285 194 41.0 2e-48 MEILLEPIQESLMTVPVLFLACLLVEYISNRDVVNKVMEFGKVGPFIGAILGSIPQCGFS VIAARLYSMRYLTMGTLLSIFIATSDEALAILVVHPDLWEMLVVLIVLKIVLGTVTGYVV DRLNHRSHDDYEYLQVEPCDCGCQNGIWLPALRHTLNIFIFILLTNIVFTLIIAGIGEET VTNLLKTNWIFQPIIAGVIGFIPNCAGSVILTQLYVSGGLSFGALLTGLTTSAGVGTLAL IKYNENKKDTFKILFISYVIALVVGYIASLVML >gi|223714214|gb|ACDT01000001.1| GENE 35 29494 - 30459 1252 321 aa, chain + ## HITS:1 COG:L0018 KEGG:ns NR:ns ## COG: L0018 COG0039 # Protein_GI_number: 15672358 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Lactococcus lactis # 8 312 6 311 314 314 50.0 2e-85 MRQDDDLKKVVLVGTGFVGMSMAYSILNTGGIDELVLIDVDQEKAIGEAMDISHGLPYSK SSLKVKAGGYDECKDADIVVITAGAAQKPNQTRLELASVNAKIMKSITEQIMASGFDGII IVASNPVDLMSYVVQKVSGLPTSRVIGSGTILDTARLRYLLSEYLNISSTNIHAYILGEH GDSSFVPWMNTYIGCKSMMEYIVEMNIDMNEMHKIYKEVQQAAYEIIKRKNATYYGIGLS LNRLITAILSDENAVLTVSAYQQGEYKQEGLYIGVPAIINRQGISKIMTLHLNNVDQHKF DRSCETLKEMIDGELEAIINS >gi|223714214|gb|ACDT01000001.1| GENE 36 30493 - 31302 826 269 aa, chain + ## HITS:1 COG:BH3665_2 KEGG:ns NR:ns ## COG: BH3665_2 COG0860 # Protein_GI_number: 15616227 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 63 267 4 179 180 76 31.0 5e-14 MKKLLSIFLIMLCFSGCKQNNDSELIKTEAVEAQQTAFVEKEGEVLVPESLRTVQLPIVG DSRVIVIDAGHQRRGDSNKEPIGPGASQTKAKVTTGATGISTGNLESAINLEVALKLQKK LEDSGYQVIMVRTSQDVNISNQQRAEIANKNNAGAFIRLHCNSDDSSSIHGTLTMAPSES NPYCSQIATASQRLSKTVVNSICNQTGSKNRGVIITDVMSGINWCQVPVTIVEMGFMSNP DEDRLLGDSTYQDKIVTGIVLGINEYFKN >gi|223714214|gb|ACDT01000001.1| GENE 37 31672 - 33129 1785 485 aa, chain + ## HITS:1 COG:lin1965 KEGG:ns NR:ns ## COG: lin1965 COG0793 # Protein_GI_number: 16801031 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Listeria innocua # 27 485 38 496 496 224 32.0 2e-58 MEDFIRPNHNPEPKLKPKKTKHIRETLFIVCMIVCLAVGFVSGYVAKKTVPTNSTSKNAE SIIDEAYEILDEAWLNPNDKDVDIKGNTITALVESLGDMHSSYFTYEESKTYNQSVDGNY EGIGVAQRTVSEGTMVMQVYKNSPAEKSGLQVGDIITGVDGNSVAGKSADEISDLIRGEA NTKVKLTIIRNTEQQEVEVERANVDSAVTSEIRDNDGKKFGYVKINTFGSTTADDIEAAL QTFTAEKIDTLVLDLRDNGGGYLTAATDVLSLFMKEDKLLFQMETKNGAIEKYKAKDCQK YNFINGYILVNGNTASASEVVAGALQEKMNYKLVGDQTYGKGTAQTQKQLSDGSVLKYTY AKWLLPSGTWINGKGLTPDYSVSNTDTSGIYTKALETDMGYDSVGTAIASMQKMLSILGY DCGRNDGYFSQQSVEALKQFEQANNLTVDGIYTNSDRQKLEAAVIMYANSENNDYQYKKL MELIK >gi|223714214|gb|ACDT01000001.1| GENE 38 33224 - 35197 2272 657 aa, chain + ## HITS:1 COG:BS_uvrB KEGG:ns NR:ns ## COG: BS_uvrB COG0556 # Protein_GI_number: 16080570 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Bacillus subtilis # 3 656 4 660 661 809 63.0 0 MQKFKLVSPFSPMGDQEEAIKQLVAGIKEGKKEQVLLGGTGTGKTFTVSNVIAAVNKPTL VLAHNKTLAGQLYSELKEFFPENRVEYFISNFDFYQPEAYIPGRDLYIDKNAKTNYEIEM LRSAAMNSLIEREDVIVVASVASIYGLGNPEQYKEMIFSLRVDQDIDRKELLTYLVDRQY QRNDIEQTKGTFRVRGDVIEIVPGHTESWLIRIELFGDTVERISEVDPLTGHVLGVYNTY TIYPAYGYVTKKEQILKACDTITAELENRLETFKAETKLLEHERLEQRTRHDVEMLREVG MCPGIENYSRHIDGRDEGQRPYTLIDYFPKDFLMIVDESHVMLPQVRGMYNGDRSRKETL VEYGFRLPSALDNRPLRFEEFEKIINQVIYVSATPGDYELEHVENKVVEQIIRPTGLLDP VIEVRPTKDQIDDIISEIKIRQDRNERVLITTLTKRMAEDLSAYLKELGIKVAYLHSDTK TLERTEILRDLRLGKYDVLVGINLLREGLDLPEVSLVCILDADKEGFLRSNRSLIQTIGR AARNSNGEVIMYGDKITDSMAYAIEETNRRRKIQDAYNKEHNITPTTIHKEIRDAIRGQE VIDDAASLVKKGKKASKKDKQVMIQELEKQMKDAAKVLDFERAMELRDIIMELQGEK >gi|223714214|gb|ACDT01000001.1| GENE 39 35289 - 36542 1197 417 aa, chain + ## HITS:1 COG:PM0247 KEGG:ns NR:ns ## COG: PM0247 COG0617 # Protein_GI_number: 15602112 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Pasteurella multocida # 11 382 21 393 420 150 32.0 4e-36 MLIYNRVDYHDVDVEVYGLSVDELEELLANYGNVNSIGKSFGILKLDVLPNFDFALPRTE IKTGESHQDFDVTVNQNLDLKVAASRRDLTINALMYEIKTGKIYDFFHGQEDIEKRTLRM VSETTFIEDPLRVLRTAQFASRLDFCIETATKLMCKKMVQNKSLDKLSKERVFQEYSKLL LSQQPSIGLTFLKEIKALWPCLDVLSKTMQRLDYHPEGDVWRHTLLVTDLAALCCHKTSN PLGFMWSALLHDIGKATVTTKDGHAPGHNEAGVKIFNQEVKAFIPDKQLQKYIKTMIFYH MHLMNMVRNEAKDYSYFKILKGIDGIVTIEDLILITKCDKLGRYKDEHENINRFDYVMKE KMARLSTKAQLPLVDGYDLKALGIEPSSKYSELLDWAYDLQLRGHSKAAILKMVEGR >gi|223714214|gb|ACDT01000001.1| GENE 40 36546 - 39368 3354 940 aa, chain + ## HITS:1 COG:BH3594 KEGG:ns NR:ns ## COG: BH3594 COG0178 # Protein_GI_number: 15616156 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Bacillus halodurans # 1 940 1 938 957 1217 63.0 0 MEHDKLVIKGARENNLKNIDIEIPKNKLVIMTGLSGSGKTSLAFDTIYAEGQRRYVESLS AYARQFLGNMEKPDVDSIEGLSPAIAIDQKTTSNNPRSTVGTVTEIFDYLRLLYARIGKA YCPEHGIVIESQTIKQMADTIDKYPDGSKLQVLARVVKNQKGTFKDLFEDLLKDGYIRVQ VDGEVKLLDEGIELDKNKKHNIDVVIDRIIKKEGYRSRLIDSLEVGLKLTGGEVVVANLS DGSEALFSEHLACPECGFSVPKLEPRLFSFNNPLGACPDCRGLGIKNEVDDDLLIPNWDL SINQGGIRYFKTSVGTDRIEWQRFLILCKTYKIDLDKPLKDFTKKELRIILEGSDKPISY EIVSSSGNVSKTNRPIEGVKTLIKRRFEETTSSWSKEWYASFMAEHTCPTCGGRRLNDQV LSVRVGGLNISEFTDMSIEKALKFVDELKLNEYEAKIANLVLKEIKDRLGFLNDVGLGYL TLSRKAGGLSGGEAQRIRLATQIGSRLTGVLYVLDEPSIGLHQRDNDKLIATLQNIRDLG NSVLVVEHDEDTMRASDFIVDIGPGAGVHGGEVIVAGTPQEVMNCKKSITGQYLSGRLKI DVPKKRRKGNGNFIQVKGAAANNLKNINVKFPLGCMSVVTGVSGSGKSTLVNEIMGKAMM ARIYRSKEKPGKHKDILGLENIDKVIEVSQDPIGRTPRSNPVTYTGVFDDIRDLFAQTNE AKMRGYDKGRFSFNVKGGRCEACQGDGLKRISMHFLPDVYVPCEQCGGKRYNEETLQVTY KDKNIYDVLEMTIEDAVEFFGNLPKIKTKLQTLNDVGLGYIKLGQSATTLSGGEAQRVKL ASELQKKATGKTVYILDEPTTGLHSNDVAKLIEVLQRIVDQGDTVIIIEHNLDVIKVADY IVDLGPEGGDGGGTIVVAGTPEKVAKCEASYTGSFLKKML >gi|223714214|gb|ACDT01000001.1| GENE 41 39382 - 40317 1242 311 aa, chain + ## HITS:1 COG:lin2626 KEGG:ns NR:ns ## COG: lin2626 COG1493 # Protein_GI_number: 16801688 # Func_class: T Signal transduction mechanisms # Function: Serine kinase of the HPr protein, regulates carbohydrate metabolism # Organism: Listeria innocua # 1 309 1 307 312 228 38.0 1e-59 MPKSITVGELLNNEHYVQVTGDEKSLERPIYVAEINRPGFELAGFFKHSDFRRIIIFGDK ESAFINEMSEERQREIFPFLINDEVPCIIVAKGNECPAILIEIARKKDFPIFITESATGR VSIELTNILDEALAPETLIHGVFLNIYGKGVIIKGDSGIGKSEIALELVKRGHLLVADDA VELYHLGQSIVGKAPEVLTNLLEIRGIGVIDVSKMFGISSILDKDKVDLIIQLDRWVPSR EYTRVGVEENDSTEEILGVKIPKIVVPVSSGRSMSVIIEAAVMNMRLREYGVDSSKEFVD RILKNIDKNKG >gi|223714214|gb|ACDT01000001.1| GENE 42 40321 - 41121 783 266 aa, chain + ## HITS:1 COG:BH3589 KEGG:ns NR:ns ## COG: BH3589 COG0682 # Protein_GI_number: 15616151 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Bacillus halodurans # 5 260 7 261 289 191 43.0 1e-48 MTFFPDGKTFVQIGSFSIAWYAICIITGAFIAYKLGQYNFKKIGYNKEILSDYFFGLMIT GVLGARLWYVIFMWNELYAENPLEIIMFRHGGLAIQGGIFVGLIFSWWYFKKHKIDFMVA ADAIMPGVLIAQALGRWGNFFNQEAYGGMVSLDFLKSLHLPNFIIEGMHINGYYYHPTFL YESIANIVGFLLIYFVIRKIQTKQGEQFFSYFIWYGIFRFFIEGLRTDSLYVMGLRTAQL VSIVFVIVGIAGYIYCKKYGKPAIKA >gi|223714214|gb|ACDT01000001.1| GENE 43 41319 - 42035 904 238 aa, chain + ## HITS:1 COG:CAP0006 KEGG:ns NR:ns ## COG: CAP0006 COG2188 # Protein_GI_number: 15004711 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 236 1 236 237 168 36.0 1e-41 MKLKYDVGDKPLWAQLFDILLDRIVSKEYNAGDKMPTDLEIMQEFDVSRMTVRQAMNKLI NEGYIERRRGKGTIVLEKENKVETILKSSFNGVLEKNDIKNRRVLNVEMVNAGQEIADFF NITTKDKVLRLVREIRINNMVVSIHETFLNPIVPLDEQGDYSGSLYEKLMGAGYGISNVK EKISASLMNQKQKELFEIKGDEAMINRQRRGFCHDFPVEFTNSMYLSQGYELIIDLCE >gi|223714214|gb|ACDT01000001.1| GENE 44 42150 - 43436 1167 428 aa, chain + ## HITS:1 COG:BH3919 KEGG:ns NR:ns ## COG: BH3919 COG1455 # Protein_GI_number: 15616481 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Bacillus halodurans # 8 426 5 427 430 283 38.0 6e-76 MDKFSKFLEEKLMPIGQKINNIRFLQVIRMAMMPLIPFIIAGSFTLIVLNFPFIDKVLPA GFLEMLGNVLSPLSGTTLSLVAMFLAFLMGYNYAKTEEQKCEEVYAGITTLVAFLIVSPL SITVGEEVIGGVIPTTYMGSQGMFVALILCYLVAKVYCKILSSKLKIKLPDSVPPMVSSS FESLIPVIITLVLVCSINYCFTLTSFGNIHALVNEIIQKPLLLIGTGLPALLISQGLAQL LWFFGLHGDQIVGSVMDPILQTAGMENLSAYTAGDAVPYIITDQFRALFVMIAFMSLVIA ILIVSKSNRLKGVGKVAVLPATFCISEPVVFGAPVVMNAMLFIPWVLSRPVFGLITWLFM KFGLCPYPTGVAIPWTTPPVLSGFLATNSIMGAVVQIICLAIGVLMFIPFVKMIDKTYHQ EEIKTKEQ >gi|223714214|gb|ACDT01000001.1| GENE 45 43438 - 44868 1620 476 aa, chain + ## HITS:1 COG:BH3918 KEGG:ns NR:ns ## COG: BH3918 COG2723 # Protein_GI_number: 15616480 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Bacillus halodurans # 5 466 8 466 466 396 45.0 1e-110 MKKLYRLPEGFKFGSSASAWQTEGWTGKKEHQRNFVDMMYLAEPERWFNGVGTIKSTDFY NRYEEDIKLMKELGIQVYRTSIDWSRFITDYETNEVDQDALRFYEKVIDCLIENGIEPML CLEHWEIPEYVINKYGTWDNRKVMELFVGYSKKVMAAFHDKIKYWWTINEPAVVPNEAYL FGNIWPYEENTKKSVQMNYHRVLATSLLVEHHAKMGYQGKVGIILNPSPSYPKSMNNPKD IEAAKICDMFNYKQYTDPLINGKFDEAYFELLKKHECMFEYVKGDFEKIDDNKIEVVGIN YYQPLRVRQRETAWNPDKPFIPTAYYETYSPRGIKMNFSRGWEIYPKGIYDMLKIIQNDY RNIECWITENGMGVMDEDKFKDASGQIQDSYRIEFLSDHMAWMLKAIDEGCDCRGFLNWT FTDNLSPVNAFRNRYGFVEIDMENNRNRRIKKSGDWVKELMRTRCFEAEDTEPEYK >gi|223714214|gb|ACDT01000001.1| GENE 46 44893 - 46611 1601 572 aa, chain + ## HITS:1 COG:BS_ydhS KEGG:ns NR:ns ## COG: BS_ydhS COG1482 # Protein_GI_number: 16077654 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Bacillus subtilis # 226 556 3 305 315 83 24.0 1e-15 MSNYDNSPYIKINGYDNDAYSGYAQIDKKIKESLGNKKIVVVDCYLGINDRELLNVLIKK LTPAHVILSEDIFYDGKKLTEMMQVNLTEDRVRGVMYYGTIRDYVDEAKLAKIQKFVKNK EGIVLVYGFGASLITKGDLLIYADLTRWEIQLRYRAGLPNFKQSNYDEDPLIKNKRGYFI EWRIADKHKREIFEDIDLYLDTNCSNKPKMITGIAFRNALKSVCNQPFRLVPYFDPGVWG GQWMKEVCNLDKSKSNYAWSFDGVPEENSINLKFGDIVIATPALNVVMYQAIELLGAKNY ARFGAEFPIRFDLLDTMGGGNLSLQVHPTTEYIKSHFGMPYTQDESYYILDCDGGGVYLG LKENINSDEFINDLERANEGNQSFDADKYINFFPAKKHDHFLIPAGTIHCSSANCVVLEV SATPYIFTFKLWDWDRVGLDGKPRPIHVEDGKKVIDYQRTTTWVESNLVNNFTEIEKSDE YTEVKTGLHELEFIETRVITTKQKIYQQTNKSVVMMNLVAGESAVIKSPKNKFEPYTVHY AETFIVPAQVGEFTIEPFDKNEIKVLKAYVRN >gi|223714214|gb|ACDT01000001.1| GENE 47 46871 - 48976 1894 701 aa, chain + ## HITS:1 COG:BBB29_1 KEGG:ns NR:ns ## COG: BBB29_1 COG1263 # Protein_GI_number: 11497021 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Borrelia burgdorferi # 5 446 10 464 469 435 51.0 1e-121 MKTRVFGILQRIGRSFMLPIAVLPIAGLFLGVGSSLTNETTISSLHLEGVLGNGTFLHDF LIILTKVGSGLFNNLPLIFAAAVALGMAKKAKEVAVLSSIIAFFVMHTTISGMLSLNGSI LDNQIINPDVLDGTITAVCGIMSLEMGVFGGIIVGLGVSYIHNRFYQIELPRALSFFEGE RFIPIISTITFLFVGIVMFYVWPFFQNGIYELGKLISTLGYFGTFVYGVIKRLLVPFGLH HVFYLPFWQTAIGGSMVVNGAVIYGGQNIFFAQLADPNIVHFSSEATKYFTGEFIFMIFG LPGAALAMYQTAKPENKKAIGSLLFSAALTSILTGITEPIEFTFLFAAPLLFGVQVILAG SAYMVAHIFNIAVGLTFSGGLFDFIVFGVLQGNSKTSWVLLIPIGIIYFMLYYFSFKYLI KKFDLKTPGREIDNMKLSIFKNPKASHRKLLNKGIEIDKQSQLIVRGLGGRDNFTDLDCC ITRLRATVSDNQLVNEGLLKQSGAAAVVMQGNGVQIIYGPKASSIKSKLDEYLINIPDEL DDYQLEKETDHTNLELGNIVDGEVLPIEQAPDQIFSHKLLGDGVMIKPFHGLIVAPCDGE ITMIYPTKHAIGMKLDNGYELLIHFGTDTVNLNGTGFEILVKLNQRVQKGDLIWNADLEY IKENALDESILLIFTNLSNDLKIEKEYGKIKSGSTIMKICN >gi|223714214|gb|ACDT01000001.1| GENE 48 48993 - 49817 763 274 aa, chain + ## HITS:1 COG:BH0845 KEGG:ns NR:ns ## COG: BH0845 COG3711 # Protein_GI_number: 15613408 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus halodurans # 1 270 1 273 276 144 31.0 2e-34 MYKIIKVLNNNTILACNETEEVIIMYKGIGFSKKVDDYFEVPRNAKKFLMQKGYKAKSGS NDIINYIDPVYLEIASEIIRLIIEKFGKIDNDILLPLADHIYFAIKRMDENIMPLNPFIN DIRLLFPDEYEVALEGREVINKFISRMINDDEVGFITLHIHSAISSNQVAESMEATRIIH ESIIKLENDLNIVIDIQSVSYARLMNHIKFLILRLNKNEELQMDISEFTKDKFPFAYEQA SNMCEALSKVLKKKLPETEVGYLALHLERILSSI >gi|223714214|gb|ACDT01000001.1| GENE 49 50040 - 51029 1434 329 aa, chain + ## HITS:1 COG:lin0143_2 KEGG:ns NR:ns ## COG: lin0143_2 COG3444 # Protein_GI_number: 16799220 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Listeria innocua # 165 327 1 163 165 217 69.0 2e-56 MVGIILASHGEFANGILQSGAMIFGDQVDVKAVTLQPSEGPDDLKAKMEAAIATFENQDE VLFLVDLWGGTPFNQANGLIAGHEDKWAIVTGLNLPMLIETYASRMSMETAHEIAKHVTE VAREGVKVKPEELEPETKVAAQAINTQSQGAIPEGTVLGDGHIKYVLARIDTRLLHGQVA TTWTKTTQPNRIIVVSDAVSKDKLRKQMIEQAAPPGVKANVVPIDKMIQVAKDPRFGNTK AMLLFETPQDALRAIEGGVDIKELNIGSMAHSIGKVVVNKAIAMGPEDIETIEKIKAKGI TFNVRKVPSDSNENIDTLLKKAKAELKNV >gi|223714214|gb|ACDT01000001.1| GENE 50 51055 - 51870 1117 271 aa, chain + ## HITS:1 COG:lin0144 KEGG:ns NR:ns ## COG: lin0144 COG3715 # Protein_GI_number: 16799221 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Listeria innocua # 1 271 1 268 268 260 68.0 2e-69 MSLITIILIIIIALLAGMEGILDEFQFHQPLVACTLIGLVSGHLTEGIILGGSLQMIALG WANVGAAVAPDAALASVASSIIMVLGLEGGATDVQTAISTAIAVAIPLSVAGLFLTMICR TLTIPIVHFMDGAAEQGNMRAIDMWQILAILLQGIRIAIPAAALCVVPAAVVTDALNQMP PWLSGGMTVGGGMVAAVGYAMVINMMSTKETWPFFALGFVLAAIGELTLIALGAIGVALA IIYLGLKENSGSNGGGSTGSGDPLGDILNDY >gi|223714214|gb|ACDT01000001.1| GENE 51 51894 - 52811 902 305 aa, chain + ## HITS:1 COG:CAP0068 KEGG:ns NR:ns ## COG: CAP0068 COG3716 # Protein_GI_number: 15004772 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Clostridium acetobutylicum # 1 305 1 303 303 423 71.0 1e-118 MAEKIKLSKKDRMSVAWRHQFLQGSWNYERMQNGGWCYSIIPAIKKLYPNKEDKVAALKR HMEFYNTHPYVSAPVMGVTLALEEERANGAEINDTAIQGVKVGMMGPLAGVGDPVFWFTL RPILGALGASLALSGNIIGPLIFFFAWNIIRIAFLWYTQEFGYKVGTSIAQDLSGGLLGK ITQGASILGMFIIGSLVQRWVSITFTPVVSTVTQSKGAYIDWNSLPSGVDGIKSALEQFA SLGSNGLNVEKVTTLQQNLDQLIPGLSALLLTLLCCWLLKKKVSPIVIIVALFIIGILAR LGGIM >gi|223714214|gb|ACDT01000001.1| GENE 52 52854 - 53237 402 127 aa, chain + ## HITS:1 COG:CAP0069 KEGG:ns NR:ns ## COG: CAP0069 COG4687 # Protein_GI_number: 15004773 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 123 12 134 135 155 60.0 2e-38 MVQSLNTKVDLTVKATSYLGLANYGEVMVGDKAFEFYNEKNIRDYIQIPWEEVNCVMASV MFKGKWIPRFAVVTKNNGNFTFSSRDNKALLRAINKYIPSENLVRSLSFFQVIKRGLKSL FRKKSRQ >gi|223714214|gb|ACDT01000001.1| GENE 53 53405 - 54085 785 226 aa, chain + ## HITS:1 COG:lin2728 KEGG:ns NR:ns ## COG: lin2728 COG0745 # Protein_GI_number: 16801789 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 1 219 1 220 225 188 44.0 7e-48 MFHILVVDDDKNTRQLIKAVLETENYIVIGATDGEDALKVLDRSHIDLVILDVMMPKMDG YEFTEILRRVQNELPILMVSAKQLPADRNKGFIVGTDDFMTKPFDEEELLLRVKALLRRA KIVNEHRIIIGEVILDYDSLTVKRKNTCQELPQKEFMLLYKLLSYPGKIFTRIQLMDEIW GVDSDTGWETVTVHVARLRKRFEDWPEFEIISVRGLGYKAVRNDES >gi|223714214|gb|ACDT01000001.1| GENE 54 54075 - 55127 916 350 aa, chain + ## HITS:1 COG:lin2727 KEGG:ns NR:ns ## COG: lin2727 COG0642 # Protein_GI_number: 16801788 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Listeria innocua # 56 349 164 459 459 184 32.0 3e-46 MNRKKAPPRALLTLIFTIVVFIILVVTMIIVGCVIFFLTQIGVLGISQTASAPIHFQIII FAMTSIVIGTIVAALGSHFPLRPLNTLIEAMNQLAHGKYDIRINLGKHRISQELTDSFNT MAEELENTEMLRTNFINDFSHEFKTPIVSIKGFAKLLQKDNLDQKTHDKYLQIIETESSR LADMATNVLNLNKIEKQTILSNITSFNLSEQIRNCLLLLEKKWSNKNLNLIIDFDEHQIA GNQELLYQVWINLLDNAIKFAPRDGKLGIKIIHNQSDYQILISNNGPKITDEEKKYLYNK FYQGDTSHATEGTGIGLAIVKKIVSLHSGTIDVDSNKNETTFIVTLPLNL >gi|223714214|gb|ACDT01000001.1| GENE 55 55237 - 56994 219 585 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 339 569 2 231 245 89 28 7e-17 MIKIFKNFTKKDWSLIIICLAFIILQVWLDLKLPDYMSKITMLVQTTGSTMDEIIESGVK MLGCALGSLASSVVVAIFAARVATNFAANLRERLFDKVQAFSMEEIGSFSTASLITRSTN DISQIQTLVVMGLQAIIKAPIMAVWAIFKIAGKSWQWTASTGVAVIVLLVIVGSCMIFAL PKFKKLQQLTDDLNRVTRENLTGISVVRAYNAENYQENKFENANQELTSANLFANRVMAI MSPGIQGIMSGLTLAIYWIGAILIENAAMMDKLNLFSDMVVFSTYAMQVVMAFMMLVMIF IMFPRASVSSKRIMEVLETVPIINDGTVTVAKPGIKGQIEFKNVSFKYPDAEDYVLQNIT FTVAQGETVAFIGSTGCGKSTVINLVPRFYDVTEGEVLVDGINVKAYTQYALHNKIGYVS QKAVLFSGTIESNVAYGDNGRNNNLQNNVIDAIYTAQASDFVEASPRRYNSYVAQGGSNY SGGQKQRISIARAICRQPEIMIFDDSFSALDYKTDQKLREALKKDCCDATKLIVAQRIGT IRDADKIIVLDEGVIVGMGTHEDLMKNCQVYQQIARSQLSKEELE >gi|223714214|gb|ACDT01000001.1| GENE 56 56994 - 58814 193 606 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 369 589 132 353 398 79 25 8e-14 MAEGKYLEKNDLKAPGVNMGSKEKSKDFKETWLKLLKYCRKYWIVMVMAIISAIFGTILT LIGPDRLSELTDLITNGIMTGIDMTRIGKIGLTLVGFYVTSAFFSLFQGFVMATVTQWVT KNLRRDISRKINRLPISFYNHTSTGDILSRVTNDIDMIGQSLNQSVGTLVSSVTLLAGSL IMMLKTNLIMTFTAVLATVIGFALMLVIMSKSQKYFARQQKHLGEINGYVEEIYTGHTIV KAYNGEVKAKETFTRQNNNLRNSGFKAQCLSGLMMPIMSFIGNFGYVAVCIIGAALAMSG DISFGVIVAFMMYIRYFTQPLSQLAQAAQSLQSAAAAGERVFEFLEAQEMADESHKVKRL SKAKGGVEFRHVHFGYDSNKIIINDFSARAEPGQKIAIVGPTGAGKTTIINLLMRFYEVD RGEIRIDNIPTKDLRREDVHGQFCMVLQDTWLFEGTVRENLVYNTKNISNQIIEEACKAV GLDHFIRTLPYGYDTVLNDQVNLSQGQKQQLTIARAMIADKPMLILDEATSSVDTRTELQ IQKAMDELMEGRTSFVIAHRLSTIKNADLILVMKDGDIIESGNHEDLLSRGGFYAELYNS QFDKAV >gi|223714214|gb|ACDT01000001.1| GENE 57 58888 - 59736 1307 282 aa, chain - ## HITS:1 COG:Cgl2568 KEGG:ns NR:ns ## COG: Cgl2568 COG0561 # Protein_GI_number: 19553818 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Corynebacterium glutamicum # 5 281 3 271 271 252 47.0 5e-67 MADSKIIFIDVDGTLVDYENKLPASADKAIKQARKNGHRVYICTGRSKAEVYPYLWDIGL DGMIGGNGSYVEDGDTVVMHQLITAEQCKHIVDWLKSRGLEFYLESNNGLFASENFETRG EPVIQEYSKRKGKEHSDQIKVRDVFPEMIFDGELYRDDLNKVSFILESYQDHLDSIEEFP DLKAGTWGGAGEIALFGDLGVKDITKAHAIDALLEYLGADIEDTYAFGDAKIDIPMLDYC HIGVAMGSGGDEIKAIADYITDDVDQDGLYNAFVRFGLIEAE >gi|223714214|gb|ACDT01000001.1| GENE 58 60010 - 60630 646 206 aa, chain - ## HITS:1 COG:CAC0331 KEGG:ns NR:ns ## COG: CAC0331 COG4684 # Protein_GI_number: 15893623 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 5 201 8 189 192 89 34.0 5e-18 MRSKKTQYMTFMAMFLAIEIILVVTPLGYIPIGPLSATTMHIPVIIAGITLGKKAGAQLG LVFGLTSLIRATIQPGITSFCFSPFVTVGNISGDWRSVIIALVPRILLGYLAGVIFEFIK NKFNNENAAAVIGALIGTITNTVLVLGGIYFFFGTAYADAVNIAYSSLLAMLFGVVTTNG IVEALIGAVVTLLAYKAIKPMATNIK >gi|223714214|gb|ACDT01000001.1| GENE 59 60799 - 61767 1268 322 aa, chain - ## HITS:1 COG:BMEII0898 KEGG:ns NR:ns ## COG: BMEII0898 COG3191 # Protein_GI_number: 17989243 # Func_class: E Amino acid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: L-aminopeptidase/D-esterase # Organism: Brucella melitensis # 5 310 8 316 335 264 50.0 2e-70 MKKINITEIKGIQIGQVENQKAGTGCTVIICKDGATAGVDVRGGGPATRETDLLNPINMV QQINAVMLSGGSAFGLDAASGVMQYLEEHNCGFDMKVAHVPIVCGASLFDLTVGDPKIRP DKAMGYQACLNSEKNEPLKEGNYGAGCGASVGKIMGPQYAMKGGLGTAAIQIDNLQVGAI VAVNACGNIVDYKTNEQLAGIYDSEHNSIIPADEVIFQQIEKLRQLPSGNTTIGCIITNA KLDKAQCTKIAGIAHNGYARAIQPVHTMSDGDTIFVLSTNEVEVMPDAIGILATDLMAKA VNRAVKKADSAYGLKSYKELHK >gi|223714214|gb|ACDT01000001.1| GENE 60 61918 - 62583 867 221 aa, chain + ## HITS:1 COG:L12334 KEGG:ns NR:ns ## COG: L12334 COG1396 # Protein_GI_number: 15671989 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 10 74 11 79 107 62 46.0 6e-10 MIGDKIILYRKRKGLTQEELADLLEVSRQTVTKWESGSVLPNLDYIMGLSVIFGITIDNL VKDNDCAKQEIESKISNYNWIDFMLKAKKATYAKKEGKTVSSRPNSHDYQYQENEYLYID TFVGSEFFGGEECVYKDNIPLYVMNYYGKVLDEAFSGDFLKEALLLVDRISPFRGPALYT NGNYTYHCSYDGDYEFFNGKEEIYYNNIKVYECLFHGGSVK >gi|223714214|gb|ACDT01000001.1| GENE 61 62926 - 64248 1499 440 aa, chain - ## HITS:1 COG:TM0033 KEGG:ns NR:ns ## COG: TM0033 COG4099 # Protein_GI_number: 15642808 # Func_class: R General function prediction only # Function: Predicted peptidase # Organism: Thermotoga maritima # 88 440 71 395 395 77 24.0 4e-14 MKRLSKLLLALFIACSITGCSQSTEVVDATYEIYIAGYDWGCGVNKTILTLDKAVDDVDK NDFMVSETKQVTDWEDEALPVVEKTLERVVDDAYSCDKDGQKIDGESKYIAIELYVSPND GSPLLYSATTHYNTWSDPYYLNISLAKNGEITVDDKKVTKLDVSTEYTKKITAADALELE KFKASDGIELNYGHFNPKEPSNTLFVWLHGSGEGGTEDTNPQVTSLSNKVSAYFNDDFQN AVGNAYVLVPQCPTFWMDADGNGGEWNDDLLVTHGPSYYTNALMELIKDYQSRCGATKVV LAGCSAGGYMTMRLILDYPDYFAAAIPICEALPDDVISDQQLSEIKDLPLYFIYSNDDPS IIPEKFEIPTINRLKNLGATNLHVATFDSVIDTSGKYNDLNDGDPYQYNGHWSWIYFDNN EADCDEHGEKVWQWIGEQVK >gi|223714214|gb|ACDT01000001.1| GENE 62 64328 - 65200 653 290 aa, chain - ## HITS:1 COG:CAC1451 KEGG:ns NR:ns ## COG: CAC1451 COG2207 # Protein_GI_number: 15894730 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 44 285 35 284 295 82 28.0 8e-16 MLNGKELLDNNTHIIYEKITTDELPLKVIEVTLAKDDPNLIPPKHWHRSLEFIIPLTASI ELWSNGETYSIARNGLAIINSQAIHTTKIIESEDCFKSIVIQIKYDFLKKCYPAFDNIYF SNHIVPEIEQKLVVLLTNLSLEYQQTTEFKLLTINGYLYLIISILLKYQKKIRKVGYFLQ SDQKLQEFMKILFYLDKHYSQALDVSSIANQFNFSYGYLARLFKKHLNITVKQYLTAQRL EHCTNDLIHTNLTITEIAMKNGFSSTKSFNREFKLKYHETPQVYRNKVRK >gi|223714214|gb|ACDT01000001.1| GENE 63 65291 - 65797 587 168 aa, chain - ## HITS:1 COG:FN0772 KEGG:ns NR:ns ## COG: FN0772 COG0716 # Protein_GI_number: 19704107 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 1 160 1 162 169 58 26.0 7e-09 MKIAIVFDSHTGNTKKVANVIKEACINEEVVYFGEPQTFSDADLIFIGSWTDKGNCSQKI QNFLTTLSNEKIAFFATAGFGGSTEYYDKLAERFDNVVNTNNKILGHFFCPGKMPLSIRD RYVKMIQEHPEDKQLQVSIDNFDAALSHPDTNDLENAKKYALEMIAKV >gi|223714214|gb|ACDT01000001.1| GENE 64 65883 - 66545 743 220 aa, chain + ## HITS:1 COG:BS_yvoE KEGG:ns NR:ns ## COG: BS_yvoE COG0546 # Protein_GI_number: 16080550 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Bacillus subtilis # 10 213 7 210 216 169 44.0 3e-42 MMCKGDFMDTAIIFDLDGTVLYTDELIKRTFIKVFEKYQPGYTLSEDELLSFLGPSLKET FSKYFPDEMFNELLNYYHSYNHSHHEDFVYVYPTVVETLEYLKNRGYPLGIVTTKLKVAA DVGLNTFDLKKYFDVVIGLDDVKVTKPDPEGIIKAMELLGVKKAVYIGDNITDIQAGKNA GIKTIGVKWSPKGYQHLLELEPDLMIDEMKEIIPYIEGAC >gi|223714214|gb|ACDT01000001.1| GENE 65 66547 - 67455 495 302 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 1 291 1 294 306 195 38 7e-49 MKDLIIIGAGPAGLSACLYATRAGAEVLMLDAGAPGGKLNVSAEIENWPGQKKKTGPELA YEMYEHALSFGGEQAYGEVTKVIDYGDYKEVITTDQSYEAKNVLIATGTKERKMGIPMEE ELTGHGVSYCAVCDGPFFKGDEVAVVGGGNSALEEAIYLTKFAKKVHLIVRRDVFRADKI IQDHVQANDKIEIHFLKKPHHLIAKDNKVAAIALEDSKTGEVSELAVKGVFPFVGLDPIT DFVKDLGITDERGYIITNELMETSVKGIYAAGDVRQKVLRQVVTATNDGAIVGQQVGQGL NN >gi|223714214|gb|ACDT01000001.1| GENE 66 67452 - 68324 705 290 aa, chain - ## HITS:1 COG:BS_ywfK KEGG:ns NR:ns ## COG: BS_ywfK COG0583 # Protein_GI_number: 16080817 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 7 289 6 291 299 147 30.0 3e-35 MITRKHLAIFKTVAKTKSMSKAAKLLYISQPTISQKIQEIEDEYQVKLFERYSKTLFISE EGQTLLVKANQILNLYDEIEHIFTKTKQFPLKIGATLTIGSTLISPILKKLKDKYQDITF QVYIDNTNVIEEKLLNNTLDIALVEGTIINQDLIVEPVIKDRVVFVCAQDYPLNNVKVMS LTELSLHPFIDREQGSGTRNQLDECLKNNNIFLEPAWQCHSWESVKQAVLNGHGIALMPM KIIENEVIEKSIQVINVNNISIERDFSICYHKNKSINKKMQAFIKTCKNY >gi|223714214|gb|ACDT01000001.1| GENE 67 68443 - 69483 1066 346 aa, chain + ## HITS:1 COG:CAC1513 KEGG:ns NR:ns ## COG: CAC1513 COG1145 # Protein_GI_number: 15894791 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Clostridium acetobutylicum # 1 339 1 335 338 324 45.0 2e-88 MGYKVGFKDVNDILATLSQEYDIYGPKRFEKQGRYSDTDIIRYDKISKIEEVVFDQKSDF PAKEVITPITQSLFYFTEDEYRESKVTDKKILIFMRPCDVNAMEHQNKIYLGNGGYSDLY YQRLQEKVKIILMECTSGWDTCFCVSMQANKTDNYSLAIRHLDNEVLIEVKESEFEKYFM EAPKDDFEVSFIEENEVKVTVPEIKDKETLLKLKQHSMWQTFNSRCISCGACTIACATCT CFTTTDLIYNENASIGERRRTAASCQVDGFSDMAGGHAFRPTAGDRMRYKILHKFHDYKA RFNDYHMCVGCGRCTSRCPEYISVTATVEKMATAVAEIEVKGDNHE >gi|223714214|gb|ACDT01000001.1| GENE 68 69476 - 70270 922 264 aa, chain + ## HITS:1 COG:STM2549 KEGG:ns NR:ns ## COG: STM2549 COG0543 # Protein_GI_number: 16765869 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Salmonella typhimurium LT2 # 7 264 16 272 272 243 43.0 2e-64 MNNPIKPIPSTILEIKRESNLEYTFKVATNIKPEHGQFLQLSIPKIGEAPISVSAQGEGW LDFTIRSVGKVTDEIFNKKPGDVLFIRGPYGKGWPLEETFKGKHLVVITGGTGLAPVRSM LNECFENDGYVKSLTLIVGFKNEAGIIFKEELNKWQEKFNTFYTLDKDQIEGWNVGFVTD LVEKIPFDSFMGNYEVIIVGPPMMMKFTGEKVAGLGVPDEKIWVSFERKMSCAVGKCGHC RIDETYVCLDGPVFNYTKAKYLVD >gi|223714214|gb|ACDT01000001.1| GENE 69 70283 - 71320 1117 345 aa, chain + ## HITS:1 COG:STM2550 KEGG:ns NR:ns ## COG: STM2550 COG2221 # Protein_GI_number: 16765870 # Func_class: C Energy production and conversion # Function: Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits # Organism: Salmonella typhimurium LT2 # 1 334 1 334 337 383 51.0 1e-106 MNHDIDIKKLRINCFRQSKVAGEFMLQMRVPGSMIAAKYLQVVQDIAENWGNGTFHIGMR QTLNIPGIKYEYIGEVNKYIKDYIKEVEVEMCDVDMDVTDYGYPTIGARNVMSCIGNTHC IKANVNTYQLARKIEKIIFPSHYHIKVSVSGCPNDCGKGHFNDFGIMGIAKMEYHPERCI GCGACVRACQHHATRVLSLNADNKVDKDTCCCVGCGECVIACPTGAWTRKPTKFYRVTLG GRSGKQYPRMGKIFLNWITEDALLQVFSNWQKFSAWVMDNKPEYIHGGHLIDIAGYPKFK ELILDGVELNPECLVAEELYWTESEQRANIHLKPLEQHKKAGPQN >gi|223714214|gb|ACDT01000001.1| GENE 70 71351 - 72541 912 396 aa, chain + ## HITS:1 COG:lin1592 KEGG:ns NR:ns ## COG: lin1592 COG0373 # Protein_GI_number: 16800660 # Func_class: H Coenzyme transport and metabolism # Function: Glutamyl-tRNA reductase # Organism: Listeria innocua # 1 394 1 422 435 127 27.0 5e-29 MEFAVIGVSHKQLALDKRSLFSFTDTQKLEFSSLLLTYGIEQVLILSTCNRSEVYLMYQT DIDFLPSIYLNYFNQAEAPLYVKTGDEAFRHLLKVTCGLESMLIREDQILGQVKQAYDFT CRMSLGGKELSLIFQETLNFVKKMKHKYQQPSISLTHLAVEHLKKVSNLYNKKIMICGAG EIATSFIPYLYDNNELIFSNRNSDKLVKLKQQYPKIKIIPFNQRKILLDQVNILISATAS PHLIFNRDDFISSQLIAIDLTIPRDIDKTASIQCIDLDTLNHEVAVNNQLRSKDVEKINS EIDIKIREVRTKLDSIKYDYVIQSLQAKSMQLANQTYEILVNKLSLTARERHILEKTLKA SFMQIMKDPIHCVKTNQINDLEVINKLFSLKEDNEE >gi|223714214|gb|ACDT01000001.1| GENE 71 72538 - 73431 865 297 aa, chain + ## HITS:1 COG:lin1591 KEGG:ns NR:ns ## COG: lin1591 COG0181 # Protein_GI_number: 16800659 # Func_class: H Coenzyme transport and metabolism # Function: Porphobilinogen deaminase # Organism: Listeria innocua # 2 293 4 297 309 227 43.0 2e-59 MKIIVGSRGSSLALKQTEYVIEKLQRAYPENEYEIKVIHTIGDKMQHIALDKINDKGVFV KEIEKELLEHTIDLAVHSLKDMPSDNPAGLVYAKTLAPSDYRDCLVLKNCSSLARLPQHA TIATGSKRRKYQLLKLRPDLKIVDIRGNVNTRLQKMANQEIDGLVLASAGLKRLGLEDLI SEYLDETKMIPACGQGILAIQVRQDSELLTMINNISDEITTTRMELERLFLKTVNGSCHI PVGGYAKIIGDQVHFYGLLGNEDGTVLKNIDKKFSLKDASEEVTALAEMLMREVYER >gi|223714214|gb|ACDT01000001.1| GENE 72 73421 - 74857 1319 478 aa, chain + ## HITS:1 COG:lin1164_1 KEGG:ns NR:ns ## COG: lin1164_1 COG0007 # Protein_GI_number: 16800233 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III methylase # Organism: Listeria innocua # 3 254 2 252 252 230 46.0 4e-60 MNGRVYLVGGGPGAIDLYTVKALECIKSADCIIYDRLIDPAILNYCKETCEKIYVGKASN NHTLAQDNINLLLIEKARQYAQVVRLKGGDVYVFGRGGEEALALFKAGIDFEVVPGLTSA VAGLCYAGIPITHRGVSSGFMVVTAHHKNNQAYEWDYQQFLDDSLTYIFMMGQQKLEQIV TNLLVAGKDKTTPIALISNATRTEQRSIYGTLENIIDKFKDSYLCSPMLIVVGAVVALHH ELDFFQRKILLNKKILVACVSDQELPISKYFKEQGAAVKQVQLGKVNYLNSQQKLDLNHK LIVFTSQNGIRGFFNYLKMQRVDHRSLTTVRFACIGEKTANLLQLYGYYSDLISPKANSE DFNNYLKNKRMDSEEMIIVRAKEHSKIKKRAEDHELIVYENKELIIVEEESYHLACFTCA SSVERLAKYPSTFKIALSIGPMTTQALKKYYPHVKIIEAKDNSYEGMIAIIKENINVL >gi|223714214|gb|ACDT01000001.1| GENE 73 74847 - 75824 1019 325 aa, chain + ## HITS:1 COG:CAC0100 KEGG:ns NR:ns ## COG: CAC0100 COG0113 # Protein_GI_number: 15893396 # Func_class: H Coenzyme transport and metabolism # Function: Delta-aminolevulinic acid dehydratase # Organism: Clostridium acetobutylicum # 1 319 1 318 320 389 60.0 1e-108 MFYRGRRLRNKKVPRNLIKENRLSVEQLISPIFVVAGNNIKKEISSLEGIYHYSIDRLNE IIKELQRAGINACILFGIPEHKDELGTEAYNDDGIIQRAIREIKKIAPEMYVIGDVCMCE YTDHGHCGILDRDGCVINDETLLYLGKIAVSYAKSGVDMVAPSAMMDGEILAIRKALDET GFKDLPIMGYSAKYASSYYGPFRAAANSAPSFGNRSSYQMDIANGNEAMREIQADIDEGA DIIMVKPALAFLDVIKEAKLSFNMPICAYNVSGEYAMLKMAVKQGIMKEAVIEESLLAIK RAGADMIITYFALELAQKWQGDHHD >gi|223714214|gb|ACDT01000001.1| GENE 74 75817 - 77112 1351 431 aa, chain + ## HITS:1 COG:STM0202 KEGG:ns NR:ns ## COG: STM0202 COG0001 # Protein_GI_number: 16763592 # Func_class: H Coenzyme transport and metabolism # Function: Glutamate-1-semialdehyde aminotransferase # Organism: Salmonella typhimurium LT2 # 3 424 4 425 426 470 52.0 1e-132 MTNQEIFEEACQYIPGGVNSPVRAFKSVGEIPVFAKKGDKAHLFGEDGVEYIDYICSWGP LILGHNHPVVQNAIIEASQLGTSFGLPTKLEVAMAKLIVDSYQGIEMVRMVNSGTEATMS ALRVARGFTGRNKIIKFEGNYHGHSDGLLVKTGSGALTFNLPTSPGIPQEIISQTIICKY NDLSSVIAAVEQNKKDIAAIILEPVAANMGIVPATKEFLTGLRTLCDQNGIVLIFDEVIT GFRVSYHSCPDYLGVIPDMVCFGKIIGGGLPVGAYGGKKEIMSLVSPQGPVYQAGTLSGN PLAMSVGLAQLSYLNNHPDVYEYINKAANYLENGIKTILSTLSLDYQVHRCQSLLTLFFT NKPITTFEDVQTCNTEMYALFFREMLRQGILIAPSQFEAWFVSAAHSQIDLDNTLNAIKK ALVKCHDALVD >gi|223714214|gb|ACDT01000001.1| GENE 75 77093 - 77653 518 186 aa, chain + ## HITS:1 COG:lin1105 KEGG:ns NR:ns ## COG: lin1105 COG1648 # Protein_GI_number: 16800174 # Func_class: H Coenzyme transport and metabolism # Function: Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) # Organism: Listeria innocua # 3 138 4 139 159 72 30.0 3e-13 MMPWLINFKDKTVVIVGGGAVASRKTLQFLKEGAKVKIIASRIDRKLEALEVEKYLTDYQ PKYLQNAFFVYIATDDRELNNQIIRDADELGVLCACATSSMASLKAMKVVETENLLLGIS TKGQYPAFTKKLAYELEQYDDYLSALSNIRRLVLLNNLVNKHDRKQFFKQLMYFEKTELD MLLMFI >gi|223714214|gb|ACDT01000001.1| GENE 76 77692 - 78216 369 174 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754986|ref|ZP_02427113.1| ## NR: gi|167754986|ref|ZP_02427113.1| hypothetical protein CLORAM_00490 [Clostridium ramosum DSM 1402] # 1 174 201 374 374 332 100.0 5e-90 MNVIDYFAEQVSAMSLCLKEIDFQDKLKILMMLKIKWQIQPMVISYGKIYDKISKMCSPK EVAAPLFNSEILLEQLYSNKVKRLFIIHPRSNSKLFDILSRYGSVCAFNTKIIGEYEQII PFVLLPGYHYQNDILKAIENVKGQLTPVLLEDKQVVALLTKEISRRFGLEKVAI >gi|223714214|gb|ACDT01000001.1| GENE 77 78276 - 78851 198 191 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988990|ref|ZP_01820390.1| hypothetical protein CGSSp6BS73_02415 [Streptococcus pneumoniae SP6-BS73] # 1 191 1 192 192 80 33 2e-14 MKDFGELKQKIQECRYCQRQFGYEPHPVVWGKPNAKIMQISQAPSLNVHNTLKPFNDLSG KKLREEWYDIDDETFYDQNNFYIASLAHCYPGKSKSGGDRLPPKCCSEKWLRQEIDIVDN QIYIIIGSYAAKKIFPNQNFVDLIFNDQVFNGKPAYVLPHPSPLNRRWFKTYPEFEAKRV KEIRAVIKSLV >gi|223714214|gb|ACDT01000001.1| GENE 78 78924 - 79859 996 311 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735290|ref|ZP_04565771.1| ## NR: gi|237735290|ref|ZP_04565771.1| predicted protein [Mollicutes bacterium D7] # 1 311 1 311 311 548 100.0 1e-154 MSQEEILSYINDSLVINGSFYSNQNLPNQISNAIYTFGDYNVFEILAFIDATKEQDGSRG MIITPAQIYFKFGQAGIIEYQKIISLGLEKHRNDSIIKAIIKLEEITYTFSNQIIDPEVF VTMLSKITGIEIEMIMDTHEKVEYYTRIVLNDLENDEYEDIVLTPTQNNSIKEFYKELEM VQQLADEDYQYELENICNRALEFFDVLGLDSDEIDGLLECQNQFNQKNIEEEQKIDNAQQ FYNDMMNKYQQGDSEMFNRVKSIMSSMGIDENDLAGKSPEEIEDFLCGKFGISKSMLEKL AGRFNKSTVKH >gi|223714214|gb|ACDT01000001.1| GENE 79 79938 - 81362 1863 474 aa, chain - ## HITS:1 COG:BS_yuaG KEGG:ns NR:ns ## COG: BS_yuaG COG2268 # Protein_GI_number: 16080153 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 11 467 7 487 509 213 33.0 6e-55 MISDNFFTEHMTIVIAVAIVLLLIIILLTCWKRVPADKALVITGLKKRVLTGKGGIQVPF LETSCIISLEAVSMTTDITEAPSKQGIFVDIAGTAVVKVDNNPEKVLIAVEQFCSGNADR TTQNIKTVVEQILEGKLRGIVSTLTVEQINEDRVAFENSIEDSITRELDNMGLRLLSYTV LKIATQGGYLENRAIPQIAQSKADADIASAERARDTEVKTAAAVREGQKAKLEAEAEIAQ SDRDKTIRMEQYRAEQDKIKANADVSYKLQEIENNKIVADRNVELAKKEAQVVEEQLVAS VKKPADAKKYETEVQAEANKIKSIKEAEARAQALKIEAQARADAKLLEAKAEAEAIRAQG EAEAEALKAKGIAEAEAKDRLAEAMEKYGEAAMMSMVVERLPEIMAQIAKPMEQIDKITV IDNGSGDGGSKVSKIVSDVAANGFEVLKDLTGADLGEVVRTIANKNSDNKPKTK >gi|223714214|gb|ACDT01000001.1| GENE 80 81378 - 82496 1025 372 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237735292|ref|ZP_04565773.1| ## NR: gi|237735292|ref|ZP_04565773.1| predicted protein [Mollicutes bacterium D7] # 1 372 1 372 372 548 100.0 1e-154 MDKKDKIDTLIDQYLNFYQHGSQAIEPQQSTDFDSLMNQYKSKFDKQIANEQKAIREKEE ARVLQKKQEAKLQSQKRKTQEQQNKLRKEKLRNHSKKTTAEVKQAAKKYQHPNRDYDKSS NHKGLIVLIIIVVVFGLIYFIASNQDNESTNNTYLYNNLNYLVNDAVLQEPSSNDLSYLD NIENNENYNVAITYLNYVEDGLNYLDLISVTRGNDDLLEIRNDSEYYLYIKYLDNGYTNN VLIVPNDTYYCSIIDEEELTIETIAFYNFEDPALDVFTNLSDIDIYMNVDSIDFLETNAE ILFKHYYYQAYLKLPYDDIMVIKERDSNKGYYGYLDYTNRQIKVYYSDNYTHDVTIENIQ NYQSDYLKTITL >gi|223714214|gb|ACDT01000001.1| GENE 81 82718 - 83566 932 282 aa, chain + ## HITS:1 COG:SA0720 KEGG:ns NR:ns ## COG: SA0720 COG1660 # Protein_GI_number: 15926442 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Staphylococcus aureus N315 # 5 280 12 295 303 243 46.0 2e-64 MDQVEIVIVTGMSGAGKTSAMACFENLAYRCIDNYPVALLTEFAELVQDNPKYQRVAMAV SLDDALKAIRLLSNLDWIHLTVVFLDCDDEVLVKRYKETRRSHPLLISNKASSLIEAIEF ERRLAEPISRLAHLVIDTTKLKGARFQNLLEDYFSKGKIDPFRITFESFGYKHGVPKDAD LLFDVRFLPNPFYIEELRHLTGNDQAVYDYVIDKEETKVFIEKMTVLFDYLLKEYEKEGK MHLIIGIGCTGGQHRSVSLTNYFADHYSKVYQVHRLHRDADH >gi|223714214|gb|ACDT01000001.1| GENE 82 83575 - 84522 893 315 aa, chain + ## HITS:1 COG:BS_yvcL KEGG:ns NR:ns ## COG: BS_yvcL COG1481 # Protein_GI_number: 16080528 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 2 309 1 307 316 242 43.0 7e-64 MVSFSRVVKEEVVFNDFESCCQRAMLCAVIKINGTLSLSNHGLSLTIRTENAKIASKIHK MLKEEYDPQIEFLVSRKMKLKKNNVYILKVTKAREILDDLSLMDGLGFNQIPNPKIIEKE CCTRAYLAGAFLSCGSVNNPETSNYHLEMSFNEEEFAQFIADLINRFELNAKIIKRRNKY VVYLKSSEKIGDFLRAIGASQSVMNFESTRIDRSMSNTVNRWNNCDIANEMKSMATANKQ LEDIETVDMFLGLDMLDEKTRAVALIRKKYPELTLNELTEVYYEETGQTISKSGLHHRFK KISEEAKRLIQMESD >gi|223714214|gb|ACDT01000001.1| GENE 83 84524 - 84736 278 70 aa, chain + ## HITS:1 COG:ECs5252 KEGG:ns NR:ns ## COG: ECs5252 COG1396 # Protein_GI_number: 15834506 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli O157:H7 # 8 61 9 62 154 62 50.0 1e-10 MDKIEVKFGHRIKELRLKQNISQEELAFRCGLSKNYISDVERGTRNVSLKSIEKIANGFA VNLKELFDFK >gi|223714214|gb|ACDT01000001.1| GENE 84 84776 - 85063 368 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754978|ref|ZP_02427105.1| ## NR: gi|167754978|ref|ZP_02427105.1| hypothetical protein CLORAM_00482 [Clostridium ramosum DSM 1402] # 25 95 25 95 95 121 100.0 1e-26 MPLYYVILIILIIIVILSLILNLLTNPIVWIIIAILVVWSAIKRYLYTKQLEEYNKEFQR KTEEKKKAYYSQEEYRRGSDDIIDVDYQEFDEDDK Prediction of potential genes in microbial genomes Time: Thu May 26 09:00:29 2011 Seq name: gi|223714213|gb|ACDT01000002.1| Coprobacillus sp. D7 cont1.2, whole genome shotgun sequence Length of sequence - 61351 bp Number of predicted genes - 61, with homology - 60 Number of transcription units - 32, operones - 14 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 63 - 93 5.0 1 1 Tu 1 . - CDS 117 - 1460 1324 ## COG0534 Na+-driven multidrug efflux pump - Prom 1560 - 1619 7.1 + Prom 2239 - 2298 6.7 2 2 Op 1 . + CDS 2420 - 2950 597 ## COG1658 Small primase-like proteins (Toprim domain) 3 2 Op 2 . + CDS 2937 - 3476 600 ## Cbei_3254 hypothetical protein 4 2 Op 3 . + CDS 3476 - 4282 909 ## COG0030 Dimethyladenosine transferase (rRNA methylation) 5 2 Op 4 . + CDS 4309 - 5052 795 ## BH0058 hypothetical protein 6 2 Op 5 . + CDS 5057 - 5914 895 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase + Term 5961 - 6004 -0.8 + Prom 5935 - 5994 8.8 7 3 Op 1 11/0.000 + CDS 6014 - 7393 1550 ## COG1207 N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) 8 3 Op 2 . + CDS 7403 - 8359 1271 ## COG0462 Phosphoribosylpyrophosphate synthetase + Term 8407 - 8454 7.0 - Term 8282 - 8337 1.1 9 4 Tu 1 . - CDS 8464 - 9612 885 ## COG0477 Permeases of the major facilitator superfamily - Prom 9833 - 9892 7.0 + Prom 9674 - 9733 7.5 10 5 Tu 1 . + CDS 9779 - 10666 861 ## COG1737 Transcriptional regulators + Term 10789 - 10830 6.7 11 6 Tu 1 . - CDS 10678 - 10878 114 ## - Prom 10931 - 10990 7.8 + Prom 10811 - 10870 11.5 12 7 Op 1 9/0.000 + CDS 10929 - 12128 1356 ## COG0477 Permeases of the major facilitator superfamily 13 7 Op 2 . + CDS 12175 - 13059 945 ## COG0583 Transcriptional regulator + Term 13063 - 13113 11.0 + Prom 13147 - 13206 7.7 14 8 Tu 1 . + CDS 13273 - 16620 3961 ## Athe_0724 PKD domain containing protein - Term 16602 - 16648 -0.2 15 9 Tu 1 . - CDS 16721 - 17707 633 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 17738 - 17797 7.4 + Prom 17685 - 17744 5.4 16 10 Op 1 33/0.000 + CDS 17842 - 18750 1085 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 17 10 Op 2 20/0.000 + CDS 18743 - 19738 915 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 18 10 Op 3 35/0.000 + CDS 19732 - 20706 734 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 19 10 Op 4 . + CDS 20706 - 21482 195 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 + Term 21495 - 21544 11.5 + Prom 21511 - 21570 4.4 20 11 Tu 1 . + CDS 21593 - 22405 642 ## gi|167754958|ref|ZP_02427085.1| hypothetical protein CLORAM_00462 + Term 22532 - 22585 0.4 + Prom 22476 - 22535 6.5 21 12 Op 1 . + CDS 22634 - 24154 1758 ## gi|237735315|ref|ZP_04565796.1| predicted protein 22 12 Op 2 . + CDS 24145 - 25047 781 ## gi|237735316|ref|ZP_04565797.1| predicted protein 23 12 Op 3 33/0.000 + CDS 25025 - 25912 1052 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 24 12 Op 4 35/0.000 + CDS 25899 - 26882 948 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 25 12 Op 5 . + CDS 26869 - 27633 241 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Term 27841 - 27898 -0.9 26 13 Op 1 1/0.250 - CDS 27927 - 28592 819 ## COG0569 K+ transport systems, NAD-binding component 27 13 Op 2 . - CDS 28605 - 29252 659 ## COG0569 K+ transport systems, NAD-binding component 28 13 Op 3 . - CDS 29267 - 32077 2683 ## COG0474 Cation transport ATPase - Prom 32122 - 32181 9.3 + Prom 32783 - 32842 3.0 29 14 Op 1 . + CDS 32885 - 33643 644 ## COG0084 Mg-dependent DNase 30 14 Op 2 . + CDS 33655 - 34584 1001 ## BL00531 hypothetical protein 31 15 Op 1 . - CDS 34866 - 35513 519 ## COG0679 Predicted permeases 32 15 Op 2 . - CDS 35510 - 35779 277 ## SDEG_1575 predicted permease - Prom 35911 - 35970 5.8 + Prom 35718 - 35777 7.9 33 16 Tu 1 . + CDS 35949 - 36278 370 ## gi|167754945|ref|ZP_02427072.1| hypothetical protein CLORAM_00449 + Term 36287 - 36334 6.3 - Term 36275 - 36322 -0.5 34 17 Tu 1 . - CDS 36394 - 36642 149 ## gi|167754944|ref|ZP_02427071.1| hypothetical protein CLORAM_00448 - Prom 36666 - 36725 5.1 35 18 Tu 1 . - CDS 37252 - 37878 360 ## CD196_2121 hypothetical protein - Prom 38061 - 38120 7.2 + Prom 37822 - 37881 14.9 36 19 Op 1 . + CDS 37980 - 38531 626 ## COG1971 Predicted membrane protein 37 19 Op 2 24/0.000 + CDS 38596 - 40458 2235 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division 38 19 Op 3 4/0.000 + CDS 40461 - 41162 753 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division 39 19 Op 4 . + CDS 41171 - 41938 901 ## COG1475 Predicted transcriptional regulators 40 19 Op 5 . + CDS 41938 - 42429 604 ## EUBREC_2155 hypothetical protein 41 19 Op 6 . + CDS 42463 - 42825 477 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family + Prom 42832 - 42891 9.0 42 20 Tu 1 . + CDS 43066 - 43545 416 ## BL1738 hypothetical protein + Term 43594 - 43626 -0.1 - Term 43570 - 43626 16.9 43 21 Op 1 . - CDS 43652 - 44281 581 ## gi|237735337|ref|ZP_04565818.1| predicted protein 44 21 Op 2 . - CDS 44278 - 44907 638 ## gi|167754934|ref|ZP_02427061.1| hypothetical protein CLORAM_00438 - Prom 44937 - 44996 5.8 45 22 Tu 1 . - CDS 45035 - 45958 926 ## COG0260 Leucyl aminopeptidase - Prom 46064 - 46123 4.7 46 23 Tu 1 . - CDS 46191 - 46451 340 ## gi|167754933|ref|ZP_02427060.1| hypothetical protein CLORAM_00437 - Prom 46637 - 46696 5.5 + Prom 46431 - 46490 10.7 47 24 Op 1 . + CDS 46637 - 46930 357 ## gi|167754932|ref|ZP_02427059.1| hypothetical protein CLORAM_00436 + Prom 46932 - 46991 5.7 48 24 Op 2 1/0.250 + CDS 47020 - 48303 1500 ## COG1455 Phosphotransferase system cellobiose-specific component IIC + Term 48309 - 48337 1.0 49 25 Op 1 3/0.000 + CDS 48369 - 50207 1815 ## COG3711 Transcriptional antiterminator 50 25 Op 2 . + CDS 50204 - 50890 664 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 51 25 Op 3 . + CDS 50868 - 51620 766 ## COG3394 Uncharacterized protein conserved in bacteria 52 26 Tu 1 . - CDS 51686 - 52495 839 ## COG0789 Predicted transcriptional regulators - Prom 52527 - 52586 7.6 53 27 Op 1 35/0.000 + CDS 52578 - 53285 667 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 54 27 Op 2 35/0.000 + CDS 53282 - 54307 228 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 55 27 Op 3 . + CDS 54300 - 56078 204 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 + Term 56083 - 56119 1.7 + Prom 56126 - 56185 4.6 56 28 Tu 1 1/0.250 + CDS 56212 - 56892 652 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Prom 56897 - 56956 6.6 57 29 Op 1 . + CDS 56981 - 57658 799 ## COG1131 ABC-type multidrug transport system, ATPase component 58 29 Op 2 . + CDS 57660 - 58472 608 ## gi|237735350|ref|ZP_04565831.1| predicted protein + Term 58492 - 58535 1.3 + Prom 58492 - 58551 8.6 59 30 Tu 1 . + CDS 58679 - 59305 576 ## COG0642 Signal transduction histidine kinase - Term 59348 - 59386 6.5 60 31 Tu 1 . - CDS 59397 - 60926 1368 ## gi|167754919|ref|ZP_02427046.1| hypothetical protein CLORAM_00423 - Prom 60963 - 61022 7.1 + Prom 60984 - 61043 7.6 61 32 Tu 1 . + CDS 61088 - 61324 196 ## gi|167754918|ref|ZP_02427045.1| hypothetical protein CLORAM_00422 Predicted protein(s) >gi|223714213|gb|ACDT01000002.1| GENE 1 117 - 1460 1324 447 aa, chain - ## HITS:1 COG:MA0334 KEGG:ns NR:ns ## COG: MA0334 COG0534 # Protein_GI_number: 20089232 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 1 441 1 443 466 211 32.0 2e-54 MNQEYDMMAHKPIFPLLMKMAIPPMISMLIQSMYNIIDSIFVAKLGEEALTALSLAFPLQ NLSLAFSVGLGVAINALIAKSLGASDEKQANYISDHGIFLAILHSLLFVFIGIFLMKPFF LMFTTNQTVLDYAITYGSIVITFTFGSIIHITIEKMFQATGNMMIPMFLQGIGAIVNIIL DPILIFGINGYLEFGVAGAAIATIIGQMTACLLAIILFRKTSRIKVSLKNFKPNAQIIKN IYSIAIPSGVMTSLPSILVALLNSLLATVSQTAIAFFGIYFKLQSFIYMPANGLIQGMRP LISYNYGARHFDRVKKIIKVSIISTAVILCCGTIIFMGLPELVLSWFNATEQLLEIGIIG LRVISPCFILSTMGVVISGVFESLGKGRQSLTISLLRQFIITLPLAYILLKVIGLNGIWL SFVIAEGIASVIAVILIKKELHNFKVD >gi|223714213|gb|ACDT01000002.1| GENE 2 2420 - 2950 597 176 aa, chain + ## HITS:1 COG:BS_yabF KEGG:ns NR:ns ## COG: BS_yabF COG1658 # Protein_GI_number: 16077109 # Func_class: L Replication, recombination and repair # Function: Small primase-like proteins (Toprim domain) # Organism: Bacillus subtilis # 3 172 2 177 186 121 42.0 7e-28 MKKIKEIVVVEGKTDTALLKELFEVDTIETHGLALDQQTLELIQEASKSRGIIVLTDPDF PGKKIRDQIQAVVPNCQHAFVAKKDARGKKKLGIAEANKEAVIEALENVVSFDVNRESIT WSEFVALDIIGNKQRRLMVYEAFNLGYGNVKTLFKRLNMVGITKEQVLGRLNDGSR >gi|223714213|gb|ACDT01000002.1| GENE 3 2937 - 3476 600 179 aa, chain + ## HITS:1 COG:no KEGG:Cbei_3254 NR:ns ## KEGG: Cbei_3254 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 3 177 4 181 181 164 48.0 1e-39 MDLGKHNGYTFKVSNEFKGEYLVPIIDNLKVKYRDFYDDYDEVYHCNQLVHILKDDPNVN VDYHVMYKYDQCVGLAMVTTGNIDSSKYFDKPVFEQGANAIVLNYFHISPKARGNGNYWL RNVIIPHYQDKEDLYLKSSHSKAFSLYQRLGEEIGSYQTLSDNGEFKRTGKIFKIKVNQ >gi|223714213|gb|ACDT01000002.1| GENE 4 3476 - 4282 909 268 aa, chain + ## HITS:1 COG:lin0227 KEGG:ns NR:ns ## COG: lin0227 COG0030 # Protein_GI_number: 16799304 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Listeria innocua # 2 268 3 288 295 176 38.0 3e-44 MKEIATPSTTKYILEKYHLNALKKYGQNFLIDVNIINKIVTSAKIDQTTAVIEVGPGIGA LTQVLGRYSGKVTSFEIDERFMPVYQEFLNQDNIEIIFGDFMEQDIKPIVDQLKQRYQKV CLVANLPYYITTAIIEKVVLGNFGIDEMIVMVQKEVALKMTGIYKNPLLLMIKDMGTIEY LFTVNKNVFMPAPHVDSAILKIELKKAPDLKLYEVLNICFKQRRKTILNNLKQSYEEAEE ILLQTGIENRKRSEELELADFKNITKMI >gi|223714213|gb|ACDT01000002.1| GENE 5 4309 - 5052 795 247 aa, chain + ## HITS:1 COG:no KEGG:BH0058 NR:ns ## KEGG: BH0058 # Name: not_defined # Def: hypothetical protein # Organism: B.halodurans # Pathway: not_defined # 2 244 4 281 291 195 39.0 1e-48 MKIGDIVCRKKYGKDICFKITDIQDDVYYLTGIEYRLVADSDESDLELSDFSSEKSDIIV ESRPCLKGSVLHIDGDKHYLDMCLKKYEEFNINVHGYFMKENEIKDQIIPLLEKHRPNLL VITGHDAMKKNGNRKDINSYLHTKDFVEAIRRARLYEDDKDSLIIFAGACQSYYELLLAS GANFASSPSRKNIHALDPVFVVSQIANASIKNYVDLEKIVAKTSNKHLGVGGIDTRGVAR KIYPTSR >gi|223714213|gb|ACDT01000002.1| GENE 6 5057 - 5914 895 285 aa, chain + ## HITS:1 COG:BS_yabH KEGG:ns NR:ns ## COG: BS_yabH COG1947 # Protein_GI_number: 16077114 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Bacillus subtilis # 4 274 6 280 289 236 45.0 3e-62 MKVKAYAKINLALDVIGKREDGYHELEMIMAPITLHDLIYINTIASGIEIDSNSKIMPTD ERNIMYKVVALMKERYNIKKGVKIFVYKHIPTQAGLAGGSADGAAVIKAMNKLFYLNLSN EEMAALGKEVGADIPFCIYQKIALVSGVGEKLKFIKNSFECKVLLVKPKRGVSTKKSFTS LDLSKASHQDCRLMATGIEDGNYQIVIDNLQNTLEEPSIKMVPEIAKIKEEMMKIGFDGA LMSGSGSCVFGLTRNNIILEKGMKYFKGKYYFVRKTEILNDEEID >gi|223714213|gb|ACDT01000002.1| GENE 7 6014 - 7393 1550 459 aa, chain + ## HITS:1 COG:BH0065 KEGG:ns NR:ns ## COG: BH0065 COG1207 # Protein_GI_number: 15612628 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) # Organism: Bacillus halodurans # 4 446 5 449 455 433 49.0 1e-121 MKTYAVVMAAGKGTRMKSDKPKVVHEVLYKPMINHIVDELKQVGVDEIYVIVGHKAEEVE KLLDGVNIIYQKEQLGTGHALMQCKDALAGKAGTTVVLNGDAPLITSETLKDLIAYHNDN QLKGTIMTCDCDLDKKFGRVIRNNDQVTGIVEFKDCTPEQVQISEMNCGEYCFDNIAVFE ALEKVTNDNAQNEYYITDVIEIMNNDKLKVGGYKIADLAEVGGINDRVELAEATKSLQLK INKQHLLNGVNIIDINNTYIGVDVTIGADTTIEPGCIIKGKSSIGSNCHIGPYCEFDNVE IKDNVEIKFSVISDSIIENGVDIGPFARLRTNCHILEDAHMGNFVEMKKAVFGKGSKASH LTYVGDATVGSNVNMGCGTITSNYDGKNKFQTIIGDNAFIGCNSNLVAPVTVGANAYVAA GSTITDQVEDSAFAIARARQVNKDGYAKVLEEKRNQKGK >gi|223714213|gb|ACDT01000002.1| GENE 8 7403 - 8359 1271 318 aa, chain + ## HITS:1 COG:FN1992 KEGG:ns NR:ns ## COG: FN1992 COG0462 # Protein_GI_number: 19705288 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Fusobacterium nucleatum # 3 311 5 315 316 362 58.0 1e-100 MKNKITVFALSASGELANDIAEKLGTKVGKSKVHHFADGEILVEIDESVRGKDVFIVQST SNPVTENLMEILVLADALKRASAKEITAIIPYFGYARQDRKAKPRQPITSKLVADLLTVA GVTRVVTVDLHAAQIQGFFDIPVDEMQALPLISNYFKKKDMEDICVVSPDHGGATRARKL AVALDAPVAIIDKRRPKPNVAEIMGVLGDVSGKNCIMVDDMIDTGGTIVAGIEMLKEKGA KSVHVACTHPVFSGPAVERLQNSSADEVVVTDTIKLPEDKMFPKLKTVSVAGLLAKTIEN IENCLPVSDVFEMFDFDK >gi|223714213|gb|ACDT01000002.1| GENE 9 8464 - 9612 885 382 aa, chain - ## HITS:1 COG:BS_ywbF KEGG:ns NR:ns ## COG: BS_ywbF COG0477 # Protein_GI_number: 16080885 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 1 381 4 390 399 87 23.0 3e-17 MQYRDSYLSYLLMYLFYFLSLALLSGLISVYLLDRGYSASQVSLVVASSYIVSVILQPLV GYLNDHFNLKWVNSVVLIIAAILGIALIFAKHLFSITLLYSLTLGLFNSTKPVIERMATL SKYQYGKIRIWGTIGYAIGSQIGGLIYQYISPESMYFFFSISLGICLLGIIGTRNEHKNS DIHGETTTQHLGLWNKNFIVYLIIVCLFYAITNLNTTYLPAMFQHQGISVYKVSTIILLI TISELPIIYYSHRFMNNISNRHMLIIVFILLIIQFGTYCFIPYNIIHVIISIGTKAVSTV IFIMLNMKIVVSIVDSDLQMSALAWVSTFNSFSSIVGQGLGGKILDTYSYTDFYLILFLI AVIGLGISYFYHLPSGRKYHLF >gi|223714213|gb|ACDT01000002.1| GENE 10 9779 - 10666 861 295 aa, chain + ## HITS:1 COG:lin2846 KEGG:ns NR:ns ## COG: lin2846 COG1737 # Protein_GI_number: 16801906 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 12 280 13 280 283 90 27.0 2e-18 MLLKEQMKAYPFSHNERVIIDYILDKQINIKDYSVKMIADATFTSPSTLIRISKKLGFQG WVEFKDAYLEEANYLNSHFCNIDSNLPFSNQDSIMTIASKIVNLHIESAKDTLSLFQHDS LQKAVRILHQSKEIRVFTMSNLNFAGEEFVFKLNRIHKKAYIHPIQDNLFHDAAMSSPDE CAICISYSGESSNIVKTAQILKENNCPIIAITSIGNNSLSDLATVTLRVTTREKSFSKIA GFSSLESISIVLNTLYACLFSLNYHDNLTYKLDIAKKIENAPIINNEIIAEKDDE >gi|223714213|gb|ACDT01000002.1| GENE 11 10678 - 10878 114 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSNTYILFILFVLCIVLFEIKNTSLKECVFNITYSPTLNYLSGICIRRLFDIITDILRMS IRELDI >gi|223714213|gb|ACDT01000002.1| GENE 12 10929 - 12128 1356 399 aa, chain + ## HITS:1 COG:SP1600 KEGG:ns NR:ns ## COG: SP1600 COG0477 # Protein_GI_number: 15901440 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Streptococcus pneumoniae TIGR4 # 1 347 1 345 383 111 26.0 3e-24 MDKKILRFALLSASLLVGSAAAINANIPAMAQHFDQVPLSMVEMLTTVPSLFLMISVLTS SLIAKRVGYKQTITIGLGIVMIAGIVPLLIDNFMIILISRAMLGFGVGLFNSLLVSMINY FYDAKERSSMYGLQSAFEGAGGIAITFIAGQLLKINWQAPFIAYMIAIPVFFIYFKFVPQ VKTADVIKANGGDQIKKESHKSAGFLPVVYYVGLIFMAAMLYMIMGIKIASLMTGEGYGT ASDASLVIILLSLGGISAGLLFGKILKVFNQLTTSIGLIILALAMVILGLSQNLVITFVG GYLTGFGFKIFMPSLIDKINNSNIPNTTLATSLLLVGFNLGVFISPYGSIVIQSLMRTEA LPALFIANAIGFISLSTITLIITLIKNKKTVNIVQKVLE >gi|223714213|gb|ACDT01000002.1| GENE 13 12175 - 13059 945 294 aa, chain + ## HITS:1 COG:lin0450 KEGG:ns NR:ns ## COG: lin0450 COG0583 # Protein_GI_number: 16799526 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 283 1 282 291 166 33.0 7e-41 MDLRVLEYFLAVVESGNVSKAAQKLHLTQPTLTRQLQQLEEIYGTRLFIRGSRQITLTNS GIILKRRAKEMLDLQDKTAAEIRKSEYDISGEITIGCGESSGNKFLPVVLKEFSKRYPKI KYNIYTAGSNQNKEKINEGLIDVAIVLEPIDKDYYQYLELPYLDKWCLVMKKDDLLKEYE TIENHVVSNLVLGIANRLEVKETFKNWLGKEINAQTILNYDINSNLALLVESGLCYGISI DGVFKSNAYPNLTYRLLEPEIVSQSCLIWKKGVVTNQALEKFIVCAKELISSNQ >gi|223714213|gb|ACDT01000002.1| GENE 14 13273 - 16620 3961 1115 aa, chain + ## HITS:1 COG:no KEGG:Athe_0724 NR:ns ## KEGG: Athe_0724 # Name: not_defined # Def: PKD domain containing protein # Organism: A.thermophilum # Pathway: not_defined # 102 469 31 359 432 147 33.0 2e-33 MDSKMGKNLNKALISVLSLGMIVSMCSITVNAVETDNVANWVTMYGDNGNNSITPDTVND AERLINGDIIATGTFDGNKVTGIEQAKGKSDGAVMLYDKNGQIKWETLIGGSAADSFNAV TEANDGSYVAVGVSQSNDGNLANLNKGGKDGIIAKLDSNGTLIKVVTVGGSDADELKAIQ PAGDGGFIVAGYSHSTDLDMNGLNKIETDRDAIIVKIDQDLNIEWINRAGGTGGDKAVKK MDEFTSVVANYGDGSYLAVGYSSSSDGDLEGFAHGGKDAFVVKFNENGTKEWVKAFGGTG DDVFNHIIRAHNKKVSTDKTEISSSDIDNGYVLTGTTNSNSGIVGNEEENNVAFTLKIDS NGQVEWTTSLANNSKTTGDYLLAISDGFLAVGTFDTNDQDFTGIKTYGKEDIYIAHYSKD GTCLNISSYGGDGKDNPEGIIPGANDDYIVYGNTNSNSGLFEGKLKGKYDGFLASLAGTT LEQYAEEKYLVPVEAWKANEDSLSMMAPLLYKDAYVEKIGFKYTVTIYFTNATIMGTQVS ASTLGEVSYEKNGEMIAVLKDEYDEVTQVKTVTLEVDDLDTPVKLHIDKAMGDIRLSFSS KDMVETTNPPYFAPVEVEQPDFKTSWKVNIGGSDYDYTDDMTVLKNGNIVAVGQSYSNDG DFQNLLKGGSIAYINQYDPKGELLKTLTLGGTEYDSIAYAAKVCGLDDGGFVVSGGYQEG VYVAPTGDFAQLDTAGTVHGQMDTFIAKYDSNSKLIWMSNFSGSANDQVKQIKATDDGGI AVLIETNSNDGDMENQNRGLYDLVIVKYDAHGKKVWQRVLGGKNIETSSSGLDILSDGSF IVAGSLSSKSGDFADVDWYGDIFDCFAAKISATGELLWTKAYGGDRNDYCNSVLATSDGG FILMGNTKSTTDTFAPIGTGYDNAYVLKCNETGETEWVNVIKSSEASEMISAIELEDQYV FIGQSRGTDYDVTGLNKGSMDVFIANYEKNGTLKTIENIGGSKADYASEIMVLNDYQISL LVYSESNDGDFKNLNRGKFDGMLMTYDYRQKADNSTTDKPNQDNVNNQNNSNVIETTNSS IKNSINTGDDSKIIGYSLLGGTMLIILLLTRKKYN >gi|223714213|gb|ACDT01000002.1| GENE 15 16721 - 17707 633 328 aa, chain - ## HITS:1 COG:all2613 KEGG:ns NR:ns ## COG: all2613 COG2207 # Protein_GI_number: 17230105 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Nostoc sp. PCC 7120 # 107 324 109 324 326 75 24.0 1e-13 MICRNLTEYYNIFFNKLAFQYYQYDNFSLYINPQQPEEFIIHVHIADMYELFVGDFTAPY DIEISFEISETYLQLGMLLTGEIEYDLKESHNNHLEPSSFYLNGSKRSGLQVWKKNQHYH STDIIIKKDFFEKKLSAYGYSYTTLEKIEKNKIFRPLPSKLLQLMQEISDTLRNKAISQL YLDAQLMKIVYFLLPQNNILYPQSKTITIKLKCGRNILLNNDDQEIIQQIHNDLTVNYQH PPTIKELAKIHYISEQKLTAGFKYLYHCSIHDYISNLRLNMANSMLLSEQSTIIQIAKTL GFSNANSFIYFYRKKTGITPKQFQLTQR >gi|223714213|gb|ACDT01000002.1| GENE 16 17842 - 18750 1085 302 aa, chain + ## HITS:1 COG:lin2073 KEGG:ns NR:ns ## COG: lin2073 COG0614 # Protein_GI_number: 16801139 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Listeria innocua # 24 298 31 310 313 142 30.0 1e-33 MKKVLSTILVLSLFILTGCSAATKDTIRTVETVKGKIEIPNEPKRVVIDYFVGDALALGV EPVGTTYVYKDSVFEEDLKDVPSINGDDAYGQYSMEAVTKLKPDLIITYSEADYDNLSKI APTVLVDYLNMSTAERITWMADVLNKKEEGKKLLADFNQEVAGYKQQLQDAKISNKTITL METYSKELFVYGNKQGRGGEVLYDLLELKAPEVVQKEIINGEQYRSISLEAINEYAGDMI MIGGWQEDPMDLVGNNSVWNSTPAVKNQKVITYDSSAYIYQDILSTKEQLKNITQSILAL YE >gi|223714213|gb|ACDT01000002.1| GENE 17 18743 - 19738 915 331 aa, chain + ## HITS:1 COG:lin2072 KEGG:ns NR:ns ## COG: lin2072 COG0609 # Protein_GI_number: 16801138 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Listeria innocua # 28 325 38 335 341 229 43.0 8e-60 MNNQQVRKWLIVGLCLLPIIIFFSLKFGAVTIGWEDFFKIILGKKSNGNDYEVIINLRFP RILATGIVGSAFAMAGAAMQGLTKNHLADSGILGINAGASFMLAICFALVSRSNFLLMMM YSFVGAGLGLFLTMTMGKGKYTNNSKRLLLAGVAVSLFFTSLSQFIAIYFGIGQELTYWN VGGTANIGYQELILATLLFVSGLFLLLRQGKNITIISMSEETAVGLGVNVPRCKILVCLA ALLLAGMAVSIVGSIGFIGLIVPYLVRRLVKKEDYQIILPMSAIYGAVFLILIDLIARNL QPPYETPLGIIMALIGVPFFLFMERSINKKC >gi|223714213|gb|ACDT01000002.1| GENE 18 19732 - 20706 734 324 aa, chain + ## HITS:1 COG:BS_fhuG KEGG:ns NR:ns ## COG: BS_fhuG COG0609 # Protein_GI_number: 16080384 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Bacillus subtilis # 8 321 14 334 336 181 37.0 2e-45 MLKKKRIIIFLIIILLVSSFLSFTIGQMKIELSELFDPYTQLVIFQFRLPRLLCAILVGG GMALAGHILQTVLQNDLADPGVIGINSGISFFVLLYILVSGTDIFLDYVWMPLAGMIGAV ITIGLLMFLIYEKERGIDPIRMTICGVALNLGFQGLILFFSAKLNRQKFSYVQLWNAGIL TGTDWLKIALLITILVIIGVILWKYQDVLNIFSLGKESVITLGVDLKKQTIILISLAGFL AAFTIAFGGSIPFVGLIAPHIARRLVGSNHRYSLWVSMLMGSLMVIIGDSLSRSLMSTGE IPTGIILSLIGVPYFIYLLKKERA >gi|223714213|gb|ACDT01000002.1| GENE 19 20706 - 21482 195 258 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 20 227 28 231 312 79 29 4e-14 MKGIIEARQLSTGYGKKIVLKDQNIVIPKNKITVILGPNGCGKSTMLKTMARVLPLKDGK IILDGKDLKTMKTAEIAQKIGFLSQILNTPEGISVYELVSYGRYPYKRISSTKEDQKIIE WALKETQTFEIKDMLVHDLSGGQKQRVWIAMVLAQQTNTILLDEPTTYLDIGHQLEVLEL LSKLNKEEQQTIVMVLHDINHASRYSDWIIAMKEGKVFKEGSPKDIITSEVIQELYGVKT QIIFDKEHNCPVCFSYDL >gi|223714213|gb|ACDT01000002.1| GENE 20 21593 - 22405 642 270 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754958|ref|ZP_02427085.1| ## NR: gi|167754958|ref|ZP_02427085.1| hypothetical protein CLORAM_00462 [Clostridium ramosum DSM 1402] # 1 270 1 270 270 511 100.0 1e-143 MQLTKEKLSYLLFLYKQREMDYTVTRLAQEVGVSKSTFSRVLNTFYQEGLTTEKGKGILS CQGCRMAREYLKDINKLSDWLKYTANFDDEEAYQEALSLVLTLSSKARAKLVNNTSKERL FELIDNVKEISGDMLSANLDDGQYLFAFTVYKADKMEISMANDGFFHPGILDIKMGHGGL VLKPKEVERESMMGKMVLKGKLSSLKYQVGEDYLECRMIKDDYIIPISKLRFHYSKEERI LQTSVKIKVKPTVGKIHMPESIAIMNVIIK >gi|223714213|gb|ACDT01000002.1| GENE 21 22634 - 24154 1758 506 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735315|ref|ZP_04565796.1| ## NR: gi|237735315|ref|ZP_04565796.1| predicted protein [Mollicutes bacterium D7] # 14 506 14 506 506 824 100.0 0 MKKLLKVVLVLMLAFGIQIATLSNVDAAQVSDLPDGKYTIPTKLKNASNIANDSMAAGAL AENGELSVEEGKWYLTAEFKTLNLMGLVGNAGNIQYYETDTKSEKHAAEVISYREDDQGK QQVEKVKIPVAVNSEGVYIEMFVDAMDTTVDAYIQYSTADIVIPEEPEAPEYTLADGTYT VTSDVLKANEDVQSMAAQAVKSATVTSKDGTLLVTLQMGAVTVYGQTAYVDKMEVEQAAG AYKVADITGRDSNGNVSEMEFTLAKNTKLTNVKFYYGGSTHGSEARLSLGLDNPTLVVPE STSKFAKDGTYTVDVALWNATQDKASMAAGAIDSQATVVVKNGVATMYITTKEMTMGTIK AWLEELYIGSSTDDYKSNPAVIVSKNADGKATMWSFVLPNEEELFDVVVNPHVAMMGNSD IPARMKVDYSTLKFVSDSIEAPKVDGESNNNTNDTPNTTTPTTNQTTGTSSSSVKTGDNA NMELMGGLLVSSLVAAAYLTRKRLCK >gi|223714213|gb|ACDT01000002.1| GENE 22 24145 - 25047 781 300 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735316|ref|ZP_04565797.1| ## NR: gi|237735316|ref|ZP_04565797.1| predicted protein [Mollicutes bacterium D7] # 1 300 1 300 300 453 100.0 1e-126 MQIKKVLLILVAIITMTINIKPAFALDDGAYLIGRSTSYVNPLTGNTEDGGTNITLGDSM VSSIVEGNLLLEQTGGKYYITVGLGLASNVSNVRFKIMNSGGSFNSVSATKTGSSSANGD TVNHYRIQINSLNDYISPIMYVAPMGRDVQFFIKLNQGSITPGTGVYTSTMVPATADNGN SVNNNSTTNNSTSNNTSTNSNQTSDTTTNEEQPVVETPVTTVGEVSRESLFDGISGLSSH VIDTNGKVNEKKKLTVENLLKQSEQKKDSSNNTLLIIGIVAVIAVVAGGGAYYYVKKVKK >gi|223714213|gb|ACDT01000002.1| GENE 23 25025 - 25912 1052 295 aa, chain + ## HITS:1 COG:SPy1795 KEGG:ns NR:ns ## COG: SPy1795 COG0614 # Protein_GI_number: 15675633 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Streptococcus pyogenes M1 GAS # 1 287 1 283 294 256 46.0 5e-68 MLKKLKNSFLVLVLALSMIGCVDQSGKSDNNNLTIAATSVAVTEILEALDVPADQVIGIP TTESYTVPKKYKKATELGTAMSPDMEVLSTLKPSLVLSPNSLEGDLASKYEKIGVSSSFL NLKSVAGMFKSIEELGSLLGKEKKANKLVDEFVNYMVEFRAKYQNTASPRVLILMGLPGG SYLVATESSYVGDLVKLAGGTNVYGNGNGQDFVNVNAEDMLQQNPDIILRTSHAMPEEVM KNFETEFAQNDIWSKFSAVQNGKVYNLDNNYFGMSANFNYQKGLENLEGILYGTN >gi|223714213|gb|ACDT01000002.1| GENE 24 25899 - 26882 948 327 aa, chain + ## HITS:1 COG:SPy1794 KEGG:ns NR:ns ## COG: SPy1794 COG0609 # Protein_GI_number: 15675632 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Streptococcus pyogenes M1 GAS # 25 321 37 337 340 227 48.0 2e-59 MELIKKHPRIFIIFLILVLFICFFIAINLGSIQVSIVQLFKGLFIEYDVDVASVYSIRFP RIIISMLVGGALALSGLLFQVVLKNPLADPGIIGIANGASLVSVLVGLFLPQLSAIAPLL SFFGGLITFAIIYSLSWKAGFKTTRIILIGVAINYTISALVTLANSATASITSGATGTIS LYTWQDVTTLLIYLVPVIIITLFMAKACNLLGLEDRTLTSLGVNVNVYRFGLSLLAVLLC SISVAVVGVIGFIGLLVPHISRLLVGTEHKHLIPISILIGAIVLLVADTVGRIIMAPYEI SAAIIMAIIGGPLFIILLKRSIDVDGS >gi|223714213|gb|ACDT01000002.1| GENE 25 26869 - 27633 241 254 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 225 1 226 245 97 27 2e-19 MMEVKKIKFSYQDKPFIEELSVKFKKNKITSIIGPNGSGKSTLLMLLSRIYKAKDGTITL NDKNVWDYKIKEFAKNVAVVHQKNQIYGDLDVKTIVGYGRLPYLSYHQNLSEEDYQIIDW AIATTNLKEYENRLLANLSGGQQQRVWLAMALAQKTPILLLDEPTTYLDVKYQIEILNLI KEINKKYQMTIIMVHHDINQAINYSDEIIAMKDGKILFQGVPEEVITSASLKSLYDYDLS VIDYNHQKIVLNYQ >gi|223714213|gb|ACDT01000002.1| GENE 26 27927 - 28592 819 221 aa, chain - ## HITS:1 COG:MT2766 KEGG:ns NR:ns ## COG: MT2766 COG0569 # Protein_GI_number: 15842230 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Mycobacterium tuberculosis CDC1551 # 15 220 15 217 220 117 28.0 2e-26 MKIIIVGGGQIGSYITSLLLKNDHEVRVIENRTKALENYKNAGLPEECLVIGDGTDIEVL EKAGVRTCQALVCVTGLDEVNLTTAMVSKFEYDVPRIIARVNNPKNAWLFNSGMGVDAKI NQADIIGHMIADEMNYQSIMTLMKLSKGDFSIVRIRVDYQSRYAGKQISEISLPNNALLI AIYNDDELMIPHGDTLIEAGQDILAFGDEQAIGKLNEIFVS >gi|223714213|gb|ACDT01000002.1| GENE 27 28605 - 29252 659 215 aa, chain - ## HITS:1 COG:Rv2691 KEGG:ns NR:ns ## COG: Rv2691 COG0569 # Protein_GI_number: 15609828 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Mycobacterium tuberculosis H37Rv # 1 128 1 128 227 108 39.0 8e-24 MKIIIVGCGKFGTRLANYLNNQNHDVTIIDNREEAFNNLGGNFTGRTVLGMGYDKDVLES AGIKVADALISCSSSDAINAVVSNIAHNIYHVNNVIARMYDKSKARIFRSMGIHTISITD LGVERILDYFDSNRLQVVKKIGEDSEIKIIKIKASLSLIGKTVDEISLHGEFEVVAIERH DITMMPLKDTKIKDNDTIYFSVLEESLLKFKEILY >gi|223714213|gb|ACDT01000002.1| GENE 28 29267 - 32077 2683 936 aa, chain - ## HITS:1 COG:MTH1516 KEGG:ns NR:ns ## COG: MTH1516 COG0474 # Protein_GI_number: 15679513 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Methanothermobacter thermautotrophicus # 21 932 15 907 910 768 44.0 0 MNINEDNELNNYQIMYVTPVEAYTLMESSMDGLSEEQVSHRLKKYGKNEISKKKQSSMFK KLISNFTSLMALLLWGGGLLAILSGTVELGISIFCVNLINGFFSFFQEFKAEKATSALQK MMPSYARVVRDGKEVKIFAEDIVPGDIMILEEGDRICGDARILRCSDFQVDQSTLTGESN PIRKNYEALQEKVSYLEAENMIFTGTTVASGTCHCVVVATGMDSEFGKIANLTQNTEKSL SPLQKELNVLTKQIAIIALSVGIVFMLIAVFVIKDPLLESFIFSLGMIVAFIPEGLLPAV TLSLALSVQRMAKDNALVKKLSAVETLGCTNVICSDKTGTLTQNEMTVNHLWTLDSQMDV SGEGYVPNGKIYVDEQEITAKKSNVLRLLLSGAVLCSNAKLVPPQNKSVNPRYTVLGDPT EACLEVVAKKAEIDLDKLNSQYPRILELPFESRRKRMTTIHQLKDSFEGNQRIAFVKGSP KEVMELCNRCFKGSKACPISEEDRINIMKANDMYAREGLRVLAVAYRTIAHNDKKLPSSI REYTPELIEQDLTFLGLIAMQDPPRSEVKEAVELCHSAGIKIVMITGDYGLTAESIARKI GIIKSDTARIVSGTELSKMNDQELKNVLEGEVVFARMAPDQKYRIVCALQEMGNIVAVTG DGVNDSPALKKADIGVAMGITGTDVAKEAADMILTDDNFASIVRAIEEGRAVYNNIRKFL RYIFDSNTPEAAAPALYLLSGGLVPMPLTIMQILTIDIGTDMIPALGLGAEHPEDNIMNQ PPRRIDERLLNKKVILVGFIWYGLIITLFALGGYFLVNYLNGWPHNPLAPEGTALYNEAT TMMLVGVVFSQMGMVMNNRTEKESVFKRGIFSNHYINLGLVIEFVILLAVVYIPFLNGIF NTAPIGLIEWLYALPIPFIVFGIEELRKKILRNKDK >gi|223714213|gb|ACDT01000002.1| GENE 29 32885 - 33643 644 252 aa, chain + ## HITS:1 COG:BS_yabD KEGG:ns NR:ns ## COG: BS_yabD COG0084 # Protein_GI_number: 16077107 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Bacillus subtilis # 1 251 1 254 255 248 48.0 6e-66 MYFDTHCHLNSDELYEDHDVFVQKALDNNVTNMVVVGYDLKSSQIAIELAKQYEFVYAAV GIGPNDCLNTTKDQMNIIDKYLEEPCVVALGEIGLDYYWDSVPKEKQMEVFQWQMDLAKK HQKPVVIHCRDAYEDTYEVLKRNGHPGVMHCYSGSVEMAERFVKLGYYISLAGPVTFKNA RVPKDVAATINLENLLIETDCPYLTPHPFRGKLNEPANVVYIAQEIAKLKNMEIENVASA TTFNAKKLFGIK >gi|223714213|gb|ACDT01000002.1| GENE 30 33655 - 34584 1001 309 aa, chain + ## HITS:1 COG:no KEGG:BL00531 NR:ns ## KEGG: BL00531 # Name: yabE # Def: hypothetical protein # Organism: B.licheniformis # Pathway: not_defined # 50 300 197 441 441 89 28.0 2e-16 MKKIKEAIINADPRMKGLLVVMIAYIGIVATTLNAGATNQIDNYVETIDVKVQDGNQDQK DYLIRQASVSSVLDDLKISVNPQDILNLDLNYIVNKGDLIQITRVNQADIDEMITVESNT VNTTGLELFTTKVAQQGQNGQVKNTYRVTYENGNEVGRELIGSQVVSQATDTIIETGAVQ EGAFFTGRLTTYGGDCAGGNGTSSTGIKLSPISGVQGSNSPKLTYNGRSYYCLAADPSIP FGTIIEITNHNLSIESTAYGIVVDRGGAIKGNKIDIFNGTEAGKYFTGGTSKNTQFKIIS VGSGKNFWK >gi|223714213|gb|ACDT01000002.1| GENE 31 34866 - 35513 519 215 aa, chain - ## HITS:1 COG:L181807 KEGG:ns NR:ns ## COG: L181807 COG0679 # Protein_GI_number: 15673902 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Lactococcus lactis # 3 214 27 238 238 100 32.0 2e-21 MTRALSMINSSGYNIGTFTLPFVQSFFPSNLIGYVCLFDTGNALMCLGGTYSAASTVVAN EEKQSFKTVAKKLFSSIPFCTYIILFFLSLFHIAIPTQILTVTSIAGNANAFLAMLMIGI LLEIKLDLSQIRLIKKILLNRYAVTLGLSLFVYFILPIDLTVKKMIILCLCSPISAVAPV FSNRLGSRSPVPSAINSLSIIISIFIMTILILFMV >gi|223714213|gb|ACDT01000002.1| GENE 32 35510 - 35779 277 89 aa, chain - ## HITS:1 COG:no KEGG:SDEG_1575 NR:ns ## KEGG: SDEG_1575 # Name: not_defined # Def: predicted permease # Organism: S.dysgalactiae # Pathway: not_defined # 1 81 1 81 302 72 48.0 5e-12 MIDILIKAGGFILIIMLGFALKTKGVCTREHGSFLSTIIMNITLPCSLLSSINNLEITPI LLVALACGFLGNVITNLSGYLIQKKRNHP >gi|223714213|gb|ACDT01000002.1| GENE 33 35949 - 36278 370 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754945|ref|ZP_02427072.1| ## NR: gi|167754945|ref|ZP_02427072.1| hypothetical protein CLORAM_00449 [Clostridium ramosum DSM 1402] # 1 109 1 109 109 189 100.0 4e-47 MEKIKNWFYKPAFTLICYAIALLIICYFGYTVNISYQSVVSYVEAGSISWGANFGEIVAY FMSNTFSYFFYAMTFVFFGKVISIIKPRAPKQEIVEEINDNVDDELIEA >gi|223714213|gb|ACDT01000002.1| GENE 34 36394 - 36642 149 82 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754944|ref|ZP_02427071.1| ## NR: gi|167754944|ref|ZP_02427071.1| hypothetical protein CLORAM_00448 [Clostridium ramosum DSM 1402] # 1 82 1 82 82 129 100.0 5e-29 MNQIKFHKRLGLILDEIYDIFNSSTRQEQLNALLLKKLTDSKLKIDSCYTSLDKSNSLLN YDYKDIPTLKHYFNNYLVSSVS >gi|223714213|gb|ACDT01000002.1| GENE 35 37252 - 37878 360 208 aa, chain - ## HITS:1 COG:no KEGG:CD196_2121 NR:ns ## KEGG: CD196_2121 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_CD196 # Pathway: not_defined # 14 175 10 174 185 67 29.0 3e-10 MKNKKTNFIYKSCILIYLISLVYAFYLNWGGKYFSMTIVACLTPFIMPLLMKLLKIEVPQ EFYIINIIFVYFASLWGSCLGGYHLPYFDKFTHFFSGIIFCEIAYIFYKHFLPNEKRKFL MFIFINALNAMIALFWEFYEYALLIFFNYDAINHYSTGIHDSITDMLVAVIGSFILSLYL THYDQRNDNHFFIKLAKNIKFNKTSHPN >gi|223714213|gb|ACDT01000002.1| GENE 36 37980 - 38531 626 183 aa, chain + ## HITS:1 COG:CAC0130 KEGG:ns NR:ns ## COG: CAC0130 COG1971 # Protein_GI_number: 15893426 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 1 181 1 198 200 79 32.0 3e-15 MHIFSAVLYAVSANIDALSIGIAYGIKKTHINHQKNFMIAFITMIGTYGAMRFGHFLTNY ISLSLADFLGSSLLLLIGIYTVVKSLKECDEIIVTNHLSAVKNISLKETLVLSSVLTINN VALGIGASITGMPELITSLTTFVCSYVFIGTGQKFKRLNFKGKYADIISGIILIILGLYE LTI >gi|223714213|gb|ACDT01000002.1| GENE 37 38596 - 40458 2235 620 aa, chain + ## HITS:1 COG:SA2500 KEGG:ns NR:ns ## COG: SA2500 COG0445 # Protein_GI_number: 15928296 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Staphylococcus aureus N315 # 2 615 5 617 625 720 58.0 0 MYDIIVVGGGHAGIEAALAPARMNQKTVLVTSNFDNVGSLPCNTSIGGPAKGIIVREIDA LGGQMPKTADKTYLQMKMLNTAKGPGVQSLRAQADKKVYPRYMQEVLKKQENLDIIEGMV EDLIVEDNCVKGVILDNGQTIEGRMVILTTGTYLKAEILCGDQKHASGPDQQKESKYLSE KLANLGMRIQRLKTGTPPRVEINSVDYSKTSLQPGTDANLAFSYQTNEFIPIEEQTPCYL TYTNEKTHQIIRDNLHRSSMYSGIVKGIGPRYCPSIEDKIVKFSDKPQHQIFLEPESAEM DTIYVQGFSTSLPHDVQEEMIRTIPGLENCKILKYAYAIEYDAIDPLQLWPSLETKIIKN LFTAGQINGTSGYEEAAGQGLIAGINATLKNQGKEPLILKRDEAYIGVMIDDLVTKGTDE PYRMLTSRAEYRLLIRHDNADERLMKYGHDVGLVSDQIYQAYLDKMSEIFAEIDRLDTIR FTPKHEINDALEHFGSARLTEGISAKELIKRPELTYEKILPYVEAPKLSEEQRKRVTILI KYKGYIDKARRQADKQVKMEEKKIPADIDYEDISNLALEAKQKLSKIRPLTIGQASRISG INPADISVLLIYLKQKYNEE >gi|223714213|gb|ACDT01000002.1| GENE 38 40461 - 41162 753 233 aa, chain + ## HITS:1 COG:BH4060 KEGG:ns NR:ns ## COG: BH4060 COG0357 # Protein_GI_number: 15616622 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Bacillus halodurans # 1 233 1 237 238 256 57.0 3e-68 MNELEFIKQLETKGITLTPQQVNQFKQYFKILVEWNEKMNLTAITDEEGVYLKHFYDSLT IAFDFDFTDQSIVDVGAGAGFPSVPLKIVYPELKITIVDSLTKRITFLNHLFKELNLSNC QAISARAEDYAKDHRQKCDIVMARAVARLNILDELCLPLVKVGGYFLALKGLKATEELEE AGKGIVLLGGQVEKSIDFTLTNDNHRSNIIIKKVRDTPAKYPRMFGKIKKQPL >gi|223714213|gb|ACDT01000002.1| GENE 39 41171 - 41938 901 255 aa, chain + ## HITS:1 COG:BH4059 KEGG:ns NR:ns ## COG: BH4059 COG1475 # Protein_GI_number: 15616621 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 8 255 29 281 283 203 47.0 3e-52 MLEKVYKQIDIEKISANENQPRTVFDDEKIEELAASIKENGLIQPIIVRKYNRGYQIVAG ERRYRASKLAGLKTVPCVIKDIDDKQVDTLAIIENIQRENLSPIEEANAYKTLIDTYDMN QTELANKVGKKQSTIANKLRLLKLSDDVKHALKSKQITERHARAMIGLEADKQQTVLQEV LKKSLNVKQTESLISKPVKTKPKNKKGPTKVSRNFKIAINTINQATELIQKSGIEVISET EEADDEYIITLRVKK >gi|223714213|gb|ACDT01000002.1| GENE 40 41938 - 42429 604 163 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2155 NR:ns ## KEGG: EUBREC_2155 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 160 10 168 174 173 54.0 2e-42 MKINDLILAMIDFYQGHPKQIQHLIKVHSFARVIGIDEGLSTQEQERLEVAAIVHDIGIK PAWEKYNSSNGKYQEELGPAEAIKLLNRLNYDEALIERVAYLVGHHHTYSEIDGLDYQIL VEADFLVNLFESESEEMAINSAYQKIFKTNMGKRICRTMFMKG >gi|223714213|gb|ACDT01000002.1| GENE 41 42463 - 42825 477 120 aa, chain + ## HITS:1 COG:lin2295 KEGG:ns NR:ns ## COG: lin2295 COG1393 # Protein_GI_number: 16801359 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Listeria innocua # 1 119 1 119 131 124 47.0 4e-29 MIRVYTAPGSQSCRKVIAWLKEHNLSFIEKNIFSSDLHANELREILERCENGTDDILSKN SKIIKSNKIDFDNMKMDELIAFIKANPSILKRPILMDEHRFQVGYNEEEIRTFIPRHQRG >gi|223714213|gb|ACDT01000002.1| GENE 42 43066 - 43545 416 159 aa, chain + ## HITS:1 COG:no KEGG:BL1738 NR:ns ## KEGG: BL1738 # Name: not_defined # Def: hypothetical protein # Organism: B.longum # Pathway: not_defined # 1 156 28 199 199 200 55.0 2e-50 MLIRRAENRDITQINNLLKQVCLVHHKGRPDLFKKGARKYSDDQLMQIIKDDNRPIFVAV DNNEVVLGYVFCIFEQYLDSNILTDVKTLYIDDLCVDENSRGQHIGKQLYEYSKDFAQKS GCYNLTLNVWSCNASAMKFYESCGLKPQKVHMETIFDKS >gi|223714213|gb|ACDT01000002.1| GENE 43 43652 - 44281 581 209 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237735337|ref|ZP_04565818.1| ## NR: gi|237735337|ref|ZP_04565818.1| predicted protein [Mollicutes bacterium D7] # 1 209 1 209 209 377 100.0 1e-103 MKVILECSHQYINTIKRKLNAHLEQITYLDSFTGNDLSLFIKEVKKLEDIENIKNIKSIS NANIVCLIDSGLFMFELLEFNPLAFIRSANINADLDELINRIEYLNKGLGVMLEFQCGYQ TVRMNVKNITYVESYAHYLLIHTLNSTVKVREKISVALQKLEPLGFIQVHRSYVINHDRI KKIDTDNITLSDETIIPIGKKYKQALKNY >gi|223714213|gb|ACDT01000002.1| GENE 44 44278 - 44907 638 209 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754934|ref|ZP_02427061.1| ## NR: gi|167754934|ref|ZP_02427061.1| hypothetical protein CLORAM_00438 [Clostridium ramosum DSM 1402] # 1 209 1 209 209 372 100.0 1e-102 MKDKILIGEKMTKKGVFLNYLGLVFFGIIAVVGYNGLISKYLHLNNQLNILIMIFIFIAV LLFLTPIIGSSEIIEFNHNEVRYFHTKGYFQQFAEVIRILTGKQAVPDITLKTKDIEQVN LSYVPFLMMYAQQGYQIKITFLMKDGTTLAFLPSTIDQMEKGDYESAFKILETNGVEIVD KLMLRKALKMNKNDFYQYVNTIEKGRNKK >gi|223714213|gb|ACDT01000002.1| GENE 45 45035 - 45958 926 307 aa, chain - ## HITS:1 COG:VC2501 KEGG:ns NR:ns ## COG: VC2501 COG0260 # Protein_GI_number: 15642497 # Func_class: E Amino acid transport and metabolism # Function: Leucyl aminopeptidase # Organism: Vibrio cholerae # 2 299 190 492 503 230 41.0 3e-60 MLSDTPSNLMTPDTLVDEAIKLSQKYPQLECTILNKSNLEKMEAGGILSVNQGSNIPAYM ICLKYNNSDDPYTAVIGKGLTFDSGGYNLKSDSYGMKYDMCGGADVLGVMQILAASQAKA NVYGIIPTTENLVSDKAYKPQDVITTLSKKTVEIVSTDAEGRLILCDAITYVQQLGVKKI IDLATLTGACVSALGDVYTGVFSNCDEYYDEFVQALQISDEKGWRLPLDPEYFEKLKSTS ADFKNSAGKPGGGASVAANFLEAFIEEGTQWLHLDIAGSADNDGTGATGAMIRSVVNMIN LKTIVKK >gi|223714213|gb|ACDT01000002.1| GENE 46 46191 - 46451 340 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754933|ref|ZP_02427060.1| ## NR: gi|167754933|ref|ZP_02427060.1| hypothetical protein CLORAM_00437 [Clostridium ramosum DSM 1402] # 1 77 27 103 501 144 96.0 1e-33 MKQIKNINIETLENNLIYPIYEDQLIDPQLNKVLDYNLDKLVNKKFKELTVIHTLGKYKF ETITFIGLGSSKKINKQTNKRYCCFN >gi|223714213|gb|ACDT01000002.1| GENE 47 46637 - 46930 357 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754932|ref|ZP_02427059.1| ## NR: gi|167754932|ref|ZP_02427059.1| hypothetical protein CLORAM_00436 [Clostridium ramosum DSM 1402] # 1 97 9 105 105 155 100.0 1e-36 MINNTEQETISIIQLTQQSSLAYQKVLAYVKDQNFVKANEYLIKGNETLLEASKIHAGCL TVSSDLDLLLVHAEDMLISVQLYKSLIKEFIDIYKKI >gi|223714213|gb|ACDT01000002.1| GENE 48 47020 - 48303 1500 427 aa, chain + ## HITS:1 COG:SP0474 KEGG:ns NR:ns ## COG: SP0474 COG1455 # Protein_GI_number: 15900389 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Streptococcus pneumoniae TIGR4 # 1 422 1 433 440 228 35.0 2e-59 MQKLMDKMENVLAPLATKIGSNKILKAISTAFNLIMPLIILGAIFTLLSTLQLDAYQQFL ADSGVGTVLSLVGKFTTDMLAVYVSFTAAYAFIRNEGMNTDAIPAGLLSILAFFIMTPLA NIVVEETTTSYIAFDYLGSKGLFTALIVGILVGYIYTFVVKRGWIIKMPEGVPPTVAKSF NALIPGFVITTVFLIINGAFASLTGVTFSEWFYGIIATPLSALSGSLITYMVLTLLASVF WFFGIHGGQVTIPFSMMLFMQAGVENQAAFASGAPMQNIITVGLLYFLMLGGIGNTLGLS IDMLLAKSNRYKTLGRLAILPSCCSINEPIVFGLPLILNPIMALPYFLVPQINILITYFA MKSGLVSLPRIAMGATGTPVLLDGWLICGVSGIILEIVLIIVSAILYYPFFKTQDNIALK EEAQIKE >gi|223714213|gb|ACDT01000002.1| GENE 49 48369 - 50207 1815 612 aa, chain + ## HITS:1 COG:BS_licR_1 KEGG:ns NR:ns ## COG: BS_licR_1 COG3711 # Protein_GI_number: 16080911 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 18 460 20 490 499 206 27.0 1e-52 MDKNQKMIILELMNHETVSGSNLSSVIKMSTRSVRTIIKNINEDICGAKIESGSFGYRLT IETPETFLAYLQRDQNGKEESRLAYLFNRFIDCNNYLKIDDLCDELYLSRTQLKQSLKEL REYLHDFDLTIATKAYYGMYLEGDEINKRRAIAHFEEYQMDFDILQRIRDIVISSIANAD YVISDDVLDNLVAHLYIAYYRVMKKEYANIDSEWLEEIKEEKEYSLGCAIMELMNKIMAM EYRIEEVAYLTMHLCGKNSKQLSNNYINQEILDIVKEMLMIIEKVANIPFQADLNLQLAL SLHLIPLVKRIQYGTFMHNPLKDEIKSKLIMAYELAVKACVVINQRFNCTLSEDEIAYFA LHINLSLEQKKYNFHRNNILVVCSSGVGSARLLEYFFKENFNDYIEHLEVCSLHELENIS LTKFDCIFTTVPLAIKVNIPIFLINNLINQRDTIKITNNLKQLNQANILDYFPEQLFFTY ESFSSKEEAIHEIINECKKSYDLPADFEQYVLQREALATTEFNDLIAFPHSNKPVSNATF VAVTILKKPLLWKKHKIRIILLSAIENKAIKELDDFYKIISNIISDSTIQWNLINNPNYQ YFKEIIERLERL >gi|223714213|gb|ACDT01000002.1| GENE 50 50204 - 50890 664 228 aa, chain + ## HITS:1 COG:lin2905 KEGG:ns NR:ns ## COG: lin2905 COG1440 # Protein_GI_number: 16801964 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Listeria innocua # 1 94 1 95 100 64 43.0 1e-10 MKKILLVCGGGMSTSLLVKKMEAADIYNEYQIKCSDVVSGQVLLVDYDIILLAPHIAYLQ NEFVPLCQKVNIPLMLIDSYDYTKMAGKIVLDKAIMLLNEFTKEHPFKVLLLHSVAGAMS DLIVLDMKKKVIGDEKDWQIESLTTEQFEDKNYDIILLEPQIRYEEVNLERRLKENLTII TVPPISLYASFDGRKVLDYIKRVYNQEIDKKRKKVKESIEKYEINNES >gi|223714213|gb|ACDT01000002.1| GENE 51 50868 - 51620 766 250 aa, chain + ## HITS:1 COG:lin0339 KEGG:ns NR:ns ## COG: lin0339 COG3394 # Protein_GI_number: 16799416 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 2 246 3 256 261 143 34.0 2e-34 MKLIMRADDLGFSEGVNYGIYKSVVDGVITSVGMMPNMASAKHGYELVKDLDIALGQHSN ICVGKPLSDPKLIPSLVQKNGNFCSSKEIRARQVDTINVAECEIELEAQLAKFRAITGRD PDYFEGHAVFSNNFFIALKNVAQRNNLFFENPSLDKEWEKEFGITGLGFMSLDEHGLYDP RTYMEEHLDIIKNNPCAVAIFHPGFLDQYILEHSSFTLIRPMECDFLCSDWLSEWLNKNK IELVNFKNYK >gi|223714213|gb|ACDT01000002.1| GENE 52 51686 - 52495 839 269 aa, chain - ## HITS:1 COG:BS_bltR_1 KEGG:ns NR:ns ## COG: BS_bltR_1 COG0789 # Protein_GI_number: 16079711 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 3 120 6 123 124 95 40.0 8e-20 MKEKLLTCGTFAKICGVEKHVLFHYDEINLFKPAYVNDKGYRYYSYRQYDTFKVIIALKK LGMSLKDIQIYLDQRNPQLFLELLNQQEAKLLEYINEMQQIHEIIKNFQTFTTDALNANY DEIKVTYLKETKLLLSANLENTTSKGFADFMNDYTSFIENNHIITGEFVGVMMNVENIRK NQLSNYSYLFTTTNNLSEQIFVKKGGNYLCGYHHGSYDGLRDSYQRILDYAELHKIKLGK YAFEEYIIFDICEASKDEYLTRITIEIAE >gi|223714213|gb|ACDT01000002.1| GENE 53 52578 - 53285 667 235 aa, chain + ## HITS:1 COG:TM0287 KEGG:ns NR:ns ## COG: TM0287 COG1132 # Protein_GI_number: 15643056 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Thermotoga maritima # 1 227 1 227 577 138 33.0 9e-33 MKLILKYLKNYKLLFFFNVISVFGFILVELGIPTIVATMIDVGVTNKDVSFIYMMGGCIA LVSLIGVGGTIALGYCCSKISTAITRDIRNDIFAKVQQFTANEFNQIGTSSMITRTNNDA FQIQQFVNVLLRTALMTPIMFIFSFIMTARASLPLSYIIAATIPLIILGVVVVAKITKPI SENQQSSLDDLNRISRENLSGIRVIRAFNNDQYEQKRFKETNHRFTKYSKNYLKL >gi|223714213|gb|ACDT01000002.1| GENE 54 53282 - 54307 228 341 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 80 318 114 351 398 92 25 5e-18 MTMTQPIFFMLMNVAGLSIYWVAAHLISTGSLEIGQLVAFMDYLFHAMFSIMLFCTVFMM YPRAEVSAKRIEEVFNLDPIIKNKPADSDFDEKVSIEFDHVTFVYPDGEEPVLQDVSFKA NKGEMIAFIGSTGSGKSTLVNLVPRFYDVSSGSIKINGKDIRDYDVLELRDKLGVIPQKA VLFSGTIADNIRFGKKDASDEEVEYAAKVAQAYPFIMEKENGFDEEISEGATNVSGGQKQ RLSIARALVRKAQIYIFDDSFSALDFKTDAILRKELKKEMTESIMLVVAQRISSIMEADQ IIVLNEGKVVGKGTHHQLLKECQIYHEIATSQLSEEELANA >gi|223714213|gb|ACDT01000002.1| GENE 55 54300 - 56078 204 592 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 368 574 28 231 318 83 27 3e-15 MHKKYSFKKSLKGLIPFIKPYKWQFIIAILMIFAFNISMVLAPTFEGMITTQLASDVAKS SSLTSVDISFGAIIKIVLSLVVIYIIKTVAQMISVVYLTNAIQQAMEDLRNALQNKIRKL PVRYFDNHHFGDVLSRITNDVDAISNALQQSFTQIVSGVLTVTLALTMMYMINPKMAIIG TMIIPLSLLVTKLVVGKSQKLFKNQQDALGELNSTVTEMYTGYNEILLYNQQVESVEKFK KINENLKENAFKAQFVSSTIAPLNALVTYLAIGAVAVVGTADVIAGTFLVGQLQAFIRYI WQINDPLSQISNLSSQIQSAFAALGRVIELLEEPEEVPEANPPKHLSEVAGNVDFEHVKF GYYEENLMKDLNVNVKSGQMVAIVGPTGAGKTTIINLLLRFYDVKGGSIKIDGVDIRDLP REELRSMFGMVLQDTWLYSGTIYDNIRYGRLDARKDEIINAAKMANVHHFIRTLPDGYNS HINEEANNISQGEKQLLTIARAILKDPQILILDEATSSVDTRLEKMLQEAMQRVMKGRTS FVIAHRLSTIKSADLILVINNGDIVEQGTHEELLAKQGEYEKLYNSQFAHNS >gi|223714213|gb|ACDT01000002.1| GENE 56 56212 - 56892 652 226 aa, chain + ## HITS:1 COG:CAC0860 KEGG:ns NR:ns ## COG: CAC0860 COG0745 # Protein_GI_number: 15894147 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 225 5 230 231 185 46.0 5e-47 MVKILIVEDDNDINNMIHDLLRLNCIESVSAYSGTEALLLLDEHVDLVLLDLMLPGLSGQ EVIKEIKKKKDLPVIILTAIDKMDTKLDLFALGADDYLIKPFDNNEPLARVKIQLKHRQV AFSSALAALITYKDIILDENSYKVTCNEIRINLSKIEFKLLKILLESPTRVFTKDILFEM VWDHEDSGDDNTLNVHISKIRSKLKQANPNQEYIETVWGIGYKIKG >gi|223714213|gb|ACDT01000002.1| GENE 57 56981 - 57658 799 225 aa, chain + ## HITS:1 COG:CAC2734 KEGG:ns NR:ns ## COG: CAC2734 COG1131 # Protein_GI_number: 15895991 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Clostridium acetobutylicum # 1 217 76 293 301 197 49.0 2e-50 MGCMIESPAFYPNLSAYDNLKYYCILKGIINRHKIEEVLEKVKLNNTGKKKFKQFSLGMK QRLGIALAILNNLDFIILDEPINGLDPLGISEIRETLLQLQQENITILISSHILSELYLV ANKFGILEGGKIVKELSKEQLDNECSHCLLLKVNDIKKASVLLEEELKTANYKVLNDQEI RLYDYLDAQHIVSKAIVEKGIELYAISEIGVSLEEYFKNIIRDGE >gi|223714213|gb|ACDT01000002.1| GENE 58 57660 - 58472 608 270 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735350|ref|ZP_04565831.1| ## NR: gi|237735350|ref|ZP_04565831.1| predicted protein [Mollicutes bacterium D7] # 1 270 5 274 274 464 100.0 1e-129 MLNIIKADLFRIFKGKGIYVCLLAIIALCSVSIYLRSPGHIGISGGLTSNSEVTKEDDLA TPEDVRKVAGESNELLDEGMMKVNGNLYYVLIFVVFAVICVDLANHTAKNVISTDVSRTT YYFAKLLLTWGLGIAIIAISTYLGYFGNIIFNAPSHYSSFLDITIIMLRQLPIFCGIMSV LVMIAAITQKTSRYNAIAIVLVMVSQMLLMTIITVFNIDGSIIMQFEFETILRDMAVIGQ IEIKTLLTGIGLIIASSMIGVTYFKRCNIK >gi|223714213|gb|ACDT01000002.1| GENE 59 58679 - 59305 576 208 aa, chain + ## HITS:1 COG:CAC2730 KEGG:ns NR:ns ## COG: CAC2730 COG0642 # Protein_GI_number: 15895987 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 5 207 74 279 303 116 36.0 2e-26 MLTFRKLEIDIEKKNNSLQKTIVNISHDLKTPLTSALGYVELLQKYELSGEEQHKYESII IDKLKRLSQLIDDFFTFTKIISNEEKFELKLLDINRIFEEELVCYYQDFNSQGRMIYIEG NKKIEMLSNERILRRIFDNLIINALKHSRGDIYISINTGAEICFQFKNRLDEPIIVEQIF DEFYTSDISRTKQNTGLGLAIVKEFTEM >gi|223714213|gb|ACDT01000002.1| GENE 60 59397 - 60926 1368 509 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754919|ref|ZP_02427046.1| ## NR: gi|167754919|ref|ZP_02427046.1| hypothetical protein CLORAM_00423 [Clostridium ramosum DSM 1402] # 1 509 1 509 509 893 99.0 0 MDILMKYAKKKTVFILSTVLSGIVTLLYLISKFGPTTSSNVNINDINELAGTIGSMVTII QIIFYLFIILAIVTAILSGVYFFKKNKQEYVMLGEFIVSCINSILLLLSMSGINAICKIL RVAISGDYSSLMTMDSARLLSSVTSAASNLKYFMYLSIFVFIINIVLLLIVKKIINIDGF HFNFDEPVVTAVPAQEPVHEDSTTVEDDSTIADNELTQENGETETNTVEPNASETKPATI NTQKIKDFFKTKNGKITIGVAAALIVCFGGYKIYDTFFNYTSIDLGKNITVEFTGKDGSG YIKDVESNIDYDKNNSDLSNFVSSTYTDYDFSGDLSNGDKITVTIKYSEELAKANKIKVT NDSKTFTVKGLIEKFKDSSKIPEKVITQLKKEADKNIKNRYKDGYSFTYNHEFNSLWFAK GEDDDNDSVIAIYKIDETYTSSFSGQQDVETYYAAIYVDDVDSAYLDEKSHYWYGASLYG SDGKELSDLNALETALKEKFDDQTLELIK >gi|223714213|gb|ACDT01000002.1| GENE 61 61088 - 61324 196 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754918|ref|ZP_02427045.1| ## NR: gi|167754918|ref|ZP_02427045.1| hypothetical protein CLORAM_00422 [Clostridium ramosum DSM 1402] # 1 78 6 83 83 124 100.0 2e-27 MKFYQYMYYSFLGLTLFLSHLLCIVVAYQYCLIEHHKTTSFPPYVAFFSAIPFLIGIIVC LIIAIKFKQHYQENKNNL Prediction of potential genes in microbial genomes Time: Thu May 26 09:03:16 2011 Seq name: gi|223714212|gb|ACDT01000003.1| Coprobacillus sp. D7 cont1.3, whole genome shotgun sequence Length of sequence - 39170 bp Number of predicted genes - 38, with homology - 38 Number of transcription units - 21, operones - 10 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 515 552 ## COG0756 dUTPase 2 1 Op 2 40/0.000 - CDS 559 - 1887 1184 ## COG0642 Signal transduction histidine kinase 3 1 Op 3 . - CDS 1880 - 2581 757 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 2605 - 2664 6.0 4 2 Op 1 . + CDS 2722 - 5163 1932 ## COG4485 Predicted membrane protein 5 2 Op 2 7/0.000 + CDS 5135 - 5542 526 ## COG2246 Predicted membrane protein 6 2 Op 3 . + CDS 5529 - 6473 958 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 6475 - 6534 3.5 7 3 Tu 1 . + CDS 6590 - 7435 1140 ## gi|167754911|ref|ZP_02427038.1| hypothetical protein CLORAM_00415 + Term 7456 - 7489 1.5 + Prom 7580 - 7639 6.0 8 4 Op 1 . + CDS 7674 - 7991 464 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 9 4 Op 2 . + CDS 8003 - 8305 355 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 10 4 Op 3 . + CDS 8328 - 8972 535 ## gi|237735363|ref|ZP_04565844.1| predicted protein + Prom 8977 - 9036 6.5 11 5 Tu 1 . + CDS 9075 - 9257 173 ## gi|167754907|ref|ZP_02427034.1| hypothetical protein CLORAM_00411 12 6 Tu 1 . + CDS 9783 - 10640 622 ## COG1737 Transcriptional regulators - Term 10477 - 10516 -0.2 13 7 Tu 1 . - CDS 10650 - 11330 558 ## Phep_1535 LmbE family protein - Prom 11495 - 11554 7.6 + Prom 11390 - 11449 13.4 14 8 Tu 1 . + CDS 11532 - 12863 1448 ## COG0534 Na+-driven multidrug efflux pump 15 9 Tu 1 . - CDS 13162 - 13416 129 ## gi|237735368|ref|ZP_04565849.1| conserved hypothetical protein - Prom 13463 - 13522 5.0 16 10 Tu 1 . - CDS 13617 - 14036 416 ## CLL_A1558 hypothetical protein - Prom 14129 - 14188 8.5 + Prom 14008 - 14067 5.5 17 11 Tu 1 . + CDS 14120 - 14779 573 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 14783 - 14818 5.8 18 12 Op 1 12/0.000 - CDS 14803 - 15297 492 ## COG0602 Organic radical activating enzymes 19 12 Op 2 . - CDS 15297 - 17414 1972 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase + Prom 17574 - 17633 8.6 20 13 Op 1 12/0.000 + CDS 17757 - 19916 2408 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase 21 13 Op 2 . + CDS 19919 - 20446 417 ## COG0602 Organic radical activating enzymes + Prom 20531 - 20590 5.7 22 14 Tu 1 . + CDS 20610 - 21392 696 ## COG1737 Transcriptional regulators + Term 21451 - 21480 1.4 + Prom 21632 - 21691 9.9 23 15 Op 1 8/0.000 + CDS 21750 - 23219 1154 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 24 15 Op 2 3/0.000 + CDS 23222 - 24787 1394 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 25 15 Op 3 . + CDS 24741 - 25007 195 ## COG2190 Phosphotransferase system IIA components + Prom 25412 - 25471 6.4 26 16 Op 1 14/0.000 + CDS 25499 - 26113 677 ## COG1183 Phosphatidylserine synthase 27 16 Op 2 . + CDS 26110 - 26970 936 ## COG0688 Phosphatidylserine decarboxylase 28 16 Op 3 25/0.000 + CDS 26987 - 27757 850 ## COG1192 ATPases involved in chromosome partitioning 29 16 Op 4 . + CDS 27747 - 28724 1229 ## COG1475 Predicted transcriptional regulators + Term 28762 - 28801 1.3 + Prom 28789 - 28848 9.4 30 17 Op 1 1/0.000 + CDS 28888 - 29661 692 ## COG1145 Ferredoxin + Prom 29753 - 29812 4.0 31 17 Op 2 . + CDS 29838 - 30530 192 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 32 17 Op 3 . + CDS 30532 - 32091 1427 ## Ccel_1181 hypothetical protein + Term 32100 - 32138 5.2 + Prom 32123 - 32182 9.0 33 18 Tu 1 . + CDS 32339 - 33691 1153 ## COG0534 Na+-driven multidrug efflux pump + Term 33702 - 33768 -1.0 + Prom 33700 - 33759 9.8 34 19 Op 1 . + CDS 33828 - 34256 464 ## COG4702 Uncharacterized conserved protein 35 19 Op 2 . + CDS 34292 - 35323 953 ## COG0673 Predicted dehydrogenases and related proteins + Term 35345 - 35385 3.5 + Prom 35354 - 35413 9.4 36 20 Op 1 . + CDS 35640 - 36944 1518 ## COG1455 Phosphotransferase system cellobiose-specific component IIC + Prom 36958 - 37017 4.4 37 20 Op 2 . + CDS 37055 - 37495 531 ## COG0716 Flavodoxins + Term 37496 - 37537 6.5 + Prom 37527 - 37586 6.2 38 21 Tu 1 . + CDS 37608 - 39086 1720 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase Predicted protein(s) >gi|223714212|gb|ACDT01000003.1| GENE 1 3 - 515 552 170 aa, chain - ## HITS:1 COG:SPy0235 KEGG:ns NR:ns ## COG: SPy0235 COG0756 # Protein_GI_number: 15674420 # Func_class: F Nucleotide transport and metabolism # Function: dUTPase # Organism: Streptococcus pyogenes M1 GAS # 40 168 19 146 148 94 41.0 8e-20 MQAIAKFHKVSKEQFIIDFKDSFPKYDEAAINDIYASIKLPKRATIGSAGYDFYTPIDFI LKPHETIKIPTGIRVSINDGWVLAIFPRSGLGFKYRLQLNNTVGIIDSDYFNSDNEGHIF IKITNDSNEDKTVELKAGQGFGQGIFLQYGIVEDDNTTDERNGGFGSTTK >gi|223714212|gb|ACDT01000003.1| GENE 2 559 - 1887 1184 442 aa, chain - ## HITS:1 COG:BS_yrkQ KEGG:ns NR:ns ## COG: BS_yrkQ COG0642 # Protein_GI_number: 16079695 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 120 434 99 423 432 129 27.0 2e-29 MIKNFFNKLSIQISAIIIICGTLGVATFCLLYAGKESFFEFTQNLGVISENTEAYANNIE NQIIDQNVSITDVDTLDKILGTSDIYSVNLYNKADNLAITGSFATMLDNFIIGSTVYDTE AIYNGDDYSTEIELKDGTIEMYVYSYALAKLVVPYIVASIVIALFTFLLPTFLFIRRKVN YMTTLKNEVTIMSQGDLDHTIKINSNDEIAELSNQIDNLRLTLKDNFATEEANRKANYEL VTALSHDLRTPLTSLMGYLDIIRLKKFKNEDQYNLYLKNSIDKVNQINELANKMFEYFLV FSKDQDTELSKMSLGVIYEYILENIGVLEENGFEVIKNIKHSDYFIRGNINLVKRIINNV FSNVLKYADIKQPVYITLLIDEDIENLGLTIKNTKKHSANYIESNQIGLKSVQQMIKIHD GTFTVLDEESTFTVSITIPLMQ >gi|223714212|gb|ACDT01000003.1| GENE 3 1880 - 2581 757 233 aa, chain - ## HITS:1 COG:BS_yrkP KEGG:ns NR:ns ## COG: BS_yrkP COG0745 # Protein_GI_number: 16079696 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 5 231 4 227 231 232 51.0 5e-61 MTNNKILIVDDNPEIREVLNVLLSSEGYDVIEAKDGQEAIDMVSDDIDLYILDIMMPIIN GYQACVEIRKKSNAPILFLTAKGQESDKTLGFSSGGDDYLSKPFSYNELTTRVKALLRRY YVYQGKMEKEENNDDNIVYNNITINPNSEAVYLNGEQIELTYLEYQILYLLLSNRKRIFS TQTLYESIWNEPYYYSANNTIMVHIRNLRKKIESDPQNPKIIKTIWGKGYRCD >gi|223714212|gb|ACDT01000003.1| GENE 4 2722 - 5163 1932 813 aa, chain + ## HITS:1 COG:BS_yfhO KEGG:ns NR:ns ## COG: BS_yfhO COG4485 # Protein_GI_number: 16077927 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 56 804 16 815 819 79 21.0 3e-14 MKKKQFYPIIFLTCLTLVMIIVITPSSSIFGANTDWLSQHVNLADYIRKTMLETKSFFPD FAFNIGGGQNIYNISYYGLFRPDVLIGCLIPGIAMKDIIITYMIINLVLSVNLTYIWLKR KQFSTTLCIVGAVLLLSSSVLFQSHRQIMFVDYMPGVLLALIAVDTYIKNQKQGLLIGAV VLIIANSYFFAISALLTIFTYYCFEMYRLKNLITFKKLCLFIQPLLIGVLICGILLIPTA YVMLENHQSSGGSLDLMSLVVPRGDFKALLYNAYGCGLGFVAWAGLVLSLKLKETRKLSI WLLLLLFIPIFSFVLNGFLYPRAKILLPFTPLIIYVVIQTINEYKQSQIKLDLKLIVLLI LPIVLFYKEPLVILDIIICLVGILLYLRITERTLYLLLVMPLIISYVNNQKESFVTKDTY QQVSNLKNIKVEKDGRYDIFKQSLNTVNQVSNNELRSSIYSSVSNRLYNHFYYDIIKNPI SIKNRVACLSNSNIFFQGMMGVKTLYSENVVPMGYQAIGNNLYQNDKVLPLVYATSNSYD VVQFDRLDYPQTLDTIYNNVVVAGGQSNYQSKAATVALETVVKEKSDNLQITKLNDGYRI NTKKTGKLELGINQDLTNKILIVEFDLAKVKLIKRKDTSITINGVKNKLSSSSAAYPNHN THFTYVLSQNELLDQLSISFSNGRYDLKNIKVSTIDYDVIKNRNQEIDALVGSYNQDGNL VEGTINVSNDGYLVTSLPYQNGYTVLIDGKEVAKECVNKAFLGAKISKGQHQIRIIFKAP MKNVGYVCSGVGFIWLVFQGRRKKNEKGFERIN >gi|223714212|gb|ACDT01000003.1| GENE 5 5135 - 5542 526 135 aa, chain + ## HITS:1 COG:lin2694 KEGG:ns NR:ns ## COG: lin2694 COG2246 # Protein_GI_number: 16801755 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 4 127 17 144 145 72 32.0 1e-13 MKKDLKELISYVFVGGCTTLINFVIYWFVITMMDQGWLFANVISWVGAVIFAFWANKHFV FKSANETGKEAYQFFILRLGTLLVESGLLFIFIQLLSANEMLSKVVVSVITVVSNYGLCK FKIFAAKGGHEYGQN >gi|223714212|gb|ACDT01000003.1| GENE 6 5529 - 6473 958 314 aa, chain + ## HITS:1 COG:SP1606 KEGG:ns NR:ns ## COG: SP1606 COG0463 # Protein_GI_number: 15901446 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 4 306 2 304 320 296 47.0 3e-80 MDKISVIVPCYNEEEVLPLFHKEVTKELETIENIDYEILFIDDGSHDHTIDLLKMICAKD HHCDYYSFSRNFGKEAAMFAGLEKSTGDYCVIMDADLQHPPKLLAPMYQAVSQEGYDCCA GKRMDREGEGRLRNFLSKSFYKVMQKLSKLDMSDGAGDFRMMSRLMVDSILEIREYNRYM KGLFSYVGFETKWLPFNNVERAAGSTKWNFKSLFSYALEGIFSFSTAPLKIAGIFGVILF VGSIILSLYTAISTLMFGNSVGGYTTIVCLILFLSGTQMLLIAILGEYVSKDYMENKARP IYIVKDTNRRKRYN >gi|223714212|gb|ACDT01000003.1| GENE 7 6590 - 7435 1140 281 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754911|ref|ZP_02427038.1| ## NR: gi|167754911|ref|ZP_02427038.1| hypothetical protein CLORAM_00415 [Clostridium ramosum DSM 1402] # 1 281 1 281 281 489 100.0 1e-137 MKKFGNVVLTIVMLCLIMLLCVNLSIKTMSTKAITNAVVVQEASNGIEEVLNQAFPDVSN ENIKKVEEAVKNNSALNDVTSDLLDQITAAVANGSDVDTAAIAAQLSKAVDENIPAIEEA IGKKITTEQREQIQSKITDENGALQNKIVSTVEKVQKTTPGTQKFIKTYKTLSDTPTRII CVVGIILTAVLLGLINKSFYKWTLFSGIAAVISGIVVGLFMPLVVSAMEFTIGNRLLGMS IDIPVGSLRLDGAICAGAGIILIVAYIILNKKYATFERHYY >gi|223714212|gb|ACDT01000003.1| GENE 8 7674 - 7991 464 105 aa, chain + ## HITS:1 COG:lin2472 KEGG:ns NR:ns ## COG: lin2472 COG1440 # Protein_GI_number: 16801534 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Listeria innocua # 1 102 1 102 104 97 51.0 6e-21 MIRIMLACAGGMSTSLLMNKMKEEADNRGLEVSVEAIGEKTLEQHLGEFDVLLLGPQVRY VIPNVKKILAGKIPFDVIDMRDYGLMNGEKVLAAALKMYDDFYNK >gi|223714212|gb|ACDT01000003.1| GENE 9 8003 - 8305 355 100 aa, chain + ## HITS:1 COG:SP2023 KEGG:ns NR:ns ## COG: SP2023 COG1440 # Protein_GI_number: 15901844 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Streptococcus pneumoniae TIGR4 # 2 100 3 100 102 93 49.0 8e-20 MKILLVCNAGMSTSILVREMEKAAKEQQLELEVTAMGFTQAEKVLLDWDIVMLGPQVRHQ LTGLQKAAEGKVPVEVINMRDYGMMNGANVLKRALEIINE >gi|223714212|gb|ACDT01000003.1| GENE 10 8328 - 8972 535 214 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735363|ref|ZP_04565844.1| ## NR: gi|237735363|ref|ZP_04565844.1| predicted protein [Mollicutes bacterium D7] # 1 214 8 221 221 387 100.0 1e-106 MDEFLSELYVEYFKKWILLQTSKDYKLYFNDNDSNIILIEAENSLSRVTFNRFNIIELCV MNNNDNKLEFYIHFQMQNLGHAKRLFEEMIECIYNTIKQPVLKILLSCSSGLTTSYFAEK LSQTAELLELNYQFQAVGWEKVLAAALDFDVLLLAPQISYQCARIQKILPNKLVLKIPTL IFASYDCLKLFEFIKESLLLDKVNNNKTIIKLSP >gi|223714212|gb|ACDT01000003.1| GENE 11 9075 - 9257 173 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754907|ref|ZP_02427034.1| ## NR: gi|167754907|ref|ZP_02427034.1| hypothetical protein CLORAM_00411 [Clostridium ramosum DSM 1402] # 1 60 1 60 60 95 100.0 1e-18 MIYEKRIIKDSIIIQGIYDVLNIVLLHFLEIDCIAISIFGVFDNGYVYSSFIEGDKWYKL >gi|223714212|gb|ACDT01000003.1| GENE 12 9783 - 10640 622 285 aa, chain + ## HITS:1 COG:lin2846 KEGG:ns NR:ns ## COG: lin2846 COG1737 # Protein_GI_number: 16801906 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 1 277 1 276 283 209 43.0 6e-54 MNILMKLKNSKVLSNNEKIIADYILEHPDRILKMTAKDLSTVCYVSTATIYRLCDKLELS GFADLKLKITSSLEEYLKSTENFDFNFPVKQFQTHYEIIHKLKEDYEQTLNSTANLFSLD QLKRVVSTMKKAQVIDVYTSAGNIYFAQNFQFQMQEIGVSINVPIDEYQQRLQAASSDNS HLAIIVSYSGRGIISDILFRILKERKTPIILISSYNYLIKEVKPDHHLYISSGENHYKKI SSFSTRLSILYIFDILYTCYFELNYEENLERKLDYYRLIDKTNIR >gi|223714212|gb|ACDT01000003.1| GENE 13 10650 - 11330 558 226 aa, chain - ## HITS:1 COG:no KEGG:Phep_1535 NR:ns ## KEGG: Phep_1535 # Name: not_defined # Def: LmbE family protein # Organism: P.heparinus # Pathway: not_defined # 35 152 3 121 213 63 31.0 4e-09 MKKLLIICLVLLTGCTNANAKVNKIEHQDIFDQVQLSDYQNVMIVAHPDDETIWGGMHLL KDKYLVVCLTNGDNEIRSKEFKEVMKKTQNTGLILNYPDKTNGKRDNWKSSYQKIEDDLN YLLSKQHFNLVVTHNPKGEYGHQHHKMTSAIVTSIAKDLAITNNLKYFGLYYRKTNPILP MKPTFDDDLTKQKLELADCYSSQKKVCHNLGHMFPYENWISYDEWH >gi|223714212|gb|ACDT01000003.1| GENE 14 11532 - 12863 1448 443 aa, chain + ## HITS:1 COG:BH0886 KEGG:ns NR:ns ## COG: BH0886 COG0534 # Protein_GI_number: 15613449 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Bacillus halodurans # 1 409 1 410 454 197 31.0 3e-50 MNDELLNGSVFKSLMLFAIPFMLSNLLQVLYGAADLFVVGHFATTSDVSAVSIGSQTMAM LTNLILGFTTGITVLLGQFFGAKKEKELAATVGSSVVLFTMIAVTISIILFLTNHQIVGL MHTPNDAISATRSYLYICSLGTVFIVGYNAVSAILRGLGDSKTPLLFVAIACVINVIVDF ILVDGYQMGAAGAAIATVLAQAGAFIFSLIYLKYKGLGFKFNRQDISFNLPMIAKIIKVG LPIGLQSALVGISFLLITVIVNGMGLVASASVGVVEKLIEFLMLPAIALGNAVATMTAQN FGAEQYSRAYKSMRYGIAFCLSISLIVTVVCQFDGSIFTRIFSSDQAVIKNAALYLQTYS FDCLCTSFIFCYNGYLNGGGYTVFTMLHSLLATFILRIPLTLVISKLSGITLFHMGIASP IASMASIIMCVIYIRYLNKKLLF >gi|223714212|gb|ACDT01000003.1| GENE 15 13162 - 13416 129 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237735368|ref|ZP_04565849.1| ## NR: gi|237735368|ref|ZP_04565849.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 84 58 141 141 96 100.0 5e-19 MLVGGIIGVFNVLILNIVVSKVTPIKLTLITFISQLLSGMILDYYIYHVFTLNKLIGCII VIIGLIIYQSADIKVISNEEINSI >gi|223714212|gb|ACDT01000003.1| GENE 16 13617 - 14036 416 139 aa, chain - ## HITS:1 COG:no KEGG:CLL_A1558 NR:ns ## KEGG: CLL_A1558 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 1 137 1 137 140 78 43.0 9e-14 MYLLLSFLCGLIVSPMNIFNGQLSAHCGVYLATVIIHLIGLITFTLIMFLKKQEISFKHH LPFILYTGGVIGVLTVIFNIIAVNNIGAALLTALGLLGQMIISIILESKGWLGSLKRKLT PLKWLSLIIVTIGIGVMVK >gi|223714212|gb|ACDT01000003.1| GENE 17 14120 - 14779 573 219 aa, chain + ## HITS:1 COG:ECs3055 KEGG:ns NR:ns ## COG: ECs3055 COG0664 # Protein_GI_number: 15832309 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Escherichia coli O157:H7 # 27 216 20 208 219 91 30.0 1e-18 MKNREYYINLLKVEQMMSDDNYQYLELCEFSKAEHLLHQGQILEYLYILVSGRIKSCRTT ANGTTVLSAFSNPITVTGEVEFLNHHEVTNDVYALKNTVCFRISVAQYEDILLHDLIFMR YLARTLSNLLYHANHNTAISINYPVENRLASYLISSAQQLIIKDNFVQVAEMIGCSYRQL QRVLNDFCQCGYLCKVKRGNYLITDESALKALGQDLYYI >gi|223714212|gb|ACDT01000003.1| GENE 18 14803 - 15297 492 164 aa, chain - ## HITS:1 COG:PM0941 KEGG:ns NR:ns ## COG: PM0941 COG0602 # Protein_GI_number: 15602806 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Pasteurella multocida # 1 144 1 150 158 138 41.0 4e-33 MNYHNITKDDMLNGEGLRVVLWLSGCTHRCPGCHNPQTWDVTSGIPFDAEAKAEIFTELD KDYISGITFSGGDPLHPSSINEVGQLIKEIKERYPDKTIWLYTGFQFDDIKTIDFIKNID VIVDGRFEITQLDPQLHWKGSANQRVIAVGPTLAFNTIILHDDN >gi|223714212|gb|ACDT01000003.1| GENE 19 15297 - 17414 1972 705 aa, chain - ## HITS:1 COG:VCA0511 KEGG:ns NR:ns ## COG: VCA0511 COG1328 # Protein_GI_number: 15601271 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Vibrio cholerae # 3 702 5 702 706 288 29.0 4e-77 MYVIKKDNTKELFNIQKVVTAINKSATRCLYKFKEGEEAQICDFVEARVKDFNSNEITIS QMHNVVEGALDAINPKVAKSYRDYRNYKQDFVAMLDEVYKKSQSIMYIGDKENSNTDSAL VSTKRSLIFNELNKELYKKFFLTTEELQAIRDGYIYIHDMSARRDTMNCCLFDVKSVLEG GFEMGNLWYNEPKTLDVAFDVIGDITLSAASQQYGGFTVPSVDLLLEPYAIKTFDLFYEK YLSLGLDPETASKIAEGDVLNEFRQGFQGWEYKFNSVSSSRGDYPFITMTTGTGTGKFAK MASIEMLHVRRRGQGKKGNKKPVLFPKIVFLYDEELHGEGKELEELFEAAVLCSSKTMYP DWLSLSGEGYVASIYKKYGEIISPMGCRAFLSPWYRRGGMHPADENDTPVFVGRFNIGAI SLHLPLIYAKAKQESKPFFDVLDYYLNLIRKLHLRTYDYLGEMKASVNPLAYCEGGFYGG HLGLHDKIKPILKSATASFGITALNELEQLAHKKSLVEDGSFALKTMEHINQMVEKFKEE DGRLYAIYGTPAENLCGLQVQQFRKKHGVIENVSDRDYVSNSFHCHVSEEISPIEKQDKE KRFWDLCNGGKIQYVKYPIDYNMQAFKTLLKRAMRLGFYEGVNLSLSYCDDCDHEELNMD VCPKCGSKNLTKIDRMNGYLSYSRVKGDTRLNDAKMAEIADRKSM >gi|223714212|gb|ACDT01000003.1| GENE 20 17757 - 19916 2408 719 aa, chain + ## HITS:1 COG:ECs5215 KEGG:ns NR:ns ## COG: ECs5215 COG1328 # Protein_GI_number: 15834469 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Escherichia coli O157:H7 # 3 718 5 712 712 471 39.0 1e-132 MKVIKRDGQSVEFDASKIEKAILKAMQCGSGIVDSKCAHDIAVEIENWHQFSDNNISIYR IEGQVFEKLIEKGQILTAKAYEGYRKVREFQRETNTIDKEIYGIVNQTNKEAMEENSNKD ATVLATQRDLIAGEFSKDYCRRLLLPPKIVQAHDDGIIHFHDMDYYIQKMHNCDLVNLKD MFAKGTVINDKLIETPKSLQTACTVATQIVQQVANGQYGGQTISLAHLSPYVRLSYEKHQ RNVRNEGRLIGIDYSEEQIIKIAKSRLQDEIKAGIQTIQYQINTFSTTNGQAPFLSVFMY LREEPEYIEETAMLIEETLKQRFIGMKNPVGAYVTPAFPKLLYVLDDNNVPQDSKYRYLT DLAVKCVSKRMMPDFISAKIMRENYEGQVFPCMGCRSFLSPWKDENGEYKWYGRFNQGVV TLNLADAGLSANHKLDNFWKILDERLELCKEALMLRHESLKGTLADVSPIHWRYGALARL ESGEPIDKLLENGYSTISLGYIGLYECVVALIGQTHTSKEGNELATKIMERLRSACDSWK EETGLGFGLYGTPAESTTYTFARALKKRFGIVEGITDKDYLTNSYHVNVKEHIDAFDKLQ VEAKFQKVSSGGAISYVEVTNMSENLDALSALIDYMYNTIQYAEINTKSDYCQECGFDGE ILLDENREWYCPNCGNRNHKTLNVCRRTCGYLGDNFWNQGRTQEIAERFVHLDNHLYGE >gi|223714212|gb|ACDT01000003.1| GENE 21 19919 - 20446 417 175 aa, chain + ## HITS:1 COG:FN0312 KEGG:ns NR:ns ## COG: FN0312 COG0602 # Protein_GI_number: 19703657 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Fusobacterium nucleatum # 1 163 1 167 168 138 44.0 4e-33 MRYAQIRKTDIANGEGIRVSLYVQGCKRHCPNCFNPETWDFTGGNKFDLKTENILIKLVN QPHIVGITILGGEPLELENRQDVSILLKHLKEQCPNKTIWLYSSFLYEEIKDFDVEILSY LDVLIDGPFVEALKDRKLRFRGSSNQRLIDVKKSLANDEVILYQDSRYYKNDKEV >gi|223714212|gb|ACDT01000003.1| GENE 22 20610 - 21392 696 260 aa, chain + ## HITS:1 COG:BH3576 KEGG:ns NR:ns ## COG: BH3576 COG1737 # Protein_GI_number: 15616138 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 24 214 24 216 284 61 25.0 2e-09 MSNIRIQLLNVINRESSDSMNFIIAQYILENLDRRKVITTKELADNCNVSKSSISRFCRK IGYEDFMELQYAIATYNSFIAERFSHIDGVTTKDYVVNYFERSKEIINYLNQNMDTDTLE ELVKDIHDYKRVVLMGHVQSSFPAISLQYYLTILHKFIYSTQDPNEQREMLETLDDQNLI IIFSAGGRFLERVLDRLSVMDRENGPKIYMITANKLKHFPFVHKYIELSEEFSYSSSVIL EMYSNLISLIYHKKYTDNYY >gi|223714212|gb|ACDT01000003.1| GENE 23 21750 - 23219 1154 489 aa, chain + ## HITS:1 COG:CAC1405 KEGG:ns NR:ns ## COG: CAC1405 COG2723 # Protein_GI_number: 15894684 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 9 483 9 473 473 467 50.0 1e-131 MNYPFKMYWGSASADFQYEGGFNEGKRGLITHDIVTDGSYTTPRRLTYRLPDGTLGSVAY RESFPEGAVGCIHDHVYYPSHNAVDFYHHYKEDIKLLAEMGLSMMRFGICWTRIYPTGEE KTPNEEGLQFYEDVIDECLKYNIEPMITICHDELPLYLANKYDGWSNRIMIDLYEKLCNA LFERFKGKVKYWLTFNELNVLQGFSHLGTRNSDAQTTWQAIHHLFIASARAKILAKKIMP KAMLGAMYATSPSYPKTCHPDDQLAWMKQRRRLFYFSDVMLRGYYPSFARSFWDEYKVTI RMEENDEEILKEGTLDFYSFSCYRSTTIGKDDKLGIIALPFGENPYLKSTPWGWPIDPVS IRYVLNEVYDRYQKPIFIVENGLGEVDKPVENNFVNDTYRINYLNDHFLEIKKAVEIDRV PVLGYTMWGGIDLVSLSTGEMKKRYGWVYVDMDDKGNGSKKRYPKASFYWMKEFIKSNGN ILKENNQGE >gi|223714212|gb|ACDT01000003.1| GENE 24 23222 - 24787 1394 521 aa, chain + ## HITS:1 COG:BS_bglP_2 KEGG:ns NR:ns ## COG: BS_bglP_2 COG1263 # Protein_GI_number: 16080978 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Bacillus subtilis # 92 440 3 351 364 222 37.0 1e-57 MTYTELGKQIIAYVGGKDNVKNIMHCMTRLRMNLKDNTKVEIEKLEALKDVLKVQFKNEQ LQVVIGPQVSEIYQTIEKDFNFSEADNLEEKEKQGIVKSFLNILSSVFVPVIPAIASAGM LKAIIALIKAFEIIPQDNGVFIVFNMMADVAFYFLPILLAASASKIFNTNRMISIVLAAT LIHPTFTTLVADTEASLSFFGLPVPLINYASSVVPVILSVWILSYIYRYVDKIMPNALKV IFTPTISLLIMVPLMLVVLGPLGNYVGVLISYCVGGLFTFNRFIGGFILSFIRPLLVITG MHQAFTPVIFQNLAERGGDFLLPTMMMSTMGQFGAVAAMIFKTRNKEKRTIRTSASISAV LGITEPALYTVLIHNRKALISACLGGALGGAFISMTGFELPAFASSSIVSLPIYLQVNVT NVIIAFLISIISSFVIAMLLVKTEKDEIVETNGIISPIAGKLISLSDVNDETFSKEVMGK GFAIIPAEGKVIAPFNGTVNAVFPTNHAIGIRSRAFNSCRY >gi|223714212|gb|ACDT01000003.1| GENE 25 24741 - 25007 195 88 aa, chain + ## HITS:1 COG:bglF_3 KEGG:ns NR:ns ## COG: bglF_3 COG2190 # Protein_GI_number: 16131590 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Escherichia coli K12 # 1 66 85 151 174 68 52.0 4e-12 MQSESGVELLIHVGIDTVELGGKYFEAMVKQGAKVNKGDVLLAFDLNEISKHYDSTTSVI FLNKPDARLNSDINKDVMLGQTLNIDFN >gi|223714212|gb|ACDT01000003.1| GENE 26 25499 - 26113 677 204 aa, chain + ## HITS:1 COG:CAC0798 KEGG:ns NR:ns ## COG: CAC0798 COG1183 # Protein_GI_number: 15894085 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine synthase # Organism: Clostridium acetobutylicum # 2 198 3 195 205 113 36.0 2e-25 MIGYYSYTVILTYLSLIFAMAGIHLSFNGMYQWAFICLIMCGICDTFDGMVARSKKNRTD EEKRFGIQIDSLCDLVAFGVFPAILGYNVGLSSIGWLAIEIFYVLAAVIRLAYFNVKEET RQKETTEKRKYYQGLPVTTSSFILPMAYALRYVIFQLDYLYGALMLITGILFIVDFKVPK VQSKGLAALGVLVIVELIQIVVTK >gi|223714212|gb|ACDT01000003.1| GENE 27 26110 - 26970 936 286 aa, chain + ## HITS:1 COG:CAC0799 KEGG:ns NR:ns ## COG: CAC0799 COG0688 # Protein_GI_number: 15894086 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine decarboxylase # Organism: Clostridium acetobutylicum # 17 284 20 288 291 264 51.0 2e-70 MIVVDRLGNVQTNGEGQNNLLKKLYGTFLGRCALKILVCKFVSDLGGWYMNSSLSKRRIA PFIKENKIDMSQYEQREFKSYNDFFTRKIVDGKRPFLADDNVLISPADSKLSCYKIDQDS RFMIKDTRYSLGELLEDDELAKEYMNGYWMIFRLTVDDYHRYSFIDDGKIIGNKYIKGRF HTVNPIANDYYPIYKQNSRSYTIIESKNFGKMIQMEVGAMMVGRIVNHDKKQCFKGEEKG YFEFGGSTVIILLKENQVVIDNDIIENSMNDKETVVKLGETIGKKY >gi|223714212|gb|ACDT01000003.1| GENE 28 26987 - 27757 850 256 aa, chain + ## HITS:1 COG:BS_soj KEGG:ns NR:ns ## COG: BS_soj COG1192 # Protein_GI_number: 16081149 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Bacillus subtilis # 1 250 1 249 253 299 62.0 4e-81 MTKIIAITNQKGGVGKTTTSINLAAALANAKNRVLLVDMDPQANATQGIGIDRDHIELST YNIIVEECNINDVIVPSYIAKLDVAPGSIDLAGADLELANVKKGREQRLKKALDKIKDRY DYIIIDCPPALGLLNTNALTACNSVLIPVQCEYYALEGLTQLLNTVLLTQSVFNPQLTIE GVLLTMLDQRTNLGVEVSQEVRKYFKEKVYKTAIPRNIKLSEAPSEGLAIFDYDNNSEGA RAYRDFAKEVCKRNAK >gi|223714212|gb|ACDT01000003.1| GENE 29 27747 - 28724 1229 325 aa, chain + ## HITS:1 COG:BH4057 KEGG:ns NR:ns ## COG: BH4057 COG1475 # Protein_GI_number: 15616619 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 41 319 3 279 288 205 46.0 1e-52 MPNNTTKSAPAKMAAKKTATKKTVTTTKKPAARKPATTKNKKLGKGLDAIFGGDISTLID DIEKNTPESKQITVSLEEIRPNPYQPRKLFDEEKLQELAISIKEHGVFQPVILKKSIQGY EIVAGERRCRAAKIARLVEIPAIIVDFTDQQMMEIALLENIQRENLNSIEEAKAYQMMME RLNLKQDELAKRIGKSRSYIANTLRLLQLPEMIQNYVLEGKITMGHARCLITLPQEKAES LAARCIEEGLSVRDVENIVKGIELGNSRKDRPKVEKPKEYVYVEGLLRKKFRTKIKVDEK AVTIKYTDTKDLNRILELMGVIEES >gi|223714212|gb|ACDT01000003.1| GENE 30 28888 - 29661 692 257 aa, chain + ## HITS:1 COG:CAC2657 KEGG:ns NR:ns ## COG: CAC2657 COG1145 # Protein_GI_number: 15895915 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Clostridium acetobutylicum # 1 233 1 248 249 67 25.0 3e-11 MKTGIYYFSATGNSLTTAKLLAASLDGQCDVISLAALHNKQDIEVDYERVGFVFPIYYGD MPYLIRDTIRKMKFKQNTYIFIFTTYRGHPGDVAKRFDNLLQEKNLSLALSKGIPMPGNS YLSTVEQIKDTLANQKTNIKKLVKSIIEQDKIDYSLLPEVENSAVYKACNMRGIKADEKC IGCQTCIKVCPMNNIELIEGKIKIKDNCMTCLACFHWCPTAAIYMSKEKEIERREKYHHP DVRLTDIIKQKYNEFVE >gi|223714212|gb|ACDT01000003.1| GENE 31 29838 - 30530 192 230 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 222 1 238 245 78 25 6e-14 MIEVNGLTKIYDGFRAVDNVDLVAKGGKVTILLGPNGAGKSTTIKSIANLLKFDGEIKIC GYPNDSIEAKRCFGYVPETPILYDLLTIDEHIDFIGNAYRVDNYHEIANKYLELFKLTGK RKSMAKELSKGMTQKLSMLLALLIQPQALLVDEPMVGLDPASIEDVLKIFTLLKEEGCAV FVSTHIIDIIKDIYDEAYIMNKGKIIKHVLRDELEDESLKQYFFELTDGE >gi|223714212|gb|ACDT01000003.1| GENE 32 30532 - 32091 1427 519 aa, chain + ## HITS:1 COG:no KEGG:Ccel_1181 NR:ns ## KEGG: Ccel_1181 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 16 516 16 516 518 129 24.0 3e-28 MKPLLKLWLLKIKGTIRNLFKRKASGVFVIIMILFYGAMIVSLFNVDAGQIMAVNNIDLH MGILLLIGFQAIMLFATLMQSKKALFTGEDAFYLFTGPFTRRQVMSYLTFQTIIQAFLLA LISLVFLAAFSGGAGFNFIFIVLAYLASVITVLFFLLLTDYLYVLSIGDKKYRKYSKIIP GIIIAFVVIIVLVLYLQTGNYHTLFMDFVQSNLFYVVPIFGWMKLALIAYVEHNYLLVTL GYLLLCGAVILVYVLFIGYRGNFYEQALQDSLDLSKRMKAAKAGDQEALRNKKVKLGIKG EFRQGAYAVMSKNILLMRKTNSFISVSDLISIGIYIAVTIAVDVGFGMFIYMMVIWIFSS LQNSDLSKELKNYQIYLIPDKPFSKLIAVIIPTFIKIFVVAAVSFIAMGLYYHQSLMMII VYLLNVIGYTSIFISGSVLSIRLLKSRTSPMMENFMRMIVMLIGAIPSAVITTAILLNSG TTVAMMAASYVALFVNFAISFLILYGCRNMMNGRELKSE >gi|223714212|gb|ACDT01000003.1| GENE 33 32339 - 33691 1153 450 aa, chain + ## HITS:1 COG:FN0667 KEGG:ns NR:ns ## COG: FN0667 COG0534 # Protein_GI_number: 19704002 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 3 407 7 411 426 244 37.0 2e-64 MEKKRIDILNGSIWDKILAFALPLAATSVLQQLFNSADVAVVGHFSGSNALAAVGSTSPI TNLFITIFVGLSIGANVVISRFLGAQNEKDTSKAVHTSILLSIISGIVIALIGEVIAVWL LKIMSTPKEVLDQAALFLRITFIGMIFLTIYNFEAAILRAGGDTKRPLYCLLISGVVNVT LGLFFVVVCKLDVAGVALATLIADATSALLLFYILTKEQGPLKLSLEKLKIDKTITKDIL FTGIPAAIQGMLFNVSNIIIQSGINSLGADVVAASTVGLNFEIYVYYLITGFSQASITFN SQNYGAGNYKRCIKSTRGCMILGTIFTVSLSMIFIVFDRFFAGIFTSNPKIVELATIRMT YILIFEVLNMTIEIMSGSLRGLGSPMISTLLCVIFTCGVRLGYMFLVFPHFNTYNMLLII YPISWALTASSIIIAYFVTKKKKFNNLVGV >gi|223714212|gb|ACDT01000003.1| GENE 34 33828 - 34256 464 142 aa, chain + ## HITS:1 COG:YPO2534 KEGG:ns NR:ns ## COG: YPO2534 COG4702 # Protein_GI_number: 16122752 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Yersinia pestis # 10 135 22 152 171 74 35.0 7e-14 MEKLIAFNKFDNEDAFLLGCNLVEKVKKENLKNIRIRIVLNQDIVFQYLMNGKKGDQWLN RKQNTVELFKLPTYQIWQENERSHCYQQYTNDERYVICGGAYPIIVGKGMIGSVIVSGLA HNEDHQIIVDVLSKYHDEKNKI >gi|223714212|gb|ACDT01000003.1| GENE 35 34292 - 35323 953 343 aa, chain + ## HITS:1 COG:lin0375 KEGG:ns NR:ns ## COG: lin0375 COG0673 # Protein_GI_number: 16799452 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 6 343 2 338 338 449 64.0 1e-126 MINNKLVLAYIGNGKSANRYHLPFVLQRSDKFIVKKIFDIRIRHDLWKTIDGVEYVEDVN KILNDPKIDMVVICTAHHLHYDYAKMVLNAGKHCLVEKPFMENSAQAKEIFALADEKGLY CSAYQNRRYDSDFLTVQKVIESGKLGDLLELEMHFDYYRPEVPEKINHFDPAMSYLYGHG CHTLDQVISYFGKPDTINYDVRQLLGEGRFNDYFDLDMYYGTLKVSVKSSYFRVKERPSF VVYGKKGMFIKQSKDRQEEHLKLFYMPTNKDFGVDTPEHYGTLIYYDQDGFYHEEKVVSV IGDYARVYDGIYDCIVKHQPQIITHEQTLLQMEILETGISKLK >gi|223714212|gb|ACDT01000003.1| GENE 36 35640 - 36944 1518 434 aa, chain + ## HITS:1 COG:lin0033 KEGG:ns NR:ns ## COG: lin0033 COG1455 # Protein_GI_number: 16799112 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 16 420 23 425 452 219 33.0 8e-57 MDKFQDILMKVGVFAAENRYLSSIKNAFQTFVPFTIIGAIGVLWSNVICNDTTGLGALVP AVMNLSFLNPAFNALNFATIGCISVAITFLVGGEIGTSRKSSPMFCGLLAVVSLLTVTQT SLDIKAGGELIQTVSGIFTSSLGSQGLFTGMIVAIVAVELFCGLFKLDKLKIKLPDQVPP QIAKSFEYLVPAFIEILIISLVGLGVNAVSGVYINDVIFNVIQKPLLYIGGSLPGVLTFM FISLVFWSIGLHGDNMIGGVFNPILTTLAVENLDAIKAGLEPSNIVNNTFHRAFFATGGT GCMLGLTIAMLIVCKRPENKSIARIALVPELFNIGEVSMFGVPIVMNPTLIIPFILAPLV TVIFGYVLTMLHICPIMYVDLPWTMPPLLIAFLGSGGNFMAPICQLAGIILSALIYLPFV KLYEKQQAQIEQTA >gi|223714212|gb|ACDT01000003.1| GENE 37 37055 - 37495 531 146 aa, chain + ## HITS:1 COG:MA0407 KEGG:ns NR:ns ## COG: MA0407 COG0716 # Protein_GI_number: 20089301 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanosarcina acetivorans str.C2A # 3 138 8 153 179 73 32.0 1e-13 MSKKLVAYFSASGVTKKVALRLAQEELAELFEICPKIEYTSADLNWMDDNSRSSLEMKAQ SCRPEILNKLDNLDQYEIIYLGFPIWWYVEPRIIDTFLESYNFSKKQIVPFATSGGSGIS NVVENLKAHYPELNFADGRLLNNYID >gi|223714212|gb|ACDT01000003.1| GENE 38 37608 - 39086 1720 492 aa, chain + ## HITS:1 COG:lin0288 KEGG:ns NR:ns ## COG: lin0288 COG2723 # Protein_GI_number: 16799365 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 1 489 4 485 486 527 55.0 1e-149 MTKQFPEGFLWGGATAANQFEGGWKEGGKGISVSDVALFTDPKSLKELTDVHGLCDITDE MIDQALATDDEVYYPKRHASDFYHHWKEDIKLLGEMGFKVYRLSIAWSRIFPNGDELVPN EAGLKFYDDIFDECAKYGIEPLVTMSHYEPPLEFARKYNGWYDRRAIDFFVRYVDVITKR YKNKVKYWLTFNEIDSIIRHPFMTGGLIESRFKPEEFEEVCFQAMHHQFVASALATKVTH DNIPDAKVGCMLTKLTYYPYTCKPEDVLEAQQRMRSIYCFSDTQVHGEYPAYLLSMYKNK GFNINMTEEDLRIMKEYPVDFISFSYYSSSCVAKDDTGLNKTAGNTVTAIKNPHIPSSDW GWQIDPIGLRVSLVDLYDRYRKPLFIVENGLGAKDELIDGKVHDDYRIDYLKQHCQAMYE AIHEDGVELMGYTTWGCIDLVSNSTNQMSKRYGFVYVDVDDYGNGSYKRYKKDSFDWYKQ VIATNGASILED Prediction of potential genes in microbial genomes Time: Thu May 26 09:04:01 2011 Seq name: gi|223714211|gb|ACDT01000004.1| Coprobacillus sp. D7 cont1.4, whole genome shotgun sequence Length of sequence - 4518 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 87 - 146 7.8 1 1 Op 1 . + CDS 323 - 1630 1438 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 2 1 Op 2 . + CDS 1646 - 1921 232 ## gi|167754878|ref|ZP_02427005.1| hypothetical protein CLORAM_00382 + Term 1923 - 1955 2.0 + Prom 1953 - 2012 6.9 3 2 Op 1 1/0.000 + CDS 2037 - 2888 934 ## COG1737 Transcriptional regulators + Prom 2891 - 2950 4.7 4 2 Op 2 . + CDS 3014 - 4483 1546 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase Predicted protein(s) >gi|223714211|gb|ACDT01000004.1| GENE 1 323 - 1630 1438 435 aa, chain + ## HITS:1 COG:lin0033 KEGG:ns NR:ns ## COG: lin0033 COG1455 # Protein_GI_number: 16799112 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 2 432 6 430 452 244 36.0 2e-64 MSMTDRLTEGLLKVASKISNQRHMTAIKNAFTALLPIIITGAFCTLFSNVVCSTTTTGIS LAKIDGFAWLEELTPLFTAANYATLNFFTIGTVVLIAIEMAKQLGHKETVAPVVAISAYV SLCDSFTKVAVEGVTDLVEVVNVLPREFTNAQGLFLGMIVAIVSMEIYCRLADSGKLAIK MPESVPSNVSSAFNALFPAVLTILIISAFGLLFTKVTGNSIYNMISTWIQAPLRGILTGL PGYLLIFFLSTCFWVIGIHGTQVLKPVYQATMLEAVIANTDAAANGQTPQFILNETFISC FTTMGGAGCTIGLLFAMLLASKRSDHRTIAKLSLAPGLFNINETMTFGLPIVLNPIFMIP FILTPVITATFAYFMTVIGFCGKMIYAVPWTTPPLLIAWLGSGGSIGAVITQALCVVLSF IVYLPFVFAANKQEN >gi|223714211|gb|ACDT01000004.1| GENE 2 1646 - 1921 232 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754878|ref|ZP_02427005.1| ## NR: gi|167754878|ref|ZP_02427005.1| hypothetical protein CLORAM_00382 [Clostridium ramosum DSM 1402] # 1 91 1 91 91 168 100.0 9e-41 MATKVIKQNNRGLTLRQQNILRMKEELNKPDEKALHPFTKYKIITYFLVILFPPIAMYRV WKKDSTFDITEKIGQTLTCVLYVCYLIQLIF >gi|223714211|gb|ACDT01000004.1| GENE 3 2037 - 2888 934 283 aa, chain + ## HITS:1 COG:lin2846 KEGG:ns NR:ns ## COG: lin2846 COG1737 # Protein_GI_number: 16801906 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 2 265 3 266 283 96 27.0 5e-20 MIIDKMNTLENLTNQEKAVVDYIIANPKALLEMSVNELANASYTSASTITRLCKKLGTKG YADMKFIYVSEYSEMMKLKDSLKVVPFSRNSGIDDIIHTMPLIYSRAIDHTRSMLDRNTV IRVSNLMKRAQVIDLYGDGINFEIARNICYKLDEIGISANAYNSIQWNHCKRLQQDKIPS FSILLSHTGKNPSMVDAAKRLKQYNIPSLSITGNVDKRLLNLTDYNFQIMITENTFEFST VIFTMSSLYILDILVASLIVHNYDKIERHLEELIGQRYDWQND >gi|223714211|gb|ACDT01000004.1| GENE 4 3014 - 4483 1546 489 aa, chain + ## HITS:1 COG:lin0288 KEGG:ns NR:ns ## COG: lin0288 COG2723 # Protein_GI_number: 16799365 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 2 466 6 465 486 526 56.0 1e-149 MRKFPEGFLWGGATAANQVEGGWKEGGKGISVSDCARHHLDVDVENYKAHNHITSKDIEE ALASDDDSLYPKRHGSQFYYHYKEDIKMFAEMGFKVYRMSIAWSRIFPNGDDEQPNEAGL KFYDDVFDECAKYGIEPLVTMSHYEPPLNLVLKYNGWYNRKLVGFFETFVKTICERYKNK VKYWLTFNEVDSMIRHPYTTGGLIEDRFPGIKWNQVIFQAMHHQFVASALATKICHEIIS DSMVGCMLTKLTYYPYSCKPEDVLQAQQDMRGTYCYSDTQVFGEYPAYLLAKFKNEGVEI VKEPGDDEVMKKYPVDFVSFSYYMSSCSAASSEGLDTAVGNTVTAVKNPYLPSSEWGWQI DPIGLRISMVDLYDRYRKPLFIVENGLGAKDVVLPDGTIDDQYRIDYFDCHFKEMLNAIE IDGVECLGYTSWGCIDIVSESTKQMSKRYGFIYVDADDYGKGTYKRIQTLSVPTQLSTTL LLILKLYWK Prediction of potential genes in microbial genomes Time: Thu May 26 09:04:35 2011 Seq name: gi|223714210|gb|ACDT01000005.1| Coprobacillus sp. D7 cont1.5, whole genome shotgun sequence Length of sequence - 102216 bp Number of predicted genes - 100, with homology - 100 Number of transcription units - 55, operones - 19 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 87 - 146 5.5 1 1 Op 1 . + CDS 171 - 989 790 ## COG0789 Predicted transcriptional regulators 2 1 Op 2 2/0.000 + CDS 1010 - 1981 875 ## COG1680 Beta-lactamase class C and other penicillin binding proteins + Term 1990 - 2029 7.9 3 1 Op 3 . + CDS 2036 - 2449 409 ## COG0178 Excinuclease ATPase subunit 4 1 Op 4 . + CDS 2458 - 2646 189 ## COG0178 Excinuclease ATPase subunit 5 1 Op 5 . + CDS 2687 - 2860 182 ## COG0178 Excinuclease ATPase subunit + Term 2900 - 2958 9.0 + Prom 2915 - 2974 8.8 6 2 Op 1 . + CDS 3107 - 4816 1918 ## COG2199 FOG: GGDEF domain 7 2 Op 2 . + CDS 4856 - 5422 415 ## gi|167754869|ref|ZP_02426996.1| hypothetical protein CLORAM_00373 8 2 Op 3 9/0.000 + CDS 5464 - 7425 693 ## PROTEIN SUPPORTED gi|194246575|ref|YP_002004214.1| 50S ribosomal protein L9 9 2 Op 4 16/0.000 + CDS 7425 - 7871 732 ## PROTEIN SUPPORTED gi|167754867|ref|ZP_02426994.1| hypothetical protein CLORAM_00371 10 2 Op 5 . + CDS 7881 - 9254 1423 ## COG0305 Replicative DNA helicase 11 2 Op 6 3/0.000 + CDS 9266 - 9856 752 ## COG1435 Thymidine kinase + Term 9911 - 9960 -0.4 + Prom 10078 - 10137 7.9 12 2 Op 7 32/0.000 + CDS 10220 - 11296 1300 ## COG0216 Protein chain release factor A 13 2 Op 8 . + CDS 11296 - 12153 210 ## PROTEIN SUPPORTED gi|241760258|ref|ZP_04758353.1| protein-(glutamine-N5) methyltransferase, ribosomal protein L3-specific + Term 12154 - 12180 -1.0 14 2 Op 9 . + CDS 12201 - 12950 577 ## gi|167754862|ref|ZP_02426989.1| hypothetical protein CLORAM_00366 + Term 12952 - 12997 4.1 + Prom 12952 - 13011 8.5 15 3 Tu 1 . + CDS 13038 - 14462 1795 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 14550 - 14591 2.5 - Term 14458 - 14491 2.4 16 4 Op 1 . - CDS 14495 - 14941 230 ## Ccel_3010 hypothetical protein 17 4 Op 2 . - CDS 14951 - 15451 586 ## Ccel_3009 transcriptional regulator, CarD family - Prom 15523 - 15582 12.5 + Prom 15544 - 15603 8.9 18 5 Tu 1 . + CDS 15709 - 17259 1918 ## COG4099 Predicted peptidase + Term 17265 - 17303 2.2 + Prom 17276 - 17335 7.2 19 6 Op 1 . + CDS 17389 - 17763 399 ## COG2832 Uncharacterized protein conserved in bacteria 20 6 Op 2 8/0.000 + CDS 17756 - 19492 1574 ## COG4988 ABC-type transport system involved in cytochrome bd biosynthesis, ATPase and permease components 21 6 Op 3 . + CDS 19461 - 21086 204 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 22 6 Op 4 . + CDS 21123 - 21416 271 ## PC1_1895 hypothetical protein 23 6 Op 5 . + CDS 21481 - 21975 467 ## COG1670 Acetyltransferases, including N-acetylases of ribosomal proteins + Term 21976 - 22009 1.1 + Prom 21978 - 22037 15.9 24 7 Tu 1 . + CDS 22104 - 24296 1317 ## COG2208 Serine phosphatase RsbU, regulator of sigma subunit + Term 24382 - 24427 7.2 + Prom 24881 - 24940 9.5 25 8 Tu 1 . + CDS 24976 - 26286 1351 ## COG0513 Superfamily II DNA and RNA helicases + Prom 26292 - 26351 7.1 26 9 Op 1 11/0.000 + CDS 26387 - 28186 1706 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase 27 9 Op 2 1/0.000 + CDS 28196 - 30160 1369 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 + Prom 30215 - 30274 6.6 28 9 Op 3 . + CDS 30300 - 31169 966 ## COG1281 Disulfide bond chaperones of the HSP33 family + Prom 31185 - 31244 7.4 29 10 Tu 1 . + CDS 31303 - 31767 685 ## gi|167754847|ref|ZP_02426974.1| hypothetical protein CLORAM_00351 + Term 31771 - 31804 1.0 + Prom 31865 - 31924 4.0 30 11 Op 1 . + CDS 31946 - 32956 604 ## PROTEIN SUPPORTED gi|145640649|ref|ZP_01796232.1| ribosomal protein L11 methyltransferase 31 11 Op 2 . + CDS 33018 - 33866 818 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase 32 11 Op 3 . + CDS 33868 - 34047 229 ## COG4481 Uncharacterized protein conserved in bacteria + Term 34099 - 34130 3.1 + Prom 34102 - 34161 5.5 33 12 Tu 1 . + CDS 34212 - 35477 1341 ## COG3538 Uncharacterized conserved protein + Term 35505 - 35541 0.8 + Prom 35545 - 35604 9.0 34 13 Op 1 . + CDS 35624 - 36964 1556 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 35 13 Op 2 . + CDS 36942 - 39602 2602 ## COG0383 Alpha-mannosidase 36 13 Op 3 . + CDS 39643 - 40404 616 ## COG1737 Transcriptional regulators 37 13 Op 4 . + CDS 40422 - 42623 2077 ## COG1472 Beta-glucosidase-related glycosidases + Prom 42629 - 42688 9.8 38 14 Tu 1 . + CDS 42792 - 44321 1250 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily + Term 44325 - 44358 4.0 + Prom 44352 - 44411 8.7 39 15 Op 1 . + CDS 44461 - 45780 912 ## gi|167754837|ref|ZP_02426964.1| hypothetical protein CLORAM_00341 40 15 Op 2 . + CDS 45768 - 46517 283 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) - Term 46513 - 46545 3.2 41 16 Tu 1 . - CDS 46554 - 47045 441 ## gi|167754835|ref|ZP_02426962.1| hypothetical protein CLORAM_00339 - Prom 47112 - 47171 6.5 42 17 Op 1 . - CDS 47177 - 48043 802 ## COG1876 D-alanyl-D-alanine carboxypeptidase 43 17 Op 2 . - CDS 48048 - 49238 1090 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) - Prom 49278 - 49337 10.0 + Prom 49237 - 49296 9.6 44 18 Tu 1 . + CDS 49330 - 50067 812 ## COG3884 Acyl-ACP thioesterase + Term 50193 - 50237 4.4 45 19 Tu 1 . - CDS 50125 - 50433 229 ## gi|237735437|ref|ZP_04565918.1| predicted protein - Prom 50481 - 50540 5.7 + Prom 50728 - 50787 6.1 46 20 Tu 1 . + CDS 50807 - 51370 710 ## gi|167754830|ref|ZP_02426957.1| hypothetical protein CLORAM_00334 + Term 51447 - 51495 3.1 + Prom 51373 - 51432 4.2 47 21 Tu 1 . + CDS 51550 - 51774 109 ## COG3326 Predicted membrane protein + Term 51842 - 51869 0.1 48 22 Tu 1 . - CDS 51771 - 52586 885 ## COG0500 SAM-dependent methyltransferases - Prom 52713 - 52772 7.7 49 23 Op 1 1/0.000 + CDS 52659 - 53702 870 ## COG1408 Predicted phosphohydrolases + Prom 53778 - 53837 9.1 50 23 Op 2 . + CDS 53869 - 54933 895 ## COG1408 Predicted phosphohydrolases + Term 55031 - 55084 1.7 51 24 Tu 1 . - CDS 55038 - 55664 452 ## COG3467 Predicted flavin-nucleotide-binding protein - Prom 55687 - 55746 9.2 + Prom 55639 - 55698 8.3 52 25 Op 1 . + CDS 55743 - 56303 654 ## COG1247 Sortase and related acyltransferases 53 25 Op 2 . + CDS 56357 - 56632 385 ## gi|167754823|ref|ZP_02426950.1| hypothetical protein CLORAM_00327 54 25 Op 3 . + CDS 56634 - 57374 850 ## COG0219 Predicted rRNA methylase (SpoU class) 55 25 Op 4 . + CDS 57402 - 58586 1146 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities + Term 58588 - 58618 1.1 + Prom 58949 - 59008 7.1 56 26 Tu 1 . + CDS 59031 - 59897 1222 ## COG0191 Fructose/tagatose bisphosphate aldolase + Term 59917 - 59950 4.0 57 27 Tu 1 . + CDS 60405 - 61673 1597 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase + Prom 61676 - 61735 7.2 58 28 Tu 1 . + CDS 61780 - 61983 370 ## PROTEIN SUPPORTED gi|167754815|ref|ZP_02426942.1| hypothetical protein CLORAM_00319 + Term 61997 - 62023 -1.0 - Term 62015 - 62049 0.4 59 29 Op 1 . - CDS 62050 - 63372 1183 ## COG0534 Na+-driven multidrug efflux pump 60 29 Op 2 . - CDS 63390 - 63974 785 ## BDI_1616 hypothetical protein - Prom 64083 - 64142 7.1 + Prom 63957 - 64016 7.3 61 30 Op 1 . + CDS 64117 - 64866 808 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I 62 30 Op 2 . + CDS 64835 - 65335 482 ## COG1576 Uncharacterized conserved protein + Term 65358 - 65407 4.8 63 31 Tu 1 . - CDS 65433 - 67565 1469 ## COG2199 FOG: GGDEF domain - Prom 67704 - 67763 13.0 + Prom 68009 - 68068 8.6 64 32 Tu 1 . + CDS 68317 - 68505 126 ## Ccur_10260 cupin domain-containing protein + Term 68537 - 68573 4.2 + Prom 68679 - 68738 9.3 65 33 Op 1 . + CDS 68762 - 70072 1067 ## COG1075 Predicted acetyltransferases and hydrolases with the alpha/beta hydrolase fold 66 33 Op 2 . + CDS 70144 - 70422 355 ## gi|167754807|ref|ZP_02426934.1| hypothetical protein CLORAM_00311 67 33 Op 3 . + CDS 70424 - 70906 527 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family 68 33 Op 4 . + CDS 70911 - 71165 366 ## HMPREF0868_0924 glutaredoxin-like protein + Term 71187 - 71242 -0.4 69 34 Tu 1 . - CDS 71263 - 72453 862 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs 70 35 Tu 1 . - CDS 72809 - 73075 181 ## gi|237735462|ref|ZP_04565943.1| predicted protein - Prom 73102 - 73161 4.4 + Prom 73102 - 73161 9.6 71 36 Op 1 . + CDS 73407 - 74021 540 ## COG4832 Uncharacterized conserved protein 72 36 Op 2 . + CDS 74018 - 74569 716 ## COG0622 Predicted phosphoesterase + Term 74577 - 74612 -0.8 + Prom 74618 - 74677 11.0 73 37 Tu 1 . + CDS 74711 - 75013 534 ## COG1440 Phosphotransferase system cellobiose-specific component IIB + Term 75021 - 75058 -0.2 + Prom 75059 - 75118 6.9 74 38 Op 1 1/0.000 + CDS 75162 - 75878 685 ## COG2188 Transcriptional regulators + Term 75881 - 75910 -0.2 + Prom 76009 - 76068 8.2 75 38 Op 2 . + CDS 76099 - 77457 1676 ## COG1455 Phosphotransferase system cellobiose-specific component IIC + Term 77481 - 77516 7.1 + Prom 77531 - 77590 9.6 76 39 Tu 1 . + CDS 77613 - 78119 528 ## gi|167757465|ref|ZP_02429592.1| hypothetical protein CLORAM_03015 + Term 78281 - 78325 3.2 + Prom 78223 - 78282 9.9 77 40 Tu 1 . + CDS 78345 - 80423 2452 ## COG3968 Uncharacterized protein related to glutamine synthetase + Term 80431 - 80470 6.1 + Prom 80451 - 80510 5.9 78 41 Tu 1 . + CDS 80699 - 80884 133 ## gi|167757461|ref|ZP_02429588.1| hypothetical protein CLORAM_03011 + Prom 80886 - 80945 7.8 79 42 Tu 1 . + CDS 81032 - 81589 632 ## gi|237735472|ref|ZP_04565953.1| predicted protein + Term 81657 - 81701 5.3 + Prom 81651 - 81710 12.0 80 43 Tu 1 . + CDS 81753 - 82619 759 ## COG1737 Transcriptional regulators + Prom 82664 - 82723 3.2 81 44 Op 1 17/0.000 + CDS 82744 - 83493 350 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 82 44 Op 2 21/0.000 + CDS 83498 - 85129 1222 ## COG1178 ABC-type Fe3+ transport system, permease component 83 44 Op 3 . + CDS 85126 - 86106 1207 ## COG1840 ABC-type Fe3+ transport system, periplasmic component 84 44 Op 4 2/0.000 + CDS 86116 - 87210 1384 ## COG0075 Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase 85 44 Op 5 . + CDS 87210 - 87995 995 ## COG0637 Predicted phosphatase/phosphohexomutase + Prom 88019 - 88078 8.6 86 45 Tu 1 . + CDS 88107 - 88748 712 ## COG2364 Predicted membrane protein + Term 88754 - 88793 -0.7 87 46 Tu 1 . - CDS 88858 - 89286 484 ## DSY1202 hypothetical protein - Prom 89353 - 89412 9.5 + Prom 89307 - 89366 7.9 88 47 Tu 1 . + CDS 89386 - 89685 487 ## gi|167757451|ref|ZP_02429578.1| hypothetical protein CLORAM_03001 89 48 Tu 1 . - CDS 89706 - 90131 340 ## COG5652 Predicted integral membrane protein + Prom 90170 - 90229 10.1 90 49 Tu 1 . + CDS 90456 - 91826 1360 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 91 50 Tu 1 . - CDS 92047 - 93576 240 ## SPH_0571 transcriptional regulator, putative - Prom 93691 - 93750 6.4 + Prom 93767 - 93826 6.2 92 51 Tu 1 . + CDS 93853 - 94062 184 ## gi|167757447|ref|ZP_02429574.1| hypothetical protein CLORAM_02997 + Prom 94938 - 94997 10.9 93 52 Op 1 24/0.000 + CDS 95106 - 96347 1485 ## COG0004 Ammonia permease 94 52 Op 2 . + CDS 96361 - 96705 470 ## COG0347 Nitrogen regulatory protein PII + Term 96708 - 96743 1.1 95 53 Tu 1 . - CDS 96745 - 98418 1403 ## COG1283 Na+/phosphate symporter - Prom 98447 - 98506 9.9 + Prom 98441 - 98500 8.3 96 54 Tu 1 . + CDS 98560 - 98862 391 ## gi|167757442|ref|ZP_02429569.1| hypothetical protein CLORAM_02992 + Term 98863 - 98893 1.3 + Prom 98873 - 98932 6.6 97 55 Op 1 . + CDS 98985 - 99287 234 ## gi|167757441|ref|ZP_02429568.1| hypothetical protein CLORAM_02991 98 55 Op 2 40/0.000 + CDS 99293 - 99946 949 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 99 55 Op 3 . + CDS 99943 - 100995 868 ## COG0642 Signal transduction histidine kinase 100 55 Op 4 . + CDS 101001 - 101519 795 ## Lebu_1349 propeptide PepSY amd peptidase M4 + Term 101522 - 101553 0.2 Predicted protein(s) >gi|223714210|gb|ACDT01000005.1| GENE 1 171 - 989 790 272 aa, chain + ## HITS:1 COG:BH3496_1 KEGG:ns NR:ns ## COG: BH3496_1 COG0789 # Protein_GI_number: 15616058 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 4 87 7 90 117 72 48.0 1e-12 MLSIGEFSKICQVSTKTLRYYAEIGLILPDEINLENGYRYYSIDQLETMLFINRLKSYNF SLEEILETEEAQDEKLCKALMKKKKEIDLKIQNYQNSLAQLHEDITTLQQGKSIMSYLDK IVVQLVEIPPMNLLNIRKMIQEDEFAEAYSNSFGQLFREIMTDKLTVTAPPMVLFHSSEF TLQGLDTEFAIPVQETNNRTREFVPGTCLKTVVHGSYSNLSSIYIKQCEWAEKIGYKNDA PLYEVYVNDPSQVTSDKELITEVYYPVSKLKV >gi|223714210|gb|ACDT01000005.1| GENE 2 1010 - 1981 875 323 aa, chain + ## HITS:1 COG:lin1811 KEGG:ns NR:ns ## COG: lin1811 COG1680 # Protein_GI_number: 16800879 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Listeria innocua # 4 320 10 323 323 234 40.0 1e-61 MNQEKIMQLEEIINTKYDNITGIIGLKNGVTKYENYFNGGNVNSTVHVASVTKSIISILI GIALDKGYIEKIDQKIIDFFPNYVIKNNNLAIQKITIRDMMTMTVPYKYQEDSYIKYFTS KDWVTFSLNLLGDCGEIGKFQYAPLIGPDILSGILVNTTGRSVFDFATEYLFTPLGIEGK DKITFSSEEEQVAFLKAKDINGWVIDSCGINAAGWGLTLSTRAMAKIGQLYLNQGRWNKQ QIVSKQWVIDSIKKQSYNDELNLAYGYLWWLHQDGFMAMGDGGNVIYVNIIKKIVISITA TFKPDVSDRIEFIKEYLEPLFDE >gi|223714210|gb|ACDT01000005.1| GENE 3 2036 - 2449 409 137 aa, chain + ## HITS:1 COG:lin2156 KEGG:ns NR:ns ## COG: lin2156 COG0178 # Protein_GI_number: 16801222 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Listeria innocua # 4 137 6 139 746 176 64.0 1e-44 MDKIKIRGLIQNNLKNIDLDIPKHKIVIFTGVSGPRKSSIVFDTIATESGRQLNETFAAF IRGKLPQYAKPNLQSIENLSPAVIIDQSSLGGNIRSTAGTISDLYTDLRILFSRIGKPYF GSAACFSFNDPAQYVRD >gi|223714210|gb|ACDT01000005.1| GENE 4 2458 - 2646 189 62 aa, chain + ## HITS:1 COG:CAC1464 KEGG:ns NR:ns ## COG: CAC1464 COG0178 # Protein_GI_number: 15894743 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Clostridium acetobutylicum # 1 61 609 669 755 77 59.0 7e-15 MTIEEVLDFFDNAKITKKLQQLVDVGLSYMTLGQSLTALSGGEIQRIKLAQALNKKGNIY IR >gi|223714210|gb|ACDT01000005.1| GENE 5 2687 - 2860 182 57 aa, chain + ## HITS:1 COG:aq_686 KEGG:ns NR:ns ## COG: aq_686 COG0178 # Protein_GI_number: 15606094 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Aquifex aeolicus # 1 53 851 903 926 80 66.0 8e-16 MKLFNSLVDSGNTVIIIEHNLDVIKQADWIIDIGPEGGKNGGKVVFQGTPKEMITTS >gi|223714210|gb|ACDT01000005.1| GENE 6 3107 - 4816 1918 569 aa, chain + ## HITS:1 COG:sll0267_5 KEGG:ns NR:ns ## COG: sll0267_5 COG2199 # Protein_GI_number: 16331091 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Synechocystis # 400 569 2 179 187 106 34.0 1e-22 MEYQDRLFNQKNRASMDKIINNLPGGIIVFEMQENKIKTLFSNDGTYQLFGYSKEAFKTV FHSDVFEIVYQPDLAPLQEEFYEATKNRESFEHGCRCFRNDGTLMWTYFRVNYIGTEDGR NVFVGLITDSSKERLLEEKSRINEEMYHIAFNQSNVFLLEYDHRSKQVSCSSESSAAYFD SAAFSNIPESLIDSDLIHERSFDDFYHFYHQIIGGVKTGEVKLKMRALNSDKYIWIHVKF TNIFDDDNQPIKAVGVYQNIDEQVRLESRYLQEIKYRQKLMQRSITNLEVNLTTDEVIKA HNVTYNLLGLDEKASYSEFIKKMVTLVADEDKHLFLDTFGLDNVRQAYHDGKDDLRISFL FYRPTKKRYGWVSTHMMFLYNEETNQLMGFFYANDIDDDKRKTIELTRQARTDALTGLIN RRELERVITSEIQLVEPGNYGALFMIDVDNFKLVNDANGHSAGDETLRFVANRLKSLFRG DDVVGRFGGDEFMVYMKEVGAKKDVISKAKQVSEALATTCPGSKIDISCSVGVTFVETPR VIFDEIYHQADSAAYEAKKNGKKSYHIFE >gi|223714210|gb|ACDT01000005.1| GENE 7 4856 - 5422 415 188 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754869|ref|ZP_02426996.1| ## NR: gi|167754869|ref|ZP_02426996.1| hypothetical protein CLORAM_00373 [Clostridium ramosum DSM 1402] # 1 188 1 188 188 263 100.0 4e-69 MNGNKVKVMAQGAMLASLFGVLGVINLYTGSIFDIVLAYVMVIGLVYYTYLYDYRAGLSV LAVTFVILFLVGELFFTFYVTFTLVMGIFYGYCLKHEKQKCFSKYGLMIISAIKNFLIFF LLGGLLGINVYQEGLEMYREIISLIPMLKNILTPEVSFALLWIFLFISESYIVRVYSNII VSKLMKRK >gi|223714210|gb|ACDT01000005.1| GENE 8 5464 - 7425 693 653 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|194246575|ref|YP_002004214.1| 50S ribosomal protein L9 [Candidatus Phytoplasma mali] # 34 643 57 678 837 271 29 3e-91 MTVKMTRIKNVVIALLLIEFIFALMGYVFLSSNTSLVLATYIFIKNVIVLGFIFYSSSLA NENNLSVSEALNNEAKNAFIFGGIGLIKYDENRNISWISDLFTEMKLNIVGKKLLEWQPL LASLFEDDDIKVIDINSRKFEVYNSKESRLLYLKDVSDYVGISKEFEDQQVCVAYITVDN YEESIEQADEQTAASIQSTTRQIVLDWAKENGIVLKRYKSDGYIAMFNERTYRKQVEDKF KILDYFKEQAEQLGQMMTLSIGIGRGSNILRELDELAFSALSLAYSRGGDQATVKSNDEP IRFFGGNSESYEKSNKIRARVIAQSLAGLIRQANNVLIMGHKQSDFDSFGASIAMYSICK AYGKKAHIIIDYDSLEEKTGVIARSLRDDERYRGVFITPARINEFNHSKTLLVNVDNHKP SLAIDANALDIIKNKVVIDHHRRGEEFIELPLLTYLEPAASSTVELIVELFDYQKENVCV TEREATIMYAGMLIDTNYFRTRVGTRTFQAAAKLKEMQANVSEAYKYLEDDYDTTLTKLS ITQTAYRYGENILIAFGRQDKIYSRTLLAKAGNELLGISGVKAVFTVGRTGKEEVSISAR STRDVNVQLIMEKLGGGGHFSMAACQLKYEDVTIAINLLEEAINEYLDERTNE >gi|223714210|gb|ACDT01000005.1| GENE 9 7425 - 7871 732 148 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167754867|ref|ZP_02426994.1| hypothetical protein CLORAM_00371 [Clostridium ramosum DSM 1402] # 1 148 1 148 148 286 100 3e-76 MKVILLQDVKKVGKKGEVVKVADGYGQNFLIKNKLAVLETNTSRKIVESQKEAEHQQDLE NQAKAKELAKEIESITLEFTLKSGKDGKTFGSVSTKQVVEQLREKYGIKIDKRKFIDAHP IGALGYTKLKVDLYKGIIATINVHLKER >gi|223714210|gb|ACDT01000005.1| GENE 10 7881 - 9254 1423 457 aa, chain + ## HITS:1 COG:lin0047 KEGG:ns NR:ns ## COG: lin0047 COG0305 # Protein_GI_number: 16799126 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Listeria innocua # 3 440 7 441 450 437 52.0 1e-122 MANREYPHDIEAERSLLGSMLISKDKCIDILNKTVEDDFYDDSHQAIFKAMAAINEEGTP VDVTTVTSYLMDHSQLDKIGGVDYLLRLSESVPTVAHSEYYLKILHNKATLRRIIRETTQ IAENAYGDVENIDAFIDETEKTILKVTQDRSAGEFRDIRGVIKSVTDRLNLLQKIDGNIS GVKSGFRDLDKITSGFQKGDLIILAARPAMGKTAFALNLAHNAAYKAEEPVAIFSLEMPA EQLVQRVICSMGGIEGSSMRTGEILKTNANKYYAAAERVSKCNMYIDDSPGIKINDIVAK SRKLKSEHGLRMIVIDYLQLITTASKNKENRQQEVSEISRTLKALARELEVPVISLSQLS RSVEQRPNKRPMMSDLRESGAIEQDADIVSFIYREDYYKDPGEESEDNGLTEIIIAKHRN GATGEVNLAFEKNYSRFSDLAQMGPDGTSEGVRDLRS >gi|223714210|gb|ACDT01000005.1| GENE 11 9266 - 9856 752 196 aa, chain + ## HITS:1 COG:BH3779 KEGG:ns NR:ns ## COG: BH3779 COG1435 # Protein_GI_number: 15616341 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine kinase # Organism: Bacillus halodurans # 5 196 5 196 204 254 60.0 7e-68 MYHQYREGWIEVISGCMFAGKTEELIRRINVLSYAKKNIIVFKPKIDNRYSDSEIVSHSG AKVPCLVVEKAQDILKKIEADTEVVAIDEVQFFDKDIVEVCEYLADKGIRVMVAGLDKDF RGESFGVMPELLTRAEFVTKLTAVCAKCGAPATRTQRLVNGKPAGFEDPIVMVGADESYE PRCRHCHQVPNKPHKF >gi|223714210|gb|ACDT01000005.1| GENE 12 10220 - 11296 1300 358 aa, chain + ## HITS:1 COG:BS_prfA KEGG:ns NR:ns ## COG: BS_prfA COG0216 # Protein_GI_number: 16080754 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Bacillus subtilis # 5 355 1 352 356 374 56.0 1e-103 MNESMLDRLKTMENRYEELGHMLMDPDIGSDIKKMTEVTKEQSSLQAAYDLYQEYKEVEA GISDAKELAKESDPEIKEMAKMELAELETRLPEIISKLEIELIPKDPNDNKDVIMEIRGA AGGDEGNIFAGDLYRMYVKYAESQGWKVEVMEAVDAEAGGYSLISFMVKGEDVYGKLKFE SGSHRVQRVPKTETQGRVHTSTATVLVMPEMEEVDVEINKSDLRIDTYRASGAGGQHINK TDSAVRITHLPTGIVAASQDGRSQHDNKDKAMKALVSRIYDFYQQQHDEQVGSERKSKVG SGDRAEKIRTYNYPQNRVTDHRIGLTIQQLDRIVDGKLDDIITALINEDQRLKMEGQH >gi|223714210|gb|ACDT01000005.1| GENE 13 11296 - 12153 210 285 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|241760258|ref|ZP_04758353.1| protein-(glutamine-N5) methyltransferase, ribosomal protein L3-specific [Neisseria flavescens SK114] # 1 258 9 272 299 85 30 1e-15 MATVKELIKLAESRLDDASKDVNVAKVLFYHLADKQPHELYLMYDEEVSSELEAKFLAGM EEYYQGKPIQYIKGVENFFGRDFKVNEDVLIPRYETEELVENILYRIDDYFAEYQSITLC DVGTGSGAIATSLALEEPRLKVFATDISLKAVTVAKDNAKNLGANIEFMVGDMLEPLLEN EIKVDIFVSNPPYIPQEQEIEAMVKDNEPHVALFGGNDGLYFYRKIFQGVEPLLQERALL AFEMGFDQRELMEAALQEYFPNDPHEIIKDINGKDRMLFIYRNLK >gi|223714210|gb|ACDT01000005.1| GENE 14 12201 - 12950 577 249 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754862|ref|ZP_02426989.1| ## NR: gi|167754862|ref|ZP_02426989.1| hypothetical protein CLORAM_00366 [Clostridium ramosum DSM 1402] # 1 249 2 250 250 464 100.0 1e-129 MIINKLNDIINNSQTSDPRYMISSFIKQNILIINSLTIKEVADGSHVSKAMVSKFVKELG YDNFAELKESCSMYVGSLGLKDRYFQLDSDFRSNSSDLVVRMNNLLTNTVHQINYNDLDM LVEDIRNCSCLFLLGHGEAKGLCSMIQIELDALQIPIIVVDVDFRKEYRIGDHAVFLIIS VNGNTLMYNHRTITKILKQKQKTWLITCNQEIAFAGKKIYVPSVDSTLNKINIRFVTDLI IARLQQNQE >gi|223714210|gb|ACDT01000005.1| GENE 15 13038 - 14462 1795 474 aa, chain + ## HITS:1 COG:L121426 KEGG:ns NR:ns ## COG: L121426 COG2723 # Protein_GI_number: 15673653 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Lactococcus lactis # 1 473 1 474 477 700 68.0 0 MAFSKDFLWGGATAANQCEGGYNEGGRGLANVDVVPHGKDRFPVCLGMKEMLDFEEGYYY PAQVGIDFYHHYQEDIKLFAEMGFKTFRLSIGWTRIFPNGDENEPNEEGLKFYENVFNEC HKYGIEPLVTITHFDMPIHLIKKYGGWKNRELIEMYKKLVTVLFTRYKGLVKYWLTFNEI NMILHMPFMGAGLMFKEGEDQKEAKYIAAHNELVASAWATKIAHEIDSNNMVGCMLAGGE YYAYSCHPADVWASINKNRENIMFIDVQARGYYPNYALKMFEREGINIGITTEDKEILKN NPVDFISFSYYSSRCISTQGNVEKTSGNAFEGTKNPYLKTSEWGWAIDPLGLRITLNTLY DRYQKPLFIVENGLGAKDTIGADGSVNDDYRIEYLREHIIEMDKAINEDGVELLGYTPWG CIDLVSASTGEMSKRYGFIYVDRDDQGNGSLKRSKKKSFAWYKKVIASNGTDLD >gi|223714210|gb|ACDT01000005.1| GENE 16 14495 - 14941 230 148 aa, chain - ## HITS:1 COG:no KEGG:Ccel_3010 NR:ns ## KEGG: Ccel_3010 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 145 1 148 162 83 29.0 3e-15 MYKINDYIIHKNNHLYQIIAIHNHTCILTTWQTNETIITNITELVRKVITASEMEEIVER IPYIRTLNITSERYRQELYQKSLDKYDEVEWIKLIKTIYIRHQKKQTQNYELKYFKAAKN IFHEEISLLLNIPLPKIETYIKQKILEN >gi|223714210|gb|ACDT01000005.1| GENE 17 14951 - 15451 586 166 aa, chain - ## HITS:1 COG:no KEGG:Ccel_3009 NR:ns ## KEGG: Ccel_3009 # Name: not_defined # Def: transcriptional regulator, CarD family # Organism: C.cellulolyticum # Pathway: not_defined # 1 166 1 167 175 136 42.0 3e-31 MYKIGDTVLYGREGVCKIKDIVTRKLNNIDKQYYFLTPLDDHITILVPTDNEEALSKMRK VLSKKDIYELIKTMPDNETIWINDKNIRKQKYNDIINHGNHEQLVKLTKTLYLNKQKQEK AGKKFHVQDQHFLQTAEKMLYDEFCHTLNLKPEQILPFILETLKEN >gi|223714210|gb|ACDT01000005.1| GENE 18 15709 - 17259 1918 516 aa, chain + ## HITS:1 COG:TM0033 KEGG:ns NR:ns ## COG: TM0033 COG4099 # Protein_GI_number: 15642808 # Func_class: R General function prediction only # Function: Predicted peptidase # Organism: Thermotoga maritima # 25 359 13 344 395 70 25.0 9e-12 MKRLKRLLAIVVSFALVATTYCTPLFAVDAPDTYSCQLYGEVTDAGEVVSKMVIDYGAAK KVTGVGLDTFTVHAKASTEDIRQGTQDTSYGDYDLDRKIVKVEVEGQYVNIYFNQSEGAT LAYLSSGRNYPAELTYTVTQNKPVTLTAADGTVINDSYTGNYTCDNTVINAETAKFESVK VKDGINYQYYDAGSADSLIVWFHGNGEGDYLSSGNNVAQMLANRGTVAWATDETQNIFGG AHVMAFQAPDTWYYAQNDGLLEKAYNEINEVIKAKNIDPDKVFVSGCSAGGYMTTRMLIA YPDLFAAAMINCPALDVADARGGETPTDAELASIKNSDTAIWLVQGKTDSSVKPEDCSIR LFNALTDGQELITSEHKQDLNSDFTTSETKDGKYKLSLYETVAVTDPSTGASSNKLEFAE DYDQDGVATLVQYSDHWSWIYTLNNNPKDAKGVSIMNWAAEYGKTAEQPTPAPVPETPAD TPKTSVKTGDSVSLGLYAITTLAGLCGIALLKKRYN >gi|223714210|gb|ACDT01000005.1| GENE 19 17389 - 17763 399 124 aa, chain + ## HITS:1 COG:PM0679 KEGG:ns NR:ns ## COG: PM0679 COG2832 # Protein_GI_number: 15602544 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pasteurella multocida # 1 120 1 120 120 89 47.0 2e-18 MKYLYFMLGIIFTGIGMIGVVLPVLPTTPFLLVAMFFFTKSSSRAKVWFEGTKIYQNHLN DFVTKRAMTLKTKVMLLSFASTMLLIAFLMMDNIYGRITIVLLVIFKYYYFAFRIKTIKE VSND >gi|223714210|gb|ACDT01000005.1| GENE 20 17756 - 19492 1574 578 aa, chain + ## HITS:1 COG:PM1474 KEGG:ns NR:ns ## COG: PM1474 COG4988 # Protein_GI_number: 15603339 # Func_class: C Energy production and conversion; O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome bd biosynthesis, ATPase and permease components # Organism: Pasteurella multocida # 21 572 1 559 561 476 45.0 1e-134 MINKRLVSYLKEDKKYIYLQVLMQWLALISQVVIIAIIANMINELYQGHQVNNFSIKVII IIVLILIKGYFGKLVSSFSFKAAKNIKGKLRCDIYQKVLLLNNHYSDVISTASLTQLMSE GVEQLEIYFGKYLPQFFYSMLAPLTLFIILGRVEFKAALVLFVCVPLIPVSIVLVQKFAK RLLNKYWGLYGNLAERFLDNVRGLTTLKGYQGDQAKHLEMNEEAQRFRNITMKVLVMQLN SISIMDLVAYGGAALGIIISLYSYQGGAIDLGQTFMMIMLSAEFFIPLRLLGSFFHIAMN GNAASEKIFRLLDTPVNDHKELEITEIEKIEITRLSFGYDEEIVLKDVSLEIDEPGIYGV VGSSGSGKSTIAKLLMGYYDNYQGVLSYNGNQVNQVKHQSLMKQITMVEHNPYIFAGTVR SNLSDGNDNCDDSIMIEVLKKVNLWNYFDGVNGLDSEIEERGNNLSGGQKQRLSIARALL HDTSVYIFDEATSNIDIESEEIIMKVIEQMRDEKIVILISHRLASVENCKCNYVFSQGRL IGWGNHKELMRNNPEYIELVNTQKEIENYGGQANAKTE >gi|223714210|gb|ACDT01000005.1| GENE 21 19461 - 21086 204 541 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 342 540 5 207 223 83 29 5e-15 MEARQMQKRSNLKVVISLIALVKSMLLIMLGAIILGVIGYLAAIFLTVIAGEIMIKVIKH ETFTSLLYLLIACGILRGFLRYGEQWCNHFIAFKLLAMIRDIVFVKLRKLAPAKLDGKDQ GNLVAMITSDVELLEVFYAHTISPVMIALIVSIIMIIYIGYYHFALGLLASTAYLFIGVI IPLIVGKKSGNMGLNYRNDYGKYTSYVLESVHGLRIIDQFSRGNKRLKEIDKKSELLNEQ NKELKDLEGQYRMIGDICVSLFSLAMFILGYYLYRQGSIEFNGVLISTIALMASFGPVLA LSSLAHNLVITFASGNRVLDLLAEEPQVETVINGYQGNLGTIEINNLSFKYEAETILDKI NLSIKPGEIIGIEGRSGSGKTTLLKLLMRFYDPTDGEILINGHDIKIWQSDSLAKLISYV TQDTYVFNDSIINNIKLGMETSDEAVIEACKKANLHEFIISLKDGYETMIGSLYQSLSGG QLQRLSLARAFLHDAPLILLDEPTSNLDSLNEALVLKSLSDNHQGKTIIMVSHRPSSLKI L >gi|223714210|gb|ACDT01000005.1| GENE 22 21123 - 21416 271 97 aa, chain + ## HITS:1 COG:no KEGG:PC1_1895 NR:ns ## KEGG: PC1_1895 # Name: not_defined # Def: hypothetical protein # Organism: P.carotovorum # Pathway: not_defined # 5 89 12 101 143 78 38.0 9e-14 MTIDQEKELISIMIKIYEDGNKVDLSDLKNYAFKRIEFCPRKEEKTFCSSCPIHCYQKTY RQQIREVMKYSGKRIIFKHPVIAFKHVINTLKYKIMS >gi|223714210|gb|ACDT01000005.1| GENE 23 21481 - 21975 467 164 aa, chain + ## HITS:1 COG:L1015 KEGG:ns NR:ns ## COG: L1015 COG1670 # Protein_GI_number: 15672772 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Acetyltransferases, including N-acetylases of ribosomal proteins # Organism: Lactococcus lactis # 2 145 14 153 187 62 27.0 3e-10 MENLETQRLIIRELEIKDAMRLSEYRDKREVAFYQSWWRYPYQKALKRVEYCVKHPFDGS RGNYQLGVVLKENNILIGDYFLEVLESNSITIGYTFDSDYWQHGYAIESMRALLLELKNR YNFKIVFAHVYDDNIRSIRLLKNLGFVQYETSKIMGDIGFKLRL >gi|223714210|gb|ACDT01000005.1| GENE 24 22104 - 24296 1317 730 aa, chain + ## HITS:1 COG:BS_spoIIE KEGG:ns NR:ns ## COG: BS_spoIIE COG2208 # Protein_GI_number: 16077132 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Serine phosphatase RsbU, regulator of sigma subunit # Organism: Bacillus subtilis # 119 730 154 807 827 130 22.0 1e-29 MDLTVAKTKTKIKNTGMTYVSIFGLTLLLGLVKIYDTSLCFVLPMLVVSFMCGYRHIIMY MTALLLVSYISYDNYMLLLVSITSVIIMQVMMYLKFVKSKYLALVVTLVSLLFLYIYQYN YIEVLVILSFTLLHSLLYLEVVPLFIHNTIEVYTNKRMMILSTMIMLAIISLVELNQVYM MILLRFYLLLSVYYLGINSTMPTILYISIILMFINPLLKEEILSLILPFSIFFMYKPENK LVCSTVYLVSHLVLPFFITYDYYYHSFIIVVSATLFLFVPTFKHKPKLLSDDFKNITSRN KLIQRATTFASLFKQLTDIFQEANRNVNVGEFVGYVYEDVCSKCPSRDLCFYRDGSVSRL GKLINKGFKTNYTKEDINYINSNCISPNRFLKSISEYKDSYEKIKRVNQENSHLKKDLFY EFSLLSEVFDNFSNSLEQTPLNDDSLKEHLLGYQFNITYLYRHQIGNNVYTLELGLMDIS EEEVVNELVPIVESYLNETLEIVSLKDSMHHLGYTSLVLKHQQNYSLQYGFQQYALEPLN CGDSFSAFHQDNLHYLALSDGMGQGKMAANESKLTLEVLSKLILNGIGLKDTIDSINALL KIKNRNDMFTTLDLCNINLANARMKLIKYGANPSYHIRSGMVEKIATNSLPVGVVSKLKM VSYEMPLKNNDLIIMSSDGTGNDFEQIVNDNVHLYCDQHPQEIATFLMDQVLEKNNLDDI SIIVIKVVNN >gi|223714210|gb|ACDT01000005.1| GENE 25 24976 - 26286 1351 436 aa, chain + ## HITS:1 COG:RSc0539 KEGG:ns NR:ns ## COG: RSc0539 COG0513 # Protein_GI_number: 17545258 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Ralstonia solanacearum # 1 369 1 374 540 363 50.0 1e-100 MKFSELELIEPLLNAVKEMKYDIPSPIQEQAIPAIISGRDIFGCAKTGTGKTAAFALPIL QKLYLRDESEKYPRTIKALILAPTRELAIQINETFEAMNPQVNLKSAVIFGGVRQGSQVT KINRGIDVLIATPGRLIDLYNQGLVDLKHVEYLVLDEADRMLDMGFIKDIRKILRFIPRR HQTMLFSATLPDEIKHLVSDLLNDPLKIMISSGNVTVEKINQSLYFVDKVNKAKLLIKLL ENPQIYNAIVFVRTKRNVDTLCKKLIKAQITCEGIHGDKSQNARVRALNNFKNDKVRVLV ASDIAARGIDIDELTHVINFDLPDQAENYVHRIGRTARAGASGEAITFCSFQEKALLKDI QKFINQDIPVVDNPYYPMKDLSIPEKKHKSKKKGETQKSRDKAKHLKQGQKKLSVRTNNI KKNSKTKRSNFKRRRP >gi|223714210|gb|ACDT01000005.1| GENE 26 26387 - 28186 1706 599 aa, chain + ## HITS:1 COG:SP0012 KEGG:ns NR:ns ## COG: SP0012 COG0634 # Protein_GI_number: 15899961 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 423 598 1 178 180 201 55.0 3e-51 MDSFKDLLKMQELYILGVSGGCDSMALLDMMNQAGYQIIVCHVNYHLREDSDLDQQTVEA YCHRYDLPCYVREIDKQVYGPDNFQDQARRLRYQFYLEIGMKYQTQKVVLAHHQNDVIEN IVMQLQRHNTKGYLGIQEISEVFGVTVIRPCLAVRKQFLRDYCHGHNVEYRDDYTNFQTE FTRDYVRNVTLKDYDEEKIEKLLKWAQEHNQRYASKLKQLQIYLDLYHQKNKIDYTCIPK ELLDGFIYEILKEVVYPPLISNALIKEIIKQIKSNKPNINMDLPVNIRFIKEYDNICVSN LKNNHSYCLKYEHLVYDKHEHFYLSEVGHLNEGVYVSKEDFPITIRSARPGDVIVTAGGT KKVSRLFIDNKIPKSKRDTWPIVENSQGMIILVPHLAKNIGYLYSKPNIYVVKLETYTTR SEIMHKDIKEILISGDQISAKCKELGAIIDKDYEGKEVLLVGLLKGSVPFMAELSKYLNT DVTFDYMNVSSYEGVESKTLVVKQDLKEDVSGKNVLIVEDILDTGKTLFNVKEMLLKRKA NSVKIVTMLDKEEGRVFEMKADYVGFKIPNAFVVGYGLDFNERYRQLPYVGILKEDCYK >gi|223714210|gb|ACDT01000005.1| GENE 27 28196 - 30160 1369 654 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 9 618 6 603 636 531 49 1e-150 MSNKNGLLKSILPWVVLLFLIGGLVTYMNQGNSKEIKYSEFVEVVQDEKIKEVEIVPSSL VVDVSGKYTKKVDGKSTEYEFTTTIPNTEAEMQSLTGLLNEKGIETTIVDANDRGVFMKV IISILPYVLLLGGMFFIFKMMGQGGGNAKAFDFGNSRAKLEKNQTTRFADVAGADEEKEE LKELVAFLKNPKKFAQMGARIPKGVLLVGPPGTGKTLLARAVSGEASVPYYSISGSEFVE MFVGVGAGRVRDMFKKAKQTAPCIIFIDEIDAVGRQRGTGMGGGHDEREQTLNQLLVEMD GFSGNEGIIILAATNRADVLDPALLRPGRFDRQIQVANPDKNARTEILKVHARNKKFAPD VDFSNIAQRTPGFSGAELENVLNEAALLAVREDHKVISMDDIDEAVDRVMGGPAKKSRKY SEKERRLVAYHEAGHAVLGLTLEAANKVQKVTIVPRGQAGGYNLMTPKEETYFQTKTQLE ANIAGFMGGRVAEEIFFGDVSSGAHNDIEQATRIARMMVTELGMSELGPIKYDSEQGNVF LGRDYTQHNNSHSGQIAYEIDVQVRKIIDECYAQAKEIIEANKDKLVIIADALLEYETLA GEQIEALFNTGKMLDRHDGTFDSSSNDDSSNDSNASSTTTPSFDDADDLLDDMK >gi|223714210|gb|ACDT01000005.1| GENE 28 30300 - 31169 966 289 aa, chain + ## HITS:1 COG:BS_yacC KEGG:ns NR:ns ## COG: BS_yacC COG1281 # Protein_GI_number: 16077139 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Disulfide bond chaperones of the HSP33 family # Organism: Bacillus subtilis # 3 288 2 288 291 288 48.0 6e-78 MKDYLVRGLVNSKNCRVFACKTTNLVDEARKHHNLWPTASAAIGRMMSATLMMAKMNKNQ EKMTVVINGGGPIGTMMTVTNGDGNIKGFVANPEVHYTYNDTGKLAVGVAVGHEGTLQVI RDMGLKEPFTGSVPLQTGEIGDDFGYYFAVSEQTPSVVSVGVLVNDTNEVLASGGFIIQL LPEAIEEDIDYIEKVVAQCPPVSKLINEGKSPEEILKQLFDDVEITEKQDLFFKCDCSKE KMGKALITVGKDELQAMIDEDHGCELSCQFCNKKYQFSEEELKEIKNRL >gi|223714210|gb|ACDT01000005.1| GENE 29 31303 - 31767 685 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754847|ref|ZP_02426974.1| ## NR: gi|167754847|ref|ZP_02426974.1| hypothetical protein CLORAM_00351 [Clostridium ramosum DSM 1402] # 1 154 6 159 159 226 100.0 5e-58 MKKLLSVMLLALLLTGCGGEEKTTTICKGNIDEMTAAEVTIEATGDKTDVMKSTVTYDFT GYVTESTPIDTYWLPKVKSINVDYDSLKGGSAKYTVDGEKIILKIELDYGEADFDELKKA KLVTTTDSDKKIVYISLEETIKEQEKGGLTCKEK >gi|223714210|gb|ACDT01000005.1| GENE 30 31946 - 32956 604 336 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145640649|ref|ZP_01796232.1| ribosomal protein L11 methyltransferase [Haemophilus influenzae R3021] # 1 328 17 343 353 237 39 2e-61 IDKILLRSDIMFKIGNVEIENPVVMAPMAGITNMAFRKIIKDFGAGLVYSEMVSDKALCY GNAKTIEMLQVDDGEHPVSIQLFGGEVETMVQAAKYIDKHSNCDIIDINMGCPVNKVLKA DAGSKLLLYPDKIYEIVKGVVDNVTKPVTVKIRSGFDKEHINAIEVAQLIEKAGASAIAI HGRTRSQMYEGHADWEIIRQVKEAVKIPVIGNGDIKSVEDAKRMLEETKVDAVMIGRAAL GNPWLIKQVVESLKTGETISEPTYQEKITQCLAHARRLMEIEPEKVAMFQMRGHAPWYIK GLKSSARVKNELSKINTFEELEIILKEYQEYLDQLD >gi|223714210|gb|ACDT01000005.1| GENE 31 33018 - 33866 818 282 aa, chain + ## HITS:1 COG:lin1397 KEGG:ns NR:ns ## COG: lin1397 COG0190 # Protein_GI_number: 16800465 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Listeria innocua # 2 275 4 276 284 301 55.0 8e-82 MIISGKEISVKIKDQLKEEVSKIKETYPRLPKLVVILVGDNQASQTYVRNKERGCQYIGI ESEILRHDASFSEIELLQEINDLNNDDTVDGILVQLPLPQHINEEKVLDAIVPSKDVDGF HPENVAKLFLGQHSLVPCTPKGMMVLLEEINYDLAGKEVVIVGRSNIVGKPVALLCLQKN ATVTIAHSQTKDLKAVCSRADVLIAAIGKPKFFNHEYVKDGAVVLDVGINRDENNKLCGD VDFDDVKDKVSAITPVPGGIGPMTITMLMKNTIEAFYHRNGD >gi|223714210|gb|ACDT01000005.1| GENE 32 33868 - 34047 229 59 aa, chain + ## HITS:1 COG:BH4052 KEGG:ns NR:ns ## COG: BH4052 COG4481 # Protein_GI_number: 15616614 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 2 59 4 60 65 71 60.0 4e-13 MEYKLNDIVEMKKAHPCKKSTQWQIIRMGADIKIKCLGCGAIVMFPRREFEKKLKKIIE >gi|223714210|gb|ACDT01000005.1| GENE 33 34212 - 35477 1341 421 aa, chain + ## HITS:1 COG:lin0759 KEGG:ns NR:ns ## COG: lin0759 COG3538 # Protein_GI_number: 16799833 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 3 416 9 419 432 404 48.0 1e-112 MTIKELLKQADTVLKDEKLREMFKRCFLSTLETSVKREGDKTYIITGDINAMWLRDSSCQ VNHYLKYTHNDSELQSIIKGLIKAQVEYVLWDPYANAFLNTEDVEREYHDTTIMKPYVWE RKYELDSLCYPVKLSYQYYKETGDKSIFDQQYLHFIRTLLNVFECEQNHSQNSSYTFERE EAKKWNLRKTETLQNKGRGFETRYTGMIWSAFRPSDDACTFNYNIPGNMFCSVVLGYLKE IVELVYQDEYLQERIVDLKFQIDYGIELFGIVRHPKYGKIYAYETDGYGNHVLMDDANVP SLLSIPYLGFADANDEIYKNTRAFILSKENPYYFEGNRAKGIGSPHTWSEYIWPIALSMQ GLTSLLQHEREALIQTIIDNTGGTGYCHESFDVNDDTQFTRPWFCWADSLFAELVIKTYF E >gi|223714210|gb|ACDT01000005.1| GENE 34 35624 - 36964 1556 446 aa, chain + ## HITS:1 COG:lin2856 KEGG:ns NR:ns ## COG: lin2856 COG1455 # Protein_GI_number: 16801916 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 1 427 1 439 454 172 31.0 1e-42 MKLIMSLLEKYLMPFSNKITSNKLLNAIKDACILTSPFTVVGSLAGLVSQQANYWFGEWG IPLDGILGSIINAFTNINTVAMGLTGLVVVIASSYNYANQLKGLNKGDKCVPLVACIVSL IAYIATIPNAINVGDSTITAFQINFFNYEGMFAGMIIGLASSYMYYKLVNSKFTIKLPGQ VPPMVLSSFLSIIPFFVIITVFSVIKEIVVLAGFDSIQQLITQFIITPLNGIGTGLPAVI LVIIIMQLLWFFGVHGFSIMWGLISVLWMPIFYEHIQIFVETGSFDKITQVAPNTLCNVY AMIGGSGSTLALVVLLLVLGKKGSAERSIGKVSIIPGLFGINEPVTFGLPIVLNPIMFIP FVFVPVINAIIGYFATSWGLVNHLVVLNSGVEPIFVNAWVLGAFTLSPVILCIVLFILDL VLYYPFVKLQIKQNTLEAQNEGAAVE >gi|223714210|gb|ACDT01000005.1| GENE 35 36942 - 39602 2602 886 aa, chain + ## HITS:1 COG:ybgG KEGG:ns NR:ns ## COG: ybgG COG0383 # Protein_GI_number: 16128707 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Escherichia coli K12 # 6 839 4 829 877 354 29.0 4e-97 MKVLPLSKIHFIPHVHWDREWLRSTDSSRIKLVYFYDRLIEMLENDPQMKYFTFDGQTAA LEDYLAIKPFQKDRISQLVKNRKLFIGPWYVQPDMIIPSGEALLRNLLVGSKYAAGLGHC MNVGWVPDAFGQNKITPYLFKEIGMKGIFAWRGFDYDVLDDSLFMWQGNGDASLPTVHFA LGYGHYRGFPEDYEEVKKDMTDFIPKLEARYQDQEVLFMLGSDYAFPRKHSSKTIEQLQK DGYDCIMNNPEDYLDTLLNAAKKNHHQLKIYQGEARSAALGRIHAGITSTRIDIKNAMRH YETMMAKVIEPMISISRFQGGYCDQELINYFWKIIFKNQFHDSIYGSSPDSINHTVENRL LNLRHGLNELIWMNFRYLAESVDLSVLKENEDILVLFNTLPYQRNDYAFTSMIVKDKNFV LKRQDGTIVPYEIMNEVKATSHDIEYYNGMENFHDAGEVLEGTKFKVQIKIAASFLPPMG YEVLKVCFHERTSKRPEGDVVILKNGAENKYLVMHIQEDGTLSVTNKQTQETYHQLNTFV EKGDDGDEYNYSPCIDDTEIRINDIHPMITCIESSSIEVKYRLEYQIQVPEKVVDHHRVQ ALKPLKICSDVSLKANSQTIDFVTKINNHSCDHIVRVCFEDIYKAAENCSQDQFGTIIRQ NVIKNQKSLKDGATEYVLPIYAMQRFVKLDHQKSIMAVLSKGPLEYEIENNQKICLTLLR SVGKFGKADLLIRPGRSSGYRMDAPSSQLLNQTITSEYSLFFGTKEQMSKMIQTAEVINT PVQTRYLNDISRRENLKLDWSYSAITLDERVEFMALKKAENSEASIFRILNNREVDVADV ELFIPINKRCYLCDVQENKQTEVENQNGKVIIKNIVSNTFVTVIIE >gi|223714210|gb|ACDT01000005.1| GENE 36 39643 - 40404 616 253 aa, chain + ## HITS:1 COG:TM1228 KEGG:ns NR:ns ## COG: TM1228 COG1737 # Protein_GI_number: 15643984 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Thermotoga maritima # 36 224 36 223 266 79 28.0 9e-15 MSLIKNFQKYGNNLTELEIECFNKLLTFRDLEPSLTISSLAETLNVSTTTIFRMVKKLDY KTFMDFRYDLLYHRRDEYEYQSPVESTCDSLEKEIKDTIEMLRNLDIDDVIKDIVKAKSV LICSSGMNKYVAKILAVKLSLYGIRTIYPDDHWFLYLEANNLTQDDFVIVLSRGGATEAI IDVMKNAKLSGCKILLVTETRNSPMTKLSNYVMNVSVTKDEGYDIDSRLHIHLAIEYLLR ELMNQYLYNKKYL >gi|223714210|gb|ACDT01000005.1| GENE 37 40422 - 42623 2077 733 aa, chain + ## HITS:1 COG:SSO3032 KEGG:ns NR:ns ## COG: SSO3032 COG1472 # Protein_GI_number: 15899739 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Sulfolobus solfataricus # 2 723 4 726 754 414 36.0 1e-115 MIEKIVKEMTLKEKVGQLNQHLYGWQCYQKVNGKYELTDLFKEHVKEYGGVGAIYGILRA DAWSQINRENGISREDSKIVITMIQEYIKKHSRFEIPALISEECVHGHMALGAPVFPTNL AMGMTWNPDLMERITHNVSQELAAKGGNLALFTGFDVLRDPRWGRSEECFSEDAYLTSCM TSAAVHGFQKEKNGVAVVVKHLCAQGGCVGGHNSDAALIGPRELREIHLPPVKAAIDAGA KAVMAAYNEIDGIPCHINKALLFETLRKEYDFSGIVMADGCALDRLLLLDDDILKVGSLA LKAGVDLSLWDNVYLRLDEAIKQGYLTEEELDQAVLRVLRLKEELGLWEELNTYSLPEAS DLLLQAARECQVLLKNNDSLLPLSSKQKIAVIGPNANHYLNQLGDYTAYQNKSDIVTVYD GICQKTEISPIYVQGCSIRGRKFDIEPALVAAKAADIVILVLGGNSTRLYQDTFENNGAV KANQENQMNCGENIDLASLELEGYQNELLLALKEVNPHIVTVLIQGRPHVINTVLKHSQA VLASFYPGSGGGEGIADVLFGDYNPSGHLSVSIPQHVGQLPCYYNHKHNGAQKDYVDMPG EALLPFGYGLSYSQFIYRDIKVPKAIKITDLLENGIDLQLEIYNNSTYDGEDVIQVYLKD HQASVVSRVIELKAFNKVKIQAYQSKQISIHLTSDAFAIWNYEMKYLVEPGDVSICIGVD SKHYQEFKLTLVK >gi|223714210|gb|ACDT01000005.1| GENE 38 42792 - 44321 1250 509 aa, chain + ## HITS:1 COG:ECs5319 KEGG:ns NR:ns ## COG: ECs5319 COG1368 # Protein_GI_number: 15834573 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Escherichia coli O157:H7 # 137 464 133 448 750 150 30.0 4e-36 MVVMIERIIQLITFKEVIGSLMIFIAVLIVASIIWGNRTFSVKTLNQMIFHLKVPMDGTD DGIYTDWFLHTVPQSFVIVAFTEIIVFNLPLPAFHIYLIAHIFAIGCFAIIGSLLFALYN YQIFGYVFDMLRKTQLYEEHYVDPKGVTLTFPSCKRNIIHIYLESIENAYLAKTSGGGQD VSYMPELDALANQNINFSHHNQIGGSLTLEGTQWTIASMVGQEAGIPLLVPFNSKSYNDK SNFFSGVYTLGEILEQHGYINEIMMGSDSNFGCTSNFYKQHGNYYISDYNTAVEEGRIAK DYFVFWGYEDKKLFEFAKADITKLAKSGKPFNMELVTIDTHTPDGYVCEDCQHKYKSQYA NVIACQSKQVNNFINWCKSQPWYENTTIVITGDHNSMSEKFFTNLDHDYVRTPYNCFINS AVTTKFNKNRKFSIIDMYPTILAAMGVKIDGNKLGLGVNLFSGEKTLIEQYGYRKINQEV KKKSRYYRHKLIGDDIKECEQKELSKRSD >gi|223714210|gb|ACDT01000005.1| GENE 39 44461 - 45780 912 439 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754837|ref|ZP_02426964.1| ## NR: gi|167754837|ref|ZP_02426964.1| hypothetical protein CLORAM_00341 [Clostridium ramosum DSM 1402] # 1 439 1 439 439 689 100.0 0 MLQISLQMLWKERKKALNLGITITITLCVCIIFLQFFNNPIINYLVNILKDTTKFYGYEA ITEYEVMRFYDGMYKGTMVIILLVIFISLIMYSCNYYNKTNSRIVGLLKIMGCRDREILF FQMIQLIVITLISYLIAVGISFLLIPLCQALAYLYMGVSENIFYYALETYVDSLPLVLLL LVFMTFMQISYSIRGVIPDLLKNDEIVSFKKRRRITIVANVNYIVLFIIGTFTIYISDLN QGLIFPAGLYVIGAYNLGVMCFPNLLEKWIKLKNIDAKESLIFNNFILNLQQMKSVILMF VLSSAILLTMMCTNLDDLQYMILFQLGYVLTNIVLSFTLINRFRINQINKKQYYRNLSRL GLNYEEIRYMTKKEKTLMYEMIAVLSLVYLSNLNIAFVYRDKMGVLSALFLILELIIPLF ISYRVAVYQEEMRIRKWKK >gi|223714210|gb|ACDT01000005.1| GENE 40 45768 - 46517 283 249 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 203 1 199 223 113 32 3e-24 MEEIIKTANLKKVYGLDTPYPKTALNGVSISVNKGTFACIMGTSGSGKTTLINILSTIDE ATSGKLLIFDQNIVGLSDKEKANIRKRYMGFIFQDYNLIDSLKVIDNILFSLKLNKKNII NEAEIKQIISSLGIEELLDKYPFECSGGQQQRIAIARALVCKPKILFADEPTGNVDSIRA KQLMEYFTEINRKYGITIVMVTHDCLVASYASEMYYVEDGKIINHIFKGNDSFEKFYNRI ARISMQIKL >gi|223714210|gb|ACDT01000005.1| GENE 41 46554 - 47045 441 163 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754835|ref|ZP_02426962.1| ## NR: gi|167754835|ref|ZP_02426962.1| hypothetical protein CLORAM_00339 [Clostridium ramosum DSM 1402] # 1 163 1 163 163 285 100.0 5e-76 MDIITNVLPFLVFLALLWFTYTNISLAVAMYRAKKDDSTLSSCLSNTGKYFYIGLTIFYI ALFVGSVALMVLALINNKLDDLYLPLNVLTIGTLAIGFLLQQIILVGHRQMMIGKIKLDY RKIKRVSYPKAKKLRFVYGQRTYETPLRFIDDFKLKKALQKAR >gi|223714210|gb|ACDT01000005.1| GENE 42 47177 - 48043 802 288 aa, chain - ## HITS:1 COG:lin1969 KEGG:ns NR:ns ## COG: lin1969 COG1876 # Protein_GI_number: 16801035 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Listeria innocua # 55 282 18 260 274 120 34.0 4e-27 MNKKRLNKKKVLLVFIPLIIAGTIIYWFLPSNSTKTEEKQPKKTVNNDPQYYQTSLKERY ENYQKTNPDLSTEEIITRVNMNLDYPFYEVIIPQNKPLELNTIVNKYYKLDDNFTPDDLI YINDTYANTSDPAYKYRKHQMRKVVYDDFIALKEACKTKGFNLYVVSGYRSTTWQTEIYN HMVNTYSVAKADQTCSRPGHSEHTTGLACDIALDNYSFEDVIKHPQYQWFLGQLTNYGFI IRYPENKDTLTGYSYESWHLRYLGKDLAKKVTASNLTYDEYYAQNYAK >gi|223714210|gb|ACDT01000005.1| GENE 43 48048 - 49238 1090 396 aa, chain - ## HITS:1 COG:SPy0818 KEGG:ns NR:ns ## COG: SPy0818 COG2843 # Protein_GI_number: 15674859 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Streptococcus pyogenes M1 GAS # 22 341 47 368 430 161 34.0 2e-39 MKKINKKRVAFASLVIICLIIIIALIINLTILPLFNNSNGITKNKQAKAKKESYDTVSLV AVGDNIIHERVFQYASTNGTYDFTPCYKHIKKYIQDADLAFINQETILGGDTLKITGYPA FNSPSELAKNLIDTGFNMINGATNHSFDRDFEGVKAASQTWRQYQDIIYTGTYDSQSDRD TIRIIEKNGIKFALLSYTQSLNEYNTNPYKLLKQTAYAVPLLEDTAAIKADVQKAKEQAD VIIVSAHWGDENEAEVTAKQQEYAQLFADLGVDLVIGTHPHIIQPVTWVNDQSGNKTLIA YSLGNFLSTMETQDTQLEGMLSLNFIKKDAKIFIDDITWTPLINHFGDNTVEVYPLAKYP DDKLAKHFVLHDKPNIIQQYKAKTRDVIGDKITIKD >gi|223714210|gb|ACDT01000005.1| GENE 44 49330 - 50067 812 245 aa, chain + ## HITS:1 COG:CAC3591 KEGG:ns NR:ns ## COG: CAC3591 COG3884 # Protein_GI_number: 15896825 # Func_class: I Lipid transport and metabolism # Function: Acyl-ACP thioesterase # Organism: Clostridium acetobutylicum # 1 213 1 213 248 72 25.0 7e-13 MENIVSYCDLTIECQEADYQGNYRISSLLSKLSDLATKNAVEVGIWRPELGERFGFVLAK ETLILKRPIKIDEKIRLKTRAAACKRIQFTRNYWVEDENGDEIASVYSLWTLIDLEKRRI TKPDKAGIIMPEIKSYDYTIEEYHEIIKELPLDYVMERQVLYSDVDVNQHMNNSRYIEWA FDAVGLRIFEQHYFKEVSILYKQEMAPGTIAKIYRYFDDKYVKVVFKSIDDSVIYFEMGG YLDNF >gi|223714210|gb|ACDT01000005.1| GENE 45 50125 - 50433 229 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237735437|ref|ZP_04565918.1| ## NR: gi|237735437|ref|ZP_04565918.1| predicted protein [Mollicutes bacterium D7] # 1 102 1 102 102 199 100.0 4e-50 MLSPQAELYSAYDSKCLKIELTYCLGSEPYIHKIHYLPKKLVLDEKIEAENFSIYRFLHQ NNCKICFVKIKKPFGTRAFFNICILYGKTNTIIDIWDTFIIP >gi|223714210|gb|ACDT01000005.1| GENE 46 50807 - 51370 710 187 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754830|ref|ZP_02426957.1| ## NR: gi|167754830|ref|ZP_02426957.1| hypothetical protein CLORAM_00334 [Clostridium ramosum DSM 1402] # 1 187 10 196 196 373 100.0 1e-102 MILDYPKIYKEKELIGQGGNQDWFADSWARKAGCASVSATDIFIYYTKREREFDKSTYLD YMNEMFKLMEPGKRGFPYVFLYARRLSKLLNNCEYRIFRRPRLYDASEIVKESIEHGNPL GLLILTHRRRKIRDDLWHWVTIVGYEETKKGLDIIFLDCGDEKRIPARILFEKSRFNVVK MVRFFYR >gi|223714210|gb|ACDT01000005.1| GENE 47 51550 - 51774 109 74 aa, chain + ## HITS:1 COG:BS_ysdA KEGG:ns NR:ns ## COG: BS_ysdA COG3326 # Protein_GI_number: 16079936 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 7 68 19 80 89 58 54.0 4e-09 MMILLYGIDKYKAIHHHWRVSEKILLIGALFGGSLGAILAMYGFNHKTRKNIFKYGIPLL LGIQIILIIKISIG >gi|223714210|gb|ACDT01000005.1| GENE 48 51771 - 52586 885 271 aa, chain - ## HITS:1 COG:rrmA KEGG:ns NR:ns ## COG: rrmA COG0500 # Protein_GI_number: 16129776 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 4 269 3 268 269 187 36.0 2e-47 MHKLACPKCHESIVQEGNSYKCANNHCYDLAKTKYLNLLLNPDKATNNPGDSKESLVARK AYLTKGYYDVILNSVIDCIKKYRDDTPLDILDLGCGEGYYTRGLKEIFSQDTIYGLDISK EAINMATKYTKDVYWLVGNSKNLPIVDHSLDFITALFTVVNQDELKRTLKKGGYIIHVTA NPNHLIEIKELIYDEIKVKSDKYIRLDFETIESYDLVHQVKIDNREDALNLLKMTPHYYH IKKERRGVLDELERLDITIDIKITIYQTSID >gi|223714210|gb|ACDT01000005.1| GENE 49 52659 - 53702 870 347 aa, chain + ## HITS:1 COG:CAC3027 KEGG:ns NR:ns ## COG: CAC3027 COG1408 # Protein_GI_number: 15896279 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 97 344 143 388 392 155 34.0 1e-37 MILTVMMVVIAILINIFIAYHFKKYLNKHRVVAAVIIVLVAVSVSIEMIILGFLINLCIC LIIADLIYPLIFKTKLKRPYQKVYLIGVLICSLSLSVYGIYNANNIVITNYQVSINKEFK DKKIMALSDIHLGTAVKTVDLKQLVTQAEKIKPDMIFLVGDVYDENTSSDEIDDSMKIFT QLAKSYPVYYVIGNHEVGYSSSPLKEYNILERLQLAGVNTLNDEYVEFEDINIVGRQDYK IKKRKPVEQIINGMNQNKPVILLDHQPRSLEENKKLGIDLMISGHTHAGQVFPMLPLWNL LGINEMNYGYRHDDNFNAIVSSGMGTWGFAMRTSKNCEIVVIDLISK >gi|223714210|gb|ACDT01000005.1| GENE 50 53869 - 54933 895 354 aa, chain + ## HITS:1 COG:CAC3027 KEGG:ns NR:ns ## COG: CAC3027 COG1408 # Protein_GI_number: 15896279 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 36 351 75 388 392 161 35.0 2e-39 MSWFVLMEMTLVAVMTVYLAYHTHLVFAKAKYRKIITTVIIGILIGLCLYNLIFFGFLAN ACICFVIFDIINLILYKTRFNRYFKFIYQRGMVALAASVILSFYGIYNAKNTVITTYDVI INKSFVDKSLMVVSDIHLGTVVTKADLTELSEHAEAIAPSGIILLGDIYDEGTTQDEFDY SLQVFKILASKYPVYYIEGNHEIGFQGGSPLREFNIVMKLKEIGIKVLLDDVTKLDDIYL IGRKDYVVKKREALKDLTEPLDTSKPLILLDHQPHDYELNEQLGIDLQLSGHTHAGQIFP LNFLFSFIRVNDLNYGIEVNNNFHGIVTSGMGGWGYAMRTAKHSEIVVVNLKSS >gi|223714210|gb|ACDT01000005.1| GENE 51 55038 - 55664 452 208 aa, chain - ## HITS:1 COG:CAC2475 KEGG:ns NR:ns ## COG: CAC2475 COG3467 # Protein_GI_number: 15895740 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein # Organism: Clostridium acetobutylicum # 66 205 6 152 154 71 34.0 1e-12 MDKIKDLYQDDLIYIFTRSQLRDLNTDYLAIIKNECSIQTINNCITMHYQDDFESLLAFS LPTKNRKNKQLISLNQCYKIMDKIHYGVLSFTHEGLPYSVALNHIIIDNRIFFHCAKSGY KLNSIEQRATYLIVEDLGINLKAGTHNHNSVAVFGTVHEVTEFETKKAALLEIVSHLAPE HPYNDKMVDTTNIIELEIDYINGKTHIR >gi|223714210|gb|ACDT01000005.1| GENE 52 55743 - 56303 654 186 aa, chain + ## HITS:1 COG:L19745 KEGG:ns NR:ns ## COG: L19745 COG1247 # Protein_GI_number: 15673759 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase and related acyltransferases # Organism: Lactococcus lactis # 8 164 5 161 187 164 51.0 7e-41 MTKIRLAVATADDAARLLEIYRPYVLTTAITFEYEVPTLEEFRARIVSTLEKYPYLVAKL DDKIVGYAYTSAFKSRAAYQWAVETSIYIDLDYKGGGIGSMLYHKLEEITKQQNIINLNA CITAGNPESIVFHEHFGYQKVAYFTKCGYKFNQWHDMIWMEKMLGEHPGKPAPVIPFGKL YKHLSI >gi|223714210|gb|ACDT01000005.1| GENE 53 56357 - 56632 385 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754823|ref|ZP_02426950.1| ## NR: gi|167754823|ref|ZP_02426950.1| hypothetical protein CLORAM_00327 [Clostridium ramosum DSM 1402] # 1 91 1 91 91 127 100.0 2e-28 MEVVEMGLFGKKKKTVDLQQVFKDKYKDINQIVNDANNELDLEIQISLLELAYDKYNDLL DLIDQGVDFDKAHFLSMQQDLKKQIDLLKGL >gi|223714210|gb|ACDT01000005.1| GENE 54 56634 - 57374 850 246 aa, chain + ## HITS:1 COG:BH1023 KEGG:ns NR:ns ## COG: BH1023 COG0219 # Protein_GI_number: 15613586 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase (SpoU class) # Organism: Bacillus halodurans # 95 231 4 144 157 79 35.0 7e-15 MKIKTYHKDFDYTYTLGVFETIELLENKPELVMRVYLKSNSYQNKGVKIIENLCQQNKIE LIESDNTINKLAKKDNCFAIAIIKKYTMDLEDANHVVLVNPGDMGNMGTIIRTMLGFNYT NLAIIRPGVDVFDPRVIRASMGALFNINFEYFNSFDEYLTKYPEREVYPLMLKGAKNIHQ ITAANNKHSLVFGNESSGLPDDYLKYGQSVFIPHSDKIDSLNLSMALGITLCHFSMMEFK DKQVLR >gi|223714210|gb|ACDT01000005.1| GENE 55 57402 - 58586 1146 394 aa, chain + ## HITS:1 COG:CAC2970 KEGG:ns NR:ns ## COG: CAC2970 COG1168 # Protein_GI_number: 15896223 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Clostridium acetobutylicum # 1 391 1 384 384 399 47.0 1e-111 MYDFDKIVKRRNTKCIKWDTWKKEGKPKDILPLWVADMDFETLPEVTEAIIERAKHAVYG YSMAGDDYYEAVCNWMKRRHQLDINADNIVTTTGVVTALKIAVNAFTAPGDAIIINKPVY YPFDFSIDENQRKKIECPMMFIGQTYELDFDLFEQLIIDNDVKMFILCNPYNPIGKVWNK EELFKLGNICKKHNVLVVSDEIHQDFIYKGNKHLPFVNVDASFKEFTIICTAPSKTFNLA GLQTSNILFFNHKLKEKFIKVKSSLGFPVEPTIFGIEACKAAYNHGDKWVDELVAYLDGN IKYLDGFLKKKLPMLKMIKPQGLYLVWVDFSALQMTHEELEAFMVNEAKLWLDEGYIFGV GGAGFERFNLAMPRCLLVQALDNLYQALKNRQLI >gi|223714210|gb|ACDT01000005.1| GENE 56 59031 - 59897 1222 288 aa, chain + ## HITS:1 COG:CAC0827 KEGG:ns NR:ns ## COG: CAC0827 COG0191 # Protein_GI_number: 15894114 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Clostridium acetobutylicum # 3 288 2 287 287 402 73.0 1e-112 MGLVSATEMLNKAKEGHYAVGQFNINNLEWTKSILLTAEELKAPVILGVSEGAAKYMTGF KTVSAMVSAMVDSLGITVPVALHLDHGSYEGAKAALEAGFSSIMFDGSHYGIEENIEKTK EIVELCHSKGVSVEAEVGSIGGEEDGVVGKGEVADPKECKMIADLGVDFLAAGIGNIHGK YPENWEGLDFDALDAIQKETGKLPLVLHGGTGIPEDMIKKAITLGVSKINVNTECQLYFQ EATRKYIEAGKDLEGKGFDPRKLLAPGAAAIQECVKEKMEIFGCIGKA >gi|223714210|gb|ACDT01000005.1| GENE 57 60405 - 61673 1597 422 aa, chain + ## HITS:1 COG:BS_murZ KEGG:ns NR:ns ## COG: BS_murZ COG0766 # Protein_GI_number: 16080763 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Bacillus subtilis # 3 422 2 420 429 464 57.0 1e-130 MSEVIAIEGGHKLNGTVVVSGAKNATVALIPAIVLADSPVTVYGVPNISDVDALGVLLRE LNCEVDKNDDVLVVDPSNLKNIPLVSDAVDKLRASYYLMGALLGKCKKVTMKMPGGCFLG PRPIDLHIKGFEALGAKVSYSDNGTYTIEADRLVGNKIYLDFASVGATINILLAAVKAEG KTTIENSAKEPEIIDVVNLLTKMGAKIRGAGTDTITIEGVETLCGCDHEIIPDRIEAGTF LVMAAAAGEKVIIQNVIPQHLEAVTSKLKEIGVKMDIVGDSIIVHGGLDNLKPIDIRTQV YPGFATDLQQPLTALLTQCQGASQVVETIYPERFGHCYQLNSMGAKIDQEEAMCYINGPT PLHGARVYATDLRCGAALIVAALMADGVTEIGNVYHIDRGYENIDHKLISLGAVITRRQV ED >gi|223714210|gb|ACDT01000005.1| GENE 58 61780 - 61983 370 67 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167754815|ref|ZP_02426942.1| hypothetical protein CLORAM_00319 [Clostridium ramosum DSM 1402] # 1 67 1 67 67 147 100 3e-34 MKKGIHPDYKKVKIVCTSCGNEFEGGSVLDEVRVDTCSNCHPFYTGKQRFASADGRVDKF NKKYGLK >gi|223714210|gb|ACDT01000005.1| GENE 59 62050 - 63372 1183 440 aa, chain - ## HITS:1 COG:MA2050 KEGG:ns NR:ns ## COG: MA2050 COG0534 # Protein_GI_number: 20090897 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 1 436 8 444 468 210 31.0 5e-54 MGTMPVNKLLISMSLPMIISMLVQAMYNIVDSVFVAQISENALTAVSLAFPLQNLMIAFA GGTAVGVNALLSRSLGEKNQDHVNQTAVNSVFIFLVTAVIFMIAGLTLSNLFFNVQTTNT EIVNAGTQYSMIVVGCSIGLFCQFLFERLLQATGRTLFTMVTQGLGAIINIILDPIFIFG LCGFPKMGVAGAALATITGQIIACLLALFFNLKFNHDIHFKFKRFRPNAHIVKQIYSVGI PSIIMQSIGSVMTFGMNTILITFSTTATAVFGVYFKLQSFVFMPVFGLNNGMIPIIAYNL GAKQKKRMFDTIKLAMIYATGMMIIGVIFFETIPQTLLGFFNASEAMIKIGTPALRIIAI HFIFAGFSIVCSATFQAVGKGTYSLLTSLIRQLLVLLPCAYVLSLTGNLDLIWLCFPIAE IFSAVTSFILMKRTRRHLEF >gi|223714210|gb|ACDT01000005.1| GENE 60 63390 - 63974 785 194 aa, chain - ## HITS:1 COG:no KEGG:BDI_1616 NR:ns ## KEGG: BDI_1616 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 4 180 6 194 215 150 43.0 2e-35 MPRIITIAREYGSGGRLIAQKVAQKVGLVYYDNEVIDLAAREMGMDVDAIRKVAEQKSSS FMYTMSSSAFSLPLNDQVFVMQSKIIRHLANHDSCIIVNGCADYILEDYDDVLSIFIHAP LESRIRRVKEDYQEVHDDYKKYVTKRDKGRSNYYNYYTTKKWGHLKNFDLTINSDLGIDE VATIIADLFLKGEK >gi|223714210|gb|ACDT01000005.1| GENE 61 64117 - 64866 808 249 aa, chain + ## HITS:1 COG:lin0319 KEGG:ns NR:ns ## COG: lin0319 COG1235 # Protein_GI_number: 16799396 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Listeria innocua # 1 215 15 241 276 143 35.0 4e-34 MEFHILASGSKGNSTFIYDNGVGILIDCGIARKQLLYKLGNLGFQESNINYVLLTHDHYD HNKNISIFDQRICFCGKGCIKGIDSSHEIKPYKSFQLAHFEIMPLSISHDATSPLAFIIT GVNESILYMTDTGYVSQKNRKYLNNLDYYIIESNHDVEMLMATKRPYFLKQRIHGDLGHL NNEYSAKMMVELIGDKTKEIVLAHLSEEANTKEKALETYRKIFNQNNLEFDNIKVASQID VVSGGNYED >gi|223714210|gb|ACDT01000005.1| GENE 62 64835 - 65335 482 166 aa, chain + ## HITS:1 COG:lin0321 KEGG:ns NR:ns ## COG: lin0321 COG1576 # Protein_GI_number: 16799398 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 8 166 1 159 159 158 54.0 5e-39 MWSAEVIMKIKIVCVGRLKEKYLVAGMQEYLKRLQAYCKVEVYEVADKSIPDNCSLANET LIKAKEGRKMLDKIKQDDYVILLDVSGTQIDSVELSNKMEKAMISGKSTIAFVIGGSLGH GEEMLDRANFRLSFSKMTLPHQLMRLVLVEQIYRAFKIMKNETYHK >gi|223714210|gb|ACDT01000005.1| GENE 63 65433 - 67565 1469 710 aa, chain - ## HITS:1 COG:slr1305_3 KEGG:ns NR:ns ## COG: slr1305_3 COG2199 # Protein_GI_number: 16329450 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Synechocystis # 537 710 2 177 188 122 35.0 3e-27 MENLNIREDVSDIVNKTELLADLQIGFWIIEYPNDGIPKMYGDTIMYSLLGGNSNTLTPE ELYSFCMERIDSEYIGLVFDAIENIKKNKRTEIRYPWRHEEKGWTWVRCGGFLDKSYKNG IRFKGYYHDITEELEMDIKNNEHKIIDFKKLRLYSPYFIENCEGLFEIDIDSLEIKTIFY KKKKYHKIDDGDNVFSVIRKWTHPNDITRLDNIFVSASLEQIINKKKIKQIEFKTKTIFG DYTWVEARIFCAQVMGIKKLLFYIYDISDRKTISSLVKEKEEILDAFFNLYSTIVEVDLA TEKLHILKSVLPTNNDHDQKIFSLKQFLLLLEKRLIINFEKKELENFLTIDNLRLLSKTQ KTSHLDLRFQKELENYEWVQLKTLYLPKNNEKIYLVFSNVDREHIINSITEKFVYKKGDY LYYLDTKNGYFSNFSITDNGTVIPPQEGNNYTQAMIKYTRQNTVLEDQDRVIEQLTPEYI LNRLQEQSVYKIELGMLDENKEYRRKLITVQYYDEENKIVLIRRSDITKEYLKQKNQENK LKAIKKAADTDALTGIYNRIGAQRLIVEYLHKSNNEMSAFIIIDLDNFKKINDSLGHLQG DEVLKQVANILKENFRKTDIIARLGGDEFIVFMKNIVQKENAIISLNNLLKKLRLSYQWH EETILITGSIGVAVAPIDGATFEVLYKNADQALYNSKYSGKNGFSFYNEH >gi|223714210|gb|ACDT01000005.1| GENE 64 68317 - 68505 126 62 aa, chain + ## HITS:1 COG:no KEGG:Ccur_10260 NR:ns ## KEGG: Ccur_10260 # Name: not_defined # Def: cupin domain-containing protein # Organism: C.curtum # Pathway: not_defined # 1 62 33 94 94 72 56.0 5e-12 MHKHETGNNINYIVSGMGKAICDDDEESLFAGICHICKEDSKHSIINIGTEDFAMITIVV EQ >gi|223714210|gb|ACDT01000005.1| GENE 65 68762 - 70072 1067 436 aa, chain + ## HITS:1 COG:CAC1028 KEGG:ns NR:ns ## COG: CAC1028 COG1075 # Protein_GI_number: 15894315 # Func_class: R General function prediction only # Function: Predicted acetyltransferases and hydrolases with the alpha/beta hydrolase fold # Organism: Clostridium acetobutylicum # 70 436 115 479 479 273 43.0 4e-73 MKYIIRLVSATIVFVLANRLLLGFSNVLLIGLLVGIFIVINAIPYVPYPRPLLKRYRICK AGCELLKIFLISLIMTIVYMLYSFFKVDIGAWLTNLLVVIIVEAVVFWNGIIRIYLTANQ LGAKWRIIGVVWGWFPLVNLIILRKLIKIADDEIIFENEKTILNGVRKEQQLCKTKYPLL LVHGVFFRDFKYFNYWGRIPEELEQNGAVIYYGNHQSAASVIASGDELAKRIKEIVQESG CEKLNIIAHSKGGLDCRYAIAKLNIAPYVASLTTINTPHRGCIFADYLLEKIPEKIKQKV AQGYNTALKKFGDENPDFIAAVNDLTASTCQKFNADVPDVAEVFYQSVGSKLNVASGGRF PLNFSHHLVKYFDGPNDGLVAENSFPWGCDYTFLTTNTKRGISHGDMIDLNRENIAEFDV REFYVQLVNMLKVHGY >gi|223714210|gb|ACDT01000005.1| GENE 66 70144 - 70422 355 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754807|ref|ZP_02426934.1| ## NR: gi|167754807|ref|ZP_02426934.1| hypothetical protein CLORAM_00311 [Clostridium ramosum DSM 1402] # 1 92 11 102 102 145 100.0 7e-34 MEVALVIGMFVCTALALNFLVNGRRKLSGTFEIINLCLIVATAYQFKLVSSIIWVLVILA ILLIAGLIFNKKRKDKKVANEIIEARIVDKGE >gi|223714210|gb|ACDT01000005.1| GENE 67 70424 - 70906 527 160 aa, chain + ## HITS:1 COG:CAC2769 KEGG:ns NR:ns ## COG: CAC2769 COG0652 # Protein_GI_number: 15896024 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Clostridium acetobutylicum # 1 158 1 158 174 172 56.0 2e-43 METVNIIIEMEDGGVMKGELYPEIAPISVENFLKLIDQDFFAGLIFHRVIPGFMIQGGGY DKDFNHHEADSIKGEFRSNGVANDLLHTRGVLSMARTMVPDSASSQFFIMHADAPHLDGD YAAFGKITEGLEVVDQIANAPRNIQDCPNQPIVIKTIKRG >gi|223714210|gb|ACDT01000005.1| GENE 68 70911 - 71165 366 84 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0868_0924 NR:ns ## KEGG: HMPREF0868_0924 # Name: not_defined # Def: glutaredoxin-like protein # Organism: Clostridiales_BVAB3 # Pathway: not_defined # 3 82 6 85 85 78 43.0 8e-14 MKLFILENCPHCRNARRWIKELQDENPEYNQIEIELIDEQINSELADQYDYYYVPALFDG NTKLHEGVASKDGLKKIFDQYLGH >gi|223714210|gb|ACDT01000005.1| GENE 69 71263 - 72453 862 396 aa, chain - ## HITS:1 COG:PAB2227 KEGG:ns NR:ns ## COG: PAB2227 COG1167 # Protein_GI_number: 14520410 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Pyrococcus abyssi # 23 391 31 405 410 282 39.0 8e-76 MKTFNYSTRSTRPDFQGDFLDMMLQATNKPNLISFAGGLPNPISFPTQEMAAATQKVLDQ HGTIAMQYSIPEGYLPLRQFIAHQYLKQGIDISANDIIITNGSQQALDILSAVLIDEGDK ILIEKPSYLAALQVFHLYNPKINSVTLNHHGIDLQELTAYLEQKPKFFYAIPTFQNPTGL TYDNETRIALANLLKKTNTIFIEDNPYGDLRFNDETFLPIYPLLKDQTILLGTFSKTVSP GMRIGWIACSNSYLKQKLLAYKQIVDLHTNIFGQMVLHQYLCDNSLEQHLEKIKKLYKEQ AEQMITSIQKYFPAEVKHTIPQGGMFLWVTLPQKLTAVALANEAIKHDIAITAGDPFYEE ERNVSTLRLNYTNCDLKTIDQGIKILGQIIKNLLAS >gi|223714210|gb|ACDT01000005.1| GENE 70 72809 - 73075 181 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237735462|ref|ZP_04565943.1| ## NR: gi|237735462|ref|ZP_04565943.1| predicted protein [Mollicutes bacterium D7] # 1 88 1 88 88 157 100.0 2e-37 MGIVVPLSKPNVYANDNICPAAIENNNYSSTRETIYYNDNDFSSRYKVLKNGKKYYMKIL DIWEIVINSKGYLCNSFMIRFQKRHSYQ >gi|223714210|gb|ACDT01000005.1| GENE 71 73407 - 74021 540 204 aa, chain + ## HITS:1 COG:lin2189 KEGG:ns NR:ns ## COG: lin2189 COG4832 # Protein_GI_number: 16801254 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 3 199 5 207 208 101 30.0 1e-21 MAKFDIKKEYPKLYRATTKKISSLTIPMIKYIAIDGIGNPTVPEFKNKSKLLFEFNKALK EYYCQQEIVFSGSKLEGIWDTYDNSHFDVTRKKMIKYTLMMPQPPILTDHILEEVKNEVL TKTGNCLALDVYLKEFEEGKCIQMLHIGPYNTEINSTKQIMEYITVANLKLSGFHHEIYL NKPEKVAPEDLKTIVRYPVEEVFL >gi|223714210|gb|ACDT01000005.1| GENE 72 74018 - 74569 716 183 aa, chain + ## HITS:1 COG:CAC2749 KEGG:ns NR:ns ## COG: CAC2749 COG0622 # Protein_GI_number: 15896006 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Clostridium acetobutylicum # 1 178 1 179 180 197 54.0 1e-50 MKLMFISDIHGSSLYAQKAIDTYKQEKADKLIILGDILYHGPRNDLPEEYAPKKVISLLN AYKKDIIAVRGNCDAEVDQMVLDFPIRSDFATVDVDGHHFFLTHGHLFDEDNLPGLNDGD IFVYGHIHKPVAKEINGIYIINPSSISLPKEGKNSYGIYEKDTFMIKDFDQTVVKKIDFR KSQ >gi|223714210|gb|ACDT01000005.1| GENE 73 74711 - 75013 534 100 aa, chain + ## HITS:1 COG:SP2023 KEGG:ns NR:ns ## COG: SP2023 COG1440 # Protein_GI_number: 15901844 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Streptococcus pneumoniae TIGR4 # 1 97 1 98 102 93 53.0 8e-20 MYKILLVCNAGMSTSMLVQRMEKAATEKGIEAEIIALPITDALSKMDDWDVVMLGPQVRH ELKGLRTKTETPIEVIEMRDYGMMNGEKVLEAAIKVIDAK >gi|223714210|gb|ACDT01000005.1| GENE 74 75162 - 75878 685 238 aa, chain + ## HITS:1 COG:lin0390 KEGG:ns NR:ns ## COG: lin0390 COG2188 # Protein_GI_number: 16799467 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 1 236 4 235 237 177 44.0 2e-44 MVKYIDIADDIRSKIIEEKYTYGQKLPYEYVLCVTYHCNKETMKKALDILVKEGLIVRRR GAGTFVKDYNPSMESPNKSNYFTKGLTERFAGIKKIDSEIIAFEVIPSDETISKKLQIEE GSFVYHIIRFRKLDDKPYSLEIIYMPISIIPNLKVDHLKTSIYQYIENDLKLKIQSAHKT VRGHLSSQLEQDYLGLKETEPYFEVEQVAYLSSGVIFEYSFSRFHYNDFELQTVTVAM >gi|223714210|gb|ACDT01000005.1| GENE 75 76099 - 77457 1676 452 aa, chain + ## HITS:1 COG:lin2906 KEGG:ns NR:ns ## COG: lin2906 COG1455 # Protein_GI_number: 16801965 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 4 448 11 439 450 207 32.0 3e-53 MDKMADVLGRVGAWCGQNKYLSAIKNSFQTFMPLTIAGAIGVLWCNVLVNADSGLGMFWE PIMALEVINPAFAAMQFATISCITVGITFGIAQEIGESNGETGYFAGLLGLACWLSVTQS GWANYALVDTAKQTLSLTADGALQTFTGVAGGALGATGLFTGMIVGVVSVEIFCALRKVE GLKLKMPETVPPGVARAFEVLIPAVITLAIIALIGRGCELATGLYLNDVISTYIQGPLGA IGATVPGVIIIYIIIMLFWLVGIHGNNMLSAVKEALFTPLMLENIETFSKTNDAKSDELH IFAMAWLQMFGEFGGSGVTIGLVIAIMIFSKREDNRTIAGISLVPGLFNINETVTFGIPM VLNPILGIPFVLAPIATLAVGYILTVIGFCPKAVINTPWTTPPILHGFLTTGANIMGAVS QAIAIVVSILVYVPFLIAYERYQNKQAAEAAE >gi|223714210|gb|ACDT01000005.1| GENE 76 77613 - 78119 528 168 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757465|ref|ZP_02429592.1| ## NR: gi|167757465|ref|ZP_02429592.1| hypothetical protein CLORAM_03015 [Clostridium ramosum DSM 1402] # 1 168 1 168 168 311 100.0 1e-83 MKEKEFKIKLNPAILYFEIFYMVYVVYLFIIGQMVPAIVCAICGLLIIAYFSLWRPYKYT INRRTLIINRRVGKDKEINIMTCETICDPVPKMTKLITSPRALEIYAENKKRFVVTPKDR MGFIEAIVAANKRIHVQCTEYAATHRSYEKKRKRALKEEAKKANMDAE >gi|223714210|gb|ACDT01000005.1| GENE 77 78345 - 80423 2452 692 aa, chain + ## HITS:1 COG:CAC2658 KEGG:ns NR:ns ## COG: CAC2658 COG3968 # Protein_GI_number: 15895916 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Clostridium acetobutylicum # 3 692 1 696 696 766 55.0 0 MEMNKLLEDYGCLAFSDDVMKERIPKSIYKAFHESLDNGEELSKECATVIANAMKIWALE NGATHFTHWFMPMTGLTAEKHDAFLEPDGCKAVLEFSGKTLRKGEPDASSFPSGGLRATF EARGYTAWDCTSPAFVKDGSLYIPTLFCSYTGEALDKKTPLLRSCDALSKAACRLLPLLG EKGITKVTASVGAEQEYFLVDDKYYQERMDLKLTGRTLFGAMAPKGQELEDHYFGSLKRK VSAFMKDLDHELWKYGIPSKTKHNEVAPAQHEVACVYSKVNITTDNNHLLMQIMQDIAKK HGLRCLLHEKPFAGVNGSGKHDNWSVITNTGINLFNPGANPAENKPFIATLACTIKAVDD YADLLRMSIASAGNDHRLGANEAPPAIISMFLGEELDALLAEICEGKKTSKSDAARFATG VSVVPTFSKDNTDRNRTSPFAFTGNKFEFRAVGSSQSVAGPNTILNAILADAMEKMADEI ESGKSFEDVVKEFVLAHKRIIFNGDGYSAEWEEEAAKRGLPNNKNTVDALKCLKEEKNLE MLDRLGVYSRVELGSRYEILLENYIKTIQVEGLTALKMAKSQIYPAVCDYLSKVSSEVIA AKEAGLDVDFLVDDANALAKLVKTMKEQMTTLETNIAAAQASEEEIFEQAVAWRDDVFAM MQALRETVDQLEESIDAEYWPMPTYLDLLFGI >gi|223714210|gb|ACDT01000005.1| GENE 78 80699 - 80884 133 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757461|ref|ZP_02429588.1| ## NR: gi|167757461|ref|ZP_02429588.1| hypothetical protein CLORAM_03011 [Clostridium ramosum DSM 1402] # 1 61 1 61 61 88 100.0 2e-16 MNRLKVLKLSQFGLLMLKTIDPAKNAISTIKRLIATGDDIIIYRLVLVPSLAILVSMKRK H >gi|223714210|gb|ACDT01000005.1| GENE 79 81032 - 81589 632 185 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735472|ref|ZP_04565953.1| ## NR: gi|237735472|ref|ZP_04565953.1| predicted protein [Mollicutes bacterium D7] # 1 185 1 185 185 323 100.0 2e-87 MKRDVLNKISRVINYEESERARLNVYVDQLGDQQEEYLLIGEELILRLDNKIRKITFDEI KEINVSMCSRLYNPGIVESNLEKLIIKRWTYGSTLAVGQHLNYYVDLDIILDDEKIMIEA YSLKNIVSIIEILSNKGITINDPLKIKDILTKGLDDTALEKYLDNHFPKLAKEYNLDNPR GIIIP >gi|223714210|gb|ACDT01000005.1| GENE 80 81753 - 82619 759 288 aa, chain + ## HITS:1 COG:CAC0191 KEGG:ns NR:ns ## COG: CAC0191 COG1737 # Protein_GI_number: 15893484 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 8 259 7 258 283 123 33.0 5e-28 MLSKVNFRIKTIYNILRPSEKKVADYILNYCGKLEDLSMTLLAKEVAVSQPTVMRFVKAI GYDSFKQFKLELAKNYDKENNIDILYGFSINTDDKIMDLPGKIVATATSMLENALKSLSL VNYQKTVALINQSQHISIYAVENSMGVAHDLMTKLIYLGKSVTCHSDYYLQSIDASNLTR DSLAIGISYSGNSKNTVEVLKIAKDAGAQTIAIVNFENTMLTRYGDIVLSTSNDQFLYGD AIFSRTAQIALVDMIYMGVIISDYDNYTKKLDHYSRLIKHRGYQKEEL >gi|223714210|gb|ACDT01000005.1| GENE 81 82744 - 83493 350 249 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 237 1 245 245 139 34 6e-32 MLKLERIVKSFDGINILNNLSLEIPKGQIVSILGPSGSGKTTLLNLILGISEVDSGRIIF ENEDLTYVPMEKRGFNIVFQDYALFPNLNAYENITYGLKNKPDISTKEEVDELIHLLGLE KHLDKKIDQLSGGQKQRVALARTMVMKPKILLLDEPLSALDGVIKESIKERIKIIAKEYN LTTIIVTHDPEEALTLSDQVLIINEGKISQFGKPEEIINHPSCDFVQDFILNQLEIKRRN IMTLFSPVV >gi|223714210|gb|ACDT01000005.1| GENE 82 83498 - 85129 1222 543 aa, chain + ## HITS:1 COG:SMb21542 KEGG:ns NR:ns ## COG: SMb21542 COG1178 # Protein_GI_number: 16264731 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Sinorhizobium meliloti # 42 534 164 658 680 205 28.0 2e-52 MNKNREIKFIYIIVIAVFGIFLLLPIGSLLIQSFYNNGGLTLNNYLQTYQTTGFMHALKN SFVVSGVSALVTVILAFIVAYTLNYTNMFKGLKRIINNVSMLPMLLPTITYGFAIIYSFG KQGLWTKLFGFQLFDIYGFKGLMLGYVIYTFPIAFMLINNAISYIDKKFIIVSKLMKDSE FRTLMITLIRPLLGTLVAAFIQSFFLSFTDFGIPASVGGKYDVVASVLYNKMLGSIPNFA GGAVVAMTMLIPSIISIILLHRLEKYNIRYNKISTIEIKQNRFRDWFWGTLSITINVIII AVFIVIIIVPFVGEWPYRISFTFQNVINVFSDATLLGVIKNSLITAIFTALLGTLVVYGA ALASARSTLNKFLKKIIESIALATNTIPGMVLGIAYLLMFSGTVLQNTYIIIILCNIVHF FSSPYLMMKNSLTKMNGSFETTAKLMGDSWFKTVIRVITPNAISSILEVFSYYFINAMVT VSAVIFIAGARTMVMTTKIKELQHFAKFNEIFVLSILILLINLLAKGVFSYFSKQRRERK TLK >gi|223714210|gb|ACDT01000005.1| GENE 83 85126 - 86106 1207 326 aa, chain + ## HITS:1 COG:ECs0415 KEGG:ns NR:ns ## COG: ECs0415 COG1840 # Protein_GI_number: 15829669 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 243 1 249 343 92 28.0 9e-19 MKKKLLKGLVVSSLLCASLLTGCSSSNDQVVIYSNADDEAVTAMKNALDKNGYEGQYLFQ TFGTSELGGKLLAEGTNIEADLVTMSSFYLDSAQEQKKMFVDLGFDTQALDEYPSYYTPI TSQEGAIIINTEELKANSLETPTCLKDLAKPEYKDMISVTDIASSSTAWLLIQALVSEYG EDGAKEVLKGIYDNAGAHIEDSGSGPIKKVRAGEVAIGFGLRHQAVRDKADGLPIDFVDP TEGNFSLTESVAVVNKDGSEHQKLAMKMAQCIIESGRKELLETYPNPLYQGETSDSANKS AYPKTFSEKLTAELLEKHQKLSEECK >gi|223714210|gb|ACDT01000005.1| GENE 84 86116 - 87210 1384 364 aa, chain + ## HITS:1 COG:STM0431 KEGG:ns NR:ns ## COG: STM0431 COG0075 # Protein_GI_number: 16763811 # Func_class: E Amino acid transport and metabolism # Function: Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase # Organism: Salmonella typhimurium LT2 # 2 357 4 360 367 383 51.0 1e-106 MKDYKLLTPGPLTTTATVKQEMLFDHCTWDDDYKKITLKIRDQLLELAHVRAEQYTVVLM QGSGTFGVESVITSVIGENEKLLIVANGAYGKRMKDICEHARINHEIIEFAENENPSAVA VAEKLDEEKEITHVAIVHSETTSGILNDIESVAKVVKERNKVFIVDAMSSFGGVDIEVGK LGIDFIISSANKCIQGVPGFSFVIANKNLLLASRGKARSLSLDLYDQWETMDKDGKWRFT SPTHVVLAFSKALDEMLEEGGIAARSKRYYDNNRLLIEKMAEMGMKSYVDLAYQGPIITT FYYPEGKEFAFSEMYDYIKARGYAIYPGKLTTAQTFRIGNIGEIYEADILNLCSIIKDFL AEVA >gi|223714210|gb|ACDT01000005.1| GENE 85 87210 - 87995 995 261 aa, chain + ## HITS:1 COG:PA1311 KEGG:ns NR:ns ## COG: PA1311 COG0637 # Protein_GI_number: 15596508 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Pseudomonas aeruginosa # 4 258 9 262 275 176 35.0 4e-44 MSRIEAVIFDWAGTTVDYGCFAPVQAFKDAFNNYGLEPTNDQIREPMGMLKIDHIKTMLE MPVLNEAFKARYDREFNEQDIHKIYDLFEASLMTGITKHTDVKPYVLETVAKLRELGIKI GSTTGYTDMMMEPVLKSAKEQGYQPDCWYSPDATNHFGRPYPYMIFKNMIELHISSVKNV IKVGDTISDIQEGVNAGVIVVGVIEGSSTLGLNEEEFNALTPKERNRAIERVKEAYLDAG ADYVINNLSELIALIQEVELF >gi|223714210|gb|ACDT01000005.1| GENE 86 88107 - 88748 712 213 aa, chain + ## HITS:1 COG:CAC0198 KEGG:ns NR:ns ## COG: CAC0198 COG2364 # Protein_GI_number: 15893491 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 1 198 6 203 227 64 27.0 1e-10 MFKRVIVLVFGVVVVACSSALTLKAAIGVGAWDALAQTGSLVTGIEVGTIGMFCNFLCIF VQVAILRRKFKPIQLLQIPVCILLGLIVNFMLYEVYSNFTINTYWMNVGLLILGYIVSAF AVGMVMVLDIVTFALEGACMAVSGVTGKKFHVLRQGVDVVSILLVIIIVIITDVPLAVRE GTIIGMFLFGPMMGIFMKLQKPVFKKFGLIDYN >gi|223714210|gb|ACDT01000005.1| GENE 87 88858 - 89286 484 142 aa, chain - ## HITS:1 COG:no KEGG:DSY1202 NR:ns ## KEGG: DSY1202 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 142 4 145 145 152 57.0 4e-36 MNDNVKLLNFIYQNSQMGVETIEQLEKIVEDKDFKRYLKEKYEGYCKIHKDAKEKLNSHG YDERGIGSFEKIRTYLMINMQTLTDKSTSHIAEMMMIGSTMGIINAIRNIADYNHAKKDI IKLMETLKAFEEKSYGDLQKFI >gi|223714210|gb|ACDT01000005.1| GENE 88 89386 - 89685 487 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757451|ref|ZP_02429578.1| ## NR: gi|167757451|ref|ZP_02429578.1| hypothetical protein CLORAM_03001 [Clostridium ramosum DSM 1402] # 1 99 1 99 99 152 100.0 4e-36 MYIPNLDPYSFTGTAVILGYLLTNDFTTSEQAALGAWFNVVGDILASNSSWSAVLEERST PPSDDDNDDHQNDLDVLNDAIDKLKESIEKLQNEKKSND >gi|223714210|gb|ACDT01000005.1| GENE 89 89706 - 90131 340 141 aa, chain - ## HITS:1 COG:BH3707 KEGG:ns NR:ns ## COG: BH3707 COG5652 # Protein_GI_number: 15616269 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Bacillus halodurans # 1 133 1 141 146 87 39.0 8e-18 MKKLKYFIPALIWMIFIFIMSHTNGNDSSNQSNFIAEIILKVINIDRDTLTFIIRKAAHM SEYAILLLLLYYALSNVISKHTLSLSLLVTFIYACSDEFHQLFIPGRSGQFKDVLIDTSG ALIMLMIIFLWQRKKKSKLIK >gi|223714210|gb|ACDT01000005.1| GENE 90 90456 - 91826 1360 456 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 4 454 3 455 456 528 55 1e-149 MQFITNLLTKIDELVWGLPLICLLLGTGIYFTCKLKLLQLTKLKLAFSCIFKKQDNDEGD VSSFQALCTALSSTIGTGNIVGVATAIAAGGPGALFWMWVSAFFGMATKYAEGLLAIRYR IKDENNQMAGGPMYYLERGLGSKWLAKIFAFFGVAVALLGIGTFTQVKSISDAMELSFNV PPIVTAVLLTITVGFITIGGIKRIANVAEKVIPMMCILYIGGVVIILITHFNLIPQTVVL VVHSAFNPQAALGGGIGITMTMAMQKGISRGIFSNESGLGSAPIAAAAAKTDSCVEQGLV SMTGTFIDTVVICTMTGIAILLTNSHTSGLEGAAMTTQAFSNGLFIPMIGKYIVNIGLIF FAFTTIIGWNYYGERCMYYLKGLKAIPYYKFIYIVFIAIGPFMSLEFIFILADIVNGCMA IPNLIGLIGLRKEVIVQTQAYFTDEEVINQDELVYD >gi|223714210|gb|ACDT01000005.1| GENE 91 92047 - 93576 240 509 aa, chain - ## HITS:1 COG:no KEGG:SPH_0571 NR:ns ## KEGG: SPH_0571 # Name: not_defined # Def: transcriptional regulator, putative # Organism: S.pneumoniae_Hungary19A_6 # Pathway: not_defined # 1 427 1 425 509 182 26.0 4e-44 MLDYFLENTTKKKLYLFSILYLKRVTSVKECKYILTFSSTSIISIINELNFDFNGIAEIN ITNSLELKLIVYEDITFSDLLHAIYQSSNVLHCIKFMITNESNHSFLVFAEDNYLAKSSA YRIRQKCIKYIRSIGLDIKKNKVIGEEYRIRFLIALLYYKCGIDCCDIDEYSIKLIKEFI ISTNQSITFEFLNNATNEYDYFECLMSLLWKRKDHNDNLSIPKELEKFKEIFIYKDLKKY LHNVIGNQLNIDFSKKDYDYMYLVYFCTNNFLFANQWTEERKEHIYKIFLSNQKFYDLYI RLSHKYGKFLEQSHIFKTILIYFFKDHLFQLQCLIPNQRLYIDTTPTTTYITNNDVSGII NSWKKANNILYPTDKRSIFYLTTQIEFILRQLVPPVTIILLSDLMSEIEMINLFLEKNFS RNRINITPLLLNSYNLEYLNAEKNRIIIITRGIEYLLKSYNFFQNNTIVIIDSRTTLYEK EKISNAIANYEEKQFLKICAGTSLTDTEY >gi|223714210|gb|ACDT01000005.1| GENE 92 93853 - 94062 184 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757447|ref|ZP_02429574.1| ## NR: gi|167757447|ref|ZP_02429574.1| hypothetical protein CLORAM_02997 [Clostridium ramosum DSM 1402] # 1 69 1 69 69 107 100.0 2e-22 MHRDIFKDVVDLVCCNYISDLRFLVKEVYQKIHKINFKEYDICELQKFFSYVFNIEISNY EDIEKFLNN >gi|223714210|gb|ACDT01000005.1| GENE 93 95106 - 96347 1485 413 aa, chain + ## HITS:1 COG:L0236 KEGG:ns NR:ns ## COG: L0236 COG0004 # Protein_GI_number: 15673574 # Func_class: P Inorganic ion transport and metabolism # Function: Ammonia permease # Organism: Lactococcus lactis # 2 413 1 412 413 374 55.0 1e-103 MIDSGDTGFMIICTALVLLMTPGLAFFYGGMVRRKNVVNTMMTSIFVIGIAMVMWILFGY SLAFGNDHLGIIGDLSFLGLSNVGTTPGNYASTIPDLGFAAFQMMFAIITPALITGAVAG RMKFKALFIFIIVWSTIVYYPMAHMVWGLGGFLAEIGSVDFAGGNVVHISSGVSALVLCI ILGKRKDYDHAKYRIHNIPFVVLGASLLLFGWFGFNAGSSLKADGLAIHAFMTTAVAAAS GMLSWMLMDVWKINKPTLVGSVTGLVVGLVAITPGAGFVPIWSAVIIGFVGSPICYFMIS KAKQYFGYDDALDAFGCHGIGGIWGGIATGLFAQTSINDVARWNGLFFGDTNLLIAQIIS ILLTIVVAVIGTLICIGVVRLFTPLRVDKRDEQMGLDMSEHNETAYPSFNGLE >gi|223714210|gb|ACDT01000005.1| GENE 94 96361 - 96705 470 114 aa, chain + ## HITS:1 COG:AGc3252 KEGG:ns NR:ns ## COG: AGc3252 COG0347 # Protein_GI_number: 15889072 # Func_class: E Amino acid transport and metabolism # Function: Nitrogen regulatory protein PII # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 113 46 157 157 94 47.0 5e-20 MIKIEAFVREDKFEDVKEALAKIEVHGITVYQVMGCGIQKGYREVIRGNEVEINMLPKIK FEIIVSDEVWEKKTIKAIQTAAFTGNPGDGKIISYDLKSALRIRTGETDKDAIN >gi|223714210|gb|ACDT01000005.1| GENE 95 96745 - 98418 1403 557 aa, chain - ## HITS:1 COG:SP0496 KEGG:ns NR:ns ## COG: SP0496 COG1283 # Protein_GI_number: 15900410 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Streptococcus pneumoniae TIGR4 # 8 540 8 535 543 323 36.0 7e-88 MDESLITIIGFIIGGLSIFIYGINLMSDGLKSIAGYHIREYIEKYTSNLFTSILVGTLIS AILHSSSAVTVISISLVRAGLMKLEQAIGITIGANIGTCVTSIMIGLNVEQFAYYLVFIG VVIMFLAKKKTIGYIGKVVLGFGLIFVGLQIMSDQMIAFAKEPWFTSIMTTLGKKPWWAL LGGTIATGIMQSSTAVIGVVQKLFTTGSITPAAGAAFIFGANIGTCITAVIAALGGSIST KRAAWFHVVYNLAGAIIGMLVLKPFVNLTNWVNLLLNGSQEMYIAQAHFIFNIASTILVI PFVRYCVILLEKLIPGNDHHDTLIENIDELDDTLIDKLPVGALAVAKKNTLRMGRNVIAN IKLSYTYLISKNSEDYDEIIEIEALIDKYDSRLSKYLLKIAQQPTLAKKQTNEYFKNYQI IKNLERIGDIVSNLANYYKLVYDEKGKFSNEAIDDLNKMYKLALEMVQDALDIYEHDNAN KLLSSLNKKEHTLNALEIECRQNHLVRMRDGICEDNIAASIIIDIISSIERIGDIALNTA NSTITVYKDHKVKYLNV >gi|223714210|gb|ACDT01000005.1| GENE 96 98560 - 98862 391 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757442|ref|ZP_02429569.1| ## NR: gi|167757442|ref|ZP_02429569.1| hypothetical protein CLORAM_02992 [Clostridium ramosum DSM 1402] # 1 100 1 100 100 192 100.0 4e-48 MLARLYVIIESEDIKDYQMIKDLLLQINPNFSISPSREYAGKKDCSEFFVTVNLDVTEVP LLLERLNNDWDGEIDDCSCYGFNTRMFHDLVYYLEFALFE >gi|223714210|gb|ACDT01000005.1| GENE 97 98985 - 99287 234 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757441|ref|ZP_02429568.1| ## NR: gi|167757441|ref|ZP_02429568.1| hypothetical protein CLORAM_02991 [Clostridium ramosum DSM 1402] # 1 100 21 120 120 142 100.0 6e-33 MEKLIIYKKQIITTIIIIVISVICFGVYILHTTLNNINYSKSEAHVVALKQFPGTIVSSQ IEYEHMQIFYHLEIENQQQELIEINISAKSGKIIGFEYKE >gi|223714210|gb|ACDT01000005.1| GENE 98 99293 - 99946 949 217 aa, chain + ## HITS:1 COG:L0131 KEGG:ns NR:ns ## COG: L0131 COG0745 # Protein_GI_number: 15673576 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Lactococcus lactis # 3 214 6 224 230 205 52.0 4e-53 MAILIIEDEKNMSDLLKLELTHAGYCCDQAYDGEVGLKKALEQEYELILLDLMLPRINGI EVCRRLREIKQTPVIMLTARDEVMDKVNGLQVGADDYLAKPFAMEELLARINALLRRMNF QKALLEYSYGEVKIDPSKHKVFFKDNEVNLTTTEFDLLSLLVQNGGDVVSRNKILDQVWG YDEDVSTNVVEVYIRYLRNKIPGIKIETVRGVGYRLS >gi|223714210|gb|ACDT01000005.1| GENE 99 99943 - 100995 868 350 aa, chain + ## HITS:1 COG:lin1415 KEGG:ns NR:ns ## COG: lin1415 COG0642 # Protein_GI_number: 16800483 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Listeria innocua # 60 345 179 473 483 186 38.0 4e-47 MKIKSLRHQLVSTTSIIIMTVIVVFSLILLLYMGYYLLIIGRIYAMDEDLITPIASIVAV IFLILIIVALITSIACGFMISKRFLQTVDQFTKSIKQIKNEGLSHRLVIEGNDELALLGK EFNETIDQAERSLLQQNQFVSDASHELKTPLAIIKGNLDMLERWGKDDPAILSNSLNVTS NEVERLIQLCNELLHLTREMDIHCEEPTDLNLVVDEVITNFKEVHPEFEFIIKITVTSKI WMRIEHLKQLLIILIDNAIKYSREEEKKIELLYVDQKLMVKDHGIGIEADKLDYIFNRFY RADESRAQNNNNFGLGLAIAKRICSYYDYAITVESIVDQYTIFTIDFERR >gi|223714210|gb|ACDT01000005.1| GENE 100 101001 - 101519 795 172 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1349 NR:ns ## KEGG: Lebu_1349 # Name: not_defined # Def: propeptide PepSY amd peptidase M4 # Organism: L.buccalis # Pathway: not_defined # 24 168 42 199 203 64 31.0 1e-09 MKKIILFVLITIALVGCTEKVSALTLEEAQEIALKEVEGKILKAKEDKDDGVTYYDFTII TDTEKYEIEVDANSGKVLKREKDDDYIGTTINPVDGTVTPITPVNTAVSLEEAQKIAFDR VGGGYLIKTELDYDDDDGIKKYEIEIKNGNKEYELEINADTGEIIKYEEDVE Prediction of potential genes in microbial genomes Time: Thu May 26 09:07:22 2011 Seq name: gi|223714209|gb|ACDT01000006.1| Coprobacillus sp. D7 cont1.6, whole genome shotgun sequence Length of sequence - 62994 bp Number of predicted genes - 79, with homology - 75 Number of transcription units - 29, operones - 15 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 188 - 247 13.6 1 1 Tu 1 . + CDS 441 - 587 93 ## + Term 601 - 663 9.2 + Prom 622 - 681 10.1 2 2 Op 1 . + CDS 849 - 1070 249 ## gi|237735495|ref|ZP_04565976.1| predicted protein 3 2 Op 2 . + CDS 1084 - 1308 277 ## gi|167757433|ref|ZP_02429560.1| hypothetical protein CLORAM_02983 4 3 Tu 1 . - CDS 1316 - 3169 1733 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 3246 - 3305 8.5 + Prom 3361 - 3420 9.9 5 4 Op 1 1/0.500 + CDS 3449 - 5374 2205 ## COG1048 Aconitase A 6 4 Op 2 1/0.500 + CDS 5374 - 6369 1291 ## COG0473 Isocitrate/isopropylmalate dehydrogenase 7 4 Op 3 . + CDS 6373 - 7056 589 ## COG1802 Transcriptional regulators + Term 7058 - 7097 1.3 8 5 Op 1 9/0.000 + CDS 7119 - 7802 580 ## COG3279 Response regulator of the LytR/AlgR family 9 5 Op 2 . + CDS 7802 - 9064 855 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Term 9260 - 9298 1.8 + Prom 9166 - 9225 9.9 10 6 Tu 1 . + CDS 9441 - 9995 444 ## gi|237735503|ref|ZP_04565984.1| predicted protein + Term 10062 - 10109 9.1 - Term 10050 - 10097 9.1 11 7 Tu 1 . - CDS 10098 - 10355 144 ## Cphy_3214 hypothetical protein 12 8 Tu 1 . - CDS 10982 - 12364 1865 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases - Prom 12387 - 12446 3.8 13 9 Tu 1 . - CDS 12540 - 13649 509 ## Sterm_3933 integrase family protein - Prom 13890 - 13949 6.1 - Term 13798 - 13841 10.1 14 10 Op 1 . - CDS 13969 - 14799 482 ## gi|237735507|ref|ZP_04565988.1| predicted protein 15 10 Op 2 . - CDS 14877 - 15929 888 ## COG4748 Uncharacterized conserved protein 16 10 Op 3 . - CDS 15974 - 16390 251 ## gi|237735509|ref|ZP_04565990.1| predicted protein 17 10 Op 4 . - CDS 16393 - 16803 410 ## EUBREC_1258 SOS-response transcriptional repressor, LexA - Prom 16860 - 16919 8.7 + Prom 16784 - 16843 8.7 18 11 Op 1 . + CDS 16937 - 17158 141 ## gi|237735511|ref|ZP_04565992.1| predicted protein 19 11 Op 2 . + CDS 17224 - 17412 168 ## gi|237735512|ref|ZP_04565993.1| predicted protein 20 11 Op 3 . + CDS 17499 - 17663 169 ## gi|237735513|ref|ZP_04565994.1| predicted protein - Term 17508 - 17560 8.7 21 12 Tu 1 . - CDS 17634 - 18035 347 ## gi|237735514|ref|ZP_04565995.1| predicted protein - Prom 18071 - 18130 7.3 + Prom 17989 - 18048 4.1 22 13 Op 1 . + CDS 18122 - 18406 459 ## gi|237735515|ref|ZP_04565996.1| predicted protein 23 13 Op 2 . + CDS 18445 - 18717 305 ## gi|237735516|ref|ZP_04565997.1| predicted protein + Term 18752 - 18793 -0.7 + Prom 18813 - 18872 5.2 24 14 Op 1 . + CDS 18973 - 19362 488 ## gi|237735517|ref|ZP_04565998.1| predicted protein 25 14 Op 2 . + CDS 19368 - 19574 94 ## gi|237735518|ref|ZP_04565999.1| predicted protein 26 14 Op 3 . + CDS 19564 - 20898 1428 ## Ccel_3310 SMC domain protein 27 14 Op 4 . + CDS 20898 - 22007 899 ## GALLO_0439 hypothetical protein 28 14 Op 5 . + CDS 22021 - 22500 475 ## Ccel_3308 phage protein 29 14 Op 6 . + CDS 22500 - 24107 1172 ## COG1061 DNA or RNA helicases of superfamily II + Prom 24116 - 24175 2.9 30 14 Op 7 . + CDS 24195 - 26327 1358 ## COG3598 RecA-family ATPase + Prom 26382 - 26441 4.9 31 15 Op 1 . + CDS 26576 - 26818 157 ## gi|237735524|ref|ZP_04566005.1| predicted protein 32 15 Op 2 . + CDS 26808 - 27206 381 ## COG4570 Holliday junction resolvase 33 15 Op 3 . + CDS 27220 - 27399 202 ## gi|237735526|ref|ZP_04566007.1| predicted protein 34 15 Op 4 . + CDS 27396 - 28007 450 ## gi|237735527|ref|ZP_04566008.1| predicted protein 35 16 Op 1 . + CDS 28194 - 28367 148 ## 36 16 Op 2 . + CDS 28368 - 28685 472 ## gi|237735528|ref|ZP_04566009.1| predicted protein + Term 28727 - 28761 1.9 + Prom 28770 - 28829 4.5 37 17 Tu 1 . + CDS 28850 - 29197 342 ## gi|237735530|ref|ZP_04566011.1| predicted protein + Term 29338 - 29382 4.4 + Prom 29450 - 29509 10.8 38 18 Op 1 . + CDS 29739 - 30044 250 ## gi|237735531|ref|ZP_04566012.1| predicted protein 39 18 Op 2 . + CDS 30037 - 30537 441 ## gi|237735532|ref|ZP_04566013.1| predicted protein + Term 30545 - 30573 -0.1 + Prom 30787 - 30846 6.2 40 19 Op 1 . + CDS 30869 - 31222 278 ## Cphy_2967 hypothetical protein 41 19 Op 2 3/0.000 + CDS 31304 - 32125 787 ## COG3728 Phage terminase, small subunit 42 19 Op 3 . + CDS 32118 - 33458 1018 ## COG1783 Phage terminase large subunit 43 19 Op 4 . + CDS 33458 - 34966 1322 ## SDEG_1632 portal protein 44 19 Op 5 . + CDS 34966 - 36459 1094 ## COG5585 NAD+--asparagine ADP-ribosyltransferase 45 19 Op 6 . + CDS 36452 - 36637 300 ## gi|237735538|ref|ZP_04566019.1| predicted protein 46 19 Op 7 . + CDS 36697 - 36999 213 ## gi|237735539|ref|ZP_04566020.1| predicted protein + Prom 37020 - 37079 6.6 47 20 Tu 1 . + CDS 37103 - 37408 281 ## gi|237735540|ref|ZP_04566021.1| predicted protein + Term 37421 - 37454 3.1 + Prom 37422 - 37481 3.6 48 21 Op 1 . + CDS 37560 - 38204 787 ## SDEG_1629 phage scaffold protein 49 21 Op 2 . + CDS 38221 - 39135 1144 ## MGAS10750_Spy0865 phage protein 50 21 Op 3 . + CDS 39153 - 39425 445 ## SH2353 hypothetical protein 51 21 Op 4 . + CDS 39437 - 39760 316 ## FI9785_834 hypothetical protein 52 21 Op 5 . + CDS 39757 - 40059 389 ## gi|237735545|ref|ZP_04566026.1| predicted protein 53 21 Op 6 . + CDS 40059 - 40409 479 ## Spy49_1475c hypothetical P protein 54 21 Op 7 . + CDS 40421 - 40807 336 ## gi|237735547|ref|ZP_04566028.1| predicted protein 55 21 Op 8 . + CDS 40821 - 41357 683 ## LACR_1143 hypothetical protein 56 21 Op 9 . + CDS 41402 - 41779 509 ## gi|237735549|ref|ZP_04566030.1| predicted protein + Prom 41801 - 41860 5.0 57 22 Op 1 . + CDS 41899 - 42147 142 ## gi|237735550|ref|ZP_04566031.1| predicted protein 58 22 Op 2 . + CDS 42131 - 45115 3427 ## Spy49_1468c putative minor tail protein 59 22 Op 3 . + CDS 45119 - 46654 1487 ## COG4722 Phage-related protein 60 22 Op 4 . + CDS 46654 - 48408 1223 ## L22437 prophage pi3 protein 12 61 22 Op 5 . + CDS 48422 - 49453 1084 ## Shel_15430 hypothetical protein 62 22 Op 6 . + CDS 49457 - 50149 555 ## gi|237735555|ref|ZP_04566036.1| conserved hypothetical protein 63 22 Op 7 . + CDS 50162 - 50320 123 ## 64 22 Op 8 . + CDS 50274 - 50627 259 ## gi|237735556|ref|ZP_04566037.1| predicted protein 65 22 Op 9 . + CDS 50630 - 50893 291 ## gi|167757367|ref|ZP_02429494.1| hypothetical protein CLORAM_02917 66 22 Op 10 . + CDS 50910 - 51383 486 ## gi|167757366|ref|ZP_02429493.1| hypothetical protein CLORAM_02916 67 22 Op 11 . + CDS 51385 - 51642 378 ## gi|167757365|ref|ZP_02429492.1| hypothetical protein CLORAM_02915 68 22 Op 12 . + CDS 51644 - 52222 648 ## Cphy_2944 hypothetical protein + Prom 52249 - 52308 4.0 69 23 Tu 1 . + CDS 52395 - 52577 60 ## + Prom 53034 - 53093 8.6 70 24 Tu 1 . + CDS 53150 - 53326 318 ## gi|237735562|ref|ZP_04566043.1| predicted protein + Term 53351 - 53396 5.5 + Prom 54023 - 54082 9.5 71 25 Tu 1 . + CDS 54136 - 54597 492 ## gi|167757361|ref|ZP_02429488.1| hypothetical protein CLORAM_02911 + Term 54618 - 54670 7.3 72 26 Tu 1 . - CDS 54681 - 55493 1099 ## COG0005 Purine nucleoside phosphorylase - Prom 55533 - 55592 8.6 73 27 Op 1 . + CDS 55536 - 55724 170 ## gi|237735566|ref|ZP_04566047.1| predicted protein + Prom 55726 - 55785 2.2 74 27 Op 2 . + CDS 55810 - 57042 1446 ## COG1078 HD superfamily phosphohydrolases + Term 57102 - 57141 0.6 - Term 56934 - 56964 -0.9 75 28 Tu 1 . - CDS 57052 - 58929 1461 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) - Prom 59078 - 59137 9.4 + Prom 58669 - 58728 8.8 76 29 Op 1 . + CDS 58970 - 59365 388 ## gi|167757357|ref|ZP_02429484.1| hypothetical protein CLORAM_02907 77 29 Op 2 . + CDS 59367 - 60887 1316 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 78 29 Op 3 . + CDS 60959 - 62620 2085 ## COG0018 Arginyl-tRNA synthetase 79 29 Op 4 . + CDS 62635 - 62976 483 ## gi|167757353|ref|ZP_02429480.1| hypothetical protein CLORAM_02903 Predicted protein(s) >gi|223714209|gb|ACDT01000006.1| GENE 1 441 - 587 93 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MESLENIAERISTTKKSLEEHPRQNLEIAIKELLEISKELDLVKKSIK >gi|223714209|gb|ACDT01000006.1| GENE 2 849 - 1070 249 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735495|ref|ZP_04565976.1| ## NR: gi|237735495|ref|ZP_04565976.1| predicted protein [Mollicutes bacterium D7] # 1 73 1 73 73 123 100.0 4e-27 MLLKPLILGVSIDFGLLSIALTLRFIIYPLYQSWISQKNLVNEKTIVEDLEKTIEIRIEE IDGSNHYKQYLKI >gi|223714209|gb|ACDT01000006.1| GENE 3 1084 - 1308 277 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757433|ref|ZP_02429560.1| ## NR: gi|167757433|ref|ZP_02429560.1| hypothetical protein CLORAM_02983 [Clostridium ramosum DSM 1402] # 1 74 1 74 74 124 100.0 1e-27 MCSLQVDSQYFDKICNGVINHLVVCKEEGIEQGDCVSLRKLGKTNISCVVKVEYVDCEGS GVDENYCILGVKRI >gi|223714209|gb|ACDT01000006.1| GENE 4 1316 - 3169 1733 617 aa, chain - ## HITS:1 COG:CAC2714 KEGG:ns NR:ns ## COG: CAC2714 COG0488 # Protein_GI_number: 15895971 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 1 616 2 637 643 481 44.0 1e-135 MLIECSNITKSFQGIPLLKDITFKIDDHDKVAIIGVNGAGKTTILKIIAGEENYDSGDLF CNKELTLGYLKQQHDLAMDKTVYDVGLDVFAPLITIENRLRQLENEMTTDHSERILNEYD RLTQSFHDKDGYSYPSKLTGVLKGLGFSEEEFNLKVSMLSGGQKTRLALAKMLLQEPKLL LLDEPTNHLDNSAINFLEGYLKNYPHAIITVSHDRYFIDQVSNKIVEVENGKSKSYKCKY NEYSILKKKQRAVDLKHYLNQQKEIKRIQESIDTLKSYNREKQVKRAESKEKQLAKIERI DKPENLPDTITINFKPKRESGFDVLKIENLAVKFDEILFKNIDIDIKKQERIALVGDNGV GKTTLFKTILDQLNAYQGKIKFGSKVDLAYYDQEHSTLAMDKTIFNEISDNFPKMTNTEI RNSLALFNFKGDDVFKEISLLSGGEKGRVVLTEILLKQANLLILDEPTNHLDIASKEVLE DALNQFEGTIFFISHDRYFINKVATRVIELTSTKAISYDGNYDTYLNHKSVAKPLKKENV DYLDMKQQNALYRKQQNKIKKIETNISILETKINTLTEQLHSEEVLNDYQKYNQINDEIS ELENQLNTLMEEWEALQ >gi|223714209|gb|ACDT01000006.1| GENE 5 3449 - 5374 2205 641 aa, chain + ## HITS:1 COG:CAC0971 KEGG:ns NR:ns ## COG: CAC0971 COG1048 # Protein_GI_number: 15894258 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Clostridium acetobutylicum # 1 636 1 639 642 737 58.0 0 MGMNLAYKILSSKLKDGNLVPGEQIGIQIDQTLTQDSTGTMAYLQLEAMNIKHVAVEKAV AYIDHNMLQTGFENMDDHEFIRSVAKKHGIVFSKPGNGVCHQLQLENFSKPGKTLVGSDS HTPTCGAMGMIAIGAGGLDVAVAMATGKYYLQCPSVVKVNLTGKKAPWVSAKDIILYILQ QLTVKGGVNKIIEYTGDGVASLSLTDRATICNMGAELGATTSVFPTDERTKEYLAQQGRV DDYIEMKADEDATYDQELDVDLSALVPMTAKPHSPDAVVPVKELERMKVNQVVIGSCTNS SFADMMKAAKILKGRKVADHVSLVIAPGSSSILAMLSQNGALADMVQAGARILECGCGPC IGMGQAPLSKGISLRTINRNFKGRSGTNDASVYLVSPEIAALSAIKGYMSEEFEDDMYLE EVPNTPFIKNGNFFIDEYDENNEVYMGPNIKPVPRGEKITDEISGKVVLKVGDNISTDHI VPSDSKLLPYRSNVPHLAKFSFSKVDPEFYDRAIANNGGFIVGGDNYGQGSSREHAALVP NYLKIKAIFAVSFARIHRSNLINNGILPLVIEAKDQDFFNDQDSYKIVNIKDVVEHNGKV KVINETTNDSIEAELTLSPREKVMINYGGLLNAIKELGGEF >gi|223714209|gb|ACDT01000006.1| GENE 6 5374 - 6369 1291 331 aa, chain + ## HITS:1 COG:CAC0972 KEGG:ns NR:ns ## COG: CAC0972 COG0473 # Protein_GI_number: 15894259 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Clostridium acetobutylicum # 5 330 9 334 334 350 54.0 2e-96 MKAVLIPGDGIGKEISKSVVDITKAMKLDIEWVEYQAGAEYAATTKEVFEPGLVDAIKEY KWALKGPTATPIGTGFRSVNVALRQQFATYANVRPIKSFKGINSRYEDIDLVMIRENTED LYKGIEYMINPNMANGIKLITREASEKICRYAFEYAKNNRRKKVTAVHKANIMKYTDGLF LEAFRDVAKDYPDIEPQEVIVDNMCMQLVIRPETFDVLVAPNLYGDIVSDLCAGLVGGLG FAPSGNIGDEFRIYEAVHGSAPDIAGKGIANPSALLLAFALMLEALGKLDDANKLREALA AVVEEGTIVTPDIGGSASTDEFTAAIIRKLG >gi|223714209|gb|ACDT01000006.1| GENE 7 6373 - 7056 589 227 aa, chain + ## HITS:1 COG:SMb20773 KEGG:ns NR:ns ## COG: SMb20773 COG1802 # Protein_GI_number: 16265213 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Sinorhizobium meliloti # 13 217 22 226 228 105 31.0 8e-23 MTKNLLENTGHGSLGNKIFTVLRDKILNEEYTQGQKLNEVALSKELNISRTPIREALKQL ELEGLVKSIPNKGVYVLGFSHRDIDDMLEIRYALEGLAIQLAIERINDEELEKIKEVYDL MEFYAEKGDQEKFNEINIAFHEAIYRCTQSKYFEQLLTDINYYIHVTSRHSIRQPDRLIS AAQEHREIYEAILARDKDLAKEKIQNHIRKTQVLVRNYYEKKQQDKA >gi|223714209|gb|ACDT01000006.1| GENE 8 7119 - 7802 580 227 aa, chain + ## HITS:1 COG:BS_yccH KEGG:ns NR:ns ## COG: BS_yccH COG3279 # Protein_GI_number: 16077343 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Bacillus subtilis # 1 196 2 202 233 71 28.0 2e-12 MKIGILDDERSMLLRIRGYVEEARIKEPYTVELFQDVEEFFNCGKNFDILLLDIDMPTMS GIQVAKKLLKQNTIIIYITNYENKMQEAFDINVAGYLLKSELETKLTPTLLKAINRRQQN DDIWIKKKGEIFHFRSRNIIYIESVKRYLHFYTNDNNFYIPYMTLEEVNQYLPDNFVQIN KSQIININMVVSMNGYILKLKGIDESLEISRRRKKDVFNLLMQEIKE >gi|223714209|gb|ACDT01000006.1| GENE 9 7802 - 9064 855 420 aa, chain + ## HITS:1 COG:CAC1582 KEGG:ns NR:ns ## COG: CAC1582 COG2972 # Protein_GI_number: 15894860 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Clostridium acetobutylicum # 237 418 252 452 452 63 30.0 7e-10 MINFFIEYLYTLIYVVALSYVISLTYPYIRKENMLRNIMVILSYTFVLNIVTYTSPHNGW LYVIGNLMCISLDFIYCGVLVKRFRLNNLFITTIYYSIQIIVGSVMIAVALEIFDNTYLD IVWGSMRVVFPLGNYISSALICKYAVQKRDFVEHQLPKKFLYSFCLINIVEVIIVLILNV LSVEQGDIYIFFILILAGVQMILVNNYIINSSRMYLRNKELELANYSYATTRRHISELEK EQERLYKFKHDINNHLQVLEEVVETESVANQYLKAIKEDLNAPVKLIRTGNIFVDACLNA KIRNNDEVNYVVDAVIDRNVNIAQDKLCSLLFNLIDNATEAALKTAEKIVEIKIFSKGNM LLISIINSTNRPLNYESKKGRDHGKGLIIIKEVVETYHGTMEYFYENNKVHCDILLNVDN >gi|223714209|gb|ACDT01000006.1| GENE 10 9441 - 9995 444 184 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735503|ref|ZP_04565984.1| ## NR: gi|237735503|ref|ZP_04565984.1| predicted protein [Mollicutes bacterium D7] # 1 184 1 184 184 306 100.0 3e-82 MKRFCEKISRFLLDVDNVTEDEIDEVRFGIELILTQMILFFLLLAIGIFRNMVLETIIFI IALVGLRTFVDGYHADSFYRCMFLTTLLYLGTLYFADFENSYLLIMAILIAAKAFMWQEN HEARNVKISKLIFWGCLAVIGLIYPIEPRLSMIIGFTLFFTGVSMEVKKHGYKEKSSFIS EKSI >gi|223714209|gb|ACDT01000006.1| GENE 11 10098 - 10355 144 85 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3214 NR:ns ## KEGG: Cphy_3214 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 75 1 78 101 75 53.0 6e-13 MSRVGKCIDNGSAKRFWRIIKSEMYYLNKFNSFNELKKAIETHIDFYNNERFNNKVPMKV RLEALNGTENPAQYPTNNSLNYAMV >gi|223714209|gb|ACDT01000006.1| GENE 12 10982 - 12364 1865 460 aa, chain - ## HITS:1 COG:lin0540 KEGG:ns NR:ns ## COG: lin0540 COG1486 # Protein_GI_number: 16799615 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Listeria innocua # 3 438 2 439 440 607 65.0 1e-173 MEKKPVKIVTIGGGSSYTPELMEGFIKRYDELPIKEIWLVDIEAGKEKLEIVGAMAQRMW DASPYDVKVHLTLDRREALKDADFVTTQFRVGLLNARIKDERIPFSYGMLGQETNGAGGI FKAFRTIPVILSIVEDMKELCPNAWLVNFTNPSGMVTEAVMKYGKWDRVIGLCNVPVGAM MAEPETIGTTLDKLIYKFAGLNHFHWHRVFDLDGTDVTGKIIDAMYEGKDSGIPANIHDI PFFKEQLKQMNLIPCGYHRYYYRQQEMLAHGLEEYNGIGTRGQQVKQTEAELFELYKDPN LNYKPEQLTKRGGTHYSDAACETIASIYANKNTHLVVTTKNNGAVPDLPADCAVEVSAHI GSAGALAIAFGPLEPAQRGWLQCMKNMELCVEEAAVTGDYGLALQAFILNPQIPSGENAK RVLDELLIAHKKHLPQFADKIAELEAAGVTVKDEIARDLD >gi|223714209|gb|ACDT01000006.1| GENE 13 12540 - 13649 509 369 aa, chain - ## HITS:1 COG:no KEGG:Sterm_3933 NR:ns ## KEGG: Sterm_3933 # Name: not_defined # Def: integrase family protein # Organism: S.termitidis # Pathway: not_defined # 9 361 2 340 344 134 31.0 5e-30 MPRKQTFKRRPNKAGTVIKLSGKRRRPWCAKVTTGKDIITGRQIQTVLGTFETWDEADDA LTLYRLGQKNKITDQEAEALAPDTFQKLVDQREKNMPTFKEIFDIIYQEELCTLSKSAAQ GYRSWIKHFNSIYHHKISQISLADLQEIFDRDKAGYGTKVHMKVLVSKIFEYAVIHKYIN RDDDYTEYIKCGKAKESTKHYAFSNDEIRALMSDNSDTAKIILIYIFTGLRANELLNIPR KNICLNTEFPYLVSGSKTDAGKNRVIPIHTFIEPFVKQLLMKRKKRIIDCTYYQFSSMFS SFLTDHNMKHTIHDTRDTFATLCQSNNVDLFIRKRILGHKMKDITFDTYTSTVIETLYKE INKIKVPKP >gi|223714209|gb|ACDT01000006.1| GENE 14 13969 - 14799 482 276 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237735507|ref|ZP_04565988.1| ## NR: gi|237735507|ref|ZP_04565988.1| predicted protein [Mollicutes bacterium D7] # 1 276 1 276 276 508 100.0 1e-142 MKPNQEKELKLYHGNIYKYRMSESIDSLGIYMCDTNLKYKVCIIPLSEDDETGISYKIGC LDKVAYPFEFLEIDRRNIVDVLRIKGSVAKVNYKEYIELSELLLNKLSLKVLDTYSAFQA KRLSLSKENYILTEDYYKYYTWFEHKSNLEFNKNINRNPIIKKYCLYYVEIGENIGTELH KMRPAVIFKRCMASNPNDSSYIVIPITSQSTSGNYPYNTPIMVNGKVNYVRTNDVRRISV KRIVGPLYKSGTNEVLKLNEAEIQSVKENFKNYFIN >gi|223714209|gb|ACDT01000006.1| GENE 15 14877 - 15929 888 350 aa, chain - ## HITS:1 COG:lin1828 KEGG:ns NR:ns ## COG: lin1828 COG4748 # Protein_GI_number: 16800895 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 10 350 15 355 355 287 46.0 3e-77 MELQEKMYNLSERIKQLKENIQTEEATKQSFILPFFQALGYDVFNPLEFIPEFTADVGIK KHEKVDYAILQEGKPLILIEAKSCNEKLDKHDSQLFRYFGTTESKFAILTNGIIYKFYSD LDQPNVMDSQPFYVLDMMDLSDQAIQYLANFNKCNLDIDSIMNTASDLKYLSLTKTAFKE LIENPTDEFIKLLLNSGVYDGLKNQKVIDKFKPIVKRGINQYINDKMSSKFKETLSSNDD EVIEENNEPDEEVSKINTTIDELNGFAIVKAILRTEVEAKRITYKDTESYFGILLDNNIR KWICRIYINTKAKYVIISDENKKGIKHDLGTLDDLYNLSNELKDSLKKYL >gi|223714209|gb|ACDT01000006.1| GENE 16 15974 - 16390 251 138 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237735509|ref|ZP_04565990.1| ## NR: gi|237735509|ref|ZP_04565990.1| predicted protein [Mollicutes bacterium D7] # 1 138 2 139 139 258 100.0 8e-68 MEVHEVIEYYNQNGLNETLDYFDIDIIHKELRGKTVESRLVIDFYGKATIFIQPDLDENY EQFLKAHELGHYLLHYQCDISFNYLTRVYKTKIEKEANSFACKLLMSDINIKEQENIDFI AMEKGIPLKVWHSVMNLI >gi|223714209|gb|ACDT01000006.1| GENE 17 16393 - 16803 410 136 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1258 NR:ns ## KEGG: EUBREC_1258 # Name: not_defined # Def: SOS-response transcriptional repressor, LexA # Organism: E.rectale # Pathway: not_defined # 5 74 4 73 213 61 41.0 8e-09 MSDSEKTKKIFAKNLLKYMERHNLNQTDISEITGVSQQSVSNWLNAKLMPRMGIIEILAE YFKILKSDLLEEKQNSTYIPDHFESVTDAMEFILRVPVVANNCGYDLDTMSEEEIIEMAE DLSEMLQIMAKKHRKK >gi|223714209|gb|ACDT01000006.1| GENE 18 16937 - 17158 141 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735511|ref|ZP_04565992.1| ## NR: gi|237735511|ref|ZP_04565992.1| predicted protein [Mollicutes bacterium D7] # 1 73 1 73 73 145 100.0 7e-34 MTKDINIPKISLKAARVNAELNQQEAAELLGIDVSTLIRWEKDPRIVKSGYHEIISQVYH YPTDYIFFGHNTS >gi|223714209|gb|ACDT01000006.1| GENE 19 17224 - 17412 168 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735512|ref|ZP_04565993.1| ## NR: gi|237735512|ref|ZP_04565993.1| predicted protein [Mollicutes bacterium D7] # 1 62 1 62 62 67 100.0 2e-10 MKIFLCMLFGIGFVLSSGVGVFLLLGIAIMEDYKSLIKPTLFAFGLSAVLVIMSFILILS TN >gi|223714209|gb|ACDT01000006.1| GENE 20 17499 - 17663 169 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735513|ref|ZP_04565994.1| ## NR: gi|237735513|ref|ZP_04565994.1| predicted protein [Mollicutes bacterium D7] # 1 54 1 54 54 74 100.0 2e-12 MKKEDIKKILEEHLQMLSKISGDESLMIENPILLQVINDGIKSISLTLLFFDKI >gi|223714209|gb|ACDT01000006.1| GENE 21 17634 - 18035 347 133 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237735514|ref|ZP_04565995.1| ## NR: gi|237735514|ref|ZP_04565995.1| predicted protein [Mollicutes bacterium D7] # 1 133 1 133 133 171 100.0 1e-41 MNFENILTVLLSATIPSIITYLVTKKTCDSKIEQIKISKDSEINQLKLQHQHEIDKLESE HKHNIEQLELQHQLKKESKSNDMSSKLTEMFLKGELDMDNINNSIPKLKKMDRNVTRMKN KQETSNFIKKQKR >gi|223714209|gb|ACDT01000006.1| GENE 22 18122 - 18406 459 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735515|ref|ZP_04565996.1| ## NR: gi|237735515|ref|ZP_04565996.1| predicted protein [Mollicutes bacterium D7] # 1 94 1 94 94 181 100.0 2e-44 MEKVEALGVTFEEIKERTGLSKDFVISAVVNGSFPGSYKITECGKRYIYVPRGAFEDYMT KWHREPSEKLIDALIIEYNKSTKKGTAPTVPNKN >gi|223714209|gb|ACDT01000006.1| GENE 23 18445 - 18717 305 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735516|ref|ZP_04565997.1| ## NR: gi|237735516|ref|ZP_04565997.1| predicted protein [Mollicutes bacterium D7] # 1 90 1 90 90 164 100.0 2e-39 MESITLRQLLNIMPSDELVKPSIRLEGKNEFDNQDIFKVDELRESNILLLDYGIAISYPA EDVSVITAKDGALSINKQLYTMVVLKKGNC >gi|223714209|gb|ACDT01000006.1| GENE 24 18973 - 19362 488 129 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735517|ref|ZP_04565998.1| ## NR: gi|237735517|ref|ZP_04565998.1| predicted protein [Mollicutes bacterium D7] # 1 129 1 129 129 192 100.0 5e-48 MEDKIKELIESAMESGADVKVVKINSNKSESIKKLLDEIEEDMEPETLIQFEFKISPIEN SIFGGVGEAMVSYVEDISTLTKEDIIEIFKPVEGVFEKCGQEFVEKLNANKRVPSSEEVR KIMDELLGG >gi|223714209|gb|ACDT01000006.1| GENE 25 19368 - 19574 94 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735518|ref|ZP_04565999.1| ## NR: gi|237735518|ref|ZP_04565999.1| predicted protein [Mollicutes bacterium D7] # 9 68 1 60 60 67 100.0 2e-10 MSTNFNHNMFSKNLNDAYFEIIEQKREDLNIRCRVETNVNDETVKVYVIKKNKIIKIITF KGGNRNDN >gi|223714209|gb|ACDT01000006.1| GENE 26 19564 - 20898 1428 444 aa, chain + ## HITS:1 COG:no KEGG:Ccel_3310 NR:ns ## KEGG: Ccel_3310 # Name: not_defined # Def: SMC domain protein # Organism: C.cellulolyticum # Pathway: not_defined # 2 444 3 433 433 500 63.0 1e-140 MTIKINSLELENVKRIKAVKVEPNQNGLTVIGGRNNQGKTSVLDSIAWALGGNKFKPSNA AREGSTVPPNLNITLSNGLVVERKGKNSALKITDPNGNKAGQQILNGFIEELALDLPKFM EASNKEKANILLRIIGVGEQLAKLNYEESEIYNNRLAIGRIADQKKKYAKEQVFYSEAPK DLISPQELINQQQAILAKNGENQRKRDKVTQIEYSVSILTEEVAALQKQLLAKQTELNKA TNDLTIAKTDALDLIDQSTEELEKNLAEIEEINRKVRANLDKDKAEEDANNYASQYNEMT VKIEEIRKQRIDLLKGADLPLPGLSVEDNELTYNGKKWDGMSGSDQLRVATAIVRKLNPD CGFVLIDKLEQMDIETMNEFGAWLEQEGLQAIATRVSTGDECSIVIEDGYVKGQMLEENP SVLSSVNGGIKEQAETPKWKAGEF >gi|223714209|gb|ACDT01000006.1| GENE 27 20898 - 22007 899 369 aa, chain + ## HITS:1 COG:no KEGG:GALLO_0439 NR:ns ## KEGG: GALLO_0439 # Name: not_defined # Def: hypothetical protein # Organism: S.gallolyticus # Pathway: not_defined # 6 369 3 369 369 392 56.0 1e-108 MNGFVITDGVINGAKKVVFYGPEGIGKSTFASKFPDPLFIDTEGSTKELDVKRLPKPTSW QMIIQEVQWIIQTKPCKTLVIDTADWAERLCVEAVCSRHGKSGVEEFGYGNGYTYVAEEW GRFLNLLQDVIDVANINVLLTAHAVIRKFEQPNEMGAYDRYELKLGKKTTAQTAPLTKEW ADIVLFANYKTFSVAADKEGKKHKAQGGQRVMYTTHHPCWDAKNRFGLPEEMPLDYTGIA HIFNGIVQNTEINTRSMESVQSQAVQPQVESNQNISKKIDQLGSELEPVIKTAEAKVMPQ SNSALPRALIDLMNKDLVTEDELKKAVASKGYYPYETPIENYDSSFIDGVLIAAWPQVFK IIDEQIRQF >gi|223714209|gb|ACDT01000006.1| GENE 28 22021 - 22500 475 159 aa, chain + ## HITS:1 COG:no KEGG:Ccel_3308 NR:ns ## KEGG: Ccel_3308 # Name: not_defined # Def: phage protein # Organism: C.cellulolyticum # Pathway: not_defined # 12 150 4 148 160 118 45.0 7e-26 MDNNYQQQNGMERELGWDDTIQQEQEYITLPAGDYDFRVERFERGRYEGGKKIPPCNQAN LTIVIVDPASGRDVKIQHNLLLHSKLETMLSEFFRGIGQKKKDEPLRMNWQMVPGATGRC KVVPEEYNGNMYNKIKKFYPKDEVQQSFNQAPQYNPGQF >gi|223714209|gb|ACDT01000006.1| GENE 29 22500 - 24107 1172 535 aa, chain + ## HITS:1 COG:SPy0669 KEGG:ns NR:ns ## COG: SPy0669 COG1061 # Protein_GI_number: 15674735 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Streptococcus pyogenes M1 GAS # 1 535 1 527 527 761 67.0 0 MQLRPYQQEAHDSIFNEWNKGVQKTLLVLPTGCGKTIVFAEVAKDCVKDGDRVLIMAHRG ELLEQASDKIAKSTGLGCAMEKASETCIGSWFRIVVGSVQTLQRPKRMEQFPRNYFDKII IDEAHHCLSDGYQRVLEYFNTAKVLGVTATPDRGDMRNLGSYFESLAYQYTLPKAIKEGF LAPIKALTLPLKMDLSGVGVQAGDFKVSDIGTALDPYLHQITEEMKKYCMDRKTVVFLPL VKTSQKFRDILNENGFRAAEVNGDSKDRSEILKDFENDKYNVLCNSMLLTEGWDCPSVDC IIVLRPTKVRSLYSQMVGRGTRLCEGKDHLLLLDFLWHTERHELCHPANLICEDEEVAKT MTKNLEDKANACLPEDVLEAIDIEDAEEQASNDVVAQREESLAKLLSEMKKRKRKLVDPL QFEMSIMDQDLSGYKPLFGWEMAPASDKQIKALEKFGIFPDEIDNAGKANLLLDRLDKRR QEGLTTPKQIRFLESRGFQHVGTWSFDAASSLINRIAASGWKIPKGIDPKTYKGE >gi|223714209|gb|ACDT01000006.1| GENE 30 24195 - 26327 1358 710 aa, chain + ## HITS:1 COG:SPy0671 KEGG:ns NR:ns ## COG: SPy0671 COG3598 # Protein_GI_number: 15674737 # Func_class: L Replication, recombination and repair # Function: RecA-family ATPase # Organism: Streptococcus pyogenes M1 GAS # 2 708 31 744 757 856 60.0 0 MALKYEGYSVSEWDSWSRRDSKRYHDKECLKKWDTFTGSGVTGGTIVQYAKDQGWTPPVK DGAGHELDWDDVINAKDEKVIVDRNWLEVKEVREPRGWDPSAELITYLETLFDSTENVAY VTKSWFNEEKQKHLPTKGCCDRTAGKLIELLAKSNGDVGEVIGDYNPEIGAWIRFNPVDG NGVKNENVTDYRYALVESDSMSVDEQNAIIRELELPVACLVHSGGKSLHAIVKIEAADYR EYRKRVDYLYNICKKNGLEIDTQNRNPSRLSRMPGVVRNGKKQFLVGTNIGKSSWDEWFE WIEGVNDDLPDPESLNEFWDNMPDLAPPLIDGVLRQGHKMLIAGPSKAGKSFALIEMCIA IAEGTKWFDFNCAQGKIMYVNLELDRASCLHRFKDVYNALHITPNNLSNIDIWNLRGKSI PMDKLAPKLIRRAAKKDYIAIIIDPIYKVITGDENSADQMANFCNQFDKICNELGTAVVY CHHHSKGSQGGKRSMDRASGSGVFARDPDALMDLIELEPGENVYKQLKNNAACQFCISYL DKNYPGWQEDVSQDDMLSQREMVDYCKKRIPKGKYGVFDTMLNHAREEVEKISAWRIEGT LREFSKFSPVNLWFDYPVHKVDKSGVLNDIQPDDMKPQWQKAKEARKTPEDKKKERMKSL EFAYEALKIEGEVTIKALEEYFTLSTNAIRKRVDEHPDFTRKGGAVFKNE >gi|223714209|gb|ACDT01000006.1| GENE 31 26576 - 26818 157 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735524|ref|ZP_04566005.1| ## NR: gi|237735524|ref|ZP_04566005.1| predicted protein [Mollicutes bacterium D7] # 1 80 1 80 80 149 100.0 6e-35 MRKRKKVSKELDVARNMPPLYHKLPGEEYDVNKSEVARWLIQQPDILNYVVNRIKASGTS EPLIKYNPSTGKWQGVDYDN >gi|223714209|gb|ACDT01000006.1| GENE 32 26808 - 27206 381 132 aa, chain + ## HITS:1 COG:SPy0673 KEGG:ns NR:ns ## COG: SPy0673 COG4570 # Protein_GI_number: 15674739 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvase # Organism: Streptococcus pyogenes M1 GAS # 2 131 1 130 131 132 54.0 1e-31 MIIEFFMPMIPPTVTAQEKDVTVVNGKPVFYDPPDLVKAKNKLMVNLLPHRPEKPLDGAL RLVVKWCFPLNGGKHYDGEYKYTKPDTDNLNKALKDIMEKLGFYVNDARVASETIEKFWA EIPGIWIHLEKN >gi|223714209|gb|ACDT01000006.1| GENE 33 27220 - 27399 202 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735526|ref|ZP_04566007.1| ## NR: gi|237735526|ref|ZP_04566007.1| predicted protein [Mollicutes bacterium D7] # 1 59 1 59 59 96 100.0 4e-19 MAAICKSCRKNINNFCKVSGEPISRTRSKCKKFKMVYEQTQLFYAVGSSCDLNRGGNKK >gi|223714209|gb|ACDT01000006.1| GENE 34 27396 - 28007 450 203 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735527|ref|ZP_04566008.1| ## NR: gi|237735527|ref|ZP_04566008.1| predicted protein [Mollicutes bacterium D7] # 1 203 1 203 203 370 100.0 1e-101 MRRIVVETETANKFIEDIEALAAKNDLNIELNMDELGVYVGFASLLKAEIADIDMPEVPA KSKGRGGRTKEPKVKLVVQDFKDALKKLDKGTLAQNITEAIDITGLSKSFLNRLYYCEGE TTMVTKSYHEKLKPLLEPNSLKPVKPTRLNDNMIMIRRVARAKNTDDPYLIANTIIGSIN MQKQTTFKNVSDVIEFIKKNGTR >gi|223714209|gb|ACDT01000006.1| GENE 35 28194 - 28367 148 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKCNECKYLDYKDSKIEKSGYYCYHPDNILGPRLVKECARYQLGYRTPRWCPLKGAK >gi|223714209|gb|ACDT01000006.1| GENE 36 28368 - 28685 472 105 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735528|ref|ZP_04566009.1| ## NR: gi|237735528|ref|ZP_04566009.1| predicted protein [Mollicutes bacterium D7] # 1 105 1 105 105 167 100.0 2e-40 MKVELEKIQEAVASLGGTDASDEYSKGWDAAVGAVYSEINKLAEAESRKPAHNYEHECIE LKKRLDDLTCENKMLRADRENEKEYSSTLESVLKAVNLLTDTVTK >gi|223714209|gb|ACDT01000006.1| GENE 37 28850 - 29197 342 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735530|ref|ZP_04566011.1| ## NR: gi|237735530|ref|ZP_04566011.1| predicted protein [Mollicutes bacterium D7] # 1 115 1 115 115 194 100.0 2e-48 MDYNEKVQYLKSYRDKLDQLTYVDGQIMGIKAISYGPALGTRQSIEQLYAKKEAIFNEME KIEHTIDTLKNIQERLVLKYAYIHLMQYDEIAKKMGFSERNVYRYRRNAINNLEI >gi|223714209|gb|ACDT01000006.1| GENE 38 29739 - 30044 250 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735531|ref|ZP_04566012.1| ## NR: gi|237735531|ref|ZP_04566012.1| predicted protein [Mollicutes bacterium D7] # 1 101 68 168 168 157 100.0 2e-37 MNSWVGIVLGLVATIIGVISMILSFYNLDQSINTQKETVDMINNLKNEIIKKIDKSFKET QDIVKENTTDKNDLTTSKVDGSFKKTDLNLNLKRDEGDLND >gi|223714209|gb|ACDT01000006.1| GENE 39 30037 - 30537 441 166 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735532|ref|ZP_04566013.1| ## NR: gi|237735532|ref|ZP_04566013.1| predicted protein [Mollicutes bacterium D7] # 1 166 1 166 166 299 100.0 4e-80 MTKFVNDIAANMLVCNEFDDQNGKISMTDTIYTDIDLSSNFTVLVLVNIYIQPDSIAKDI KRVKIHAFLREDKSKDFNVIYLGEFIVPPKNIDIKDNILSHGHRHAFNFEDFKFPNTGDY VVELFVDTSTDLANSDITDSGKIYESFYQKCDILGMIGFDVQIRSS >gi|223714209|gb|ACDT01000006.1| GENE 40 30869 - 31222 278 117 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2967 NR:ns ## KEGG: Cphy_2967 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 6 113 9 119 123 93 48.0 2e-18 MLFYSINILGTEYRIFKRKKDKDNRLSNKDGFCDHHAKEIVVLKCEENKDDIDQMRNLQD YEKKILRHEIIHAFLYESGLDINSHDIDQWARDEEMVDWMAIQFPKMYKIFAELDIL >gi|223714209|gb|ACDT01000006.1| GENE 41 31304 - 32125 787 273 aa, chain + ## HITS:1 COG:BS_xtmA KEGG:ns NR:ns ## COG: BS_xtmA COG3728 # Protein_GI_number: 16078322 # Func_class: L Replication, recombination and repair # Function: Phage terminase, small subunit # Organism: Bacillus subtilis # 4 230 23 233 265 114 33.0 3e-25 MKYKDIAAKYGVSVSAVKSWKSRYWKDKKLQPKKAKVATKKVAREIAKNIADDYGELDED KQLFCIYCLKYHNHVKAYQKVKPKVTYGSAAVMASRWYKLPEVQEEIKRLKAEMYADALL DPQDIVQKYIDIAFADLNDYLEYGRELVPVMGPFGPVTVKDELTGETVELKKEINVVKLK DSAFADGTILSEVKQGRDGASIKLSDRMKALDWLGRHMNLVTEEQRAKIDLIKAQTQNIT GGNDEEIEDADDGFIEALSGTAISDWNEGQQDG >gi|223714209|gb|ACDT01000006.1| GENE 42 32118 - 33458 1018 446 aa, chain + ## HITS:1 COG:SPy0972 KEGG:ns NR:ns ## COG: SPy0972 COG1783 # Protein_GI_number: 15674984 # Func_class: R General function prediction only # Function: Phage terminase large subunit # Organism: Streptococcus pyogenes M1 GAS # 171 439 172 418 429 60 25.0 8e-09 MAKIKQVIFKFKPFSKKQRMILNWWTKDSPVKDNDGIIADGAIRSGKTIVMSLSYVLWAM TTFSGQNFGMAGKTIGSFRRNVLFWLKLMLKARGYKVNDHRADNLVIVRKKNIENYFYIF GGKDERSQDLIQGITLAGMFFDEVALMPESFVNQATARCSVKGSKWWFNCNPQGPFHWFK VNWIDKSIGYLNAEQIKELERKKETVKNILYVHFTMNDNLSLDKEVKRRYASAYNGVFYD RYIRGLWAVAEGVIYDMFNRDKHIVNKRPKIDENEKKYISCDYGTQNPMVFLLWEKGVNE TWYATKEYYYSGREERKQKTDSQYADDLIEFIGSLNIEYIIVDPSAASFITELKSRGLSV KRARNDVSNGIRSVGTMLNLGRIGFLDTCKMALKEFSIYVWDSKATSRGIDAPIKENDHC MDAIRYFVNTILINKNKLNTDLKGGI >gi|223714209|gb|ACDT01000006.1| GENE 43 33458 - 34966 1322 502 aa, chain + ## HITS:1 COG:no KEGG:SDEG_1632 NR:ns ## KEGG: SDEG_1632 # Name: not_defined # Def: portal protein # Organism: S.dysgalactiae # Pathway: not_defined # 2 421 3 420 441 458 56.0 1e-127 MEIFRLPKDTVMTPDLLAEYISKHKMLVNGHYQKLHDAYENNYDIYNQPDKEKWKPDNRI SVNFAKYIVDTFNGFFIGNPIKINGKDKGTNDYIAFLDSYNDQDDNNAELSKICSIYGHG YEMYYLDDDMQQCITYLSPLEAFIIYDDSIIEKPLFFIRYYKDYKNVERGSWSDDTVIQY FHQNGSYVFDDDEHLHGFDGVPVTEYVENAERTGIFESAMPMINAYNKAISEKANDVDYF ADAYLKVLGAKLDTNGVKQIRDNRIVNFEGDPEVNMVVEFMDKPNSDGTQENLIERLERL IFQISMVANISDENFGTSSGIALKYKLLSMTNLAKAKERKFTSGMNRRYKLLFSHPLSKV KSDAWVGLEYKFTFNIPANITDEAQVASSLEGIISKETQLKVLSIVDDVQSEIDRLKNEE QDAENDIVAKTMFENGTDLDYKDDAVTEVQGKTLNGAQTQSLLAIMAQFTSGTITEGQAV KLISTAIGIDTNEAKKILSGEL >gi|223714209|gb|ACDT01000006.1| GENE 44 34966 - 36459 1094 497 aa, chain + ## HITS:1 COG:SPy0975 KEGG:ns NR:ns ## COG: SPy0975 COG5585 # Protein_GI_number: 15674985 # Func_class: T Signal transduction mechanisms # Function: NAD+--asparagine ADP-ribosyltransferase # Organism: Streptococcus pyogenes M1 GAS # 6 328 8 352 541 164 32.0 3e-40 MGSYDYWRNRENEQHKHNITEEKKYNQELNKIYKDMMDECKRSINNFYAKYASENGITMA EAKKRASKLDIEEYARKAAKYVKTKDFTKEANDAMKIYNLTMKVNRLELLKADLGLELAK GHSKIYQLFYKALKKSSIDEFKRQSGILGKTVQNNTKLANSIVNASFHNATFSDRIWMHQ DLLKNDLNKLLQIGLIQGKNPKTLATELRKRFNVKQSDAERLMQTELARVQVEAQKKSYI ENGLEEYEYIACGGSDVCDVCKALDGKHFKIKDMMPGLNAPPMHPRCHCSTAPSVDRKDY DEWLNYLEKGGTTEEWEAQKRFKSNMNKQLNKLSESEKSILNRYTGIFAFQLNSSLNHGK YDKYQNEINILDNALSKGVILEDITLRRKVDLGFFVEKSKYNIEDLKKLIGIKIIEKGYS STSLNLFDDIDLKGRNGFIEFDVPKGYKGAQYIKDLAYPKFKNQEEVLFNRGLCYIINEV REENGIYYIKAEVLKND >gi|223714209|gb|ACDT01000006.1| GENE 45 36452 - 36637 300 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735538|ref|ZP_04566019.1| ## NR: gi|237735538|ref|ZP_04566019.1| predicted protein [Mollicutes bacterium D7] # 1 61 1 61 61 107 100.0 2e-22 MIDPKDYPFHVKELGGEEEFKEYIQLWKKDAPWSDDEHIKNILNGEEEAKVSISMWCNPW M >gi|223714209|gb|ACDT01000006.1| GENE 46 36697 - 36999 213 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735539|ref|ZP_04566020.1| ## NR: gi|237735539|ref|ZP_04566020.1| predicted protein [Mollicutes bacterium D7] # 1 100 1 100 100 171 100.0 2e-41 MARIESGVTGVVSKIDINSLEIDEIKEKLDTHIKEFNGCVEHIDTGINYLKEDIEYLRKE NKTNKLEMMILKDSLKKLSYWLCCVIVINFILLIAIMLFF >gi|223714209|gb|ACDT01000006.1| GENE 47 37103 - 37408 281 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735540|ref|ZP_04566021.1| ## NR: gi|237735540|ref|ZP_04566021.1| predicted protein [Mollicutes bacterium D7] # 1 101 32 132 132 189 100.0 5e-47 MIKIDVKRSRDHIAVSCVGHANYNTVGQDIVCAAISSLLQTLCYSLEELTQDNVNVCLKS GNSLIAIYKPTSKSQLLVDSFFIGCREIANVYSDYVEISKN >gi|223714209|gb|ACDT01000006.1| GENE 48 37560 - 38204 787 214 aa, chain + ## HITS:1 COG:no KEGG:SDEG_1629 NR:ns ## KEGG: SDEG_1629 # Name: not_defined # Def: phage scaffold protein # Organism: S.dysgalactiae # Pathway: not_defined # 50 214 41 203 203 79 41.0 8e-14 MKDLEKLLKLPLLKNKFDLQLFAEDSDNGEDGEDPDNEPNTSLKDGEGENPSDKKHSDED VDKLISKKFAEWEKKRQKEEAKFKEAQKLKNMTEQEKKDLEFKQLQEKIAKYEKQATLGE MSKVARSILADEEISVNDELLANLVSEDADTTKANVENFAKIFKVAVQKEVAAKLRHEPP KKGSKTKMTKEEIFKVENTAERQKLISENMELFQ >gi|223714209|gb|ACDT01000006.1| GENE 49 38221 - 39135 1144 304 aa, chain + ## HITS:1 COG:no KEGG:MGAS10750_Spy0865 NR:ns ## KEGG: MGAS10750_Spy0865 # Name: not_defined # Def: phage protein # Organism: S.pyogenes_MGAS10750 # Pathway: not_defined # 15 291 9 288 296 197 42.0 3e-49 MKNKNKFNLQLHAAETNLTAGKDLEPAISIDYTSRLNKNINELQRLLGVTEMIPMSAGTT IKIYKMEQVNTPDQVGEGETIPLTEINRKLARTVELKLNKYRKSTSAEAIQRSGRSLAVN QTDEKLISGVQTSIKKSFYTLIKTGTGTAKGTNLQSALSAAWGALQKFYVDMTVTPIYFV SSEDLADYLGNAQITLQTAFGMSYIENFLGLGTVIVSPELEKGKVIASAKENINGAYVPA NSGDVAQTFNLTSDATGLIGMTHNIDGKTATFETLLFSGVIFFPEFLDGVIVSSIQASEV SVGA >gi|223714209|gb|ACDT01000006.1| GENE 50 39153 - 39425 445 90 aa, chain + ## HITS:1 COG:no KEGG:SH2353 NR:ns ## KEGG: SH2353 # Name: not_defined # Def: hypothetical protein # Organism: S.haemolyticus # Pathway: not_defined # 2 86 3 87 97 70 49.0 2e-11 MYKVIKYFTDLQDNEHPYNAGDTFPRDGLTVSRERIIELATASNKQSTPLITFIEDKSNQ AQDENEAVEPEKQKKDETKKSEPKNSASKK >gi|223714209|gb|ACDT01000006.1| GENE 51 39437 - 39760 316 107 aa, chain + ## HITS:1 COG:no KEGG:FI9785_834 NR:ns ## KEGG: FI9785_834 # Name: ps131 # Def: hypothetical protein # Organism: L.johnsonii_FI9785 # Pathway: not_defined # 1 107 1 110 118 85 46.0 4e-16 MTILENVKELLGNPKNIDDKLNVIIELTQKRLGNLLSVKEVPEELEYIVIEVSVIRFNRI GSEGVSSHSVEGESMSFNDDDFDSYDKDIRSWLNNQSDLKKGSVCFL >gi|223714209|gb|ACDT01000006.1| GENE 52 39757 - 40059 389 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735545|ref|ZP_04566026.1| ## NR: gi|237735545|ref|ZP_04566026.1| predicted protein [Mollicutes bacterium D7] # 1 100 1 100 100 175 100.0 8e-43 MRYDTPVYFQTVKSGQYDQNTGNYGDDTIVEKELMASVMDTSTKTMQLIYGTIKQGSLTI HIQNHWNEVFNFIRIDKKQYKVDYSRKLKTKHIFVLSEVT >gi|223714209|gb|ACDT01000006.1| GENE 53 40059 - 40409 479 116 aa, chain + ## HITS:1 COG:no KEGG:Spy49_1475c NR:ns ## KEGG: Spy49_1475c # Name: not_defined # Def: hypothetical P protein # Organism: S.pyogenes_NZ131 # Pathway: not_defined # 9 116 11 118 118 95 50.0 4e-19 MAKVFYLEGLEKLSNKLKKNIKMADVKRVVSTNGAELTNKMTRNANFVKGYQTGTTKRSI QLSKEDSGFTAIVEPGTEYSPYLEYGTRKMEAQPFVGPAFNEQKEIFKKDMKKLVE >gi|223714209|gb|ACDT01000006.1| GENE 54 40421 - 40807 336 128 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735547|ref|ZP_04566028.1| ## NR: gi|237735547|ref|ZP_04566028.1| predicted protein [Mollicutes bacterium D7] # 1 128 1 128 128 246 100.0 3e-64 MDPQQELFSYLLVELKKLYPDNVYDTFLPPDNTPCPFIYVGNSQLIDDANKSAVFGNVYQ IIHVFHNNPKQRGTVSKMLLDIKKVSRELNHTTNFAWSLKNVSQDIMPDTSTSIPLLHGV LSLEFKFN >gi|223714209|gb|ACDT01000006.1| GENE 55 40821 - 41357 683 178 aa, chain + ## HITS:1 COG:no KEGG:LACR_1143 NR:ns ## KEGG: LACR_1143 # Name: not_defined # Def: hypothetical protein # Organism: L.lactis_SK11 # Pathway: not_defined # 14 173 6 166 169 131 46.0 9e-30 MRRFDLQLCAAPEAVQGKKIVYLYRILSSATTKDGATLAFTTENGRTKSKDADSTATKDG SIRTPGVAEVEITATSILAVGDTLIDELEKALDNDELVEIWEVNLAEKGTEDNVGKFKAK YFQGYLTECEITSNAEDMVEVSLTFGINGSGVDGYATVSQEQQEMANYVFADTKKTGA >gi|223714209|gb|ACDT01000006.1| GENE 56 41402 - 41779 509 125 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735549|ref|ZP_04566030.1| ## NR: gi|237735549|ref|ZP_04566030.1| predicted protein [Mollicutes bacterium D7] # 1 125 1 125 125 205 100.0 7e-52 MELTINGIVYKFKASIGFVRKVNKNVTQKDELGVEKQVGLTYLVAGLVDGEIEELINALD YLNDGMTPRVTREQIEEYIDNETTDVEKLFEVVIDFLSSANASKVAIKKLFERVEEAKKR EKEQH >gi|223714209|gb|ACDT01000006.1| GENE 57 41899 - 42147 142 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735550|ref|ZP_04566031.1| ## NR: gi|237735550|ref|ZP_04566031.1| predicted protein [Mollicutes bacterium D7] # 1 82 1 82 82 110 100.0 3e-23 MKAVELKELDLNYHIHLLAFNNFKVKARKKAGKNKTRPVFDTFKKFFDYEYELNKVLGKK EDKFSKVKEFMRKRGEKNGREF >gi|223714209|gb|ACDT01000006.1| GENE 58 42131 - 45115 3427 994 aa, chain + ## HITS:1 COG:no KEGG:Spy49_1468c NR:ns ## KEGG: Spy49_1468c # Name: not_defined # Def: putative minor tail protein # Organism: S.pyogenes_NZ131 # Pathway: not_defined # 1 923 1 1130 1211 261 28.0 9e-68 MAESFSVKAILSAADKGFTSTMEKADSKLSSLGSKIKSGLGFGILTGIGQQAFSSITSGI SGVISELGASSASWKTFNGNMGMLGKSSDEIISTKKELQKFATQTIYSASDMATTYSQLA AVGTKNCTQLVKGFGGLAAAAENPTQAMKTLSTQATQMAAKPKVAWQDFKLMLEQTPAGV AAVAKEMGMSTSKLVSKVQDGTVKTEDFFNAIAKVGTNDAFTKLATEYKTVDQAMDGLTE TLGVKLAPAFDYVSKIGIDAISGLVDKLDGFNADSLVNTISGAITTIQPYWDAFSKAAGK VAGALFDVGGVIVDVGASIATNETVIKTFSDVMSSAGDVIAFVGNIIADNSDIIVAATPW VAGFFLAWKGYKKISSAVTALQKFGDKLMGISDTVSSGVTKKITNVADGINDTGNAAKTN AKNMLASAKSFMMMGAGILMVSAGFALLAYSAIQLANAGPLAIGVMAGLVVALAAMGAGM TLMLNSIKPGAAKLNAISLAMLAMGTALVLVSAGFAILTASAINLANAGPLAIGVMVGMI ATIALLAAGAAILGPALTAGAVGFVAFGAAIVLVGVGALLAATALTLVAGVLPTVCEYGL LGAGNIALLGASMIAFGAGAVVAGAGALILAAGLIAVGVGALGAAVGVLALAVASVALGA GINLCALGAVILGPALLTLSAGALAAGASLLVLTAGVLAFTASGVASLAGTISLTAGFVA FGASLLVVTAGMIALAAGLLGVLGSMKSIASSAKTTEKSLKAMKSSISFVNSALEGLGSL AKSAIKSLISSFSNAEGKAKTAGQNIGNNISSGVQTGATKMVLIASLTTMQTIGVFQNGQ AGAYGAGVYIGQGLGNGMSSQLGYVRSVASQLASAAEKAIRAKAQIHSPSRVSTKLGNFW GKGLGNGIVEMKNFVKKAADKLFSIPVLNNPKIAFAGDFDSNLSEDYEYYQNTKYTINVP VIMDGKEVARVTAPFTQEEIEKNEKLKNMIKGVK >gi|223714209|gb|ACDT01000006.1| GENE 59 45119 - 46654 1487 511 aa, chain + ## HITS:1 COG:SPy0995 KEGG:ns NR:ns ## COG: SPy0995 COG4722 # Protein_GI_number: 15675003 # Func_class: S Function unknown # Function: Phage-related protein # Organism: Streptococcus pyogenes M1 GAS # 1 226 1 230 259 62 27.0 2e-09 MYEFVDTDEMYTKTILPAEAMSYNGVFIENEIPGYRTLYVSGRELMESEVQDETVNLLDG TNYLGKRYPSRTITVTYQLIASTCLEFRDSFNKLNRLLKDEQVKIIFNDEPDKYFIGTKI GNSIPGPGSNSITGNIEIYCSDPFKYSDVLKEFIAEPNDNGVLEVTVINDGSVSVPIDYE ITHNAESGFIGVVSDKGTMQFGKIDEADKEPYEQNERLGTLWDFINLPNDTNGTDYMHPS HSVKGTLGTSTWFDQTFLTLGVSGPISSSSNGGLRTLILPSDSEGRKGAKNFYSYFHLIF YAGLMGQTGEMCINWLTEDNKLIAGVCWYKTDTTGNTGNYELWANGKVLHTYSYTTSHLG NQNPWYWDWGHCDLRKEGNRLTFYYWGGYPSYIIPEVENMECAKIQIAIKQYGNRGGSSF MTYMGVNDFVFDKMNVEKWKDIPNRYQPNDVCVIDGESSKFYVNGMYRPNDEILGSQYFK ADSGETKIQFVVSEWTKTKPTVKVRVREAWI >gi|223714209|gb|ACDT01000006.1| GENE 60 46654 - 48408 1223 584 aa, chain + ## HITS:1 COG:no KEGG:L22437 NR:ns ## KEGG: L22437 # Name: pi312 # Def: prophage pi3 protein 12 # Organism: L.lactis # Pathway: not_defined # 4 580 1 589 595 228 31.0 7e-58 MDNVRIAVLDAYDNVCIFLDNTIDEAMHYYKDELHTYLSGSAYTYSFKTLSNHDDSKFLT VGNKLSFVYKNKGYYCNIVNNERNEKYTKVTAYGLSLELSNEETGPYKASNALSFDEYIR AFNFENQVFEIGINEVSDKRITHEWEGTETILARLFSLANVFDAEIEFITELNSDYSLGG IVLNVYKKHDTNVQGMGTDRRSEIIRYGINIRGISKISDITELYTAIRPTGTDGLTLAGI DKKEYDSNGNLEYYSPSGTIEILAPQARDRFPSTLTTSENDRYIAKVWSYETSNVNTLYG QALAQLKKNCIPQVSYDVDGFIDANIGDTFTIEDKEYKPTLYLEARITEQIISFTDQTTC KTTFDNFVERQSQIDESLIKQMNDLIEANKSYSANIISSNGIIFKKDDEKTILEALVSDG INDITEKFTIKWYKDSFFLIDSKTTEVSATDLENDRSVFRFEALTDNGVIKASAEVTVLK LVDGKSAIVLKIDSVNGFSFKNTGVNTTMTVQIFVDDKIIDTSQKMYDVFGEQAKIIWEI KNIGETEYTPINQNDKRLSDNGFIFTLTNKDINNKATFRCFLDF >gi|223714209|gb|ACDT01000006.1| GENE 61 48422 - 49453 1084 343 aa, chain + ## HITS:1 COG:no KEGG:Shel_15430 NR:ns ## KEGG: Shel_15430 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 1 343 1 334 334 114 40.0 4e-24 MAIKASAQVDLIDLTDGYSVNLSNDNHTFQGTTSAVNGTQSITSKITAMCGSEIVACTLG AISTPSGLSVVSDNKTPEPTITITATSALTTSGSFTIPVIIGDITIGKVFSYAIAFKGTN GSNGTSVTVKDTSVTYQVGSSGTTVPTGSWVASPPSTSAGQYLWTKTVVTYSDGKSTTAY SVSRNGTNGSNGSSVTVTSSAVSYQASSNGTTPPTGTWSTTPVTGSAGQYVWTRTVVTYS DGKTTTSYSVSRNGTNGANGADAFNIAIISSNGTIFKNTEIATTLTAKVFKGATELTGSA LTSAGTIKWYKDGSSTATATGSTLTIAAGDITNRGSYVAQLEG >gi|223714209|gb|ACDT01000006.1| GENE 62 49457 - 50149 555 230 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735555|ref|ZP_04566036.1| ## NR: gi|237735555|ref|ZP_04566036.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 230 1 230 230 377 100.0 1e-103 MAVKARNEITLVKVVDGTDGDSGIIVSSTAPSKPTVGQLWQTATGQPIKRWTGNSWVIHY ISVDNLNVDTLSAIAANLGNITGGSLNINGKFIVDTTGKITSLTGSIGGVNISDGGLSSS KSNQNGLTTSYNIKPDGTISSKQTGGEMDYLLNMDYGMIDLSATPNDGTSGSWSRHKGIN ISGGIINFYSGSAETVGSIEVDTSDNCIRITNAGHEQPIFEMVGTVNVEI >gi|223714209|gb|ACDT01000006.1| GENE 63 50162 - 50320 123 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLIKSIKSRKFGGGVLLEIYLKILLIVSYTLRKMGVRYAKNIRTKKRYKITA >gi|223714209|gb|ACDT01000006.1| GENE 64 50274 - 50627 259 117 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735556|ref|ZP_04566037.1| ## NR: gi|237735556|ref|ZP_04566037.1| predicted protein [Mollicutes bacterium D7] # 1 117 1 117 117 200 100.0 3e-50 MLKIFGLKNGTKLQRKDTFFQYSTEEQFTGEYWLDGKKIYQKSYNLGTINAFKKIENIAN FDRNIRYEFSMRANDKISGMNGNSSTDLFVTTGGDVYINTNGNTRYDVVLTLWYTKN >gi|223714209|gb|ACDT01000006.1| GENE 65 50630 - 50893 291 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757367|ref|ZP_02429494.1| ## NR: gi|167757367|ref|ZP_02429494.1| hypothetical protein CLORAM_02917 [Clostridium ramosum DSM 1402] # 1 87 8 94 94 130 100.0 3e-29 MKGRHEIMIKTHELDVTASKFSELLESNYKIIKQNDYEQNDYILFREIETVEEEVSYTSK SQLTQIKQIINDEGIKEGYVLAVLNKI >gi|223714209|gb|ACDT01000006.1| GENE 66 50910 - 51383 486 157 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757366|ref|ZP_02429493.1| ## NR: gi|167757366|ref|ZP_02429493.1| hypothetical protein CLORAM_02916 [Clostridium ramosum DSM 1402] # 1 157 1 157 157 276 100.0 3e-73 MKKMNVLNNMNYMDTYNAITGAVVAFLSFIFGEHWILFALFLLFNVIDWITGCMKSKLAN KTNSQKGWLGVLKKLGYWIMILVAFAASVLFIEIGTTLGIDLGITTLIGWFVLGSLAVNE IRSIIENLVEAGYNVPSILTRGLEVADKLINKESGDN >gi|223714209|gb|ACDT01000006.1| GENE 67 51385 - 51642 378 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757365|ref|ZP_02429492.1| ## NR: gi|167757365|ref|ZP_02429492.1| hypothetical protein CLORAM_02915 [Clostridium ramosum DSM 1402] # 1 85 1 85 85 151 100.0 1e-35 MELNETIEMMNSNGYKERFRGEYFQAKIRYDKLDAMTVKYEAGTLNFRPSCSLELLKEQK GYMGNYIRCLKIRAEIEGIDLKEDK >gi|223714209|gb|ACDT01000006.1| GENE 68 51644 - 52222 648 192 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2944 NR:ns ## KEGG: Cphy_2944 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 3 145 36 173 225 91 35.0 2e-17 MAVLTASQLVDKVKAVANTATAYKLGTFGNKTSGGKRQWDCSGLLKGILWGYPDNGKYLK NGVLDQNADTIISKCSGVSTDFSNIVSGEIVWMKGHMGIYIGNGKVVEATPKWDNGVQVS TCANVSNGSKSRKWTKHGKSPYIDYGTTVSTPAPEPSQPVDDWLNRLNAEIARQGFSSYP TVKKVRRVVLLD >gi|223714209|gb|ACDT01000006.1| GENE 69 52395 - 52577 60 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEKVYLKINAEDIQGNCLNTRIEYTLLYMGLSHSIINNGYRDIHVNNKYIKFKPRSHTLI >gi|223714209|gb|ACDT01000006.1| GENE 70 53150 - 53326 318 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735562|ref|ZP_04566043.1| ## NR: gi|237735562|ref|ZP_04566043.1| predicted protein [Mollicutes bacterium D7] # 1 58 1 58 58 84 100.0 2e-15 MAKITVNESCIGCGTCVGVAPDVFEMNDEGLSSVIGDDVDSAKEAAESCPVEAIEVED >gi|223714209|gb|ACDT01000006.1| GENE 71 54136 - 54597 492 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757361|ref|ZP_02429488.1| ## NR: gi|167757361|ref|ZP_02429488.1| hypothetical protein CLORAM_02911 [Clostridium ramosum DSM 1402] # 1 153 1 153 153 261 100.0 1e-68 MQQLKKFICLLLFLFIMTACQEKIAYDKEWAANFAQQQVKAMDGYRLEGVYFYQGNHIRV NDQKVGKAVVVKAVVRYYDVFKDPNKYELVSVFDESNIDLAGQGYIYGEDGDRLLGVDQI DVYDSNQPIKYLRAMDDKDISEVNRKLNTVQIK >gi|223714209|gb|ACDT01000006.1| GENE 72 54681 - 55493 1099 270 aa, chain - ## HITS:1 COG:BH1531 KEGG:ns NR:ns ## COG: BH1531 COG0005 # Protein_GI_number: 15614094 # Func_class: F Nucleotide transport and metabolism # Function: Purine nucleoside phosphorylase # Organism: Bacillus halodurans # 4 269 6 271 272 293 52.0 3e-79 MYIKINEATNYIKTQYHGKIDLAIILGSGLGPLADEIENPIELDYRDIPHFPISNLIGHA GKLIIGTLENKTVIAMKGRFHYYEGNDMDIVTLPIRVFKRLGIDNLILTNACGGIREDLN PGQIMLIKDHIGLFAPSPLRGPNLDEFGPRFKDMSEVYNRKLGKLAHQVADDNNISLKDG VYAFFKGPMYETPAEILAYRALGADAVGMSTVPEAIVARHCDMKTLGISLITNKAAGLSN QELNHQEVVEAANKAEKDLVTLVKGIIKNW >gi|223714209|gb|ACDT01000006.1| GENE 73 55536 - 55724 170 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735566|ref|ZP_04566047.1| ## NR: gi|237735566|ref|ZP_04566047.1| predicted protein [Mollicutes bacterium D7] # 1 62 1 62 62 98 100.0 1e-19 MYYNENRKSGENVERKLKTRHLYRHFKGKLYYVMNIGLDSETLEEVVIYQAMYDDKKHLF VL >gi|223714209|gb|ACDT01000006.1| GENE 74 55810 - 57042 1446 410 aa, chain + ## HITS:1 COG:BH3818 KEGG:ns NR:ns ## COG: BH3818 COG1078 # Protein_GI_number: 15616380 # Func_class: R General function prediction only # Function: HD superfamily phosphohydrolases # Organism: Bacillus halodurans # 1 408 1 409 432 360 46.0 2e-99 MNQAAYRLEQTKLTDSKVFRDAVHNYIHVDQPLILDLINSHEMQRLRRIKQLGGTHQVYQ SAEHSRFCHSLGVYFIARKMIFNSAIGAYLNDYDKLTVMCAALLHDIGHGPFSHCFEDAF DLNHEAYTIKIINGKTEVHDLLEKFDHGFSHRVSSVIEKTHPNKILVQMVSSQLDADRMD YLLRDSYFSGTTYGQFDLSRILRVMAVCDGKIVFKNSGVQAIENYILARYHMYWQVYYHP TARSYEQVLISIFRRMKDLYAAGYDFGDIRYLKPFLDGHVDENDYTKLDEGIVFYYFTVL KEGNDEILKDLCTRFLDRRLFIYHDLLDQHEKQLAESFYEKKGYDPRYYVVSDDQSQVPY RYYGNTEELSEIEILIDEELRFLPEVSEIVGAIVNSKKNKNDHKIFYPEV >gi|223714209|gb|ACDT01000006.1| GENE 75 57052 - 58929 1461 625 aa, chain - ## HITS:1 COG:BH3812 KEGG:ns NR:ns ## COG: BH3812 COG0744 # Protein_GI_number: 15616374 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Bacillus halodurans # 16 611 31 631 687 419 38.0 1e-116 MKKLVKIFVTILASLAGILLSLYVIAYCMGEPDLNKNRYLKLYDDNQEIYYQSINDYTGH YVTLNNVSNYFKESIVAIEDQRFYDHRGFDPIGIMRALKVNFTNQKKSQGASTITQQYAR LLYLTNEKSWSRKIKEAFLTMQLETRLSKDEILEGYINNVYFGHGIYGIENAAQYFYGKS AKELDLNESSMLAGVVNGPTYYSPLNDAKAARKRQSIVLNALIDCKEITVTQKQQVLDSD LNLAETHKVDDNMANNYYKDTVIDELEELGYYNNTYLNQGLNVYTSFNQDYQKVVEKSAS EYTKDSKVETSSIVVEPYTSKILAIIGGKDYASSQFNRATQANRQIGSTIKPLLYYLALE NGFTPATTFLCEPTTFKLDDNSTYSPSNFNDKYAYKDVTLAQAIAVSDNIFAVKTHLFLG TDNLASLIKKFGIKDVKANASLALGTLNTNIYNLANMYNCLASEGKYNHLYTIEKITNDD GKVLYQHEQEDKQLLEQDSCLELSQLLTSTFNSTFSTYLQATMASYQLDNINACKTGSTD VDNLAVAYNPQVLVASWVGYDDNRKMETADDKTVAKKTVIDVLNYTNENKDVTWYQPTED LQQIAINPLTGDFDENGTVYWFKKD >gi|223714209|gb|ACDT01000006.1| GENE 76 58970 - 59365 388 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757357|ref|ZP_02429484.1| ## NR: gi|167757357|ref|ZP_02429484.1| hypothetical protein CLORAM_02907 [Clostridium ramosum DSM 1402] # 1 131 32 162 162 236 100.0 3e-61 MHGDRMKKIIKVNYRSIFKYEEHQETVKYDGSGHLEIGDDKIVVSYQDENKIKIELSENE VKLHNGASVLHLVRDRDILNQYETPYGAIALKTRLISYDNGDNVKIKYELYDGTNLISQV YVMLNYLILEN >gi|223714209|gb|ACDT01000006.1| GENE 77 59367 - 60887 1316 506 aa, chain + ## HITS:1 COG:CAC3316 KEGG:ns NR:ns ## COG: CAC3316 COG1502 # Protein_GI_number: 15896559 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Clostridium acetobutylicum # 4 506 5 510 510 454 46.0 1e-127 MKILKFLSNRIVVSIILICVQILWLALFMRYIFTDATLLKGVFIFLSVIVTLYIINNDDD DPSYKIIWLIPILSFPVFGGLLYVVFGNKKPAKKLQEAFANQEEAIRPFIYPNKMVEQIE DLIAKGQANYLVQENFPIYNNSEIKYYPLGDDTYPDLLKELAKAKHFIFMEYFIVEEGEM FNTVLTILKQKVKEGVEVRFMYDDMGSLTMLPFRYYQKLESYGIKCIAFNHFVPFISAVM NTRDHRKITVIDGNVGFSGGFNLADEYINAKVKYGHWKDTGVMIKGEAVWNLTLMFLTTW NASLNTFEDYDKYHPRHYICPEISPNGYILPYGDSPLDNKPVGKNVYLNMINQAQKYIYI DTPYLIINDEIKNALCLAVRRGVDVRIITPGIPDKKMVFKVTRSYYEPLVNGGVRIYEYT PGFIHAKNFVCDDKIATVGTINLDYRSLYLHFECGVYMYDVPAIKDIKKDFLKTIAVSKE MTPEDVVKGRFRGWLEAILRLFAPLL >gi|223714209|gb|ACDT01000006.1| GENE 78 60959 - 62620 2085 553 aa, chain + ## HITS:1 COG:BS_argS KEGG:ns NR:ns ## COG: BS_argS COG0018 # Protein_GI_number: 16080786 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Bacillus subtilis # 5 553 6 556 556 624 56.0 1e-178 MAINKIEATLKGALKQAIIACGFVEEYDQESITIEIPKDKSHGDYSTNLAMQLTKLLKRN PRQIAEAIIEALDKENANIEKVEIAGPGFINLFLAKDAMTSIIKEVLEEKEAYGTTTYGQ GTKYNVEFVSANPTGDLHLGHAKGAAVGDSICRIMSAAGYDVTREYYINDAGNQIHNLAL SLYARYKQAFGQDVTMPEDGYHGKDIIDIATKIKEIDGDKYLEMDEGKAIAFFRNKGTEY ELQKIKDILNEFRVSFDVWFSETSLYENDRVVPTIEKLKAAGYTYEEEGALWFKSTEFGD DKDRVLIKSDGSYTYLTPDIAYHLNKLDRGYEYLVDLLGADHHGYINRMKAAIQALGYNA DQLNIDIIQMVRMMNNGEPVKMSKRTGNAVTIKDLIEEIGVDATRYFFVSKAANTPFDFD IGLAKSKSNENPVYYAQYAHARMCSIKAQAAKANIDYSDKYDLLVNPKEIELVKHINEFR NVIIDSAINRTPHKITNYVQRLAQLFHSFYNECYVIDEDNLELSGQRLALVEATRITMAN ALNLIGVSAPEKM >gi|223714209|gb|ACDT01000006.1| GENE 79 62635 - 62976 483 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757353|ref|ZP_02429480.1| ## NR: gi|167757353|ref|ZP_02429480.1| hypothetical protein CLORAM_02903 [Clostridium ramosum DSM 1402] # 1 113 1 113 113 162 100.0 8e-39 MDSMDKYYNMSMVDVAYELMSKKKSAVNFFKLWEEVCQMKKFDDEQKEDKESLFYTNITL DGRFITVGENAWDLRSRHKFSDVHIDMNDIYADEEETEELEEDVDSTIEDDYN Prediction of potential genes in microbial genomes Time: Thu May 26 09:12:37 2011 Seq name: gi|223714208|gb|ACDT01000007.1| Coprobacillus sp. D7 cont1.7, whole genome shotgun sequence Length of sequence - 5195 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 77 - 733 543 ## COG0398 Uncharacterized conserved protein 2 1 Op 2 40/0.000 + CDS 750 - 1460 847 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 3 1 Op 3 . + CDS 1464 - 3560 1965 ## COG0642 Signal transduction histidine kinase 4 1 Op 4 . + CDS 3615 - 4298 526 ## Amuc_1270 hypothetical protein 5 1 Op 5 . + CDS 4291 - 4725 426 ## CPR_0856 acetyltransferase + Term 4854 - 4894 -0.3 Predicted protein(s) >gi|223714208|gb|ACDT01000007.1| GENE 1 77 - 733 543 218 aa, chain + ## HITS:1 COG:CAC2706 KEGG:ns NR:ns ## COG: CAC2706 COG0398 # Protein_GI_number: 15895963 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 51 203 60 212 236 99 33.0 6e-21 MLKSIKKTNLLMMSSIILFMAFITIYLFRNCDFNDVIYEIETLPYFTKVSAMIALIALQI FLAFLPGEPLELASGYIFGSFQGTIVCLIGSMIGTIIVYYLARIFQHSIIDKMFDQSKVA EVKKLFSSKKSKFWLFIIFLIPGSPKDIMTYLVSLTDIDIKQWLLLTTIGRIPSIVTSTY LTGALRDGNIVLAVSIFAVTVILVITGAVYYKKIVNSN >gi|223714208|gb|ACDT01000007.1| GENE 2 750 - 1460 847 236 aa, chain + ## HITS:1 COG:CAC0564 KEGG:ns NR:ns ## COG: CAC0564 COG0745 # Protein_GI_number: 15893854 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 4 236 5 233 233 231 51.0 8e-61 MEHTILIVEDEKGIRETVTVFLKNQGYNVLQASNGEEGLALIEANEIHLAVVDIMMPVMD GITMVLKLRENYDFPVIFLSAKSEDIDKITGLNIGGDDYITKPFEAMELIARVNSNLRRY NQILALKENRNLNNNQQRLVVGGLELDKFTKEVFVNDRGVRLTAKEFQILELLMSYPGRV YSAEEIYEAVWKEEAINTETIMVHVRKLREKIEANPKKPEYLKVVWGIGYKIEKGV >gi|223714208|gb|ACDT01000007.1| GENE 3 1464 - 3560 1965 698 aa, chain + ## HITS:1 COG:BH1154_2 KEGG:ns NR:ns ## COG: BH1154_2 COG0642 # Protein_GI_number: 15613717 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 452 691 31 268 274 196 42.0 2e-49 MTKAKRIFGVIFASILLLMSVVVISLYDNVRENTPRDDNLTGILEDDLQLFTIAMAKTLD SNFEAVSFDDSVDEQTRQIFLDVFDDYLKGITSLFNNDGDFIYKAVNTNTNQMISQHSEK ITTNDDQSKYSFYTEASYDQNGLVSCNGDLSSDKLSYFNIANLLSSFAHYDGENTIYIDN WMFDTNAIKINVPKNLQITYIIPETPASHGYASQYVNSWENYNYFSAIALLSCSAILVLF ILFYPIKIVEDVNPFLSVKRWKAEINLAVLTLMITFGVMGCMIVTGYTLNNSLLELLKRY DIMYVDYIVIVINFVIWILTLLVISLGIFQLKYIFARGFWRYLKEDTLIGSGVRSIKNKL NQIADIDLSKSLNQTIMKYILINTAIIMIMITFWGFGYFLAIIYAFLVFFWVKDKVLKVQ DDYDKLLTATKELSKGNFDVEIDGDLGIFNALNDEFKNIRIGFETAVKEETKSQNMKNEL VSNVSHDLKTPLTCIKNYIVLLQDDNLPIETRHEYLDNLNQYSNRLTNLIEDLFEVSKVN SGNIKLNLMELNIVALIEQAHAECSEILDAHQLTVITNYPHNDIILKLDGDKTYRIFENL FTNISKYALFNSRVYVDLTEETEDITIIFKNISQEQMNFSPQEITERFVRGDKSRHESGS GLGLAIAKSFVEAQGGTFNIAIDGDLFKAIITFKKIAK >gi|223714208|gb|ACDT01000007.1| GENE 4 3615 - 4298 526 227 aa, chain + ## HITS:1 COG:no KEGG:Amuc_1270 NR:ns ## KEGG: Amuc_1270 # Name: not_defined # Def: hypothetical protein # Organism: A.muciniphila # Pathway: not_defined # 2 180 45 234 307 82 28.0 1e-14 MKILFVGNSHTYMNDMPEMVRINSSEKLEVTMLARPAITFHDHLESMELQFALKQGYDFV IFQQAAHEPCPSKEATLHDAKALIELARSCGVMPYIMIPWSQRNYDDDFKTTKDIYHQVM MDNLVDGIPVGYVINRLSHQNPELELFQSDNQHLTSLGSYLESITILNTIFFETKFPGKL IYPNQSSFEEHQLDERLIDFLTKEVVHTVERFKSNYCVCGKREILDD >gi|223714208|gb|ACDT01000007.1| GENE 5 4291 - 4725 426 144 aa, chain + ## HITS:1 COG:no KEGG:CPR_0856 NR:ns ## KEGG: CPR_0856 # Name: not_defined # Def: acetyltransferase # Organism: C.perfringens_SM101 # Pathway: not_defined # 2 143 4 147 147 82 34.0 4e-15 MIRSVTKNDYEEIAKLMVEAFKNPPWNEVWSYERSYQRIEQLDDGKYTRCFVYMLDNKIA GVVCGKLITYVNDLDFMIEDFYIDPHCQRRGLGKKMMKALEECLPEVDNLILLTGREFYS ADFYQKNGFKINDDMVFMVKKLKS Prediction of potential genes in microbial genomes Time: Thu May 26 09:12:55 2011 Seq name: gi|223714207|gb|ACDT01000008.1| Coprobacillus sp. D7 cont1.8, whole genome shotgun sequence Length of sequence - 41249 bp Number of predicted genes - 40, with homology - 40 Number of transcription units - 20, operones - 7 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 34 - 606 718 ## COG0740 Protease subunit of ATP-dependent Clp proteases - Prom 634 - 693 9.0 + Prom 590 - 649 6.2 2 2 Op 1 . + CDS 883 - 1899 1422 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase 3 2 Op 2 . + CDS 1914 - 2870 486 ## COG1893 Ketopantoate reductase 4 3 Tu 1 . + CDS 2922 - 3671 793 ## gi|237735582|ref|ZP_04566063.1| predicted protein + Term 3697 - 3734 -0.4 5 4 Op 1 . - CDS 3748 - 4440 842 ## COG1346 Putative effector of murein hydrolase 6 4 Op 2 . - CDS 4433 - 4765 229 ## EUBREC_2893 putative effector of murein hydrolase LrgA - Prom 4830 - 4889 9.2 - Term 5262 - 5313 9.5 7 5 Op 1 . - CDS 5325 - 7544 1733 ## COG2200 FOG: EAL domain 8 5 Op 2 . - CDS 7556 - 8818 1046 ## GYMC10_5081 extracellular solute-binding protein family 1 - Prom 8875 - 8934 8.4 - Term 8924 - 8969 3.1 9 6 Op 1 7/0.000 - CDS 8983 - 10335 1112 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) - Prom 10429 - 10488 4.2 - Term 10466 - 10499 3.1 10 6 Op 2 . - CDS 10508 - 11710 236 ## PROTEIN SUPPORTED gi|145223395|ref|YP_001134073.1| NLP/P60 protein - Prom 11784 - 11843 7.7 + Prom 11700 - 11759 5.6 11 7 Tu 1 . + CDS 11893 - 12903 1066 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) - Term 12765 - 12806 -0.6 12 8 Tu 1 . - CDS 12905 - 13405 650 ## COG1633 Uncharacterized conserved protein - Prom 13630 - 13689 8.9 + Prom 13874 - 13933 6.8 13 9 Tu 1 . + CDS 13977 - 14804 771 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) + Term 14939 - 14972 1.5 - Term 14794 - 14826 2.0 14 10 Tu 1 . - CDS 14827 - 15321 403 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 15341 - 15400 3.5 + Prom 15199 - 15258 8.6 15 11 Tu 1 . + CDS 15438 - 16220 755 ## EUBELI_01807 beta-phosphoglucomutase + Prom 16329 - 16388 6.8 16 12 Tu 1 . + CDS 16592 - 16762 101 ## gi|167757329|ref|ZP_02429456.1| hypothetical protein CLORAM_02879 + Term 16810 - 16845 5.3 - Term 16798 - 16833 5.3 17 13 Tu 1 . - CDS 16838 - 17731 660 ## COG1533 DNA repair photolyase - Prom 17752 - 17811 13.8 + Prom 17858 - 17917 7.4 18 14 Op 1 . + CDS 17943 - 18170 396 ## COG0236 Acyl carrier protein + Term 18184 - 18222 6.4 + Prom 18174 - 18233 2.8 19 14 Op 2 27/0.000 + CDS 18257 - 18700 530 ## COG0511 Biotin carboxyl carrier protein 20 14 Op 3 4/0.000 + CDS 18706 - 20058 1730 ## COG0439 Biotin carboxylase 21 14 Op 4 10/0.000 + CDS 20062 - 20934 907 ## COG0777 Acetyl-CoA carboxylase beta subunit 22 14 Op 5 . + CDS 20934 - 21884 1108 ## COG0825 Acetyl-CoA carboxylase alpha subunit 23 14 Op 6 4/0.000 + CDS 21877 - 22338 481 ## COG1846 Transcriptional regulators 24 14 Op 7 . + CDS 22347 - 23288 892 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III 25 14 Op 8 . + CDS 23292 - 24224 1200 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase 26 14 Op 9 3/0.000 + CDS 24227 - 25291 1255 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase 27 14 Op 10 26/0.000 + CDS 25291 - 26223 1275 ## COG0331 (acyl-carrier-protein) S-malonyltransferase 28 14 Op 11 11/0.000 + CDS 26225 - 26977 197 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 29 14 Op 12 8/0.000 + CDS 26989 - 28230 1710 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 30 14 Op 13 . + CDS 28235 - 28660 606 ## COG0764 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases + Prom 28677 - 28736 7.8 31 15 Tu 1 . + CDS 28762 - 29622 735 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 29793 - 29842 -0.4 + Prom 29625 - 29684 9.9 32 16 Op 1 . + CDS 29913 - 31067 1421 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 33 16 Op 2 32/0.000 + CDS 31060 - 32082 988 ## COG1135 ABC-type metal ion transport system, ATPase component 34 16 Op 3 22/0.000 + CDS 32075 - 32743 660 ## COG2011 ABC-type metal ion transport system, permease component 35 16 Op 4 . + CDS 32754 - 33566 889 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen 36 17 Tu 1 . - CDS 33900 - 34487 203 ## COG3547 Transposase and inactivated derivatives + Prom 34710 - 34769 8.1 37 18 Op 1 2/0.000 + CDS 34928 - 36085 872 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases 38 18 Op 2 . + CDS 36075 - 37256 905 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities + Term 37362 - 37406 -0.6 + Prom 38181 - 38240 8.5 39 19 Tu 1 . + CDS 38289 - 39107 728 ## Pcar_2183 transcriptional regulator + Prom 39165 - 39224 8.9 40 20 Tu 1 . + CDS 39248 - 41188 2490 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family Predicted protein(s) >gi|223714207|gb|ACDT01000008.1| GENE 1 34 - 606 718 190 aa, chain - ## HITS:1 COG:BH3564 KEGG:ns NR:ns ## COG: BH3564 COG0740 # Protein_GI_number: 15616126 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Bacillus halodurans # 1 190 1 190 194 249 66.0 3e-66 MHLVPTVIEKTNQREYAYDIYSRLLEDRIILLTGTIDDKMSSSIVGQLLYLESLDNNADI FMYINSPGGSINAGMAIYDTMNFIKCDVSTIVIGMAASMAAFLLSAGAKGKRCSLPNSEI MIHQPLGAFEGQASDIEISAKRILKQKEKLNLILSKNTNQPIDKIVIDTDRDHFLEPDEA LEYGLIDEVI >gi|223714207|gb|ACDT01000008.1| GENE 2 883 - 1899 1422 338 aa, chain + ## HITS:1 COG:NMA0246 KEGG:ns NR:ns ## COG: NMA0246 COG0057 # Protein_GI_number: 15793264 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Neisseria meningitidis Z2491 # 1 334 1 331 334 432 67.0 1e-121 MAVKVAINGFGRIGRLAFRQMFGAEGYEVVAINDLTDPKMLAHLLKYDSAQGRYALADKV EAGENSITVDGKEIKIYSEADASKLPWGEIGVDVVLECTGFYVSKAKSQAHIDAGAKKVV ISAPAGNDLPTVVFGVNENILTADDTIISAASCTTNCLAPMANALNNLAKIKSGIMLTVH AYTGDQMVLDGPHRKGDLRRARAAAVNIVPNSTGAAKAIGLVIPELNGKLIGSAQRVPVP TGSTTILTSVVEGEVTVEEVNAAMKAAATPSFGYTEEQLVSSDIIGINYGSLFDATQTMV KPMDNGTTEVQTVAWYDNENSYTSNMVRTIKYFAELSK >gi|223714207|gb|ACDT01000008.1| GENE 3 1914 - 2870 486 318 aa, chain + ## HITS:1 COG:CAC1605 KEGG:ns NR:ns ## COG: CAC1605 COG1893 # Protein_GI_number: 15894883 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Clostridium acetobutylicum # 49 315 34 301 301 163 36.0 4e-40 MQKLVIDQLFSYGGYYCDQTSWNYWDGCFRNVIWRFYCKNITREAVTFIVDEKRKKAYQE TVFTINGEKVLFNLVEYFNAVPFDLIIIAVKGTALEEVIKEIKKCVDKNTIIISVLNGID SEKVISEYYGREKVIYAVAQGMDAMKFGTSLKYTKEGQLLLGITDFKQKDKLDKLTSFFD QAKINYEVKGDILHSMWAKFMLNVGINQTCMAYNTNYSGALMPGEANDTLIAAMKEIIEL SHYEGVDLTETDLKHYIEIIRTLAPLGLPSMAQDAKVRRYSEVEMFAGTVISLASKHKLK VPTNEFLYKRIRGIENSY >gi|223714207|gb|ACDT01000008.1| GENE 4 2922 - 3671 793 249 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735582|ref|ZP_04566063.1| ## NR: gi|237735582|ref|ZP_04566063.1| predicted protein [Mollicutes bacterium D7] # 1 249 14 262 262 474 100.0 1e-132 MAKFGYVKIVSGQAARRIGRFIGWDEKGVKAKVAFGYDTDILPYTNYHFFSVNSITNNIT KQDLVDRYFELSQALKAIDLRAHPKIKKYTAEHTSIITECNLVRHLLQVFFDLKNLTLKN KEKNITLLTNTSNIMWINEFALELEIRGFQVGILNHEIIKDNFNDELKYALEVCDHYIFI EEYNHQELKYIKENLSSVYQSLTVVVETAVDEDYENGYLYFDEPFTDRFEQSLQRLVERL ESPETYSQI >gi|223714207|gb|ACDT01000008.1| GENE 5 3748 - 4440 842 230 aa, chain - ## HITS:1 COG:MA3262 KEGG:ns NR:ns ## COG: MA3262 COG1346 # Protein_GI_number: 20092078 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative effector of murein hydrolase # Organism: Methanosarcina acetivorans str.C2A # 3 230 7 234 238 174 48.0 9e-44 MDSFLLETVYFGIVISLLSYWIAVQIRKLLPYPIFNPLLISAVISIGILIIFDIDFDTYN KGAQFITFLLTPATVCLAVPLYKQVQILIKHLDAILISLFSGCLAGIVSIFIMCLIMKAD PVIYYSLLPKSITTAIAIGVSDKLGGNSTITVGIVIITGILGAIIAKSICKLFKIKHPVA IGLALGNSAHAIGTSKALEFGEIEGAMSSLSIVIAGLLTVIIAPLMANLL >gi|223714207|gb|ACDT01000008.1| GENE 6 4433 - 4765 229 110 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2893 NR:ns ## KEGG: EUBREC_2893 # Name: not_defined # Def: putative effector of murein hydrolase LrgA # Organism: E.rectale # Pathway: not_defined # 1 105 11 115 125 71 40.0 1e-11 MTISFIGELLGLLLPLPIPASVYGLAIMLICLFTRVIKLNQIEEVADWLILIMPVLFVPS AVSLINVGNAIIKDLLVIGIVTLISTIVVMIVTGKVAQVIIERKEDNDNG >gi|223714207|gb|ACDT01000008.1| GENE 7 5325 - 7544 1733 739 aa, chain - ## HITS:1 COG:RSp1097_2 KEGG:ns NR:ns ## COG: RSp1097_2 COG2200 # Protein_GI_number: 17549318 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Ralstonia solanacearum # 470 738 1 270 272 182 35.0 2e-45 MKNKKISLMIVGGISAVIFVVFSMFIYVNYVQDSLWDKSINDILETTSQEEKAFNVYMTK EMTNSQNILTNIAYLQTKIPQYLQHLSNREEAKYLYIDYTNDCYYDENGNYPLGDEQLFQ QIKADNNSQGFLEPYLNSKTGIKTFATYVKTSDSQSVIVKETQLSFIIENFSLSFYNDLG FSYIVKDNGEILIRSKHRNSNRTFKNLFDLIDLSGNDAKTIESFKASLSKRDKGVALFNY QDSSNVFCYVPLQQNNHWFIISIIPNNVIMKQANNIIISSIILCSGIAGAISCVVLLYIR DSRKHKRMLENLAFYDHLTMLYNYQKFKTEGEKLFLERNHTLLAVLYLDINDFKIINELY GYKYGDKILTQLANILKKLVIAPGLTCRVNADNFLIMQSYHDRTELENLCQEINQKFCNL LESLDNKNNTIVKIGICCYEDDQTISGIDGLIDCSHLALNAYIPGQTNNYYFYNSKMHDQ MIRKVEIENNMESALLNNEFTFYLQPKYAPNGQVMLGAEALVRWLEPSGNLIMPGEFIPI FEQNGFILKMDEYIFESVCKFLYQRIIENLPNVPISINISRLHLYQDDFIERYSKIKNKY SLPDKLVELEITENILLDNIERIRNIIIELQNNGFTCSIDDFGSGYSSLNSLKDLPFEVI KLDRLFLINSYDIQRSQEIIKAIVEMAKTINIKTVAEGVETPSQLEFLKMIDCDMIQGYI FSKPRPIKEFEQLLINQEK >gi|223714207|gb|ACDT01000008.1| GENE 8 7556 - 8818 1046 420 aa, chain - ## HITS:1 COG:no KEGG:GYMC10_5081 NR:ns ## KEGG: GYMC10_5081 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 47 374 45 389 446 89 25.0 2e-16 MKAKIISLFHYLLVFLLIIFNTSCQAKSINVVDTNNNDQIELNFYGYKTEAINVVAIEEI LQAYMDKNPNITITYESVKGTEYYEILKKRLASNNYDDIFMIDEDNLQQLTNYNYFEDLS KLKTIKNFNTNSLEQMVQPDNTIPYIPTSISAFGLYCNLDLLEKHHQSVPKNNQEFMAVC QYFIEQGITPIIANNDISLKTIALAKGLYPIYQSPEKDRIIASLNQDPSILCNYLKEGYQ YVKQLIDNKFIDAVTTLKTEKTADDLIQFEENKNPFMLTGAWASVRVEKSSPDLNFKVYP YPILDDGAILVSNIDTRVCINAKGPHITEAKKFIEYLSQEDVMWKFVNSQSSFSPLKEQR IADDPKIQELSPYLTSEKTILGSDSNINLPLWSLTHEGITQLLTGNSIDSALEIIKNYEQ >gi|223714207|gb|ACDT01000008.1| GENE 9 8983 - 10335 1112 450 aa, chain - ## HITS:1 COG:Cgl2140 KEGG:ns NR:ns ## COG: Cgl2140 COG0791 # Protein_GI_number: 19553390 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Corynebacterium glutamicum # 318 411 98 186 209 67 40.0 6e-11 MYRGDCMKIKKVISTALTLTIVTSACYYAAPVAADEFDGNESKYMTLCSSSNLSSSNKST CEAFNKYLKTKNSELTKKLAEQKEAASDTKTTLESVQKQLDEVNKQISDKEAEIEYLNTT IANLQASIEKNTQLLKDRMYAMQSYANENTYINFIFGASNFTEMFSRIDGYNELTQSDKE LIEELNKQRKEVEQQQTILEEAKATLVIQQEEQKTLKTKYTALLEEQNKSIASTQNTIYD YGEMTETLDAAIKKFNEEAYETPTPTPPPPKPDNGGNNNNSGGNNNNNNNNNNNNSDSGN NNNGGNNNNDSSTSDNLGEKIYAAAYAKLGSRYWWGAQGPTYFDCSGLVYWSLKQAGVSG GRDTAAGYSRKWSAVSFANSKTGDLVCFGSPAYHIGIIVVNSDGSRSMVHAGGGDSSTHG DNPKAYVKVSSIEPGSYYYSRISTIRRVSK >gi|223714207|gb|ACDT01000008.1| GENE 10 10508 - 11710 236 400 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145223395|ref|YP_001134073.1| NLP/P60 protein [Mycobacterium gilvum PYR-GCK] # 154 398 124 347 348 95 32 5e-19 MKKVFVTALGVCTMLSLVTIPTPVQATDFSGQEDKYMKLCSSSNLSTNNLNTCKEFNTYL KNKNKELKSQVSDSKSKVSDTQNSLNSISSEISALNDQIAEKQKEIEYLQTSISNLEASI AKKEEEVKERMYSMQSYNNNNSYIDFIFGASSFTDMFARIDSVNEITSYDDELVAQLADE KEQVETQKATVVTAKANIESQKSSKLALQQEYQALFEKQNADLIAQEKAAAQAADSSQKI TDNIAALLAATEKSQVSGGSVVSGDSAVGNAIAQKALTRVGYMYVWGGCHSMSEIANPNH TAFDCSGLVNWAYYQSGVNIGSNNTKSLASKGVSVSRNNMQAGDIILFSSNGSTSGIHHV GIYIGGGNMVHAPQTGKPVQVADLGYSYWQKEWYDVRRLY >gi|223714207|gb|ACDT01000008.1| GENE 11 11893 - 12903 1066 336 aa, chain + ## HITS:1 COG:SMc01846 KEGG:ns NR:ns ## COG: SMc01846 COG3757 # Protein_GI_number: 15965949 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Sinorhizobium meliloti # 2 188 52 241 261 68 27.0 2e-11 MTTHGIDVSQYQGVIDWEIVQDRVDFAILRCGFGQDRTDQDDRLFKRNADECTRLGIPFG VYLYSYAKNSSDARGEAQHVLRLVKNYKMAYPVYYDLEDNNTTGKQSNDVIANIAKTFAD ELEANGYYVGMYASLYWWNTKLTDPIFDNYTRWVANYAAELNYDKPYDMWQYSSTGWVQG VPTIVDMNYCYADFPAIIKRAGKNNFDIDTAVEQYKLGDTVRFSYVFLTSESSNPLRPYR NVGKITRIVKNTRNPYLIGNDQGWVNDQVIEGRVSYLSNPNYVGDSLVGALQQINVDTSF ENRQHLAKLNGIINYVGSAAQNLQLLQLLKEGKLIS >gi|223714207|gb|ACDT01000008.1| GENE 12 12905 - 13405 650 166 aa, chain - ## HITS:1 COG:CAC1633 KEGG:ns NR:ns ## COG: CAC1633 COG1633 # Protein_GI_number: 15894911 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 8 163 65 229 236 70 32.0 2e-12 MDTKVNKPYPEIKVTVPNETYGLMILDNVGGMNSETSAICQYIYNHSIAGEDFLELKKTF LNISMVEMHHLDIFMTLALELGLDPRWWSCLNDQCTYWSPSYLNYPNSLEEVLKTAIDAE YQAIDKYMYQASVIKDPYIVAILNRIIEDEELHIKVLKEWETKLVR >gi|223714207|gb|ACDT01000008.1| GENE 13 13977 - 14804 771 275 aa, chain + ## HITS:1 COG:CAC2688 KEGG:ns NR:ns ## COG: CAC2688 COG0596 # Protein_GI_number: 15895946 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Clostridium acetobutylicum # 8 275 3 271 271 223 42.0 3e-58 MMRGDYVFYLKSSDNAKLAVEDINPAHQKTVVLVHGWPICQEMYEYQKDILNDCRYRIIS YDIRGLGMSQVMGRGYDYDQLAIDLHSVLITLNVHNVTLVGFSMGGAICVRYMSKFGKER VSKLVLAGAAVPSYTRTIHNPEGQSIDAVNQLIEQCYTNRPKMVHDFGQNVFALNHGSEF MNWFTSICLKGSGIGTIQTAISLRDEDVYQDLFEIKVTTLIMHGVLDKICPFSFALIMHE CIEDSILCKFEYSGHGIFYDERERFNQVLIDFIEQ >gi|223714207|gb|ACDT01000008.1| GENE 14 14827 - 15321 403 164 aa, chain - ## HITS:1 COG:CAC2751 KEGG:ns NR:ns ## COG: CAC2751 COG0454 # Protein_GI_number: 15896008 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 164 1 167 167 147 46.0 1e-35 MKLRLTTTNDLTAVMTIINQAKVYFKEQGINQWQDGYPDELTIINDISRHEAYVLEDNGE IVATAMISKELEPSYNYIEGKWLQDNAYIVVHRIAIRNDQKGKGLAKIIIDEGLKLYPKK PSIRMDTHDNNLSMQRFLIKYGFEYCGTIYLENKETRRAYEKIL >gi|223714207|gb|ACDT01000008.1| GENE 15 15438 - 16220 755 260 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01807 NR:ns ## KEGG: EUBELI_01807 # Name: not_defined # Def: beta-phosphoglucomutase # Organism: E.eligens # Pathway: not_defined # 2 255 242 522 528 168 33.0 1e-40 MIDLKNAKRLFDEYVANYDKDNPKVALKIEHTYRVMEVSKNVAVSLGLDQDEIDLASLIG LLHDIGRFEQLKRYNCFIDSKTIDHALLGVQILFDDNLISKFDIDQKDYPLIYKAIFNHN KYKIAEGLSEHEMLHCKIIRDADKIDIFKTGLLETFEAFLDAGQEILENDIITESIYETF MNSESILSTTRKTDLDRWVSFLALIFDLNYQYSCNYVYQQDYITKLVKRLNYQNSDTFKK MLNIMNHANEYLKEKSDLNN >gi|223714207|gb|ACDT01000008.1| GENE 16 16592 - 16762 101 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757329|ref|ZP_02429456.1| ## NR: gi|167757329|ref|ZP_02429456.1| hypothetical protein CLORAM_02879 [Clostridium ramosum DSM 1402] # 1 56 1 56 56 95 98.0 1e-18 MILSSCRQNDLDIWLPHLVFVFDLNFKYSFQEVLKHQYIDNLYKRLDYQNIDTKKR >gi|223714207|gb|ACDT01000008.1| GENE 17 16838 - 17731 660 297 aa, chain - ## HITS:1 COG:CAC3492 KEGG:ns NR:ns ## COG: CAC3492 COG1533 # Protein_GI_number: 15896729 # Func_class: L Replication, recombination and repair # Function: DNA repair photolyase # Organism: Clostridium acetobutylicum # 29 283 16 272 290 221 45.0 2e-57 MKIPEIKAKTIIQNKADKNHFWFGFDYNMNLYQGCHHGCIYCDSRSDCYQISNFDQIKVK HNVLTLLSKELCSKKRKGVIAIGAMSDPYNHYEQQLKITHQALKIIKQTNFGVMITTKSH TLINDLELIKQINDKQSVLICMTITCGDDSLAKLIEPNVSISSKRFETIKQLRQHNIYAG VLMTPILPFINDTPENIREIIYRAHLANASFVYPMFGVTLRDHQRDYFYTQLDHLFPGCK ERYIKQFGNTYLCNSPRLYELKNIFIEECHKYNIIYKMEDIINGYKQTASSEQLSLF >gi|223714207|gb|ACDT01000008.1| GENE 18 17943 - 18170 396 75 aa, chain + ## HITS:1 COG:DR1942 KEGG:ns NR:ns ## COG: DR1942 COG0236 # Protein_GI_number: 15806940 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Deinococcus radiodurans # 3 74 38 109 110 70 55.0 9e-13 MEFEKVKEIIVDSLSCDEDAVTLEANLKEDLDADSLDAVELIMAVEEEFDIEIPDDKAAE IKTVQDIVDYIKENA >gi|223714207|gb|ACDT01000008.1| GENE 19 18257 - 18700 530 147 aa, chain + ## HITS:1 COG:DR0118 KEGG:ns NR:ns ## COG: DR0118 COG0511 # Protein_GI_number: 15805158 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Deinococcus radiodurans # 70 146 111 187 187 90 55.0 1e-18 MKIDLVKQLINEFENSEVFKMKVEIDDIKLELEKTPVAPVNVVSTPVVSAPTPTLAPVEQ VAANEPAITGTPVKSPIVGVYYASSSPTAKPYVEVGTKVKAGQVLCIVEAMKVMNEIKAP IDGTVTSIMANTEDLVEFDQVLMIIEG >gi|223714207|gb|ACDT01000008.1| GENE 20 18706 - 20058 1730 450 aa, chain + ## HITS:1 COG:CAC3570 KEGG:ns NR:ns ## COG: CAC3570 COG0439 # Protein_GI_number: 15896804 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Clostridium acetobutylicum # 1 445 1 445 447 552 60.0 1e-157 MFKRILIANRGEIAVRIIRTCQELGIEAVAVYSNVDSTSLHVQLADHAVCIGPAKAADSY LNMKNILSVATALGCDAIHPGFGFLSENSTFARLVEECGITFIGPSGDVIDMMGNKSMAR QKMIEAGVPVVPGSDGSVNTLQEAKEVANQIGYPVLIKASAGGGGRGMRKAFSEEEFDDA YLTAKAEAKACFGDDDMYLEKLILNPKHIEFQILADNYGNVIHLGERDCSIQRRNQKMIE EAPSKALTPLLRQKMGEDAVKAAKGAGYRNAGTIEYVLDQDGNYYFIEMNTRIQVEHPIT EMVTGVDLIREQIRIAANQKLTYKQSDIHLNGHAIECRINAENPREGFRPCPGTVNSVHL PGGLGVRIDTTLYQGYKVSSHYDSMIAKVIVHGSNRLEAIRRMRRVLAELVIDGIDTNQE LQYLILHTGEYVKGNFDTSFIENNLDKLVG >gi|223714207|gb|ACDT01000008.1| GENE 21 20062 - 20934 907 290 aa, chain + ## HITS:1 COG:CAC3569 KEGG:ns NR:ns ## COG: CAC3569 COG0777 # Protein_GI_number: 15896803 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase beta subunit # Organism: Clostridium acetobutylicum # 4 286 3 285 285 323 56.0 2e-88 MEELFKARKEKLNLFKSIRNKMQDKKRIDVPDGLYTKCDSCGESILSEDLKESYYVCPKC GAHLKMRAYTRLNLLYDGGKYKELYKSIKSNDPLMFPGYKEKLVKLEETTNLDEAVVCAT GRIDGRKVVVCVMDSRFLMGSMSGAVGEKITRAIEHATKRKCPIIIFTTSGGARMQEGII SLMQMAKTSAALAKHHEAGLLYISYITHPTTGGVTASFAMLGDIIIGEPKALIGFAGPRV IESTIKQKLPEGFQRTEFMQDQGFIDMIVERSKMRETIIKLLKMHSRGAQ >gi|223714207|gb|ACDT01000008.1| GENE 22 20934 - 21884 1108 316 aa, chain + ## HITS:1 COG:CAC3568 KEGG:ns NR:ns ## COG: CAC3568 COG0825 # Protein_GI_number: 15896802 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase alpha subunit # Organism: Clostridium acetobutylicum # 45 316 2 271 274 310 56.0 3e-84 MSLQEKEIKIKEIELELLKLKDDQEIDHTLQIQDLENTKAKIEAEAYQDISAYDRVYLAR KADRPNVREYIDNLFDDFIELHGDRLYKDDGSIVGGLAMFNNIPCTVIGHLKGRTLEENL HCNFGMSSPEGYRKAMRLMKQAEKFNRPVVTFVDTPGAYPGLKAEKHGIGEAIARNLMEM SQLTVPIIVIVIGEGGSGGALALSVGDRMVMLENSVYSVLSPEGFASILWKDKDGSRVHE AAELMKLTANDLYEMKIIDKIIKEPRGGITKNREYVYKRLRIYLRNTLEELMKLSKTSLV NKRYNKFREMGRITND >gi|223714207|gb|ACDT01000008.1| GENE 23 21877 - 22338 481 153 aa, chain + ## HITS:1 COG:CAC3579 KEGG:ns NR:ns ## COG: CAC3579 COG1846 # Protein_GI_number: 15896813 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 11 148 7 144 154 97 38.0 7e-21 MTNRDILNRELINQLFVRLFNQILDIETQYMVGHGVEDLSLSELHIIDAISSLSNPTMST IASQATLTNGTITTAIKKLEAKGYVKRRKDDNDRRIIRVELTAKGNRVCKVHRDFHEEMV SRVCEDSHVLDDELLIKSLQQLLYFFEDIKEKY >gi|223714207|gb|ACDT01000008.1| GENE 24 22347 - 23288 892 313 aa, chain + ## HITS:1 COG:CAC3578 KEGG:ns NR:ns ## COG: CAC3578 COG0332 # Protein_GI_number: 15896812 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Clostridium acetobutylicum # 1 313 1 324 325 318 49.0 1e-86 MSGLKILSTGYYAPQKVLDNFDLEKVVETSDEWIVSRTGIKRRHIAENESCVELGYQAAL KAVEKIDKSKIGLIICATMTPDYFTPSTACLIQERLGLNDQEVMCFDLNAACSGFVYALT VAQALLQNMEEKYALVIGSEEISKIMDFKDRNTCVLFGDGAGALVVGNGDGIFASYSNSA GNLEALKAPAISKSNQDHYLTMAGQEVFKFAIKVIPESINAILEKTNLTLDEIDYVVCHQ ANYRIIKNVYKKMKSSEDKFYMNLQEYGNTSAASIPLALGEMNEKGMLRPGDKIICVGFG GGLTWGATLMEWS >gi|223714207|gb|ACDT01000008.1| GENE 25 23292 - 24224 1200 310 aa, chain + ## HITS:1 COG:lin0810 KEGG:ns NR:ns ## COG: lin0810 COG2070 # Protein_GI_number: 16799884 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Listeria innocua # 1 307 1 307 309 319 57.0 3e-87 MKLNKLLNIEYPLIQGGMANIATGEFAASVSNAGALGLIGAGGMDTATLKKNIEICRNLT DKPFGVNIMLINPCADEMAQLVIDEKVPVVTTGAGNPGKYVAAWKAAGIKVLPVVPSVAL AKRLEKYNVDAIIVEGTEAGGHIGELTTMALVPQVVEAVSVPVIAAGGIASGKQVLAAYA LGACGVQVGTCLLASEECPIHENYKQAVIKAKDTSTTVTGRIAGTPVRVIKNKMAKEYVK REKEGADMMELEKYTLGSLRRAVLEGDADTGSLMAGQVVGMINEIRPLKVIIKELFDDCD KAFKKLESEF >gi|223714207|gb|ACDT01000008.1| GENE 26 24227 - 25291 1255 354 aa, chain + ## HITS:1 COG:CAC3580 KEGG:ns NR:ns ## COG: CAC3580 COG2070 # Protein_GI_number: 15896814 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 4 349 6 347 355 356 54.0 4e-98 MKKVNIGDMTLDVPIIQGGMGVGISLGNLAGNVALNGGMGVISTAHPGYRADDFEKNPLE ANKRELANEIKKAKEIAKGKGMVAINAMVAITDYAALVEVAVKSKIDAIISGAGLPMNLP SFVQGTKVKIAPIVSSGKAAKLICKTWDRKFKVAPDFVVIEGSEAGGHLGFHKEDVLNKT TAKLVDIFKEVKETVQPFVEKYQKDIPIFVAGGVYDSEDIQKYLDLGADGVQMATRFICT EECDADIKYKQAFINAKKGDIEIVKSPVGMPGRAIMTKLTKRLKQDERIPVKRCYNCLIP CDVKTTPYCISGALINAAKGNLDEGLVFSGSNGYRNDKIISVKELMDELKRGLK >gi|223714207|gb|ACDT01000008.1| GENE 27 25291 - 26223 1275 310 aa, chain + ## HITS:1 COG:CAC3575 KEGG:ns NR:ns ## COG: CAC3575 COG0331 # Protein_GI_number: 15896809 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Clostridium acetobutylicum # 1 302 1 305 308 272 48.0 5e-73 MGKIGFVYAGQGSQVVGMGQSFYDNYQIAKDVFDNIDLDIDVKKLCFEGPLEELSLTRNT QPCMVAVAVVATKLLKENGIAPDYVAGLSLGEYSALNAAGVLDDQTAINLVRFRGQAMER AAAGIESKMIAIIGLDRELLNEAVKEARALGVVSIANYNCPGQLVIGGEAEAVTKASELA LEKGARRAIPLNTSGPFHTELLAPASNELKEKFVSVNFNEMQVPVIFNSSAKELESGVSI AQMLEKQVMSSVYFEDSIRYMLENGVDTIIEIGPGKVLSGFIRKIDKTIKTYQVEDQASL EKTLAGLKGE >gi|223714207|gb|ACDT01000008.1| GENE 28 26225 - 26977 197 250 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 3 246 1 238 242 80 27 2e-14 MKLKDKVVLVTGGAQGIGKEICLTCAREGAKIIVNYVNFGDNKKIAEATKSELEAIGATV MLAQANVASFEETETMFKEIIKEFGRIDVLVNNAGITKDGLLMRMKENDFDAVINVNLKG TWNCMKHATKIMMKQRYGRIISMSSVVGVMGNAGQVNYSASKAGIIGMTMSLAREVGSRG ITVNAVAPGFIQTAMTDVLPEDIKESMAKQIPLGTFGQVSDIANTVVFLASDDARYITGQ TIHVDGGMAM >gi|223714207|gb|ACDT01000008.1| GENE 29 26989 - 28230 1710 413 aa, chain + ## HITS:1 COG:CAC3573 KEGG:ns NR:ns ## COG: CAC3573 COG0304 # Protein_GI_number: 15896807 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Clostridium acetobutylicum # 4 411 3 410 411 476 60.0 1e-134 MERRRVVVTGLGAINAIGHNVEETWKSIKEGVCGIDEITNYDKELTKVKIAGEVKNFNAA DYLDKKAVRKMDRFTQFGVIAAKEAFKDAGLKIEEIDADRFGVVVSSGIGGLATIEKDHT KGLEKGYDRVSPFFVPMSITNLAAGNIAIELGAKGLCTCPVTACAGGTNAIGDAFRNIRD GYQEIMVAGGAEASITPLGIGGFSSMKALCDNNDPKRASIPFDEERCGFVMGEGAGIVVL EELEHAKARGAHIYCEMVGYGVSCDAHHITAPLEDGSGGAKAMINAIKDAGLVAEDVTYI NAHGTSTPLNDKGETMAIKSALGEAAKKVAISSTKGNTGHCLGAAGGIEGVICVKAIEEG FIPATINYQKPDPLCDLDIVPNTGRNQDVNVAMSNSLGFGGHNATIIFKKYGE >gi|223714207|gb|ACDT01000008.1| GENE 30 28235 - 28660 606 141 aa, chain + ## HITS:1 COG:SA1901 KEGG:ns NR:ns ## COG: SA1901 COG0764 # Protein_GI_number: 15927673 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases # Organism: Staphylococcus aureus N315 # 3 141 5 143 146 167 58.0 4e-42 MELNSNQIQEINPHRYPFALVDRITDYQPGQWAKGIKCVSVNEMHFCGHFPEHHVMPGVL IVEALAQVGCIAILSKEENKGKIAFFGGINKCKFKGQVTPGDVLEMECTLTKQKGPIGIG DAVAKVNGKVVCKAELIFAVQ >gi|223714207|gb|ACDT01000008.1| GENE 31 28762 - 29622 735 286 aa, chain + ## HITS:1 COG:SA0684 KEGG:ns NR:ns ## COG: SA0684 COG0697 # Protein_GI_number: 15926406 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Staphylococcus aureus N315 # 1 285 1 283 288 206 43.0 4e-53 MNNIRKGMMCIILAAFFFAAMNVFVKLAGDLPSIQKSFFRNLVAALFAFFILLKSKQGFT YKQKDLPMLLLRSVFGTLGILCNFYAVDHLLVSDASMLNKLSPFFVIICSSLFLKEKATR IQKISIIVAFIGSLFVIKPSFNLLNNVDSIIGVLGALGAGIAYTCVRKLGKQGVNGAKIV FFFSCFSCLSVLPYLIINYQPMSLEQFLILLGAGLMAAGGQFAITAAYNNAAGKDISIYD YSQILFAAILGFIFLNQIPDIWSIVGYIIIIGAALGVYLYQRRIAN >gi|223714207|gb|ACDT01000008.1| GENE 32 29913 - 31067 1421 384 aa, chain + ## HITS:1 COG:BS_yxeP KEGG:ns NR:ns ## COG: BS_yxeP COG1473 # Protein_GI_number: 16080998 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Bacillus subtilis # 1 371 1 372 380 336 45.0 4e-92 MEDDSLQKELEYQFEWFHRNPELSYEEYETTKRIKTLLQEHDIEVLDLPLKTGLVAVIRG GYPGKVIALRSDIDALPVNEETTLSYKSEIEGKMHACGHDFHLTTIYGVALLLNENAAQL HGTVKLLFQPAEESSLGALKIIETGVLDDVNAIFGIHSTSQFEVGTIGIKAGTVSAAVDR FKINLKGFGSHAAHPQMAKDPIVAAAALVNSLQTIVSRNMDPFNSSVLSITHLQAGNTWN VIPEQALIEGTVRTLTSEERELFGKRLKEITYGISQAYDLDTEIEWIAGPPATVNDEEWS NFAKAIAQAEKIAIGIPEATLGGEDFAFYLEKIRGTFIKIGTGKTYPNHHPKFKVNPAAL SLAAKYLSQLAKQALLKLEADEND >gi|223714207|gb|ACDT01000008.1| GENE 33 31060 - 32082 988 340 aa, chain + ## HITS:1 COG:lin0312 KEGG:ns NR:ns ## COG: lin0312 COG1135 # Protein_GI_number: 16799389 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, ATPase component # Organism: Listeria innocua # 1 327 1 327 338 307 49.0 2e-83 MIKIEHVNKTFTTKSGEVPALQDVSLEVEAGDIYGVIGYSGAGKSTLLRMVNLLERPENG AIYINNQNIIDLKERELNILRKDIGMIFQQFNLLESQTVYQNIAIPLILSKLDKKAIEVR VKQLLSFVELEDKRDTYVSQLSGGQKQRIGIARALATEPKILLCDEATSALDPQTTEAIL LLLKKINCELGITILLITHEMNVIKKICNKVAVMKSGKIIEKGDTLEVFSNPQQEMTKRF VETVISNVIPTTLLKELDYRYPVLKLTFFGESTKHDVISKINKHFKIITTILFASVNEVG EDILGILTIQLKGEQDEVIRAINYIQKQDVKIERVELNYD >gi|223714207|gb|ACDT01000008.1| GENE 34 32075 - 32743 660 222 aa, chain + ## HITS:1 COG:SA0421 KEGG:ns NR:ns ## COG: SA0421 COG2011 # Protein_GI_number: 15926140 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, permease component # Organism: Staphylococcus aureus N315 # 7 221 5 219 219 189 61.0 4e-48 MIENYLNIPADKLLEAAGQTIYMLGISLFFGTLIGIPLGLLLVLTRKNGVKANHTIHFLV SGLVNIVRSVPFVILLVFIAPLTKVLVGTRIGTKAAIVPLVFYIAPYLARLIESSILEVQ PGILEAAKAMGANTLEIIRYFLLPEAKASLVLALTTGTIGLLGATAMAGTIGGGGVGDLA LTYGYQRFNNTLMFVTVIILIIFVQLIQTLGNHLSKKIRTHE >gi|223714207|gb|ACDT01000008.1| GENE 35 32754 - 33566 889 270 aa, chain + ## HITS:1 COG:NMA0506 KEGG:ns NR:ns ## COG: NMA0506 COG1464 # Protein_GI_number: 15793505 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Neisseria meningitidis Z2491 # 29 262 42 274 287 154 33.0 1e-37 MKKRILSLLLVMVFVLAGCQSSSDKSEDKTTIIYGKAAGPYTVLFEDAIIPILKKEGYQF KCIEFSDLLQNDTALNEGEIDVNVEQHTAYMKNFNESQDGDLVALTAIPTVPAGIFSNTH KSLEEIKKGAKIAVPNDASNTSRAYVLLQKAKYITLDPDVDISSVTKDDIIKNPYEIEFT EMDSTMIPQALDEFDYAVITGSIVYNAGIDASSALLQEDVLEHLLLQVVVKEKNKDTKWA KAIVEAYRSKEFKEYLDKNNNGLWYVPSYN >gi|223714207|gb|ACDT01000008.1| GENE 36 33900 - 34487 203 195 aa, chain - ## HITS:1 COG:MA2762 KEGG:ns NR:ns ## COG: MA2762 COG3547 # Protein_GI_number: 20091585 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Methanosarcina acetivorans str.C2A # 2 176 59 230 414 79 30.0 5e-15 MESTSKYWIPVFNILESELNIFLTHPKYVKVIRSKKTDKKDSKWIANIFKQDLLKYSFIP PKNIRELRKISHYRIKLVNKRSSERNRYQNCMTVSNIALASVSTDHLGKNCKAAMDEILK SDIITEDNLKKILKGSVSKKSDQIFQAIQNSHIESDQRFKINCTIKHMNNLDEYIQNWLV FETTSCSMCSFGYQE >gi|223714207|gb|ACDT01000008.1| GENE 37 34928 - 36085 872 385 aa, chain + ## HITS:1 COG:CAC0390 KEGG:ns NR:ns ## COG: CAC0390 COG0626 # Protein_GI_number: 15893681 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Clostridium acetobutylicum # 5 379 9 382 384 402 50.0 1e-112 MKYEFETRCIHPANLELKQHPYGAVSIPIYQTAIFSHPGIGESRGHNYSRESNPTRAYLE EIISNLEGAVDTVACSSGMAAISLVLELFSSGDHLICSNDLYGGSTRLFTLQQNKGLKFS YVDTTNLKSFKQYIRPQTKAVFIETPSNPTMQISDLKKISALCQAHKLLLIVDNTFLSPY FQNPLLLGADIVIHSATKFLSGHNDIIAGSISTGDHKLAKRIREVSKTVGNSLSPLDCYL LIRGIKTLAIRQKSQQENALKISNWLNRHPRVSKVYYPGLIEHPGYEINRNQARGFGSMI AFSVENHFLAKKILENVQLITYAESLGGVESLITYPMIQTHGDVPAEIRAKLGITDNFLR LSVGIENIDDLLADLRQAFGDNDEI >gi|223714207|gb|ACDT01000008.1| GENE 38 36075 - 37256 905 393 aa, chain + ## HITS:1 COG:CAC2970 KEGG:ns NR:ns ## COG: CAC2970 COG1168 # Protein_GI_number: 15896223 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Clostridium acetobutylicum # 4 389 3 382 384 399 48.0 1e-111 MKSNFDSIINRTNTNSLKYDFTSERERPDDILPLWVADMDFIVAPEIQQRLYDVVNHGIY GYSDSKDGYYQAIYSWYRSKFNYEIKKEWIVKTPGVVFALAQAVRAFTKLGDGVIIQTPV YYPFKEVIIDNGRRVITNSLVLKNAHYEIDFDDFEEKIKKEKVKLFILCSPHNPVGRVWK KDELLKLGQICLKNDVKIISDEIHSDFIYPGHQHHVFASLKTELQNITITCTAPTKTFNL AGLQISNIFIADEKLRLQFKKAIASTGYSQCNLFGLAACQTAYESCSSWLEDLKQYLFEN TRFVDEYLKEHIPLIKLIIPEGTYLLWLDFRRLQISDKELNDLLIEHARLWLDSGTMFGK EGKGFQRINIACPRAYLRQALNQLKNALEEIEY >gi|223714207|gb|ACDT01000008.1| GENE 39 38289 - 39107 728 272 aa, chain + ## HITS:1 COG:no KEGG:Pcar_2183 NR:ns ## KEGG: Pcar_2183 # Name: not_defined # Def: transcriptional regulator # Organism: P.carbinolicus # Pathway: not_defined # 4 264 2 265 268 81 23.0 3e-14 MNSQYKIGDISRLFNLSSEMIRYYEKMGIIDPIRDEKNGYRYYSIFDIYILLECLQYQNL GMKIKEIPNFINFNYREKLQNQLHIFQRRLNKEIDYKVMLKKRVTELAMQLEVADLNLSN YWVKKINKKYAFDFVDGIGDDYDKVNTSEDNIRFLLNDKYINFIDSYVCFNKDKDEWYFA IDEEYFNGLSISYKKKYTIIPAHYCLCTVIEMGPLGQFSHECYEAAVMYAQDKGYHVQLP IRGIIRGRGYEGDHFKRYLEIQIPIIKNSQIA >gi|223714207|gb|ACDT01000008.1| GENE 40 39248 - 41188 2490 646 aa, chain + ## HITS:1 COG:CAC1044_1 KEGG:ns NR:ns ## COG: CAC1044_1 COG1902 # Protein_GI_number: 15894331 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Clostridium acetobutylicum # 4 374 3 382 413 246 36.0 9e-65 MALKKLFTPYQIGTVKIPNRLVVPAMVTNYCTIEGEITERYMKYIEEKAKGGWGLIITED YAVQQYGKGYQRIPGLYKDELIEGNKQLTTMVHKYESKIFCQMYHPGRQTTRIANEDHMP VAPSAIECPLCQEQPRDITVAEINQLVKDFGSAAKRAKEAGFDGIELHCAHGYLLAEFLS PFVNKRVDNYGGCLQNRVRIVAEIYLEMRKQVGDDFPIIVRLSGNEYVHGGRTEAETYEL CTIFEELGFDGIHISNGSYASPGNRAVIAPMFTEHALNMEISAQVKKLVDLPVIVTNRIN DPQMADTLINMDKADFIGMGRGSLTDPDLPNKAKAEKYGNIKMCIGCLQGCEMPLFFNQE VTCLVNPRVGREYENSMDIVEKAKKVMIVGGGPAGLQAAETAAMIGHNVTVYEAQEEVGG QFRSAAYPIGKGELTTLISVFKKNLEDLHVKVHLNTLVTKEMIEQENPDAIILATGARPL VPAIKGIDKENVVNAEDVLLGKIATKVEHIVVCGGGEVGCETATFIAQTHRHVTVLEMKP AVLTDMDPINMSCLLPIMNESGVTARANCTVTEILDDGVAFINEQGEKEVISADLVVLAF GYKAYNPLEKIANELCEDVQVVGGAIKAGNALPAVKEGYEAALNIK Prediction of potential genes in microbial genomes Time: Thu May 26 09:13:38 2011 Seq name: gi|223714206|gb|ACDT01000009.1| Coprobacillus sp. D7 cont1.9, whole genome shotgun sequence Length of sequence - 39961 bp Number of predicted genes - 37, with homology - 36 Number of transcription units - 15, operones - 11 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 129 - 188 8.8 1 1 Tu 1 . + CDS 278 - 1477 1078 ## COG2508 Regulator of polyketide synthase expression + Prom 1500 - 1559 9.4 2 2 Op 1 . + CDS 1602 - 2495 834 ## COG1940 Transcriptional regulator/sugar kinase 3 2 Op 2 . + CDS 2551 - 3315 660 ## gi|237735620|ref|ZP_04566101.1| predicted protein 4 2 Op 3 . + CDS 3320 - 4006 421 ## gi|237735621|ref|ZP_04566102.1| predicted protein + Term 4007 - 4051 -0.7 + Prom 4013 - 4072 10.8 5 3 Op 1 . + CDS 4149 - 5474 1314 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 6 3 Op 2 . + CDS 5490 - 6425 1083 ## lmo0737 hypothetical protein 7 4 Op 1 . - CDS 6504 - 8429 1983 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family 8 4 Op 2 . - CDS 8446 - 10401 1719 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family - Term 10453 - 10486 0.8 9 4 Op 3 . - CDS 10487 - 11365 760 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 11415 - 11474 6.5 10 5 Op 1 . + CDS 11496 - 13430 1635 ## COG1061 DNA or RNA helicases of superfamily II 11 5 Op 2 . + CDS 13397 - 14167 594 ## Swoo_3249 type III restriction protein res subunit 12 5 Op 3 . + CDS 14239 - 14934 658 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase + Term 15149 - 15195 0.7 13 6 Tu 1 . - CDS 14967 - 16868 961 ## COG3711 Transcriptional antiterminator - Prom 17074 - 17133 5.2 + Prom 16893 - 16952 2.9 14 7 Op 1 . + CDS 17015 - 19009 1868 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family 15 7 Op 2 . + CDS 19022 - 20254 1083 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 16 7 Op 3 . + CDS 20314 - 20556 285 ## BCAH820_3049 PTS system, lactose/cellobiose-specific IIB subunit + Term 20557 - 20609 13.1 - Term 20545 - 20596 8.3 17 8 Op 1 . - CDS 20600 - 21541 994 ## Cbei_2464 G3E family GTPase-like protein 18 8 Op 2 . - CDS 21541 - 22581 934 ## COG0523 Putative GTPases (G3E family) - Prom 22601 - 22660 7.3 + Prom 22514 - 22573 11.0 19 9 Tu 1 . + CDS 22791 - 23519 802 ## COG1811 Uncharacterized membrane protein, possible Na+ channel or pump + Term 23522 - 23571 3.1 + Prom 23555 - 23614 12.9 20 10 Op 1 9/0.000 + CDS 23655 - 24596 1034 ## COG0673 Predicted dehydrogenases and related proteins 21 10 Op 2 1/0.000 + CDS 24597 - 25577 1125 ## COG0673 Predicted dehydrogenases and related proteins 22 10 Op 3 . + CDS 25590 - 26084 590 ## COG0346 Lactoylglutathione lyase and related lyases + Term 26090 - 26127 6.6 + Prom 26116 - 26175 4.2 23 11 Op 1 . + CDS 26204 - 27340 1041 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase + Term 27350 - 27378 -0.9 + Prom 27347 - 27406 7.8 24 11 Op 2 . + CDS 27426 - 27584 89 ## + Prom 27651 - 27710 6.0 25 12 Op 1 . + CDS 27757 - 27927 91 ## gi|167757277|ref|ZP_02429404.1| hypothetical protein CLORAM_02827 26 12 Op 2 . + CDS 27978 - 28175 221 ## gi|237735643|ref|ZP_04566124.1| predicted protein + Prom 28426 - 28485 9.6 27 13 Op 1 2/0.000 + CDS 28605 - 30053 1510 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 28 13 Op 2 7/0.000 + CDS 30073 - 30912 722 ## COG3711 Transcriptional antiterminator 29 13 Op 3 3/0.000 + CDS 30924 - 32333 1432 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 30 13 Op 4 . + CDS 32354 - 32842 509 ## COG2190 Phosphotransferase system IIA components + Term 32850 - 32889 6.1 + Prom 32911 - 32970 13.2 31 14 Tu 1 . + CDS 33096 - 34493 1259 ## BCG9842_B1876 ROK family protein + Term 34561 - 34606 4.2 + Prom 34603 - 34662 4.5 32 15 Op 1 . + CDS 34768 - 36036 1551 ## LMOf2365_0663 hypothetical protein 33 15 Op 2 8/0.000 + CDS 36042 - 36509 516 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 34 15 Op 3 7/0.000 + CDS 36522 - 36836 651 ## COG1445 Phosphotransferase system fructose-specific component IIB 35 15 Op 4 . + CDS 36848 - 37903 1398 ## COG1299 Phosphotransferase system, fructose-specific IIC component 36 15 Op 5 . + CDS 37919 - 38914 971 ## CDR20291_0206 putative transcription antiterminator 37 15 Op 6 . + CDS 38853 - 39947 869 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) Predicted protein(s) >gi|223714206|gb|ACDT01000009.1| GENE 1 278 - 1477 1078 399 aa, chain + ## HITS:1 COG:CAC1426 KEGG:ns NR:ns ## COG: CAC1426 COG2508 # Protein_GI_number: 15894705 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Clostridium acetobutylicum # 18 384 17 391 397 118 22.0 2e-26 MGFTIEDALVQTQEQYHLKLLAGREGCSNAMSWVHMIEDTTIIQQLWGKELAVTTGLGFQ SHDSLFDFIKCLVKYHSVGLVINTGKYIFEIPQDIIDYCNEQDFPLLTTPWEVHMADLIK DFSMRCLYSQREDNQISKLFQNVFTTPQVVEEVRQQLLGAFDVDGFFQVVLIGIEDSDQF DAIERRRVSFQLELCFEKIQSPYTFFWFDGYFVLIVNNLDPDSLEKITDKMYKRTKKRMP SRFIHLGIGTPMTDFRQVILSYKRARAAVAMAEQFKYPMILFEEMGVYQILFSIEDKQIL SGMYQHLLKPLIDYDQKHHSELEKTLFYYLIHDGSQVAMAKNLYMHRNTINYRMTKVKEL LNCQLDTFEEKMPYMLALYIKKVIQEDINKDKDLFAAIQ >gi|223714206|gb|ACDT01000009.1| GENE 2 1602 - 2495 834 297 aa, chain + ## HITS:1 COG:SPy1596 KEGG:ns NR:ns ## COG: SPy1596 COG1940 # Protein_GI_number: 15675481 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Streptococcus pyogenes M1 GAS # 1 288 16 296 307 154 32.0 2e-37 MKYLCLDFGGTGVKYAIIDENFQLYDVNKENKIFQSHDEMITWVVDLFKTIHMKLSGIAI SYCGEMNPFTGLIKNGGSYRFNDNKNIKQILWSKCHVPVSIENDGNCAALAELYRGSLKE TSNGAVLVLGTGVGCAIIINGELYRGCNYFAGAATFSLIDSKHDFDWQNTFGLIGGVGYL TRNYEKRFALERNSIDGLAFFNKANHNDIASLEILRDYTKNLARYIFNMQMLLDLENISI GGGISTQPLLLTMLKKEVSDLFDCIPVPVTPPEIKVCHYYNDANLIGAFCWYKKANF >gi|223714206|gb|ACDT01000009.1| GENE 3 2551 - 3315 660 254 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735620|ref|ZP_04566101.1| ## NR: gi|237735620|ref|ZP_04566101.1| predicted protein [Mollicutes bacterium D7] # 1 254 6 259 259 443 100.0 1e-123 MFMCLAYFLSIVNNSNNKDLNYYIAKYIIENINHIGSITLQDVADGCFVSVPTVKQFLKK FGYSNYSIFKERLESELDVRKDQIYRGYKIFNKKRLAVAVSHLVDYPLEFNDKMKLKLII KEIAQSQRVIIVASPTITPILFNFQIDMISMGKTVIMSSLLKDNSIEIEDNDLIILISGT GRLFLSDDNLVSILGNKANKVIIFSGETKINFHYNVFEVINILSNNEMFETEYLLLYYFD LIKFYYHEIIYQEQ >gi|223714206|gb|ACDT01000009.1| GENE 4 3320 - 4006 421 228 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735621|ref|ZP_04566102.1| ## NR: gi|237735621|ref|ZP_04566102.1| predicted protein [Mollicutes bacterium D7] # 1 228 1 228 228 375 100.0 1e-102 MIDILLDLSMNAKINNTDQILVKYFLNQREKLIELSLEDVIYDTGLSKSSIIRFCKSIKT SHFTDFKKTFVREFNAVICDFIYLLDDRDKLDDLVIACNHCHQIILFGEREAMLIWKMYL KYFYVLGYDIYIINEEDNNLAINMDEHTLYFYSHLQISLEKYYDYNISNALIRNIYRHLN DENNYIISLKSSYTHKRFLEIHEQPEIKKKFELMKIIESLLARLNNQG >gi|223714206|gb|ACDT01000009.1| GENE 5 4149 - 5474 1314 441 aa, chain + ## HITS:1 COG:BS_ywbA KEGG:ns NR:ns ## COG: BS_ywbA COG1455 # Protein_GI_number: 16080890 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Bacillus subtilis # 5 427 8 433 444 224 32.0 2e-58 MNDFIQNKVIPFAQKIENNFILSAISGAFMKVMPALLGGAVFSLFQGLPLGDWYTNFLSS TGISDALAVGVAVCNLTALYFIFALGYNLGEKFNVNPFQTAIVSLLSLLLVTPFSYNNVD NTNGAVTLIKNVIPVDNLGASGIFTAMLCGIIGAYVFAFAVNHNWKIKLPDSVPEMVSKP FEAVVPAFLVSALFLLVRSGFAVTSYGNIQNFIYTMVQAPLVSLGNSFPAMLISMLILLL LWWVGIHGTSVVLSVMMVIWMEPAITNLNNYMAGEPVTLVTTYMFFFVFCQFIGGPGCLF GLTCDMAIFAKSERFKAFGKVCFVPGMFNIVEPVIFGFPIVLNPIMFIPFILTPLVFMII GYLLMVTGVVNIPALMLSVMTIPGPIAGFILGGGISLGIMIVIMCLLSCIIYYPFFKICD NQALEAEKKSNQEKTLEENKI >gi|223714206|gb|ACDT01000009.1| GENE 6 5490 - 6425 1083 311 aa, chain + ## HITS:1 COG:no KEGG:lmo0737 NR:ns ## KEGG: lmo0737 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes # Pathway: not_defined # 4 311 5 310 310 425 65.0 1e-118 MSNVKRLINASKSEINKMGPLELKESIYKSEGRVIMGQHLIFAGPGTVMGVTNAELLCSF GADMVMLNTIDLDNFENNPGLCGLTIKELKEKCNCPIGVYLGCPKKGEKRSDEKTLYRLA GMLATKEHILKAVALGVDFIVLGGNPGSGTSIEDVIEATKLVKEICGDDIFLFAGKWEDG IEEKVLGDPLAAYDTKDVIKRLIDAGADCIDLPAPGTRAGITVDMIRELVEYIHRYKPGT LAMSFLNSSVECADEDTIRQVALLMKQTGCDIHAIGDGGFGGCTWPENIYQLSVSIKGKN YTWFRMASNSR >gi|223714206|gb|ACDT01000009.1| GENE 7 6504 - 8429 1983 641 aa, chain - ## HITS:1 COG:CAC3371_1 KEGG:ns NR:ns ## COG: CAC3371_1 COG1902 # Protein_GI_number: 15896613 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Clostridium acetobutylicum # 3 376 7 399 401 171 30.0 6e-42 MNLLKPIKVGKITLKNRIMFPPMTTGYEERDGSIGEQSFNFYKRLAQGDVAYIVLGDVAP VNTVSPTPKLFHDGQIEAFKKLSDALHEYDCKLGIQIFHPEYDVDALAELFRKGDMQAAR AKLHHDMLHYIDEVTEAQLTAIIDKIGACVKRAAAAGVDVIEVHGDRLVGSLCSTILNHR NDNYGGSFENRIRFALEVVKVIKENAPDICIDYKLPIITENPLRGKGGLKIDEAIKLATI LEAHGVDMIHVGQANHTGNMNDTIPAMGTQPYCFMNKYTKQIKDVVTIPVSSVGRILTPQ NGESLIDNGICDIIGLGRSLLADPDYVKKLKNNEAARIRHCMMCNKGCTDAIQNRQFLSC VLNAENGYEYKRIITPAATKKKVVVVGGGPAGLEAARVAKTKGHDVILFEQDTRLGGQLN IAAIPPRKAEMNRAINYLSNEMKILNVDLRLGKKATSTDILNEHPDSVIIAVGANNASLP IPGSDLTHVLDAWKVLNYEQFPSGRVAIIGGGLVGAETAEFLAQMGLNVTVIEMLEEIAK EESSTVKPVMFEDFEKHQVKLLTKTKVIEIKNDCIKAANGDGELTIPCDYVILATGAKPN PFDVSELTAHNIDIHLVGDCNEKAADINNAITQGYLAANSI >gi|223714206|gb|ACDT01000009.1| GENE 8 8446 - 10401 1719 651 aa, chain - ## HITS:1 COG:AF0455_1 KEGG:ns NR:ns ## COG: AF0455_1 COG1902 # Protein_GI_number: 11498067 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Archaeoglobus fulgidus # 9 364 2 354 354 238 37.0 3e-62 MNLENKYQKLFQPIKINQLEVKNRIFMSPMSTNFATKDGYVTDEMIYYYSRRAKGGVGLI VTEVTMIEPTYKYIAHTLSIQDDSYLEGWQKLANEIHKYDSKVVAQLLHPAYMAIPFPET PQLVGPSEVGPYYAKTPPRPLTKEEISIIVEQFGDGALRLKKAGIDGVEIHAAHAHALLG GFLSPLYNKRIDEYGGDITHRIRLLLEVIANVRKKCGRGFAIIVRISGDDYEAGGQSLHE GCYIAKRLEAAQVDMIHVSGGTTIHRGSSIAPPGTPQASHIDSAREIKKCINIPVSTVGR LNEPWILEEVLERDLADVCMVGRSLLCDPDFVNKIKNGQEDDIRPCIGCLGCLSSTMLKD HVECGINPSLTSENEETVTPATISQNILVIGGGPAGLEAAYILSKRGHKVTLAERHHTLG GAMIVAGYPIAKQHFAQTTKYFIKRVIDCNIKIELNTIVDQAYLSQHHYDHIIVATGAKP LRLDIFKNHPHTGTGQDVLLGKMWTGKNIVVVGGGSVGCEVADFVAPLVNDRHPNNKKVT VIEMKSNIIMDDTSPSRSILVQRMADKGVNIITDSRVVKVTPTTLEYQKDGNTITIENVD TIIEATGYYADHELEKTLDQMQLSYNVLGDAATPGKIKNAISDAYHLCKNI >gi|223714206|gb|ACDT01000009.1| GENE 9 10487 - 11365 760 292 aa, chain - ## HITS:1 COG:TM1005 KEGG:ns NR:ns ## COG: TM1005 COG2207 # Protein_GI_number: 15643765 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Thermotoga maritima # 133 290 142 290 299 65 27.0 1e-10 MTMYIQTQNQYTQFQSDELTDLLGKISFTINKFGLWQSNHDTRANYITDDIEIVYYREGG SITTIGNKKYTCPPHSFLIIEPYKLNTSINTGNTNYSYYFFHFDIEPLSLRQQFITLLTK HGHLIYKEEIKDFKEMLERLLIEASEKEIGYSSIITSALIRVVVEIIRAQLKRNNDHISQ ITDLPHIELVNDAIKYIHDHLHEPIKLTAMALQLGVSTSMLYKSFIAILAIPPLTYIHQQ KVLYAQRMLAHGKSVTMIANDLGYSSAYHLSKTFKQITGVSPREYKKNIKLL >gi|223714206|gb|ACDT01000009.1| GENE 10 11496 - 13430 1635 644 aa, chain + ## HITS:1 COG:Rv1179c_1 KEGG:ns NR:ns ## COG: Rv1179c_1 COG1061 # Protein_GI_number: 15608319 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Mycobacterium tuberculosis H37Rv # 8 236 12 246 624 95 28.0 3e-19 MLEQILHFKGTWRSYQQRVLDKYDRYSQDRKIHIVAAPGSGKTTLGIELIKRIDYSALIL VPSITIREQWVERICEAFLVKQENRDQYLSQDLKKPKLITVVTYQALHSAMSHYCGELVE TNDEFKTVEEVDYHNFDVISNFKECQLGTLCLDECHHLRSEWWKSLEEFKKAFNNIFTVA LTATPPYDSNLSMWTRYMDMCGDIDEEITVPELVKDGTLCPHQDYVYFNYPASQEKQKLS IFEENSQSILEQLIQDEYFCKAIMSHRFFTEEVSDDELLDDPSYLSAMIIFLNTKGVCSA NKYQKLLGYKSLEPLSLKWLEILLQGFLYDDIASYRVEEQYHDELIRELKSKGLIEKRKV SLCLNQAIEKMLINSVGKCESIKEIVNYEYKTMKQELRLLILTDYIRKEYERALGDESKD VNNLGVLPFFEQLRRDSAKNKTAIKFGVLCGTMIIIPKDAQTALIELVEEPSKISFHQIG KLDDYVKVEISGNRNFITGVISEVFAQGYMQVLIGTKSLLGESWDSPCVNSLILASFVGS YMLSNQMRGRAIRVFEKTPNKTSNIWHLVCVRPKEKLTGHYDDGGSEDFQTLARRMEHFL GLHYQQDIIENGILRLSAIKLPFSPGNIKKLIKICLNYLQSVVN >gi|223714206|gb|ACDT01000009.1| GENE 11 13397 - 14167 594 256 aa, chain + ## HITS:1 COG:no KEGG:Swoo_3249 NR:ns ## KEGG: Swoo_3249 # Name: not_defined # Def: type III restriction protein res subunit # Organism: S.woodyi # Pathway: not_defined # 1 244 624 882 884 79 26.0 2e-13 MLKLSSKRSQLKKRWEDSLAVYDKIEIVEETEVKEEFITVVMFMDALRTLLIIVFSGIIG AVIGIPFLKSLTLIDILEYLVVIIYVGIFVMSILLSIKKLYTLANPLGRLEIFGKGIRNA LEKTNQLDSLNSRVETDSFNAIHAIYLLGGTGHDKALFAKCINEFFASIDNQRYILYNPK RKNKLDCYFAIPDSFAKRKEDAHIFAGYMKPFIGNYQVIYTRNESGRKILLEARVSALAN RQDRCFTRKKVKGALE >gi|223714206|gb|ACDT01000009.1| GENE 12 14239 - 14934 658 231 aa, chain + ## HITS:1 COG:FN1921 KEGG:ns NR:ns ## COG: FN1921 COG0340 # Protein_GI_number: 19705226 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Fusobacterium nucleatum # 1 190 1 189 234 108 32.0 1e-23 MSYIELDEVDSTNDYVKRNLNKLPNLSVVRCNYQTNGRGRNGHVWQSKNGDDLLMSILVK DFKKPQDLHKMTQLVACSVAGLLDRYGIKAKIKWPNDIYVDDLKICGILVEAIYQTDLEG VVVGVGLNVNSINGDYASMKMKTGQTYQVKSLMIAMLTYFKIYYSLYQQGSYDKILDYAN DIAYLKDKQVEFQDYGLVTFTKLNENGTVNFVDSNHREHNILINEISLHKD >gi|223714206|gb|ACDT01000009.1| GENE 13 14967 - 16868 961 633 aa, chain - ## HITS:1 COG:lin0919_1 KEGG:ns NR:ns ## COG: lin0919_1 COG3711 # Protein_GI_number: 16799990 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Listeria innocua # 22 505 19 498 505 132 23.0 2e-30 MLELTSRQFLIIRLLQKNTVPLSANTLSRLLDISPRTLRNDIFQINKNIEDFHISSSKNG YQLTLLSENAHMLINDIKISEQSKSTNIILQYLLNHPSCHLLELSENCYMSESSVARCIK TIKPLLNKYHLTVERKNDIFSLHGSEYDKRALFAHLIGIEASQPITSLQHFQQYFQDFHL SEVEAIIDSVLKKYNIRVDDIHYQNLVMNTSITLQRIFVGSDIEPLPFSYSLSNDEIICQ LSKDLCHNLEEYFNLHFAQTDNEYMQVLCIGSIKINEENYETQILYNDYSFVYKIKEILD DLVEHFSLQLNYQMSLNHFALHIHRLFFRNHSSLYFQNDFHNNLKNTHPFIYELAVYFAY KFQESFNITVNINEIGLLAIHLGLMIQSNEENKQYLKALCICPEYNDLRKHFANQFLLNF GDNVQLISFISSKKNIQDYHFDFLISTINDYESPDCMVISPLLLSQDIDKLNLKINLLKE KKKKDLLKKELSKYIDPKLFFRNVDFTCKKDAILFLCSQMLKHGDAKTDFVDSVFEREKI STTSFFNKFAVPHAIHFGDVKTRITFLLNDKAIPWGDTSVHLVMLISINEKDIMTFNTIY SSIIDLLLDDESFDEMLKIKTFDELSVYLENHV >gi|223714206|gb|ACDT01000009.1| GENE 14 17015 - 19009 1868 664 aa, chain + ## HITS:1 COG:lin0492_1 KEGG:ns NR:ns ## COG: lin0492_1 COG1902 # Protein_GI_number: 16799567 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Listeria innocua # 5 381 4 364 364 152 31.0 3e-36 MNQYYPYLFQPLRVNTMMLKNRIIASTMGIPKSHELLSTTHYGNVSIIDKSVGGAAMTFV SIESAANANGEFPKHDRDGIRESISVARQYGAKVGTWCVPRLKQDAANDIRAHYDGKQVL APSPYLTRSGARAVELTLEDIQGIMKQARNDALAIKRFGFDFIYLYVGYEELTTQFLSPV FNKRTDEYGGSLENRMRFTIEHVKTIRNAVGTDFPIVILLGASDYLKGSYQFEEMLELLK RIEDDVDLINVSAGMDMIPGYFPEDQIIEPELGLEAWYSVNGKHCQSIFEPHLTNVHWAK LVKQNFPNKLVGVIGSIMTPKEAEQLLQEGVVDIISMGRPLVADPFLPRKAMQGQSDDIV PCIRCLQCYHSATEHTNVQCSVNPRYRREHRVPLKLEKSEIIKKVIVVGGGPAGCKAAIT AHDRGHQVILIEQRDKLGGQLNLAQHEMHKQELKAYRDYLENQINKRDIQVMYKTTATKN LLEQLNGDVVLVAIGASPIRLHFPGEQLEYVDYFENVYPKLDSLKDNVCIVGGGQVGIEL AVELLERGKLVTVIEMTDQIASQGHILYRAGLRRLLKRFEKQLTILINSQCLGFSESGVK IINKNGESIIKCDNAIIAVGMKPNREEAFKLYGIADETMMFGDCEKLGQVVGATNDAYFI AANI >gi|223714206|gb|ACDT01000009.1| GENE 15 19022 - 20254 1083 410 aa, chain + ## HITS:1 COG:lin0874 KEGG:ns NR:ns ## COG: lin0874 COG1455 # Protein_GI_number: 16799948 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 11 404 24 424 441 209 34.0 1e-53 MKKLQDKLIEVGNKLATQRHLQSISFGLMAIMPLTIFGSIFQLISALPDIFPFIPSYSET IKDAILFPYNMVFGLFGVIAVCAIAYYHARTYKLDLLQAPIVSLLSFIVLAAPIVDNNMD ATYLGSQGIFLAIFVALVTVEIMQLLKRHNIQIKLPDSVPPAVAASFDAIIPMCLIVAGF YGLSLLCQNFTGMLLPSWIMSLLTPAIQGSESLWFCMLMAFIIAFLQFLGVHGFNVVSGI ILPLLIANTGANAAAYAAGETATKIFTLPMFQMSGVFCYIIPLLFLKCKSERLRSMGKIS IVPAIFNIAEPIQYGAPIMFNPILGIPWIAFYTINMGIIWCCMNFGIMGKAVIAASSNIP MPFFQYLCTLDWRAVLIFVVMFIVSGLIWYPFIKVYDKKCLDEENNLQAE >gi|223714206|gb|ACDT01000009.1| GENE 16 20314 - 20556 285 80 aa, chain + ## HITS:1 COG:no KEGG:BCAH820_3049 NR:ns ## KEGG: BCAH820_3049 # Name: not_defined # Def: PTS system, lactose/cellobiose-specific IIB subunit # Organism: B.cereus_AH820 # Pathway: Phosphotransferase system (PTS) [PATH:bcu02060] # 1 79 21 99 100 65 40.0 5e-10 MKRYISQEGYNYEVYAYPILKINELGKYADYIFIAPQIKHDLYRVKELLPEKPIEVIDPF VYGQMNGQELVVQIVEKLGK >gi|223714206|gb|ACDT01000009.1| GENE 17 20600 - 21541 994 313 aa, chain - ## HITS:1 COG:no KEGG:Cbei_2464 NR:ns ## KEGG: Cbei_2464 # Name: not_defined # Def: G3E family GTPase-like protein # Organism: C.beijerinckii # Pathway: not_defined # 5 312 6 312 312 228 38.0 2e-58 METRVYIFAGFLDSGKTSFINDTLMNPDFYSEEDTLLISFEEGEVAYNDIYLKKSNTSLV TLDFENFTTYDLKHLELEYRPTQIMIEANGMMDLNIFVKETIPENWKVVQVLTTFDTTTF SMYLNNMRSLVYSQVVFSDLVIFNRYDPNIRKSMLRNNIKAINSNAQIIYELANGTIDTM NPEEELPFDINKEHLDIQDHDFGIFCMDAMDHPDKYNGKTVKIKGKFIGLDRVIEDGFVL GRQAMVCCEEDTSLIGLVCISKLANKLIPEEWIVVEGKITVDYDPEYQMNVPILTVEHLD VVPPLKNQYVTFD >gi|223714206|gb|ACDT01000009.1| GENE 18 21541 - 22581 934 346 aa, chain - ## HITS:1 COG:FN0779 KEGG:ns NR:ns ## COG: FN0779 COG0523 # Protein_GI_number: 19704114 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Fusobacterium nucleatum # 3 188 2 185 294 137 34.0 3e-32 MTKVDIISGFLGAGKTTFIKKLISDVYPGEQIVLIENEFGEIGIDGTFMNNAGIEVTEIN SGCICCSLVGDFEASLKEVLETYHPDRIIIEPSGVGKLSDVIKAVSDVHSHELELDNYIA VIDAKKCRLYTKNFGEFFNNQIEYASLIILSRTQDITEFQLDECLKIIKEHNNQANVITT PWDQLTKEILINAFSNSDNLRDELLEQLDICPICAHHHDHHHNHEHHHDHEHHHNHEHHH DHNCECGHEHHHADEVFTSWGKQTPKKYSKSQLEKILKELSDSSAYGNILRAKGYVDSIE GDWWYFDLVPGEYEIRIGKPDFTGRICVIGEKLDKIKLDRLFDEVA >gi|223714206|gb|ACDT01000009.1| GENE 19 22791 - 23519 802 242 aa, chain + ## HITS:1 COG:CAC1482 KEGG:ns NR:ns ## COG: CAC1482 COG1811 # Protein_GI_number: 15894761 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, possible Na+ channel or pump # Organism: Clostridium acetobutylicum # 1 242 1 242 242 293 70.0 2e-79 MPTGVIINSLSIVLGGIFGALLGHKLSHRLKTEITLIFGICAMAMGINAIGLMENMPAVI FSIIIGTIVGLSLHFSDWINKGAILMQKPIAKIFKNNGSDLSSEEFQSTLITIIVLFCAS GTGIYGSLTSGMTGDNGVLISKSILDFFTAVIFACNLGYVVSVIAIPQFVIFFILFLSAK FIFPLTTPAMINDFKACGGFLMIATGFRMIKVKEFPIADMIPAMVIVMPISWLWVNYIMT VI >gi|223714206|gb|ACDT01000009.1| GENE 20 23655 - 24596 1034 313 aa, chain + ## HITS:1 COG:CAC1480 KEGG:ns NR:ns ## COG: CAC1480 COG0673 # Protein_GI_number: 15894759 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Clostridium acetobutylicum # 1 311 4 318 320 115 28.0 1e-25 MNFGIIGFGNIARKFVKSIEYTDEGKIYAIASHSLTSDDEYVKAHPEVKIYQDYDELLQD RELDAVYIALPHKYHKEWILKALDNHIAVLSEKPMVLTVADIEEIKTKVYTEKGYCLEAL KTKFNIGLEHLKQDLSLIGKIKQIEANFCFDATAHQTTSYLFNVDQGGALNDVGSYLLGF VLALVDSDIKNVNSQIDIVAGVEWYFKAKINFNNECTAIVEGAIDRNKERLAIIEGEFGK IEIPMFNRIIDYKIIKNDGTVVERNYPIIGDDMTMEIQCFIDDVRAHKTESALHSLEDTK RILELTEIIREAN >gi|223714206|gb|ACDT01000009.1| GENE 21 24597 - 25577 1125 326 aa, chain + ## HITS:1 COG:SPy0441 KEGG:ns NR:ns ## COG: SPy0441 COG0673 # Protein_GI_number: 15674565 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Streptococcus pyogenes M1 GAS # 1 314 1 316 319 275 46.0 7e-74 MKLGILGTGMIVKDLLTTINKLKIESISILGTEQTKKETTILAKQYHLDHCYYDYDEMLR SEIDTVYVALPNHLHFSFAKKALLKGKHVIMEKPITSNVKELEELKEIAQENDLIILEAV NIHYLPAFKKLKEDIKRIGNIKIMNFNYSQYSSRYDAFKQGTILPAFDYHKSGGALMDIN VYNINAIISLFGQPVSIGYMANIENQIDTSGILMYDYGSFKGISIGAKDCKAPIISTIQG DKGVIKIDKPVNQMTGYQIIFNDGTEEIYRVDDCEHRLYYEFIEFIRIIDNNDYDFAKKM LELSLIVAKVMNEARNQEKIVFTNDN >gi|223714206|gb|ACDT01000009.1| GENE 22 25590 - 26084 590 164 aa, chain + ## HITS:1 COG:mlr5077 KEGG:ns NR:ns ## COG: mlr5077 COG0346 # Protein_GI_number: 13474234 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Mesorhizobium loti # 4 155 6 133 140 62 29.0 4e-10 MEFKSLMHVSFFTDQMDVMRDFYENKLGLKVKIVTRAGLYKGLNRGKYSEVAEVDPNRII IVYIEIAPGQFIELFPKDEGQAPHDEWNQRLGYSHFALLVEDIEKTYRELVECGVAIDTP ISKGPSHTYQMWVHDPDGNKFEIMQYTDKSFQVVGHIEEEISED >gi|223714206|gb|ACDT01000009.1| GENE 23 26204 - 27340 1041 378 aa, chain + ## HITS:1 COG:BH0687 KEGG:ns NR:ns ## COG: BH0687 COG2265 # Protein_GI_number: 15613250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus halodurans # 3 375 78 454 458 295 37.0 1e-79 MKCLIEKKCGSCKYINTDYQRQLKIKTDYCKKLLKDNKLDMYEVKDTIGMGHPYEYRNKI IVAFNHRYEFGFYEEDSHKIIPYDRCLLHEELSDMIIKKIQSLLKRYRVSIYDENRNRGL LRHVLIRRALVTDQTMVVLVCNDNVFKGSKNFCNELIKNFPSIKTVVLNVNKRKTSVVLG NEEKILYGKGFIVDELCGLKFKISPKSFYQINHQQCELLYSKALDLLNLTGQERIIDAYC GIGTIGMIVANRTKEVTGVELNKDAVKDAINNAKMNKIENIKFINDDASAFMIKLAKQKQ KVDCVIMDPPRSGSTQEFMDAVKILNPKQVVYISCDPSTQVRDIKYFAKIGYRGEVMYPV DMFPHTSHVETIVLLSKI >gi|223714206|gb|ACDT01000009.1| GENE 24 27426 - 27584 89 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKIILLLGLIIRLRRIVMIYNIANKMIKYANGNNYNIKHFLKVHAYAKTLVN >gi|223714206|gb|ACDT01000009.1| GENE 25 27757 - 27927 91 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757277|ref|ZP_02429404.1| ## NR: gi|167757277|ref|ZP_02429404.1| hypothetical protein CLORAM_02827 [Clostridium ramosum DSM 1402] # 1 56 1 56 56 90 100.0 3e-17 MTSVIQQAMVFILIIFLACFIKLRGYIKESDSYVLALLITNITCPWTLLAGTVRFD >gi|223714206|gb|ACDT01000009.1| GENE 26 27978 - 28175 221 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735643|ref|ZP_04566124.1| ## NR: gi|237735643|ref|ZP_04566124.1| predicted protein [Mollicutes bacterium D7] # 1 65 14 78 78 95 100.0 7e-19 MAVVGYHVKLRDSKTMKGVSMLNASVYNIGRFVLPFIQAIMPSLTVTYLSTFNIGAIFMV LVASM >gi|223714206|gb|ACDT01000009.1| GENE 27 28605 - 30053 1510 482 aa, chain + ## HITS:1 COG:CAC1408 KEGG:ns NR:ns ## COG: CAC1408 COG2723 # Protein_GI_number: 15894687 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 4 480 5 476 477 462 47.0 1e-130 MIRFPKDFLWGGATACFQYEGGYKEKGRGLSSHDFETAGDIDKERQITLKLNNGTRGSID YRDSIPDGAIPYIYDDIYYPSHQATDFYHHWKEDIALFAEMGFKVYRFSISWSRIFPNGD EEKANEDGLKFYEMIIDELLKYNIEPMITICHDEVPFHLCELYDGWAGRETIGCYVNYAT TLFNRFKDKVRYWITFNEINNLGGYAQMGTHRQDSQTRYQAVHHMFVASAKAISQGKKIM GDIKFGAMFALSEIYPATCNPEDVFATYCKRREALFFVDVMARGSYPNYTKDIFNKKDVK LKVSEEDRNLIKKHTLDFISFSYYRSSIVSAGSKVDHMGGEPNPYLKKTQWGAGIDPLGI RYCLNEIYDRYQKPLFVIENGMGAVDILENGKIHDYYRIKYVAEHLKQLRDAIVIDKVPC FGYTMWGCTDLISLASGQMKKRYGFIYVDMNDYGQGSLQRYRKDSFYWYKKVIKTNGSDL TF >gi|223714206|gb|ACDT01000009.1| GENE 28 30073 - 30912 722 279 aa, chain + ## HITS:1 COG:BS_licT KEGG:ns NR:ns ## COG: BS_licT COG3711 # Protein_GI_number: 16080959 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 7 279 2 274 277 162 34.0 7e-40 MKNKYYRINKILNNNAIEILENGTEIIIIGNGVGFNHKVNDYITMKENYKLYTLQNNLLK IQFKALLNEIPFKCIELTQEIIDMAKTDLNRNFNQGLLVSLSDHINFVSKNYLKGYGSYS LVSEEIKRFYHEEYEVGLKALDLINRYYQINLNKKEASAIAFHLINAEFNNNVSKTTSIL KSIDDILNIIEANFGLELPEDSLYYSRLVIHLKFFMQRVIKGESDDENFEKLIISAKSDI NKKIGITLDSIERYLNEEFDYVLSEAERFYLLVHISRII >gi|223714206|gb|ACDT01000009.1| GENE 29 30924 - 32333 1432 469 aa, chain + ## HITS:1 COG:SPy0572_2 KEGG:ns NR:ns ## COG: SPy0572_2 COG1263 # Protein_GI_number: 15674662 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Streptococcus pyogenes M1 GAS # 93 455 2 364 364 206 32.0 1e-52 MDYKEIADQIIIAIGGQDNVKYVTHCVTRLRFVLKNKNLVNQDKINEIDGVLGSTFGAGQ FQIIMGKNLSATFTQVVKNYNFEVAEIIDESLDEDIKEKDDSPIWKRGIKNIFNFLSACV TPMIPGLTVAGLIKVFLVLTSLAWPEIVDNQTYLVINGIVNTSFYFMPVFVAYGAAIRLG STPIYSMIVACLLIHPDIIGMLSAEEPLKILGFTAYSASYASSFLPAILSTVAVANIEKL LNKYLPGMFKGIFVGGLTLVLASLLTLTVLGPIGFFAGEYFINFLVWMQSTIGPFALGAL GGVLPFVIMSGMHTVFGPVMMQSIASLGYEGFLRPTQFVHNVAEGGACFGVALKTKNPEL RSQAISSGVGAILAGVSEPAIYGINLRLKKPMYGVMAGGAVGGCIAGFFGVKAFAYGNPS ILALPIYGETIIGVIIAIIAAFVVSSVVSYFSGFEDVPVKNKNFQQNSH >gi|223714206|gb|ACDT01000009.1| GENE 30 32354 - 32842 509 162 aa, chain + ## HITS:1 COG:SP0758_3 KEGG:ns NR:ns ## COG: SP0758_3 COG2190 # Protein_GI_number: 15900652 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Streptococcus pneumoniae TIGR4 # 15 161 2 149 151 118 41.0 4e-27 MIKIFKSLKNNQKNTNNVYAVVNGVSFNLEYVDDNVFSKKVMGEGVAIVPTSSEVCAPCA GVLTTLFPTGHAFGITRDDGVEILVHIGLDTVNLKGKGFTILKKKNQKVGAGEKIIKLDL NYLRNENIDYTIMIVFLDTKGKTLKLEKYGKVTAGKSIITVM >gi|223714206|gb|ACDT01000009.1| GENE 31 33096 - 34493 1259 465 aa, chain + ## HITS:1 COG:no KEGG:BCG9842_B1876 NR:ns ## KEGG: BCG9842_B1876 # Name: not_defined # Def: ROK family protein # Organism: B.cereus_G9842 # Pathway: not_defined # 268 454 127 323 342 79 27.0 3e-13 MDEFLSEIYIEVFKQWILMQSNKNLAISLSDDYKTIAIKNKYSTSQVNFYNMNIVEFSVE NLIKKEVEFYLHFQIKNIKHAIQLFDEMLETIKQLTNKPALKVLLCCTGGLTTGYFADRI NQITKLLNIDVQVSAVSYNNLYKASEDYDMIMLAPQISYMYAKVCEILKNKIVLNIPPAI FAKYDVKKMLSIIEEERNYKRTLAKGMHKPLKAKNMQPSGVKTLCLALIRNSNRVHIVYR LYDENNSILENNEIIKPTLSIQDYYDVIDSMLAKYVDIKVIGLSTPGIIRGDQIISMSIK GLEKISITHDFTLKYSQKFVFSNDANTIALGYYSSKENCSSLSFVFQPVSSNSGAGHIVN DRLIAGHHHVAGEIQYLPLSYSKDVLELHQTPEGAIEAMAKTVTSIISFLDPQIIVICCF LITNIEELKKEVAKYIPKDYIPEIVLIDYLQEYMLLGTLLLCHQD >gi|223714206|gb|ACDT01000009.1| GENE 32 34768 - 36036 1551 422 aa, chain + ## HITS:1 COG:no KEGG:LMOf2365_0663 NR:ns ## KEGG: LMOf2365_0663 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_F2365 # Pathway: not_defined # 4 416 2 417 422 430 50.0 1e-119 MNKKVTIREAVNYVINSKGEKHTLLGIGPMSENFLRASLEISKEKDFPLMYIASRNQVDA YKFGGGYVFNSDQKLFKEKIEEIAQEINYDNVYYLCRDHGGPWQRDKERADHLPEDEAMA IAKESYKEDILNGFDLLHIDPTKDPDEYCKVVDIDVVFNRTIELIEYCEQVRKEYGIERE IAYEVGTEETSGGLTSMDRYENFIQRIESYTQEHDLPMPIFIVGQTGTLTRLTKNVGNFS YENSLELSAISTKYGVGLKEHNGDYLSEAKLLAHLPLNITAMNVAPAFGTIETMALLELI EVEKQFAKFNMIKEPSNLENVIKYESIHSMKWKKWLTDEVDMSDLDNLDEQTTLQITELC GHYTYSKPKVALEINKLYENLASIKIDGKRFVIEKLKEEMQKHVRCFNMEGLTSKMLACK KG >gi|223714206|gb|ACDT01000009.1| GENE 33 36042 - 36509 516 155 aa, chain + ## HITS:1 COG:lin0446 KEGG:ns NR:ns ## COG: lin0446 COG1762 # Protein_GI_number: 16799523 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Listeria innocua # 10 150 9 149 154 80 31.0 1e-15 MIMQEKAIHKEMVFLKQALSTKDDVIELLSDQAIKLDLINSKKDFKEAVYKREEMASTSI GYQIAIPHGISHTVNRAFIGFVQIKDAFHWNHDEKQSVQLIFLIGVPQDNQNNIHLKFIS LLSRRLLDDNFRKQLIEVEDVEDAYRALNAINEQL >gi|223714206|gb|ACDT01000009.1| GENE 34 36522 - 36836 651 104 aa, chain + ## HITS:1 COG:STM4113 KEGG:ns NR:ns ## COG: STM4113 COG1445 # Protein_GI_number: 16767378 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system fructose-specific component IIB # Organism: Salmonella typhimurium LT2 # 2 102 3 99 106 86 48.0 1e-17 MKIVAVTACPTGIAHTYMAQEAIEKECRNRGYEVKVETQGSMGIENELEQEEIDEADAVI LAIAVGIDGEERFEEKQDAGKVIQVEPGDVIKDAVRVIDQAENL >gi|223714206|gb|ACDT01000009.1| GENE 35 36848 - 37903 1398 351 aa, chain + ## HITS:1 COG:STM4112 KEGG:ns NR:ns ## COG: STM4112 COG1299 # Protein_GI_number: 16767377 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Salmonella typhimurium LT2 # 5 321 6 316 359 223 42.0 4e-58 MKNNKVWKDIQKAFNTGVSYMLPSVVVGGILLAVALATGTATDTGMEVTNPFMKNLNDLG SAGFAMMIPILSGYIAYSLAGKPGIAPGMILGYVANNPIGESEVKTGFLGAMILGLATGY LVRWCKTWKVSNTIKTIMPILIVPIFTTFILGIIYIYVISVPIGACMDWLVKFLSGMQGG NAVLLGAIIGAMTAVDMGGPINKTATAFTLALMAEGVYAPNGAHRIACAIPPLALALSTF IDRKKYSADDKNLGISALFMGTIGITEAAIPFAVKDMKRVLPAIIIGSAVGGALGMMNGV EALVPHGGLIILPVVNGKLWYVLAMLVGVLVSAAILHFTKPNLDIAEKEEK >gi|223714206|gb|ACDT01000009.1| GENE 36 37919 - 38914 971 331 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_0206 NR:ns ## KEGG: CDR20291_0206 # Name: not_defined # Def: putative transcription antiterminator # Organism: C.difficile_R20291 # Pathway: not_defined # 1 308 1 311 676 190 37.0 5e-47 MISARMFKIIRFLNQKKESSYKEIGNALDLKERNVRYDIDCINDALSLKGLSEIEKRSKG QLIVPIDLDLNILLEDSEFIFSADERMQLLRFIIMFDVHSLNIKKLSEKFQVSRRSIQND LDTLIKEFNQYGLSLVYNRHFHLIEDDHKGYPLRVNELKKYIYLFSNKLSEYNTFELELM KLLYRIYNFDILHIYDWITNKMKMMSWTFSDDSFDWYVANILTTCWYIIKEKQLPSCPKK IDVEDYSLQELEMIINKQLDDNQKKLIMSYSSYTNKYGEVDINLDLIMTEDIIIQLVTMM SDELGVIFFTGYDFNKRIIESCGTDVGTNKK >gi|223714206|gb|ACDT01000009.1| GENE 37 38853 - 39947 869 364 aa, chain + ## HITS:1 COG:SPy1952_2 KEGG:ns NR:ns ## COG: SPy1952_2 COG1762 # Protein_GI_number: 15675752 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Streptococcus pyogenes M1 GAS # 222 353 5 143 154 77 31.0 3e-14 MILIKGLLNHVAPMLERIKNNIRVYELSQTVIPQSYEYVFEVLKSIVLSIQPLQEISQEE LIYITIHFIASIQRLRANDYKNILLICGLGFGATALLKDTLRNEYQVQVIDSIPAYDLEH YQKWEQIDIVISTSKLILPVKKTLIVVNAIFDNHDHNKLEAVGIRKKNVLTNYIAIEKRL GFLNSPEREKVLEIIKEELGYKDVRIPKKYYNISDLLGENCIRYVKSFSNWKDAIKLSTD ILIQHGCIKSEYYEGLMSVIKESGFYAVTDSKFALFHSGNTSSVKISSMSLIVTDEPVWF DNKQVRVIFCLASKDKKEHIPAIVKLMRMVNDAKLIEKLESCNNQNNIIKVINECEKEVL SQYI Prediction of potential genes in microbial genomes Time: Thu May 26 09:14:49 2011 Seq name: gi|223714205|gb|ACDT01000010.1| Coprobacillus sp. D7 cont1.10, whole genome shotgun sequence Length of sequence - 29464 bp Number of predicted genes - 33, with homology - 33 Number of transcription units - 18, operones - 9 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 1931 1544 ## COG0021 Transketolase 2 1 Op 2 . + CDS 1943 - 2638 657 ## COG1011 Predicted hydrolase (HAD superfamily) + Term 2675 - 2733 14.9 + Prom 3085 - 3144 6.8 3 2 Tu 1 . + CDS 3167 - 3727 428 ## HMPREF0868_0096 CarD-like protein + Term 3749 - 3797 5.5 + Prom 3835 - 3894 10.2 4 3 Op 1 . + CDS 3930 - 4130 276 ## gi|237735656|ref|ZP_04566137.1| predicted protein 5 3 Op 2 . + CDS 4123 - 4443 384 ## gi|167757259|ref|ZP_02429386.1| hypothetical protein CLORAM_02809 + Term 4453 - 4483 2.0 6 4 Op 1 . + CDS 4510 - 4992 435 ## COG1846 Transcriptional regulators + Prom 4997 - 5056 2.7 7 4 Op 2 . + CDS 5080 - 6273 1310 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities + Term 6304 - 6344 2.3 + Prom 6275 - 6334 6.8 8 5 Op 1 . + CDS 6392 - 6736 432 ## DET1117 hypothetical protein 9 5 Op 2 2/0.333 + CDS 6738 - 7043 398 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 10 5 Op 3 1/0.333 + CDS 7046 - 8473 1360 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 8487 - 8526 1.0 + Prom 8477 - 8536 8.6 11 6 Tu 1 . + CDS 8600 - 9436 979 ## COG1737 Transcriptional regulators + Term 9643 - 9676 1.5 12 7 Tu 1 1/0.333 + CDS 9789 - 10646 782 ## COG1737 Transcriptional regulators + Prom 10691 - 10750 5.3 13 8 Tu 1 . + CDS 10773 - 12131 1769 ## COG1455 Phosphotransferase system cellobiose-specific component IIC + Term 12146 - 12179 2.4 14 9 Tu 1 . + CDS 13221 - 13520 330 ## gi|167757249|ref|ZP_02429376.1| hypothetical protein CLORAM_02799 + Term 13527 - 13568 7.2 - Term 13515 - 13556 7.2 15 10 Tu 1 . - CDS 13559 - 14386 1117 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) - Prom 14420 - 14479 13.7 + Prom 14553 - 14612 6.9 16 11 Tu 1 . + CDS 14634 - 15623 805 ## COG1940 Transcriptional regulator/sugar kinase + Term 15798 - 15833 -0.5 + Prom 16019 - 16078 11.7 17 12 Op 1 . + CDS 16107 - 16577 538 ## COG1869 ABC-type ribose transport system, auxiliary component 18 12 Op 2 . + CDS 16580 - 17140 734 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 17154 - 17195 5.1 + Prom 17195 - 17254 13.4 19 13 Op 1 . + CDS 17284 - 17781 552 ## CDR20291_2115 MarR-family transcriptional regulator 20 13 Op 2 . + CDS 17778 - 18188 459 ## COG0716 Flavodoxins 21 13 Op 3 . + CDS 18179 - 18886 538 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases + Term 18905 - 18934 -0.2 + Prom 18902 - 18961 4.8 22 14 Tu 1 . + CDS 19004 - 20332 1342 ## COG0534 Na+-driven multidrug efflux pump + Term 20389 - 20458 3.4 23 15 Op 1 3/0.000 - CDS 20521 - 21009 503 ## COG4769 Predicted membrane protein 24 15 Op 2 . - CDS 21019 - 21378 320 ## COG5341 Uncharacterized protein conserved in bacteria - Prom 21612 - 21671 5.8 + Prom 21247 - 21306 4.8 25 16 Op 1 . + CDS 21407 - 22381 776 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 26 16 Op 2 . + CDS 22453 - 23253 1010 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 23261 - 23289 -0.0 + Prom 23316 - 23375 11.4 27 17 Op 1 . + CDS 23406 - 24455 999 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 28 17 Op 2 . + CDS 24433 - 25185 711 ## COG1876 D-alanyl-D-alanine carboxypeptidase 29 17 Op 3 . + CDS 25148 - 26212 665 ## CDR20291_1526 serine/alanine racemase 30 17 Op 4 . + CDS 26172 - 27302 1191 ## COG0787 Alanine racemase 31 17 Op 5 . + CDS 27325 - 28515 1295 ## COG1876 D-alanyl-D-alanine carboxypeptidase + Term 28516 - 28542 0.3 32 17 Op 6 . + CDS 28595 - 29056 515 ## COG3760 Uncharacterized conserved protein + Prom 29146 - 29205 5.2 33 18 Tu 1 . + CDS 29228 - 29462 161 ## gi|167757228|ref|ZP_02429355.1| hypothetical protein CLORAM_02778 Predicted protein(s) >gi|223714205|gb|ACDT01000010.1| GENE 1 3 - 1931 1544 642 aa, chain + ## HITS:1 COG:CAC1348 KEGG:ns NR:ns ## COG: CAC1348 COG0021 # Protein_GI_number: 15894627 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Clostridium acetobutylicum # 1 642 14 658 663 712 54.0 0 LCADTVEKANSGHPGLPLGSAAAAYILWGKVMKHSGTHRKWVNRDRFILSGGHGSSLLYS LLHLYNYGLTIEDLRQFRQLDSLTPGHPEYGHTVGVEATTGPLGAGMAMATGMAMAQAHL AAIFNRPGYPVFDSFTYVLGGDGCLMEGISSEAFSLAGTLKLEKLIVLYDSNNISIEGST DIAFREDVQGRMKAFGFDTFTVEDGDDLDEILSAINQARVSQKPAFIEVKTRIGKGCPAK EGKASAHGEPLGEENVRELKRTLGLDEDRTFAIDEEVYLNTAKQQERSEVIYKIWKDMFD DYLVTYPEMKKLWKQYFEVDYNKVLENDVEFWKFENKPESTRNLSGTAMNRIKELFPNFM GGSADLAPSTKTYLNKVGDFSAENYRGRNLHYGVREQAMTGIGNGIMLYGGLKTFVSTFF VFSDYVKPMARLSALMNLPLIYVLTHDSIGVGEDGPTHEPVEQLTMLRAMPNFNVFRPCD AQETNAGWYLAMTSKTTPTALVLSRQNLPQIKGSSREALKGGYVIDECEGTPEIIIIASG SEVTLAVEAKKRIRGKAIRVVSMPSMDIFKQQSREYRESILPPGIEKRIAIEAGSRMSWG EFVGLKGKYITMDSFGASAPADELFKRYGFTVDNVVKMINKL >gi|223714205|gb|ACDT01000010.1| GENE 2 1943 - 2638 657 231 aa, chain + ## HITS:1 COG:lin0639 KEGG:ns NR:ns ## COG: lin0639 COG1011 # Protein_GI_number: 16799714 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Listeria innocua # 5 224 6 234 234 160 39.0 2e-39 MKFYFDVDDTLYDQFIPLKKAFIEIFPKIKTLPIYQIFIKFRKYSDKIFEASQSGQVTIN EMYIYRIQAALKEYYLEIDEAHALAFQTKYYVYQQDITTSNAIKDILDLLKDNHIGIGII TNGPACHQMEKLRRLKMSNWVALEDIYISSVVGYSKPDPQIFNLITDKDCIYVGDSYEND VIGAKNANWKCIWLNKKGLKAMDIKPDYEVNDEEALLNLISKIVKESAQKM >gi|223714205|gb|ACDT01000010.1| GENE 3 3167 - 3727 428 186 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0868_0096 NR:ns ## KEGG: HMPREF0868_0096 # Name: not_defined # Def: CarD-like protein # Organism: Clostridiales_BVAB3 # Pathway: not_defined # 19 183 1 165 210 112 37.0 4e-24 MRYNRYIKHILNRKLGGYVYEKGDLIIYGNQSVCRIENIGVISIGKQPNSRKYYTLNPIF MDGKTYVPIDTQVYMRHLISIEELERLLTRHPKVQKEIIENQNLRQLTDYYKETISQYTC DGLIQLICNLQAKQKNLLLQNKRLSQTDERYKKEAEDLLHQEFAAVLNIPKEAVMAYIEN KIKVAG >gi|223714205|gb|ACDT01000010.1| GENE 4 3930 - 4130 276 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735656|ref|ZP_04566137.1| ## NR: gi|237735656|ref|ZP_04566137.1| predicted protein [Mollicutes bacterium D7] # 1 66 1 66 66 116 100.0 5e-25 MNQPPKMISTKDSGSFNDQLNSLYVLSKKLKAYEESVEDNDIKMALGRVNSTIKNHYSEL LGCLNG >gi|223714205|gb|ACDT01000010.1| GENE 5 4123 - 4443 384 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757259|ref|ZP_02429386.1| ## NR: gi|167757259|ref|ZP_02429386.1| hypothetical protein CLORAM_02809 [Clostridium ramosum DSM 1402] # 1 106 1 106 106 189 100.0 4e-47 MDNQNYSNPTKQPILPNENTFNDYDRINDVLLCLKSLLSDYTTFIIECSHQQLANKLIDI QKEVYQIQREVFDLMYSKGWYPLEPETPQKIQKVVQEYTTKESHLM >gi|223714205|gb|ACDT01000010.1| GENE 6 4510 - 4992 435 160 aa, chain + ## HITS:1 COG:MA0743 KEGG:ns NR:ns ## COG: MA0743 COG1846 # Protein_GI_number: 20089628 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 40 150 76 186 196 65 31.0 4e-11 MIPEVIFVDQERFDNMIKVIDESYDLIQEYDSKPRQFGKVMLYPVETHTLAIIGNHPGIN ASMIAKELKKTLSASSQILKKLEQKGLISRTKNPHNNREYNLYLTVGGRQIFDLHGQFDK KIMKRYFKNLISFSNEEIDIYIKVQRALNKEYAQDNEESL >gi|223714205|gb|ACDT01000010.1| GENE 7 5080 - 6273 1310 397 aa, chain + ## HITS:1 COG:CAC2970 KEGG:ns NR:ns ## COG: CAC2970 COG1168 # Protein_GI_number: 15896223 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Clostridium acetobutylicum # 3 390 2 381 384 325 40.0 1e-88 MKYDFETLVDRSQNGSAKWNGMKDHNPAVAKNIAPLSVADLDLKLAPEIAEGMLEFMQNN PVFGYTNGTAAYYDAVINWMKDKHNYQVEKEWIVLSNGVVPALSDGVTAFTEENDGVIIF TPVYYPFYRAIELNNRRVQRCPLINHENSYQIDFDKFEELAKQANTKLLILCNPHNPVGR VWTKEELEKIADICLNNGIIILSDEIHEDLTMPGYTHYPIAVLSEVIGDITVTCTAPSKS FNIAGLQGSNIIIKNKELRAKFIAQQEKKGFFTLNTLSFEATRLAYTKGDAWLAEFKTLI HHNYDILVDFIKKNLPSVTVYPLEGTYLAWLDFRNLGYDYLELEKKMIAADLYLDEGYVF GQEGEGFERINIACPTWVLKEALERLKNAFSGPQVSK >gi|223714205|gb|ACDT01000010.1| GENE 8 6392 - 6736 432 114 aa, chain + ## HITS:1 COG:no KEGG:DET1117 NR:ns ## KEGG: DET1117 # Name: not_defined # Def: hypothetical protein # Organism: D.ethenogenes # Pathway: not_defined # 3 109 4 110 136 163 66.0 2e-39 MSKCKITVLRREYYQDLADQYLADPQVGKCPLFKEGQEFILERGNGKDDFWNMMNGKFCS EAWDAVSRYIYAALQGGSIMKGWTNDEKMMIACCSDGTRPVIFKIERIDEMEEN >gi|223714205|gb|ACDT01000010.1| GENE 9 6738 - 7043 398 101 aa, chain + ## HITS:1 COG:SP2023 KEGG:ns NR:ns ## COG: SP2023 COG1440 # Protein_GI_number: 15901844 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Streptococcus pneumoniae TIGR4 # 1 101 1 101 102 92 50.0 2e-19 MIRILLCCAGGMSTSLLVTKMEQAAQEKGIETKIWAVGATEVKNHSQDADVILLGPQVRY LEKSIADDANGVPAYLIDMRDYGKMDGKAVLTFALNKLQEV >gi|223714205|gb|ACDT01000010.1| GENE 10 7046 - 8473 1360 475 aa, chain + ## HITS:1 COG:CAC0743 KEGG:ns NR:ns ## COG: CAC0743 COG2723 # Protein_GI_number: 15894030 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 8 472 5 469 471 637 64.0 0 MKMVEKNFPKGFLWGGAIAANQCEGAVLEDGKGYSTADALPEGVFGNEAIPPLDNYLKKN AIDFYHRYKEDIALFAQMGFKVLRLSIAWSRIFPHGDETVPNEKGLQFYDDVFDELAKYG IQPLVTLSHYEMPLYLATNYGGWVNRKVIDFYLNFAQVCFERYCNKVKYWLTFNEINVIL HAPFNGGGIKGKADEINPEVLYQAIHHQFVASAQATKIAHKINPNFKIGCMIAGSPIYPL TPQPEDIQETVRRDRESLFFADVHVRGKYPGYMLRFFEENHIHLEITKEDEEILKNTVDF ISFSYYMSYCATADEKLNQQARGNVMSGVKNPYLKESEWGWQIDPKGLRYILNQFYDRYQ LPLFIVENGLGAKDILVVDEQYGYTVNDDYRIDYLKEHLKQVLEAIKDGVEVLGYTSWGP IDIVANTKCQVSKRYGYIYVDVHDDFSGTLARYPKKSFWWYQKVIATNGESLNDI >gi|223714205|gb|ACDT01000010.1| GENE 11 8600 - 9436 979 278 aa, chain + ## HITS:1 COG:YPO3017 KEGG:ns NR:ns ## COG: YPO3017 COG1737 # Protein_GI_number: 16123196 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 9 259 21 271 292 97 31.0 3e-20 MILEKLRERKRFTNSEKLIAEYVLKHPEQLYQLSVEELGKETYTSKATVIRLCKKLEVNS YQEFKRQVEFEYNELSRISSLLKDEPVDENSTYQDIVKVIPTIYDKSIVQTRLDLDQNKM IQVINRLSQAKRISIYGTGISYTIASQAAFKIMTLGKECDAHSGINEHFILSQKNSRQCV AILISFTGNNSEMIKIAKYLKKAGIYVIAIGGKKGELQNYCDIYLNVYSSQDILSLEVIT SFTATMYVLDVIFVSLLVKDYQSNVTTALKVIDFENKK >gi|223714205|gb|ACDT01000010.1| GENE 12 9789 - 10646 782 285 aa, chain + ## HITS:1 COG:lin2846 KEGG:ns NR:ns ## COG: lin2846 COG1737 # Protein_GI_number: 16801906 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 1 276 1 277 283 124 31.0 2e-28 MSLLSKLEYKKGFSDIEKGIANYIIDHKEEVANMRLVELAEATFTSTATISRFCKKLGEK NYNSFKINFASSVLTSYQTDVDYNRPFKENDSIQEVTNQLGELYKDTIEATKALLDYDVL NQVIEKLLKTSVIDIFAVGASYLSGLLFEHRMISIDHFVNFKSSPNDQDKRSLFVNKNTV AIVISYSGESHEIKTIVDRIVDNKGTIIAITSINDSYLRKRANYCLTMCSKENIVSKIET YSSKLSSDYLMDLIFSILFQKDYYPNLIRKINCEQKYEKKTDHDM >gi|223714205|gb|ACDT01000010.1| GENE 13 10773 - 12131 1769 452 aa, chain + ## HITS:1 COG:lin2906 KEGG:ns NR:ns ## COG: lin2906 COG1455 # Protein_GI_number: 16801965 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 4 448 11 439 450 215 32.0 2e-55 MDKMADVLGRVGAWCGQNKYLSAIKNAFQTFMPLTIAGAIGVLWCNVLVNADSGLGMFWK PIMALEVINPAFAAMQFATISCITVGITFGIAQEIGEANGETGYFAGLLGLACWLAVTQS GWANYALVDTAKQTLELTADGALQTFTGVAGGALGATGLFTGMIVGVVSVEIFCALRKVE GLKLKMPETVPPGVARAFEVLIPAVITLAIIALIGRGCELVTGLYLNDVISTYIQGPLGA IGSTIPGVIIIYIIIGLFWLVGIHGNNMLAAVKEALFTPLMLENIETFSKTNNAKGDDLH IFAMAWLQMFGEFGGSGVTIGLVISILVFSKREDNRTIAGISLVPGLFNINETVTFGIPM VLNPILGIPFVLAPIVTLMLGYVLTVIGFCPKAVINTPWTTPPILHGFLTTGANIMGAVS QAIAIVASILIYAPFLIAYERYQNKQAAEAAE >gi|223714205|gb|ACDT01000010.1| GENE 14 13221 - 13520 330 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757249|ref|ZP_02429376.1| ## NR: gi|167757249|ref|ZP_02429376.1| hypothetical protein CLORAM_02799 [Clostridium ramosum DSM 1402] # 1 99 1 99 99 184 100.0 1e-45 MRLLSIIKVIVVMIDARSEWNVKDGQTKFYEYYLKRVKPEYQDAAQHLLKESFKRQADGN FDDNFINKFVKETQVYIETDKLTEVLEIIEHFCTGHFNN >gi|223714205|gb|ACDT01000010.1| GENE 15 13559 - 14386 1117 275 aa, chain - ## HITS:1 COG:SMa0326 KEGG:ns NR:ns ## COG: SMa0326 COG1028 # Protein_GI_number: 16262630 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Sinorhizobium meliloti # 2 274 7 276 279 193 39.0 3e-49 MKTWLITGCSSGIGRGIAKAVLKKGDNAVVTARDTSKIADLVDQYPNTALGVALDITKKE SISAAVKQAQERFGTIDVLINNAGYGYRSSVEEGDIDDVNLLFNTNFFGPIELIKEVLPQ MRANKNGAIVNVSSIAAVRSGVGSGYYAASKAALELMTDGLAKELKPLGIKVMIAQPGSF RTNFYDTSLKGTKNKIDDYNETAGKTRKENVVNHKNQPGDPDKGGQVIVDTIEKDNYPFR LLLGSDATKIVASALEERIQEIETWKNISSQSDFE >gi|223714205|gb|ACDT01000010.1| GENE 16 14634 - 15623 805 329 aa, chain + ## HITS:1 COG:lin0520 KEGG:ns NR:ns ## COG: lin0520 COG1940 # Protein_GI_number: 16799595 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Listeria innocua # 9 316 7 317 334 95 23.0 2e-19 MNRIGTGKLRDLNDQIILKLLLRHQEVTKKELANYSRLTVATVGTILNDFLDNGTVVEKE IIYLKKGRPTKKYGLNPEYFHSLCLFVQRKRGRDYLCWQIIDALASVLKQGKELVNDLKL EDILKFIKYLLSQDSRIQIIGLGVPAIISKEIVIESDIPNLKNVNLKAEIEKATGLKTVV KNDMNYTAYGCYLHNNKKDLCYVTFPLNSGPGCGSVINGKLIEGENSIAGEILYLPFFKY LKQEQLCFDYSPEDVALSLCCVASIINPSIVILTGEAIEADDILKIKKICLEYLPSEFMP ELTYCANYENDYLLGIQEVIRESFIKQDK >gi|223714205|gb|ACDT01000010.1| GENE 17 16107 - 16577 538 156 aa, chain + ## HITS:1 COG:BH3729 KEGG:ns NR:ns ## COG: BH3729 COG1869 # Protein_GI_number: 15616291 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type ribose transport system, auxiliary component # Organism: Bacillus halodurans # 4 138 3 126 129 58 30.0 5e-09 MKPKKRILHERLAAVVASLRHGEMLFIADAGSGTSSKALYPLDSSVEYIDVEVVTGSPSF EDIVTTLVACGDFEGAIVTEDMIEVDQKDYQLLVDLLGKKNIHSMHYIPEYYEMRDRCKA VVQTGDYGVHAQAILIAGYPSAEISMKWLKEGIKGK >gi|223714205|gb|ACDT01000010.1| GENE 18 16580 - 17140 734 186 aa, chain + ## HITS:1 COG:SPy1065 KEGG:ns NR:ns ## COG: SPy1065 COG0110 # Protein_GI_number: 15675057 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Streptococcus pyogenes M1 GAS # 2 185 3 186 188 245 61.0 3e-65 MTEREKMELGMWYDANNDEELLKQRVLAKDLCFDLDHIKPSSYEKRLEVITSLLGYQPIN IELISPFMCDYGNKIKIGKNAFINSNCYFMDGGGIELGDNVYIGPSCGLYTAIHPTEYKI RNTGLEQALPIKIGNNVWLGGNVVILPGVTIGDGCVIGAGSVVTKDIAPNSVACGNPCKV IRSIEN >gi|223714205|gb|ACDT01000010.1| GENE 19 17284 - 17781 552 165 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_2115 NR:ns ## KEGG: CDR20291_2115 # Name: not_defined # Def: MarR-family transcriptional regulator # Organism: C.difficile_R20291 # Pathway: not_defined # 17 162 18 164 165 104 37.0 1e-21 MKKSSYELLGQDPKWVKKLLFAGVFIQENKLHAIFDRYNEMTSKQWLLMATMTSFTDAPD LSMLAKVMGCSRQNVKKLALSLEKQGYVILEKSPKDARSLCVVMSKQGLRFRKDMEELTD EVHQALFSEFSNEEITQYYQLSIKLMHGIDHLEVFFKNRQKGAEG >gi|223714205|gb|ACDT01000010.1| GENE 20 17778 - 18188 459 136 aa, chain + ## HITS:1 COG:MJ0298 KEGG:ns NR:ns ## COG: MJ0298 COG0716 # Protein_GI_number: 15668473 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanococcus jannaschii # 4 134 5 149 150 59 36.0 2e-09 MIEVIYKSKKNNKKIAKAIADELQVDAKNIKQVQDVKADILFLGCPIIGGNIPEQVTRFV AQLDANKVKKVILFSANGFGTDQFTNVKAQLKEKGIIYGPVFSCKGSAFIFKNFGHPDKK DIQAAKKFAKEVLKWR >gi|223714205|gb|ACDT01000010.1| GENE 21 18179 - 18886 538 235 aa, chain + ## HITS:1 COG:lin2436 KEGG:ns NR:ns ## COG: lin2436 COG1187 # Protein_GI_number: 16801498 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Listeria innocua # 12 231 6 230 233 157 39.0 1e-38 MEVSRIKDLYYVLMKNKLGNHKICKTLIRHGEIRVNDQVITDVRYPVKADDIIMYHHIQL NAQPFVYYMMNKPADYLCANHDKYHQCVIDLIGREDCYCLGRLDIDTTGLLIITNDTSLS KKLLLPQNHVNKKYYVTTKFPLKDELIKKFSDGVVIDHKVQCLPSLLEILDEYHCLVTIN EGRYHQIKKMFKSCQNEVCTLKRIAFAGIELDQKLALGEYRDLTSTELITLFKHC >gi|223714205|gb|ACDT01000010.1| GENE 22 19004 - 20332 1342 442 aa, chain + ## HITS:1 COG:yeeO KEGG:ns NR:ns ## COG: yeeO COG0534 # Protein_GI_number: 16129928 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 6 437 92 528 547 204 29.0 4e-52 MLFDNKALKKLIIPLVVEQLLVITVGFADTVMVASVGEAAVSAVSLVDAINILLINIFAA LGTGGAVVAAQFIGQGSKDKAKYSAKQLILITALLSVIIMVIVIAFNVPLLHMVFGNVEV DVMKNAEIYFLFSAMSYPFIALYNSGAALFRAIGNSKISMINSAVMNVINIILNAIFIFV FKWGVFGAVLATLIARAVACIVILKMLSHRDNDVCVNDYLHWKFDFMYIKKILAIGIPSG LENGMFQLGKILVQSLIATFGTYSIAANAVSNNLAQMQIIPGMAMSLAMVTVVGQCVGAN DYKQAKYYVKKLLKITYLSMAGLIVVLIIATPSILTFYSLSKETTDLAYQCIFLHAIIAA LIWPTSFTFPNALRAANDAKFTMIVSAASMLIFRLCFSYVLGQYMNLGLIGVWIAMFIDW GVRCIFFIWRYFSGKWMNRQLV >gi|223714205|gb|ACDT01000010.1| GENE 23 20521 - 21009 503 162 aa, chain - ## HITS:1 COG:L179010 KEGG:ns NR:ns ## COG: L179010 COG4769 # Protein_GI_number: 15673323 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Lactococcus lactis # 3 157 4 161 176 81 38.0 5e-16 MTKRITYLSLLIAIAIILGYLETFIPINFGIPGAKLGLANLIGLIALYRFGFRDALIITL LRVFIIASVFTNYYMFLYSLSGALLSLVIMVLLKKTKLFSTLIISISGAIFHNLGQILVA LIFYGFNIIYYLPYLIILSLITGSVIGILGQIILAKLPKQLS >gi|223714205|gb|ACDT01000010.1| GENE 24 21019 - 21378 320 119 aa, chain - ## HITS:1 COG:L178600 KEGG:ns NR:ns ## COG: L178600 COG5341 # Protein_GI_number: 15673322 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Lactococcus lactis # 3 113 17 129 135 67 33.0 6e-12 MFKKKDFIIIFIIFVVIIGALLFNHFYFSQKAAYVKITVNNQVYRTVPLEIDQTIKISDK NMIVIDHGSVHMKDATCPDKLCIKQGTISKNGEQIICLPNQVVIAIVSNQYNDVDSSSK >gi|223714205|gb|ACDT01000010.1| GENE 25 21407 - 22381 776 324 aa, chain + ## HITS:1 COG:lin2785 KEGG:ns NR:ns ## COG: lin2785 COG1477 # Protein_GI_number: 16801846 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Listeria innocua # 3 319 2 343 360 218 38.0 2e-56 MKKRWFIKILIILIIFCGCTNKQPLSQTQLLLDTQVTITLYDCASSNILDECFAICKDYE LLFSRTNPKSELYQLNHQDKTKPIKISKELAKVINIGLEYSKLSNGTFDITVGQLIDLWD FKADTPKLPETSAIAGALTSIGYRGITLNDSTISFSNPNTIIDLGAVAKGYIADKIKKYL IEQGVDSAIINLGGNVLCVGKKNSDDFTIGITDPKGSSDILKLKINDQSVVTSGIYQRYF EVEGKYYHHILNPKTGYSYDNGLASVTIISNHSVDGDALSTVCFTLGKEQGLALVNQLAG IEAVFIDTNNALYYSDHAKDYVIQ >gi|223714205|gb|ACDT01000010.1| GENE 26 22453 - 23253 1010 266 aa, chain + ## HITS:1 COG:BS_ykrA KEGG:ns NR:ns ## COG: BS_ykrA COG0561 # Protein_GI_number: 16078519 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Bacillus subtilis # 1 262 1 252 257 101 27.0 1e-21 MNTKIAFFDIDGTLVNVPNGMLHPTDETIRVLNEFKNQGNKIVIATARGEVPESVANIEF DGYICNDGHYIRFNNEILIDEQFDNGMVQKQLDVYAKYNGRSMFNGRGGAWCSFLDDELV IKHRAMFQGTTERPTDVNEVFKTSDVKAVSCCVLFDSAAQLWAAYHELEDEFTMIPYDTG LIRMDVYCKGFTKGTACEYLYQKLGIDYENTYAFGDGINDVEMLQLVKHGIAMGNAIPKL KSVASEITDSVDNDGIAQSFKKHFNI >gi|223714205|gb|ACDT01000010.1| GENE 27 23406 - 24455 999 349 aa, chain + ## HITS:1 COG:ECs0431 KEGG:ns NR:ns ## COG: ECs0431 COG1181 # Protein_GI_number: 15829685 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Escherichia coli O157:H7 # 1 342 1 352 364 249 37.0 6e-66 MKRKTIAILFGGKSSEYEVSLQSAAAIIKNFPQEKYDLYLIGITRAGQWFHYDGEIEKIS NDTWNQELLNPVTVCLEPERQAVMEIVDNKIIYQQIDAVLPVLHGKNGEDGTVQGLMELA KIPVLGCGLLASALCMDKYRAHQLVKAAGIKVTRSVLVHNINDFNVKALNFPLFVKPLKA GSSYGISKVVKMEELTKALEYAFNYDNEVIVEEAVEGFEVGCAIIGNDELTVGEVDEIEL SGGFFDYDEKYTLKTSTIHMPARIDEVTARKIQETAKIIYQTLACQNFARVDMFLTPNNE IVFNEVNTIPGFTDHSRFPQMMKGVGFEFSELIDKIVAIGLKYADGNFK >gi|223714205|gb|ACDT01000010.1| GENE 28 24433 - 25185 711 250 aa, chain + ## HITS:1 COG:BH1810 KEGG:ns NR:ns ## COG: BH1810 COG1876 # Protein_GI_number: 15614373 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 2 244 50 287 290 98 27.0 1e-20 MQMVTLNKDDIYQGPLVLVNHNYPIANNYQITNCSTAKDSTISLEKTAMEMLTKALQEIK AEDKIVLVSGLRSGQEQAMIYNQAIQNHGKEFVDKFVARVNCSEHQTGLAIDLAVKNETI DIFCPDFTSQEICQQFRDIASDYGFIQRYSKNKIDFTKISEEPWHFRYVGIVHAKIMEEN QWCFEEYHLYIKNHSIFNPYVLDCQGKLVEVFYVPAVNEQTKLNLSDHGSYQISGNNCDG FIITSWRPYV >gi|223714205|gb|ACDT01000010.1| GENE 29 25148 - 26212 665 354 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1526 NR:ns ## KEGG: CDR20291_1526 # Name: vanTG # Def: serine/alanine racemase # Organism: C.difficile_R20291 # Pathway: not_defined # 11 324 1 313 712 212 42.0 2e-53 MVLLLHHGGHMSKVKGYPLIDCFRIIAAILVVTIHTSPLDTISPTLDFYLTRGLGRIAVP FFFITTAYFYFLNPSKKRLYKIIKQTLLIYLAAIIIYLPLNLYNHYFNQSNLIYEIIKDV LFDGTFYHLWYLPATVLGLITVAVLDKRVSLLTSFSIVVLLYLIGLGGDSYYGLIHPYCK WGYDLIFLISDYTRNGIFFAPLFIWIGKLLAVKQFNFSLRVIVISTILAISLMFIEISLL KGYQDAADYVFVRHDSMNILLPVVMFWLYQGLLRISGKRISKCKDLSLAIYLFHPLMIVV VRGIGKLLKCDHLVVDPLIQFISVLIITVLFSEILLKGKEKYGEYSHKSQLDRS >gi|223714205|gb|ACDT01000010.1| GENE 30 26172 - 27302 1191 376 aa, chain + ## HITS:1 COG:alr2458 KEGG:ns NR:ns ## COG: alr2458 COG0787 # Protein_GI_number: 17229950 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Nostoc sp. PCC 7120 # 6 372 29 398 401 232 34.0 7e-61 MVNIRTSRSWIEVDLDAITHNLDELSSLLNKNSSLMVVVKAEGYGHGLVEVAKHCQKMGI EAFAVATIDEGIKLREHGIYGEILVLGYSHPIRAIELHNYRLIQTIVNREHARQLNESGH MINVHIKIDTGMHRLGFDLSDYESIKEIYDYQFIHIQGYFTHLCVCDSKRKVDQEYTQKQ IAAFYDLINELKLRHYDIGKVHLQSSYGLVNYPEINADYARVGILMYGVYSTFNDYSKVQ LDLKPALTIKAQIAMLHYLTKGASLGYGLTYKVNRDSVIAVIPIGYGDGLPRILSSNGSV LIKGQKCPIVGKICMDQMLVDVSEISDISDNDLVTVIGQDGTEEIRVEEIALEANTISNE ILSRLGSRLPRIYRGG >gi|223714205|gb|ACDT01000010.1| GENE 31 27325 - 28515 1295 396 aa, chain + ## HITS:1 COG:BS_yodJ KEGG:ns NR:ns ## COG: BS_yodJ COG1876 # Protein_GI_number: 16079020 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus subtilis # 198 385 80 269 273 169 50.0 6e-42 MSYNEKKKIPYKVIISVLIAMLILGTLIAFNFTRIRLMTKGYGWSEQSIILQLNDNEIER YLDAEKVSDLSSWNEYKNKKHYLEYQTYQKEHQDYSKKEVIKYIDTFYQDYYQDLKDLKY SYKQVLSLMKVAELSDFKVIVDNKYTYSQIKSYLKINGMVFEDLPKYLASNQEPITAVLT VTYPFIDANNAVGSEYEVLDPSNTLLLIKKGFVLPKDYVPADLVVPDIPIAPDNNHNKLR KDAAKALEDMNKDALKEDYHLVLNSGYRSYDEQVEIYNDYFNRYDEVTASGLVAKPGSSE HQLGLGVDLTSQSVIDKKRMVFGDTDEYKWVAKNAYKYGFILRYPKNRSDITGTANEPWH LRYVGKKAAKIIYDNNWTLEEYILKYGFDYELKKID >gi|223714205|gb|ACDT01000010.1| GENE 32 28595 - 29056 515 153 aa, chain + ## HITS:1 COG:SMc03800 KEGG:ns NR:ns ## COG: SMc03800 COG3760 # Protein_GI_number: 15966936 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 2 153 66 220 224 102 38.0 4e-22 MDIYKFLTQLGIDYQKINHQSVSTVEEAMFINEQIDGVACKNLFLKSSDKQFFIYMMTVE KRADLKYIAKQIGSKRLSFASDEELVELLKLIPGSVTPLGIINDNKRVKILIDRELRAQR LLIHPNINTATISITYDDLLLFINACGNEYLII >gi|223714205|gb|ACDT01000010.1| GENE 33 29228 - 29462 161 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757228|ref|ZP_02429355.1| ## NR: gi|167757228|ref|ZP_02429355.1| hypothetical protein CLORAM_02778 [Clostridium ramosum DSM 1402] # 1 78 1 78 1078 157 100.0 1e-37 MLVTDKELHRIVSSVRNLRTGRIYYQKDYVREITITYDRLFDDYTIEGIVDNGSYENVCK IVIDQDNKITDYSCNCYW Prediction of potential genes in microbial genomes Time: Thu May 26 09:15:27 2011 Seq name: gi|223714204|gb|ACDT01000011.1| Coprobacillus sp. D7 cont1.11, whole genome shotgun sequence Length of sequence - 12500 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 5, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 2819 2606 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family + Term 2822 - 2866 6.1 + Prom 2848 - 2907 6.8 2 2 Op 1 . + CDS 3107 - 3763 570 ## COG0122 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase 3 2 Op 2 . + CDS 3811 - 4542 728 ## COG1737 Transcriptional regulators + Prom 4560 - 4619 7.0 4 3 Op 1 2/0.000 + CDS 4689 - 6575 2276 ## COG1299 Phosphotransferase system, fructose-specific IIC component 5 3 Op 2 . + CDS 6586 - 9153 2296 ## COG0383 Alpha-mannosidase + Term 9160 - 9192 3.2 + Prom 9158 - 9217 3.8 6 4 Op 1 . + CDS 9258 - 10508 1440 ## COG2015 Alkyl sulfatase and related hydrolases 7 4 Op 2 . + CDS 10523 - 11428 1019 ## COG0451 Nucleoside-diphosphate-sugar epimerases 8 4 Op 3 . + CDS 11467 - 11790 288 ## gi|237735694|ref|ZP_04566175.1| predicted protein + Term 11925 - 11967 -0.8 + Prom 11913 - 11972 7.5 9 5 Tu 1 . + CDS 12014 - 12454 654 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases Predicted protein(s) >gi|223714204|gb|ACDT01000011.1| GENE 1 3 - 2819 2606 938 aa, chain + ## HITS:1 COG:FN1160 KEGG:ns NR:ns ## COG: FN1160 COG0553 # Protein_GI_number: 19704495 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Fusobacterium nucleatum # 36 938 176 1088 1089 546 38.0 1e-155 NLIASFKTAKAIDFNQVKLNQKYDLEVQIKINYGHPCLKFKVGSDKKYIIKNIPNFLSAL ANNEYVEYGKQLAFVHNQKVFSDEALKIIELMKFCMITYERYQQEYYYDISHVTNEIKII PESIEYIYEVLKDLEKTNQHLSIYQYNYQPTLHIRKLEKYYSLELKDFDEFCGYKNLYKI TENEIGMVKLDDEGKAIKFIHNCESKDKLLLAKEDVNDFVKYILNDIKQYFNYTGDLLTS EVIETEILTLYGDINELEQIKVYLESKQGLETIYSFQQPRKQTSINFDLIEDYLKEIGDV VDFEKHCMYLNLESEKTYSFLNEGLPFLANYCEVMVSDALKKIGKKSQFSITVGVTLEND LLAIDIESIDIPKDELALVLNSYQKKKKFHRLKNGQLLYLDSDELKELNEFMTDYQIRPK MLEDGHLEMDVYRAFSLDNKAETSNYLVYDRSTVFKEIIDNFKNIAKQSYPLAPNYQEIL RDYQKFGYQWLQSISSYGFGGILADDMGLGKTLQMIVLLDQNRDDKKTSLVVCPSSLLLN WQDEIHKFSNSLSCTCIHGSLKRRKEAIRKFNEVDVLITTYDYMRRDANLYDGYQFENII LDEAQYIKNPKTKNAITVKSLKAKHRFALTGTPIENSLAELWSIFDFLMPEYLYSYHYFK TNYETPIVKNHDENKQQELKKLITPFILRRNKKDVLTELPEKIEKTLSIEFNEEENKLYL ANLIQVNKELQEQLNYDNVDRIAILAMLTRLRQICCEPRIIYDNISNISSKLSGCLDLIR NFLGNNQKVLLFSSFTSVLDLIAKELEKESITYYQLTGDTKKEERHRLVNQFQNDDTTVF LISLKAGGTGLNLTAAEAVIHFDPWWNMSAQNQATDRAYRIGQENVVTVYKLVMKDSVEE KILELQNKKKNLADSFVENNEGSITTMSTNDIIELFKV >gi|223714204|gb|ACDT01000011.1| GENE 2 3107 - 3763 570 218 aa, chain + ## HITS:1 COG:MYPU_0950_1 KEGG:ns NR:ns ## COG: MYPU_0950_1 COG0122 # Protein_GI_number: 15828566 # Func_class: L Replication, recombination and repair # Function: 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase # Organism: Mycoplasma pulmonis # 3 200 2 199 217 189 44.0 4e-48 MEYFEYSDIEIEYLKSRDPVLKQVIEQIGIIKRAVIPDLFAALVNSIVGQQISTKAHQTI WQRINNSFSPITPANFKSLSAEELQKCGITMKKAHYIKEITRRIISDQLDLERLADLSNE EVIEQLCELPGIGRWTAEMILTFSLQRKDIISYSDLAILRGMRMVYHHRRITPQLFTKYK RRYGPYATIASLYLWAVAAGAIPELKDYAPKSKNNVRI >gi|223714204|gb|ACDT01000011.1| GENE 3 3811 - 4542 728 243 aa, chain + ## HITS:1 COG:L54944 KEGG:ns NR:ns ## COG: L54944 COG1737 # Protein_GI_number: 15674168 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Lactococcus lactis # 14 233 16 253 282 82 27.0 1e-15 MKFEERVLQQEFTLNDTDDLIVDYIRKNKANLEHLSIQKVAEDLYLVPNSIMRLCKKLGY KGFAEMKFLVNNEVSENLDIQKLLSKNLYKSMELINYKTLEEVAAKIKNARTCHFIGVGE SLYFCEMMVDNLRCYDYRAEYYQTYREIEYRIKHCEEKDILFVISASGENGRLNNWVEEA KQNGIFVISVTHFNENRLAGLAHIPLYYWGDEQKLNGYNVADRTGMMMLLRELTEVFWRM YCV >gi|223714204|gb|ACDT01000011.1| GENE 4 4689 - 6575 2276 628 aa, chain + ## HITS:1 COG:hrsA_3 KEGG:ns NR:ns ## COG: hrsA_3 COG1299 # Protein_GI_number: 16128706 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Escherichia coli K12 # 286 624 8 348 360 340 55.0 5e-93 MELSKLTSEQLITLNSDLSSKDEIIKFLVSKLYQAGKISNEEDFYQAVLERESLTPTGID NGLAIPHGKDGVVREAAFAVVTLKKPVKDWESVVEGNKVQYVFLLAIPQNDKNSVQMQLL AEMMTKMANHTYTEKLYASKTVKEFYQNLDNGVNSDEIKSFDRSIVAVTACAAGIAHTYM AAEALTKAGQELGVNVYVEKQGANGIEDRHTNEMLKNASAAIFAVDVAVKEEERFSHLPT IKTKVSAPLKDAKKIIETALVKAEQTARGEYVEHSRHQEAGFLETVKEAVMTGISHVIPL IVAGGMIAAICVIFARTFGFTDLMNTEGSWLYLIKSMGSGLLGTLMVPVLSAYMAYSIGD KPALASGFAAGFCANTVGGGFLLGMLGGIIAGYSIRYLKKWIPAKGTFAGFVSFVIYPVL SCLIVGVLLLVVLGKPVAMINTALVDFLGSMAGTNAALLGAVLGIMVSFDLGGPVNKAAY AFCVAAMAEGVLMPYCAFASIKMVSAFGVTFATKFRKDLFTPEEREVGNSTWLLGLAGIT EGAIPFMMADPIRVIISLCTGSAVCGAIVAMFNIGLDVPGAGIFSIFVLTADNIPLAMIV WFGAAVIGAIISAVLLVITRKQKLAKAK >gi|223714204|gb|ACDT01000011.1| GENE 5 6586 - 9153 2296 855 aa, chain + ## HITS:1 COG:ybgG KEGG:ns NR:ns ## COG: ybgG COG0383 # Protein_GI_number: 16128707 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Escherichia coli K12 # 3 812 5 821 877 866 51.0 0 MANQVHIVPHMHWDREWYFSAEESKLLLLNNMEEILEMLENNEDYPYYVLDGQTAVLEDY LALKPKNLSRLVNLVKKGKLIIGPWYTQTDEMVVGGESIVRNLLYGHKDSQTFGEPMNIG YLPDSFGQSGQMPMILNGFGITRSIFWRGTSERMGSNKTEFYWTSDDGSKVLTQLLPLGY AIGKYLPTDLDELKKRCDKYMPVLQKGATSEHILIPNGHDQMPLQKNIFDVIEQLKRLYP DKNFFLSNYDQIFKELEKKDNYDILHGELLDGKYMRVHRSIFSSRADIKTMNTVIENKIT NLLEPLASIAFKLGFEYFDGMMESIWKEIMKNHAHDSIGCCCSDKVHQEISSRFTIANEK VERMIDFYKRKITDAMPKDIALDKLTVFNLLPYPRNRVVRSEVITRMKSFKIVTSEGHEI DYQVINKEIIDAGLIDRQIVHYGNYDPFIKYTIEFKDNFPAMGYQAYLIQEAVEPKEILL DKVDFLENEYYQIKINSNGTLKIYDLVNDHTYDSVLLIEDGSDEGDGYDYSPLLEDYIIT SKDAKADYQINRSRYSTTAHIIVEMKVPKDLTSRKQKKCDTSLKIEFQIELKQGERNLDI TTNIDNTAKDHRLRVLLPTDIWAQASVSDNQFGIIERPVKDMAIESWQEDNWSERPDSIY PMLSYVAIKGCPLALLTNSVREYEVIGDNYNTLAITLFRSVGYLGKEELVRRPGRPSGIK MPTPDSQLLKPLEFRFALVIENDHAKLSRLAKDYLTPLISYNKMPYNAMKLNEVDFNTPY TFSLLHENNELTTLSVLKKAESSNKLIYRTFNTNNKEVLISLNHDLKDFVDLKEEIVKTN NKLKKNQVKSFILDD >gi|223714204|gb|ACDT01000011.1| GENE 6 9258 - 10508 1440 416 aa, chain + ## HITS:1 COG:yjcS KEGG:ns NR:ns ## COG: yjcS COG2015 # Protein_GI_number: 16131909 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Alkyl sulfatase and related hydrolases # Organism: Escherichia coli K12 # 21 416 124 531 665 145 27.0 2e-34 MLGEEKLINFTKTNYQKKITKVNERVYHFLGYGHSNAIAIIGETSIILIDTLDSDYRAKV MKDELAKISDKPVKTIIYTHGHPDHLGGAGAFRDTVVEVIAFKPRKPILKYYDRISKLLN QRGIYQHGYGLTDQEAISQGIGIREGKEIKEGKYDFIKPTTYYEQKVVERVIDGIKLKMV SAVGEADDEIYVWLEDDQIMCCGDNYYGCWPNLYAIRGTMYRDVATWIDTLNEILNYPAV ALLPGHTRVLWQNEIKAIVGTFRDAIEDILFKTLDCMNQGMSLEETVKNVKLAPIYQDKE YLGEYYGTVEWSVKSIYNGYVGWFDGDPVKLLPHPTEAYNQAVLALIGDNDKLILAIKNY MAQEDYQMALQLIELLMVQDNSLQIIELKKQALLMRAKQVTSANARHYLIACAKSL >gi|223714204|gb|ACDT01000011.1| GENE 7 10523 - 11428 1019 301 aa, chain + ## HITS:1 COG:FN1299 KEGG:ns NR:ns ## COG: FN1299 COG0451 # Protein_GI_number: 19704634 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Fusobacterium nucleatum # 3 289 2 299 309 157 37.0 3e-38 MTKKILVTGGTVFVSKFVATYFKNHDYDVYVLNRGNRKQIEGVKLIKGDRHHLGNLLKGY DFDVVFDVAAYTKQDVKDLLEGLNGVKDYILISSSAVYPESLSQPFKEEQKVGRNSIWGD YGSNKIEAENYLLSHIPQAYILRPPYLYGPMQNVYREPFVFECALKKRSFYLPNDGKMLL QFFHVEDLCKLMETIIKKHPHDHIMNVGNSEIVDINKFVELCYQVVGVPLNKVYVTKHDN QRDYFSFHNYNYVLDVTKQNQLLPSQKSLFEGLEESYQWYLDHPDEVVKKNYIEFIDQEL I >gi|223714204|gb|ACDT01000011.1| GENE 8 11467 - 11790 288 107 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735694|ref|ZP_04566175.1| ## NR: gi|237735694|ref|ZP_04566175.1| predicted protein [Mollicutes bacterium D7] # 1 107 1 107 107 174 100.0 2e-42 MWIIFGSLSIICTLMSCLSIFKSQEKLRAYWFSILGLVFTVLTLLAGYQLVSNWVKHEDW SALLDVVPSMFTILTIYVITMLTANIIALILIRGKSKNTNDNLMKKR >gi|223714204|gb|ACDT01000011.1| GENE 9 12014 - 12454 654 146 aa, chain + ## HITS:1 COG:L170990 KEGG:ns NR:ns ## COG: L170990 COG0454 # Protein_GI_number: 15672558 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Lactococcus lactis # 9 144 9 145 147 142 50.0 1e-34 MEIKRVFERDEQLISSFVTVWESSIRATHDFFSETNIREIKKYVFQAIEEIPILVVAVDN KETLAFMGIAEDKLEMLFITAQRRGQGIGRNLLEIGINDYEITEVCVNEQNPQALAFYEY MGFTIYRRSELDEQGNPFPILFLHRQ Prediction of potential genes in microbial genomes Time: Thu May 26 09:15:37 2011 Seq name: gi|223714203|gb|ACDT01000012.1| Coprobacillus sp. D7 cont1.12, whole genome shotgun sequence Length of sequence - 13440 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 9, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 11 - 964 944 ## COG3706 Response regulator containing a CheY-like receiver domain and a GGDEF domain + Term 1023 - 1053 -0.4 2 2 Tu 1 . + CDS 1363 - 2208 919 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase + Prom 2210 - 2269 3.3 3 3 Op 1 . + CDS 2289 - 3134 834 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 4 3 Op 2 . + CDS 3151 - 4020 992 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 5 3 Op 3 . + CDS 4049 - 4465 510 ## SSA_0903 NAD(P)H dehydrogenase (quinone), putative (EC:1.6.5.2) 6 3 Op 4 . + CDS 4513 - 5457 1039 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 7 3 Op 5 . + CDS 5460 - 6332 707 ## Tmath_1399 esterase/lipase-like protein + Prom 6390 - 6449 8.1 8 4 Tu 1 . + CDS 6471 - 7877 1490 ## COG1940 Transcriptional regulator/sugar kinase + Term 7897 - 7946 13.1 - Term 7881 - 7939 15.9 9 5 Tu 1 . - CDS 7965 - 8363 463 ## COG0789 Predicted transcriptional regulators - Prom 8429 - 8488 7.5 + Prom 8380 - 8439 11.9 10 6 Tu 1 . + CDS 8478 - 9095 713 ## COG0655 Multimeric flavodoxin WrbA + Term 9103 - 9154 6.5 11 7 Tu 1 . - CDS 9144 - 9575 464 ## COG1321 Mn-dependent transcriptional regulator - Prom 9601 - 9660 6.5 + Prom 9565 - 9624 5.9 12 8 Op 1 . + CDS 9675 - 9932 323 ## Dtox_3837 FeoA family protein 13 8 Op 2 . + CDS 9919 - 12060 1533 ## COG0370 Fe2+ transport system protein B + Prom 12122 - 12181 6.5 14 9 Op 1 . + CDS 12202 - 12822 552 ## Ccur_08110 hypothetical protein 15 9 Op 2 . + CDS 12819 - 13232 487 ## COG0242 N-formylmethionyl-tRNA deformylase + Term 13322 - 13359 -0.9 Predicted protein(s) >gi|223714203|gb|ACDT01000012.1| GENE 1 11 - 964 944 317 aa, chain + ## HITS:1 COG:PA3702 KEGG:ns NR:ns ## COG: PA3702 COG3706 # Protein_GI_number: 15598897 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and a GGDEF domain # Organism: Pseudomonas aeruginosa # 121 311 150 333 347 114 35.0 3e-25 MFIICVLGLILLHKKNVKKGLLLKVQNYYIVFLIMWSLAVTIYDNRVSNNISVHIITMLS IAILVYLPPKLFVPLYLFSDIILITGIELVKHGEPLGSHYSIYVNSTWLTIMALFVGYYH YQSMRQTFLNQSIILEKNRMIEEKSAELDFIAHHDSLTGLQNRRYLANCLNNIYLESRDS QLKTGVFIIDIDDFKSYNDTFGHIKGDECLKRVSNALNLCLSEGFLIRYGGEEFVYLLKN TDLEEMQRIGTELCHTVERLHLLMPNTSSREYLTVSIGGAIGVLNGEESWQAVLEQADQA LYKAKNTNKNKCICAVE >gi|223714203|gb|ACDT01000012.1| GENE 2 1363 - 2208 919 281 aa, chain + ## HITS:1 COG:YPO2805 KEGG:ns NR:ns ## COG: YPO2805 COG0656 # Protein_GI_number: 16123003 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Yersinia pestis # 1 280 15 294 297 351 58.0 1e-96 MEYITLNNGVKMPMAGFGVFQIKDKEECVRVVLEAIETGYRLIDTAQSYGNEEAVGEAIL KTNVPREELFVTTKVWITNYGYEKAKASVEESLEKMKLDYIDLVLLHQPFKDYHGAYRAL VDLYKEGKIKAVGVSNFYPDRLVDLCLDTDVVPAVNQVEVNLFHQQNQALEYNQKYGVQL EAWAPFAEGKKGIFTNETLMGIGEKYNKSVGQVILRWLVQRGIVPLAKTVRKERMLENID IFDFQLSEEDMKTITNMNKDTSSFFSHYDPATVEMICGLKR >gi|223714203|gb|ACDT01000012.1| GENE 3 2289 - 3134 834 281 aa, chain + ## HITS:1 COG:L126956 KEGG:ns NR:ns ## COG: L126956 COG0656 # Protein_GI_number: 15673856 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Lactococcus lactis # 2 267 3 270 281 216 42.0 4e-56 MRIIELSNGVKMPIIGFGTIGQFGKQVEDNVSFALQNGYRLIDTANRYTNEKSVGRGISK SGLDRKEIFIETKLAPTFYEKSNAVEQTLERLGVEYIDLMLLHHPLNNYIAGYQMMEKAY KEGKIKALGLSNFKVEQIQEILDICEIKPVVMQIECHPYYPAEKVKDFCFEKGIILQSWY PLGHGSNELLNEPLFKALAIKYHKTTSQIILKWHTQMGFSAVPGSRSANHILENIELFDF VLTNEEMYEIAKLNKHCPFYQITEKSQHILATTIPDVDGEE >gi|223714203|gb|ACDT01000012.1| GENE 4 3151 - 4020 992 289 aa, chain + ## HITS:1 COG:STM1676 KEGG:ns NR:ns ## COG: STM1676 COG0656 # Protein_GI_number: 16765019 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Salmonella typhimurium LT2 # 1 289 1 289 289 425 66.0 1e-119 MEYIILNNQIEMPILGLGVFQVPDKKECQETVFQAIKAGYRLIDTAASYMNEDAVGNAVK QAIEAGICTREELFITSKLWVQDMRTYETAKQGIENSLKKSGLEYFDLYLLHQAMGDYFA AWRAMEDAYHKGKLRAIGVSNFYPNILTNFCEIVEIKPMVNQIELHPYFTQEQAIETMKY YDVVPEAWAPLGGGRYNPFEDEMLKGIAEKYHKTVGQVLLRWNIQRDVVVIPKSTHIERI IENINVFDFELNNDEMKKISSLDMGYSGSRAKHFEPDFVRMVVHRKIHE >gi|223714203|gb|ACDT01000012.1| GENE 5 4049 - 4465 510 138 aa, chain + ## HITS:1 COG:no KEGG:SSA_0903 NR:ns ## KEGG: SSA_0903 # Name: not_defined # Def: NAD(P)H dehydrogenase (quinone), putative (EC:1.6.5.2) # Organism: S.sanguinis # Pathway: not_defined # 3 135 2 134 163 115 43.0 6e-25 MSILFINGSPNKNGNTVALSEQFLKGKNYQTLHLVDYKIYSYGQHFTDDQFDEVVLAMSK ADTIVIGSPLYWHSMSGAVRNLLDRFYQHVDESLLKGKNMYFIFQGYAPTKQQLEAGEYT MQRFANLYGMNYRGMITE >gi|223714203|gb|ACDT01000012.1| GENE 6 4513 - 5457 1039 314 aa, chain + ## HITS:1 COG:YPO2806 KEGG:ns NR:ns ## COG: YPO2806 COG0667 # Protein_GI_number: 16123004 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Yersinia pestis # 3 306 13 312 329 276 47.0 4e-74 MYVSPIGLGCMGFSHAYGLAEDKERAIQTIKQAYKMGYNFFDTAECYTGENTDGTISYNE ELVGEALKEVREQVVIATKFGVEHKGDHLELDSSPEKIYASIEGSLKKLQTDYIDLYYQH RIDPKVEPEVVAGVMKELIAAGKIKAWGISETTEEYLRRAHAVCPVTAIQNRYSMMARWY EDLFVVCKELGITYVAFSPMANGLLSGKFTPDSTFEKGDFRNNMPQYQKEGYEKAKKLLN LLKVLSEKKNCTMAQLSIAWMLRKKDFIVPIPGSRKLERLEENFRAGEIEVTIDEIKEID ALLDTIDFDVFGGH >gi|223714203|gb|ACDT01000012.1| GENE 7 5460 - 6332 707 290 aa, chain + ## HITS:1 COG:no KEGG:Tmath_1399 NR:ns ## KEGG: Tmath_1399 # Name: not_defined # Def: esterase/lipase-like protein # Organism: T.mathranii # Pathway: not_defined # 4 289 349 632 635 303 54.0 5e-81 MNISENTTIQEVLELPAFKTWGRLLFPINRPIALNMTIKELSSSKIYLWYNYIDVNKTIE IINTLNHDAQNHSIFYNIYSKEELLNDASKRDVGLFFFKGSKDAPFAICNAGGGFVYVGA MHDSFPHALELSKRGYNAFALIYRVEHPYEDLAKAISFIYNNANMLKIDKRHYSLWGGSA GARMAAVLGNKKFLSSYSEIAVPSAEAVIMQYTGYDYVSNEDAPTYACIGTNDYIADHIL MKKRLEHLNKLNIPTEFHCYKGLSHGFGLGTGTIAEGWIDDAIYFWQRQS >gi|223714203|gb|ACDT01000012.1| GENE 8 6471 - 7877 1490 468 aa, chain + ## HITS:1 COG:BS_xylR KEGG:ns NR:ns ## COG: BS_xylR COG1940 # Protein_GI_number: 16078822 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Bacillus subtilis # 278 371 110 204 350 63 38.0 1e-09 MTEKKAVFDFKGWIREHHNTPYEIRLENDDLIKLVTEYGEASIQFTVIEEYTIVEFSIVS NKDHSVKFYLHFELNDENHAKQLYDEMVETLIGLKEEKTLRVLLSCSAGLTTSMFADNLN SVAGMLGLDYHFDAVSYMSIYEEAEKYDVILIAPQIGYMLKRLKESITEKPVLQIPTSVF ASYDALAALKFIQSELEIFRQEKSNEQAHECTHCAEYENRILSIVILISREQTRIYYRLN DKCEIIDSNLVIKPTMNIYDLYDIIDTILLKHSYIDMIGIATPGIVKDEKQLKEPTDGRT IDIKADFEDKYGIDVFVYNNANAAVVGFSLEHPEYDNIIFHSQPFGFGVGGQGIISNGKV IRGKNGIAGEIRYFIRRMQLSDDVHKLAWTQHGAVELVTKSLLPTISLIGPEAVVISSPM TPDMDEVKNVLSSFIDKEFLPEFYYIKEASSYMLSGITKLCVDFIEEE >gi|223714203|gb|ACDT01000012.1| GENE 9 7965 - 8363 463 132 aa, chain - ## HITS:1 COG:CAP0107 KEGG:ns NR:ns ## COG: CAP0107 COG0789 # Protein_GI_number: 15004810 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 3 114 4 115 152 90 41.0 5e-19 MCTMKEVCKEIGMNYETLKFYCKEGLVPNVKRDKNNYRNFDEKNIAWLKGIQCLRKCGLG VSEIKIYMNYCLQGPTSIPQRKDMLSQTKEHLLKRIAEINECIEYIDNKQRFYDDVLEGK CRYTSNLIEINE >gi|223714203|gb|ACDT01000012.1| GENE 10 8478 - 9095 713 205 aa, chain + ## HITS:1 COG:CAC3341 KEGG:ns NR:ns ## COG: CAC3341 COG0655 # Protein_GI_number: 15896584 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Clostridium acetobutylicum # 1 205 1 208 208 268 64.0 4e-72 MKVILINGSPNAKGCTYTALEEVSKTLKSEGIETEIIHVGHKDIRGCIGCRQCKTKGKCV FNDIVNDIAPKFKECDGIVIGSPVYFASANGTLVSFIDRLFYSMTADKTMKVGAAVVSCR RGGNSATFDELNKYFTISQMPIASSQYWNMVHGNSPEEVQQDLEGLQTMRTLGKNMAFLI KSIQLGKKEFGLPEKEEHKFTNFIR >gi|223714203|gb|ACDT01000012.1| GENE 11 9144 - 9575 464 143 aa, chain - ## HITS:1 COG:CAC2616 KEGG:ns NR:ns ## COG: CAC2616 COG1321 # Protein_GI_number: 15895874 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 5 129 3 127 157 96 41.0 2e-20 MNNDENAYFTFKGYQLNNDSNLTSSMEDYLEMIIRLKEKQPFVRINELANNLNVRPSSAS KMVTKLSQEGFLEYQKYGIITLTELGKQWGEYFIYRHQVLNNFLCLLNNSHNELEQVEKI EHFLNKKTIDNLALLTKELHNLQ >gi|223714203|gb|ACDT01000012.1| GENE 12 9675 - 9932 323 85 aa, chain + ## HITS:1 COG:no KEGG:Dtox_3837 NR:ns ## KEGG: Dtox_3837 # Name: not_defined # Def: FeoA family protein # Organism: D.acetoxidans # Pathway: not_defined # 7 76 16 85 86 70 47.0 1e-11 MIENRCLNDLQMGEVGVITSLMAQGSIKRRLLDIGLVDDTEVQCILNSPSQDPKAFLIRG AVIALRNEDCQNVLIRKDDVNYGLN >gi|223714203|gb|ACDT01000012.1| GENE 13 9919 - 12060 1533 713 aa, chain + ## HITS:1 COG:CAC1031 KEGG:ns NR:ns ## COG: CAC1031 COG0370 # Protein_GI_number: 15894318 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Clostridium acetobutylicum # 28 712 7 680 683 313 29.0 6e-85 MGLTNKSTGINASETKLVIKKQSRNDKVVAIAGNPNVGKSTVFNGLTGLHQHTGNWPGKT VTNAQGYCRGKENSYVLVDIPGTYSLLAHSAEEEVARNFICFEHPDIIVVVCDATCLERN LNLVIQTIEIHTNVIVCVNLLDEAKKKGIDIDLEQLSSCLGVPVIGITARKKTDLHKVID SLDQAIDRPVNYKSLKIAYSDILLQAIRIVEEVLQKKVTTSLSYQWLSLRLIENDPSLMI EMSNQLGLDYDDQELKEAIDQANELLAQHDINKDNLQDKIVVSFIHSAHLICQKVVHYQK QDYNQRDRQIDKILTNRWTGYPIMILLLALIFWITITGANYPSQWLSDFLFGLEVPFNQF FHFIGTPLWLSDMLVSGVYRVLAWVVSVMLPPMAIFFPLFTLMEDLGYLPRIAFNLDHAF KKCSACGKQALTMCMGFGCNAAGIVGCRIIDSPRERLIAMITNNFVPCNGRFPTLIAILT MFFIGTEATMQNSLLSAIFLTMLIVLGITMTLIISKLLSKTILKGVPSSFTLELPPYRKP QVGKVIVRSIFDRTLFVLSRAVVVAAPAGLIIWLMANISVDGMTILNITSNFLDPFAKCI GLDGVILLAFILGFPANEIVIPIMIMAYLASGSLIEMDNLVMLKSLLVENGWTWITAINV MLFSLMHWPCSTTLLTIKKEANSWKWAIVSFLVPAVTGIIICSIFTMCANLFI >gi|223714203|gb|ACDT01000012.1| GENE 14 12202 - 12822 552 206 aa, chain + ## HITS:1 COG:no KEGG:Ccur_08110 NR:ns ## KEGG: Ccur_08110 # Name: not_defined # Def: hypothetical protein # Organism: C.curtum # Pathway: not_defined # 2 201 4 203 205 217 56.0 2e-55 MTRPNFNDIQSFDEFKQYYWYRDELKQICKEHGITHTGTKTELNDNIKAYFNGEIIKSVK RMKIRKQDVPLTLETKLLECGFVLNGRFREFFGEITKTPNFKFTADMATALRRVRSEGNV HFTLKDLLDVYYGKSDYAKYDNSACQWNQFLKDFCRDENNKIYYNKLKAASILWKVIRDQ PGKKVYSASLLKKHQALLRQYKGENK >gi|223714203|gb|ACDT01000012.1| GENE 15 12819 - 13232 487 137 aa, chain + ## HITS:1 COG:SP1549 KEGG:ns NR:ns ## COG: SP1549 COG0242 # Protein_GI_number: 15901392 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Streptococcus pneumoniae TIGR4 # 1 137 1 136 136 129 48.0 2e-30 MIREIVSDQFRLSQKSLPATKEDLPVVQDLLDTIIAHATGCVGMAANMIGINKNIIIVND DGNYLVMLNPEIIKTMGRLYETEEGCLSHIGVKKTKRYEKIKVAYYDVDFKKKIKTFSNY TAQIIQHEIDHCNGILI Prediction of potential genes in microbial genomes Time: Thu May 26 09:15:50 2011 Seq name: gi|223714202|gb|ACDT01000013.1| Coprobacillus sp. D7 cont1.13, whole genome shotgun sequence Length of sequence - 2437 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 35 - 94 4.1 1 1 Tu 1 . + CDS 195 - 650 420 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 669 - 696 0.1 2 2 Tu 1 . + CDS 1173 - 1349 95 ## gi|237735712|ref|ZP_04566193.1| predicted protein + Prom 1543 - 1602 8.1 3 3 Tu 1 . + CDS 1753 - 2427 692 ## COG1272 Predicted membrane protein, hemolysin III homolog Predicted protein(s) >gi|223714202|gb|ACDT01000013.1| GENE 1 195 - 650 420 151 aa, chain + ## HITS:1 COG:L50174 KEGG:ns NR:ns ## COG: L50174 COG0454 # Protein_GI_number: 15672030 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Lactococcus lactis # 1 149 1 146 149 94 38.0 8e-20 MEIRKYQSSDLKVICELFYETINTINRYDYNNDQIKVWSNRSNFLLTQDDFFNSMYTLVA VENEKVLGYGNIDKRGYLDHLYVHKDYQGKQIATKLCDKLEQYCKDVKSLTVHASISAKP FFEKRGYKVIKEQSVKVDNVYLTNYLMEKTR >gi|223714202|gb|ACDT01000013.1| GENE 2 1173 - 1349 95 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735712|ref|ZP_04566193.1| ## NR: gi|237735712|ref|ZP_04566193.1| predicted protein [Mollicutes bacterium D7] # 1 58 1 58 58 83 100.0 4e-15 MIYLLVKLENLKESYSINNFNKEIKIVENLQNSYIFEDKTGIWNKNRGTFILVLIIEI >gi|223714202|gb|ACDT01000013.1| GENE 3 1753 - 2427 692 224 aa, chain + ## HITS:1 COG:BH2865 KEGG:ns NR:ns ## COG: BH2865 COG1272 # Protein_GI_number: 15615428 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Bacillus halodurans # 14 224 7 212 215 146 39.0 3e-35 MKRTKLDDRVLPEYTRGEEIFNMVTHIVGGVFGIAVLVLCIVFGVLRHNNYGIASGIIYG ISMILLYTMSSIYHGLSPNTKSKKVFQIMDHCSIFVLIAGTYTPIVLCSIRPVDSFWAWM IFAVVWGCTVLGIILNAIDIESNEKFSMICYLAMGWCIIFKFNLLPEAIGYNGIWLLVAG GVAYTIGTVFYGLQNKYRYMHSIWHLWILLGSVLHFLCIILYVI Prediction of potential genes in microbial genomes Time: Thu May 26 09:15:56 2011 Seq name: gi|223714201|gb|ACDT01000014.1| Coprobacillus sp. D7 cont1.14, whole genome shotgun sequence Length of sequence - 2704 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 455 238 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 2 2 Op 1 9/0.000 - CDS 583 - 1185 661 ## COG0673 Predicted dehydrogenases and related proteins 3 2 Op 2 . - CDS 1185 - 1535 236 ## COG0673 Predicted dehydrogenases and related proteins - Prom 1608 - 1667 7.4 + Prom 1539 - 1598 7.6 4 3 Tu 1 . + CDS 1638 - 2228 630 ## gi|167757196|ref|ZP_02429323.1| hypothetical protein CLORAM_02746 + Term 2240 - 2300 8.7 + Prom 2265 - 2324 8.4 5 4 Tu 1 . + CDS 2436 - 2703 263 ## gi|167757195|ref|ZP_02429322.1| hypothetical protein CLORAM_02745 Predicted protein(s) >gi|223714201|gb|ACDT01000014.1| GENE 1 3 - 455 238 150 aa, chain + ## HITS:1 COG:PA3222 KEGG:ns NR:ns ## COG: PA3222 COG0697 # Protein_GI_number: 15598418 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Pseudomonas aeruginosa # 2 136 137 271 299 81 37.0 4e-16 IPEFSLNNQITLGIFIGLIGSLSYAVLCLMNRYFAGKYTGRTICLYEQGIAAIILLPSLL IVKASITKLDLFALIFLGVICTAVAHSIYVSSLKKVKVQTAGIISGMESVYSIIFAIFLL HEIPGVKELLGGVIILGVAFYATVTNNKKF >gi|223714201|gb|ACDT01000014.1| GENE 2 583 - 1185 661 200 aa, chain - ## HITS:1 COG:L194226 KEGG:ns NR:ns ## COG: L194226 COG0673 # Protein_GI_number: 15673536 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Lactococcus lactis # 1 199 118 315 324 122 35.0 6e-28 MEAQKALYLPTTSKVKELINSEIIGKIKYLEFKAGFPGRFTYDHWMYDLSLGGGALYGSA TYTIEYLQYLFDHPNLTIDGTCIKCSTGADEICNFQLKINNSILASSTIAMNVALKNEAV FYGELGYIVVPNYWKSNQLDLYLDDGSHRHYDFPYQSEFVYEIEHIHQCINNGRLESPIM NKEKTIETIKLVEQLYQKWK >gi|223714201|gb|ACDT01000014.1| GENE 3 1185 - 1535 236 116 aa, chain - ## HITS:1 COG:L194226 KEGG:ns NR:ns ## COG: L194226 COG0673 # Protein_GI_number: 15673536 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Lactococcus lactis # 1 116 1 116 324 104 45.0 3e-23 MFRYGILSTASIIDRFIAGIRESQDSYVQAIASRNLETAKKAAKRLNIENYYGSYGELLC ANNIDIVYIPTINSQHYENCKKALKHHKHVIVEKTFALTVLEAKELFTLAKKIIVF >gi|223714201|gb|ACDT01000014.1| GENE 4 1638 - 2228 630 196 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757196|ref|ZP_02429323.1| ## NR: gi|167757196|ref|ZP_02429323.1| hypothetical protein CLORAM_02746 [Clostridium ramosum DSM 1402] # 1 196 1 196 196 303 100.0 3e-81 MLRFNDDFDFRSEFVGKWNKKIHSLRVAGIIVAILMIICAILCMIYPVKSVTIIGTIATG LILILGIYQIIDYIAAPPLFRGPGGLVSAICNLLIGLLLICSPIEMTINTFAFIFGFILM VYGLNKLSFAHQLSFFGIESYGWVIFTGIINVLAALAFIIAPLMSTVVLNYIVAAYLLVG GIALLVEVIAMHDLKL >gi|223714201|gb|ACDT01000014.1| GENE 5 2436 - 2703 263 89 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757195|ref|ZP_02429322.1| ## NR: gi|167757195|ref|ZP_02429322.1| hypothetical protein CLORAM_02745 [Clostridium ramosum DSM 1402] # 1 89 20 108 288 176 100.0 5e-43 MNKEMFLSTMKVITDYTQRDVIKIELSNDECSIFDSKFGGLPYLGVDDEVPLSTDNKQLR LLGQINFADLPMNSYQLEAGLLQFWAMDN Prediction of potential genes in microbial genomes Time: Thu May 26 09:16:14 2011 Seq name: gi|223714200|gb|ACDT01000015.1| Coprobacillus sp. D7 cont1.15, whole genome shotgun sequence Length of sequence - 8973 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 455 487 ## COG3878 Uncharacterized protein conserved in bacteria + Term 701 - 745 -0.8 2 2 Op 1 . - CDS 452 - 865 336 ## CLH_2436 hypothetical protein 3 2 Op 2 . - CDS 877 - 1719 699 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 1782 - 1841 10.4 + Prom 1746 - 1805 14.6 4 3 Op 1 . + CDS 1981 - 2328 467 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain 5 3 Op 2 . + CDS 2371 - 2934 611 ## COG0406 Fructose-2,6-bisphosphatase + Prom 2936 - 2995 5.3 6 3 Op 3 . + CDS 3018 - 4028 966 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) + Term 4039 - 4074 3.7 + Prom 4045 - 4104 6.0 7 4 Tu 1 . + CDS 4130 - 5578 1349 ## COG2199 FOG: GGDEF domain + Prom 5637 - 5696 5.4 8 5 Op 1 . + CDS 5746 - 7299 1557 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 9 5 Op 2 . + CDS 7384 - 8463 785 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 8479 - 8530 0.7 + Prom 8629 - 8688 8.3 10 6 Tu 1 . + CDS 8720 - 8972 604 ## wcw_1811 hypothetical protein Predicted protein(s) >gi|223714200|gb|ACDT01000015.1| GENE 1 3 - 455 487 150 aa, chain + ## HITS:1 COG:DR2173_2 KEGG:ns NR:ns ## COG: DR2173_2 COG3878 # Protein_GI_number: 15807167 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 69 149 184 259 260 83 49.0 1e-16 SQILNKYHPEGEGYFPIVDNFKLHFIGGEEEISIGDYHFDGLFTQEWNNLYPNNLISSYY DLPDEILYEDEFNEFSGFGHKMFGYPAFTQEDPRSSEKYDDYILLLQIDSVGIGDKEIMW GDSGICNFFITKKDLENKNFSKVLYNWDCY >gi|223714200|gb|ACDT01000015.1| GENE 2 452 - 865 336 137 aa, chain - ## HITS:1 COG:no KEGG:CLH_2436 NR:ns ## KEGG: CLH_2436 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 1 126 1 125 131 162 65.0 3e-39 MVITNIKASFPNDIKTIWNIVTSLDNYTWRSDLSKLEIIKSQEQFIEYTKNGYPTTFTIT CFRPYQRYEFSMENSNMKGHWIGLFSSHNGITSIDFTENITPKKFFMKPFVKRYLKKQQA AYIRDLTAVLEAIEHRF >gi|223714200|gb|ACDT01000015.1| GENE 3 877 - 1719 699 280 aa, chain - ## HITS:1 COG:BH0594_1 KEGG:ns NR:ns ## COG: BH0594_1 COG2207 # Protein_GI_number: 15613157 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 1 127 1 127 127 150 49.0 3e-36 MEWIERLNTTIEYIENHLTEKIDYEQLAKIAGCPSYHFQKTFLYMTNISLSEYIRKRRIS LAAVDLQTGEDKIIDIALKYGYESPTAFNRAFQAVHGIAPSLVRKKNVQIKSFPALKFTF SIQGLDELSFRIEKKDSFKILGISSLLSKELLENFVQIPKQWDKALNDGTLNQLYRLNNQ TPLGLLGVNIHQHEDWRYFIAVSSTNHNQNFEEYHIGSATWAIFSGHGTNRTLQELQCRV ITEWLPTSGYEYANIPDIEVYLKADPNDAIYEYWLPIIKK >gi|223714200|gb|ACDT01000015.1| GENE 4 1981 - 2328 467 115 aa, chain + ## HITS:1 COG:SPy1581 KEGG:ns NR:ns ## COG: SPy1581 COG1917 # Protein_GI_number: 15675470 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Streptococcus pyogenes M1 GAS # 5 113 3 111 112 110 49.0 6e-25 MSDIFIKNIPHEQITTLASQVEVQAGQVVSKTLAQNDALSITIFSFDKGEEIGTHGSSGD AMVTVLEGVGKFTVDGQEYVLKRGETLVMPANKPHSVYGQEAFKMLLVVVFPQIK >gi|223714200|gb|ACDT01000015.1| GENE 5 2371 - 2934 611 187 aa, chain + ## HITS:1 COG:lin1208 KEGG:ns NR:ns ## COG: lin1208 COG0406 # Protein_GI_number: 16800277 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Listeria innocua # 1 183 8 190 199 279 69.0 2e-75 MRHGQTLFNARKKIQGWCDAPLTELGIKQAKIAGKYFKDNNIVFDHAYSSTSERACDTLE LVCDLPYHRVKGLKEWNFGIYEGESEDLNPPLPYGDFFVNYGGEGELQMRQRIADTLLEI MQKEDHKVVLAVSHGGACRGFMRYWQETSQVDQKGGLKNCCILKFEFEDNEFRLLEIINH NFDETKG >gi|223714200|gb|ACDT01000015.1| GENE 6 3018 - 4028 966 336 aa, chain + ## HITS:1 COG:TM1297 KEGG:ns NR:ns ## COG: TM1297 COG0667 # Protein_GI_number: 15644052 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Thermotoga maritima # 1 191 1 197 285 125 37.0 9e-29 METVRLGRTNIVVNRNGFGALPVQRVNMEEAKVILKKAYDNGITFFDSARAYSDSEEKIG RALSEVRQKIYISTKTMATTVTDFWKDLETSLSLLKTDYIDIYQFHNPSFCPKPGDGSGL YEAMLEAKKQGKIRYIGLTNHRLHVAREAVECGLYDTLQFPFSYLASKKEEELVDLCKEK DVGFICMKALSGGLIDRSDAAYAYLAQFDNTLPIWGIQKESELDEFIEYNDNPPIMSEEI KAIIAHDQKELSGEFCRGCGYCMPCPMGIEINQCARMSLMLRRAPSAGWLSEHWQEEMKK IENCINCGKCMSHCPYGLNTPELLKKNYEDYKTFFK >gi|223714200|gb|ACDT01000015.1| GENE 7 4130 - 5578 1349 482 aa, chain + ## HITS:1 COG:VC2697 KEGG:ns NR:ns ## COG: VC2697 COG2199 # Protein_GI_number: 15642692 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Vibrio cholerae # 334 454 144 257 312 81 38.0 3e-15 MREIDRLVEEIRHLQFDDIEKMYQKCLQLKEIALEQKDDYILCLASNYIIDYYYSCKSQQ ETVKLANEMLALNEEKGYPDLLMQAYNLYAIAVCNNDYSLATGFYLKGLKLAEQLNDYIM KAKFNCNLGDVFVNLGQFDLALPYFLESLEQIKKISPKLPEYKIKRFVLCYLIIIYCEKR RLDQAIALMEENKDLFNDTSFDPIDRLWQALKALICYGHGDVAKALEFIDNILGNEIHGF RANEAIYFIHHILLYITFAIKDKERTKCLYLLLKENDFGKTGMRYQIEMLEMKIKYCIVF EEKEQLPKLYEQYYYLMHENQKESIDFCLNSVLYKIELFKAMEEKKDIVKESRLDDLTKI YNRRYFHYKYSEFKNKSRLLGIIIFDLDHFKEYNDSLGHLTGDQILKDFADSLQQDDDRI ISCRFGGDEFICICVDCNEQMIISFIEQVYSKFELYGYDKITISSGYYNSFSNCLSKEEL IN >gi|223714200|gb|ACDT01000015.1| GENE 8 5746 - 7299 1557 517 aa, chain + ## HITS:1 COG:CAC3339 KEGG:ns NR:ns ## COG: CAC3339 COG0488 # Protein_GI_number: 15896582 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 1 516 1 517 518 635 59.0 0 MSILSVENVSHDFGGRTILENASFRLLNGEHIGLVGANGEGKSTFLNIITGKIAPDSGKI EWCKRISTGYLDQHTALHPGKTIYETLQDAFKYYYDLEAEMLSMYEQMADCDDETMNQLM EEVGEIQEILDHGGFYTIDVKIKEVAGGLGLNDIGLDKPVDALSGGQRSKVLLTKLLLEN PMILILDEPTNYLDEQHINWLINFLKNYDNAFILVSHDVSFLNQVINIIYHLEDGVLTRY KGDYDYYLQQVELKKRQLAADYQKQQKEIADLEDFIARNKARVATRNMASSRQKKLDKMN LISKPKEKIKPTFSFIEARTPGKVLFDCKDLVIGYDSPLTKEMNLVFERNKKVAIKGVNG LGKTTLLKTLIGMQKPYSGTVEKDPFVEIGFFKQEESAVSQTALDYLWEEFPDRNNGEIR AMLAKCGLTTDHIESLMKVLSGGENAKVRLCKIMNREANVLVLDEPTNHLDVDAKDSLQS AIKAFKGAVILVSHEPEFYLPIVDEVINLEECSTKII >gi|223714200|gb|ACDT01000015.1| GENE 9 7384 - 8463 785 359 aa, chain + ## HITS:1 COG:lin0696 KEGG:ns NR:ns ## COG: lin0696 COG0463 # Protein_GI_number: 16799771 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Listeria innocua # 4 183 5 192 637 95 33.0 2e-19 MITISVCMIVKDEEMVLERALKSVERIADEIIIVDTGSTDRTKKIALKYTDKVYDFVWCD DFAKARNYSFSKATMDYCMWLDADDIISDSDQEGLINLKKTLSKDTSVVMMKYNAEFDQN NKPTFTYYRERLLRRKDNFYWEGFIHEVISPRGKVLYSDLAITHSKVKNSDVNRNLRIFR EKLNEGVMFSPREIFYYARELFYHQLYQEAVAEFKRFLNTKKGWKENNISACQFLGYCFY NLGDMQEAITWLLNSLCYDLPRAEICCDLGKYFFELKQYEQSIYWSAVALSCERDDTSGA FISPDCYDYIPYMQLCLCYDKLSNYKTAKYYNELAGKCKPNDPSYLHNKLYFNRIENDE >gi|223714200|gb|ACDT01000015.1| GENE 10 8720 - 8972 604 84 aa, chain + ## HITS:1 COG:no KEGG:wcw_1811 NR:ns ## KEGG: wcw_1811 # Name: not_defined # Def: hypothetical protein # Organism: W.chondrophila # Pathway: not_defined # 26 84 290 348 603 68 81.0 5e-11 MKDCNYNGCIWCPCCRCRPPIIIRCPTGPTGATGATGPTGPTGVTGVIGPTGATGATGPT GPTGATGEDGATGATGPTGPTGAT Prediction of potential genes in microbial genomes Time: Thu May 26 09:16:27 2011 Seq name: gi|223714199|gb|ACDT01000016.1| Coprobacillus sp. D7 cont1.16, whole genome shotgun sequence Length of sequence - 25964 bp Number of predicted genes - 30, with homology - 30 Number of transcription units - 16, operones - 7 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1276 2385 ## CD3349 putative exosporium glycoprotein - Term 1569 - 1614 3.2 2 2 Tu 1 . - CDS 1813 - 2451 803 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase - Prom 2477 - 2536 7.8 + Prom 2585 - 2644 8.0 3 3 Tu 1 . + CDS 2666 - 3259 632 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Prom 3339 - 3398 5.1 4 4 Tu 1 . + CDS 3582 - 3854 143 ## gi|237735730|ref|ZP_04566211.1| predicted protein 5 5 Tu 1 . + CDS 4300 - 4521 179 ## gi|167757180|ref|ZP_02429307.1| hypothetical protein CLORAM_02730 + Term 4523 - 4572 13.5 - Term 4510 - 4560 14.0 6 6 Tu 1 . - CDS 4562 - 5905 1213 ## COG0531 Amino acid transporters - Prom 6021 - 6080 8.4 + Prom 6008 - 6067 7.6 7 7 Op 1 . + CDS 6111 - 6662 447 ## PROTEIN SUPPORTED gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) 8 7 Op 2 . + CDS 6715 - 7443 636 ## BVU_3509 putative arginase 9 7 Op 3 . + CDS 7453 - 8025 757 ## COG2135 Uncharacterized conserved protein + Term 8058 - 8110 1.0 + Prom 8031 - 8090 4.8 10 8 Op 1 . + CDS 8247 - 8855 591 ## COG5015 Uncharacterized conserved protein 11 8 Op 2 12/0.000 + CDS 8859 - 10400 1560 ## COG1732 Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) 12 8 Op 3 . + CDS 10410 - 11153 208 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P + Term 11154 - 11180 -0.6 + Prom 11162 - 11221 8.4 13 9 Op 1 . + CDS 11287 - 11754 402 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 14 9 Op 2 . + CDS 11747 - 12946 1197 ## gi|167757171|ref|ZP_02429298.1| hypothetical protein CLORAM_02721 15 9 Op 3 . + CDS 12976 - 13617 858 ## COG0546 Predicted phosphatases 16 9 Op 4 . + CDS 13656 - 14180 463 ## BcerKBAB4_2674 GCN5-related N-acetyltransferase + Term 14182 - 14226 9.4 - Term 14172 - 14209 3.0 17 10 Op 1 . - CDS 14219 - 15385 1188 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases - Prom 15412 - 15471 1.9 18 10 Op 2 . - CDS 15485 - 16510 1095 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Prom 16558 - 16617 9.7 - Term 16640 - 16677 6.6 19 11 Op 1 . - CDS 16687 - 17163 459 ## COG0780 Enzyme related to GTP cyclohydrolase I 20 11 Op 2 1/0.250 - CDS 17166 - 17849 712 ## COG0603 Predicted PP-loop superfamily ATPase 21 11 Op 3 1/0.250 - CDS 17861 - 18427 724 ## COG0302 GTP cyclohydrolase I 22 11 Op 4 22/0.000 - CDS 18429 - 19091 582 ## COG0602 Organic radical activating enzymes 23 11 Op 5 . - CDS 19088 - 19576 406 ## COG0720 6-pyruvoyl-tetrahydropterin synthase - Prom 19670 - 19729 6.1 + Prom 19697 - 19756 6.8 24 12 Tu 1 . + CDS 19785 - 21437 1686 ## COG1283 Na+/phosphate symporter + Term 21438 - 21469 3.4 25 13 Op 1 1/0.250 - CDS 21500 - 21799 268 ## COG3910 Predicted ATPase 26 13 Op 2 . - CDS 21884 - 22135 81 ## COG3910 Predicted ATPase - Prom 22241 - 22300 9.3 + Prom 22260 - 22319 13.2 27 14 Tu 1 . + CDS 22344 - 23111 793 ## Dalk_2238 4Fe-4S ferredoxin iron-sulfur binding domain protein + Term 23173 - 23219 9.0 + Prom 23178 - 23237 10.7 28 15 Tu 1 . + CDS 23263 - 24234 1128 ## COG0673 Predicted dehydrogenases and related proteins 29 16 Op 1 . - CDS 24931 - 25569 419 ## gi|237735755|ref|ZP_04566236.1| predicted protein 30 16 Op 2 . - CDS 25566 - 25835 125 ## gi|167757155|ref|ZP_02429282.1| hypothetical protein CLORAM_02705 - Prom 25897 - 25956 2.1 Predicted protein(s) >gi|223714199|gb|ACDT01000016.1| GENE 1 2 - 1276 2385 424 aa, chain + ## HITS:1 COG:no KEGG:CD3349 NR:ns ## KEGG: CD3349 # Name: bclA3 # Def: putative exosporium glycoprotein # Organism: C.difficile # Pathway: not_defined # 1 394 182 654 661 242 60.0 2e-62 ATGEDGATGPTGPTGATGEGGATGATGPTGATGATGPTGLTGATGEDGSTGAIGPTGPTG STGATGSTGPTGATGEDGATGATGSTGPTGATGEDGATGATGSTGPTGSTGATGPTGPTG ATGATGPTGATGEDGATGPTGPTGATGEDGATGPTGATGEDGATGPTGATGPTGPTGATG EDGATGATGSTGPTGPTGATGEDGATGATGSTGPTGTNGANGDRGPTGPTGITGATGATG STGPTGSTGAAGASAIIPFASGLPVSLTTIAGGLVGTPAFIGFGSSAPGISIIGNTIDLT NPSGTLTNFAFTMPRDGVITSIDVFFSTTAALSLVGSTITIEAKLYESIAPNNTMTVVPG TTVTLTPSLTGVISIGTISKGILTGLNINVPAGTRLMLVLTARASGLSLVNTVAGYVSAG VAID >gi|223714199|gb|ACDT01000016.1| GENE 2 1813 - 2451 803 212 aa, chain - ## HITS:1 COG:SP0317 KEGG:ns NR:ns ## COG: SP0317 COG0800 # Protein_GI_number: 15900249 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Streptococcus pneumoniae TIGR4 # 1 209 1 209 209 263 64.0 1e-70 MTKSNTIIELKKQGVVAVIRGTSYEEGHQTATACIKGNLKAIEIAYTNNNADMIIKQLSS DYQNNNTVLIGAGTVLDAPTAKNAIMAGAKYIVSPAFNQETAIICNRYGIPYIPGCMTIK EIITAMEYGCEIIKLFPGSAFGPNYINAIKGPLPYVSLMVTGGVNLNNAAEWFNAGIDAI GIGGELNKLAAKGQFETIKEIAQKYVNTKTRK >gi|223714199|gb|ACDT01000016.1| GENE 3 2666 - 3259 632 197 aa, chain + ## HITS:1 COG:pli0059 KEGG:ns NR:ns ## COG: pli0059 COG1961 # Protein_GI_number: 18450341 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 6 159 4 154 199 89 37.0 4e-18 MTNIYGYARVSSKDQNEARQIIALSQFPVKKENIYIDKFSGKDFDRPKYSELIKILKEQD ILVIKEIDRLGRNYEEILEQWRVITKEIKADIVVLDMPLLDTRTRKENLTGTFIADLVLQ ILSYVAETERQSIKQRQREGIEAAKKRGVKFGRPCIPIPEEFYDLKEKWLNKKITSREAA TTINVSQDTFLRWVHLK >gi|223714199|gb|ACDT01000016.1| GENE 4 3582 - 3854 143 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735730|ref|ZP_04566211.1| ## NR: gi|237735730|ref|ZP_04566211.1| predicted protein [Mollicutes bacterium D7] # 1 90 1 90 90 136 100.0 4e-31 MKKTKSLVGRKYGKLLVLAETNKLEARYKVWECRCECGKITFVNTKKLKRGTITNCGCIP KNKAKKGQKFGDLWQFLQQKKRTQKLVKLG >gi|223714199|gb|ACDT01000016.1| GENE 5 4300 - 4521 179 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757180|ref|ZP_02429307.1| ## NR: gi|167757180|ref|ZP_02429307.1| hypothetical protein CLORAM_02730 [Clostridium ramosum DSM 1402] # 1 73 1 73 73 115 100.0 8e-25 MQKDIFEDAVKLMRCYFISDLRILKKEVYEMLYIVDFKQYDIPNLQQFFAYVFDREISSY EDIEEFLWNESTL >gi|223714199|gb|ACDT01000016.1| GENE 6 4562 - 5905 1213 447 aa, chain - ## HITS:1 COG:BS_ykbA KEGG:ns NR:ns ## COG: BS_ykbA COG0531 # Protein_GI_number: 16078351 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Bacillus subtilis # 1 415 1 404 438 160 30.0 4e-39 MNNNNNDVTKKYGLFTAITMIVGICIGSGIFFKSDNVLIATNGSIALGVLSFILAAISII FGCLTIGELASRTDKVGGLISYAEMFLTRKVACAMGWFQTFVYYPTITSVVAWVIGVYIN ILFNLQASLEFEILIGYIFLLLCFVYNILLPKFGAFIQNSTTLIKLLPLFILGILGIIFG DPINGLANTSINTLSGTGWISALGPIAYSFDGWIVTTSISHELKDAKKNMPKALMLGPLI VLSIYVLYFIGISCYVTPEVVMTLGDRHVSLAAQNLFGPAFSKMIIIFIIISVIGTVNGL VIGYIRMPYSLSLRKNMFPFSKHLRKIDKRFQMPLNSAICSLIICTIWMFVHYLCTKYNL LYNSDVSEGAIGISYIFYTLLYIRVFKMYLDKEIKSKFKGLVCPILATIGSLIILSGGIQ NKLFIFYIIFCILLFLYSLYYYQKNNH >gi|223714199|gb|ACDT01000016.1| GENE 7 6111 - 6662 447 183 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) [Campylobacter concisus 13826] # 2 180 1 181 185 176 47 1e-43 MVKRCSWVDEKSEIYKNYHDHEWGVPVYDDEKLYEMFLLETFQAGLSWITILKKREAFRV AFDNFDVIKIANYDDKKVMELLANEKIIRSKRKIGAAINNAKIFINIQTEFGSFSDYLWG FSNHQIIKNKDNNFKTTTKLSDDISMDLKKRGMSFVGSITIYSYLQAIGIVDDHELECFC YKH >gi|223714199|gb|ACDT01000016.1| GENE 8 6715 - 7443 636 242 aa, chain + ## HITS:1 COG:no KEGG:BVU_3509 NR:ns ## KEGG: BVU_3509 # Name: not_defined # Def: putative arginase # Organism: B.vulgatus # Pathway: not_defined # 1 237 24 263 269 197 38.0 3e-49 MDFTGIYELESFYKHKNINWIDCLDIKGTRGYCSEEAKAKIIKRIGDYLPAGIHFIDSGN FHYVSEFWIRKINYDFILIVFDHHSDMIKPMFGDILSCGSWILNALKNNQYLKQIILIGI AQEQVDLIDPKYRERVIYLCDNDLGDLKVWKIIDELLKQYPVYFSIDKDVLSEQVVTTDW QQGQMRLIEFKLILADLIKRGDVIGIDICGECCNEVAMLSEIKNDDSLNAQILEFLKCEL VK >gi|223714199|gb|ACDT01000016.1| GENE 9 7453 - 8025 757 190 aa, chain + ## HITS:1 COG:BS_yoqW KEGG:ns NR:ns ## COG: BS_yoqW COG2135 # Protein_GI_number: 16079108 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 168 1 185 224 91 34.0 9e-19 MCGRYFLELKALMELEERIDYTFERDLLIAKDYYPSNIVPIIIDRDNKLELKKAKWGFST FDNKLIINARSETLLEKPFFKKEVITHRCLIPASGFYEWDGHKHKFTFENEKRSLLLMTG IYRNVNGQTEVTIVTTKANKSMCEIHDRMPLILEEHLNHDWLENRHIEALLRTVPQSLTI TSGFLQSSLF >gi|223714199|gb|ACDT01000016.1| GENE 10 8247 - 8855 591 202 aa, chain + ## HITS:1 COG:MA0739_1 KEGG:ns NR:ns ## COG: MA0739_1 COG5015 # Protein_GI_number: 20089624 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 3 148 21 163 163 125 44.0 6e-29 MKVNDYLNELRKIKDVAMATVDSDGYPQIRIIDVMAVENNNLYFLTARGKNFYQELLDKN FVSLVALTKDYASIRLKGKVKKVSNQKEWLEKIFIDNPSMNEVYPGDSRNILEVFCIFDG EIEIFDLSKVPIERSQYDLCDNRIFNKGYLISNRCIACDRCKRECPQQCIKSGSKYKIMQ DHCLHCGLCYENCPVRAIEKVG >gi|223714199|gb|ACDT01000016.1| GENE 11 8859 - 10400 1560 513 aa, chain + ## HITS:1 COG:FN2009_2 KEGG:ns NR:ns ## COG: FN2009_2 COG1732 # Protein_GI_number: 19705305 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) # Organism: Fusobacterium nucleatum # 214 506 5 298 307 326 56.0 6e-89 MLRDMLSLLQEKGDFFLKLLIEHMQISLIAIIIATVIGLILGVIISEYKKSSKTILGIIN FMYTIPSISLLGFLIPFSGIGNSTAIIALSVYALLPMVRNTYTGIINIDDNIIEAATGMG STRWQILYKIKLPLALPVIMSGFRNMVTMTIALAGIASFIGAGGLGVAIYRGITTNNTAM TLTGSLLIAILALITDIVLGFIEKLFTIRKATLKSKRKYGLVVIVTTCAVLIISMISGNL NQTVTLNIATKPMTEQYIIGAMLKEMIEQDTDIKVKITQGVGGGTSNIQPGMVSGEFDLY PEYTSTGWNTVLKHEGFYNETMYEQLVKEYQEKYQFTWTGLLGFSDAYGLAVRKEIAEQY NLKTYLDLAGISNQLVFGAEYDFYEIPEGYDALCQKYNFTFKNTMDLDIGLKYQAINEGK VDVMDVFTTDGQLSTANVVVLEDDQQFFSTSMGALVVRNEILEQYPELNNVFDKLTGILN ETKMAQLNYLVETKGQDAEDVAHEFLVSINLVK >gi|223714199|gb|ACDT01000016.1| GENE 12 10410 - 11153 208 247 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 1 226 126 351 398 84 30 5e-16 MDTAIEFKHVKKVYGEKVIIDDFNLKITPGEFLTVVGSSGCGKTTILKMINGLIIPDEGQ VLVHDQCTQAVDLIELRRGIGYAIQGSVLFPHMTVAQNIAYVPNLLNKNDKKRTYEALSK WMKIVGLDEELIHRYPSELSGGQQQRVGIARALAASPDILLMDEPFGAVDEITRSTLQDE ILRIHHQENITIIFVTHDINEALKLGSRVMVMDQGKVVQLASPREILEHPKTEFVRRLVQ RKDDFLQ >gi|223714199|gb|ACDT01000016.1| GENE 13 11287 - 11754 402 155 aa, chain + ## HITS:1 COG:lin0443 KEGG:ns NR:ns ## COG: lin0443 COG1595 # Protein_GI_number: 16799520 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Listeria innocua # 4 149 15 178 182 80 33.0 1e-15 MEKERYEEVVYQYSDMVTRIAVMNVKNYDDAKDCYQNVFMKLYCYNKKFESEEHLKAWLI RVTINECKDYQKQFWKREIDIDKLILGQEDQRLIILPIVMNMPSKYRNVLYLYYYEGYST NEIASILKENHNTIKSRLIRGRKLLKKKLGDDFYE >gi|223714199|gb|ACDT01000016.1| GENE 14 11747 - 12946 1197 399 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757171|ref|ZP_02429298.1| ## NR: gi|167757171|ref|ZP_02429298.1| hypothetical protein CLORAM_02721 [Clostridium ramosum DSM 1402] # 1 399 4 402 402 750 100.0 0 MNKFNEFDLIKTPRNWIDEVTATNFEPTKRAKVIRIKYIFAIVLIVMTIGISSLGITYAF SDSFRTWLSQRFSSNLKISSALEFSSINKNTPTLKLDYGHWRAENEFFGIIDEDYNFLKV FTLENEQIRECPINEYYGSIAGHEFSFKYADYQERIVAFDFNGCIYSVLPKIINNDIYVC VDINDGEQRALNIAKINLESKQISFITNDNISVNPITSPQQTNILINKSDQGWENYNIET GETNLIKNIDPYMHSNCITFIDEQTVVTYDENGHSYLINILTNEVRTLDKYPLEGTIVNI EYSDETIKLTNIITGAVSRIDYHFYGTNYMGTYCSIDYLVLFNNEKLYLYDIESNQLVDL DPQKELDEQLKEVMIIDQKHLLITTDKRAYIVANVGNNG >gi|223714199|gb|ACDT01000016.1| GENE 15 12976 - 13617 858 213 aa, chain + ## HITS:1 COG:CAC0418 KEGG:ns NR:ns ## COG: CAC0418 COG0546 # Protein_GI_number: 15893709 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Clostridium acetobutylicum # 5 210 4 210 216 210 53.0 1e-54 MNKKYILFDLDGTLTDPMKGITKSVRYALNYYGIEVNDLNDLLPFIGPPLRDSFQEFYGF DALKAEEAVVKYREYFSTKGIFDNKVYPGIEVCLQTLKDQGKVLLVATSKPEKFAKEIIE HFGLAKYFDFVGGSEFNGREKKAEVIDYVLTANMIDKDEAIMVGDRKHDVIGAHENDLPC IGVLYGYGTKEELMACNSDYLVADINALQELLG >gi|223714199|gb|ACDT01000016.1| GENE 16 13656 - 14180 463 174 aa, chain + ## HITS:1 COG:no KEGG:BcerKBAB4_2674 NR:ns ## KEGG: BcerKBAB4_2674 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: B.weihenstephanensis # Pathway: not_defined # 2 174 15 195 195 148 43.0 6e-35 MSKEIKREIVHLLHVVWPSDDCNDAHNEELTVQSFYIMKDNQVVSYGAVLRMETTIDNEK YQIGGLSCIATLPNYRRQGLSSKIVQTATKWIEDNLDFGIFTCKRELEDFYACNGKWQVA NNVVLIANRDNNALSSNKLAVIVLIRLFSNKAQVNGQKILNARIYLDLPQGQFI >gi|223714199|gb|ACDT01000016.1| GENE 17 14219 - 15385 1188 388 aa, chain - ## HITS:1 COG:BH0799 KEGG:ns NR:ns ## COG: BH0799 COG0626 # Protein_GI_number: 15613362 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Bacillus halodurans # 17 385 20 392 393 248 37.0 2e-65 MKDIENVCLEVDELDFDNYSPISPDIVLTSSFKFKNFDHYVKVNAKEEFAYTYTRDGNPT LNLLETKLARLEKGEAAQMFASGMGAISASILTLAKAGDHIIIVNTVYGSSVKLIKQLSK FGIESTKIDVSDTLEIFDYVKNNTSIIYFESPSSQKFEMLDLELISKFSKDKGIFTIIDN TWSTPLLQNPLVHGIDVVIHSCSKYIGGHSDIVGGVVISSKKIIDEIVEFGQVLLGATMS PMNAWLALRGLRTLPVRLKSQQETLQQVINFLQEDPRIERIYHPLCNGEQQNELAHKYLK GYGSLLGVVLKDANPEIIKRFIDSLEHFTLAYSWGGFESLVMSVYKGNNINEIKERGLSL GQLRMYIGLEDSELLISDLKNALDQAYQ >gi|223714199|gb|ACDT01000016.1| GENE 18 15485 - 16510 1095 341 aa, chain - ## HITS:1 COG:SMc01214 KEGG:ns NR:ns ## COG: SMc01214 COG1063 # Protein_GI_number: 15965330 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Sinorhizobium meliloti # 1 339 1 339 347 260 41.0 3e-69 MKSAVFYGKHDIKVEEVEMPQLGNQDVLIKVMACGICGTDVHIYEGDKGAANTTPPTILG HEFAGIVEAVGHEVKHVKVGDRVCVDPNQLCGTCYYCRSGIGHFCEDMIGIGTTRDGGFA QYCAVNESQVYKLADTTSFEEGAMTEPVACCLHGLDMCNITPASTVLVIGGGMIGLLMVQ LAKLAGAHEIILSEPVAVKREMGLKMGATLTVDPINDNIPALLADKGIHRINTVIECAGL PATIKQAISLAGNKSVVMMFGLTKPDDEVAIKPFEIFQKEIEIKASFINPYTQQRALDLI NSKKIDVSSMVCDICSLDKLADILSKPELRNKGKYIINPWQ >gi|223714199|gb|ACDT01000016.1| GENE 19 16687 - 17163 459 158 aa, chain - ## HITS:1 COG:SA0683 KEGG:ns NR:ns ## COG: SA0683 COG0780 # Protein_GI_number: 15926405 # Func_class: R General function prediction only # Function: Enzyme related to GTP cyclohydrolase I # Organism: Staphylococcus aureus N315 # 3 158 11 166 166 265 81.0 4e-71 MTKNLTLLGNQNTVYKDDYAPEVLETFDNKHPENDYFVKFNCPEFTSLCPITGQPDFATI YISYVPNQKMVESKSLKLYLFSFRNHGDFHEDCMNIIMKDLIKLMDPKYIEVWGKFTPRG GISIDPYCNYGKPETKWETIAFNRLANHDLYPEKIDNR >gi|223714199|gb|ACDT01000016.1| GENE 20 17166 - 17849 712 227 aa, chain - ## HITS:1 COG:AF0442 KEGG:ns NR:ns ## COG: AF0442 COG0603 # Protein_GI_number: 11498054 # Func_class: R General function prediction only # Function: Predicted PP-loop superfamily ATPase # Organism: Archaeoglobus fulgidus # 1 227 1 217 239 201 46.0 1e-51 MKVLVLFSGGVDSTTALALAIQEHGKDNVVALSISYGQKHTKEIEVANKIAQFYQVEHLY LDLAKIFTYSNCSLLQHSDNEIPHESYNEQLNKTDGNPVSTYVPFRNGLFLSSAASIALS KGCNIIYYGAHSDDAAGSAYPDCSKVFNNAMNQAIYEGSGHQLEIKAPFVEMTKADIVKI GLSLGVPYQLTWSCYEGNDTPCGTCGTCIDRQNAFLKNGIKDPLLGE >gi|223714199|gb|ACDT01000016.1| GENE 21 17861 - 18427 724 188 aa, chain - ## HITS:1 COG:CAC3626 KEGG:ns NR:ns ## COG: CAC3626 COG0302 # Protein_GI_number: 15896860 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Clostridium acetobutylicum # 2 183 3 185 195 238 69.0 6e-63 MIDTKKIEEHIYGILKALGDDPEREGLKDTPKRVAKMYGEVFAGMNYSNLEIATMFDKTF IDDLDFDNQEVVVIKDIDIFSYCEHHLALMYDMKVTVAYIPCGKVIGLSKIARIADMAAR RLQLQERIGTDIAEIISLVTNSKDIAVIIEGKHSCMSSRGIKKVNSTTVTSTLTGRFKTD SKLQIYLH >gi|223714199|gb|ACDT01000016.1| GENE 22 18429 - 19091 582 220 aa, chain - ## HITS:1 COG:CAC3625 KEGG:ns NR:ns ## COG: CAC3625 COG0602 # Protein_GI_number: 15896859 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Clostridium acetobutylicum # 3 220 2 221 221 234 54.0 9e-62 MNNYKVVEKFISINGEGSRAGQLAAFIRFHYCNLNCSYCDTRYANDSNSNYELLSAQNIL DYLKTNKVVNVTLTGGEPLLQQNIDYLIDLLLKNGFSVEIETNGSIDIKPFIKETRPIFT LDYKVPSSTMENEMCLNNYQYLTKNDVVKFVVSNLSDLNKAKEIIDTYDLVNRTKVYFSP VFGKIEPRMIVDYMVKHHLNGINMQLQMHKFIWDVNQRGV >gi|223714199|gb|ACDT01000016.1| GENE 23 19088 - 19576 406 162 aa, chain - ## HITS:1 COG:CAC3624 KEGG:ns NR:ns ## COG: CAC3624 COG0720 # Protein_GI_number: 15896858 # Func_class: H Coenzyme transport and metabolism # Function: 6-pyruvoyl-tetrahydropterin synthase # Organism: Clostridium acetobutylicum # 23 158 1 132 136 146 56.0 2e-35 MLFLSRLFSEPAFLFDGIEGGKMYFLKTEQSFDSAHFLAGYHGKCANIHGHRWKVIATIK SEKLLEDPQNKGMVTDFGDLKKDLKIIADSFDHALIIETGSLSEKLYQALIDENFKIINL PFRPTAENLAKYIYEALSKNYLVDCLDVYETPNNCASYRGKL >gi|223714199|gb|ACDT01000016.1| GENE 24 19785 - 21437 1686 550 aa, chain + ## HITS:1 COG:BH1407 KEGG:ns NR:ns ## COG: BH1407 COG1283 # Protein_GI_number: 15613970 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Bacillus halodurans # 4 538 7 537 543 299 37.0 8e-81 MELTDFFAMFGGLALFLYGMTMMSNGLELAAGNKMKTILEKLTTNRFLGVGVGAVITAVI QSSSATTVMTVGFVNAGLMKLENAVWVIMGANIGTTITGQLIAIDITALAPVIAFVGVAL IAFFKSKKLDAIGGIIAGLGILFMGMEMMSSAMVPLRSSPEFVNLVTTFENPLIGILVGA GFTAIIQSSSASVGILQALAMSGVITLPSAIYVLFGQNIGTCITAVLASIGTGRNAKRTT IIHLSFNIIGTIVFVFISMLTPFASFMQSLTPTNIPAQIANVHTVFNVVTTLLLLPFGAQ LVKLSYLVLPEKEGFEDKLSVKFLDNSIFTNDYHIGTSAIANTQLFNETQNMLNIVQKNV QRAFDLIIKYDEKTHEKLLKDEQYIDYLNKEIIQFTTNAISNEFPIDDSKSIGLFLKTAG DLERVGDHAVNIAERAEKLYSEDEHFSDEAMREIKIMNDLTRNILEELNVLNRDELHNIV EKVDVIEDSIDITTHEFSLNQLRRLRDKKCTPEHSALYTETLIDFERIGDHGLNIAVAFD EIKDDLTEMA >gi|223714199|gb|ACDT01000016.1| GENE 25 21500 - 21799 268 99 aa, chain - ## HITS:1 COG:BH0315 KEGG:ns NR:ns ## COG: BH0315 COG3910 # Protein_GI_number: 15612878 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Bacillus halodurans # 1 99 138 237 237 107 52.0 7e-24 MIQNRFNQNSLYILDEPEASLSPQRQLTLLITIYELAKQGSQFIIASHSPILLGIPGAKI LNFDSKEIIPIDYQDTESYQITELFINNRAGLLNKLLNK >gi|223714199|gb|ACDT01000016.1| GENE 26 21884 - 22135 81 83 aa, chain - ## HITS:1 COG:MA0995 KEGG:ns NR:ns ## COG: MA0995 COG3910 # Protein_GI_number: 20089872 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Methanosarcina acetivorans str.C2A # 3 83 42 122 251 96 56.0 1e-20 MIFDNPITFFVGENGTGKSTLLEAIAIKYGFNPEGGTLNFNFHTNNTHSSLHHAIILDRG YRRPDEGFFLRAESFYNVATQLE >gi|223714199|gb|ACDT01000016.1| GENE 27 22344 - 23111 793 255 aa, chain + ## HITS:1 COG:no KEGG:Dalk_2238 NR:ns ## KEGG: Dalk_2238 # Name: not_defined # Def: 4Fe-4S ferredoxin iron-sulfur binding domain protein # Organism: D.alkenivorans # Pathway: not_defined # 10 230 10 245 269 180 43.0 5e-44 MKIDRTYAIYFSPTYTSKKSAVSIARGLEGELSEIDLTLEETIPEMTFSRHDIVVFGFPV YGGRILHEALERLKSFRGDHTSCVITVTYGNRHYDDALLELFNTVKEQGFIPIAGAALVG EHTYGQIQVGRPNRDDLYRDELFGSLVRLKIKDDNFSFVSVPGKYPYKEGGTGGKFRPET NEQCSGCGVCVNMCPTNAIDIEDCKTINNDCIACFRCIRICPVHAKSMNNNQEYQTFAES FSKKLATPRENEYFI >gi|223714199|gb|ACDT01000016.1| GENE 28 23263 - 24234 1128 323 aa, chain + ## HITS:1 COG:CAC3400 KEGG:ns NR:ns ## COG: CAC3400 COG0673 # Protein_GI_number: 15896641 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Clostridium acetobutylicum # 1 323 1 322 322 379 57.0 1e-105 MKKLNWGIIGSGVIANEMAQALLDVNGEIYAVGHRDMNKAIDFAMKYKIKNAYGSVEELL NDPDVDVVYIATPHNSHYEIMKQAVAVKKHVLCEKAITVNDRQLEEIVALAKKNNVVVQE AMTIFHMPLYKKLKEMVAAGVIGNVKMIQVNFGSCKEYDVNNRFFSKELAGGALLDIGVY ATSFARYFLSSKPNVVITTADYFETGVDEQSGIIMKNQDGQMVVMALTMRAKQPKRGVIS GELGYIEINNYPRSNQATITYTKDGHQEVIECGDDKQGLQYEVIDMQDYIINNLGEYELQ LTRDVSSLLSQIRTQWGMIYPFE >gi|223714199|gb|ACDT01000016.1| GENE 29 24931 - 25569 419 212 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237735755|ref|ZP_04566236.1| ## NR: gi|237735755|ref|ZP_04566236.1| predicted protein [Mollicutes bacterium D7] # 1 212 1 212 212 333 100.0 3e-90 MKTLSTKASKQKILSLLLLAGLWSIMLIIVLLAAKSTWAKNLNSPYWAPAIITIGIIFFI TYFIFTVRALFYRQHGYDNQYIYYYKSTSYFNILKLIINTLFNKEMYQDKIAFKDIKKCE IGWYNQTFRIHSFGSEVAHPIYFKINEQMVIKTDLTSNNEHIIELANLLKSHVKDFKDPY HLVLAMNDRSITVYEYIERIIHNKIIKPNLYK >gi|223714199|gb|ACDT01000016.1| GENE 30 25566 - 25835 125 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167757155|ref|ZP_02429282.1| ## NR: gi|167757155|ref|ZP_02429282.1| hypothetical protein CLORAM_02705 [Clostridium ramosum DSM 1402] # 1 89 147 235 235 144 98.0 2e-33 MTYLNTLDKTYRIFIQNFEYLQKMLPKTFFKISRSAIINIFRLITLNSENVIMKNYKNKD IAISRRRYQDLLVVLKKHAISFTERKLKK Prediction of potential genes in microbial genomes Time: Thu May 26 09:17:43 2011 Seq name: gi|223714198|gb|ACDT01000017.1| Coprobacillus sp. D7 cont1.17, whole genome shotgun sequence Length of sequence - 51972 bp Number of predicted genes - 43, with homology - 42 Number of transcription units - 27, operones - 11 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 98 - 157 7.6 1 1 Tu 1 . + CDS 193 - 780 809 ## EUBELI_20196 cytidylate kinase + Term 783 - 825 2.2 + Prom 834 - 893 6.8 2 2 Op 1 . + CDS 984 - 1670 570 ## COG0561 Predicted hydrolases of the HAD superfamily 3 2 Op 2 . + CDS 1689 - 2354 688 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 2359 - 2394 0.1 + Prom 2378 - 2437 8.3 4 3 Op 1 1/0.000 + CDS 2529 - 3872 1388 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 5 3 Op 2 . + CDS 3915 - 4937 1042 ## COG1609 Transcriptional regulators + Prom 4984 - 5043 4.6 6 4 Tu 1 . + CDS 5095 - 10608 5374 ## COG0823 Periplasmic component of the Tol biopolymer transport system + Term 10612 - 10651 8.2 - Term 10592 - 10643 10.5 7 5 Tu 1 . - CDS 10644 - 13070 2456 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases - Prom 13151 - 13210 10.1 8 6 Tu 1 . - CDS 13711 - 14088 264 ## Clos_0828 chromate transporter - Prom 14127 - 14186 2.6 9 7 Tu 1 . - CDS 14232 - 14801 449 ## COG2059 Chromate transport protein ChrA - Prom 14822 - 14881 9.1 + Prom 14812 - 14871 10.1 10 8 Op 1 . + CDS 14906 - 15772 844 ## COG0583 Transcriptional regulator + Term 15775 - 15834 -0.1 + Prom 15913 - 15972 8.3 11 8 Op 2 . + CDS 15996 - 17624 1856 ## COG0659 Sulfate permease and related transporters (MFS superfamily) 12 9 Tu 1 . - CDS 17637 - 18506 890 ## COG3173 Predicted aminoglycoside phosphotransferase - Prom 18531 - 18590 3.9 - Term 18516 - 18565 -0.6 13 10 Tu 1 . - CDS 18719 - 18910 121 ## - Prom 18931 - 18990 7.2 - TRNA 19146 - 19230 51.4 # Leu CAG 0 0 14 11 Tu 1 . - CDS 19235 - 19630 170 ## gi|169350090|ref|ZP_02867028.1| hypothetical protein CLOSPI_00832 - Prom 19711 - 19770 11.2 + Prom 19722 - 19781 11.8 15 12 Op 1 . + CDS 19804 - 21243 1738 ## COG2195 Di- and tripeptidases + Prom 21265 - 21324 5.9 16 12 Op 2 . + CDS 21352 - 22722 1356 ## COG0534 Na+-driven multidrug efflux pump + Term 22784 - 22830 5.5 + Prom 23931 - 23990 9.1 17 13 Tu 1 . + CDS 24037 - 24324 229 ## gi|237735771|ref|ZP_04566252.1| predicted protein + Term 24423 - 24458 -0.2 + Prom 24667 - 24726 8.4 18 14 Tu 1 . + CDS 24862 - 25014 93 ## gi|167757139|ref|ZP_02429266.1| hypothetical protein CLORAM_02688 19 15 Op 1 . - CDS 25239 - 25859 351 ## CPF_1045 hypothetical protein 20 15 Op 2 1/0.000 - CDS 25847 - 26359 455 ## COG0655 Multimeric flavodoxin WrbA - Prom 26489 - 26548 10.2 21 15 Op 3 . - CDS 27198 - 28193 924 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family - Prom 28285 - 28344 8.0 + Prom 28942 - 29001 10.4 22 16 Tu 1 . + CDS 29072 - 29605 534 ## AM1_4351 aminoglycoside N(6')-acetyltransferase (AAC(6')), putative + Term 29612 - 29657 4.3 + Prom 29661 - 29720 8.8 23 17 Op 1 3/0.000 + CDS 29747 - 30358 503 ## COG1309 Transcriptional regulator 24 17 Op 2 . + CDS 30340 - 31248 311 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 25 17 Op 3 . + CDS 31238 - 32038 920 ## Cphy_1635 hypothetical protein + Prom 32051 - 32110 6.5 26 18 Tu 1 . + CDS 32140 - 32949 756 ## gi|167757131|ref|ZP_02429258.1| hypothetical protein CLORAM_02680 + Term 33083 - 33139 3.1 - Term 32932 - 32990 8.0 27 19 Tu 1 . - CDS 33003 - 34454 1298 ## COG2382 Enterochelin esterase and related enzymes - Prom 34506 - 34565 7.4 + Prom 34466 - 34525 8.5 28 20 Tu 1 . + CDS 34666 - 36516 1481 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain + Prom 36583 - 36642 16.2 29 21 Op 1 . + CDS 36695 - 38029 841 ## gi|167757128|ref|ZP_02429255.1| hypothetical protein CLORAM_02677 30 21 Op 2 . + CDS 38001 - 39383 1089 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component + Term 39449 - 39489 3.5 + Prom 39456 - 39515 10.5 31 22 Tu 1 . + CDS 39652 - 39939 363 ## gi|237735785|ref|ZP_04566266.1| predicted protein + Prom 39948 - 40007 3.6 32 23 Op 1 . + CDS 40071 - 41489 1557 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase 33 23 Op 2 1/0.000 + CDS 41515 - 42300 553 ## COG2207 AraC-type DNA-binding domain-containing proteins 34 23 Op 3 . + CDS 42378 - 44021 1767 ## COG0366 Glycosidases 35 23 Op 4 . + CDS 44089 - 44457 456 ## COG0239 Integral membrane protein possibly involved in chromosome condensation 36 23 Op 5 . + CDS 44484 - 45338 635 ## COG1737 Transcriptional regulators + Term 45376 - 45418 4.1 + Prom 45340 - 45399 4.8 37 24 Tu 1 . + CDS 45488 - 46270 824 ## gi|167757120|ref|ZP_02429247.1| hypothetical protein CLORAM_02669 38 25 Op 1 . + CDS 46602 - 47675 1204 ## COG0180 Tryptophanyl-tRNA synthetase + Prom 47689 - 47748 5.4 39 25 Op 2 . + CDS 47770 - 48834 868 ## COG3192 Ethanolamine utilization protein + Prom 48901 - 48960 3.1 40 26 Op 1 . + CDS 49068 - 49451 287 ## Elen_0387 metal dependent phosphohydrolase 41 26 Op 2 . + CDS 49470 - 50159 578 ## COG2206 HD-GYP domain + Term 50161 - 50201 4.4 + Prom 50176 - 50235 6.2 42 27 Op 1 . + CDS 50260 - 50895 760 ## COG2364 Predicted membrane protein + Prom 50913 - 50972 6.8 43 27 Op 2 . + CDS 50994 - 51972 1125 ## ebA6332 hypothetical protein Predicted protein(s) >gi|223714198|gb|ACDT01000017.1| GENE 1 193 - 780 809 195 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20196 NR:ns ## KEGG: EUBELI_20196 # Name: not_defined # Def: cytidylate kinase # Organism: E.eligens # Pathway: not_defined # 1 195 1 195 196 212 53.0 7e-54 MRKNIITISREFGSGGRTIGKEVAKRLNIPFYDKELIEKVARESGLNVNYIEEHGEYAPS SNPFAYAFLGHYIDGMSMNDYIWMMQRKIILELVQEGPCVIVGRCADYILRDRDDCFNIF ICSDLDKKVERIVRLYGETDEKPEKRLADKDKKRRANYKYYTDQVWGLAQNYHLCLNSGE IGIDKCTELIVDLVK >gi|223714198|gb|ACDT01000017.1| GENE 2 984 - 1670 570 228 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 227 20 253 256 130 32.0 2e-30 MPSSTFEALKKLRKNGIRLFVATGRPPNNLKVIQDCFEFDGFLTSNGQYCFNHEVVIHEK YIEREDIRNLLPYINKNQIPVLFVEINGNYSNIQNYRLDEAARSLNKEGYPIKAVNEIIE SKIIQLMAYIDETKDSELLSYMPNCKLARWTSLFADIIPIDGGKNKGIDQMIKHYQINLG EVMAFGDGNNDIDMLKHVGVGVAMGNANDFVTDTINDDGIFKALKNLN >gi|223714198|gb|ACDT01000017.1| GENE 3 1689 - 2354 688 221 aa, chain + ## HITS:1 COG:FN0217 KEGG:ns NR:ns ## COG: FN0217 COG0664 # Protein_GI_number: 19703562 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Fusobacterium nucleatum # 3 212 8 213 217 85 27.0 9e-17 MEQELTKFLTFWDHLNENEKELLKQNATLQTYAKGITMHRGSDDCLGVVLVRKGQLRTYM LSPDGRDITLYRLFAGDVCILSASCVLETITFDVFIDIEENSEIITITATLFQQLAKQNI YVETFGYKMATNRFSDVMWAMQQILFMSVDKRLAIFLKEEAEKNNSLELKITHEQIAKYM ATAREVVTRMLKYFSQEGIVKISRGKITIVDKHKLDKLTID >gi|223714198|gb|ACDT01000017.1| GENE 4 2529 - 3872 1388 447 aa, chain + ## HITS:1 COG:BH0596 KEGG:ns NR:ns ## COG: BH0596 COG2723 # Protein_GI_number: 15613159 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Bacillus halodurans # 6 445 10 475 477 372 42.0 1e-103 MLRKDFLWGGAVTAHQSEGGYTLDGKVPAVCDLTVTGEYSDFKDGIDSYHRYEEDFDLFQ EMGFNAYRFSLDWSRLMSDEGVYNEKGFVFYEHFIDALLKRGIQPIPTLYHFEMPAFLYE KYNGFYSRKVVDIFVELCKKIVDRYHDKVENWIIFNEQNGILQKGPKMFFGAVCPDGVDT QTFDNQIMHNTLIAHSLINEYIHQKGGKVMGMATVVQSYPETCHPLDTLESMKAQSEAYV FLDVFARGHYNSYYFANMKNEGTMPEILDGDLEILKKGITDSLSISYYMSTISHYGEESL TNINDVVIKKNPYLEMSEFGWTIDPTGLRITLRQLYDRYEMPIYIVENGFGYNDQINADG QIIDDYRIDYMRKHLEEMKLAIQEGVDCRGYLSWGPVDILSSRAQMKKRYGFIYVNREND DLKDMRRIRKKSFTWYQQVIASNGEVL >gi|223714198|gb|ACDT01000017.1| GENE 5 3915 - 4937 1042 340 aa, chain + ## HITS:1 COG:lin0851 KEGG:ns NR:ns ## COG: lin0851 COG1609 # Protein_GI_number: 16799925 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 2 338 3 329 341 152 30.0 1e-36 MKVTMKDIANKLGISINAVSIALNDKPGVSDEMRLEILKMADKMGYINQKRQYLSVYSKS NICILMQSYYADTGHFYSIVLRSIVEQARVFGYFSILNYFEEGHFVVPECIEERKVAGIL VVGKISDSNLLLLKNIGVPVVLVDFTSLCTPCDCVLTHNKQGSYMMTTHVINKGYQRIGF FGDLDYSFSFEDRFIGFKQSLLKNNIVSYSKVDEYIQKYSFLQGIEEYVLTNQIDKIIEI LNTKKLPEVLVCANDSNAFAVITALKDMGLKVPKDISVVGFDDTPLCVKSVPQITTVQVQ KELMGEVAVSNLIDRIHRKDNIPMTQLLSVKIVERTSLKI >gi|223714198|gb|ACDT01000017.1| GENE 6 5095 - 10608 5374 1837 aa, chain + ## HITS:1 COG:RSp0279 KEGG:ns NR:ns ## COG: RSp0279 COG0823 # Protein_GI_number: 17548500 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic component of the Tol biopolymer transport system # Organism: Ralstonia solanacearum # 1302 1419 367 474 475 66 38.0 6e-10 MIKKALRIILSLVMSFGLVLSSMNVTYASESDFATIKTRLKDYFLTLDTIDDGSKVETCY VSKAKDYLDLIQEDGAFGDVDYKATSSAANGTAWSPYLALDRLQAITVAYHKEGNALYHD EEVINKLNKAIVYWGKMNPSSSNWWENQIGVQLRFSRIALFMESIVSKDAMDIMLNKLLE KTPVKYGTGQNNLWFDQNYVYYAIITENGTKHTNSTGFKKLDLKELVDDYLSYCLVVQTD DNTAEAVQVDNSFYMHGRQFYSNGYGMSMFRDMSFWIYMLRETSYAFEQDVIDLMADYMI DGTSWTIRGDIMELYLGYRPYDYDVGYDNYAAEYIDPLKRMIESDPDRADEYQAILDNIQ GKNTVNGKNGNYYMWRSGYASHMRDGYGVNIKMDSNQIIGGEWRGSWSGYDKDGGQLIYW TSSAASTITVDGDEYTNVYPTYDWAHCPGTTTAARIVQDYANAGRFTNGTEHTIGVSNGK YGNTAYDMNKKGTQVKKGYFFFDDEFVALGSGINSTEGVNIHTTLNQCEAEDVNVGGQSV AEGTKEQIYNTNWLYNGKVGYVFLENTDVVVSNSVQTNNPSLWDEAKKNETPATFTAYLD HGLKPSNDSYAYIVVPNTTAGAVSQYAGNTPVTVIANNEKVQAVRHDGLKQTQINFYQAG SLEYKDGYIITVDQPCSIIIDESEDTRKISVAVNDTAANQTVNVNLNYDNQETQTTFVSG ALPYAGQTMTLSEGSDNRYHTNSSMIGHEVEKAFDGDENTYWQSEETQNEWISMFTGNNK HLASMNILWGDNYASAYDVMVSQDGKNYELLKSVSNGDGKTDTIELKGVYPYIKIIMKDG NGNCFEIKEITFKASDSIALNKSVEASSTSTNDPGNTKELVVDGNTSTRWSSLRNEDENW IMVDLGQYSEINAMSIQWESACSDDYDIQVSSDKNNWITVKDSMATGSSLLDEYNFDEPV YGRYVRVSSHKSRTLKYGISIYEISVYGSYKDEDIAANKNAFSSTIKDLNLPTHAIDNKA NTSWISSAAGDQWIYVELKGIYKISSMQVDWGNNYAPNYEIQISDDAQMWQTVKTIIDGQ GGNENISDLGNQEARYVRLKLNGSSGSSYQIIQWSVYGELVEAENLENIALNKTATASSV YNNVYTANRAIDGSFENNGGNDQSRWVSNRNSNDEYIQIDLEANYDLTGVKLYWEGNGAK KYQILVSEDGKTWQKVYLEENGQPGIANIPFNETVTARYVKMLGIECASKYGYSLWEFEV YGKLHQEPVVPEVNIALNKTSKASSEFTDPNDGNKTYQSLLAFDGKGTNETVDGKQSRWV SLRTKDDPTATSQWIYVDLEADYDISKVVLNWEGNGAKEYKVQISNDGQNWTDISHITDG KGGIDELTYKNTTGRYVRMLGIEPGGIYGYSLWEFEVYGEAVLEPENPNVNLALNKSSNA SSEYVDSKDGNKTYESSLAFDGKGTNEIVDGKQSRWVSNRKSNDEWIYVDLQDIYNISKV ILNWEGSGAKKYKVQVSNDGQVWTDISDINDGDGGLDELSYKDVTGRYVRMLGIEVGSDY GYSLWEFEVYGTTLKSKLQEIYKKYKDLDISSYTPNSIARYQEALNNAVKVYNDDDVTSE DILKAIDQLEDSVKGLTKIVDKMELEETINSASKQLDTAKYTPQSIELLTVTLQKAKEVF NDDNATKHEVDEAINKLNEVVAALVEKADKNNLISIFNTALKLDKEKYTSESLTKVEALL EKVESIIDDENAAQAEVDEMYDQLHQALDQLVIKEDVNTNDKVEGDVTVIPTPEKEIPKD DSTSAAKTGDSVSIQILAGTLLISILGIGILRKKKQS >gi|223714198|gb|ACDT01000017.1| GENE 7 10644 - 13070 2456 808 aa, chain - ## HITS:1 COG:VCA0644_1 KEGG:ns NR:ns ## COG: VCA0644_1 COG0446 # Protein_GI_number: 15601402 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Vibrio cholerae # 6 466 3 483 484 453 52.0 1e-127 MNTNQKILIVGGVAGGASAAARLRRLDESANIIMFEKDEYISFANCGLPYYIGGEITDQE ALTLQTPDSFKARFNVDVRNFNEVTAITPETKTVTVYNHQLQKEYQESYDKLLLSMGAKP IKPNIPGINSNKVFTLRNIPDTYAIKEYVDTHKPKHAIVVGGGFIGIEMAENLHSTGINV TIVEMANQVISPIDYEMACEVHQHLISKGINLVLETELQAINEAGNKLTVTLNNQTVDTD MVIMAIGVVPETKIVKNTEIATNSRGAIIVNDKMETSIKDIYAVGDAIEIKNFVTNKASY VPLAGPANKQGRIAADNICGFDRHYQGTQGSSILKVFDLTVASSGINEKTARELNLNYDK VYTYSANHAGYYPGAVNMSIKVLFDKSTGTILGAQIVGYDGVDKRMDVLAAAIRAKMTGF DLTELELCYAPPYGSAKDPVNMAGFVIENILTDKIKQYNWDDVASLPRDGSVILLDTRTE LEYANGHIDGYINIPLDSLRTRLHELNLNKPIYVTCQIGLRGYIASRILSQNGFDTYNLN GGYRLYNTIFNQEHDEPKIKTMHPACPIENPETIKINACGLQCPGPIVKLSASLETAKDG DIIEIQTTDPAFATDLDGYCRRTGNELIELSCNKGISSAKIKKGQNSISNGNKNNKNMIV FSGDLDKAIASFIIANGAAAMGRKVTMFFTFWGLNIIRRPEKVKIKKNFISKMFAMMMPR GSKKLSLSKMNMGGMGAKMIRTIMKDKNIDSLEDLIKLAQDNGVELIACSMSMDVMGIKQ EELIDGVTLSGVATMLANGEESDMSLFI >gi|223714198|gb|ACDT01000017.1| GENE 8 13711 - 14088 264 125 aa, chain - ## HITS:1 COG:no KEGG:Clos_0828 NR:ns ## KEGG: Clos_0828 # Name: not_defined # Def: chromate transporter # Organism: A.oremlandii # Pathway: not_defined # 1 125 55 186 186 100 49.0 1e-20 MTPGPLAVNTSTFVGIQLAGISGAIVATIGCIFAGATISIILYLLFTKYQKLEIITNILN TLKATSVGLIMSAGATILILTFVENNSINFLAIIIFVISLLWLQKKKPNPIALMIITGII GYFVY >gi|223714198|gb|ACDT01000017.1| GENE 9 14232 - 14801 449 189 aa, chain - ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 176 1 171 186 78 28.0 8e-15 MKANKLKDLTWLFFINIFISSFTFGGGYVVVSMVRKYFVEKRKIFNESDLITMSAISQTT PGAIAINLAALAGYKVAGTIGTIVSCIGAIIPPITILAVVSLWYQVFSTNHIIMAILKGM QAGIAAIIVDILIGMTRAINEQHSKLLTAMIPITFIASYIFKVNIVVIILITIIVATAQL YYRRRYCQY >gi|223714198|gb|ACDT01000017.1| GENE 10 14906 - 15772 844 288 aa, chain + ## HITS:1 COG:ECs3049 KEGG:ns NR:ns ## COG: ECs3049 COG0583 # Protein_GI_number: 15832303 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 282 3 284 293 120 27.0 3e-27 MNLRHLLIFKTVVDTGSFTKAAKQLFITQSGVSHAIRELEQQTNAVLFDRLSKAIILTPA GKLLLEKVIPILSLYHDLEKQIDVLEMSAPLKVVSSITIATFWLPKILKQFASSYPNIKV EVQVVSAKEALAVLECGEADLALVEGVVPPGPFIVIDFSSYQLNVLCAPDYFADNEITLK ELCTHDLLLREKGSATRDIIDSCLYLRGLAAYPKWTSVNSKVLIEACKAGFGFAVLPTLL VTEELKHGTLKTVETDELLYNKTKLLYHQEKYISEPLAKLIDIIGEAE >gi|223714198|gb|ACDT01000017.1| GENE 11 15996 - 17624 1856 542 aa, chain + ## HITS:1 COG:L1004 KEGG:ns NR:ns ## COG: L1004 COG0659 # Protein_GI_number: 15672032 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfate permease and related transporters (MFS superfamily) # Organism: Lactococcus lactis # 81 521 1 438 460 382 46.0 1e-106 MFKDYVLSLKKEFAGYNGQKLVMDILAGLTVAAVALPLALAFGVSSGADAGAGLITAIIA GLLIGGLSGASYQISGPTGAMSAILIGLSTTYGLQGVFVASFISGVMLLIASLFKFGKIV SFIPASVITGFTSGIAIIIATGQIDNFFGVTSKGSNPIEKILSYRMLGFNVNLEALFFGI LVIAIIILWPKKWGNIFPASLAGIIIALIINLVFKFNVAEVGSIPSTLLPKARLSLNSLN LTAITNLIIPAFSIAMLGMIESLLCGASAGKMKNEKLNADQELFAQGIGNMVIPFFGGVP ATAAIARTSVAIKAGGQTRLVSIFHAAALLISMFVLGPFMSRIPLSALAGVLIMTAWKMN EWHEIKQFFSKRIKTSMTQFLVTMLATVVFDLTVAIVIGVFISMILFVINNSDLDIETSD IEPQRLDKELNYDHQRTKVIYMAGPLFFGNQEQIITKVNEILNECNNIIFSMRGVPSIDD SGIREFIDVVELCRQNNINVLFAGVQKNVMKQFKRHHFVELAQPNNFCWDVIKALDKIEQ KG >gi|223714198|gb|ACDT01000017.1| GENE 12 17637 - 18506 890 289 aa, chain - ## HITS:1 COG:SMc04360 KEGG:ns NR:ns ## COG: SMc04360 COG3173 # Protein_GI_number: 15965828 # Func_class: R General function prediction only # Function: Predicted aminoglycoside phosphotransferase # Organism: Sinorhizobium meliloti # 3 289 8 299 299 276 48.0 5e-74 MIEINKKLVQNLINEQFPQWKDLVIEPVAKSGHDNRTFHLGSKMTVRLPSGKGYAAQVEK ELTWLPYLQKHLTMTISSPIAKGYPSCGYPFSWSINKYIEGDTLTKQNINNLNEFADDLA KFLKEFQKIDTTNGPQAGLHNYYRGGDLAVYHNETIEALENLKTVLPTELLLKIWQRALN ASVSDLNVWVHGDIAPGNLLVKNGKLAAVIDFGVLGVGDPSCDYAMAWTFFEEESRQRFL RKLDQGMIDRACGWALWKALITYNSDEAERAENAQYTINEIIKDEKKLG >gi|223714198|gb|ACDT01000017.1| GENE 13 18719 - 18910 121 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNPSYTIFHFDDETFSILWSEYNYKFKYFYPNYLNNLKTYGEIKKYFLNKNLLIHLTFPV HPR >gi|223714198|gb|ACDT01000017.1| GENE 14 19235 - 19630 170 131 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|169350090|ref|ZP_02867028.1| ## NR: gi|169350090|ref|ZP_02867028.1| hypothetical protein CLOSPI_00832 [Clostridium spiroforme DSM 1552] # 1 125 1 125 127 86 43.0 6e-16 MKRQALILSSIYLIIMCLGYIWCYPFFKIETILFDLIFRTVLWSISSYGLYIVLLILKKF SLLKNIAISKPFLITCLPYIYLIIFLVEGFIGLVMVFVFKTYVFAYSFFSILTILHAIKL SQDLLNNYCTY >gi|223714198|gb|ACDT01000017.1| GENE 15 19804 - 21243 1738 479 aa, chain + ## HITS:1 COG:pepD KEGG:ns NR:ns ## COG: pepD COG2195 # Protein_GI_number: 16128223 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Escherichia coli K12 # 14 476 16 479 485 327 39.0 4e-89 MAVLDTEILVNYYFEEICKIPHGSYNEEKIADFVEAFAKEKGFRYHRDHLNNIVIYKEAS QGYENHETLMLEAHMDMVNEKNKDSNHDFEHDPLDLYLEDGYVSANGTTLGADDGYGVAY MLAILSDPNVKNPPLECVFTVAEEVGLDGALGFDASLLKATRMIGLDSENEGEICTTSSG GCDVMITKELYFTSNENPTYTLLIKGLSGGHSGGEIHRGKGNANKLAARVMYGMIKANLD IQLVDLNGGLKNNAIPRECAIVFASTSDFKLLQEVADEYQDYFSEEFEISDPGVKVELSL NNDIAQQTLSIKESKDIIKLMMAAKSGFVERSLVIEDLTTVSLNMGVVSIKDKQLKIDYL LRSPMKSAVMNMVDELDIIADAFGGTITPANYYPGWNYDQHSKLRDLFKAFYFKRTGAEV KEVATHGGLETGIFKGKMPALDIITMGPNMADIHTPDERMEVASFVNCYEILKDFIATL >gi|223714198|gb|ACDT01000017.1| GENE 16 21352 - 22722 1356 456 aa, chain + ## HITS:1 COG:SP1939 KEGG:ns NR:ns ## COG: SP1939 COG0534 # Protein_GI_number: 15901763 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Streptococcus pneumoniae TIGR4 # 3 448 4 449 456 430 52.0 1e-120 MFKMKVDLTKDPILKSLLIFAFPLFVANVFQQLYNTMDTMIVGNFLGDTSLAAIGACGAI YELLVGFALGVGNGLSIVTARSFGAKDENLLKKSVAGSIVIGILLTIILMLFSQFCLYPL LELLNTPANIINEAYDYIFMITIFVGVMFAYNLCAGLLRAIGNSVMPLVFLLISSVLNVG LDLLFITQFNMGIQGAAIATVIAQGVSAILCFYYIYKKCPILLPHKEHFKISRELYQELA GQGFSMGLMMSIVSTGTVILQTAINKFGYLIIAGHVTARKLNSFCMMPAATLGLALSTFV SQNKGANQGLRIRQGVRYANLIAVGWSVIATVVLFFIAPTLVQLLSGSSSAVVIDNGSLY LMLNAPFYCMLGILLNLRNSLQGLGRKIIPLVSSIIEFIGKIVFVWLFIPLLGYFGVIIC EPVIWCCMCLQLAYSFYRDPYIKEYRNIKKEAVTGK >gi|223714198|gb|ACDT01000017.1| GENE 17 24037 - 24324 229 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735771|ref|ZP_04566252.1| ## NR: gi|237735771|ref|ZP_04566252.1| predicted protein [Mollicutes bacterium D7] # 1 95 8 102 102 155 100.0 6e-37 MNDTIKTINHKIQEMSFEDLRLICTKHSIDISDGNLNAILSLIKNNPSTIMFADYHPIIY IQILNKLDDNILNIFKPLIEKDYLMHDIKKLCKIN >gi|223714198|gb|ACDT01000017.1| GENE 18 24862 - 25014 93 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757139|ref|ZP_02429266.1| ## NR: gi|167757139|ref|ZP_02429266.1| hypothetical protein CLORAM_02688 [Clostridium ramosum DSM 1402] # 1 50 1 50 50 72 100.0 9e-12 MHKAIKKNITEYNDGIYNFIVDLIDENNDYFVQKVQIDAKDDAPRFSNKD >gi|223714198|gb|ACDT01000017.1| GENE 19 25239 - 25859 351 206 aa, chain - ## HITS:1 COG:no KEGG:CPF_1045 NR:ns ## KEGG: CPF_1045 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 1 200 1 200 200 239 58.0 7e-62 MERIVILNGSPRAPKSNSKCYADIFSKKYNKDIEYYNITKQNHDKLCSKMNDFSDMLLVF PLYADGIPATLLNFLKSLENNLPDKKPVISVIINCGFLEYTQNDLAVKMIELFCKKTNCE FGSVLKIGSGEAILKTPFKIFVTRKVKKLANSILSKNYKTLHITMPLTPKLFIKASTKYW IEYGNRNGITKEQMDTMSIEDDNTLV >gi|223714198|gb|ACDT01000017.1| GENE 20 25847 - 26359 455 170 aa, chain - ## HITS:1 COG:CAC0756 KEGG:ns NR:ns ## COG: CAC0756 COG0655 # Protein_GI_number: 15894043 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Clostridium acetobutylicum # 20 162 34 177 180 92 32.0 4e-19 MKLILSDKYLNLDVSDHSNTKFIDLSSLNIANCTGCFGCWTKTPGKCVIRDDATKVYPYI AKSDNLIYISKILYGGYDTPMKTMLERAIPIQKAFIRILNGETHHVQRSVELKNATIIAY GEIGTEEKEIFKALVERNAKNMNFKHYKILFTNEQNLEKTVKTEMIKWSE >gi|223714198|gb|ACDT01000017.1| GENE 21 27198 - 28193 924 331 aa, chain - ## HITS:1 COG:lin0492_1 KEGG:ns NR:ns ## COG: lin0492_1 COG1902 # Protein_GI_number: 16799567 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Listeria innocua # 8 323 11 328 364 202 37.0 5e-52 MRNLDKSISINGLTLKNRLVMPPMATSSANDGEVSQRILDYYDKKTKGGYIGLVITEHSY IDIQGMANPKQMSVAKDSDIKGLKQLVNIIHNNGSKAFAQINHAGSMARGTGLPTVSASN TIPITMKESNIEIPEELSKEQIQYIVKRFADAARRVKLAGFDGVEIHSAHAYLLNQFYSP ITNHRTDEYTGTTLEGRFRIHKEVIEAVRSEVGEDFPIALRLGGCDYMAGGSTIKDSIKA SQMLESYGVDILDITGGINRYMIPWNKEPGYFSDMTEHIMEKVSIPVILTGGITNAMDAE KLLQLNKADLIGVGRAILKDDSWAKNAMEQK >gi|223714198|gb|ACDT01000017.1| GENE 22 29072 - 29605 534 177 aa, chain + ## HITS:1 COG:no KEGG:AM1_4351 NR:ns ## KEGG: AM1_4351 # Name: not_defined # Def: aminoglycoside N(6')-acetyltransferase (AAC(6')), putative # Organism: A.marina # Pathway: not_defined # 1 175 1 182 186 136 37.0 3e-31 MEIIKLTSNLTYYRQQAATLLIEAFPYAYKDCAELEITKCLSTNRIMLGAVENDILMGLV GAIPQYGITAWELHPLAVSKKWQSQGIGTKLCTALENELKNCGCCTIYLGSDDEFDKTTL ANTNLFDDLYSKIKNIRNLERHPYEFYQKIGYQIVGVIPDANGLGKPDIWLAKSLVR >gi|223714198|gb|ACDT01000017.1| GENE 23 29747 - 30358 503 203 aa, chain + ## HITS:1 COG:lin2076 KEGG:ns NR:ns ## COG: lin2076 COG1309 # Protein_GI_number: 16801142 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 10 116 2 107 206 63 34.0 3e-10 MNQKFFELNEEKRLAIINAGLEVFSKNDYKHALTDDIAAKAGISKGLLFYYFHNKLELYQ YLAQYSARLVMKLFEKMNIIDGKDFFDAIKLAVLGKMEMMSKYPYIYGFSLRYYQNRKEI VGNSYEQLYAEVIASYNLIKRADISKFKAGVDPEEVWNIIYWMSQGYMDRYQDLENVNMK EIQDEYFRYLDIIKENFYREEFI >gi|223714198|gb|ACDT01000017.1| GENE 24 30340 - 31248 311 302 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 297 4 310 318 124 28 1e-27 SGGIYMSVVEINGLTKDYGNDKGIFDVSFKIEKGEVFGFLGPNGAGKTTTIRHLLGFIIP EEGTCLIEGMDCSKKIAEIQKKIGYIPGEINLMEEVTGIQFIKFMAEYRGMKDLGRAEEI IERFDLDPHVKIKKMSKGMKQKIGLVVAFMHDPAVFILDEPTSGLDPLMQSEFINLIIEE KKRGKTILMSSHMFEEVEKTCDRVGIIRQGRMVSIEDIDTLKKSKQKIYIITFKNHQDAY KFQKESFTIRNADDNVVEVVIKHEINELISCLNNYELMNFDIAHQSLEEIFMQFYGGNQH VQ >gi|223714198|gb|ACDT01000017.1| GENE 25 31238 - 32038 920 266 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1635 NR:ns ## KEGG: Cphy_1635 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 266 1 266 266 202 42.0 1e-50 MFNKALFKLEWKSNYKILVIFCLILTMYTTIMLAMYDPKLGTALETFAKSMPEIMAMVGM SGTPTTLIDFLSTYLYGLIMIVFPLIFGILLALRLVVRKVDNGVMSYLLCSGVERRSVWF TQMLVIITNLFVLIAFCTGLGLGCSALMFPGDLDIGAYLVLNLGVFILQLTLTGICYMCS CIFNEYRLASLFGAGIPIVFIMIQMLSNMQGSMEGLKFATLLTLFDPQKLINGNNEGYLM LGGLLVVMIVCYGVGGIIFDRKSMSL >gi|223714198|gb|ACDT01000017.1| GENE 26 32140 - 32949 756 269 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757131|ref|ZP_02429258.1| ## NR: gi|167757131|ref|ZP_02429258.1| hypothetical protein CLORAM_02680 [Clostridium ramosum DSM 1402] # 1 269 1 269 269 473 100.0 1e-132 MKRHEQLGFNFFRYCNENLSNTNEYNIARIIIEHIGDIKTVSLEQIAQEANISIASVSRF VQKIGYSSFQDFKDGLDYFIRNLNMVRTVSNMQQFMRTSLDNLADSLYVEAISNLRQTKL NLDMEKLVAITKLLLNSRSVTFIGDTHEMIDFYTVQLDLVANEVPAYLFDLQEFQDIHSD FFKDGDTLVLLNVSNDFYSEIQKRVVEKASQKNLKLVVFAQDDLAEQKIFDYIYQYGIPK SINDGYYSLFYLSQIISELIYKVRVIKKS >gi|223714198|gb|ACDT01000017.1| GENE 27 33003 - 34454 1298 483 aa, chain - ## HITS:1 COG:CAP0071_1 KEGG:ns NR:ns ## COG: CAP0071_1 COG2382 # Protein_GI_number: 15004775 # Func_class: P Inorganic ion transport and metabolism # Function: Enterochelin esterase and related enzymes # Organism: Clostridium acetobutylicum # 197 420 71 292 297 65 26.0 3e-10 MKTRNLIATSSLALTMMLGIVATPLSAQQTYTPGVTVEKNTNPEWEADYTATFIYKDKDA RDATNVSVSGGFQFYKPEEVQSFANTGDGANIPCYNAYQYQDGMFAGGYNLNGDAANYSM TEIEDEIFEVTIPLPANEYFYAFNIEYSDGSSETNVKDPANPALANDGSDAGWSLFYVGN GETKGQEYINPRTDGKTGTYSFVNYLAEDGTTQPLGIYLPNGYDANKTYKTIYVSHGGGG NEVEWMTIGAVPNIMDNLIADSLTKEAIVVTMDNTYFNWDYDRIINNLFNNILPYIETHY NVSKEVNDRAFCGLSMGSMTTTTLYQTHPDQFGYFGCFSGANVPVDVSKVAHLDQPNLYI TAGCIDMALMNDSYNTASDRTTTGFVDKLNTLNLTNYKLEILNGAHDWYVWRESFTNFVK DHLWTKDVKIETPSETPVIKPETPASSTNTTTAPKTGDDLNLAVAGLFIIISVTTGVVIK KHY >gi|223714198|gb|ACDT01000017.1| GENE 28 34666 - 36516 1481 616 aa, chain + ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 183 603 336 754 776 174 29.0 4e-43 MRLKRLGLPAFLLTLILLCSISVHTLYKIRDYGTLINYVGIVRGASQRLVKLELTDQPND ELKDYIDEILTELDMGVGKYSLINLDDEDYRTDLNKLKNIWEAIQMDIVLVREGNGHEKL LSDSEDLFEIANDTVFAVDYYTTTQSNYLLMMLMILSGISLVTWSIIIIFYLRSMITLQR KNNTLSIVAYRDNLTGVNNRLKFVLEARNIIKKTKGQFAFLYIDIEHFKYINDVFGYQFG DKILKKYAVVLANSIDKNETFARNNADSFIILRHYESKEMLKQKQIEIDQILIDYVKKAK NGYQIRLCCGICCYEDVIEKLTIDEYIDRAIFAQDTIKRTSEEHYAFYTEAIRNKLHMEK MIENRMQTALDNQEFLVYLQPKVNIKTGKIACAEALVRWKSEDRLIYPDLFIPIFEKNMH IKEIDKYVYREVCSWIRKRIDEELPCYPVSVNVSKVQFYNSKFVNEYYQIKQEFGIPDRM IEIEFTESVAFERTDLLIKIINDLRNYGFICSMDDFGKAYSSLNLLKDLPIDVLKLDSAF FEDSDNPDKVKLIIKDIIKMVNNLGIKTVAEGVEQVSQVEFLREVGCDLVQGYVYYKPMP INEYEKLLIKIVSPYE >gi|223714198|gb|ACDT01000017.1| GENE 29 36695 - 38029 841 444 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757128|ref|ZP_02429255.1| ## NR: gi|167757128|ref|ZP_02429255.1| hypothetical protein CLORAM_02677 [Clostridium ramosum DSM 1402] # 1 444 1 444 444 680 100.0 0 MITFIKKIISGQKKLYLGILAVLSLLSSFEFISVVMYSTSNIEVDFANGYILQLVTGIVI MIAFALTLFVNNYIINNKNEEFSLLLLCGRNMKQIIRYILIQFGLLFLISYALGFLIGWG WVYMYSAIYDAVITVDLTNILIIYGALFLTKMIYVVVIDMGKFMRIKTDIAGYMNHVPRK TTNFNPFKLMSSGLALKGEKPKTIDMLPKGKILSTILGILLIYIGFTPMLRSIENNNLPT YFAFSLLGEVLIIDNTFPLLFDLLHNKVLLKSKNWIISLSNLKDLTDVMTAMINISAVVV PIIISFLFLQSVSIDVQNAMMMSFFALMASMFLCFIIRFMIYLPTRASVIATMRALGYNK KSIFKIHYQEMMLFIILIVIFPIVMYGSLLYQSYLVNMITYNTFITMIAIYIILYTFMSI YMIVSYRKLIKEVYDDVRYLNHGE >gi|223714198|gb|ACDT01000017.1| GENE 30 38001 - 39383 1089 460 aa, chain + ## HITS:1 COG:CAC0164 KEGG:ns NR:ns ## COG: CAC0164 COG1136 # Protein_GI_number: 15893458 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Clostridium acetobutylicum # 5 253 10 253 260 223 50.0 7e-58 MSDILITANNISKIYDQDILLKRGANFYALRNVDFILEEGDFISVMGPSGSGKSTLLNCL STLDEVSSGAVKIFNRFVGEMNDAELSDYRNRYLGFIFQNHNLVPSLSVFDNIATPLILN EVSPQEIKERVNEIGERLNISHTLYKKPNECSGGERQRVAIARAIVTKPKIVVCDEPTGN LDSKNSHEVLEILRELNQQGTSIILVTHDNMIASYAKKFLYLRDGQIINRIDCQNGSQVD FFNEIVKITTQDSLLKLFNPQNNLKKESKKEFQEREIKNSVGEKTEDRKNENTTSQVKVE TFEPKKSTTARLIVYAIFDGLEYDEASALRNRVLNFDKDAVRFNNTNYEEIEIPYGEIKS TKISMCSRSIISWPMPEFRFYLDLDIITKDNNICRFEVLNQRNALSLFDTLEKHQIVMED PYGIVTLYRSKPDDLVRHRYLLYNFKKIAKEYNLDNPRNL >gi|223714198|gb|ACDT01000017.1| GENE 31 39652 - 39939 363 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735785|ref|ZP_04566266.1| ## NR: gi|237735785|ref|ZP_04566266.1| predicted protein [Mollicutes bacterium D7] # 1 95 32 126 126 160 100.0 2e-38 MRTEYLKEDDKIKETNDEVARNLVYQLFGNLEIADFEGLNDLETNSIGSLKVYYGSHDYL ERHFDQETLKKSNELKEVIISLLEFVDNSIDKQMF >gi|223714198|gb|ACDT01000017.1| GENE 32 40071 - 41489 1557 472 aa, chain + ## HITS:1 COG:BH0630 KEGG:ns NR:ns ## COG: BH0630 COG0034 # Protein_GI_number: 15613193 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Bacillus halodurans # 3 467 13 454 473 187 29.0 5e-47 MSGFFGVASKESCVLDLFFGIDYHSHLGTRRGGMAVHGNNGFQRSIHNIQNSPFRTKFEK DIEEFEGNLGIGCISDNEAQPLLVRSHVGSFAITTVGRINNLEELKDDCFAKGATHFLEM SSGEVNPTELVAALISQKDSIVEGIKYVQDVVKGSMSIMILTSEGIYASRDKLGRTPVII GEKPEGYCATFESSAFLNLGYHHYRDLGPGEIVFMTPSKIEVKQGPGDKMKICSFLWVYY GYPTSTYEGINVECMRNRCGNLLAKRDDVQPDSVAGVPDSGIAHAVGYSNVSGIPFARPF IKYTPTWPRSFMPSSQSQRNMIAKMKLIPVKELIKNKSLLLIDDSIVRGTQLGETTEFLY ESGAKEVHVRPACPPIMFGCKYLNFSRSSSELDLVTRRVIQKLEGDNVSDEILKEYADPD SERYKNMCEEIRRRSKFTSLRYHRLDDMIESIGLDKCKLCTYCWDGKEDEDC >gi|223714198|gb|ACDT01000017.1| GENE 33 41515 - 42300 553 261 aa, chain + ## HITS:1 COG:lin2267 KEGG:ns NR:ns ## COG: lin2267 COG2207 # Protein_GI_number: 16801331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 1 260 1 286 292 93 25.0 3e-19 MQHLYEVVETSFKYPIKIWNHQFNRSKVMKPHWHDEIELIYCNQGSFELSVNGEKKFVNT KQLVLINSNSIHRLRFLEDQTKIITLLLSTELFDEFKLVQCEFDLSLIKDQEQLEQLIIY LGDHLDQETLVIHNRMMQIYELLLEECSVYKLAITKKKLPIIKQIVTYIEQHYDDELSLT ILANHFHFSEVYLSRMFKEKTGITLLSYINKIRLNHAYNDLMNTSLTINEIALNHGFKNV KAFNKIFKEYYHDIPSKYRKG >gi|223714198|gb|ACDT01000017.1| GENE 34 42378 - 44021 1767 547 aa, chain + ## HITS:1 COG:BH2903 KEGG:ns NR:ns ## COG: BH2903 COG0366 # Protein_GI_number: 15615466 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 1 545 1 558 561 584 54.0 1e-166 MQKSWWQEAAVYQIYPRSFQDSNNDGIGDLQGIISRLDYIKNLGVDVIWLCPVYQSPNYD NGYDISDYQDIMADFGTMADFEELLKQAHQKGLKIIMDLVVNHTSFKHRWFVESRKSKDN EYRDYYIWREGKNDQPPNLQQSVFEGSAWQYDEDTEMYYLHLYTKEQPDLNWENEKVRNE VYKMMEWWLDKGIDGFRMDVINQISKDFEKMDKAIINDPHMYEIISNGPRVHEFLQEMHD RVLAKYDTMTVGETADVSVEDALLYAGFDRRELKMIFQFEHMSLDKGPNLTYQRPKLADL KVVFERWQTGLNGKAWNALYWDNHDRPRAVSKYGDDSTPFYLEKSAKMLALFMFWMQGTP YIYQGEEIGMTNAYYQSMTQYRDVDALNKYEVFKQNYDEETIIAYFGKRSRDNARMPIPW DNSRFGGFSIVQPWLAPSQKYTNITVENALKDPNSVYYFYRDLLRLRKEYEVLIYGDYQQ LLKEHLQVYAYRRTLNEQQVIVICNYSKDEVEIDLEFIQGTLLISNYEDDLKQILRSYEA KAYLAGK >gi|223714198|gb|ACDT01000017.1| GENE 35 44089 - 44457 456 122 aa, chain + ## HITS:1 COG:YPO2596 KEGG:ns NR:ns ## COG: YPO2596 COG0239 # Protein_GI_number: 16122809 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Integral membrane protein possibly involved in chromosome condensation # Organism: Yersinia pestis # 5 118 6 121 127 67 41.0 4e-12 MLECLAVGIGGFIGAVSRYLMGLIIIKDTTFPFMTMMINIIGAFVIGLVVALASKYNLES RWILFLKTGVCGGFTTFSTFSLESMNLLSDGKFLMAGGYILLSVSLCLLFVYLGNLCVKV LL >gi|223714198|gb|ACDT01000017.1| GENE 36 44484 - 45338 635 284 aa, chain + ## HITS:1 COG:lin2846 KEGG:ns NR:ns ## COG: lin2846 COG1737 # Protein_GI_number: 16801906 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 1 284 1 283 283 232 44.0 5e-61 MNIFSRLDNLTDLTTNEQTLVDYMKNNPERFINMSADEISAACFISIPTIYRLCKKLELN GLAQLKVMVSTSIRDYLKEKKTIDYNYPFSQNETQYQITMKMKELYEQTLIASNNLIDLD QLRLIASALKNAQFIDMYTSAGNLYFVENFKFQMSEIGRFVNVPVEEYQQLLAAASSDKE HIAIVVSFEGRGMIVDKIVKLLKKNNTPIILISSTTLKSLVSLCDYNLYLSPYENHYNKI SSFSTRLSLLYLLDCIYTCYFKLDYDKNVKYKTETYLKMTNHDE >gi|223714198|gb|ACDT01000017.1| GENE 37 45488 - 46270 824 260 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757120|ref|ZP_02429247.1| ## NR: gi|167757120|ref|ZP_02429247.1| hypothetical protein CLORAM_02669 [Clostridium ramosum DSM 1402] # 1 260 1 260 260 484 100.0 1e-135 MGDVLLSRIIKYLNGTLFYDGYYQFCTFVIANYHRMIEYTFAQVCKESDLDPRTILGFLN YLGFDNYEDFHRKLVADDILRNDQIRARLLSMSYRDLFRLIEVNDYDIFMDNIVSICHDI YDADRVIIVGGLFPTSIAVEFQTDLITLGKNVVQYHAYDPNIKFDENDFVVFFSSTGRSL RDFLAKKSLGHLSKSKTLLITQNQLLENENLSCIKYFFSLPGKFDGISFNYQIMAILDLI RIIYYQEYFVEYTEEALESR >gi|223714198|gb|ACDT01000017.1| GENE 38 46602 - 47675 1204 357 aa, chain + ## HITS:1 COG:SPy2207 KEGG:ns NR:ns ## COG: SPy2207 COG0180 # Protein_GI_number: 15675940 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Streptococcus pyogenes M1 GAS # 2 346 3 340 340 407 59.0 1e-113 MKNIILTGDRPTGRLHLGHYVGSLRRRVELQNSGEFDEIFIMIADSQALTDNADNPEKVR RNILEVALDYLSVGLDPEKTTMFIQSQVPALFEFTAYYMNLVTVSRLQRNPTIKEEIKQR GFETSIPVGFLNYPVSQAADITAFDATCVPVGEDQMPLIEQTKEIVHSFNRIYGDVLVDP KIMLAENEVCQRLPGTDGKAKMSKSIGNCIYLADTSEDVRKKVMSMYTDPNHLKVEDPGT VEGNTVFTYLDCFCKDEYFEKYLPDYKNLDELKEHYRRGGLGDVKIKKFLFNVLEETLNP IRERRAVLQADIAAVMKVLEDGCVKANEKANATLKRAKDAMKINYFDDKDNITEYFK >gi|223714198|gb|ACDT01000017.1| GENE 39 47770 - 48834 868 354 aa, chain + ## HITS:1 COG:lin1150 KEGG:ns NR:ns ## COG: lin1150 COG3192 # Protein_GI_number: 16800219 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Listeria innocua # 1 339 11 351 373 138 29.0 2e-32 MLILCLIGLIDKILNNKLGLVAAFDKGMDSMGGIAMSMIGFYCIAITLIQNNVDVITKIS ADSSLDPSIVIGSILAPDMGGFSIVSGLGNSTFLIFSGVILTATLGQTISFQLPIFLASL KKEDLNPFISGLVYGILSLPIILVAVAWYLQIPNLLINLLPIIILCVILVIALYVSYDKT IFVLTLFGYLIRIISILLFGMVVLQLFFDTLPFTTTALISDAMVIVLRMCIVVCGSMILS DLIIKRFSRMIFMIGQKLGVNSASVMGLLLSLGTSIAMIPLFSQMDRKGKMMNAAFSVSG AYVLGGQLGFISSVVDGNGVIIYMISKIIAGLLAIVFVLIFYREEKVSVDAMEK >gi|223714198|gb|ACDT01000017.1| GENE 40 49068 - 49451 287 127 aa, chain + ## HITS:1 COG:no KEGG:Elen_0387 NR:ns ## KEGG: Elen_0387 # Name: not_defined # Def: metal dependent phosphohydrolase # Organism: E.lenta # Pathway: not_defined # 1 67 99 165 454 71 43.0 1e-11 MFIFETKNIWQHAIYGYLFIKNFSPLDKLAPAILYHHLDYQKMIYIDDEIKNISQIINLC DRIDVFLQTSNDYSMLKDYLDKESGTRFVENLVTTLYWQVDQVYQFTTNPFSKWQKVTSL NISLSLE >gi|223714198|gb|ACDT01000017.1| GENE 41 49470 - 50159 578 229 aa, chain + ## HITS:1 COG:CAC2837 KEGG:ns NR:ns ## COG: CAC2837 COG2206 # Protein_GI_number: 15896092 # Func_class: T Signal transduction mechanisms # Function: HD-GYP domain # Organism: Clostridium acetobutylicum # 1 170 233 403 417 114 35.0 2e-25 MLVSLIDFRSYFTVTHTITTCTISLELARLLSLSSEQTKKIYYGAMMHDLGKIGIPIEIL EYPGQLTQEQMIIMRSHVLKTRELLEEKIDQEILEIACRHHEKLNGSGYPHSLWENQLTQ EQRIVAISDIVSALCGKRSYKTSFSKEKTCEILNQMAQAGEIDSMTVAQVIDNYDLIEGK VWKEAAAVLEKYNTIISDLLNRKQELIKAKWLYLVFRYSKTGIVNLKID >gi|223714198|gb|ACDT01000017.1| GENE 42 50260 - 50895 760 211 aa, chain + ## HITS:1 COG:CAC0198 KEGG:ns NR:ns ## COG: CAC0198 COG2364 # Protein_GI_number: 15893491 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 1 196 6 203 227 75 31.0 7e-14 MAKRLGLYFLGILILGFGIVLNTKTGLGVAAINSVPFGISEMTNLSLGMATTILYIIFVG VQLLIYQKLDFKVLLQIPFSYFMGYVLDFYNNLLNFTVTSLPVALVLLAIAILATALGAY VVVTLDLIPNPADGMVKALSQVIDKEFGKTKLLFDCGMMLVTIITTYCLAGKIIGIGIGT ILSAGFIGQIIVIYNRYFTKHLTLIVENSRA >gi|223714198|gb|ACDT01000017.1| GENE 43 50994 - 51972 1125 326 aa, chain + ## HITS:1 COG:no KEGG:ebA6332 NR:ns ## KEGG: ebA6332 # Name: not_defined # Def: hypothetical protein # Organism: Azoarcus_EbN1 # Pathway: not_defined # 6 153 476 623 654 85 33.0 2e-15 MNGYQLRIDVEGCADPIWRRIKIPAIISFEDLHQLIQTLFGFEDYHLYEFRIKELGIWIP GGETDIENQINDDVEIVDNLERIADYLKKEMVIYYHYDFGDDWQLLITVEQELEGVEHYP VVVEFCGNNLIEDCGGIERYQEIILSRDDLDINFDLEDINSCLSLLGVDEILNNSPLKNE FIEILEQLKELVKKREFTDNQVIKLVSNQTTYWVILKTLEGYVIELFETYNDLLEGFYNL ANEGINNAFCNCWTILLSEQELDFDVALDEDNYLAAFRNEAGYIPCLLEVEEARVILELL KEFANGIKEDQNRSESDEIIEIYVED Prediction of potential genes in microbial genomes Time: Thu May 26 09:19:13 2011 Seq name: gi|223714197|gb|ACDT01000018.1| Coprobacillus sp. D7 cont1.18, whole genome shotgun sequence Length of sequence - 17201 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 7, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 311 221 ## gi|167757114|ref|ZP_02429241.1| hypothetical protein CLORAM_02663 + Prom 317 - 376 6.3 2 2 Op 1 . + CDS 415 - 1611 1287 ## COG0426 Uncharacterized flavoproteins + Term 1614 - 1656 4.9 + Prom 1616 - 1675 10.0 3 2 Op 2 . + CDS 1700 - 2158 394 ## gi|237735799|ref|ZP_04566280.1| predicted protein + Term 2162 - 2216 14.3 - Term 2148 - 2204 3.1 4 3 Tu 1 . - CDS 2225 - 5908 2715 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain - Prom 6008 - 6067 9.4 + Prom 5888 - 5947 7.1 5 4 Op 1 1/0.000 + CDS 6070 - 6969 679 ## COG1737 Transcriptional regulators + Term 6983 - 7041 -0.2 + Prom 6974 - 7033 5.4 6 4 Op 2 . + CDS 7092 - 8525 1502 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 7 4 Op 3 . + CDS 8512 - 9042 363 ## gi|237735803|ref|ZP_04566284.1| predicted protein 8 4 Op 4 . + CDS 9052 - 10347 1412 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 9 4 Op 5 . + CDS 10362 - 11276 749 ## COG0679 Predicted permeases + Term 11456 - 11495 -0.2 10 5 Tu 1 . - CDS 11281 - 12267 765 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 12291 - 12350 6.8 + Prom 12316 - 12375 6.0 11 6 Op 1 . + CDS 12419 - 13738 1576 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 12 6 Op 2 . + CDS 13764 - 16031 2217 ## LCRIS_01599 putative protein without homology + Term 16187 - 16234 2.1 13 7 Tu 1 . - CDS 16141 - 17121 633 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 17141 - 17200 2.2 Predicted protein(s) >gi|223714197|gb|ACDT01000018.1| GENE 1 3 - 311 221 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757114|ref|ZP_02429241.1| ## NR: gi|167757114|ref|ZP_02429241.1| hypothetical protein CLORAM_02663 [Clostridium ramosum DSM 1402] # 1 102 412 513 513 155 100.0 6e-37 ALIDGLLEYANRYGKPQRIVVNNLNVLFLVLNFISQYDIDYFEDEITLEIEGAIMDAFGL ETADEAFNDPFIQELLEQLDGKSEEEIEEKINEILANMELLN >gi|223714197|gb|ACDT01000018.1| GENE 2 415 - 1611 1287 398 aa, chain + ## HITS:1 COG:FN1423 KEGG:ns NR:ns ## COG: FN1423 COG0426 # Protein_GI_number: 19704755 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 1 397 1 403 405 452 50.0 1e-127 MNCIREVNENVYWIGGNDRRLSLFENIFPIPRGISYNAYVVLDEKTILLDTVDWSIGHLF FDNLETALQGRTLDYIVINHMEPDHCACLKEVINRYPEVVIVGNAKTFTMIDQFFGIEIN KLVVKENDTLTTGKHTFTFVMAPLVHWPEVMVTYDSYDKTLYSADAFGTFGALDGALFND EVDFEHEWLDDARRYYANIVGKYGPQVQMLLKKASSLEIATICPLHGPVWRNNIDYIVSK YDCWSKYEPEEKGVVLAYGSIYGNTENVMEIIASKLKQAGVKNVRMYDVSKVHVSNLISE VFHYSHLILAAPTYNSGIFPPMENFLSDMIALSLKKRSVALLENGTWGALCAKHMRTKLE TMKDMEIINEPITIKSTLKTEQVDDIDNLVTTIVASIG >gi|223714197|gb|ACDT01000018.1| GENE 3 1700 - 2158 394 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735799|ref|ZP_04566280.1| ## NR: gi|237735799|ref|ZP_04566280.1| predicted protein [Mollicutes bacterium D7] # 1 152 1 152 152 293 100.0 2e-78 MDIDGLRNMMFVFKTKSKRNIIQLFNFRSSKANNIDETQIKEVNDYLYIPIDLKNWLDID TNKCLERVLTTLLHLTDPKSGRPGASIATIVAGYDENHQSNFIFTTDYIDGKHTVIGYYE NGDEVIYRDSIVLRGKNCLDKYNDLSSKWQIK >gi|223714197|gb|ACDT01000018.1| GENE 4 2225 - 5908 2715 1227 aa, chain - ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 243 653 343 754 776 181 27.0 6e-45 MLKGDGDTYIYIIDKNYKILYCNDNALQKFPCVKVGELCYHAVRDEKKPCHDCPLQLNSN STRPIFYNRKINQWQEINSAIIDWLNYKDVNLIMIRDIQEEDKNLFYNFTNLNAYDELFE LNITRNLYKILYHVPNKYIIPPTEGKINTMINQIIDEEMIHPDDHDNFLNFWDLPKLHET FSSNQHQILKGYFRKKKTDGKYCWVLQTVVAIENSQMNDEIIMCFIQEINEPPTFQQLPD KKQFNELTGLYQTNSFFQIADNLLKTDTSHQYCMVAIDIEHFKLFNEWYGIAAGDLFLRA IAKELKTISQQLRGIAGYLLGDDFALLIPYHNNHTKMILDKLMHYIEEFQDKIGFYPALG VCLVDDRNMPSQTIYDRAVIALTSIKGNYAKRIAYYQDKMQLQMEEEHKLLLDFQNAIEN NEFTFYAQPKCNIQTGKIVGAEALVRWQHPKKGIIPPNEFIPILEKNGLIGKMDYYIWDC VYKHLRKWIDNGHKAIPISVNVSRIDMFTLDVVKCFKDLVNKYQINHNLIEIEITESAYV EEYDKVKTIIKELRQEGFLVSMDDFGSGYSSLNMLKDINVDVLKIDMNFLNMNEQSSDKG IGILEAIINMAKLMGLRVIAEGIESSQQVELLLDLGCLYAQGYYFYKPMTSKQFEAIIKN DDKVDYRGIQAKQINYISIKTLVDHHVLSDAMLNNILGGIAFFEFDGHNIELLNVNQNYC KIVDESGVDVEEKRLKILENIYEPDQTIFLDLFEQATVHLINGTKGQIRYLTAKGTYKWL NIYLFLLKKLDHSNLYYAQVSDITEQKKKEKQLESSQKALAAAVHISENDDSFMNLAEEN KALASSIFSQMSPGGMIGGYCEDGFPLFFANDALVKLLGYETYEEFAIAIKEQVINTIHP DDREQVAKDIGSNYYSGLEYTTTYRMPKKDGTWFWTLDKGKVIETEDGRLAIVSACIDIS ETIAAQQKLAKHNETLQKVNQELYYLNNKLPGGYHRCADTPDYDFIHISNRFLEFFNYTR QEIKELFNDKFINMIHPDDRAKAVRTTENLSQQDESFDLEYRMLAKDGYIWVIDQTSVLE YNGTRFFQGVVTNVTKNIELRNQMQLLEKYSPVDIVLITCRKNNVKHTIITNGLISKYGY NKEQYQRYLDNKEFEYNFNRDAFKIFEQNIIKAFKQQTDYYEILSININSKTVWFKTSFE FVKADAEEIQYLYISSDITSFKEKEYH >gi|223714197|gb|ACDT01000018.1| GENE 5 6070 - 6969 679 299 aa, chain + ## HITS:1 COG:lin2846 KEGG:ns NR:ns ## COG: lin2846 COG1737 # Protein_GI_number: 16801906 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 4 275 1 271 283 102 25.0 6e-22 MIEMLIIEKLNLKEKMSDGEESIAAFILTLGKELHKYSTRNIAEATLTSPATVIRLCKKL DFKGFDDFKEQFLKEINYLDQQYGKVDANFPFNRNDTMMKTAHKISHLYEDTVTDTMTLL HHDDLQKALRLLKNSNSIHIFSTGTALNLAESFKEKMLKIGKNVVISNNLNYQLYEVSCI PKGDIAIIISYSGETINTIKIAQTCKNNKIPIIAITSFGENTLAKLASCKLTISTKESLY HNLGDFSTHLATHLMLDILYSVFFLEDYDQNYDNKIKITKDLEALRSSTNPIINDGIEK >gi|223714197|gb|ACDT01000018.1| GENE 6 7092 - 8525 1502 477 aa, chain + ## HITS:1 COG:CAC1405 KEGG:ns NR:ns ## COG: CAC1405 COG2723 # Protein_GI_number: 15894684 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 1 463 1 469 473 535 57.0 1e-152 MKIKDNFLWGGATAANQYEGAYREDGKGLSIADVEMGSSHGVPREIHDRVQSGCYYPSHE GIDFYHHYQEDIALFAKMGFKCFRMSINWPRIFPNGDEYEPNQAGLEFYDRVFAELAKYQ IEPIVTLSHYETPLYLVQKYGSWRNRKLIDFFERFCEVIFTRYREQVRYWMTFNEINEVM NQEMPYHQAGIIYQPGENHGDVKLQVSHHLMIASAKAVILGHRINSEFKIGCMLQYPMTY GATCRPQDELAKRLSMLPNFYYSDVMVRGHYTNTCRAQAKRLNGNFITIKGDEEILQAGK VDYIAISYYFSSIASYSKDSEIKVTRDNPYLSVNDWNWPIDPLGLRLSLNELYDRYQVPL FVVENGLGAIDEIEEDGTINDDYRIAYLASHIDALRDAIEIDEVDVIGYTCWGPIDIISV GTGEMKKRYGFIYVDKDDQGMGTLKRSKKKSFDWYKQVIASNGYDTSYKGMKNNETK >gi|223714197|gb|ACDT01000018.1| GENE 7 8512 - 9042 363 176 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735803|ref|ZP_04566284.1| ## NR: gi|237735803|ref|ZP_04566284.1| predicted protein [Mollicutes bacterium D7] # 1 176 1 176 176 234 100.0 1e-60 MKQNSLLLKRGIAFLVDLYIGALLGSLPISIISLITIKQMTQNIFLLNHQIAIIAALLSL LLLGFYYLYIPCWIYSGQTLGKRLMDLKINNKGKKFLVKRQLFVLLILTSGGRLVAQLLS LLSGYSLIEISNDITMYLSLISIGMLLLKKETLQDRLFKTTVKDISNKNILIKNHI >gi|223714197|gb|ACDT01000018.1| GENE 8 9052 - 10347 1412 431 aa, chain + ## HITS:1 COG:BS_ywbA KEGG:ns NR:ns ## COG: BS_ywbA COG1455 # Protein_GI_number: 16080890 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Bacillus subtilis # 1 424 1 435 444 322 42.0 1e-87 MDRFMDKIGDAIAPIAQKMTANRYLSAIKEGFFGSTPILIAGSIFLLFTSLPFNGYTEFM ESTFGAGWMDFFYLPYQVSFKLMAIFVVIGMAKSLANYYKVDSKLAIALSFVGILLLTPV IVTVENVKGLPLDNFSAQGLFVCMIAAALAVEIYRWCVQKGFTIKMPDSVPQNVSSAFAA VIPAFLIILLFNILRMGFAMTDFGSAQTFVFTILQQPLQSLGGTLPATILVLLVEAVIWC FGLHGSSIVSSVMNPIWFAQSAENLAAFEAGLAMPHIVNYQFISFFVKLGGVGATLSLTL LCLFKAKSDQYRALGKLGIGASLFNINEPIIFGFPIVLNPMMMIPFILANVSVGIVTYLA IYLGLVPMINGINLPYTIPAVISGFMISGWQGALLQVVLLLLTGLIYYPFFKAQDKQAFI EEQAKKEALLD >gi|223714197|gb|ACDT01000018.1| GENE 9 10362 - 11276 749 304 aa, chain + ## HITS:1 COG:CAC0366 KEGG:ns NR:ns ## COG: CAC0366 COG0679 # Protein_GI_number: 15893657 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Clostridium acetobutylicum # 6 296 3 293 301 112 31.0 8e-25 MDITVVIIQLIQLFILIGIGYLLGKTSLFRGMFVQQLTNLVLNLTMPCMILSSVMNSIDA PALPLKDIIIAMFILVIILPVAAFLMIRRIKTNQGLYLFMIMYPNVGFIGFPLMQSIFGS EAILATAIINMGFNLSLFTLGIVAINYGENKLTSFDLKKIFSPGVISSILAVLIYTFKLT FPYVIVEPINSIGMMTTPLAMLVIGATLSVYHLKDIFSDYTVYLFTLLIDLIIPILFYPV ILLFIKDSMIRGITLIILAMPVANGAVLFARSNGQDEFLAAKTVFISTMLAIFTIPSLVY MFLL >gi|223714197|gb|ACDT01000018.1| GENE 10 11281 - 12267 765 328 aa, chain - ## HITS:1 COG:lin2983 KEGG:ns NR:ns ## COG: lin2983 COG2207 # Protein_GI_number: 16802041 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 51 326 48 325 326 105 26.0 8e-23 MNYQELEKVLQGPLTYTNTSDKKLIEHFQEKFKMVPYNNDFLFIFHHEREFENKNHIISL HQRNSGRVPMHIFHYIVITYIYSGTMIITVENDTVTLNAGDVIIFDKHVPHSVAPTSAND LGVNIVLNENYFSKKFINHLPNDQLISKFMIELMNSQTNHNHYLLFYTKKDHLITNCIQN ILCEHFEPAVCSDDLIDNFIMVLITHLVRKFQYNTNLTVSMFKNEQLMDDILNYIHSHYN EGSLNKMCHDFGYDPSYTSKLIKQFSGKTFKQLVNEERMKKAAILLQNHELPIYEIAHQI GINNLTSFYRRFQAYYQCTPQQYRDRYD >gi|223714197|gb|ACDT01000018.1| GENE 11 12419 - 13738 1576 439 aa, chain + ## HITS:1 COG:BS_licC KEGG:ns NR:ns ## COG: BS_licC COG1455 # Protein_GI_number: 16080909 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Bacillus subtilis # 11 429 8 437 452 264 36.0 3e-70 MKGELTFMDKLQEILGKVSIKLSGNLYINAIKDGMLAYMPFAFIASIFLIIAFFPIPAFT DFVSSITGLEAAVWQGKLALVNDASLGIGGLLVLLSISRSLADKLGINGIQVTMTSVVAF MLLVPFGANDTGNFISVTYLGAQTIFLSIVISIISAKVYKFINDKGIKIKMPAAVPPAVA APFESIIPSFVVIFIFWLLRLTIDAFDGGSALAVFNFILGVPLQKVGGSLIGVVIVKMFS QLLWFFGIHGDSIVNGVMTPIFQVLQDANKTVSMAGGTPVNIINQSFWDSFAGIGIVGAI IAIVIIAKSKRYKEMKKIAGVPYIFNVGEPTLFGIPLMMNVIYFIPFIISNVVSILISYV AFATGLVPVCTGLAQVPWTTPLVISGYLATGSIAGSILQIVCLIVVVLIWLPFVRIADNQ LIKEEAALENKGGSVQETI >gi|223714197|gb|ACDT01000018.1| GENE 12 13764 - 16031 2217 755 aa, chain + ## HITS:1 COG:no KEGG:LCRIS_01599 NR:ns ## KEGG: LCRIS_01599 # Name: not_defined # Def: putative protein without homology # Organism: L.crispatus # Pathway: not_defined # 8 705 4 703 732 553 43.0 1e-155 MEKSNYQFHGYWIKGTEIDYGQEDNEYYLKQPNPLMKKEFVVNEINSSYIYIGILGYAIV YLNGRRISQDELNCDWTNFKKCVYYDVYDVSQFLKIGTNILEVELGNGMYNPAPLKLFGK YNLRNNLAEVGQPRFILDLVNDDKVILTSDSSWILGNGKRLFNNLYLGETVDNNLEANYY AQVQIDDCKRNLVLSEIPKIKRCGDILPQSFINQSDGIIVDFGKMISGFINFACTAHKNQ EIVLQYSEAMVDNHLVYSTCLAGSVGERIGEHVIAGGTGAPAIAIQTDRLICKEGYNQFT NKFTYHSFRFVRITGIELTQLQQIYATYVHTDLKKIAKITVDNQELQQLYDAATRTKLNN VHGTFEDCARERLGYGGDMVALATSNLYTFELDKLYKKIIKDFRFEQTAMGGIPETAPYM GIQSNGTAKGEGPILWQLVYPYLVYKHYQYYGDLSLLKQEYPYLKKQLDYLLDYNLNKLV ECCLGDHGSILIAGQFRKPTPDKLLLGYCTVLLFLRYNILIYQAIKKDTVFYQQRYQELK ILTINKFKNSDGSFGEGSQSGYAFAIELGLDDPKKLCALFVEKVKKDDYVFNSGIFAMAL SYEILSKYGYDEVIENWLLRKEAPSYQQMLKSGNQVLAELFVGEHLSLNHAMFASYQQWY YQGLAGIKITDQAVGFDHIVFNPYFSKKVNDFTCQLDTKQGLITSSWHRCGHEINWELVV PVKKVNYQIKINNRYQEIKRISKNNQIIIKLIDSV >gi|223714197|gb|ACDT01000018.1| GENE 13 16141 - 17121 633 326 aa, chain - ## HITS:1 COG:lin2983 KEGG:ns NR:ns ## COG: lin2983 COG2207 # Protein_GI_number: 16802041 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 9 322 8 325 326 135 28.0 8e-32 MDSFKLDHFLRTYSRTEIKHKNGFATQYATKGKYRLINNIYQFNTDNVPDSDTIIVYKNQ RFDDVPTHVHDHIEINYMYSGSCSQIIDGHEVILKKGQMTLIDTRTPHAIGYTDENDILI NFIIKKDYLNNSFFSKLTDNNLITSFFINAINQDNTDLNYLIFNTENNQRLQMFILEFIY EYFYPTLNSTEIKKSLFVLIILEMINTLDTSINLESITKSSTIIILALQYLEKNFLTCSL DSTAKYLNINPCYLTTLLKKNFNYSYKELIIKLKMQYASKLLLNSSYSIDQIAHECGYQN LSFFYKKFKETYYCLPKEYRNRNKKR Prediction of potential genes in microbial genomes Time: Thu May 26 09:19:48 2011 Seq name: gi|223714196|gb|ACDT01000019.1| Coprobacillus sp. D7 cont1.19, whole genome shotgun sequence Length of sequence - 16926 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 7, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 61 - 120 5.5 1 1 Op 1 1/0.000 + CDS 141 - 1139 889 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 1151 - 1210 5.7 2 1 Op 2 3/0.000 + CDS 1256 - 2626 1502 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific + Term 2631 - 2669 3.2 3 1 Op 3 . + CDS 2675 - 3157 573 ## COG2190 Phosphotransferase system IIA components 4 1 Op 4 . + CDS 3168 - 5741 2329 ## Pjdr2_3683 glycoside hydrolase family 2 sugar binding + Term 5783 - 5818 2.0 + Prom 5806 - 5865 7.4 5 2 Op 1 6/0.000 + CDS 5909 - 7315 1368 ## COG1070 Sugar (pentulose and hexulose) kinases 6 2 Op 2 5/0.000 + CDS 7318 - 8571 1390 ## COG4806 L-rhamnose isomerase 7 2 Op 3 . + CDS 8582 - 9406 943 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases + Term 9466 - 9520 -1.0 + Prom 9533 - 9592 9.3 8 3 Tu 1 . + CDS 9623 - 10588 833 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 10604 - 10658 1.5 - Term 10589 - 10646 5.1 9 4 Tu 1 . - CDS 10649 - 11206 427 ## Fisuc_1679 hypothetical protein - Prom 11331 - 11390 9.1 + Prom 11262 - 11321 10.5 10 5 Op 1 19/0.000 + CDS 11455 - 12366 1076 ## COG1105 Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) 11 5 Op 2 . + CDS 12366 - 14252 1961 ## COG1299 Phosphotransferase system, fructose-specific IIC component 12 5 Op 3 . + CDS 14254 - 15024 978 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 15209 - 15239 -1.0 + Prom 15219 - 15278 13.0 13 6 Tu 1 . + CDS 15298 - 16086 987 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family + Term 16107 - 16163 11.2 + Prom 16152 - 16211 7.3 14 7 Tu 1 . + CDS 16289 - 16894 626 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs Predicted protein(s) >gi|223714196|gb|ACDT01000019.1| GENE 1 141 - 1139 889 332 aa, chain + ## HITS:1 COG:lin2983 KEGG:ns NR:ns ## COG: lin2983 COG2207 # Protein_GI_number: 16802041 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 42 328 33 324 326 129 28.0 8e-30 MIAFTNRGDEMIREDIMERLVCFSDEEINNLNGFVGIDKSIFISEQSNVVDADKLLKANQ QFAVRKHARFCEYPKHRHNYLEFMYVYGGEMVTIIDNQEIVIKQGELLLLNQNIEHAIKY TNENDIIFNFIIKPEFLEFLSGMAEEQNEVFSFIFDALYSYDNKGEYLIFKVSNNEIVRN HIEAIITNIYQQQLNHSFTLKLLVGLLLTELMNNPHLIETYESNNYNKLVVISILKYITL NYQEGSLSVLAKQIHQPDYKICKLIKEHTGSTFKQLIQEERLKAAANLLKTTSLPIVEIM QEVGYENITYFYKIFKEKFKITPSIYRNHNLR >gi|223714196|gb|ACDT01000019.1| GENE 2 1256 - 2626 1502 456 aa, chain + ## HITS:1 COG:CAC1407_2 KEGG:ns NR:ns ## COG: CAC1407_2 COG1263 # Protein_GI_number: 15894686 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Clostridium acetobutylicum # 102 452 1 365 379 239 39.0 8e-63 MAKKVDYTQLAKEVIRAVGGKENINGVTNCMTRLRFVLKDDSIPNSEEVKLIKDVKGVMN KGGQYQVIIGTHVNEVIKFVNQELGFKGDEKATEDKVEEKGNLFNRFFKVISGCIMPMIG PMVAGGIIKGILTICVTLGWMTKTDGNYLTLYAAADALLYFMPIIVGFSAGKVFKCNPYV TATIGAALLYPDLVNALAGDTAHHFMGLTITNMSYSQTLLPIILASFIAAKIEYFAKKII PTMLQLMIVPVIVLMITVPLSWLAIGPVMNTVSSWLSTAVVSIFGFSPILGGILFGAFWQ LMVLLGLHSAFIPVLLNNLFTMGYDPINAILGLTVWALAGVSLGYAIKMKDPEKRSLGFG NMASCLCGVTEPTIYSIALPQIKCFVAAWIGGGIAGGILGALGGKMYSLGGDGLFRIPAM INPNGIDVSFYGFIITALIALVVSAIITYFAADPEK >gi|223714196|gb|ACDT01000019.1| GENE 3 2675 - 3157 573 160 aa, chain + ## HITS:1 COG:lin1016 KEGG:ns NR:ns ## COG: lin1016 COG2190 # Protein_GI_number: 16800085 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Listeria innocua # 3 136 4 137 163 117 47.0 1e-26 MFRFKKKKQQIEIKAPVVGKSVSIESVPDDTFATKLMGEGVAFKFSGSDICAPVTGELIL VADTLHAFGIRDTHGVEILVHIGLETVTLNGAGFIKIKEIGQKVFEGEPIIKVDQTYMLK SGLNMITMMVVTNSSDYQISLDNLESNVDLDSVMIRCIAN >gi|223714196|gb|ACDT01000019.1| GENE 4 3168 - 5741 2329 857 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_3683 NR:ns ## KEGG: Pjdr2_3683 # Name: not_defined # Def: glycoside hydrolase family 2 sugar binding # Organism: Paenibacillus # Pathway: not_defined # 8 835 9 879 896 600 38.0 1e-170 MRVVEKNLNSSSYHNLFPFFWQHGESNNKIDEYITKMKEQGINDFCIESRPHPAFLEKGW WQSLDFIIEKAKENDMKIWILDDARFPTGYANGKVPEALRKRYLNYRRYDIAGNQKFAQL NLKPIVDMRTFMNNKRHQQDQIFRVVLAKNDISVKDGFNETDLIDITNQIKDNVLTVSLE NNVNYSIFVLYVTMVGDEAVTEEYLDPMKKEATQVLIDEVYQKHYDHYYHEFGKTIVGFF SDEPRFGNTKGPNASIGRYNMPLPWNENVLMKLKEIDFDLNNLVLLFQGMSSTANMVRYA YMNIISNLYSENFSQVIGNWCKEKEIDYVGHVIEDNNAHSRLGYGAGHYFRSIAGQTMAG IDIIGGQIVPGMDYHHTAFSTGGSDGEFYHYALCKLGASAAKLDKNKNGRLMCEAFGAYG WVEGLKMMKWITDHMISHGVNLIVPHAFDPKEFPDWDCPPHFYAHGNNPQYPYFHYFANY ANRLCNLMSDGHQICKVGVLYHAFAEWSGDYMLIQKVLKVLQQNQIDCNVISEDYLMEAM IKEDVYQINGYDFEVLVVPYAKRLPDCLLQTIKRLKSKVIFIDAFPEDEEVEGALVLSLQ ELPRELNDYSEILIDHMEEKLVFYHYQHDDGDIYMFNNESIYSDINSQIMLKTDQSLMIY DAFSNQTYKFASKIEGQQQIFNLHLAPYQSLILVSGKSNDLIPAKKSELGNVNDVEISLR AFNETKYRVNFKADLNTNLMNRYPCFSGAVKYHFRYSFITKDVLLEISEAYEIVEVIVNG KLAGVKIAPEYLFDISEYLEIGENSFEIIVINTLARNQHDAMSQFLALEPMGITGTLKFY TKNVDDEVIVNRDKNLI >gi|223714196|gb|ACDT01000019.1| GENE 5 5909 - 7315 1368 468 aa, chain + ## HITS:1 COG:BH1551 KEGG:ns NR:ns ## COG: BH1551 COG1070 # Protein_GI_number: 15614114 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Bacillus halodurans # 1 464 1 461 467 463 51.0 1e-130 MEYYLAVDIGASSGRHILGSYQNGKLVLEEIHRFENNIRVIDEHLSWDVDYLFHEIIAGM KKCKELDKIPLSMAIDTWAVDYIFLDRHGKRIGNAISYRDHRTDNIDQEVYKIVSAEELY ERTGIQKQLFNTLYQLMAVKKYTPEILTQVSSMLMIPDYFSYLLTGRQVCEYTNASTTQL VSSNTRNWDRELIEQLGYPQTMFKEIIEPGTIIGNLTDIVQESVGFDTKVIATASHDTAS AVVSVPALREDFIYISSGTWSLMGMERNIADCSLESMQANFTNEGGYNHRFRYLKNIMGL WMIQSLRRELEETLSFNELCQMAKANQNFPSRVDVNDCCFLSPDSMIKAIQDYCQNSNQP IPKTAGELANVIYQSLAISYAATIEEIEMLARKKYEAIYIVGGGGNAEYLNELTAKATGK VIYTGPVEATATGNIVVQMLNNQVFASLVEARKCIKNSFEIKKYKEEK >gi|223714196|gb|ACDT01000019.1| GENE 6 7318 - 8571 1390 417 aa, chain + ## HITS:1 COG:BS_yulE KEGG:ns NR:ns ## COG: BS_yulE COG4806 # Protein_GI_number: 16080170 # Func_class: G Carbohydrate transport and metabolism # Function: L-rhamnose isomerase # Organism: Bacillus subtilis # 1 417 1 418 424 531 60.0 1e-150 MNVVERYKSAREIYKKVGVDTDLAIRKLLNFPISMHCWQGDDVVGFDGAGALSGGIQTTG NYPGKARTPEELMADIDKVLSLVPGKHRINLHASYAIFENGEVVDRDQIEPKHFIKWVEF AKERGLGLDFNPTIFSHPKAEGLTLSNPDKEIREFWIRHCKACIRISEYFAKELNSPCLM NIWIPDGLKDIPADRLGPRQHFMESLDEILSIDYDKEKVLVCLESKVFGIGMESYTVGSS EFTINYAQTRGILPLMDNGHYHPTEVVSDKLSAMLLFNQKVALHVTRPVRWDSDHVVLFD DETKEIAKEIVRNDALDRVLVGLDFFDASINRISAWTVGIRNMQKALLNALLIPFDDLKK MQSEGDFTKIMALQEELKLYPLGDVWNYICKITQVPEGLEWFKEIKKYEHEVLVNRK >gi|223714196|gb|ACDT01000019.1| GENE 7 8582 - 9406 943 274 aa, chain + ## HITS:1 COG:rhaD KEGG:ns NR:ns ## COG: rhaD COG0235 # Protein_GI_number: 16131742 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 3 264 4 265 274 266 49.0 3e-71 MEILKTQFIQSFMRMCSDGYKLGWHERNGGNLTYRIKVDEIALIREGLDETTEDYPIGTS VPALANEYFLVTGSGKFFRNVDLDPLDSIGIIKIDTRGENYKICWGLINGGKPTSELPSH LMNHEVKKLKNENYRVIYHCHATNVIALTFILPLKDEIFTRELWEMATECPVVFPDGVGV VPWMVPGGRDIAIATSELMKKYDVAIWAHHGIFCAGSDFDITFGLAHTVEKSAAILVKML SMSSTKLQTISPANFRLLAKDFNVVLPEKFLFDK >gi|223714196|gb|ACDT01000019.1| GENE 8 9623 - 10588 833 321 aa, chain + ## HITS:1 COG:lin2983 KEGG:ns NR:ns ## COG: lin2983 COG2207 # Protein_GI_number: 16802041 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 20 318 11 325 326 112 28.0 9e-25 MNQSLLEQLKIITAEEQAILSNKVNIQKSIYTGNDQFIVESKKMLDKQTLISVRTHTRFA DFPLHRHNYIEMMYVCQGSITHIIDNKKVVLRQGEILLLNQHSWHEIKKASADDIAINLM ILPEFFDMVYTMIGYNNIIGDFLINILKQDECRGEYLLFKVADILQIQNLMENIIYSLLQ SESEFRHEHQITMGLLFLYLTKYIARTKKGTTKEFEELLVETVVDYITDNYKHATLNEIA QILNQPVYALSKLIKQQTQRNFKELLQSRKFYRAEELLRDTKLSINDIITAVGYENNSYF FKRFKAKYKMTPTAYRKKMKL >gi|223714196|gb|ACDT01000019.1| GENE 9 10649 - 11206 427 185 aa, chain - ## HITS:1 COG:no KEGG:Fisuc_1679 NR:ns ## KEGG: Fisuc_1679 # Name: not_defined # Def: hypothetical protein # Organism: F.succinogenes # Pathway: not_defined # 8 181 5 182 186 129 42.0 4e-29 MPSTALLFESKLVTFCSPTLLKMKAGNIFNISNEFNELEECLNYYNQLCNQRGIYISIIH ENNNSKMIYVYQKEKLNNLFNNHKFRAVLNNYHYPYQDIDSLINWLKIRMNSSEFPHEIG LFLGYPLTDVKGFINGADYKYIGYWKVYSHVPQTLCTFDRYKKCTQELKRRFCQGERIEQ LLAHV >gi|223714196|gb|ACDT01000019.1| GENE 10 11455 - 12366 1076 303 aa, chain + ## HITS:1 COG:BS_fruB KEGG:ns NR:ns ## COG: BS_fruB COG1105 # Protein_GI_number: 16078503 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) # Organism: Bacillus subtilis # 1 299 1 298 303 278 49.0 1e-74 MIYTVTLNPAIDYVIKVDNFETGIVNRTKQENMFFGGKGINVSNVLKTLGEKSTALGFIA GFTGKAIKEGLEAKGLKTDFVEVAGLSRINVKMKSDNETEINGMGPQIKDAQVQALLGKL DNLKDGDILVLSGSIPGSMPDDIYEKIMEYVKSKNVLIVVDATNNLLMNCLKHHPFLIKP NNHELAEIFNVEIKNKEDVVVYAKKLQTMGAKNVLVSMAGDGAVLVDQNNDVHITSAPKG EVKNSVGAGDSMVAGFIYGYIKTNDFKTALKQATATGSATAFSEDLASKEKIDELLKTIS EDE >gi|223714196|gb|ACDT01000019.1| GENE 11 12366 - 14252 1961 628 aa, chain + ## HITS:1 COG:BH0828_3 KEGG:ns NR:ns ## COG: BH0828_3 COG1299 # Protein_GI_number: 15613391 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Bacillus halodurans # 288 624 1 332 336 242 41.0 2e-63 MKITDLLCLQGIDMNGEVANKEEAIDHLVDLMVGTGNISDKLAFKKGILAREAQSTTGIG EGIAIPHAQVAAVKKPGLAVMIVKDGVDYQSLDSQPARLFFMIAAPVDGGNIHLETLARL SGMLMDNDFKENLINASDAVEFLRLIDDKENEIVKVKEETDGTNYEVLAVTACPTGIAHT FMAAESLEAKAKEMNILFKVETNGATGPKNVLTAEEISKARCIIVAADKQVEMSRFDGRP VIITKVADGINKAEELLSQAINGNVEIYHHQGDKVVSSSANEGIWRKLYTQLTDALSKIL PIIIGSGALISIASLAASYNLFGIKQIYDSIWLSNSYILGITVNGMIVALFAGFIGQAIA SQQGFAVALAGGAAMLLNNMYLQNMPSPGILGAIIAGFIGGYVVILLQKICRKLPECLDG IKPTVIYPIFGVMITGVLSYLISPYVGSLNQTISVFIGSMDIVYKLILGVIVGGMMSVDM GGPVNKIAYLFGIAQIVEGNFDVMAAVMAGGMVPPLAIAISTTFFRNKFTLTERKLGCQN YLKGLMFISEGTIPFVQKDPRLVTLACIVASAVAGGFSMLYNCGIRIPHGGIFVLPLIAH PFRYIVALLAGSLCGAVIYGFFKEIGEE >gi|223714196|gb|ACDT01000019.1| GENE 12 14254 - 15024 978 256 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 2 248 3 249 256 104 30.0 1e-22 MKYIFFDIDGTLTDNTTGKIVPSAQEALDKLQENGNFVAIATGRAYYKAKNFLKEVGLNN MVCNGGNGLVINHQLVKNAPLDRQKALAVIDEAENLGYGILIAPFDSIDVYSKNTLFLKQ AGYRKEPTRYTIDSEINYHNLENIYKIYISIPASEETRLTLKDTLGSLRFEQEYLMFQPD DKKQGIVDMIAMIKGNIDDVVVFGDDYNDLVMFDERFYRIAMGNACDELKAKADYITDRN TSDGIYNACRVHGWIK >gi|223714196|gb|ACDT01000019.1| GENE 13 15298 - 16086 987 262 aa, chain + ## HITS:1 COG:DR0470 KEGG:ns NR:ns ## COG: DR0470 COG1387 # Protein_GI_number: 15805497 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Deinococcus radiodurans # 1 255 4 238 260 75 25.0 8e-14 MIDGHMHLEYGDLNKEYVLKFVDAAVKKGLTKIQILDHTHRFKEFEEIYTELKEEPLQKK WLENKAMKFKDTLDDYDKLIEEVKALDLPIEVTFGLEVCYVPKHEKKIGEVLANHNYDFV VGAIHSINGRLYDMNFSKDILWNKFDVDDIYRDYYELIFSLVKSDLFTQLAHPDTIKMFN YYPTYDLTPTYHQLADLLVEHNVKAENNTGCYYRYNHKDMGLSEELLKILKEHGVSMITA SDAHQPDHVGTNIADIYEKTML >gi|223714196|gb|ACDT01000019.1| GENE 14 16289 - 16894 626 201 aa, chain + ## HITS:1 COG:pli0059 KEGG:ns NR:ns ## COG: pli0059 COG1961 # Protein_GI_number: 18450341 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 5 185 2 182 199 107 36.0 2e-23 MEYNKYGYIRVSSKDQNVDRQITALKEVGLTGSQLYIDYQSGKDFNRPHYQKLLQVLKQG DILYIKSIDRLGRDYDEIIEQWRYIVKKIGVDIVVIDFPLLDTREKHDGITGKFIADLVL QVLSYVAQIERENTRQRQAEGIREAKRRGVRFGRPKLTIPDDFEEIYYLWRKNVISKAEA SRRLSTNNHTFTRWVKDYESI Prediction of potential genes in microbial genomes Time: Thu May 26 09:20:08 2011 Seq name: gi|223714195|gb|ACDT01000020.1| Coprobacillus sp. D7 cont1.20, whole genome shotgun sequence Length of sequence - 21283 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 10, operones - 6 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 238 - 297 7.8 1 1 Tu 1 . + CDS 388 - 624 172 ## gi|167757106|ref|ZP_02429233.1| hypothetical protein CLORAM_02655 + Term 678 - 708 2.0 - Term 664 - 694 2.0 2 2 Op 1 . - CDS 713 - 979 226 ## gi|167757105|ref|ZP_02429232.1| hypothetical protein CLORAM_02654 3 2 Op 2 . - CDS 921 - 2336 1256 ## COG2720 Uncharacterized vancomycin resistance protein - Prom 2495 - 2554 7.6 + Prom 2451 - 2510 7.4 4 3 Op 1 6/0.000 + CDS 2591 - 3676 1292 ## COG1932 Phosphoserine aminotransferase 5 3 Op 2 1/0.000 + CDS 3687 - 4850 1491 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases + Term 4862 - 4899 2.1 + Prom 4906 - 4965 8.4 6 3 Op 3 . + CDS 4992 - 6416 1300 ## COG0726 Predicted xylanase/chitin deacetylase + Prom 6420 - 6479 6.4 7 4 Op 1 7/0.000 + CDS 6578 - 7405 773 ## COG3711 Transcriptional antiterminator + Term 7428 - 7487 11.3 + Prom 7417 - 7476 3.4 8 4 Op 2 8/0.000 + CDS 7497 - 9329 1786 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 9 4 Op 3 . + CDS 9363 - 10814 1288 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 10827 - 10863 0.4 - Term 10815 - 10851 4.2 10 5 Op 1 3/0.000 - CDS 10856 - 11671 685 ## COG0726 Predicted xylanase/chitin deacetylase - Prom 11706 - 11765 6.1 - Term 11720 - 11761 7.2 11 5 Op 2 . - CDS 11767 - 12954 1035 ## COG0477 Permeases of the major facilitator superfamily - Prom 12979 - 13038 12.8 + Prom 13021 - 13080 10.6 12 6 Tu 1 . + CDS 13102 - 13866 941 ## COG3394 Uncharacterized protein conserved in bacteria + Term 13989 - 14047 -0.0 13 7 Tu 1 . + CDS 14230 - 15612 1497 ## COG0534 Na+-driven multidrug efflux pump + Term 15616 - 15671 1.2 14 8 Op 1 . - CDS 15643 - 16167 180 ## PROTEIN SUPPORTED gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 15 8 Op 2 . - CDS 16208 - 17485 1275 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily 16 8 Op 3 . - CDS 17546 - 18736 937 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs - Prom 18761 - 18820 8.1 + Prom 18810 - 18869 8.2 17 9 Tu 1 . + CDS 18895 - 19269 630 ## COG0251 Putative translation initiation inhibitor, yjgF family + Prom 19334 - 19393 9.1 18 10 Op 1 40/0.000 + CDS 19413 - 20084 877 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 19 10 Op 2 . + CDS 20102 - 21281 984 ## COG0642 Signal transduction histidine kinase Predicted protein(s) >gi|223714195|gb|ACDT01000020.1| GENE 1 388 - 624 172 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757106|ref|ZP_02429233.1| ## NR: gi|167757106|ref|ZP_02429233.1| hypothetical protein CLORAM_02655 [Clostridium ramosum DSM 1402] # 1 78 86 163 163 145 100.0 1e-33 MKKFVIDKDGITSYRLWKKVILWQKVAEIQLHHEPFYFLIILEKNGKNMEINLASASLMI SEQEIYQYCLKKWKIAQK >gi|223714195|gb|ACDT01000020.1| GENE 2 713 - 979 226 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167757105|ref|ZP_02429232.1| ## NR: gi|167757105|ref|ZP_02429232.1| hypothetical protein CLORAM_02654 [Clostridium ramosum DSM 1402] # 8 88 462 542 542 124 100.0 2e-27 MKTILHYGQTKVKQKGTPGSKAKSWRYVYDANGNLISSNEEAYSVYKGHEEIILRGTMKI TEPATPAPTEPTTPAQPETPTIPNNETS >gi|223714195|gb|ACDT01000020.1| GENE 3 921 - 2336 1256 471 aa, chain - ## HITS:1 COG:BH1596 KEGG:ns NR:ns ## COG: BH1596 COG2720 # Protein_GI_number: 15614159 # Func_class: V Defense mechanisms # Function: Uncharacterized vancomycin resistance protein # Organism: Bacillus halodurans # 295 427 168 300 305 141 51.0 3e-33 MKRKKLKLKKQAYFVIGGLVILAALILYLILCLVTGGDNFVSNTTINGIKVGNMNKEEAR KAIEQQYQNDLNPPTLTLLLDQQEYPVDLTDNLTFDSKKAVDKISKQSNSSFFRRGYNYL FNHDYTVGVKIKNKDLLNEKITNSQILNYSTLIPTNYELKDDKIIFTKGKSGKTAELESV FNTINKALNNYDFKGKIKYSPVEHKLNDEEIKLIHENLSKEAKNATLDKNNNYEIIDSQV GAKFDLEDAVAKYNKTTEGKQFTLNATIIKPEITKEMLEQNLFKDVLGEYATNVSGTSVR KNNVKLSGDKCNGVILLPGEEFSYNNVVGKRTKENGFGEAAAYLNGETVQEVGGGICQTS STLYNAVLYANLKITERTNHTFVSGYVPIGRDATVSWGGPDFKFKNDQAYPIKIIASYEN SRLTTKILGTNVNNIRVELKSQKLASTNYNTKYEDDPTLRTDKSKTKRNSR >gi|223714195|gb|ACDT01000020.1| GENE 4 2591 - 3676 1292 361 aa, chain + ## HITS:1 COG:PA3167 KEGG:ns NR:ns ## COG: PA3167 COG1932 # Protein_GI_number: 15598363 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoserine aminotransferase # Organism: Pseudomonas aeruginosa # 4 357 3 357 361 375 50.0 1e-104 MTEKRVLNFSAGPSMLPEPVLEKAAKQMLNYENSGMSVMEMSHRSSSYLDIFEKTKGLLK KVMNIPDDYKIVFIQGGATQQFSMVPLNLLKNGKADYVVTGAFSKKAAAEAKKFGEINIA YDGSENNFKHIPTQDELKLDPEASYVHLCANNTIYGTEWKYIPETNGVPVIADMSSNILS KPVDVSKFGMIYAGAQKNMGIAGLGVAIIKEDLLQKVAATTPVLLDYKLMIENDSMYNTP PAYAIYVLGLVLEWIDSMGGLEVMQERNIKKANLLYDYLDSSDFYIAHSDKDNRSLMNVT FTTPNKDLDAKFVKESIAAGMTNLKGHRSVGGIRASIYNAMPLEGVEKLVAFMRDFEAAN K >gi|223714195|gb|ACDT01000020.1| GENE 5 3687 - 4850 1491 387 aa, chain + ## HITS:1 COG:lin2956 KEGG:ns NR:ns ## COG: lin2956 COG0111 # Protein_GI_number: 16802015 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Listeria innocua # 1 385 1 388 395 369 48.0 1e-102 MYNIKLLNKISKVGLDGFDENYAYSEEMTNEDAILVRSASLHEYDFGENLKAIARAGAGV NNIPLDDCSEKGIVVFNTPGANANAVKELVLCGLFLSSRKIVESIRWIDGLKDDQDIAKT AEKGKSNFVGPEIEGKTLGVIGLGAIGVNVANAAIKLGMKVKGYDPYIGVNAAWALSKHA RHAASLEEIYQECDYITIHVPSTKETKGFMNKEAFAQMKTGVRILNFARGDLVNNEDLLA NVASGKINKYISDFAAPELIGQENIIILPHLGASTPESEDNCAKMAVEEVCEYLENGNII NSVNFPGVNQARMSKTRLCIINKNVPNILANISKLFADHNLNIENMVNRSRGEYAYTLID TNDEVRPDIIERIESANGIINVRAIID >gi|223714195|gb|ACDT01000020.1| GENE 6 4992 - 6416 1300 474 aa, chain + ## HITS:1 COG:CAC3017 KEGG:ns NR:ns ## COG: CAC3017 COG0726 # Protein_GI_number: 15896269 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Clostridium acetobutylicum # 268 462 92 282 282 136 40.0 8e-32 MKKIKNLTIGNKIVIVSVLLVIIVGGVATYFLTRSPIVFKADIIKVEINDSYDVMKNIAS VRNGNIKDVKADISKIDYNKLGRYPVIYIYKDKKYEATIEIVDTKKPQFDVVDLDIDLGM KVDPASMATNINDATKTEVSFKEDYDFSKEGTVEVIIQVKDKGNNVTEKKGKVKVTKDSE PPQITGLEPLTIVIGAKTDYNSGVSISDNRDPEPKLTVDSSNVNIQKEGTYTLIYIGIDR SGNKIEKQREIKVVEKKAIGSNNQTSDKIVYLTFDDGPSANTQKILDILDVYGAKATFFV TGNNKPYNHLIKTAHDKGHTIALHTYSHDYKTVYASPEAYFDDLTKVGNMVKDIIGFVPK YVRFPGGSSNTVSRKYCPGIMTVLSRELINRGYQYYDWNGDSTDASGNNVAVSKLIANAT SSKANNINILFHDTAAKSTTVQALPAIIENYKARGYRFEAITDSSFVPHQGINN >gi|223714195|gb|ACDT01000020.1| GENE 7 6578 - 7405 773 275 aa, chain + ## HITS:1 COG:BS_licT KEGG:ns NR:ns ## COG: BS_licT COG3711 # Protein_GI_number: 16080959 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 1 273 1 273 277 256 49.0 3e-68 MIIHKIINNNIVSSLDENHNEIILMGRGIAYQKSRGEVVLDEMIEKQFYLKNDKQNNRFL ELIKDIPVDVLNVTDEIIDFARNGLDKELNDGIYITLTDHINYAIERYLQGINVQNPLLW EIRQFYQEEYKIAAEALKIINQKLMVNLSEDEVGFIALHFVNAELNSEMEEVTQITKIIQ EILNIVRYHLKISFDEDSLDYHRFVTHLKFFAKRLMFGEKEAKRDDILFDIVIERYPEAF ECVKNIEKHVQKIYKKDLSQAEKLYLTLHIARLKY >gi|223714195|gb|ACDT01000020.1| GENE 8 7497 - 9329 1786 610 aa, chain + ## HITS:1 COG:BH0296_2 KEGG:ns NR:ns ## COG: BH0296_2 COG1263 # Protein_GI_number: 15612859 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Bacillus halodurans # 96 460 1 368 368 266 38.0 6e-71 MKYEKLARFIIKNVGGEKNINIVQHCATRLRFELKDEEKVNKVNLENSSEILQVLFSGGQ CQVVIGTHVADVYKAILATAKISTVKEQSNKNKKLLDKIFALSSGIFTPFVPVMAGAGVL RGLLTIFTSMGILTTESGTYMILYAAADAILYFMPFYLAISASRYFKIDEFIGVTLAGTM LYPTLLTAFSEGAKLSFLGLDVILIKYASSVVPIIVAVYAVSLLDRLIRDRLPSIIRSFV TPLIDLVIMAPLSLLAIGPVVTFLTNIFTDGLLAVYSFNPIIFGFIFCACWQPLVIFGLH RGFIPVNLNNLATTGRDALLAMTMPSAFAQTGAALGIAFKSKNKNFKSVAAANVIPSVLG ITEPLIYGTTLKAKKAFFMACLMSGFGGAVVGGAGCYTTGLPAGGILSLPLFAEHGFVWY IAGLLISFIGTLVLVLLFWKDEFVEKNDDVIKETDQLLEVDTEILGAVGTGKVLPLEIVN DSTFSSLALGNGVAIIPDEGKLYAPCDAKIVAVFPTNHAIGLRSILGAEILLHVGIDTVK MEGKGFYAFVKEGDMVKKGQLLLDFNIDEINHAGYDPTIIVIVTNTNHYNKVELLNMNTV VNNDDILRLE >gi|223714195|gb|ACDT01000020.1| GENE 9 9363 - 10814 1288 483 aa, chain + ## HITS:1 COG:lin0288 KEGG:ns NR:ns ## COG: lin0288 COG2723 # Protein_GI_number: 16799365 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 6 483 7 485 486 495 50.0 1e-140 MFGVCELDKDFLWGGASAANQMEGGYDLDGRGMSIADVFMFYEKKERHLAKKELTIEEIK KLRENKSNNFPKRRGNDFYHHYQEDIALLAKMGIKAFRMSFSWSRIFPRGDEKEPNKAGL KFYDNIINELLKYKIEPIITLSHFETPLVIATEYGGWCNKEVIKFFQTYCYTVFTHFKGR IKYWMSFNEINAALEIPFKGSAIPFSQDRLYETQVHQGLYNQFLASALVTRELKRIDKTA KMGCMIASFTTYPASCDPDDILKAMRANQEYYLYADVQANGNYPNWYLKNLEKRDIHLNF SKEELDIIKNNTVDYISFSYYLSLVASHDPARKVGEGNLKGGVENPYLEKSDWGWAIDPI GLRITMNEIYYRYNLPILISENGFGANDILTVDNQIHDEYRIKYLQKHINELKKAIELDG VICLGYLSWSPIDMISAGTSEMNKRYGYIYVDYDDYGNGSGKRYLKDSFYWYQKVIASNG KVL >gi|223714195|gb|ACDT01000020.1| GENE 10 10856 - 11671 685 271 aa, chain - ## HITS:1 COG:STM0179 KEGG:ns NR:ns ## COG: STM0179 COG0726 # Protein_GI_number: 16763569 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Salmonella typhimurium LT2 # 36 234 165 368 409 93 29.0 5e-19 MEKFMKNITKMFLFILSFYLVLGSLWIGKDVTIHYQNIDGLPILGYHGVLEDKDKEKYFA NYPYCMSLSEFKAQMKYLNQNNYHTLTMDEINDYYQNHTPLPKKSVALTFDDGLLNFKTV VKPILEKYNFKATCFVIGYKTTVKNSQNPYKHQYLRKSDLVNDEYVEYYSHSYNLHHNTK YPNTKLIETLSTQEIINDFKQNENIVSSKYFAFPYGRTCDNANEALIKANVSLAFGYNQN RTMTYHDDKYLLPRYLMFSKMPMFYFKWIVE >gi|223714195|gb|ACDT01000020.1| GENE 11 11767 - 12954 1035 395 aa, chain - ## HITS:1 COG:BS_ywbF KEGG:ns NR:ns ## COG: BS_ywbF COG0477 # Protein_GI_number: 16080885 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 4 369 8 371 399 117 26.0 6e-26 MKDRFSLSFKAFYLIYLGAIGSFIPYINTYLEKNAGLSGSQIGLITAISLVLGVCVIPIW GVVGDKTRKYSALLKLSIMAALVVLYFYYKAAVYPAIVVCAIALEVSRLGTMPMADTLAT NYCHKTGGNYGSIRGMGSLGYMLAGMAVGFLADLFGLDGAMFATYAVLLILAFSISFGFP KDDGKKEDGEEVKVKKGSFKELLTNKNFLFILFLQLLMSTVVDSAMTYGGNHLTVTLNSG ASAISWMTFATVLPEVAFLMIAIKVLNKTGFKKFYLIACVSMMLRFGVYAFIPNQYAFLA ISIVHCIGVAIATVASLTYIRNSVDPAVLGTAITLLNATLSIGKAIYGYIFGVVYEIWGS FIMFGICLIPFVIAFLMISRSHCFDEIDKHKGHIA >gi|223714195|gb|ACDT01000020.1| GENE 12 13102 - 13866 941 254 aa, chain + ## HITS:1 COG:lin2456 KEGG:ns NR:ns ## COG: lin2456 COG3394 # Protein_GI_number: 16801518 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 4 254 2 247 248 108 27.0 1e-23 MVKQLLIRGDDLGYSEAVTLGMISAHRKGLVNSVALMVNMPYAKEAIKLAKQCPNLCLGL HVNVTNGYALAPKEMIPSLVDEHGIFLSSTIRRAQIKNNEPLFSEEDAYIEAKYQIEKYI ELVGQLPEYIDLHVLEVEPLINAVVRIYHEYNVPVCYYGMDLEQGIIDQTMKQYDYYEEH DHDFENMFIDGYYQIKEGLNILVTHPGFIDYDVATTSSMLKERLYDYSLVTSEKLKQWLR DNEIEVISFRDVKK >gi|223714195|gb|ACDT01000020.1| GENE 13 14230 - 15612 1497 460 aa, chain + ## HITS:1 COG:FN1726 KEGG:ns NR:ns ## COG: FN1726 COG0534 # Protein_GI_number: 19705047 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 18 456 11 447 457 273 37.0 5e-73 MKGVIKVKKKKEADLGKDSLGPLLLKLALPAILAQIINVLYNMVDRMYIGHIPKVGPSAL TGVGVTMPVIMAISAFAALVSMGGAPRASIMLGRGEHPKAEKILGNCTVMLVIMAIILTA VFLIWGEPILMVFGASEATIGYALDYMRIYALGTIFVQLALGLNAFINAQGYAKIGMITV AIGALCNIVLDPIFIFSMSMGVKGAALATIISQAISSIFVVYFLTSKRSGLRIKLDNLKL DFQVILPCLALGLSPFIMQFTESVISVCFNTSLLKYGGDIAVGSMTILTSVMQFSMLPLQ GLTQGAQPIISFNYGAENIDRVKRAFKLLLKISLSYSMLLWAVAMFIPDTFIYIFTSHGE LTTYTRWAIRIYMAASGIFGIQIACQQTFIAIGNAKTSVFLAVLRKVLVLIPLIFILPMF IENQAFAVFLAEPIADTIAVSVTATLFYQTYKRLGKETKA >gi|223714195|gb|ACDT01000020.1| GENE 14 15643 - 16167 180 174 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 [Bacillus selenitireducens MLS10] # 4 171 4 182 190 73 29 8e-13 MNLKVKNMIYCSMFACITAILAQIRFTLPSLVPITLQTLGIYLIGLVLKPKIAFISSLIY ILMGAIGLPVYSGFSAGLSTILGPTGGYIFSFPITALIISYLVHYRSTTIFKFGALILGT IICYLIGTLWFMYIMKMSFSGSLIICVLPFLPGDVLKIFIAAALTNKFKNIIKD >gi|223714195|gb|ACDT01000020.1| GENE 15 16208 - 17485 1275 425 aa, chain - ## HITS:1 COG:ECs1399 KEGG:ns NR:ns ## COG: ECs1399 COG1752 # Protein_GI_number: 15830653 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Escherichia coli O157:H7 # 1 68 5 77 356 67 47.0 6e-11 MKRGLVLAGGGSKGAYQAGCIKALKELGYDFDLVTGTSIGALNGLLVVQEDYNTLYRLWD EITISEVLEDPVNFNFSIDSMLSQTNLIKPFFKSYINEKGADITPLKNILHALYNEEKAF NNKIDYGVVAVKYPKLTPIEITKEEMKPVPAIEYAIASASCFPAFPVHHIDKQGYIDGGY YDNLPISLALKMGADKIIAIELNQDATHDYFLDRPDIIFIRPSYDLGGFLDFGREILDWR IKLGYHDTLKKFNQLKGYRYNFKNFEINHELAHKFYRLILNYEDSINKNVVSKTITSSST PLSDLLKNNTYLKELTTEDYFVRAVEITMDHYHYQSDLLYDLNEVCHEIFNTFKEEYQER YDILEKRFVDIPIKELFSKIKKLSSKDAICAFYHSLNNNEEIETTLISNIFLEEYLIALL LYTLV >gi|223714195|gb|ACDT01000020.1| GENE 16 17546 - 18736 937 396 aa, chain - ## HITS:1 COG:Ta1193 KEGG:ns NR:ns ## COG: Ta1193 COG1167 # Protein_GI_number: 16082202 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Thermoplasma acidophilum # 1 393 1 396 402 270 34.0 2e-72 MKYTFAVRFNNITPSGIRAVLEKAAGPDMINFSPGFPDNDAFPANDIQKISQDVLKEDIY TILQYSRGTTYPPLKKALKAFFNRQEKIFTENDDLMVTSGSGEGLEMAAKVFLNPGESII VEDPTFVGALNGFISNDAKLLGVPVESDGMNLGLLEEAMQSTPAPKLLYIIPTFQNPTCI TTSLEKRRAIYDLCLKYNIIILEDNPYGTLRFKGKTIPTLKSIDTEGIVVYCASLSKIIS PGIRLGTIIANKEIIDKFNILKGASAGAVTNWSQHVIARFLETVDMDKHLAHLQSVYGKK STFMVEMMKKTFHPDVKFTTPEGGMFVWFELPKYADARTFLNRATTMHIAIVNEETFAVN RRDKMNGFRLSFTSATMEQIETGIAKLGKLTYELCK >gi|223714195|gb|ACDT01000020.1| GENE 17 18895 - 19269 630 124 aa, chain + ## HITS:1 COG:ECs3993 KEGG:ns NR:ns ## COG: ECs3993 COG0251 # Protein_GI_number: 15833247 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Escherichia coli O157:H7 # 1 124 22 148 150 131 46.0 3e-31 MKNVVKTDKVPGAIGPYSQGINVGDMYFFSGQIPLVPETGEMPEGIEAQAHQSLKNVQGL LESQGLTFANVIKTTVFLDNMDDFVTVNDIYAQYFVEPFPARSAVEVAKLPKGALIEVEV IATK >gi|223714195|gb|ACDT01000020.1| GENE 18 19413 - 20084 877 223 aa, chain + ## HITS:1 COG:L0131 KEGG:ns NR:ns ## COG: L0131 COG0745 # Protein_GI_number: 15673576 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Lactococcus lactis # 3 219 5 224 230 155 37.0 5e-38 MYKILMIDDERDILEMLALQFQSEGYLVYLANDANEALKQLSVLPDLILLDINMPEMNGL ELCTMIREHVNCPIIFLTARISSRDMINGLMLGGDDYITKPFSMDELFARVQAHLRREKR SSHKSRGHFTRELIIDYSQRTVIIKNKKIDFSNKEFEIIKLLSMNAGQIFDREKIYEVVW GFEANGDSAVIKEHIRKIRMKLARYSENEYLETVWGVGYRWKK >gi|223714195|gb|ACDT01000020.1| GENE 19 20102 - 21281 984 393 aa, chain + ## HITS:1 COG:CAC0317 KEGG:ns NR:ns ## COG: CAC0317 COG0642 # Protein_GI_number: 15893609 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 112 368 210 473 498 98 28.0 2e-20 MSLKKSLLLILSVALFLIIILSSTTVLLTSSLRQMILDNRTIHIAVDTAIISENITGTYE ISPSNDEYQYSELANQDKFIYLSATVAMIALPTLYLVLGTIFIVKIYYRLKLALPIEQLN IGIANLTNNNLDFQITYNSNDELGRLCSTLEVMRNELIENNQKVWELLDQRKALTASVSH DLRTPITVLKGYLDYLLKNLPEKRVSDDVLLTTLKSMAHSSARLERYVECIQDIQKIEDI EISKTMVVGKDFLEEINNDFMIIAKNNHKDLVLEAKIVSKQLYLDKQIIFKILENLLNNA FRFANNGVKLTIMETVGYLEFVIQDDGPGFSKKDLENATTLFYSSQTNKGSFGIGLSISK ILCEKHGGILKLLNNENGACAIVQIKKSEMEFK Prediction of potential genes in microbial genomes Time: Thu May 26 09:20:34 2011 Seq name: gi|223714194|gb|ACDT01000021.1| Coprobacillus sp. D7 cont1.21, whole genome shotgun sequence Length of sequence - 45774 bp Number of predicted genes - 51, with homology - 51 Number of transcription units - 22, operones - 10 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 107 - 166 8.8 1 1 Tu 1 . + CDS 390 - 551 98 ## gi|167757087|ref|ZP_02429214.1| hypothetical protein CLORAM_02636 2 2 Tu 1 . - CDS 586 - 1428 522 ## COG1968 Uncharacterized bacitracin resistance protein - Prom 1462 - 1521 13.3 + Prom 1446 - 1505 9.4 3 3 Tu 1 . + CDS 1526 - 2104 599 ## gi|167757085|ref|ZP_02429212.1| hypothetical protein CLORAM_02634 + Prom 2120 - 2179 9.3 4 4 Op 1 2/0.167 + CDS 2237 - 2632 438 ## COG1131 ABC-type multidrug transport system, ATPase component 5 4 Op 2 . + CDS 2631 - 2933 104 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 6 4 Op 3 . + CDS 2927 - 3121 133 ## gi|237735845|ref|ZP_04566326.1| predicted protein 7 4 Op 4 . + CDS 3072 - 3656 556 ## gi|167757083|ref|ZP_02429210.1| hypothetical protein CLORAM_02632 8 4 Op 5 . + CDS 3665 - 4021 516 ## COG1725 Predicted transcriptional regulators + Prom 4043 - 4102 6.2 9 5 Tu 1 . + CDS 4122 - 4568 415 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family + Term 4569 - 4598 -0.2 10 6 Tu 1 . - CDS 4618 - 5196 605 ## COG4116 Uncharacterized protein conserved in bacteria + Prom 5155 - 5214 6.3 11 7 Op 1 7/0.000 + CDS 5238 - 6020 833 ## COG0061 Predicted sugar kinase 12 7 Op 2 . + CDS 6017 - 6853 332 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit - Term 6911 - 6949 1.4 13 8 Tu 1 . - CDS 6966 - 8201 1018 ## COG0772 Bacterial cell division membrane protein - Prom 8257 - 8316 5.3 + Prom 8167 - 8226 9.2 14 9 Op 1 . + CDS 8250 - 8594 436 ## COG1264 Phosphotransferase system IIB components 15 9 Op 2 . + CDS 8640 - 10868 3032 ## COG0790 FOG: TPR repeat, SEL1 subfamily 16 9 Op 3 . + CDS 10917 - 12149 1361 ## COG2309 Leucyl aminopeptidase (aminopeptidase T) + Term 12155 - 12194 4.6 + Prom 12160 - 12219 2.3 17 10 Op 1 . + CDS 12240 - 13175 845 ## CPE0962 hypothetical protein 18 10 Op 2 . + CDS 13236 - 14822 1898 ## COG4108 Peptide chain release factor RF-3 19 10 Op 3 . + CDS 14867 - 15127 330 ## gi|167757071|ref|ZP_02429198.1| hypothetical protein CLORAM_02620 20 10 Op 4 . + CDS 15124 - 15648 679 ## CLH_2057 acetyltransferase, GNAT family 21 10 Op 5 . + CDS 15707 - 17656 2246 ## COG2217 Cation transport ATPase 22 10 Op 6 . + CDS 17696 - 17995 523 ## gi|167757068|ref|ZP_02429195.1| hypothetical protein CLORAM_02617 23 10 Op 7 . + CDS 17988 - 18827 229 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit + Term 18828 - 18866 3.0 + Prom 19187 - 19246 10.4 24 11 Op 1 . + CDS 19387 - 19941 383 ## PROTEIN SUPPORTED gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 25 11 Op 2 . + CDS 19959 - 20153 325 ## PROTEIN SUPPORTED gi|167757065|ref|ZP_02429192.1| hypothetical protein CLORAM_02614 26 11 Op 3 . + CDS 20179 - 20538 603 ## PROTEIN SUPPORTED gi|167757064|ref|ZP_02429191.1| hypothetical protein CLORAM_02613 27 11 Op 4 . + CDS 20606 - 20998 462 ## CLB_1581 hypothetical protein + Term 21023 - 21078 16.1 + Prom 21036 - 21095 6.6 28 12 Op 1 . + CDS 21126 - 21551 396 ## LBUL_0272 transcriptional regulator 29 12 Op 2 36/0.000 + CDS 21565 - 22875 1225 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 30 12 Op 3 36/0.000 + CDS 22865 - 23632 801 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component 31 12 Op 4 . + CDS 23632 - 24894 880 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 32 13 Tu 1 . - CDS 24961 - 25425 477 ## COG0622 Predicted phosphoesterase - Prom 25455 - 25514 9.2 33 14 Op 1 36/0.000 + CDS 25589 - 26293 295 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 34 14 Op 2 . + CDS 26295 - 29534 3222 ## COG0577 ABC-type antimicrobial peptide transport system, permease component + Prom 29693 - 29752 5.8 35 15 Tu 1 . + CDS 29780 - 31222 1646 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 36 16 Tu 1 . + CDS 31649 - 32017 408 ## gi|167757054|ref|ZP_02429181.1| hypothetical protein CLORAM_02603 + Prom 32085 - 32144 5.9 37 17 Tu 1 . + CDS 32188 - 32871 681 ## EUBREC_2521 hypothetical protein + Prom 32873 - 32932 4.5 38 18 Op 1 . + CDS 33028 - 33675 812 ## EUBREC_1860 hypothetical protein 39 18 Op 2 . + CDS 33696 - 34385 924 ## gi|167757052|ref|ZP_02429179.1| hypothetical protein CLORAM_02601 40 18 Op 3 . + CDS 34372 - 35205 1119 ## gi|237735879|ref|ZP_04566360.1| predicted protein + Term 35217 - 35259 3.1 + Prom 35207 - 35266 2.6 41 19 Op 1 . + CDS 35327 - 35713 292 ## STH1180 signal peptidase, type I 42 19 Op 2 . + CDS 35710 - 38607 3139 ## MCCL_0364 hypothetical protein 43 19 Op 3 . + CDS 38631 - 39251 720 ## CPR_0481 hypothetical protein 44 19 Op 4 . + CDS 39269 - 39910 523 ## CPR_0481 hypothetical protein 45 19 Op 5 . + CDS 39933 - 40427 423 ## COG0681 Signal peptidase I 46 19 Op 6 . + CDS 40446 - 41087 575 ## gi|167757045|ref|ZP_02429172.1| hypothetical protein CLORAM_02594 + Term 41093 - 41134 7.6 + Prom 41238 - 41297 10.6 47 20 Tu 1 . + CDS 41320 - 42093 914 ## gi|167757044|ref|ZP_02429171.1| hypothetical protein CLORAM_02593 + Term 42103 - 42142 7.0 + Prom 42133 - 42192 6.1 48 21 Op 1 . + CDS 42326 - 43054 891 ## gi|237735887|ref|ZP_04566368.1| predicted protein 49 21 Op 2 . + CDS 43075 - 43851 809 ## gi|237735888|ref|ZP_04566369.1| predicted protein 50 21 Op 3 . + CDS 43874 - 44587 888 ## gi|167757041|ref|ZP_02429168.1| hypothetical protein CLORAM_02590 + Term 44600 - 44653 8.2 + Prom 44649 - 44708 6.4 51 22 Tu 1 . + CDS 44739 - 45761 934 ## gi|167757040|ref|ZP_02429167.1| hypothetical protein CLORAM_02589 Predicted protein(s) >gi|223714194|gb|ACDT01000021.1| GENE 1 390 - 551 98 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757087|ref|ZP_02429214.1| ## NR: gi|167757087|ref|ZP_02429214.1| hypothetical protein CLORAM_02636 [Clostridium ramosum DSM 1402] # 1 53 1 53 53 74 100.0 2e-12 MIETAYYAQDNYAILKDILTAFDLYKKIIKIDGKEMNITDIIKINSKKNSFLK >gi|223714194|gb|ACDT01000021.1| GENE 2 586 - 1428 522 280 aa, chain - ## HITS:1 COG:BH0464 KEGG:ns NR:ns ## COG: BH0464 COG1968 # Protein_GI_number: 15613027 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Bacillus halodurans # 17 280 20 273 274 160 37.0 4e-39 MELIKNIIIYGILGAIQGFSEPIPISSSGHLVIFQSIFEKMGLMVPQMNDVTFEVIVNTG SLLAIMYYYRHDIVRLFTAFFGYIRKPKKRNYYEGDFRYCILLIVATIPAAIGGYLFNDK IEAAFSNPKLVGCMLLVTAIFLLSIHKFGYKGKRTAKKLNLFDALRMGIFQLFALLPGIS RSGSTLTGGMLGGLSQKAARDFSFFMFMPVSFGAIILKLKDFLTSSTLGALWLPYLVAFI ISGIVTYLALHLLFKLLDKKKLNVFSLYCLVVGILAVLFL >gi|223714194|gb|ACDT01000021.1| GENE 3 1526 - 2104 599 192 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757085|ref|ZP_02429212.1| ## NR: gi|167757085|ref|ZP_02429212.1| hypothetical protein CLORAM_02634 [Clostridium ramosum DSM 1402] # 1 192 1 192 192 327 100.0 3e-88 MKDIIVNWVKEDEELLEPYEWDEDDDLEIISRVDIYHVDEKTLHDFIYGCINIYDSQYFN NIFIVGDGKYSVVVEINNNGKLVYRSLTMMKQRTLINHELKKLPIVKINYHMYDEGLIKE YGLTRNERIKKQYVENMVDQLYVEDYDKFLNVCKQLEINDDKSIGKYLHLKKKLEKGYSF IHELLYNEFVKK >gi|223714194|gb|ACDT01000021.1| GENE 4 2237 - 2632 438 131 aa, chain + ## HITS:1 COG:SPy1286 KEGG:ns NR:ns ## COG: SPy1286 COG1131 # Protein_GI_number: 15675239 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Streptococcus pyogenes M1 GAS # 1 126 1 126 232 127 50.0 4e-30 MSELLKITNVTKKYHHFKALNNVSMTLESGKIIGLLGPNGSGKTTLIKIINGLLKDYEGE VLVDGKNVGIDSRKIISYLPDENYFQDWMYIKDVLSIFSDLYEDFDKENCLTLMNRFKLD KGMKLKRCLKE >gi|223714194|gb|ACDT01000021.1| GENE 5 2631 - 2933 104 100 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 1 87 149 237 312 44 26 2e-15 EEKFQLSLVMSRKAKLYILDEPIAGVDPAAREVILDVILNNYEEDALVLISTHLISDLET IFDDVVFLKDGEIVLHQSTEDLRLERKQSIDEAFREVFRC >gi|223714194|gb|ACDT01000021.1| GENE 6 2927 - 3121 133 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735845|ref|ZP_04566326.1| ## NR: gi|237735845|ref|ZP_04566326.1| predicted protein [Mollicutes bacterium D7] # 1 64 1 64 64 93 100.0 5e-18 MLKLLKYEMIQSYRQYFLTLGIFLILCVLAPLLPDFISQVLSSLMIFCNVRYINCSFGQC YYKF >gi|223714194|gb|ACDT01000021.1| GENE 7 3072 - 3656 556 194 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757083|ref|ZP_02429210.1| ## NR: gi|167757083|ref|ZP_02429210.1| hypothetical protein CLORAM_02632 [Clostridium ramosum DSM 1402] # 1 194 49 242 242 277 100.0 2e-73 MLGISIAVLVNVITNFNRSMYKRPGYLTLTLPVSTEKLVGAKFIGSLIWVFVSSIVLSLG ILIIVFLIGNVPLNTLFDLFGELLKALGNNFGLVVINLIDSIAMVSSMILSFYAIITLTK TKYIPKHKTVIGIIVYVALLILGSSLLMWQPIESFVMSLDSTASVWFSIVLNLVLATAFY FFTVYLIDHKIEVE >gi|223714194|gb|ACDT01000021.1| GENE 8 3665 - 4021 516 118 aa, chain + ## HITS:1 COG:SP1714 KEGG:ns NR:ns ## COG: SP1714 COG1725 # Protein_GI_number: 15901548 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 1 116 1 116 121 100 44.0 7e-22 MKNDFDPNLPIYIQVMEEIKKEIFASEYLPGSKLASVRELALEYGVNPNTIQKALSELER TGIIYSRRALGRFVSEDSNLISELKKDVSLDKVKVFIEEMRKLGFSKIEVIKMIEELD >gi|223714194|gb|ACDT01000021.1| GENE 9 4122 - 4568 415 148 aa, chain + ## HITS:1 COG:SA0856 KEGG:ns NR:ns ## COG: SA0856 COG1393 # Protein_GI_number: 15926586 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Staphylococcus aureus N315 # 1 119 1 119 131 137 52.0 5e-33 MIRIYTAPSCASCRKVKSWLKEHNIPYVEKNIFSTLLREIELKELLERSENGTDDIISKR SKIIKENDIDIDSMSISELIKFIQENPSVLKRPIMIDERRFQVGYNAEEIRVFIPRELRK LAECPSSEVCPQFISEPCSIKEAYFKDR >gi|223714194|gb|ACDT01000021.1| GENE 10 4618 - 5196 605 192 aa, chain - ## HITS:1 COG:SP1096 KEGG:ns NR:ns ## COG: SP1096 COG4116 # Protein_GI_number: 15900964 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 7 191 2 188 189 99 38.0 3e-21 MEATMEENLEIEFKVLINKEIYQQIYNDHHIDHQYSQTNYYLIHPKLTELKYMLRIRQKQ ENYELTLKQPQSYGNLETNLKIDKITKEKIITHDFVTNEIFDLLKPLGLDSTMFKTDCVL TTTRCEIKTTDGLICLDQSLYNGITDYELEYEVFDYHHGKQVFLDFISQYNLKYSRNCPS KVKRLMDSLKDD >gi|223714194|gb|ACDT01000021.1| GENE 11 5238 - 6020 833 260 aa, chain + ## HITS:1 COG:BS_yjbN KEGG:ns NR:ns ## COG: BS_yjbN COG0061 # Protein_GI_number: 16078226 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Bacillus subtilis # 3 260 2 264 266 188 40.0 8e-48 MKQYALVVKQDEMSANIAEKIKKGLTGIMEYNPDDPDLVISVGGDGTMLLSVHQYMEQKV SFVGVHTGTLGFFTDYQKDEITELIAAIKADHYQMTPRHLLEVDVYHKAGKETYLALNEM RIDHGYTTQVIDVYIDDELLEVFRGNGLCVSTPSGSTAYNKSIGGAVIYPGSPLMQLTEV AAIQHNAYRSLGASLILDENKVIKLKGQHFNRVYLGIDHLSYHLDDVEKIEIRISKKVVK FIEYKEMSFIQRIRRAFISE >gi|223714194|gb|ACDT01000021.1| GENE 12 6017 - 6853 332 278 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 14 269 18 277 285 132 34 4e-30 MKFIVAMKESGMLLREFLSQQELSKKAVKLIKMHGKILVNGQRQTVRYCLRYGDVVELLW PEEVSTMEPYPLALKISYEDENFLVIDKPAGLPSIPTKRYPRNTLANAVVYYYQSQGIKA TVHLVNRLDKDTQGLLLIAKNSYAHYLLSRDIKQVRRVYHCVVEGVLTGQGTITAPIIKD KKSVKRLIHPAGKFAVTHYRGLEVSANQSKIECILETGRTHQIRVHLSSINHPLVGDTLY DSIYQESYYLDSIALSFIDPFTQRLIEVKKDSKQFEKN >gi|223714194|gb|ACDT01000021.1| GENE 13 6966 - 8201 1018 411 aa, chain - ## HITS:1 COG:BH3275 KEGG:ns NR:ns ## COG: BH3275 COG0772 # Protein_GI_number: 15615837 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus halodurans # 22 403 11 388 398 172 32.0 1e-42 MIEKTIIFIRRDFVEEKEKSVICWPLIGLLLLLCFISCLGIKSATPLITKGNPSGYWIKQ LMFYGISFILMFIVYKVSNDRIYSSMWIIYGILMVLLVGLAIDHFAFTRFGIHIVPLAKW AGGATSWYNLKVFDLQPSEFMKIIMVVVMADTVDKHNNRYLVHNIHNDCLLIGKLLAISL PPCILVYLQNDAGVTMIMLASVVFVIFMSGIQAGWFIIGGIVVAIILGIGVYLFLYQHDI FASLMGGDHKLNRFYGWVDPEGTYNDQGFQLFNAMLSYGTAGLWGHGMGTAIINLPEAQT DFIFAVIALGFGFVGGGFTIAVVCVLDALLIRIGFKSKNNRDKYLTAGIFGLLIFQQVWN IGMVLGLFPITGITLPFLSYGGSSLLSYMIAMGIFLDMERQTRILEGKKRY >gi|223714194|gb|ACDT01000021.1| GENE 14 8250 - 8594 436 114 aa, chain + ## HITS:1 COG:SA0183_2 KEGG:ns NR:ns ## COG: SA0183_2 COG1264 # Protein_GI_number: 15925893 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIB components # Organism: Staphylococcus aureus N315 # 45 114 2 75 110 57 43.0 8e-09 MNEYLLYGIIALIVIVILAFIFLKVMKNKQPKQVSAADIQVDVNSIIEAVGGKTNIKETT ATSSKVTFFVNDDSLVDLDKLKALGASGIVQTSNKVSAILGKYSKEISHMINGQ >gi|223714194|gb|ACDT01000021.1| GENE 15 8640 - 10868 3032 742 aa, chain + ## HITS:1 COG:ECU11g0430 KEGG:ns NR:ns ## COG: ECU11g0430 COG0790 # Protein_GI_number: 19074843 # Func_class: R General function prediction only # Function: FOG: TPR repeat, SEL1 subfamily # Organism: Encephalitozoon_cuniculi # 139 545 154 560 590 220 33.0 8e-57 MDIKDRVKSYEPLFNQWLFEELVEIHDQESLIKIKHQAIGVEKTALLKVITLYDQDNNIE FLEKRVEEIKNRLGLQVINESEYLDYEQYLIEDNQGKIIGIDLCLLMVDEVDRLQNINNV SGKLLYELGLKLFDKHEYLEALEYFKKGAEVDNSDCVCIIGYLYERGLGVEQDHYQAFRY YQSATSLGNVVASCNLAYFYELGIGVKQDYQKAFELYEFGARAGFARAICNLGYCYEYGH GTAVDLTRAVNYYYQAAKLGYSEALYSLGTCLEFGEGIEQDIERAFKCYEEAANQGHERS QHRLGYCYENGLGTIQDFVKAFYWYYQASQVNYAPALIALASCYELGQGTDIDLKAARQN YLKAARQGYSRAQFWLGYFYENHPEIKNAPYRCSYWYRQASKQNDVQAIVALGYCYESGF GVKQNLIKAIELYNKAANQGYAPAQCNLAYCYEMGIGIDVDLNEAVRYYRLAGDAGYPRA LCNLGYLYDHGEGVDVDHHQAFKLYQKAAEANYAPGLYHLALAYEEGNGVDVDIDKAIEY YELASHQDYGIALYNLGLIYEHNEQYHDDLKAIKYYEAAIDQNDVRAMYRMALYLDEGKV IAKNPDKAFTYLQIAANQGYGPAMNMYGIYLENGIGGYKNLDEAFKYYLASTSDEYVPGI YNLARCYFYGIGTTVDKASAFKLFLKASERGYYDASFMIGYMYSYGDGINQDKQKAKEYF KQAANKGMKEAIEELNKLDKEV >gi|223714194|gb|ACDT01000021.1| GENE 16 10917 - 12149 1361 410 aa, chain + ## HITS:1 COG:BS_ampS KEGG:ns NR:ns ## COG: BS_ampS COG2309 # Protein_GI_number: 16078509 # Func_class: E Amino acid transport and metabolism # Function: Leucyl aminopeptidase (aminopeptidase T) # Organism: Bacillus subtilis # 2 410 3 410 410 378 45.0 1e-104 MSFKEKLSKYAQLIVKAGLNVQPAQIVVIDAPVESVELVRLVTKEAYAAGAREVVVKYND EVVSRYKYKYLEKEAFAHVPTWFKEFRNDYAAKKAAFLTIISDDPEGFKGIDPAKIALWS KSVSTACKPFYDSLDLGINTWCIVAGSSIKWANKVYPDMSDSEAVEALWNAIFKAVKVVD DDPLTAWQEHRKSFEARVAYLNKLNLNTLTYTNSLGTNLTVTLNEGYLFAGGGSYTTKGQ YFFPNMPTEEIFTSPYRNGTNGVVYSSLPLNYNGNLIDEFKMEFKDGRIIDFDAKMGKEV LKEMLSIDEGSLYLGEIALVPYDSPISNMKTLFYNTLFDENASCHLAIGKGFSECIQGGL TMTKEQLLEKGVNDSLTHVDFMIGTSDLKIVGETKDGQVVRIFENGNFVF >gi|223714194|gb|ACDT01000021.1| GENE 17 12240 - 13175 845 311 aa, chain + ## HITS:1 COG:no KEGG:CPE0962 NR:ns ## KEGG: CPE0962 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens # Pathway: not_defined # 12 303 8 308 319 159 34.0 1e-37 MEVKKSIVPQIMTILLVALMILMSEVFHEKEFIFPEITALTIGAWLAPKQVWKTNKIKLV FLIAAYASLGIILVKYVDIDIYFKILIGFVVCVTGLFLSKTTFAPLISATILPIIINSES WLYPLFATAMSILIVIGQQYLEKGGYRSVQEYHPVTAAVKETYSLNLKRLCALALTAFIA LQLKLPFLIAPPLIVAFVELSSNHPKLRHNSIKLAFVTFICAFSGAYGRILISEIGDLPL TISAIIIVMTMLSIMKITKLYFPPAGALAILPLLIESGKLIVYPFVIIGGFVIFTVFAFI ITKTPLNSLSK >gi|223714194|gb|ACDT01000021.1| GENE 18 13236 - 14822 1898 528 aa, chain + ## HITS:1 COG:CAC0630 KEGG:ns NR:ns ## COG: CAC0630 COG4108 # Protein_GI_number: 15893918 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptide chain release factor RF-3 # Organism: Clostridium acetobutylicum # 1 527 1 526 526 718 65.0 0 MSDFIKEVEKRRTFAIISHPDAGKTTLTEKFLLYGGAIQQAGTVKGKKNSRHATSDWMEI EKQRGISVTSSVLQFNYDGFCINILDTPGHQDFSEDTYRTLMAADCAVMVIDASKGVEDQ TRKLFKVCTMRNIPIFTFINKMDREAQDPYNLMEEIEQELGIDTCPVNWPIGSGKQFKGV YERHDQEVIRFIPVDGGRKEVDTKIMKYDDPELINELDEELYNKLQEDIELLDMAGNDFD LEKVRQGKLSPVCFGSALTNFGIEPFLKHFLEMTTPPLPRSTGEKTIDPFSEDFSAFVFK IQANMNKAHRDRMAFFRICSGKFTKGMEVYHYQGGRKIKLNQSKQLMADERQEVDEAYAG DIIGVFDPGIFSIGDTITTGKAKFAFEGIPTFAPEHFSLIRNKDTMKRKQFVKGVEQIAK EGAIQIFTELGGGMEEIIVGVVGVLQFEVLEYRLKNEYNVEIIKQPLPYQYVRWVKEGCD LKALNLSSDTKKVQDLKGQHLLLFTNDWGVRWATEKNPDLELLEFSKN >gi|223714194|gb|ACDT01000021.1| GENE 19 14867 - 15127 330 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757071|ref|ZP_02429198.1| ## NR: gi|167757071|ref|ZP_02429198.1| hypothetical protein CLORAM_02620 [Clostridium ramosum DSM 1402] # 1 86 1 86 86 149 100.0 5e-35 MRKEVITIANTGKRRVCRCIKTMNIVIGKEQRDLFTKGQVYDCVIRDSGQLQVNYKIYGK EFDLSCTKEEFEENFVLINKKTGVRK >gi|223714194|gb|ACDT01000021.1| GENE 20 15124 - 15648 679 174 aa, chain + ## HITS:1 COG:no KEGG:CLH_2057 NR:ns ## KEGG: CLH_2057 # Name: not_defined # Def: acetyltransferase, GNAT family # Organism: C.botulinum_E3 # Pathway: not_defined # 3 173 5 175 182 93 29.0 3e-18 MIFEDLKLAEFNDFYKVILGNFPSKEIKEYDYMKETFINGDFKVLTLKEDDEIKGILSHY DGGEFAFVDYFAIDGNQKGKGLGSKMLKHFMETAGKPVILEVEHPEDEQSRRRIAFYQRN GLVLNDQHDYFVPPVRNLKHRLYFHLMSYPTEIDALQFERYYPQILQLVYGVNE >gi|223714194|gb|ACDT01000021.1| GENE 21 15707 - 17656 2246 649 aa, chain + ## HITS:1 COG:CAC2241 KEGG:ns NR:ns ## COG: CAC2241 COG2217 # Protein_GI_number: 15895509 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 95 476 21 401 699 128 25.0 3e-29 MARNFYIDNINDNDKLNEVSLLLNEMEEVSRIKIGKTGISFYCEDPENVQAVLNNHDENL VLKEEINSRKREFVAPEQKVEHIFMFTNLETEEEAKEIEEVISRYSAYENVSLDFTNKLL KVTTSQKNILVRLNRLVDKVNPKIDVEQWKKPFKSQDIFQQKYLNTMIRIAGLFVALALG LVTKDDPNFITNIAWLIALIIFNEKTLVQAYKDLKVKQFLSENITINLACLFGWVYGAYI EAIVVSLIYQVGERLLMRLINLTMEKIDDEINPIQLGRREIGENEYEMVSLEDFDIGDVI VVLPGETIPLGGKIASGESELDMFAINGSDILEPVKAGSEVQSGSVNVKDTLRIKILYTY DRSAMSRVLEIATMAPVGSSRTHRLVELISKIDTVLLVVAGIFCATLVPILNFEANFKYI YLGAILITISGSFAYKQASSFAVLSGVAKAFSKNIVIKENSGLDALNLCRTVIYDRFDGV EVTEEEMELFAKLSKLHVGLIIFNDGPVDLENDQYRIYNNLTVDEKLEVMDKASIAGPVA YIGDNSKDIALLQKAYVGISRGGIKDKKVIENSDIMLMNSDLNTVIETFMISKKQKYITI ENIFVGLFINVILMILAVLFIIPWWLALVIYLIEVVVVLFNTHRIIDMK >gi|223714194|gb|ACDT01000021.1| GENE 22 17696 - 17995 523 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757068|ref|ZP_02429195.1| ## NR: gi|167757068|ref|ZP_02429195.1| hypothetical protein CLORAM_02617 [Clostridium ramosum DSM 1402] # 1 99 1 99 99 163 100.0 4e-39 MDMIEVSTIDYLPGYRITKSLGIVFSSKIMNQAIASDNTNIEEMLDTARQEAMAETVAQA SRRHASAIINVRFTTTNLGNNMIECQVSGMAVKAEKIDE >gi|223714194|gb|ACDT01000021.1| GENE 23 17988 - 18827 229 279 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 60 277 75 285 285 92 32 3e-18 MNKYLVKTDSTLMEFLLNHYNKKNVKNLLKFKQVSVNGQVISQFDYSLKRDDEIVVDRNP NNNSSLDIIYEDRELIVINKPAGLLSMAGGNEKEKTAYHLVGEYLKSKNKNARVFIVHRL DKETSGVLLFAKNEIIKNKLQNNWNDIVYKRGYLAIIEGKLKKKHGTIKNHLDESKTQMV YIANNGKGKLAITNYKVLKESRYNSLIEVFLETGRKNQIRVHMQSLGHSIVGDKKYGATT NPIKRMGLHSHVFAFVHPDTKARMEFKAVVPEEFKKMFK >gi|223714194|gb|ACDT01000021.1| GENE 24 19387 - 19941 383 184 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 [Haemophilus parasuis 29755] # 28 184 2 159 159 152 47 4e-36 MEVAFISRFDNRKPKQPDELVNEKIRFKEVLVIDQNGDQLGVKSRNQAIDIAYGAGLDLV CVAPNGKPPVCRIMDFGKYRFEQQKKAKEMKRNSKVVALKETQLSVTIDVHDKNVKLKRT LKWLEEGNKVKIAIRFRGRQLAHMDLGKKVIDDFVAECAEVGQIEKPAKLEGRTLTAIIA PKKK >gi|223714194|gb|ACDT01000021.1| GENE 25 19959 - 20153 325 64 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167757065|ref|ZP_02429192.1| hypothetical protein CLORAM_02614 [Clostridium ramosum DSM 1402] # 1 64 1 64 64 129 100 2e-29 MPKMKSHSGLKKRLKRTGSGKLKRSHAYVSHLSHNKTHKQKKHLAKATLVSASDYKRIKS RLAK >gi|223714194|gb|ACDT01000021.1| GENE 26 20179 - 20538 603 119 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167757064|ref|ZP_02429191.1| hypothetical protein CLORAM_02613 [Clostridium ramosum DSM 1402] # 1 119 1 119 119 236 100 1e-61 MARVKGGFTTRRRRKKVLKLAKGYFGSKHTLYKTAHEQVMHSLAYSYRDRRQVKRDMRKL WIARINAAARLNDISYSKLMHGLKLANVEVNRKMLSEIAICDPKGFTAIVNTAKKALAK >gi|223714194|gb|ACDT01000021.1| GENE 27 20606 - 20998 462 130 aa, chain + ## HITS:1 COG:no KEGG:CLB_1581 NR:ns ## KEGG: CLB_1581 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A_ATCC19397 # Pathway: not_defined # 5 129 8 138 139 71 32.0 8e-12 MKRIDMTVDFKSEVAAVFKVITNLKDCSWRSDLTRVEQIGDNRFIEYDRKQRETKIFVTD LRENIQFDYDVENNSYRGHWSGQFAPLPDGGCRMYLNFDFEPQSILGKFVRVDKFEERYI EDLKKELNEY >gi|223714194|gb|ACDT01000021.1| GENE 28 21126 - 21551 396 141 aa, chain + ## HITS:1 COG:no KEGG:LBUL_0272 NR:ns ## KEGG: LBUL_0272 # Name: not_defined # Def: transcriptional regulator # Organism: L.delbrueckii_BAA-365 # Pathway: not_defined # 10 139 8 137 149 67 28.0 2e-10 MKDYEVMFSLNKLAGNIANYANARLKPYGLTFTQLSVLIFLADNQDRTINQKDISMEFDV SHATTVGIVARMHGRSLVKVKACESDRRITNVIITDHGKDMIVQTRKVQEELVQLFSKCL DETEMTNFFKLINKINQVFKT >gi|223714194|gb|ACDT01000021.1| GENE 29 21565 - 22875 1225 436 aa, chain + ## HITS:1 COG:CAC0165 KEGG:ns NR:ns ## COG: CAC0165 COG0577 # Protein_GI_number: 15893459 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 1 425 1 443 451 124 25.0 4e-28 MGNFKFAMTLLVKEYKKSAFYALTMVFAIAVCFIFFNIMNNDLLADPGAVVGGATWNQVN VPLSTTLSFIIICFCCFMIFFANSFFISRKTNEIAIMTLSGNSSFKSTKYLIYQTFVLLL IAAPIGLALGMAVTPISNYWMFHYLGIEASIYHIPSAAFMQTIVVVAMILVALCVFASGY IYRNDISSLLQQEKAMDFSDKRKLVIPSAVYLFLYVFGIVMMFMNEHSATAYIAPTGVGI VGAIGLIRYALPEYIKKFKEKKLLEKRYALIYVSNLGYSIQRAILLISLMMISVTGMIAA IAANQGAPREYITGVIGYIVIIILLITSIVYKFCMEAQTRKTLFFNLWKIGYTRKEITKI IKYEVFYFYLVLLLIPMIYIIIISGCFIYHGEMSVAFALGVIGVYFIPVVCSGLITYYNY KKAVVAPIHGGKNDGK >gi|223714194|gb|ACDT01000021.1| GENE 30 22865 - 23632 801 255 aa, chain + ## HITS:1 COG:CAC0164 KEGG:ns NR:ns ## COG: CAC0164 COG1136 # Protein_GI_number: 15893458 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Clostridium acetobutylicum # 5 255 10 258 260 284 59.0 9e-77 MENKILVAKNVTKIYGVGTKNPYTALSDVSLEMYEGEFVCVMGPSGAGKSTFINNLSTID IPTKGFVYINGKEVRQMSEGEIGKFRYENLGFIFQEFNLLDSLTIFENIAVPLTLAGKSK VEIKAAVAKIAKRLDVEQILNKYPSECSGGQRQRAAICRALVTNPKIIVADEPTGNLDSK NSHELLSLFRDLNINDGVSILMVTHDSKIASYSSKLLYIKDGVIDETIERKDMSQKEYFY KIVDINSTESQALFD >gi|223714194|gb|ACDT01000021.1| GENE 31 23632 - 24894 880 420 aa, chain + ## HITS:1 COG:CAC0165 KEGG:ns NR:ns ## COG: CAC0165 COG0577 # Protein_GI_number: 15893459 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 1 415 1 442 451 107 24.0 4e-23 MGLIKLAIRSLKTDFLKSLFYFLSFVLTTIFIFLFFNLTLNPSTGINLGGNDAKLITTIA AFVILIAMVCVFMANDFYVLAKTKDVSIVLMSGASVYQVGIYLLLQSTIIMFLAIPIGFV ISYPLVPLVNNLLFMVFDYQGSLNYISSNTFMATAIILFCEIGWCTLLNMGYCYRTSINK MVTANVKIEKFGIGVKKLSNRIYLVLFILPMIVFPFLNDTASHLLIALIGIIGVYGLVKN IIPEFIENRQTNESLEDRCLLIALGNVRYDLEKVRMLVLVITIAAIILMCTTIYTLDTPL ISMVTLMSYFSVMVLLSITTIFKVGMELQGRKRSFLNLYHMGYDLKDLKKIIDLEMIIFY GLIIVIPLLYQIIILIKLYSLGLINFYLVGGLLLIQIIPMLVCMIICTLMYQKVLPEPII >gi|223714194|gb|ACDT01000021.1| GENE 32 24961 - 25425 477 154 aa, chain - ## HITS:1 COG:BH3066 KEGG:ns NR:ns ## COG: BH3066 COG0622 # Protein_GI_number: 15615628 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Bacillus halodurans # 1 139 1 140 169 100 41.0 7e-22 MKIMILSDSHSVSKTDLLTLLKNNSVNYYIHCGDIYMTYDGINLNSFYLVRGNNDFGNIP DELFITIDDLKFYIVHGHRYDVDYNLDYLTHTAKEKGADIVCFGHTHRPYYDFHEGITFI NPGSVCYPRGQYRNPTYCIFDTKTKKVLFMMSRH >gi|223714194|gb|ACDT01000021.1| GENE 33 25589 - 26293 295 234 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 199 1 200 223 118 34 7e-26 MKSFITFNNVKKTYKVGDVEINASDGVDFEVNKGEFVVIVGPSGAGKTTILNLLGGMDKA TSGQILVDGQDVAKYSERQLTQYRRDDIGFVFQFYNLVQNLTALENVELATQISKNPLDV RMVLERVGLNKRLDNFPAQLSGGEQQRVAIARAIAKNPKLLLCDEPTGALDYQTGKAILG LLREMCDKYTMTVIVITHNSALAPMADRIIHLKNGQVASMNINEHPKSIAEIEW >gi|223714194|gb|ACDT01000021.1| GENE 34 26295 - 29534 3222 1079 aa, chain + ## HITS:1 COG:lin1187 KEGG:ns NR:ns ## COG: lin1187 COG0577 # Protein_GI_number: 16800256 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Listeria innocua # 1 1079 1 1136 1136 527 33.0 1e-149 MKKRAYVKNQIQIIKKTKARFLSIFCIVFLGAAFFAGLRHTASIMEITMDSYLQEHAFND LNYVATLGFTNEDIEAAKKIEQVAQVEYGNRFDALIQVGDNVKGTTVYTNESFDNKVNKI ELVDGTVPVADDECLVDNNYASRNKIKVGSKITLTNDNGEKKFKVVGLINDPRYFSTIER GTNTLGDGTNEAFVEILAQGNESLALPQDLYDLRNGTVFNEIRVTLKNSDDNNIFSDDYL SYVKDVNKEIKKVLSGKVAQVNEELAGDANRELEKGEQEYNDGINEYQDGLAQYNNGLSQ YESGLKQYQDGLNKYNDGYRQYQNGLKQYQDGKAQYDQGVSDYQLGKAQYDQGISAYNEG INQYNNGLNQYNVGIEQYNQSVVQLQYLIENNLITPEDAAIMQEKLNETKIQLDATKKTL DASKLELDNNGAVLAATANQLESARITLANTKVTLDSSKAQLDETAITLTNSKKQLDDSK ITLDKTKVLLDETKIKLDEAKTSLDDAKIKLDEARSAIDDIPTGKLISLTREESASILSY DSACQSMKAIAVVFPLIFFLVAALVSLTTMTRMVEEQRVQSGTFRALGYDKKDVINQYLI YAFLATFVASVLGIIAGVYFFPSIIYFLYRKMLFNVGAPIKISFDTFICIQTFLISVAIT LLVTYIVTRQELSEMPASLLRPKAPKMGKRIVLERITFIWKRLSFNQKVTMRNIFRYKKR FFMSIIGIAGCAALIVTGFGIKNSVSTLADKQYGDIFTYDGMVVFDRNLSNDQLKKEKDE FESLSSVKDCSSFYRKTITVVGSKDHYGTLEVFKNNEELSKYTNLENYQTGEKIKLNNQG VVISAKLSELLGVTIGDKINITIDEKDYQVKITGVMLLYFQHHIYMSEEFYRGLTGETPV NNYAYFNLEDDGNRKTVTNYCDKDSNIDSLNYVKGISQAFRDQMGSIDSVVVILIACAGA LAFIVLYNLTNINIQERKSEIATIKVLGFYPREVYDYVFRENLILSAIGSIVGLGIGKVL HMFIINAVEVEVAMFIRSVNLMSYVYAIIITMIFTYLIDFGMRKVLKNIDMVESLKSIE >gi|223714194|gb|ACDT01000021.1| GENE 35 29780 - 31222 1646 480 aa, chain + ## HITS:1 COG:MT0027 KEGG:ns NR:ns ## COG: MT0027 COG0791 # Protein_GI_number: 15839398 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Mycobacterium tuberculosis CDC1551 # 377 468 176 268 281 99 48.0 1e-20 MDRNIKTSFKNVVDEYIVNRKAMFIPIAAVASFMLVGYAAADKEKPEIKSNQIDLAYGEK FDVDSIDITDNKDSRDLIEVSADTSSLNVTQLGSYKVSVTATDSGSNVATKTVQVNVVDN EGPKFESLGSNQGYTIDVPVKGSTDFASYVKANDNVDGDVTPFIEANTPLNTEVKGQQDI TLKATDSSGNETVKTFTFAVSDLEAPVINLTQGENVTVDYGSGFDLNNFVNVIDNLDGTL VPTVEGSIDTSKIDETQTLKISATDSSGNVSEANLNVVVKDLSAPVISLTKSSITVNAGE NVDFNAYIASAIDNKDGDVKANVKVDAPSTAKAGTKTATYTVTDAAGNTGTASLSVKVNA VYGGAASNNYGNSVLSAAYSRLGCPYVWGADGPNSFDCSGFVKWCYSRVGISLPHSSSAQ KNAGTQISISQAQPGDILWKSGHVGIYIGNGQYIHAPRTGDVVKISSVSGSGFVCAVRVK >gi|223714194|gb|ACDT01000021.1| GENE 36 31649 - 32017 408 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757054|ref|ZP_02429181.1| ## NR: gi|167757054|ref|ZP_02429181.1| hypothetical protein CLORAM_02603 [Clostridium ramosum DSM 1402] # 1 122 6 127 410 207 100.0 2e-52 MANDKINVNMFGTFSLSRGNVTVDLTKLLGKQLINLFQLLLLQKGVVISKNNIIDILYPD SENPNSVVKFSIFRLRKDLKASGIFNEDEEVILTVKGGYQINPKLEWVMDTEEFYYNWDK IK >gi|223714194|gb|ACDT01000021.1| GENE 37 32188 - 32871 681 227 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2521 NR:ns ## KEGG: EUBREC_2521 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 198 199 399 416 63 24.0 8e-09 MKRKQFEKMMSINYHAILLEPFYEGLHYYYIKGLIELKDYHNALKYYDDLNERFYKELGT GLSPRFKELYDVITEDIEEDYIINTENVNEDLIINNADYRGFYCGYDMFKHMYEILLKNA LRDNKKYFLIIFDLNAKTTIENQVHVMNQLKEMIASSLRVNDLFAKINKKQYIILVACQE MDNAYTIIQRINKKFYAKYNSTTYRLNYDVAKAKLLYKKPIESSVQS >gi|223714194|gb|ACDT01000021.1| GENE 38 33028 - 33675 812 215 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1860 NR:ns ## KEGG: EUBREC_1860 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 75 207 95 243 249 64 33.0 2e-09 MENKKNKLLILMIVLCIMVPIGGAIAYFRADTSSSNKLSSGNIGIELRDATTNEDGQITD EGINFNFGYPGAIKDKQTFVKNTSENDLYTRIAVTKYWEDESGNKVIDANPGFIEIITND KDNWIIQDSDENKEVVYFYYKKLLEPGAVTDLFVNQIKLSGNIDNTNAMRYSGLSAHLSF EAEAIQKVGAESAILSEWGLEVKIDNNDVIQYVEE >gi|223714194|gb|ACDT01000021.1| GENE 39 33696 - 34385 924 229 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757052|ref|ZP_02429179.1| ## NR: gi|167757052|ref|ZP_02429179.1| hypothetical protein CLORAM_02601 [Clostridium ramosum DSM 1402] # 1 229 7 235 235 402 100.0 1e-111 MRGKNILKTLVVTTVLALSMTTIHAADTNVSFDGDAEDFIFLDGSQDLFDNFKNIMPGEV RTQNIVLKNDDHRELRFYLSADVLDSLDTDTSGLSVYNITITNEGEEFYNGTLDDLAALS SGRMSEDTLLASLQKGETTTVTMTLEVNGDSMDNTYQNREGLIKFNFSVEELDDQSTIVE VVRNVYESIKTGDATVIAPLLGLVAVSGFAVIFLLKKKKHKEDCSHEEN >gi|223714194|gb|ACDT01000021.1| GENE 40 34372 - 35205 1119 277 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735879|ref|ZP_04566360.1| ## NR: gi|237735879|ref|ZP_04566360.1| predicted protein [Mollicutes bacterium D7] # 1 277 1 277 277 477 100.0 1e-133 MKKIKKVIIVLAMMIAMVGFVNPMTVDASRTYKVTFRAGAHGTFAANGENKITIDVAANE RFPDIPDINVEDGYYLMGWNKTLPASGEAVTGAMTFVAKYQVLTNGREYSINYVNQDNVQ IATTKVMTAPLGSTVNERAKTIAGYELIGAADLSITIGETDNSITYVYRDLNVTQEYVES VVVIPVGLPANTAVSTATGGAGTGGTTTNQGDDTENVTDSETPLSPGDGTEEVPDDQTPL KKGEEGINYTYLYITGGIVLLLIAAGIFILKRKQASE >gi|223714194|gb|ACDT01000021.1| GENE 41 35327 - 35713 292 128 aa, chain + ## HITS:1 COG:no KEGG:STH1180 NR:ns ## KEGG: STH1180 # Name: not_defined # Def: signal peptidase, type I # Organism: S.thermophilum # Pathway: Protein export [PATH:sth03060] # 1 126 53 180 196 75 35.0 7e-13 MEPVYSTGELLLVKPAKADEIKVNDIISFKGSGVSGNVITHRVIKIDQEKQVFITKGDAN SSQDSNPVAFSRLNGIVKINIPYIGYIYGMIQSMIAKIILAGLVLIYIIVNLVTKKKLNS NVSEEEAK >gi|223714194|gb|ACDT01000021.1| GENE 42 35710 - 38607 3139 965 aa, chain + ## HITS:1 COG:no KEGG:MCCL_0364 NR:ns ## KEGG: MCCL_0364 # Name: not_defined # Def: hypothetical protein # Organism: M.caseolyticus # Pathway: not_defined # 63 525 485 930 1102 67 24.0 4e-09 MKISKKRKQNKFTRKIKTLVLALAMIIPAAGTNLIVEAENNPGSTTSSDGLVTVNKTSER VGDNRYKINLDLATGKAQDSVEAIGNDIVLVFDISNSMAEDEHGNSTSSNDKKRLTKAKN AAIEFLNNSKISGNKKNRYSIVTFNYYGTVEQNLTSNLETAKQAIRDVELGNNSDGGTNI QAGLYKARTVLKNAKSENGIIILLSDGGATGSYKLNNERNNGYLVNDYSEATATDKALGY SGRYTFGENAINYDSVIKGGRNDFTLDLYLNNSHYSLNNAAATLAENEALLAKKSGNTIF TIGYTTGSSVNSFLKNVATQGEGYAYSSSSDLSGIYENIANEIVTRYETIIKNGIVNDPM SQYVDLLDLPDGRYDNSVADKLIFDRGYVIVSTENGRKKITWYVGDVEKNSQAHLEYYVA VKEEYKDGTDYPANDPTQLVYKNYIDKDSTLDFNVPTVNEELPITSATLTVKKVVNNSSE DKTFIIHVVGRDSENAVIYKTDLILKNNEEVLLENLPFGKYTVTETVPMEYKLESTESGY AQTNLSSGNIIELSQNKDAANGYVTLTNSFGHVGFFKATDEKTNYLPSGSETINNYEYNV SGNVKSTTVKAPQDVVLLLDKSGSMDESMNGSSRLTHLKNNVIKFITKLYEHNPDSRVSV ITFAYSADGSITNNNFVKLSDIKSGNETWYTYLTKNNGGIKNIKASGGTQIDLGLYEVRN QLSSATGENNRSVIVFTDGQPGNKGFNTSYNDYDDNGYRVGAEALNQADFIKFSGNLTGI NNYIESSNGSKYYGHKNDDITKNRSNNNSNDAGNRTNRSGKGLGKTIFTIGLNSNNSSLF DSFLTRLASEGHYTKANNSSAMENAFNSIFTSITTMDTDVDIRVKYDANKFEVIDAQGGK LGGSGTNAYIEWKDIIDETTGKFVIEGIKFRNIVADAGEPQLTVQGYKDGVTTDLEAAEI KKVGN >gi|223714194|gb|ACDT01000021.1| GENE 43 38631 - 39251 720 206 aa, chain + ## HITS:1 COG:no KEGG:CPR_0481 NR:ns ## KEGG: CPR_0481 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_SM101 # Pathway: not_defined # 10 178 3 186 211 67 27.0 3e-10 MKYLGRIKMKRPLMVLMVVVTVCLTAVGGIYAWSKLSASVTNNIKTPTVDTEVVEKFDKD STVDWFDQVEKEVQFKNTGTAPVFLRVSYAEFWKDADGKVISGLVDGNEVVNKNWTEAWQ NDWFDGGDGWYYYKKVLPANEMTAKVLTSLKFDSNYAEIYQTADYTLKFVSEAVQYSDDD SVNHRAVQANFGRDFKVISNNVITWN >gi|223714194|gb|ACDT01000021.1| GENE 44 39269 - 39910 523 213 aa, chain + ## HITS:1 COG:no KEGG:CPR_0481 NR:ns ## KEGG: CPR_0481 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_SM101 # Pathway: not_defined # 7 163 3 177 211 65 27.0 2e-09 MKFLKIKKTKVALIFASVCLIVAIGYTSAYYNSTVNVENKMATQEPEIELIEKFNQDSQF LPGETVDKKVKFANSGEIDALIRVSYSQSWINQNGDFVNGDVDSVEKKWTSAWQDEWVDG NDGYFYYTKVLKANTTTNIILDSLVLSEQVSNDTHAVDYSGIIYQITFNLESCKATTEGA QSTFGKTAAISGNNVTWSDYKGNSNQQKSGEDF >gi|223714194|gb|ACDT01000021.1| GENE 45 39933 - 40427 423 164 aa, chain + ## HITS:1 COG:BH2130 KEGG:ns NR:ns ## COG: BH2130 COG0681 # Protein_GI_number: 15614693 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Bacillus halodurans # 1 147 3 164 191 79 35.0 4e-15 MKIFKKVVSYLSILCYVVIIALVLILAPMVVGYKPVVVLSGSMEPTYPVGSLIYYKAASF EDIKENDAITFRVDDDTLVTHRVIVKNEISQTFVTKGDANPTNDTNPVEYQNVAGKTLEF CLPVVGTIFASSAKYIAVAIIGGILLLNIVLSNLVADEKEKVTV >gi|223714194|gb|ACDT01000021.1| GENE 46 40446 - 41087 575 213 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757045|ref|ZP_02429172.1| ## NR: gi|167757045|ref|ZP_02429172.1| hypothetical protein CLORAM_02594 [Clostridium ramosum DSM 1402] # 1 213 1 213 213 381 100.0 1e-104 MSKKKRTIMSLIVLLGCVLAVSATLAWFTDSDETKNRIRIGEFGIDVIESSTNPDASVSE DGIKYDDPVVIGQEKSKIVDIKNTEELDAFIRVTVEKVWMDEDGNILSDKDPDKIVIRGI DEDKWIYKATGEKGKSYYYLKMPLKSNETVNLFKSFIIENNFTSHEETMSYANLKSKIII TANGVQADDGANAVATQNWDSIIVENGQLVSID >gi|223714194|gb|ACDT01000021.1| GENE 47 41320 - 42093 914 257 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757044|ref|ZP_02429171.1| ## NR: gi|167757044|ref|ZP_02429171.1| hypothetical protein CLORAM_02593 [Clostridium ramosum DSM 1402] # 1 257 1 257 257 433 100.0 1e-120 MKNKKSVVASIVLSGALVVVGTLAYFTQTHTVDNKLKTKGFGSDIVEKFTPKEFNPGATV TKEVRVDNTGDYALVARAKWEESWTRNGEEFKAVAYPDTNNESVVDKNGMSDKWVDGNDG WAYYNEMIGVNGHTENFLTSITLKNSADVVGTDIKNFYYTTAATEPDKTSIGTDSKTQWV KISEEEFKALDDENNNIKATFKRAEVKSNGLYDNAEYTLTITAQVSQANKEAAATWITDA TNQTVKNFLNGLPTVNN >gi|223714194|gb|ACDT01000021.1| GENE 48 42326 - 43054 891 242 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735887|ref|ZP_04566368.1| ## NR: gi|237735887|ref|ZP_04566368.1| predicted protein [Mollicutes bacterium D7] # 1 242 1 242 242 422 100.0 1e-116 MKRKKSLIAIISLSAFLVIGGAVAYYNSTTNVDNKLKTGQFKSEVVEKFTPKDNWEPGDK VTKEVGVENTGTGNIVSRAMWSETWKTADGEDITVTSADPRASVVEKEYGNKWIYSNSDG YFYYDGIISSGGRAEDKFLKSIMLSGDVDFAETKKTVVYYSTMTTGEPSKITNNPETGWA VAPNTGIPNNATYNKTVVTGNGKYASANYTLTITVQVYQANKQAVAGTAFETVVPETYAL LD >gi|223714194|gb|ACDT01000021.1| GENE 49 43075 - 43851 809 258 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735888|ref|ZP_04566369.1| ## NR: gi|237735888|ref|ZP_04566369.1| predicted protein [Mollicutes bacterium D7] # 1 258 4 261 261 483 100.0 1e-135 MKNKKSLSALISLAAIVLVVGTLAYFSRDLTVENRLKTAKYDTTIEEEFIPTDKWAPGVE INKKVTVKNTGNVDIVVRAKLTELWKRSENLMDPNDPDKILSEEGEILPNSFVDENGISH DAAIKNFVQNMVYEYEDVKDNLADYHNKWIHYGDYYYYLGVIGEDEASNGLLNSVKMNPL LDATVSGSHTVVEADEQGTTTVTKKYQYGKYGYDSADYTLTVEARTVQASKAAIRNAFGD NVMAVYLADHFATIAEQQ >gi|223714194|gb|ACDT01000021.1| GENE 50 43874 - 44587 888 237 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757041|ref|ZP_02429168.1| ## NR: gi|167757041|ref|ZP_02429168.1| hypothetical protein CLORAM_02590 [Clostridium ramosum DSM 1402] # 1 237 1 237 237 392 100.0 1e-108 MKNKKSLGGILLLALTVGVVGSLAYFSQELTKTNEFKTAKYDTTIEEEFTPPTDWLPGVE VNKDVTVKNSGNVDVVVKATLTESWTRGDETLSNTFTDENGLTQNAALLALPNVVKYTDD IELAAQAGKWVEYEGTYYYMGAIKGGNSSALLLDNVTLNPLLDVTDKSVTTVVTTDENGT KKEITTTEKGKYGYDDANYKLTVTANTIQATASAIKTWGSNPVVDYIVANYATIAEK >gi|223714194|gb|ACDT01000021.1| GENE 51 44739 - 45761 934 340 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757040|ref|ZP_02429167.1| ## NR: gi|167757040|ref|ZP_02429167.1| hypothetical protein CLORAM_02589 [Clostridium ramosum DSM 1402] # 1 340 1 340 340 620 100.0 1e-176 MKLFKRLLKYFSRLEKEEIIDLTPIVEDKYNEVAIGNLVWALMPLSNDELERIEESHRIR PYLVVAKDENYFYGYYCSTKQKSRYLATFKLDKNIYDHRKNTYVYLTNTYKIPRTNFRDI YNRIGINNLIMIERKLIVNSRYLEGLIHFDVPIIYCVGDIIRVGDQLYYIYQTDNSNLYV NKTKFVREAALRYEFDYSETIIFDCKKSYELVNMLSPKHIEIVYENKRKYKHEQKRKQVK QYKNVFNYPRGSVFEDCNGDKIIYLYSRSNHHYGINTKINEFFPYICDIQNIEKANIIGT INDGKMMIMLERLLDQNINPHNIVTMIYQEIMDERPRSRV Prediction of potential genes in microbial genomes Time: Thu May 26 09:23:33 2011 Seq name: gi|223714193|gb|ACDT01000022.1| Coprobacillus sp. D7 cont1.22, whole genome shotgun sequence Length of sequence - 38726 bp Number of predicted genes - 24, with homology - 24 Number of transcription units - 14, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 53 - 1528 427 ## PROTEIN SUPPORTED gi|126666946|ref|ZP_01737922.1| Ribosomal protein S15 + Term 1577 - 1621 4.4 - Term 1563 - 1606 0.4 2 2 Tu 1 . - CDS 1781 - 7348 3451 ## COG2200 FOG: EAL domain - Prom 7383 - 7442 16.1 3 3 Tu 1 . - CDS 7686 - 8432 679 ## COG1496 Uncharacterized conserved protein - Prom 8482 - 8541 4.3 + Prom 8383 - 8442 5.1 4 4 Tu 1 . + CDS 8510 - 8899 422 ## gi|237735894|ref|ZP_04566375.1| predicted protein + Term 8904 - 8937 2.4 5 5 Tu 1 . - CDS 8921 - 10564 1614 ## COG1293 Predicted RNA-binding protein homologous to eukaryotic snRNP - Prom 10585 - 10644 9.2 + Prom 10606 - 10665 8.3 6 6 Op 1 . + CDS 10700 - 11260 724 ## COG0194 Guanylate kinase 7 6 Op 2 . + CDS 11289 - 11699 486 ## gi|167757033|ref|ZP_02429160.1| hypothetical protein CLORAM_02582 + Term 11727 - 11765 -0.5 + Prom 11707 - 11766 1.9 8 7 Op 1 . + CDS 11825 - 12241 481 ## CKR_3111 hypothetical protein 9 7 Op 2 . + CDS 12291 - 12863 506 ## COG1434 Uncharacterized conserved protein + Term 13093 - 13142 1.1 10 8 Tu 1 . - CDS 12864 - 13685 568 ## Clos_0765 membrane associated protein - Prom 13738 - 13797 8.1 + Prom 13723 - 13782 9.8 11 9 Tu 1 . + CDS 13802 - 14320 702 ## COG0778 Nitroreductase + Term 14394 - 14451 2.0 + Prom 14564 - 14623 4.9 12 10 Op 1 . + CDS 14681 - 16879 2029 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase 13 10 Op 2 4/0.000 + CDS 16892 - 18145 1126 ## COG0144 tRNA and rRNA cytosine-C5-methylases 14 10 Op 3 5/0.000 + CDS 18147 - 19175 1064 ## COG0820 Predicted Fe-S-cluster redox enzyme 15 10 Op 4 17/0.000 + CDS 19175 - 19912 882 ## COG0631 Serine/threonine protein phosphatase 16 10 Op 5 7/0.000 + CDS 19914 - 21860 2362 ## COG0515 Serine/threonine protein kinase 17 10 Op 6 10/0.000 + CDS 21870 - 22745 924 ## COG1162 Predicted GTPases 18 10 Op 7 6/0.000 + CDS 22738 - 23379 687 ## COG0036 Pentose-5-phosphate-3-epimerase 19 10 Op 8 . + CDS 23376 - 23966 558 ## COG1564 Thiamine pyrophosphokinase + Term 24054 - 24108 11.2 + Prom 23969 - 24028 7.9 20 11 Tu 1 . + CDS 24139 - 29295 5718 ## CPE0191 hyaluronidase + Term 29366 - 29426 19.1 - Term 29343 - 29379 1.2 21 12 Tu 1 . - CDS 29426 - 29908 566 ## COG2190 Phosphotransferase system IIA components - Prom 30046 - 30105 9.7 + Prom 30259 - 30318 4.5 22 13 Op 1 7/0.000 + CDS 30406 - 31830 1422 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 23 13 Op 2 . + CDS 31835 - 32680 722 ## COG3711 Transcriptional antiterminator + Term 32756 - 32810 12.4 + Prom 32785 - 32844 9.7 24 14 Tu 1 . + CDS 32910 - 38549 6036 ## CPF_0184 hyaluronidase (EC:3.2.1.35) + Term 38577 - 38620 7.1 Predicted protein(s) >gi|223714193|gb|ACDT01000022.1| GENE 1 53 - 1528 427 491 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|126666946|ref|ZP_01737922.1| Ribosomal protein S15 [Marinobacter sp. ELB17] # 1 485 1 498 503 169 29 3e-41 MYVATSRQMKAYDQALLNEGYSIEELVDKASNAILPHCLGYNNIVIVCGPGNNGADGLSL GIKLHIRARNVKLYCFGNPNKFSQANNFYIEQAQEMEVPITFMDEEDISLFISDAQKADV VIDAMFGFGLNGEVRGVARILIEEINNLYDIDIIAIDIPTGLNPDTGIPYGNVICASKTI TLTAIKQAFLNEECHMYTGEIAVEILDAKDLRQEQGLAKLVSPSWIKYHLKPRAYYGHKG IYGKILHLTGCDYYRGAALLASKASVYSGSGVVCVCSSEKVIDALATVTPECTSKLRNSK LEQELFYDRDAILIGSGLGLNEQSEQYVIDTLQYANCPIVIDADGLTIAAKHLDLLKECP VPIILTPHFGEFKRLCAYDDELDMIDKVGGFARKYGVTVVLKGPNTLITDGRETYRNITA NKAMATAGMGDVLAGMIVSFVGQGYTPKNAAILGTYLHGSCGDVIGDNSYTVLPSKLIDL IPQVMHEIINE >gi|223714193|gb|ACDT01000022.1| GENE 2 1781 - 7348 3451 1855 aa, chain - ## HITS:1 COG:RSp1097_2 KEGG:ns NR:ns ## COG: RSp1097_2 COG2200 # Protein_GI_number: 17549318 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Ralstonia solanacearum # 1591 1848 7 265 272 164 32.0 2e-39 MNENFNSAFYKKILDQINSNIYVTDVETDEIVYMNDYMKNTFQVSNTEGKKCWQILQSGM NKRCEFCKIKQLQESNSQKICVWYEKNTVTEKTYLNHDTLELWDNKLYHIQNSIDVTEQL QLSVEATIDELTGIFNRNAGKKRIEELLTAEYLNEDFVVVLYDMNGLKWVNDTYGHLEGD RLLIFVAQNIKKRLKEPNFIFRLSGDEFIIVFLDTTVKGVEVWMDNILLTLNQERIKENI DYDTSFSYGIANIHVDENLTVSDVLSIADSQMYIQKRDYHILMNKKRLTNSQPKYNSIPL FEYNRDHLFESLSETIDDYLFVGNLKTSKFMYSRQMMIDFRLPNQILDNAAAFWGEKIHP DDKAGFLRSNQEIADGRAEQHTIYYRAKDASNRWIPLLCKGKMIRDKNGNPDLFAGTIRN LDKNREQASLEPNSLDTYLDTSYYIYEKNQSNKNNFLFQKSFYFVNIFDQQNDLKTKTDL LDFVNKNIPGGILAVQAHSGFPLYCFNQAILEYTQYSYEELCQITNGKFSNIIYEKDRQF VEREIFQQLERNDIYEIRYRIVCHGGHLLWVYERGKYVIDGHGQKLILSFFVDISHEMNN EQELRFINENSLDGVFKAALIEHFPILYANDGYYRIHGYTKKQFHHDLNDCAENLIYESD KARIRAQITDLIKNNQSQAILEYRIKKRDNSIAWIHISASFTSLLDQTIIMIGMVMDITK RKKLEDKLYRSQQLFKVAQTKTGLNIWEFDIQNKQVIGNPDSAIIRSYQNLTANIPESII AQGLIHPQSINELKNLYQRLEKGAAPLSATIRVKQKDAKDKYLWEKITYTIVQHIDEKPA WAIGISEDVTEQKEIETRIYNEESLRKVLNEDIILNIRLNFSRDFIEDISLHSNEQYEKQ LSDYTYDDIYKYLLQTIANEDDQKKFEKKYSKNQINDYIRTFTDIPTFEFRQKYLNGMIL WVSLNMKTTISPQNNDVILFIYTKNIDIQKRRELALQNKAEIDEMSSLYNESTSQLMIET LLKKEYDPKKVNAFLLLDIDNFKQINHSGNFLFGDQIIRQMGQMIKKSISSSCIAARING DTFSIFYHNAESKKVIYQAALQLQKKLSSIYHFQDIETQLTLTGGLIYLFSENMSYSQIH QCALHALDTAKRRGKNQILSYQEIEKVEFNFKIEMIINPNDFTILSMSPTGQIIFNCIFP VKKPIKCYELLHNRKSPCPFCSQQIDFEQTNMRECFVSKIDKLMYVKEQLSHYEGDTVRK IEMQDIPFNLAYENQDSDLQQFLEISWKNIDKNIQRENAFDNILRQLGTFFNAQNVLLFH QSDNQQKFSLTSKWNYNNQISNIKNTHFSKEHFENIVSAVAPQKYFVILDKNHFCYKNIT ELYFEKDVPLPLLFIGIYNKNKLVSCLLVETVKQHINASKSIEVIADVICKITTIYNLQT KYTYALTHDQKTGLSNYDSYIKTIRNINTDIYSTFGMIEITIINLKKYNQRYGIVQGDEI LQFIAQTITKFFDKKSCYRISNAHFIVLCPDITYENFIHIYDTFYQELIEFYDQWIICEK VWEKNSLVLETMQEQLEEKTRHAVNKKIHSKLAKPDQSIAKILERLNTAFVNGKFLTFLQ PKAHTKTNQICGAEALIRYNDPQKGIIPPGRFLPEIERAGLIRHIDLFILKDVCRILSNW LKTTWKPFPISINYSRATILEPGILEETNKIVESYNIPKELIEIEVTESIGSIDSASLKN IVNQFLAEGYKIALDDFGAEYSNIYILYSLHLSSLKLDRKIVSDIYHDHRAKLVIKNIIN TCQQLDITCVAEGVETKEQLKVLKDLSCDVIQGYYLNKPLSEDDFKKISGITNPH >gi|223714193|gb|ACDT01000022.1| GENE 3 7686 - 8432 679 248 aa, chain - ## HITS:1 COG:BS_ylmD KEGG:ns NR:ns ## COG: BS_ylmD COG1496 # Protein_GI_number: 16078601 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 34 244 62 273 278 139 35.0 4e-33 MKLIKWNTVENIEAYTITKELGDMSYNNEDKQLILNNRKQLAKLLKTDLEHMVAPHQTHS ANFKEVSLKDGGRGMYCQDDAFNNTDALYTKDTDLFLLSFHADCTPVLLYCPKTKIIAAI HSGWLGTTKQIVSKVTKHLIENEHCDPKEMLAYIGPCICQDCLEVMDNVIDLVKQMDFDT TPYYKQTDTTHYLLDNKGLNRQMLLNLGLLKKNITVSPYCTVENDKLFYSYRKYKDSGRN ITIIKRQA >gi|223714193|gb|ACDT01000022.1| GENE 4 8510 - 8899 422 129 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735894|ref|ZP_04566375.1| ## NR: gi|237735894|ref|ZP_04566375.1| predicted protein [Mollicutes bacterium D7] # 1 129 2 130 130 166 100.0 3e-40 MFCNSGCGCNNSCGCGGCNSGMNCCRATPIVAPRKVCTSCSNQYVEQPIICPIECRHINN IVYYPRYYPRYERTFMTQSNGNPFASAANATPFNGAGTGFNAANNTGDSVINGMYNNANN GGSGCGCNR >gi|223714193|gb|ACDT01000022.1| GENE 5 8921 - 10564 1614 547 aa, chain - ## HITS:1 COG:BS_yloA KEGG:ns NR:ns ## COG: BS_yloA COG1293 # Protein_GI_number: 16078628 # Func_class: K Transcription # Function: Predicted RNA-binding protein homologous to eukaryotic snRNP # Organism: Bacillus subtilis # 1 546 3 571 572 414 42.0 1e-115 MAYDGIMMHQVIDDLNKTITGGRINKIYQISKYELLIQVRNNRTNYKLLLSCHPMYARIQ LTNLDYPTPESPNPLTMLFRKHLEGGYIKKIEQIELDRICHFLFGCHNEFGDYVEYHCYI EIMGKHSNIILCNQDKKILDCLKRITPNINSERFVQPGAMYQLPPMNQNKVDPFTSDFIE NNNLTKIYQGMSPILSKEILYRHDNGIDFKEIMDEIKNSDTLYISRVDEKEYFHLIPLTH LNQQAQSYSLFDGLDKHFNLIDQKERIKQQTSNLLKFINNEYQKNVTKLAKLESTLEDSN NSDEYRIIGDLLYSNLHLLKKGMRHVELDNYYDGSKIMVDLDEKLDPKSNAQKYYNKYQK AKNSINVLHEQIDLTKKEIEYFDSLLTLMDNASYYDALEIKEELENLGYLKKKKKTNTIR KNKKPSFETYYTKDGIEICIGKNNLQNDYLTFKHAHRYDTWFHVKDMPGSHVVVKGDQLD EYTIRLASNIAAYYSKGKNSSSVPVNYTLIKTLKKPHGAKPGQVILDNYKTIYIDPDQHC LDELQKK >gi|223714193|gb|ACDT01000022.1| GENE 6 10700 - 11260 724 186 aa, chain + ## HITS:1 COG:BS_yloD KEGG:ns NR:ns ## COG: BS_yloD COG0194 # Protein_GI_number: 16078631 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Bacillus subtilis # 1 178 46 223 244 197 56.0 1e-50 MLIILSGPSGVGKGTVREELFKDDSLNLAYSISMTTRKPRPNERDGIDYFFVEEEEFKSK IEEGKLLEWAQFVGNYYGTPKDYVDQLLNEGKNVVLEIEVQGALQVMEKCPDATTIFLVP PSLEELERRIRGRRTEEEEIVQQRLSKARKEIATKDEYKYVVENDDVMAAKDKIAEIIKN HQTNQG >gi|223714193|gb|ACDT01000022.1| GENE 7 11289 - 11699 486 136 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757033|ref|ZP_02429160.1| ## NR: gi|167757033|ref|ZP_02429160.1| hypothetical protein CLORAM_02582 [Clostridium ramosum DSM 1402] # 1 136 1 136 136 197 100.0 2e-49 MEKFKEYFKVFAYIFYPGMVLILYLAQGLEGKLGLDLIFCPTDIIVLLAILGIYMLLDAF KVFDAKVFDFMKKNDGRLKVPFLLAFLLIIAVLMGLYDDLKVYFYLNNDVSINLFIIIAL GLGLAGIAFLLNKSEK >gi|223714193|gb|ACDT01000022.1| GENE 8 11825 - 12241 481 138 aa, chain + ## HITS:1 COG:no KEGG:CKR_3111 NR:ns ## KEGG: CKR_3111 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 2 138 25 162 162 92 42.0 6e-18 MLDVGLELNVLFDDPFWIGVFYLTNGDKCYVERVVFGQEPSDGEIYVYFLNNYHKLNFVA EFKAKHKDKPKNPKRLQRMIRKQSMNLQTGTKSMQALKRQYEQNKAKNKIIRREQRTAEK EHLFVLKQQKRKAKHRGH >gi|223714193|gb|ACDT01000022.1| GENE 9 12291 - 12863 506 190 aa, chain + ## HITS:1 COG:CAC0352 KEGG:ns NR:ns ## COG: CAC0352 COG1434 # Protein_GI_number: 15893643 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 18 181 33 196 225 113 37.0 3e-25 MVKRVIIGFGILNLLAFLLVFKKPYKKQRYKKYDCAIICGYPANIDGEPSTIMKSRVEMG VTLYKKDKIKFLILSGGAVKNEYQEAKIMASYAISLGVKKEDILIEGQSRSTYHNLMYSR DIMQINGLKNALVVTNSWHLRKADHYARKFKLDYAMVSASAPKSYCWIKVVVLHLGTNIK MYYNLFKGYY >gi|223714193|gb|ACDT01000022.1| GENE 10 12864 - 13685 568 273 aa, chain - ## HITS:1 COG:no KEGG:Clos_0765 NR:ns ## KEGG: Clos_0765 # Name: not_defined # Def: membrane associated protein # Organism: A.oremlandii # Pathway: not_defined # 5 271 8 288 303 112 28.0 2e-23 MCKKLSIILLCLLMITGCSKDKPILYSNLGNQASQNKLTNILNEADLPKENIKQFFSYVN TFNQHATPLIGDFETLEREQPDYQYFKYNSPVEISDQNDSNSLIPSFILIKNLIYTNNSG HADDSYIMFNLNLIDTIDQYSMSQEDRLKFITTFNSISVTGIKNNEISHINQIEKTFSDR DFSVKQNQKASLITLWLHSSADNRRFVSHCGVLIDSNDGLYFIEKYGCFYPYQVTKFNSR SELKTYLLTRNDLKGDENDGLPLVFENNKYLNK >gi|223714193|gb|ACDT01000022.1| GENE 11 13802 - 14320 702 172 aa, chain + ## HITS:1 COG:FN1223 KEGG:ns NR:ns ## COG: FN1223 COG0778 # Protein_GI_number: 19704558 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Fusobacterium nucleatum # 1 172 1 174 175 152 43.0 4e-37 MNETINNLITRRSVRSFQKEQISEEQLNEILTAGTYAASGMGRQSAIMVVIQNPEIIKRL SKLNAAIMDSENDPFYGAPTVVVVLGDSTIGTYLEDGALVMGNLLNAANAVGVDSCWIHR AKEVFMSTEGKELLKQWGIDEKYVGIGNCILGYGKEPKPEAKPRKDNYIYRV >gi|223714193|gb|ACDT01000022.1| GENE 12 14681 - 16879 2029 732 aa, chain + ## HITS:1 COG:BH2509 KEGG:ns NR:ns ## COG: BH2509 COG1198 # Protein_GI_number: 15615072 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Bacillus halodurans # 4 729 2 801 804 604 40.0 1e-172 MVYIVQVLVEHPTHALDTTFDYLANTEIASGVRVTIAFGYQRIIGYVTGCRYSNSTKEEL ETAAGFKYRYISEVIDERPLLNQELQELSLTLAKLTLSPRISCLQAMLPPQLKPSTNKSV GIKYQKVIKVINEDATLKTVKQQEAYQYLKTHPDTPLKAFPYSRGLLDKLIEQRVTVIIE QELYRDPYLDDQKEEHEFSLTVDQQKVVNGIMSKIDTYHTALIHGVTGSGKTEVYLHLSK YVIRNNKTVLMLVPEISLTPMMVNAFKHKFGKEVAILHSKLSAGERYDEYRRILNSEVKI VVGARSAVFAPLEKIGLIILDEEHDPSYKQESKPRYQTTQIARIRGQYHNCPVILGSATP SLESYSRGQKGIYDLYELPKRINQFPLPKIELIDMADEIRNKNYSLFSKAMKEQIQTCLD HDEQVILLLNKRGYATYVRCLDCGEVIKCPHCDVTLTYHKADHKLRCHYCEHQIEMPRLC PHCSSERLKLVGAGTQKVEEQLETIFDGAKVIRYDVDTTKQKDGHQKLLDKFARKEGNIL LGTQMIAKGLDFENVTFVGVLNADISLNIPDFRANERTFQLLEQVSGRSGRGQKEGTVMI QTYNPEHFVLQCVKNHDYLRFYQEEMKTRKLAAYPPYVHLVSILIQGKDEEVVNQSAVQI KEYLQKQLDKMAILGPANSLIYRMQDIYRKRIMIKFTNSKQLYPVLEKMSDFYNKKGNKV HVVCDFNPYNQI >gi|223714193|gb|ACDT01000022.1| GENE 13 16892 - 18145 1126 417 aa, chain + ## HITS:1 COG:BS_yloM KEGG:ns NR:ns ## COG: BS_yloM COG0144 # Protein_GI_number: 16078637 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Bacillus subtilis # 3 415 7 445 447 313 40.0 6e-85 MARTTALDVLIKYQNEQSYLNITLNDYLEKSLLSRNDKDLATRIVYGTIQNKLYLEYQLA PYIQGKKIKNREKMILLMSLYQLIFLDKVPDYAVIDEAVRLAKKRDNFAGKFVNAILRNF LRNGKCEIDEKDELKRISIETSHPLWIVKMLSKQYDLNIAKKICQHDNTPPTRAARVNIL KTTKAQLLTDENFEVGNLSPDGLLYRAGNIANTDYFKNGLVTIQDESSQLIAPLLDPQPT DLVLDMCCAPGSKTTHLAAIMNNQGKIIAYDLFEHKIKLVEANLKRLGVNNVELHVGDAT LLKEKYSEESFDKILLDAPCSGLGVMKRKPEIKYHDSGAMDTIIPLQAKLLDNAYYLLKK NGKMVYSTCTINKKENEQMIKQFLDKYPDMKIIEEKKILPYVYDSDGFYMCKLEKGK >gi|223714193|gb|ACDT01000022.1| GENE 14 18147 - 19175 1064 342 aa, chain + ## HITS:1 COG:lin0484 KEGG:ns NR:ns ## COG: lin0484 COG0820 # Protein_GI_number: 16799559 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Listeria innocua # 3 342 5 355 367 370 55.0 1e-102 MKNIYDYSLEQLTEYFASIKQKPFRAKQVFSWLYQKDARSFDDMSDLSKDLRNQLKVEFS LDVLKIKEKQVSRDGTIKYLFELLDGSLIESVLMIHDYGKSLCVTSQIGCNMKCTFCASG LLRKQRDLTPGEIVAQIIKVQQDIDQRVSHVVVMGTGEPFDNYDNVMEFVRIINHPHGLA IGARHITISTCGLIKGIKRYSEEGIQTNLAISLHAANDEIRDELMPINKVHPMDDLREAI SEYIDKTNRRVTFEYIMLKGVNDDIVYARQLAHYLRGLNAYVNLIPYNSVDEHGYQPSDK ETVEIFKNELLRLHINVTLRKEHGRDIDGACGQLRAKRSGVK >gi|223714193|gb|ACDT01000022.1| GENE 15 19175 - 19912 882 245 aa, chain + ## HITS:1 COG:lin1935 KEGG:ns NR:ns ## COG: lin1935 COG0631 # Protein_GI_number: 16801001 # Func_class: T Signal transduction mechanisms # Function: Serine/threonine protein phosphatase # Organism: Listeria innocua # 7 240 7 241 252 169 39.0 3e-42 MNYYALTDIGKVRNKNQDQATVIANVKDQVIAIVCDGMGGHRAGEIASRVVMDQFVNCFD SIPPFDNVDELKKWVNETIYEADAIVKRMAKQNVEHHGMGTTIVVAILMDNVIYISHVGD SRAYVLKNDQLMQLTKDHTLVNALVDRGAISEEEAVNHSQKNVLTQAIGADTELTPSFIE LEFADSLLLLCSDGLYNCLNDEIIKEILKKNIKVSQKVTELIDRANENGGRDNIGVAIID NREEK >gi|223714193|gb|ACDT01000022.1| GENE 16 19914 - 21860 2362 648 aa, chain + ## HITS:1 COG:L138452_1 KEGG:ns NR:ns ## COG: L138452_1 COG0515 # Protein_GI_number: 15673869 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Lactococcus lactis # 1 430 4 429 431 319 42.0 1e-86 MAKIIAERYELLELIGQGGMADVYLAQDIILNRTIAIKILRTSLAKDPIYVTRFQREASA AAALSHKNIVEIYDVGEDEDKYYIVMEYVPGMTLKELILKRGAVHVVEAIDIMKQVISGI SKAHQLGIIHRDLKPQNILVTDSGVAKIADFGIASMQSLAQVTQTDVIMGSLHYLAPELA RGEKATAQSDVYALGIVFYELLRGEVPFNGESPVNIALKHMQEDLPSLLEFNPSIPQSVE NIVIKATAKNLNDRYKSATEMLDDIKTCMERQDEEKLVFSHDQDTDPTIVIDPRSAFTSG NTAPIVDPVEEKEVAAPKKEGFFSKLVNKFKGLDTKAKVAVGVVTALVIAGIAFAIYLGV KPDTSLMPDLTEKTVEQAKEILKEYNVTISDDITEELSDEYEKGEIIATDPKKGTTIKEG DVVKVTVSKGKYIVLEDYTGKKEDVAKKALEKLKFEVEIEYEISSKTKGTVIDQSIEKGT KVDPTEKDRKIILTVSKGDYVVLGNYVGMDQNKAKEALTKLGFDVTIKETSSEQAVGTVV EQSLKEGHKVDPDEKDRTITLTVSSGIKIEVPNVKGMDIDAAATLLTNKGFSVKRETLPT PTDPNEIEKITVNQVVRQSLDAFTTVTKKNESITLYYYNYKPEIPTDD >gi|223714193|gb|ACDT01000022.1| GENE 17 21870 - 22745 924 291 aa, chain + ## HITS:1 COG:BS_yloQ KEGG:ns NR:ns ## COG: BS_yloQ COG1162 # Protein_GI_number: 16078641 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Bacillus subtilis # 1 288 1 297 298 316 53.0 4e-86 MSQGRIIKALAGFYYVEDDHQIIQCRARGKFRKDEIKPLVGDFVEYEVEGDNDGYVMNVL PRHNCLVRPPICNVDQALIVSSCKEPDFSSILLDKFLLVIEHLGIEPIIIISKMDLDEDE SVKQYVKDYRQAGYRVYEISSKDNHGLEELKTVFKDKVSVITGQSGVGKSSFLNALDINL KLETNEISKSLGRGKHTTRHVELLKMYGGYLGDTPGFSSLELEMTPEELAVAYHDFAQFS HECKFRGCLHDSEPNCGVKKAVEDGKISKERYEHYLMNLQDVKKKEERKYG >gi|223714193|gb|ACDT01000022.1| GENE 18 22738 - 23379 687 213 aa, chain + ## HITS:1 COG:Cgl1560 KEGG:ns NR:ns ## COG: Cgl1560 COG0036 # Protein_GI_number: 19552810 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Corynebacterium glutamicum # 4 213 8 215 219 199 49.0 4e-51 MVKVAPSLLSANFAYLKEEINAIKAADWIHYDVMDGHFVPNISFGYSILKDVSKVTDMYL DVHLMITDPAKYVDNFINSNASLIVFHYEAVAEDKINELIAHIKEHDVEVGLSIKPDTPV EVLKPFLNELDVVLVMSVEPGFGGQKFNSAAVDKIAQLATLRKENNYHYLIEVDGGINES TSKLCNDAGVDVLVAGSYVFGSDDYTKAIESLK >gi|223714193|gb|ACDT01000022.1| GENE 19 23376 - 23966 558 196 aa, chain + ## HITS:1 COG:SA1066 KEGG:ns NR:ns ## COG: SA1066 COG1564 # Protein_GI_number: 15926806 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine pyrophosphokinase # Organism: Staphylococcus aureus N315 # 18 195 24 211 213 110 38.0 1e-24 MKIGICSALACKLDDSLKYIGVDHGVEILLKQGIKPVFAIGDFDSIENENVLSELDIERL PTRKDVTDTHAALEYALENGYDEVDIYGVTGGRLDHFLSVMCLLEKYADKKIRIIDQQNV IQLLLPGTHKVYQDEYKYFSLYALDDAYIDIDGAEYPLNQYFLQRQDPLCVSNQVSCEIA TITNSKPIVLVKSKDA >gi|223714193|gb|ACDT01000022.1| GENE 20 24139 - 29295 5718 1718 aa, chain + ## HITS:1 COG:no KEGG:CPE0191 NR:ns ## KEGG: CPE0191 # Name: nagH # Def: hyaluronidase # Organism: C.perfringens # Pathway: Metabolic pathways [PATH:cpe01100] # 35 1457 36 1491 1628 780 35.0 0 MARFWDKALKSSLAMLVVFSTLFFMSPNIKVEAKETNYEIYPTPHEITYQDKDYVIRSQV NVVYEDGIDAATKKRMTEVLESKNKQISTSKQKVDGKTNILVGTYKSGGYVDGYVKSNYS VEESLFSKFGAHFVASNNGEIVILGKDTDGAFYGITSLKHIFNQMDGSTIRNFVIKDYAD TDIRGFIEGYYGIPWSNEDRMSLMKFGGDFKMTSYVFAPKDDPYHKNLWRELYPEEELAA IKEMVQVGNDSKCRFVWTAHPFMGGFNASKVDEEIASLLKKFDQLYDAGVRQFGVLGDDV GSLDRSVVIKMMNSVSAWAKEKGDVYDSVFCPAGYNHSWQGDYSELSDYDAGFPEDVKIF WTGEAVCQPVEQKTLDHFKRNRLPEGADERRSPLFWLNWPVNDINGSRLMMGKGSLLHTD INIDDLSGVVTNPMQEAEASKVAIFAVADYAWNVKSFNDDQSWADSFKYIDQEASEELHT LAKHMSNPQPNGHGLVLAESEELQPLINEFKTKLANGDSIIESGKQLIAEMNVIIDACDG FHTNSKNENLKKELLSFTGSLKDLTTAIKHFTESAIAIEENDMVTAFNQFSNASSELINS QNHIRKLLNGTAKVSPGSTHLIPLAKALQDKLSGPINDYIASGETEQPLEISASSSFTTW HSGKIADIVDGNNDTAAWHNGYEKAGQYFQVDLSKPTTIYGVHILNGANNSDKKEDTFGY AKLQYSTDGTTFKDLNKEVYGEYASQVDVTNIEIEDVVAVRYVCSEVGGGNKWPAMREFT VVTQPVEEEQFTKEVIRTTDGWSIYGGSDANLIDGDLTTKVHYNVRQKGEPANTTIPGDY VGVKLSKPITLGKINILQGNTDSDGDYFKDADLQYSLDGKTWTTVETYKNTINIVTDLSD QNITAQYVRLVNKVNQNTWIAMREFDVAAKVYHNGKVFTNVSEYKSLTADYLDESAKITP AKGITLANGDYVGLKLDRIHELKDIVSELTNDSLTIQVSKNSYEWQNVNAGNVSADARYV RIINNNEAEVTFDLNKFIVNTVEIKEKSITSSNFSIDKPLNVFDGDLTTATAYQGSQNVG KYFTYDLGQEITLNSFKAICTDSEWDYPRHGKFSVSTDGETWTEIMLLGNQNENNPGEAE NTDEIGFVLPSHEISYNAKKVDGLNLQARYLKFEITRTKVGADKWVRFQELEINNGEFIP SQNNPTFTSSSQETRDGLFSYMVDGDISTMFIPKDANGYVNYSLSSNNQVNTIKIIQNSA VISNAVVKARLFTNASAEQWVTLGTLSQAINEFVLPENTILLDIKVEWGDVIPNITELST YKSEVTTLNVDALKALIDNKEDLSSWTAAAAATYGDAYNAGKQVLESEYASQTTVDNAVR AINKAIENKVLKGDLSKLEIIIANAHTDQNNYTAASWLAYSKAIEAINKSIANGDNTSVA DVDKLIANYDLANTNLVFNPSNQEEAIITIEGENDFVASVTNPEKLYTINSWKMYLEAKA KVEQLIIDNQSTPVHPDEFAKAISELAAAKEKLVVVGDLKDILDTANKVNQELYTTSSYK GLADAIAEATARLENGTAEEIDASVKALDNALKALVVRAKADEVKEYINSITLVDLSKYT ESSAKIYQEAYNVLKAMLDNLSDISAKEFIEAKNNFETAVAKLEEKATVAPIPTPTPNEL PASDSAKTGDDVNIFAYVTGLGVVAIVGIYWLLRKKEE >gi|223714193|gb|ACDT01000022.1| GENE 21 29426 - 29908 566 160 aa, chain - ## HITS:1 COG:CAC1354 KEGG:ns NR:ns ## COG: CAC1354 COG2190 # Protein_GI_number: 15894633 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Clostridium acetobutylicum # 11 160 6 157 159 108 38.0 5e-24 MNFFTRLLKNKKNTVCAPACGKVIRLENIPDKAFANKMVGDGLAIEITDNKLVAPCEGII SVIANTKHAFVMTLPNGLHLLVHIGINRAKPDANNFQYHVKAGDYVSLGADIVTLSDNLL KVLNNKVITPIIVCNYESHPIKTFTTASSVETGKTIFTYK >gi|223714193|gb|ACDT01000022.1| GENE 22 30406 - 31830 1422 474 aa, chain + ## HITS:1 COG:YPO2628_1 KEGG:ns NR:ns ## COG: YPO2628_1 COG1263 # Protein_GI_number: 16122841 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Yersinia pestis # 1 392 15 409 410 432 55.0 1e-121 MLPVSVMPIAALLKGIGYWIDPTGWGSNNAIAAFLIESGGAIIDNLPILFAVGIAIGMSQ ERELTVALSAVVSYMIVSRLLSPPAVALIKGIPEAEVSAAFENSANAFVGIISGLVVAYN YKKFSKVKLPDALSFFGGKRFVPIISTASMLVISLLLLIVWPFFYECFVSFGEMISKLGP VGAGLYGFFNRLLIPTGLHHALNSVFWFDVARINDIGKFWGTVSGGIKGVTGMYQAGFFP VMMFGLPAAAIAIYQSARPERKKQVGTILLSAAFASFVTGVTEPLEFSFMFVAPVLYFIH ALLTGIMLFVAAVFKWLAGFGFSAGLIDYILSIKAPFSNDIFMLIPLGLICAFIYYCIFR FMIVRYNLMTPGRETDEDIIEIEEEEYNITLENQDYDEIAIVLIEALGGRNNVQFADSCI TRLRVELGNPDLVDEQAILATGATGIIKIGKNNLQIIIGTEVQFIVDAMNDILM >gi|223714193|gb|ACDT01000022.1| GENE 23 31835 - 32680 722 281 aa, chain + ## HITS:1 COG:BS_licT KEGG:ns NR:ns ## COG: BS_licT COG3711 # Protein_GI_number: 16080959 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 6 280 1 275 277 173 35.0 4e-43 MEAITLKILKVFNNNSVAAISDELGDIILTGSGIGFQKRIGDEVDESRIEKTYFFKDDQQ KRFEQSIETVPAIYFEITNKIVNQANKELDTDFSGEIFLAISDHISFAVKRKKEEIYLPN VVLSETKVLYKKEYKVGLWALDYIEEKTGIRLDDDEAGYIALHLVNFSLDNKANNATKIV TLTKEVLNVIKLSMKVDLEEDSLGYARISTHLKYLAERIFRDEIDELQDTTADIREMLKE DLRLSLCINRIVKLIRDRYDYELSPDEQTYLCIHIKKNAKL >gi|223714193|gb|ACDT01000022.1| GENE 24 32910 - 38549 6036 1879 aa, chain + ## HITS:1 COG:no KEGG:CPF_0184 NR:ns ## KEGG: CPF_0184 # Name: nagH # Def: hyaluronidase (EC:3.2.1.35) # Organism: C.perfringens_ATCC13124 # Pathway: Metabolic pathways [PATH:cpf01100] # 36 1073 36 1058 1627 858 46.0 0 MRKIKKNTHKLMSMFLAVLMVFGCFSAWPTQVSAAESDYEIYPKPHLMEYQSGDYVIHQN INIVYEDGIDEYTKARMNEVLAIKKINATISNEIKEDQTNILVGIKGSDGYVDKHVSKNY TLKTSGLYDKLDSYLLASDNGTITVLGKDTDAAFYGITSLYHIFNQMDSYSIRNFVMEDY ADVASRGFIEGYYGNPWSTEDRSELMKWGGYYKLNSYFYAPKDDPKHNSKWRELYSDEEI ETKIKPLAQAGNESKCRFVFALHPYMNNAIRYNSEENYQADLKVMQAKFAQVIEAGVRQI AILADDAGNVGGDNYIKTLKDMTAWIKEMQKTYPDLKLTLPFCTQEYMGNGQGYYTNFPE NVQIVMTGGRVWGEVSNNFTTTFTNNVGRGPYMWINWPCTDNSKKHLIMGGYSTFLHPGV DPNKIQGIVLNPMQQSEPSKVAIFGNACYSWNIWETSEEADEIWEDSFKYVDHNSAEVTA ASDALKELSKHMINQNMDSRVTPLQESVELKDRLNQFKTALNSGNTISDDLFTDLINEFT KLQTAAKTYRNEAGDTRIKDQIIYWLDCWDDTTEAAIALINGVKAVQDGAENDQIWDLYA QGQAAFERSKTHNFHYVDHDEYAEVGVQHIVPFIKTMEQYLSDVASSIIDPSKQVTKFIT NRDDTPTGAISNVFDNKANTEIVYKNPNSINEGTYVGVSYTKAIDVDKVIFRLGTNSNSK DTFAKSKVQFTNDGKEWKDLDGTIYNLPNEVVLEDLELQNVKGIRMIATEAKSNTWLGIR DIVVNPTDEPIVDNDMGTLSIDKLSLQGGTLAKVTDGDNSTFAHFAEDPYKGGTIKDYIP IDASLILTFKNPKKLGTINFVQDSKTDKITKYALEYTIDGKNWQEIAKYDGQAAVNEDVS ALNLTVKAVRIRNLELNLQNDSAGFWWKVYDFSVSNPAAIEKTFMYTDTWEVYKGTESNL TDGDNTTALDFNTKPGDTSRVGDFIGWDFGKTIQIGKVHAVIGGDRNAGDKWLKYSLQYS ADGQDWTTYKSYEGVTNGKDVIDENLRGVEARYVRLVNNEQKNSWVIFSEFDVAQFDPIK DYDDTNVYTNTEYRLITESKEDLTKLMYDEVITLTQGQYIGVDLLRIKDISEIVVDMNKD NLTIETSKNALEWTTVKTKTSELPDARYVRIINNTDSAITFNLNRFEVHSNELYAPSLYE TTMGINSSWGVAEDSRNNGAAFDGNIDTTTEFADLPQKGQYIIYDLGQERNISKIQMFCQ DSAVNYIRDADILISNDLENWTKVITIGDGVENKNDASVTCINSDAGYKASSTYPNKVLV EGTADNVKARYLKILMTATNNNRAVLFNEIVINDGEYVPVSNDPTFESNAVEVQGFVPQN MFDGDLTTAYKPNTTDAGYITYTLSDNLDVKKINIVQKGTVSNAKVMALVMVGEEKQWVQ VGTLSKSLNEIYLPFDMTYELKIVWEANNIPTISEIVRLNDDEFLPELEALQKYVNSLNV DENNYTASSYAKYVSKLNQANEVLSTASGDKKLIIKAYSELQIAYTNLVTRGNAQLIKDE LNKIGALVADDYTDSTWEALQDKVNEANTLLAKGEEELNVKEVADMVNILQVAKSNLITK VSVSKEVLQNYIDTNELDNLDTSKYLTSTATPFVDALKVAHDLIASDDATVKQLEDALTA LQETRAALVLKATADEINAINGLIDSYKENNYTASTWKEFVKVLTEVKDALNDENSSEDI AELTKTLKAAAEKLVERGDLTDLNILLETAEALDANRYTEESYNKLLDIIKTIKVELKDS SEMLQTDVDKLQAKLQAAIDALEIKPSVTPNEDGIINGNTASDTTTSSKTGDDVVIGSFL VLGMLSIAGLWLYRKKENC Prediction of potential genes in microbial genomes Time: Thu May 26 09:24:40 2011 Seq name: gi|223714192|gb|ACDT01000023.1| Coprobacillus sp. D7 cont1.23, whole genome shotgun sequence Length of sequence - 46492 bp Number of predicted genes - 46, with homology - 46 Number of transcription units - 20, operones - 12 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 110 - 169 8.3 1 1 Op 1 . + CDS 207 - 2774 1812 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Term 2775 - 2815 10.1 + Prom 2781 - 2840 5.9 2 1 Op 2 . + CDS 2860 - 3675 777 ## gi|167754796|ref|ZP_02426923.1| hypothetical protein CLORAM_00300 + Term 3789 - 3844 6.8 + Prom 4152 - 4211 6.1 3 2 Op 1 . + CDS 4311 - 5048 886 ## COG1385 Uncharacterized protein conserved in bacteria 4 2 Op 2 . + CDS 5048 - 5833 703 ## gi|237735919|ref|ZP_04566400.1| predicted protein 5 2 Op 3 . + CDS 5830 - 7116 1240 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 6 2 Op 4 . + CDS 7116 - 8582 1617 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 7 2 Op 5 . + CDS 8572 - 9222 905 ## COG0274 Deoxyribose-phosphate aldolase + Term 9229 - 9285 -0.6 - Term 9140 - 9172 1.4 8 3 Tu 1 . - CDS 9340 - 10020 706 ## COG3711 Transcriptional antiterminator + Prom 10412 - 10471 4.1 9 4 Op 1 11/0.000 + CDS 10526 - 12778 2832 ## COG1882 Pyruvate-formate lyase + Term 12805 - 12840 4.3 10 4 Op 2 . + CDS 12852 - 13601 843 ## COG1180 Pyruvate-formate lyase-activating enzyme + Prom 13688 - 13747 7.1 11 5 Tu 1 . + CDS 13770 - 14648 1277 ## COG1940 Transcriptional regulator/sugar kinase + Term 14728 - 14769 4.1 - Term 14634 - 14677 8.5 12 6 Tu 1 . - CDS 14739 - 16379 1222 ## BDP_1333 arylsulfate sulfotransferase - Prom 16418 - 16477 11.4 + Prom 16950 - 17009 15.4 13 7 Op 1 5/0.000 + CDS 17065 - 17400 297 ## COG1695 Predicted transcriptional regulators 14 7 Op 2 . + CDS 17403 - 18134 869 ## COG4709 Predicted membrane protein 15 7 Op 3 . + CDS 18131 - 18916 924 ## gi|237735931|ref|ZP_04566412.1| predicted protein 16 7 Op 4 . + CDS 18920 - 19099 230 ## COG1983 Putative stress-responsive transcriptional regulator + Term 19102 - 19147 5.3 - Term 19090 - 19134 5.1 17 8 Tu 1 . - CDS 19358 - 20656 1216 ## COG1621 Beta-fructosidases (levanase/invertase) - Prom 20677 - 20736 11.5 - Term 20709 - 20739 1.3 18 9 Op 1 . - CDS 20740 - 21636 943 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase 19 9 Op 2 . - CDS 21641 - 22849 1396 ## COG2195 Di- and tripeptidases - Prom 22871 - 22930 6.5 + Prom 22935 - 22994 7.3 20 10 Op 1 . + CDS 23016 - 23603 764 ## COG3341 Predicted double-stranded RNA/RNA-DNA hybrid binding protein 21 10 Op 2 . + CDS 23596 - 24129 516 ## COG1525 Micrococcal nuclease (thermonuclease) homologs + Term 24246 - 24287 5.1 + Prom 24141 - 24200 9.4 22 11 Op 1 49/0.000 + CDS 24389 - 25306 855 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 23 11 Op 2 44/0.000 + CDS 25318 - 26253 1009 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 24 11 Op 3 44/0.000 + CDS 26269 - 27279 1296 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 25 11 Op 4 2/0.000 + CDS 27283 - 28257 994 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 26 11 Op 5 . + CDS 28275 - 29933 1958 ## COG4166 ABC-type oligopeptide transport system, periplasmic component + Term 29943 - 29977 4.6 27 12 Op 1 . + CDS 29984 - 31216 1417 ## MXAN_7220 putative lipoprotein 28 12 Op 2 . + CDS 31225 - 32439 1469 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 29 13 Tu 1 . - CDS 32464 - 32754 348 ## gi|167754769|ref|ZP_02426896.1| hypothetical protein CLORAM_00273 - Prom 32777 - 32836 7.9 + Prom 32726 - 32785 10.7 30 14 Tu 1 . + CDS 32865 - 33869 926 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 34093 - 34139 11.6 + Prom 34116 - 34175 10.1 31 15 Op 1 . + CDS 34226 - 34768 706 ## COG1045 Serine acetyltransferase 32 15 Op 2 . + CDS 34824 - 35342 407 ## COG0671 Membrane-associated phospholipid phosphatase 33 15 Op 3 . + CDS 35366 - 35551 105 ## gi|237735949|ref|ZP_04566430.1| predicted protein 34 15 Op 4 . + CDS 35564 - 36661 815 ## COG0628 Predicted permease + Prom 36681 - 36740 9.3 35 16 Op 1 . + CDS 36767 - 37369 535 ## gi|237735951|ref|ZP_04566432.1| predicted protein 36 16 Op 2 . + CDS 37362 - 38012 495 ## CPE1501 DNA-binding response regulator, LytTr family + Prom 38078 - 38137 5.0 37 17 Op 1 . + CDS 38273 - 39001 501 ## gi|237735954|ref|ZP_04566435.1| predicted protein 38 17 Op 2 . + CDS 38994 - 40265 649 ## gi|237735955|ref|ZP_04566436.1| predicted protein 39 17 Op 3 2/0.000 + CDS 40246 - 40503 428 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component 40 17 Op 4 . + CDS 40478 - 41005 184 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 41 17 Op 5 . + CDS 41020 - 41565 398 ## gi|167754758|ref|ZP_02426885.1| hypothetical protein CLORAM_00262 - Term 41540 - 41577 6.0 42 18 Tu 1 . - CDS 41583 - 42842 1473 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs - Prom 42881 - 42940 8.3 + Prom 42918 - 42977 5.2 43 19 Op 1 3/0.000 + CDS 43102 - 44115 1274 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 44 19 Op 2 2/0.000 + CDS 44118 - 45272 934 ## COG0477 Permeases of the major facilitator superfamily 45 19 Op 3 . + CDS 45326 - 45715 483 ## COG0789 Predicted transcriptional regulators + Term 45793 - 45836 7.8 + Prom 45795 - 45854 8.4 46 20 Tu 1 . + CDS 45945 - 46244 407 ## gi|167754753|ref|ZP_02426880.1| hypothetical protein CLORAM_00257 + Term 46394 - 46443 2.1 Predicted protein(s) >gi|223714192|gb|ACDT01000023.1| GENE 1 207 - 2774 1812 855 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 848 1 804 815 702 44 0.0 MNIEKWTAKMQEAIQKAITMASEMGHQVVDVEHFLLALLEDNAGILYRVLSKANVNIADL QNKLNVRLQGKPIVSNVDINSIRISYDLNQLLSLADKQMHNFKDEYLSVEHLIMALFELN SSWLKEILNGYNLNKKNVKKIIDEMRGGNMVDNQNPENQYEVLEKYGRDLTKDVADGKLD PVIGRDEEIRRVIQILSRKTKNNPILIGEPGVGKTAIVEGLAWRIFKNDVPVSLQNKTLY ELDLGALVAGAKYRGEFEERLKAVLNEIKKAEGDIILFIDEIHQLVGAGKTDGAMDAANL LKPMLARGELHCIGATTLDEYRMYIEKDAALERRFQKVQVDEPDQDDTIAILRGLKDSFE SHHGVQITDSAIIGAVNMSQRYITDRFLPDKAIDLIDEACASIRMEIDSLPEELDVITRE KNRLEMERISIEKEDKNEDNEKRLNEIKSRIASLDEQVKGLTDKWQEEKKSLDHIKELKD QKVRLEALTDKYQTEGNLEEASRIKYQTLPQIEKEIKEFEASKKEDDLLQEKVTVDTVSE VIARWTGIPMNKLMESEREKLLKLDTALKVRVIGQDEAIEKVSDAILRSRAGINDENRPI GSFLFLGPTGVGKTEVAKSLAEQLFDTERNIVRIDMSEYMEKFSVSRLVGAPPGYVGYEE GGQLTEAVRRHPYSIVLLDEIEKAHPEVFNILLQVLDDGRITDSKGNLVSFKNTIIIMTS NIGSNYLLEGNNSDTRAAVDNELKAHFKPEFLNRIDEIVYFNSLDQAVVNKIIDKFIKQL SDRLSDKKITITVSDRAKAVISEQGYDVTFGARPLKRFIQSRIETLVAREMIKGDIKNGS HVLVDFDDDFKLAIQ >gi|223714192|gb|ACDT01000023.1| GENE 2 2860 - 3675 777 271 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754796|ref|ZP_02426923.1| ## NR: gi|167754796|ref|ZP_02426923.1| hypothetical protein CLORAM_00300 [Clostridium ramosum DSM 1402] # 1 271 16 286 286 421 100.0 1e-116 MERFQRVVMKFISPQGNLFRYSLSFCVLLALLPALIIFLKVFQTDILSAPNLINLLYDYV PETILAPFVEYVMSQEYTTFVSLIISMGFSIYLASNAFYSFMLISMTDEGFDTYAILVRI KAIVVFVLLVIGLVGLTLFNYLVPINSTIVMLAGLFIVFYCFYRYLSFEKKPLSYGLIGA AFTSACIVLIGVFFFYIVNHYTRYNALYGPLASLVIMLISIYLISSVIYFGYCLNHEYGK SFSKRTYKHQKFYDYGNNFLDKIVEKFGKLI >gi|223714192|gb|ACDT01000023.1| GENE 3 4311 - 5048 886 245 aa, chain + ## HITS:1 COG:BH1350 KEGG:ns NR:ns ## COG: BH1350 COG1385 # Protein_GI_number: 15613913 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 1 243 1 247 250 178 41.0 7e-45 MQRYFINEDLTSKQQLILPDDDLHHVKKVMRNINGDEIICIDNAGQVYYGVIEDIEIGLV KIDKRLEENNELDVEITLVYALPKGDKFELVLQKATELGVKRIVPIQTKRCVVKMDEQKF AKKITRYRKILKEAAEQSWRNFIPEITNVIKLEQLDQYLGDHNLVAFEELAKQGEHMVLK QTLDQLSSGDKITIVVGSEGGFELDEIEMMNKLGIKACSLGKRILRSETAPLYFLSVIGY AREIG >gi|223714192|gb|ACDT01000023.1| GENE 4 5048 - 5833 703 261 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735919|ref|ZP_04566400.1| ## NR: gi|237735919|ref|ZP_04566400.1| predicted protein [Mollicutes bacterium D7] # 1 246 1 246 261 422 100.0 1e-116 MSILDLLNKYRYEIKLNKTGLYNFVTGGNKELAGEGFNYKLKTPEDLFAYEVILNGIRTK IAIDECYNKFMAANYDIFEYVNYEEKRKMFNHDEEKIINNFPSFQDHQSGEIIYIPYLEP FINRYYANDYQLVTLKQHQVYLNSYAKTIDKVIELYGIQPYNSQFSSLQLVGQDDESYYF YHDDFKTVYQFNGQNYRIANEINLIDRYTKEYPNLNLIKEAMVKLANSQDDEEILEFLYE NKFVGDKTYKKLIKKLKKVSK >gi|223714192|gb|ACDT01000023.1| GENE 5 5830 - 7116 1240 428 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 1 418 1 425 451 482 55 1e-135 MKTVAFHTLGCKVNTYESNAMLKIFNEAGYQEVDFKQVADVYVINTCTVTNTGDSKSRQM IRKAIRKNPKATICVVGCYSQTAPEEIEKIEGVGVVLGTQYRSDIVKYVDEHLETGEMVI KVDNVMNLRKFEDLNIDRFKNTRAFLKIQDGCNNFCTYCIIPYARGRVRSRQKESVLNQA QKLVDNGYVEIVLTGIHTAGYGEDLDDYSFYELLVDLVKIKGLKRLRISSIETSQISDEI IDLIGSNEIIVDHLHVPLQAGSDATLKRMNRKYTTAEYLEKINKIRSYLPNIAFTTDVIV GFPGETDEEFEETYNFIKQVNYSELHVFPYSPRKNTPAAKMKDQVNDQIKHERANRLLQL SKELNHEFALKQIGKTLKVLFEKRDGEYLIGHAGDYLKVKVKTADNLIGEIVTIKIDKYD EILEGRVV >gi|223714192|gb|ACDT01000023.1| GENE 6 7116 - 8582 1617 488 aa, chain + ## HITS:1 COG:BS_murE KEGG:ns NR:ns ## COG: BS_murE COG0769 # Protein_GI_number: 16078582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Bacillus subtilis # 13 475 21 484 494 262 34.0 1e-69 MKRLNELFDTDNDMKIYSIHSDSRYVKPYSIFFCIEGLSVDGHRYVEDAVFQGAKVIVHS KNLNYYHDKIIYFRVENTLVELNRVSNLFYDCPSNKMKIIGVTGSSGKTVVASMIKDAMA KYCSSGYIGTISLEYNGRKEECPYTTPEALYLQRKLFEMNRAGVKVVAMEASSHGLALGR VDGINFNIAVMTNIGAEHLDFHGTKEQYVLAKQKLFEMIKPTGWAILNSDDLNFMTLKNN TQGRILTYGIEIESDIMAKNIQLHLDHSEFDISFKGNTFHVNSPVIAMFNIYNVLALTGV LIAMGCDDKMVIDAVENVKPIEGRMELIQTEQQFAVIIDYCQHISNYEAIFEYVDGVRQG HGRIIAVLGAPGKRNYKLRKELGQLANRYLDHVILTQLDDRGEDVYEICKTIQKEIVDIN SVIIESRQIAIEQAIEIACKDDIILILGKGHEKFISLDVGQVDYPGDSQIVKEALERIYY KGEDEDEL >gi|223714192|gb|ACDT01000023.1| GENE 7 8572 - 9222 905 216 aa, chain + ## HITS:1 COG:SP0843 KEGG:ns NR:ns ## COG: SP0843 COG0274 # Protein_GI_number: 15900730 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Streptococcus pneumoniae TIGR4 # 1 212 1 212 220 243 65.0 1e-64 MNYNKMIDHTVLKADATKAMVAKIIDEAKEYNFASVCVNPTWVAYCAQALADSDVKVCTV IGFPLGANTSAVKAFETKDAIANGADEIDMVINIGALKDGNTDLVFNDIKAVVDAAAGKC VKVIIETCLLTDEEIVTVCKLAKEAQATFVKTSTGFSTGGATPEAVSLMKQTVGDDLEVK ASGGVRTIEDMEKVVAAGATRIGTSAGCKLVKNNNS >gi|223714192|gb|ACDT01000023.1| GENE 8 9340 - 10020 706 226 aa, chain - ## HITS:1 COG:SA0321_1 KEGG:ns NR:ns ## COG: SA0321_1 COG3711 # Protein_GI_number: 15926034 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Staphylococcus aureus N315 # 13 224 183 393 504 78 24.0 1e-14 MDLTFNTNTDFDQTKVLITNIIKNTADDYDLFLSENSINNLSVHLTIAIIRIKSNNYIPM SNSQIASYKQDHSYPYAKMLCDRLAREFEIDFPEAEISLVSMYLSKNQKLDLEINSGFDL LDDQVFKILRETMYTIYQDYHKDFRNDDKLFVAIGLHLEPALERLANGQLVENPLKEKII ERHQEEFNYSKILNDVVRHDLNLSFDDDELAFIALHFVVANNRIDE >gi|223714192|gb|ACDT01000023.1| GENE 9 10526 - 12778 2832 750 aa, chain + ## HITS:1 COG:lin1443 KEGG:ns NR:ns ## COG: lin1443 COG1882 # Protein_GI_number: 16800511 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Listeria innocua # 5 750 3 743 743 1028 64.0 0 MLEKDQWNGFKGRLWKEEVNVRDFIQNNYKPYDGDESFLAGPTEATDKLWGKLQELQKEE RAKGGVLDMETHVVSGLTAYGAGYIDEELKDLEAVVGLQTDKPLKRAFMPYGGIKMAQQA CETYGYTPDPELGRIFTEYHKTHNQGVFDVYTPEIRLARRNKIITGLPDTYGRGRIVGDY RRVALYGIDFLIEEKQNDLAHCGSGSMRDDTIRQREELAEQIKALNGMKKMAEAYGFDIS EPAKNAKEAVQWLYFGYLAAIKTQNGAAMSVGRVSTFLDIYLERDLEAGIITEAEAQELI DHLVMKCRMVKFARIPSYNQLFSGDPVWATLEVAGLGTDGRSMVTKNDFRFLHTLENMGP SPEPNLTVLYTAALPENFKKYAAKISIDTSSVQYENDDVMRPVWGDDYSICCCVSATQTG KEMQFFGARANLAKCLLYAINGGIDEKTKTQVAPKYRPITSEYLDYEEVMERYDQMMEWL ADIYVNTLNLIQYMHDKYYYEAAEMALIDTDVRRTFATGIAGFSHVVDSLSAIKYAKVKT VRDEDGIAIDYEIEGDFPRYGNDDDRADDIAVWLLKEFLNKLKKHHTYRDSEPTTSILTI TSNVVYGKATGSLPDGRKAGEPLSPGANPSYGAEQSGLLASLNSVAKLPYEWALDGISNT QTISPDTLGHDEEERKTNLVQVMDGYFAQGAHHLNVNVFGTEKLIDAMEHPEKEEYANFT IRVSGYAVKFIDLTREQQMDVISRTCHKSM >gi|223714192|gb|ACDT01000023.1| GENE 10 12852 - 13601 843 249 aa, chain + ## HITS:1 COG:SP1976 KEGG:ns NR:ns ## COG: SP1976 COG1180 # Protein_GI_number: 15901799 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Streptococcus pneumoniae TIGR4 # 5 249 11 256 264 309 56.0 3e-84 METAIKGFVHSIESFGSVDGPGIRYIIFLHGCPLRCKFCHNPDTWANAKSTMEMTPQEAI AKALKYKSYWGNDGGITVSGGEPLLQIDFLIELFKLAKKEGINTCIDTSGGNFTREEPFF SKFNELMKYTDLLLVDLKHIDSTQHQELTGKGNDNILDMAKYLSTINKPVWIRHVLVPGI SDKDEYLTELDKFISTLNNVKKVEVLPYHTLGVFKWEELGIPYQLDGINPPNQERIDNAN KLLHTDKYR >gi|223714192|gb|ACDT01000023.1| GENE 11 13770 - 14648 1277 292 aa, chain + ## HITS:1 COG:BS_ydhR KEGG:ns NR:ns ## COG: BS_ydhR COG1940 # Protein_GI_number: 16077653 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Bacillus subtilis # 3 291 2 289 299 341 57.0 9e-94 MKLGAIEAGGTKFVVCIGDEFGNVIERDSFPTETPEETMANIFKFFDGKDIEALGVGCFG PIDPDLNSPTYGYITTTPKPGWGNFNIMGALKERYDIPMGFDTDVNGAALGEAYFGAAKG LDSALYMTIGTGIGCGAIVEGNLVHGLLHPEMGHMNMIVREDDTYAGKCPFHGTCFEGLA AGPAIEARWGKKGFELPADHPAWDLEAYYIGQALATYVLVISPKKIILGGGVSKQKQMFP LIHKYLREFLNGYIQKDEILTDKIDDYIVSPALGDNAGVCGALALAKQALEK >gi|223714192|gb|ACDT01000023.1| GENE 12 14739 - 16379 1222 546 aa, chain - ## HITS:1 COG:no KEGG:BDP_1333 NR:ns ## KEGG: BDP_1333 # Name: not_defined # Def: arylsulfate sulfotransferase # Organism: B.dentium # Pathway: not_defined # 52 545 56 554 555 326 40.0 2e-87 MKKVKNILIGLLVFSIIALCFIFNGPLVTEASSDDENSRPYTQLKNATSDPETIYSDKYQ KNIYQQIKKIKQINNYTFNNPLLIENPYGTNTTGIYMYFTTDYQCVATYTISCEGYEDFT RTLNTNSLSGFSKEHEYLLIGSIPDAKNTITVTLTDTNGKQVDSITWSYQAPKLSGGDQY LSVATESYDTTSALSNGLYTVLGNDVTEENDEQTYMRLYDNDGVIRSEIPIISYRSHRIL FEDNTMYLSVSSTKIVGVDQTGYVSTIYSTGDYKLHHDYIFDSQNNLIVLASKKNAVTSE DKIIMIDHNTKAITELVDLIELFPDYYKTTAKPDSADDLDWMHINSLELVGKDSLIISSR ETSTIMKLDNIYSNPTVDYMIGSNNFWQESGYDDLLLTKTSDFSLQAGQHCVTYVEDATL ASGQYYLYLYNNNLAVSTTRSDYNWKEDDNYQNVSYDTKNGISYYYKYLIDENNRTVSLV NTIPVDYSGYVSSVQELDNNIIIDSGMAMSWGEYDQFGTLIKKFTTTGGKFVYRVFKYDY NDYWFQ >gi|223714192|gb|ACDT01000023.1| GENE 13 17065 - 17400 297 111 aa, chain + ## HITS:1 COG:SPy2172 KEGG:ns NR:ns ## COG: SPy2172 COG1695 # Protein_GI_number: 15675909 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 7 109 1 101 108 90 43.0 9e-19 MKGAGLMIFNTGAALLDAIVLSVVSKEAEGTYGYKITQDVRIALDVSESTLYPVLRRLLK DNCLETYDQEYGGRNRRYYKITSQGQAQLEMYKGEWHEYVRKINKIIEGGK >gi|223714192|gb|ACDT01000023.1| GENE 14 17403 - 18134 869 243 aa, chain + ## HITS:1 COG:SPy2173 KEGG:ns NR:ns ## COG: SPy2173 COG4709 # Protein_GI_number: 15675910 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pyogenes M1 GAS # 1 58 1 58 195 60 46.0 3e-09 MNRKKFIEELAFLLQDIEDAEREEAMQYYEDYFDEAGPENEQQVINDLGSPERVAAIIKA GLDNQFDQDIEYSEKGMDNSNYKQSREIIDAKIISEEETAADDNNDYQNNKRHKNNFRGN SDRNRILLIFIIIGAVFLALPVGGGILGLGLGFFGAVFGLGVGILCGGAACLIGAIVCFV KAFMIIAAYPGAGLITMAAGCALIALAFVFFWLAKGLIKIIPAVIRGIVDFCQNIFNRVG DRR >gi|223714192|gb|ACDT01000023.1| GENE 15 18131 - 18916 924 261 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735931|ref|ZP_04566412.1| ## NR: gi|237735931|ref|ZP_04566412.1| predicted protein [Mollicutes bacterium D7] # 1 261 1 261 261 474 100.0 1e-132 MKKIGYVALGSLLLAIVLFISGTMIGGFSELETLYDKGDFTISLPITKTMDVTKEFTDIK NLEIQAEAGTVELIEYEGSTIKVEAKNVSKKIKLYQEQNTLIVKDSFRFWHLINVTDITT RIKIYIPNSYEFNKVELEVDAGELIVPNLKANDVEIDVDAGSFKAENIIASYTKVDVDAG DARINLLNSYRSEFNCDAGDIDATMVGSESDYSYEVDSDVGDISIGSYRSDGLSDEYSHS GGQRKIEADCNVGSIRIKMEV >gi|223714192|gb|ACDT01000023.1| GENE 16 18920 - 19099 230 59 aa, chain + ## HITS:1 COG:MA4106 KEGG:ns NR:ns ## COG: MA4106 COG1983 # Protein_GI_number: 20092899 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Putative stress-responsive transcriptional regulator # Organism: Methanosarcina acetivorans str.C2A # 1 57 1 56 59 68 56.0 3e-12 MKRIYRSRQDRMVCGVCGGIAEYFDLDPSLVRLGWIIFSAMGGSGFIAYIIAAIVIPSR >gi|223714192|gb|ACDT01000023.1| GENE 17 19358 - 20656 1216 432 aa, chain - ## HITS:1 COG:SP1795 KEGG:ns NR:ns ## COG: SP1795 COG1621 # Protein_GI_number: 15901624 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Streptococcus pneumoniae TIGR4 # 3 415 28 428 439 207 32.0 4e-53 MHPKYHLTAPKHWINDPIGFIYYQNNYHLFYQHFPYENKWGTMHWGHATSKDLVNWVDQG IALYPSKDFDANGCFSGSALEVDKKMYLYYTSVVYQKVNPENIHKTAPGYDFYSSQSMII SDDGFNFDNLNQKQLIIPVFKDGEVGHPVHTRDPKVWKFQDTYYMVLASKYLDDNNDYNG QLLFYTSKNATEWHYANNYRGIKCGDMWECPDIFTVAHQDILIMSPERTNDSGYPSHARI TTANFDHQNCQLEITGELNYLDYGLDIYAPQTTIDEAGRRIYVGWMRMPVADEANWIGLI TYPRVITYQNDAIFTNIHPSVDSLFKKPATEFKATQACKIVTNLKTGDFINIGGYLIKYD DCLCIDRSNVFKSDAALKELRSPKIDHCNLNIYYDQDIIEIYINNGQYVLSNIVYEMADT IKSNTDFEIFTD >gi|223714192|gb|ACDT01000023.1| GENE 18 20740 - 21636 943 298 aa, chain - ## HITS:1 COG:BH0676 KEGG:ns NR:ns ## COG: BH0676 COG1597 # Protein_GI_number: 15613239 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Bacillus halodurans # 1 298 1 293 295 200 35.0 2e-51 MKRCLFIINPSSGQRTIQNTLDKLIGQLTLRQLINHIDVFYTLKKDDAYFKALNADEQDY DFITVVGGDGTINEVVSGMVASDKKIPLCILAAGTVNDFANYLNLPSNIEGVCNLINNFK TVCCDVGKINERYFMNVAAGGMFSDVSFTVSKADKKRLGPLAYYLNGIANLPSQLNTNIN LKIVLDDANILETEAKMFMVTNTNRVGGFENIIPYADIQDGMLDLIVIKKCSVTDLVALS KDYLLKKHANSPFISYVQAKKIEIYSQQKVVIDIDGEEGSPLPVTIEAISQAINILVP >gi|223714192|gb|ACDT01000023.1| GENE 19 21641 - 22849 1396 402 aa, chain - ## HITS:1 COG:CAC0476 KEGG:ns NR:ns ## COG: CAC0476 COG2195 # Protein_GI_number: 15893767 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Clostridium acetobutylicum # 2 401 3 403 408 388 49.0 1e-107 MDIEERFLKYVSIDTQSDEYSDTTPSTLKQLDLGKELVKEMLELGITDAHLDEYGIVYGT IKGNGGTGDIIGFIAHMDTSPDASGTNIKPQKINNYDGSIIEINKELGLSLDPEEFVSLK KMVGHDLITTDGTTLLGADDKAGVAIIMDLANYLYEHPEVKHNDIKIAFTPDEEVGRGAD NFDVNKFGAKYAYTLDGGDIEEYNYENFNAYSAQVEITGKSIHPGNAKDKMVSAINVAIE FENMLPAQQKAYFTDGYDGFNHLHHLEGGCEKATLEYIIRNHDLTLAKKQINDFKRIKKY LDEKYGYELIKLDIKESYLNMAEVIKENYYIIERLEKAMNSVGIQGFASPIRGGTDGARL TFMGLPCPNIGTGGDNFHGPFEFVSLTMMKQSVEILKELIKE >gi|223714192|gb|ACDT01000023.1| GENE 20 23016 - 23603 764 195 aa, chain + ## HITS:1 COG:TM1296 KEGG:ns NR:ns ## COG: TM1296 COG3341 # Protein_GI_number: 15644051 # Func_class: R General function prediction only # Function: Predicted double-stranded RNA/RNA-DNA hybrid binding protein # Organism: Thermotoga maritima # 3 190 6 195 223 124 43.0 1e-28 MAKYYAVKNGRKPGIYSSWDECKKQVEKFKGAIYKSFTSLEDAKVFIKEEKIEFDGGLIA YVDGSYNIKTHEYGFGCVIIEEQKVIKEMYGKGADENYASMRNVAGEILGSICAMEFAKN NGYKQICIYFDYEGIEKWANGMWKANKPGTQEYQRKVKEYRQDLKIAFVKVLAHSGDFYN ERADVLAKKAVGING >gi|223714192|gb|ACDT01000023.1| GENE 21 23596 - 24129 516 177 aa, chain + ## HITS:1 COG:BS_yokF KEGG:ns NR:ns ## COG: BS_yokF COG1525 # Protein_GI_number: 16079220 # Func_class: L Replication, recombination and repair # Function: Micrococcal nuclease (thermonuclease) homologs # Organism: Bacillus subtilis # 47 177 69 200 296 98 44.0 7e-21 MAKKIKLTKTQQKKLLRSWLGIIVILAVIGYNFYQQNKSIPTGERFEVTLDRCVDGDTAW FNVDGESTKVRFLYIDTPESTKEIEPYGKEASDYTKTQLTNAAKIELELNVDGDSKDKYG RLLAWVFVDGELLQEQIAREGLVEKFYDYGYDYTYKNEIIEAANSAKSMRKGIYSEN >gi|223714192|gb|ACDT01000023.1| GENE 22 24389 - 25306 855 305 aa, chain + ## HITS:1 COG:CAC3638 KEGG:ns NR:ns ## COG: CAC3638 COG0601 # Protein_GI_number: 15896872 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Clostridium acetobutylicum # 1 305 1 305 306 268 45.0 1e-71 MFRYVLKRVLIGIVTLFLLSSATFFLMKATPGSPISGEKYKNEAAREIAIKKYNLDKPVF EQYTLYMNDLIHGDLGESIVRPGRDVGTTISQSFPVTARLGAIAFASALVIGVALGTAAA LSKRKWLNNLCMFIATIGVSVPSFLIGVLLMIVLGVKLKLFPFVGLTSPAHYVLPALSLA FYPIAMVCRLTRSSMLEVMRQDYIILARSKGTPYKKVVIKHALKNALIPVITYAGPAFAY MLTGSFVIETLFSVPGIGREMVSSIQTRDYSMIMGLTIFLGFLVISFNIITDLLSAVVDP RIKLK >gi|223714192|gb|ACDT01000023.1| GENE 23 25318 - 26253 1009 311 aa, chain + ## HITS:1 COG:CAC3637 KEGG:ns NR:ns ## COG: CAC3637 COG1173 # Protein_GI_number: 15896871 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Clostridium acetobutylicum # 11 311 4 304 305 280 47.0 3e-75 MEENKVMNIELTPEMFEKLDDSKKNSEKIAYESKTYIADAWNRFKKNKLALIGLCFLLVM AIGCIFVPIFSSYTYDGQNMANTFAGPSMDHIFGTDRFGRDVLVRIMYGGRISLAVGFSA AIISLCVGVTYGAIAGYAGGKVDMIMMRIVDALYSIPDMLYLIMITVVLGSNFQSIIIGI CISSWMGMARQVRAQVMTLKEQEFSLAAFVLGASKKRILLKHLIINSMGPIIVSFTMLVP SAIFYEATLGFLGIGLSAPQASWGTLANDARAMISSQPLQVVWPILAICLTMLALNFIGD GLGDALDPKKK >gi|223714192|gb|ACDT01000023.1| GENE 24 26269 - 27279 1296 336 aa, chain + ## HITS:1 COG:CAC3642 KEGG:ns NR:ns ## COG: CAC3642 COG0444 # Protein_GI_number: 15896875 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Clostridium acetobutylicum # 2 329 3 322 340 364 54.0 1e-100 MKMLEVNDLTTEFSTENGTVKAVRDVSFYVDKGEVLGVVGESGSGKSQTMFSIMGLLAGN GSVTNGSIKIDGEEIAPTHFDSRKKYEETMDRIRGNDMAMIFQDPMTFLNPTLRIETQLI EPIINHNPKISKKEARDKAIDLMRKVGIPSPEDRIRQYPHQFSGGMRQRIIIAIALACDP KVIIADEPTTALDVTIQAQVLDLIGSLKDEIDSSIIMITHDLGVVASICDRIAIMYGGKI VETGTTDEIFYNPQHPYTKGLLSCIANPEDLEKKELHPIPGSPPDLLNISEGCPFLDRCE NAMNICVDHMDEYRNFSSTHCSSCWLNNPYLKRKGE >gi|223714192|gb|ACDT01000023.1| GENE 25 27283 - 28257 994 324 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 2 324 8 329 329 387 59 1e-107 MSEPILKVKGLKVHFPVSGGFLAKKQVVKAVDGVDFEIAPGETFGLVGESGCGKSTTGRA LVKIYDPTEGQVIFEGEDITKIKGAKLKEFRRDMQMIFQDPYASLNPRMTVGEIIREPMD IHGVYDTKEEREKRVRELLEIVGLKPDHIRRYPHEFSGGQRQRIGIARTLALNPKFIVCD EPISALDVSIQAQVINLLEKIQNEMGIAYLFIAHDLSMVKHISDRIGVMYLGNMVEIGDA DDIYHEPLHPYTQALLSSVPIPDPKVARNKKRIVLEGELPSPINPPSGCVFRTRCPKATE RCAQEKPALKTVGNRQVACFLYEK >gi|223714192|gb|ACDT01000023.1| GENE 26 28275 - 29933 1958 552 aa, chain + ## HITS:1 COG:lin0200 KEGG:ns NR:ns ## COG: lin0200 COG4166 # Protein_GI_number: 16799277 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Listeria innocua # 1 537 1 534 549 179 29.0 9e-45 MKKLLAYGLSFLMALTCLTGCGGGSEADLRVAVGKDTADLDPAIVDDSVTANILAQVYQG LYTLDTDGNVITNLATDMPEISEDGLTYTIKISGDTKWSDGEKVKASDFVYAWKRAVTTG GYYTQFIWQYIDGTSHMVKDKKTGKETDTPYTTMDELKDFGATALDDTTIQIKLKSKCAY FTSLLTNTVFYPVREDYLKENAEDITKSSWGNNSDVPYNGAFKVTSVNTKDEVLLEKNDE YYDAKNVTLNSISFKVMSDMDAQTSAFEKGDLDFATSCNIDTINSKDELKKQSWKIDPFV CNYYVLINAGDENTREELKDADIRNAIGLSINRANVLKALGYGDNAYELNGLIPKGIPGA TGDFREEQDAVEKLATYNLDEAKAIMTSKGYSESNMLKLTYTYNDNTMHKNVAQAMQASM KEAYIDLTLSATEGEAFFDARDKGDFEICRHAMTADFIDPMAYLSMYVGSTTPGNTVDDA KYEELVAAANAIDDKTERMNALHAAENYLVGEQHYVIPLFGYAEPYLMSSKVTGVTHSPE GHYQLAYAKVEK >gi|223714192|gb|ACDT01000023.1| GENE 27 29984 - 31216 1417 410 aa, chain + ## HITS:1 COG:no KEGG:MXAN_7220 NR:ns ## KEGG: MXAN_7220 # Name: not_defined # Def: putative lipoprotein # Organism: M.xanthus # Pathway: not_defined # 6 398 45 455 468 289 35.0 1e-76 MKEIRSVKEQYQIKDACLKERLELILPKVMKKYDVDMWISASKEYNEDPLFHAITPANYP TARRISLFAFVKEGDNIHRFSLCMPSEELAPYYTSYWTDFNNEDQMACLNRLCKEYDPQT IAVNVSNNFAFSDGLTQGLYEMITAKMAPEYAERIVRNDMLAIKLMELRTPLELELHPEV MEVAFSVIEEAFSSKSIIPGVTTCEDLQWLMMQRVKDLGLDYWFEPTVDLQRPGLDNPRY FGVIEKGDLLHCDFGIKYLNICTDTQRLAYIAKDDEESIPQELLDGMKVNDRFQDIVAEN MADGKSGNDVLIDSLKQGKDEGIEAVLYSHPCNIYGHGPGTTIGLWNNQKAIPVKGDVLM SYDTTYALELNTKSKAFGQDYYFYTEETVAFTRDGLIYLHPGRKNIYFIK >gi|223714192|gb|ACDT01000023.1| GENE 28 31225 - 32439 1469 404 aa, chain + ## HITS:1 COG:YPO0623 KEGG:ns NR:ns ## COG: YPO0623 COG0436 # Protein_GI_number: 16120949 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Yersinia pestis # 1 403 4 407 410 412 48.0 1e-115 MKNRMKKEFLGFEGGLFSEVEKADVGDSFAKLAENGVALMGWADPFMPDFSIPEHIMEKT LEAIKSPVAAHYTAPTGNMELRAMICQKAKKMYGVDLNPARNVIITPGSDSALFFAMYPF LEKGDEVIIPCPSYPNNMQNITMMQATPVVLELNEAEDYQINHQALESLVSNKTKMIVLT HPNNPTTTVYNHESLMAIREVVLKHDLVLVCDQAFEDFTFENEFIAPMALDGMFEHTVTV CSISKGMGLSGYRVGYIMASDVIMDVMYGCAVSVIGATNTVSQIAAIEAFKHPEFMDEFN RAYDIRRHQAYNILNTVPGVSMELPASGFLAWVDVSALGDSSAICKRLISEAKVAVNDGI NYGPGGAGHLRIVLGVYRDDQQVIEALTRMAATLDAIAKEKGIK >gi|223714192|gb|ACDT01000023.1| GENE 29 32464 - 32754 348 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754769|ref|ZP_02426896.1| ## NR: gi|167754769|ref|ZP_02426896.1| hypothetical protein CLORAM_00273 [Clostridium ramosum DSM 1402] # 1 96 1 96 96 145 100.0 1e-33 MYFVINCFISAVCIISLILFTEVFPKWTKLLVGAVQVVTTSYAIAILIKKIGEWTSVWSG DYKYINSNFLWILPIVAVITIAVAYICNKKSKKAAK >gi|223714192|gb|ACDT01000023.1| GENE 30 32865 - 33869 926 334 aa, chain + ## HITS:1 COG:CAC1488 KEGG:ns NR:ns ## COG: CAC1488 COG0463 # Protein_GI_number: 15894767 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 5 333 1 336 338 214 35.0 2e-55 MEAIMKLLTICIPCYNAIEYMHKALGSCLLLKDELEVIIIDKNSTDETYEVAMEYQNEYP DTFKVIKNTENIDDIKCAYQYTTGLYFKLLNSYDWFDQASLVRVIETLKDIIRVQANLDV LVTDYFCCYGKRPRKVSYRSLLPSDKIFGWHEIKHFKKQQYLLTPALIIKTKIIKEVIDN FPLNATFYMEMFAYAPLPYIRSFCYIDMPLYCFGRNPFENVINNDIFIPALLDTAREYID YYDIYTLRSRKQRHYMIKYLSMITLITCALLSHDGSPESIQDKDDLWSYLKYTKPKLYKE LRKTTYGRLFSIEGRLSKNIIDKAYNVILKLYGI >gi|223714192|gb|ACDT01000023.1| GENE 31 34226 - 34768 706 180 aa, chain + ## HITS:1 COG:BH0110 KEGG:ns NR:ns ## COG: BH0110 COG1045 # Protein_GI_number: 15612673 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Bacillus halodurans # 7 180 12 182 229 202 58.0 2e-52 MENRSLIKKVLETDPAARYALNVIINYPGVHAMFCYRINSFLWKKLHLKFLARMFSQIAR FFTGIEIHPGATIGKRLFIDHGMGVVIGETTIIGDDCVLYQGVTLGGVGTGEHKVKRHPT LLNNVMVSAGAKVIGDVTIGNNSIIGAQTVVLKDVPDNCTVVGVPAFIVKENGVKVKKEL >gi|223714192|gb|ACDT01000023.1| GENE 32 34824 - 35342 407 172 aa, chain + ## HITS:1 COG:SA1250 KEGG:ns NR:ns ## COG: SA1250 COG0671 # Protein_GI_number: 15926998 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Staphylococcus aureus N315 # 6 155 30 185 204 73 34.0 2e-13 MTGIIKPIDNFVYQLLQILINERVTPIVIMVTHLGSFIGIIGAICVAFIISKRIALVCLA ASFIQQLLNRIIKFIVKRPRPSVVHLVNETNYSFPSGHAMAITCLYGLFIYYLYHSKLKY RKLLISGCIIIILFVTLSRIYLGVHYFSDVFGGVMLSLSLIMYMSNIPSFNA >gi|223714192|gb|ACDT01000023.1| GENE 33 35366 - 35551 105 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735949|ref|ZP_04566430.1| ## NR: gi|237735949|ref|ZP_04566430.1| predicted protein [Mollicutes bacterium D7] # 1 61 1 61 61 107 100.0 2e-22 MEKLSEYGLTLVVINDDFHLYDLVSKEVYEVMTLSYIEQLLDKWNKKGKCLHNYGIIRIT I >gi|223714192|gb|ACDT01000023.1| GENE 34 35564 - 36661 815 365 aa, chain + ## HITS:1 COG:lin0908 KEGG:ns NR:ns ## COG: lin0908 COG0628 # Protein_GI_number: 16799980 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Listeria innocua # 12 349 19 347 378 109 27.0 8e-24 MQKDKLKQYSGLLIIGILLIVVYKVFEPSWFSVLLNACFPIIIGGVLAYFLEPLVQMVTK LFENCQSSFLKKHRRVLSVLLVSLAVILMIILLLTWLIPMVMDYAVEFAKNIDTYVKSFE ASINSSFEDPNLAGTIIQVEQTFVDSIKSLSANDFMEVIAFAGKTGSTLLTILMGLIFCP YILIEAEKLARIFDRFMLCFISQENLNLIHGYAIKSHRIFGNFIYGKFIDSVIIGLIALV GFGLMGLPFFPLLAFVVFITNMIPYFGPFIGGVPVVFIVLLTNGIMPAIWTALFIFALQQ FDGLILGPRILGDSVGISPFWIICSITIFGSLFGFLGMFLGVPLICVIRMFFNDFLEYRK NRIKD >gi|223714192|gb|ACDT01000023.1| GENE 35 36767 - 37369 535 200 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735951|ref|ZP_04566432.1| ## NR: gi|237735951|ref|ZP_04566432.1| predicted protein [Mollicutes bacterium D7] # 1 200 3 202 202 393 99.0 1e-108 MKKYVIGKPRRILAIVNMLIWVPFATMLIVFEVQYHASIVNGIFLGLVILIINFFIAGPI IYSSGMWWSIDEEVFKYSSFRKFTARIKAFYLPKQQNSYFLALKVQQMDRIDLGWHDVPI FPFGLIGHPITFTITMKDGTTVELEALYTKESQKFVMACKHIESLGIMLNDPYNLLEVIA NPKRMVSQYIDEIESGVHHD >gi|223714192|gb|ACDT01000023.1| GENE 36 37362 - 38012 495 216 aa, chain + ## HITS:1 COG:no KEGG:CPE1501 NR:ns ## KEGG: CPE1501 # Name: virR # Def: DNA-binding response regulator, LytTr family # Organism: C.perfringens # Pathway: not_defined # 30 210 44 223 236 66 26.0 6e-10 MIRIAMAIKNQDIKNILKIELLKLCDDLQIMDYQEAVNYDVYMMEIKNIAEIENLKRIRL EDYQQIIVLIGLEDLELIKKGYELEAFNYIRTNKFIEDIDNLMSQLGDVMVKRFKTYQIQ NSGTISKVRISSINYVESFRHYIHIHANSGEYIERKNISEFVCQMKNENFIQIHKSYAVN LHAVIKIAANCVELVDGSILPIGNSFKKQLIELFDK >gi|223714192|gb|ACDT01000023.1| GENE 37 38273 - 39001 501 242 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735954|ref|ZP_04566435.1| ## NR: gi|237735954|ref|ZP_04566435.1| predicted protein [Mollicutes bacterium D7] # 1 242 38 279 279 439 100.0 1e-122 MNSELSIRYVLERKQEYIEQLQIELKDIDEKIKAYIQRHKVEVIFDNSINHEYEGLLINQ NEICFNEVKINCNDIMQIDLSMCSSKGCDGKAYFFLNYYVDLDIHTTSDTYSFQIMNNSQ INNLFNYIVSLSVPINDPLGLIRIYRDRNDPVMLNKYLNKHFSSWSQEYNLDNPRENYIS IMKESYIYPLKELKDNEHSIDKLVNEQFKALKTPYTEIFKRASKSINKCFAKLIKRGDCK DD >gi|223714192|gb|ACDT01000023.1| GENE 38 38994 - 40265 649 423 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735955|ref|ZP_04566436.1| ## NR: gi|237735955|ref|ZP_04566436.1| predicted protein [Mollicutes bacterium D7] # 1 423 1 423 423 714 100.0 0 MIKMALHLLKLEKKSAVDLCICIASTVTVCLLFMEILKSPEISGNLDFLNIQFLYNALLT FFMIFMCVVLMVYSSNYFIRLKSREIGLTVLSGLSFFKRTIYSLVQVLCIVLFSSMIGII VFLGISPIVKMVIYEILSVDANIFVLSFDALMQGLAVILAMILVIMLFNSGFIMRTGITE LLENHNIISYKKDTRAIKPPDWIYPLIYIFGLICMYTGNNQVGGYILFSFFGAVGAYGVF RHYLPHHYNRNALKYCKTAKDLLIKEDVSLIMQQSKSFILILMVAMIGIVPFICGTTDNS LFNFEMHLAFVITNILLSLTLINRFKIDHIQRREHFHAIYQLGFARAEIDFVYMQEIKYY FRHLWILSIIYIFNIFIVFYFNQGMGIMTTFIIMAEYLIPYMLAEMIILYERKEEIKNDK YYA >gi|223714192|gb|ACDT01000023.1| GENE 39 40246 - 40503 428 85 aa, chain + ## HITS:1 COG:BH2699 KEGG:ns NR:ns ## COG: BH2699 COG1136 # Protein_GI_number: 15615262 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Bacillus halodurans # 4 79 2 77 251 74 50.0 5e-14 MTNIMLEVKNLKKIYDELGPSPKIALDDLSFEIKNNEFVCIMGPSGSGKTTLVNILSTID KATSGIVNISGASIVGMSGAAKAKI >gi|223714192|gb|ACDT01000023.1| GENE 40 40478 - 41005 184 175 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 10 147 79 217 245 75 28 5e-13 MGQQRXKFRKKKLGFIFQNYNLLYSLTIRENILFPLIINSIGEKEWQENLEQVTQILGIK DILDKHVYECSGGQQQRAAIARALISKPEIIIADEPTGNLDSNNSRELMELFSKINHDSQ TTIIMVTHDAFVASYSTKMLYMKDGKIDKILNRHDLTRDEYFNEIVKVNAILSNN >gi|223714192|gb|ACDT01000023.1| GENE 41 41020 - 41565 398 181 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754758|ref|ZP_02426885.1| ## NR: gi|167754758|ref|ZP_02426885.1| hypothetical protein CLORAM_00262 [Clostridium ramosum DSM 1402] # 1 181 7 187 187 324 99.0 1e-87 MMRIKLYLGSQELEKYLNKKVQHNYILETIIPFGLFQPLRLDVLKFKKKKDLQRIYIVDS NDLVLIKRKYQNLGWQYFENNYANDEFNNDAIFYYDSIDEKELMKMNIKESCHQNDVSAS LVKGLFFGTVFLIFSVISPNLMIHVNNNGLDFILGNLYIITALTLIVMAVIGKLRNKKRK E >gi|223714192|gb|ACDT01000023.1| GENE 42 41583 - 42842 1473 419 aa, chain - ## HITS:1 COG:ML2336 KEGG:ns NR:ns ## COG: ML2336 COG1167 # Protein_GI_number: 15828259 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Mycobacterium leprae # 3 418 39 459 463 373 46.0 1e-103 MKAFSQMNNNELCELKKELEKRYLDFKARDLKLDMSRGKPASEQLDLSLGLLKQEDYMID GIDCRNYGGLEGLPAMRRFFGELLDVNEKNVIVGGTSSLTLMYDYLSQAMLFGVMGNTPW SKLDKVKFLCPVPGYDRHFAFCEHFGIEMINIPMNSDGPDIETIKKHLQDESVKGMFCVP KYSNPQGITYSDDIIKAIASLTPAAKDFRIIYDNAYCVHDLNEHGDTLLNIFNELPKYQH EDMVIMVASTSKISFAGAGVSCIVASENNIVDIKKRLTIQSISQDKMNQLRHLNFFKDVA GLKAHMKKQAVLLKPKFDCVIEHLNNELGGKGIASWIEPNGGYFISLDVMPGCAKRIGEL CKDAGVVLTTIGATYPYGIDPQDTNIRIAPSYPSVDELSKAAELLCICVQLACLEKLLK >gi|223714192|gb|ACDT01000023.1| GENE 43 43102 - 44115 1274 337 aa, chain + ## HITS:1 COG:YPO2806 KEGG:ns NR:ns ## COG: YPO2806 COG0667 # Protein_GI_number: 16123004 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Yersinia pestis # 1 325 1 319 329 328 52.0 1e-89 METRKLGSNFEISAIGLGCMGMTHGFGDASDQKEMIEVIRGAYQAGITMFDTAECYQGVS ETGEILYNEDLVGQALSIYPRDSYQIATKCGIKVIDGKQVLDARPEVIRTSLSESLKRLK TDYVDLYYLHRVDPKTPIELVAQTMKELMAEGKIKHWGLSEAGVETIRKAHAICPLTVVE SEYSMIFREPEKNGLLDTLKELKIGFIPFAPLGKGFLTATIDPNQTFSKNDTRSKQPRFK KENMAINQVLVELIKKLAKEKQVTPAQIALAWVMAQGDWIVPIPGSRKLSRIEENIKASK VILDEVDLKNIKRALDNMDLKAERWDLNSDNAKRVGK >gi|223714192|gb|ACDT01000023.1| GENE 44 44118 - 45272 934 384 aa, chain + ## HITS:1 COG:ECs3121 KEGG:ns NR:ns ## COG: ECs3121 COG0477 # Protein_GI_number: 15832375 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 379 1 382 396 123 27.0 5e-28 MKEKLWTKKYLLSVIILFAICLCSNIVLSVLTIFAKNLTGLDTYAGLMTSIFTLAALCVR FIAGILLDKFGCKKVILGGITLMMAAAFFFIDCKQIELAIIYRIMQGIGFGIASTGASTY VAKMCHPNRLLEGVSYASIANSLTGVVGPSIAYSILGANYDRFKLLFIVSLLICILTFGL MLLGKDVSIVSIENQNKEKETINWLILIVPILVLFFNSLTQSAITSFVSLYAISLGFAGA GMFFSVNAIGMISSRFIMNRLVIHFGEFKMLLLNSLLFFISVYLIGQVTRMYQLLFLALP AGFATGAVAPIVNTFLIKRMPESKNGIANATYYAAMDIGYAIGSVVWGIIAAFNGYRMIF YLGALMQIVGVILCLAQMKIYRLK >gi|223714192|gb|ACDT01000023.1| GENE 45 45326 - 45715 483 129 aa, chain + ## HITS:1 COG:CAC0766 KEGG:ns NR:ns ## COG: CAC0766 COG0789 # Protein_GI_number: 15894053 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 126 1 126 126 102 46.0 1e-22 MTIAEVSKKFGISSTTLRYYEKIGLMNPVAKNISGHRDYQEPDLRRINFIKCMRAAGMTI EQIKLYVDLFNEGEHTISQRKDIMIEQLGNLEAQVEELQSIISYLKHKIDNYESTLVKRE MEQRNKLKG >gi|223714192|gb|ACDT01000023.1| GENE 46 45945 - 46244 407 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754753|ref|ZP_02426880.1| ## NR: gi|167754753|ref|ZP_02426880.1| hypothetical protein CLORAM_00257 [Clostridium ramosum DSM 1402] # 1 99 1 99 99 174 100.0 2e-42 MNIYYIIGAVAIILILLFVYWYSHRSNKEPHEYILVVSFDDGYYYDLCEDERFKKLAQFE KNNNEFIAKTKLNEVYLKKEIGKILEMDLAKLHVIIKRW Prediction of potential genes in microbial genomes Time: Thu May 26 09:26:29 2011 Seq name: gi|223714191|gb|ACDT01000024.1| Coprobacillus sp. D7 cont1.24, whole genome shotgun sequence Length of sequence - 6663 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 184 236 ## gi|167755182|ref|ZP_02427309.1| hypothetical protein CLORAM_00687 + Term 364 - 411 1.0 - Term 431 - 478 4.5 2 2 Tu 1 . - CDS 493 - 2373 1590 ## COG2199 FOG: GGDEF domain - Prom 2426 - 2485 9.0 3 3 Tu 1 . - CDS 2492 - 3484 1087 ## COG0673 Predicted dehydrogenases and related proteins - Prom 3504 - 3563 7.8 + Prom 3280 - 3339 9.5 4 4 Op 1 . + CDS 3584 - 4918 1604 ## COG0486 Predicted GTPase 5 4 Op 2 . + CDS 4920 - 5147 195 ## gi|237734936|ref|ZP_04565417.1| predicted protein 6 4 Op 3 . + CDS 5156 - 6028 950 ## COG0668 Small-conductance mechanosensitive channel 7 4 Op 4 . + CDS 6012 - 6632 655 ## COG1051 ADP-ribose pyrophosphatase Predicted protein(s) >gi|223714191|gb|ACDT01000024.1| GENE 1 2 - 184 236 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755182|ref|ZP_02427309.1| ## NR: gi|167755182|ref|ZP_02427309.1| hypothetical protein CLORAM_00687 [Clostridium ramosum DSM 1402] # 1 60 375 434 434 95 100.0 1e-18 KYFDINIIKENKTTKEDKENHGLGLITINNIILKYNGSIDYSVTNQSLVIDITLLNVLKI >gi|223714191|gb|ACDT01000024.1| GENE 2 493 - 2373 1590 626 aa, chain - ## HITS:1 COG:CAC0631_1 KEGG:ns NR:ns ## COG: CAC0631_1 COG2199 # Protein_GI_number: 15893919 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Clostridium acetobutylicum # 451 620 337 504 525 82 31.0 3e-15 MSISLASVLLINFFLSTDSLQARTLETFNLKINQVIQTIENNRTELVSLKSSLDEDYLTR AKAFAYVIEKNPNIIESVTELQNLATLLDVDELHVSDSEGLIAYSSVPKYVGLDFHNGEQ MRGFLPILESDDPDEYVIQDAQPNTAEGKIMKYVGVARKDKKGLVQVGLEPTRLLEAQQR NTYSYIFSRFPTNEGEQLFAINKTTNELIASTNDLDDKEASYYTYDNLKDCQSGTFKDIG NKTENFFVTREYDNILIGATVPKDILYKNRSSDLLLIGAYLIAIEIVIVVSINVILNKKV LKGIHSVLNDLNRIKNGDLNTVVKADDNQELIDLSSGINSMVNSVVNSADRISKIISIID IPLAAFEYQNDTKQLFATARLKELLHLSDEEANQMYLDPKQLLKKLQIIISNPIEGESDV YCLEKDTFIKIRLVKDENGFYGTVNDVSSDILKKKQIQYEKDHDHLTGLLLYPSFKREVT LVINNNNYNHLYAAIMVDLDSFKHINDTYGHDFGDHYLKRLTYALNKLSKSNCLIARRSG DEFCLFIYNYQDKNTIINQLNKLWSYFKEELIELPDHSYQSIKVSGGFVCSSGPNNTIEE LMKKSDKALYEAKNHYKGHFIEYKGK >gi|223714191|gb|ACDT01000024.1| GENE 3 2492 - 3484 1087 330 aa, chain - ## HITS:1 COG:SPy0441 KEGG:ns NR:ns ## COG: SPy0441 COG0673 # Protein_GI_number: 15674565 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Streptococcus pyogenes M1 GAS # 1 318 1 319 319 342 52.0 4e-94 MKLGIIGSGMIVYDLLTFIDIIDEIELTAILGRKESQEKIETLINKHHIKKAYYDYDELL NDDEIDTIYVALPNHLHYEYTKKALLHNKHVICEKPFTSNVNELDELIALAKQQNCLLFE AITNQYLPTFKEIKEKISEIGKISIISSNYSQYSSRYNAFKRGEILPAFDVNKSGGALMD LNVYNIHFVVGLFGAPNKVEYFANIEKGIDTSGVAILEYPTFKAICIGAKDCGAPIISTI QGDQGCVKIDGPTSVLTDVQILRNGRDSEIIPTGNYHRMYSEFKVFAECIDNNDFTTCNK MLEHSRIVMDVLTKARQSANIIFGCERDTK >gi|223714191|gb|ACDT01000024.1| GENE 4 3584 - 4918 1604 444 aa, chain + ## HITS:1 COG:L0157 KEGG:ns NR:ns ## COG: L0157 COG0486 # Protein_GI_number: 15674224 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Lactococcus lactis # 4 444 8 455 455 397 47.0 1e-110 MNEDIIVAIATSRLEAAISMIRVSGPDCIAFVQKFFTGKIIEKPSHTINYGYIIDDGKRI DEVLVNIYRGTRTFTGEEMVEINCHGGVYITSRVLEVCIKNGARMAERGEFSKRAFLNGR IDLSQAEAISDIITAKNSYATDLALKGISGSISGFIEDLKEDLIQIITQIEVNIDYPEYD DVEELTASSLLPRSANLLTKMNKILDDSKNIKLVKEGIKTVIIGRPNVGKSSLLNALLRE DKAIVTNIAGTTRDIVEGSISIDGVVLNMIDTAGIRETDDIIESMGVEKSKELIHQADLV LLVIDGSQSLSSEDMQLLELTEDATRIIVLNKADQGTKVDLDGIVISAKDNQISTLTEEI KKMFELGKIIDNNDHILTNARQTMLLQRASQALKQAVEAMEMMIPTDLIVTDLYECWNNL KEILGEKAKEDLLDELFKRFCIGK >gi|223714191|gb|ACDT01000024.1| GENE 5 4920 - 5147 195 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734936|ref|ZP_04565417.1| ## NR: gi|237734936|ref|ZP_04565417.1| predicted protein [Mollicutes bacterium D7] # 1 75 1 75 75 118 100.0 9e-26 MDKNRILSARFLGFSKYLGIVAAISFVVFLIINAFNTGNDILFWISYVLLMLSIIGAIQC VCLYFIGKYYGSKSK >gi|223714191|gb|ACDT01000024.1| GENE 6 5156 - 6028 950 290 aa, chain + ## HITS:1 COG:MA1724 KEGG:ns NR:ns ## COG: MA1724 COG0668 # Protein_GI_number: 20090576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Methanosarcina acetivorans str.C2A # 26 284 54 345 379 112 26.0 1e-24 MDKTYERINSLFEHGLIDSGISILISVVIVLVINKIINKIIKKKINDPLKLVFPMRIKKI VLVTIVLAIIMSEITAMQSIIKALLASGGILAVVVGLASQEAASNMINGFMIMTYKPYKI GDFVNVREYNVIGTVIDISMRHSVIETLERTQVIIPNTIMNKAIIENVSNVKTQKANYLY LEVSYESDIQKAIEIIQEEGSKHSLCLDGRTKAQKKKHEPMVKVHCVEFNDSGIQLRATL LSKDSGAGFQLLSDLRLIIKKRFDENGIDIPYPHRVVINETREKNDEQTK >gi|223714191|gb|ACDT01000024.1| GENE 7 6012 - 6632 655 206 aa, chain + ## HITS:1 COG:BH3089 KEGG:ns NR:ns ## COG: BH3089 COG1051 # Protein_GI_number: 15615651 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Bacillus halodurans # 1 194 1 195 207 163 43.0 3e-40 MNKQNRWLDLAQELQFLAQGGLAYTKDKFDQERFERIRQISAEMVALQSDLPIATVTSLF CNETGFQTPKLDSRGAVFKGDKILLVQESDGRWSIPGGWVDALASVKENTIRELQEEAGI EAQIVKVIAILDRNIHNTPRYAYGITKIFIECSYLGGAFKPNIETLDSGYFSLDELPELA EEKVTLEQIKMCFKAHFDDNWQCLCD Prediction of potential genes in microbial genomes Time: Thu May 26 09:27:09 2011 Seq name: gi|223714190|gb|ACDT01000025.1| Coprobacillus sp. D7 cont1.25, whole genome shotgun sequence Length of sequence - 101317 bp Number of predicted genes - 98, with homology - 98 Number of transcription units - 45, operones - 25 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 57 - 869 772 ## COG1737 Transcriptional regulators 2 2 Op 1 . - CDS 881 - 1192 398 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 3 2 Op 2 . - CDS 1219 - 2613 1089 ## COG1940 Transcriptional regulator/sugar kinase - Prom 2696 - 2755 8.1 + Prom 2630 - 2689 11.0 4 3 Tu 1 . + CDS 2837 - 4144 1216 ## COG0534 Na+-driven multidrug efflux pump + Term 4253 - 4286 0.8 5 4 Tu 1 . - CDS 4131 - 5477 1560 ## COG1253 Hemolysins and related proteins containing CBS domains - Prom 5662 - 5721 9.8 + Prom 5459 - 5518 7.1 6 5 Op 1 . + CDS 5656 - 6462 764 ## COG2207 AraC-type DNA-binding domain-containing proteins 7 5 Op 2 . + CDS 6492 - 6755 287 ## CDR20291_1621 hypothetical protein 8 5 Op 3 . + CDS 6752 - 7219 473 ## COG0346 Lactoylglutathione lyase and related lyases + Prom 7233 - 7292 7.2 9 6 Tu 1 . + CDS 7331 - 8125 800 ## Cphy_0275 XRE family transcriptional regulator + Term 8364 - 8399 2.1 + Prom 8363 - 8422 8.4 10 7 Tu 1 . + CDS 8503 - 11952 3630 ## COG1409 Predicted phosphohydrolases + Term 11981 - 12015 2.4 + Prom 12111 - 12170 6.8 11 8 Op 1 31/0.000 + CDS 12291 - 13112 1079 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 12 8 Op 2 34/0.000 + CDS 13115 - 13819 430 ## COG0765 ABC-type amino acid transport system, permease component 13 8 Op 3 . + CDS 13836 - 14570 240 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein + Term 14712 - 14764 -0.4 + Prom 14614 - 14673 5.7 14 9 Op 1 20/0.000 + CDS 14779 - 16512 1893 ## COG2060 K+-transporting ATPase, A chain 15 9 Op 2 18/0.000 + CDS 16528 - 18600 2601 ## COG2216 High-affinity K+ transport system, ATPase chain B 16 9 Op 3 15/0.000 + CDS 18614 - 19219 703 ## COG2156 K+-transporting ATPase, c chain + Term 19223 - 19263 7.6 + Prom 19248 - 19307 9.5 17 10 Op 1 16/0.000 + CDS 19328 - 22018 2492 ## COG2205 Osmosensitive K+ channel histidine kinase 18 10 Op 2 16/0.000 + CDS 22011 - 22721 907 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 19 10 Op 3 16/0.000 + CDS 22766 - 24226 1233 ## COG2205 Osmosensitive K+ channel histidine kinase 20 10 Op 4 . + CDS 24219 - 24902 770 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Prom 24918 - 24977 2.8 21 10 Op 5 . + CDS 24999 - 25205 248 ## gi|167755209|ref|ZP_02427336.1| hypothetical protein CLORAM_00714 + Term 25216 - 25266 5.2 22 11 Op 1 . - CDS 25211 - 25474 164 ## gi|167755210|ref|ZP_02427337.1| hypothetical protein CLORAM_00715 23 11 Op 2 . - CDS 25455 - 25655 141 ## gi|237734959|ref|ZP_04565440.1| conserved hypothetical protein - Prom 25781 - 25840 7.5 24 12 Op 1 . - CDS 25873 - 27258 1164 ## gi|167755213|ref|ZP_02427340.1| hypothetical protein CLORAM_00718 25 12 Op 2 . - CDS 27239 - 28666 1062 ## gi|237734961|ref|ZP_04565442.1| predicted protein - Prom 28725 - 28784 6.6 26 13 Op 1 . - CDS 28788 - 30011 1004 ## COG2270 Permeases of the major facilitator superfamily 27 13 Op 2 . - CDS 30004 - 30786 473 ## TDE1263 hypothetical protein - Prom 30880 - 30939 11.0 - Term 30922 - 30960 6.5 28 14 Op 1 . - CDS 30989 - 31774 616 ## TDE1263 hypothetical protein 29 14 Op 2 . - CDS 31774 - 32124 392 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family - Prom 32261 - 32320 8.1 + Prom 32147 - 32206 8.9 30 15 Op 1 . + CDS 32292 - 34199 1828 ## COG3711 Transcriptional antiterminator 31 15 Op 2 . + CDS 34210 - 34788 532 ## SGO_1646 hypothetical protein + Prom 34792 - 34851 4.8 32 16 Op 1 8/0.000 + CDS 34875 - 35198 524 ## COG1447 Phosphotransferase system cellobiose-specific component IIA 33 16 Op 2 10/0.000 + CDS 35223 - 35543 437 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 34 16 Op 3 . + CDS 35560 - 36927 1526 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 35 16 Op 4 . + CDS 36939 - 37460 758 ## SGO_1642 hypothetical protein + Term 37469 - 37500 2.1 36 17 Op 1 . + CDS 37519 - 38475 1324 ## COG1446 Asparaginase 37 17 Op 2 . + CDS 38477 - 39577 1198 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 38 17 Op 3 . + CDS 39593 - 40327 675 ## COG3142 Uncharacterized protein involved in copper resistance 39 17 Op 4 2/0.067 + CDS 40324 - 40704 214 ## COG1284 Uncharacterized conserved protein 40 17 Op 5 . + CDS 40692 - 41183 359 ## COG1284 Uncharacterized conserved protein + Term 41188 - 41221 2.3 - Term 41176 - 41209 2.3 41 18 Op 1 . - CDS 41232 - 42200 746 ## COG1078 HD superfamily phosphohydrolases 42 18 Op 2 . - CDS 42222 - 42593 377 ## Cbei_1499 hypothetical protein + Prom 42692 - 42751 13.2 43 19 Tu 1 . + CDS 42825 - 45359 2374 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain + Term 45361 - 45404 8.5 44 20 Tu 1 . - CDS 45389 - 46162 720 ## COG1737 Transcriptional regulators - Prom 46182 - 46241 9.8 + Prom 46138 - 46197 9.1 45 21 Op 1 . + CDS 46352 - 47761 1297 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 46 21 Op 2 . + CDS 47742 - 49217 1314 ## BDP_0122 sialic acidspecific 9-O-acetylesterase (EC:3.1.1.53) + Term 49299 - 49345 9.1 + Prom 49332 - 49391 6.5 47 22 Op 1 . + CDS 49524 - 50867 1623 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) 48 22 Op 2 . + CDS 50879 - 52633 1659 ## gi|237734984|ref|ZP_04565465.1| predicted protein 49 22 Op 3 . + CDS 52636 - 52812 321 ## gi|167755237|ref|ZP_02427364.1| hypothetical protein CLORAM_00742 + Term 52820 - 52859 1.0 - Term 52898 - 52934 3.5 50 23 Tu 1 . - CDS 52972 - 55842 3655 ## COG0491 Zn-dependent hydrolases, including glyoxylases - Prom 55863 - 55922 5.0 + Prom 56133 - 56192 8.1 51 24 Op 1 40/0.000 + CDS 56217 - 56876 734 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 52 24 Op 2 4/0.000 + CDS 56873 - 57883 1046 ## COG0642 Signal transduction histidine kinase 53 24 Op 3 36/0.000 + CDS 57951 - 58718 1012 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component 54 24 Op 4 . + CDS 58708 - 60681 1553 ## COG0577 ABC-type antimicrobial peptide transport system, permease component + Prom 60690 - 60749 6.1 55 24 Op 5 1/0.067 + CDS 60775 - 61323 422 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 61333 - 61367 1.2 + Prom 61451 - 61510 9.2 56 25 Tu 1 . + CDS 61550 - 62884 1229 ## COG1301 Na+/H+-dicarboxylate symporters + Term 62890 - 62929 4.6 57 26 Op 1 6/0.000 + CDS 63365 - 63835 582 ## COG0527 Aspartokinases 58 26 Op 2 6/0.000 + CDS 63845 - 64738 1238 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 59 26 Op 3 . + CDS 64740 - 65483 828 ## COG0289 Dihydrodipicolinate reductase 60 26 Op 4 . + CDS 65496 - 66803 1410 ## COG0019 Diaminopimelate decarboxylase + Term 66809 - 66834 -0.5 61 26 Op 5 . + CDS 66846 - 67250 503 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 67251 - 67294 6.2 62 27 Tu 1 . - CDS 67289 - 68119 584 ## COG0500 SAM-dependent methyltransferases - Prom 68366 - 68425 9.5 - Term 68386 - 68440 14.1 63 28 Tu 1 . - CDS 68631 - 69173 574 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) + Prom 69655 - 69714 3.7 64 29 Tu 1 . + CDS 69741 - 69935 216 ## gi|237735000|ref|ZP_04565481.1| LOW QUALITY PROTEIN: lysR-family transcriptional regulator - Term 69884 - 69924 1.3 65 30 Tu 1 . - CDS 69961 - 70452 538 ## COG4720 Predicted membrane protein - Prom 70473 - 70532 8.0 66 31 Tu 1 . - CDS 70979 - 71671 752 ## COG2357 Uncharacterized protein conserved in bacteria - Prom 71702 - 71761 4.8 + Prom 71625 - 71684 6.5 67 32 Tu 1 . + CDS 71745 - 72107 470 ## gi|167755255|ref|ZP_02427382.1| hypothetical protein CLORAM_00760 + Prom 72132 - 72191 3.7 68 33 Op 1 . + CDS 72247 - 73245 936 ## COG2199 FOG: GGDEF domain + Term 73247 - 73279 2.0 + Prom 73249 - 73308 5.7 69 33 Op 2 . + CDS 73339 - 75336 931 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Term 75365 - 75420 1.1 70 34 Op 1 . - CDS 75333 - 75911 249 ## PROTEIN SUPPORTED gi|223039927|ref|ZP_03610210.1| ribosomal protein L22 71 34 Op 2 . - CDS 75883 - 76788 643 ## COG1283 Na+/phosphate symporter - Prom 76954 - 77013 5.3 + Prom 76831 - 76890 8.4 72 35 Op 1 . + CDS 76957 - 77592 377 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 73 35 Op 2 . + CDS 77593 - 78978 1261 ## COG1472 Beta-glucosidase-related glycosidases + Term 78987 - 79040 -0.9 + Prom 78985 - 79044 7.1 74 36 Op 1 24/0.000 + CDS 79113 - 79394 479 ## PROTEIN SUPPORTED gi|167755262|ref|ZP_02427389.1| hypothetical protein CLORAM_00767 75 36 Op 2 21/0.000 + CDS 79410 - 79892 582 ## COG0629 Single-stranded DNA-binding protein 76 36 Op 3 . + CDS 79910 - 80140 385 ## PROTEIN SUPPORTED gi|167755264|ref|ZP_02427391.1| hypothetical protein CLORAM_00769 + Term 80153 - 80183 1.2 + Prom 80401 - 80460 10.4 77 37 Op 1 3/0.000 + CDS 80504 - 81271 902 ## COG1349 Transcriptional regulators of sugar metabolism + Term 81277 - 81318 3.4 + Prom 81287 - 81346 9.5 78 37 Op 2 . + CDS 81445 - 82155 763 ## COG0274 Deoxyribose-phosphate aldolase 79 37 Op 3 12/0.000 + CDS 82207 - 83007 775 ## COG3959 Transketolase, N-terminal subunit 80 37 Op 4 . + CDS 83009 - 83932 773 ## COG3958 Transketolase, C-terminal subunit 81 37 Op 5 . + CDS 83935 - 85356 1302 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 82 37 Op 6 . + CDS 85379 - 86572 1177 ## COG0205 6-phosphofructokinase 83 37 Op 7 . + CDS 86562 - 87284 549 ## COG1434 Uncharacterized conserved protein + Term 87306 - 87341 1.1 + Prom 87286 - 87345 5.4 84 38 Op 1 4/0.000 + CDS 87419 - 87979 591 ## COG0693 Putative intracellular protease/amidase 85 38 Op 2 . + CDS 87981 - 88532 651 ## COG0693 Putative intracellular protease/amidase 86 38 Op 3 . + CDS 88552 - 89325 769 ## COG1737 Transcriptional regulators + Term 89328 - 89374 10.8 87 39 Tu 1 . - CDS 89352 - 90245 665 ## COG1737 Transcriptional regulators - Prom 90278 - 90337 5.8 + Prom 90215 - 90274 6.2 88 40 Op 1 . + CDS 90456 - 91406 1042 ## COG0451 Nucleoside-diphosphate-sugar epimerases 89 40 Op 2 . + CDS 91470 - 91904 479 ## EUBREC_0368 hypothetical protein + Prom 92385 - 92444 8.1 90 41 Tu 1 . + CDS 92585 - 92857 442 ## gi|167755279|ref|ZP_02427406.1| hypothetical protein CLORAM_00784 + Term 92860 - 92897 5.1 91 42 Tu 1 . - CDS 92871 - 94310 735 ## PROTEIN SUPPORTED gi|145632256|ref|ZP_01787991.1| 50S ribosomal protein L27 - Prom 94355 - 94414 5.7 - Term 94393 - 94426 3.1 92 43 Tu 1 . - CDS 94438 - 97368 2585 ## COG2200 FOG: EAL domain - Prom 97408 - 97467 3.5 93 44 Op 1 . - CDS 97472 - 97885 334 ## gi|237735030|ref|ZP_04565511.1| conserved hypothetical protein 94 44 Op 2 . - CDS 97961 - 98755 860 ## COG0351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase 95 44 Op 3 . - CDS 98758 - 99396 521 ## COG0637 Predicted phosphatase/phosphohexomutase 96 44 Op 4 6/0.000 - CDS 99389 - 100009 612 ## COG0352 Thiamine monophosphate synthase 97 44 Op 5 . - CDS 100014 - 100838 963 ## COG2145 Hydroxyethylthiazole kinase, sugar kinase family - Prom 100999 - 101058 6.5 + Prom 101052 - 101111 7.9 98 45 Tu 1 . + CDS 101144 - 101315 96 ## COG2509 Uncharacterized FAD-dependent dehydrogenases Predicted protein(s) >gi|223714190|gb|ACDT01000025.1| GENE 1 57 - 869 772 270 aa, chain + ## HITS:1 COG:yfeT KEGG:ns NR:ns ## COG: yfeT COG1737 # Protein_GI_number: 16130352 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 10 223 12 228 285 67 25.0 2e-11 MIIEHLLDDNNLTNTEKSIAQFLLNKDQKINNLTSSELGKQSYTSQAAVTRLYKKLGFNN FREFLSTLILERNDYLKYNDLPTEHPEQYFTSLEDTEKVIASLYEKAMIQTNRLLDKNVV NRVCNRILSASFIDIYGIGLSDTIAKEMCFKLQSLGLPCSYQNGINIKYIDNMQQHQSNV SILITTTGDNNLIKEIAKILKNRNVYTVGILGKKGKDLIQLCHDYLLFDTSLFEDIDSLC ATFSAEYVINILYATLLYRLELSNYMHYLK >gi|223714190|gb|ACDT01000025.1| GENE 2 881 - 1192 398 103 aa, chain - ## HITS:1 COG:SPy1324 KEGG:ns NR:ns ## COG: SPy1324 COG1440 # Protein_GI_number: 15675267 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Streptococcus pyogenes M1 GAS # 1 103 1 103 103 89 43.0 1e-18 MKKILLVCCAGMSTSLLVNRMREVAKTKEIDCKIWSTSEKELPNEFAQEPADVILVGPQI RYLLNNIKKEVNNTVPVELIDMRIYGMMDGEKVLNQALNALEK >gi|223714190|gb|ACDT01000025.1| GENE 3 1219 - 2613 1089 464 aa, chain - ## HITS:1 COG:lin0520 KEGG:ns NR:ns ## COG: lin0520 COG1940 # Protein_GI_number: 16799595 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Listeria innocua # 188 446 43 304 334 63 20.0 1e-09 MEDFLKDVYTSIYKKWILFQQIDNCQIMLSSKDQNKIILKTKYGVANVIFYKFNIIELNV ISKIDQESCFFLHFQMNNINHAINLFYEMVECLKTLIKKPKIKILLCCSGGLTTTYFAYK IDEAIQLFALDYEIAATGYNELFKKGEQYDVILLAPQVSFMYAKVKKIFKDKYLLNIPAQ VFAKYDVKEILNLVDQELIKKRNKNGQVQLLSIRNKTITFHRKILCISLFRNRNRIHIAY RLYQSQSDIIVNNETIKQRITIQDIYDVIDTVLLNYPGIEVIGFSTPGIVNNGFATTASI NGFDDMNYKKLFTSKYSQKFIITNDVNTAAIGYHATQNQYSSIVLLFQPMSTKAGAGIII DNKLINGKHNVAGEMKYLPVNLLEKGANVYKTPEDIIKIVKYISLSIISVIGPEAIVIFC SLLPNIEDLENELKTVLPQEYIPRLIKIDDIQEYIFLGQTIICT >gi|223714190|gb|ACDT01000025.1| GENE 4 2837 - 4144 1216 435 aa, chain + ## HITS:1 COG:lin2192 KEGG:ns NR:ns ## COG: lin2192 COG0534 # Protein_GI_number: 16801257 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 7 429 15 437 443 179 28.0 1e-44 MKNYIFKDFLKYTSLNVLGMIGLSCYILADTFFIANGVGEIGLSALNLALPMYCLINGLG LMIGIGGGTRYSLLKVNSNRTSQNIIFSNALYFMLILAIIFVVLGLSASTSLANLLGANQ DTTIYTATYLKTIMFFAPMFMLNNLLIGFVRNDDNPRLAMMGMLIGSLTNVVLDYIFIFP FGLGMFGAALATGVAPVISISIMSPYVIKKANFKFLKTKLCLNHCKDIFLLGLSALIGEL ATGIVMLLFNFTLLKYSGNIGVAAYGIIANIAFVIVSMFTGVAQGIQPIISKNLSFKEYR NVNLAYHYGLWTVIIFAVIVYLSSYFAAAQIVALFNSSSNPELLDLAVRGIHIYFGGFIF AGVNMISATFFSASDKPRQAFIISSLRGFILIAPVIFILSTIFKVDGVWLTYVVTELITS IITIILKRKSDFSLH >gi|223714190|gb|ACDT01000025.1| GENE 5 4131 - 5477 1560 448 aa, chain - ## HITS:1 COG:CAC0460 KEGG:ns NR:ns ## COG: CAC0460 COG1253 # Protein_GI_number: 15893751 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Clostridium acetobutylicum # 1 437 1 439 443 422 52.0 1e-118 MDAGPESSITLQLILIVVLTLINAYFAASEMAIVSVNKNKIRRLSEEGNKKAKLVEKLLD QPTNFLSTIQVAITLAGFFNSASAATGISVSLANLLKVWSIPYADTIAVVLITILISFIT LIFGELVPKRIALQKAEWYSMFCAKPILIISKIAGPFIKILSWSTKFVLRLFGMDDENVE ESLSREEIRSMVESGQETGVFNEIETDMITNIFEFDDSLALNVMTPRTDVYCIDINDPLS ENIDQMMSMQYTRIPVYDDSIDNIIGILNMKDFAIEARQVGFDNVDIRKLLRKPYFVLET KNIDDLFKELQEDRQHIAILVDEYGGFSGIVTVEDLIEEIMGDIEDEYDHDDEPKLQKID DYNFIVDGNYLIDDLDDELDLKLDNVNHDTISGFVLHLLGEIPDDNQERSVTYENLTFKI TGVKGNRVTKIKLTIKKSEEKESDSSED >gi|223714190|gb|ACDT01000025.1| GENE 6 5656 - 6462 764 268 aa, chain + ## HITS:1 COG:CC3506 KEGG:ns NR:ns ## COG: CC3506 COG2207 # Protein_GI_number: 16127736 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Caulobacter vibrioides # 91 256 139 306 321 62 25.0 7e-10 MKNFDYKGSYINKTNHHIYIAPTEELKDIIAHYTITFPSEVPASSKLFHIIPDASGCFIF QGRSRRDFWGPMSEIMVLENDLQKAPARFFVEFRPGGLYQISGLHQKKLVNQRQQLRYLD VQLDEELALLYQSCKDYEQLIKAFNKYFTGKRKVNLLPARLIRAKEIIDQDHGNVSLEKV ASDCHLSSRQLLRDFHNYIGLSGKEYAKVVRFNYLLKQIEEDDFLSLALQGGYFDQAHFN KVFKQITKTTPKKYLSNLADFYNEIYKF >gi|223714190|gb|ACDT01000025.1| GENE 7 6492 - 6755 287 87 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1621 NR:ns ## KEGG: CDR20291_1621 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 84 1 84 88 95 60.0 6e-19 MIESRCGIICSECEFKSKIGCAGCINIEHPFWGEICPVKSCVETKELNHCGECEDFPCKL AKSYAYDEKQGDQGKRLEQCRCWGKIK >gi|223714190|gb|ACDT01000025.1| GENE 8 6752 - 7219 473 155 aa, chain + ## HITS:1 COG:MA0108 KEGG:ns NR:ns ## COG: MA0108 COG0346 # Protein_GI_number: 20089007 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Methanosarcina acetivorans str.C2A # 1 152 10 163 163 122 38.0 2e-28 MKLEATVLAVKNLEVSKQFYMQLFDQKIILDHGLNITFDGGFALQKDFAWLVDVPSDWVC DKACNMELYFEVDDFDKFIKKLETYPNIKYVHPIKKYNWQQRVVRIFDPDNHMIEIGEAM DVVITNLLLAKKDIDEVSRITQYPIDYVRNIFNKL >gi|223714190|gb|ACDT01000025.1| GENE 9 7331 - 8125 800 264 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0275 NR:ns ## KEGG: Cphy_0275 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.phytofermentans # Pathway: not_defined # 1 263 1 273 276 103 29.0 6e-21 MDNKEIGIRILSARKNLGYTQKQLGNLINVSDKAVSKWERGIGCPDISLLLPLSEALKMS IDELIGGTVVDKSEKKTVQNLINYTKIKAIENKERIIKISYLFITIVMMIGIFVPCLCNY YLNHSFTWSIISSAAIIYAWIILTVLVQAKENVVIKAIIVASIGVIPLLYVVSVQVYSLD WFLHEALVIALFTNLYVYIVIWIWVKTTLNIWYKIGLCIYLSNITNIPANYISGVNLFSM IMNLALNMIVGTIIIMVGYTRKHN >gi|223714190|gb|ACDT01000025.1| GENE 10 8503 - 11952 3630 1149 aa, chain + ## HITS:1 COG:CAC0205 KEGG:ns NR:ns ## COG: CAC0205 COG1409 # Protein_GI_number: 15893498 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 41 485 43 409 652 122 27.0 3e-27 MKKVNNRIIKGVLCLVTLSTMLIGQNNILTKGYAQDTYSEVAGESHTGSFSVDSLTLQPG ATTTSINLNWYAPAGTTVAKIQFGDQTYDVTAKPLTSPTEIKSDKYTDTGKLVCKTTISN LKPDTKYTYYISNDGGTTWSKEYNYTTPSSNEFTFGFTSDPQIKENKEINSEGWNPSDET NQTGWATMMETLEKQGVDLVVSAGDQVEDQNWGKSSEYEAFFAPEEMTSIPYAPAVGNHD RHYMFADHFNLPNQMESLTEVKTTFRGQNSGTSQSHGNYIQATENEIKNNAASNGVTPNS DGQYDFSERRDMETQGNYYYLYNNVLFVTLNTSAYPGGNDEENANNPNVSSASRDNSEAE AIINNFAQTLKSATSEYQNQYQWIIVTHHKSTQTVAKHAADSDIENYVDAGFEKLMDEYN VDFVLGGHDHVYSRSYVLKDGQRNSERLDTLNNPQGTIYITGNCCSDMQYYTPFETLDKN NNADYPILANGKSGSAAYLEGDSLPVGNQEWNQEYSPSYAVFNVENNRISVKVYNLAGDQ TNPDSKEIDAFTVTKNSDGGEKTTGYENNSALLALEQSARYDSGTTNSDGGVTEIVDYNT VTGWAYAVNGQSGNLTAIALKNIENKDKIDLLDGNDIDISSLVQSADFTYGDMTSVAVSH DGTSLAVAIQGKETNANGRVALFDCNQDGTITLKQLFEVGVQPDMVVFTNDDGQILTANE GEPRDGYTGAVDPKGSISVINTKTNKAKTVDFTAFDSQRDALIEKGIVLKKGVLPSTDLE PEYIAVSDSKAYITLQEANAIAKLDLNSLEIEDIYSVGFEDYSKIAIDIDKKDEKYNAQT YDSLKGIRMPDAISVYTVDGIDYLITANEGDSRDWNGYSNEIEVNFGKGKTSPTGKITAD NSGLTGKVTFFDTSDYDGLNNENDYLFGGRSSTIFKADEQSLQEIYTTGNDFEVKTAAYL PNNFNCSNDDATIDDRSGKKGPEAEAVTIGQIEDKTFAFIGLERIGGIMVYDITNPEKTE FVNYINSRDFSTDIGGDDSPEGIHFIAGNDSITGKPQLVVAYEVSGTVGVYDLTLQKNSD QSGDDKTDNPSQEPTVEETNNSSNTNTNNKNHSVNTGDASNIAITIISLISTIVLITILM KKDKVLAED >gi|223714190|gb|ACDT01000025.1| GENE 11 12291 - 13112 1079 273 aa, chain + ## HITS:1 COG:SP0453_1 KEGG:ns NR:ns ## COG: SP0453_1 COG0834 # Protein_GI_number: 15900370 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Streptococcus pneumoniae TIGR4 # 1 270 1 265 325 147 34.0 3e-35 MKKILAFLLIALLISGCESNSGNKSIDEGDVFVVGMECNYPPFNWQTNTQSDTAVELEGS GYADGYDVYIAKEIAESLDKKLVIKKLAWNGLQPALESGEIDAIIAGMTADEEREKGIDF TTPYYQSEMVMIVRGDDASKNYTDIQQFSGKTIVGQMSTSYDTVIDQINGVDHATPKQSY PEMLVALQSGEVDGITAELPVAQGILETNQDLAIVRFEQNKGFKVDTAVSIGLKEGSRNS TLFKKVQKCIDNISSEKRNEMMGKYSSSQPKGE >gi|223714190|gb|ACDT01000025.1| GENE 12 13115 - 13819 430 234 aa, chain + ## HITS:1 COG:SPy0277_2 KEGG:ns NR:ns ## COG: SPy0277_2 COG0765 # Protein_GI_number: 15674455 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Streptococcus pyogenes M1 GAS # 10 229 1 220 239 174 47.0 2e-43 MFDFDKSFTIFLDNFDLFWYGTKITIILALVGTIAGLIIGLLFGAIRATTIDIKDKKIIV LIKKALKIISEIYIWFFRGTPMLVQAMFFYHGLRPFFQWNALTAGIMIISINTGAYMAEI VRSGIQSVDRGQSEGALSLGMTRVQTMKYIILPQAIRNSFPAIGNQFIINIKDSSMLNVI GVVELFFQSSSIAGSTMSYSATFLITCLVYLCLTSIATILLNIIEKRINNPILR >gi|223714190|gb|ACDT01000025.1| GENE 13 13836 - 14570 240 244 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 220 1 216 311 97 26 3e-19 MSDNIITIKHLKKHYQALPVLLDINLEVNRGDVISIIGASGSGKSTFLRCLNLLEIPDHG EILYCNKNILKEELNLNKYRSDVGMIFQSFNLFNNLTVLENCVIGQMKVLNRNREEAEEK ALLYLQKVGMAEFAKKSSTQLSGGQKQRVAIARALCMEPQVLLFDEPTSALDPQMVGDIL KIIKHLALKGMTMIIVTHEMQFARDVSNKVIFMHNGIIEEAGTPEQIFEHPRSMGMIQFL SRQY >gi|223714190|gb|ACDT01000025.1| GENE 14 14779 - 16512 1893 577 aa, chain + ## HITS:1 COG:lin2830 KEGG:ns NR:ns ## COG: lin2830 COG2060 # Protein_GI_number: 16801890 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, A chain # Organism: Listeria innocua # 22 577 23 560 561 560 58.0 1e-159 MDIFVKYLGYLIVLVVLAVPLGFYINKVMNGKKVFLSRILEPCENFIYRVLHVKKDEEMS WKKYLVSVLIFSGFGLVFVFLLQMLQGVLPGNPAKVEGTTWDLALNTAISFVTNTNWQAY SGESGLSYLTQSLGLTVQNFVSAGTGLAVLYALIRGFTRIKAKGLGNFWVDLTRSVIYVL MPLSVVLSIILVSQGVTQSIDEYQTVKLAEPIVLEDGTEITEQTVPLGPAASQIAIKQLG TNGGGFYGVNSAHPLENPTGLSNLVEVVSILLIPAALCFSFGKGIKDSRQGIAIFVAMGI MLVAALGSIAVSEQMATPQLEQNGIVDISNHDQAGGNMEGKESRFGIVGSSTWAAFTTAA SNGSVNSMHDSYTPLGGMVTMLLMQLGEVIFGGVGCGLYGMLGFVILAVFMAGLMVGRTP EYLGKKIEPYEMKWAVVVCLATPVAILVGSGIASLVPQVADSLNNSGAHGFSELLYAYSS AGGNNGSAFAGFAANTPFINVTLGLVMLFVRFIPILGILAIAGSMVQKKKVAVTAGTLST CSPLFVFLLVFVVLLVGALSFFPALALGPIAEFFGMF >gi|223714190|gb|ACDT01000025.1| GENE 15 16528 - 18600 2601 690 aa, chain + ## HITS:1 COG:lin2829 KEGG:ns NR:ns ## COG: lin2829 COG2216 # Protein_GI_number: 16801889 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity K+ transport system, ATPase chain B # Organism: Listeria innocua # 9 689 1 681 681 914 73.0 0 MKTETKSALMDKQMVIRAIKDSFYKLAPKTQAKNPVMLLVYISAILTTALFVISLFGIKD ANSGFTLGIAVILWFTVLFANFAEAIAEGRGKAQADSLRAAKKDVEAYKIPSIDQKDIVE IVSSATLTKGDIVLVKAGQQIPGDGEVIDGVASVDESAITGESAPVIRESGGDRSAVTGG TTVLSDWIVVQISSEPGESFLDKMISMVEGANRKKTPNEMALQIFLVALSIIFVLVTMAL YAYSAFGAKQAGMDNPSSVTVLVALLVCLAPTTIGALLSAIGIAGMSRLNQANVLAMSGR AIEAAGDVDTLLLDKTGTITLGNRQASAFIPVDGVTSEELADAAQLASLADETPEGRSIV VLAKEQFNLRGRSINQMHMEFVPFSAKTRMSGVNYQGNEIRKGAADTIKKYVTENGHIYS DECEKVVTEIANLGGTPLVVTKNKKVLGVIHLKDIIKQGVKEKFADLRKMGIKTVMITGD NPLTAAAIAAEAGVDDFLAEATPEGKLEMIRELQAKGHLVAMTGDGTNDAPALAQADVAV AMNTGTQAAKEAGNMVDLDSSPTKLIDIVRIGKQLLMTRGSLTTFSIANDVAKYFAIIPV LFITLYPGLDALNIMGLHSKDSAIFSAIIYNALIIIALIPLALKGVKYREVSAGKLLSRN LLVYGLGGIIAPFICIKIIDVIIVALHIVS >gi|223714190|gb|ACDT01000025.1| GENE 16 18614 - 19219 703 201 aa, chain + ## HITS:1 COG:CAC3680 KEGG:ns NR:ns ## COG: CAC3680 COG2156 # Protein_GI_number: 15896912 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, c chain # Organism: Clostridium acetobutylicum # 1 201 2 199 205 145 40.0 6e-35 MKHLKTSVLPALKIFLIFTVVCGVIYTAAITGFAQIVFPDKANGSIIEVDGKKYGSELLG QQFIDDTHMWGRIMIIDGETFTNKDGEKTMYGLASNSSPASEDYEKVIAERVAMIEAANP EQKGKQIPVDLVTVSGSGLDPHISLAAAEYQIPRLVRTTGKSEAEIRKIIDKYTDHGFLG YFGETTVNVLKVNLALDGILK >gi|223714190|gb|ACDT01000025.1| GENE 17 19328 - 22018 2492 896 aa, chain + ## HITS:1 COG:pli0050 KEGG:ns NR:ns ## COG: pli0050 COG2205 # Protein_GI_number: 18450332 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Listeria innocua # 22 895 14 887 888 900 53.0 0 MDERRPDPEELLKQIKQEEIAKQRGKLKIFFGYAAGVGKTYAMLEAAHVAYHAGVDVVAG YVEPHQRPETSKLLDGLEVLPPLKVTHNGIMLNEFDLDGALKRNPDLILVDELAHTNDEQ CRHLKRYQDINELLDHGIDVYTTINVQHIESLNDIIASITAVVVKERIPDYIFDNADQVE LVDIEPEDLIKRLEAGKIYQKNQVQRALGNFFMLDNLIALREIALRRTADRVNKKFEQIK PKNGEHHYTNEHILICLSPAPSNQKVIRTAARMANAFFGEFTAVFVETPNFEKMPTKVKQ ALRANTKLASQLGANIVNLYGDNVPEQISEYARFAGVSKIVLGRTNTKRRFSHNSFADQL IALTPTIDIYIIPDKIRNNYKHTSLKPRQFFSLSIKDTILSIVIILIVTLLGIWFRDLGF GEANIIMVHILGVLATSLLTENLIYTLLSSLVSVLSFNFFFTIPMHSFTAYDKGYPITFV IMFLTAFITGTLTKRVKEQARLSAIKAFRTEVLLKSSQKLQRAKTKKEIIDEISKQLFKL LDKTIIFYPVENHELSEPLVFKSNEQEDEKIYLNKDEEAVASWVFNNNKHAGASTTTLPG AKCLYLAVRCEDSVLGVIGIYLDKTAIDDFENNLLMAILNEGAMALEKEASNSKKREVEI KANQEELRANLLRSISHDLRTPLTSISGNAGVLLDSADKLSNERKIEIYSDIYEDSMWLI DLVENLLSITRIENGNIQINKEAQLINEIVLEAMHHISKDSADHIIQLDLSDEFIFVKID ARLIMQVIINIVNNAIKYTPLGSTIKITTKKLNQVLSLEISDDGEGVPDDQKEKLFEMFY TRNNLNGDSRRGLGLGLALCKSIVEAHDGKIKVIDNYPRGSIFVINLPLEEVDIHG >gi|223714190|gb|ACDT01000025.1| GENE 18 22011 - 22721 907 236 aa, chain + ## HITS:1 COG:pli0051 KEGG:ns NR:ns ## COG: pli0051 COG0745 # Protein_GI_number: 18450333 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 6 236 7 236 240 281 61.0 6e-76 MDKPLILVVEDDRAVRSLVTTTLDTHDYKYITAQNGQQALLEIVSRRPNIILLDLGLPDM DGIEIIKKVRSWSIVPIIVISARSEDKSKIEALDTGADDYLTKPFSIDELLARLRATMRR LKYLEHQEIEDVVFTNGDLKVNFDSGCAYLNDQELHLTPIEYKLLCLLAKNIDKVLTHNY ILKEVWGDVIQSDVPSLRVFMATLRKKIESDTTNPKYIQTHVGVGYRMLRVEREKE >gi|223714190|gb|ACDT01000025.1| GENE 19 22766 - 24226 1233 486 aa, chain + ## HITS:1 COG:CAC3678 KEGG:ns NR:ns ## COG: CAC3678 COG2205 # Protein_GI_number: 15896910 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Clostridium acetobutylicum # 1 481 397 899 900 285 33.0 1e-76 MKNKDNYVNLTKIIVIWILATISSLILDYFKIRVENILLIYVVGVLISIIETSNIAWGII SAIIYIMTFNYLYTDPRYTFLINDPNYLISVFIFIIVAIIVSTLTNRLKKQREIALYQEE VTSKINQISSGFLNLSGYEEIRTYCQDSLYNLTKIKNEVFLYQNKEFQDLMAWWCYCHGE PCGKDQKKFTYLKEVYLPIKKDNYTYGTIKFDCNQRTITDEDLIYIKTIIAELILVLQRD LLSHEKEEARLQVEREKLKSTLLRSISHDLRTPLTSIAGGANFLVNNLDTVESDTSLNII QDISKEAMRLNGMVENLLNMTRIQEGNFKINKKLEVVDDIISTAVSAITNRKENHELVVK ETKDIILFPCDAQLIVQVLVNLLDNAFKHTPDASKVMLKAFVRNNNIIFQVIDNGKGIEI KQLNHIFDDFFTTSLDNGDHKRGIGLGLAICKAIVEAHKGEIKAFNNDLGGATFELSLPL EEENNE >gi|223714190|gb|ACDT01000025.1| GENE 20 24219 - 24902 770 227 aa, chain + ## HITS:1 COG:CAC3677 KEGG:ns NR:ns ## COG: CAC3677 COG0745 # Protein_GI_number: 15896909 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 227 3 231 232 245 54.0 4e-65 MNNTTILIVEDDKTIQNFLKVTLKTQNYNYIIAETGLSGLSLFYANRPDLVLLDLGLPDI EGIEVLKQIRQNSSIPIIVVSARSSETEKVMALDYGSDDYVTKPFNAAELLARIRAALRH CLKEKVSEPIFELDYLKVDFERRHVWVKDQEIHLTPIEYKMLVLLITNRGKVLTHHFIQE NVWGYETTDDYQSLRVFMANIRRKIEIDSSSPHFIITEVGVGYRFVE >gi|223714190|gb|ACDT01000025.1| GENE 21 24999 - 25205 248 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755209|ref|ZP_02427336.1| ## NR: gi|167755209|ref|ZP_02427336.1| hypothetical protein CLORAM_00714 [Clostridium ramosum DSM 1402] # 1 68 1 68 68 100 100.0 3e-20 MLKTAYIYPHNTPSIVNRGKIINYTKYFITDVFCVFKLALLMLLTGALMVGGLFLVVGMP YFVVSLFI >gi|223714190|gb|ACDT01000025.1| GENE 22 25211 - 25474 164 87 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755210|ref|ZP_02427337.1| ## NR: gi|167755210|ref|ZP_02427337.1| hypothetical protein CLORAM_00715 [Clostridium ramosum DSM 1402] # 1 87 1 87 87 134 100.0 2e-30 MIIIIVEQLFNFIIDHSLADNTIARQTLSFGLPITSTIPFNKNNLVILNELKLFLKTLKY KSKFVNNTDSYQILIKKHRVFTLCLFI >gi|223714190|gb|ACDT01000025.1| GENE 23 25455 - 25655 141 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734959|ref|ZP_04565440.1| ## NR: gi|237734959|ref|ZP_04565440.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 66 14 79 79 90 100.0 2e-17 MAIIILIIKDNTNKVVFISSKQGLSNKIKNFYLDQYINKLIDKNINLLLYLASIIITEKN YDNYYC >gi|223714190|gb|ACDT01000025.1| GENE 24 25873 - 27258 1164 461 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755213|ref|ZP_02427340.1| ## NR: gi|167755213|ref|ZP_02427340.1| hypothetical protein CLORAM_00718 [Clostridium ramosum DSM 1402] # 1 461 1 461 461 801 100.0 0 MKRIHFNKLAKIMILITFSIAIIISFFTYQDYQKAKQIYKDDSPVKIVADSINDDLKISS YHVEDAVNAQFERILTNTYIVYDEELNPYLPASETWSMTINSNNKESVISLKQLNYDQLK ILSQYLLSQVSENNIETINVTLNYDQNNNDFIEVSKLSYLAFDEQVILGQANSSTITCQL VSLDCPLITIYEYDSDYAYNYTLQSYENLEKCFKKNYQNLDQNGYYQNIDSNQKQILETT STITSSICIQELSGKYSQHTDFNELYGINDFNEKGYLVWYQFDGLFEEYTFFDYLATHYY NYLLALILLGMIYLGVGIIFAEKKQIPVSPDSKASSFDSIPENIEEIDISQLITQLVHNS ANVLNFKKIKLSYKPQMMIIKGNREQIIDLVTRLYNFSLRHSNQNDELIIELNHHKITFS NNNFNYLDDDLKTLNDCIEIIKRHQYHYSFTNRSLSIKIGA >gi|223714190|gb|ACDT01000025.1| GENE 25 27239 - 28666 1062 475 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734961|ref|ZP_04565442.1| ## NR: gi|237734961|ref|ZP_04565442.1| predicted protein [Mollicutes bacterium D7] # 20 475 1 456 456 800 100.0 0 MLRLNKFTKITSIYIITMMMITTIFTYFGYYAHKATIKDDWPTKYVADDINDDLKYNYSQ FTDVVIANIDRNVTSVYLAFDEDFQPFLAPHERWHFAAKNGDNIAEVSLSEISYQKLKSI CEKLIALMDEDKTKTINVTFNYSAITDNHIIVSNLVYFAINDEVYISNDSTATITCELTN LYTPFLDIQIPSDEDWNGYYYNSNIKIYNYLENFFKQNYRKTQDDGYYKLSTPDDDPFKI NNEAEENAYNFSYCIQKLSREYSHHDSFSAKEFDYPDDRLGYLVWNQVDDYYQKYTIFNY FTDYYYIYLLIGITSILFILILKRNIITLPQYETIVLNPVPAVTPKNVEDVDIEPIITEL IDNSSNLLQFKKLILSYQPQTTIIKGNRKQITKLITTLYNFAIRYSASNDNLSIIFIENG LSFTNDNFTYTTKNLTALNDCLEIIKIHNYEYTFDHNNLLFKQIIKGATDETNPL >gi|223714190|gb|ACDT01000025.1| GENE 26 28788 - 30011 1004 407 aa, chain - ## HITS:1 COG:lin1456 KEGG:ns NR:ns ## COG: lin1456 COG2270 # Protein_GI_number: 16800524 # Func_class: R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Listeria innocua # 5 398 4 403 414 313 45.0 4e-85 MNKLTKLEKYWVLYDVGNSAFVLLVSTIIPIYFKNLATEAGISLSDSTAYYSYAISISTL IVALAGPMLGSIGDRKGVRKPIFTFFMMMGVLSLAALSIPVGWIVFLAIFIVAKTGFATS LIFYDSMLVDITTYERVDQVSSHGYAWGYIGSCIPFTISLLLILGCDTIGLSTVMATAIA FIMNALWWFFITIPLLKNYQQINYTTAKEPIINNLKRVLTSIKNNKKVLFFLIAFFFYID GVYTIIEMATSYGKDVGIDDNSLLLALLLTQVVAFPFSLIFGKLAKKFPVKSLILSCIIG YFFIAVFALWLDTAWKFWVLAVFVAVFQGAIQALSRSYYAQIIPESQASEYFGIFDIFGK GASFMGTLLMGITTQITDNSKYGVVVIACMFVIGGIIFKSKCHNQAD >gi|223714190|gb|ACDT01000025.1| GENE 27 30004 - 30786 473 260 aa, chain - ## HITS:1 COG:no KEGG:TDE1263 NR:ns ## KEGG: TDE1263 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 30 259 11 249 250 103 30.0 5e-21 MPASYTHQQFGNQVLNKLENTTIKKIINDNLNLYNIGLQGPDILFFYQPLKKNPVNTLGN QLHQDIARAFFENAMKLLKEQPNEKSLAYILGFINHFILDSECHGYIDKTMINREIGHFE IERDLDQRFMILNHESFNYSTPKHYQINLENAKIIAPFFNLAPATILASLKSFKRFNHLF ACRSVLKRSLILTGMKIVNAKLYCGMVMTAKPEKRIEENIDILVNLFNQSINIATKEIIE YYYSYQNNTAISKRFNYNYE >gi|223714190|gb|ACDT01000025.1| GENE 28 30989 - 31774 616 261 aa, chain - ## HITS:1 COG:no KEGG:TDE1263 NR:ns ## KEGG: TDE1263 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 28 260 10 249 250 110 27.0 7e-23 MPAAYSHYRLGQLVLDELNQPLKHVLETNQDLFDIGLHGPDILFYNRPYIHSKINRLGSA MHKERADIFFYQALEILKETKSAPQFAYLCGFICHYILDSNCHPYINTIIKETGVTHFEI ETELDRYFMVKDRLDPLRTKLTDHIKVNDHTLNNIEPYFKATKKELYKSLKGMKFYDRLL LAPQFYKRGLIYLVLKITFTYKRFQGFVVNYRPNKLVDPYLEKLESLFNQSISESYEAIY NFIDVIKYDRSLSYRFEHNFK >gi|223714190|gb|ACDT01000025.1| GENE 29 31774 - 32124 392 116 aa, chain - ## HITS:1 COG:BS_yusI KEGG:ns NR:ns ## COG: BS_yusI COG1393 # Protein_GI_number: 16080333 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Bacillus subtilis # 2 113 3 114 118 127 59.0 5e-30 MINFIEYPKCSTCKKAKKYLDELGIDHHDRHIVEERLSKEELTALYQKSGLPLKRFFNTS GLKYKELQLKDKLPTMTEDEQLELLATDGMLVKRPIVETKDSVLVGFKANEYDNLK >gi|223714190|gb|ACDT01000025.1| GENE 30 32292 - 34199 1828 635 aa, chain + ## HITS:1 COG:BS_licR_1 KEGG:ns NR:ns ## COG: BS_licR_1 COG3711 # Protein_GI_number: 16080911 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 1 480 1 491 499 164 25.0 7e-40 MLNKRQTNIIDLLNDYGKWITGKEIAKVLNVSDRTIRSDIENINNYYDCILIQANKRLGY RLDETLLHKQDIETKDIIPQTSQERCIYIIQELLFKSHEINLIALQDQVFVSGYSIDNDI KKIRKMINDYPSLKLVRSKNYISLEGNETDKRKLYKQLLTAETQGNFMNLNSIAGLWNSF DLLEVKDILEEICEKYDYQIHEMTFPMIMIHAGVAIERIINHNYIKNQTISEKLESSREY QISYDFFTQVSTMINIELVTDEVILFALLLMGKRANDYRKDIIKEKLAVDVEHLVEAVIV EVKEYFDIDFSKDKDLKVGLAMHLQSLLERQKNNVQVTNVYLQEIKRKYPLVFEMAVRAG EVLAETCEDNINENELAFLALHLGAAYDRVNSLKRYRAIMIIPNNQMLSKMCRDKLKIRF EERMMIVQSYGFFEESMVKEEEPDLIITTVPLKHNLNIPTVQVTLFINYEDESKVFQTLN ILDKHRYHDDFVSLIKELMREDLFHVKNTMNNPKEIIDYLCDELIEKGLATQSYKEDVFR REAVSATSFVNGFAVPHSIEVSANESCISTLILDKPVIWGGFEVKLIILLAIRETDNHLL KIFFDWLSSIVSDSNKFTQLLEVREHQEFMEQVIM >gi|223714190|gb|ACDT01000025.1| GENE 31 34210 - 34788 532 192 aa, chain + ## HITS:1 COG:no KEGG:SGO_1646 NR:ns ## KEGG: SGO_1646 # Name: not_defined # Def: hypothetical protein # Organism: S.gordonii # Pathway: not_defined # 21 189 20 188 190 165 49.0 9e-40 MATKVVKAKKERKIRYEDAPLVKRFISYLIDWYVGALCTAIPIAIISQKLTNTMLNQNIV EFKQPYGIIAGILAVLFAIFYFVIVPAYIYPGQTLGKKICKIKIVKVNNEQVTIKNMLLR QLLGVIVIEGVLYTASAIWHELVTIITQTNFVTPLMYAGFIISGISILLYLFKGEHRTLH DYLGNTKVVLCK >gi|223714190|gb|ACDT01000025.1| GENE 32 34875 - 35198 524 107 aa, chain + ## HITS:1 COG:BH0910 KEGG:ns NR:ns ## COG: BH0910 COG1447 # Protein_GI_number: 15613473 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIA # Organism: Bacillus halodurans # 4 101 8 105 110 80 45.0 5e-16 MDGLELVSFKIISAVGMAKSSFIEAMKVAANGEFDAARAKIKEGEESFNGGHLAHSELIQ QEASGNHVVPSILLMHAEDQLMSAETIKVMALEIIRLNERVKALEKY >gi|223714190|gb|ACDT01000025.1| GENE 33 35223 - 35543 437 106 aa, chain + ## HITS:1 COG:CAC0384 KEGG:ns NR:ns ## COG: CAC0384 COG1440 # Protein_GI_number: 15893675 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Clostridium acetobutylicum # 1 102 1 100 102 89 42.0 2e-18 MKKVYLFCSAGMSTSMLASNMQDVANQHNLPIKVAAFPHNKLEEIISEDRPDCILLGPQV KYMYEETVEQFGSQGIPIAVIDQGDYGMMNGEKVLKSAIRLIKANK >gi|223714190|gb|ACDT01000025.1| GENE 34 35560 - 36927 1526 455 aa, chain + ## HITS:1 COG:lin0900 KEGG:ns NR:ns ## COG: lin0900 COG1455 # Protein_GI_number: 16799973 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 9 436 12 425 428 297 39.0 3e-80 MIKQLEKILMPLAEAIGRNKYLVSIRDGFLVSTPLLIAGSIFLLIANFPIPAWTEWLSSV VINSNTGETLAAFIEKPSGATFTIMAVFAVVGIAYSFARQMKTDKIFGAAVAIMGWFLIM PYTVEGTVEMAGKEIPVSLTGIPTGWVGAKGIFVGIICAFVAVHIYSWVEKKGWVIKMPA GVPPTVVQSFAALIPATVVMTFFFAINLLFGFLGTNVFQIIFEFLQTPLLNLGDTLGAMI IAYIFLHLFWFFGVNGGSVVGAVFNPILQTLSVENIEFFKTGVGEGHIICQQFQDLFATF GGCGSTLSLIIAMLLFCKSKRITELGKLSLVPGIFGINEPIVFGLPIVLNPTMIIPFILV PTINIIISYLAMSAGLVPICSGINIPWTTPVIISGFLATNWVGAILQAVLLVLGIFIYMP FIKIMDKQYINDEANAVHEAEEDDFSLDDLSFDDL >gi|223714190|gb|ACDT01000025.1| GENE 35 36939 - 37460 758 173 aa, chain + ## HITS:1 COG:no KEGG:SGO_1642 NR:ns ## KEGG: SGO_1642 # Name: not_defined # Def: hypothetical protein # Organism: S.gordonii # Pathway: not_defined # 5 173 1 169 169 201 60.0 1e-50 MKVIMIYDQIQSGAGIKDDHMIPLGAKKEPVGPAIMMEQYLKTVDGRVMACLYCGDGYYE ANPEEVSRKLCAMVNKLKPDVVMCGPAFNYLGYGKMAANIAYDINQTTDIPAFAAMSKEN EETINEFKDKIHIIETPKKGGIGLNESLDGMCKLAKALVDHEDLNPITSKYCF >gi|223714190|gb|ACDT01000025.1| GENE 36 37519 - 38475 1324 318 aa, chain + ## HITS:1 COG:CC2359 KEGG:ns NR:ns ## COG: CC2359 COG1446 # Protein_GI_number: 16126598 # Func_class: E Amino acid transport and metabolism # Function: Asparaginase # Organism: Caulobacter vibrioides # 3 284 20 316 327 174 36.0 2e-43 MWGIIATWRMAVEGISKASQVLAEGGDAGDAIEIAVREVEDFPYYKSVGYGGLPNEEMEV ELDAAFMDGDTLDIGAVAAIKDYANPVSIARRLSKEKVNNLLVGEGAEKFAHKEGFERKN MLTDRAKIHYRNRVKEVQALEIKPYSGHDTVGMVCLDTHGKMTSATSTSGLFMKKAGRVG DSPISGSGFYVDSKVGGASATGLGEDVMKGCVAYEIVRLMKDGMHPQAACEKAVNMFDLE LKERRGQAGDMSLVAMNNKGEWGVATNIEGFSFAVATADQEPIVYLTENKDGKCIHTVAS QEWLDNYMATRTAPLEEK >gi|223714190|gb|ACDT01000025.1| GENE 37 38477 - 39577 1198 366 aa, chain + ## HITS:1 COG:CAC2723 KEGG:ns NR:ns ## COG: CAC2723 COG0624 # Protein_GI_number: 15895980 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Clostridium acetobutylicum # 3 221 2 216 465 196 43.0 6e-50 MEDIILKKVDDIAEQMIKSIIEIVKIDSVETAPLENAPFGSGVKAALDATLDLAASLGFV TTNIDNYIGYASYGESKNYICAIGHLDVVPVGTGWKQPPFSGYVEDGVIYSRGILDNKGP ILSCLFALYALKELKLELAHEIRIIFGCDEESGFEDLKYYLSKERPPLMGFTPDCKYPVV YGERGRAIIKICGKKDRLKEFFDLVNNYILNAKNNGERFNIDFQDQEFGLLEVRNYQLEL DGDPSIQFSVSYPASVTVETIVNNIKMVVSNFEVNLVGNYDPVKFPLKCKLVKTLVHTYE KVTGDDGTPVTTTGGTYAKLMPNIVPFGPSFPGQKGIGHQPNEWMTIEDIITNAKIYALS LYNLAK >gi|223714190|gb|ACDT01000025.1| GENE 38 39593 - 40327 675 244 aa, chain + ## HITS:1 COG:PM0526 KEGG:ns NR:ns ## COG: PM0526 COG3142 # Protein_GI_number: 15602391 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Pasteurella multocida # 2 205 3 204 244 136 37.0 3e-32 MIEICCGSYEDALNAYYGGAKRIELNSALHLGGLTPSIGSLKLTKNTTNLKVICMVRPRG AGFCYTDIEFKQMMIEAKDLLENGADGIAFGFLLKNHEIDIERTKEMVSLIKSYQKEAVF HRAYDCVKDPYASIEILINLGIDRLLTSGLRAKAADGKDLIKKLQTKYGDKIEILAGSGI NSTNGQAIMEYTGIAQIHSSCKDWQNDPTTSGEFVNYCYGPAEHKDDFDMVSKELVEKLV KLGL >gi|223714190|gb|ACDT01000025.1| GENE 39 40324 - 40704 214 126 aa, chain + ## HITS:1 COG:BS_yitT KEGG:ns NR:ns ## COG: BS_yitT COG1284 # Protein_GI_number: 16078176 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 13 114 9 111 280 61 35.0 4e-10 MKNKLWFRYLITIVAVIASSFLQTYAIKVFVEPANLLSSGFTGVAILIDRITSLYGFNFS TSLGLIILNVPVAILCFKSIGKKFVISSLCQVFLTSFLLRICTFPPLFNDVILNVFLVDL FMGCRQ >gi|223714190|gb|ACDT01000025.1| GENE 40 40692 - 41183 359 163 aa, chain + ## HITS:1 COG:lin2365 KEGG:ns NR:ns ## COG: lin2365 COG1284 # Protein_GI_number: 16801428 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 5 157 130 282 300 108 32.0 5e-24 MSTVIALRGNTSSAGTDFIALYVSNKMGKSIWEYVFIFNALILCIFGYMFGWIYAGYSIV FQFISTKTISSFYQRYKRVTLQITTTHPEQIIERYVKDYRHGVSVVNGYGGYSKNPMSLL HTVVSAYEVQDIVQLMHESDEKAIINVIPTENFFGGFYQRPIE >gi|223714190|gb|ACDT01000025.1| GENE 41 41232 - 42200 746 322 aa, chain - ## HITS:1 COG:BH1811 KEGG:ns NR:ns ## COG: BH1811 COG1078 # Protein_GI_number: 15614374 # Func_class: R General function prediction only # Function: HD superfamily phosphohydrolases # Organism: Bacillus halodurans # 9 291 12 287 325 84 25.0 2e-16 MFYKIYHDTIPPYLLEIMNTNIFTRLKDIGMNCGMEYTQFQTFKNSFGVSRYDHSIGVSL ITYHFTHDKHATVAALLHDIATPVFAHVIDFLNNDYEKQESTEAETKKMIFNSRELQKIF LKYEIDINLIYDYHKYPIADNESPKLSADRLEYTLANAYYYNFLSVEQIKSIYYDLIIND KKDEIIFKSKDKAIMLTKVMFQCSRVYTSDENRYCMEYLANLLKTALTKKLITQDDLYTN ESSVIKNICKDKELSDKWEAFCHFHQVDISHNKKAGYYKINAKHRYFNPMIGNQRIINQS PKFKSELNSFLGDHFDRYVKVT >gi|223714190|gb|ACDT01000025.1| GENE 42 42222 - 42593 377 123 aa, chain - ## HITS:1 COG:no KEGG:Cbei_1499 NR:ns ## KEGG: Cbei_1499 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 1 120 3 122 124 100 43.0 1e-20 MKEDLIIEWFQSWFDSHWNNFEVIFESDAYYSESWGPEYKGIIEIKKWFNDWHEHFKLDK WEIKQFIHTNKHSIVEWHFSCIDLDGVHEFDGISLIEWSPNSLIKSLKEFGSSLPKYDPF EQL >gi|223714190|gb|ACDT01000025.1| GENE 43 42825 - 45359 2374 844 aa, chain + ## HITS:1 COG:CC0896 KEGG:ns NR:ns ## COG: CC0896 COG5001 # Protein_GI_number: 16125149 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Caulobacter vibrioides # 420 843 226 647 677 198 29.0 4e-50 MGFDDFKQLVDNMQINLYLTDIQTNEIIFMNNKMKETYNLKDPEGKICWKTLQKNQSGPC PFCPLIKLKQINNLYQSISWRENSDLTNRIFENYDSLIKWENGKIVHMQQSIDITDSEIL SHQASIDDLCGLLNRRAGKEKLLEFMVKKKKAKESFTIALLDINNLKRVNDSFGHQEGDF LLKQIAYAVKQDLDKEDLIFRLGGDEFVLVFGHKNTKDSAKFLIENQKNIQKLQALYNKP YGFSYAFGMYTVLPTNKLTVNDVITRADEQMYQEKLRYRRTQLAELEDVSVGSTNVACFQ YDPTQLYEALIKSTDDFIYICNMKTGIFRYSPAQVKVFNLPGEIIDHPLPFWKAIVHPQD WERFYKSNMAIGENKMDYHSVEFRAMNSDGEYIWLKCRGQLMRDEFGEPNLFAGIMTQLD RQNKIDPLTHLYNRQEFAKAFELKTKDKAIDNLGIMVIDIDDFKNINEIYDRSFGDFIIK TVAQLTQAVLPGNTSIYKLDSDQMGLLIENSTELEITKLYNEIQKQLLHEQLLKRYKCPI QISAGCAIYPKDGLTYNELNKYADYSLQYAKDNGKNKLTFFSNYILEHKMRSLEILKYIR ESVSDDYRGFELNYQPQVEVSSKQIKGVEALLRWQCQELGKVSPVEFIPILEESGLIIPV GLWVIKKAIQACSKWVVYDPEFSVSINVSALQMLNSDFVKDVKKVLKKEKLAPKNIVLEL TESYMVRNMDLLRTIFEQLRELGFKIAMDDFGTGYASLEILKTVPADIVKIDQTFVRDIK ISKFDKTFIRFIAQICHDVDIEVLLEGIETEEEFNVVEPMELDYIQGFLFGRPQSEEEIT NKLI >gi|223714190|gb|ACDT01000025.1| GENE 44 45389 - 46162 720 257 aa, chain - ## HITS:1 COG:SP1674 KEGG:ns NR:ns ## COG: SP1674 COG1737 # Protein_GI_number: 15901509 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 2 257 5 257 283 60 22.0 2e-09 MDFKKTVAIHYTEFTPSERKICARILEDPQIVNNSIVVLGDLCETSKSAILRFYQKLGYR GYSEFKYSVEESFKNNDEKKLSNESLLSQISQEYSNTILELGALDYDHQLKELASLIDKY PHVLSIGINNTYFAASQLEYSLYSHNRFIQSVREDIQLTYLINSINQDYLCIIYSVSGST YTYENFIKEASAKGAKIVLITIDGNSELLPLVDLAFVLPSISLPISSIDSIIQIDNRIML YYFSEIISYYYGLNTQN >gi|223714190|gb|ACDT01000025.1| GENE 45 46352 - 47761 1297 469 aa, chain + ## HITS:1 COG:BS_ydhO KEGG:ns NR:ns ## COG: BS_ydhO COG1455 # Protein_GI_number: 16077650 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Bacillus subtilis # 11 444 15 425 442 190 30.0 6e-48 MKDKFTALICKVSGNRFLNVLRDSFVLVASLTLIAGFAIMISSVFIDPNGIIFGSSGLGL GEMIFGSEKEFLASGFAQMLVQGQNIFNFFSKGAMSINALLIVVIFSYNVSRKYFSDNRE HMVSVMYALAAFFICLPWDFNYTSAKGIEVDINGIINSNFLGQQGIFFGLIVAGSATFIF NKLSKLNISIKMPDSVPPAVAQSFESLVPGMLTLFVFIIATVISNTFAGESLPEFILTTL QAPALAISKTAAFALVSQITWPLFQWLGIHPTAIWGAIFGMTWDIAGNENVLGVAKHLYT TMFMNYSTIAAGTFALAPVLSILFFSKLKQNKKITKVALAPAIFNISEPITFGLPIILNP IYFIPFIIVQPLCFYIAVFFTKVGFIPVVCNNVPWTVPPIISGILFTGTIAGGLVQLINL VISMAIYYPFVKIADKMELDKMTETEKLEAVETKRVSLLRGRKNETKIS >gi|223714190|gb|ACDT01000025.1| GENE 46 47742 - 49217 1314 491 aa, chain + ## HITS:1 COG:no KEGG:BDP_0122 NR:ns ## KEGG: BDP_0122 # Name: not_defined # Def: sialic acidspecific 9-O-acetylesterase (EC:3.1.1.53) # Organism: B.dentium # Pathway: not_defined # 3 386 29 431 600 240 33.0 1e-61 MKLRLAEIFTSRMVLQHGKEINIFGTGETNQQVTITIQGQTATTTIINNNWLITLPALEI STDETLIVSTGLETITLSNILIGDVYFLAGQSNIEFKFKDCDSVKVDQEVCYDRYLRYYE VPQIEYEDENIKIPPIRNQGWQICDNQTINDFSAIGYYLAYYLRHDIDIPIGLIAVNKGG TSGSCWINETYLQKNQEIKKVYYDEYYQAIMNQTEAQEDLEIAKYKERVKQYRQKVALYQ QTYPERNMSQLKKDVGHTPWPGPRGKKDFCRPAGLYYTMFKKICQYSGKAVIWYQGEEDT KNAYLYHQLLQLVIENWREDMKAQIPFIIVQLPEYDDDKNDNWPILRDAQQQVCREVPAC YLVVTMNCGEEFNIHPQNKKTVGHRIFWRIKELFYDNQFNGHSPKIINIENNSKIIIEFD QKLHSRGINNFILELPDHQVETVGIIKENKLLVERPAQVRTISYGYQNYCKISIFGENNL PIAPFKIQLTK >gi|223714190|gb|ACDT01000025.1| GENE 47 49524 - 50867 1623 447 aa, chain + ## HITS:1 COG:mll1318 KEGG:ns NR:ns ## COG: mll1318 COG2239 # Protein_GI_number: 13471369 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Mesorhizobium loti # 21 440 45 464 470 269 36.0 7e-72 MIDTITSDELKDILLNGTPETIKKVITNIHPADILDILHEDEDSIKALMDNLPNDVVASI IEEEDDEDDQYDLLKLFSDAKQKEILDEMSNDEITDLIGELEEDEKQAILDKMDKEDKED VERLLTFEPDTAGGIMTTEYISIRARNTVEKTLKFLQENTEEDTTYYLYVVDPQNILKGV VSLRDIVTSSFDTAMLDITNTNVKTVLYNEDQEEVAKKFQKYGFIMMPVVDEMDHLLGVI EFDDIIDIIQEESTEDINLLGGVNSEERLDSSVGESVKSRIPWLIVNLFTAVMAASVVSF FEGTIAQVVTLATVMPIVTGMGGNAGTQSLTIVVRGLSLGEMSKENATWIMLKEVAVGFC SGVIIGIIVALGSMLFEGNPVFGLVTGLAMFLNMILANIAGYFIPVILEKFHIDPALASG VFVTTVTDVLGFFFFLGLATVFLPYLI >gi|223714190|gb|ACDT01000025.1| GENE 48 50879 - 52633 1659 584 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734984|ref|ZP_04565465.1| ## NR: gi|237734984|ref|ZP_04565465.1| predicted protein [Mollicutes bacterium D7] # 1 584 1 584 584 1026 100.0 0 MEVKYVKKNIAKIIISLFFIIMMLFGAAFALIWFKHIFNYMTCLIVAIVVFIVIIILNKK GQICRNVEIGEQGLVDHLGTLEYREIKKVIYDNNRIEMNYGFANIVTKYHEVLLDKLQEQ GVTMIEYNGRLINFERVSLLIIYGLGIYFFYNLFVLIFGIISCLNYNYEIINSLGIVVRV LKLISILLALFVYNRLKNKKATVIIIIVIIAGFIVWGQSVKIKTYHDYYGKFACIMKERE LNIYKDIIAEYGILDKTINNLDSYKVADLTNYEKVIVYQGTNISGQYIVHRDVKNNNSKE QIFAYYNGKTFTHGALHITFNDGKLAIIDYNGAVYDYDKLKLVNGYILEIYKDNEVKKCL IFNYFKGYDNDEYRRSWELVISNLNDGKQNLYILKDSDDEKVRPQPNNKSENKIDAQTET DNVVSDEIKVSNGEKLIKKMDSINITAFESNEDFVKIKADSSDYNEIVLEVAKQFTIINN TDKKIDTQILGITIMSGSLEEFGVATNDRQDVEDIGKVKNTYYYRIRKVGNYYLAARVGD DCSVNVGLTMLEPVIATDTSQTTDFLYRIDGKKYLGNRWGDIGG >gi|223714190|gb|ACDT01000025.1| GENE 49 52636 - 52812 321 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755237|ref|ZP_02427364.1| ## NR: gi|167755237|ref|ZP_02427364.1| hypothetical protein CLORAM_00742 [Clostridium ramosum DSM 1402] # 1 58 1 58 58 66 100.0 6e-10 MEEKLDEMIVLLKNIDKNLQILVDEKNAVKVKKPSNDAILKQFEETEKLFNKVMNKGK >gi|223714190|gb|ACDT01000025.1| GENE 50 52972 - 55842 3655 956 aa, chain - ## HITS:1 COG:TM1295 KEGG:ns NR:ns ## COG: TM1295 COG0491 # Protein_GI_number: 15644050 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Thermotoga maritima # 98 317 25 218 218 68 28.0 6e-11 MFKKKFFIGSLSLMFALSGIGAVQAEGDQGSKSTASKVVELDPNGNAFLTDLTDDGYAEL GRYKVEKIKDNIYHWDEGTKKLPGGATDESGVTNNPSSMYFVLEEDGVLLVDLGNGSEND EDIQNAKTIVNSMTGNKPLSIIITHSHGDHTGFGRSEAVFADVDVEKVYISEPDYAAAAE AITQFDSKITKVNHGDTVNIYGAPYVFNVVSAHTEGSLMITDTTHGALFTGDTFGSGFIW ALFETNGGNPIAALNEGCALARTILNENPSLSILAGHRWQQFWEDNAQRPNEMSIQYFND MAQVISGLLDGTTISQPYPEVAWAPDAIELSSNGAKAKIDTLPKYVEAYKASVNEMSEAY VYSGSDKLSIESVNAAAAATFVVYPDGNLTDEEAQKYINESGIKDIVDRAASKVYVARPA NGISFTDDDVIGFETIVGKIGVSNNFKLIGIGDGATFINQNLTSYMNFVSGLALINPAQG QEVKVSVPTYLVTDDQAVINSYVKANNAGKVKDGYYENPESHYEIVVVNSDTKTSAADAT KDAWNTVLEKFGRIGNYSEVYKQTATWYSRPLLSGDKEADQARKYQYFDSVDAIDNIERH VVTQDLDKDGTDSLWYEYIPEQSKDAQPGSVPVVILFHGNTNDPRTQYDTSGWAQIASEE GVILICPEWQGHTYQGYTYDPMTDDSNETPDSDVITMLKIIEEKYPQIDQSRIYISGLSA GSRNTTNAGLSDAKYFAAGAGHSGPFGASDLNKEAVAANKDKYDMPIIFFTGDGDEYCKD AFDTTELNAGLQVAQLYQELNDMEVTQVEDIKDEDAYLYGVPWTKRYTIEPTAENIAKID VGAIENAKGVEISMARIYGWGHWNYTPDAKLMWEFMSKYARDLETGETIRLDLQEPDTPT DNPSDTPNDTEKPATQPSTSTSVKTGDSTMIAPFVIVASLSLFALVTVIKKSKAIR >gi|223714190|gb|ACDT01000025.1| GENE 51 56217 - 56876 734 219 aa, chain + ## HITS:1 COG:CAC1516 KEGG:ns NR:ns ## COG: CAC1516 COG0745 # Protein_GI_number: 15894794 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 3 218 4 225 229 165 38.0 6e-41 MKILIIEDDLELREELKVLLDNNGYQGIILKDFKNALEEILVTDPDLILLDIKIPFLNGQ QLLKKLREKSTIPVIMVTSKDSEIDEILALSYGADDYITKPYNPTILLLHIEAVFKRLHK AEPKLTYQNIRVNLMKSTLEKDDQELLLSKNEMGIFYYLLMNQGKIVTRDDIMNYLWGTD KFIDDNTLTVNMTRLRKRLEKIGLFDVIETRRDQGYMLI >gi|223714190|gb|ACDT01000025.1| GENE 52 56873 - 57883 1046 336 aa, chain + ## HITS:1 COG:lin1852 KEGG:ns NR:ns ## COG: lin1852 COG0642 # Protein_GI_number: 16800919 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Listeria innocua # 38 335 44 343 346 159 33.0 5e-39 MRIKEYLLDRLAYLVLLIIMIVSVLFFLLMFKVSNALIYAIVLVMLLFSGLLIGYDYYQK HAFYNNFRIKLNQLDKKCLITELIAQPDFLEGKILYDALYEIDKSMYEEIETYQDNLKNF KEYIELWIHEVKLPISSVKLMIHNHEHNERKLKDQVSRIENYVEQVLYYARQDSSEKDYL IKQCELGTIIRNVIKKNQDSLLYQNIKIEIGQVNCSVLSDSKWLEFIINQIVSNSIKYRR TDKCKIAFNVIRRNNIALEIIDNGLGIKTSDLTRVFEKSFTGENGRLVSSSTGMGLYIVK RLCDKLGHKIIIESKYSEYTKVTISFNDEQYYKVVR >gi|223714190|gb|ACDT01000025.1| GENE 53 57951 - 58718 1012 255 aa, chain + ## HITS:1 COG:lin2219 KEGG:ns NR:ns ## COG: lin2219 COG1136 # Protein_GI_number: 16801284 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Listeria innocua # 1 255 1 255 255 275 54.0 7e-74 MENILTIENISKYYGNKSNLTKAISNISMNVDKGEFIAIMGASGSGKTTLLNVISTIDKV TSGHIYIEGQDITKLKGNDLNRFRREELGFIFQDFNLLDTLTAYENIALALQIQNFKAKA IEQQIKVVARKLDIEDILNKYPYEMSGGQKQRVASARAIVTNPKLVLADEPTGALDSKSS KMLLERLQRLNQDYQTTILMVTHDAFSASYASKVIFIKDGKVFNQFNRGEATRKQFFDKI IDVVSLLGGDVSDVI >gi|223714190|gb|ACDT01000025.1| GENE 54 58708 - 60681 1553 657 aa, chain + ## HITS:1 COG:lin2220 KEGG:ns NR:ns ## COG: lin2220 COG0577 # Protein_GI_number: 16801285 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Listeria innocua # 2 657 3 645 646 155 23.0 2e-37 MLFNLSLKNIRKSFKDYAIYFFTLILGVAIFYIFNSLESQTVMLKLNSSTKDIIKLMNNV LSGVSVFVSFILGFLIVYANQFLMKRRKKEFGIYMILGMSKKQISKILLVETITIGLISL VIGLVLGVALSQVMSIVVANMFEADMTKFTFVFSLSAWIKTMIYFGIMYLLVMVFNTIQV NRQQLIKLLTANRQNEEVKLKNTYLCIGIFIGAVVLLSYAYYNVTAGAGNLTNSSSVLFQ MIYGAVATFLIYWSISGLILKFVMASKNYYFKNLNSFTVKQISSKINTTVLSTTVICLML FLTICIFSSAFALNNSATEEINELAPVDIQMQKDVSVDNLSITKYLSQKQIGLDDLKNIY TFMTYRTSQLTYRDTFGSAFKDVSESYLNNCEDIIKLSDYNKLAKIYNLPTYDLKDEYVV MGNYENLMFSRNQFLKEKPAIELNGKIYQSKFDKCQNGFLSMQSSHMCMGVYIVPDEAVS NFIPADSYLVGNYAANDQDTRNEVDEKMHSYSSDVLLINTRIQISSSSVSMGAMVIFIGL YIGIVFLISCAAILALKELSQSIDNKGKYQILRNVGVDEKMINRSLFKQIAVYFAFPLIL ALIHSIFGIQVCNIMLQTFNQANVFDAIITTGIFLVIIYGGYFIITYQCSKSIIKDK >gi|223714190|gb|ACDT01000025.1| GENE 55 60775 - 61323 422 182 aa, chain + ## HITS:1 COG:ylaD KEGG:ns NR:ns ## COG: ylaD COG0110 # Protein_GI_number: 16128443 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Escherichia coli K12 # 2 182 3 183 183 180 48.0 2e-45 MSEFEKMRAGEIYNPRDLKLIYMYDKTARRLHRYNKRCFHVYNMRSRLMKKIINTSGNFW IHPPFQCDYGCNIYLGKDVMINYGCVFLDVCEIKIGDNTLIGPHTQIYTACHSIDPQERL KEIEFGKAVTIGNNVWIGGNCTILPGVTIGDNSVIGAGSVVTKDVPANVLAYGNPCQLKK KI >gi|223714190|gb|ACDT01000025.1| GENE 56 61550 - 62884 1229 444 aa, chain + ## HITS:1 COG:XF0656 KEGG:ns NR:ns ## COG: XF0656 COG1301 # Protein_GI_number: 15837258 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Xylella fastidiosa 9a5c # 37 431 17 425 437 176 31.0 8e-44 MNINIFELGAVLLAGVFFAIIYFLGKVKKYDFGLLTILGLVFGVAVGLIFKGHYLYLEAI GTIYSHLILAIVIPLLLFSIISSITNLGTSIKLKKISAKAIIFLLLNTLLASTITLTLAV VTKVGSGINYQLASDYKAVEVPAFIDTIINLFPSNLANSWVNGEVVPIVIFAVIVAIAYN KIARNQQKAVLPFKHFIDAGNRVMGEVVNFVISFTPYAVLALIARAVSKSALSDLIPLLS VLGLAYLLSIIQIFGVTSVLLKVVGKLNPLNFFKGIWPAGVVAFTSQSSIGTIPVTVRQL TKKLGVNEDVASFVASLGANLGMPGCAGIWPVLLAVFAINVLGIDYSIGQYIFLIVLAVV VSIGTVGVPGTATITATALFASAGLPIEIIVLLAPISSIVDMARTATNVVGAASAAMLVA RSENELDLAVYNNETIGQPDLEII >gi|223714190|gb|ACDT01000025.1| GENE 57 63365 - 63835 582 156 aa, chain + ## HITS:1 COG:SA1225 KEGG:ns NR:ns ## COG: SA1225 COG0527 # Protein_GI_number: 15926973 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Staphylococcus aureus N315 # 5 154 252 401 401 80 30.0 1e-15 MENIIDAVSYEENIIQLQLRNVPKHPMIIAKIFTILSECGVNVDMISQVMIEDAMQIEIT LDEKYQKNLNDAIMRLKDEVKQLEIATNRKYFKIAVGGKLLETTPGAAAKVFTILGDNNI HFYQVTTSKRTISFIVDKKHKELAMKKLDEAFGLNI >gi|223714190|gb|ACDT01000025.1| GENE 58 63845 - 64738 1238 297 aa, chain + ## HITS:1 COG:aq_1143 KEGG:ns NR:ns ## COG: aq_1143 COG0329 # Protein_GI_number: 15606400 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Aquifex aeolicus # 3 292 1 288 294 226 42.0 4e-59 MAIFEGSAVALVLPMHEDGSIDYDGFRCQVQRMLDGGVQALLVNGTTGETATIHIDDEFK LLDITLEMAAGTGVKVIAGAGSNDTATALKKAHYAKEKGADAILVVTPYYNKTSQRGLIK HYTTVADAVDIPMILYNVPGRTGININVATVVELAKHKNIVAMKDATDNIAYAMEVLAKT QGMDFDMYSGCDDNILPFICAGGKGVISVLSNIYPRQTELLAQLALKGDLPKAQELAYAL EPVCRYLFIDVNPIMPKAALARMNVCKPTLRLPLIETTEENKKLLFDAMDAFEKLGF >gi|223714190|gb|ACDT01000025.1| GENE 59 64740 - 65483 828 247 aa, chain + ## HITS:1 COG:SA1228 KEGG:ns NR:ns ## COG: SA1228 COG0289 # Protein_GI_number: 15926976 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Staphylococcus aureus N315 # 1 246 1 240 240 182 41.0 7e-46 MKVLVYGYGLMGKKIAHKVRSQNDMELMGIVSYEFDEKVPEKTYSSLKECDEKADVIIDF SHPNNLNDILTYALANQTKLVIATTGYSQEQLDQIKEASKEIAIFQSYNTSFGVQMVTKM LRQFAKEFYDAGYDIEILEKHHNQKIDAPSGTAELLYEVMAEEIDGVEACYDRSSRHEKR KKEEIGIQSLRGGTIFGEHTIMFAGVDEIIEIKHTALSKEVFVQGAISAAYALNDKDNGL FTLKSLY >gi|223714190|gb|ACDT01000025.1| GENE 60 65496 - 66803 1410 435 aa, chain + ## HITS:1 COG:BS_lysA KEGG:ns NR:ns ## COG: BS_lysA COG0019 # Protein_GI_number: 16079395 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Bacillus subtilis # 1 434 1 434 439 372 44.0 1e-103 MVLQTEFAQIKNNNLYIDDVKATELAKKYGTPLYVMSEGHIRHQFNKLKTKMIEKYENTL PLFASKSFSCIAIYKIAKEYGVGIDCVSAGEISVALKSGFDPKKIYFHGNNKLPSEIEYA LENGVENFVIDNFYEIDLVQEIASRLGVKINGIVRITPGVYAGGHDYIRVGAKDTKFGFS SHDNTYLKAIKKVIDAPNINFDGIHCHVGSQIEDIQAYVLAMNKFVEIAYEIYDIFGIVI NKLNAGGGFGIAYTAADNPLEFETVVDTIMEIITKGFDARGLKRPMVLVEPGRYCVGNAG ITLYTVGSSKYIKDIRDYITVDGGMTDNIRSSLYAAKYDAIIANKAEMPTDHLITVAGKN CESGDILIKDIMLQDPEPGDILAMFSTGAYHYSMSSNYNQLPKPAVVFTYEGKDREVIRR QTFDDLVYYDIDSKY >gi|223714190|gb|ACDT01000025.1| GENE 61 66846 - 67250 503 134 aa, chain + ## HITS:1 COG:CAP0110 KEGG:ns NR:ns ## COG: CAP0110 COG0454 # Protein_GI_number: 15004813 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 133 1 134 137 103 41.0 1e-22 MELTYKNIKEISQERLVDLFKSVEWESANYPAQLVQAIKNYGSVFSAWDNDKCVGLVASM DDSIMVAYVHYVLVNPQYQKYGIGKKLMQMMLEHYKDYHKVCLIGVNTAVGFYEHLGFEV NEKAKPMFYLNKNY >gi|223714190|gb|ACDT01000025.1| GENE 62 67289 - 68119 584 276 aa, chain - ## HITS:1 COG:MA4656 KEGG:ns NR:ns ## COG: MA4656 COG0500 # Protein_GI_number: 20093435 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Methanosarcina acetivorans str.C2A # 22 259 1 238 257 187 40.0 3e-47 MKDILKQLNTRPKLFEQSTASIWDDPHISKGMLKAHLNENQESATRKLDFVKKSVAWINT VLPNHHYNNLLDLGCGPGIYAELFYQYGYQVTGIDLSKRSISYAQASAKQKGFDIIYLRS DYTKSNFRQHYDLVTLIYCDFGVLPPATRKKLLQKIYNTLSPKGALLFDVFTPLQYDGAV EHKNWQINENGFWHNDWHLVLNAFYRYDNVHTFLNQYTVINEDRITTYNIWEHTFSLKEL ERDLKNAGFTKLDFYKDVIGQKYDKTSKTICVIAQK >gi|223714190|gb|ACDT01000025.1| GENE 63 68631 - 69173 574 180 aa, chain - ## HITS:1 COG:lin0876 KEGG:ns NR:ns ## COG: lin0876 COG0667 # Protein_GI_number: 16799950 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Listeria innocua # 5 180 11 173 317 102 35.0 3e-22 MEYINIKGLDKPISKLIMGTAWFNPAFEEEIFTMLDQYVAAGGTVIDTGRFYGANKDGEH ACESERVLKKWFDSRNNRDQLVIMDKACHPIITPEGCHHPEYWRVKPDIITDDLHYSLLH TGCDHFDIYLLHRDDPSVPVNEIMDRLEQHRQEGLITTYGVSNWELDRVQAAVEYCQQMG >gi|223714190|gb|ACDT01000025.1| GENE 64 69741 - 69935 216 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735000|ref|ZP_04565481.1| ## NR: gi|237735000|ref|ZP_04565481.1| LOW QUALITY PROTEIN: lysR-family transcriptional regulator [Mollicutes bacterium D7] # 1 64 143 206 206 120 100.0 3e-26 MINQGFGVGIAANTSFLKEFDLKVIPLKLKKDYRVIYLVYNKVDYISAATENFINYIAIN KINL >gi|223714190|gb|ACDT01000025.1| GENE 65 69961 - 70452 538 163 aa, chain - ## HITS:1 COG:PH1832 KEGG:ns NR:ns ## COG: PH1832 COG4720 # Protein_GI_number: 14591582 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pyrococcus horikoshii # 1 146 32 183 202 65 35.0 4e-11 MINEKTNQLAKTALMIALIFVGTFSIRIPNPATGGYFHMGDSMIFLSVLILGKRDGAFAG ALGGALADLLCGAAIWIGPTLIIKFIMAWIMGIIWEQYPSKIISAIGGGVFQIIAYTAAE TFLFTWPAALGALPGLTMQTAGGIIIYVILARALQTTKLLNSD >gi|223714190|gb|ACDT01000025.1| GENE 66 70979 - 71671 752 230 aa, chain - ## HITS:1 COG:CAC3340 KEGG:ns NR:ns ## COG: CAC3340 COG2357 # Protein_GI_number: 15896583 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 23 224 14 215 217 213 56.0 2e-55 MTNRIMQKEIVSDPFTALVPNTDITDTLQEFMALQQLYDAGIKEVRTKLEILDDEFKIKH DHNPIHHMEYRLKSVNSILGKLEKRGLEVSLDSIVTNLTDIAGVRVICNYVSDVYKIADL LIKQSDIKLIAKKDYIKHPKENGYRSLHLVVEVPIFLAEKVQPTTVEIQIRTIAMDFWAS LEHHLRYKADNEVPDGVRDELIECAKTISNLDYKMQGIHEELNKPKKKVI >gi|223714190|gb|ACDT01000025.1| GENE 67 71745 - 72107 470 120 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755255|ref|ZP_02427382.1| ## NR: gi|167755255|ref|ZP_02427382.1| hypothetical protein CLORAM_00760 [Clostridium ramosum DSM 1402] # 1 120 1 120 120 197 100.0 2e-49 MKKTIVVLLIMMNIVSVNASSNDPKYKIIANSNSDKDIKTMYETKDDLIKDYKDWIKSVD DVDQALADHQKDYQAKYYNGEYTIVLGAGKGKELTGTLKASYCVSSKEIKKKSFFAELFS >gi|223714190|gb|ACDT01000025.1| GENE 68 72247 - 73245 936 332 aa, chain + ## HITS:1 COG:VC1370 KEGG:ns NR:ns ## COG: VC1370 COG2199 # Protein_GI_number: 15641382 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Vibrio cholerae # 150 327 247 428 443 101 35.0 2e-21 MKTVKDAEIFCQYIAHEYLGKRNYKLLEDIIDKRITIIGTGAHEVSRNIQELMTALNKEA ELWDGTFIIEDEWYQGTELADNLFVVIGQIEGRQNSNDQLVYSFSSRVTFIIEYHDMQWK IIHVHQSVPDYSQGDDEFFPQRIVEESNAQLKKEIAEKTKELEMSNQQVIYNLRHDYLTG ILNRPSLEKEVTEAMSNHQYGVILVLDIDYFKEVNDNNGHPYGDEVLIKLANTMQESFKS GICGRIGGDEFIIYLALDDANYDAIEKTIKEFKQNWQKNITTIDRSSTITLSIGGAYYPK HGKNYQELWSNADKALYLSKNNGRNRISIYQL >gi|223714190|gb|ACDT01000025.1| GENE 69 73339 - 75336 931 665 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 639 2 751 815 363 32 2e-99 MLTKFDEQAQKAIVVGESIAFDLGHNNVGSEHLLLSLLKISDSKLRELVKKYDVDDKNIY EDIKRLFGTNDDQPFYMEYSDAVKSILEAAMEITNQQNKSKVTLNILTIALLQSEESVAH ELLKKYKVDFEEIIYQLNESSELETKLDQIQSLVNLNKKVKQGEHLIVGRQKELEKLCMI LSKKEKNNALIIGEAGVGKSALVEKLAYLINQKQVNEGLKKKIIYELSLSSIVAGTKYRG EFEEKLRKIIDKVKEMDNVIIFIDEIHNIIGAGGAEGAIDAANILKPYLARKDLTIIGAT TTEEYFQHFEKDQAMNRRFSVITLKENTKEETLEILQKTKLFYEQFHQIEVDNEVLTYLV DNVDRYIKNRTFPDKAIDIFDLACVKARFKQQHIITKQIVKDVIEEYTSVKISEDYDYDE IKAKLNHHIIGQSKAIEQIINQLKLTKQSKQPSAVMLFVGNSGVGKSESAKQLSKLLGRK LIRLDMSEYRDSSSVQKIIGAAPGYVGYDKPSLLLGQLQTYPKSIILLDEIDKASQDVIN LFLQVFDEGYLEDSHKRKVYFNNTIIIMTSNKGTAKNTLGFKKNNHSSKVKNFFSDELLS RIDEIINFKNLTKMDLKKIIRKNCPHEVKEEDIELILKEYDMKLQGRGIVKAANKYFQNK AKAQS >gi|223714190|gb|ACDT01000025.1| GENE 70 75333 - 75911 249 192 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223039927|ref|ZP_03610210.1| ribosomal protein L22 [Campylobacter rectus RM3267] # 5 190 9 196 208 100 32 3e-20 MLTNLLAYINTYGYKAIFLLIAIENLFPPIPCEIILTFGGFMTTITNLTPLGVITVASLG SYLGAVILYWFGYLINLERLEKLLVRFYFKRHDLSRSLNWFEQYGKISVLIGRLIPIVRS LISLPAGITKMNFFSFSFYTLIGTIIWNSILVVLGIILGNNWILISKYVKQYALIIAIIT LLIILWKKRRLN >gi|223714190|gb|ACDT01000025.1| GENE 71 75883 - 76788 643 301 aa, chain - ## HITS:1 COG:SA0100 KEGG:ns NR:ns ## COG: SA0100 COG1283 # Protein_GI_number: 15925808 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Staphylococcus aureus N315 # 1 300 26 328 555 207 41.0 3e-53 MSTSLDTLANDKLENILYKLSSNKYLGVITGTAITAIIHSSSATTIILIGLLNSHLISLN QATWIILGANIGTTVTGLMIALDLGQSAIYLCVLGLLLMFLKNKIAYLGRVLMALGLIFL AMDSMAVALAPLQTSPYFINMFGHLDNPLTAILVGTVFTALIQSSTASVGVLQTIYQKGL ISFAMATNVIYGQNIGTCITAVLASLNGDRSSKRLSAIHILINVLGTIIFVILAKFIPLV SFIESLTPNYMMQIAYMHTFFNIISTIILLPFDNLLIYLANKVIPLSKQEVTYAHKPSRL Y >gi|223714190|gb|ACDT01000025.1| GENE 72 76957 - 77592 377 211 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 3 211 4 216 234 149 42 4e-35 MNYSAIILCAGKGSRSGLDYNKMLYRFGNQTVYQMTMKTFLNDPRCKQLVVVTKADEITD LKQLVTDKRIVYVYGGKERQDSVYNGLQVVDQDHVLIHDGARPYITTEKIDDLLLCLKEY DACLLMVPVKDTIKLCCDDTVEVTLPRARLMQAQTPQAFKTSLIKVCYQQAQDSGFIATD DASLVEEFSTAPVKVVIGSYENIKITTPEDL >gi|223714190|gb|ACDT01000025.1| GENE 73 77593 - 78978 1261 461 aa, chain + ## HITS:1 COG:BH0675 KEGG:ns NR:ns ## COG: BH0675 COG1472 # Protein_GI_number: 15613238 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Bacillus halodurans # 54 461 64 480 686 135 27.0 1e-31 MKERNNNFITHKIIIIIVLLGLIMGALVYQLRLAGEGETTITAKELKGQIVDITHETISL RDDNNIVYTVDCQKAKIKGDELQYGNLVTIKYTGKLEQTTAIQAIDVLGLNVQAVQVRNG GTGNTDATIASHKIAVMVEKMTLEQKIAQLFLARCPESQAVELLSQYQLGGYMLYNRDFH NRTREEVIENIQSYQKAVTIPMLIAVDEEGGTVVRVSNNLRSNKFRSPQDVFKAGGMDAI ISDATEKSEFLKEFGINVNIGPVADVAMSKDDFIYQRSFGTDPNETAEFVKNVVKAMNDI KMGSVLKHFPGYGNVADNHTAICHDSRDYDSLVNNDFLPFKAGISAGANSILISHIVVDS IDDQNLASLSPRVSKILRDDLNYHGVIIADDISMASAKAFGSEGEVALKAIKAGNDLIMT SNPQEHISALITAAKNDEICLNSLDRSVMRILTWKSQLGIL >gi|223714190|gb|ACDT01000025.1| GENE 74 79113 - 79394 479 93 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755262|ref|ZP_02427389.1| hypothetical protein CLORAM_00767 [Clostridium ramosum DSM 1402] # 1 93 1 93 93 189 100 6e-47 MNKYEIMFIVKPDVEDDARNALIESFKGILTANGGTVDNVDEWGLRDLAYEIKDYKKGYY VVVDATCTAADVAEFERLSNINASMLRHLTLRK >gi|223714190|gb|ACDT01000025.1| GENE 75 79410 - 79892 582 160 aa, chain + ## HITS:1 COG:BH4049 KEGG:ns NR:ns ## COG: BH4049 COG0629 # Protein_GI_number: 15616611 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Bacillus halodurans # 1 111 1 111 168 151 66.0 6e-37 MINRVVLVGRMTRDPELRRTPQGDAVTSFTLAVNRNFTSRDGQQQADFINCVVWRKPAEN VERYCSKGSLVGVEGRIQTRSYDNTQGQKVYVVEVICDSVQFLETRAARERQPQPQQQMQ QPYQQPQNNDNFYDMKTVELEKEFDNSINTYDIMEDDIQF >gi|223714190|gb|ACDT01000025.1| GENE 76 79910 - 80140 385 76 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755264|ref|ZP_02427391.1| hypothetical protein CLORAM_00769 [Clostridium ramosum DSM 1402] # 1 76 1 76 76 152 100 5e-36 MAFRKQRMGRKKVCYFTKNKITYIDYKDVELLKRFVSANGKIIPRRVTGTSAKYQRMLAT AIKRARQMALLPYVGE >gi|223714190|gb|ACDT01000025.1| GENE 77 80504 - 81271 902 255 aa, chain + ## HITS:1 COG:SPy1712 KEGG:ns NR:ns ## COG: SPy1712 COG1349 # Protein_GI_number: 15675565 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Streptococcus pyogenes M1 GAS # 1 235 1 231 256 128 32.0 1e-29 MIASERFLRIVNTVNERGFISTKELSENLDVTETTIRRDCEELEKQGLLIRVHGGAKSIE QKTILSNKDEKQMSERITNNKEKDLVCKKAASFIRDGDCIFLDGGTSIVPMLKYIKGKNV KIVTHSTLIANGFNDFESELFMIGGKYIPEYNMSVGPITLSDLEKFNFDYAFLSCAGIDI DRKLVYTAEMDTMAVKQKAMDLAVKKYLLIDSSKLSVKGFCSFISSDDFDAVICNKDAHM NIDDIPVNYILTNED >gi|223714190|gb|ACDT01000025.1| GENE 78 81445 - 82155 763 236 aa, chain + ## HITS:1 COG:MYPU_3140 KEGG:ns NR:ns ## COG: MYPU_3140 COG0274 # Protein_GI_number: 15828785 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Mycoplasma pulmonis # 15 223 11 212 229 134 38.0 1e-31 MKKITKKANEMNLKELASYIDYSVLKPEFTEQEIIDLTKDGVKLNCATICINPGYMDLCE PYVKGTDTMLCPVCDFPFGTSSTESKVKQIEIVAKYDSVKEIDIVANFGMIKSGKWEEVL ADIKACTEAAHKYGREIKVIFETDALNELQIRKMCHICIEAGADFVKTSTGFLTGHEAHG ASLEIIKVMMEECGDKIKIKGSGCIRTREHFLQLIDMGIDRMGVGYRSVSVVLGLD >gi|223714190|gb|ACDT01000025.1| GENE 79 82207 - 83007 775 266 aa, chain + ## HITS:1 COG:FN0294 KEGG:ns NR:ns ## COG: FN0294 COG3959 # Protein_GI_number: 19703639 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Fusobacterium nucleatum # 2 264 7 270 270 273 53.0 2e-73 MLNKMASEARKSIIKMIYEAKSGHPGGSLSCIDILMYLYQKELKLTKDNINSNERNKLVL SKGHGAPALYAVLKEMKVLDEKELMTFRKINSSLQGHPNMNDTKGVDMSTGSLGQGISAA VGITLANKYKNNNYYTYVICGDGEFEEGQVYEALMAASHYQLSHFILFLDYNGLQIDGKI VDVIGPQPFLEKFLAFGFEVININGHDFDEIEAAVEMAKNSQKPTAIIAHTVKGKGISFM ENEIEWHGKAPNRKEMEQALRELGGH >gi|223714190|gb|ACDT01000025.1| GENE 80 83009 - 83932 773 307 aa, chain + ## HITS:1 COG:FN0295 KEGG:ns NR:ns ## COG: FN0295 COG3958 # Protein_GI_number: 19703640 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Fusobacterium nucleatum # 2 306 3 309 309 276 48.0 3e-74 MKIATREAFGEALKELVIKNNDVIVLDADLSPATKTCYAKEARPQQFYSTGIAEANMVGI GAGLAVQGLKPFVSSFAMFLAGRAFEQIRNSIAYPHLNVKLCATHAGLTVGEDGATHQCN EDIALMRVLPGMVVLQPCDANETKAMMQFMVDYDGPVYLRTSRYAVDNILDSNYKFLLGE IVPIVKGCKIAFLATGIMVNEARKAIEILAKNNIYPSLYNVSSLKPINTMQLNSIINSYD LIFTLEEHNIIGGLYSLVCEKLDKPKKIYPIAIQDTFGESGTPEELMSKYKIDADYLVKS VLETVEE >gi|223714190|gb|ACDT01000025.1| GENE 81 83935 - 85356 1302 473 aa, chain + ## HITS:1 COG:lin0017 KEGG:ns NR:ns ## COG: lin0017 COG2723 # Protein_GI_number: 16799096 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 2 471 4 477 477 545 53.0 1e-155 MTKFKDDFLWGGSLAANQCEGAFDADGRGLDLMDVIPGGLEARQAAYVSPLDYMNENIEY CPNRIAIDFYHRYKEDIALFGEMGFKCLRLSVNWTRLFPYGDEEQPNEKGLTYYDSLIDE LLKHGIQPLITLNHFNVPLYLAKHYGGWANRKMIDFYCRLVRLLMTRYKGKVTKWLTFCE INVGLFQPYSTLGLLYRKGDNIEQIKYQAMHHQLLASALATKIGHEIDKNNQVGCMIAGG VVYPYTCHPSDVLKAFEENNKELMLTDVQVFGEYPYHMKAKMKKEGITIHMEKDDEEILK NNTVDFVAFSYYASKVVTTDHRLKINNYGKLQNPYCDASQWGWIIDPQGIRITLNFLQER FHKPLFIVENGLGAPDVVETDGSIKDDYRIEYLKAHIEQVKLALDDGIDLMGYLTWGPID VISATEGQMTKRYGFIYVDRDDEGKGTLKRSKKKSFNWYKKVIQSNGEELENI >gi|223714190|gb|ACDT01000025.1| GENE 82 85379 - 86572 1177 397 aa, chain + ## HITS:1 COG:XF0274 KEGG:ns NR:ns ## COG: XF0274 COG0205 # Protein_GI_number: 15836879 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Xylella fastidiosa 9a5c # 1 389 11 402 427 209 36.0 6e-54 MSRKCIIGQSGGPTVVINATLAGVIQGALKADYEKIYGMINGIDGLLDEKMVDLSSFKDQ KNLFKLIHTPAMYLGSCRYKIFEKDQSIKQRIITILNKYEITDFFYIGGNDSMDTIEKLS TYAKVCNCPINFIGIPKTIDNDLMNIDYTPGYPSACHYVATSLCEIAFDSVIYPIKSVTI VEIMGRDAGWLTASAQLVNEVYPDSIDLIYMPENIFDENDFLEQIKDKLKKKDHCIIAVS EGIRDQYGMCVSLDSGIQKDAFGHLSNNGCSNYLKKLIGDHLGCKVRAIQLSTLQRCAAH LASYIDIENALRIGSYAINCALDKRTGIMVTISKNRNEFLLSHVDIHKVANHVKPFPLEW IDEKNKKIKKEYIDYVVPLIYKSGQIELPQYLQLNER >gi|223714190|gb|ACDT01000025.1| GENE 83 86562 - 87284 549 240 aa, chain + ## HITS:1 COG:ECs2020 KEGG:ns NR:ns ## COG: ECs2020 COG1434 # Protein_GI_number: 15831274 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 6 228 16 248 266 126 33.0 4e-29 MKDSQAFNLLNQFLAINTYDEKKKYDLILVLGNAIAETGLIAYQLYKTGHAYHFMIAGGK GHTTDILINTIKEKYDIKDITKRSEAELFKYLIEENYGVDNTILLEKESTNCGENIQFAF KLLKKEDIIVKNILLVHDPLMQRRIDATARHYAPHINFDNYRCFLPVVENIGFELKNNIW GLWSKERYISLLLGEMKRVIDDKDGYGPNGKQYIEHVEVPQKILAAYRYIFFRYGKYQRK >gi|223714190|gb|ACDT01000025.1| GENE 84 87419 - 87979 591 186 aa, chain + ## HITS:1 COG:SP0804 KEGG:ns NR:ns ## COG: SP0804 COG0693 # Protein_GI_number: 15900697 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Streptococcus pneumoniae TIGR4 # 1 180 1 180 184 198 51.0 6e-51 MKKAAVIVAPGFEEGETLSIIDIIRRANFQCDMIGFEKDVAGAHDIIVKCDSVLNDEVID YDMIILPGGYPGAANLRDNTRLIEILNIMAQKNKYICAICAAPIVLEKAGLLKGKNYTAY VGYEQKIKQGNYLYDKVVIDGKIVTSRGPATVYAFAYKLVDLLGGNSLALKKRMVYFNAF DVKEDE >gi|223714190|gb|ACDT01000025.1| GENE 85 87981 - 88532 651 183 aa, chain + ## HITS:1 COG:SP0804 KEGG:ns NR:ns ## COG: SP0804 COG0693 # Protein_GI_number: 15900697 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Streptococcus pneumoniae TIGR4 # 1 183 1 182 184 149 38.0 4e-36 MRKAAVLVVDGYEESETVTIVDLLRRAGIECHTFGFAEQYVRGMQGMMIKVDKIFSDEIK NYDMLVLPGGRPGGVNLGANPLVIEMVQYYNENGKYLAAICSGTIVLSKARVIDGKNVTG YTGYADKLVGGEFIDKVVVFDQNIITSQGPATSYPFAFKIIEVLGQDVSEMKERLLYNFA GGR >gi|223714190|gb|ACDT01000025.1| GENE 86 88552 - 89325 769 257 aa, chain + ## HITS:1 COG:BH3576 KEGG:ns NR:ns ## COG: BH3576 COG1737 # Protein_GI_number: 15616138 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 21 253 19 264 284 63 23.0 4e-10 MIYDKLPIVFLSTLVSEKNGSTNSQIAAYILNHLEEVQNLGIKEIAKECNVAVSSISRFC KEVGLRDFAELKELLLSTDLSFEDHSHATSKQARLHDYSHKVRESIIMVEKSIDMDAVID LCKDINEYQKVAIFGLLKAGAVAFNLQGDLLMLNKQVYTNISYSQQIQYIVAADEDDLII IFSYTGAYFDYQDLRALKKRLTAPKIWMISSDDREYPECIDRTILFKSLQDQNSHPYQLQ FIAGLIAQEYSRLHQLK >gi|223714190|gb|ACDT01000025.1| GENE 87 89352 - 90245 665 297 aa, chain - ## HITS:1 COG:L147291 KEGG:ns NR:ns ## COG: L147291 COG1737 # Protein_GI_number: 15673115 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Lactococcus lactis # 2 266 4 269 283 98 27.0 2e-20 MLLQEKMKNTKFSPSEEIVIDFILTKQERIKDYSTTMIAGETYTSPSLLVRISKKLGFSG YNNFKNAFLEEVQYLHKSHLNIDANQPFLKTDSIMNIANKITQLKTESLNDTLSLIHHDT LQKAVRALQKSNTVKVFGVSNLSFPAEEFVFKLRHIGKNAEVFVLNSNIYQEAMMANSND LGICLSYSGESGEIIKIANILKKKNVPIIAITSIGENSLTRLADIVLRVTTREKSYSKIG AYSSLESISLILDVLYSCFFTTAYDQHYTFKTELAKNTESREITNLIIQENEQNEKE >gi|223714190|gb|ACDT01000025.1| GENE 88 90456 - 91406 1042 316 aa, chain + ## HITS:1 COG:mll6957 KEGG:ns NR:ns ## COG: mll6957 COG0451 # Protein_GI_number: 13475790 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Mesorhizobium loti # 2 315 3 317 318 285 43.0 6e-77 MKVVVIGACGHIGSYLVPKLVKGGFKVVAISRGKSKPYINDPAWKMVEQVVLDRNKDEDF AYKVAKMNADIVVDLINFNIEDTKKMVTALKEIKISHYLYCSSIWAHGRAEFLPADPNSL KEPLDDYGVDKYQSELYLKEQYRKNGFPATIIMPGQISGPGWTIINPLGNTDFSVFQKIA NGQEIYLPNLGMETLHHVHGDDVAQMFYQAITHRNQALGESFHAVERESMTLYGYAKAMY RYFNQEPKIKFLSWEKWCKYVNDDELIDHTYYHIARSGEYSIENAQKLLEYSPKYTTLET VEQAVASYIERGIITL >gi|223714190|gb|ACDT01000025.1| GENE 89 91470 - 91904 479 144 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0368 NR:ns ## KEGG: EUBREC_0368 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 5 144 13 152 152 87 36.0 1e-16 MIEPDTIKLLRECDAGIKMGAASIDDVLDYAHDETLRQHLIDCKREHEQLKEEIQTLLDK FHDEGKEPNPMAKGMSWVKTNAKLVMHESDQTIADLITDGCNMGVKSLNKYLNQYEAADE KTKDITKRLINLEEKLVIDIRQFL >gi|223714190|gb|ACDT01000025.1| GENE 90 92585 - 92857 442 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755279|ref|ZP_02427406.1| ## NR: gi|167755279|ref|ZP_02427406.1| hypothetical protein CLORAM_00784 [Clostridium ramosum DSM 1402] # 1 90 1 90 90 120 100.0 3e-26 MALFDDLSKKAAKLTEKTIEKSSELADVAKTRVSIKSAQADLDEKFIELGKLYYEIISND NIFDEKTAAIVQEIDSINTRIKELEANLDK >gi|223714190|gb|ACDT01000025.1| GENE 91 92871 - 94310 735 479 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145632256|ref|ZP_01787991.1| 50S ribosomal protein L27 [Haemophilus influenzae 3655] # 8 441 1 440 456 287 36 1e-76 MKGSECVISIQELLGLINEFLYSNILIALLVATGVYFTIRTKFVQFRMLPEGIRLLKEKS HHDDGVSSFQALMISTASRVGTGNIAGVATALAAGGAGSIFWMWIIALIGGASAFIESTL AQVYKEKDGDSFRGGPAYYIEKAIGKRWLGIIFSCLLIACFIFGFNPLQAYNVSSAVEYY FSNNELVAFVIGAVLALATAAVIFGGVHRIGIISSKVVPVMAILYILLGLYITFSNLNQL PEIFSDIFNQAFDFKAIAGGFAGSCVMHGIKRGLFSNEAGMGSAPNAGATADVSHPVKQG LVQTISVFIDTMLICSTTAFMLLNYGTESGLTGMPYVQQAIFAEVGEFGIHFITISIFLF AFSSLIGNYCYAESNFKFIIDNKKALFIFRIITVIIIFFGAQASFNTIWDLADVLMGFMA IMNIVVILLLGKIAFKCLKDYSIQKKEGKDPIFHPDNLGIKNAEFWHDIEKEYEKPVEV >gi|223714190|gb|ACDT01000025.1| GENE 92 94438 - 97368 2585 976 aa, chain - ## HITS:1 COG:sll0267_6 KEGG:ns NR:ns ## COG: sll0267_6 COG2200 # Protein_GI_number: 16331091 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Synechocystis # 730 973 8 249 253 160 36.0 1e-38 MLKLVDLNTNCSYSEFLLRWTKKCVHPDDKNKFYQELHPHRLKKLFEQGITEVYCEYRSL NATQKQGWVSTTIHLLHVGETNKLFGFVYVKDINDKKIHELELLRQSQSDPLTKLYNRTA FGQIVNEYLNKKRESSSALLLIDIDNFKNINDNLGHSFGDTVLCEIAHKLTNIFNNKAII GRYGGDEFIIFIKDIPSKKYVYHKASIILEELHLYYSSNYQEYTISSSIGITFSPDDGKT LHELFEQTDSALYRAKKLGKSQYFAFNDSYQDITPVTNYISKGWLIDELDEIVYVSSLDT YELLYLNRKGREITGIEAGEYNHIKCYEALQGRTSPCPFCTNAKLNLNEFYIWEFSNQHL NKDYIVKDKLVLWEGTPVRMEIAVDVSDTHNFNLQRVPTEFAIEKTILDCLQALTIPDTL EEAINNVLEIIGNFYQATRAYIVEIDLNTKIGSNTYEWCRENYPHYRDQLQRIDLNEIPY IYEAFEHHNNLIINDCEAIKKEHPREYAHFISREAHSLVTIPYEETGIFAGYIGVDNPSI NQNTIALLDSINFSIVNEIKKRRLYEKTQYNLYHDNLSGLLNRNSFTQFLSYENSAVYSQ GVILADINGLKEINRDFGHYHGDKIITIISSIMNSYFPSEKIFRLSGDEFIIIVNNLEYK QFIEATKQMEDTLLGSTPNGVSLGYTWSEDNMDINDLIHQAEELMMINKQIYYERADTYK KHYSPKKLENLLKCFKAKQFVVYLQPKFDIDQNKVVSAEALVRLEYPGHGLIMPNKFIPT LEKERMTRYLDFYMFEQICEILERWQKEGKELIPISVNISRLTLLESDFTNSLKRIKNKY NIPNNFITLEITESIGNIDRSIIATISKRIKDLGFNISLDDFGAKYANMSLLSTLNFDEL KIDKSMIDTLVNNDKCQTILHHIIEMCKKINVACVAEGVETEKQIELLVCLGCNIIQGFY YSKPIPLQEFEDKYHQ >gi|223714190|gb|ACDT01000025.1| GENE 93 97472 - 97885 334 137 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237735030|ref|ZP_04565511.1| ## NR: gi|237735030|ref|ZP_04565511.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 137 1 137 137 230 100.0 2e-59 MPVNPYNLDMITKICNHLNTACAIITYRHNFHFEYANDLYYQLFKYKRNDGFNYSLPYSN DYQKVKNTITSIIQKKEEFLEIETQSFNKDNDLIWTRSRLSFIYNNQSVYIICLVENITE NKKVLKKSRNQSSKISS >gi|223714190|gb|ACDT01000025.1| GENE 94 97961 - 98755 860 264 aa, chain - ## HITS:1 COG:CAC3095 KEGG:ns NR:ns ## COG: CAC3095 COG0351 # Protein_GI_number: 15896346 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase # Organism: Clostridium acetobutylicum # 1 259 4 262 265 304 60.0 1e-82 MKKVLTIAGTDPTGGAGVQADLKTMTAHKVYGMSIITALVAQNTLGVRDIMEVKPDFLAE QFDCVFEDIYPDAIKIGMVSSPVLIEMIVNKLTSQKDCPIVVDPVMVSTSGSRLLADNAL RLLKEKLIPLATIITPNIPEAQVLTNLKINTKDDMITAAKMISEWYHGYILIKGGHFEER ADDLLYYRGNITWLTGEKINNPNTHGTGCTLSSAIASNLALEYSIEESVTRAKVYITGAL KANLNLGHGSGPLDHCWNINQTLK >gi|223714190|gb|ACDT01000025.1| GENE 95 98758 - 99396 521 212 aa, chain - ## HITS:1 COG:CAC3231 KEGG:ns NR:ns ## COG: CAC3231 COG0637 # Protein_GI_number: 15896477 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Clostridium acetobutylicum # 2 189 5 192 215 103 37.0 3e-22 MIKGYIIDMDGTLLDSMHIWNELGSRFLELKGITPEANLKDILAPLSINQAIKYIAETYQ LKEPLDVLINEVNSLLNHIYLSEIPLKPGALEFITNCFNHHKKLCLLTANNYQATINILD KYNLTSKFDEIITCDHTTLDKRSGEAYNYAISALHLHKDECIVIEDALHAIIAAKKQGFT VWAVADQSNQDDWDEICKISDLNLKNLSEMER >gi|223714190|gb|ACDT01000025.1| GENE 96 99389 - 100009 612 206 aa, chain - ## HITS:1 COG:CAC0495 KEGG:ns NR:ns ## COG: CAC0495 COG0352 # Protein_GI_number: 15893786 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Clostridium acetobutylicum # 3 194 7 198 211 189 52.0 2e-48 MLELYLVSDRSWLNDRSLEEDIEQAILGGVTMVQLREKNLTDEEFTIQAKKVKTICSKYH IPFIINDNVAVALAVDSDGIHIGQDDQPVKRVRKIIGPHKIIGVSAHNLKEALAAKEDGA DYLGVGAMFNTSTKDDATAVSFTQLHEITTKIGLPVVAIGGINQDNCLLLKGTKIDGIAV VSAIMSAPDIKEAAAKLKAHARGIYD >gi|223714190|gb|ACDT01000025.1| GENE 97 100014 - 100838 963 274 aa, chain - ## HITS:1 COG:CAC3096 KEGG:ns NR:ns ## COG: CAC3096 COG2145 # Protein_GI_number: 15896347 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxyethylthiazole kinase, sugar kinase family # Organism: Clostridium acetobutylicum # 1 272 1 271 273 285 58.0 7e-77 MFEKVLEEIKLRNPIVHCITNYVTVNDCANAILAVNGSPIMADDIHEVEEITTICNALVI NIGTLNERTVASMIKAGKMANHLNHPVVLDPVGAGASKLRTNTAKKLLEEINFSVIRGNI SEIKALAMNMTSTQGVDANINDIVTTENLNEVISFAKKFSQETGAVIAITGAIDIVANQE KTYVITNGCAMMSRITGTGCMLSAILGATTAIGQNDLLETTAYTIAMMGYCGELADQRVR NDNSGTSSFRMHLIDALSTINYYQLKAGVKIELH >gi|223714190|gb|ACDT01000025.1| GENE 98 101144 - 101315 96 57 aa, chain + ## HITS:1 COG:BH1470 KEGG:ns NR:ns ## COG: BH1470 COG2509 # Protein_GI_number: 15614033 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Bacillus halodurans # 1 57 1 57 480 83 61.0 9e-17 MKNNYDVVVVGAGPAGIMACYELYLKQPELNVLLIDKGQDVMKRHCPIKEKKIKSCP Prediction of potential genes in microbial genomes Time: Thu May 26 09:29:31 2011 Seq name: gi|223714189|gb|ACDT01000026.1| Coprobacillus sp. D7 cont1.26, whole genome shotgun sequence Length of sequence - 47422 bp Number of predicted genes - 48, with homology - 47 Number of transcription units - 29, operones - 14 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 157 - 216 5.0 1 1 Op 1 . + CDS 243 - 1004 695 ## gi|237735035|ref|ZP_04565516.1| predicted protein 2 1 Op 2 . + CDS 991 - 1680 484 ## gi|237735036|ref|ZP_04565517.1| predicted protein 3 2 Tu 1 . + CDS 2021 - 2707 608 ## EUBREC_1039 putative esterase + Prom 2970 - 3029 9.8 4 3 Tu 1 . + CDS 3166 - 4449 1663 ## COG2873 O-acetylhomoserine sulfhydrylase + Term 4510 - 4552 3.2 + Prom 4451 - 4510 3.5 5 4 Tu 1 . + CDS 4576 - 4821 345 ## CDR20291_0586 hypothetical protein + Prom 4896 - 4955 2.8 6 5 Tu 1 . + CDS 5135 - 6619 1907 ## COG1190 Lysyl-tRNA synthetase (class II) + Term 6655 - 6696 1.9 + Prom 6732 - 6791 6.1 7 6 Op 1 4/0.000 + CDS 6817 - 7092 315 ## COG0640 Predicted transcriptional regulators + Term 7119 - 7162 2.4 + Prom 7111 - 7170 1.9 8 6 Op 2 5/0.000 + CDS 7193 - 8197 699 ## COG0701 Predicted permeases + Term 8344 - 8391 -0.8 + Prom 8271 - 8330 4.9 9 6 Op 3 . + CDS 8479 - 8814 470 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Prom 8838 - 8897 5.6 10 7 Tu 1 . + CDS 8917 - 9318 305 ## COG0394 Protein-tyrosine-phosphatase + Term 9487 - 9528 -0.8 + Prom 9619 - 9678 10.2 11 8 Tu 1 . + CDS 9920 - 11098 1410 ## COG1404 Subtilisin-like serine proteases + Prom 11100 - 11159 7.1 12 9 Op 1 3/0.000 + CDS 11201 - 12094 770 ## COG0583 Transcriptional regulator + Term 12095 - 12139 0.8 + Prom 12115 - 12174 7.3 13 9 Op 2 . + CDS 12207 - 13175 1296 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) + Term 13177 - 13224 2.1 + Prom 13263 - 13322 4.2 14 10 Op 1 . + CDS 13345 - 14397 886 ## COG1434 Uncharacterized conserved protein 15 10 Op 2 . + CDS 14422 - 14784 236 ## COG2315 Uncharacterized protein conserved in bacteria + Term 14933 - 14980 0.6 16 11 Tu 1 . - CDS 14965 - 15627 460 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 15734 - 15793 7.3 + Prom 15599 - 15658 10.3 17 12 Op 1 . + CDS 15761 - 17596 2300 ## COG1151 6Fe-6S prismane cluster-containing protein 18 12 Op 2 . + CDS 17607 - 18179 504 ## Cphy_1508 hypothetical protein 19 12 Op 3 . + CDS 18181 - 18669 572 ## COG3945 Uncharacterized conserved protein + Term 18683 - 18727 6.2 + Prom 18709 - 18768 13.9 20 13 Tu 1 . + CDS 18794 - 20110 1094 ## COG2199 FOG: GGDEF domain 21 14 Tu 1 . - CDS 20221 - 20940 764 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase - Prom 20960 - 21019 11.5 22 15 Tu 1 . - CDS 21384 - 21554 129 ## - Prom 21796 - 21855 7.7 23 16 Tu 1 . + CDS 21841 - 22398 380 ## COG0406 Fructose-2,6-bisphosphatase + Term 22403 - 22446 7.2 24 17 Op 1 . - CDS 22438 - 23505 1438 ## COG0205 6-phosphofructokinase 25 17 Op 2 . - CDS 23540 - 25570 1703 ## COG0855 Polyphosphate kinase - Prom 25618 - 25677 8.2 26 18 Op 1 2/0.000 - CDS 25796 - 26563 488 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 27 18 Op 2 . - CDS 26560 - 27411 464 ## COG0846 NAD-dependent protein deacetylases, SIR2 family - Prom 27525 - 27584 6.9 + Prom 27482 - 27541 9.3 28 19 Op 1 . + CDS 27573 - 27896 390 ## COG1733 Predicted transcriptional regulators + Prom 27985 - 28044 4.4 29 19 Op 2 . + CDS 28082 - 29149 927 ## COG2357 Uncharacterized protein conserved in bacteria + Term 29307 - 29339 -0.9 + Prom 29160 - 29219 12.6 30 20 Op 1 . + CDS 29426 - 31078 1834 ## COG0366 Glycosidases 31 20 Op 2 . + CDS 31122 - 32261 1054 ## gi|237735066|ref|ZP_04565547.1| predicted protein + Term 32311 - 32354 -0.1 32 21 Op 1 3/0.000 - CDS 32402 - 32998 483 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 33 21 Op 2 . - CDS 32999 - 33625 732 ## COG0546 Predicted phosphatases 34 21 Op 3 . - CDS 33612 - 34607 724 ## COG1940 Transcriptional regulator/sugar kinase - Prom 34650 - 34709 8.2 + Prom 34662 - 34721 7.7 35 22 Tu 1 . + CDS 34780 - 35334 493 ## CDR20291_3456 hypothetical protein + Term 35337 - 35386 6.5 - Term 35324 - 35374 8.4 36 23 Op 1 . - CDS 35378 - 35770 493 ## COG1725 Predicted transcriptional regulators 37 23 Op 2 . - CDS 35767 - 37140 928 ## EUBELI_01889 hypothetical protein - Prom 37200 - 37259 9.5 + Prom 37657 - 37716 6.6 38 24 Tu 1 . + CDS 37779 - 38000 159 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains + Prom 38109 - 38168 5.5 39 25 Op 1 . + CDS 38266 - 38727 449 ## COG1396 Predicted transcriptional regulators 40 25 Op 2 . + CDS 38727 - 39056 415 ## gi|167757514|ref|ZP_02429641.1| hypothetical protein CLORAM_03064 + Term 39072 - 39131 9.1 + Prom 39059 - 39118 8.4 41 26 Op 1 1/0.000 + CDS 39151 - 40113 1014 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase 42 26 Op 2 2/0.000 + CDS 40100 - 40672 180 ## PROTEIN SUPPORTED gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 43 26 Op 3 4/0.000 + CDS 40666 - 41625 999 ## COG0502 Biotin synthase and related enzymes 44 26 Op 4 . + CDS 41622 - 42284 672 ## COG0132 Dethiobiotin synthetase + Term 42428 - 42469 4.5 + Prom 42299 - 42358 6.0 45 27 Tu 1 . + CDS 42513 - 43616 1087 ## COG1396 Predicted transcriptional regulators - Term 43496 - 43531 -1.0 46 28 Op 1 . - CDS 43642 - 44757 1304 ## COG2768 Uncharacterized Fe-S center protein - Prom 44778 - 44837 9.3 - Term 44877 - 44911 0.5 47 28 Op 2 . - CDS 44929 - 46197 1027 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 - Prom 46428 - 46487 9.0 + Prom 46279 - 46338 1.9 48 29 Tu 1 . + CDS 46363 - 47121 542 ## COG1737 Transcriptional regulators + Term 47147 - 47191 7.7 Predicted protein(s) >gi|223714189|gb|ACDT01000026.1| GENE 1 243 - 1004 695 253 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735035|ref|ZP_04565516.1| ## NR: gi|237735035|ref|ZP_04565516.1| predicted protein [Mollicutes bacterium D7] # 1 253 1 253 253 475 100.0 1e-132 MNVFVTSLLKIINNPDIQDTNYNIAYYLLLHYYDVFEMTLQEIADACYVSVSTLNRFFRI FGFKKFSIIKDLMQVHAAARISQLEERSIRKNNQQVNQLLKPLLDNDDYWTVSDQGLINQ CCEMIKKCSRVILIGSNEMMDSLLRFQGDMVMMQKLVIQNTIYNNNYIEPQADDLIILLS MTGRIAELKPTLIEHLLVGKNNLICVGYKNFLCEKALFLKIPSYLDETLENMILDNYFQN IVYCYYGGYYDNR >gi|223714189|gb|ACDT01000026.1| GENE 2 991 - 1680 484 229 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735036|ref|ZP_04565517.1| ## NR: gi|237735036|ref|ZP_04565517.1| predicted protein [Mollicutes bacterium D7] # 1 229 5 233 233 385 100.0 1e-105 MIIGKLNLILCTSSYNSNDYLLAAYLIKKSNQIESIKMVDILKDTKVSKSTLLRFLKKLG YKNYTDVQYLITKEKKHQEIFKERIIEKNLRDKLAKKQRLIVIGDSYSVSSLIVYKQKFY EIGIDLDIQLSLKSYGQMVVEKKFTGSDLIIVVSLYKSDLDLMAEFFSGYLELKHFIKEQ KIDHLYVGKIATGNRNLDECYRIDDNINFSEAIYRLCSLFEKIYNFYAL >gi|223714189|gb|ACDT01000026.1| GENE 3 2021 - 2707 608 228 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1039 NR:ns ## KEGG: EUBREC_1039 # Name: not_defined # Def: putative esterase # Organism: E.rectale # Pathway: not_defined # 1 169 170 338 626 67 29.0 4e-10 MDNLIADGEVADAIVVTKDNTYFNFEKGDSLKNLTECIMLYIEKNYSVSANPQDKVLAGL SSGATVTVQAMFYSNETFGYYGVFSPSRTLDFVEETLTLEMVAAFPKVSQYYVSVGIFDT FVRRNVNVQINEELIKAGANVVFEWKNGAHDWGVWRSQFSEFVKDYLWDRETVVSTPSNP DADVAPQSNSTAAVKTSDSVDMTGLLIISGLSLLGISIQYYYRHRKTH >gi|223714189|gb|ACDT01000026.1| GENE 4 3166 - 4449 1663 427 aa, chain + ## HITS:1 COG:L75975 KEGG:ns NR:ns ## COG: L75975 COG2873 # Protein_GI_number: 15672055 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Lactococcus lactis # 1 427 1 426 426 507 58.0 1e-143 MSKDNLRFETKQLHVGQEITDPTGARAVPIYQTAAYVFDNCDHAAARFGLSDAGNIYGRL TNPTEDVFEQRMAALEGGVAALAVASGAAAVTYAIESITRSGDHIVAAKQIYGGTYNLLA HTLTNYGVNTTFVDSDDPENFKLALQENTKAIFIESLGNPNSSLVDVEAVSKIAHEHGIP LIIDNTFGTPYLFRPIEHGADIVVHSATKFIGGHGTTLGGVIVDSGKFDWQASGKFPQLS KPDPSYHGVIFTEAAGAAAFVTRIRAVVLRDKGATISPFNAFLLLQGLETLSLRVERHVE NALKVVDYLSKHPKVAKVNHPSLNNSPYHELYQRYFSHGAGSIFTFEIAGDENDAKKFID NLEIFSLLANVADAKSLVIHPASTTHSQMNEAELAASGIKANTIRLSIGLEHIDDLLEDI ENAFDHV >gi|223714189|gb|ACDT01000026.1| GENE 5 4576 - 4821 345 81 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_0586 NR:ns ## KEGG: CDR20291_0586 # Name: d1 # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 81 1 81 81 110 65.0 2e-23 MSELTKIPYVGKATAASLNLIGYETIASLKGADPEEMYQKECDVRGQMVDRCQLYMYRMV VYYASSKNPDPKKLKWWIWKD >gi|223714189|gb|ACDT01000026.1| GENE 6 5135 - 6619 1907 494 aa, chain + ## HITS:1 COG:lin0260 KEGG:ns NR:ns ## COG: lin0260 COG1190 # Protein_GI_number: 16799337 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Listeria innocua # 5 494 8 497 498 609 61.0 1e-174 MEERKFSEQELVRREKLQELVDKGIDPFGQRFDVTAYSKEIKETYGDKTHEELEEMAVEV KVAGRIMTKRRKGKVCFMHIQDRDGQIQLYVRKDAVGEDVYEIIKKGDIGDIVGVKGTVF MTNTGECSIKVEEYTHLTKALRPLPEKYHGLSDVEERYRRRYVDLIMNEEARKIAFARPK IIRSIQHYLDNQGYVEVETPVLNPILGGASARPFVTHHNALDMNFYLRIATELGLKRLIV GGMDAVYEIGRLFRNEGMDRTHNPEFTTIEVYKAFSDLEGMMDLTEGIISNAANEVCGTY ELEYKGNKISLAPGFKRISMTDAIKEKTGIDFADQLSFDDAKKLAEEHHIEVEPHFGYGH IINEFFEKYVEETIIEPTFVYGHPIEISPLAKKNLDDPRFTQRFELFICGNEYANAFTEL NDPIDQYERFANQLKEKELGNDEANEMDLDYVEALEYGLPPTGGMGMGIDRLVMLLTGQE SIREVILFPHMKNK >gi|223714189|gb|ACDT01000026.1| GENE 7 6817 - 7092 315 91 aa, chain + ## HITS:1 COG:pli0034 KEGG:ns NR:ns ## COG: pli0034 COG0640 # Protein_GI_number: 18450316 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Listeria innocua # 9 89 7 87 97 89 49.0 1e-18 MEKQFNQDAKVFKAFCDPNRLRILDILKSGEHCACKLLDILDVSQSTLSHHMKLLTDAKI VNVRKDGKWSHYSLSKDGIQFAIEYLQQLKV >gi|223714189|gb|ACDT01000026.1| GENE 8 7193 - 8197 699 334 aa, chain + ## HITS:1 COG:MTH894 KEGG:ns NR:ns ## COG: MTH894 COG0701 # Protein_GI_number: 15678914 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Methanothermobacter thermautotrophicus # 29 328 19 325 327 251 43.0 1e-66 MSQILLFFQDQILGMRWLNELIGSILTSIGLNIDEKIGGMLHFFIYDSIKIFILLSVLIY IISYIQSYFPPEKTKKILGRFYGIWANILGALLGTVTPFCSCSSIPIFMGFTSAGLPLGV TFSFLISSPMVDLGALILLMSIFGWKVAVLYVIAGLIISVVGGRLIEKLGMDDQVADFIK QTDNVDVGFVSLTVKDRRNFAKEQVIETVKKVSIYIFVGVATGSVIHNVIPEHWIQYILG DHNFYSVPLATVVGVPMYADIFGTIPIAESLLLKGAGLGTVLSFMMAVTTLSLPSMIMLS KAVKRKLMIVFVAIVTVGIISVGFMFNIFSFLLM >gi|223714189|gb|ACDT01000026.1| GENE 9 8479 - 8814 470 111 aa, chain + ## HITS:1 COG:asl1510 KEGG:ns NR:ns ## COG: asl1510 COG0526 # Protein_GI_number: 17229003 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Nostoc sp. PCC 7120 # 33 106 5 78 80 69 47.0 1e-12 MKLFKKKEKNCCNVQCGDGSMKRGLEKKDKGSRIKVLGSGCAKCATLEKNVKEALDELGQ EMSIEHITDFTEIASYGIMSTPALVIDEKVVSFGKVLTIEEVKNIILTIDG >gi|223714189|gb|ACDT01000026.1| GENE 10 8917 - 9318 305 133 aa, chain + ## HITS:1 COG:CAP0105 KEGG:ns NR:ns ## COG: CAP0105 COG0394 # Protein_GI_number: 15004808 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Clostridium acetobutylicum # 2 128 3 129 136 168 62.0 3e-42 MPRVAFICVHNSCRSQIAEAIAKLFASDVFESYSAGTEVKPQINQDAVRIMKKRYDMDME KTQHPKLLVDIPEVDIVITMGCNVNCPFLPCRYKEDWNLDDPTGKSDEEFGRVIDNIFAN INELSTRISKMNI >gi|223714189|gb|ACDT01000026.1| GENE 11 9920 - 11098 1410 392 aa, chain + ## HITS:1 COG:aq_1950 KEGG:ns NR:ns ## COG: aq_1950 COG1404 # Protein_GI_number: 15606951 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Aquifex aeolicus # 98 372 72 392 560 171 35.0 3e-42 MNNNYELGKVIVALNINITSKDVLDYYIDLVLDGIDYEKAEIIFHSKNTESLNEGTNIVL IYLKDKNPHAVTNAVAKLSSSPYIVYAEPDYIEEMHIISNDPLYNQLWGIQKINAPLAWD YTTGDSSISVGVIDTGIDQNHPDIRENMWTTWNGRLIYGWNFADNSSDSMDLDGHGSHVA GTIGAVGNNRIGITGVCWQVRVAALKFGLDVASAIAAIDFANYYKISILNASWGGRAYSQ ALKDAIDQYDGLFVASAGNDGTNNDVDPMYPASYDCKNIISVAAVDPYDTLARFSNYGLK TVDIAAPGTNILSLDLAGEYSPLNGTSMAAPHVAGAAALLKSSMPNISTITLKRIILSSA MENPELKGKILTGGILDMETMFKLANCWHGRE >gi|223714189|gb|ACDT01000026.1| GENE 12 11201 - 12094 770 297 aa, chain + ## HITS:1 COG:VCA0082 KEGG:ns NR:ns ## COG: VCA0082 COG0583 # Protein_GI_number: 15600853 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Vibrio cholerae # 7 141 8 140 295 76 34.0 5e-14 MYNPQLETFIKVVDRGSFSKAAEDLFISPSAVIKQINILETNLGLKLLKRTHQGVSLTPA GASLYRDAKYLIKYAKDSIVRAHKAMEKEEMVIRIGTSNTTPSQFLINLWPRIHELCPNL RIQLIPFENTPENAREILANLGDNIDLVAGVFEKDNFKQRGCDSLEIKKESICLALSIYD ELSNKDKLSLQDIHGRELMLIKRGWNKYVDEMRDDLIKNDPTIIIKEFEFFDISVFNECE NNKSLLMSVKSWKNIHPLIKTIPVEWDYYMPYGLLFSLEPSRQVIEFVNAVKRVLDI >gi|223714189|gb|ACDT01000026.1| GENE 13 12207 - 13175 1296 322 aa, chain + ## HITS:1 COG:YPO2806 KEGG:ns NR:ns ## COG: YPO2806 COG0667 # Protein_GI_number: 16123004 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Yersinia pestis # 1 317 1 314 329 320 51.0 3e-87 MRKTRILGQGLEVSAIGLGCMGMDHAYGAPADREEMIKLIRHAVTLGCNFFDTAVVYGEA NEVLLGKALEIFPRDEVIIATKFGIYGQEIVDGKPQNILNSKPDSIREQLEGSLKRLGVD YIDLYYQHRVDPEVEPEIVASVMKELIAKGKIKHWGLSNAPLDYLKRAHAVCPVTCIENQ YSMVFRQPEKEVFKVCEELNVGFVAYSPLGNGFLSGKYTPATKYAEGDFRNNMGRFNPEV MKRNQALLDLVQEIAERKNATSAQIVLAWEINQKDWIVPIPGTTKIHRLEENLGAMEVEL TEQEMAAINQALDNLDIDETHF >gi|223714189|gb|ACDT01000026.1| GENE 14 13345 - 14397 886 350 aa, chain + ## HITS:1 COG:lin1003 KEGG:ns NR:ns ## COG: lin1003 COG1434 # Protein_GI_number: 16800072 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 80 344 76 339 344 153 38.0 4e-37 MPYILILVILTLLWLGAFWRIKRKQNINLLAGFLFFMFLSTFASFLVMLGINYSNRLVYL LVGVIFLFALFVLLFGVYILIAFLLINSVIILRKERFSLSHILTLLVAVGIILLITLSSF LGRINPPLPIMALWSGLIVMILFLTFHIFLFIETLILTNLFHPRKNLNYIIVLGSGLING NVSPLLAKRIQAALKFAKKQEKKKGFAPYLLMSGGQGSDETRAEALAMKEYAIEQGYSEE LIITEEKSKNTLENMQFSKAVMEEKSHGENYKCAYASSSYHLMRAGIYARQVGLIMSGLG GKTAFYYLANAVLREYIAYLAMNKKLYLVVLGSIFMFGTLMYILFSALIG >gi|223714189|gb|ACDT01000026.1| GENE 15 14422 - 14784 236 120 aa, chain + ## HITS:1 COG:DR2400 KEGG:ns NR:ns ## COG: DR2400 COG2315 # Protein_GI_number: 15807390 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 42 117 49 126 132 59 35.0 1e-09 MKYEWIDSYLLAKAGISKNLQTEWNWIRYVIDGKMCVAVCLDEQDNPYYITLKLRPENGM DLQEQHEDIIPGYYMNKKHWNSIKVNGSISDQLLKDMLDEAYELVLSSFSKKRQLEILSL >gi|223714189|gb|ACDT01000026.1| GENE 16 14965 - 15627 460 220 aa, chain - ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 2 217 4 215 229 98 29.0 9e-21 MKKLQILKQCSLFKGIDPAEIEKLLNCLKSHEKTYQKNETIIRQGDIIHEIGLVLDGQAH IYHIDFWGNKTIISEIKASEFFLESYACALQQEIDLNVVATEITTILFLDISHLIRSCQH SCSFHHKIIQNLLSIVSNKNVLLTQKIHHLTRRTTQEKILSYLNSQALKNNSLTFEIPFN RQQLADYLAVERSALSSELSKLQKNGIIAYKKNTFQLKKS >gi|223714189|gb|ACDT01000026.1| GENE 17 15761 - 17596 2300 611 aa, chain + ## HITS:1 COG:CAC2750 KEGG:ns NR:ns ## COG: CAC2750 COG1151 # Protein_GI_number: 15896007 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Clostridium acetobutylicum # 5 543 3 527 530 691 62.0 0 MENKMFCYQCQETVGNQGCSQVGVCGKTPETAGLQDLLIDVTKGLAEVINEVRCKGKEIN NVYDDQVSENLFITITNANFDDQMIIEAVRKTLTLKKELLLQLDDQAVLSNRALWMEMEI EAIEKRMQMVGVLETADEDIRSLKQLITYGLKGMAAYNKHAHALGFRKDEIDRFIETTLV KIETSTMSLNDYVALTMETGKYGVQVMELLDQANTSTYGNPEITEVNIGVRNNPGILVSG HDLKDLEMLLEQTKDTGIDVYTHSEMLPAHYYPFFKKYPNFVGNYGNAWWKQREEFKAFN GPVLLTTNCLVPPLASYQERVYTTGAVGFEGCVHIDKDEHGYKDFSQIIEHAKKCLAPTQ IETGKIVGGFAHHQVLALADQIVTAVKNGDIKKFVVMAGCDGRHPTRQYYTDFAQSLPTD SVILTAGCAKYRYNKLDLGTINGIPRVLDAGQCNDSYSLAMIALKLKEVFALEDINELPI IYNIAWYEQKAVIVLLALLSLGVKNIHLGPTLPAFLSPNIVDFLVDNFQISGIQSVQEDL NLFFPQTKKEDKFHRDMLVGSIIGMDPQAAQILSDSGMGCLGCPASQSETLADACLVHGL DVEEILKQLNQ >gi|223714189|gb|ACDT01000026.1| GENE 18 17607 - 18179 504 190 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1508 NR:ns ## KEGG: Cphy_1508 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 150 1 162 207 90 32.0 3e-17 MSAFLGPVHYWLYRKIQYQEQLNQKILKRICPQLNEIVAQECGTIQDGSLEEIIDHQAIH QWLSMELMIVEKRFAFIVEHIEKSDFEEVREVLFEAGKEISINENYHNCIELFKVINNYL IDGMPCDKGIKIMSQEENQIIYEYNEMVHQYLDFEIFQKYRKAWLDGVLSDSHIVFSRLN YNTYMLKMEE >gi|223714189|gb|ACDT01000026.1| GENE 19 18181 - 18669 572 162 aa, chain + ## HITS:1 COG:CAC0760 KEGG:ns NR:ns ## COG: CAC0760 COG3945 # Protein_GI_number: 15894047 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 12 158 10 155 184 100 41.0 1e-21 MTNNIIKDLEADHQKILAFVNELEQRLIEFMEANIFDYEQMKRDILFIREFADKEHHQRE EKILFRYMIEYLGKVAENLVRHGMIVEHDLARLYVKQWDEALQRYHRNHLLIDKLTIISN GHAYCMLLRRHIEKEDTVVYPFAKKHLSEEIFAEMIKENQNY >gi|223714189|gb|ACDT01000026.1| GENE 20 18794 - 20110 1094 438 aa, chain + ## HITS:1 COG:aq_429 KEGG:ns NR:ns ## COG: aq_429 COG2199 # Protein_GI_number: 15605926 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Aquifex aeolicus # 273 437 76 238 246 105 38.0 1e-22 MIKKHKPFIAAILIFMIMSILSFFILYGVINDKVAAAKEKIHYFTEAQVSQLDRVLSSYI QTAETLKLLIVDSNGNINDFDRVAHQLYNDDNTFRSIQLAPQGDVQYVYPLEGNEGAFGN MFEDPERATEAKKARDTGETTLAGPFELYQSGKGIVVRQPIYLEENGENNFWGFAIVVLN VPEIFDSVHLDSFENMGYEYQLWRIDPDNNKKQIILKSDHELLDDTINLSFEVPGNTWTL SVSPINGWVRSSDLMPVIILAICLSLLIPLLVYTLLKINEQRRRMIDISNRDYLTTLYNG RKMNFVLNDLIARQTSFIFVYLDVDKFKVVNDTYGHLAGDVLLKEIALRIMRYLSREDYA FRVGGDEFVIIIKNNSSTKKTLQIIAEQITEKINLDGFEYYPEVSMGYAVYPNDGKSLEE VIKCADSRMYEIKKHKNI >gi|223714189|gb|ACDT01000026.1| GENE 21 20221 - 20940 764 239 aa, chain - ## HITS:1 COG:lin2869 KEGG:ns NR:ns ## COG: lin2869 COG0363 # Protein_GI_number: 16801929 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Listeria innocua # 1 229 1 235 239 192 40.0 5e-49 MKLIIEENEQKMSESAMFILLGAMMQDKRVNISLTSGRSPKTMYDMMIPYVKNQAKFKDI EYYLFDENPYIDEPYGPNWKDMQELFFKAANIPDERIHIMTSNDWQDYDNKIRNAGGIDV MVIGLGYDGHFCGNCPRCTPFDSYTYCIDFKKKQAVNPDYGDRPRQPHTLTMGPKSLMRV KHLVMIVNGKEKAEIFKRFLDEPVNQDVPATILKLHPNFTVICDQAAASLIDPQQYSNL >gi|223714189|gb|ACDT01000026.1| GENE 22 21384 - 21554 129 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCFPYFTLISMLTNLDETSDIHNFSSTVKVIEFVGIDPGTYQSISRTAQSKRGSRY >gi|223714189|gb|ACDT01000026.1| GENE 23 21841 - 22398 380 185 aa, chain + ## HITS:1 COG:lin1208 KEGG:ns NR:ns ## COG: lin1208 COG0406 # Protein_GI_number: 16800277 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Listeria innocua # 1 179 8 186 199 210 50.0 2e-54 MRHGQTLFNQKRLMQGWCDSPLTELGKRQARCVKEYLENNEIVFDGAYASSSERACDTLE IVTSLPYQRLKGLKEWNFGIYEGESDDLDPPFPFGNFFKKFGGESENEVSERVSTTVLEL MKNTDQETVLIVSHGLSCYCFAKRWEQNQQVSLPNDMPNCVVLKYQFNDEQFSLVEVIDV TSELK >gi|223714189|gb|ACDT01000026.1| GENE 24 22438 - 23505 1438 355 aa, chain - ## HITS:1 COG:Cgl1221 KEGG:ns NR:ns ## COG: Cgl1221 COG0205 # Protein_GI_number: 19552471 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Corynebacterium glutamicum # 1 336 4 333 346 241 38.0 2e-63 MRVGILTSGGDCQALNVTMRGLAKTLYRNVNDIEIIGFLQGYKGLMYEDYKIMKPKDFSG IINIGGTILGTSRCPFKKMRVIEDDFDKVAAMKNTYAKLKLDCLVVLGGNGSIKSANMLS QEGLNVIALPKTIDNDTWGTDYTFGYQSAIDIATTYLDQIHTTAASHNRVFVIEVMGHKV GHICLSAGIASGADVILLPEIPYDIKEVAKAIKTREKNGKKFSIIACAEGALSKEEATLS KKEYKAKVKARNGQSVVYEIAAELEKYIDSEIRVSCIGHAQRGGQPCPYDRMISTQFGVA GARLVMGGDYGKLVILKNNGVTAIPLVESAGKLKYVDAVGAKVKDAKLLGISFGD >gi|223714189|gb|ACDT01000026.1| GENE 25 23540 - 25570 1703 676 aa, chain - ## HITS:1 COG:BH1392 KEGG:ns NR:ns ## COG: BH1392 COG0855 # Protein_GI_number: 15613955 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Bacillus halodurans # 7 675 22 693 705 480 40.0 1e-135 MNKCKENRELSWLKFNDRVLMQAKDNKVPLGEKLSFISIFQSNLDEFFMVRIGSLYDQML FYPSSKDNKTGMTGEEQLKACLRRITYLNKKKDRIYQSIMDELKTHGWGIVKYRDLKNKE DRKYFESYFEREILPLISPQVISKRQPFPFLNNREVYVVVQLESKKGKRKMGIVSCANAM DERMIAIPSMQGKFILIEDIILHFISNIFSKYTIKNKGFIRVTRSADIDEDDHSLEGHED YREMMETLIKQRRKLCPIRLEVSPGLEELEILMLMNFLNLKKHQVFECNAPLDLKFTSEL RDHLRYIHPEMFYKKLEAKNSPLVENTVPMIKQILKKDILLSYPYESMSPFLRLLDEASR DTNVASIKMTLYRVAKKSKIIKSLIEAAENGKEVVVLVELRARFDEENNIDWSKRLEESG CRVIYGLDGLKVHSKLCLITYKNNQGVHYISQVGTGNYNENTSKLYTDLSLMTSNYEIGE EINSIFNHLCLGETENEVNLLMVAPNCMITKIFEHIDNQIALAKAGKEAYIGFKCNSVTS KEMIDKLIEASQAGVKIDMIVRGICCILPGVEGLTDNIRIISIVGRYLEHSRIYIFGKGS AQKVYISSADLMTRNLSRRVEVAAPILDEKLRKRIITMFNTMLKDNVKACRLLEDGTYKR VKNKEPELNSQEYFFK >gi|223714189|gb|ACDT01000026.1| GENE 26 25796 - 26563 488 255 aa, chain - ## HITS:1 COG:SA0314 KEGG:ns NR:ns ## COG: SA0314 COG2110 # Protein_GI_number: 15926027 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Staphylococcus aureus N315 # 6 255 10 266 266 214 43.0 1e-55 MNQITRLQFLINYLIQENKLEIELPTEQAQLFSLYRSLVNIREAKTATDKFIMIQNKMLK NEIKRKGIIDSSNFKKSMNIWRGDITQLKVDAIVNAANNQMEGCFIPGHNCIDNAIHTFA GVQLRNECHQIMSKQRYLEPTGKAKITNAYNLPCRYIVHTVGPIVHNVLTDEKRFLLAEC YRNCLKKAEVYGLKSIAFCCISTGVFNFPKKEAAQIAVNTVSTFLKGSQIEKVIFNVFKE DDEMIYQQLLNNKSK >gi|223714189|gb|ACDT01000026.1| GENE 27 26560 - 27411 464 283 aa, chain - ## HITS:1 COG:SPy1215 KEGG:ns NR:ns ## COG: SPy1215 COG0846 # Protein_GI_number: 15675179 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Streptococcus pyogenes M1 GAS # 11 282 17 289 293 233 39.0 3e-61 MFQRESLDILKLSQAIKKCDAILIGAGAGLSSAAGLSYSGNRFKKYFNDFITKYHLQDMY SAGFYPYPSLEKYWAYWSRHIYYNRYLEAPKDTYKKLLNLLINKDYFVITTNVDHQFQIA GFIKEKLFYTQGDYGLWQCSKPCHQKTYDNYETVVTMIKQQHDLKIPSSLIPYCPICNAP MTMNLRCDKTFVQDSGWYHAQERYYHFLNKYHYSKIVYLELGVGYNTPGIIKYPFWQLTI ENPKAIYACINQDIIDLPSELKKQTIMINDNIHNVLSSLEKLL >gi|223714189|gb|ACDT01000026.1| GENE 28 27573 - 27896 390 107 aa, chain + ## HITS:1 COG:FN0589 KEGG:ns NR:ns ## COG: FN0589 COG1733 # Protein_GI_number: 19703924 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 2 97 4 99 107 139 75.0 2e-33 MKEFPKCPVETTLLLISDKWKVLIIRDLLPGKKRFGELKKSIGTISQKVLTSNLRAMEES GLISRKVYPEVPPKVEYSLTELGYSLKPILDAMYSWGEDFKKNHQSI >gi|223714189|gb|ACDT01000026.1| GENE 29 28082 - 29149 927 355 aa, chain + ## HITS:1 COG:DR1631 KEGG:ns NR:ns ## COG: DR1631 COG2357 # Protein_GI_number: 15806636 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 6 195 26 220 394 171 44.0 3e-42 MDNEDLVREYSNNLTTYEQFTNLMETYICNLLNREQISFHSITSRTKSIESLSKKIELKN KYQKLDEITDLSGIRVITYYTDTVDQISKLIENEFIIDRDNSIDKRKSLDPKRFGYRSLH YVVQIDPKYVKAQEYLKYHNLKLEIQISSILQYTWAEIEHDLGYKSQEEIPYDIKRSFSR LAGLLELADEEFLRIKNEIFAYQNHLLLSYLSADIDKESLAVFKVKSPEYNELTEFLIKE MKAKKVSQGNFENIIRMFEYLNISKLQEVDQKLKQYQQLIMEHIDVLYRDATRTKTNIIS GDMPLLCLCYFILVVEKGEVEFEKFTEQFISSVNFKNRLIELKRILKDYKLEAAD >gi|223714189|gb|ACDT01000026.1| GENE 30 29426 - 31078 1834 550 aa, chain + ## HITS:1 COG:BH2903 KEGG:ns NR:ns ## COG: BH2903 COG0366 # Protein_GI_number: 15615466 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 4 547 5 555 561 563 51.0 1e-160 MKNWWQKSVVYEIYVRSFKDTNHDGIGDINGITEQLDYLATLGVDVLWITPIYESPNDDN GYDISDYYEIMQCFGNMTDFENLLTQAHKRNIKIVMDLVVNHTSDEHRWFIESKKSKDNP YRDYYIWKDGKEDGSVPNNWTSCFLGSAWQYDETTKQYYLHLFSKKQPDLNWDNTDVRNE IQKMIAWWLDKGIDGFRMDVINLISKDQDNIYQDSPIKGHSVSANGPRVHEYIRELTDNV FSKYDVMTVGEAPAVTTKEAIQYASNDGKEMSMVFQFELMDVDGGEKQKWSDERFKLTDV KAILRKWQIDMHGRAWNSLFWNNHDQPRVVSRFGDTSSKETWEKSAKMLATAQYFLQGTP YIYQGEELGMTNVEFKNIGELRDIESINSYHQYVEVEKMFTPEEMLRFINKSSRDHARTP MQWNDLENAGFSDAEPWIKVNQNYKWLNAQAQITDENSIFNYYKKMISILKENEVVQFGD YQEYYEDSNEVYVYKREYDGKKLFVLSNFTAKEVTYDKTLFEAEAKVLLGNYNDLIRGKL RAYEAVVLVV >gi|223714189|gb|ACDT01000026.1| GENE 31 31122 - 32261 1054 379 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735066|ref|ZP_04565547.1| ## NR: gi|237735066|ref|ZP_04565547.1| predicted protein [Mollicutes bacterium D7] # 1 379 1 379 379 735 100.0 0 MPKKYFKIGCTVIIVTLIGLFLKNNAPANNYKNPDTLNIQDKQSMMMADKAVEKIARGTS GKFETLYLGADGQFQWNILNPEETSFFDEDGKAIKSILLISTKLQTSTPFDQESNRYETS TLRQAEIALASQTLSPKEAEVIFNTTQPGGDVVTVTSYSFIESQLFGDKFFAPTAKIMNS NSCGLEDRHYRQEGNYYWLSAAAKDDPKLAGYISKCGVFAQLDITVEHLSARATANLDPH AVLFFAAAQGGKVSGVVGSKAMQDVEHQELLSDYRLTLKDASQTLVVSDQVKENNKISFN YNAQGDGDLLSAVIIDKTQNEIVSYGQLADISVTKMGVALIELPKKYDHDNYQIKVFSEQ YNGDYKTDYASELVDLKIE >gi|223714189|gb|ACDT01000026.1| GENE 32 32402 - 32998 483 198 aa, chain - ## HITS:1 COG:all1011 KEGG:ns NR:ns ## COG: all1011 COG0110 # Protein_GI_number: 17228506 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Nostoc sp. PCC 7120 # 2 188 10 187 192 203 54.0 2e-52 MTEKEKMLAGELYDCGDAELLALWHRAKDLARDYNLTNSSDTKRKSELLNELLGKVGNQL WITPPFHVDYGCNIYFGNNCEVNMNCTFLDDNKIIIGDNVLIAPNVQIYTAYHPTHYLDR FTISENETFNFCKTQTAPVIIGKNVWIGGGTIILPGVTIGDNTVIGAGSVVTKDIPADTI AYGNPCKVHKANERSKSI >gi|223714189|gb|ACDT01000026.1| GENE 33 32999 - 33625 732 208 aa, chain - ## HITS:1 COG:lin2878 KEGG:ns NR:ns ## COG: lin2878 COG0546 # Protein_GI_number: 16801938 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Listeria innocua # 4 208 2 202 203 117 31.0 2e-26 MKKYKAIIYDLDGTVLNTINMNMYPLIKIIKEETGEDWTFEQVLKFLPYPGMKVMEELQV ADKEKAYARWVKYVNEYEEGASLYPGFEEIFEAFDGSIIQAVASAKTTAQYQIDFVAKGL DKYMKTAVLANDTVKHKPDPEPLLECLKRLSLQPEDVIYIGDAHSDYLASKNAGIDFGYA KWGSVSAKGIDKPDHVFEQPLDLLKLLS >gi|223714189|gb|ACDT01000026.1| GENE 34 33612 - 34607 724 331 aa, chain - ## HITS:1 COG:BH1094 KEGG:ns NR:ns ## COG: BH1094 COG1940 # Protein_GI_number: 15613657 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Bacillus halodurans # 9 224 10 242 407 65 25.0 2e-10 MTKKILKDVKKHHLHLLLQSLFSRDSATMSELSNDTQLSQPSVRNMIRLLQQHDIVKETG NDHSTGGRCPTRFALVEKHFNIICLYIQNPKITYQIHSYKKILTTDILEFEDEQQLITLI KVLLTKFNCQCIVIAVEGIVRDNEYLTDHNNHFETHTWITELEKSINIPILVENDVKAMQ LGTCYHHHKNSSIYLHLNKKGIGSSYMYNGQLVHGKYGIAGEIGLIPYHDISLNQAVRTC NKPKDFQNIIAYLLIIVLTSLDPEHVDLSLELDWDLDWHLINETIYTYLHSQFIYDIKVY KQYLDNLFHGLNYLGIEKILKQLVEESNEKI >gi|223714189|gb|ACDT01000026.1| GENE 35 34780 - 35334 493 184 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_3456 NR:ns ## KEGG: CDR20291_3456 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 181 1 179 180 180 58.0 2e-44 MKGFKRNNQLLSLCGLNCGLCPMHIDGYCPGCGGGAGNQSCRIARCSLEHDNLTYCYECS SYPCDKYMKFAKYDSFITHQHYHQDLMKAKDGGIDTYNQEQLEKIDILKIFLEKYNDGRK KSFYCLAVNLLELEELKSLIKQIETNSKLSALTKKEQAKYVFDLFQDLAKRHSLILKLNK KSKK >gi|223714189|gb|ACDT01000026.1| GENE 36 35378 - 35770 493 130 aa, chain - ## HITS:1 COG:CAC0599 KEGG:ns NR:ns ## COG: CAC0599 COG1725 # Protein_GI_number: 15893888 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 118 1 119 125 106 44.0 1e-23 MIINLDYTSEIPIYLQLRNEIVKGIANGQFKYGESLPTVRTLAQDLQVNNMTVNKAYGLL KQEGYICIDRRHGAKVQPTIDNSHEHQEKLLDELELLASEAIVKGMNKDEFINTCKALLQ TIQYDASIVQ >gi|223714189|gb|ACDT01000026.1| GENE 37 35767 - 37140 928 457 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01889 NR:ns ## KEGG: EUBELI_01889 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 3 440 8 454 467 90 24.0 2e-16 MWILILCLVFINVVMYFSSKRESRYNEGRLFLVTIPDWAYESVEIKEIEKQFKKEHLWVF IISFITFIPLFIFSTKWLLLYFMIVIWFNVSVYYVPYRKARAKLLALKKLRNWPDEKIEK IKIDLSLSAYMEKHPFNLRRYFFVLIIDLSVLGNMIYFHAENAMYLYMVLQFMVLVLGIV FIKKLPNKTFCKNSEVNITLNLLRRDSFHHCFFFLITGDSIFNLALQFFLLEKLPFVILF LVALIMILCVIIIVIKANHYREKKAKILAHYNECEYTISNDDCWKIGWFGPTYYNKADPR TLISAPNGTQMTFNTAKPAYRIFIIGIWTFVIALLLWLFGYPYYLDITNNLVNLSLTDQA VVVDSPFYDVSIDLQKVNKAELADDLGKGIRTNGTDTFVYGKGNYTFDRYGKCKVYMASL HPCYIILYTDDITYIVNDDDIQNTKLIYQEIQEVLSQ >gi|223714189|gb|ACDT01000026.1| GENE 38 37779 - 38000 159 73 aa, chain + ## HITS:1 COG:lin0034 KEGG:ns NR:ns ## COG: lin0034 COG0449 # Protein_GI_number: 16799113 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Listeria innocua # 2 65 231 294 361 81 64.0 3e-16 METKFSETIRVASQGYKLEAYMHGPYLEVNPEHRIFFIETPSPRLSKAKLLKAYESQYTE HVFAITKAQSIRP >gi|223714189|gb|ACDT01000026.1| GENE 39 38266 - 38727 449 153 aa, chain + ## HITS:1 COG:SA2495 KEGG:ns NR:ns ## COG: SA2495 COG1396 # Protein_GI_number: 15928290 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Staphylococcus aureus N315 # 2 83 3 85 189 70 39.0 8e-13 MLGEKLMRLRKKQGYSQQEVADKLSVSRQTISNWECDQALPAVDKAMELAQLYNISLDDL MENEIEIVSNNKTKDLHLLQYLIGKTCTLECTRDAYLLDISTSDGKVLIVDVNDDWVKVQ YHRTKKGSFIKKETVTKLIDLSCISGFKVEGEL >gi|223714189|gb|ACDT01000026.1| GENE 40 38727 - 39056 415 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757514|ref|ZP_02429641.1| ## NR: gi|167757514|ref|ZP_02429641.1| hypothetical protein CLORAM_03064 [Clostridium ramosum DSM 1402] # 1 109 1 109 109 154 100.0 1e-36 MTTIIAIIALVISICAIYEIEKLKKNDVKNIEKEIKETDRIKNILAGLKQCDCELIVKET MFLIDVPYSVEGKIIDLDDDWVLIVSMGRKTISRMIRITMIKDVIELTK >gi|223714189|gb|ACDT01000026.1| GENE 41 39151 - 40113 1014 320 aa, chain + ## HITS:1 COG:SP1900_2 KEGG:ns NR:ns ## COG: SP1900_2 COG0340 # Protein_GI_number: 15901727 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Streptococcus pneumoniae TIGR4 # 89 312 26 248 252 141 38.0 2e-33 MSVKQNVIALLEENRSKVISGQELANQLHVSRAAIWKAIKTLKEEGYNIEATPNKGYVLL ENSDVLSKQGIAYYLTEEIDIFSYKTIDSTNTQMKKLAINGGKNHSVIVSEEQSAGRGRF GRSFYSPAQKGVYMSVLLKTGDSLQNATMITIKTAVAVRRAIAKLYDIEVAIKWVNDLYY RGKKVCGILSEAISDFESGMIEAIIIGIGINVSTDNFPLEIASIATSLGLQEANRNQFIA EILNQLFAIIDEDFKLVLNEYRMASCVLHKQITFNQKGEQFTGLVREINDLGNLVVSSNG AEMVLTAGEVSIIGGNHGAE >gi|223714189|gb|ACDT01000026.1| GENE 42 40100 - 40672 180 190 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 [Bacillus selenitireducens MLS10] # 9 188 5 188 190 73 29 2e-12 MEQNKLLTTKKITYCAIFTALITIGAFIQIPVPFMDYFTLQFFFVLLAGILLGSKLGALA VLLYVVIGLLGLPIFAAGGGLAYIVRPSFGYLIGFIAGAYVTGIICEKTNEIAAKKYVLA VLSGLLATYMIGLGYKYIILNYYTGTPITWKLVLLSCFPLDLPGDLFLSFLAVGTGIRFE KIFKKRRIQC >gi|223714189|gb|ACDT01000026.1| GENE 43 40666 - 41625 999 319 aa, chain + ## HITS:1 COG:FN1000 KEGG:ns NR:ns ## COG: FN1000 COG0502 # Protein_GI_number: 19704335 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Fusobacterium nucleatum # 13 316 48 360 360 271 46.0 1e-72 MLEKLIAQIMAGYEISKAEAKMLIDYDLGTLKQGAELIRRSYCGSTFDLCSIINGKSGRC SENCKYCAQSVYYQTGIDIYNLLKISEIKAIALHNQKQGVERFSIVTSGRKLNVQEFNKI LKIYQELNDTTTISLCASLGLLNFEELKKLKETGVKRYHNNLETSRNYFNNVCTTHSYEQ KIKTIKAAKLAGLEVCSGGIVGMGESWLDRLDLAFELKDLGIKSVPINVLNPIRGTPLEH LKPLSKEDVERIFAIFRFILPDATIRMAGGRGLFDDKGINVFKSGANGAITGDMLTTQGI TIAEDQRIIKQLGFRVAKP >gi|223714189|gb|ACDT01000026.1| GENE 44 41622 - 42284 672 220 aa, chain + ## HITS:1 COG:CAC1361 KEGG:ns NR:ns ## COG: CAC1361 COG0132 # Protein_GI_number: 15894640 # Func_class: H Coenzyme transport and metabolism # Function: Dethiobiotin synthetase # Organism: Clostridium acetobutylicum # 2 206 1 211 240 142 39.0 3e-34 MIKGLFVTGTGTDVGKTYVSARIVKALKSQYKVGYYKAALSGAVVENDVLIPGDLEVVKQ YASLPNESCKVSFVYEEAFAPHLAAKKTNTPVDLNTIKADLTDLENKHDFVVIEGSGGII CPLRDDKLIMLSDVMLLANYPLIIVTSSGLGSINGAVLTAQYAKQLNLNVLGFIMNNFDS NNLLHQDNRIMIERLSSYKVLGYLNKKADKIEWFEQIIES >gi|223714189|gb|ACDT01000026.1| GENE 45 42513 - 43616 1087 367 aa, chain + ## HITS:1 COG:SPy1834 KEGG:ns NR:ns ## COG: SPy1834 COG1396 # Protein_GI_number: 15675661 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 76 1 76 195 60 34.0 3e-09 MSIGKKLLSLRQEKGISQEALGRELNVSRQTVSKWESDLSLPDMKMMITISQFYEISITQ LLDLDDETEADSINKIYEQTSLVLENLQKENKKKIIRDWIIIIICTLCLCVALAILVKKG TNEVVFYPNNSTTPITYNNIIDYSNTTFKVISYDFDKMTMSINYKCVLNNYTDITQVNIC IVDVFNNQSIYPMTKENAYTFVLTETIPLVNYANFEILVKNGDKGEKINADSEIEHHLDN LLSHLIYIYIPGDKNYKLIRNNMIYKLDYSYLENEKIDYSGTLSNKVEIIINKIKNNNYE KIGQTNTTLDKQKKIKLSKDLSNGSEINVIVNIITAEQSYELVNINRYVQTGGISYSEYP IYRYGDI >gi|223714189|gb|ACDT01000026.1| GENE 46 43642 - 44757 1304 371 aa, chain - ## HITS:1 COG:TM0034 KEGG:ns NR:ns ## COG: TM0034 COG2768 # Protein_GI_number: 15642809 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S center protein # Organism: Thermotoga maritima # 8 370 3 352 357 351 51.0 1e-96 MEVIMEKAKVYYCDMHTGRMNLPEKLKFLMKKAGFEEIDFKNKYTAIKVHFGEPGNLAFL RPNYAKAVADYVKELGGKAFVTDCNTLYVGGRKNALDHLDSAYSNGYNPFQTGVHTIIAD GLKGTDEEIVPINGEYVKEAKIGQAIMDADIIISLNHFKGHELTGFGGALKNLGMGCGSR AGKMEMHSAGKPVVEQDKCIGCGQCIKICAHNGTSITDHKASIDHDKCVGCGRCIGVCPK DAIVASMDEANDILNYKIAEYTKAVVQDRPCFHISLVIDVSPYCDCHSENDIAIVPDVGM FASFDPVALDMACADAVNKQPAIANSLLDKHGHHHHDHFTDVSPETNWKSCLEHGEKIGI GTREYELIEIK >gi|223714189|gb|ACDT01000026.1| GENE 47 44929 - 46197 1027 422 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 4 421 2 433 447 400 48 1e-110 MSNKRIIQVEDKVPGKLLIPLSLQHMFAMFGASVLVPFLFGINPAIVLFMNGIGTLLFIV ITKGKAPAYLGSSFAFIAPANIVISKFGYPYALGGFVAVGFCGCLLALIIKKCGTKWIDI VLPPAAMGPVVALIGLELSATAANNAGLIGDNIQMANVWVFLITLGVAIFGNICFRKFLS VIPILIAIICGYIAAICFGLVDFTTVANAPLFAIPNFSTPKFDINAILMILPVLLVITSE HIGHQIVTGKIIGKNLLEDPGLHRSLFGDNFSTMISGFIGSVPTTTYGENIGVMAITGVY SVQVIAGAAVLSIICSFVGPLSALIQTIPGPVIGGISFLLYGMIGTSGLRILVDQKVDYA TNKNLILTSVVFVTGLSSITLSFGGVELTGMVLACIVAMILSLTFYLLDKFNLTNDTAEE NN >gi|223714189|gb|ACDT01000026.1| GENE 48 46363 - 47121 542 252 aa, chain + ## HITS:1 COG:CAC0531 KEGG:ns NR:ns ## COG: CAC0531 COG1737 # Protein_GI_number: 15893821 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 252 1 252 257 202 46.0 6e-52 MQLEELVNQYYDDLNENDLMIWKYILAHKKECCEISIDELSKRCNISRATISRFSQKISF EGFREFKMRLKLECQQVQVKNDGMLDEICNNYLKSMELTKDSDLDDLFEHIHHARRLFVF GTGETQNTVAEMIKRVFLQTKVFFVTLYGKSELKMTINGLGEQDVMIFISVSGENEMAIE AMKTLKNKGTYIVSITKLNNNTLARLADKSLYVITDQYKRFNNRTYETTSAYFNTVEILF LKYLDYLDNLDE Prediction of potential genes in microbial genomes Time: Thu May 26 09:30:51 2011 Seq name: gi|223714188|gb|ACDT01000027.1| Coprobacillus sp. D7 cont1.27, whole genome shotgun sequence Length of sequence - 62318 bp Number of predicted genes - 56, with homology - 56 Number of transcription units - 27, operones - 15 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 71 - 1987 1113 ## COG3711 Transcriptional antiterminator - Prom 2034 - 2093 8.4 + Prom 1892 - 1951 7.5 2 2 Op 1 10/0.000 + CDS 2182 - 2505 485 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 3 2 Op 2 13/0.000 + CDS 2529 - 3815 1130 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 4 2 Op 3 2/0.000 + CDS 3857 - 4198 321 ## COG1447 Phosphotransferase system cellobiose-specific component IIA 5 2 Op 4 8/0.000 + CDS 4201 - 5652 1295 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 5659 - 5705 10.1 + Prom 6446 - 6505 6.5 6 2 Op 5 . + CDS 6705 - 6980 140 ## COG2190 Phosphotransferase system IIA components + Term 6984 - 7023 4.0 - Term 6971 - 7010 1.0 7 3 Tu 1 . - CDS 7014 - 7751 578 ## COG1433 Uncharacterized conserved protein - Prom 7795 - 7854 8.2 + Prom 7806 - 7865 8.7 8 4 Op 1 . + CDS 7897 - 8388 645 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 9 4 Op 2 . + CDS 8381 - 8926 611 ## COG1859 RNA:NAD 2'-phosphotransferase 10 4 Op 3 . + CDS 8987 - 10612 1778 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 10614 - 10657 4.2 - Term 10602 - 10645 4.2 11 5 Op 1 . - CDS 10649 - 10822 65 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family - Prom 10961 - 11020 2.6 12 5 Op 2 . - CDS 11048 - 11185 139 ## CLD_1152 putative flavoredoxin - Prom 11211 - 11270 9.7 + Prom 11187 - 11246 7.1 13 6 Op 1 . + CDS 11290 - 11622 299 ## COG1733 Predicted transcriptional regulators 14 6 Op 2 . + CDS 11615 - 12433 800 ## COG0789 Predicted transcriptional regulators + Term 12435 - 12463 -0.0 15 6 Op 3 . + CDS 12478 - 12933 448 ## Sterm_3755 hypothetical protein + Prom 12950 - 13009 8.3 16 6 Op 4 . + CDS 13033 - 15147 1546 ## COG0584 Glycerophosphoryl diester phosphodiesterase + Prom 15149 - 15208 4.5 17 7 Op 1 40/0.000 + CDS 15254 - 15925 458 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 18 7 Op 2 4/0.000 + CDS 15918 - 16916 869 ## COG0642 Signal transduction histidine kinase + Prom 16931 - 16990 3.4 19 7 Op 3 36/0.000 + CDS 17011 - 17766 345 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 20 7 Op 4 . + CDS 17763 - 19670 1706 ## COG0577 ABC-type antimicrobial peptide transport system, permease component + Term 19678 - 19713 1.1 + Prom 19827 - 19886 8.0 21 8 Op 1 . + CDS 19911 - 20363 546 ## COG1970 Large-conductance mechanosensitive channel 22 8 Op 2 . + CDS 20376 - 22772 2122 ## COG0178 Excinuclease ATPase subunit + Prom 22785 - 22844 5.5 23 9 Op 1 . + CDS 22877 - 23389 593 ## DSY4740 hypothetical protein 24 9 Op 2 . + CDS 23391 - 24164 1001 ## COG0428 Predicted divalent heavy-metal cations transporter 25 9 Op 3 . + CDS 24242 - 24418 297 ## gi|237735107|ref|ZP_04565588.1| predicted protein + Term 24605 - 24650 11.6 + Prom 24633 - 24692 7.3 26 10 Tu 1 . + CDS 24753 - 24926 178 ## gi|167757551|ref|ZP_02429678.1| hypothetical protein CLORAM_03101 27 11 Tu 1 . - CDS 24953 - 26119 907 ## COG3307 Lipid A core - O-antigen ligase and related enzymes - Prom 26196 - 26255 5.8 + Prom 25948 - 26007 10.5 28 12 Tu 1 . + CDS 26184 - 27119 837 ## Cbei_5041 putative RNA methylase + Term 27120 - 27156 3.3 29 13 Tu 1 . - CDS 27142 - 28725 1131 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid - Prom 28831 - 28890 12.5 + Prom 28845 - 28904 10.9 30 14 Tu 1 . + CDS 28970 - 29797 758 ## COG0789 Predicted transcriptional regulators + Term 29804 - 29841 3.3 + Prom 29803 - 29862 9.6 31 15 Op 1 35/0.000 + CDS 29883 - 31601 200 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 32 15 Op 2 . + CDS 31594 - 33330 178 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P + Term 33389 - 33432 3.5 + Prom 33484 - 33543 11.8 33 16 Op 1 . + CDS 33688 - 36321 2424 ## COG1221 Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain 34 16 Op 2 13/0.000 + CDS 36341 - 36649 533 ## COG1447 Phosphotransferase system cellobiose-specific component IIA 35 16 Op 3 8/0.000 + CDS 36671 - 38317 1758 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 36 16 Op 4 . + CDS 38320 - 39732 1650 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 37 16 Op 5 . + CDS 39722 - 40384 588 ## RBAM_012200 Pgm + Term 40385 - 40418 1.1 + Prom 40456 - 40515 11.4 38 17 Tu 1 . + CDS 40558 - 42267 1771 ## COG1966 Carbon starvation protein, predicted membrane protein + Term 42490 - 42526 -0.1 39 18 Tu 1 . - CDS 42303 - 44189 1585 ## Deide_1p01580 putative esterase - Prom 44243 - 44302 8.5 40 19 Tu 1 . - CDS 44323 - 45117 683 ## gi|167757566|ref|ZP_02429693.1| hypothetical protein CLORAM_03116 - Prom 45170 - 45229 7.4 41 20 Op 1 . - CDS 45621 - 46094 433 ## COG0671 Membrane-associated phospholipid phosphatase 42 20 Op 2 . - CDS 46104 - 46328 286 ## gi|167757569|ref|ZP_02429696.1| hypothetical protein CLORAM_03119 - Prom 46351 - 46410 10.1 + Prom 46365 - 46424 6.3 43 21 Op 1 . + CDS 46446 - 47546 1591 ## COG0012 Predicted GTPase, probable translation factor 44 21 Op 2 . + CDS 47546 - 48349 938 ## COG4509 Uncharacterized protein conserved in bacteria + Term 48354 - 48387 3.1 + Prom 48444 - 48503 8.9 45 22 Tu 1 . + CDS 48596 - 50092 1677 ## COG1316 Transcriptional regulator + Term 50094 - 50134 3.1 + Prom 50099 - 50158 12.6 46 23 Op 1 . + CDS 50180 - 51547 1326 ## gi|237735129|ref|ZP_04565610.1| predicted protein + Prom 51562 - 51621 4.0 47 23 Op 2 . + CDS 51641 - 51802 172 ## gi|167757574|ref|ZP_02429701.1| hypothetical protein CLORAM_03124 48 23 Op 3 . + CDS 51836 - 52012 85 ## gi|167757575|ref|ZP_02429702.1| hypothetical protein CLORAM_03125 + Prom 52042 - 52101 9.7 49 24 Op 1 . + CDS 52183 - 52614 360 ## COG1846 Transcriptional regulators 50 24 Op 2 . + CDS 52624 - 56859 4452 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) + Prom 56897 - 56956 8.7 51 25 Op 1 . + CDS 56994 - 58313 1409 ## COG1455 Phosphotransferase system cellobiose-specific component IIC + Term 58323 - 58348 -0.5 52 25 Op 2 . + CDS 58369 - 59160 822 ## COG1414 Transcriptional regulator 53 25 Op 3 . + CDS 59150 - 59446 312 ## gi|237735135|ref|ZP_04565616.1| predicted protein + Prom 59505 - 59564 5.6 54 26 Op 1 . + CDS 59589 - 60374 778 ## COG1414 Transcriptional regulator 55 26 Op 2 . + CDS 60392 - 61435 1188 ## COG1312 D-mannonate dehydratase + Prom 61442 - 61501 4.3 56 27 Tu 1 . + CDS 61603 - 62317 676 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase Predicted protein(s) >gi|223714188|gb|ACDT01000027.1| GENE 1 71 - 1987 1113 638 aa, chain - ## HITS:1 COG:lin0919_1 KEGG:ns NR:ns ## COG: lin0919_1 COG3711 # Protein_GI_number: 16799990 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Listeria innocua # 5 490 4 479 505 165 26.0 3e-40 MRRKQEELINYLYTHNEKVTANILSKALNLSIRTIKSYIAELNMNYPSLISSSNRGYVID KVKANSLLQYKDDIPQDYESRCIYIIKKTLLEKQDYIDIFDLCEELFISYSTLKKDIYKM NTSFANFKITFSSENNKLHVGGSEQNKRKLISHVMSEEVSGNFLNLTLLQESFPDYDLDD ACTLIKDICKQHHYYLNDFSCVNFILHVTIMVSRINHGNHIINNNELIQVTNKNDEKIAK ELCLALEQVFNVSFNSSEILEIYILFKNNANYINNENENVSLLVSDEIIQITKNIIKNVD EHFFINLDSDNFITPFMLHLKNLKNRLIKNNLLKNPMLDSIKISCPTIYDISTFIAYQLT LSFHENVNEDEIAFIALHVGTEIERQKKEETKVSCLLLCPEYLNITSTLHKKIMMDFGDQ LTIQKSISFENEILGNNFDLLITTVPVLESTNYFTVLLPPFPMSYEKNKILDAIIRIENT KKSQILTNNLNFYFNEKLFYSMNEDISKSAVINELAERMINLGYVEENFKEEIWKRETAS STAFMNIAIPHPMKMSAYKTSIGVVISHKGIDWGNQHFVNVVFMIAFNKIDNKHFHALYE SLVLLFNEPIVISEIKKCKNFNDFKDIVIKNYLKFNER >gi|223714188|gb|ACDT01000027.1| GENE 2 2182 - 2505 485 107 aa, chain + ## HITS:1 COG:SP0249 KEGG:ns NR:ns ## COG: SP0249 COG1440 # Protein_GI_number: 15900184 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Streptococcus pneumoniae TIGR4 # 6 99 4 97 102 72 38.0 2e-13 MAQAHIYLCCGGGLSSGFLAQKARAAAKKGKIDATIEAKSESEVTQYLPKMDILCIGPHY EFRLNAFKDMAAPYNIPVIVIPEEIYSMLDGKALLELALLEIEEFYK >gi|223714188|gb|ACDT01000027.1| GENE 3 2529 - 3815 1130 428 aa, chain + ## HITS:1 COG:lin2459 KEGG:ns NR:ns ## COG: lin2459 COG1455 # Protein_GI_number: 16801521 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 1 419 1 420 433 135 28.0 1e-31 MKGIMNWFTNKLAPGMQKVFSNPWIAAVASAMQKILPFILVGSVISIYNVFVRYIPSLPD LSFVNTFSFGMMSLIVAFMVTYFGMVELDHPKYTITAGLTSVTVFLMALCPTMATLLKNA TTGKTELTFTDINFLGGSGLFIAIIVGLVVMLIFHLYAKLHILEDSATMPDFVCEWINNI VPMTIIYLIFGVTVFTIGFDLVEFINLIFQPIVNLGQSLLGFVVICFLYVLLYSVGISAW SLNAIVKPILMAGIAANAEMALAGGSPIYIVTTETIFTTALIAMGGLGGTLPLNFLMLFK CKSKKLKTMGKICIGPSIFNINEPLVYGAPIVFNPLLMIPMWINSIIGPLIIWFVMKAGM LNIPAGVNNISRIPAPICTWLTTDDFRAFIWFLVLFIIYGLIWYPFLMRYDNQLVKEEHE KELLKKEF >gi|223714188|gb|ACDT01000027.1| GENE 4 3857 - 4198 321 113 aa, chain + ## HITS:1 COG:BS_licA KEGG:ns NR:ns ## COG: BS_licA COG1447 # Protein_GI_number: 16080908 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIA # Organism: Bacillus subtilis # 4 109 2 107 110 63 38.0 1e-10 MDENSLLIPVAMKIIMNSGNARTKANEALEALSIFDFSNAHKKIVEAREDIKKAHQEQTE IIQKEAAGEHYKTCLLFTHAQDTLMTIMSEVNLTEKMIILFESFYQIQGHKEN >gi|223714188|gb|ACDT01000027.1| GENE 5 4201 - 5652 1295 483 aa, chain + ## HITS:1 COG:lin0918 KEGG:ns NR:ns ## COG: lin0918 COG2723 # Protein_GI_number: 16799989 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 13 483 5 482 483 461 47.0 1e-129 MLLTEKYKKQIYYQFPKDFLWGVAMSAKQAEGADDRALTVADLQNYDPNDKAKVKGDLSK AEILDRLNYPERYNFPKKIGIEFYHRYKEDLILLKEMGITCFRFSISWARVFPNGIDGEP SEKGLKYYDKIFNLLKEMGIVPIVTLYHDDMPIDLALKYNGFLNEKVVTAFIKYATLIMQ RYKDKVTYWIPVNQINLTRVGLSSLGIVKDTVTQLEQKKYQAIHNKFVVCAKIKEIGSKI NNNFKFGCMLADFLVTPMTCKPKDVVFSTEKNQMTMYFYADVQLRGEYPGYAISYFSHNH IAVEIKETDLVLIKENTLDFLAISYYNSNVVSHEKNTLAIGDSQLNPYLETNPWGWTINP LGLYDCFLKYWDRYQKPLMIAENGIGQIERLDNDTIHDDYRIDYIREHIIALNKAITRGV KVFAYCAWSPIDMVSSGTSEMKKRYGFIYNDQDDYGVGSHKRYKKDSYIWYQNVIKSNGL KLE >gi|223714188|gb|ACDT01000027.1| GENE 6 6705 - 6980 140 91 aa, chain + ## HITS:1 COG:BH0296_3 KEGG:ns NR:ns ## COG: BH0296_3 COG2190 # Protein_GI_number: 15612859 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Bacillus halodurans # 13 89 77 159 161 65 44.0 3e-11 MSAFFTLRYPKGGVELLIYVGINAVQLNGEGFIARIGQGQLLLEFDMDKIKAAGYSLETP VLITNHTDLKEIKNTNEAVVSNDVELIKVEF >gi|223714188|gb|ACDT01000027.1| GENE 7 7014 - 7751 578 245 aa, chain - ## HITS:1 COG:CAC3167 KEGG:ns NR:ns ## COG: CAC3167 COG1433 # Protein_GI_number: 15896415 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 119 234 1 117 118 77 38.0 2e-14 MPRPCRKRKVCRLPRYAYFGPNPSATKTVVLSIDEYETIRLMDLEGLTQQECAKQMDVAR TTVQSIYEQARYKLAQSIVRGYSLKIEGGNYSLCDKAHLVNCTSSCTIKNKRKMEDNRMK IAVTYEDGNIFQHFGKSKEFKVYDIENQEIISSQIESTNGQGHSALAEILKNLNIDVLIC GGIGGGARNILTSLGIEIIPGVIGSSDEAVIDYLKGELHYDPNTACNHHDDGHHACHEHD NKCCH >gi|223714188|gb|ACDT01000027.1| GENE 8 7897 - 8388 645 163 aa, chain + ## HITS:1 COG:CAP0111 KEGG:ns NR:ns ## COG: CAP0111 COG0454 # Protein_GI_number: 15004814 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 3 160 5 162 162 197 56.0 1e-50 MIIRKATIKDLKAITAVEATCFPPAEAASRSNFKKRLKTYPNHFWLLEDEGKLISFINGM VTNETTINDIMFEDASLHDEAGEWQAIFGVNTIPEYRQQGLAAKVMQVVIDDARMAGRKG CILTCKDKLLHYYEKFGFKNCGISQSMHGGAIWYDMRLEFEHE >gi|223714188|gb|ACDT01000027.1| GENE 9 8381 - 8926 611 181 aa, chain + ## HITS:1 COG:FN1102 KEGG:ns NR:ns ## COG: FN1102 COG1859 # Protein_GI_number: 19704437 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNA:NAD 2'-phosphotransferase # Organism: Fusobacterium nucleatum # 1 176 1 177 179 194 59.0 9e-50 MNDLVNLGKFISLILRHKPELIGLKLDYHGWAKVDELLLGINNSGRFINRTLLDEIVMTN NKQRYQYNEDHTKIRANQGHSIKVDIELIEKIPPEYLYHGTAFKYLNKIEQEGIKKMKRL YVHLSKDIETAFKVGSRHGKAIVLVIDTKAMCEDGCKFYYSQNGVWLTEDIDYKYVMEVI K >gi|223714188|gb|ACDT01000027.1| GENE 10 8987 - 10612 1778 541 aa, chain + ## HITS:1 COG:CAC1405 KEGG:ns NR:ns ## COG: CAC1405 COG2723 # Protein_GI_number: 15894684 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 1 466 3 470 473 535 56.0 1e-151 MKENFLWGGALAASQCEGAYNLDGKGLSVSDVMLAGSKTKKRKRTDGIVAGEYYPSHEAI NFYHYYKDDLQYFKEMGFNCLRVSIAWSRIYPNGDELEPNEKGLQFYDQLFDEMLKLDIE PVVTLSHYEMPYYLSRQYNGFFDRRCVDYFERFAITCFKRYSAKVKYWLTFNEINGMITN PYTGGGVIAKIDNNYLQTVLTACHHMFLAGAKAVKACHEIIPSAQIGCMIAFLCGYSATC KPDDVMFNHNFMDVNMFFTDVQVRGRYSNKALTWMSRYGIELPVIDGDEKILEQGKVDYI GFSYYQSITMSSDILDNLGSGGNLFSGAKNSYIKVSEWGWPIDPKGLRVSLNYLYDRYQI PLFIVENGLGAVDEISDDHQIHDNYRIDYLTQHVREMKKAVDLDGVELLGYTWWSPIDIV SYSTGEMKKRYGFIYVDKDNDGNGTLKRYIKDSFYVYQEIIRTNGECVDEKKRYTLNSQL KDVLEIKGLREIIKLVSEGKVTDLQLKLGGRLKINQLLDKFDVNDNNQRLIIDLLNRLEA K >gi|223714188|gb|ACDT01000027.1| GENE 11 10649 - 10822 65 57 aa, chain - ## HITS:1 COG:FN0320 KEGG:ns NR:ns ## COG: FN0320 COG1853 # Protein_GI_number: 19703665 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 1 57 123 179 180 67 52.0 7e-12 MTLECKIVYQQVQDKNAITPNNLERFYPQDVDSSFYGANKDLHTAYYGQIINAYIIE >gi|223714188|gb|ACDT01000027.1| GENE 12 11048 - 11185 139 45 aa, chain - ## HITS:1 COG:no KEGG:CLD_1152 NR:ns ## KEGG: CLD_1152 # Name: not_defined # Def: putative flavoredoxin # Organism: C.botulinum_B1 # Pathway: not_defined # 1 45 1 45 178 79 88.0 3e-14 MKKQINVFDYAKEIMEAVQTGVLLTTKVDDKVNSMTISWGTLGIE >gi|223714188|gb|ACDT01000027.1| GENE 13 11290 - 11622 299 110 aa, chain + ## HITS:1 COG:BH0655 KEGG:ns NR:ns ## COG: BH0655 COG1733 # Protein_GI_number: 15613218 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 9 105 17 113 113 122 59.0 1e-28 MNDEVNVRPFAYAMSLIDGKWKMHILFWLWKKQVLRYSELKRALGTITHKMLSTQLKELE KDDLIIRNEYPQVPPKVEYSLSPRGLTLMPVLECLCKWGHEHINDGDNHG >gi|223714188|gb|ACDT01000027.1| GENE 14 11615 - 12433 800 272 aa, chain + ## HITS:1 COG:BH3496_1 KEGG:ns NR:ns ## COG: BH3496_1 COG0789 # Protein_GI_number: 15616058 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 3 118 2 117 117 72 36.0 1e-12 MDKEYYSIGEIAEICNIPIRTLRYYDEIGLLVPEKRDIESSYRYYARHQILQANIINQFK IQGYSIKEIKHQLANNSVSLSEETLNQKHAELKKTIAELNKLEKRLGFFLECLKYQDHQL HLQLKEIPEIYIAYVRQQGLADQAAFMRRFSELNTLCRKNNLEPIGNIMARYYDDYQQYD PDCADIEVSIQIDLNHEIGGVVRKEPGYLCVSALHYGSYRDEYKTYTLMMEWMKKNNLVM CGPALEYYLIDPIFTNDENEFVTELRIPVKSI >gi|223714188|gb|ACDT01000027.1| GENE 15 12478 - 12933 448 151 aa, chain + ## HITS:1 COG:no KEGG:Sterm_3755 NR:ns ## KEGG: Sterm_3755 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 151 1 151 151 224 74.0 5e-58 MIHLVYLDNKEKVLEKIMSGKKTMIIRGAAGRKIPHSRVFNDEVLYFMEKGTKKITAKAR VVNVLNFIKLSEDEINEILLKYQDKLDLTPKQQVRWHKKCLCLVEFTDVEAIKPLDFDHQ GNMDDWLILDRIEDVVVGTSIPYNYNNARFK >gi|223714188|gb|ACDT01000027.1| GENE 16 13033 - 15147 1546 704 aa, chain + ## HITS:1 COG:lin0625_2 KEGG:ns NR:ns ## COG: lin0625_2 COG0584 # Protein_GI_number: 16799700 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Listeria innocua # 455 690 1 231 243 142 34.0 3e-33 MKDKLKNFKHLFFKILKFEIFYKFAIFLIISPLINKILQIYLNNNSSGAAFNQYILFNFL NLEGITVVLIIMMMAVIAIGYEFSVLINMITLNKQNKDFRIYEVMKTSLLNLSCLKHPSS ILTGIFFLFILPLVHLGYINSLIPSLKIPNFIFGELSLTFQGNILICLIYGLYFGIYGLM FFVPVLMILKKENIIKAFKENIKFHKLLTLKERISLIGIMGIWIVLENGLIQILPDALIK NADFNRYFLKNLIISSRFRIYLLEYIIYTLIIMVLMIVFYQYLIGVMGKHEPELLKVKVD LEFNRLFDRTLLKAQIHGNRFFQSLDKYFFETHFYQKYRGMINLVFWPCMFILITYLFPN SIYILSAILIVVLLMYFAASIIIRYEKRKNGQEFEDDSGLLFLPYRLIKRTLNNSRLYNH HPRMVTALLAFISVCLVGMYLEQPLMLHHPWVIGHRGSIYGVENTDGAIMTAADKGADYA EIDVQLSKDGVPVVIHDADLSRLAHKDEKVKNMTAKQLSETLVYHNEYTDKIPTLDHLIK KLKKNSTKMGLLIELKVEGGNGEKLAKKIIDVIEKNDYQKQAIYMSLDLDTVQYLQSQRP EWWIGYCIYGSAGDIEGSLWNQGIDFLAIEENRATVSFVEKANRNWVPVYVWTVDDESRM IQYLELGVSGIITNYPNRGRKAVDKFKENNYQYYYHHGSGYPDS >gi|223714188|gb|ACDT01000027.1| GENE 17 15254 - 15925 458 223 aa, chain + ## HITS:1 COG:CAC0224 KEGG:ns NR:ns ## COG: CAC0224 COG0745 # Protein_GI_number: 15893516 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 222 3 222 223 188 45.0 6e-48 MKIMIVEDDLILARELQLLCVKWNFQAKYLEDFKHVDQVYCTYQPDLILLDINLPFYDGF YWCQKIRENSTVPILFISSREQNADKILALSAGGDDYIEKPFDLELLLVKIKAILRRTYE YKHLEKEYLDEDTSYDIVRGRFVYQNKVLELTKSEGKIMSTLLGQRGNWVTREQLMMTLW NTDEFVTDATLSVHISRLRNKLKELTNGKDIIKTKKGVGYYID >gi|223714188|gb|ACDT01000027.1| GENE 18 15918 - 16916 869 332 aa, chain + ## HITS:1 COG:CAC0225 KEGG:ns NR:ns ## COG: CAC0225 COG0642 # Protein_GI_number: 15893517 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 2 328 4 334 339 137 30.0 3e-32 MINYLKEKLRPIVGGMLVLVFINIYFLYLCGSRLYLSDLVYLDVILIFIGITIFIYGNVK YRAVEKIMKTNNFLTTKELKSYLSLQALSVIEQNDLYYHEEINHLSTQIQELSDYISRWS HEAKLPIASMRLMNERNQDLVLKKQMRLTIEQIQLLLNTMLMSSKLRNPVNDIKIEKVFL SQVIKEAIKHQSYFLINDHFKINLKVENEYVYSDRRWLTYMLDQFITNSIKYKQTEPNLT FYIREDHDGLELIVEDNGIGIAPEDAPYIFDRGFIGHNLRDGDYRSTGMGLYFVKEIAEK LGIKVIYDNTFCSGSRFKLQFEDNAEYFMLDY >gi|223714188|gb|ACDT01000027.1| GENE 19 17011 - 17766 345 251 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 203 1 199 223 137 36 1e-31 MRKIVEIKNLVKSYGGKDYQTKVLKNINLDIYENDFIAIMGPSGAGKSTLLNMLSTLDKP TRGEIIIDGEDITKVNNKRLSKIRQEKIGFIFQDYNLLDNMTLQDNIALPLSLNGKRSRE IIEKTKQMASLFGLSEHLNKYPYQLSGGQKQRGASARALITNPRIIFADEPTGALDSKSS KDLLISLKKANESGNATILMVTHDAYSASYAKQVYMLSDGAIKCRLNHSGERQKFYDEIL SLLASMGSEQE >gi|223714188|gb|ACDT01000027.1| GENE 20 17763 - 19670 1706 635 aa, chain + ## HITS:1 COG:CAC0227 KEGG:ns NR:ns ## COG: CAC0227 COG0577 # Protein_GI_number: 15893519 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 1 631 1 632 634 162 24.0 1e-39 MSLFKLAYSNFKRSVKNYVALIISLSFSIFTFFNFQNIVFSNAMDVLKEKNSDYIAIIIQ VISIVLIIFVFFFIWYATNVFLSQRKKEIGIFTFMGLDNVKIGKMYVIEVAMIGLLSLII GLGSGALFSKLFTMLLLRLSDISVDVSFSFSIIPVLLTIALFVTIFAIMIIKGYINILRS SVLDMLSASKQNELREEKIIITLLKVVIGVVLICGGYYAAMMVGDMSSFIYIFYAVVLVV AGIYFLYNGVIPFVIKKLAQNKSYLYQKERSLWVNNLAFRLKKNYRAYAMVTILMICSVT VLATSFAMKQRYDNISHFRGTYTYTVMANKALDGKQIAQKISSNNELNYHNSLNYIALNA DMVKSKYLYNQYGVVSYSQLKKAAKEAKIKFDIPKLDDQQVVNLTRLYLLDISDPEINPV IEIAGEKYQIVDETATSYLGLLQENISTYIVSDNTYEKLRNFGQEFYMYNYQIKDPNNYQ ASIDYLNSLVKDGPVDYVSYLANDPNGGDIAWVRVMYSICIFLFLVFILASGSIIFMKLS NEAFEDRERYNVLKKLGISKRTLSRSIRNEIRFAYYCPFVLMVLSSYFSVHALANVMKTE LYTVNIVSAIVILVIFYIIYTISVLMFKKKVLSDH >gi|223714188|gb|ACDT01000027.1| GENE 21 19911 - 20363 546 150 aa, chain + ## HITS:1 COG:AGc934 KEGG:ns NR:ns ## COG: AGc934 COG1970 # Protein_GI_number: 15887877 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Large-conductance mechanosensitive channel # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 109 1 114 142 107 52.0 5e-24 MKKKSGIIEEFKKFITRGNVMDMAVGVIVGGAFTTIVTSLNNDIISPILGIFGGVDFSDL KLVMGSGENAPVLKYGSFLTAVINFLIIAMIIFLIVRAMNKINESIKAKLTDKEEVLVET TKVCPYCREKVAKEATRCPHCTSILEEKVD >gi|223714188|gb|ACDT01000027.1| GENE 22 20376 - 22772 2122 798 aa, chain + ## HITS:1 COG:BH0714 KEGG:ns NR:ns ## COG: BH0714 COG0178 # Protein_GI_number: 15613277 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Bacillus halodurans # 3 798 4 818 820 735 48.0 0 MKDIIIKNAYHGNLKNLDLTIPRNKLVVITGLSGSGKSTLAIDVLYQECQRQYLEAISFQ GINKPGVEVIRNVSPAVVITQDEKNNNPRSSLGTVTDIYTDIRMIYEKLGVRKCPHCGKM IDASYCHEELVRADNDFTVYMYCNYCHYKMEKLTRSHFSFNSEKGACPTCFGLGKTLILN LDKVLEPNLSLREGAVAFWHHRYKEYQIEQLEKVYRELGLSVTPETKVIDFNQQQRVILL YGTESKQFKELFGKIKISKFEGIITNLWRRVEEKKNQSKEISSYFDEQVCPDCHGEKLNS LSRQITVNKTVITATTKMPLTELLHWLEQLEATVSGPQKEAVQQYIIDLKTKIQRIINVG LGYLHLDRQTMTLSGGEKQRIKLAAALQSQLTGIIYIFDEPTMGLHPKDTAGIINVLKEL RDQENTVIVIEHDLDVIKAADYLIDLGPGAGKHGGQILALGTYHELQNNPSSITGQYFVN KKKCKRVYRTSNRYFEVKNAHVHNLKNIDVTFLVDCLNVVTGVSGSGKSTLVFAVLAKQH YRYFDDIIAVRQESITTTRRSNIATYTGIYDEIRKLFGSLEATKEMGFTAKHFSFNSQGG RCENCEGLGTVTSNMLFFKDVEVVCPVCQGKRFIPEILALKYHEHSIDDVLHLSVDEGVG FFSDCPKIIKTLKMLQDVGLGYLELGQVLTTLSGGEKQRLKLATTLLTNINKHNLYLIDE PTIGLHPLDIEHLLLLLDRIVDSGNTVVIVEHNQQIINAADWIVDLGPAGGNDGGYVIAA GTPNDIKGNENSIIGEFL >gi|223714188|gb|ACDT01000027.1| GENE 23 22877 - 23389 593 170 aa, chain + ## HITS:1 COG:no KEGG:DSY4740 NR:ns ## KEGG: DSY4740 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 2 165 4 200 203 118 35.0 1e-25 MTTQEKIINVATQEFLNSGYLGASLRKIAKAAHVTTGAMYGYYKNKEALFNDIVEEVSEE FKNDYLVEGINLETIDYIYQNFEVFKLIICCSKGTRYEEYLDCLVNEKTKQFNEHGFDDK LSHIINHSYLFGVFEIVRHQMTEKQAEDFVSDLQEFYQAGWNKLLERKEI >gi|223714188|gb|ACDT01000027.1| GENE 24 23391 - 24164 1001 257 aa, chain + ## HITS:1 COG:lin0435 KEGG:ns NR:ns ## COG: lin0435 COG0428 # Protein_GI_number: 16799512 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Listeria innocua # 14 257 25 269 269 228 55.0 1e-59 METLIGILIPFIGTTAGAACVYFMKNKMNDLVQKVLLGFASGVMIAASVWSLLIPAMDMS SDLGKMAFVPAAVGFLLGIAFLLLLDRNVPHMHLDNEEEGPKSQLKKSTMLVLAVTLHNI PEGMAVGVIFAGLASGSQGVTYAGALALSLGIAIQNFPEGAIISMPLKSSGLSKNKSFIY GMLSGIVEPIGAGLTILMASLVVPILPYLLAFAAGAMVYVVVEELIPEASQGHHSNIATI GFAIGFVVMMMLDVALG >gi|223714188|gb|ACDT01000027.1| GENE 25 24242 - 24418 297 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735107|ref|ZP_04565588.1| ## NR: gi|237735107|ref|ZP_04565588.1| predicted protein [Mollicutes bacterium D7] # 1 58 4 61 61 83 100.0 4e-15 MKKRLGLGLGMFLITILSNLSGVMAYSGNSNLPGSAIIIVFVPVMLGIIYLASKGEVK >gi|223714188|gb|ACDT01000027.1| GENE 26 24753 - 24926 178 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757551|ref|ZP_02429678.1| ## NR: gi|167757551|ref|ZP_02429678.1| hypothetical protein CLORAM_03101 [Clostridium ramosum DSM 1402] # 1 57 1 57 57 79 100.0 8e-14 MNKLALVAISLFVVVEILVGDCVVQGIPLWVVGLIGFSLIWMGIICVKEKVQIKDWY >gi|223714188|gb|ACDT01000027.1| GENE 27 24953 - 26119 907 388 aa, chain - ## HITS:1 COG:SP1893 KEGG:ns NR:ns ## COG: SP1893 COG3307 # Protein_GI_number: 15901720 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A core - O-antigen ligase and related enzymes # Organism: Streptococcus pneumoniae TIGR4 # 3 384 6 386 397 161 30.0 2e-39 MKFINSLKERFTKEQLMIIATALSLSTPFYICVPFLLLETLYLLYTKKIINAFKSTPKSK YLIIFMLLTLAISLIYQNWIGVGCVALIFIFVSLMLYYRQYINEDTFEFILDMLIALSIL WAIYGLFEQIQILDRLGYDHFTLKVFSRRENRLNSVFFNANYYAMMIEFTIMMIGYKIFG TKNVKKQLYYFIVACLNFFLLYMTGCRTAFVATAGAMLVFLIINKNYRICSLIGLLCIIG GIYFIFNPEQFPRIEKLVDNFTVRTKIWHAAIEGIKAHPLFGQGPMTYMMIFKQYGGHVT QHAHSVYLDPILSFGVVGVATLVPYMFDNCKRLFKVYQEKLNLRYVALVISCIAVILLHG VLDYTIFWVHTGMLFLLIASSFEMFKHK >gi|223714188|gb|ACDT01000027.1| GENE 28 26184 - 27119 837 311 aa, chain + ## HITS:1 COG:no KEGG:Cbei_5041 NR:ns ## KEGG: Cbei_5041 # Name: not_defined # Def: putative RNA methylase # Organism: C.beijerinckii # Pathway: not_defined # 11 310 23 318 319 221 37.0 2e-56 MKKYLYVFNYPPEDKELCALEFRTLFKDEFKSKYYLSNKDYSVDKSVFVKAKLDLWGIDA DFNKLIEKVAALHKDYQNFKVIYLKNEVTHVDYQESLLRCKDISWGIAGSVNMSRPQHTI AITKLEDQWLCGYYHHGVPSWQKHDDKPNTFSNSLDIRLARTLVNLAAGDNDQVTIVDPC CGMGTVVLEGLALDLDIEGFDISREISWQARKNLKYYGYDEYLINKVSIHDLQKHYDVAI MDIPYNLYTPITYEEQCRMIQSSRRICDKMIMVTFEEMSKEINQAGFLIIDGCLRKKTEY VKFGRYIYICI >gi|223714188|gb|ACDT01000027.1| GENE 29 27142 - 28725 1131 527 aa, chain - ## HITS:1 COG:CAC3213 KEGG:ns NR:ns ## COG: CAC3213 COG2244 # Protein_GI_number: 15896460 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Clostridium acetobutylicum # 1 500 1 490 512 160 26.0 5e-39 MKKNSIFKQAAFLAAAGILVRIIGLLYRSPLTKLIGSQGMGYYSTAYNVYALILLVSSYS IPTAISKLLSEKLAVNQYNNVKKILLCSFIYIVAVGGGAAIIAFVIAPYIVPDKAVSALR ILCPTIFLSGLLGIFRGYFQAFKTTAFTGISQIIEQVFNAGVSIGAAYLFIQPYLNNQSL VASHGATGSALGTGAGVLISLCYMLFMFKRTKQSYLNPPNIEAADPHTDSFKDIFKMISN IVTPIILATCVYNLISTIDMYMFYLVCGDGAKSISAFGAYGGEYIILQNVPVALASAMST ASIPNISSAWLFKDTAEVKKQIAQGTKVIMLILIPSAVGMSILAVPIIQAIFPQKETVVL ASTLLTFGSPAIVFYGLSTYTNGLLQALGYSSIPLKNAIYALIIHCFITLFLLLATNLNI YSLLIGNCLYGLQLCFANQRALKHITTYYQEKLHTFIYPLISSAIMAFTVALCYYGLIKV IDRLIITLFIAIVIGVIIYFGTLLFLYKDNVEELSQIPYLQKIIKHK >gi|223714188|gb|ACDT01000027.1| GENE 30 28970 - 29797 758 275 aa, chain + ## HITS:1 COG:BH3496_1 KEGG:ns NR:ns ## COG: BH3496_1 COG0789 # Protein_GI_number: 15616058 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 9 107 5 106 117 73 40.0 3e-13 MFMEGDKMFSIGEFSKMAKTTVKTLRHYDEIGLLAPAYVDTYSKYRFYTSNQLIIIHQIQ SLRQIGLSLDEIRMIINGVDGVAILESRKKAIKESISEAEDQLSRIEFILSGKEEDTFMN YQASIKELPACTVYSKRMVVPGYDSYFELIPAIGQKVMNKYPDLKCTVPEYCFIIYRDGE YKEKDIDIEFCEAVDQVKVDFDDIKFKKMPAVKAVSIMHKGDYAGLPKAYAYAFKWIEEN GYQVLDNPRESYIDGIWNKESKDEWLTELQIPVTK >gi|223714188|gb|ACDT01000027.1| GENE 31 29883 - 31601 200 572 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 346 560 28 239 312 81 27 9e-15 MKLLKYVKEYRFPAIIGFVFKIAEAALELMVPLVMADIIDVGIKNNDQNYILVRGLFLVG LAVAGYLFALVCQYYASLTSQSVGTKLREDMYHQINRYDHHNLDKLSAPTLVTRLINDVV QIQLAVAMTIRLTSRAPFIMIGSLFLAFLISGPLASIFVVGAIVLAIVMLMITIISMPYF NNVQKKLDKISLIVRENLNGIRVIRAFASQDKEINKFKTETRQQKDIQVKVGRIQALLNP FTYLIVNVAIVLIVYFGGTEVNVGGLSQGEVIALVNYMNSILLALIVFANVLSIYNKAGA SYTRIFEVLETEPAVVNDGQIQIYRESEDCIEFRHVSFAYEQKNVLNDLNFTIKRGQTIG IIGGTGAGKTSLVNLIGRFYDVSSGEILINGEAIKNYDLHALRSFIGFVPQYAALISGTI RENLQLGNQTANDQQLLKALEIAQGKEMLEDKAAGLDTVIEQGGKNLSGGQKQRLTIARA LVKQPEILILDDSSSALDYATDFKLRQALKQLNMTKIIISQRTASIEQADKIMVLYHGDL VGFDSHEQLMKDCKIYQEIYASQHSKDGDDHE >gi|223714188|gb|ACDT01000027.1| GENE 32 31594 - 33330 178 578 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 339 563 131 355 398 73 25 3e-12 MNSKTIKRLLTYCRPYRVILGLIFILSFISVCLTLITPILFGQAIDLLIGIGLVNFSRLF WQLLVIVGVVLAGALVQWILGQLTNKITYNITNDLRDRVFEKIHLLPLKYIDSTPHGDII GRVINDIDLIAAGLLQSFTSLFTGVATIVGTIIIMCLINLSIAIVVIVLTPLSLLVASLI VKRTHIYFKEQLELRGEMNGYIEEMIGNQRIIKAFNYEQMNEERFKEVNQRMHVSGVKSQ FYGALINPTTRIVNSLVYGAVGVFGAISVLNGSFTVGLLSSFLTYANQYTKPFNEISSVM TEMQTALAASQRVFNLLDEPVEKPVSNPQNVEIMEGNVSLKNVYFSYDPKVSLIENLSLE ALPGQTIAIVGKTGCGKTTLINLLMRFYDQNSGSIMVDNVNTLDMERDYLRRMYGMVLQE SWIFKGTIKENIAYGKSDATDEQIIAAAKKARVHKFIMKQPEGYDTMIDEDGGNISQGQK QLICIARIMLTKPPMLILDEATSSIDTRTELQIQEAFETMMKGRTTFIVAHRLSTIKNAD MILVMDKGHILEQGTHQELLEKQGYYYNLHNSQFNLAK >gi|223714188|gb|ACDT01000027.1| GENE 33 33688 - 36321 2424 877 aa, chain + ## HITS:1 COG:CAC0382_1 KEGG:ns NR:ns ## COG: CAC0382_1 COG1221 # Protein_GI_number: 15893673 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain # Organism: Clostridium acetobutylicum # 23 416 16 423 423 157 28.0 7e-38 MKSNKVKVFEFIKEYSITQSTDEYPKLTTQYLSEKLDMQRTNLSSILNQLVKEGKITKTT TRPVLYQLANFQLTNQKDFENLIGYNQSLNEAVMLAKAAILYPQGSPHILLTAESGSGVK YFAKNVYDFAVKSKVLKNNAPFTVFDCKTFIENPTIINEMLYGDEVNTGLVHQTNQGMLL IKHVELLSGYQRTMLFSIITSNKMPSNQYVELPRNYKCIMLCAIASDASKDLYDLYRNKM DFVIELLPLSQRPLKERFALLELFLKQEARKLDRVLEVSTNILHSLLLYEVKDNIRGLKN DIHTGVANCYVREHDVHHHHIELLLSDFPNYVRKGMIYYKTFKEEIDEIIPGDCKFAFTK NEVLKNRAKENNANIYRTIDVRKKELKKQSVSEEQINTLVSLQLHQDFQEYLNELSNRVS DKEQLSKIVSMKLISLVERFITRVSSELAINYQENILYGLCLHINASLIKVSSKQRLANE EIKRMIDLYPKHYHLAKEFVHEIEEEFRVKMNIDEIIFTMLFILNESKPVTNKHVVTLIA MHGDSSASSIVNVVNALAMHNNTYAYDLPLDKSMDDAYEDLKEQIIKIDQGKGIILIYDM GSIRTMAESIAFETKIEIKYLEMPVTLIGVTSSNKASNNDSVDEIYEYLQTKFKDIKYFR KQSDKKILVIISKEQSEVTRLKSYLNEHFDLSNVTVHVIENSEANHLYNEINIIANEGKI IGIIGNNRPNLAQYPFAEVRWLEQKNAKTLEEIFIEEDEARDDINEIFEYLKDQFNEVDV DGIKDYLLDFTKSLEVTLDATLDEDQQIGLIVHLVCLIDRISRHQAPIVNFIASNIILNH GMLVSKVKELLVPLEMALNISISDAEIATIISIVKKR >gi|223714188|gb|ACDT01000027.1| GENE 34 36341 - 36649 533 102 aa, chain + ## HITS:1 COG:SA1993 KEGG:ns NR:ns ## COG: SA1993 COG1447 # Protein_GI_number: 15927771 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIA # Organism: Staphylococcus aureus N315 # 1 102 1 102 103 107 52.0 4e-24 MTKEEATMVAFEIVAYSGDARSKLLMAVEKAKAGDLVTADRLVAEANECLVDAHKAQTDL LQQEARGDNVEVGFIMVHGQDHLMTTLLLKDIVGTLMDVYKK >gi|223714188|gb|ACDT01000027.1| GENE 35 36671 - 38317 1758 548 aa, chain + ## HITS:1 COG:SA1992 KEGG:ns NR:ns ## COG: SA1992 COG1455 # Protein_GI_number: 15927770 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Staphylococcus aureus N315 # 1 546 2 569 570 658 58.0 0 MNKLVSNIEKLKPFFEKVSRNRYLRAIRDGFISAMPIVLFSSIFMLLAYVPNIFGFYWSK EIEAMILKPYNYSMGILALVVAGTTAKNLADNFNRDMPVNRQINNISALMAAIVGFLIVG VDSIEGGFANGYMGSKGLLTAFLVAFVVCNVYRFCVKRNVTIKMPDAVPPNISQTFADVI PFAVSAIIFTIFDIIFRNVTGICFAQGVIEFFQPLFTAADGYIGLAIIYGAMSLFWFVGI QGPSIVEPAVSAIYYVNIANNLDLFQAGQHANNILTPGVQQFVATIGGTGATLVITLMFA FMSKSKELKAIGRASSIPVIFGVNEPILFGAPLILNPIFFIPFIFAPILNVWLFKAFVDF LGMNSFMYVLPWTTPGPIGIVLGCGLGLLTILFAAVILLVDFAIYYPFFKVYDNEKCEEE KNKNLDAVEKEEEMIEVDGNTLKSKRILVLCAGGGTSGLLANALAKGAKEQNIPLITAAG SYGAHMDIMKDYDLVVLAPQVANYYEDLKKDTDRLGIKCVACEGKQYINLTRDPDGALKF VFKIMEGE >gi|223714188|gb|ACDT01000027.1| GENE 36 38320 - 39732 1650 470 aa, chain + ## HITS:1 COG:SA1991 KEGG:ns NR:ns ## COG: SA1991 COG2723 # Protein_GI_number: 15927769 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Staphylococcus aureus N315 # 3 465 5 469 470 576 58.0 1e-164 MKFPDNFVFGGATAAYQCEGSTLKYGKGKVAWDDYLKKEGRFSGDPASDFYHQYPIDLKL CRDYGIKGIRISIAWSRIFPNGVGQINQEGVDFYHRVFQECKKQGVEPYVTLHHFDTPDA LHKIGDFLNHEIIDQYTNYAKFCFQEYHDEVKYWFTFNEVWPIAVNQYIVGSFPPAIKYD IPKAVYSMHHMMVAHAKAVLAFKEGKYPGQIGIIHSLETKYPYNDNEADYQATKNEDVLA NQFVLDATFLGEYTDETLTIIKRLVALNNGTFRVDLEDLEIMKKAAKLNDYLGINYYQSH FIQAYDGENDIFHNGTGDKGTSRFRLKGVGERMFKEGIERTDWDWLIYPQGLYDMIMRIK NQYPNYKAIYITENGMGYKDEFVGGKIDDTPRIDYIRKHLSWILKAIDDGADVKGYFVWS LMDVFSWSNGYNKRYGLFYVDYDSQVRYPKKSASWFKQISETKELDKIEL >gi|223714188|gb|ACDT01000027.1| GENE 37 39722 - 40384 588 220 aa, chain + ## HITS:1 COG:no KEGG:RBAM_012200 NR:ns ## KEGG: RBAM_012200 # Name: pgm # Def: Pgm # Organism: B.amyloliquefaciens # Pathway: not_defined # 4 196 2 196 227 142 40.0 8e-33 MNYKAIIFDFNGTLFFDNDKHVLAWGKISKLIRGRGISDEELHEQFNGTPNAKNIQYMMN NQATSEELKKYSLLKEEYYREFCKQDKASFHLVDGAVAFFDKLKELGIPFTIASASIKEN IDFFVESFNLDKWIDPEMIIYDDGSYSNKVNMFKDAAQKLKHNVKDILVFEDSFSGIANA YKAGIEKIVVICPQEKESEYRDLPGVIMTIQDFSDFNKLI >gi|223714188|gb|ACDT01000027.1| GENE 38 40558 - 42267 1771 569 aa, chain + ## HITS:1 COG:STM0600 KEGG:ns NR:ns ## COG: STM0600 COG1966 # Protein_GI_number: 16763977 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Salmonella typhimurium LT2 # 1 515 32 596 701 241 32.0 4e-63 MNGLLLLALSMVALAAAYVIYGRYLEKTWGIDPKAKTPAVANEDGVDFVPSSKWEVFAHQ FSSIAGAGPVTGPVMALMFGWVPTVLWIIVGGIFFGAVQDFGALYASVKSNGKSMGGIIE EYIGKTGKKLFFLFCWLFTLLVIAAFADMVAGTFNGFSAVDGSKLQPNAAAASISMLYIV VAIAFGFFLKKRHVSGKVQAMLGIGLIVLMLIGGIAFPIYFSKTTWLYVVFVYIFFASVT PMWLLKQPRDYLTTFLFVGMIVAAVVGVLFYNPTISLPAFTGFTSETGSYMFPTLFITIA CGAVSGFHSLVSSETSSKQIKNEKDMLQVGYGSMLLESLLAILVVVVVGSLTQLVSDGVL TDQLASLVTAEGATPFTKFSVGVTAFVSKLGLPQEWGICIMTMFVSALALTSLDAVARIG RLSFQELFEVENSQEASGLNKILTNKYFATLITLFFGYLLSLGGYNNIWPLFGSANQLLA AMVLISLSVFLKVTGRKGFMLYIPMCTMLVVTLSSLGLSVYNIVNAWMTTGTFDFLTSGL QLIFAILLIALGVLVAFFDFRKLFKTKKV >gi|223714188|gb|ACDT01000027.1| GENE 39 42303 - 44189 1585 628 aa, chain - ## HITS:1 COG:no KEGG:Deide_1p01580 NR:ns ## KEGG: Deide_1p01580 # Name: not_defined # Def: putative esterase # Organism: D.deserti # Pathway: not_defined # 316 433 31 155 297 68 36.0 7e-10 MKKITKALLALTLSLLMLAAPISVNQLYANENILTEVTSDTINNYTYYEYDSEADGYTSE RSNILTPIYYIFAGKQDLTSADKLIEEIGLLDNVHEWAGKVYIINPISTQYNNYDVTAFK KLAGTGVSNIKVIGIDEGATFVNNYISQNCYFIAGMMVYGGTMNSDLTYNVPIPAYLSST ATSAVSYYKQANQTDQSQSFNNYTIYQNSTNPLQIVVNSKTDETLKNAFDNAWETVFSKN YRQHNETTEFYNMPVTDTNLANAEQPYKLIETPIFDRLGIIHNQEINQTVSNMPGKYTWF EYLPNQVIDTKKDSVPLVLTLHGNGNDPRVQADSSGWIELAAKENFIVVSPEWQDASVNF SKCDGLGDEGIINLIDDLKIKYPQIDRSRVYVTGLSAGGAESLLLGVKNSETFAGVGAVS GVNLYSEAITELTNDYKGHETPLLYICGDHDFFQMIPVDGSSQYGTSQLYGFSIWAEDSN THIYSALQAYQKINDLTVTDMNMDLNPYYGIKLDNQQWTKLGEKDMYTGTLSNNNGVVME LAAIKDHAHWNYKPEAQYIWNFFKNYQRDLLTGELIFVNNGSNTTTVIDKKDDLTTSVKT GDEVEFEYLGILSVITITTFIYFKKKIA >gi|223714188|gb|ACDT01000027.1| GENE 40 44323 - 45117 683 264 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167757566|ref|ZP_02429693.1| ## NR: gi|167757566|ref|ZP_02429693.1| hypothetical protein CLORAM_03116 [Clostridium ramosum DSM 1402] # 1 264 1 264 264 457 100.0 1e-127 MKREGIHGLIQYLNKTTNNATNTQVAKAIFTNRDKINEISLEKLAGDNYLSQASVSRFIK NQGYKNINEYRWDFIAGQQMLRLNAFNNKKVIISKNNEEIKNDVKQSLQNAFNDIDKLDM TALERLVKIINNYKQVLFIGSEFSLANIYLVQLEMVQYGINAYSYNDPIIMAENLRSLKE DTLIICISTSGQWYNAPSTKEIRDILFSLNNPKILLTCITQHTDEAKFDYIYKFGNQRND EISGYIQLTYFIPIFRNMYIRYID >gi|223714188|gb|ACDT01000027.1| GENE 41 45621 - 46094 433 157 aa, chain - ## HITS:1 COG:SP1916 KEGG:ns NR:ns ## COG: SP1916 COG0671 # Protein_GI_number: 15901740 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Streptococcus pneumoniae TIGR4 # 2 157 5 163 167 82 34.0 3e-16 MEKFYQTMLSNIRKHPLLQKIIMGFTRYIPIITFIVYSILLVYLLYTQNTLLAKTLYKPL ASFLIVTLLRKVINRKRPYEAMAIDPLIEHKQGESFPSRHTVSAFAIALACLQVNSLLGT IMLILAFVVSCSRILSGVHYISDVLSAVIIALIISFL >gi|223714188|gb|ACDT01000027.1| GENE 42 46104 - 46328 286 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167757569|ref|ZP_02429696.1| ## NR: gi|167757569|ref|ZP_02429696.1| hypothetical protein CLORAM_03119 [Clostridium ramosum DSM 1402] # 1 74 1 74 74 135 100.0 8e-31 MFNQIDIHGCTTIEAKIRLDNYLNSLSPNTKEITVVHGYSSKILQQFIRKQYKHKRAGRR ILTMNAGETIIQLK >gi|223714188|gb|ACDT01000027.1| GENE 43 46446 - 47546 1591 366 aa, chain + ## HITS:1 COG:BS_yyaF KEGG:ns NR:ns ## COG: BS_yyaF COG0012 # Protein_GI_number: 16081144 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Bacillus subtilis # 1 366 1 366 366 482 66.0 1e-136 MALTAGIVGLPNVGKSTLFNAITNAQVEAANYPFATIDPNVGVVEVPDYRLDKLTELVEP KKTVPTTFGFTDIAGLVKGASRGEGLGNKFLGNIRETDAICEVVRCFRDKDVTHVDGDVD PIRDIETINLELIFADLDTVEKRIGRIGKKAQSGDKEAKLEVAILEKLKSTLEANKPARV IEFSKEEMDVVKQYTLLTMKPIIYVANLGEEDLEDPTTNPHYNKVVEFAAGEGADVVPIC AKIESELVGMDKEEKDLFLQDLGIEESGLDKLIKEAYKLLGLRTYFTAGVQEVRAWTFKE GMTAPEMAGIIHSDFQRGFIKAETYSFDDLVEYGSEHALKEAGKIRQEGKQYVGQDGDIM LFKFNV >gi|223714188|gb|ACDT01000027.1| GENE 44 47546 - 48349 938 267 aa, chain + ## HITS:1 COG:BH3294 KEGG:ns NR:ns ## COG: BH3294 COG4509 # Protein_GI_number: 15615856 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 18 254 11 250 254 144 36.0 2e-34 MAKTKKSFIERIKPTGAKDLIRKIILLVCICVFCYSAYNLASIFLEYKSMDDSNKEVEET YVTENTEQKNSYKTIDFEALLARNSDVKGWIDIPDTKVSYPIVQGETNDTYIHSDIDKKE FRAGSIFIASENKNPFTDLNTVIYGHNMKNGSMFNNIKSYTEQDFADKHPYVYIYLPDGT VSRYKVVAAHIIPEESLLYNTGITDIQAFYQEMLKTSDIKVDFEQAAGNPVITLSTCTSA GSESGKRNVVHAVLDRAGIDPKTETMD >gi|223714188|gb|ACDT01000027.1| GENE 45 48596 - 50092 1677 498 aa, chain + ## HITS:1 COG:SP0346 KEGG:ns NR:ns ## COG: SP0346 COG1316 # Protein_GI_number: 15900275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pneumoniae TIGR4 # 83 486 84 481 481 184 31.0 4e-46 MKEKILKFITSKFFVLGIQLLATIAVVYFTFKLDLVPTKYLIAGIVVLALLLAGFFGIIY SSEQKIKKGLSSKRGIVTKIISLLTSIILIAGCTYISRGNNFIENVSNATGQEYVVSVIS LKNGKITKLKDLDSSKKIGVSYEKDTVTIAEALKDLDTEIGDHEYTKYDNYASLADALYE GKVDAIVVGEQYRTMLQTNHEEFNDVTKVLKSYAYDAKMEVTTKQTDVTENAFSIYVTGI DVYGSLKTVSRSDVNLIVTVNPKTKQILMTSIPRDCQINLHKNGKMDKLTHTGIYGTSET INTIQDLLEMKINYFARTNFSGMTNIVDALGGITVNSSEAFETLHGNYQITEGLNEMDGD KALCFVRERKRLKRGDFARGENQQKVLKAMLDKAMSPKIITNFNNILSAVEGCFETDMSD KEIKSLINMQLNDMADWKIINVQIEGEYQLMDDTFSMKGTNSDVMIPFESHIERVQELIN KVEEGKEIKDSELKGLTH >gi|223714188|gb|ACDT01000027.1| GENE 46 50180 - 51547 1326 455 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735129|ref|ZP_04565610.1| ## NR: gi|237735129|ref|ZP_04565610.1| predicted protein [Mollicutes bacterium D7] # 1 455 1 455 455 776 100.0 0 MVFNRKYYKLCFIIYMFINTLALGYLGIETNYLFVPLLIWALVIIVHDIYKKKFKLKKNY SLLMIVQGLILLLATARNDYSDLNSYVIAVMQLVIYLLIFNNPVSMSKDDIEDEVRMIIP LVNVLVGGASLISIGMYLVHFSSLANGWTLGMVGNRLFGIYFNSNPAAFLACITIVLALV AARQKFKGKYWYLANIGIQLVYILLTRCRAALIILAIIIVMVGYYFLIRRKPYSNFKRLG LVISLIVVIAGASLVGQRVVEIVPQMQGIASKETSRFQMDKVVKAGHLLIAGNWQDFNQG LTIIDEVSNGRVSLTKAALEIWHTEPVIGIGANNFKKIGSQETDALEYWAVQVVHSHNVF LEALVTTGVIGFILFVVFFFKTLLMIFNVLKKSHGKEIYFIVQMFAMIVLSEFIGSLSDY GVFYIYSLSATLAWCFLGYLYTYQNITEISNIDKI >gi|223714188|gb|ACDT01000027.1| GENE 47 51641 - 51802 172 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757574|ref|ZP_02429701.1| ## NR: gi|167757574|ref|ZP_02429701.1| hypothetical protein CLORAM_03124 [Clostridium ramosum DSM 1402] # 1 53 1 53 53 80 100.0 4e-14 MNNEEQINKVIDYIEEHLQDENLTLTTIAQEIGYLKYHLYRMFTMIAGITIYQ >gi|223714188|gb|ACDT01000027.1| GENE 48 51836 - 52012 85 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757575|ref|ZP_02429702.1| ## NR: gi|167757575|ref|ZP_02429702.1| hypothetical protein CLORAM_03125 [Clostridium ramosum DSM 1402] # 1 58 1 58 58 70 94.0 4e-11 MLIYEDIALISGYDTQRSFSKSFKALFKMSPSSYRKQQKLLLVQLKLGLRTVGFQLFN >gi|223714188|gb|ACDT01000027.1| GENE 49 52183 - 52614 360 143 aa, chain + ## HITS:1 COG:CAC3413 KEGG:ns NR:ns ## COG: CAC3413 COG1846 # Protein_GI_number: 15896654 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 5 142 6 141 143 86 43.0 2e-17 MKEKRIGIEIRILANLICRCLNELGFNEEHNGLTGPQGLVLGYLYDHQDKDIFQKDIEAT FNIRRSTATGLLQCLEGNGFVKRVSVDYDARLKKIVLTTKAHEFKELLESHIQKMEEILV KDLEPQEVDDLIRIIGKIKKNLE >gi|223714188|gb|ACDT01000027.1| GENE 50 52624 - 56859 4452 1411 aa, chain + ## HITS:1 COG:CAC2401_1 KEGG:ns NR:ns ## COG: CAC2401_1 COG1924 # Protein_GI_number: 15895667 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Clostridium acetobutylicum # 1 662 1 663 663 862 64.0 0 MEKKYYMGIDVGSTTIKIYITDQDDNCIYSKYERHYSDIQATIKTLINDVKEQFGNLEVT AVMTGSGGLSLSSCLDVKFVQEVIACTKTIETYIPECDVAIELGGEDAKITYFQGSLEQR MNGTCAGGTGAFIDQMASLLQTDAGGLNELAENYQMIYPIASRCGVFAKTDVQPLINEGA NKEDIAISIFQAVVNQTISGLACGKPIRGKVAFLGGPLYFLDQLRQRFIETLNLQDDEII FPDNSQLFVAMGATLLAKEEEKSTTIEALIEKLDNIDETMLASEDTLDPLFADENDRNGF RHRHYINKVRKNDIKGYVGNVYLGLDVGSTTTKAVLIDDNDSLLYSFYDSNEGNPLDVVV KIVKDIYDFIPPNVKLVRSGVTGYGEALIKAALKVDMSEVETMAHYKAATYFKEDVSFIL DIGGQDMKAIKIKDGIIQDILLNEACSSGCGSFIETFAKSLGYEVSDFAKLALESKAPID LGSRCTVFMNSKVKQAQKEGAKVEDISAGLSYSVIKNALYKVIKLRSKEELGDSILVQGG TFYNEAVLRAFEKEAGVNAIRPDIAGLMGAFGIARLARESYQGEPTTLLTKDELENFTCQ SEIRYCEKCTNHCMLTVSTFNDNSEYISGNRCERGANIPVSSKKLPNLFDYKYRRVFNYR SLTEDQAVRGTVGIPRVLNMYENYPFWHTFFTKLGFRVILSPRSSKAIYEKGIESIPSES VCYPAKLAHGHIEALIEKGIKYIFYPSLAYERKEFNNANNHYNCPVVTSYPEVIRNNVDN LENRSIIYRNPFMDLSNRKTLFNNLKDELRAFNVNDKELLSAINDAYDELAKCRQDIQDQ GEMVLQYLKDNKMTGIVLSGRPYHVDPEINHGLADLITAEGMAVLSEDSVCHLDKDLDQL RVVDQWTYHSRLYRAASFVATQPNLELIQLTSFGCGLDAVTSDQVADILKARHKIYTLIK IDEGSNLGAIRIRIRSLKATIEKQAKNKKQLYPKYQPLKVPFTKEMRDQGYTILCPQMSP LHFQFVETAMQESGYNLVVLPSVDKGAVDAGLKYVNNDACYPSILVTGQIMEALLSGKYD LEKTAVIISQTGGGCRATNYIAFIRKALQDAGMPQVPVISANLQGLENNPGFKLTLPLIK KVVIGAMYGDIFMRVLYRVRPYEVIPGSANDLYQSWVERCQENVKNGSIKQFRKNVYQIV KEFDELPLLDIKKPRVGLVGEILVKFHPTANNEIVEIIEREGGEAVMPDLIDFFQYCFYN TDFENEHFHASKNSARICNLAIKFVDLLRHDMIKALKKSNRFDPPASIQHLAKKASSIVS IGNQTGEGWFLTGEMIELIESGAPNIVCMQPFGCLPNHVTGKGAIKALRKAYPESNIVAI DYDPGASEVNQLNRIKLMLSTAFKNMNKKPE >gi|223714188|gb|ACDT01000027.1| GENE 51 56994 - 58313 1409 439 aa, chain + ## HITS:1 COG:lin0033 KEGG:ns NR:ns ## COG: lin0033 COG1455 # Protein_GI_number: 16799112 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 3 432 7 430 452 243 36.0 7e-64 MKISDSLTEKLLVVASKISNQKHMYAIKTAFTTLMPVIITGAFCTLIVNVVCSTETTGIS LAKVQGFSWLEMFTDLFNAANYATLNFFTIAAVVLIGLELGVKNGIKGFMTGIVAVCSFV ACLSTNIVATVGEESITVAGIAKDYTASKGLFLGMIIALLSVELFTKLCKSKYLKINMPD SVPSNVTSSFNNLFPFALTIIIFAAINYTVVQLAGLSLYDIIYTFIQKPLQAVVQGLPGV LLLMLVAQIFWCVGIHGNQIIKPVRDPLLNAAILANTDIVANPNYNPADLNIISMSFWDT YGNIGGSGCVVGLLIACLVFSKREDYKQVAKLSIAPNIFNISETLHFGLPIMLNPLLMIP FIITPLATMAFGYFMTIIGFSDILVYAFPWTTPPFINAWIASGGSIGTVITQALCIVISI AIYTPFVIIANKQKDIVSQ >gi|223714188|gb|ACDT01000027.1| GENE 52 58369 - 59160 822 263 aa, chain + ## HITS:1 COG:YPO1714 KEGG:ns NR:ns ## COG: YPO1714 COG1414 # Protein_GI_number: 16121974 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Yersinia pestis # 6 253 15 260 263 111 28.0 1e-24 MKLNRTALRTIKILEYIANQKDGCTLLEITEALDIPKSSAFDIVKTLLYKKMIVEDQNHG KLKYKMGINAFIIGSGNIERLDLVEAAKNQLIETANHFNATAFLAILDNKMVTYLYKYEA PHRIVTNANIGTRKPIYSTALGKCLLAFQRKPEVIKDILKEIDFKPLTEYTITSPQKYLK ELKRVKNIGYAVDFQEDSIYQICIAAPIFNHNKNVVAAISCTFLYDTNLRIKEIGEEMKR IALTISKKLGYIDDTGRRNIHGK >gi|223714188|gb|ACDT01000027.1| GENE 53 59150 - 59446 312 98 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735135|ref|ZP_04565616.1| ## NR: gi|237735135|ref|ZP_04565616.1| predicted protein [Mollicutes bacterium D7] # 1 98 9 106 106 174 100.0 1e-42 MASKVKKKQNLQGLTEQQKHIIKLRNELNKPDPHQVKAFTLYKIITYVFNVLFPPYALYR IWCKKSEFTKIERYAQSVVAVTILCMFVLLQLERYKII >gi|223714188|gb|ACDT01000027.1| GENE 54 59589 - 60374 778 261 aa, chain + ## HITS:1 COG:BH2137 KEGG:ns NR:ns ## COG: BH2137 COG1414 # Protein_GI_number: 15614700 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 13 258 3 248 251 122 30.0 6e-28 MAMEMEEKMKLNRTLLRAIEILELLSRSKEGYTLTELSLIFDYPKSSVFDIIKTLVYKNM VVEDNQTGITKYKIGLASFLIGSSYLNNIDIVNIAKSNLIDFANKMNATTFMAVLDENMV TYIYKYESENSIITTANVGTRKSLHCAALGKAMLAYKSEEEINKIIDKIDFISYTYFTIK TKEKLIEELAEVRQRGYAKDDRENTLQQIAVAAPLFDHEGHVVAAISCVGFYESSIDLDD LGLLIKDVGKQISYKLGYNPQ >gi|223714188|gb|ACDT01000027.1| GENE 55 60392 - 61435 1188 347 aa, chain + ## HITS:1 COG:CAC1332 KEGG:ns NR:ns ## COG: CAC1332 COG1312 # Protein_GI_number: 15894611 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Clostridium acetobutylicum # 1 342 1 344 351 477 63.0 1e-134 MKMTFRWYGTDDPVSLEYIKQIPNITGIVSAVYDIPVGEVWPIEKIEKLKAEVNLVGLEL EVIESVPVHEDIKLKRNDYQRYIDNYKQTIRNLAKCGIKCICYNFMPVFDWTRSSVDHQL NDGSKALVYYKNEVAQLDPTKLSLPGWDTSYSVMEVSELITAYKELGEEGLWENLKYFIK EILPVAVECDVNMAIHPDDPPWSIFEIPRIITNEKKLDRFLSLYDDSHHGLTLCSGSLGC ASFNDIPLLIRKYGKMRRIHFAHIRNVKILDDGSFEESAHYSRCGSLDMVEIMRAYIEVG FEGYIRPDHGRMIWGETGKPGYGLYDRGLGAMYLGGIIDALKDDGDL >gi|223714188|gb|ACDT01000027.1| GENE 56 61603 - 62317 676 238 aa, chain + ## HITS:1 COG:SPy1599 KEGG:ns NR:ns ## COG: SPy1599 COG2723 # Protein_GI_number: 15675482 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Streptococcus pyogenes M1 GAS # 5 235 2 230 480 169 35.0 3e-42 MKKTVKLPENFFLGAAASAWQTEGWSEKKESQDSYIDLWYKENKNVWHNGYGPAVATDYY HRYKEDIAYMKEIGMNCYRTSLNWSRFLTDYENIVVDEEYANYYDKMLDELIAQGIEPMV CLEHYEIPAELFKKYDGFASKRVVELFVKYAQEAFKRYSHKVKYWFAFNEPVVVQTRIHL DALRYPFYQDSKAWMQWNYNKALATNMIMKVYKEGGYRIAGGKFGTIINVETAYPRGN Prediction of potential genes in microbial genomes Time: Thu May 26 09:32:12 2011 Seq name: gi|223714187|gb|ACDT01000028.1| Coprobacillus sp. D7 cont1.28, whole genome shotgun sequence Length of sequence - 6492 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 721 564 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 2 1 Op 2 . + CDS 781 - 2202 1571 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 2209 - 2250 -0.2 + Prom 2214 - 2273 4.2 3 2 Tu 1 . + CDS 2439 - 2591 73 ## gi|167757585|ref|ZP_02429712.1| hypothetical protein CLORAM_03135 - Term 2696 - 2753 8.3 4 3 Tu 1 . - CDS 2785 - 3804 908 ## COG1609 Transcriptional regulators - Prom 3839 - 3898 5.1 + Prom 3777 - 3836 8.8 5 4 Op 1 . + CDS 4005 - 4658 648 ## COG0036 Pentose-5-phosphate-3-epimerase 6 4 Op 2 2/0.000 + CDS 4674 - 4973 534 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 7 4 Op 3 . + CDS 5010 - 6476 1806 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase Predicted protein(s) >gi|223714187|gb|ACDT01000028.1| GENE 1 2 - 721 564 239 aa, chain + ## HITS:1 COG:lin0583 KEGG:ns NR:ns ## COG: lin0583 COG2723 # Protein_GI_number: 16799658 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 1 230 234 463 464 243 48.0 2e-64 PRDLEAADKYDLFYNRIFLDPAILGHYEEGFFDLLKKHDILMEYTQEELEIIQNNTLDWV GINLYHPNRVKGRTTIVHPEAPFHPDFYYEEFNMPGKKMNPHRGWEIYPQIMYDMAMRMK NDYHNFEWFVAESGMGVENEKQYKNESGMIQDDYRIEFISMHLDWLLRGVTEGSNCKGYM LWAFTDNVSPMNAFKNRYGLVEIDLEDNRNRRLKKSAYFYKEIIEKRSFEIETDEEVYK >gi|223714187|gb|ACDT01000028.1| GENE 2 781 - 2202 1571 473 aa, chain + ## HITS:1 COG:ECs3572 KEGG:ns NR:ns ## COG: ECs3572 COG2723 # Protein_GI_number: 15832826 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Escherichia coli O157:H7 # 3 473 4 474 474 728 71.0 0 MGFPKGFLWGGAIAANQAEGAYLEDGKGLTTVDMIPHGEKRMYVKLGDMYPVKLIDGENY PSHEAIDFYHRYQEDIKLFAEMGFKTFRTSIAWARIFPHGDDEQPNEAGISFYLDLFKEC QKYGIEPLVTLCHFDVPMGLVEKYGSWRSREMIECFIRYARVCFERFDGLVKYWLTFNEI NILLHSPFSGAGIAFKPDENHEQVKYQAAHHELVASALATKIAHEINPLNRVGCMLAGGQ FYPYSCDPKDVWLAMNKDRENLMFIDVQARGYYPSYAKKVFKEKDINIEIQTDDLEILKK YPVDFISFSYYQSRCASADPSRGMTDGNVVKSVKNPYLETSDWGWQIDPLGLRITLNYLY DRYQKPLFIVENGLGAKDTIEEDGSILDDYRIDYLRKHIKAMKDAIDDGVDLIGYTTWGC IDLVAASTGEMSKRYGFIYVDKDDQGNGTLERRKKKSFNWYKQVIASNGERLD >gi|223714187|gb|ACDT01000028.1| GENE 3 2439 - 2591 73 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757585|ref|ZP_02429712.1| ## NR: gi|167757585|ref|ZP_02429712.1| hypothetical protein CLORAM_03135 [Clostridium ramosum DSM 1402] # 1 50 1 50 70 94 98.0 3e-18 MRCNEQSDFKYIPNYAPSVIAEFNDNHLIIKDFNTRKDFYEFCQCKIDLW >gi|223714187|gb|ACDT01000028.1| GENE 4 2785 - 3804 908 339 aa, chain - ## HITS:1 COG:RSc1014 KEGG:ns NR:ns ## COG: RSc1014 COG1609 # Protein_GI_number: 17545733 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Ralstonia solanacearum # 8 338 3 337 347 181 34.0 2e-45 MVKKKNISIKEIAKLADVSVATVSRVINNNGRFSDKTKEKVETIIKEYGYTANIAAKSLR TSKSKTIGLIVPNIDNEWFSQLALDIENYFFKHNYSVFICNTSQNEEKEVAYFKSLDSKL VGGIICISGIETIPSDSLTRDIPIVCIDRKPKDHSNAYYVESNHYLGGYLATEELIQQGC KNIAIVSRNKSLSVNKQRLMGYRQALKDYQLQENKQLELLLDIEQANYEGAKEAINQLIK NNIPFDGVFATNDWRAYGVLVGLLENNVKVPDEVKIIGFDDIFISASSHPSLSTIKQNIP ALTKTACSLLLDLMNDIAINDEKKQFILPVEVIRRDSTS >gi|223714187|gb|ACDT01000028.1| GENE 5 4005 - 4658 648 217 aa, chain + ## HITS:1 COG:lin2808 KEGG:ns NR:ns ## COG: lin2808 COG0036 # Protein_GI_number: 16801869 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Listeria innocua # 4 204 3 203 214 167 42.0 1e-41 MEKLLCPSMMCANFGNLEKEIKELEEGGIDFFHLDVMDGSYVPNFGMGLQDIEYICKQAT KPCDVHLMVVNPSDYIEKFAALGVKIIYIHPETDKHACRTLQKIKDAGAKAGIALNPGTS FETVKELLYLCDYVMLMSVNPGFAGQKYLDFVTPKFRQFVDAASNYGGYQVMIDGACSPE KIAMLSKVGVKGFILGTSALFGKEKSYKEIITELRKL >gi|223714187|gb|ACDT01000028.1| GENE 6 4674 - 4973 534 99 aa, chain + ## HITS:1 COG:lin2472 KEGG:ns NR:ns ## COG: lin2472 COG1440 # Protein_GI_number: 16801534 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Listeria innocua # 2 99 3 101 104 95 55.0 2e-20 MNILLVCAAGMSTSLLVNRMNEAAAAKGIGINIEAHPVGSIDQFGDAADVILLGPQVRYE LKNVKAKYPNKPVEVINMQDYGMMNGNKVLDTALKLIEG >gi|223714187|gb|ACDT01000028.1| GENE 7 5010 - 6476 1806 488 aa, chain + ## HITS:1 COG:lin0017 KEGG:ns NR:ns ## COG: lin0017 COG2723 # Protein_GI_number: 16799096 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 3 488 6 477 477 434 45.0 1e-121 MSFPKGFLWGGSISAAQIEGGWNEGGKSPVLVDYCTAGSAKERRQIWYLDQEGKKVHKDW EAVDELPEGCKFQLFDDLHYTNHVASDFYHRYKEDIALLAEMGYTTFNTSISWARIYPHG IKGGVNQEGVEFYRSVFEECRKHGMDPVITLYKYDEPVSLLEQHGGWRNRAMIDEFVEFA RVCFTEYKDLVNKWMTFNEINIIMPLNDTPKDKAKRDLAYLHNQLVAAARATVVAHEIDK DLKVGAMICGNMNYPLTPDPKDAFAQYKRFQDFFGYSADTQIRGEYPPYAKRIWEEWNFT PEITEQDKKDLKEGKADFLGFSYYASGVITTHNEEQLDITGGNILGSVKNPYLEANAWGW QIDPLGFKHFLHILNDRYQVPLFDVENGIGLIETEGEDGICHDSARIDYHRRHIQCMKEA VEEGVNLFGYTTWGCIDLVSAGTGQMDKKYGFIYVDMDDQGNGDLHRSRKDSFYWYKKVI ASNGEDLD Prediction of potential genes in microbial genomes Time: Thu May 26 09:32:23 2011 Seq name: gi|223714186|gb|ACDT01000029.1| Coprobacillus sp. D7 cont1.29, whole genome shotgun sequence Length of sequence - 25044 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 11, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 161 - 2065 1393 ## COG3711 Transcriptional antiterminator - Prom 2152 - 2211 8.2 + Prom 2180 - 2239 9.9 2 2 Op 1 . + CDS 2316 - 3647 1542 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 3 2 Op 2 . + CDS 3710 - 4129 457 ## gi|167757593|ref|ZP_02429720.1| hypothetical protein CLORAM_03143 4 2 Op 3 . + CDS 4146 - 4610 634 ## gi|237735148|ref|ZP_04565629.1| predicted protein 5 2 Op 4 . + CDS 4647 - 5573 1007 ## COG1482 Phosphomannose isomerase 6 2 Op 5 2/0.000 + CDS 5585 - 7003 1676 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 7 2 Op 6 . + CDS 7026 - 7322 438 ## COG1440 Phosphotransferase system cellobiose-specific component IIB + Prom 7340 - 7399 6.6 8 3 Tu 1 . + CDS 7429 - 7908 587 ## COG0346 Lactoylglutathione lyase and related lyases + Term 7910 - 7949 3.0 - Term 7900 - 7934 4.0 9 4 Tu 1 . - CDS 7942 - 8718 813 ## COG1737 Transcriptional regulators - Prom 8806 - 8865 10.4 + Prom 8699 - 8758 10.3 10 5 Op 1 . + CDS 8937 - 9881 1041 ## lmo0737 hypothetical protein 11 5 Op 2 1/0.000 + CDS 9914 - 11365 1597 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Prom 11827 - 11886 9.1 12 5 Op 3 9/0.000 + CDS 12087 - 13103 1309 ## COG0673 Predicted dehydrogenases and related proteins 13 5 Op 4 . + CDS 13103 - 14110 1293 ## COG0673 Predicted dehydrogenases and related proteins + Term 14164 - 14204 4.3 - Term 14205 - 14258 -0.1 14 6 Tu 1 . - CDS 14260 - 14913 621 ## Cphy_1046 hypothetical protein + Prom 15130 - 15189 10.6 15 7 Tu 1 . + CDS 15270 - 16352 1065 ## COG1408 Predicted phosphohydrolases - Term 16363 - 16409 4.3 16 8 Op 1 . - CDS 16434 - 17306 994 ## COG0024 Methionine aminopeptidase 17 8 Op 2 . - CDS 17381 - 17626 304 ## COG2827 Predicted endonuclease containing a URI domain - Prom 17739 - 17798 9.9 + Prom 17626 - 17685 10.5 18 9 Op 1 1/0.000 + CDS 17760 - 18659 772 ## COG0679 Predicted permeases + Prom 18663 - 18722 4.0 19 9 Op 2 . + CDS 18745 - 19761 1276 ## COG0673 Predicted dehydrogenases and related proteins + Prom 19792 - 19851 7.5 20 10 Tu 1 . + CDS 19871 - 21217 1058 ## gi|167757611|ref|ZP_02429738.1| hypothetical protein CLORAM_03161 + Prom 21238 - 21297 6.8 21 11 Op 1 . + CDS 21318 - 22553 1236 ## gi|237735166|ref|ZP_04565647.1| predicted protein 22 11 Op 2 6/0.000 + CDS 22563 - 23915 1358 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 23 11 Op 3 . + CDS 23912 - 24952 1335 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes Predicted protein(s) >gi|223714186|gb|ACDT01000029.1| GENE 1 161 - 2065 1393 634 aa, chain - ## HITS:1 COG:BS_licR_1 KEGG:ns NR:ns ## COG: BS_licR_1 COG3711 # Protein_GI_number: 16080911 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 10 483 9 491 499 153 24.0 9e-37 MKLKKNELKIIQLLLASPDYISTYDIATSTGIPRRLVRDEISNVRVILSSLNLNLLSKPS KGYFIEGKSSKDLSRLQNIINDNERDENNVFPTLPKDRRNYIAKRLIEEASYIKLETFAN ELLVSRPTVANDVLLLKKDIHKYGLTLIQKPNYGISIQGSEISKRKVLADWVFENLTQSD MFGDFLDAYFNSPDYQIIEIINQHQIIMSDISLIDFLICFSIAIARNTINHTIDSPIANF EQFKDRFEYATAKELGKYVQERFKIDFNEYEIQNITILLICKRSSRGLSASHDDNVITIA NEILARIKAETLITFDTDKLFRILSLYIENTLIRQAYVEKIRTPVYENIQYEYPLPYHLA SIASATIQQHSKIKLSRSELSAFTILFNNAINNQNKKKKKVLLINCMSGSMFTLHRYRIE TELAAQLVITKYTQYYRINEEDLSNYDLIISSVPIRKQLSIPVINTSYMITNDDIIRIKS YLSYLFNDEDLVYYFHPYLYSSNVKVKSKKGVASTFYHLLTCLYPNLKDTFKYELNKQHR YTLNTFNNVIGLIKLNKPINANNNIVAITLNEPILFDQQQMQVIILFSCLDNNNIMYNTL FNTLKNVANNELDVKKLLSHLSYPEFLSVIKNNK >gi|223714186|gb|ACDT01000029.1| GENE 2 2316 - 3647 1542 443 aa, chain + ## HITS:1 COG:BS_licC KEGG:ns NR:ns ## COG: BS_licC COG1455 # Protein_GI_number: 16080909 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Bacillus subtilis # 14 442 10 437 452 178 30.0 2e-44 MSTEKKSFMDKVLEKVDVIAGPMTRFGQIPFVRAIVNGMVAALPVTMVGSIFLVVYLFCS DGGLTTKALIPFLKPWADDLALVNSLSMGIMAIYIIIAFGGEYAEIKGFNKTTGAVGAFF AFMLLNYDSVGNLVITAKDGTLSAGGSAFESTYWGGSGLITAMIAGAIAINIINICLKKN IVIKLPDSVPPAISDSFSAIIPYFFITIVCWGIRTLAGINIPVLVGEMLLPVLGAADNVF VYSLQQFLSALLWMCGLHGDNITGAVTNVFTNQWLADNNTAFTAGTLVKDLPYVWTPNLC RLSQWVSSCWPILIYMFMSSKKLPHLKPLATICLPPAIFCIIEPIMFGLPVVMNGFLLIP FILTHTLTGALTYWLTSIGFVGKMYMSLPWATPSPILGYLSAGGSVGGIIIVFINLAIGL VIFYPFWKAYEKAEVAKMNTAEA >gi|223714186|gb|ACDT01000029.1| GENE 3 3710 - 4129 457 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757593|ref|ZP_02429720.1| ## NR: gi|167757593|ref|ZP_02429720.1| hypothetical protein CLORAM_03143 [Clostridium ramosum DSM 1402] # 1 139 1 139 139 257 100.0 2e-67 MECPYCHKEIPQDSAFCYHCGKEISADALKQKNKSKLKKNPRENSWAKLGILLFFIGLIG LDFIAGTIFSAVGGNVKIPYILSSFAYLGAIVCGVLSLRVDKQDRKKGFEPNGNKNYAWV SIVISGFVSLVNFSQVILK >gi|223714186|gb|ACDT01000029.1| GENE 4 4146 - 4610 634 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735148|ref|ZP_04565629.1| ## NR: gi|237735148|ref|ZP_04565629.1| predicted protein [Mollicutes bacterium D7] # 1 154 4 157 157 306 100.0 3e-82 MYCRKCGAVLKDSAKFCDSCGSEVIKVEQRSYAQKYNDNKIKQKMSKKDIERMEKHRDEK NPYIGAALFASVLALILAIVPWNYFGDGIGTSLPMRIVIVVFALLGDYHVTKAKQVNNLI YSKYGFRIKANIVSLANCLSIFVTVIGLFALFTL >gi|223714186|gb|ACDT01000029.1| GENE 5 4647 - 5573 1007 308 aa, chain + ## HITS:1 COG:lin2215 KEGG:ns NR:ns ## COG: lin2215 COG1482 # Protein_GI_number: 16801280 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Listeria innocua # 4 290 5 296 318 219 39.0 4e-57 MGVLFFKPIPRLAIWGHTLVKDYFGYHDFPDGVGQAWSFSCQKGASNICETAPYQGKTLL ELWQEHQELFGHPNEQFPVIISLVGPEDDLSIQVHPDVKAALKLGYQAGKNEAWYFIEAE PGAKIVYGHNAKDEADLTHYIKEEQWDKLIRHLEVHPDDFVYLPAGLLHALKKGSVVYEI QQATDVTFRFYDYHRKDANGNERELHLKEAIECLSYDQKEMENKLTTVMTSLENGEQTVF IDNDSFTVTKLELTGENHYYHDNYQLATVVRGSGTVDGVPIKVGDNFLIPQGNKIVFDGH MTIMMTTR >gi|223714186|gb|ACDT01000029.1| GENE 6 5585 - 7003 1676 472 aa, chain + ## HITS:1 COG:BH0596 KEGG:ns NR:ns ## COG: BH0596 COG2723 # Protein_GI_number: 15613159 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Bacillus halodurans # 3 464 8 475 477 366 40.0 1e-101 MKEEFLWGGALSNVQAEGAYLEDGKGLNVYDTLIVTPEPGIEPMFCDTKVATDHYHHYQE DIDYMAQMGYKAYRFSVVWSRIHPLGNEESPNEAGLAYYEKMVDYLLSKGIEPVVSLVHF DMPDYLLKNYNGFMNKEVINFYARHVEAVVSRLKGKVKYWLTYNEINLAYCQSDLVAGAY LPDGMSKAEMFVQLNINVQIAHARAVEVIKRVDPQAKVGGMIGHAPFYPLTCKSSDIIAA DFKNKLHNYFAFDTMCQGELPDYFKQYALNRNITISLNKADEEVIKSASTKLDYLAFSYY RSSVQASFDEIDDVITLEDAILFDQRNLKNPYYQANEWGWQIDSQGLRYSLIDFYHRYHK PLFIVENGIGIDEQLIDQKVYDDQRIDYYQQHIKAIKQAVEHDGVDLIGYLAWSPIDFLS SHKEIRKRYGFVYVDRDFEDLKDLKRYPKKSFYWYKKCIATNGDDLENNIEY >gi|223714186|gb|ACDT01000029.1| GENE 7 7026 - 7322 438 98 aa, chain + ## HITS:1 COG:CAC0384 KEGG:ns NR:ns ## COG: CAC0384 COG1440 # Protein_GI_number: 15893675 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Clostridium acetobutylicum # 3 98 5 102 102 97 54.0 5e-21 MFLLVCATGMSTSLLVNRMKEAAETKEIEFQIEAHPVGQIEKYGEAADVILLGPQVRYEL KNVKKMFLDKPAEIINMQDYGTMNGAKVLDTALKLGGK >gi|223714186|gb|ACDT01000029.1| GENE 8 7429 - 7908 587 159 aa, chain + ## HITS:1 COG:BMEII0064 KEGG:ns NR:ns ## COG: BMEII0064 COG0346 # Protein_GI_number: 17988408 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Brucella melitensis # 16 151 17 150 160 58 30.0 5e-09 MKFNEVMHLSFYTDQMDKMRDFYENKLGLKAKIIMRYGAYLGQKSRGAWAKKAITDPDGI AYIFIELAPGQYLELFNKADNQLEHEKPDVRLGYSHFALMVDDIFEARKELIEAGIEIDI EPNKGQSETWQMWIHDPDGNKFEIMQYTDLSLQHQGNVG >gi|223714186|gb|ACDT01000029.1| GENE 9 7942 - 8718 813 258 aa, chain - ## HITS:1 COG:PM1577 KEGG:ns NR:ns ## COG: PM1577 COG1737 # Protein_GI_number: 15603442 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Pasteurella multocida # 11 255 14 265 286 73 24.0 3e-13 MIRELIDSIKLTKKEQVISEYLEQEPDCVLQYNAKELAALIKVSAPTLIRFVQKLGFKGY SDFQVTYTQEKTMYDQMKNIRIDSKSSIRDVIETLPAIYDHVFKETRKLTRFESFVRTIN YMLQAKQIDFYANDNNYSEVQSACLKLNTIGIKAQCFNTLNTAYIDNIDPRDILAFVVSH SGKNQTMVDAAYELRKKRIRVIAITGKIDPTLELVCNEALYIDSSSHHLPHHIMLYGLSI HYILDILVTSLYYKKYKE >gi|223714186|gb|ACDT01000029.1| GENE 10 8937 - 9881 1041 314 aa, chain + ## HITS:1 COG:no KEGG:lmo0737 NR:ns ## KEGG: lmo0737 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes # Pathway: not_defined # 3 314 1 310 310 353 57.0 3e-96 MELHNQVTRLISAKASKISNYTGRELKEAIFKSEGRVLMGQTYLKNPILFPNCTSTELMF AFGADMVLLNGFDFFHPADCPGMQGFDYQELKDLVGGRPVGIYLGCPKEGLDFNNQENSQ LYDLAGMVCTKENLLKAKQWGVSFLILGGNPGSGTSIKDVIKWTKEARAICGDDMLIFAG KWEDGISEKVLGDPLADYDAKNIIKQLIEAGADIIDFPAPGSRHGITVEMIRELIEFTHR NGALAMSFLNSNVEAADVDTIRQITLMMKQTGCDVHAIGDGGFGGGTWPENIYQMAITLK GKAYTWAQMATPRR >gi|223714186|gb|ACDT01000029.1| GENE 11 9914 - 11365 1597 483 aa, chain + ## HITS:1 COG:CAC1405 KEGG:ns NR:ns ## COG: CAC1405 COG2723 # Protein_GI_number: 15894684 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 1 483 1 473 473 435 46.0 1e-121 MSFPDKFLWGGSISAAQCEGAWDEDGKSPVQVDFGDPGTTTNNRYIHYLNADGTRGKMRQ FDHLPKGAKYELFDDVRYTNHVGIDFYHRYKEDIALFAELGFTTFNTTISWARIFPHGVE GGVNQAGVEFYRNVFKECRKYGIDPVITLYKYDEPVYLEETYGDWTNREMIHQFVEFAKV CFIEYKDLVNKWLTFNEINILLHFNVKEGKQFAFEELHNQMVAAAEAVKAAHEIDEEIKV GCMIAGFCCYPYTCDPKDVLGSYKLFQEKFAYCADTMVRGYYPAYAKRIWKDNNVSLEIS EEDKKILMEGKSDFLAYSYYMSNVFTTHDVEGNLATAGGQGSLANPYLEASDWGWQIDPT GYEYFLHVLNDRYQVPLFDVENGLGAHDQVEDDGSIHDDYRINYHRSHIKSLMKAREEGV NIFGYTSWAPIDLVSFTTGQMDKRYGFIYVDMNDDGVGDLHRIKKDSFYWYQKAIKSDGK DLD >gi|223714186|gb|ACDT01000029.1| GENE 12 12087 - 13103 1309 338 aa, chain + ## HITS:1 COG:ECs4289 KEGG:ns NR:ns ## COG: ECs4289 COG0673 # Protein_GI_number: 15833543 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Escherichia coli O157:H7 # 1 334 2 336 345 280 45.0 2e-75 MLKVAYIGFGNSVCRYHLPYVEKRKDFIDVKYIYRLETEINEEKESWYPGIKFTSSINEI MDDPDVNLVVVNTPDRYHVEYTMQALDHGKHVLCEKPFAMTAKEAKDVFDYAKEKGLVAM ANQNRRYDADMRTVRKVIESGVLGDIVEIESHYDYFRPSIANNKGMGILYGLAVHPIDQI IGEFGKPNRVVYDCRSVDNPGVSDDYYDFDFFYDGFKAIVKTSYYVKLDYPRFIVHGKKG SFLMPALGHNSSEKPKPGAIHVSFDPLPEDKWGTLSYIDDNGNDITKKVPTEIGDYGIIY DNLNDVIVNGAHKLVADEEVIEVLKIIEEATKVAKEAK >gi|223714186|gb|ACDT01000029.1| GENE 13 13103 - 14110 1293 335 aa, chain + ## HITS:1 COG:SPy0441 KEGG:ns NR:ns ## COG: SPy0441 COG0673 # Protein_GI_number: 15674565 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Streptococcus pyogenes M1 GAS # 1 326 1 319 319 300 45.0 3e-81 MKVGIVGAGMIVHDFLTFAHEVTGMELVALCATPAEKEKIVEMCQANNIKAHYTNIDVML EDSEVEVVYVAVPNHLHYEMCKKAILAGKHVICEKPFTSNLKELEELVALADTNKVIMVE AVSTQYLPNTLKIKELLPTLGQIKIVSANYSQYSSRYDAFKAGEVLPAFNPAMSGGALMD LNIYNINLVVALFGKPLNVNYEANIQRGIDTSGILTLDYDSFKCVCIGAKDCKAPVATNI QGDAGCITISTPANSLSGFKVLMNKGSAKQMNNEGDEVSYNNDKHRMYHEFVEFVKMIDE KDFTRAKKMQEISLITIEIATKARQSAGIEFAADK >gi|223714186|gb|ACDT01000029.1| GENE 14 14260 - 14913 621 217 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1046 NR:ns ## KEGG: Cphy_1046 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 27 216 90 296 298 78 28.0 1e-13 MKKRFLIFLLIIAPLCINAVHASEDNAKLQLLKEKHYETEGTIVTVKIPHVVNVKDDKVK KVINKLITQAIDDFTNEFKEFDKEPNTEHKLIADITFQNYYSDDKIISFSINATQIMADS YLQKKFYTVDLKTGEVYNIEHFLGSDYQNIVKKSVQQQIAENKEKYPNLMYFDEAVNNLK ITNEQPFYINKDNQVVVVFNQFEIAPGYMSLPEFIIK >gi|223714186|gb|ACDT01000029.1| GENE 15 15270 - 16352 1065 360 aa, chain + ## HITS:1 COG:CAC3027 KEGG:ns NR:ns ## COG: CAC3027 COG1408 # Protein_GI_number: 15896279 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 106 360 144 387 392 119 32.0 9e-27 MYLYMGLLAVIPGIIYFTFFINRLLRTILNIHTNWFLIIIIGLVIMVLSIPAMNTLELYG VIYYHLLVVMLLFELLNLGLKRFPIYRISFTTGILGIVVTALFLGYGYYNIKHVVATTYD LKSDKVEDLKILEIADLHMSTSLSVTELQKYCDEMSQLNADLVVLTGDIFDENTPLDDMV NASKALASINNQQGIYYVYGNHDNGSHAFSDSEFGPEDVRVTLEKNGIVVLEDAVVSLDN INIIGRKDASFWGTNPRLSTSQLLEMIPENKRGNYTIMLEHQPLNLDENAALGIDLQLSG HTHGGQLFPMGIVQSLTSDTLIRGQRDIGDFTAITTTGIAGWRYPIKTGAPSEYVIINIK >gi|223714186|gb|ACDT01000029.1| GENE 16 16434 - 17306 994 290 aa, chain - ## HITS:1 COG:CT851 KEGG:ns NR:ns ## COG: CT851 COG0024 # Protein_GI_number: 15605587 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Chlamydia trachomatis # 3 289 1 287 291 240 41.0 2e-63 MKLNRNDLCWCGSQKKYKKCHLAFDEKLEQLKQQGKKIPTHKMIKTPEQIEGIKKAAVIN SGLLDYIEANIKVGMTTEEIDVMAKEYTAKYGAVCADYQYMGYPKHICVSINDVVCHGIP NEQTILQEGDIVNVDATTMLNGYYADASRMFMIGKVSNEAKKLVSDTKKALELGMASVIP YQSCVGDIGKAIETFAKAMGYSVVREFCGHGVGLAIHEDPYVFHFEPNVPTVTLVPGMVF TIEPMLNLGGREVYVDESDEWTVYTDDESLSAQWEHTLLVTEDGIEIISK >gi|223714186|gb|ACDT01000029.1| GENE 17 17381 - 17626 304 81 aa, chain - ## HITS:1 COG:BS_yazA KEGG:ns NR:ns ## COG: BS_yazA COG2827 # Protein_GI_number: 16077103 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease containing a URI domain # Organism: Bacillus subtilis # 1 79 1 81 99 80 55.0 1e-15 MKTNYVYIVKCQDNTFYTGWTTDLVKRINAHNHGKGAKYTKARRPVELVYFEEYTEKSLA LKREYAIKQMTRKEKELLINS >gi|223714186|gb|ACDT01000029.1| GENE 18 17760 - 18659 772 299 aa, chain + ## HITS:1 COG:CAC2949 KEGG:ns NR:ns ## COG: CAC2949 COG0679 # Protein_GI_number: 15896202 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Clostridium acetobutylicum # 14 299 19 305 305 132 33.0 7e-31 MDLQFEIFILMIIGYILRKTNIISKEHRKSLTDLVIYIVLPANIIYSFMIKMDTQIIKSG LTILIVSIIIQFACQVFGKYFFIKATKRQQSVLQYGTICSNAGFMGSPLIQGLYGLDGLL FASIYLIPQRIVMWSGGVACFTNAKGKDVIKKVITHPCIIAVFIGLFIMISQIQLPSFLK VSIQSLSNCTMALSMIVIGGILAEIKIRDVINRLTLYYSFIRLILIPLLVLFSCAIVNLP PLVTAVATVLAGMPAGSTTAILAEKYDGDSNLAVEIVFLSTALSLFTIPLLCLVINMVV >gi|223714186|gb|ACDT01000029.1| GENE 19 18745 - 19761 1276 338 aa, chain + ## HITS:1 COG:lin0375 KEGG:ns NR:ns ## COG: lin0375 COG0673 # Protein_GI_number: 16799452 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 1 338 1 338 338 516 74.0 1e-146 MLTMGFIGNGKSTNRYQLPFVLTRKNIKVKTIYARNLNKHDWSRVAGINYTDDLDSLLND PEIQLISVCTRHDSHYEYARMVLEHGKNCLVEKPFMKNSAEAKEIFALAKEKGLLVQCYQ NRRYDSDFLTVQKVIESGKLGNLLEVEMHYDYYRPEIPLNVHEYVGYNGYLYGHGCHTLD QVISYFGKPEKIHYDVRQLLGSGRMNDYFDLDLYYGTLKVSVKSSYFRVKERPSFVVYGD KGCFVKATKDRQEEHLKLFYMPGSPNFGIDRPEDYGTLSYYDDAGVFHEEKVISEVGDYG NVFDGLYESIIEGKEPRVKDEQTLLQMEILETGVKMCK >gi|223714186|gb|ACDT01000029.1| GENE 20 19871 - 21217 1058 448 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757611|ref|ZP_02429738.1| ## NR: gi|167757611|ref|ZP_02429738.1| hypothetical protein CLORAM_03161 [Clostridium ramosum DSM 1402] # 1 448 1 448 448 793 100.0 0 MRCDNCGNEYKNSLKCPLCGHRQGKISHCSVCGAAIHFGQPRCPNCGNPTKYEKKTDVTK KYASSFNINNSTNNKDCDKHSEKSHVYHQQEMYDYKSSNNEIKQRLEEARQRINSMGFPI IKKKAANNEEKLRNIIVGIVVILIAGVTIFGQIFKNSNDQRDYQELSTLQLFDSNSSITQ EGNLKNGGYAFLTDSNIYFGMDYQIYQTERNFSNFKSLVDDGERYIYSTDDYLYYEQYGR YARYDMKSGELTALFEMDNVLPIDKNKFLYTKYDEEGLFIYDEVSATSIKIISDEISDYS YDMQDSLVFYTTIEHDYIQAIDLAGNKLSQFNLSSTGKIYVSGDLLYYQDYQGVHCYSIA DENDELLVEGEVNNYIVTNNTIVYTNYDDDLMTSDGHIVSIDYDVTVFNVIGNYIVYSTG NGDEYLKQWYINDFYDTAIAKLNNNEEE >gi|223714186|gb|ACDT01000029.1| GENE 21 21318 - 22553 1236 411 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735166|ref|ZP_04565647.1| ## NR: gi|237735166|ref|ZP_04565647.1| predicted protein [Mollicutes bacterium D7] # 1 411 1 411 411 728 100.0 0 MEKIKRCRKCGARLLEHEKRCPVCGTPVGLDKEVDIIKEEQIIEPEKVDEFIAADKTEQT LLKSNESAQNYWRSKKIWAVFIVLIVVTTVMRQYVINNPIKLAENDNSNTIINDYSDSKI SINKNTGKYSQATNINYLGISYVNDDAVYLMMNSELLKYDRSFNNRELVLEQAVTVFSED EQWYYYLDENNDYIRMDKKTKAEDILLKNVYYVHNLGDKVYYQNDSDGETIHCLELETNQ DHKISDEVSYSIVVDEEKGRIFYINKNNELVSIALDGSDKKNLANNTNVYTYDGEYLYYI NNDGLVKSDLEGQSKVIYESNNLSLVNLVEKKLVIQDKNIIYTMDLDGKDKKKLYTMDIG GSLTFEVVGDKLLVLTKGNSDSVIGYEIVGLDGKRHILDDENQPTIKGNEF >gi|223714186|gb|ACDT01000029.1| GENE 22 22563 - 23915 1358 450 aa, chain + ## HITS:1 COG:SA1886 KEGG:ns NR:ns ## COG: SA1886 COG0770 # Protein_GI_number: 15927657 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Staphylococcus aureus N315 # 22 445 25 450 452 298 42.0 2e-80 MKVSEIVIATGGKLISGDETMEITGFSQDSRQAEVGMMYIPIIGERYDGHDFIESAFTNG ASAIITDRLIEDDKHVVILVDNTLKALQKMAHYLRKHRKVKVVGITGSVGKTSTRDMVYS VVKQQYKTLKTEGNYNNNIGLPLTILRLKDEEVMILEMGMNHLNEMEELSRIACPDISAI TNVGTAHIGELGSRENILKAKLEIVAGMSDGSTLVINGDNDMLSTVRFDNFKVVKVGVDC AAVFKAEKVILMDDHSEFTINYNGKVYQVVVPVPGNHFVLNALVAIAIGINLEIPMEKCI QGISQFELTKKRMDVIELKNNITLIDGTYNASEDSMKSSIDVLATYSRRKIAVLADMLEL GEFSEQLHRSVGKYVAEKQIDVLVAVGREAKFMSDSAALSGMAEIYYCNNNQEVVNYLKN NLQNDDVVLLKGSNGMKLKDVVTKIEEKFS >gi|223714186|gb|ACDT01000029.1| GENE 23 23912 - 24952 1335 346 aa, chain + ## HITS:1 COG:CAC2895 KEGG:ns NR:ns ## COG: CAC2895 COG1181 # Protein_GI_number: 15896148 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Clostridium acetobutylicum # 1 346 1 340 343 276 44.0 3e-74 MKQKLLVLCGGQSSEHIVSRMSCTSVLNNLDANKYEITLVGIDLDGGWHYLDAKQTDLAK NTWLDNSSMVEDVYGLLKNQDVAFPVLHGMYGEDGTIQGLFELAKLPYVGCRVMGSSVAM DKIYTKKILDTVGVPQVKSVYVKKRYDEKLVVVTNTFDEIEEIEDYIVRELGMPCFIKAS RSGSSVGCYRCDNQTELIGKLSEAAKYDRHVVVEECIDCIELETAVLGNDDVIVSRVGQI MPHGEFYTFESKYEDEESKTCIPALVDEQIQEQIRQYAIKVFKAVDGHGLSRVDFFLDKK TNKIYLNEINTMPGFTKISMYPQLMNDFGITYPELLDRLIVLALQK Prediction of potential genes in microbial genomes Time: Thu May 26 09:33:38 2011 Seq name: gi|223714185|gb|ACDT01000030.1| Coprobacillus sp. D7 cont1.30, whole genome shotgun sequence Length of sequence - 49528 bp Number of predicted genes - 50, with homology - 49 Number of transcription units - 25, operones - 12 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 77 - 853 690 ## COG0789 Predicted transcriptional regulators 2 1 Op 2 . + CDS 855 - 1328 493 ## COG1131 ABC-type multidrug transport system, ATPase component 3 1 Op 3 . + CDS 1383 - 1706 274 ## gi|167757617|ref|ZP_02429744.1| hypothetical protein CLORAM_03167 4 1 Op 4 . + CDS 1703 - 2068 225 ## gi|167757618|ref|ZP_02429745.1| hypothetical protein CLORAM_03168 5 1 Op 5 . + CDS 2145 - 2423 61 ## gi|167757618|ref|ZP_02429745.1| hypothetical protein CLORAM_03168 + Term 2428 - 2468 3.3 6 2 Tu 1 . - CDS 2460 - 3665 878 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) - Prom 3744 - 3803 5.5 + Prom 3639 - 3698 6.3 7 3 Tu 1 . + CDS 3825 - 4448 539 ## gi|237735173|ref|ZP_04565654.1| conserved hypothetical protein + Prom 4855 - 4914 7.2 8 4 Op 1 . + CDS 4938 - 6614 2160 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases 9 4 Op 2 15/0.000 + CDS 6632 - 7141 576 ## COG0440 Acetolactate synthase, small (regulatory) subunit 10 4 Op 3 . + CDS 7162 - 8178 1296 ## COG0059 Ketol-acid reductoisomerase + Term 8179 - 8232 0.2 + Prom 8186 - 8245 2.0 11 5 Op 1 30/0.000 + CDS 8277 - 9551 1553 ## COG0065 3-isopropylmalate dehydratase large subunit 12 5 Op 2 10/0.000 + CDS 9551 - 10030 597 ## COG0066 3-isopropylmalate dehydratase small subunit 13 5 Op 3 1/0.400 + CDS 10036 - 11118 1367 ## COG0473 Isocitrate/isopropylmalate dehydrogenase 14 5 Op 4 6/0.000 + CDS 11131 - 12804 1830 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase 15 5 Op 5 . + CDS 12819 - 14501 2202 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] + Prom 14535 - 14594 8.8 16 6 Tu 1 . + CDS 14623 - 15126 503 ## gi|237735182|ref|ZP_04565663.1| predicted protein + Term 15192 - 15229 2.1 - Term 15171 - 15207 -0.7 17 7 Tu 1 . - CDS 15277 - 16968 1606 ## COG4716 Myosin-crossreactive antigen - Prom 16994 - 17053 8.1 - Term 17011 - 17057 9.1 18 8 Tu 1 . - CDS 17079 - 17636 567 ## COG1309 Transcriptional regulator - Prom 17685 - 17744 9.5 + Prom 17659 - 17718 9.6 19 9 Op 1 . + CDS 17768 - 18754 835 ## COG1940 Transcriptional regulator/sugar kinase 20 9 Op 2 . + CDS 18755 - 19645 1135 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) + Term 19652 - 19682 -1.0 + Prom 19676 - 19735 10.3 21 10 Tu 1 . + CDS 19756 - 20610 661 ## COG1284 Uncharacterized conserved protein + Term 20617 - 20651 4.4 22 11 Op 1 9/0.000 - CDS 20627 - 21895 413 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 23 11 Op 2 . - CDS 21892 - 22596 514 ## COG3279 Response regulator of the LytR/AlgR family - Prom 22725 - 22784 8.2 + Prom 22522 - 22581 5.7 24 12 Tu 1 . + CDS 22722 - 23294 342 ## CLK_0453 putative AIP processing-secretion protein + Term 23422 - 23471 7.5 + Prom 23304 - 23363 3.9 25 13 Tu 1 . + CDS 23567 - 23737 82 ## gi|167757639|ref|ZP_02429766.1| hypothetical protein CLORAM_03189 26 14 Op 1 . + CDS 23857 - 24003 125 ## 27 14 Op 2 2/0.000 + CDS 23990 - 24271 219 ## COG3344 Retron-type reverse transcriptase + Prom 24328 - 24387 3.7 28 14 Op 3 . + CDS 24413 - 24787 213 ## COG3344 Retron-type reverse transcriptase - Term 25141 - 25172 -0.1 29 15 Op 1 12/0.000 - CDS 25225 - 26787 1674 ## COG1732 Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) 30 15 Op 2 . - CDS 26789 - 27913 951 ## COG1125 ABC-type proline/glycine betaine transport systems, ATPase components - Prom 27933 - 27992 5.5 31 16 Tu 1 . - CDS 28020 - 28331 219 ## gi|167757646|ref|ZP_02429773.1| hypothetical protein CLORAM_03196 - Prom 28400 - 28459 7.4 + Prom 28223 - 28282 10.8 32 17 Op 1 13/0.000 + CDS 28416 - 29600 1600 ## COG0126 3-phosphoglycerate kinase 33 17 Op 2 . + CDS 29616 - 30359 1104 ## COG0149 Triosephosphate isomerase + Term 30365 - 30401 4.2 + Prom 30423 - 30482 8.0 34 18 Tu 1 . + CDS 30536 - 30919 541 ## COG2033 Desulfoferrodoxin 35 19 Tu 1 . - CDS 31065 - 31826 956 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) - Prom 32031 - 32090 10.1 + Prom 32026 - 32085 10.4 36 20 Op 1 55/0.000 + CDS 32183 - 32596 682 ## PROTEIN SUPPORTED gi|167757651|ref|ZP_02429778.1| hypothetical protein CLORAM_03201 37 20 Op 2 43/0.000 + CDS 32664 - 33350 1133 ## PROTEIN SUPPORTED gi|167757652|ref|ZP_02429779.1| hypothetical protein CLORAM_03202 + Term 33452 - 33485 3.1 + Prom 33437 - 33496 5.8 38 20 Op 3 47/0.000 + CDS 33524 - 34120 948 ## PROTEIN SUPPORTED gi|237735200|ref|ZP_04565681.1| 50S ribosomal protein L10 39 20 Op 4 3/0.000 + CDS 34150 - 34521 585 ## PROTEIN SUPPORTED gi|167757654|ref|ZP_02429781.1| hypothetical protein CLORAM_03204 + Term 34542 - 34570 1.0 40 20 Op 5 3/0.000 + CDS 34576 - 35169 767 ## COG2813 16S RNA G1207 methylase RsmC + Term 35233 - 35271 1.2 + Prom 35196 - 35255 12.5 41 21 Op 1 58/0.000 + CDS 35331 - 39002 4162 ## COG0085 DNA-directed RNA polymerase, beta subunit/140 kD subunit 42 21 Op 2 . + CDS 39014 - 42682 3970 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit + Term 42910 - 42952 -1.0 + Prom 43127 - 43186 4.2 43 22 Op 1 . + CDS 43213 - 43617 176 ## gi|237735205|ref|ZP_04565686.1| predicted protein 44 22 Op 2 . + CDS 43611 - 43982 473 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 45 23 Tu 1 . + CDS 44039 - 44491 376 ## Cbei_2571 hypothetical protein + Term 44495 - 44535 -0.5 + Prom 44503 - 44562 10.9 46 24 Op 1 56/0.000 + CDS 44705 - 45118 714 ## PROTEIN SUPPORTED gi|167757662|ref|ZP_02429789.1| hypothetical protein CLORAM_03212 47 24 Op 2 51/0.000 + CDS 45147 - 45617 797 ## PROTEIN SUPPORTED gi|167757663|ref|ZP_02429790.1| hypothetical protein CLORAM_03213 48 24 Op 3 30/0.000 + CDS 45651 - 47726 2547 ## COG0480 Translation elongation factors (GTPases) + Prom 47728 - 47787 4.0 49 24 Op 4 . + CDS 47811 - 48995 1444 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 + Term 49012 - 49045 3.1 + Prom 49042 - 49101 9.4 50 25 Tu 1 . + CDS 49145 - 49453 461 ## COG1447 Phosphotransferase system cellobiose-specific component IIA + Term 49476 - 49515 6.1 Predicted protein(s) >gi|223714185|gb|ACDT01000030.1| GENE 1 77 - 853 690 258 aa, chain + ## HITS:1 COG:DR1628 KEGG:ns NR:ns ## COG: DR1628 COG0789 # Protein_GI_number: 15807649 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Deinococcus radiodurans # 2 158 9 153 185 79 34.0 9e-15 MNEDYLTIGELAQKMDVTVRTLQYYDREGLLKPAAISKGGRRLYSTKDIVKLHQILSFKY LGFSLVEIKTKLFNLDTPQEVAAILNQQKSVIQEQIANLSEALEATIALSREVQEMNTVD FKKYAEIIELLRLGNKEYWVWKHFDNTITDHIKERFGDDPGAGLRIFSTYQEVLDKLYVL KKQGVSPDSPECFMIAKQWWEMILEFTGGNLELLPELQKFNDKKDDWNNDLAVKQKEVDN YLTAALEYYFKRIQKEQE >gi|223714185|gb|ACDT01000030.1| GENE 2 855 - 1328 493 157 aa, chain + ## HITS:1 COG:Cgl1530 KEGG:ns NR:ns ## COG: Cgl1530 COG1131 # Protein_GI_number: 19552780 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Corynebacterium glutamicum # 2 156 29 196 340 114 38.0 6e-26 MDIALQINNLSKAYGKHQVLENISFTVVKGEIFALLGTNGAGKTTTLECLEGIRRYDTGK ILINGKLGVQLQSSSLPADITAKEAIMLFAKWQDLEVTDDYFIYLGIKAFLKKQYHQLST GQKRRLHLAIALLGHPDIIVLDEPTAGLDVEGRNSIH >gi|223714185|gb|ACDT01000030.1| GENE 3 1383 - 1706 274 107 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757617|ref|ZP_02429744.1| ## NR: gi|167757617|ref|ZP_02429744.1| hypothetical protein CLORAM_03167 [Clostridium ramosum DSM 1402] # 1 107 177 283 283 193 100.0 3e-48 MTEVEELCDRIGVLNHGKIVFIGAPDQLHQTMQSKFKLKVRFSKVPRLNEQFEIVSQEQE YYIFETTNLEITLKAIIQLTEEQSIKIMEINTVQPKLEERFLKEVQS >gi|223714185|gb|ACDT01000030.1| GENE 4 1703 - 2068 225 121 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757618|ref|ZP_02429745.1| ## NR: gi|167757618|ref|ZP_02429745.1| hypothetical protein CLORAM_03168 [Clostridium ramosum DSM 1402] # 1 121 1 121 239 198 96.0 1e-49 MRAFIYQIGLNWKLNFRSKELLVHYYVVPLVFYLFIGGVFISILPDADKTIIQVMSVFAI TMGGVLGSPYPLVEFYHSDIKKAYQVGKIPLWTIAASNFISGIMHLFVMSLIILISAPDY F >gi|223714185|gb|ACDT01000030.1| GENE 5 2145 - 2423 61 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757618|ref|ZP_02429745.1| ## NR: gi|167757618|ref|ZP_02429745.1| hypothetical protein CLORAM_03168 [Clostridium ramosum DSM 1402] # 1 92 148 239 239 150 100.0 3e-35 MVFGVFFKSAAKMGMATQLVFLPSIMLSGIMFPVAMLPNVLQVIGRILPATWGFELMCTN DFLWLNVAVQLIIFMIMLIVASFKIKQIRKED >gi|223714185|gb|ACDT01000030.1| GENE 6 2460 - 3665 878 401 aa, chain - ## HITS:1 COG:SPy0818 KEGG:ns NR:ns ## COG: SPy0818 COG2843 # Protein_GI_number: 15674859 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Streptococcus pyogenes M1 GAS # 4 349 36 370 430 123 27.0 5e-28 MSKKKLKLKRPIKIFLNFLLLLSLVTGTYLFINRKETSIKSPNKSTSTKRPRIVNASFIG DLLYEQPYYDWIGTSYNDKGYYDLVKPYFLNDDLTLANMEVPIGGKELGVSGTGYSFNAP EEIGNQVIAMGVDAVNLANNHANDAGPQGRINTLSFFKKHQILTTGIYESKEVQTQIPTK TINDITFSFLGYTYKTNKPDNQNRELIGYYRNLDTMKLDNAHKEIIKQEVAQAKQLSDIV IVSVHWGNEFTYAVNSEQKELANYLNELGVDVIIGHHSHCIQPIEWLETTNHKTLVVYSL GSFISADNQVTRATPEFANAYNVSMILQITFEKNNNSTIIKNINSLPVINYYDQKFENFK LVPIDKYNEKLEKSHNRYSKGLTKDFITNSFNNVIDQQFKQ >gi|223714185|gb|ACDT01000030.1| GENE 7 3825 - 4448 539 207 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735173|ref|ZP_04565654.1| ## NR: gi|237735173|ref|ZP_04565654.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 207 1 207 207 392 100.0 1e-108 MTNEIKLEVTCVHCLGVGEKQLYPIINRQDLEAKQNIFEEDLFLYRCPHCGHYQRISYEC MYYDEKLKYAVVLSHDGHRFLKKVKIDLTAYQLRFVKNVSELKEKIVIKENGLDDRIVEI MKHNIQTTLRSKASYQLVLRDGQGLEFVLLYPDNEVIKIFMFNKQDYLTLTKKYNRCLAN DYIVDRLFASRAVLRLNKFNNLKLNCH >gi|223714185|gb|ACDT01000030.1| GENE 8 4938 - 6614 2160 558 aa, chain + ## HITS:1 COG:CAC0273 KEGG:ns NR:ns ## COG: CAC0273 COG0119 # Protein_GI_number: 15893565 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Clostridium acetobutylicum # 2 552 1 554 558 649 57.0 0 MMNYQKYKRFEPINFKERTWPDKQIEKAPTWVSVDLRDGNQALITPMKIEEKVEMFQLLV DMGFKEIEVGFPSSSQIEYDFLRILIDENLIPDDVTVQVLTQAREHLIRKTFESLKGLKR AIVHVYNSTSVLQRDVVFNRSKEEIKQIAVDGVKLVKELSREFDGEVILEYSPESFTGTE LEYALEVCEAVMDEWGATKDNKVIINLPSTVEMATPNVYADQIEWMCTHFSDRDRVIVSL HTHNDRGCGVAATELGIMAGADRVEGTLFGNGERTGNLDIVVVALNMFTQGIDPKLDMSN INNIKHVYEKVTKMEIPPRQPYVGKLVFTAFSGSHQDAINKGTHAMHERDSQIWEVPYLP IDPSDLGRQYEPIVRINSQSGKGGVAYVMESMYGFHLPKGLQVDFAKVIQDISEEEGEVS PERVYDTFIEEYVDIDEPYRFIKQKLIDISEDDSEFERRAEITIEAHGEVTTLTGYGNGP IDAVKNALNSLPEMHSHLLDYSEHALTSGSSSKAAAYVYLRAKGSARQEYGVGIHPNITT ATVKAIISGMNRLYKTLK >gi|223714185|gb|ACDT01000030.1| GENE 9 6632 - 7141 576 169 aa, chain + ## HITS:1 COG:MTH1443 KEGG:ns NR:ns ## COG: MTH1443 COG0440 # Protein_GI_number: 15679440 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Methanothermobacter thermautotrophicus # 5 158 10 164 168 152 49.0 2e-37 MNRLVLSLLVENNPGVLSRVAGLFSRRGYGIESLSVGKTNEPNVSRMTVVAVGDELILNQ IEKQLGKLVEVIEIFPLKPEESVYRELVLVKVEADEKQRSSLVSIADIFRARIIDVAPAS LVIEVTGDQSKIDGLLAMLEGFNVVEMVRTGLSGIKRGLGEIDIESHNK >gi|223714185|gb|ACDT01000030.1| GENE 10 7162 - 8178 1296 338 aa, chain + ## HITS:1 COG:Cj0632 KEGG:ns NR:ns ## COG: Cj0632 COG0059 # Protein_GI_number: 15791992 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Ketol-acid reductoisomerase # Organism: Campylobacter jejuni # 4 337 5 336 340 434 62.0 1e-121 MAKIFYESDCDLSLLDGKTVAVIGYGSQGHAHALNMKESGVNVIIGLYEGSKSWKKAEEQ GFEVYTSAEAAKKADIIMILINDELQAKLYKESIEPNLEEGNMLMFAHGFNIHFGQIVPP KNVDVTMVAPKGPGHTVRSEYQVGKGVPCLVAVEQDATGRAQDVALAYALAIGGARAGVL ETDFRTETETDLFGEQAVLCGGVCALMQAGFETLVEAGYDPRNAYFECIHEMKLIVDLIY QSGFAGMRYSISNTAEYGDYITGPKIITEDTKKAMKKVLQDIQDGTFAKDFLLDMSEAGG QAHFKAMRKLASEHISEKTGEDVRKLYAWSDEDKLINN >gi|223714185|gb|ACDT01000030.1| GENE 11 8277 - 9551 1553 424 aa, chain + ## HITS:1 COG:CAC3173 KEGG:ns NR:ns ## COG: CAC3173 COG0065 # Protein_GI_number: 15896421 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Clostridium acetobutylicum # 3 421 5 421 422 519 64.0 1e-147 MAMTMTQKILAKHAGLDAVVAGQLIEAKLDVVMANDITGPMALPIFHQMADKVFDRDKVI LVPDHFTPNKDIKSAENSKAILDFSNDQCLTHHMEQGKCGVEHAILPEKGIVVAGECIIG ADSHTCTYGALGAFSTGVGTTDIATGMATGELWFKVPSAIKFVLTGKPSQFVSGKDIILD IITRIGVDGARYKSMEFVGEGIQHLTMDDRFTICNMAIEAGAKNGIFPVDEQTIAYMEKH SVKEYAVYEADEDAEYDEVIEIDLGKVRPTVAFPHLPGNGHTIDEIEAMDKIMIDQVVIG SCTNGRLSDLRKAAAILKGKQVAKNVRVMVVPATQKIFVQCIHEGLAEIFVEAGCAFNTP SCGPCMGGHMGVMAKGEKCVSTTNRNFVGRMGDTEALIYLASPEVAAASAIAGYIANPEK VGEE >gi|223714185|gb|ACDT01000030.1| GENE 12 9551 - 10030 597 159 aa, chain + ## HITS:1 COG:PAB0892 KEGG:ns NR:ns ## COG: PAB0892 COG0066 # Protein_GI_number: 14521550 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Pyrococcus abyssi # 2 157 6 161 164 216 61.0 1e-56 MKIYKYGDNVDTDVIIPARYLNSFDAKELASHAMVDIDPDFAKTVESGDIIVAGKNFGCG SSREHAPLCLKTAGTKCVIAESFARIFYRNSINIGFPIMECPDAVAGINQGDEVTVDFTT GLITNITKGETYQSQPFPEFLQKMIELDGLVNYVNSKKG >gi|223714185|gb|ACDT01000030.1| GENE 13 10036 - 11118 1367 360 aa, chain + ## HITS:1 COG:PA3118 KEGG:ns NR:ns ## COG: PA3118 COG0473 # Protein_GI_number: 15598314 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Pseudomonas aeruginosa # 1 360 1 357 360 412 59.0 1e-115 MEKNIAVIKGDGIGPEIVTEAMKVLDAIADKYNHKFNYTQILMGGASIDVHGVPLTGEAL AIAKNSDSVLLGSIGGDTTTSPWYKLEPNLRPEAGLLKIRKELGLFANLRPAVLYDELKG ACPLKEELTEGGFDMMIMRELTGGLYFGERSTKEENGELVARDAMSYSETEIRRIAKRAF DIAMKRNKKVTSVDKANVIDTSRLWRKVVSEVAKDYPEVVLEHMLVDNCAMQLVKDPKQF DVILTENMFGDILSDEASMVTGSIGMLASASLRVDKFGMYEPSHGSAPDIAGKNIANPIA TILSAAMMLRYSFDLDDEAKVIENAIEKVLQQGYRTSDIMSDNKTLVGTKEMGDLIVANL >gi|223714185|gb|ACDT01000030.1| GENE 14 11131 - 12804 1830 557 aa, chain + ## HITS:1 COG:CAC3170 KEGG:ns NR:ns ## COG: CAC3170 COG0129 # Protein_GI_number: 15896418 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Clostridium acetobutylicum # 1 555 1 551 552 683 64.0 0 MKSDNVKKGMQQAPHRSLFNALGLTEEEMDKPLIGIVSSYNEIVPGHMNLDKIVEAVKMG VALAGGVPREFPAIAVCDGIAMGHIGMKYSLVTRDLIADSTEAMALAHQFDALVMVPNCD KNVPGLLMAAARINIPTIFVSGGPMLAGRVHGKKTSLSSMFEAVGSYSAGKIDEEEVRYY ENHACPTCGSCSGMYTANSMNCLTEVLGMGLRGNGTIPAVYSERIKLAKHAGMQIMELLK KDIRPRDILTKESMMNALTMDMALGCSTNSMLHLPAIAHEIGFDFNIKFANEISEKTPNL CHLAPAGPTYMEDLNEAGGIYAVMKEISKLNLLNLDCMTVTGKTVGENIKDAVNFDPEVI RTIENPYSKTGGLAVLSGNLAPDGSVVKRSAVVPEMMEHSGPARVFDCEEDAIEAIKGGK IVAGDVVVIRYEGPKGGPGMREMLNPTSAIAGMGLGSTVALITDGRFSGASRGASIGHVS PEAAVGGPIALVEEGDIIEINIPEYKINLKISDEEMAKRKAKWQPREPKVTTGYLARYAA MVTSGDRGAILEVQKNK >gi|223714185|gb|ACDT01000030.1| GENE 15 12819 - 14501 2202 560 aa, chain + ## HITS:1 COG:CAC3169 KEGG:ns NR:ns ## COG: CAC3169 COG0028 # Protein_GI_number: 15896417 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Clostridium acetobutylicum # 1 547 1 547 554 613 55.0 1e-175 MLLTGSQIVAECLIEQGVDTVFGYPGGTILNIYDALYQYQDKIHHILTSHEQGAAHAADG YARSTGKVGVCMATSGPGATNLVTGIATAYMDSTPLVAITANVAVSLLGRDSFQEIDIAG VTMPITKHNFIVKDIKDLAPTIRKAFHIASEGRPGPVLVDITKDVTAAKFDFEPIVPEVI EPRKYTYGAQELDQAIKMIEASEKPFIFVGGGAIASDAASELKEFAEKIDAPVTDSLMGK GAYDNTRPRYTGMLGMHGTKASNFGVSQCDLLIVVGARFSDRVTGDTNRFAKDAKIIHID VDAAEINKNVVVDLGIVGDAKNVLKELNEKLTRQNHPQWLKEIRDLKERYPLSYDDEGLT GPYVIEQIDYLTNSEAIICTDVGQHQMWAAQYFNYRRPRQFISSGGAGTMGFGLGAAMGA KLANPHQTVFNIAGDGCFRMNLNELATLSRYNIPVIQVVMNNQVLGMVRQWQTLFYGQRY SNTILEDKVDFCKVAEGLGCKAIKVTTKEEVASAIKIAMEHDGPVVIECMIGKDDKVFPM VAPGGAIAEAFDDTDLKNKQ >gi|223714185|gb|ACDT01000030.1| GENE 16 14623 - 15126 503 167 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735182|ref|ZP_04565663.1| ## NR: gi|237735182|ref|ZP_04565663.1| predicted protein [Mollicutes bacterium D7] # 1 167 1 167 167 311 100.0 9e-84 MQNVYKCVYDYDDQQIDRLTKNGSSILGLFKVIVENERFVLMPVNHLNAKDRSVYFNEIE KVKMTLSTGAATRITINRATIYQIDLNIYFHDGSLFTLEFRSFRGVVSLIKKFAELHIYV YDPMNLLEIISMDDRWAIEKYFATNLEKIKKQFQVEEMRKKDRYVKA >gi|223714185|gb|ACDT01000030.1| GENE 17 15277 - 16968 1606 563 aa, chain - ## HITS:1 COG:lin0483 KEGG:ns NR:ns ## COG: lin0483 COG4716 # Protein_GI_number: 16799558 # Func_class: S Function unknown # Function: Myosin-crossreactive antigen # Organism: Listeria innocua # 1 563 1 566 566 813 69.0 0 MKTKTKLLTGAALLAGAGAVAAKKYADNKDTVRKPVEITNQQFYLIGGGLASMAAAAYLI QDGHVDGKNIHIFEGMKILGGSNDGAGSIKDGFVCRGGRMLNEETYENFWELFDRIPSLD HPGQSVTKEILDFDHLHPTEARARLIDRHGKILDVKSMGFDNNDRLTLGKLMITPESKLD DITIEQWFKDAPHFFTTNFWYMWQTTFAFQKWSSVFELKRYMNRMIFEFPRIETLAGVTR TPYNQFESVILPIKKYLDSHHVNFVTNTTVTDIDFKDDDTITVKALYLNKDGKDEKIILN DNDICIMTNACMTDSATLGDYKTPAPKPVEKPISGELWYKVAQKKPNLGNPEPFFGNIKE TNWESITVTFKGNKFLKIIEEFSTNIPGSGALMTFKDSSWLMSMVVAAQPHFKAQDANTT IFWAYGLYTDRLGDYIKKPMKDCTGEEIFDELLYHLHLIDRKEEIKKDIINVIPCMMPYV DAQFQPRKMSDRPHVVPKGSTNFAMISQFVEIPEDMVFTEEYSVRAARIAVYTLLGINKP ICKVTPYNKDPKVLKKALETAYR >gi|223714185|gb|ACDT01000030.1| GENE 18 17079 - 17636 567 185 aa, chain - ## HITS:1 COG:lin0482 KEGG:ns NR:ns ## COG: lin0482 COG1309 # Protein_GI_number: 16799557 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 182 1 182 186 158 43.0 6e-39 MSDSLITKRALAKTLKELCQYRNFEKISINDLTNKCGLNRQTFYYHFQDKYDLLQWLYYD ELFADIENIITFDNWDQCLLTVLEKIYQEKDFYISTINTNEQYFYHDLYNLAQKCFYDAI TKLDVNKTVSPQEKNFFSEFYAYGIAGTVLRWIKTNMKNEPAKLAHGLKKIATQSETFAA SLLNN >gi|223714185|gb|ACDT01000030.1| GENE 19 17768 - 18754 835 328 aa, chain + ## HITS:1 COG:lin0520 KEGG:ns NR:ns ## COG: lin0520 COG1940 # Protein_GI_number: 16799595 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Listeria innocua # 6 323 6 329 334 102 23.0 7e-22 MVKITELKTNNIHQVRLCFYNGEIWTKNDLAKHTGLSLAATTNILQLLLKDDEIKLVGEA QSTGGRKSKQYILNKDYYHLGNVSLKRDDQYYYFLVKITDLLGNILKEEKLISEKGSIDE LLEAISNLICDDYKVTVLALSIPGVCKDGKVGICDFEALRNCELLKLLKEHFNLDIIMDN DVNAASIGFGIHYPNANNLALMYQPKIKYVGCGILINHRLCSGYSNFAGELSYLPFMSHH EQDEMLFKAPNDLLLKQLATICCVINPEIIGVCSDVFKEFDSSQLINYLPAEHWPKIIDI DNLDQLIKDGLYSLGIEVLKNKMRKRDR >gi|223714185|gb|ACDT01000030.1| GENE 20 18755 - 19645 1135 296 aa, chain + ## HITS:1 COG:BH2283 KEGG:ns NR:ns ## COG: BH2283 COG0613 # Protein_GI_number: 15614846 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Bacillus halodurans # 6 278 8 270 290 121 30.0 2e-27 MEKYIDLHMHSKYSDDGEFTPTQLVEQCHKAGIKVMAIADHNSVKAIDEAKEAAAKLNIK YIPSVEIDCTYKGINLHVLGYGVDYHHADFVTLEGNVLNQELACSKEKVKLTNMLGFDVD QAALDALSDNGVYTGEMFGEVLLNDQRYNENSLLKPYRSGGERSDNPYVNFYWDFYAQGK PCYTEIVYPTLKETIDLIKRHGGTVVLAHPGNNLKGKFEIFDEMVELGVEGVEAFSNYHS PETVEYFYQAGKKHQILITCGSDYHGKTKPAIELGECRCTIDEHDIENQLKEYKLI >gi|223714185|gb|ACDT01000030.1| GENE 21 19756 - 20610 661 284 aa, chain + ## HITS:1 COG:lin2365 KEGG:ns NR:ns ## COG: lin2365 COG1284 # Protein_GI_number: 16801428 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 6 279 12 282 300 147 32.0 3e-35 MDIKKKKCLISFGAVFLSSLIMAFATKNLVRPAGILSGGFMGIAIMIDMIAELIGHNIPT SIGLLCLNIPVALFCAKKISPRFVFFSLLQVGLTSLFLPLMPQVPLFDDKILNVIFGGFM FGGSIVIALKGNASSGGTDFIALYVSNKSGKEIWNQVFIFNTIMLCVFGYIFGFEAAGYS ILFQFMSTKTVSNFHTRYKRVMMQIFTKNKDDVMAIYCKKFHHGITALDGMGGYSKTPVS MLTAIVSSYEVDDVVTALKEVDPKIIINVSKSEKYVGRFYNAPL >gi|223714185|gb|ACDT01000030.1| GENE 22 20627 - 21895 413 422 aa, chain - ## HITS:1 COG:CAC1582 KEGG:ns NR:ns ## COG: CAC1582 COG2972 # Protein_GI_number: 15894860 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Clostridium acetobutylicum # 200 416 234 450 452 62 25.0 1e-09 MSYFDLIINMTESIISILFIYIIFNTRKRLLSYFIFVGINFLSISLCNYFSVPEITLILL SIVILFTYSNYLNHHKHLQNIFIALFINTLLNVATTLSICFSTLLAKFPIDSVDGYHLMA MISKPILLVLVLATASHIKKYYFLESQVLKYITISIVLLNFIYSVTVDYIFYGDNLNSTL TLLLLLIDALTICLCFVFFEAQREQYRILELQRKSMKIENLETIQSINENNYQQLRNWKH DIIHVFNAIKYQLTNKNYNEAFKIINEYNHNLNSNNFVQSSGNELLDYLLLQKVNQIKDK EIHLITSCHNTIGPLEDAHFFIIVGNLIDNAIENCDSNFNKQLWVSIGTKPEYYFISIKN SIKDSVLAANPDLNTTKVDSEHHGIGINNVRLLVNHYQGLIKFEEDEGYFIVKILIPNNP AS >gi|223714185|gb|ACDT01000030.1| GENE 23 21892 - 22596 514 234 aa, chain - ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 232 1 232 234 66 25.0 4e-11 MYKFAIIDDNHFDQKLILSTLEKECQLHNIEYKLDVYNNSLNFDLSIFYDAIFLDIEMPK EDGFSFAERLNNYYETKIIVVTNNNDYRANAYNIHPFQFINKQNIKIEIINVCNQLFKSL KKNDKFITGYIYDRKKNIRIKDIYYIIVEDHLCYIFLYENKKSICIRDTIGNIEKSHLSN GLVRINRSTIINMLHIKDIKKNKICLKNNIELIISLGYNKHFEKHYQSYIWGLL >gi|223714185|gb|ACDT01000030.1| GENE 24 22722 - 23294 342 190 aa, chain + ## HITS:1 COG:no KEGG:CLK_0453 NR:ns ## KEGG: CLK_0453 # Name: not_defined # Def: putative AIP processing-secretion protein # Organism: C.botulinum_A3_LochMaree # Pathway: Two-component system [PATH:cbl02020] # 1 180 1 183 194 63 27.0 4e-09 MIDSLSNKLVNYLDKGNYLDEDKEIYLHGAKLIISEVIGTLLLLVLGLMTNHFIEAVVYE IVLSSTRSILGGYHCKSYAACICTYAGFFLAGVIFLNYYHFTLMTILIVCLIGSINILAL GPVDNVNKVASKKKKDVFKRYSKIVLIIYLTIIAALFHIHNGYLDIMVYIFIVINLLMLG GKIDYEKSKK >gi|223714185|gb|ACDT01000030.1| GENE 25 23567 - 23737 82 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757639|ref|ZP_02429766.1| ## NR: gi|167757639|ref|ZP_02429766.1| hypothetical protein CLORAM_03189 [Clostridium ramosum DSM 1402] # 1 56 1 56 56 97 100.0 3e-19 MEAKSWSEEHESHQAYNGDECAKQHEVIRGCGMKLLYVNVADIWKERDMTLSWEVS >gi|223714185|gb|ACDT01000030.1| GENE 26 23857 - 24003 125 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDSERNGGNGNMKTDNALLEQMLSDTNLGLAFTQVKRNKSASGADGSR >gi|223714185|gb|ACDT01000030.1| GENE 27 23990 - 24271 219 93 aa, chain + ## HITS:1 COG:CAC3514 KEGG:ns NR:ns ## COG: CAC3514 COG3344 # Protein_GI_number: 15896751 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Clostridium acetobutylicum # 1 92 83 178 470 115 59.0 3e-26 MGVDELKGYLEEIKDSIRNKTYKPELVRRVEIPKPDGGVQNLGVPTVLDRFVQQAIAQVL IPIYESIFSDNSFGFRPNRCCEMAIIKALEYMN >gi|223714185|gb|ACDT01000030.1| GENE 28 24413 - 24787 213 124 aa, chain + ## HITS:1 COG:BH0039 KEGG:ns NR:ns ## COG: BH0039 COG3344 # Protein_GI_number: 15612602 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Bacillus halodurans # 12 98 186 272 418 107 59.0 6e-24 MIDNEYRESVVGTPQGGNLSPLLSNIVLNELDKEMEARGLRFTRYADDCIILVGSSKAAD RVMENISKFIEKKRRFKVNMTKSKILKPNDIKYLGFGFYYDSFSSMWKAKPHEKTANKIE ETNK >gi|223714185|gb|ACDT01000030.1| GENE 29 25225 - 26787 1674 520 aa, chain - ## HITS:1 COG:CAC2849_2 KEGG:ns NR:ns ## COG: CAC2849_2 COG1732 # Protein_GI_number: 15896103 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) # Organism: Clostridium acetobutylicum # 247 520 1 273 273 232 47.0 2e-60 MNEFINYFSQNFSTIFGLFIDHIQLTILAIIISILIGVPLGIIITYFKPSKKPVMAIANI IQAIPSMALLGFMIPLLGIGTKPAIVMVILYSLLPIIKNTVAGLDSINSDTLEAAKGIGL TPMQVLYKVQIPLAAPVIMAGVRISAVSSVGLMTLAAFIGAGGLGYLVYAGIRTVNNAQI LAGAIPACILALLIDYIFSILEVLVTPKCNQLASPQSKGKKLIDKIVIIATCICLAGSFV YTNLGKTSDKITINIGSMDFSEQEILNYMLKYLIEKNTDVEVNQSLSLGSSSIVLDAMKT GDVDMYVDYTGTIYGSVLGLEPNSDVEAVYNTVKDEMKKQYNFTVLEPLGFNNTYTLAMS KQTADKYNIETISDLCKVSNQLIFSPTLTFVERKDCLVGLQETYPLQFKEVIPIDGSPRY TALVNNECDLVDAFSTDGLLKKFDLKVLTDDKNFFLPYNAIPIINQRIKDECPEIIELVN QLQNYLNEEVMIDLNYKVDEEKQKTKDVAYNFLLENNLIK >gi|223714185|gb|ACDT01000030.1| GENE 30 26789 - 27913 951 374 aa, chain - ## HITS:1 COG:CAC2850 KEGG:ns NR:ns ## COG: CAC2850 COG1125 # Protein_GI_number: 15896104 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, ATPase components # Organism: Clostridium acetobutylicum # 1 372 1 372 377 379 52.0 1e-105 MIEFKNIYKSFKDKHVLEDVSFSIDKGEFVCIIGPSGCGKTTALKMINRLIKPTRGAIYV DGKDISKEDEIDLRRNIGYVIQQTGLFPHMTVKENIELIPKLKNKKDPSLATKVVELLDM VGLDPEGYMNLYPTQMSGGQQQRVGVARAFASDPEIILMDEPFSALDPITREQLQDELLT LQNKLHKTIIFVTHDMAEAIKMADRICIMSDGRVQQFDTPEQILKHPANDFVHNFVGKKR IWDSPELIKVADIMIDKPITCNVNLKCIKAVNIMYNYKVDSLMIVDNHQNFLGILDANQA AREKNRDKKVDEVMHTECLSVKPDESIVDVINLANSSNIYTLPVVDDKNKLVGLITKSTL VTTLSKKYDESEDD >gi|223714185|gb|ACDT01000030.1| GENE 31 28020 - 28331 219 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167757646|ref|ZP_02429773.1| ## NR: gi|167757646|ref|ZP_02429773.1| hypothetical protein CLORAM_03196 [Clostridium ramosum DSM 1402] # 1 103 1 103 103 200 100.0 2e-50 MENSQLISVDKLPELLKHDYTFIDLRDPLQFNKIHLRKFINIPYDTFIATPPQFPKDKPI YLICYSGKRSLDLAQKLTRCGYHAYSFNGGFYAVEHPINKQFY >gi|223714185|gb|ACDT01000030.1| GENE 32 28416 - 29600 1600 394 aa, chain + ## HITS:1 COG:lin2552 KEGG:ns NR:ns ## COG: lin2552 COG0126 # Protein_GI_number: 16801614 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Listeria innocua # 1 394 1 396 396 464 63.0 1e-131 MDKKTIKDIEVAGKVVLCRVDFNVPRNKETGEITNDNRVVAALPTINYLLEQAPKCVVLF SHLGKVKTEEDKAKNDLAVVAPCLEKHLGKPVTFVNATRGPVLEDAIKNAADGAVILVQN TRYEAGESKNDPELGKYWASLGDVFVEDAFGSVHRAHASTAGIPAHLPSAAGFLVEKEIA YIGKAVNDPERPMVAILGGAKVSDKILVIENLLKVADKVIVGGGMCYTFAKAMGHNIGNS LVEDDRIEIAKELIAKAGDKLILPIDSICSDKFAVDGDIKECGEDVPDGYMGLDIGPKSV ELFKEALQGAKTVVWNGPMGVFEMEPFAKGTIAVCEALANLDGANTIIGGGDSAAAVMQL GYADKVSHISTGGGASLEYMEGKVLPGIAAIDDK >gi|223714185|gb|ACDT01000030.1| GENE 33 29616 - 30359 1104 247 aa, chain + ## HITS:1 COG:SP1574 KEGG:ns NR:ns ## COG: SP1574 COG0149 # Protein_GI_number: 15901416 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Streptococcus pneumoniae TIGR4 # 2 244 3 249 252 252 53.0 4e-67 MRKPIIVGNWKMNKTIAETKAFVEAVDAKVSDSADWGIATPYLALQAAKEGTKKLLVAAE NCHFKDSGAYTGEVSVEMLKEIGVEWVILGHSERRQYFGETDETVNAKMLQVLKNDMTPI VCVGETLEEYEAGTTKNVVKTQTVAAFKDVCPKCAGRTVIAYEPVWAIGTGKTATNEIAQ DVCGYIRSVVAELYGQEVADQVRIQYGGSVKPEGLKTLLEQPDIDGALVGGASLQADSYI AMIENLG >gi|223714185|gb|ACDT01000030.1| GENE 34 30536 - 30919 541 127 aa, chain + ## HITS:1 COG:AF0833 KEGG:ns NR:ns ## COG: AF0833 COG2033 # Protein_GI_number: 11498439 # Func_class: C Energy production and conversion # Function: Desulfoferrodoxin # Organism: Archaeoglobus fulgidus # 1 123 5 124 125 93 41.0 1e-19 MKLLKCPICGNVVEMVEDHGVPLMCCGKKMEEVEAGAVDAALEKHVPVLKVEGDCLTAVV GDVLHPMTPEHLISNIWIEFADGSNKKVTLTSDDEPIAKFNIAGKSGKATVYEYCNLHGL WKTEIEL >gi|223714185|gb|ACDT01000030.1| GENE 35 31065 - 31826 956 253 aa, chain - ## HITS:1 COG:lin1921 KEGG:ns NR:ns ## COG: lin1921 COG1028 # Protein_GI_number: 16800987 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Listeria innocua # 3 248 2 244 247 140 35.0 3e-33 MKTLQNKVAIITGGGRGIGFGLAKAFAKAGANLVITGRNRNTLHNAKEKLESQYQINVLC IPADGANEDDVNNVINKTIKYYGKINIVVNNAQASKSGVMLKDHTKEEFDLAINSGLYAT FFYMRAAYPYLKESHGSVINFASGAGLFGKLGQSSYAAAKEGIRGLSRVAAAEWGPDNIN VNVICPLAMTEGLTKWKEEYPDLYEKTIQGIPMQRFADPENDIGSAAVFFASEAGHYITG ETMTIQGGSGLRP >gi|223714185|gb|ACDT01000030.1| GENE 36 32183 - 32596 682 137 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167757651|ref|ZP_02429778.1| hypothetical protein CLORAM_03201 [Clostridium ramosum DSM 1402] # 1 137 1 137 137 267 99 1e-70 MSKKVARVMKIQFPAGGAKPGPALAGAGIQMPKFCTAFNDQTRDRMGETVPVVLTIYEDK DFSFVLKTAPAAEMIKKACGIKKGSSNAGTTEVATLSAEKLKEIAEYKMPDLNAIDLESA MKIIAGTARNMGVKVEA >gi|223714185|gb|ACDT01000030.1| GENE 37 32664 - 33350 1133 228 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167757652|ref|ZP_02429779.1| hypothetical protein CLORAM_03202 [Clostridium ramosum DSM 1402] # 1 228 1 228 228 441 100 1e-123 MAKKSKRYQEAAKLIEKGKAYSIEEAVALVKETSKVKFDAAVDVAFRLGVDPRQADQQLR GALVLPNGTGKSKKVLVVTEGPKAQEAKDAGADVVGGKEILEDIKKGWLDFEVMIATPDM MAELGKLGRILGPKGLMPNPKTGTVTMDVAKAVKETKAGKVTYRTDKEGNVQMTIGRVSF DNDKLVENFTAIYDLLVKIKPSTSKGVYMKNIVVSSTMGPSIKIAAAK >gi|223714185|gb|ACDT01000030.1| GENE 38 33524 - 34120 948 198 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237735200|ref|ZP_04565681.1| 50S ribosomal protein L10 [Mollicutes bacterium D7] # 1 198 1 198 198 369 100 1e-101 MSAEAIKAKSALVEEIAAKLKGAQSAVVVEYRGLSVAEMTELRNNLRAEDVELKVYKNSL VQRATVATEYEGLNAELTGPNAIALGNSDAVAPARVIAKFAKDHEALVIKAAVVEGKLLT VDEVKEISKLPNREGMYSMLLGMLQAPVSKFARVVKAVAEAKPEDGSAAEEAAPVEEAAP ETPVVEETAEETKEETAE >gi|223714185|gb|ACDT01000030.1| GENE 39 34150 - 34521 585 123 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167757654|ref|ZP_02429781.1| hypothetical protein CLORAM_03204 [Clostridium ramosum DSM 1402] # 1 123 1 123 123 229 100 2e-59 MAKLTHEDILAYLEEATILELNDLVKAIEEKFDVTAAAPVAVAAAGAEEAAGEPANVTVT LTEAGGTKVAVIKAVREITGLGLVDAKGLVDKAPSVIKENIPAAEGAEIKEKLEAAGASV EVK >gi|223714185|gb|ACDT01000030.1| GENE 40 34576 - 35169 767 197 aa, chain + ## HITS:1 COG:lin0284 KEGG:ns NR:ns ## COG: lin0284 COG2813 # Protein_GI_number: 16799361 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S RNA G1207 methylase RsmC # Organism: Listeria innocua # 3 195 5 199 201 192 49.0 3e-49 MKHYYTNNEDLISEPEQFIFTYRGKELIFTSDHGVFSKKMIDFGSRVLLDAIELDEGKST LLDVGCGYGTFGVALKSAYPALEIDMIDVNERALLLAKQNLAANNLEANVYLSSVYENVT NKYDVIVTNPPIRAGKETVTKILVEAKEHLNLHGEIWVVIQKKQGAPSAKKNLESVFGNA NVVKKDKGYYILKAINK >gi|223714185|gb|ACDT01000030.1| GENE 41 35331 - 39002 4162 1223 aa, chain + ## HITS:1 COG:lin0285 KEGG:ns NR:ns ## COG: lin0285 COG0085 # Protein_GI_number: 16799362 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta subunit/140 kD subunit # Organism: Listeria innocua # 13 1164 16 1162 1184 1478 63.0 0 MENKITEAKSKITRRNYSRISGSLELPNLVEIQTNSYKWFKEVGIKEVFEDIYPITNFNE TLSLEFVDCTFDEAKYSVEESKDRDANYAAPIRASLRLVNSGTGEIKEQEVFMGDFPLMT DSGTFIINGAERVIVSQLVRSPGAYFADALDKSGKTVFNGSIIPSRGTWLEFETDAKDVL NVRIDRNRKMPGTVLLRALGLSSNDEIIDVFGEHEYIINTLAKDNTTNTEEALIEIYNKL RPGEPATLEGANNLLYTRFFDCKRYDLAKAGRFKFRKKLSLLDRIAGRILAEDLVDVDGN VVYPEGTVVTQEVSDTIKPILEAGAHTRELDTNPRLESNGVIQVLDVYVDDTKTKKMRVI GTDLSLDSKFVTISDMMAAYSYMFNLVDIYDALDLTAEDRVNLMARIGLLDDIDHLGNRR VRSVGELIQNQFRIGLSRMERVVKERMSLSEVDSITPQSLTNIRPLTAAIKEFFSSSQLS QFMDQINPLAELTNKRRLSALGPGGLSRDRAGYEVRDVHASHYGRICPIETPEGPNIGLI STLASYAKINQYGFIETPYRKVNNCVIDEHDVRYLTADEEKNYIIAQANVRTDKDGTILD AQVIARHLGENIMAKREEVDYIDISPKQIVSVATSCIPFLENDDATRALMGANMQRQAVP LLNPHTPFVGTGMEHQAARDSGAAVVSREDGIVTYVDAKKVIVEDNEGVEHRYRLSKFKI SNNGTCINHKPIVKTGEKVLKGQVLADGPAMEQGELALGQNVLVGFMEWNGYNYEDAVIM SERLVKDDVYTSVHIDEYAIECRDTKLGPEEITRDIPNVGDEARKNLNSDGIIMIGAEVK EGDILVGKVTPKGQAELSAEEKLLLAIFGEKSREVKDNSLRVPHGGAGIVHDIKVFERKN GDELQPGVNKVVKVYIVQKRKISEGDKMAGRHGNKGVISKILPIEDMPHLEDGTPLDIML NPLGVPSRMNIGQVLELHLGYAARQLGLYIATPAFDGLHPSDLEDIMAEAGMSKDGKQPV ISGRTGEYFDNNISVGIMYMIKLAHMVDDKLHARSVGPYSLVTQQPLGGKAQNGGQRFGE MEVWALEAYGAAYTLREILTVKSDDVVGRVKTYEAIVKGQPLPEPGLPESFRVLKKELQA LALDIRLLDENDNEVDMRNIEEEEHRFPRSIDKDEVIETPETEDELEEEILEDDLDVEEE EDFEDIDEDFDEELEEIEESESL >gi|223714185|gb|ACDT01000030.1| GENE 42 39014 - 42682 3970 1222 aa, chain + ## HITS:1 COG:BH0127 KEGG:ns NR:ns ## COG: BH0127 COG0086 # Protein_GI_number: 15612690 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Bacillus halodurans # 1 1180 1 1175 1206 1577 65.0 0 MANTNKFSAIQIGLASPEKIREWSYGEVKKPETINYRSQKPEKDGLFCEKIFGPSKDWEC SCGKYKKVRYKGVVCDRCGVEVTKSAVRRERMGHIELATPIAHIWYLKGIPSRMGLILDM SPKQLEEIIYFVSYVVIDKGSTPLEYKQVLSERDYRKCFEQFGHTFEAQIGAEAIQTLLQ QVDLDAEFAKVTKELKEAQGQKRTKLLKRLEAIEAFRTSTNEPEWMILEALPVIPPDLRP MLQLDGGRFATSDLNDLYRRVITRNNRLKKLLELGTPSIIVQNEKRMLQEAVDALIDNGR RSKPITGAGGRALKSLSHTLKGKQGRFRQNLLGKRVDYSGRSVIAVGPDLKMYQCGIPRE MALNLFKPFVINGLVRDQLATNIKAAERLIDKMDDRIWPIVEEVIQQHPVLLNRAPTLHR LGIQAFEPKLVEGRAIRLHPLVTPAFNADFDGDQMAVHVPLGEEAIQEARQLMLGSNNIL GPKDGKPIVTPSQDMVLGNYYLTLEDEGGLGEGTVFADRNEVDHAYFAKTVELHTRVAIK ASALKNETFTKAQNDCYLITSVGKILFNDIFDGKFPFINDPSSENLIATPDKYFVPMGTD IKEHIRNQKIIKPLNKKSLGKIIDEVFKHSAMSDTSLMLDKLKDQGFYYSTIAGTTVSVY DIQVPEAKYEIFEKADEKLEQIKKFYNKGKLTESERYQNVIKLWTDVKDEVQEVVRLEFE ADDRNPIFIMSDSGARGSLSNFTQLVGMRGLMSNPKGETIELPIKSSFREGLTASEFFIS THGARKGSTDTALKTADSGYLTRRLVDVAQEVIISEEDCGTDRGFVVTELYNNDDKSVIV PLHDRLVGRYSQKDIFHPETKELIIAGGELITEALADEIVNAGITAVEIRSVLGCNAKGG VCRKCYGRNLATGDVVELGEAVGIMAAQSIGEPGTQLTMRTFHDGGVAGGADITQGLPRI QELFEARNPKAKSIISEIEGEVTNIIDNAGRMEVVITNDLETRSYLAPYGAKIRVNVGDH VNIGAKITKGSIDPKELLSVADVEAVENYIIKEVQKVYRIQGIEISDKHIEIIVKQMLRK MKVVEGGETGCLPGTNINVNAFTELNKQVLKDGKHPAVARPVLLGITKASLETESFLSAA SFQETTKILTDASIKGKKDHLLGLKENVLIGKLLPAGTGLRGALKSPERLAREAEEAMLA SQEDMDDLIDNDDEIEMMEQAG >gi|223714185|gb|ACDT01000030.1| GENE 43 43213 - 43617 176 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237735205|ref|ZP_04565686.1| ## NR: gi|237735205|ref|ZP_04565686.1| predicted protein [Mollicutes bacterium D7] # 1 134 1 134 134 249 100.0 5e-65 MKSFYANKVCSTDKVPTAARLAPEEFFSLSSTQEFVFSSLNTSNSLQHITAMSDWFYCYY FGFLFDNLIVLTLIMQEKENPICAYQNLDQQIELIKNELIRYLLFADCGDAVFEDQVDVF AEEINKFIEEKIKC >gi|223714185|gb|ACDT01000030.1| GENE 44 43611 - 43982 473 123 aa, chain + ## HITS:1 COG:lin1080 KEGG:ns NR:ns ## COG: lin1080 COG1440 # Protein_GI_number: 16800149 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Listeria innocua # 5 123 6 127 134 63 32.0 6e-11 MLVYICCAGGATSSLFCKKIGDASKVPTTVEDIFTVLKNYDEYDNKYEIILAYGPAEFLK ERCIREYNLGEKISSIWIAPQERFMLPTIQKIFAKYNTPVAAIDMRTFGTMNGAKALADI LAL >gi|223714185|gb|ACDT01000030.1| GENE 45 44039 - 44491 376 150 aa, chain + ## HITS:1 COG:no KEGG:Cbei_2571 NR:ns ## KEGG: Cbei_2571 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 3 84 86 163 260 74 50.0 9e-13 MIDLNQVMTFTEAADKWGFANGNTIRKAVERNKFLPAEIRKSGDVWLTTYAAMLRVFGQP RKLDEVITYQEIAELITDAVYLHKNVDLEMNSIFRRIAGAIEKRQTITVVESRNKSERIL MVVKTRDDLEAFMNTLKRYLDSVDIKLKKE >gi|223714185|gb|ACDT01000030.1| GENE 46 44705 - 45118 714 137 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167757662|ref|ZP_02429789.1| hypothetical protein CLORAM_03212 [Clostridium ramosum DSM 1402] # 1 137 1 137 137 279 100 2e-74 MPTINQLVRQGRNDKTTKSKSPALNRGFNSLAKKPTTTNSPQKRGVCTRVATMTPKKPNS ALRKYARVRLSNGMEVTAYIPGIGHNLQEHSVVLIRGGRVKDLPGVRYHIVRGTMDCAGV NDRKQGRSRYGAKKPKA >gi|223714185|gb|ACDT01000030.1| GENE 47 45147 - 45617 797 156 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167757663|ref|ZP_02429790.1| hypothetical protein CLORAM_03213 [Clostridium ramosum DSM 1402] # 1 156 1 156 156 311 100 5e-84 MARKGQVAKRDVLPDPVYNSKTVSKLINNIMLDGKKGVAQNILYDAFKKVEEKTGNPAME VFDQAINNIMPVLELKVRRIGGANYQVPVEVSGERRMTLGLRWLVNYSRLRNENTMVDRL ANEIIDASNGAGASVKKKEDTHKMAEANKAFAHFRW >gi|223714185|gb|ACDT01000030.1| GENE 48 45651 - 47726 2547 691 aa, chain + ## HITS:1 COG:BH0131 KEGG:ns NR:ns ## COG: BH0131 COG0480 # Protein_GI_number: 15612694 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Bacillus halodurans # 1 689 1 691 692 1025 72.0 0 MSREFSLEKTRNIGIMAHIDAGKTTTTERVLYYTGKIHKIGETHEGASQMDWMEQEQERG ITITSAATTAQWNGYRVNIIDTPGHVDFTVEVERSLRVLDGAVTVLDSKAGVEPQTETVW RQATTYGVPRIVFCNKMDATGADFIMSLESLEKRLGVHGVALQLPIGAEDTFEGIIDLIK MKAVYFEGTKGENVIYKDIPEEYLDQANEYHAKMLDAAASYDDDLMMKVLEEEEPTEEEV KAAIRKGVLAVELFPVLCGSAYKDKGVQPMLDAVIDFLPAPTDIPSIKGIDEDGNEIEKH ASDEEPFAALAFKIMADPFVGRLTFFRVYSGTVDSGSYVLNSTKDKKERLGRILQMHANK RNEIQTVYAGDIAAAVGFKNTTTGDTICDEKNFVILEKMEFPEPVIELAIEPKTKQDQDK LGVGLSKLAEEDPTFRTFTNPETGDTVIAGMGELHLDVIVDRLRREYKVEANVGAPQVAY RETIKTAAECEGKYVKQSGGRGQYGHVWIKFEPNEGKGFEFVDAIVGGSVPREYINSVKV GLEDALATGMIAGYPVLDVKATLFDGSYHDVDSSEMAYKVAASMALKAAGKKCDPVILEP IMAVEVTAPAEYLGSVMGDVSSRRGMIEGQEERGNAVTVQASIPLSEMFGYATDLRSFTQ GRGNYTMIFDRYEPVPKSIREEIIKKNGGNN >gi|223714185|gb|ACDT01000030.1| GENE 49 47811 - 48995 1444 394 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 394 1 407 407 560 67 1e-159 MAKEKFDRSKAHVNIGTIGHVDHGKTTLTAAITTVLSKDGQAQAMDYAAIDAAPEEKERG ITINTAHVEYQTATRHYAHVDCPGHADYIKNMITGAAQMDGAILVVAATDGPMPQTREHI LLSRQVGVPYIIVFLNKCDMVDDEELLDLVEMEVRELLNEYDFPGDDTPVIRGSALKALE GDPKWVPAIHELMEAVDSYIPTPTRDTDKPFLMPVEDVFTITGRGTVATGRVERGQLNLN DPLEIVGIHETKNTVATGIEMFRKLLDYAESGDNVGVLLRGVNREEIQRGQVLAKPGSVN PHKKFKSQVYILSKDEGGRHTPFFANYRPQFYFRTTDVTGVIELPEGVEMVMPGDNVELT VELIAPIAIEKGTKFSIREGGRTVGSGNISDIIE >gi|223714185|gb|ACDT01000030.1| GENE 50 49145 - 49453 461 102 aa, chain + ## HITS:1 COG:lin2833 KEGG:ns NR:ns ## COG: lin2833 COG1447 # Protein_GI_number: 16801893 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIA # Organism: Listeria innocua # 5 99 4 98 100 90 56.0 5e-19 MDEQEQIVINLIVNSGSARSSAIEAIQYAKAGDLAKADESLQQAKETVNEAHHSQTELIQ AEIRGEKAPLNLLMVHAQDHLMTALVVIDLAQEFIDVYKKIG Prediction of potential genes in microbial genomes Time: Thu May 26 09:34:40 2011 Seq name: gi|223714184|gb|ACDT01000031.1| Coprobacillus sp. D7 cont1.31, whole genome shotgun sequence Length of sequence - 1303 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 266 130 ## gi|167755694|ref|ZP_02427821.1| hypothetical protein CLORAM_01209 2 1 Op 2 . + CDS 332 - 1240 825 ## COG0726 Predicted xylanase/chitin deacetylase Predicted protein(s) >gi|223714184|gb|ACDT01000031.1| GENE 1 3 - 266 130 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755694|ref|ZP_02427821.1| ## NR: gi|167755694|ref|ZP_02427821.1| hypothetical protein CLORAM_01209 [Clostridium ramosum DSM 1402] # 1 81 54 134 465 135 97.0 6e-31 GGSISDVNVDTSKIDFSKVGKYPITYIYNNIKTTITIELQDTLKPNVEVQEVTVDLGMKI TARDVVKDIYDNSRTTVNSKKIINLIM >gi|223714184|gb|ACDT01000031.1| GENE 2 332 - 1240 825 302 aa, chain + ## HITS:1 COG:CAC3377 KEGG:ns NR:ns ## COG: CAC3377 COG0726 # Protein_GI_number: 15896619 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Clostridium acetobutylicum # 96 289 88 281 282 127 39.0 2e-29 MHVLPKDTTPPEILGIRNLSVLKDSDIDLLSSVSVKDNQDDNPTLTIDSSNLDISKVGDY QIVYFAKDRSGNMTKETCIVSVVENKTIGSFEPSDEKVVYLTFDDGPSMNTQKVLDILAV YDAKATFFVTGTNENYYDLIKKAHDQGHTIGLHTYIHEYDQIYNSSSAYFSDLKKIEDLV YSQIGSVPKYIRFPGGSSNQVSKKYCHKIMSKLTKEVVNRGYQYYDWNEDSEDGSGQLSV KQLIKNATASTEKNIMLLFHDANGKENSLKAIGPVIQYYQNKGYVFKGIDDSSFVVHHSV NN Prediction of potential genes in microbial genomes Time: Thu May 26 09:35:01 2011 Seq name: gi|223714183|gb|ACDT01000032.1| Coprobacillus sp. D7 cont1.32, whole genome shotgun sequence Length of sequence - 54497 bp Number of predicted genes - 54, with homology - 54 Number of transcription units - 19, operones - 13 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 87 - 146 6.0 1 1 Op 1 . + CDS 176 - 577 333 ## Amet_3459 hypothetical protein 2 1 Op 2 . + CDS 547 - 1113 468 ## Ccel_1994 hypothetical protein + Prom 1115 - 1174 10.3 3 2 Op 1 . + CDS 1197 - 1721 546 ## COG1670 Acetyltransferases, including N-acetylases of ribosomal proteins + Term 1751 - 1781 0.3 4 2 Op 2 . + CDS 1797 - 2648 1060 ## COG1091 dTDP-4-dehydrorhamnose reductase 5 2 Op 3 . + CDS 2650 - 3945 1330 ## COG1301 Na+/H+-dicarboxylate symporters + Term 3971 - 4005 3.0 6 3 Op 1 . + CDS 4011 - 4787 687 ## COG1357 Uncharacterized low-complexity proteins 7 3 Op 2 . + CDS 4815 - 6383 1481 ## EAT1b_0787 protein of unknown function DUF975 + Term 6449 - 6486 0.0 8 4 Tu 1 . - CDS 6380 - 7033 462 ## BHWA1_00599 acetyltransferase, GT family - Prom 7055 - 7114 10.5 + Prom 7008 - 7067 8.5 9 5 Tu 1 . + CDS 7104 - 7913 773 ## COG0789 Predicted transcriptional regulators + Prom 7919 - 7978 8.2 10 6 Op 1 6/0.000 + CDS 8021 - 9133 1177 ## COG0620 Methionine synthase II (cobalamin-independent) 11 6 Op 2 . + CDS 9133 - 11400 2419 ## COG0620 Methionine synthase II (cobalamin-independent) 12 6 Op 3 3/0.000 + CDS 11444 - 12361 956 ## COG0583 Transcriptional regulator + Prom 12365 - 12424 5.0 13 6 Op 4 3/0.000 + CDS 12445 - 13263 559 ## COG2207 AraC-type DNA-binding domain-containing proteins 14 6 Op 5 . + CDS 13256 - 14146 849 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 15 6 Op 6 . + CDS 14139 - 14549 490 ## LEUM_1139 hypothetical protein 16 6 Op 7 . + CDS 14607 - 15329 636 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family - Term 15298 - 15344 9.0 17 7 Tu 1 . - CDS 15413 - 15664 319 ## gi|167755711|ref|ZP_02427838.1| hypothetical protein CLORAM_01226 - Prom 15759 - 15818 11.8 + Prom 16092 - 16151 4.6 18 8 Op 1 35/0.000 + CDS 16186 - 17649 1691 ## COG0147 Anthranilate/para-aminobenzoate synthases component I 19 8 Op 2 13/0.000 + CDS 17649 - 18224 634 ## COG0512 Anthranilate/para-aminobenzoate synthases component II 20 8 Op 3 21/0.000 + CDS 18217 - 19224 1144 ## COG0547 Anthranilate phosphoribosyltransferase 21 8 Op 4 9/0.000 + CDS 19238 - 20038 880 ## COG0134 Indole-3-glycerol phosphate synthase 22 8 Op 5 23/0.000 + CDS 20028 - 20624 567 ## COG0135 Phosphoribosylanthranilate isomerase 23 8 Op 6 37/0.000 + CDS 20611 - 21795 1572 ## COG0133 Tryptophan synthase beta chain 24 8 Op 7 . + CDS 21788 - 22555 423 ## PROTEIN SUPPORTED gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc + Term 22669 - 22697 -0.0 + Prom 22960 - 23019 9.3 25 9 Op 1 17/0.000 + CDS 23099 - 24385 937 ## COG0168 Trk-type K+ transport systems, membrane components 26 9 Op 2 . + CDS 24396 - 25040 748 ## COG0569 K+ transport systems, NAD-binding component 27 9 Op 3 16/0.000 + CDS 25080 - 26999 1904 ## COG2205 Osmosensitive K+ channel histidine kinase 28 9 Op 4 1/0.333 + CDS 26992 - 27690 861 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 27694 - 27728 3.9 + Prom 27692 - 27751 7.8 29 10 Op 1 . + CDS 27775 - 28632 666 ## COG2510 Predicted membrane protein 30 10 Op 2 . + CDS 28645 - 29127 430 ## COG3663 G:T/U mismatch-specific DNA glycosylase - Term 28907 - 28953 3.1 31 11 Op 1 . - CDS 29165 - 29896 777 ## CKR_2202 hypothetical protein 32 11 Op 2 1/0.333 - CDS 29874 - 30623 235 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 33 11 Op 3 . - CDS 30629 - 31057 558 ## COG2002 Regulators of stationary/sporulation gene expression - Prom 31079 - 31138 5.3 - Term 31143 - 31175 2.0 34 12 Tu 1 . - CDS 31184 - 32617 1501 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase - Prom 32756 - 32815 7.0 + Prom 32690 - 32749 7.2 35 13 Tu 1 . + CDS 32769 - 34109 1300 ## COG0534 Na+-driven multidrug efflux pump + Term 34116 - 34151 -1.0 - Term 34201 - 34231 1.3 36 14 Tu 1 . - CDS 34395 - 35831 1626 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase - Prom 35918 - 35977 10.8 + Prom 35866 - 35925 8.8 37 15 Op 1 . + CDS 35968 - 36492 564 ## COG0500 SAM-dependent methyltransferases 38 15 Op 2 . + CDS 36556 - 36825 369 ## COG1925 Phosphotransferase system, HPr-related proteins + Term 36841 - 36867 -1.0 + Prom 36838 - 36897 7.6 39 16 Op 1 28/0.000 + CDS 36924 - 37901 1149 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 40 16 Op 2 . + CDS 37905 - 39245 1606 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase + Term 39262 - 39305 5.5 + Prom 39310 - 39369 7.0 41 17 Op 1 . + CDS 39487 - 40269 630 ## gi|237734702|ref|ZP_04565183.1| predicted protein 42 17 Op 2 . + CDS 40272 - 41075 595 ## gi|167755737|ref|ZP_02427864.1| hypothetical protein CLORAM_01252 + Term 41152 - 41189 -0.9 + Prom 41082 - 41141 8.7 43 18 Op 1 . + CDS 41263 - 43956 3467 ## COG1472 Beta-glucosidase-related glycosidases + Term 43971 - 44013 7.7 44 18 Op 2 . + CDS 44019 - 45398 1398 ## EUBELI_01288 hypothetical protein + Term 45404 - 45451 -0.9 45 18 Op 3 31/0.000 + CDS 45475 - 46551 1125 ## COG0772 Bacterial cell division membrane protein 46 18 Op 4 2/0.000 + CDS 46580 - 47656 1266 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase 47 18 Op 5 3/0.000 + CDS 47666 - 48580 1213 ## COG0812 UDP-N-acetylmuramate dehydrogenase 48 18 Op 6 25/0.000 + CDS 48584 - 49351 783 ## COG1589 Cell division septal protein + Prom 49355 - 49414 8.8 49 18 Op 7 35/0.000 + CDS 49438 - 50697 1312 ## COG0849 Actin-like ATPase involved in cell division 50 18 Op 8 . + CDS 50722 - 51816 1503 ## COG0206 Cell division GTPase 51 18 Op 9 . + CDS 51866 - 52642 1028 ## COG0345 Pyrroline-5-carboxylate reductase + Prom 52644 - 52703 2.0 52 19 Op 1 . + CDS 52725 - 53462 671 ## gi|237734713|ref|ZP_04565194.1| predicted protein 53 19 Op 2 2/0.000 + CDS 53462 - 54157 759 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 54 19 Op 3 . + CDS 54206 - 54497 220 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit Predicted protein(s) >gi|223714183|gb|ACDT01000032.1| GENE 1 176 - 577 333 133 aa, chain + ## HITS:1 COG:no KEGG:Amet_3459 NR:ns ## KEGG: Amet_3459 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 20 128 23 135 326 66 33.0 3e-10 MKIENTELNLHLKDSDQRLFEGWYFKIVDCKISLAIIVGISKTIEKSCAFIQTLDTYTNQ SQMIEYSLDDFQWGKDPFYIRIKNNFFTKEQIILDLDNGLVDIQGNLKNSQYTKLETTCY APTLWDLFIIYHF >gi|223714183|gb|ACDT01000032.1| GENE 2 547 - 1113 468 188 aa, chain + ## HITS:1 COG:no KEGG:Ccel_1994 NR:ns ## KEGG: Ccel_1994 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 187 130 318 325 130 39.0 2e-29 MGPFHYLPFLECNHAIISLRHHITGSLKVNNQKFQIIGDGYIEKDWGRSFPQDYLWLQSN SCKEKEASLFLSIAKIPLLACSFQGLIMNLLVDDQQIRVATYYGARVKDMFTREGYHYLI ISQHPHTFYLKIKAGHRFELKSPQSGKMNGYVEESLNALAVLLVYKKNKKVAKFNFINCG FELFGNWL >gi|223714183|gb|ACDT01000032.1| GENE 3 1197 - 1721 546 174 aa, chain + ## HITS:1 COG:L1015 KEGG:ns NR:ns ## COG: L1015 COG1670 # Protein_GI_number: 15672772 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Acetyltransferases, including N-acetylases of ribosomal proteins # Organism: Lactococcus lactis # 2 172 14 185 187 61 26.0 8e-10 METIITERLILRSPDNNDAQTLLKIHNDMATLKFNAMTKYEFAEMIRDIEREHDMMYCIE LKNSNNVIGAVFINPDRLRYRVNALSLSYYLDCRYYRNGYMYEALQAIIIKLFEQEIDII SARVFKENTASARLLEKLGFQYEGCLRMGVTGYQEIVHDDFFYSILKDEVNLTV >gi|223714183|gb|ACDT01000032.1| GENE 4 1797 - 2648 1060 283 aa, chain + ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 2 277 1 275 280 249 44.0 3e-66 MLKVWVVGSNGQIGTAINEVIEPLEIEVLNTDQDELDITNTDDVISFGEINRPDIIINCA AITDVYKCERNRDQAFRVNALGPRNLCIVARKIGAKVVQISTDDVFDGKSDTPYNEFDVA KPKTVYGCSKKAGEDYVKEFTHKHFIIRSNWIYGQNGTNFVNEFIAKSKKEDFLNVANDQ FGSPTSAKDLARAILYLMETSEYGTYHITCKGVCSRYEFAKEISKQINSHVMITPVSTKE MADEVVRPSYVVLDNFILRLVNGYQMPTWQESLQEYIKEIKED >gi|223714183|gb|ACDT01000032.1| GENE 5 2650 - 3945 1330 431 aa, chain + ## HITS:1 COG:VCA0088 KEGG:ns NR:ns ## COG: VCA0088 COG1301 # Protein_GI_number: 15600859 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Vibrio cholerae # 7 421 7 418 424 339 48.0 6e-93 MALKKAKKKIGLTTKIFIALIAGAILGIILFNLPAGSVRDDIVIDGILYVIGQGFIKLMK MLVVPLVFCSLVCGSMAIGDTKKLGSVGVRTIVFYLFTTALAITVALLIGNLINPGLGLD MSAVQSTAANVGTTESTTLVETLLNIIPDNPISALSNGNMLQIIVFALLVGVILAKLGER AEVVGNFFSQFNDIMMEMTMMIMAIAPLGVFCLIARTFAGIGFGAFLPLGKYMIAVLIAL AVQCLFVYLGLLKLFTGLNPIKFIKNFFPVMAFAFSTATSNATIPLSIDTLAKKMGVSKR ISSFTIPLGATINMDGTAIMQGVAVVFAAQAFGINLDLTDYITVIGTATLASIGTAGIPS VGLVTLTMVFNSVGLPVEAIGMIMGIDRILDMTRTAVNITGDAVCTTIVAHQNKAINRNI FDNSDIKRVSE >gi|223714183|gb|ACDT01000032.1| GENE 6 4011 - 4787 687 258 aa, chain + ## HITS:1 COG:CAC0073 KEGG:ns NR:ns ## COG: CAC0073 COG1357 # Protein_GI_number: 15893370 # Func_class: S Function unknown # Function: Uncharacterized low-complexity proteins # Organism: Clostridium acetobutylicum # 7 256 9 262 270 169 39.0 5e-42 MNKLSGLKTDCCGCRGFCCRALFFFKSDGFPYDKPAKRNCKNLLADYRCKIHDQLEIVNY YGCINYDCLGAGQKTALFKEDDQLLFEVFEVMVQLHEILWYLYVGKNFVQSQRLNEQLET YYQMIESLTYLDAEEILALDLNKYRQLVNPLLQEVYCQMNKDIQIKPLQARKTVLNRADF MGCDLHEKDLRGQDFKGAFLIGANLSNCDLCGVNFLGSDVRKMDIRNSDLSKSLFLTPAQ ISACVINRYTKLPEYIGY >gi|223714183|gb|ACDT01000032.1| GENE 7 4815 - 6383 1481 522 aa, chain + ## HITS:1 COG:no KEGG:EAT1b_0787 NR:ns ## KEGG: EAT1b_0787 # Name: not_defined # Def: protein of unknown function DUF975 # Organism: Exiguobacterium_AT1b # Pathway: not_defined # 90 210 81 202 223 96 40.0 3e-18 MDRVGIKLWAKEKTRSNKWNIWKGFLAIFAASLGLSFIALLLFSVMIDIGGSSYYDTFTF MDLAVTVGVFALYFLVIGFSVNIYRYIKKIVQEDIADLNELRAPIGQYFKQGFGVLAVGL ICVLGTLAFVVPGIILGLGLSMTPYLLANYPSLSIFEAITTSWKMMQGKKMKCFVLFFSF YGWILLSTVTLGILFIWLLPYMTLTFNKFFLENENEFYGVVNSNNNVSSNDIEEFALLNN LKLDTRNNVYGMFNDYPVVVMFGSANNDLIITIDTIGEDRTALDEYFNNLRTILTNIKTI NYSNGTITLSALRSEELHEVYEILNRLTLKLRDLGFLPSCGTCHINKPTSFYEYHGQLMN MCDDCRDAISTQAEEIVEDSSRGIIGAVAGALIGGVLWALVYQIGFIVAFLGYLIVFLAI NGYQRMAGKISKKGLIISIICSVLVLFFAESISLGLKIKDLMQLQSIFTAFSYIPYFLSF SEIFAIVVRDIVLGLIFMGLGSWQYIYKIKKSLDEENLNKLD >gi|223714183|gb|ACDT01000032.1| GENE 8 6380 - 7033 462 217 aa, chain - ## HITS:1 COG:no KEGG:BHWA1_00599 NR:ns ## KEGG: BHWA1_00599 # Name: not_defined # Def: acetyltransferase, GT family # Organism: B.hyodysenteriae # Pathway: not_defined # 1 205 1 203 206 114 34.0 2e-24 MKNQIIFRPIIKKDYLAIEKIIREAWHYDEFCTSKIAKLLSKIFLSSCLSNQTFTLVALL KEQPIGIIMGKNIKTHKCPFKYRVKQGFNIFNLLIRKEGRKTAKIFKAVNNIDKVLLQQT NQDYQGELSFFAIDAQYRGLGLGKELFNLLISYMQEQQINRFYLYTDTSCNYHFYEHLGM MRRVEQTHCFKINNEKNVMHFFIYDYLCKKTVSTETV >gi|223714183|gb|ACDT01000032.1| GENE 9 7104 - 7913 773 269 aa, chain + ## HITS:1 COG:BS_bltR_1 KEGG:ns NR:ns ## COG: BS_bltR_1 COG0789 # Protein_GI_number: 16079711 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 9 124 9 123 124 72 36.0 7e-13 MLKNESLKLTTAQFAKLHKINKRTLHYYDEIDLFKPKFKGENNYRYYDYFQSIELENILM LKQLDMSISEIKSYLNNPNVNDFVNIADDKILKIEQDIRRLKQTKKVLEIKKNQLLKSSR VTDFEIEIVERQDEYLLVSNEPFVQYDVKEILEYLQQAWNIEQYKVGCGSYISIDKIKNN DFEHYDGLFISLQNKRYGKNVLLQSKGKYLCGYVKGDWDKIAVLYRSMLNFAKKENLKLI GYAFERGLNEFAINSIDEYFTEISIKISE >gi|223714183|gb|ACDT01000032.1| GENE 10 8021 - 9133 1177 370 aa, chain + ## HITS:1 COG:lin0838 KEGG:ns NR:ns ## COG: lin0838 COG0620 # Protein_GI_number: 16799912 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase II (cobalamin-independent) # Organism: Listeria innocua # 1 370 1 367 367 387 53.0 1e-107 MNKIPYRYDIVGSFLRSERLKHARENFNNNLISKNELKKIENEEILKLVKKEEAAGLKAV SDGEFRRSYWHLDFLAGLTGVKKIEAEKWSVDFKGVQPKAATLKIIDKIDFEDHEFLEHF EYLNSIIENGVTAKMTIPAPTMLHLIACVRTKEYQPIARYQDNEQLIVDLAMAYQKIIQA FYDRGCRYLQLDDTSWGEFCSKEKREEYANCGIDVGALVKKYVYLINLSIVNKPDDMNIT IHICRGNFRSTWFSSGGYEPVAKELFGNCKVDEFFLEYDSDRSGDFTPLRFIKDQVVVLG LITSKFSELEDKEMIKARINEASQFVALDQLCLSTQCGFASTEEGNNLTEDQQWAKIALV KEIAEEVWGK >gi|223714183|gb|ACDT01000032.1| GENE 11 9133 - 11400 2419 755 aa, chain + ## HITS:1 COG:Cj1201 KEGG:ns NR:ns ## COG: Cj1201 COG0620 # Protein_GI_number: 15792525 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase II (cobalamin-independent) # Organism: Campylobacter jejuni # 1 754 1 754 754 943 61.0 0 MKTAVIGYPRIGSNRELKFATERYLKGQIEQSELLQIAQNLRKEHWVEQDNQKIDYISSN DFSFYDNVLDTAYLLNIIPKRYQELQLNSLDQYFAMARGYQGENGDVKALAMKKWFNTNY HYLVPEVDDQVEISLNGTKIFDEFQEALDLGIKTKPVIVGPYTLMKLLRFTGTLTYQDIL DDLINGYLEVIKKLRDLGASWIEVDEPALVFDLSDKDIILFKEIYTKILDNKGSVKILLQ TYFGDIRDCYQTVLNLDFDGIGLDFIEGKENLTLLDRYGFDQNKILFAGVVNGKNIWKNE YEKTVSLIEKIKNNVNKIVLNTSCSLIHVPYTIENERNLEQIYKNNFSFAQEKLKELFQL KVIMRIGKGHSYYVDNLAFFKEPNNRTNEAVQARIAALKKEDFIRLPEFEVRRQIQKEKF KLPILPTTTIGSFPQTKEVRANRRAYRSGDIDEQTYAEFNHQMIKEWIKYQEEIDLDVLV HGEFERNDMVEYFGENLDGYLFTENGWVQSYGTRCVKPPVIWGDISRVRPLTVDYAVYTQ ALTKKPVKGMLTGPVTMLNWSFPRIDISLAETTLQLALAIKDEVLDLENHGIKIIQIDEA ALREKLPLRKRDWQSEYLDWAIPAYRLVHSSVQADTQIHSHMCYSEFDDIIQAIEDMDSD VISFEASRSDLTLIDTLNRVNFKTEIGPGVYDIHSPRIPSVEELERVLTSMLEKLSASKL WINPDCGLKTRAYPETLASLKNMVEATKNIRAKLQ >gi|223714183|gb|ACDT01000032.1| GENE 12 11444 - 12361 956 305 aa, chain + ## HITS:1 COG:SP0676 KEGG:ns NR:ns ## COG: SP0676 COG0583 # Protein_GI_number: 15900577 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pneumoniae TIGR4 # 1 299 21 319 322 330 56.0 2e-90 MTLQQLRYVITVAQKGSISEAAKELFISQPSLSNAIKELEKEMKIIIFSRTNKGIIISDE GLEFLGYARQVLEQAELLENRYLETKSTKKRFNISTQHYSFAVNAFVDLVKKYSDNEYEF TIRETRTYEIIEDVKTLKSEIGILYINDFNQKVINKLLKENNLIFNQLFIAKPHIFVSNT NPLAQKDIVTLEDIEPFPYLSFEQGQYNSFYFSEEILSTLVHQKIIKVSDRATLFNLLIG LDGYTISTGIISEELNGKNIIAIPLAVDETIKIGYIVRNDTARSYLGQIFIEALKQNTQE LMQGV >gi|223714183|gb|ACDT01000032.1| GENE 13 12445 - 13263 559 272 aa, chain + ## HITS:1 COG:BS_ybfI KEGG:ns NR:ns ## COG: BS_ybfI COG2207 # Protein_GI_number: 16077291 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 2 267 4 270 275 261 51.0 1e-69 METRTIVYDEQLQIEAYQFVGIMQKFPNHFHDCYVVGFVEKGNRHLTCRGDEYDMEPGDM LLLNPRDNHTCESRDGQPLDYRCLNIDADVMVRAVKEVIGCSYLPNFEKPVAFQSEMVES LRTLHQSVMKKETEFLKEELFYILIEQLVKEYTNNQPLLKSRFSEPIKLVKKHLDDNFIN QISLDSLSELAQMNKYTLIRNFARSFGITPYQYLETIRVNHAKELLEQGLSPLEAAMLSG FSDQSHFTRFFKSLIGLTPKQYQNIFKGKEDE >gi|223714183|gb|ACDT01000032.1| GENE 14 13256 - 14146 849 296 aa, chain + ## HITS:1 COG:BS_ybfH KEGG:ns NR:ns ## COG: BS_ybfH COG0697 # Protein_GI_number: 16077290 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 1 288 3 290 306 268 54.0 8e-72 MNKDFSGHLAALLTIFIWGTTFISTKILLTDFKPIEILFFRFLIGLFILILIYPKRLRKT TKKQELTFALTGLCGVTLYYLLENIALTYSMASNIGVIISIAPFFTAILSHFILKEETLN QNFIIGFMAAMVGIMLISFNGSSNFKLNPLGDILALLAALVWAVYSILTKRISDYGYSTI QVTRRTFMYGVTFMFLTLIPLGFKLDLGRFSNPIYLGNILFLGLGASALCFVTWNMAVKI LGAVKTSIYIYIVPVVTVVTSIIVLHEQITLMAFIGTVLTLIGLFLSQKREVQKNG >gi|223714183|gb|ACDT01000032.1| GENE 15 14139 - 14549 490 136 aa, chain + ## HITS:1 COG:no KEGG:LEUM_1139 NR:ns ## KEGG: LEUM_1139 # Name: not_defined # Def: hypothetical protein # Organism: L.mesenteroides # Pathway: not_defined # 1 136 1 135 135 132 47.0 4e-30 MDRKCVMIVDKNLPLGLIANTTAILGTALGKLEGEIVGTDVYDMKEHIHRGIVTISIPVL KGDEFLIRELLKQANYYPNEVLVIDFCDLAQSCRDYSEYIDKMKLASEASLKYFGICLYG TRTRINKLTGNLSLLK >gi|223714183|gb|ACDT01000032.1| GENE 16 14607 - 15329 636 240 aa, chain + ## HITS:1 COG:CAC0509 KEGG:ns NR:ns ## COG: CAC0509 COG1387 # Protein_GI_number: 15893800 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Clostridium acetobutylicum # 6 237 5 235 244 187 42.0 2e-47 MENIELDVHTHTIASGHAYSSLSEMVAGAKEKDIKLLGIVEHDQGIPGTCAPIYFKNLDV IPREIEGIKLLLGIEINILDHQGRLSITEDLYDYVDYCMAGIHLHCYEAGTIEENTRAVI ETIKNPHISVIVHPDDGKCPLDYEKVVLAAKKYHTLLEVNNNALRSPSRLNSRENTLKML QLCKQHRVKIILGSDAHIHFDIKNYDQIEELLKEVEFPKELIVNYHIEDFLEYIKKQVRY >gi|223714183|gb|ACDT01000032.1| GENE 17 15413 - 15664 319 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755711|ref|ZP_02427838.1| ## NR: gi|167755711|ref|ZP_02427838.1| hypothetical protein CLORAM_01226 [Clostridium ramosum DSM 1402] # 1 83 1 83 83 117 100.0 2e-25 MNEMYVISVLMLYTLIILINTLLNTYRLLTTTEMVEEKEFIYPKSSIYHRVHTYVAPMIL RINMKKDYVLNKKRRPPIILMAH >gi|223714183|gb|ACDT01000032.1| GENE 18 16186 - 17649 1691 487 aa, chain + ## HITS:1 COG:PA0609 KEGG:ns NR:ns ## COG: PA0609 COG0147 # Protein_GI_number: 15595806 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Pseudomonas aeruginosa # 18 487 14 492 492 359 41.0 8e-99 MYYPTLEQIKEIIKTGDYNLVPVCKEMYGDSITPIEVMRILKKHSKHCYMLESVDDTKRW SRYTFMGYDPKSEITCLNNLMNVDGKEFMVTHPKDYLKKLLKQYQSPKIAGLPTFTGGLV GYFAYDYFKYSEPVLNLKSDDKEGFNDVDLMLFDQVICFDNYKQKIVLIANVAVDDLEKS YHEALVKLEKMQALIISGVKAKIEPLRIKSDFKALFEQNEYCNMVERAKKYIKEGDIFQV VLSNCLEAEVTGSLFDTYRVLRTTNPSPYMFYFFSDDQEFAGASPETLVKLEDKNLYTYP LAGTRPRGMNENEDVNFEVSLLSDQKELAEHNMLVDLGRNDIGRISKFGTVKVEKYQEIL RFSHVMHIGSTVSGVILENKDAIDAIDALLPAGTLSGAPKIRACQIINELENNRRGIYGG AIGYIDFTGNLDTCIAIRLVYKKNEKVYVRSGAGIVYDSVPEIEYQECLNKAQAVISALN TANGGID >gi|223714183|gb|ACDT01000032.1| GENE 19 17649 - 18224 634 191 aa, chain + ## HITS:1 COG:TM0141_1 KEGG:ns NR:ns ## COG: TM0141_1 COG0512 # Protein_GI_number: 15642915 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component II # Organism: Thermotoga maritima # 2 189 47 236 246 207 49.0 7e-54 MILLVDNYDSFAYNLYQLIGSIDPNIKVIRNDEMTVDEIEKIDPNTIVISPGPGKPSEAG NIEEIIKYFYNKKPILGVCLGHQAICEAFGSTITYAKELMHGKSSIIKMTDDLIFQGLKQ QTKVARYHSLAVKRETLAKVLKAIALSEDGEVMAVKHIDYPVYGLQFHPESILTDDGKKM IENFLGVKEND >gi|223714183|gb|ACDT01000032.1| GENE 20 18217 - 19224 1144 335 aa, chain + ## HITS:1 COG:CAC3161 KEGG:ns NR:ns ## COG: CAC3161 COG0547 # Protein_GI_number: 15896409 # Func_class: E Amino acid transport and metabolism # Function: Anthranilate phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 1 330 1 331 331 306 50.0 3e-83 MIKEAILKLSKKVDLTAGEARSCMNEIMNGEASDVQMSSYLTALSLKGETIDEITGSAAG MREHCIKLLNDKDVLEIVGTGGDGSNSFNISTTSSLVIAASGVKVAKHGNRAASSKCGAA DVLEALGVKIDVTPEKSATILNDINICFLFAQNYHIAMKYVAPVRRELGIRTVFNILGPL TNPAGANMQVMGVYEEALVEPLAEVLTNLGVKRAMVVYGQDSLDEISLSAPTSVCEVKDG TYTRYTITPEQFGLTRCCKEELVGGTPKDNAKITRDILSGVKGPKRDAVLLNAGAAIYIA GRTKTMQEGIELAARLIDDGKALAKLEEFINKTNM >gi|223714183|gb|ACDT01000032.1| GENE 21 19238 - 20038 880 266 aa, chain + ## HITS:1 COG:XF0213 KEGG:ns NR:ns ## COG: XF0213 COG0134 # Protein_GI_number: 15836818 # Func_class: E Amino acid transport and metabolism # Function: Indole-3-glycerol phosphate synthase # Organism: Xylella fastidiosa 9a5c # 2 246 4 250 264 178 39.0 9e-45 MIIDDIVERTLERVAENKKRNSIEKLAKLAFDKPINKDYPFERALRRAGISYIMEIKRAS PSKGVIAPNFDYKSIAKEYEDIGAAAISVVTEPDFFKGDDDFLAEIAKIVKIPVLRKDFV VDEYMIYEAKLLGADAVLLICSILDEITLMRCLNLAERLGMSALVEAHSSMQVKKALRVG AKIIGVNNRDLRNFEVDLNNSIELRSMVPENIIFVSESGIKTYDDIKTLEENNVDAVLVG ETLMRSHDKRKTFEILQGLRKPDNAN >gi|223714183|gb|ACDT01000032.1| GENE 22 20028 - 20624 567 198 aa, chain + ## HITS:1 COG:CAC3159 KEGG:ns NR:ns ## COG: CAC3159 COG0135 # Protein_GI_number: 15896407 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylanthranilate isomerase # Organism: Clostridium acetobutylicum # 1 194 2 202 205 145 43.0 4e-35 MQIKICGLFQVEDIDYVNEAKPDYVGFVFAKSKRQVDIHQAEKLKNKLDTNIKAVGVFVD EQISEITAIVKMGIIDLIQLHGHEDNAYIKQLKQSVQMPIIKAIKVIEKDDLNNLDYECD YYLLDSKISGSGKSFDWSLIKDLDKPFFLAGGIDLDNLDEAMSKADYGIDVSSGVETNGI KDRNKIIEIVRRTKNGNR >gi|223714183|gb|ACDT01000032.1| GENE 23 20611 - 21795 1572 394 aa, chain + ## HITS:1 COG:CAC3158 KEGG:ns NR:ns ## COG: CAC3158 COG0133 # Protein_GI_number: 15896406 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Clostridium acetobutylicum # 4 386 3 385 394 502 63.0 1e-142 METGRYGIHGGQYIPETLMNEIHNVEKAYEFYKNDPEFNRELEKLLKEYAGRPSLLYYAK KMTEDLGGAKVYLKREDLNHTGSHKINNVLGQVLMAKKMGKTRVIAETGAGQHGVATATA AALLGLECEIFMGKEDTDRQALNVYRMELLGATVHSVTSGTMTLKDAVNETMREWTKRVD DTLYVLGSVMGPHPFPMIVRDFQSIISKEAREQILEDEGKLPTAVMACVGGGSNAMGMFY NFINDQEVQLIGCEAAGKGVDTALTAATINTGSLGVFHGMKSYFCQDEYGQIAPVYSISA GLDYPGIGPEHANLHDIGRAQYVPVSDDEAVAAFEYLARTEGIICAIESAHAVAHARKIV PSMDKEDIVIICLSGRGDKDVAAIARYQGVDIHE >gi|223714183|gb|ACDT01000032.1| GENE 24 21788 - 22555 423 255 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc [Roseobacter sp. AzwK-3b] # 1 235 1 242 263 167 37 1e-40 MSRISEAFNNKKAFIGFLTAGDPYQKMTVEYILEMEKAGASLIEIGIPFSDPIAEGPVIQ EANIRALSKGMTTDQVFEIVAEVRKHSNIPLCLMTYLNPVFHYGYDEFFKKCQETGVDGI IIPDCPYEESKEVSDICYKYDVDLISMIAPTSKERTLEIAAQAKGFIYLVSSMGVTGMRS NIVTDIEGIVESIKTVTDTPVAVGFGINSPAQVKHFGNIADGVIVGSAIVNLIKEHGDRA HQPLFNYIQTLIKEL >gi|223714183|gb|ACDT01000032.1| GENE 25 23099 - 24385 937 428 aa, chain + ## HITS:1 COG:FN1725 KEGG:ns NR:ns ## COG: FN1725 COG0168 # Protein_GI_number: 19705046 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Fusobacterium nucleatum # 4 428 19 448 448 242 39.0 1e-63 MKIIFYGFCIIILMGTILLSLPISLKENQEVSIMTSFFTATSATCVTGLIRVDTYSHWSW FGQFIIILLIQIGGIGFMTFCIYALTLAKKKIGIISRSIMQNSISSPHVGGIVKMTKFII LATFFIEALGAFFLSFIFCPKLGLVKGLWFSIFHSISAFCNAGFDLMGNFEPYSSLVSFQ DNWYLNMIIMLLIIIGGLGFLVWKDIIDNRGHFSKMRLHTKIVITTTIILIFGGALFIYF CEQGNATILSSLFQSVTARTAGFNTVDLSKIRETTQLIIIILMFVGGSSGSTAGGIKTTT IAVMLVNIISMFKQKKGVEVFKRRISDEIVKMASCVLMAYLVLTLIVSLIICQLENISYI TVLFECVSAIATVGLTIGITSQLGVISQCLLALLMLFGRVGSITFLLAFASNRVTPLAKA PAEKIQIG >gi|223714183|gb|ACDT01000032.1| GENE 26 24396 - 25040 748 214 aa, chain + ## HITS:1 COG:lin1022 KEGG:ns NR:ns ## COG: lin1022 COG0569 # Protein_GI_number: 16800091 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Listeria innocua # 6 212 7 214 219 143 36.0 3e-34 MKSVLVIGLGRFGRHLSRKFIEEGNVVLAIEKDETRADWAVNIVNEIQIGDATNEDFIKS LGVNNFDICVVAVGDNFQTALEITVLLKDFGAEYIIARASRDVHRKLLLRNGADHVVYAE REIAEKIAIKYGAKNVFDYLELTPDIAIYEIKIPESWLNKTIIEKAVRTRYHVSILATKK NGRIFPLPPTNHVFAADETLIVMGTAKSIKEITR >gi|223714183|gb|ACDT01000032.1| GENE 27 25080 - 26999 1904 639 aa, chain + ## HITS:1 COG:pli0050 KEGG:ns NR:ns ## COG: pli0050 COG2205 # Protein_GI_number: 18450332 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Listeria innocua # 5 639 241 888 888 511 46.0 1e-144 MINQQEHILLCLSSSISNPDVIKMAAAVAHENKAIFTALYVETPENSKISKDDRERLQYN TNLARKLGANIETAFGLDIAYQINEFCKNAKVTKVFFGHSQKDRKHLFSSRLDKKLLNLL KDLDINIVYHQNTGKDKSKQRDFNFVVKGKDLAISIFILLIATLIGILFHHLGFSEANTI TVYILSVLITAITTSNRIYSLVSSILSVLIFNFMFTNPRFTLAAYGSGYPVTFAIMLIAA FITSTLAIRIKQQAKQSAQHAYRTKVLLETNRLLQKEKSIAGIKETAAKQIVKLLNKSVM IYQITENDTILHEVFHHNNMTVHDCDEIETNVHWVYQNNKTYQAKNYYLPLASTNDIYGV VVIILEDVILDAFENNLVLSIVGECALALEKEFFNQKKEEAAIEAKNERLRANLLRSISH DLRTPLTSISGNASVLINNSGSLDENKKMQLYEVIYDDSLWLINLVENLLSVTRIEDGKM NLHLETELIEEVINEALNHISHKKDEHRIEVKANDEFILAKIDAGLIIQVIINIVDNAIK YTPKGSMISIETFKHKDFVEIQIADDGAGISDKDKEKLFEMFYTVKHEVIDGRRGLGLGL ALCKAIIVAHGGNIAVKDNIPHGTIFSFTLPIKEVEIHE >gi|223714183|gb|ACDT01000032.1| GENE 28 26992 - 27690 861 232 aa, chain + ## HITS:1 COG:pli0051 KEGG:ns NR:ns ## COG: pli0051 COG0745 # Protein_GI_number: 18450333 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 1 231 1 231 240 279 61.0 3e-75 MNKPFVLVVEDDNAVRNLIATTLETHDYKYHTSQTGKQAIIETASQKPDIMLLDLGLPDL DGVEIIKKVRTWTNMPIIVISARSEDRDKIEALDAGADDYLTKPFSVEELLARLRVILRR LNYNQSQNIESSDFVNGDLKIDYAGGCAYLDQQELHLTPIEYKLLCLLSRNVGKVLTHTY ITKEIWGSTLESDVASLRVFMATLRKKIELDPTHPKYIQTHVGVGYRMLRIE >gi|223714183|gb|ACDT01000032.1| GENE 29 27775 - 28632 666 285 aa, chain + ## HITS:1 COG:alr2616 KEGG:ns NR:ns ## COG: alr2616 COG2510 # Protein_GI_number: 17230108 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Nostoc sp. PCC 7120 # 151 284 9 142 143 105 50.0 1e-22 MWIGLACLSALFAGVTAVLAKIGLKNVDSSLATALRTIIVLIFSWIMVLFAGTQEQLKYI DSNTVLVLCFSGIATGLSWLCYFKALQLGDVNKVTPIDKSSTILTMLLAFIFFDEKITML KIISMIIMSLGTYMMIEKKQVVNKVTNNHWLLFALASALFASLTAIIAKIAMLNIDSNLG TFIRTAVVLIMAWGIVFYQRSYHTIKEITKKNWLFLFSSGITTGLSWICYYKALQLGEAS IVVPIDKLSIIVTIIFSYFVLKESLSKKAVAGLILIVAGTMLLLI >gi|223714183|gb|ACDT01000032.1| GENE 30 28645 - 29127 430 160 aa, chain + ## HITS:1 COG:Cj1254 KEGG:ns NR:ns ## COG: Cj1254 COG3663 # Protein_GI_number: 15792578 # Func_class: L Replication, recombination and repair # Function: G:T/U mismatch-specific DNA glycosylase # Organism: Campylobacter jejuni # 5 160 7 158 160 137 45.0 6e-33 MREYHLIEPVYNKSSKILILGSFPSVKSREANFFYHHPQNRFWKILANIYNTELPETIIE KKEFLINQHIAVWDVIASCEIKGSSDSSITNVEVNDLKKIIDQSQITHIYTNGNLADRLY HRYFDEIIDLPVTKLPSSSPANASYSLTKLISFWKEIKSL >gi|223714183|gb|ACDT01000032.1| GENE 31 29165 - 29896 777 243 aa, chain - ## HITS:1 COG:no KEGG:CKR_2202 NR:ns ## KEGG: CKR_2202 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 243 1 243 243 275 59.0 1e-72 MRTWAFAKRTTKEILRDPINIIFGLGFPIVILLLLTTIQKNIPATPFSLKQLTPGIAVFG LSFLSLFSATLISRDRMSSLLARLFTTPMTAKDYILGYTLPLIPMALIQTLLCYLAAFCL GLKITPDVIIAILCTIPISIIFIAIGLFCGTIFNDRQVGGICGALLTNLTAWLSGTWFSL DLIGGLFKDIAYCFPFVHAVDMARDGLNANYHAMPNNLFWVLSYAGLLLIAAVSLFKYKM KHN >gi|223714183|gb|ACDT01000032.1| GENE 32 29874 - 30623 235 249 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 220 1 226 245 95 26 7e-19 MEMIEIKNLTKKYNDKLAVDNLNLSIKQGKLLALLGVNGAGKTTTLKMLSGLIQPTSGDA RINNYSIVNESFFVKELIAVSPQETAIAPNLTVRENLELMAEIHGFKKTLINNKVIEMIT AFNLKDVAAKKSKKLSGGYQRRLSIAMALISEPQILFLDEPTLGLDILAREQLWRTIKAL KGKITIILTTHYLEEAEALSDYVCIMKDGKIKALGTTNELIKFAKTTSFEDAFIKLATGG TENENMGLC >gi|223714183|gb|ACDT01000032.1| GENE 33 30629 - 31057 558 142 aa, chain - ## HITS:1 COG:CAP0091 KEGG:ns NR:ns ## COG: CAP0091 COG2002 # Protein_GI_number: 15004795 # Func_class: K Transcription # Function: Regulators of stationary/sporulation gene expression # Organism: Clostridium acetobutylicum # 74 142 6 73 76 58 43.0 3e-09 MLSTNLVWLRKHYQYTQEEIAQQVGVSRQSVAKWESGESLPDIDSCMALAKIYNVTIDNL INHDEDDAGIVIPPKGKHFFGAVTVGERGQIVIPQEARRIFKISAGDKILILGDEERGLA IVHQRDVINFVSELGVPKNEKD >gi|223714183|gb|ACDT01000032.1| GENE 34 31184 - 32617 1501 477 aa, chain - ## HITS:1 COG:L22116 KEGG:ns NR:ns ## COG: L22116 COG2723 # Protein_GI_number: 15672399 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Lactococcus lactis # 1 475 1 478 478 600 61.0 1e-171 MTFNKDFLWGGAIAAHQAEGAWQEGGKGISCSDVETAGNNVTGAPRRLTDGVIEGEDYPN HVGVDFYHRYKEDIALFAEMGFKAFRTSIAWTRIFPRGDEETPNEEGLQFYDDLFDECHK YGIEPIITLSHFEMPWTLAKEYGGFRNREAIDMFVKFAKVCFERYQHKVKYWMTFNEINN QADVSQHNLIQEGAVLLKKDDDAEYLMYLSAHHELVASALAVKAAHEINPDLQVGCMIGM NAVYPASPNPTDVMNALGAMHQKYWYVEVHARGHYPNHILKKFERKGYDFITEEDKKILS QGKVDYIGFSYYMSFASKYQGRNEKTYDYFSEDFVRNTHLKASDWGWQIDPLGLRWCLNW FHDRFELPMMIVENGFGAYDKVEADGTIDDQYRIDYLAAHIEAMRDAVDYDGVELLGYTM WSPIDIVSASTGEYDKRYGFIYVNYNNNHEGDFSRCKKKSFDWYKKVIATNGEDLSE >gi|223714183|gb|ACDT01000032.1| GENE 35 32769 - 34109 1300 446 aa, chain + ## HITS:1 COG:CAC3354 KEGG:ns NR:ns ## COG: CAC3354 COG0534 # Protein_GI_number: 15896597 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 5 424 8 427 452 262 35.0 8e-70 MIGKLTEGNITRALIKFSIPMILGNLLQQLYNVADTFIVGHYIGTDALAAVGSSFTIMTF LTSIILGLCMGSGILFSMFYGAKQLDKMKTSFFVSFVGIGIFSIGLEIVCLLAIDLILNF MNIPRDIFTDTHQYLFIIFLGLVFTFIYNYFSSLLRALGNSKIPLIFLALASIINIGLDI YLVAEVAMGVAGAAVATLIAQAFSAIGIMLYVFLSQKELLPQRKHWHFEREIFEKIKAYS LLTCIQQSVMNFGILMIQGLVNSFGLVTMSAFAAAVKIDSFAYMPVQDFGNAFSTYIAQN KGAGLEERIHKGFKVAVVMASIFCIFISALVFIFADKLMLIFIESSKSEIIYQGAQYLRI EGACYLGIGCLFLLYGYYRGVGKPGISVVLTVISLGTRVVLAYLLAPLFGTLAIWWAIPI GWFLADLIGIMYGLKKERWTNRYIDK >gi|223714183|gb|ACDT01000032.1| GENE 36 34395 - 35831 1626 478 aa, chain - ## HITS:1 COG:BS_bglA KEGG:ns NR:ns ## COG: BS_bglA COG2723 # Protein_GI_number: 16081063 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Bacillus subtilis # 3 477 4 479 479 687 69.0 0 MALSKDFLWGGALAAHQFEGGVLNTSKGYSVADVMTAGAHGVPRRITDQIMENEYYPNHV GIDFYSHYKEDIALFAEMGFKCFRTSIAWTRIFPNGDETEPNEEGLQFYDDVFDELLKYG IEPVITLSHFEMPYHLAKEYGGFMNRKTIDFFVKFAEVCFKRYKDKVKYWMTFNEINNQM NFKNDIFGWTNSGAHFGDYDNPEEAMYICGHHTLLASALAVKIGKQINPNFLIGNMIAMV PIYPFSCRPADMVLSNQMMHDRWFFCDVQCRGHYPAYALKMFERKGFNINITEEDKKILA EGTVDYIGFSYYMTNTVDSTAHQDVSKATDGSSEHSVKNPFIKESDWGWAIDPEGLRYAL NIFYERYEKPLFIVENGFGAIDVKEADGSCHDDYRINYLKAHIKEMKKAVELDGIDLIGY TPWGCIDCVSFTTGEMKKRYGFIYVDRDNEGNGTLERSKKDSFEWYKKVIASNGENLD >gi|223714183|gb|ACDT01000032.1| GENE 37 35968 - 36492 564 174 aa, chain + ## HITS:1 COG:SP0652 KEGG:ns NR:ns ## COG: SP0652 COG0500 # Protein_GI_number: 15900553 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Streptococcus pneumoniae TIGR4 # 1 173 5 178 185 127 36.0 1e-29 METMIEYVHNRIKRKDYQIAIDFTMGNGHDTLFLSKVAKQVYSFDIQQEALNNTKKLIED TDNVNLILASHENFDCYVKNFDVGIFNLGYLPNGDHQITTMADSSLQAIKKAVEYLNRKG ELFLVVYIGHDEGKKESLLIEEYVASLDHISYNVALFKMMNKLSAPYVIQIEKR >gi|223714183|gb|ACDT01000032.1| GENE 38 36556 - 36825 369 89 aa, chain + ## HITS:1 COG:SA0934 KEGG:ns NR:ns ## COG: SA0934 COG1925 # Protein_GI_number: 15926669 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Staphylococcus aureus N315 # 9 88 8 87 88 64 45.0 4e-11 MVKRIDVTVTDPVGLYATPATELVDTMKMFNSDIKLVYASKTVNMKSLMGVLSLGIPTKA KIEIIADGEDEDAVIKRALELMKDLGISG >gi|223714183|gb|ACDT01000032.1| GENE 39 36924 - 37901 1149 325 aa, chain + ## HITS:1 COG:BS_mraY KEGG:ns NR:ns ## COG: BS_mraY COG0472 # Protein_GI_number: 16078583 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Bacillus subtilis # 5 324 2 322 324 258 46.0 1e-68 MENFLKLIIAFGIGFILVVLAMPKVIPFLHKMKFGQVEREEGLESHKSKNGTPTMGGIVF VIAAILGAFIVNFNNLLDPELILATIVLVGYSAIGFVDDALIIVKHSNKGLPPLAKLLAQ IALAIICYFFAMNFIPDFTSVITIPLLDINIDMGYLYPALILVMFAGESNGVNLSDGLDG LATGLSMVAIAPFIIFSIMTKDYTLASYATAMVGALLGFMMFNYHPAKIFMGDVGSLGLG GFLAILAILTKQELLLILVGGVFLMETLSVIIQVVSFKTRGKRVFKMAPIHHHFEMLGWS EQQVTISFWFIGFICGILSIVIGVL >gi|223714183|gb|ACDT01000032.1| GENE 40 37905 - 39245 1606 446 aa, chain + ## HITS:1 COG:SP0688 KEGG:ns NR:ns ## COG: SP0688 COG0771 # Protein_GI_number: 15900589 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Streptococcus pneumoniae TIGR4 # 2 446 7 448 450 309 38.0 6e-84 MFTGKKVLVIGLAKSGKAAIRLLHKLNATITVNEAKEIKDIPEYQQYLDMGIEMVTGSHP TELFERDFDFVIKNPGINYHKPFILRLKERKIPVYTEIELAFQVAKKQHYIGVTGTNGKT TTVTLIDKILKGQYQHVHLAGNVGTPLCDVVLDNDLLNEENHYVVIEMSNFQLLDIERFK PYVSTIINLTPDHLDYMASLDEYYASKTNIYKNTDLDDYFVVNLDDKTVMEYLEKYPVPC NQVTMSLIHEADCMIKEQAIYYRDELIINLADIKVVGRHNIQNIMTAICIAKKCDVPVTV INREIASFIGVEHRIEFVREINGVKIYNDSKATNTDATIIALKAFEQPVILLMGGFDKGL DLTEMATYGNKISHLVTFGAAGERFKNDMHVANSSSVENLEAATLLAIEKAKAGDIILLS PSTSSFDEFSGYEERGRVFKNIVNKL >gi|223714183|gb|ACDT01000032.1| GENE 41 39487 - 40269 630 260 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734702|ref|ZP_04565183.1| ## NR: gi|237734702|ref|ZP_04565183.1| predicted protein [Mollicutes bacterium D7] # 1 260 1 260 260 468 100.0 1e-130 MNTLLSRLIVYINSCVKKDVIYEIARYIIKNYSYCQNITLKDIMDECYVSRASVTRFCEY FGFHSWNNFQSFLVKTKKVKEQQIESRFSNIDIESIYDHLCYIAKNDSLEFKNNLKKEIN ALVEIINQSSRIYLFGAVYPLSLAVDFQINMISIGKTVYSDFQSESDYLEPMNEDDLAII LTASGRYVGECKSKFNLICNNAGKKAVISCSNRYSSLQYINHYLYIPTTNKNSFVDFDYY TILILDLIYIEYNLKYCRGK >gi|223714183|gb|ACDT01000032.1| GENE 42 40272 - 41075 595 267 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755737|ref|ZP_02427864.1| ## NR: gi|167755737|ref|ZP_02427864.1| hypothetical protein CLORAM_01252 [Clostridium ramosum DSM 1402] # 1 267 1 267 267 493 100.0 1e-138 MGSILIKLHQLINYNQNENIHYSLAKYLLANCDTINRLSISKVVEDTLISKTSVLKFCRY LGYDSWKVFCSDFQEECLDEKIRLDRLKMNANLMFSKDNLNEYVRLNDEYFSEVQKIVTF QKIKIFAKQINEANNIFVFGEPRELPLFYDFQELLGLYHKELIFPKSLNKKDFEEQLALI DQESLIIITNGVHSFDGFIEKETIEPVYGLERIMQTNCKVIFIGQKSVQNYDNVTTLTIP FSFNEYFIRLAIADLIYKMITYYFYRY >gi|223714183|gb|ACDT01000032.1| GENE 43 41263 - 43956 3467 897 aa, chain + ## HITS:1 COG:Cgl0317 KEGG:ns NR:ns ## COG: Cgl0317 COG1472 # Protein_GI_number: 19551567 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Corynebacterium glutamicum # 100 582 12 505 548 176 28.0 2e-43 MSKKKKLTKTLAALSAAGCIIGSATPIALNAVELSKYSVVEAKYQAAKTEGSGENAQFSD YDIDIMVVTNPDYKGEAAPTLTYSKAFDEKAKEAGKTALLSENIDGKEYAFKDMNLNGEI DDYERWYLDAETRAQDLADALVAEGEDGIAKIAGLMLFSSHEGNSAAGLTDSQKDYLTNS YVRNITDAAGNDVEDSVRWVNAMQEYVEISDFNVPVDFSSDPRSTAGSGDLYSDNVSSGA TDISKWPSNIGLAATFDTEHMYNFAKASSAEYRALGIVTALGPQIDLATEPRWLRVGGTF GENTKLSTDMAQAYVDGSQSTYVNGEDQGWGKDSINCMIKHFPGDGAGEGGRESHTRAGK YAVYPGGNFEEHLTPFAQGGLKLKGKTGAATAVMTSYSIGINADGSALGGERVGSAYSKS KMSILRDDLGFDGVVCTDWGVTSTPPEQLALGGLGMAWGMENATDNERHLAILDAGGDMF GGNNDNKPIVWSWQQMVDRDGEEAANTRFATSAKRLLKLTMEPGLFENTYLNLDESLAEA GSKDKKDAGYQAQLESVVMLKNKDNTIKAAKDQAKDPKDMTVYIPKTYTAEQKGVFGDTP ASWSDSMNIETAKSIYGTVLTDEIVDDKVVRPDLSNVDKVIVGMRSPNNGSLFNQSGMIV EDGKQKFYPLSLQYRPYTADGENVRKVSIAGDLNSDGTQQNRSYYGATSKIANEYDLTAC LEAVEAAKAVNKDIPVVVALNANGPVIVSEFEDKVDAIVCGFSVSDSAVFDIINGKSEPK GLLPLQFPANMDTVEANKEDVAGDLVCYKDTQGNTYDVAYGLNWSGIIDDERVQTYNPDR PTTDTTTVTPTPTPDKPTTSVKTGDEINLNTTMLAGAVALVAGLTAYVTAKKRKAVK >gi|223714183|gb|ACDT01000032.1| GENE 44 44019 - 45398 1398 459 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01288 NR:ns ## KEGG: EUBELI_01288 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 2 275 41 317 395 77 27.0 9e-13 MKILKKLMILLVLAALIIGCSKEPENNYLGNLKVGYTMKLDDKEVPKAKFEFFYGRALTK YLNQYYSAFSDDYASIVDFNIEKDINDLKNQDCTVTGYTDKSWEYYFADQAKKDILEVEA LVRGAKTAGYQINADDKKKARSTFGELESYCKENKIDFEDYLHSTFGLEATKDNLISYYE EANLASRYAAVLLKEPTDEEVEQYYLANKNDIDTINIRYFAFAKDDKAKADEFAQKIRSE ADFKQMAVDYSSDDKKIYYQNNDLSLRQDLRKSDLPEYLQSVLFDQALPINSVKVVEGDS SIDVVMLVAREQPVYRQASISTVYLDARETDTDNLTTEKMNTCKEFADNLLQDFMNNTDR SIDKFHEYNSKYADDKNNQGDYDNISKGDSTIEIANWVFDESRQVGDLEVLMSNYGYTIV YFRGFGGIDYFERAKVLAKEATYDKQLNDLKRNIKVMIK >gi|223714183|gb|ACDT01000032.1| GENE 45 45475 - 46551 1125 358 aa, chain + ## HITS:1 COG:BH2566 KEGG:ns NR:ns ## COG: BH2566 COG0772 # Protein_GI_number: 15615129 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus halodurans # 6 357 12 364 366 305 51.0 1e-82 MKRRRVVVTVLIIVIIGLMMVYSASNIWAGYKFNDSLYYIKRQGIFAVIGIIAMFVFSKI DYHIYQKNANKLLIGSFILMILVLIPGIGAVRGGSRSWFNLGIISLQPSELFKIAIIIYS ANYINNHYHELKKLKASLKLLLILGLGFGLIMLQPDFGSGVVMACSIVVMLIVSPFPFKY FVMLGILGVIGIVIMIISAPYRLARIVAFLNPFADPLGSGFQIIQSLYAIAPGGILGVGF NNSVQKHFYLPEPQTDFIFAIFLEEFGLIGGVLLVGMYGYMFVTVFNQATKVKDLFGSFL MIGIISMIGIQTLINLGVVVGLFPVTGVTLPLMSYGGSSLTITLIAIGIVLNISKSTY >gi|223714183|gb|ACDT01000032.1| GENE 46 46580 - 47656 1266 358 aa, chain + ## HITS:1 COG:lin2141 KEGG:ns NR:ns ## COG: lin2141 COG0707 # Protein_GI_number: 16801207 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Listeria innocua # 1 347 1 350 363 294 43.0 1e-79 MKVIVGAGGTGGHLYPALALVEYIKEVEPDSEFLFVGTKDRIESEVVPQQGYEYIGLNVR GLVGNPIKKGIAAAIFVKSIFTAKKIVKKFKPDIVIGFGGYPSASVVEAANRLGYKTMIH EQNSIIGLTNKILIKNVDKIVCCYDKAYENFPKDKTYKLGNPRASVIASIKPDDIFKKYH LNKNLPLVTIVMGSLGSKSVNEMMLKSLKTFEQKNYQVLYVTGKPYFEEMKTKLGKLNKN IKLVPYIDDMPSVLKNTTLVVSRAGASTLAEITAVGIPAILIPSPYVAANHQEYNARELA DRNAAMMILEENLNSKDFVEKVDYILENKIVQESMQKSAKALGKPNACRDIYKLIKEM >gi|223714183|gb|ACDT01000032.1| GENE 47 47666 - 48580 1213 304 aa, chain + ## HITS:1 COG:BH2564 KEGG:ns NR:ns ## COG: BH2564 COG0812 # Protein_GI_number: 15615127 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Bacillus halodurans # 5 301 3 298 301 244 41.0 1e-64 MKFSQVKEDLEKLDVGEMIEDEPMYKHTTYKVGGPARIYLKVKDVDSLIKTIKYCGKHRV KYLVIGRGSNLLFSDREYEGLIISLNECFNEIKVNGSTMIAQAGVPMISLSYQAAKIGLS GFEFMGGIPGSIGGGIYMNAGAYKYDLASVVKTVTLLNEKHEVVTFNNEQMDFSYRHSIC QDNRKLIVLEVTFELTAKSPDEIKAVLDKRKERRMSSQPWNMPSAGSVFRNPQDKPAWQY IDECGLRGYEIGGAQVSPKHSNFIVNNGYASAKDIYDLIMLVQEKVNEKFGVKLRREVGL INWE >gi|223714183|gb|ACDT01000032.1| GENE 48 48584 - 49351 783 255 aa, chain + ## HITS:1 COG:BS_divIB KEGG:ns NR:ns ## COG: BS_divIB COG1589 # Protein_GI_number: 16078588 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division septal protein # Organism: Bacillus subtilis # 1 227 1 222 263 75 28.0 7e-14 MAKRKQQQNIYNDYDRDQVLAYLKKQKAKKRKRRRRVIFVILVIGLIIAFFVSDYSRLQT ITVSGNNRVSSEEIITASKIKLHQDYTFFKSMDAAENAIKKTSLIKDAKVTKDLFGHVKI KVVEADPIGQCTIDNILYVVDETGRVTKDEAGVLTTYVQRCPKLNGFDYDRFAAFAKEFA KIPAQVVNQISDINYAPENLDDKRCEFIMDDGKILYLRYDDMAVQLKGDNYALKMEEFPD YKYYDFVGKYVYVHN >gi|223714183|gb|ACDT01000032.1| GENE 49 49438 - 50697 1312 419 aa, chain + ## HITS:1 COG:BS_ftsA KEGG:ns NR:ns ## COG: BS_ftsA COG0849 # Protein_GI_number: 16078592 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell division # Organism: Bacillus subtilis # 3 416 5 439 440 146 24.0 6e-35 MKDIYAVLDIGSATLKFLVAEVNNINVNVLFVKTIPSHGVKKGIIEDDRILIRDIRKLID EAEAFLESKITSVALTIPTIKAKLYQSDSSISLSDAGSKVSTDDIVRVLRLSSKFKRSKD EEVVSIIPVRYHSDKGATVEAPLGELSRNLIVDSLVITTSKELLYPYISVVEKCGVEVLD ITINAFACAKEAFDAVYLQEGALLIDMGYKTSTVSFYKDGYLQYLTVCQVGGYDFTRKIA QNMQISMNQAEAYKIKYGSLDVTQGQNDIIHTTFVDEQKRDYTQQDLADLLNETAYEVMN KIKEKISVIDDISKYETLIVGGGGELEMLDTIATEVLECPVRIYRPDTIGTRDMSLVAAI GMVYYLMERKQVVGDYTPSLVLPDVTNTMAIRFKGLTKSAPSKQGSKMSKLLDSFFSED >gi|223714183|gb|ACDT01000032.1| GENE 50 50722 - 51816 1503 364 aa, chain + ## HITS:1 COG:BH2558 KEGG:ns NR:ns ## COG: BH2558 COG0206 # Protein_GI_number: 15615121 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Bacillus halodurans # 4 363 5 378 382 311 53.0 1e-84 MSEELGFEQVARIKVIGVGGGGNNAVNRMVEEGVAGVEFYVANTDLQVLKRSPVTNKIEL GRDLTKGLGAGGEPEIGKKAALESEAEIRQVLEGADMVFIAAGMGGGTGTGAAPVFAKIA RELGALTVGVITKPFTFEGMKRKKQAISGIEELRANVDSIITVSNDRLLQLIGGRPMQEA FREADNVLRQGVQTITDLIAIPAFINLDFADVSAVMKNRGNALIGIGMSSGDDKAKEAAK RAISSPLLEVSVAGAKDAIINVTGGPNISLFDANIALETISQEVGDDINTYLGIAINENL DDDIIVTVIATGFEEENDDDFEPRPNVLKTRSVEDVVPLSARFEDEEDDDDGDFPPAFIK NRRV >gi|223714183|gb|ACDT01000032.1| GENE 51 51866 - 52642 1028 258 aa, chain + ## HITS:1 COG:proC KEGG:ns NR:ns ## COG: proC COG0345 # Protein_GI_number: 16128371 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Escherichia coli K12 # 2 256 3 258 269 187 41.0 1e-47 MKKIGFIGMGNMAGAIAGGIIKSGFVEGENVYAFDIDNDKLTKMHTDFSINVCTSEKELV AMADIVIIAVKPNVVEDVVAKITDELDNKAIISIVAGYDNEMYNELLLDSTRHLTIMPNT PALVMNGMTLFEQENTLTADELNYAVEMFSSIGEVVILPSYQMKAGGSISGCGPAFVYMF IEAMADGGVRLGLPRDVAYRLASQTLIGAGMMQKETQLHPGILKDQVCSPGGITIKGVET LEENGFRNAVLKAIKESN >gi|223714183|gb|ACDT01000032.1| GENE 52 52725 - 53462 671 245 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734713|ref|ZP_04565194.1| ## NR: gi|237734713|ref|ZP_04565194.1| predicted protein [Mollicutes bacterium D7] # 1 245 8 252 252 350 100.0 6e-95 MEVYIELTYLTNYLIILVALEMMAILISKEMSYLMVIKHSFYLSGVILLLYLDSYSWLII LIWAVVFLCLYQRQIFLYYPIFIFVYFSLLLFASSIIPEAFIYNGILISPVNVSSIGLFI VSLLVVLMQIMFIVYLKRKIRIDDYLYVMKLDYQQHSYTIKGFLDSGNEVYYEGFPLILI NQKIIDEYEVIDVLELNDLREDIIEIIKVDQVIINNQRLQDIYVGVIAGIQYDCLLNKSL MGGIL >gi|223714183|gb|ACDT01000032.1| GENE 53 53462 - 54157 759 231 aa, chain + ## HITS:1 COG:BH2556 KEGG:ns NR:ns ## COG: BH2556 COG1191 # Protein_GI_number: 15615119 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus halodurans # 3 226 12 234 237 241 57.0 6e-64 MIKFILKILNKTKYLYYIGRHDILPPPLKGIEEKEALEKLGAGDNDARDLLIEHNLRLVV YVAKRYDTTQNGGIEDLISIGTIGLVKAINTFKPDKNIKLATYASRCIENEILMFLRKNN KLRHEISLDEPLNIDYDGNELLLSDIIGTDSDIVKNELEQSDQKAMFYEAFKDLSKREKE ILMFRYGLMNYDELTQKDVAKMMGISQSYISRLEKKIIKKLRNKLNYNEIK >gi|223714183|gb|ACDT01000032.1| GENE 54 54206 - 54497 220 97 aa, chain + ## HITS:1 COG:BS_sigG KEGG:ns NR:ns ## COG: BS_sigG COG1191 # Protein_GI_number: 16078597 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus subtilis # 1 97 1 98 260 78 45.0 3e-15 MSNYQVIINGYQEPKKRLTTDETYKLIDEYQATKNEAIKDRLVNDNTKLVLSMARRFYGR EDSMDDLFQVGMIGLIKAIENFNTSFGLKFSTYAVPL Prediction of potential genes in microbial genomes Time: Thu May 26 09:36:07 2011 Seq name: gi|223714182|gb|ACDT01000033.1| Coprobacillus sp. D7 cont1.33, whole genome shotgun sequence Length of sequence - 5589 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 388 410 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 2 1 Op 2 . + CDS 412 - 651 203 ## gi|167755750|ref|ZP_02427877.1| hypothetical protein CLORAM_01265 + Term 652 - 690 5.2 + Prom 1094 - 1153 4.1 3 2 Op 1 . + CDS 1255 - 2013 336 ## PROTEIN SUPPORTED gi|227891037|ref|ZP_04008842.1| ribosomal protein S4e 4 2 Op 2 . + CDS 2026 - 2457 576 ## gi|167755753|ref|ZP_02427880.1| hypothetical protein CLORAM_01268 + Term 2647 - 2683 5.0 + Prom 2609 - 2668 5.0 5 3 Tu 1 . + CDS 2697 - 5426 3319 ## COG0060 Isoleucyl-tRNA synthetase + Term 5527 - 5584 1.1 Predicted protein(s) >gi|223714182|gb|ACDT01000033.1| GENE 1 2 - 388 410 128 aa, chain + ## HITS:1 COG:CAC1696 KEGG:ns NR:ns ## COG: CAC1696 COG1191 # Protein_GI_number: 15894973 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Clostridium acetobutylicum # 3 127 133 257 257 85 37.0 3e-17 SYITKLNREPTINELANELEVEPSAVIEALLSTNSVSSLQEEVKNDDGNNLKMIDSITDD KTVVSRTNETIDLYDALKSLNQKEHRVIKQRYFEGLSQSEIAKELFISQAQVSRIERKAL DNLHNYLK >gi|223714182|gb|ACDT01000033.1| GENE 2 412 - 651 203 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755750|ref|ZP_02427877.1| ## NR: gi|167755750|ref|ZP_02427877.1| hypothetical protein CLORAM_01265 [Clostridium ramosum DSM 1402] # 1 79 1 79 79 130 100.0 3e-29 MRFVKLQSKDVINVVDGCKIGFISDIEIDWCGKCIQAIVVEKYSFFKLLCFFKEAPCIVI PIECVVSIGGDVILVSIEP >gi|223714182|gb|ACDT01000033.1| GENE 3 1255 - 2013 336 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227891037|ref|ZP_04008842.1| ribosomal protein S4e [Lactobacillus salivarius ATCC 11741] # 1 242 5 249 262 134 30 2e-31 MIEHFKGEEVFVKKVLEFKDQALYKQRLVLTKFLNPYHQSIVYSIVGNQNDLIVIEDGGM VDSEMKRLIIAPSFYQIEKEDFEIVLAKISYAKPFGTLNHRDILGALMSLGVKRELFGDI YEYEDNFYVAMDAKIYEYVKNNLIKIKRSKVKILESEEIITIKHQYISKTFIVSSFRLDK VVSTLYGVPRSKAVSYIQSGFVKVNHKEVEEINYLCNNSDIISLRRHGRVKFVDTKRRTK QDNYVVEGYFYK >gi|223714182|gb|ACDT01000033.1| GENE 4 2026 - 2457 576 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755753|ref|ZP_02427880.1| ## NR: gi|167755753|ref|ZP_02427880.1| hypothetical protein CLORAM_01268 [Clostridium ramosum DSM 1402] # 1 143 1 143 143 191 100.0 2e-47 MSEFKKQFRGYSVQQVDSKIDDYQAELASLKQQVASLTDELDHVKEQNSLLQHRVNITEK TNEEIARLALKEASELIDKAKRNANMILKESLDYVRSLSSEMNDFKDQAIKFRSSVQKMS QDILDSIDNSEVFNLINEEDEDN >gi|223714182|gb|ACDT01000033.1| GENE 5 2697 - 5426 3319 909 aa, chain + ## HITS:1 COG:BS_ileS KEGG:ns NR:ns ## COG: BS_ileS COG0060 # Protein_GI_number: 16078607 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Bacillus subtilis # 1 885 1 890 921 1013 55.0 0 MNYKDTLLMPKTEFEMRGNLPKKEPKYVERWQKENMYERVIEQNKGKEDFVFHDGPPYAN GNMHMGHMLNKVIKDVICRYNNMEGRYTPYIPGWDTHGLPIENAIQKLGVNRKKMSTAAF REKCDEYAHEQVNKQMEQLIRMGTFADYKHPYLTLMPEFEANQIDVFAKMAMDGLIFKGL KPVYWSPSSESALAEAEIEYHDIKSPTIFVKFAVKDGKGILDTDTSFVIWTTTPWTIPAN LAICLNERYDYALIETTKGKLIVLNELVDSLMEKFEITEYKVIKNFKGSELELITCQHPL YDRESLVILGDHVTAEAGTGCVHTAPGFGADDFFVGQKYGLPAYCNVDEHGCMMKDCGEW LEGQYVDDANKTVTMKLDELGVLLKLEFITHSYPHDWRTKKPIIFRATDQWFCSVDKIRE KLLSEIDKVNWLNEWGHIRIYNMIKDRGDWCISRQRSWGVPIPIIYCEDGTPIMEEEVFK HISNLFRKYGSNVWFERDEKDLLPEGYINEHSPNGIFKKEKDIMDVWFDSGSSHTGCMIE RGYRYPVDLYFEGSDQYRGWFNSSLIVGTAAHGCAPYKTVLSHGFVLDGKGNKMSKSLGN TVDPIKLVNQYGADIVRLWATSVAYQQDVRISNDIMKQISENYRKIRNTMRFVLGNLNDF KQADLVAVEALADVDKYMLAQLNDLIKGYHKAYDNYDFAEANQLILNYFTNLLSSFYMDF TKDILYIEKADSLRRRQVQTVLYYHAKAMMKLISPVLVFTAEELHDHFHCDDNKADSIFL EPNVEMINMPNSDQLKAHFDRFLELRKDVMKALEGLRNEKIIKSNMEAKVTISLKDEFKE MASMLADLKQLFIVAKVELVKDSSLEEFDSAYIKVEKFEGHQCPRCWNYFDEDEMEGELC PRCHAVING Prediction of potential genes in microbial genomes Time: Thu May 26 09:36:33 2011 Seq name: gi|223714181|gb|ACDT01000034.1| Coprobacillus sp. D7 cont1.34, whole genome shotgun sequence Length of sequence - 49715 bp Number of predicted genes - 55, with homology - 55 Number of transcription units - 23, operones - 10 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 16 - 918 686 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 2 1 Op 2 . + CDS 911 - 1969 1142 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Term 2041 - 2079 1.2 - Term 2221 - 2268 13.6 3 2 Tu 1 . - CDS 2275 - 4722 2460 ## COG1511 Predicted membrane protein + Prom 4732 - 4791 8.4 4 3 Tu 1 . + CDS 4926 - 5366 572 ## COG1959 Predicted transcriptional regulator 5 4 Tu 1 . - CDS 5382 - 5588 395 ## gi|167755759|ref|ZP_02427886.1| hypothetical protein CLORAM_01274 - Prom 5645 - 5704 5.9 + Prom 5602 - 5661 5.9 6 5 Op 1 . + CDS 5683 - 6336 627 ## COG0546 Predicted phosphatases 7 5 Op 2 . + CDS 6320 - 6763 475 ## COG1683 Uncharacterized conserved protein 8 5 Op 3 . + CDS 6768 - 7703 1057 ## COG1482 Phosphomannose isomerase 9 5 Op 4 . + CDS 7703 - 7993 157 ## gi|167755763|ref|ZP_02427890.1| hypothetical protein CLORAM_01278 + Prom 8015 - 8074 4.8 10 6 Op 1 . + CDS 8098 - 9219 1384 ## COG2055 Malate/L-lactate dehydrogenases 11 6 Op 2 . + CDS 9236 - 9604 186 ## PROTEIN SUPPORTED gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 + Term 9787 - 9855 9.4 12 7 Op 1 . - CDS 9834 - 10247 444 ## COG0824 Predicted thioesterase 13 7 Op 2 1/0.091 - CDS 10237 - 10914 547 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold 14 7 Op 3 . - CDS 10944 - 11678 992 ## COG0217 Uncharacterized conserved protein - Prom 11707 - 11766 8.0 + Prom 11879 - 11938 8.0 15 8 Op 1 . + CDS 11958 - 13082 1030 ## COG2081 Predicted flavoproteins 16 8 Op 2 . + CDS 13073 - 14644 1649 ## COG2509 Uncharacterized FAD-dependent dehydrogenases 17 8 Op 3 . + CDS 14652 - 15212 673 ## COG1971 Predicted membrane protein 18 8 Op 4 17/0.000 + CDS 15224 - 16627 1212 ## COG0168 Trk-type K+ transport systems, membrane components 19 8 Op 5 1/0.091 + CDS 16643 - 17308 1013 ## COG0569 K+ transport systems, NAD-binding component + Prom 17376 - 17435 8.2 20 9 Op 1 17/0.000 + CDS 17651 - 19012 1647 ## COG0569 K+ transport systems, NAD-binding component 21 9 Op 2 . + CDS 19018 - 20457 1134 ## COG0168 Trk-type K+ transport systems, membrane components 22 9 Op 3 . + CDS 20501 - 21679 1149 ## COG0232 dGTP triphosphohydrolase 23 9 Op 4 . + CDS 21752 - 22078 405 ## COG3870 Uncharacterized protein conserved in bacteria + Term 22084 - 22116 4.0 24 10 Tu 1 . + CDS 22134 - 23795 1715 ## COG0285 Folylpolyglutamate synthase + Prom 23821 - 23880 9.5 25 11 Tu 1 . + CDS 23902 - 25512 1531 ## COG3263 NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain + Prom 25526 - 25585 5.9 26 12 Op 1 . + CDS 25703 - 26497 684 ## gi|237734747|ref|ZP_04565228.1| predicted protein 27 12 Op 2 . + CDS 26565 - 27008 559 ## BPUM_2073 stage V sporulation protein AC 28 12 Op 3 . + CDS 26986 - 27990 1074 ## BL00782 stage V sporulation protein AD 29 12 Op 4 . + CDS 27990 - 28340 441 ## RBAM_021510 SpoVAE2 30 12 Op 5 . + CDS 28337 - 29620 1140 ## BPUM_2069 stage V sporulation protein AF + Term 29623 - 29663 -0.3 + Prom 29658 - 29717 8.1 31 13 Op 1 21/0.000 + CDS 29822 - 30475 770 ## COG1354 Uncharacterized conserved protein 32 13 Op 2 . + CDS 30475 - 31089 964 ## COG1386 Predicted transcriptional regulator containing the HTH domain 33 13 Op 3 . + CDS 31076 - 31600 484 ## Cbei_2682 hypothetical protein + Term 31656 - 31696 -0.9 + Prom 31666 - 31725 5.8 34 14 Tu 1 . + CDS 31752 - 32795 821 ## COG3706 Response regulator containing a CheY-like receiver domain and a GGDEF domain 35 15 Op 1 . - CDS 32839 - 33561 599 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 36 15 Op 2 . - CDS 33558 - 34451 931 ## CLH_2307 ABC transporter, substrate-binding lipoprotein 37 15 Op 3 1/0.091 - CDS 34472 - 36220 2100 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 38 15 Op 4 1/0.091 - CDS 36230 - 38029 1922 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit 39 15 Op 5 . - CDS 38041 - 38409 525 ## COG3411 Ferredoxin 40 15 Op 6 . - CDS 38406 - 38939 669 ## COG0642 Signal transduction histidine kinase 41 15 Op 7 . - CDS 38939 - 39643 598 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) 42 15 Op 8 . - CDS 39645 - 39971 332 ## HRM2_16550 iron-sulfur binding hydrogenase 43 15 Op 9 . - CDS 39976 - 41139 1158 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 44 15 Op 10 . - CDS 41139 - 41543 491 ## COG2172 Anti-sigma regulatory factor (Ser/Thr protein kinase) 45 15 Op 11 . - CDS 41540 - 41893 398 ## Cphy_3800 DRTGG domain-containing protein 46 15 Op 12 . - CDS 41887 - 42378 650 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit + Prom 42352 - 42411 8.6 47 16 Op 1 1/0.091 + CDS 42576 - 43505 927 ## COG0583 Transcriptional regulator + Term 43511 - 43554 7.2 + Prom 43508 - 43567 6.2 48 16 Op 2 . + CDS 43595 - 44812 1529 ## COG1454 Alcohol dehydrogenase, class IV + Term 44838 - 44872 0.4 + Prom 44843 - 44902 6.6 49 17 Tu 1 . + CDS 44990 - 45526 494 ## COG0406 Fructose-2,6-bisphosphatase + Term 45646 - 45689 -1.0 50 18 Tu 1 . - CDS 45511 - 46122 593 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 46170 - 46229 6.2 + Prom 46112 - 46171 9.7 51 19 Tu 1 . + CDS 46208 - 47056 835 ## COG1284 Uncharacterized conserved protein + Term 47124 - 47163 8.1 - Term 46880 - 46918 3.1 52 20 Tu 1 . - CDS 47039 - 47446 354 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Prom 47531 - 47590 8.9 + Prom 47417 - 47476 8.6 53 21 Tu 1 . + CDS 47560 - 48384 833 ## COG1968 Uncharacterized bacitracin resistance protein + Term 48388 - 48432 6.2 - Term 48375 - 48420 2.6 54 22 Tu 1 . - CDS 48518 - 48991 433 ## COG3467 Predicted flavin-nucleotide-binding protein - Prom 49074 - 49133 7.5 - TRNA 49141 - 49215 60.8 # Arg CCG 0 0 - Term 49094 - 49133 2.3 55 23 Tu 1 . - CDS 49266 - 49583 162 ## gi|237734776|ref|ZP_04565257.1| hypothetical protein MBAG_01147 - Prom 49604 - 49663 2.6 Predicted protein(s) >gi|223714181|gb|ACDT01000034.1| GENE 1 16 - 918 686 300 aa, chain + ## HITS:1 COG:FN1038 KEGG:ns NR:ns ## COG: FN1038 COG0697 # Protein_GI_number: 19704373 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 1 293 2 299 303 131 31.0 2e-30 MKSDKFKGSFLLLLAALIWGSSFIVMKSAVDFLTPNVLLFVRFTLATLVMIIMFYKYIKD TCIRDLKGGAITGTCLFLAYLIQTLGLTMTTPGKNAFLTAIYCAIVPFLVWLFYHKRPDN YNFVAALLCVSGVALVSLDGDLTMNTGDLLTICGGFFYALHILAIKKYSQEMHPIKLTTL QFGMTAILALFGSLLFEDITVIKQIDSSVILQIGYLVFFATALTLLCQNIGQNLVSECNA AILLSLESVFGVIFSVLLYGEVLTLKVIAGFVIIFVAIIVSETKLSFLKRSNVREEELNV >gi|223714181|gb|ACDT01000034.1| GENE 2 911 - 1969 1142 352 aa, chain + ## HITS:1 COG:BH1535 KEGG:ns NR:ns ## COG: BH1535 COG1686 # Protein_GI_number: 15614098 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 10 348 8 342 387 141 28.0 2e-33 MCKKILSFLLVGICLIATLTPIRALENSEDIGLTCDYAVAIDAKSGLVLYNKNMDERMYP ASMTKVMTVILALELMDDMKKTTVITQSDIDTVWETGASSANFEVGETVTYEDLLYGAIL PSGADATRALANNLCGSQEAFVDKMNELAQKLNLKDTHFVNTTGIHDENHYTTVHDMALI VQYAIQNEDFKNIYAQRYKTSSNGLHQWVNKSMYNAKRAKINVDDILGCKSGYTNEAKSC LSSLNRVNDNEIITIVGHSVNNDVKTHAAVSDTLDIMNYVGQHYSLQSILTKGAKVRTLD VTLAKDEQKIAITNLNDVSAFLPNDFNKDDLEYKYSFKKLEAPVKKGEKLVI >gi|223714181|gb|ACDT01000034.1| GENE 3 2275 - 4722 2460 815 aa, chain - ## HITS:1 COG:lin2460 KEGG:ns NR:ns ## COG: lin2460 COG1511 # Protein_GI_number: 16801522 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 1 801 3 908 927 464 35.0 1e-130 MIKKEWSAFRKNWWLKIVAVAIIAIPSIYAGVFLGSIWDPYGNTKEIPVAVVNEDKKVEY NDSTLNVGKELAKSLQNNDSMDFKLVDSKTADQGLKDGDYYMVITIPSNFSKNATTLLDA QPQKMILNYTTNPGSSYIASKMDDSAIAKIKAEVSSTVTKTYAETIFNQLGTVGSGMSEA ADGSNQINDGANQLISGNKTISDNLQVLATSSITFKDGASTLTNGLQDYTDGVVTVNNGV YSLKDGINSLNNATPALSSGINQLDNGATELQTGISQYTGGVTSAYQGSQALVSNNSTLS NGVDTLAAGASQLKAGNQQISDGLTQMASQVKTSSISLSNYTTLINTLQNSGNADYKKLA QTILDTGLSKEEVEQYGLTKFGVTDALPSSYQLVQQMSSSLSTMNTALNGDTTTPGLATA SRTIQAGLNNLDASISGGEYTNTDGSKTIIPTEKSLKTGIKSYLAGTSQINSGLAQLNDN SSSLVDGANKLKTGTSQLASQTPTLTNGIGALDQGAKQLADGTSTLVSNNPTLLSGADQL ADGANQISDGAGQLAAGSTTLGTGLTTLQDGANTLATSLHDGAEQVNSINSNDSTFDMLA SPVDTSHKEISTVENNGHGMAPYMMSVGLYVACMAFTLMYPLFNDVEKAESGFKYWLSKA SVWFTVSTIASIVMIASLMFFCDFAPQQLLMTFIFAVIVGAASMALVTLLSIVCGKIGEF VLLVFMVINLGGSAGTYPLETSSAIFKAIHPFMPFTYSVDGFRKVISMSNVSLNTEIIVF VGIIIICSLLTILVYNHRIKKPTPLIPQAFENVNE >gi|223714181|gb|ACDT01000034.1| GENE 4 4926 - 5366 572 146 aa, chain + ## HITS:1 COG:lin2461 KEGG:ns NR:ns ## COG: lin2461 COG1959 # Protein_GI_number: 16801523 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Listeria innocua # 1 142 1 141 156 76 34.0 2e-14 MKVKNSVEQAICLLIIIAHSNDEKPLKSYNISDSLGVSDSYLKKIIRQLVVAGLITSEAG KKGGVCLSRKPDKITLLDIFEAIEGKEPFAKATGLVERVFVNELKEIMDQKQAMILDAFN AAEASYKEKLKTITLKMAMVERKKDS >gi|223714181|gb|ACDT01000034.1| GENE 5 5382 - 5588 395 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755759|ref|ZP_02427886.1| ## NR: gi|167755759|ref|ZP_02427886.1| hypothetical protein CLORAM_01274 [Clostridium ramosum DSM 1402] # 1 68 1 68 68 118 100.0 1e-25 MKEKSSVFMTLGTAVGLTIAVIIFVIGFVNKETSSWSALWFIVFAFIGRVIGYGLDRIVD KNNQKKHG >gi|223714181|gb|ACDT01000034.1| GENE 6 5683 - 6336 627 217 aa, chain + ## HITS:1 COG:BB0676 KEGG:ns NR:ns ## COG: BB0676 COG0546 # Protein_GI_number: 15595021 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Borrelia burgdorferi # 2 212 3 219 220 120 36.0 1e-27 MINTIIFDLDGTLIDSLVDLANTVNVILTEKGYPTHTLDEYRYFVGNGVLKLLERALPAD HQGDITAVKKRFDEIYGEICLENTKPYPGITKLINTLADQGYNLAVVTNKPQDHAVKIVK TLFPGCFKYIFGSSIRHPKKPDPCLTNLVINLFDVRKNEVVYIGDSDVDILTAKNTKVRS IGVSWGFRGRQELLENGADLVVDHADEIKEAINDWCK >gi|223714181|gb|ACDT01000034.1| GENE 7 6320 - 6763 475 147 aa, chain + ## HITS:1 COG:lin2433 KEGG:ns NR:ns ## COG: lin2433 COG1683 # Protein_GI_number: 16801495 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 137 1 147 147 123 45.0 1e-28 MIGVSSCLAGCKCTYSGDDKLIKLVKEMVMRNEAIMICPEVLGGLSIPRSPCEIRDGKVV DINGVDWTKEYRLGAQKSLEILQEHDVDVVLLKAKSPSCGKNYIYDGTFSHTIINGDGIT CQLLEKHGIIVFNDDNIDEFLKYIERR >gi|223714181|gb|ACDT01000034.1| GENE 8 6768 - 7703 1057 311 aa, chain + ## HITS:1 COG:lin2215 KEGG:ns NR:ns ## COG: lin2215 COG1482 # Protein_GI_number: 16801280 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Listeria innocua # 1 310 1 314 318 321 51.0 1e-87 MGHVLKINPVFKEMIWGGHKLRDVYGYDIPSDNTGECWAISAHKNGDCTISDGEFAGKTL SSLWDEHRELFANIEGDQFPLLVKIIDACNDLSVQVHPNDEYAKAHENSLGKTECWYVLN TDEGTKMVMGHHAKTKEELVKAIENDDYDNLLNKFDIKEGDFFYIPSGTIHAICSGSLIY EAQQSSDITYRVYDYHRKDAQGNERQLHVQQSIDVATVPYVPLAADSMVETAIENGTRTK LVSSEFFTVNKYEMTGKNTIVNDKPFQLVTVIKGNGVINGNSVKMGDNFVVCADQDAVEY DGTMTVMICTL >gi|223714181|gb|ACDT01000034.1| GENE 9 7703 - 7993 157 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755763|ref|ZP_02427890.1| ## NR: gi|167755763|ref|ZP_02427890.1| hypothetical protein CLORAM_01278 [Clostridium ramosum DSM 1402] # 1 96 1 96 96 179 100.0 7e-44 MYCSNCNHEFKAASFAYKARQRCPECGTLMQIQFPSGMGFIPIIVAILPAVYMVSVLQFD LLVGLSVFILVYWPIEVIMNILLIYFDKYRLYEVDQ >gi|223714181|gb|ACDT01000034.1| GENE 10 8098 - 9219 1384 373 aa, chain + ## HITS:1 COG:CAC0566 KEGG:ns NR:ns ## COG: CAC0566 COG2055 # Protein_GI_number: 15893856 # Func_class: C Energy production and conversion # Function: Malate/L-lactate dehydrogenases # Organism: Clostridium acetobutylicum # 1 361 8 368 369 378 50.0 1e-105 MAHLKFKFETLNAFCNEAFEKFGFTSDESKTITDVLLLSDVYGIESHGMQRLSRYHKGIE KGMIKVDAKPEIVFETPVSAVIDGHAGMGQLISKFAMEKCIEKAKTVGMAIVTVRNSNHY GIAGYYAKMACDEGLMGMSFTNSEAIMVPTFARKAMLGSNPIALAMPAKPYPFFFDASTT VVTRGKLEMYRKAEKELPNGWALDKDGNPSIDAPDVLDNIANKIGGGIMPLGGSTEQLGS HKGYGYGMFCEIFCSILSQGLTSNHTHTNGIGGTCHGFIAIDPKIFGNPEDIETHFSTFL QELRDAPKAEGQPRIYTHGEKEVEATKDRMENGIDVDVKTVLEMKDLSNFVGVDFEKYFG KLDIEDNDYKTVY >gi|223714181|gb|ACDT01000034.1| GENE 11 9236 - 9604 186 122 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 [Streptococcus pneumoniae SP3-BS71] # 1 122 1 126 126 76 33 3e-13 MKFKCIHCNINITDINKSVAFYQKALGLKVERTKEANDGSFILTYLGDGLSNFQIELTWL RDHPQAYELGENESHICFEVDDFDQAYQLHKEMGCICFENKAMGLYFINDPDDYWIEIIP KK >gi|223714181|gb|ACDT01000034.1| GENE 12 9834 - 10247 444 137 aa, chain - ## HITS:1 COG:SA1185 KEGG:ns NR:ns ## COG: SA1185 COG0824 # Protein_GI_number: 15926931 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Staphylococcus aureus N315 # 10 135 10 136 155 108 39.0 2e-24 MMIRPYLHNAKYYETDQMGIIHHSNYIRWFEEARIDYMNQIGLTYKKMEEEGIISPVLSI NCNYQKMMYFDDLAIIEVKITKYTGVRFACEYKIYNQKHTLCTTGNSNHCFTDRSGRPIN LKKIKPDFDRLFKKIIE >gi|223714181|gb|ACDT01000034.1| GENE 13 10237 - 10914 547 225 aa, chain - ## HITS:1 COG:FN1387 KEGG:ns NR:ns ## COG: FN1387 COG2220 # Protein_GI_number: 19704722 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Fusobacterium nucleatum # 3 210 2 225 237 113 33.0 3e-25 MQVTYIYNSGFLVELDKHILLFDYYQGTIPPLNQNKPLYVFVSHFHHDHYNPAIYQINHP KITYIIDRKINNTGIKVRPSEIYEIDDLYIQTLLSTDAGVAFVVKVENKQIYHAGDLHWW HWIGEPEADNKYQAGTFKKEISKIKDIHFDLMMIPLDPRLEESSWWGMEHILKNIKTKYV LPMHFTDNPKMMLKYLNHEPLKQYNNILKINNEGEIFIIGDNNDD >gi|223714181|gb|ACDT01000034.1| GENE 14 10944 - 11678 992 244 aa, chain - ## HITS:1 COG:VCA0006 KEGG:ns NR:ns ## COG: VCA0006 COG0217 # Protein_GI_number: 15600777 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Vibrio cholerae # 1 236 1 235 239 226 48.0 2e-59 MGRAHEVRKVAMAKTAAKKAKLYSRYGKEIYLAAKAGGPEPDGNLALRRLIEKAKKEQVP ADVIKRAIDKVKSGVTENYEEKVFEGYANGGATLIVKCLTDNINRTISSVRPAFTKSKAK LGNEGSVSYLYDVISVVAFKGLDEETVLDALVMAEVDAQDIVSEEDGTIKIIGEPTDLYK IKEAIEGIKSDIEFEIDEIQTLPQDTVELEGEDMELFERLMNNLNDCDDVDQIYHNVANY DEGE >gi|223714181|gb|ACDT01000034.1| GENE 15 11958 - 13082 1030 374 aa, chain + ## HITS:1 COG:CAC3590 KEGG:ns NR:ns ## COG: CAC3590 COG2081 # Protein_GI_number: 15896824 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Clostridium acetobutylicum # 5 365 4 405 405 178 31.0 2e-44 METTVIIGAGASGLICGHELAKSNHNVIILEQSNKAGRKILASGNGRCNLSNTNLDMYFY NTDESKIKRIIHAFDARKYFNQIGVMTRQVGTLLYPRSNQSLTVKNALMNNFDQVALIEE CQAKNIIVKKDGYIVKTNKGEYFCNNIVIATGSPASLLSGKNSYDLVKLLDLKVTKLYPS LVQIKTKPAYRSLKGVRTKCKASLLVDGKLIESQNGEVLFSDNGLSGICIMQLSRLLSRY CGNKEISLDLMEEYTSQELERFLKERHDRFGHYYLEGIFNDKLAKILKDIDNLKDIRFKI IDTYDYSKAQVMSGGVSLDEIDENLEVIKYPGIYLAGELLDVDGDCGGYNLHFAFASGHH IAKTIMNKQVKKCY >gi|223714181|gb|ACDT01000034.1| GENE 16 13073 - 14644 1649 523 aa, chain + ## HITS:1 COG:L195271 KEGG:ns NR:ns ## COG: L195271 COG2509 # Protein_GI_number: 15673161 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Lactococcus lactis # 1 520 1 525 535 461 45.0 1e-129 MLLINNVKVPLELADYRKIISQQLNISKNKIFDVKLVKQAVDARRKNKVHFVCSFNFTVE NEDLMIKKYPKLNLSKVVAYDYPVLKSTDEHIVVVGSGPAGLFCAYNLARAHQKVTLIER GNAVEQRKEDIDNFFKTGKLSPDSNVQFGEGGAGTFSDGKLTTGVKDKRKKFILETFVKH GAESDILYVNKPHVGTDYLIKVVKSMRETIIANGGEVLFETKLVDVNLADDQLVNIVVEK NNIKTTMELDKLVLAIGHSARDTYEMLYQKGIKMEQKSFAVGLRIEHLQSFINEHQYGKY ANHPSLKAADYKLAVKTSNGRGVYTFCMCPGGKVINSSSEAGGIVVNGMSNQARDEANAN SAVLVTVGPEDFASSHPLAGITYQRELEQKAFELGGKDYSVPVMRVEDYLNDTLDLKMEE VSCSVQPNVRYAKLSQIFSNEVNLALKEGLQLMNHKFTGFTEKAMLSGVESRSSAPVRLY RDENFQSNIKGIMPIGEGAGYAGGIMSSAIDGLKCSEFILKGE >gi|223714181|gb|ACDT01000034.1| GENE 17 14652 - 15212 673 186 aa, chain + ## HITS:1 COG:Cj0167c KEGG:ns NR:ns ## COG: Cj0167c COG1971 # Protein_GI_number: 15791554 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Campylobacter jejuni # 1 186 1 186 187 162 54.0 3e-40 MSIIEIALIGVGLAMDAFAVSICKGLAMRRMNYKKAIIIAAFFGVFQALMPALGYVLGTT FANKIAAIDHWIAFILLALIGANMIKEALSSDDDECQDDSLRLGDLIMLSIATSIDALAV GITFAFFNVSLLLSVSMIGIITFIICVIGVKVGNVFGEKYKSKAELAGGLILIVMGAKIL IDHLFF >gi|223714181|gb|ACDT01000034.1| GENE 18 15224 - 16627 1212 467 aa, chain + ## HITS:1 COG:BS_yubG KEGG:ns NR:ns ## COG: BS_yubG COG0168 # Protein_GI_number: 16080162 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Bacillus subtilis # 17 467 16 445 445 316 43.0 5e-86 MKVSTLKMHHEHHPMRPTRQIAISFLAVILIGSVLLMLPICNNTTPTAYLNNLFIATSAT CVTGLVPVVTSEQFNILGQIVIIILIQIGGLGFLTFLNLLLIMVKKKISLTNKLVLQEAL NQPSLNDLPKFVKNVIKYTFIVEGLGAIILAFVFIPDFGIVKGIYYSIFHSISAFCNAGF DVLGANSLIGYQNNLIINLVIPGLIIMGGLGFVVWFDVAHCIKKEYGKRSKFSKKHLFKS FSLHTKLVIIATVVLLLAGAVLFYLCEFNNIKTIGGLPLFDQIQISFFQSATLRTAGFAS VDMASLHPSTKLMMCVLMFIGGSPAGTAGGIKTVTFAIGVLEVYNIYHGRKEVTAFSRRI PKRLIVRSFAIISMALAIVFLSIFALSITEQAPFIDICFEVVSAGATVGLSASLTPYLSV FGKIVIIMLMYIGRIGPITMMISFARKSYMQASKKEVRYPDGNILLG >gi|223714181|gb|ACDT01000034.1| GENE 19 16643 - 17308 1013 221 aa, chain + ## HITS:1 COG:SPy0326 KEGG:ns NR:ns ## COG: SPy0326 COG0569 # Protein_GI_number: 15674488 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Streptococcus pyogenes M1 GAS # 1 219 2 220 224 170 42.0 2e-42 MNSKQYAVLGLGIFGSTVATTLAEYGCEVIAVDQDESCVERVADEVTKAVVANVTDQEEL RAIGIEDIDVAIVAIGTHLEEAVLATMNLKELGVPYVIAKAKNKQFMKILEKVGANKVIR AEKDMGLKVAKSLLRKSIVDLVELDEDYSVVEIKAPLDWVGKNFIDLNIRRVYNMNIIGI KHGDEDHLSLDVAPEYVIQNGDHFLVIGKTKELERFDYMTK >gi|223714181|gb|ACDT01000034.1| GENE 20 17651 - 19012 1647 453 aa, chain + ## HITS:1 COG:FN0242 KEGG:ns NR:ns ## COG: FN0242 COG0569 # Protein_GI_number: 19703587 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Fusobacterium nucleatum # 1 451 1 450 452 315 41.0 8e-86 MNVIILGAGKVGKTLIKHMSNEDHDIIVIDQDAQKIEDVVNLYDVIGVVGNGGSYDILME AGATTANLIICVTASDELNILAGLMAKQMGTRHTIARVRNPDYSKQRDFMRNQLGFSMII NPELEAANEIRRSLSFPSAVKVDTFARGKVELAEFYVEEDSYLRDIPLSQLHSITKTNIL VCAVSHNEEVIIPDGNYIIKPGDHLFITGSHRDLSRFCLDIGIVTNRIKNVLIVGGGKIA YYLAKQLGVQGIKVKIIEKDIEHCRVLAKKLPFVTVINADGSDEEVLLEEGLETTDAFLA LTGLDEENLILSLYAKNIYHKKTIAKVTRMSFTGLADSLKVDSIVAPKKIIAGQIIRYVR AKMNKDDDSSVKTLYKIVDGEVEASEFIATPKITFLGMTLNDLNLKNHVLVAAISRENET IVPKGDTTIELGDHLVIISRGETMKSLNDIIRR >gi|223714181|gb|ACDT01000034.1| GENE 21 19018 - 20457 1134 479 aa, chain + ## HITS:1 COG:FN0993 KEGG:ns NR:ns ## COG: FN0993 COG0168 # Protein_GI_number: 19704328 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Fusobacterium nucleatum # 1 479 1 481 483 410 50.0 1e-114 MNKKMIVFTIGKLLTITGLLMFMPVIVSLIYQETEGGYYAILGIIMLALGILISRKAPRK KNIYAREGFAIVALSWLLISLLGAIPFCLTGEIPSYVDAVFETVSGFTTTGSSILSNVEA LSHTSLFWRSFTHWVGGMGILVFVIAFIPIASGRSLHILRAEVPGPVVGKLVSKVRLTAR ILYVIYAIMTVIEIILLLFGGMPLFDCILNAFGTAGTGGFGIKNASIAAYDSAYIDGVIT VFMILFGINFNLIYFFILGRIKEVLKSEELRWYLIIIVAAIALITINILPLYDSILSAFR YSSFQVGSIITTTGFVTTDYGQWPVFSQVILLSLMFVGACSGSTGGGIKVSRIVIYFKNA RNELHKLLHPHSVRAVEFENQPVEDNVIRTIHAYLVVYITIFALSLLILTFLNLDFKSAF SAIASCLNNIGPGLDVVGPVSNFGSLPDVSKIVLTFDMLAGRLELFPMLLLFSPSLWRK >gi|223714181|gb|ACDT01000034.1| GENE 22 20501 - 21679 1149 392 aa, chain + ## HITS:1 COG:CC2008 KEGG:ns NR:ns ## COG: CC2008 COG0232 # Protein_GI_number: 16126251 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Caulobacter vibrioides # 43 383 28 388 394 127 25.0 4e-29 MKKEEIVIKNALRMTKEADDQLSIYATKNRDCIKVKQTIRNREEFDIRWPFEEDIDRILY CKSYQRYVDKTQALSFFNNAHISKRSIHVQWVSRIARQIGRGLALNQDLIEAAALGHDLG HAPYGHVGEKALDKILQIKGFGYFAHNANSVRNILFIERDGIGYNVSLQVLDAILCHNGE MLSKKYMPDYDKTIAQFWQEYEDCWHQKNTSLIIRPMTLEGCVVRVSDVISYIGKDIEDA IKVGIIKQSDLPKEVTDILGFDNKSMINRLIGDIVIHSYGKPYLRFSNEVFSALKTLLNF LSNRVHQHPVLVKENTKLVRMINQLYHVYYDELTDPNNHDCKIKAFVNKMAPAYSKNDPA LIVADYLSMMTDSYVLNEYESIFLPIQHNEIL >gi|223714181|gb|ACDT01000034.1| GENE 23 21752 - 22078 405 108 aa, chain + ## HITS:1 COG:BH0043 KEGG:ns NR:ns ## COG: BH0043 COG3870 # Protein_GI_number: 15612606 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 1 108 1 109 109 99 49.0 1e-21 MKLIIAIVNSDDSSSVQGALTEGGYFVTKLSTTGGFLKKGNTTFFVGTNDDKVEDCISII KAHSKKRVEKEPTVPPTEMGEFFTPIMVDVLVGGATVFVLNIDQFEKF >gi|223714181|gb|ACDT01000034.1| GENE 24 22134 - 23795 1715 553 aa, chain + ## HITS:1 COG:lin1586 KEGG:ns NR:ns ## COG: lin1586 COG0285 # Protein_GI_number: 16800654 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Listeria innocua # 169 550 25 425 429 226 35.0 1e-58 MEYRILKENEIKEAIELVARVAKKDIYKDFDQEGINSFNQVNCDTFYRNKQNLTYICLEN KKIIGIITLTQGKHLSLLFVDNDYQGNGIGTKLFELIDNLALADLEVNSGIGAKGFYLNL GFKMVGDLTKKNGILYYPMLKQRVVNKRFNTYDEVVNFINSQKDRVYSLNNFKHYMEDIG NPQLFLDCVHIGGTNGKGSTTNYIKEVLKQAGYRIATFTSPALYSRLDIIRINDQFIDEE TMVRYANRYVELWLKYEISMFEIEVFIAIMYFIEQNVDFALFEVGLGGLLDATNIIMPKL AINTNIGLDHVDYLGHDYQSIALNKAGIVKEGIDYLTGETKEECLAVFRDVCLKHHSELI TLKPITKIIDGNNVSYHYRDYDIILDTPALYQINNSALALEALIYLKEHQFVDFSDDDLL QGMYNARWAGRFEIINIEPLIIIDGAHNKEGIDAFYECAKKYDNIKIIFSALRDKDYKHM IEKLLQLTKDITICEFEHVRASDAKTLANGFDVKIEPNFKTAIDKAFTHEGTVFVTGSLY FISKVREYIIDRS >gi|223714181|gb|ACDT01000034.1| GENE 25 23902 - 25512 1531 536 aa, chain + ## HITS:1 COG:FN1559 KEGG:ns NR:ns ## COG: FN1559 COG3263 # Protein_GI_number: 19704891 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain # Organism: Fusobacterium nucleatum # 1 528 1 527 527 363 41.0 1e-100 MTLFLLLIAVVVIVCIFATRLSSKFGVPTLLIFILLGMVFGSDGLFKISFDDFIISEQIC TFALIYIMFYGGFCLNFNSAKPVLIRATIMSTLGVILTCGFVGLFCYWVLHTSLIEGMLI GAVMASTDAASVFSILRMKRLNLKAGIAPLLEMESGSNDPTAYMLTIIILTLLSSGKTNI YMLIFSQIIFGIISGVVIGLLAKNYLDKIQLNEVGLKSIFVAALALLAYALPVALNGNGF LSVYLFGIILGNSRLKKKVSLVHFFDSISHMMQILLFFLLGLLSFPSQIPDIIGSAILIT LFLTFVARPLTVFICLTPFRVPFKQQLFISWCGLRGAASIVFAIMTVVSPAYTNNDIFHI VFFVALLSITFQGSLLPLVAKWLKVEDDQNDTMKTFNDYSDDSSFELIRVYLSDDHPWIN QTLIDIVMPTNMLVATIIRDNQMILPKGTTRVLKGDLLIMCAPGYEGNDIYLDEEYIEEH HHWIDCTLAMINPRNKFLVVLIKRDNKMIIPDGKTRIKRDDMLVICKKEYLGFVDE >gi|223714181|gb|ACDT01000034.1| GENE 26 25703 - 26497 684 264 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734747|ref|ZP_04565228.1| ## NR: gi|237734747|ref|ZP_04565228.1| predicted protein [Mollicutes bacterium D7] # 1 264 1 264 264 369 100.0 1e-101 MIQIICGIILVLIGIFVFWKFPIKSDTKSLTLAALFIVLAAILKRLSIMIPLFGFESLKI SIEVIPMLLAGIMLAPGYCYIIGLAVDLVGLIVTPTGFPFLGFTLSAVLQCLIPSIVVAT IKENYMNYLEKAIQVILIFLGIGACFYVFSLDQVVISKNVVDITLNFKIIISVVCIVMIS VLFTVMRYYKKKLNDQEYHLFNLWLISVVLVEMVITFMLTPYWLQVMYGIPFTLSLFIRV IKECIMIPVDIILGYSVLRVLKRL >gi|223714181|gb|ACDT01000034.1| GENE 27 26565 - 27008 559 147 aa, chain + ## HITS:1 COG:no KEGG:BPUM_2073 NR:ns ## KEGG: BPUM_2073 # Name: spoVAC # Def: stage V sporulation protein AC # Organism: B.pumilus # Pathway: not_defined # 6 142 8 144 150 138 51.0 5e-32 MNTSQYNEIAEDLKPQKQVGKHCLSAFIYGGIIGAVGQGILEFYMYIFDVSEKEATPMMI ITIVLAAAILTGLGIYDRIGKKAGAGTFIPITGFANSMTSSALEAKSEGLVTGIGANMFK LGGTVITFGIVASFVLGGIRYVISLIW >gi|223714181|gb|ACDT01000034.1| GENE 28 26986 - 27990 1074 334 aa, chain + ## HITS:1 COG:no KEGG:BL00782 NR:ns ## KEGG: BL00782 # Name: spoVAD # Def: stage V sporulation protein AD # Organism: B.licheniformis # Pathway: not_defined # 13 334 13 335 339 315 47.0 2e-84 MSSVLSGSSMQLNNVYVRATGTTCGIDEFNGPLGQYFDRHYPDFHFGHKSYELAEMAMLQ DAISHALKKGKLKKDDINVFLGGDLNNQVTASNYTAKDFQRPFMAMYGACATMGLVINAA SMLIENHCIHNANCFVCSHNATAERQFRYPIEYGVQRKETMTTTATGAVSVILDNNPSAV RIEALTIGRVIDMNQSDPNDMGRAMAPAAFDTLVNHLKDLNRQGTDYDLILTGDLSTYGK SIMKELMNENSVTYNEYDDCGCLLYDSDQDVNQGGSGPSCSALVAFGYIYQKMLKHKYKR VLVITTGALLSPVMTNQKQSIPCIAHAFSLEVVE >gi|223714181|gb|ACDT01000034.1| GENE 29 27990 - 28340 441 116 aa, chain + ## HITS:1 COG:no KEGG:RBAM_021510 NR:ns ## KEGG: RBAM_021510 # Name: not_defined # Def: SpoVAE2 # Organism: B.amyloliquefaciens # Pathway: not_defined # 1 115 1 115 116 117 61.0 1e-25 MNYLLAFIFCGFVCVIAQLIYEYSKLTPGHITSLFVVIGAFLDLFHIYDKLVEIFHAGAL LPITSFGHSLMHGALAATKEFGVFGLAMGIFDLTAAGISSAILFAFLVAICTKPKS >gi|223714181|gb|ACDT01000034.1| GENE 30 28337 - 29620 1140 427 aa, chain + ## HITS:1 COG:no KEGG:BPUM_2069 NR:ns ## KEGG: BPUM_2069 # Name: spoVAF # Def: stage V sporulation protein AF # Organism: B.pumilus # Pathway: not_defined # 7 419 28 465 490 337 37.0 7e-91 MINNNPSFDLNTRTLDFENEKVTINYVSSLCSDDLIAYLVEGITNHRGETLKDCLNNGDV KDEPDYQKAMYSMLTGCAMISYRDKTYVLDTRHFPSRSVEEPETEKSVRGSKDGFNESML TCAGLIRRRIRTIDLVMNKQTIGKNNPLDVCLCYLDSTVDQTMLETVLQRIKEIKNEDLV MSDRAVEEMILDQGYNPFPLVRYSERPDVVATHILHGHIAIICDTSSSVMMLPTTLFEIL EHVEEHRQTPIIGTFIRLIRCSAVLISIYLVPLWLLLTSKGTIDLVFLGQVLLVELAIEL LRIATIHTPTSLSNAMGMIAAVLLGQFAIDLGIFSEEILLFCAIGDVGGFATPNYELSLT NKYLKIFLIIFVGFLGWIGFVIYHVILIGYLVSLKPFGVSYLYPLYPFDGKEMLNFIIRK PKTKRGS >gi|223714181|gb|ACDT01000034.1| GENE 31 29822 - 30475 770 217 aa, chain + ## HITS:1 COG:lin2065 KEGG:ns NR:ns ## COG: lin2065 COG1354 # Protein_GI_number: 16801131 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 216 27 244 249 135 37.0 8e-32 MDLEKLELSLIADQYLQYIHAMDPSKLEIMSEYLVMAADLIEMKSKMLIPKEKVTINDEY QEDPREALIKRLIEYKRYKDVLDEIRDKYEQRQTVIIKPAESMDEYVVDTSSMIPEGLEV YDLMKAMQKMFQRKAMMKPLDTHIAKKDISIDERTAQIRNYFKTRVNKKVKFEELFDRHD RFYFIVTFISILVLAKDKEVEIIQDGLFEEIYVEGKV >gi|223714181|gb|ACDT01000034.1| GENE 32 30475 - 31089 964 204 aa, chain + ## HITS:1 COG:BS_ypuH KEGG:ns NR:ns ## COG: BS_ypuH COG1386 # Protein_GI_number: 16079378 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing the HTH domain # Organism: Bacillus subtilis # 5 191 7 195 197 143 43.0 2e-34 MEQNNYLDIIEGMLYLAGDDGVDIKQVAGVLEISKKEATLLMDQFTEIYGNKALKGIVLV NFGGRYKLATNADYFIYYQKMVEQSSASLSNAALETLAIIAYNQPITRAGVEDIRGVGCD AMIRKLVAKALIKEVGREDTPGMPILYGVTDEFMDTFGLTTLEELPDLADIVEVDEQEDI FATRYREDTSDDISKAESELNETV >gi|223714181|gb|ACDT01000034.1| GENE 33 31076 - 31600 484 174 aa, chain + ## HITS:1 COG:no KEGG:Cbei_2682 NR:ns ## KEGG: Cbei_2682 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 1 173 1 176 185 178 53.0 6e-44 MKLFKHFKTITKHKFYVMKLCFRFGLYKQGLKHDLSKYSWTELVTGAKYYLGYKSPNSNE RDTIGYSSAWLHHKGRNKHHWEYWIDFTSKGIISIEMPINYVVEMFCDRVAATMVYQGTQ FNFKAPLDYYNKTHHYYVINENTDRILRDMLERLANSNLDETIEYIKEKYLRNH >gi|223714181|gb|ACDT01000034.1| GENE 34 31752 - 32795 821 347 aa, chain + ## HITS:1 COG:BH2234 KEGG:ns NR:ns ## COG: BH2234 COG3706 # Protein_GI_number: 15614797 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and a GGDEF domain # Organism: Bacillus halodurans # 177 343 132 311 314 114 36.0 4e-25 MNEKGKAAHSNYVRSVYLQIVALALIMHLIYLCIFYLLNFNFLVVYNLFSVIFYGIMLVI INKGYYRTAVLSVHLEVITFVGITVFTLGWESGFPLYLLAMVSLVYFWPFKNSRWAYLCA IIEVFIYIIIRIISSTQDAPIYIISDKNIILTLSIFNAVCCFIVMIYSVFISDVSSKAID KLNENLQEIADNDQLTGLRTRHYLFDSLNSSPNLDCNIVIGDIDDYKIINDTYGHLCGDY VLHSLANLMKDKIPDDIDICRWGGEEFVFLCFNTNKEELTYVIERFNNLLRQHSFIYEGN VIKITMTFGISDTMKAKHLNTMIKCADDRLYKGKRCGKDQIIVDDKK >gi|223714181|gb|ACDT01000034.1| GENE 35 32839 - 33561 599 240 aa, chain - ## HITS:1 COG:AGl927 KEGG:ns NR:ns ## COG: AGl927 COG0600 # Protein_GI_number: 15890580 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 12 229 29 245 269 83 28.0 3e-16 MRRKIITLILIIGLWQGFALSIDKAVILPLPLVVFNQMFNLATSQSFYIAIGATLSRVAL SFFLALIVGTILGVFSGLFKSVNEYLAPIFSFLQTIPQIAYILILLVWFKSLTALIIIVL LMILPVFYNNAVNGIKNISNDLNDVTILYHHPFKFNLIHVYLPLIKGYIVSAVETALPQS LKVGVMAEIFVSSNQGIGKQLYFARAQIDMVSIFAWTIWMVIIIMLITYFTNKIINKDKK >gi|223714181|gb|ACDT01000034.1| GENE 36 33558 - 34451 931 297 aa, chain - ## HITS:1 COG:no KEGG:CLH_2307 NR:ns ## KEGG: CLH_2307 # Name: not_defined # Def: ABC transporter, substrate-binding lipoprotein # Organism: C.botulinum_E3 # Pathway: ABC transporters [PATH:cbt02010] # 1 289 1 309 326 85 27.0 3e-15 MKKIIKYLLCLMMIFTMIGCSNTETKDKSNTVKILCPAGAPSLAFVSEYENISKKGKIDF VDGSDQLVAELSKNSSDYDIIVAPINLGAKLIASEQTDYRIQGVITWGNLYYIGTSNEAL NQTGELGLFGEGAVPQKIVDTVKINTSLTPKYYQSATLVQQQLLSGNIQVGMLAEPLASA TIAKAKQAGIELSIIADLQKEYGKDGYPQAAIFVKENNNYDELFEAIDQFTNNDYNGLKD YLEKIGIETLSLPSVEITVKSIDRQNVHYKKASECSEQIKDFLKLFNIDYNKNMLSS >gi|223714181|gb|ACDT01000034.1| GENE 37 34472 - 36220 2100 582 aa, chain - ## HITS:1 COG:CAC0028_2 KEGG:ns NR:ns ## COG: CAC0028_2 COG4624 # Protein_GI_number: 15893326 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Clostridium acetobutylicum # 215 579 2 365 371 355 49.0 1e-97 MSKIKMTINNREVEAYEGQTVLEAAKNNGIHIPTLCYLKDVTGTGACRVCQVEIEGAKTL CAACVYPVREGLVVKTNSQRALDARRRVVELIVSNHSKDCLSCIRNTNCELQRLCQELGV REDAFAGEKTAPTFDEVSPGIVRNTSKCVLCGRCVETCAKTQGLGILGFMNRGFKTKVGP VYDKSMNDVNCMQCGQCINVCPVGALQEKEEVHNVIAALNDDSKHVVVQTAPAVRASLGE EFGMPIGTRVTGKMVHALKLMGFDRVYDTNFGADLTIMEEGYEFIHRISNDGVLPMITSC SPGWVNYIEHEYPELLDHLSSCKSPHMMLGSMIKSYYAKENNLDPKDIYVVSIMPCVAKK GEKEREENLTDGLKDVDAVLTTRELGKLIKMFGINFRDLKDEDFDQDMFGEYTGAGVIFG ASGGVMEAALRTVTDVLTKEDLTNLDYHAVRGEEGVKEASLKIGDMTVNVAVAHSMVLAK PLLEEIKNGTSKYHFIEIMGCPGGCVNGGGQSYVNALTRNSGFDWKQARAKALYDEDLAL PVRKSHKNSQIQKLYADFLGEPNSEKAHHLLHTHYTRKERFK >gi|223714181|gb|ACDT01000034.1| GENE 38 36230 - 38029 1922 599 aa, chain - ## HITS:1 COG:TM0010_1 KEGG:ns NR:ns ## COG: TM0010_1 COG1894 # Protein_GI_number: 15642785 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Thermotoga maritima # 1 527 1 525 527 664 59.0 0 MPTKRTQVLVCAGTGCTIGNSGELITEFEKEIKALGLEKEVEVLRTGCLGLCGVGPNISI YPDNIIYKTVQVSDVKEIVMEHFYKGRPVHRLMLNESDEETKEIHDINDTKFYNKQKRIA LHNCGVIDPENINEYIGKDGYFALAKVVSEMTPQEVVDVIKASGLRGRGGGGFPAGVKWQ FALNEPGKEKYVICNADEGDPGAFMDRSILEGNPHSVIEAMAIAAYAIGANHGYIYIRAE YPIAVERLNTALEQARELGLIGKNIFESGFDFDLELRLGAGAFVCGEETALIQSIEGERG MPNPKPPFPAHKGVWGKPTIINNVETYANIAQIIQHGAEWFRSIGTETSPGTKVFALGGK IVNTGLVEVPMGTTLREVIYEIGGGCPNHKRFKAVQTGGPSGGCLTEEQLDTPIGFDELV KLGSMMGSGGMIVLDEDNCMVDVARFYMDFIVDESCGKCTPCRVGTKRMLELLEQICDGK GTMETLDELELLASTIQDTALCGLGQTAPNPVLSTIHQFRDEYIAHIVDKKCPAGVCKEL LQYEIDEDKCRKCGLCAKQCPVGAISGELGKVPYVIDQEKCIKCGQCIKACHFNVIERK >gi|223714181|gb|ACDT01000034.1| GENE 39 38041 - 38409 525 122 aa, chain - ## HITS:1 COG:TM0011 KEGG:ns NR:ns ## COG: TM0011 COG3411 # Protein_GI_number: 15642786 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Thermotoga maritima # 1 117 6 123 128 72 33.0 2e-13 MKSLEDLKKLRDAAQSNMSMRTDEQQKYRVVVGMATCGIAAGARPVLNTLVETVAQEKLP ATVLQTGCIGMCTLEPIVEVFDRDDNKTTYVLVDSAKAKEIAEKHLKHDEIIDEYTVGHY KK >gi|223714181|gb|ACDT01000034.1| GENE 40 38406 - 38939 669 177 aa, chain - ## HITS:1 COG:TM1665 KEGG:ns NR:ns ## COG: TM1665 COG0642 # Protein_GI_number: 15644413 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Thermotoga maritima # 1 151 5 153 186 86 33.0 2e-17 MQEIAMNILDIAYNSIKAQANLIKIIIHDSSKLNIINIQIIDNGIGMDKTTIKNVIDPFY TTRTTRKVGLGVPMFKENIEATGGTFKIESEVGKGTHVTGEFIKNHLDTPPMGNIVETII TLIQADEKINYLFKYTSDKLEFILDTQEIKKILEGVPINQPEIIVWLKEYIKEGLQL >gi|223714181|gb|ACDT01000034.1| GENE 41 38939 - 39643 598 234 aa, chain - ## HITS:1 COG:TM1352 KEGG:ns NR:ns ## COG: TM1352 COG0613 # Protein_GI_number: 15644104 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Thermotoga maritima # 1 210 1 193 232 73 31.0 3e-13 MYYDLHIHSALSPCGDDTMTINNIFNMSYIKGLELIAITDHNSLKQQNYLEEIINHEILK GKIDYIHGVELQSKEEIHILAYFLKDTDLKPIQAWIDSYLIKVPNNPDYYGNQYIFNVND EIIDHEDNLLISSLDLDIYQIIETIHSFNGIAVLAHVLAKKFGIYEIYQGIPDDLAYDGI EVGNLRELKELKKRCPSVRCEHVFYNSDAHNLEAINEPVNQINKDDFYQLWRKI >gi|223714181|gb|ACDT01000034.1| GENE 42 39645 - 39971 332 108 aa, chain - ## HITS:1 COG:no KEGG:HRM2_16550 NR:ns ## KEGG: HRM2_16550 # Name: not_defined # Def: iron-sulfur binding hydrogenase # Organism: D.autotrophicum # Pathway: not_defined # 27 106 466 545 548 77 51.0 2e-13 MNIKEILQLEIKQINKAPLTNIVENVYIGDLLSFVMANGKEGALWLTVQKHLNVLAVAEL NDFAGIIFVQNSFPDGDTIAKADELEIPLFISQLDAYHLCKQLISLGL >gi|223714181|gb|ACDT01000034.1| GENE 43 39976 - 41139 1158 387 aa, chain - ## HITS:1 COG:TM1421 KEGG:ns NR:ns ## COG: TM1421 COG4624 # Protein_GI_number: 15644172 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 1 188 1 182 301 62 25.0 1e-09 MKQVISYLGSGCKNCIKCVKSCPMDAISIVNEQVIIDEDKCINCDICIQACDQKVLRIKN IDLESTLKKHDYNIALIPTAILSDLKTYDEIKNIAHAIKEFGFDEVVQYSDIEGLLYKQG LKDSKNGEEVMLTSFCPTINKLIINDYPTLIDHLLPYDYPVEIAAKKLRQKYAQKDIAIY SLCECVGKLTLAKQPFGNEDSNIDYALSVSQMFPRINQFKNDSQEELEINEYGVKSIVGD LYGDKRLSAISVEGLSQIRKALDLIEFDQLKHVDLMALFACFQGCIGGYYLWSNPFEGCF NIESMIDDCNGNLASLDKKDYIKAHEITSGNEKNFKERLAWFNKVNAILETLPQYDCGSC GFANCRGLASRIASGEVDDSLCRVKRR >gi|223714181|gb|ACDT01000034.1| GENE 44 41139 - 41543 491 134 aa, chain - ## HITS:1 COG:TM1354_2 KEGG:ns NR:ns ## COG: TM1354_2 COG2172 # Protein_GI_number: 15644106 # Func_class: T Signal transduction mechanisms # Function: Anti-sigma regulatory factor (Ser/Thr protein kinase) # Organism: Thermotoga maritima # 16 134 45 164 181 87 42.0 6e-18 MIKTYQVEKDNYQDAGKASSDIKRTLKALGIDRKILKAVAIACYEAEINIAIHSDGGTVT FEIDDDGIVHLAFDDIGPGIEDLNLAQTPGYSTASPKARELGFGAGMGLYNMKSVASTFE ITSSNEGTHIKMTF >gi|223714181|gb|ACDT01000034.1| GENE 45 41540 - 41893 398 117 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3800 NR:ns ## KEGG: Cphy_3800 # Name: not_defined # Def: DRTGG domain-containing protein # Organism: C.phytofermentans # Pathway: not_defined # 1 116 1 111 116 83 38.0 2e-15 MLVKEIVDILKAKEIYIDDQAIYQKDYKKGFSSDLMSDALALLKNETEEVLFITGLANVQ SLRTAEVLDIDTILFVRGKPLDMTIVDMAKNLHINLFQTDETMFEACGKLYEAGMRR >gi|223714181|gb|ACDT01000034.1| GENE 46 41887 - 42378 650 163 aa, chain - ## HITS:1 COG:TM0012 KEGG:ns NR:ns ## COG: TM0012 COG1905 # Protein_GI_number: 15642787 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 3 136 17 152 176 123 44.0 1e-28 MKKLKQEYLDKIDEIVAKHKDEKGPMKLMLHEIQNELGYIPFEAMEKISETIGVPVSKVY GVVTFYSQFTTEPKGKHVIAVCLGTACYVNGSQTILDLLCEMTGCEVNSTSPDGLFSIDA TRCVGACGLAPVVSVDGIVFGCTKQLEDLKMLVLDYLKEEAPC >gi|223714181|gb|ACDT01000034.1| GENE 47 42576 - 43505 927 309 aa, chain + ## HITS:1 COG:SPy0898 KEGG:ns NR:ns ## COG: SPy0898 COG0583 # Protein_GI_number: 15674920 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 1 242 1 238 301 99 27.0 8e-21 MNLLHLKYVIEVAKAGSISKAASNLYMNQPHLSKTIKDLEENMQIIIFERTSKGVIPTKK GAEFIERAKSIIMQVEEMEAMYQGYDDKSVHLDICVPRASYVANAFTNYLSMINFHDKNI KVNYRETNSFDTIKDVYEGEANIGIIRFPIDEESYFLHLLETKELNFEKLYQFNYQLLFS NENPLSFKEIIYLSDLETQIEITHGDTALPSLPIVATNRNDETRHSKKEISVYERASQFE LLNHLPHTFMWVSPIPQEVLRRFNLVTKECQENKMSYQDLLITRKGYRLSKEDEGFIRAI KCSIKEINI >gi|223714181|gb|ACDT01000034.1| GENE 48 43595 - 44812 1529 405 aa, chain + ## HITS:1 COG:VC2033_2 KEGG:ns NR:ns ## COG: VC2033_2 COG1454 # Protein_GI_number: 15642035 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Vibrio cholerae # 6 399 6 411 443 319 42.0 6e-87 MQRFTLPRDLYHGKGSLEALKSLEGKKAIICVGGGSMKRNGFLDRAIAYLEEAGMQVKLF EGIEPDPSVETVMKGAAVMEEFQPDWIVAMGGGSPIDAAKAMWIKYEYPEITFEEMCKVF GIPKLRKKAHFCAIPSTSGTATEVTAFSIITDYEKGIKYPIADFEITPDVAIVDPDLAET MPIKLVAHTGMDAMTHAIEAYVSTANCNYTDPLAIHAIEMIQANLVKSYNGDMGARDDMH DAQCLAGQAFSNALLGIVHSMAHKTGAAFVDQGGHIIHGAANAMYLPKVIAFNAKDETAK KRYGVIADYMHLGGNSDDEKVALLIAYLRKMNDELNIPHSINHYGADGLPADTGFVSEDV FLARVDNIAALAIEDACTGSNPRIPTQEEMVELLKACYYDSEVDF >gi|223714181|gb|ACDT01000034.1| GENE 49 44990 - 45526 494 178 aa, chain + ## HITS:1 COG:FN0808 KEGG:ns NR:ns ## COG: FN0808 COG0406 # Protein_GI_number: 19704143 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Fusobacterium nucleatum # 1 153 1 164 206 83 37.0 2e-16 MSLYVTRHGQTNYNVNDLVCGISPAALTTDGIEQAKELGRQLKSIKYDFLYVSPLQRAID TADYANVEGLEVIIEPRISEINFGIYEGVHRDDPGFIANKHNLAIRYPNGESFIELCKRV YEFLDEIKEQATKSNVLLVCHGAVCRAINTYFNEMSNDDIFYYQTENCQLLKYDYLSR >gi|223714181|gb|ACDT01000034.1| GENE 50 45511 - 46122 593 203 aa, chain - ## HITS:1 COG:CAP0082 KEGG:ns NR:ns ## COG: CAP0082 COG0664 # Protein_GI_number: 15004786 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 20 202 2 187 188 99 32.0 4e-21 MKIAENLFNYFKNTGTLIKYQPNDIIYMQEDSSNNLYLILKGRVRVYLISKDGQEITLDI IDKGRIFGESSFLLNTSRPVCVSAIDEVKLISCNLENLYPAILESKELTIAIMQLLSSTN DHLTNQVKKAYFYNRHEKIAAFLLEQKKKTISYTHEEIASLTGLNRVTVTKILNDFYQKG WIDLAYRKIIIVDHDELSNYLDK >gi|223714181|gb|ACDT01000034.1| GENE 51 46208 - 47056 835 282 aa, chain + ## HITS:1 COG:lin2652 KEGG:ns NR:ns ## COG: lin2652 COG1284 # Protein_GI_number: 16801713 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 277 13 286 287 130 33.0 3e-30 MKGKLLRIGTIILGNVFVAFAVSTLVLENQLISGGVTGLGIVTNHYTGINISLIVGIINV LLFLLGLFFIGKKFALSTLISTFAFPVLLEFFNTNAIFHNYCQDTLLAVVLAGCLVGIGV GLMIRSDASSGGMDIIAIILNRKLGIPVFIMVNVFDFIILCMQATFSNPTKILYSIVLVF VTSFMLNKTLTKGSKMVQLVVISDCHEMIKKMIIEEADAGVTSLYSQKGFNETDTKTLLT IIPPVKLTKIKEQIKLIDPVAFMVVATVDEVSGRGYTLERHH >gi|223714181|gb|ACDT01000034.1| GENE 52 47039 - 47446 354 135 aa, chain - ## HITS:1 COG:VC2392 KEGG:ns NR:ns ## COG: VC2392 COG0494 # Protein_GI_number: 15642389 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Vibrio cholerae # 1 108 1 110 132 58 33.0 3e-09 MKRNVNIIVVFNQNKSKVLMCKRKKAPFKGLYNFVGGKIEPNEDHLQAAYRELAEETNIS KTDIELIHFMDFTYYLDDTLLETYIGTLSTNVDIHGEENQLAWIKLTDDFKDLNRFAGNG NIYHILKEISLMMPF >gi|223714181|gb|ACDT01000034.1| GENE 53 47560 - 48384 833 274 aa, chain + ## HITS:1 COG:CAC0501 KEGG:ns NR:ns ## COG: CAC0501 COG1968 # Protein_GI_number: 15893792 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Clostridium acetobutylicum # 1 273 1 273 274 324 72.0 1e-88 MEIIELIKVIILGIVEGITEWLPISSTGHMILVDEFIKLNVSADFLEMFLVVIQLGAILA VVVLYWKKLIPFDYKNRWRIKKDTFSMWFKIIFACLPAAVVGLLFDDQLNELFYNPLTVA IALIVFGVLFIFIENYNKGKETKINSLSEITYNTALIIGIFQLIAAVFPGTSRSGATIVG ALLIGVSRTVAAEFTFFLAIPVMFGASLLKLVKFGFAFTTAELLLLVIGMLVAFVVSMLT IKFLMSYIKKHDFKAFGWYRIILGIIVLIYFMVF >gi|223714181|gb|ACDT01000034.1| GENE 54 48518 - 48991 433 157 aa, chain - ## HITS:1 COG:FN1023 KEGG:ns NR:ns ## COG: FN1023 COG3467 # Protein_GI_number: 19704358 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein # Organism: Fusobacterium nucleatum # 1 153 3 156 156 147 49.0 7e-36 MRRKDREITDFNEIINIIKKCDVCRIALNDKDFPYIVPLNFGLDIQGKEVYLYFHCAMEG KKLDLIAKDNRVTFEMDCDHNFILYEERMSCTMGYESVIGHGVIETVPDENKYESLKILM RQYHAEDFKFNTDMMRVTTVLKMTVIDMVGKRRNNIH >gi|223714181|gb|ACDT01000034.1| GENE 55 49266 - 49583 162 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734776|ref|ZP_04565257.1| ## NR: gi|237734776|ref|ZP_04565257.1| hypothetical protein MBAG_01147 [Mollicutes bacterium D7] # 1 105 45 149 149 161 100.0 1e-38 MKLLDQYTFFPFVAILLWPVNPLIALASNYALLSLHNLYYSFKSRYQSKHQYYLFIAIFL LLCCDFFVALTNINLPVPAVFRILIWILYLPSQLFFSASQIISEK Prediction of potential genes in microbial genomes Time: Thu May 26 09:37:30 2011 Seq name: gi|223714180|gb|ACDT01000035.1| Coprobacillus sp. D7 cont1.35, whole genome shotgun sequence Length of sequence - 2255 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 109 - 168 6.6 1 1 Op 1 . + CDS 210 - 1787 1605 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain 2 1 Op 2 . + CDS 1820 - 2255 378 ## COG2200 FOG: EAL domain Predicted protein(s) >gi|223714180|gb|ACDT01000035.1| GENE 1 210 - 1787 1605 525 aa, chain + ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 187 514 306 640 776 105 26.0 3e-22 MVSLSAKAVCIIDCHNKLVYMNDFYQNLFNEEKIGYKTKLGDAVINEINKVSYHTIQKTD FKVIKEDFIINDQASYIYTLERVNINSVSPVLESALYDELVEINLTRDTYNILFYRKDKY IIPALSGSFSKRIREIAQKIIHPHDQNDFLQLFNKMKNNHGDFKPVMGRFRRLLNNGEYS WINGILIAISKGQFSDTVYMCLFQDGGLTPINVESTNYKEYDQTTGLLNEHSFLKAARFL LEEGGNDNYCLIAIDIEHFKLFNDWYGRQMGNKILGQIGLELQKIQNEFKTVAGYFGGDD FVVIMPNERPLIDKLLTDLIAHIKVVTVNPGFFPKFGIYIVIDGEYVAAMYDRALIALSS IKENYTKQIAYYDSNIRQDFEYSQSILMDTQQALYNHEFMVYFQPKCNMLNGKIVGVEAL VRWNHPTKGIISPGDFVFILEKNGFIVNLDLFVWQETCRLLSNWLDQGNQGVPVSVNVSR TDIYAIDVVKTFKELVAEYKLPVNLIEIEITESSYGVIMKLFQKW >gi|223714180|gb|ACDT01000035.1| GENE 2 1820 - 2255 378 145 aa, chain + ## HITS:1 COG:slr1305_4 KEGG:ns NR:ns ## COG: slr1305_4 COG2200 # Protein_GI_number: 16329450 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Synechocystis # 1 96 153 247 255 94 47.0 5e-20 MDDFGSGYSSLNMLKDVNVDILKIDMKFLELREDSIGRGIGILEAIVKMAKLMGLRVIAE GVETKQQIDLLNELGCIYGQGYYFYPPLSVQEFEAIISNSDLVDYRGIMLKQVDQVRLQD IFINDISSEAMLNNILGGIAFYQVS Prediction of potential genes in microbial genomes Time: Thu May 26 09:37:31 2011 Seq name: gi|223714179|gb|ACDT01000036.1| Coprobacillus sp. D7 cont1.36, whole genome shotgun sequence Length of sequence - 931 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 225 131 ## Elen_0467 diguanylate cyclase/phosphodiesterase with PAS/PAC sensor(s) 2 1 Op 2 . + CDS 188 - 929 576 ## Elen_0189 diguanylate cyclase/phosphodiesterase with PAS/PAC sensor(s) Predicted protein(s) >gi|223714179|gb|ACDT01000036.1| GENE 1 1 - 225 131 74 aa, chain + ## HITS:1 COG:no KEGG:Elen_0467 NR:ns ## KEGG: Elen_0467 # Name: not_defined # Def: diguanylate cyclase/phosphodiesterase with PAS/PAC sensor(s) # Organism: E.lenta # Pathway: not_defined # 2 68 751 817 1263 72 43.0 5e-12 FVHPDDQNEFISLFDQAYQNPISGSEATIRYYSLKGKVSWISLRVFFLRKQDKSLIFYGA MNDATKNKFARTRA >gi|223714179|gb|ACDT01000036.1| GENE 2 188 - 929 576 247 aa, chain + ## HITS:1 COG:no KEGG:Elen_0189 NR:ns ## KEGG: Elen_0189 # Name: not_defined # Def: diguanylate cyclase/phosphodiesterase with PAS/PAC sensor(s) # Organism: E.lenta # Pathway: not_defined # 8 247 679 918 1435 169 34.0 7e-41 MLQKINLQEQELKASKQTINQLLGFPEDTAVEDYLNYQDSQTAVSMFSKVVPAGILGCYR TKELPIYFINREMLSLLNLTTMEEFNLFCKGQIINIIHPEDHQTVYAAIGLEEDEGFEYT VRYRIMKKDQTWLWVQEKGRIVKAKDGLLAYICAIVDINETMSSKIELEKINQQLIKQKQ QLSFLNECTLGGYFHCKNNSQLEFDYLSESFLNIVGFSQVEIITSYNNQLVQLIHPDDRH KITDKLT Prediction of potential genes in microbial genomes Time: Thu May 26 09:37:50 2011 Seq name: gi|223714178|gb|ACDT01000037.1| Coprobacillus sp. D7 cont1.37, whole genome shotgun sequence Length of sequence - 49316 bp Number of predicted genes - 48, with homology - 47 Number of transcription units - 23, operones - 12 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 49 - 1611 1351 ## COG2199 FOG: GGDEF domain + Term 1841 - 1894 -0.9 - Term 1414 - 1456 1.4 2 2 Tu 1 . - CDS 1617 - 2132 423 ## gi|167755811|ref|ZP_02427938.1| hypothetical protein CLORAM_01327 - Prom 2187 - 2246 5.7 + Prom 2053 - 2112 1.7 3 3 Tu 1 . + CDS 2133 - 3086 983 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase + Term 3200 - 3256 12.0 + Prom 3168 - 3227 2.4 4 4 Tu 1 . + CDS 3273 - 5870 2868 ## COG0474 Cation transport ATPase + Term 5906 - 5962 2.1 + Prom 5921 - 5980 10.3 5 5 Op 1 . + CDS 6002 - 7378 1338 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 6 5 Op 2 . + CDS 7407 - 8360 901 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 7 5 Op 3 . + CDS 8378 - 8794 530 ## COG0456 Acetyltransferases 8 5 Op 4 . + CDS 8794 - 9075 259 ## gi|237734786|ref|ZP_04565267.1| predicted protein + Term 9103 - 9145 -0.4 + Prom 9110 - 9169 9.7 9 6 Tu 1 . + CDS 9324 - 11933 2746 ## COG0474 Cation transport ATPase + Term 12064 - 12103 -0.9 - Term 11685 - 11726 6.1 10 7 Op 1 . - CDS 11940 - 14294 2260 ## COG1410 Methionine synthase I, cobalamin-binding domain 11 7 Op 2 . - CDS 14276 - 14908 398 ## Amet_3374 vitamin B12 dependent methionine synthase, activation region 12 7 Op 3 . - CDS 14901 - 15791 931 ## COG0685 5,10-methylenetetrahydrofolate reductase - Term 15822 - 15863 6.1 13 7 Op 4 . - CDS 15864 - 16412 483 ## CDR20291_1394 hypothetical protein - Prom 16577 - 16636 6.0 + Prom 16387 - 16446 9.1 14 8 Tu 1 . + CDS 16538 - 17194 499 ## COG1272 Predicted membrane protein, hemolysin III homolog + Prom 17203 - 17262 10.5 15 9 Tu 1 . + CDS 17334 - 17702 425 ## Blon_0286 lactoylglutathione lyase (LGUL) family protein, diverged + Prom 18051 - 18110 5.0 16 10 Op 1 15/0.000 + CDS 18134 - 18619 494 ## COG0597 Lipoprotein signal peptidase 17 10 Op 2 . + CDS 18606 - 19526 1020 ## COG0564 Pseudouridylate synthases, 23S RNA-specific 18 10 Op 3 . + CDS 19519 - 19983 668 ## COG2131 Deoxycytidylate deaminase + Term 19984 - 20014 2.0 - Term 19972 - 20002 1.2 19 11 Tu 1 . - CDS 20058 - 20846 794 ## COG1039 Ribonuclease HIII - Prom 20867 - 20926 4.1 + Prom 20873 - 20932 8.0 20 12 Op 1 . + CDS 20954 - 21775 840 ## gi|167755829|ref|ZP_02427956.1| hypothetical protein CLORAM_01345 21 12 Op 2 . + CDS 21777 - 24089 2317 ## COG1193 Mismatch repair ATPase (MutS family) 22 12 Op 3 . + CDS 24089 - 25861 1678 ## COG0322 Nuclease subunit of the excinuclease complex 23 12 Op 4 . + CDS 25898 - 26101 251 ## COG2771 DNA-binding HTH domain-containing proteins 24 12 Op 5 . + CDS 26164 - 27321 885 ## COG1876 D-alanyl-D-alanine carboxypeptidase 25 12 Op 6 2/0.250 + CDS 27324 - 28109 813 ## COG0796 Glutamate racemase + Term 28110 - 28150 4.2 + Prom 28127 - 28186 8.4 26 13 Op 1 . + CDS 28234 - 29151 903 ## COG5401 Spore germination protein 27 13 Op 2 . + CDS 29198 - 29665 457 ## COG0622 Predicted phosphoesterase + Term 29807 - 29857 12.2 + TRNA 29725 - 29801 88.4 # Arg TCT 0 0 + Prom 29728 - 29787 79.8 28 14 Op 1 . + CDS 29905 - 30474 512 ## COG3544 Uncharacterized protein conserved in bacteria 29 14 Op 2 1/0.250 + CDS 30509 - 31117 495 ## COG1713 Predicted HD superfamily hydrolase involved in NAD metabolism 30 14 Op 3 . + CDS 31186 - 31617 533 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 31825 - 31874 3.6 31 15 Tu 1 . + CDS 32180 - 32698 570 ## COG3231 Aminoglycoside phosphotransferase 32 16 Tu 1 . - CDS 32750 - 33754 852 ## COG2855 Predicted membrane protein - Prom 33774 - 33833 11.5 + Prom 33732 - 33791 13.4 33 17 Op 1 . + CDS 33864 - 34757 924 ## COG0583 Transcriptional regulator 34 17 Op 2 . + CDS 34782 - 34913 98 ## 35 17 Op 3 . + CDS 34951 - 35874 1052 ## Bsel_1414 hypothetical protein + Prom 35886 - 35945 6.3 36 18 Op 1 3/0.250 + CDS 35968 - 37248 1936 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) + Term 37255 - 37281 -1.0 37 18 Op 2 4/0.250 + CDS 37303 - 39624 2616 ## COG0466 ATP-dependent Lon protease, bacterial type 38 18 Op 3 . + CDS 39681 - 40280 655 ## COG0218 Predicted GTPase + Term 40309 - 40361 -0.0 + Prom 40384 - 40443 8.9 39 19 Op 1 . + CDS 40474 - 41451 1011 ## gi|237734816|ref|ZP_04565297.1| predicted protein 40 19 Op 2 . + CDS 41448 - 42362 824 ## gi|167755849|ref|ZP_02427976.1| hypothetical protein CLORAM_01366 + Term 42572 - 42609 -0.4 + Prom 42538 - 42597 4.6 41 20 Op 1 . + CDS 42619 - 43290 729 ## gi|237734818|ref|ZP_04565299.1| predicted protein 42 20 Op 2 . + CDS 43262 - 45883 2806 ## COG0525 Valyl-tRNA synthetase + Prom 45886 - 45945 2.9 43 21 Tu 1 . + CDS 46005 - 46163 187 ## gi|167755852|ref|ZP_02427979.1| hypothetical protein CLORAM_01369 + Term 46240 - 46273 1.0 + Prom 46219 - 46278 1.9 44 22 Op 1 . + CDS 46352 - 46738 365 ## gi|237734822|ref|ZP_04565303.1| predicted protein 45 22 Op 2 . + CDS 46735 - 47412 657 ## COG2003 DNA repair proteins + Prom 47426 - 47485 5.5 46 23 Op 1 . + CDS 47587 - 48315 891 ## COG1792 Cell shape-determining protein 47 23 Op 2 . + CDS 48308 - 48841 461 ## gi|167755857|ref|ZP_02427984.1| hypothetical protein CLORAM_01374 48 23 Op 3 . + CDS 48816 - 49202 454 ## gi|169350146|ref|ZP_02867084.1| hypothetical protein CLOSPI_00888 Predicted protein(s) >gi|223714178|gb|ACDT01000037.1| GENE 1 49 - 1611 1351 520 aa, chain + ## HITS:1 COG:PA4929_2 KEGG:ns NR:ns ## COG: PA4929_2 COG2199 # Protein_GI_number: 15600122 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Pseudomonas aeruginosa # 345 519 4 181 196 105 37.0 2e-22 MKKGGNCGLINKNDYAAIKDDYIKAVDNKEDYNAIYRFELEQTTLWMDSKTKYLNTENGY RTYLCVLNDVTSMKQKEEEIWLSSQILDRVIKMANLNIWQYDYQTKQLILMNLERDSFIY QILDLPRNEDIIIDNYLNKINNIEVMNEQTKEKIREQLIAQTVQNKFSYELPLTLNSKII WIKVVGETIYDYGDRPLKLVGYFKDVTLEKSSFQKYLENAKTLEALQEKSFYELSINLSQ NRIINTNIQEDDLSFINSSYSSYINSITEKIIQKEYQDKVKEVLGLEAILTRYQQGTTTN SVVYKRYVDGEYYWMEATYHILDLQENKDIFCYLYVIDIDKQKRQELALKNKAEHDGLTG LYNRSKAINEINQILLSNEMTCGALLMLDMNDFKEINDNYGHAIGDKIITKTAKRLKEMF RNDDILCRLGGDEFMILCRNIDEISIRMKLEDVTRQMKIPYLIEEYQIMAPLSIGFVMIP LYGTTFNNLYKKADIALYKAKQDGLASYRMYHDDMNKKAL >gi|223714178|gb|ACDT01000037.1| GENE 2 1617 - 2132 423 171 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755811|ref|ZP_02427938.1| ## NR: gi|167755811|ref|ZP_02427938.1| hypothetical protein CLORAM_01327 [Clostridium ramosum DSM 1402] # 1 171 1 171 171 273 100.0 2e-72 MLKEIAMKKILLITVVTSLIGVTGCSNKTIAEDKAKKIALNDAEVSEANITFTIQGKDQN GYHYLFEDDNYTYEYEIDIHDGTIENKEILVKEFSPLNDYDKIISSDEAKTISLNYFNFT ESDISNLNIELDNDSSNIYYNISFTKDKHHYSINIEAVNGIPTNAKTTKSS >gi|223714178|gb|ACDT01000037.1| GENE 3 2133 - 3086 983 317 aa, chain + ## HITS:1 COG:BH1953 KEGG:ns NR:ns ## COG: BH1953 COG1597 # Protein_GI_number: 15614516 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Bacillus halodurans # 57 311 33 295 295 95 28.0 1e-19 MLSHLKKDDNSYDVFYQNLLEYTKVGERMKHIFIIKPDQNNKNIEAMIVKVMQGYRYEIK YTHYPRHATLLAKSYSGCNYRIYAVGGDGMIHEIVQGLVGSDNELVVIPTGTGNDFVRTI ADENDPEKLLKKSLSLAASEIDLIKANDIYCINVLCCAFDSDIANNVHKYSQIKYLPRSL QYASVLVRRITQYCLFPTALFQDGKKFYEGNLIVGAFCNGKYFGGGFKIGKEAQIDDGMI DINLVSSLHKRYIPYYLTLLLAGKLEQGKLYYHQKLPYLTLKTRQQVNIDGETYPSGTYN LKIVKNSLKIVLYRQKD >gi|223714178|gb|ACDT01000037.1| GENE 4 3273 - 5870 2868 865 aa, chain + ## HITS:1 COG:FN1022 KEGG:ns NR:ns ## COG: FN1022 COG0474 # Protein_GI_number: 19704357 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 3 865 2 862 862 781 51.0 0 MKEYYNEAKTEVRKRVNGSLKPLTNEQVKANQEKYGLNELIETKGKSIPVIFLEQFKDFL VIILIFAAIISGVLGDIESTLVILIVIMINAILGTVQTVKAEQSLNSLKELSSPSAKVLR DGKVIEIPSKEVTIGDEVYLEAGDFIPADGRILENASMKVDESALTGESVAVEKSSDLIT GEVALGDRTNMVYSGSFVTYGRGNFLVTGIGMETEVGKIAQLLKSTSEKKTPLQVNLDNF GKKLSIIIMVFCALLFGINILQGGNVGDAFMFAVALAVAAIPEALSSIVTIVLSFGTQKM AREGAIIRKLQAVEGLGSVSIICSDKTGTLTQNKMTVEDYYIEGKRIDASEIDPSIPLHK DLMRLSILCNDSSNVDGQEIGDPTETALINLSAHLGVPASRVRAVYPRLSEIPFDSDRKM MSTLHLLKDGYTMITKGAVDVLIERIKYVRKNNQIVPITAQDREDILAMNMEFSQNGLRV LAITYKKLTAEKSLDYDDENDLIFLGLISMMDPPRVESAPAVTECLQAGITPIMITGDHK ITAAAIAKRIGILTDISQAVEGSEIDGLSDEELKTYVEDKRVYARVSPEHKIRIVRAWQE KGNIVAMTGDGVNDAPALKQADIGVAMGITGSEVSKDAAAMVLTDDNFATIIKAVENGRN VYANIKDAIQFLLSGNFGGILAVLYASIMALPVPFAPVHLLFINLLTDSLPAIALGLEPH NAGVMKEKPRPMNESILTKPFLASVGIEGFVIAVMTMVGFMIGYQESALLASTMAFGTLC LSRLVHGYNCKSKSPVLFKKSFFNNKYLQGAFLVGFILITLVLTMPLLQSMFKVQTLNIK QLMIVYGLALANLPVIQFIKYLRNR >gi|223714178|gb|ACDT01000037.1| GENE 5 6002 - 7378 1338 458 aa, chain + ## HITS:1 COG:FN1225 KEGG:ns NR:ns ## COG: FN1225 COG0769 # Protein_GI_number: 19704560 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Fusobacterium nucleatum # 4 455 23 478 485 310 41.0 4e-84 MEVALKTNSRAVKPGDTFIAIPNVARDGHDYIEEAIANGATKIIAEHGSYKVETIIVEST REYLKTYLYENYYPLIKDIKLIGLTGTNGKTTTCLMTYQILKMLKKNVAYMGTIGFYYGD AKKPMVNTTPDVDVLYNMLLEAKENGVEYFVMEVSSHALDKDRIHGLEFDEVAFTNLTQD HLDYHKTLENYANAKRILFTKTRNDKIAIINGDDEHYQHFVLESNNNIIIGQHDSDVKIL EMSFSHLGTIFKFEYLNHEYQARLNMVGRYNIYNYLIALLLVNKLGVKIEDILALNDKLK APAGRMELLKYGTNSIFVDYAHTPDAVINVLKSAEEFKNGRIITIIGCGGDRDATKRPIM GKAALEHSDYVIFTSDNPRSEDPQMILDDITNGLSGSNFEIEVDRQKAIIKGMQQLKHND ILMILGKGHEDYQITKTGKHHFSDQEEVMKYITKQGTL >gi|223714178|gb|ACDT01000037.1| GENE 6 7407 - 8360 901 317 aa, chain + ## HITS:1 COG:all1225 KEGG:ns NR:ns ## COG: all1225 COG0667 # Protein_GI_number: 17228720 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Nostoc sp. PCC 7120 # 10 302 12 311 315 157 32.0 3e-38 MKQIEMQDLTIPPIAIGTWAWGKGPFGSKFIFGVSHGIEELKPIYHYAVKNGLTLFDTAP VYSLGSSEKIIGDLSNGEDILISTKFMPTSWLPRKSMSWTLNSSLSRLKRADTDIYWIHR PANVTKWAKEIIPLMKANKIRYCGISNHNLKQIKEVELILKNAGLKLAAIQNHFSLLYRN SEENGIMKYCEEQGITFFAYMVLEQGALTGYYNYNHPFPKRTRRARAFSKQRLKQIEPLI LEMKNIGLKYNVGVAEIAIAYALAKNTVPIIGVTKVKHIDSALAALKITMSEEDIHKLEY FSNNFQLKIKGFWEKKM >gi|223714178|gb|ACDT01000037.1| GENE 7 8378 - 8794 530 138 aa, chain + ## HITS:1 COG:Cj0225 KEGG:ns NR:ns ## COG: Cj0225 COG0456 # Protein_GI_number: 15791597 # Func_class: R General function prediction only # Function: Acetyltransferases # Organism: Campylobacter jejuni # 1 137 6 142 148 141 49.0 3e-34 MTIKDYDDVYELWMNTKGMGMRNLDDSRSGIAKFLERNPTTNFIASDGNKIVGVILAGND GRRAYIYHTTVRSDYRGQGIATQLVKECLSAVKAEGINKTALVVFADNQLGNDFWQSQGF KEREDLTYRDFSLNEENI >gi|223714178|gb|ACDT01000037.1| GENE 8 8794 - 9075 259 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734786|ref|ZP_04565267.1| ## NR: gi|237734786|ref|ZP_04565267.1| predicted protein [Mollicutes bacterium D7] # 1 93 1 93 93 159 100.0 8e-38 MRIQQVDITEVFFQTKEYTQGMYDSIVRIGLSFPVKVSLEQGKFYCQDGHKRLSAIASME KLEVYKKRRKINIIVINNGNSRSNDCWRGRNMH >gi|223714178|gb|ACDT01000037.1| GENE 9 9324 - 11933 2746 869 aa, chain + ## HITS:1 COG:SP1551 KEGG:ns NR:ns ## COG: SP1551 COG0474 # Protein_GI_number: 15901394 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Streptococcus pneumoniae TIGR4 # 2 864 26 906 914 693 45.0 0 MYYNREIKELAQEFSTDLKSGLTEQQVTENLAKYGPNELEQTKKQSIIVRFFLQFKDALT IILIIAAVVSIIVEPTEWIDSAIIVFVVVVNAILGLVQENNAERALEALQKMASPKAKVI REGNIITINASQVVPGDLLVLEAGDCIASDGRLVECFNLKVDESALTGESVPVEKISSIL DKEEIPLAERNNMVYASCNVTYGRGIAIVTTTGEENEVGKIATMIQQTKRENTPLQDQLD QIGRTIGVICIIICIVVFGMELLEGLSILEAFKTSIALAVAAIPEGLATVVTVVLALSVQ RMVKKNAIVKSLPAVETLGSTSIVCSDKTGTLTQNKMTVVKTYLYNRKIELLETCSEETN QLLNYFTLCSDGSISTVAGKEVAVGDPTETALVKASLEKGFTKDGLAELYPRFNELAFDS ERKMMTVFVKHEDKIISITKGGPDVIFARCNNLDIDEVSVVNETMSNEALRVLALGIRYW DEEPQEFTSEMIENNLTFIGMVGMIDPPREEAKQAVAEAKSAGVRTIMITGDHVITASAI ARSLGILESGQKAIMGSELAKMSDQELAEHIEEYSVYARVAPEHKVRIVNAWKSKGKVVA MTGDGVNDSPALKAADIGCAMGITGTDVAKGAASMILTDDNFATIITSIREGRGIFDNIK KDVQFLLSSNIGEVLTIFGASLISLLTPHNFGVPLLPIHLLWVNLITDSLPAFALGMEPT EPTVMHRKPRQKDEGFFSHGLGFTIAWQGLMIGTLTLIAYAIGNNQDHTVGMTMAFITLC GCQLVHSFNVKSSASILNRTLFNNPYLWGALAAGLLLQVIILAIPELSAIFKLQPLNIVQ WGICVGLCLSTTVICELVKFFDRRKNNPS >gi|223714178|gb|ACDT01000037.1| GENE 10 11940 - 14294 2260 784 aa, chain - ## HITS:1 COG:TM0268_2 KEGG:ns NR:ns ## COG: TM0268_2 COG1410 # Protein_GI_number: 15643038 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Thermotoga maritima # 282 782 2 481 483 288 36.0 4e-77 MLQERLKNDILVFDGAMGTQLQDAGLKAGDIPECLNITDPKLIQTIHLNYLNAGADFITT NTFGANPLKMAEAPYSYEEIINAAIDNATIARKTADRQNDSYIVLDIGPIGQLLEPMGTL TFDEAYEIIKKQVIIAKDKVDAVLLETMTDIYEVKAGILAVKENSDLPVFVTMTYENNLR TLSGCDPLTMVNVLEGLNVDVLGVNCSLGPIELTPIIDQILAAATIPVLLQPNAGLPCLV EGKTCYNMDKETFVQESLKHVKNGVAIIGGCCGTTPDFIASLKNNLPVRKKITPKRATRV SSGTKTVEFGHHVVVCGERLNPTGKKKLKLALKEERYDELVVEAIKQDQAGAHVLDVNVG LPGINEVATMKHVIKLLQEVISLPLQIDSSVPGAIEQACRYYNGKPLINSVNGKDETMDA IFPIVKKYGGVVIGLTLDENGIPPLAKDRYKIAKKIINKAASYGITKENIIIDCLVLTVS AQQKEVMETVKAVAMVKELGVHTVLGVSNVSFGLPNRPLLNKTFLAMAMSAGLDLPIINP MDQELMATIDAFNVLYNYDHDAAVYIERRANQETITKKDTSTFTLNDIVLHGLKDEVTNA TKELLKTTPGLEIINNILIPALDTVGKQYEKNIIFLPQLIQSAETSKIAFGIIKDTFKDT AATKGPIIMATVHGDIHDIGKNIVKVVLESYGYKVIDLGKDVPPETVVEAFHKHHPKAIG LSALMTTTVVSMAKTIELLKQIDNICPIFVGGAVLTADYAKEINADYYSKDAMEAVELLN KIIK >gi|223714178|gb|ACDT01000037.1| GENE 11 14276 - 14908 398 210 aa, chain - ## HITS:1 COG:no KEGG:Amet_3374 NR:ns ## KEGG: Amet_3374 # Name: not_defined # Def: vitamin B12 dependent methionine synthase, activation region # Organism: A.metalliredigens # Pathway: not_defined # 1 206 3 220 226 126 33.0 5e-28 MIKENALKYLGYLDNQVDSNTEILLNECLKELEQVTPKFMYQIYTLTHHPLTIKELNLTI NYPDLIDLFDSCDRIVIIACTLGLQLDQQLRYYSKINLTKMTVMDALASSYIEIKCDEYE AKQNFGKRTFRFCPGYGNVPLELNKNLANALNCSKHIGLTVQESNLLLPQKSMIGLIGLG DEKLTKHCFSCVNKENCMYRKRGQRCYKKD >gi|223714178|gb|ACDT01000037.1| GENE 12 14901 - 15791 931 296 aa, chain - ## HITS:1 COG:aq_1429 KEGG:ns NR:ns ## COG: aq_1429 COG0685 # Protein_GI_number: 15606607 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Aquifex aeolicus # 1 282 1 283 296 243 45.0 4e-64 MKIIDLLHRRPTLSFEIFPPKNHDGDISSIYQTIDELAKLKPDFISVTYGAGGSTTENTV EIASKIKNEYSIEAVAHLSCIDATPEQLIRVLDSLKANNIENVLALRGDYPRGYDPMKAP HYYKYASELNDFISKNYPDTFCLSGACYPEVHQEAASLEEDLIALKKKVDAGAEYLITQI FFDNNYYYRLVREARIRGINVPIIAGIMPATNSKSLLNIAKLSGCNIPYNLSASIERFKS NPQAMKEVGMNYATNQIIDLITNGVDGIHLYTMNKPETVHEILKRTSNILAEFKND >gi|223714178|gb|ACDT01000037.1| GENE 13 15864 - 16412 483 182 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1394 NR:ns ## KEGG: CDR20291_1394 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 5 178 5 182 188 145 48.0 8e-34 MNQLELSEIIDKVVNKSDLTTKDIPSLDLYMDQIMTLFDDHLQDNKRFVDDKLLTKTMIN NYSKAGVIKPVKGKKYTKEQIIGMLLVYNLKNTITIQEIKQVLAPVYANDESLENIYDQF IEIKKFQSDQLKPLVLKTVENFNLDIDNDNQRLISIMALSSLSNQLTNIVQGIIDNYYIE SE >gi|223714178|gb|ACDT01000037.1| GENE 14 16538 - 17194 499 218 aa, chain + ## HITS:1 COG:CAC0882 KEGG:ns NR:ns ## COG: CAC0882 COG1272 # Protein_GI_number: 15894169 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Clostridium acetobutylicum # 4 217 2 214 214 147 47.0 1e-35 MRFFKKAMDPVSSETHFIGAVLSLATLVMMIIIAIIEGSDHLTIIGIIVFGLSSIALYSA SCLYHYYNANEDNKIKLILRKLDHSMIYVLIAGTYTPICLSYLEYPHSIYFLAAIWAIAC VGIIIKLFWMNAPRFISTIFYLLMGWALIFDLPAFSKVPLGCLGLIACGGISYTIGAIVY IVKKPNWFKTFGFHELFHIFVMIGSAFHFLAVIIYILL >gi|223714178|gb|ACDT01000037.1| GENE 15 17334 - 17702 425 122 aa, chain + ## HITS:1 COG:no KEGG:Blon_0286 NR:ns ## KEGG: Blon_0286 # Name: not_defined # Def: lactoylglutathione lyase (LGUL) family protein, diverged # Organism: B.longum_infantis_ATCC15697 # Pathway: not_defined # 2 116 1 115 115 165 67.0 5e-40 MVNCRIESMYLCVNDMDRAVSFYEQFFEQRVTKKDEIYSIFDINGFRLGLFAYKKMNEEH IFGSNCLPSIEFLNKEVLKAKISSYKLCFPLTPIGTNWVVEIIDSEGNHIELTAPIEIVK ES >gi|223714178|gb|ACDT01000037.1| GENE 16 18134 - 18619 494 161 aa, chain + ## HITS:1 COG:BS_lsp KEGG:ns NR:ns ## COG: BS_lsp COG0597 # Protein_GI_number: 16078609 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Bacillus subtilis # 9 154 1 146 154 102 39.0 4e-22 MNLTKKNKILYLVTLILVVGGDLFTKHLVSSSMLLGQSHEIINNFFYFTYAHNTGVAWGM FAGKLGLFIVVAIIAAVVMIVFFRKTKSEEVLTRFGLVLTFGGMIGNLVDRIFLGYVRDF IDVIIFNYNFPIFNIADMAVVIGVALIIVEIVFEEYIHGKN >gi|223714178|gb|ACDT01000037.1| GENE 17 18606 - 19526 1020 306 aa, chain + ## HITS:1 COG:CAC2114 KEGG:ns NR:ns ## COG: CAC2114 COG0564 # Protein_GI_number: 15895383 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Clostridium acetobutylicum # 1 304 1 305 305 328 56.0 6e-90 MEKINLIIEEDNNLRLDKVIAMQLQELSRTQIQDMINQGLVLVNQKKEKASYKTKLNDEI EITILDNVDLDIEPEDISLKIVYEDEDVIVVDKPTGMIVHPSAGIMHGTLVNALLFHCKD LSGINGVNRPGIVHRIDKETSGLLMVAKNDNAHRLLSKQLKDHTVVRRYVALVHGLIPHE HGKVDAPIGRNPKDRQSMAVTRTNSKEAVTNFTVLKRYNAMSLIECRLETGRTHQIRVHM SYIGYPVYGDPKYGYRKDDFSHGQFLHAKKLGFIHPTTQKYMEFESPLPDYFQAKLDELQ EELKDE >gi|223714178|gb|ACDT01000037.1| GENE 18 19519 - 19983 668 154 aa, chain + ## HITS:1 COG:FN1902 KEGG:ns NR:ns ## COG: FN1902 COG2131 # Protein_GI_number: 19705207 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidylate deaminase # Organism: Fusobacterium nucleatum # 5 148 19 161 174 226 72.0 1e-59 MSKVINWTQYFMGVAKLSAFRSKDPNTQVGACIVNEANKIVGVGYNGLPWGCEDNEFPWE VREGDLYETKYPYVVHAELNAILNSTGQLKGCRIYVSLFPCHECVKAIIQSGISEIVYED DKYKGTDSDRAAKRMLDAAGVKYTKVEPFVIEVK >gi|223714178|gb|ACDT01000037.1| GENE 19 20058 - 20846 794 262 aa, chain - ## HITS:1 COG:lin1191 KEGG:ns NR:ns ## COG: lin1191 COG1039 # Protein_GI_number: 16800260 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HIII # Organism: Listeria innocua # 52 257 94 301 308 151 37.0 1e-36 MNMNVVRNNSSKRINIINMHAYDKLVAFSDSSMADSSSALNDASKYTNISHIGCDETGSG DFFGPLCVVACYVDERDFDWLVSIGVRDPKDMDNKELVRVAKEIKDRLIYSLLILDNSHY NAMAKAGNNLANIKAKLYNQAVTNVMQKVSMPIKNKLVNQFVSPKTYYNYLKSEVIVVKD LTFVQKGEEKYLAIICSMILSKYAYLQYFSNMSRSLKMKLPRGNSNSVDAIAIEVANKYG AKMLNKVTKTNMTNFKRIKDLI >gi|223714178|gb|ACDT01000037.1| GENE 20 20954 - 21775 840 273 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755829|ref|ZP_02427956.1| ## NR: gi|167755829|ref|ZP_02427956.1| hypothetical protein CLORAM_01345 [Clostridium ramosum DSM 1402] # 1 273 1 273 273 420 100.0 1e-116 MNFALNSMAIDTVIVFILLLMLIFGYFKGFVYRVYDLMATVVSLLAALYASSPLSAIYQI YKVEGIGEIVGKTINRFIIFIILFIALKVILLIIGKFIKPILKNIIYTISIFEHLDHLLG ALVSLIEGTIIIYLALIFVITPIIPGGKENVEKTVVASKILELVPSVTEEMKALSDGFDV FTNIINDGINYDSFDARNVAALAASLNSAYKHGLINQEDLESALIKYYDEIDRVNEPISL NQDQYNEVVDVLSKLDSAKFDQTKILNKIIVSE >gi|223714178|gb|ACDT01000037.1| GENE 21 21777 - 24089 2317 770 aa, chain + ## HITS:1 COG:BH3106 KEGG:ns NR:ns ## COG: BH3106 COG1193 # Protein_GI_number: 15615668 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Bacillus halodurans # 1 767 1 782 785 449 36.0 1e-125 MKNIYDTLEFNQIKNKISTYCVSSLGKTRIAELMPLKDTDDLKLEQKYLDQAMKLIFKYG KMPIGHFIDIEPLLLKTTKDGTLFGEDFVQIVYLLNNVKEVQSYLSEKELVDNELLQLCN ALVLPKQLLADINRCIDSSGNVLDGASSELRRIRRQILSIESNIRTKIDQVKVANKDYLS QEAISSRNNHLVLPVKAGNKNLIKGIVHAVSSSGQTMFIEPDIIVQMNNQLVHAKDDERR EVNRILTVLSNNVKDHYDILHEDQELMIDIDVIFAKAAYGVKIDGIVPEVAEDYQHFSLI RARHPLIEPHDVVANDIVLNAPKRMLLISGSNTGGKTVVLKTAGLLSVMALCAMALPCTK ATVPMFDQICVDLGDEQSIEQSLSTFSSHMKRLVEITNEVTEKSLVLLDEIGSGTDPREG QSIAEAILRYLHQINPLIVASTHYSGLKEFAKNEDYILVAAVEFDQELMQPTYRLIEGSV GNSYAIEISSRLGLNQEIVDMAYQIKEESLTDTDKLLEKLQNELAQVQIEKDRLEALTNE TEVKMNKYERLISSFEKHKDEMMEKAKAEANKLLEESKQEIDLVVEDLKQQAVLKQHVVI DAKRNLDLLKHEPKKVINEEVHEYQVGDVVKVLSANRQGEILSINKKGILTINMSGLKLN AKPEEVSFISKKVKPKKVKSNLKSLRKTTNQSYELNIIGKRYEEAMAIVDKFLDDAIVNN YTMVRIIHGMGTGVLRNGVRKMLDKNKNVVSYRDGGPNEGGLGATLVYFE >gi|223714178|gb|ACDT01000037.1| GENE 22 24089 - 25861 1678 590 aa, chain + ## HITS:1 COG:lin1197 KEGG:ns NR:ns ## COG: lin1197 COG0322 # Protein_GI_number: 16800266 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Listeria innocua # 1 590 1 596 603 624 54.0 1e-178 MNRQMIKDKLSLLPLQPGCYLMKDKDNTVIYVGKAKKLKNRVSSYFVGAHNYKTTKLVQE IVDFDYIVTDSEKEALLLEINLIKDYSPKYNISFMDNKYYPYIQLTKEKHPRLKIVRNAQ DKKHKHFGPFPDGTAARETFKLLNRLYPLRKCNHIPKKPCLYYSLNQCVAPCIQEVSDEV YKEMTSSITKFIQGDTKEIINDLQNKMMSASEVQNYELAKECRDLITHIQHVTSKQHVQF NDLVDRDIVGYYSDKGYLCLQLFFMRNGKLLARDLNLVPAQDDYQEQIISFLVQFYQENT EPKELLVPQELDIELLKEIVNCKIIKPQKGNKANLVAMANENAKEQLEKKFLLIQKNEAS TIGAIKQLGEILNIQLPRRIELFDNSNIQGAYAVAGMVCFKDGTPSKKDYRKFKIKTVEG PDDYASMKEVIYRRYYRVLMEGLEKPDLIIVDGGKGQIKVAKEVIDALNLNIMVCGLAKD DRHSTSVLLDSSGEVVEINRKSELFFLLTRMQDEVHRYAISFHKNVRSKSLFQSILDSVE GIGPKRRKMLLKEFGSVKQLKEAQLEQLERVLPHEVALNLFNVLKADTEK >gi|223714178|gb|ACDT01000037.1| GENE 23 25898 - 26101 251 67 aa, chain + ## HITS:1 COG:BH3075 KEGG:ns NR:ns ## COG: BH3075 COG2771 # Protein_GI_number: 15615637 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Bacillus halodurans # 4 67 11 74 74 70 60.0 8e-13 MNIILTPREKEIFNLLIKNQSTRDIAKTLGISEKTVRNHISNVIQKLGVDSRIQAVFELI KFKELEL >gi|223714178|gb|ACDT01000037.1| GENE 24 26164 - 27321 885 385 aa, chain + ## HITS:1 COG:BS_yodJ KEGG:ns NR:ns ## COG: BS_yodJ COG1876 # Protein_GI_number: 16079020 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus subtilis # 198 380 62 266 273 83 30.0 7e-16 MKLKRWHLYLIVTLCFTIAFISINRKYDRFYRVNGINNDNRALIEMYLDDEEQDYLVENA IAVDQFIKYIEFPEFKLQYYEFYNALNRTNKYSNYSDLINVGNQLASKLEVSFASNALSR CNTLIKNDLVNAYINQEGFSFDNIDYYQLLRSLYDEGDYTYIADTNTYLSIMKEFDGLEG KKLYDNVKLLSNNFTKGSLASLFNHDLQPNAKRIYNPSDLALVVNNETYIGGYEPKNQVE IRGIPRVKYSMYLQEDANNSLMEMYRALNDEGYNDMVLTAAYISYDVASLGSSGILPGYN EYQLGTTINLQKREISIADFDQTDIYKWLINHCHEYGFILRYPSDKVDVTNHEYSSTTFR YVGKEIASKLHVQNLALEEYNANEE >gi|223714178|gb|ACDT01000037.1| GENE 25 27324 - 28109 813 261 aa, chain + ## HITS:1 COG:L0120 KEGG:ns NR:ns ## COG: L0120 COG0796 # Protein_GI_number: 15673264 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Lactococcus lactis # 3 261 4 265 271 155 35.0 5e-38 MERAIGVFDSGVGGLTVLNSIRTLLPNENIIYIGDNYHCPYGEKTREQLFSYASEIVEYF IKENVKLIVLACNTTSATVLNELQAVYKEVLIIGVIDATVEDFISRGVDNTLVIATAATI NSHKYPDTIDCYQTGIEVFTLPTPKLVPLVEAGMDKKDIYDVLHEYLDSYAGKIKSIILG CTHYPILENQIKDILPDIEYISSSDAVCKNVRDVLIKNNLLNLNKNKFIKIYTTGRVDEF LKSSTGFFDYTDLVVEHIIIK >gi|223714178|gb|ACDT01000037.1| GENE 26 28234 - 29151 903 305 aa, chain + ## HITS:1 COG:BS_gerM KEGG:ns NR:ns ## COG: BS_gerM COG5401 # Protein_GI_number: 16079890 # Func_class: R General function prediction only # Function: Spore germination protein # Organism: Bacillus subtilis # 86 287 124 340 366 66 25.0 6e-11 MKRKLRNGIIICACLMVCFFCIKQDNNRNEKKNTATMTANYQTVVFRDDKNTLVPIEVDF GAEVEDDTKYRNMIEVMKSNDYEYLGLHPILDSNLQVNAMAINDKSLTFDLSDNLYVNSN QEALDIFEMFSYVFCNGDIEKVNLKIDGNDISTLPNSTVPASCITNQLGINNFEASTNYI YKTTPVVVYNTETINNKEYYVPVTKRIETNENDIDTKVSIMLNEMDYDKPLSLVDQCSLQ DGTLSIHLAANILNDNESIDNTLYNRIVKSASHLENVKKVSLFVDNQEIDPVQDVNGEVD NRIKM >gi|223714178|gb|ACDT01000037.1| GENE 27 29198 - 29665 457 155 aa, chain + ## HITS:1 COG:BS_ysnB KEGG:ns NR:ns ## COG: BS_ysnB COG0622 # Protein_GI_number: 16079887 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Bacillus subtilis # 4 150 5 154 171 94 34.0 9e-20 MKKVVVMSDSHGYHKMIDEVQRLEPDGDYYVHCGDSEAREEQLKGWICVRGNNDWMAPFD DEVVFEVEGVRFLVTHGHRYGYYKREEAMVDDLLRHGCDVLLSGHTHVPQCDEVTGFYLI NPGSTTLPRRGSGKSYCIILVDQGKIEVKFKQIFC >gi|223714178|gb|ACDT01000037.1| GENE 28 29905 - 30474 512 189 aa, chain + ## HITS:1 COG:all4988 KEGG:ns NR:ns ## COG: all4988 COG3544 # Protein_GI_number: 17232480 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 42 181 72 216 225 70 29.0 1e-12 MENLYCYEQKLQSYLDEYHCILNQMINKMTSVELSCSISYNFMVQMIPHHRAAILMSCNL LKYTKNEALQNIAFNIIKEQTRSIENMRRILCDCQNQTNSQEQLYDYQNRLDIIVGTMFT KMAEPSCSNDIDVNFINEMIPHHEGAIAMAKTTLEYPICSELKTILQAIITSQSNGVRQL KTVLEDITC >gi|223714178|gb|ACDT01000037.1| GENE 29 30509 - 31117 495 202 aa, chain + ## HITS:1 COG:L132126 KEGG:ns NR:ns ## COG: L132126 COG1713 # Protein_GI_number: 15673861 # Func_class: H Coenzyme transport and metabolism # Function: Predicted HD superfamily hydrolase involved in NAD metabolism # Organism: Lactococcus lactis # 10 197 4 193 197 110 32.0 2e-24 MDLNYIQTPLRNQTIEEQIKDFYLLNNKERTYQHVLGVAMAAEKLALQYQEDIRACILAA LLHDISAVITPQDMMKIANELQWQLDSAEKIYPFLLHQKISALIAYEYFQIRNLSILKAI ESHTTLRSNPSKLDMIIFIADKLAWDQPGEPPFKKNIEMALNDSLEAACYQYIKYQFDHD LLLFPHCWIIEAYRWLEKSHCK >gi|223714178|gb|ACDT01000037.1| GENE 30 31186 - 31617 533 143 aa, chain + ## HITS:1 COG:phnO KEGG:ns NR:ns ## COG: phnO COG0454 # Protein_GI_number: 16131919 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Escherichia coli K12 # 3 140 6 142 144 60 28.0 9e-10 MLIRKIEEVDYQDVYQLVKYELGYQNLEFNKFCWRLDQMLQDTNYQIYVAIIDYRVVGII GLMLGWGLEIEGKIMRIIALAVEHRYQGQKIGSSLIAYSEQYALSKQVSAITVNSGLERR RAHEFYKKNDYYKKGYSFIKKCE >gi|223714178|gb|ACDT01000037.1| GENE 31 32180 - 32698 570 172 aa, chain + ## HITS:1 COG:L33782 KEGG:ns NR:ns ## COG: L33782 COG3231 # Protein_GI_number: 15673193 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aminoglycoside phosphotransferase # Organism: Lactococcus lactis # 8 165 100 256 262 136 43.0 2e-32 MSGMCSCAPKLISNPEILAVTLGKLLRKFHDTKFTEAIYINHNEEILKRVFLNYQQANLD EQMTQGIGCLELDEVFSYINAKKDILVEDTVIHGDYCLPNIFLDQDYRFCGFLDMGRAGV GDRHYDLFWGRWSLAYNLGNDKYGDLFYEAYGNEVIDPERLKLIAYIACLDG >gi|223714178|gb|ACDT01000037.1| GENE 32 32750 - 33754 852 334 aa, chain - ## HITS:1 COG:FN0533 KEGG:ns NR:ns ## COG: FN0533 COG2855 # Protein_GI_number: 19703868 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 4 334 1 346 346 349 61.0 4e-96 MSKIKNNALGVILCLAIAMPAYYLGKQLPLIGGPVFAILMGMIITFLVKDKSKVQSGITY TSKKILQYAVILLGFGMNLSDIVKTGSSSLPIIISTIATSLVVAYVLCKVLKMPSKISTL IGVGSSICGGSAIAATAPVIEADDEEIAQAISVIFLFNILAALIFPTLGGMLGLSNDGFG LFAGSAVNDTSSVTATATAWDGIHGSNTLDQATIVKLTRTLAIIPITLVLAFKRTRDAEK SESTFNLKKIFPFFILFFILASVITTVFNLPGNITAPIKDLSKFFIVMAMAAIGLNTNIV KLVKSGAKPIFMGFCCWVAITVVCLSMQALLGLF >gi|223714178|gb|ACDT01000037.1| GENE 33 33864 - 34757 924 297 aa, chain + ## HITS:1 COG:PA3398 KEGG:ns NR:ns ## COG: PA3398 COG0583 # Protein_GI_number: 15598594 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pseudomonas aeruginosa # 5 289 7 290 308 113 26.0 4e-25 MLDFRIDTFLAVCKYMNFTKASQYLNITQPAVSGHIRYLEEYYDVKLFMYIGKKMQLTKA GKILLDVATTLKHDDIFLKNKLNNLSLKQELVFGATLTVGEYVMPKVIDQLLNENQNTMI KMQVANTSELLKMINEGQLDFALVEGYFNKLEYDYRKFCQDEYICVGSIDFPLSIVHDIS ELFKYNLIIRENGSGSREIIERWLKERNLDIDDFSNIIEIGNINMIKRLVKNNHGITFIY KLAVEKELAERRLKQVKVNALTIKHDINFIWRKNSVFNDYYEELFEVFRLQNDKVLL >gi|223714178|gb|ACDT01000037.1| GENE 34 34782 - 34913 98 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPLTDDDFLKDSKLELYNEVIESLQEQEETTDVLDYEFDLEEE >gi|223714178|gb|ACDT01000037.1| GENE 35 34951 - 35874 1052 307 aa, chain + ## HITS:1 COG:no KEGG:Bsel_1414 NR:ns ## KEGG: Bsel_1414 # Name: not_defined # Def: hypothetical protein # Organism: B.selenitireducens # Pathway: not_defined # 13 306 35 335 348 82 26.0 2e-14 MIEEYIANNQFQDALDLLNDMDDEMTRYQRLVCLYGLGEYQMAKAEGMLAKAKASNTYYD VVSIYVATLKELEEFEEAINIMVEELSMPYIPYEYEAIFNAAHDELLLAKREANEGMERH NNAFSLDDMENILMKDLPNEDLLYMAIEQMEGINIRRLLPAIRNFLKDENKPSFAKSLLI ELMIDQEIDEEMTLVKKGIEYGINPSYAPLVLNQEVGGTILNLLSEGIEDDNPSLYSLCE QFLNFYLYLVYPKYIDDYDYRPIAAAIHYHIASMQYIDIELDDIEYLYNCDKEEILEKLN EIKEIEY >gi|223714178|gb|ACDT01000037.1| GENE 36 35968 - 37248 1936 426 aa, chain + ## HITS:1 COG:BS_tig KEGG:ns NR:ns ## COG: BS_tig COG0544 # Protein_GI_number: 16079875 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Bacillus subtilis # 1 424 1 423 424 280 44.0 4e-75 MKINNKKLENAIVELTVAFDSEEWKATQEKALDKLAKNVKIDGFRPGKAPAAMVRARVSK ASVLEEATDMILQTKFVEILTEANVEPVAQPALSVQKVDADELEVQILVPVKPQVELGEY KGLEVKKGRVTVTKKEIEEQLANYQTQFAELTVKEGGKVAKGDTAVIDFEGFVDGVAFEG GKGENYPLEIGSGSFIPGFEDQVIGMTVDKEQDIVVTFPEDYGAADLAGKEATFKVTVHE IKEKHLPEIDDELAKDVNIDGVETLDQLKDHIKANIKTRKESENENKFMDDLYKAIVASS KVEDSDALLEQEQGLMLQEIEQNLQRQGLNFEVYQQFTGKSKDDIKEDIKPQAEERVKLN AILAAIIEEEKLAVSDEELETELKTIAEYYQKELDEVKKIFEGNMSRIENDLLTRKAVDL VKDNLK >gi|223714178|gb|ACDT01000037.1| GENE 37 37303 - 39624 2616 773 aa, chain + ## HITS:1 COG:BS_lonA KEGG:ns NR:ns ## COG: BS_lonA COG0466 # Protein_GI_number: 16079872 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Bacillus subtilis # 1 771 1 769 774 760 51.0 0 MNEQTVLTLPVVCTRGMIVFPENRLTLDVGRPVSLKALELSANEHDNNIIFVSQINPLVD NPSFDDVFHIGTLCKIDRKVRRDSSGTIKLTVLGAKRVRLTNFEEQQGSIYSTVEIIEDE FGDRNEEVALVRKTTSYFEQAKRSMPNMPLDSINRLTSGVSASVLADTIGQYLPIDLNQK QKILETININERLLLVASSIESEKVIGEIEETINRKVRESIDENQREYYLREKLRAIKEE LGDSVPKEDDAETIREELEKNPYPQHVKDKIEEELRRFETMPAASSEANVVRTYIDWMMK TPWYQETKDVEDIQAIEDVLNEDHYGLEKVKERIVEYLAVKQMTKSLKAPIICLAGPPGV GKTSIAKSIARALQREFIKASLGGVKDEAEVRGHRRTYLGSMPGRIIQGMKKAGVINPVF LLDEIDKMSSDYKGDPTSAMLEVLDPEQNSQFSDNYLEEPYDLSKVLFIATANDLGNIPA PLRDRLEIIELSSYTEQEKLMIAKNHLIKKQLALHGLKEDQLVIEDDAIMAIIRHYTREA GVRDLERLLAKICRKAVLIVLKEKRDNLTVNKESLEKHLGKAPFEHTKKLDHSQIGVVTG MAYTQFGGDILPIEVNHFQGSGKFIITGQLGDVMKESASIALDYMKANKEKYGLEKIAFD KEDIHIHVPEGAVKKDGPSAGVTLTTAIYSAFKNQPVRNDIAMTGEITLRGNVLPIGGLK EKSISAHRSGIKKIIIPKDNAKDIDDIPKSVQDELEIVLADHIDTVLDHALEK >gi|223714178|gb|ACDT01000037.1| GENE 38 39681 - 40280 655 199 aa, chain + ## HITS:1 COG:lin1593 KEGG:ns NR:ns ## COG: lin1593 COG0218 # Protein_GI_number: 16800661 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Listeria innocua # 4 192 3 191 194 218 55.0 6e-57 MVKIKKAEYVLSAAWKSQWPEESFPEMCLAGRSNVGKSSFINAMLNHNGLAKVSGTPGKT RTLNFFNVNDALYFVDVPGYGYAKVNDSITKQFGKTMDEYITLRSTLRGFILLVDYRHKP TKDDVLMYEFVKHHDVPVMVVATKEDKLKRNDLKKNEKIIKDTLGFHPEDKFVRFSSLRK KGIEEAWNFIYELCDINFN >gi|223714178|gb|ACDT01000037.1| GENE 39 40474 - 41451 1011 325 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734816|ref|ZP_04565297.1| ## NR: gi|237734816|ref|ZP_04565297.1| predicted protein [Mollicutes bacterium D7] # 1 325 1 325 325 557 100.0 1e-157 MQKIYFEKWIDLNHQLNELLSLSVDESINYKIESVGVRAVGSLIVKGEYNGNHKFDENIE LDVLATFDKIVDQRDFNIKVEDFDYFIKDGNIQIKIEVGIHGVVEGEDRYVRDEHLGHDE ALEEIENLIKDTEPANASIEQLVRAKPVEVQETGPVTPAYVAQAKQPESKPMAAETTTST VKEIKAVAKEMDNHPKNLETSKVAVIPQKSESSVHAAKEPAVHMTKEPMAHVAKEAAVME VSKKETAMAEEISEQLQEKVYQSKSRPIFQDTSDSVGTYYLYIVKENDSYSEVATRYSVD EEIIRNYNQDKALEAGSVLIIPYVP >gi|223714178|gb|ACDT01000037.1| GENE 40 41448 - 42362 824 304 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755849|ref|ZP_02427976.1| ## NR: gi|167755849|ref|ZP_02427976.1| hypothetical protein CLORAM_01366 [Clostridium ramosum DSM 1402] # 1 304 1 304 304 517 100.0 1e-145 MNELIKEAYSLEVIGYIKVTKKVYKVKCREGFFCLKFVEDNGLTVVIDHIESLHLKCFLP VIRNYKKQVLTSYEDKTFYLSPWLASDNAIVKELKLKFYYECLSYLHSTSFFNYSVSKSF FKRQIEDIANIIHERQAYYNEIMSNFEVLSYRSPTGWMFVLNYHRIEGCLKKALELLECY EQYVCNLDTIRLCLTYNNFNYSHVMMKESKLISIDQIKINLCIYDIYNMYQRIPEFIFDF DVILDSYFCKIKLLKEERLLLQCLLHVVPIIELGHDEINNIIMMSRLLYYLDSISSLNKK LAID >gi|223714178|gb|ACDT01000037.1| GENE 41 42619 - 43290 729 223 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734818|ref|ZP_04565299.1| ## NR: gi|237734818|ref|ZP_04565299.1| predicted protein [Mollicutes bacterium D7] # 1 223 1 223 223 379 100.0 1e-104 MSYSVVWILFLGGCLFLMYFYSGKMRQVKKYQKQQRAAYEENRVKYSHFTSEVFDQTPDE ELTHAVLFHLLAKEDKLYEGEEITGTLIDVMTPSELLIYTIYQVELSMEGGRGSLHSFFI KEPYCLYRPYAIKAFEAVDCHEIAELMTAAEKLAVMIENEEEIELDEDSDYGKYNFSDFT SSLLSMLKSSGIVNKAAKFIRENKNDFIDLEVTTDEQTTGTEV >gi|223714178|gb|ACDT01000037.1| GENE 42 43262 - 45883 2806 873 aa, chain + ## HITS:1 COG:SA1488 KEGG:ns NR:ns ## COG: SA1488 COG0525 # Protein_GI_number: 15927242 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Staphylococcus aureus N315 # 4 873 2 874 876 1082 59.0 0 MSKQLEPKYNHHKVEEGKYRHWIDKKYFEAGDTSKKPYSIVIPPPNVTGKLHLGHAWDTT LQDMIIRYKRMQGYDALWLPGMDHAGIATQAKVEARMREEGISRYDLGRDGFLEKAWSWK EEYASIIRAQWEKLGLSLDYSKERFTLDDGLSEAVKEVFVKLYNEGLIYQGKRIINWDPV QRTALSNIEVIHKEIEGAMYYFKYQIVDSDEQLIIATTRPETMFADQAIFVHPDDERYTH LVGKKAINPANGEALPIMADSYIDMSFGTAVMKCTPAHDPNDFALAKKYNLEMPICMNDD GTMNELAHKYAGMDRFACREALVADFKAAGVVDHIEKHMHQVGHSERSNAIVEPYLSKQW FVKMEPLAKAALENQLKDSKVNFVPERFEKTFNQWMENIEDWCISRQLWWGHQVPAWYHK ETGEVYVGKNPPADLENWKQDEDVLDTWFSSALWPFSTLGWPNTDSELFKRYFPTNTLVT GYDIIFFWVSRMIFQSLHFTEERPFEHVLIHGLIRDEQGRKMSKSLGNGVDPMDVIDEYG ADTLRFFLTTNSAPGMDLRYIPEKLEASWNFINKIWNSARFVLMNIDDEMKFEELSFDNL NLCDKWILNRLNEVIREVDINMDKFEFVNVGSELYKFIWDDFCSWYIELTKVHLNSTNDT EKQASLNTLVYVLNAIVKMLHPFMPFVTEEIFQAIPHLEESICIAIWPEVNDHFTDESIN DQFTYLIDIVKGIREIRTQYTIKNAIEVPYVINTKNDDLEGLLNKCLPYIKKLCNAVCSG YNLNAAGEVANITIKGGNSLLVELGDYIDKDAEKEKLANQLKKLEGEIKRCQNMLANEKF TSKAPKEKVELERNKLADYQSKYDAVKEKLEQM >gi|223714178|gb|ACDT01000037.1| GENE 43 46005 - 46163 187 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755852|ref|ZP_02427979.1| ## NR: gi|167755852|ref|ZP_02427979.1| hypothetical protein CLORAM_01369 [Clostridium ramosum DSM 1402] # 1 52 15 66 66 97 100.0 3e-19 MNELQQCMQMEVMGGGVTGFFFCVVIGTAIYKMFRSGAGRVSIPKIISIEWR >gi|223714178|gb|ACDT01000037.1| GENE 44 46352 - 46738 365 128 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734822|ref|ZP_04565303.1| ## NR: gi|237734822|ref|ZP_04565303.1| predicted protein [Mollicutes bacterium D7] # 1 128 1 128 128 205 100.0 7e-52 MKIVRLVVVLAVALSICFCGIYLKFFSTINSYINTYIVQVGIYKQDENANEMIKKLLDLG KPNYKYQKDNNFVVITGVFLDEKEAKDMGNTISESGITCVIKQTKFESSFKEVIEKQDYE AIIKELGS >gi|223714178|gb|ACDT01000037.1| GENE 45 46735 - 47412 657 225 aa, chain + ## HITS:1 COG:BS_ysxA KEGG:ns NR:ns ## COG: BS_ysxA COG2003 # Protein_GI_number: 16079856 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Bacillus subtilis # 1 217 8 225 231 172 42.0 6e-43 MKLKNYPLEERPREKAFHHGIESLNNIELLALVLRTGNKQESAIELAQRIINEIGGFRYL HDINYYQLIQIKGIKQAKAIEVLAIIEIAKRLDKQPVAMSAIKEPRDGYELLKNQLMFEQ QEKVIVLCLNSRLEVIKEKTVFIGGNNISIISGRELFKEALICGSNRVMVVHNHPSGNPE PSIEDIEATERLYSMAKELDIDVVDHLIIGRSRFYSFASNKIIEV >gi|223714178|gb|ACDT01000037.1| GENE 46 47587 - 48315 891 242 aa, chain + ## HITS:1 COG:lin1582 KEGG:ns NR:ns ## COG: lin1582 COG1792 # Protein_GI_number: 16800650 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Listeria innocua # 13 238 51 276 295 149 39.0 6e-36 MVSDSIAAIEYYVVKKPVEFVSNLFSEYNELKDVYKENKILKAKLSSYASVEVNTDVLSK EIDELKKMLNIEYLPTDYNVKTTSFVRESDDWNNEITIDLGSLAGVSKDMVVISSKGMIG KVTSVTEVTARVQLLTAENPTSALPIQVINGDQNVYGLLNRYDIESKCFEITLFSDVEKF EDNAKVITSGLGGKAPKGIYIGTVESSIVSEDGTSKTIRVKPAADFNDLSYVAVVFRSDS NE >gi|223714178|gb|ACDT01000037.1| GENE 47 48308 - 48841 461 177 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755857|ref|ZP_02427984.1| ## NR: gi|167755857|ref|ZP_02427984.1| hypothetical protein CLORAM_01374 [Clostridium ramosum DSM 1402] # 1 177 1 177 177 274 100.0 2e-72 MNKQLYRYYLVMALSFVVDSIISYYLPYNFDKLGITIVPCVSLMMFTLLNNTIGDEHRYI FATVTGLYYAIVYANSLLIYVLLYCVYAFFGKKYTKLATYTLLEMFIAVIVTIIAQEVVI YWLVWITNVTQLSIVAFLTNRLLPTLGANLLLIAPVYFIHKKLGFEGKVNAYQSKRS >gi|223714178|gb|ACDT01000037.1| GENE 48 48816 - 49202 454 128 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|169350146|ref|ZP_02867084.1| ## NR: gi|169350146|ref|ZP_02867084.1| hypothetical protein CLOSPI_00888 [Clostridium spiroforme DSM 1552] # 1 123 1 123 191 190 77.0 2e-47 MHIKVKGVNNHLVFVLDDNQDFNSLMNELESLLESPLLKSDGYYPRAFFDFKSRILSIHE LLRLLDLLFSKQVLLFDGINMAKVEKKNKIKVINKTVHAGEILELDQDALIIGQVNPGAI VRLKESCM Prediction of potential genes in microbial genomes Time: Thu May 26 09:39:29 2011 Seq name: gi|223714177|gb|ACDT01000038.1| Coprobacillus sp. D7 cont1.38, whole genome shotgun sequence Length of sequence - 1587 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 64 - 843 785 ## COG2894 Septum formation inhibitor-activating ATPase 2 1 Op 2 . + CDS 892 - 1470 600 ## gi|167755860|ref|ZP_02427987.1| hypothetical protein CLORAM_01377 Predicted protein(s) >gi|223714177|gb|ACDT01000038.1| GENE 1 64 - 843 785 259 aa, chain + ## HITS:1 COG:BH3027 KEGG:ns NR:ns ## COG: BH3027 COG2894 # Protein_GI_number: 15615589 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor-activating ATPase # Organism: Bacillus halodurans # 1 257 1 261 264 227 43.0 2e-59 MSRVIVVTSGKGGVGKSSVSVNLASALAFSKFKVCLIDGDFGLKNLDVMMGLENRVVYDL NDVVEGRCTIEQVLVKDKRIDGLSLLPSCKSLSFENLDTEIMNSLIERLNKDYDFIIVDS PAGVEKGFQYSASLANEAIVVVNLDVSSLRDADRVVGLLMKKGINTINMIINKVNVDDIE GARSLTVEDAQEILSLPLLGIVYDSHDMIEANNRGVPIFLNNQHLLHSCFVNISKRILGQ QVPYAKYKKKSLIRRFFYS >gi|223714177|gb|ACDT01000038.1| GENE 2 892 - 1470 600 192 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755860|ref|ZP_02427987.1| ## NR: gi|167755860|ref|ZP_02427987.1| hypothetical protein CLORAM_01377 [Clostridium ramosum DSM 1402] # 1 192 1 192 192 337 100.0 3e-91 MSDLDDVRKRMQKRRGTNKKPLNDDNFKRFYNFMVKFMVVILVVLVSMSYIKINPDSNIK KMILNDTNYKAVTSWISDNLFSFLPSDDVSVSSGVEYQHVKDDYYKNNSNEVLSIEAGRV IATGNDENLSGYVVILGKNDVEITYSNIDNVMVTLYDEVDQGMVLGSYQDQFVLQFEHLG KEISYEEYQRME Prediction of potential genes in microbial genomes Time: Thu May 26 09:39:46 2011 Seq name: gi|223714176|gb|ACDT01000039.1| Coprobacillus sp. D7 cont1.39, whole genome shotgun sequence Length of sequence - 21150 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 8, operones - 5 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 2.5 1 1 Tu 1 . + CDS 90 - 575 256 ## gi|167755861|ref|ZP_02427988.1| hypothetical protein CLORAM_01378 + Prom 588 - 647 1.5 2 2 Op 1 14/0.000 + CDS 698 - 1042 576 ## PROTEIN SUPPORTED gi|167755862|ref|ZP_02427989.1| hypothetical protein CLORAM_01379 3 2 Op 2 14/0.000 + CDS 1047 - 1370 174 ## PROTEIN SUPPORTED gi|81428286|ref|YP_395286.1| putative ribosomal protein 4 2 Op 3 14/0.000 + CDS 1374 - 1658 482 ## PROTEIN SUPPORTED gi|237734832|ref|ZP_04565313.1| 50S ribosomal protein L27 + Prom 1662 - 1721 3.5 5 2 Op 4 . + CDS 1745 - 3028 1531 ## COG0536 Predicted GTPase + Term 3029 - 3061 3.3 + Prom 3068 - 3127 7.0 6 3 Op 1 29/0.000 + CDS 3155 - 3721 647 ## COG0632 Holliday junction resolvasome, DNA-binding subunit 7 3 Op 2 . + CDS 3732 - 4721 1038 ## COG2255 Holliday junction resolvasome, helicase subunit 8 3 Op 3 4/1.000 + CDS 4714 - 5214 482 ## COG1555 DNA uptake protein and related DNA-binding proteins 9 3 Op 4 4/1.000 + CDS 5205 - 7211 1306 ## COG2333 Predicted hydrolase (metallo-beta-lactamase superfamily) 10 3 Op 5 . + CDS 7247 - 8194 944 ## COG1466 DNA polymerase III, delta subunit 11 3 Op 6 . + CDS 8203 - 9417 1444 ## COG1760 L-serine deaminase 12 3 Op 7 . + CDS 9407 - 10501 871 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases + Prom 10534 - 10593 8.5 13 4 Tu 1 . + CDS 10622 - 11245 727 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family + Term 11246 - 11272 -1.0 + Prom 11277 - 11336 12.8 14 5 Op 1 21/0.000 + CDS 11431 - 12471 976 ## COG1420 Transcriptional regulator of heat shock gene 15 5 Op 2 29/0.000 + CDS 12483 - 13025 542 ## COG0576 Molecular chaperone GrpE (heat shock protein) 16 5 Op 3 31/0.000 + CDS 13043 - 14860 2549 ## COG0443 Molecular chaperone + Term 14870 - 14903 2.1 17 5 Op 4 . + CDS 14921 - 16045 1373 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain + Term 16049 - 16080 -0.6 - Term 15816 - 15850 -1.0 18 6 Op 1 . - CDS 16030 - 16899 420 ## gi|237734846|ref|ZP_04565327.1| predicted protein - Prom 16923 - 16982 9.4 - Term 16946 - 16988 1.3 19 6 Op 2 . - CDS 16991 - 18349 1281 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase - Prom 18488 - 18547 8.9 - Term 18505 - 18543 1.1 20 7 Tu 1 . - CDS 18549 - 18734 308 ## PROTEIN SUPPORTED gi|167755880|ref|ZP_02428007.1| hypothetical protein CLORAM_01397 - Prom 18875 - 18934 8.8 + Prom 18780 - 18839 12.7 21 8 Op 1 9/0.000 + CDS 18964 - 19320 553 ## COG1302 Uncharacterized protein conserved in bacteria 22 8 Op 2 . + CDS 19333 - 20988 2341 ## COG1461 Predicted kinase related to dihydroxyacetone kinase + Term 21019 - 21054 3.2 Predicted protein(s) >gi|223714176|gb|ACDT01000039.1| GENE 1 90 - 575 256 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755861|ref|ZP_02427988.1| ## NR: gi|167755861|ref|ZP_02427988.1| hypothetical protein CLORAM_01378 [Clostridium ramosum DSM 1402] # 1 161 87 247 247 249 100.0 6e-65 MAGLSTHLGMFIVIKVFFDSEYLLAVNQLVFVFNLLPIYPLDGSKILLLVFSLFKDYYRA VKLQIKVSIFSLSVLIVLYNQMGYILVYFYLLYINYQYIKEFRYIIIRLYLKRMHDNQYH RLKINHDYRFYRPYENYYLIGGNGYSEKEVLQYLIKNLKSN >gi|223714176|gb|ACDT01000039.1| GENE 2 698 - 1042 576 114 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755862|ref|ZP_02427989.1| hypothetical protein CLORAM_01379 [Clostridium ramosum DSM 1402] # 1 114 1 114 114 226 100 1e-58 MYAIIETGGKQLKVEAGDTIFVEKLDVEAGEKVVFDKVLAVGDKTMKLGAPYVKGATVEA TVEKQGKEKKVVIYKYNAKKHYHKKQGHRQPYTKLVINTINKTAKKAAEETAEA >gi|223714176|gb|ACDT01000039.1| GENE 3 1047 - 1370 174 107 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|81428286|ref|YP_395286.1| putative ribosomal protein [Lactobacillus sakei subsp. sakei 23K] # 1 102 1 108 113 71 36 4e-12 MIKVLVKQNNNQIVNLSITGHADSGEYGKDLVCAGVSTVGIGAMNMLAKKGFLAKGLGTI EINEGYINVVVNHTDEVCQVVLETLVVTLETMVESYGRFIKISKVEV >gi|223714176|gb|ACDT01000039.1| GENE 4 1374 - 1658 482 94 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237734832|ref|ZP_04565313.1| 50S ribosomal protein L27 [Mollicutes bacterium D7] # 1 94 1 94 94 190 100 8e-48 MMFKLDLQLFASKKGVGSTKNGRDSKSKRLGAKLSDGQFCTAGSIIYRQRGTKVHPGTNV GKGGDDTLFATVDGVVKFERLGRDRKQVSVYPAA >gi|223714176|gb|ACDT01000039.1| GENE 5 1745 - 3028 1531 427 aa, chain + ## HITS:1 COG:lin1572 KEGG:ns NR:ns ## COG: lin1572 COG0536 # Protein_GI_number: 16800640 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Listeria innocua # 3 427 2 429 429 462 58.0 1e-130 MQFIDKAKIRVEAGKGGDGTVAFRREAHVPKGGPAGGDGGRGGSVIFQATTSLSTLLDLK YNRLYKAPSGQNGMAKKMHGKDAIDTVIKVPVGTMILNEETGQIMADLTEDKQRVVIAKG GRGGRGNARFATSRNPAPQICERGEPGENFDLICELKLLADVGLVGFPSVGKSTLLSVVS RARPEIADYHFTTIVPNLGVVQVKDGRSFVMADLPGLIEGAAQGKGLGHQFLRHIERCRV IVHIVDMGAVDGRDPYEDYVTINKELGEYQYRLLERPQIVVANKMDEEGAEENLVRFKKQ VGEDVKIFPISAIIHDGVDQVLYAVADALATAPTFTMEDEVEHTVLYTMGDEEDKPFELH NLGNGNWQITGKKIERMVAMTSLVSDDSLKRLLIKMRNMGVDDALRNAGAQDGDNVAIGE FEFDFYE >gi|223714176|gb|ACDT01000039.1| GENE 6 3155 - 3721 647 188 aa, chain + ## HITS:1 COG:SPy2119 KEGG:ns NR:ns ## COG: SPy2119 COG0632 # Protein_GI_number: 15675869 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Streptococcus pyogenes M1 GAS # 1 188 1 197 198 147 43.0 9e-36 MYSYLIGTVVEMNIDHIVVENSGIGYLIYVNNPYEYTRGKEIKIFLYQQVKEDALLLYGF KTSEEKEMFLKLILVKGIGCKTAIGILATGDVTSIISAIETGNVAYLKKIPGIGPKAAQQ IILDLQGKFKNTPTATTLINNNDLDEAAEVLIALGYKKSEVDKALAVLLNEKLDTNGYVK RALSLLVK >gi|223714176|gb|ACDT01000039.1| GENE 7 3732 - 4721 1038 329 aa, chain + ## HITS:1 COG:BS_ruvBm KEGG:ns NR:ns ## COG: BS_ruvBm COG2255 # Protein_GI_number: 16081161 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Bacillus subtilis # 15 325 20 330 336 412 63.0 1e-115 MTRNEILDANVIDDDSSLRPSSFDEYVGQTNLKENLKVFVGAAKLRDESLDHVLLYGPPG LGKTTMSMIIANEMGTNIKITTGPSIEKTGDLVAILTALEPGDVLFIDEIHRLNKVVEEI LYPAMEDFCVDVVIGKEASTRSVRIDLPPFTLVGATTRAGDLSAPLRDRFGIISKLEYYD ETDLKTIIDRTSRVYSMPMDDDAKSALAMRSRGTPRIANRLFRRVRDFAQFNGEEIISKE RTIEALDRLKVDQLGLDDVDHKYLLGIIHRFKGGPVGLESLAASIGEEPQTLEDVYEPYL LQIGLIKRTPRGRVATSEAYKHLNINIDE >gi|223714176|gb|ACDT01000039.1| GENE 8 4714 - 5214 482 166 aa, chain + ## HITS:1 COG:TM1052 KEGG:ns NR:ns ## COG: TM1052 COG1555 # Protein_GI_number: 15643810 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Thermotoga maritima # 94 160 39 101 181 60 52.0 1e-09 MNKAVFTVFFCGGIVKKHEIIGLLIILIISTVYSFTLPETSLNEIESDPQVKIIVEGKYN ETLVFNSSPTIEDVFKALNTDNVYGFDQKTVLDSQTVFYIPINKNLISLNHASKEQLMTI KGIGPKTADKIIDYRNEHPFATIEEIKEVSGIGEKTYLRIRELLCL >gi|223714176|gb|ACDT01000039.1| GENE 9 5205 - 7211 1306 668 aa, chain + ## HITS:1 COG:L0317_2 KEGG:ns NR:ns ## COG: L0317_2 COG2333 # Protein_GI_number: 15673754 # Func_class: R General function prediction only # Function: Predicted hydrolase (metallo-beta-lactamase superfamily) # Organism: Lactococcus lactis # 422 654 2 270 282 164 33.0 6e-40 MSLRYHLIYYAVACLLGVLVMLIHPVFLVLLIIYAFFIIRHFGWGRLVTIGLVMIFFLVF IRWPQPTDEPIISGYIISRDEKSIVLKTTKTKIKVYGEFIGYEVGDELEIEVNYFEISRA TNDNAFDYRNYLYSQGITNNASLLRLINSQKHNTLFQKLQKRIDGKELVNSYASMFILGI RDEMINDYYHQLTELSIVHLFALSGLHIHILRRLIKKVLIFLLPEYLINYLSLIIIGVYM YIIPYNISFMRAYLVMLLMTLFKKYLNQLDCFSLVAMFFVFMNPYIIYNLSFVFSYFMYL IIILINHHRYLNEIVYGASVPIVISIQYRINILSLFLGIILTPLISVLYQLLWLYVIFGN FFKPVISLVIEVLDNIVVFSTDFSFFINFSKPSLFFILGYYYIYFKLIVKINIKQRIHRE ILLLLSLVIMFYFKPYYQTFGQVVMIDVGQGDCFFIQQPYNQGNILIDTGGLRNKDLAAL TLVPYLRSVGVFKLDHVFISHDDFDHSGAYQSLADQIEIGHTITSYQDKFKIGQVEIEIL KTPESTDNNDSSLVLLVTINKLKYLFTGDISNAVERQLINDYPELKIDVLKVSHHGSNTG TSAAFLNAIKPKIALISCGKDNYYGHPHDDVITRLNDYGVKVYRSDEMGMVKIVYYGNDN YIFNDFND >gi|223714176|gb|ACDT01000039.1| GENE 10 7247 - 8194 944 315 aa, chain + ## HITS:1 COG:BS_yqeN KEGG:ns NR:ns ## COG: BS_yqeN COG1466 # Protein_GI_number: 16079610 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, delta subunit # Organism: Bacillus subtilis # 2 304 17 325 347 162 32.0 1e-39 MIFVIYGEESFLMEQKLQSLKKEYDCNEDNMNISTYRANEDSLEEVYEDLITPPFLTDKK MVVLKNPYFLTTKKVKKDNNELEYFEKCLINSEEVIFVIYHVGKDFDERKKIVKNLRKQA QFFEIDKVNHYKLSDSTRQAIRRRDATIDDDALELLLSRLGDSLNNVALEVEKLCLYSKH ITYDIVDELVSKPLDENVFALTSAILQKDRRKMFSIYHDLMILNEEPIKLIVLIAGQMRL IYQVKLLDRKGYNDKEIGKILGVNPYRLKYVRQEGKDFDLNELLQCLDDLSRLDVEIKTG KMDKKMGLELFMVRI >gi|223714176|gb|ACDT01000039.1| GENE 11 8203 - 9417 1444 404 aa, chain + ## HITS:1 COG:FN1106 KEGG:ns NR:ns ## COG: FN1106 COG1760 # Protein_GI_number: 19704441 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Fusobacterium nucleatum # 1 398 1 399 408 345 47.0 1e-94 MESLRELYKIGVGPSSSHTMGPQRAALKVKETYPQATHFEVYLHGSLALTGKGHLTDYII EETLGKDSVKIHFVNTALPKHPNGMIFEIYRENQLLDKITVYSVGGGALMYDDSSLQPSK QVYPHGNLTEILEYCDRKGINLYDYTIEYEDEHFKAYLFEVLNAMFACVEAGLSTTGVIP GKLGLKRVAKSMYQQAINTRRSGERDRLLVSSYAYAVSEENASGHRIVTAPTCGASGVLP AVLYYCYKQLEIPRKEIIKALAVAGIFGNVIKTNATISGADGGCQAEIGSACAMAAAAYG WILELNNNLIQYAGEMGLEHNLGLTCDPVGGYVQIPCIERNGFGALRAMDAAMYAKQLGY LRKNKVSFDAVVRVMKETGKDLNSAYKETSLGGLAKEFGFENED >gi|223714176|gb|ACDT01000039.1| GENE 12 9407 - 10501 871 364 aa, chain + ## HITS:1 COG:SP1409 KEGG:ns NR:ns ## COG: SP1409 COG0635 # Protein_GI_number: 15901263 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Streptococcus pneumoniae TIGR4 # 2 364 4 374 376 253 40.0 4e-67 MKTNSLYLHIPFCEQICAYCDFCKVFYNEHQADDYLAVLKHELQALNITEPLKTIYIGGG TPSSLNDEQLEWLMDIIKPYISSETKEVSIEVNPESIDYYKLDILKRGGVSRLSIGVETF NDILLKKINRQHTSIQVERIIDYARKIGFNNISIDLMYGLPNQTITDIKNDLAKVRQLPI EHISYYSLILEDHTVLKNLNYQPLDEEVESKITQLIEESLEQIDFHKYEISNFAKTGYES LHNLAYWQYDNYYGIGVGASGKIDDCLIEHNRNLNAYLRRQNTITKMINSKEETMFNHLM MSLRLVKGLDLKEFEKRYGLRAVDVYQTAIDKHLKMKTLVIENDYLHATSESIKLLNEIL IDFL >gi|223714176|gb|ACDT01000039.1| GENE 13 10622 - 11245 727 207 aa, chain + ## HITS:1 COG:AF0830 KEGG:ns NR:ns ## COG: AF0830 COG1853 # Protein_GI_number: 11498436 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Archaeoglobus fulgidus # 1 165 1 165 169 144 41.0 1e-34 MNNKAFFKLSYGLYVVSSSCDGKDSACIANTFVQVTSEPARVCITLNKNNYTTSLIENSC VYNVGVLLDDIDMDVIRRFGFQSGKDVNKFDGIDYEVDCQNIKQITEGIAASFSVKVISM TDVGTHIMFVGDVIDCKVINEGEVLTYANYHNKKNGTTPKNASSYQADTSKHGWRCTVCG FILEADELPEDFICPVCKQPASVFEKI >gi|223714176|gb|ACDT01000039.1| GENE 14 11431 - 12471 976 346 aa, chain + ## HITS:1 COG:lin1512 KEGG:ns NR:ns ## COG: lin1512 COG1420 # Protein_GI_number: 16800580 # Func_class: K Transcription # Function: Transcriptional regulator of heat shock gene # Organism: Listeria innocua # 1 336 1 336 345 259 39.0 5e-69 MLTARQLLVFKCIVDEFIETAEPVGSKALMTKYQLPYSSATIRNEMSFLEEYGYLEKTHT SSGRVPSVEGYRFYVDNLAKRDLDEGIKNQVAQIFSDRHRGLNEIIHESCEMLSELTHLT SVVLGPDSSEDTLQQINIVPLNDNRVTAIIITNQGYVENKVFDLNRNHKIDDLVSCVAVM NDLLIGTPIDQVAFRLERDVKPVLSAKIKEHEELFNAFLEAFVRFASSNVYFSGKENMLY QPEYNDVNKLRRIVSAFENSQVWNALQPLSDEEGVTVRIGSDSPINEIEDVSVISASFRT GERTKGSISVIGPTRMPYEKVVSLVEYISRNIEEAFFDESDDSNDE >gi|223714176|gb|ACDT01000039.1| GENE 15 12483 - 13025 542 180 aa, chain + ## HITS:1 COG:BH1345 KEGG:ns NR:ns ## COG: BH1345 COG0576 # Protein_GI_number: 15613908 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Bacillus halodurans # 8 179 22 193 194 121 47.0 7e-28 MAKEKETVVDEESTEKTAEETVETKEDEITVEDQLKNLEDEVNTWKTDYYKVFADMENLK KRLQNEHANAMKFMMQSFIEELLPVVDNFERSLAVVDPSDEIKNFLKGYEMIYNQLMEVL KSQGVEVIKTEGEEFDPNFHQAVMTVKDDNFKTNMIVEELQKGYKLKDRVIRASLVKVSE >gi|223714176|gb|ACDT01000039.1| GENE 16 13043 - 14860 2549 605 aa, chain + ## HITS:1 COG:BS_dnaK KEGG:ns NR:ns ## COG: BS_dnaK COG0443 # Protein_GI_number: 16079601 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Bacillus subtilis # 1 605 1 605 611 735 68.0 0 MGKIIGIDLGTTNSCVSVMENGEVKVIANPEGNRTTPSVVSFKNGEIIVGEAAKRQAVTN PDTIMSVKRHMGTDHKEHANGKDYTPQEISAMILQNLKATAEAYLGETVTQAVITVPAYF NDAQRQATKDAGKIAGLEVERIINEPTAAALAFGLDKLDQEQKVLVYDLGGGTFDVSILD LADGTFEVLATAGDNKLGGDDFDEKIMNWVVEEFKKDQGVDLSNDKMAMQRVKEAAEKAK KDLSGTMQTQISLPFISAGAAGPLHLELTLTRAKFDELTRDLVLRTETPVRQALKDAGMD PTEIHQVLLVGGSTRIPAVQESVKKLLGKDPNKSVNPDEVVSMGAAIQGGVIAGDVKDVL LLDVTPLSLGIETMGGVMTVLIERNTTIPTTKSQVFSTAADNQPAVDINVLQGERSMAKD NKQLGLFKLDGIEPAPRGVPQIEVTFSIDVNGIVNVKAKDLKSQKEQSIVIQNSTGLSDE EIDRMVKEAEANKAEDEQKRKDIETRNKAEQMINEIDKALAEQGDKIDATQKESAEKLKD ELKAALDNNDMATLEAKMSELEQMAQQMASYAYQQQGGAGASDANAGTANAQDDNVVDAD FEEKN >gi|223714176|gb|ACDT01000039.1| GENE 17 14921 - 16045 1373 374 aa, chain + ## HITS:1 COG:lin1509 KEGG:ns NR:ns ## COG: lin1509 COG0484 # Protein_GI_number: 16800577 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Listeria innocua # 4 373 3 374 376 329 51.0 5e-90 MADKRDYYEVLGVSKQASADEIKRAYRKKAKQYHPDVNKEPGAEEKFKEVQEAYEVLSDA NKKATYDRFGHAAFEQGAGNGGAGGFGGFGFEDVDLGDIFGSFFGGGGRSRQRNTGPRRG DDRLMRLNISFMEAVNGVKKDIKVTYDAPCTHCNGSGAKNPNDVQTCSRCGGRGTVQQQQ SSPFGTFVTETTCPDCNGTGKVVKEKCPHCHGRGYETKTVTVQLDIPAGINSNQQLRVAG KGGRGANGGPNGDLFVEIQVGSHEHFKREGRNIHITIPVSNVDATLGCEVDVPTVQGDVT VKIPAGTQSGTILRLKGKGVKDLRSTNYGDEMVRIEVKIPTKLSSEEKELYTKLSKLSKK KESIFDAFKRQFKK >gi|223714176|gb|ACDT01000039.1| GENE 18 16030 - 16899 420 289 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734846|ref|ZP_04565327.1| ## NR: gi|237734846|ref|ZP_04565327.1| predicted protein [Mollicutes bacterium D7] # 1 289 4 292 292 370 100.0 1e-101 MKKLIQRLYYRINNRVLDYRYKVNKNAGLCFAYYLIMSIIPICSLFAFFASILNVDLGTL EQLLKNYLTPEFSNIIIASLKSSHITLSSIIILFISLFVVSRGINQLYGISKNLFPSAHQ RNIIIEQLLMLLKTIAVFVLLLLIISILTIIPLINYFINFKDILVLGDLYLFLVFFIILF LLYKIIPDVHVHIFDIVKGAFCSSILMLILLSALEFYFSIADYTSVYGPLASVVVIMISF SLIAETIFIGMYIMFEAHMKRLIIEMKKNIILKKNKKRRIFFLRFYFLN >gi|223714176|gb|ACDT01000039.1| GENE 19 16991 - 18349 1281 452 aa, chain - ## HITS:1 COG:CAC1405 KEGG:ns NR:ns ## COG: CAC1405 COG2723 # Protein_GI_number: 15894684 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 1 450 1 473 473 355 41.0 8e-98 MTFPKDFLWGGAVTAHQSEGAYNAGGKSPAVCDLFPKPEHSDFKDGIDSYHRYDDDFSLF EEMGFTSYRFSIDWSRVCPDGINFSEESMQFYDDFIDSMIRHGMEPMCSLYHFEMPQLLM NKYNGFYSREVVDMFVTYANKMIDRYGDRVKKWISFNEQNAIALPGSSKVAYGAVCPKDI DEQTFINQLVHNTFVAHAKVVEKVHTIADAKVLGMVIYIPAYAATCNPLDELESRNQMAL TDMYFDMFTYGEYSSYMMAKMKNENNLPKMLDGDLELLKKNKVDWLSLSYYFSTVASHGK LSIEMNGNAKAATNPYLKASEWGWQIDPLGLRIGLRDIYAKYRLPIMVVENGFGMRDILE NETVIDDTRIDYMKDHLEQIALAINEGVDCRGYLMWGPIDILSSQGEMSKRYGTIYVNRD DKDLKDMKRYKKKSFYWYKKVISTNGDDIKND >gi|223714176|gb|ACDT01000039.1| GENE 20 18549 - 18734 308 61 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167755880|ref|ZP_02428007.1| hypothetical protein CLORAM_01397 [Clostridium ramosum DSM 1402] # 1 61 1 61 61 123 100 1e-27 MSRKCQISGKGPMSGNTRSHALNSSRRKWNVNLQKATILVDGKPTTVRISARELRTLRKS A >gi|223714176|gb|ACDT01000039.1| GENE 21 18964 - 19320 553 118 aa, chain + ## HITS:1 COG:BH2499 KEGG:ns NR:ns ## COG: BH2499 COG1302 # Protein_GI_number: 15615062 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 2 115 3 116 120 119 54.0 9e-28 MIEKNNNMGTVNISLDVVATIAGGAAIECYGVVGMASQKKLKDGYYDLLRKENYSKGIVA RDGETGLILDLYVLLGYGIKMTEVLREVQKKVKYVVESTLDVNVESVNVYVQGVRGIE >gi|223714176|gb|ACDT01000039.1| GENE 22 19333 - 20988 2341 551 aa, chain + ## HITS:1 COG:BS_yloV KEGG:ns NR:ns ## COG: BS_yloV COG1461 # Protein_GI_number: 16078647 # Func_class: R General function prediction only # Function: Predicted kinase related to dihydroxyacetone kinase # Organism: Bacillus subtilis # 1 551 3 553 553 484 47.0 1e-136 MKVVTGKLFKEMVLCGANTLHNNHPEIDALNVFPVPDGDTGTNMSLTFSAGAKEIEGMDS NNIGEISKKLSKGLLMGARGNSGVILSQIFRGVSMALQGHEEADAVLWAQALESGAKVAY KAVMRPVEGTILTVIRESSEAVVKYAKPGMEIEDVFSYFVKEAEASLERTPELLPVLKEV GVVDSGGAGLLLVFTGFLAGLAGETVDYVEIKSSDSAMNAVADIEGGEEGYGYCTEFIVR LEPSLVDKFKEEQLKKELARIPGESIVVVQDEDIVKVHVHTLKPGNALNIAQRFGEFVKL KIENMQEQADTIQSNAGTIVGVDDNAKPKREAKETAVISVCAGDGLKDAFLELHCDYVVS GGQTMNPSAEDMVQAVRDVNAKNVIILPNNSNIVMTAQQTATILEDEVNVIVIPTKTIPQ GLSACIMFNPDASLDDNVAEMTEAVGNVKTGQVTFAIKDTNIDGVEIKANDYMALVEKDI VACKDNKLKALKVVLEKLVDEDAELITLIYGEDVNDDDIEEIESFVEDNFEAELEVVNGK QPVYSFIVGVE Prediction of potential genes in microbial genomes Time: Thu May 26 09:40:14 2011 Seq name: gi|223714175|gb|ACDT01000040.1| Coprobacillus sp. D7 cont1.40, whole genome shotgun sequence Length of sequence - 25765 bp Number of predicted genes - 23, with homology - 22 Number of transcription units - 10, operones - 4 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1143 - 2072 615 ## COG0582 Integrase - Prom 2136 - 2195 8.2 + Prom 2145 - 2204 9.1 2 2 Op 1 1/0.000 + CDS 2404 - 3714 1371 ## COG0534 Na+-driven multidrug efflux pump + Prom 3739 - 3798 8.0 3 2 Op 2 . + CDS 3874 - 4758 787 ## COG0583 Transcriptional regulator + Prom 4834 - 4893 6.1 4 3 Tu 1 . + CDS 4940 - 6046 1460 ## COG0686 Alanine dehydrogenase + Term 6048 - 6083 4.4 + Prom 6077 - 6136 7.0 5 4 Tu 1 . + CDS 6163 - 6852 556 ## Ccel_1416 hypothetical protein + Term 6996 - 7033 1.0 + Prom 6936 - 6995 9.0 6 5 Tu 1 . + CDS 7079 - 9820 3080 ## CPR_0360 cell wall surface anchor family protein + Term 9838 - 9870 1.2 - Term 9826 - 9857 1.0 7 6 Tu 1 . - CDS 9896 - 10177 126 ## + Prom 10048 - 10107 5.4 8 7 Tu 1 . + CDS 10152 - 10355 187 ## EUBREC_1425 PTS system, glucose subfamily, IIA subunit + Prom 10398 - 10457 6.3 9 8 Op 1 7/0.000 + CDS 10600 - 10905 112 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific + Term 11137 - 11174 -0.9 + Prom 11545 - 11604 6.8 10 8 Op 2 3/0.000 + CDS 11668 - 12498 827 ## COG3711 Transcriptional antiterminator + Prom 12510 - 12569 6.6 11 8 Op 3 . + CDS 12649 - 12954 410 ## COG1440 Phosphotransferase system cellobiose-specific component IIB + Prom 12975 - 13034 11.6 12 9 Op 1 . + CDS 13057 - 13968 970 ## COG0679 Predicted permeases 13 9 Op 2 . + CDS 14016 - 16034 1975 ## COG1200 RecG-like helicase 14 9 Op 3 4/0.000 + CDS 16072 - 17046 1166 ## COG0416 Fatty acid/phospholipid biosynthesis enzyme 15 9 Op 4 . + CDS 17075 - 17728 636 ## COG0571 dsRNA-specific ribonuclease + Term 17729 - 17761 -0.0 16 9 Op 5 . + CDS 17807 - 18109 349 ## gi|167755901|ref|ZP_02428028.1| hypothetical protein CLORAM_01418 17 9 Op 6 . + CDS 18175 - 19635 1578 ## COG1488 Nicotinic acid phosphoribosyltransferase 18 9 Op 7 . + CDS 19653 - 20036 227 ## gi|167755903|ref|ZP_02428030.1| hypothetical protein CLORAM_01420 + Prom 20038 - 20097 6.5 19 9 Op 8 . + CDS 20135 - 20965 997 ## COG0287 Prephenate dehydrogenase + Term 21017 - 21059 -0.3 + Prom 21006 - 21065 8.5 20 10 Op 1 4/0.000 + CDS 21095 - 21628 536 ## COG1396 Predicted transcriptional regulators 21 10 Op 2 30/0.000 + CDS 21639 - 22682 1288 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 22 10 Op 3 8/0.000 + CDS 22682 - 23539 700 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 23 10 Op 4 . + CDS 23517 - 25367 1608 ## COG0687 Spermidine/putrescine-binding periplasmic protein + Term 25374 - 25406 2.6 + TRNA 25581 - 25655 78.5 # Cys GCA 0 0 + TRNA 25666 - 25754 81.3 # Leu TAA 0 0 Predicted protein(s) >gi|223714175|gb|ACDT01000040.1| GENE 1 1143 - 2072 615 309 aa, chain - ## HITS:1 COG:SP1129 KEGG:ns NR:ns ## COG: SP1129 COG0582 # Protein_GI_number: 15900995 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 6 304 70 378 387 72 26.0 1e-12 MKTKKFKEISNEWIELKKLSVKYPTIAKYRVVIDTQLAHQFNDYTMEQFNEDIIIAYFNK LSTKENYANSTLLSIRYVLRAIINYSHQKYKTNNCSFELIKIAKKQRQINTLSNSQKINL SNYCFNNFQPISIAVAISLYTGLRIGEICALKWEDINFEDNYIYVYKTVERLKSKEDTGN KTALMILNPKTSSSKRIVPIPLFLNEFLVNYKSRYQIEDNNVFIITGNNKIPDPRTTQYR FNKLCKQFDFNTNFHTLRHSYATNCVMNEVDTKSLSEMLGHSNVGTTLNLYVHSSLEFKK KQINKISRL >gi|223714175|gb|ACDT01000040.1| GENE 2 2404 - 3714 1371 436 aa, chain + ## HITS:1 COG:yeeO KEGG:ns NR:ns ## COG: yeeO COG0534 # Protein_GI_number: 16129928 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 5 435 92 527 547 168 27.0 2e-41 MFSNKALKKLILPLIIEQILIMAVGVADTVMVSYAGEVAISGVGLVDMFNNLIITVLAAI DAGGAIIVSQYIGNKDRKNANKASSQLLTITIVIATVIMLGCLVFHRILLSTFFGAIEMD VMKAATTYFLISAISFPFLGVYNSAAALYRSMEKTRTTMYVSILMNIINVVGNYIGVFIL HAGVAGVAVPTLISRIVAAIIMFALSLNSSNLVYVKIKNVFAWNQEMISRILKIAVPNGI ENGLFTLGRVLVTSIVALFGTSQIAANSVAGSIDQIAVVVVNAINLGIVVVVGQCIGAND YEQAKYYIKKLMKISYIVTGIIGSAVILLLPWILNLYSLSSEARNLTFILVIMHNIMATA LHPTAFVLPNGLRAAGDVKFSMVVGIVSMILFRLGAAVLFGIIFNLGIIGVWIAMGSDWL CRSVCFVIRFIKGKWR >gi|223714175|gb|ACDT01000040.1| GENE 3 3874 - 4758 787 294 aa, chain + ## HITS:1 COG:lin0450 KEGG:ns NR:ns ## COG: lin0450 COG0583 # Protein_GI_number: 16799526 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 293 1 291 291 209 38.0 7e-54 MEIRVLKYFLTVAREGNITRAAELLHITQPTLSRQLMQLEDELGAVLFIRGKRKMVLTEE GMVLKRRAEEIIILSEKAELEVGSQSHDISGEIAIGCGVTEATQTMGKLIKKFSEIYPHV SFRIRNGNSDFILENIDNGLLDIGVVLEPVELEKLNFIHFNAEEKWGILMKKNASLAKKE YITSKDLIDLPLIASARSEAQTIFSNWYGTGYENLSFCATSDLTTTAAILVKNDVGYAVV VEGSICDAASDELCFRPLYPSLNSHSLFVWKKYQSFSLTVTKFIDFISKEIKKL >gi|223714175|gb|ACDT01000040.1| GENE 4 4940 - 6046 1460 368 aa, chain + ## HITS:1 COG:SA1531 KEGG:ns NR:ns ## COG: SA1531 COG0686 # Protein_GI_number: 15927286 # Func_class: E Amino acid transport and metabolism # Function: Alanine dehydrogenase # Organism: Staphylococcus aureus N315 # 1 368 1 370 372 442 61.0 1e-124 MIIGIPKEIKNNENRVSMTPAGVFALCKAGHQVYVETMAGTGSGFNDDEYKEAGAIICSC DEVFAKADLIVKVKEYLESEYKYLREDQMIFTYLHIANDQPFAQALIDSKTTAIAYETVE LNHNLPLLTPMSEVAGRMAVQIGANMLQKANGGSGLLLGGVPGVMPAKVVVIGGGKVGLN AAKIACGLGASVRVFDINAERMAYIDDISNGVIHTIYNNEYNLRQALKTADLVIGAVLIP GAKAPKLVTEDMVKTMKEGSAIVDVAIDQGGCIETCDHITTHDDPVFIKHGVLHYSVANM PGAVPRTSTIALSNATLPYILKLADKGIEALKEDAGFMKGLNTYKGYFTCKPVAQALNGK YKEVEELL >gi|223714175|gb|ACDT01000040.1| GENE 5 6163 - 6852 556 229 aa, chain + ## HITS:1 COG:no KEGG:Ccel_1416 NR:ns ## KEGG: Ccel_1416 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 5 181 2 180 230 97 35.0 3e-19 MDKTKNLYLMVSKTPTRFGYMIRKVGRIRYNHSAIALDENLQALYSFARLQHNSLFLAGI VLETIPRYTLRKETFVDVAIIKIPVSLEQYKLANEIIGELYGNEEYLYNLFSVLTYPVTK GFATYQSYTCVEFVIHVLSQIGFKFSAPGYYYKPDDLLTIFKDNIYFEGNLLDYCPDERI DEQYFAPMSFEVAKKSAKCFGKIVGRSIFCRGDDYFKNKEMFWDKNNFH >gi|223714175|gb|ACDT01000040.1| GENE 6 7079 - 9820 3080 913 aa, chain + ## HITS:1 COG:no KEGG:CPR_0360 NR:ns ## KEGG: CPR_0360 # Name: not_defined # Def: cell wall surface anchor family protein # Organism: C.perfringens_SM101 # Pathway: not_defined # 38 796 37 767 852 823 57.0 0 MRMKKRGIMKKGLILGLSCMMLGSTITITRAAENTKPLTDFGNVLDVTADPKEKIYGTYS TNEYNNFSDMGAWHGYYLHQKTATDLYGGFAGPVIIAEEYPINLSDSINKINLEKVTAQG NEKIDLTKATTNEVYYPGRLEQTYDLAELTLKLKLIFGTNRSALIETKIINKTNNDLKLN LSWDGHLFTWYTSENKSMGTTLSESADGVKVNFIEKRGTWEYMTTTENSFDVVLSEDDIK TTVSDDRLSYTITKNESITIGADSEYNTYQTQSFTFTNEERVSEKTKVTNLLKNPQKYFE KNNSRWQGYVDTIFENGVDANVNYQRAAVKSIETLMTNWFSAAGAIKHDGIVPSMSYKWF IGMWAWDSWKQVVATTYFNEELAKDNVRALFDYQIKSDDAVRPQDAGAIIDCIFYNQNED RGGDGGNWNERNSKPALAAWAVENVYRQTSDKEFLKEMYPKLAAYHNWWYTNRDIDKNGI AEYGGMVHETCYDWRNYEYTVGQYVEGFGTVNEDGYILDDNGERIVCPEAGIEAAAWESG MDNATRFDREGNGADDKGIEIYTVRNNQHAPIGYVINQESVDLNAYLYAEKGFLKSMAEE LGYQNDVVKYTQEAQYIQEYINDRMYDEETGFYYDVQTNEDGSEKKLLVNRGKGTEGWIP LWAKAATKDKAARVVESMTDANKFDTFVPFPTASKDNNKYAPEKYWRGPVWLDQALYAVE ALQNYGYYDEAKVATTKLFDHAKGLLGTGPIHENYNPETGEGLHTKNFSWSASAFYLLYQ NTLTSTNTTSQTGFDIPNVNIEVNINKELLLEAIKKAELIKESEYTKDSYQGLVIALENG RAVYDDKNATQDMVDIAAAKLNEALNALVKIKITDNEGTSPKTGDSIETIGYAVLLGLTG GALALAGKRKKHN >gi|223714175|gb|ACDT01000040.1| GENE 7 9896 - 10177 126 93 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIPKPAFFIITNRAFIIVTSPKIFYKLYEVIIVSFAKAIANPNTDLVTGRTAIDNIKLRS ILCNSAKIRSFILSSESNETHFLYGIIESAFGQ >gi|223714175|gb|ACDT01000040.1| GENE 8 10152 - 10355 187 67 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1425 NR:ns ## KEGG: EUBREC_1425 # Name: not_defined # Def: PTS system, glucose subfamily, IIA subunit # Organism: E.rectale # Pathway: Amino sugar and nucleotide sugar metabolism [PATH:ere00520]; Phosphotransferase system (PTS) [PATH:ere02060] # 1 63 64 126 750 68 57.0 8e-11 MKKAGLGIIDNLSFIFAAGMALGMAKRERAVTVLSSVIAFFVMYALINVLLVINGQILAD NSIVIMF >gi|223714175|gb|ACDT01000040.1| GENE 9 10600 - 10905 112 101 aa, chain + ## HITS:1 COG:BBB29_1 KEGG:ns NR:ns ## COG: BBB29_1 COG1263 # Protein_GI_number: 11497021 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Borrelia burgdorferi # 1 101 231 332 469 119 58.0 1e-27 MVNETGYLGTFIFGVIKRLLVPFGLHHVFCLPFWQTAVGGSMLIDGLLVQGGQNIFFAQL ADPNILYFSVAATRYFSGGFIFMIFGLPDVALAIYQCADSQ >gi|223714175|gb|ACDT01000040.1| GENE 10 11668 - 12498 827 276 aa, chain + ## HITS:1 COG:BS_sacT KEGG:ns NR:ns ## COG: BS_sacT COG3711 # Protein_GI_number: 16080858 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 3 270 2 271 276 149 36.0 4e-36 MYRIEKVLNNNSILASKDNQEVIFLGKGIGFGKKINDIYIPGDGVKKYKMETKEQEKRLP HEIIRNVDPVFIEIASDIIRFAQEQFDHVDTKILLPLADHIDFAIKRIQENVNMSNPFTK DIELLFPEEYKVALKGKQLINSITGFEITEDEVGYITLHVHSAISTDHVGESMQAMEIIH ESIDKLRKELNIGIDRNSISYIRLMNHIKYLLLRLNTEEKLQMDISDFTQERFPFAYERS KEICIRLSKVMKKEIPQSEIGYLALHLERILSIELK >gi|223714175|gb|ACDT01000040.1| GENE 11 12649 - 12954 410 101 aa, chain + ## HITS:1 COG:lin0393 KEGG:ns NR:ns ## COG: lin0393 COG1440 # Protein_GI_number: 16799470 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Listeria innocua # 1 99 1 99 100 72 39.0 2e-13 MYKILLCCASGLTTSMLVNAMKKDAKEKNIDVMIWAVAESAIDLSWADADCILVAPQNAG DLEKVKGIVNSSIPVASIDGVNFSKMDGQAVLEQAIEMINK >gi|223714175|gb|ACDT01000040.1| GENE 12 13057 - 13968 970 303 aa, chain + ## HITS:1 COG:L181807 KEGG:ns NR:ns ## COG: L181807 COG0679 # Protein_GI_number: 15673902 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Lactococcus lactis # 65 302 1 238 238 112 31.0 1e-24 MFDVLIKALGFVIVIVIGFLLKQFRILKKEDGYTLATIIMNVTLPCALFSNANGITINGA MIVLILMGIILNVLMVAIGYFVSNGKSAPTRAAYMINCSGYNIGNFVLPFVQAFFPGMGV AYLCMFDVGNALMGLGGTFAIASSVVSSEQKLSVSNVIKKLFSSIPFDVYIVIFFLALFK IKIPAPILSITDFIGAGNGFLAMLMIGLLLEIKISHDDFKDLISILSYRLLGNMLLMALC FFLLPLPLLAKKILIIALAAPISTVSAVFTRQCKYEGEVAAVANSLSILIGIGVLVVLLM LFV >gi|223714175|gb|ACDT01000040.1| GENE 13 14016 - 16034 1975 672 aa, chain + ## HITS:1 COG:BH2495 KEGG:ns NR:ns ## COG: BH2495 COG1200 # Protein_GI_number: 15615058 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Bacillus halodurans # 20 672 14 673 673 547 42.0 1e-155 MTKQIPLQQLKINTNRIELLNNMNIYSVRDLVLHFPYRYESIEETPLIDNEKVIIEGVLI DEPKIFYKGRLSRLSFQILYKQEVYKVTLFNRHFLKKNMTKEMPLTIIGKYNSKTKSITA SDLRLKPLDEISGITPIYSLKEGITQKSFQGYVKKALNFYHGHIQDEVPTNLLIKHHLIH KELALNLIHFPSNNDDVKEALRYLKYEEFLRFQLTMQYIKLSRKDNLGIKKQFNRQTLDE FIAQLPFELTFDQEQAALEVINDLQKETMMYRFVQGDVGSGKTVVGAIGLYANYLAGYQG AMMAPTEILATQHYRSLIKLFKKIDINIALLTGHLSNKEKQRIYNDLENGTIDIVVGTHA LFQEKVIYQKLGLVITDEQHRFGVNQRKALKEKGKQVDFMVMSATPIPRTLAISIYGDMD VSTIKTMPSGRKPVITEVFKTHSMKPILNRLKEYLASGGQCYVVCPLVEESEAIDSRDAT GIYNGMKAYFKEQYQIGLLHGKMDDETKDQIMAAFKANEIQILVSTTVIEVGVDVSNANW IVIYNAERFGLSQIHQLRGRVGRSDQQGYCFLLSNSSSQEALERLEFLRNCHDGFEVSYY DLKLRGPGDILGNQQSGLPVFSVGNIFEDANILEISRKDALELLESKSNDLIYLKLIKEI EEQLINNNKYID >gi|223714175|gb|ACDT01000040.1| GENE 14 16072 - 17046 1166 324 aa, chain + ## HITS:1 COG:CAC1746 KEGG:ns NR:ns ## COG: CAC1746 COG0416 # Protein_GI_number: 15895023 # Func_class: I Lipid transport and metabolism # Function: Fatty acid/phospholipid biosynthesis enzyme # Organism: Clostridium acetobutylicum # 1 323 1 327 331 221 40.0 1e-57 MKLGIDAMSGDLGSRIVVEACLSFLEKNKTDELYVVGKIEELEALKPYDQVTLIDAREVL EMTENILAIRRKKESSMVKTMMLARKGEVDAVLSCGNTGAYYASAMLFLKRIEGVEKSCL MAMMPTYSDNKVAMLDVGANSENTAEQLKSFAIMGNAYAKNVLKITNPKIALLNIGSEHH KGDEIHQETYKLLEDMQEINFVGNIEGKEILDGEVDVVVTDGFTGNVALKTIEGVAKVLV KSLKDGFMSSTRTKAGAVLAKPALKQLLTKFDTKAAGGALMMGFVKPVIKAHGSSDAIAF ENAINLAFEMVSSDVVEKMKEGLN >gi|223714175|gb|ACDT01000040.1| GENE 15 17075 - 17728 636 217 aa, chain + ## HITS:1 COG:BS_rncS KEGG:ns NR:ns ## COG: BS_rncS COG0571 # Protein_GI_number: 16078656 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Bacillus subtilis # 1 217 27 244 249 185 46.0 7e-47 MEIPYQNIEIFKEAFTHPSYANENKMKNHDYERLEFLGDAVLQYHVSRHIFDLYPELPEG RLTKLRAKLVREESLARFARELDLGPLIYLGAGELNNGGRDRDSVLADIFEAFMGAICHD CGPEYVEKMLDITIYKHVEDVNYDEITDFKTKLQELIQADQRKTVTYDLISSSGPSNNPV FEMAVKMDEMVLGVGVGSSKKRAEQQAAKDALNKLAR >gi|223714175|gb|ACDT01000040.1| GENE 16 17807 - 18109 349 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755901|ref|ZP_02428028.1| ## NR: gi|167755901|ref|ZP_02428028.1| hypothetical protein CLORAM_01418 [Clostridium ramosum DSM 1402] # 1 100 7 106 106 144 100.0 2e-33 MYEDYMDNLFDRRPMPTVAPKVGDIFLYTVNRGDNVYAIARRFNTRVDYIKNMNSLDDKM MIYPGQQLLVPVLFEKMPPMKPQPLPQPQPQPRQSYELYF >gi|223714175|gb|ACDT01000040.1| GENE 17 18175 - 19635 1578 486 aa, chain + ## HITS:1 COG:CAC1780 KEGG:ns NR:ns ## COG: CAC1780 COG1488 # Protein_GI_number: 15895056 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 9 485 13 487 489 603 61.0 1e-172 MDLHAGSKRNLTLVMDLYELTMSYNYFKQGNKDEYVYFDMYYRKNPDNGGFSIFAGLQQL IECIEHLHFSEGDIKYLRSLNKFDEEFFDYLRDFHFTGSIYAVKEGTPVFPNEPLITIRA KFIEAQLIETLLLVTVNHQSLIATKANRIVREAKGRPVLEFGARRAQGYDSATYGARAAY IGGVAGTATVSAGMMFNIPVVGTMAHSFVQSFESEFEAFKAYALTYPDDCVLLVDTYDTL KSGVPNAIKVANEVLAPLGKKLKGIRLDSGDIAYLSKRARVMLDVAGLVDAKITASNSLD EYLIRSLLDQGAQLDSFGVGENLIVSRSAPVFGGVYKLVALEKNGKIIPKIKISENTEKI TNPGYKKVYRLFENETGKAIGDVIAFHDEEISSSHDLTIYHQSNIWKFKTIAANTYSVEE LQVPVFINGKRVYPEYTVEEIREYSNQQKARLWDEVFRLEYPHDYYVDLTKKLLDYKIKM LEEKRK >gi|223714175|gb|ACDT01000040.1| GENE 18 19653 - 20036 227 127 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755903|ref|ZP_02428030.1| ## NR: gi|167755903|ref|ZP_02428030.1| hypothetical protein CLORAM_01420 [Clostridium ramosum DSM 1402] # 1 127 1 127 127 213 100.0 4e-54 MARFDRKVERQKKEFDFYHKEKTKKSKMTEFKENFSFRWIKINLRTVIYIVLDFLAVSLA FIPLLMKYYDAKTAFILGHGVLTSLLVVLTFYFINKEEKPPLSALFIRYCFMALLLGATS LIAVFLV >gi|223714175|gb|ACDT01000040.1| GENE 19 20135 - 20965 997 276 aa, chain + ## HITS:1 COG:CAC0893 KEGG:ns NR:ns ## COG: CAC0893 COG0287 # Protein_GI_number: 15894180 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydrogenase # Organism: Clostridium acetobutylicum # 3 272 10 283 286 213 42.0 3e-55 MIITVVGLGVVGGSFVKALKGQGHEVYGIDIDEKTLQMAKNEGTIIEGFTDGKEIIAQSD LTIICLYPSLVLKFIKENKFKKGSIITDAVGIKSYFLEEAMTIIDPEVEFVSGHPMAGRE KKGYGYASKEVFKNANYILIEHPVNQKECISFMERFVGTLGFKSVKIMSPQAHDEIISFT SQLPHAIAVSLINSDNEKYETGKYIGDSYRDLTRIANINENLWSELFLRNSDYLLASIEA FEEQLDLIKVALKDNDERLLKDLFIKSSLRREKLEK >gi|223714175|gb|ACDT01000040.1| GENE 20 21095 - 21628 536 177 aa, chain + ## HITS:1 COG:CAC0841 KEGG:ns NR:ns ## COG: CAC0841 COG1396 # Protein_GI_number: 15894128 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 177 1 179 179 182 51.0 3e-46 MDIGSKVKRLRQANGLTLEELANRSELTKGFLSQLERDLTSPSVATLEDILEALGTNLQE FFSEKEAEQLVFKQDDFFENIQDDYKISYIIPNAQKNDMEPILIELEKDKQSMIIAPHEG QEFGYVVQGRVKLIYGDNEFVLKKGETFYLKGLVSHYLFNPGETKAKVIWVSTPPLF >gi|223714175|gb|ACDT01000040.1| GENE 21 21639 - 22682 1288 347 aa, chain + ## HITS:1 COG:CAC0840 KEGG:ns NR:ns ## COG: CAC0840 COG3842 # Protein_GI_number: 15894127 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Clostridium acetobutylicum # 4 343 6 345 352 415 60.0 1e-116 MTKLIELDNLTKEYNGQVVLKGIHLEINEKEFVTLLGPSGCGKTTTLRIMGGFEEANGGT VLFDGQDISGLPPYKRELNTVFQKYALFPHMDVFDNIAFGLKIKKMDKETIKHKVNRMLD LVGLAGFGQRNINQLSGGQQQRVAIARALVNEPKVLLLDEPLGALDLKMRKTMQIELKNI QQEVGITFVYVTHDQEEALTMSDTIVVMNEGQIQQIGTPIDIYNEPENRFVAQFIGESNI IEGIFVKDYLVEFDGKQFECVDKGFDDGQDVDIVLRPEDLDIVEVGKGKIEGVVTSIVFK GVHYEIIVETKDRDYMVHTTDISEVGKKVNLDFWPEDIHVMDKMGSY >gi|223714175|gb|ACDT01000040.1| GENE 22 22682 - 23539 700 285 aa, chain + ## HITS:1 COG:CAC0839 KEGG:ns NR:ns ## COG: CAC0839 COG1176 # Protein_GI_number: 15894126 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Clostridium acetobutylicum # 1 268 1 265 277 233 45.0 3e-61 MKQFRKLVGPYCFWLFVLTVVPMLLIMIYAFVQKGNTITTFNFTLENFSKFFDPIFVSVL VKSFVLGAITTVLCLAIGYPVAYAISRCKEKTQTLLILLITFPTWINMLMRTYAWINILS SKGIISNFVQFLGFGEVSFLYTDFAVVLGMVYDFLPFMILPIHTALTKMDKSLVEASNDL GANPITTFWKITFRLSLGGVLTGITMVFLPAISSFVIPKLLGGGQYSLIGNFIEQQFINV GDWHFGSAVSLILAVLVIVLMGLINRVEKYAGYQEEAKREKTKKL >gi|223714175|gb|ACDT01000040.1| GENE 23 23517 - 25367 1608 616 aa, chain + ## HITS:1 COG:CAC0837 KEGG:ns NR:ns ## COG: CAC0837 COG0687 # Protein_GI_number: 15894124 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Clostridium acetobutylicum # 273 616 11 353 354 264 43.0 3e-70 MKKLKNFKFNFNSSLWIGFALAFLYLPLVIMAIFSFNDSKSLSSWSGFSLRWYQELFVNQ QMIDAIVVSVSIAILSTVISTVLGTITAIGVSKSKPLLRKLVLQVNNLPIMNPDIVTGIS LMLLFSFIKVEKGYLTMLLAHVAFCTPFVITNVLPKVRQLDVNLADAAMDLGATPFQALT KVIIPQIKPGIISGALLAFTMSFDDFIISYFVSGNGIENISIVIYNMSKRTNPSIYALAT IILVVVLLFVAIGTLVPKFFPKATNKLLSSKTAKIVIAVCLVLSVVWSISTGVSRKTLRV YNWGEYIDKTVISDFEDKFDCRIIYETFDSNEIMYTKYMSGNSYDVLVPSEYMIERMIKE DLLQPVDKSLIPNLDNINEGILGQSFDLSNNYWVPYFCGNVGILYDKTVVDEKDLEAGWS ILRNTKYKNQIYMYDSERDSFMVALKALGYSMNTTKEREINEAYQWLLEQREEMNPVYVG DESIDTMISGVKAMAIMYSGDAAAVMSENPNMGYYMPKEGTNVWFDGFVISKECKQTKLA NEFINYMISDKVSYKNTVEVGYLTANVNAAAQAEKEEFKDISAYGLRTGPNDEVFAYQDN AVKEMYNSRWTKVKAK Prediction of potential genes in microbial genomes Time: Thu May 26 09:40:47 2011 Seq name: gi|223714174|gb|ACDT01000041.1| Coprobacillus sp. D7 cont1.41, whole genome shotgun sequence Length of sequence - 940 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 62 - 138 76.3 # Arg ACG 0 0 + TRNA 162 - 238 90.8 # Pro TGG 0 0 + TRNA 254 - 329 94.0 # Ala TGC 0 0 + TRNA 367 - 443 89.5 # Met CAT 0 0 + TRNA 480 - 556 94.0 # Met CAT 0 0 + TRNA 569 - 661 68.3 # Ser TGA 0 0 + TRNA 705 - 781 77.3 # Met CAT 0 0 + TRNA 784 - 860 91.2 # Asp GTC 0 0 + TRNA 863 - 938 88.6 # Phe GAA 0 0 Prediction of potential genes in microbial genomes Time: Thu May 26 09:40:52 2011 Seq name: gi|223714173|gb|ACDT01000042.1| Coprobacillus sp. D7 cont1.42, whole genome shotgun sequence Length of sequence - 17469 bp Number of predicted genes - 24, with homology - 23 Number of transcription units - 13, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 6 - 81 76.6 # Glu TTC 0 0 + Prom 9 - 68 79.8 1 1 Tu 1 . + CDS 158 - 550 420 ## gi|237734873|ref|ZP_04565354.1| predicted protein - Term 512 - 560 2.6 2 2 Tu 1 . - CDS 568 - 1737 1545 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase - Prom 1955 - 2014 12.8 3 3 Tu 1 . - CDS 2051 - 2326 100 ## - Prom 2363 - 2422 4.8 + Prom 2120 - 2179 10.9 4 4 Op 1 . + CDS 2309 - 3829 1342 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 5 4 Op 2 . + CDS 3826 - 4233 339 ## Cphy_2149 hypothetical protein + Term 4344 - 4402 8.0 + Prom 4342 - 4401 9.9 6 5 Op 1 . + CDS 4437 - 4682 232 ## gi|237734877|ref|ZP_04565358.1| predicted protein 7 5 Op 2 . + CDS 4717 - 5055 210 ## gi|167756240|ref|ZP_02428367.1| hypothetical protein CLORAM_01771 8 6 Tu 1 . - CDS 5126 - 5833 697 ## gi|237734879|ref|ZP_04565360.1| predicted protein - Prom 6011 - 6070 10.1 + Prom 5826 - 5885 6.5 9 7 Tu 1 . + CDS 5979 - 6209 163 ## CKR_2554 hypothetical protein + Prom 6298 - 6357 6.5 10 8 Tu 1 . + CDS 6377 - 6922 392 ## gi|237734881|ref|ZP_04565362.1| predicted protein + Prom 6924 - 6983 8.3 11 9 Tu 1 . + CDS 7167 - 7589 341 ## BpOF4_01460 two-component response regulator central for the initiation of sporulation + Term 7654 - 7704 6.1 12 10 Tu 1 . - CDS 7622 - 8119 378 ## CTC01569 stage 0 sporulation protein A - Prom 8144 - 8203 9.1 13 11 Op 1 11/0.000 + CDS 8644 - 9624 1127 ## COG3705 ATP phosphoribosyltransferase involved in histidine biosynthesis 14 11 Op 2 18/0.000 + CDS 9617 - 10246 798 ## COG0040 ATP phosphoribosyltransferase 15 11 Op 3 6/0.000 + CDS 10246 - 11544 1527 ## COG0141 Histidinol dehydrogenase 16 11 Op 4 18/0.000 + CDS 11547 - 12134 712 ## COG0131 Imidazoleglycerol-phosphate dehydratase 17 11 Op 5 25/0.000 + CDS 12154 - 12762 723 ## COG0118 Glutamine amidotransferase 18 11 Op 6 23/0.000 + CDS 12756 - 13481 900 ## COG0106 Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase 19 11 Op 7 24/0.000 + CDS 13463 - 14227 1077 ## COG0107 Imidazoleglycerol-phosphate synthase 20 11 Op 8 1/0.000 + CDS 14227 - 14865 680 ## COG0139 Phosphoribosyl-AMP cyclohydrolase 21 11 Op 9 . + CDS 14867 - 15652 863 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family + Prom 15657 - 15716 6.7 22 12 Op 1 . + CDS 15737 - 16084 489 ## gi|167756255|ref|ZP_02428382.1| hypothetical protein CLORAM_01786 23 12 Op 2 . + CDS 16084 - 17250 1186 ## COG0053 Predicted Co/Zn/Cd cation transporters + Term 17255 - 17295 5.8 - Term 17243 - 17283 9.6 24 13 Tu 1 . - CDS 17285 - 17464 187 ## gi|167756257|ref|ZP_02428384.1| hypothetical protein CLORAM_01788 Predicted protein(s) >gi|223714173|gb|ACDT01000042.1| GENE 1 158 - 550 420 130 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734873|ref|ZP_04565354.1| ## NR: gi|237734873|ref|ZP_04565354.1| predicted protein [Mollicutes bacterium D7] # 1 130 1 130 130 198 100.0 7e-50 MKYKEFRMLIFSIIFFFVIGGVCVYFALDHNRSKVIIGVILSFVVLMMLTTKYNMKIFND SMLVYEFKGIGILPALIDYKDVKEIKLVSKHKVKIKHKGISTLYILDAESFYEEVMENIE EYKKSVNQED >gi|223714173|gb|ACDT01000042.1| GENE 2 568 - 1737 1545 389 aa, chain - ## HITS:1 COG:PA4588 KEGG:ns NR:ns ## COG: PA4588 COG0334 # Protein_GI_number: 15599784 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Pseudomonas aeruginosa # 1 388 60 444 445 462 57.0 1e-130 MFKVPWMDDNGQVQVNRGYRVQFSSAIGPYKGGLRLHPSVNLGIIKFLGFEQIFKNALTT LPIGGGKGGSDFDPQGKSDNEIMRFCQSFMTELYRHIGPDVDVPAGDIGTGAREIGYMYG QYRRIRGAFENGVLTGKPLPYGGSLIRPEATGFGACYYGKEVLEHFNDSYEGKTIACSGY GNVAWGVCLKAREFGAKVVSISGRDGYVYDPEGITTDEKIDFLCKIRESNDVKLKDYAEK FGCEFHAGEKPWGLKVDMAFPCATQNEIGIEEAKQLTANGVKYIIEGANMPTTPEAMEYF ISNGGTLGPAKAANAGGVAVSALEMAQNSMRYNWTREEVDAKLKQIMKDIHDHSKAAAEK YGLGYDLVKGANIAGFEKVVAAMISQGIY >gi|223714173|gb|ACDT01000042.1| GENE 3 2051 - 2326 100 91 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSRCFHFNTPFLKRIHCLFYYSLPFFTKKNLLLSNIVCPFSQRKIYYYQIINFIFNPHAK YTKNKFINLLAHYIGNSLYSSYVFIDIQIVF >gi|223714173|gb|ACDT01000042.1| GENE 4 2309 - 3829 1342 506 aa, chain + ## HITS:1 COG:SA1196 KEGG:ns NR:ns ## COG: SA1196 COG0389 # Protein_GI_number: 15926944 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Staphylococcus aureus N315 # 9 505 13 419 420 196 31.0 9e-50 MEAAGHVYMAIDLKSFYASVECIERNLDPLTTNLVVADTSKTEKTICLAVSPSLKQYGIS GRARLFEVIQKVKEVNFQRKRAINDRNFIDKSYFDSQLKKNPYLELSYIAASPRMALYME YSTRIYNIYLKYVSPEDLHVYSIDEVFIDATSYLKASRKTPGEFAKMIILDILETIGITA TAGIGTNLYLAKVAMDIVAKHVKADEDGVRIAYLDEMRYRKYLWHHQPITDFWRIGRGYE KKLKKIGLFTMGDIAKCSVGSEDEYYNEDLLYSIFGINAELLIDHAWGCKPCTMQHIKAY KPENNSIGIGQVLSSPYSFEKGKLILKEMLDALALSLVDKKLVTNQIIITIGYDIENVTT GSLYREEIITDRYGRKIPKHAHGTINLDSYTASAKLIILAVLKWYEEHVKRFLTIRRFNI SANHIIDESSIKIKPTIQQMDLFTDYDQQKKDEEKQRRDLEKEKRLQKTTLELKKKYGKN AVLKGMNLAEGATGRERNKTIGGHKA >gi|223714173|gb|ACDT01000042.1| GENE 5 3826 - 4233 339 135 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2149 NR:ns ## KEGG: Cphy_2149 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 8 135 5 132 142 121 51.0 9e-27 MTKNIHDYQDIINLPHHISIKHPQMSLEDRAAQFAPFAALTGHKDSIKETERLTAQRKIL DEDRIAIINFRLQQLLKHLTEAPLIKITYFEADQKKSGGQYITIINHLKKINEYENLLVL ENGVKINIDDIYEIE >gi|223714173|gb|ACDT01000042.1| GENE 6 4437 - 4682 232 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734877|ref|ZP_04565358.1| ## NR: gi|237734877|ref|ZP_04565358.1| predicted protein [Mollicutes bacterium D7] # 1 81 1 81 81 152 100.0 7e-36 MKVEKVSMRSPYCASKMILGYIYSGKNDICWTPDNEKNSWIINHPNKQQVILAKLNLIKG CRIKVFRCASCQIEIINEKEL >gi|223714173|gb|ACDT01000042.1| GENE 7 4717 - 5055 210 112 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756240|ref|ZP_02428367.1| ## NR: gi|167756240|ref|ZP_02428367.1| hypothetical protein CLORAM_01771 [Clostridium ramosum DSM 1402] # 1 112 1 112 112 108 100.0 1e-22 MVNIKKIIKIISWIIIALLGIALLLNWQTIESPVIIHADLSGNFSYGNKSTLIVFYLIIV LINLLFTFKYEIPLMKELHKMIKSSLVIDIISIFFQFLIIVVIGAFIVKAII >gi|223714173|gb|ACDT01000042.1| GENE 8 5126 - 5833 697 235 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734879|ref|ZP_04565360.1| ## NR: gi|237734879|ref|ZP_04565360.1| predicted protein [Mollicutes bacterium D7] # 1 235 1 235 235 415 100.0 1e-114 MYMYDFFNSLDLLQQVPNINDLPRGNYLYFGICKKDELIQRGYKVSCDKLYLTYARYDDL SNLSYYPIDKFYNYMNQLTSNLIDLNELDNNELKASLFEAIWLINEIAYLEEIPFFNAKL NIEVSTLCDMIDHNGDEFNHSIDYFDNIGLLKKIHIAQIRYFISQYLRAKLKINKTYSNI DLAKFDSFVLDSMNRFIEVAPIKYKVEIYTNLDNPEFDSIFEQIVVLNERQSNKT >gi|223714173|gb|ACDT01000042.1| GENE 9 5979 - 6209 163 76 aa, chain + ## HITS:1 COG:no KEGG:CKR_2554 NR:ns ## KEGG: CKR_2554 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 3 74 2 73 73 62 36.0 4e-09 MNEYGHIIMNIEQILKNKGISKNKICKELDIPRPNFNRYCKNSFQRIDANLICKLCYYLE CNIEELVTYIPPQETK >gi|223714173|gb|ACDT01000042.1| GENE 10 6377 - 6922 392 181 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734881|ref|ZP_04565362.1| ## NR: gi|237734881|ref|ZP_04565362.1| predicted protein [Mollicutes bacterium D7] # 1 181 1 181 181 306 100.0 3e-82 MIIISNDTVYLKKEFTQETYDQALIRVDMKGLECDCGSNGKLVKIGYYQRYYKTSTRKIC IQIQRVMCIHCGRTHALFVECMVPSSMLLVTTQIELLRSYYNHRLEEFLMVYPTIERSNA FYVVKNYEKKWSKILKLTGLSLMDEEKKIIKVFIKKYQMQFMQMRSYSKIVSQLRLSEKL S >gi|223714173|gb|ACDT01000042.1| GENE 11 7167 - 7589 341 140 aa, chain + ## HITS:1 COG:no KEGG:BpOF4_01460 NR:ns ## KEGG: BpOF4_01460 # Name: not_defined # Def: two-component response regulator central for the initiation of sporulation # Organism: B.pseudofirmus # Pathway: Two-component system [PATH:bpf02020] # 19 137 144 265 266 108 45.0 4e-23 MNREFSIVILPKDQMIEKNSRDKNIFIMKRLYEIFLSIGAPNHLNGFGYLVSAIKRSCSD IEILTEITKELYPSVAREYSTTSSRVEQSIRHIIKVIWSNGNLVELEKMFGYRAKHRPCN SEFISMVTDRLLSEYKIYNE >gi|223714173|gb|ACDT01000042.1| GENE 12 7622 - 8119 378 165 aa, chain - ## HITS:1 COG:no KEGG:CTC01569 NR:ns ## KEGG: CTC01569 # Name: not_defined # Def: stage 0 sporulation protein A # Organism: C.tetani # Pathway: Two-component system [PATH:ctc02020] # 12 159 119 276 278 144 51.0 1e-33 MKQYYIYILEEDEEISAELIDHLSNNNNIWIEAVNNLQSTNHSEVKLEHDITTILKRVGI PAHLKGYMYLRTAILMVYSDIELLGQITKRLYPNIARQYSTTASRVEKAIRHAIENSWNQ GDKRKINTIIRAAPDSLKGKPTNSEFIALIADKLKIKNRYAFESR >gi|223714173|gb|ACDT01000042.1| GENE 13 8644 - 9624 1127 326 aa, chain + ## HITS:1 COG:L0341 KEGG:ns NR:ns ## COG: L0341 COG3705 # Protein_GI_number: 15673189 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase involved in histidine biosynthesis # Organism: Lactococcus lactis # 45 326 3 286 286 226 44.0 5e-59 MEEFKYLIPEESIDAIITNVSKLREIENSLRAIFAKNNYNEVLMPSFEYVDLYTQLDCGF EQEKMFQYINHEGKNVAMRCDFTIPLARFYSTNNYGDEEARYCYFGKVYRKETMHKGRSS EFFQGGVELINKPGIAGDKECLNMIQESLPHLGLKNILVEIGSARFFNRLCTLVGDKAVE LTEILKYRDISEMKKFVADNEFDGQLNELLLKLPTAFGDITLLDTAIKGINDEILLKALE RLKELYNSLDNKESIIFDLGMVPTMKYYTGLMIKGYSDKSAQPIISGGRYDDLLPRFNKN VGAIGFCYHMNHILKAIDKQGEGNND >gi|223714173|gb|ACDT01000042.1| GENE 14 9617 - 10246 798 209 aa, chain + ## HITS:1 COG:L0066 KEGG:ns NR:ns ## COG: L0066 COG0040 # Protein_GI_number: 15673190 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase # Organism: Lactococcus lactis # 1 173 1 173 180 249 74.0 2e-66 MIKIAITKGRIEKQVCKLLQEAGFDMEPIFNKDRELLIATKDGIEMIFGKANDVVTFLEH GIVDIGFVGKDTLDDVDFEDYYELLDLNIGKCYFAVAAYPEYRNKKFERRKKIASKYTKV AKKYFATKQEDVEIIKLEGSVELGPIVGISDAIVDIVETGSTLKANGLEVIEKISDISTR LVANKVSFKFKKDEIMNLVNKLEAVTEGK >gi|223714173|gb|ACDT01000042.1| GENE 15 10246 - 11544 1527 432 aa, chain + ## HITS:1 COG:L0067 KEGG:ns NR:ns ## COG: L0067 COG0141 # Protein_GI_number: 15673191 # Func_class: E Amino acid transport and metabolism # Function: Histidinol dehydrogenase # Organism: Lactococcus lactis # 1 431 1 430 431 603 70.0 1e-172 MLKIIDYKGNLEAVATKLDSRKESVNKEVNEAVLKIIEDINERGNKALYEYCLKFDGYQI NDEKDLIVSEIEKEEALKQIDADYLRILERTKEQITEFHKNQIDKSWSLFKDNGVIMGQM VRPIERVALYVPGGTASYPSTVLMNAVPAKLAGVKDLVIITPVKEDGKVNPIIIAAAKVS GVDTIYKFGGAQGVAAIAHGTETIKKADKIVGPGNIFVATAKKLSYGLVDIDMVAGPSEV LVIADENANPKYIAADLMSQAEHDKLASALLVTTSRDLVAKVNGELVRQMAYLSRRDIIE ESLVNYGGAIIVDNLNEAFDVSNYLAPEHLEVLVDNPVNMLPKIKNAGSIFLGEYSPEPL GDYMSGTNHVLPTGGTAKFYSALGVYDFVKYSSYSYYPKNVLGEFKEDVIKFAKSEGLDA HANSVAVRFEEE >gi|223714173|gb|ACDT01000042.1| GENE 16 11547 - 12134 712 195 aa, chain + ## HITS:1 COG:L0068 KEGG:ns NR:ns ## COG: L0068 COG0131 # Protein_GI_number: 15673192 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate dehydratase # Organism: Lactococcus lactis # 2 195 3 200 200 242 62.0 4e-64 MRIGKIERETKETKILVQLDLDGEGKSEISTGIGFFDHMLTLFAFHGGFDLIVKCEGDLE VDTHHTVEDLGIALGTCLKEALGNKLGIKRYGAFTIPMDETLVTTNLDISGRPFLVYNVN LTCERIGTFETEMTEEFFRALAFNSLITLHINEQYGTNNHHIVEAIFKSLGRALKEAVSI DEANKDKVVSSKGVL >gi|223714173|gb|ACDT01000042.1| GENE 17 12154 - 12762 723 202 aa, chain + ## HITS:1 COG:L0069 KEGG:ns NR:ns ## COG: L0069 COG0118 # Protein_GI_number: 15673194 # Func_class: E Amino acid transport and metabolism # Function: Glutamine amidotransferase # Organism: Lactococcus lactis # 2 200 4 202 202 272 63.0 3e-73 MVVIIDYNIGNLSSVISALKRVGIEAVITRDKDIIRQAKAIVLPGVGAFPVAMNNLKKFD LIDVLNERKDAGIPILGICLGMQILFEKGYEVEETKGLGFLDGEVVFMDIDEKVPHMGWN QLHFNQNHPILKNIHENDDVYFVHSFMATCPNEQLIAYSDYGKTNITAIAAKDNVIGCQF HPEKSGAVGQKILLAFKEMIEC >gi|223714173|gb|ACDT01000042.1| GENE 18 12756 - 13481 900 241 aa, chain + ## HITS:1 COG:L0070 KEGG:ns NR:ns ## COG: L0070 COG0106 # Protein_GI_number: 15673195 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase # Organism: Lactococcus lactis # 1 233 1 233 239 277 54.0 1e-74 MLVIPAIDLKDGQAVRLFKGDYNQKTVYSNEPEKLAENFEKMGAKLLHVVDLDGAKDGEC INLETIKKIKQNTSMQVELGGGIRNIETVALYLDEVGIDRVILGTAAINDPEFLKSAINT YGPEKIVVGVDVKDGFVSTSGWLKISNVPYLEFIKELEKLGVKYIVATDISKDGTLQGPN FDMYEQIARTSTINFVVSGGIKDAQNIKDVASKNYYACIVGKAYYEGKVDLKEVITCLQN G >gi|223714173|gb|ACDT01000042.1| GENE 19 13463 - 14227 1077 254 aa, chain + ## HITS:1 COG:L0071 KEGG:ns NR:ns ## COG: L0071 COG0107 # Protein_GI_number: 15673196 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate synthase # Organism: Lactococcus lactis # 1 251 1 252 259 358 73.0 5e-99 MLTKRIIPCLDIKNGKVVKGINFVELKDVGDPIELAKRYDQQCADEVVFLDITASYEERD IIKDIIERGASELTIPLAVGGGIRTLDDFRTILASGADKVSVNSAAIANPDLIKVAADEF GVQCVVVAIDAKKRDDDGYDVYVKGGRENTGIDLIEWVTKCEELGAGEILLTSMDADGTK AGYDIDMINAVCNAVDIPVIASGGCGSIQDIVDVFKQTNCDAALVASLFHFGEATVEDVR KELRKHDINVRRAM >gi|223714173|gb|ACDT01000042.1| GENE 20 14227 - 14865 680 212 aa, chain + ## HITS:1 COG:L0072_1 KEGG:ns NR:ns ## COG: L0072_1 COG0139 # Protein_GI_number: 15673197 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosyl-AMP cyclohydrolase # Organism: Lactococcus lactis # 1 110 1 110 110 173 70.0 3e-43 MKPDFTKMELIPAIVQDYKTNEVLMLAYVNEEAYQRMLETNQTCFFSRSRNELWHKGETS GHFQNIKGMYLDCDLDTLLIFVEQIGAACHTGAYSCFFNEIMAYDSTNIFRSLSNLIENR KQNPVEKSYTNYLLDQGVDKICKKVGEEASETIIAAKNDDKEELIGEISDLFYHVFVLMN NQGVTLDDIENKLKDRHKITGNKKDFHTKGDY >gi|223714173|gb|ACDT01000042.1| GENE 21 14867 - 15652 863 261 aa, chain + ## HITS:1 COG:L37351 KEGG:ns NR:ns ## COG: L37351 COG1387 # Protein_GI_number: 15673198 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Lactococcus lactis # 1 257 1 257 269 202 42.0 5e-52 MRSIDYHMHTRFSGDSEADPREHVIQAIKMGLDEICFTDHRDFDYPIDVFDLDVEGYYQT IMALKKEFADKIIIKWGIEIGLDMNHQLEINNLINQYPFDFVIGSIHVIDHTEFYYGDFF KNLNKEQAHQAFFEETLRCVKNFDCFNVLGHLDYIMRYGPYEDKKVEHKKYQNIIDEIFK NLIIKNKGIEVNTSGYAVNQTCGFPNFEQVQRYYDLGGRIITIGTDSHTSDRVGQYVNDV KENLIKIGFEDVSTFTGRLKD >gi|223714173|gb|ACDT01000042.1| GENE 22 15737 - 16084 489 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756255|ref|ZP_02428382.1| ## NR: gi|167756255|ref|ZP_02428382.1| hypothetical protein CLORAM_01786 [Clostridium ramosum DSM 1402] # 1 115 10 124 124 196 100.0 4e-49 MELYNKETLKALKESVDDSTYYLPKALFQGSKFGNIRLETKLAYAALLDTLLRKPVFNQE NIALLKIDNPVIAQTLAAMANKEVNQAKVAKYLDELIEANLIEINKQDIYIYHID >gi|223714173|gb|ACDT01000042.1| GENE 23 16084 - 17250 1186 388 aa, chain + ## HITS:1 COG:CAC0606 KEGG:ns NR:ns ## COG: CAC0606 COG0053 # Protein_GI_number: 15893895 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Clostridium acetobutylicum # 1 387 11 400 403 302 38.0 6e-82 MFKFISKKFIKNYNDYNNPKVREQYGMLCSIVSIACNIVLVIFKLTMGAITHSIAIQADG LNNLSDVGSNLASLFGFKLANRHPDSEHPYGHGRIEYVAGLIIAFLILLVGFQALKDSIF KIIAPEKVTFTIVAVVILIVSILIKLWMAMFNRTVSKNINSATLMAASQDSLNDVMATSA TLISLILSLYTDLPVDGIMGAVVSIIVLKAGIDIFKDTVDPLLGMAPDKELINEIEEYIL SYPEALGIHDLMMHDYGPGRKFLTLHVEVDCNDNIMAVHDAMDLIERSMLEKYHILTTIH MDPVDTNDALTNELKQVVLGVVKGINKEYSIHDFRIVTGPTHTNLIFDVMIPSNDEIKHK VLKEEINTKLRAINPNYYTVMQIEHSFI >gi|223714173|gb|ACDT01000042.1| GENE 24 17285 - 17464 187 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756257|ref|ZP_02428384.1| ## NR: gi|167756257|ref|ZP_02428384.1| hypothetical protein CLORAM_01788 [Clostridium ramosum DSM 1402] # 1 59 332 390 390 96 100.0 5e-19 MAMMNIPILNVVIVIKEIIFNQLNYVHLYMTIGWSIVYVIVSMIAAKMMYEKEEVIFRA Prediction of potential genes in microbial genomes Time: Thu May 26 09:41:52 2011 Seq name: gi|223714172|gb|ACDT01000043.1| Coprobacillus sp. D7 cont1.43, whole genome shotgun sequence Length of sequence - 1066 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 1 - 790 824 ## COG1668 ABC-type Na+ efflux pump, permease component 2 1 Op 2 . - CDS 783 - 1052 249 ## COG4555 ABC-type Na+ transport system, ATPase component Predicted protein(s) >gi|223714172|gb|ACDT01000043.1| GENE 1 1 - 790 824 263 aa, chain - ## HITS:1 COG:CAC3550 KEGG:ns NR:ns ## COG: CAC3550 COG1668 # Protein_GI_number: 15896786 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: ABC-type Na+ efflux pump, permease component # Organism: Clostridium acetobutylicum # 2 255 3 256 392 123 34.0 4e-28 MNNIMTIVKKEFKDILRDRKTLLMTFVIPIIVMPLLFTFIFSSIGDMASPSEDNKYKIVL ETNNPEITALFKEAGIYELLNSKDPINEAYDGKIVAYIIANDINESLTQGSTPEVDLYYD TTSQRAMTAISTVQTMFSTYQNSYLASYLEANHLSSKVLQPFTYNEHAKDEEADSMSLMM LGMLIPMMIIGYSGSGIVPIATDLGAGEKERGTLEPLLSTSISRSSILIGKLIVTATFGS ITSILSAVGLLLAFKFGMSDTMA >gi|223714172|gb|ACDT01000043.1| GENE 2 783 - 1052 249 89 aa, chain - ## HITS:1 COG:CAC3551 KEGG:ns NR:ns ## COG: CAC3551 COG4555 # Protein_GI_number: 15896787 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: ABC-type Na+ transport system, ATPase component # Organism: Clostridium acetobutylicum # 1 83 155 237 238 117 68.0 4e-27 MLFDEPTTGLDVLSSKLIHDFILKCKQENKAIVFSSHNMYETEKLCDRVIIIHKGKIVAS GTIEQLKKEYQQDNLEDLFIECIGGNDHE Prediction of potential genes in microbial genomes Time: Thu May 26 09:42:01 2011 Seq name: gi|223714171|gb|ACDT01000044.1| Coprobacillus sp. D7 cont1.44, whole genome shotgun sequence Length of sequence - 28498 bp Number of predicted genes - 36, with homology - 36 Number of transcription units - 15, operones - 9 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 242 - 577 293 ## gi|167756259|ref|ZP_02428386.1| hypothetical protein CLORAM_01790 2 1 Op 2 . - CDS 564 - 779 299 ## COG1476 Predicted transcriptional regulators 3 1 Op 3 . - CDS 846 - 1313 433 ## gi|237734899|ref|ZP_04565380.1| predicted protein 4 1 Op 4 . - CDS 1306 - 1515 298 ## COG1476 Predicted transcriptional regulators - Prom 1549 - 1608 9.4 5 2 Tu 1 . - CDS 1617 - 1901 356 ## COG1694 Predicted pyrophosphatase - Prom 2009 - 2068 7.8 + Prom 2223 - 2282 2.0 6 3 Op 1 32/0.000 + CDS 2351 - 3355 1058 ## COG1135 ABC-type metal ion transport system, ATPase component 7 3 Op 2 22/0.000 + CDS 3356 - 4021 766 ## COG2011 ABC-type metal ion transport system, permease component 8 3 Op 3 . + CDS 4031 - 4852 936 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen + Prom 4858 - 4917 7.4 9 4 Op 1 . + CDS 4938 - 5273 435 ## COG0221 Inorganic pyrophosphatase 10 4 Op 2 . + CDS 5285 - 5629 333 ## COG5646 Uncharacterized conserved protein 11 4 Op 3 . + CDS 5626 - 6078 418 ## gi|167756269|ref|ZP_02428396.1| hypothetical protein CLORAM_01800 + Prom 6080 - 6139 11.1 12 5 Op 1 . + CDS 6159 - 6935 1019 ## COG0561 Predicted hydrolases of the HAD superfamily 13 5 Op 2 . + CDS 6938 - 7852 716 ## COG2378 Predicted transcriptional regulator 14 5 Op 3 . + CDS 7907 - 8650 646 ## DSY0090 hypothetical protein 15 5 Op 4 . + CDS 8647 - 9279 694 ## COG4832 Uncharacterized conserved protein 16 5 Op 5 . + CDS 9343 - 10023 750 ## COG0274 Deoxyribose-phosphate aldolase 17 5 Op 6 . + CDS 10073 - 10948 989 ## COG0053 Predicted Co/Zn/Cd cation transporters 18 5 Op 7 . + CDS 11015 - 11998 1191 ## COG1052 Lactate dehydrogenase and related dehydrogenases - TRNA 12406 - 12481 69.6 # Gln CTG 0 0 + Prom 12321 - 12380 9.0 19 6 Op 1 . + CDS 12546 - 13415 864 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases 20 6 Op 2 . + CDS 13480 - 13974 428 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases + Term 14017 - 14066 3.6 + TRNA 14308 - 14383 66.4 # Glu CTC 0 0 - Term 14290 - 14359 20.2 21 7 Tu 1 . - CDS 14441 - 15325 958 ## COG1737 Transcriptional regulators - Prom 15364 - 15423 6.0 + Prom 15429 - 15488 5.1 22 8 Op 1 . + CDS 15535 - 15780 386 ## Ccel_3105 hypothetical protein 23 8 Op 2 . + CDS 15782 - 16840 1023 ## COG0502 Biotin synthase and related enzymes 24 8 Op 3 . + CDS 16872 - 18293 1440 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 25 8 Op 4 . + CDS 18296 - 19483 1044 ## COG1160 Predicted GTPases + Term 19518 - 19556 5.2 - Term 19496 - 19553 7.6 26 9 Op 1 . - CDS 19580 - 20674 988 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family - Prom 20707 - 20766 9.7 27 9 Op 2 . - CDS 20791 - 21291 553 ## COG0778 Nitroreductase - Prom 21374 - 21433 9.3 + Prom 21311 - 21370 7.6 28 10 Tu 1 . + CDS 21405 - 22517 1020 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases + Term 22615 - 22679 2.0 + Prom 22526 - 22585 10.6 29 11 Op 1 . + CDS 22730 - 24151 1437 ## COG0531 Amino acid transporters + Term 24165 - 24198 -0.7 30 11 Op 2 . + CDS 24225 - 25544 1480 ## COG1362 Aspartyl aminopeptidase 31 11 Op 3 . + CDS 25544 - 25945 400 ## gi|167756291|ref|ZP_02428418.1| hypothetical protein CLORAM_01824 + Term 26007 - 26042 2.7 - Term 25988 - 26038 6.5 32 12 Tu 1 . - CDS 26050 - 26400 554 ## gi|167756292|ref|ZP_02428419.1| hypothetical protein CLORAM_01825 - Prom 26546 - 26605 6.9 33 13 Tu 1 . - CDS 26703 - 26903 265 ## gi|237734929|ref|ZP_04565410.1| predicted protein - Prom 27086 - 27145 7.1 + Prom 27072 - 27131 8.0 34 14 Op 1 . + CDS 27190 - 27612 451 ## COG3773 Cell wall hydrolyses involved in spore germination 35 14 Op 2 . + CDS 27626 - 27979 292 ## Pjdr2_4878 spore coat protein GerQ + Term 27984 - 28046 3.7 + Prom 27989 - 28048 6.1 36 15 Tu 1 . + CDS 28119 - 28400 367 ## gi|167756296|ref|ZP_02428423.1| hypothetical protein CLORAM_01829 Predicted protein(s) >gi|223714171|gb|ACDT01000044.1| GENE 1 242 - 577 293 111 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756259|ref|ZP_02428386.1| ## NR: gi|167756259|ref|ZP_02428386.1| hypothetical protein CLORAM_01790 [Clostridium ramosum DSM 1402] # 1 111 1 111 153 124 100.0 2e-27 MKKFKSLFKCRLDERQLKIRGNIYFKSFSVLSFIIIAIFFIKEIFNIDLMIGDWEYLIAL FISITYCFILMIYYEIYPLTKARYRLLFIFFGLYGFGFFGLYLFLIINGKP >gi|223714171|gb|ACDT01000044.1| GENE 2 564 - 779 299 71 aa, chain - ## HITS:1 COG:SPy1934 KEGG:ns NR:ns ## COG: SPy1934 COG1476 # Protein_GI_number: 15675737 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 3 64 2 63 68 87 69.0 6e-18 MAKNLRIKAARAALDMTQKDLADAVGVSRQTMNAIEKGDYNPTVKLCIKICKVLNKSLDE LFWEDILDEEI >gi|223714171|gb|ACDT01000044.1| GENE 3 846 - 1313 433 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734899|ref|ZP_04565380.1| ## NR: gi|237734899|ref|ZP_04565380.1| predicted protein [Mollicutes bacterium D7] # 1 155 1 155 155 263 100.0 2e-69 MDKRIILAREDLKKTNYWMITLMISALTLIILSLTNVLTINSNNEHYIDFVHGYQTGLSF ALFVFPCINLISNFFLAKNEVKLIEKYINDHDERKLFIADKVGGNFSFTFEIITMALVSV IAPLYSFDLLLGIVACIFIIIIIRGCLYLYYNHKY >gi|223714171|gb|ACDT01000044.1| GENE 4 1306 - 1515 298 69 aa, chain - ## HITS:1 COG:CAC3324 KEGG:ns NR:ns ## COG: CAC3324 COG1476 # Protein_GI_number: 15896567 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 64 1 64 69 80 78.0 5e-16 MKTKIPELRKAYKISQEELARSVGVTRQTITSIEVGKYTASLILAYKIAKYFNLTIEEIF DFEEELNNG >gi|223714171|gb|ACDT01000044.1| GENE 5 1617 - 1901 356 94 aa, chain - ## HITS:1 COG:BH3997 KEGG:ns NR:ns ## COG: BH3997 COG1694 # Protein_GI_number: 15616559 # Func_class: R General function prediction only # Function: Predicted pyrophosphatase # Organism: Bacillus halodurans # 1 91 4 99 101 73 47.0 8e-14 MKELENKIIEFVQKRGWDQLEHPECLIKSISIEAGELLECIQWDNDYKTENISEELADIM IYCFQLAYSLNLDVSTIIEKKLIKNAEKYPVKEF >gi|223714171|gb|ACDT01000044.1| GENE 6 2351 - 3355 1058 334 aa, chain + ## HITS:1 COG:SA0769 KEGG:ns NR:ns ## COG: SA0769 COG1135 # Protein_GI_number: 15926497 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, ATPase component # Organism: Staphylococcus aureus N315 # 1 313 1 313 341 328 55.0 8e-90 MIEIKRVSKVYRLKDREVVAVDDVSLTIEDGDIYGIIGYSGAGKSTLVRLINQLEVQDSG KIFIDGQDLGNLSQIELRKLRTKVGMIFQHFNLLWSRTVSENIELALEIAGNKDKMIRQQ KVQELVELVGLTGRENAYPSELSGGQKQRVGIARALANDPRILLCDEATSALDPDTTKSI LELLLKINKKLNITIIMITHQMEVVQRICNHIAVMSEGKVIEEGTVKEIFTSPQHSVTKS FIQEGKNHNEFDETVLKKIYSKGRLLKVVFDENVSRLPILTKVIRECDSDINVIEANLSN TIDSSFGIMILQVIGDYEKVITLFEKYLAKVEVM >gi|223714171|gb|ACDT01000044.1| GENE 7 3356 - 4021 766 221 aa, chain + ## HITS:1 COG:BH3480 KEGG:ns NR:ns ## COG: BH3480 COG2011 # Protein_GI_number: 15616042 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, permease component # Organism: Bacillus halodurans # 4 221 1 218 218 202 53.0 5e-52 MRSILENIDINAMIAALNETLFMTLISLIFAVILGMVLGIAIYLTQDDGLYPNLIVNKIL NLIVNVFRAVPYIILVFLLIPLTTFLVGSMLGAKAALPSLILSSAPFYGRMVMIALNEVD GGTIEASKAMGASNVQIITKVLIPEAKPALISSITVMSISLVGYTAMAGAIGAGGLGNLA YLYGFVRTNNYVMYTATFLILVIVFIIQFIGDYFVRKIDKR >gi|223714171|gb|ACDT01000044.1| GENE 8 4031 - 4852 936 273 aa, chain + ## HITS:1 COG:BS_yusA KEGG:ns NR:ns ## COG: BS_yusA COG1464 # Protein_GI_number: 16080325 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Bacillus subtilis # 1 272 1 272 274 251 50.0 1e-66 MKKLFIGIISLLLAVTLTACGGKTERDDKTLLVAATSKPHGDILEEAKNILKDDYDIDLE IKILDDYYIFNRSLNDGDVDANFFQHVPFFEKDIKDNKYDLVNAGGVHIEPFGFYSKQIK NINELKKNDIVVISNSVADHGRILAILDEAGIIELDDKVKVQDATIEDIKTNRLNLEFKE IKPELLTNAYKNDEGALIAINGNYAIDAGLNPTKDAILLESADESNPYVNIVACQKGHEN DKKIKALVAVLQSQKIKDFITNTYSDGSVIPVK >gi|223714171|gb|ACDT01000044.1| GENE 9 4938 - 5273 435 111 aa, chain + ## HITS:1 COG:CAP0096 KEGG:ns NR:ns ## COG: CAP0096 COG0221 # Protein_GI_number: 15004799 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Clostridium acetobutylicum # 6 110 4 108 110 126 56.0 1e-29 MLCRDYLYQEVEVLIDRPLGSQHPQYGFIYPLNYGFIENTVSGDGEELDAYLLGVFEPVE RYRGVVIAVIQRTNDNDDKLIVVPHNVDYSDEQIRALTEFQERYFNSVILR >gi|223714171|gb|ACDT01000044.1| GENE 10 5285 - 5629 333 114 aa, chain + ## HITS:1 COG:lin0899 KEGG:ns NR:ns ## COG: lin0899 COG5646 # Protein_GI_number: 16799972 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 4 112 9 116 130 87 46.0 8e-18 MTITVDEYINSFYQQIQINLIKLRKIIKETAPEAQEKISWGVPTYFQDGFLVQFAAYKKH IGFYTSPQTIEAFKTELANYKTNNKNTVQFMFEQELPVELIKKMVLFKIEENTK >gi|223714171|gb|ACDT01000044.1| GENE 11 5626 - 6078 418 150 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756269|ref|ZP_02428396.1| ## NR: gi|167756269|ref|ZP_02428396.1| hypothetical protein CLORAM_01800 [Clostridium ramosum DSM 1402] # 1 150 1 150 150 282 100.0 4e-75 MRLGTTYISVNDFAKSLAFYQLLLEQEPLYCNDDRWATFDCGNSLSLYNRQFDIDFLVNN DYRGHYNHSYLTTLKNENSDKINTSVIFNFEVEDLISEYERIKKLKIGAVSELLFVNIHM PYWYFTIKDPDGNEIEITGNYNGSDKLTDE >gi|223714171|gb|ACDT01000044.1| GENE 12 6159 - 6935 1019 258 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 4 257 3 256 256 109 33.0 4e-24 MKRKFFFFDIDGTLAVGTPGNQYVPESTKIAISKLKEAGHFVAIATGRSYAMAVEHMKEL GFENMVSDGGNGITIHNKLVEIKPLDYDKCINLINECKEKGYIWAISPDNATRRLAPDNR FYDFTHDIYMDTIVQEGLDPKDFDKIYKVYVACYAPEEEKLETLKELPWCRFHQEYLFVE PGDKSVGIKMMVDHFDGDYSDVVVFGDEKNDLSMFTDEWTSIAMGNAIDALKAKAAYITD DCDKDGIYNACKHFGWID >gi|223714171|gb|ACDT01000044.1| GENE 13 6938 - 7852 716 304 aa, chain + ## HITS:1 COG:FN1249 KEGG:ns NR:ns ## COG: FN1249 COG2378 # Protein_GI_number: 19704584 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Fusobacterium nucleatum # 1 301 16 312 314 201 39.0 1e-51 MSINRLFETIYYLIEHKQTTAKELAEHFEVSTRTIYRDLDRLIVAGFPIYANQGAKGGIY IDSEFVLDKMTFSDDEQNQILMALQCIERLQDRDEAGLIDKMAALFNKNKLDWIEVDFTT WHHNNSQNEKFETIKKAIFEQKEIKFKYINSYGEKSQRCVFPNKLFFKANTWYLQGYCLQ KNSYRVFRLTRIIELLTTSNYFQLEEIALPPKINKIPNLNTPRIKVILKFDKSIGSVVFD EFGDGVICEDTAGNYIVSSVVPDDYWLISFILSFGSKVEVIEPQDLKDKVIIEINKIKAV YKQT >gi|223714171|gb|ACDT01000044.1| GENE 14 7907 - 8650 646 247 aa, chain + ## HITS:1 COG:no KEGG:DSY0090 NR:ns ## KEGG: DSY0090 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 243 1 247 271 229 46.0 5e-59 MEIITVDQNNLEKEHICCAISSNNDIQVKAKKAWLEEQFKCGLIFKKMDVRGKCFIEYLP LEEAWVPIVGDDLMHINCLWVSGKYQGQGLAKKLLEACIEDCRKQKKHGITVISAKRKMP FVMDYKFLIKHGFISIMSLDKYELMYLSLDSQAVQPSFTIKEMTCEEGGLVLYYSHQCPF TAKYAPMIEAYCQEKGLIIKLKLLNSSEEAKNAGILFTTYSLFYNHKFITREILSVKKFE KILEELI >gi|223714171|gb|ACDT01000044.1| GENE 15 8647 - 9279 694 210 aa, chain + ## HITS:1 COG:lin2189 KEGG:ns NR:ns ## COG: lin2189 COG4832 # Protein_GI_number: 16801254 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 3 209 5 207 208 201 48.0 1e-51 MTKLDYKKEYKDLYLPKRSPMLINVDKITYVTVDGKGNPNTSAEYKEALELLYSISFTIK MSKMTDNKIDGYFEYVVPPLEGLWWLDKKPTEKVIGDKSKFYWKSMIRLPEFVTLEVFEH AKELLKVKKPYLNLDKVNYEIIEEGKCVQIMHLGSYDEENTSIEKIYQFIETHGLQIDLN EVRRHHEIYLSDPRKCKPENLKTVIRYPVK >gi|223714171|gb|ACDT01000044.1| GENE 16 9343 - 10023 750 226 aa, chain + ## HITS:1 COG:SA0133 KEGG:ns NR:ns ## COG: SA0133 COG0274 # Protein_GI_number: 15925842 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Staphylococcus aureus N315 # 10 226 5 219 220 166 42.0 3e-41 MQRTLEEITQYIDHTLLKPYASKEAMQAFCNEAKELKVKMVAINSYYTKFCKELLKDTTI HVGAAISFPLGQTTIAVKAFETIEAIKDGADEIDYVLNLAKVKDGDFTYIKEEMETIVKI CREAGIISKVIFENCYLTKDEIRKCAQIAKEVKPDFIKTSTGFGPGGALIEDVKIMLETV DGVCKVKAAGGIRDYKTFNEFINLGVERIGTSSTKTIIKEFKENDE >gi|223714171|gb|ACDT01000044.1| GENE 17 10073 - 10948 989 291 aa, chain + ## HITS:1 COG:MA0617 KEGG:ns NR:ns ## COG: MA0617 COG0053 # Protein_GI_number: 20089506 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Methanosarcina acetivorans str.C2A # 11 288 29 311 331 165 34.0 8e-41 MEEYQKTVMRVSTVTLLGNLILAFFKILIGFISFSNAIISDGIHTASDVLSTVVVMVGVK LSSKESDLEHPYGHERFECVAAIILAVMLFLTALLIGYSGIIEIINPSNVKLKYGSLAIV IAVISIVAKEAMYWYTIIEAKRINSNAMKADAWHHRSDAFSSIGSLVGVIGFYLGYYILD AIASILICGFILKVAIEIFNDAIDQMIDHACSREFQEDLISLITEQSEIINIDDLKTRLF GNKVYVDLEVQVDGNDNLRHAHAIAHQVHDLIENNFPTVKHCNVHVNPSDS >gi|223714171|gb|ACDT01000044.1| GENE 18 11015 - 11998 1191 327 aa, chain + ## HITS:1 COG:CAC1543 KEGG:ns NR:ns ## COG: CAC1543 COG1052 # Protein_GI_number: 15894821 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 326 1 325 326 347 57.0 2e-95 MKIAVYNYREFDEGQYFEKFSQHYCVEIIKIYDSPNKENAVLAEGCNGVSVITTAITKEI IEIWHEQGVKHISTRTIGYDHIDLKAAKANKMIVSNVTYSTASVANYTVMIMLMALRKMK MIMRRAIGFDYSLNGSIGLELENMVVGVIGTGAIGQKVIKNLSGFGCQILAYDPFEKEEV RKYADYVAMEEIITKSDIITFHVPALEDTYHLVCQETIEKMKDGVIIINTARGSIIDTSD LISALESGKIAACALDVIENELGLYYNDYKYKVIGNHELSILRDMPNVLLTPHMAFYTEQ AVSDMVEHSIESIVADRDGKENKFRVC >gi|223714171|gb|ACDT01000044.1| GENE 19 12546 - 13415 864 289 aa, chain + ## HITS:1 COG:CAC0990 KEGG:ns NR:ns ## COG: CAC0990 COG0008 # Protein_GI_number: 15894277 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Clostridium acetobutylicum # 2 283 5 294 485 363 61.0 1e-100 MKIRTRFAPSPTGYMHIGNLRTALYGYLIAKKDDGDFILRIEDTDQAREVAGAIDVIYQT LSDTGLEYDEGPDKDGGYGPYIQSQRLEIYQKYAHELVKLNGAHYCFCNQENVKGNKEDQ IFKDPCHQLSDEEIEKLLSENKPYVIRQTIKQGQTTFNDEVYGEITVDNDTLDEGVLLKS DGYPTYNFANIIDDHLMSITHVVRGNEYLASTPKYNIIYQTFNWEIPTYIHVPPVMKDEQ HKLSKRNGDASYQDLIKQGYLNEAVINYIALLGWAPEGEEEIFFFIRTD >gi|223714171|gb|ACDT01000044.1| GENE 20 13480 - 13974 428 164 aa, chain + ## HITS:1 COG:CAC0990 KEGG:ns NR:ns ## COG: CAC0990 COG0008 # Protein_GI_number: 15894277 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Clostridium acetobutylicum # 1 161 323 483 485 138 42.0 4e-33 MNGLYLRNLTLENFHQLAMPYYQKIITRDVDLLELSQVLQLRISYLNEIPEMIDFINEPC NADETLFKNKKMKTNPENSLEALIWVRDALSTFDNFNDDLALHDLFINLAQEKEVKNGRI MYPVRVALTFKSFTPGGAVEIAHILGKNESLKRIDLAIKLLSMK >gi|223714171|gb|ACDT01000044.1| GENE 21 14441 - 15325 958 294 aa, chain - ## HITS:1 COG:lin2846 KEGG:ns NR:ns ## COG: lin2846 COG1737 # Protein_GI_number: 16801906 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 2 270 3 271 283 120 29.0 3e-27 MLIIHKLEHTHFSSSEAIIIDYILEQGLKIKNMSANSIAKATFTSAPLLVRIAKKLGYSG WNAFKEDFIAELEYLFKEQKVDASIPFVVSDDIMTISNNIGQLQIDAIQDTMSMLNHDDL QTALRYLRNAKELDLYGVSNNLLLAEGFRSKLFYIHRNVNICRLPGNPKVQAAMSDETHC AILISYSGETEFIIEVAKVLKKKETPIIAITCIANNRLSKIADVTLRISSREMLHTKIGD FATTTSIKYLLDTLYAGIFSFDYQKNLDYKIQIAKAVDDRHSGYEFIDEDEFPQ >gi|223714171|gb|ACDT01000044.1| GENE 22 15535 - 15780 386 81 aa, chain + ## HITS:1 COG:no KEGG:Ccel_3105 NR:ns ## KEGG: Ccel_3105 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 80 1 80 81 98 60.0 7e-20 MESRIAILGIFVYELDSANKINDLLHKYSRYIIGRMGIPYKDKQVSVMSVIVDGPNDVIG ALSGKIGMLKGVSVKALYSKG >gi|223714171|gb|ACDT01000044.1| GENE 23 15782 - 16840 1023 352 aa, chain + ## HITS:1 COG:CAC1631 KEGG:ns NR:ns ## COG: CAC1631 COG0502 # Protein_GI_number: 15894909 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Clostridium acetobutylicum # 4 313 5 309 350 252 43.0 8e-67 MSNIDLINKLHQEGALDFDEFKRLLKTFDQNDFDYAKKIAAKITEQIFKNKIYIRGLIEI SSYCKNDCYYCGLRHSNKLALRYRLNLEEILLCCKEGYRLGFRTFVLQGGEDEYYQDERM VEIIEKIKELYPDCAITLSLGEKSKETYQKYFNAGADRYLLRHETYNREHYYQLHPHEMS FDNRIDCLLNLKAIGFQTGCGFMVGSYHQDIDCLVNDLLFIKELKPEMVGIGPYLVHQNT PFKNQPNGDLKLTLFLLSLIRIMTKDVLLPATTALATLDSNGRLEGINHGCNVVMPNLSP QEVRKKYSLYDGKAASGQEASEGLKELIDTLKTAGYEVVYERGDYCKFDYEG >gi|223714171|gb|ACDT01000044.1| GENE 24 16872 - 18293 1440 473 aa, chain + ## HITS:1 COG:CAC1356 KEGG:ns NR:ns ## COG: CAC1356 COG1060 # Protein_GI_number: 15894635 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Clostridium acetobutylicum # 1 473 1 472 472 725 76.0 0 MYNVKSNKAEEFIDDQEILETIAYAKENKDNRELIFSLIERAKDCKGLTHREAMLLLECE IPEAIDEMEKVAKMIKQKLYGNRIVMFAPLYLSNYCVNGCVYCPYHLKNKHIARKKLTQE EIKKEVIALQNMGHKRLALETGEDPINNPIEYVLESIKTIYGIKHINGAIRRVNVNIAAT TVENYKKLKEAGIGTYILFQETYNKKNYEALHPTGPKHDYAYHTEAMDRAMDGGIDDVGI GVLFGLEMYRYDFVGLLMHAEHLEAAKGVGPHTISVPRICPADDIDKSDFSNAISDKIFE KIVMIIRMAVPYTGMIISTRESKRVRERVLELGISQISGGSKTSVGGYDEPETEEDTTSA QFDISDNRTLDEVVKWLCELNYIPSFCTACYREGRTGDRFMQLCKSGQISNCCHPNALMT LKEYSQDYAGDDTKLKADNLIMKELDNITSDKVRKICQERLTMIEDGKRDFRF >gi|223714171|gb|ACDT01000044.1| GENE 25 18296 - 19483 1044 395 aa, chain + ## HITS:1 COG:CAC1651 KEGG:ns NR:ns ## COG: CAC1651 COG1160 # Protein_GI_number: 15894928 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 1 394 1 395 411 310 43.0 3e-84 MAGLNKTPGANRLHIGFFGKRNSGKSALINAFVNQEVSIVSSEPGTTTDPVYKAMEIDGL GPCLLVDTAGFDDIGTLGQLRNDKTKKASEKIDIAIILFNDRKICQELDWYHYFKEKHIP TILVISKADLQRNDNLLKKVTEATSETPLLISSLTKQGIKELQEKLLEKVPADFEKESIT GELAKKGDLILLVMPQDIQAPKGRLILPQVQVMRELLDKGCQIMAVTTEQLKSALQKLNQ APDLVITDSQVFKYVYENISSESKLTSFSVLMAGYKGDLNYYAKSVEILERLNGDSTVLI AECCSHAPLSEDIGRVKIPRMLKKKFPGIGVEIVSGTDFPDDLTKYDLIITCGGCMFNRR YIMTRVNQAKGQMIPMTNYGIFIAYVNGILDKVIY >gi|223714171|gb|ACDT01000044.1| GENE 26 19580 - 20674 988 364 aa, chain - ## HITS:1 COG:CAC0767 KEGG:ns NR:ns ## COG: CAC0767 COG1453 # Protein_GI_number: 15894054 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Clostridium acetobutylicum # 4 364 5 372 376 344 47.0 2e-94 MENKFGFGCMRLPMKNDEVDYDEFNQMIDLYMQEGFNYFDTAHGYLGGKSEIALRDCLVS RYPRESYVLTNKLTKSFFNSKEDILPLFNEQLQKTGVSYFDYYLMHAQDRENYKHFQKCQ AYQAAQELKATGKVKHIGISFHDTANILEMILSEHPEIEIVQIQFNYADFNNPSVESKKV YEVCRKFNKPILVMEPIKGGGLANLPEAASRILTDLNKNASNASYALRFAASFEGVYKVL SGVSNLEQMKDNLKVMKNFIPFNDEEYEAVAKVRKILDSLGGIPCTACRYCTDGCPMKIS IPDLFGCYNDKKIFNDWNSDYYYGVHTKTSGKASTCIGCQQCEHICPQHLPIIKYLKEVA ETFE >gi|223714171|gb|ACDT01000044.1| GENE 27 20791 - 21291 553 166 aa, chain - ## HITS:1 COG:MA0330 KEGG:ns NR:ns ## COG: MA0330 COG0778 # Protein_GI_number: 20089228 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanosarcina acetivorans str.C2A # 2 162 9 169 179 147 44.0 6e-36 MLKEIEQRQSIRKYLNKPVEPEKILELLKAAMNAPTARNTQEWKFKVITNSQALNDMTKL SPYTTMMKEAPCAILVIADLNKAISPEYGLINCSAAIENLLIEAVNQGLGCCWCGIAPVS ERIEGFKNYYNLADNEYPVGVVAVGYSNETKPLIDRFDSKNISYFK >gi|223714171|gb|ACDT01000044.1| GENE 28 21405 - 22517 1020 370 aa, chain + ## HITS:1 COG:lin0658 KEGG:ns NR:ns ## COG: lin0658 COG0639 # Protein_GI_number: 16799733 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Listeria innocua # 20 229 7 210 235 73 25.0 5e-13 MLEAQHIKIKVVGEYRCIVISDIHSHLDRFKALLKKVDYRYDDYLIILGDFIEKGTQALE TVRFLQELQAKSERVYVILGNCEYALEEMVDNPKYANQIVHYLNRIGKGGMIRQALEKLN IDIKKENPEVMQVKIKQFLRPYFQYFKTLPTILEFNNFIFVHAGIENRKDWQNSSTSSLI EMRTFYQTGHCLDKYVVVGHLPTSNQYANAINNDIIIDKQKRIISIDGGTGVKSISQLNA LIITGDGDRFKLSKEYIQPLPLYQAVVDVNVEKQVVNKVAWPNFEVEVLKQGKLFSTCKQ IKTGKIFKIKNEFLYLNNQKYYCLDDYVDYLIPLEKGEVIKLVGIYGPYAYVIKDNEIGW VKYRYLKKVN >gi|223714171|gb|ACDT01000044.1| GENE 29 22730 - 24151 1437 473 aa, chain + ## HITS:1 COG:STM0969 KEGG:ns NR:ns ## COG: STM0969 COG0531 # Protein_GI_number: 16764329 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Salmonella typhimurium LT2 # 5 468 7 471 473 432 48.0 1e-121 MSNSKKIVWYNLAFMAFSTVWGFGNVVNGFVYFNGVQVIFSWIMMFALYFVPYALMVGEL GSAFKNAGGGVSSWIQETLGPKLAYYAGWTYWVVHMPYIASKASGGLKALSWVLFQNSSF YDSLDIRLVQIITLVVFLVFAWVASRGLNPLKSLATIAGSSMFVMSILFIILMMAAPALN PNGGYQTIDWSFKNLIPSFRLEYFTSLSILVFAVGGCEKISPYVNKVENPSKGFPKGMIA LAIMVIICAILGTISMGMMFNVAEINANFDSYVANGAYWSFQKLGEYYGIGNTFLIVYAV CNMVGQLSTLVISIDAPLRMLLDNENARQFIPAKLLKKNDKGAYTNGILLVVAIVTPLIL IPALGIGSVTAFLKFLTQLNSVCMPLRYLWVFIAYIALRKAIDKFPAEYRFTKSQFFAKF FGWWCFIVTAACCALGMFKGSAFEIAMNIITPIILVGLGAIMPALAKRGKTSQ >gi|223714171|gb|ACDT01000044.1| GENE 30 24225 - 25544 1480 439 aa, chain + ## HITS:1 COG:CAC0607 KEGG:ns NR:ns ## COG: CAC0607 COG1362 # Protein_GI_number: 15893896 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Clostridium acetobutylicum # 4 424 6 432 433 397 47.0 1e-110 MDKQISQELVTFIKQSPTAFHAVANMQNILIEHGYEELLEGQTWQIKKGGHYFVTRNNSS IIAFNLGENLDNYSFNVAASHSDSPTFKVKENAEIEIKGKYTQLNTEGYGGMLCATWFDR PLSIAGRVLVQEGDNYVTKLVNIDRDLVMIPNVAIHMNRTVNDGYAYNKQVDMLPLFGGS ETKAGDLKKLIADELGVDVETIYGTDLYLYNRMEPSIWGANEEFISCPQLDDLQCAYTSL QGFLKGANKQSINVFACFDNEEVGSGTKQGAGSTFLYDAMRRINNALGKGEEEYYRALAS SFMLSADNAHAVHPNHPAKTDVNNCVYMNEGIVVKSHAGQKYTSDAVSIAVFKGLCKKAG VPVQFFSNRSDTAGGSTLGNIAMAQVSMNSVDIGLPQLAMHSSYETAGVKDTAYMIKVME EFFNSHIEEMSGHILKVTK >gi|223714171|gb|ACDT01000044.1| GENE 31 25544 - 25945 400 133 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756291|ref|ZP_02428418.1| ## NR: gi|167756291|ref|ZP_02428418.1| hypothetical protein CLORAM_01824 [Clostridium ramosum DSM 1402] # 1 133 1 133 133 200 100.0 2e-50 MKKEKTSKKRLKEIKKEVLEKYIIAGLWQTMCGYIVLLFIKELLTDNYLVSFSVDVLIAI IAFYVTLHNLVNQYKLIKENRLSLKPFSFQIFGIIVGLFIVILTLKSPFDISFAILVIAF LTSKKMFEKELMK >gi|223714171|gb|ACDT01000044.1| GENE 32 26050 - 26400 554 116 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756292|ref|ZP_02428419.1| ## NR: gi|167756292|ref|ZP_02428419.1| hypothetical protein CLORAM_01825 [Clostridium ramosum DSM 1402] # 1 116 1 116 116 131 100.0 1e-29 MNTKKINEQAIESLKTYFNEGKDLIKENVTPLTKDQIKAIVNEEIAHHEAKLHELNDLKQ KIDLEAEEAFEKVFGKAKEEPKTHKLTTDEMAVKLQEEKEHQMDVANKLKEEKLSH >gi|223714171|gb|ACDT01000044.1| GENE 33 26703 - 26903 265 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734929|ref|ZP_04565410.1| ## NR: gi|237734929|ref|ZP_04565410.1| predicted protein [Mollicutes bacterium D7] # 1 66 1 66 66 88 100.0 1e-16 MNKREIEKQALNSLNTYFKDNPDKVLNDFERKPLDKEDITKIVNEEIQAQEKKVESLEVL RKKYCY >gi|223714171|gb|ACDT01000044.1| GENE 34 27190 - 27612 451 140 aa, chain + ## HITS:1 COG:BS_cwlJ KEGG:ns NR:ns ## COG: BS_cwlJ COG3773 # Protein_GI_number: 16077329 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall hydrolyses involved in spore germination # Organism: Bacillus subtilis # 9 132 8 130 142 115 47.0 2e-26 MSSRIAYNSKEVELLARLIKAEALGEGQEGMLLVGNVVVNRVAAVCDVFKNVVTITEAIY QKNAFAGVGTDLFNGPVNTKEKELALKTIKGYRNAPATYALWFKNPGDNVTCPKTFYGEF SGRFKNHCFYNPGPNLVCEI >gi|223714171|gb|ACDT01000044.1| GENE 35 27626 - 27979 292 117 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_4878 NR:ns ## KEGG: Pjdr2_4878 # Name: not_defined # Def: spore coat protein GerQ # Organism: Paenibacillus # Pathway: not_defined # 22 117 72 170 185 81 41.0 7e-15 MNYFEDLNPYFVRQDPDAEVNPDGTLIPPKNPNPAAVPQIYMGNIIRLNIGKLGTFYFTY SDSEKWRDEVFVGIVEDAGRDYFIIKDVNSEDRFLLPLVYFLWARFKGAISFTVPPQ >gi|223714171|gb|ACDT01000044.1| GENE 36 28119 - 28400 367 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756296|ref|ZP_02428423.1| ## NR: gi|167756296|ref|ZP_02428423.1| hypothetical protein CLORAM_01829 [Clostridium ramosum DSM 1402] # 1 93 1 93 93 165 100.0 7e-40 MATGDVIDFLEREGYDLDFIKTANRYLSYDSEHLILQCKNDILEFGEKELVKMYLDNDIF YSYTLSRFNSKVGLAYFECTLKDAFSIFTLQNI Prediction of potential genes in microbial genomes Time: Thu May 26 09:42:54 2011 Seq name: gi|223714170|gb|ACDT01000045.1| Coprobacillus sp. D7 cont1.45, whole genome shotgun sequence Length of sequence - 3501 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 226 259 ## gi|237734441|ref|ZP_04564922.1| hypothetical protein MBAG_01304 2 1 Op 2 . - CDS 226 - 1224 1133 ## gi|237734442|ref|ZP_04564923.1| conserved hypothetical protein - Prom 1250 - 1309 8.3 - Term 1535 - 1565 1.2 3 2 Op 1 . - CDS 1568 - 2257 470 ## COG1272 Predicted membrane protein, hemolysin III homolog 4 2 Op 2 . - CDS 2311 - 3393 860 ## COG0628 Predicted permease - Prom 3430 - 3489 6.9 Predicted protein(s) >gi|223714170|gb|ACDT01000045.1| GENE 1 1 - 226 259 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734441|ref|ZP_04564922.1| ## NR: gi|237734441|ref|ZP_04564922.1| hypothetical protein MBAG_01304 [Mollicutes bacterium D7] # 1 75 1 75 75 110 100.0 2e-23 MKKKIFPLAIVATLLVGCSSSSATTYTSEIKDGDKTIATADDVKISKNDVYHYLLKEYGS SEVLSLALTYIADQE >gi|223714170|gb|ACDT01000045.1| GENE 2 226 - 1224 1133 332 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734442|ref|ZP_04564923.1| ## NR: gi|237734442|ref|ZP_04564923.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 332 1 332 332 553 100.0 1e-156 MKLGKCAAALVVTSFLLTGCQNSSKTVKEDGKYIVASLNSGKKDKNIFADDIFNDVISTA SGKSSYFNAVLQQLMDQKFPIDEDMETDANETVDQIQTYYENQYGDNAETQLQTALSSSG FNSLDEYRENMVRVYQRCNFLLAYVEKNFDEVFDDYYTQASPREASLIKVSMTDVENPTA DETAKLSEVTALLSSSKSFGDIAKDYSDDTNTNKNKGKLGIVDTTSGLSQTFGSDVETKI LALASGETSEAIKGTDGYYFVKVTSTDKDSIKKEIKKDLSIDTPLIAYDKYMSYIVYQSY NVKYDDKNTEKIINNVIDEALKERETSRGGTN >gi|223714170|gb|ACDT01000045.1| GENE 3 1568 - 2257 470 229 aa, chain - ## HITS:1 COG:FN1885 KEGG:ns NR:ns ## COG: FN1885 COG1272 # Protein_GI_number: 19705190 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Fusobacterium nucleatum # 26 229 9 212 215 214 57.0 1e-55 MYWVIKLEKVNNKITKKIKRIFPPYFKEELFNCITHGIMAFIMLLLIPACAVYAYVKGGP IQSFGVSVFTICIFLMFLVSTLYHAMDHDSPHKQVFRILDHIFIYFAIAGSYTPVALCLI KGYQGIIILVIQWAMVIVGILYKSISIKSLPKLSLTIYLVMGWTAILFMPSIIQNSSAVF LWLIVIGGLMYSIGAYFYANKKIPYNHVIWHIFISIASILHFIAIVFYI >gi|223714170|gb|ACDT01000045.1| GENE 4 2311 - 3393 860 360 aa, chain - ## HITS:1 COG:SP1505 KEGG:ns NR:ns ## COG: SP1505 COG0628 # Protein_GI_number: 15901352 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Streptococcus pneumoniae TIGR4 # 39 344 45 361 388 82 21.0 1e-15 MEFLKALFKIISSKKITHTLLNVMIVLVILLLIIATGDLWYGVLSAIWLVSKPFIVGFTI AFVLNPLINYIEKYVTKRSIAVTMVYLAAFAILTLLISLAVPMIYESISEMFPAFYSGLG EIGVFVKENFNYDISSLMRHIQNIVNAFFKDSVVLDTTIDVLNQVMINVTNFLIYIILAI YMSSNFKNIRSHIRNITYRIDKTLPIYLREIDSSLVQYVKAFFIGAIAQAITTMLMYLAI GHPNWLILGFVSGASSIIPYVGPIVANCLGLITSLGMGTTTIVILFILIFIQSTIMSYVI TPRIYSSRIDLSIMWVLFGILSGSSLFGIWGMVIAMPLLVSAKITFQVYKENHQNEKLME Prediction of potential genes in microbial genomes Time: Thu May 26 09:43:15 2011 Seq name: gi|223714169|gb|ACDT01000046.1| Coprobacillus sp. D7 cont1.46, whole genome shotgun sequence Length of sequence - 11278 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 2, operones - 2 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 58 - 117 12.0 1 1 Op 1 41/0.000 + CDS 178 - 993 169 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 2 1 Op 2 24/0.000 + CDS 993 - 1886 919 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 3 1 Op 3 19/0.000 + CDS 1886 - 3112 1421 ## COG0520 Selenocysteine lyase 4 1 Op 4 6/0.000 + CDS 3096 - 3524 621 ## COG0822 NifU homolog involved in Fe-S cluster formation 5 1 Op 5 . + CDS 3535 - 4941 1440 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 6 1 Op 6 . + CDS 4934 - 5464 541 ## COG0212 5-formyltetrahydrofolate cyclo-ligase 7 1 Op 7 . + CDS 5508 - 6917 1331 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid + Term 6927 - 6983 9.1 + Prom 6950 - 7009 12.8 8 2 Op 1 . + CDS 7030 - 9606 2073 ## COG0474 Cation transport ATPase 9 2 Op 2 . + CDS 9640 - 10293 517 ## Aflv_2429 hypothetical protein 10 2 Op 3 . + CDS 10278 - 11114 679 ## gi|237734454|ref|ZP_04564935.1| predicted protein + Term 11205 - 11235 0.4 - TRNA 11173 - 11246 79.1 # Gly TCC 0 0 Predicted protein(s) >gi|223714169|gb|ACDT01000046.1| GENE 1 178 - 993 169 271 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 4 226 2 221 245 69 28 9e-12 MSTLKIENLHVSIGEKEILKGIDLVVNTGETHAIMGPNGNGKSTLLSVIMGHPKYIVTQG SIYIDDQNVLEMSVDERSRAGIFLGMQYPQEIPGVTTSDFLRAALNAHQEKPVSLFKFVK ALDKNIADLKMDENLAHRYLNEGFSGGEKKRNEILQMKLLQPKFALLDEIDSGLDVDALK IVSQAINAMKSDDFGCIMVSHYERLFELVPPSHVHVLVNGKIILSGGIEVVEKIDQEGYD WVKELGVEIATDEKKPILLESCANKERMKAK >gi|223714169|gb|ACDT01000046.1| GENE 2 993 - 1886 919 297 aa, chain + ## HITS:1 COG:BH3470 KEGG:ns NR:ns ## COG: BH3470 COG0719 # Protein_GI_number: 15616032 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Bacillus halodurans # 103 295 242 434 435 123 34.0 3e-28 MKQLKEIKNYLVFHNGTIECHPENVGIAINDGKIDVKDAASLQIIYLIDQEGKYCFDLKI KGQLELIETYDFKVPASLVKNIEVLQNSNILRYNDNQSSVNGTVEVTENVSVERDSNVRC AYVELSDSSIEMNVNHALNGSNANSTVRLASLAKGQEKKHMTFSLTHFAPHTTGIMDNYG VVKDASNLVIDGIGTIKQGNHQSSSHQTNKIIVFDEKCNAKANPYLYIDEYDVKASHGAS VGKIDEDHLYYLQSRGLSKKDAMHLVTYGYFIPVMEFIDNADLREMFNETLKEKVGI >gi|223714169|gb|ACDT01000046.1| GENE 3 1886 - 3112 1421 408 aa, chain + ## HITS:1 COG:BS_yurW KEGG:ns NR:ns ## COG: BS_yurW COG0520 # Protein_GI_number: 16080321 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Bacillus subtilis # 2 407 1 406 406 358 43.0 1e-98 MLDVEKIRQDFPMLNGTTMHGQPLIYFDNGATTLKPQCVIDAVCNYLTNYSGNAHRGDYD LSHDVDTQFEKTRDLVAKLINCDRKEVVYTYGSTDGLNMIAFGYGMTHLDEGDEILLTVA EHASNTLPWFEVCDTVGSTVKYIDLDDEGRLTVANVLKAISDKTKIISIAQVSNVLGFDG PIKEICQIAHERGIIVCVDGAQSIPHEKVDVQTLDIDFLVFSGHKMCGPTGVGILYGKYD LLQETKPTRWGGGSNARYKSCGRIKLKNAPAKFEAGTPNIEGVIGLGAAIEYLIEIGMDN IRDYELELRQYAVKKMLELDNLEVYNPNGHGAIAFNIKGVFSQDGASLFNTYGIAIRAGQ HCAKILDEFLDVSQTLRASFYFYNTFEEIDRFIEVCKKGDDFLDAFFG >gi|223714169|gb|ACDT01000046.1| GENE 4 3096 - 3524 621 142 aa, chain + ## HITS:1 COG:BH3468 KEGG:ns NR:ns ## COG: BH3468 COG0822 # Protein_GI_number: 15616030 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Bacillus halodurans # 8 141 9 143 146 100 38.0 1e-21 MPSLDSMMLRQIIMDHYENPRNHGLVDDDNYQSVNMDSETCIDDIDVQALIEDGVIKDIR FDGEACAICTASTSIMSELLIGKTIDEANVIIENYNNMIYEKDYDPEILEEAIAFMNTHK QANRIKCATLGWTGIKQILDKE >gi|223714169|gb|ACDT01000046.1| GENE 5 3535 - 4941 1440 468 aa, chain + ## HITS:1 COG:SA0778 KEGG:ns NR:ns ## COG: SA0778 COG0719 # Protein_GI_number: 15926506 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Staphylococcus aureus N315 # 12 461 10 459 465 630 66.0 1e-180 MADIKNEIPDQEYAYGFNDGDVSVFKTPKGINEEIVTEISKIKNEPEWMLEYRLKSYRCF MEKPMPTWGVNLDRIDFDEYTYYVRPSDKQTNKWEEVPETIKQTFNKLGIPEAEQKYLSG VSTQYESEVVYHNMLKEVTEKGVIFLDIDSGLREYPEIFRKYFDTVIPYNDNKFSALNGA VWSGGSFIYVPPGVKLDKPLQSYFRINSEQMAQFERTLIIVDKGADIHYVEGCTAPSYSR DSLHAGVVEIIVGEGAKCRYTTIQNWSNNILNLVTQRAKVFKNGSMEWVDGNIGSAVTMK YPTCVLMEEGAKGSCITIAVAGENQIMDSGSKMIHLAPNTSSSIVSKSVSRRGGKVNYRG MVQHGAKAYNSKSKVECDTLILDDISTSDTVPVNWMRNNDSIIEHEATVSKISEEQLFYL MSRGLTKKEAMEMIVMGFIEPFARELPMEYAVELNQLIKLDFGDQGIG >gi|223714169|gb|ACDT01000046.1| GENE 6 4934 - 5464 541 176 aa, chain + ## HITS:1 COG:aq_1731 KEGG:ns NR:ns ## COG: aq_1731 COG0212 # Protein_GI_number: 15606807 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Aquifex aeolicus # 1 176 1 176 186 124 34.0 9e-29 MDKKVLRKQLIQARLDLDSETYASKSNFIVSKLKQQPEFIEARAIGIYVSFRHEVETISL IKEIINNKIVCVPKISGKQMDFYQINSINELKTSNFGILEPNNSHPVTKDNLDLLIVPMV GYDQSGNRLGYGGGYYDRYLSDYCGNVIGLAFSFQEVANLPVEPFDLPIKKIINEK >gi|223714169|gb|ACDT01000046.1| GENE 7 5508 - 6917 1331 469 aa, chain + ## HITS:1 COG:MA4461 KEGG:ns NR:ns ## COG: MA4461 COG2244 # Protein_GI_number: 20093247 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Methanosarcina acetivorans str.C2A # 12 452 19 456 490 83 24.0 9e-16 MSREKTLVKNTLIMALGTFLPRIINLLTTPIITSATTDAQYGQLDLVTTTILSFIVPLCT LQIEQALFRFLVDAKSEKEQRRVITNGYVMIFGLMVIAAIICLFVPISMFEGSFKLLIIG YIWIEIIATTSRFVLRAFSKYKEYSILAALVVIVNFVVVSVCLLWLKTGYIGVLIALVAA DVVGFIYVLCVCNIFKFFNFKYFSSEYALKMLGYALPFVPNMVSWYINQLSDRWIISIFL GAGPNGVYALANKIPSIVNILYPAFNLAWTDSATRSVNDPDSGRYYNRMFRMLFCIVSAG SVLLVAGSPIIFGILCRNKELYSAFDYTPTLILATYFYCFSQFFGSIYVAVKSAKNMSVT TTIAAIINIIINLGLINIIGVQAAVLSTLAANLFLAGYRFFDLNRRYVRLKINKRLTVLT IVIFIISMSLAWSGDKILWILNIVIALGYAYFISGDIFIGMFKAVLKKR >gi|223714169|gb|ACDT01000046.1| GENE 8 7030 - 9606 2073 858 aa, chain + ## HITS:1 COG:FN1022 KEGG:ns NR:ns ## COG: FN1022 COG0474 # Protein_GI_number: 19704357 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 1 856 1 862 862 758 50.0 0 MEAYQKSFKEVQQALAVSDNGLSQKDVIKRQTRYGRNEIIGEKRQSVWGIFFDQFKDLLV IILICAGIISAISGEFESTLVILVVITLNALLGTFQTVKAQKSLDSLKKLSMPKIKVLRE GKLVEIKSNELTVGDLVFIQAGDVIAGDGRLKEANNLQVNESALTGESNSVEKQLVAINH QCTIGDMTNMVFSGSLVTNGTGQYIVTNIGMKTQLGKIAEMLNATKQRRTPLQKTLDEFS VKLSIGIISICIIVLFMDVFIAKEQLLDALMIAVALAVAAIPEALGSIVTIVLSISTQKM AKENAIIKNLNAVESLGCVSVICSDKTGTLTQNKMTAVDIYTNRMLIQANHLNHHEYCHN ILLKAFVLCNNATISKAQSIGDPTEIALLELYQDYNYKNAYRLQTKRQQELPFDSTRKLM SVTSKNHLYTKGAPDVLLKRCNRILIDGYKEIFTNKDKQRIMEQNDALARQGKRVLGFAY KEFYRNTLEKRDENDLVFVGLVSLIDPPRIESAQAVKDCILAGIKPIMITGDHVVTACSI AKKIGIFKQGDKCLEGTQLDKLTDQQLRNYLPHVSVYARVAPEHKIRIVQAWQARGAIVA MTGDGVNDAPALKQSDIGVAMGITGSEVSKDAASMILTDDNFATIVKAIITGRNVYTNIK NSIIYLLSGNLSAIICVLVTSFAILPTPFLPVHLLFINLITDSLPAIAIGMEKGSDEILK QKPRGSGDSLLNKETILQVSFEGIIICIFTMLAYFIGLRSDELTASSMAFGTLCLARLLH GFNCRSNLPLYKIPVNYYSVGAFVIGVSLLTWILISPNFHSLFSISELSLKQLAIIFVSA AFPSIIIQIYKMVSSRYN >gi|223714169|gb|ACDT01000046.1| GENE 9 9640 - 10293 517 217 aa, chain + ## HITS:1 COG:no KEGG:Aflv_2429 NR:ns ## KEGG: Aflv_2429 # Name: not_defined # Def: hypothetical protein # Organism: A.flavithermus # Pathway: not_defined # 4 215 16 233 249 98 29.0 2e-19 MKKRYIFIVITIILAIVIQVLSVNITPALKKIADKEINRFCQMVINNTPFPVTLDHQELI KINRNGDEIATINFNTSYASSIGAKMVNKLDELFVAIEEGTYKKTDNSFYQRRFQKMSDE GGVIASIPIGALTQNPFLAGVGPKIKLKYETISAITCSVEKDVKSYGVNHVMVSLKLVIK IKMMVLLPFYNEEFNKDYDYPLVMEIIEGEVPNWYQN >gi|223714169|gb|ACDT01000046.1| GENE 10 10278 - 11114 679 278 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734454|ref|ZP_04564935.1| ## NR: gi|237734454|ref|ZP_04564935.1| predicted protein [Mollicutes bacterium D7] # 1 278 1 278 278 522 100.0 1e-146 MVSELEILQRFYQIYLEQESDFFSFQGNFYYLCKVNNFTSDYPYYVNALGLNGFMVVNNC FNRPISMDYILYTYQIEDYHFEMFLQYSLRPLDIQVEVLKIKESWCAILDEAKCKIGNYA SRISHFEHFVVLSYYYQGLGEAAISVLNEIRETKLVAGIEHFNMIDNYEMLCCPANLVIA SRVKDLASSYKNNLISVEQLEEYIQISALTVDEIIYLYSRLLFPSEFMQLAINDDCNDAQ IKKTLLNMYQNIDNQKASLVIAWQMLNKYTRLPKIAWL Prediction of potential genes in microbial genomes Time: Thu May 26 09:43:30 2011 Seq name: gi|223714168|gb|ACDT01000047.1| Coprobacillus sp. D7 cont1.47, whole genome shotgun sequence Length of sequence - 5468 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 132 - 162 2.0 1 1 Op 1 . - CDS 170 - 1450 1679 ## COG0166 Glucose-6-phosphate isomerase 2 1 Op 2 . - CDS 1452 - 1829 219 ## PROTEIN SUPPORTED gi|30264941|ref|NP_847318.1| general stress protein 13 - Prom 1850 - 1909 8.8 - Term 2153 - 2189 4.1 3 2 Tu 1 . - CDS 2272 - 4536 1470 ## COG2199 FOG: GGDEF domain - Prom 4581 - 4640 7.3 4 3 Tu 1 . + CDS 4998 - 5192 76 ## + Term 5288 - 5338 4.2 Predicted protein(s) >gi|223714168|gb|ACDT01000047.1| GENE 1 170 - 1450 1679 426 aa, chain - ## HITS:1 COG:BS_pgi KEGG:ns NR:ns ## COG: BS_pgi COG0166 # Protein_GI_number: 16080187 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Bacillus subtilis # 14 426 20 438 451 506 58.0 1e-143 MITLDLSKAKLSEDLESYKDKVKEIHDMIHNKTGAGNDFLGWVELPENYDKAEVELIKET AASLREKTDVLLVCGIGGSYLGARAAIEAINGLYPTNDVEIIYVGNTFSSNYIKQVADYI KDKDFAINVISKSGTTTETSIAFRIFKEMCETKYGKVGAKERIVATTDKAKGALKTLATE EGYTTFVIPDDVGGRYSVLTPVGLFPIAMAGIDVDAMLKGAKDAMEKYGNADLDQNDAYK YGVARQILHKAGYPAEMFVTYELQLAMVAEWWKQLYGESEGKEGKGILPTSATFSTDLHS LGQFIQEGTKVLYETIMQIKEPAGDITIPSDKDDLDGLNYLAGKSVDYVNKKACEGTVDA HVNTGNVPNILITLDKMDAYGFGYMVYFFEMSCAMSVYLLGVNPFNQPGVEVYKANMFKL LGKPGY >gi|223714168|gb|ACDT01000047.1| GENE 2 1452 - 1829 219 125 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|30264941|ref|NP_847318.1| general stress protein 13 [Bacillus anthracis str. Ames] # 1 119 1 109 114 89 38 8e-18 MSQKIKRGDIIDVKITGIQPYGAFASLPDNSTGLIHISEISDKFVKSIDSFVKVGESIKV KVIDFDQETNHAKLSLKAIDNRYRRRNKRVYYKNPRRLIVETPNGFKPLALAMEKWLKQG ITEES >gi|223714168|gb|ACDT01000047.1| GENE 3 2272 - 4536 1470 754 aa, chain - ## HITS:1 COG:DRA0297_2 KEGG:ns NR:ns ## COG: DRA0297_2 COG2199 # Protein_GI_number: 15807957 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Deinococcus radiodurans # 386 547 200 361 373 127 41.0 6e-29 MNNKNTAKKMSINENAILEKYLNTLATPISKHKPDKKLSFIWANDAFYKMTGHEKKDFKM IWQDLKTYYKPYTDDFIILENQLAKLQEHRDLHQECIIRIPFRHDEIKWIKLSISLSFEV ITIAYNDISNLINTQNTLISTKNVYIDNFQWMIAENAGNVYISDIDTYELLYLNKVACET LGTRAEQAIGKKCYEIIQGRSSPCPFCTNKYLKKDQTYEWEFYNPLLKRNFIIKDRMLNW HGHSARIELSHDTYSTEYKLAKKDQEREAILKTISAAMVRVDARNYSSILWCNDIFLKMI GYTKEQFKSERYELKSYMHPEDLKRAKQIVHKLRNSKESVVFEGRVYTRNKEERIWTLTL CYVDERDSWDGIPCFYSMGLDITDERQKIKTLQHKAEKDALTGIYNRSETENQIKNFIIE HHDSKGALFMIDTDNFKQINDTLGHIAGDMVLGEMATAMKKKVRDSDVVGRIGGDEFIIF MKNITSHLDAEKKAEELQKTFAHLFDKEKSKIQVTSSIGIALYPKDGHNFLELYANADKA LYQAKIQGKNTYIVYNEKSCNQQSITAYSSLGATIDSQQSYSETSVSLADYAFRILFQNK DLKTAINLILEITGKRYDVSHAYIFETMTDKEHCSNTYEWCNEGITAVKEQLQNISYHDM DEYERVFDENSIFYCHDVTSLSKKQRQLFEAQGIHSTLQCAYYEQEMIAGFVGFDECTGK RIWTKEELRMLTLISQIISIFLKHKGLSDQVSKL >gi|223714168|gb|ACDT01000047.1| GENE 4 4998 - 5192 76 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYVVRTFVRQIKNLSLVGLFICVTCPFACLRNESFGTYTHFPDILYQNKKKNEKIMHYFL SRNS Prediction of potential genes in microbial genomes Time: Thu May 26 09:43:44 2011 Seq name: gi|223714167|gb|ACDT01000048.1| Coprobacillus sp. D7 cont1.48, whole genome shotgun sequence Length of sequence - 27990 bp Number of predicted genes - 26, with homology - 26 Number of transcription units - 14, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 230 199 ## COG0694 Thioredoxin-like proteins and domains + Prom 584 - 643 6.4 2 2 Tu 1 . + CDS 680 - 1822 1356 ## COG0192 S-adenosylmethionine synthetase + Term 1824 - 1869 3.2 + Prom 1870 - 1929 8.8 3 3 Tu 1 . + CDS 1949 - 2479 422 ## PROTEIN SUPPORTED gi|148378176|ref|YP_001252717.1| ribosomal subunit interface protein + Term 2493 - 2529 4.2 4 4 Tu 1 . - CDS 2502 - 3446 645 ## COG1242 Predicted Fe-S oxidoreductase + Prom 3376 - 3435 6.1 5 5 Op 1 2/0.000 + CDS 3484 - 5328 1523 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component 6 5 Op 2 . + CDS 5330 - 7201 1469 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component + Term 7450 - 7480 -0.4 + Prom 7414 - 7473 6.3 7 6 Tu 1 . + CDS 7497 - 9893 2869 ## COG0495 Leucyl-tRNA synthetase + Prom 10154 - 10213 5.2 8 7 Tu 1 . + CDS 10417 - 10644 137 ## gi|167756748|ref|ZP_02428875.1| hypothetical protein CLORAM_02295 + Term 10774 - 10823 -0.6 - Term 10636 - 10672 7.3 9 8 Tu 1 . - CDS 10808 - 12553 1926 ## COG1154 Deoxyxylulose-5-phosphate synthase - Prom 12596 - 12655 9.5 - Term 12598 - 12637 -0.0 10 9 Op 1 12/0.000 - CDS 12854 - 13645 988 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB 11 9 Op 2 3/0.000 - CDS 13661 - 14233 637 ## COG4657 Predicted NADH:ubiquinone oxidoreductase, subunit RnfA 12 9 Op 3 13/0.000 - CDS 14234 - 14965 705 ## COG4660 Predicted NADH:ubiquinone oxidoreductase, subunit RnfE 13 9 Op 4 12/0.000 - CDS 14958 - 15590 733 ## COG4659 Predicted NADH:ubiquinone oxidoreductase, subunit RnfG 14 9 Op 5 12/0.000 - CDS 15590 - 16516 794 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD 15 9 Op 6 . - CDS 16516 - 17844 1139 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC - Prom 17871 - 17930 8.5 - Term 17920 - 17969 4.4 16 10 Tu 1 . - CDS 17972 - 18556 711 ## COG1592 Rubrerythrin - Prom 18591 - 18650 8.7 - Term 18595 - 18648 -1.0 17 11 Tu 1 . - CDS 18693 - 19085 518 ## gi|237734475|ref|ZP_04564956.1| predicted protein + Prom 19016 - 19075 8.0 18 12 Op 1 . + CDS 19185 - 20129 851 ## COG0598 Mg2+ and Co2+ transporters + Prom 20204 - 20263 7.4 19 12 Op 2 . + CDS 20292 - 20897 1045 ## PROTEIN SUPPORTED gi|237734477|ref|ZP_04564958.1| 30S ribosomal protein S4 + Term 20917 - 20962 7.2 + Prom 21054 - 21113 10.6 20 13 Op 1 9/0.000 + CDS 21240 - 21947 753 ## COG0284 Orotidine-5'-phosphate decarboxylase 21 13 Op 2 . + CDS 21960 - 22592 893 ## COG0461 Orotate phosphoribosyltransferase + Term 22645 - 22695 3.3 + Prom 22625 - 22684 10.6 22 14 Op 1 26/0.000 + CDS 22715 - 23536 598 ## COG1682 ABC-type polysaccharide/polyol phosphate export systems, permease component 23 14 Op 2 . + CDS 23546 - 24373 888 ## COG1134 ABC-type polysaccharide/polyol phosphate transport system, ATPase component 24 14 Op 3 1/0.000 + CDS 24383 - 24769 531 ## COG0615 Cytidylyltransferase 25 14 Op 4 3/0.000 + CDS 24778 - 26994 2211 ## COG1887 Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC 26 14 Op 5 . + CDS 27030 - 27990 796 ## COG1887 Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC Predicted protein(s) >gi|223714167|gb|ACDT01000048.1| GENE 1 3 - 230 199 75 aa, chain + ## HITS:1 COG:BH3419 KEGG:ns NR:ns ## COG: BH3419 COG0694 # Protein_GI_number: 15615981 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Thioredoxin-like proteins and domains # Organism: Bacillus halodurans # 1 70 10 79 79 76 52.0 1e-14 EVEKVINKLRPYLNRDGGDIELIDFKDGIVYVKMLGACAGCSMLDETLKDGVEQILMEEV PGVLGVQNILEGTEY >gi|223714167|gb|ACDT01000048.1| GENE 2 680 - 1822 1356 380 aa, chain + ## HITS:1 COG:BS_metK KEGG:ns NR:ns ## COG: BS_metK COG0192 # Protein_GI_number: 16080107 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Bacillus subtilis # 2 378 3 395 400 426 54.0 1e-119 MKEKVLFTSESVSKGHPDKVCDQISDAILDACLKEDPNSRVACEVFATTDLVVIGGEITT TAEVNYEQVARDVLKDIGYDDSDKGIDYRTCKVQVVMDLQSPDIALGTNDEVGGAGDQGI MFGYACKETKGYMPLPISIAHHLVRYATEKKDTGEFKSARPDMKAQVTIDYTESTPKIDT ILMSIQHDPDFDETEFKRYIKEEIMDAVVRKYNLNTDYKVLINPTGRFVIGGPHGDTGLT GRKIIVDTYGGAARHGGGAFSGKDPSKVDRSAAYMLRYIAKNIVAANLCDKLEIQVSYAI GVKEPTSIFIETYGTEHVDHDIILKAIRDNFDLTPGGIIKTLDLRQPLYLKTAAYGHFGR GDNNLPWEELDKVEILKKYL >gi|223714167|gb|ACDT01000048.1| GENE 3 1949 - 2479 422 176 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148378176|ref|YP_001252717.1| ribosomal subunit interface protein [Clostridium botulinum A str. ATCC 3502] # 1 173 1 172 175 167 47 9e-41 MKISVRGKNIEITEAIEAKISEKLSKLDKYFIVSDNVEAKVLCRVYPYGQKLEVTIPTEY VLLRAEVVDDDLYNAMDLVVDKLEGQIRKYKTRLSRKSKDNKLAFNLSSIEDVESEEDVL VKVKSITPKPMDMEEAIMQMELIGHSFFVYRDVESDSISIVYRRNDGDYGLIETSN >gi|223714167|gb|ACDT01000048.1| GENE 4 2502 - 3446 645 314 aa, chain - ## HITS:1 COG:SA1581 KEGG:ns NR:ns ## COG: SA1581 COG1242 # Protein_GI_number: 15927337 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Staphylococcus aureus N315 # 2 314 3 315 317 355 51.0 5e-98 MNLFKYSDDNKRYHTFNYYLKNKYQSKVAKVSLNAGFTCPNRDGTKGTGGCIFCSSTGSG DFAGNIQDSLEIQFSNVSLLLQNKWPNCKYIAYFQANTNTYGPLKKIKACIEPFINKKNV VAISIATRPDCLETDVLNYLAEVNNRCDLWVELGLQTIHDQTGELINRGHNYQDFLDGLN KLRKLKINVCVHIINGLPGETYEMMLETAKAVGQLDIQALKIHMLYVIEKTKLHQLYLQQ PFKILSRDEYIDLVVKQLSYIPENIVIERLTGDGNISDLVAPLWSIRKVTILNDIDKLMV AKDYYQGCKIKKPI >gi|223714167|gb|ACDT01000048.1| GENE 5 3484 - 5328 1523 614 aa, chain + ## HITS:1 COG:MT1014 KEGG:ns NR:ns ## COG: MT1014 COG1136 # Protein_GI_number: 15840411 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Mycobacterium tuberculosis CDC1551 # 3 211 7 219 248 137 34.0 9e-32 MALQLINITKFYKYGKKRQRVIDDLTIKFPSRGMIGIVGQSGTGKSTLLNIIAGIERPCN GQVLIDGHRLNYQQIRMYQRDYISYVYQFYNLVNALTIKENLILITKIKGIDLKQVSGQL EEYCRTLKVVELLERYPDELSGGQKQRIGLVRAFLCNTPILLADEPTGALNDKLSIEVMR LLKRYAKEHLVIVISHNSRLIKQYTKMIVDLDKKKNYYNFSNERYHKYSPIIIKKQPGRL LFYCKRQLKYQFKKIIMMFVSQIFIILAFVLLLSAFNGGWQYLEKKFEEDPLKRVIEVSK NDYINNIFERKEIKKFEKNNKIANVAYKLVFSNGILSSYDEELDVEVYQIFKTKKLDYIS GKYPSKENQILINQTTAKKYKLKIGERINFEINDDEYEFTISGIINDHINIGVNLYLEQQ YLEEELSDDCLDKTVLIFHSDQVDAVIKEYQKDYLLINLHGDYIDNYRSIFDITRFVIFL FIVISFAISLILISVILKTIFIERKRDTSLLLANGLGKGKAINLFCREAILIGGLIGSAG AILAMGVLRLIELFDLSDYLFNIPRLFVLPKYIFTKYDLYVVLIVIYIIACYLAGLQASL KINRMDISILLKEN >gi|223714167|gb|ACDT01000048.1| GENE 6 5330 - 7201 1469 623 aa, chain + ## HITS:1 COG:L119891_1 KEGG:ns NR:ns ## COG: L119891_1 COG1136 # Protein_GI_number: 15672696 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Lactococcus lactis # 1 210 3 225 290 144 39.0 6e-34 MLEIKNLTKRYDETIIKNLNIILPSTGLIIIVGNSGCGKTTLLNLIGGIDQDYEGEILFD HQDIRKIKKYCRRHIGFIFQDFNLINWLNVKENYLLAKFFTKISYKRAVEDQEEKLELRK LNKKRVKILSGGQKQRVALLRAMIKNVDILLCDEPTGSLDDKNAKMVFELLHAEAKERLV IVITHNEQLAYQYADQCFSMQNGQLIGKYRRDKNNHFYSRLVKQHSPLKLYQLALLQYRA NLGRNLKISSGITVALICIMITFTLSGSLQTQIKRQLDSIFPNQLISIQNVHKKLLSYHE LLSLSNLDMNEFIYGEPADYEFMGVSLQSDYEVEKTIYISDMTKPIKSRNLEIGRETKNS NEIVLSKTTATHLNPDYQKLLNQQLYGYYLKDDQIKGVSLTVVGIGKDVTAFDTIYINEL ANLDHISQAFAIDKKQLLFQLAMINLNSKADLDDLLMDLEDQYSQFEFKVAGENINERID TFMLQIQRVLLLFSLLAIVAACFLIGEVLYLSVIEKTKDIGIFKCMGASKLQIMNLVLLE SFTLISGAFICSYVFFYQLVNLINQLVENELQLDLSGAFIQIDYQLVIVIYLGALCFGLC SSYIPAFLAGRLDPIKALKQPNY >gi|223714167|gb|ACDT01000048.1| GENE 7 7497 - 9893 2869 798 aa, chain + ## HITS:1 COG:BH3281 KEGG:ns NR:ns ## COG: BH3281 COG0495 # Protein_GI_number: 15615843 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Bacillus halodurans # 1 796 1 804 806 1150 67.0 0 MPFDHKQIEPKWQKYWDEHKTFKTDCYDDSKPKYYCVDMFPYPSGNGLHVGHPEGYTATD IVSRMKRMQGYNVLHPMGFDSFGLPAEQFAIQTGHHPAEFTKKNIEVFKGQIKSLGFSYD WDREIATSDPEYYKWTQWIFTKLYDAGLAYVDEIPVNWCPELKAVLANEEVIDGKSERGG YPVIRKPMRQWVLKITEYAERLLEDLDDLDWPEATKQMQRNWIGKSVGANVDFRIDGTDK IFTVFTTRCDTLFGATYCVMAPEHPYVAEITTLEQKEAIEAYKESCASKSDLERTELNKD KTGVFTGAYAINPVNGKKIPIWISDYVLASYGTGAIMAVPAHDDRDWEFAKKFGIEIIPV LEGGNIEEAAYTEDGLHINSQWLDGLGKQEAIDKMIAWLEENKCGEKKISYKLRDWLFSR QRYWGEPIPIVHMEDGTMRTVPVEELPLELPATKNFQPHDSGESPLANCEDWLEVEIDGQ KGRRETNTMPQWAGSCWYYIRYIDPHNSEVICDPKLLEKWLPVDLYVGGAEHAVLHLLYS RFWHKVLYDAGVVKCKEPWQRLFHQGMILGDNNEKMSKSRGNVVNPDDIVASHGADSLRL YEMFMGPLEAALPWSTNGLDGSRKWLDRVYRLFIEQDKLSDENDHSLDRVYHQTVKKVTD DFETLGFNTAISQMMIFINECYKAETVYKEYAFNFIKMLSCIAPHICEEMWQMLGHDSSI AYETWPTYDENMLVSETVEMGVQVNGKLRAKIQVTKDTDDEAVKEIAFEQENVKAHTEGK NIVKVIVVKNKIVNIVVK >gi|223714167|gb|ACDT01000048.1| GENE 8 10417 - 10644 137 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756748|ref|ZP_02428875.1| ## NR: gi|167756748|ref|ZP_02428875.1| hypothetical protein CLORAM_02295 [Clostridium ramosum DSM 1402] # 1 75 4 78 78 130 100.0 3e-29 MLMINEYYVLLNIKSTIKYSTNGCTGQAVNPQYIAKNLLNREFTTDALNEKRLTDVTEFH YYIRIEKHKIYLQIA >gi|223714167|gb|ACDT01000048.1| GENE 9 10808 - 12553 1926 581 aa, chain - ## HITS:1 COG:CAP0106 KEGG:ns NR:ns ## COG: CAP0106 COG1154 # Protein_GI_number: 15004809 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Clostridium acetobutylicum # 1 580 1 583 586 632 54.0 0 MFLEKIDDTNDLKKLNRKELKELANEVRAALINKISKAGGHSGPNLGMVEMTVALHYVFD SPNDKFVFDVSHQSYPHKILTGRKEAFLDEEHFHDVTGYTNPLESEHDMFTIGHTSTSVS LALGLAIGRDLNKKAENIIAIIGDGSLSGGEALEAIDYAGEYNKNLIIIVNDNDQSIAEN HGGIYQTLKELRDTNGTSNNNIFTSFGLDYKYLDDGHDIEKLIELFESVKNIDHPIVLHI HTIKGKGLPYAEVNKEAWHAGGPFNVADGTYKFVDEDDQTLFNSLTNLLNANKKAIVVNA GTPMGLGFIEPVRKEYVERGQFIDVGIAEENAMAMISGIAKNGGIPVFGTFAPFLQRTYD QLSHDLCLNNNPATILVLSAGVYGMNSNTHIALCDIAMFAHIPNLIYLAPTTKEEYLQMF KYATTQKDHPIAIRVPVQMIESGQEDTTDYSIHNKSQIVQQGNDIAIIGVSNLLPLALKT AQKYKEDTGKNITVINPKFLTGLDEELLNDLTNNHRLIITLEDGELEGGYGHRISSFYGM TEMKVKNYGISKAFHTDFNADELLAQHGISVDQLVTYMKNS >gi|223714167|gb|ACDT01000048.1| GENE 10 12854 - 13645 988 263 aa, chain - ## HITS:1 COG:MA0664 KEGG:ns NR:ns ## COG: MA0664 COG2878 # Protein_GI_number: 20089551 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Methanosarcina acetivorans str.C2A # 1 260 1 261 264 172 40.0 7e-43 MNEIIIAAIIVGAIGLLIGILLTIASEKFAITVNEKEIQIREELPGNNCGGCGYPGCDGL AKAIVAGEAKINQCPVGGKSVADAIAMIMGVESEETMRMTAFVRCGGTCENATIQYLYHG IKDCNNAAIVPGGGDKQCSHGCMGYGSCVRECEYNAISIIDGVAVINPDLCQACGKCLRV CPNNVITLIPYQKQAVVKCNSNEPALRARKNCAVSCIGCGICARNCPNNAIIIENSLARI DYTKCTNCGLCKEKCPRKCIITA >gi|223714167|gb|ACDT01000048.1| GENE 11 13661 - 14233 637 190 aa, chain - ## HITS:1 COG:VC1017 KEGG:ns NR:ns ## COG: VC1017 COG4657 # Protein_GI_number: 15641032 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA # Organism: Vibrio cholerae # 1 190 21 211 213 164 53.0 1e-40 MTGMILLLISVALVNNVVLSQFLGLCPFLGVSKKIETAAGMGGAVIFVMSIASIATSLVY KILVLFHLEYLNTVTFILVIAALVQLVEMFLKKYSSGLYSSLGVYLPLITTNCAVLGVAI TNVQNNYDILTSLVCSFGTAVGFTIAIVIMAGIREKIADNDIPVSFQGSPIVLITAGLMA IAFIGFSGLI >gi|223714167|gb|ACDT01000048.1| GENE 12 14234 - 14965 705 243 aa, chain - ## HITS:1 COG:FN1593 KEGG:ns NR:ns ## COG: FN1593 COG4660 # Protein_GI_number: 19704914 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE # Organism: Fusobacterium nucleatum # 8 205 7 190 205 187 48.0 1e-47 MNKYVERLYNGIIKENPTLILMLGMCPTLAVTTSSVNGLGMGLTTMIVLAASNFMISFLR KIIPERMRIPSYIVIVASMVTIVQLLLQAYLPSLYDTLGIYIPLIVVNCIILGRAEAYAS KHSCVLSLFDGVGMGLGFTLALTCIGLIREMLGSGTFFGIEFAPLFKDLVGTNIYSFFAH LEPVTIFVMAPGAFFVLAGLTAIQNRLKLPSATNKNKDGNCNHDCLHCRSSDEKALCDLK EEK >gi|223714167|gb|ACDT01000048.1| GENE 13 14958 - 15590 733 210 aa, chain - ## HITS:1 COG:FN1594 KEGG:ns NR:ns ## COG: FN1594 COG4659 # Protein_GI_number: 19704915 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG # Organism: Fusobacterium nucleatum # 9 202 10 172 177 64 30.0 2e-10 MKKNLLKDILVFVVITVAASLSLAMVNSITKDPIARQNEKIIQDSYRSVFKEGQNFDNNK QINQLVKNFPNVIKDKKYNFGENGIVIDDVLVVTNDKKELIGHVVKVTTEDGYGGDITLV VGIDTNDEITGIEVLSIDETVGLGMNAKNDNFKNQYYHKSVTDFKITKTGKQNEDEIDAL SGATITSSAFNNAINGALAINQEIKEANHE >gi|223714167|gb|ACDT01000048.1| GENE 14 15590 - 16516 794 308 aa, chain - ## HITS:1 COG:TM0245 KEGG:ns NR:ns ## COG: TM0245 COG4658 # Protein_GI_number: 15643017 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Thermotoga maritima # 10 303 8 314 318 204 42.0 2e-52 MDNLFNVSTAPHIRSSITTKNIMYDVIISLLPATFFGIYQFGFNAFLVITTAIITAVLSE YLYQRLMKLPITIKDGSAAITGLLLALCLSSSLPLWMVALGSIFAVIIVKQLFGGLGQNF MNPALAARCFLLISFAGKMNSFVSIDASTSATPLAMIQNGQNVNLLKMFIGNTSGTIGET SVIALLLGAIYLIYRKVISPKIPLCILFSFSLIILINNNFNSLFLLQQICGGGLMIGAFF MATDYVTSPITPTGKIIYGLVIGILAAVFRLYGNTPEGMSFAIIITNLLVPLIEKVTVPK IKGSGGKA >gi|223714167|gb|ACDT01000048.1| GENE 15 16516 - 17844 1139 442 aa, chain - ## HITS:1 COG:FN1596 KEGG:ns NR:ns ## COG: FN1596 COG4656 # Protein_GI_number: 19704917 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Fusobacterium nucleatum # 1 438 7 441 441 323 40.0 3e-88 MRILSFKGGIHPHDNKSLSANHPIKLLNPGDELIYPLSQHIGAPAKAIVAKGDHVLVGQI IANGNGMISANVASSVSGTVKTIDLRLAPTGFFVDCIVITNDHQYQSIPEYNTTNDYHNY SSMQIRELIAHAGIVGLGGAGFPSAIKNTPKDDDSIKYVIINGAECEPYLTSDYRLMLEE TDKLIAGIKIQLQLFKNAKAIIAIEDNKPEALAHLKKVVVDEDKIEIAALPTKYPQGSER VLIKVLTNQTIVNGMLPADVGCIVSNVASIIAIYEAVALNIPLIEKIVTVTGDAIFQPQN FKVRLGTKYQTLVDACGGFKVHPQKIISGGPMMGTALFNLDVPVSKTSSALVCLSEDEVS KTAPTACIHCGRCVDVCPIGLIPQLLYRYARTGDKENFLKVHGNDCMECGCCTFTCPAKR NMTQSFKKIKKAINDDRKKGAK >gi|223714167|gb|ACDT01000048.1| GENE 16 17972 - 18556 711 194 aa, chain - ## HITS:1 COG:MTH756 KEGG:ns NR:ns ## COG: MTH756 COG1592 # Protein_GI_number: 15678781 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Methanothermobacter thermautotrophicus # 4 183 6 189 197 124 36.0 8e-29 METKTKTNLLRAFAGECQARMRYEMAAKVAGKNGMAILQDMLYFTANQEKEHAIVFYNYL KDVFKDANVSMEANYPVDLDQDLSSLLEQAAQHEYDEATSIYKQFGDEAKEEGYVEIATT FYMICEIEHYHHERFKKYHNLLTSGKLFNDQSTVKWMCMNCGFIYEGTDALTICPVCKHP QGYSLRLDESQFHL >gi|223714167|gb|ACDT01000048.1| GENE 17 18693 - 19085 518 130 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734475|ref|ZP_04564956.1| ## NR: gi|237734475|ref|ZP_04564956.1| predicted protein [Mollicutes bacterium D7] # 1 130 1 130 130 225 100.0 6e-58 MQETVLLYNIDKTDAGKAIISILEKLNVEVIIVKSSDLMSPIGYILGADNFERGTEALTE IPQDDMMVMAGFEDKQVDLLLQIFKEANIPFIPLKAIVTQTNVNWTFMQLLKNVKTEYME LTGMNKDMIN >gi|223714167|gb|ACDT01000048.1| GENE 18 19185 - 20129 851 314 aa, chain + ## HITS:1 COG:sll0507 KEGG:ns NR:ns ## COG: sll0507 COG0598 # Protein_GI_number: 16332029 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Synechocystis # 133 277 211 356 387 74 32.0 2e-13 MRYYLEKEITNSYKEGSLVIDILSKEDFINQYQKENHANHVIRHIGDIMYCKVELFYDVI CGAFVIPDKKDLKIKHCFSFYLTEQKMIFIDDNDYVMKLIEKMKDTYTNNRLSLNKFFHD LIFLITVDDGMYLQEYADRLQVIEDKIAFNFNSQINNEIILLRKELLILNSYYNQLGDMI DILSDNESDFLNDYECHLFSMQARRIDRLDNQLNALKDYSLQIKEMYQNKIDTHQNKIMT VLTVATTIFFPLSIITGWYGMNFKNMPELNNPYGYIIVIGASILVVIIEMLILKKKKILL EKNVDVVNKKRYSK >gi|223714167|gb|ACDT01000048.1| GENE 19 20292 - 20897 1045 201 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237734477|ref|ZP_04564958.1| 30S ribosomal protein S4 [Mollicutes bacterium D7] # 1 201 1 201 201 407 100 1e-113 MSRYTGPQWKLSRRLGFSTLENGKELNKRAYAPGQHGQKRKKQTEYGLQLAEKQKVRHMY GVNEKQFHNTFKQAGKMEGIHGYNFFCLLERRLDNVVYRLGFAPTRRAARQLVNHGHFLV NGIKTDIPSFRVKVGDVIEVKEKSKNLAIIKASLEARVHTPAFVEVDDTKLTGKLARLPE RSELNQEINESLIVEYYNRMG >gi|223714167|gb|ACDT01000048.1| GENE 20 21240 - 21947 753 235 aa, chain + ## HITS:1 COG:BS_pyrF KEGG:ns NR:ns ## COG: BS_pyrF COG0284 # Protein_GI_number: 16078619 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Bacillus subtilis # 1 235 1 236 239 238 53.0 7e-63 MTDSRICIALDFQNKAEVKEFLEKFNDEKLYVKVGMELFYGEGIEIIKMIKEMGHNIFLD LKLHDIPNTVKSAMKQLAKLEVDMVNVHASGSIAMMKAAIEGLEAGKTGDKRPLCIAVTC LTSLDQEVLDNELLINDTLENVVLKWATNAKEAGLDGVVCSPLESKVIHDNLGMEFITVT PGIRLADDSVNDQKRVTTPAMARELTSSYIVVGRTITGSADPYATYKKVYQDFQG >gi|223714167|gb|ACDT01000048.1| GENE 21 21960 - 22592 893 210 aa, chain + ## HITS:1 COG:BS_pyrE KEGG:ns NR:ns ## COG: BS_pyrE COG0461 # Protein_GI_number: 16078620 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Bacillus subtilis # 1 210 7 216 216 283 64.0 2e-76 MKELIAKDLLDIQAVFLRPNEPFTWASGIKSPIYCDNRLTLSYPKVRKDVETGLAKLVQE NFPEAECLMGTATAGIAHAALVADILELPMGYVRGGAKSHGRNNRIEGKVEPGMKVVVVE DLISTGGSSLECVEALKEAGCEVIGMIAIFTYGLPKATINFEAAECKYVTLTDYDTLIGV AKENDYIKEADMEKLKAWKKDPSDESWMSK >gi|223714167|gb|ACDT01000048.1| GENE 22 22715 - 23536 598 273 aa, chain + ## HITS:1 COG:lin1062 KEGG:ns NR:ns ## COG: lin1062 COG1682 # Protein_GI_number: 16800131 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate export systems, permease component # Organism: Listeria innocua # 2 273 1 267 267 167 35.0 1e-41 MLKSIKYVLKENFTNLFRIYSISKYELLSDIRDSRLGIFWNFANPLIQILTYYFVFGLVM NRKDVDGIPFIQWMLAGMVVWFFINPCITQGANAIFSKTGVITKMKFPVSVLPATVVLKE LFNHACILLLMIIFYCTQGVYPSIEWLGIIYYCFAACCFAVSLAMITSVLNMIARDTRKL ILACMRLILYLTPILWPISRFSEQTLLFDIIRFVMKINPIYYIVCGYRDCFLYHLGFMHY WKQMLAFWGITGILFVVGCYMMYKFKHKFIDMI >gi|223714167|gb|ACDT01000048.1| GENE 23 23546 - 24373 888 275 aa, chain + ## HITS:1 COG:lin1063 KEGG:ns NR:ns ## COG: lin1063 COG1134 # Protein_GI_number: 16800132 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate transport system, ATPase component # Organism: Listeria innocua # 2 247 3 254 335 238 47.0 1e-62 MENYAIKFENVSKIYKLKKKNDGNTKSSETKRFYALKNISFEIPQGEVVGILGTNGSGKS TLSLILAGISEIDEGVMHINGEQSLVAINTGLNKQLTGLENINVKGALLGLSKKQIQEII DGVIEFAELGDFLYQPVKKYSSGMKSRLGFSISLYLNPDIIIVDEALSVGDKGFAQKCIN KMNELKDEGKTIIFISHSLPQVRNFCQTAMWIEGGMLKEYGEVGEVCDHYGEYVDYYNSL TNKEKAKIRDNKFKKRIVQNPQIGFWEKVLDRIKG >gi|223714167|gb|ACDT01000048.1| GENE 24 24383 - 24769 531 128 aa, chain + ## HITS:1 COG:lin1076 KEGG:ns NR:ns ## COG: lin1076 COG0615 # Protein_GI_number: 16800145 # Func_class: M Cell wall/membrane/envelope biogenesis; I Lipid transport and metabolism # Function: Cytidylyltransferase # Organism: Listeria innocua # 1 126 1 126 127 183 74.0 6e-47 MKKVITYGTFDLFHVGHLNIIKRAKALGDYLIVAVSSDAFNAQKGKKAYHSDHDRKLILE AIRYVDEVIFEESWDQKIKDVQEHDVDVFVMGDDWEGKFDFLKDYCEVVYLPRTDGISTT KIKDDLHK >gi|223714167|gb|ACDT01000048.1| GENE 25 24778 - 26994 2211 738 aa, chain + ## HITS:1 COG:BS_tagF_2 KEGG:ns NR:ns ## COG: BS_tagF_2 COG1887 # Protein_GI_number: 16080625 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC # Organism: Bacillus subtilis # 376 738 19 385 387 347 47.0 5e-95 MKLSIIIPYDRYKQYLHDCLESVYEQRLNDYETLLVVNNKKEIDQDIKAYDINLKILEAG ENSSVAKKRNIGLEQASGEYIYFIDCDDYLMPNTLKLLITCANQNDLDLVAGIRRFTWFK KKVFETMGDEKNNELNLKDKDHDYDDKFIEKVYDQNNVDEYMVDVLIRSRHAIRNISVLN VLIKKKLIDEHQIRFNEDFFYYTDVPFVIQLVNYAKSFDRVEQAIFVKRKHNDPINTPAL SQIKDSENKFDEFLMAYRYSASLVAPDSYIRYYLDSKMLRYYTKFFARKIRRSKDDIWRN ERFKAMGEILKNIHPRLLNKSSRYSKKMIKAGLNGDLELAKKTVSRHLAKIKFKKILKNK NEMNKYLYRHKYINKPLEENTVMFETFMGKSYADSPKYIYEYLAKNYPGKYKFIWVLNDP KEKLPYEGKIVKRFTREYAYYLGVSKYFVFNIRQPLWFRKREEQVFLETWHGTPLKRLAF DQEEVTAASPTYKAQFYRQKQEWDYLIAPNKFSSDIFKSCFMYDGNMLETGYPRNDLLSL PNRDAIALELKKKLGIPLDKKTILYAPTWRDDEYYGNGKYKFKLKLDLDLLKQQLGDEYV VLLRTHHYIADALDVTGLEEFAYNLSKYDDITEIYLISDICITDYSSVFFDYANLKRPML FYTYDLDKYRDVLRGFYIDMETELPGPLVYTTEEVIDKIKNLNSLNQEYQQRYEQFYERF CSWEDGNAAMRVVEAVFK >gi|223714167|gb|ACDT01000048.1| GENE 26 27030 - 27990 796 320 aa, chain + ## HITS:1 COG:L151480 KEGG:ns NR:ns ## COG: L151480 COG1887 # Protein_GI_number: 15672908 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC # Organism: Lactococcus lactis # 3 320 12 321 395 281 45.0 1e-75 MKILKKLLGKLIRILYRFVYRFIPCQEKTILFISFHGRGYSDNPMAIHQYLSKHSQYADY RCIYAIKNHKQKKLKIENARIIEYFSIAYFFYLARSKYWIANCKLPKYVLKKDSQVYLQT WHGTPLKKLAHDIEVPEGTTFYRSEMSVEEMRSTYDNDVSKYNYMISPSAFTTEVFQSAF AIERERLIETGYPRNDILSNYNSDDIKKIKDKLNLPEGKKIILYAPTWRDNSYNLKGYTF KLEVDFKKWQKILGTDYIVIFKPHYLIVNDFDLEAVKEFVYYIDPKEDISSLYLIADVLV TDYSSVFFDYAILKRPIYFF Prediction of potential genes in microbial genomes Time: Thu May 26 09:44:16 2011 Seq name: gi|223714166|gb|ACDT01000049.1| Coprobacillus sp. D7 cont1.49, whole genome shotgun sequence Length of sequence - 68231 bp Number of predicted genes - 66, with homology - 66 Number of transcription units - 35, operones - 16 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 1236 - 1277 9.2 2 2 Tu 1 . - CDS 1279 - 1935 696 ## COG0637 Predicted phosphatase/phosphohexomutase - Prom 1960 - 2019 6.8 + Prom 1962 - 2021 5.5 3 3 Op 1 . + CDS 2044 - 2379 236 ## COG2337 Growth inhibitor 4 3 Op 2 . + CDS 2433 - 3281 957 ## gi|167756770|ref|ZP_02428897.1| hypothetical protein CLORAM_02317 5 3 Op 3 . + CDS 3294 - 4025 370 ## COG1266 Predicted metal-dependent membrane protease + Prom 4086 - 4145 5.8 6 4 Tu 1 . + CDS 4181 - 4693 405 ## COG1943 Transposase and inactivated derivatives + Term 4770 - 4811 6.2 - TRNA 4856 - 4931 81.3 # Lys CTT 0 0 + Prom 4780 - 4839 7.7 7 5 Op 1 . + CDS 5038 - 6612 1425 ## Cbei_1560 HAD family hydrolase 8 5 Op 2 . + CDS 6590 - 7108 178 ## PROTEIN SUPPORTED gi|52081538|ref|YP_080329.1| ribosomal protein S2 + Term 7324 - 7391 30.2 + TRNA 7290 - 7380 65.9 # Ser GCT 0 0 + TRNA 7388 - 7464 91.2 # Asp GTC 0 0 - Term 7277 - 7346 30.7 9 6 Tu 1 . - CDS 7517 - 8050 658 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Prom 8167 - 8226 9.4 + Prom 8062 - 8121 7.2 10 7 Op 1 . + CDS 8266 - 9234 212 ## PROTEIN SUPPORTED gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family 11 7 Op 2 . + CDS 9297 - 10217 388 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 + Term 10237 - 10291 1.2 + Prom 10220 - 10279 5.3 12 8 Op 1 . + CDS 10305 - 11438 1286 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 13 8 Op 2 1/0.167 + CDS 11438 - 13633 1834 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member + Prom 13768 - 13827 9.2 14 9 Op 1 4/0.000 + CDS 14055 - 16664 2939 ## COG0013 Alanyl-tRNA synthetase 15 9 Op 2 6/0.000 + CDS 16725 - 16976 354 ## COG4472 Uncharacterized protein conserved in bacteria 16 9 Op 3 7/0.000 + CDS 16978 - 17397 479 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) 17 9 Op 4 . + CDS 17410 - 17757 427 ## COG3906 Uncharacterized protein conserved in bacteria 18 9 Op 5 . + CDS 17812 - 18903 1114 ## COG2706 3-carboxymuconate cyclase + Prom 18908 - 18967 3.2 19 10 Op 1 4/0.000 + CDS 19020 - 20069 970 ## COG1559 Predicted periplasmic solute-binding protein 20 10 Op 2 4/0.000 + CDS 20071 - 20670 608 ## COG4122 Predicted O-methyltransferase 21 10 Op 3 . + CDS 20667 - 21101 475 ## COG0826 Collagenase and related proteases 22 10 Op 4 . + CDS 21128 - 21601 360 ## gi|167756787|ref|ZP_02428914.1| hypothetical protein CLORAM_02336 23 10 Op 5 3/0.000 + CDS 21585 - 22829 1232 ## COG0826 Collagenase and related proteases 24 10 Op 6 4/0.000 + CDS 22829 - 23449 625 ## COG0572 Uridine kinase 25 10 Op 7 . + CDS 23514 - 23999 661 ## COG0782 Transcription elongation factor + Term 24012 - 24060 3.0 - Term 24075 - 24129 4.4 26 11 Tu 1 . - CDS 24276 - 24632 202 ## gi|237734509|ref|ZP_04564990.1| predicted protein - Prom 24692 - 24751 7.0 + Prom 24599 - 24658 8.5 27 12 Tu 1 . + CDS 24682 - 25314 610 ## COG2323 Predicted membrane protein 28 13 Op 1 13/0.000 - CDS 25329 - 26123 270 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 29 13 Op 2 9/0.000 - CDS 26116 - 27036 1067 ## COG4120 ABC-type uncharacterized transport system, permease component 30 13 Op 3 . - CDS 27036 - 28004 1161 ## COG2984 ABC-type uncharacterized transport system, periplasmic component 31 14 Tu 1 . - CDS 28424 - 29218 675 ## COG0428 Predicted divalent heavy-metal cations transporter + Prom 29155 - 29214 6.1 32 15 Tu 1 . + CDS 29325 - 29699 475 ## COG1321 Mn-dependent transcriptional regulator + Term 29700 - 29740 8.1 - Term 29683 - 29730 12.2 33 16 Tu 1 . - CDS 29732 - 30988 1266 ## gi|167756798|ref|ZP_02428925.1| hypothetical protein CLORAM_02347 - Prom 31085 - 31144 7.2 + Prom 31032 - 31091 10.2 34 17 Op 1 . + CDS 31112 - 33256 1676 ## PROTEIN SUPPORTED gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase 35 17 Op 2 . + CDS 33256 - 33813 621 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation + Term 33841 - 33881 6.1 36 18 Tu 1 . - CDS 33853 - 34899 814 ## COG0789 Predicted transcriptional regulators - Prom 34953 - 35012 10.1 + Prom 35014 - 35073 11.4 37 19 Tu 1 . + CDS 35171 - 36286 1490 ## COG0205 6-phosphofructokinase + Prom 36637 - 36696 7.2 38 20 Op 1 36/0.000 + CDS 36719 - 38158 1381 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 39 20 Op 2 . + CDS 38161 - 38832 226 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) - Term 38823 - 38850 -0.8 40 21 Tu 1 . - CDS 38853 - 39728 685 ## COG1737 Transcriptional regulators - Prom 39776 - 39835 9.2 + Prom 39742 - 39801 8.0 41 22 Tu 1 . + CDS 39880 - 41292 1651 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Prom 41332 - 41391 2.8 42 23 Op 1 . + CDS 41432 - 41644 353 ## EUBELI_00572 hypothetical protein 43 23 Op 2 22/0.000 + CDS 41667 - 41888 252 ## COG1918 Fe2+ transport system protein A 44 23 Op 3 . + CDS 41915 - 44098 2762 ## COG0370 Fe2+ transport system protein B 45 23 Op 4 . + CDS 44113 - 44319 148 ## gi|237734527|ref|ZP_04565008.1| predicted protein + Prom 44323 - 44382 11.1 46 24 Op 1 . + CDS 44410 - 45243 1032 ## COG0561 Predicted hydrolases of the HAD superfamily + Prom 45351 - 45410 11.2 47 24 Op 2 . + CDS 45439 - 45639 333 ## COG1278 Cold shock proteins + Prom 45672 - 45731 5.8 48 25 Tu 1 . + CDS 45778 - 47667 2312 ## COG0326 Molecular chaperone, HSP90 family + Term 47684 - 47714 -0.4 + Prom 47717 - 47776 6.5 49 26 Op 1 . + CDS 47803 - 48369 521 ## DhcVS_1159 hypothetical protein 50 26 Op 2 . + CDS 48366 - 50000 1812 ## cbdb_A1333 auxin-responsive GH3 protein 51 26 Op 3 . + CDS 49993 - 52527 2109 ## COG2206 HD-GYP domain 52 26 Op 4 . + CDS 52540 - 53076 592 ## COG0778 Nitroreductase + Prom 53089 - 53148 8.2 53 27 Op 1 4/0.000 + CDS 53178 - 55616 2482 ## COG2217 Cation transport ATPase 54 27 Op 2 . + CDS 55609 - 55860 367 ## COG1937 Uncharacterized protein conserved in bacteria + Term 56003 - 56045 8.9 55 28 Op 1 . - CDS 55882 - 56934 558 ## gi|237734537|ref|ZP_04565018.1| predicted protein - Term 56984 - 57010 0.3 56 28 Op 2 . - CDS 57012 - 58682 1759 ## COG1283 Na+/phosphate symporter - Prom 58711 - 58770 8.0 + Prom 58728 - 58787 16.0 57 29 Tu 1 . + CDS 58847 - 60562 228 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Prom 61327 - 61386 7.1 58 30 Op 1 . + CDS 61435 - 62235 620 ## COG3741 N-formylglutamate amidohydrolase 59 30 Op 2 . + CDS 62241 - 63119 653 ## DSY0046 hypothetical protein 60 30 Op 3 . + CDS 63128 - 63289 99 ## Blon_0991 hypothetical protein 61 31 Tu 1 . + CDS 63767 - 63934 152 ## gi|167756827|ref|ZP_02428954.1| hypothetical protein CLORAM_02376 - Term 64175 - 64218 1.1 62 32 Tu 1 . - CDS 64271 - 65539 1466 ## COG0148 Enolase - Prom 65613 - 65672 10.5 + Prom 65490 - 65549 8.2 63 33 Tu 1 . + CDS 65762 - 66556 802 ## COG0796 Glutamate racemase + Term 66702 - 66739 -0.9 + Prom 66634 - 66693 7.4 64 34 Op 1 . + CDS 66801 - 67001 255 ## gi|239623914|ref|ZP_04666945.1| conserved hypothetical protein 65 34 Op 2 . + CDS 67087 - 67587 380 ## Dred_1525 signal transduction histidine kinase, nitrogen specific, NtrB + Prom 67736 - 67795 4.4 66 35 Tu 1 . + CDS 67895 - 68101 217 ## gi|167756834|ref|ZP_02428961.1| hypothetical protein CLORAM_02383 Predicted protein(s) >gi|223714166|gb|ACDT01000049.1| GENE 1 161 - 1264 935 367 aa, chain + ## HITS:1 COG:L152603 KEGG:ns NR:ns ## COG: L152603 COG1887 # Protein_GI_number: 15672909 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC # Organism: Lactococcus lactis # 1 363 1 369 371 266 39.0 4e-71 MGIKKIIINIILKIANIFVLRCSIKQGQIAFISLESNQLESDLELIYQELKKENKYSLKM VLINYNKKSLINNFLYMLNCIKQIYVINTSKVVLITDNNYVISNFKRPGVKVIQVWHATG AIKKFGNAIKREYPIKNYDYVIANSDYWKQPYQEAFNVSEQGVIITGMPRVDHLVDPYYL TQAKARMYSRYPILKGKKVILYAPTFRGNIYQGFKTVDFDAQAVLNALGEEYVIIFKFHP LLLKTTLSDDSRVINMNHENTHDLFAVCDVLISDFSSIIFDYSLLNKPMCFFVPDLDEYI ETLGCFVDYRRVMPGAICYNETQVIAALQGNKKYDIKAFRDMFFKYQDGHNTQRIVMFID KLINTKN >gi|223714166|gb|ACDT01000049.1| GENE 2 1279 - 1935 696 218 aa, chain - ## HITS:1 COG:VCA0102 KEGG:ns NR:ns ## COG: VCA0102 COG0637 # Protein_GI_number: 15600873 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Vibrio cholerae # 4 218 8 219 219 131 34.0 9e-31 MITAVIFDMDGLMIDSERVTFEGYKHVLAKHNLTLSLEAYKTLLGKPVKAVYELFHKDYG DDFDVEETIKAVHQYMADLFENEGVPLKEGLIELLKYLKENDYKTIVATSSQRHRVDHIL ELSGLQKYFDDSICGDEVTKGKPDPEVFLKSCQKLGITPDEALVLEDSESGINAAYSAGI KVICIPDLKYPDHKFAIMTNKIMDNLSNVRDYLANENH >gi|223714166|gb|ACDT01000049.1| GENE 3 2044 - 2379 236 111 aa, chain + ## HITS:1 COG:lin0887 KEGG:ns NR:ns ## COG: lin0887 COG2337 # Protein_GI_number: 16799960 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Listeria innocua # 1 110 2 112 115 122 54.0 1e-28 MIHRGEIYYADLSPVVGSEQGGYRPVIVLQNNKGNRYSTTVIIAPISSRLTKNPLPTHVM VDCSSLEKKSVVLLEQIRTIDKQRIKEKVGMIDSQVMNLINQAIKTSLDIK >gi|223714166|gb|ACDT01000049.1| GENE 4 2433 - 3281 957 282 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756770|ref|ZP_02428897.1| ## NR: gi|167756770|ref|ZP_02428897.1| hypothetical protein CLORAM_02317 [Clostridium ramosum DSM 1402] # 1 282 1 282 282 413 100.0 1e-114 MNVRDIKLKAKSVLVNRNNIVTVFVFISVITTLANYIGSEIGGVIPFLSLIVTIIMLPFG HGNVVTALKTVNEQGDEVTIEQDGVVGLRRFKDLFGTYFIREILLLVIMMLLGLIIFVIA RFTVDSNVFSQLGAIIEQAAVYSTNVSAYLNDPTVIEALGTLGVIFFIGIIIIAIVAFIY SLVFALTPFVLEKYNIRGAKAMSESARLMKGHKRTLFMLYLSYLGWAILVLVLSAVVEMI LPIPLILNIIVAALTTYLFGAELNTSLAVFFEEIDLEDKNNI >gi|223714166|gb|ACDT01000049.1| GENE 5 3294 - 4025 370 243 aa, chain + ## HITS:1 COG:SP0181 KEGG:ns NR:ns ## COG: SP0181 COG1266 # Protein_GI_number: 15900118 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Streptococcus pneumoniae TIGR4 # 54 240 34 220 225 61 26.0 2e-09 MNLIKKIKLNSRQKAIGVILIFPWYLYFAPIVINYFLKLYTVYIANDFSTNSLNAFFNLF IGIATAIPLIIVFKGFIKENWLVFKKDFLENIIWVLTIGIGLAYLFSIAGELIVNLVAPQ SGEAANQTLVETLVQSNFLLMFFQSVIIAPFIEELLFRGLIFNSLRQKNMVWAHLISAFL FGLLHVYSYILAGDMSEWIKLIPYMMAGLSFSIVYEKRQTIIAPIILHAAKNLIAVLLMA TMF >gi|223714166|gb|ACDT01000049.1| GENE 6 4181 - 4693 405 170 aa, chain + ## HITS:1 COG:SP1064 KEGG:ns NR:ns ## COG: SP1064 COG1943 # Protein_GI_number: 15900934 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pneumoniae TIGR4 # 2 152 3 152 157 200 60.0 1e-51 MKDTKSLSHTSYRCKYHIVIVPKFRRMAIYNKLRHDIRLIIKTLIERKPGVELIEGELCP DHIHLLLEIPPKYSVAEFMGYLKSKSTLMIFDRHASMEYKYGNRKFWARGYFVDTVGKNE KTVREYIQNQLAEDKLSDQMSMEEFIDPFTGEPVEGYKKKKKSKRPFEGQ >gi|223714166|gb|ACDT01000049.1| GENE 7 5038 - 6612 1425 524 aa, chain + ## HITS:1 COG:no KEGG:Cbei_1560 NR:ns ## KEGG: Cbei_1560 # Name: not_defined # Def: HAD family hydrolase # Organism: C.beijerinckii # Pathway: not_defined # 7 219 1 215 221 72 22.0 4e-11 MFYKQTINNIKVVILSLDGGLLDLNRLRFNYFKRICKTHNFELTKDIFEEALGNMKTMYN DFPISKDINPDDINKLIERDLYEYAKLKPETIKKEGTDELLQFFKQKEIKIAVVSTHKTK RAIQYLQLTRLYDKVDFVIGGDSDNLPLPDTAVLKTTLEQMNCKPSEAIVIANYPNLLYA ANQCLINVIYLSDLCLVQESIGPRVFKIARNNLEIINIFLFARYDTMEIFSPLLGMSSDM DIETLDKTYAKLLEEYQNDQQLIELVRNTYHYFLGEIANREADTVNTSLFEDEDVIVAPV DDEHFSNPQSIATLEKSIEEVLESEEVSETDNEIADTVFKRENLYQDDEFSKTRTAIGCD PERVNELMDIINGNSGTKLEEGELFEKSNDEILLEKRGLLSYFINFVYTVIVVGLISFLG LIIFIGFEDFIQRPGIVAGIIKSIIDFYVNLVLSIYAVIFNSLNALLSFIPDYSSLIAGN GLLSTMAVKLVLFIIFNVIIVYIVKMIYLLITEAEENDANFAED >gi|223714166|gb|ACDT01000049.1| GENE 8 6590 - 7108 178 172 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|52081538|ref|YP_080329.1| ribosomal protein S2 [Bacillus licheniformis ATCC 14580] # 66 172 67 174 174 73 37 4e-12 MRILQKIKEIDVKTSHFFPQFYHHRFFSGAMKFIAGIGDFGMIWLVLILLLSLNDKTQLL SQKMLAALILATVIGQVTIKSIVRRKRPCHTYHDVEMLIPVPSDYSFPSGHTTSSFACAT TVCFFFPRVGFICMIFAALMAFSRLYLFVHYLSDVTFAIVLGISVGIIVMLF >gi|223714166|gb|ACDT01000049.1| GENE 9 7517 - 8050 658 177 aa, chain - ## HITS:1 COG:BS_yqkG KEGG:ns NR:ns ## COG: BS_yqkG COG0494 # Protein_GI_number: 16079418 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Bacillus subtilis # 2 167 6 172 185 136 46.0 2e-32 MEKKISSEIIYDGKIITVYKDKVECENGNLATREVVRHHGGVGVLAIVDGKILLVKQYRY PNAVDTLEIPAGKLELNEDTAICALRELEEETGYSAKKISLISKFLPTPGYSDEWLYVYE AHDVYKVENPLECDEDEVIELIKMDIDTAYHKVVNGEIFDSKTMIAILHAYINKNKS >gi|223714166|gb|ACDT01000049.1| GENE 10 8266 - 9234 212 322 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family [Lactobacillus jensenii 269-3] # 15 295 18 269 287 86 26 4e-16 MKKLKITENDANQRIDKYLKKLLVNAPNNFIYKMFRKKDIKVNGKKVDEKYILSLDDEVE MFLYDDKFKEFTETKSIYEVNRTFSVLYEDKHVLIVFKPAGLLVHEDANESINTLTNQVL SYLASKNELDLSRENTFMPGPVHRLDRNTSGIVIFGKTLAALQNLNEMIKQRHCIEKKYL TICRGQLSHQRNLHGYMVKLENQAQVKLVKKDYPGALTMETIVRPLKYNCDYSLVEVTLI TGRMHQIRVHLSSIEHPIIGDRKYGDFELNKYIKKTFGLNNQLLHAYKIKFVKTFGVLNY LQDKEIICPVPALFEKIEKSLL >gi|223714166|gb|ACDT01000049.1| GENE 11 9297 - 10217 388 306 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 4 305 5 317 323 154 32 2e-36 MRYYIGIDLGGTNVRTLLVDENGKTYSEVKDNTERENGPDYVCAKIIRQIESLDTSICGG LQGVSGIGIGVPGPVDTVKGTMIMATNLPGFENYPICDKLADRFNLPTFIDNDANVAGLA EALLGAGKDYPTCYYVTISTGIGGAFIVDGKLVSGGRGHAGEIGNIIVKNNGYKFGGLNP GAAEGETSGTAITRKGKELLGEDRVNHAGDVFRLASEGDLKAQSIVDECISELATMFSNI AHTVDPHCFVIGGGVMKSREYFYDRLVEQFNSKIHVGMRGYIPLLGTKLEDCGAIGAAML PMSRLG >gi|223714166|gb|ACDT01000049.1| GENE 12 10305 - 11438 1286 377 aa, chain + ## HITS:1 COG:lin1547 KEGG:ns NR:ns ## COG: lin1547 COG0482 # Protein_GI_number: 16800615 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Listeria innocua # 4 377 9 370 371 441 57.0 1e-123 MSKRVVLGLSGGVDSAVAAYLLKEQGYDVIGVFMRNWDSQLNNDILGNPTNDNDICPQEE DYNDAKAVAECLGIPIKRVDFIKEYWDHVFTYFLDEYRKGRTPNPDILCNKHIKFKAFLD FAKTLNADYIATGHYAQVKHYQGADSVMLKGLDNNKDQTYFLCQLNQQQLQNSLFPLGEI DKTEVRRIARELNLPVADKKDSTGICFIGERDFKEFLQNYIPAQAGKMVDIETGNVIGDH SGIMYYTIGQRKGLGIGGPGDAWFVVGKDYDKNVLYICQGDQKDWLYSTGALITDVNWIA STKPIDKIACNAKFRYRQPDNPIQLEFIDEDTVYLTFDKPIKAVTPGQAAVFYDGDICLG GGTIETVYKDGQPIKYL >gi|223714166|gb|ACDT01000049.1| GENE 13 11438 - 13633 1834 731 aa, chain + ## HITS:1 COG:BS_yrrC KEGG:ns NR:ns ## COG: BS_yrrC COG0507 # Protein_GI_number: 16079801 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Bacillus subtilis # 6 718 18 755 798 443 38.0 1e-124 MIEYTGYIKKIRFYSESSNYIVALIEVEQEDKLITMNGYMNNFNDYEKYAFIGDYEIHPK YGKQFKLSEYRIIYAKESEEIIKYLSSPLFKGIGKALATQIVNTLGEECLEKIKEDKHNL DLVRGMTEKRREIIYEALTNGDYDQEVMQFFMGHGISLKNLGLIQAYYQEKTLEILQNNP YQLVEDIDGIGFKTADELALKTGGTLDNPNRIKAGIIYSIKQYGFNTGSTYCYLDEIKIM FSKIIYNIEEVSFNEYLDELIDEGLIIQRGDKYYYFEMDEAEKNIAEYLKIRINKPDELF DEKEVERLLTNYEKTQGICYAAKQKEALNYFLKSSCMILTGGPGTGKTTIVQALLKVYSV LYPDDRIGLVAPTGRAAKRLTELTGIYACTIHRLLKWDLHSNTFAMNKSNPLDLDVLIID EFSMVDCLLLSKLFEAGRGINKVLFIGDYHQLPSVAPGNILQDLMEAGVKTIELDEIFRQ AKDSGIIQLAHHIIHNEIENMDLFEQYRDINFFPCINYDVVKNVKIIVKKAIDEGYDTND IQVLVPMYQGVAGIDALNDALQDVFNPIDEFNDSYQIGRKEYRVGDKILQLKNRPDDDVF NGDIGTLVEICRKDNFEYLQDTLIVDFDGNFVEYTSNTFNTITHAYCMSIHKSQGNEFKI VIMAVLSDYYIMLRRNLLYTAITRAKQSLFILGSSKAFMHGLANYQDSRRKTSLKSRFKT IETLNVYDFLE >gi|223714166|gb|ACDT01000049.1| GENE 14 14055 - 16664 2939 869 aa, chain + ## HITS:1 COG:L0343 KEGG:ns NR:ns ## COG: L0343 COG0013 # Protein_GI_number: 15673706 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Lactococcus lactis # 1 851 1 853 872 837 49.0 0 MKQLTGNQVRKMFLDFFESKGHKVEPSHSLIPNDDPTLLWINAGVAAIKKYFDGTITPDN PRITNAQKSIRTNDIENVGKTARHHTFFEMLGNFSIGDYFKKDAIDFAWELLTDEKWFAF EKDKLYITVHDHDEEAYDYWVNVKGVAVDHMLKTPGNFWEIGEGPCGPDSEIFYDRGEKY DPDGLGTKLFYEELENDRYIEIWNLVFSQYNSQEGVAREDYKELPQKNIDTGMGLERITS IIQGGETNFDTDFFLPIIHEVEKLANVSYQENKMAYRVIADHIRTVTFALADGALFDNAG RGYVLRRILRRAVRYGKQIGIEKAFMYDLVDVVSEIMKEFYDYLPEKVSYISDLVKKEEE AFHKTLTNGEKLLSSIIAKNTDGVISGKDAFKLYDTYGFPFELTLEIAEESDLKVDEEAF RAELKVQQERSRGARTDSESMASQKPDLMAFDLPSNFEYDPTNINSVVIGLFKDGVKTDR IDDYGEVIFDTTTFYAEMGGQCADTGYVYNENCQGEVINVLKAPNQQHLHFVKLKTGTIR VGDILTLDVDKAKRNKIIANHSATHILQAALKEIIGKHINQAGSYVDDHRLRFDFTHFEK ITNEQLKLIENQVNNVIFDGVAVDIAHMTKDEAINSGAMALFDEKYGDKVRVVSIGDYSM ELCGGCHVSNSANIGLFKIEAEESVGSGVRRIEAVTGKTAYQALVQEKETVDIISNILKL KNRKEVVAKVTALTEELSNTKKEIEILSGQLNALNAASKANDIQEINGVKVLFVEEMIDA AKAKQLAFDFRDKIEEGIVILVTQFEDKCSYFVGVTKNYVATGYKAGDIIKKINAVVDGR GGGKPDFAQGGCPINDKISNIKGELKNFF >gi|223714166|gb|ACDT01000049.1| GENE 15 16725 - 16976 354 83 aa, chain + ## HITS:1 COG:lin1538 KEGG:ns NR:ns ## COG: lin1538 COG4472 # Protein_GI_number: 16800606 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 13 82 14 83 90 75 57.0 3e-14 MQDLTKTKVFSSEDMRREMIRQNLKTVIDALNERGYNAVHQIAGYLISNDPAYISSHKNA RNIIQQIERDEIIEELVKSYLEK >gi|223714166|gb|ACDT01000049.1| GENE 16 16978 - 17397 479 139 aa, chain + ## HITS:1 COG:BH1269 KEGG:ns NR:ns ## COG: BH1269 COG0816 # Protein_GI_number: 15613832 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Bacillus halodurans # 3 139 2 137 140 137 53.0 4e-33 MERILGLDLGSRTCGIAISDTLGMLAHGIETYRFRDDDYDRAAKHVVALINEHQIKTVVL GLPKHMNGDLGERAQISIMFKEMLEKEVPGLNVVLIDERLTTKVAQDQLIFADVSRKKRK QVIDKMAAVAILQGYLDAQ >gi|223714166|gb|ACDT01000049.1| GENE 17 17410 - 17757 427 115 aa, chain + ## HITS:1 COG:BS_yrzB KEGG:ns NR:ns ## COG: BS_yrzB COG3906 # Protein_GI_number: 16079792 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 6 82 8 92 93 63 50.0 8e-11 MEANKIQVIDDQGNEKEFEVLFTFNNEELGKQYVLYYDTTVEEPSVFASIYDDAGQLFPI ETPEEWEMVEEVFQSFMAESEEGHECCGNHGNGCCHDNDEEHECCGNHGDCNCDN >gi|223714166|gb|ACDT01000049.1| GENE 18 17812 - 18903 1114 363 aa, chain + ## HITS:1 COG:BS_ykgB KEGG:ns NR:ns ## COG: BS_ykgB COG2706 # Protein_GI_number: 16078366 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Bacillus subtilis # 17 356 7 343 349 125 27.0 2e-28 MALMGLFGKKRVTESFFVNSYGNDQTRGIYTFQVDIENGEILYKKHFKTPSDPVYSFNYG RFVCVTYKNRTGTSGDGGICSYAATADILALVSRISDKGKTYMHACANGDHETADKLYAV DYYNGEIMVAKIDKKKLVAPIFNYKLEGHSIDPKRQNQPHPHFVDFTPDGKRLIVVDLGL DKVLLFKMEAKEIILDEEHSFDLEPGSGPKKIMFNQAGTIAYVLNELSNTICVYKYNDLK FELIQTIDTYPKDEYDEPSLAGQMLFSEKEERIFVTNRGHDSLALFLVDQETGLLTYKDF VDTSPNARDIAIFKDRWIVAVCQKGGVVESYEYRDERGGMLFETKYSYLVSEPVCITKFE TIY >gi|223714166|gb|ACDT01000049.1| GENE 19 19020 - 20069 970 349 aa, chain + ## HITS:1 COG:BH1271 KEGG:ns NR:ns ## COG: BH1271 COG1559 # Protein_GI_number: 15613834 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Bacillus halodurans # 39 341 65 363 382 169 33.0 5e-42 MKKTQKIIIGAIAGITIVVLALVFFYFNGQGAVSSKSEEVVVEISGSTSSVLNQLDKAGL LKSKTVASIYTKFNSYSFKANVYVLNKNMDLKKILTILEGDKDYISAAKITILDGYRIPE CAQQVAKGLEIDSTEVLEKWTNKEYLQTLVEKYWFLDESILSADIMFPLEGYFGPETYVI TSKKTSIEDVTKMMLDQMDRNLSTYKDKISNFMISGNKVSMHQFLSLASVVQCESSGQKE DQAKIAGVFMNRLEKPMRLQSDVTVNYANQIKTVAVTYNHLSVDSKYNTYKYEGLPVGPI STVSTNIIEACLNYQKTDNLFFFALKDGSVIYSKTYEEHQQVVKENKWY >gi|223714166|gb|ACDT01000049.1| GENE 20 20071 - 20670 608 199 aa, chain + ## HITS:1 COG:BS_yrrM KEGG:ns NR:ns ## COG: BS_yrrM COG4122 # Protein_GI_number: 16079790 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Bacillus subtilis # 5 196 24 216 217 130 37.0 2e-30 MRYYEKIEADALARNIPVMQREGLEFMIEIFKHHNCHCCLEIGSAIGYSAMMLVSNINNF KVETIELNEERYLEAVKNIEENNLKSQIIIHHGDALSFDLEHLKIKKYDCLFIDAAKAQY QKFFEKYMPLVADEGICIVDNLDFHGMIFDIDNIKNRNTKQLVKKIKRFKDWIFDNELYD VEYHHVGDGICVIRKRVAQ >gi|223714166|gb|ACDT01000049.1| GENE 21 20667 - 21101 475 144 aa, chain + ## HITS:1 COG:SP0801 KEGG:ns NR:ns ## COG: SP0801 COG0826 # Protein_GI_number: 15900694 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Streptococcus pneumoniae TIGR4 # 17 142 67 195 356 65 32.0 4e-11 MKLVLSLNSKKFIYDYVDIGVEYFVVGAKYFSCRQALSLEYDEIANLKKVLGNKKVWVLV NALVEEKYIDFLEKHLIRLSHIGVDGILFQDFGVLQICNEHNFDFEMIYHPDTLNTNQAT LNYLGTQGINGAFLAREIPLEEKK >gi|223714166|gb|ACDT01000049.1| GENE 22 21128 - 21601 360 157 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756787|ref|ZP_02428914.1| ## NR: gi|167756787|ref|ZP_02428914.1| hypothetical protein CLORAM_02336 [Clostridium ramosum DSM 1402] # 1 157 155 311 311 301 100.0 7e-81 MIQVHGVEYMAYSKRKLLTNYFKEINQDIPIGISDDLTIQANGVNYSCHIYEDQYGCHIL SKQQMCGLDIMSSFQDFDYLYIESLYVDELKLVEIVNLYQDALISVGNRTYGKVAKELIS QLYQLDPNIEYHHSFMFDATVYKIDDVRKREENEKCK >gi|223714166|gb|ACDT01000049.1| GENE 23 21585 - 22829 1232 414 aa, chain + ## HITS:1 COG:BS_yrrO KEGG:ns NR:ns ## COG: BS_yrrO COG0826 # Protein_GI_number: 16079788 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Bacillus subtilis # 4 412 8 420 422 443 50.0 1e-124 MRNVSKVINGKRVIIKKPELLAPAGNLEKLKVAIQYGADAVFVGGKEFSLRSGASNFTLD DIKEAVTFADQYGAAVHVTCNIILHQDNLDGIEEYLRALDQAGVRAIIVADPYIMSIAKK LNLKLEVHVSTQLSTLNTKAIKFYRNLKMDRVVLGREVCYDDLKTILDKTDVDIEYFIHG AMCIHYSGRCMLSNYFSRRDANRGGCSQSCRWYYDLYQGNQKINQDGVIPFSMSSKDMAL INHIPELIELGVDSFKIEGRMKSLHYIATIVSTYRKLIDDYCNDPDNFELTEKYYREIQK AANRSLCTGFFDDQADNDKQLYNQRDEHPTQEFCARVISYDQVNQIATIEQRNYFKIGDQ IEFFSPYHENVVVPVTKIINADNEEVEVANHPMEILQIPVSVELQKDDMGRKVI >gi|223714166|gb|ACDT01000049.1| GENE 24 22829 - 23449 625 206 aa, chain + ## HITS:1 COG:BH1275 KEGG:ns NR:ns ## COG: BH1275 COG0572 # Protein_GI_number: 15613838 # Func_class: F Nucleotide transport and metabolism # Function: Uridine kinase # Organism: Bacillus halodurans # 2 206 3 206 211 231 56.0 6e-61 MKKPVIIGIAGGSASGKTSIAQELYDCFKGRHTIRIIKLDDYYKDQTHLSMDKRVLTNYD HPLAFDMDLLIEHLDLLKEGKSIQKPTYDFEQHNRSKIVEIVDCRDVFILEGLFVLNEVR IRERCDILVYVDTDADIRFIRRLRRDLEERGRSLDSVCTQYLTTVRPMHEQFVEPSKKYA HIIIPEGSSNTVAIDLLLTKISSIVD >gi|223714166|gb|ACDT01000049.1| GENE 25 23514 - 23999 661 161 aa, chain + ## HITS:1 COG:MG282 KEGG:ns NR:ns ## COG: MG282 COG0782 # Protein_GI_number: 12045138 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Mycoplasma genitalium # 1 157 1 158 161 119 51.0 2e-27 MEHEKVLLTQSGVEKLEQERDNLINVERPKVIEELQLARSQGDLSENADYDAAREKQAHL ESRIKEIDYMLQNAEVISEEQMDLKVVKPGTTVTILDLSEKDAEPESYQIVGYTETDPLN GKISNESPLAKAVLGHGVNEIVTVGVADPYDVKIVNIEFKN >gi|223714166|gb|ACDT01000049.1| GENE 26 24276 - 24632 202 118 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734509|ref|ZP_04564990.1| ## NR: gi|237734509|ref|ZP_04564990.1| predicted protein [Mollicutes bacterium D7] # 1 118 1 118 118 121 100.0 1e-26 MKSYLKIYLKFALFILITFTITSLIMAGIISFIHLSNFIYHSIINIIAGIIMIVWAFWLI KIFQNKAIIHALLCGLIFGIIALMVNIEDINLINILSRPIILIITTLILQLYTKKLDA >gi|223714166|gb|ACDT01000049.1| GENE 27 24682 - 25314 610 210 aa, chain + ## HITS:1 COG:BS_yrbG KEGG:ns NR:ns ## COG: BS_yrbG COG2323 # Protein_GI_number: 16079821 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 10 210 11 213 218 134 35.0 1e-31 MNIFDVFLISVCIYIYLSILLRIFGKKEFSQLNVFDFVVFLIIAEIMTDTIGNSDFTFYH GVVATVTLIVVDRLVSMITMKSKKLRDIFEGRPTYIIFKGKLDQEIMKKQRYTVDDLSHH LRVNDIDSISKVEFALLETNGSLSIIPKDQCVVELPDALICDGIIDEDNLKLLNRDINWL KKELKKQGVDRIEDVFYCVPEKGHLLVIKK >gi|223714166|gb|ACDT01000049.1| GENE 28 25329 - 26123 270 264 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 255 1 245 245 108 32 8e-23 MLKLEHVSKTFNLGTVNEKKALQDLNLTVNDGDFITIIGGNGAGKSTTLNMIAGVYPIDC GYITIDNQDISLADEYKRAKFIGRVFQDPMMGTAANMEIQENLAMAFRRGKRRGLSWGIS KEEKKQYHEALKRLDLGLETRMSSKVGLLSGGQRQALTLLMATLQKPKILLLDEHTAALD PRTAKKVLDLTEEIVTEQKLTALMVTHNMKDAINIGNRLIMMDKGRIIYDVNGEEKAKLT VDDLLKKFEEASGSEFDNDRMLLG >gi|223714166|gb|ACDT01000049.1| GENE 29 26116 - 27036 1067 306 aa, chain - ## HITS:1 COG:Cgl2197 KEGG:ns NR:ns ## COG: Cgl2197 COG4120 # Protein_GI_number: 19553447 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Corynebacterium glutamicum # 9 284 3 276 296 212 47.0 8e-55 MSFILAIQGAASQGILWGIMALGVYITFRLLDFADLTVDGSFATGGAVCAVAIVNGINPI LAVLLAIIAGFVAGAITGLLHTKCQIPAILAGILTQIGLYSINLRIMGKSNTPLLQSDTI FKGLSNTFNLSQAWITLIIGIICAIIVILICYWFFGTEIGSAIRATGNNEHMVRALGANT NTTKLLGLMISNGLIAMSGALVTQSQGYADIKMGIGAIVIGLASIVIGEVILGRKPGFMF TLTAIIVGSILYRIIVAVVLQLGLSTDDLKLLTALLVGIALTVPVMVSKRKQIATYKKLT KEKENA >gi|223714166|gb|ACDT01000049.1| GENE 30 27036 - 28004 1161 322 aa, chain - ## HITS:1 COG:Cgl2198 KEGG:ns NR:ns ## COG: Cgl2198 COG2984 # Protein_GI_number: 19553448 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Corynebacterium glutamicum # 22 320 38 330 330 213 46.0 3e-55 MKKLLKASLTVMMALTLSACGGSKDVNIGIVQYAEHPALDNAKKGFLKALEDNGYGEDNV AFDDQNAQGDGSNCTTIADKFVNDNVDLIFAIATPTAQAAANKTTEIPIVLSAVTDPASA KLVKTNEKPGGNVTGTSDLTPVADQFDLLQQLLPDAKTIGIMYCNAEDNSIFQADIAKEE CAKRNLTVVDKSVTDSNQIQQVTESLIGKVDAIYIPTDNLLAEGMATVAQVANENNLPCI VGESGMVENGGLATYGIDYYNLGYRAGLQAVKILKGEAKPADMAIEYLPAEECELTINEK VAKKLNITIPDDLKSKAKMVNK >gi|223714166|gb|ACDT01000049.1| GENE 31 28424 - 29218 675 264 aa, chain - ## HITS:1 COG:lin0435 KEGG:ns NR:ns ## COG: lin0435 COG0428 # Protein_GI_number: 16799512 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Listeria innocua # 1 263 2 268 269 193 45.0 2e-49 MEYLLSLNPFIIVLIVATFNWLMTFFGASLVLFVRKASQKLICIALGSSAGIMVAASFFS LLLPAKDQLEAGGKLDLLIIPFGFICGVALLMLIDKLLPHEHMMSHEQEGINPGRFSKNK LLMLAMTLHNIPEGLAVGVAFAGCHDGNYLPALILALGIGIQNFPEGTAISLPMHQCGKS RFIAMMYGQFSAIVEIPAALLGFIFATLVNGVLPFALCFAAGAMFFVCIEELIPEANATE GIDLGTISFMIGFVIMMSLDILLS >gi|223714166|gb|ACDT01000049.1| GENE 32 29325 - 29699 475 124 aa, chain + ## HITS:1 COG:CAC1469 KEGG:ns NR:ns ## COG: CAC1469 COG1321 # Protein_GI_number: 15894748 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 4 120 1 117 122 86 41.0 1e-17 MEEMSIAMQNYLELIYELSLDGKKARVSDIAKHLGVSKPSVNNAVVVLAKDGYVIYEKYA DVKLTPKGKETAEFICGKHQTIKQLFVEVLNIDEGIADKDACLIEHVISNESIKAMQEFL DRQK >gi|223714166|gb|ACDT01000049.1| GENE 33 29732 - 30988 1266 418 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756798|ref|ZP_02428925.1| ## NR: gi|167756798|ref|ZP_02428925.1| hypothetical protein CLORAM_02347 [Clostridium ramosum DSM 1402] # 1 418 1 418 418 718 100.0 0 MNENELKSLEVYNEKFKEDPTSFNDFQANDYIKLLKKSDNNEEAIEVGKTFMSLCPNLRG YLNQYGYALYNKFINIDDEKIRDNEDLFFSVLKDILAVCKQERYSPMEPSVNRAIKYLQR EKPEDPNKLVEMLDLLDPNLLDDKPFTNDEGKEFESKKERYYRLKVKALYEAKRSKECVD TANNALSLSLKWHFNTLQWINYQRACSLVELEQYDEAKKVFLSLHNRIRNVNFYEVLYKT NAQTGNIKEANAYLLYEFFENGYSINHLGIYTRLMEATQRTEDPMLIEIVDAFLTKLTAE NNREYTPVADYGEKYNDQTSEELYDELYEKVMNNLGKYVERVEGTVVHYNNQRGLGTISR YDEDGIFFRQADYVYDEDVQRRDKVEYTPIETFDNKKNEITNKAILIITTEEYLDFGY >gi|223714166|gb|ACDT01000049.1| GENE 34 31112 - 33256 1676 714 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase [Brucella abortus bv. 1 str. 9-941] # 2 714 3 714 714 650 48 0.0 MNKQVFSMDFYGKNLSVEVGELAKQANGAVLVRYNDTVILSTVVAGKEPKNVDFFPLTVT YEEKLYSVGKIPGGFLKREGRPSEHGTLTARMIDRPIRPLFADGFRNEVQSVNTVLSVDQ DATPEMAAMLGASLALCVSDIPFNGPIAGVNVGLVDGEFIVNGTPEQLENSLINLEVAGT KDAINMVEADAKEVSEEVMLEALLFGHEKIKELIAFQEKVVEACGKEKIEIPLFELNNDL VEEVAVRAKAEMAAAVSIPGKLERYGAIDDLIAQVTEEYDNREYENEEIKADVMKQVGII LHDLEKDEVRRLITEDKIRPDGRKIDEIRPLDSQIDLLPRVHGSALFTRGETQVLSVATL GAIGEHQKIDGLGLEDQKRFMHHYNFPPYSVGETGRMGAPGRREIGHGALGERALAQVIP SEEAFPYTIRVVSEVLESNGSSSQASICASTMALMAAGVPIKAPVAGVAMGLVKKGDAYT ILTDIQGMEDHLGDMDFKVAGTDAGICALQMDIKIDGITKEILQEALAQAKVARKQIMAN MMDAIAAPRDELSPYAPKVQMMKIEPDQIKAVIGQGGKTINEIIEQSDGVKIDIEQDGTV VIYHYDQAAINKAVELIEKIVKKAQVGEVYDGKVVRVEDNYAFINLFDGTDGFLHISDYA YERTKKMGEVIKLGDIIKVKVTKVDDKGKVNVSRKALLPKPIKKEEPKEEPKAE >gi|223714166|gb|ACDT01000049.1| GENE 35 33256 - 33813 621 185 aa, chain + ## HITS:1 COG:BS_maf KEGG:ns NR:ns ## COG: BS_maf COG0424 # Protein_GI_number: 16079857 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Bacillus subtilis # 2 174 3 175 189 140 44.0 1e-33 MKRLVLASSSPRRKELLELHKFDFIIDFQEIEEVLDESLALPLRLEKLAYQKAAPIALKY PSDIVIGADTMVCLENQMLGKAADRQAAYEMLKLLSDQTQTVYSAVAIIDNGKVSTYHDG TKVTFKKLSDEEINAYLDLNEWPGKAGAYAIQGEGKALVAKVEGNLETVIGMPVWIIEEY LNNHR >gi|223714166|gb|ACDT01000049.1| GENE 36 33853 - 34899 814 348 aa, chain - ## HITS:1 COG:mll5312 KEGG:ns NR:ns ## COG: mll5312 COG0789 # Protein_GI_number: 13474432 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Mesorhizobium loti # 1 97 1 104 152 65 32.0 1e-10 MRTNEVEKITGLTKHAILYYEKVGIIRPKRTNNGYRNYSHDDLQTLKLVKFLRNLDISID DIKAILNNKCSLDDCLKIHKRNIDQTISELEDLQNTISTFKDMHIPLIPALENIKLVPEK KGLGYRKTSKTIAYNRPLTRAMAIRQVISAGIIASIFTYGIYIWIWSEKWYYLENLYFVA VVFLINILIFIFANFQFMMHLYDKSRNQSIEFLEEGIRIFKRKNSFHHFKYVLSALINQH YKYQKFYFYQDIEKLVVQTRQRFIPLYSLGTGGPSTDMYEVDITLYFSDGETYFLLGPET FNNDSQLIGHILTEKIPNIVDPDNILTAYQNKINLTDYMTSQKSMISR >gi|223714166|gb|ACDT01000049.1| GENE 37 35171 - 36286 1490 371 aa, chain + ## HITS:1 COG:Cgl1221 KEGG:ns NR:ns ## COG: Cgl1221 COG0205 # Protein_GI_number: 19552471 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Corynebacterium glutamicum # 6 344 5 331 346 257 42.0 3e-68 MADVKRIALLSGGGDCPGLNAVIRAVTKAAINEYGYEVIGYVYGYRGLYNNDYIELTVDK VENIYKEGGTILYSSNKDNLFDYLVDDGHGGKVKKDVSDVGVENLKKAGVDVLVVLGGDG TLTSARDFSRKGVNVIGIPKTMDNDLPATDLTYGFISASSVGTEFIDRLNTTAKSHHRVI CCELMGRDAGWIALYSGMAGNASCILIPEIPFTIDNIVKHVAKRDAEGYPYTVIAVAEGA KYADGTKVIGKIVEDSPDPVRYSGLAAKVADDLEQAIANHEVRSVNPGHIIRGGDIQAYD RILSIRFGVKALELIKEGKFGNVVTLKGEEMSYTSLEEVIGDAKFGKQKHVDPNGELVKA AKAIGICFGDE >gi|223714166|gb|ACDT01000049.1| GENE 38 36719 - 38158 1381 479 aa, chain + ## HITS:1 COG:BS_yclI KEGG:ns NR:ns ## COG: BS_yclI COG0577 # Protein_GI_number: 16077442 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Bacillus subtilis # 2 478 4 486 486 222 34.0 1e-57 MLKNAFLSIKKNIGKTVLLFVIMTVIANLIIAGLSIQSASKKSMEQIRTSLGNDVTLTTN MQNMMGQREKGQAVSEVAASVTIAMADQLKDLKYVKNYNYAISTTASSDDITAVELTTSQ DNQTMERPDGQENFQSANQGDFSISANTTMEYLDSFTEESSTLKQGRLLSSDDAGTNNCV IETNLATDNDLNVGDTFTVYTTINEETVTQQLTIVGIYEVTDAKTMGGPGQSNPFNTIYT DLSIGQTLSGSETNITSATYYLDDPENIEAFQTLAQKKSDIDFETYTLDANDRLYQQNVS SLENTQSFATMFLIVVIGAGSAILCLILILTIRSRYYEIGVFLSLGQSKVKIILQQLFEM LLIAAVAFVISLGTGKLVSNVVGNMLESGTNNNQVQMQMPEIRDQNNDSNNTNSTSGNDK MFDQAFNGPENTELDVSLTTTTVMQLAGITIAICLVSIAIPSAYVLRLSPREILVKKEG >gi|223714166|gb|ACDT01000049.1| GENE 39 38161 - 38832 226 223 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 223 1 222 223 91 31 1e-17 METILNFEDVSYHYLDGSSNVSILNKANYSFEKGKIYAIVGASGSGKTTTIVLAGGLDKP KGGKVIFKGQDTARIGLNKYRRNDISIVFQSYNLIHYMNAYENVANAIEIAGRKVPNKKE YCLDILKKLGLSKEQSLRDIRKLSGGQQQRIAIARAIAKDVDLILADEPTGNLDEKNSKE ILKTFIDLAHQANKCIIIVTHSPSLAQKCDVQLKIEDGQIVEI >gi|223714166|gb|ACDT01000049.1| GENE 40 38853 - 39728 685 291 aa, chain - ## HITS:1 COG:lin2846 KEGG:ns NR:ns ## COG: lin2846 COG1737 # Protein_GI_number: 16801906 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 1 272 1 271 283 126 29.0 6e-29 MSIMTQLEFELDFSSSEKTIAKYILDNGEDILNLSVKELAKQTYTSPATIVRLCRKLGLN GYGDFKIKYSAELQFDKKNKKRVDVNFPFGNTDSNSQIAYRIANLHQEAIEDTLNLVDFK NLDKIINLLDQARRIYLFGNGNSLLAGFDFQHKMMRIGKMVEMRAHAGEQGFLSYTCSPD DVAILISYSGETNEMVELAKFLKKMHVPLLGITSIGDNQLSKYCTYIMNTGSREKIFSKI APYSSKTSISYLLDLIFSCIFRLNYDHYINEKINRDKLFDHRHPYKSPIND >gi|223714166|gb|ACDT01000049.1| GENE 41 39880 - 41292 1651 470 aa, chain + ## HITS:1 COG:L32812 KEGG:ns NR:ns ## COG: L32812 COG2723 # Protein_GI_number: 15672801 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Lactococcus lactis # 1 470 1 453 453 379 44.0 1e-105 MEKKLPDNFYWGGSVSSFQTEGARNEGGKGVCIYDVRPTDPEFSDWSVGIDEYHRYDEDI ALMKDMGFNFYRFSICWSRIIPDGTLDEPVNEEGIKFYSDLIDKLIAAGIEPMITLVHFD MPYKLVKEHNGFASRYVVDCFERYARVCIERFGNRVKHWMSFNEQNLHGMMLRVSNAEEI PEGVDPAKHLYQVNHNVFIAHCKAVKALRELQPDAQFCGMNAVTNIYPYSNTPKNNLFAW KAYQYMNGFHCDVFAKGKYPDYMIAYLENRGWMPTFEEGDDELLKYTVDYIAFSYYRSNT VTEGEFDYTRPYHEVVGEHTVKNPHCEANEWGWEIDPVGFRWTLNDLATRSDLPVFVLEN GIGWREDMTQEEVDTMLAANKTIEDDYRINYHRDHIKEMKNAMFEEGVKCLGYITWGPID ILASMCNMDKRYGFVYVNRTNKDIRDLKRIPKKSYHWVKKAFESNGEDLD >gi|223714166|gb|ACDT01000049.1| GENE 42 41432 - 41644 353 70 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00572 NR:ns ## KEGG: EUBELI_00572 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 70 14 83 83 75 60.0 5e-13 MMPLTMATIGEVQTIKKVGGKQEVKLFLESLGFTTGATVTIVSKNQGNLITKIKESRVAI SQEMANKIMI >gi|223714166|gb|ACDT01000049.1| GENE 43 41667 - 41888 252 73 aa, chain + ## HITS:1 COG:L192240 KEGG:ns NR:ns ## COG: L192240 COG1918 # Protein_GI_number: 15672170 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein A # Organism: Lactococcus lactis # 4 73 80 149 152 68 48.0 3e-12 MKTLRDVKCGTSARVMKLHGDGAVKRRIMDMGITKGTEVLVRKVAPLGDPIEVKVRGYEL SLRKADAAMIEVL >gi|223714166|gb|ACDT01000049.1| GENE 44 41915 - 44098 2762 727 aa, chain + ## HITS:1 COG:L190009 KEGG:ns NR:ns ## COG: L190009 COG0370 # Protein_GI_number: 15672169 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Lactococcus lactis # 5 711 4 702 709 701 49.0 0 MSITIALAGNPNSGKTTLFNALTGSNQFVGNWPGVTVEKKEGKLKGHKDVKLMDLPGIYS LSPYTLEEVVARNYLIQERPDAIINIVDGTNLERNLYLTTQIMELGIPVIMAINMMDIVA KNNDKIDVKKLSKELGCQVVEISALKGKGIKEAADRAVTLANSKRINAPIHKFDSAVEEK LDQIEAMLPSDIAVEQRRFYAIKLFERDDKISGLMSSVPNVETIIDQSEKELDDDAESII TNERYQFITSIIDDCYKKHSHDQLSVSDKIDRIVTNRWLALPIFAVVMFVVYYISVTTVG TWATDWANEGVFGDGWSLFGLAVPSIPSLIEGLLESLGTAAWLNGLILDGIVAGVGAVLG FVPQMLVLFFFLAVLESCGYMARVAFIMDRIFRKFGLSGKSFIPMLIGTGCGVPGIMASR TIENDRDRKMTIITTTFIPCGAKLPIIALIAGALFGGASWVAPSAYFIGIAAIITSGIIL KKTRRFAGEPAPFVMELPAYHMPTITNVLRSMWERGWSFIKKAGTIILLSSILVWFTSYF GFVDGQFTMLEDTQLDHSILASIGNAIAWIFTPLGWGDWKAAVAAITGLVAKENVVGTFG ILYGFAEVADDGAEIWGTLAASYTQIAAFSFLIFNLLCAPCFAAMGAIKREMNNGRWTAF AIVYQTVFAYLVAFSVYQIGNLVITGTFTFATVIAILVVIGFVYLLVRPYKEDNRLNIDL NKTVTEK >gi|223714166|gb|ACDT01000049.1| GENE 45 44113 - 44319 148 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734527|ref|ZP_04565008.1| ## NR: gi|237734527|ref|ZP_04565008.1| predicted protein [Mollicutes bacterium D7] # 1 68 4 71 71 117 98.0 2e-25 MNWLIENLSTIIVSVILLFAVLLAINSIRKDKANGKSSCGGNCGSCGTGCHSRTNSLVDS YHHDHGKC >gi|223714166|gb|ACDT01000049.1| GENE 46 44410 - 45243 1032 277 aa, chain + ## HITS:1 COG:BS_ykrA KEGG:ns NR:ns ## COG: BS_ykrA COG0561 # Protein_GI_number: 16078519 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Bacillus subtilis # 1 277 1 256 257 112 26.0 8e-25 MEKKAIFLDVDGTLVANHGKMTEQVKKAITLARNQGHKVFICTGRNKIGIRNELEEVNFD GFIASAGSYIEVADEVVHSRYYDRKLIEKACDVFNRNNIVYNYECTDITYMSPKMMEFFV GGNNEEATNSEMERLKAEQSDKFNVRDMNEYDGRGIHKISFIAMKQEDFDHAFKELKENF NFVVHDMFGSKNINGEIMSKLDNKGTGIKRIVDYLNIDMEDTIGFGDSMNDYEMIDAVKC GVVMDNGSARLKEIADRICKSVDEDGVYYEFIDLGII >gi|223714166|gb|ACDT01000049.1| GENE 47 45439 - 45639 333 66 aa, chain + ## HITS:1 COG:SA0747 KEGG:ns NR:ns ## COG: SA0747 COG1278 # Protein_GI_number: 15926469 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Staphylococcus aureus N315 # 1 66 1 66 66 89 72.0 1e-18 MNTGTVKWFNSEKGFGFITKDTGGDLFVHFSAIQGSGFKSLEEGAKVSFDIVESDRGEQA ANVAAL >gi|223714166|gb|ACDT01000049.1| GENE 48 45778 - 47667 2312 629 aa, chain + ## HITS:1 COG:BS_htpG KEGG:ns NR:ns ## COG: BS_htpG COG0326 # Protein_GI_number: 16081033 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Bacillus subtilis # 4 625 3 626 626 660 56.0 0 MAEKKQFKAESKRLLDLMINSIYTHKEIFLRELISNASDASDKLYYKALTENISSINRND LKIMIEIDKENRVLTIKDNGIGMDKDELETHLGTIANSGSFEFKNENEHDDIDIIGQFGV GFYAAFMVAKKVEVISKKYGTEQGYTWVSEVSDGYEIFETNVADHGTTIKLYLKDNTDDD NYDEYLNQYHIRDLVKKYSDYVHYPINMDMTVSKKKDDSDEYEDVIENQTLNSMVPLWKR NKKDITDEEYNDFYKDKFNDFNDPLKVIHSRVEGNVSYDSLLFIPSKPPMNYYSNEYEKG LQLYSRGVFIMDKASELVPEHFRFVKGLVDSEDLSLNISREMLQHDRQLKVIADKIEKKI QSELETMLKKDRVKYEEFFNSFGLQLKFGIYNSYGMLKEKLQDLLLYYSSKEEKLITLAE YIEHMPEGQKEIYFASGETREKIATLPQVEVVKDKGYDVLYLTDNVDEFCFQMMRDYKEK PFKSVAQGDLDIDSEEEKKELEKTNEENKDLLTAIKDSLGDKVVDVKVSSRLKSHPVCLV SSDGVSFEMEKVLNQMPEGQEVKAGRILEINPNHDIFNAMKKVYENNPDKVNDYASILYD QALLIEGFSIDNPVDFSNKICDLMIELNK >gi|223714166|gb|ACDT01000049.1| GENE 49 47803 - 48369 521 188 aa, chain + ## HITS:1 COG:no KEGG:DhcVS_1159 NR:ns ## KEGG: DhcVS_1159 # Name: not_defined # Def: hypothetical protein # Organism: Dehalococcoides_VS # Pathway: not_defined # 73 181 75 183 196 64 35.0 1e-09 MLNICGIIAFFMMYIYDLNLIYLGNKIFKKFFAIGSIILVFATLANIIIYFPKKIDAHFI IFLGLALLAAIGLIYTLFFALKFDDAYKNTENKQPCVKTGVYALCRHPGVHMMILMYLCL YLAFQNETMLFMLIGFNLLNIGYVILQDIIIFPKQFVDYDEYKKIVPFLVPTIKSIRSCI RTFGEVSK >gi|223714166|gb|ACDT01000049.1| GENE 50 48366 - 50000 1812 544 aa, chain + ## HITS:1 COG:no KEGG:cbdb_A1333 NR:ns ## KEGG: cbdb_A1333 # Name: cbdbA1333 # Def: auxin-responsive GH3 protein # Organism: Dehalococcoides_CBDB1 # Pathway: not_defined # 13 521 14 516 543 348 38.0 3e-94 MKFEDKLKARSREDVWNEYCGFLDLSMDEFVQIQNRLMEEQIALWSRSPLGKKILKGSTP KNIKEFRGLVPLTSYDDYASDLLQKKSELLPDEPVLWIQTTWEGGKHPIKVAPYTESMLD TFKRNVCACLLLGTSTKRGDFKARSTDTILYGLAPLPYATGLLPLAFEEEIGIEFLPPVK DAVKMSFSTRNKVGFKLGLSKGIDYFFGLGSVTYYVSLSLGKMSSGGGGAISLLKKSPLR FAKIVLAKCKCIKEGRELKPKDIFKLKGFMVAGTDNACYKDDLEELWGIRPMEIFAGTEP TCIGTETWSRNGLYFFPDACFYEFIPSDEMEKNLADSSYQPRTVLINEVEEGMSYELVIS VLKGGAFMRYRVGDMYQCIDLKNKDENIKLPRFKYLDRVPNVIDIGGFTRITENSIDQVV KLSGLKITNYIAKKEFNHNNRPYLHLYVEMDPHAQITQAISIEILREQLSIYFKYVDQDY QDLKKILGIDPLKITIIKAGTFAYYEKNHSHKIKKINPPTLEINELLTIQDQDYRVEMGG RLYE >gi|223714166|gb|ACDT01000049.1| GENE 51 49993 - 52527 2109 844 aa, chain + ## HITS:1 COG:TM1170_3 KEGG:ns NR:ns ## COG: TM1170_3 COG2206 # Protein_GI_number: 15643926 # Func_class: T Signal transduction mechanisms # Function: HD-GYP domain # Organism: Thermotoga maritima # 661 833 4 176 185 149 41.0 3e-35 MNNYTYVSIIALYCYIFLFISLSASKKTKLIKAFMTLLMAMMFWTGGSFFMRSMFGPSVK FWYDISLLGLCLVPYAFISFVHEFLFNSPHKKDTIWFFLFVGLFIINATTELLLPAPEYT IRASGIPAFYYTSNVYTYIFYCFFLASIIYCVSLIIKGFKENRNQVIRLMPMFAGIACLI FGNTCVMLGWFAGFPIDIVAGVINAICMFYTLYKRRLFRLNLLVSRRSSYVIAGMLSIIL FANLVGTFQNILFEHFGLITKNYYVLVIAVAFTIVTAIIYTMLKKFIDAIFIKEEIQQSD SLKEFSNYVSKTLQVNEIVGAMGEIILKCVETKQVYICLLDDNGDYSIEYSSSPLRKNNY IINHENPVVIKLMNNDDMIFYSEFRHSINYRSMWDSEKQLFENLNIEGIIPVKQEQLIGM ILLAPKPRRSGYSYGEISFLSSVASISAIALINSKMYETVYNEARSDELTGLLNRKYFYE ALNNEYEKLGERQLSLILVSVDDVRLYNQLYGNKEGDIVLINVARIMDDYVNTQGYTARL GGKEFGIILPNYGPLEAKDLAENIREQIMNMNKRDSDYTLKVVTASFGISSIPVSANTIK QLLDYADQALYQSKRHGKNRVSVYNAGVVESTSGVTIVDAHREKQNIYLEYAPTIYALTA AIDAKDHYTFTHSENVAYYATKLAIGCGYDSELVEIINEAALLHDIGKIGISENILKKTS KLSDEEYQIMQSHVEASVNIIRHLPSLDYVIPIVLTHHERYDGRGYPRRISGEDIPAGGR IMAIADSFDAMTSRRSYKEPYSLEYALNELKRNRGLQFDPDLVDKFIELINSGIIEVRGN QNNE >gi|223714166|gb|ACDT01000049.1| GENE 52 52540 - 53076 592 178 aa, chain + ## HITS:1 COG:TM0386_2 KEGG:ns NR:ns ## COG: TM0386_2 COG0778 # Protein_GI_number: 15643152 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Thermotoga maritima # 1 164 1 155 181 70 33.0 2e-12 MSLDQIIKQRRSIRKYKKKPVSNEQIQLLIQAAVEAPSWKNSQTARYHVITSQPLLNEFK ERCLPEFNQENCKDAPVLIVTSFIPNRSGFERDGTPSNELANGWGCYDLGLHNQNILLKA TELGLGSLVMGIRDIDKIKGLLDISKEEIIVSVIAIGISDIEPERPKRKTVEDITRFY >gi|223714166|gb|ACDT01000049.1| GENE 53 53178 - 55616 2482 812 aa, chain + ## HITS:1 COG:CAC3655 KEGG:ns NR:ns ## COG: CAC3655 COG2217 # Protein_GI_number: 15896888 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 3 738 74 816 818 578 44.0 1e-164 MKRKYSVKGMTCSACENHVHNSVCKLPGVEHVEVNLLTNSMNVEFDETKVDDSMIIKAVK DGGYQASSYDDALVDNDDLSNLKRKLIYCFIALGLLMYVSMSSMLSYPIPNIIRDNVYFN IILQIIFLIPILILKKDYFVNGFKNLIHFDPTMDSLIALGAGAAIVYSIYSVVLALSGSL MEMHLVHNVYFESAGMIVTLISFGKYLEAKSKKKTTDAIGKLLELAPDVACRFNDGKEEI VKISELKIDDLVLVRANETVPVDGKIVQGFSSIDESMITGESLPIEKEVSSLVIGGTNNL QGTFIYTVTRTVEDSTLAKIIELVEEASSSKAPMTRTIDKVVKYFVPTVIVIALITFVGW LIMGEDLNFAITSAIAVLVISCPCALGLATPVAIMVSTGVGATNGILIKSAEVLENENNI DVVVFDKTGTLTKGMAQVSDIYSHKLELAQVIAIIASLEKGSSHVLANAFIQKAAEMNLE LKEITDFESLSGLGIRGTIDNVEYAVGNLRMMKESGIDLSRYQEKIDLYLIQGKTLVFLA HDQELIGLVSIFDDIKDTSRQAIQRLKEMKIKTVMLTGDLKGTANAINKQLGLDEVIAEV LPQDKENVIQQLQAQGNSVLMVGDGINDAPALVRSDVGVAIGKGNDIAIDAADVILMKDD IRDIVASIELSKRTIINIKENLFWAFIYNVIGIPIAAGIFYYSFGLRLDAMVGSLCMSLS SVCVVTNALRLKRFKPYYQKENKKMKKEIVIEGMMCQHCKKHVEEALNGLADTTAAVDLE NNLARVETVQDDSILKNAIEKAGYKAVGIKNV >gi|223714166|gb|ACDT01000049.1| GENE 54 55609 - 55860 367 83 aa, chain + ## HITS:1 COG:CAC3656 KEGG:ns NR:ns ## COG: CAC3656 COG1937 # Protein_GI_number: 15896889 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 4 83 6 85 88 79 52.0 2e-15 MSNEKVLLRLKTARGQIDAVIKMIEDERYCIDISNQLLAIQSLIKNANNEVLSNHLNHCV KNAINEHDADQKIDEVIKLLTKL >gi|223714166|gb|ACDT01000049.1| GENE 55 55882 - 56934 558 350 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734537|ref|ZP_04565018.1| ## NR: gi|237734537|ref|ZP_04565018.1| predicted protein [Mollicutes bacterium D7] # 1 350 1 350 350 609 100.0 1e-173 MKLRLTNYDTSEYKQFEDMLNNLSKSGYNCKSVDMFTVFKKDEQRLYYKADIFVPQKKSS KNNREQRDQWLLNYVNHGYEFIGKSRKIYVFKAAQAANIKGTDQALLLTYFKRNKTISNI IFIFVAMLLSFLLIPGVFANQNPMEFITNGSIILHYIPLLFCPALLIRFFNHHLTTEKIK LTLSNKKANNKISKYPFIISNWLLIVSIVLIIAGFSIDFIGRETLPLNDRIITLSALGLS SNNDEYNTFNKSSSLMIKEAISYNEENDNDALIVNYYYYNSEKKAKNALNDYLNSVNFKN KKQITNGYLLSNDSIYNCIAFVKNKRLIIVQTTVDLLENNTYQKITSFNY >gi|223714166|gb|ACDT01000049.1| GENE 56 57012 - 58682 1759 556 aa, chain - ## HITS:1 COG:BH1407 KEGG:ns NR:ns ## COG: BH1407 COG1283 # Protein_GI_number: 15613970 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Bacillus halodurans # 10 539 11 539 543 401 42.0 1e-111 MTLADIDWPMILAGLGLFLFGIEYMGDGLKGYSGDKMKDIIDKYTSSPFKGIVIGAFVTC LIQSSSGTTALAIGLIRSGLMTLEQSIGIIMGANIGTVITSVFVGLKVSQYAVYFIILGA AFLMFSKNKKTKYMGQIIFGFGCLFYGLELMGDNLANISKVPEFTQVANYLSQNPWLGLL GGTLLTTAVQSSAAVIAIAQQMYGAGAIGLSIALPFLFGSNIGTTITAVLASFGGTVPAK RAAFFHVLFNVVGSLLFMIVLSPFTLLIEWLAGALNILPELQLSVAHGIFNVVTTAFFFP LIPAIVKLIKKILPSSKKEINMDLSELDQNVVQLFPSHALAIAKNKIIEMGHITIEAVEG IRAYFETKNPLAKDGVYEMENAINTLDSKITEYLVLISHETLNDHDSNDYLANMKTIKDF ERIGDLCINIVKYYEAIYDEKEDFSPEAREDLEAMMDMVIDMLNHAVKAFDTHDLDDIVY VDDKEADLDYFNKKAKQRHIKRVGRKIENSALVNSTYVDILANLERMGDHCQNISESYLL DESAYLNEEPESAFSK >gi|223714166|gb|ACDT01000049.1| GENE 57 58847 - 60562 228 571 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 354 560 16 226 245 92 33 6e-18 MSERSKKFLSYYRPYLKLFLADMFCAMIAAGITLVFPMIIRYITGTVLIADNFEMGIIYK LGIFMVVLVIVEYLCNYFVAYQGHVMGVYMERDLRNELFQHYQKLSFSFYDEQKTGQLMS RLTNDLFSLTELYHHGPEDIVISFIKFFGAFIILATINLKLTLIVFAFIPVMGAFIYYYN RKMKRAFKRNKQRVGDINARIEDNLSGIRVVKSFGNESHEIIKFHDENSRYVSSKKNSYF YMGKFHSGLGAFTSMVTVAAVFFGAIFISNDGLNTADLIAFLLYINNLIDPVKKFINFTE QFQDGITGFERFMEILEIEPDIKDKKDAHNLLDVKGAIEYRHVGFRYNQKSDYVLKDIDL KVAPGEYIALVGSSGAGKTTICSLLPRFYEVSEGDIFIDGQNIKDIKLNSLRQNIGIVQQ DVYLFAGTILDNIRYGRFDATDEEVIEAAKKANAHDFIMELPEGYDTDCGQRGVKLSGGQ KQRLSIARVFLKNPPILIFDEATSALDNESEHIVQQSLESLAKNRTTLVIAHRLSTIKNA KRICVLSTKGIEEEGTHDELLAKKWPICNLL >gi|223714166|gb|ACDT01000049.1| GENE 58 61435 - 62235 620 266 aa, chain + ## HITS:1 COG:PA5091 KEGG:ns NR:ns ## COG: PA5091 COG3741 # Protein_GI_number: 15600284 # Func_class: E Amino acid transport and metabolism # Function: N-formylglutamate amidohydrolase # Organism: Pseudomonas aeruginosa # 21 231 12 234 266 105 29.0 7e-23 MIQILKDKQSMYTINTSLHELPVVISLPHSGTYITSEMAENLMDDVIFPNTDWYLPKLYG FLKELGFTIIINNMNRHLIDVNRDINDKKGSSYKTNLIYTKTTQGALMYNHELSIQEIER RIADYYLPYHEAIRQALLEKQKYFKKVYLIDLHSFGLNYGADIILGNDNGRACTSKTTNF FKKMMKKQNFKVTENNPFAGGYITKYYGASVENCEAIQIELWYQTYIDKRSFGNEELPVI NNELFGETSRRMENVFIELKKWLKDE >gi|223714166|gb|ACDT01000049.1| GENE 59 62241 - 63119 653 292 aa, chain + ## HITS:1 COG:no KEGG:DSY0046 NR:ns ## KEGG: DSY0046 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 6 288 1 283 290 325 54.0 9e-88 MKEQAIDSALILRKSFEHGEAFSEIEISELLKESKLVEKLTRDYEDSSFFNIFRLICLSE IPFIEQLPYTQKIIDFISNNLAADEGFSYNGQGDCIVPCYNAMLLEAYTRLQMAKSNEAQ NALDWIKRYQVFERNQRTSWRYGKICKHGGCMKATPCYIGIGKTVRALITYAKYIKNADS HVEQLIEQGIVYMLKHNMYQSLSNQQCISKYIDEIMFPQAYMLSLTDLVYIVNERNLWWD QRTIPLKNLLEEKQVGSDQWKIEYIYSHKGYKAFETKRNSSKWINYLYNLNK >gi|223714166|gb|ACDT01000049.1| GENE 60 63128 - 63289 99 53 aa, chain + ## HITS:1 COG:no KEGG:Blon_0991 NR:ns ## KEGG: Blon_0991 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_infantis_ATCC15697 # Pathway: not_defined # 1 48 1 48 145 66 70.0 4e-10 MCQECYVDERRCTSLLNSLVCLEKHTQYICGTCGRNICIECDLNEGYKVDFSI >gi|223714166|gb|ACDT01000049.1| GENE 61 63767 - 63934 152 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756827|ref|ZP_02428954.1| ## NR: gi|167756827|ref|ZP_02428954.1| hypothetical protein CLORAM_02376 [Clostridium ramosum DSM 1402] # 1 55 1 55 55 91 92.0 2e-17 MGWNTYHQIVTELSPEMWVYDNLITYVFTHKYYENRAKIKFTNDDAYEFLNLLKS >gi|223714166|gb|ACDT01000049.1| GENE 62 64271 - 65539 1466 422 aa, chain - ## HITS:1 COG:CAC0713 KEGG:ns NR:ns ## COG: CAC0713 COG0148 # Protein_GI_number: 15894001 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Clostridium acetobutylicum # 4 409 7 414 431 483 60.0 1e-136 MPYISNVYARQVLDSRGFPTIQVEVTTESGFSGSAIVPSGASTGKYEALELRDNDDARYR GKSVFKAINNVNETIAKLIKEKSVLEQREIDLAMIRFDNTENKEKLGANAMLAVSLAVAN CAANYLEIPLYRYLGGCNAHVLPTPMINIVNGGSHATNSLDFQEFMIMPISANSFYQAME MATNVFHTLKDILKRSNLATSVGDEGGFAPNLESNDDALELIIKAIKECNYEPGKDISIA LDVAASELYQNGIYTINGKQYTSDELIKYYEQLVSKYPIISIEDGLDQEDYTGWKKLTSR LGAKIQLVGDDLFVTNTNRLKAGIAGNYSNSILIKLNQIGTLSETIDCIEMAVKNQIAPI ISHRSGESEDTFISDLAVGLNIGQIKTGSMSRSERICKYNRLLKIEDDLVGNSCYQGKIN NK >gi|223714166|gb|ACDT01000049.1| GENE 63 65762 - 66556 802 264 aa, chain + ## HITS:1 COG:CAC3250 KEGG:ns NR:ns ## COG: CAC3250 COG0796 # Protein_GI_number: 15896495 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Clostridium acetobutylicum # 7 261 4 256 256 254 50.0 1e-67 MKINEERNLPIGFIDSGLGGLSVLKEAIKIMPHEDFIYYGDSLNAPYGTKSVDEIRDLTF AIVDKLLKLGIKGLAVACNTATSAAVRQLRIMYPDLTIVGIEPAIKPAVESNHGGEILVM ATPMTIKQEKFNRLLDIYKDRAKIIPVSCKGLMEFVEHGNLNGSFLEAYFNETLVPYLND TTETIVLGCTHYPFLRPYLKEFLGERRIQIIDGSHGTSCELKRQLAGKKLLQKENHEGKV VIKNSMDEPEMIDLSWKLLNLPID >gi|223714166|gb|ACDT01000049.1| GENE 64 66801 - 67001 255 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|239623914|ref|ZP_04666945.1| ## NR: gi|239623914|ref|ZP_04666945.1| conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] # 1 62 1 62 904 96 67.0 6e-19 MRRKLFLIVLTTSLLLSSIISIYVIHNLQGNARVINYTGVVRGATQRLIKQELSNKPNDK LVKKNR >gi|223714166|gb|ACDT01000049.1| GENE 65 67087 - 67587 380 166 aa, chain + ## HITS:1 COG:no KEGG:Dred_1525 NR:ns ## KEGG: Dred_1525 # Name: not_defined # Def: signal transduction histidine kinase, nitrogen specific, NtrB # Organism: D.reducens # Pathway: not_defined # 72 161 8 104 653 68 39.0 7e-11 MKFDWEGLKKEIYLVREGRDNGNLYFYSEAFFELADDTVHSAELYSEKSVYSAEGTLIIL NLALAILVFFIYKYNTKQEKIRKKLEKEEENNKQRKVQLARLAENMRAPLNDISELLYIV DLETYDLLFINETGMKIFEVDSIYGKKCYKVLQGRDAPCNFVLTAI >gi|223714166|gb|ACDT01000049.1| GENE 66 67895 - 68101 217 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756834|ref|ZP_02428961.1| ## NR: gi|167756834|ref|ZP_02428961.1| hypothetical protein CLORAM_02383 [Clostridium ramosum DSM 1402] # 1 68 70 137 137 123 100.0 3e-27 MYNDYEWCAENIISQKNTLQGIPLAMIDRWIPYFKNDKCVIIENLEEIKEISLEEYEILD SQSITSLV Prediction of potential genes in microbial genomes Time: Thu May 26 09:45:59 2011 Seq name: gi|223714165|gb|ACDT01000050.1| Coprobacillus sp. D7 cont1.50, whole genome shotgun sequence Length of sequence - 12379 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 7, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 46 - 453 430 ## Ddes_1762 diguanylate cyclase/phosphodiesterase + Term 517 - 562 -0.8 + Prom 473 - 532 5.1 2 2 Tu 1 . + CDS 597 - 1100 422 ## COG2200 FOG: EAL domain + Term 1101 - 1145 -0.6 3 3 Tu 1 . + CDS 1500 - 4778 3703 ## COG1472 Beta-glucosidase-related glycosidases + Term 4785 - 4840 6.0 + Prom 4877 - 4936 7.8 4 4 Tu 1 . + CDS 4974 - 5765 932 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family + Prom 6381 - 6440 8.0 5 5 Op 1 . + CDS 6543 - 7307 826 ## COG0561 Predicted hydrolases of the HAD superfamily 6 5 Op 2 . + CDS 7342 - 7596 158 ## gi|167756841|ref|ZP_02428968.1| hypothetical protein CLORAM_02390 7 6 Op 1 31/0.000 + CDS 7918 - 8661 1036 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 8 6 Op 2 34/0.000 + CDS 8664 - 9344 550 ## COG0765 ABC-type amino acid transport system, permease component 9 6 Op 3 1/0.000 + CDS 9337 - 10065 552 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 10 6 Op 4 . + CDS 10065 - 11174 989 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 11 7 Tu 1 . - CDS 11184 - 12377 680 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid Predicted protein(s) >gi|223714165|gb|ACDT01000050.1| GENE 1 46 - 453 430 135 aa, chain + ## HITS:1 COG:no KEGG:Ddes_1762 NR:ns ## KEGG: Ddes_1762 # Name: not_defined # Def: diguanylate cyclase/phosphodiesterase # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 130 680 809 1032 110 39.0 1e-23 MKMVFKKADFYRIGGDEFIIICQSIKKESFEKRVKELSESFSKKPVCQVAIGTQWTNAVG NINEMIAEADARMYENKKEFYHKHMISRRYRHHSDEMLHLTNIDYLESEIENGHFVVYLQ PKILCEDRSVLEQKH >gi|223714165|gb|ACDT01000050.1| GENE 2 597 - 1100 422 167 aa, chain + ## HITS:1 COG:BH2971_2 KEGG:ns NR:ns ## COG: BH2971_2 COG2200 # Protein_GI_number: 15615533 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Bacillus halodurans # 4 167 89 251 251 110 35.0 1e-24 MKIIPLSINFSIESLRGKSFVERILETCKKYQIPTKYIEIEITERVHDEKNFEIKTVISK LRSAGFIVAIDDFGTEYANLALLSDAEFDILKLDKSLISNVALNPRTKIIMEYISKICHR LGVDMIAEGIESEEQFFTLCSYGVETVQGYLFSKPLAINVFEEKYLS >gi|223714165|gb|ACDT01000050.1| GENE 3 1500 - 4778 3703 1092 aa, chain + ## HITS:1 COG:BS_ybbD KEGG:ns NR:ns ## COG: BS_ybbD COG1472 # Protein_GI_number: 16077234 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Bacillus subtilis # 7 680 2 634 642 369 36.0 1e-101 MKRLGKRRIISLIMALSMAVTTVFSANISNVRALTNAEKARELVSKMTLEEKIGQKLMLS FRSGWTMRDGTKISSVQTINDEIHEIIGEYDIGSVILFAANFNSDAKVNVELTDGLQKAA MDKDLGKNSIPLLIATDQEGGIVYRLTGGTALPGNMALGASGNTENAVKAGNIIGSELNA VGVNVNFAPDADVNNNPNNPVIGLRSFSSNPQLAAKFVSAYIEGVQSNNVATAAKHFPGH GNVATDSHTGLPSVPATKEELYKTELVPFQAAIDAGTDMVMTAHIQFPNIVTEEIYSDKK DELMSPPATLSREILTDLLRDEMKFDGVIVTDSMTMQGVANYFDTNERNLLAVKAGVDIL DIPFTDISSMKDMESKLIPLINAFVDAYTKEDGYNGIKLSLEELDKSVERILVCKYNRDV MDLANDTTTLDDKKAIASKVVGSVENRETERLISANAVTVVKNENNVLPLKLTKDSKVVF ASTYSRNNNRFILAWQRAKQAGIIPEGADYKILQHYNWTGLNDKVNSAINSDGTNFVGTN KDVLDWGTVLVHASEISSASGIDGYLVACPQLFTKYCKENGKQTVVISLNHPYDVQAFPD ADGILAVYGTTSLGLDITESFGGGTVGATAAFSPNLTAGTEVALGTFGASGKLPVDIPKY VPKSGVYSTDENVYELGYGIEYDAIAKDPDKKGLADAIAQAKAIDKSVYTTESWEAAAEA LNTLKVVLQTAENVNLTHKLAQSVVDENETDLRDKYNTVLSLLVYKSDKKALSDLIAEAK KLDKDHYTIETWQNFEKALENAESIYDGNVPQDEVDLAKEALEKAKNELLEKNKKELSAA LEMAKQVTPEQLDKVVPAVVTEFKAALENAQIIFDKKDANELEINKAFERLSNVMQMLEF YKGDKTQLSALVEKIEKLNKNDYLSATWKNLESVLTTAKDVLNNPDALEIEVSETHDQLV RTFLQLRLKPNKDLLSELINKAENTDQTKYTAVSLLSLKSALVEAKAVLDNNEATQDEVT AVEILLKTALDNLVVKEITLDDPNAQTPPDAKPGNGAITATKTGDDSMMLTFALAGLLAL AGSVIVLKKKEE >gi|223714165|gb|ACDT01000050.1| GENE 4 4974 - 5765 932 263 aa, chain + ## HITS:1 COG:DR0470 KEGG:ns NR:ns ## COG: DR0470 COG1387 # Protein_GI_number: 15805497 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Deinococcus radiodurans # 9 88 8 87 260 75 40.0 1e-13 MKQQKWNIHTHTTRCGHATGLDIQYIQSAIEAGFTMLGFSEHLPYPEIRISGARMFYEQK DEYIATIRKLKQDYQDKIDIKVGYEVEYLKDHFNDLMNLKKECDYMILGQHFKYLIYDYD SYCSDEDVYVYAQQIEEALSKNFITYVAHPDYFMMGRRSFSNACVDAAHRIAKASIRYDT PLEINLNGFGYGKKQYYINNELKACYPYPFREFWEIIAMYGCKVTFGYDAHSPLTLLERE RECWALDILSDLPLNFIDHIELK >gi|223714165|gb|ACDT01000050.1| GENE 5 6543 - 7307 826 254 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 253 1 256 256 112 30.0 5e-25 MRKYFFFDIDGTLTNSNPGGIILPSTFETLDKLRKNGHFVAIATGRAHWMAIDFSHESKI DNLVCDGGNGLVINGELLGIEPLDKNICLEIIDECIEKKFPFGVSLGDVPELYTCNEWLS NFKMHTKIIVDPQIDFHRVDNIYKIFIMATHKQEAELTAIHKLGYMRYHGDQLIVEPLEK YRGILKMIEIQGGKPEDIVVFGDGHNDISMMRQAPISIAMGNAIDEVKEVATYITKSNQE DGIEYACKHFGWID >gi|223714165|gb|ACDT01000050.1| GENE 6 7342 - 7596 158 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756841|ref|ZP_02428968.1| ## NR: gi|167756841|ref|ZP_02428968.1| hypothetical protein CLORAM_02390 [Clostridium ramosum DSM 1402] # 1 84 1 84 84 148 100.0 1e-34 MELIKWILMIAGILAFLIIGCMILLRSDEKKRSRTKQYQENIDMLNETNEMLDTTLMELE SYNEPLKKICEHIDNFNHKFLGQK >gi|223714165|gb|ACDT01000050.1| GENE 7 7918 - 8661 1036 247 aa, chain + ## HITS:1 COG:TM0593 KEGG:ns NR:ns ## COG: TM0593 COG0834 # Protein_GI_number: 15643359 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Thermotoga maritima # 1 245 1 243 246 99 33.0 6e-21 MKKLLKVLLVCTMAFTLAACGGNSGSDGTKEILIGISPDYPPYESLEGKDMVGFDIDMTK ELFKIMNDNGGNYEYKFKQMSFDTISTSLISDQIDLGISGFTMHKDWDVLWSDKYNDSRQ VALVANDSTITTKTDLEGKNIGAQLAATGESVANDIKDAKVKAVKDVKVLIETLNSGGID AIILDEAVAKNYVEQGGYKMLDETLLEEENLIIANKGSEDLIKDINKALAEFIKSDKYQE LKTKWGA >gi|223714165|gb|ACDT01000050.1| GENE 8 8664 - 9344 550 226 aa, chain + ## HITS:1 COG:sll1270_2 KEGG:ns NR:ns ## COG: sll1270_2 COG0765 # Protein_GI_number: 16330176 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Synechocystis # 16 219 35 237 255 163 45.0 3e-40 MDALVAFCTADNLIFLLQGAGKSLLLAACALFFGLILGVLGAAAKISKHKILRLIGNVYV ELIRGTPMLLQILFLYIAFPLLMRSITGERFIPDPYVCGAIAMSINSGAYSTELIRSGIM GVDKGQWEACEVLGLNYTQTMKLIILPQAFKRIIPPIVSEFITLIKDSSLISVLGATELL YSAQILGANTFNLIPPLLASAVLYLIMTLTTSYFARKIERRLSVSD >gi|223714165|gb|ACDT01000050.1| GENE 9 9337 - 10065 552 242 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 240 1 242 245 217 46 4e-56 MIKVEHITKKFNDLKAVNDVSLEIRKGEIVCLIGPSGSGKSTVLRCINGLEIPEEGTVFI NEQPLNAKNPQAFKELRSKMGFVFQHFNLFPHKTVLENLTLAPIQVLGMEQEAANQKALE LLKRVGLSDKKDVYPNKLSGGQKQRVAIARALCLGPEVMLFDEPTSALDPEMVIEVLEVM QELAKEGMTMVVVTHEMGFARTVGDRVIFLENGKIIEEAKSEDFFTHPKSERAKDFLSKV MH >gi|223714165|gb|ACDT01000050.1| GENE 10 10065 - 11174 989 369 aa, chain + ## HITS:1 COG:L91456 KEGG:ns NR:ns ## COG: L91456 COG0436 # Protein_GI_number: 15674014 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Lactococcus lactis # 1 367 1 373 376 252 35.0 7e-67 MKIDLFDVEAWMTEYETNYRYNLAETCVASMSINDLLELVENKKGVIDNLLNTKLDYGPI VGSQRLREGIARLYQNGDADNITISHGCINANEMVLISLLEAGDHLITITPTYQQFYSFP ESLGVETTLIPLLEENDWLPRLSDFENAIKDNTKMICLVNPNNPTSTKFSREFLENLTDL AKEYHLYILCDEVYQGLGDGEVAISDLYDKGISTASLSKVTSFAGLRLGWVKANYEVIKL INDRRDYHIISTGYLNDYLGTLVIENYHKILERSRKIINTNRQILIDWLAQEPLVDCVVP EAGTIAFLKYHLSIKSRELCAKLQQDTGVFFVPGACFDQEYHLRFGFANNSDEIKTGLQL FSKWLKENA >gi|223714165|gb|ACDT01000050.1| GENE 11 11184 - 12377 680 397 aa, chain - ## HITS:1 COG:BH1233 KEGG:ns NR:ns ## COG: BH1233 COG2244 # Protein_GI_number: 15613796 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Bacillus halodurans # 1 388 40 438 522 149 28.0 7e-36 MALYILVMPTVSLCITLGQLGIPSAVFRLVANPKYNNKKIVISGLTICLFTCIIVMSALL FSSKFISHNLLKNEGAFYPLLSLALFIPLVGISGIIKNYFLAKQNVFLVAKAGFVEEVAR LSFSYLMIRQFSFLTDTYLVSFATLAMSVGELASILYLTTRLKDHHITTFNKQSIVNNLQ IKDMLNIAMPLTGSRLYQCLVSFLEPIILVYVLTKFGLKESMIHQQYAIISGYVISLLVT PTFFNGVIYRLILPIITNDVVYHKISAARYHILIAMIGSFLISLPFTLIFYLYPEVCLKI LYDTTSGATYLKYMCIPFTIFYLETPLSALLQALNKNRLMFAISIIQCTLEIVLIYFLSQ SLGVFSIAVSMLIGIVVALFINMVACYKYLFIDNKKA Prediction of potential genes in microbial genomes Time: Thu May 26 09:46:11 2011 Seq name: gi|223714164|gb|ACDT01000051.1| Coprobacillus sp. D7 cont1.51, whole genome shotgun sequence Length of sequence - 12413 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 3, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 239 269 ## gi|237734557|ref|ZP_04565038.1| conserved hypothetical protein 2 1 Op 2 7/0.000 + CDS 277 - 1914 1847 ## COG0608 Single-stranded DNA-specific exonuclease 3 1 Op 3 9/0.000 + CDS 1978 - 2496 713 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 4 1 Op 4 . + CDS 2541 - 4766 2355 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases + Term 4994 - 5024 2.0 + Prom 4953 - 5012 3.9 5 2 Op 1 13/0.000 + CDS 5042 - 6319 1567 ## COG0124 Histidyl-tRNA synthetase 6 2 Op 2 . + CDS 6312 - 8069 2156 ## COG0173 Aspartyl-tRNA synthetase 7 2 Op 3 . + CDS 8056 - 9129 1137 ## COG0006 Xaa-Pro aminopeptidase + Term 9130 - 9167 4.1 + Prom 9164 - 9223 4.6 8 3 Op 1 . + CDS 9250 - 10560 1328 ## COG0773 UDP-N-acetylmuramate-alanine ligase 9 3 Op 2 . + CDS 10609 - 10899 236 ## gi|167756855|ref|ZP_02428982.1| hypothetical protein CLORAM_02404 10 3 Op 3 . + CDS 10915 - 11391 733 ## gi|167756856|ref|ZP_02428983.1| hypothetical protein CLORAM_02405 11 3 Op 4 . + CDS 11391 - 12395 1204 ## COG1609 Transcriptional regulators Predicted protein(s) >gi|223714164|gb|ACDT01000051.1| GENE 1 3 - 239 269 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734557|ref|ZP_04565038.1| ## NR: gi|237734557|ref|ZP_04565038.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 78 1 78 78 127 100.0 2e-28 KLKARDFKQQNIHNISEQQIKDYLFERKWKQRDAVPMCEIIDDIMNLDFSEIFDYLSLQV VKEASRLSINDFSDFISK >gi|223714164|gb|ACDT01000051.1| GENE 2 277 - 1914 1847 545 aa, chain + ## HITS:1 COG:SA1462_1 KEGG:ns NR:ns ## COG: SA1462_1 COG0608 # Protein_GI_number: 15927216 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Staphylococcus aureus N315 # 3 544 8 557 557 309 33.0 1e-83 MDWRIINDLDYNHYMQEYKINALLAKVFAYKQYSSQEIETMLSSRLVYHDFSLFSEAEMT LERIHEAIGNKEKICIYGDYDCDGILATSILVEAFRQLGVEVGYHIPSRLEDGYGLNANR VEQMANKGYTLIITVDNGIKAYEAIEKANELGVDVIVTDHHNYDDELPDAFSIIHTKISP DYPYKEISGGFVAYKLASALLKKHDKYLFSLAAITTISDMMPLLDENRALVKRALEFMKE NKYPQLELLLGSNQTYSVTSIGFIIAPKINSFGRLSEIINPNHLVKYFRSDASIGILKAV SEKAIEVNAKRQELTNKQYQSVLTMIDPNEQFLYAYQNEIHEGIIGLVAGKYTRQYNRPS FVMTFDSEKNLYKGSARGIEGIPLNKIFENVQDLLESYGGHALAGGFSVGSDKVEELKEK LELYLNKQLKDYTPPLVDVIEVTGNEISKVSVKQLELLEPLGNGNEEMRFFIKDLPVSKV TSLSNGKHIRFDLSLPQVKAQALFFNHGDLFEQLKDTKMINAVGKFNINVFNNIESVNLI IEDIK >gi|223714164|gb|ACDT01000051.1| GENE 3 1978 - 2496 713 172 aa, chain + ## HITS:1 COG:FN1483 KEGG:ns NR:ns ## COG: FN1483 COG0503 # Protein_GI_number: 19704815 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Fusobacterium nucleatum # 1 172 1 170 170 230 64.0 9e-61 MDLTKYVASIENFPEDGIIFRDVTPLMADKEAFKETIDRFVEWTKSLDTNVDIVAGPEAR GFLFGCPVAYNLNAGFVPVRKPGKLPRETVSETYALEYGTNEVHMHADSIKPGQNVIIVD DLLATGGTVEATVKLVEKLGGKVVGIAFLIELEGLKGRDKLQGYNIYSILKY >gi|223714164|gb|ACDT01000051.1| GENE 4 2541 - 4766 2355 741 aa, chain + ## HITS:1 COG:lin1558 KEGG:ns NR:ns ## COG: lin1558 COG0317 # Protein_GI_number: 16800626 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Listeria innocua # 11 741 7 738 738 635 44.0 0 MAEYKGAGDTVTINDVLEVCGKYITNQESIDLINQAYDFIMEKHEGQKRKSGEPYTIHLI WVAYILATLQTGPATIAAGLLHDVMEDCGVSREEMVERFGEEITTLVEGVTKIGKMPFKD ESDVYAENHRKIYIAMAKDIRVILIKFADRLHNMRTLQFMPPHKQKRISRETLEVYTPIA HRLGINDIKTELEDLSLSYLDPVAYSEISKLLATKQEERKESVAKMMASVKQILDDNSYQ YRILGRAKSIYSIYKKMFVKKKRFDELYDLYALRIITQDKMDCYGILGLIHDQYRPIPGR FKDYIAMPKPNMYQSLHTSVIGEDGNTFEIQIRTEEMDELAELGVAAHWRYKENKGYSSK KEQQEIGEKLQWLREFISLSDDIKDGDAKEYYDSLKRDIFEANVYVLTPQGKIVELPNGS TPIDFAYRIHTEVGHKAVGAIVNNVMVPIDTKLNTGDVVEIKTNKQSLGPSEDWLKFVRT AGARNKIRQFIANREAETKKETIEDGRKMLRDELKKRQLDEKKYMDPDTYKTYLGSFGAR SFDDILYFIGKKSLPAMNLLDRVVPKKSGFFDSISKMLQRNNQVNEKQQSSRNNSGVVVK GLDGLRIQLSKCCNPIPGDDIMGFVSQGQGIKVHRRDCPNIQQPEIKARLIDVYWDFASI STMKFQADLEIVGLDRPNLLNDVVTSLGQMKINILNIHADVVDMKAIIKLKISVEDASRL QQAIDNIDRIQGIYEIKRVIH >gi|223714164|gb|ACDT01000051.1| GENE 5 5042 - 6319 1567 425 aa, chain + ## HITS:1 COG:BS_hisS KEGG:ns NR:ns ## COG: BS_hisS COG0124 # Protein_GI_number: 16079810 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Bacillus subtilis # 1 421 1 424 424 475 51.0 1e-134 MMYKAPRGTVDILPEKSAKWQTLEQLIRTICANYNVKEVRTPIFEHTELFTRAVGDTTDV VSKEMYTFADKKGRSITLRPEGTAGVARAYVENKMFSNPDKISKLYYIGPMFRYERPQNG RQRQFHQFGVEMIGYESPYLDVECMTMAVTLVEALGLQGVKLHINTLGDGDSRDAYREAL KAHFAPYLDDLCQDCKNRYEANPLRILDCKVDHNHPAMANVPKTVDYLSEAAKAHFTKVC ELLDDLEIEYVVDTNLVRGLDYYCHTVFEVISDDPRLGAGATVGGGGRYNGLVQELGGPE APGVGFAFGMERLLIAMDEELEEPEGLDVYVMPMGDEARDLAVQITAMLRANGFSVDMDY QGRSLKAQFKTVDRLNSHFAMIIGDQEIADEVVNIKCTHSKTQDIVPIENIVAYIENHME GHVHE >gi|223714164|gb|ACDT01000051.1| GENE 6 6312 - 8069 2156 585 aa, chain + ## HITS:1 COG:BH1252 KEGG:ns NR:ns ## COG: BH1252 COG0173 # Protein_GI_number: 15613815 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Bacillus halodurans # 1 580 2 586 595 689 57.0 0 MNRTHNNGELRISDVNKIVELKGWVAKKRNLGSLVFIDLRDRYGITQIVLNEEMEALGKD IKNEYVLHVVGKVIERSSKNPNMPTGDIEIEAAQIDIINTAKVTPIYTTDDTDASEDARM QYRYLDLRRPVMQNKIIMRHKITSSIRNFLDNHEFIDIETPYLNRSTPEGARDFLVPSRV HNGSFYALPQSPQLFKQLLMVSGFERYYQIARCFRDEDLRLDRQPEFTQVDVEMSFMTTD EIISIGEQLLAQVMKDVKGIELQLPLPRMTWHEAMEKYGIDKPDIRFGLELVNLNEVVKD VEFMVFKSALEADGHVKGINVKGEADNFSRKAIDKLTDVVKTYKAKGLAWLKVKAGKAEG PIAKFFNEEQMTKLLKAMDASDNDLLLFVSDAKYNVVCDALAALRNHLGKELKLFNPDEF AFLWVVDFPMFEYDDETERYYAVHHPFTRPKDSDIDKIENDPANCLADAYDIVLNGFELG GGSQRIYEQELQERAFRALGFTQERIDSQFGWFVEAFQYGTPPHGGFALGLDRLAMLLTG SENIREVIAFPKNASAVCPMSKAPSEVDNAQLEELGIAVIEDEQE >gi|223714164|gb|ACDT01000051.1| GENE 7 8056 - 9129 1137 357 aa, chain + ## HITS:1 COG:CAC2788 KEGG:ns NR:ns ## COG: CAC2788 COG0006 # Protein_GI_number: 15896043 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Clostridium acetobutylicum # 1 357 1 357 358 315 45.0 1e-85 MNKNRIDAVVNNMKEAGLDYLLISEPSSIDYLIDYVNNPGERMYVLMLAANGAHKLFFNK LFFVENDLGIEIVWHSDTDDATKTIADYVENSGTIGVDKHWSANFLLSLMEKLPDVKFVN GSFCVDFVRMVKDENEQVLMIEASRINDQAIHEVIHQVSLGLSELEVAGKLSGIYSKFGG DGNSFDAIIAYGANGANPHHENDDSHLKPGDSIIIDMGCKYNGYCSDMTRTVFYQEVSEE AKEVYGLVRLANETAEAMIKPGVRLCDIDKAARDIITDAGYGKEFNHRLGHFIGKDVHEF GDVSVNFDLEVKEGMIFSIEPGIYLPGKFGVRIEDLVMVTKDGCKVLNSYPKDLFVI >gi|223714164|gb|ACDT01000051.1| GENE 8 9250 - 10560 1328 436 aa, chain + ## HITS:1 COG:lin1646 KEGG:ns NR:ns ## COG: lin1646 COG0773 # Protein_GI_number: 16800714 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Listeria innocua # 1 430 3 431 447 365 43.0 1e-101 MYYFIGIKGSGMAPLAEILHELGNEVCGSDIDKYIFIEEELRKRNIPIYSFDANNIKDGY TVIIGNAFGDDHTEVKAALANPNVKCYRYFEFLGEFMENFISISIAGTHGKTTTTGMAYH LFEEFDKTTVLIGDGTGHAVKDSKYFIAESCEFQDHFLHYYPKYAIINNIELDHVDYFGN LERYIESFEKFANQVKDTVIVWGDDPNIKKINFKKRVMRFGLNDYNDVRAVNVLENSNGL SFDVYIHNELFGHFNLPYFGMHMLYNSLAIITLGYLENMTNDYIQNRMATFEGTKRRYSV TEVGNNVYVDDYAHHPTAIKYVIEATRVRYPSKKIVAIFKPDRFSRGARFAVDFAKSMDL ADYPYFCPFPENAVKEEGIDIDIYDIANNLPRAKVIGEDEEAAKELAKFDNVVFLFMSSK DIYKLEEKVIAIKKSI >gi|223714164|gb|ACDT01000051.1| GENE 9 10609 - 10899 236 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756855|ref|ZP_02428982.1| ## NR: gi|167756855|ref|ZP_02428982.1| hypothetical protein CLORAM_02404 [Clostridium ramosum DSM 1402] # 1 96 1 96 96 178 100.0 8e-44 MTAIDVCYCVLILSGAFALVSLGILLLRSSTTVKQVGNTVEMAQSTINKADKIMDDITYK LDLLNAPVETIARFFDPNRPKFNPISAIIGLFKKKF >gi|223714164|gb|ACDT01000051.1| GENE 10 10915 - 11391 733 158 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756856|ref|ZP_02428983.1| ## NR: gi|167756856|ref|ZP_02428983.1| hypothetical protein CLORAM_02405 [Clostridium ramosum DSM 1402] # 1 158 2 159 159 224 100.0 1e-57 MKLGKFIAGVGIGAVIGMLCAPKKGSELRGELKEKSQDLYDKAQNMTKDDVESLINNTIE EIKLAIDEFDVDEFKDTAGEKLGDIKTKLEQLATSVKSSDEYASFKESVAKVSDEVTTKF KEIKTKVQDKDFNVLQELDDAMDDIEDELDVIIEDLKD >gi|223714164|gb|ACDT01000051.1| GENE 11 11391 - 12395 1204 334 aa, chain + ## HITS:1 COG:BS_ccpA KEGG:ns NR:ns ## COG: BS_ccpA COG1609 # Protein_GI_number: 16080026 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 1 332 1 332 334 179 33.0 8e-45 MKKVTIYDVAREAGVSLATVSRVINGSNVVREKTKQKVLDVIDRLDFKPNDIARGLATSK TTTIAIVFPQSLFAHVKDMIGGIGDTGRHLDYNINMYTTDDIGDENTVADVTERLVKSRV DGVILFNNDNIDETVESIVKYNLPIVVIGTKMSGENIGSIYIDVKKAAEEIVDKYLAKGK DDIVYILPKQNLIKSDEIIEGIKDTYKKYNKEFTTDQIVTSSSHYENTYPNFVEYFKNHK HDLVFCGYDKDGVAIINAAQENGIKIPEEMEVVGMLNTSYSIMCKPTLSSMNVPVYDMGA LAVRLLTKFLQDEEITSKEIAVQHMFIKRNSTND Prediction of potential genes in microbial genomes Time: Thu May 26 09:46:32 2011 Seq name: gi|223714163|gb|ACDT01000052.1| Coprobacillus sp. D7 cont1.52, whole genome shotgun sequence Length of sequence - 7826 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 5, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 51 - 512 461 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 2 1 Op 2 . + CDS 509 - 1486 790 ## gi|237734569|ref|ZP_04565050.1| predicted protein + Prom 1540 - 1599 4.5 3 2 Tu 1 . + CDS 1834 - 3060 581 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 4 3 Tu 1 . - CDS 3080 - 3820 803 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control - Prom 3870 - 3929 8.6 5 4 Op 1 12/0.000 + CDS 3919 - 4761 876 ## COG0130 Pseudouridine synthase 6 4 Op 2 . + CDS 4761 - 5675 520 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 + Prom 5678 - 5737 7.5 7 4 Op 3 . + CDS 5763 - 6602 542 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 6716 - 6760 4.2 8 5 Op 1 . - CDS 6614 - 7072 526 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases 9 5 Op 2 . - CDS 7059 - 7478 455 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases 10 5 Op 3 . - CDS 7475 - 7825 383 ## COG2071 Predicted glutamine amidotransferases Predicted protein(s) >gi|223714163|gb|ACDT01000052.1| GENE 1 51 - 512 461 153 aa, chain + ## HITS:1 COG:lin0443 KEGG:ns NR:ns ## COG: lin0443 COG1595 # Protein_GI_number: 16799520 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Listeria innocua # 3 152 18 181 182 74 32.0 6e-14 MEEAMFEHLFQKYKDMIYRVAFLYLKNEADALDIVQNTFIKLLKKNDFNDEEHLKRWLLR VCINLCKNNLKSYWKRNVSVFEEEFYQLENSDLKLQELVFKLAPKYKGVIHLYYYEGYSV KEISVILKISEAAVKQRLKRARTKLKIELEAES >gi|223714163|gb|ACDT01000052.1| GENE 2 509 - 1486 790 325 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734569|ref|ZP_04565050.1| ## NR: gi|237734569|ref|ZP_04565050.1| predicted protein [Mollicutes bacterium D7] # 1 325 1 325 325 589 100.0 1e-166 MKNNDYKKIVNGIKIPEQKLDLVKNSILQKKSICKFKYATALIVIAAMITGTIYFRTYPR GDNETKTLIDVDYFITNIYANDETYELTDDKVSIDMSSDMFRTTWRCNAKRSCLPFDIRI VGENIKSISYSVIKGTSLDFYRVKSLDLFHDDLSELNKKNDFSIQFKMFNELRDQDKELL KERYQLTEENLESYVNDHIIDIIKDYRAALEQAGYSQESLNMYGGLFWIEINQGDRMTVP YDQQDPLNYRNILYTELNLENKDISVIDNEQIIYQTVKQELLSYEISMVIEYNNGVIKEK IITFEEGTCEERNNECRDTIYMKIK >gi|223714163|gb|ACDT01000052.1| GENE 3 1834 - 3060 581 408 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 10 385 16 396 418 228 35 1e-59 MKIYDELVWRGLIKDVSSPDLEEKLNNGGMTFYIGTDPTGDSLHIGHFSSFLISKRLKEA GHNPILLVGGATGLIGDPKPDTERPMITKETVEHNFACLKKQAQDLFGFEVVNNYDWSKD LNFIDFLRDYGKYFNINYMLNKDIVKRRLDAGITYTEFSYMIMQAMDFEWLYNNKNCVMQ VAGQDQWGNITAGIELIRKKDGKEAYGFTMPLLTKSDGTKFGKTNGKAIWLDREKTSPYE MYQFFINSEDDKVIDYLKFLTFLTPEEIMELEEKNKTQPHLREAHQALAREVITFLHGAE AYEEAVNISKMLFSGQIQSLTLAQVKVCFEGVPSIEVNEDLNILDALTTCGAAKSKREAR EFVNGGSILINGERIKDIEFIVTKANAFGNEATVVRRGKKNYFVIKHI >gi|223714163|gb|ACDT01000052.1| GENE 4 3080 - 3820 803 246 aa, chain - ## HITS:1 COG:VC1432 KEGG:ns NR:ns ## COG: VC1432 COG0037 # Protein_GI_number: 15641443 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Vibrio cholerae # 12 232 29 249 310 137 32.0 2e-32 MTMKKVLGCIRKADEEFNLIQDGDKVCVGVSGGKDSMLLLYCLSLYKKFATVKFDVIGVH IEMGFPNMDFSEADAFCKKNGIELYHEPSDVYEILKLNKTDDGRLQCSLCSKFKKALVID GAKKYRCNKVAFAHHGDDAVETLFMNMIYGGKIATFTPKMYLTRTEMNFIRPLVYAYESD IVSAVEEAHVPIVASTCPADKHTKREEFKHLLNDLYLKYPQAKSNLLTSLTNEENTMLWK KTPRKK >gi|223714163|gb|ACDT01000052.1| GENE 5 3919 - 4761 876 280 aa, chain + ## HITS:1 COG:BS_truB KEGG:ns NR:ns ## COG: BS_truB COG0130 # Protein_GI_number: 16078729 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Bacillus subtilis # 1 267 2 289 309 238 43.0 8e-63 MDGIILVNKPSGMTSHDVVNKLRRILKTKKVGHCGTLDPDATGVLVVCVNKATKVLQFLT SESKEYVATLSLGTSTDTYDASGKIIETKEFHALDNNEIVACFNNFIGSQEQKPPIYSAI KVNGKKLYEYARAGEQVEVPTRSVTVNHLEILQIENNLIKFKVGCSKGTYIRSLCYDLAK ALGYPGHMKDLIRTKSGNFSLENCFTLEQIENGEYTTVSLEEALNSYQQLVVDDEKIIFH GKKIKSDLNEQVVILNRQGKVLAMYGPDGNGYLKSIRGLW >gi|223714163|gb|ACDT01000052.1| GENE 6 4761 - 5675 520 304 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 19 299 20 308 317 204 37 1e-52 MKVIYLNEENMIALEPSCVALGFFDGMHLGHQKLIDEVLKVSKIKNLKKGLLTFDVHPKS YLLDSSFKYLMSLEDKIEFLEKLNFDYLFVLRFNHQLASKEPRQFIDEFIIKPKIKHVVC GFDFHFGNHGSGDSVYLKNNRNNDYEISIIDKLEYEQHKISSSYLRQVLSNGQVELASQL LGRQYQVTGKVVHGRENGRKIGFPTINVAAIDYVLPKNGVYGAKVIIDGKEYIGMANLGY NPTFTALKQASLEVNIFDFDQNVYGKQVNVMFIKHIRSEKKFPSINDLIEQLNKDKQQII NEMI >gi|223714163|gb|ACDT01000052.1| GENE 7 5763 - 6602 542 279 aa, chain + ## HITS:1 COG:lin0496 KEGG:ns NR:ns ## COG: lin0496 COG0697 # Protein_GI_number: 16799571 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Listeria innocua # 2 276 10 293 319 65 22.0 1e-10 MLYGWLIFSSFLWGTNVLVMKYMLENTTTYFLAALKVLLSVIAIFVIMKYKKIPFKWYKG SLGIKVSLLSITINFILTFEGLNLITGSSNAIVNSLAPLVTILLTWLFYHTKVNRYQLIA MIIACIGFLISLDFNISQISLGHLMMIGGIVLYSYGNLLMQHQCKKEDSLPFTFQYLCLS FFQLALITLFIPSSNHLENISITLWVLFIIFSGIGFAIIQLTYFRAVHEIGSVKTSFLLG LNPVFTYIGSLLLQENFNFNKFMAMILMVVAMVIANKKR >gi|223714163|gb|ACDT01000052.1| GENE 8 6614 - 7072 526 152 aa, chain - ## HITS:1 COG:lin1004 KEGG:ns NR:ns ## COG: lin1004 COG2084 # Protein_GI_number: 16800073 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Listeria innocua # 1 151 136 286 286 159 51.0 2e-39 MVGGDENVFKQIEPVLEVLGSNINYVGSAGNGQHTKMANQIALAGALAGVCEALTYAKAV NLDPQIMLDCISSGAAGSWQMTNNAPRIIADDFAPGFYIKHYIKDMKIAKAVNDTNNLDL EVLNTVLQQFEQLQTDGFENLGTQALIKYYQR >gi|223714163|gb|ACDT01000052.1| GENE 9 7059 - 7478 455 139 aa, chain - ## HITS:1 COG:BS_ykwC KEGG:ns NR:ns ## COG: BS_ykwC COG2084 # Protein_GI_number: 16078460 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Bacillus subtilis # 2 129 3 130 288 147 55.0 5e-36 MKKIAFIGVGVMGKAMVQNLSQAGYPVSIYTRTKNKVEDLISDKIIWHDSVAECVKDQDI VMTMVGFPQDVEEIYFSSKGIIANAKKDCILIDFTTSSPLLAKKIFEESSKHSLASLDAP VSGGDIGAKKRDIVNYGWR >gi|223714163|gb|ACDT01000052.1| GENE 10 7475 - 7825 383 116 aa, chain - ## HITS:1 COG:ML1573 KEGG:ns NR:ns ## COG: ML1573 COG2071 # Protein_GI_number: 15827825 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferases # Organism: Mycobacterium leprae # 1 95 125 222 249 80 40.0 6e-16 INVAFNGTLIQDIPETKKYDNHLQENLEGYHHLVKTIPHTKLNKYLGNEFMTNSFHHQAI DDIAPGFVVSAITADGIIEGIEKDNIIAVQWHPEKNHDKIQAGLIKLYKELLEEEK Prediction of potential genes in microbial genomes Time: Thu May 26 09:47:01 2011 Seq name: gi|223714162|gb|ACDT01000053.1| Coprobacillus sp. D7 cont1.53, whole genome shotgun sequence Length of sequence - 47096 bp Number of predicted genes - 48, with homology - 48 Number of transcription units - 11, operones - 7 average op.length - 6.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1189 1396 ## COG4477 Negative regulator of septation ring formation + Prom 1194 - 1253 4.9 2 2 Tu 1 . + CDS 1370 - 1933 575 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 2061 - 2116 0.3 + Prom 2032 - 2091 6.4 3 3 Op 1 . + CDS 2167 - 2364 192 ## gi|167756869|ref|ZP_02428996.1| hypothetical protein CLORAM_02418 4 3 Op 2 7/0.000 + CDS 2387 - 3301 864 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes 5 3 Op 3 . + CDS 3305 - 4501 1216 ## COG0301 Thiamine biosynthesis ATP pyrophosphatase 6 3 Op 4 . + CDS 4570 - 4791 361 ## Aflv_0473 small acid-soluble spore protein (alpha/beta-type SASP) + Term 4804 - 4831 0.1 + Prom 4817 - 4876 10.5 7 4 Tu 1 . + CDS 4915 - 5136 371 ## Aflv_0473 small acid-soluble spore protein (alpha/beta-type SASP) + Term 5141 - 5186 9.2 + Prom 5230 - 5289 8.4 8 5 Op 1 . + CDS 5317 - 6513 1529 ## COG0282 Acetate kinase + Term 6514 - 6546 2.5 + Prom 6518 - 6577 4.3 9 5 Op 2 . + CDS 6598 - 7533 458 ## PROTEIN SUPPORTED gi|149007035|ref|ZP_01830704.1| 50S ribosomal protein L31 type B 10 5 Op 3 . + CDS 7526 - 9190 1666 ## DSY4330 hypothetical protein + Term 9206 - 9249 7.1 + Prom 9213 - 9272 9.3 11 6 Op 1 40/0.000 + CDS 9343 - 10014 847 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 12 6 Op 2 . + CDS 10011 - 11402 1313 ## COG0642 Signal transduction histidine kinase 13 6 Op 3 . + CDS 11415 - 11987 787 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 14 6 Op 4 10/0.000 + CDS 12004 - 14949 3434 ## COG1196 Chromosome segregation ATPases 15 6 Op 5 7/0.000 + CDS 14960 - 15943 936 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 16 6 Op 6 8/0.000 + CDS 15945 - 16274 362 ## COG2739 Uncharacterized protein conserved in bacteria 17 6 Op 7 . + CDS 16287 - 17681 1585 ## COG0541 Signal recognition particle GTPase 18 6 Op 8 . + CDS 17723 - 18457 504 ## PROTEIN SUPPORTED gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase + Term 18462 - 18497 1.1 + Prom 18459 - 18518 3.2 19 7 Op 1 . + CDS 18538 - 18810 455 ## PROTEIN SUPPORTED gi|167756884|ref|ZP_02429011.1| hypothetical protein CLORAM_02433 20 7 Op 2 . + CDS 18813 - 19064 370 ## gi|167756885|ref|ZP_02429012.1| hypothetical protein CLORAM_02434 + Term 19067 - 19097 2.0 21 7 Op 3 30/0.000 + CDS 19104 - 19613 497 ## COG0806 RimM protein, required for 16S rRNA processing 22 7 Op 4 33/0.000 + CDS 19603 - 20322 712 ## COG0336 tRNA-(guanine-N1)-methyltransferase + Prom 20335 - 20394 2.0 23 7 Op 5 5/0.000 + CDS 20429 - 20779 580 ## PROTEIN SUPPORTED gi|167756888|ref|ZP_02429015.1| hypothetical protein CLORAM_02437 24 7 Op 6 2/0.000 + CDS 20847 - 21392 578 ## COG0681 Signal peptidase I 25 7 Op 7 8/0.000 + CDS 21403 - 22269 850 ## COG1161 Predicted GTPases 26 7 Op 8 2/0.000 + CDS 22256 - 22876 674 ## COG0164 Ribonuclease HII 27 7 Op 9 13/0.000 + CDS 22946 - 23698 704 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake 28 7 Op 10 6/0.000 + CDS 23775 - 25847 2107 ## COG0550 Topoisomerase IA 29 7 Op 11 5/0.000 + CDS 25907 - 27211 1195 ## COG1206 NAD(FAD)-utilizing enzyme possibly involved in translation 30 7 Op 12 . + CDS 27192 - 28100 744 ## COG4974 Site-specific recombinase XerD + Prom 28105 - 28164 6.6 31 8 Op 1 38/0.000 + CDS 28262 - 29125 1446 ## PROTEIN SUPPORTED gi|237734605|ref|ZP_04565086.1| 30S ribosomal protein S2 + Term 29139 - 29169 2.0 32 8 Op 2 24/0.000 + CDS 29196 - 30086 539 ## PROTEIN SUPPORTED gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts + Prom 30199 - 30258 5.4 33 8 Op 3 33/0.000 + CDS 30278 - 30994 933 ## COG0528 Uridylate kinase 34 8 Op 4 . + CDS 30996 - 31541 736 ## COG0233 Ribosome recycling factor + Term 31545 - 31579 5.3 + Prom 31544 - 31603 1.9 35 9 Tu 1 . + CDS 31707 - 31988 114 ## BCA_3719 hypothetical protein + Term 32207 - 32240 -0.9 + Prom 32204 - 32263 7.8 36 10 Op 1 32/0.000 + CDS 32285 - 33040 798 ## COG0020 Undecaprenyl pyrophosphate synthase 37 10 Op 2 15/0.000 + CDS 33040 - 33822 739 ## COG0575 CDP-diglyceride synthetase 38 10 Op 3 17/0.000 + CDS 33824 - 34975 1337 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase 39 10 Op 4 . + CDS 34975 - 36054 1086 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 40 10 Op 5 4/0.000 + CDS 36130 - 40452 3929 ## COG2176 DNA polymerase III, alpha subunit (gram-positive type) + Term 40508 - 40537 2.1 41 10 Op 6 32/0.000 + CDS 40551 - 41018 448 ## COG0779 Uncharacterized protein conserved in bacteria 42 10 Op 7 22/0.000 + CDS 41031 - 42362 742 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA 43 10 Op 8 8/0.000 + CDS 42366 - 42632 175 ## PROTEIN SUPPORTED gi|206900953|ref|YP_002250931.1| ribosomal protein L7Ae family protein 44 10 Op 9 10/0.000 + CDS 42625 - 42921 484 ## PROTEIN SUPPORTED gi|237734618|ref|ZP_04565099.1| 50S ribosomal protein L7 45 10 Op 10 32/0.000 + CDS 42926 - 44779 2637 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 46 10 Op 11 . + CDS 44789 - 45136 538 ## COG0858 Ribosome-binding factor A + Term 45204 - 45247 4.2 + Prom 45410 - 45469 10.6 47 11 Op 1 . + CDS 45615 - 46754 1615 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase 48 11 Op 2 . + CDS 46756 - 47095 402 ## COG2017 Galactose mutarotase and related enzymes Predicted protein(s) >gi|223714162|gb|ACDT01000053.1| GENE 1 2 - 1189 1396 395 aa, chain + ## HITS:1 COG:L7722 KEGG:ns NR:ns ## COG: L7722 COG4477 # Protein_GI_number: 15674133 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Negative regulator of septation ring formation # Organism: Lactococcus lactis # 1 383 173 558 576 94 22.0 5e-19 IPQKFADIEERFVQLEALMNSQRFDEAKVSADKIDKDIDLLSAYLRDLPTYISIVRKYIP KRLDELYRVITEMKERDFSIERLNTTVRYNKINADLENTIQAIKELNLENVGASIEVMTE DLNSLTADFEKEEAAYSRYEESRNACYKHIGHLDEGLRNTINSLGELQRNYLLSDYEITV KEDYEAFKGILDELDQLTVIIESNDFSYSVLIERFEELIDRCKPFDESLNKYIELENSLR LQEKRALDELDNINIVLLEIKSEIKNKHLPMINESYKDYIDDSYQKADEILKFIRRRPID LERLSVQVDAARDVIYKLYDNVHNLIVTAEMVEDAIIYGNRYRSSFLEVNTELTKAELLF RNGEYTKALTTAVDIIEKINPGSYEMLINKNSAKS >gi|223714162|gb|ACDT01000053.1| GENE 2 1370 - 1933 575 187 aa, chain + ## HITS:1 COG:NMB1820_2 KEGG:ns NR:ns ## COG: NMB1820_2 COG0110 # Protein_GI_number: 15677656 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Neisseria meningitidis MC58 # 4 183 1 180 190 117 36.0 1e-26 MNAYEQIDFLEIKEKGSRKIINHTVFKESEFDKVVKLYDEVIVATGSNDLREEKIKMLLA FGIPLATIVHPTALISSSAKIGSGTTILANVIININTKVGIGCIVNNGAIIEHDCMVGNY VNICPKFAMAGHSSIGYKSYLGIGSTVIDDIRIGNRVTVGAGAVVVSNISDNIVAIGVPA KKMLVDK >gi|223714162|gb|ACDT01000053.1| GENE 3 2167 - 2364 192 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756869|ref|ZP_02428996.1| ## NR: gi|167756869|ref|ZP_02428996.1| hypothetical protein CLORAM_02418 [Clostridium ramosum DSM 1402] # 1 60 1 60 378 115 96.0 1e-24 MIYLDYVATTPLNSEVLNTYHNLLSNYFYNADSMYDKGIEVNRLMEHSRKLISESFKLKK MKYFY >gi|223714162|gb|ACDT01000053.1| GENE 4 2387 - 3301 864 304 aa, chain + ## HITS:1 COG:BH3204 KEGG:ns NR:ns ## COG: BH3204 COG1104 # Protein_GI_number: 15615766 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Bacillus halodurans # 1 300 75 375 380 237 40.0 2e-62 MAIKGTAFQYQNRGKHIITTAIEHSSVYETCKELEKFFGFEVTYLGVDHKGRISLEELEK SIRDDTILVSIMYVNNEIGVINPIDEIKKIVKKYDKVKLHFDMVQALGKLPIDLNDVDLA SFSAHKIYGLKGSGLLFKRRSTTIVPLISGGQQEFSLRGGTSNACTNIVFAKTLRLALDN FEQKNQHIKDINDYCRKCLQEIEGIVINSDENCCLPSILNFSCLGYKPEVILHDLETKEI YLSTRSACSSKTSNVSRVMAQLHLDEAISSSALRISFGEHTTREEIDRFCYYLQESMRKL KKQR >gi|223714162|gb|ACDT01000053.1| GENE 5 3305 - 4501 1216 398 aa, chain + ## HITS:1 COG:lin1634 KEGG:ns NR:ns ## COG: lin1634 COG0301 # Protein_GI_number: 16800702 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis ATP pyrophosphatase # Organism: Listeria innocua # 1 397 1 395 403 372 46.0 1e-103 MEANYILVRFGELTTKGKNRKLFTNRLLKNTKEILAEFNHLTYDLQHDRMYVVLNGTDHE AVCTKLKTVFGIYSFSVAYKMEKNLELAKQAVLQIIENNDGNTFKINTKRSDKNFPGSSQ EINREIAGYVFHHTTKEIKVDVHAPAILVTVEIRYDAIYVMDNIIKGAGGYPVGVGGKAL LMMSGGIDSPVAGYLTLKRGVDIECIHFAAPPYTNELAREKVFDLVDKLRHYTHGSIRVH VINFTKLQLAVYDNCDESYAMTVMRRMMYRISEKIANKNNCLALVNGESIGQVASQTLNS MQVINEVVKIPVLRPVLCLDKLEIIDIADRIDTYEISIRPHEDCCTIFTPKAPATKPKLY KAEIFESGFDFETLIDECVETAEIITVDHNYKKNDDIF >gi|223714162|gb|ACDT01000053.1| GENE 6 4570 - 4791 361 73 aa, chain + ## HITS:1 COG:no KEGG:Aflv_0473 NR:ns ## KEGG: Aflv_0473 # Name: sspD # Def: small acid-soluble spore protein (alpha/beta-type SASP) # Organism: A.flavithermus # Pathway: not_defined # 6 65 2 61 63 82 75.0 5e-15 MSQSNSSRNKLVVPGAKNAIDQMKYEIANEFGVNLGPDTTARDNGSVGGEITKRLVAMGQ AQMSSSNKYNQSK >gi|223714162|gb|ACDT01000053.1| GENE 7 4915 - 5136 371 73 aa, chain + ## HITS:1 COG:no KEGG:Aflv_0473 NR:ns ## KEGG: Aflv_0473 # Name: sspD # Def: small acid-soluble spore protein (alpha/beta-type SASP) # Organism: A.flavithermus # Pathway: not_defined # 6 65 2 61 63 84 78.0 9e-16 MSQNNSSRNKLVVPGAQNAIDQMKYEIANEFGVNLGPDTTARANGSVGGEITKRLVEMGQ SQMSSSNNYNQSK >gi|223714162|gb|ACDT01000053.1| GENE 8 5317 - 6513 1529 398 aa, chain + ## HITS:1 COG:BH3192 KEGG:ns NR:ns ## COG: BH3192 COG0282 # Protein_GI_number: 15615754 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Bacillus halodurans # 5 395 1 389 391 432 55.0 1e-121 MVKVMAINAGSSSLKFQLINMPSEEVITSGLVERIGLDQGNFEMKYNGEKFIKECPIKDH SVAVQLLLDALVDHHVVESLGEIEACGHRVVHGGEYYNDAVKVDEEVVARVEELAELAPL HNPAHIVGYNAFKAALPEVEHVFVFDTAFHQTLDRERYLYPLPYEYYTDLKVRKYGAHGT SHKYVSQVAIDMLGNPKHSRVIVCHLGNGASISAVQDGICIDTSMGFTPLAGVMMGTRCG DVDPSIMPYLCKKLNKTPDEILDIYNKKSGMLGISGISSDSRDIENALFKNGDERALLTG LLYARIVSKYIGSYFVEMGGVDAIAFTAGVGENASYLRRLIIDNVSRALGVFLNEEENER RSKENRLISHQYSKVDVYVIPTNEEVMIARDTVRILGL >gi|223714162|gb|ACDT01000053.1| GENE 9 6598 - 7533 458 311 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149007035|ref|ZP_01830704.1| 50S ribosomal protein L31 type B [Streptococcus pneumoniae SP18-BS74] # 3 309 5 309 311 181 34 9e-45 MEEKIIAKIEEYDYIAIYRHVNPDFDAFGSQLGMYDMIKATYPDKKVFLCGDFSSELVKK YTVVFEHQEVDYTKDILGIVLDTANIERIDDQRYQLCKEIIKIDHHLVVDSYGTLNYEDS SASSASQLVGTIFKDTAMKITKSGAEALYLGIVGDTARFMYRNTDERTFAVAGALVQEGI DIVEIYNRIYMKKAKDLQINKFILNNHQFDGGVAYYVLSDADLKMLEISRERGSDFVNLL SGVEEYKIWMALTENVADHNWRVSLRSRDYAVNKVAEKYNGGGHMLASGAKLASLEQLGQ LLQDLKEIINE >gi|223714162|gb|ACDT01000053.1| GENE 10 7526 - 9190 1666 554 aa, chain + ## HITS:1 COG:no KEGG:DSY4330 NR:ns ## KEGG: DSY4330 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 29 539 5 520 533 437 44.0 1e-121 MNKFTDIFANFIEVFSFRQGSQDVKQVKKGAFLKAFTITLIIGLIGEYAFLIPLNLRSPQ FIIYFCILLLIFNGLYFMLSQRFSKITKYSLIVIGILIAYVAIGTFVSSPIFNAGSYQKQ LKLDKKADFYADNKTISYQSIPVVDRDSAIKLGDRKMGQMVDYVSQFEVDESYEQINYQD TPYRVTPLEYSDLIKWFTNRSDGLPAYIRVNMVTQESEVVKLKEGMKYSKSEHFGRKIER HLRANYPTLMFDTLAFEIDEKGNPYWIAPVYNYKIGLFGGKDITGAVLVNAINGDHEYYD IDKVPEWVDRAYPAELVLSQLENYGKYTNGYFNTLFSQKGVLQPTSGYNYVAINDDVYLY TGLTSVSNDASNVGFAMINLRTKDGKYYNISGAEEYSAMSSAEGQVQNLKYTATFPILIN AGGQPTYFLSLKDDAKLVKKYAFVSVENYQIVATGDSVAQAEQAYYALLEANGKKTDSSE YKTNELTGAITAINEAVVDGNSTYYFKLEGSDTIFIGDISLSNQFPLAKVGDVVTIEYVN SKDNSEVITSIKFD >gi|223714162|gb|ACDT01000053.1| GENE 11 9343 - 10014 847 223 aa, chain + ## HITS:1 COG:lin2728 KEGG:ns NR:ns ## COG: lin2728 COG0745 # Protein_GI_number: 16801789 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 4 221 3 222 225 213 51.0 2e-55 MKAKVLVVDDEKDIRELINFYLNKEGFEVLEASNGEEALEIFENEYIDLGIIDVMMPVMD GFELVENLKEFKDVPVIMLTAKGESKDKLRGFSVGADDYVVKPFDPTELMARVRAVLKRY SVNTSNIIKLNEVEFDGDKYEIRYQSEIIHLPLKQFELVFELAKHPDQIFSREQLIEKIW GMDYDGFDRTVDVHIKRIRENLGHLPGFKVVTVRGLGYKVEVE >gi|223714162|gb|ACDT01000053.1| GENE 12 10011 - 11402 1313 463 aa, chain + ## HITS:1 COG:lin2727 KEGG:ns NR:ns ## COG: lin2727 COG0642 # Protein_GI_number: 16801788 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Listeria innocua # 2 447 1 457 459 208 33.0 2e-53 MMKSIYGKLICGFLITIAFSFSVAGYVALRNNYDQIEDMAKSELSATSEYVVGILNSLDD ANINDIMNKYAISSEVYITLYSPELDYYTYGENKGYFPSRETMIQYYHDENKNGRFDERK SVQTYGSKAKINGSDVYIFIQKDTSYEKGIFANSAVLILGCVFLSGNLVFLAIADIIVKP ITRLTNATKELSKGNYSVRVNYVGDDEISRLNQGFNQMAQQLAKQEETRQKFISDISHEF QTPLTAIQGFANILKEEDLPKEQRQKYADIILFHSKRLSTLSKNMLQLTLLEREEVELEF TTYSIVEQLSRVISTQENQAILKDIEIQFEKPRKDIMVYGDEQRLEQVWINLISNAIKYT GEGGLITVTVKKASREVEVSIEDTGYGMSKEVVSHIFERFYREEKARSVEGNGLGLSIVK TIVDLHHGNIDVISQVDVGSTFVVKLPSERKVFDIKEKLSFNK >gi|223714162|gb|ACDT01000053.1| GENE 13 11415 - 11987 787 190 aa, chain + ## HITS:1 COG:PA5298 KEGG:ns NR:ns ## COG: PA5298 COG0503 # Protein_GI_number: 15600491 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Pseudomonas aeruginosa # 1 187 1 186 190 180 51.0 1e-45 MELLKERILKDGIVIMPDILKVDSFLNHQIDPTLYRQIAQEFKARFKDTKITKILTIEAS GIGIALATAFEFDDAPVVFAKKGTAKTTSTHYYSSKVHSFTKGIDYDAIVSKKYLDKDDN ILIIDDFLANGQAVLGLVDIVKQAGANIAGVGIVVEKGFQTGRQLIEEKGYRVESLAVVE AFRDGAVILK >gi|223714162|gb|ACDT01000053.1| GENE 14 12004 - 14949 3434 981 aa, chain + ## HITS:1 COG:MYPU_7140 KEGG:ns NR:ns ## COG: MYPU_7140 COG1196 # Protein_GI_number: 15829185 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Chromosome segregation ATPases # Organism: Mycoplasma pulmonis # 1 977 1 971 979 591 37.0 1e-168 MYLKRIELHGFKSFADKVNVEFQPGITGIVGPNGCGKSNISDAVRWVLGEQSVKSLRGAN MSDVIFAGSEDRRAQNLAEVTLVFDNSDRFMKYDYNEVEITRRLYRMNNEAEYLINKQSC RLKDIIDLIMDTGLGKDSLSIISQGNISSFADNKPEERRGIFEDAAGVSKYKKRKLESIR KLERTNENLERIGDIVVELEKQVGPLKRQKDKAEKYLALKEKLTAIEVNVLINEITEAKK SLDELSKVIKDLNERQASLEADILLKESSNDEIKKKMFTLDQEINALQSKLLEAVSNVSK LETAKVEVDQKRKHALETLSKENLKENIANMKAILSDVVNEYNDRVERLDNTQKELTTII GQQEERNKKLLLLKNEIDQLSNNINKNKSRKEILIDAIENKSNYHHGIKTVLSLAKSNKN IIGVLGDLITTDEGYELALSSALAGAIEFIVTTDDLTARDAIKFLRDNKAGRATFLPISA MKPRHVREEHLVVCQTMEGYLGLASDFVTYEDEIAAVVLNQLGNIIVAKDIDSANEISKA TFSRYKVVSLEGDVVNVGGSLTGGSFNRQKSSIVQKRELEQVAVTLEQQEKELGLKRNEH NGLDNEIKEVSHALLQKQMAYAKLEVVVQSKKEELIKAKSEYESLADQSVELEEFASGKT ESKLVNQLNEAIKYRDNLTEEIKSKRELRMAYVNQNEALDVELREYRNDLKEAQSEVNTS AINATKLEAMLNNHLARLNDEYRMTYEYAVEHYLDEIDVEQAKLEVYELRTNINRLGNVN VDAIEEYQIISERYENMNTQRIDLIQAQDSILEAIKEMDEIMVERFSETFEKINEEFNHV FRSLFGGGKARIKYTDPTNILETGIDIDVQPPGKAVQNITLFSGGEKALIAISCLFAILR VRPIPMCILDEVEAALDIANVERFAKYLREFSGTTQFIVVTHREGTMEECDLLYGATMQQ KGVTKLVSVKLEEAIDLTDQS >gi|223714162|gb|ACDT01000053.1| GENE 15 14960 - 15943 936 327 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 1 327 1 326 336 365 55 1e-100 MGFFSQIKEKFVGKSAKQNEKYVAGLDRSNSTFSDRINELAARFREINDEYFEELENILI MSDVGVSMVMKIVEEIKNEVRLQNITDPKEINEIIVDKMFVIYANDSVMTTKINYAEEGL TVILMVGVNGAGKTTTIGKLANRIVNDEGKKVMVAAGDTFRAGAIDQLAVWAQRVGVDIV KGKEGGDPSAVVFDALKQAKEKNVDVLICDTAGRLQNKVNLMNELEKMNRIIKREVPDAP HETLLVIDATTGQNGVSQAVEFSKITDVSGLVLTKMDGTAKGGIVLSIKDQLNIPVKFIG LGESVDDLQEFDLDQYIYGLCKNLVEE >gi|223714162|gb|ACDT01000053.1| GENE 16 15945 - 16274 362 109 aa, chain + ## HITS:1 COG:lin1916 KEGG:ns NR:ns ## COG: lin1916 COG2739 # Protein_GI_number: 16800982 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 5 109 2 106 110 68 43.0 2e-12 MESSLEKKQRVNLLMDCYSDLLTEKQQTYLEYYYQEDYSLSEIAQILDVSRNAVFDNLKK AVHSLENYEEKLQLLKKHQERLDLIQRIEDDISSDHKSLEEYLELLKKI >gi|223714162|gb|ACDT01000053.1| GENE 17 16287 - 17681 1585 464 aa, chain + ## HITS:1 COG:BH2484 KEGG:ns NR:ns ## COG: BH2484 COG0541 # Protein_GI_number: 15615047 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Bacillus halodurans # 1 431 1 433 451 454 56.0 1e-127 MAFDSLSERLQGTLKKVSGQGRLTEKNMEEMLSEIRLALLEADVNFQVVKEFIAATKEKA LGQDVLGSLKPGQVVVKIVHDELVSLLGTSVAEVDFDKKPTVIMMVGLQGSGKTTTSGKI AKLITKKYGKKPLLVAADIYRPAAVDQLKTLGEQLNVPVYEKGTSQSAESIVEEAMTFAR DHNNDVIIIDTAGRLHIDEPLMNELANIKKIAHPSEILLVVDALTGQDIVNVASSFNEQL SITGAVLTKLDGDSRGGGALSIRHITNVPIKFIGTGEKLDAIDLFYPDRMADRILGMGDV VSLVEKVQDVYDEKDTMKAYKKMQSGQFGLDDMLSQMQQLRKLGPLSGIMKMIPGMPKLP KMNDEDSEKKLKMTESIIFSMTKEERRDPSIITLSRKERIAKGCGKDVAAVNRLLKQFEE SKKMMKMLGNVDPNTGMPMPGGRAKNPNIGNPNRKKIRHKKKKK >gi|223714162|gb|ACDT01000053.1| GENE 18 17723 - 18457 504 244 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase [Ochrobactrum intermedium LMG 3301] # 1 244 1 243 245 198 42 4e-50 MTQNKYDDPAFFEKYSQMSRSKNGLEGAGEWHVLQRVLPDFTNKTVLDLGCGYGWHSIYA IEHGAKKVIASDISSKMIETARIKNNDPRIKYECISFEESNFGEAVFDIVICSLMIHYLP SYQDFVKKVNKWLKTGGYLVFNVEHPVFTAKGDQDWYYNEAGEIEHFPVDNYYLEGKREA VFLGEKVIKYHRTLTTYLNELLNGGFEIIQVKEPTVSSNALIAHPEFKDELRRPMMLIVS ARKK >gi|223714162|gb|ACDT01000053.1| GENE 19 18538 - 18810 455 90 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167756884|ref|ZP_02429011.1| hypothetical protein CLORAM_02433 [Clostridium ramosum DSM 1402] # 1 90 1 90 90 179 100 2e-44 MAVKLRLLRMGAKKAPFYRIVAADSRAPRDGRFIELLGTYDPRTNPAKVTIKEEEVLKWL NNGAQPSDTVKNLLSKEGIIKKFADSKSGK >gi|223714162|gb|ACDT01000053.1| GENE 20 18813 - 19064 370 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756885|ref|ZP_02429012.1| ## NR: gi|167756885|ref|ZP_02429012.1| hypothetical protein CLORAM_02434 [Clostridium ramosum DSM 1402] # 1 83 1 83 83 135 100.0 1e-30 MDYAKILKDIAIELVEDKDHLEVREMPSLEEDVVVLHVYSAKSDIARLIGRKGMMANSIR QLMSVAGRMANKKLDIKFESYEQ >gi|223714162|gb|ACDT01000053.1| GENE 21 19104 - 19613 497 169 aa, chain + ## HITS:1 COG:SA1082 KEGG:ns NR:ns ## COG: SA1082 COG0806 # Protein_GI_number: 15926822 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RimM protein, required for 16S rRNA processing # Organism: Staphylococcus aureus N315 # 3 167 2 167 167 117 38.0 1e-26 MEKLKIGKIVGTHGLKGELKIRSNSDFADKRFKKGNEIIIRYQNQDLVYKIITSRIHKGN YLVSFRDNQDINLVEKYIGSFVYGYKDDELLDADEYFYTDLIGMQVVSTEGTKIGPVTSI YDNTRHDILNIDHNGKNVAIPYVDAFIKDVDVEKKIIVVMLIKGLIDED >gi|223714162|gb|ACDT01000053.1| GENE 22 19603 - 20322 712 239 aa, chain + ## HITS:1 COG:BH2479 KEGG:ns NR:ns ## COG: BH2479 COG0336 # Protein_GI_number: 15615042 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Bacillus halodurans # 1 237 1 242 246 269 56.0 3e-72 MKIDILTLFPEMFTGFLNTSIIKRAIDKGLVTIELHDFREFSIDKHKHVDDYPYGGGQGM VLMCAPIVECLKTIATDDSFIILMSPQGITLNHGCAVELSAKKHLIIICGHYEGFDERIR DYVDMELSIGDYVLTGGELGSMVVSDAVIRLLDGAIKEDSHMDDSFSHGLLEYPQYTRPQ CYDGNEVPEVLMSGHHENIRKWRKFQSLKKTYLKRPDLLESYQFDAESIEMMKKIKEND >gi|223714162|gb|ACDT01000053.1| GENE 23 20429 - 20779 580 116 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167756888|ref|ZP_02429015.1| hypothetical protein CLORAM_02437 [Clostridium ramosum DSM 1402] # 1 116 1 116 116 228 100 7e-59 MNLQLVEQITKKQMRNDIPEFKAGDTLKVFVKIKEGEKFRIQLFEGVCIARKGSGISESF TVRKISYQVGVERTFPVHTPIIDHIEVVKVGKVRRAKLGYLRGLSGKAARIKEIRK >gi|223714162|gb|ACDT01000053.1| GENE 24 20847 - 21392 578 181 aa, chain + ## HITS:1 COG:lin1310 KEGG:ns NR:ns ## COG: lin1310 COG0681 # Protein_GI_number: 16800378 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Listeria innocua # 28 180 30 179 180 107 41.0 1e-23 MSKKKEVIKFSLQLLAIVAVTTVTFTKIIIPVRVDGQSMYPTLHDEDIAIVNALSLERSD IKRFDIVVLKCEKLDKDIVKRVIGLPGDTLVYRDDKLYINGTYYDEKYLNKDYIAKAKIK YQTELFTNDFEITLNDDEIFVLGDNRLQSADSRTLGTFKYSDIIGKKGLVIFPLKNMNLI K >gi|223714162|gb|ACDT01000053.1| GENE 25 21403 - 22269 850 288 aa, chain + ## HITS:1 COG:BS_ylqF KEGG:ns NR:ns ## COG: BS_ylqF COG1161 # Protein_GI_number: 16078668 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Bacillus subtilis # 5 284 3 280 282 267 46.0 2e-71 MVKQIQWFPGHMAKARREISEKMKLIDIVIELVDARAPLSSKNPMFNEICNNKPRLIVMT KKDLADSKATSKWIEYFKGKGIYAICVNLKNFNEYQLVIDVCKEILKEKMEREAKRGLKP RAMRAMVLGIPNVGKSTFINRLAKRKATVTGNRPGVTKAQQIIRVDKDFELFDTPGVLWP KFDDLNVARNIALIGSIKQDILPLDELFIYAVNYLETHYANIVCNRYNIIIDLNTDWVEK AYDDIAKNRKIKPVRGYTDYDRVMEVFFNDIFDGNMGKITWELPNGTL >gi|223714162|gb|ACDT01000053.1| GENE 26 22256 - 22876 674 206 aa, chain + ## HITS:1 COG:BH2475 KEGG:ns NR:ns ## COG: BH2475 COG0164 # Protein_GI_number: 15615038 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Bacillus halodurans # 2 203 60 262 263 179 46.0 4e-45 MERYEYEEKYYQAGYDYIIGLDEVGRGPMAGELVVAGVVFPKGFYDERINDSKQLSAKKR EVLYDLIIENALYYDIEIISVADVDRLNVYQASKQGMEKCLELLKKEKMFALTDAMPIDY PDHLSIIKGDAKSISIAGASILAKVTRDRLMEAYALEYPEYGFEKHKGYVTKAHKEALGK YGVCPIHRKSFGPVQKILSKQMSFDF >gi|223714162|gb|ACDT01000053.1| GENE 27 22946 - 23698 704 250 aa, chain + ## HITS:1 COG:all1325 KEGG:ns NR:ns ## COG: all1325 COG0758 # Protein_GI_number: 17228820 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Nostoc sp. PCC 7120 # 44 248 79 289 372 196 45.0 3e-50 MEEILLYFSLKYAGDFDSILKALECKEKIDEKLKKELFKDITANYTTLISPDYPTALKEI ACPPFVLFYYGNLALVKNKCISVIGKRHPSTYGVECTAALAAQLVEAGYTIISGMAMGID TIAHQSTITANGQTIAVLGSGIDYCYPHRNQQLYQVLKDHHLVISEYPGKFIPQKINFPR RNRIISGLSESILVTEANQQSGTMITVGHGLEQGKDIYCVPSRINDALGCNYLIQQGAKL VINVSDILNG >gi|223714162|gb|ACDT01000053.1| GENE 28 23775 - 25847 2107 690 aa, chain + ## HITS:1 COG:BS_topA_1 KEGG:ns NR:ns ## COG: BS_topA_1 COG0550 # Protein_GI_number: 16078675 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Bacillus subtilis # 1 545 1 550 550 520 52.0 1e-147 MSKKLVIVESPSKSKTIEKYLGSDYVVTSSKGHVRDLATSGKEGLGVDIENQFEPKYVIN KDKKDVVKELKQLIKESDDVYLATDPDREGEAISWHLAQVLNVDMDKENRVVFNEVTKDA VVDALSHPRKIDQNLVKSQETRRVLDRIIGFKLSKLLQKKIKSKSAGRVQSVALRLIVER EREIEAFVPEEYWKIKAEFEKDKIEFSGELAKYNNKKIEIKNGEEATAIYEALNKEFEIA SVKKTTKRRESKPPFITSTLQQEASSKLGFKARRTMSIAQKLYEGITLENETVGLITYMR TDSTRLSDTFVSAAQDYIGEKYGKEYIGKVKVGKKKENVQDAHEGIRPTSALRTPESVKE YLKPEELKLYSLIYARAMASLMAPAKFDATSVSLMNNGYEFKTSGSVIKFDGYLRVYGDY EKQSNEILPELKEKEMLLSKNIEKTQHFTKPPARYSEAKLIKEMEELGIGRPSTYAMIID TIQTRQYVELVDKAFKPTESGILTSDRLTEYFNDIINVEYTAKMEHELDDIAEGQDEYAH ALQKFLDLFQPLLDNAYDKMEVIAPKKTGEKCPECGHDLVERKGRYGTFVACENYPECKY VKKDPVEIEYTGEECPKCGSKMIFKNGRFGRFEACSNYPECKYIKNSKKKEPVMTDEECP NCGSPIVIKQGRWGEFKACSNYPKCKTIIK >gi|223714162|gb|ACDT01000053.1| GENE 29 25907 - 27211 1195 434 aa, chain + ## HITS:1 COG:lin1315 KEGG:ns NR:ns ## COG: lin1315 COG1206 # Protein_GI_number: 16800383 # Func_class: J Translation, ribosomal structure and biogenesis # Function: NAD(FAD)-utilizing enzyme possibly involved in translation # Organism: Listeria innocua # 1 430 1 433 434 543 62.0 1e-154 MEKIVNVIGAGLAGVEACHQLVKRGYKVRLYEMRPKKMTPAHHSGNFAELVCSNSLRADG TGNAVGVLKAEMEMMDSLIIKYARKHQVPAGGALAVDRNNFSQAITEYIQSHPLIEVIHE EAKEFPAGYTIIASGPLTSDALATAIKEKLGEDYFYFFDAAAPIIAKESIDFAIAYYKSR YDKGDNEYINCPMNEAQFNAFYDALVNAEVVKPKDFEEKFFEGCMPFEEMARRGKQTLLF GPMKPVGLTAPDGTRPYAVVQLRQDNVQASLYNIVGFQTHLTWPEQKRIIQMIPGLENAS FVRYGVMHRNSFICSPKHLLKTYQLKNYSNIFMAGQITGVEGYVESAQSGMAAGINMVRL LEDKEPLIFPENTVMGALANYITNASKEDFQPMKANFGILPDFPIRIKKKERKAAYASRA LETMKGFVDENNLG >gi|223714162|gb|ACDT01000053.1| GENE 30 27192 - 28100 744 302 aa, chain + ## HITS:1 COG:BH2465 KEGG:ns NR:ns ## COG: BH2465 COG4974 # Protein_GI_number: 15615028 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Bacillus halodurans # 1 299 1 302 303 263 43.0 3e-70 MKTTLDNYFRQYLDYLQYQRHYADKTIESYKRQIDHFKQFLIEESIDDYNDVSYAMLRGY LTKLYEKNLSKTTINHKLSALRSFFNYLLKEELINDNPFLLIESQKVAKRNPDFLFPEEI LGLLDSIETKDDLGIRNKAMMELMYASGLRCSEVVNLQLSNIDFNQMVLFIHGKGNKDRY VPFHDYAGEWLIKYIQEARENLMIKNEGHNFVFVNKFGNPLTNRGVENIVDRIMRLYDST KKIHPHTIRHSFATHLLNAGADIRTVQELLGHENLSTTQIYTHISRDHLKEVYLKAHPRN IE >gi|223714162|gb|ACDT01000053.1| GENE 31 28262 - 29125 1446 287 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237734605|ref|ZP_04565086.1| 30S ribosomal protein S2 [Mollicutes bacterium D7] # 6 287 1 282 282 561 100 1e-159 MSVISMKKLLEVGVHFGHQTKRWNPKMAPYIFTSRNGIYIIDLQKSSKKIDDAYKAMNEI AAKGGKVLFVGTKKQAQEAVKEEAIRSESFYVNSRWLGGTLTNFKTIQKRIRRLIELEKM EADGTFDLLPKKEVILLKKEAAKLEKNLGGIKEMRRLPNALFVVDPKAEHNAVAEAKILG IPVFGIVDTNCDPDEVDYVIPANDDAIRAVKLIVAAMADAICEAKNEPLTVAYVKDEDDK EVSMNDAITSVENNQRRAPRNKGGRPNPRRNNAPRPNKDNTAGTEGK >gi|223714162|gb|ACDT01000053.1| GENE 32 29196 - 30086 539 296 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts [Haemophilus influenzae R2866] # 3 290 4 276 283 212 44 4e-54 MAITASLVKELREKTGAGMMDCKKALEACDGDIEKSFDWLREKGIAKAAKKADRIAAEGL TAFVLDGDTAAIVEVNSETDFVAKNAEFQGLVKNIAEVVAANKPVDLEAALNTEVDGKKL ETVIAEASGKIGEKLSFRRFEVLTKTADEVFGAYSHMGGKMTAIVKVANSTEDKARDVAM HVAASDPKYIDRTAIPAEVLDHELSVLKAQAMEENAAAAKPKPENIIEKMVEGRLNKNLK EMCLVDQEFIKNPDETVAKFLGEGKVINMVRFQVGEGIEKKEENFAEEVAAQMNAK >gi|223714162|gb|ACDT01000053.1| GENE 33 30278 - 30994 933 238 aa, chain + ## HITS:1 COG:L70624 KEGG:ns NR:ns ## COG: L70624 COG0528 # Protein_GI_number: 15673992 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Lactococcus lactis # 1 234 4 237 238 301 64.0 1e-81 MKYNRVLLKLSGEALAGDDKTGINAHTVADIARQIKDAKDLGVEIAIVCGGGNLWRGKTG ADMGMDRSSADYMGMLATVMNGLAVQNALEAIGVPTRVLSAIEMRQVAEPYIRRRAIRHL EKGRVVIFGAGTGSPFFTTDTTAALRAAEINADVILMAKNGVDGVYSADPKVDPNAIRFD TISYFDVLQKDLKIMDQTAITLCKDNNIDLCVFNMSVDGNIAKACNGDDIGTTISGGK >gi|223714162|gb|ACDT01000053.1| GENE 34 30996 - 31541 736 181 aa, chain + ## HITS:1 COG:BS_frr KEGG:ns NR:ns ## COG: BS_frr COG0233 # Protein_GI_number: 16078715 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Bacillus subtilis # 1 181 1 185 185 183 56.0 1e-46 MINMILDDTNERMTKTIESFQRDLSSVRTGRANPNMLDRVMVNYYGSPTPVNQIAGISVV EGRQLVIKPYDKSSIKDIEHGIYEADLGLTPQNDGEIIRIMVPALTEERRKEFAKNVWKF AENAKVSIRNIRRDSNDEIKKTDGSEDEIKAGQERVQKLTDKFVKEIDEIAKVKEKDIMT V >gi|223714162|gb|ACDT01000053.1| GENE 35 31707 - 31988 114 93 aa, chain + ## HITS:1 COG:no KEGG:BCA_3719 NR:ns ## KEGG: BCA_3719 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_03BB102 # Pathway: not_defined # 2 83 36 120 193 63 43.0 2e-09 MSGGKVEITERIGYFLDTGNISDLMSSNKSKCQIGIITEDSKIEENFVCSQKHRVFFKEK IDKTFSFNVPFQKWLKTNTGKIYLMQLKHIMRF >gi|223714162|gb|ACDT01000053.1| GENE 36 32285 - 33040 798 251 aa, chain + ## HITS:1 COG:SA1103 KEGG:ns NR:ns ## COG: SA1103 COG0020 # Protein_GI_number: 15926843 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Staphylococcus aureus N315 # 3 250 2 253 256 273 56.0 2e-73 MFFKKKKNEEFNYHVLDKDNIPQHIAFIMDGNGRWAKKRKMPRTYGHHEGTKTIRDVALH CNKLGVKAMTVYAFSTENFARPEQEVQYIFKLPKDFFELYMKELIENNVKICTIGHLEMA PKETQDIINSAIDKTKNNTGLKLCFAFIYGGRDEILEATKKLAMKVKAGELGVNEINETV FNDELMTKDLPEVDLMIRTSGEQRLSNFLLWQLAYAEFIFTDVLWPDFNENELDKAIWMY QNRDRRFGGLK >gi|223714162|gb|ACDT01000053.1| GENE 37 33040 - 33822 739 260 aa, chain + ## HITS:1 COG:BH2422 KEGG:ns NR:ns ## COG: BH2422 COG0575 # Protein_GI_number: 15614985 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Bacillus halodurans # 1 246 1 253 264 115 33.0 6e-26 MKERIITAICLMLVAVPCVIFGGYYFKGLIGVILALAVYEMLHICTRPKIKMYLYPIVCG FFIYGFLFDQDDLFLASYGILIYLIVLFGATIFDDTLTIERTSYIFTMGVLICAGLHALM SLRDVYGFQYILLLALATYGSDTGAYFAGVFFGKHKLIPRLSPKKTIEGSIGGVLLGTLL SVGYASYLGLLENNVILIAAFFVLTFTSQIGDLVFSAVKRHFGVKDYSNLLPGHGGILDR IDSILFNAIVFSFFLVMVRL >gi|223714162|gb|ACDT01000053.1| GENE 38 33824 - 34975 1337 383 aa, chain + ## HITS:1 COG:BH2421 KEGG:ns NR:ns ## COG: BH2421 COG0743 # Protein_GI_number: 15614984 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Bacillus halodurans # 18 376 1 360 365 412 56.0 1e-115 MKKITVLGVTGSIGMQTVDVVMNHPEQFKITAMAAGYNVAKVEEILAMIDVEYICMVKKE DALYLQEKYPDLKVVYGESGLIEIATLPEIEIVLNAIVGFAGLVPTIEAIKAKKDIALAN KETLVVAGHIITELVKEYGVKLLPVDSEHSAIFQSLNGEEHNKIKKIILTASGGSFRDKQ RDELAGVTVKEALNHPNWSMGAKITIDSATLFNKGLEVMEAKWLFDVDYDQIEVLIHPES IIHSMVEFVDTSIIGQLGNPDMRLPIQYALTYPERDYLIGGESLDLAKIASLTFKKPDFE RFRALALAYQAGKSGGSMPCVLNGANEQANELFRNGKIEFLEIENLVEKALNKHQLVKNP TLEKLIEIDAWARNFVLKEIGED >gi|223714162|gb|ACDT01000053.1| GENE 39 34975 - 36054 1086 359 aa, chain + ## HITS:1 COG:CAC1796 KEGG:ns NR:ns ## COG: CAC1796 COG0750 # Protein_GI_number: 15895072 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Clostridium acetobutylicum # 3 354 2 333 339 159 33.0 7e-39 MQTLINIVVFILILGIVVLIHELGHFITAKSFGVYCSEFSIGMGPKIFSRKKGETEYEIR ALPIGGFVSMAGEADNDIEEFKDVPIERTLKGISCWKKCVVFLAGVFMNFVLSLVILIGV YCVIDVQTNTPEIGKVTSDSPAMIAGLEAGDTISKITYDGHENIIASFADIREVLNNDNL KSKSATIMLQVELVRDGKTITKEVNAKYNSDSNSYTMGLTPATRNLSFFEAINYGVTKFV EMALLIFTTLGKLFTDSANTIGQLSGPAGIYNVTAQITETGSISQLLTLLALLSTNIGMF NLLPIPGLDGCQVIFAVVERVIGRELPLKVKYGLQIAGLALVFGLMIFVTFNDISRIFG >gi|223714162|gb|ACDT01000053.1| GENE 40 36130 - 40452 3929 1440 aa, chain + ## HITS:1 COG:BH2418 KEGG:ns NR:ns ## COG: BH2418 COG2176 # Protein_GI_number: 15614981 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit (gram-positive type) # Organism: Bacillus halodurans # 2 1440 9 1433 1433 1252 45.0 0 MENKLLILLKNLKIESQANILKEGKISKVVVARDNSYTFHLVFNQILPFEEYQLLINNRD NFPYPTKYKISYETEFFNQNELLLYTGYLLEKLKKDYPVCATLTVESFKIEDKVIRVETC NEIQLEQLRQLRAQIEDMFNDVGIDKNFDFYIDEENDVFKDIKEEMEAYEPVEIDLSLIQ KPDKPASEQNNYKNNYRQKNAAIDMKIEEITNQTMDNNIVIKGFVFKTEMIKTRAGKHIQ SLWVTDYTDSIIVKRFENNSNNSLEELKVIGKGGVWVKVRGEARFDSFARETVMMAREVE VIKSPAPRKDTSEAKRVELHTHSKMSAMDGVGTITQYINAVASWGHKAIAVTDHGNVQSF PEAQMAAGKAGIKMIYGVEFNMIEPILNIVYNEIDTSIEHATYVSFDLETTGLSVIHDGI TEFGAVKIKNGEVIDRLQMFVNPGKSISSRITNLTSITNDMVRNEPTIDALLPRIIEFFD DCILVAHNANFDIGFLNENLRRNNMPEITNPIIDSLALARAILKPMKSYRLGNVCRSYRV NYDDEVAHRADYDAEVLGDVFNMMLHQIMQSGKYNLLDLCELTGDDVYKIVYPYHMTALA LNKAGLKNMFKLVSEANTKYFHNGSRIPKERLEHYREGLLYGSSCYNGDVFEAALNLSDE KLERAMEFYDYIEIQPLEDYYHLVDRGKLQDTDELIKSLHRIIDCAKKLDKLIVATGDVH FLEVRDKIFRDVFISNPTIGIGHRAHPLCDRRNPKAKNPCQYLRTTNEMLEGYPYLPQDE VFEYVVTNTNKIADMVEEIKPVHDKLFTPKIDGADENLKKICYDTAHKTYGNPLPQIVEK RLEKELSNIIKHGFGVIYYISHLLVKKSNDDGYLVGSRGSVGSSFVATMSGITEVNPLPP HYVCLHCSHSEFLEEGIVADGYDLEDKVCPKCGKIMKGEGHNIPFETFLGFNADKVPDID LNFSGEYQANAHAFTKEIFGEDHVFRAGTISTVAEKTAYGYAKGYAELMGTDQTIRSAEL ERIAAGCGGVKRTTGQHPGGIIVIPGDMDVFDFTPYQFPADDLNAAWKTTHFDFHAIHDN VLKFDILGHVDPTVTRFLQDLTGVDPKDIPTNDKKVMSLFTSSEALGCNLDFIGCKNGAL GLPEFGTSFVRGMLDQTQPKTFNDLVIISGLSHGTDVYLGNAETLIKSGTCTLSEVIGCR DDIMVYLIEKGLPNKDAFDIMECVRKGKSPVVFPEKKYEELMKEYNVPQWYIDSCKKIKY MFPKAHAAAYVLSAIRVAWWKLYYPREYYAVYFTTRCDFYDIETLVQGKDAIMARRAEIT QLRAERSSSNKDEGLWDIFEIALEMIERGFHFNPVSLEYSQASKFILDPNDEKGLIPPFS AIDALGESVAKTVVDARADGPFLSKEDVIKRTKLNNSHIKTLSKMGVFNGMQERNQLSLF >gi|223714162|gb|ACDT01000053.1| GENE 41 40551 - 41018 448 155 aa, chain + ## HITS:1 COG:BS_ylxS KEGG:ns NR:ns ## COG: BS_ylxS COG0779 # Protein_GI_number: 16078722 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 2 155 4 156 156 115 41.0 3e-26 MELLDTIKTLIKPILENNDVYLDDIEYLQENGEWYLRIFVEKNEGSLDMDTCVAVSEAIS LKMDEEDPIKGEYYLEVSSPGVEKPLKTFEQVKASVGKYVYAKFINPTAGMDEVEGFIKT IEDETIEFEYLVKNIKKRIKIDYSNIKFIRLAVKF >gi|223714162|gb|ACDT01000053.1| GENE 42 41031 - 42362 742 443 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 5 404 9 437 537 290 38 1e-77 MASKKFMDALNLLIEEKGIEKDVFLEMLKESIGKAYKKNYLNPDANVRVEINEKTGKFRL FELRTVVDDLDDEDIELSLEEAQAINPNYQIGDVVETEADIEHIGRLAAIQTKQLFRQKI RETEKETLYNEFADKKDDIITGIVDRVEDKFAIVNIGKTGAFLASNQQIPGEKLNEGQHL KVYVSDVDRGTKGTHIVVSRTEPSFVKRLFELEVPEVYDGTVEIKAVSREPGERSKVAVY TSNENIDPIGSCVGPKGSRVKNVVDELNGEMIDIILWSSDPVVFISNALSPSDVKWVSIN EENHSALVVVPDDQLSLAIGKRGQNARLAVRLTGWKIDIKSVSEAVELGLIDLQTVNNTE ESSPVDASFEEEFAQEMLDEAVEEAVEVAEEPEIEEEVVEVEEEPTQSKKVIEYEDFEDL DDEYSKYDEEIDYDEYDEYYDKD >gi|223714162|gb|ACDT01000053.1| GENE 43 42366 - 42632 175 88 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|206900953|ref|YP_002250931.1| ribosomal protein L7Ae family protein [Dictyoglomus thermophilum H-6-12] # 3 88 3 85 98 72 43 6e-12 MIKLRKIPLRKCLATGEQLPKQQLIRIVRNKEGQVAVDPTGKMNGRGAYLKRSHEAFVLA KKKKVLARALQVEIPEEIFVELEKFADE >gi|223714162|gb|ACDT01000053.1| GENE 44 42625 - 42921 484 98 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237734618|ref|ZP_04565099.1| 50S ribosomal protein L7 [Mollicutes bacterium D7] # 1 98 1 98 98 191 100 9e-48 MNNYLNTLGLAARARKIVTGETLITKIRSNEVEFVIIASDASDNTKKKITDKCTSYKVEY VIACTIDELSSAIGKKNRVALGIQDTGFAKILKEKIGG >gi|223714162|gb|ACDT01000053.1| GENE 45 42926 - 44779 2637 617 aa, chain + ## HITS:1 COG:lin1362 KEGG:ns NR:ns ## COG: lin1362 COG0532 # Protein_GI_number: 16800430 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Listeria innocua # 10 617 167 782 782 682 58.0 0 MAKQNKQTKNKKGAPRKKQLKSNFPTKKEEIVAKDGVIVYEEGITVGQLADKIGQTPANV IKVLFLLGTMVTINSSLNDEQVELICLEYGFEVEKHVEVSEVNFEEIDIQDDEKDLQPRC PVVTIMGHVDHGKTTLLDTIRKSAVVEGEFGGITQHIGAYQVEVNGKKVTFLDTPGHEAF TAMRARGAQVTDIVIIVVAADDGVMPQTKEAIDHAKAAGVPIVVAINKIDKEGADPERIK GEMAEHGLLPEDWGGDTVYCEISAKKRIGIEELLETLTVVAELADLKANPNRYAYGSVVE GKLDKGRGPVATLLVENGTLRAGDPIVVGTSFGRVRQMLDDRGKIIKEALPATPVEITGL NDVPVAGDKFMAFENEKQARSVGETRLKAKQDKERSSGAALSLDDLYSQIKEGEMIDLNI IVKADVQGTAEAVKASLEKIDVDGVRVNVIRSTAGGISESDVLLASASQAIIYGFNVRPN AKVRQKAEEEGIEIRLHNIIYKMVEEIETAMKGMLAPEIKEVVTGQAEIRQVIKVSKVGN IAGCYVTDGFIRRNCGIRLLRDSVVVYEGKLGSLKRFQDDAKEVAAGFECGLSIENFNDI KEGDIVEGYIMEEVEVK >gi|223714162|gb|ACDT01000053.1| GENE 46 44789 - 45136 538 115 aa, chain + ## HITS:1 COG:BH2411 KEGG:ns NR:ns ## COG: BH2411 COG0858 # Protein_GI_number: 15614974 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Bacillus halodurans # 3 111 4 112 116 97 44.0 5e-21 MILKKDKMNGIIQRELSQIIQVEVRDPKIGFCTITAVDTTTDLSIAKVYVTFLGKDFDTR KGMEALNRSKGFIRSLLAKRLTIRKVPELIFVNDTSLEYGNKIEKIIDDLNHHDK >gi|223714162|gb|ACDT01000053.1| GENE 47 45615 - 46754 1615 379 aa, chain + ## HITS:1 COG:CAC0188 KEGG:ns NR:ns ## COG: CAC0188 COG1820 # Protein_GI_number: 15893481 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Clostridium acetobutylicum # 5 375 4 375 381 281 44.0 2e-75 MKYCITNGKVILKNQVVDANVYVENTKITEISNRQPDDETVIDAKGRYVSPGFIDVHTHG RGGSDTMYNTFEDLDTITSTAVKTGVTGILPTTMTMSKEDTYAAIKNVGDNMDKVGGSKI LGVHMEGPFFNTKYKGAQPEEFMIKPTAENYSSLVGEYGKIVKKLSLAPELKDSDKLIEY LVKEGVVVSIGHTNATYDEAVVGIKAGATSGTHTYNAMTPLTHRNPGVVGAIMEHDEVYA ELILDGIHVSYPAAKVLLRAKGLDKVILITDSIEASGLEDGQYKLGNQAVFVKDNSARLE DGTLAGSILAMNNAVKNAYQHLGLSINEAVNLASYNPAKNLNLINLGEIAVNKTADIIMF DEEINVDFVMIDGNVKIGG >gi|223714162|gb|ACDT01000053.1| GENE 48 46756 - 47095 402 113 aa, chain + ## HITS:1 COG:L183012 KEGG:ns NR:ns ## COG: L183012 COG2017 # Protein_GI_number: 15673903 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Lactococcus lactis # 3 113 4 112 235 92 43.0 2e-19 MITLKNEVLEVTLANKGAEIIKIVGQDDQINYMWRRDPIQWANSAPILFPIVGALQNNEC RIDGKTYTMTQHGFSRHSEYEANQINDQEVVFTLISNEEIAKQYPYLFKLDVT Prediction of potential genes in microbial genomes Time: Thu May 26 09:47:24 2011 Seq name: gi|223714161|gb|ACDT01000054.1| Coprobacillus sp. D7 cont1.54, whole genome shotgun sequence Length of sequence - 2188 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 196 176 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 2 1 Op 2 . + CDS 229 - 882 695 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase 3 1 Op 3 2/0.000 + CDS 875 - 1195 409 ## COG0526 Thiol-disulfide isomerase and thioredoxins 4 1 Op 4 . + CDS 1207 - 1806 732 ## COG0073 EMAP domain 5 1 Op 5 . + CDS 1824 - 2187 432 ## gi|167756922|ref|ZP_02429049.1| hypothetical protein CLORAM_02471 Predicted protein(s) >gi|223714161|gb|ACDT01000054.1| GENE 1 2 - 196 176 64 aa, chain + ## HITS:1 COG:SPy1337 KEGG:ns NR:ns ## COG: SPy1337 COG1187 # Protein_GI_number: 15675276 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Streptococcus pyogenes M1 GAS # 4 60 180 236 244 68 63.0 2e-12 SCDVRIEIFEGKFHQVKKMVEVCGKEVIYLKRLSIRNLELDRSLALGDFRELSNEELVDL MEDL >gi|223714161|gb|ACDT01000054.1| GENE 2 229 - 882 695 217 aa, chain + ## HITS:1 COG:BS_ytmQ KEGG:ns NR:ns ## COG: BS_ytmQ COG0220 # Protein_GI_number: 16080042 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Bacillus subtilis # 1 214 1 212 213 204 48.0 7e-53 MRLRNNPKAYDIMKENKSFVILEPENCKNKWKDIFGNNHPIYIEIGMGKGDFIYENARQF PEINFVGIEKYPSVLAAAINKINAREEKVNNLRLMHYDAIELHQVFEKDEVDKIFLNFSD PWPKSKHAKRRLTSSKFLDVYKDILIDEGNIEFKSDNRGLFEYSIISLNQYPMDIEYISL DLHNSPESEINIMTEYERKFCEKGPIYKLVARYRKNG >gi|223714161|gb|ACDT01000054.1| GENE 3 875 - 1195 409 106 aa, chain + ## HITS:1 COG:BH3253 KEGG:ns NR:ns ## COG: BH3253 COG0526 # Protein_GI_number: 15615815 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus halodurans # 6 106 3 104 106 105 48.0 1e-23 MDKLVEIKTIEEFDNIRVGKTIIMFTADWCPDCMFVKPFIGDIVDANPEYKFYMINRDNM IDLCKDQDILGIPSFVAYNDKEESGRFVSKLRKTKEEIQAFIDTLK >gi|223714161|gb|ACDT01000054.1| GENE 4 1207 - 1806 732 199 aa, chain + ## HITS:1 COG:lin1648 KEGG:ns NR:ns ## COG: lin1648 COG0073 # Protein_GI_number: 16800716 # Func_class: R General function prediction only # Function: EMAP domain # Organism: Listeria innocua # 1 198 1 201 205 178 45.0 7e-45 MIFHGFYNLKHVGDILLARCGEGRTFDYDKYDDLVVLKDSKNRILGFNLLNASNYLGELN TGLVNFSDNQIEKFNELLTTHGLDPVVLDKEPRFIVGEVTAMEEHPDSDHLHICQVDLKN TTTQIVCGAPNVEVGQRVVVATIGAVMPSGLVIKPSKLRKIDSNGMICSARELGLPNAPQ IRGILVLDKDKYSIGDSFF >gi|223714161|gb|ACDT01000054.1| GENE 5 1824 - 2187 432 121 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756922|ref|ZP_02429049.1| ## NR: gi|167756922|ref|ZP_02429049.1| hypothetical protein CLORAM_02471 [Clostridium ramosum DSM 1402] # 1 121 1 121 222 169 100.0 4e-41 MFRGLFKKHEEDEDIFEVDPVVEQRRKEKFSTPLIYDEEFKEEPEVTVTPTPKTKEKTKV VEAKPKVEEKPRQIYQMSEVISPMMGITKGKNDKKAVKPTTSSKPKKRKNVDQLVPVISP F Prediction of potential genes in microbial genomes Time: Thu May 26 09:47:31 2011 Seq name: gi|223714160|gb|ACDT01000055.1| Coprobacillus sp. D7 cont1.55, whole genome shotgun sequence Length of sequence - 5487 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 266 374 ## gi|167756922|ref|ZP_02429049.1| hypothetical protein CLORAM_02471 2 1 Op 2 . + CDS 269 - 604 376 ## COG0736 Phosphopantetheinyl transferase (holo-ACP synthase) + Term 664 - 716 -0.4 + Prom 757 - 816 13.2 3 2 Op 1 . + CDS 894 - 3158 1740 ## COG2200 FOG: EAL domain 4 2 Op 2 . + CDS 3179 - 4234 789 ## LHK_00948 GGDEF domain protein 5 2 Op 3 . + CDS 4278 - 5261 599 ## COG2199 FOG: GGDEF domain + Term 5410 - 5449 -0.8 Predicted protein(s) >gi|223714160|gb|ACDT01000055.1| GENE 1 3 - 266 374 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756922|ref|ZP_02429049.1| ## NR: gi|167756922|ref|ZP_02429049.1| hypothetical protein CLORAM_02471 [Clostridium ramosum DSM 1402] # 1 87 136 222 222 119 100.0 6e-26 KLTTINDEPKEKKVIEDLPTVEDNLRNIAKIVEEEQDQLKIIEERTGEFKLDFGTNDVEK QNSLIDEIDDDMSLDELMSLYEKKFKD >gi|223714160|gb|ACDT01000055.1| GENE 2 269 - 604 376 111 aa, chain + ## HITS:1 COG:BS_ydcB KEGG:ns NR:ns ## COG: BS_ydcB COG0736 # Protein_GI_number: 16077529 # Func_class: I Lipid transport and metabolism # Function: Phosphopantetheinyl transferase (holo-ACP synthase) # Organism: Bacillus subtilis # 1 109 1 117 121 82 45.0 2e-16 MISGIGCDIVDLNRLNLDNECFVLKLLTKNEFLIFKNKKSLKQKKEFLGGRFAAKEAFFK AHGIEHGMLSFHDIEILNDKNGKPKINYPNTFISIAHENDYAIAYVVVEEG >gi|223714160|gb|ACDT01000055.1| GENE 3 894 - 3158 1740 754 aa, chain + ## HITS:1 COG:slr1692 KEGG:ns NR:ns ## COG: slr1692 COG2200 # Protein_GI_number: 16330979 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Synechocystis # 494 746 63 316 332 167 35.0 9e-41 MYEDKVKKSLRKIIIFAVIIGIFLASIGAGFFYVIRTAFDRSTDERMIEETDNYKKRLDK QINNNFQMLNTVASIIGNSNLDESADFDSILERAYIENDFLTVAFFYNDEMGTLSTGDHI ISSDIHLSSLQPEVQEVVRKALRGKENVSEPFYGDFSEEEVFTFGVPVYRNKQIVGAMIA SSVVDIFSEIIDGEKVLNGSAYIHLLDVNGKFLIRSSHAVVKEKKTDIFKEPYLSGDELT KIKKSMKNSEIVRFTFTYEGKEYHSILEPLDVNEWYLFCINSVQNSNSGIYSVAYAVACF FVIIVLLVGYLLIYGYRTMKKNNQQLMDFAYLDRLTGIYNLNRFGELAHQHIEEHSEYAI AALNVRKFKFINEIFGKEQADRLLCYIGRILNDNIKDGELVCRDSADVFYMFLLETEKNI LEVRLKEILEKIRTSSNDSNSNYRITMSCGIATATDKQELQVSMTHAMFALEISKNNFKI PLWFFDAALHEQEKMNDYIERHMYEAIENGEFKLYLQPKIDLKTNSLASAEALVRWIRND GTTIFPSQFIPIFEQNGFCVNLDMYMVEKVCQLLRQWIDEGYQAIPIAINQSKLVLYEIG YVKNLCGILDRYNIPASLITLEILEEIAIENVDDLNKKLTELRDIGFKISMDDFGTGYSS LNTLGKLNIDELKLDRSFLVEIKDTRNQNARLIMEEIVQLSKKLSIFSVIEGVETSEDDL LVKEIGCDYGQGYYYSRPIDSNDFTSKMLNKFNK >gi|223714160|gb|ACDT01000055.1| GENE 4 3179 - 4234 789 351 aa, chain + ## HITS:1 COG:no KEGG:LHK_00948 NR:ns ## KEGG: LHK_00948 # Name: not_defined # Def: GGDEF domain protein # Organism: L.hongkongensis # Pathway: not_defined # 6 124 16 133 333 107 44.0 9e-22 MENLLIDNMLENAKIGIWVIELSHNKKNDPKMFGNTHMKLLLGVADSYSPEEVYQFWVER IHPAYVSYVEEAVERLISGNKAEVEYPWKHPENGWMYIRCGGYLNTNFTQCYRLEGYHQD ISNLIFRFEPFSFDYQIQDQLLFKEYSRYYMDIYDELCEIDPVTNTMEIVFQRKDKYLSI FSGINFFEFIKKYIYSMDHLIICETYNELSSAKKQKVTIDVRIHNSNDGYSWIRLVFVLA QMNYSPKVLMGIMDIQEQKRNMEFFIDSDAILSTIIEENAYIFDIDIMTQNISAIKGENL SIHDMSYNKFLSLFMQRCSDISELEYLNSFLSFNHLKELVFQKKKCTYRYP >gi|223714160|gb|ACDT01000055.1| GENE 5 4278 - 5261 599 327 aa, chain + ## HITS:1 COG:PA0575_3 KEGG:ns NR:ns ## COG: PA0575_3 COG2199 # Protein_GI_number: 15595772 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Pseudomonas aeruginosa # 160 325 7 173 182 114 37.0 2e-25 MVPKALEDRILILIKKIDKDDVFQTVITKYVQKNFESLCYIDCDSGKYLQFVTEGMKELI FNENEFDYKDALDTFIESQVVIEDRKVAQTKMNLDYIFLTLQKDGDYFCNIRILDSCGEF HQKTIHYYIYDEDTKIILMMVADTTEQYLLNQSEKEQIERLKLKANTDELTGLYNRYYCK THINEYLQEVPQGQQAVFLLIDLDNFKNLNDSLGHLAGDQALVDVANVLKNSFCDDAIIS RFGGDEFVVFVKNIQDIKLFNRLISRLLQQLDLYYQSEKDQINIQASIGIALAPIDGRDM NTLYLKADKALYQAKEGGKNRYKYYNN Prediction of potential genes in microbial genomes Time: Thu May 26 09:47:45 2011 Seq name: gi|223714159|gb|ACDT01000056.1| Coprobacillus sp. D7 cont1.56, whole genome shotgun sequence Length of sequence - 12357 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 6, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 70 - 129 8.9 1 1 Op 1 . + CDS 159 - 494 535 ## COG0629 Single-stranded DNA-binding protein 2 1 Op 2 . + CDS 555 - 2363 1740 ## COG0481 Membrane GTPase LepA + Prom 2370 - 2429 10.0 3 1 Op 3 . + CDS 2453 - 3853 1386 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 4083 - 4131 1.2 + Prom 3875 - 3934 10.5 4 2 Op 1 . + CDS 4160 - 6271 2169 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 + Term 6278 - 6331 0.8 5 2 Op 2 . + CDS 6344 - 6493 281 ## PROTEIN SUPPORTED gi|237734635|ref|ZP_04565116.1| 50S ribosomal protein L33 6 2 Op 3 . + CDS 6530 - 7147 520 ## COG0491 Zn-dependent hydrolases, including glyoxylases + Term 7201 - 7247 7.0 7 3 Op 1 11/0.000 - CDS 7186 - 7353 108 ## COG2801 Transposase and inactivated derivatives - Prom 7374 - 7433 2.1 8 3 Op 2 . - CDS 7443 - 7820 312 ## COG2801 Transposase and inactivated derivatives 9 4 Tu 1 . + CDS 8224 - 10917 3089 ## COG2382 Enterochelin esterase and related enzymes + Term 10938 - 10987 -0.2 + Prom 11119 - 11178 12.2 10 5 Tu 1 . + CDS 11409 - 11624 178 ## gi|237734639|ref|ZP_04565120.1| conserved hypothetical protein + Prom 11811 - 11870 14.1 11 6 Tu 1 . + CDS 11930 - 12356 299 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB Predicted protein(s) >gi|223714159|gb|ACDT01000056.1| GENE 1 159 - 494 535 111 aa, chain + ## HITS:1 COG:BH4049 KEGG:ns NR:ns ## COG: BH4049 COG0629 # Protein_GI_number: 15616611 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Bacillus halodurans # 1 104 1 104 168 80 38.0 6e-16 MLNTVALVGRLVDIPEMKTTIGGVKVANATIQIEQGFKNSLGAFENDYIVVSLWRGIAEM IIDCARPGCMIAVKGRLHSRTFECAESKQITSIEVIAERVSLLDKYFKNYD >gi|223714159|gb|ACDT01000056.1| GENE 2 555 - 2363 1740 602 aa, chain + ## HITS:1 COG:BS_lepA KEGG:ns NR:ns ## COG: BS_lepA COG0481 # Protein_GI_number: 16079605 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Bacillus subtilis # 5 601 11 607 612 871 70.0 0 MTIDQSKIRNFSIIAHIDHGKSTLADRILQITGAVADREMKAQLLDSMDLERERGITIKL NAVQLNYTAKDGQTYLLHLIDTPGHVDFTYEVSRSLAACEGAILVVDSVQGVEAQTLANV YLALDNNLEILPVINKIDLPSADPQRVIREVEEVIGLPADHAPLISAKSGLNVEEVLEKV VQEFPAPSGDVNNPTQALIFDSYYDSYRGVIVFVRVKEGTINVGDHVKFMATGATYEVTE LGVRTPREVKKEQLVVGEVGWFAASIKSIQDIHVGDTVTTVENESNIQLPGYRQLNPMVY CGLYPVDNARYKDLREALEKMKLSDSSLMFDPETSQALGFGFRCGFLGLLHMDVIQERLE REFDLDLIATAPSVIYHCYLTDRSEVIVDNPAMMPEPQVIDHIEEPYVKASIMTPNEYVG VIMELCQSKRGEYQDIEYIDDTRRNVIYEIPLSEIVYDFFDKLKSGTKGYASLDYELIGY RTSKLQKMDILLNGEVVDALSTIVHKDFAYGRGKIICEKLKEIIPKQMFEVPIQAALQGK IIARTTIKAMRKNVLAKCYGGDISRKKKLLEKQKEGKKRMKAVGNVEIPQEAFMAILSVD DE >gi|223714159|gb|ACDT01000056.1| GENE 3 2453 - 3853 1386 466 aa, chain + ## HITS:1 COG:lin0583 KEGG:ns NR:ns ## COG: lin0583 COG2723 # Protein_GI_number: 16799658 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 1 466 1 463 464 404 45.0 1e-112 MTIKLPDNFFFGAALSGPQTEGAYNVGGRLRSYWDMWSDQEINAFHNNVGSYVGNDMYHR YEEDIKLFKSMNFKSFRTSMQWTRLLDQDGNINPEGAAWYHKLIDCANENGIDIFMNMYH FDMPEYLYKRGGWANREVVEAYANFVKKAMEEFGGKVKYWFTFNEPIVEPEQQYLHGVWF PYEKNIKRSLDVQYNITLAHCLAVMNFRELQARGKMVEGAKIGMINCFAPPYTKEDPSPE DLEAVRMTDGLHNRWWLDVVAHGHLPKDVIDTVENEWKIKMNRRPGDEAILAGGKVDWLG FNYYQPTRIQAPDQKFDENGLPCIAKPYIWPERRMNEHRGWEIYPKGIYDFGMKIKNECP DLPYFISENGMGVEGEEKYMDENGSVQDDYRIEFVRDHLEWIAKAIEDGSNCLGYHYWGV IDNWSWANAFKNRYGFIRVALDQGYKRIEKKSANWIREVAKNNEFE >gi|223714159|gb|ACDT01000056.1| GENE 4 4160 - 6271 2169 703 aa, chain + ## HITS:1 COG:BS_pbpA KEGG:ns NR:ns ## COG: BS_pbpA COG0768 # Protein_GI_number: 16079555 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Bacillus subtilis # 35 697 21 686 716 347 34.0 3e-95 MAFNFRNKIPKVRFYGASEQEREQKNLQDNIVSRRLIGLLCFVIAVTSVMAIRLGIIQFK QADELAVKLEQYGTTTYTSDAPRGEIVDREYVKLVENTSVICATYYAPKKIDNDVLKESA AFLAKNINIDVNDKKIISDRAKKDYFILAFKDVADGLISDKEKEEIKAQENADKRLYSLK LERITDELISQYMDEATLKYTHFLYLMQNCTSGSSILAEGLTDEEASIIGENADILPGVT VTTDWARQYSNSTEFASVLGKVTTKKQGLPVELKNELLALGYQNDSRVGTSGLEEQYEDI LRGNDSTYTLNYDSSGNPIITSTKAGTAGSNIRISIDWELQQLANQKIEEELKACNSSNK YFNKMFFILMDPYTGEIIVMAGKTIDKATGEVSDYASGNYLDANKIGSTIKGGSIYTGFK EGVIGPNTYFVDEPIKIKGTKAKKSWKSFGTINEVDALAYSSNVYMFRIAMLLGGANYVY DGPLKINEEAFDTLRNDLGELGLGVKTGLDVSNEALGYRGKQRTGGLLLDAMIGQYDTYT NIQLAQYACTLANGGKRIQPHLLLDSYTTDEDGEIQVNYEANTNVLDDVSNQATAFSQIK QGMRACVTRTEGTAHSWNSKPYVTYAKTGTAEDYTTDDGKTGTTDYPNHLQIGYVQTTED SRPEVAFACMSYRQTTATSGSSSAPIVAQAVIDRYWEKYHSTN >gi|223714159|gb|ACDT01000056.1| GENE 5 6344 - 6493 281 49 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237734635|ref|ZP_04565116.1| 50S ribosomal protein L33 [Mollicutes bacterium D7] # 1 49 1 49 49 112 100 1e-24 MRENIIYKCTECGEENYINTKNKRNHPDRMEINKYCPRCNKKTVHKEKK >gi|223714159|gb|ACDT01000056.1| GENE 6 6530 - 7147 520 205 aa, chain + ## HITS:1 COG:BH2820 KEGG:ns NR:ns ## COG: BH2820 COG0491 # Protein_GI_number: 15615383 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Bacillus halodurans # 6 202 6 208 211 150 38.0 1e-36 MKIKTLLLGNMQTNGYVVSDENHHCLIIDPGANGKKVVHYLTENELVPEAVLLTHGHFDH IGAVDYLYEHYHCPIYLHQDDLEMLDNPQLNLSVYENPFTVKAPVQSSHEEMKFGDFDVQ WLHLPGHCPGSSMIYLKDENIIFSGDVLFKGSIGRFDFPNSSKYETIESINKIKEYDFDA VIYPGHGPNSTLSEERLNNPYLKKS >gi|223714159|gb|ACDT01000056.1| GENE 7 7186 - 7353 108 55 aa, chain - ## HITS:1 COG:SPy1336 KEGG:ns NR:ns ## COG: SPy1336 COG2801 # Protein_GI_number: 15675275 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 1 53 216 268 268 60 50.0 8e-10 MKNEMFYGFEDTFKSLEELKQAMIDYIEYHNNSIITVKRKGLTPIQIRNQALQLI >gi|223714159|gb|ACDT01000056.1| GENE 8 7443 - 7820 312 125 aa, chain - ## HITS:1 COG:FN0841 KEGG:ns NR:ns ## COG: FN0841 COG2801 # Protein_GI_number: 19704176 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 14 125 17 121 207 90 44.0 6e-19 MSHMGLYALNPTPKQRYNSYKGEMNGTCKNLLLDKRENKYIRKTYFNRNLKTTSVNQKWT TDVSEFKTAMSKLYLSPILDMHSRKIVGYDISTTPSLFQTYRMLDMAFSKFFHSNQGWQY QHFSY >gi|223714159|gb|ACDT01000056.1| GENE 9 8224 - 10917 3089 897 aa, chain + ## HITS:1 COG:yieL KEGG:ns NR:ns ## COG: yieL COG2382 # Protein_GI_number: 16131587 # Func_class: P Inorganic ion transport and metabolism # Function: Enterochelin esterase and related enzymes # Organism: Escherichia coli K12 # 653 836 188 396 400 67 27.0 2e-10 MNKKNLLLSVALSGTMLMSNGITALTAQNAVYSPGVTVEKNTNPDWDADYTATFVYEDKD ARDAERVTVSGNFQFYSLNDPVVKNYEKTADPTGAKVYSAYDYDKDNAMFNVSGGQNNDT PIYELEETGDERFEITLPLPGNQYFYDYTVEYSDGSKVTIQDPVNPSKKNSLTDHDSGHS VFYVGNSENTTAGQEYIYPRADGQTGKYEFIEYKDAHNTDEETQSLGVYLPYNYDPSKTY KTIYVSHGGGGNEAEWMEIGSLPNIMDNILAEKEAAESVVVTMDNSHFGWDYDLIAKNFK ENIIPLIEERYSVSTKVEDRALCGLSMGALTTGTIMFSYTDMFGAYGNFSGTADPLICKD WELLKTKTVYLTGGNLDMAIQATQESGDSAFNRGKTVRAHEYLEELGVEHFYDLKYGSHD WGVWRDAFTTFVKDILWGYEYPGESTYQAGVTVEENINPAYDGDYITTFVYEDQDIKDAV KVTVSGNLQFYSKDDEAVKNYTQTLDFSQAKVYNAYEYKDGMFNTGYGLNNDTAIYELTE TKDERFEITLPLPGNLYYYDYTVTYADGTTVTIQDPANPSLKNDYNNHDAGHSLVYVGSS KNTIDGQEYVYARDDNQKGTYSFVNYDAIDGTKQPLGIYLPYNYDATKSYKTIYVSHGGG GNEVEWMTIGAVPNIMNNLLADKEAAEAIVVTMDNTYFGWDYDQIKNNLMNHIIPFIEAN YSVSTISNDRAFCGLSMGGLTTTSIYTTLADKFGYLGIWSATDPNTDISAIKNADKPTLL LAAGIVDYGKVGFTGNKDFAGLLANLDEAGIKYDYIETYGAHDWGTWRSLFTTFVKDYLW DIKEVKDEPTVNPGDTTKPVVSVKTGDNVDINGLLLISGLSLLGISIQYYCKAHRGH >gi|223714159|gb|ACDT01000056.1| GENE 10 11409 - 11624 178 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734639|ref|ZP_04565120.1| ## NR: gi|237734639|ref|ZP_04565120.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 71 1 71 71 125 100.0 1e-27 MKISLEEIKKSIYQCIADWEKDGRYNHVAFVVASGSKISSGYDDVTIAQHTSNCCARVTS SVTGWGNTSAV >gi|223714159|gb|ACDT01000056.1| GENE 11 11930 - 12356 299 142 aa, chain + ## HITS:1 COG:SA1374 KEGG:ns NR:ns ## COG: SA1374 COG2804 # Protein_GI_number: 15927124 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Staphylococcus aureus N315 # 1 142 1 138 324 71 35.0 5e-13 MDELFEKILNHAINYKTSDIHILMKQKCSVSFRQHGDLVLYDTFESHIGLKLFNYIRFKS NIDINYRLRPQTGHLNYIYQNQIYYLRISSLPGRELDSIVIRILNNHQQIALDKLTIFKE TTVFLKHITTLEAGLFIVSGAT Prediction of potential genes in microbial genomes Time: Thu May 26 09:47:56 2011 Seq name: gi|223714158|gb|ACDT01000057.1| Coprobacillus sp. D7 cont1.57, whole genome shotgun sequence Length of sequence - 3469 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 2, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 5 - 64 5.5 1 1 Op 1 24/0.000 + CDS 101 - 466 296 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 2 1 Op 2 6/0.000 + CDS 459 - 1460 528 ## COG1459 Type II secretory pathway, component PulF 3 1 Op 3 . + CDS 1473 - 1778 429 ## COG4537 Competence protein ComGC 4 1 Op 4 . + CDS 1756 - 2175 177 ## gi|167756939|ref|ZP_02429066.1| hypothetical protein CLORAM_02488 5 1 Op 5 . + CDS 2165 - 2386 178 ## gi|237734645|ref|ZP_04565126.1| predicted protein 6 1 Op 6 . + CDS 2338 - 2673 223 ## gi|237734646|ref|ZP_04565127.1| predicted protein + Prom 2675 - 2734 5.9 7 2 Op 1 . + CDS 2781 - 3161 190 ## gi|167756942|ref|ZP_02429069.1| hypothetical protein CLORAM_02491 8 2 Op 2 . + CDS 3237 - 3468 312 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) Predicted protein(s) >gi|223714158|gb|ACDT01000057.1| GENE 1 101 - 466 296 121 aa, chain + ## HITS:1 COG:CAC2105 KEGG:ns NR:ns ## COG: CAC2105 COG2804 # Protein_GI_number: 15895375 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Clostridium acetobutylicum # 1 63 320 382 491 69 50.0 2e-12 MIGEIRDEQTARLAVTCALTGHLVLTTIHSSNCLTTIRRLLNLGLLKIDIEDVLIGIICQ KMLYHKYTHQPLILPEYLNRQKILTFFNNEPVKYQTFEMNAKYLLKNNLVDYTQVAGIIY E >gi|223714158|gb|ACDT01000057.1| GENE 2 459 - 1460 528 333 aa, chain + ## HITS:1 COG:L0314 KEGG:ns NR:ns ## COG: L0314 COG1459 # Protein_GI_number: 15674104 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Lactococcus lactis # 6 326 39 349 357 65 21.0 1e-10 MNDDLQLLENIANFLKQGYLIKDVLNLCLIIHQTEKIKLIEQNLVAGKAFDEVIIEIDFD NTFIEYFKFFRFKNNISTAIEHSLKICRKKEQIFTQLKKELTYPVLLIIFLMFFSLFIVY GLLPSIMQLFVEFNISPSIITRIIFKLFEIIPIIVIFIILSFTVFFTISIYAIKRKYFKL IDLLIEKIVLIRRLIQKYYSLKFALYYNELLINGYDSTDIIVMLYEQIDDSDIKMIIYEI YRQVLEGEALEDIINDFEYFEPLFIAYFKLLIHDNQKDKSLDNYLRVSIDTLHMQVTRLI KLFVPIIYCFVAGFVILVYVAIVIPMMNVVSNL >gi|223714158|gb|ACDT01000057.1| GENE 3 1473 - 1778 429 101 aa, chain + ## HITS:1 COG:BH2827 KEGG:ns NR:ns ## COG: BH2827 COG4537 # Protein_GI_number: 15615390 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Competence protein ComGC # Organism: Bacillus halodurans # 6 96 10 96 102 64 31.0 5e-11 MKRNNGFTLIEMIFCISVILVILLLVIPNVTSKNRVVKEKSCDGQIEVVNSQIVLYEIEN GELPTSISDLTSGEHPYLTEKQATCPSGLRISISDGQAYVR >gi|223714158|gb|ACDT01000057.1| GENE 4 1756 - 2175 177 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756939|ref|ZP_02429066.1| ## NR: gi|167756939|ref|ZP_02429066.1| hypothetical protein CLORAM_02488 [Clostridium ramosum DSM 1402] # 5 139 1 135 135 221 100.0 2e-56 MDKHMYDDSGFTLIEVVFTISIILILSTITLHFVINSKPVISLQQQCQQVISLLEEGKSR AMINHEQINIVIQTKQISYNGQKDQRTLTINDNYYIDDYYEFHFNHNGNISSGGHLKICS SNGCKSIILNVGSGAFYVK >gi|223714158|gb|ACDT01000057.1| GENE 5 2165 - 2386 178 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734645|ref|ZP_04565126.1| ## NR: gi|237734645|ref|ZP_04565126.1| predicted protein [Mollicutes bacterium D7] # 1 73 8 80 80 117 100.0 2e-25 MLNKQGMTLIETLMAFSIFISIIVLFLSCYNNAINHHYQINQDYTNYLKQQQEKEVELWQ TSGLNESVNEVLH >gi|223714158|gb|ACDT01000057.1| GENE 6 2338 - 2673 223 111 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734646|ref|ZP_04565127.1| ## NR: gi|237734646|ref|ZP_04565127.1| predicted protein [Mollicutes bacterium D7] # 1 111 1 111 111 196 100.0 3e-49 MADKWFERISQRGFTLIEVLFSLSICLLIILNSVPILRIVTAKDKLSFNLSSYALGVKQI SSILHTAKDIEINDDLTYTNANNEIFTISLNHHRIVKEPGFDIIIHNVDEI >gi|223714158|gb|ACDT01000057.1| GENE 7 2781 - 3161 190 126 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756942|ref|ZP_02429069.1| ## NR: gi|167756942|ref|ZP_02429069.1| hypothetical protein CLORAM_02491 [Clostridium ramosum DSM 1402] # 1 126 4 129 129 189 100.0 4e-47 MKLKQYCNNRGTILQVVLVIFIVLILNISLVFNNIIENSRSLERIRELDRARLLEITILR YYKETILNDLLFSDEISIDDYFINYTVDDMGSYYYIVTTIKQNDSSYSFNLEINIETLVI SSFDYQ >gi|223714158|gb|ACDT01000057.1| GENE 8 3237 - 3468 312 77 aa, chain + ## HITS:1 COG:MYPU_5130 KEGG:ns NR:ns ## COG: MYPU_5130 COG0231 # Protein_GI_number: 15828984 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Mycoplasma pulmonis # 1 77 1 77 187 93 57.0 1e-19 MISVGDLRPGITFEYEGNLYVVLDYSHNKTARAAANIKVKMKNMRSGATTEITFGGNDKV KKAHIDKRKMQYLYNSG Prediction of potential genes in microbial genomes Time: Thu May 26 09:48:21 2011 Seq name: gi|223714157|gb|ACDT01000058.1| Coprobacillus sp. D7 cont1.58, whole genome shotgun sequence Length of sequence - 6851 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 1, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 41 - 100 9.1 1 1 Op 1 1/0.000 + CDS 164 - 568 471 ## COG0781 Transcription termination factor 2 1 Op 2 . + CDS 568 - 1884 1090 ## COG1570 Exonuclease VII, large subunit 3 1 Op 3 . + CDS 1895 - 2110 331 ## gi|237734651|ref|ZP_04565132.1| predicted protein 4 1 Op 4 13/0.000 + CDS 2112 - 2960 1009 ## COG0142 Geranylgeranyl pyrophosphate synthase 5 1 Op 5 6/0.000 + CDS 2960 - 4831 2059 ## COG1154 Deoxyxylulose-5-phosphate synthase 6 1 Op 6 . + CDS 4812 - 5612 775 ## COG1189 Predicted rRNA methylase 7 1 Op 7 . + CDS 5621 - 6832 1356 ## COG0497 ATPase involved in DNA repair Predicted protein(s) >gi|223714157|gb|ACDT01000058.1| GENE 1 164 - 568 471 134 aa, chain + ## HITS:1 COG:L92686 KEGG:ns NR:ns ## COG: L92686 COG0781 # Protein_GI_number: 15672676 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Lactococcus lactis # 19 130 207 318 323 66 36.0 1e-11 MKKYRKKIIREKAVIATYQKLLVDTNEDEIREYLNSDKDLSSNQDDFDYCFMFIISIASN IERYKAEVAKYLKPGWTLDRLSKMELAILLVGCYELLETEQSKEVIINEAVELSKKYCDS DAYKFINGLLNRIK >gi|223714157|gb|ACDT01000058.1| GENE 2 568 - 1884 1090 438 aa, chain + ## HITS:1 COG:SA1354 KEGG:ns NR:ns ## COG: SA1354 COG1570 # Protein_GI_number: 15927104 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Staphylococcus aureus N315 # 5 435 4 435 445 361 45.0 2e-99 MEKRYLTVSALNRYLKAKIDSDSQLQRILIKGEVSNFKHHSSGHFYFTLKDEHSRINAVM FSSKASKVPFDLTNGMKVLVQASVSVYDVAGTYQLYVDTIEQDGLGNLFLKYEQLKKQLA SEGLFNPENKLVIPKFPSKIAVLSAYPSAALADIIRTIHLRFPVVRVIVFPIPVQGKDAY LEIIRTLRYVDTLGFSEIIIARGGGSLEDLWNFNEEGLARAIYQCKTPIISGVGHEVDFT ICDFVADYRAATPTAAAIKATPDLFELQQAVDNIKYTLNNLMKQKIILNKENLNRLKSFY LFKNPQKMFEDKTAKIDYLYDQLNNVFNHDLIEKQNKASNLIQTFNHQANLFTLNQRNRL DTINRTMSLEIKRKLQHNQEKFYYSLSKLNTLSPLKTLERGYAIVLKEDHVISSVDDLNS GDKIELKLHNGIKKAIIE >gi|223714157|gb|ACDT01000058.1| GENE 3 1895 - 2110 331 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734651|ref|ZP_04565132.1| ## NR: gi|237734651|ref|ZP_04565132.1| predicted protein [Mollicutes bacterium D7] # 1 71 1 71 71 100 100.0 3e-20 MDNITFEQAMSRLEEIIAALENNQISLEKSVDLFQEGIKLSKICSDKLAGIEDKVAKILV DGKLEDLKIEE >gi|223714157|gb|ACDT01000058.1| GENE 4 2112 - 2960 1009 282 aa, chain + ## HITS:1 COG:CAC2080 KEGG:ns NR:ns ## COG: CAC2080 COG0142 # Protein_GI_number: 15895350 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Clostridium acetobutylicum # 21 280 25 287 289 218 47.0 9e-57 MENLIEQINERLLEIIEEFGDSKVKEAMKYSLMAGGKRIRPVMMLQVIRSYDKNYQDYLD IACAIEMIHTYSLIHDDLPGMDNDDLRRGRLTCHKQFDEATAILAGDALLNEAVNVIIKT KVPDELKISLLATLYQASGINGMILGQALDMEYETKLATRDQLDLIHHHKTGDLISAAMK MGALIANPDDQKTWVEIGYKIGLAFQIQDDVLDVVGDSSLLGKKVGSDQVNHKSTYVTLM GVEQSQMIVEQYFKEAMELIYKLKINHGLILEIMEKLKKRVK >gi|223714157|gb|ACDT01000058.1| GENE 5 2960 - 4831 2059 623 aa, chain + ## HITS:1 COG:lin1402 KEGG:ns NR:ns ## COG: lin1402 COG1154 # Protein_GI_number: 16800470 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Listeria innocua # 1 619 6 630 632 557 44.0 1e-158 MDLEKITDPSFLKELDIRQLNQLADDIRKFLINNISKTGGHLSSNLGVVELSIALHYVFN SPKDKIFFDVGHQSYVHKILTGRAGQFSTLRQYKGLSGFQKRCESEHDPWEAGHSSTALS GALGMAVARDLNHENYHILPVIGDAAMVGGESLEALNHLGSINNKVIIILNDNQMAIGKS VGGFGDFLSSIRISGTYNNLKEDYRNITSHNKIGKMIFNVSKRVKDFVKHGLIDDTIFED FGVDYLGPVNGHDFDDLIRVLNLAKSAKTSVVVHVVTKKGRGYKYAENDIAGKWHGIAPF NIEDGTMKGAGTSSKISWSKMVADHIERQMKSDQDIVAITPAMIHGSAMDNIFMHYPERS FDVGIAEEHALTFTAGLAISQKKPFISIYSSFLQRAYDQINHDIARMDLGCLICVDRCGF VGADGPTHHGVFDLGILTPLPNVIICTPSNSYDAKRFINTYLKNNDHPYILRIPRGDIED MNVGDELLTIGKWQVVNRKDYDVTIICYGQNVNLIREFFKDKEIKVRIIDALFIKPMDED MLNEIIDEKPLIVYETELKTGSLASNIAYYYSQNNILKRIHSFGVDDHYSVQGTVAQILQ DEGLDMDTFYQKVKEILNEEGKN >gi|223714157|gb|ACDT01000058.1| GENE 6 4812 - 5612 775 266 aa, chain + ## HITS:1 COG:BS_yqxC KEGG:ns NR:ns ## COG: BS_yqxC COG1189 # Protein_GI_number: 16079482 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase # Organism: Bacillus subtilis # 2 246 4 248 269 316 62.0 3e-86 MKKERIDVLLVEQGFFDSREKAKRAIMAGIVHDDYDIIDKPGTKIPIDSNLHVKGNIMPY VSRGGLKLERALKEFELDITDRIMVDIGSSTGGFTDCALQNGVKLVYAIDVGTNQLVWKL RNDPRVIVKEQTNFRYATSELFEHGIPTFASIDVSFISLKYIFDALKNILHSGNQVVALI KPQFEAGREEVGKKGIVKEPSVHQKVIQHVIEYANHDGFVLEKLTYSPITGGEGNIEFLG LFVKDGNNVAIDIERVVETAHNCFKG >gi|223714157|gb|ACDT01000058.1| GENE 7 5621 - 6832 1356 403 aa, chain + ## HITS:1 COG:BS_recN KEGG:ns NR:ns ## COG: BS_recN COG0497 # Protein_GI_number: 16079480 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Bacillus subtilis # 1 397 1 400 576 204 33.0 3e-52 MLESIYIENFAIIDRLEVDFHNHMTVLTGETGAGKSIIIDAIGQLMGNRSQSSFIKADCD ECFIEGVFTIGAKSPVLNKLKEYRIDYEDKLVVSKSFNRDNKSIIKINYRNVSKMVLQSI MADLIDIHSQFETHSLFDAENHLIILDEFINQPLKKLFQTYSLAYRTYREINRDYQKALN EELSDEQLEFYQAQLAEINSLDLEELDEDELEREKKLLQSYEKTNEQISKYRQYMNGDRG ALATLSNALSELEELNDNPQYQNTYERMYDLYYNLIDLDDEIINEFNSTNFDEYRLNEIQ EVFFKLNRLKRKYGQSIEAIKEAKEDLEMKVAAFNNRETYLNDLKKQLDIAYQETKSIAE QITRLRQSKAKEFTELVTKELKSLYLDKVVFKVDFKLVDFQKK Prediction of potential genes in microbial genomes Time: Thu May 26 09:48:26 2011 Seq name: gi|223714156|gb|ACDT01000059.1| Coprobacillus sp. D7 cont1.59, whole genome shotgun sequence Length of sequence - 2676 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 + CDS 82 - 432 184 ## COG0497 ATPase involved in DNA repair 2 1 Op 2 . + CDS 486 - 1475 931 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 + Term 1476 - 1525 13.6 + Prom 1483 - 1542 8.8 3 2 Op 1 8/0.000 + CDS 1562 - 2233 556 ## COG1296 Predicted branched-chain amino acid permease (azaleucine resistance) 4 2 Op 2 . + CDS 2226 - 2549 280 ## COG1687 Predicted branched-chain amino acid permeases (azaleucine resistance) Predicted protein(s) >gi|223714156|gb|ACDT01000059.1| GENE 1 82 - 432 184 116 aa, chain + ## HITS:1 COG:BS_recN KEGG:ns NR:ns ## COG: BS_recN COG0497 # Protein_GI_number: 16079480 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Bacillus subtilis # 1 111 455 565 576 90 48.0 6e-19 MLAIKILSLSSSSVETIIFDEADTGVSGKVAESIGAKMKYISKQHQVLCITHLAQVAAFA KNHYLIQKSSNDNYTNVKIKELSYDQSINEIAKLISGKEVSQESINHAKKLKISSE >gi|223714156|gb|ACDT01000059.1| GENE 2 486 - 1475 931 329 aa, chain + ## HITS:1 COG:BS_spoIVB KEGG:ns NR:ns ## COG: BS_spoIVB COG0750 # Protein_GI_number: 16079479 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Bacillus subtilis # 25 325 104 411 425 203 39.0 4e-52 MLFKKLILSFILALAIINPIAIYAISLVPGGDSIGIELDYQGVVITGGYKIKVGNESYDP LAKDFKVGDIIVAINNQKVTSIEELSNVIKEGDIANPRYDLTIKRGKETLHHDLQVVYEN QQFSTGLYVKDAISGVGTLTFYNPATSTFGALGHAMSDSKLDSEELIQNGNIFESTVTSI KKATTQSSGNKIADISNIEIGSINSHSQFGIYGTYNYDISKREMMETASIEETKLGKAYF LTVLDGDKIQKCEIEITKLNDQDSIKEKGIEFTVTDQEVINKANGIVQGMSGSPIIQNNK IVGCVTHVSGNNPIIGYGLYIDWMLEMDK >gi|223714156|gb|ACDT01000059.1| GENE 3 1562 - 2233 556 223 aa, chain + ## HITS:1 COG:jhp1251 KEGG:ns NR:ns ## COG: jhp1251 COG1296 # Protein_GI_number: 15612316 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permease (azaleucine resistance) # Organism: Helicobacter pylori J99 # 3 221 5 227 228 168 43.0 8e-42 MNLLTIKSAFKESIPVMMGYLVLGFAFGMLLVSKGFPIYYAFIMSCFIYAGSMQFVTISL LAGQASFISSFIMTLMVNARHLVYGLSMLKKFNFLGKLKPYMIFSLTDETFSLLVKNDFK SKNEVFLISFLDQCYWIIGSLVGATIGNNVSFNTQGLEFSMTALFIVIVINQIKNNSNHL ATLIGFFVSIICLIIFGSDNFVIFSMILIMIILILVKPRLKNE >gi|223714156|gb|ACDT01000059.1| GENE 4 2226 - 2549 280 107 aa, chain + ## HITS:1 COG:BS_azlD KEGG:ns NR:ns ## COG: BS_azlD COG1687 # Protein_GI_number: 16079723 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permeases (azaleucine resistance) # Organism: Bacillus subtilis # 1 106 5 110 110 90 44.0 5e-19 MNNDVLIIIAVALGTILTRVLPFLIFNDAENLPPAISYLSKVLPYSIMAMLVVYCLRDTT FISGNHGFPEIIAVSITIIIHLLKENTLLSILVGTITYMICIQVIFA Prediction of potential genes in microbial genomes Time: Thu May 26 09:48:27 2011 Seq name: gi|223714155|gb|ACDT01000060.1| Coprobacillus sp. D7 cont1.60, whole genome shotgun sequence Length of sequence - 814 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 38 - 790 666 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair Predicted protein(s) >gi|223714155|gb|ACDT01000060.1| GENE 1 38 - 790 666 250 aa, chain + ## HITS:1 COG:BS_yqjH KEGG:ns NR:ns ## COG: BS_yqjH COG0389 # Protein_GI_number: 16079444 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Bacillus subtilis # 1 247 155 399 414 176 38.0 3e-44 MASDMKKPMGITILTRSNLKEIMWPLDIKDMFGIGKKTQPKLKAVGINTIGDIANYDNYN KLRQIIGKNALLLYRKANGIDNSAVDAKQNELKSVGNSTTLPYDTNDEEILRDTLKSLAR QVSARANKRNLISNSISITIKYTRFESVTRQTTVNSFINDYETILSTAKMLFDANYSGRP VRLLGISLNNTINKKNYKEQLNIFEMAQEDETNDDSLEQLLNAINNKFDKKLVTKASSFA KKTPQKKYLK Prediction of potential genes in microbial genomes Time: Thu May 26 09:48:31 2011 Seq name: gi|223714154|gb|ACDT01000061.1| Coprobacillus sp. D7 cont1.61, whole genome shotgun sequence Length of sequence - 12486 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 3, operones - 2 average op.length - 5.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 55 - 441 251 ## gi|167755499|ref|ZP_02427626.1| hypothetical protein CLORAM_01013 2 1 Op 2 . + CDS 434 - 1066 721 ## gi|237734252|ref|ZP_04564733.1| predicted protein 3 1 Op 3 . + CDS 1084 - 1548 413 ## gi|167755501|ref|ZP_02427628.1| hypothetical protein CLORAM_01015 4 1 Op 4 . + CDS 1545 - 2477 933 ## gi|167755502|ref|ZP_02427629.1| hypothetical protein CLORAM_01016 5 1 Op 5 . + CDS 2515 - 3939 1438 ## gi|167755503|ref|ZP_02427630.1| hypothetical protein CLORAM_01017 6 1 Op 6 2/0.000 + CDS 3957 - 5627 1779 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 7 1 Op 7 . + CDS 5637 - 6680 1025 ## COG2805 Tfp pilus assembly protein, pilus retraction ATPase PilT 8 1 Op 8 . + CDS 6701 - 7558 797 ## gi|237734258|ref|ZP_04564739.1| predicted protein + Term 7564 - 7599 -0.8 + Prom 7598 - 7657 8.4 9 2 Op 1 . + CDS 7698 - 8888 1243 ## Lebu_1198 protein of unknown function DUF201 10 2 Op 2 . + CDS 8888 - 9625 716 ## COG4947 Uncharacterized protein conserved in bacteria 11 2 Op 3 . + CDS 9638 - 10843 1240 ## Lebu_1198 protein of unknown function DUF201 + Term 11036 - 11070 2.5 + Prom 11054 - 11113 9.5 12 3 Tu 1 . + CDS 11249 - 12485 1463 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase Predicted protein(s) >gi|223714154|gb|ACDT01000061.1| GENE 1 55 - 441 251 128 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755499|ref|ZP_02427626.1| ## NR: gi|167755499|ref|ZP_02427626.1| hypothetical protein CLORAM_01013 [Clostridium ramosum DSM 1402] # 1 128 87 214 214 224 99.0 1e-57 MQEIQKTIDNLKKVNANIASYPNLSEDIYNACYKAALQGKITSSNYEQTTGELTLVIETS QVPYTKQIVKNLMDLNIFTKVRYSGYKEVEKEDESSNSGSIYMPSTTPEKVVVYQMTVIC SLGGGINE >gi|223714154|gb|ACDT01000061.1| GENE 2 434 - 1066 721 210 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734252|ref|ZP_04564733.1| ## NR: gi|237734252|ref|ZP_04564733.1| predicted protein [Mollicutes bacterium D7] # 1 210 1 210 210 322 100.0 7e-87 MNKLSRREQILIYILVLLMVVVVGWYFMISPAMAKNTTLKQQRDNLEFELESKKAVYETN IDVNAAIKKAKDELVDYHNKFMSIMNDYEIDVYFSELAAEYNLTPVDLTINEASETEIKS FSVALQEAKAESNSKSETTSNKSEEKEENPKTLVANVQQKVTGSPTNIARYIGKISENTG VALTQLQYTYNRDSTKEYTILTYNLYMIEK >gi|223714154|gb|ACDT01000061.1| GENE 3 1084 - 1548 413 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755501|ref|ZP_02427628.1| ## NR: gi|167755501|ref|ZP_02427628.1| hypothetical protein CLORAM_01015 [Clostridium ramosum DSM 1402] # 1 154 1 154 154 287 100.0 2e-76 MIKNNKGSGLTWAVMIVMILLLIIGASLTFAYSSYNRSIDNRNQLQCDLYAKSALDSLIK AIGDNETDLIPKSSAFEDNIYPEISLPNAKGEVSQAVIHIVDKEQEKDNDKKIIPLIYIE ITYEYNDRTSNVQAVMQKIGGIWKVINYDGGEAA >gi|223714154|gb|ACDT01000061.1| GENE 4 1545 - 2477 933 310 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755502|ref|ZP_02427629.1| ## NR: gi|167755502|ref|ZP_02427629.1| hypothetical protein CLORAM_01016 [Clostridium ramosum DSM 1402] # 1 310 1 310 310 557 100.0 1e-157 MINNKGMTLVEEIVALAIISIASLIMLVGFSTAANVFADSTRYKDITNKQYAALLGNENT DKDINVSNEDAKVIIKVADKVITVKTNQSNATSKKDSQTMLSKLNFNNINLSNTPQTVAN CNDFLDSVLSFTTIKQFDEWIAEQGIDVASYNGWYKNNDCYRYYYYNRYGAAHLTLEQDI INECNRIFDEVHQTATNQNEKKARIGDKTLYIKPYFCGGIWQVGIDSEDGYFLMASELPG ISGGTQWRTNLIYARGSWYYKVFAFGTSDQNSYVDVAGFSNAKATVDSLFDGSISEKINL SDQSQWVKIK >gi|223714154|gb|ACDT01000061.1| GENE 5 2515 - 3939 1438 474 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755503|ref|ZP_02427630.1| ## NR: gi|167755503|ref|ZP_02427630.1| hypothetical protein CLORAM_01017 [Clostridium ramosum DSM 1402] # 1 474 1 474 474 879 100.0 0 MKQFKNNNGMTLIEVTVVLLISTVLLTIVGSILVTSLNFFRDQKLSDQNKQKLDGIAEYV DNEINYVTDLAIDNEAPYEREGWHYLSIIDGKLYRDGIKVFDDKYYDNYTVSIKVKGYDG YRLDVKYDFNKKSELVYGTAKSYSFDNLKLKVEAGGEDYLVSASNQVEVNDTYRIYYIKG INIIDTNDKPDVEEPDESNDNITVADQVHCINTYNNRGVFDGVQRHYMIGDFVYYKGYWW QLITQENNYGFGAEPDREQKWKKIDKNYDRFSAYTVNDVIYYPENGHYYKCLKQLDNKGG DSGYGPTGWNGIHENYWEDLGTTNPTTDSGHDCLELSTKNRIKTVMNKLDSLTKEQINKI PAYLNTVVYPVGSWVYEDVDGRKQYYLKVFDGDGSAPGLSASSGWQIISRDWYLESAYIK DDVVYVTNGGRSIFITFNKTIDLTIDLVNGTNINGNRIDIDNPYSYFNKKNSMY >gi|223714154|gb|ACDT01000061.1| GENE 6 3957 - 5627 1779 556 aa, chain + ## HITS:1 COG:aq_1971 KEGG:ns NR:ns ## COG: aq_1971 COG2804 # Protein_GI_number: 15606969 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Aquifex aeolicus # 14 551 17 563 566 376 41.0 1e-104 MRNIPIGEVLKEYGYINDEQLNVALEAQKSNRSKRLGQHLIDLGFVSEYQMLEALSDKLA EPLIELSEIKVDIDAVQKIPRAMADKYNIIAIDLTDQQLTIVTSDPLNFYGIEDVRLVTG MHLNVCLATKAEVSKAIDRYYNDVAALDIADDIKLNTIVVEDTLDLFNESEDDTPVVKLV NTLLSRGYVNNASDIHIEPFEDKVIIRMRVDGMLVDYLTLQKNIQNSLIVRIKILSNLDI AEKRLPQDGHFVGRVEGLELNMRVSVIPTVFGEKIVMRYLNSNTPITRSDTYGMTLDNYN KIESMINMPHGIIYVTGPTGSGKTTTLYMLLESISKRQINISTIEDPVEKNLPRINQTQV NNMAGLTFEVGLRSLMRQDPDIIMVGETRDAETAEISVRAAITGHLVLSTLHTNDAVSAI VRLEDMGVEPYLVANSLVGVVAQRLVRTICPKCKEEVPAKASDKIAVGEDIKKVSIGKGC PYCNNTGYKGRIAVHEIILIDGTVRRMISRKAEIDEIKEYLNLEQGLETLQDQAVQLVKD GITTVAELNKIKVYSD >gi|223714154|gb|ACDT01000061.1| GENE 7 5637 - 6680 1025 347 aa, chain + ## HITS:1 COG:DR1963 KEGG:ns NR:ns ## COG: DR1963 COG2805 # Protein_GI_number: 15806961 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein, pilus retraction ATPase PilT # Organism: Deinococcus radiodurans # 2 344 9 353 420 297 45.0 2e-80 MINEILKASRNNKCSDIHISANTGIKVRQDGLLADYPIEFPQNELIKMIMELIPKRLIDN VDRHEDADFVYHVWEERYRVNIYYEQTNICAAIRVINDEILTLEQLEMPTVLNQIAMEPR GLVLVTGPTGSGKSTTLAAMIDLVNKQRNCHILTIEDPIEYVYKQKQALIHQREIGEDVK DFDMALRSAMREDPDVILVGEMRDYETISAVITLAETGHLVFSTLHTIGAAKTIDRIIDI FPQHKQAQVRTQLSGVLNAVITQTLLPHASGIGRVAAVEIMRANDAVRNLIRDNKGHQIN SVIQTGKKEGMISLNHALANLVREGKITLDTAKKCASDISEFKQYLQ >gi|223714154|gb|ACDT01000061.1| GENE 8 6701 - 7558 797 285 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734258|ref|ZP_04564739.1| ## NR: gi|237734258|ref|ZP_04564739.1| predicted protein [Mollicutes bacterium D7] # 1 285 1 285 285 553 100.0 1e-156 MKGKRGFSLIEIVVVLMIFAILAALAWPALTNYYRDSNEEIYLAEGDKVLTAAQVEAKKL CSEVNGATKLDDIALKDSDGKILKRTALKGELVSIYPNDTRDDVGFFCYKVEDGSCYVIY ENGKLYISKDEVYYMDNIADRVRRGFLILFGDMWEEYFSKSGKVVMDSNGPNFGIKYEAK LKEMGIDISLCSFRIYVNDHGKNGDGSDATFTLTVSSKRITNEMAETKEEFQITRYIFTG GIKEGNYSKYTGTAKAVLKSENDTSGIRHNYAVIEANANSLKPVK >gi|223714154|gb|ACDT01000061.1| GENE 9 7698 - 8888 1243 396 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1198 NR:ns ## KEGG: Lebu_1198 # Name: not_defined # Def: protein of unknown function DUF201 # Organism: L.buccalis # Pathway: not_defined # 1 391 1 395 401 379 51.0 1e-103 MNFLFISPMFPKNYWNFCDRMKNHGVNVLAIGDMPNESISEELKNSVTEYCYVNNMENYD NVYRCVAYLASKYGKIDWLESNNEYWLELDAKLRTDFNVNTGMKEDITKIFKTKSGMKAY YQKAGVKTARYHLVNTLENGRKFISEVGYPVVVKPDNGVGAAATYKINNDQELEEFYNID HITQYIMEEFVNGLILSYDGIANNEHQVLFETSHVFPNSIMDVVNNHSNIHYYSLRKIPE ELQKLGRSVVKKFPVNARFFHCEFFQLLEDRPGLGNKGDFIGLEVNMRPPGGYTPDMMNW ANDIDVYNIYADMVTYNHSEYYTNRSYHCIYCGRRDGKKYYYNDDALMAMYGNHIVMNER MPDILSGAMGNYTYTARFETMDEVNDFIDKVLKEVC >gi|223714154|gb|ACDT01000061.1| GENE 10 8888 - 9625 716 245 aa, chain + ## HITS:1 COG:mll2788 KEGG:ns NR:ns ## COG: mll2788 COG4947 # Protein_GI_number: 13472478 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mesorhizobium loti # 1 245 10 258 267 215 42.0 8e-56 MKVEYFKEYSNCLNREMEFKIYGHTGIPLLVFPAQDGRFYDFENFGMVSVIADKIEAGSV QVFCLDSIDGECWSDEGGNGPHRTYMMEQWYYYVMNELIPRIFDINGTGQKIYTTGCSMG ATHAANFMFRRPDIFQGCICLSGYYDSDLFFGNYHDERLYNNSPIQYLEGMSYDHPYVEM YRHCDIILCCGQGAWEDEMIYSTHRMQELLENKDIPAWIDYWGYDVNHDWDWWQIQLPYF INHIL >gi|223714154|gb|ACDT01000061.1| GENE 11 9638 - 10843 1240 401 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1198 NR:ns ## KEGG: Lebu_1198 # Name: not_defined # Def: protein of unknown function DUF201 # Organism: L.buccalis # Pathway: not_defined # 1 389 1 393 401 323 45.0 6e-87 MNVVFISPHFPLYFHNFCSRLKNRGVNVLGIGDAQYSEISNETKESLSEYYRVNSLENYD EVYKAVDYYISKYGRIDFVESQNEYWLETDARIRSDFNICSGTKFEDLAVMKYKSKMKAV YESVGLNVARYCLIDNFENALNFIDAVGYPVVVKPDNGVGATSTYKLNNQAELEYFFATK DERIYIMEEYVNGHVETFDGITDSNKNVLIANSTIMLNSIMDNVNEHCDTAFCNRFVAGS DIEEIGTKVVKAFDTRSRFFHFEFFRLDSDKEGLGKKGDLVGLEVNMRAPGAYMPDMINF SYESDVYTIWADMVIYDKCFIELKQKYLVAYIGRRNDIGYMFDHDQLISQFGGNIMLDVD VPEVLSAAMGNHVYMVRTENQEHLDYLINEFLKRCDGSNWR >gi|223714154|gb|ACDT01000061.1| GENE 12 11249 - 12485 1463 412 aa, chain + ## HITS:1 COG:BH0596 KEGG:ns NR:ns ## COG: BH0596 COG2723 # Protein_GI_number: 15613159 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Bacillus halodurans # 2 412 5 429 477 482 53.0 1e-136 MNFPKNFLWGGAIAANQAEGAYDEDGKGLNVTDVSTGLINVPDYRVIPGKYYPSHEAIDF YHRYKEDLQLMSEMGFNCFRTSISWGRIFPNGDDEEPNEAGLNYYEEMFNYMLELGMEPV ITISHYETPIHLVEQYGGWKSRELICFFEKYCRTLFNRYKDLVKYWMTFNEINNVHTIPF AAGAIRHDSNLQDKFQAAHNMFVASSRANKLCHEIIPDAKIGCMLSLSGVYPATCKPEDV MGAYNLRRRSLFFSDVMIRGKYPSYIKRIFEENNIQLDTRPEDFELIKDYPSQYLGFSYY RTTTYEDGMPILGNTGGVIGKANPYLKETPWGWQIDPLGLRYVCNELYDRYQVPLFIVEN GMGNIDQIEVDGSINDDYRIAYIRDHLEALKEAIKDGVDLIGYTYWGPIDIV Prediction of potential genes in microbial genomes Time: Thu May 26 09:49:47 2011 Seq name: gi|223714153|gb|ACDT01000062.1| Coprobacillus sp. D7 cont1.62, whole genome shotgun sequence Length of sequence - 16261 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 5, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 84 - 746 813 ## COG0637 Predicted phosphatase/phosphohexomutase 2 1 Op 2 . + CDS 778 - 1434 643 ## COG0637 Predicted phosphatase/phosphohexomutase + Term 1554 - 1604 8.2 + Prom 1437 - 1496 11.2 3 2 Op 1 . + CDS 1734 - 4217 2491 ## COG1409 Predicted phosphohydrolases + Term 4219 - 4257 7.2 4 2 Op 2 . + CDS 4264 - 5427 1062 ## COG5438 Predicted multitransmembrane protein 5 2 Op 3 . + CDS 5492 - 6934 1692 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 6935 - 6967 4.2 6 2 Op 4 . + CDS 6975 - 7844 670 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 7852 - 7884 -0.9 + Prom 7852 - 7911 8.7 7 3 Op 1 6/0.000 + CDS 8014 - 9453 1243 ## COG0579 Predicted dehydrogenase 8 3 Op 2 4/0.000 + CDS 9446 - 10690 1459 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 9 3 Op 3 . + CDS 10692 - 11063 400 ## COG3862 Uncharacterized protein with conserved CXXC pairs 10 3 Op 4 1/0.000 + CDS 11056 - 12549 1585 ## COG0554 Glycerol kinase + Term 12550 - 12576 0.3 + Prom 12590 - 12649 8.8 11 3 Op 5 . + CDS 12673 - 14001 1603 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases + Term 14029 - 14068 1.2 + Prom 14161 - 14220 11.7 12 4 Tu 1 . + CDS 14247 - 14852 620 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Term 14975 - 15025 -0.5 + Prom 14865 - 14924 9.4 13 5 Op 1 . + CDS 15111 - 15506 186 ## COG1959 Predicted transcriptional regulator 14 5 Op 2 . + CDS 15523 - 16261 637 ## gi|167755524|ref|ZP_02427651.1| hypothetical protein CLORAM_01038 Predicted protein(s) >gi|223714153|gb|ACDT01000062.1| GENE 1 84 - 746 813 220 aa, chain + ## HITS:1 COG:VCA0102 KEGG:ns NR:ns ## COG: VCA0102 COG0637 # Protein_GI_number: 15600873 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Vibrio cholerae # 1 210 4 211 219 124 35.0 2e-28 MNLKLVIFDMDGLMYDTEQIGMDCLINAAQKFGYVIDQEFGLSSIGMNANDYQKLVKEKF GADYPYDLISKESRKTRMAYLRKNGMIIKPGLCELINYLQKKEIKLALASSSSKETIDEY NHLAGFDNVFDYIIAGNMVEHSKPDPEIFLKVLEHFELKKNEALILEDSRNGIMAAHNAN IPVICVPDLVKHGQDITKLTYATLPSLNEVKLEIEKIITK >gi|223714153|gb|ACDT01000062.1| GENE 2 778 - 1434 643 218 aa, chain + ## HITS:1 COG:CAC0153 KEGG:ns NR:ns ## COG: CAC0153 COG0637 # Protein_GI_number: 15893448 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Clostridium acetobutylicum # 1 175 10 186 222 104 36.0 1e-22 MDGLMFDTEPLGAVCFARAAKQFGYIIEEEFRYKLIGINANDHYALMKSKFGQDCPAKEI HELSKKLRSDYLYKHGIVIKPGLFELITYLKNKGIKIAVASSSAYSKINEYLALAGLKNI FDLIVGGDDLEHGKPDPEIFLKVLKYFKIAADHALVLEDSTNGILAANAANIPVVCVPDY LPNCKEVLARTSAVLPSLVEVKNEIMKIFEENRVCSFL >gi|223714153|gb|ACDT01000062.1| GENE 3 1734 - 4217 2491 827 aa, chain + ## HITS:1 COG:CAC0205 KEGG:ns NR:ns ## COG: CAC0205 COG1409 # Protein_GI_number: 15893498 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 234 618 57 452 652 259 39.0 2e-68 MATLALLMSAGYVAQDYQVTPTYAETTKIGDSAQLISTATTWKYLDNNVDPGTETDRYAW TKADYNDSEWKSEAGKFGAKKGKLEDLGDGFVPTVLLNQYINGVNGDDIPAFFFRTTVNI SNLDDFSSLSGKLYYDDAAIVYINGVKVASFDEPEGGFESNMSYGGSNASNPKEGVISLT KEQLKDVIKTGQNTIAVELHQGRASSSDIYFEFNNLQVDYGQEETTVEQKALNLTIGEDE TKMNLTWYANTNTSGTVQLAKAGAMINGEFPSQFTTVEATNNQANDKGFYYNQATLANLE ENTKYVYRVVNGDQVSKIYDFTTKDFDGSYNFIFAGDPQIGASGSASKDTEGWDKTLSDS INKFNPNFILSAGDQVNTASDENQYSGYLDHEELTSVPQATTIGNHDSSSNAYTQHFNLP NETAKGETAAGTDYWYVYNNTLFMNINTNNTSTAEHKAFMKEAIKENQDVRWKVVVFHHS VYSVASHSVESSILKRREELTPVFDDLGIDVVLMGHDHVYVRSNMMKGMKVSQETKDLTS VTDPEGILYLTANSASGSKYYDIKTNISTDFVAKMDQSKQRSISNIEVSENSFKVTTYLY NSNDNQWSTLDEFTINKSVETNNQEITLVPEETANDIRVVAPVGTVEKNSTLSAVEVNEG DLYQGIKNTLNTILGSDKDFKVFDLSLIKNGNIINPLGKVQLSLALPEGYDASKLLIYQV QDSQNNDITLNPITYTIKDGRIIIETASLGQFVLVNNVENKKPDTGNGSNTNLPTVKPSN TNNSTTTTNNKVVTGDNSNLIAWGTLLFTALGCSLVAYKSKKEEKLK >gi|223714153|gb|ACDT01000062.1| GENE 4 4264 - 5427 1062 387 aa, chain + ## HITS:1 COG:CAC0206 KEGG:ns NR:ns ## COG: CAC0206 COG5438 # Protein_GI_number: 15893499 # Func_class: S Function unknown # Function: Predicted multitransmembrane protein # Organism: Clostridium acetobutylicum # 48 372 49 374 397 192 38.0 8e-49 MKIFNELKENRVVIIIMLITIIAAGFFNQYLAKDYSRVNNDSTDFVSGKIVEITSSNLEY DQDLKINLGKQVVVVEILEGKSTGKRVEIDNYLTAAHNVEVAIGSKVIISADEPDGIDSY YTVYNFDRGLGMIIFACVLLLVIIAIGRGKGVKAILGLAYTLYLVIFLLLPTVFSGYSPV LMSIICVALSTIVTLMLLNGASKKTYSAIVATVLGVVLSAGGFYLMSLVLKVNGFSVDEA ESLVLINQATGLSIKDILFAGILISSLGAIMDVGMSIVSALSELFHHQPNLTQKQIFDSG IEIGKDMIGTMTNTLILAFTGSAFVSLLVLFSYNVDIKQLLSSNYIAIEFAQGIAGTLGI VLTVPIASFISAWALTNKKSNNNFFKS >gi|223714153|gb|ACDT01000062.1| GENE 5 5492 - 6934 1692 480 aa, chain + ## HITS:1 COG:CAC1405 KEGG:ns NR:ns ## COG: CAC1405 COG2723 # Protein_GI_number: 15894684 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 1 479 1 472 473 660 65.0 0 MSFPKGFLWGGATAANQLEGGWNEGGKGISCPDICTGGSHTQSKRITPTLEEGTFYPSHM AIDHYHRFKEDIALFAKMGFKVYRFSIAWTRIFPNGDETTPNEEGLKFYEDLIDECLKYN IEPLITISHYEVPFGLTKKCNAWASREMIDYYMNYCKTIFERYKGKVKYWLTFNEINSAT MPMGAFLSQGILNEQEKSTDFTNQVDDPQLRFQGLHHQFIASAKAVQLAHEIDPDYKVGN MMIYATSYPLTCDPDDIIKNQQQNHIMNYFCSDVQCRGYYPTYMKRYFKENNIEVTMEPG DEEILKAGPVDFYTFSYYMSVCQTADPTKGAGDGNILGGVANPYLKASDWGWQIDPKGLR YSLNEIYDRYQIPVMVVENGLGAYDEIEADGSINDDYRIDYLRSHIEQMHEAILDGVDLI GYTPWGCIDLVSASTGEYAKRYGFIYVERYDDGTGDFSRREKKSFNWYKKVIETNGEDLG >gi|223714153|gb|ACDT01000062.1| GENE 6 6975 - 7844 670 289 aa, chain + ## HITS:1 COG:BS_ycbK KEGG:ns NR:ns ## COG: BS_ycbK COG0697 # Protein_GI_number: 16077323 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 5 280 15 301 312 69 22.0 9e-12 MKTKGYFLTILSAVIFGFTPILAKITYSMGSNGITMAFFRHLFVIPILFLIMKCQRLPYK ITLHQLKNICLVGIIGNALTVAALYTSYSFIQVGSATVLHFLYPMFVSLICFFYYKERLS KAVKICLGTASFGILFFIELGGSSFLGLFLALFSGITFAYYMVGIEKLGLQSLNPYVLNF YFAIVISISLLMIGLISGQLELILPVEAYGYSFLIAVLTSIVGIICLQQGVRYLGAATAS ILSMFEPVTSVIFGIIILHEQLTIVKLIGCFIILSAITGLIVFNGKQTE >gi|223714153|gb|ACDT01000062.1| GENE 7 8014 - 9453 1243 479 aa, chain + ## HITS:1 COG:CAC1322 KEGG:ns NR:ns ## COG: CAC1322 COG0579 # Protein_GI_number: 15894602 # Func_class: R General function prediction only # Function: Predicted dehydrogenase # Organism: Clostridium acetobutylicum # 1 474 1 475 475 436 49.0 1e-122 MEDIIIIGAGVTGCAIARELSRDQRKILVLEKGSDVCVGTSKANSGIVHAGHDAKPGTLK AKLNLEGNLMMPQLAKDLDIPFVQNGSLVLCFDESQFEALEELYQQGLINRVKGLKVLDT KELLEMEPNLNPNVLKGLYAPTGGIICPFDLVLALAENAYANGVKFKFTQEVVGIEKNID HYIVKTMDKCYQSKIVINAAGVNCDLIHNLICNEKMKIVPRKGQYVLFDKTVGSMVSKTI FQLPTKLGKGVLVTPTVHGNLMIGPDAIDCNREEINTTMEGQKDIVDRASLSIKQVPYAN QITSFAGLRAHNTTGDFIIKEDLDNPGYFDVAGIESPGLTSAPAIGKYVYDLVGEKYPAK AKENFIKYRKGITKVMELPLEKRNQLIAKNPLYGQIVCFCENVSAGEIIEAINRPLSPTT IDGLKRRIRVGMGKCQSGFCLNKSMKLLSQELNKDIFTITKNGKHSTYLVGLNKEVNHE >gi|223714153|gb|ACDT01000062.1| GENE 8 9446 - 10690 1459 414 aa, chain + ## HITS:1 COG:CAC1323 KEGG:ns NR:ns ## COG: CAC1323 COG0446 # Protein_GI_number: 15894603 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Clostridium acetobutylicum # 1 413 1 415 417 422 54.0 1e-118 MNKDLVIIGGGPAGMAAALGAYEAGVKDLIIIERDECLGGILNQCIHNGFGLHTFNDELT GPEYAERYRRQIDELKIPYLLNTVVIDLNDKVITAMNGTSGTFMIHAKAIILAMGCRERS RGALNIPGYRPSGIYSAGTAQYFVNIDGYMPGKEVVILGSGDIGLIMARRMTLEGANVKM VIELMPYSSGLKRNIVQCLDDYNIPLKLSHTIIDIEGKERLEAVIVAKVDENLNVIEGSQ ERVCCDTLLLSVGLIPANEISTKANVMLSSLTKGAIVDDYLQTNQPGIFACGNVLHVHDL VDYVSNEAKLAGINAAKYLRDHLVLCQTITISYECGIRYVVPSVIHPNNDDNVIVRFRVD KVYQNKYLSLYLDGDRKIHQRKLVLAPGEMEQIVIKKDWLNEHIKKILITLEDE >gi|223714153|gb|ACDT01000062.1| GENE 9 10692 - 11063 400 123 aa, chain + ## HITS:1 COG:TM1434 KEGG:ns NR:ns ## COG: TM1434 COG3862 # Protein_GI_number: 15644185 # Func_class: S Function unknown # Function: Uncharacterized protein with conserved CXXC pairs # Organism: Thermotoga maritima # 6 116 5 118 138 84 35.0 4e-17 MEVIKLTCINCPLGCALEVRYIDEEYVVTGNRCIRGKKYALAEISNPTRILTTTIRVKNS PYMLSVKSNGGIAKDKIFECIQILKEIEVQAPISMNEIIVANILDSGIDIVATKSIGGDE LYG >gi|223714153|gb|ACDT01000062.1| GENE 10 11056 - 12549 1585 497 aa, chain + ## HITS:1 COG:FN1839 KEGG:ns NR:ns ## COG: FN1839 COG0554 # Protein_GI_number: 19705144 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Fusobacterium nucleatum # 3 496 2 497 497 640 62.0 0 MAKYVMALDQGTTSSRCILFDQKGNISAMAQREFKQIFPQPGWVEHNPMEIWSSQLAVAS EAMALNNTKADEIAGIGITNQRETTIVWNKDTGEPIYNAIVWQCRRTAAMIEQLKKDGLE EMVIAKTGLIPDAYFSASKIAWILENVDGARELANQNKLLFGTVDTWLIYNLTGGKIHVT DYTNASRTMLFDIHQLCWDDELLAYFKIPKSMLPEVKPSSCVYGYTKEGILGGHIPVSGA AGDQQAALFGQCCYEPGEAKNTYGTGCFLLMNTGNEIYHSKNGLITTIAASDNNEVSYAL EGSVFVGGAVMQWLRDGLRMLKSTPQSSDYANRVEDTNGVYLVPAFTGMGAPYWDPYAQG TIVGLTRGCSKEHFIRAALESIAYQTNDVLKAMQEDTGIELRCLKVDGGASQNDFLMQFQ SDISNCKVHRPQVVETTALGAAYLAGLATGFWASKEEIKNNWKLNYEFVPRINKSRREKS LEGWAKAVRFALMWKDA >gi|223714153|gb|ACDT01000062.1| GENE 11 12673 - 14001 1603 442 aa, chain + ## HITS:1 COG:SPy1150 KEGG:ns NR:ns ## COG: SPy1150 COG0446 # Protein_GI_number: 15675127 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Streptococcus pyogenes M1 GAS # 1 442 1 455 456 575 64.0 1e-164 MSKIIVIGANHAGTAAINTILDNYEDDVVVFDGNSNISFLGCGMALWIGKQIDGPEGLFY SSKELLEEKGAIIYMETIVEHVDYEKKIVYAKGKDGKEYQESYDKLILACGSLPMRPTIP GSDLENVQMVKLYQNAQDVIEKLNNPELKNIVVVGGGYIGVELAEAFRRCDKEVTLIDCA DTCLGAYYDHSFTDLMNQNLTEHGIKLAFNQSVLEIKGNQKVESVVTDKGEYAADMVILA VGFKPNNELGKDVLKLYSNGAYLVNKKQETSLKDVYAIGDCATVYDNTIDDVNYIALATN AVRSGIVAAHNVCGQPLESIGVQGSNGICIYDLKMVSTGLTLSKAKKLGFKATSVSYRDL QKPAFIKENQEVMIEIIYDEETRRILGCQMASKYDISMGIHMFSLAIQEHLTIDKLALLD IFFLPHFNQPYNYITMAGLTAK >gi|223714153|gb|ACDT01000062.1| GENE 12 14247 - 14852 620 201 aa, chain + ## HITS:1 COG:pli0059 KEGG:ns NR:ns ## COG: pli0059 COG1961 # Protein_GI_number: 18450341 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 3 158 2 154 199 108 41.0 7e-24 MNKYAYIRVSTKDQNIDRQKFAMKEIGLGTRQMFIDKQSGKDFNRKGYKKLVKKLKKGDE LYIKSIDRLGRDYDEIIEQWRYLVKKKEVEIIVLDFPLLDTRNQVNGITGKFIADLVLQV LSYVAQIERENTKQRQAEGIKAAKEKGVRFGRPQLPYPENFEEIYLCYKEKQISKRECAR RLSTNHTTFSNWVNRYELKKE >gi|223714153|gb|ACDT01000062.1| GENE 13 15111 - 15506 186 131 aa, chain + ## HITS:1 COG:RSc3397 KEGG:ns NR:ns ## COG: RSc3397 COG1959 # Protein_GI_number: 17548114 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Ralstonia solanacearum # 1 131 1 131 172 62 27.0 2e-10 MQLKVSTDYAIRIVLYIAIKKDIVQSKELSETLGIPQSTVFKIGKKLSDNEIISIATGIQ GGFKLKKQPKDISIFSIIDIFEPTIRINSCLEEDKYCSRFATETCPVRKVYCTMQKHFEE YLKKIKILDLI >gi|223714153|gb|ACDT01000062.1| GENE 14 15523 - 16261 637 246 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755524|ref|ZP_02427651.1| ## NR: gi|167755524|ref|ZP_02427651.1| hypothetical protein CLORAM_01038 [Clostridium ramosum DSM 1402] # 1 246 1 246 590 482 100.0 1e-134 MLYLKRILIVGLSFVVGLINIIGNNYTVFALEKNTLVHQNVMSMIQSQRGGLFDPIYTES LEDNWQGDYIYYGEDTDGSALKWQLLDSDNKDFSKNANSTMFLFSDKTIGTSFWYRENDY GFDYSNVKPGETPIYWQKGYEWDKSDLKTNMEDEMKNKFSSIENEGIIFSSKSSEDAQNV AVGLEKLGTTEMTASKWFPLSIQELTNKCYGFSSVPFNRNQAPLGLHTGDKSRVSSDDKN YLIRSY Prediction of potential genes in microbial genomes Time: Thu May 26 09:49:58 2011 Seq name: gi|223714152|gb|ACDT01000063.1| Coprobacillus sp. D7 cont1.63, whole genome shotgun sequence Length of sequence - 1211 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 5 - 1006 927 ## CPF_1087 pullulanase family protein Predicted protein(s) >gi|223714152|gb|ACDT01000063.1| GENE 1 5 - 1006 927 333 aa, chain + ## HITS:1 COG:no KEGG:CPF_1087 NR:ns ## KEGG: CPF_1087 # Name: not_defined # Def: pullulanase family protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 183 270 2416 2502 2638 77 50.0 9e-13 MRYHPVFVLNSNSVYGANCAGGLNYGNMNTVQSMRPAMNIDTSKVMLVKSHNQKEIIGIK AVNDSKTNEFKLTLLDDKQKLNIENIKTERDKITFDFEAIGNGKYLSVVVQDENNNVMYY GNIKDNLDKVNVGTMTIDLNQIGLNSKSFSSGEYKLSIFTEQLNFDKKTDYSSKFVDVEV VQNNVPPAVINKVPTIVAEDKVLTVGDTFDPLKDVTAYDNEDGIIKLTEANIAANDVNTN KEGTYNVTYKVTDKQGASSTKTITIDVIERIVIPENKPVDKTNNDTNSSNSSKVPQTGDI NNIGVLAVTLMMSGSIVIGCNRKKSKANSLSEK Prediction of potential genes in microbial genomes Time: Thu May 26 09:50:10 2011 Seq name: gi|223714151|gb|ACDT01000064.1| Coprobacillus sp. D7 cont1.64, whole genome shotgun sequence Length of sequence - 24616 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 7, operones - 4 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 27 - 69 10.2 1 1 Tu 1 . - CDS 132 - 1454 1575 ## COG0527 Aspartokinases - Prom 1479 - 1538 12.9 + Prom 1575 - 1634 6.9 2 2 Tu 1 . + CDS 1658 - 3136 1773 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 3145 - 3195 10.4 + Prom 3439 - 3498 10.9 3 3 Tu 1 . + CDS 3642 - 4862 1490 ## COG1434 Uncharacterized conserved protein + Term 4873 - 4915 7.2 + Prom 4933 - 4992 8.2 4 4 Op 1 7/0.000 + CDS 5041 - 6039 1199 ## COG1609 Transcriptional regulators + Term 6044 - 6104 5.1 + Prom 6093 - 6152 7.0 5 4 Op 2 1/0.000 + CDS 6183 - 8078 2424 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 6 4 Op 3 . + CDS 8090 - 9490 1366 ## COG1621 Beta-fructosidases (levanase/invertase) + Term 9496 - 9540 8.4 + Prom 9508 - 9567 6.5 7 5 Op 1 1/0.000 + CDS 9622 - 10587 1292 ## COG1879 ABC-type sugar transport system, periplasmic component 8 5 Op 2 19/0.000 + CDS 10587 - 11897 1283 ## COG4585 Signal transduction histidine kinase 9 5 Op 3 1/0.000 + CDS 11890 - 12564 737 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 10 5 Op 4 1/0.000 + CDS 12557 - 13852 1179 ## COG1653 ABC-type sugar transport system, periplasmic component + Prom 13864 - 13923 3.6 11 5 Op 5 9/0.000 + CDS 13994 - 14419 624 ## COG2893 Phosphotransferase system, mannose/fructose-specific component IIA 12 5 Op 6 13/0.000 + CDS 14432 - 14923 710 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 13 5 Op 7 13/0.000 + CDS 14951 - 15787 1177 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 14 5 Op 8 . + CDS 15789 - 16625 883 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID 15 5 Op 9 . + CDS 16641 - 16916 270 ## gi|167755543|ref|ZP_02427670.1| hypothetical protein CLORAM_01057 + Term 16993 - 17052 4.1 + Prom 16968 - 17027 4.9 16 6 Op 1 8/0.000 + CDS 17103 - 18893 231 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 17 6 Op 2 1/0.000 + CDS 18897 - 20585 220 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 18 6 Op 3 . + CDS 20645 - 21289 688 ## COG2755 Lysophospholipase L1 and related esterases + Prom 21293 - 21352 5.8 19 7 Op 1 . + CDS 21378 - 23735 2317 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain 20 7 Op 2 . + CDS 23722 - 24336 336 ## gi|167755547|ref|ZP_02427674.1| hypothetical protein CLORAM_01061 21 7 Op 3 . + CDS 24386 - 24592 363 ## gi|167755548|ref|ZP_02427675.1| hypothetical protein CLORAM_01062 Predicted protein(s) >gi|223714151|gb|ACDT01000064.1| GENE 1 132 - 1454 1575 440 aa, chain - ## HITS:1 COG:CAC0278 KEGG:ns NR:ns ## COG: CAC0278 COG0527 # Protein_GI_number: 15893570 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Clostridium acetobutylicum # 4 435 5 436 437 458 54.0 1e-129 MNKIVKFGGSSLADATQFKKVANIIKSDKSRRFVVPSAPGKRFKDDIKVTDLLYQAYNAK DNKTFEETFKLIKARYQDIIDELKLNCSLEKEFEIIKKAFQDKIGIDYAASRGEYLNGIV LANYLGFEFIDASEVIFIDEHGNYDQEMTDPVLSKRLSNVKNAVIPGFYGSANDGSKTIK TFSRGGSDVTGSIVARNGHVDLYENWTDVSGFLIADPRIVKDPDTIGTITYKELRELSYM GASVLHENSIFPIRNEGIPIQIKNTNRPEDEGTLIVETTCHKPDHVITGIAGKKGFATIM IEKDMMNSEIGFGRKVLQVLEDNDLSFEHTPSGIDTMTIVVETESFIDKEQEILAGIHRA VQPDSIELESDLALIAIVGRGMKDGRGTAAKIFTALAQENINIKMIDQGSSELNIIVGVK NIHFKNAIRAIYKMFITEEK >gi|223714151|gb|ACDT01000064.1| GENE 2 1658 - 3136 1773 492 aa, chain + ## HITS:1 COG:CAC1405 KEGG:ns NR:ns ## COG: CAC1405 COG2723 # Protein_GI_number: 15894684 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 1 489 1 472 473 603 60.0 1e-172 MSFPKGFLWGGATAANQFEGAYNEDGKGLSTADVMTNGTHQEARRITYTMPDGSKHSQAV MPFKSLPDGAVLECHEGEYYPSHQAIDFYHHYQEDIKLFAEMGFNCFRLSINWGRIFPNG DDETPNEKGLEFYDRVFDECLKYGIEPVVTISHYETPLNLANKYGGWLDRRLVGFFEKYC ETIFKRYKDKVKYWMTFNEINIMDFMPPFAGGVLKNDPQSRAQAVYHQFIASAKVVIMGH KINPDFKIGMMTAYGSTYALTCNPDDELKLMLSDQQRHFFSDVQCRGFYPAYKLKEFERD GVKLIKEPGDDEILKEGTVDYIGFSYYSSSVVTTDESKKVTDGNMSTSVLNPYLKASDWG WQIDPTGLRLALNRLQERYNLPLFIVENGLGAIDKVEADGSINDDYRIDYLAKHIAAMKD AVEYDGVNLMGFTPWGCIDLVSAGTGEMRKRYGFIYVDKDDDGKGPLTRMKKKSFYWYQK VIKSNGEDLSNQ >gi|223714151|gb|ACDT01000064.1| GENE 3 3642 - 4862 1490 406 aa, chain + ## HITS:1 COG:YPO2511_2 KEGG:ns NR:ns ## COG: YPO2511_2 COG1434 # Protein_GI_number: 16122732 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Yersinia pestis # 100 218 20 137 192 90 42.0 6e-18 MKKLGKAILGLALALSMCFSMFIPVYAEGETVDTLVGSLIGYYRDGARTDVLRTLDTLKE VSKDDYDQWQSIIDYWDWIENDMVENIGVAPDGLPNDNTHAFVVLGFALKSDGTMQPELV GRLETALASAQKYPNSYILVTGGVKKNGWTEGDRMHDWLVEHGVADNRILVENKSANTAQ NAEFSFDILYNHPTIKSISMITSQYHLKRGTILYYAESLIRARELGKTPIKLNGNAGWYR EDKTSETLSQKASSLSQIAGVSIPYGEKLPKSKLTGIEVTGKNTYTVGDALDIIVTSTYD TEYSRDVTGLAKIIGFDANKVGKQNVVVVYTETLHHSTSSSEDVTMTAEFEVEVKTAPIT VPVNPQPTPDKPVNAVNTGDNSPLAIMTMVAGLAAAGMFLAKRKEN >gi|223714151|gb|ACDT01000064.1| GENE 4 5041 - 6039 1199 332 aa, chain + ## HITS:1 COG:BH1855 KEGG:ns NR:ns ## COG: BH1855 COG1609 # Protein_GI_number: 15614418 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 7 330 3 326 326 259 42.0 4e-69 MKDKKLTMSDIAKMAGVGKSTVSRYFNGGYIKEETRLKLKKVIDENNYEPSTLAQSLKAK YTKVIGIVVPCLDSITTSRVLMTMDQYLKDHGYTTLIINTNHDEMRELTSIEQLWRMNVD GIILMATAVTMAHQNIAAKLDIPLLFVAQRYGAGVSIINDDYSAGYEVGKYAAYMGHRKI CYIGVSGKDEAVGIYRKDGVINGLRDNGVSSVDLLETDFSLEKAHLIALDYLKKKQPTLF IGATDNIALGCLKAINELKLKMPDDISLIGFGGYETSQFINPSLSTVRFNNEETGIKAGQ TIIDLIEGNVVDNLQLIGYTLIKGQSVKDLNK >gi|223714151|gb|ACDT01000064.1| GENE 5 6183 - 8078 2424 631 aa, chain + ## HITS:1 COG:VCA0653_2 KEGG:ns NR:ns ## COG: VCA0653_2 COG1263 # Protein_GI_number: 15601411 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Vibrio cholerae # 96 473 4 383 386 440 60.0 1e-123 MDYNKIAKEIINCVGGKENITGAMHCATRLRLNLVDESKVDEESLTDIDVVKGTFLAKGQ YQIILGPGLVNLVCDEVTNILGIKTDSTLQVEKVEEKGNLLQRAVQLLSDIFVPIIPAIV AGGLLMGINNILTAQGLFFANKSLIEAYPQWADLASMINLFSNAAFTFLPVLIGFSATKK FGGNPYLGAALGMIMVHPDLLNAYSYGKAGVEVPVWNIFGLSIEAVGYQGTVLPVLGVSW ILANIEKRLHKVTPTWLDNLTTPLLATLITGFITFILVGPLLREAGVLLSDGISWMYNTL GLFGGAIFGFFYAPICLTGMHHSFIAVETQLLASIKTTGGSFIFPTASMSNVAQGAAVIA ILLVTKDKKLKSICSASGISALLGITEPAMFGVTLKLKYPFIAAMVGSAVGSAYLAATKT LANALGAAGIPGFLSIDPKNYLNFAIGMVLSIAVSFVLTIILYKRDMAKQPTQDSELPTE AKEVKTNIITAPLAGKVASITEAPDEIFAQKMMGDGVVIFPTENILAAPIDGEITMIFPT KHALGIKSNDGIEILIHVGLDTVKLEGKPFNLFVTEGQKVKQGDKLMEIDFAMVEAAGCP IATPLVITSQNQFEIINLGTCSLNQDIIKLD >gi|223714151|gb|ACDT01000064.1| GENE 6 8090 - 9490 1366 466 aa, chain + ## HITS:1 COG:BH1858 KEGG:ns NR:ns ## COG: BH1858 COG1621 # Protein_GI_number: 15614421 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Bacillus halodurans # 7 451 14 467 487 318 37.0 1e-86 MTCLNNEQIKKLEKQASNDEYRLNYHLMPPTGWLNDPNGLCQFEGIYHLYYQYAPDDCFG GDKYWGHYSTKDFITFKNEPVALFPDCSLDAGGAYSGSTIIKDNKMYVYYTGNVKYPGKH DYIHTGRDHNTIMVISDDGINFGEKVCLMKNDDYPDDLTLHVRDPQIIKNGDHYYMILGA RTNQDVGCCLLYRSDDLVNYKLVNRIISNEPFGYMWECPNLVKFDEQMILFCCPQGVETA GYDYENLYQNGYYLINGDLEKDYRLSEFIEFDHGFDYYAPQLFIDEQGRTIIIGWMGMPD VPYTNPTTKYNWQHAFTLPRELTFKNNKVYQLPISETKSLRKAKEIINLSNHEKISCKGN IYELYLPINNRDFTLTLRNDAVIDYHSNLLTFSMGASGYGRDSRHIVIDDLSNMTIYSDT SSLEIFFNDGEYALTTRIYDHSSNLEISIDTEIQCTYYELNSYQIV >gi|223714151|gb|ACDT01000064.1| GENE 7 9622 - 10587 1292 321 aa, chain + ## HITS:1 COG:CAC1453 KEGG:ns NR:ns ## COG: CAC1453 COG1879 # Protein_GI_number: 15894732 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Clostridium acetobutylicum # 40 321 43 325 325 281 48.0 2e-75 MEIKNNQKTIVKMMVFLCLCITLMIFGFYSYPDFKATVGQQRFGATYMTMNNPFFEVINN EIKKAVEANGDVLITLDPILDVDKQNEQILDLIDQRVDAIFVNPIDAKKIEIGLKAAKKA KIPVIVVDAPIYDESLVNCSIASNNYDAGVLCAKDMMSKRDHAKIILLEHATAKSAVERI QGFLDTIASNANYQVINRADCDGQIEIAMPTMKQMLKETPDVDVVMALNDRSALGALAAI ESMEIDNVLVYGVDGSPDVKELINTGLIQATAAQSPVKMGQLAYQKAKALLENKKIEKKI EVPVELISRDNISNFELTGWQ >gi|223714151|gb|ACDT01000064.1| GENE 8 10587 - 11897 1283 436 aa, chain + ## HITS:1 COG:CAC1454 KEGG:ns NR:ns ## COG: CAC1454 COG4585 # Protein_GI_number: 15894733 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 7 433 5 439 441 296 39.0 4e-80 MQYSEYRSVSSIKNMMIIINFIIILFEASIILFSTKYVCNNLMGRDFLDTLAYLPKNPTK VFIYSIIGFALLVMIMFIRKSENFQVRNGRVICNGLEIILCFWIIYNLYMGYNGIALLVF ADIIFNTKNGRNTMVIIGFILIIFLLSNYDIISNIIPMVSLDSYIQVYDAATKTAILIAK NILESTNLVLFIMFLIVYIANQIRENENISKELSMINEVNKQLKDYAAVTEKIGESNERK RLAREIHDTLGHALTGIAAGIDACIAMIDIDPNVTKQQLLVVSKVVREGISDVRRSLNKL RPGALEEHTLKEAIQKMIKEFSDVSEVEIMLDYQLDKVDFENTKEDIIFRIVQESITNAL RHGRAKKVEINIYQRVSDLEIVIADNGVGCDDLKLGYGLKQMQERAAILNGRLEYYSKEG FTVKVTIPMKEGERYD >gi|223714151|gb|ACDT01000064.1| GENE 9 11890 - 12564 737 224 aa, chain + ## HITS:1 COG:CAC1455 KEGG:ns NR:ns ## COG: CAC1455 COG2197 # Protein_GI_number: 15894734 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Clostridium acetobutylicum # 1 219 1 220 225 279 65.0 3e-75 MIKVLIADDQELIRESLKIVLNTHEDLQVIDTVEDGFGVLDSLKRNIPDVILMDIRMPRM DGVYCTKMVKEAYPDVKIIILTTFDDDDFVFSALKFGASGYLLKGVSMDGLYQAILTVVS GGAMINPDIATKVFRLFSKMANTNYAINVNDDNVRELNKPEWKVIQQVGFGLSNKEIAQK LFLSEGTVRNYLSSILSKLELRDRTQLAIWAVQTGQTTKDLDDD >gi|223714151|gb|ACDT01000064.1| GENE 10 12557 - 13852 1179 431 aa, chain + ## HITS:1 COG:CAC1456 KEGG:ns NR:ns ## COG: CAC1456 COG1653 # Protein_GI_number: 15894735 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Clostridium acetobutylicum # 22 429 30 437 439 420 47.0 1e-117 MIKKNKIMIIVLVLVMGIGLIYYHNQKKVLTIGIFAGSNWDVPTFDYYEMIDKIIVRFEK EHPNVEVQYKSGILKEDYSLWLSNQLLIGAEPDLYMILNEDFNSLSALGALKNLDSIITD DQDFNSDEFYQSTYQAGQYQGKQYALPYEVNPTLMFVNKTLLQQEGISLPKDNWSLEEFY NVCRQLTKDSNNDGIIDQYGCYNYEWINAIYASGIELFNETGTTCDFNNHEVKSAIQFIE KLNALNRGFNITSNDFDQGKVAFAPMQFSQYRTYKPYPYRVSKYSTFEWDVIEMPQSSEQ HLTQLSSLMMGMSMRTSAPELAWQLLKEFTYSQASQQDIYEYSQGVSPLKEVTESKQVLK LLEEDTLGDSKIDMRVLSQIMEYPTTNTRFRKYEAAMKIADNQINQALQNNLDLDTSLTA LQKEITNYLNE >gi|223714151|gb|ACDT01000064.1| GENE 11 13994 - 14419 624 141 aa, chain + ## HITS:1 COG:CAC1457 KEGG:ns NR:ns ## COG: CAC1457 COG2893 # Protein_GI_number: 15894736 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose-specific component IIA # Organism: Clostridium acetobutylicum # 1 141 1 142 142 119 48.0 2e-27 MKYVILVSHGNFAEGLLDSLKMLTGNHDDVIAIGLKDGITADQFAIDFTEMITKIPKESE IIVLADIIGGSPLSTAMNVLAQSGLIDKTTVLGGMNLPLALTTILMKDVLDTDALITNVT SEATGAIKQFVLETNDEDDDI >gi|223714151|gb|ACDT01000064.1| GENE 12 14432 - 14923 710 163 aa, chain + ## HITS:1 COG:CAC1458 KEGG:ns NR:ns ## COG: CAC1458 COG3444 # Protein_GI_number: 15894737 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Clostridium acetobutylicum # 1 163 1 163 164 237 68.0 8e-63 MSVSFVRIDDRMIHGQTCTRWAREYPCDGLIAVNDKAATNDVLKAAYKAASGKKTFVWTL ADFAKKSQKVLDSDTKYFLIAKNPIDMCTILVDQGFDPGVKTVIVGPCNDREGATKLGNN QSITQAEADAFERIHKAGYTIKFALLPDVSIGTWDDFKSKFGY >gi|223714151|gb|ACDT01000064.1| GENE 13 14951 - 15787 1177 278 aa, chain + ## HITS:1 COG:CAC1459 KEGG:ns NR:ns ## COG: CAC1459 COG3715 # Protein_GI_number: 15894738 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Clostridium acetobutylicum # 1 278 1 281 281 342 69.0 7e-94 MEISWIQAALLGLFACLSSMPGLGGTSFGNYTLGRPLVGGLVCGIILGDIQTGILVGIAM QVVYIALVTPGGTVSADVRAVSYIGIPLAMIAIKSYGLDAASSDGAALATSFGTMVGTLG TVLFYGTATINLAWQHIGWRAVENGDYKKLYIVDMVLPWISHILCSFIPAMIMCKMGAPM VDLIKTYLPMDGVAMKTLFTVGSLLPCVGIAILLKQIVTKAIDFIPFFFGFTLAAALGIN LVSATVVAGMFALINYRIKMLTLGKVAVAVVDDDEEDI >gi|223714151|gb|ACDT01000064.1| GENE 14 15789 - 16625 883 278 aa, chain + ## HITS:1 COG:CAC1460 KEGG:ns NR:ns ## COG: CAC1460 COG3716 # Protein_GI_number: 15894739 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Clostridium acetobutylicum # 2 278 3 277 277 362 64.0 1e-100 MEKKLSKKTLTKSFHNWYYGNLTCFSQEHMQTFGYLCSMLPIVEELYDKKDARAKAIKTY TAFFNTEPQVGSVIVGMTAGLEEARANGAEDVDDETINGLRAGLMGPLAGIGDSLVVGTI IPILLGVAMGMSTGGSPLGAIFYIIVWNLFAYFGMKLLYFKGYELGGKAVDFLVGPQGEA LRESITMLGGIVIGAVAATWVSVKTSFSMTAAGAKEPFLDLQKTLDGVYPGFLTALFVLL CWYLLSKKKMSPIKVMLLLVVIALIGVLVGFFNPGLTY >gi|223714151|gb|ACDT01000064.1| GENE 15 16641 - 16916 270 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755543|ref|ZP_02427670.1| ## NR: gi|167755543|ref|ZP_02427670.1| hypothetical protein CLORAM_01057 [Clostridium ramosum DSM 1402] # 1 91 1 91 91 81 100.0 2e-14 MDQRVKKAYFEEIEYQTKQINRLKIWLKNLLIISSLIIAIIFFVDKISMIITVISYIALI IIIISLIIINLAIRNGSMNVNNIIIKMEHIK >gi|223714151|gb|ACDT01000064.1| GENE 16 17103 - 18893 231 596 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 351 567 279 502 563 93 29 1e-18 MLLINKTLIRMSEGIRGWIAIITGLKILTLVGTVMFAKTISSFLSDLYQPKMSQEQLLAA IISALIASCLILIADLLTGEAEYRCGAKARINLRKEIFEKMLQLDVGKIERIGASNTISS VGDGVELMQVYYTKYLPSLLYCFFAPVYLFFQLKDTSLLVATILLIISLSLPLVNNYFRK MIDQLKKEYWTSFQDLTSYYLESLKSLVTLKLFNQDRRRHDNLRRKADTFNQSTMSIMKM NFSSFLVSDGLIYLGVILAVVIGCSQLAKGTIDLAQAVLLLMLSYSFFSSIRQLMSATHD ALSGIAATIKVDEILNLDASRIHKSEAETAHEIAVKYSQDTLAYLQDYSGIIVKDIVFNY HQRRQVLKQLSIEIPKGKTTAIVGKSGCGKSTLASMLMRFIDPDQGYLFLDGQSYFSLTP EEVRKNIVMVPQRVSIFSGSLEENLKLAAPNASTSELLNVLKQVHLYQWLQSLPEGLSTD LGDSGAKLSGGQRQKIGIARALLSGAPYLIFDEATSSVDRDSEDEIWQCINELGLTRTLI IISHRLSTIQRADNIIVLNQGRLVEQGNHQALLKYHGEYYQLVKEQEALEAAGKVG >gi|223714151|gb|ACDT01000064.1| GENE 17 18897 - 20585 220 562 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 331 561 270 504 563 89 25 2e-17 MEHKQSLLTLTKRLLKIAATLKKFFVISTIVSIIGNIAQMGLMGFGAAFILSVAGKLKYA NSVTYCILMIISGILIVTCRYLEGYFSHAGSYELLAKMRVDMFGTLRKLAPGSLIGRNNG DIMAIAIADIESIEFFFAHTIGPLFTVILLPLVTLIIAGSIDMLFVYALLPIYLIISVII PILAIKLGRNIGIGYRQKLGELKIFLLDSVYGLSEIQIFDYGKRRNEELEAVNRNINRSI HQLAYHRQLVVSTPTFFIYLARIAVIAVASYLALKGNIDTTGIIILSFLVSASFSSTQSL TTVVSSLLETYAAAQRYFDLEDMVPVVNEIVEPKELKNIDKIEFINVSFSYPEINRKIIE NMNLTINFKDKIGLVGESGIGKSTLIRLLLRFYDVTSGQILINGIDIKEYSLQDLRQRIG TLEQDTFLFNDSIAANIALGKPKATKEEIVKAAQMAGIHELIISLPEQYDTMMGELQNRL SGGEKQRIGIARVLLVDPDFLVMDEPTSSLDVINEKGLLKTLAEQFENKTWLIVSHRPSS LTGCDRVIKLENKQIYELGGKA >gi|223714151|gb|ACDT01000064.1| GENE 18 20645 - 21289 688 214 aa, chain + ## HITS:1 COG:SPy1115 KEGG:ns NR:ns ## COG: SPy1115 COG2755 # Protein_GI_number: 15675096 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Streptococcus pyogenes M1 GAS # 9 209 8 201 204 80 31.0 2e-15 MKYLTNHYQLREFTLIRMLEIINQNKNITPGGTVFYGDSITEYCDLDKYYPEIETKYNCG IAGITSGMLLNFIDEGVIKYQPKNVVLMIGTNDLGNTVMESPRDIALNVKETIEIIHYNC LDTKIYLVAPLPCLEMLHGYKATKQGLRSNDTLKMVFKEYKNIIPYDYVTLINPFMALCN KKGQPVENYYLDGLHINDDGYRAYTGVIKEKLID >gi|223714151|gb|ACDT01000064.1| GENE 19 21378 - 23735 2317 785 aa, chain + ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 120 537 335 754 776 192 28.0 2e-48 MDTIIIVIKDSDDCLRLKEFLSSDYHIVICSELENIIDNLNDNRIVAIIMDLDDATEVNV AKYSSYKNIDNIQIPLIVGGNFTGNLLNLLNLGVSEVIVKPYNEAIIKLRINNAIRRAED NEYRIMADFDTLTKIYNRTAFYREAKKLITQNQDINYDLACFDIDRFKIINDIYGSSVGD ELLIYIANTGVKRMKKLGGLIGRLSGDLFAIVVPHQANIEDLLLQQMNEDIGNFDLGIKV VVSFGYYNIDDLSLPVNNMCDRAMMAIKKVKDKYNTSYATYDDELRDQVLEEQRIIDEMD RAFENDEFIPYYQPKFNMVTRTYIGYEALMRWQSPTRGMVLPSTFIPVFEKNGFISKADR VIWRKVCQDMVIARQKGHVLLPVSVNVSRIELYDPELGNTILKLLAEYELSIELFQLEIT ETAYMQDSNQMIEAVVKLKELGFTILMDDFGSGFSSLNILKELPFDIIKIDLAFLEHFDK NNKAEKILKSVIQMAKRLNMEVIAEGVETKRQEDFLVELGCNRAQGYRFAKPVSASKIAY MVENGVIGVGDTKDEDSAIVNIDDILTTVYQQGEVDWYRQALLELNAQVFEYDFKRDNMI IFDTPTKENGSNLAKMEIPNYLYNVSIGRIVHSKDVKRYQTVFNGNQEFKVIYRRFMINH GSGYTWVKDTGRIIYDEDNKPKICIGVSRVISDEKLNEQMMNVLSVMEQSTDFDSSINHI LAEIGDDFLLDRITLLLEEGNNYRSVYAWQDESIELEISDTFPIVRTELQEIVTYFKEHQ IIVVK >gi|223714151|gb|ACDT01000064.1| GENE 20 23722 - 24336 336 204 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755547|ref|ZP_02427674.1| ## NR: gi|167755547|ref|ZP_02427674.1| hypothetical protein CLORAM_01061 [Clostridium ramosum DSM 1402] # 1 204 782 985 985 386 98.0 1e-106 MLLNKNEGNQFSKQINKNFFDNQVKTLVIISLNDDIGEFIGCIVFTMINDYRQWRDDELW ALQELVKGINVYLNKEKIFNRLDSILLTYNRVMEKLNDGILMFTYESQPQLLFINEAYRK ILNIDDINVTNFLSTYYRSVDLDVQKQIHQAIEKCVETGEEQTLYHPLKTCEKQIIKAKT IYNLSPVKENGKLVIISITTKQCT >gi|223714151|gb|ACDT01000064.1| GENE 21 24386 - 24592 363 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755548|ref|ZP_02427675.1| ## NR: gi|167755548|ref|ZP_02427675.1| hypothetical protein CLORAM_01062 [Clostridium ramosum DSM 1402] # 1 63 1 63 221 110 100.0 3e-23 MAEILIIEDEVKLRQELEVYLNNNGYLTKVIESFENTLEQMNQFDGDLILLDINLPNVNG QFLLQRIS Prediction of potential genes in microbial genomes Time: Thu May 26 09:50:44 2011 Seq name: gi|223714150|gb|ACDT01000065.1| Coprobacillus sp. D7 cont1.65, whole genome shotgun sequence Length of sequence - 53466 bp Number of predicted genes - 47, with homology - 47 Number of transcription units - 19, operones - 14 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 40/0.000 + CDS 42 - 185 84 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 221 - 255 -0.8 + Prom 225 - 284 2.7 2 1 Op 2 4/0.250 + CDS 340 - 1188 923 ## COG0642 Signal transduction histidine kinase 3 1 Op 3 36/0.000 + CDS 1262 - 2029 961 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component 4 1 Op 4 . + CDS 2016 - 4016 1822 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 5 2 Tu 1 . - CDS 4036 - 5520 1196 ## gi|167755552|ref|ZP_02427679.1| hypothetical protein CLORAM_01066 6 3 Tu 1 . + CDS 5637 - 6254 777 ## COG1357 Uncharacterized low-complexity proteins + Term 6280 - 6308 -1.0 + Prom 6295 - 6354 8.0 7 4 Op 1 . + CDS 6375 - 8198 2245 ## COG1217 Predicted membrane GTPase involved in stress response + Term 8200 - 8233 0.8 + Prom 8223 - 8282 6.1 8 4 Op 2 . + CDS 8392 - 8610 222 ## gi|167755555|ref|ZP_02427682.1| hypothetical protein CLORAM_01069 + Term 8636 - 8691 10.8 + Prom 8670 - 8729 8.3 9 5 Op 1 . + CDS 8752 - 10176 1441 ## gi|237734307|ref|ZP_04564788.1| predicted protein 10 5 Op 2 . + CDS 10181 - 11224 1022 ## COG0500 SAM-dependent methyltransferases + Term 11227 - 11275 6.1 + Prom 12033 - 12092 7.7 11 6 Op 1 . + CDS 12339 - 13058 838 ## Pjdr2_2685 hypothetical protein 12 6 Op 2 . + CDS 13059 - 13931 523 ## DSY3852 hypothetical protein 13 6 Op 3 . + CDS 13924 - 15441 210 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 14 6 Op 4 . + CDS 15438 - 16127 528 ## DSY3851 hypothetical protein 15 6 Op 5 . + CDS 16124 - 17245 1430 ## CTC00776 putative surface/cell-adhesion protein 16 7 Tu 1 . - CDS 17874 - 18716 735 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 18774 - 18833 11.0 + Prom 18733 - 18792 15.4 17 8 Op 1 35/0.000 + CDS 18900 - 20249 1574 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 20267 - 20299 2.2 18 8 Op 2 38/0.000 + CDS 20311 - 21207 726 ## COG1175 ABC-type sugar transport systems, permease components 19 8 Op 3 . + CDS 21222 - 22097 1035 ## COG0395 ABC-type sugar transport system, permease component 20 8 Op 4 . + CDS 22109 - 22276 244 ## gi|237734317|ref|ZP_04564798.1| predicted protein 21 8 Op 5 . + CDS 22289 - 23407 1441 ## COG3839 ABC-type sugar transport systems, ATPase components 22 8 Op 6 . + CDS 23410 - 25563 2320 ## CPR_0537 hypothetical protein + Term 25572 - 25604 1.7 23 9 Op 1 . + CDS 25614 - 26537 1141 ## EUBREC_0139 hypothetical protein 24 9 Op 2 . + CDS 26544 - 27635 1107 ## EUBREC_0139 hypothetical protein + Prom 27686 - 27745 8.7 25 10 Op 1 . + CDS 27772 - 28041 527 ## COG1925 Phosphotransferase system, HPr-related proteins + Prom 28046 - 28105 7.0 26 10 Op 2 . + CDS 28125 - 28604 590 ## COG2606 Uncharacterized conserved protein 27 10 Op 3 . + CDS 28607 - 29104 571 ## COG2109 ATP:corrinoid adenosyltransferase 28 10 Op 4 . + CDS 29182 - 30876 2225 ## COG1109 Phosphomannomutase + Term 30900 - 30941 1.9 + Prom 31000 - 31059 10.4 29 11 Op 1 . + CDS 31187 - 32548 1646 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 30 11 Op 2 1/0.500 + CDS 32568 - 34322 1727 ## COG1874 Beta-galactosidase + Term 34347 - 34393 4.1 + Prom 34402 - 34461 5.3 31 12 Tu 1 . + CDS 34494 - 35498 1014 ## COG1609 Transcriptional regulators + Term 35530 - 35580 3.2 + Prom 35527 - 35586 10.6 32 13 Op 1 4/0.250 + CDS 35707 - 37569 1743 ## COG0296 1,4-alpha-glucan branching enzyme 33 13 Op 2 . + CDS 37594 - 39990 2449 ## COG0058 Glucan phosphorylase + Prom 40079 - 40138 4.9 34 14 Op 1 . + CDS 40158 - 42242 2047 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases 35 14 Op 2 7/0.000 + CDS 42253 - 42807 550 ## COG0193 Peptidyl-tRNA hydrolase 36 14 Op 3 . + CDS 42816 - 46250 3618 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) + Term 46274 - 46310 5.2 + Prom 46262 - 46321 5.9 37 15 Op 1 . + CDS 46392 - 46970 524 ## Cbei_3144 TetR family transcriptional regulator 38 15 Op 2 . + CDS 47028 - 47633 562 ## COG0500 SAM-dependent methyltransferases 39 15 Op 3 . + CDS 47633 - 47995 342 ## COG1321 Mn-dependent transcriptional regulator + Term 48083 - 48123 6.5 40 16 Op 1 . - CDS 48000 - 48464 335 ## Bsel_1376 dUTPase 41 16 Op 2 . - CDS 48519 - 49118 762 ## COG1739 Uncharacterized conserved protein - Prom 49188 - 49247 6.0 + Prom 49133 - 49192 6.9 42 17 Op 1 . + CDS 49215 - 50096 1053 ## COG1307 Uncharacterized protein conserved in bacteria 43 17 Op 2 . + CDS 50084 - 50530 536 ## COG1854 LuxS protein involved in autoinducer AI2 synthesis + Term 50542 - 50577 4.1 44 18 Tu 1 . - CDS 50612 - 51301 500 ## COG1737 Transcriptional regulators - Prom 51322 - 51381 7.6 + Prom 51165 - 51224 5.9 45 19 Op 1 . + CDS 51352 - 51963 685 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase 46 19 Op 2 . + CDS 51979 - 53256 1288 ## COG1455 Phosphotransferase system cellobiose-specific component IIC + Term 53261 - 53287 -1.0 47 19 Op 3 . + CDS 53293 - 53464 62 ## COG2509 Uncharacterized FAD-dependent dehydrogenases Predicted protein(s) >gi|223714150|gb|ACDT01000065.1| GENE 1 42 - 185 84 47 aa, chain + ## HITS:1 COG:BS_yvcP KEGG:ns NR:ns ## COG: BS_yvcP COG0745 # Protein_GI_number: 16080525 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 4 44 185 225 237 62 73.0 1e-10 MAYLWDSDMFVDDNTLTVNINRLRKKLEDIGLNDVIHTRRGQGYIIL >gi|223714150|gb|ACDT01000065.1| GENE 2 340 - 1188 923 282 aa, chain + ## HITS:1 COG:lin1852 KEGG:ns NR:ns ## COG: lin1852 COG0642 # Protein_GI_number: 16800919 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Listeria innocua # 63 273 118 334 346 165 42.0 7e-41 MYDFFRRRNFYNNLIGILDHLDEKYLITELIGNPHFLDGKIMVNCIYEIDKSMKEHLNDA VYRQSELKEYIEMWCHEVKTPVATSQMIIENNLNPVNESIKEELIKIDNYVEQVLFYARS ENVEKDYLIKDVKLSELVNSVIKRNKKDLINKHVKIELENLEIDVASDKKWLEFILNQII NNAIKYVDETPIIKLFAKQYKDRVELMIKDNGIGILENELNRVFDKGFTGTTGRTKQKST GIGLYLCKKLCYRLNHEIKIDSNSHGTVVTLVFPYSSHITLQ >gi|223714150|gb|ACDT01000065.1| GENE 3 1262 - 2029 961 255 aa, chain + ## HITS:1 COG:lin2219 KEGG:ns NR:ns ## COG: lin2219 COG1136 # Protein_GI_number: 16801284 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Listeria innocua # 1 255 1 255 255 272 54.0 4e-73 MEKILRIEKIEKYYGNKNNITKAVDNISFDVNGGEFVAIMGASGSGKSTLLNCISTIDKV TSGHIYLKDADITKLKGYKLTKFRRDSLGFIFQDFNLLDTLTSFENIALALTIQKSDYKK VDEKVNKVAAALGIKEILSKYPYELSGGQKQRVACARAIVTNPDLILADEPTGALDSKSA RMLLDSLNALNEKLNATILMVTHDSFTASYADRVIFIKDGKIFNEIVKGDNSRKMFFNSI MDVQTLLGGELNEVI >gi|223714150|gb|ACDT01000065.1| GENE 4 2016 - 4016 1822 666 aa, chain + ## HITS:1 COG:lin2220 KEGG:ns NR:ns ## COG: lin2220 COG0577 # Protein_GI_number: 16801285 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Listeria innocua # 1 649 1 631 646 150 23.0 6e-36 MKLFSLALRNIKKSIKDYSIYFITLVIAVAIFYIFNSIDSQTAMMSISKSTMEIVQSLVN VLSYISVFVSVVLGFLIVYANNFLIKRRKKEIGIYMTLGMSKLKVSTILVLETVIVGVIS LGIGLLLGIGLSQLLSIFTAKLFEADMTKFTFVFSSGALLSAIINFGFIYIIVMIFNIIT LNRFKLIDLLYANRKNEKVKIKNGWLIFIIFIISIVLIGYAYKLLYDGALYVVGSDFLIM IILGTIGTFLFFLSVSGFLLKVVQMRKKTYYNGLNMFVLKQVNNKINTHVISTTVISLML LLTIGILSSSLAMASAMNTTYEGNSPSDVTVFSSNTEIIERFKQEPVYQQIIKEDYQFNY LNLQGLKQGIFIDSKDNAYTNIEAVKNQPMRVIKESDFNKIMELNGKEDQKIQLANNQYE LVATYPVALEYYNKFLAGDSTIKIGDTTLQSNTDQAVELAITNDSGNEGFVVVNDQIANK YVEEQESYLTYLVADYQGDKEECEEQFQDLLKSFNEQLRTESSPTVISFTRIGLGQSGIG TSALFTFVGLYLGIVFALASGTVLAIEQLSESSDNKERYRILGQLGASKPMVNRSLLVQI GITFMFPLIVALIHSFVALKEINYIISLMVSINIADNILVTTIFIVIVYGGYYLATYLAS NRIINE >gi|223714150|gb|ACDT01000065.1| GENE 5 4036 - 5520 1196 494 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755552|ref|ZP_02427679.1| ## NR: gi|167755552|ref|ZP_02427679.1| hypothetical protein CLORAM_01066 [Clostridium ramosum DSM 1402] # 1 494 1 494 494 898 100.0 0 MKLVELITTPECQHIYQRYFKIVATPKPPHQITSKQIYQEIVAYYNGSIQNVLDILCYDE IVFLQNFEKTKSYRKDIPTINTLINKCLLIKNPNNNNYLEIPEDIKINVINAVEQADIDK VIQIDQINELLIGILAIRGIIEPQELVNIYHSYDPSLSRNEVNTHLDTNHYLFQHYFIYQ GETELLLAYEPYRPIIQKIEQLQTLVKKTPAAYTAKQLRTIAKYGFDLSEPAITNLYEAV EQIDNLFVRELFRDSIMLFCQIHGDLNDLIYMINELKITNNQVIEHLEKNLRAAVPLIHS AAFYGLSPYDYYLTINKDSYFSEQESSTFYQLYLSLLEYTNQVFKITDISMKNLDYIEQD DFSQIRTLLFDNPNIIDHYISENPEHLTNYELTIINDFKAGFIDDFLILTNLDDYTLVSN ETGVYAIYGLVNHLKEIYPNSTLPQVCNLALLPYRHKIVFDGLLEDRQSILPHKQIRTYS HDDIIYELPHTLLN >gi|223714150|gb|ACDT01000065.1| GENE 6 5637 - 6254 777 205 aa, chain + ## HITS:1 COG:CAC1657 KEGG:ns NR:ns ## COG: CAC1657 COG1357 # Protein_GI_number: 15894934 # Func_class: S Function unknown # Function: Uncharacterized low-complexity proteins # Organism: Clostridium acetobutylicum # 39 204 50 215 216 94 33.0 1e-19 MKRREPVIHDLNKCSFHEIEDNYYQCLFNENPQQVIPIKNLTIEECKFIKIDFNMIELVN THITDCIFENCDLSNLEFNKVSIHRCHFINCKMTGLNFIDCSLQDLLFEGVQGRYMNISL GNIRCVEFNDSTLDESSFMEVNVKNLVFKQVSFIAGEVFKTSLKGMDFKTTNIEGIRIDD YSLKGIHVDMYQAIALASLLGIIVD >gi|223714150|gb|ACDT01000065.1| GENE 7 6375 - 8198 2245 607 aa, chain + ## HITS:1 COG:BS_ylaG KEGG:ns NR:ns ## COG: BS_ylaG COG1217 # Protein_GI_number: 16078541 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Bacillus subtilis # 3 597 7 600 612 761 62.0 0 MKIRNIAIIAHVDHGKTTLVDQLLKLSGTFRDNEQIAERAMDSNAIERERGITILAKNTA INYKDYRINIMDTPGHADFGGEVERIMNMVDGVLLVVDAYEGTMPQTRFVLKKALAAKVK PIVVINKVDRPVVRIQEVMDEVLELFMELGADDDQLEFPTVYASALQGTSSLDPDLSTQE PSMDCLFDMIIDEIPEPLVDEEGPLQFQPALLDYNDYVGRIGVGRIQRGKIRVNESVTCV RADGSHSQFRIQKLFGFIGLHRIEIEEAGAGDIVAIAGLADIGVGETVCTTGKEEALPLL KVDEPTIQMVFGTNTSPFAGQDGKFVTARQIEERLFKETNRDVSLKIERIPNSEEWIVSG RGELHLSILIENMRREGYELQVSKPKVIIKEIDGVLCEPYEEVNIEAPDDCIGNVIESLG YRRGVLENMVSNEGQTSVTYTIPSRGMIGFMTNFLTMTKGYGIISHSFLEYRPMEGESVG ERSLGVLVSIDNGQTTAYALGGVEDRGVMFIGPGVDVYEGMIVGEHSRDNDLVVNVTKGK QLTNTRSSSKDSTVVLKRPRTFNLEACLDYINDDELVEVTPENIRLRKRYLTEQARKQQN RLKQNNS >gi|223714150|gb|ACDT01000065.1| GENE 8 8392 - 8610 222 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755555|ref|ZP_02427682.1| ## NR: gi|167755555|ref|ZP_02427682.1| hypothetical protein CLORAM_01069 [Clostridium ramosum DSM 1402] # 1 72 1 72 72 105 100.0 8e-22 MKAIVEVQYHGKNVTATDIEKLVKEDVKSQGVKISTVDTLQIYYTPETSSVYYVATTKDG KSVNNEEPLVIE >gi|223714150|gb|ACDT01000065.1| GENE 9 8752 - 10176 1441 474 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734307|ref|ZP_04564788.1| ## NR: gi|237734307|ref|ZP_04564788.1| predicted protein [Mollicutes bacterium D7] # 1 474 1 474 474 795 100.0 0 MKKITAGLLAFLMLTGCSNIDNDQTLFVANNKEKYALMDYEGDKQTEFIYDKYEEVGSSG YIVIKDKKYGYLLRDGEEAIKLGKYDKLESIGNMIVGYDKNEKISILDGEGKELYKEDKK TEISLFGLPVIHQGKEYIVLYNDGEVLKKSKEKIISAYTVDSDYAVINFEKTSSIYNLVN EKHIEGIKIGGNNQLMDHSSKKGYLLYNRKTHEINALDLEGKIIFTTTLELDDLYYDNSK NIIGVKNQTTYLFNKDGETVTSNSYYRNYKNYVVKNKGMIYGPHKFVNDGKEIEVNGIQL DPMASFTKSKIFPVYVRDEGYQYYGFDGKEAIKESYKSAEAFDENNLAVVSKKEDKYYLI NNKGKKVSEQYVRIMYLGEKYYAGYTTGSKYEVFDVEGNKVINDYFMDEGTTFVYNDVVY GIFNKSGSSYIYDMNENEVVFSVEGDLEFNEKGYFVTVDGDAYYSLKGEKIYKR >gi|223714150|gb|ACDT01000065.1| GENE 10 10181 - 11224 1022 347 aa, chain + ## HITS:1 COG:FN0736 KEGG:ns NR:ns ## COG: FN0736 COG0500 # Protein_GI_number: 19704071 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 116 344 3 251 251 75 26.0 2e-13 MEYYIYMLISLAVMLTAIKLIFCELEGRTAQKKSILSWYTKLNEPFKYTRTKSIIFMCVI CYAIASVQAMFTTEWFVEMLGFIAVGVICDGISQYIGFYYNKIRFRKKINEALLMKSEIA KAMNETANALVQQSLPTYSSHEIAARYFNDDTHLATISFDGGEYVAAFEHLPPLTYVVEA QYEKAEEKLADRNVKVTKLTNEGKLPFKDERIDVIVNELSNYDKYDLYRVVKPGGYIIVN QMGSDNYKELINIFLPFKLHGRWDREASCQTLSDIGLEIIDSVEEHGYVRFDTLASFIQF MKGITKADITQDRFMNFYSHVLKQIKDKKYFELTTHRFMVVARKKEL >gi|223714150|gb|ACDT01000065.1| GENE 11 12339 - 13058 838 239 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_2685 NR:ns ## KEGG: Pjdr2_2685 # Name: not_defined # Def: hypothetical protein # Organism: Paenibacillus # Pathway: not_defined # 41 237 106 310 324 99 30.0 7e-20 MSKYLKISLVFILVLLTGCGTVTKGNNIVKTKTESTEQQWANEVTATIDAAGNVTVISEN SQTDSATDAGDEHNKGTNSSALDSNQTSNDGANTDAGNKDNSPNNKPADNDNQMINVSIS IDCKTILNNMNDLKDGYEQFIPTNGMILSTEQFKVKKNTTVIDVLKKATSENEIKLVTVS SGFGLYVKGIGQIEEKICGSSSGWMYKVNGNFPEYGASSYRLKDGDKIEWRFTCKQGDV >gi|223714150|gb|ACDT01000065.1| GENE 12 13059 - 13931 523 290 aa, chain + ## HITS:1 COG:no KEGG:DSY3852 NR:ns ## KEGG: DSY3852 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: ABC transporters [PATH:dsy02010] # 17 284 9 285 302 99 27.0 2e-19 MLSKINIIKNKLVIKPFKNSHPVVLFAYFLSVITLTIIQQNYYLIILSIVSVIIIDYYFN YLTFFKDMKYTVVLVVIIMITNPLFVTEGFDIWFQNDYVTITKQALFYGLVFGLLLSCML LWFRIMKTCLTDSHIVYLFGSILPTLGLVISMCLNMISKLKNQYQKIREANINMPSQNKL GYYRNMIVVLVTYAFESSLDMMNSMQARGYGQGKRTSFHLYSFRKDDALKLIVIIGLFMI SLLGFLIRYCSFYYFPLIQEFSLQWQDGLFMLVYIGLMLLPIYLGGKKNV >gi|223714150|gb|ACDT01000065.1| GENE 13 13924 - 15441 210 505 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 277 505 5 244 305 85 27 6e-16 MYKVKNVCFTYPDNEKVIDNLSFEINKGDFVVICGESGCGKTTLLRLLKSSLQPTGTLEG TIELDEQIKKDIQIGFVFQNPDDQIVMDKVWHEIAFGLENQAIPLKQMKRRVGEIVNYFN LQNIINNDTQSLSGGEKQLVNLASVMVMNPSVILLDEATAQLDPVNRDEFIRILKQLNDD FDITIILVEHQLEGLLDVANRLILLDHGQKVIDDQIEKAVKKMLTEDVFVAALPNYVRVS TLANALCLTVKQAREALKDYHDFDIKEQEVVDTGLLLKIRDLSFAYDSTVLENLDLDVRQ NEILSVVGANGSGKSSFLKCIAGLVKYQGNITKVGVVEKIGYLPQDPTTLFITDKVINEL LVVENNLKTVEIEMENIGISNLRDMHPFDLSGGQKQLLALGKVLLTRPQLLLLDEPTKGI DAVSKDNLASLIRSLSKHMTIIVVSHDLEFVAKISDRVAMIFNGQMESVDSMRNFFSNNL FYTTTINKIVRNNNSEVVLLEDLGL >gi|223714150|gb|ACDT01000065.1| GENE 14 15438 - 16127 528 229 aa, chain + ## HITS:1 COG:no KEGG:DSY3851 NR:ns ## KEGG: DSY3851 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: ABC transporters [PATH:dsy02010] # 3 225 660 885 885 141 32.0 2e-32 MKIKKSIFILMVIVLITILGTVLYKGEKYNLIIIASFALSCLPFYFKYENSKPKIREVVV LATMIALTVVSRTIFMLTPSFKPVTAMVIICGIVFGPASGFMCGSLSATLSDFVFGIGPW TPFQMIAWGIIGYGAGIFSKNLFKNKFLLYGYSAICGLAFSMVMDLWSVLAFETSFNLTR YLAIVLTSLPTTLIYIVSNIIFMFLLSKTMFQILQRVKIKYGITEGKIE >gi|223714150|gb|ACDT01000065.1| GENE 15 16124 - 17245 1430 373 aa, chain + ## HITS:1 COG:no KEGG:CTC00776 NR:ns ## KEGG: CTC00776 # Name: not_defined # Def: putative surface/cell-adhesion protein # Organism: C.tetani # Pathway: not_defined # 154 299 174 332 355 64 30.0 7e-09 MKKLMRFTVALALVCACIVPINAQSSTLAKAQSYWLNNNELSGLDAVLAVESLGLDVEDE KNNFTLNYSFEAPTNSYGDEITYEEMDAGYLAKNIMALAAIKSDPAKLKLKDGSTINLID LLKSKIDENGNVDYGSANPESSTAYTMFALAIVDESYNLDKLGLNLTEMQLSDGSWGYNG AWGGPDITGWALAALSLCNNTYQASIDKALAYLASIQIDNGGYSGFGVNCNSQACAVWGI LEYDIQGVKNGTYNKGTGNPYDLLLQFMQEDGSFGASLDVDYGNAYATVQAALTVGVYEN GSLIDNIKAQYDEMLNPKNDQPEPTPAPAPVTPSTPADKSITSVKTGDDAVIALTVSSLI ISGGMYLFIRKES >gi|223714150|gb|ACDT01000065.1| GENE 16 17874 - 18716 735 280 aa, chain - ## HITS:1 COG:STM4423 KEGG:ns NR:ns ## COG: STM4423 COG2207 # Protein_GI_number: 16767669 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Salmonella typhimurium LT2 # 21 279 1 268 274 107 33.0 2e-23 MSNERIEFIEKPALINNQARLLYVSSSKYEGDWHSQLHSHHFTELFYVTKGCGGFKIGAD EFNVQENDFVIINPNTMHTEVSRDRNPLEYIVIAIDGISFEIPEDSLKDYISCNYANYKN EILFYLNLMVKESQAKNELYDDMCQQLMQVLLINILRLNRINVSLTQGKTIRKEIFMIKN YIDRNYHKEITLDTLAEKTHMNKFYLAHEFKNDVGVSPISYLLQRRIYESKYLLRDTDLS ISQISTILGFSSLSYFAQAFKKSTNFSPLQYRKNHQKYNK >gi|223714150|gb|ACDT01000065.1| GENE 17 18900 - 20249 1574 449 aa, chain + ## HITS:1 COG:mlr7000 KEGG:ns NR:ns ## COG: mlr7000 COG1653 # Protein_GI_number: 13475830 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 42 348 41 332 414 72 26.0 1e-12 MKKILSTALVLSLAAGMLTGCGGGKSDDTNTLKISGLNGGYGTKGWEAVVKAFEKKEGVK VELNLEKNIAETLRPVITSGKDVPDLIYLSVGSEGGLVDTMISDKAIAEISDVLDMKVYG EDVKVKDKIIPGFTNSFATKPYGDGKLYLAPFTYDPCGLFYNAGLFEQKGWTVPTTWDEM WELGEKAKAEGIALFTYPTTGYFDAFFSALLNETVGADAYTKLMEYDTKTWQSAETKKAF EIVGKLAQYTEANTVANANKDNFTKNQQLILDNKALFCPNGTWLPDEMSEAPRAEGFKWG FMALPKVSAEGDSYSSTFSEQVYIPSKAKNADLAKKFITFMYSDEAAKLFAEESGCVLPT TTASSYLPEKVKDKDGNEIDNQKTLFYQIYDNGAKSCTVGFKSVDAIEGVDLTSAEGILY GTVNSVVTGDKTVDEWYNAVIDAVKKYDK >gi|223714150|gb|ACDT01000065.1| GENE 18 20311 - 21207 726 298 aa, chain + ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 13 286 11 281 292 151 35.0 1e-36 MNKKKERRRFIILCLAPAVILFGVFMILPTLNVFWMSTLKWGGLSADKTFVGFNNFVLLM QDMNFIRALQNTILIIAVVTVITMAVAILFASILVREKIKGQNFFRVIFYIPNILSVVVI SSIFSAIYDPSENGMINSVLMLFKPESWETVKFLGDQSIVIYAIIGTMIWQAIGYYMVMY MASMSSIPEHLYEASALDGAGKIKQFFSITLPLIWDNIRTTLTFFIISTINLSFLFVKVM TSGGPDGASETVLSYLYKQAYNNASYGYGMAIAAVVFIFSFILSFIINRATERDTLEL >gi|223714150|gb|ACDT01000065.1| GENE 19 21222 - 22097 1035 291 aa, chain + ## HITS:1 COG:BS_yurM KEGG:ns NR:ns ## COG: BS_yurM COG0395 # Protein_GI_number: 16080311 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 15 290 37 299 300 147 30.0 2e-35 MKKLFSSDKLYKIFIYVALITLAVSIIVPVAWVFMASIKENSEFYGNPWSIPAGIYLQNF VNAFEQANMGEYLLNSILTTAMALVILLVVSLPAAYVLARYNFKGRRFFNTLFMAGLFIN VNYIVVPIFLMLLDWDDFIYELLGGNFFLNNIFVLAVVYAATAIPFTVYLLSGYFKTLPK AYEEAALIDGCGYYKTMVRVMIPMAKPSIITVILFNFLAFWNEYIIALTLLPSTSKTLPV GLLNLMQATRGKAEYGMMYAGLVVVMLPTLILYIIVQKKLTQGMTLGGLKD >gi|223714150|gb|ACDT01000065.1| GENE 20 22109 - 22276 244 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734317|ref|ZP_04564798.1| ## NR: gi|237734317|ref|ZP_04564798.1| predicted protein [Mollicutes bacterium D7] # 1 55 1 55 55 75 100.0 1e-12 MDGKTKITIMVVVFIVSMALIFIGQKNVGYTGLATELVGLAGLIGVIYTYNKGYK >gi|223714150|gb|ACDT01000065.1| GENE 21 22289 - 23407 1441 372 aa, chain + ## HITS:1 COG:CAC3237 KEGG:ns NR:ns ## COG: CAC3237 COG3839 # Protein_GI_number: 15896483 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Clostridium acetobutylicum # 1 369 1 369 369 471 63.0 1e-132 MANLSLKHIDKIYDNNVQAVFDFNLEIADKEFIVFVGPSGCGKSTTLRMIAGLEEISAGE LYIDGKLMNDVAPKNRDIAMVFQSYALYPHMSVYDNMAFGLKIAKKPKDEIDQKVRKAAK ALDIEQYLDRKPKALSGGQRQRVALGRAIVREPKVFLMDEPLSNLDAKLRVQMRTEISKL YQQLGTTFIYVTHDQTEAMTMGTRIVVMKDGRIQQVASPAYLYNHPVNKFVAGFIGSPQM NFLNGKINEKDGAIIIELSDVSLKVESTRAKQLREKGYLGKTVIMGIRPEDISVDNDYIN EHQEDCFEGKAEVVEFMGSESYIHMVKDEKPFIVKAPGGTTLTNGVIGRFAYKMDKMHFF DIDSEEAIIEGE >gi|223714150|gb|ACDT01000065.1| GENE 22 23410 - 25563 2320 717 aa, chain + ## HITS:1 COG:no KEGG:CPR_0537 NR:ns ## KEGG: CPR_0537 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_SM101 # Pathway: not_defined # 2 717 3 732 734 937 60.0 0 MEKYGRLTLPTDLDVIDQTIELKNKLGADAIRDCDGTDMPQELMNLDAKIYATYYTTRKD NEWAEANPDEIQQEYLITDRYTARDTSLTIELMKGFHTQQLKVNDIDDPKRWWEVIDRTT GEVVAVDQWEFNKEVGAVTISTIPYHEYTVSFLAFLIWDPVHMYNFITNDWKDAPHQLTF DVRQPKTKEFVKDKLRKFCEENKHVDVIRFTTFFHQFTLTFDDQKREKFVEWFGYSASVS PYILEQFEKWAGYKWRAEYIVDQGYHNSLFRVPSKQFKDFIEFQQIEVCKLAKELVDIVH GYGKEAMMFLGDHWIGTEPYGKYFETIGLDAVVGSVGNGVTMRMISDIKGVRYTEGRFLP YFFPDVFCEGGDPIKEAKDNWLQARRAILRSPLDRIGYGGYLKLAVEWPGFVDTITQVVD EFRQIHDLTKETKAYTSPFKVAILNCWGSIRRWMNNQVHHAIWYREIYSYVGIIECLSGM PIDVEFINFDDIKKGINPDIKVIINAGTAYTSWSGAENWIDEEVLTTIRKWIDNGGGFIG VGEPTAYQYQGKFFQLSDVLGVDKEVGFTLSHDKYNTVDPNHFIVEDIEGEIDFGEGMAS IYGHGENYQIIRQHKEFAQLVTNTYGAGRSVYLAGLPYSPQNCRLLLRAIYWAAAKEDEM KKYYVDNVNAEVAAFEAVGKIVVINNSLDALDTHLYVEGNLREKLRLAPMEMRWLDI >gi|223714150|gb|ACDT01000065.1| GENE 23 25614 - 26537 1141 307 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0139 NR:ns ## KEGG: EUBREC_0139 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 304 1 304 668 422 65.0 1e-117 MNKPVLVIMAAGMGSRYGGLKQIDPIDEDGHIIIDFSIYDARRAGFETVIFIIKKENEEE FKEVIGDRMEQIMNVKYVYQDLANIPAEFSLPAGRVKPWGTAHAIWSCKDLIDGPFAVIN ADDYYGVTAYKQIYDFLLNHPDGVKYNYAMVGYYIENTLTENGHVARGVCQVNDKQLLVD IHERTQIVRNGAGAKYTEDDGQSWVELPKGTIVSMNLWGYNKSILLEIEKGINSFFEIGL KENPLKCEYFIPAVVSKLLDQNKVEVTVLESQERWYGVTYREDKPIIQMAIKELKDAGVY PKHLWKE >gi|223714150|gb|ACDT01000065.1| GENE 24 26544 - 27635 1107 363 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0139 NR:ns ## KEGG: EUBREC_0139 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 8 363 309 668 668 461 61.0 1e-128 MSEELLNEVVEAFSFEGVLKKKKPYGSGHINDTFLLTFDIERMGMIDIILQRMNTAVFTK PIELMENIANVTRFLRDKIIASEGDPERETLNIIYTLDHKPYYVDSAGGYWRAYKFITCA TSYDQVESAEDFYQSALAFGHFQSLLADYPAATLHETIKGFHDTKARFERFKEVVANDIC NRAKDVQAEIEFVLAREDVANVLTNCNLPIRVTHNDTKLNNVMIDDKTHKGICVIDLDTV MPGLAVNDFGDSIRFGASTGAEDEIDLDKIECDMALFEIYAKGFIKGCDGKLTKEEIKAL PIGAKVMTFECGMRFLTDYLEGDIYFKIHRKNHNLDRCRTQFKLVKDMESKWEMMNKIVE KYM >gi|223714150|gb|ACDT01000065.1| GENE 25 27772 - 28041 527 89 aa, chain + ## HITS:1 COG:SA0934 KEGG:ns NR:ns ## COG: SA0934 COG1925 # Protein_GI_number: 15926669 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Staphylococcus aureus N315 # 3 85 2 84 88 87 57.0 5e-18 MAEKLSFVVSDPVGLHARPATILVNQASKFTSNIKLVYNGKEVNLKSIMGVMSLGVPTKA TVEIVAEGDDEKDVIASIAKVIKEQKVAE >gi|223714150|gb|ACDT01000065.1| GENE 26 28125 - 28604 590 159 aa, chain + ## HITS:1 COG:lin0783 KEGG:ns NR:ns ## COG: lin0783 COG2606 # Protein_GI_number: 16799857 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 6 138 4 137 158 127 50.0 8e-30 MAKKIKTNALRLLDQAKIDYEIKEYEYDEDHLSGEHIVAQVEIPAKDIFKTLVLKGNHGF LVCCIPVLEEIDLKKLAKLSNNKSVAMIHMKDLLATTGYIRGGCSPVGMKKKFDTYFDCS INECDKIALSAGKRGYQMIVEAKQLLAYLDAVVGDVIRR >gi|223714150|gb|ACDT01000065.1| GENE 27 28607 - 29104 571 165 aa, chain + ## HITS:1 COG:PAB2289 KEGG:ns NR:ns ## COG: PAB2289 COG2109 # Protein_GI_number: 14520300 # Func_class: H Coenzyme transport and metabolism # Function: ATP:corrinoid adenosyltransferase # Organism: Pyrococcus abyssi # 1 165 9 175 175 104 37.0 6e-23 MIQCYYGNGKGKTTAAVGQALRMAGADKKVLFLQFLKDGDSSEIKMLKKCGIKVLYAKMP QMFIDMHDPEMIKLVSRLEDELFEQIDESYEGIVLDEILDAIALNLLNEGKVYDCLVSLK ETHEVILTGRQPSHKLKPILDYSSEIKKHKHPYDKGIKARKGIEF >gi|223714150|gb|ACDT01000065.1| GENE 28 29182 - 30876 2225 564 aa, chain + ## HITS:1 COG:BS_yhxB KEGG:ns NR:ns ## COG: BS_yhxB COG1109 # Protein_GI_number: 16077996 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Bacillus subtilis # 1 564 1 558 565 529 47.0 1e-150 MNYNEEYQRWVNCENLDPSLKAELASMNEKEKEDAFYMNLEFGTAGMRGILGAGTNRMNI YTIRKANVGFAKYVLGLPEGKERGVAIGYDNRHMSYKFAIESAKVLATYGIKSYIFESLR PTPELSYAVRYLKCAGGIMITASHNPKEYNGYKVYDDTGCQLIPEWGDQVVAYVNEVKDE LAVEVISDEEAYPYITWIGEEVDEAYYQEVMAIEINPGMDKKDFKIVFSPQHGTSNLPVR NCLSRLGYNVIPVLAQCAPDPDFSNTKSPNPEVDCSYDLAINKAKEVDADVVVICDPDGD RLGVVAKHDGEYVLMSGNQSAAVYLEYILSELKKQGKLPANAVMYNTIVTSDLGELVSKS YGVEVEKTLTGFKFIGDKIRKYEKTKEKEFIFGYEESYGCVVKDFVRDKDAVQAVVMAAE AGNFYKHQGKDLIDVLNELYAKHGTFKESQIALSKAGAEGAKRIKEIMDNLRKDAPSVIG GYKVLAVEDYQSSQRIENETTTVIDLPKSNVLKYYLEDGSWIAARPSGTEPKCKFYFSIK GNDADDASVKTEVFQKDILSIIGE >gi|223714150|gb|ACDT01000065.1| GENE 29 31187 - 32548 1646 453 aa, chain + ## HITS:1 COG:lin2856 KEGG:ns NR:ns ## COG: lin2856 COG1455 # Protein_GI_number: 16801916 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 5 446 4 443 454 389 49.0 1e-108 MYQRLMDIMSERLLPIATKIGSQKHLVALRDSFIATMPVVMTGSIALLLNAFFVDFPAEF GWTGITDAFQWLISINNLIFNGSLAIVSLVFIFALGYNIAKVYETDKLSGGLVALSSFVI SLGTSITQTFQLEGVSADIGKIINQVTGLNFSDGNLSVTINGLLPGNQINSRGYFTAIVI GFLAVVIYSKFMKKNWTIKLPDSVPPAISVPFTSIIPAFIAMYAVAAITYGFNLVTGQLI IDWIYDILQTPLLGFSQNPLSIILVAFLTQLFWFFGIHGGNVMAPIMEGVFGVALLANLE AFQKGTEIPYLWTSVSYGAFVWYATLGLLIAIFWQSKNKHYREVAKLGIAPVMFNIGEPV MYGLPTVLNPLMFIPFLLAPIVLSAVAYGATALGLVAPVTQNVTWVMPPVLYGFFATGFD WRAIILSLVNLALAVIIYLPFVKLANNPKFEEK >gi|223714150|gb|ACDT01000065.1| GENE 30 32568 - 34322 1727 584 aa, chain + ## HITS:1 COG:SP0060 KEGG:ns NR:ns ## COG: SP0060 COG1874 # Protein_GI_number: 15900005 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Streptococcus pneumoniae TIGR4 # 1 581 1 585 595 589 49.0 1e-168 MGSIEVNKEFFINGNKVKIISGAVHYFRIVPEYWRDTLLDLKAMGCNTVETYVPWNLHEP YQGKYDFSGIKDIETFLKLAEELELFVILRASPYICAEWEMGGLPAWLLKYPRIRLRTND KQYLKCLDQYFSILLPKLSKYQITQNGPIILAQLENEYGSYGEDKEYLLAVYQMMRKYGI EVPLFTADGTWHEALNAGSLLEKKVFPTGNFGSQAKENITVLKKFMESHQITAPLMCMEF WDGWFNRWNQEIIKRDPQEFVNSAQEMLSLGSVNFYMFQGGTNFGWMNGCSARKEHDLPQ ITSYDYDAILTEYGAKTEKYHLLREVITGKKERLPERRQTKNYGQIIKNRSVSLFSTLDC IAACHQSDWPLTMEELDHYYGYVVYQHTFKSYTDDLRMRIIDGRDRAKIYLDDQEIATQY QEEIGDEINLPTHSNDTHDLKILMENMGRVNYGSKLQAETQQKGIRNGVILDIHFTKKWK HYCLNFEHLDLLNWENGYQSGPGFHEYIFEADEVKETFIDLEGFGKGVVFVNGHHCGRFY EAGPTLSLYIPGPFLKKGINQIIIFETEGCYRDKIELIGQPKYL >gi|223714150|gb|ACDT01000065.1| GENE 31 34494 - 35498 1014 334 aa, chain + ## HITS:1 COG:SP1854 KEGG:ns NR:ns ## COG: SP1854 COG1609 # Protein_GI_number: 15901682 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 1 329 1 330 335 198 35.0 2e-50 MATLKEVAEFANVSIATVSRILNDDNSLTVLPETKEKVIKAAEQLNYIPKSKRKGNHDEA LTVAVIQWYSFEQEMRDTYYLSLIQGVENYLKTNGIIAKRIRKDDMDLNKVLENVQGIIC IGKFEFEVVKELKEYTENIVLLDMDISPVTECCVTLDFDNAMFQVVDYFNRLGHKHIGFL GRGEYEELSLHISTRKSSFIKYCQYFGLKYTELLLENELTSEAGYNLIHNLAKTEELPTA IFAVNDPIAIGAMRALQDQGIRVPEQVSIIGFNNIEAGNFTTPSLTTVYAPSKEMGNLGA MLLHEMILKKKALFPMRVQLPCSLLERKSCKKII >gi|223714150|gb|ACDT01000065.1| GENE 32 35707 - 37569 1743 620 aa, chain + ## HITS:1 COG:BS_glgB KEGG:ns NR:ns ## COG: BS_glgB COG0296 # Protein_GI_number: 16080150 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Bacillus subtilis # 7 614 11 616 627 536 45.0 1e-152 MLDDFLIHVFHKGNTLEAYKVFGAHLETNDHKKGVRFTVYAPNARSVQVVGDFNDWDGQN HYMERYTDGGIWTLFIPGVKEFALYKYRIETQNFATVDRADPYAFYSELRPGTASRVYNI ERFRWADKKWLNSRTKNFDKPMNIYEVNIGSWKMKKDFTAEEDGEFYSYEEMIELLIPYL VENNYTHLELMPLTEFPFDGSWGYQATGYFSITSRYGEPKGLMKFINACHKAGIGVIMDF VPAHYVKDGHGLYKFDGGFVYDYPDINRRYTEWDSVYFDVGREEVRSFLMSSVGFLAEYF HIDGIRFDAVSNLIYWKGNKEAGVNDGALEFMKRMNNHMKGYYPGVMLIAEDSSDFNGVT KAPDIGGLGFDYKWDLGWMNDTLKYMKLDPVYRQGDHNLLTFSMAYFYSENFILPFSHDE VVHSKGTIIDKIWGNNEQKFAQLKTLYTYMMIHPGKKLNFMGNELGEYKEWDEKVALGWN ILKYPIHDSFHKFMIRLNEVYKTYPCMYEQDYGMDGFEWLVVDDRAQSVFAIERKGTDGS SLIAVMNFTGNKHEGYMVPVNQPGSYKEILNSDTDIYTGSNFVNKRAIKAKEGQTLNKEY YIPVNIAPFGSMIFEYKGSK >gi|223714150|gb|ACDT01000065.1| GENE 33 37594 - 39990 2449 798 aa, chain + ## HITS:1 COG:CAC1664 KEGG:ns NR:ns ## COG: CAC1664 COG0058 # Protein_GI_number: 15894941 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Clostridium acetobutylicum # 1 796 1 805 812 827 53.0 0 MFKNKEEFKFEFSRRIIESYGRTVEEAHITERFLVLETMVRDYASVNWATSKAIIRNNYQ KQMHYFSMEFLMGRLLVNNMMNLGIYQIAKDGLEEYGINIHDLEELETDAGLGNGGLGRL AACFMDSLASLAYPGHGHTIRYEYGLFKQKIENGYQIELPDQWMQTGFNWEVRKPKHRVP VKFFGKIIFNETTGKYEHVDAEEVYAVPYDVPIVGNDTTNTNTLRLWSAEASDNIPANQD FRQYIQNVRDICQMLYPDDSTPAGKMLRLKQEYFFTCAGLNSIVRAHLRQYPNLNNFHQK NVIQLNDTHPVLAIPELMRILIDEQGYEWEEAWDITTQTFAYTNHTILAEALETWPISLM QSLLPRVYEIIEEIHRRFIGYVKMASDRDEELLNRVMILRDGTVYMARLAIVGSFSVNGV AKLHSDILAERELKDFAQLYPSKFNNKTNGITHRRWLAYCNPELSSLISEYIGDEWIKHP ERLEDLMGHLDDLTLQDRFLAVKKERKQILADYIKEHNGITVDVDSIFDIQVKRLHAYKR QLLNILHVIDLYLRMKENPDFRIEPRTFIFGAKAASGYYFAKKVIKLINSVGDVVNNDSE TNKYLKVVFLENYGVTLAEKIMPAADVSEQISTAGKEASGTGNMKFMMNGAITLGTLDGA NVEIAELVGEENCVIFGMKDHEVKELQMSGAYKAWDYYNNNPRLKRIIDSLMDGTFHENR EEFRVIFEELMNKNDEYFVLADYEAYCQAQSSVRDLYKDRSRWAKVCLTNIAKSGFFSSD RTIQQYVDDIWHLDKVKF >gi|223714150|gb|ACDT01000065.1| GENE 34 40158 - 42242 2047 694 aa, chain + ## HITS:1 COG:BS_amyX KEGG:ns NR:ns ## COG: BS_amyX COG1523 # Protein_GI_number: 16080045 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Bacillus subtilis # 54 662 53 664 718 447 37.0 1e-125 MEKSAIRMIAKMVSVNKIEVQLIRRYYEGNCPRFYLRDLNNNTLIEIEISAKFVEEEYVH YYFDNVEIDFCHVYQIVDAYGLAETLQYTQLIYDAEFLKNNYYDGNDLGNSYHEDYTIFK VWAPTALAVKVAITKDHTTYSYEMKRIGNGVFCSYVAGNFDNCEYVYLVRHHDSYIKALD PYAYGSSSNGKSSYIVDLNKIKIDLNRECLKPLKNKTDAIIYEASVRDFTMYENSLSKYK GKFAGIRETGLLTKHGNSAGLDYLVELGITHIQLLPIYDFATVDENHQEVLYNWGYDPAQ YNVPEGSYCTNPNDGYSRIVECKQMIADLHAKGIRVVMDVVYNHMYDVNASAFERIVPGY YFRKNPDGSLSNGSWCGNDLDSGEHMVRKYIIDMSRRWQSFYGVDGYRFDLMGIIDIETI NRVYQICSAYDASFMVYGEGWNMPTNLPDEQKAMQDNHAKMPKISFFNDEYREVIKGGSS DNVLVNKGFITGNLYETEKARNVITATNRYTSPDQSINYVECHDNATVFDKLYISNVEEG LDGIITRQKNLTIMVLLSQGIPFIHAGQEFYRTKGGIGNTYNSPDNINCINWDFRDIYKD DINEIRKIIQLRKDNKCFRYATRKEIEANVMVENIDFKMLKYTLKQDEGEYREFVIYINP SSFTFDYEETAYQHLYGEITDQQIGARTIVVLAR >gi|223714150|gb|ACDT01000065.1| GENE 35 42253 - 42807 550 184 aa, chain + ## HITS:1 COG:BH0068 KEGG:ns NR:ns ## COG: BH0068 COG0193 # Protein_GI_number: 15612631 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Bacillus halodurans # 1 184 1 185 185 204 55.0 5e-53 MKLIVGLGNPGKEYEGTRHNCGFMVVDELANKLNTEINQNKFKGLYTKVKYHGEDVILLK PQTYMNLSGESVIAAMNFFKLDKEDIIVIYDDLDMPVGKLRLRKTGSAGGHNGIKNIIAH LSSQDFKRIRVGIDRHKYMKVVDYVLSRFAKEETEAINQGIDKASDAVLDYLDHDFDYIM NRYN >gi|223714150|gb|ACDT01000065.1| GENE 36 42816 - 46250 3618 1144 aa, chain + ## HITS:1 COG:BS_mfd KEGG:ns NR:ns ## COG: BS_mfd COG1197 # Protein_GI_number: 16077123 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Bacillus subtilis # 1 1135 1 1159 1177 849 42.0 0 MKKIDEILKQNPAFKTLLKGEGNIIVNDLNDEALLVTSAFLTLQKDIIVIKPNQYEANLL YQQISLINEKDSLFFPVDESYRIEALASSPELLGQRIDALYQLTTDQPKILITHGQALVR YLPSRQLFLDNCLNLKTGMQIDIYDLQKLLIKAGYTSAPRVDQPFYFSKRGGVIDVFSIQ YDNPLRIEFFDDEIDNIRFYNQNSQRTIEKVKEVTIIPASDILYDEQEVPAVLSKINDLR DRQIEELDELYQEDYLSKVSIDQENLRNHDTSFTMYGYFNLFNQTASLLDYLDTPLIIQA NNHDINFAYKNYLEENHYYYQELASIGKTIKGLNLFRDLYEVIDRKSVNFKPFAQSDKDV LFNARAIMINNDDEAMIINQIRAYLKLSKVVVALDDDHQLKLMTELFDRHEMAYTLIGIK DEIYPGLNIAVNKIGFGIELVDEKIVIISANELFKTRNIKKPKYFKYKNAKVLKDYQELN IGDYVVHDNHGIGQYLGIKTLEVQGFHKDYLYVAYAGDDTLYIPVEQFKMIRKYSSNEGK VPKINKLGGSQWQKTKAKARSKVDDIADKLIEIYSARINQPGYAFPSDSEIQLEFERSFG YELTVDQLRSVEEIKADMEKSQPMDRLLCGDVGFGKTEVALRAAFKAILGNKQVAFLCPT TILSMQHYKTMIARFKDFPVKIALLNRFTSTKEKKQILSDLKLGNIDLLVGTHRILSKDI VFKDIGLLCIDEEQRFGVKQKEKIKEYRKTIDVLTLTATPIPRTLQMSLMGIRGLSQIET PPKNRQPVQTYVIEKNNVLIKQIIERELARDGQVFYLYNRTSQIANVAYNITLSVPGARV AVGHGQMDKNELEDVMMRFVNKEFNVLVCTTIIETGIDIPNANTIIVEDADKFGLSQLYQ IKGRVGRSNRGAYAYLLYNPTKVLNEEASKRLKAIKEFTELGSGYKIAMRDLAIRGSGDI LGGTQSGFIDSIGFEMYMKILQDAINEKMGKEDVEAEKEIKSVNVKVDGYIPHDYVSSDI EKLELYQRLDNAKTISGVDHLKSEFIDYYGKLPEEVSTLVEKRKLDILASTEIIENLAEV KGKMEITFTKGYSQNVKGDQLFELVNRLFTKPVFRQLGGKIVIVLPKGDQWLERINQLIT TLNS >gi|223714150|gb|ACDT01000065.1| GENE 37 46392 - 46970 524 192 aa, chain + ## HITS:1 COG:no KEGG:Cbei_3144 NR:ns ## KEGG: Cbei_3144 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: C.beijerinckii # Pathway: not_defined # 12 190 11 189 192 150 42.0 2e-35 MTSQRFNKTIERKKEIIKAAMQLFSEKGYAQTSMRDIARTMGVSLGLCYRYFDSKQILFN TAIDLYIEECCNSYLAILHDSTITIKDKIDALFTSIGDEHSNMQYYDFFHRVENEELHEQ LSIKLCKYMYPHLLEAVKKAIAAKEIYIENPEALISFIIYGQVGLLSKSNIDHHEVAKLL NQYVNQLMKFNP >gi|223714150|gb|ACDT01000065.1| GENE 38 47028 - 47633 562 201 aa, chain + ## HITS:1 COG:CAC3419 KEGG:ns NR:ns ## COG: CAC3419 COG0500 # Protein_GI_number: 15896660 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Clostridium acetobutylicum # 4 197 9 202 207 169 42.0 2e-42 MNKKEQSRQTFDLQASKYDTTFYGKHARKIYPYLLNEIIRCYGEEVLDLGCGTGALMKQV ISEDSHRHLTGIDLSSQMIEKAKHQLKNKATLVVGDSENLPFFDQTFDIVYCNDSFHHYP NPQKAIAEIYRVLKIGGTLIIGDTYLPVIARQVMNYFIRFNKDGDVRIYSKKEMISFLEE LFHEIKWIKVSNSAYMIKGVK >gi|223714150|gb|ACDT01000065.1| GENE 39 47633 - 47995 342 120 aa, chain + ## HITS:1 COG:STM0835 KEGG:ns NR:ns ## COG: STM0835 COG1321 # Protein_GI_number: 16764197 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Salmonella typhimurium LT2 # 9 113 41 146 157 59 33.0 2e-09 MIQATNRKYLRYIYQLLLEGKKVRQIDLALSLGYARSSVSIAIKQLKKAGYIDLIKNNIT LTERGSMLAQESLKSYQQVYRWILALGLTSYEARLYADKLESDFDQKFIEMLLKDKRLNN >gi|223714150|gb|ACDT01000065.1| GENE 40 48000 - 48464 335 154 aa, chain - ## HITS:1 COG:no KEGG:Bsel_1376 NR:ns ## KEGG: Bsel_1376 # Name: not_defined # Def: dUTPase # Organism: B.selenitireducens # Pathway: not_defined # 3 154 2 162 164 79 32.0 4e-14 MNEINKIYQAHMSHELKLVVDPNHAVLNKDKVEAKVLAFLCELGKVGEEAKVFSFWDNSS ANNKEILNYYTDALHMLMSIGFELHVDKLKNYQEINNSNNIANQLIKVYQSALKVNETYS FEAFQNCIDDYFTLGFKIGLDFDTILANYSSQKD >gi|223714150|gb|ACDT01000065.1| GENE 41 48519 - 49118 762 199 aa, chain - ## HITS:1 COG:BH3630 KEGG:ns NR:ns ## COG: BH3630 COG1739 # Protein_GI_number: 15616192 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 3 194 7 209 213 137 38.0 1e-32 MKSVREITTYKWIIKKSEFICTLIPCNDETAIDNIIKEYQEKYHDATHNCIAYIVGTQKR ANDNGEPSGTAGLPMLDVLEKNKLTNIIAIVTRYFGGIKLGAGGLTRAYRQSVADALKGA EIVEKFSVPLYKITVDYSYTKKFEHLIKVHHIKCINTEYLDRVSYTCYLENESFLTIIQD LTNNTYTKEYLRHDYIELS >gi|223714150|gb|ACDT01000065.1| GENE 42 49215 - 50096 1053 293 aa, chain + ## HITS:1 COG:SP1557 KEGG:ns NR:ns ## COG: SP1557 COG1307 # Protein_GI_number: 15901400 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 3 283 2 279 282 126 27.0 4e-29 MKKIAVLADSGCQLPIGSLEDQGIYIVPLTITMGNKTYLDLEEISANEVFERMNKTGEMV MTSQPSTGSIQNAVHRIKDAGYDHIIALPIATGLSSTLNGMKLACDMVGVPVTLIDTKGT ASNHRYLIRVAKKLIDDGKSVAEIETILTAMVEDSATVIMAPNLDHLKKGGRITPAVALL GNLLKIVPVMKLNYELGGKIDTLDKVRTVKKANLKIVDHLVNECGLNNKDFIIAIEHVLV DDLAHQMKQAIIDRIGECEITVRELPAVVGAHMGVGGVGYQYIKKYEGLTWNE >gi|223714150|gb|ACDT01000065.1| GENE 43 50084 - 50530 536 148 aa, chain + ## HITS:1 COG:BB0377 KEGG:ns NR:ns ## COG: BB0377 COG1854 # Protein_GI_number: 15594722 # Func_class: T Signal transduction mechanisms # Function: LuxS protein involved in autoinducer AI2 synthesis # Organism: Borrelia burgdorferi # 1 137 17 159 173 143 51.0 1e-34 MERITSFSVDHTKLTKGVYVSRIDGELTTFDIRMTTPNLEPALDPRGAHTIEHIGATLLR NGENKDKIIYFGPMGCMTGFYLITKELDLARVIKIVRELFEQIANWQEPIPGAKMEECGN YSFMDLEKAKAAAKHFVTSEWQHEYHYL >gi|223714150|gb|ACDT01000065.1| GENE 44 50612 - 51301 500 229 aa, chain - ## HITS:1 COG:CAC3424 KEGG:ns NR:ns ## COG: CAC3424 COG1737 # Protein_GI_number: 15896665 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 12 224 7 225 235 75 27.0 5e-14 MNIEHLRDKFDLNELELALVKYLEDNQNNLKDITIRQMASANFTSTSAIYRLCNKFGFSG YSDMIYQLSSSNNTQLFSPDDFEQYIPAFIELLNKHRQHNIIVFGMGFSAPIADYLQQRL TLNGFYAMNVVHTEMLDIKYQDNSLFIVISNSGITPRLVEIVDNAFKNNIDIISFIANEN SNLYQNVTLPIIVGSYNSFTHNNLVPNTFFGQVLIVFESLLYTYLSSIE >gi|223714150|gb|ACDT01000065.1| GENE 45 51352 - 51963 685 203 aa, chain + ## HITS:1 COG:BS_yrvN KEGG:ns NR:ns ## COG: BS_yrvN COG2256 # Protein_GI_number: 16079807 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Bacillus subtilis # 15 153 234 368 421 58 30.0 9e-09 MAKTMYDPWSKIITRNGYAGDEVISALQKSIRRSLEEQACMFAYEMYISSPELLEKMWRR LLTISVEDIGMGDPMAAVLVNNLYQMSKHFDYADGDQPMYFIHAIRYLCSCQKDRSSDLL KNICIKSFAMGKLPEIPDVALDKHTVRGKAMGRDSFHFLNEASKVIPQMEVDNDYKERYA KILEEYDPENVSETAFTFNGWQF >gi|223714150|gb|ACDT01000065.1| GENE 46 51979 - 53256 1288 425 aa, chain + ## HITS:1 COG:VC1282 KEGG:ns NR:ns ## COG: VC1282 COG1455 # Protein_GI_number: 15641295 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Vibrio cholerae # 1 423 10 427 446 230 34.0 3e-60 MISKLEKLLGPIANKLAQQRHLQAISSGMMMSLGLIVVGSVFLIIANPPINLDLVDLNTG NIFLKFLISWKQFALANYDVLTLPYNYTMGLVGLISAFGVAYCLAESYKMKASVYGIISM CTFLMVAAPLKDGAITMSYLGADGLFVALIISLISVEISRVVSHRIVFTFPDSVPSAVTN FVNSLMPLALNIIVIYGANLILVALSGKSMPNLIMSFLTPAISGVDNVWMFMGIYLFSNI LWMFGINGSSIVFPIVFALGIANTGLNGELVNLGKDPTAIMNLQMFRYALLGGAGNTLGL VLLMCRSKSKHISSIGRLSIVPGICGINEPIMFGAPIVLNPILSIPFLVMPCISIGLGYL VQKIGLVSMGYIVDPSFTPFFAQGFLSALDIRNVIFMIVLIIISVGVYYPFFKVYEKNTL ANEAE >gi|223714150|gb|ACDT01000065.1| GENE 47 53293 - 53464 62 57 aa, chain + ## HITS:1 COG:BH1470 KEGG:ns NR:ns ## COG: BH1470 COG2509 # Protein_GI_number: 15614033 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Bacillus halodurans # 1 57 1 57 480 83 61.0 9e-17 MKNNYDVVVVGAGPAGIMACYELYLKQPELNVLLIDKGQDVMKRHCPIKEKKIKSCP Prediction of potential genes in microbial genomes Time: Thu May 26 09:52:27 2011 Seq name: gi|223714149|gb|ACDT01000066.1| Coprobacillus sp. D7 cont1.66, whole genome shotgun sequence Length of sequence - 57097 bp Number of predicted genes - 51, with homology - 51 Number of transcription units - 25, operones - 12 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 162 128 ## gi|167755594|ref|ZP_02427721.1| hypothetical protein CLORAM_01108 - Term 144 - 186 5.0 2 2 Tu 1 . - CDS 213 - 1772 1370 ## COG0038 Chloride channel protein EriC - Prom 1959 - 2018 4.3 + Prom 1705 - 1764 9.2 3 3 Tu 1 . + CDS 1898 - 2353 490 ## COG1846 Transcriptional regulators + Term 2411 - 2446 4.3 - Term 2394 - 2439 6.1 4 4 Op 1 6/0.000 - CDS 2527 - 3156 207 ## COG1040 Predicted amidophosphoribosyltransferases 5 4 Op 2 . - CDS 3123 - 4226 790 ## COG4098 Superfamily II DNA/RNA helicase required for DNA uptake (late competence protein) - Prom 4348 - 4407 5.1 + Prom 4242 - 4301 8.4 6 5 Tu 1 . + CDS 4497 - 4694 288 ## COG1278 Cold shock proteins + Term 4695 - 4728 1.0 + Prom 4705 - 4764 9.3 7 6 Op 1 6/0.000 + CDS 4794 - 7427 3433 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) 8 6 Op 2 3/0.000 + CDS 7427 - 8524 1027 ## COG1186 Protein chain release factor B 9 6 Op 3 . + CDS 8538 - 9395 885 ## COG1284 Uncharacterized conserved protein 10 7 Tu 1 . - CDS 9440 - 9655 240 ## gi|167755603|ref|ZP_02427730.1| hypothetical protein CLORAM_01117 - Prom 9799 - 9858 8.4 + Prom 9730 - 9789 8.6 11 8 Op 1 28/0.000 + CDS 9842 - 10525 357 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 12 8 Op 2 . + CDS 10525 - 11436 1016 ## COG2177 Cell division protein + Term 11440 - 11464 -1.0 13 8 Op 3 . + CDS 11476 - 12165 828 ## Cbei_2690 GCN5-related N-acetyltransferase 14 8 Op 4 . + CDS 12181 - 13551 1363 ## COG1305 Transglutaminase-like enzymes, putative cysteine proteases 15 8 Op 5 . + CDS 13614 - 13865 175 ## gi|237734356|ref|ZP_04564837.1| predicted protein + Term 13977 - 14027 8.6 - Term 13976 - 14004 -0.1 16 9 Tu 1 . - CDS 14007 - 15764 1586 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 15784 - 15843 7.0 + Prom 15712 - 15771 4.5 17 10 Op 1 . + CDS 15882 - 16475 535 ## COG0789 Predicted transcriptional regulators 18 10 Op 2 . + CDS 16462 - 16713 69 ## gi|167755610|ref|ZP_02427737.1| hypothetical protein CLORAM_01124 19 10 Op 3 . + CDS 16786 - 17823 1491 ## COG3804 Uncharacterized conserved protein related to dihydrodipicolinate reductase + Prom 17833 - 17892 8.3 20 11 Tu 1 . + CDS 17929 - 18246 396 ## COG1695 Predicted transcriptional regulators + Term 18434 - 18480 6.1 + Prom 18330 - 18389 8.7 21 12 Tu 1 . + CDS 18620 - 19894 605 ## gi|237734361|ref|ZP_04564842.1| predicted protein + Term 19981 - 20023 3.4 - TRNA 20017 - 20090 63.4 # Trp CCA 0 0 + Prom 20105 - 20164 11.3 22 13 Op 1 . + CDS 20233 - 20367 64 ## gi|167755614|ref|ZP_02427741.1| hypothetical protein CLORAM_01129 23 13 Op 2 . + CDS 20364 - 20645 304 ## gi|237734363|ref|ZP_04564844.1| predicted protein 24 13 Op 3 . + CDS 20609 - 21223 804 ## COG3546 Mn-containing catalase + Term 21231 - 21270 6.3 + Prom 21237 - 21296 3.5 25 14 Tu 1 . + CDS 21417 - 21614 87 ## LCRIS_00926 RNA-directed DNA polymerase + Term 21665 - 21715 -0.8 + Prom 21821 - 21880 10.7 26 15 Op 1 . + CDS 21967 - 23562 1869 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 27 15 Op 2 1/0.250 + CDS 23626 - 24387 957 ## COG0778 Nitroreductase + Term 24462 - 24503 -0.3 + Prom 24429 - 24488 11.8 28 16 Op 1 4/0.000 + CDS 24515 - 24943 501 ## COG1846 Transcriptional regulators 29 16 Op 2 35/0.000 + CDS 24999 - 26732 1868 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 30 16 Op 3 . + CDS 26749 - 28587 185 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 + Term 28592 - 28624 3.1 - Term 28579 - 28610 2.1 31 17 Tu 1 . - CDS 28615 - 28893 408 ## Elen_2980 hypothetical protein + Prom 29041 - 29100 5.6 32 18 Tu 1 . + CDS 29152 - 29937 762 ## COG0561 Predicted hydrolases of the HAD superfamily + Prom 29955 - 30014 7.5 33 19 Op 1 1/0.250 + CDS 30037 - 30870 898 ## COG0253 Diaminopimelate epimerase 34 19 Op 2 6/0.000 + CDS 30885 - 31589 1088 ## COG2171 Tetrahydrodipicolinate N-succinyltransferase 35 19 Op 3 . + CDS 31600 - 32730 1091 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 36 19 Op 4 . + CDS 32733 - 33920 1327 ## COG4992 Ornithine/acetylornithine aminotransferase + Term 33959 - 33997 5.2 + Prom 33934 - 33993 5.7 37 20 Op 1 7/0.000 + CDS 34037 - 34597 679 ## COG2059 Chromate transport protein ChrA 38 20 Op 2 . + CDS 34594 - 35154 435 ## COG2059 Chromate transport protein ChrA + Term 35155 - 35191 2.5 + Prom 35160 - 35219 9.0 39 21 Op 1 . + CDS 35239 - 36144 1107 ## EUBREC_1535 hypothetical protein 40 21 Op 2 . + CDS 36156 - 37184 942 ## COG1459 Type II secretory pathway, component PulF 41 21 Op 3 . + CDS 37195 - 37494 226 ## gi|237734381|ref|ZP_04564862.1| predicted protein 42 21 Op 4 . + CDS 37487 - 37936 449 ## gi|167755635|ref|ZP_02427762.1| hypothetical protein CLORAM_01150 43 21 Op 5 . + CDS 37946 - 38314 421 ## gi|237734383|ref|ZP_04564864.1| predicted protein 44 21 Op 6 . + CDS 38307 - 38687 520 ## gi|167755637|ref|ZP_02427764.1| hypothetical protein CLORAM_01152 45 21 Op 7 . + CDS 38689 - 39744 1095 ## COG2805 Tfp pilus assembly protein, pilus retraction ATPase PilT + Prom 39767 - 39826 8.2 46 22 Op 1 . + CDS 39889 - 41070 1232 ## EUBREC_1848 hypothetical protein 47 22 Op 2 . + CDS 41093 - 53380 13595 ## COG4932 Predicted outer membrane protein + Term 53393 - 53426 1.3 + Prom 53446 - 53505 6.4 48 23 Tu 1 . + CDS 53548 - 55371 2217 ## gi|237734388|ref|ZP_04564869.1| predicted protein + Term 55396 - 55440 3.5 + Prom 55440 - 55499 8.2 49 24 Tu 1 . + CDS 55534 - 56502 903 ## EUBREC_1848 hypothetical protein + Prom 56505 - 56564 7.0 50 25 Op 1 . + CDS 56642 - 56824 195 ## gi|167755642|ref|ZP_02427769.1| hypothetical protein CLORAM_01157 51 25 Op 2 . + CDS 56879 - 57095 255 ## gi|167755643|ref|ZP_02427770.1| hypothetical protein CLORAM_01158 Predicted protein(s) >gi|223714149|gb|ACDT01000066.1| GENE 1 1 - 162 128 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755594|ref|ZP_02427721.1| ## NR: gi|167755594|ref|ZP_02427721.1| hypothetical protein CLORAM_01108 [Clostridium ramosum DSM 1402] # 2 53 428 479 479 106 100.0 4e-22 SKVREGFECEIDDLYVAGDGAGLTRGLAQAGANGIIVARHIIGTIKNRDSKLA >gi|223714149|gb|ACDT01000066.1| GENE 2 213 - 1772 1370 519 aa, chain - ## HITS:1 COG:L113400 KEGG:ns NR:ns ## COG: L113400 COG0038 # Protein_GI_number: 15673646 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Lactococcus lactis # 16 518 14 512 512 352 42.0 7e-97 MKNRLIKTLNIKRFKLVLLLEGLLVGILAGLVIVGYRICLTNGALWLQRILAYCKQSIFT IIIWFVILFVIALVVTKLLDWESLISGSGIPQLEGELAGKIDAKWWRVLIAKFFGGFLAN FSGLALGREGPSIQLGAMVGKGIGKVLKRGKTEERYLLTCGASAGLAAAFHAPLAGVMFA LEEVHKHFSAPLLISVMTSSIAADYLMSSILGMAPVFSFNIHGTLPTQYYWMVIILGIIL GLLGAFYNKALLFVQSLYNHSKHLNNYTKLLIPFILAGILGFTVPQLLGSGDTLVDLLIE GKLTMSLILLLLAGKFLFAITCFGSGAPGGIFFPLLVIGCLIGGAFANISVDYLGLDPIY INNFILLAMAGYFTAIVRAPVTGIILIFEMTGSLNHLLSIAIVTIVAYVVADLLKSKPIY ESLLENLLKKRSLPTPQGVGEKVLLDFMILHNSPLDNHLVKEIQWPNHCLIVSLRRDGQE FIPHGDTYLKASDSIIVMCDKIDESYIYDTLSALTTEQI >gi|223714149|gb|ACDT01000066.1| GENE 3 1898 - 2353 490 151 aa, chain + ## HITS:1 COG:CAC3579 KEGG:ns NR:ns ## COG: CAC3579 COG1846 # Protein_GI_number: 15896813 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 9 114 10 115 154 84 40.0 8e-17 MDKRSDAIETVVDVFNEAMVIQELYLKKSKFKELSMSETHVLDAVDKVEFPSMTNVAKSL SVTMGTLTTAVKKIVEKGFLVKERSSKDQRVYYLKLTDKGHEALAIHEQFHHELADLYKS AIPDDRVDWVFNTLKKIKLDLDNYKKALEEK >gi|223714149|gb|ACDT01000066.1| GENE 4 2527 - 3156 207 209 aa, chain - ## HITS:1 COG:lin2656 KEGG:ns NR:ns ## COG: lin2656 COG1040 # Protein_GI_number: 16801717 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Listeria innocua # 57 199 73 216 218 62 26.0 5e-10 MSQNNQAGQCLICFNDLNKSPSLYHLYYHATLCFHCLNQFSIYNRTHDYHGYKLTILYYY NDFFKQLLFQYKGQGDYALKDAFLNAYPHFKTKYRRHLIALVPSSQQDDLRRGFNPNEMI VRSFSNHIFTGLYKNSAYKQTSQTDRSQVSQIIKIKDGQRLYNQNVIIFDDVITSGNTIM TCAKVISSYQPKTISIIVMASNQLDKLFK >gi|223714149|gb|ACDT01000066.1| GENE 5 3123 - 4226 790 367 aa, chain - ## HITS:1 COG:BS_comFA KEGG:ns NR:ns ## COG: BS_comFA COG4098 # Protein_GI_number: 16080600 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicase required for DNA uptake (late competence protein) # Organism: Bacillus subtilis # 3 366 60 453 463 206 33.0 5e-53 MICPRCKNQDSQYFYTFKNITYCRKCVKIGSTCFSPSQLPSIIKTVDYQLDYELTPLQNN ISRQLLQRYQQHLNTSLKAVCGAGKTEITYEVIKYALNQGQRVCFTTPRKELVIELAKRL QSQFKNISITTVYGGHSELVDGQFIICTTHQLYRYPQYFDLLILDELDAFPYVNNEVLIG LLQNSIKGNYIYMSATLTNQPDLLMTKRYHGYLLDVPKCYLTSSLVMYLWAIKKIRDFVR KKKPVLVYVPTIQLTNTVARIFKLFKLKSHAISSNTKDIQKFIERLRHHQLDVLVTTTIL ERGITIDNVQVIILYGNNRIYTTATLIQICGRVGRKTAHPSGSISIFTPYKTRAIKECLK TIKQDNA >gi|223714149|gb|ACDT01000066.1| GENE 6 4497 - 4694 288 65 aa, chain + ## HITS:1 COG:BH3610 KEGG:ns NR:ns ## COG: BH3610 COG1278 # Protein_GI_number: 15616172 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Bacillus halodurans # 1 65 1 65 65 83 64.0 8e-17 MQGKVKWFNAEKGFGFIDRGEGKDVFVHYSQITQDGYKTLNEGELVEFELYQSDRGMQAK HVVKI >gi|223714149|gb|ACDT01000066.1| GENE 7 4794 - 7427 3433 877 aa, chain + ## HITS:1 COG:BS_secA KEGG:ns NR:ns ## COG: BS_secA COG0653 # Protein_GI_number: 16080583 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Bacillus subtilis # 68 867 2 790 841 826 54.0 0 MASRKEQIRKELKKKNEKEGLNPEAVTNIVDLQDQTGEDIKVEDFQGIYNGEVIPFNTTP HSTKKSFIGRIKGVFDDERKQLKKLDKMADEVIALESKYAAMSDEELAGQTSIFKEALAN GKTLDDIIIDAYATVREAAYRQLGLKAFKVQLMGGITLHEGDIAEMKTGEGKTLTSIFPV YLNALTGNGVHVVTVNDYLAGRDAVNNGKVFNFLGLTVGLNKRELTPEQKQEAHGCDVTY TTNAELGFDYLRDNMVTRLEDKVLVKGLNYALVDEVDSILIDESRTPLIISGGRKNTAAL YLQADRFVKSLKQDKDYEVDIESKTVALTPDGIAKAEKGFKLDNLYDPQHTALVHHINQA LKANYTMTRDVEYMVATEDGTRDIRNAKIMIIDQFTGRVMPGRAYSDGLHQAIEAKEGVP IKEETETRATITYQNFFRLFNKLAGMTGTAKTEEEEFRLIYNMRVIEIPTNRPVIRDDRN DKIYSTRANKFKALCEEVEARNSYGQPILIGTVSVETSEVLSKMLDRRKIRHNVLNAKNH AKEAEIIEKAGQRGAVTIATNMAGRGTDIKLGEGVAEIGGLAVIGSERHESRRIDNQLRG RSGRQGDPGYSVFYVSFEDDLMERFAGERLKSFTDYLEDDQAIENKMVTKAIEGAQKRVE GQNFDSRKHILEYDDVMRQQREIMYKERDDIMSEENLDAIVKGMFNQAIEMTVRQFTKHD GKDDIVDVAGVVDFVAKNYMLLVEVEASNCEALQKDPQKLIETLTDLVFNQYISRFNKEL EPEKKLQYERSILLGVIDYTWINHIDAMTKLRNGIYLRAYAQKDPLAEYTEEAFYMFEQM TSSIADAISRNIVHMGIRPGSEVEQTIPHLKMELTFK >gi|223714149|gb|ACDT01000066.1| GENE 8 7427 - 8524 1027 365 aa, chain + ## HITS:1 COG:BS_prfB KEGG:ns NR:ns ## COG: BS_prfB COG1186 # Protein_GI_number: 16080582 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Bacillus subtilis # 1 365 1 365 366 400 56.0 1e-111 MELYEIKNGLTKAHLLMTEFYQSIDIEAYRREIEGLTVITLQEGFWDDANKAKVTYDKLN KMKKTTDQYDLLETTLTSLDETYELVKNTEDQEFKEILESDYTVFEKELSKFETMMLLSG EHDDLNAIVEIHPGAGGTESQDWAEMLFRMYQRYAANKGWKVEVLDYLDGDVAGIKSVTM LIKGDNVYGHLKAEKGVHRLVRLSPFDSAKRRHTSFASVDVMPEFNNEIEIEIQSTDLKI DTYRASGAGGQHINKTDSAVRITHLPTNIVVTCQSQRSQIQNREQAMVMLKSKLYQLMLE KQASELKELKGEQKEIAWGSQIRSYVLHPYSLVKDNRSGYESNNPKAVLDGDLDGFIYAY LKSAL >gi|223714149|gb|ACDT01000066.1| GENE 9 8538 - 9395 885 285 aa, chain + ## HITS:1 COG:CAC0848 KEGG:ns NR:ns ## COG: CAC0848 COG1284 # Protein_GI_number: 15894135 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 5 283 17 290 292 142 33.0 1e-33 MKKHIYETLIIIVGNFILALGICAFITPVGLITGGASGIGIAVKSLTGINISYTVYAINI VMFVVGYFYFGKKFAAGTLLSTFLYPTFLAILERVPALSTITSDALLSTLYAGLCIGLGL GLVLRVGASTGGMDIPPLIVNKKTGFSVAWLINIFDCAILLFQVIFCPITIEQVLYGITV VIITTIVMDQVMMLGETKVQVTVISPKWQEIRKIVFEDINRGCTLLNVTTGYHQKNQYAV MAVVSKRELHLLNDMILAIDPTAFIISNATHSVRGRGFTLPPIDL >gi|223714149|gb|ACDT01000066.1| GENE 10 9440 - 9655 240 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755603|ref|ZP_02427730.1| ## NR: gi|167755603|ref|ZP_02427730.1| hypothetical protein CLORAM_01117 [Clostridium ramosum DSM 1402] # 1 71 1 71 71 122 100.0 9e-27 MKKVCPVCQTTLKTNCFIKDNGISTLSYLELIIKDDDFKKTNYELKSCYCPNCGHVEFFV DLNNPEKNKND >gi|223714149|gb|ACDT01000066.1| GENE 11 9842 - 10525 357 227 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 217 4 220 223 142 35 6e-33 MIKLTNVTKIYKTGVRALNDMNLTIEPGEFVYVIGTTGAGKSTFIKLLYREEKATSGKVE VVGRDVSKIKNRKVPYFRRNIGVVFQNFRLLPKKTVFENVAFALEVIDTPRVEIRRRVRA TLELVGLEDKVNAFPHELSGGQQQRVAIARAIVNKPKVLIADEPTGNLDPETSEEIINLL ERINEQQNTTILVVTHDSKIVQEHKKRTILIENGCVNADTSLGGYDI >gi|223714149|gb|ACDT01000066.1| GENE 12 10525 - 11436 1016 303 aa, chain + ## HITS:1 COG:lin2649 KEGG:ns NR:ns ## COG: lin2649 COG2177 # Protein_GI_number: 16801711 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Listeria innocua # 9 302 4 293 294 217 37.0 2e-56 MNELISCVKNLPKHFKTAVQNIWRNGVMSFSSIFAVTITLVLIGVIGVLALNVQDISSNI EEGVSIYVKLDRYIDEAAEQAVGPQIEAISGVKKITYYTKDQELDKLIETQGDEGAALFE SYRADNPLGGAYEVEVDDAANIAKIAEKIQEIPNVNKTSYGGQSTQDMVKTLKTIQTGGS VFIVGLAIIALFMIANTIKITITARQTEISIMRMVGASNWYIRIPFMLEGMLIGLIGSII PIIVLVYGYGMVYDYANGALMSAMLALKPPMPFIRDFSLVIAAVGAGVGLVGSFVSIRRF LKF >gi|223714149|gb|ACDT01000066.1| GENE 13 11476 - 12165 828 229 aa, chain + ## HITS:1 COG:no KEGG:Cbei_2690 NR:ns ## KEGG: Cbei_2690 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.beijerinckii # Pathway: not_defined # 1 227 1 228 230 95 29.0 1e-18 MKNEILKFLKQDYLNNLDLIYALEHGAKIKYYNQAGIMLKFEDIYMLAFKDEKIAEQLLK QIDKCSMLAIHNDKYFNIINQKWRLNKEIVAYQYGYLKKYVRIINNPIVEIKEIGREYFE FIKENYSTPIEEAYLLERIDANVFVGAFIEKQIVGFAGRHIEGTIGFVEVIKEYRRLGIA QALEQYLMQKLIAENEIIYLQVEVDNYPSMRLHEKLGYERSDDIITWYM >gi|223714149|gb|ACDT01000066.1| GENE 14 12181 - 13551 1363 456 aa, chain + ## HITS:1 COG:TM0007 KEGG:ns NR:ns ## COG: TM0007 COG1305 # Protein_GI_number: 15642782 # Func_class: E Amino acid transport and metabolism # Function: Transglutaminase-like enzymes, putative cysteine proteases # Organism: Thermotoga maritima # 9 456 1 438 438 307 38.0 3e-83 MGNYNFDDLQYLQIPLPEELLKLKWHGSFTRMKRIIDAKLANDIPIALKKRLLHEKEIIR RIPQEYPLSYEEALKICHNNFNDFSDEELEKLQDENAVEWIFVEGQPYFRADFFDNILKT RNDYAARLFDKSKVESREANFALLNNAIHEIKTKGQLTYHFRLKTGIKIKASDSNKKIKI HLPLPLEDCQVKNFKLIATMPQYKYLNDGSHPQRTVYFEENIDQVKDFYVEYEYDNEMKY VELKPEDVLESQPNFYLHEQAPHIVFTPYLKSLAKEIVKDETNNLIKAKLIYEYITTHIT YSFVRNYYTIPNIPEYAALGGKGDCGVQALLFITLCRCVGIPARWQAGLYVTPQDIGNHD WARFYIAPYGWLYADCSFGGSAYRNGDQERWQYYFGHLDPFRMPANSEYQYDLYPPKQFI RQDPYDNQTGEAEFEDRGLYLDEYDLHLEVLEIKKV >gi|223714149|gb|ACDT01000066.1| GENE 15 13614 - 13865 175 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734356|ref|ZP_04564837.1| ## NR: gi|237734356|ref|ZP_04564837.1| predicted protein [Mollicutes bacterium D7] # 1 83 1 83 83 164 100.0 1e-39 MIDLYGVTNFLTFNQGFSKYNEKEMTNIYDLLGCWCEDDEVLYREASPVSYLEAKMNLLP ILILHGSKEHVVPFEQSVELYQK >gi|223714149|gb|ACDT01000066.1| GENE 16 14007 - 15764 1586 585 aa, chain - ## HITS:1 COG:CAC3012 KEGG:ns NR:ns ## COG: CAC3012 COG0488 # Protein_GI_number: 15896264 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 2 582 3 628 632 446 43.0 1e-125 MLISANNLSKTNGIKNIVDNVTFSIEETDKVALIGVNGTGKSTLLKVIAGNENYQGDIIR KKDMTISYLPQNPDFDPNNSVIKQVYKLIDASEVNEYEIKAILNKFGITNYEQLIKELSG GQQKRIALAITLLKPCDLLILDEPTNHLDNEMIEYLEKYLIKFNKAILMVTHDRYFLERV TNKIMEIDRSKIYEYTANYSKFLDLKAQREEADLASQRKRKLFLKKELEWVRAGVQARTT KSKERLQRFEQLNNISDIQTINQVELINTASRLGKKTIELKDLSMCYDDHTLFNNFSYLF KRTDRIGILGINGCGKSTLLKIIAKELEPSSGSVTHGDTIKIGYFKQMNDGLDENMRVID YIKQTSNDLKTLEGSFTAKQMCERFLFDSSLQHTYIARLSGGEKRRLYLLNILMQAPNVL LFDEPTNDLDISTLAILEDYLDSFNGIVITVSHDRYFLDRICDGLFVFKNQQITYCNGGY SSYIDISEQQNKNKSDGALKYKEQKKLQSAIRLSFKEKQELENMEKVILEFEQQIKIINE QMNEYQSNYNKLSELSNQRDYLNEQLEIKNERWLELLEKQEQSQK >gi|223714149|gb|ACDT01000066.1| GENE 17 15882 - 16475 535 197 aa, chain + ## HITS:1 COG:BH3496_1 KEGG:ns NR:ns ## COG: BH3496_1 COG0789 # Protein_GI_number: 15616058 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 3 97 4 98 117 74 43.0 1e-13 MKRYTIGKMAKLNNVTEQTLRLYDKMGLFKPAYVDEHNGYRLYDIGQSARLDIIQYLKNL GMTLKEIKEIFDSKNLELLQEKLEESKLQINNQIIQLNEQKAAIQKTIDSFELFKAAPKV GMITLEFINKRTMLYVDEKVNFYDYDLDVYEELLREFKDDLIKQALPPTIFFNPGTILRK EYVLKRKFCATEIFVFY >gi|223714149|gb|ACDT01000066.1| GENE 18 16462 - 16713 69 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755610|ref|ZP_02427737.1| ## NR: gi|167755610|ref|ZP_02427737.1| hypothetical protein CLORAM_01124 [Clostridium ramosum DSM 1402] # 3 83 196 276 276 153 100.0 3e-36 MFFIDDKKEIPLKKETIPANMYLCIYCDDFTKEKEYAHRLFDEIESKNYTIVGDYLCESL NDILIFNDKQRNMYLRLQVPIKI >gi|223714149|gb|ACDT01000066.1| GENE 19 16786 - 17823 1491 345 aa, chain + ## HITS:1 COG:mll2179 KEGG:ns NR:ns ## COG: mll2179 COG3804 # Protein_GI_number: 13472019 # Func_class: S Function unknown # Function: Uncharacterized conserved protein related to dihydrodipicolinate reductase # Organism: Mesorhizobium loti # 11 332 3 324 329 124 25.0 3e-28 MNKEKIRVVQYGCGKMAKYISRYLYEKGADIVGAIDVNPDVVGKDIGEFAELGFNLGVTI SDDADAVLDECDPDIAVVTLFSFMEDCYPHFEKCVRRGINVVTTCEEAIYPWTTSSKLTN KLDELAKETGCTIVGSGMQDIYWINMIGTVAAGCQRIDKIEGAVSYNVEDYGLALAKAHG AGFSPEQFEKELAHPETLEPAYVWNSNEALCNKMGWTIKSQTQKCVPYFYDTDLYSETLG ETIPEGNCIGMSAVVTTETFQGPIIETQCIGKVYGPDDGDMCDWKIKGEPDVSFYVQKPA TVEHTCATVVNRIPSILLAPPGYMTVEKLDELEYLSYPMHLYCDE >gi|223714149|gb|ACDT01000066.1| GENE 20 17929 - 18246 396 105 aa, chain + ## HITS:1 COG:lin1176 KEGG:ns NR:ns ## COG: lin1176 COG1695 # Protein_GI_number: 16800245 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Listeria innocua # 3 100 6 103 103 69 40.0 1e-12 MARKEFETLTPQMFYILVVLNVPRHGYEIMNEIMRITNDEIRVGAGTLYTLLSRFQNDGY IELVSEIDRKKIYQITKIGKYKLQEEKRKLEMQIKALNEVNQDKN >gi|223714149|gb|ACDT01000066.1| GENE 21 18620 - 19894 605 424 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734361|ref|ZP_04564842.1| ## NR: gi|237734361|ref|ZP_04564842.1| predicted protein [Mollicutes bacterium D7] # 1 424 7 430 430 738 99.0 0 MLNDLLLYEDECITYLEKMAQDGWYLVKVGTTFFKFERRSPRKVKYQMDYNLLTPDYLEM ITLEGYRFVDNYREISFFYNEDINAANLHTDEAARLMALGNIYKYFHVFLIIGCALILLI INSSIWKMNLLVIRYTLGSFFLNTKLFFGHYLYIFIALMFVIDGIALFLMKLSINRQQEN KKSLKKWIKLCFIIEKLIGLILLFILTLVIIDIVVYNPFVLIGLIIVIGGFEIYNYYINK KAYHEINSTLRRGKTAIAMVLVVVFVIAMQNIDFDINIKTIQPLQKGINVESTSTRSILV SNYVNTSYEGEQLIFLESKYECLNNYIANKVFEEIVCGVERDSRIPDETEIDAIVETTGE WSTNDVKYLNYLQAIKKFKYIQTEYADKCYYFDNTFVAIKNKQILLIVKKEDTKIDEILK YYLS >gi|223714149|gb|ACDT01000066.1| GENE 22 20233 - 20367 64 44 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755614|ref|ZP_02427741.1| ## NR: gi|167755614|ref|ZP_02427741.1| hypothetical protein CLORAM_01129 [Clostridium ramosum DSM 1402] # 1 44 1 44 44 91 100.0 2e-17 MNKLELGISSIKCQEFRKIYPIVKAFHSGTIFEELDLPWGELCR >gi|223714149|gb|ACDT01000066.1| GENE 23 20364 - 20645 304 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734363|ref|ZP_04564844.1| ## NR: gi|237734363|ref|ZP_04564844.1| predicted protein [Mollicutes bacterium D7] # 1 93 3 95 95 190 100.0 2e-47 MKNDDLLMQIMMLDFAVQDSALFIDTHPCDKEAMNYFNEAATRLKAAKKEYQKQGHALVN REVGAYQNDYLSAPWPWVGDHKCGCTKNVCNTL >gi|223714149|gb|ACDT01000066.1| GENE 24 20609 - 21223 804 204 aa, chain + ## HITS:1 COG:CAC1338 KEGG:ns NR:ns ## COG: CAC1338 COG3546 # Protein_GI_number: 15894617 # Func_class: P Inorganic ion transport and metabolism # Function: Mn-containing catalase # Organism: Clostridium acetobutylicum # 1 185 1 185 200 233 57.0 2e-61 MWLYEKRLQYPINITKPNAKAAAVIIDQLGGPDGELGAALRYLSQRYTMPYPEIQALLTD IGTEELAHVEMISAILYQLTTDLTIDQIKAQGYDKYFVDHTLGIYPQGASNVPFTAAYFQ SKGDPITDLTEDMAAEQKARTTYDNILALVDDEEIKKPIRFLREREIIHYQRFAEALEIV KGHLNSDNYYAYNPAFRNGKCNKK >gi|223714149|gb|ACDT01000066.1| GENE 25 21417 - 21614 87 65 aa, chain + ## HITS:1 COG:no KEGG:LCRIS_00926 NR:ns ## KEGG: LCRIS_00926 # Name: not_defined # Def: RNA-directed DNA polymerase # Organism: L.crispatus # Pathway: not_defined # 1 62 201 262 265 89 62.0 4e-17 MKLNKACRCGFSEEEIYKCVNTRLGWYRRSAMNVVNFTISSKVLGIKKRDKLGLVNPLDY YLKSL >gi|223714149|gb|ACDT01000066.1| GENE 26 21967 - 23562 1869 531 aa, chain + ## HITS:1 COG:BS_ykpA KEGG:ns NR:ns ## COG: BS_ykpA COG0488 # Protein_GI_number: 16078507 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 531 1 532 540 742 68.0 0 MISTQNVSLQYGDRALFENVSVKFTDGNCYGIIGANGAGKSTFLKILAGEIEPNKGEVIL EPHKRLSVLKQDHFAYDENGVVETVIMGNKRLYEIMQEKEVLYSKPDFNDEDGIKLAELE GEFAEMDGWNAEVDAEIMLNGLGIGEELHNLKMKELDGNQKVKVLLAQALFGNPDVLLLD EPTNHLDLDSIRWLENFLLNFKNTVIVVSHDRYFLNKVCTHIADIDYSRIQLYVGNYDFW YEYSQLQLQQAKEVNKKAEAKKKELEEFIARFSANASKAKQATSRKKLLDNLEMVDIKPS LRRYPFIAFKPGREVGNIVLKVEGLSKTVDGIKLLNNVSFSINPKEKVAFVGDVKATETF FKIIMGEIEPDEGTFQWGVTTSQSYFPKDNSEFFNNCDLTLVDWLRQFSVKDEYEQDLRG WLGRMLFSGEEALKKASVLSGGEKVRCMLAKMMMAEANVLVLDEPTNHLDLESIQALNQG LINFKENILFTSHDHQFVQTIADRIIEFNENGILDRLSTYDEYLEYKENNK >gi|223714149|gb|ACDT01000066.1| GENE 27 23626 - 24387 957 253 aa, chain + ## HITS:1 COG:lin0935 KEGG:ns NR:ns ## COG: lin0935 COG0778 # Protein_GI_number: 16800005 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Listeria innocua # 1 241 1 232 246 107 30.0 3e-23 MNNTIKQLFERKSVRVYQDKPIEAKEKQLIFNAAIQAPTAGNMTLYSIIDIQDQAIKEKL AKTCDNQPFIAKAPLVLIFVADYQKWYDAFKYYHGEEQMRSPSYGDFLLAFSDTLIAAQN AVVAAESLGIGSCFIGDIIEQYEVHRELLNLPQYTAPVGMLVMGYPTEQQQHRTKPARFD SKYVVSVDRYEKFDDEKLVKMFDDRAVLANRSSGGRSFVDDIYNRKWTDPFIEEMSRSSK LWIEQWGKEDEES >gi|223714149|gb|ACDT01000066.1| GENE 28 24515 - 24943 501 142 aa, chain + ## HITS:1 COG:SA0322 KEGG:ns NR:ns ## COG: SA0322 COG1846 # Protein_GI_number: 15926035 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Staphylococcus aureus N315 # 8 140 7 137 139 66 32.0 2e-11 MPNKKIGFLIKQVFYLHEQHLNKQFSKFGLTASQTFTLIYLFKSHDKGISVNQKDIEKEL DISNPTVTGILNRLEVKGLITRVPCRHDARAKNIEVTEKALELDKQLRIVFQQSDEKLVE SLSKEEIDNLQSYLIKILRSNS >gi|223714149|gb|ACDT01000066.1| GENE 29 24999 - 26732 1868 577 aa, chain + ## HITS:1 COG:CAC3414 KEGG:ns NR:ns ## COG: CAC3414 COG1132 # Protein_GI_number: 15896655 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 1 572 1 570 577 759 68.0 0 MLKTLLAQIKEYKKDTILTPVLVVVEVILEVVIPLLMAMIIDKGIEVRDMNMVVKLGIIT LLASFISLAAGGLAGKYAAKASTGFAKNLRKAMYYNIQDFSFANIDKYSTAGLVTRMMTD VTNVQNAFQMLIRACIRAPLMMVSAMIMAFTINAQIAMVFLVAIIFLGVLLVFFMTRAHP YFKRVFNTYDDLNASVQENVNGIRVVKAYVREEHEDEKFKKTSTLIYKLFVKAENYLVFN MPLMQLTVYGCIIGISWFGAHLIVGGNLTTGELTSLFTYVMMILMSLMMFSMVFVMVVMS IASAQRISEVINEKSTLHNPAEPDYEIKDGSIDFNNVNFSYFDDQEEINLRDINVHIKSG QTIGIIGGTGSAKSTFVQLIPRLYDVTKGEVLVGGKDVRKYDLETLRNEVAMVLQKNVLF SGTIKENLRWGNKEASDEEIVEACKLAQADEFIQRFPDKYDTYIEQGGTNVSGGQKQRLC IARALLKKPKILILDDSTSAVDTKTDALIRSAFKKVIPGTTKLIIAQRISSVEDADLIIV LDDGQISAMGTNDELLESSAIYKEIYETQKKGGTLSE >gi|223714149|gb|ACDT01000066.1| GENE 30 26749 - 28587 185 612 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 391 594 40 249 329 75 27 5e-13 MKRLFKFVVKRHKKTCFAILLLIIVSSIASVMGTIFIKSLIDDYITPYINMANPDFGPLT NAILKMIAIYAVGVIATFSYNKLLIKVTQGSLKEIRDTMFEHMEKLPIRYFDTHNHGDIM SIYTNDTDTLRQMISQSVPQVIVSATTIVCVLVSMIVMSIPMTIISVLMVCVMLLVSKHV TNRSGRYFFAQQTNLGKVNGFIEEMMEGQKVVKVFTHEEEAKFDFDKVNEELFESAYQAN KYANILMPLIGNLGYVSYVLVAVVGGVLAINGYMDLTVGTLAAFLQLNRSFNQPIGQISQ QINMVLMALAGAERIFALMDEEVEDDLGHVSLVNAKENADGTLSPVKERTGLWAWEHKRP DGSIEYVRLAGDVRFHDVTFGYNENKTILYDMNLFAKPGEKLAFVGATGAGKTTITNLIN RFYDIQKGSITYDGIDIKLIKKADLRRSLGIVLQDTHLFTGTVMDNIRYGKLDATDEECI EAAKLANAHEFIMHLEHGYQTILSGDGSSLSQGQCQLLAIARAAVANPPVLILDEATSSI DTRTESIVQSGMDKLMEGRTVFVIAHRLSTIKNSDAIMVLDQGRIIERGDHDKLIKEKGT YYQLYTGGLELD >gi|223714149|gb|ACDT01000066.1| GENE 31 28615 - 28893 408 92 aa, chain - ## HITS:1 COG:no KEGG:Elen_2980 NR:ns ## KEGG: Elen_2980 # Name: not_defined # Def: hypothetical protein # Organism: E.lenta # Pathway: not_defined # 18 92 41 115 129 84 53.0 1e-15 MEDINNVAVEVKNGEMDEEEIIAYIKYLEKKFPDETLKSLSIKIDGDEVDLDYTFYPLGF ERIRRITGYLVGTLDRFNDAKAAEEKDRVKHA >gi|223714149|gb|ACDT01000066.1| GENE 32 29152 - 29937 762 261 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 5 249 3 247 256 85 28.0 1e-16 MKKKKLILFDMDGTLIDWRQGIDKPTKQMFEAFQKLREHGFLVMIASGRLLPLITVPLRG FEFDGYLLSDGAHVILNGEEIIKEPLDHGDVERTISLARSLTLEYGLLFKDGAYLQKDGV IAPFLKKANFDMNKINYEKYDDQVFKLYIHCQKKKQNEIIEQFGCFNLAFEDDYDLIEIR NKYHSKASGLKAILTRLEIETENTYFFGDGFNDVEIFNMVGHPYVMENAAPELYQYGTIC QPVEADGAYLKVMEILAEENL >gi|223714149|gb|ACDT01000066.1| GENE 33 30037 - 30870 898 277 aa, chain + ## HITS:1 COG:slr1665 KEGG:ns NR:ns ## COG: slr1665 COG0253 # Protein_GI_number: 16332245 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Synechocystis # 1 277 1 279 279 261 47.0 8e-70 MQLHFTKMEGIGNDYIYFDGINQEIPMDKEFIMKISDRHFGIGSDGMIVILPSDQYDFKM RMFNLDGSEAMMCGNGIRCFAKFIYDHKLSDKHVLKIETKSGLRTVELLFDGDECIGAKV NMGKPVLTCQDIPCTYAKEKMIDEPVEIGGREYRLTTISMGNPHTVTFVSDLDTLDLKTI GPKFEHHQLFPASVNTEFVQIINKSYVKMRVWERGSGETMACGTGACAVMYACYLNHYTD SKVTVELLGGCLEIEYVDETIMMSGPAANVFNGTIKI >gi|223714149|gb|ACDT01000066.1| GENE 34 30885 - 31589 1088 234 aa, chain + ## HITS:1 COG:SA1229 KEGG:ns NR:ns ## COG: SA1229 COG2171 # Protein_GI_number: 15926977 # Func_class: E Amino acid transport and metabolism # Function: Tetrahydrodipicolinate N-succinyltransferase # Organism: Staphylococcus aureus N315 # 4 234 6 237 239 252 61.0 5e-67 MLNSAEEIIKYIGDAKKQTPVKVYLKGKNLPEATEFKMFGGEDSKVCIGDLEAIKAYMEV NKEVIVDSYLEQDRRNSAIPMLDMTNINARIEPGCFIREHVTIGDNAVIMMGAVINIGVK IGEGTMIDMGAVLGGRVEVGKRCHVGAGAVLAGVIEPPSASPVILEDDVLIGANAVVIEG VHIGKGAVVGAGSIVTSDVPAGAVVVGNPARIIKEQKDETTEGKTQLMDDLRKI >gi|223714149|gb|ACDT01000066.1| GENE 35 31600 - 32730 1091 376 aa, chain + ## HITS:1 COG:alr4934 KEGG:ns NR:ns ## COG: alr4934 COG1473 # Protein_GI_number: 17232426 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Nostoc sp. PCC 7120 # 6 371 26 401 405 269 38.0 6e-72 MDALNEIKRWRQELHQIPELGLQEFKTSKYLKDELTKMGYEPISILDTGVLVYLDNHQDK TLAFRSDMDALKISEQTNCSFKSCHNGYMHACGHDGHMAALLGLAKKLKEQPLKWKHNIL LIFQPAEESPGGAKLIVEAGILKQYNVRAIFGLHLMPTIEAGKIACRPGPLMAQNGELDV TITGKSAHAGLYHLGIDSIMIASQIICQYQSIISRVIAPVESCVINIGEIQGGTVRNIVA DQTTFKGTVRTYSETVFKKITDTMEAINHGVEQTYGCTIEFSCPPMYPPVLNDYDLYRQF VRLTDENYEELKEPLMLAEDFSFYQKEVPGIFFYVGTKTPKYFSGLHTETFNFDEEVLMQ AVELYYRLADKIKLGD >gi|223714149|gb|ACDT01000066.1| GENE 36 32733 - 33920 1327 395 aa, chain + ## HITS:1 COG:CAC2388 KEGG:ns NR:ns ## COG: CAC2388 COG4992 # Protein_GI_number: 15895654 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Clostridium acetobutylicum # 11 390 6 386 387 361 45.0 1e-99 MDYFDRANNAFMKAYGRFDVTFDHGNGVYLYDTNGKKYLDFYSGIGVNSFGYDYQLYTDA MCQQMHRLMHISNYFNSVETIEAAEAVIKATQLDQVFFTNSGTEATEGALKLARKYYYEK HGKADSEIISLNHSFHGRSTGAVTLTGTPAYQTAFGPLITGVKYGDINDLDSVAKLITPR TAAIILEPVQGEGGIHVCTQEFMKGIRQLCDKHEIVMILDEVQCGMGRTGTIMTYFQYGI MPDIVCLAKGIGAGFPMGAFVANKKIGDALKPGDHGSTYGGNPMAGKACKTVFKILEETQ MLDHVQDVSEYLITKLDELVEKYDFIKERRGLGLMLGLEFDHPVKPYIQKALDKGLVLIT AGTNVIRMLPPFIIEEKDVDEMISIFTEVIDEINS >gi|223714149|gb|ACDT01000066.1| GENE 37 34037 - 34597 679 186 aa, chain + ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 184 1 182 186 82 27.0 4e-16 MKKGMLLKLFVTNLYLSAFTFGGGYVIVTLMKKKYVDEFNWIDEKEMLDLIAIAQSAPGA IAVNGAIVVGFKLAGIKGAIVSILATILPPFVILSLVSVFYEIFKSNEIIALMLSGMQAG VGAVIASVAYDMAVGVVKEKEILPIIIMVCAFIATYLFNVNVIYVILSCGFIGIIQNMVS RWRERQ >gi|223714149|gb|ACDT01000066.1| GENE 38 34594 - 35154 435 186 aa, chain + ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 183 1 175 176 105 40.0 3e-23 MKLLTLFFSFLQIGAFSFGGGYAALPLIQHQVVDLYHWLTMNELTDLITISQMTPGPIAI NAATFVGLKIDSFYGAIVATLGCILPSCIIVSLIAYIYLKYQQMSIIQNILKYIRPAVVS LIAVSGLLIIVSCFFGDTVNFANLKFGSVAIFIVALVLLRKQKMNPITVMVIAGIIQIGL YYIGIS >gi|223714149|gb|ACDT01000066.1| GENE 39 35239 - 36144 1107 301 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1535 NR:ns ## KEGG: EUBREC_1535 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 20 296 25 311 312 160 34.0 5e-38 MYKRILVVVTVLSCLLTGCSSDNNSNQEYPSDYPTSKSYTVDEILYPEASGALVLDSDVA KVDYSNTNQGYVMAFLKPGVSKRIKIQISKDEQKLNYDLTDEQGVSYPLQLGNGKYLIKI LENIEGTQYAIKKSVEVDVILDNELLPFLYPNILVNYKPGDAITTLAIDEVKDEDNDLKR IKKVYEFVANYINYDDDKVALAKQQYLIPNLDEIINNKKGICFDYASMMTAMLRINHIPA RLICGGTDKDEYHSWLEVYVKGQGWVNPDIFMDKDTWTIMDPTFASTKYDYEGKYVETAR Y >gi|223714149|gb|ACDT01000066.1| GENE 40 36156 - 37184 942 342 aa, chain + ## HITS:1 COG:aq_747 KEGG:ns NR:ns ## COG: aq_747 COG1459 # Protein_GI_number: 15606135 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Aquifex aeolicus # 10 340 71 404 408 139 25.0 9e-33 MKLTMSEKYMFCNQMAMILESGFSLNQGVTMVYEEMDDKNIKGVLQEVAKYLDEQVSFSE AIDLTKAFDDYMVNLVKVGETSGNLDDVMQSLSEYYARIDDITNKLKQALTYPIILIIMM VVVVGIIVFKVLPIFKDVLNGLGSDLSSYANSFMEFGQIFSLICFAVLLVLVIVIIAGYL YQRITHVNVLSNVVQKSFLTRKLSRALNKAQITYALSLFISSGYDLQEAMKFVPKLVDDK QLRANLEKCNEDLINGDSFVEVIKKYQIYQGMQLNMIQVGFKTGQVDIIMKQLSNSFQEE VSRAIDQFLNIIEPTIVTLLSLVVGIVLMSVMLPLISIMSSL >gi|223714149|gb|ACDT01000066.1| GENE 41 37195 - 37494 226 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734381|ref|ZP_04564862.1| ## NR: gi|237734381|ref|ZP_04564862.1| predicted protein [Mollicutes bacterium D7] # 17 99 17 99 99 137 100.0 3e-31 MKNMLKIILLIVIIMVSIYAFNSTSTYSSNEDLKRVSDTINQLSLKCYSIEGKYPKDIEY LKENYGLLLNDEDYQVIYYYEGDNLQPRIEVFKKEKNYE >gi|223714149|gb|ACDT01000066.1| GENE 42 37487 - 37936 449 149 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755635|ref|ZP_02427762.1| ## NR: gi|167755635|ref|ZP_02427762.1| hypothetical protein CLORAM_01150 [Clostridium ramosum DSM 1402] # 1 149 1 149 149 255 100.0 7e-67 MNKHRNIHVLFSLSLFLLFVIGSFFIVTYEIKGYQVINDTCQQEDDLIVPLAYLNTKLKA NDSDDSTKIVEIDNTQCLEIKTAKTVTYIYCQDGYLKELYTSNDYHAGLQEGSKLFALDD FKIEQKDRLFKFTVTRDQVSKSISIYLHG >gi|223714149|gb|ACDT01000066.1| GENE 43 37946 - 38314 421 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734383|ref|ZP_04564864.1| ## NR: gi|237734383|ref|ZP_04564864.1| predicted protein [Mollicutes bacterium D7] # 1 122 1 122 122 206 100.0 4e-52 MKSKVQSFSFLMELIIVILFFAASTTVCASFIVQAKNKQVQGTNLQNALIEAQSMIETMQ AYPQADLEQLLEVEKIDENHYQKDNIFIEIDRDMITQGKIMIKNKNEVISELPFVLGGNH DE >gi|223714149|gb|ACDT01000066.1| GENE 44 38307 - 38687 520 126 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755637|ref|ZP_02427764.1| ## NR: gi|167755637|ref|ZP_02427764.1| hypothetical protein CLORAM_01152 [Clostridium ramosum DSM 1402] # 1 126 1 126 126 197 100.0 2e-49 MNKKHTSMRLGIGAPSIFMIFVILVMCVLAILSYLRANSYYESSVRQANITSQYYESESR LLTKYYQLDSQNLEYSLENLQIEYQKEDDLYMLEDKINDSQVLQLSFVQENDSLKIISLK TINLEE >gi|223714149|gb|ACDT01000066.1| GENE 45 38689 - 39744 1095 351 aa, chain + ## HITS:1 COG:aq_745 KEGG:ns NR:ns ## COG: aq_745 COG2805 # Protein_GI_number: 15606134 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein, pilus retraction ATPase PilT # Organism: Aquifex aeolicus # 1 349 13 360 366 319 47.0 6e-87 MEIKKLLEEVVTKGASDIFIVAGSPCAIKIDGRISKYSDERLTPHDTERMIRDIYEIANN AGIEEFMRSGDDDFSFSIPNVGRFRVNAYRQRNSCAAVLRVVSFQLPDPEVMHIPKVITD LANQKKGMVLVTGPAGSGKSTTLACIIDQINRNRNTHIITIEDPIEYLHSHDKSIVSQRE VYHDTHDYVSALRAALREAPEVILVGEMRDLETIDIAVTAAETGHLVFSSLHTVGAANTI DRMIDVFPPSQQQQIRVQLAMVLNAVISQQLVPGIDGKMIPVFEIMICNQAIRTLIRDGK THQIDNAISTNRQIGMVTMDDAIVDLYKQNKITKETAVMYASNPNLIERKF >gi|223714149|gb|ACDT01000066.1| GENE 46 39889 - 41070 1232 393 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1848 NR:ns ## KEGG: EUBREC_1848 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 6 375 11 387 415 139 26.0 2e-31 MDKKLVLKIKTLGRLHITDGENEFPQEKKRSAQVELLIVYLILNRNANITNPQLIDFLWP DGNADKPEGALRNLIYRARKEMKQFFKEVDCIKSKGRSYYWNPEVECRVDYEEMLKLCRR VEQEDDPYRKHERCLEMINKYHGEVLPEFNYNDWILEINNLLERNCLEAILKTLTILANN NLYEQVLQICNHQNSQKIMDTRLYEMKLYAYYKTGKTDLALSFYRQIVDYYYSKYGIEVS QRLKEIYEMILDTSPATQIDVEELEKSLTVDDNNDNTFYCDFDVFKNIYQINVRSARRSM KARILTLLTIVDTSGCLDEKAIRQEADILREVISNSLRKNDVYSKFNMTQYSLILASPDL DGAKIAVDRITNRYNEKKKHDEIILTNDLKKIR >gi|223714149|gb|ACDT01000066.1| GENE 47 41093 - 53380 13595 4095 aa, chain + ## HITS:1 COG:BH0361 KEGG:ns NR:ns ## COG: BH0361 COG4932 # Protein_GI_number: 15612924 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted outer membrane protein # Organism: Bacillus halodurans # 2112 2609 1086 1534 1661 119 28.0 2e-25 MGNLMKKLKKTKMVKQFGILSLVLAMLIGVMSNGTIINAVDGNADANITFNSRIVEKVNG QDVDVSQADSGDSFFLAIYYAVSASTDASKYTECTLSITLPNDIEFDETTDISNTGFDSI EQRTSLGKNILSIKSKELAAGQSGTLYLKMHFKNMTTKNGTTAKFVDMKMTGSMVNENGS YTINPVSVPDASVTAKASQEWTVGKNVIQQADGNNYTIKSIDGKDYYSVDYKVTAAPGSN LTDLGNNYGRLNCTTFELVDTLPVAPAGREDGGAANISIKVGDKVLEEGADKDYELVLNE NGTVKAIKIYYISTYESNPDDTNYVPDGAAMLTTYTINANYSKDAYRVLPNEKDDIPFIL DNKVELKYLPLGENMPKYVSANAPVNVGWVDENAPTTNLDITKYVTVNGDNSIVGDNKFI FDQEKQDCYYQNTDSRIAFGLYTDEACQNAATDIDGNTVGNSKVIDKDGKVNFAKISEGT YYLKETVGNLPFIGDGVDSANGETIYKIEVKNENNKAVVYLDGKKIENGKLEINNLTTAQ GFGYVAFNKVGTSATNSQHGPLAGVKFILTNKVTGDKYETVSNDKGLVLFEGIIAGDYTV EEDFSSAEYDKPDVYKWDVKVEANKVNYPSLMLQDHNTGVPYVVNSSNKGKLIINKVDAL TETKKLNGAEFTVYGPFESKETANIAIESEDFSSVDSFTLNGNEESKALVKGFYVYQETK APNHYKLDENYYVTELKTQTLNEVTVKNEELGYLQILKKGSLKDYPTITVNLAGAKFHVY TNKNATDDSLVKDDAGNPIVIESNNSVSSSSNSNKVELAAGTYYLKEFEAPSGYNQLTDV ITVEIQSGKTATKEIMNESTTLGYLEVTKIDSKTKEPLANVTFDVYRKDDDTKVDTIKTD SDGKAKTKFLEAGEYYLKETSKLTGYAVNTKNIDFKIENNKETSKTVSNDPLVGYMIRKV SSLDHNTGLYKAEFKLLDADKKVIDEGIESNGQGYVIFTGLTPGATYYYVETKAPDGYTL NTTEYSFTAPQVNDTNNSLVYESKEIVENVPKGKFTITKTSFNIDETIAKPAGNITFEYF PKLSTNIDEDRKIADTNKTLLTVTTNNDGIATSKIVDAGEYWVVEKNVKSEFEVESGNAK VVTVKPGSNTDDKLVSNVDVVNRLIKGKVAVKKVSSVDSDTGVAATFNVYKWNETGNYTG DPVYVLTTKADGNVVTSEFLAPGKYVLIEKAVVGNYTIDPTPHEFEIVEGKTNKVYVENP IENVPYSSLGLVKTAKWLDKDNKDVSEPLAGIKFSVYKAKEVSSDEDGAISKDDKWYVKD GNALATIESDVVEKTVSGLEPGFYIVEENLTAEQQNDYEQQPPKVVELEAGKTTHVTFEN TPKKGKIKLTKTNEDKTVLLDGAEFKIYYVDSNGSQTITANDGKKYTVSDSGISTKIISG TAVLYDANGKSSKDPGVGYSVFLDPGKEVVLQEVRAPDYYGIIHEWWYVGEIKAQHISEI EVQNYLLTNPIGYKYDEAEKLLAGATMGLFSTETGADRLNKLSQTEIDEVIADSEKWAEY DLVSTAVSTDKYFKFENIDSTQTYYVTELKQPAGYNKNTDIRKVTVEVNGKVYFLKDADT NETVTFINYKYRQIWVSKVLKFAGEQSALTGINFKVYPVVESNESVADAFKIGDKWYIKT GEEVTYTTGTISSGKTGSFMTAPSPAGLYVVEEDITSLPEHVVAPDTTKYLVNLGFDTDN TDLYSDQADASNVIVNTSNYGKLALNKVSSTNSNTYIKATFNIYAKNDEAGHDYSKDTPI STITTNGNQNAVLTDYLKPGNYVLVESLISTDGYVLNPTPREFTIEADKVTGLDGSVYTT VSDAVKNPLVVTNVPKGSVELTKQGTYLDANGLTNPVALSGVNFNVYKSVKLADGTLDLS STNLVGTAVSQSNGKLQFKVNGTDVTSKKWLEAGDYILQETTVGTTNKNNGFADYAYLGK FTITPGDITKEIVKVDENGNESGKADNVISNTSSYGKLEITKVDHYDKTKKLADVVFEVF NESGILVDTIKTNKNGVARTKLLPAGEYTLKEKSTNNDYFLNDTLYGPYTVEALKVTGTD GKIEITNMMKQSLKVIKKDSHTGKEIDKLDGTTFSLYDAQTDGNLVATATYQKDKGIVFD NLRPNTTYYLQEDISPNGYVIKDSNRIKVTTKTNGEVTTIEVENEPLGSIKIEKISEWSI YGANDSKTTRFPLAGAEFTLYKDVNGDSKLQDDEKVNGIVKNSDTLGIVIFDNLEKGKYL FVETKVPDGYDETVMNDVYAVEVTEGEQNIVYTEAGKTDDAQNHGPIINTTTAGKFTFTK TDPEGKGLSGATFQLMKKNTDGDYENYLDSFTIENTDGKFESSVIETGKYQLVEITAPDN FEIMQPIEFTIEAGKVTTVTSKDANTVVNQALGKVVITKYDDRSNYSNVGNKVLEGVEFG LYQESGDLVQTVKTTGKDGKIIWTDVPAGNYYVKEVATLDGFELDTTVYPVTVESGQSTV KEYLPNTDGKIINHSNMGKIVVKKTSDANENLAGAVFKITNPNDSGFVPIIITTNEDGIA KTELLPANKEGTTYVVTEIKAPDGYTLDDKYHDIEQSVKVYPVQDETVILSEQSKNYLEF KNKKQSDIMDIDGKIIKYIDGDNNTLLTAKTVETPLLDAADTETFLLREYAQAKNDVPVR SVEVEDSLIKMYYYDANGNSVEYMPDAPYVVNTVTVHSATDKNGNKIQARVYYQTAGSDE WTTNNSLVADVSTNSQGISLEGLNALKIKVVYTNVVSKDLIAENFNATGIDVNVTFNKRP SDATVHEVRRVSNQTSVTYNYTLKDNEGNDQNAQYTKVSDIVNLYYPLKESVTPRVSLGI VSGEKDKTYKPGETIESVITVKNESTKENEIIEQPIISFDLPIGMSLNNNSYSIGDLQTR FQVIVFEHDGDTSGKALEPDEYTVTYTNDVPARVVENNQLVETTSKTKKITIDLGEDFKF KPGMKIQISYKGTSSINDTSTTLWAPAYFTSGKIIALSAENPYGNSFTVESATGSSYNTL VADSVLDEITNKPDETGDGLKYPNSNGIIKINEQNYLSIYKQVKGIYDDDYLNNNQIAKT APGEDLDYKIVFKNGEKSDRAVSKARVVDILPFEGDSLVNRTNDNYTARVTNLDKSPILN YVDCLTPGVSYKVYYCVGESEDTKWDEWKQDARTSKTASEELPIVYGNMDDDDWTSGAHQ WIEASNDIDLRLVSAIAVEFDFSNAPLEPNQSIELHVNMSAPEYSTSDLEKVSGKLMSNS ALVAVKRTGLDTVNDSDIVENLEVKAEIALPKGTIGDYAFYDMNRDGLQDSNDLPVQGLN VTLHKFVTTINQNEGKVTKELDSETTLTDASGKYEFTNQDCNVLKDGKTDSSSTNPNDYV GNKYYQYQVEFAIPEDESLYKYEPTQRYAQDVDGKPQIEADSNIGNDLNKDDYCKTELFT LTATKQADGSLVGETNLTLDAGYVALGSLGDFVWFDENKNGIQDPEETGVKDVIVNLYTV TNGEVSGTPEKTTRTDENGYYLFTDLEDGSYVVEFDIRGVKPTMGGGHAERFYFTKNGGT TDSAIDSNPTITDENKNTLIARSEVIDLAYHTSDMTIDAGLTVYSAISGVAFEDRDYSDI QNYQDSDGKDVDIKLPGTIVELYRVDEEFGIENNIPTELVASTVVGEDGSYLFDYLDSGT YVVKFTYPEGYKVIEADVGDDDTLDSDVAYHIETNENRMSGYTGVINIPVNTRVDNVDGG ARLYSSLGDYVFKDVNADGLQDETDIVMPDVPVYLFARQKGEEVWTSIANTITDQDGRYR FDNLKGSTYTGIEYRVIFDLPLTTKLTVPYAGDDPELDSNALNEYVPGLGFPTQTIDLAY NTVDLTWDAGIVQSKGSVGDYVWYDTNVNGIQDENGTGIEGIKVILETNLTGNIADENDW QEVGTTYTNSQGYYIFNELSEGYYRVKFEIPSKYKVTLSTQGEDSAEDSDGIYSEDGVWY YTRSFYLDQDGYDMTWDCGVYDPTDTKLTKTGSTATGDSSDINGYVVLGFSSILALGAVY FVSNKKRKAKKAEKI >gi|223714149|gb|ACDT01000066.1| GENE 48 53548 - 55371 2217 607 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734388|ref|ZP_04564869.1| ## NR: gi|237734388|ref|ZP_04564869.1| predicted protein [Mollicutes bacterium D7] # 1 607 1 607 607 1048 100.0 0 MKKKLFRLLVVLAMIVMPVTGVMINAEAASNLMLGTAVSNNPDTRVFTVKNNGEATTFAW EATGTGETNGVSIARGETKRIEVPAVNGSSHLTVLAQDAAAVQEIASLNQYYINVTFSTV DGTELMSAKTVTCTYNNNGTYTAPSTITLGSSVYDINGNNYISVPYGTRSIDFKYTKRAQ TPYTSTVALIDQDGVEHKILSYQVTEANGGSVNTPETFVSTINGRTYKRIAGQKATVSQT YAQGQMRSIVRYQVQDDVASKPYNIYIQYVDKATGSAILTKKLLVEKDKTVDHTPSKTFM KGGKQYEVVDGAKITHTFGDATKTYQVEYKQVITDENTPQPIQVNYVDLATGEDLQIHQY TVDPGKTKTIEVDTTVEIDGKTYVLSPNQESTITHKFGEETTEYNVYFNEKGLEVDKYEV SVTYMNVSNTYVGEDTLYTTKLEANVGQELNIEVPAQYEANGTTYVLMSGQATSYSHDFY STRRNYVMVYRDVNDTQNEIEFIPGETGTDLAATTPGGTNFTIDGGTGNPMITNPDGTVT TVDQDGQIVPYEEPQTEVVDENETPLAKGDKSTSNNTAYVIGGGFATVAIIAGIVAFVIK KKKAQQA >gi|223714149|gb|ACDT01000066.1| GENE 49 55534 - 56502 903 322 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1848 NR:ns ## KEGG: EUBREC_1848 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 18 317 4 308 415 120 27.0 7e-26 MLFMLILIILEIINYNIISGGDLVKLKVDFFNAYVIKLDEDHWLDFNIINSPKLAKLLAY IIYHHRRKLSSNDLQELMFASGESNNPANALKALIYRLRTILKNNLGEYDYILSSQGTYY WNPEIEMIFDVDDFTLYHRLGRNEIHDEEIRAKYYMKAYYRYSGEFLPMIENVEKLSIIR GFLNSQFLNVSRFLIEFYLKKEEYGIIEEICMKNLADNYLDENINAILVLALVKQDKITL AKEHYNKVTSALKDDLGNATIRKMKYYLNYNVNNAEERNIFSIQDDLIETKTRGAYSCDY DFFKKSYRLEARKSIRNRTKRR >gi|223714149|gb|ACDT01000066.1| GENE 50 56642 - 56824 195 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755642|ref|ZP_02427769.1| ## NR: gi|167755642|ref|ZP_02427769.1| hypothetical protein CLORAM_01157 [Clostridium ramosum DSM 1402] # 1 60 371 430 430 111 100.0 2e-23 MKQFLILLDCDENGVVKAINRIKKDFLAQDKYERVSIDYSYAAIAVAKMHIDDEDILKLQ >gi|223714149|gb|ACDT01000066.1| GENE 51 56879 - 57095 255 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755643|ref|ZP_02427770.1| ## NR: gi|167755643|ref|ZP_02427770.1| hypothetical protein CLORAM_01158 [Clostridium ramosum DSM 1402] # 1 72 1 72 680 143 100.0 3e-33 MFKYPIYKQDNNYSCGAYCIKMILKYYHLDIEIKEIKERCKVTNEGISVYGIIKCFESYH LDAKAYQSDLNT Prediction of potential genes in microbial genomes Time: Thu May 26 09:54:34 2011 Seq name: gi|223714148|gb|ACDT01000067.1| Coprobacillus sp. D7 cont1.67, whole genome shotgun sequence Length of sequence - 34719 bp Number of predicted genes - 37, with homology - 37 Number of transcription units - 14, operones - 11 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 1 - 804 396 ## COG2274 ABC-type bacteriocin/lantibiotic exporters, contain an N-terminal double-glycine peptidase domain 2 1 Op 2 . + CDS 801 - 1775 602 ## COG2274 ABC-type bacteriocin/lantibiotic exporters, contain an N-terminal double-glycine peptidase domain 3 1 Op 3 1/0.000 + CDS 1789 - 2250 593 ## COG0319 Predicted metal-dependent hydrolase 4 1 Op 4 6/0.000 + CDS 2328 - 2744 488 ## COG0295 Cytidine deaminase 5 1 Op 5 16/0.000 + CDS 2723 - 3613 1031 ## COG1159 GTPase 6 1 Op 6 . + CDS 3603 - 4349 773 ## COG1381 Recombinational DNA repair protein (RecF pathway) + Prom 4353 - 4412 11.8 7 2 Op 1 . + CDS 4444 - 6288 1710 ## COG1032 Fe-S oxidoreductase 8 2 Op 2 . + CDS 6291 - 7004 764 ## COG2267 Lysophospholipase + Term 7203 - 7258 12.0 + Prom 7196 - 7255 5.2 9 3 Op 1 3/0.000 + CDS 7363 - 8736 1570 ## COG0423 Glycyl-tRNA synthetase (class II) 10 3 Op 2 31/0.000 + CDS 8752 - 10506 1708 ## COG0358 DNA primase (bacterial type) 11 3 Op 3 5/0.000 + CDS 10510 - 11763 1650 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) 12 3 Op 4 9/0.000 + CDS 11765 - 12457 700 ## COG2384 Predicted SAM-dependent methyltransferase 13 3 Op 5 . + CDS 12444 - 13196 704 ## COG0327 Uncharacterized conserved protein + Term 13243 - 13285 7.2 - Term 13528 - 13571 1.0 14 4 Tu 1 . - CDS 13616 - 14467 927 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 14639 - 14698 6.1 + Prom 14463 - 14522 6.8 15 5 Tu 1 . + CDS 14609 - 15946 1347 ## COG0534 Na+-driven multidrug efflux pump + Term 15949 - 15987 1.5 - Term 15936 - 15974 1.5 16 6 Op 1 . - CDS 15978 - 16889 368 ## PROTEIN SUPPORTED gi|34762725|ref|ZP_00143715.1| LytB protein; SSU ribosomal protein S1P 17 6 Op 2 . - CDS 16905 - 17258 395 ## gi|237734406|ref|ZP_04564887.1| predicted protein + Prom 17479 - 17538 10.5 18 7 Op 1 4/0.000 + CDS 17563 - 18933 1537 ## COG0513 Superfamily II DNA and RNA helicases 19 7 Op 2 . + CDS 18933 - 19793 1181 ## COG0648 Endonuclease IV + Term 19828 - 19859 1.1 - Term 19677 - 19714 4.1 20 8 Op 1 . - CDS 19766 - 20035 241 ## gi|167755662|ref|ZP_02427789.1| hypothetical protein CLORAM_01177 21 8 Op 2 . - CDS 20096 - 21151 1129 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis - Prom 21171 - 21230 5.7 + Prom 21225 - 21284 8.8 22 9 Tu 1 . + CDS 21361 - 22767 359 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 + Term 22771 - 22816 6.2 23 10 Op 1 . - CDS 22796 - 24571 1679 ## COG1164 Oligoendopeptidase F 24 10 Op 2 . - CDS 24605 - 24865 422 ## PROTEIN SUPPORTED gi|167755666|ref|ZP_02427793.1| hypothetical protein CLORAM_01181 - Prom 24892 - 24951 7.4 + Prom 24902 - 24961 8.5 25 11 Op 1 . + CDS 25010 - 25969 1164 ## GTNG_2447 germination protease (EC:3.4.24.78) 26 11 Op 2 . + CDS 26022 - 26771 791 ## Bsph_3835 stage II sporulation protein P 27 11 Op 3 . + CDS 26768 - 27052 379 ## gi|167755669|ref|ZP_02427796.1| hypothetical protein CLORAM_01184 28 11 Op 4 . + CDS 27075 - 28028 1143 ## COG0223 Methionyl-tRNA formyltransferase + Term 28029 - 28056 -0.8 29 11 Op 5 . + CDS 28099 - 28359 452 ## gi|167755671|ref|ZP_02427798.1| hypothetical protein CLORAM_01186 + Prom 28363 - 28422 8.7 30 12 Op 1 . + CDS 28442 - 29743 1601 ## COG0213 Thymidine phosphorylase 31 12 Op 2 . + CDS 29767 - 30606 851 ## COG0561 Predicted hydrolases of the HAD superfamily 32 12 Op 3 . + CDS 30634 - 30954 142 ## gi|237734421|ref|ZP_04564902.1| predicted protein + Term 31012 - 31045 2.7 - Term 30999 - 31033 3.1 33 13 Op 1 . - CDS 31035 - 31862 867 ## gi|167755675|ref|ZP_02427802.1| hypothetical protein CLORAM_01190 34 13 Op 2 . - CDS 31864 - 32502 549 ## COG3764 Sortase (surface protein transpeptidase) 35 13 Op 3 . - CDS 32535 - 33083 620 ## gi|167755677|ref|ZP_02427804.1| hypothetical protein CLORAM_01192 - Prom 33232 - 33291 15.8 36 14 Op 1 . - CDS 33375 - 34352 855 ## gi|237734425|ref|ZP_04564906.1| predicted protein 37 14 Op 2 . - CDS 34355 - 34717 281 ## COG0489 ATPases involved in chromosome partitioning Predicted protein(s) >gi|223714148|gb|ACDT01000067.1| GENE 1 1 - 804 396 267 aa, chain + ## HITS:1 COG:SP0042 KEGG:ns NR:ns ## COG: SP0042 COG2274 # Protein_GI_number: 15899987 # Func_class: V Defense mechanisms # Function: ABC-type bacteriocin/lantibiotic exporters, contain an N-terminal double-glycine peptidase domain # Organism: Streptococcus pneumoniae TIGR4 # 7 260 99 354 717 67 21.0 2e-11 TNTLCGVISGNKKYLLIGDPAHGLTKRTKESIEKDYTGICICIHHVGRYRVSSRHQEQSF ISFIILHLKNNYCYIIRLVAKAILISCCSVLGSYYFQGLVDQINGMNYIMILIFTGVFMI ISGIRIMINFQRKQLEIEIQRYLNQEYVNKSVVNMLYLPFSYYYRNQEGVLLTKVQNLFQ LSDFFIHFYMALFVDLILIIGLLSALILFSMNLALVVIAFLSIITIVTIKWLKIVNGLNK QIVVNQEIMNQGHLEYLKNIYNSHQFF >gi|223714148|gb|ACDT01000067.1| GENE 2 801 - 1775 602 324 aa, chain + ## HITS:1 COG:L82520 KEGG:ns NR:ns ## COG: L82520 COG2274 # Protein_GI_number: 15672061 # Func_class: V Defense mechanisms # Function: ABC-type bacteriocin/lantibiotic exporters, contain an N-terminal double-glycine peptidase domain # Organism: Lactococcus lactis # 7 322 368 693 715 115 30.0 1e-25 MKKFIKEKLNYLFEEYNYSLYQRDSELNKLNFISETLIQLLSFGVVLLASFHYKKGSISV GDIIFFYMLVSYMIEPMFNVIAFIIKKDEILILYERYKEIIPNRSEKKQKIRGRISKISF DHISYSYGYSKPILQHLDLQIERSLWLKGDTGVGKSTLLKLLMKYDELLKGHIYINGIDL MNIDTNSLYRKIIYLDRKPVFYHESLRFNLILNSKEEELMEKLLKIFGLNYYLDRLDLII ETDGQPLSSGQAQIVMLIRAIIKKPEVLILDEALCNIDDYKAKKIISYIHDNLPNTIMII VAHQTKLVNELFDCAIIRDGKIYK >gi|223714148|gb|ACDT01000067.1| GENE 3 1789 - 2250 593 153 aa, chain + ## HITS:1 COG:BS_yqfG KEGG:ns NR:ns ## COG: BS_yqfG COG0319 # Protein_GI_number: 16079586 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Bacillus subtilis # 1 152 1 156 157 112 41.0 2e-25 MEIDFALINEAAGFNEDYTDDYRSIINEAAKQLGIEEDLELSCILVDDSKIHEINKEYRK IDRATDVISFALEDNEQFYVPGMPRSIGDIFISVDHAKVQAEEYGHDLKREMCFLFTHGL LHLLGFDHMIEEDEIQMIAMQKSILDALNITRR >gi|223714148|gb|ACDT01000067.1| GENE 4 2328 - 2744 488 138 aa, chain + ## HITS:1 COG:BH1366 KEGG:ns NR:ns ## COG: BH1366 COG0295 # Protein_GI_number: 15613929 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Bacillus halodurans # 1 131 1 130 132 117 46.0 5e-27 MDVKTIIEKAFEAAKNSYSPYSKFKVGACIEMRDGNYFLGTNIENAAFGSTMCAERNAVY GAYSNGYGKADIKQLAIVAGSDEIVSPCGACRQVLMELLPQDCPVILANRESYEITTMKE LLPRAFIGESLTCSNQDL >gi|223714148|gb|ACDT01000067.1| GENE 5 2723 - 3613 1031 296 aa, chain + ## HITS:1 COG:lin1499 KEGG:ns NR:ns ## COG: lin1499 COG1159 # Protein_GI_number: 16800567 # Func_class: R General function prediction only # Function: GTPase # Organism: Listeria innocua # 2 296 5 301 301 323 55.0 3e-88 MFKSGFVSIVGRPNVGKSTLLNSILETKLAIMSDVAQTTRNTIQGIHTDDEAQIIFMDTP GIHKPQDRLGTFMNTTALNSIFGVDLVLFLAPANEKIGRGDKFIIERLKEADGPVFLVLS KIDTVSKEELIKKLQEWQELFDFKEIIPISATTNDNIDLLLKTVKAYLPEGNMYYPQDHL TDHPERFVMAEFIREKILYFTKEEVPHSVAIVIERMLEDEAGVEIIATIVCDRKSQKGII VGKQGTMIKKIRQNAQREMKRFLQVPVHLELFVKVENNWRNKQKYLKEFGYNEDDY >gi|223714148|gb|ACDT01000067.1| GENE 6 3603 - 4349 773 248 aa, chain + ## HITS:1 COG:lin1497 KEGG:ns NR:ns ## COG: lin1497 COG1381 # Protein_GI_number: 16800565 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Listeria innocua # 7 246 2 243 255 138 30.0 1e-32 MIINGEENITGLVLKNKPYKENDMLLWIYTHDYGKLALIAKGVKKLKSKNAPSCQTITLS EFTFIPRSGLSNLIKGAPVEYFRYIKEDIELEAYASYFCEFVYKFTKDNDPDETIYNTLL LALRSLQTGYNPKLVYLLFNAFIMEKTGSSLEVDQCVSCGRLDHISGVSIHSGGFVCQEC IGMYDQKLTVDVLKGFRYINKWQLENIDMLHLEENIIDELMPIMEAYIDEFTGITFQSRK FIRQFNNL >gi|223714148|gb|ACDT01000067.1| GENE 7 4444 - 6288 1710 614 aa, chain + ## HITS:1 COG:MA4618 KEGG:ns NR:ns ## COG: MA4618 COG1032 # Protein_GI_number: 20093399 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Methanosarcina acetivorans str.C2A # 2 592 124 737 742 610 48.0 1e-174 MSEFLPINREEMEARGWQQPDFVYICGDAYVDHPSFGAAIICRTLESHGFKVCFLAQPDW HNVEEFRQFGKPRLGFLISSGNIDSMVNHYTVAKKRRHKDLYTPGSEGFKRPDRAVIVYS QMARQAYKDANIIIGGIEASLRRLGHYDYWDNKVRKSIIIDANADLLLYGMGENSIIEVA EALDSGLEIKYLTYLDGTVFKTKNLDDAHEPVMLPSYEEICNDKKAYAKSFHIQYLNTDH FNAKTLVEPYQGWYVVQNRPNRPLKQREFDRAYSYDYNRDSHPSYKQHVPAIDEVKFSLI SNRGCFGSCNFCALAFHQGRIVQARSHESIISEASKMVWDPEFKGYIHDVGGPTANFRGP ACSKQNKHGACPTKQCLWPKACNNLEVSHQDYLELLRKLRVLPNVKKVFVRSGIRYDYLM YDKDDTFFKELVQNHISGQLKVAPEHISDQVLDKMGKPRRGLYEKFVDKYYQLNEEYNKK QFLVPYLMSSHPGCTLDDAIELAEYLRDIHHQPEQVQDFYPTPGTLSTTMYYTGLDPRDM SEVYVPKTAKEKAMQRALIQYRNPKNYDLVYEALVQANRKDLIGFGKNCLIRPRRNNNQR SYQAKNNYQKKRRN >gi|223714148|gb|ACDT01000067.1| GENE 8 6291 - 7004 764 237 aa, chain + ## HITS:1 COG:AGc3857 KEGG:ns NR:ns ## COG: AGc3857 COG2267 # Protein_GI_number: 15889407 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 22 218 34 236 328 89 26.0 6e-18 MKYSEKFTSATATHVNFDIYLPTVMIRYKAIIQIHHAMGEHSGRYERFAEYLAHDGFVVV VSDFPGHGTSLYNYEQGYFGIGDATKTLVEDMHRLRNIMASRYPDLPYFMIGNQLGSLVL RQYMAQYGDFIQGAILMGTCGKPHFALIGKLIIKGDAMLKGHMHRSKTVRKNVINQLISR TKNATYVTGDARELEQYQQDPFTDFTYTNNAYEEVFGLIKKVSTIQNIKKYLNIYQF >gi|223714148|gb|ACDT01000067.1| GENE 9 7363 - 8736 1570 457 aa, chain + ## HITS:1 COG:CAC3195 KEGG:ns NR:ns ## COG: CAC3195 COG0423 # Protein_GI_number: 15896443 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase (class II) # Organism: Clostridium acetobutylicum # 4 444 5 450 462 669 68.0 0 MASKDMDKLVSHAKTSGFVYQGSEIYDGLANTWDYGPLGVEMKNNIKQLWWKRFIQESPY NVGLDSAIFMNPRVWEASGHVGGFSDPLIDCKECKTRHRADKLIEAYDPSVHSEGWSSEE TVKYIKDHNIVCPNCGKSNFTDVRQFKLMFETQMGVVEDAKDVVYLRPETAQGIFVNFKN IQRTSRKKVPFGIGQIGKAFRNEITPGNFIFRMREFEQMELEFFCKPDTDLEWFDYWKSY CQKFLFDLGLKEENLRFRDHEKEELSFYSKATCDVEYNFPFGWGELWGIADRTNYDLTQH QNHSKKSLEYLDPTTNEKYIPYVIEPSVGADRLFLSVLCDSYEEEILEDGETRVVMKLHP SVAPFKVAILPLTKKQSDKATEIYSQLAKYFNCEYDVAGQIGKRYRRQDAIGTPYCVTVD FDTMEDETVTVRDRDTMEQIRLPINQLVDYISKKITF >gi|223714148|gb|ACDT01000067.1| GENE 10 8752 - 10506 1708 584 aa, chain + ## HITS:1 COG:BH1375 KEGG:ns NR:ns ## COG: BH1375 COG0358 # Protein_GI_number: 15613938 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Bacillus halodurans # 3 583 4 596 599 352 35.0 1e-96 MPRLSSEKINEIRQSVDIVDVIGTYLPLEKKGRNYVAICPFHDDSHPSMSISPERQIFMC FVCHHGGNVFTFLKDYLKIPYIEAVKMVANIGNVDISKYNLESRVKPVDQKLEPLYRMHD EANKIYNHYLNTKLAIQAKEYLNNRKITDEIIETFEIGYAPNNHVLLKAFEKMNFNKVSM FESGLIIEASNGYDRFTDRIMFPLHDASGRVVGFSGRIYKQSQNESKYMNSPESSIFIKG DTLYNYHRVGEEARQAGYIIITEGFMDVIALYKAGIKNAVAIMGTALTHGHLNLLKRLSK TVYLCLDGDQAGRNATIKSIDILLSAGFIVKVVDLPDNLDPDEILDKRGIEELNAVIKRP LSSLDFKMNYYYEMTNMDNYEDRKSYLETIAREISRLDDIIDQDHYIQQLEKKSGFSRNI IDQLINQNTAVRIETAEPPKIPKFQNQRLLDKYIRAERDLLYYMMNDKNVAMMYEAKAGF MFNDIYRIIASYIIDYYRQEVVLEVADLISSIEDENLVQNIIEISQLGLPKLKDTKAIDD YIETIKEKTVMVKKEELTKALADTFDPKQKAQILKEIIALKDKE >gi|223714148|gb|ACDT01000067.1| GENE 11 10510 - 11763 1650 417 aa, chain + ## HITS:1 COG:BH1376 KEGG:ns NR:ns ## COG: BH1376 COG0568 # Protein_GI_number: 15613939 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Bacillus halodurans # 10 416 14 372 372 419 63.0 1e-117 MKKKKNVTAVSFDDLKQIVLNDAKKEDGVLTQDQIDNYLSTYDLNDDLAEELLEFIGNND IVISDDGLDDLDVEDEVLLAGVPAELDDDLGLDDDLGLDGDTPDLDFAGDFDMMTGDTIH MYQVADADAELDQNQLGSNVKINDPVKMYLKEIGRVELLTHDEEIDLAKKILEGDEDAKK ELAAANLRLVVSIAKRYVGRGMLFLDLIQEGNMGLIKAVEKFDYTKGFKFSTYATWWIRQ AITRAIADQARTIRIPVHMVETINKLTRIQRQLIQELGREPSAEEIAEKMDGMTPEKVRE IQKISLEPVSLETPIGEEDDSHLGDFIEDEGAMSPDDYAANELLKDELNEVLLELTDREE KVLRLRFGLDDGRTRTLEEVGKEFSVTRERIRQIEAKALRKLKHPSRSKRLKDFLDR >gi|223714148|gb|ACDT01000067.1| GENE 12 11765 - 12457 700 230 aa, chain + ## HITS:1 COG:lin1490 KEGG:ns NR:ns ## COG: lin1490 COG2384 # Protein_GI_number: 16800558 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferase # Organism: Listeria innocua # 2 226 5 233 234 151 37.0 9e-37 MKLSKRLQLIADVISKYKQGSVLADIGSDHGYLPCYLVKNKIITCAYACDVAQGPLDSAK ETIKQYGLEDKVFALLGNGLNPILDREVDMISIAGMGSYLISEILEEHREYLRNVKLMFL QANANNDHLRKYLFANDWIIIDEQMVKDAGHIYEVMVVTARQNKAITYNRKDEEFGPILI NNQTPLFKEKWQKQYQVYEKIQNTLPHDHPRYHEIADKMKMIEEVLHECR >gi|223714148|gb|ACDT01000067.1| GENE 13 12444 - 13196 704 250 aa, chain + ## HITS:1 COG:CAC1303 KEGG:ns NR:ns ## COG: CAC1303 COG0327 # Protein_GI_number: 15894585 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 6 245 8 260 268 112 34.0 5e-25 MNAVKIINYLESIFPLNLQMSWDKCGIQVGDCNQQVTSIMVALNADIESLQKAIDQNCQM LVTHHPFLLEEIMNLDFHNHHGKFIEMAIKNNILVYSLHTCLDRGKDGISMNDWLINALG VHDVSCYDEFQVGKMALLNQPCMTSELVKKVKDTFNVPVRLAGKEVEIKTIAICGGSGAD DLEQLAGRVDAYITGDSKHRHAKYALDHDIVLIDVPHHLEVIMEKRLASLLSNLGITVKE ANSQDYYSYY >gi|223714148|gb|ACDT01000067.1| GENE 14 13616 - 14467 927 283 aa, chain - ## HITS:1 COG:lin2453 KEGG:ns NR:ns ## COG: lin2453 COG0561 # Protein_GI_number: 16801515 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 283 1 281 281 288 56.0 6e-78 MEKKIIFFDVDGTLVSDTGGIEHVPESAKRAIALTRAKGNLVYLCTGRSKAEIYDFILEC GIDGVIGAGGGYIEIGNQMLYHKKVTNKAVNHMVDYFDTNNFDYYVESNGGLFASKNLVK RLERIIYGDYENDPQAKARYEKKDSHFINALIEGQEMRRDDINKACFLENPNIPFNNIIK EFDREFNVIHCTVPAFGDNSGELSVPNIHKANAIETLINYLGIERKNTYAFGDGMNDKEM LEYVNIGIAVGNAKEGLKAVADEITDNIDNNGIYNSMKKHNLI >gi|223714148|gb|ACDT01000067.1| GENE 15 14609 - 15946 1347 445 aa, chain + ## HITS:1 COG:FN1653 KEGG:ns NR:ns ## COG: FN1653 COG0534 # Protein_GI_number: 19704974 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 4 438 6 442 445 265 35.0 2e-70 MADQSTDLIKGSIYKSILWFSIPLLIGNLFQQLYNTVDSYVVGNYVNSNALAAVGASTPV INMLVGFFMGLATGAGVVISQYFGARRIEKMQRAVHSSLALTGVLCVVFTVVGLLCTQPL LKAIGVPQEVLPHSSMYLMIYFCGISFGLVYNMGSGILRAIGDSKRPLIYLIVASVVNIA LDFLFVCSFNWGIAGVGIATVIAQAISAVMVMYQLIHTKEDYRVQINEISFDKHIVRKIV QVGFPAAFQQSLTSFSNVIVQSYINSFGTAAMAGYSSTIRIDGFLQLPLQSFNMAITTFV GQNMGAKQYQRVRQGVKAAWLMCSIVILAGSITMGLFGSELVGIFTNDKAVIAAGVTMIN VFAPCYIILPIVQILNGTLRGAGLSKVPMYFMVGSFVILRQIYLMITVPLTHNLAFVFAG WPVTWAICAIGLLIYYRRVNWLPDE >gi|223714148|gb|ACDT01000067.1| GENE 16 15978 - 16889 368 303 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|34762725|ref|ZP_00143715.1| LytB protein; SSU ribosomal protein S1P [Fusobacterium nucleatum subsp. vincentii ATCC 49256] # 1 270 1 270 827 146 31 2e-34 MKVYPITPRGYCKGVVRAIELAKKQAHQDDVYILGMIVHNQYIVDALNNLGVKTIDKKGA SREELLDQVSQGTVIITAHGASHNVIKKAHSKNIQVINATCPDVIKTHDLIKNYLNQGIE ILYIGKAGHPESEGALSIDPAHIHLIENKTDFENLDPNLNYVITNQTTMSLYDVYDLCEY AKTKLNNLIVAKETCQATTVRQEAIAKINDEVDVIFIVGDPHSNNTKKLASIAHEKANKD TYMIESVNDIDISMLKEKHAAAVSSGASTPTYLTNQVIEYLRQFDYQNVSTHPKPKIDLA KII >gi|223714148|gb|ACDT01000067.1| GENE 17 16905 - 17258 395 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734406|ref|ZP_04564887.1| ## NR: gi|237734406|ref|ZP_04564887.1| predicted protein [Mollicutes bacterium D7] # 1 117 1 117 117 186 100.0 4e-46 MNPFYDENFYYNNFNPQMNPMNQGFPVNRKLKTRANLQKTLHYASKTVYTINQVIPLVYQ IRPIINNTKNAFRVIQAVNHINDIDFDEVEQNITPIDQEKKETKEASDDVKFENMVQ >gi|223714148|gb|ACDT01000067.1| GENE 18 17563 - 18933 1537 456 aa, chain + ## HITS:1 COG:L0340 KEGG:ns NR:ns ## COG: L0340 COG0513 # Protein_GI_number: 15672392 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Lactococcus lactis # 4 445 3 445 446 398 46.0 1e-110 MSTFTKFKMQEFCLQTIEELGFQKPTSIQEQVIPLVYKNSDIIGISQTGTGKTHAFLLPV MDRIDPKNNSVQAVITAPTRELATQIYNNAKLFTKYNSEIKVSLIVGGNDRQKTVNKLAV QPHVVIGTPGRIKDLSLDEQALKITTASTFIVDEADMTLEFGYLEDIDAVAGKMRDDLRM MVFSATIPQMLRPFLKKYMKSPVLVEIDDDKVTTENVEHILIATKHKNRYEVLKKIIESI DPYVCIIFANTRSDVASTTKKLREDGFKVGEIHGDLEPRERKQMMRRIQNNEYQFIVATD IAARGIDIDGVSHVINMEFPKEPDFYIHRSGRCGRGNYTGICYSMYDTTDEANLKVLERK GILFKNMTIKNGQFVDLGDRLKRNKRPKQQTELEKKVQSIIRKPKKVKPGYKKKRKREVE QVVRKQKRAMIQNDIRRQKKERARAAQAAKRNGEEN >gi|223714148|gb|ACDT01000067.1| GENE 19 18933 - 19793 1181 286 aa, chain + ## HITS:1 COG:SA1386 KEGG:ns NR:ns ## COG: SA1386 COG0648 # Protein_GI_number: 15927137 # Func_class: L Replication, recombination and repair # Function: Endonuclease IV # Organism: Staphylococcus aureus N315 # 1 286 1 290 296 364 63.0 1e-101 MIIGSHVSMSGKEMLLGSVKEALSYGANTFMFYTGAPQNTARKPIDQLRIEEAKELMKEN GIEIDNVVVHAPYIINLANTTKPETFELAVSFLKQEIERCKAIGATRLVLHPGAHVGAGS QIGLERIVEGLNEATKDCGNVKIALETMAGKGSECGRTFDELQYIIENVDNNECLGVCMD TCHLHDAGFDLTKFDDILTEFDQKIGLERLLVVHVNDSKNECGASKDRHENIGYGHIGFD TLNMIVHHDKLKSVPKILETPYIEKIAPYKEEIEMFRTQTFNPRLK >gi|223714148|gb|ACDT01000067.1| GENE 20 19766 - 20035 241 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755662|ref|ZP_02427789.1| ## NR: gi|167755662|ref|ZP_02427789.1| hypothetical protein CLORAM_01177 [Clostridium ramosum DSM 1402] # 1 89 2 90 90 144 100.0 2e-33 MSNTSYNELFNLDFKMFKKLVYEQDIDIDEEQLLLIYNLIQNNRYALIDSHYNEVLYNYI SEKTSISTCLKIKSFLSNCPHYFNLGLKV >gi|223714148|gb|ACDT01000067.1| GENE 21 20096 - 21151 1129 351 aa, chain - ## HITS:1 COG:BS_yqfY KEGG:ns NR:ns ## COG: BS_yqfY COG0821 # Protein_GI_number: 16079562 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Bacillus subtilis # 3 346 9 352 377 419 63.0 1e-117 MKRNETRTIKVGNLTIGGNNEVIIQSMTNTKTKNIEATVKQINALAATGCQLVRLAVLNN DDALAIKEIKKHVSIPLVADIHFDYRLALQAIESGIDKIRINPGNIGSPEKVEMVVNACK KHHVPIRIGVNSGSLEKDILAKYGKPTAEGMIESAKKHVEILESLGFYDICISLKSSNTL LTIAAYRLASETFDYPLHIGVTEAGTKLGGTIKSSLGIGTILYQGIGNTIRVSLSDSPLE EIKVAKILLKELELIDNVPTLVSCPTCGRIQYDLIPVAQEIEDFLNTINANITVAIMGCA VNGPGEAKHADIGIAGGFKEGLLIKKGEIIRKVKQEDMVEELKKEILLMIK >gi|223714148|gb|ACDT01000067.1| GENE 22 21361 - 22767 359 468 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 20 468 7 426 447 142 25 2e-33 MENSKEEKVTSVSTENIYKLNGRVPLKKAIPFGLQHVLAMFVSNLAPVLIVCGAALVRGT GEHLTSAEITQLLQCAMFVAGIGTCMQLYPVWKIGSKLPIVMGVSFTFLGSLLVICTNPD LGYEGMIGAVILGGIFEGIVGLSAKYWKRFLTPVVSACVVIAIGLSLLSVGMDSWGGGNG AADFGSWYHLLVGTFTLVVCLVSRYLLKGVYKNLNILVGLIFGYLLSVIFTVANIAPMVD FSGITKTIQEVGFFSIPKLVFFTSHTPVFDLGAFFTIVIVFLVSAAETTGATTAVCTGAL ERDIKMEELQGSLAVDGLSSAISGCFGCIPLTSFSQNVGLVTMTKVINRFTILVGALILI LASLFPPLGAFFNSLPQAVLGGCTVMMFGSIMYEGIKMLKENEFDERTMIIVSLAFCIGV GLTQTSGNFFAAFPPVVGDIFNGNAVAGVFVVSLLLSLVLPKKKEAKK >gi|223714148|gb|ACDT01000067.1| GENE 23 22796 - 24571 1679 591 aa, chain - ## HITS:1 COG:BS_yjbG KEGG:ns NR:ns ## COG: BS_yjbG COG1164 # Protein_GI_number: 16078219 # Func_class: E Amino acid transport and metabolism # Function: Oligoendopeptidase F # Organism: Bacillus subtilis # 2 587 12 602 609 382 37.0 1e-105 MERKDIDIKDTWDLSTIYNSHEAFYQDLEKTKSLINDLITFKGKICNSIDSFYNFLTTKD TVERLMNKLYCYAHLNCDVEPKNQEYQTMMATIMGVLEASSVQISFVDNEIIEAKDKVLE YLKDPKLEKYTYKINSLLSYEAHILPKEQEELLAKVESIADTSSQVFNALRLEYEDVEED GKLKTLNNATLNQFLKNKDEKVRKQAYTNFFKEYQRYENVFAQTLAGVMKKDAFYADVRK FNSALEASVFDDDVPSELFFKVLESANVKYRHLFHRYNRLKKELLKKDVIYNYDLNIPLV SSINKKYTIDQCFEIINECLRPFGQDYLDIINKARNERWIDYYPTPGKRIGAYSSGSYDT NPFILMNFIGDYNSLSTMIHELGHSVHSYLSNHHQDAVNANYRIFVAEVASTVNETLLIN YMIKNAKNDQEKAYFIYEQLENCVGLIFRQPMFADFEYKLHTMAENNEPLSSKVITDLYA RLNQEYYGKDVTMDELVGWSCYYVPHFYYDYYVYKYTLGMTVALAIVNRILKGNQQQVTD YLNFLKSGGSKAPVELLKQAGVDPLDDQIYEDAFKYFEELLNQFETIKKNA >gi|223714148|gb|ACDT01000067.1| GENE 24 24605 - 24865 422 86 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167755666|ref|ZP_02427793.1| hypothetical protein CLORAM_01181 [Clostridium ramosum DSM 1402] # 1 86 1 86 86 167 100 1e-40 MANMKQQIKRYKNDNQKRAKNTSYKSALKTAIKKVLAEVEAGNKDEAVKAYNEVASKLDA SVVKGIHHRNYASRQKSRLAKLVNSI >gi|223714148|gb|ACDT01000067.1| GENE 25 25010 - 25969 1164 319 aa, chain + ## HITS:1 COG:no KEGG:GTNG_2447 NR:ns ## KEGG: GTNG_2447 # Name: not_defined # Def: germination protease (EC:3.4.24.78) # Organism: G.thermodenitrificans # Pathway: not_defined # 5 316 11 356 372 169 32.0 2e-40 MNFSNIRSDLATESLEQLEVGKHYQKEEYVNDGVKVEKVSILQAHQSINQGIGTYIEISF KDYLEQTAIIKEVVNNLKPLIDKIDYPKILMVGLGNQYLTNDAIGPRTLRDVRITHFLDD EDKLINHYYDILGIAPGVMVQTGMETGAIVKALVDKEDIDLVIVVDALCAKNYHKLAHVI QINDVGINPGSGIGNYRQAINEETVGAKVIAIGIPTVIYASSLVSDVLNYTLDYFGDSLN PISKLKVGKRERYHGSLNEQQKEMMLGQLGKLDDSELELLFNEILNPIDCNFVLSDKQID EQCEIMSKIISKSINALRY >gi|223714148|gb|ACDT01000067.1| GENE 26 26022 - 26771 791 249 aa, chain + ## HITS:1 COG:no KEGG:Bsph_3835 NR:ns ## KEGG: Bsph_3835 # Name: not_defined # Def: stage II sporulation protein P # Organism: L.sphaericus # Pathway: not_defined # 77 246 99 272 280 100 34.0 4e-20 MRLKNTFLILLKLAVIVGLIVYLPTKVSGSNAQVINAINNEQNNPTTPVATNPTNTSFVP TKDKKIHIYNTHQGEEYDGFNVVEGANYLKDCLNNKGYQCDVEGNDFEGYKAVHKIAYNK SYTVSKMYLEESLKLSGGYDLIIDFHRDSISKELSTISHDGKSYAKIMFVVGKSSGKFDA VNQLSQKLSDLANATVPKISRGVYVKKSHYNQGISDNMVLIEVGAQSNTKEEVQATVEII AKAIGEYLG >gi|223714148|gb|ACDT01000067.1| GENE 27 26768 - 27052 379 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755669|ref|ZP_02427796.1| ## NR: gi|167755669|ref|ZP_02427796.1| hypothetical protein CLORAM_01184 [Clostridium ramosum DSM 1402] # 1 94 1 94 94 154 100.0 2e-36 MKNFLIDLCLVGLLLFMINGLFGDYGIQKAVFDKNITQFEEDVKEQKPITTDYGTTNDNE ENTLSLVVKQISEVCIKVIETLVLIISNIISTLL >gi|223714148|gb|ACDT01000067.1| GENE 28 27075 - 28028 1143 317 aa, chain + ## HITS:1 COG:BS_fmt KEGG:ns NR:ns ## COG: BS_fmt COG0223 # Protein_GI_number: 16078636 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Bacillus subtilis # 5 309 3 304 317 314 53.0 1e-85 MKDAKIIFMGTASFSKQVLEKLLENNYNVVAVVTQPDRFVGRKKVLTMPEVKEVALASGI PVYQPLKIKEEYQEIIALEPDLIITAAYGQIVPEAVLNAPKIGCINVHASLLPKYRGGAP VHYAIMEGEEVTGVTIMYMVKKMDAGNIISQVEVPIGAEETTGELYERLSIAGAELLLET LPSVLAGKNESIAQDESLVTYSPTISHEQEKIDFSKSALRVYNQVRGMNPWPGAYTTYQG KVVKIWAGKIHVCENALKHHAHQENGTIVKIFKDAIGVKTGEGIYLITELQLAGKKRMLV KDYLNGNNIFEVDSKFE >gi|223714148|gb|ACDT01000067.1| GENE 29 28099 - 28359 452 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755671|ref|ZP_02427798.1| ## NR: gi|167755671|ref|ZP_02427798.1| hypothetical protein CLORAM_01186 [Clostridium ramosum DSM 1402] # 1 86 38 123 123 159 100.0 4e-38 MIISTLSSYPNKKVVKDLGIVVGYDDAIRAFRVTMGVEEYIVKAQENLAKAAEKLGANAV LGVCFDLRDSNKPVLMGTAVVLEDLE >gi|223714148|gb|ACDT01000067.1| GENE 30 28442 - 29743 1601 433 aa, chain + ## HITS:1 COG:BH1533 KEGG:ns NR:ns ## COG: BH1533 COG0213 # Protein_GI_number: 15614096 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine phosphorylase # Organism: Bacillus halodurans # 1 430 1 430 433 444 55.0 1e-124 MRMVDLIELKKNNHELTKEQIHFIIKGYNDGNIPDYQMSAFLMAICFNGLSKHETAILTD EMLHSGEIIDLSEIAGVKVDKHSTGGVGDKTSLVLGPMVAACGVKVAKMSGRGLGHTGGT LDKLESIPGLQIQISKDDFIKQVNEIGIAIIGQSATLVPADQKMYALRDVTATVDCIPLI ASSIMSKKLASGSDTILLDVKYGEGAFMQTKEEAKRLAQTMIEIGQHLGKDTRAAISNMN QPLGKAIGNSLEVIEAIDTLNGSGPEDLLNLCLEAGSHMLVQAKKCKNEESALKMLKRVI DTGQALDKFKEMVKYQHGNDEYIGNPELFTKAKLIIEVKAKQAGYIQTLEAKTLGIVSMK LGGGRQTKDDVIDHSVGIILNKKVGDQCKVGEVLAYIHANNEVSDELINELYGAYKIVDF FVEKPILIDEILV >gi|223714148|gb|ACDT01000067.1| GENE 31 29767 - 30606 851 279 aa, chain + ## HITS:1 COG:lin2453 KEGG:ns NR:ns ## COG: lin2453 COG0561 # Protein_GI_number: 16801515 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 3 279 4 281 281 111 28.0 1e-24 MGKYLFFDIDGTLIGPSKRVTKNTEEGIKKARANGHKTFLCTGRAPVSIMKSIRDIGFDG IISSAGGFVSIDGKYIFENFINQYVLSEVMLLFTNAKILFSLETKDALYQTPGVQDFFEK KQANILEGNLELARFLEERRNEEVRLPISEFNILQTKVTKVCFIAEDKLAFYDCVKYLSE FFNIVIFSKETDDFINGEIILKNCTKGDAMKRAVAYLGGDMKDTIAFGDSMNDFQMISEA AYGVVSYLAPDKLKAIADDTFEEPDDDGIFKCLQRLGLI >gi|223714148|gb|ACDT01000067.1| GENE 32 30634 - 30954 142 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734421|ref|ZP_04564902.1| ## NR: gi|237734421|ref|ZP_04564902.1| predicted protein [Mollicutes bacterium D7] # 1 106 1 106 106 194 100.0 2e-48 MTLGKVLKRIRKTKQIPQKEIVKVLKISKSTVYRFEKGGNIDTNNYLKYCEYLGIDAGIP YFISNHDDFYRLFNRITRSSFDYEVFELAFSKALIKTAEQIFDNYT >gi|223714148|gb|ACDT01000067.1| GENE 33 31035 - 31862 867 275 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755675|ref|ZP_02427802.1| ## NR: gi|167755675|ref|ZP_02427802.1| hypothetical protein CLORAM_01190 [Clostridium ramosum DSM 1402] # 1 275 1 275 275 461 99.0 1e-128 MARKKSSSRTTLSYILSLLIAIFLSILVVLTVTRFTVMSPSFLISKMDGADFYKQSVVSL NEEIQQETQSTGFPIEMFENYVDEDTAKTAMEKYINNAFDGGETTINTTEFETKLTQDID NYLTTNNIIINSESQEAISVLKSNLIDLYKSYLSFPYLALVIDVINTYDGIFLIAAAALV VLIGLASFLLYRLYSHYQGRRRYFSYALSASGLMCFALPFFIYIGKFIDKISLSPAYFYN FFTSTLNSYMLTFVIVGLVMIVLANIVAYIKFERK >gi|223714148|gb|ACDT01000067.1| GENE 34 31864 - 32502 549 212 aa, chain - ## HITS:1 COG:BH3596 KEGG:ns NR:ns ## COG: BH3596 COG3764 # Protein_GI_number: 15616158 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase (surface protein transpeptidase) # Organism: Bacillus halodurans # 52 201 30 183 187 84 31.0 2e-16 MALVLVPLSFTIICYLVLFLIISPIIEPVMSIYSLVASEHAPNLNDNSNKDLYKNKLVSS ATIKASEVVMPSYGDLFGKITIDSVNKTINLYYGDGNDQLRKGAGLFNGSFLPGFNRTSL IAGHSLPYFECLGDVNVGDIIKISTHYGDFEYKITDTKIGKATDESNYDLAQSEKEQLIL YTCYPFELIGYKADRLFVYADKISGPTILEGQ >gi|223714148|gb|ACDT01000067.1| GENE 35 32535 - 33083 620 182 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755677|ref|ZP_02427804.1| ## NR: gi|167755677|ref|ZP_02427804.1| hypothetical protein CLORAM_01192 [Clostridium ramosum DSM 1402] # 1 182 1 182 182 274 100.0 2e-72 MKKLLSSLMVACLMFVATISTVGAVGSITADEQKIIDALEAGVTVDGVKVNLKAADINAA KNYLTNNDVTAAQVTTILNNIEGAKSYMQTNHIKDIASIKGAHATAILGFAQAAADVLGL TIKVGADGTVTVYKGSEVVYSTTNVIKKTGYDFTQTAVMAAGLVAVLFGAGYYAKRKQLL VK >gi|223714148|gb|ACDT01000067.1| GENE 36 33375 - 34352 855 325 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734425|ref|ZP_04564906.1| ## NR: gi|237734425|ref|ZP_04564906.1| predicted protein [Mollicutes bacterium D7] # 1 325 1 325 325 525 100.0 1e-147 MKNKKNSIFLIVVPILCILIVALFAYDRYQTNQLAKKFKQDNQTVDTSKNNNQTSLENKL PTIYCVGDSTTIGTSKNTSYPKYLENSINTTITTFGDDSINSLALAIKFGVSEIYVNNFT IPAKSEETTITLLDKNGQAVNAILTSKTNIDKCTINNIEGSINYDTANNRLVFKRSKAGE SSKISTLTKIEVTKPEINKDNILILFTGSYEESVQGSLAEYQKQIISAFNTDKYIVVSLT QDDRDATNNLLKTTHGDHYLDFKSYLLTSGLKDAGITETAQDKTNLANKNTPSSLLDDKI NGNSKYNELLAKQLTDKMTKLGYLK >gi|223714148|gb|ACDT01000067.1| GENE 37 34355 - 34717 281 120 aa, chain - ## HITS:1 COG:CAC3040 KEGG:ns NR:ns ## COG: CAC3040 COG0489 # Protein_GI_number: 15896291 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Clostridium acetobutylicum # 9 120 105 214 232 97 47.0 8e-21 FQFIEDESFVGKLSVLTSGVKVPNPSELLSSQIFEDFINELKGLYDFIIIDCPPIMLVSD AIPIGNVVDGTIFVCSSQLTGRKDAKASIEILQKNNVNILGTVLTQVEQDGSSNGYYYYY Prediction of potential genes in microbial genomes Time: Thu May 26 09:55:47 2011 Seq name: gi|223714147|gb|ACDT01000068.1| Coprobacillus sp. D7 cont1.68, whole genome shotgun sequence Length of sequence - 1933 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 198 161 ## gi|167755679|ref|ZP_02427806.1| hypothetical protein CLORAM_01194 2 1 Op 2 . - CDS 199 - 453 250 ## gi|167755680|ref|ZP_02427807.1| hypothetical protein CLORAM_01195 - Prom 475 - 534 1.7 3 2 Op 1 2/0.000 - CDS 542 - 976 548 ## COG4464 Capsular polysaccharide biosynthesis protein 4 2 Op 2 . - CDS 979 - 1662 791 ## COG3944 Capsular polysaccharide biosynthesis protein - Prom 1768 - 1827 9.2 Predicted protein(s) >gi|223714147|gb|ACDT01000068.1| GENE 1 3 - 198 161 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755679|ref|ZP_02427806.1| ## NR: gi|167755679|ref|ZP_02427806.1| hypothetical protein CLORAM_01194 [Clostridium ramosum DSM 1402] # 1 65 1 65 241 116 100.0 5e-25 MFKKNKKQSEKDTYLITSPDKKHGKIDFKEVYRHIRTNIEYSTVGKNVKAINVTSSIPSE SKSTT >gi|223714147|gb|ACDT01000068.1| GENE 2 199 - 453 250 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755680|ref|ZP_02427807.1| ## NR: gi|167755680|ref|ZP_02427807.1| hypothetical protein CLORAM_01195 [Clostridium ramosum DSM 1402] # 1 84 175 258 258 167 100.0 2e-40 MIQENAYQLLDNGMVHVIANDVHRPNGKRSVNLQETYEVLAKKYQAEDLTRLMYDNPLAI INGHAPELIKVRKISKLPWFKRRK >gi|223714147|gb|ACDT01000068.1| GENE 3 542 - 976 548 144 aa, chain - ## HITS:1 COG:SA2455 KEGG:ns NR:ns ## COG: SA2455 COG4464 # Protein_GI_number: 15928248 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Staphylococcus aureus N315 # 4 110 2 109 255 68 38.0 4e-12 MAFIDIHSHYAWNVDDGIETREDAQAALKKAQIQNITKIVATPHLTPGTTTAKEFEQIKE RIEELKKLAKEYQIEIYPGCEVMLNGDYLELLDNHQYLTINNGPYLLVEFNVTQKLPEDY DDRLYEYGIKRKNCYRSCRTLFSS >gi|223714147|gb|ACDT01000068.1| GENE 4 979 - 1662 791 227 aa, chain - ## HITS:1 COG:SA2457 KEGG:ns NR:ns ## COG: SA2457 COG3944 # Protein_GI_number: 15928250 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Staphylococcus aureus N315 # 11 225 4 220 220 103 32.0 2e-22 MDKNELNDEVEIDLSQLLKLLKKNIRLIIILALVGIIIAASATTFLISKKYQSQGSVLLK ADVVNGSLDSTQVNTNKMMVNNYVKLLQGNNIQDQVAKNLNITSAEVRSSLSITNTTDTQ IIEISSTTVDPGLSKRIVDETISVFTTLIQEKLDVTNVTIVDQPEVNPNPVSPSMVKNVI IGAVAGIVISLGYLLLTYLLDTKIKNGEQAEQYLGVPLLGIVPFFEE Prediction of potential genes in microbial genomes Time: Thu May 26 09:55:57 2011 Seq name: gi|223714146|gb|ACDT01000069.1| Coprobacillus sp. D7 cont1.69, whole genome shotgun sequence Length of sequence - 3143 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 18 - 77 1.7 1 1 Tu 1 . + CDS 102 - 1763 1700 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases + Term 1774 - 1821 1.6 + Prom 1773 - 1832 3.4 2 2 Tu 1 . + CDS 1883 - 2404 445 ## Elen_1920 NusG antitermination factor + Term 2518 - 2567 -0.9 + Prom 2771 - 2830 5.8 3 3 Tu 1 . + CDS 2947 - 3142 148 ## gi|167755683|ref|ZP_02427810.1| hypothetical protein CLORAM_01198 Predicted protein(s) >gi|223714146|gb|ACDT01000069.1| GENE 1 102 - 1763 1700 553 aa, chain + ## HITS:1 COG:BH3718 KEGG:ns NR:ns ## COG: BH3718 COG1086 # Protein_GI_number: 15616280 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Bacillus halodurans # 23 492 103 566 608 454 48.0 1e-127 MSAIANILICELTGNPLPRSCYFIYFLLLVLMVGGSRFIYRFIRLYAARHNIRGRKEQRP LEKVMIIGAGVAGEKVYKEILGSKSIYKEVICFIDDEPSKWNRTIHGVSIYGGRDKIIEA VNKYKIEEIMVAMPSASKRDLIDIFNICKETKCKLKKLPGIYQFINEDVHISDLKEVEIQ DLLGRDPIKVNLADIMGYVTDKVVMVTGGGGSIGSELCRQIAANKPKQLIIVDIYENNAY DIQLELKHNYPELNLETLIASVRNEVKVNKLFEIYHPDIVYHAAAHKHVPLMEDSPNEAI KNNVFGTLNVARAADKYNAQKFILISTDKAVNPTNVMGATKRLCEMIVQTYNKRSQTEYV AVRFGNVLGSNGSVIPIFKRQIKEGGPVTVTHPDIIRYFMTIPEAVSLVLQAGAYAKGGE IFILDMGEPVRIADMAKNLIKLSGYEPDVDIKIEYTGLRPGEKLYEELLMEEEGLQDTPN HMIHIGKPIEMNEDTFVERLINLKEAAYGETDDIRSLIKELVPTYQYGITPKRRKTPEIK KDSHVQVSKTVNI >gi|223714146|gb|ACDT01000069.1| GENE 2 1883 - 2404 445 173 aa, chain + ## HITS:1 COG:no KEGG:Elen_1920 NR:ns ## KEGG: Elen_1920 # Name: not_defined # Def: NusG antitermination factor # Organism: E.lenta # Pathway: not_defined # 5 173 2 170 170 150 41.0 2e-35 MYENWYVIQVRTGKEEKIKNTCEKLISRDILEECFIPKCIRLKKYQGTWKEVDEVLFKGY VFMISDHIDELFNKLKLIPDLTKVLGNDGEFICPILKEEAIFLLKFGKEKHIVDLSQGYI EGDKINIISGPLVGYEGMIKKIDRHKRVAYIEVKLFDQITMVQVGLEIIGKSS >gi|223714146|gb|ACDT01000069.1| GENE 3 2947 - 3142 148 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755683|ref|ZP_02427810.1| ## NR: gi|167755683|ref|ZP_02427810.1| hypothetical protein CLORAM_01198 [Clostridium ramosum DSM 1402] # 1 65 1 65 241 113 100.0 5e-24 MLGRKRKKAVKSTHKIIDKSNKNNHVDYQEVYRNIRTNIEYSAVGKNVKAINITSSISNE GKSTT Prediction of potential genes in microbial genomes Time: Thu May 26 09:56:07 2011 Seq name: gi|223714145|gb|ACDT01000070.1| Coprobacillus sp. D7 cont1.70, whole genome shotgun sequence Length of sequence - 6621 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 3 - 365 281 ## COG0489 ATPases involved in chromosome partitioning 2 1 Op 2 . + CDS 375 - 827 225 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases + Prom 962 - 1021 7.9 3 2 Tu 1 . + CDS 1105 - 2175 1097 ## COG3210 Large exoproteins involved in heme utilization or adhesion + Prom 2269 - 2328 8.1 4 3 Tu 1 . + CDS 2435 - 2932 424 ## gi|237734434|ref|ZP_04564915.1| predicted protein + Term 2948 - 2984 2.1 5 4 Tu 1 . - CDS 2986 - 4125 984 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Prom 4206 - 4265 2.4 6 5 Op 1 . + CDS 4292 - 4615 369 ## CLJ_B3354 anti-sigma F factor antagonist 7 5 Op 2 6/0.000 + CDS 4615 - 5049 523 ## COG2172 Anti-sigma regulatory factor (Ser/Thr protein kinase) 8 5 Op 3 . + CDS 5042 - 5764 738 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 9 5 Op 4 . + CDS 5795 - 6310 600 ## COG0681 Signal peptidase I + Term 6329 - 6369 3.4 + Prom 6318 - 6377 8.7 10 6 Tu 1 . + CDS 6442 - 6619 59 ## gi|167755692|ref|ZP_02427819.1| hypothetical protein CLORAM_01207 Predicted protein(s) >gi|223714145|gb|ACDT01000070.1| GENE 1 3 - 365 281 120 aa, chain + ## HITS:1 COG:CAC3040 KEGG:ns NR:ns ## COG: CAC3040 COG0489 # Protein_GI_number: 15896291 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Clostridium acetobutylicum # 9 118 105 214 232 88 40.0 3e-18 FQFIEDESFEGKLSVLTAGIKVPNPSELISSDIFEEFINELMKLYDFIVIDCPPVMLVSD AIPIGNVVDGTVFVCSSQLTGRKDAKASIEILQKNNVNILGTVLSQVEVEKDKYNNYYYY >gi|223714145|gb|ACDT01000070.1| GENE 2 375 - 827 225 150 aa, chain + ## HITS:1 COG:Cj1120c KEGG:ns NR:ns ## COG: Cj1120c COG1086 # Protein_GI_number: 15792445 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Campylobacter jejuni # 5 126 6 130 590 58 28.0 6e-09 MRKFSVRRIILILIDLICIILASFLALFLRFNGEIPSYYFNILKNLMGLEIFISLVIFYL FKLYRSLWQFASINELKNILSAAVIDSIANVFLFEIADKSLPLSCYFIYFLLITMFLGIT RLIYRMTRLYRTEKGSDNYKERRSLEKVRD >gi|223714145|gb|ACDT01000070.1| GENE 3 1105 - 2175 1097 356 aa, chain + ## HITS:1 COG:PA4625 KEGG:ns NR:ns ## COG: PA4625 COG3210 # Protein_GI_number: 15599821 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Pseudomonas aeruginosa # 121 307 1762 1955 2154 75 34.0 2e-13 METYDGTSKSLIGKVYDKSGNAFIDAKLVYKGVDNSYNSTIPPTAAGRYRVVASYSGDDT HYAAIPKVASLVIKKADKAIVTVNDINNNYGEEYKYSYSLEGVVARDIDTLVSSIVCVTD NNENVGNHKIKAIINEDVAKNYKEVVVVKGKHTITPKAININIDNINTTYGDDEKVLTFS LAVDSTLAYGDTLADLGITLERESGNDVGEYKITGDASNENYDVTFTDGTYTIASKKVQL IIDDKMKVEGEIDPEFTYNIIGLVNGDIFTVELTREPGEKAGKYVINGKVEGANNYNLEI VTGTLTVNAKTALPVKPVDKGTPTGDNVDLNTLIALAGISALLLLIILRKIHSLNK >gi|223714145|gb|ACDT01000070.1| GENE 4 2435 - 2932 424 165 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734434|ref|ZP_04564915.1| ## NR: gi|237734434|ref|ZP_04564915.1| predicted protein [Mollicutes bacterium D7] # 1 165 1 165 165 265 100.0 1e-69 MSLVKNMDNWSDMSKLDLLYKFSIIKYLRYFIIPELFGLSILVIASIKIRGIFIFIIIVV IMMLYQINKISQSFIKEKAGKLIYRQKTSFFRSLRNFDECTYSDYCIRKIKKVKSFFNYY VINGEIVKRSFIAYMDDDEICIDEKYCTVLKIPKAYHGLDKLRDF >gi|223714145|gb|ACDT01000070.1| GENE 5 2986 - 4125 984 379 aa, chain - ## HITS:1 COG:BH1535 KEGG:ns NR:ns ## COG: BH1535 COG1686 # Protein_GI_number: 15614098 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 1 367 1 372 387 311 44.0 1e-84 MKKLLITLLVVCLLPISRVQALDNDVTPNAKSAILIEANSRQILYDKNAEEKLYPASTTK IMTMILMFEALRNKKISFDDEITTSKYAASMGGSQVYLEVGESLSLKDMFKSIAIASAND ASVAVAEHIAGSINKFVEMMNDKAKKLGLKNTHFKNATGLHDPDHYTCAHDLAIMAAYLI EIGGEDLFAVTSLYDSYIREDTKQSFWLVNTNKLLKLYNGVDGLKTGYTKEAGYCLVTTA QRDGQRIIGVLMKETAPKTRNEEMCSLLDYGFNNYKQVTIFKKDTLIEQHLVDKMDNLTI DVKCKKDIVYTKAKSADAKVSTKVNYKENFLPVKQGDVIGTLDVIVDGKTIATYDVYSDT NASKSTYLSKTIKTLKYLF >gi|223714145|gb|ACDT01000070.1| GENE 6 4292 - 4615 369 107 aa, chain + ## HITS:1 COG:no KEGG:CLJ_B3354 NR:ns ## KEGG: CLJ_B3354 # Name: spoIIAA # Def: anti-sigma F factor antagonist # Organism: C.botulinum_Ba4 # Pathway: not_defined # 12 105 10 104 111 70 42.0 1e-11 MENKIEYKILDDKVIVYFYGELSCSNVVKYRSLLNSILDKGNGPVYFDFLHTNFIDSSGI GLVLGRYNQLKLDDRQLYLANLSKTSYKVFELSGMFDLMEYVEEVKG >gi|223714145|gb|ACDT01000070.1| GENE 7 4615 - 5049 523 144 aa, chain + ## HITS:1 COG:BS_spoIIAB KEGG:ns NR:ns ## COG: BS_spoIIAB COG2172 # Protein_GI_number: 16079403 # Func_class: T Signal transduction mechanisms # Function: Anti-sigma regulatory factor (Ser/Thr protein kinase) # Organism: Bacillus subtilis # 2 141 3 141 146 138 51.0 4e-33 MNQMELSFNARIENEPFARTSVASFITSLNPTIDELVEIKTIISEGVSNAIIHGYQNNPE CKVIIKVTIEDNRQIKIIIQDYGKGIEDIDEAKIPMYTSLKELEHAGMGLTIIEALCDNL EIHSTPDLGTKLTIKKQLKDSNLG >gi|223714145|gb|ACDT01000070.1| GENE 8 5042 - 5764 738 240 aa, chain + ## HITS:1 COG:BS_sigF KEGG:ns NR:ns ## COG: BS_sigF COG1191 # Protein_GI_number: 16079402 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus subtilis # 8 238 21 251 255 229 51.0 3e-60 MAKSNTIELIEQAHRGDDRAKELVVSQNLGLVWSIVHRFKNNYYDKEDLFQIGCIGLMKA VNNFDTNYGVQFSTYAVPIIMGEIKRYFRDDGTIKVSRSLKELNLKINKARELLTNQTGQ DPTVEQIAAYLDVDVQDVVEAIDSSYYPTSLSEPIYEKDGSSISMEERIENKHNKMWFEK IALEMEIDKLDEKEKLILYMRYQLDFNQERVAQRLNISQVQVSRLEKKIIAKLRSHLNEQ >gi|223714145|gb|ACDT01000070.1| GENE 9 5795 - 6310 600 171 aa, chain + ## HITS:1 COG:alr2975 KEGG:ns NR:ns ## COG: alr2975 COG0681 # Protein_GI_number: 17230467 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Nostoc sp. PCC 7120 # 14 166 23 179 190 96 37.0 2e-20 MKEATQSKKSILLEYLKVIVLTLVITYGVLYFIQISRVQMTSMVPTFKEGNIVLVDKVLY KNSSPQRNDIVIVDYKDANQKEKHIIKRVIAIGGDHVEIKDNIVYLNGKKLDENYVNGVM ANNEDMSINIPEGKVFVMGDNRNNSLDSRRLGYFDFKEDVVGKVFFTVPFS >gi|223714145|gb|ACDT01000070.1| GENE 10 6442 - 6619 59 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755692|ref|ZP_02427819.1| ## NR: gi|167755692|ref|ZP_02427819.1| hypothetical protein CLORAM_01207 [Clostridium ramosum DSM 1402] # 1 59 7 65 177 105 100.0 9e-22 MEEAIQSKKSILLEYLKVIVITLIVTYGILYFVQISRVYGTSMIPTYHEGNIVLVDKVF Prediction of potential genes in microbial genomes Time: Thu May 26 09:56:21 2011 Seq name: gi|223714144|gb|ACDT01000071.1| Coprobacillus sp. D7 cont1.71, whole genome shotgun sequence Length of sequence - 1056 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 132 - 758 672 ## gi|237734112|ref|ZP_04564593.1| predicted protein + Term 877 - 926 1.2 Predicted protein(s) >gi|223714144|gb|ACDT01000071.1| GENE 1 132 - 758 672 208 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734112|ref|ZP_04564593.1| ## NR: gi|237734112|ref|ZP_04564593.1| predicted protein [Mollicutes bacterium D7] # 1 208 1 208 208 344 100.0 2e-93 MVLFETVRDIMFAEIMEATLTEISKELLRANELKEKELALKSKELELLNSGEFKYNYIER YKRAFFDACNRLEELDRILYEMDQTVEPMSNEEYKEAILSNIQNMDKNQYDCDKAKDYEV SSKEFIDYTNRLDSAKLEGKEYYKKINEKFGDSFMYYDIEERLYVCYDFLYGDVGLPNIE TIKDRKMAIKWLEGKIEMNEVIHNSYIR Prediction of potential genes in microbial genomes Time: Thu May 26 09:56:33 2011 Seq name: gi|223714143|gb|ACDT01000072.1| Coprobacillus sp. D7 cont1.72, whole genome shotgun sequence Length of sequence - 11685 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 190 - 249 8.8 1 1 Op 1 1/0.000 + CDS 367 - 2289 1272 ## COG3711 Transcriptional antiterminator + Term 2329 - 2371 3.3 + Prom 2368 - 2427 12.9 2 1 Op 2 8/0.000 + CDS 2585 - 3916 1350 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 3 1 Op 3 . + CDS 3919 - 5424 1362 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 5578 - 5618 -0.1 + Prom 5517 - 5576 7.2 4 2 Tu 1 . + CDS 5754 - 9224 4025 ## LSEI_0291 lacto-N-biosidase + Term 9227 - 9261 -0.5 - Term 9214 - 9248 2.2 5 3 Tu 1 . - CDS 9274 - 10419 361 ## COG0477 Permeases of the major facilitator superfamily - Prom 10456 - 10515 11.3 + Prom 10458 - 10517 9.2 6 4 Op 1 . + CDS 10573 - 11358 792 ## gi|167756170|ref|ZP_02428297.1| hypothetical protein CLORAM_01700 7 4 Op 2 . + CDS 11434 - 11634 241 ## gi|237734119|ref|ZP_04564600.1| predicted protein Predicted protein(s) >gi|223714143|gb|ACDT01000072.1| GENE 1 367 - 2289 1272 640 aa, chain + ## HITS:1 COG:BS_licR_1 KEGG:ns NR:ns ## COG: BS_licR_1 COG3711 # Protein_GI_number: 16080911 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 1 492 2 494 499 194 28.0 3e-49 MNKRNYQILKYLKETEEYVTSETLSAMNNVSTKTILKDIKSLNCDMKATDNYIEVTPSRG IKLVINDIEEFINLCNSFNDTNDFSIYSVTEREEWIQKYLIENNGWVKAEALCEKLFISQ SVLSQNIKSVRKAFQKYDLILLQKPHYGMRVEGREFNKRLCLAQIYITHIDQREGFPGTQ FNNDELQLILKIEDIVDRILTRYQISMSEVSVQNFIIDIYISLKRIKKGIFLKSTDKMVI DIARWTDSIVAVELAKLINDELTIEMKDQEIVSLSIHLAAKRIICHFDESIHRIIEDFDV NKLVNNMINNINCKWGIDLTQDEELKSQLVLHLIPLEVRSRYNVVLHNPLIDKIKQQNIF AYQMAVTACDQFSDYHGNRLSEDEMGYIALHMNLALLRTQIKNKKNILVVSGLGRGTAHT LAYQIKEMYGKYINEVKTADYIELNNYDFTNINLLISSIPLRRDFSVPSIEVNYFFSDND KKRIETILCDQEVFKMRDYIDKRMVLTSINADTKEEAISFIINSTNYDKNIIQEIIENDK VANHELDNMISILSLNGISANNQTEVIIGILSKPILWNEKRVQLIIVPLIGEPINSKILN LYHELAYMIKNPLYVKRIIRKKTYEEIIAIFEEIETVLER >gi|223714143|gb|ACDT01000072.1| GENE 2 2585 - 3916 1350 443 aa, chain + ## HITS:1 COG:lin0033 KEGG:ns NR:ns ## COG: lin0033 COG1455 # Protein_GI_number: 16799112 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 3 443 7 451 452 223 33.0 6e-58 MKLSEKLQNGLLAFSAKIEANKYLGSIKDAFTMYVPFIIIGSFGSMLNILVSGTSGLAQW FPWLIKLSPAFTAMNFVTISCMALPIAFLIGTKLAQREGLPTLESGLIGLLAYLAVCPNA ISTAVEGLKDPIIISGLGSGVIGAQGLFVSMLISIMAIELLKGLSKIDAIKIKMPDSVPT GIARSFNILIPIFIIIALFAICGCLFESFTGNYLNVWIYNIIQIPLQALANTTIGIVVLS LVNQLFWFLGIHGGMVIEGIRGPLSAAGLADNIAAVSAGHVATNILTRGFWTSFVVVGGG GITLSLLIAILLFSRREDHKAIAKFSIVPGICGINEPVVFGLPLVLNPIFAIPFICNSAI AALIAIFATNIGFLTCGIVDCPPGLPIFITGFISYGLHGIIIQAIILVVTFFVWVPFVSV SNKQAAIEDATEKSEKDKNMNGE >gi|223714143|gb|ACDT01000072.1| GENE 3 3919 - 5424 1362 501 aa, chain + ## HITS:1 COG:CAC1405 KEGG:ns NR:ns ## COG: CAC1405 COG2723 # Protein_GI_number: 15894684 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 1 500 1 472 473 511 51.0 1e-144 MKFPTNFLWGGATAANQYEGGWQEGGRGIAVHDMLTDGNQENPRRIYCKADNGNITSIQA GECIPFGFHPFIKESEYYPSHNATDFYHCYKEDIKLMAEMGFKVYRLSISWTRIFPHGDD KEPNEDGLRFYDAVIDELKKYNIEPLVTILHFDMPLNLAEKYGGWINRKLIEFYLNFAST LFERWKTKVKYWITVNEINVLGGYWTLGLASNNRKTKSNSNQGETPVEDAGIKLQAIHHL LVASAKANKIARKINPNFMLGTMCALSGIYPMTCKPEDVFGAYEFRRKALMFSDITMRGY YPNYAQSIFDEYKFTLKKELGDDEVLKENTSDYLAFSYYRTTAFDCDASNTTTTGGQQAS PNPYLQKTAWGWPIDPKGLRFVLNELYDRYQKPLFIVENGMGNYDIVEEDGAIHDDYRIE YLRSHIVEFKKAVEIDQIPLLGYTTWGCIDIVSAGTGEMAKRYGFVYVDLDDKGNGTFKR KKKDSFYWYQKVIESNGEILG >gi|223714143|gb|ACDT01000072.1| GENE 4 5754 - 9224 4025 1156 aa, chain + ## HITS:1 COG:no KEGG:LSEI_0291 NR:ns ## KEGG: LSEI_0291 # Name: not_defined # Def: lacto-N-biosidase # Organism: L.casei # Pathway: Other glycan degradation [PATH:lca00511]; Amino sugar and nucleotide sugar metabolism [PATH:lca00520]; Metabolic pathways [PATH:lca01100] # 994 1090 401 495 569 69 44.0 8e-10 MKKQLKKIMTIVLTAGMVISLLGTPSFAVEHVGTSDAAKEMYKNTYANLTNRITNRGYAP TSLTGAYEGMFIRDSSIQVMAQNAYGDYQKSRQVLNFLLSYQQGLGADYAQHIVPDYEDE EFGNSYLSQGETAQTQNLYTSQTFSDHALFGLKGPNNGGGVAFQTDSDKVTSISIFTNST GNDVLTGSLRTNITDASTELISKEITVTASGWQTFTFDQPITVKKGETYHFFVQSNNGIA MFGKADGKPADNAIMGYNYDVPVLGGFAESAYPAFKLNGNDDEKTDNSFMSNTDHSIALF LINATTNNSAFAFVPNSDSICSFRVYLTGSAPVTFKLTTEPGNEAKTIKTISQTVSSNGW QTVDFGENIPVTPGQKYYVHITTTSGQVIACGSSMFAGSVASMNWENNVWSQTGYVIAAE VFSEVNIAQSPYAQKISLNGDAIQAVSFTADAKVAGGNLIGELRESLNSKAIASGEIAIE KAGVQNYQIDFGKEVKVAADKDYYFVLWAENNDYVQWNYNPALKASAYSLNGDNWDSHDF DFELDILPLYSGDYRVPLMTIGDKNIGIQEIPSLDQYVTAVDVLLGKNNLEEGIAIATLY KGYGAEAIKVDDAAIDISKLSSSGQMVHLEFSLSLEQVDKTQSYYLEIKAPQNSDNTITW YGSKNVDNLATLYNGTAVAGEAGYVAYKSNLHTLSNHVQIDGNYMLIHAWAQFVNGCKKT VENQEFIKKSYPIIKRFANYYIDQGYISDEYNLMRNDSFEHSREGRYWKSYDLITNVFAS QALYELSALAKEFGDSDSAIKWADTSATLTKGINEYLTTEVDGVKIYAELYDIDNNMKFI KGMSWVNWAPMAAQWYAMDEKIMQNTYDIYAKYDSQDYDDYKMLDVVYDFDTQQSGNHII GKGLAWEIMFNRVKDDNEKLDYLASFILDHSTPGGVYPETYQPNGNFSDVGNQEHASWQH YAMSCAYPELTKTYSLQILEELINETQVLDAKLYTLDSYAKLSRALDAAISGYNVENITK EQCDALYEALLAAKNELTYLDADYQAVDAAIVAAEKLNKDDYKDFSKVEAAINAVVRGKN ITQQAEVDAMAKAINDAIDTLEKKEGSVTPINSIKPVDKKPVVPNTGDENLMSLYISLIG IMAGLFVIIKKRKNIY >gi|223714143|gb|ACDT01000072.1| GENE 5 9274 - 10419 361 381 aa, chain - ## HITS:1 COG:SP1600 KEGG:ns NR:ns ## COG: SP1600 COG0477 # Protein_GI_number: 15901440 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Streptococcus pneumoniae TIGR4 # 1 342 1 347 383 63 23.0 9e-10 MKNKLILLSILSVALVSSITGSIIPGMNLLYQEYSAYPMSTVSLIITIPSIAVIPTSYFC SKFVGTKVSYRTILIAMNAALLFGGTGPFFIDELYSRLFLRLLFGIGIGIAISLNKSIIL EYYTDNRQIKYLGYTTIISSLGAIFFQTLVGYLSKISTDFMFLGYLPLVITLILSFYIPD IPLKAPSKSTIKKEKNIKLLTISIFFLILNMLMSSIGINLSTLFATKSFADIAVITVHAN NINSICSISSGLLFNKIYNKFKNNFLPISLFLCALSLAIYIFGKSYISMYITSGIFGFTY NTIILYIFLIAAQTAKEKGNAAVGGYMGIAAGVGGFLPSFLVYLCQLFFNESIYSVLFIA VMFLVFVGFITIKFLPAKIKS >gi|223714143|gb|ACDT01000072.1| GENE 6 10573 - 11358 792 261 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756170|ref|ZP_02428297.1| ## NR: gi|167756170|ref|ZP_02428297.1| hypothetical protein CLORAM_01700 [Clostridium ramosum DSM 1402] # 1 261 1 261 261 476 100.0 1e-133 MKNDCFSFIKFCNERIANSNDYNIARTIINHIEDIKILSLEKIAEEANISPASVSRFINK AGFESFQAFKYEFELFTRDVKMRRIISHTQRFMRTTIENMSESLYVDGLANLRQTKLNLD IDKLKEIVKLLKTSKSVTFIGDSHELADFYTLQLEMLVNDIPAYLINFYEFEKMYLNQLD HNDTIVFIGVYKEWFSDAQKEILDYAKERKIKIIAFVQEEDYLKEYADLLYVYGIPDSYN DGYYSLPYLNRLLCEMIYFKL >gi|223714143|gb|ACDT01000072.1| GENE 7 11434 - 11634 241 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734119|ref|ZP_04564600.1| ## NR: gi|237734119|ref|ZP_04564600.1| predicted protein [Mollicutes bacterium D7] # 1 66 1 66 66 113 100.0 4e-24 MSAGKYQIIETIMSYTSQYSFAELNKFSVQELYNILAIFNQEEVPEGCQPGYNQDSHHRK NHNQPK Prediction of potential genes in microbial genomes Time: Thu May 26 09:57:06 2011 Seq name: gi|223714142|gb|ACDT01000073.1| Coprobacillus sp. D7 cont1.73, whole genome shotgun sequence Length of sequence - 25029 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 8, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 661 597 ## COG2188 Transcriptional regulators - Prom 681 - 740 7.1 + Prom 653 - 712 8.0 2 2 Tu 1 . + CDS 953 - 5095 4101 ## BVU_3139 hypothetical protein + Term 5105 - 5135 2.0 + Prom 5100 - 5159 9.2 3 3 Tu 1 . + CDS 5355 - 10049 5502 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 10061 - 10093 -0.8 + Prom 10096 - 10155 7.2 4 4 Op 1 . + CDS 10178 - 10732 552 ## COG1704 Uncharacterized conserved protein 5 4 Op 2 . + CDS 10734 - 12479 616 ## PROTEIN SUPPORTED gi|219668460|ref|YP_002458895.1| ribosomal protein S14 - Term 12489 - 12548 11.0 6 5 Op 1 . - CDS 12564 - 13982 1741 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases - Prom 14005 - 14064 6.2 7 5 Op 2 . - CDS 14126 - 14362 271 ## gi|167756178|ref|ZP_02428305.1| hypothetical protein CLORAM_01708 - Prom 14393 - 14452 7.5 + Prom 14345 - 14404 12.0 8 6 Op 1 . + CDS 14429 - 14689 393 ## gi|167756179|ref|ZP_02428306.1| hypothetical protein CLORAM_01709 9 6 Op 2 . + CDS 14704 - 15558 739 ## COG1307 Uncharacterized protein conserved in bacteria 10 6 Op 3 . + CDS 15570 - 16541 1175 ## COG2355 Zn-dependent dipeptidase, microsomal dipeptidase homolog + Term 16553 - 16590 5.1 + Prom 16699 - 16758 7.6 11 7 Op 1 . + CDS 16839 - 17654 849 ## Sterm_2094 transcriptional regulator, RpiR family 12 7 Op 2 11/0.000 + CDS 17691 - 18533 512 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase 13 7 Op 3 . + CDS 18530 - 19084 353 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase 14 7 Op 4 . + CDS 19114 - 20361 1595 ## COG1876 D-alanyl-D-alanine carboxypeptidase + Term 20362 - 20392 2.0 15 7 Op 5 . + CDS 20402 - 22027 1293 ## COG1032 Fe-S oxidoreductase + Term 22054 - 22097 4.7 + Prom 22038 - 22097 7.1 16 8 Op 1 12/0.000 + CDS 22181 - 24484 2632 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase 17 8 Op 2 . + CDS 24492 - 25007 506 ## COG0602 Organic radical activating enzymes Predicted protein(s) >gi|223714142|gb|ACDT01000073.1| GENE 1 1 - 661 597 220 aa, chain - ## HITS:1 COG:BS_yydK KEGG:ns NR:ns ## COG: BS_yydK COG2188 # Protein_GI_number: 16081065 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 1 217 1 218 236 125 33.0 4e-29 MRKYETISIDLKKRIENGEFEETKKLPTGGQLMKEYDASKNTISNAINLLVNDGLLFAIH GSGFYIRKPNKGTIKLNNTIGFSAEHPGEKLERDILSFKLIDCDTDLAEHLECKVGTPVY AIKRLMYIDDIPFAVEYTYYNKDIIPYLNEDIAKTSIFSFIINDLKLTIGFSEKYIFAKP LSKEDQELLNLEPGAFGLIIDDNVFLSNGKKFNYSKICII >gi|223714142|gb|ACDT01000073.1| GENE 2 953 - 5095 4101 1380 aa, chain + ## HITS:1 COG:no KEGG:BVU_3139 NR:ns ## KEGG: BVU_3139 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 812 1231 10 400 417 177 31.0 2e-42 MKKRGKIVSLLVIVAMFFSYIVPIKAQGNWQQHLDQTILKVGYTEVYNDNKPVVPNANLT ITITGTGLSGKEYNTVLIDEEAREVEATKQNGVINGDTTQLVLQVPGTLSGGQYQVRTTI DSESVTNGLILLNNVNLSSKISTNVTGYSGVIGDLIDGIQWAMYTPVTNKPTRENPGIFT FDYGEDLIDVYSLDLYSTYGNDQGPKVAEILYWDDSSQTWENILNNGSREFEFRWDKNIE CEPFRIQFDQVYSTNKIMLKTLDSYTSWENTNDIVEVQVWGYKSGELPSIQISSSQGNYG VKNIVPMVIKNKNKSTTINLELVDFKSHQSLARPVKYQVEMKNGKLNYEFVIPKTVPVGT YSIHVTCEGLDLYSDEYRISLDQDINIENCSKYLDISLVNDQAYKITSITDGDIKSGWSN NGNTSLLFEVKPENNKFANIALKKMRIYTSNKNIETVTLKGIANTNYYRNVNVNENGMIN DLAGDFKPDWKSDKNGDYFEVDPGYLSLGNIFILELQSNENIVINEIETEGVYLKDNLLS NARVYHNGENIDAAGMVDNNVSSYYKTTYSDSDEYILDFSPRTITTNELVYTAHFPNSQG IGNLTVEIWDGNSWKNVDTFDFKHKTDNQIRESQQLIFNELTTSKIRLIVNKANLVWDNS YLFTDLNAIGTINNNAQYYAESIDTNIVVNQGESKFPIPHLKGYEVSINSSDHEEIIDLN GNINAPLTDTKVKITLKVSSEDGQDVGITDEFEVVVPRRVLSGNKLISTIVDNETIYNNP SMGWVQYYEFQDTDVDKYWAEMDELYAQGLKTNILYIRNPWSWYEPSEGNYAWNDPDSRL SKLISGARARGIQLAFRVLVDSSDAFQQATPEYVFKAGASWYKTDRTDDTAPEIDAKDPY INDPVFLEKLDKFIKAFGAEFNNDLDIAFIDGMGFGNWGEAHHVKFSTEWDDNVYDAVEK VIKIYDKYFPDVLLGAQEGQPENYGTQENDEIINTDKPYTGAFDKEYDFVVRRDTFGWMT DAIRNQTLEWFNQGIPVFAENCYHSFKVREYWYNNAGYPTLDGILRQVVADALTCRANTL DARVVMDCQKWLENDQQNGSGLLNKFGLNGGYRLALTSLEVPELFTTGEEITIKQAWRNL GLGMLPNKNKHWDNKYHVAFALLDPKTNQVIYQYNEDTDKVNPGDWLFEDGNNHYETTFK LPNSIKSGEYKLATAIVNDKNNNLPEIALALKDAEVTEQGWYVLDSINVKNINDSDTTFG IFIEDTINGKVIADRSTVNAGESVTLKVVPDQGYELDQLLINEKAVEIKDNSYMIKNVTA DIRVRATFKKAGEVTVKPDESKDSSVVATGDEVQLFGYIGMFIGALVLLAELKRKFKHQS >gi|223714142|gb|ACDT01000073.1| GENE 3 5355 - 10049 5502 1564 aa, chain + ## HITS:1 COG:SPy1586 KEGG:ns NR:ns ## COG: SPy1586 COG3250 # Protein_GI_number: 15675473 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pyogenes M1 GAS # 5 870 3 891 1168 588 37.0 1e-167 MKGRWNQKLCAVATVSMLLSSAPVGVLAQENARITQSDPEEVYVDIIGDGQRTTLFNENW KFHRGDINDAQNKDYNDSTWETVNLPHDYSIDQDFTTSGEAESGFLPGGVGWYRKTFVVP KKYQEKQLMIEFDGAYMNAEVYLNGTKLGEHPYGYTAFAFDLTEGLICDGETENVLVVKT NNKIPSSRWYSGSGIYRDVKLTVTDKVHVGYNGTKIIAKDLENTKDNNVKVDVTATIDND SDTNKTITVKNTLLDAQGNAAGSTVTNEVELAAKSTKDVKQEVSVIAPKLWSTEDTNLYT MKTEIVADNKVLDTYETEYGFRYFDFDNNTGFSLNGEKIKLKGVCLHHDQGALGAVANRD AIERQVKILKEMGCNAIRVTHNPAASVLLEVCNENGMLVINEAFDGWTEYKNGNVNDYTS HFNEVISTDNQIVNGKAGMKWGEFDVKAMVDGAKNDPSVIMWSLGNEIDEGVSGNTSHYL NLVDDLIKWVQEVDTTRPVTNGDNRKNTNPNAMLSQINQKIYEAGGVVGMNYANGDQTIA MHNTYSDWPLYGSETASAVHSRGVYNTTGKDDNTLQMSEYDNDEAKVGWGHSASDAWQFV IKNDFNAGEFVWTGFDYIGEPTPWNGTGAGSASGQGAKPKSSYFGIVDTAGFVKDTYYLY KSMWDEDTTTLHLMSTWNNDEIVKNNNGKVKVDVFTNAAKVELYLNGNKVGEDTATTHTT ALGYKYQTFSNGEFYPSFNVTWASGTLSAKAYDKKGNLITETEGRSSVTTNTKAAKLAIS ADKKEILADGSSLSYLTVDVKDMNGNIVAGADNRINFKIEGNGKIVGVDNGNAADTDSYK GTSRKAFSGKALVIVQSTKDEGSFTLSASSDGLSGSNISVKTVKNAVEGDAYLQSYTIAK NLYVNVGEEPDLPAKVVGTYNNGETRDLTIKWNEYDKESLNKIGEFTITGKLQDSEAVVT VTVHVIGNVVAMENYSTVTSAGITPVLPQTVRGLYENGNYSEAFPVKWNIPADAFKNEGM VTVKGTVNVLNEVKDVTATVRVAPKLADSQNIASKNYQDTPLFTNGKMVNGVPSEPTATP INDSLSELNDGITNESGRESARWTNWSLRNENPPVDTYVQLEWQKEYLMQNIKLWHFTDN QFSVLPGDNNVRFEYYDVPTNSWQEIESSHITQVPYTSGDTPYGFIHPITTNKLRIWMKS PQVNKCIGLTEIEVYDYILPVTANSSANLDDLKLDGVNIKEFNGYRGYDKLTKTYTVDLK TADTPSVIAQGNNNEAITVLPVYNNEVKITVRAEDGKTIENYTIKYIISVNKEFLENYIE SDEIKKVFETSEEYTVQSYQEFKEAYDQAIVILKDDQATQKQVDDSYEKLQTMYQALVKK TEEKVDKSQLEALLAIADKITEDILSNLDADLVKEFKTALQAAQDIYTGNDVTQAQVNAA ITRLNVAIENLNIRDFNKIKLQELVERTEKLSAKDYTRASWERLEKALSAAKIVLADEKA TLAAVDEAKTVLQNALEQLEKVAISEESSQGVKTGDAANFAGIFALMVSTASLVLWMKRR KENN >gi|223714142|gb|ACDT01000073.1| GENE 4 10178 - 10732 552 184 aa, chain + ## HITS:1 COG:SP1284 KEGG:ns NR:ns ## COG: SP1284 COG1704 # Protein_GI_number: 15901144 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 7 184 4 181 186 176 51.0 3e-44 MSIPIIIVLIVVALIVIYCIVAYNTLTKLKNRVKTNWAQIDVVLKRRADLIPNLVETVKG YASHEKETLENAIKARNTYVTANSPQDQLAASGELNQALSRLMMLGEAYPDLKANTNFME LQKELTATEDKITYARQFYNDSVNKYTNQTEVFPSVLIAKIFGFGTFDYFEVQETDKEVP KVKF >gi|223714142|gb|ACDT01000073.1| GENE 5 10734 - 12479 616 581 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|219668460|ref|YP_002458895.1| ribosomal protein S14 [Desulfitobacterium hafniense DCB-2] # 1 581 1 561 561 241 29 3e-63 MKLKKLFLALSIFLVSLMPLPVLASGNDLQAVNMNIYINQDGSAKVEETWKMDAIEGTEN YKAFNNLYGATISDFSVVDEKGIKYQYVDNWDIDASRQEKKNKCGIIEKSDGYELCYGIG DYGQHTYTMTYTITPFVNQYSDAQGFNWQLLNQEMNPKPAEFEATISSDYKFYDQDSDIW GFGYEGQVIFDDHGAIIISNRDLNNNRRGDINYVNILVRVPDGTFSNVITKNESFEAVLE DARDGSSYQEDSSDGSIFMFIGISVAVLAGIIAVIVGVVSKANKNKQLYFSDGNNEIPTM EHVNPFRDIPCHKDIYYFYYVATKAGIIQQDDRSGILCAMLLKWIRDGYVNFTMEPSTGW FKKDKYEIDFSGEISTENILEEKMLGYFREASGDNQILENKEFERWCKRNYEEIEDWFND LIEFEEKELKEKGLMKKQTTYKKFLGIDIANEKVVYEPSFRDDIIYTKGLKRFLLDFSSI EEKEVIEVKLWEEYLMFASILGIADEVEKQIGKLCPEFNQYSNIDYTYTMLATRTFMYGG VRSAANAYSAAHSSSYSGGGGFSSFGGGGGGFSGGGGGGSR >gi|223714142|gb|ACDT01000073.1| GENE 6 12564 - 13982 1741 472 aa, chain - ## HITS:1 COG:lin0540 KEGG:ns NR:ns ## COG: lin0540 COG1486 # Protein_GI_number: 16799615 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Listeria innocua # 3 462 2 439 440 657 69.0 0 MAKRPVKIATIGGGSSYTPELVEGFIKRYDELPIKELWLVDIEEGKEKLEIVGAMAQRMV KAAGVPMTIHLTLDREEALKDADFVTTQFRVGLLDARIKDERIPLSHGIIGQETNGAGGM FKAFRTVPVILEIVEDMKRLCPNAWLVNFTNPSGMVTEAVIKYGKWDKVVGLCNVPISCR KMVGKALDKNEEELFFKFAGLNHFHWHRVWDVDGTELTDKAIQKLYVENDGLRKFGANAT RAKEEEEKAAKRAADAGVANIKNISFVAEQIQDLGYLPCMYHRYYYITDDMLEDEIADFK ENGTRAEVVKRTEAELFELYKDPNLDYKPEQLTKRGGTYYSDAACELINSIYNDKRTTMV VSTQNKGALTDLPYDSVAEVSAIITAAGPMPISWGTLKPAARGMVQIMKAMEETTIEAAV TGDYGKALHAFTINPLVPSGQIAKTLLDEMLIAHKKHLPQFADKIAELEAQQ >gi|223714142|gb|ACDT01000073.1| GENE 7 14126 - 14362 271 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756178|ref|ZP_02428305.1| ## NR: gi|167756178|ref|ZP_02428305.1| hypothetical protein CLORAM_01708 [Clostridium ramosum DSM 1402] # 1 78 1 78 78 138 100.0 1e-31 MKTISIFQVQDINNLLKERQFDYILKLRDACGSQSLYLECTGEETDINILCNVINELLKN DYLQVIPGTINPYNLLLK >gi|223714142|gb|ACDT01000073.1| GENE 8 14429 - 14689 393 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756179|ref|ZP_02428306.1| ## NR: gi|167756179|ref|ZP_02428306.1| hypothetical protein CLORAM_01709 [Clostridium ramosum DSM 1402] # 1 86 1 86 86 118 100.0 9e-26 MAATEKEKATKLAHELVAYIESEQSDLKASDNEFDALWQSIYDVCMLVHFNILDELLSEE EYLEGVTWLKDYQHLTKDYQDKELEL >gi|223714142|gb|ACDT01000073.1| GENE 9 14704 - 15558 739 284 aa, chain + ## HITS:1 COG:BH3627 KEGG:ns NR:ns ## COG: BH3627 COG1307 # Protein_GI_number: 15616189 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 1 283 1 283 283 105 28.0 8e-23 MKNYLIVTDTTSAMNGVQAKESGVELVSLSVLIDGIEYKDQIDISTEELYKKLEAGCIPT TSQPNTGYLHELMANWKAKNYEAIIIITCSSDLSGTWSGFNLVKNQLEMNNMYIYDSRQV GAPVMDMALAAKQMADNDHDVNEIFEMLEMKTKNSFSFLAPFDFKQLARSGRLSPVAAKM ASILKVKALLYLKEDGSCVDKYAMSRTESKIVKTVIDKFKNAGVNAKDYTLYISHADNEE SALKAKRLLQDIFDNIEVIVNKLPAVLTCHGGMKCISVHYIHKM >gi|223714142|gb|ACDT01000073.1| GENE 10 15570 - 16541 1175 323 aa, chain + ## HITS:1 COG:SMc01524 KEGG:ns NR:ns ## COG: SMc01524 COG2355 # Protein_GI_number: 15966167 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent dipeptidase, microsomal dipeptidase homolog # Organism: Sinorhizobium meliloti # 51 321 82 345 351 164 34.0 2e-40 MKVVDMHCDTILRLYDEGGCLSKNDFNIDLKKMVKGDYLLQNFAMFVNLSDHLDPLTKAQ RLIDLYYTELEKSPELIKPVYSYEDILNNQKNGLMSAMLTLEEGAVVNNDLAILRNYYRL GVRMITLTWNHPNGIGYPNLISTKEFKDLYHLNTEDGLTDFGIAYVKEMERLGIIIDVSH LSDAGFYDVLKYTTKPFVASHSNARSVCGVARNMSDDMIKQLAKRNGVMGINFCGDFLKP SSHGGMRSCIEDMVKHILYIKELVGIDYVGLGSDFDGIDQNLELADASMMPQLAAALQEA GFSEIEIEKVFYKNVLRVYSQIL >gi|223714142|gb|ACDT01000073.1| GENE 11 16839 - 17654 849 271 aa, chain + ## HITS:1 COG:no KEGG:Sterm_2094 NR:ns ## KEGG: Sterm_2094 # Name: not_defined # Def: transcriptional regulator, RpiR family # Organism: S.termitidis # Pathway: not_defined # 17 217 22 222 278 86 27.0 9e-16 MKSLFYRLIIFLDTASEADTNYNIAWYMAHNFSKVAKMGISQLAKECYVSPATISRFCRA LGYENYAHLKQECSSFASDSKKFNNLINVPLDMMKDKPQECTKYYSDQVCDAISKLSQRL DWNVIDKVLQAIHDSDSVAFFGTQFSHSAALHFQTDLLMLEKFTMAYMETERQIDCARLM DENSVAIIVTVNGFYARSNSKVLQYLKKSKCKVVLMTNNPGIDIGINVNYTIVLGDSEQR KIGKHTLLTAVELMSLRYYSLYYPGIQEKII >gi|223714142|gb|ACDT01000073.1| GENE 12 17691 - 18533 512 280 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 4 280 5 287 508 201 40 8e-87 MREIKCEDITKTVKQLCMEAACNLPSDVFKALKDKTDTEPYPLAKKTLEVLIDNADLAKD NMMPICQDTGMAFVYITIGQEVHVDGDLKAAINEGVRQGYEEGYLRKSVVDDPLFERINT KDNTPAIIYFDLVCGDEFKIVVAPKGFGSENMSQIKMLKPSDGLQGVKDFVMKVVNDAGP NACPPMVIGVGIGGSFDKVTMLSKQAMMREIGTHHEDSRYAALETELLEMINATGIGPAG YGGKTTALSLNIETHPTHIAGMPVAVSICCHVARHKEASL >gi|223714142|gb|ACDT01000073.1| GENE 13 18530 - 19084 353 184 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 8 168 321 482 508 140 45 8e-87 MIKLTTPLTVEKIKQLHAGDEVLLSGTIYTGRDAAHKRLMALIEEGKELPFHLEDQVIFY VGPTPSKPGKVFGSGGPTTSGRMDAFAPTMIKLGLRSMIGKGYRQQAVKDAIIKYHGVYF GAIGGAGAMMSNCIKECTVIAFDDLGPEAIRRLEVEDMPLVVVIDSNGNDQYELGRNDYL STCK >gi|223714142|gb|ACDT01000073.1| GENE 14 19114 - 20361 1595 415 aa, chain + ## HITS:1 COG:BS_yodJ KEGG:ns NR:ns ## COG: BS_yodJ COG1876 # Protein_GI_number: 16079020 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus subtilis # 218 402 79 266 273 142 45.0 8e-34 MFKKKRDYGYYSSYGTSSRSRLKVGRVALVGAVGVVLVIGIVVVLNLTRIKLLIKGYSFS QASEILSLTKEQENEILSHDKLDHITNWIANSKKVSLYDEYEKYLGMNKKMEYADVVTYI NGLFKEEVPKLKSMGYSEKVIWNLISDKVSKEDLQYLINHNYTAKDLEPYRKVKGYQIQN LEAYINAYNNYKDYNYAVCITNYPFLISSNGEPKESYTITNPDDLLVLVKKGFYLPESYE PKLVDPEIPVAPDCQNPKMTKETSDALTKMYEAAKKEGLELVVNSAYRSYQTQVDTMADF ERRFGGQYANEYVAQPGASEHQTGLGIDLTSQSVVEGKKITFGDTDEYKWVIKNCAKYGF IIRFENGTDGITGIAHEPWHLRYVGKDVAKKIVEKEWTFEEYCLYNNVMPKVTKK >gi|223714142|gb|ACDT01000073.1| GENE 15 20402 - 22027 1293 541 aa, chain + ## HITS:1 COG:BH2952_1 KEGG:ns NR:ns ## COG: BH2952_1 COG1032 # Protein_GI_number: 15615514 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Bacillus halodurans # 1 454 1 456 457 376 42.0 1e-104 MKTLITTLNSKYIHKSLSLRLLYVATKEYHEVDFKEYTIKEDLDTIANDIIEQKVQVVAF SVYIWNVDLIKILIDKLKKMDPQLVIVIGGPEVSYEIDYFLDNFKIDYLISGEGEVAFKE LLDCLETQQPINIQGISRKDKRDYQQTKPVDLQYLETLESPYLLKRDLADMDKRILYFET SRGCPYQCQYCLSSLEKGLRFFSLDYLKAQLSLLLDSGVKTIKLLDRSFNAKTAHALAIL EFIFKHYRPGIQFQFEINGDVLDQRIIDYINDYAPRGLLRFEIGIQSTYEPTNEIVKRYQ NFERLSEVIKQLQQNHKIDFHLDLIAGLPLENLERFARSFDDVFAFYPKELQLGFLKLLR GTSLRKEAEKYGYVYDENPPYELLYSNDLSPEDIGKIHLAEEMLEKFWNSGKMPITMNTV IRTVASPFYFFLDLGNYYLKQQFKMFKFQNDELFSYLDEYLNRQYQDLLIEDYLRLSKIK PKRWWKARLDKSEARETMHYLIEKYDLDQEEVFRYGVVEKVTDGYLLAIYQNYQVSVKKY S >gi|223714142|gb|ACDT01000073.1| GENE 16 22181 - 24484 2632 767 aa, chain + ## HITS:1 COG:CAC0480 KEGG:ns NR:ns ## COG: CAC0480 COG1328 # Protein_GI_number: 15893771 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Clostridium acetobutylicum # 4 767 3 698 702 520 40.0 1e-147 MIENIRKRDGRLVAFELDKIAGAIYKAFEASGGKYELKNAMNIAKTVETKLEKKKSELPT VELVQDTVEEVLIEQGYVRVAKAYILYRANRTRERDMKSRLMKTFEEITFADSKDSDLKR ENANVNADAPMGTMLKYGTEAAKQFNEMFVLNPKHARAHINGDIHIHDMDFLTLTTTCCQ IDLIKLFAGGFSTGHGVLREPNSISIYAALACIAIQSNQNDQHGGQAIPNFDYSMAPGVA KSYLKHFYHNLAKAITLLVKLDGEDIATKVKGEMRLKPTLDKNEAFEEQLKEILKANKVD KFVDEIVEFAITSAYEETDRETYQAMEAFIHNLNTMHSRAGAQVPFSSINYGMDTSSEGR MVMRNILLATEAGLGHGETPIFPIQIFRVKEGINYNEGEPNYDLFKLACRTSAKRLFPNF SFVDAPFNKQYYKGTPETEIAYMGCRTRVMGNVYAPDKEISFGRGNLSFTSINLPRLAIR SKGDVNEFFDKLDGMLDLCIEQLLERFEIQCRRKAKNYPFLMEQGVWLDSDELKPDDEVR EVLKHGTLTVGFIGLAETLKALIGVHHGESEKAQILGLQIVAHMRKAMDKATEKYNLNFS LIATPAEGLSGRFVKMDKKLFGELDGITDREYYTNSFHIPVYYSISAFKKIKLEGPYHEL TNGGHISYVEMDGDPTKNIAAFEKIVRAMHDNGIGYGAINHPVDRDPVCGYNGIIDDVCP LCGRSEEEHHGFERIRRITGYLVGTLDRFNDGKKAEESDRVKHETKE >gi|223714142|gb|ACDT01000073.1| GENE 17 24492 - 25007 506 171 aa, chain + ## HITS:1 COG:CAC0481 KEGG:ns NR:ns ## COG: CAC0481 COG0602 # Protein_GI_number: 15893772 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Clostridium acetobutylicum # 1 150 1 153 153 113 40.0 2e-25 MEIRLFGTVNDSIVDGPGLRYAIFTQGCFHNCLGCHNPKSHDINGGYIRDTSEIINEIDE NPLLDGITLSGGEPMLQIEPLKELCKAARLRKLNIVIYSGFTFEQIMDDPNKKSLLELCD MLIDGKFELDKKSLALLYRGSSNQRLINIQQSLSQGEVVEYLADENNEIKI Prediction of potential genes in microbial genomes Time: Thu May 26 09:57:47 2011 Seq name: gi|223714141|gb|ACDT01000074.1| Coprobacillus sp. D7 cont1.74, whole genome shotgun sequence Length of sequence - 24895 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 9, operones - 4 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 64 - 1809 1815 ## COG0006 Xaa-Pro aminopeptidase - Prom 1856 - 1915 17.6 + Prom 1744 - 1803 7.1 2 2 Tu 1 . + CDS 2038 - 2499 481 ## MmarC7_1474 hypothetical protein + Prom 2639 - 2698 9.4 3 3 Op 1 . + CDS 2731 - 3039 379 ## gi|167756191|ref|ZP_02428318.1| hypothetical protein CLORAM_01721 4 3 Op 2 . + CDS 3052 - 4083 1069 ## ACL_1174 V-type H+-transporting ATPase, subunit C (EC:3.6.3.14) 5 3 Op 3 . + CDS 4097 - 5458 1358 ## ACL_1173 V-type H+-transporting ATPase, subunit I (EC:3.6.3.14) 6 3 Op 4 . + CDS 5471 - 7363 1924 ## COG1269 Archaeal/vacuolar-type H+-ATPase subunit I 7 3 Op 5 . + CDS 7384 - 7815 653 ## ACL_1172 integral membrane protein 8 3 Op 6 . + CDS 7835 - 8143 280 ## ACL_1171 V-type H+-transporting ATPase, subunit F (EC:3.6.3.15) 9 3 Op 7 . + CDS 8156 - 8752 839 ## gi|167756197|ref|ZP_02428324.1| hypothetical protein CLORAM_01727 10 3 Op 8 16/0.000 + CDS 8768 - 10519 2026 ## COG1155 Archaeal/vacuolar-type H+-ATPase subunit A 11 3 Op 9 16/0.000 + CDS 10533 - 11975 1866 ## COG1156 Archaeal/vacuolar-type H+-ATPase subunit B 12 3 Op 10 . + CDS 11978 - 12586 823 ## COG1394 Archaeal/vacuolar-type H+-ATPase subunit D + Term 12773 - 12806 2.3 - Term 12435 - 12471 -0.9 13 4 Tu 1 . - CDS 12604 - 13032 547 ## Cagg_1118 hypothetical protein - Prom 13097 - 13156 7.0 + Prom 13032 - 13091 8.0 14 5 Op 1 . + CDS 13323 - 16442 3231 ## gi|237734149|ref|ZP_04564630.1| predicted protein 15 5 Op 2 . + CDS 16435 - 17370 313 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 16 5 Op 3 . + CDS 17351 - 18262 914 ## gi|237734151|ref|ZP_04564632.1| predicted protein + Term 18377 - 18439 4.1 + Prom 18371 - 18430 7.7 17 6 Tu 1 . + CDS 18503 - 19387 249 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 + Term 19446 - 19483 -0.5 + Prom 19455 - 19514 6.1 18 7 Op 1 . + CDS 19534 - 20292 838 ## COG0710 3-dehydroquinate dehydratase 19 7 Op 2 . + CDS 20356 - 20733 398 ## gi|167756207|ref|ZP_02428334.1| hypothetical protein CLORAM_01737 20 8 Tu 1 . - CDS 20742 - 22544 1838 ## COG0367 Asparagine synthase (glutamine-hydrolyzing) - Prom 22573 - 22632 7.2 + Prom 22613 - 22672 5.8 21 9 Op 1 . + CDS 22821 - 24380 1887 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 22 9 Op 2 . + CDS 24406 - 24895 369 ## gi|167756210|ref|ZP_02428337.1| hypothetical protein CLORAM_01740 Predicted protein(s) >gi|223714141|gb|ACDT01000074.1| GENE 1 64 - 1809 1815 581 aa, chain - ## HITS:1 COG:FN0453 KEGG:ns NR:ns ## COG: FN0453 COG0006 # Protein_GI_number: 19703788 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 2 575 3 578 584 506 44.0 1e-143 MIKENIAKMQTLMKENGIDIYIIPTSDFHQSEYVGEYFRGRKFLSGFTGSAGTLVISLDE ARLWTDGRYFIQAEQQLAGSGIILMKMAMPGVPTIKEYLDQNTDKTVGFDGRVMSYQEVA RLSNKLITDVDLVDEVWSERPSISHEPAFIYDEEYCGESRASKLARLRSAMGDCQHHIIT SLDDIVWLFNIRGNDVDCNPVVLSYALINQDNAILYVQDNVVDVKTEAILKRDSIIIRAY NDIYEDVKKLTGKVLLDDQIVNYQIFNNLNCEIKTAPNPTQHFKAIKNETEIKATKNAHI KDGVAMTKFMYWLKNNVGKIELDEVTISDKLAAFRKEQNEFFDLSFDTICGYKANAALMH YKAEPRNCAKVTNEGMLLIDSGGQYLDGTIDTTRTFVLGPISDIERRDFTVALKAMFRLQ AAHFLAGTTGPNLDLLARGIVYEYNLDYRCGTGHGVGHFLNVHEGPNGFRPHDRPGFAKM CAFEPGMITTDEPGIYIENSHGVRHENELLCVEVETNEYGQFLKFEPITMVPFDLDGLDL ELLSNHEIKQINDYQQLVFDHVAPFLTNEERAWLQANLLIK >gi|223714141|gb|ACDT01000074.1| GENE 2 2038 - 2499 481 153 aa, chain + ## HITS:1 COG:no KEGG:MmarC7_1474 NR:ns ## KEGG: MmarC7_1474 # Name: not_defined # Def: hypothetical protein # Organism: M.maripaludis_C7 # Pathway: not_defined # 1 148 1 148 158 98 35.0 7e-20 MPTNRKESLVFTMFMCAFMVFWMSVYNVSLHFGFSSEAIKEAWLGFPLAYLFAILCDWVI ISKFAKSLAFKVVNPKSKPLIKIIMISFFMICGMVILMSLYGAVEQTGISEQTLMVWLTN IPKNFIAALPLQLLIAGPFIRSVFKILFTRQIA >gi|223714141|gb|ACDT01000074.1| GENE 3 2731 - 3039 379 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756191|ref|ZP_02428318.1| ## NR: gi|167756191|ref|ZP_02428318.1| hypothetical protein CLORAM_01721 [Clostridium ramosum DSM 1402] # 1 102 1 102 102 131 100.0 2e-29 MDDVIQRLVEIDRKCVERVEAAKAKKLDAQTNMNEKRKEIYDGFVTEQNKKIEAHKAELM NKNKEEAKQLDQDYNDTVANLENLYNQNKDKWINQIVERCLK >gi|223714141|gb|ACDT01000074.1| GENE 4 3052 - 4083 1069 343 aa, chain + ## HITS:1 COG:no KEGG:ACL_1174 NR:ns ## KEGG: ACL_1174 # Name: ntpC # Def: V-type H+-transporting ATPase, subunit C (EC:3.6.3.14) # Organism: A.laidlawii # Pathway: Oxidative phosphorylation [PATH:acl00190]; Metabolic pathways [PATH:acl01100] # 1 342 1 343 343 156 28.0 1e-36 MALSSNALCAKAKAMYGYRIGEEGYSDLCRKQSLSEMVTYLKSQTKYSGVLEDINVRNVH RRQVEAALNKEYFERCARLMKYAPKKNQDFYSQEVIGIEVQLIVDKVVSIKEKDQASFSL EIPDYLASKMSFNIYGLINVDNYKDLVTYLRDTRYYKILADFDFTAPIDINALERQMKSL FYSTYVAAIKKNFKGKEQKELLDILSTSIELINITKIYRFKKYFKESNETIKNSLYLEHC RMSSSMLDTLIEARSAKELMELLANSKYKLYLGDKDYAYIEYYVEEVRYNIAKRYMRFSS NAPLVYLTYSILQKVEVDNLKHIIEGIRYGRDASSIEEMLIFA >gi|223714141|gb|ACDT01000074.1| GENE 5 4097 - 5458 1358 453 aa, chain + ## HITS:1 COG:no KEGG:ACL_1173 NR:ns ## KEGG: ACL_1173 # Name: ntpI2 # Def: V-type H+-transporting ATPase, subunit I (EC:3.6.3.14) # Organism: A.laidlawii # Pathway: Oxidative phosphorylation [PATH:acl00190]; Metabolic pathways [PATH:acl01100] # 3 425 2 433 637 132 27.0 3e-29 MAIIKTHFFNISFEHKDLMKMLIKMTEYQDEMFPQDSKKIANNVKGVSVMDEANPYNEPL DNIYHILNRLNLESNVQDNEFKEINLHNANKLIDDINDKIDNIVKIKEDITKEKEENDEA IILLKNLEESKISVDDVKNTKYITCRFGKIPVNEFTKIQYYRDYEFIFRELNRSKQYVWI VYAGLTSSISEIDNAFSSMSFEPISLPEFAHGKVHEAIDELTEESTAMEQYIAKMDSKLE DIKKEYKEEVLETFTRLYNLKRLFDKCRYVVDFSQKASIYTFSSFDLKEAEAKFDDIDSV KVMELPVNIYENRNIVAPVLVRNNSLMQPFENVLSVTLGDTFDPTTIVALLTMLIGAVCV GDIGVGALLIVLGFLFTIKKPNNFGGMLKRLGAAVLIGGLFYGTAFYRIELYQPLASLPL HVIHTFLFGFSLWIIAMVILIIVKKVTRKSIKI >gi|223714141|gb|ACDT01000074.1| GENE 6 5471 - 7363 1924 630 aa, chain + ## HITS:1 COG:MJ0222 KEGG:ns NR:ns ## COG: MJ0222 COG1269 # Protein_GI_number: 15668395 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit I # Organism: Methanococcus jannaschii # 122 630 148 689 695 125 24.0 4e-28 MAISRLNLVSINFDKYRYNEVLLKLFSHDDFHPELASKFIDSVTGLSVYNQDNLYEEILV RLNDASNKYHFKLEEVPVETNQINVLRVKEYLDDLLLDVERIDAVKVQLDRMIIENQEAI IQLQHVADIDVDFDDLFSCKYLQVRFGKIPVNNVSKLDYYESLPFVFKAFHNDNQYTWCM YITTPGDAPEVDNIFSSLYFERIHIPAFVHGSPELAIGEIQEEADVAKEQSEKLKERIDK MFANDRDKLNEIYTIAKHLNDVFIMQKYVVVIGDKLSVNGFVPTRSVKKFKEDFETLEKV TVDIKPANSDVRLSPPTLLKNNWFSRPFRMFVEMYGVPKYNDADPTLLVALSYTLLFGIM FGDVGQGVILSLVGFVAYKKFGLQLGAVGVRLGISSMLFGVVFGSVFGNEEILIPLFNAM SPDNTMTLLIAAICLGIILIIISMAFNVILSFKKKNFGEAVLSQNGICGLVFYISILGLV INMFLGLGFVNVIYVVGLIALPLFLIFLKEPLIRKFEEHEAMFPNGFGAFFVEGFFELFE VVLSFITNTMSFLRVGGFVLSHAGMMLVVYTLAEMVGGVGELAVIIFGNIFVMCLEGLIV GIQVLRLEFYEMFSRYYEGNGISFKTIKED >gi|223714141|gb|ACDT01000074.1| GENE 7 7384 - 7815 653 143 aa, chain + ## HITS:1 COG:no KEGG:ACL_1172 NR:ns ## KEGG: ACL_1172 # Name: not_defined # Def: integral membrane protein # Organism: A.laidlawii # Pathway: Oxidative phosphorylation [PATH:acl00190]; Metabolic pathways [PATH:acl01100] # 11 143 11 146 150 68 55.0 7e-11 MTVLEFILPALVVILLSLPLIKVFTGKVSVKSAKIRLATHVAGFAGAVCLALYLAAVTNG VYAEETAVITGTIAQGLGFVAAALATGLSALGAGIAVAAAAPAAIGAFSENEKNFGKSLI FVALGEGVAIYGLLISILIINKI >gi|223714141|gb|ACDT01000074.1| GENE 8 7835 - 8143 280 102 aa, chain + ## HITS:1 COG:no KEGG:ACL_1171 NR:ns ## KEGG: ACL_1171 # Name: ntpG # Def: V-type H+-transporting ATPase, subunit F (EC:3.6.3.15) # Organism: A.laidlawii # Pathway: Oxidative phosphorylation [PATH:acl00190]; Metabolic pathways [PATH:acl01100] # 1 102 1 102 104 125 61.0 5e-28 MRFYCISDNVDTQVGLRLVGIEGQVVHKRREFLELLETKLKDDSIGIILITTNLIELAPD VISEIKLKQQKPLLVEIPDRHGESKIGETIDKYVSEAIGVKL >gi|223714141|gb|ACDT01000074.1| GENE 9 8156 - 8752 839 198 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756197|ref|ZP_02428324.1| ## NR: gi|167756197|ref|ZP_02428324.1| hypothetical protein CLORAM_01727 [Clostridium ramosum DSM 1402] # 1 198 1 198 198 275 100.0 1e-72 MEKKEQVFLYMKEEIERLANLEEEKILDEAKRLEEEAYNQIKAEAKKDAEALLAKELVEI SSNASVEASSSQEEKTKKLVEKRDEYVANIFSEAKDKLVAFANSKDYQAYLVKHMEEIGK LYPMSDSTLELREADMKYKDELIKAYGTALEVEVSDKITIGGFIVKNKATNVVVDESLDF ALENQKDWFYKTSGLMIK >gi|223714141|gb|ACDT01000074.1| GENE 10 8768 - 10519 2026 583 aa, chain + ## HITS:1 COG:MK1017 KEGG:ns NR:ns ## COG: MK1017 COG1155 # Protein_GI_number: 20094453 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit A # Organism: Methanopyrus kandleri AV19 # 1 582 1 587 592 647 55.0 0 MNNNVIYSINGPVVKVKDTVSFSMLEMVYVGNDSLMGEVISISKEFTTIQVYETTTGLKP GEPVVSTGSPICATLGPGILRNIFDGIERPLKALEKESGAFIQAGSNVDSLDIAKKWDVT MTVKVGDEVKGGTIYATTPETDLIVHKCMVSPLISGKVVEVVEDGQYTINDVVMKIEDAH GKIIECTLCQKWPIKQSRPVTKRLPIGTPLVTGQRVIDTLFPIAKGGTAAIPGGFGTGKT MTQHQIAKWCDADIIVYVGCGERGNEMTQVLEEFSELIDPKSGKPLTDRTVLIANTSNMP VAAREASIYTGVTLAEYYRDMGYHVAIMADSTSRWAEALREISGRLEEMPAEEGFPAYLP SRLSQFYERAGYMETLNGQEGSVSIIGAVSPQGADFSEPVTQNTKRFVRCFWALDKALAY ARHYFAINWTQSYSDYVYDLEKWYSKNVGPKFVSDRQEIAYILAEEAKLNEIVQLMGPDV LPEDQKLIIEVAKVVRVGYLQQNAFHKDDTYVPMEKQLKMMEVILYLNKRCKEAVNQGKV VRRIVDTGIFDQVIKMKYDIPNDNIAKLDTYYHDIDEAIKSVA >gi|223714141|gb|ACDT01000074.1| GENE 11 10533 - 11975 1866 480 aa, chain + ## HITS:1 COG:TP0528 KEGG:ns NR:ns ## COG: TP0528 COG1156 # Protein_GI_number: 15639518 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit B # Organism: Treponema pallidum # 5 454 6 454 480 572 60.0 1e-163 MSLQYVGLNEINGPLVVLDHVKGASYDEMVDIQLKDGTTRAGRIVQIEGEKVVVQVFEGT NGLSLENTKTRLLGKPMELPVSKEILGRIFSGAGRPIDGLGEVYAEAEMDINGLPLNPVS RVYPRNYINTGISSIDCLTTLIRGQKLPIFSVSGLPHDRLAVQIVKQAKIADEAGKGFAV VFAAMGVTNDVAEYFKRSFKEAGVMERVVMFLNLSNDPIIERILTPRCALTAAEYLAFKQ NMHVLVIYTDVTSYCEALREFSSSKGEIPGRKGYPGYLYSDLASLYERAGIIKDAEGSVT QIPILTMPNNDITHPVPDLTGYITEGQITLDVDLNSAGVYPPVGVLPSLSRLMKDGIGEG FTRADHQDIANQLFACYAHVQDVRSLTSVIGEDELSDMDKKYISFGRLFEEYFINQGFDT NRSIEETLDLGWDLVSVLPREALDRLDNALLDQYYNHEKAVERFNIKEKKIIKELQEEGE >gi|223714141|gb|ACDT01000074.1| GENE 12 11978 - 12586 823 202 aa, chain + ## HITS:1 COG:PH1972 KEGG:ns NR:ns ## COG: PH1972 COG1394 # Protein_GI_number: 14591709 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit D # Organism: Pyrococcus horikoshii # 3 200 5 207 214 107 35.0 1e-23 MALKVVPTKGNLIAMKKSLQLANLGYNLMDQKRNVLIKEMMTLLDDVKLIRDQITSSYQE AYDALQEANISMGLITDIVNSTPEDYGISIAYRSVMGVEIPKIAYDKQPLKMTYDIERSN SKVDYAYNCFYNVKQLTVLLAEVENSVYRLANTIRKTQKRANALRNISIPRFESTIKVIS EALEEKEREEFTRQKVIKEMKK >gi|223714141|gb|ACDT01000074.1| GENE 13 12604 - 13032 547 142 aa, chain - ## HITS:1 COG:no KEGG:Cagg_1118 NR:ns ## KEGG: Cagg_1118 # Name: not_defined # Def: hypothetical protein # Organism: C.aggregans # Pathway: not_defined # 17 130 22 141 156 65 31.0 7e-10 MNPLTAEIATAQNSDNKLILIIGGPGSGKSKLIHDYSNETGIPILNLDQIFKDDCSEIIT VMNDFIDNYDKEVLLLDNKRVLYAKDSNIDMLTFLKELAKKIIVVATWNGTIADNKLIHI RSKLPANLEYSLKNEQIKFIKC >gi|223714141|gb|ACDT01000074.1| GENE 14 13323 - 16442 3231 1039 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734149|ref|ZP_04564630.1| ## NR: gi|237734149|ref|ZP_04564630.1| predicted protein [Mollicutes bacterium D7] # 1 1039 4 1042 1042 1867 100.0 0 MKKMLTFLIATILILAMSLMPLQAITGINSLVSYGEGNEIEGGCNGKNVDLGEGLSGLMV SNQSELVFIGNDDFKIKVGKPVKEFEIIDDVDNDGLKDVAVYIDSEDDYDDFKIVSSKTS KVLYATKYSYNTINDNNEAVNKNATIRQILFNDNIVYLIYNYHLVGISTKDYQVVFDYEA KDNIWKMALVNNQVIFTTQLGELGSLNKTNGKLNYTVALTTPKEIKDPRYDTLGKVNLNI WDIVYIKDKLYVTTEDAKLYQIDQMNGKIIKTVNLEEGIDEKLGNLLKNNNKWSGQLVPT GIHNISFMSYRMDLLDNNLMLISAYLGSPQIELNEGDSEKEIPHLILIDLEKMEVKLDIP VEQYNLDYSRAVKGIFDNQPALVVPTYASTGKLRLVAYSLNDGTVLKQTNLNLNGINSKN VKIHFANQGERYLLQVDGGVSLLVDQDLKTIEYLGSAKVVNKVVDLSDGTIVSVLNNGKI TQIKKMGLGGCDDNLVTFNIPDGYHSNGFEAINYDEKHNHLLSLVSELNANGEVIASHIV IMNLNDGNTLADKKVLLEKGYDENNKYYEHYLIGETIRYFTDMNNDGKSEIIVDENILDG ASLTFRSVYNKSFEGATTIIDVGDVNNDGISDLVNIGESEMRLYYSLKNGYDITYQKTNI AKSYDKKLQNNIHVKVLGDLDHDGIKDFVINAYNDKGCQCFQVIKGKDLSVRYSLLKDGV IYENEEDSFSVTGIDYDNDGVDDLVYGIYDEMRTVLSGATGEKLFEYLITDYHNDWDNSM GDPVPLENIVSFNLIENGNSVVKINDLNDDGIQELAYLVRDYDNSDYGTKNILKILNGSN YEELKSKILGKGNYDSSQITTVIGQSKLIYNDGTLSQIYDYRNDSLVAGLKMAVTSARTL NNQLLQLEDNVGQLYSFADQRDFELVDFNEKDVNDGNLKISYKTEKNGLMYVYDQGNLIE KTTAKNFNVKLLAGEHDLIFSYNDGQGKVTHYATVITVEKSTISRYLIMLLVIGIVAGGF GLVFYPKYRLMKKAGVKHG >gi|223714141|gb|ACDT01000074.1| GENE 15 16435 - 17370 313 311 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 303 1 303 305 125 26 3e-28 MDKLIVIEHMDKVFGNYKAIDDLSFSLEGGKIYGMIGPNGAGKSTTMSLLMGLIFPTKGS GLIKGYPLGSNEAKAIMGYSPEFPNFYSDMSCLEYLVYMGMLSDLSYDEALKRTKELLEE FDLVEHQLKRVAKFSTGMKKKVGLIQAMMHKPEILLLDEPTANLDPTSRYEIIQTLKRLV KQRKMTVLISSHVLTELEMIIDHVIMINKGHVVLNQPIDVVQNEFNQEKLLVSCNHNEHL QTYLETKGYFYTLNNGVLRVSIDDKQKCKKEIVKFIYENDYELDLLKEDTISLEGLYQQL VEENDHESTIQ >gi|223714141|gb|ACDT01000074.1| GENE 16 17351 - 18262 914 303 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734151|ref|ZP_04564632.1| ## NR: gi|237734151|ref|ZP_04564632.1| predicted protein [Mollicutes bacterium D7] # 1 303 1 303 303 469 100.0 1e-131 MKVLFSKTFIKMFKLIPVVACVIISGFFGFILSNQYLGLMSQKPTYLSLSQDILMFYAVL AFVLLVGMLIWIICSNSGSGLFASEIHEGTMRLLLSKKISRLELVIGKVGGMLVGSFVYL VISFATFLMVFIILSGVEKDILRMVLAATLAFISYGIILIFILGGLATFLSSCFKKKVPA ILILVALAALIFGIIPFARIFLTNTGKYDTYHLYLFDINYHLSLIFNQFLMLVGDFNTML GDLSMFGMFTNLYSYGVTDFDLGNRAVYTLNQSLNSGLITAVYLALASGLYVLSYRRMDK KDI >gi|223714141|gb|ACDT01000074.1| GENE 17 18503 - 19387 249 294 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 6 285 9 308 323 100 27 9e-21 MYLVFDIGGTAIKYCWMNEAGEIFNKQEILSTAVDSLDKFIETVAKIYHESNRKVEGIAL SCPGVIDAANGTIKVVVAYPYLQGICLTELISKACDNIKVSLENDAKCAGLAEAWIGSAQ AYDDAIIVVLGTGIGGAIIKNKQIHHGAHLFAGEISTLIVDYDKETNQVLTWSDIASTTA LCKRAAEALAVTSIDGRRVFELANNADEVVLEVLKNFCLDIAIQLYNLQYSYDPGVICIG GGISKQPLLIKLIKEAVEIIANETNQLLKPNVTTCKFYNEANLIGALSYFLSIK >gi|223714141|gb|ACDT01000074.1| GENE 18 19534 - 20292 838 252 aa, chain + ## HITS:1 COG:lin0494 KEGG:ns NR:ns ## COG: lin0494 COG0710 # Protein_GI_number: 16799569 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase # Organism: Listeria innocua # 6 249 6 249 252 189 38.0 6e-48 MGVCKVKNIVIGEGMPKICVPVVSNNHQDIITDLIRLQSLDIDLIELRIDYFNELLDHQK LTELFKAIASMAIKQGIILTYRSVPEGGNGKLSNDEYMQLYSIALESGAFDIYDVELSSG TNTIINLSNLIHQNDKKIIMSSHDFTRTPSIDTMLGKIKQMDSLEADIIKIAVMPEDYRD VLLLLETTLKANELYDKPLVTMSMSSKGIATRILGEQFGSAITFGKDNNSSAPGQIEVHA LKEVLKVIHHNN >gi|223714141|gb|ACDT01000074.1| GENE 19 20356 - 20733 398 125 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756207|ref|ZP_02428334.1| ## NR: gi|167756207|ref|ZP_02428334.1| hypothetical protein CLORAM_01737 [Clostridium ramosum DSM 1402] # 1 125 1 125 125 189 100.0 4e-47 MLINQNELEKLCKYLYPKFFDYKDISLYDLNIKIDDYLHLSANLNYYNIETKLRAIARVR VEDKIIIDINGVIKYGFINLDLNKVLKETVKDLPYLTINDESIIIDNEYVREISLKDEYI NIELK >gi|223714141|gb|ACDT01000074.1| GENE 20 20742 - 22544 1838 600 aa, chain - ## HITS:1 COG:BS_asnO KEGG:ns NR:ns ## COG: BS_asnO COG0367 # Protein_GI_number: 16078143 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthase (glutamine-hydrolyzing) # Organism: Bacillus subtilis # 18 598 1 589 591 541 45.0 1e-153 MYDYMNDLTNQVQNFNKMLTLLKYRGPDDFNIASSEHVLLGHCRLSIIDLNGGKQPFKYT YNDIEYTIVYNGEIYNMNTIKEQLIDEGYHFTSQSDTEVVVASFIAYGPRCLNLFDGIFS FVISYQNKLFVARDQLGVKPLYYYQKDDLFIFSSEIKCILMYLGRCVVDETGLKELLGLG PSVSPGKTIYKDIYSLRPAHYMNVFNGYKEINRYWALERRNHNRSYEETVHDIRCLVNHS IRQQLLSDVPVSCMLSGGLDSSIITGVTSQYISKLSTYSVDYQDQDKYFQPYEYQTTRDD HYIDEVKTLYNTKHKTVTLSQKQLVMSLKESLIARDAPGMADIDSSFLLFSKEISHNHKV VLSGECADEIFGGYPWFYRKELYSIDGFPWMRDLDKRMELFSDEIQALNIKNYVMDKYHQ TLNEIDYHDHTYEDANKRKMIYLNMEWFMQTLLTRSDSQTMRSSIELRVPFANKDIVSYL YNVPWEYMYHNDIEKGLLRDAFKDFLPSDIYNRKKNPYPKTHSPLYRDLIIELLKESLND ENNLLLRLFNRDKLIDLINSGGESFKYPWFGQLMTGPQLLAYFYQIYLWGRIYHVEIELE >gi|223714141|gb|ACDT01000074.1| GENE 21 22821 - 24380 1887 519 aa, chain + ## HITS:1 COG:BS_glvC_1 KEGG:ns NR:ns ## COG: BS_glvC_1 COG1263 # Protein_GI_number: 16077887 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Bacillus subtilis # 1 435 1 436 452 520 59.0 1e-147 MLQKIQRFGGAMFTPVLLFAFAGTTIGIGTLFTTEAIMGGLADPNGLWYQVWNVILQGGW TVFNQLPLIFVVGLPIGLAKKQQARCCMEALVLYLTFQYFLSTILSQWGGTFGVDFAAET GGTSGLTMIASIKTLDMGMIGALMISCIVVYLHNKFFDTELPEWLGTFSGSSFVVIVGFF ALLPVAFLSAWLWPIVQDGMRAFQGFVAGAGSIGVWIFIFMERILIPFGLHHLMYFPFFY DSVIINGGVYSAWATALPDLAASSDSLKSLAPWAWPTLTGLSKVFGCPGIALAFYATAKD KKKKQIAGLLIPITLTAIVCGITEPIEFTFLFIAPVLFVVHAVLAATLATVANALGVVGV LSGGLIEMSALNFIPLMASHWQTYLTLLVVGLVFTGIYFIVFRFLILKFDFKTPGREDDE EEIKFHTKADFKEKQAGGKVEKQSLAAAILEGLGGKDNIVDVTNCATRLRVNVKDEKLVQ SDSYFKSIGTHGLKATKTNIQVIVGLKVPSVREDFEALL >gi|223714141|gb|ACDT01000074.1| GENE 22 24406 - 24895 369 163 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756210|ref|ZP_02428337.1| ## NR: gi|167756210|ref|ZP_02428337.1| hypothetical protein CLORAM_01740 [Clostridium ramosum DSM 1402] # 1 163 8 170 170 231 100.0 1e-59 MAKRINEKKLKRLDRYYMIAKVFLMVTPFIAYLYLSLLAMMRSITLPEVLSSEPSVAVIF LIAMINPYIAYLLNIAQRKLKEGDIKFACINFVLLLLAQALTLNSLYFMIIAYLFYVTVK TYDIKVFKTLKEFTMKYTFQYGGGSFIVVAFSTVCLFATLRLM Prediction of potential genes in microbial genomes Time: Thu May 26 09:59:27 2011 Seq name: gi|223714140|gb|ACDT01000075.1| Coprobacillus sp. D7 cont1.75, whole genome shotgun sequence Length of sequence - 22975 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 10, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 72 - 509 522 ## COG2190 Phosphotransferase system IIA components + Term 512 - 542 1.3 + Prom 539 - 598 8.4 2 1 Op 2 . + CDS 623 - 1378 716 ## COG1737 Transcriptional regulators 3 1 Op 3 . + CDS 1378 - 2136 748 ## COG1737 Transcriptional regulators 4 2 Op 1 26/0.000 - CDS 2486 - 3400 1249 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 5 2 Op 2 . - CDS 3414 - 3839 697 ## COG1585 Membrane protein implicated in regulation of membrane protease activity 6 2 Op 3 . - CDS 3855 - 4220 361 ## ACL_0835 hypothetical protein - Prom 4241 - 4300 12.8 - Term 4245 - 4297 -1.0 7 3 Tu 1 . - CDS 4308 - 5006 732 ## COG2188 Transcriptional regulators - Prom 5060 - 5119 10.0 + Prom 5077 - 5136 9.7 8 4 Tu 1 . + CDS 5162 - 11854 7910 ## COG1472 Beta-glucosidase-related glycosidases + Term 11875 - 11916 10.1 + Prom 11895 - 11954 8.2 9 5 Op 1 8/0.000 + CDS 11980 - 12351 478 ## COG1725 Predicted transcriptional regulators 10 5 Op 2 . + CDS 12355 - 13203 263 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 11 5 Op 3 . + CDS 13203 - 13859 675 ## gi|167756223|ref|ZP_02428350.1| hypothetical protein CLORAM_01753 12 5 Op 4 . + CDS 13872 - 14309 442 ## gi|237734169|ref|ZP_04564650.1| predicted protein + Term 14314 - 14352 3.0 + Prom 14315 - 14374 4.0 13 6 Op 1 . + CDS 14397 - 14864 580 ## gi|167756226|ref|ZP_02428353.1| hypothetical protein CLORAM_01756 14 6 Op 2 . + CDS 14867 - 15568 984 ## COG0813 Purine-nucleoside phosphorylase 15 6 Op 3 . + CDS 15568 - 16191 596 ## COG0560 Phosphoserine phosphatase - Term 16143 - 16176 0.7 16 7 Tu 1 . - CDS 16204 - 16404 263 ## COG2155 Uncharacterized conserved protein - Prom 16456 - 16515 9.8 + Prom 16445 - 16504 8.6 17 8 Op 1 3/0.000 + CDS 16566 - 17483 938 ## COG2834 Outer membrane lipoprotein-sorting protein + Prom 17486 - 17545 3.9 18 8 Op 2 . + CDS 17570 - 18643 1158 ## COG0787 Alanine racemase 19 9 Op 1 . - CDS 18719 - 19138 361 ## gi|237734176|ref|ZP_04564657.1| predicted protein - Prom 19158 - 19217 7.5 - Term 19193 - 19226 2.3 20 9 Op 2 . - CDS 19237 - 20235 1215 ## COG1087 UDP-glucose 4-epimerase - Prom 20256 - 20315 5.4 - Term 20294 - 20333 7.3 21 10 Tu 1 . - CDS 20338 - 22389 1804 ## COG0480 Translation elongation factors (GTPases) - Prom 22420 - 22479 9.5 Predicted protein(s) >gi|223714140|gb|ACDT01000075.1| GENE 1 72 - 509 522 145 aa, chain + ## HITS:1 COG:SA0183_3 KEGG:ns NR:ns ## COG: SA0183_3 COG2190 # Protein_GI_number: 15925893 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Staphylococcus aureus N315 # 4 119 3 117 143 111 49.0 4e-25 MANGKSVAIEDVPDEVFSTKMMGDGIAVIPDDGKIYAPCNGKISMVMDNTKHALGIETED GMELLIHVGLDTVNLMGEGFATHVSAGDSVETGELLISYDKTDFVTKGINDITMLVIVEN GGHEITKYNINENVHIIESPLIEYK >gi|223714140|gb|ACDT01000075.1| GENE 2 623 - 1378 716 251 aa, chain + ## HITS:1 COG:CAC0531 KEGG:ns NR:ns ## COG: CAC0531 COG1737 # Protein_GI_number: 15893821 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 244 1 246 257 154 36.0 1e-37 MKLEELVNRNYQQLNDNDLYIWNYIIHHRKECERLSIDELALRCSVSRSTILRFSKRLGL KGYAEFKVFLRIDNQKNDKNALTDNIFNHYIETLEQYRDYHYKEIVQSIYEAKNLYVYGT GVIQDNTAQHLKRSFSMVNKLFLDIDVLADFEAYINLFGSNDVFIAISYAGENQRLLDYV YRLKAKGVIVVAIIANNDCTLSHVADYSLHVKTLPVMTNQGRREDLVGNYFILIDFIIAN YIEYAKSRGNE >gi|223714140|gb|ACDT01000075.1| GENE 3 1378 - 2136 748 252 aa, chain + ## HITS:1 COG:CAC0531 KEGG:ns NR:ns ## COG: CAC0531 COG1737 # Protein_GI_number: 15893821 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 247 1 246 257 180 41.0 2e-45 MTLDKLVEDNYHQLNENDLYIWQYILHHKRECQRISIKDLARNCNVSHTSILRFTKKLGL EGFSELKVHLKWDLAQKPNFKPRIIDDTYHEFIETMERMKNRDLTNIMEMIEKAERVFVY GTGVVQANMAGELRRVFLYTNKVFHAVGNGTEIDTILNNVTKNDLFIIISLSGDNETAVT LARALRGLHIPRIGIAKVGNTLLSKYCDDMITFRYETFKVGMSDILYGSTAHFFIISDFL FLRYLEYSQEKG >gi|223714140|gb|ACDT01000075.1| GENE 4 2486 - 3400 1249 304 aa, chain - ## HITS:1 COG:FN1549 KEGG:ns NR:ns ## COG: FN1549 COG0330 # Protein_GI_number: 19704881 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Fusobacterium nucleatum # 7 296 4 293 294 321 57.0 1e-87 MDGIVAIVLWVFLGIIVITIIASTIRIVPQSRAYVVERIGAYNRTCNVGLHILIPFFDRV ANKVSLKEQVVDFAPQPVITKDNVTMQIDTVVYYQITDPKLFTYGVDRPINAIENLTATT LRNIIGDLELDETLTSRDIINSRMRSILDEATDPWGIKVHRVEVKNIIPPRDIQEAMEKQ MRAERERREAILQAEGKKTAAILNAEGDKESMILRATADKEAKIAIAEGEAEALRLVYEA QAKGITYINQANPDSAYVTLQGFKALEELSKGEATKIIIPSEIQGIAGLASSLKELVTDK PKKD >gi|223714140|gb|ACDT01000075.1| GENE 5 3414 - 3839 697 141 aa, chain - ## HITS:1 COG:MTH693 KEGG:ns NR:ns ## COG: MTH693 COG1585 # Protein_GI_number: 15678720 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Membrane protein implicated in regulation of membrane protease activity # Organism: Methanothermobacter thermautotrophicus # 5 141 10 144 146 63 33.0 1e-10 MIPIWLGIIIVAIIVEIITIDLVSIWFAAGGVVALIADLLGASQAIQIALFVIVTTIAIF VTRPIAKKYLRTNIEKTNYDRVIGKHGLVTKTITADTKGEVKVMSTSWLAASVDNNTINE GEYCEIMAIEGAHLVVKKIEE >gi|223714140|gb|ACDT01000075.1| GENE 6 3855 - 4220 361 121 aa, chain - ## HITS:1 COG:no KEGG:ACL_0835 NR:ns ## KEGG: ACL_0835 # Name: not_defined # Def: hypothetical protein # Organism: A.laidlawii # Pathway: not_defined # 1 121 1 120 124 69 35.0 3e-11 MEFEWSTNDIAKKTVTIYSTNLTLNKAACRHFEEVDFVLLGIDRNKNVLGIKPVGKKEIN DALYPEDQLHRISIGKSYGRISNKNFISELSKEYNLDLDNNNCVKYTAKFDVIHQILLVD L >gi|223714140|gb|ACDT01000075.1| GENE 7 4308 - 5006 732 232 aa, chain - ## HITS:1 COG:BS_yydK KEGG:ns NR:ns ## COG: BS_yydK COG2188 # Protein_GI_number: 16081065 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 1 227 1 227 236 150 35.0 2e-36 MLKYEMIVNDLQKRIMNEEFEETRKLPTEEKLIEEYGVSRNTIRNAIKILMNLGIIYPVQ GSGMFVRAPKKKGTVYLNSTRGVTMDNPGNKIINKLLDIQIVEADEKLSEQMNCKVGTPV YYLKRLRIVDGIPYALERTYYNKEIVPYLGKEIAEGSIFNYLKDDLKISFGFADKYLTAI KLSKEDAALLELEENDPAIMINDNIHLSNGQLFNTSNIIYNYQTTNFYSVAK >gi|223714140|gb|ACDT01000075.1| GENE 8 5162 - 11854 7910 2230 aa, chain + ## HITS:1 COG:CAC1075 KEGG:ns NR:ns ## COG: CAC1075 COG1472 # Protein_GI_number: 15894360 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Clostridium acetobutylicum # 1 637 5 665 665 426 39.0 1e-118 MKRILKGALVALMVVSSTIGVKAESKISAEVQRIIDSMTPVEKVAQMIQADTRWITPAEV AEYKIGSILSGGGAAPTSGNELSNWVTSANSYQKAVIDSGGIPLLYGIDAVHGNNNLYGA TIYPHNIGLAAANNTQLVEQIGEATASEVRAMGANWTFTPTLGVPHNERWGRTYETFGDD VERVTNLGTAYIKGIESDGDTLSSAKHYLGEGLTTNGANQGNVELSDADYQDLINANMNN PMVKELLTPYKQAIKQGTKSIMVTYNSINGKRCHGNKDVVTTLLKENLGFDGIVISDYNG LDQIENQATYKDKAIACINAGVDVLMVAEKDGSTPRWKNLYNALVEAVNEGKISEERLND AVARILTAKEELGFLDNPSKAYADEADQAKFGGSEHRALAKQAVSESLVLLKNDEVNAGK TVMQALADMDNIVVAGSAGDDIGKQCGGWTITWQGATGNTTPGTTIFSGLKAAMDKKGGT ISYNANGVFTNSDNKVDAAIVVVGEDPYAESNGDRSAGQLKLPANDISTIKRIENSHPDL PIILVLTTGRPIAMADYVNDDHIKGIVNAWLPGSEGDGVADVLLGDKDFVGTNPITWIWY PQDITSKYDDSSKVLYPVGYGLKKAQKTGDFTAPADPNVIDLAKTNGKLEAEAFVDAHSE IKLENNGTTVGYLQDGRHMTYKINVPEQAAYKLTVQAAREYAATIDGAFELYLDDELVLE KKNTPIVSTGSWTKFTAQEMTALVSLKAGIHELKLVARDKDFNIDYYIFEKAGDYVGPII PDEVTNEGTGAMLQEGAVEVSMSSSENSQDMAWYKGEFEISNKNATKDPLDLRVADDSSQ TTIVVNDQKTYQSVLGMGTSIEESTIYNLLKMTDENRQAFLRRLLDPVNGMGMSLMRVTV GTSDFTAQDFYTYYDGTGKELDGKPDWNNVTGKGFSIQKDKDLGIIKVINEMQAIAKELG VEDNLKFFASSWTPPGWMKLPTSSSNSYEDNELLLKGGKLNDDYIDDLAKYMVRYVEEYQ KCGIPIYAMTIQNEPLLEINYPSCAMTGTQEAKIAKAIKAELAKSTVLKDNEKEVKLWAF DHNFDGADKFMADFFKEAGDQLNIDGIAFHPYGGNASTMGSFYDRYKDQLSMNLTERSVW GTSGANDIITWLRNGSESYNSWVTMLDSNVGTHHWVGTPDPTLFVQDANNPQRYWATPEV YIMSQFTKYVKPGYVRVDTNNGSSSTVTNVAFKDPETGKIVMIVANRSGTDQKFKVMMKG AQFNAVLPAGNVATYIWDGSDIEINALDVPGTFTADKCSSATNANINTTDHIIDYVNKNA SIDYLVDVKEAGTYNVMIKVASGEDWENKTVNVKSGDTVVGTTVAQRFNFWGGEWQDYRY IQVQANLEKGIQTLTIELPDGGMNLSEIKINKTKYSHDVPGYISATDYCYGERIIVEDGN IGFFGEYDRVNYRVNVQKAGTYQMKMNYASDGEAKFWMDLIQNGISASIGEATLSSTGND STYSTGVVEINLPEGESQISFVPQSGVSFNLKALSFGSYIQISSDELEEGKLAGKEITVN LKDGKFETDLNQANWVVEGLPAGITYSLKRINDTKAVIILKGNEVVDYDSNLTVTVKVDA GEIGDTGYSLSDSVMIAAVDDDETLLDIGDLAFGTTEFDLMIEGGKFNQDLKVNDLVLSE EAANYMTVKAVKVSADGNKVTVTLERKNTNYEDVAGTISIKSNGYSDGNVDLTANLKLLK TNNLPDSIDVGDVAVGLKESDAYRKKGTLVNGAKGDYIDFYLNIKEEGNYVLTYKIKDSE AITNGLKLSGGLGLATDNLGSVSFGKYWGNAQGYAQMLNLKAGEQTLRFEVNSAGFELTN LKIEKLTQAIEVSDTLDQTTTITADKVVDGSKEIGWGIEGSTTKNIGFGSAGAYQDYYLD VKTAGLYDVKVNYSHDCGGDTKAVIMTVDGKITATLGEVTLKNSGGWSNWKDSDTIQIKL EAGKQFIRIYDDLDGFNYRNFQLTFKGEQDVTAPEITGNDAVIYVNDATDIKDLLGLIVT DDVDGDLLDKVTISGEYNYNVAGTYTITVTVSDEAGNETSKEFTVKVVEPAKLIVEDKVL TVGDKFDPLAGIQVLDVDGTDITKNIIVISDGVDTSKAGSYEVIYRITDALGNEVEFRRT VEVKDIEPDVKPVDPDEQTPSDNSGQQTIPGQVVEAAKTSDDVNITGMFGILVLAGLGWI GSRKRKFCDC >gi|223714140|gb|ACDT01000075.1| GENE 9 11980 - 12351 478 123 aa, chain + ## HITS:1 COG:BH0651 KEGG:ns NR:ns ## COG: BH0651 COG1725 # Protein_GI_number: 15613214 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 1 123 1 123 123 89 43.0 2e-18 MKIIINNSSMQPIYEQLVEQIKAQIINGQLHENDVLPSVRALSKELKISALTVKKAYDHL ETEGFTATIHGKGTYVKGANQELLLEERRKEVETELESTIDKGRRYGMNDDEIKALFNLI MEE >gi|223714140|gb|ACDT01000075.1| GENE 10 12355 - 13203 263 282 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 2 246 12 256 318 105 31 2e-22 MLELINVQKDYGRFKLNCSLEVKAGCVTGLIGQNGAGKSTTFKAVLGLISVDSGTVKLFS KDIKRLSIEDKQNIGVVLSDSGFSEYLRIEDIIPIMESLYRHFDKQMFIERVRHFELPLN KQIKEFSTGMKVKLKVLIALSHQSKLLILDEPTAGLDVIVRDELLDMLRDYMEQNEDCAI IISSHISSDLEGLCDDLYMIDQGKIVMHEETDVLLSDYAILKVDEEQFYELDKQYLLCQK RESFGYRCLTNQKQFYLENYPDVVIEKSSIDEVMTLMIRGEK >gi|223714140|gb|ACDT01000075.1| GENE 11 13203 - 13859 675 218 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756223|ref|ZP_02428350.1| ## NR: gi|167756223|ref|ZP_02428350.1| hypothetical protein CLORAM_01753 [Clostridium ramosum DSM 1402] # 1 218 1 218 218 281 100.0 3e-74 MKGLYIKDFKLMMNQKMFFIVIAAMAIFFAVTQTNIFFVISYATFIAAMFVISSISYDEF NNGNAFLFTLPVTRTGYVKEKYLFAATLATGAWLAATILSIVVVFIQGTEVIILEWFLTA AMILLVALMMVLIIIPVQLKFGQSKGNVAMLLVMGGAFAIGYLVVSVLSNFGIDVFAMID ALSTIGLSGLLMILLLMVAGAGIISYLISCKIMTEKEF >gi|223714140|gb|ACDT01000075.1| GENE 12 13872 - 14309 442 145 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734169|ref|ZP_04564650.1| ## NR: gi|237734169|ref|ZP_04564650.1| predicted protein [Mollicutes bacterium D7] # 1 145 1 145 145 267 100.0 1e-70 MKKILKILVIVIFVTSLLGCQDKYHETYTLRYYIEGCANCEFFTKKGIPLIEKEFGKHIK IVKYNMDDATSFDKVKKAYDHDLEQLQDFDYDQYGVGPFLVLENHYAQLGVYDIEKFLNN LIRAVSGEKLMAPEKIEKYYYFKDN >gi|223714140|gb|ACDT01000075.1| GENE 13 14397 - 14864 580 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756226|ref|ZP_02428353.1| ## NR: gi|167756226|ref|ZP_02428353.1| hypothetical protein CLORAM_01756 [Clostridium ramosum DSM 1402] # 1 155 1 155 155 284 100.0 1e-75 MKKIFKIMITLVLTVSLFGCQKKEKNVYTETYTLQYFYLEGCPNCENFTKNGLPLIKEEF GDHMKIIEYDMDDTETLTEVKAAYDEVINSIIDFNQDDYGFGPFLVLEGYYAQLGVSDVD DYLENLIAAIKGEELNEPGEIDTYYYLRDGKVKEE >gi|223714140|gb|ACDT01000075.1| GENE 14 14867 - 15568 984 233 aa, chain + ## HITS:1 COG:SA0131 KEGG:ns NR:ns ## COG: SA0131 COG0813 # Protein_GI_number: 15925840 # Func_class: F Nucleotide transport and metabolism # Function: Purine-nucleoside phosphorylase # Organism: Staphylococcus aureus N315 # 2 232 3 234 235 295 61.0 6e-80 MATPHNQATKGEIAKTVLMPGDPLRAKFLAETYLENVKQFNTVRNMFGYTGTYKGKEVSI MGSGMGMPSIGIYSYELFSQYDVENIIRIGSCGSFKENVHLRDIIIVQGCCTDSNFAHQY ELPGTYSAISSYALLERAVNEAKEKDVVYHVGNVLASDIFYHADQGSVEKWASMGCLGVE MESYALFATAAYLNKRALTLLTVSDSLVSNEETSPEEREKTFTAMMEIALEIA >gi|223714140|gb|ACDT01000075.1| GENE 15 15568 - 16191 596 207 aa, chain + ## HITS:1 COG:CAC2227 KEGG:ns NR:ns ## COG: CAC2227 COG0560 # Protein_GI_number: 15895495 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Clostridium acetobutylicum # 3 196 2 201 213 99 31.0 6e-21 MKQKFAFFDYDDTLIHGDSGRALLKYYLKKHPLAVFRLLKVAVAFPLSIIGLVKFQTAKS AWLFPMDNLSDEELNDFYQTCLIPKYYPNVVAELKDKKNDGYLVYICSASIEGYLRFCDL PVDGILGTKTEVIGGKYTSRMIGNNCKNEEKVTRLTAVIAKLGVEIDYENSYAYSDSMHD IPMLKMVKNRIRINKKNGEMTPFIIEE >gi|223714140|gb|ACDT01000075.1| GENE 16 16204 - 16404 263 66 aa, chain - ## HITS:1 COG:BS_yuzA KEGG:ns NR:ns ## COG: BS_yuzA COG2155 # Protein_GI_number: 16080190 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 60 1 61 78 63 63.0 1e-10 MNILQKISLVLTIIRAINWGLIGLFNFNLVDSLFGVDSFLSMLIYILVGIAGIINIMLLF TDLDTK >gi|223714140|gb|ACDT01000075.1| GENE 17 16566 - 17483 938 305 aa, chain + ## HITS:1 COG:BS_ydcC KEGG:ns NR:ns ## COG: BS_ydcC COG2834 # Protein_GI_number: 16077530 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane lipoprotein-sorting protein # Organism: Bacillus subtilis # 11 305 73 373 373 130 31.0 5e-30 MKDSSLDKVIEKVKAYDKYALTCNMEMVENDELKSYLVNVSYLKEKKNEYYKVELYDKSL NQSQIIVKNPDGVFVLTPTLNQIFKFQSEWPNNSPKPYIYQSLIELLEKGEVEKIKTGYQ VKCEVTYPNDSRVVAQEIIFDKSLAPKQVTVLDKDEAEIITADFTDFKTDVKLGKDNFNE KKVLEKSTNEYSNVSSELPLYPVALMGSTLDSEKVSTIDGTTNHILKFTGDKNFTVIETP VTASEEVAVETVSGEVIDLVDGVAFYNDGQLMMMKSGVLCKLYSQDLNKDEMVNVISSMQ TSSLK >gi|223714140|gb|ACDT01000075.1| GENE 18 17570 - 18643 1158 357 aa, chain + ## HITS:1 COG:SPy1802 KEGG:ns NR:ns ## COG: SPy1802 COG0787 # Protein_GI_number: 15675637 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Streptococcus pyogenes M1 GAS # 2 355 4 366 366 259 40.0 5e-69 MSLHRDTYAEINLKYLKENIETTYKKFKRPLMAVIKADAYGHGYREVATYIKDIEYLEMF AVATLPEAIELRELGITKGILILGAVPTSKEEIDLAIKYDISLTMISLDYMHHLETLIDQ QPLKIHIKLDTGMHRIGLTSKAQLDEMLNTIDYNKFILEGIFTHYATADGEQAAFDQQRE LFYDLVADHKFKYIHCCNSAAMAYHHDDRSNLGRIGIIMYGIDPAGNETKEFKQVMSLYS KVALIKKIKAGDRVGYGLTYTADEDEYLATIPIGYADGLIRKNQGRNVYINGKYFEIVGR VCMDQIMVRVDETIKEGDKVEIFGAHISLASMARELETIPYEIICLITKRVDRLYIK >gi|223714140|gb|ACDT01000075.1| GENE 19 18719 - 19138 361 139 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734176|ref|ZP_04564657.1| ## NR: gi|237734176|ref|ZP_04564657.1| predicted protein [Mollicutes bacterium D7] # 1 139 1 139 139 227 100.0 1e-58 MEKVKKLWKEISIFGIVLVVFIGLFVYRKITHVEYTTITQSALVEKVKDKDSFVVVIGNN SDNTTLSYQQTMTTFVEKNRSKSLYYVDVSGDSDYTTWLEEKLDITDGTVPQTLVYEKGK VKTAKTGVLSYYRLSQLYK >gi|223714140|gb|ACDT01000075.1| GENE 20 19237 - 20235 1215 332 aa, chain - ## HITS:1 COG:PM0286 KEGG:ns NR:ns ## COG: PM0286 COG1087 # Protein_GI_number: 15602151 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Pasteurella multocida # 1 331 1 332 338 446 62.0 1e-125 MNVLVCGGTGYIGSHICVELLNAGYEVTVIDDFSNSRPEVLGYIKQITGKEVKFYEFNIL DEEKTEAVFKENKLDAVIHCAAFKAVGESVEKPIEYYTNNLTTTLIVSKMMKKYHVNQIV FSSSATVYGDPETVPITEDCKLGETTNPYGTSKAMMERILTDVQHACPEMSVTLLRYFNP IGAHESGLIGEDPKGIPNNLMPYIMKVATGELECLGVFGDDYDTHDGTGVRDYIHVVDLA KGHVKAIEHYANPGVHICNLGTGTGYSVLDLVKAFERVNNVKVKYVIKDRRPGDIATCYA NPARAKEELDWVATKGIDEMCRDTWNYALKHK >gi|223714140|gb|ACDT01000075.1| GENE 21 20338 - 22389 1804 683 aa, chain - ## HITS:1 COG:FN1546 KEGG:ns NR:ns ## COG: FN1546 COG0480 # Protein_GI_number: 19704878 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Fusobacterium nucleatum # 1 683 3 685 690 550 41.0 1e-156 MRTYHANEIGNVAVLGHSGCGKTSIVEAMAYRSGIINRTGSISEGNTLSDYSEEEIKRQS SVNLAVIPVEWNNCKINLVDVPGIFDFVGECEAALSVCESALIVVPSSAGISAGTKQAMF RAKDKAKIIYINGLDNPSSDYATKLQQLKDTYGKAIAPIQVPIMDNNKMVGYVNVAKMEG RIFDGEQTKATDIPIELMEQVRPIKEMIDEAVANTSDELLEKYINEEPFTKEEISWALRK GVMERTLIPVLCGTDQIGIQIILNSIVAFFSAAGDMTNSLIVKNIDTDEEEIIGYDESLP ASIFIFKTIADPFIGRTTLFKVVTGTLYSGSSLYNVKEKEVERIAKLYQPRGKDLIEVDC LHAGDIGAVTKLSYSKTNDTLCIAGYQIIVPEIEFSTPYYGKAIVPLGKSNEEKIANAIN KLLEEDPTLKFESNPETKQQCIYGIGDIHLDMMLNKLKSKFKVEATLENIKIPYRETIKK SITQRTRFKKQSGGHGQFGEVEITFEPTYDYSQSYLFEEKVFGGAVPKSYFPAVEKGLIE SVKSGVLAGYPVLGIKATLIDGAYHNVDSSEMAFKTATAMCFKEAIPKASPALLEPYMKM KVVVDEQYTGDIMSNFNTKRARVIGSDVLNDGLIKIVAEAPMSEVMNYAVDLRSITQGQG VFEMEFLDYEFASDNIVKAVLNK Prediction of potential genes in microbial genomes Time: Thu May 26 10:00:16 2011 Seq name: gi|223714139|gb|ACDT01000076.1| Coprobacillus sp. D7 cont1.76, whole genome shotgun sequence Length of sequence - 49904 bp Number of predicted genes - 36, with homology - 36 Number of transcription units - 19, operones - 12 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 14 - 102 81.3 # Leu TAA 0 0 + TRNA 108 - 181 79.1 # Gly TCC 0 0 1 1 Op 1 22/0.000 - CDS 228 - 1178 515 ## COG0842 ABC-type multidrug transport system, permease component 2 1 Op 2 45/0.000 - CDS 1190 - 1936 294 ## COG0842 ABC-type multidrug transport system, permease component 3 1 Op 3 . - CDS 1939 - 2877 476 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein - Prom 3076 - 3135 79.3 + TRNA 3059 - 3132 79.1 # Gly TCC 0 0 + Prom 3061 - 3120 76.8 4 2 Op 1 40/0.000 + CDS 3233 - 3889 816 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 5 2 Op 2 . + CDS 3889 - 5328 1313 ## COG0642 Signal transduction histidine kinase 6 2 Op 3 . + CDS 5318 - 5980 790 ## gi|167755293|ref|ZP_02427420.1| hypothetical protein CLORAM_00803 + Term 6015 - 6064 9.2 + Prom 6008 - 6067 5.9 7 3 Op 1 . + CDS 6097 - 6801 720 ## COG1876 D-alanyl-D-alanine carboxypeptidase 8 3 Op 2 . + CDS 6869 - 7381 440 ## gi|167755295|ref|ZP_02427422.1| hypothetical protein CLORAM_00805 + Term 7554 - 7621 30.2 + TRNA 7536 - 7609 79.1 # Gly TCC 0 0 + TRNA 7612 - 7687 93.8 # Asn GTT 0 0 + TRNA 7689 - 7764 76.6 # Glu TTC 0 0 - Term 7823 - 7862 0.3 9 4 Op 1 36/0.000 - CDS 7876 - 10362 1919 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 10 4 Op 2 . - CDS 10364 - 11047 353 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) - Prom 11081 - 11140 8.2 - Term 11089 - 11132 -0.9 11 5 Tu 1 . - CDS 11153 - 11548 227 ## gi|167755298|ref|ZP_02427425.1| hypothetical protein CLORAM_00811 12 6 Op 1 . - CDS 11600 - 12127 345 ## PROTEIN SUPPORTED gi|229883981|ref|ZP_04503445.1| acetyltransferase, ribosomal protein N-acetylase 13 6 Op 2 . - CDS 12127 - 13041 820 ## COG4989 Predicted oxidoreductase - Prom 13064 - 13123 5.0 + Prom 13086 - 13145 6.9 14 7 Op 1 . + CDS 13257 - 13457 258 ## gi|167755302|ref|ZP_02427429.1| hypothetical protein CLORAM_00815 15 7 Op 2 . + CDS 13501 - 14418 1041 ## COG2385 Sporulation protein and related proteins 16 7 Op 3 . + CDS 14463 - 15104 729 ## COG0739 Membrane proteins related to metalloendopeptidases + Term 15109 - 15158 12.2 + Prom 15130 - 15189 6.5 17 8 Op 1 . + CDS 15211 - 16200 1386 ## COG1077 Actin-like ATPase involved in cell morphogenesis + Prom 16285 - 16344 7.6 18 8 Op 2 . + CDS 16369 - 19398 3309 ## Csac_2519 coagulation factor 5/8 type domain-containing protein + Term 19402 - 19445 10.1 + Prom 19489 - 19548 8.2 19 9 Tu 1 . + CDS 19586 - 24394 4786 ## Csac_2519 coagulation factor 5/8 type domain-containing protein 20 10 Tu 1 . - CDS 24288 - 25226 629 ## COG1737 Transcriptional regulators - Prom 25259 - 25318 11.6 21 11 Op 1 . + CDS 25609 - 30876 5922 ## Csac_2519 coagulation factor 5/8 type domain-containing protein 22 11 Op 2 . + CDS 30932 - 31810 844 ## COG1737 Transcriptional regulators + Term 31820 - 31860 -0.1 + Prom 32203 - 32262 3.3 23 12 Tu 1 . + CDS 32307 - 32531 295 ## gi|167755312|ref|ZP_02427439.1| hypothetical protein CLORAM_00825 + Term 32648 - 32695 9.4 + Prom 32705 - 32764 7.7 24 13 Op 1 . + CDS 32803 - 35040 2395 ## COG1409 Predicted phosphohydrolases + Term 35063 - 35098 4.4 25 13 Op 2 . + CDS 35107 - 36276 1175 ## COG5438 Predicted multitransmembrane protein + Term 36284 - 36313 1.2 26 13 Op 3 . + CDS 36333 - 37121 967 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 37127 - 37165 2.1 - Term 37113 - 37152 6.1 27 14 Tu 1 . - CDS 37165 - 38313 1246 ## COG3589 Uncharacterized conserved protein - Prom 38334 - 38393 5.4 + Prom 38241 - 38300 5.7 28 15 Op 1 . + CDS 38418 - 40106 1523 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 29 15 Op 2 . + CDS 40099 - 40578 484 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Prom 40599 - 40658 5.4 30 16 Op 1 . + CDS 40692 - 40922 334 ## gi|167755319|ref|ZP_02427446.1| hypothetical protein CLORAM_00832 31 16 Op 2 10/0.000 + CDS 40972 - 43242 1598 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 32 16 Op 3 1/0.333 + CDS 43266 - 43712 580 ## COG0691 tmRNA-binding protein + Prom 45080 - 45139 6.1 33 17 Tu 1 . + CDS 45228 - 45968 608 ## COG0500 SAM-dependent methyltransferases + Term 46187 - 46229 0.6 + Prom 46209 - 46268 4.3 34 18 Tu 1 . + CDS 46294 - 46491 142 ## Coch_0559 hypothetical protein + Term 46573 - 46609 1.2 + Prom 46806 - 46865 9.8 35 19 Op 1 . + CDS 46925 - 49051 1253 ## COG2200 FOG: EAL domain 36 19 Op 2 . + CDS 49083 - 49832 669 ## BT_1587 hypothetical protein + Term 49851 - 49901 3.9 Predicted protein(s) >gi|223714139|gb|ACDT01000076.1| GENE 1 228 - 1178 515 316 aa, chain - ## HITS:1 COG:CAC3610 KEGG:ns NR:ns ## COG: CAC3610 COG0842 # Protein_GI_number: 15896844 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Clostridium acetobutylicum # 9 313 11 325 330 101 29.0 2e-21 MYKLIKLILYRIKHNKAFLITYLVLIPIIIALAIYFTNSISHTIQIGIVGDIDTSQNSQI KYTYLDTLPSTSELVLNQYDAIIVQDDKNIDVLSTKGETYNQTITLLVNGQIDSLPVNDN QRGSASNILGFLMMVILLLGGQIYKYYFDERTGINKRILSTSVSCYQYLLSHFIVVYIFL FIPATIIICSALFVFKITLSIALWKFILILALLCFFATSFGLWSNVLSKSLEESMMFGNM FAIIGTIVSGGITQVTNNEIFNYVVQFFPQKQVMTTLSALENKNVLPTSGIIYSLLLSLV LIISAIVIEKRKLSVR >gi|223714139|gb|ACDT01000076.1| GENE 2 1190 - 1936 294 248 aa, chain - ## HITS:1 COG:CAC3404 KEGG:ns NR:ns ## COG: CAC3404 COG0842 # Protein_GI_number: 15896645 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Clostridium acetobutylicum # 2 248 1 247 247 143 36.0 3e-34 MIKFWTLFKHDLYNLISSPGTLSVFLIFPTILILLMGFLFDNLYHTTIISSYDFYGVTMI FFIAMIGATVPANAFLEKHIKNGNTRIFYSPVSRVSIYSSKILTCFLFMALALTINIIIF DTISLVNFGSDKIGYVILLIINFVLFLTILSSAICVTLHSEELTNVILSNSMSVLGFLSG IFFPIASLGAIFEKIASFSPIKWTVDCIFQLIYDGQSTNYWWIMFALLTLSSLLLLVVHK NYRPEDYI >gi|223714139|gb|ACDT01000076.1| GENE 3 1939 - 2877 476 312 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 310 1 306 311 187 35 8e-47 MKKILEVKNINKSYQHNKAINNLSFNVYDGEILGLLGPNGAGKSTTINILSTLLTSDTGT INIFNLPLTSHSNKIKAQLGIVPQEIALYEDISAYHNVAFFASLYGVKKKDLPKQVMNAL TFVGLSDKKNDDPRTFSGGMKRRLNIACAIAHNPKLLILDEPTVGIDPQSRNYILSSLKE LKEKGTTIIYTTHYMEEVEEIADRIVIMDRGTKIAEGTLKELLAKYKDTLIYTIYADDFP VDLAISLSTINGVEEIKTNADNFKITVSKNYDALDKIIMLFVHKKCHINKMETEEGNLEM VFLGLTGKKLRD >gi|223714139|gb|ACDT01000076.1| GENE 4 3233 - 3889 816 218 aa, chain + ## HITS:1 COG:CAC1506 KEGG:ns NR:ns ## COG: CAC1506 COG0745 # Protein_GI_number: 15894784 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 216 1 216 217 211 52.0 8e-55 MNKILIVEDEQAICRLIKINLSDAGYSCKCAYDGKGAIELIEYNQFDLILLDIMLPEING YELMEYIRPLEIPVIFLTAKGDVKDRVKGLKLGAEDYIVKPFEIIELVARVETVLRRYHK TSTVLSVYDISVDTLSRVVKKNDQVINLTVKEYDLLLLFIQNKNIALFRDRIYEAVWGDY YMGDSRTVDLHVQRMRKKLGLEDKIVPVYKVGYRLEGD >gi|223714139|gb|ACDT01000076.1| GENE 5 3889 - 5328 1313 479 aa, chain + ## HITS:1 COG:CAC1507 KEGG:ns NR:ns ## COG: CAC1507 COG0642 # Protein_GI_number: 15894785 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 1 471 1 473 473 182 29.0 1e-45 MNFFWKLYFSIMAITLTCFSVGGYMLIQTSFDNSFEREVEGVYQENDILVNSLTLELLPH LEELTIKGGTFQERLDLLKNLVSNMTVETFNGNVSFCIRAENGDVVYQNDTFNDDHKLFD KISQNERGYIVQKDKEHYKLQSLRIFSLNGTKLYFENARDISELFLNRENQFGTLFYYTI ILLLASGAVIFAVTRWLVSPIKKLSKATKGVADGHLIVPVVVSSEDEIGQLTKDFNTMTE RLSTTMKELHEAVERQEVFVGNFAHELKTPLTSIIGYGDMLRSKRLNEEEVIGYSNLIVE EGRRLEAMSMKLLELIVLKKQDFKMYWVTAETFFQGIVDTVDLLMKEQQIEFIIEIEPGK LYIEPDLMKTVCLNLLDNARKAVGQNSKVYLRGKVLATGYQFVVKDNGCGIAATELSKIK EAFYMVDKSRSRSAGGAGLGLAICEQIIELHHASINFESVLGQGTTVTVVLEGVKVDEV >gi|223714139|gb|ACDT01000076.1| GENE 6 5318 - 5980 790 220 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755293|ref|ZP_02427420.1| ## NR: gi|167755293|ref|ZP_02427420.1| hypothetical protein CLORAM_00803 [Clostridium ramosum DSM 1402] # 1 220 1 220 220 318 100.0 1e-85 MRYSKNIIKFAAAIILLIIIPLVPLTVSKIQDDKLIEHLQVEKIKSDNNEIQTSKLTVAE KLELIGDYENKEKNIITTTQVQDMSDENITRIRTIINEQLVILKNLGILTDFNFDGNYVC YNYTLRRYSNVIDSSKSVSVYQVNFTNEEGIFNATIDVDTHLIYQYNYYNKKYIARNYEV IYTFGTAYLGLTEQETYKYLFGIIDNRTDSVSVSSYNDIY >gi|223714139|gb|ACDT01000076.1| GENE 7 6097 - 6801 720 234 aa, chain + ## HITS:1 COG:CAC3297 KEGG:ns NR:ns ## COG: CAC3297 COG1876 # Protein_GI_number: 15896541 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 16 233 17 234 241 116 35.0 3e-26 MKVLVVTLLMIGLVMSLFILVNDQVNQSNNQLIAANKVKTNQNKSANSVVKDELLTLVNF ENTIPKDWKVDLVQLNNGQSVDRRIYDDLIAMLQAAKSEGLNPLICSSYRTNEKQEQLYQ NKVSEYLSQGYSKVEASDKAAFWVARPGTSEHQLGLAVDIVSTKNQRLDRSQENTVEQRW LIQNSWKYGFVLRYPTNKNSITGVGYEPWHYRYVGKEHAKKINELGVCLEEYVK >gi|223714139|gb|ACDT01000076.1| GENE 8 6869 - 7381 440 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755295|ref|ZP_02427422.1| ## NR: gi|167755295|ref|ZP_02427422.1| hypothetical protein CLORAM_00805 [Clostridium ramosum DSM 1402] # 1 170 1 170 170 128 100.0 1e-28 MTKRFEFDDDLEELDIDEPEQDEAVEETGNKKQRRPKTKKKKVTKTKRKTKTKTKVKKEK QGRSKFSTFILSLLILLIITGACVGGYFAYEYYIDQQKQIDNLQEQLDDSKKENNSTKQK ENKNDTKKKDNQQSDDAGTDEPPVEDNTSNGNTSDQEGTNDTNNETTEMQ >gi|223714139|gb|ACDT01000076.1| GENE 9 7876 - 10362 1919 828 aa, chain - ## HITS:1 COG:CAC0527 KEGG:ns NR:ns ## COG: CAC0527 COG0577 # Protein_GI_number: 15893817 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 1 826 1 861 863 313 29.0 9e-85 MHIINKLTLKQLLANKKRTVVTIVGIIISVAMFTAVTTFVTSFMEMMRLEVSSYDGSWHV AYDNLTSQQIDSLKNDDAYQNSVLFQKIDTLLIDENNPIFIELNQLDNYSKEIKLLKGNY PANSNEIIISKYYQTQNGVKVGDEITVKTGATQISDKFIETEQLESTAALPVKTFKVVGI YQSTYLDQASYYNSFYTANSSSIISSKMYLTLDHVDNDIFDNGSNTAQSLGINSGDISYH QNLLYYYGISNNDGFNQAMMTAVGIVVVIIMIGSVGLIYNAFAISLNERSRYLGTLSSIG ATKRQKKQSVYFEGLVVGIIAIILGVIAGVGGMAVTFKVINPILANLGQEIGFPLAISFK GILTAVICAIITIFISSIIPAYHASKISPIAAITNQQDTKIKVRNIKTSFLTRRIFGFEG DLAMKNIKRNRNRYRITLFSLIISSILFLTASGFTYYLKESYDMTSININYDGYININSQ NQELINELAHLKNTTQIAVAKETSLNLETSLDNINPELLSFLTENEQNYGNTYFLDLNLV TYNQEYLDSIPNLNKELQNQILINKENNIKLNRKYKQTDILKENLDHLIGTTFNNNGDEI EITLNKIKYTNQYLLGQTPHETHYSLNLLVSNDTFNQLINQYQLNSNNTNIYYTSNNNEA VLKEIKEITDNYPEESLYSFNLSEDIKQANQLILIINIFTYGFIILIGLISIANIINTIS TSMALRTKEFSMLKSIGMTPRAFNKMIYFESLFYGIKTLIYSLPLGFIIMYFLYENLAAI FERSFSVPINSFIIITIAIFVIVFTTLAFSSQKIKKLNIIDGLKNDNN >gi|223714139|gb|ACDT01000076.1| GENE 10 10364 - 11047 353 227 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 3 200 4 199 223 140 39 1e-32 MNILSVKNLSKIYGDENNHVIALDNVSFDVEIGEFIAIIGASGSGKSTLLHQIGGVDHPS SGKVIINNTDIYTLNENDLAIFRRNEIGLIYQFYNLIPVLNVKENITLPLQLAHQKVDQK RFNTLIEQLGLSNRLNHLPNQLSGGQQQRVSIARALINQPSLVLADEPTGNLDSKNSDEI IRLLKEANEKYHQTIIIITHDNNIAKLANRVITIKDGKIISDLKRGG >gi|223714139|gb|ACDT01000076.1| GENE 11 11153 - 11548 227 131 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755298|ref|ZP_02427425.1| ## NR: gi|167755298|ref|ZP_02427425.1| hypothetical protein CLORAM_00811 [Clostridium ramosum DSM 1402] # 1 131 1 131 131 249 100.0 3e-65 MDNTVIKNLIYNQLFAAANYDLIATIAPDDPTKTRILNFSADCKNNANMLDRIYQEENTS SYHPIVQKPQFHGSFIESIHWMLNYEGDSFRLFHINSFYDVYTTAQRQLLTYIAGILNDH AIGLTHISLTK >gi|223714139|gb|ACDT01000076.1| GENE 12 11600 - 12127 345 175 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229883981|ref|ZP_04503445.1| acetyltransferase, ribosomal protein N-acetylase [Sebaldella termitidis ATCC 33386] # 3 168 1 166 166 137 42 1e-31 MDLRLRPYQLSDQSALAALANNPQVSANLKDIFPYPYTPQDAQNYLNFITQNSNNNLIEY AIIVDGLFAGAISITFGEDIYSHLAEIGYWLGEPYWHQGIMKKAVKMIIKYIFDNYDTKI IKAEIFSRNIGSRKVLLANNFEYLVTLKKHAYKRGEFLDLELFELTRDKYQDNQL >gi|223714139|gb|ACDT01000076.1| GENE 13 12127 - 13041 820 304 aa, chain - ## HITS:1 COG:BH3927 KEGG:ns NR:ns ## COG: BH3927 COG4989 # Protein_GI_number: 15616489 # Func_class: R General function prediction only # Function: Predicted oxidoreductase # Organism: Bacillus halodurans # 1 304 1 305 305 359 55.0 4e-99 MEYYSLPQTNLKVSKVALGCMRIASKTPEEVENLVLESLKAGINFFDHADIYGGGKSEEL FGKVLQKHPELRKEMIIQSKCGIRPGICFDFSKEYILASVDNILSRLQTDYLDILLLHRP DALMDPQEVSEAFDELYQAGKVRYFGVSNQNPGQIELLKKYCKQPIIINQLQFGPAHAQM IDSGIYANMDEGIDHDGDILNYCRLNDITIQPWSTIRSSLSEDTFIDNPKYPKLNEQLDK LADKYNVSKVAIVTAWILRHPAQMQPIAGTTSIKNMLDTIKGVEVKLTREEWYAIYTAED KPLP >gi|223714139|gb|ACDT01000076.1| GENE 14 13257 - 13457 258 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755302|ref|ZP_02427429.1| ## NR: gi|167755302|ref|ZP_02427429.1| hypothetical protein CLORAM_00815 [Clostridium ramosum DSM 1402] # 1 66 1 66 66 91 100.0 1e-17 MSYQVIKMIVYVVSVMLSMWSLSCFNFDNVIRKAKVREFYLFFLIASLCLGYLLGSFILE FTTIHF >gi|223714139|gb|ACDT01000076.1| GENE 15 13501 - 14418 1041 305 aa, chain + ## HITS:1 COG:BH3748 KEGG:ns NR:ns ## COG: BH3748 COG2385 # Protein_GI_number: 15616310 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Sporulation protein and related proteins # Organism: Bacillus halodurans # 43 302 48 321 336 204 41.0 2e-52 MKEISVKLGVVFLITVIFLSINVGVKKDETIKGTETKEKKEEKEEKAVQVAVTIDNNVNY IDLDDYLLGVVAGEMPAKFETEALKAQVVASRTFVYNRNLSVDNTTNSQVYLTEDKMREN WGDKYDEYHQKIVAAINDTNDEVMKYEGKYITAMFFSSSNGYTENVEDYFDSSALPYLRS VDCHWDLSIDPTNSRSKTFTKQELKEKFNCDSLDFNIIAYKKSGRVGTLSVGGKNYSGRK VREILGLASSCFTIKYENGKYTFNTLGSGHGVGMSQYGAQGMALEGASYKEILNHFYTNV EIVNN >gi|223714139|gb|ACDT01000076.1| GENE 16 14463 - 15104 729 213 aa, chain + ## HITS:1 COG:BS_spoIIQ KEGG:ns NR:ns ## COG: BS_spoIIQ COG0739 # Protein_GI_number: 16080708 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Bacillus subtilis # 20 211 27 227 283 112 38.0 5e-25 MKLFMKRNRRVLVFTTVVMLFLVGVGVIQLVVNQYRPLKDTAVVKVEDNKDKDTDKEDPQ APAEKMVKPVGDDIKIVKKFYDSSLSDEELEKALVYFEGVYRPNLGIDFSKDNKAFEVTA AFSGTVTKKDNDALLGWIVTVTNETGVSATYQSLSEVNVEKDAVIKQGDKIGTSGENVYE SELKNHLHFILEKDNQALNPEKYFNQEVSKIAV >gi|223714139|gb|ACDT01000076.1| GENE 17 15211 - 16200 1386 329 aa, chain + ## HITS:1 COG:BH3739 KEGG:ns NR:ns ## COG: BH3739 COG1077 # Protein_GI_number: 15616301 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Bacillus halodurans # 3 324 2 323 334 393 60.0 1e-109 MALSKEIGIDLGTANILIYVKGEGIVVNEPSVVAIDDETKKPLAVGQEAKDMLGKTPGRV KAIRPLKDGVIADFRITEILITHFINKLNLKGIFSRPTILICCPSNITSIEKSAIKDVAE RCGAKRVFIEEEPKVAAVGAGLEISKPTGNMVIDIGGGTTDIAVLSLGDIVTSSSLKVAG DVMDEDIIKFVKDKYKLLIGDRTAEQIKMEIGIAIKGISEETFEVRGRDLVTGLPHTITL TADETELALRETCQTIVRETKHVLEQTPPELAADIMSRGIFLTGGGALLTGLDHLLENEL GVPVFVADDALNCVANGCGVMLENTHYVK >gi|223714139|gb|ACDT01000076.1| GENE 18 16369 - 19398 3309 1009 aa, chain + ## HITS:1 COG:no KEGG:Csac_2519 NR:ns ## KEGG: Csac_2519 # Name: not_defined # Def: coagulation factor 5/8 type domain-containing protein # Organism: C.saccharolyticus # Pathway: not_defined # 506 954 46 493 628 233 32.0 3e-59 MVHRVKQVSIVVLSMLVLLSTLIVGTYAKATNLVSNGDFSQGTNYWSMNQSDGSLAMMSS DNHSLKVQIEKVNSDDYWWSLTLKQDNIALSANKKYQLSFDVEASMSGDISLDIESTSDY NKKYLDRQMIAISNQKKTYTFDFEKKDDDAVTLVYHLAKGDGISDFSNQIIKLSNVTIIE IGDIEIQKAGDWYLNEGDGANGKLSGEPNDLKVKVTSINQENYWWALSLKKENLVLSGNK IYKIVFDAKASKAGTIGIDIEKSADYNVKYLDKQNFDLTTTSKEYSFNFIKKDNDDVSIV FLLNEGDGISNFKDETINIRNVRIEEVNDLGDEYVVNGDFETNTTSGWNTYNSNGSNLKI NAVNQQLEVTFPNSNGSNYWDCQLFQGDILLDNGIYRLSYDLKGSKAGTIYFDVEDTGDY ATKYYPETMVDFTKDVQTYNFEFEINADNLLGTQKNAKIQFNLGPNEHCDSLAGETLYFD NIKIEKIGSGAGTGELQTTNVTFDGNDVLVNNFKGLGVQWDPYVVHPLTDEEWQTVTKRV DFLNPAFVRCMIYANTYCEGFDDEGNPIYDFDSLANQALIRELDYLESRDIEVVLGEWET PGRFGGEFEGITVDDPRWASIIGGFLDYLINQKGYTCIKYFNYVNEANSDWSYCGDFDKW QTGINYLHTELDKYGLNEKIKITGPDTVWDSDNTWLKEINSNQDLDAKIGLYDTHMYPTI DEITNGTIEEMVAEQRSCVTGKDFYMTEIGMVTGKSDGDSQPYTKEFSYGVIMADAASQV MRGGFSGLAIWDLDDAMHDQQNGFPITDIRSLKQWGFWNSVAGRVFNQPEEEEIRPFFYT WSLMANLFPRNSKIIGSTANKELNGLRTVGMEKDGQMTYMIVNDSNSPKEVTIDVKNLNA NNLKLFKYDYFDNDRKVDSNGYPVASKVLENVNLEEGYEVSLPSGGVVMLSTIDIEDAKK ELVDQNDSKQDNENKNEVAIKTGDDLKTAGFMISGLLATAFIYGMKKKG >gi|223714139|gb|ACDT01000076.1| GENE 19 19586 - 24394 4786 1602 aa, chain + ## HITS:1 COG:no KEGG:Csac_2519 NR:ns ## KEGG: Csac_2519 # Name: not_defined # Def: coagulation factor 5/8 type domain-containing protein # Organism: C.saccharolyticus # Pathway: not_defined # 345 919 69 608 628 159 24.0 8e-37 MKGMHKHAIALLLSASLLQSNVMPLWANTAPSVRETNQVDELDDFSKVLNHSDGWQTVID EPNNFFGDSSRLIRTEDSAQYLVYQLDNLTDFKVRMHSYLRNLRYLVNIYQSNDLSSWEA VDYTITSAKTSDSAFWGTFDLMPKNILDGKNKYLKIEIKNNENADQTKTVKGSDEVNFLK AVQIGKVELTDSKAGLITNTNTDLKTIKLEGFDDSKTSTLKISENKTIKLINPTIYDVKI ESSNPKVVKYENNQIIGVSNGTAYLKGIVPGELEVFRYEIKIGTGENLDISPKYTKPADD TGDSRYIKQGNGEETDNLITVDPDNIIVKDYMGGAMELEMIEWFDMDDARWERYFQRIQY MELGFVRCMLAAYWYCTGFDDHNNPIYVWDCLNEDGSPKYYSRDGYPIDPLGNNIEPNIK RMKRLYQLFDFFKKNNITVMTGEWQYPEAAEWYYVPGENLKQYNLTLDDPRYAQIAADFI EHMVKDKNYTCIEQYDLGNEVNINPGSYQYDKWKQANLNLYDQLKQRGLEKTISLTPDLS YWLDLWYNQSIADLHDVFEQYEFHWYVNEVNLKNGVFENEMRMLVNNAAGQDNPDKKVYM GEVGMRDYLNNVDQVTNIDSFEYGLWMANILVQGTRAGMSGLTAWSLDDSVHINGPMTYD YSVDQKSLMKVWGLWNSLGEYQGVSEDIRPWYVPWALASKFMPKGSQIIYTSPTEQQVLV KIPEATKKVDLSVYAYFKDDLNQDENGYPVRSDFKKNLDLQEGIVLTLPADGNLVLTTME GENKFNSKDSGYFSDSMQSLQNTVDYQNIGFNSYEKHPNNAQNRINMNQDIFGDVTRINR AGKDKASLVYQRAEGIKNFRIHSYVKGAVGSNFKVYGSNDKNEWTEIKINEIEKYSLTKG WKFAVLTPESEIEYKYIKIELGNEVSSTDIQVGKVELTNQSEFSEPLKLEIPDDAQSSEY KEIMKDVQLATNDIMSIYSISGFENLPENCTLKSSDENIVKITDMSFKAVNSGSALVNVY RDGKIIGSIIVNVYAFHDSVEDGNGLVYQLNNMKYDLMADANDTTFSRTNDGGNVIYKLV GIKDFSIRGYVADSNDKLFEQIKIYASKDGNTWESIEIKVDGINALSNPYWHYANIIPQE AITKDYDYLKIAIEGTLTDHWKIQVGDVRIQDKPFNSVVNKIELANQDLKLAVGNSQIIG AKVYPEAVFPFKVIFDAADNEIVSVDKDGLVTGLKAGKTTVRVYSQDNPEIYQDCTIEVT EVIDSSKEIADSIKELTIINNQIVLPVVEGYEITIKSSSIPNIIDLDGKVTPSNQDEQVA LVLAVKNKATNVVSDTANIIVTVAKKDETVMIEFDITDVVLKVYQEFDPLADIKVFENDT DITKNVRIISNNVDLNHHGKYFVTYEVTDSKNITHSHTRNVEVRGIKSISCPEYQVDLTS LKDLDPDVFLNIVKVDKQVILDKLNSKLIDDQIIISCYDLAMMLNNKEYVLTDNQLTIVW NVTSKMQGFKVLMLDKDGNVVEVPMTVNGKEITFMVEQLGTFAIVGEQKVETKEPDENKP ENKVETAVKTGDEVYYAEYAVLGLLALSLVYGMKKRGYVDGR >gi|223714139|gb|ACDT01000076.1| GENE 20 24288 - 25226 629 312 aa, chain - ## HITS:1 COG:lin2846 KEGG:ns NR:ns ## COG: lin2846 COG1737 # Protein_GI_number: 16801906 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 1 272 1 271 283 140 33.0 3e-33 MSLLRKLEMAVDFSTSEKLIATYILKNGPDVLNMSTSQLAKATYTSPATITRLCQKLEYK GYNDFKIALSAQLQYTFDRNSETNPNFPFKKRDTIHTIKHSIATLAKDSINLTVNNIDND ILAKIIIMLDQAKVIDIYGVSGPLRIASDFQYKMFRIGKNVQIAPMVNEQLFQAAQSNKD HCAILISYSGETNEVIAAAELLKRRKVPMIAITSFGESRLSKLCDYVLFLDSRERIYSKV STFGSTISIHIMLDIIYSCLFARHYDENLELKLETDHIIDHRHILFFSFHKLMIVPIILT QHIPHNKLHHQF >gi|223714139|gb|ACDT01000076.1| GENE 21 25609 - 30876 5922 1755 aa, chain + ## HITS:1 COG:no KEGG:Csac_2519 NR:ns ## KEGG: Csac_2519 # Name: not_defined # Def: coagulation factor 5/8 type domain-containing protein # Organism: C.saccharolyticus # Pathway: not_defined # 1 558 1 492 628 187 27.0 4e-45 MNRKFKRFLTSLLSLAVVASSGIGPVKAVDNTPKLDSKVYINSDNVLTKDFEGFGVQWDP SDLFDYTDEQWSSFYEKASFLSPNVMRVMLHDGDSYCIGFEDDGTPIYDWESVMMKRVYK ILDFAQQNDIPIMLGEWRSISERGYLSYDDHGKTVNWSSPTWARMIGDCLEHLIVDKGYT CIKYYNMINEPNYYKRDHGDVTNEYVYDQWKQAITNLRNEMDSSGIEKIENIKIVGPDVY DSQEAWINQATSDEMKDKIELTEVHRYAPQSEVESGLIEKKLKTWKEQAESLDPEVAEEG FALGEMGISGTGPGDSQLNARKYDYGVDIFDYGVQAIRAGLKFGSVWGFEDSMHVQHNDI VNNFKDQYGPAATTEEGRDYVVHTPTGDPSIDNDIKIWGFWNELGEEMAAQNAEHNVTGR ANTVKASDENLKPWYYTWSMLCRYFPSGTKILETTDSYVDQLRVTSGIIPVADDKGDISI AVVNNSSSQKTLEVNMPNAASKVDLNQYFYYDGEIDGKTRPVNEKGQLLQYDTIENANLK DGVNVTLPAKSCMILTSLGYEGESHPMSFTTGQTTDVEKVEIYETTNANQLEVGKSYQLA ANYIPSISKGEMEWSVVDYFGNASDKALIDENGLLTVKKAGQFKVIGNLKGKPEIQDTLV FKATSSSILVDELNDLENGVALSYQDIIKDDNSANFNGDKTVKRSDSNANGKPGIITYQA NNIYDFEFSAYSLNNNLDKSGNFVVEVSQNGDSWSPVECEFIQGSKLSSGWYPYTIKNKA VIKDDGYQYLRVTITSKSGYKTYDPQYAGGSIYYDYQGASQIDIQSHNEFIVKGQELQFK AEVLPSIASQEVSWKVLSLEGKPTELATISPDGILTAKAKGEVVVVATAKDTDVSAYSHI NIINGYFVDEIEDFSKMYQYGEFAFDDPKSSNFTDKTRIKRLSDSNQSIVYALSDIEKAT FEIYKNGSFVNDSVDIYASSDGLTYNKVEKNVVNAGKAASSTEYYLYEVSTKELGDNVNY IKLELKNDEIIYCPMLGKTKIIYNPIENVEVTNVTLDQKELSINVGQSKTLVKKLAPLYA DEQLTWSSRDEDIASVDQNGIVTAKKVGSTIIYAKYNDEIYATSVINVLGENMALNKDVS ASTETNQWNKNPTGKLVNDGDYDTRWVSKDGTSITKEQITIDLGEVTTVDNVKLYWESAR ATDFNIEVSTDGSKYDIVEKLRDEDKNKLTNEISFDPVEARYVRMQGLTPATKYGYSIYE FEIYNNSDLKLASSVEFKEMSSELYLGENTALDVVVTPDDATYQDASYTSSNEFVVVCKN NQMIAVGEGTATISANVNGQKITKEITVVKDNARKIAQELTSLTVENGRVDFPTVDGYSF SVFSSDLEKVIAKDGSVNLPIEDQEVALVVTVSKDGSEDSANTDEIKVMVKGSRDKYELL EKRIKEIEETDWTVYKPGTVKTFKEKLQAVKESLNAEVLLVSEVEKAGENLEAAFKGLEI KSDKKTLDALIKELEELDTKVYTEVSVARVEKALEKAETVSENEDASEAEVNDAVQALYE AKLGLVEQINYDNLKDRIEEIEKEDLSIYSEATAARLKEQIAEAKEALKNENITNEELQE ELAGLNEKYSSLKLNHEPGSVEELIKRIEEMDLSGYSEASVKELERVIKEIKEELLEGVD QVRYAKLAEKLQTAYEGLERKSEVKPVEPEQLETPKGDEQAAKTGDNMIIGTWLITMSAA VMVVIFLNKRRSLEK >gi|223714139|gb|ACDT01000076.1| GENE 22 30932 - 31810 844 292 aa, chain + ## HITS:1 COG:lin2846 KEGG:ns NR:ns ## COG: lin2846 COG1737 # Protein_GI_number: 16801906 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 1 272 1 271 283 152 34.0 8e-37 MGLLKKIELANNFSSVERILGKYVLENGEKVLNMSTKELAKATFTSPASIVRFCRKLEYE GYNDFKIALAAQLQYSRLSNDDINANYPFKGTDSVYTVASNIANLSKESIDITMKNLDLE ELRLVVLMITKAKVIDIYGVSGPLRIASDFQYKMFRIGKDVRISQMINEQLFQAAQSTSE HLAILISYSGETEEVIEAAKILYRRKIPAIAITSFGENRLVKYTQRVLYLNSSEFIYSKI ATFSSTLSLHLLLDIIYGCVFSKNYEDNLNYKIETDNLIDHRQSSIAIDKFK >gi|223714139|gb|ACDT01000076.1| GENE 23 32307 - 32531 295 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755312|ref|ZP_02427439.1| ## NR: gi|167755312|ref|ZP_02427439.1| hypothetical protein CLORAM_00825 [Clostridium ramosum DSM 1402] # 1 74 17 90 90 142 100.0 5e-33 MRLRTGARADQEVYLPFFDFENDIICAKLAEKLQTAYEGLKKKSEVTLVEPEQPKTPKGN EQAAKTGDNMIIGT >gi|223714139|gb|ACDT01000076.1| GENE 24 32803 - 35040 2395 745 aa, chain + ## HITS:1 COG:CAC0205 KEGG:ns NR:ns ## COG: CAC0205 COG1409 # Protein_GI_number: 15893498 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 65 542 47 496 652 224 34.0 6e-58 MNENLKKRIIVSLATVSAFSSLAMVPLSAYGEYQGNDGTQKLNNGLTSTSDYNDWYNNQW NNQESGEMDTGKIVLTPGAKATDLNFAWYSEETGTPTVKISTNQDMSGAKTVTGSADKIN KNNSFKNYTASNKVALKDYLVENMTYYYQYSTNGVDWSDTYTYKTHSFSDYQAVLVGDPQ IGASGSNGQGTQDDTDIAVNTYAWNKTLQKALGAGGIAENASFILSAGDQIDYSSSGTNG SGEIIREQEYAGFLYPDVLRSTPLATTIGNHESMVDDYSLHYNNPNASTLGSTESGGDYY YSYGDTLFISLNSNSRNVEEHRQLMKEAVASHEDAKWKVVLFHHDIYGSGSPHSDVDGAN LRILFAPLMDEFNIDLCLTGHDHSYARTYQILDGKVIETDGVSENASKAYNPEGTLYIAA GSASGSKFYTLNTVKQYYIAERSNTPEPTFSTIDFSGDSLTIKTYDYNGQKYANDVTLSK DGNAKSIEEMKNEVAAIDTVNVTSGSKNRIDEALIAVNTALDTRDDSTAETELQNKWNTT SDPLNYYGYAQNGFANENSTALKRGYSSLLDKTLYENDSNKAVTTATIDEAYNKLATAKN EVVTKAEFAEVQTKFDQIGSTLAQISIGDKKDQYTRADVDAFKKSIAALKVDFNEATITK TALNELSTQLDTVTNEFLAKKNTEDITTAPIVTPSTTPSKTPVKTTSSKVKTGDDTSINL AGITAFVSLLGIAGTKLFKKRKIEE >gi|223714139|gb|ACDT01000076.1| GENE 25 35107 - 36276 1175 389 aa, chain + ## HITS:1 COG:CAC0206 KEGG:ns NR:ns ## COG: CAC0206 COG5438 # Protein_GI_number: 15893499 # Func_class: S Function unknown # Function: Predicted multitransmembrane protein # Organism: Clostridium acetobutylicum # 30 376 21 374 397 224 40.0 2e-58 MKELLKEFKQLSKKEKIGHFGVWLAILAFLIFLYFFNNTVEKTPLMDQEGAVFEKARVTE IISENRTENGLQQGTQIVNTEILSGDYKGQIVETTNIDSYLYGADCKVGTRVIVQLSEYN GTLSASVYNYDRTNTLYTMVAIFLILLVVIGKRKGFTSALGLIFTFICIIFLYLPMLYLG FSPFFSAVAVVVLTTLVTMYFIGGFSMKTLCSVLGTIAGVVVAGIFASSFGALGHVSGYN VNDIETLLYIGQNSKLDISGLLFSGILIASLGAVMDVAMSISTTIEEIKYHNPSISRKDL FKSGIKIGGDMMGTMSNTLILAFTGGSLSTLVVFYAYDMPFLQMFNSYDMGIEIIQGIAG SLGVILTVPFVSIIAAILMTKKKKGDFPK >gi|223714139|gb|ACDT01000076.1| GENE 26 36333 - 37121 967 262 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 260 1 253 256 116 31.0 3e-26 MTKVLFFDIDGTLTDTHGGLPEIPAGAQRKMKELQERGYLLFLASGRPYAFIPEILTNFG FDGAVLCNGAHVEMNGQFIYHQPTDLKKTIDLVNHLDENNFEYIIETKRGAYLDPSFKVL EKFFISCNINEEFLIEDFNREDIIAECLKLEVSVPDEHQAKIEEIILNDFNYDKHGTDNA FEIYSNVISKATGVQKVLEYLNLNVQDSYAFGDGLNDLEMIQTVGTGVAMGNAVDELKAV SDLVCDSVHNNGLEKVLNDLFG >gi|223714139|gb|ACDT01000076.1| GENE 27 37165 - 38313 1246 382 aa, chain - ## HITS:1 COG:lin1829 KEGG:ns NR:ns ## COG: lin1829 COG3589 # Protein_GI_number: 16800896 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 380 1 357 361 246 39.0 4e-65 MHKLGISVYPDKSPKEEVYAYMEKAAKLGFSRIFTCFLSIPEDKRESYLVEFKEFMDKAH ELGFEVAADTNPEVFKIIGATPGNLKPFADLGLDIIRLDGNFGTQGDIQITRNPYGIKIE FNASMDMGVELLINHGGNKDQMIMCHNFFPERYSGLDFNLFMEYNRYWKELNLHTAAFVS SNNANTIGPWEVFCGLPTVEIMRGLPIDLQARTLLATGDVDDILIGNYPATDEELEALSK INYQAIEIRVDEVPEITDNEKLIMYDFAPHWDRYDHSSFMLRSSMPRVKFKEKATVQDSG FTSNTEIAGSKSIPHHDCGKKVFTRGDVLIVNDNLAHYRGELEIVLTEIPNDGERNLVAT VKEEEMILLDFVTPGHHFVLKK >gi|223714139|gb|ACDT01000076.1| GENE 28 38418 - 40106 1523 562 aa, chain + ## HITS:1 COG:BS_tagO KEGG:ns NR:ns ## COG: BS_tagO COG0472 # Protein_GI_number: 16080606 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Bacillus subtilis # 5 340 9 346 358 226 40.0 8e-59 MNPSLIMSFVVTMIVSAILIPLVMKAGKELGIVAHKNKRTVHKVEVPRIGGYAIYISSLI GAVIFLKTDPQINAILIAGFLVFFVGLIDDVHDLSPKTKLAVELIAALIVIVYGDIYLKG FDFMPSNWPPIIPGIITVLWIVGITNAINLIDGLDGLSSGISIIVLFTVSMTSLTSGRTD IASLSLVLAGAIMGFLFYNFHPAKIFLGDCGALYIGFMISVISLLGFGYNVSTFFTLGAP IVVLMVPIMDTLIAIIRRKVNHKKFSEADRAHLHHNLMFKLKLGHRKSVLVLYGVTFLFS LTSYIYLYDATLGTLMFVILMFIFELFIETTSMVSQKYKPVLTLANVFIQSDRLPKIKFL ERYRANRSEKRKVIDHVVMLCCLLIVIGGAGTYILYSREGASKNNPATVPVETPYIHEEG NELLNDIYVRLDKAYKNNLTSEECQLVASYFAVDYFTLKDKKAGEIGGLDYIYPSLQSEF SSFASKSFYTYRETYPKLEVKNYEIISFAPSKVVIDDLDDNEYYNVLISMEFNRKVDELP KSVNIVLVLDNDRFYVVGVDNA >gi|223714139|gb|ACDT01000076.1| GENE 29 40099 - 40578 484 159 aa, chain + ## HITS:1 COG:SP0950 KEGG:ns NR:ns ## COG: SP0950 COG0454 # Protein_GI_number: 15900828 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Streptococcus pneumoniae TIGR4 # 40 158 48 162 166 59 30.0 3e-09 MLSYLKVENIDNELIEFHLRQKNYYSLEGQVVTREYLINEMRCPDGFDEIDHYIEKCYLE GELVGLVDYQYGYRYSMIHDDECVWIGLFLIDQDKQNCGLGKRLFNDCLKKFKQRCERIQ LACLVRNEAGLVFWKKCGFNEIGNSKYGDLPVIVLEKRI >gi|223714139|gb|ACDT01000076.1| GENE 30 40692 - 40922 334 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755319|ref|ZP_02427446.1| ## NR: gi|167755319|ref|ZP_02427446.1| hypothetical protein CLORAM_00832 [Clostridium ramosum DSM 1402] # 1 76 1 76 76 104 100.0 2e-21 MFGTILNVLLVVISIALIILCLLQSGKSDGIVNALTGQSSNLFAQQKERGADLVMTRVTT GLAIAFFVIAIILRMS >gi|223714139|gb|ACDT01000076.1| GENE 31 40972 - 43242 1598 756 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 1 701 3 705 730 620 44 1e-177 MKEKILELLSDENYLERSIDLIAAKLNCNKSDKFVNLVKLMNELEDEGKVVRNKYNHYYL PNQFGMILGTLTINKRGFGFVVVEDQEKDIFISPDDLKDAFNKDTVLVELKKHSDEERPE GRVIRVIKRGQTRLVGEIKKGKRDCFVDVDDPMFDKPIFVDRAHLHGAMPGHKVQVEIKT YKPILKGDVVKILGHRNDPGIDILSIVYEHDAPVEFPQAVYDQIENIPDSLEGIDIGQRI DLRDEVVVTIDGDDAKDLDDGISLKKLDNGHYHLGVHIADVSYYVTEESPLDKEAFARGT SIYLADRVIPMLPHKLSNGICSLNPHVDRFTISCFMEIDENGEVVEHDIVPAVINSTERM TYTNVNKILDGDQTLQLQYSHVKDLFFLMQELAMILQAKKARRGAIDFDVKEAKVLVDSK GNTTDVVLRERGASDRIIEEFMLLANETVAEHFKWLELPFIYRVHENPKPKKLLQFSGIA KTLGYTIKGSLENVYPGELSNIIEVSKDTPEHTIIATLLLRCMQKARYDEQCLGHFGLAD EYYTHFTSPIRRYPDLIVHRLIRKYMFENRLDRRTIMHYQELMPEIALQTSAREREAIMI EREVDDMKIAEFMEKQIGEEFEGIISSVTNFGFFVELDNTIDGLVHVTDLTDDFYFFDEK NIRYIGQRTGKVFKMSDRVKVRVSSASKKDKSVDFEIVGMKSNKKTTKTVIINKRKDNDR KNNRKRNDRRKKRHEEPRFTNNHFKRSKKGKSKRRA >gi|223714139|gb|ACDT01000076.1| GENE 32 43266 - 43712 580 148 aa, chain + ## HITS:1 COG:TM0254 KEGG:ns NR:ns ## COG: TM0254 COG0691 # Protein_GI_number: 15644629 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Thermotoga maritima # 1 146 10 155 158 155 55.0 2e-38 MKIVSNNKKAYHDYFILDTYEAGIELKGTEIKSVRLGNVNLKDAFVRIKDNEAYIENMHI SPYSHGNQFNHEPLRNRKLLLHKKEILKISNKLKEGGLTVVPTKLYFNTGSKVKLEIGIA RGKKLYDKRQDLKERDSKREIEKALKNY >gi|223714139|gb|ACDT01000076.1| GENE 33 45228 - 45968 608 246 aa, chain + ## HITS:1 COG:TM0938 KEGG:ns NR:ns ## COG: TM0938 COG0500 # Protein_GI_number: 15643700 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Thermotoga maritima # 38 234 42 235 254 60 25.0 3e-09 MKNYYGSLCTEMYEILHKDAPEDELDFYLSYTRKDMSILEPLCGSGRFLVPFLERGYDIK GIDLSKEMLAKLKEKSPNAKVFQSDILEYDTVEKYDYIFISSGSVSLFTDMVLCRKILKK MKWLLKKGGKFVFAVDTVANRCPDDSDYRTAIAIKTKERYDLILKNKNYYDEKTHTQFSP SIYELYKGAELLQQEIMDFQTHLYELGELEQHLQDIGFTSITVYSSYEKIIAKNNHTEMF LYECSF >gi|223714139|gb|ACDT01000076.1| GENE 34 46294 - 46491 142 65 aa, chain + ## HITS:1 COG:no KEGG:Coch_0559 NR:ns ## KEGG: Coch_0559 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 65 106 170 170 85 67.0 5e-16 MSENYIRNHFDDIKKYGNVIENRLDDENCTLDSVLKDNAQFSKLAQKSKMNYILIDDPYK IDIDL >gi|223714139|gb|ACDT01000076.1| GENE 35 46925 - 49051 1253 708 aa, chain + ## HITS:1 COG:sll0267_6 KEGG:ns NR:ns ## COG: sll0267_6 COG2200 # Protein_GI_number: 16331091 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Synechocystis # 447 690 5 248 253 184 36.0 5e-46 MKVKKTRRQFYDSIYILAIKHGLMLAIPFLILGSFALLFNSFPIPDYQVFLAAFLDGGLK DGLSIIYNCSLGSIALILIITISLSYGKLNGQDDVFFYPITAIISYLSFCGGLHEKTYIF QSDWVFTAMCITLLTCAMLKKGMHLVNHLEKLYTVGTNYIFNRAIQGIFPIVFIIIIFTS AGQIMKGLLGDINIVNFGSYFFIYLFNWFGNGIIGTLLYVFFVHFLWFFGIHGTNTLDMV AKQLFEPGVQINQALIQNGQLPTELFSKTFLDIFVFIGGCGTALCLILAIFIAAKKSNNK KLAKVAGISVFFNINEIVIFGFPVIFNPIMLIPFILTPLVLTIISSIAMSTGIVPYISQS VEWTVPIILSGYQATGSIAGSFLQIFNIIIGTMLYIPFVKYSEQIQSKEFEESILALEND MIEGEQNGSIPVFLSDHYHYHFPAKTLAMDLRNAMYLHQLQLYYQVQMTAEEQVYGVEAL LRWNHPVCGFVAPSLLISLAYQGGFLNELGLYIIEKACQDAEQIDRLAIKMNLSINISPK QLEDADFVSNVLKIINKYHLQYIELVFEVTERALLNTTNIITKRILELRDSGIKLSIDDF GMGHSSITYLQENIFDEVKLDGSLVQHLLENDRTREIIMSITQIAKRLNFSVVAEFVETE EQINVLKEAGCKIYQGYYYAKAVPLDEFLQYCLITRNGIIRSPINEKV >gi|223714139|gb|ACDT01000076.1| GENE 36 49083 - 49832 669 249 aa, chain + ## HITS:1 COG:no KEGG:BT_1587 NR:ns ## KEGG: BT_1587 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 249 1 250 250 407 75.0 1e-112 MDDEFINITTENVFNEHLCCIIRSKKFHPGIDAKRQWLSKRLNEGHVFRKLNERGTVFIE YAPLEKAWVPIVGDNYFYIYCLWVMGSYKGKGYGKSLMDYCLADAKAKGKSGVCLLGSKK QKHWLTDQSFAKSNGFEVVDTTDDGYELLALSFDGTFPRFTQKAKKQKIENKELTIYYDM QCPFVYQSLEIVKQYCEINNVPILMIQVDTLQKAKDLPCVFNNWAVFYHGEFETVNLLDK NTLTRLLKK Prediction of potential genes in microbial genomes Time: Thu May 26 10:01:47 2011 Seq name: gi|223714138|gb|ACDT01000077.1| Coprobacillus sp. D7 cont1.77, whole genome shotgun sequence Length of sequence - 9533 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 6.5 1 1 Op 1 . + CDS 93 - 770 797 ## COG3010 Putative N-acetylmannosamine-6-phosphate epimerase 2 1 Op 2 . + CDS 786 - 2060 1591 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 3 1 Op 3 . + CDS 2050 - 2817 557 ## COG0561 Predicted hydrolases of the HAD superfamily 4 1 Op 4 . + CDS 2819 - 4309 1406 ## Cphy_3566 hypothetical protein 5 1 Op 5 . + CDS 4366 - 5187 810 ## COG1737 Transcriptional regulators + Term 5224 - 5260 5.0 - Term 5212 - 5248 0.4 6 2 Tu 1 . - CDS 5268 - 6146 741 ## COG0583 Transcriptional regulator - Prom 6197 - 6256 7.2 + Prom 6192 - 6251 11.9 7 3 Tu 1 . + CDS 6276 - 7751 1574 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 7783 - 7824 6.6 + Prom 7815 - 7874 9.4 8 4 Op 1 . + CDS 7919 - 8173 253 ## gi|167755334|ref|ZP_02427461.1| hypothetical protein CLORAM_00848 9 4 Op 2 . + CDS 8188 - 9048 1018 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 10 4 Op 3 . + CDS 9088 - 9532 472 ## SUB0840 transporter protein Predicted protein(s) >gi|223714138|gb|ACDT01000077.1| GENE 1 93 - 770 797 225 aa, chain + ## HITS:1 COG:lin2933 KEGG:ns NR:ns ## COG: lin2933 COG3010 # Protein_GI_number: 16801992 # Func_class: G Carbohydrate transport and metabolism # Function: Putative N-acetylmannosamine-6-phosphate epimerase # Organism: Listeria innocua # 1 225 6 230 231 258 54.0 5e-69 MKRIQNSLIVSCQALEDEPLHSSLIMSKMALAAKMGGAKGIRANSVQDIHEIKKEVDLPI IGIIKKDYEDTDIYITTTMKEVDALVAEGVDIIAMDATMQLRPHNQTITEFFKEVKAKNP HQLFMADCSTVEEAIHADNIGFDFIGTTLVGYTPQSQNSRIDANDFKIIREIVKNVKHPV IGEGNIDTPSKVKRALELGCYSVVVGSMITRPQLITKKFVDEISR >gi|223714138|gb|ACDT01000077.1| GENE 2 786 - 2060 1591 424 aa, chain + ## HITS:1 COG:lin0033 KEGG:ns NR:ns ## COG: lin0033 COG1455 # Protein_GI_number: 16799112 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 13 416 19 430 452 262 39.0 1e-69 MMDKLSNVLLPIAEKLSKNRYLTAIRDGFVSIMPLVITASLFTLINSVFVGEGHYFDQWF GAPCNDFAKIGSVISSASMSIMAILLVFTTAKALANEYKMDTSVAGATAVVCFLCLTPFV SDAKIGEYVTTYYLGAAGMFTAFISALISVELMRFLLGFKALVIKMPESVPTGIARSFNA IIPVALTVIIFAILRIITDAIGAPLNDLIFNWIQTPFTAIVSSPVGLIVIYVLYMLLWGF GIHSAYIFNPILEPIYLASLAINEAAITSGTAATEIITKPFIDSVAFMGGAGNMLALVIA IFIVSKRDDYKVIGKLGFVPALFNISEPLMFGLPVVMNPILIIPMILTTLAGLGIGALAT QIGIMAHTYVLIPWTTPPVISAFLATGGDIMSGVIGLLILVVSVLIYIPFVKVMNKEGIE SIEE >gi|223714138|gb|ACDT01000077.1| GENE 3 2050 - 2817 557 255 aa, chain + ## HITS:1 COG:HI0597 KEGG:ns NR:ns ## COG: HI0597 COG0561 # Protein_GI_number: 16272540 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Haemophilus influenzae # 4 254 6 258 272 72 27.0 7e-13 MKNKVIVFDLDGTLLGANNEIIGGTETLNCLDSLQQMGWTLAICTGRLDHDILKIEEQYK LKIEHRISQNGAVCMKGQQLIATLIDKQEAINIYKEIQKEDIRIELNTVSNRYWKSERDP AFPSELYDSHIICKDFDKLILYQPAVLFLVIGTEDKLRSIETYINTTYQKTKAVMTSSTS LEIMHKDASKGNAVKMLYPKSEVYAIGDSPNDFDMFPIAKKGYLVANKPCQFPCAQKESI LEALKDIIKINKEDR >gi|223714138|gb|ACDT01000077.1| GENE 4 2819 - 4309 1406 496 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3566 NR:ns ## KEGG: Cphy_3566 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 494 6 498 501 348 39.0 2e-94 MKTLFIPLDERPCNYNFPLMIAQSNQQIELVEPDKSFLGHKKKPAQVEYFKQFIDENIEG CDNCIISIDMIVYGGLIPSRLHYLSETDALQRLELIKYLKQLKPQIKIYAYHCIMRAPSY NSSEEEPDYYENYGFALFRRKYLLDLKKRHGLNQKEKDELLGIDIPQEIITDYEQRRLFN TKINLKVIDFLEAGYLDFLVIPQDDSSPYGYTAISQKIVINALKEKRLDQKVMIYPGADE VGLSLLARAYHEYYHLEPKVYPFYASVLGPMIIPKYEDRPMYESLKSHVRVCKAKLVNNP QDADVVLAINSPGKIMQEAFIDKNDLDVTYTSYRYLLAFAEQIQDYINSGYHVALCDSAF SNGGDLQLIEYLDELNILDQLVSYAGWNTNCNSLGTTLSQAFIGQENVINNLCYRLIEDV FYQAIVRKEVVENDLPHKGLSYYDFKDQQTDVENIIKKKLQTNFGHLHLSQRYPFNITKI YMPWKRMFEIGMEIKL >gi|223714138|gb|ACDT01000077.1| GENE 5 4366 - 5187 810 273 aa, chain + ## HITS:1 COG:SP1331 KEGG:ns NR:ns ## COG: SP1331 COG1737 # Protein_GI_number: 15901185 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 6 249 5 251 269 193 42.0 2e-49 MLSYNRSVIPIIESAYEELTNVEKNIANYFIDQVEDEDLSSKAVSQRLYVSEASLSRFAK KLGFSGYRQFLFAYQDSHQSSRHLDLLTKQVLNSYQKVLEKTYSLIDNDQMVRIAKMLDE YNRVYIYGIGSSSVVAREFKLRFMRLGLDVDYLAESHSIRMNITRVNQESLVIGISVSGK TEEVIEGLREAKEKGAKTIMLSAARVYEYRSYYDELILIGGLKNLAISDKISPQIPALIV VDILYSHYLNYNNDNKKEKLQMTLEHINYELEE >gi|223714138|gb|ACDT01000077.1| GENE 6 5268 - 6146 741 292 aa, chain - ## HITS:1 COG:lin0450 KEGG:ns NR:ns ## COG: lin0450 COG0583 # Protein_GI_number: 16799526 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 289 1 287 291 235 42.0 6e-62 MEIRILKYFLAVAQEESITKAAEILHTTQPNLSRQLNMLEAEVGKKLFERGSRKVTLTEE GMFLRKRAKEIIDLTERTESELSTYGETTSGDIYIGAPETYVMHSIAEIFKRMHNQYPNI KYHIFSGSTLEVSEQLNKGLLDFAILIEPIDLEKYNYLKLPYTDTWGVLMRRDSPLAKLN AITPEDIKDEPIFLAHQQSSANVLSGWFKEYYRNLNVIGSFNLITTPAMIVESGLGYVFT FDKLINTTGNCNLCFRPLEPNFETGFYLVWKKYQIFSRSAKMFLEELQKVLF >gi|223714138|gb|ACDT01000077.1| GENE 7 6276 - 7751 1574 491 aa, chain + ## HITS:1 COG:CAC1405 KEGG:ns NR:ns ## COG: CAC1405 COG2723 # Protein_GI_number: 15894684 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 1 491 1 473 473 549 53.0 1e-156 MSFPKNFLWGGATAANQCEGAWNEDGRGPALTDVTTGGSVKESRQVTYIDKDGKHCKIVT SEGQVQLPEGAHFACLDECLYPNHDGIDFYHHYKEDIKMFAEMGFKTFRMSISWSRLFPH GDEETPNQKGIEFYRSVFKELRKYNIEPLVTIWHFDTPLYLIEKYGGWSNRKLIKFFDHY ARTVFTEFKGLVKYWLTFNEINNTIMGLGLFYEAKEEDYQRAYQHLHHQFVASAHAVKIG HEIDSENKIGCMICGITSYPATCDPKDVLANRHQWEKSIFYCGDVQCFGKYPTYAKRLWD EHNVKLDITEQDLIDLKEGTVDMYTFSYYMSSLITTHEIADAVGGNFTTGARNEYLQYSD WGWAFDPDGLQYYLEMIYDRYQRPLMVVENGLGAFDTVEEDGTIHDDYRIDYYRDHILAM DKAINNGVDLIAYTTWGCIDLVSAGTGEMRKRYGFIYVDKHDDGTGTMKRSPKDSFYWYQ KVIQSNGQDLD >gi|223714138|gb|ACDT01000077.1| GENE 8 7919 - 8173 253 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755334|ref|ZP_02427461.1| ## NR: gi|167755334|ref|ZP_02427461.1| hypothetical protein CLORAM_00848 [Clostridium ramosum DSM 1402] # 1 84 1 84 84 136 100.0 4e-31 MVKLNVQKISEKEIVIALLDAGMDTEKINEFIESINSEYFQKSKNILLSQRAELLLKIRD GQEKLYCLDYLTRTLNQAGIFDKK >gi|223714138|gb|ACDT01000077.1| GENE 9 8188 - 9048 1018 286 aa, chain + ## HITS:1 COG:Cgl1021 KEGG:ns NR:ns ## COG: Cgl1021 COG0656 # Protein_GI_number: 19552271 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Corynebacterium glutamicum # 25 283 8 264 267 303 55.0 3e-82 MILKENYKLNNGVEIPKLGLGTWEIADNKVADAVKKAVKLGYRHIDTAQGYENENGVGIG IKTCGVERENLFITTKIQADFKTYEEARRSIIDSLKRLELEYIDLMIIHAPQPWTKFREE NHYFKENIDVWKALEEFYKMGKIRAIGLSNFEIVDIENILNNCEIKPAVNQILAHISNTP FDIINYCQAHDILVEAYSPVGHGEMLKNNEVKLIADKYGVTIPQLCIKYCLQLDTLPLPK TADPGHMQNNADLDFIISDEDMMLLKNINKIKSYGEFSVFPVYGKI >gi|223714138|gb|ACDT01000077.1| GENE 10 9088 - 9532 472 148 aa, chain + ## HITS:1 COG:no KEGG:SUB0840 NR:ns ## KEGG: SUB0840 # Name: not_defined # Def: transporter protein # Organism: S.uberis # Pathway: not_defined # 1 148 1 148 309 140 56.0 2e-32 MAALTTLFPVFFMLALGFIARVKGWITLEQKNGANAIIFKILFPILVLNLMCTATIETDH IKIIGYVFGVYLLALVAGKLLAPLAGKKYAHFSPYLLTVVEGGNVALPLYLSIVGKSSNT VIFDIAGTVIAFIAFPVLIAKEAATGSS Prediction of potential genes in microbial genomes Time: Thu May 26 10:02:06 2011 Seq name: gi|223714137|gb|ACDT01000078.1| Coprobacillus sp. D7 cont1.78, whole genome shotgun sequence Length of sequence - 18616 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 10, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 674 492 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase + Term 736 - 785 -0.6 - Term 723 - 773 4.2 2 2 Tu 1 . - CDS 799 - 2385 1498 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 2621 - 2680 10.2 + Prom 2697 - 2756 8.5 3 3 Op 1 . + CDS 2786 - 3514 347 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative 4 3 Op 2 . + CDS 3469 - 3987 331 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative + Term 4050 - 4096 2.2 5 4 Op 1 1/0.000 - CDS 4509 - 5384 665 ## COG0583 Transcriptional regulator - Prom 5404 - 5463 6.8 6 4 Op 2 . - CDS 5473 - 6420 873 ## COG0598 Mg2+ and Co2+ transporters - Prom 6555 - 6614 12.3 + Prom 6682 - 6741 2.5 7 5 Tu 1 . + CDS 6775 - 7626 797 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 7656 - 7700 -0.8 + Prom 7658 - 7717 6.1 8 6 Tu 1 . + CDS 7742 - 9112 1504 ## COG0733 Na+-dependent transporters of the SNF family + Term 9125 - 9156 1.0 + Prom 9170 - 9229 8.3 9 7 Op 1 . + CDS 9293 - 10225 1156 ## COG0714 MoxR-like ATPases 10 7 Op 2 . + CDS 10228 - 11133 787 ## CTC00362 hypothetical protein 11 7 Op 3 . + CDS 11133 - 13163 1631 ## COG1305 Transglutaminase-like enzymes, putative cysteine proteases + Term 13170 - 13202 2.3 12 7 Op 4 . + CDS 13223 - 13426 254 ## gi|167755350|ref|ZP_02427477.1| hypothetical protein CLORAM_00864 + Term 13510 - 13559 4.5 + Prom 14384 - 14443 7.0 13 8 Op 1 9/0.000 + CDS 14477 - 15469 1403 ## COG2984 ABC-type uncharacterized transport system, periplasmic component 14 8 Op 2 13/0.000 + CDS 15474 - 16346 969 ## COG4120 ABC-type uncharacterized transport system, permease component 15 8 Op 3 . + CDS 16346 - 17131 210 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 17183 - 17237 13.2 16 9 Tu 1 . - CDS 17550 - 17720 161 ## gi|237734241|ref|ZP_04564722.1| predicted protein - Prom 17951 - 18010 10.3 + Prom 17741 - 17800 14.0 17 10 Tu 1 . + CDS 18047 - 18529 306 ## SPO0629 ISSpo3, transposase Predicted protein(s) >gi|223714137|gb|ACDT01000078.1| GENE 1 3 - 674 492 223 aa, chain + ## HITS:1 COG:TM1009 KEGG:ns NR:ns ## COG: TM1009 COG0656 # Protein_GI_number: 15643767 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Thermotoga maritima # 1 219 46 269 286 186 45.0 3e-47 TARNYGNEREVGEALIDSGISRAEIFLTTKIYGADSYRQACQYIDEALNKLQTKYIDLML FHWPSGNITETYRAMEDYYKAGKLKAIGLSNFFIHDFNTVISSCTIIPMVNQVEAHIFHQ RKVFQKDMNQYGIKLEAWSPLACGKNNIFGNLILEKIAKSHRKSTAQVALRFLNQKGIAI IPKSIHKDRIKENWEIFDFELTNDEMKEIELLDCDCSLFSWYE >gi|223714137|gb|ACDT01000078.1| GENE 2 799 - 2385 1498 528 aa, chain - ## HITS:1 COG:BH2025 KEGG:ns NR:ns ## COG: BH2025 COG0488 # Protein_GI_number: 15614588 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus halodurans # 1 508 1 511 539 404 41.0 1e-112 MIEIGMHKIYKNFGYKQVLKDINFEILTGDRVGIVGKNGSGKSTLFKMIMKKEFPDTGNI TIRKQATIGYLEQIPTISDNDITVKDMLMSSFFEIHAIEQRMRALEQDMMHNNVDEAILN KYAKIQDQYIALGGYEVEEKLNKVITGFKLHSILDKPFAILSGGQKTIINLAQLILTEPD ILLLDEPTNHLDITMLEWLENYLNKYRGTIVIISHDRYFLDRVTNKTILLDNSENRLFHG NYSYTLKEQERLLLQEFEQYKTQQKKIEAIKASIKRYRDWGNQGDNEKFFKKAKELEKRL EKIEILDKPQLEKKKLPLNFLGERSSKEVLKVIDFGISFNNLSLFSKVNFTLYYQEKTVL LGNNGSGKTSFIKALLGNLNNYQGELKMAETVLIGYIPQEINFINNNDSILQTFCHEYPC LEGEARGILAKYSFYQDQVFKRVGNLSGGEKVLLKMAILMQHKVNLLILDEPTNHIDIET REILEDALSNYSGTLLFVSHDRYFIDKLAQRKLVIENNTINSVYCDYK >gi|223714137|gb|ACDT01000078.1| GENE 3 2786 - 3514 347 242 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 7 228 3 217 396 138 36 2e-65 MKRKYPKVIISEEGEKWLDKGQMWMYRNNAVNIDETIENGALVDIITTTGRYLGTGFISK NSHITVRILSKDINEIFDRNFFKRRIQFAYDFRKTVEKENLTNCRLVFGEADELPGLTVD RYNDILVCQISSYGLDKIKDMIYEILLEVLTEDGQDVKGIYERNDIKVRAKEGLPLEKGY WRNINLPTTTIINENGINLSVDVENGQKTGYFLDQKANRVLLRNIAYGKKFWIALVILGA LH >gi|223714137|gb|ACDT01000078.1| GENE 4 3469 - 3987 331 172 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 1 165 219 386 396 132 39 2e-65 KKVLDCFSHTGGFALNAAYGKAKEVVAVDVSQTALDQGYANAKLNNLQGCISFVKDDVFD YLDKCKKGQFDIIVLDPPAFTKSRRTVDHAYNGYKKINMKAMKLLGRGGYLITCSCSRFM ETDNFEKMLRESAHEAGVTLKQVSVTQQNPDHPILWTMEETSYLKFYIFQII >gi|223714137|gb|ACDT01000078.1| GENE 5 4509 - 5384 665 291 aa, chain - ## HITS:1 COG:lin0450 KEGG:ns NR:ns ## COG: lin0450 COG0583 # Protein_GI_number: 16799526 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 287 1 286 291 277 47.0 2e-74 MEIRTLQYFLTIAREESISGAAEYLHVTQPTLSRQMKELEEELGKQLFIRGKRRITLTDE GMILRKRAEEILGLVERAEAEVKANEELLTGDIYLGCGESEGMRPIAKTIATMLEKYPHV KFHLHSGKAEEVMEKIDAGVLDFGIVIEPVDLNKYDYLRLPYSNAWGILMKRDSTLASLD KITPDDLRGKPLLCSNQGMVKNELAGWIGGNQRKLNIVGTYNLLFNASLIVEEGNLYALC LDKIFNSNNSSLCFRPLEPKLEANMLIIWKKYQLFTKSAETFLNMIKDNIT >gi|223714137|gb|ACDT01000078.1| GENE 6 5473 - 6420 873 315 aa, chain - ## HITS:1 COG:lin1052 KEGG:ns NR:ns ## COG: lin1052 COG0598 # Protein_GI_number: 16800121 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Listeria innocua # 23 314 24 314 314 242 44.0 5e-64 MLEFYKTFGTETKKIDKPEPGSWISAIAPTEEEKNYLIEEMGILPEFVKSSLDAEESSHI DYDEDYNQTLVIVDYPSAEEVEDGYDKNMLQYTTLPLGIVIMKGYVVTISLYENLNIDDM AQGRIKGVNTDLKTRFLLLLLLRISQRYLIYLRQIDRISSRTEQRLHKSMQNKELIQMLG LEKSLVYFSTSLKTDEITLNKIMRGKAIKLYDEDQDLLDDVLIEIHQAIEMCNIYSNILS GTMDAFASVISNNLNIVMKVLTVITIVMAIPNIIFSFYGMNVAGLPFPQWWFPTGLAIVA CIIATIIFIKKDMFH >gi|223714137|gb|ACDT01000078.1| GENE 7 6775 - 7626 797 283 aa, chain + ## HITS:1 COG:CAC0426_1 KEGG:ns NR:ns ## COG: CAC0426_1 COG2207 # Protein_GI_number: 15893717 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 1 124 1 124 127 135 47.0 1e-31 MNWTDGISNALKYIEEHLKEDIKIVDVANEAYVSSFYFQKAFRILCGFSVAEYIRYRRLS LAGSEVITTNKKIIEIALEYGYDSPDSFTKAFTRFHGVTPTVARKEQLMIKSFAPLKIQF TLKGGSTMDYKIIEKEAFTVLGKARRFDFDCAFNEIPKFWGEHMQSADKKVCGVYGVCIE DEKDEGFKYLIADEYVLDSEIPDGYEVCEIPQLTWAVFACHGAVPKSLQEVNTKIFSEWL PNNDTYEVAEHYNIEMYTPCEDYPRGNHDENYYSEIWIPVKKK >gi|223714137|gb|ACDT01000078.1| GENE 8 7742 - 9112 1504 456 aa, chain + ## HITS:1 COG:MA0901 KEGG:ns NR:ns ## COG: MA0901 COG0733 # Protein_GI_number: 20089780 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Methanosarcina acetivorans str.C2A # 1 454 7 456 459 452 55.0 1e-127 MNRERLSSRLGFILLSAGCAIGLGNIWRFPYMVGKYGGGAFVLVYLFFLIILGLPIIVME YSVGRASQKSIAKSFHVLEKKGQKWHIFSYVAMVGNYLLVMFYTTIAGWMLAYFVKMLNG DFIGLNPSEVNNVFVSLQADPQASIFWMILIVVIGCGICAVGLQRGVEKVTKVMMGLLLG VMLLLVVKSLSLDGAIKGVEFYLVPDFNLLMENGIFNAVYGAMGQAFFTLSIGMGGMAIF GSYIGREHSLTGEGLRVLALDTFVATMAGLIIFPACMSFGVDAGSGPGLVFVTLPNIFNV MENGQIWGTLFFVFMNFAALSTIIAVFENIVSFSMDLLNWSRKKSVILNFIIIVLGSIPC AVGYSVLSGFQPFGPGSAVLDLLDFLVSNVIMPLGSLVFLFFCTRRLGWGWKKFMTEANA GEGLHFPEKAKFYISWILPLIVIFIFLFGLWEKLFA >gi|223714137|gb|ACDT01000078.1| GENE 9 9293 - 10225 1156 310 aa, chain + ## HITS:1 COG:BH0604 KEGG:ns NR:ns ## COG: BH0604 COG0714 # Protein_GI_number: 15613167 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Bacillus halodurans # 1 303 1 306 318 269 44.0 6e-72 MNEKLELLINEVKKAVLGKDDVLKKVVIALLANGHILLEDIPGVGKTNLAVALTKAIQLD YQRVQLTSDVLPSDLLGYSVYDFETKEMTFKKGPLFTNIFLADEINRTSSKTQSALLQVM EEGMISVDGLTYPTAKPFIVIATQNPFGSAGTQMLPDSQLDRFMMRLSLGYPDEMSEIEI INRKKSGNPINSINPVLSPQELLMMQEAVTKIHLDESLVKYIVQIVQQTRIHEKIELGAS PRASIALMKAAQASAYLNGRDYVIVEDINENLYETLNHRIFLMPSVKKNRENFQIIMEDI MMKIAPIAYK >gi|223714137|gb|ACDT01000078.1| GENE 10 10228 - 11133 787 301 aa, chain + ## HITS:1 COG:no KEGG:CTC00362 NR:ns ## KEGG: CTC00362 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 9 298 8 324 386 76 24.0 1e-12 MFRYRLAYLLLIGFGVFFYIAFVGYFSYYFLLLILILPVLSLFYLVMSFKFTKLDFAILN QKVIQDDFVKIKIIKDNLGLGAIRFVVEDQKYLIKGNQDGLTLVFKHCGGRNFIINTYYQ YDCLNLFCIKKRCHYEIPITIYPKRVELDFKEYAHQLPRVGDEVYAVNRKGDDPTEIYDI HKYQEGDQLKNIHWKLSARYQDILVKDNAMLVGEVINLYVSFDDNDDHNDLVFGYLDTFC GFLLKRQIGFLLSNKEIKSIQEYDEMFKYLLWNKEYQSDISKHNYEFVISYNGIMKVEGG R >gi|223714137|gb|ACDT01000078.1| GENE 11 11133 - 13163 1631 676 aa, chain + ## HITS:1 COG:DR0620_2 KEGG:ns NR:ns ## COG: DR0620_2 COG1305 # Protein_GI_number: 15805647 # Func_class: E Amino acid transport and metabolism # Function: Transglutaminase-like enzymes, putative cysteine proteases # Organism: Deinococcus radiodurans # 433 552 372 479 638 85 36.0 3e-16 MKLLIYLLGIIGSVGSLVYSFNFSSNNHFIFISCIIVGLLIYLGFNYIQGKKKFVIGSII ADGILLLIPSTMDCLTYITSIIIYKYREVSVYDFNFESTYIFYDDPQICILAFLLIFIPL FLSAVIAIDKQKYTLAILALLPGVFIELLFTITPPWYFLACYVLYVLILLIGALQKGAIL KIPMIIISVVAMAITYISFPISTYRPSKYSLFDNARTPISTPGNIKEEYNVNSQGDRHYR NSLDFTIAGEVTLNNFKIRGIAYDLYEDGKCGTSHSRVETEWFKNNLEKIANITKTSRQV IEVNQISGYSQRNYTPYFIINDDMTYYGDHYEGKNPQTYEMIIPNDDFNALLSTIDYKAK GELLKEIAERNGTQDYYDEYFGQGNEDDEKLTSVPEETKSIIENFLKQHNVIDNGNIFNY ITQCTNALAANTSYTLRPGNTPDNVDVVDYFLNTNKKGYCVHYASTLALMLRSRGYPARF VVGYQVPGSKNNAGKLIVRDSNAHAWVEIYDEYLGWIPIEATPTSSENPNTPTDTITPAP NQGDKTPQPVEPEKPDIQQISQNDSFQIPVYIYYLAGGMIFIFIVLFQARIRKKRMFKGA ANSNQKVCYYYYYLTKLKINCDVIKTIIDKARFSQHQISIEELEIVEQFYQDKTNAYFKQ ANLFKKIYLRYLIAVL >gi|223714137|gb|ACDT01000078.1| GENE 12 13223 - 13426 254 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755350|ref|ZP_02427477.1| ## NR: gi|167755350|ref|ZP_02427477.1| hypothetical protein CLORAM_00864 [Clostridium ramosum DSM 1402] # 1 67 1 67 67 123 100.0 3e-27 MAKLIKETTKSERENYLNKLFACKNGDCENCGVCKIFAGTSPQEVYYDYIIGKREFLEIS QTFNSRK >gi|223714137|gb|ACDT01000078.1| GENE 13 14477 - 15469 1403 330 aa, chain + ## HITS:1 COG:AGc4844 KEGG:ns NR:ns ## COG: AGc4844 COG2984 # Protein_GI_number: 15889928 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 35 319 40 323 332 196 38.0 7e-50 MRINKLFKLMAVSALTMTALTGCGNSSSNGDVKEVAIIQYVEHTSLNTIKDAFDEQMEAL GYKEGENIKYTFKNAQGDMNTAPSIIQDFKSSKKDCVVAIATPVAQTAAEMSTDTPVVFA AVTDPIKAGLTTSLEKPDKNITGTSDEIQVEMILERALQVNPDLKKIGVIYNKGEVNSVT NIAKAKAFCDDKGIEMIETTVTGVNEVQSAIDVLTSKCDAVFAPNDNTVANAMDMVGPAC AKAKVPLYVGADSMVQDGGFLSVGINYEDLGKETANMVDKILKGEKVENIPVKVFKENLS VYVNTKVLKQLGITLPNEIKNDKAYVEIKE >gi|223714137|gb|ACDT01000078.1| GENE 14 15474 - 16346 969 290 aa, chain + ## HITS:1 COG:FN2080 KEGG:ns NR:ns ## COG: FN2080 COG4120 # Protein_GI_number: 19705370 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Fusobacterium nucleatum # 5 275 1 266 278 194 43.0 2e-49 MNQTVILGALELGGIFAILSLGLYISYKVLNLPDLTVDGSFALGCAVSGMFTINSMPYVG LIGSFVIGAAAGVVTGLLITKFKIMPLLSGILTMTGLYSINLAIMNDTPNLSLFGNNTIF SSFTAFDPYGKIILIYLIVIVICVILDFFLRTQLGLSLRACGDNEDMVRASSIDTDKMKI LGLALANGLVAMSGAVFAQHQSFADISSGTGMMVIGLASIIVGTTFIKKEKIIFQLVAVV FGAIFYRAVLTVALQLGLPSGYLKLLSAVLVIVAIASTNGAFRSKKRGRN >gi|223714137|gb|ACDT01000078.1| GENE 15 16346 - 17131 210 261 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 227 1 218 245 85 27 2e-16 MLELKDVSVVFNEGTVNEKVALNNINLALDKGDFVTIIGSNGAGKSTLFQAISGAVDTKR GNIILNGRDITFEPEYKRSKGIGRLFQDPLKGTAPNMTIEENLHLSNQRGKHFSLSLMSH RHREEFKKALMELELGLEERLDSKVGLLSGGQRQALTLLMATLVTPDLLLLDEHTAALDP KTALKVLELSQKIVEEHQITTLMITHNMEDALKYGNKTMIMKDGQIIAMIEGKEREEMTV EGLIHLYSTSSQEYTDRVLLK >gi|223714137|gb|ACDT01000078.1| GENE 16 17550 - 17720 161 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237734241|ref|ZP_04564722.1| ## NR: gi|237734241|ref|ZP_04564722.1| predicted protein [Mollicutes bacterium D7] # 1 56 1 56 56 102 100.0 8e-21 MELTKEILFGILDAYAYEIVFVDRNHIVQYMNKTAKQRYGNRVKIGNSLFNCHNEN >gi|223714137|gb|ACDT01000078.1| GENE 17 18047 - 18529 306 160 aa, chain + ## HITS:1 COG:no KEGG:SPO0629 NR:ns ## KEGG: SPO0629 # Name: not_defined # Def: ISSpo3, transposase # Organism: S.pomeroyi # Pathway: not_defined # 12 159 12 158 297 84 34.0 1e-15 MSHIKRKPKKDSMIDLINKYSHDHLACVQYFFNIKWPTGFYCDKCGCTHYYFNEKRTLFE CADCGHQHYLFAGTIFQDNKLPLFKLILGLYLFFSANKGCSAVELASELDVNYKTALKLC KKCRVLMTLSNSERILDSMFYEADTLYIGAKTSNKPRNGN Prediction of potential genes in microbial genomes Time: Thu May 26 10:02:25 2011 Seq name: gi|223714136|gb|ACDT01000079.1| Coprobacillus sp. D7 cont1.79, whole genome shotgun sequence Length of sequence - 6378 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 3, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 240 232 ## EF2255 phage integrase family site specific recombinase - Prom 391 - 450 7.4 + Prom 320 - 379 8.0 2 2 Op 1 . + CDS 503 - 1009 595 ## Ccel_0849 NusG antitermination factor 3 2 Op 2 . + CDS 1018 - 1518 547 ## Ccel_0204 NusG antitermination factor + Term 1555 - 1609 11.2 + Prom 1587 - 1646 8.8 4 3 Op 1 5/0.000 + CDS 1731 - 2999 1237 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis + Prom 3016 - 3075 2.2 5 3 Op 2 . + CDS 3096 - 4733 1337 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 6 3 Op 3 7/0.000 + CDS 4730 - 5641 444 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 5716 - 5775 3.2 7 3 Op 4 7/0.000 + CDS 5801 - 6199 239 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 8 3 Op 5 . + CDS 6199 - 6376 119 ## COG0463 Glycosyltransferases involved in cell wall biogenesis Predicted protein(s) >gi|223714136|gb|ACDT01000079.1| GENE 1 3 - 240 232 79 aa, chain - ## HITS:1 COG:no KEGG:EF2255 NR:ns ## KEGG: EF2255 # Name: not_defined # Def: phage integrase family site specific recombinase # Organism: E.faecalis # Pathway: not_defined # 1 77 1 72 369 63 46.0 2e-09 MPRRGENIFKRKDGRWEARYVKAISVDNKKIYGSVYASTYTEVKQKRNNICNSLNKEKVN KQNINYQVLGYYIVEWLNV >gi|223714136|gb|ACDT01000079.1| GENE 2 503 - 1009 595 168 aa, chain + ## HITS:1 COG:no KEGG:Ccel_0849 NR:ns ## KEGG: Ccel_0849 # Name: not_defined # Def: NusG antitermination factor # Organism: C.cellulolyticum # Pathway: not_defined # 1 167 1 172 173 83 30.0 3e-15 MNWYLIFANDNKINDLLLYFNNHPEMTAFVPKIEKLMKKDGKKVFAEVPMFPNYIFIESE FNSQEFYQIIESLEKDMDSTMRIMQSDEQKVLSLANNEKELLESLFNDDHLIIRSIGVIV DSKLIVQDGPLVGKEEMIKKIDRHKRVAFIGDVFGKTMKVPLEVTSKT >gi|223714136|gb|ACDT01000079.1| GENE 3 1018 - 1518 547 166 aa, chain + ## HITS:1 COG:no KEGG:Ccel_0204 NR:ns ## KEGG: Ccel_0204 # Name: not_defined # Def: NusG antitermination factor # Organism: C.cellulolyticum # Pathway: not_defined # 4 166 16 187 188 75 32.0 6e-13 MKDNWYVLFALVAKENKLCSVLRRKGLDAFIPKIEYYRRDIKGNTLKMLFPGYIFVRSEM KQSDFDNLLYELGEQRDGLIKQLKDDGVTALRDEEIEMFNKLLNSEGILEMSQAFIEDNK AKVIYGPLIHYQDHIVKIDKHNRIAILDIEFLNRHILAGLEIKSKI >gi|223714136|gb|ACDT01000079.1| GENE 4 1731 - 2999 1237 422 aa, chain + ## HITS:1 COG:BS_yvfE KEGG:ns NR:ns ## COG: BS_yvfE COG0399 # Protein_GI_number: 16080476 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Bacillus subtilis # 9 285 4 272 301 279 48.0 7e-75 MKFKPFDNKVWLSSPTMHGEELEYVKEAYETNWMSTVGVNINEVEKLTCEKIGSKYAVAL SAGTAALHLAMKLAGIKVYGMPKVGYGALEGKKVFCSDMTFAATVNPVVYEGGIPVFIDS EYETWNMDPKALEKAFEIYSEVKVVVIANLYGTPSKLDEIKDICDKHNAVLIEDAAESLG ATYKGQQTGTFGKYNIISFNGNKIITGSAGGMLLTDDLEAANKVRKWSTQSRENAPWYQH EEIGYNYRMSNVIAGVVRGQYKYLDEHIAQKKAIYERYKEGLKDLPVKMNPYIEDIMEPN FWLSCLLINKEAMCKQVSNDSGVLYISEPGKSCPTEILEAITSINAEGRPIWKPMHMQPI YRLNPFVVRDGNGRARSNAYIAGGVADVGMDIFTRGLCLPSDNKMIVEQQDRIIEVIRAC FE >gi|223714136|gb|ACDT01000079.1| GENE 5 3096 - 4733 1337 545 aa, chain + ## HITS:1 COG:Cj1124c KEGG:ns NR:ns ## COG: Cj1124c COG2148 # Protein_GI_number: 15792449 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Campylobacter jejuni # 26 215 2 189 200 189 50.0 1e-47 MKKKAKKTTYKAENIEAIPTRKMGFYEKYVKRAIDVVCASVAIICFSPLYIGVAILVKFK LGSPVLFTQDRPGLIGEDGKETIFKMYKFRTMTDERDENGELLPDEVRLTSFGKWLRSTS LDELPEAFNILNGTLSICGPRPQLVRDLTFMTKEQRMRHTAKPGLSGLAQVNGRNAIKWE DKLNWDLKYIENVTLLKDFSIILKTVKTAFIKQEGITDGDMATAEDLGDYLLKSGKVDRE EYIEKHTKAKKILKGEVDVEDNIEIDKIHHVPFSVAISVYKSDNPIFFDRALNSITENQT ITPNEIVLVVDGSVSDSLNEVISKYENKYDIFKIIRLKKNGGLGNALKIAVENATFELIA RMDSDDVSLPTRFEEQLRYFQVNPEIDIVGGNITEFIGEENNIIGQRLVPVSNEAIREYM KERCAMNHVSVMYKKTAVQNAGGYQDWFWNEDYYLWIRMWLNGAIFANTGSVLVNVRVGE EMYQRRGGSKYFESEKGLQDYMLKNKMINHSTYIKNVAKRLIIQKLMPNKLRGWVFRTFA RKKVS >gi|223714136|gb|ACDT01000079.1| GENE 6 4730 - 5641 444 303 aa, chain + ## HITS:1 COG:PAB0772 KEGG:ns NR:ns ## COG: PAB0772 COG0463 # Protein_GI_number: 14521365 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Pyrococcus abyssi # 7 245 5 243 298 122 33.0 7e-28 MNYVEGLVSVIMPTYKRSEKLLRAIDSVLNQTYKNLELLLVNDNEPFDEYTELLKKRVEK YSNDERFRLILQDKHINGAVARNVGIRQARGQYIAFLDDDDWWELNKLEEQVKELKSLDD SWGAVSCKFTLYDQNGNVIGKTKKYRDGYIYKDILNLMSDVATGTLLMRHDYFDTTRYFD ENLLRHQDLQLLVDFTSKYKLKEVNQYLHCVDVSDTQNRPDAQKLIQYKKAFFRSIKPVM DSLSEKERKCIYAMHKYELGYVCLKNGEKCQGLGYCKAVLKSPKACGLALKKTFLKIIQI CKR >gi|223714136|gb|ACDT01000079.1| GENE 7 5801 - 6199 239 132 aa, chain + ## HITS:1 COG:SP0074 KEGG:ns NR:ns ## COG: SP0074 COG0110 # Protein_GI_number: 15900019 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Streptococcus pneumoniae TIGR4 # 8 94 93 179 185 57 37.0 6e-09 MTSEPYLISIGDNVTIAGNVTFVTHDNSISKIDTRYANLFGYVEIGNNCFLGQNVTVLYG VTLANNIIVASGSVVCNSFNEENIIIGGNPAKKIGNWNSLEVKGKCYAMKKREAFEAAEN KEYRKFIKRKKM >gi|223714136|gb|ACDT01000079.1| GENE 8 6199 - 6376 119 59 aa, chain + ## HITS:1 COG:SP1764 KEGG:ns NR:ns ## COG: SP1764 COG0463 # Protein_GI_number: 15901595 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 1 59 1 59 301 68 57.0 3e-12 MEELVSIVVPVYNMGNSIEICVNSLLKQDYANIEILLVDDGSKDNSLEKCNKLKDKDNR Prediction of potential genes in microbial genomes Time: Thu May 26 10:02:33 2011 Seq name: gi|223714135|gb|ACDT01000080.1| Coprobacillus sp. D7 cont1.80, whole genome shotgun sequence Length of sequence - 673 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 673 187 ## COG0463 Glycosyltransferases involved in cell wall biogenesis Predicted protein(s) >gi|223714135|gb|ACDT01000080.1| GENE 1 1 - 673 187 224 aa, chain + ## HITS:1 COG:BS_yveR KEGG:ns NR:ns ## COG: BS_yveR COG0463 # Protein_GI_number: 16080483 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 1 204 62 265 344 72 25.0 6e-13 RVYHTENRGSGPARNTGIENAKGQYIYFPDADDQLEENAISMLVAGMQKGKYDLVVFGYK SLDKNGNTVLLKQYPEMEKKGEDVRQDYSNYVASTRKYGIQGAPWNKFFDLSLIKMHSIE YPPLRRHQDEGFISRYMCFCNNIHFIQPILYIHNLNDLKKEWDKYPVDYIDAVMGLYNTK KETILTWNKKDKKTKTIIEHEYICNVIKSLEMSYSPKMHMKSID Prediction of potential genes in microbial genomes Time: Thu May 26 10:02:49 2011 Seq name: gi|223714134|gb|ACDT01000081.1| Coprobacillus sp. D7 cont1.81, whole genome shotgun sequence Length of sequence - 62734 bp Number of predicted genes - 53, with homology - 52 Number of transcription units - 24, operones - 13 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 239 243 ## gi|167755999|ref|ZP_02428126.1| hypothetical protein CLORAM_01519 + Term 433 - 467 3.0 + Prom 512 - 571 5.7 2 2 Op 1 . + CDS 597 - 1262 905 ## COG2964 Uncharacterized protein conserved in bacteria 3 2 Op 2 . + CDS 1272 - 2912 2004 ## COG1227 Inorganic pyrophosphatase/exopolyphosphatase 4 2 Op 3 . + CDS 2914 - 3183 353 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) + Term 3320 - 3370 1.1 + Prom 3330 - 3389 8.7 5 3 Tu 1 . + CDS 3438 - 5129 1912 ## COG3525 N-acetyl-beta-hexosaminidase + Term 5167 - 5218 11.9 + Prom 5205 - 5264 6.8 6 4 Tu 1 . + CDS 5320 - 9381 4752 ## COG2730 Endoglucanase + Term 9387 - 9429 7.2 + Prom 9409 - 9468 7.2 7 5 Op 1 . + CDS 9490 - 11271 1725 ## COG4805 Uncharacterized protein conserved in bacteria + Prom 11287 - 11346 6.3 8 5 Op 2 . + CDS 11370 - 12737 1771 ## COG1362 Aspartyl aminopeptidase + Prom 12841 - 12900 6.5 9 6 Op 1 . + CDS 12920 - 14245 1236 ## COG0534 Na+-driven multidrug efflux pump + Prom 14247 - 14306 6.1 10 6 Op 2 . + CDS 14328 - 14867 253 ## PROTEIN SUPPORTED gi|148360238|ref|YP_001251445.1| nucleotidyltransferase PLUS glutamate rich protein GrpB PLUS ribosomal protein alanine acetyltransferase 11 6 Op 3 . + CDS 14917 - 15216 451 ## BAS0057 YabP protein 12 6 Op 4 . + CDS 15216 - 15644 141 ## gi|167756010|ref|ZP_02428137.1| hypothetical protein CLORAM_01530 13 6 Op 5 . + CDS 15596 - 15880 399 ## gi|237733996|ref|ZP_04564477.1| predicted protein 14 6 Op 6 . + CDS 15925 - 16701 901 ## COG2819 Predicted hydrolase of the alpha/beta superfamily 15 6 Op 7 4/0.000 + CDS 16760 - 17581 1040 ## COG0561 Predicted hydrolases of the HAD superfamily 16 6 Op 8 . + CDS 17581 - 18411 984 ## COG0561 Predicted hydrolases of the HAD superfamily 17 6 Op 9 . + CDS 18401 - 19060 738 ## COG0692 Uracil DNA glycosylase + Term 19063 - 19108 1.5 + Prom 19110 - 19169 9.0 18 7 Tu 1 . + CDS 19190 - 19324 87 ## + Term 19332 - 19369 1.5 + Prom 19350 - 19409 9.9 19 8 Tu 1 . + CDS 19432 - 20601 1208 ## COG0285 Folylpolyglutamate synthase 20 9 Op 1 . - CDS 20628 - 21020 249 ## gi|167756018|ref|ZP_02428145.1| hypothetical protein CLORAM_01538 - Prom 21044 - 21103 3.1 21 9 Op 2 . - CDS 21105 - 22925 1345 ## COG0367 Asparagine synthase (glutamine-hydrolyzing) - Prom 22990 - 23049 3.5 22 10 Op 1 . - CDS 23055 - 23453 427 ## Aflv_2764 spore coat protein GerQ 23 10 Op 2 . - CDS 23477 - 23809 313 ## COG3773 Cell wall hydrolyses involved in spore germination - Prom 24005 - 24064 8.8 + Prom 23967 - 24026 12.7 24 11 Op 1 7/0.000 + CDS 24135 - 25259 1518 ## COG0448 ADP-glucose pyrophosphorylase 25 11 Op 2 17/0.000 + CDS 25274 - 26383 1135 ## COG0448 ADP-glucose pyrophosphorylase 26 11 Op 3 . + CDS 26383 - 27813 1534 ## COG0297 Glycogen synthase 27 11 Op 4 . + CDS 27855 - 28634 726 ## COG0561 Predicted hydrolases of the HAD superfamily 28 12 Tu 1 . + CDS 28709 - 29230 600 ## gi|167756026|ref|ZP_02428153.1| hypothetical protein CLORAM_01546 + Term 29242 - 29287 -0.9 + Prom 29409 - 29468 4.6 29 13 Op 1 8/0.000 + CDS 29489 - 30826 1498 ## COG0215 Cysteinyl-tRNA synthetase 30 13 Op 2 7/0.000 + CDS 30826 - 31203 202 ## PROTEIN SUPPORTED gi|163764762|ref|ZP_02171816.1| ribosomal protein S13 31 13 Op 3 . + CDS 31241 - 31966 552 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 32 13 Op 4 . + CDS 31983 - 32588 759 ## gi|167756031|ref|ZP_02428158.1| hypothetical protein CLORAM_01551 33 13 Op 5 . + CDS 32637 - 32786 263 ## PROTEIN SUPPORTED gi|237734017|ref|ZP_04564498.1| 50S ribosomal protein L33 34 13 Op 6 . + CDS 32798 - 32986 86 ## gi|167756032|ref|ZP_02428159.1| hypothetical protein CLORAM_01552 35 13 Op 7 . + CDS 32988 - 33545 773 ## COG0250 Transcription antiterminator + Term 33705 - 33757 12.1 + Prom 33617 - 33676 7.0 36 14 Tu 1 . + CDS 33792 - 38375 5104 ## COG3525 N-acetyl-beta-hexosaminidase + Term 38391 - 38426 1.1 + Prom 38556 - 38615 7.9 37 15 Op 1 1/0.000 + CDS 38684 - 40114 1817 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 38 15 Op 2 . + CDS 40178 - 40651 689 ## COG2032 Cu/Zn superoxide dismutase 39 16 Tu 1 . + CDS 40717 - 41508 944 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 41509 - 41535 -1.0 + Prom 41552 - 41611 8.6 40 17 Op 1 . + CDS 41664 - 42605 947 ## COG0530 Ca2+/Na+ antiporter 41 17 Op 2 . + CDS 42615 - 43304 888 ## COG0584 Glycerophosphoryl diester phosphodiesterase + Term 43392 - 43442 1.5 + Prom 43328 - 43387 3.9 42 18 Op 1 . + CDS 43494 - 46142 2695 ## COG1012 NAD-dependent aldehyde dehydrogenases + Term 46157 - 46192 7.1 + Prom 46154 - 46213 8.0 43 18 Op 2 . + CDS 46240 - 47499 1078 ## SSA_0231 hypothetical protein + Term 47582 - 47632 9.1 44 19 Tu 1 . + CDS 47899 - 49503 1788 ## COG0504 CTP synthase (UTP-ammonia lyase) + Prom 49514 - 49573 3.5 45 20 Tu 1 . + CDS 49656 - 51143 1634 ## COG4099 Predicted peptidase + Term 51173 - 51213 6.3 + Prom 51181 - 51240 8.4 46 21 Op 1 . + CDS 51297 - 52070 614 ## gi|167756044|ref|ZP_02428171.1| hypothetical protein CLORAM_01564 47 21 Op 2 . + CDS 52075 - 52815 629 ## gi|167756045|ref|ZP_02428172.1| hypothetical protein CLORAM_01565 48 21 Op 3 . + CDS 52833 - 53588 698 ## lse_0378 hypothetical protein + Term 53600 - 53642 6.3 49 22 Op 1 3/0.000 + CDS 53661 - 54143 686 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase 50 22 Op 2 . + CDS 54146 - 55579 1799 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases + Term 55592 - 55623 1.0 + Prom 55613 - 55672 9.6 51 23 Op 1 . + CDS 55728 - 55901 183 ## gi|167756049|ref|ZP_02428176.1| hypothetical protein CLORAM_01569 52 23 Op 2 . + CDS 55954 - 56574 612 ## SUB0807 transcriptional regulator + Term 56584 - 56618 2.8 + Prom 56576 - 56635 8.2 53 24 Tu 1 . + CDS 56844 - 62684 6658 ## COG3525 N-acetyl-beta-hexosaminidase Predicted protein(s) >gi|223714134|gb|ACDT01000081.1| GENE 1 3 - 239 243 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755999|ref|ZP_02428126.1| ## NR: gi|167755999|ref|ZP_02428126.1| hypothetical protein CLORAM_01519 [Clostridium ramosum DSM 1402] # 1 78 243 320 384 138 100.0 1e-31 TIGTDSKQAQLLKNVADYILPELKLNYGEVYEIIYQQDSLNITLNPDDTVNLNESGILSK YKKDNKLVKFDGFYYLND >gi|223714134|gb|ACDT01000081.1| GENE 2 597 - 1262 905 221 aa, chain + ## HITS:1 COG:Cj1387c KEGG:ns NR:ns ## COG: Cj1387c COG2964 # Protein_GI_number: 15792710 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Campylobacter jejuni # 6 221 7 218 218 113 33.0 3e-25 MYKYLEPYAAMIEFLGKALGDNVEIALHDLTSKEQEIVAIANNHNSGREVGAKLSNLSLH YLEEKQYLDNDYVMNYKTTGPDGKLMRSATYFIKEPGKEMPVGMLCINVNISDLEYLTST IKKILGIKDEKDIEFKMDNPVEILSSPLDEMIDLYIKECLEKMGFPSYFLAERLNVDEKI KVVKYLQEKGTFKVKGAIVLVAEKLAVSEPTVYRYLKKMEK >gi|223714134|gb|ACDT01000081.1| GENE 3 1272 - 2912 2004 546 aa, chain + ## HITS:1 COG:FN1824 KEGG:ns NR:ns ## COG: FN1824 COG1227 # Protein_GI_number: 19705129 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase/exopolyphosphatase # Organism: Fusobacterium nucleatum # 3 538 2 529 538 290 34.0 7e-78 MKELVYVSGHKNPDTDSICSAIGYAYLLKAIDRYNAVPVRLGEVSRETEYVLKRFNLEIP TLLKTVKQKVEDLNYDKVTVFSKELTLKTAWSLMKQQNLKSAPILDDHGQLLGLLSTSNI VEGYMEKWDSSLLKDAHTPIENVIDTLEASVIYLNHNLKTIEGDLHIAAMRGEEASKRVR PGDIVIIGGDRDDAVNSMIDAEVSLIILTGSLGVEAEVLNKLKEKGISVISTSFNTYLTS QQIIQAIPVEYIMQKGDLKLFSTDDTLDHVKEVMSETRYRSYPVLDLNNRCVGSISRFAL LKGLRKKVILVDHNERGQSIPGIEEADILEIVDHHRVADIQTIGPLLFRGEPLGSTATIV TKIFDENDVEIPQSIAGALLGAIISDTLLFKSPTCTPVDTKAARKLAKIAGVDIEEFAME MFKAGTSLVGKTVEEIFNQDFKKFPFEQGTVGVGQVNSMDIEGFMPYKADMLDYMDKFAE DNRLDFTLLLLTDIINANSEIFVAGPKPYLVEEAFNIKLNDHQATLKGVISRKKQVVPAI TAVMSK >gi|223714134|gb|ACDT01000081.1| GENE 4 2914 - 3183 353 89 aa, chain + ## HITS:1 COG:SP0007 KEGG:ns NR:ns ## COG: SP0007 COG1188 # Protein_GI_number: 15899956 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Streptococcus pneumoniae TIGR4 # 1 88 1 88 88 77 56.0 5e-15 MRLDKFLKVSRIIKRRTLSKEISESSRVKVNGKVAKPSTKLKVGDEIEIEFGRSRLTVKV KELREHVLKEDSTMLYEIINEERIERDLN >gi|223714134|gb|ACDT01000081.1| GENE 5 3438 - 5129 1912 563 aa, chain + ## HITS:1 COG:SP0057_2 KEGG:ns NR:ns ## COG: SP0057_2 COG3525 # Protein_GI_number: 15900002 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Streptococcus pneumoniae TIGR4 # 37 472 2 448 782 402 48.0 1e-111 MKGLKKLFSAMLVLTMLFGTIANVGMAKIYAAEEGMKRVFSIDAGRKYFSEEQLLQIIDK AYLNGYTDVQILLGNDALRFFLDDMSITVDGKTYASEAVKKAITAGNDHYYKDPNGNALN ETEMNRIVAYAKERGLHIIPVINSPGHMDSILVAMEELGMKNVRYSYNGKESERTVNIES DEAIAFTKELVKKYVTYFANANVSEIFNFGADEYANDVFSNPGWGELQKIGLYDEFVVYA NDLAKIIKDAGMKPMCFNDGIYYNKKDSSGTFDQDIIISYWTAGWWGFNVAKAEYLVNKG HKILNTNDAWYWVLGNIDAGGYNYNSTVNNINNKKFTDVTGASNELPIIGSMQCVWCDTP SKEHDMDRIIKLMDLYSQKHTDYLIRPADFTKVDEAIAKIPEDLSIYTTESVEKLNTAID NIDRSIRVTEQSIVDGYAAAIEQAIIDLTLKDADYSKVDEAIAKAEALNKDEYTDFSNVD AAVKAVKRGLDITKQKDVDAMAAAINEALAALEKKEAVTPDKPNSEKVDSPKTGDTTNTM VWLSFAAMSILASTLVLKKRKTY >gi|223714134|gb|ACDT01000081.1| GENE 6 5320 - 9381 4752 1353 aa, chain + ## HITS:1 COG:YOR190w KEGG:ns NR:ns ## COG: YOR190w COG2730 # Protein_GI_number: 6324764 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase # Organism: Saccharomyces cerevisiae # 854 1071 46 276 445 91 32.0 1e-17 MKRVIREIKKAGSIAIIIAMLLSFVPTGIFALNESNYKEPLVFADFEGNDSNVDGSNNAA VSFIEGFATGTGKLVAKLELANSGDPSISERSLVITKETSVDVSNYKYLTFWIKDNGTNS AKVHLIDASGNATSGNWTGNVTAGKWSQLSVSLDQFKNIDLSKITGVVIGEWNKGSYLID DVQFTDVLAKDLKLSASKNTGTYNDSFEVTLTAGEGQSIYYTVDGSQPTISSTKYSNSLT IDESMTLKAIVVDNQNISEVYEFDYIIDHQDNRIYTPVVVQTFEDQNNFSAANGASGTIV TNEKHSGEQSLKYVKTKSEAASTSKGSIKIDFNHAVNAADLKYLIFYIKDTQGSNTLQVS LIDTDGKESDFGWRSPSTTKDKWVQYCIKVSDFNKIDKTKIAGIRIGEWNAGTYYLDDIY FDNYLYSGLPSLVPTKPEANISDGYIFKDRLAVTLKNDNNAPMYYTMDGSTPNVDSTKYS GEIELSKTTTLKTVSFDNGKYSEVVELAYIKDTNIPSDAKADKVAGKYTKPIKVTLSNED NLAIYYTIDGSTPSKTSSKYTTPISIGESTVLKAVTYQGDSAGNVMTFKYQYPTVPSEVT ASIPETKFTSSKTVELISDIDANIYYTTDGSVPSLTSSRYDQPLTISKSMTVKAIAERDG KTSAVTTLDYIIAPVAVQADKPAGTYDGSVVVEFRVPNNDQVEIYYTTDGSVPTVASNHY TQPLRVSENTTFTVGATYKNSNDIGVVTNHTYIINPITEAKAPVITPGSGTYGQRQLVSM SSDTQDSKIYYTVDGSIPSRDSMEFKEPFYVKQDTVVKAITVTKNGISEITVNEIKVNQE ASNFLKTDGKVIRNNYGAGEKIQLKGTNVGGWLVMEEWQCPTSAPDQKTMLETFTKRFGE AKAWELINTYQDNWFTEADFITLKEEGVNCLRLPITYFEMANLDGTLKETAFDRLDWFIE EAAKHGIYTLIDMHGAFGSQNGKDHSGDITYPDQGDFFGKEENIQKTIKLWEAIAARYNG NEWVAGYDLLNEPGGALGTEQFEVYDRIYKAVRAIDQDHIIQIQAIWEPTHLPAPTLYGW ENVVYQYHFYGWDDINNLEYQKAFINSKIKYVNEDTNYNVPVFVGEFTFFTNMDSWEYGL SVFDEQGWSYTSWTYKVAGANSSWGMYTMPKNDSTNVNINTDDFETVKAKWSNFDFTRNT SIADVLSKHFKIVSSDLIAPVIEGNDAAVMVGVKATVSEILDLFIKDDQDGVIDIAKADI TTDFDCSKAGVYTVTVEASDKAGNISEAAFTITVKEETVIDPDVVEKPDSSKSEVSVNKP VKTGDNENIIGDLMILGLSMIAGVILLKRRKEI >gi|223714134|gb|ACDT01000081.1| GENE 7 9490 - 11271 1725 593 aa, chain + ## HITS:1 COG:CC1085 KEGG:ns NR:ns ## COG: CC1085 COG4805 # Protein_GI_number: 16125337 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Caulobacter vibrioides # 97 591 49 559 564 104 23.0 5e-22 MRILKRISLLLLSLVMIISLSGCTADTNNKDERASFDRFIERQFIETMESDYTTAHVFLE HPENYDVDISKLTVNLGVRLDEESMRNQEKKDADSWAEFKKFERSKLTKEQQDTYDIYEF QNNLAREMSDNKFDYYQQLFESISGIHYQIPTMLADWSLRNEQDVKDIISLVNDVLPYLQ GAIDYTKEQAKRDLLMIDVDAVREYCDNVIKSGENSLILSDINKNIDTLNLEESVGNAYK EQLKTAFNSSFIPAYQAVKDLMDEVSLTGNNEEGLVKFKNGKEYYELLLQNAVGSDKSVM EIKELMEDSLDKHISKLQENVIKNASVREFLTSGGEIPTTKYTSYDEILDDISAQLFEDF PNVSSLSYNIRDINEEIASSSGVAAYFNIPPLDGNGIKQLRVNPSTGEVSDLDTFSTVAH EGYPGHMYQYAYMYENIQSPYQKALANSSAYTEGYAVYAQFYAYKYLTGIDKNVLEALKE NELASYDIMILCDIGIHYEGWSLDDFSDFISEKGINLEEDSLKTQYKQLQANPAAFEPYY VGYEEFMLLKETAQNKLGDKFNDKKFHEAILKSGNAPFTVVERNVAAYIKSAK >gi|223714134|gb|ACDT01000081.1| GENE 8 11370 - 12737 1771 455 aa, chain + ## HITS:1 COG:CAC1091 KEGG:ns NR:ns ## COG: CAC1091 COG1362 # Protein_GI_number: 15894376 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Clostridium acetobutylicum # 4 452 10 463 465 568 62.0 1e-161 MYGKLAWEKYNDEQINDIMTFNEGYKNYITKGKTERLCVSETVKLAMAHGYKELNEVDIL KPGDKVYVTNMKKNIALFVIGKKPLEDGMRILGAHIDSPRMDLKQNPLYESEGFAMLDTH YYGGVKKYQWVTIPLSMVGVVVKKDGTVINVNIGEDENDPVVGISDLLVHLSADQLKKDG AKVIEGEDLDVTFGSIPLKDHEKDAVKANVLKILKDKYDFDEEDFLSAEIEIVPSGKARD YGIDRSMVAGYGHDDRVCAYTSLMAILDMEMPDYTSCCILVDKEEIGSVGATGAQSLFFE NTVSELLLKQGTDSFVKTRKAMANSKMLSSDVSAGVDPLYLSVNDKKNAAYLGKGIVFNK YTGARGKSGSNDANPEYMAEIRKILDDDNIYYQTAELGKVDQGGGGTIAYILGNYNMNVI DAGVAVLNMHAPMEIVSKVDVYEAYLAYRTFLKQI >gi|223714134|gb|ACDT01000081.1| GENE 9 12920 - 14245 1236 441 aa, chain + ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 1 435 2 434 447 253 36.0 6e-67 MKNMTQGNPLKLILMFSMPLMLGNILQEFYTIVDTIIVGQFLGVKALASMGAAGWIQWML LSVVMGFAQGFSIKVANLYGANNHEGISKTIGNIVIACLVIALLLTISGELIIIPILKLL QTPNDIIQGAIAYLRVMAGGVTITLMYNLLSCVLRAFGNSKTPLLAMVIAASLNVGLDLL FVCVFHLGIAGAAIATILAQLFASLFCLLVLYKQSIFSLKSEHFKIQRHLIFELVKLGLP LALQNGIISIGGMIVQFVVNGYGVIFVAGFTATNKLYGLLETAAISFGYALTTYNAQNYG ALEYQRVREGVNASALISLVTSLMIALIMIVFGRNILTLFVSGSQNEINAVLEVAYHYLF IMAVCLPILYILHTYRNALQGLGNTVIPMFSGIVELLMRVGIALFLPLFMGQEGIYYAEI VAWTGAALLLYISYKRTKLGD >gi|223714134|gb|ACDT01000081.1| GENE 10 14328 - 14867 253 179 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148360238|ref|YP_001251445.1| nucleotidyltransferase PLUS glutamate rich protein GrpB PLUS ribosomal protein alanine acetyltransferase [Legionella pneumophila str. Corby] # 8 168 175 342 601 102 36 7e-21 MVKIVVKVEVVPYNEDWPRKYALEANVIRNIVLDELITITHVGSTAVPGLVAKPIIDILI IVEDINKLDQLDPLFKKEDYECCGEDGISGRRFYYKSDGSIHVHAFDERSHDKIELLISF YDYLKDNPDIAREYGELKLSLAKKYPDNIKKYKKGKEAFTKEIQTRALKWYRHKDVEQI >gi|223714134|gb|ACDT01000081.1| GENE 11 14917 - 15216 451 99 aa, chain + ## HITS:1 COG:no KEGG:BAS0057 NR:ns ## KEGG: BAS0057 # Name: not_defined # Def: YabP protein # Organism: B.anthracis_Sterne # Pathway: not_defined # 15 99 25 108 108 71 45.0 1e-11 MDNNNSIRFEHTPYHNVYLKDRKNIELTGVKNIESFDSLEFLIETSLGFLNITGTELSLT RLDQEKCEVSIKGNIDAISYVSNKKNQKGKDSVFNKLLK >gi|223714134|gb|ACDT01000081.1| GENE 12 15216 - 15644 141 142 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756010|ref|ZP_02428137.1| ## NR: gi|167756010|ref|ZP_02428137.1| hypothetical protein CLORAM_01530 [Clostridium ramosum DSM 1402] # 1 142 1 142 142 176 100.0 3e-43 MDLVKQIQGISYSLVFGFTFTFIYSLINRLFYKYHKRIFRLILQIVIGIIFGYLYYLGLL VINNGVVRIYFIISMVIGYVLYLNYYSYYMFFLIEIIVSMLKYLLRPIIFIFRKISGIIK RMKRVVKWLKEKFIKPSKDLPT >gi|223714134|gb|ACDT01000081.1| GENE 13 15596 - 15880 399 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733996|ref|ZP_04564477.1| ## NR: gi|237733996|ref|ZP_04564477.1| predicted protein [Mollicutes bacterium D7] # 1 94 1 94 94 150 100.0 3e-35 MAKRKVYKTVKGFAYMIIGCGLMLTLFNSAITVFQRQDEIAVLEKEKKAVEKEKKALENE ISLLNDDDYVARYARENYVFTREGEQVAIIPGVE >gi|223714134|gb|ACDT01000081.1| GENE 14 15925 - 16701 901 258 aa, chain + ## HITS:1 COG:SP0882 KEGG:ns NR:ns ## COG: SP0882 COG2819 # Protein_GI_number: 15900765 # Func_class: R General function prediction only # Function: Predicted hydrolase of the alpha/beta superfamily # Organism: Streptococcus pneumoniae TIGR4 # 14 256 22 270 274 100 27.0 2e-21 MFIEENMYMRSLRRKRKITIYVPDDYETSNKRYPVLYINDGQNAFFDETSYMKISWGFYD YVQAQNLDVIMVAIPCNNRLNKREDEYGPWRIGKEILMMEYGDDSLEIGGEGDKYLRFII KQLKPYIDTHFPTIVEDSAMVGSSMGGIITTYAGLKYPHIFKKTASLSSAYWFYMDELID LIHRSDLSATQCFYLDLGGNEGNGDEEMSRIYYESNETIYNELIKKSNRIEGRFFEEAGH NEYEWRQRVPIFMDLFYK >gi|223714134|gb|ACDT01000081.1| GENE 15 16760 - 17581 1040 273 aa, chain + ## HITS:1 COG:CAC0629 KEGG:ns NR:ns ## COG: CAC0629 COG0561 # Protein_GI_number: 15893917 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 1 270 1 266 268 119 33.0 6e-27 MYKLMLSDLDETLLVKHHVPEFNVEAIKAARKKGLKFVPATGRAYNMIPEILKEIDAYDQ EGEYSICFNGALIVENKNNRILNFEGISFETTKMLFEKAREFDVCVLIFTIDMCYIYNAD PDEVQRKTDQKAPFVVIDEYNMDDLKDEKIAKILYEKRDMPYLKSIEVEVSEMIEGKACA SYSSGRYLEFNAIGIDKGFGLRWLANYLGIDINETIAIGDNYNDVEMIKAAKLGVCVTCA TDDIKELAQYVTEVDYDQGAVKEVIEKFVLEAE >gi|223714134|gb|ACDT01000081.1| GENE 16 17581 - 18411 984 276 aa, chain + ## HITS:1 COG:CAC0629 KEGG:ns NR:ns ## COG: CAC0629 COG0561 # Protein_GI_number: 15893917 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 3 275 2 268 268 130 35.0 3e-30 MGYKLIACDLDETLLNDEHLVGAKNVAAIKRAREEYGVKFVPATGRGFMQIQPELKELQL YDLEGEYVISFNGGALTENKNNRIIEFKGLEFSKMKEIFEAGLNFDVCIHVYTPDNLYVF NLSESERQRIKNQKLEAVYPEVNSVDFLKDTPISKILYQNVDVPYLMSLEEGLKPITDGH CAVSYSSNRYMEFNALGVDKGQGLIDLANKLGIAIEETIAVGDNYNDMAMLKVAGLSVAA NNAVDDVKAACDYTTNANNNEGVVAELIEKFIFNEI >gi|223714134|gb|ACDT01000081.1| GENE 17 18401 - 19060 738 219 aa, chain + ## HITS:1 COG:BS_ung KEGG:ns NR:ns ## COG: BS_ung COG0692 # Protein_GI_number: 16080848 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Bacillus subtilis # 10 219 16 225 225 260 57.0 1e-69 MRFKDIVEIEFQQDYYKQLHQFVENEYLHKTIFPPKENIFRALNLCDYENVKVVILGQDP YHELHQANGLAFSVYPGVRIPPSLVNIYKELQDDLGTMIPNHGDLTKWAKQGVLLLNNVL TVEEGRANSHSGKGWETFTLNIVKALNKRQKPLVFILWGNNARAKKQYIDTSRHLVLESA HPSPLSAHRGFFGSHPFSKANEFLIRNGMPMIDWQIDNI >gi|223714134|gb|ACDT01000081.1| GENE 18 19190 - 19324 87 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNSGIGIIWAPVSAGTLVIWGLGIFVAVVGMVWFVKRLKKKIQL >gi|223714134|gb|ACDT01000081.1| GENE 19 19432 - 20601 1208 389 aa, chain + ## HITS:1 COG:SP0290 KEGG:ns NR:ns ## COG: SP0290 COG0285 # Protein_GI_number: 15900224 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Streptococcus pneumoniae TIGR4 # 11 380 9 428 440 198 35.0 2e-50 MFNDIKSALVYIESKRTKRTLEQFKETIKKYGFNIYQKNIIHIAGTNGKGSTTNFIKDIM VEHGYRVGTFTSPYLICHNDRICINGEMISDERLLMIINSLVEIIESEQLSMFEIDILIM LRYFDEEELDYRIIETGIGGLNDKTNVIDSVCSVITNIGYDHQFMLGDTLAQIAHHKAGI IKPEQTCFTSEINNDLIDIFKAAAEIKNSQVVSVEIDCDYRYPYHFSCLGYDYQLSNCGS YQVANATLAINVCDYLIELDHSLVQRALDGFSWPGRFEKFGKIYLDGAHNIDGIKALIKT LHDQQIKKALVIFSALGDKDIEQMEALLSEYPLIQATFADERLQLEGIDFKEAIKNNIDL YDHLIITGSLHFISTVRKYLQNNLIEKQV >gi|223714134|gb|ACDT01000081.1| GENE 20 20628 - 21020 249 130 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756018|ref|ZP_02428145.1| ## NR: gi|167756018|ref|ZP_02428145.1| hypothetical protein CLORAM_01538 [Clostridium ramosum DSM 1402] # 1 130 1 130 130 183 100.0 4e-45 MNCLKKQLLIISFALIVAIAIFLIFGSSIYINQIFFIINAFLGIFGLVSILISEALTPPN NAYTCCGNLATLGSSGLILISLLAIIIEAYAYTSVISLILAGSFFFLVLMLGGIWCYLNL RNHCHRNYYC >gi|223714134|gb|ACDT01000081.1| GENE 21 21105 - 22925 1345 606 aa, chain - ## HITS:1 COG:L0095 KEGG:ns NR:ns ## COG: L0095 COG0367 # Protein_GI_number: 15674210 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthase (glutamine-hydrolyzing) # Organism: Lactococcus lactis # 1 605 1 613 625 488 39.0 1e-137 MCGIIGYTNNVSNNQSVIENMLQKISHRGPDDQGYYQDSKITLGMRRLSIIDLDSGNQPL FNEDKSLILVFNGEIYNYQVLRAKLISLGHTFTTNSDSEVIIHGYEEYGNDIVNHLRGMF AFAIYNIHTKALFIARDIFGIKPLFYTIVKNELVFASEIKALFEYPGVEKKFNEQCLSSY LAFQYNPLTETFYRGIFQLLPAHYLEFTDNLKLTHYFKFNYNLDDNLKETDALNMLESTL KDSISHHRLADVEMGSFLSGGIDSSYLAANSQVKKTFSVGFEDHRCNELPLARELSQYLD IENYQKVITKDEFWHTIPTILNILDEPVADPSIIPLYHLTKLASQHVKVVLSGEGADELF GGYNIYQTPLTLAPTKIIPFFLRKGLKNILIKTNYEFKGKQYLIRAGQKIEERFIGNAFI FQNHELKKLIKPQWPIYPPQVLTNAYYRDVARKDDITKMQYIDLNFWLRGDILRKADHIS MANSLEVRVPYLDKEVWAISSILPTNLRVNKKATKYIFRKMAAKSLPLQNSERKKLGFPV PLNEWIKEEQYYQPIKELFNSKSAQKYFNCDYLNYLLDEHYQGIHNNARKIWTVYIFLTW YNRNFD >gi|223714134|gb|ACDT01000081.1| GENE 22 23055 - 23453 427 132 aa, chain - ## HITS:1 COG:no KEGG:Aflv_2764 NR:ns ## KEGG: Aflv_2764 # Name: gerQ # Def: spore coat protein GerQ # Organism: A.flavithermus # Pathway: not_defined # 25 131 30 135 147 109 51.0 4e-23 MDYFNDLNPYLVRQDPAPAETPDTTPTPAPAPTPNQTPGTPSTPSVTPAPVIDQAYVENI LRINVGKLGTFYFTYTGSNEWRDRIFKGIIEQAGRDHFILHDPKTGKRYLLQLVYLEWAE FDEALNYQYPYQ >gi|223714134|gb|ACDT01000081.1| GENE 23 23477 - 23809 313 110 aa, chain - ## HITS:1 COG:BS_cwlJ KEGG:ns NR:ns ## COG: BS_cwlJ COG3773 # Protein_GI_number: 16077329 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall hydrolyses involved in spore germination # Organism: Bacillus subtilis # 1 102 30 130 142 85 43.0 2e-17 MLLVGNVVINRVVANCDVFRNTRTITEVVYQKNAFSGVGTPLFNEPVNELLRQIALRCIN GYRNEPATNALWFKNPGTNVACPEKFYGNLSGRFKQHCFYNPGLKLNCDL >gi|223714134|gb|ACDT01000081.1| GENE 24 24135 - 25259 1518 374 aa, chain + ## HITS:1 COG:BH1087 KEGG:ns NR:ns ## COG: BH1087 COG0448 # Protein_GI_number: 15613650 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Bacillus halodurans # 1 372 1 370 383 411 55.0 1e-114 MGREIVAMILAGGRGTRLEALTAKVAKPAVHFGGKYRIIDFPLSNCANSGIDIVGVLTQY ESVLLGTYVGAGTKWGLDGKQSLAAILPARERGEVGATWYAGTADAIYQNISFIDQYDPE YVLILSGDHIYKMDYDKMLTAHKQRKADATIAVLNVSLKEASRFGIMNTYEDGTIYEFDE KPEKPKSTLASMGIYIFTYKQLRKYLIADAKKEDSKHDFGMNIIPDMLNDNKKLYAYEFD GYWKDVGTVESLWQANMDLLKDKDLDLYNIKKDWKIYTEDTLGKPQIIGDQAKVKNSLIT QGCLVNGNVEGSVLFNNVNVGEGAKVIDSVLMPGVLVEEGAEVYKAIVDEGVVIRAKTKI NKEAKEVALVSDNK >gi|223714134|gb|ACDT01000081.1| GENE 25 25274 - 26383 1135 369 aa, chain + ## HITS:1 COG:CAC2238 KEGG:ns NR:ns ## COG: CAC2238 COG0448 # Protein_GI_number: 15895506 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Clostridium acetobutylicum # 12 368 14 368 372 231 35.0 2e-60 MVKAIGIVNLHSDVDFVGLTERRPVASVSFLGRYALIDFVLSNMSNSTIDAVGVLIQKKP RSLFKHLGNGDSWNFNSKAGGVSLLYNEKYANNPNYNHDINNLIENIAFLEANKADYVVI APAHIITTMDYEDVIENHEKTGAEITMVYKKVHDANEAYIGCDCLTIKDKVVTAVERNKG SRKDRSISLETYVINTKTLLKVMKQAQKISSFFSLKDVLGYLCDEKQINAYEYKGYARCI DSLEHYYKYSLEFLDLDVSSAVFKSNWPIYTITNDTPPAKYLTESDVTRSFVANGAMING TVENSIIGRDVVIGTGAVVKNCILFSGSVVNPGAHLENVIMDKTSKVQRQLNLKGDATTP LYIKEGDVV >gi|223714134|gb|ACDT01000081.1| GENE 26 26383 - 27813 1534 476 aa, chain + ## HITS:1 COG:SP1124 KEGG:ns NR:ns ## COG: SP1124 COG0297 # Protein_GI_number: 15900991 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Streptococcus pneumoniae TIGR4 # 1 475 1 476 477 444 44.0 1e-124 MRVLFATSEATPFIKSGGLADVLGSLPKEMVKKGVESIVVLPKYQDMKLADEIEYVTSFD IWVGWRKVYCGVFTYELDGVRFYFIDNEQYFRRPGLYGYDDDYERFAYFDFAVLELISHL DIKPDVLHLHDWQTAMIAMLYKERYCYYEHYDNIKIVFTIHNIAFQGKADPKLLSELFGL DNYLYYNGNCRNDGCLNMMKAAIFYSDIITTVSPTYAREILTPEYGEGLQNILEMRKYDL YGILNGVDYDVINPATDPQIVKNYDLETVFKDKIVNKLALQKEVGLPEDKDVALIGIVTR LTKQKGLDLIINQFSEMCSRKVQILILGAGDQVYEDALKGIAYHHQDTVSLQLKYDFGLS CRIYAGCDMFLMPSLFEPCGLSQMMSLRYGTIPIVRETGGLKDSVEPYDEYADTGTGFSF ANYNAHEMMKTIDYALKVYDEDPEAWKGIMTRAMQAKLDWDASADQYLKVFNSLIG >gi|223714134|gb|ACDT01000081.1| GENE 27 27855 - 28634 726 259 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 259 1 256 256 150 34.0 2e-36 MIKAIFFDVDGTLISKSRSVLSEGVIAALKQLQEKGIKLFIASGRHYLELDELGINAQFT FDGYLTLNGGYCFNQEGVIYKNPINSKDVERIVDHVTKHQLACSFVEGDDLYINLINDLV VAAQTAINTSLPPKKDVSRALINDVYQIDPFVTKEEIEELVALTQHCKYTQWHDGAYDII PKQGGKQEGISAILDHYKIKVEETMAFGDGHNDIDMLQFVGTGICMENGCAETKAVCDYI TDNVDNDGIVSALRYFGLL >gi|223714134|gb|ACDT01000081.1| GENE 28 28709 - 29230 600 173 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756026|ref|ZP_02428153.1| ## NR: gi|167756026|ref|ZP_02428153.1| hypothetical protein CLORAM_01546 [Clostridium ramosum DSM 1402] # 1 173 15 187 187 299 100.0 5e-80 MKKIMKILVTTLCLCLVTACGARDKDVNLEKSGTKYSTDDGVSFYYPSNYEITANATDTD TVEFSKDNNTLYFKVIKDETDNVVEDKDELYTGEIEQSGASEIEVSKPVLDSGLDVYQYV FVHSDTGIRAKEIVYFSDDNTYIYGYRAAKDDFEDNDKDMTVYLQSFSMATGK >gi|223714134|gb|ACDT01000081.1| GENE 29 29489 - 30826 1498 445 aa, chain + ## HITS:1 COG:BH0111 KEGG:ns NR:ns ## COG: BH0111 COG0215 # Protein_GI_number: 15612674 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Bacillus halodurans # 1 445 3 452 466 380 44.0 1e-105 MKIFNTLTNKKEEFKPLREGEVSIYVCGPTVYNYVHIGNTRPMIVFDVLRRTFEYLGYKV TFVSNFTDVDDKIIKAAKAEGITEKELTDKYIAAYEDVRRNLNLLFPTYAPRVTNTMDAI IKFIDNLVKSGYAYEVDGDVYFRVSKIDEYGQLSGIKIEDLVAGASERIDENDKKEESTD FALWKKTDEGIRFDSPWSKGRPGWHTECVVMINDIFEGGKIDIHGGGQDLKFPHHENEIA QSMACHHHPIAHTWMHNQMINIDNQKMSKSLGNVIWAKDMVAELGCNVVKWFMLSSHYRN PLNLTEEVLNSVKKEVAKVDNVIKSVSLYLQVNHIANENYNKAAVDGMVGALEDDLNTSL ALTKILDQVKKLNLAFRQKEKNDKAIAIEYQTLLKMTAVIGFVFEPRKLNAAELEIYQAW LEAKQNKDFETADKLRTQLIEKGII >gi|223714134|gb|ACDT01000081.1| GENE 30 30826 - 31203 202 125 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764762|ref|ZP_02171816.1| ribosomal protein S13 [Bacillus selenitireducens MLS10] # 3 125 10 135 141 82 35 6e-15 MRPELINASVLAYLGDSIFEVLVRDYLVKESGFVKPNDLQREAVKYVSASSHAAFMHDML DEEFFSADEVGTYKRGRNTKGSKNESLDHMHSTGFEAVIGTLYLEENFDRIKVIFERYKQ YINNK >gi|223714134|gb|ACDT01000081.1| GENE 31 31241 - 31966 552 241 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 2 241 6 249 255 217 44 1e-55 MKQYIYGKNTILEALKGDKSVYTVYVQNNAKDNGIIEQCKRKKIPFKIVDKSEFIKKLGN VSHQGVMAEIEEYRYYSIDEIINTIPDGKQPLLLMLDGLEDPHNLGAILRTCDAIGVDGV IIGKNRSVGLNGTVAKVSTGAIDHVKVAQVTNLTRTLEDLKKRSFWVVGCDLDKSQDYRQ VDYNMPLVIVIGSEGFGISRLVKKSCDINVVLPMVGHVNSLNASVATAVILYQVYNSRNP L >gi|223714134|gb|ACDT01000081.1| GENE 32 31983 - 32588 759 201 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756031|ref|ZP_02428158.1| ## NR: gi|167756031|ref|ZP_02428158.1| hypothetical protein CLORAM_01551 [Clostridium ramosum DSM 1402] # 1 201 1 201 201 375 100.0 1e-103 MNGYEIEELYYLYRQGCPIAQALLIEYCYWQIKMMMPAYCYTMTSYQSDYKDYLQVVIIR CLLALDNYRPDRGMQVRSFISMVIQNTISSLIVKNQGKVVRERQTLFSLDDYCSDDERIR YVDAIADNKSPDLLLIDQEQQKRVDAYILNVCSAWEQTVIAYHKCGFKDQDIALKLNKDI RSIYNANYRIQKKMENSNLFD >gi|223714134|gb|ACDT01000081.1| GENE 33 32637 - 32786 263 49 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237734017|ref|ZP_04564498.1| 50S ribosomal protein L33 [Mollicutes bacterium D7] # 1 49 1 49 49 105 100 5e-22 MKQKVILVCSECLSRNYTITKNKRVNVERLELNKYCKKCGKHTLHKETK >gi|223714134|gb|ACDT01000081.1| GENE 34 32798 - 32986 86 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756032|ref|ZP_02428159.1| ## NR: gi|167756032|ref|ZP_02428159.1| hypothetical protein CLORAM_01552 [Clostridium ramosum DSM 1402] # 1 62 1 62 62 107 100.0 3e-22 MKFRDWFSLKGIRKEIKNVSWLSKKELAQNSAIVLLFCFVMGLYFFAGDAIIAFILKTLG MN >gi|223714134|gb|ACDT01000081.1| GENE 35 32988 - 33545 773 185 aa, chain + ## HITS:1 COG:BS_nusG KEGG:ns NR:ns ## COG: BS_nusG COG0250 # Protein_GI_number: 16077169 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Bacillus subtilis # 9 182 3 176 177 194 55.0 1e-49 MTGMNDEGRQWYVVNTYSGHENKVKENLEKRVESMGLQDILFQIVIPEHVETEIKDGKKI NKTKNMFPGYVLVEMIMTDEAWYVVRNTSGVTGFIGSSGGGAKPFPLQKSELDPILKKMG LSTSSIEVDYAAGDEVNVISGPFAGKSGKVESIDLEKETAKVLVDFLGNLTPMEIELVQL EKADL >gi|223714134|gb|ACDT01000081.1| GENE 36 33792 - 38375 5104 1527 aa, chain + ## HITS:1 COG:VC0613 KEGG:ns NR:ns ## COG: VC0613 COG3525 # Protein_GI_number: 15640633 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Vibrio cholerae # 424 701 129 389 637 88 25.0 8e-17 MEKTKVKKLVTSAITLSMAFSTLAPSMVFATDDVKPRDIVTKVNHALNGTATANNSETSY WGPDKAIDGIVNRDAAKPDQSRWSTNMGTTPMVLTIDLKEEKAFSEFKIEWERKNIKGFN ISISNDNNEYTPVYTKPDDSNITSLTTTVTLENSVSARYVKLTVDNYDDTEAAGWASVSL YEFEVLGEESYENLAVGATAVASGSETTSFGPANVVDENMKTRWASTANHDDDKWISLDF GTAKDIASVTLKWERRNATKYKIQSSKNGTSWEDVKTLTKAPREFDDIINFDNTINTRYL RVVVSDFENLAEDRDGKSVNWPTVSLYEFEAYAVKQQVDSEQIVTIDDVINNLVIPAVAK GDTKMALPQVPEGFEIEFIGADYEQILDRDLTIHQPIVDTIVSVNYKVKKGDQEKITGAY NVTIPGKNSPDVSINAKPKVVPELAEWVGTEGSFTISDDSRIVINPAYKDDLAYLAKTFK ADYQAQTGKEIEVVYANTPGAHDFYFTLGSSDTGLKEEGYLMTVGDSVKVEAVDKTGAFW ATQSILQILKQNSNTIPNGQTRDYPKYEVRGFMLDVGRMPYSLDILKDIAKNMAYYKMND FQVHLNDNYIWVEDYGDNAFDAYSGFRLESDIKAGGNGGLNQADLTSKDVFYTKDQFRTF IQECRDMGVAVVPEFDTPAHSLALTKVRPDLAMKDKSVARHWDHLDLDSMYDESLAFTQS IFNEYMNGDDPVFDQNTTVNVGTDEYDGKYAENFRQYTDDMLKFIQDTGRDVRLWGSLSM RKGSTPVRSENVQMNIWNTSWANPNEMYKQGFDLINMVDGTLYMVPGAGYYNDYLNSQNI YNNWQPNNMGGTIIPAGDEQMLGSAYAIWNDMVDKKANGISEYDIYDRFEKALPAMSSKL WGDGQDLKYNELNEVVNSLGTAPNSNPRDVVTSKSDTVLNYDFNKSEIVDKSGNEYNVVN KKNVATVAGKFSKGLELSGGESYIETPLADMGPNNSVSFWVKMDKDATGEQVLFESDKGS IKMSQKDTGKVGFSRIGYDYSFNYELPKDEWVKLEIKGYANKAELYVNGELVDTLAKNAT GGKYATLTLPLERIGSKTTSFKGIIDNLEISTTGSQSDDTDYTKVDSSNFTVSTDNENPQ AGNEGPITYAFDNNEVTFWHSNYSPYQALPATVEIDMKEVHTINQFDYLPRQDGNTNGQI TKYELYIKENAADEYQKVSEGELAANASLKKITFDAANARYIKFVALEGTGNGGKSFACA SEFTVRQIDSKAELRKVVNLAANYEKEYYTTASWNDFETALTNAQMILDKADASSTEIDD AASALKTAIDSLVEIGDVVKTELERVIAEAQSIDTTKYTDKTVDVLKTALVNAISINGNE NATQEEVNAAITALTNAIDELELKTDVSVDKTDLEALLNICGTIDLNNYQENGKAEFKAA YEYALSIFNEADATEQEIRDAMNRLLEAKKNLKEIEVTPTDPDKNGQDKPIDNKVETGDN INVLYSASGLALASIVFYFSKKRKRSN >gi|223714134|gb|ACDT01000081.1| GENE 37 38684 - 40114 1817 476 aa, chain + ## HITS:1 COG:BH3007 KEGG:ns NR:ns ## COG: BH3007 COG0791 # Protein_GI_number: 15615569 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Bacillus halodurans # 335 476 180 334 336 109 39.0 1e-23 MKEQRVKQLNNFINENIIKRKAMFIPILGVSVFMLVGYAAVDKEAPKIVSNRIEVSYGDK VDLDAIDITDNQDSRPEIEVTANDLSSVNVNQLGTYDLSVTATDSFSNTASKVIKVDVVD DEAPKFKVAGVETGYVVQVPINGSQDISSYVTASDNVDGDVSPFIESNQELDTTKAGIQD IKLSVTDSSGNVNEKTFTFAVSDLTAPVVTLSQGNDIVIDYGSEFKLENFLTATDDQSAV TNTVTGEVDTKKENEVQTITVSTQDEAKNEVLTTLNFTVKDISGPQVNLSTNAVEVIKGD AFDPRQYLVSAIDNKDGDVTGNVVIGNIDTGSTGDKAVTYTVSDSSGNQTVATLNVKVYT PGSKILETAYTKLGSPYVWGATGPNSFDCSGFTSWVYRQHGISLSRTAQAQSQGGKAVDR ADLQPGDLVFFGSSTSRITHVGIYVGNGQMVHSPQTGDVVKVSSLNRNYVCARRYL >gi|223714134|gb|ACDT01000081.1| GENE 38 40178 - 40651 689 157 aa, chain + ## HITS:1 COG:CAC1363 KEGG:ns NR:ns ## COG: CAC1363 COG2032 # Protein_GI_number: 15894642 # Func_class: P Inorganic ion transport and metabolism # Function: Cu/Zn superoxide dismutase # Organism: Clostridium acetobutylicum # 16 153 31 179 182 102 39.0 2e-22 MNSFDLIRVLCNTSPKAYALITGATVTGTMLAYSFEEGTIVVVEAQGLPATGCGLGVHGL HIHEGSSCSGTPENPFGNAGGHYSTTNCPHPYHTGDLPPLFSEDGQAWMAVYISKFTPEQ IVGRTIIIHGNVDDFTTQPSGNSGPMIACGVIRELYQ >gi|223714134|gb|ACDT01000081.1| GENE 39 40717 - 41508 944 263 aa, chain + ## HITS:1 COG:CAC2423 KEGG:ns NR:ns ## COG: CAC2423 COG0561 # Protein_GI_number: 15895689 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 3 263 4 263 263 131 33.0 1e-30 MSKLLFFDIDGTLIECNLDIYSITENTRNALDRLKENGHDVFLATGRCKCFITEGVMNYP FSGYVTCNGAYVEYHGEPVYKAIIPSEAIKATMALSEQYDFNYYFESSDYIYVRDQNDER HQWFAKNWGMKPETVIDDFDPETIETYIGMIVVNDKKDISPMVNALSKYFDVQRHQSDYS FDLTLKGVSKAVGIEKLVERINRNIDDTIAFGDGRNDIEMLETVKIGVAMGNAVDEAKAV ANYETDRIENDGIVKALKHFELI >gi|223714134|gb|ACDT01000081.1| GENE 40 41664 - 42605 947 313 aa, chain + ## HITS:1 COG:BH0465 KEGG:ns NR:ns ## COG: BH0465 COG0530 # Protein_GI_number: 15613028 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/Na+ antiporter # Organism: Bacillus halodurans # 16 313 14 315 318 198 43.0 1e-50 MILQLFCLIAGFVLLIKGADIFVEGASKLASKLNIPPIVIGLTIVAFGTSAPEAAISITS ALGGNVDLAVGNIIGSNIMNVLLILGITGCIAKLKVNNNTYRYEIPFVMVITLVLLMLGK FGRSIDRFDGVLLWGLFLLFLYYLYRLVKKGEEVPLDEVEELDEKDTLLRLIIMIVLGMA AIVIGSNLTIDAATYIAEELGVSQRLIGLTIIAFGTSLPELVTSMTAAWKGKSDLAIGNI VGSNIFNILFVLGTTALISPKAVAFESGFIIDGIVAIGALFLLYTFIGSDGYLKKSGAII MLIGYLAYFVSIL >gi|223714134|gb|ACDT01000081.1| GENE 41 42615 - 43304 888 229 aa, chain + ## HITS:1 COG:CAC0430 KEGG:ns NR:ns ## COG: CAC0430 COG0584 # Protein_GI_number: 15893721 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Clostridium acetobutylicum # 2 226 6 233 249 141 43.0 8e-34 MIKLGHRGYSSKYPENTMLAFKAAIDGNFDGVETDVQLTKDNRLVLIHDEKIDRTSNGKG YVKDYTYDQLCQFNFNYRFDGIEAKIPLLEELLDYCIGKDVILNIEIKTDKIQYPNIERM TYMMIKEKGLLDRVMFSSFHLESLLNLRELDDSIYLGYLYEDNYQVNKAKVFEYRFNGAH PKYTFLNEKEINDYLRRGIDVNTWTVDSDDIKDFLVDEGVKTIITNKDI >gi|223714134|gb|ACDT01000081.1| GENE 42 43494 - 46142 2695 882 aa, chain + ## HITS:1 COG:SP2026_1 KEGG:ns NR:ns ## COG: SP2026_1 COG1012 # Protein_GI_number: 15901847 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Streptococcus pneumoniae TIGR4 # 1 465 1 463 464 657 70.0 0 MAEKKAVNKVVEPTNEEIDAHVDQLVAKAQVALSKFEEFTQEQVDYIVAKCSVAGLDAHG TLAEAAVNETGRGVFEDKAVKNLFACEYVTSNLRHLKTVGIINEDPLLGITEIAEPVGVV CGIVPTTNPTSTVIFKSLICLKTRNPIIFSFHPSAHACSVMAAVVIRDAAIAAGAPEDCI QWLDLKSMYATTALMKHPGVATILATGGNAMVAAAYSCGKPALGVGAGNVPAYVEKTCVL PRAVNDIVLSKSFDNGMICASEQAAIVDTEIYDAFMKEIKRFKVYFVNKEEKAKLETFMF GAVAYSDNVNAAKLNPNVVGKPATWIAEQAGFKVPEDTQIICAECKEVGPNEPLTREKLS PVLAVLKAKNTDDGILKATQMVEFNGLGHSAAIHTEDHEISKKFGHACKAIRIIENAPST FGGIGSVYNAFIPSLTLGCGSYGHNSVSNNVSAVNLINIKRIGRRNNNMQWVKLPPKIYF ERNSIKYLRDMKKMEKAMIVTDRGMYNLGYVEKIEDVIRRRRNQVDLELFFDVEPDPSFD TVQAGLELMNKFQPDTIIALGGGSSMDAAKVMWLLYENPEVCFDDIKQKFMDIRKRAFKF PELGKKANLICIPTTSGTGSEVTPFSVITDKSCNKKYPLTDYALTPTVAIVDPEFVMSLP AAIAADTGIDVLTHAVEAYVSILASDFTDGWAKQAVKLVFDYLEESVKEGTPLSREKMHN AATIAGMAFANAFLGMNHSLAHKIGGEFHVPHGRTNGILLPHVIRYNGSMPTKLNIWPKI ENFKADVKYMELAQLIGLNPKTPAEGVQMFADACEELCHKVNLASNIESQGIDKKDWDEA IHRMAMNAYEDQCTPANPRMPMVHDMEAILRTIWDYKNKFDK >gi|223714134|gb|ACDT01000081.1| GENE 43 46240 - 47499 1078 419 aa, chain + ## HITS:1 COG:no KEGG:SSA_0231 NR:ns ## KEGG: SSA_0231 # Name: not_defined # Def: hypothetical protein # Organism: S.sanguinis # Pathway: not_defined # 11 411 21 429 429 239 34.0 2e-61 MRRNIYNSIYYCVLGLILIVISSMAIIQREDFLMRVFDVLAWILIINGLHELSVFLRRRF RGDLINIIGNIGVGIFILSYTAIPIRLLFAIFAIYITLNGIIKFISYLNYKKDKVSKRFP VLCGALFLIIYGLALLLGRYADANAMMIFIGGYGLLLGINYIIDGIFTAIPQQHKDSLKR RIRIPVPIFISALVPKVMMDYINERLAVEPTEKFLDDQNHANVEIFIHVSPDGFGTIGHC DICIDNQVISYGNYDYDSIRLFETIGDGVLFIAPRESYIPFCIETDHKTIFSYGVRLTVK QLASVKREIEKLKENTYPWYPRSYKDRNDCNDYASRLYLRTGASFFKFKRGRYKTYFVLG SNCVKLAEAIMGKAGMDIIDLNGIISPGTYQNYLEKEYQRANGVIISKNVYNQLTIDHK >gi|223714134|gb|ACDT01000081.1| GENE 44 47899 - 49503 1788 534 aa, chain + ## HITS:1 COG:lin2704 KEGG:ns NR:ns ## COG: lin2704 COG0504 # Protein_GI_number: 16801765 # Func_class: F Nucleotide transport and metabolism # Function: CTP synthase (UTP-ammonia lyase) # Organism: Listeria innocua # 1 531 1 532 532 711 62.0 0 MAKYIFVTGGVVSGLGKGLTAASLGRILKQRGLKVFMQKLDPYINVDPGTMSPYQHGEVF VTADGAETDLDLGHYERFIDEELNRNSSITTGRIYSNVISKERRGDYLGATVQVVPHITN EIKAKIYAAASSSNADIVITEIGGTTGDIESLPFLEAIRQVRLDLGYDNTLYIHTTLLPY IGASHEVKTKPTQHSVKELRGLGIQPDFIVCRSEKYIEQELKDKISLFCNVPTKNVISNY DVEVLYELPMMLLDQHMDDLVLEHLRIEAPQPNMSEWESLIERVKGLDHEITIALVGKYT QLPDAYLSVNEALRHAGYYENSVVNIEFVDSEEITKDNVAEKLKTADGIIVPGGFGNRGI EGMIDAIEYARINNLPFFGICLGMQLATIEYARNVCGLKDANSLEFDELTKNPIINLMSD QSLTDMGGTQRLGDYNCELAAGTHARELYGVDMIQERHRHRYEFNNEYKDALVEHGLTIA GINPERNLVEIVEIKEHPYFVACQFHPEFTSRPNRSQPLFQGLITAAHKRKYHG >gi|223714134|gb|ACDT01000081.1| GENE 45 49656 - 51143 1634 495 aa, chain + ## HITS:1 COG:TM0033 KEGG:ns NR:ns ## COG: TM0033 COG4099 # Protein_GI_number: 15642808 # Func_class: R General function prediction only # Function: Predicted peptidase # Organism: Thermotoga maritima # 146 434 149 394 395 93 27.0 8e-19 MRKIKKLFAMMMTLVMTLGIITTNISAIEELKVTSIVANTYVGDGQREVSSFEITVNDIN LVNDLKAEDFDITNNVSSVPYDVATNALADDYLDDGISLEIKGNTIILNVKAFDYTGRYN PDFSKNPWQVTCNKYGVLSFNKDNVTTLKTKTLDDAIRGTFTYAGLTREYALYLPKNSDG TNMKNVPLVVWNHGGGEYAGQLENTLVANRGLTAWVEAGYNTAVLQIQVSNPNYSYGTAF DEDKKSLIDQNNALQAALIKKLIAEGTVDQNRVYVTGASSGGGATMRFIMQYPELFAGAI ACCSMDPIVWVHYNYQDSYEQIVENFETAFQGQVYTWDETKEMMVSKQVDTASLVELPIY FTHAQNDTTCNVNSSKAMYEALNNLGAKNNHLTIWSNQEMTDEGIAEGLLHWSWVKVLNH NEEGSPMNWLFKQTNASENIPVEKEESNNSKTETTASEEAKKAVKTGDDSTVELMAMMTG LSLLAGTVVLSKKYY >gi|223714134|gb|ACDT01000081.1| GENE 46 51297 - 52070 614 257 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756044|ref|ZP_02428171.1| ## NR: gi|167756044|ref|ZP_02428171.1| hypothetical protein CLORAM_01564 [Clostridium ramosum DSM 1402] # 1 257 14 270 270 496 100.0 1e-139 MIKNITLLDSILDVANCVDINDNNTYIARKVIENCVWLKDMSLEDFLETTNINGSQFKRF YKLYDCNNFSILKERIGLWCDIRKEQVVKQCQWQKQEPLARVIFNLTNYNDFNDFFNVEL IDRICKQIYQSKRIIMYGAMGLLNLTHDFQIDMKLFGKDFIRSSMYEDKALVPQKDDFVW LFSMMGRTMNMVGTTMRLKIFNGPCKKLLITQHGALLESDFLISLNTDNDYYEGQYVFMF YLDMIKTRYYELYIKEN >gi|223714134|gb|ACDT01000081.1| GENE 47 52075 - 52815 629 246 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756045|ref|ZP_02428172.1| ## NR: gi|167756045|ref|ZP_02428172.1| hypothetical protein CLORAM_01565 [Clostridium ramosum DSM 1402] # 1 246 1 246 246 460 100.0 1e-128 MILGKVMMAQSANKRGSVEYQISKKTLMLGNRIKDYSLEMFIKEAGVSKTSLMRYLNSLG VNTFTSFREIITLESIKGITALETIKYNYKEEQLDKQAVKLGKIIKEAKRVIVLGDGNCF SLMIYMKAFIHLGIDFEIPIYLGKEEDIFDEYALSKDDLVIFISLHETFRSFLDHRTMFY KDVRYFQFNCPCRVAFVGEVDLGQEPLSEYQFIVDIDGLIYERKMRLDKLFEEAMSYLCL SYEFVY >gi|223714134|gb|ACDT01000081.1| GENE 48 52833 - 53588 698 251 aa, chain + ## HITS:1 COG:no KEGG:lse_0378 NR:ns ## KEGG: lse_0378 # Name: not_defined # Def: hypothetical protein # Organism: L.seeligeri # Pathway: not_defined # 29 236 14 218 243 104 32.0 3e-21 MGIFFCFCYNTVYYGGGNMLETKSTKKFKIIKGIICILTLAMLVAAVIFLFIVDDMKKKR RLIFSIVQLLLMIGIIILPEQLKERLDLKIPIMLETSLTVFAFCGFVLGDVFDFYKKIPI WDSILHAFSGVILAYAGFVLIDYFVKRESVNISMGHMFICTSVVLFSLALGALWEIGEYL VDDIFGTNNQQYMKSTRGTLYGQKDVPLEGHAALGDTMKDLMLDLAGATAIVTIEYCKED YRKKKAKKEDK >gi|223714134|gb|ACDT01000081.1| GENE 49 53661 - 54143 686 160 aa, chain + ## HITS:1 COG:BS_yacN KEGG:ns NR:ns ## COG: BS_yacN COG0245 # Protein_GI_number: 16077159 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Bacillus subtilis # 1 157 1 157 158 169 50.0 1e-42 MYRIGQSIDIHQLVAGRKLILGGVEIEHERGLLGHSDADVLCHAVIESIIGALGLGDIGK HFSDTDAQYKGISSLVLLEKTYEMMDDRGYEISNLDAIILIEKPKMAPHIQKMKENISRI LKCDIADVNIKATRGEKLGFVGHEEGAVSQCVVMLKKKGI >gi|223714134|gb|ACDT01000081.1| GENE 50 54146 - 55579 1799 477 aa, chain + ## HITS:1 COG:BS_gltX KEGG:ns NR:ns ## COG: BS_gltX COG0008 # Protein_GI_number: 16077160 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Bacillus subtilis # 3 476 4 480 483 566 59.0 1e-161 MEKVRVRYAPSPTGHLHIGGARTALFNYLFAKHHGGDFIFRLEDTDIERNIEGGEASQLE NLAWLGIIPDESPLNPKPEYAPYRQMERLDIYRKYTQELLDRGLAYKCFCTPEELDAEHE RQVEAGVAPMYNRKCRDLTAEEVAAKEAAGIPYTIRLRVPANKTYEFDDMIRGHVSFESN DIGDWVIVKANGIPTYNYAVVIDDHTMEITHVFRGEEHLSNTPKQMMVFEMLGWKAPTYG HMTLIVNEQRKKLSKRDESIMQFVSQYKEEGYLPDAMFNFMALLGWSPEGEQEIFTKEEL IKEFSETRLSKSPSMFDKDKLTWVNNRYIKERSLEDVVALCRPFLEEAYDLSSKSEQWIN DLVATYHDQLSYGKEIVSLVDLFFTDELNLDEEAKEFMKDETIPNTLKVFKAQLENLEEF TKENIQACIKATQKEAKAKGKMLYMPLRIATTGIMHGPDLASSICLLGKEKVLSRLG >gi|223714134|gb|ACDT01000081.1| GENE 51 55728 - 55901 183 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756049|ref|ZP_02428176.1| ## NR: gi|167756049|ref|ZP_02428176.1| hypothetical protein CLORAM_01569 [Clostridium ramosum DSM 1402] # 1 57 1 57 281 94 100.0 2e-18 MSIIAKLTEKKDFSASESKIADYIIENKEEILHLTIRELAKTTYTGASTVMRVIKKI >gi|223714134|gb|ACDT01000081.1| GENE 52 55954 - 56574 612 206 aa, chain + ## HITS:1 COG:no KEGG:SUB0807 NR:ns ## KEGG: SUB0807 # Name: not_defined # Def: transcriptional regulator # Organism: S.uberis # Pathway: not_defined # 17 195 92 271 290 64 26.0 3e-09 MMSKGNNKILFQIKKQETAFSVMEKIASVEKDTIDRTKLLLNYQQIERITKLINKATIIY IFADGINEQIGHEFKYMMARIGKAVEIATDNFWVALNCLSESANPLAIYVNHREHNVQLL EKIRLNHKIAIPGIAVTGYINSTFETCCKEVITIPTGVSYSDLAPVIYTTAIRYVLNTLV GCVIASNHELSMDKLIDYTELSQTVK >gi|223714134|gb|ACDT01000081.1| GENE 53 56844 - 62684 6658 1946 aa, chain + ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 739 1003 42 325 757 77 26.0 3e-13 MRRTKRIIASLVAAVTVISQVMPMMKVTAATDNVRLATFNIAANKKPDITKLNELLKTNN VDIVGLQEVDINTSRNPYNMLEKFVEQGDYAYSSFQKAIETGGGDYGVALLSNLELINTN GGALNSEGIKEARAWQKGEIEVNGKVIAVYNTHLTYESVEARAKQLLELKATMDQDPAEY KVAFGDFNVDQNHNEIYPFLEDYNIANGKGGKWYDTFNGTDASMKTSAVDNIITSRNLEI SNITMVETDLSDHNLFYADAKLLDQPVASRDYLDLVISDANALDEGKYSNASYNTLDEKL VAGELLDQNATQEQINKAVKDIQTAQDNLKTRYPNLALNKDVSVSGLEVNDGRWTAPMAV DGIVSSDSRVSLAKDKDEQWLEVDLGKTETISNFVINYESQVPAFKIQVAGEDKVYRDIY EVSNLEGQISNIQKISIDPTEARYVKYVQLKRWTHSGNGKQYSGSIYEFEVYEQDPDAKD YSNIALNRPVSASSLEVGDGRFTAPMAVDGVVSKDSRVSFGKTADNQWLLVDLGSRKTVS EFVINYESCPPAFKIQVSTDGVNYQDVYEAADLPDGGPSEIKEIQIEATKARYVKYVQLK RWTHTNKNKYSGSIYEFEVYKERDHTVTTADEVLKSLNKEAPQINGNKIVLPAVPDGFEI SLYGCDNKQVVAMDGTITRPLEDMDINVLYQVKNLNDETDIAYSEADINFIVPGAYNKEE SINAKPNVMPGLREWKGNEGEFTLSKNSKIVVKDASLEATAKQIQFYLKEMIKHELEIVT SGEKSGDIVLVKDGSVDNLGKEGYTLAIDDMIEIKSNYETGILYGGISITQMFSQNDGMN NVAKGLARDYPKYEVRAGMLDVARTYIPMEYLQEMTIYMAYYKLNEVQVHVNDYWGATGY SAFRLECDTYPMITATDGSYTKEEYRQYQIDMKNYGINVITEIDTPYHAECFRNVPGVKM LKAGALDIREQSSYDVIEAVLDEYLDGPNPVIQSDKFHIGTDEYDKSYSEQMRKWTDHFI NYVNDKGYDTRLWGSLGSRGFNGTTPVSNDATVNLWAPYWADVKETYNAGYDIINTCGGW LYIVPGANAGYPDRLDIASLYDKFDVNNFSPSRNAGQGTAIMPIAHPQTKGGQFCVWNDM TSYGGGFTWFDIYDRFIDAVMITSEKTWYGEKTDGQTAAEFVERANKLNSVVPGANPGRV VESKGENITAIDFETIKNGKAVDTSGNNYNATLNNAIIEKANGNNIATVNGDGYVALPYK SIGYPYTVMMDIKLDSNTSENTEIFSGDDGVMYANIDGTGKIGYSRSGYNFTFDYELPKD TWVNLALTCDQKNTTLYINGKAITTGINMGTAINNRKDSSTFVLPVEKILNNAKGSVDNI KIYNKTFSATEISDELGYIQLENLALNKNATASSVNPLTPNLTPNLAVDGDNTKVTTNRW SSKRATGAGTNEGDTEYGTVEQDLTVDLGANYKVDKVFISWEGAYATKYTIQGSLDGVNF FDIKDITNGTGGEITHKDLGDVETRYVRIACHDAKNRNWGYSIYELEVYQSTNEGLRALV KEGQAELAKFDAGNKNGNIAKDDYEVFEQKFEDYLALADGTEISTEEIGKLLKEVTAIIN SIEDSVIYSKNELNELYQKVSLYQEKDFTKDSFVDFNTKLTAIKGLIDNADDYIKVNEAM GQLQTLDSSLVKLDRSKLKTAIESYDNFQETDYTTSSWQTMKTIVDAAKLVYENVSLNQV DVDNAIAALNEINLEKRGNTKELAKLLKEYDEADYTISTWAEFAGVYQAANAMVVDNSDV NQEQVDAMVALVREAAMKLLELGNSKDLEKVINKITEAIKDLDQANYNQDSWNKLQTMIK AAKELLDSKDVSQAELDAMLDSLNQVYDALKPIEVKPGEETETPTNKPVTDTSDKSVKTG DDVNLFGYAGLLGIALMGLICRKKFD Prediction of potential genes in microbial genomes Time: Thu May 26 10:04:39 2011 Seq name: gi|223714133|gb|ACDT01000082.1| Coprobacillus sp. D7 cont1.82, whole genome shotgun sequence Length of sequence - 65903 bp Number of predicted genes - 57, with homology - 57 Number of transcription units - 27, operones - 16 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 203 - 262 6.2 1 1 Tu 1 . + CDS 294 - 5480 6241 ## COG3525 N-acetyl-beta-hexosaminidase + Term 5485 - 5533 8.4 + Prom 5510 - 5569 8.2 2 2 Tu 1 . + CDS 5592 - 7823 1842 ## COG2200 FOG: EAL domain + Prom 7831 - 7890 7.5 3 3 Op 1 . + CDS 8001 - 8801 913 ## CPF_1087 pullulanase family protein 4 3 Op 2 . + CDS 8802 - 9383 542 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) 5 3 Op 3 . + CDS 9371 - 9805 434 ## COG1671 Uncharacterized protein conserved in bacteria + Prom 9809 - 9868 10.5 6 3 Op 4 . + CDS 9900 - 12377 2192 ## COG2199 FOG: GGDEF domain - Term 12351 - 12388 4.0 7 4 Tu 1 . - CDS 12392 - 13978 1706 ## COG0367 Asparagine synthase (glutamine-hydrolyzing) 8 5 Tu 1 . - CDS 14375 - 14839 558 ## COG2606 Uncharacterized conserved protein - Prom 14860 - 14919 7.7 + Prom 14808 - 14867 6.8 9 6 Op 1 . + CDS 15054 - 15872 964 ## COG0561 Predicted hydrolases of the HAD superfamily 10 6 Op 2 . + CDS 15872 - 16051 260 ## gi|167756060|ref|ZP_02428187.1| hypothetical protein CLORAM_01580 + Term 16052 - 16086 2.1 + Prom 16097 - 16156 11.7 11 7 Op 1 . + CDS 16177 - 17379 1397 ## COG1171 Threonine dehydratase + Term 17393 - 17425 1.1 + Prom 17425 - 17484 8.0 12 7 Op 2 . + CDS 17525 - 19180 1900 ## COG0366 Glycosidases + Term 19193 - 19228 4.1 + Prom 19226 - 19285 6.3 13 8 Tu 1 . + CDS 19309 - 21171 1679 ## COG4932 Predicted outer membrane protein + Term 21183 - 21213 1.0 - Term 21164 - 21204 1.9 14 9 Op 1 . - CDS 21228 - 22112 715 ## Ccur_08300 glycosyl hydrolase family 25 15 9 Op 2 . - CDS 22168 - 22638 482 ## COG4420 Predicted membrane protein - Prom 22690 - 22749 10.3 + Prom 22849 - 22908 11.2 16 10 Tu 1 . + CDS 22966 - 26184 3993 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) + Prom 26454 - 26513 10.8 17 11 Op 1 24/0.000 + CDS 26543 - 27613 1411 ## COG0505 Carbamoylphosphate synthase small subunit 18 11 Op 2 . + CDS 27615 - 30812 3763 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) + Term 30820 - 30862 6.4 + Prom 30941 - 31000 7.5 19 12 Op 1 . + CDS 31067 - 31276 92 ## gi|167756069|ref|ZP_02428196.1| hypothetical protein CLORAM_01589 20 12 Op 2 . + CDS 31222 - 31422 184 ## gi|167756069|ref|ZP_02428196.1| hypothetical protein CLORAM_01589 + Prom 31424 - 31483 3.5 21 13 Op 1 . + CDS 31530 - 31799 352 ## DSY1231 hypothetical protein 22 13 Op 2 2/0.000 + CDS 31827 - 33260 1516 ## COG2199 FOG: GGDEF domain 23 13 Op 3 . + CDS 33271 - 33735 420 ## COG0346 Lactoylglutathione lyase and related lyases 24 14 Tu 1 . - CDS 33772 - 34254 648 ## COG4708 Predicted membrane protein - Prom 34395 - 34454 9.6 + Prom 34353 - 34412 11.9 25 15 Op 1 . + CDS 34500 - 35330 962 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 35333 - 35392 3.5 26 15 Op 2 . + CDS 35413 - 36996 1747 ## Lebu_0189 phospholipase/carboxylesterase - Term 36819 - 36852 -0.1 27 16 Tu 1 . - CDS 37032 - 37802 713 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 37822 - 37881 6.2 - Term 37850 - 37883 1.1 28 17 Op 1 . - CDS 37893 - 38567 705 ## COG2013 Uncharacterized conserved protein 29 17 Op 2 . - CDS 38569 - 39609 1100 ## COG2706 3-carboxymuconate cyclase - Prom 39738 - 39797 10.2 + Prom 39787 - 39846 10.8 30 18 Tu 1 . + CDS 39942 - 40403 385 ## gi|167756079|ref|ZP_02428206.1| hypothetical protein CLORAM_01599 + Term 40405 - 40448 1.3 + Prom 40733 - 40792 5.8 31 19 Op 1 3/0.000 + CDS 40813 - 41304 572 ## COG1522 Transcriptional regulators 32 19 Op 2 1/0.000 + CDS 41304 - 42473 1246 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 33 19 Op 3 . + CDS 42544 - 43464 733 ## COG0583 Transcriptional regulator + Prom 43482 - 43541 5.6 34 20 Op 1 . + CDS 43605 - 44117 791 ## gi|237734068|ref|ZP_04564549.1| predicted protein 35 20 Op 2 . + CDS 44171 - 44740 463 ## TSIB_1498 peptidase S26B, signal peptidase (EC:3.4.21.89) + Prom 44750 - 44809 4.4 36 21 Tu 1 . + CDS 44851 - 47169 2564 ## COG0474 Cation transport ATPase + Term 47225 - 47274 3.2 + Prom 47267 - 47326 10.5 37 22 Op 1 . + CDS 47385 - 48017 552 ## gi|167756086|ref|ZP_02428213.1| hypothetical protein CLORAM_01606 38 22 Op 2 . + CDS 48031 - 49017 987 ## EUBREC_1720 hypothetical protein + Term 49068 - 49105 3.2 + Prom 49055 - 49114 5.5 39 23 Op 1 39/0.000 + CDS 49149 - 49991 1172 ## COG0226 ABC-type phosphate transport system, periplasmic component 40 23 Op 2 38/0.000 + CDS 50001 - 50933 849 ## COG0573 ABC-type phosphate transport system, permease component 41 23 Op 3 41/0.000 + CDS 50926 - 51786 911 ## COG0581 ABC-type phosphate transport system, permease component 42 23 Op 4 32/0.000 + CDS 51788 - 52543 210 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 43 23 Op 5 7/0.000 + CDS 52553 - 53212 762 ## COG0704 Phosphate uptake regulator 44 23 Op 6 40/0.000 + CDS 53225 - 53908 967 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 45 23 Op 7 . + CDS 53895 - 55592 1870 ## COG0642 Signal transduction histidine kinase + Term 55593 - 55624 1.1 + Prom 55594 - 55653 6.1 46 24 Op 1 . + CDS 55680 - 56399 629 ## Mevan_0168 hypothetical protein 47 24 Op 2 . + CDS 56452 - 57348 878 ## COG1307 Uncharacterized protein conserved in bacteria + Term 57363 - 57430 -0.6 48 25 Op 1 40/0.000 - CDS 57420 - 59507 1427 ## COG0642 Signal transduction histidine kinase 49 25 Op 2 . - CDS 59516 - 60226 750 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 50 25 Op 3 . - CDS 60302 - 60979 400 ## COG0398 Uncharacterized conserved protein - Prom 61114 - 61173 4.4 + Prom 60911 - 60970 8.8 51 26 Op 1 . + CDS 61163 - 61522 334 ## BCE_4519 ArsR family transcriptional regulator 52 26 Op 2 1/0.000 + CDS 61522 - 62187 565 ## COG0778 Nitroreductase + Prom 62203 - 62262 9.3 53 26 Op 3 . + CDS 62298 - 63617 467 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 54 26 Op 4 13/0.000 + CDS 63642 - 63950 476 ## COG0526 Thiol-disulfide isomerase and thioredoxins 55 26 Op 5 . + CDS 63952 - 65112 718 ## COG0785 Cytochrome c biogenesis protein + Prom 65118 - 65177 3.5 56 26 Op 6 . + CDS 65201 - 65569 373 ## COG1733 Predicted transcriptional regulators + Term 65573 - 65628 11.1 + Prom 65594 - 65653 8.1 57 27 Tu 1 . + CDS 65679 - 65901 148 ## gi|167756106|ref|ZP_02428233.1| hypothetical protein CLORAM_01626 Predicted protein(s) >gi|223714133|gb|ACDT01000082.1| GENE 1 294 - 5480 6241 1728 aa, chain + ## HITS:1 COG:VC0613 KEGG:ns NR:ns ## COG: VC0613 COG3525 # Protein_GI_number: 15640633 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Vibrio cholerae # 403 642 129 373 637 70 25.0 2e-11 MEKILRKGFISLLSLLMVFTTFNQLITNVAAEQEYENIALDKVVVASSAIAGNEGSNAVD GKMDTKWEVGNISTNPSIIVDLGKEESFEKITLKWASNFAFRYKLYVAKTDQKYGQVLFH EEGTGGDEEWEMEENTIARYVKIELIEAKSGESTIGLKELEVLRVKKDTSNDPVNIALNK PAYASGNEANLDRLGPGGAFDGKYSNDYNERWSSGNITKSPQWIYVDLEAEMTFSSIHIF WENAFATKYVLQYSNDVGTDKNWITFVNVDDGKGSWEKFDFDPITARFVRLYATASNGVG SAKPISIWEMEIYEEVNNSLTVDDIAKDFTIGTITSNMDKMPVYHYDNDNIVAEIFCSDT DFVVEDNGMIHKPLVDKEVLVTYIIKDKRTNATKEMKNIPVIIPGQNSVALDSNPVPTTV PTIQEWNGATGTYQLTNNSRIIVQDESLYKAAQITKDDIKDLFGLELEISNTTVKKGDIL LTKENADTTIGDQGYTLEIGDSITIKSTTYQGIFFGTRSVLQALVASGDALTINKGEARD YAKYEYRKFMLDVARKYMPMWYLKDFVKYASWYKFTDIQVHLNDTSYNGHSRFRLESDIP GLTATDGYYTKGEYRDFQYYAEDYGIRIVSEFDTPAHSAVFIDNNSELGFDGTHLDIRLN SDKKETVYKFIADLYEEYMGGDNPVFVTDAFNVGLDEYNSAYKEDMTQYTQYVMDLVYNK YGKTPMAWASMGCVDSDKTKIPDYPIMDAWANYAISLKSLFNQDYKLINATNKYGYIVPG GNNGYPDFAKEEEMYNNLSAGKFRDKMGAGVDVAEGHPKIVGGSISLWNDRGIFNGISVY DVFARTQSILPIYAQTFWYGKDNDKSYDQFKAEVNTLGTGPNVEMDKEISSKTEKVYDFD MENTSKQGDNLVIKDNSGNGYDATAIKATVETTDEGQALKLNGDGYLQMQHSALKWPYTL AFDLKIDESQTGDIVLFEETMPIEECNQWIDTKYQTRKIYLKEIDGKYKLMYDRDSYHYE HNIELEKGKLYQLAFTSDKKYTNVYLDGVVKSTINGPILTNSGNKWYDSASINLPLQKIG QNLVGTLDNIKVYNRLLNNEEIKNLYDTGIDVIHENIALNKEATASSSYTDYQTPNKAFD GIINQSASGPEQSRWASARTHDQWIQVDLNKVYKVDQIKITWEDAYGVDYELKGSVNGKD WFTIKNVTGNVAKENNHTGLGDIEARYIKLIGTKAANNAKYGYSIYEIEVYENPKNDLLR TISQVKDKLSEMNVGNFHGQIKEEAYQTFTAYLTNLENEVVGKESISEEEMLKFKEELSK KYTNLKKQVVRVDKGALETVLEVAESKLAEGYSEANSTVSSWQSFIQNKKAAEAMADRMD ITQSDVDKMVSELQTALDNLTFRALESSFNELVALIDEVTKMETDYSAEMFKVMKGYLDQ ANALVALGAGEVVEKDVQANLTNLANEKAILISYWELKVSVEVAKEILDKEAVNLRPATL KVLQDAYEAGRKLIDDNSKELIVLQNAKQEINEAIEGLLEIVERKDLDKLIEQVKDLKEE DYTEESWKTFKNGYERAVTVNQDLDANEGAIQNAYDELLRAFNELKQVVNKDNLLLQIEL AKQVLANKDKYDIELVKELEELLSQAVALIANDVSSEDVNKMSDALLTKIEAVKASQKEE PKVEEQPGTEVSDKTKKNAKTGDETMLFEFAMISLVALLAVVRLRKKS >gi|223714133|gb|ACDT01000082.1| GENE 2 5592 - 7823 1842 743 aa, chain + ## HITS:1 COG:BS_ykoWm_4 KEGG:ns NR:ns ## COG: BS_ykoWm_4 COG2200 # Protein_GI_number: 16081166 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Bacillus subtilis # 493 739 8 253 259 184 36.0 5e-46 MKIKNFKRIYVDIVISCLIIAVVATFFAFKSQTISQEQVLNTLSEISSQSVNVIDKEIQK NVAVLANLSIYISQEDAFDPVKIINKIKKVNEINNFKRIGIIDERGQSYTTDDNNILLNE QQMTRFNKAMNGEVSITDTLPDLIDGEEVSVYTLPITFDNDQHCVLFGAYASRYYKETLS VSTFDGLGYSFIIKENGDKIVDSSNKTSIKFSNFFDDIGAISKENKEKLENLKHNIKNEK SGFLEYDRTDEARYLYYQKLDVNDWYLLSVIPASVVDNNINGMLFLGYMMLGCCLLGILY LITRIVLMYRNNQKRLEEVALVDEVTGCGSYTRFRLAGGDILKSTKSKYTMVYFNILKFQ YINDLYGYDEGNLILVKIAKILNNLLQEKEFCARIQADHFVALLQYDSVTQLKKRAEYMI KNMENAINSNDEAAYRIKMIVGIYLIENYNDRIESMVDRAATVIRDENRHELENCNFYND NIRQRMYKNKELEDLFEEALHNHEFEVYFQPKYSTKKKRLYGAEALVRWKSSKLGSISPG EFIPVLERSSNIIELDEYVFVITCKQIRKWLDDGLEVAPIAVNVSQLHLYRQDFVESYLK VIDAYQIPYNLIELEMTETSLFGNRDILKDILNKFRKLKITVSMDDFGSGYSSIMMLESM PIDCLKIDKTMIDDLEVNSKAKEILKSIISLAQSLGLVVVAEGVEQAGQYEELIRMKVDY IQGYYCARPLPLDEYEKILKLKK >gi|223714133|gb|ACDT01000082.1| GENE 3 8001 - 8801 913 266 aa, chain + ## HITS:1 COG:no KEGG:CPF_1087 NR:ns ## KEGG: CPF_1087 # Name: not_defined # Def: pullulanase family protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 183 263 2424 2502 2638 63 40.0 8e-09 MKNRIISVVLSFFVIVNIILIACAFNVTPLTIKRDTFVYEYGSEISTKPQDYINANEAIL SQVILNFSNLKNEIGEYKVSATYLGVEYPFYIKIVDTTKPVVTLKASTFNVHLNTEVYAI DLIEKVEDNSDIAAYFIDENGEKSTHKVFTEKGSYVERIIVEDQAGNQSASLRVKIVAGQ NGNNPTLTGIDDIEVLKNSKFNPLDGVKATDGSGNDITKNIKILKNNVNTDKVGDYEVIY SITNDKGHNLQRTRRVSVIKTEKAGE >gi|223714133|gb|ACDT01000082.1| GENE 4 8802 - 9383 542 193 aa, chain + ## HITS:1 COG:BH1421_1 KEGG:ns NR:ns ## COG: BH1421_1 COG0705 # Protein_GI_number: 15613984 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Bacillus halodurans # 3 171 189 348 349 70 28.0 2e-12 MITNAIIIISIIIFILINFIDKSNYDKTSLAIKYGAFYMPRIEVKKEYWRFITANFVHID IMHIFMNVYCIYNLGHFFEYLLGTGAYLYLIFISCLATSGACYYHGKKYAYAYNTVTLGA SGIFYGYLGAMIALGILVGGYFDYMLQQYIYVILINVAFTFFNRQVSKAGHFGGLAGGFL ATAMLIGCGVWHF >gi|223714133|gb|ACDT01000082.1| GENE 5 9371 - 9805 434 144 aa, chain + ## HITS:1 COG:CAC2825 KEGG:ns NR:ns ## COG: CAC2825 COG1671 # Protein_GI_number: 15896080 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 144 1 146 148 99 32.0 3e-21 MALLVDGDACPDLPAIRDLAWKYQVEMTVFVDYAHFIQDDYFRTILCEVGSDSVDLVLLK QVQANDLVITQDYGLASLVLSKGAKVLHISGKVIDDNNIEELLMSRYVSAKQRKSGRRTR GPAKRTDEVRNQFLKQLDKILIQA >gi|223714133|gb|ACDT01000082.1| GENE 6 9900 - 12377 2192 825 aa, chain + ## HITS:1 COG:PA4929_2 KEGG:ns NR:ns ## COG: PA4929_2 COG2199 # Protein_GI_number: 15600122 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Pseudomonas aeruginosa # 649 824 4 181 196 98 35.0 5e-20 MKTLKTILVLLIMSMLINIIPIKAASRTVKVAYPIQSGLTEIDEDGLYSGYTYEYLKEIS RFTNWNLEFVRLEGTVNEQLSSAMEKVKTGEIDLMGGMLYIDTLINDYDYSATNYGMGNM AIYVNSNNAEINDTSIYSLKNLNVGVVSTKKETNVNLKDFGDINGININQIFFDNTPDLL TSLERNEIDAMVLSDLSIEEGNYRVVARFSPRPFYFVMTKGNQELMSELNEAMTQLNKEQ PNYMSNLHERYFSLANRNFILTEAEKGFVANNSKIDVLILGGKAPIQYYDEKKGKVVGIT VDVLNYIGKLSGLKFNYIYANDIDEYNKQIETKKPMIIGGVYNSNPEKYITTKAYLDSAM TLVTNKNINANELDGKQAAMIYDSEVDSAAVGDSSNKVNYYGTQLECIKAVENGTADYTY INNHVALFYNSNYDFKNINIIPQGNLRNQGSYFAINTGVADELLDIINKGIDVVSYSKIE EIIFTNASTSENDVTFFDYVENNQLQVATFTLMAVAIIVLFFLIWRNYINKRNNEKILSE YKRYQQISEFSHDCFVEYNIATDTLVLMGGGAKLLSKETIIKDFLKRPISNKDTLEMSLR NLTECESEEFVKFIDGKKRWLKINLRPIFDLNNKPTHIIGKVIDIHSEKEEQLLWRELAQ KDSLTKIYNSAACREKAEEFLQNNTAAQVALIIIDIDNFKYINDTYGHLCGDIVLQKIAQ ALLEVSHPLDIFGRVGGDEFLTLIKYPKSIEKVAEYCQNMIKSTSKVESQGTRIITSISV GAVVSSVSLSYDELYRLADDALYRVKNNGRNNYCIIDYDKENTSS >gi|223714133|gb|ACDT01000082.1| GENE 7 12392 - 13978 1706 528 aa, chain - ## HITS:1 COG:L00396 KEGG:ns NR:ns ## COG: L00396 COG0367 # Protein_GI_number: 15672334 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthase (glutamine-hydrolyzing) # Organism: Lactococcus lactis # 1 528 1 530 530 715 64.0 0 MCGFLVVGSSELELQVFNDALNQAAYRGPDHQHVSINNGIAWGFNRLSIMDLSSKGNQPF VYKDCVLVCNGEIYNYHSLKAMLETDYTFKSGSDCEVLIPLYHKYGIEILCKFLDAEFAM VIYDAKKDKILAARDPMGIRPMFYGYHKEDGKICFASEAKGLFPLCQEIMPFPPGHYYED GKFICYNDLSNPKTISNDDLETIATNLRLKLEKAVKKRLQSDAPIGYLLSGGLDSSLVCA IAARLNDQPIKTFAIGMDTDPIDLKYAKQVADYLGTDHTEVIMTKEDLLENLRDVIFHLE TYDITTIRASMAMYLLCKYIHEKTDLKVILTGEVSDELFGYKYTDFAPDPSQFQKEAQKR IKELYMYDVLRADRCISAHALEGRVPFADIDFVDYSMSIDPAKKMNIYNKGKYLLRHAFE NQDYLPDEILFREKAAFSDAVGHSMVDYLKEYAATLYTDEDVIKAKEKYAFATPFTKESL LYRDIFESFYPGQAKWVKDFWMPNKDWENCNVDDPSARVLKNYGDSGK >gi|223714133|gb|ACDT01000082.1| GENE 8 14375 - 14839 558 154 aa, chain - ## HITS:1 COG:BS_ywhH KEGG:ns NR:ns ## COG: BS_ywhH COG2606 # Protein_GI_number: 16080800 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 152 1 154 157 175 57.0 3e-44 MSLQQAKNYLKQFNREQDIIEFDVSSATVELAAKALNCAPERIAKTLSFKINERAILIVT AGDVKIDNRKYKETFKTKAKMLARDEVLPIIGHDIGGVCPFGVNPDVTIYLDESLKRFQT VFPACGNSNSAIELTIDELSSYIKAEWIDVCKTI >gi|223714133|gb|ACDT01000082.1| GENE 9 15054 - 15872 964 272 aa, chain + ## HITS:1 COG:lin2847 KEGG:ns NR:ns ## COG: lin2847 COG0561 # Protein_GI_number: 16801907 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 266 1 276 279 186 38.0 4e-47 MTYKLIALDLDGTLKSTDKQILPKTKAILQELAKRGVVIVLASGRPTAGLYAEANELELD KTGGYLLSFNGAKVVDYKTKEIIYQKVYDAKTAHQVYERAKQYNLAVMTYTDELILTEDE TDEYVCIEGDINHMPIKHIDNFKDAVNFSVNKVLLTAKPEYAETILDEFKAPYGDSLSIY RSAPFFIEVMAQGIDKAASLQALIERLGIKRDEVISFGDGYNDLSMIEFAKFGVAMANAV DEVKKRAAYITLSNDEEGIYECLKMLSEKGEI >gi|223714133|gb|ACDT01000082.1| GENE 10 15872 - 16051 260 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756060|ref|ZP_02428187.1| ## NR: gi|167756060|ref|ZP_02428187.1| hypothetical protein CLORAM_01580 [Clostridium ramosum DSM 1402] # 1 59 1 59 59 84 100.0 3e-15 MLPNDINMLVSFLNTKLRDDDMSLAQILEINDANEIEIMELLDVNGYYFDEKINQIKIK >gi|223714133|gb|ACDT01000082.1| GENE 11 16177 - 17379 1397 400 aa, chain + ## HITS:1 COG:FN1411 KEGG:ns NR:ns ## COG: FN1411 COG1171 # Protein_GI_number: 19704743 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Fusobacterium nucleatum # 1 399 1 399 404 465 67.0 1e-131 MLKLEDFKAAKQRVDEVILETHLIHSDAFSAECGNDVYIKPENLQKTGAFKIRGAYNKIV KLDDEAKKHGLIASSAGNHAQGVAYAANKLGVKATIVMPSHTPLIKVEATKAHGADVVLA GEVYDDAYAKAVELQESEGYTFVHPFDDEDVMEGQGTIALEILEELPDADMILVPIGGGG LISGIAAAAKLIKPDIQIIGVEPEGAASASAAIQEDQVVLLNEVNTIADGTAVQQIGSKT FKYIKEYVDDIVLVNDYELMEAFLLLVEKHKLVAEGSGILALAGLKKLKTKGKKVVSLIS GGNIDVLTISSMINKGLISRGRIFTFSVQLPDKPGQLELVSKVLNQCNANVIGVDHNQFK NFARFSEVELRVTVETNGNAHIQMIIDEFNKLGYTITKIN >gi|223714133|gb|ACDT01000082.1| GENE 12 17525 - 19180 1900 551 aa, chain + ## HITS:1 COG:BH2903 KEGG:ns NR:ns ## COG: BH2903 COG0366 # Protein_GI_number: 15615466 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 3 551 4 561 561 445 42.0 1e-124 MEKWWQKEVVYQIYPKSFKDSNGDGIGDLQGIIEKLDYFSDLGVTSLWLCPIYASPMDDN GYDISDYYAINPMFGTMEDLDELIKKGKARGIKIIMDLVVNHTSDEHPWFKAAIADPTSK YRNYYIFKHGKDGRAPTNWRSIFGGSVWQKVTEDEYYFHAFSKKQPCLNWENPVLRQEIY QMINWWLDKGIAGFRVDAINFIKKDQRYQDGEPDGEDGLVSCIPFARNQPGIEVFFKELK ENTFAKHDCMTISEAYGVSYDKLGIYIGENGCFDSMFDFNYSNFDIGDNEEWFIRKDWTA KQFRDLLFTSQVEVNKVGWVGTFLENHDQPRSLDKLIVNKADHGYYSATMIASMYFFLRG VPYIYQGEELGMANCVRRSIDDFDDISSHGQYQRMLEEGFSESEALALVNKRSRDNSRTP MPFDDSQYAGFSDVEPWLALNESYPVINVKAQFNDPDSVYSFYKKMIYLRQQSVYSNTLT FGSFIPDECENDNLIVYQRSYQGQLIVNVCNFSNQEQPFICAKELILNNYKEYDGKVLKP YQTIMYLEENK >gi|223714133|gb|ACDT01000082.1| GENE 13 19309 - 21171 1679 620 aa, chain + ## HITS:1 COG:BH2014 KEGG:ns NR:ns ## COG: BH2014 COG4932 # Protein_GI_number: 15614577 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted outer membrane protein # Organism: Bacillus halodurans # 264 585 1343 1651 1816 73 26.0 1e-12 MKRKLNQYIRKSLLVAVTFILVFCNAVTYVNGQQYQISAGEKIKYPSWFDSSGTWSTRMY TVDGKIAYCLEASKHSPGNGTVGEGSYVDNENLQKVLYYGYGGPADTLSTDGVTSVEMAY LLTHIAASYYYAGDLHGVDLEKLHGWGWAEWIEAIPNRPAIPKGEIGFSKNSLKMYQKDG RQRSEEITVIGDADATLTMVLDKDMELHNLTTNQTVTGTVQLSSGQKFYLSSPLTKRDNY NWSSGSLMGTTNSYYRPLVLTIDDETQTIGTMVLGNGETKPTSLQVEYLPVGSLKLIKTN EDKQMLDGAVFNLKGKNNDYNQDHVVKNGVLQIDNLLVGDYILTEIKTPADHDSLNKEYT VTIKSNEVTTKTIVNELIPVGKLIINKKLEGHLDDLSGIEFKITAKEDIKDIITGTVLYR AGMPIGQNQGIYITDQNGQIIIDKLPMGSYLISEVKTLPGYVFDDTVYEVAFNKEDRVTK EYQYTLNVENKLTQTTISKIDALTKQLLPGATLELYLKDGESLSLIEQWVSTDDAHLIKG LLVDCEYVLKEVKAPAGYTLAPELTFKVENSESIKEIVFENQHIPVVPTGDISNYSGYAG IALISFIFFLIFTKQKRRIL >gi|223714133|gb|ACDT01000082.1| GENE 14 21228 - 22112 715 294 aa, chain - ## HITS:1 COG:no KEGG:Ccur_08300 NR:ns ## KEGG: Ccur_08300 # Name: not_defined # Def: glycosyl hydrolase family 25 # Organism: C.curtum # Pathway: not_defined # 101 287 186 369 592 116 35.0 1e-24 MKKKTKNKIFSQKNIIFLFIAGTLLFSCSLYYKQYLPKYLTHSIVILLIMIFCLLPFLFK RFKKQHNLIKIIVSALSFSLIVTGILVFNHNDRLAKDGQYIGIDISKWNNNVNLQLARQE IDFVIIRCGYTSLTDGTKTKVDPLFEQNIKQCQNLNIPYGVYYYSLATEAAQAKKEAEYV NHLLDGRVPELGVFIDLEDETFQGALTNDQLTIVATTFLDNIQNQNKKGIYANHHWWTTK LTDKKLDSYIKWKARYNDTPVLEEEYHILQYSETGQIRGINGNVDLNMTINKYW >gi|223714133|gb|ACDT01000082.1| GENE 15 22168 - 22638 482 156 aa, chain - ## HITS:1 COG:mlr0821 KEGG:ns NR:ns ## COG: mlr0821 COG4420 # Protein_GI_number: 13470973 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Mesorhizobium loti # 43 146 23 128 161 122 55.0 3e-28 MKNLTHTKIINEIIENVRDDMQDEEIIHLVLEKTVALPDGKDETLGQKIADKIAHLGGSW TFIIIFIILLISWMIYNILAKDSLDPYPFILLNLILSCIAAIQAPIIMMSQNRQEQKDRQ RNENDYKVNLKTEIIIQDLHNKLDQILEELEKQNHQ >gi|223714133|gb|ACDT01000082.1| GENE 16 22966 - 26184 3993 1072 aa, chain + ## HITS:1 COG:AF1274 KEGG:ns NR:ns ## COG: AF1274 COG0458 # Protein_GI_number: 11498873 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Archaeoglobus fulgidus # 1 1068 1 1073 1076 1114 53.0 0 MPKREDINTVLVIGSGPIIIGQACEFDYSGTQACKALRGLGYKIVLVNSNPATIMTDPET ADVTYIEPLNVERLTQIIEKERPDALLPNLGGQSALNLCSELNEAGVLEKYNVQVLGVQV DAIERGEDRLEFKKEMDKLGIEMARSEVSTTVEEALAIAERLGYPVVVRPAYTMGGAGGG LVYNVEELKTVVSRGLQASLVGQCLIEESILGWEELEVEVVRDANNNMITVCFIENIDPL GVHTGDSFCSAPMLTISQEVQDRLKDQAFRIVESIGVIGGTNVQFAHDPETDRIVVIEIN PRTSRSSALASKATGFPIALVSAMLAAGLTMEDIPCGKYGTLDKYVPDGDYVVIKFARWA FEKFKGAEDKLGTQMKAVGEVMSIGKTYKEAFQKAIRSLEIGRYGLGYAKNFNKLSKEEL LHMLTVPTSERQFIMYEALRKGATVEELFELTQIKHYFIEQMKELVEEEENIKSYEGKEL PAEVLKQAKQDGFSDKYLSQILNVEENVIREERLANNINQAWEPVHVSGTKDSAYYYSTY NAEDNSDIHHDKKKIMILGGGPNRIGQGIEFDYCCVHAALALKKLGFETIIVNCNPETVS TDYDTSDKLYFEPLTLEDVLSIHEKEKPVGVIAQFGGQTPLNLAAQLKKNGVNILGTTPE TIDMAEDRDLFNAMMEKLEIPMPESGMAVNVEEALEIAKRIGYPLMVRPSYVLGGRGMEV VYDDESLEQYMKAAVGVTPDRPILIDRFLNNAMECEADAISDGETVFVPAVMEHIELAGI HSGDSACILPSKHIPIRHLETIKEYTKKIAKEMNVRGLMNMQYAIADDKVYVLEANPRAS RTVPLVSKVCNINMVKIATDIVTRELTGRPSPVPTLTEKKIPHIGVKQAVFPFNMFPEVD PVLGPEMRSTGEVLGIASSYGAALYKAEEGAKTILPTEGKVLISVSDLDKPEVVELAQGY YDAGFTIVATGNTYNLIKESGIPVEKIKKIHEGRPNISDALTNGELAMIINTPHGKQSAH DDSYIRKAAIKMRIPYMTNIAAAKASLEGIMEMKIHGSHEVKSLQEYHLAIK >gi|223714133|gb|ACDT01000082.1| GENE 17 26543 - 27613 1411 356 aa, chain + ## HITS:1 COG:BS_pyrAA KEGG:ns NR:ns ## COG: BS_pyrAA COG0505 # Protein_GI_number: 16078615 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Bacillus subtilis # 1 354 1 351 364 322 47.0 6e-88 MKAYLILEDGSVFEGTSIGAVKEVVSEFVFNTSMTGYLEVLTDPSYAGQSVVMTYPLIGN YGVCLEDQESSKPHVEGFIVNELARLGSNFRKNIDLEEYLIQHDIPGIQGIDTRNLTKII RNIGCMNGMITTTKYENLDEVMKRIHAFKVTGVVEQTTCKESYTIGPGDKRVALLDYGAK ANIAKSLVKRGCQVTVYPASTPASVILKEKYDGIMLSNGPGDPEENVQIIEEIKLLASSQ TPIFAICLGHQLMALAHGFKTEKMKYGHHGANHPVKDLETGRVYISTQNHNFVIKEDSID SAVAKPWFINVNDKTIEGVTYLKENIKTVQFHPEACAGPLDTDALFEEFIKMMEGK >gi|223714133|gb|ACDT01000082.1| GENE 18 27615 - 30812 3763 1065 aa, chain + ## HITS:1 COG:BS_pyrAB KEGG:ns NR:ns ## COG: BS_pyrAB COG0458 # Protein_GI_number: 16078616 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Bacillus subtilis # 1 1049 1 1049 1071 1195 55.0 0 MPKDSRIKKVLVIGSGPIIIGQAAEFDYAGTQACRALKEEGIEVVLINSNPATIMTDKDI ADKVYIEPLTVPVVKSLIIKEQPDSILPTLGGQNALNIAMALADEGFLEEHNVQTIGTST NTIKLAEDRLEFKTLMENINEPCAASLVVNHVDDALEFAKKIGYPVVVRPAYTLGGTGGG IAYDETELHDICSNGLRLSRVSQCLIERCIAGWKEIEYEVMRDGAGNCITVCNMENLDPV GVHTGDSIVVAPSQTLSDKEYQMLRTSALNIITALDIQGGCNVQYALNPDSFEYCVIEVN PRVSRSSALASKATGYPIAKLAAKIALGYTLDEIINAVTGKTYASYEPTLDYVVCKIPKW PFDKFVDASRELGTQMKATGEVMSICNNFEGALMKALRSLEQNLYYLHLDELDGKDVEFL QKKLKDVDDQRIFAVAEALRQGISEKEIHDITKIDLWFIDKIKHLVEVEKELQNKELSGE LLKKAKRLEFPNQVIAKLTGKTSDEVKNMCQENKINAVYKVVDTCAAEFDATTPYYYSVY GGVNEVNPADDKKKIMVLGSGPIRIGQGIEFDYCSVHCVWALKEAGYDTIIVNNNPETVS TDFDIADRLYFEPLTPEDVENIVNIEHPDGAIVQFGGQTAIKLTKALMEMGVKILGTSAD DVDAAEDRERFDEILEKCEIDRPRGSTVFTIEEALEVANRLKYPVLVRPSYVLGGAGMEI CLNDGDVKKFMKIINRQYQEHPILIDKYLAGKEVEVDAICDENGILIPGIMEHVERAGVH SGDSISVYPTQKIKQHIKDTIVEYTGRLAKALNVIGLINIQFIVYEDQVYVIEVNPRSSR TIPYISKVTDIPVVDVATHVIMGKTIKEQGYEYGLAPEKETIAIKMPVFSFEKIKGAEIS LGPEMKSTGEVLGISTDYDEAIYKAFIGSGINLPKKKNIICTIKDSAKDEFLPIAKQYYM FGYDLYATEGTYNFLKENDVPVSLVNRISAKSNTLFDLMLTDTVDLIIDIPTRNDNLKDG FLIRRFAVEAGIPIFTSLDTAQALITSLEKRHKGDTSLVDITKLK >gi|223714133|gb|ACDT01000082.1| GENE 19 31067 - 31276 92 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756069|ref|ZP_02428196.1| ## NR: gi|167756069|ref|ZP_02428196.1| hypothetical protein CLORAM_01589 [Clostridium ramosum DSM 1402] # 1 52 1 52 121 93 98.0 4e-18 MHLNSLEQLTIKKVIDDVGISKASLHRFFSKGGYESFKDLITILDEEVKQKKCLILIINS IVPIWLGVL >gi|223714133|gb|ACDT01000082.1| GENE 20 31222 - 31422 184 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756069|ref|ZP_02428196.1| ## NR: gi|167756069|ref|ZP_02428196.1| hypothetical protein CLORAM_01589 [Clostridium ramosum DSM 1402] # 1 42 53 94 121 73 88.0 3e-12 MFNIDYQQYRSNMAWSALETEFEDHQIRILIKTIKKAKRVVFYGNSCELSCFKRLSFYLF NHGIDI >gi|223714133|gb|ACDT01000082.1| GENE 21 31530 - 31799 352 89 aa, chain + ## HITS:1 COG:no KEGG:DSY1231 NR:ns ## KEGG: DSY1231 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 4 89 3 88 99 116 67.0 2e-25 MNMKRYEFDAVIKKVPNIDGAYVDFPYDVKKEFNKGRVKVIATFDGIEYLGSLVKMKTPN HIIGLRKDIRKIINKQPGDIVHVTIQERE >gi|223714133|gb|ACDT01000082.1| GENE 22 31827 - 33260 1516 477 aa, chain + ## HITS:1 COG:aq_035_2 KEGG:ns NR:ns ## COG: aq_035_2 COG2199 # Protein_GI_number: 15605636 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Aquifex aeolicus # 288 477 51 245 251 116 39.0 1e-25 MLKVESKYTKKYLFAILFFIIMSIICVTIFLTNINSGINEKVQEILLNEVISHRKSLEAT ISIQYQELEGISEYITSLNTRDMSQLINFTKAFVSKNGFDRIVVAFEDGIGYLSDGSSYD ISTRGYYQDTMNGGARNSEPVDSVIDGERRMALSVPIYSQHGQIIGMVMGSYDVRAISEI EFSNLYNNKGYTYLIDYSGNFLINDQKLHNYQGNFYDYCQESNILSNQQLQQIKTDISNY VSGCVLTKGGKETRYLVYVPLESSEWVICYSVLVSDAFQDYQFIMKYETYLLGAVAMVLT VTVLYIFIVNKKERRVLLKRADYDSLTGLYNRRKLTEEIKRYLKAQSEFGLAILDIDDFK KINDTYGHPIGDIILKEIGQILRDNLGEYLVGRLGGDEFVILIENVQIINIVIQKLELTC NQIAKIKIEEHPDIVVTCSCGLAVAPKDGETVAELYKAADSALYIAKSLGKNKLAKF >gi|223714133|gb|ACDT01000082.1| GENE 23 33271 - 33735 420 154 aa, chain + ## HITS:1 COG:CAC0547 KEGG:ns NR:ns ## COG: CAC0547 COG0346 # Protein_GI_number: 15893837 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Clostridium acetobutylicum # 1 153 1 155 159 118 41.0 5e-27 MSYTGFVISVKDINASKKFYEELFDLEVYQDYGINITFDCGLSLQQDFDWLINIEKEKII EKANDHEICFEEENFDAFLKRLSVYPQIEYLDDVIEHSWGQRVVRFYDLDKHIIEVGESM KMVINRFIDRGMTTEQISKKMDVSVADLKKLLND >gi|223714133|gb|ACDT01000082.1| GENE 24 33772 - 34254 648 160 aa, chain - ## HITS:1 COG:CAC2413 KEGG:ns NR:ns ## COG: CAC2413 COG4708 # Protein_GI_number: 15895679 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 1 157 1 163 166 92 39.0 2e-19 MNKGISVKLVVINAMIAGVYAVLTLAISPIAYSEIQFRLSEIIVFLAFYNRRYIPGLTIG CIIANLFSPMGMLDIVFGTLSTIIVCIAMYLIKNRYLAAGAGAIITGMIIGAELWYAFNI PYVINAIYVAVGELIVLIIGALVFKSLEKNDRLMNLLDLK >gi|223714133|gb|ACDT01000082.1| GENE 25 34500 - 35330 962 276 aa, chain + ## HITS:1 COG:lin2953_1 KEGG:ns NR:ns ## COG: lin2953_1 COG2207 # Protein_GI_number: 16802012 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 43 274 26 264 343 77 24.0 2e-14 MSHRIVVRKHEKIDYQRSTFFTLTHTHYDKDYYMENHWHNSIEITYVVKGLKVQQMENKE VIAPAGTLLLVNSGVIHDVDVKKGLEGIVLLIDRNYIDHVCPQCIERGFSLEKEPLAKKK IVDYLFKLVEAYENNNKVKANIIVLEIISVLAEQLMLEGHYIKEKHDDESYELVISITEY IDYHYAQKISLDDLARMTCYNKTYLSNIFKKKTGITIFEYLRNVRMQHCLYELKHSDDTI VSIALNNGFANIQIFNRVFKEVYQMTPKQYREKNKK >gi|223714133|gb|ACDT01000082.1| GENE 26 35413 - 36996 1747 527 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0189 NR:ns ## KEGG: Lebu_0189 # Name: not_defined # Def: phospholipase/carboxylesterase # Organism: L.buccalis # Pathway: not_defined # 47 316 172 401 424 95 33.0 4e-18 MKKIKKFLSSLIAIMIVATMVTSNLSALEVNSSFVQNTNTSNTYGYVNYETDQFETYSKV SGSAGNYTSIASKVYKPLSGDTLIVVFHGNGEGGVDGVSNNYSQIAANRLAVTYTTDELQ SAFKGAYVLAFQAPDYWYNDYTQQAKTIIDQAIAEFGIKEVFVSGLSAGGLMTQRMLAKY GDFFAGALMSCAAIAKNDQYVEGLGGNYTDSTEYLDAGDAYTSGDTFKKPVDFNQYLANY DEWLENIAVSNVPIFMVHCYYDTTIYYGWTQYAYNMIKSYRDDRGLDGDIYCGLFEDVSY PDETYSSAHWSWIKMLNGDVYASTNASLDTITWFKSLSTSTNDYQLKTVTNPVAGEAAGD NIYAYNLIATVTNSGEKITALEIDMNGKKVDASKLTTEMFKITGYNSDASGLVKGDVQSY GIFGSEDEPVDIEVARVSVNEKGNIILDLATQNGVLNYTSLARNLATKIRYKLASVALPM IVEDNSTTNRDDKQTAVKTGDDLNVLGAGSLMIILLMMVAVSKRKFD >gi|223714133|gb|ACDT01000082.1| GENE 27 37032 - 37802 713 256 aa, chain - ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 254 1 255 256 109 29.0 5e-24 MKPMIFFDVDGTLMDNHDYQVTPSTKKALQKLQENGYKIGIATGRAVNSLKRTGVVDIAN WDGFVCNNGQTVLNKNFQIVEELVISPEIVHRCIKIAKDNNIPLALKMEHRIITQEPDEN VYTARKFFNSIIPPVGIYQNQKVEAMIAYGPKGYDYAPFKQLEGLHILPGVSTYCDITHA DATKATGIQVILKQYNLDKYICFGDSLNDVEMFNHAAISICMGQGDAYLKNIATFVTDSI DDDGIYNACINLGLFK >gi|223714133|gb|ACDT01000082.1| GENE 28 37893 - 38567 705 224 aa, chain - ## HITS:1 COG:CC1115 KEGG:ns NR:ns ## COG: CC1115 COG2013 # Protein_GI_number: 16125367 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Caulobacter vibrioides # 1 217 23 254 280 142 37.0 5e-34 MKYEIKGDNLPVVICHLEKGESMITDSGAMSWMDPVMEMETTSGGFGKAFGRMFSGETMF QNRYHAKEDGMIAFASSFPGSIEAVEINPNHEIIVQKSSFLAGTSGIDMSVFFNKKMATG IFGGEGFIMNKISGTGTVFLEIDGSAISYDLLPGQQMVIDTGYLAMMDATCKMEIQSVKG VKNKLLGGEGFFNTVVTGPGKVVLQSMPISAVAAGLIPFLPSGK >gi|223714133|gb|ACDT01000082.1| GENE 29 38569 - 39609 1100 346 aa, chain - ## HITS:1 COG:BS_ykgB KEGG:ns NR:ns ## COG: BS_ykgB COG2706 # Protein_GI_number: 16078366 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Bacillus subtilis # 5 340 7 343 349 187 33.0 2e-47 MKTFYVSSYGKSDNKGIYIIDLNEENQKLSLVQHIVTHDYPSYMITKNNILYVAYKNASR LNNGGGIGSFSIHKEELIPNNNYNSNGRSYTHLCVSDNSRYLFAANYHVGSTASYLLENN FIKEKICAIHHTGLGPDLLKRQTGPHCHYVGITPDKEFVYAVDLGADKVIMYTYQDGKLE EDVEHTLNVVPGSGPRQMIFSKDGRFAYLVNEISNNLMVYKYNDKYLNLIQVIHTTPRHF HGFSSASAIRLSATGNHLFISNRGHDSIVLYRVNQETGKVSLLYMVHTGKNPRDFNIIDD KYLIVGAQDDDELELLTFDEKSEQLVRTASTLAIPAPVCIAINKEK >gi|223714133|gb|ACDT01000082.1| GENE 30 39942 - 40403 385 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756079|ref|ZP_02428206.1| ## NR: gi|167756079|ref|ZP_02428206.1| hypothetical protein CLORAM_01599 [Clostridium ramosum DSM 1402] # 1 153 1 153 153 273 99.0 3e-72 MICQMANVSRKTFYALYHDKYEVIERIVVDNVISDLKGMLELLGKVELNQPIILEKMYQH FYDEKEFYVKAINIEGNNSLEHCLLYCFEEINLEILKDIEISLVEKEYAAYFFAASQVML IKKWLINDLELTPREMVLSSHKWAVKAMVNNYF >gi|223714133|gb|ACDT01000082.1| GENE 31 40813 - 41304 572 163 aa, chain + ## HITS:1 COG:BH3351 KEGG:ns NR:ns ## COG: BH3351 COG1522 # Protein_GI_number: 15615913 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 6 163 8 164 164 127 40.0 7e-30 MDENLLLTLLHNNARMDVTDLAMALNETEDNVVDTISQLEKDKVICGYHTIINWDRTNVD NVMALIQINATPEREYGYDRIAKKIYAFPEVDTMYLLSGSFEFVAIVKGHTMQEAARFVA SKLAPLEGVTGTATMFVLKQYKNNGLSFDDDDKEAAERLLVTP >gi|223714133|gb|ACDT01000082.1| GENE 32 41304 - 42473 1246 389 aa, chain + ## HITS:1 COG:BH3350 KEGG:ns NR:ns ## COG: BH3350 COG0436 # Protein_GI_number: 15615912 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Bacillus halodurans # 7 389 11 393 393 422 51.0 1e-118 MDFNKVLNDTVREIPPSGIRKFFDLANQMEGVISLGVGEPDFDTPWKIREAAIYSIEQGK TFYTANQGLVELRKEICRYQKRRFGLDYCYDKECIVTVGGSEAIDLAFRAIINPGDEVIL LQPSYVAYTPGVALAGGKVVNIELKEDNEFKLTPELLEAAITPKTKAILLNYPSNPTGGF MTREDYEKIVSIIKKHEIIVITDEIYAELSYEQKFCSIAAFDEIKDQVILVSGYSKAYAM TGWRLGYVLANEVLTKAMNKIHQYVIMSAPTGAQYGAIEAMRHCDNEIEEMRKAYMLRRN YIVKAFNDLGLHTFTPQGAFYVFPCIKSTGMTSDQFCEELLKDQLVACVPGTAFGEAGEG FIRVSYAYSIEQIKEATSRIKKFLDKLKK >gi|223714133|gb|ACDT01000082.1| GENE 33 42544 - 43464 733 306 aa, chain + ## HITS:1 COG:SP0676 KEGG:ns NR:ns ## COG: SP0676 COG0583 # Protein_GI_number: 15900577 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pneumoniae TIGR4 # 1 251 21 267 322 102 28.0 7e-22 MKNEQLRNFIKVVDCGSINKAAEQLYVSQPSLSRSIHSLEEEMGKELVIRSNRGVTLTPT GRLLYYYGRSILEQFQVLEKLKNLSEESIYSKLAVSVDSIFLRDDLILQFYNHIRSADTE IKMIETTAEEVLNNVSDMTSEIGIVILNNYQLAIFKKMTELKDVEMIVLGTGPLYVHINE QNPLAKGEIIDAKELVSSTYIHLPHDFFSNLNLSLTIDGSIQISSFHKTITMSNYHAIIN MLNHTDAFMLGHKWQIEELKHSRIKSMQFQNCNINKSFIIIKRKREILSDAAKIFLEIIN DNYADM >gi|223714133|gb|ACDT01000082.1| GENE 34 43605 - 44117 791 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734068|ref|ZP_04564549.1| ## NR: gi|237734068|ref|ZP_04564549.1| predicted protein [Mollicutes bacterium D7] # 1 170 1 170 170 228 100.0 1e-58 MKKSTLLTMASVGAVALTSAMTFAAWDNLSDTTTSNEVTFKQINVEKAADIVLTEPVASD LTSSYVPSSSGEVTFNITGIDAADLSGKELKLEPVVKANDTAISTGYTLEIYDGTEESAE KVAGNVDKSLTGTNTYKVVVTATDDKAGTDALANKKVTVELKATLQKTIV >gi|223714133|gb|ACDT01000082.1| GENE 35 44171 - 44740 463 189 aa, chain + ## HITS:1 COG:no KEGG:TSIB_1498 NR:ns ## KEGG: TSIB_1498 # Name: not_defined # Def: peptidase S26B, signal peptidase (EC:3.4.21.89) # Organism: T.sibiricus # Pathway: Protein export [PATH:tsi03060] # 5 167 1 160 162 70 27.0 2e-11 MKEKVRKFSKLAYWLILGMILMYIVGMQFYAQEITDFIGYRFYAVMTDSMEPRIPTYSMV LTKTIDENESIAPGEIITFKVQRGEQEIVLTHHFNKTQEENGQLYYRTNAEGRDELDLYY TKRSDIIGKYVWHIPLLGKFIMFIQSKFSWILYCEFLLIWLINKTIKAHWEEKGELDDEV KIPRKLDNA >gi|223714133|gb|ACDT01000082.1| GENE 36 44851 - 47169 2564 772 aa, chain + ## HITS:1 COG:SP1623 KEGG:ns NR:ns ## COG: SP1623 COG0474 # Protein_GI_number: 15901459 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Streptococcus pneumoniae TIGR4 # 11 755 7 764 778 590 43.0 1e-168 MEKELPMYYKVGLNDSEVQERIAKGQVNGVSKPVSKSYGQIIKDNVCTLFNLLNFLIFVA LVFVQAWSNLVFMAIIVTNSVIGIWQEVKAKKLVDQLSILTKPTIMVVRNSRHVIVDINE VVLDDVLVLESGNQVCNDATVIHGMIEANESLLTGESDPIHKNIGGHLLSGSSVVSGKCY ARVKNVGDSNYATTIVKEAQKVKEKNSELMDSMNKVTKVTTFMIIPLGIILFVEAMYLRQ DTLFNAVVSSAAGLLGMLPKGLVLLTSVSLANGVTRLARRKVLIQDLYSLETLSHVDVLC LDKTGTITDGKMKVEKTYTLETTQNFKLDEVMGSYMRASDDNNATFQAMNSYFGENNLYE YVKKIPFSSLRKWSAIEFKGIGTIVVGAAEKIISGELPEDIHELMLQGMRAIAIGYTEKT VDDKEELPRLQPLMAIILSDTIRNNTKETLEYFHQEGIDAKIISGDNVNTVMAIAKKAGV LNYERCIDMSTINDDEIQEVVRNYTIFGRVTPSQKKMIVEALKNDGHHVAMTGDGVNDLL ALKEADCSIAIADGSDASKQISQVVLLNSDFTCLPDVLLEGRKVVNNVTRVAGVFFIKTI YTILLSLFCVISNTAFPFIPLQITLIDLLIEAMPSFMTIFEADTRKITGRFLPKVFSKAA GNALSIVILFIAIMIFGPMWKINDLELVTLMYLVLGTISMAAVIRSCYPFTMLRIIICTM MAGGFYGAVLLFSGLLHLAPITLNLVFIGLILSIFGLFIERIIHFVIKKRLA >gi|223714133|gb|ACDT01000082.1| GENE 37 47385 - 48017 552 210 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756086|ref|ZP_02428213.1| ## NR: gi|167756086|ref|ZP_02428213.1| hypothetical protein CLORAM_01606 [Clostridium ramosum DSM 1402] # 1 210 1 210 210 354 100.0 2e-96 MKKQSYQKVIDKDIIEVKQYLLDISEGYWMQDIHDLINISMDVKIIRKKLMRRKDLELAV FSKIKKLIDQAQGLNEMENHLIMMNLLLDKHYSPMLTYKYKLLNYIIENGGFSIETYCLL RHLIKFTNNNLNDFIMALATRLNFSNERYHYLASHILLLEKQYKKVYNHLEYITIDERLG RYLPALYNFSPRLYNKYARMMYIPLNLAIM >gi|223714133|gb|ACDT01000082.1| GENE 38 48031 - 49017 987 328 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1720 NR:ns ## KEGG: EUBREC_1720 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 4 328 25 346 349 100 26.0 5e-20 MGEKLYYVFKIIKEAREPISAKDILKRLENYEIFLDIKTVYSLIKKLNDFYYCLSNKQLI KTIRRRGYIIDEDFFEDGQLQLLIDSVLFNPNLDKKSANDLVNKLALISSVIQMERLNTE HQNDNELTYDLLLNLTTVIKAINNHKNIAFKYISYDIKDNALQEVYHTNGNLNPETYVIS PYKLILRNSNYYLIGYFDKRKDSLSVYRIDRMRIVRNHSSSYEDIQDRFDMEKEFENNVN MYVSNERIDLKIAFESSVLREVVNQFGQDINVNKCFDGRIEAFIKDVALSDGLIGWIMML QDKVEVVFPLSLKEIVKTRIRAMLRIYE >gi|223714133|gb|ACDT01000082.1| GENE 39 49149 - 49991 1172 280 aa, chain + ## HITS:1 COG:VCA0070 KEGG:ns NR:ns ## COG: VCA0070 COG0226 # Protein_GI_number: 15600841 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Vibrio cholerae # 38 275 52 296 299 165 40.0 1e-40 MKKVFGVLLTLVLCLGLTGCGGSGDDSSASGNDVSGNVSLNGSTSMEKFVNALKEAITEE YPNLTLEPQFTGSGAGIEAVTNGTADIGDSSRALKDEEKAKGIEENIVAIDGIAVVTHKD NKVSDLTTDDLKKIYTGEITNWKDLGGDDQAIVVVGREAGSGTRGAFEEILGIEDACKYA QELNETGAVMAKVSETAGAIGYVSLDVVDDTVKSLKLDGVKASEKTIKDGSYTLQRPFVM ATKGKISEQSKEVQAVFDFINSEAGQKVIEQVGLIIPDKK >gi|223714133|gb|ACDT01000082.1| GENE 40 50001 - 50933 849 310 aa, chain + ## HITS:1 COG:VCA0071 KEGG:ns NR:ns ## COG: VCA0071 COG0573 # Protein_GI_number: 15600842 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Vibrio cholerae # 25 302 56 321 327 190 40.0 3e-48 MEKDMLSIISNRKTKSLVEKTAAIIFRGCAVMSIIAVVSITGYMIISGTPAIFKVGILDI LFGTVWAPTAADPQFGILLIILTSIIGVFLAICIGVPIGLFTAINLAELANPKVRKIVKS AIELLAGIPSVVYGLLGILIICPFMYRLELLIFAGSKTHQFTGGANLISAILVLAIMILP TLINITETALKAVPDDYRKSSLALGASQMQTIFKVVLPSAKSGIVSAIVLGVGRAIGEAM AILLVAGNSVNLPLPFNSVRFLTTGIVSEMGYASGTHRQVLFTIGLVLFVFIIIINLILT FIIKRGDQHE >gi|223714133|gb|ACDT01000082.1| GENE 41 50926 - 51786 911 286 aa, chain + ## HITS:1 COG:VCA0072 KEGG:ns NR:ns ## COG: VCA0072 COG0581 # Protein_GI_number: 15600843 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Vibrio cholerae # 13 281 16 285 289 207 47.0 2e-53 MNSIFERRHRSSDKILNFLIQLSSALSVLILVVCIGYILYRGLPYFNFSYLINTTSILKG TVGILPNIINTLYIIVITLLIACPIGIGGAIYLNEYAKNKKFVNIISFTTEVLAGIPSII YGLFGMLFFGNLLGFKFSILTGSFTLAIMILPIITRNTQVALEGVPKSYREAALGIGATK WYMIRTVLLPSAMPGILTGVILGMGRIVAESAALLFTAGSVSVLPKNIFTHLSSSGATLT IQLYLEMAKANYESAFVIALVLIVIVLGLNMLAKLITNKFDVNRVD >gi|223714133|gb|ACDT01000082.1| GENE 42 51788 - 52543 210 251 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 212 1 207 223 85 33 7e-16 MENKIEAKNLDLYYGEKHALKNVSLDIKTNKITAFIGPSGCGKSTFLKTLNRMNDYVKDI KITGSVVLDGEDIYDSRVDTTVLRKKVGMVFQQPNPFPMSIYDNIAYGPRIHGIKNKKQL DKIVKDSLEAAALYEEVKDRLHSSALGLSGGQQQRLCIARALAVEPDVILLDEPTSALDP ISTLKIEELLMELKEKYTIAIVTHNMQQASRIADYTAFFLVGEMVEYGKTKDVFAMPKDK RTEDYITGRFG >gi|223714133|gb|ACDT01000082.1| GENE 43 52553 - 53212 762 219 aa, chain + ## HITS:1 COG:CAC1709 KEGG:ns NR:ns ## COG: CAC1709 COG0704 # Protein_GI_number: 15894986 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate uptake regulator # Organism: Clostridium acetobutylicum # 1 211 1 212 219 129 39.0 5e-30 MLRTNFENDLNKLHVDLDKMCHLVILAIENCIVAFKSGDRELCRDILAGDKVINDMERTI EARCLSLILKQQPVASDLRNVSTALKVVTDLERIGDQGADIAEILLDTDVTCPYKMVEHI PNMAHLAKNMVKQSIEAFHQHDLKKASEVKKLEDDMDGLFEEVKVELIQIVNESKETIDL AINFLMIAKYFERIGDHAVNICEWVEFNQTGTVDNYRLI >gi|223714133|gb|ACDT01000082.1| GENE 44 53225 - 53908 967 227 aa, chain + ## HITS:1 COG:CAC1700 KEGG:ns NR:ns ## COG: CAC1700 COG0745 # Protein_GI_number: 15894977 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 224 3 232 232 183 45.0 2e-46 MGERIFVVEDDENIREIINLALVSNGYEVVQFDNAIDALAEIDKKAPSLAIFDLMLPKMS GIDAIKEIRETDSELPILILSAKDREIDKVNGLDSGSDDYMTKPFGILELQARVRSLLRR HVTSDIIRTKHLTVDKQTRLVKLDDQKLELTNKEYQLLVYLMDNKHRVVEREELLNEIWG YDFIGESRALDVHIRALRSKLNDDGHKYIKTIRSVGYRFYEEGDGSE >gi|223714133|gb|ACDT01000082.1| GENE 45 53895 - 55592 1870 565 aa, chain + ## HITS:1 COG:lin2643 KEGG:ns NR:ns ## COG: lin2643 COG0642 # Protein_GI_number: 16801705 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Listeria innocua # 6 559 11 588 591 217 29.0 6e-56 MVVNKTIFRYFLTVLLVTLVLSSSVSMVILSSQMLENTKHDMLYAVKLVDYQLDESHDLK AQVDALNPLAYNDQTRLTVIDTNGEVLADSGSEEIDENHKGREEVKQALSEGVGYATRYS STVKRNMLYVAVFNKGYIVRLALPYNGIFDNLPTLVRPLGVGAIMSLVIALFLSKRFANT LTAPIQDITTQVTKMKDYRELEFDSYKYDEFNIIASKLEEQAKTIHDTMKKLKSEQIKIN GILDQMKEGFVLLDSDLTVLMVNRKAQKLYGHTIKLNCSIKDFIFDFKIINALDHLSDEQ QVVEVEKEKEFYNCYVAKVDYGVTLLFVNITEQHNAMKMRQEFFSNVSHELKTPMTSIRG YSELLETGVINDKDASKKALDKIHDEVNNMSTLINDILMISRLENKDVDVIKHPVHLTPL VDEIIDTMQVEIDKKHLQVDKELEDITYTSNHQHMHQLLSNLITNAIKYNVDGGKIIIKS YQFGRNIIIEVSDTGRGISKIDQGRVFERFFRCDQGRDKETGGTGLGLAIVKHIVQYYQG NITLTSKLHEGTTFKVTLPMEEEVI >gi|223714133|gb|ACDT01000082.1| GENE 46 55680 - 56399 629 239 aa, chain + ## HITS:1 COG:no KEGG:Mevan_0168 NR:ns ## KEGG: Mevan_0168 # Name: not_defined # Def: hypothetical protein # Organism: M.vannielii # Pathway: not_defined # 1 233 1 239 244 162 41.0 7e-39 MKTDTETLRKRGFLTNTEAEPYFLYSKEELLELLKDKTAVNRTAALFILRSFVDINELDE ILLRMLVKEKALYTKLEICDILTTGNELTIKRMIPYMGTIRGNQHRTIPEKVSKKKSYPL PRDIIARTMAKMNPDYFSTILEIINYPEDKVVAEAIDAIGWMVFYHQELATAKNYQTVIQ SFERYHDNELMKWKLIICLSAFNQSEAFLKQLDFQNPVVQAEIERSLSLINKRKSKVVY >gi|223714133|gb|ACDT01000082.1| GENE 47 56452 - 57348 878 298 aa, chain + ## HITS:1 COG:CAC0948 KEGG:ns NR:ns ## COG: CAC0948 COG1307 # Protein_GI_number: 15894235 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 4 295 6 281 281 109 30.0 7e-24 MYKIITDGSCDLSTERAQELDVSVVPFYVSTDGINHQKEIVELGIRDFYQFMIDNPKVFP KTSMPSIQDYLDVFGPLAKDGIDIICICITTKFSGSYNSAMNAKNMLTDEFPDVKIEVID ATVNTVLQGMLVEEAVAYKQTGATFEETVAYINKIKITGRIFFTIGGMDYLVHGGRVGKL SGIAAGALGIKPLILLKEGEIFNNGLTRGRKKSKQKIVEQIINYFKEDNLELSKYRFCIG FGYDYDEAVEFKTTFLAALKGFDPSFSAEVPIRQIGATIGVHTGPHPLGVGLLEKLQK >gi|223714133|gb|ACDT01000082.1| GENE 48 57420 - 59507 1427 695 aa, chain - ## HITS:1 COG:BH1154_2 KEGG:ns NR:ns ## COG: BH1154_2 COG0642 # Protein_GI_number: 15613717 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 452 691 31 268 274 223 48.0 1e-57 MNRKKIIFSILTIVIILLSSVGIYSCYEVVETSNGKSINFSSNQILYDIFDNIYALSFQL DNVSKKEGIPKYLTANEEASDEFYRNNILYCREMMSEKKNIKYYAEGNNNSLGNTNDDIK NIRNNKELKEKYQWYLMIHFDENGQFSYDTLGCLVAGNQNYELVWNNYKQTYFQYLQEYD EDSTLHNPVNTTIYFAVPHKIVSNSIDTIAWYSEKNNRTINVENIVPFAAIAVAAVGLFI LLAPYETIKEINIFKYPAKIKLEPLVIGITLAFAGIIVVIYGLILDTINGYYITKLSGIG FGESSELILGAMNVLSWAAFLFLCMFSVFYLKSSFKEGIINSIKKNTVCIWLLNVLRKII YKASRFDLNDDTNKTILKIVLANCLIISIICCFFTFGIFFALVYSAILFIILRKRFDELK HDYEILLDAVKRLSNGDFDVEINQDIGMFNSLGTEFSNIKDGFEKAVSEEVKSQKMKTEL ISNVSHDLKTPLTSIITYIDLLKDEQLDYEKRKEYLDTLDRNSLRLKNLIDDLFEVSKVN SGDVKLNLVDVDIIALIQQAKFELIDKFNEKSLIFKTAFPNEKIILSLDSLKTYRIFENL LMNIGKYALENTRVYIDIDNSDDEVTITFKNISADEIKVSEDELVERFVQGDTSRNTSGS GLGLAIAKSFTELQKGNFKISVDGDLFKASVTFKK >gi|223714133|gb|ACDT01000082.1| GENE 49 59516 - 60226 750 236 aa, chain - ## HITS:1 COG:BH1153 KEGG:ns NR:ns ## COG: BH1153 COG0745 # Protein_GI_number: 15613716 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 5 235 6 232 232 243 56.0 2e-64 MNQDNILVVEDEIEIAKAIEIYLKSQGYNVYIANNGKAGLEMVERYDIHLAIVDIMMPVM DGIEMTMKIRENYDFPIIFLSAKSEDIDKITGLNIGGDDYVTKPFVPMELLARVSSQLRR YHKYLNMIGKIQEDQAANKYVVGGLELDLDTKEVSVDGKAVKVTPIELKILELLMEKPGR VFSSEQIYENVWHEAAINTETVMVHVRNLREKIEINPSNPQYLKVVWGVGYKIEKQ >gi|223714133|gb|ACDT01000082.1| GENE 50 60302 - 60979 400 225 aa, chain - ## HITS:1 COG:CAC2706 KEGG:ns NR:ns ## COG: CAC2706 COG0398 # Protein_GI_number: 15895963 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 9 222 10 226 236 99 30.0 7e-21 MDKLNEHKKNLSLLITGFIIITILCLVLYHFFENFLNDPIELKEKLTEFGSWGKLILVAA MAIQVIFVFLPGEIIEIAAGFSYGTIEGMAICLLGAIIGTIVIYAIVNRYGIKLVNHFYD TNKFNEISFLKKEKNLEIILFVIFFIPGTPKDILTYLAPFTRLKLSSFIFITSIARIPSV ITSTIGGNAISNQNYRFTIFIFVITGIISIFGLLSYKRYIKKKNL >gi|223714133|gb|ACDT01000082.1| GENE 51 61163 - 61522 334 119 aa, chain + ## HITS:1 COG:no KEGG:BCE_4519 NR:ns ## KEGG: BCE_4519 # Name: not_defined # Def: ArsR family transcriptional regulator # Organism: B.cereus_ATCC10987 # Pathway: not_defined # 1 106 1 106 120 80 38.0 2e-14 MKNTIQDSESLRKQFLASQKVFNALGDETRMYLLLIMLEGPCDGSRVIDLASKVNLSRPA VSHHLQILKQAGIVKTRKVKTCIYYYLEPQHCEIDKLINLFQHIKEIMNNVSRCEMEEE >gi|223714133|gb|ACDT01000082.1| GENE 52 61522 - 62187 565 221 aa, chain + ## HITS:1 COG:CAC0748 KEGG:ns NR:ns ## COG: CAC0748 COG0778 # Protein_GI_number: 15894035 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 1 220 1 231 241 153 38.0 3e-37 MELLEAVQKRHSVRMYQDREIPDSIKKELLDFIDQCNQTSGLNIQLILDEPKAFDNFMAH YGKFSGVKNYLALIGKKSVKLEEMCGYYGEKIVLKAQQLGLNSCWVALTYKKVKSAFVID DDERLCCLITLGYGIDNGATHKIKTIEQVSEVTGDMPSWFETGVKTALLAPTAMNQQKFK FILNDNTVKVKPGLGFYTKLDLGIVKYHFEIGAGTNNFIWQ >gi|223714133|gb|ACDT01000082.1| GENE 53 62298 - 63617 467 439 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 1 433 1 444 458 184 27 1e-45 MKQYDAIIIGFGKGGKTIAGKMAKLGKRVAIIEKSAQMYGGTCINEGCIPSKSLITQAEK YNYHDAIINKEDLITKLRNKNYAKLADLQNVDVITATAKFVDDHHVAICTENEEEIIFGD LIFVNTGATPVMPPIKGLATANKVYTSASLMKLEELPKTLAIIGGGYIGLEFASMYARYG SEVTIYESNDRLIAREDRDIAEEIQRILEKQGVKFVFNANIQALANEGEEVLVTYNDQVS QRFAGILMATGRKANTADLNLDAAGIEVNQRGEIVVNKYLQTTKKNIFALGDVKGGLQFT YISLDDSRIVMDYLFGDKKRSTLNRGNIAYSVFISPTFSRIGLSEVEAKAAGYEVIVTKL LTAAVPKANVLKKPEGLLKAIIDKKTERILGCVLLCEHSEELINLVTLVMNNDLSYKVLK NQIFTHPTMAEALNDLFDL >gi|223714133|gb|ACDT01000082.1| GENE 54 63642 - 63950 476 102 aa, chain + ## HITS:1 COG:AF1284 KEGG:ns NR:ns ## COG: AF1284 COG0526 # Protein_GI_number: 11498883 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Archaeoglobus fulgidus # 8 96 36 126 134 93 53.0 9e-20 MSNVLYPNKQEFDELIKNEKVLVDFFATWCGPCKMIGPNIEKLAEENADATVIKIDVDKH PEIAAVYGVQTIPTLIAFKKGQIVNQKIGFIPYPEILAMITS >gi|223714133|gb|ACDT01000082.1| GENE 55 63952 - 65112 718 386 aa, chain + ## HITS:1 COG:SPy1559 KEGG:ns NR:ns ## COG: SPy1559 COG0785 # Protein_GI_number: 15675452 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis protein # Organism: Streptococcus pyogenes M1 GAS # 10 225 11 232 236 132 36.0 9e-31 MATHISLALVFVEGLVSFLSPCVLPIIPIYMGYLAGNSNEKRNSNKKVLLFTISFIFGIL LAIFLMNASINLLQSFLKEHMTLFVRIGGILIVLLGIYQLGFIKINFLQRTFRFSLKTDN KMNVMVAFIMGFTFSFAWTPCIGPALSSILLLAASSGDFWYSNFLMIIYAFGFTLPFLVL GLFTNKALNWLNSHRDIVKYTTKIGAVILIIFGLMMFSGKLNTISNYMSPSQRQVSTNQA ENSDDSVNYGNALLNQDDKPISLADYHGKVVFLNFWATWCPPCQREMPEIQKLSEKYQNS EDIAILTVVMPGGQEMDAAGIKKFLKEKGFTMPVIFDDGRLSSSFQITSLPTTYMFDRDG NVYGSVVGQLSSDMMENIIDRTLKGK >gi|223714133|gb|ACDT01000082.1| GENE 56 65201 - 65569 373 122 aa, chain + ## HITS:1 COG:CC0895 KEGG:ns NR:ns ## COG: CC0895 COG1733 # Protein_GI_number: 16125148 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Caulobacter vibrioides # 25 119 48 145 167 69 33.0 1e-12 MDIIDSSCGKRKKHTYKEIHESLPGIPTNLLSNRLKELCQDDLLQCELYSKHPPRYRYSL TAKSIDLDDIYNALIIWGDRHLDKSYKCISHDGCQGEIEIVYRCKECGEIINKEDLKIAP KE >gi|223714133|gb|ACDT01000082.1| GENE 57 65679 - 65901 148 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756106|ref|ZP_02428233.1| ## NR: gi|167756106|ref|ZP_02428233.1| hypothetical protein CLORAM_01626 [Clostridium ramosum DSM 1402] # 1 74 18 91 746 112 100.0 5e-24 MRSSIMKKSIIVAIILLILFGTLATLYINNLSDNLYSETDEYLNDLTNQTAKAIKNRINE NLTQLTTISLIIQQ Prediction of potential genes in microbial genomes Time: Thu May 26 10:05:53 2011 Seq name: gi|223714132|gb|ACDT01000083.1| Coprobacillus sp. D7 cont1.83, whole genome shotgun sequence Length of sequence - 1029 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1027 674 ## COG2200 FOG: EAL domain Predicted protein(s) >gi|223714132|gb|ACDT01000083.1| GENE 1 1 - 1027 674 342 aa, chain + ## HITS:1 COG:alr2306_2 KEGG:ns NR:ns ## COG: alr2306_2 COG2200 # Protein_GI_number: 17229798 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Nostoc sp. PCC 7120 # 221 341 2 124 260 84 30.0 2e-16 LVASVFPKEAVTGKFQNFIQTAYLTWLIIGVGSAVVVAYFYFIQGKNKLQITKLAYEDEI TGHYNYYRFIEYCQNINHLSKYVLVNCDVKGFKWFNEIYGEEIANRLLKQIIICIDTQCQ NEEFCCRQSADHFVILLDSDDFEQIKNRLFVLANYIRGEFAKEYNTSPYYFHFGVYKILE DDVDINLAFKKTQYTSNDLKRLSKDDVSFYQEEFFQKELYALQIEKEFQDSLKANHFKAY IQPKVNLQTGKVCSGEILTRWHHPEYNIISPGDFIPVYEKNGMLEALDFHIFKKALEQID YWNKYYGIKISISVNVSRTYIFNEGYVDKLIHLVKACNVNPE Prediction of potential genes in microbial genomes Time: Thu May 26 10:05:59 2011 Seq name: gi|223714131|gb|ACDT01000084.1| Coprobacillus sp. D7 cont1.84, whole genome shotgun sequence Length of sequence - 16938 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 10, operones - 7 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 421 334 ## COG2200 FOG: EAL domain 2 2 Op 1 13/0.000 - CDS 531 - 1565 1172 ## COG0136 Aspartate-semialdehyde dehydrogenase 3 2 Op 2 . - CDS 1569 - 2810 1237 ## COG0527 Aspartokinases 4 2 Op 3 . - CDS 2812 - 3966 1411 ## COG0460 Homoserine dehydrogenase - Prom 3994 - 4053 7.6 + Prom 4216 - 4275 12.5 5 3 Op 1 19/0.000 + CDS 4296 - 5780 1622 ## COG0498 Threonine synthase 6 3 Op 2 . + CDS 5790 - 6683 1060 ## COG0083 Homoserine kinase + Term 6684 - 6727 8.9 7 4 Op 1 . - CDS 6713 - 6931 250 ## gi|167756112|ref|ZP_02428239.1| hypothetical protein CLORAM_01632 - Prom 6954 - 7013 4.0 - Term 6969 - 7004 4.1 8 4 Op 2 . - CDS 7023 - 7253 237 ## DSY0630 hypothetical protein + Prom 7219 - 7278 6.4 9 5 Op 1 . + CDS 7391 - 8077 783 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 10 5 Op 2 . + CDS 8077 - 8319 396 ## gi|167756115|ref|ZP_02428242.1| hypothetical protein CLORAM_01635 11 5 Op 3 . + CDS 8319 - 9059 759 ## COG1073 Hydrolases of the alpha/beta superfamily + Term 9060 - 9088 -0.0 + Prom 9119 - 9178 6.3 12 6 Tu 1 . + CDS 9371 - 11914 3134 ## COG2273 Beta-glucanase/Beta-glucan synthetase + Term 11923 - 11966 2.1 + Prom 11995 - 12054 7.9 13 7 Op 1 59/0.000 + CDS 12181 - 12618 764 ## PROTEIN SUPPORTED gi|167756118|ref|ZP_02428245.1| hypothetical protein CLORAM_01638 14 7 Op 2 . + CDS 12636 - 13028 657 ## PROTEIN SUPPORTED gi|167756119|ref|ZP_02428246.1| hypothetical protein CLORAM_01639 + Prom 13057 - 13116 8.5 15 8 Tu 1 . + CDS 13193 - 13720 623 ## gi|167756132|ref|ZP_02428259.1| hypothetical protein CLORAM_01652 + Prom 13813 - 13872 13.4 16 9 Op 1 . + CDS 13938 - 14366 256 ## gi|237734106|ref|ZP_04564587.1| predicted protein 17 9 Op 2 . + CDS 14441 - 15181 607 ## gi|167756134|ref|ZP_02428261.1| hypothetical protein CLORAM_01654 + Term 15242 - 15282 4.2 + Prom 15253 - 15312 10.2 18 10 Op 1 . + CDS 15334 - 16035 135 ## gi|167756135|ref|ZP_02428262.1| hypothetical protein CLORAM_01655 19 10 Op 2 . + CDS 16010 - 16291 252 ## COG1846 Transcriptional regulators 20 10 Op 3 . + CDS 16321 - 16905 397 ## gi|167756137|ref|ZP_02428264.1| hypothetical protein CLORAM_01657 Predicted protein(s) >gi|223714131|gb|ACDT01000084.1| GENE 1 2 - 421 334 139 aa, chain + ## HITS:1 COG:slr1692 KEGG:ns NR:ns ## COG: slr1692 COG2200 # Protein_GI_number: 16330979 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Synechocystis # 2 132 189 318 332 119 44.0 2e-27 PEQVEIEITETTALNHKEELIKILNQLKSYHFKVALDDFGSGYSSLNILKDLPIDVVKID QEFFRTNDYTHTRSHIIIEEIIQLCHKLNLEVVAEGVETIEQKEFLEAHDCDYIQGYYYY RPMLLKEFEALFVLENINK >gi|223714131|gb|ACDT01000084.1| GENE 2 531 - 1565 1172 344 aa, chain - ## HITS:1 COG:L66199 KEGG:ns NR:ns ## COG: L66199 COG0136 # Protein_GI_number: 15673604 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Lactococcus lactis # 1 344 1 347 358 394 58.0 1e-109 MKKYKVAVVGATGAVGREMLRCIFQFKFPFESIKLLASARSAGKVVEFEGHEFTVEELTE NSFEGIDVAFWSAGGSISEKYMPFAVKAGCVNIDNTSHFRMDPEVPLVVPEVNGEALRNH KGIIANPNCTTIQMVSALNRLNEYFDIERIIVSTYQAVSGAGVAGMEELYRQSQEILDGK EPVAKTLPCAGDKKHFPIAFNCIPQVDKFDLDNHFSKEEMKMVNETKKIFNKDIKVNATC VRVPVLRGHSESVYIETKKPIDIDKVFELLNNSTGVELYDDIENQIYPMANLFVGDELVH VGRVRKDLDNPNGLSLWVVADQLMKGAAYNSVDIGLKMIEMGLI >gi|223714131|gb|ACDT01000084.1| GENE 3 1569 - 2810 1237 413 aa, chain - ## HITS:1 COG:HP1229 KEGG:ns NR:ns ## COG: HP1229 COG0527 # Protein_GI_number: 15645843 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Helicobacter pylori 26695 # 3 407 2 401 405 236 37.0 6e-62 MNIIVQKFGGTSTRSRESRSTMYNNIIREINNGNKVVAVVSAMGRYDDPYATDTLLSIVN TQQLTDEEIDRLTSIGETISTLVCKGEVASLGYRAATITNSELGIVTDSTFNNATINKVE GKDIIEKLKYADVVFCPGFQGHSIHGKITTLGRGGSDLSAVAIGVAIDASEVEIYSDVNG IYTADPRLVPDALKLDCISYAEMLELAKNGAKVLNHRCVQLAASHNIAIHARSSFEDTKG TYVIGDERIMKENLNQLIISGITGSQNEARITLVKVDATNNSAGSLFEKIADAGVNVNVF NQALVGGGRMDISLVISDVDVPTVVDIINDLKPQLNADRIIVRGNIGKVSVIGVGIKNNR GMFQKAYNTLSSNGINVEMTSCSEINISCYIDRDDVKKAQILLHEAFLGKVGK >gi|223714131|gb|ACDT01000084.1| GENE 4 2812 - 3966 1411 384 aa, chain - ## HITS:1 COG:aq_1812 KEGG:ns NR:ns ## COG: aq_1812 COG0460 # Protein_GI_number: 15606863 # Func_class: E Amino acid transport and metabolism # Function: Homoserine dehydrogenase # Organism: Aquifex aeolicus # 1 341 4 353 435 258 43.0 2e-68 MKIGVIGLGTVGYGVIEILTNERKRLEKVINDEVMIKYGCGLEEVDLPEGVIYTKDFNDV INDHEIDVVVELIGGVTVAKTIILEAIRNGKHVVTANKALLAHDGKEIFMAAKEKNVHIF YEASVGGGIPVLVSQKESLVANNTKEIIGILNGTTNFILTLMENENLDFDEALTIADGLG YVEANPALDLDGIDAAHKICILAQNGFKKYIDFDKISASGIRNVTKVDMDYAKKLGYRFK MIAQAKEVNNGIGIDVAATLVSKKELVANVMDAYNVVEIDNDYVENVIYYGKGAGRYVTA SAVVSDIIKTTQVERWDHDYEDCSNIYPITKSKYYLRANKPLDVDYEMYFTEKHDHIYIT HEIELADLTTDLGDTEYTIFKVRG >gi|223714131|gb|ACDT01000084.1| GENE 5 4296 - 5780 1622 494 aa, chain + ## HITS:1 COG:CAC0999 KEGG:ns NR:ns ## COG: CAC0999 COG0498 # Protein_GI_number: 15894286 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Clostridium acetobutylicum # 5 489 6 492 496 439 48.0 1e-123 MEKKYISTRNQQKEINFYQAIVQGIGEDGGLLVPDFDFTKMDLDVLSKLNYVDLATEVLS TFVPEEGKELIHDACLNAYGKGLFPEIVVPVKKAGDVYVAELFHGQTAAFKDMALSLLPY LMTLSLKQLKEEREVMILAATSGDTGKAALEGFKDVEGTCIKVFYPLDGVSAIQQQQMVS QTGKNVKVVGIHGNFDDAQSAVKKAFASEELKVASDQHNVFLSSANSINVGRLIPQIVYY FHSYFELVRNNEIKLGDKINFCVPSGNFGNCLAGYFAKKMGLPIDKFVCASNKNNILTDF FTTGKYDANREFYKTNAPAMDILVSSNLERLVWFMSDGDGEKVRQYMDKLNSEGVYEVDD ATFAKIKDQFKAGCLSEDEILTTIKTCFNETGYLLDTHTAIGYGVLKQYQQETGDHTKTV LLSTASPYKFPESVYQAIYGEELDVYTAIDKLSEKTGVPVPQALAGIKEREVLHKEAIDK TEIISFIKSEIEAM >gi|223714131|gb|ACDT01000084.1| GENE 6 5790 - 6683 1060 297 aa, chain + ## HITS:1 COG:CAC1235 KEGG:ns NR:ns ## COG: CAC1235 COG0083 # Protein_GI_number: 15894518 # Func_class: E Amino acid transport and metabolism # Function: Homoserine kinase # Organism: Clostridium acetobutylicum # 3 291 2 292 296 210 38.0 2e-54 MKVKVKVPATSANLGPGFDVAGLALTLYNTFTFELADQGLNITGCPEQFCNEENMTYQAF KQAAEICGLEYQGVNIECSGDVPYTRGLGSSSTCIVAGIVGAFAFKDKVEDRQEILELAT AIEGHPDNVAPAIFGGLTVSVMEEDNVLTLNIPVKHDYRFVTLIPPFTLSTEQSRSVLPQ VLPRADAIKNVSHLALMVASLINGYDEGLKLGFKDRLHQPYRGDLIKGFNEIMGVLEQDE KVLGAYLSGAGPTIMAVIRGEDKMGVVRIKEELGALIKDWQVVKLELDNRGYTADYE >gi|223714131|gb|ACDT01000084.1| GENE 7 6713 - 6931 250 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756112|ref|ZP_02428239.1| ## NR: gi|167756112|ref|ZP_02428239.1| hypothetical protein CLORAM_01632 [Clostridium ramosum DSM 1402] # 1 72 2 73 73 108 100.0 9e-23 MQYKRLKELRTKYGYTQEKVAKYLHVDQKTYSRYELGQHEMTPDTLGKLATLYDTSVDYL IERTDNIKSFTK >gi|223714131|gb|ACDT01000084.1| GENE 8 7023 - 7253 237 76 aa, chain - ## HITS:1 COG:no KEGG:DSY0630 NR:ns ## KEGG: DSY0630 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 3 74 2 73 73 65 48.0 4e-10 MYYKRIRDLREDKDLTQKQLAEYLNVSQKSYSRYERGERTIDPEILSKLATFHETTVDYL IERTDDKTDYTKKKRR >gi|223714131|gb|ACDT01000084.1| GENE 9 7391 - 8077 783 228 aa, chain + ## HITS:1 COG:CAC0908 KEGG:ns NR:ns ## COG: CAC0908 COG1179 # Protein_GI_number: 15894195 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Clostridium acetobutylicum # 4 224 6 249 251 228 48.0 1e-59 MEQFARITRMVGSSWIECLQNKKVAVFGLGGVGSYVVEALVRNGIGELVLVDHDIIDITN LNRQLYALHSTIGMNKVDVARARCLDINPNLKITTYQQFYLPDQGMETIFEDCDFVVDAI DTVKAKLALIENCQHLGVRVISSMGTGNKLDPSRFEITDIYKTSVCPLARVMRRELKNRG IKKCPILYSTESPQEVDGSTPGSVSFVPSVAGLMIAGYVIKELVKEEE >gi|223714131|gb|ACDT01000084.1| GENE 10 8077 - 8319 396 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756115|ref|ZP_02428242.1| ## NR: gi|167756115|ref|ZP_02428242.1| hypothetical protein CLORAM_01635 [Clostridium ramosum DSM 1402] # 1 80 1 80 80 143 100.0 3e-33 MEGITEINKDDYIDNCLKIVKEMITEEDFSDEIWLALTGEIMDTCLFIGGDFEEANIRNI TNQYINNGGIKRFKKAHEVL >gi|223714131|gb|ACDT01000084.1| GENE 11 8319 - 9059 759 246 aa, chain + ## HITS:1 COG:CAC3665 KEGG:ns NR:ns ## COG: CAC3665 COG1073 # Protein_GI_number: 15896898 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Clostridium acetobutylicum # 1 245 7 259 265 162 37.0 6e-40 MQHYCEIPTPKGTMRGFFHKPDLDRHPVCIIFHGFTGQKTGTKFCYVQLARMLEAKGIAT FRFDFLGSGESDLNFKDMTFKDELACARVILEEALKMENCTEIYVLGHSMGGAVASELAK LYPQVISKLVLWAPAFNLPAALDYLTGKVEPSSNGLYDHGGYEISQTFVDDILARDFYAD LDTYKNELLVIHGTNDTTVPFDISKIYVPKFNQQLKFVPIEGANHNYDTVEHIKEVLKLS LDFLTK >gi|223714131|gb|ACDT01000084.1| GENE 12 9371 - 11914 3134 847 aa, chain + ## HITS:1 COG:TM0024 KEGG:ns NR:ns ## COG: TM0024 COG2273 # Protein_GI_number: 15642799 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucanase/Beta-glucan synthetase # Organism: Thermotoga maritima # 33 476 24 462 642 154 29.0 5e-37 MERRWSFRKVLSSVAVAGMLTSMAAVPTISPVNAAVDHQYDLVEMKKDSVLKNGNFDNGA TNWNIAGQNSGVVAEDGVTIGKSGKLGATTDLHANGHIWQEVTLKPNTKYTLKAKVKFSS ENSDDTITLDVKTGGLNAPKTFKDKPISPTNGWEEVELEFTTSSETAYVVGVARWVDGNA SDSLKNSTVYFDDVELIGGDSEEDSNYDILWADDFNESQLDTSDWGYELGSIRGVEQQHY VKSDENVFMRDGNLVLKATDRAKEDQYKNPRGNRQVIYNSGSVRTHGKQEFLYGRIEMRA KLPKGRGAFPAFWTLGADFTLDGKINGKQGYGWSRCGEIDIMELLGGYEGEARNKQVWGT PHFYDKNIGDQWDQDTTGSGGKAYTNPSGADFNDDYHVFGINWSPDKIEWYVDGVIYNTL DLTNPTWGESAKKCFNRPQYIQLNLAMGGNWPGDVETNLAGTEFAIDYVYYAQNEEQKAA AKEYYDAAPEISGLKDVTMQEGATPDLLAGVTSNRNSFVDFSIENEHMFKNEGGLTSVDL LCTGKDDLASLAKLPVGKYNIHYTAIPNDIEYDGNRPERESDYKFTRKTMTLTVAERTFP SDFKLNGIVGDKLANVALPEGWSWVDPETKITGTADEYDVKYVNGEYSKTVKVVVNAVVV DKEGLKARLAEATAEAAKIDVYKAATIEKLNTVIEEATKVLNDTSANKEAVEEAIRALNT AIANLEKYVTEAELDQVIAKGEEFLGKTDIYTKESLDILSTALAKVRSAITSGDKEAIES AYAQLTEAIDKLVKIDAPTPEVKPEVKPENKPNTSVKTGDNTYLGAIMTSLLLSAGGLVL FKKRKYN >gi|223714131|gb|ACDT01000084.1| GENE 13 12181 - 12618 764 145 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167756118|ref|ZP_02428245.1| hypothetical protein CLORAM_01638 [Clostridium ramosum DSM 1402] # 1 145 1 145 145 298 100 1e-80 MRQTTMANAATIERKWYVVDATDLTLGRLASEVATVLRGKNKPTYTPHVDTGDYVIIVNA DKVVMTGKKLDTVNYYRHSGWLGGLKVKTARQFKEQNPTGWVEAAVKGMLPHNTLGRKQG MKLFVYAGSEHPHAAQKPEELKIKG >gi|223714131|gb|ACDT01000084.1| GENE 14 12636 - 13028 657 130 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167756119|ref|ZP_02428246.1| hypothetical protein CLORAM_01639 [Clostridium ramosum DSM 1402] # 1 130 1 130 130 257 100 3e-68 MANVQYRGTGRRKSSVARVILTPGTGNIVINDKPAKEYLPSDVLLMIVNQPLELTNTTSQ FDVSVNVYGGGYSGQAGAIRHGISRALLQAGTDYRPTLKAAGFLTRDARVKERKKYGLKK ARRAPQFSKR >gi|223714131|gb|ACDT01000084.1| GENE 15 13193 - 13720 623 175 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756132|ref|ZP_02428259.1| ## NR: gi|167756132|ref|ZP_02428259.1| hypothetical protein CLORAM_01652 [Clostridium ramosum DSM 1402] # 1 175 1 175 175 325 100.0 5e-88 MVKLKRMLGDIDDPTVVVLMALIESQSAETLANWSLNYVEQNLLSIYQDKYPDDARLKHI ICETRKYLAGTKTKKEQRMLLKEAKELIKDVEETIVPLAIVRAILIACATKDSPTNALGY TFYAVAAIVYHQAGLCENKEIYDQLAKEEFKKVLKSLQLIAIDNENNPVKVKWYC >gi|223714131|gb|ACDT01000084.1| GENE 16 13938 - 14366 256 142 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237734106|ref|ZP_04564587.1| ## NR: gi|237734106|ref|ZP_04564587.1| predicted protein [Mollicutes bacterium D7] # 1 142 1 142 142 194 100.0 2e-48 MKKVIISIFIILFLFLGISYFYYVHNSNARKYINNWNIELSSNLKSLYEKKTEIGFQGDG TIYEIFELEKTSKLPNNLFSNKNTDFEALFKEYLVNLKINTNQYPDFKNDYYWKYIEKNN SDKLLVIINKNKTRLYIIQSLS >gi|223714131|gb|ACDT01000084.1| GENE 17 14441 - 15181 607 246 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756134|ref|ZP_02428261.1| ## NR: gi|167756134|ref|ZP_02428261.1| hypothetical protein CLORAM_01654 [Clostridium ramosum DSM 1402] # 1 246 1 246 246 421 98.0 1e-116 MKNKIKLIRCISVVTCMCLPQTNVYAQSINQEEKTYELLEQQIESEHIDIIAKLDKLTNE YQEILVIETQNKNLTEINKIKDLISGLEKIKKEYMASIQNTTRANQPNTAVAAVIGYFSN KNYKLASELLIHATVNTNKNSTYSPTNGSRVKSHSVFVKIANGSKTNGSDIFTNTGGTAS KDCYYALHSFNYSKPTSSSKLVNISDYYDYASGDYNGMEGIAVNAMYLAQQSGAIVPYNV LISQRL >gi|223714131|gb|ACDT01000084.1| GENE 18 15334 - 16035 135 233 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756135|ref|ZP_02428262.1| ## NR: gi|167756135|ref|ZP_02428262.1| hypothetical protein CLORAM_01655 [Clostridium ramosum DSM 1402] # 1 233 1 233 233 352 100.0 1e-95 MSKNNDINKLKIINQAIKDTEYITTEYSPYRGIISVFCKWLICYSSMMLLIYVIDILNFK FGFYNYKYFYNLYNGGKVLFNICINLYIWKTICLKELSVKERRFLKLWIIFPILFSIEII IPILTNYLNTDAMISFYQTISLSYIIVLIELFYIYSYFRNKRTMIITLLFICYIVVSFIL KAYIYSSRAISNSFGVFMNIFYDFDTYGLVAIIMLFTIIFLKRDTDDKRKRNL >gi|223714131|gb|ACDT01000084.1| GENE 19 16010 - 16291 252 93 aa, chain + ## HITS:1 COG:mll7902 KEGG:ns NR:ns ## COG: mll7902 COG1846 # Protein_GI_number: 13476549 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Mesorhizobium loti # 10 92 6 92 103 62 34.0 2e-10 MIKGRETFKIFESNIRLKMVAGLSKSNLTYNQLRELCNCSDGNIATHTKKLVSAGFIDVK KEFVNNKPKTTYNLTEYGQKEFKEYIKFLKKIN >gi|223714131|gb|ACDT01000084.1| GENE 20 16321 - 16905 397 194 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756137|ref|ZP_02428264.1| ## NR: gi|167756137|ref|ZP_02428264.1| hypothetical protein CLORAM_01657 [Clostridium ramosum DSM 1402] # 1 194 5 198 198 306 100.0 4e-82 MFTFLLTLSMIMLINIVPTKAFDNETELDLLIRRIDHINEIYNSNYYILSEEEFINSEIT DTFKSYDDYVKHLLNYDLNELENELIVNITVDSETINVNIETYFRSTSGSRTLSFYNGNN KMILKYKYSGSKFDTSYKPGVTVTKVNTKNFFEMSSHTGSFKNSNKTYSIIAKGRVITPS GVVSNKSFTVNFNL Prediction of potential genes in microbial genomes Time: Thu May 26 10:06:57 2011 Seq name: gi|223714130|gb|ACDT01000085.1| Coprobacillus sp. D7 cont1.85, whole genome shotgun sequence Length of sequence - 10543 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 5, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 306 146 ## Cphy_1856 TetR family transcriptional regulator - Prom 343 - 402 8.3 + Prom 860 - 919 4.8 2 2 Op 1 . + CDS 976 - 1776 762 ## gi|167756566|ref|ZP_02428693.1| hypothetical protein CLORAM_02103 3 2 Op 2 . + CDS 1827 - 3275 1745 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 3277 - 3308 -0.6 + Prom 3279 - 3338 10.6 4 2 Op 3 . + CDS 3373 - 4377 860 ## COG3049 Penicillin V acylase and related amidases + Term 4381 - 4412 -0.5 + Prom 4389 - 4448 5.4 5 3 Op 1 24/0.000 + CDS 4512 - 7019 2297 ## COG0209 Ribonucleotide reductase, alpha subunit 6 3 Op 2 . + CDS 7029 - 8069 1044 ## COG0208 Ribonucleotide reductase, beta subunit + Term 8075 - 8124 12.6 + Prom 8352 - 8411 9.6 7 4 Tu 1 . + CDS 8607 - 8813 73 ## gi|167756573|ref|ZP_02428700.1| hypothetical protein CLORAM_02110 + Term 8969 - 9014 1.3 + Prom 9281 - 9340 1.6 8 5 Op 1 . + CDS 9397 - 9495 73 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 9 5 Op 2 . + CDS 9483 - 10046 247 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 10 5 Op 3 . + CDS 10059 - 10274 275 ## gi|167756576|ref|ZP_02428703.1| hypothetical protein CLORAM_02113 Predicted protein(s) >gi|223714130|gb|ACDT01000085.1| GENE 1 3 - 306 146 101 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1856 NR:ns ## KEGG: Cphy_1856 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: C.phytofermentans # Pathway: not_defined # 1 101 1 101 185 123 55.0 2e-27 MNLPSTSKEEIISICQRLAKEKGLSSINMRSVAIECSVSVGAIYNYFPSKSELLCSTIES IWKDIFHLSSEQFSFTNFIECLTWLFESIQEGSQEYPEFLS >gi|223714130|gb|ACDT01000085.1| GENE 2 976 - 1776 762 266 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756566|ref|ZP_02428693.1| ## NR: gi|167756566|ref|ZP_02428693.1| hypothetical protein CLORAM_02103 [Clostridium ramosum DSM 1402] # 1 266 1 266 266 424 100.0 1e-117 MGILLVRLIDVINEYSEDSTFYSIAYTMLLNFDNLQNLSINDVANLCHVSKSTISKFVRS LNFEDYSDFKAEAYFKENRFNSDYNYVANIQQYIANQDANTYIDKVIQDIEIIKNIDMTV IRKIAQIIYQYPKVTAFGTLFSQLGALDLQYKLAYNHKFIMSYVNDVKQDEYLKNNSEQG VVIIYSNSGNYLNKYQLSSFDEKKKYDYKNKKVILITANEMMVNHPDVDICLVYQHLSKL QTHSYIYPLINDLIVAEYRSFQNANF >gi|223714130|gb|ACDT01000085.1| GENE 3 1827 - 3275 1745 482 aa, chain + ## HITS:1 COG:BS_bglH KEGG:ns NR:ns ## COG: BS_bglH COG2723 # Protein_GI_number: 16080977 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Bacillus subtilis # 3 482 19 481 481 452 47.0 1e-127 MSRFPDNFLWGGASADFQYEGGFGEGNRGLITADFVTDGNVDTLRQVTYKCKDGTLGSSP LKAEIPEGATGYIDPERYYPSHDGVDFYHRYKEDIALMAEMGFNVYRFSICWTRIFPTGE EAEGNELGLKFYENVIDELLKYHIEPLITICHDEIPAYLADHYDGWSSRHTINCYLKLCK TLFERFKGKVKYWLTFNELNVVKGYAQMGTHKIDSQTHYQAMHHVFVASSLATKMAHEMM PGCMVGTMYAMSGLYPLTCKPEDMMAHMNTRRLSYFYADTMVTGGYPYYAQALFEKEGVV LVKEPGDDDLLKEYPLDFITFSYYRTTTVNANTKLNIIGLAMDLNPYLEATPWGWPIDPV GLRYVMNELYDRYKKPIMIVENGMGEIDHFENDTVIDDYRIRYLKDHFKNMKDAINIDHV DCIGYTMWGAIDLVSLSTGEMKKRYGFVYVDKNDDGSGTYNRYKKKSFDWMKEVIATNGE KI >gi|223714130|gb|ACDT01000085.1| GENE 4 3373 - 4377 860 334 aa, chain + ## HITS:1 COG:BS_yxeI KEGG:ns NR:ns ## COG: BS_yxeI COG3049 # Protein_GI_number: 16081005 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Bacillus subtilis # 1 309 1 307 328 236 36.0 4e-62 MCTAIKIYSDNGDIYFGRTMDFSFELNPELYYIPAKYQWRNVLNTHDIENKYNILGIGQD VSPVALSDGVNDHGFAVAALYFSGFAQYDGLKKDNTSIIPIAAFELVYFLLSQCASVKEA QELIKVIKIIGVEDSITHIAAPLHWIISDANNNCMVIEKTIDGLHAINNPIGVLANSPDF SWHMTNLRNYYNLSPYQFQEAIWNNIKLTPFGQGAGTLGLPGDYTPPSRFIRAAFQKTYT DIPIDRHQAVITCFRIMETVSIPKGVVITADQTSDYTQYTMFINLATREYYFKTYWNNQI TRVEFPQYDEKDMKMLSLGPLNQTIEFKTRSIFL >gi|223714130|gb|ACDT01000085.1| GENE 5 4512 - 7019 2297 835 aa, chain + ## HITS:1 COG:TP1008 KEGG:ns NR:ns ## COG: TP1008 COG0209 # Protein_GI_number: 15639992 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Treponema pallidum # 1 835 1 845 845 1026 56.0 0 MLIQKRNGLFQDYDNNKIETAIKKAFCSLSETIKDDELKKMIADIEKHIIADMSVEAIQD LVEETLMKHGYLQVAKAYILYRQKHTEQRLIIDELLALLGDQNLKKVLLKIHKDYPQEEY SLQLLLVKFKTFYKEGMSHFEALKMLMKASVELISKEAPKWEMISARFLTYDVNQQVDQK MNTAGRYTFYERINYLAQQGYYGQYILESYSKDEIDMLGQYISESRNELFTYSGLELIVK RYLIQNHDREIIEKPQEMFMSIAMHLAMNESNRVKWAKEIYDILSTLKVTMATPTMSNAR KPFHQMSSCFIDTVDDSLAGIYKSIDNFAKVSKNGGGMGLYFGKVRANGSSIRGFKGAAG GVIRWLKLVNDTATAVDQLGVRQGAAAVYLDAWHKDLPEFLQIRTNNGDDRMKAHDIFPA VCYPDLFWKLAKNDLEASWYMMCPHEIHEVKGYYLEDCYGEEWERKYYECVEDDRIPKRV MTVKEIVRLIIKSLVETGTPFTFNRDHVNNFNPNPHQGMIYCSNLCTEIAQNMQAMEILP SEEIEVNGETIIVEKTKPGDFVVCNLASLTLGHLDVLNDDELRHVISVVIRALDNVIDLN YYPIPFAKITNQKYRAIGLGTSGYHHMLVKLKMSFESDEHLQFIEQLYEKINYFALEASC DLAKEKGSYSLFNGSDFETGAYFIKRRYQSKAWQNLQAKIKENGLRNGYLLAIAPTSSTS IIAGTTAAVDPIMKKYFMEEKKGSMITRVAPDLDSETFWLYKNAHYIDQEWIVKAAGIRQ RHLDQSQSVNLYMTNEFTFRKLLNLYIKAWEYGVKTLYYVRSQSLEVEECESCSS >gi|223714130|gb|ACDT01000085.1| GENE 6 7029 - 8069 1044 346 aa, chain + ## HITS:1 COG:TP0053 KEGG:ns NR:ns ## COG: TP0053 COG0208 # Protein_GI_number: 15639047 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, beta subunit # Organism: Treponema pallidum # 3 346 8 351 351 510 72.0 1e-144 MELKQKALFNAKGDIDVRKRRMINGNTTNLNDFNNMKYQWVSDWYRQAMNNFWIPEEINL TQDIRDYRLLSEAERTAYDKILSFLVFLDSLQTANLPNIGEYITANEINLCLTIQAFQEA VHSQSYSYMLDSICSPTKRDEILYQWKFDEHLLKRNEFIGEQYNAFLEHKDSFHLMKTIM ANYILEGIYFYSGFMFFYNLGRMGKMSGSAQEIRYINRDENTHLWLFRNMIIELQKEEPT LFTPEKIEVYRNMLKRGVEEEIAWGQYVLGNRIEGLTMDMVSDYIHYLGNLRAFSLNFEP LYEGFETEPEAMKWVSIYSNANNIKTDFFEAKSTAYAKSSAIEDDL >gi|223714130|gb|ACDT01000085.1| GENE 7 8607 - 8813 73 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756573|ref|ZP_02428700.1| ## NR: gi|167756573|ref|ZP_02428700.1| hypothetical protein CLORAM_02110 [Clostridium ramosum DSM 1402] # 1 68 1 68 68 110 100.0 4e-23 MLLLRREHEAIVNHQLCITNTFEQVFKEKMLKLYKKVRYLCNTHGYPFTKNIHSKYLADL IYLKMKLL >gi|223714130|gb|ACDT01000085.1| GENE 8 9397 - 9495 73 32 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 28 16 43 311 32 46 5e-17 NRVQVLKGVNGSIERGEICVMLGPSGSGNQHY >gi|223714130|gb|ACDT01000085.1| GENE 9 9483 - 10046 247 187 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 171 41 218 245 99 37 6e-24 STLLNLIGGIEIIDYGNVIVGGIDLASLSKTKLGLYRRDTLGFVFQFYHLVPNLTVKENI ETGAYLSKTPLAVETLLDKLGLKEHCNRYPNQLSGGQQQRTAIGRALAKNPKLLLCDEPT GALDYHTSKDILELLEQINQDYGTTILIVTHNDAIAKMAHRVLRLRDGQIISNYLNTSRI NAKELEW >gi|223714130|gb|ACDT01000085.1| GENE 10 10059 - 10274 275 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756576|ref|ZP_02428703.1| ## NR: gi|167756576|ref|ZP_02428703.1| hypothetical protein CLORAM_02113 [Clostridium ramosum DSM 1402] # 1 71 1 71 748 134 100.0 2e-30 MKNPLNKSLKKEFTDNLARYLALALVMIVMIATVSGFFSVAYSVNNLLKQNQDECLVEDG QFTVLKPLTKE Prediction of potential genes in microbial genomes Time: Thu May 26 10:07:18 2011 Seq name: gi|223714129|gb|ACDT01000086.1| Coprobacillus sp. D7 cont1.86, whole genome shotgun sequence Length of sequence - 931 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 930 795 ## COG0577 ABC-type antimicrobial peptide transport system, permease component Predicted protein(s) >gi|223714129|gb|ACDT01000086.1| GENE 1 3 - 930 795 309 aa, chain + ## HITS:1 COG:CAC1534 KEGG:ns NR:ns ## COG: CAC1534 COG0577 # Protein_GI_number: 15894812 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 7 306 255 560 746 93 24.0 4e-19 NMLFYIILVILAFIFVVISQAIIEDQATAIGTLLANGYTKQELIHHYLTLVMIIVFVSGL VGNVIGYTIMPSFFSNMYYNSYCLPPLQLKFIPSVFVNTTVIPFLVILIVNYLMLWHKLN ISPLKFLRQDLHENKKSHYIHLKQTSFIKRYRQRVILQNKGSYLVLMIGIIFASFLLTFG FCITPSIERYLENMENSIKTNYQYLLKVPVEAAGEKITYTLLQTYFAAGEIDLDVSFYGL DDNSIYYADIDLPKEKDQIIISYDFAKKVGLEKGDKVTFTNQYTEQEYQLKVFDIYNSRT NISVYMSRE Prediction of potential genes in microbial genomes Time: Thu May 26 10:07:21 2011 Seq name: gi|223714128|gb|ACDT01000087.1| Coprobacillus sp. D7 cont1.87, whole genome shotgun sequence Length of sequence - 9018 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 4, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 6 - 401 421 ## Shel_01820 ABC-type transport system, involved in lipoprotein release, permease component 2 1 Op 2 3/0.000 + CDS 443 - 1396 971 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 1398 - 1457 2.8 3 1 Op 3 35/0.000 + CDS 1480 - 3243 233 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 4 1 Op 4 . + CDS 3240 - 4979 1623 ## COG1132 ABC-type multidrug transport system, ATPase and permease components + Term 4983 - 5012 -0.2 + Prom 5006 - 5065 3.9 5 2 Op 1 . + CDS 5117 - 5344 346 ## gi|167756580|ref|ZP_02428707.1| hypothetical protein CLORAM_02117 6 2 Op 2 . + CDS 5352 - 7475 2007 ## COG2217 Cation transport ATPase + Term 7476 - 7513 6.4 + Prom 7492 - 7551 10.8 7 3 Tu 1 . + CDS 7579 - 7935 438 ## COG1321 Mn-dependent transcriptional regulator + Term 8033 - 8073 1.9 - Term 8015 - 8072 5.7 8 4 Tu 1 . - CDS 8117 - 8299 92 ## - Prom 8438 - 8497 10.8 Predicted protein(s) >gi|223714128|gb|ACDT01000087.1| GENE 1 6 - 401 421 131 aa, chain + ## HITS:1 COG:no KEGG:Shel_01820 NR:ns ## KEGG: Shel_01820 # Name: not_defined # Def: ABC-type transport system, involved in lipoprotein release, permease component # Organism: S.heliotrinireducens # Pathway: not_defined # 4 131 684 811 811 79 33.0 3e-14 MIPIMSGIAIVIYLVVMYILTKLVLDRNTNYMSFLKVIGYNSQEIAKIYLKATALVVVFS LLISLPICKYGLEILFIQAMMRFAGYIEIYIPTYLYVMIFTVGLITYFVVNWLLNKQIQR IDLGKSLKETE >gi|223714128|gb|ACDT01000087.1| GENE 2 443 - 1396 971 317 aa, chain + ## HITS:1 COG:RSc1813 KEGG:ns NR:ns ## COG: RSc1813 COG2207 # Protein_GI_number: 17546532 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Ralstonia solanacearum # 227 313 217 303 303 64 28.0 3e-10 MLEDIKLPFIDKKVINKHEREYQIGLIDGQGVMTSIDIYPGIQVIYNNFHCFTAPTDQSL NRHRYIEINHCLKGKFECQYNKNYYAYLCQGDLAISGGTLNKLTHSFPLGYYNGVEILIE VEKAKENKLLDEFNIDIDLISKRLKESHNVYIFRATAQIEHICLEMYEIEEELKRNYLKI KVLELLLFLSHHDFGILENEKHYYPKKQIEIIKAIKTELSNNISAKYDLEKLVKQYNINI HTFRKAFKEIHGKPIYQWYKEYRLEYSLGLLINTDIPIIEIANEIGYSNPSKYSAAFYQY TSMTPQQYRKSHLKMEQ >gi|223714128|gb|ACDT01000087.1| GENE 3 1480 - 3243 233 587 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 315 561 109 351 398 94 29 3e-19 MTVLKSLFAYAGKYKYLTILSLVFSMISAILLLMPFIMIWKVIEVILAEYPNFAQGSIAI QYGWYALIFAAVGILLYVSGLLCSHLAAFRIASNMRKKALHHAVQLPLGYFAKEGSGKLR KIIDEAAASTETYLAHQLPDMAQLITTVCAVIVCLFIFDWKFGIAALIPTILALSNMFKM IGPGLQESMKAYMDALEGMSNEAVEYIRGIPVVKTFQQTVFSFDRFHQSINNYEKFAIGY TNKMRFPMTLFTTFINSIFIFIIGTMMLLILNGFDLNSLMPDFLFYVIFTPIIAVTTNKI MFASENTMMAADALNRIEGIINQKPFTYKHSLQEITNSDIEFEHVSFTYPESENEVIHDV NLKIKSGSIVAFVGKSGGGKSTLVSLIPRFYDVTKGTIKIGGVDVKEISEKELMSKISFV FQDSRLLKKSFKENIMMGSNASVEDIQTAVHKAQCDDIIEKFDQGLETKIGSKGVYLSGG ETQRLTLARAIVKDAPILLLDEATAYADSDNEVLMQKAIMELAKDKTTIMIAHRLSTIVN VDCIYVVDNGEIIESGTHQELLIKNGLYAQMWHQYCQSVEWKVGEQE >gi|223714128|gb|ACDT01000087.1| GENE 4 3240 - 4979 1623 579 aa, chain + ## HITS:1 COG:SP1435 KEGG:ns NR:ns ## COG: SP1435 COG1132 # Protein_GI_number: 15901287 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Streptococcus pneumoniae TIGR4 # 6 577 14 579 581 414 41.0 1e-115 MIKKVQERFALTHLGAVNLIKACVSCTISYLAIAMSIGVLYYFTCDVLEMLYGSSNTILY SMYLIEFVIVVILIFIAHYIQYNMTFYNTYKESARLRIRVAEKLRKFPLMFFSKRDLSDL TTTILSDVTGMEQALSHFIPEFFGSIASTLLLSISMFFFDFRMALAAVWCVPVSFLLVVL AKRKLSNAGFKDRQKQLVRTEKIQETLETIRDLKANHYTQQYLNEVDQVIDDCEKSQIKT ELTNALFVVSSQLILKLGIATVVIYGVTSLINQTIDLKVFMLFLIVASRLYDPLSGTLQN LAAIISCDPKIARLNEIENYPLQTGEEKFEPSNYDIEFKNVSFEYQTKKKVLEDVSFVAK QNEITALIGNSGGGKTTCASLAARFYELNEGVIKIGGIDISTVDPEILLSKFSIVFQEVV LFNNTILENIRIGKKDATDEEVMEAAQKAFCDEFVEKLPDGYNTVIGENGSKLSGGQRQR ISIARAILKDAPIILLDEASASLDVESETFVQKALSHLIANRTVIMIAHRMRTIANASKL IVLEDGHVVEQGTPEQLLGKEGVYQRMVDLQKMSNEWKL >gi|223714128|gb|ACDT01000087.1| GENE 5 5117 - 5344 346 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756580|ref|ZP_02428707.1| ## NR: gi|167756580|ref|ZP_02428707.1| hypothetical protein CLORAM_02117 [Clostridium ramosum DSM 1402] # 1 75 1 75 75 110 100.0 3e-23 MKDLLKNKTVLSFVGGIAATIAGAKFLKSDKARTMAVNSLASGMKLKDDAMATYETIKED AKDICYEAKAKNEGE >gi|223714128|gb|ACDT01000087.1| GENE 6 5352 - 7475 2007 707 aa, chain + ## HITS:1 COG:SP2101 KEGG:ns NR:ns ## COG: SP2101 COG2217 # Protein_GI_number: 15901916 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Streptococcus pneumoniae TIGR4 # 112 691 100 675 687 363 35.0 1e-100 MKAYQIVHDMPGRMRIRYGKYTFSKTAAIGLSYELERWKMVNKVEANDITGSILFIYDKN RRRELLALIRQFEIEKFDVTPYLATIKEYNSYEITQKEVSQNYTSKFIKLLLRRYSIRWF LPTPIQYLSSIVLAVKYFKKGLASLGKGHIDVPVLDATSIAVSMLNRQFKTAGDIMFLLS VSQLLEDYTRKKTTLQLKESLSLHLDKVWILENESEIEIPSDTLKRGDIVIVHMGTMIPV DGEITDGQALINESSFTGEPLSKAVGKDDTVFAGTIVEEGKIYVKVRNLQKESRINKIVD MIDVNETLKAGVQSRAEHLADSIVPFSFFGFFGLLLWTRSLTRATSILLVDYSCAIKLTT SISIISAMQEAGKHSIMVKGGKYLEAMAQADTIVFDKTGTLTNATPFVQEVTPLDDYSRD EVLRIAACLEEHFPHSVANAIVKQAQDEELLHQEEHAEVEYIIAHGIATNLHGQRAIIGS EHFVFDDEQIEKPPEIVQLISKLHSKGASSLIYLAIGQRLAGIISIYDPLKPEAKHVINN LREIGFKKVIMLTGDCENAAGHIANELGIDDYKAGILPEDKAEYIRLLKAEGKKVVMVGD GVNDTPALSSADVSISMQDSSDIARELADVTLTSSRLDEIVEFKKISILLMQRIKHNYTN IVAFNSLLILSGLIGLLQPNTSAFLHNTSTFLFSAVSTKPLLYKKEN >gi|223714128|gb|ACDT01000087.1| GENE 7 7579 - 7935 438 118 aa, chain + ## HITS:1 COG:CAC1469 KEGG:ns NR:ns ## COG: CAC1469 COG1321 # Protein_GI_number: 15894748 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 1 116 1 117 122 90 43.0 6e-19 MGEALEDYLEAILILSNQNKAVRSVDLAVYRGYSRASISYAVKELRNKNYLEVNKDGHLK LTSSGQVLANRIYERHCFFKKLLIAAGVSSNQAEIEACKMEHGISDDSFEKLKTLLDF >gi|223714128|gb|ACDT01000087.1| GENE 8 8117 - 8299 92 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MICLEVSDIKYQNLHELIQRSSSSREYFLSLPVDTQLRLHVLNDYIHSLHQLHIYAEISK Prediction of potential genes in microbial genomes Time: Thu May 26 10:07:35 2011 Seq name: gi|223714127|gb|ACDT01000088.1| Coprobacillus sp. D7 cont1.88, whole genome shotgun sequence Length of sequence - 6943 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 35/0.000 - CDS 2 - 502 127 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 2 1 Op 2 35/0.000 - CDS 531 - 1730 1105 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 3 1 Op 3 . - CDS 1733 - 3478 208 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 4 1 Op 4 . - CDS 3545 - 4159 391 ## Elen_0312 transcriptional regulator, TetR family - Prom 4204 - 4263 10.6 - Term 4294 - 4348 2.4 5 2 Tu 1 . - CDS 4488 - 5240 780 ## COG0588 Phosphoglycerate mutase 1 + Prom 5478 - 5537 11.6 6 3 Tu 1 . + CDS 5670 - 6860 1097 ## COG0477 Permeases of the major facilitator superfamily Predicted protein(s) >gi|223714127|gb|ACDT01000088.1| GENE 1 2 - 502 127 166 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 4 139 359 502 563 53 31 3e-12 MVFQRVYLFQDTIYNNISIGRLEATEEEVIEAAKKARCYDFIMALPNGFQTIIGEGGTSL SGGEKQRISIARCILKDAPIIILDEATASIDGDNENYIQEAITQLCQGKTLLVIAHRLNT IRYADKILVIADGKIAQSGSHNQLIKQTGIYQDFINVRNNTKGWSQ >gi|223714127|gb|ACDT01000088.1| GENE 2 531 - 1730 1105 399 aa, chain - ## HITS:1 COG:SA2216 KEGG:ns NR:ns ## COG: SA2216 COG1132 # Protein_GI_number: 15928006 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Staphylococcus aureus N315 # 1 392 1 390 577 141 24.0 2e-33 MISLMSRILAIAGKYRRNIKIAFIFSFLKSMLAKSPIGLAFIAFNAFYNNKMSDRLCFLL GITLIVCVLFQAVFQNISDRLQGDSGYRIFSDMRMELGRHLRKLPMGYFTEGNIGKISSV LSTDMVFIEENCMTMIADMMSYIFAEIIMIIFMFYLNIWLGILSLLIAGAILWLGKAMEQ QTLEHSTIRQEQSEKLTNAVLDFIEGISIIKTYNLLGEKSAELSNNFKESCRTNLEFEET HAPWQRWLNIIYGLGIAAILALSLYLQSQNLLTVPYLIGVTLFVFDLFGPLKALYSQSTR LTVMSSCMDRVEDVLAQKELPDDGMEVIPEQSDLPEIKFKNVSFSYDKKEVLHNINFTLE KNKMLALVGPSGGGKSTIANLLTRFWDVDSGSWNKYQKR >gi|223714127|gb|ACDT01000088.1| GENE 3 1733 - 3478 208 581 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 320 555 118 354 398 84 27 2e-16 MLKKVLEYAGDYRKKTYQAIATMLIGVGMNILPFLFIYQIITPLIMHQQISTHYIIFRVI AIAISLILYAVFYVKGLALSHQSAYHTLKNLRISLQCKLEKQPLGVIQEKGVGAHKKTFI DDIEAIELLLAHALPEGISNLAIPLFIYIVMFIVDWKLALLSLGSLPIGLLAMGAMFKIG MKEMSNYYVAAQKMNNTIIEYVNGMEVVKVFNRDGESYHRFENDIRGYRDFTLAWYKACW PWMALYNSILPCVALLTLPFGSWLVLHNYSSLPDLVLILCLSFAIGTPLLRALAFVSTLP QISYKIESLEKMLSAPPLQQTSDCFKDRNHHISFDNVSFAYNNNVVLHNINLEIKEGSLT ALIGESGSGKSTLAKLLVHFYDVHSGSIKIGNQDLRQMSIETLNNQIAYVSQEQFLFNTT LMENIRMGRLDASDEEVYQAASKAQCDEFLSRLELGIHTLAGDGGKQLSGGERQRISLAR AILKDAPIIVLDEATAFMDPENEEKMNAAIAEVIKDKTVIVIAHHLHSIINAHQICVLKN GYLNAVGTHQELIDTCLEYQKLWTMANNSANWRVSHNQGGN >gi|223714127|gb|ACDT01000088.1| GENE 4 3545 - 4159 391 204 aa, chain - ## HITS:1 COG:no KEGG:Elen_0312 NR:ns ## KEGG: Elen_0312 # Name: not_defined # Def: transcriptional regulator, TetR family # Organism: E.lenta # Pathway: not_defined # 1 202 1 203 203 195 45.0 7e-49 MSKPDKSIDPRILKSAKEEFLSQGYEKASLKTICANAKVTTGALYKRYKSKEELFTAVVT PTLEALNEVADNRRINFETVTEFELIEAWNMNNDVMLWWFKFLYSHYDGFILLLSCSQGT SFSNFTHDWVEKMSIYTYEYYLEAKNRGIFTTEISKTEFHTLLSSFWTTIYEPFIHGFSW EQIENHCQIVCHFFNWYQVLGYTR >gi|223714127|gb|ACDT01000088.1| GENE 5 4488 - 5240 780 250 aa, chain - ## HITS:1 COG:TP0168 KEGG:ns NR:ns ## COG: TP0168 COG0588 # Protein_GI_number: 15639161 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglycerate mutase 1 # Organism: Treponema pallidum # 1 245 1 249 251 316 57.0 2e-86 MKLVLVRHGESDWNKLNLFTGWTDVDLSQTGHREAIQAGTILKNEGYEFDVCYTSYLKRA IHTLNHILDEMDLCWLPVNKSWKLNERHYGALQGLNKAETAEKYDEEQVKIWRRSFDVLP PALNINDKRSAQKQAMYRNIDSALLPAGESLKTTIERVIPYFNETVKKDMQAGKRALIVA HGNSLRALVKYFDKLSNEAIMNINIPTGIPLVYEFDDEFKVIKHYYLGDETLLKEKIDAV ADQGKKLVTV >gi|223714127|gb|ACDT01000088.1| GENE 6 5670 - 6860 1097 396 aa, chain + ## HITS:1 COG:BS_ywbF KEGG:ns NR:ns ## COG: BS_ywbF COG0477 # Protein_GI_number: 16080885 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 11 395 10 385 399 83 25.0 9e-16 MSRVKSLNIKYVASQVFYFATFAAMMGYASVYLLHKGFSNSTIGIILSLCNILAVFMQPA LATFADKNQKIEIRKIITTIVAIAIALSAILLVVPSNQAIIFILIVSIFSLMTTIMPLMN TLAFVFEKYGIQVNYGLARGLGSVAYAVASMVLGHAVDAFSPDLLPVCYVVFNALLFIVV HGYVLPKSEQIEVAVETNDEDEVAVNNEGLLRFAGKYKKFIVFLLGFVFVYFAHTIINNF FIQIITNVGGNSSDMGNAVFLAAMLELPTMAYFTKLSKKVNCGTLIKISIILFLVKHVIT YFADGMTMIYIAQAFQMGAYALFIPASVYYVNCKIAPQDMVKGQSFVTTSMTVAGVFGNL IGGMLLDSVGVSQVLLISAVLSLIGAVIVVMSVEKV Prediction of potential genes in microbial genomes Time: Thu May 26 10:07:56 2011 Seq name: gi|223714126|gb|ACDT01000089.1| Coprobacillus sp. D7 cont1.89, whole genome shotgun sequence Length of sequence - 62319 bp Number of predicted genes - 68, with homology - 67 Number of transcription units - 35, operones - 17 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 212 - 271 11.7 1 1 Tu 1 . + CDS 504 - 1457 1288 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) + Term 1460 - 1514 12.6 + Prom 1500 - 1559 10.9 2 2 Op 1 . + CDS 1598 - 1780 185 ## gi|167756591|ref|ZP_02428718.1| hypothetical protein CLORAM_02128 + Prom 1826 - 1885 7.7 3 2 Op 2 . + CDS 1995 - 2738 201 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 + Term 2971 - 3014 -0.8 4 3 Tu 1 . - CDS 2763 - 3803 721 ## COG3835 Sugar diacid utilization regulator - Prom 3834 - 3893 11.2 + Prom 3824 - 3883 12.3 5 4 Tu 1 . + CDS 3931 - 5067 1333 ## COG1929 Glycerate kinase + Prom 5956 - 6015 9.5 6 5 Op 1 . + CDS 6053 - 6823 693 ## COG0561 Predicted hydrolases of the HAD superfamily 7 5 Op 2 . + CDS 6902 - 7375 311 ## CLL_A2607 hypothetical protein + Term 7427 - 7469 4.8 + Prom 7543 - 7602 7.9 8 6 Op 1 . + CDS 7651 - 8853 1213 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family 9 6 Op 2 . + CDS 8876 - 9631 644 ## COG2819 Predicted hydrolase of the alpha/beta superfamily 10 6 Op 3 . + CDS 9693 - 11084 1193 ## COG2199 FOG: GGDEF domain + Term 11097 - 11133 4.3 + Prom 11130 - 11189 7.3 11 7 Op 1 . + CDS 11221 - 12075 837 ## COG0561 Predicted hydrolases of the HAD superfamily 12 7 Op 2 . + CDS 12121 - 12567 451 ## Fisuc_0321 GCN5-related N-acetyltransferase 13 7 Op 3 . + CDS 12567 - 13322 691 ## COG1145 Ferredoxin 14 7 Op 4 . + CDS 13319 - 14071 701 ## COG3022 Uncharacterized protein conserved in bacteria 15 7 Op 5 . + CDS 14043 - 15215 1184 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase + Prom 15232 - 15291 2.4 16 8 Tu 1 . + CDS 15346 - 15612 285 ## PROTEIN SUPPORTED gi|227371401|ref|ZP_03854885.1| SSU ribosomal protein S15P + Term 15616 - 15642 -1.0 - Term 15636 - 15669 3.1 17 9 Tu 1 . - CDS 15709 - 17034 861 ## EF2169 hypothetical protein - Prom 17134 - 17193 6.4 + Prom 17078 - 17137 7.2 18 10 Op 1 . + CDS 17182 - 18099 1104 ## Cphy_0247 hypothetical protein + Term 18108 - 18145 4.1 + Prom 18114 - 18173 7.3 19 10 Op 2 . + CDS 18198 - 19538 1315 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) + Term 19540 - 19575 5.1 - Term 19528 - 19561 4.7 20 11 Tu 1 . - CDS 19568 - 19903 449 ## gi|237733900|ref|ZP_04564381.1| predicted protein - Prom 19941 - 20000 3.5 - Term 19969 - 19999 -0.5 21 12 Tu 1 . - CDS 20008 - 21825 2161 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains - Prom 21857 - 21916 4.8 + Prom 22124 - 22183 10.5 22 13 Tu 1 . + CDS 22263 - 22781 618 ## gi|167756613|ref|ZP_02428740.1| hypothetical protein CLORAM_02150 23 14 Op 1 . - CDS 22801 - 23466 406 ## COG3619 Predicted membrane protein 24 14 Op 2 . - CDS 23505 - 24695 1428 ## COG0282 Acetate kinase - Prom 24937 - 24996 8.6 + Prom 24730 - 24789 9.6 25 15 Tu 1 . + CDS 24962 - 25558 658 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Term 25747 - 25790 5.5 + Prom 25606 - 25665 10.6 26 16 Tu 1 . + CDS 25804 - 26061 300 ## gi|167756617|ref|ZP_02428744.1| hypothetical protein CLORAM_02154 + Term 26203 - 26242 6.1 27 17 Tu 1 . - CDS 26091 - 26291 97 ## - Prom 26312 - 26371 3.0 + Prom 26185 - 26244 5.2 28 18 Tu 1 . + CDS 26276 - 26824 739 ## COG3859 Predicted membrane protein + Term 26827 - 26861 1.1 + Prom 26837 - 26896 9.4 29 19 Op 1 . + CDS 27143 - 27451 359 ## COG3557 Uncharacterized domain/protein associated with RNAses G and E 30 19 Op 2 12/0.000 + CDS 27461 - 27910 623 ## COG0802 Predicted ATPase or kinase 31 19 Op 3 20/0.000 + CDS 27907 - 28509 672 ## COG1214 Inactive homolog of metal-dependent proteases, putative molecular chaperone 32 19 Op 4 9/0.000 + CDS 28503 - 28946 241 ## PROTEIN SUPPORTED gi|220931046|ref|YP_002507954.1| SSU ribosomal protein S18P alanine acetyltransferase 33 19 Op 5 . + CDS 28949 - 29983 672 ## PROTEIN SUPPORTED gi|229232313|ref|ZP_04356740.1| (SSU ribosomal protein S18P)-alanine acetyltransferase 34 19 Op 6 . + CDS 29976 - 30605 752 ## COG2344 AT-rich DNA-binding protein 35 19 Op 7 . + CDS 30661 - 32046 1708 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases + Term 32047 - 32090 4.0 36 20 Op 1 . + CDS 32120 - 33583 1415 ## COG1409 Predicted phosphohydrolases 37 20 Op 2 . + CDS 33593 - 33940 376 ## gi|167756627|ref|ZP_02428754.1| hypothetical protein CLORAM_02164 38 20 Op 3 . + CDS 33969 - 34982 863 ## COG1052 Lactate dehydrogenase and related dehydrogenases + Term 34997 - 35034 1.7 + Prom 35044 - 35103 9.1 39 21 Op 1 . + CDS 35165 - 35329 296 ## gi|167756628|ref|ZP_02428755.1| hypothetical protein CLORAM_02165 + Term 35332 - 35371 8.6 + Prom 35357 - 35416 8.4 40 21 Op 2 . + CDS 35443 - 36816 1169 ## COG0733 Na+-dependent transporters of the SNF family + Term 36819 - 36874 15.0 - Term 36806 - 36860 3.5 41 22 Op 1 4/0.000 - CDS 36895 - 37404 478 ## COG0700 Uncharacterized membrane protein 42 22 Op 2 . - CDS 37397 - 37969 521 ## COG2715 Uncharacterized membrane protein, required for spore maturation in B.subtilis. 43 22 Op 3 . - CDS 37962 - 38294 323 ## gi|167756632|ref|ZP_02428759.1| hypothetical protein CLORAM_02169 - Prom 38339 - 38398 2.3 44 23 Tu 1 . - CDS 38437 - 38970 634 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 39057 - 39116 7.8 - Term 40136 - 40184 11.4 45 24 Tu 1 . - CDS 40199 - 41722 1863 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain - Prom 41744 - 41803 8.9 + Prom 41772 - 41831 9.3 46 25 Op 1 . + CDS 41882 - 42607 858 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 47 25 Op 2 . + CDS 42679 - 42888 206 ## gi|237733924|ref|ZP_04564405.1| predicted protein + Prom 42901 - 42960 5.7 48 26 Tu 1 . + CDS 43200 - 44933 2059 ## COG0441 Threonyl-tRNA synthetase + Term 44938 - 44976 8.1 + Prom 44989 - 45048 9.0 49 27 Op 1 . + CDS 45097 - 45981 285 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 50 27 Op 2 . + CDS 46043 - 46720 755 ## CLD_0779 hypothetical protein + Term 46729 - 46760 1.1 + Prom 46722 - 46781 2.5 51 28 Op 1 16/0.000 + CDS 46803 - 47681 1004 ## COG0207 Thymidylate synthase 52 28 Op 2 2/0.000 + CDS 47681 - 48151 485 ## COG0262 Dihydrofolate reductase 53 28 Op 3 . + CDS 48156 - 48872 765 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 54 28 Op 4 2/0.000 + CDS 48932 - 49213 255 ## COG0640 Predicted transcriptional regulators 55 28 Op 5 . + CDS 49203 - 49826 532 ## COG5658 Predicted integral membrane protein + Term 49842 - 49892 7.7 + Prom 49829 - 49888 8.9 56 29 Op 1 . + CDS 49935 - 50192 334 ## COG3070 Regulator of competence-specific genes 57 29 Op 2 . + CDS 50189 - 50848 692 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) + Term 50943 - 50985 5.5 58 30 Op 1 8/0.000 - CDS 51344 - 52312 890 ## COG0524 Sugar kinases, ribokinase family 59 30 Op 2 8/0.000 - CDS 52314 - 53261 839 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase 60 30 Op 3 . - CDS 53266 - 54984 1776 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase - Prom 55018 - 55077 9.1 + Prom 55017 - 55076 8.7 61 31 Tu 1 . + CDS 55117 - 55797 809 ## COG2186 Transcriptional regulators 62 32 Op 1 . - CDS 56234 - 56689 295 ## gi|167756651|ref|ZP_02428778.1| hypothetical protein CLORAM_02188 63 32 Op 2 . - CDS 56686 - 58008 995 ## COG0144 tRNA and rRNA cytosine-C5-methylases - Term 58036 - 58066 2.0 64 32 Op 3 . - CDS 58072 - 58305 359 ## gi|167756653|ref|ZP_02428780.1| hypothetical protein CLORAM_02190 - Prom 58328 - 58387 6.9 + TRNA 58631 - 58715 63.8 # Leu CAA 0 0 65 33 Op 1 1/0.286 + CDS 58972 - 59850 1059 ## COG0524 Sugar kinases, ribokinase family 66 33 Op 2 . + CDS 59893 - 61203 1184 ## COG2199 FOG: GGDEF domain + Term 61209 - 61253 5.1 67 34 Tu 1 . - CDS 61234 - 61653 407 ## gi|167756656|ref|ZP_02428783.1| hypothetical protein CLORAM_02194 - Prom 61721 - 61780 10.6 + Prom 62044 - 62103 7.5 68 35 Tu 1 . + CDS 62128 - 62317 284 ## gi|167756657|ref|ZP_02428784.1| hypothetical protein CLORAM_02195 Predicted protein(s) >gi|223714126|gb|ACDT01000089.1| GENE 1 504 - 1457 1288 317 aa, chain + ## HITS:1 COG:lin1306 KEGG:ns NR:ns ## COG: lin1306 COG0544 # Protein_GI_number: 16800374 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Listeria innocua # 5 313 116 425 427 169 34.0 7e-42 MSKVIKLGNYKGIEAEVTKQPVADTEVDNEIQRLVAQSTSLIEKDGDVTNGDVTTIDFEG FKDEVAFDGGKAEGFQLEIGSGQFIPGFEEQMIGMKKGETRELNLTFPENYGAADLAGAD VVFKVTVHKIEEKKEAELNDAFVASLNAPGIETIDQLRDNIKASLEAQHEQAFMAAKENA ALGKLIDDCEVEVEESDIEKALQQQLQHISMELASQGMQLEQYLQMMGMNQETLLQQLAP SAKQQATFEAIIDEIVAIENLETSDEEANQQVEAIAAHNQVSKEDVLNQIDIESLKRDLN RIKASRLIMDNTVFIEV >gi|223714126|gb|ACDT01000089.1| GENE 2 1598 - 1780 185 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756591|ref|ZP_02428718.1| ## NR: gi|167756591|ref|ZP_02428718.1| hypothetical protein CLORAM_02128 [Clostridium ramosum DSM 1402] # 1 60 1 60 113 103 100.0 3e-21 MKKILISIILCGCKTAVSREIKTETRYNQIEINAAMDEIEMEFANCTLLELSFDETNEQN >gi|223714126|gb|ACDT01000089.1| GENE 3 1995 - 2738 201 247 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 1 231 1 241 242 82 27 7e-15 MMQRGILITGGAHGIGKQICLDFLRQGDKVCFIDINEEASKRFESEYHDLYYYYGDVANP DTLKNYVDFAKKRLGRIDVLVNNACKGMRGILSDIDYNTFDYVLSIGLKAPYELSRLCKE ELIKNKGKIINIASSRAFQSEPDSEAYASAKGGIVALTHALAISLGPNVLVNCIAPGWIN VDEIENFGELDKASLPAGKVGTPKDISKMVLFLCGQDFITGETITIDGGMNKRMIYHGDW NWQYHEE >gi|223714126|gb|ACDT01000089.1| GENE 4 2763 - 3803 721 346 aa, chain - ## HITS:1 COG:BH2731 KEGG:ns NR:ns ## COG: BH2731 COG3835 # Protein_GI_number: 15615294 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Sugar diacid utilization regulator # Organism: Bacillus halodurans # 3 343 2 362 371 133 30.0 5e-31 MYINNQLATQIVETVKDICDHDINFIDCQGIIYASTDIKRIGSFHEAGLLAIKEKRIIEV YDDTSFNGSLQGVNLPVYYHDQIIAVIGISGKPDKVKKYAYLAQKITNLLIREKELNEFN RSQAEKMHYLLQSLLDQENLDSQYFSDMLKYFQLDLKNNYRLLIILAKRLVNEQSIIQEL HSLAIPVYHYQYPNRFLAIIDADYFNRCEYKLKKLAGNQVSIAVGKQIPLIKLSESYHSA LIALRNIQDHQYINYDVLTLEIILQSISIYDKKDYLNKTIAMLSLEDLNLLHIYFEEDMS LLNTANRLFIHKNTLQYKLDRIHKLCGYNPRKFRDANILYLALKLK >gi|223714126|gb|ACDT01000089.1| GENE 5 3931 - 5067 1333 378 aa, chain + ## HITS:1 COG:STM2959 KEGG:ns NR:ns ## COG: STM2959 COG1929 # Protein_GI_number: 16766265 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerate kinase # Organism: Salmonella typhimurium LT2 # 1 371 1 369 380 327 51.0 2e-89 MKIVIAIDSLKGSLSSIEAGEAIKAGIEHVYDDAKVIVSPLADGGEGTVEALTEGMDGNM RSLEVTGPLGTRVVCQYGIITETKTAVIEMASAAGITLVAPDQRNPLLTTTYGVGEIIAD AITMGCRHFIIGIGGSATNDGGIGMLQALGYRFLDKYGNQVTYGAQGLEQLVEIDDSGVV SSLNECTFKIACDVNNPLCGKLGSSAVYGPQKGATAEMIKKMDKGLNDYAKLAKLKYPNA DPELPGTGAAGGLGFAFLTFLNAQLESGIKIVLEETRLEEYIKDTDIVITGEGRLDFQTA MGKAPIGVAKLAKKYHKPVIAFAGSVTADANECNHQGIDAYFPIVRGVTTLEEAMESSNA KNNLIATVEQVFRLWKIK >gi|223714126|gb|ACDT01000089.1| GENE 6 6053 - 6823 693 256 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 256 1 256 256 173 38.0 2e-43 MIKAIFFDIDGTLVSFKNHQIPASTIESLKKLKEKGIKIFVATGRGKDGLDILDEIEFDG YITLNGQYCYVDNQIIYENTIKRDDLQKLLNYLEDHPFPCGFTEEHNKYFNLRDARVDEI HRITLNDDHPAGDCSDVVNKKIYQCMCFIDEQEEQKLMQIMPNCISARWHPLFCDVSPKG GTKQNGIDKFLDFYHIDRNETMAFGDGGNDIEMLQHVALSVAMENGNDKVKEIADYVTAD VDEDGILKALQYFSIL >gi|223714126|gb|ACDT01000089.1| GENE 7 6902 - 7375 311 157 aa, chain + ## HITS:1 COG:no KEGG:CLL_A2607 NR:ns ## KEGG: CLL_A2607 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 4 149 1 146 154 122 43.0 3e-27 MDEIKVPSTRYHKIVNLLCIIILTGTLVFLVTNWTMISDQIPGHYDSAGVINRWTSKSGL WIVYFVGIIMFAGLSVIEHFPKTWNTGVAITKENQKHVYQLIKGMIVTLKLCITTIFMFL TLYTTLMINLPKWFTPLFLIITFGSMIVYGIMIYRAR >gi|223714126|gb|ACDT01000089.1| GENE 8 7651 - 8853 1213 400 aa, chain + ## HITS:1 COG:MA0404 KEGG:ns NR:ns ## COG: MA0404 COG1453 # Protein_GI_number: 20089299 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Methanosarcina acetivorans str.C2A # 19 398 1 362 364 259 35.0 8e-69 MVDLKFTLGSTMYFEVVKVKYRELGRTGLKVSEIGLGCEGFMKMDTDQVKKFVAQASKMG INFIDFFSSNPHGRSAFGKAIAGNREKWIIEGHICTFWKNEQYLRTRKMDEVKAGFEDLL ARLQTDYIDIGMIHYIDAKKDFEEVFNGEVIEYVKDLKKRGIIKHIGISSHNPLIAIEAV KTGLVDVILFSINPCYDMLPPSENVDDLWDDERYAEPLFNIDPVRQELYELCQTKGVALT VMKAFGGGDLLDEKLSPFKVKMTPLQCIHYCLTRPGVASIMAGSHSIKEMMEALAYEDAS MVEKDFANVLANIPKHSFEGNCVYCGHCAPCVKCINIADVNKFADLCVAQGEVPETVREH YEVLAHHASECIECGVCIKNCPFNVKIIEKMKAAVAIFGY >gi|223714126|gb|ACDT01000089.1| GENE 9 8876 - 9631 644 251 aa, chain + ## HITS:1 COG:SP0882 KEGG:ns NR:ns ## COG: SP0882 COG2819 # Protein_GI_number: 15900765 # Func_class: R General function prediction only # Function: Predicted hydrolase of the alpha/beta superfamily # Organism: Streptococcus pneumoniae TIGR4 # 13 195 21 206 274 77 28.0 3e-14 MVEKFDIVITPLGLERTIHVYLPEDYYDSDEQYPVMYMFDGHNLFYDHDATYGKSWGLKE FLDTYDKKLIIVGIECNHEGQKRLSEYCPYQIESKYFGHLNGQGKILMDWVVNELKILVD QKYRTYPFRECTGIAGSSMGGLMAFYTVIYYNKYFSKAACISPSISMCMEELKNEYTQAK IMEDTRVYFSFGTDEVKGKNGIQWMLNNILYFNDRLIESNASSYINVVEKGQHNEASWQL ENQIYLDYLWK >gi|223714126|gb|ACDT01000089.1| GENE 10 9693 - 11084 1193 463 aa, chain + ## HITS:1 COG:aq_265_2 KEGG:ns NR:ns ## COG: aq_265_2 COG2199 # Protein_GI_number: 15605806 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Aquifex aeolicus # 295 459 2 168 168 121 36.0 4e-27 MVETMLLFLILPNMACVIQLLKIPTNQTEIIIYRDKKMKRKKRKFAINVMSLAVIPILIL GMFVIFITSSLIYTALKGEVANNLEDLARVSYQDFDMQYPGDYYLNNNHLYKGTEDIQND FNLVDLIKNNTGVDATLFYQDKRILTTIRQEDGKRVVDTIAPDEVIETVLNNDKEFFSDS VVINDSSYFGFYMPVKNTDQHVVGMLFMGRPRADVMNHIMHNIYLVSLSIVTIMIIAIIV SYYYSKKTIFALNTTKHFLGAVANGDLTTEIDPYVLQRHDEIGDMGRFAVMMKESLSDLV GKDPLTGLHNRHSCDVVLASLIQRVKQKNTSFAVAMGDVDFFKYVNDTYGHQAGDETLRQ LAKVISTHMEHLGFVFRWGGEEFVLIYEDMDRYQAFKHLEILQEQIDQESIYWKDDKVKI TMTFGLADSNEYNDLDELINLIDDNLYRGKKEGRNRIVFNTLK >gi|223714126|gb|ACDT01000089.1| GENE 11 11221 - 12075 837 284 aa, chain + ## HITS:1 COG:lin0668 KEGG:ns NR:ns ## COG: lin0668 COG0561 # Protein_GI_number: 16799743 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 284 1 284 288 152 32.0 6e-37 MIKLIVSDMDGTLLAHDSSISKGNIEAIRYAQSKGVQFAIATGRDYSSLKGILEAHDLKC FSILGNGAQFCNENGEILSSAYFPKKCFKQVLQIFDELKIHYMIFTANGFYSTAEPNVVR DAFIDRCVVQFKRKREDYLDDGCNQDMACMKLKKIGDLDDFINSSIDIIKVEAFNNDVSL IEKAKEKLQEIEGIAYLSSFDDNIEVTDKAAQKGLILENVIEELGYSKDEVMVLGDGLND ITLFERFKYSFAPGNANETIKAMAYQVVGACEEDGVSQAIYMML >gi|223714126|gb|ACDT01000089.1| GENE 12 12121 - 12567 451 148 aa, chain + ## HITS:1 COG:no KEGG:Fisuc_0321 NR:ns ## KEGG: Fisuc_0321 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: F.succinogenes # Pathway: not_defined # 7 148 5 147 148 87 31.0 1e-16 MIEIKFLKSTDEYWDTMIDYVQNCSWRAGKSLANKMKTGYFTKWQCVVGIWYDNKIVGFS TFVESDSIDNSGYQPFIGYIYVNPNYRGQRLSEVMIKEIIVCARKMGFKKIYIHSSEFGL YEKFGFKIIDCGKTCSGRYENIYERKIA >gi|223714126|gb|ACDT01000089.1| GENE 13 12567 - 13322 691 251 aa, chain + ## HITS:1 COG:MA4170 KEGG:ns NR:ns ## COG: MA4170 COG1145 # Protein_GI_number: 20092963 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Methanosarcina acetivorans str.C2A # 2 251 12 264 294 72 29.0 1e-12 MILYFSGTGNSRYIAEVINSVIEDKLVSINECLKNNLIVSLEQQYIIVCPTYAWRIPRVV EQFIRTNHFVSGTKIYFIMTCGGDIGNAAKYINRLCNEKNMILMGVGEIVMPENFITMFK APENPEVIISHGVETALLLAAEIKAGKYFLETKPTFIGRFNSGIVNRLFYPMFVRAKGFH VEDTCIGCGQCIHACPLKNISLVDSRPIWDKHCTQCMACISICPKAAIEYKNKTKGKRRY YLKDSYNKEIK >gi|223714126|gb|ACDT01000089.1| GENE 14 13319 - 14071 701 250 aa, chain + ## HITS:1 COG:CC3385 KEGG:ns NR:ns ## COG: CC3385 COG3022 # Protein_GI_number: 16127615 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Caulobacter vibrioides # 1 248 1 254 255 153 34.0 3e-37 MKIVITPAKRMNDNIDYIDVESLPVFLPQTEELLGILKTLKVDEVKKMLGCNDQIAQIAY LNYQMMDLKKDTVPALLAYEGIQYTNTAAHVLTDDDYEYTKQHLRILSGFYGILRPFDGV VPYRLELNNKVKTEKFKSLYEFWNSRIYDELTKDDNQILDLGAKQYTKIIKKYLTSSIKY VKCHFKEESDEGYREIGVYVKMARGQMVRYLIENRIDSFEAVKQFNYLGYQYCEALSDVE TYVFIRKKGT >gi|223714126|gb|ACDT01000089.1| GENE 15 14043 - 15215 1184 390 aa, chain + ## HITS:1 COG:BH0421 KEGG:ns NR:ns ## COG: BH0421 COG1820 # Protein_GI_number: 15612984 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Bacillus halodurans # 15 388 1 389 397 219 34.0 5e-57 MYLYGKRGHDMKTLIKNGIVILDGIKRLNNGAVMIEGGKITGIYKDYEGLEADSVIDVQN NYIIPGLLDTHTHGAMGYDFNKYSSKQELEIISDSLLDEGVTGFNASIVCESHHDTLNLL QMYEGNTPDNLIGVHLEGPYLNIKNKGVMKEEHLRLVNFDEFNQYISTCRKVNAMTIAPE LPNALELINYASNHGYVMNVGHSSASAKEVKKAQAYGAKGITHLYNAMSQHLHRDPGVVT GAILSDLMCELIVDGFHIDEDVIRATYKAIGKERIILITDANPCKGLPDGQYHFSGKDIV IVGGHATVKETGRIAGSTLGLNEACANMMRYCDCSIDDAVLMAAVNPAKLYGLKQGKIEI GYQGDIVVIDKEFTILAVINRGIIKRNKFI >gi|223714126|gb|ACDT01000089.1| GENE 16 15346 - 15612 285 88 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227371401|ref|ZP_03854885.1| SSU ribosomal protein S15P [Veillonella parvula DSM 2008] # 1 88 1 88 88 114 62 1e-24 MLTKAEKTAIMQKYATKEGDTGSPEVQIAVLTADINKLNGHFKQHPKDNHSNRGLLKKVG RRRDLLKYLKNKDIERYSSLCESLGLRK >gi|223714126|gb|ACDT01000089.1| GENE 17 15709 - 17034 861 441 aa, chain - ## HITS:1 COG:no KEGG:EF2169 NR:ns ## KEGG: EF2169 # Name: not_defined # Def: hypothetical protein # Organism: E.faecalis # Pathway: not_defined # 54 438 64 452 467 85 25.0 4e-15 MQTYLLSRKYFKLAFIIYAFINTLALGYLGFNLNILLFIMIGWGGLIIGWDFYQKQINYH DIKFILIITYGLILFLATAINPYSNLKSYLIALLQLMIFFLIYGNQKSFSSNNIYDELKS IIPLTNLLTGLAGICSLIMYLFKFSKIQNGWPIGLVGNRLFGVYFNCNPAAFLSAITILL ALYALKNEYTHPRLYKLNIIVQLLYIILTQCRAALVILAVILTGLIYTYFFKDCYYSKLK KYLFSLGLSIIIFITSIITANGLSYLSGNHHETSSRFQINKVIESIELFFQGDFQPSLDQ INQISSGRIELFNTSIEIWQTNPILGIGAGNFQTIGRHLTNSNVVKQIQVVHSHNVFLET LVTTGIAGASLFIIFFIASFKSILKLFKVKQQSSNYLKIMIFTLIVVSEFIGGMFDYGVF YVYSLSATLCWLFLGYLHDYH >gi|223714126|gb|ACDT01000089.1| GENE 18 17182 - 18099 1104 305 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0247 NR:ns ## KEGG: Cphy_0247 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 305 1 305 306 331 55.0 2e-89 MKNITLVIMAAGIGSRFGGGIKQLAPVGPNGEIIMDYSIYDAKKAGFNKVVFVIRKDLEA EFDAVIGSRIKQIIDVEYVFQELDNIPDEYQETFKQRTKPWGTGQAILCCKDVINEPFLV INADDYYGKQAYEEAYKYLNENHDNTAKQQIAMVSFVLKNTLSDNGGVTRGVCTVDQNNH LCDIVETHNIEKGDNCAFVKKENQIIELDLEVPVSMNMWALQPEIFDILDSKFKLFLSQL ENDNFKDEYLLPTIIGSLLKENQVEVTVLKSLDNWFGVTYKEDKQTVIESIQKLTAVGVY PQKLF >gi|223714126|gb|ACDT01000089.1| GENE 19 18198 - 19538 1315 446 aa, chain + ## HITS:1 COG:TP0917 KEGG:ns NR:ns ## COG: TP0917 COG2239 # Protein_GI_number: 15639902 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Treponema pallidum # 2 444 3 445 449 404 48.0 1e-112 MEQIKLIEKLLEIRKYKEIKEILKEMNDVDVAEMLEGFSDENMIRIFRLLPKDDAADIFA YMSSDREHALIDSLTNKELENIINDLYSDDVMELLEELPANVVKRIIAASNPETRRDINH LLRYPEDSAGSNMNIDFVDLRADMTVKEAIARIRRIGVDKETINTCYVIDNFRHLLGIVT LRKLVLSSQSALIEEIMNDNLITVHTMEDQEDVAHDFQKYDLTSMPVVDNENRLVGIITV DDIVDIMQEETTEDIEKMAAMVPSDKPYIKNGPFETFKKRIPWLLLLMISASITGKIIQG FEHALAGSVILTAFIPMLMDTGGNSGSQASVSIIRALSLDEIKHSDIVKIVFKEFRVAIL VGVTLAACNFIKMMIIDHVSIMIAAVVCLTLIVTVIIAKIVGCTLPILADKLGFDPAVMA SPFITTIVDALSLLIYFTIATNLLNL >gi|223714126|gb|ACDT01000089.1| GENE 20 19568 - 19903 449 111 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733900|ref|ZP_04564381.1| ## NR: gi|237733900|ref|ZP_04564381.1| predicted protein [Mollicutes bacterium D7] # 1 111 1 111 111 139 100.0 5e-32 MNSCNRDFCIDDCCMDNCCIEGCCVDERCALAAVLQSVAMQEGALAAILCAESEKIKKAV CLAKCIDELIAINESAAQTIGTVKELENALKEKACCAIEALQDLRNNDSCK >gi|223714126|gb|ACDT01000089.1| GENE 21 20008 - 21825 2161 605 aa, chain - ## HITS:1 COG:CAC0158 KEGG:ns NR:ns ## COG: CAC0158 COG0449 # Protein_GI_number: 15893453 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Clostridium acetobutylicum # 1 605 1 608 608 624 53.0 1e-178 MCGITAFSGKEEALPFLLQGLSKLEYRGYDSAGVTLVDKDKLFTIKTKGRLQNLIDRLDQ DTPIGCVGIGHTRWATHGVPSNLNSHPHTNNKNTISLVHNGIIENYRELKEQLVAKGYKF HSETDSEVVVHLLDSYYDGDMLKALKKVITHIDGSYALCIVSTLEPDVVYVTKKDSPLVL GTSDCASFGASDIPALLDYTKDVYFIDDFEIAKLCKNKITFYDAEGNEIQKEITHIPYDN EAAQKGGYDTFMLKEIHEQSYAISETLRGRVEGNDRIILPELEILKERFTTFNKVYFVAC GTAYHACLSGANIMERLTGIPTFTQAASEFRYGDPIIDEKTLCIFVSQSGETADTMAALR LAKNKGCTTIAVANVLGSTISREAEATIYTCAGPEIAVASTKAYTTQVIVLLLLAMYVAQ TLGKENDIYKDIINGIAKLPKQIENILKDEPLFEKYANYLKNQKDAYYIGRSLDYASVLE GALKLKEVSYIHADAYIAGELKHGPIALIEEGSVVIAVATQPHIASKTISNIQETIARGA KVILFTLTGEEVGNVDETYYMPDVNPILQAVLVAIPLQLISYYAAKLKGCDVDKPRNLAK SVTVE >gi|223714126|gb|ACDT01000089.1| GENE 22 22263 - 22781 618 172 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756613|ref|ZP_02428740.1| ## NR: gi|167756613|ref|ZP_02428740.1| hypothetical protein CLORAM_02150 [Clostridium ramosum DSM 1402] # 1 172 1 172 172 341 100.0 1e-92 MSVHENLIEISKSLENLEETGIYDVEKYSCSLKHLTNQLEYFYCHLVKENGRDIDSTYYH LSNRPQVHQLAYFNIGRGFPKELMDGHWCYILKDLGYKMLIIPCTSIKGTSANPEFEMDI EVMMSGVKTKSRLQLSEIRCVDMQRLDLRKTFCDVLSSHDDIIKYVKEHLLK >gi|223714126|gb|ACDT01000089.1| GENE 23 22801 - 23466 406 221 aa, chain - ## HITS:1 COG:MA1282 KEGG:ns NR:ns ## COG: MA1282 COG3619 # Protein_GI_number: 20090146 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 13 219 20 226 240 144 41.0 1e-34 MKIKNFLKATIHESLLTGILLAIVGGFLDIYTYLLKGNVFANAQTGNMVLMGLKIAEQNY LGALYYLLPISAFFLGIVISEYIKHRLSNVQYVEWQHLILIIEIIILTIIAFIPKEIPYS VCNVTIGLVCSLQVNTFRTTNGLPYASTMCTGNLRSAGQKLSAYLFAHDKDALHHCFRYL IIIFFFIIGAIIATGLINIFGQCALLFCCILLLITLLILVY >gi|223714126|gb|ACDT01000089.1| GENE 24 23505 - 24695 1428 396 aa, chain - ## HITS:1 COG:BS_ackA KEGG:ns NR:ns ## COG: BS_ackA COG0282 # Protein_GI_number: 16079999 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Bacillus subtilis # 1 393 1 393 395 424 53.0 1e-118 MAKIISVNAGSSSLKFQLFEMDDESVITSGVIERIGLEDSIFTIKYQGKKTVTTNPIKDH KVAVQLLLDTLIEKGIVKELNEIKGVGHRVVQGGSYFDSSAIIDEDVVSKIDELKSLAPL HNPAHLTGYYAFKEAIPEAGAVAVFDTAFHQTLDPKCYIYPIPYKYYTDYKVRKYGAHGT SHFYVSQRAIEMLGNPEHSKVIVAHLGAGGSLTAVKDGKSVNTSMGFTPLAGIMMGTRSG DVDPSVIDYLIEEVGMDMKEVITMLNKESGLLGISGVSSDFRDVQNAALEGNERAQLAID IFYRRVIAYIGRYFIALGGVDAICFTAGIGENSFFARKGICDLLKDALGIELDEDANVNG QGDRLISTPNSKVKVFVIPTNEELVIARDTKRLLNL >gi|223714126|gb|ACDT01000089.1| GENE 25 24962 - 25558 658 198 aa, chain + ## HITS:1 COG:pli0059 KEGG:ns NR:ns ## COG: pli0059 COG1961 # Protein_GI_number: 18450341 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 7 193 4 194 199 99 34.0 3e-21 MENKVYGYARVSTREQNEDRQMIALENYPMSRKQIYLDKLSGKDFNRPQYQKLLRKIRAG DTIVIKSIDRLGRDYEEIQNQWRKITKEKNVNIVVLDMPLLDTRSTGDNLTGTFVADLVL QILSYVAQTERENIRQRQREGIEAARMRGVRFGRPRKEIPENFEILKHQWQQNLIASRQA AKELGVSQDTFLRWAHGK >gi|223714126|gb|ACDT01000089.1| GENE 26 25804 - 26061 300 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756617|ref|ZP_02428744.1| ## NR: gi|167756617|ref|ZP_02428744.1| hypothetical protein CLORAM_02154 [Clostridium ramosum DSM 1402] # 1 85 1 85 85 155 100.0 8e-37 MGLLEKIADIMQCTYISDLRGRLSLDHEQLQFLNELRAETYALSEWQEAVRYITGNLDSF NSAAEGKNILLDYYIKKEATEISIK >gi|223714126|gb|ACDT01000089.1| GENE 27 26091 - 26291 97 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFCFSYLFFPFLMKEPNKIKKCFFANAKKQTILNLLSLRWHYPVQMIRDQAYALLSNFAP LARILL >gi|223714126|gb|ACDT01000089.1| GENE 28 26276 - 26824 739 182 aa, chain + ## HITS:1 COG:BS_yuaJ KEGG:ns NR:ns ## COG: BS_yuaJ COG3859 # Protein_GI_number: 16080151 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 16 181 13 191 192 79 31.0 3e-15 MKNKTFKITTKTIAYMAMFMALQVVLELAFKVVPGQPQGGSITLSLVPILLASYLLGGYY GIVVGACCCGLHFVLGLATYWGPWSLFLDYLIPLAVIGIASFFKNFEIKGHVVYPGIIVV MILKFISHYLSGAWLFGAYAPEGMNPWWYSFGYNLAYCLPTLIICYIGFALIYPRLNTSI KL >gi|223714126|gb|ACDT01000089.1| GENE 29 27143 - 27451 359 102 aa, chain + ## HITS:1 COG:BH0940 KEGG:ns NR:ns ## COG: BH0940 COG3557 # Protein_GI_number: 15613503 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Uncharacterized domain/protein associated with RNAses G and E # Organism: Bacillus halodurans # 1 96 76 171 175 111 50.0 3e-25 MIRKTGVYYYCNIASPTLYDGEALKYIDYDLDLKVFPDNNYRVLDEEEYKQHAKQMNYSK ELDRILKKQLELLIDLATKREGPFSEDFTKHWYHVYEDLTQG >gi|223714126|gb|ACDT01000089.1| GENE 30 27461 - 27910 623 149 aa, chain + ## HITS:1 COG:BS_ydiB KEGG:ns NR:ns ## COG: BS_ydiB COG0802 # Protein_GI_number: 16077658 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Bacillus subtilis # 9 149 10 151 158 138 49.0 3e-33 MEKVIKVNNLEETIALGNRLGLLLQPNMLLTLSGDLGAGKTTFTKGIGQGLGITKVINSP TFTILKQYQGRLNLSHFDAYRLEGQDDDLGFEEIFDSDDVCVVEWANFIEDILPVDRLTI EIKKIDENIREFVFKTNSEKYAQVVEALT >gi|223714126|gb|ACDT01000089.1| GENE 31 27907 - 28509 672 200 aa, chain + ## HITS:1 COG:SA1856 KEGG:ns NR:ns ## COG: SA1856 COG1214 # Protein_GI_number: 15927626 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Inactive homolog of metal-dependent proteases, putative molecular chaperone # Organism: Staphylococcus aureus N315 # 1 198 10 220 229 79 27.0 6e-15 MKTIVMDTSNAYLVIGLYEDDQCIDKYQADGNRRQSEYALTHLDEMLKKHHWEVLNVDEM IITIGPGSYTGQRVALTIAKTLAAISKIKIKAVSSLHGYAGASKAISVIDARSKKIFVGV YEHNQAIIDDQIMLIDDFANFKEQYPDFQVIGDSDLVGINKVKLDLSDCIYQAGKSVDYC QNIDNLVPHYLKDVEAKKIC >gi|223714126|gb|ACDT01000089.1| GENE 32 28503 - 28946 241 147 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|220931046|ref|YP_002507954.1| SSU ribosomal protein S18P alanine acetyltransferase [Halothermothrix orenii H 168] # 1 146 3 148 151 97 35 2e-19 MLIRRMDLGDIEEVVKLEHDLFSSPWNEEAFKYELEKNAFSSILILEDNNIIVGYIGMWT LGDQTQITTIGVRKEFQGKGYAKILMTKCDEITKHLGYSNINLEVRVSNTKAISLYQKCG FKIAATRKNYYQDNHEDAYLMVKEMEE >gi|223714126|gb|ACDT01000089.1| GENE 33 28949 - 29983 672 344 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229232313|ref|ZP_04356740.1| (SSU ribosomal protein S18P)-alanine acetyltransferase [Cryptobacterium curtum DSM 15641] # 3 333 518 857 860 263 46 1e-81 MSLILAIESSCDEMAMAILKDEREFLSSVVASQIDVHAMYGGVVPEIASRKHVECVSLVL KETLNKANVTIDEIDAIAVTKGPGLVGSLHIGLQAAKTIAMAYHKPLIGVHHIAGHIYAN NYVQNIIYPSLCLVVSGGHSELVLLKAPFKFEVIGQTLDDAVGEAYDKVGRVLNLPYPGG PIIDKMAAKGNHTYNLPVPLDDESYNFSFSGLKSAVINLNHKAMQRNEEINQEDLAASFQ DVVLSVLVNKTIRATKEYGIKQVLMAGGVSANRGLRNAMGEAVGQLDGVELLLPPMSCCT DNAMMIALAAKQMYDLKLFSDLSLGIKPNLDLENESVSGGNQNG >gi|223714126|gb|ACDT01000089.1| GENE 34 29976 - 30605 752 209 aa, chain + ## HITS:1 COG:BH0551 KEGG:ns NR:ns ## COG: BH0551 COG2344 # Protein_GI_number: 15613114 # Func_class: R General function prediction only # Function: AT-rich DNA-binding protein # Organism: Bacillus halodurans # 4 201 7 203 211 135 38.0 5e-32 MDKKISNATMSRYPVYLKALRKMQHEGKENCLSSELSSLTGIQDTTIRRDFTYLSKTDNF GQRGKGYDVKHLIDGLSEVLGLGLDESIILIGIGNLGSAILKYNRWQYTVGKIVCGYDQD KNKEGERFGVKLYNIDDLEKTFPKDCKIAILAISENVQPTVDRLMDLGIKGIVDFTHTHF TVREGVEVQQVDVVVAIQELVIKMDSNRQ >gi|223714126|gb|ACDT01000089.1| GENE 35 30661 - 32046 1708 461 aa, chain + ## HITS:1 COG:CAC3260 KEGG:ns NR:ns ## COG: CAC3260 COG0017 # Protein_GI_number: 15896505 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Clostridium acetobutylicum # 3 461 2 463 463 545 57.0 1e-155 MFEFITVRELYEMFLSNQMMDLEGYEYVELEGWVRTNRNNGKLGFIALNDGTYFKNLQVV YTEAEISNYEEIDKLSTGSSIRVVGKLKLTPEGKQPFEVEATEIEIEGKCDEDFPLQKKR HSFEYMREIPHLRPRANTFYAIFRLRSVLSMAIHEFFQSQGFVYVHTPIITGNDGEGAGE MFRATTIDGTNFEDDFFEKEAFLTVTGQLHVEAFAMAFRDVYTFGPAFRAENSNTSRHAS EFWMIEPEIAFADLEDDMDLIEDMVKYCIDYVLDNAPAEMEFFASMIDKDCINRITKVKN SDFKRMTYTEAIEILEKADVKFKNKVSWGMDLNSEHERYICEQVVKGPVFLTDYPKEIKA FYMRLNDDNKTVAACDLLVPGIGELVGGSQREERYDVLERIMDEKGMSKDGLQWYMDLRR YGGCKHAGFGLGFDRFLMYLTGMQNIRDVEPFPRTPRNLKF >gi|223714126|gb|ACDT01000089.1| GENE 36 32120 - 33583 1415 487 aa, chain + ## HITS:1 COG:MTH1722 KEGG:ns NR:ns ## COG: MTH1722 COG1409 # Protein_GI_number: 15679714 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Methanothermobacter thermautotrophicus # 206 437 7 203 262 69 27.0 2e-11 MNEYDLINAYLNLIFDDSITYFSRILDEYRGLCQLRKWMRKRGINLKAIENIGYDMFLAF SKAKNAEQIMEKMDPTHLEELKEYINTERNKYQIVVLSSFDYSEPINAYLTSNQIVDDLD SDNIRVLNCHNLLKKEDLSFYGPFKHFVTALYQIDRWPGFLIFKGDECAFVPIYTTQDIE EVIFKIEHDTIFSIRYLSPYDSFFVQLSDLHLGSKKKNIGRLALEDSLDQTVKQLRSYYP LKFLITGDLMNSPNRKNMYEASGFMKGLKNRYNADVSFILGNHDVIVNGFNIFKRQKSKV IAFLLNESIKVFEEEKIVLIKIDSTVEGNLARGKVGMKQLNNIDEELAAIKNLGDYTIVA MLHHHLFPITRDDFLKQKWREKIFVGKIMDSSKALVDSRDLVEWLRKHQIQYVLHGHKHL PFFSSQEGMYVIAAGSSCGSGAKESKSRYLSYNVLKYDHRYKRFKYCFIIYDDMTMQDRQ RIVINIF >gi|223714126|gb|ACDT01000089.1| GENE 37 33593 - 33940 376 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756627|ref|ZP_02428754.1| ## NR: gi|167756627|ref|ZP_02428754.1| hypothetical protein CLORAM_02164 [Clostridium ramosum DSM 1402] # 1 113 1 113 462 229 100.0 6e-59 MKLVDGMKDEKLGSALLYCFTKGYGAVPMDLYNLVLPLLFDDIFREEVKISKDFSSCIRA CIQKNSEFVNQILNSVEELEEVTSRSLGISLLNKDLDFQINDGVMSGNCRPSKNS >gi|223714126|gb|ACDT01000089.1| GENE 38 33969 - 34982 863 337 aa, chain + ## HITS:1 COG:Cj0373 KEGG:ns NR:ns ## COG: Cj0373 COG1052 # Protein_GI_number: 15791740 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Campylobacter jejuni # 19 328 2 309 311 257 45.0 2e-68 MIQGKGFDEVVTMMKSDFKIVFLDSETLGSDIDVSKLNRYGDVTIYHQSQVSEVPERIKD ASVVITNKHILKEEQLKDAHKLKLICVTATGVNNIDLEYCKAAGITVCNVKGYSTNAVAQ HTFALLLDLYNKNHYYHNYVESGNYASSSMFTHLGYTFHELTGKTWGIVGMGDIGRKVAE IATAFGCQVQYYSTSGKNNQQNYQQVDFDTLLESSDIISIHAPLNSQTENLFDQSAFAKM KDSAYLINVGRGKIVNEADLVEALKEHRLAGAGLDVFENEPFCCNSPLLEIKDATKLIMT PHIAWAPIETRNRVIEEVCLNIEGFKTNNLRNVCNML >gi|223714126|gb|ACDT01000089.1| GENE 39 35165 - 35329 296 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756628|ref|ZP_02428755.1| ## NR: gi|167756628|ref|ZP_02428755.1| hypothetical protein CLORAM_02165 [Clostridium ramosum DSM 1402] # 1 54 1 54 54 89 100.0 6e-17 MKYAVVDGKLTHVNKVPKGTIAREFGYSNYPVIACKGKYRSYWKYVSVNKANYA >gi|223714126|gb|ACDT01000089.1| GENE 40 35443 - 36816 1169 457 aa, chain + ## HITS:1 COG:MA0901 KEGG:ns NR:ns ## COG: MA0901 COG0733 # Protein_GI_number: 20089780 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Methanosarcina acetivorans str.C2A # 3 456 8 457 459 464 57.0 1e-130 MKEREKFSSRLGFILISAGCAIGLGNVWRFPYVVGQYGGALFVLIYLFFLIALGAPIMTM EFAVGRASQKSAALAFHSLVPKRKFWGAIGKFQMAGNYMLMMFYTTVGGWMIAYFFKMIK GDFNGITPAQVGDIFNNSLQDPFMQVFWMIVVVVLGFGICSLGLQNGVEKITKIMMSSLL IIILILVIRAVTLPGGRDGLAFYLIPNLAAIKQYGLLEIIFAAMGQAFFTLSLGIGAIAI FGSYIGKERRLFGETISVCTLDTLVALLSGLIIFPACFAFGVNPESGPGLVFITLPSIFN EMWLGQVWGCLFFVFMSFAALSTIIAVFENIISFGMDLFNWSRKKAVVVNFIIILILSLP CALGFNILSDITPFGIGTTIQDLEDFLVTYNFLPLGSLAYLLFCTCRYGWGFDGFLQEAN TGTGIKFPKWTKNYLKFILPCVIIFIFIQGYYTFFIK >gi|223714126|gb|ACDT01000089.1| GENE 41 36895 - 37404 478 169 aa, chain - ## HITS:1 COG:CAC0470 KEGG:ns NR:ns ## COG: CAC0470 COG0700 # Protein_GI_number: 15893761 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Clostridium acetobutylicum # 3 162 7 166 173 87 31.0 1e-17 MDSVILIIILIALVEASFKHVDIFNEFIEGVKEGSKLLITLFPTMMAFTLWVTCFQYCGL IELLEIGFKSIFAILNIPIDIFMMMIVRPFSSNGSLTILNNIFLKYGVDHPYSILGSIIQ TGSDTTFYVVTLYFGSIKLQNNRYALKLGLWLDAIACLLATIAYLLFIG >gi|223714126|gb|ACDT01000089.1| GENE 42 37397 - 37969 521 190 aa, chain - ## HITS:1 COG:CAC0469 KEGG:ns NR:ns ## COG: CAC0469 COG2715 # Protein_GI_number: 15893760 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, required for spore maturation in B.subtilis. # Organism: Clostridium acetobutylicum # 14 189 15 190 191 82 25.0 6e-16 MAKVFKFGLIFFVIIAFCTSNQNKMFEAILSSPQQVFDLIKVIVLSACLWNGFLNIIKAS GLIKQLSFLLKPILKLIYGNTVEDDQVYLYLSTNFIANLLGVGSLASISGLKAMQSLTKY QRDPKTPCKEMMLLVILNTTGLSIIPTTMMTLRQSYGSHDILGFFGYSLTIGFVITIVGI IASKVIEHYG >gi|223714126|gb|ACDT01000089.1| GENE 43 37962 - 38294 323 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756632|ref|ZP_02428759.1| ## NR: gi|167756632|ref|ZP_02428759.1| hypothetical protein CLORAM_02169 [Clostridium ramosum DSM 1402] # 1 110 233 342 342 218 100.0 1e-55 MARKDDLDLIIVTFNCGNDFEFHQKKFEECYANLKATKIFSKGIVEFKQQQYYLDFDVYV NKKPTDKVEVSLNNNEIVILLNNQPIAKQKVVPYNYFLGFKMVLKDLFYG >gi|223714126|gb|ACDT01000089.1| GENE 44 38437 - 38970 634 177 aa, chain - ## HITS:1 COG:BS_dacB KEGG:ns NR:ns ## COG: BS_dacB COG1686 # Protein_GI_number: 16079376 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus subtilis # 5 170 11 177 382 132 42.0 5e-31 MKLGLIISLMMTLVNTNVSAVPFQGKSYIVMEASHQIVIEGSNQDYIQSVASISKIMTCI IAIENMELDTIITVDDTINKAWGSGVYIHIGDKISLRDLLYGLMLRSGNDAAVMIAKAVG KEIPKFVDMMNDKAKELQLHSTTFSNPTGLDEEDNGNQSTVYDMARLMAYCHQNPIF >gi|223714126|gb|ACDT01000089.1| GENE 45 40199 - 41722 1863 507 aa, chain - ## HITS:1 COG:CAC3230 KEGG:ns NR:ns ## COG: CAC3230 COG4624 # Protein_GI_number: 15896476 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Clostridium acetobutylicum # 135 502 50 429 450 202 36.0 1e-51 MRGIYSSLTKIRRKVFVEVAKLAYEGNDYSRIENLPYKVIPGEVATYRDSIFLERAIVGE RIRLTMGLSLRPIDEPAPISKGIEESIIADKYYEPPLINIIKFACNKCPEKKFEVSSGCQ ACLAHPCIEVCPKNAISFKDGKAYIDQDKCIKCGLCKTNCPYDAILKRERPCAKACGMDA IETDEYGNAHINYDKCVSCGMCLVSCPFGAIADKSQIFQLIQAIKAGDEIIAAVAPAFIG QFGPKVTPEVLKKAMQQLGFKDVVEVAIGADLCTIDEAKDFLEKVPAQQPFMATSCCPSW SMMAKKEFPEFKSYISMALTPMVLTARLIKEKHPNSRVVFVGPCAAKKLEASRKSVRSEV DFVLTFEEIGAMFEAKGIDFASLKPDEADPFTQASSDGRGFAVSGGVAKAVVNCIKAKEP EREVLVESAEGLANCKKMLKLAKSGKYDGYLLEGMACPGGCVAGAGTLLPITRATTAVKK YTNSSDKLNANDSKYKDYLERLIEEYK >gi|223714126|gb|ACDT01000089.1| GENE 46 41882 - 42607 858 241 aa, chain + ## HITS:1 COG:lin2063 KEGG:ns NR:ns ## COG: lin2063 COG1187 # Protein_GI_number: 16801129 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Listeria innocua # 1 234 1 234 244 225 51.0 7e-59 MERLQKAIAASGYTSRRKAEDLIVQGKVEVNGKVVTELGTKVKQGDLITVEGKPLVGENK VYYVFYKPKGCVCTLDDEFDRPIITDYFTEVKERIYPVGRLDFDTTGLIIMSNDGEFANM IMHPRSHLEKIYEVSIKGLITGPTLNKLEKGVYLEGVKTLPCKIKVVDKDTEHKTTMLKI KLVEGKNRQVKKMFESVGHPVKRLHRVSIGGINLKGLTPGRYRILKPQEVKDLKKLVNTD K >gi|223714126|gb|ACDT01000089.1| GENE 47 42679 - 42888 206 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733924|ref|ZP_04564405.1| ## NR: gi|237733924|ref|ZP_04564405.1| predicted protein [Mollicutes bacterium D7] # 1 69 1 69 69 129 100.0 5e-29 MRIQIVEPQNKIECGICKAEGDWIKRINVRGIQALYCIKCDTVTMFNKMPSKFVYKAIKK RNRKYSDGL >gi|223714126|gb|ACDT01000089.1| GENE 48 43200 - 44933 2059 577 aa, chain + ## HITS:1 COG:FN0611 KEGG:ns NR:ns ## COG: FN0611 COG0441 # Protein_GI_number: 19703946 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 2 565 41 606 620 667 56.0 0 MIDFKEDEKLSVLNHSCAHVMAQAIKHLYPQAKFWVGPVVSEGFYYDVDLGNDVIKDEDI PKIEKEMKKICKDGKRITRQEISKEEALEMFKDDEYKIDLINNFENDTTISCYRQGDFVD LCRGPHVETVKACKNFKLTKHSGAYWKGDKENKVLQRIYGVCFETSEDLAKHLELLEEAK RRDHKKLGKELGLFMMSEYAPGMPFFLPKGMILRNTLEQFWYEEHAKEGYEFIKTPIMMS RELWETSGHWENYKEDMYTSMVDDREFAIKPMNCPGSLLVYKNSLHSYKDLPLRMGELGQ VHRHEASGALNGLFRVRTFTQDDAHIYMTPDQIEGEIIRLINFIDRVYGSLNLSYEIELS TRPEKKYIGDLAIWEKSEAALAAACKAAGKDYKVNPGDGAFYGPKLDFHVKDSLGRVWQC GTIQLDMNLPERFDITYIDDKGEKVRPVMLHRVIFGSIERFIGILIEHFAGVFPLWLAPV QVKVLPVNNEYHLDYAKEVTELLKDKGFKVELDAREEKLGYRIREGQMEKVPYLLVLGNN ERDEKTVTYRKHGEQKQITVPFDDFVAMLNQQIVDKK >gi|223714126|gb|ACDT01000089.1| GENE 49 45097 - 45981 285 294 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 5 287 6 313 319 114 27 1e-24 MKYYLAMDVGGTSIKYGVVNDQGKIINTDKIVTPDSLEKMYQAMGEIYHNCNYEVTGIAL SMPGAVNSEVGNIEGASALDYIHGPNIKEDLQKRFNTKVSIENDANCAALAEVWKGSGSD VDDCMFIVSGTGIGGAVVKDRMIHKGKHLHGGEFGYMVALNDLDNDSYISWSTAGSTVAT VKGVAKELGVDYQTLDGKVIFDQAQDNPIYQKYVDRYYSVLAMGIYNLQYVYDPEKIIIG GAISVRPDLIEQIESRLEKIYDSIPVAKIHPKVVKCRFGNEANLIGAVYHFIHS >gi|223714126|gb|ACDT01000089.1| GENE 50 46043 - 46720 755 225 aa, chain + ## HITS:1 COG:no KEGG:CLD_0779 NR:ns ## KEGG: CLD_0779 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B1 # Pathway: not_defined # 3 219 10 225 240 228 57.0 1e-58 MKREAPKINTQLRSHTIMVPECIRNASGIVINGKRIKSLLFSTDVAVISNCNADAVIAVY PFTPTMQITNSIIDVAQRPVFAGVGGGTTAGPRVREIALDAELHGATAVVLNAPTKTEFI QELSDYVDIPVVLSIVSLDENLEERMLHSGATIVNVSGGKNTVAIVKALREISQDFPIIA TGGPNLETIQAVIEAGANAITYTPPSNGEIFKAMMEDYRQRCQHQ >gi|223714126|gb|ACDT01000089.1| GENE 51 46803 - 47681 1004 292 aa, chain + ## HITS:1 COG:MPN320 KEGG:ns NR:ns ## COG: MPN320 COG0207 # Protein_GI_number: 13508059 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Mycoplasma pneumoniae # 1 292 42 328 328 358 57.0 6e-99 MKQYLDMCKYVLENGENREDRTGTGTRSVFGYQTRYDLRDGFPLMTTKKMFLRPIAEELL WFIKGDTNIKYLVDRNVKIWNEWPYEDFKKSPDFNGETIEEFVEKIKTDDEFAKKHGDLG PVYGAQWRDFNYEGVDQLAKLVDSLTNNPFSRRHIICAWNPAQVDNMALPPCHAFLQFYV SADKKYLSCQLYQRSADIFLGVPFNIASYALMTEMLARTCGYEAKEFIHTIGDAHIYKDH FEVVKQQIAREPLPKCKLVLNPDVKSIFDYKIEDIKIEDYQSHGKLVGKVSV >gi|223714126|gb|ACDT01000089.1| GENE 52 47681 - 48151 485 156 aa, chain + ## HITS:1 COG:BS_dfrA KEGG:ns NR:ns ## COG: BS_dfrA COG0262 # Protein_GI_number: 16079240 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Bacillus subtilis # 1 143 1 140 168 115 43.0 3e-26 MINIIVATDEDLLIGKKDSRNGMPWNVPEDLQHFKATTLNKTILMGLTTYQAIGRPLPNR KTIVVSFEPFEDERVEVRSSLEEVIEEYRSSGEDLFISGGASIYKQCLPIADQLLISRIP GKHEGETYFPNFDEYGYKLVAQKPFETFTLETYKRG >gi|223714126|gb|ACDT01000089.1| GENE 53 48156 - 48872 765 238 aa, chain + ## HITS:1 COG:CAC0965 KEGG:ns NR:ns ## COG: CAC0965 COG0204 # Protein_GI_number: 15894252 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Clostridium acetobutylicum # 25 233 23 236 241 98 29.0 1e-20 MKRIALIVIRVIWRLPYWWGYKVNKYKHIDKYSADERYGFIHQVVAIVTRKARVELECYG VENLPKKSGYFVAPNHQGLFDPLAIFQTHSRPIRAIVKKELASVILVKDVIKMLEFIPMD RSNVRESAKIIKYVANEVAAGKNFFVFPEGTRSRDGNNILEFKGGTFKIATKAKAPIVPV ALIDCYKVFDNNTIKKTTAQIHYLKPIYYEEYKDLHTNDIAKLVHDRIEKCIKENDKG >gi|223714126|gb|ACDT01000089.1| GENE 54 48932 - 49213 255 93 aa, chain + ## HITS:1 COG:BS_yvbA KEGG:ns NR:ns ## COG: BS_yvbA COG0640 # Protein_GI_number: 16080432 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 9 86 5 82 90 82 44.0 2e-16 MIVGFAETFKALSDPVRREILVLLKDGKLSAGEIAGCFEMTQATVSYHLSKLKQADLIFE NKYKNYIYYEINTSVFEEVMLWLSQFGGFKDET >gi|223714126|gb|ACDT01000089.1| GENE 55 49203 - 49826 532 207 aa, chain + ## HITS:1 COG:MA3135 KEGG:ns NR:ns ## COG: MA3135 COG5658 # Protein_GI_number: 20091953 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Methanosarcina acetivorans str.C2A # 4 207 10 216 227 101 30.0 9e-22 MKLKTALITCGVCLLPILFGLIIYDCLPAQMPIHWNIAGEIDNYASKEIFVFLLPILMSG IQLLCLFMMQMDPKYKNYSSKMLKIVVWIIPTITIVLEAITYAIVFGYEISINIVVPIFL GILFVILGNYLPKCRQNFTMGYRLPWTLNDEDNWNKTNRLAGYVMVLGGILIIIMSFFDL AFWAVILFVGAILIIPSVYSFLLYRKK >gi|223714126|gb|ACDT01000089.1| GENE 56 49935 - 50192 334 85 aa, chain + ## HITS:1 COG:lin2889 KEGG:ns NR:ns ## COG: lin2889 COG3070 # Protein_GI_number: 16801949 # Func_class: K Transcription # Function: Regulator of competence-specific genes # Organism: Listeria innocua # 1 80 1 80 83 72 50.0 2e-13 MGKLEDLPNIGKVVAQQLIQVGIDDPIKLKEVGAKEAWLKIQQIDESACIHRLYALEGAV EGIKKTQLDYETKAELKQFYNEHKL >gi|223714126|gb|ACDT01000089.1| GENE 57 50189 - 50848 692 219 aa, chain + ## HITS:1 COG:lin2547 KEGG:ns NR:ns ## COG: lin2547 COG0596 # Protein_GI_number: 16801609 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Listeria innocua # 5 198 5 212 239 95 29.0 8e-20 MKILVNGIELFYEISGQGRPLILVHGNGEDHHIFDELVAKLKNHYTCYCIDSRGHGLSSS VAVYSYQSMAEDIIEFIKSLKLSEVAYYGFSDGGIVGLLVASQSKLIKYLMISGANADPD GLKNWAYYLMKIQYYLKKDVKIKMMLEQPHINEYQLAKITAKTLILVGSDDMIKESHTLY LAHNISGSQLMILPNENHSSYIINSAKLAPIILAFLKVA >gi|223714126|gb|ACDT01000089.1| GENE 58 51344 - 52312 890 322 aa, chain - ## HITS:1 COG:CAC0395 KEGG:ns NR:ns ## COG: CAC0395 COG0524 # Protein_GI_number: 15893686 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Clostridium acetobutylicum # 1 311 1 315 315 290 45.0 2e-78 MKKILLFGEPMALLTTTSYDSLDEAEMFKKTLAGAEVNVAIGLTRLGHQATYLTVLGDDP WGHYIKKKLYQEGIDTRLVYYNSLFPTGMMMKNQVIEGDPDIYYYRQGSAFSNIDTNLID QINLNDFDQLHITGIPLALSSKTRETSLRLVKKAREAGLYISFDPNLRPSLWPNQTTMID TTNEMAKYCNMFLPGNNEGKILMGSDDPEKIAVYYHQLGCDELVIKLGSQGAYYSSKNET LTVPGFNVKKVIDTVGAGDGFAAGIISGHLEALCSKDMLIRANAIGAIQVQTLGDNEGLP NIEELNAYLKKLHCEANPESWI >gi|223714126|gb|ACDT01000089.1| GENE 59 52314 - 53261 839 315 aa, chain - ## HITS:1 COG:VC0285 KEGG:ns NR:ns ## COG: VC0285 COG0800 # Protein_GI_number: 15640313 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Vibrio cholerae # 13 201 12 199 201 215 56.0 7e-56 MNKIINDISLYGIVPVIKIECLEDAPYLAKALCEGGLPVAEVTFRTACAKEAIIAMKKAC PQMIIGAGTVLNPAQVNSAIEAGSEFIVSPGLNPKTVQYCLDKDIPILPGCANPSDMEQA IELGLDVVKFFPAQANGGLPAIKAMSAPYCNLKFMPTGGINTSNLTEYLAFDKIIACGGT WMVSDELIKEHKWDEITAITKQAVKTMLDIKFHHVGIGAGDRGAQLAALINSPHQKNLSI SRMVDEIEFMFDTPSGSLHHLCLSTKSIERAIFFLEKQGYTFDMQTARYNHKNKIEFIYF NELISGCKCHLCLEG >gi|223714126|gb|ACDT01000089.1| GENE 60 53266 - 54984 1776 572 aa, chain - ## HITS:1 COG:CAC3604 KEGG:ns NR:ns ## COG: CAC3604 COG0129 # Protein_GI_number: 15896838 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Clostridium acetobutylicum # 1 571 2 572 572 749 62.0 0 MKSQELRKIAPEIDPLRIGTGWSVADLSKPQIMIESTYGDSHPGSVGLLDLVYEVEKGIN DHGAKASKFFTTDICDGEAQGHDGINYSLASRDMIAGMIEIHNNATPFDGSVFVASCDKG MPAHLMGAGRINNPCIFVTGGVMEAGPDMLTLEQIGKYSAMYLRGEISKEQFEYYQHHAC PSCGACSFMGTASTMQIMTEALGLMLPGTALMPATCKDLKEQAYKAGKQIIKLAKMDLTA RDIVTMKSFENAIMVHAAISGSTNSLLHLPAIASEFGIDLNADLFDQMHQNAHYLLDIRP AGKYPAEYFYYAGGVPAVMEAIKENLNLDVMTVTGKTLGENLEELKNNGYYENCQTFLDK INLKSTDIIRPYDDPIGKDGTIAILKGNLAPDGAVIKHSACPQEMMQAVLRARPFDSEEA AIDAIIKKQIKPGDAVFIRYEGPKGSGMPEMFYTGEAISSDQELAKSIALITDGRFSGAS KGPVIGHVSPEASEGGPIALIEENDLIKIDVPQRRLEIIGIEGKQRTSQEITKILKERHA KWKPKEPKYKNGILSIYTKLATSPMNGGKMKG >gi|223714126|gb|ACDT01000089.1| GENE 61 55117 - 55797 809 226 aa, chain + ## HITS:1 COG:BMEII0858 KEGG:ns NR:ns ## COG: BMEII0858 COG2186 # Protein_GI_number: 17989203 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Brucella melitensis # 16 215 20 221 242 90 31.0 2e-18 MKKLGSKKTLPELAVEQIIGYINENGLQPGDQLPNEVDLMNKLKVGRGTIREAIKSLNSK GVVIIKRGIGTFVADQPGISDDPLGFTFETDKTKVLMDALEVRLLIEPQMIIKTIDNISA KQLKELSDLADEIESLINKNKDYTVQDIEFHTLLARVSGNRVIGKLIPIIAGAIEQAIDV TDKSLVQETIITHRLIVEAVANKDKEGASKAMARHIQDNIDILKTI >gi|223714126|gb|ACDT01000089.1| GENE 62 56234 - 56689 295 151 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756651|ref|ZP_02428778.1| ## NR: gi|167756651|ref|ZP_02428778.1| hypothetical protein CLORAM_02188 [Clostridium ramosum DSM 1402] # 1 151 1 151 151 254 100.0 8e-67 MRRYEIIKDEVYQILNTNCFGNKRKHGLEHLFSVAAMMKYLAIQNNLNIEIAATIGILHD LATYKLNSSFDHANRSSLIASELLKKDELFSANEIDTIVTAIKNHSNKERIDDKYSELIK NADLLIQYLNDPEALLTSEKQKRINRLIESK >gi|223714126|gb|ACDT01000089.1| GENE 63 56686 - 58008 995 440 aa, chain - ## HITS:1 COG:SP1402_1 KEGG:ns NR:ns ## COG: SP1402_1 COG0144 # Protein_GI_number: 15901256 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Streptococcus pneumoniae TIGR4 # 5 287 7 280 280 212 38.0 1e-54 MNKLFEQRMQKLLQNDYPAFESALNQKPIKSFYLNPLKHGSIEHLNSAFLTKHPYINDAY FYDYENYQLGKHPYFNCGLYYIQEPSAMAVANCLDFDHDDYVLDMCAAPGGKTCFTASKL SSAGLMIANDINKLRAGILSENVERFGLQNTIVTNSDPVKLEKYFNNFFDKIILDAPCSG EGMFRKLDQAVETWSIEKVNECAYIQRNLINSAYQMLKDEGILIYSTCTYSLEENEDQVK YMTNDLQMELLEIKKQPGMASGYQNDKVVRMYPHLNKGEGQFIALLKKHTNGSTKKINFL KPNVTKDQLNLVTKFYHNNLNLPVPQYLYNSNNHIYAILPQFPDLKGTKVLRNGLYLGEC KKGRFEPSHSLALSLKKEDVKRYYNFKADDLNISKYLHGETLIGNNQKGYGLILVDGYPL AFYKESNNQVKNLFPKGLRR >gi|223714126|gb|ACDT01000089.1| GENE 64 58072 - 58305 359 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756653|ref|ZP_02428780.1| ## NR: gi|167756653|ref|ZP_02428780.1| hypothetical protein CLORAM_02190 [Clostridium ramosum DSM 1402] # 1 77 5 81 81 112 100.0 5e-24 MDKGSNIMEYENIISELNEITTSLDATTETLVGLQENAKHEEAIKKIDVDLYLEAPIQEL SRLANRLAELSHIINEK >gi|223714126|gb|ACDT01000089.1| GENE 65 58972 - 59850 1059 292 aa, chain + ## HITS:1 COG:RSc1013 KEGG:ns NR:ns ## COG: RSc1013 COG0524 # Protein_GI_number: 17545732 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Ralstonia solanacearum # 3 289 15 310 315 157 34.0 3e-38 MKVLNVGSLNIDHTYQLERIVAPGETITSKQMSVYAGGKGLNQSIAIARAGIEVYHAGKI GIDGQFLIELCNDSGVNSTFIETGVIPTGNAIIQVTSQGQNSIILYPGANRSLDKAWIDK VLTEFSKGDILLLQNEVNLIDYLIIEGKKKGMNIFLNPSPYDEYIDKCDLSLVNTFLMNE VEGAQITGCYDPECILTRMKNLYPQAKVVLTLGQDGVYFQDKEQRVYQPARTVKAVDTTG AGDTFTGYYIAAVIEGKSAIEALTFATNAAALAVMKAGAANAIPKRHDVEKY >gi|223714126|gb|ACDT01000089.1| GENE 66 59893 - 61203 1184 436 aa, chain + ## HITS:1 COG:slr1305_3 KEGG:ns NR:ns ## COG: slr1305_3 COG2199 # Protein_GI_number: 16329450 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Synechocystis # 263 433 1 174 188 103 35.0 8e-22 MNNSRTREELFLFRDLAKTNMTEFEQLISISIGAVGKIVYDPQFTVIYCSEGISKLIGVD FIGSGERYFSSSQFIHDDDREYVYQEIGKYVNSKEPFSLKYRLKCSKGKEAWVQAKGIFL DELYEDKDPIMYMVYTDITPLVEANERLEQEIQRYQIFTEFVQECFFDYEIGYDSLMFYG SDVNDKNDQENWYNIWRINKRLILEHFYQNNHHLDNDLEVDLIDKSKCWCHFRAKGIYNT YHDLVRIVGTMKNIQSEKERESRRKEYEEKLKNKAHYDHMTGLLNRFACETIVNQVLDEG LENVTGVVIDVDNFKEINDLHGHYIGDQVLIKIGELFQKYCRSDDIAARLGGDEFFLMFL GLMQQDTIRARLQTMHQEIHNIARELKLKVPVSVSMGVTKIYSTDKTFDDIYVRADETMY QAKLSGKNTLICFNEE >gi|223714126|gb|ACDT01000089.1| GENE 67 61234 - 61653 407 139 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756656|ref|ZP_02428783.1| ## NR: gi|167756656|ref|ZP_02428783.1| hypothetical protein CLORAM_02194 [Clostridium ramosum DSM 1402] # 1 139 1 139 139 246 100.0 3e-64 MENIDIALIGKYIKSTRLSKGWSQSKLEKITGLSASSISNIELGNRGVSYNISLPSLVKI AHAFEMSLVDLLSQSGYLSAIGDECYEAEKSQIYIHNYELINIYHRLTPIQQQEFVQEIK KLVSQSDLAKQQKMFRKKP >gi|223714126|gb|ACDT01000089.1| GENE 68 62128 - 62317 284 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756657|ref|ZP_02428784.1| ## NR: gi|167756657|ref|ZP_02428784.1| hypothetical protein CLORAM_02195 [Clostridium ramosum DSM 1402] # 1 63 1 63 338 114 100.0 2e-24 MIIVMKMTATEKDVEKVSKMVTDKGLNVSVVNGTGQSVIGIIGDTTQIDPKAIEVDEAVD HVM Prediction of potential genes in microbial genomes Time: Thu May 26 10:09:29 2011 Seq name: gi|223714125|gb|ACDT01000090.1| Coprobacillus sp. D7 cont1.90, whole genome shotgun sequence Length of sequence - 5234 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 + CDS 29 - 1087 1201 ## COG0337 3-dehydroquinate synthetase 2 1 Op 2 5/0.000 + CDS 1089 - 2369 1403 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase 3 1 Op 3 . + CDS 2347 - 3414 1298 ## COG0082 Chorismate synthase 4 1 Op 4 2/0.000 + CDS 3418 - 4536 1205 ## COG0077 Prephenate dehydratase 5 1 Op 5 . + CDS 4546 - 5052 527 ## COG0703 Shikimate kinase Predicted protein(s) >gi|223714125|gb|ACDT01000090.1| GENE 1 29 - 1087 1201 352 aa, chain + ## HITS:1 COG:CAC0894 KEGG:ns NR:ns ## COG: CAC0894 COG0337 # Protein_GI_number: 15894181 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Clostridium acetobutylicum # 3 338 4 341 356 306 50.0 3e-83 MEMTVSLKENSYPIYIEKGILNQADKYIKEIYQGNKIMIISDDQVYHYYGDLLTNVLNKE YIVEHVTVPHGEQSKRFDILPSLYKALLDFKLTRTDLIIALGGGVIGDLAGFVAATFLRG VKLVQIPTSLLAQVDSSVGGKVAVDLPEGKNLVGAFYHPRMVLIDPNTLKTLPERFINDG MGEVIKYGCIKDKSLFDKLNSYDNFEQLYMDIDEIIYRCIGIKRDVVERDQFDFGDRLLL NFGHTLAHAIEQYYHYEKYSHGEAVAIGMVQLTKIAEENDLSPVGTSDIIKEICIKYNLP VYSGLKTTDLLEAISLDKKNINKKLSLVLLKEIGDSYVYQSNLSLIQAKDRV >gi|223714125|gb|ACDT01000090.1| GENE 2 1089 - 2369 1403 426 aa, chain + ## HITS:1 COG:CAC0895 KEGG:ns NR:ns ## COG: CAC0895 COG0128 # Protein_GI_number: 15894182 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Clostridium acetobutylicum # 1 417 1 419 428 404 48.0 1e-112 MTAVKITPTKLSGVVQVPPSKSLAHRAIICASLAKGISRIDNIEYSKDIQATIKAMQSLG TKIEKYDDYLIIDGTTTYTKQNSEIDCEESGSTLRFMVPIAIVEENKAHFVGRGNLGKRP LNTFYEIFERQNIGYLYKEDILDLYVIGKLKPDHYKVPGNISSQFITGLLFALPLLDGDS IIEITSPLESKGYIDLTLQMLNQYGIKIVNNDYKSFIVMGKQEYQAHDYRVEADFSQAAF YLVAGAIGNDVVLTDLNLDSLQGDKATLEFLEAMGAKISVVSDGIKVTGENLAATIVDAS QCPDVIPVVSVALALAQGKSDVINAKRLRIKECDRIIATRSQLNELGGTVTELPDGMTIR GVNEFTGGNCSSFADHRIAMMLAIAATRSSQPVVIDNMECVEKSYPSFWEDYQSLGGIID VIDVEK >gi|223714125|gb|ACDT01000090.1| GENE 3 2347 - 3414 1298 355 aa, chain + ## HITS:1 COG:CAC0896 KEGG:ns NR:ns ## COG: CAC0896 COG0082 # Protein_GI_number: 15894183 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Clostridium acetobutylicum # 1 354 1 356 356 404 56.0 1e-112 MSSTWKNNIEITIFGESHGKAIGIVLGNLPSGIKIDLEEVAKEMKRRAPGQDKMSTARKE ADKVEIMSGLQDDVTTGAPLCGIIYNSDQHSKDYSLLKEKMRPGHSDYPAFIKYRGFNDV RGGGHFSGRITAPVVFAGAIAKQILAKQGIQIGAHILSIKNEYDENFDMRLSTKTLEYLR RQHYPVINQEKYEKFVNIVDAARMDQDSVGGKVECAIIGLKPGLGEPFFDSIESHLSSLL FSIPAVKSVAFGNDKISELFGSEANDCYYYQDALVKTTSNNNGGITGGISNGMPVVFTVG IKPTPSISKEQKTIDVKKHENTTLGVHGRHDPCIVFRATVVVEAMAALAMLDLVR >gi|223714125|gb|ACDT01000090.1| GENE 4 3418 - 4536 1205 372 aa, chain + ## HITS:1 COG:aq_951_2 KEGG:ns NR:ns ## COG: aq_951_2 COG0077 # Protein_GI_number: 15606269 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Aquifex aeolicus # 102 372 4 273 277 181 38.0 3e-45 MKDLEQCRREIDEIDQQLIKLFEQRMNVSKDVVTYKLAHGLEIFQPEREKAVIEKNAARM MNPELADYARNFMQDVMDVGKSYQATFIPLNNLYNLAAPKRENIKVGYAGVPGAFAHQAM LEYFGNVENTNYVNFRDVFEALKNAEIDYGIVPLENSSTGAINDNYDLVRDYDFYIVGEH SVCISQHLLGIKGAKIENIKTVYSHPQGIQQSADFLRNNPQMLSQDFSNTAAAAKYVSEC NDLSKGAIASKVAAKLYDLEVLQENIHNEKTNNTRFIIFAKHLEDHPQTDRVSIVFTLQH KVGALYGVLKAIKDHQINLSRIESRPIKDKRWQYYFYIDFEGSLHDDNVKLALEQMKTNC LTLRVLGNYHHA >gi|223714125|gb|ACDT01000090.1| GENE 5 4546 - 5052 527 168 aa, chain + ## HITS:1 COG:CAC0898 KEGG:ns NR:ns ## COG: CAC0898 COG0703 # Protein_GI_number: 15894185 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Clostridium acetobutylicum # 3 157 2 156 165 127 38.0 1e-29 MKNIVLIGIMGCGKTTLSRMLGEKLNRPVIDIDEYIVEKYHQTIPEMFEVSETYFRNNET AGCKDVSDLNGHIISTGGGVVLRPENIKYLKQNGIIIYIDRPIDNILTDVQVTSRPLLKE GPQKLYELDKQRHQLYLEACDHRIVNDDTLENITDKIIELITKNKFED Prediction of potential genes in microbial genomes Time: Thu May 26 10:09:40 2011 Seq name: gi|223714124|gb|ACDT01000091.1| Coprobacillus sp. D7 cont1.91, whole genome shotgun sequence Length of sequence - 39155 bp Number of predicted genes - 32, with homology - 32 Number of transcription units - 17, operones - 7 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 918 879 ## COG2199 FOG: GGDEF domain - Prom 973 - 1032 6.3 + Prom 973 - 1032 4.1 2 2 Op 1 . + CDS 1068 - 1472 439 ## Acid_1436 tagatose-bisphosphate aldolase (EC:4.1.2.40) 3 2 Op 2 28/0.000 + CDS 1484 - 2617 1244 ## COG0420 DNA repair exonuclease 4 2 Op 3 . + CDS 2607 - 3596 911 ## COG0419 ATPase involved in DNA repair 5 2 Op 4 . + CDS 3568 - 5640 2065 ## COG0419 ATPase involved in DNA repair 6 2 Op 5 . + CDS 5694 - 7133 1664 ## COG0442 Prolyl-tRNA synthetase + Prom 7142 - 7201 7.0 7 3 Op 1 7/0.000 + CDS 7225 - 7653 412 ## COG1846 Transcriptional regulators 8 3 Op 2 . + CDS 7646 - 8980 1337 ## COG0534 Na+-driven multidrug efflux pump + Term 9021 - 9057 1.1 - Term 9106 - 9139 3.1 9 4 Tu 1 . - CDS 9152 - 9763 786 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family + Prom 9822 - 9881 8.2 10 5 Op 1 . + CDS 9904 - 10716 851 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase 11 5 Op 2 . + CDS 10728 - 11372 685 ## COG0535 Predicted Fe-S oxidoreductases 12 5 Op 3 . + CDS 11430 - 12164 886 ## COG2365 Protein tyrosine/serine phosphatase 13 5 Op 4 . + CDS 12216 - 12620 474 ## gi|167756675|ref|ZP_02428802.1| hypothetical protein CLORAM_02213 + Prom 12622 - 12681 6.8 14 5 Op 5 . + CDS 12702 - 14018 1328 ## COG2252 Permeases + Term 14021 - 14081 13.7 15 6 Tu 1 . - CDS 14512 - 14973 275 ## COG3476 Tryptophan-rich sensory protein (mitochondrial benzodiazepine receptor homolog) - Prom 15092 - 15151 3.8 + Prom 14903 - 14962 6.4 16 7 Tu 1 . + CDS 15123 - 15878 814 ## COG0708 Exonuclease III 17 8 Tu 1 . - CDS 16023 - 17387 1373 ## COG0733 Na+-dependent transporters of the SNF family - Prom 17416 - 17475 6.9 18 9 Op 1 . - CDS 17645 - 19207 1623 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold 19 9 Op 2 . - CDS 19287 - 19670 424 ## COG1832 Predicted CoA-binding protein - Prom 19693 - 19752 6.2 + Prom 19718 - 19777 7.7 20 10 Tu 1 . + CDS 19897 - 20229 324 ## COG1695 Predicted transcriptional regulators - Term 20559 - 20597 0.5 21 11 Tu 1 . - CDS 20674 - 22422 1410 ## COG1164 Oligoendopeptidase F - Prom 22638 - 22697 8.1 22 12 Tu 1 . - CDS 23014 - 24195 660 ## COG2508 Regulator of polyketide synthase expression - Prom 24292 - 24351 9.1 + Prom 24248 - 24307 7.3 23 13 Tu 1 . + CDS 24327 - 26258 2421 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family + Term 26269 - 26299 1.3 24 14 Op 1 3/0.000 - CDS 26745 - 29606 2084 ## COG1061 DNA or RNA helicases of superfamily II 25 14 Op 2 . - CDS 29590 - 29907 327 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Prom 30008 - 30067 7.0 - Term 30547 - 30592 7.4 26 15 Op 1 5/0.000 - CDS 30602 - 30874 262 ## COG4668 Mannitol/fructose-specific phosphotransferase system, IIA domain 27 15 Op 2 . - CDS 30831 - 32537 1333 ## COG3711 Transcriptional antiterminator - Prom 32560 - 32619 7.4 + Prom 32566 - 32625 9.3 28 16 Op 1 8/0.000 + CDS 32733 - 33182 447 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 29 16 Op 2 7/0.000 + CDS 33183 - 33494 526 ## COG1445 Phosphotransferase system fructose-specific component IIB + Prom 33518 - 33577 2.1 30 16 Op 3 2/0.000 + CDS 33597 - 34694 1451 ## COG1299 Phosphotransferase system, fructose-specific IIC component 31 16 Op 4 . + CDS 34740 - 37460 2936 ## COG0383 Alpha-mannosidase + Prom 38271 - 38330 10.4 32 17 Tu 1 . + CDS 38355 - 39047 688 ## CTC01469 hypothetical protein Predicted protein(s) >gi|223714124|gb|ACDT01000091.1| GENE 1 3 - 918 879 305 aa, chain - ## HITS:1 COG:aq_035_2 KEGG:ns NR:ns ## COG: aq_035_2 COG2199 # Protein_GI_number: 15605636 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Aquifex aeolicus # 194 305 52 168 251 70 36.0 4e-12 MKFKRLDRVDYQVSIITIIIVCASFFCVYIFNYKITHDDMIYSLKERSNTIYHYVENYLD KDTFNHNFSLDMKNDTYKQIKQKLEDVKNATNVLYLYTAKMTDEGEYIYLVDGLPTASPD FRYPGDKIEKEIIPELKAALKNKIILPDQIKETTWGPIFVSYYPIHDEGKVVGVLGIEFD ASHQFEAFKQIRIITPLIAIMASLIATIIAVLSFRRISNPRFKDMANTDYLTNLHNRNAF EIDFENLSTHKHNTGIIITDLNDLKKINDTYGHRTGDEYIKKVAKILEKHVQKNTVYRFG GDEFI >gi|223714124|gb|ACDT01000091.1| GENE 2 1068 - 1472 439 134 aa, chain + ## HITS:1 COG:no KEGG:Acid_1436 NR:ns ## KEGG: Acid_1436 # Name: not_defined # Def: tagatose-bisphosphate aldolase (EC:4.1.2.40) # Organism: S.usitatus # Pathway: Galactose metabolism [PATH:sus00052] # 12 134 29 153 505 85 38.0 4e-16 MQLEKQIYIDKDLPAGWKPYYIFLMKVNNEIVGRMTLREGSCEERYYDGHIGYTVEPEFR GHYYAYQGVQLIKPIALKLGFKELIITCSPNNLASKKTILKLQAQYLETVEIPKKYRKDF EAGETIKEVYLIKL >gi|223714124|gb|ACDT01000091.1| GENE 3 1484 - 2617 1244 377 aa, chain + ## HITS:1 COG:VCA0520 KEGG:ns NR:ns ## COG: VCA0520 COG0420 # Protein_GI_number: 15601280 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Vibrio cholerae # 1 369 1 371 379 214 34.0 2e-55 MRFVHISDIHLGKLLYQQNLLEIQADLLNQVCDYLVDNEIDVLVMAGDIYDRSVPSNEAI DVLNDFLTKVILKYHKKVLMIAGNHDSASRLSFVSGLLRQEGLYIEAYPAKQMKPIEIEG VNFYLMPFFKPSYIRYLYEDEQIVTYQDAFRCYLEHQKIDFTKPNVLITHQFIAGNSEVM RSESEAVLSVGGTEIIDVSLVKQFDYVALGHIHAPQKISTETIRYSGSLMRYSFDEVNQT KSIVEVTIDNKKVSYQLVSLQPKQDLIKITGSFDEIMADSFAVNNNDFIAVELTDSMLVP NAIDYLRTKFSNVLQITYPKLISNETSNNTKADPGFEKLDSVALFEQFYQKIKGTELSAE AKQIVTEIMMGGKENVA >gi|223714124|gb|ACDT01000091.1| GENE 4 2607 - 3596 911 329 aa, chain + ## HITS:1 COG:VCA0521 KEGG:ns NR:ns ## COG: VCA0521 COG0419 # Protein_GI_number: 15601281 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Vibrio cholerae # 1 260 1 273 1013 110 32.0 3e-24 MLPNRLTMSAFGPYHQVVTIDFEPFIQDGLFLITGPTGAGKTMIFDAIMFALYGVSSGSE RSSEQFRSDQANHDTPTFVELDFILHNQHYLIKRSPRYLLEGKKTPKLPTALLTLPGGKM IEGIKEVNYKIKEILGIDDKQFKQIAMIAQGEFTKLIYAGSEEREKVLRNLFKTDNFRCL EEQLKLRVKEYKSKYDLLFKQREMLLKSLDVEDEKIDLDEYLDSLEKKIKIKAMEYQQNA QYYEQKAKELNVIEINNRRLEYLDDLKIQLSNYLSKEDYYRELEKMIKDLKRANQLQNIY SLMISSKTKLAKLREKEKVLNERLSFAKK >gi|223714124|gb|ACDT01000091.1| GENE 5 3568 - 5640 2065 690 aa, chain + ## HITS:1 COG:VCA0521 KEGG:ns NR:ns ## COG: VCA0521 COG0419 # Protein_GI_number: 15601281 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Vibrio cholerae # 92 672 423 994 1013 186 30.0 1e-46 MNAYHLQKNDLKKQEKQYALVSKYHDEVAQGQIELEKMVELKQNVLVYQKDLNQGNNLRK QIKVIESKNIGLNDRLQKSAKIIERDSNSIAHLDKLKGDYELINQNYQKIHAHKLEIHNL SSDYDKFLREEEHCYELREQYQKIEEKYLNEKNKYDQMEHHFRSSQAGILASSLKENEPC PVCGSLNHPQLADFDQKIVYQEDLKKAKKNFDKCSETRNDIYNALILKQQEVTLLKKQME NDCQRLGIEEELGKEVFIKVLGKINIDENKLLKQAKELDNEIIYLNNLKISLANHQKDLD LLEQNIQKNNAQVQAYKDQLNQVIGRMEQNKQLKDLTHHQVDQQIEEKQKELLGLTKIIK DIEEKYRSINEQIIQLEAQNRSYCQQIKDEENNYQQVLREYRGKLNSLFTDEKEFLEIIK NVSQLENKEKEYQAYLIAKDSLNKQIHELQLELKGIEVIDVEVVKEKLIILKQKLDSALN ILNNLNAQLMTVTSTIKNIKNIDQELSNSHDIYQCYLDLSEVTSGKNSYRISFERYVLAA YFENILVYANTLLKRMSQGRYQLYRRDNRSKGAGKQGLELDVLDLESGLLRDVKTLSGGE SFKAALSLALGLSKMIQGYAGGIELNTLFIDEGFGSLDSQSLDQAIDCLIDIQQDGKLIG IISHVSELKERIDHKIILSRKNKETKIAIE >gi|223714124|gb|ACDT01000091.1| GENE 6 5694 - 7133 1664 479 aa, chain + ## HITS:1 COG:BB0402 KEGG:ns NR:ns ## COG: BB0402 COG0442 # Protein_GI_number: 15594747 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Borrelia burgdorferi # 7 479 2 488 488 477 48.0 1e-134 MANNKKNDAITAMEDDFAQWYTDVCRKAELMDYSSVKGFIIYRPYSYAMWEAIQQYMDKR FKETGHENVYMPMLIPQSLLQKESDHVEGFAPECAIVTKGGLDDLEENLIIRPTSETLFC EHYAKIVNSYRDLPKLYNQWCSVVRWEKTTRPFLRGSEFLWQEGHTVHASETEARAETLQ MLEIYEDTAKNLLAIPMLTGRKTEREKFAGAEETYTIEALMHDGKALQSGTSHYFGQGFA KAFDMKFLSKENKLEYVYQTSWGVSTRLLGAIIMVHGDNNGLVLPPRVAPTQVMIIPIQQ QKAGVLDKAYEIEKQLKEAGIRVKVDASDKSPGWKFSEAEMRGVPLRLEIGPKDIENNAC VIAKRNDGTKEKYALDQELINTIKTLLDTIHDEMYQKALNHRNSNIRQVNTYDEFKTTLE TKGGYIKMMWCGDEACEVKIKEETGATSRCIKEEEEAFGDVCPICGKPAKKVVYFAKAY >gi|223714124|gb|ACDT01000091.1| GENE 7 7225 - 7653 412 142 aa, chain + ## HITS:1 COG:lin2874 KEGG:ns NR:ns ## COG: lin2874 COG1846 # Protein_GI_number: 16801934 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 11 134 12 138 149 57 27.0 6e-09 MERIGRIIHHLAILNRNDSNQRLKEYGLTSNEGSVLMFLSRHSKVYQEELIKELQIDKSA VTRLLQNMENKGLIKRIQSPVDKRFYLLEMTALGQEKQQIVDQTFAKKDFILVSGLSEKE QEELRRMLDVIQTNLKGGKINE >gi|223714124|gb|ACDT01000091.1| GENE 8 7646 - 8980 1337 444 aa, chain + ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 14 432 15 432 448 313 40.0 5e-85 MNDERMGNKKILPLLIEFSVPAIIGMLVNAIYNVVDRMFIGNAPQLGAVGIAGITISYPV TLVLMAISLMVGVGGATRFSISLGQKEKDKASIYQGNAIVLTVVFGLMFTVLGNIFITPM LEVLGASKTVLPYASDYLSIVLYGAVFQCVAMCGNNFSRAQGNPKNAMISQLIGAGFNIV FDYILIIQLHMGMSGAAIATIGGQFLSMIWQLCYLCSNRSLIKLDIEHMKLKLIYALDII KTGIPAFLMQMANSVLNFILNSTLGMYGGDIAISTVGIITSFQTICQMPLTGLMQGQQPL ISYNFGADNYHRVKETLKYAIIGGTMIAVVGFLAVQFFPETIIRMFNGEAEVVKLGTKAI RIWFLFLPLLGAQIMTANYFQCIGKIKVASILNLLRQVIILIPMILLLATFFGLNGIFWA VPIADLGAFVITIFCFIKAIKKLK >gi|223714124|gb|ACDT01000091.1| GENE 9 9152 - 9763 786 203 aa, chain - ## HITS:1 COG:CAC2769 KEGG:ns NR:ns ## COG: CAC2769 COG0652 # Protein_GI_number: 15896024 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Clostridium acetobutylicum # 44 177 6 139 174 160 59.0 1e-39 MKKLFKFTMVMLLALSLAGCGGKSNDEQDKKTASKDLLSGNYRIEIDVKDYGVIKVELQA DEAPITVTNFVKLVKEGFYDGLTFHRIIDGFMIQGGDPKGNGTGGSDETIKGEFSSNGVE NPLKHTRGAISMARSSDPDSASSQFFIMHQTTTSLDGDYAVFGYVYEGIEVVDKIATTVP VTDNNGTVEASNQPVITSIKMLK >gi|223714124|gb|ACDT01000091.1| GENE 10 9904 - 10716 851 270 aa, chain + ## HITS:1 COG:CAC1622 KEGG:ns NR:ns ## COG: CAC1622 COG2240 # Protein_GI_number: 15894900 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Clostridium acetobutylicum # 1 270 5 278 290 176 38.0 4e-44 MKRIALINDVTGYSRCSIAAQLPIISAMGIECVFVPTAILSVNTMHPEYYFDDYTDRMND YIETYKKMKVSFDGIATGFLGSERQIDIVIDFIKTFKKKHTFVLVDPVMGDHGKLYPTYT TAMQDKMRQLIPYATIMTPNLTELCALLEVPYPHDIINRDELIEMCRQLSNLGPQMVVVT GISIGDEIINFAYEQGKEPAIIKVKRIGDDRSGTGDVISGVIAGSYLNGLNFYQCVEKAA NFASRCIEYSQAVGAHNHLGLCFEPFLKEL >gi|223714124|gb|ACDT01000091.1| GENE 11 10728 - 11372 685 214 aa, chain + ## HITS:1 COG:aq_2060_2 KEGG:ns NR:ns ## COG: aq_2060_2 COG0535 # Protein_GI_number: 15607030 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Aquifex aeolicus # 27 213 9 191 194 122 35.0 6e-28 MVILYTLHGNGYIDLKDLDTYKGITCYVNSTNRCSCACTFCLRNTKEMLESNSLWLKREP TVEMIIAEFEKYDLTAFKEVVFCGFGEPLTRHDDLMKVAKYIKEKRSDLSIRINTNGLSS ITLGRDIAPDFKGVIDTVSISLNAPTKEEYYELTRACYGVDSFNYMLQFTKACKKYVPSV VLTVVDIIGVEKIKLCQKIADGLGVELRVRPFEE >gi|223714124|gb|ACDT01000091.1| GENE 12 11430 - 12164 886 244 aa, chain + ## HITS:1 COG:lin1914 KEGG:ns NR:ns ## COG: lin1914 COG2365 # Protein_GI_number: 16800980 # Func_class: T Signal transduction mechanisms # Function: Protein tyrosine/serine phosphatase # Organism: Listeria innocua # 11 234 46 288 298 87 29.0 2e-17 MMQLSREERMLDVKSMTNVRDLGGYETQTGFYTKSHKFIRSTNPSKLSDDEKEYLYDYGI RMQVDLRSDFEVEQLPSQLTGYRDIEYYNVNLMQSKDLNVLPSEVTNYQDLAGFYIFMLE ANKQQFKEVFELFYDHPYDAIMFNCSAGKDRTGVIAALLLDLAGCHDYDIVKDYSESYEN NLKIISELERLVDDANRAFLESAPQMMMKFLDYLREHYGSAKEYLVSCGLEEEKIIEIIE NFVI >gi|223714124|gb|ACDT01000091.1| GENE 13 12216 - 12620 474 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756675|ref|ZP_02428802.1| ## NR: gi|167756675|ref|ZP_02428802.1| hypothetical protein CLORAM_02213 [Clostridium ramosum DSM 1402] # 1 134 1 134 134 238 100.0 8e-62 MRKTNKLTFKQKQEAVAELFKQFHRAKLKLYCLENTNFYPQLNIGMLHEKKSGYNASIAE RLNQRIDDRDELERVVAAFEIVIQALSPESQLIITNEFVLQKNHEWWLEFYSRATYYRLK TRALEEILFYVNIS >gi|223714124|gb|ACDT01000091.1| GENE 14 12702 - 14018 1328 438 aa, chain + ## HITS:1 COG:FN1250 KEGG:ns NR:ns ## COG: FN1250 COG2252 # Protein_GI_number: 19704585 # Func_class: R General function prediction only # Function: Permeases # Organism: Fusobacterium nucleatum # 2 437 5 447 448 347 51.0 2e-95 MLEKIFKLKEKGTTVKTEVIAGITTFLAMAYILAVNPNMLSETGLSYDGVFLATALSSGI ATLIMGLLANYPVALAPGMGVNALFTYTLVFSMGYSPAAALAAVVVSGIIFLLISVTGVR KAIINAIPQQLKLAIGAGIGFFIAFIGLKNAGIIIGSESTFVALGKLTDPVVLLAVFGVL ITIILMARKTPAAVFYGLVITAIVGIVAGLCGIEGMPAMPSKIVSIDLDTSGFGIFMNGF GELFSHPDCIVALFSLLFVDFFDTAGTLISVANKTNLVDENGELDNIEQALVADSTGTVI GGILGTSTVTSFVESTAGVEAGGRTGLTACTTAILFFLSILFAPVLSVVTSAVTAPALVA VGISMATQLGGIDWDDIVFAASGFITVIMMILTYSISDGIAFGFIVYGITSVIAGRSKEI KPIVWALILIFVVYFALI >gi|223714124|gb|ACDT01000091.1| GENE 15 14512 - 14973 275 153 aa, chain - ## HITS:1 COG:CAC0262 KEGG:ns NR:ns ## COG: CAC0262 COG3476 # Protein_GI_number: 15893554 # Func_class: T Signal transduction mechanisms # Function: Tryptophan-rich sensory protein (mitochondrial benzodiazepine receptor homolog) # Organism: Clostridium acetobutylicum # 5 153 14 165 170 87 38.0 7e-18 MKKYKYIILNLSISLGIGALSAIFTMNAMDVYQTVNLPKCSPPGYLFPIVWTFLYILIGL ASYLIHRSNSKNKETALIIYYFQLLINFAWPIAFFNYQSFLLALAILITLCILVAILIKL FYQIRPLAAFLLLPYMGWILFALYLNFWIFVNN >gi|223714124|gb|ACDT01000091.1| GENE 16 15123 - 15878 814 251 aa, chain + ## HITS:1 COG:CAC0222 KEGG:ns NR:ns ## COG: CAC0222 COG0708 # Protein_GI_number: 15893514 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Clostridium acetobutylicum # 1 249 1 249 250 361 68.0 1e-100 MKLVSWNVNGIRACITKGFYDVLKESKADIFCVQETKMQEGQIELEKHGYYTYMNSAEKK GYSGTLVFSKEKPLNYSYGIGIEEHDHEGRVVTLEYEKFYLVNCYTPNSQNELKRLDYRM HWEEDFLAYLKSLEKSKPVILCGDLNVAHQEIDLKNPKTNRKNAGFSDEERAKMTNLLEN GFIDTYRYLYPDQEGVYSWWSYRFNARKNNAGWRIDYFIVSDCLKEGIKEAFICTDILGS DHCPVGLEIDL >gi|223714124|gb|ACDT01000091.1| GENE 17 16023 - 17387 1373 454 aa, chain - ## HITS:1 COG:BH1128 KEGG:ns NR:ns ## COG: BH1128 COG0733 # Protein_GI_number: 15613691 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Bacillus halodurans # 1 444 3 447 453 236 34.0 8e-62 METKHKRSSFSGRLGYVMAVAGSAVGLGNIWRFPYLAAKYGGGTFLFTYFILAITFGFAL LISETALGRKTKKSPIAAYKQLGAKKLQFGGWLNAIVPMLIVPYYCVVGGWVCKYLFEFI TNNSASLVSDTYFTGFSSSTLQPVIWLLVFAGMVFGVVMLGVEKGIEKCSKVLMPALVAM ALFIALYALFTPGAIEGVKYYLIPDFSRFSIMTVVAAMGQMFYSLSIGMGILFTYGSYMK REIDMEQSITQVEIMDTLIAFLAGLMIIPAVFAFSGGNPDTLNAGPSLMFITMPKVFASM SFGNIIGLVFFLLVLFAALTSAISLMECCVSIIQDRYDFSRKKCCVLIFGGIALLGIPCS LGFGVLDFITPLGLSILDFFDFMTNSIMMPISAACTCLLIIKVTGFKTVTDEVEYSSKFK RKKAYLFCMKYVVVPGLLIILISSVLSTMGIISI >gi|223714124|gb|ACDT01000091.1| GENE 18 17645 - 19207 1623 520 aa, chain - ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 4 519 18 551 553 322 36.0 1e-87 MKKIYYNGDILTMENSDYEAIYVEDGVIKKCGNYSEIKNLVCADTEFIDLQGKCLMPSFI DSHSHVVSLASTLKLVDLSKATSIAQIIQCFKNYIEENNPTSEEMLIGFGYDHNFLKEQC HPTASDLDKISTTNPIMMAHASGHMGVVNTKTMQLLGLTDQINDPVGGKYGRDEDGHLNG YLEEQAFITIASQGSKENKNLKTQILDALKIYASYGITTVQEGYMKQHEFDLLNSLATEN KLFLDVVGYVDIKNNYSLYRDNQQYHHYLNHFKLGGYKLFLDGSPQGKTAWLSKPYENSG DYCGYPIYLDSDVKKFVDMAINDKAQLLTHCNGDAASNQLLGAFNDQYTPLRPVMIHSQT LRPDQLPRLKSIGMIPSFFIAHIYYWGDIHRRNLGQRAESISCVHSALKLDLPYTFHQDT PVIEPNMLETIWCAVKRQTKNGVTLGTDEQISVYNALKGVTINAAYQYFEEHLKGSIKEG KKADLLILDRNPCKVAIEEIRQIKIIETIKDGITIYKKDI >gi|223714124|gb|ACDT01000091.1| GENE 19 19287 - 19670 424 127 aa, chain - ## HITS:1 COG:TVN0743 KEGG:ns NR:ns ## COG: TVN0743 COG1832 # Protein_GI_number: 13541574 # Func_class: R General function prediction only # Function: Predicted CoA-binding protein # Organism: Thermoplasma volcanium # 5 123 6 125 134 69 29.0 1e-12 MNAKDILTHYQNFAVIGVTTNPDKFGYKIYQRLNEIEKTVYGVSPIYKEIDGKSTFPNLT AIKNKIDVAVFVVSPKFGIDYIKECNQLNIKHIWLQPGTYNDDLISLINENNLNYYQNCV LIESQDL >gi|223714124|gb|ACDT01000091.1| GENE 20 19897 - 20229 324 110 aa, chain + ## HITS:1 COG:L128255 KEGG:ns NR:ns ## COG: L128255 COG1695 # Protein_GI_number: 15674041 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 1 104 1 104 120 122 63.0 2e-28 MKDITEILKGILEGCVLEIINQEETYGYEITRRLNQLGFTEVVEGTVYTILVRLEKSGLV KIQKKLSKMGPPRKFYSLNESGREKLRLFWTKWEFMTDKLNELKESGKNE >gi|223714124|gb|ACDT01000091.1| GENE 21 20674 - 22422 1410 582 aa, chain - ## HITS:1 COG:DR2055 KEGG:ns NR:ns ## COG: DR2055 COG1164 # Protein_GI_number: 15807049 # Func_class: E Amino acid transport and metabolism # Function: Oligoendopeptidase F # Organism: Deinococcus radiodurans # 113 573 5 464 467 341 37.0 2e-93 MKEWDLSKLYQSFDDSSFTTDFNQIIKLQKQINTYRNNNDEKSLENYLVDLIKLNQLIEK VANYISLTLSVDTTNSIALKYSDQLDNLLASFVEDEVLNQQWISTFKLDSLKSNLIKEHL FILKEIQQNQKYTLDKKSESIIAHMQNTGSNAFSKLKDQLISSIEVTIEDQTYPLTSVLN MAYDKDREIRKKAYQAEINSYQDKEQAIAACLNGIKGEVLYTTKLRGYENELERTLINAR MSKETLDVLLTVMQENLSVFRDYLKTKATALGYQNGLPWFELYAPVIEDHDSHPYEKSCQ FIINQFSTFSEHLGTFAKKAIENNWIDVYPQSGKVGGAFCCNLHSIKESRFLLNYGDSFS DAITMAHELGHGFHGECLNDESILNSDYPMPIAETASTFCETIVTKAALKKANDQTKLSL LEDELSGATQVIVDIYSRYLFESRFFEKRETGALSVEEIKELMIQAQKDAYGDGLDHQYL HPYMWTWKPHYYYADCSFYNFPYAFGLLLAKGLYTIYQKEGSKFSNTYEKLLSLTGKMSL EDVCSSVNINLKDKSFWQASINTFKEDLTLYKQLLNKDVISQ >gi|223714124|gb|ACDT01000091.1| GENE 22 23014 - 24195 660 393 aa, chain - ## HITS:1 COG:CAP0121 KEGG:ns NR:ns ## COG: CAP0121 COG2508 # Protein_GI_number: 15004824 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Clostridium acetobutylicum # 263 376 399 521 543 71 41.0 3e-12 MKDMYEFLFNGLINHNLHQFMKQLYEYFHHPMVLCDVNYVVLAQHPNQQIGDMLFDHMQE HQKVAVEMLPFIQLGNYQKDLDQNNNVIYVDYGVGQTIPRIIAAITDNDNIIGYLCILFA DGKPSSEIFSFISKLAKTIASIICHSKTGYNYNRYEYFAIMNYLFSNDNINKHDLEKLLP AQPNNFSRDYCIITTKVGLKSNEIVYLNQLHQHIITLPNTYSYLQDNCLYILITGLNRAT KNNIINKVKSLITAVNIYKLKFGISYIFDNLLNINIYKQQALKAYDLSKQQFTFFEDIAL ESFFTTSSYKIYLHPLITKLKQIDESTKNDYYHTLKSYICNFGDSHKTIEQLAIHRNTLL YRLEKIKELTNVDLHDEKLCTLLLISFFLAENN >gi|223714124|gb|ACDT01000091.1| GENE 23 24327 - 26258 2421 643 aa, chain + ## HITS:1 COG:CAC3371_1 KEGG:ns NR:ns ## COG: CAC3371_1 COG1902 # Protein_GI_number: 15896613 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Clostridium acetobutylicum # 2 376 3 399 401 256 37.0 1e-67 MKYDALFTPFKIGNMEVKNRFVMAPMGTNSSHIDGCIANDEIDYFEARAKGGVGLIIMGC QFLNPDLAQGSLEGVLQNSYVIPQLTTVVEATHRWGAKICCQISCGTGRNAFPNMYGEPP FSSSDTPSTFNPEMKCRPLSKEDIKKIMTEFANSAIIAKNAGFDAIEIHGHAGYLVDQFM SPVWNKRTDEYGGSAENRMRFATEIVQSIKQAVDLPVIFRIALDHLFAEGRTLDESMELI EILEKAGVDALDIDAGCYERIDYIFPTAYLGDACMDYVCSEARKHVNIPIMNAGNHTPES ALRLIESKDADFVMFGRQFIADPDTVNKLMNDHEEDVRPCIRCNEECVGRIVGRLTKLSC AVNPQACEEERFAIKPCSQVKNIVVIGAGPAGLEAARVAAAEGHNVEIYEKTNMIGGQLS VAATPKFKDQLKKLVKWFEVQLNKLDVIIHFNYEVKADDQILKNADQIIVACGADEIIPP INGINNENVISVIDAHMHKGMIKGENVIMCGGGLSGIDSAIELASEHGKNVTIIEMADSI AKDVLFINQASIFAKIDEYKINVITGAKVIEFNNDGITYQKDEQEVVCKADTIIRAFGMK PNRELAEEIRNIYPTKTRIVGDSEKIGKVANAIRDGFYAGMSL >gi|223714124|gb|ACDT01000091.1| GENE 24 26745 - 29606 2084 953 aa, chain - ## HITS:1 COG:MA1603_2 KEGG:ns NR:ns ## COG: MA1603_2 COG1061 # Protein_GI_number: 20090461 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Methanosarcina acetivorans str.C2A # 206 782 32 611 618 661 56.0 0 MTLTIDYHKNLHALNTVDKTYQPQLLFNDYKQGMKLSHELVQQLNSCDSFELSIAFISMS GLATIKQTLLDLEEKNIPGKIITSTYLGFNEPRTFQELLKFKNLDVRIYDDEKAGFHPKG YIFKKENTFNIIVGSSNLTQSALSINQEWNLKFSSTNNTNIVEQIKKEFNQQWENSISLT NDWIEEYKKNYVKPIKSVTNIIKKEIKPNKMQRAALDSLKSIRNENKNKALLISATGTGK TFLSAFDVKNVNPKRMLFVVHRENIARTAMLSFEKIINNHSFGVFTGNNKDINADYIFST IQTIHKKEYREMFNPNDFDYIIIDEVHRAGANSYQELVNYFTPKFMLGMSATPERSDDFD IYKMFDYNIAYEIRLQQAMEYDLLCPFHYYGITDIQVNGESLDDKTDFNNLVSKDRVKHI IEQINRYGYSGDRVRGLVFCSRKDEATELSNLFNTRGYKTISLTGENTEAQRRDAMNRLE SDDLVNCLDYIFTVDIFNEGIDIPKVNQVIMLRPTESAIVFVQQLGRGLRKDPSKEYVVI IDFIGNYEKNFLIPIALSGSLSYNKDTLRRFVSEGSLIIPGASTVNFDLISKKKIFESID KANFSDIKIIKESYQQLKQKLGKIPSLKDFDKYESIDVLRIFQNKNLGSYHKFLTKYEKE YNVKFSSVQEQYLTYISTKLASGKRIHELEAIKIAIEKHTNLLDNLKESLLNNYNLNLPK ITYQTIINILSQNFATGSSKATFKDAIFIDENLNTSAQFKMLLADQEFKKQIIELLDFGI NRYKQNYTNTYKDTSLCLYKKYTYEDVCRLLNWEKNLVPLNIGGYKYDKHTNTFPVFINY DKEEGISESIKYEDRLVDQNTIICFSKPRRTIESEDVVKIYSEKTNHVKIHLFIRKNKDD EASKEFYYLGLMHAYGEPIQTTMSNTNNNVVQFTYKLENEIRKDIYDYITNND >gi|223714124|gb|ACDT01000091.1| GENE 25 29590 - 29907 327 105 aa, chain - ## HITS:1 COG:SA2278 KEGG:ns NR:ns ## COG: SA2278 COG0494 # Protein_GI_number: 15928069 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Staphylococcus aureus N315 # 1 98 32 130 130 77 39.0 6e-15 MFEFPGGKIEPGESGEQALIREIQEELETTIIIEEFFMNVNYKYPTFILDMNCYLCTLKD NHIKLNDHNSIRWISLDEQNINWIPADIQIFDTLKKRGILHDTNN >gi|223714124|gb|ACDT01000091.1| GENE 26 30602 - 30874 262 90 aa, chain - ## HITS:1 COG:lin0425_2 KEGG:ns NR:ns ## COG: lin0425_2 COG4668 # Protein_GI_number: 16799502 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol/fructose-specific phosphotransferase system, IIA domain # Organism: Listeria innocua # 4 63 38 97 125 57 36.0 5e-09 MKYLRLPHPMRLCAKDTKVAVAIIDKPLTWYQQDTVQIIFLLAIKQGDQQDIEHLYDIFI EIVNNAKLQQSIIHSYNYDNFINNLLENME >gi|223714124|gb|ACDT01000091.1| GENE 27 30831 - 32537 1333 568 aa, chain - ## HITS:1 COG:lin0425_1 KEGG:ns NR:ns ## COG: lin0425_1 COG3711 # Protein_GI_number: 16799502 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Listeria innocua # 3 509 4 509 519 219 29.0 2e-56 MMFPYNRLNEIFDYVRQDNIVSASQLSVLLNITDRTIRSDIQAINEILEKNGAKIKLKRK AGYYIEINDQEKYNTFLCSIKQTRTSNLELDSSQDRIKYLLNLLLYSDEYMSLDDLADNI YVSKNTLQNYIKTLKAIFSKYNLEYISKTNVGVKIIGNEDDKRKCLVENVLSYNFQNYVT GFTKDEYTLFEGIDLDLLKQIISNKLKNAHIKTNDFNFKNLIIHFALMISRIQFDCYINT NNTIKIDDNYTDFIDDIANEIEYTFNITISEGEKKYIYSHLVANTQLNDLVDNDNKIKEL VEELLNNIYFDYNFDLRNDEILSHDLFLHFKSILNTKSFALNKRNPLLNTIKTNFPLAFD ITLTCTAKIFNKPPYILTEDEVGYVSLHIGAAIERCFSGSLQNKSVILVCGSGQATTRML EARLNVFFKDKITIVRKASYNEFINYTKRELLNIDFVISTIPLKSEHIPTITVDFALNNQ DIEAISKFLTSISLNKMKKSNKFFDKNLFIHLDGIDSKESLLKQMCQLMEKQNIVDSNYF DCVMERENLAKTNMNEVFAITTSYAPMC >gi|223714124|gb|ACDT01000091.1| GENE 28 32733 - 33182 447 149 aa, chain + ## HITS:1 COG:lin0421 KEGG:ns NR:ns ## COG: lin0421 COG1762 # Protein_GI_number: 16799498 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Listeria innocua # 1 147 2 148 152 99 34.0 3e-21 MLNEDYIFLDVKVTNKNQLLGFIADKAHEYNICDNREGLLEDLIKREAEFPTGLQDGFAI PHARSNHVKKAAILYLRTDEAIEWGTMDDKKVNYLFSLLVPEQNEGNLHLQMISKLATCL LEDEFKNTVKSATNKLELKDYILKNMEVD >gi|223714124|gb|ACDT01000091.1| GENE 29 33183 - 33494 526 103 aa, chain + ## HITS:1 COG:lin0422 KEGG:ns NR:ns ## COG: lin0422 COG1445 # Protein_GI_number: 16799499 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system fructose-specific component IIB # Organism: Listeria innocua # 1 102 1 102 110 100 56.0 7e-22 MKIVGIAACPAGLAHTPMAAKALEKAGAKLGYDLKMEQQGSMGQVNKITEEEAKEAAFVI VASDQKILGMERFEGKPVIRIDITTCIKAPEAVIKKCVQATQK >gi|223714124|gb|ACDT01000091.1| GENE 30 33597 - 34694 1451 365 aa, chain + ## HITS:1 COG:lin0423 KEGG:ns NR:ns ## COG: lin0423 COG1299 # Protein_GI_number: 16799500 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Listeria innocua # 1 349 1 349 370 350 53.0 2e-96 MKKFLKDAKGHIMSGIGYMLPLIIGASLVVAIPKLIGVAMGINSLDAYATKDGFLHILYL LEQVGWTGIGLVNTVLAGFIAFSIADKPAIGAGLIGGALASNTKAGFLGAVIAAFIAGYI VKWGKEHIKLPDSMQQMMPLVILPFLATGAVAIIMGVILATPLASINDALVSWLRDMCSG GTSQLILSLVLGAMIASDMGGPINKSAWMAGNVLMTEGIYQPNVYINCAICIPPLAYAIA TVIKKNRFSPSFKEAGKGNWVMGFIGITEGAIPFTLVKASRLIPINMIGGALGAGICCLL GATADIPPVGGMYGFVSITGGWAYLVGIIVGALFIAIVAPMVVDFNDDKDEEETISVDDI EIVIE >gi|223714124|gb|ACDT01000091.1| GENE 31 34740 - 37460 2936 906 aa, chain + ## HITS:1 COG:lin0424 KEGG:ns NR:ns ## COG: lin0424 COG0383 # Protein_GI_number: 16799501 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Listeria innocua # 2 850 3 831 875 811 48.0 0 MKRKVHVIPHSHWDREWYFTTSRSKVYLMKDLKDVLDTLETNNDFKYFMVDAQGSLLEDY IKWMPQDKQRIKKLVKAKKLIIGPWYTQSDQLVISGESIVRNMYYGMKCCETFGEYMNVG YVPDSFGQSGNMPQIYRQFGIEDTLFWRGVSDDMVKHTDYNWRGDDGSVVFTTQIPFGYY IGGNIPEDDKASDEFWQKECFEKAGSRSATKHIYFPNGFDQAPIRKNLPDLIKERNEKDP DNEYIISCVEDYIKDVKSEKPDLEEVSGELVIAKHMRIHKSIFSSRSDLKVLNTQVQNYV TNILEPMLTMSYHLGNDYPHSAVSEVWKLLFENAAHDSIGSCISDTANEDVYMRYKQARD IAVNLVELHSRLIATKIKNKTNNEITLTLFNTLPKKRNDTIIFETYLPADNFAIKDNHGN HVKYTVIEKTDLTDYVLSQTIRLNPSKNIYLPSVVYQAKIAIEARDVPALGYVQYTIDLD DKSNDDLQEITSMENEYYQITVNKQGSLDIYDKLADYHYQNQAVLVENGDDGDSFNYSPP RQDLEIFSINSEFDYQITGSSLYQKTTINYKMDIPADLQERAKQQTSIKLPVTLEVVLRK GSNIIDLNVHVDNQGLSHRLCILFDSALTTKFNYADQQFGLIKRPNTYDKEMELYLEGLG NQEKAEKPKELANWVNDQSTWQEPPISIEPTQSYVALADKNRGIAVIPQGVREYEIIGEQ GNQIRLTLFRTYGFMGKENLVYRPGRASGERIIATPAAQLLKEMDFKLGFTTFNCDINNA NIDTLAKEYNTPIEVYEYADFLNGRLIFAQEDEEQILPVTNSLFESENNLVVSAIKKSED GEGYIIRLFNGKDHQDIGDKLTFNFDISEAYYTNLKEEKIEQINVVDNSVVIKPISHCKF ITLYIK >gi|223714124|gb|ACDT01000091.1| GENE 32 38355 - 39047 688 230 aa, chain + ## HITS:1 COG:no KEGG:CTC01469 NR:ns ## KEGG: CTC01469 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 1 227 1 227 230 192 47.0 6e-48 MDVYEIKIKLYLLKDIKIEETQTYLAYFIDSVMVKDNMFLGIHETNQYKFYTFDSLYPLA KNGVYQKDNSYVFRIRTLDYQLAQYLYDTLAKNRTKEFQGLTAEVKIIKPKLIKKIYTLT PVILKTEQGYWKNSIKTEDFEKRLKTNLIKKYKDITGEEINEDFQLYYQINFKNKVPVSR KYKGIKLLGDMIELEIAENDNAQKLAMLAIGAGLLEMNARGFGFVNYIYY Prediction of potential genes in microbial genomes Time: Thu May 26 10:09:56 2011 Seq name: gi|223714123|gb|ACDT01000092.1| Coprobacillus sp. D7 cont1.92, whole genome shotgun sequence Length of sequence - 1736 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 34 - 1008 844 ## NT01CX_0205 hypothetical protein + Term 1256 - 1295 -0.9 + Prom 1144 - 1203 7.6 2 2 Tu 1 . + CDS 1342 - 1735 486 ## CTC01467 hypothetical protein Predicted protein(s) >gi|223714123|gb|ACDT01000092.1| GENE 1 34 - 1008 844 324 aa, chain + ## HITS:1 COG:no KEGG:NT01CX_0205 NR:ns ## KEGG: NT01CX_0205 # Name: not_defined # Def: hypothetical protein # Organism: C.novyi # Pathway: not_defined # 1 317 139 457 594 202 41.0 1e-50 MYNFINDNISEIDQERLLHNKQWLKEHIYNLEGVDYNEKDYLKIFFEVDDQLYIDEGNRY LLTKIFNCNDYNVYDNGTVYGLPNYNLQLNAKKPFLENKTRMHKIPYMVSLEDALLQKQF FDYLLNRVSTGKSNVYINEDDDKRIYCLDNTENIDKGFNGFYLKTKKGKELEIHYMDVVT DYKQYLNPLFDFENVIGALDDECYREYKYRNDVEKLINNILFSKYLINNYFTAPDDIKGI KTDSVYKSNLLTCRNAIFAWTRAGRVDNIGYILPKAALGVVINSIRKEYIRLAQKQLNLY FALDKYFNKQENNMENVRESLELK >gi|223714123|gb|ACDT01000092.1| GENE 2 1342 - 1735 486 131 aa, chain + ## HITS:1 COG:no KEGG:CTC01467 NR:ns ## KEGG: CTC01467 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 1 131 3 133 319 181 69.0 1e-44 MDKRVYGVIGISSIMANWNADFSGYPKSLSDGTIFGSDKALKYPMKKMWDNEGKKVLYMK SMKFSPQKDGTVTLVPNSLKERYEKLFDVDDLGKCKDAKEVLSNLMSAVDVKNFGATFAE AKNNISITGAV Prediction of potential genes in microbial genomes Time: Thu May 26 10:10:05 2011 Seq name: gi|223714122|gb|ACDT01000093.1| Coprobacillus sp. D7 cont1.93, whole genome shotgun sequence Length of sequence - 8151 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 8, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 11 - 70 4.0 1 1 Tu 1 . + CDS 100 - 435 350 ## gi|237733728|ref|ZP_04564209.1| predicted protein - TRNA 2059 - 2134 77.0 # His GTG 0 0 2 2 Op 1 . - CDS 2261 - 2605 429 ## gi|237733729|ref|ZP_04564210.1| predicted protein 3 2 Op 2 . - CDS 2672 - 3439 710 ## COG1434 Uncharacterized conserved protein - Prom 3461 - 3520 9.2 + Prom 3395 - 3454 10.3 4 3 Tu 1 . + CDS 3572 - 4690 1266 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family + Term 4756 - 4826 -0.9 - Term 4776 - 4818 -0.3 5 4 Op 1 . - CDS 4823 - 5359 421 ## Sterm_2283 major facilitator superfamily MFS_1 6 4 Op 2 . - CDS 5371 - 5709 168 ## gi|167754634|ref|ZP_02426761.1| hypothetical protein CLORAM_00136 - Prom 5755 - 5814 3.6 7 5 Tu 1 . - CDS 5820 - 6002 57 ## gi|167754634|ref|ZP_02426761.1| hypothetical protein CLORAM_00136 - Prom 6096 - 6155 7.1 8 6 Tu 1 . + CDS 6025 - 6231 85 ## gi|167754635|ref|ZP_02426762.1| hypothetical protein CLORAM_00137 - Term 6023 - 6067 4.2 9 7 Tu 1 . - CDS 6241 - 7599 1385 ## COG0110 Acetyltransferase (isoleucine patch superfamily) - Prom 7641 - 7700 7.0 10 8 Tu 1 . - CDS 7714 - 8076 222 ## COG5265 ABC-type transport system involved in Fe-S cluster assembly, permease and ATPase components Predicted protein(s) >gi|223714122|gb|ACDT01000093.1| GENE 1 100 - 435 350 111 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733728|ref|ZP_04564209.1| ## NR: gi|237733728|ref|ZP_04564209.1| predicted protein [Mollicutes bacterium D7] # 1 111 34 144 144 204 100.0 1e-51 MIEIYDLNDKLITRVNNQDQLDEFMDNNEFEYPEVIIKENGKKRKYTFEDSEFESAKREL LKKSKYFHMSRICALANVDYYSFNLWKNKGIFVLTPKEIDALYDAISHVCK >gi|223714122|gb|ACDT01000093.1| GENE 2 2261 - 2605 429 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733729|ref|ZP_04564210.1| ## NR: gi|237733729|ref|ZP_04564210.1| predicted protein [Mollicutes bacterium D7] # 1 114 7 120 120 198 100.0 9e-50 MFSKLYITIKASDKFDDSIIDPISFILIDWGVDGNLIDWDIYPSRAKLEINTENDNYQTY SITYQGMQTIMELCYECEFVMTIIDNDEKSESYTTYFSRENSNNFETDISELLE >gi|223714122|gb|ACDT01000093.1| GENE 3 2672 - 3439 710 255 aa, chain - ## HITS:1 COG:CAC0441 KEGG:ns NR:ns ## COG: CAC0441 COG1434 # Protein_GI_number: 15893732 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 11 254 12 256 259 134 35.0 1e-31 MEKNCKLFDYILLFLALLSLFYFIIVVTNSRFINAYILYPIFTALLMCYAVYELHFRKSI LSSLPKILNYSIKGIIIISVIVFIIIEGLIIHEINNKYNKRSDFLIVLGARLYGDKPAPL LRYRLDAAVEYHQKFPDVPIIVSGGQGHGENITEAKAMKDYLINKGIDESLIIEENKSTN TNENIIYSKKIIETSTTEDYEVVIITNGFHCYRSKLLANKHGLEAHTYAARENLDTAPHY YLREFFGCLKDMILS >gi|223714122|gb|ACDT01000093.1| GENE 4 3572 - 4690 1266 372 aa, chain + ## HITS:1 COG:TM1183 KEGG:ns NR:ns ## COG: TM1183 COG1453 # Protein_GI_number: 15643939 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Thermotoga maritima # 1 366 1 371 379 323 42.0 2e-88 MKTKKFDNLGIEPSLLGFGCMRFPLDEDGKINEKEAEKMIDKAIESGVTYIDTAFPYHNG DSEPFVGKVLKKYKREDFFLATKLPIWNVSSKEEAKKVFEDQLKRLDVEYVDFYLLHALD ADKWRKVLEYDLIGMCEEFREEGKIRNIGFSFHDEYPVFKEILEYHDWDFCQLQLNYMDM DIQAGMKGYLLAEKYNIPIIVMEPIKGGSLANLPEDIENKFKAYNPDLSISSWALRYVAS LPNVKVVLSGMSTYGQVLDNLATFKNFEYLKAEEVALIQDVRDTLKARTQNGCTGCAYCM PCPFGVDIPNNFKYWNNAFVYDSHDQFKAKLEKMVSEAKAENCKQCGACEKMCPQQLPIR EDLKRVCEYMKK >gi|223714122|gb|ACDT01000093.1| GENE 5 4823 - 5359 421 178 aa, chain - ## HITS:1 COG:no KEGG:Sterm_2283 NR:ns ## KEGG: Sterm_2283 # Name: not_defined # Def: major facilitator superfamily MFS_1 # Organism: S.termitidis # Pathway: not_defined # 2 162 231 391 402 86 29.0 4e-16 MFAAATSLSTYLINVVKNLGGNTSLYGVAIFFMAASEMPVMAITPRLIRRYDSITLIMVA AFFYIIRNFTICLAPSLPVLFIGMMMQSLSYGLLTAVITYYVTYNLKSHDQMMGQTMIGI MTSGVGSTIGNLFGGILQDQYGLNMMFIFACLITIIGVIIIFSTGYFQKKQLNTPTKQ >gi|223714122|gb|ACDT01000093.1| GENE 6 5371 - 5709 168 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754634|ref|ZP_02426761.1| ## NR: gi|167754634|ref|ZP_02426761.1| hypothetical protein CLORAM_00136 [Clostridium ramosum DSM 1402] # 1 109 107 215 401 156 89.0 3e-37 MLLYVSMMSLIVSTVPFLSMIAMNYIQDGKEINFGLARGMGSISYAVSAVLLGQFIEFFN PTILAYVFVISAILDLVILYSLPNTNVKQATHKKRGIFLQLLKIIKSSFSFS >gi|223714122|gb|ACDT01000093.1| GENE 7 5820 - 6002 57 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754634|ref|ZP_02426761.1| ## NR: gi|167754634|ref|ZP_02426761.1| hypothetical protein CLORAM_00136 [Clostridium ramosum DSM 1402] # 1 53 9 61 401 99 100.0 6e-20 MKKLQMKYNLLHILYWMATCCIYGYVAVFLQYKGMSNTEIGIVSGSGCILTIFYLRLSHL >gi|223714122|gb|ACDT01000093.1| GENE 8 6025 - 6231 85 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754635|ref|ZP_02426762.1| ## NR: gi|167754635|ref|ZP_02426762.1| hypothetical protein CLORAM_00137 [Clostridium ramosum DSM 1402] # 1 68 1 68 68 89 100.0 7e-17 MKNKIALEMILKAIRYLVVNPSYFKINIAIFISSSKKNRGFSKIEKINKDIDFLRTNIFI IKLAILIF >gi|223714122|gb|ACDT01000093.1| GENE 9 6241 - 7599 1385 452 aa, chain - ## HITS:1 COG:SPy1065 KEGG:ns NR:ns ## COG: SPy1065 COG0110 # Protein_GI_number: 15675057 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Streptococcus pyogenes M1 GAS # 267 451 3 187 188 228 57.0 2e-59 MIKLVAVDMDGTFLDDQMHYNRTIFRKIYNYFKENDIYFVVASGNQYYQLKSFFDDYQDE ITYISENGAYIIENKKEIFHCEINKQDIYTITKKLQEDSRIQICLCGIKSAYLLNASNEF YNVYSKYYHRLEMIESLDEINDTIVKFALFVPAEDSKNILNRLTDDIGNIIKPVSSGHQS IDLIDPRYNKGTALELLCKRYNLSLEECAVFGDSLNDLEMLKKAKYSFVMENANQSIKNI AYKIIPSNNHYGVLKTLLDLFNVSNITEREKALLGMWYDANNDQKVLTDRLNTHHLCFKY NHTDPLDQDIRQKILNELFQNENSNLEIISPFFCDCGNLITFGHNVFINSNAYFMDGAKI NIGSNVYIGPSVGLYTAIHPLDYKRRNQGLEKAMPIEIGDNTWLGGNVVVLPGVKIGHGS VIGAGSVVTKDIPPNVLAFGNPCRVVKAIDQS >gi|223714122|gb|ACDT01000093.1| GENE 10 7714 - 8076 222 120 aa, chain - ## HITS:1 COG:RSc0478 KEGG:ns NR:ns ## COG: RSc0478 COG5265 # Protein_GI_number: 17545197 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease and ATPase components # Organism: Ralstonia solanacearum # 2 116 434 550 592 96 43.0 1e-20 MSIYENLALIDPDINNIKKACQQAEIDEYIMSLPKQYDTILSDGALNFSGGQQQRLAIAR ALLKNSKILLFDEITSALDEKTSYEIFQTLIKLKKTHTILMISHKPSEYQQCDQIINLSF Prediction of potential genes in microbial genomes Time: Thu May 26 10:10:36 2011 Seq name: gi|223714121|gb|ACDT01000094.1| Coprobacillus sp. D7 cont1.94, whole genome shotgun sequence Length of sequence - 3167 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 208 - 1131 421 ## gi|167754632|ref|ZP_02426759.1| hypothetical protein CLORAM_00134 2 1 Op 2 . - CDS 1128 - 1676 462 ## COG1309 Transcriptional regulator - Prom 1875 - 1934 12.2 + Prom 1993 - 2052 5.3 3 2 Tu 1 . + CDS 2093 - 2644 707 ## COG0846 NAD-dependent protein deacetylases, SIR2 family 4 3 Tu 1 . + CDS 2976 - 3165 307 ## gi|167754629|ref|ZP_02426756.1| hypothetical protein CLORAM_00131 Predicted protein(s) >gi|223714121|gb|ACDT01000094.1| GENE 1 208 - 1131 421 307 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754632|ref|ZP_02426759.1| ## NR: gi|167754632|ref|ZP_02426759.1| hypothetical protein CLORAM_00134 [Clostridium ramosum DSM 1402] # 1 303 1 303 541 448 99.0 1e-124 MKNLRIIKKYFQVTKANKKITFILVLASLLANGPYMFTSLLFSLTINYLTKQNSKMVIIT MILYFVLKIASKLFKIISYNIEKKLYNDVYLKLQNDMIKKLDIINMKYFTNHSKGEILNI VNGDIKLLAEFGTWLSQAILLFLSFIISIIILSQISISLMILGCIINSIVIYILNIYNDK FEILTKEGKLKADDEMQFYSQLLNGLRDIKIFSILDQLHAKYQLLNKAYLNIHDKQIKNK IISNIISPSITMCTEIILMFYACYNCLNGKFEIDTVLIIQSYFGTLFSSLSDFISTLGEL RIKKCIN >gi|223714121|gb|ACDT01000094.1| GENE 2 1128 - 1676 462 182 aa, chain - ## HITS:1 COG:FN1004 KEGG:ns NR:ns ## COG: FN1004 COG1309 # Protein_GI_number: 19704339 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 1 176 1 174 188 66 31.0 3e-11 MPPKTRITKELIIEKSFEITKSEGIENLNARYLAKQLNCSTMPIFKVFNDMNELKFELKK KIDAYYNDFILAHLDKNNYLYTMSFAYIDFAIKEKNLFGALFVNEFIKTRSIEEVVHSAW NRETIEYSAQQFGITIRESEILYRDIRFYTHGIATQIYGGNMALKEEEIQTLLKNAINKF LK >gi|223714121|gb|ACDT01000094.1| GENE 3 2093 - 2644 707 183 aa, chain + ## HITS:1 COG:AF0112 KEGG:ns NR:ns ## COG: AF0112 COG0846 # Protein_GI_number: 11497732 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Archaeoglobus fulgidus # 4 164 19 214 253 73 34.0 1e-13 MKIVAFTGAGISKQSNIPTFMERPDVREKLFRNYANNHHEEYNEVIKQLKANMNGAEPND AHIALAEYDIPIITMNIDGLHKQAGSDALELHGGLPEEDELSIAWSLFNKPVLYGDPAPN YQKAYEMVGALKKNDIFIVIGCSYHTAIACDLREVAKSRGAKIIEIQEDAAHNVRKVLKG LLK >gi|223714121|gb|ACDT01000094.1| GENE 4 2976 - 3165 307 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754629|ref|ZP_02426756.1| ## NR: gi|167754629|ref|ZP_02426756.1| hypothetical protein CLORAM_00131 [Clostridium ramosum DSM 1402] # 1 63 1 63 338 119 100.0 6e-26 MIIVLKPHTSDESIKKIEDVIRARGAEPHVSKGEIQTIIGMVGDTTQIDPKAIEVEPCVE KVM Prediction of potential genes in microbial genomes Time: Thu May 26 10:10:52 2011 Seq name: gi|223714120|gb|ACDT01000095.1| Coprobacillus sp. D7 cont1.95, whole genome shotgun sequence Length of sequence - 1151 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 19 - 78 9.0 1 1 Tu 1 . + CDS 109 - 291 296 ## PROTEIN SUPPORTED gi|167754628|ref|ZP_02426755.1| hypothetical protein CLORAM_00130 - Term 370 - 413 6.0 2 2 Tu 1 . - CDS 473 - 754 167 ## CLM_2085 hypothetical protein - Prom 799 - 858 6.0 Predicted protein(s) >gi|223714120|gb|ACDT01000095.1| GENE 1 109 - 291 296 60 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167754628|ref|ZP_02426755.1| hypothetical protein CLORAM_00130 [Clostridium ramosum DSM 1402] # 1 60 1 60 60 118 100 2e-27 MAKTVVRENESLDDALRRFKRQVSRTGTLAEARKREFYVKPGLKRKLKSEAARKNNKGKR >gi|223714120|gb|ACDT01000095.1| GENE 2 473 - 754 167 93 aa, chain - ## HITS:1 COG:no KEGG:CLM_2085 NR:ns ## KEGG: CLM_2085 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A2 # Pathway: not_defined # 3 79 125 207 227 66 44.0 3e-10 MGVKGYNWWETELPNNQEVLLNLKAHNFKIDYILTHTCPKSTVYYMNLPPIRGELDLTEF LQSIENQVSHKHWYFGHFHCEKKNKQDDGLFIF Prediction of potential genes in microbial genomes Time: Thu May 26 10:11:06 2011 Seq name: gi|223714119|gb|ACDT01000096.1| Coprobacillus sp. D7 cont1.96, whole genome shotgun sequence Length of sequence - 25954 bp Number of predicted genes - 29, with homology - 29 Number of transcription units - 9, operones - 6 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 524 530 ## gi|167754625|ref|ZP_02426752.1| hypothetical protein CLORAM_00127 2 1 Op 2 . + CDS 525 - 1481 1131 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase 3 1 Op 3 40/0.000 + CDS 1474 - 2148 731 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 4 1 Op 4 7/0.000 + CDS 2141 - 3358 1006 ## COG0642 Signal transduction histidine kinase + Prom 3386 - 3445 6.8 5 1 Op 5 . + CDS 3466 - 4665 1641 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain + Term 4674 - 4712 4.5 + Prom 4678 - 4737 11.3 6 2 Tu 1 . + CDS 4792 - 5349 557 ## Balac_0438 hypothetical protein + Term 5556 - 5597 4.2 + Prom 5531 - 5590 7.2 7 3 Op 1 . + CDS 5610 - 6254 436 ## LBA0965 hypothetical protein 8 3 Op 2 . + CDS 6256 - 6615 453 ## COG1803 Methylglyoxal synthase + Prom 6622 - 6681 2.5 9 4 Op 1 . + CDS 6701 - 6925 225 ## COG1476 Predicted transcriptional regulators 10 4 Op 2 . + CDS 6918 - 7331 444 ## gi|237733748|ref|ZP_04564229.1| predicted protein + Term 7390 - 7434 10.8 + Prom 7350 - 7409 3.4 11 5 Tu 1 . + CDS 7442 - 7741 114 ## gi|169349480|ref|ZP_02866418.1| hypothetical protein CLOSPI_00198 - Term 8149 - 8198 3.2 12 6 Tu 1 . - CDS 8214 - 8390 331 ## gi|167754615|ref|ZP_02426742.1| hypothetical protein CLORAM_00117 - Prom 8411 - 8470 5.9 + Prom 8429 - 8488 6.5 13 7 Op 1 . + CDS 8508 - 9458 1005 ## gi|167754614|ref|ZP_02426741.1| hypothetical protein CLORAM_00116 14 7 Op 2 7/0.000 + CDS 9501 - 10169 227 ## PROTEIN SUPPORTED gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 15 7 Op 3 5/0.000 + CDS 10171 - 11475 1577 ## COG1160 Predicted GTPases 16 7 Op 4 . + CDS 11537 - 12541 1201 ## COG0240 Glycerol-3-phosphate dehydrogenase 17 7 Op 5 . + CDS 12581 - 12823 329 ## gi|167754610|ref|ZP_02426737.1| hypothetical protein CLORAM_00112 + Prom 12843 - 12902 8.1 18 8 Op 1 20/0.000 + CDS 12926 - 13429 520 ## COG1399 Predicted metal-binding, possibly nucleic acid-binding protein 19 8 Op 2 . + CDS 13446 - 13613 235 ## PROTEIN SUPPORTED gi|56964136|ref|YP_175867.1| 50S ribosomal protein L32 + Term 13614 - 13651 1.7 + Prom 13646 - 13705 3.1 20 9 Op 1 29/0.000 + CDS 13812 - 14243 544 ## COG2001 Uncharacterized protein conserved in bacteria 21 9 Op 2 . + CDS 14253 - 15197 909 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 22 9 Op 3 . + CDS 15187 - 15486 377 ## gi|167754605|ref|ZP_02426732.1| hypothetical protein CLORAM_00107 23 9 Op 4 3/0.000 + CDS 15486 - 17624 2576 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 24 9 Op 5 . + CDS 17670 - 19571 2065 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 25 9 Op 6 1/0.000 + CDS 19614 - 20936 1297 ## COG1253 Hemolysins and related proteins containing CBS domains 26 9 Op 7 . + CDS 20969 - 22276 1356 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 27 9 Op 8 . + CDS 22338 - 22583 372 ## EUBELI_00054 hypothetical protein 28 9 Op 9 . + CDS 22658 - 22963 148 ## gi|167754598|ref|ZP_02426725.1| hypothetical protein CLORAM_00100 29 9 Op 10 . + CDS 23023 - 25954 3109 ## COG0587 DNA polymerase III, alpha subunit Predicted protein(s) >gi|223714119|gb|ACDT01000096.1| GENE 1 3 - 524 530 173 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754625|ref|ZP_02426752.1| ## NR: gi|167754625|ref|ZP_02426752.1| hypothetical protein CLORAM_00127 [Clostridium ramosum DSM 1402] # 1 173 134 306 306 293 100.0 3e-78 NDIYDDLKDSFKGKIDYLNIYQEGGVLFVKYTNSVGAKKVEKNFENIYASKDGVIQKIDV SSGNILVKLNQYVKKGDLLVSNTITSTSEVDKIIETEGTIKAYTYTTYEAKIDKKKMDDG EAFSYLLHTIRSKLGSIDKIDREKVLSYGIIDNKRVLKVQYILIEDIASKEEN >gi|223714119|gb|ACDT01000096.1| GENE 2 525 - 1481 1131 318 aa, chain + ## HITS:1 COG:BS_phoH KEGG:ns NR:ns ## COG: BS_phoH COG1702 # Protein_GI_number: 16079588 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Bacillus subtilis # 20 317 23 319 319 377 64.0 1e-104 MTEQVKLEEYSIAQLNNICGNHDENFKMIEEALNVSISLRGDELSITGDDPEHFKEAKKV VAALLGLVAKGITITRRDVVYALKLTKEDNLGKISELYNIRITKTASGKMIYPKTMGQKD YYFALKNNDVVFGVGPAGTGKTYLAVVFAVDALKNNIVKKIVLTRPAVEAGENLGFLPGD LKEKVDPYLRPLYDALHDMLGVEQTERLIEKGVIEIAPLAYMRGRTLEDAYVILDEAQNT TDNQMKMFLTRLGFNSKMIITGDITQIDLPRGVESGLIKALKILKGVKGISFIHLTAMDV VRHPVVQRIIERYEGKDE >gi|223714119|gb|ACDT01000096.1| GENE 3 1474 - 2148 731 224 aa, chain + ## HITS:1 COG:BH0372 KEGG:ns NR:ns ## COG: BH0372 COG0745 # Protein_GI_number: 15612935 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 1 219 1 220 222 160 38.0 2e-39 MNKLLIIEDDVSINKILCHELTNKDYQVDSLYDGKDACKQILSNEYDLVLVDWMLPYKDG VQIIQECRNNGYIKPMILLTARSEQEDIIKGLESGADDYLMKPFQANILIARIEAHLRRY HKHFKGIIKYLDIKMDLSKHEVFVGDELINLTKVEYDLLQMFIDNKEEVLSRQELLMKIW NFNYDGDTRLVDIHVFKLKTKLKDSQACFQSVRGVGYKLVSKDE >gi|223714119|gb|ACDT01000096.1| GENE 4 2141 - 3358 1006 405 aa, chain + ## HITS:1 COG:BS_phoR_3 KEGG:ns NR:ns ## COG: BS_phoR_3 COG0642 # Protein_GI_number: 16079962 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 146 402 12 271 279 181 38.0 2e-45 MNKNYQKVIFSAIIVIILMIWNFNYSYIYLIGFMIAMLFIVYDTNRYYQQQINFYKREKE NEIREVEGEKDFNHKILNSLIKTMNLPMIFVNKEGTIIFTNQSFRDAFKIKHLRGKYYKD IFTDQLLNIVDQSYIFERKFSTAVCIDERYYQIETTPVFRDDVAFDGSIILFTDVTQVKE IEKMQKQFFSDISHELKTPMSAIIGSVEILQKEGIKNKDTFNEFMGILLKESYRMQNIIG DILELSRLEQPQATLSPALVDIDSLIKDTVELFEPLAKDKQLSLVYQTKIKEELMLDYTT VKTILNNLVSNAIKYSNAGVISIKCNYKDDNLIIVVQDEGVGISKDNIPFIFDRFFQVDR SRSKKLGTGLGLSIVKKMVELNNGTIDVESTPGIGTTFTVTLPIL >gi|223714119|gb|ACDT01000096.1| GENE 5 3466 - 4665 1641 399 aa, chain + ## HITS:1 COG:BS_yyxA KEGG:ns NR:ns ## COG: BS_yyxA COG0265 # Protein_GI_number: 16081088 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Bacillus subtilis # 12 396 5 392 400 243 37.0 6e-64 MEERNPMEEQVEKEENSNQPESRMGKHNKRKVTVNTNGLFKIVITLLVVASLGMNVFLFT KVNKSSSSGSSSNKNATVENVNYDVKSSTTDVIEKVNSSVVGVLVYANGTASGSGSGVVY RVDGKTAYIITNAHVVSGATDVQVVFSNKESVNATIVGSDTYSDIAVLKLTADFDMTAIK CGDSSLLDQGETVLAIGSPLGIEYAGTVTQGIVSGIDRTVSVDLNDDGQEDWDMNVIQTD AAINPGNSGGALVNMAGELVGITSMKLSNTSVEGMGFALPINDVITSVEQIIENGKVTRP QIGISGVSLSGYSSYQLRYYRINTDLTDGIYVSRVTSGGAASKAGIQEGDIIVKFDGKEV TTYKSFLTELYSKEPGDKVSVVVNRNGTEKTIEVTLGEQ >gi|223714119|gb|ACDT01000096.1| GENE 6 4792 - 5349 557 185 aa, chain + ## HITS:1 COG:no KEGG:Balac_0438 NR:ns ## KEGG: Balac_0438 # Name: not_defined # Def: hypothetical protein # Organism: B.animalis_lactis_Bl-04 # Pathway: not_defined # 26 185 65 233 353 91 33.0 1e-17 MKCKKCGQELEIGVKFCPNCGEDCVSQDTGIPVKESVDKTKSAINDFGENMQKKEVEISG KKFNMLEVLTFGSGILAIISCFLPFASVSIFGYSQSVSLMDGGDGIICIALVVVALILSY LHKDIVALGLAGATAVLAVYEIFNAKSVLGGYGNISIGAYLLLLAAIGLVAGIFLVYNNT KKQQN >gi|223714119|gb|ACDT01000096.1| GENE 7 5610 - 6254 436 214 aa, chain + ## HITS:1 COG:no KEGG:LBA0965 NR:ns ## KEGG: LBA0965 # Name: not_defined # Def: hypothetical protein # Organism: L.acidophilus # Pathway: not_defined # 4 204 6 210 229 63 25.0 5e-09 METTKTRKLVLSAALAALGMILGLLEIPYPLAPWLNLDLSEIVVIMAISMLGFKSALFVC VCKFFVSILFKGPVGPIAIGQITALIASLTICCVYYYLSRHLKLQKEWQSYAVNMIITMF VFAMVMFILNYLFVTPTYLTQKPTWYTDLPFVLDINSFNQQYGTNMSVPGFLNFLSPYGQ AIFIIYFPFNFIKGIISAIVYYIVRPIESKFKEA >gi|223714119|gb|ACDT01000096.1| GENE 8 6256 - 6615 453 119 aa, chain + ## HITS:1 COG:BS_ypjF KEGG:ns NR:ns ## COG: BS_ypjF COG1803 # Protein_GI_number: 16079305 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Bacillus subtilis # 1 119 1 119 137 152 59.0 2e-37 MNIALIAHDQKKGELIDFVKDNEEIFARHNLYGTGTTGKKVMENTNLEVTRFLSGPYGGD QQVGNLIALGQMDMVIFFRDPLTAQPHEPDISALMRLCDVHYIPLASNKATALMLLKSL >gi|223714119|gb|ACDT01000096.1| GENE 9 6701 - 6925 225 74 aa, chain + ## HITS:1 COG:MA4668 KEGG:ns NR:ns ## COG: MA4668 COG1476 # Protein_GI_number: 20091114 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 1 59 12 70 79 63 55.0 6e-11 MKTRIKEMRLIKNMTQQQLADLVHVSSRTIISIEKEQYNPSLMLAYRIAEVFDTTIEELC CLKENKELEDQKNA >gi|223714119|gb|ACDT01000096.1| GENE 10 6918 - 7331 444 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733748|ref|ZP_04564229.1| ## NR: gi|237733748|ref|ZP_04564229.1| predicted protein [Mollicutes bacterium D7] # 1 137 1 137 137 221 100.0 1e-56 MHKINPRYIINGTLGIILVFATAYVIWQKGFVLKYGLALLVAFVLGVYNLYHCFDNDWED GLKNNTDERDLFIAMKSGQKTVQLMNLLLYAGSIITIVLYGITKKMMFMIAGTTLVSVVV VMFVIFFVINNYYEKHG >gi|223714119|gb|ACDT01000096.1| GENE 11 7442 - 7741 114 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|169349480|ref|ZP_02866418.1| ## NR: gi|169349480|ref|ZP_02866418.1| hypothetical protein CLOSPI_00198 [Clostridium spiroforme DSM 1552] # 1 82 1 83 528 124 66.0 1e-27 MKNIELLESNYPIVNGNYDTFQCRLPLDFFIVIPIDDPLTSFVEIMKGINTSKYFDCSHR GDKGSNPNMILQVIKSFLKGTHLILNRKKYGTPEVITYR >gi|223714119|gb|ACDT01000096.1| GENE 12 8214 - 8390 331 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754615|ref|ZP_02426742.1| ## NR: gi|167754615|ref|ZP_02426742.1| hypothetical protein CLORAM_00117 [Clostridium ramosum DSM 1402] # 1 58 1 58 58 80 100.0 3e-14 MAKITVNESCIGCGACTGVAPDVFEMNDEGLASVVGDDVASAKEAAESCPVEAIEVED >gi|223714119|gb|ACDT01000096.1| GENE 13 8508 - 9458 1005 316 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754614|ref|ZP_02426741.1| ## NR: gi|167754614|ref|ZP_02426741.1| hypothetical protein CLORAM_00116 [Clostridium ramosum DSM 1402] # 1 316 1 316 316 412 100.0 1e-113 MNSSRVNKYRDLRAGLKDEAGINRENIEDTIDIIDDIEDDDFLATINRSFKQEEKDLPDI NDTLTEAKTFEQMRQESSEEINRALRSAKVSVGKEAQYNTRMDILNKIREPEKQVVHIDN LDNINTSEFSKGYFVNQEESVVTKAEEEKKAAKKKMTLMERLASMSPEEDAKKAKLVMDE VQEEAEQKENELEEALKKDTGVVSSGTAEDTLEDEELAKIEQTRSLEEMLRQIKEKDQRE VEKVLKQKEDTADLKVIKNQTNDKTIDSKSKAKQSRSDIVDEKKSDRIATILNYVIIFLV VVFIGLCGMIGYQLFF >gi|223714119|gb|ACDT01000096.1| GENE 14 9501 - 10169 227 222 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 [Treponema pallidum subsp. pallidum str. Nichols] # 1 213 32 281 863 92 25 3e-18 MKKISIAVDGPSAAGKSSIAKIVAKRLDYIYIDTGAMYRCVGYYCLENNIDLKDEQAVSQ ALKQIKIEMDSNNHIFLNGQDVSQVIRQDQVSMSASVVSSYQAVRTFLVEQQRQMANVGG VILDGRDIGTVVLPNAELKIYQIASVDTRAKRRYLENQERGLTADLELIKKDIEQRDYQD MNRDISPLKQAEDAIVLDTSAMTLQEVVDAVLKLVEEKIKEV >gi|223714119|gb|ACDT01000096.1| GENE 15 10171 - 11475 1577 434 aa, chain + ## HITS:1 COG:BS_yphC KEGG:ns NR:ns ## COG: BS_yphC COG1160 # Protein_GI_number: 16079341 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Bacillus subtilis # 4 433 5 435 436 565 62.0 1e-161 MAGVVAIVGRANVGKSTIFNRIVGERISIVEDVAGVTRDRIYATATWLTKEFRLIDTGGI ELQDASFTAQIKMQAEIAIEEADLIVFVVNGREGITREDEYVARLLQKTSKPVLLAVNKI DDNAFKDDIYEFYNLGIGDPIAVSGSHGIGIGDLLDEIIKNLDFVEEEFDEDEIRFSIIG RPNVGKSSLTNAILGEERVIVSNIEGTTRDAIDTAFVKDGQKYRVIDTAGMRKKGKVYEN IEKYSILRALSSIEKSDVIVVVIDGNQGVIEQDKHVAGYAHEAGKGVILVVNKWDLVEKD EKTMQKKEKELRSQFKYLDYARIIFLSAKTHQRVQQLFPLIQESFENSHKRVQTSVLNDV LVDAQAINPTTTFNGGRLKIFYANQVSICPPTFVLFVNDPQYMHFSYKRYLENRLRDSFG FEGTPIHIICRKRD >gi|223714119|gb|ACDT01000096.1| GENE 16 11537 - 12541 1201 334 aa, chain + ## HITS:1 COG:L0016 KEGG:ns NR:ns ## COG: L0016 COG0240 # Protein_GI_number: 15673320 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Lactococcus lactis # 3 333 5 334 341 279 45.0 6e-75 MSKIAVLGTGSWATALSRVLIDNQNEIMMYGIDQEQIDDINFNHQNEAFFHNISLESEIK ATNNLEEALMDMDYLLITIPTQFVKETLEQVKPLIKKKITVINAAKGFDLGTNMRMSDTI RSVLDENLINPVVSLIGPSHAEEVVVRMLTTVCAVSLDDDTARDVQTLFSNDYFRVYRLN DEVGAEYGVAIKNVIAVAAGVLSGLGYGDNTKAALITRGLAEMVRYGTKKGGKLETYMGL TGVGDLIVTCSSVHSRNFQAGLEIGKTNDARAFMKNNKKTCEGIRTCRVIYEDAKKYTDI ELPIINAIYNVLYNNHEPKREIQQLMRRELKIEG >gi|223714119|gb|ACDT01000096.1| GENE 17 12581 - 12823 329 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754610|ref|ZP_02426737.1| ## NR: gi|167754610|ref|ZP_02426737.1| hypothetical protein CLORAM_00112 [Clostridium ramosum DSM 1402] # 1 80 1 80 80 92 100.0 1e-17 MKDEDVLLDKVVKKTNVNKGEILKLANDLQSKDLNNENDIKDFIMSVAKVTNKPITPSKV DKLVSMIKSNKIPKDIDKMV >gi|223714119|gb|ACDT01000096.1| GENE 18 12926 - 13429 520 167 aa, chain + ## HITS:1 COG:BS_ylbN KEGG:ns NR:ns ## COG: BS_ylbN COG1399 # Protein_GI_number: 16078571 # Func_class: R General function prediction only # Function: Predicted metal-binding, possibly nucleic acid-binding protein # Organism: Bacillus subtilis # 28 166 18 167 172 88 38.0 5e-18 MKYNLQWIVKHRGKFDFEEGLTFPSELFDQYAQINDLKDIIVSGTGDLDLKDKRLYVDLN IKGTMILPCAITLEDVEYPFEINSTEVFAFEKPDPLEDVHEVKKDIVDLTPVVFENIMLE VPMRVVKDDANIKSKGKGWRILDNSTSDKDDDYIDPRLAKLKDYFKD >gi|223714119|gb|ACDT01000096.1| GENE 19 13446 - 13613 235 55 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|56964136|ref|YP_175867.1| 50S ribosomal protein L32 [Bacillus clausii KSM-K16] # 1 54 1 54 57 95 75 4e-19 MAVPFRRVSSTRRNKRRTHDKLTAPAVVVCPECGEYKMSHKVCKHCGTYKGQKVL >gi|223714119|gb|ACDT01000096.1| GENE 20 13812 - 14243 544 143 aa, chain + ## HITS:1 COG:lin2148 KEGG:ns NR:ns ## COG: lin2148 COG2001 # Protein_GI_number: 16801214 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 1 143 1 143 143 158 51.0 3e-39 MFMGEFRHNIDAKGRLIIPSKLREQCGESVVITRGFDGCLALYTQEGWNDYYQKLQTLPK TKREARNFVRIITSRASECEFDKLGRVNIPNVLRIEGKLEKECIIVGVGDHVEIWNQNIW DDYYDANKDNFDEISESLEGFEL >gi|223714119|gb|ACDT01000096.1| GENE 21 14253 - 15197 909 314 aa, chain + ## HITS:1 COG:BS_ylxA KEGG:ns NR:ns ## COG: BS_ylxA COG0275 # Protein_GI_number: 16078578 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Bacillus subtilis # 1 308 1 309 311 386 62.0 1e-107 MFDHVTVLLKEAVEGLNIKADGIYVDCTLGGAGHSCAILKQLTTGHLYCFDQDQTAIEAA RLRLDAVGNNYTIIYSNFVNIKEKLNELGVTRVDGILFDLGVSSPQFDTGSRGFSYNYDA RLDMRMDTSNPLDAYAIVNTWPYERIVEILYKYADEKFAKQIARKIEQKRQIRPIETTFE LVDIIKDAIPAFARRKGGHPAKRTFQALRIAVNDELNVFDRALKDSLDLLNVGGRISVIT FHSLEDKICKYTFNDVTKLKDVPPGLPVIPDYLQPRFKLINKKAMVASKEELEVNHRAHS AKLRIIEREFENET >gi|223714119|gb|ACDT01000096.1| GENE 22 15187 - 15486 377 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754605|ref|ZP_02426732.1| ## NR: gi|167754605|ref|ZP_02426732.1| hypothetical protein CLORAM_00107 [Clostridium ramosum DSM 1402] # 1 99 1 99 99 144 100.0 1e-33 MRHKTKKTKFEAFSQRFLVISMVIFVFGIIGVKSMESSYNRTAQVLEKEIKTIKSDIDSL EMQKQELASFSRLSSVANAKGYTYSNDSVAASTQAQQNQ >gi|223714119|gb|ACDT01000096.1| GENE 23 15486 - 17624 2576 712 aa, chain + ## HITS:1 COG:BS_pbpB KEGG:ns NR:ns ## COG: BS_pbpB COG0768 # Protein_GI_number: 16078580 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Bacillus subtilis # 2 685 8 687 716 373 33.0 1e-103 MNQFKSAKIAYVVFALVIFSLVANIVYLGVTGKHLISGADIATFAKSRGKAKTIDYATRG EIYTSDNEVVASNVKKYKLIAIVSSSRINHGKDDAYVKDITATANAIAPIIGMDPTVMAQ KLQEKVDAGAYQVEFGTNGNDLSAGVKKQIEDTGLPGLEFIESNSRNYPYGDFCSYIIGY AKGTETDGVKGLVGEMGLEAIYDKTLSGENGYKVYQKDSKGYVLPDGILEQKDAVDGDDI YLTIDSSLQRDLDYQLATEAAAAQALKASCVIMEAKTGKILALSSYPSFNPNERNIEDYK NFFFDTAYECGSVFKSFVYASSIESGLYNGAATYESGKYDYGGSRPIRDHNNGAGWGSIS YDEGFYRSSNVAICNLLERGYTNRDELLDVYEKLGFFQSSNIDGFDSVAGTALYKTDKSR AAYLTTGFGQGSTVTALQLLRAYSVFANDGKTVEPYLIEKIVNKETNEVTYKGKTEYSEQ IFSSQTVQHVRDLLKGVVTESMGTAKKFNLDNGVQIIGKTGTGQVVVDGGYSSTIYTKSF SGMAPYEDPQIIVNIVFQGADNDTTQHQANVIKSVMPAALSIVNKYNAPETTTVSSDYKL DSYVNQSVSFVKSKLESKAVNVQVIGNGNTVLEQYPEAGSKVAKNDRVFIKTESNDIVLP NFTGWSRKDVLTFGSLSGVNITIDGGNGLVGAQSAPEGTVVHNGDALTITLQ >gi|223714119|gb|ACDT01000096.1| GENE 24 17670 - 19571 2065 633 aa, chain + ## HITS:1 COG:BH2572 KEGG:ns NR:ns ## COG: BH2572 COG0768 # Protein_GI_number: 15615135 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Bacillus halodurans # 7 632 9 636 644 620 48.0 1e-177 MIYEITKKKIKWVKVVTVLIVIAIILKVGYIQIIDRVNIYNKAVDLWQRSFPVEANRGLI VDSDNNVLATNLTTASLVVVPSQIKDVAATADKIANILNVDVKVMQEKLSKKVSIQRIQP EGRQLDDEVAAQIDRLKLPGVYLIKDTKRYYPNDNYLGQTLGFVGIDNQGLLGLELKYDN YLNGNNGSIDYYMDAKSNPLSLYPSVYSAPTSGFNLQLTIDGDIQDIVERELNNAYDTYN PDGIWALAMEPSTGKVLAMASKPDYNPNDYQNADKDVYNHNIPIWKSYEPGSTFKIITFS SALNENLFDMDKDTYFDKGYEIVGGARIKSWKKGGHGLQTFREVLQNSSNPGFVEIGRRL GKDKLYEYVKKFGLTEKTGIDLPGESKGIMFDYDAFNELEQATVAFGQGISITPMQLVRA VCACVNGGTLYKPYLVDKIIDSYSNDIVYEHKPEALRKVISEDASKKMRDALESVVTDGG GKNAYIDGYRIGGKTGTAQKAVNGSYVDGGYILSFIGIAPIDDPKIVLYVAMDNPKNCVQ YGGTTVAPIARKMLVDILPSMNVKKVSSQRQKAYTFMDTKTLKVENYIGKSKKEVSNPEL KFEFIGEGDKVIDQLPRVGESVEAGSTIVIMLG >gi|223714119|gb|ACDT01000096.1| GENE 25 19614 - 20936 1297 440 aa, chain + ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 31 436 17 426 426 250 38.0 3e-66 MKNINNKENRFIVDPTQITNVIIFVILIIFSGIFSCTETAFLSVNKIRMRNLAEEGNKRA LTVEKLLSNSDRLFSSILVGNNLVNIGASSLSTSFVISIFGDSATLVAAATGVVTFLILV FGEITPKSFATKNADTIALFLARFVALVCTLFTPVVFFLNIITSFFIKILGGARDSGPTM TEEDLKTIVTVGHEEGVLEEQEKEMIHNVFEFGETEIKEIMTPRIHVESIPDDCSYQELM EIYQRSQFSRYPVHSESFDEIVGVLNVKDLLFFNIDPDEFVVKDFMRDTFVVYEFNEVAD VFASMRKEHATLAIVLDEYGVMSGIVTFEDIVEEIVGEIDDEYDAEEDEMIIFLGENEYL IDGSLNLNEVNDRVGTDFDSEDFESIGGLVLGEVSGVPEIDDEVQIENVIFKIVKMHKNR IAQLKVTILEEKKDEEEEHH >gi|223714119|gb|ACDT01000096.1| GENE 26 20969 - 22276 1356 435 aa, chain + ## HITS:1 COG:BS_yfjO KEGG:ns NR:ns ## COG: BS_yfjO COG2265 # Protein_GI_number: 16077869 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus subtilis # 7 435 22 463 466 201 31.0 2e-51 MKKNVEIKKLGINGEGIGYIDRKIVFVPGALPQEEVIVEIIKQTRTYSEGKLLQVVKPSK DRVTPKCRSYDNCQGCTMLHLSYFKQLNAKKEAVRESIRKYTEYDLSRTIFKDVIAAPSQ EGFITAVNLPIVDFKGKISFGIYQRDSKYLTLMTRCFKQHPIINECLLKLEEILTTNKCK TYSDKFRTGLRFLKVKLIDNKLQLVFITGRDGLKEEIVKEISQLPHVASIFMSVNTSKHQ EFDELGYSKLYGATRLELHDDKNQYLVSVKSKLPENIEMMWKQNQTIKSLVQDSQKIISL NCGIGLLELNLDQEIVAIDEKRYQIDDAKLNAKYLKKENVTFIAGDLDSKIVTYAKKKVY DTLIIQNERYGLSDTIKDTIKIAKFKTIIYACQSHSTLAKDLADLEKNYKLERIIGLDTS CHNSYLTTIVKLVRK >gi|223714119|gb|ACDT01000096.1| GENE 27 22338 - 22583 372 81 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00054 NR:ns ## KEGG: EUBELI_00054 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 3 79 1 77 82 94 59.0 9e-19 MEIKFQPQGVCSKQIVIDVEDGIVNNVKFIGGCSGNTQGVGALAKGMQVNEVIERLQGIR CGARPTSCPDQLAKALVKHAM >gi|223714119|gb|ACDT01000096.1| GENE 28 22658 - 22963 148 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754598|ref|ZP_02426725.1| ## NR: gi|167754598|ref|ZP_02426725.1| hypothetical protein CLORAM_00100 [Clostridium ramosum DSM 1402] # 1 101 1 101 101 125 100.0 6e-28 MLIRIKKLQFLCGAILLMQVLCPMWIVPFHLIATLLSIVIIGWQRRFCVLQVQYHYYVTI LYCYRIWLLSCTSWAIFDTVYMCLCLYFSIMIILFSFRAIL >gi|223714119|gb|ACDT01000096.1| GENE 29 23023 - 25954 3109 977 aa, chain + ## HITS:1 COG:BS_dnaE KEGG:ns NR:ns ## COG: BS_dnaE COG0587 # Protein_GI_number: 16079975 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Bacillus subtilis # 4 977 5 991 1115 726 41.0 0 MLGHLQVYSSYSFQNSTILIEDLVKKASDLHYEALALTDKDNMYGAMELAKYAKKYGIKA VFGLEASVEINHEIYPLLLLARDVTGYFDLVKITSEICLSERGAISLEALSLYREHLFVI SACKEGIIERLILKELDSEALKYLELFKNTFNNFYVGLQNHGIAMQQKLNERLVALATLQ NIPICVTNEVRYLNKDEAFTLDLLQASAKGMMLDLEHQVQTDQLYLKSSFEMESLFDKKY IENTNDIIRLCNVTIPTDQMHLPKYPVPKDGDAGDYLRQLCLVGLKKRFKGKNIPNEYIE RLKKELTVINKMGFNDYFLIVFDYVRYAKMHKILVGPGRGSAAGSLVAYVLGITGVDPLR YDLLFERFLNEERISMPDIDIDFQDDRRDEVVNYVTEKYGHEHVAQIVTFSTYGPKVAIK DLGKVLGVPLPRLELLTKNIPTNYKNRKSAREVFETSYSFQSMVNKDPALRKIMPAVFIV EKLPRNISTHAAGVVLSNDVLDQVVPLVRGPNNGVVTQYSKDYIEEVGLLKMDFLGLKNL TIIDYIIKDIAKNHQVEININDIDLQDKKTYEMISRGDTFGIFQLESSGMKNLLVKMQCD CLDDVIAAIALFRPGPMANIPSYIARKKGKEPITYPLECLKPILKSTYGILLYQEQIMQA AQLVAGFSLAKADILRKAMSKKTASLMKAMKSEFIQGGIANGFSELEAVKVFELIERFAN YGFNKSHSVVYGYIAYWLAYLKANFPLEFFSALLSNEQSSDASKLSCIQEGKKYGVKLLA PSINYSTDRFKVEDGNIRYSLLAIKNVGYAGYKAIEAERQNGTFKDIFDFVSRMESSKLN SKMLDSLIMAGAFDEFKLNRTTIKQNLHKIMEYAELKNSIGIDEPPILTIVRDNRIKVLE EEKMVLGVYLTMHPIALIKNNLKEQIVNLNELSNYIDQPVRVIMALSRVKVIVDKKGQEM CFVDGYDETGEVDAVVF Prediction of potential genes in microbial genomes Time: Thu May 26 10:12:11 2011 Seq name: gi|223714118|gb|ACDT01000097.1| Coprobacillus sp. D7 cont1.97, whole genome shotgun sequence Length of sequence - 12262 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 5, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 45 - 221 219 ## gi|167754597|ref|ZP_02426724.1| hypothetical protein CLORAM_00099 2 1 Op 2 4/0.000 + CDS 233 - 2845 2875 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains 3 1 Op 3 . + CDS 2855 - 3667 754 ## COG0266 Formamidopyrimidine-DNA glycosylase 4 1 Op 4 . + CDS 3669 - 4616 996 ## COG0673 Predicted dehydrogenases and related proteins 5 1 Op 5 . + CDS 4617 - 5186 492 ## COG0237 Dephospho-CoA kinase 6 1 Op 6 8/0.000 + CDS 5203 - 6432 1197 ## COG3611 Replication initiation/membrane attachment protein 7 1 Op 7 . + CDS 6442 - 7350 760 ## COG1484 DNA replication protein + Prom 7365 - 7424 9.4 8 2 Op 1 12/0.000 + CDS 7458 - 8417 1415 ## COG0205 6-phosphofructokinase 9 2 Op 2 . + CDS 8453 - 9889 1889 ## COG0469 Pyruvate kinase + Term 9955 - 10009 13.1 + Prom 10033 - 10092 6.9 10 3 Op 1 . + CDS 10156 - 11001 1018 ## gi|237733776|ref|ZP_04564257.1| predicted protein 11 3 Op 2 . + CDS 10994 - 11437 375 ## gi|167754586|ref|ZP_02426713.1| hypothetical protein CLORAM_00088 + Term 11535 - 11583 1.6 - Term 11264 - 11304 2.5 12 4 Tu 1 . - CDS 11442 - 11642 310 ## gi|167754585|ref|ZP_02426712.1| hypothetical protein CLORAM_00087 - Prom 11677 - 11736 5.5 + Prom 11457 - 11516 4.6 13 5 Tu 1 . + CDS 11703 - 12261 530 ## Hneap_0977 protein of unknown function DUF45 Predicted protein(s) >gi|223714118|gb|ACDT01000097.1| GENE 1 45 - 221 219 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754597|ref|ZP_02426724.1| ## NR: gi|167754597|ref|ZP_02426724.1| hypothetical protein CLORAM_00099 [Clostridium ramosum DSM 1402] # 1 58 960 1017 1017 119 100.0 9e-26 MCFVDGYDETGEVDAVVFASSFQVLKKILVRGEICLIEGRVNYRDKLSLIINQAKNVR >gi|223714118|gb|ACDT01000097.1| GENE 2 233 - 2845 2875 870 aa, chain + ## HITS:1 COG:BH3153_2 KEGG:ns NR:ns ## COG: BH3153_2 COG0749 # Protein_GI_number: 15615715 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Bacillus halodurans # 314 870 9 569 569 520 48.0 1e-147 MKELVMIDGNSLLYRAYYATAAMGNLMVNKDGVPTNAVYGFANMLESILKGNPEYLVVAF DYGKKTFRNDLFEVYKGTRSATPDELACQFSMIREYLTAHGIRYQEIEGYEGDDIIGTVA VKASKQRFKVSIVTSDKDMLQLVDDNISVYLTKKGVGELEKITPAKFKEIYGLIPDQMRD LKGLMGDKSDNIPGIPGVGEKTALKLLKEYHTVENLSEHLDDLKGKMGEKIRDNIDQGLL SKRIATIIKDVPMEIDLEEYRYQGHDYNELAEFYRRYSMNSLLKRMSLNNEEVQAEKAVD VKIVTQLPIIKRDSSLVVGVYDTNYHRSMIVGFAVYNDSEAYFIGTEDALNCTNFKNFIS DKKIEKYGYDIKKSINAAKWHGLEIENYVFDLQLASYILNPSLKDEIRSVCEYYEYYDVF YDEEVYGKGAKKKVPETSRVAEHLIKQAKAIYILKDQAIEKLNNENQFELYKDVEMPVAK ILARMEFQGAKVDSVVLKQLEEQFGREINILEKEIHTLANREFNIASPKQLGEVLFEDLG LPNGKKTKTGYSTSVDILNKLKDIHPIIDKVLKYRTLSKLYSTYIIGLQDQVFIDGKIHT IYNQALTQTGRLSSTDPNLQNIPVKTPEGKLIRKAFVPEYDYLVSFDYSQIELRVLAHLA GVKSLIKAFNEDKDIHRHTASEIFRVPEEQVDSTMRRNAKAVNFGIIYGMSDFGLAEQVG VSVGEAREFIKRYFENYPEIKLYMDSNIDFCKKNGYVTTMLNRKRFIREINEKNYMRQEF GKRLAMNSPIQGSAADIIKVAMIKVDELLKKHQLKSKMILQVHDELIFDVYQEELQEVME IVAKGMTTAIKMDVELKAEGSYAVNWYELK >gi|223714118|gb|ACDT01000097.1| GENE 3 2855 - 3667 754 270 aa, chain + ## HITS:1 COG:BS_mutM KEGG:ns NR:ns ## COG: BS_mutM COG0266 # Protein_GI_number: 16079960 # Func_class: L Replication, recombination and repair # Function: Formamidopyrimidine-DNA glycosylase # Organism: Bacillus subtilis # 1 269 3 278 278 226 43.0 3e-59 MPELPEVETVRRTLKNFVLNKKIISIDVLYPNIIEDDVEEFIEACTNQTINDIDRAGKFL IFKLDDIAFVSHLRMEGKYHYVEHDEPLNKHDHIIFNLDDNKQLRYNDTRKFGRMKLVSL DNYMNEIPLCKLGPEPFNAKLEDIYPKLHKSNLPIKHAILDQSIIAGIGNIYANEICFAM GLNPNTPACKLTKKSVQELIEVSSAILNEAIAQGGTTIHSFSANGIDGLFQVKLKAHLQK VCPICGGEITKVAIKGRGTYYCKHCQKRRR >gi|223714118|gb|ACDT01000097.1| GENE 4 3669 - 4616 996 315 aa, chain + ## HITS:1 COG:CAC1480 KEGG:ns NR:ns ## COG: CAC1480 COG0673 # Protein_GI_number: 15894759 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Clostridium acetobutylicum # 2 313 4 318 320 176 35.0 4e-44 MINWGIVGAGRIAHRFCEALAQDSRANLEAVSCRTLEKAKAFQEKHPCNKAYDSFQAVLD DPEIEAVYIALPHLYHFEWVKKAILANKKVLVEKPATMNTAEMEEIKDLVKKHHILFMEA MKTRFVPGYQEAKSLVSEGIIGELECLETSFCGKVDYNQNSYLFDIKQGGCLLDLGIYNI AYLDDYFKVPLDTVDVTCKYHDCGVDTYVKAILNFGKQTGIVECAMDRNKENQAVLTGTL GKMIIRPIHRPLDLEVILNSGEIFKYHIEYDHDDFYSEIVHFNDLIENGQVESTIMSLDH SINCAKILEAVREMM >gi|223714118|gb|ACDT01000097.1| GENE 5 4617 - 5186 492 189 aa, chain + ## HITS:1 COG:lin1598 KEGG:ns NR:ns ## COG: lin1598 COG0237 # Protein_GI_number: 16800666 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Listeria innocua # 2 187 3 194 200 122 38.0 5e-28 MKVLGITGSIATGKSTVTNYLKQRGYLVVDSDKLAYDALTIDEVCIKQTKNRFDLPAGPI DRKALGRIIFNDKQAKKDLEAIIHPYVIKKMQEIIVLNQHLDLIFLDIPLLFESNLEYLC DAVIVVYLKEDEEIKRLMKRDNIDEDYARLIIGNQMSIEEKKMRADIVLDNNQGLDELYQ QIETLLKGR >gi|223714118|gb|ACDT01000097.1| GENE 6 5203 - 6432 1197 409 aa, chain + ## HITS:1 COG:BS_dnaB KEGG:ns NR:ns ## COG: BS_dnaB COG3611 # Protein_GI_number: 16079951 # Func_class: L Replication, recombination and repair # Function: Replication initiation/membrane attachment protein # Organism: Bacillus subtilis # 3 375 12 407 472 181 33.0 2e-45 MSDLYTVNASYPLDSIHMRSIQLLYQPLIGNNACGLYLSLYAELDQLSLTKALSLHSRLT KITGLSLTELNEASLKLEAIGLVSKYIKDSDNKRVYLYALKTPLTPTKFFNHAILGPLLL QRLGQEELYRTKLCFKSSAFSIEGFNNVTAKFTDVFDINLLNGCDDVLVEKDYFNTNYGS IESDYDLTLFYQGLADYQIKRSCITPDDEAVIKQLGTLYRINAIDMQGLVKDCLQNDKLI HSALISKCRDYYDLNMPETFKEIYHKQAIVHKSVTGDDALSKHISYLESINPYQLLKDKQ GGREPLKHDLQIVESIMTSLYLEPGVMNVLIELTLSQCDNALSRAFMQARASQWKRKKIK TVKDAMDEANVYLKYRRNNNDDNEETEINVVDNDNVDMDELDEFLSQFE >gi|223714118|gb|ACDT01000097.1| GENE 7 6442 - 7350 760 302 aa, chain + ## HITS:1 COG:BH3144 KEGG:ns NR:ns ## COG: BH3144 COG1484 # Protein_GI_number: 15615706 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Bacillus halodurans # 1 302 1 307 311 113 25.0 5e-25 MKSIQDINLFNNDDSFNKNKENSINTLMNDHNIIKVLQEFNLSRRDIEENWIEFLDYKED LDVCIGCKSLKSCPKISKGMVRLFNYQDGEVKLSLQPCRYGQAHFDDQKILNDILLKNVN NKILLTKPSDLTIIDDPNGNGRVVIKMMMDYIKNPTNKGFYLYGEGGNGKSTVMGFLIRC LVTKGYKCGYIHFPTFLMDLKSSFGNEGVNSSIELMKNLDYLVIDDVGGESVTSWSRDEI LSAIIAYRLQNQRATFFTSEYSLEKLKKLYTLKAGDKERVERLISRMKAVSIPVELKGKD LR >gi|223714118|gb|ACDT01000097.1| GENE 8 7458 - 8417 1415 319 aa, chain + ## HITS:1 COG:ECs4841 KEGG:ns NR:ns ## COG: ECs4841 COG0205 # Protein_GI_number: 15834095 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Escherichia coli O157:H7 # 1 317 1 317 320 355 57.0 6e-98 MVKCIGVLTSGGDAPGMNAAVRAVARTCLNKGIEVYGVRLGYKGLYEGDFIKFDRHSTRN IINLGGTILKSARFPEFKDPEIRKVAIEQMKKVGMEALVVIGGDGSYNGALKLTEMGVNC IGIPGTIDNDIPDTDFTVGFDTALNTIVDALDRLRDTSSSHQRCTILEVMGRRCGDLAIN AGIADGAEMIITSEIGFDESKVIERLKASKESDKKHAIVVITEHITDVHELAKKVEAATG FETRANVLGHMQRGGRPSARDRVLASRMGVYAVELLEAGKGGLCVSQVNGEIKGLDIEET LSHKRKTDFGIYEDAMKLR >gi|223714118|gb|ACDT01000097.1| GENE 9 8453 - 9889 1889 478 aa, chain + ## HITS:1 COG:BH3163_1 KEGG:ns NR:ns ## COG: BH3163_1 COG0469 # Protein_GI_number: 15615725 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Bacillus halodurans # 10 474 2 468 473 461 53.0 1e-129 MVFELKERKKKTRVVCTIGPASENEKMLRKLILAGMNVMRLNFSHGDFEEHGGRIVTVRK LSKELNKNIAILLDTKGPEIRTGDFVGGKTEFKKGQTTVITTEDIEGTSDRFTITYKELY KDVKPGGFILVNDGQVELLVDHVEGEDIVCVCANDGVVKNKRGINVPGIKLGFDYLSPKD TADITFGCEQGVNFIAASFVRRAQDVLDVKKLLIENGHPEIQIIAKIENSEGVENMDEIL KVADGIMVARGDLGVEVPAEDVPLIQKQLIKKCRAAGKVVITATQMLDSMQENPRPTRAE VSDVANAIYDGTDAIMLSGESAQGKYPEEAVMTMTKIALKTEETLDYASLLRKAIRTAPE DPSEAICMSVAEIASKFKVSAIVVYTESGSTAKRVSRYRPESMVIAATPYEPVTRSLALN WGVKGVVCQPMHDRAAQLEYAQVLAKENGVEPGEQILITAGTPGVGGTTSYLELVTVK >gi|223714118|gb|ACDT01000097.1| GENE 10 10156 - 11001 1018 281 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733776|ref|ZP_04564257.1| ## NR: gi|237733776|ref|ZP_04564257.1| predicted protein [Mollicutes bacterium D7] # 1 281 1 281 281 485 100.0 1e-135 MDKKHKQHLLVTLIFTLIVTATLFFMYDDFVFQTYGEVVYYDYILKGENNQLKVENIEAY LDRQSFHLGEGRIIFKDVNLTNGAVPTVKLSLYGENQQKFDYEFVVEEYHSDNLIYSIQS ISKKYKEIDLDDVKSASLTIEANDQKLSEVDLKITPVEQLEGSNKEYRIENASISNSMMR LGTLKAASDDVIKEYHTVSLEYRYLKDKNGDKEDNDNYVVFKKITGKSKELVNGNDYGTY NLEDDSFKDKDLSVVIIFSNGKEKFAFAIDLKTREVGDYYG >gi|223714118|gb|ACDT01000097.1| GENE 11 10994 - 11437 375 147 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754586|ref|ZP_02426713.1| ## NR: gi|167754586|ref|ZP_02426713.1| hypothetical protein CLORAM_00088 [Clostridium ramosum DSM 1402] # 1 147 1 147 147 187 100.0 2e-46 MDKIKKDFIVIYLARNLLATFAITVFSFVYDITNYYNMLPLRAFIKIFADNFYTTAYFLL LWILNYLLFEIYKIVMDALKDRNKNHGKLVIKGKEMIAYGSIIPIVILVILLIIDFNQLF KLNFILLVIFMLLRSIKEEFKQRKNRL >gi|223714118|gb|ACDT01000097.1| GENE 12 11442 - 11642 310 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754585|ref|ZP_02426712.1| ## NR: gi|167754585|ref|ZP_02426712.1| hypothetical protein CLORAM_00087 [Clostridium ramosum DSM 1402] # 1 66 6 71 71 118 100.0 1e-25 MDIRKALLTKLHNSSPDELKEIINGGIASKEETILPGLGVLFEEYYNSLNSMQQDAMIHA ISSLLK >gi|223714118|gb|ACDT01000097.1| GENE 13 11703 - 12261 530 186 aa, chain + ## HITS:1 COG:no KEGG:Hneap_0977 NR:ns ## KEGG: Hneap_0977 # Name: not_defined # Def: protein of unknown function DUF45 # Organism: H.neapolitanus # Pathway: not_defined # 13 186 30 219 259 67 27.0 4e-10 MEIRQIELDGQIIEYQLNFKPIKRCYLKIVSGKVVVNSSSAFSITAIEKLIRDNQQVVLK QIKNYLPKYQYINNGYVYIFNQRYQIVVRDLNQRKVAFHENKLFVYHHQVQETIERELKQ ILNKYLEFKIKEYLKSNFSLNMPVIQIKKLKARWGACFSNQNKVCFNLVLVHLEKELIDY VIVHEL Prediction of potential genes in microbial genomes Time: Thu May 26 10:12:42 2011 Seq name: gi|223714117|gb|ACDT01000098.1| Coprobacillus sp. D7 cont1.98, whole genome shotgun sequence Length of sequence - 2022 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 795 170 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 + Term 796 - 826 2.0 + Prom 813 - 872 15.4 2 2 Tu 1 . + CDS 1038 - 1805 738 ## gi|167754582|ref|ZP_02426709.1| hypothetical protein CLORAM_00084 Predicted protein(s) >gi|223714117|gb|ACDT01000098.1| GENE 1 1 - 795 170 264 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 124 261 101 240 255 70 30 1e-12 KAILSRSTKEAPGLQRAREKVKGDWNMITSTTNSTIKTLMKLKQKKYRDEMKAYLVEGDH LVTEALKANQVELLISTKHIDSELEVLEVSEEVMAKLAFTKSPQSIMAKCRMQKEIQLKE GRRYLLLDDLQDPGNIGTLIRTALAFSIDQVILSNNCVDLYNDKLLRSMQGANFHLSCIY GNLTQLISQLQEKGVVVIGSALENGKNIAQINRYSKMAFVVGNEGNGMNPEVLAKCDDIG YIPINTIESLNVAIAGSIMMYHFK >gi|223714117|gb|ACDT01000098.1| GENE 2 1038 - 1805 738 255 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754582|ref|ZP_02426709.1| ## NR: gi|167754582|ref|ZP_02426709.1| hypothetical protein CLORAM_00084 [Clostridium ramosum DSM 1402] # 1 255 8 262 262 392 100.0 1e-107 MPVYSIKTIKEYANNYISNNGREITKTVLLITSAVIFVQAVNFFSQVLGLILGLITLTMP QGLVYASLKIVDQQELSAIDDTLYGIKKISKYFTTYFIYSLIAVFVLALVIVAFILLVSY CDIGIIGGRDIVSILAQYWIIVITLLVIVIGTYVFLDCNYGMFPYLMETSDVKNLRALKR SRQFMKHKKKAYLKLYLSFWKEYLGLIIITLLVSFVLPTMLDISAIIFIVGEAVLIKRKL IIYKALFFENSGLDE Prediction of potential genes in microbial genomes Time: Thu May 26 10:13:14 2011 Seq name: gi|223714116|gb|ACDT01000099.1| Coprobacillus sp. D7 cont1.99, whole genome shotgun sequence Length of sequence - 80979 bp Number of predicted genes - 78, with homology - 77 Number of transcription units - 39, operones - 24 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 53 - 112 2.8 1 1 Op 1 40/0.000 + CDS 166 - 1203 1252 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit 2 1 Op 2 . + CDS 1208 - 3598 2665 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit + Term 3614 - 3655 5.1 + Prom 3600 - 3659 7.6 3 2 Tu 1 . + CDS 3719 - 3976 247 ## Ccur_12930 copper chaperone + Term 3981 - 4016 4.0 - Term 3877 - 3927 1.6 4 3 Op 1 . - CDS 3985 - 4866 754 ## Cphy_0711 glycoside hydrolase family protein 5 3 Op 2 . - CDS 4814 - 5605 339 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 5625 - 5684 7.4 6 4 Op 1 . + CDS 5752 - 6357 487 ## AM1_1563 MerR family transcriptional regulator 7 4 Op 2 . + CDS 6354 - 7922 1105 ## COG0249 Mismatch repair ATPase (MutS family) 8 5 Op 1 . - CDS 7989 - 9125 937 ## COG1323 Predicted nucleotidyltransferase 9 5 Op 2 12/0.000 - CDS 9122 - 9577 430 ## COG3610 Uncharacterized conserved protein 10 5 Op 3 . - CDS 9579 - 10328 781 ## COG2966 Uncharacterized conserved protein - Prom 10448 - 10507 8.0 + Prom 10343 - 10402 7.9 11 6 Op 1 . + CDS 10439 - 10657 64 ## gi|167754572|ref|ZP_02426699.1| hypothetical protein CLORAM_00074 12 6 Op 2 . + CDS 10660 - 11580 851 ## CLK_1488 hypothetical protein + Term 11829 - 11868 1.2 + Prom 11936 - 11995 5.4 13 7 Tu 1 . + CDS 12037 - 12324 258 ## Cphy_2173 transposase IS3/IS911 family protein + Prom 12326 - 12385 6.1 14 8 Tu 1 . + CDS 12485 - 12676 204 ## CDR20291_1072 integrase, catalytic region - Term 12820 - 12856 5.1 15 9 Tu 1 . - CDS 12924 - 13895 883 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases - Prom 13987 - 14046 12.5 + Prom 13935 - 13994 5.2 16 10 Tu 1 . + CDS 14078 - 16075 2499 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 + Term 16314 - 16348 2.0 17 11 Op 1 2/0.000 - CDS 16093 - 17700 1569 ## COG0492 Thioredoxin reductase 18 11 Op 2 . - CDS 17702 - 18265 684 ## COG0450 Peroxiredoxin 19 11 Op 3 . - CDS 18286 - 18717 544 ## COG0783 DNA-binding ferritin-like protein (oxidative damage protectant) - Prom 18859 - 18918 14.4 20 12 Op 1 . - CDS 18921 - 19919 844 ## COG1609 Transcriptional regulators 21 12 Op 2 . - CDS 19953 - 20927 1062 ## COG1087 UDP-glucose 4-epimerase - Prom 20953 - 21012 9.9 + Prom 21004 - 21063 10.8 22 13 Op 1 . + CDS 21087 - 22583 1424 ## COG4468 Galactose-1-phosphate uridyltransferase 23 13 Op 2 6/0.000 + CDS 22585 - 23595 464 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase 24 13 Op 3 . + CDS 23595 - 24872 1582 ## COG0153 Galactokinase + Prom 24979 - 25038 3.9 25 14 Tu 1 . + CDS 25069 - 25920 380 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 + Term 26015 - 26083 11.1 + Prom 26290 - 26349 8.8 26 15 Op 1 . + CDS 26408 - 26638 147 ## gi|237733805|ref|ZP_04564286.1| predicted protein 27 15 Op 2 . + CDS 26681 - 26965 352 ## gi|167754556|ref|ZP_02426683.1| hypothetical protein CLORAM_00058 + Term 26987 - 27031 11.1 28 16 Tu 1 . - CDS 27020 - 27277 188 ## gi|167754555|ref|ZP_02426682.1| hypothetical protein CLORAM_00057 - Prom 27322 - 27381 8.7 + Prom 27460 - 27519 6.6 29 17 Op 1 . + CDS 27549 - 27983 429 ## EUBREC_0506 hypothetical protein 30 17 Op 2 . + CDS 27976 - 29313 1124 ## COG0534 Na+-driven multidrug efflux pump + Prom 29605 - 29664 12.1 31 18 Op 1 . + CDS 29714 - 32725 3063 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 32739 - 32798 5.6 32 18 Op 2 . + CDS 32837 - 34267 1762 ## COG0469 Pyruvate kinase + Term 34282 - 34312 4.3 33 19 Op 1 . + CDS 34320 - 34955 754 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 34 19 Op 2 . + CDS 35011 - 35235 325 ## + Term 35240 - 35288 3.1 + Prom 35258 - 35317 6.2 35 20 Tu 1 . + CDS 35338 - 36258 245 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase + Term 36265 - 36303 2.7 - Term 36324 - 36364 7.2 36 21 Tu 1 . - CDS 36387 - 37550 1259 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 37581 - 37640 12.0 + Prom 37727 - 37786 12.0 37 22 Op 1 . + CDS 37844 - 38311 373 ## COG2190 Phosphotransferase system IIA components 38 22 Op 2 . + CDS 38316 - 39767 1514 ## COG2211 Na+/melibiose symporter and related transporters 39 22 Op 3 . + CDS 39757 - 41985 2016 ## COG3345 Alpha-galactosidase + Term 42180 - 42223 4.9 - Term 42168 - 42209 4.5 40 23 Tu 1 . - CDS 42267 - 42944 964 ## COG1592 Rubrerythrin - Prom 42982 - 43041 13.4 - Term 43551 - 43598 -0.5 41 24 Tu 1 . - CDS 43758 - 44624 784 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 44637 - 44696 8.5 42 25 Op 1 . + CDS 44722 - 46197 1902 ## BMQ_4334 stage IV sporulation protein A 43 25 Op 2 . + CDS 46250 - 46534 371 ## COG0776 Bacterial nucleoid DNA-binding protein 44 25 Op 3 . + CDS 46608 - 47315 933 ## COG2738 Predicted Zn-dependent protease + Prom 47317 - 47376 8.4 45 26 Op 1 . + CDS 47473 - 47883 484 ## gi|237733824|ref|ZP_04564305.1| predicted protein 46 26 Op 2 6/0.000 + CDS 47876 - 48487 532 ## COG3935 Putative primosome component and related proteins 47 26 Op 3 . + CDS 48402 - 49121 615 ## COG0177 Predicted EndoIII-related endonuclease + Term 49127 - 49192 6.2 - Term 48982 - 49011 -0.2 48 27 Op 1 7/0.000 - CDS 49184 - 51844 2962 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) 49 27 Op 2 . - CDS 51834 - 52433 494 ## COG3331 Penicillin-binding protein-related factor A, putative recombinase - Prom 52587 - 52646 7.4 + Prom 52378 - 52437 9.2 50 28 Tu 1 . + CDS 52640 - 52960 359 ## BAA_1649 hypothetical protein + Term 52971 - 53008 -0.0 + Prom 52980 - 53039 2.0 51 29 Op 1 . + CDS 53103 - 53510 419 ## COG1396 Predicted transcriptional regulators 52 29 Op 2 . + CDS 53522 - 53842 346 ## SmuNN2025_1556 hypothetical protein + Term 53909 - 53953 4.4 - Term 53894 - 53942 9.1 53 30 Op 1 . - CDS 53962 - 55029 739 ## EUBREC_3483 hypothetical protein 54 30 Op 2 . - CDS 55029 - 55466 368 ## Amet_2040 MarR family transcriptional regulator - Prom 55487 - 55546 13.7 + Prom 56329 - 56388 10.9 55 31 Op 1 . + CDS 56410 - 56964 361 ## COG1309 Transcriptional regulator + Prom 57018 - 57077 8.3 56 31 Op 2 . + CDS 57105 - 58250 1575 ## COG1454 Alcohol dehydrogenase, class IV + Prom 58534 - 58593 11.8 57 32 Op 1 5/0.000 + CDS 58634 - 59626 1103 ## COG1426 Uncharacterized protein conserved in bacteria 58 32 Op 2 6/0.000 + CDS 59626 - 60219 233 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 59 32 Op 3 12/0.000 + CDS 60219 - 60677 231 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 60 32 Op 4 . + CDS 60716 - 61753 1388 ## COG0468 RecA/RadA recombinase + Term 61763 - 61802 5.3 61 33 Op 1 . - CDS 61852 - 62157 289 ## COG0551 Zn-finger domain associated with topoisomerase type I 62 33 Op 2 . - CDS 62067 - 64022 2019 ## COG0550 Topoisomerase IA - Prom 64080 - 64139 9.7 + Prom 64229 - 64288 8.0 63 34 Op 1 2/0.000 + CDS 64321 - 65877 2088 ## COG1418 Predicted HD superfamily hydrolase + Term 65878 - 65910 4.0 64 34 Op 2 2/0.000 + CDS 65919 - 66692 909 ## COG1692 Uncharacterized protein conserved in bacteria 65 34 Op 3 . + CDS 66738 - 66998 364 ## COG2359 Uncharacterized protein conserved in bacteria + Term 67043 - 67076 -0.4 + Prom 67028 - 67087 7.9 66 35 Tu 1 . + CDS 67110 - 67355 381 ## gi|167754517|ref|ZP_02426644.1| hypothetical protein CLORAM_00018 + Prom 67359 - 67418 9.3 67 36 Op 1 . + CDS 67482 - 67730 229 ## gi|167754516|ref|ZP_02426643.1| hypothetical protein CLORAM_00017 68 36 Op 2 . + CDS 67628 - 68926 635 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 69 36 Op 3 . + CDS 68916 - 69245 153 ## gi|237733846|ref|ZP_04564327.1| predicted protein 70 36 Op 4 . + CDS 69321 - 70835 1898 ## COG0516 IMP dehydrogenase/GMP reductase + Term 70838 - 70866 -1.0 71 36 Op 5 . + CDS 70905 - 71399 530 ## Aflv_1521 outer spore coat protein 72 36 Op 6 6/0.000 + CDS 71475 - 73985 2780 ## COG0249 Mismatch repair ATPase (MutS family) 73 36 Op 7 12/0.000 + CDS 73995 - 75827 1650 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) 74 36 Op 8 . + CDS 75827 - 76738 983 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase + Prom 76748 - 76807 4.3 75 37 Tu 1 . + CDS 76850 - 77578 766 ## COG1396 Predicted transcriptional regulators + Term 77583 - 77610 0.1 - Term 77569 - 77600 2.5 76 38 Op 1 . - CDS 77605 - 78912 1392 ## COG0422 Thiamine biosynthesis protein ThiC - Prom 78945 - 79004 2.6 - Term 78933 - 78971 1.1 77 38 Op 2 . - CDS 79126 - 80391 1271 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase - Prom 80420 - 80479 8.6 + Prom 80374 - 80433 9.5 78 39 Tu 1 . + CDS 80468 - 80920 552 ## COG1490 D-Tyr-tRNAtyr deacylase Predicted protein(s) >gi|223714116|gb|ACDT01000099.1| GENE 1 166 - 1203 1252 345 aa, chain + ## HITS:1 COG:BS_pheS KEGG:ns NR:ns ## COG: BS_pheS COG0016 # Protein_GI_number: 16079916 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Bacillus subtilis # 1 341 1 341 344 462 62.0 1e-130 MENELLKIKEEALAKLADCSSLKELNEIRVLYLGKKGPIQEAMKSMKDMAPEERKTFGQT SNLVKQELSKAVEDKKEALENQAILDKINEERIDITLPSFKLDQGSLHPLSTIVAELEDL FLGMGYQIAEGPEVESDYFNFELMNLPKGHPARDMQDTFYIDENTLMRTHTSPVQAHTML AANGEGPIKVICPGKTYRRDDDDATHSHQFMQCEGLVVDKNITMADLKGTLEVFARKFFG EKREVRLRPSYFPFTEPSVEVDISCHNCGGKGCPMCKHTGWIEILGAGMVNPRVLEMCGF DSEVYQGFAFGIGIERVAMLKYGIDNIRNFYNDDIRFLSQFGRKE >gi|223714116|gb|ACDT01000099.1| GENE 2 1208 - 3598 2665 796 aa, chain + ## HITS:1 COG:BS_pheT_2 KEGG:ns NR:ns ## COG: BS_pheT_2 COG0072 # Protein_GI_number: 16079915 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Bacillus subtilis # 154 796 2 651 651 440 38.0 1e-123 MDISLKVLNRYVKVDDQDPKELADKITSIGLEVEGMHDLTRGNNMTIGYVKECMPHPDSD HLNVCQVEVRPGEIRQIVCGAPNVATGQKVIVANPGCDLGDGFVIKESKIRGVESNGMIC SIAELGLDQRLLKPEDKDGIHVLDTDAPVGAEPLEYMGLKDTILEIGLTPNRADCMAMTS LAYEVAAILGREVNLPEVSFIGAEGSGISVKVETDLCPFFGAKLVKGVTTKESPEWLRNA LIASGIKPINNIVDISNFVMLETGQPIHMYDYDKLNKKEFIIKTGFDCKEVMLDGEEYKI EPADLIVSTDAGIGCIAGVMGANSTKIDENTTNIVIEAATFDGATLRETARRLNLLTDAS QHYIKGALNTANSLMILERCANLLVELAEAKEVYQSVTTDLKIKDCFVRVTTSKVNGVLG TKITTDEIKDIFSALKFEYTIDGEEFNVKVPTYRNDITLAADLIEEVARMYGYDKIPSTL PEMSMTVGKRTDVQAKKHLIRNLLKDQGLHETLTYSLTSPVMVDDFNIFHTNESVKLAMP LGEERSVTRKSIIGSLLQVINYNQSHNIKDVNIFELSTTYSKGVELQNLAIACCGEYNGL PFKQISYKADYYLVKGFVETIFKNLGIEESRYKLVRASQDDKYYHPGRTAYIMIGKEVVG VVGNIHPLMEKKYNVKDVYIVELNLTTLLNLKTSKVKFTEIPMYPSVSRDIALVMDQDIP TFDICRKIVQASKQLVKETKIFDVYAGEHIEAGKKSVAINLLFQDPKGTLEEATVNAAME KILAAVEKDFGAVLRA >gi|223714116|gb|ACDT01000099.1| GENE 3 3719 - 3976 247 85 aa, chain + ## HITS:1 COG:no KEGG:Ccur_12930 NR:ns ## KEGG: Ccur_12930 # Name: not_defined # Def: copper chaperone # Organism: C.curtum # Pathway: not_defined # 1 68 1 68 82 88 61.0 9e-17 MQKITLEIDGMMCGMCESHINDAIRKAFSVKKVSSSHSKDRTEIITADELDEDKLKAVID ATGYKVTSITKAPYVKKGFFASFRK >gi|223714116|gb|ACDT01000099.1| GENE 4 3985 - 4866 754 293 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0711 NR:ns ## KEGG: Cphy_0711 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: C.phytofermentans # Pathway: not_defined # 4 278 333 614 626 229 44.0 9e-59 MILPFFLHGILDQGYYHDGLFLPEHPDFYKTEIMNIKKLGFNLIRKHIKIEPELFYYNCD KLGILVMQDMVNNGEYRFIHDTVLPTLGFINFKDTRYKNTLQQNIFKQHMIDTISHLYNH PSIISYTIFNEGWGQFNSDEMYQICKELDPTRFYDTTSGWFSQKNSDVVSRHIYFRNKIL STNLERPLLLSECGGYTRPINNHLNSNKKQYGYGKTNSEKELTDKIEKMYQEMVMPSINN GLCGCIYTQISDIENEINGLYTYDRKICKVNQQRMSELARKLYSYYEKKSGLS >gi|223714116|gb|ACDT01000099.1| GENE 5 4814 - 5605 339 263 aa, chain - ## HITS:1 COG:uidA KEGG:ns NR:ns ## COG: uidA COG3250 # Protein_GI_number: 16129575 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli K12 # 35 258 58 295 603 72 24.0 8e-13 MEYPRPQLKRNSFFSLNGVWQLNDLSINIPYPPQAPLSNYHGIINDTLVYQKDFILPIDF YHANNQVLLHFGAVDQLTKVYVNNRYIGSHSGGYLPFYFDISSSLNDGVNKLVVKVTDSL SHDYPYGKQSRESHGMWYTQVSGIWQTVWVEAVSKNCIKDLLITPSLTGINLKVNTEASR YTVTIENNDFKKTYTAKDTDITIDNPHYWTPDDPYLYTFSITTDDDYIESYFALRTVEIK KHQILLNDTPIFFAWYLRSRILS >gi|223714116|gb|ACDT01000099.1| GENE 6 5752 - 6357 487 201 aa, chain + ## HITS:1 COG:no KEGG:AM1_1563 NR:ns ## KEGG: AM1_1563 # Name: not_defined # Def: MerR family transcriptional regulator # Organism: A.marina # Pathway: not_defined # 1 118 12 131 256 63 32.0 5e-09 MNIGEVSDYLKITKKAINLYEEKGLLRPNKNSNGYRLYGEEEIKTLKQIKILRSLDFSLS EIKDIIINNEYCFFDNKLSELKIRDYDLQKKIQYLDLVKNDFIQNEIIENIEEYRELLIR ESKEEIKDKNIYVDFEKILIVVFTFFTTISICGRGRISGEIIIIIYLLILICSTALLHYP QSRKFFYQLYQKLQEYKRKDK >gi|223714116|gb|ACDT01000099.1| GENE 7 6354 - 7922 1105 522 aa, chain + ## HITS:1 COG:CAC3563 KEGG:ns NR:ns ## COG: CAC3563 COG0249 # Protein_GI_number: 15896798 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 45 519 84 573 577 162 29.0 2e-39 MIWLIIISIVLIAGVLGFLKILVTKKYLIHQYNLGVGVQYRDFTKEYYHYNKERSIDEET FIDFEMPQILKKINFTYSEIGNEYLYNLFFRENNNLDTQESIIDKLNCQKKLAESIYQLN KLNKAYVPILSIKRSLLNISAKSCLLAGLFLILDVVSIIAAIIDIENVYFVVIMLLLSIL MNTQLSQKTGKISLQISLINDMLQITNKLLKINIYPDSNQTLVMNSFIHLKKMTKTTYYF NKIKQFDIFNILELMKSLFFIDIFQAVKLSRKVDQVQNDIMVLYENIGLLDVCIMIKVIR SLYDTCIPTILKDERVEIVEGYHLLIEKPIKNTVVISNDTIVTGSNASGKSTFLKMLGAN LLLAKTLNISFAKEFKYYPFALISSIHMKDDIMNGDSFYVKEIKRLKQITEFANKQKSLI LIDEILKGTNEKERIIIALALMKYLFKCNSMTIITTHDIELTEVFDQVDKYCFNDIKKDN KIIFDYLIKKGVCTVGNAIAIVKTLDFDQEILKEINDKIEVF >gi|223714116|gb|ACDT01000099.1| GENE 8 7989 - 9125 937 378 aa, chain - ## HITS:1 COG:BS_ylbM KEGG:ns NR:ns ## COG: BS_ylbM COG1323 # Protein_GI_number: 16078570 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferase # Organism: Bacillus subtilis # 1 365 1 388 415 252 35.0 1e-66 MKVLGLIVEYNPFHQGHLYHINKAKQLIKPDVTIVIMSGHFVQRGEPAISNKWTRAGVAI KNGIDLVIELPFVYSVQSADYFAQGAIELLAKLKVTDIVFGSECGNINIFKDIAFTIKNN QKNYDNLVKKQMNQGLRYPDACNQALSILMNKTVTTPNDLLGLAYVKEVINHNYPIELHC IKRTNDFHSLDIESISSASAIRHALKNKIDISNQFCNYEDYQEFYFFDDFYPFLRYKILT TDAPTLKHLHLVDEGIENLLKEKVVITNYMDELITSLTSKRYTRSRIQRMLIHILMNNSK EDIAEAMQIDYIRVLKMNNVGQAYLNKIKKSCEYKLVTNFSSYHHPALELEFRATKLLSC LSKNPNQLISLEYKSIPK >gi|223714116|gb|ACDT01000099.1| GENE 9 9122 - 9577 430 151 aa, chain - ## HITS:1 COG:CAC2266 KEGG:ns NR:ns ## COG: CAC2266 COG3610 # Protein_GI_number: 15895534 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 4 140 2 137 152 80 34.0 9e-16 MQSIIECFSAFIACLGFAFIFRIHQHPRFAIIGSIGGALGWLVYLLASNFKSEISCYFLA MAFVALFSEISARIYKAPATIFLIIGCFPLVPGRGIYQTMLYCTQGNSELFLDSFLNTLA ISASLALAILIISTIFKVIKKLSCRKIKVNS >gi|223714116|gb|ACDT01000099.1| GENE 10 9579 - 10328 781 249 aa, chain - ## HITS:1 COG:CAC2265 KEGG:ns NR:ns ## COG: CAC2265 COG2966 # Protein_GI_number: 15895533 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 246 1 247 257 134 34.0 2e-31 MNEQKLINLIGEIGRLLLKHGAEIYRVEESLLRMCKSYGYLNVEVFALPSYFTLTIEMHD NHTYTLSKRTQQNRVNLDMVYELNNLVRYTCQECPDYIHLKSKLEKIKNTPINMPLVFLG YGLGTGFFTTFFGGGIHETIIAVGIGFILYFFIWLMEILEINNLVSTLLSSMVLTAIAII SLKLGLITNLQSTIIGCLMILTPGVAITNSLRDIMGSDHISGMARMLEALLTATFIAIGV GTMMMVLGG >gi|223714116|gb|ACDT01000099.1| GENE 11 10439 - 10657 64 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754572|ref|ZP_02426699.1| ## NR: gi|167754572|ref|ZP_02426699.1| hypothetical protein CLORAM_00074 [Clostridium ramosum DSM 1402] # 1 72 1 72 72 79 100.0 7e-14 MFKKYYYNNSALIIYIIEEKVKFYAIVINGFYKYFKEMFQRFVIKVKKKQQKNLVKKTAN NQIIRLLLIKGE >gi|223714116|gb|ACDT01000099.1| GENE 12 10660 - 11580 851 306 aa, chain + ## HITS:1 COG:no KEGG:CLK_1488 NR:ns ## KEGG: CLK_1488 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A3_LochMaree # Pathway: not_defined # 1 304 1 309 310 273 45.0 9e-72 MLLQVSGRTDICALYPKWFVNRLKAGYILVRNPYNEHQVSHVDVTPEVVDCICFCTKDPK AIVPYLTQIDSMGYNYYFMVTITAYDLDIEPGLRPKLEIMKTFIELSKMLGKKRVIWRYD PVLLNQRYTKVFHYKMFEKMCQLLFPYTETVIISFLDIYKNIIGKFDELTDHDVIELAQR FGKIAQKYQLHIQTCSEKYHLEEYGIAKGSCLDRGYIETLIGYPLDIKTNTNRANCSCLA SIDIGAYSCCNHGCSYCYACNHNLVAKNMAIHDDESPMLLGTLSYDDKVNKRIVASNKVR QIKLDI >gi|223714116|gb|ACDT01000099.1| GENE 13 12037 - 12324 258 95 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2173 NR:ns ## KEGG: Cphy_2173 # Name: not_defined # Def: transposase IS3/IS911 family protein # Organism: C.phytofermentans # Pathway: not_defined # 5 86 127 208 228 77 51.0 1e-13 MVKSRKVNKEERIEIIKYCLEHDMDYKITAKLFETTYANVFNWVKKYKEKGEDGLGDKRG CRKDDEKVDEVTLLKRQLKQKEYELEMVQLELKLS >gi|223714116|gb|ACDT01000099.1| GENE 14 12485 - 12676 204 63 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1072 NR:ns ## KEGG: CDR20291_1072 # Name: not_defined # Def: integrase, catalytic region # Organism: C.difficile_R20291 # Pathway: not_defined # 5 58 221 273 299 71 59.0 1e-11 MLKGHGMTQSMLRVHCCIDTGPTEGFWGIMKCEMYHYGKYYNTKDELVKAVDDWIHYYMY ERN >gi|223714116|gb|ACDT01000099.1| GENE 15 12924 - 13895 883 323 aa, chain - ## HITS:1 COG:L51032 KEGG:ns NR:ns ## COG: L51032 COG0111 # Protein_GI_number: 15673970 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Lactococcus lactis # 11 317 15 321 325 191 33.0 2e-48 MNILVLSRLKTDEINKLEKSFPDFSFTYSKEKEVTQTMIDNCDILVGNPGKHVELNRPNL KAILLNSAGSDYYIQEGVLHAATRLANASGSYGKAIAEHTIGMMLALNKNFKNYINNMHE HSWKSYRGGKEIYHSTVIIVGLGDLGYELAKRLKAFECKTIGIKRNVSPIPRDIDELYTI DKLEEILPHGDFIISCLPQSTETINLFNKKRLLMMKSDALFVNVGRGSAVNTQDLKEVLK AGHLYGAALDVIDPEPFKTDDDLWDFDNVLITPHVSGGFEWDSVREYFTELTIRNINHLI KNEPLENEVDFNTGYRKVVKYND >gi|223714116|gb|ACDT01000099.1| GENE 16 14078 - 16075 2499 665 aa, chain + ## HITS:1 COG:CAC3683 KEGG:ns NR:ns ## COG: CAC3683 COG0768 # Protein_GI_number: 15896915 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 28 657 17 664 671 478 44.0 1e-134 MLLKDRRKLIIIGVVAVALIIGAVFFIFSRGKSADDYVKEYYGYLENKEYGKMYNMLTKG SLAKTSQKVFEARYKNIYEGIEAKNIEIKIGEVEDGIVNYTLTMDTLAGKISSENKVEVK DGELVYNEAMILEGLKEDYKIKINTDSASRGKILDRNGKELATKGEAYSVGLVQGKLNGE ADYEAIGKIIGMSRDEIKKVMSASWIKDDSFVPLKEMAKDEEGKAKAQQLLTIKGVKLNT VSVRYYPYGEAASHVTGYLQQVNAEDLKKHKNEGYSETSLIGRSGIEAAYEADLKGTDGK EIIIIDKNNKLVETLATKKVENGKDIKLTIDADLQQSLYKEYQNDKSASVAMNPKTGEIL ALVSTPSFSSNDFIFGFSSAEWQALNDDANKPLTNRFRGTYIPGSSMKPITAAIGLESNK IDPDEDFGAKDKWQKDSSWGNYYVTTLHAPKPNNLNNAIIYSDNVYFARAASEIGKEKLI EGYEKLKIGSKIPFELSLNASQYQNKDSKFDDQQLADSGYGQGQLLLNPVQLASIYGAFV NEGTIAQPYLVIDEKPNDAWIKDVFSKDTVKRIKEALVGVISDSNGTGHSIYHQDIELAG KTGTAELKSSQNDTSGSEIGWFTVMTTNSDNPVLITTMVEDVKNRGGSGYVVDHMKAPLG SYLYR >gi|223714116|gb|ACDT01000099.1| GENE 17 16093 - 17700 1569 535 aa, chain - ## HITS:1 COG:FN1984_1 KEGG:ns NR:ns ## COG: FN1984_1 COG0492 # Protein_GI_number: 19705280 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Thioredoxin reductase # Organism: Fusobacterium nucleatum # 1 310 1 310 332 346 57.0 6e-95 MSNLYDAIIVGGGPAGLSAAIYLARAKFSVLVIEKEKIGGQITITSEVVNYPGILQTSGQ ELTEAMRQQAQNFGAEFLLATVKGINLKEPVKKIITDKGEFSSIGIVLATGSHPRKLGFP GEKEFQGHGVAYCATCDGEFFTGKDIFVIGGGFAAAEEAVFLTKYARKVTIIVREEDFTC AKQVSDKAKNHPNIEVHYNTEIIEAKGNHQLEQATFKNNATGETWSYRAEANNTFGIFVF AGYEPATEIFKNQVELNNYGNLIVDQNQKTNLDGVYGAGDVCVKELRQVVTAVSDGATAA TSLEKYIPNIVTALKLSPKKIKIKEAVQTSKETSENDDSFITPKIKQQLAPIFAKLEKQL IIKCSLDHSKLAGEIKSFIEEFTTLSDKLSYQFEESNSSPAMRFYDENNNYLNISYHAVP GGHEFNSFVIAIYNAVGPKQPLDTQIINEIKQINKPTTIKVVVSLSCTMCPDVVMASQRI AIENNNVDAHVFDLAHFPELKEQYQIMSVPCMIINDQDVYFGKKDIQQVIEIIKK >gi|223714116|gb|ACDT01000099.1| GENE 18 17702 - 18265 684 187 aa, chain - ## HITS:1 COG:FN1983 KEGG:ns NR:ns ## COG: FN1983 COG0450 # Protein_GI_number: 19705279 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Fusobacterium nucleatum # 1 185 1 186 188 248 64.0 7e-66 MSLIGKEITEFEAQAFHENEFKTIKKEDLLGKWNVLFFYPADFTFVCPTELEDLANKYEE FKNINCEIYSVSCDTHFVHKAWHDTSKTIKKINYPMLADPTAKLAKDLDVYIEADGLAER GSFIINPEGKVVMYEVIAGNVGRNADELFRRVQASQFVYEHGDEVCPAKWKPGEETLKPS LDLVGLI >gi|223714116|gb|ACDT01000099.1| GENE 19 18286 - 18717 544 143 aa, chain - ## HITS:1 COG:PM0817 KEGG:ns NR:ns ## COG: PM0817 COG0783 # Protein_GI_number: 15602682 # Func_class: P Inorganic ion transport and metabolism # Function: DNA-binding ferritin-like protein (oxidative damage protectant) # Organism: Pasteurella multocida # 1 143 15 156 159 95 41.0 3e-20 MDKQLNKLLADLVVEYHKLQSYHWYIKGKDFFTVHAKLEELYNGVNKAIDEVAEAILMTG FKPAASMQEFLDISAIEEVKGEHITSKDIYKVVLNDFNYLLDRIKSLKNNADNENNYLIS SLMDDYIKEFTKSIWMISQVVEN >gi|223714116|gb|ACDT01000099.1| GENE 20 18921 - 19919 844 332 aa, chain - ## HITS:1 COG:BH2227 KEGG:ns NR:ns ## COG: BH2227 COG1609 # Protein_GI_number: 15614790 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 1 327 1 333 347 227 39.0 2e-59 MATIKDVAELANVSSATVSRILNNDKTLSVPEETRQAVFNAAKELNYIKKRKATKNEFTI GIVQWYSLQQEIDDPYYLQIRQGIENFCLQNNINVIRTFRADQNYLDVLKGVDGLVCVGK FNNNEISQFKKISNKVIFLDMHSPNSDDSTITLDFHKAVIDALDYLKSLNHHKIGYLGGK EYLEDNVVYQDARKEIFIDYCNKNKIIYKKYLLEGEFTTESGYSMMMDLINQKDLPSAIF CASDPIAIGALRALQEHNLSVPRDISLVGFDDIKAASFTNPPLTTIYAPASLMGEYGASM VYNILSKYNSPTPMRITLPCTLIERESCKKVD >gi|223714116|gb|ACDT01000099.1| GENE 21 19953 - 20927 1062 324 aa, chain - ## HITS:1 COG:lin2620 KEGG:ns NR:ns ## COG: lin2620 COG1087 # Protein_GI_number: 16801682 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Listeria innocua # 1 321 2 322 328 441 66.0 1e-124 MSILVLGGAGYIGSHTVYQLIENGKEVVIVDNLQTGFKELIHPKAKFYQGDLRDKTFLNN VFEQEKIDGVIHFAANSLVGVSMKEPLEYYDNNVYGMIVLLEVMKNHSVKHIVFSSTAAT YGEPKRIPIEEDDETYPTNPYGETKLAMEKLMKWCSSAYGMSFVALRYFNACGAHPNGKI GELHNPETHLIPLILQVPLGIRESIYVFGDDYDTKDGTCIRDYIHVMDLADAHIKALNYL KAGNPSNIFNLGNGEGYSVLEIINAAKKVTNLPIAVTKAARRAGDPAKLVANNTKAKEIL GWEPKYTDIEKIISTAWNFYIGRK >gi|223714116|gb|ACDT01000099.1| GENE 22 21087 - 22583 1424 498 aa, chain + ## HITS:1 COG:CAC2961 KEGG:ns NR:ns ## COG: CAC2961 COG4468 # Protein_GI_number: 15896214 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose-1-phosphate uridyltransferase # Organism: Clostridium acetobutylicum # 1 491 2 494 497 553 57.0 1e-157 MNHEINRLIQFALDKEMITKDECDYSVNLLLDLFDENDFEYEEINEHLPLATPILEKMLD HAVTKGLIEDNTTSRDLFDTKIMNCIMPRPHTVNEKFKQKYMVSPKVATDYYYKLSIASN YIRKSRTDKNIIWKKYVKYGNIEISINLSKPEKDPKEIAKAKLIKSSGYPKCLLCKENVG FSGDLNRAARQTHRIIPIELATGNYYLQYSPYVYYNEHCIVFNEEHKPMVINENTFKNLF SFLDIFPHYMIGSNADLPIVGGSILSHDHYQGGNYEFPIQGAEVLKTLISDKYPSTLIEV VKWPLSTVRLTSENKEELVALSVKMLDYWRHYNAPVLDIISHTGDIPHNTITPIARRKGD KYQIDLVLRNNRTSTKYPDGIFHPHQESHHIKKENIGLIEVMGLAILPARLKDELKLLSD CLLKKAKIEEYPELDKHHSWYQQLLAQHEFTETNIDEILKEAVAIKFVNVLEDAGVFKMD DYGIDALTSFVETVLKGE >gi|223714116|gb|ACDT01000099.1| GENE 23 22585 - 23595 464 336 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 23 334 29 344 345 183 34 3e-45 MIKVIEKIDDKIDLIQMKNDDLEVVVSNYGCTIVKVLMKDKDGNIDDVVLGYDDFRDYQT KDAYLGALVGRTANRIGKGKFTLNSETYTLPINNGPNCLHGGVKGFSYQIFDYQILEDSI EFTYLSKDGEEGYPGNLAFKAIYTLNKDTLIVRYQATSDKDTIINITNHSYFNLSGAKED IYNHQLLVHSDKYACVDSDGLPTGKFNLVKDTPFDFNSMTRIGDVIDSDDEQLKLGAGYD HPFIFNQDKDQAILYHEATGRKLTVSTSLPGAQIYSANYLDGRIGKYGIVYPRRFALCIE TQNLPDAINIEEDPTTILKKGEVYDEITSYKFEVIK >gi|223714116|gb|ACDT01000099.1| GENE 24 23595 - 24872 1582 425 aa, chain + ## HITS:1 COG:ECs0785 KEGG:ns NR:ns ## COG: ECs0785 COG0153 # Protein_GI_number: 15830039 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Escherichia coli O157:H7 # 44 402 11 356 382 121 27.0 2e-27 MIKATILVEELNNKKYDELLNDIYVDTNLLDYQRERYVKAINKYVSLYGDTDVEIYSAPG RSEVGGNHTDHQHGCVLAAAVNLDAIAVVGRVDNKIKVLSDDFDIAPINLEDLEIKKAEE GTSEALIRGVCARLKELGYNVGGFNAFITSDVLMGAGLSSSAAFETIIGTIISGLYNDMT IDPVVIAQVGQYAENVYFGKPCGLMDQCASSVGSLINIDFNDVAKPIVNKVDVDFSKFGH SLCIVDTKGSHADLTDEYAAIPMEMKKVANYFGKEFLREVDEEDFFNDIAGARKACQDRA VLRAIHLFEENKRVDQEVKALNNSDFETFKKVVKESGDSSYKFLQNVYANCDVQNQSVSI GLAMSEKIIGRNGVCRVHGGGFAGTIQAFVKDEFVTAYKTEIERVFGKGSCHVLKVRKYG GKKVI >gi|223714116|gb|ACDT01000099.1| GENE 25 25069 - 25920 380 283 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 20 281 55 317 323 150 31 7e-37 MGADIVLKRIEKCLEILVLKNNLSFEQIDALGIGVPGLLNRNDGISLFSPNFNNWHNVKI KEWFEHKWLIPTVIDNDVRMHLYGELYFGAGKGFKNIILIAIGTGLGSGIVVDGHVLYGA NDSVGEIGHMNMYRHGRACRCGSSGCLGRYVSAVGMINTFKEKDIDHMSIVNRWVNNCDE ITAKMICQAYDLNDSIVVATLKETGEILGYGITNLINLFNPERIIIGGGVSNAGERLLAS TREVAAIHALEIASNNCDLVVAELGEQAGMYGAAKYAKRKLIL >gi|223714116|gb|ACDT01000099.1| GENE 26 26408 - 26638 147 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733805|ref|ZP_04564286.1| ## NR: gi|237733805|ref|ZP_04564286.1| predicted protein [Mollicutes bacterium D7] # 1 76 1 76 76 94 100.0 2e-18 MVDYYLEMSIRELSYCMTMIWGKLNIFSIIDYDVIISTILPIILNENIEKMSIVCQREDN LMLIYIIQLLKILSMI >gi|223714116|gb|ACDT01000099.1| GENE 27 26681 - 26965 352 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754556|ref|ZP_02426683.1| ## NR: gi|167754556|ref|ZP_02426683.1| hypothetical protein CLORAM_00058 [Clostridium ramosum DSM 1402] # 1 94 1 94 94 164 100.0 2e-39 MKDRELIARIIINILDVKNCQQWELFTGEDMYEQVCNYILNISKGNNTAEEYARKMMEEN KPVIDRIVQGEDIPNEEYNVFTESFRKYNRKFRR >gi|223714116|gb|ACDT01000099.1| GENE 28 27020 - 27277 188 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754555|ref|ZP_02426682.1| ## NR: gi|167754555|ref|ZP_02426682.1| hypothetical protein CLORAM_00057 [Clostridium ramosum DSM 1402] # 1 85 18 102 102 146 100.0 5e-34 MTNWQIVNHRLQNLSLHNLKEICYAHNISMEERDLELILQIIKNNPYSIVNEEYTPILFI EISNVTNKATCDKFKPIIEKEYLIH >gi|223714116|gb|ACDT01000099.1| GENE 29 27549 - 27983 429 144 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0506 NR:ns ## KEGG: EUBREC_0506 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 29 138 27 136 146 71 36.0 1e-11 MKKTFLMAIYPYLLKQYNIMFETIQKKYQITQIEIDVLAFLANNPEYHHAQDIVNIRGIS KAYVSCSLDKLVKRGLVERQVDNENRRCNCLFVTPYANELIKEIRAVQDNYNEIAYQGLS DQDKEQFNRLISQIYENLGGNNNE >gi|223714116|gb|ACDT01000099.1| GENE 30 27976 - 29313 1124 445 aa, chain + ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 1 427 1 427 448 260 36.0 4e-69 MNERLANEKISKLLLSLAIPSILAQMVTLLYNLVDRIYIGRMEDGALAIAGIGLCSAIIT IITAFTNLFGRGGAPLASISLGADENDQANKIISNCFSSLVLSSIVIMIVLYIFGEEILL LFGASANTLSYAKDYLNIYLLGTIFVQLSVGMNYFINCQGYAKFGMLTLVIGGALNIILD PIFIFTLNMGVAGAALATIISQFVSFLWVMHFIFSKRSTIKIKKEYLFFDKIIMKRVLGL GISPFFMNSTEGILQVCFNRQLLFFGGDIAVSSMTIMASMAQIVFLPMEGIAQGSQPIIS FNYGARQKERVIEAIKMVIKVALTFSVIMVTLMELFPALFVSMFTNDPELMELGVKMLRV YIFGYIIIGANSSFQQIYTSLGEGKRSFFFAFYRKIILLIPLIYILPNFISNGVLAVMLA EPVSDLLTTGTNAIFFKRFINDKLK >gi|223714116|gb|ACDT01000099.1| GENE 31 29714 - 32725 3063 1003 aa, chain + ## HITS:1 COG:BH2723 KEGG:ns NR:ns ## COG: BH2723 COG3250 # Protein_GI_number: 15615286 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Bacillus halodurans # 4 1000 3 1011 1014 1018 50.0 0 MKGTQADLNWLADPKVFAVNRLNAYSDHSYYLDIDKALNGDAMELKQSLNGRWYFNYAKN PNERFVDFYKEDIDCHYFDMIDVPGHIQMQGYDQMQYINTLYPWDGHEKLRPPHISSDDN PVGSYVCYFEVNEALKNKTTRLVFDGVETAYYVWLNGEFIGYSEDSFTPSSFDVTYALKD GENKLAVEVYKRSSASWLEDQDFWRFSGIFRDVSLYAIEDIHINDLFVKTTLKNNYKDVQ VSVNLDVIAKKEGYLNTALYDQNNNIIYQAKQLPLTNFSFGLTNVNLWSSENPYLYKLLL TVYDHDDNLVEVIPQKIGFREFKMDQGIMKLNGQRIVFRGVNRHEFAADKGRAITKEDML YDIKFMKMHNINAVRTSHYPNQSLWYDLCDEYGIYLIDEANLESHGSWQKLGACEPSWNI PGNLPQWHDVVVDRANTMLQRDKNHPSILIWSCGNESYAGTNIVAMANHFRENDPTRLVH YEGCVWNRDYCEATDMESRMYAKAKEIETYLQGNPAKPYISCEYMHAMGNSLGGMKHYTD LEDKYAQYQGGFIWDYIDQAVYYTNGYGEKVLGYGGDFKERYTDYNFCGDGIVFADRTIS PKAQEVKYLYQDIIIKPTDAGVIITNKMMFSNTSKYQFVYQLKQGKKVLQEGCFIADVKP GDSKEILIPWLTKLDKEAVKTVSALLKDDQIWAKAGFEVAFGQTIVGEDKPLSVSKQPLK VIIGDGNIGIRYNDFEVMFANSEGLISLKYGNHEYIARAPRPVFSRASTDNERGYKHEVN SSMWFGASTFYNCTKFNYEVASDYSSVKVFFEYTLPVIPATTVDIVFTVRAPGIIRVDYH YHGKAGLPELPLIGMCFKLYDQVDRFEYLGRGPLENYIDRKDGAKIDVYDCLVKDNLSPY LVPQECGNRCDLRYLAITDEKDQGICFKMIDQPFEATVLPYSLQTLEAAMHQEELPKSYF TFVTILAAQMGVGGDDSWGAPILEEYCIKGEEDVEYAFEICKR >gi|223714116|gb|ACDT01000099.1| GENE 32 32837 - 34267 1762 476 aa, chain + ## HITS:1 COG:CAC0518 KEGG:ns NR:ns ## COG: CAC0518 COG0469 # Protein_GI_number: 15893809 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Clostridium acetobutylicum # 7 472 1 469 473 447 49.0 1e-125 MLKNEVMSKTKIVCTIGPASDNKEMLTKLVKAGMNVMRLNFSHGTHPEHQAKIDLITEIN KELDTSVAILLDTKGPEIRTGDFIDGSTEFKKGQVVTICQEDIVGTSDRFTITYKELYKD VKPGGFILVNDGQVELLVDHVEGTDIVCVCANNGIVKNKRGINVPGIKLGFDYLSPKDID DLTYGCTQPFNYVAASFVRRAQDVFDVKKLLVENGRPDIQIIAKIENSEGVENIDEILKI ADGIMVARGDLGVEVPAEDVPLIQKEVITKCKDMGKLVITATQMLESMQQNPRPTRAEVS DVANAIFDGTDAIMLSGESASGLYPQEAVMTMSKIALKTENSLDYDALHRQAVRTAPQDT SEAICMSVAEIASKFQVAAIIAFTESGFTARKMSRYRPEARIIAATPEVATTRALAINWG VKPVKCKTMKTRSSMMDYAEIIAKENGVESGELILVTGGKPGLKGDTSYLELVRVK >gi|223714116|gb|ACDT01000099.1| GENE 33 34320 - 34955 754 211 aa, chain + ## HITS:1 COG:lacA KEGG:ns NR:ns ## COG: lacA COG0110 # Protein_GI_number: 16128327 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Escherichia coli K12 # 30 207 22 199 203 231 60.0 1e-60 MLLKDKMKNGQLYREFGHENIEDQEYEKEIERQRLNCKARMFEYNHCHPDNKKEKQDILR GLLGHAGENIWIEAPAYFAYGCNTYIGENFYANFNLVVVDDIEVHIGNNVMVAPNVTLSV TGHPVDPEYRRGGTQFSLPIVIGDDVWIGANSVILPGVTIGDNSVIGAGSVVTQDIPANS VAYGVPCRVIREINDYDKEYYRKGKKVNLDW >gi|223714116|gb|ACDT01000099.1| GENE 34 35011 - 35235 325 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDPLQDEYYMIDNQLDSRQPIIPFWFWWLFPPPWQPGPPWQPGPPGPGPRPPWQPGPPGP GPRPPGPPPRPPRW >gi|223714116|gb|ACDT01000099.1| GENE 35 35338 - 36258 245 306 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 11 295 8 313 319 99 27 7e-20 MKGDNMNYLTLDIGGSSIKYALLNETGEFIEKGSTIAPHQSIEQFVEVIGELYDKYAEQI VGMAISMPGAIDPQKGFAYTGGAYKYIKDMEIVKILQERCPLPITIGNDAKCAANAEVGF GCLKDVDDAAVIILGTGIGGCIVRDGKVHIGKHFSSGEFSWMRTNGEDGNNPECVWAATN GIPGLLKAVQESVGTKEQYNGKEIFEMANSGDKKVLAGLDKFCNRLAVQIYNLQALFDPE KIAIGGGISAQPLLLELVEKHIEEMYQTGLKANSPIARPVVVPCQYRNDANLLGAFYQHL HTCKKN >gi|223714116|gb|ACDT01000099.1| GENE 36 36387 - 37550 1259 387 aa, chain - ## HITS:1 COG:BH0936 KEGG:ns NR:ns ## COG: BH0936 COG0436 # Protein_GI_number: 15613499 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Bacillus halodurans # 4 375 5 377 385 326 41.0 5e-89 MEFSKRMELFQESIFTILAKKAALRRENGKEVIDFSTGTPNIPPTKRIREVLANAALDPK NYVYAINDLPELIKAVIAWYKKRYEVILEDDEICSLLGSQEGLAHLAMTLINDSDIVLVP NPSYPIFKDGPLLANANIVDMPLKEENNYLIDFDKIDPTIAKKAKFMIVSYPNNPTCAIA NDKFYLRLIDFAKKYQIIVLHDNAYSELLFEGLGRSFLSYPGAKDVGIEFNSLSKTYGLA GARIGFAVGNKAIIEQLKTLKSNMDYGMFIPIQLAAIEAITGEQNCVIKTREAYQRRRDI FLKCAQEIGWNIKETKGTMFIWAKIPSLYQKSIDFVNDLFEQTGILFIPGSAFGSEGEGY IRIALIQDEEIIKKAFIKMKATNIFKN >gi|223714116|gb|ACDT01000099.1| GENE 37 37844 - 38311 373 155 aa, chain + ## HITS:1 COG:BS_ybfS_3 KEGG:ns NR:ns ## COG: BS_ybfS_3 COG2190 # Protein_GI_number: 16077304 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Bacillus subtilis # 13 135 8 131 137 112 47.0 2e-25 MGVFKKEIQVNCPIEGKVYPLEYCPDEVFSKGMTGEGVVIFPTGNRVVALIDGKINFIFS SKHAIVLKGQGIEFLIHVGLDTNQLNGDGFKIFVDPDQVVKQGDILLTFDSEILKKHRCI DATPFVFTNLNRKNITVNRYGMVNLNEPFIIVERG >gi|223714116|gb|ACDT01000099.1| GENE 38 38316 - 39767 1514 483 aa, chain + ## HITS:1 COG:melB KEGG:ns NR:ns ## COG: melB COG2211 # Protein_GI_number: 16131946 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli K12 # 11 444 4 439 469 336 40.0 7e-92 MNNDRLSFKEKYSYGVGAIGKDMCCGIIFTYCMLYFTDVLKLSASFVGTLFFLAKFWDAV NDLGMGMIVDNTHSRWGKFRPWLAIGTIVNAIILVALFTDWGLSGTSLYIFAAIMYIVWG MTYTVMDIPYWSMLPNLTSNPEERDRVAVIPRIFASIGGSLLVGGFGLQIMDFLGNGDAQ VGYTNFAIVIAIIFIVTVGITVVNVKSADRVEAKKPEKTSFRKMFEIICKNDQLLVAIAT ILTFNFGMNCIAGVQTYYFIYVAGNKGLFSVFTMFAGFAEIFGLIIFPKLSQRLSKQQVY ALASGIPVVGLIILLVTGFIAPQNYILTAVAGICVKFGSGLQLGTVTVVLADVVDYGEYK LGTRNESVIFSIQTLLVKFASAMGALFTGFALDATGYVAGASQTMATQNGMRIIMVALPI ILVLISYIIYKKYYKLNGAYYQRIMNIIALRKEENQMILDDVTEDIKNSNEPIIDLKEKK YAN >gi|223714116|gb|ACDT01000099.1| GENE 39 39757 - 41985 2016 742 aa, chain + ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 1 739 1 741 748 901 58.0 0 MPIKFHKTTKIFHLYNQEISYIFKILKNNQLGQLYYGKAIKDRENFDHLFETMQRPMASC AFEGDLTFSLEHIKQEYPAYGSGDMRHPAIEIAQANGSRIISFNYQGHTIISGKPKLDKL PATYVESPDEATTLLITLYDNIIDSKIILSYTIFENMPVITRNAYIENCGEQKITLEQVM SMSLDLPDKEYEMIELTGAWARERSIKERKLEHGIQSIYSLRGCSSNNFNPFIALKRENC NEYAGEVLGFSFVYSGNFLAQVEVDTYDVTRVMMGIHPHCFSWSLEKGESFQTPEVVMVY SNDGLNKMSQTYHKLYQRRLGRGKYRDQVRPILVNNWEATYFDFDEEKIMDIAGTAQKLG IELFVLDDGWFGKRNNDVAGLGDWYPNLQKLPSGISGLSRKINNLGMKFGLWFEPEMVNK NSDLYRNHPQWILETPNRPSSHGRNQFVLDFSNPEVVIYVYNMMTEVIDKANISYIKWDM NRCMTEVYSSCHDSESQGRVMHEYILGVYQLYELLTSRFPEILFESCASGGARFDPGMMY YAPQCWTSDDTDALERLKIQYGTSMVYPVSSLGAHVSAAPNHQLLRNTPIETRANVAYFG TFGYELDLNKLSMDEQIKVKEEITFMKDYREIIQFGTFYRLKSPFEGNETVWMVVSQDQN TAIIGYYRTLQEVNVGYRRVKLLGLDPDKEYHVNLNNTVHYGDELMNLGLITTDSSCGEN KEKYDGTNGDYLSRIYVLKAKI >gi|223714116|gb|ACDT01000099.1| GENE 40 42267 - 42944 964 225 aa, chain - ## HITS:1 COG:FN0455 KEGG:ns NR:ns ## COG: FN0455 COG1592 # Protein_GI_number: 19703790 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Fusobacterium nucleatum # 53 225 5 179 179 178 57.0 7e-45 MKYVCKICGFIYDEATQDAKWNDLDTDWVCPVCKAPKEKFEAVAETSTSKYAGTKTEKNL MEAFAGESMARNKYTFFAEAARKEGLEQIAAIFEETADQEKQHAKMWFAAFHGIGTTDEN LEAAAAGENEEWTEMYARMAQEAREEGFEELAIKFENVAIVEKSHEKRYLKLLESYKNGT TFKGDAPKGWKCRQCGFIYEGDEAPEHCPTCGYPKAFFERMCENY >gi|223714116|gb|ACDT01000099.1| GENE 41 43758 - 44624 784 288 aa, chain - ## HITS:1 COG:SP1899 KEGG:ns NR:ns ## COG: SP1899 COG2207 # Protein_GI_number: 15901726 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Streptococcus pneumoniae TIGR4 # 14 272 13 275 286 168 35.0 1e-41 MYSYLELENKSFEDFYLKFCGMQECKPNYSYGPAVRPNYLLHYCLSGQGEYHVNNQVYQI KAGDAFLIMPNVVTYYQADQHEPWTYLWISFDGSKTRQYLKRCNLDEHNLVVHCDYINEL RETTIAILAHNKLSYSNELFIQGQLYTFFSFLAKSANITYNDNVAPNYNPYVDKAIEYIQ NNYQEMVTVNEIADYLSLNRSYLTTLFKKHLHLSPQEFLLKYRMMRAEDLLTNTDLTINQ IAFSCGYSNQLSFSKAFSNSHQMAPRDYRKQFKLVDNSRMNDPHKRKE >gi|223714116|gb|ACDT01000099.1| GENE 42 44722 - 46197 1902 491 aa, chain + ## HITS:1 COG:no KEGG:BMQ_4334 NR:ns ## KEGG: BMQ_4334 # Name: spoIVA # Def: stage IV sporulation protein A # Organism: B.megaterium_QM_B1551 # Pathway: not_defined # 1 491 1 492 492 482 47.0 1e-134 METKNILHDIAKRCDGDIYLGVVGPVRSGKSSFIKRFMEMAVIPYIEDKDAKLRAIDELP QSGKGKMIMTVEPKFIPNQAVEMLMDENFKVNVRLVDCVGYVIEGAKGYQDDQGIRYVKT PWYLESIPFDQAAKVGTKKVIQDHSTIGIVITSDGSICDIPGANYNEATDSIVEELLDID KPFIIIINTKDVNSSACKKEYERLTAKYDVPVLSMDVTNMEEPQIVSLLKDALYEFKISE VRIEVPKWLAMMSNSHWLKQTLDTSLKESLQSIKKFKDVERIGDTISEYDFVDKAYLTGI DTTTSSATLKIEEREGLYNEILEEIIGDKGFDQATFLTFIQELVDIKKEYEGFSNAIRMV KQTGYGYAIPKLDEIELSDPEIIKQGPRYGMKLVSKAGTTYMIKVDIESTFEPIIGSREQ AEAFIEYLNSSGNDKQAIFDCDVFGRKLGDLIEEGMYIKLNAIPENASLRLHDILSKIVN KGKSNVIAIVL >gi|223714116|gb|ACDT01000099.1| GENE 43 46250 - 46534 371 94 aa, chain + ## HITS:1 COG:BH1309 KEGG:ns NR:ns ## COG: BH1309 COG0776 # Protein_GI_number: 15613872 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Bacillus halodurans # 5 94 1 90 90 83 51.0 1e-16 MEVVMNKKELVETVSKERNLTKKDAEILVDTVFNTITRSVVEGDKVLISGFGTFKVNDRK ARKGISPKTQEEMVIPASKTVTFKPSNRLKDAMN >gi|223714116|gb|ACDT01000099.1| GENE 44 46608 - 47315 933 235 aa, chain + ## HITS:1 COG:BH1677 KEGG:ns NR:ns ## COG: BH1677 COG2738 # Protein_GI_number: 15614240 # Func_class: R General function prediction only # Function: Predicted Zn-dependent protease # Organism: Bacillus halodurans # 12 235 3 224 224 190 44.0 2e-48 MPFYYFPMNSYYMFGYLLVIIGSLIMIYGQIKVNSAYKRYERIPNSRGITGAMVAREILD RNGLSDIQIHVVNGKLSDHYNPRNKTINLSREIHDGTSIAALAVASHECGHAIQHLVGYK PLVFRNAILPLCNVGQYLGWIAVFIGLIMGNTSVAWIGVFLMGGILLFQIVTLPVEFDAS SRALRILKANYLTTDEYSGAKSMLSAAAFTYVAAMLSTVLSLLRIVLIVIGNDRD >gi|223714116|gb|ACDT01000099.1| GENE 45 47473 - 47883 484 136 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733824|ref|ZP_04564305.1| ## NR: gi|237733824|ref|ZP_04564305.1| predicted protein [Mollicutes bacterium D7] # 1 136 27 162 162 229 100.0 4e-59 MLWVHVPYSKHKNEITSIENVIVKENNYSNVKYFDRYNSDKSYYIIRATKGKKQQYAVFD EKYKLIKSYSGDVVDQQVCINAFKEKYKQEPSEVSVGYENEIFVYCLMYQGKDSLVYAFY GIDNGEFVKAYRIDNG >gi|223714116|gb|ACDT01000099.1| GENE 46 47876 - 48487 532 203 aa, chain + ## HITS:1 COG:BH1697 KEGG:ns NR:ns ## COG: BH1697 COG3935 # Protein_GI_number: 15614260 # Func_class: L Replication, recombination and repair # Function: Putative primosome component and related proteins # Organism: Bacillus halodurans # 5 178 5 193 233 80 32.0 1e-15 MDRMMLKSLVESKCINIERLLVLSAKELKIDGNECHILLLIYTLMEAGIKTVTPQMIQNY SMLTSHDLNKVLQSLLNKKLIYNRAGSISLNHLEDKLLQDHESASAKQEEESLNIVTIFE EQFGRPLSPIELNIIKDWKESDYSDEMIVKALKEAVKSQVLNFRYVEGILNNWAKNGIKQ RYVESEQPQRKVPISEYKWWENE >gi|223714116|gb|ACDT01000099.1| GENE 47 48402 - 49121 615 239 aa, chain + ## HITS:1 COG:BH1698 KEGG:ns NR:ns ## COG: BH1698 COG0177 # Protein_GI_number: 15614261 # Func_class: L Replication, recombination and repair # Function: Predicted EndoIII-related endonuclease # Organism: Bacillus halodurans # 27 232 2 207 218 239 55.0 3e-63 MVLNNAMLKVNNHSEKFQLVNINGGKMNKEKTNRVLEYFDELFPDAYCELNHESDFQLLV AVMLSAQTTDKKVNQLTENLFKKYPTVEAVSQASLPELEQDIKTIGLYRNKAKNLLALSH VLIEQFDGIVPSDQKQLESLPGVGRKTANVVRSVAFDIPAFAVDTHVERISKRLGFAKRD DNVLTVEKKLCRSIPRNRWNKSHHQFIFFGRYFCKATNPSCTECKLFDMCKDPIKNKYL >gi|223714116|gb|ACDT01000099.1| GENE 48 49184 - 51844 2962 886 aa, chain - ## HITS:1 COG:BS_ponA KEGG:ns NR:ns ## COG: BS_ponA COG0744 # Protein_GI_number: 16079289 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Bacillus subtilis # 23 689 40 696 914 310 33.0 9e-84 MTDSPTRKRKTKKKLNWKKVATSLVVVILIVCLGMGCFFAWSVYKETDDFSAKKILSGEA SKMYDINGELMYTFGSDTNGKRENVTYEDLPQVLIDAVVAAEDSRFFEHNGFDLPRIAKT MITNLAHLSIRGGGSTITQQVIKKSYFPKEEQTYTRKLSEIILAIQATKEISKEEIITLY LNKIYFGRSTSSIGIAAASKYYFNKDVSQLTLPEAALLAGTLNSPSAYDPYYNLEKATQR RATVLNLMVDHGYITQEECDLAKQVKVENTLSQGITSSDDALAAYVDIVTKEVKEKTGLD PAETQMNIYTYCNQEVQKMATAMANGETYKYSDEDMQMGGSVQSTQDGRIIAVIGGHNYK SGDLNKATVKQQPGSSMKPLLDYGLAYKYLNWSTVNTVKDEPLTLGGKTIKNWDGQYHGS MSISDALLNSWNIPAITTFQEVKKAAGSDAIIKDMESVGIDMSDQDKELSDTYAIGGWKT GVSPVELAGAYATIANNGNYIESHTINYVEVVETNKTVKIDEELQKNVTQSIGEDVSFMI RETMLSYSSSTTGSYSSLSGLDNVGAKTGTSNWDSTSPYVKAGKSRDLWMTAYTPDYTCS VWMGFDTTGIKKGKNTSDYKAYPAKVVAELLKYLQKNTTDKKSYPEQPSGVEQIQVVAGT YDSPTANTPASQILTGWAKTGYGPTTAAKDPEINTLSAFEASINGNGKIDVSFTAYEPAN PDIVYVVEIYDESNNLLYSTKLTTNSGTIDYAPTGKVKVVGYYAYSSGSPTSNKIEKTIG SSAKLDKVNYSITSGGSSVTNGSTTATSSVAIKVNAQSSSNTITIQFMNSSGGTIGSPST FTGSATKEFALTPGAQYTVKITESDGTNTVSASISFTVGGNSNPGE >gi|223714116|gb|ACDT01000099.1| GENE 49 51834 - 52433 494 199 aa, chain - ## HITS:1 COG:BH1703 KEGG:ns NR:ns ## COG: BH1703 COG3331 # Protein_GI_number: 15614266 # Func_class: R General function prediction only # Function: Penicillin-binding protein-related factor A, putative recombinase # Organism: Bacillus halodurans # 2 192 3 198 199 185 53.0 5e-47 MVNYPNMKKTTVHQTKLIDGKLNTRHRGMNLEEDLNLTNKYYLARKIANIHKKPTPIQIV KVDYPKRSAARIVEAYFKTPSTTDYNGIYKGKYLDFEAKETKKQNFPFTNISVHQIEHLK SVIEHQGIAFVIIAFTHLNEVYLVNASYVIDAYYQPDQKSISYQTVKEKGHLIEQGFNPR LDYLKIIDQYYLGGHENDR >gi|223714116|gb|ACDT01000099.1| GENE 50 52640 - 52960 359 106 aa, chain + ## HITS:1 COG:no KEGG:BAA_1649 NR:ns ## KEGG: BAA_1649 # Name: not_defined # Def: hypothetical protein # Organism: B.anthracis_A0248 # Pathway: not_defined # 2 102 8 106 112 64 42.0 1e-09 MLSPKKIIAKEFKVDFKGYNAEEVDHFLDMVVNDYEAFAAMLNASYDKIDQLEARLSEQK IKIAKLEREKALQDDNIHAMEENLSTNVDILKRLSLLEKVVFNQNR >gi|223714116|gb|ACDT01000099.1| GENE 51 53103 - 53510 419 135 aa, chain + ## HITS:1 COG:SPy1834 KEGG:ns NR:ns ## COG: SPy1834 COG1396 # Protein_GI_number: 15675661 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 119 1 122 195 75 36.0 3e-14 MEIAKKIKTARIKVELTQEHVADELQVSRQTISNWENAKTYPDIISIIKLSDLYQISLDE LLKGDDKMIEHLNESTNIVSSNKKLICVAIINIILFFGLVIFNTIISNNKYLLASVLVIA MCSIAALFYQVIKKF >gi|223714116|gb|ACDT01000099.1| GENE 52 53522 - 53842 346 106 aa, chain + ## HITS:1 COG:no KEGG:SmuNN2025_1556 NR:ns ## KEGG: SmuNN2025_1556 # Name: not_defined # Def: hypothetical protein # Organism: S.mutans_NN2025 # Pathway: not_defined # 6 100 3 98 101 68 39.0 9e-11 MKWYEIILVFIVLPGAVNTAYQVFKIAELDARSRGLKHPKFLGFIAIGGQNSSGLILYLI GRHKYPSTLSNQDREVMNSRKKKVGVGLIFIVSGAIGLIINKVLIG >gi|223714116|gb|ACDT01000099.1| GENE 53 53962 - 55029 739 355 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3483 NR:ns ## KEGG: EUBREC_3483 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 337 1 337 354 196 37.0 1e-48 MTIYQAMQLNVKNLKAAISDAPTKKDKHRYIAAMILRNIFCLAFCIIFITFFTTLFGNEN SSVGIIGLLAVLSLRFTDLDFNIKQSTLALIASFVICAIAPHLANIVPLGFGLLINFSAL LLILILTSHQVSYANHFTFILGYILLWGNDVSENSYTLRIIAMMVAAIFTALVYYHCHYQ QTMKLNFKDILKDFFSPSKRTFWYLKISSAIALVIFLGEMLNFSRTMWIAFACMSIINID HEQIIYKFKHRALFVVVGSIIFGILFTIVPKEYLGLAGILGGIMVGFSGSYHWQTVFNCF GALAITIDLFGFLNAIIIRIVANIAGSIFSFVYHHLFEKIIQLLINENLMFDNNL >gi|223714116|gb|ACDT01000099.1| GENE 54 55029 - 55466 368 145 aa, chain - ## HITS:1 COG:no KEGG:Amet_2040 NR:ns ## KEGG: Amet_2040 # Name: not_defined # Def: MarR family transcriptional regulator # Organism: A.metalliredigens # Pathway: not_defined # 13 142 9 137 141 67 32.0 2e-10 MKEPEWIKMINYSQEIRLFSHLLNRRGKPKNVLTKDELDLLSILIIDDEIITPIIISKRM CVSKPLVSRLIEQLNKKELLLKSPHPFDKRSYCLKITAQGRKHLDDIYTYYLEPIYQLKK KLPSDDFTQLINLIKKSNITLKGGQ >gi|223714116|gb|ACDT01000099.1| GENE 55 56410 - 56964 361 184 aa, chain + ## HITS:1 COG:BH3394 KEGG:ns NR:ns ## COG: BH3394 COG1309 # Protein_GI_number: 15615956 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 7 167 7 163 186 70 26.0 2e-12 MNKHDKTRYVFAQAIKGLIKVHPLDKIAVTDIVTRSGMTRQTFYRYFKDKYDLVNWYFEK LVLKSFRQMGDGCSLQEALQLKFAFIKSEHSFFKEAFKSNDYNNLVNYDFNCIYEFYKNI IEKNLGQSVTADIDFLLKMYCKGSIDMTVEWVLGDMPISIDDIVKLLIEALPQRLEPFIL NIKK >gi|223714116|gb|ACDT01000099.1| GENE 56 57105 - 58250 1575 381 aa, chain + ## HITS:1 COG:ECs3659 KEGG:ns NR:ns ## COG: ECs3659 COG1454 # Protein_GI_number: 15832913 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 1 379 2 381 383 461 61.0 1e-130 MANRFILNETSYHGSGAIREIVTEVKARGFKKAMVCSDPDLIKFGVTKKVTDLLDAANLD YVVYSDIKPNPTIENVQSGVKALQEAKADYMIAIGGGSSMDTAKAIGIIDKNPEFSDVRS LEGVAETKNPCTPILAVPTTAGTAAEVTINYVITDAQKDRKMVCVDPHDLPVVAFVDPDM MASMPKGLTAATGMDALTHAIEGYITAGAWEMSDMFHIKAIEIIARSLRGAVENTPEGRE GMALGQYIAGMGFSNVGLGIVHSMAHPLGALYDTPHGVANAIILPTVMEYNAPATGTKYK DIAEAMGVDTTGMDQEAYRKAAVDAVKKLSQDVGIPADLKEIVKVEDLDFLSQSAYDDAC RPGNPRETSVAEIKELYQSLL >gi|223714116|gb|ACDT01000099.1| GENE 57 58634 - 59626 1103 330 aa, chain + ## HITS:1 COG:lin1432 KEGG:ns NR:ns ## COG: lin1432 COG1426 # Protein_GI_number: 16800500 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 1 273 1 258 311 63 22.0 5e-10 MENISEHVRREREARGVTIEELAKGTFISVAVLKDIESGKFDKYKGDELYLKMYLRKIAK YLGLEEKELDAEFEALTQEIQLEEIRQEDLNKRLVEENKKNITISGKVSDTFKDLKNVKP KKSISNKRVYEDRYLLRYFKYALVGVVCVAIIFVVWYAIVAGRSNTEAPDFKDNNTPTVE GNNDAGKQDSDDNKNSNTDDKNEDDKKAETPTVEITKNGELDYSIKLDPSMTTFKFKMEF VGRTWSQLNVNGSDYSGFKSGIYNNANKSNATDAAPEIVELDIPVENFQNLELKLGYFMG HRFYINDQPLEIDASEYDGGNHTLKITRVQ >gi|223714116|gb|ACDT01000099.1| GENE 58 59626 - 60219 233 197 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 5 193 484 669 904 94 31 2e-18 MNLPNKLTVTRVLLVPFLILIYMFPYDSAGITVPVYHVLETNISLVNIIILFIFIIASIT DYFDGKIARKEKLITTFGKFADPIADKLLINTIFLLLASDQTINIIIPIIMISRDTIVDA IKMSAASKQVVVAASKLGKLKTVSQMIALGFLLVNNFPFTVLGVDVANALAWLATVISVI SGIDYFLKNREMLTETM >gi|223714116|gb|ACDT01000099.1| GENE 59 60219 - 60677 231 152 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 12 138 763 891 904 93 39 3e-18 MDELAKTLIQYNISISSVESFTVGNFAAMLGSIPGISKVYKGSLVTYQSTTKERLLGISH EIISKYGVVSKEVASLMCVNGKQILDSDVCVSFTGNAGPDAMEGKPVGLVYIGILYQGVN IYELNLKGTREEIQKQAIDFVIRKLNEKIKVN >gi|223714116|gb|ACDT01000099.1| GENE 60 60716 - 61753 1388 345 aa, chain + ## HITS:1 COG:lin1435 KEGG:ns NR:ns ## COG: lin1435 COG0468 # Protein_GI_number: 16800503 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Listeria innocua # 18 344 7 333 348 433 65.0 1e-121 MAEAKTKENKEEAKKVQALDDAIKQIEKKYGKGSVMKLGDRAAVSVDVIPTGSLTLDLAL GIGGYPKGRIIEIYGPESSGKTTLTLHAIAECQKQGGRAAFIDAEHAIDPVYAQNLGVNI DELILSQPDSGEQGLEIAETLVRSGAIDLVVVDSVAALVPQVELDGEMGDSQMGLQARLM SKALRKIAAELNKSECTIIFINQLREKIGIMFGNPETTTGGRALKFYSSVRVEIRRSEAI KLGTEIVGNKVNIKVVKNKVAPPFKTTQVDIIYGKGISRDGEVLDLAVEKDIVEKSGAWY AYKGEKIGQGRENAKTFLSTHSEIMEEITQAIKDSLESDNKSEQE >gi|223714116|gb|ACDT01000099.1| GENE 61 61852 - 62157 289 101 aa, chain - ## HITS:1 COG:CAC2947_2 KEGG:ns NR:ns ## COG: CAC2947_2 COG0551 # Protein_GI_number: 15896200 # Func_class: L Replication, recombination and repair # Function: Zn-finger domain associated with topoisomerase type I # Organism: Clostridium acetobutylicum # 1 96 10 105 110 76 39.0 1e-14 MLVCSNPDCKHRESVSIITNARCPNCHKKLELVGKGDKQMFVCKTCGYRQHMNAFKKERE NKNKLARKSDVKKYMNQQKQQQTAVEDSPFAALLKLKDELK >gi|223714116|gb|ACDT01000099.1| GENE 62 62067 - 64022 2019 651 aa, chain - ## HITS:1 COG:BS_topB_1 KEGG:ns NR:ns ## COG: BS_topB_1 COG0550 # Protein_GI_number: 16077493 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Bacillus subtilis # 1 565 1 572 575 670 60.0 0 MSKTLVLAEKPSVGRDIARVLQCNKNVNGGLEGNKYIVTWALGHLVTLADPEKYDKHYQK WDLNDLPIMPEKMQLVVIGKTAKQYNAVKNLMNRNDVKDIIIATDAGREGELVARWIINK AHIKKPMQRLWISSVTDKAIKEGFKKLKNAREYEALYHSAYSRSIADWIVGINATRALTC KYNAQLACGRVQTPTLAMIAAREEEIKDFVPRNYYGIEIISSQINWTWLSAKKEKHLFNE AKIDETIKKLQNQKLKIEKISTFTKKKYAPQLYDLTELQRDANKLYGFSAKETLATMQDL YEYHKVLTYPRTDSRYLTDDIVPTLKERLEASRGGDYDNIIDKILKSPIRKQSHFVNNSK VSDHHAIIPTEQPAMLGSFTDRELKIYELVLKRFLAVLLPPYLYEQTTINAIVNQEIFTV SGKIETQIGWKEVYGHEEDSEDQTLPSLKQNQFLPITEITKTTGQTTPPGYFNEATLLSA MENPVRYTKNTSKQLSSTLVNTGGLGTVATRADIIEKLFNTFLIEKSGNDIHVTNKGKQL LKLVPQDLKEPELTAKWEMELAKIAKKEQKSNIFINEIKKYTRSLIDEISSSEAKFKHDN LTTKKCPECNHFMLEVNGKKEKCLFALTPIVNIVNLFPSSQMLAVLIVIKN >gi|223714116|gb|ACDT01000099.1| GENE 63 64321 - 65877 2088 518 aa, chain + ## HITS:1 COG:BS_ymdA KEGG:ns NR:ns ## COG: BS_ymdA COG1418 # Protein_GI_number: 16078759 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Bacillus subtilis # 20 518 22 520 520 454 52.0 1e-127 MPGNIAFSILTYVLAVALGFFINYLINKLKISKANVSAAKIIDDATAKADNLVKEAILDA KTEAYELKLQAEKEAKEQKQEINELENKLLQREQTVDRRDIAVQGKEDVLAQKVVQLDER EAGLGKLEAELKEKINAKIGELEKIAAMSANEAKQELFKQVEQQTATEMTAYIKDQEEEA RSKASLLSRDIIANAINRYAQEETIERTVSVVALPSEEMKGRIIGREGRNIKAIEQCTGV DLIIDDTPEAITISCFDPVRREIARLSLETLIRDGRIQPGRIEEVVQKTKNELDEVIRKT GEDAVFELGISKIDKDIIMMLGKLKYRTSYGQNALQHSLEVAHLAGIMAAELGLNQQLAR RAGLLHDLGKAVDHEMEGSHVELGAKFAKKHGEHATVVNAIESHHGDVPATSVISILVAA ADTLSAARPGSRSETIENYIQRLEKLEEMAKSFDGVDRVFAIQAGREVRIVVKPDKVDDL MSHKIARDIKTKIEEELTYPGHIKVTVIREVRASEVAK >gi|223714116|gb|ACDT01000099.1| GENE 64 65919 - 66692 909 257 aa, chain + ## HITS:1 COG:BH2376 KEGG:ns NR:ns ## COG: BH2376 COG1692 # Protein_GI_number: 15614939 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 1 257 1 262 264 208 42.0 1e-53 MKILFIGDVFGSIGREMIEDYLHRIVKDNQIDFVIANVENTSHGKGLLKRHYDELSFQGI QAMTMGNHTFDKKELYDYIDEADKLIVPINQPKVLPGVKSRVFNVKGKLIRITSVLGVAF MDSRTSNPFEVIDEYLDLPHDIHIIDFHGEATSEKIAFAHYVKDKASAVLGTHTHVQTAD EKIIDSKLAFISDVGMTGPYMSAIGCDLDAIVTRLRGFGAPFIVAESSGQLSGVIITFEE NTPVAIERILINQDHPY >gi|223714116|gb|ACDT01000099.1| GENE 65 66738 - 66998 364 86 aa, chain + ## HITS:1 COG:BH2375 KEGG:ns NR:ns ## COG: BH2375 COG2359 # Protein_GI_number: 15614938 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 1 86 1 86 86 90 55.0 7e-19 METIKVSSTSCPNNVAGAIASMIRNESKLQIQVIGAAALNQAIKAIAIARGYIIPTGNEI VCIPSFHDLIVDDKEITALRLLLELR >gi|223714116|gb|ACDT01000099.1| GENE 66 67110 - 67355 381 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754517|ref|ZP_02426644.1| ## NR: gi|167754517|ref|ZP_02426644.1| hypothetical protein CLORAM_00018 [Clostridium ramosum DSM 1402] # 1 81 1 81 81 149 100.0 6e-35 MAKCKLLMQTAQTQKLIDFFDGYEDDKVKCKFIKKTGIKAEIECETELTPDEAAGHCKSL FKKTPDGAVLYFSIQPDGFFG >gi|223714116|gb|ACDT01000099.1| GENE 67 67482 - 67730 229 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754516|ref|ZP_02426643.1| ## NR: gi|167754516|ref|ZP_02426643.1| hypothetical protein CLORAM_00017 [Clostridium ramosum DSM 1402] # 1 54 1 54 481 95 83.0 9e-19 MKDYSNYFKGPSLNEAKRRSRDEVGRYDDAYHVPGDMVKVGLGKNIIFKHMGVKQMNVTV KLYQEYWKVCHINRLQKLKRLT >gi|223714116|gb|ACDT01000099.1| GENE 68 67628 - 68926 635 432 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 1 420 8 426 451 249 32 4e-65 TYGCQANERDSETLSGILESMSYQPTTEIKEADVIVLNTCAIRENAEEKVFGKVGYVKNL KKTNPNLIFAMCGCMAQEEVVVNRILEKHPHVDLIFGTHNIHRLPELLKDALYSKEMIVE VWSKEGDVIENAPVRRDNKYKAWVNIMYGCNKFCTYCIVPYTRGKERSRLAKDIIKEVEE LVAEGYQEITLLGQNVNSYGKDLGEDYNFSNLLEDVAKTNIPRIRFTTSHPWDFSEDMIK IIAKYDNIMPAIHLPVQSGNNEVLKLMGRRYSREQYLELFHKIKEYIPDCTVTTDIIVGF PNETHEQYLDTLSLYQECEYDLAYTFVYSPRAGTPAAKMVDNVASDEKDQRLYKLNEIVN EKAYKQNQRFLNKIVEVLVEGTSKKDDSMLTGYTRHQKLVNFKGDPKDIGKIIKVKITEA KTWALKGESIES >gi|223714116|gb|ACDT01000099.1| GENE 69 68916 - 69245 153 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733846|ref|ZP_04564327.1| ## NR: gi|237733846|ref|ZP_04564327.1| predicted protein [Mollicutes bacterium D7] # 1 109 4 112 112 144 99.0 2e-33 MNRSLNCAKQLNKYLLNLDVIKEYQKYEQLIHQDDKIEKLEAKMKAYQKKIVNQKSKQDE TVVKTIEEYQKIKDEFENHPLVVNYLYLKEEVDSLLQSINTYINGQLLK >gi|223714116|gb|ACDT01000099.1| GENE 70 69321 - 70835 1898 504 aa, chain + ## HITS:1 COG:lin0179_3 KEGG:ns NR:ns ## COG: lin0179_3 COG0516 # Protein_GI_number: 16799256 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Listeria innocua # 227 498 1 272 276 487 84.0 1e-137 MAYFFDEPSRTFNEYLLVPGYSSAECQAKNVSLKTPLVKFKKGEEPALSLNIPLVSAVMQ AVSDDKMAVALAREGGVSFIYGSQTIEDQAAMVKRAKTYKAGFVPSDSNLSISDTLADVI ALKEKTGHSTMAVTEDGSANGKLLGIVTGRDYRVSRMGLDTKVTEFMTPYDKLICGHIGI TLKEANDLIWDNKLNALPIIDDDQKLAYFVFRKDYETRKSNPNELLDDSKRYVVGAGINT RDFAERVPALVEAGADVLCIDSSEGFTEWQKITIDWIREHYGDSVKVGAGNVVDADGFRF LAESGADFVKIGIGGGSICITREQKGIGRGQATATIEVAKARDEYFKETGIYIPICSDGG IVHDYHMTLALAMGSDFIMLGRYFSRFDESPTNKVSINGQYMKEYWGEGSARARNWQRYD MGGDSKLSFEEGVDSYVPYAGSLKDNVGLTLSKIKSTMCNCGALTIPELQEKAKITLVSS TSIVEGGAHDVVLKDSTSGNGHQN >gi|223714116|gb|ACDT01000099.1| GENE 71 70905 - 71399 530 164 aa, chain + ## HITS:1 COG:no KEGG:Aflv_1521 NR:ns ## KEGG: Aflv_1521 # Name: cotE # Def: outer spore coat protein # Organism: A.flavithermus # Pathway: not_defined # 6 141 16 156 189 85 35.0 6e-16 MANNIREIITQAVIAKGKKRTLNKYPFSIEGYDKILGCWITNHRYNAAFKDGKPVVLGTF DVHLWYSINDDSSLLKQTVSYLNELDLVKKETRNFDEGDELMVTCNREPKCISVNKLEDK VVIEIEKEISLSVVGKTTMRVETKAENETWDELENLTVDENFIK >gi|223714116|gb|ACDT01000099.1| GENE 72 71475 - 73985 2780 836 aa, chain + ## HITS:1 COG:lin1440 KEGG:ns NR:ns ## COG: lin1440 COG0249 # Protein_GI_number: 16800508 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Listeria innocua # 4 831 3 856 860 778 47.0 0 MKPKYSPMMMQYLSIKEENQDSIVMFRLGDFYEMFFDDAIMVSKELEIALTGKNAGAPER VPMCGVPFHSASGYIQKLVDNGHRVAIVEQLTEPGKRGIVERGVVQIITPGTIFDESLTN NKNNYIAALEEFDFTYTLAFCDITTGEFSVVNIEKKEKLLLNQLETMAVKEIVTLPNQIR EFDDILSSPFAHDNYNEKYDKIFSKINDLKQIKTASLLLNYLLDTQKRELEHLQIIEEIN NEDYVTMDLYTKKALELTSSAKSNDKYGSLFWLLDQTKTAMGSRMLKQWIERPLINQEQI EERLDIVEIFTNHFIQRESIKEILKDIYDLERLSSRIAFGNINARDLKWISSSLKVVPEL KNQLLSLDEPLISALADHFTDLSHITNLIDQAIVDNPPLTVKEGNLIKEGFNEELDELRY IRDHGKQWLAQFEQNERDKTGIKGLKVGYNRVFGYYIEVTKGNLSLVKDEFNYTRKQSLS NAERFVTPELKEMESKLLSAQDKMIKLEYALFTEIRNYIKKDVHAIQDVAKIIAKIDVFQ SLAMISSENSYVRPTFNHNKVFKVVDGRHGVIERVMAQGTYVSNDVNIDAANPVMLITGP NMGGKSTYMRTIALIALMGQIGCFVPCSEACIPIFDQIFTRIGASDDLISGQSTFMVEML EANNALRFATENSLIIFDEIGRGTATFDGMAIAQAMIEYIAAKIKCITLFSTHYHELTFL EEKNLGIKNVHASASIENDDLVFLYRIKPGRSNKSYGVNVAKLAKLPDAVLNRANVLLEA LEENNIEHHLSDDTLKEAPPVTVSVVEKYLEGIDPMALSPIDALSTLIELKKLNEK >gi|223714116|gb|ACDT01000099.1| GENE 73 73995 - 75827 1650 610 aa, chain + ## HITS:1 COG:lin1441 KEGG:ns NR:ns ## COG: lin1441 COG0323 # Protein_GI_number: 16800509 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Listeria innocua # 2 610 3 603 603 510 45.0 1e-144 MQKIRQLDDVLANKIAAGEVIERPANVVKELVENSIDANSTKIDVIIEEGGLNLIQIIDN GEGMVKEDALLCFSRHATSKIKDDQDLFCIQTLGFRGEAIPSIASISNFELKTSTGGTGT TVTYEYGRFVECNESDSKKGTNIKVEKLFQNVPARLKYIKSTNAEFANIQTYLERLSLSH PHIAFMLVHNGRTIYKTNGNGNLLEVISNIYGLNVAKAMIPVDFEDDEFHVTGYVSKIDV NRASKNHMVTMVNSRVVKNKVSVDSINNAYRRYLADKRYPIAIVNIEIDPYLVDVNVHPS KLEVRFSKESQLRELIYQGVSDALAKVNLTYDATAEYKKTKAPLNLEQPSLDLTYESVPI KTIEQPRFEADFPQNDQEVVFKDASFDDFTFVKEETSEYIIPDQQIIETKVEPREKLMKK KLFVKGQVHGTYIICEDETGMYIVDQHAGQERINYEYFLEKYQNLDLSMRDLLVPITLEY PLSEFLIIEERKDLLTKVGINLEVFGNSGYVIKQLPLWMQNIDEQVFIEDMMTQLLNDNK IDVIKLQDHAIATLSCKASLKANTHLSTEGMQNIIDNLMRCDNPYVCPHGRPTIIYYSTY EIEKLFKRVV >gi|223714116|gb|ACDT01000099.1| GENE 74 75827 - 76738 983 303 aa, chain + ## HITS:1 COG:BS_miaA KEGG:ns NR:ns ## COG: BS_miaA COG0324 # Protein_GI_number: 16078796 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Bacillus subtilis # 2 285 6 291 314 288 52.0 1e-77 MEKVIVVVGPTGVGKTRMGVALAKHFNGEVISGDSMQIYKTMDIGTAKVTDDEMEGIVHH LIDVKEPTESYSVKDFQDEVRLKIKEIISRGKLPIIVGGTGLYIKAALYDYEFSESQTNH QEYVEKYQDYSNEELYNYLIEIDKASAKELHPNNRQRVLRAIAIYESTGIKKSETLAKQN HELIYNAKFIGLTLERESLYQRIDQRVDLMMEQGLLQEIDGLMKKNYTREMQSMKAIGYK EWFAYYQGTQTLDETLELIKKNSRNYAKRQYTWFNNQVPVIWFNVNLNNFNETINAVISE LEG >gi|223714116|gb|ACDT01000099.1| GENE 75 76850 - 77578 766 242 aa, chain + ## HITS:1 COG:SAP028 KEGG:ns NR:ns ## COG: SAP028 COG1396 # Protein_GI_number: 16119228 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Staphylococcus aureus N315 # 1 62 1 62 79 64 53.0 1e-10 MMFNDNLNKYRKQKGLSQEELAFRLGVSRQSVSKWESGQSTPELERIIEIADLFGISLDE LIGHESNDYVTVDREELRSVVRHLFTYEYKSKFKIGNVPLIHINLGRGFRIAKGIIAIGN IAVGLFSFGALALGVFSLGGLAIGLLTLAGIAIGGISLGGLSIGYFAVGGMAIGIYAIGG MAIASKLAIGGMAHGYVAIGSSANGVHTLVSTNCSLEMISNFILRQHNLNAKIVEFLLLF IY >gi|223714116|gb|ACDT01000099.1| GENE 76 77605 - 78912 1392 435 aa, chain - ## HITS:1 COG:CAC3014 KEGG:ns NR:ns ## COG: CAC3014 COG0422 # Protein_GI_number: 15896266 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis protein ThiC # Organism: Clostridium acetobutylicum # 1 435 1 435 436 575 64.0 1e-164 MNYTTQMNAAKKGIITPEMKIVAAKEKMNIEELKTLIAQGKVVIPANKYHTCINVNGIGS MLKTKINVNLGTSRDWKDLDMELKKANDAVNMGVESIMDLSSYGDTQKFRRKLTKECPAI IGTVPIYDAVVYYHKPLKEITARQWIDIVEMHARDGVDFMTIHVGINKNTAQRFKENKRL TNIVSRGGSIIFAWMEMTGQENPFYEYYDEILDICQKYDVTMSLGDACRPGSIEDAGDIS QIEELVTLGELTKRAWQKDVQVIVEGPGHMALNQIEANIKIQQTICQGAPFYVLGPLVTD IAPGYDHITAAIGGALAAANGAAFLCYVTPAEHLRLPDLNDVKEGIIASKIAAHAADIAK GIKGAKDLDYQMSCARKNLDWEKMFELAIDPKKARTYRQQAKPEKEDTCSMCGNFCAIKN MNRILNGEIVNIYDE >gi|223714116|gb|ACDT01000099.1| GENE 77 79126 - 80391 1271 421 aa, chain - ## HITS:1 COG:BH1257 KEGG:ns NR:ns ## COG: BH1257 COG2256 # Protein_GI_number: 15613820 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Bacillus halodurans # 5 406 6 421 428 366 47.0 1e-101 MSSTLANRMRPQNLKDIIGQQHLVGPNAILTKFVQKSHPFSIILYGSPGCGKTTIAMALA NDLDIPYRIFNASTGNKKEMDAIIEEAKLSGSLFVIIDEVHRLNKARQDDLLSHMESGLL IVAGCTTANPFHSINPAIRSRCHILEVRPLTVDEIVAGLNKAMTSPNGLNSKYTIEKEAL HSIAKLCSGDIRYGYNCLEICTIVCENNHITQADLKTAAIKTNVVYDKDEDNYYDTLSGL QKSIRGSDPNGAMYYFAKLIESKDIESLERRLITTAYEDIGLANPNACMRTVIAMQAAKT LGFPEARIPIASAIIDLCLSPKSKSSENAIDAALTSLNERSLKTPSYLRLTPVGLEDDEK YDYSRPELWEYIQYLPKELGNTQFYVPWMTSNYEKALAENYRRILKHGRTSDIKKLNQQN K >gi|223714116|gb|ACDT01000099.1| GENE 78 80468 - 80920 552 150 aa, chain + ## HITS:1 COG:BH1243 KEGG:ns NR:ns ## COG: BH1243 COG1490 # Protein_GI_number: 15613806 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Bacillus halodurans # 1 145 1 145 146 176 55.0 1e-44 MRLVVQKVSQSSVKIEGEIVGAIDKGYMVLVGITNGDDELLVEKMVDKLVNLRIFEDEND KLNLSLLDVGGSVLSISQFTLYANCKKGRRPSFIDAAKPDISSPLYDFFNKKLEEKGINV ERGVFGAMMEVSLINDGPVTIILDSDELFY Prediction of potential genes in microbial genomes Time: Thu May 26 10:14:52 2011 Seq name: gi|223714115|gb|ACDT01000100.1| Coprobacillus sp. D7 cont1.100, whole genome shotgun sequence Length of sequence - 7435 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 3, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 + CDS 77 - 1081 804 ## COG0438 Glycosyltransferase 2 1 Op 2 4/0.000 + CDS 1096 - 2124 837 ## COG0392 Predicted integral membrane protein 3 1 Op 3 25/0.000 + CDS 2129 - 2713 778 ## COG0438 Glycosyltransferase 4 1 Op 4 . + CDS 2731 - 4122 1153 ## COG0438 Glycosyltransferase 5 1 Op 5 . + CDS 4115 - 5059 969 ## COG3481 Predicted HD-superfamily hydrolase + Term 5086 - 5127 4.5 6 2 Tu 1 . + CDS 5697 - 5990 122 ## LCRIS_00926 RNA-directed DNA polymerase + Prom 6127 - 6186 7.5 7 3 Op 1 2/0.000 + CDS 6220 - 6597 295 ## COG3328 Transposase and inactivated derivatives 8 3 Op 2 . + CDS 6685 - 7308 347 ## COG3328 Transposase and inactivated derivatives Predicted protein(s) >gi|223714115|gb|ACDT01000100.1| GENE 1 77 - 1081 804 334 aa, chain + ## HITS:1 COG:SPy0515 KEGG:ns NR:ns ## COG: SPy0515 COG0438 # Protein_GI_number: 15674619 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Streptococcus pyogenes M1 GAS # 1 334 1 332 332 287 43.0 2e-77 MKIKLYFGQEDAIKKSGIGRAFVHQKKALELSGIPYTTDKDDLDYDILHINTVWPDSFKI INQARKYNKKIVYHAHSTEEDFKNSFMFSNTVSGVYKKWLISMYTKGDYIVTPTPYSKRI LESYGITLPIKAVSNGVDLTQFNPSQEQIESFIQQYKIDSKDVVIISVGWLFERKGFDTF VSVAKKLPEYKFMWFGDIKLSRPTKKIKDLLDDLPQNVILPGYVSGDIIKGAYGRSNIFF FPSREETEGIVVLEALASKCNVLLRDIPVFSDWLENGVNCYKGNHTDDFVEIIKKMINGE YPSLVEQGYKVAKERELSKIGKELLEVYEETLNL >gi|223714115|gb|ACDT01000100.1| GENE 2 1096 - 2124 837 342 aa, chain + ## HITS:1 COG:lin2698 KEGG:ns NR:ns ## COG: lin2698 COG0392 # Protein_GI_number: 16801759 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Listeria innocua # 7 329 6 327 357 130 28.0 6e-30 MDKKSNKKYFLNIILILVLGATVIYFTMKDDLHASLRALTTASPLWIVISFALMSIYFLL DGINLYTFGKLYKKDYSYKQGFVNSISGTFFNGVTPFSSGGQFAQVYIFNKQGIAPTNSA SILLMAFIVYQSVLVLFTAVVMIFRYQAYSSMYSEFFSLAILGFLINFFVITGLFLGAKS KRLQDFICNNIVKALSKIRIVKNYEDTSIKIARSLENFRTELNVLLKNKNVLIKSSLINL FKLLIMYSIPFFAAKALNLNVSFIQIFDFIGICSFVYMITAFVPIPGASGGSEGVYYMLF SPILGAVGTPTTLLVWRFVTYYLGLIIGGIIFATNREINRSE >gi|223714115|gb|ACDT01000100.1| GENE 3 2129 - 2713 778 194 aa, chain + ## HITS:1 COG:lin2700 KEGG:ns NR:ns ## COG: lin2700 COG0438 # Protein_GI_number: 16801761 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Listeria innocua # 1 192 1 185 427 115 32.0 3e-26 MRIAIFSDTYIPDINGVATSTKILKDELVKHGHDVVVVTSELPSDSDYMDDPHDNILRVP GLEIQALYGYRACNIYSFKGMREIKGMNIEVIHVQTEFGVGIFGRIVGETLNIPVVYTYH TMWADYSHYVNPVNSGAIDGLIKKAITRISKFYGDKSTELIVPSLKTKEALELYGLNKDV HIIPTGLELEKFDT >gi|223714115|gb|ACDT01000100.1| GENE 4 2731 - 4122 1153 463 aa, chain + ## HITS:1 COG:lin2700 KEGG:ns NR:ns ## COG: lin2700 COG0438 # Protein_GI_number: 16801761 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Listeria innocua # 2 204 193 388 427 122 38.0 2e-27 MIEQIKDKYGIKDQFIVTFLGRIAKEKSIEVLIDAMKEVVKENDNVLCLIVGGGPQLEEL KELVKDDHISNYVIFTGPKPSNEVPSYYHLSNVFVSASITETQGLTYIEAMASGIPAVAR YDKNLEDVIDDGVNGYFFKETSELVEILLKLINHDYSKMAEAAYLSAMKFSSEVFYEKVI AVYQQAINSKHYSYTVKSIYPVKRGINEVVFLFDESEIIVEVNDKLIKAYQLEPNKVIDK ELFDVLKDFEQVSRAYNKALKLLTVKDYTYNQMKKKLMDSGDYDDTQLDATLELLQEKNL INDEAYTLNYLKRCTRLGIGLNKAIYNLRSYGIDSQIIDRCLEEIDDDDDEYNAATTLIE SIYSRNTSFSYKAMVKKIRDKLYIKGFTSETIERALSDFDFEFDDQQEQEALEKEFSRQK KKYSKRYQGTHLKEKIIDTLLRKGYNYEHIKELLNREGALDDE >gi|223714115|gb|ACDT01000100.1| GENE 5 4115 - 5059 969 314 aa, chain + ## HITS:1 COG:BH1175 KEGG:ns NR:ns ## COG: BH1175 COG3481 # Protein_GI_number: 15613738 # Func_class: R General function prediction only # Function: Predicted HD-superfamily hydrolase # Organism: Bacillus halodurans # 27 314 25 313 320 280 47.0 3e-75 MNKINELKPGDEGVVIEALINRVVVGKTNGANRSTYLSITLQDATGTIDAKLWNATNEQV EKLVMGCVVQVKGDVIKYNEDRQMKIIKIVVASTEPQEQVKFLKSAPQTGEELVKEIYTF IERINNLKLNQLVKALFNEHVEKLTIYPAASKNHHEYVSGLAYHTCSMLRIADALARLYP SLNRDLLFAGITLHDLGKTVELSGPVVPEYTIEGKLLGHISISQAMIKEMADKMNIEGEE VTLLQHIILSHHGKNEFGSPILPQIKEAEVIYLIDNMDARINMLDKALETVEPGGFSKRV FALENRAFYKPKMN >gi|223714115|gb|ACDT01000100.1| GENE 6 5697 - 5990 122 97 aa, chain + ## HITS:1 COG:no KEGG:LCRIS_00926 NR:ns ## KEGG: LCRIS_00926 # Name: not_defined # Def: RNA-directed DNA polymerase # Organism: L.crispatus # Pathway: not_defined # 1 62 201 262 265 87 64.0 1e-16 MKLNKACKCRFSEDDIYKCANTRLGWYRRSGMNIVNFIISLKVLSIKKGDRPGLVYHLDY YLKSDINVEPYTRHVRTVQWENKLSNLFLYPINEIEV >gi|223714115|gb|ACDT01000100.1| GENE 7 6220 - 6597 295 125 aa, chain + ## HITS:1 COG:YPO0011 KEGG:ns NR:ns ## COG: YPO0011 COG3328 # Protein_GI_number: 16120364 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 2 121 39 157 402 110 44.0 4e-25 MLNSEFDEHMGYDKYDQKTDKTNYRNGTSKKTVKTSQGNIDLDIPRDRNSSFDPAIIEKH NRDISDVDNKIINLYARGMSTRDISDTVKDIYGVEVSAAMISKITDKIIPKALQWQNRPL DMYIQ >gi|223714115|gb|ACDT01000100.1| GENE 8 6685 - 7308 347 207 aa, chain + ## HITS:1 COG:YPO0011 KEGG:ns NR:ns ## COG: YPO0011 COG3328 # Protein_GI_number: 16120364 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 21 206 212 397 402 184 49.0 1e-46 MDIRKYLVFGSAKTKQLNSRLSVLTDLKNRGVKDILIICSDGLSGIKQAIESAFPNTVQQ RCIVHLIRNSCKYLSYKDRKEFCKDLRTVYTESTSDKALEALDKCKGKWGDKYPYAFKPW EENWNEVCSMFNYVPELRKIMYTTNAIESLNSAFRKFTKIRTVFPTDESLFKSLYLAQDK ITSKWNVPYGNWGIIYSSLQIIFEGRG Prediction of potential genes in microbial genomes Time: Thu May 26 10:15:00 2011 Seq name: gi|223714114|gb|ACDT01000101.1| Coprobacillus sp. D7 cont1.101, whole genome shotgun sequence Length of sequence - 19055 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 11, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 405 - 464 11.6 1 1 Tu 1 . + CDS 540 - 809 312 ## COG1254 Acylphosphatases + Prom 841 - 900 8.8 2 2 Tu 1 . + CDS 1127 - 1576 465 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Prom 1783 - 1842 9.3 3 3 Tu 1 . + CDS 1918 - 2532 644 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) + Term 2544 - 2600 4.1 + Prom 2587 - 2646 12.1 4 4 Op 1 . + CDS 2734 - 3507 890 ## COG1606 ATP-utilizing enzymes of the PP-loop superfamily 5 4 Op 2 . + CDS 3497 - 3907 558 ## Cthe_1015 hypothetical protein 6 4 Op 3 8/0.000 + CDS 3965 - 4969 994 ## COG0310 ABC-type Co2+ transport system, permease component 7 4 Op 4 34/0.000 + CDS 4962 - 5729 652 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 8 4 Op 5 . + CDS 5729 - 6445 318 ## PROTEIN SUPPORTED gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 9 4 Op 6 2/0.000 + CDS 6449 - 7189 824 ## COG1691 NCAIR mutase (PurE)-related proteins 10 4 Op 7 . + CDS 7186 - 8388 1371 ## COG1641 Uncharacterized conserved protein + Term 8397 - 8440 7.8 + Prom 8464 - 8523 6.7 11 5 Tu 1 . + CDS 8562 - 9917 1132 ## COG0534 Na+-driven multidrug efflux pump + Term 9924 - 9963 3.5 - Term 10041 - 10093 -0.7 12 6 Tu 1 . - CDS 10185 - 10421 299 ## gi|167755386|ref|ZP_02427513.1| hypothetical protein CLORAM_00900 - Prom 10545 - 10604 8.3 - Term 10591 - 10634 0.4 13 7 Op 1 . - CDS 10637 - 10870 176 ## gi|167755387|ref|ZP_02427514.1| hypothetical protein CLORAM_00901 - Prom 10894 - 10953 4.7 14 7 Op 2 . - CDS 10984 - 11235 310 ## gi|167755387|ref|ZP_02427514.1| hypothetical protein CLORAM_00901 - Prom 11309 - 11368 9.7 + Prom 11484 - 11543 2.7 15 8 Op 1 1/0.000 + CDS 11565 - 12548 973 ## COG1940 Transcriptional regulator/sugar kinase 16 8 Op 2 . + CDS 12559 - 13200 839 ## COG0637 Predicted phosphatase/phosphohexomutase + Term 13205 - 13253 6.1 + Prom 13235 - 13294 7.2 17 9 Tu 1 . + CDS 13328 - 14146 1128 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 14154 - 14223 9.1 + Prom 14188 - 14247 12.8 18 10 Op 1 4/0.000 + CDS 14462 - 15577 1001 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 19 10 Op 2 . + CDS 15552 - 17768 2316 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit + Term 17783 - 17822 1.0 20 11 Op 1 1/0.000 + CDS 17840 - 18502 633 ## COG0491 Zn-dependent hydrolases, including glyoxylases + Term 18509 - 18561 4.4 + Prom 18524 - 18583 5.3 21 11 Op 2 . + CDS 18611 - 19055 538 ## COG1349 Transcriptional regulators of sugar metabolism Predicted protein(s) >gi|223714114|gb|ACDT01000101.1| GENE 1 540 - 809 312 89 aa, chain + ## HITS:1 COG:PAB7421 KEGG:ns NR:ns ## COG: PAB7421 COG1254 # Protein_GI_number: 14521859 # Func_class: C Energy production and conversion # Function: Acylphosphatases # Organism: Pyrococcus abyssi # 1 57 3 59 91 70 54.0 6e-13 MIRRHYLFYGRVQGVGFRFTTYQKAKNLGLTGWVCNLSDGSVEACIQGEEKLIDYLINEL QHDRFIRIDSIKMEEIAVLKHETSFGMKN >gi|223714114|gb|ACDT01000101.1| GENE 2 1127 - 1576 465 149 aa, chain + ## HITS:1 COG:STM4287 KEGG:ns NR:ns ## COG: STM4287 COG0454 # Protein_GI_number: 16767537 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Salmonella typhimurium LT2 # 2 141 15 154 154 94 32.0 1e-19 MKIRYCQKEDLEAIYKLICELEECNLNHEQFALVYENKLNDEKSHYLVAVSDDMIVGFVS VNIDYQLHHENKVATIEELVIDNSYRGNGIGRLLVDFSTKLALEHQCEVIELTSNFKRIE AHQFYQKNDFIKSSYKLIKVLKDSDQVGI >gi|223714114|gb|ACDT01000101.1| GENE 3 1918 - 2532 644 204 aa, chain + ## HITS:1 COG:BMEI1831 KEGG:ns NR:ns ## COG: BMEI1831 COG0744 # Protein_GI_number: 17988114 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Brucella melitensis # 56 203 99 246 730 141 50.0 8e-34 MYMKKVISRLIIILFLAALTFLGYYGYRGYVMYQDTMEEKSLTVRVEELRSKDNYVKLEQ ISSIYQELVIESEDKRFYQHGPVDYIGLARAMVTNVVTMSFKEGGSTITQQLAKNLCLSF EKSLDRKIAEVFIAQKLEDDYTKDEILEMYLNITYLGEGNYGIKEASNYYYNIEPIELNE EQAKVLVKTLKAPSVYNPSKLAEN >gi|223714114|gb|ACDT01000101.1| GENE 4 2734 - 3507 890 257 aa, chain + ## HITS:1 COG:CAC0775 KEGG:ns NR:ns ## COG: CAC0775 COG1606 # Protein_GI_number: 15894062 # Func_class: R General function prediction only # Function: ATP-utilizing enzymes of the PP-loop superfamily # Organism: Clostridium acetobutylicum # 5 256 5 262 271 175 41.0 7e-44 MSKLDKLKELLKSYQKVAIAYSGGCDSNFLYQVALQTLSKENVLAVLCVGKMMSKEDIDS ARMMAGKGQFIEVPLDVFTIEQFTNNRKDRCYHCKKQVMAKVIEAAQSHDFKITLDGKNA DDEKVYRPGMKACEELGIVSPLALVGLTKAEIRQYSKELQIETYDKPANACLASRFPYDT HLTKEKLEVVDQAEALLHQKGIYYARVRVHDKLARIEVERNNFNLVNDEELILAIKELGF DYVTLDLEGITSGSYDR >gi|223714114|gb|ACDT01000101.1| GENE 5 3497 - 3907 558 136 aa, chain + ## HITS:1 COG:no KEGG:Cthe_1015 NR:ns ## KEGG: Cthe_1015 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 6 136 1 132 140 117 50.0 2e-25 MIGSEMEVVVIDGQGGGIGRSVIEALKKEVSGTFIIAVGTNSTATNNMKKGGADAVATGE NAIIYNAKNAQIIIGPIGFVFANSMYGEVSPAMAAAISCSEAQKYFIPVSKCTGHVLGVE PKSIQEYIRDLVNILK >gi|223714114|gb|ACDT01000101.1| GENE 6 3965 - 4969 994 334 aa, chain + ## HITS:1 COG:alr3947 KEGG:ns NR:ns ## COG: alr3947 COG0310 # Protein_GI_number: 17231439 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, permease component # Organism: Nostoc sp. PCC 7120 # 6 330 1 304 306 110 31.0 5e-24 MNSVLMHMADGLISVNIGVIFMIISFIMIGFSIKKINQEHDDRKIPLMGVMGAFIFAAQM INFTIPGTGSSGHIGGGILLCLILGQYPAFLSLCSVLIIQCLFFGDGGLLALGCNIFNMG ILPCFVAYPLIVKPLLKKNITSLRVLLASILGVVVGLEMGAFAVVLQTLASNITELPFTS FVVLMLPIHLAIGLVEGLITSTVVLYVLKDKREIVDHALNDKHFGTLNIRRQVAVFAVLT ILVGGVLSLYASSNPDGLEWSIEGVTGTDEIEASGSTKDTLAKVQSSTSFLPDYNFSGSD SKLGVSVSGIIGGAATLILIGGIGYIVVKKKKYE >gi|223714114|gb|ACDT01000101.1| GENE 7 4962 - 5729 652 255 aa, chain + ## HITS:1 COG:MA4019 KEGG:ns NR:ns ## COG: MA4019 COG0619 # Protein_GI_number: 20092813 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Methanosarcina acetivorans str.C2A # 14 221 7 233 268 65 26.0 9e-11 MNKLKGSLLKFGYIEKLANGDSLIHRLNPLLKIILALMYIIFVISTSCLDFFELPLYSII IGIMISLSKVSFMDLFKRSLIGLPISLCIGFSNLLFSTTIINFYGVNISTGVLLFITIII KNILCLMAVFLLMATTKFDSITCELVHLKIPSIFVLQLVMIYRYIFVLVEEALTMIQAYQ LKNPQSKGIAFKDMGSFVGSLLVRSFERSNEVYNAMKCRGFDVKQAYLNYVDFEIENYFL LMMAVGVLMMVKVVF >gi|223714114|gb|ACDT01000101.1| GENE 8 5729 - 6445 318 238 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 [Haemophilus influenzae PittAA] # 1 204 1 205 205 127 38 7e-29 MIEINNLNVLYNNSIRALDNVSIKIEQQKCIAIVGENGSGKSTLLSSLLGLVETQGEIKI NGVMLSKETLIQIREVAGLVFQNPDHMIFLPTVRDNLAFGLINQKLDKIEIDKRIAEFAK LFKIEDLLDRMANHLSGGQKRMVGLASVVVMKPEILLLDEPSAFLDPKSRRIVINMLKEL PQQIIFATHDLDMVLDLCDEIILLNNGKIVSSGNPCEILANQELLEANGLELPQRYQR >gi|223714114|gb|ACDT01000101.1| GENE 9 6449 - 7189 824 246 aa, chain + ## HITS:1 COG:CAC0776 KEGG:ns NR:ns ## COG: CAC0776 COG1691 # Protein_GI_number: 15894063 # Func_class: R General function prediction only # Function: NCAIR mutase (PurE)-related proteins # Organism: Clostridium acetobutylicum # 2 242 5 248 248 228 53.0 1e-59 MKVKEILEMVENHEISVDEAAILIDNPIDYATIDYNRKRRTGTPEIIYGSGKTKEQIAGI IKNMLEHDQIDILATRVDATKAAYLKKLYPNFNYDKEAKTFILKQSETIQNKGMIVVVCA GTSDIPIAREAVLTAEFLGNEVNLISDVGVAGIHRLFNKMDVIKRANVIIVVAGMEGALA SVVGGLVDKPVIAVPTSIGYGANFNGLSALLSMLNSCASGVSVVNIDNGFGAGYMAHTIN CLGGKR >gi|223714114|gb|ACDT01000101.1| GENE 10 7186 - 8388 1371 400 aa, chain + ## HITS:1 COG:CAC0774 KEGG:ns NR:ns ## COG: CAC0774 COG1641 # Protein_GI_number: 15894061 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 399 1 414 420 293 42.0 6e-79 MKVLYFDCSSGISGNMTLGALMELIDEPDYLQKELAKLNIDGYHLHVTKTKKNGITGTYV DVHLEEHPHHHDHEHHHHEHHHHEHHHHVHRNLFDVNKIIDDSGINERAKALAKRIFLRV ALAESKVHNEELANVHFHEVGAIDSIVDIIGTAILITKIAPDKIYSSIVNDGHGFIECAH GTISVPVPATSEIFAASNVVARQIDIDTELVTPTGAAIIAELSSSFQPMPPMNVQKIGWG CGTKELNIPNVLKVSLGTIDKTNDEIIVMETNIDDCSSEILAYTAEKLFENGALDVFFTP IYMKKNRPVYRLSTACKEDKLELLQNIIFRETTTIGIRYRREERKILARKAIELDTPYGK IAAKEVTNNDETYVYPEYESIKRLAKEHDLAVKEIYKLIK >gi|223714114|gb|ACDT01000101.1| GENE 11 8562 - 9917 1132 451 aa, chain + ## HITS:1 COG:CAC3444 KEGG:ns NR:ns ## COG: CAC3444 COG0534 # Protein_GI_number: 15896685 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 1 450 8 461 462 374 44.0 1e-103 MGKYRQQIKSICFLAWPAVVQEAMNVVVTYVDTAMVGALGAGASAAVGLTSTVGWLVSSI AVAFGIGILSVCAQAVGANNDEKVKRVGQQALFLTLIVGIALTIMCVLIAPYLPTWLNGD KAIRGEAASYFMIISIPLLFRTAALILASALRGVSDMKTPMLINLYMNIINIILNFLLIY PTRELFGIIIPGAGLGVNGAAIATAISFVAGGIMMFLRYYKNTLFDLKNSGFHFYKKEFK ECLNIGIPVVLERSVICLGHITFASLIAKLGVVRFAAHTIAIQAEQAFYIPGYGFQTAAA TLVGNAVGQKDEHKVKEVTYLISGITMFLMIICGIALFIFAEQLMGIFTPDHEVITLGAR VLRIVSISEPLYGILVILEGTFNGMGDTKAPFVFSLFTMWGIRVTGSWLMINVFHQSIEA VWIMMVFDNIARCLLLSRRFLKNGWKYRLNS >gi|223714114|gb|ACDT01000101.1| GENE 12 10185 - 10421 299 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755386|ref|ZP_02427513.1| ## NR: gi|167755386|ref|ZP_02427513.1| hypothetical protein CLORAM_00900 [Clostridium ramosum DSM 1402] # 1 78 1 78 78 152 100.0 6e-36 MLEKGTVVYFLMNGYVMSGKVFDISGNNDNYEFSIEGYAGCAGPHIISSSQIHLTVFLSQ EEAEKYKDNPQMYLPAYC >gi|223714114|gb|ACDT01000101.1| GENE 13 10637 - 10870 176 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755387|ref|ZP_02427514.1| ## NR: gi|167755387|ref|ZP_02427514.1| hypothetical protein CLORAM_00901 [Clostridium ramosum DSM 1402] # 1 77 123 199 199 145 100.0 8e-34 MLYDFLNQLNLANSFYQHIFVNCVDHLYYYRCQEQIIIKVDYSFQELSAMNQLAIDFEKH CHKERRKIELPKSNNDA >gi|223714114|gb|ACDT01000101.1| GENE 14 10984 - 11235 310 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755387|ref|ZP_02427514.1| ## NR: gi|167755387|ref|ZP_02427514.1| hypothetical protein CLORAM_00901 [Clostridium ramosum DSM 1402] # 1 81 1 81 199 153 98.0 4e-36 MKETYSKNYFEKYAALTLTKVLDVSADNIIQADRPDLRIPKRNFGIEVTQALTPAEAIAD IKKPLYSTLKLNPFDHDEKNLNS >gi|223714114|gb|ACDT01000101.1| GENE 15 11565 - 12548 973 327 aa, chain + ## HITS:1 COG:CAC3673 KEGG:ns NR:ns ## COG: CAC3673 COG1940 # Protein_GI_number: 15896905 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Clostridium acetobutylicum # 2 322 45 376 385 110 25.0 3e-24 MELEQADLIKKEDINESTGGRKPLAISIKEQSRIAIGVSILKEGITFVALDLYAQLIRKQ TVALVYDNSESYFKQYGQALNEFIDCLQIKKENFLGITIAIQGIVSQDGKKVIYGPILNN QGLTLEKFEKSSDLPVKMIHDSHAAGLAELNFSEDNNDAIYLSLNHNLGSAVFMAGKVFQ GNNYSSTFEHMTLVENGRQCYCGKKGCFEAYCNEDALLKNHTNMTLEEFYKQLRQKDQTA MQSWNDYFHYLALAIHNLTMVLDAPVIIGGKIASLFENEDLDLLTKLVSDLDPLKFTVPD IKIGQCLKETAAIGAALTDIKKFVAEI >gi|223714114|gb|ACDT01000101.1| GENE 16 12559 - 13200 839 213 aa, chain + ## HITS:1 COG:VCA0102 KEGG:ns NR:ns ## COG: VCA0102 COG0637 # Protein_GI_number: 15600873 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Vibrio cholerae # 3 212 7 215 219 121 33.0 9e-28 MKKAVIFDMDGLMIDSERVTYEEYCRKLEQLGYSFDEAVYRLCLGKNKQGICQVFYDHFG TAFPMTEVWDDVHIRLDERLAKEVPLKAGLLELLKYLKANNYKTIVATSSARVRVDVILK NAQIEQYFDDTICGDEVIHGKPNPEIFLTACKKLNVAPSEALVLEDSESGILAAYDGKID VLCIPDMKYPEADFASKATKIISSLKDVIEYLK >gi|223714114|gb|ACDT01000101.1| GENE 17 13328 - 14146 1128 272 aa, chain + ## HITS:1 COG:VC1364 KEGG:ns NR:ns ## COG: VC1364 COG0561 # Protein_GI_number: 15641376 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Vibrio cholerae # 8 272 3 267 273 103 29.0 3e-22 MNKLTNIKAVMCDVDGTLLNQDGIVSPATVEAIKKIREKGILFGLSTGRDVHSVKTLLTV WGISGLVDAIVGTGGAEIYDYQLNIDKSSYPLDGELIKDIIKHHEGMDLNFAIPYEGTLY APKDDDYIRDLARDDRVPYKVVDFDVFLNEPRAKLIIVCAPDYMDKVIEKSKTFSNPNYK SASLKTASILYEYMDPRVSKTHGLQEVMAMHKIEMTDLCTFGDADNDYDMTLNAGVGVVM ANGSEKTKSVADFITDDNNHDGIANFINKYIL >gi|223714114|gb|ACDT01000101.1| GENE 18 14462 - 15577 1001 371 aa, chain + ## HITS:1 COG:TM0015 KEGG:ns NR:ns ## COG: TM0015 COG1014 # Protein_GI_number: 15642790 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Thermotoga maritima # 14 197 7 188 192 91 32.0 2e-18 MNKIELSKVNEEGYYEIRLESVGGLGANLCGKLLGELGAAYLGFNASSFSSYGSEKRGSP VKAFIRYCDEDKEITVNTPVTRPHLLGLFHEAMIGKLQVTAGVDENSCIVINSGEQPDVI RDKLKLCAGTIYCIDAIYLAMKIKCKVNMIMLGALAKASGFIPLETVLELIEATLGKKYP QLLQKNKEGIIEGYNGVVSKTFNDDGKYPFIPYREIENDWGYQNAPLGGVNPCFGSTISN DLSASREGYIPLFLQEKCIQCAQCYITCPDMVFQFEPGIYKGRKMMVNKGLDYHHCKGCL RCVEICPTQALVSGVEREHSNLKWFIRNKDLIVEHMDYEDVGANSWITSDSFLEVEKITE VKNDGKTKDGI >gi|223714114|gb|ACDT01000101.1| GENE 19 15552 - 17768 2316 738 aa, chain + ## HITS:1 COG:SSO1207 KEGG:ns NR:ns ## COG: SSO1207 COG0674 # Protein_GI_number: 15898059 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Sulfolobus solfataricus # 3 361 5 360 385 233 38.0 8e-61 MEKQKMVFESGNELAAYAAKQINYHVMGYYPITPSTQIAENLDLMKAEGKHEIALIAAEG EHSAAGICYGASVAGARVFNATSANGLLYALEQFPVQSGTRFPMVMNVACRTVSGPLSIK GDHSDIMYMLNTGWIILFAHSPQAVYDMNICALKIAEKVKLPVVIAFDGFFTSHQKNKCQ VFEDDQVVQNFVGKYLPEYQILDFEHPVTVGSYMNEPDLMNNKYQLHLAMEEAREVIPAI FKEYETISNRAYQYVESYQNEDCDVLMFVLGSGFSSAKRAVDELRKDDKKVGVVTINVLR PFPSKELIKHFKVPKTVIVCDRQDSYGANGGNMSLEIKAAMQEAGITTRVITRIYGLGGR DFYKDDAKALLLMGFQKDVKLFDYLHIYPGKIEQPITQFFKPIKETPDDFKCVYNEEKQI MEVKPFTLNQIAKMPQRLSGGHGACPGCGIPVNVNLLLSAISGNVVLLFQTGCGMVVTTS YPKTSFKVPYVHNLFQNGAATLSGIVEAFNQKVKRHEYPEGEITFIMVSGDGGMDIGMGS ALGAALRNHHMIIFEYDNGGYMNTGYQLSYSTPLGAKSSTSHLGQDQFGKSFFNKDMPAI MAATNIPYIATVAESNPIDFVRKAIKAKAYAKEFGLAYLRTLSACPLNWGDQPNLEKSVI EAGVNSCYFPLFEIEQGKYNLTYDPQKAKKKIPLIDWFAMMQRTKHLSDPKYTKIVQAAQ DEVDRRFEILKKRADDQI >gi|223714114|gb|ACDT01000101.1| GENE 20 17840 - 18502 633 220 aa, chain + ## HITS:1 COG:TM1295 KEGG:ns NR:ns ## COG: TM1295 COG0491 # Protein_GI_number: 15644050 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Thermotoga maritima # 21 173 27 185 218 82 34.0 9e-16 MEVTEYAKDLYILDDGQVREFLIIGRNEAILIDTGFPETKIINEVQKLTSFPLKVLLTHG DGDHCGGLDSFQECYVHQGDKGLIKTTIKINCLKEKDLIEIGDYCFEVIEIPGHIYGSIA FLDIQKKLLIVGDIVQLGPIYMFGEHRNLDLYIHSLEKLNTYQNQIETILPSHNRYPLTK EYIGHCLSDARLLKAHKLLGVKHQFLPCREYRGKKISFYY >gi|223714114|gb|ACDT01000101.1| GENE 21 18611 - 19055 538 148 aa, chain + ## HITS:1 COG:L0152 KEGG:ns NR:ns ## COG: L0152 COG1349 # Protein_GI_number: 15672771 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Lactococcus lactis # 5 148 1 144 247 163 60.0 1e-40 MVKMLARHTKILELITENKKMEVTKLSQLLSVSQVTIRKDLIQLENSGLIVREHGFATLN SSDDINNRLAYHYDIKQRIAKLAVESIEDGETVMIESGSCCALVALEIAQTKKDVTIITN SAFIADYIRKVAKIRIILLGGEYQNESQ Prediction of potential genes in microbial genomes Time: Thu May 26 10:15:19 2011 Seq name: gi|223714113|gb|ACDT01000102.1| Coprobacillus sp. D7 cont1.102, whole genome shotgun sequence Length of sequence - 3813 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 258 332 ## COG1349 Transcriptional regulators of sugar metabolism 2 2 Op 1 . - CDS 284 - 622 342 ## COG1180 Pyruvate-formate lyase-activating enzyme 3 2 Op 2 . - CDS 582 - 1184 476 ## COG1180 Pyruvate-formate lyase-activating enzyme - Prom 1208 - 1267 7.2 + Prom 1223 - 1282 13.1 4 3 Tu 1 . + CDS 1396 - 3807 2390 ## COG1882 Pyruvate-formate lyase Predicted protein(s) >gi|223714113|gb|ACDT01000102.1| GENE 1 1 - 258 332 85 aa, chain + ## HITS:1 COG:SPy2054 KEGG:ns NR:ns ## COG: SPy2054 COG1349 # Protein_GI_number: 15675824 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Streptococcus pyogenes M1 GAS # 1 78 165 242 248 60 39.0 6e-10 IGTDGFTETSGFTGNDYMRSETVRDMAKQADKVIIVTDSFKFLQKGVVTLIPTAEIACIV TDTHIPIECENYLLDHGIEVKKVEN >gi|223714113|gb|ACDT01000102.1| GENE 2 284 - 622 342 112 aa, chain - ## HITS:1 COG:SPy2055 KEGG:ns NR:ns ## COG: SPy2055 COG1180 # Protein_GI_number: 15675825 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Streptococcus pyogenes M1 GAS # 10 112 155 257 257 124 55.0 4e-29 MIVSSYFLGTGVYNDLIIKNLKWALQNKIEVLPRIPVIPDFNDSLNDAKGLARLLKNIGV LKVQLLPFHQFGEKKYEMLNLEYSLKNKKALHKEDLKEYQNIFINEDIKCFF >gi|223714113|gb|ACDT01000102.1| GENE 3 582 - 1184 476 200 aa, chain - ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 4 192 5 192 302 164 40.0 1e-40 MENKGIIFNIQKFSIHDGPGIRTTIFFKGCPLRCKWCSNPESQLKAVQILWDHNKCTHCM TCINNCPTGAIKLIADKIIIDQNKCNGCLRCVNKCPQIALKNEGEYKTIDEIVTTCLQDK DFYEESNGGITISGGEGMSQPAFLFHLVNELKKHQLHLAIETTGYIEHELFTKLAPLFDL LLFDVKHYDREQLFFRYRCL >gi|223714113|gb|ACDT01000102.1| GENE 4 1396 - 3807 2390 803 aa, chain + ## HITS:1 COG:SPy2049 KEGG:ns NR:ns ## COG: SPy2049 COG1882 # Protein_GI_number: 15675819 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Streptococcus pyogenes M1 GAS # 6 803 8 805 805 1266 76.0 0 MKNIEHFGDLTPRMNDFREKVLDKKPYICAERALLATESYRLYQNQPPVMKRALMLKNIL EKMSIYIEDETLIVGNQAASNKDAPIFPEYTLEFVIDELDKFEKRDGDVFYITEETKAAL RSIAPFWENNNLRAKGEALLPEEVNVFMETGFFGMEGKLNSGDAHLAVDYEQLLKIGLVG YEKRVRQLKAELDLCVPENIDKYVFYKAVLIVIEAVKTYADRFSLLAQEMAENAQSHRKD ELLEISNICSKVPYEPASSFKEAIQSVWFIQLILQIESNGHSLSYGRFDQYMYPYLKADL EKGVITDEEAVELLTNLWIKTLTINKVRSQAHTFSSAGSPMYQNVTIGGQTPDKKDAVNK LSFLVLKSVAQTRLPQPNLTVRYYNGLNKEFLDECIEVMKLGTGMPAFNNDEIIIPSFID LGVKEEDAYNYSAIGCVETAVPGKWGYRCTGMSYQNFPRILLAVMNDGVDVTSGKRFVEG YGYFRDMKSFEELQDAWDKSIREITRLSVIVENAVDLASERDVPDILCSTLTQDCIGRGK TIKEGGAVYDFISGLQIGIANMADSLAAIKKLVFEEKKITPQQLWDALQDDFMSEENQKI QSMLINEAPKYGNDDDYVDQLVVEAYDSYINEIKKYPSTRYQRGPVGGIRYAGTSSISAN VGQGYGTMATPDGRKAHTPLAEGCSPAHAMDKNGPTAVFKTVSKLPTHEITGGVLLNQKV TPQMLATEENKEKLEMIIKTFFNRLHGYHVQYNVVSRETLIDAQKNPEKHRDLIVRVAGY SAFFNVLSKATQDDIIERTEQTL Prediction of potential genes in microbial genomes Time: Thu May 26 10:15:26 2011 Seq name: gi|223714112|gb|ACDT01000103.1| Coprobacillus sp. D7 cont1.103, whole genome shotgun sequence Length of sequence - 20606 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 11, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 128 - 187 8.6 1 1 Op 1 . + CDS 217 - 678 682 ## COG1225 Peroxiredoxin 2 1 Op 2 1/0.400 + CDS 686 - 1069 444 ## COG0607 Rhodanese-related sulfurtransferase + Prom 1117 - 1176 5.8 3 2 Tu 1 . + CDS 1197 - 1988 726 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 2021 - 2073 2.2 4 3 Tu 1 . - CDS 2027 - 2647 569 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) - Prom 2667 - 2726 6.9 + Prom 3162 - 3221 5.6 5 4 Op 1 . + CDS 3274 - 3717 577 ## COG3279 Response regulator of the LytR/AlgR family 6 4 Op 2 . + CDS 3721 - 4164 452 ## Bsph_2810 hypothetical protein 7 4 Op 3 . + CDS 4238 - 4690 455 ## EUBREC_0648 hypothetical protein 8 4 Op 4 . + CDS 4681 - 5067 392 ## EF1337 hypothetical protein 9 4 Op 5 . + CDS 5104 - 5649 729 ## Sterm_3839 cyclase family protein 10 4 Op 6 . + CDS 5651 - 5860 154 ## CLD_3554 MarR family transcriptional regulator + Term 6110 - 6153 6.1 + Prom 6007 - 6066 6.6 11 5 Op 1 . + CDS 6235 - 6852 584 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 12 5 Op 2 . + CDS 6856 - 7866 862 ## COG1940 Transcriptional regulator/sugar kinase 13 5 Op 3 . + CDS 7918 - 8268 318 ## gi|167755411|ref|ZP_02427538.1| hypothetical protein CLORAM_00925 + Term 8434 - 8473 1.0 + Prom 8311 - 8370 6.5 14 6 Tu 1 . + CDS 8486 - 9961 1320 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 15 7 Tu 1 . - CDS 10102 - 11850 1635 ## COG2199 FOG: GGDEF domain - Prom 11871 - 11930 8.6 + Prom 12142 - 12201 11.0 16 8 Op 1 . + CDS 12330 - 13301 1141 ## gi|237733644|ref|ZP_04564125.1| predicted protein + Term 13302 - 13332 1.3 + Prom 13373 - 13432 3.3 17 8 Op 2 1/0.400 + CDS 13452 - 14798 1676 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases + Term 14839 - 14885 8.5 + Prom 14851 - 14910 10.2 18 9 Tu 1 . + CDS 14930 - 15721 701 ## COG1737 Transcriptional regulators 19 10 Tu 1 . - CDS 15824 - 16840 893 ## COG1609 Transcriptional regulators - Prom 16864 - 16923 7.9 20 11 Op 1 . + CDS 17129 - 18490 1555 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 21 11 Op 2 . + CDS 18514 - 19164 764 ## COG0036 Pentose-5-phosphate-3-epimerase 22 11 Op 3 . + CDS 19166 - 20101 1067 ## lmo0737 hypothetical protein 23 11 Op 4 . + CDS 20111 - 20602 511 ## COG2190 Phosphotransferase system IIA components Predicted protein(s) >gi|223714112|gb|ACDT01000103.1| GENE 1 217 - 678 682 153 aa, chain + ## HITS:1 COG:BH0948 KEGG:ns NR:ns ## COG: BH0948 COG1225 # Protein_GI_number: 15613511 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Bacillus halodurans # 1 151 1 150 154 147 49.0 9e-36 MLEVGSIAPDFTLLDQNGDSVSLSNFVGKKIILYFYSKDNTPGCTKQACGYAQNYPRFKE KDTIIIGISKDTVVSHKKFEEKYQLPFILLANPELDVLQAYDVWKEKNMYGRMVMGVVRT TYLINEEGIIEKAFTKVKAADDAERMLREIESE >gi|223714112|gb|ACDT01000103.1| GENE 2 686 - 1069 444 127 aa, chain + ## HITS:1 COG:VC2654 KEGG:ns NR:ns ## COG: VC2654 COG0607 # Protein_GI_number: 15642649 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Vibrio cholerae # 3 122 12 133 144 65 34.0 2e-11 MQLIIGILVIIIIVLLVYRNHRQETVYQRISASEAQKIMDEESNIIIIDVRTVDEYKTGH IKNAICIPNELISNKEIAELPDKSQEILVYCRSGSRSRQAANKLIKLGYENVIDFGGIID WDGEVVK >gi|223714112|gb|ACDT01000103.1| GENE 3 1197 - 1988 726 263 aa, chain + ## HITS:1 COG:CAC0522 KEGG:ns NR:ns ## COG: CAC0522 COG0561 # Protein_GI_number: 15893812 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 1 263 1 264 265 173 38.0 2e-43 MIKLIATDMDGTFLDSDKRFDPEFIDIFYKLKEKGIKFVIASGNQYYRLYQKFLPLSEQM YFIAENGCYIAEGATELYCNTITHDNVELIKTALAPYDNLFMIMCGRKGAYVLSRDKHFE SLVRLHYCNYQFVDSFDHIDDEIMKISINDPEEQIEKYLQALEPHLPVAVKVVTAGNMWM DIHNRDINKGVGMRFLQAIYEIEPDECMAFGDQMNDYELLQQVKYGYAMDNAVLPIKEIA YGTTRSNDEQGVLIKIKEVLDLA >gi|223714112|gb|ACDT01000103.1| GENE 4 2027 - 2647 569 206 aa, chain - ## HITS:1 COG:lin1340 KEGG:ns NR:ns ## COG: lin1340 COG1974 # Protein_GI_number: 16800408 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Listeria innocua # 4 197 12 198 204 81 30.0 9e-16 MTGERIKKLRKEKGLTQEQLGNLLGVKKSAIAKYENNRVENLKKDTIQKLSEIFDVPASY FLGIDESNQPIITDSITIPLYSDISCGTGLFVDDNVDEYISLPETLLSPSKEYFCQYADG DSMINENINQGDLIVFEKSNQIRNGEVGCFTIDDNVATCKKFYRDDNNHCIILQPANPEY TPIVINQESQAGFRVIGKLTLVINKR >gi|223714112|gb|ACDT01000103.1| GENE 5 3274 - 3717 577 147 aa, chain + ## HITS:1 COG:SP1915 KEGG:ns NR:ns ## COG: SP1915 COG3279 # Protein_GI_number: 15901739 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Streptococcus pneumoniae TIGR4 # 1 145 2 145 149 60 31.0 1e-09 MKVELKIDPNCQEPWAVINVAKLTPSLQAAITILEKECEEMIFTAQNNGKIYIIDPVKIE LIRTEGRELALYDLKKERYVLNKPLYEVQEQLGKDFVRISKFAIINIKRINHVEASFNGT MEIVMKNGLEEIITRSYRKLFKERLGV >gi|223714112|gb|ACDT01000103.1| GENE 6 3721 - 4164 452 147 aa, chain + ## HITS:1 COG:no KEGG:Bsph_2810 NR:ns ## KEGG: Bsph_2810 # Name: not_defined # Def: hypothetical protein # Organism: L.sphaericus # Pathway: not_defined # 17 138 2 123 133 99 40.0 5e-20 MRFKDLVSYFFGGIAWGCTFLVVINLIGYTVIGSTFLEPLMENFVMHAVGAMVVGVCCGS TSYVYKIESLSLRKQIAFHFTIGLGGYLLIAYKLGWMPISNIGYVITFILMAIIIFTSIW TGFYFYNRNEAKKYNAKIKEIEKENDN >gi|223714112|gb|ACDT01000103.1| GENE 7 4238 - 4690 455 150 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0648 NR:ns ## KEGG: EUBREC_0648 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 145 4 152 156 67 32.0 1e-10 MLNFRYADINDLDFLVELRIRDLRLFSNQEILPQTINKIRDFYKTGIINGTCFTLLGFDN ELLAASGTLYLYSLMPSNENPSGIMGQLTNIWVDEKYRHQGIASRIVSDLLTKGQGKCGM ICLNSSKDAIALYQKLGFKTKERYMIKQWD >gi|223714112|gb|ACDT01000103.1| GENE 8 4681 - 5067 392 128 aa, chain + ## HITS:1 COG:no KEGG:EF1337 NR:ns ## KEGG: EF1337 # Name: not_defined # Def: hypothetical protein # Organism: E.faecalis # Pathway: not_defined # 9 123 12 126 143 96 42.0 3e-19 MGLIIKKPDDAIIVNKKDGTHVAYYLFDEYEVHLNSLPPQTEQVWHFHQAIEEVILIIEG EIEIHYLDNDKRMRQRVQPNDLILVKDSIHTLINRSTKECKFVVFRLVLDGKNKRDLIKN DKKVVNML >gi|223714112|gb|ACDT01000103.1| GENE 9 5104 - 5649 729 181 aa, chain + ## HITS:1 COG:no KEGG:Sterm_3839 NR:ns ## KEGG: Sterm_3839 # Name: not_defined # Def: cyclase family protein # Organism: S.termitidis # Pathway: not_defined # 1 180 1 183 183 220 57.0 2e-56 MFIDLTLEVTPKLTQDAKGNESKANFGHLGTHFDVMNKEFPLSYLERNGVIFDVRGIDEI EVIDLEKIRSGDFVIFYTGFLEKAGYGTKEYFQAHPQLSNNLIEKLLDKKVSIIGIDCAG IRRGKEHTPMDQYCADHNAFVVENLDNLESLLMQNEFTINTYPMKFSEMTGLPCRVVAKV D >gi|223714112|gb|ACDT01000103.1| GENE 10 5651 - 5860 154 69 aa, chain + ## HITS:1 COG:no KEGG:CLD_3554 NR:ns ## KEGG: CLD_3554 # Name: not_defined # Def: MarR family transcriptional regulator # Organism: C.botulinum_B1 # Pathway: not_defined # 4 68 10 74 156 67 46.0 1e-10 MQVILDLFTELYEKQDILSKLTQSSSLQGYGYSVIHCLDAIGILDGPNITKIAEHLKMTR GELVKLSKN >gi|223714112|gb|ACDT01000103.1| GENE 11 6235 - 6852 584 205 aa, chain + ## HITS:1 COG:ECs2444 KEGG:ns NR:ns ## COG: ECs2444 COG1440 # Protein_GI_number: 15831698 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Escherichia coli O157:H7 # 105 192 6 93 106 58 35.0 7e-09 MDEFLQGINASIYLKWILMQNNHYSLNIKVDDNDNNTIIISNEIVIGKIIYYGKGIFEEE LNDRQTKEKIFYLHFQLTHFNHAVELFKEMINCALEATNRPQIRVLLCCSGGLTTTLFAS KMQELAKLENLSYEIFATGYSRLFEIANDYDIILIAPQVAYMLPQAKRRLPDKEVVTIPT RIFATNDYSGALKIIYDWYQNKEEK >gi|223714112|gb|ACDT01000103.1| GENE 12 6856 - 7866 862 336 aa, chain + ## HITS:1 COG:lin0520 KEGG:ns NR:ns ## COG: lin0520 COG1940 # Protein_GI_number: 16799595 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Listeria innocua # 7 328 5 329 334 125 25.0 9e-29 MNISNTKNLRQINRDKVMDIFMEQETCTKNQLALATGLSLATITNILKELLASNEVLKDG ELESTGGRKSVVYSLNSDYSRMLTISVNKEYRRVHLIYRVYNLADEIVFSDEVIKKEVKV NDIYDIINIVLAKFDNIKVIGISMPGIINDGRVTSTGLTNFNDIDLYSLLTNKYSQVIIL GNDVNTAVIGFYLAEKNYENVCLLFQPGEGYGGVGTVINGQLVTGKTNCAGEIQYLPLSQ DDQLKLLQSPHGTIELLSKIVICLTAIVNPEIVAISCTNLERASDLDETLKQMVPAKYLP KIKKVTNLSEYILIGLQLICKDSLKTNLVIKKQSII >gi|223714112|gb|ACDT01000103.1| GENE 13 7918 - 8268 318 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755411|ref|ZP_02427538.1| ## NR: gi|167755411|ref|ZP_02427538.1| hypothetical protein CLORAM_00925 [Clostridium ramosum DSM 1402] # 1 116 1 116 116 201 100.0 2e-50 MSEKDIQLLVMCLKEYIKNIDGERFKQLFLGYAEYMARNYGQDCLIDEYDLMMEQVEKFV DEQHLDINLLPPYKRDVFYRFIKNHYSKQDILSLYYFNDEIGALECFIDMIDSKYQ >gi|223714112|gb|ACDT01000103.1| GENE 14 8486 - 9961 1320 491 aa, chain + ## HITS:1 COG:CAC0528 KEGG:ns NR:ns ## COG: CAC0528 COG0488 # Protein_GI_number: 15893818 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 1 491 1 492 492 494 53.0 1e-139 MGLLDVKNLSFKYDNQIEKVFDNVSFQIDTNWKLGLIGRNGKGKTTLLELLMSKYEYRGY ITPNYIFDYFPYHIENENIRTLKIVENILGEFELWELQRELSLLDVENEVLEQNFATLSK GQQTKVLLAILFCKPHDFILIDEPTNHLDYHGRLLLSKYLKRKKSFILVSHDRDFLDRCI DHVLAINRNSIEVIKGNFSLWYDLKMNNDKREIEKNEQLKKDIGRLKKAARQNEDWSNKV EATKSVKVAGLKPDKGYVGHKAAKMMQRSKNIERRQSKAIEEKSDLLKNIETIEKLKIFP LEYSKQQLCYFNHVTISYDKHIIVENLSFDINQGDRLNLAGKNGSGKTSIIKIIMKESNN YQGKVNIGNNLKIAYLSQDDSHLKGKLDDYANSLDIDLSLFKAILRKLDFSRELFTQEIS TYSKGQKKKVMLAGVLCQEAHLFILDEPLNYLDIFTRMQVQELLLIFKPTVLFVEHDKYF CDQIKTKTLNL >gi|223714112|gb|ACDT01000103.1| GENE 15 10102 - 11850 1635 582 aa, chain - ## HITS:1 COG:all1219_3 KEGG:ns NR:ns ## COG: all1219_3 COG2199 # Protein_GI_number: 17228714 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Nostoc sp. PCC 7120 # 391 571 3 189 199 109 33.0 1e-23 MIKDDFLTLTAQKFINTYFTSRNFDQVKNMFAPEGSWLGASNNTMANNLKQFDQYFKEGC NEGYVPEITFQDCYVIYNDDNIGIVYCNYHLHIESKGALFDMDQRVTIVLKQFNTTLKII HVHASTPNDAISQEEFYTNEIAKLGYKELQQALKKKNEQIEMIIRTTAGGMKGSLDDEFY TYFYVNEELCKMLGYTYDEFMEMSHGTAVGAVYPPDLPAALADCKRCFAIGPEYKTEYRI RKKDGSLLWVMDSGRKVTDENGQVVINSIITDISELKEMVNRLKIDQERYEIVSQLSDDI IFEYDVENNSLVYQQLQADTPVKNILTNFLTVDLKNTHLYHGDIERFKKDLKIILSNQVS NELYKLEYRFAIAPQSYTWYRLTFRRIFDNDNKLTKVVGKVVDISSELRLKHQSITDPLT GTYNRLYITSAIQEYCHILKNNLSYACILIDIDYFKKINDTYGHIIGDRFLIEIVKIIKC FFRASDLIGRIGGDEFLIFIKDIYDKEIIREKANALIKKIHEYVTANNYPKEISISMGIH IDNRPEISFTELYNKVDIALYNAKHNGRDRYVYYHEGMTYPK >gi|223714112|gb|ACDT01000103.1| GENE 16 12330 - 13301 1141 323 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733644|ref|ZP_04564125.1| ## NR: gi|237733644|ref|ZP_04564125.1| predicted protein [Mollicutes bacterium D7] # 1 323 1 323 323 577 100.0 1e-163 MKRLSKILLVFLMVLGLTACSGGSKENEPAKIVMGKENKFDDLVSYKIETISRPQQITPD VIGSFYTYYKPSKSTNVLIDVTMYMTNLQKKEMKLSSTLKGTFVIDKTDYVASTAMVSED GKTISQGGSLAAKGTNKVHFYAEVKPSLLKNNIEFKLTTVDEENPKEANMSFKLTDIAKN YESKNLNDTIVLDGRGEIALQAVNITKKLEPANPVGLYTYYQVQNDGNSFVVLTTSIKNI SESDITASNIAVAKLVDKDSNEYPANSFYEKDDRSNLASASTTVLTPGQSGMIHFVFEVS DAVANGEKSVRITYNGKVFIVNL >gi|223714112|gb|ACDT01000103.1| GENE 17 13452 - 14798 1676 448 aa, chain + ## HITS:1 COG:CAC3426 KEGG:ns NR:ns ## COG: CAC3426 COG1486 # Protein_GI_number: 15896667 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Clostridium acetobutylicum # 1 447 1 444 445 625 65.0 1e-179 MKKYNVVIVGGGSHRNPDLMAMLASKKDVFPLKRVVLYDNEAERQEIMGQYGEILMREYY PELEEFIYTTDPDVAFKDVDFALMQIRAGRLPMREKDEKIPLAHGCVGQETCGAGGFAYG LRSVPAIIELVKQIRKHSPEAWILNYSNPAAIVAEATKRIFPDDYRILNICDMPIAIMDA YAKLLGCQRKDFEPRYFGLNHFGWFTHLYDKKTGADLMPSVREQLMAGEIFKKGSEDEHV REKSWVDTYNFMSTMVKDFPEYLPNTYLKYYLYPRHVVEESDPNYTRANEVMDGKEKDTY NMMKEVIKLGALKGTKYELDPNRGVHATYIVDLATSIANNTNDIFLIITKNRGCIPNLDS EMMVEVACRVGANGVEPLAMDPVDTFYKGLLENQYAYEKLTVDGFLECNKLKLLQALVLN RTVVDTDLAKVILDDLIEANKAFWPEFK >gi|223714112|gb|ACDT01000103.1| GENE 18 14930 - 15721 701 263 aa, chain + ## HITS:1 COG:CAC0531 KEGG:ns NR:ns ## COG: CAC0531 COG1737 # Protein_GI_number: 15893821 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 248 1 246 257 162 37.0 8e-40 MQLNELVNLHFDDLNESDLYIWNYINNHREVCTKITIEDLGKKCNVSRTTILRFSKKIGL EGFAELKYYLRNEQTLADVGLGKTVNIYALFENHAKMIRELQNRDYNTCCRMIDHAKRIF VVSSGTIQRSVAKELSRQFLKLGIIMTLVNGDSEINMLARAITADDLLIIISLTGESESV IRIAKVAKTVGSKTISITRINANSLSLLADEPIYIFFGDFPIVFETAYTSPTMFFALTEL LFANYHNYQQKKLFQQGNQNYAE >gi|223714112|gb|ACDT01000103.1| GENE 19 15824 - 16840 893 338 aa, chain - ## HITS:1 COG:BH3727 KEGG:ns NR:ns ## COG: BH3727 COG1609 # Protein_GI_number: 15616289 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 9 338 3 323 331 186 34.0 5e-47 MDNHKKKLSIKDIAAISGVSIATVSRVLNNKGGYSKETEEKINALAKSYGYVSNMNAKSL RESKSQTIGLIVPDISNDFFATLALHIENYSAKHNYSVFICNSANNVQKEKDYFKSLASK CVDGIICISGLNKLTEDIIYNDIPIVCIDRYPENNKTIPRISSDDIHAAFLATEHLIKKG CRDIVFISSFTADYEKKERYLGYKKALDTYGIPLDKNYILQTQGKEPSQIEAEILITDFL KTQRRIDGIFASSDLAALGALYALKRANLKVPEQVKLIGFDNTLYSRLPTPSISTIERNP KMLAEKGCEVLLNMIQGKDIGSIDTIVPVVLVERESSK >gi|223714112|gb|ACDT01000103.1| GENE 20 17129 - 18490 1555 453 aa, chain + ## HITS:1 COG:BH0296_2 KEGG:ns NR:ns ## COG: BH0296_2 COG1263 # Protein_GI_number: 15612859 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Bacillus halodurans # 102 448 1 348 368 268 42.0 2e-71 MDFNKMAAQIIEHIGGKANIASLTHCATRLRFKLKDQGKASKDEVVKIEGVINVVESGGQ FQVIIGNEVAQAFDAIEALTGISGKEIDVVEKDDLKVKGKFIDQVIELVTGIFTPILPAL IGAGMIRALLMFATSILGMSAESGLYIVLNEVYNAVYSFLPIYLAYTSAKKFKCNPYIAV AISLALVSPTIQGLVQGETGMKMFGFDVLFPAQGYGSSVVPIIVTIYFMSKLEKLCTKYI HAVARNVLTPLITLLITVPLMFMIIGPVTSYLQSFIGEGYTWIYNLNPTICGILLGGLWQ VLVVFGLHWGIVPLGQINLAMYGRNTINAVTGPSNWAQAGAALGVALKSKNSQIRQNAMS AAVTGFFSITEPAIYGVNLKYKKPFYIAVGCAAVSGGIAGFVNAAALAGGPVGVLSFPLF FGEGFVGFCIAMVFAFVASAVLTYLFGYDRSND >gi|223714112|gb|ACDT01000103.1| GENE 21 18514 - 19164 764 216 aa, chain + ## HITS:1 COG:SA1065 KEGG:ns NR:ns ## COG: SA1065 COG0036 # Protein_GI_number: 15926805 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Staphylococcus aureus N315 # 5 215 4 214 214 164 38.0 1e-40 MKKLLCPSMMCANFGNLENEIRKLEESGIDIFHLDVMDGSFVPNFGMGLQDIEYICKTAN KPCDVHLMVVNPGAYVKKFAQLGVKIIYVHPESDVHITRTLQMIKDAGAQAGIVVNPGTS YESVKETLSLVDYVMVMSVNPGFAGQKYLNFVDDKFKQFCAKKEEFGYKVMIDGACSPER IAMLSSIGVEGFILGTSALFGKEKSYQEIIKELRGL >gi|223714112|gb|ACDT01000103.1| GENE 22 19166 - 20101 1067 311 aa, chain + ## HITS:1 COG:no KEGG:lmo0737 NR:ns ## KEGG: lmo0737 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes # Pathway: not_defined # 3 311 4 310 310 431 67.0 1e-119 MSDVIRLISASGSDFEKMTPMELKESIFKSEGRVIMGQHLLFAGEGLVRGITNSEVMFAF GADMVMLNTMDLDNMENNKGLQGLSYQTLRQRCKRPLGIYLGCPKAGFEDGGKKALYRRE GMLCTPEHVQKCVDMGVDFIVLGGNPGSGTSINDVIACTKWIKEKYGDQLFVFAGKWEDG INEKVLGDPLANRDAKEIIKELIDAGADCIDLPAPGSRAGISVTMIRELVEYIHSYKPGT LAMSFLNSSVEGADPDTVRLIALKMKETGADIHAIGDGGFSGCTSPENIHQLSISIKGRP YTYFRMASVNK >gi|223714112|gb|ACDT01000103.1| GENE 23 20111 - 20602 511 163 aa, chain + ## HITS:1 COG:BH0595_3 KEGG:ns NR:ns ## COG: BH0595_3 COG2190 # Protein_GI_number: 15613158 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Bacillus halodurans # 14 145 2 133 152 124 44.0 9e-29 MGLFNKWKKKVEVEPYRISAPVAGKVIDIKDTNDQVFNSEALGKGVGIIASEKQITAPIG GQIVSFFPTKHAVGIKTDKGVELLIHVGIDTVELNGKHFISMKKQGDTVVRGDILLKVDF DAIREAGYDPTVLMVVTNTAQYAQIKCNLGNKSNNDIIIEIEE Prediction of potential genes in microbial genomes Time: Thu May 26 10:16:07 2011 Seq name: gi|223714111|gb|ACDT01000104.1| Coprobacillus sp. D7 cont1.104, whole genome shotgun sequence Length of sequence - 21118 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 11, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 43 - 690 749 ## Tola_1977 hypothetical protein + Prom 692 - 751 10.3 2 2 Op 1 2/0.000 + CDS 870 - 2459 1916 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 3 2 Op 2 . + CDS 2476 - 3804 1628 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases 4 2 Op 3 . + CDS 3820 - 4311 581 ## gi|167755426|ref|ZP_02427553.1| hypothetical protein CLORAM_00940 5 2 Op 4 . + CDS 4324 - 4812 603 ## COG2190 Phosphotransferase system IIA components + Term 4827 - 4882 12.3 + Prom 4825 - 4884 2.7 6 3 Tu 1 . + CDS 4925 - 6259 1326 ## COG0534 Na+-driven multidrug efflux pump + Prom 6283 - 6342 7.6 7 4 Op 1 41/0.000 + CDS 6382 - 6642 460 ## COG0234 Co-chaperonin GroES (HSP10) 8 4 Op 2 . + CDS 6656 - 8272 1416 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 + Term 8273 - 8326 9.4 - Term 8267 - 8308 0.9 9 5 Op 1 . - CDS 8315 - 10219 1988 ## COG0171 NAD synthase - Prom 10240 - 10299 4.1 10 5 Op 2 . - CDS 10301 - 10480 106 ## gi|237733661|ref|ZP_04564142.1| predicted protein 11 5 Op 3 . - CDS 10498 - 11661 1017 ## COG0477 Permeases of the major facilitator superfamily - Prom 11865 - 11924 10.2 + Prom 11704 - 11763 4.0 12 6 Tu 1 . + CDS 11860 - 12618 809 ## COG2116 Formate/nitrite family of transporters + Term 12620 - 12671 9.0 + Prom 13111 - 13170 6.3 13 7 Op 1 . + CDS 13289 - 13828 563 ## STH2124 hypothetical protein 14 7 Op 2 . + CDS 13821 - 14870 977 ## Cbei_2856 hypothetical protein + Term 14871 - 14911 2.7 + Prom 14876 - 14935 5.3 15 8 Op 1 . + CDS 15012 - 15758 829 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases 16 8 Op 2 . + CDS 15815 - 16567 829 ## gi|167755439|ref|ZP_02427566.1| hypothetical protein CLORAM_00953 + Term 16607 - 16643 5.0 - Term 16635 - 16673 4.0 17 9 Tu 1 . - CDS 16705 - 17445 747 ## COG3394 Uncharacterized protein conserved in bacteria - Prom 17476 - 17535 8.9 + Prom 17434 - 17493 15.0 18 10 Op 1 . + CDS 17569 - 18438 851 ## COG0169 Shikimate 5-dehydrogenase 19 10 Op 2 . + CDS 18438 - 19025 359 ## PROTEIN SUPPORTED gi|52842692|ref|YP_096491.1| ribosomal protein Ham1 20 10 Op 3 . + CDS 19026 - 19535 557 ## COG0219 Predicted rRNA methylase (SpoU class) 21 10 Op 4 . + CDS 19525 - 20175 485 ## COG4478 Predicted membrane protein + Term 20205 - 20240 0.2 22 11 Tu 1 . - CDS 20212 - 20868 193 ## COG1787 Predicted endonuclease distantly related to archaeal Holliday junction resolvase and Mrr-like restriction enzymes - Prom 20898 - 20957 9.3 Predicted protein(s) >gi|223714111|gb|ACDT01000104.1| GENE 1 43 - 690 749 215 aa, chain + ## HITS:1 COG:no KEGG:Tola_1977 NR:ns ## KEGG: Tola_1977 # Name: not_defined # Def: hypothetical protein # Organism: T.auensis # Pathway: not_defined # 3 214 4 218 219 174 41.0 3e-42 MSLEIIQTLNPEYTIKSITDEAFRTYGKVIDNNIDEAIEFCIDFVQAAKQDNFYLPSVLE VEQLSSIIELSHRVYGYLEIIAGIVAGDNVELSGIEYHQGSETIIAVTDYILVVGHIWDM QDDTYNSSKCELFYVPKGTIVECYSATLHYTPIAVSKEGFITICLLLKGTGDILEKRKKI LKKKNKWFIAHQDNLEKIASGDYPGLLGRKIIIDH >gi|223714111|gb|ACDT01000104.1| GENE 2 870 - 2459 1916 529 aa, chain + ## HITS:1 COG:BS_glvC_1 KEGG:ns NR:ns ## COG: BS_glvC_1 COG1263 # Protein_GI_number: 16077887 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Bacillus subtilis # 1 435 1 436 452 484 53.0 1e-136 MMQKIQRFGGAMFTPVLLFAFSGIVIGVCTVFQSDVIMGSIAASDTTWFKVWYVVKEAAW TIFRQVNLLFVISLPIGLAKKQNARACMESFVLYCTFNYALAALLSNWGSFVGIDYAIEA GSGTGLASVASIKTLDTGMVGALIVAGVAVWLHNRYFDTKLPEWLGVFRGSSFVAMIGFP VMIGLAVIFFFIWPQVQHGIAGVQTFIISSGALGVWVHTFLERILIPTGLHHFIYMPFMY DSVAIEGGIKTAWVAAMPEIAKSTASLRTLFPAGGYALYGFSKMFAPLGISAAFYATAKE NKKKEVLGLMIPVTLTAMFAGITEPIEFTFLFIAPVMFAVHSVLAASLATAQYLIGISGD FGSGIISNAALNWIPLGSAHWQQYLLSVVIGLVFSGIWFIVFKFIIEKFDFKTPGREDDD EEVKLVSKAEYKASKEDTSNDPTGNIEVKGGDAGKAAAFLAALGGKDNIESVTNCATRLR VSVKDETLVQPVGVFKKAGAHGLVAKGKAFQVIVGLTVPYVREEFEKLL >gi|223714111|gb|ACDT01000104.1| GENE 3 2476 - 3804 1628 442 aa, chain + ## HITS:1 COG:CAC0533 KEGG:ns NR:ns ## COG: CAC0533 COG1486 # Protein_GI_number: 15893823 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Clostridium acetobutylicum # 3 441 2 440 441 705 74.0 0 MKKQSSICIAGGGSTFTPGIVLMLMENQERFPIRQIKLYDNNAERQNKLAEALGILMKER YPEVAFSWTTDPETAFTDIDFCMAHIRVGLYAMRELDEKIPLKHGVVGQETCGPGGLAYG MRSIGGVLEILDYMEKYSPNAWMLNYSNPAAIVAEATRRLRPNSKIINICDMPIGIERNF ATILGLESRKDMIVRYYGLNHFGWWTSITDKEGNDLMPKIKEHCRQFGYAVDTEDFQHRD QSWIDTYTKVKDVYAVDPDTLPNTYLKYYLYPDYVVEHSNKEYTRANEVMDGREKRVFGA AADIVAKQTAKDCGFSTDTHAAYIVDLACAIAFNTKERMLLIVPNDGAIINFDPTAMVEI PCIVGNEGAEPLKMGEIPQFQKGLMEQQVSVEKLCVEAWIEGSYQKMWQALTLSKTVPSA KVAKEILDDLIEANKGYWPELK >gi|223714111|gb|ACDT01000104.1| GENE 4 3820 - 4311 581 163 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755426|ref|ZP_02427553.1| ## NR: gi|167755426|ref|ZP_02427553.1| hypothetical protein CLORAM_00940 [Clostridium ramosum DSM 1402] # 1 163 1 163 163 263 100.0 4e-69 MVYGKLQSIFKFFKGMSALDIVKYLFVGIPILILGFLYLFMAQGDIEKMFNEPLIVGQLL LVLAMPFCYLAMRNICQDLKDRKKRDGLLIPLWILIVYQVSSFNMICSFLIIYGTFQEYG KGMFAIKKFTINNSTKAMLVGLIPLGMLYGLVLFVKIRLGILF >gi|223714111|gb|ACDT01000104.1| GENE 5 4324 - 4812 603 162 aa, chain + ## HITS:1 COG:CAC1354 KEGG:ns NR:ns ## COG: CAC1354 COG2190 # Protein_GI_number: 15894633 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Clostridium acetobutylicum # 1 161 1 157 159 123 42.0 1e-28 MLNLFKKKSKSVEVLAPLTGTTLSIEEVPDPVFSEKMMGEGIAIKPTTDTVVAPFKGTVK MLMPNSGHAVGLLSEDGLEILIHVGMDTVSLEGEGFEVLTEVEAKVEPGDPLIKFDEAFL KSKEMDTLTMVVVTNPGDFQPEQFLTNKEVKAANDPIMVYKK >gi|223714111|gb|ACDT01000104.1| GENE 6 4925 - 6259 1326 444 aa, chain + ## HITS:1 COG:FN1653 KEGG:ns NR:ns ## COG: FN1653 COG0534 # Protein_GI_number: 19704974 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 4 435 6 437 445 249 36.0 5e-66 MEGKNGLMTHGNIAKQLIVFAIPLLVGNLFQQLYNTVDSIVVGNFIGDQALAAVNSSGPI IDMLVSFFMGLSLGAGVLISNYFGAQDCQGVKEGVHTAMALALVSSLITTIVGVIFTPII LKWVRVDSSVIGQSIIYLRIYFLGVIGLIIYNMISGILRAVGDSKYPLYFLVVSSIVNII LDLVFVVIFKMGIAGVAIATTIAQITSALLSIYILVRSKEMYKLELQQIHFYPDILKKTA KIGMPSAMQNAVVSFSNIIMQSNINVFGAYAMAGTGSYTKIDGFAILPVLSFAMALTTFV GQNKGAQEYERIKKGARIGTLISCGIILTLTIIIVFTTPYLLRIFSSNSQVIEYGRTMML CIAPGYLFLTLSQCICGVLRGVGRTNIPMFVLIGCWCIFRVIWVTVTTNIFHNIVFVNLG WPVSWIFSSLVLGVYYYKANWLYD >gi|223714111|gb|ACDT01000104.1| GENE 7 6382 - 6642 460 86 aa, chain + ## HITS:1 COG:slr2075 KEGG:ns NR:ns ## COG: slr2075 COG0234 # Protein_GI_number: 16330002 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Synechocystis # 2 86 14 106 106 62 39.0 2e-10 MIKPLHDNVILKKDEVENKTSSGIILTTETKKIPSVATVVALGPDCKSEIKENSKVVYKE YSGTNIKIDEVDYIVIEEKDILAYIA >gi|223714111|gb|ACDT01000104.1| GENE 8 6656 - 8272 1416 538 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 2 526 3 528 547 550 52 1e-156 MAKEVRFSSDARKSMLKGVNTLADAVCITLGPKGRNVVLEKSYGSPLITNDGVSIAKEIE LEDKFENMGAKLVYEVANNTNDVAGDGTTTATILARNMINSGMKAVDKGCNPVLMREGIE YASKEVAKTILKNSRKVETSGDVASVATISAGSKEIGDLIAEAMDKVGRDGIINVDESNG FDNELEISEGMQYDKGYVSPYMVSDREKMQVELENPYVLVTNHKINNLQEILPVLEQVLK TNKPLLLIAEDYENEVISTLVLNKLRGTFNVVATKAPGFGDNQKEMLQDIAALTGAKLYN KDLNMKLEELQLEELGTIKKVIVTKDNTTMISGNEENPELKARIEEIKTRVASSTSDYDK KQFQERLGKLTNGVATIKVGATTESELKEKKLRIEDALNATKAAVAEGIVIGGGAALVEA YKELKPVLKNDNVDVQKGINIVMEALLSPICQIAENAGYNSEDIVDMQKSAAKNQGFDAK NGEWVDMFDKGIIDPTKVTRSALLNAASISALFITTEAGVAEIKSETPEAPMMPNQMY >gi|223714111|gb|ACDT01000104.1| GENE 9 8315 - 10219 1988 634 aa, chain - ## HITS:1 COG:CAC1050_2 KEGG:ns NR:ns ## COG: CAC1050_2 COG0171 # Protein_GI_number: 15894337 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Clostridium acetobutylicum # 320 627 1 309 310 395 61.0 1e-109 MKDGYIRVAAGSFETSIANVKNNSENICNLINEAYHNDARVLVLPELCLTGYTCEDLFNQ DRLLNEAKQQLQTIITATNNKDLITIVGLPYQHLNSLYNVAAVIHQGALLALVPKTHIPN YQEFYEARRFEQAPKENTLTNFNGQKIPFGTHYVFASTTNSDFKFGVEICEDLWLPDAPS TKLALNGANLILNPSASNEITTKSDYRRLLVSSQSARLVCGYVYCNAGNGESTTDVVFSG HHIISENGTMIKESRGFDSELIYGDLDLKKLSSERRKMTTFKSYHNYETIYFDSTNIDLN TTYYYDPHPFVPSNRDLRAKRCKEVFDIQTRGLMQRLKATGIKKVVIGISGGLDSTLALL VCTMAFKKLNYDTKDIIAITMPCFGTTSRTKNNALGLMEELAVTSIEVDITESVRIQFRD IEQDENIHDVTYENVQARTRTEILMNKANQVGGLVIGTGDLSEVALGWSTYNGDHMSMYA VNVSVPKTLVRYLVDYVASLYHGEKLETILKDILDTPVSPELLPQENDQIVQKTEDIVGP YELHDFFIYHMVRFGDEPRKLYRKTKLAFKDKYDKETIKKWLTKFYWRFFSQQFKRSCIP DGPKVGSVSLSPRGDWRMPSDANVSNWIDEIEKI >gi|223714111|gb|ACDT01000104.1| GENE 10 10301 - 10480 106 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733661|ref|ZP_04564142.1| ## NR: gi|237733661|ref|ZP_04564142.1| predicted protein [Mollicutes bacterium D7] # 1 59 1 59 59 87 100.0 3e-16 MIINIINMVENFDNHKKVDEQNRKIVLQLEAATSLYQMRGFQFTDELDLKNEKVMVLKK >gi|223714111|gb|ACDT01000104.1| GENE 11 10498 - 11661 1017 387 aa, chain - ## HITS:1 COG:BS_ywbF KEGG:ns NR:ns ## COG: BS_ywbF COG0477 # Protein_GI_number: 16080885 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 9 351 9 355 399 90 23.0 5e-18 MFKKGSYIYYALGNGMFYFSWAMFSCIISVYLAGINCSATEISLITSAAALFAMATQPIT GFLADKFKSPKLVAIITGALTIIFGLLFASTKSFIFLFLLNGFTQGCLNGITALTDRLAT ASPYPFGAIRVWGSILYAIAAQVSGIVYDYISPTANFYIFAAGLLLMLFSFYMMHDAKPL IVGKAAKVTTKEVLKHLWHNKPFKIFMLIYILFQGPSSAQMVYLPLVIKGLGGTTTIVGT TLLFSTLSEIPAVLFSDRYMKKISYKALMIFACVLSIIRFVWYSTCPAPYLIMSVFFFQG LTTIVFILVAVRIILDLVDEQYVNSAYGISSMLAKGFSALIFQIIGGRVLDIFPGNNGYT IMYLIFASSITIALILCFKFKFTKKVE >gi|223714111|gb|ACDT01000104.1| GENE 12 11860 - 12618 809 252 aa, chain + ## HITS:1 COG:CAC1512 KEGG:ns NR:ns ## COG: CAC1512 COG2116 # Protein_GI_number: 15894790 # Func_class: P Inorganic ion transport and metabolism # Function: Formate/nitrite family of transporters # Organism: Clostridium acetobutylicum # 3 252 6 254 256 132 32.0 9e-31 MSFEMLSDLGIKKYEMCRDDIGRFFARSVVAGLYLGLATILSTTLGTLLFKDNLIASKIA VAGSFGIGLVIIVILGSELFTGNCFTTMIPVYGKKLKFRQIIPMWIVCYFGNFVGIALVC YLFFISGSNHELLSEYVVSCANTKLSFDIIELLVKAVLCNFIVCVGAYVGMKMQDDTAKT IIMMIVVMAFVLPGFEHSIANMGSFTLTIGALGSGANFGLIAIHMIVVTFGNMIGGGVLL GLPLYLMIKPNK >gi|223714111|gb|ACDT01000104.1| GENE 13 13289 - 13828 563 179 aa, chain + ## HITS:1 COG:no KEGG:STH2124 NR:ns ## KEGG: STH2124 # Name: not_defined # Def: hypothetical protein # Organism: S.thermophilum # Pathway: not_defined # 3 168 2 163 179 146 51.0 4e-34 MKNLSVRNIVLAGFFLAVGIVLPFFTMQIPSIGNMLLPMHIPVLICGFVCGWPLGLIVGF ILPLLRSMLFTMPPMYPTALAMAFELAAYGALTGLFYNRLAKKPLNTYIALGIAMIGGRV IWGIISAILYGMAGQPFGFAIFIAGAFTNAIPGIILQLIVIPILIMALENAHLIKYNND >gi|223714111|gb|ACDT01000104.1| GENE 14 13821 - 14870 977 349 aa, chain + ## HITS:1 COG:no KEGG:Cbei_2856 NR:ns ## KEGG: Cbei_2856 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: Pyrimidine metabolism [PATH:cbe00240]; Metabolic pathways [PATH:cbe01100] # 159 346 8 195 202 153 39.0 1e-35 MISALVTTQLEKYPQMKAQDIIKLIYQNEFGGGHMIDDPETSLKRLKDEAKQLKNNSFIE EDIGNDLVRVYLGNAGELELLTLNQLFVYSAKLMNGSIISFINKLEQLKSACQRGTISFD YRQICLEIENYEKLNYPPISHSSTYRELYQPHYRVINKQIYSYFPVILKINQLLQTTDRL NIAIDGKCGAGKSTLGEMIKAIFETNLFKMDDFFLQPFQRTAARLNEPGGNIDYERFKET VIVPLQHQESVKYQRFDCSKMALEVHVELIPYSRFNVIEGTYSMHPYFGMFYDFTIALNV SDDIQAKRILMRNGEMMYEKFKKIWIPLENKYFDQYNIFNVADLQYYSN >gi|223714111|gb|ACDT01000104.1| GENE 15 15012 - 15758 829 248 aa, chain + ## HITS:1 COG:CAC2787 KEGG:ns NR:ns ## COG: CAC2787 COG0639 # Protein_GI_number: 15896042 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Clostridium acetobutylicum # 6 189 4 173 221 98 36.0 1e-20 MPKNIYVLSDLHGHYNIFIKMLEKINFSDDDVLYILGDCCDRGPDSLKIYLYIQKFDNIH LIKGNHEIMMRDAFKVDDPASSQGRMWAQNGGNKTFHSYHEYLHKKAFNDCDYKVIKAAF YKMMIDYIDRCPSFIELNCNGQDYVLIHAGINPEKGLYEQTEEECAWMREYFFMSKGLDN KIIIFGHTPTCYIHQASGCFDVWYDPVFKDKIGIDGGLGPFDKGQLNCLCLNTQEVFVIK KSELAIQE >gi|223714111|gb|ACDT01000104.1| GENE 16 15815 - 16567 829 250 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755439|ref|ZP_02427566.1| ## NR: gi|167755439|ref|ZP_02427566.1| hypothetical protein CLORAM_00953 [Clostridium ramosum DSM 1402] # 1 250 1 250 250 466 100.0 1e-130 MNLLISRILTYLNGTLLYDSHYRFCKFIVYHYLELEDLSFNEVSEKSQISKEDILRFCAL LGFDDYDTFKAELLRSHMIRLDQIRARMLGVNSEQLIDEMEKSCSNEVMSEYISTICEAI FKAKRVVLIGALYPMSIAVEFQTDLVSFGKPVIQYHNFDKDIQLDENDVTIFVSATGRSM NSFVEIKKELRVDLTTSILITQNKTYALDEYKISDYVIQVPGKFDGINFNYQIMTICDLL RVHYYQQYYL >gi|223714111|gb|ACDT01000104.1| GENE 17 16705 - 17445 747 246 aa, chain - ## HITS:1 COG:lin2456 KEGG:ns NR:ns ## COG: lin2456 COG3394 # Protein_GI_number: 16801518 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 4 245 2 247 248 158 38.0 1e-38 MIKKLIVNADDFGMTEGNSIGILMAHADGILTSTTCMMNMPFAKFALDQAKNYPDLGVGI HLVLTVGRPLVDGAKSYTDKDGNFIRPKDYPDHQPHADPEELYTEWKAQIEKFIEIAGKK PTHIDSHHHVHLLPQHQEVVIKLAREYDLPIRQRDQIIDNYEYVRCNDQMYDDLITYDFM SNSMKVDEETLEYMCHPAYVDQRLYDMTSYCLPRMKELALLRSEEMKNFIKDNNIQLINY SDLKKI >gi|223714111|gb|ACDT01000104.1| GENE 18 17569 - 18438 851 289 aa, chain + ## HITS:1 COG:lin0493 KEGG:ns NR:ns ## COG: lin0493 COG0169 # Protein_GI_number: 16799568 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Listeria innocua # 1 284 5 290 291 272 46.0 6e-73 MKSQISATTSLYAFIASPAHHSKSPAMHNTAFEQLGLDSVYLAFDIKSEELKDTIAGFKA MKVRGANVSMPHKQNIIPYLDEISTASRLCNAVNTITFKDGKYYGTITDGIGFTRSLEEQ GWLIKDKKITMVGAGGASTAIMVQLALDGVKEIIVYNRTMRTEFQEIINNTIIETGCTIT LKSLSDLESLKKDMHSSYLFINTTGVGMEPMLERSVVPDASYFKPDLKVADIIYQPAVTK MLRLAKEAGCATMNGELMLLYQGVESFKIWTGQEMPINEVKKVLGIEVK >gi|223714111|gb|ACDT01000104.1| GENE 19 18438 - 19025 359 195 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|52842692|ref|YP_096491.1| ribosomal protein Ham1 [Legionella pneumophila subsp. pneumophila str. Philadelphia 1] # 1 193 1 194 194 142 41 1e-33 MKEIIVASTNQGKIKEIKAMLKDIDIEVLSMKDVLEQELEIEETGTTFKENALIKAQTIA NIVNKPVLADDSGLEVDALDKQPGIYSARFLGADTSYNIKNQYIIDALKDKERTARFVCA MALVIPGQEPILIEETMEGLINDKIEGANGFGYDPIFYFPPCQMTSAMMSMEEKNKYSHR AKALKKLYTILKEIL >gi|223714111|gb|ACDT01000104.1| GENE 20 19026 - 19535 557 169 aa, chain + ## HITS:1 COG:BH1023 KEGG:ns NR:ns ## COG: BH1023 COG0219 # Protein_GI_number: 15613586 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase (SpoU class) # Organism: Bacillus halodurans # 3 155 4 157 157 169 50.0 2e-42 MNHIVLVHPAIPQNTGNIMRTCVATNTSLHIIKPMSFELDDKKMKRAGLDYVKDLNLYVY ENYEEFEQKNPGEYHFFTRYSHLCYTDQDFSDDKKEHYLFFGHEHDGIPKEILTKHLDRC LRIPMSDKVRSLNLSNCAAICIYEVLRQQSYPNLSKVEVQKGEDFLYEL >gi|223714111|gb|ACDT01000104.1| GENE 21 19525 - 20175 485 216 aa, chain + ## HITS:1 COG:CAC2705 KEGG:ns NR:ns ## COG: CAC2705 COG4478 # Protein_GI_number: 15895962 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 8 207 10 222 224 102 31.0 4e-22 MSYNKKDIFLATMLMLFIISLAIVFTVFFKQLYYLDINYLGIDLTSGMSVETIKKNYDVL IAYQSIFYRGTLNLPDFVMSTNGRIHFEEVKTIFEAIQVIMVVSGLISLPLVIRRFKEKE YRFLKLTGLITIIVPAMLGFVVALDFESAFITFHQIVFRNNYWVFDYRSDPVINILPETF FMHCFIMIVIIVITLAGLCLFYYYHKQKQIINDTIE >gi|223714111|gb|ACDT01000104.1| GENE 22 20212 - 20868 193 218 aa, chain - ## HITS:1 COG:sll1429 KEGG:ns NR:ns ## COG: sll1429 COG1787 # Protein_GI_number: 16330594 # Func_class: V Defense mechanisms # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase and Mrr-like restriction enzymes # Organism: Synechocystis # 41 158 186 302 304 85 36.0 9e-17 MKLIILWLINTIFVLPLKIITILFKLIYKIIKNIYFNNLDLEYINTMDGHDFEYFTKTLL EKNGFKQVNVSQSSSDYGIDVFAYKNKYTYAIQCKRYSKTVGIKAVQEAKSGCEYYQCDI PVVFTNNTFSSAAINLAKNTNVELWDQDTLYHYLKKSKILTKSLPLYYPLFTFITTAILS YYYLQHQDYLLIPLLINMFLFISIIVKILKDKKRLLLF Prediction of potential genes in microbial genomes Time: Thu May 26 10:16:57 2011 Seq name: gi|223714110|gb|ACDT01000105.1| Coprobacillus sp. D7 cont1.105, whole genome shotgun sequence Length of sequence - 54118 bp Number of predicted genes - 55, with homology - 55 Number of transcription units - 24, operones - 14 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 555 752 ## COG1592 Rubrerythrin 2 1 Op 2 . - CDS 580 - 969 397 ## COG0735 Fe2+/Zn2+ uptake regulation proteins - Prom 998 - 1057 11.0 3 2 Tu 1 . - CDS 1087 - 1944 1007 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 1996 - 2055 5.7 4 3 Tu 1 . - CDS 2065 - 2979 295 ## BCG9842_B1209 hypothetical protein - Prom 3176 - 3235 7.4 + Prom 2948 - 3007 7.1 5 4 Op 1 . + CDS 3037 - 4263 517 ## PROTEIN SUPPORTED gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 6 4 Op 2 . + CDS 4277 - 4468 292 ## gi|167755451|ref|ZP_02427578.1| hypothetical protein CLORAM_00965 7 4 Op 3 . + CDS 4528 - 5514 1193 ## COG0078 Ornithine carbamoyltransferase + Term 5518 - 5547 0.5 8 5 Tu 1 . - CDS 5523 - 6167 479 ## gi|167755453|ref|ZP_02427580.1| hypothetical protein CLORAM_00967 - Prom 6393 - 6452 7.8 9 6 Op 1 . + CDS 6307 - 7053 705 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases 10 6 Op 2 . + CDS 7057 - 7881 875 ## Sterm_2094 transcriptional regulator, RpiR family + Term 8079 - 8127 -0.1 - Term 7876 - 7905 1.4 11 7 Op 1 . - CDS 7906 - 8634 540 ## EUBREC_2038 hypothetical protein - Prom 8665 - 8724 6.0 12 7 Op 2 . - CDS 8747 - 9271 124 ## CTC01417 hypothetical protein - Prom 9426 - 9485 16.6 + Prom 9369 - 9428 7.3 13 8 Tu 1 . + CDS 9503 - 10516 1018 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) + Term 10565 - 10609 7.2 - Term 10552 - 10597 4.2 14 9 Op 1 40/0.000 - CDS 10601 - 11686 703 ## COG0642 Signal transduction histidine kinase 15 9 Op 2 . - CDS 11679 - 12371 740 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 12529 - 12588 10.7 + Prom 12519 - 12578 12.5 16 10 Tu 1 . + CDS 12607 - 13854 571 ## PROTEIN SUPPORTED gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase + Term 13856 - 13887 1.1 17 11 Tu 1 . + CDS 14390 - 14653 187 ## gi|237733691|ref|ZP_04564172.1| predicted protein + Term 14717 - 14752 5.3 18 12 Op 1 16/0.000 - CDS 14738 - 16117 1589 ## COG0165 Argininosuccinate lyase 19 12 Op 2 . - CDS 16160 - 17371 1586 ## COG0137 Argininosuccinate synthase - Prom 17554 - 17613 8.1 + Prom 17448 - 17507 7.4 20 13 Op 1 2/0.000 + CDS 17531 - 18667 1244 ## COG0628 Predicted permease + Term 18681 - 18740 5.2 + Prom 18784 - 18843 10.9 21 13 Op 2 . + CDS 18887 - 19600 264 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 22 13 Op 3 . + CDS 19593 - 21203 1443 ## CD2104 putative ABC transporter, permease protein + Term 21204 - 21236 3.3 23 14 Op 1 11/0.000 + CDS 21545 - 22582 1157 ## COG0002 Acetylglutamate semialdehyde dehydrogenase 24 14 Op 2 10/0.000 + CDS 22596 - 23819 1531 ## COG1364 N-acetylglutamate synthase (N-acetylornithine aminotransferase) 25 14 Op 3 . + CDS 23847 - 24710 928 ## COG0548 Acetylglutamate kinase + Term 24802 - 24841 1.8 26 15 Op 1 . - CDS 24724 - 25542 631 ## COG1408 Predicted phosphohydrolases 27 15 Op 2 . - CDS 25539 - 27329 1562 ## COG0595 Predicted hydrolase of the metallo-beta-lactamase superfamily - Term 27375 - 27407 2.6 28 15 Op 3 . - CDS 27408 - 28274 1008 ## COG1307 Uncharacterized protein conserved in bacteria - Prom 28306 - 28365 10.0 + Prom 28295 - 28354 10.1 29 16 Op 1 8/0.000 + CDS 28458 - 28823 341 ## COG1725 Predicted transcriptional regulators 30 16 Op 2 . + CDS 28823 - 29725 276 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 31 16 Op 3 . + CDS 29700 - 31088 1183 ## gi|167755477|ref|ZP_02427604.1| hypothetical protein CLORAM_00991 32 16 Op 4 . + CDS 31142 - 31588 372 ## gi|167755477|ref|ZP_02427604.1| hypothetical protein CLORAM_00991 33 16 Op 5 . + CDS 31591 - 33147 1311 ## gi|167755478|ref|ZP_02427605.1| hypothetical protein CLORAM_00992 34 16 Op 6 4/0.000 + CDS 33194 - 33499 466 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 33509 - 33542 3.1 + Prom 33531 - 33590 6.2 35 16 Op 7 . + CDS 33619 - 34038 445 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 36 16 Op 8 . + CDS 34039 - 34479 534 ## gi|167755481|ref|ZP_02427608.1| hypothetical protein CLORAM_00995 37 16 Op 9 . + CDS 34519 - 35082 695 ## gi|237733710|ref|ZP_04564191.1| predicted protein + Term 35133 - 35182 7.1 - Term 35121 - 35169 6.9 38 17 Tu 1 . - CDS 35173 - 35508 361 ## Clos_2092 hypothetical protein - Prom 35534 - 35593 7.6 + Prom 35539 - 35598 9.0 39 18 Op 1 . + CDS 35623 - 36804 1452 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs 40 18 Op 2 3/0.000 + CDS 36797 - 37402 589 ## COG1011 Predicted hydrolase (HAD superfamily) + Prom 37415 - 37474 6.4 41 18 Op 3 . + CDS 37508 - 39013 1609 ## COG1376 Uncharacterized protein conserved in bacteria + Prom 39100 - 39159 8.2 42 19 Op 1 . + CDS 39180 - 40199 1408 ## COG1363 Cellulase M and related proteins + Prom 40201 - 40260 2.3 43 19 Op 2 . + CDS 40285 - 41043 759 ## COG0561 Predicted hydrolases of the HAD superfamily 44 19 Op 3 . + CDS 41078 - 41791 677 ## COG2188 Transcriptional regulators 45 19 Op 4 . + CDS 41791 - 43599 1264 ## COG0249 Mismatch repair ATPase (MutS family) + Prom 43604 - 43663 9.7 46 20 Op 1 9/0.000 + CDS 43701 - 45593 2089 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 47 20 Op 2 . + CDS 45610 - 47247 1532 ## COG0366 Glycosidases + Prom 47347 - 47406 7.7 48 21 Tu 1 . + CDS 47430 - 48890 1606 ## COG4868 Uncharacterized protein conserved in bacteria + Prom 49231 - 49290 3.2 49 22 Tu 1 . + CDS 49310 - 50137 839 ## COG1396 Predicted transcriptional regulators + Term 50290 - 50332 0.0 + Prom 50198 - 50257 5.1 50 23 Tu 1 . + CDS 50378 - 50800 557 ## gi|237733723|ref|ZP_04564204.1| predicted protein + Term 50834 - 50884 5.1 + Prom 50987 - 51046 6.4 51 24 Op 1 7/0.000 + CDS 51070 - 51804 715 ## COG1989 Type II secretory pathway, prepilin signal peptidase PulO and related peptidases 52 24 Op 2 . + CDS 51818 - 53011 959 ## COG1459 Type II secretory pathway, component PulF 53 24 Op 3 . + CDS 53021 - 53200 199 ## gi|167755498|ref|ZP_02427625.1| hypothetical protein CLORAM_01012 54 24 Op 4 . + CDS 53247 - 53918 595 ## gi|167755498|ref|ZP_02427625.1| hypothetical protein CLORAM_01012 55 24 Op 5 . + CDS 53927 - 54116 115 ## gi|167755499|ref|ZP_02427626.1| hypothetical protein CLORAM_01013 Predicted protein(s) >gi|223714110|gb|ACDT01000105.1| GENE 1 3 - 555 752 184 aa, chain - ## HITS:1 COG:CAC2575 KEGG:ns NR:ns ## COG: CAC2575 COG1592 # Protein_GI_number: 15895835 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 2 184 3 186 195 202 58.0 3e-52 MTLKGTKTANNLMHAFAGESQARNRYTYYASIAKKEGFVQIQNIFLETANQEKEHAKRLM KLMNKDLAGETLYTDGNFPVLLGTTAENLKAAAAGENEEYTDMYPSFAKTAREEGFEDIA TVLDSIAIAEKHHEERYLKLLAALEAGNTFKRETPVVWKCNNCGFIFEGLEAPERCPACD HPQA >gi|223714110|gb|ACDT01000105.1| GENE 2 580 - 969 397 129 aa, chain - ## HITS:1 COG:CAC2634 KEGG:ns NR:ns ## COG: CAC2634 COG0735 # Protein_GI_number: 15895892 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Clostridium acetobutylicum # 3 127 12 137 138 82 35.0 2e-16 MPKTRYSKQRELIYENLKARCDHPTAEAIYSDLKKDNPGLSLGTVYRNLNFLADAKMIRK LDVGQTIVHFDGDISDHYHFICNECNQVFDITYHDENISNVVQEHTKHHIEKTDIVFSGI CEECLKKKN >gi|223714110|gb|ACDT01000105.1| GENE 3 1087 - 1944 1007 285 aa, chain - ## HITS:1 COG:CAP0070 KEGG:ns NR:ns ## COG: CAP0070 COG0561 # Protein_GI_number: 15004774 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 3 284 2 282 283 382 67.0 1e-106 MESKFKIIIMDVDGTLTNSQKIVTPKTKAALMKAQELGIKLILASGRPTSGLIGLGHELQ MDKYHGLFVCYNGSKVVDCQTKEELFNQPLSIEEGKAVLEHMKKFDRVRPMIDKGEYMFV NNVYDNYINFNGAPFNVIEYESRGGNYKLCEIDDLAAFADFEINKILTTSDPEYLQEHYQ EMMEPFKDSLSCMFTGPFYFEFTAQGIDKAKALDTVLKPMGYSQDEMIAFGDGHNDASMV KYAGTGVAMANAVQDLKDISQYITLSNDEDGIAEALYKYIPELKD >gi|223714110|gb|ACDT01000105.1| GENE 4 2065 - 2979 295 304 aa, chain - ## HITS:1 COG:no KEGG:BCG9842_B1209 NR:ns ## KEGG: BCG9842_B1209 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_G9842 # Pathway: not_defined # 14 290 8 333 409 72 23.0 3e-11 MKKFNYYFTLLLVAFITLFSFFNMFQIVSLSNNVIKIIINNLLPSLLPFMILISLCLSLG LLNLFSYLIQWIFKPLFKLSPIMSSIYFISFFCGYPTNVKMIKEAYELNYINSQELQHLL SIASFASISFIFVSLHLEYPLIIYFCHLAPSIIKALFYHQHYEFQSLKESIIRLKKPHVS FVEALKQSILSSCYAFIFILGYMLVFQFVGYAIGHIIKDSFINAIIQGVLEFSSGSLQLL QFSHTPLVYSLICFNLSFSGLSVMMQTDNLLDGIDYSFKAYFKARLFHGLCSFTLCLLIL TYLL >gi|223714110|gb|ACDT01000105.1| GENE 5 3037 - 4263 517 408 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 [Roseobacter sp. AzwK-3b] # 33 403 44 424 425 203 33 1e-51 MKALIVGVRYKEMNYDLDNSLIELEELCKACNIEVIKRCVQNLDKINPSLYIGTGKVQEI KGQLDNLDLVIFNEELSPLQVKNLTDILDIEVTDRTDLILRIFEVRAKTKEAKLQVEIAK GQYLLPRLAGMKEHLYSQQGGSGFRGSGEKQIELDRRLVSNQIFNAKKQLANIVKQRQNQ RKRRKDNEMKVIALVGYTNSGKSTLMNAFCLNKNKQVLQKDMLFATLETATRHITINQHP CLLTDTVGFIERLPHHLIQAFRSTLEEVVEADLLIHVVDASNPNYEQYINTTNAVLKELG IKDTPMIYAYNKVDLNKYGFIVPLEPYAFISAKERIGLEVLEKSISDILFKDYAIYDLNI PYQDGEVFKYLHQHCLVLEFQYLENSIYMKIEAHPRFMVQYDQYLLKH >gi|223714110|gb|ACDT01000105.1| GENE 6 4277 - 4468 292 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755451|ref|ZP_02427578.1| ## NR: gi|167755451|ref|ZP_02427578.1| hypothetical protein CLORAM_00965 [Clostridium ramosum DSM 1402] # 1 63 1 63 63 85 100.0 1e-15 MDNEKKLFRLDLSIAVEATSAQEAFDILVTDETLKQIRELVIKSKDNIKEMFEKEDSEPA IIN >gi|223714110|gb|ACDT01000105.1| GENE 7 4528 - 5514 1193 328 aa, chain + ## HITS:1 COG:SA1012 KEGG:ns NR:ns ## COG: SA1012 COG0078 # Protein_GI_number: 15926752 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Staphylococcus aureus N315 # 2 328 3 332 333 377 57.0 1e-104 MDLYGRNFLTLKDFSKEEIRYFLDLAHQLKKDKKEGKTSNSLAGKNIVLLFEKTSTRTRC AFEVAALDEGAHVTFLSNSQMGKKESIEDTAKVLGRMYDGIEFRGFEQSTVEALAKYSGI PVWNGLTDVDHPTQTLADFLTIEEHLDKPLNEVKFVFTGDIRNNVCYGLMYGAAKMGMHF VALGPSELQMDPEVVAYCKAEAEKSGGSFMVTDNVDEAVRDADVIYTDIWVSMGEAEELY PQRVKLLSPYKVTQAMMAKTGNDKTLFMHCLPSFHDFETTLAKEKYEQGIDIREVEDEVF RSANSVVFDEAENRMHTIKAVMVATLGK >gi|223714110|gb|ACDT01000105.1| GENE 8 5523 - 6167 479 214 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755453|ref|ZP_02427580.1| ## NR: gi|167755453|ref|ZP_02427580.1| hypothetical protein CLORAM_00967 [Clostridium ramosum DSM 1402] # 1 214 1 214 214 417 100.0 1e-115 MVDEFYLDVNLELFVQWLFGNKNKYDHNILCTLKSNEPYGKTITFNYYNISGSIVIWANG IIEEQIFDDEENLLFYLHYKLVSLAQGQKLFYDFYDSLTKHTKCPPLKIILCCSGGLSSS FFANKLAELISLKHLNYEIIPLGFYQLNSSYLDCDAIYLAPQISYLEPQAMNIVKNTVPV HCVTPSVYATYNYRGLLDMITNENISKTKENGTI >gi|223714110|gb|ACDT01000105.1| GENE 9 6307 - 7053 705 248 aa, chain + ## HITS:1 COG:CAC2787 KEGG:ns NR:ns ## COG: CAC2787 COG0639 # Protein_GI_number: 15896042 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Clostridium acetobutylicum # 6 240 4 217 221 103 34.0 2e-22 MTGKTYVISDLHGHFDLLIKLLETISFSENDILYIIGDICDRGPDSLKILFYIQDHNNII LIKGNHEYMWQEALYYGVYYDDFDYPSRALKLWLANGGDTTMRNIREYLQKDKFRYEDYR VIRTAFFKSLYEYLINLPNYLEIEVNQQKYVLVHAGIDVRYPLKEQKLNSLLWIRDDFYH ARGDLTKIYIFGHTPTAMLNKDRSFDIWVDDVHQNKIGIDGGLACGVIGQLNCLCLDDGK IIIVNKEE >gi|223714110|gb|ACDT01000105.1| GENE 10 7057 - 7881 875 274 aa, chain + ## HITS:1 COG:no KEGG:Sterm_2094 NR:ns ## KEGG: Sterm_2094 # Name: not_defined # Def: transcriptional regulator, RpiR family # Organism: S.termitidis # Pathway: not_defined # 19 255 24 256 278 81 25.0 3e-14 MRGAFYNLVNFINTTSINDVYANAARKILQNIYVIPGSTITDVAELCFVSTATISRLCRK LNYESFADFKIDITMNLNYFNRDTLRLQFDHQLPQRQYLHKGKEVFKAHFDNILDNLKET YESVSYEDLEFIVDKIHDANKVCFAGNFFTQSVSMQLQIELSYLGKECSGMYPLEQQILT VEGLDENDVIIVSSIAGGFMTDHPDVMRVISKSPAYKIVISQLDEFVYSDSIDMILKVGT DHHSLIGKFSITYIFEVLEALYHLKYGVKEKREL >gi|223714110|gb|ACDT01000105.1| GENE 11 7906 - 8634 540 242 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2038 NR:ns ## KEGG: EUBREC_2038 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 33 238 47 260 263 167 40.0 3e-40 MKLKKLKKIPAFIAFIIIFIFGYLCLQIIPPLLYNDNSGALDKVDCPQYINEFVDKNPQA IELKENYQPNENTDPIILDSPNGIPLYIQWDKRWAYTTYGDEIIGTAGCGPTCLSMVAVG LTGKTTYNPRYVAKYAIKNDFLEGSMTRWALMESGCNEFGLIASAVPLSKNEMFKQLDAG HPIIASLRPGDFTTTGHFIVITKTINDKFLVNDPNSKENSHKQWSYSQLAPQIKAMWAYN TI >gi|223714110|gb|ACDT01000105.1| GENE 12 8747 - 9271 124 174 aa, chain - ## HITS:1 COG:no KEGG:CTC01417 NR:ns ## KEGG: CTC01417 # Name: vanZ # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 2 169 11 185 305 92 33.0 5e-18 MEKKNNYLINITIALCSIYYFWMLIRIIILKNGFIDMGYNSNLVLFDFINQYHQSRLTKV LLVNIIGNVSLFIPLSIILRHYFSFLNNYNIAFIGFFTSLSFELIQLGTGWGVFDIDDIF LNTLGCLIGIIIYHFINQHRQNNVSTSLFLLSFGSIGLISVYNYAPLLLTTFII >gi|223714110|gb|ACDT01000105.1| GENE 13 9503 - 10516 1018 337 aa, chain + ## HITS:1 COG:yegX KEGG:ns NR:ns ## COG: yegX COG3757 # Protein_GI_number: 16130040 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Escherichia coli K12 # 5 186 72 255 275 73 29.0 4e-13 MVREGIDVSQYQGQIDWELVKNHIDFAILRCGYGQDIPGQDDPTFKRNADECTRLGIPFG VYLYSYATTESEALSEARHVMRLVEGYKMEYPVYLDLEDPRIGQLSNEQIERNSRVWAEE LVNNRYFPGFYASYYWWTEKLTGPLFDRYTKWVARYAEELGAEGFDMWQYTDKGFVEGIN APVDRNYAYRDFPAEIIAGGFNNFEGDETIPPITNYQVGDHVTFDHVFVSSDSTTPLIPY INHGTIRRVVKGARNPYLIGNGLGWVNNDSITGMMQYLSNPTYRGDSFVDALVQIGVDAS FASRRELARKNGIYDYTGSAKQNLELLRLLKEGRLRQ >gi|223714110|gb|ACDT01000105.1| GENE 14 10601 - 11686 703 361 aa, chain - ## HITS:1 COG:BH1809 KEGG:ns NR:ns ## COG: BH1809 COG0642 # Protein_GI_number: 15614372 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 67 360 52 346 351 144 31.0 3e-34 MNNTHFKQSLTANLLKKIGITISVTVVTFVILLLYSYEVYHQYIWQEYDLVYQVAQIIKD NLVTISITLIAFDTIIIFWLRYRESLQYIEKMLEAGKTLVADNNYLISLPYELKEIEDQM NQIKRESLQNKKAIKQAEQQRNDLLLYLAHDLKTPLTSIVGYLDLLMNQPNLTIEEKSNY TKIAYDKAIRLEELIEEFFSIAKYNLSDISLERKAVNLSMMLVQISYEFMPLYHEKNINC VTKIEDNLSLELDINQFERVFDNLIRNAINYSKENSDLIITAKKEVDCIFIQVANFVKHI SEHNLERIFQPFVRLDEARNSKTGGSGLGLAITKKIIELHGGTIEADLTDDLITFTIKLP I >gi|223714110|gb|ACDT01000105.1| GENE 15 11679 - 12371 740 230 aa, chain - ## HITS:1 COG:CAC0564 KEGG:ns NR:ns ## COG: CAC0564 COG0745 # Protein_GI_number: 15893854 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 5 230 6 230 233 231 50.0 8e-61 MVYNILVVDDEQSIADLVELYLQNESYQVFKFYHGQEALDSLKKNHYDLAILDVMMPDLD GFSLLKKIREDYFFPVIMLTAKNNDMDKINGLTLGADDYLTKPFNPLELAARVKTQLRRY TRYNTNEPAVNQQIYDFNGLMINNDNHKCTLYDQPLNLTPIEFSILWYLCSKRGSVVTSE ELFEAVWQEKYLDNNNTVMAHVARLREKMNENARKPKFVKTVWGVGYTIE >gi|223714110|gb|ACDT01000105.1| GENE 16 12607 - 13854 571 415 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase [Acinetobacter baumannii SDF] # 74 407 11 345 356 224 38 8e-58 MINVNKVGEFITLKRKEKGLTQQQLANKLSVSFQAVSKWENGGLPNVEILSDLAALLEIT VDELLAGVEKELEGLSYSKAGVDIAYSDAIKREMKQYLKTSDPRVLNGLGPFASLYDISF SDIKHPVLVLKSEEPGSKQKLAMEYGYSESICHDMINHLVNDIVVMGAKPLAVLDTIVCG NAEKETIASLVKGISQACQNNECSLVGGETSIQPQVVEHGVYVLTSSIAGIVAKDKVIDG SKISAGDVILAIASNGLHTNGYSLVRMLMDKYPQIKLERIMNETFIEQIMKPHTPYYQAL KGLFTKVSIHGMAHITGGGIEGNLCRVIPDGLTAKIDLAKINILPIFKYIKHRGNIDDKE MLNTFNCGVGFNVVVSQQDKRQVMEHLSEYYPCYEIGVIENGQDKVEFENKLNWL >gi|223714110|gb|ACDT01000105.1| GENE 17 14390 - 14653 187 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733691|ref|ZP_04564172.1| ## NR: gi|237733691|ref|ZP_04564172.1| predicted protein [Mollicutes bacterium D7] # 1 87 45 131 131 145 100.0 1e-33 MYSVLYEITLLRVFLNYYIDIDVRTDHDVYSFQILNNSQVVKMFDYLQKKQIRLNDRYGL IELYRTKDPVALNKYLDINFKKWAKKR >gi|223714110|gb|ACDT01000105.1| GENE 18 14738 - 16117 1589 459 aa, chain - ## HITS:1 COG:BH3186 KEGG:ns NR:ns ## COG: BH3186 COG0165 # Protein_GI_number: 15615748 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate lyase # Organism: Bacillus halodurans # 3 456 4 457 458 513 54.0 1e-145 MNLWGGRFTKPTDQLVYDFNASISFDQTFYTQDIQGSIAHVKMLAKQGILTSAEGTEIET ALNQILEDLNTGKLTIDNSQEDIHSFVESTLIERIGNTGKKLHTGRSRNDQVALDMRLYT RDQVVILNTLIHNLLITINDIMKTHIETIMPGFTHLQKAQPITLAHHLGAYFEMFKRDSS RLEDIYKRMNICPLGSGALAGTTYPLDREYTAELLGFDGPCLNSIDGVSDRDYLIELLSA MATMMMHMSRISEEIIIWNTNEYQFVDLDDTYSTGSSIMPQKKNPDIAELIRGKTGRVYG ALTSLLTTMKGIPLAYNKDMQEDKELVFDALTTTKGCLQLLDGMLKSTTFNKDKMRKSAT GGFTNATDAADYLVNKGVPFRDAHGIIGQLVLYAIDQNKDLDDLTIAEFKAISEVFDDDI YDAINVETCVNKRNTIGAPGITAMKQVIALNEKYLAQFK >gi|223714110|gb|ACDT01000105.1| GENE 19 16160 - 17371 1586 403 aa, chain - ## HITS:1 COG:CAC0973 KEGG:ns NR:ns ## COG: CAC0973 COG0137 # Protein_GI_number: 15894260 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate synthase # Organism: Clostridium acetobutylicum # 1 398 1 395 400 436 52.0 1e-122 MKEKVILAYSGGLDTTAIIPWLKENYDYEVICCCIDVGQAEELNGLEERAKACGATKLYI KNVVDEYVKDYIMPTVQADAVFEYKYLLGTATARPLIAKILVDVARQEGATAICHGATGK GNDQIRFELGIKALAPDLKIIAAWRDPKWNMDSRESEIEYCQAHGINLPFSADSSYSRDR NIWHISHEGLELENPENAPNYDHLLVLSTSPEKAPEEPEHVVIEWEKGFPVKLNGKAKSL TAILEELNEIGGKHGIGIIDIVENRVVGMKSRGVYETPGGTILMEAHRQLEELVLDRETL KYKRLVSVEFAELVYAAKWFSPLREALSAFVDKTQEVVSGTTKFRLYKGNIIKEGTTSPY SLYDEDIASFATGDLYDHHDAEGFITLYGLSTKVRAMKLNNKD >gi|223714110|gb|ACDT01000105.1| GENE 20 17531 - 18667 1244 378 aa, chain + ## HITS:1 COG:SPy1117 KEGG:ns NR:ns ## COG: SPy1117 COG0628 # Protein_GI_number: 15675097 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Streptococcus pyogenes M1 GAS # 4 368 8 372 382 171 34.0 2e-42 MDGVKIDYKKILEIVLIGIGCYWALNNFQIILDIFNSILAVIMPFLLGIMIAFILNVLMI RIEKILSRFILDKKYTSFKRVISIIVSLLIVIGVIGIIITLIIPELTNAIKVIVKSFPET FEQLQVWINQNGNSFPQLETWINSIDLNSIASELSGLFKIGLTGMLGSTVDVISMFFTSI LNLVVGIVFALYILMSKETLKRQSHKLIDAYIPKRISVKLLEVGTLARTTFSNFVIGQTV EAFILGTLCAVGMAVLNLPYAPMVGSLVGITAFVPIVGAFIGGGIGAFMILTVDPMQALI FIIFLVVLQQLEGDLIYPRVVGSSVGLPSIWVLFAVTVGGGLWGITGMLFSVPVLSVVYA LIKGHVNKSVTKPKITNN >gi|223714110|gb|ACDT01000105.1| GENE 21 18887 - 19600 264 237 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 219 1 226 245 106 27 3e-22 MLKIENLTKTYGQKKAVDNLSLEIENGHIYGFIGHNGAGKTTTLKSIAGIMEFDQGNIYI DNKSIKEEPLACKKVMAYIPDNPDLYEYLTGIKYLNFIADVYGVSQAERTDRIKKYGDMF ELTDSLGEPISAYSHGMKQKLAVISALIHEPKLIIMDEPFVGLDPKASHLLKGLMRDLCD RGGAIFFSTHVLEVAEKLCDKIAIIKAGKLVVSGNTQDVIGDDSLEEVFLELEDDNA >gi|223714110|gb|ACDT01000105.1| GENE 22 19593 - 21203 1443 536 aa, chain + ## HITS:1 COG:no KEGG:CD2104 NR:ns ## KEGG: CD2104 # Name: not_defined # Def: putative ABC transporter, permease protein # Organism: C.difficile # Pathway: not_defined # 76 536 77 531 531 162 30.0 4e-38 MLKSLIKIRFQGIFMRSMKGSKNKNAGIGKLALMVLVFAYIAIVFGSMFGYMFNMVLEPF TAIGYEWLYFGIMAVLIVLLCFIGSVFTTQQEIYGAKDNEILLSMPIKTSDILLSRIFVV LIINYLYEALIAIPGGIVYFTQVPFDLLKLLFFLIVIITLPLLALALSCAFGWLMAMLMK RIRNKTIITMVLSLGFLGLYFYLINKLPEYLVALVANGKSIGEAIQNTLFPIYHLALAIS DVNVVSLLLYLVCSVGPFVIVMYLLSKNFISIATSKAPSKKVKLKDSDIKFSSQKGALFK RELKHFTSNPMVMMNAALGIAFTVVLAGAVLIKGPELLDMLGVIPPEYQALVNEFIVPLL CLAVIGTNSMNIISAALISLEGNRLWIIKSLPIKTKDILESKLMLHLALCVPPGLLFSIV GSVVFSLSLVDSLLVIIVPIVFTVFEALLGLLINLWKPKFDWINETVVVKQSAAVMITMF GTMALVVFIGGIYIGYLNEFIGAVTYVYIMLALFLIIDIAFYYLLNTWGIKRFEEL >gi|223714110|gb|ACDT01000105.1| GENE 23 21545 - 22582 1157 345 aa, chain + ## HITS:1 COG:Cj0224 KEGG:ns NR:ns ## COG: Cj0224 COG0002 # Protein_GI_number: 15791596 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate semialdehyde dehydrogenase # Organism: Campylobacter jejuni # 2 338 3 336 342 377 54.0 1e-104 MIRVGIVGSTGYGGCELVRFLLTHKEVEIKWVSSRTYTDEKYSSVYSSYFKLLDQQCVDE DITKYLDDVDVVFFATPQGVCAKMLNEEVLSKVKVIDLSADYRIKDVKTYEQWYGIDHAS PQFIDEAVYGLCEINRDKVKQARLIANPGCYPTCSILSVYPLAKADLIDMDTLIIDAKSG VSGAGRGAKVANLYCEVNESIKSYGVASHRHTPEIEEQLGYAANQEVLLNFTPHLIPMNR GILITAYAKLKKGVNEVDVAKAYQCYDDEYFVRVLKSGVLPEVRSVKASNFVDINYKVDP RTNRVIMMGAMDNLVKGAAGQAIQNMNIMFGLDEKEGLMLAPAIL >gi|223714110|gb|ACDT01000105.1| GENE 24 22596 - 23819 1531 407 aa, chain + ## HITS:1 COG:TM1783 KEGG:ns NR:ns ## COG: TM1783 COG1364 # Protein_GI_number: 15644527 # Func_class: E Amino acid transport and metabolism # Function: N-acetylglutamate synthase (N-acetylornithine aminotransferase) # Organism: Thermotoga maritima # 11 407 3 397 397 365 49.0 1e-100 MKIIENGTVTSPKGYLGAGEHVGIKKAKKDLALIYSKVDAKAVGTFTQNIVKAAPVLWDK MIVDNYDHVRIIAVNSGVANACTGDEGLKCNEEFAQTVADNFGLEKEEVLICSTGVIGKQ LPMNKIINGVPIVKPLLSESQEAGVIVAESIMTTDTKPKHLAVEIEVGGKTVTLGACCKG SGMIHPNMATMLGFITTDCNISKELLNKALKEVIPDTFNMVSVDRDTSTNDTVLLLANGL AGNQEIVIEDQDYQIFKEALYYVNEYLAKAIAGDGEGATKLLEVQVHNADTVQQAKVIAK SVCTSPLVKTAVYGNDANWGRLLCAMGYSGEQFDPYNVDLSVASEFGELQLVAKGMATDY SEELATKILSSSEVKAIIDIHNGDCKAIAWGCDLTYDYVKINADYRS >gi|223714110|gb|ACDT01000105.1| GENE 25 23847 - 24710 928 287 aa, chain + ## HITS:1 COG:Cj0226 KEGG:ns NR:ns ## COG: Cj0226 COG0548 # Protein_GI_number: 15791598 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate kinase # Organism: Campylobacter jejuni # 8 283 9 278 281 273 53.0 4e-73 MHTVEQNKAQVLVDALSYIQKFHGDIVVIKYGGSAMTNEVIKHSVLKDIAVLKSVGIKPV IVHGGGNDINGWLKKVDIKSEFKNGLRVTDKETLEVAEMVLSGKINKGLVQHMERIGTHA VGLSGKDGNMITVKKAMPKGDDIGYVGEIINVDVTLLETLLEKDYTPIISTVGLDSKYNA FNINADDVATGVARALKASKLVFLTDIEGVLRNPSDPSTRISVINTESAAELFEQGIITG GMIPKLKNCLEAVQEDVKKVHILDGRLEHSLLIEIFTTSGVGTEIKK >gi|223714110|gb|ACDT01000105.1| GENE 26 24724 - 25542 631 272 aa, chain - ## HITS:1 COG:BS_ykuE KEGG:ns NR:ns ## COG: BS_ykuE COG1408 # Protein_GI_number: 16078469 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Bacillus subtilis # 24 271 28 284 287 112 29.0 8e-25 MRKKIRTIKLLLLIIVIIGSGFGYYYSRYIAPDSYSIKKTTITNNSLPDEFKNFEIGFIS DINLQKSSDVTRLKKIVTSLNKENVDMVIFGGDLFSATPFDNDKVIELLSNIKSKYGKFA VLGEKDLASSNDVNAILNEGGFEVLHNEYRPIYFNGSTISLFGLEGTGELSGLINENNSE SYKIVAVHEPDYFNTTSNNEIALQLSGHTMGGYIRLPFIGGLFKKTNGNTYVSGEHTKSK SQLLISNGLGMEDGYEYRLFCPNQINIVTLKK >gi|223714110|gb|ACDT01000105.1| GENE 27 25539 - 27329 1562 596 aa, chain - ## HITS:1 COG:BH2662 KEGG:ns NR:ns ## COG: BH2662 COG0595 # Protein_GI_number: 15615225 # Func_class: R General function prediction only # Function: Predicted hydrolase of the metallo-beta-lactamase superfamily # Organism: Bacillus halodurans # 39 592 2 553 555 619 53.0 1e-177 MQEKPNTNNKRPGKSPQKNHNNKSKSRNNTPKKAVNKSKQTKNSIDTKVFALGGLNEVGK NMYCVEHDNELFIIDAGVKFAEEGLPGIDYVIPDYSYLKRNEKKIRGLMITHGHEDHIGG IPFLLQVVKIPFIYATPLASAMIKRKLDEKRLTNATKIIQIDDEYQIKSKYFNIGFFKTN HSIPESMGIIINSPNGRIVETGDFKFDLSPVGDPADYQKMSFLGETGVTLLMSDSTNSEV PTFSISEKKVANSVQEEFRKTEGRLIVATFASNVHRVQQIVEAAVKFNRKILVYGRSMEN NIQVSRAMGYIKCPDKFFIKSEQAKKLPDNEILILCTGSQGEALAALSRIANGTHKYVKI KPGDTVVFSSNPIPGNAYSVNVVVNKLFRAGAKVLTNTAFNNLHTSGHASQEEQKLMMLL TRPKYFFPVHGEYRMLKIHAELSQDVGIPPENSFVLSNGDTILLNNGEARLGPRIHVEDI YVDGNDISGLSTAVLRDRQMLSEDGMVSILISMDSRSGKLLTRPVIISRGFVYMKDSFHM IREAEKLVETNLSKLLQGKTTFGEIKNTTRDTLSSYFYRKTKRNPMIIPVIMNKKS >gi|223714110|gb|ACDT01000105.1| GENE 28 27408 - 28274 1008 288 aa, chain - ## HITS:1 COG:SA1258 KEGG:ns NR:ns ## COG: SA1258 COG1307 # Protein_GI_number: 15927006 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Staphylococcus aureus N315 # 1 286 1 279 279 125 28.0 7e-29 MNKVAIITDSNSGITQAEGTKLGINVLPMPFTIDGQTFYEDINLSQDEFYKKLMTDAEVF TSQPVVGDVSKLWDEVLKEYDEIVHIPMSSGLSGSCQTALMLAGEYDGRVQVVDSQRVSV TQKWDVLDALELSKAGKSAKEIKEILEANKLNASIYITVNTLDYLRKGGRITPAVAILGG MMKIKPILQIQGEKLDTFSKTRTMSKATKIMMEAIKKDIDERLDPEGKGKNVHVCIAYTY DEQPALELKKELEAIYPDSTIICDPLSLSVSCHIGPHALAIAACKKII >gi|223714110|gb|ACDT01000105.1| GENE 29 28458 - 28823 341 121 aa, chain + ## HITS:1 COG:BS_ytrA KEGG:ns NR:ns ## COG: BS_ytrA COG1725 # Protein_GI_number: 16080098 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 1 116 2 118 130 90 43.0 9e-19 MQIDNNSSVPIYEQIYKQLLEYITTGLMQADDKLPSVRELSSSIHINPNTVVKAYRLLEE QGFIYSLPGKGSFVSSKQNNQTKLFEQQIKVLTKEIVTKSKYIGLTFEQLCQFMEQTWEG K >gi|223714110|gb|ACDT01000105.1| GENE 30 28823 - 29725 276 300 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 2 298 5 296 305 110 29 1e-23 MLLVKELSKKIDHKQILDNITLNIPKGSIYGIIGENGAGKTTLIRHLVGAYQGDTGFVEL EGMRIYENPEAKAKLVYIPDEFFDTFGHNIRDIKELYQGIYPSFNEARYLRLMKMFKHDS LENFSKFSKGMKKQVMFILALAIMPEYLIMDEPFDGLDPHIRKVIWDILIQDVSERNMTI FISSHHLNELDSMCDHIALINNGKIVFEEALDHLKEGYHKLQIVLECGDDMYALEKELKV LSHQMMGRVHTLIVAGEIEAINRTVEKYCPLINETLSLSLEEIFLFTLGGEYDELKEVMG >gi|223714110|gb|ACDT01000105.1| GENE 31 29700 - 31088 1183 462 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755477|ref|ZP_02427604.1| ## NR: gi|167755477|ref|ZP_02427604.1| hypothetical protein CLORAM_00991 [Clostridium ramosum DSM 1402] # 1 461 2 462 630 813 99.0 0 MNSRKSWVNIPLLKNIIKNNLFPAKVSLIIFVGVFILSLISGSNSGFSQICIVVYGISAV VLTTIYPCFIQSYQVNKTKSSMMCSLPLTTRCVWFTNYLAGYLIALVTLLIEGLGLMMFL GLNNAGTEGNIFIRFIMAIFLLLFIYYTLTYLVCSLSGNRLGQVVFSLAAYALPVILLLG LIYISPKLVPSNLDLGINDTYLYLTFPLAAGMQFIYSAYPHIFFHLFLGIVTLICSYYVY KNRENEYIGEPLVFRRIIIVLKAILIIAVTICGFGVILLVSKINITYGFKGQSLLFLIYI LIGMIVAIIVEIIFKGRHVYRNLLIYVPVLAIVFGANFMIANKQYLSGINNSRYEVTADL VAELAGNQEYLSLSLNNEVTKEFLNYLSKHRDDLHFEKYDDEQQDLAMMYTYQYEHGLDD DYYSYSIYYINKKAVIDFFNGPGNKYFDKLLNYTEGLKKEKV >gi|223714110|gb|ACDT01000105.1| GENE 32 31142 - 31588 372 148 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755477|ref|ZP_02427604.1| ## NR: gi|167755477|ref|ZP_02427604.1| hypothetical protein CLORAM_00991 [Clostridium ramosum DSM 1402] # 1 148 483 630 630 287 99.0 2e-76 MIKLLPNQEIKPKNIFSIKTYNLLNPDSVNYLISSNEAITNFITSDELIERAVFIENCRD VIDDINSDSTKFEEQIKAVVSTQFSHETIDMVYPNGDSTVIAFSNDEVIYQGTYEVVTKE GNNYTYPLEFTLKKGDDGVIVNDIKVGG >gi|223714110|gb|ACDT01000105.1| GENE 33 31591 - 33147 1311 518 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755478|ref|ZP_02427605.1| ## NR: gi|167755478|ref|ZP_02427605.1| hypothetical protein CLORAM_00992 [Clostridium ramosum DSM 1402] # 1 518 1 518 518 889 100.0 0 MMISLLKNDLKQLKNTVVIFSGFLALSLVISLISFHSDVPVGMFYLIILLYSLVVPMVNL SYLFDQTQQTHFSSLPYTKIQSFMIRYLSGLLCIIIPVIIYCIVDVILGQGLVLKNCSSA ILMIWIYYSLGNLAAYLTTSVIMDIILQIVINISPFIIYMSLFSVYQTFVRGIISADFSM EVVSIILPAFRLFIGGLSGLSSYYLLLYAGYGLVIFIMAFLASSNRECVNNYHGFTYKII GEVIKLICIISVSWLITSILDIPVQSLKSFIIINIIATFVATFIIQFIQSKRIRYVLCIV QGTLIVLVTIVIFIGSKGYLENYIPHDIKAVEINSNFGYNKINSKLTDQKSIDTVVAIHQ EILKQKSDYGSHNLKITYYKDNGDKVTRGYQISDQEYLMVIKKLDAALLKSWFDNYYYII NKLDTYQYLTYALPGESENEIRTENDILLFKNILQRKLSEFENNPNLLLKVDYDNPGQIT RFKRSNDGSESYQGDSISFDDNDPMSLALQEYDKIKAN >gi|223714110|gb|ACDT01000105.1| GENE 34 33194 - 33499 466 101 aa, chain + ## HITS:1 COG:SSO2232 KEGG:ns NR:ns ## COG: SSO2232 COG0526 # Protein_GI_number: 15899007 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Sulfolobus solfataricus # 1 101 33 134 135 102 51.0 1e-22 MKILNSNEFDNAIASGIVLVDFYADWCGPCKMLSPVIEGLAEKMEQVNFYKLNVDASSDI AGRYGVQAIPNLIIFKDGKAVDQITGFVPENEIVKHLQNLL >gi|223714110|gb|ACDT01000105.1| GENE 35 33619 - 34038 445 139 aa, chain + ## HITS:1 COG:CAC1006 KEGG:ns NR:ns ## COG: CAC1006 COG0494 # Protein_GI_number: 15894293 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Clostridium acetobutylicum # 1 134 1 138 146 131 47.0 5e-31 MKVSFYESVAEHDLKFAVIISRYLKQWVFCKHKQLKTFEIPGGHRENGESTEACARRELY EETGAVTYTLERFCDYGVVEKTGSKESFGTIFYAEIEKMVDLPDSEIEKIFLSKELPECW TYPEIQPYMIDEYKRRRGS >gi|223714110|gb|ACDT01000105.1| GENE 36 34039 - 34479 534 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755481|ref|ZP_02427608.1| ## NR: gi|167755481|ref|ZP_02427608.1| hypothetical protein CLORAM_00995 [Clostridium ramosum DSM 1402] # 1 146 1 146 146 241 100.0 1e-62 MKKIFIVMMIAGILTGCSDSSTKSITVCKGSFDYGDIENTYETKGDKVIRIVNVTINDYR DTDYDDETLVSTLESQLNVYKEIDGLEIKLTAKDQIVRQEFIIDIENGDLKKLNEVGLID LNGDGTVSFEKLTKNAKENKLTCKEK >gi|223714110|gb|ACDT01000105.1| GENE 37 34519 - 35082 695 187 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733710|ref|ZP_04564191.1| ## NR: gi|237733710|ref|ZP_04564191.1| predicted protein [Mollicutes bacterium D7] # 1 187 1 187 187 318 100.0 1e-85 MARKPLKLTKNALMLLGIIALLLITFLVLKFGGTKSEQPEKKVTETLSGLVVENQVLKVQ LLDFVSNKDFDDKYQEVSMGIKADEEVLNYKISNRQVFNKVMQLLPPGEGSPLLNNSSEV PTHEAYILVLTGDIVEYKDSEGKSSYQIANARLDYYKQSLLLENDYDSVYIASIDGKKEK NGQDYCI >gi|223714110|gb|ACDT01000105.1| GENE 38 35173 - 35508 361 111 aa, chain - ## HITS:1 COG:no KEGG:Clos_2092 NR:ns ## KEGG: Clos_2092 # Name: not_defined # Def: hypothetical protein # Organism: A.oremlandii # Pathway: not_defined # 1 108 1 112 117 74 38.0 1e-12 MQALFLVLHKVEKLDDLLAALQKSGINGGTIIESKGMLNTLKSNDNFIIESLRIFLEDPR ETSKTLFFILKNEDVEKARTVIDKTLGGIDKPNTGIMFGIDLTFVAGLNIE >gi|223714110|gb|ACDT01000105.1| GENE 39 35623 - 36804 1452 393 aa, chain + ## HITS:1 COG:Ta1193 KEGG:ns NR:ns ## COG: Ta1193 COG1167 # Protein_GI_number: 16082202 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Thermoplasma acidophilum # 1 393 1 396 402 310 36.0 2e-84 MNFEFSNRISGVKASAIREILKMMNDPEIISFGGGNPASETFPVEQLKRISDDIFNTCPN SILEYGITEGDGSFKKAAIEFLSRHETVVKDYDGVMVTSGSQQIMDFLSKCLCNEGDLVV CENPSFLGALNAFKSNGAKLVGIDMEDDGINLEQLEAVLSGVQKPKFIYLIPNFQNPTGI TTSLEKRQAIYTLAKKYQVLILEDNPYGDLRFRNDAIPSIKSLDDEGLVVYAASFSKIIA PGMRVAVMIGHQDLLAKCTVAKQTNDVHTNAWAQSVMARFLNETDMEVHLKKLQAVYEQK CNLMLEEMKKQFSPDCKWTIPDGGMFIWVTLPEGIDMPEFVKRAVEHKVAVVPGNAFYDD DNKPCQSFRMNFSTPTNEQIIKGVAILGALMHD >gi|223714110|gb|ACDT01000105.1| GENE 40 36797 - 37402 589 201 aa, chain + ## HITS:1 COG:CAC3581 KEGG:ns NR:ns ## COG: CAC3581 COG1011 # Protein_GI_number: 15896815 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Clostridium acetobutylicum # 1 170 1 174 201 109 41.0 3e-24 MIKNIVFDMGNVILRWDPHYIASKLSNDSLEQDVIERELFASKQWQKLDQGLLTVEEALK ELKTDNPQLIRHALLHWYDYFEPFAEMVALVKELKEQGYHIYLLSNCSLQFDDYYQNIEA FKYFDDFYISARYQLLKPDLEIFKHFLSKFNLKAEECIFIDDIKANVEGAKIAKMHGYQH NGDIKKLRLYLKKQHDLPINY >gi|223714110|gb|ACDT01000105.1| GENE 41 37508 - 39013 1609 501 aa, chain + ## HITS:1 COG:CAC0747 KEGG:ns NR:ns ## COG: CAC0747 COG1376 # Protein_GI_number: 15894034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 31 501 7 466 466 268 33.0 2e-71 MDKQEENMSNTNDTIESKQKSEMTKKVRSGEKTGVQKKIIIIGILVSLLVIIYLIGSFYF NKKFLSGTYINDINVSGLTLTEANEKLAEQIDARSLKLIFNDEATETVSGSECGVRYNSN NNVKNILKKQNSFGWIGAYFNKSKNVVNDLAEVDETVLRAKVASLGHLQEEVQVAPVDAK VEYLNNEFTITNEVNGSTIDQEKLISEIKLAFSEGKESLNLTEKKCYVEPAVKANDAKLQ NLLDAARKYASAAITYKTRSGDVVLDGSTLVTWLSIDESGNYYRDDAVFKEKVTDFVGSL AKKINSVGGSRTFVGANNRTITVSGGNYGLRLQNSKEISGLLEDIYANKVGVRTPVTTGR EASSDNGGLGDTFVEIDLSGQHMWYHKNGQIVLESDIVSGTYNIADRRTPAGAYYLYNKE RNRVLRGTKLPDGTWPYETPVSYWMPFNKGIGLHDSSSWRSKWGGTIYMNNGSHGCINLP TGIASQLYNSIEVNCPVVCYY >gi|223714110|gb|ACDT01000105.1| GENE 42 39180 - 40199 1408 339 aa, chain + ## HITS:1 COG:PH0737 KEGG:ns NR:ns ## COG: PH0737 COG1363 # Protein_GI_number: 14590612 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Pyrococcus horikoshii # 5 338 4 333 336 301 47.0 1e-81 MENYVIDFATKILNIDSPTGYCKEVIEFVKNEVEQLGYRTGMNNKGNLYIYVDGKNDKTI GLCAHVDTLGLMVRSIRDNGELAFTNVGGPIVPTLDGEYCRVITRENKIYTGTILSNYPA AHVYEESKTAIRKCENMHIRLDEVVKNKEDVTALGIDNGDYIAIDPKTTYTNSGFLKSRF LDDKLSVACLVTVLKELKEKNIVPANNVIMIISTYEEVGHGSASIPENISELIAVDMGCI GDDLACSEYDVSICAKDSSGPYDYGITSKFIELAKAKGLNYAVDIYPFYGSDVSAALRGG NDIKGGLIGPGVHASHGMERSHQQAINNTIELILAYLQA >gi|223714110|gb|ACDT01000105.1| GENE 43 40285 - 41043 759 252 aa, chain + ## HITS:1 COG:BS_ywpJ KEGG:ns NR:ns ## COG: BS_ywpJ COG0561 # Protein_GI_number: 16080682 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Bacillus subtilis # 1 233 1 265 285 63 26.0 3e-10 MKALASDFDRTLFFYDESGSYYRLEDVQAIKHFQSLGNLFGVCTGRSYKGIEDFNPHQID YDFYILCSGAKILDKEGNLIYQKFIDKKIAQSIYEQYKAVDTSIVYQGDMYVLNRTFDLS PRVKLITSFNQVGEQFEAFSLHFKNNDEAGKEYTKMIDMFGDVIDVYQNERHLDFAAKGC SKGKGIKRIIEFYGLAYEDMAAIGDSWNDIPMLKSVENSFTFNRSHQTVKNVAKYLVDGI DGCIEELIKCHK >gi|223714110|gb|ACDT01000105.1| GENE 44 41078 - 41791 677 237 aa, chain + ## HITS:1 COG:BS_treR KEGG:ns NR:ns ## COG: BS_treR COG2188 # Protein_GI_number: 16077849 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 2 234 3 236 238 212 48.0 5e-55 MLSKYQRISQDLIDKIDKQEIKVHSYLPSENELMKLYNASRDTIRKALDILLKNGYIQKN KGKGSLVLDVDKVAFPVSGVMSFKELAKTINSEVVTKVIELSLNPVDVQMKKELYMDSGN YYKIRRIRSFDGERVILDTDYLNANIVPNITETIAADSLYNYIENELGLKISFARKEFTV VKATEEEKNLLDMHEYDLLVCVKSYTYLEDASLFQYTISKHRPDKFRFVDFARRTQL >gi|223714110|gb|ACDT01000105.1| GENE 45 41791 - 43599 1264 602 aa, chain + ## HITS:1 COG:CAC3034 KEGG:ns NR:ns ## COG: CAC3034 COG0249 # Protein_GI_number: 15896285 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 12 601 5 598 598 347 36.0 6e-95 MSSFLGGAMTQKDYLLFQDNIRKEKANLDSRSLKISIARLIIGLLIVGFLLGGNFTNNQL LVYSSLIWIAIFIILIVIHSKITERLSYLNAQAIVIQRYLDRYNNDWKAFDETGLDYVDN VTGVMKDLDLVGKNSLFQYFNVASTIRGKKCLLDKLTRTSFNEKGIIEEQKAVLELSNNN NFVLSLETYGRMLEKPKVNERIIEDFVNNLGKNKVKRSWGFAKYLIPILTIITIVMFLFE FYFKLSVILVPVLIFGQLIIAIINLNSHGIIFEQVSKLSHCLNNYQNISHLIKQSTFTSP FLVELQARLEASSIAFDELKSISSAVKQRNNLLAFILLNGLLLWDINCKERYDRWVKQYA SSIERWLDAIGEFEALTSLQVLLHTKDVTCFANFHGDQTTYLEFGEAYHPLMPNEQAVAN SFMMNHQVCVITGSNMSGKTTFLRTIGINLVLAYAGGPVMAKSFDCSLMQIFTSMRIEDD LNGISSFYAELLRIKQIVEANRQGNLMIALIDEIFKGTNSKDRIIGAQETVKQLSTNNIF TFITTHDFELCELENEISCSNYHFNEYYQGDKIKFDYLIKYGRSQTTNAQYLLKMVGITK EV >gi|223714110|gb|ACDT01000105.1| GENE 46 43701 - 45593 2089 630 aa, chain + ## HITS:1 COG:lin1223_2 KEGG:ns NR:ns ## COG: lin1223_2 COG1263 # Protein_GI_number: 16800292 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Listeria innocua # 92 472 1 381 403 540 71.0 1e-153 MGKFSSDASKLLEFVGGRENIRAVSHCVTRMRFVLGDESKADVDKIETLGSAKGTFTQGG QFQVIIGNDVQEFYNEFIKIAGIEGVSKEQVKSDAKSNQTWLQRLMGNLGEIFAPLIPAL ICGGLILGFRNVIDSIAMFENGTKTLVQISQFWSGVDSFLWLIGEAVFHFLPVGICWSIT KKMGTSQILGIILGITLVSPQLLNAYSVASTAAKDIPVWDFGFAQVQMIGYQAQVIPAMM AGFTLVYLEKFFRKVSPEAISMIIVPFFSLVPAVIVAHTVVGPIGWVIGNFISDVVYGGL TSNFGWLFATLFGFVYAPLVITGLHHMTNAIDMQLVSMFKGTILWPMIALSNIAQGSSVL AMIVLQKKNEKAQQVSIPACISCYLGVTEPALFGVNLKYMFPFVCGMIGSAIAATFSVAT GTMATSVGVGGIPGILSIIPKYMGNFAIAMIIAIVIPFVLTYIVGKKKLTDKDLGIETDI IDNEKFVSPMTGKLIKIEDVEDQVFATKAMGDGFAIELTDSDVLAPVSGEIVMTFPTKHA YGLRGNNGVEILIHLGMDTVQLEGKGFESFVKVGEYIKQGDKLAKVDIAYIKEHGKSIVS PVIFTSGEKISILKENCNIKKLETEIIKID >gi|223714110|gb|ACDT01000105.1| GENE 47 45610 - 47247 1532 545 aa, chain + ## HITS:1 COG:BS_treA KEGG:ns NR:ns ## COG: BS_treA COG0366 # Protein_GI_number: 16077848 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus subtilis # 2 538 8 553 561 631 55.0 1e-180 MYYEDKVIYQIYPKSFKDTNGDGIGDIKGITEKLDYLQKLGIDMIWMCPIFASPQRDSGY DIKDYYSIDPIFGTMEDLELLIAEAKKRNIYLMFDMVFNHTSTEHIWFQEALKGNSKYKN YYFFKKGKNGNLPTNWDSKFGGKAWKYVPEFDEYYLHLYDEKQADLNWENEEVRNELIKI VNFWLKKGIRGFRFDVINNIDKRVFADSPDGLGKQFYCDQPKVHKYLHEVNTRSFGRVAG CITVGEMSATSVENCIGYTNPQNKELDMVFSFHHLKVDYVNKEKWTSMPFDFQELKELLN TWQVRMQEGQGWNALFWNCHDQPRSVSRFGDDKKFRVHSAKMLATAIHLMRGTPYIYQGE EIGMTNHYFTNINQYRDVESINAYHILKNQGLEEKEIHQILQDKSRDNARTPMQWDESVN GGFSDGTPWLQVNTNYQTINVNAALKDTDSIFYHYQKLIKLRKEYKIISTGEYIPILENH PQVFGYKRSLNNQELIVLNNFYGQDVLLNLDVINYQVLINNYNEIDLTNLCLKPYQSVAL YKKEV >gi|223714110|gb|ACDT01000105.1| GENE 48 47430 - 48890 1606 486 aa, chain + ## HITS:1 COG:SPy1343 KEGG:ns NR:ns ## COG: SPy1343 COG4868 # Protein_GI_number: 15675279 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pyogenes M1 GAS # 3 482 28 507 518 583 58.0 1e-166 MKIGFDNNKYLTMQSSHIKERINQFGDKLYLEFGGKLFDDYHASRVLPGFAPDSKLQMLL QLKDQAEVVIAISASDIERNKVRGDLGITYDNDVLRLIEIYRSKGLYVGSVVITQYDGQK AADIFKSRLEKRNIAVYLHYLIDGYPTNTKRIISEEGFGKNDYIETTKPLVVVTAPGPGS GKMATCLSQLYHEHIRGIKAGYAKFETFPIWNLALNHPVNLAYEAATADLNDVNMIDPFH LEAYNKVTVNYNRDIEIFPVLKNMFEEIYGTSPYNSPTDMGVNMVGFCMSDDEACKEASR QEIIRRYYNTMNRYVREECSLDEVNKQELVMNQAKVVVEDRKCVAAANQKSKEANCYAGA IELDDGTIITGRTSDFMGSCSAILINAIKYLANISDDEHLISPTAFGPIQNLKTEYLGSV NPRLHSDEILIALSLSAVNNDKAKLALQQLPKLQGTQAHITSSIGPMDAKVFSNLGIQVT YDAKKL >gi|223714110|gb|ACDT01000105.1| GENE 49 49310 - 50137 839 275 aa, chain + ## HITS:1 COG:L12334 KEGG:ns NR:ns ## COG: L12334 COG1396 # Protein_GI_number: 15671989 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 3 77 1 75 107 67 46.0 2e-11 MNIEIANRLVRLRKEKNLSQEALANELGISRQAVSKWERAEASPDTDNLILLAKLYGMSL DDLLKTDQKEFESGNNQQASKEKTEEDKKQKKKESVHISLKHGIHVTGEDGEEVHVGWNG IHVEDPKEDSRIHIDGSGVFVDDKKYDQEEWRKIRKEKGWDYYYEDYTKFPFALLAIIAY IGIALYTELWHPLWIFLLIVPIIEGAISAVKHRNLNRFPYPVLVILYFLYEGFYENIWSP TWLIFLTIPVYYSLVNYFRQRKQHKNNNCTDNQED >gi|223714110|gb|ACDT01000105.1| GENE 50 50378 - 50800 557 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733723|ref|ZP_04564204.1| ## NR: gi|237733723|ref|ZP_04564204.1| predicted protein [Mollicutes bacterium D7] # 17 140 24 147 147 191 100.0 2e-47 MKELLLKRLKKSKNKKGFTLIEIIVVLVILGILAAIAVPAVMGYIDEAKDSKYIAEARSI YVVVQTEEAKAETLGNAADYTAIATEAAKKTGINNDTADKKISIELKEEGKYTIKWLSDD DKYVTATVTKNKDITIDGKE >gi|223714110|gb|ACDT01000105.1| GENE 51 51070 - 51804 715 244 aa, chain + ## HITS:1 COG:alr1315 KEGG:ns NR:ns ## COG: alr1315 COG1989 # Protein_GI_number: 17228810 # Func_class: N Cell motility; O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, prepilin signal peptidase PulO and related peptidases # Organism: Nostoc sp. PCC 7120 # 1 242 13 268 269 148 37.0 8e-36 MFILGSCIASFINVLIYRIPRDLDFVNGRSFCPSCHNTLKPYDMIPVFSWLFLKGKCRFC KEPISPRYPLIELCGGLLAMLCFYRYGFTWMTLVSFVLAMILLTITMIDFDTMTIPNGLV IALVAPVIVCAVLEPQISISSRVIGMASVSGFMLLMTLAIPDCFGGGDMKLMFVCGFMLG WINTLLAGFIGLLAGGIYASYLMVTKKSKEQSHMAFGPYLCLGIFTALLYGNEIIRVYLS FFNL >gi|223714110|gb|ACDT01000105.1| GENE 52 51818 - 53011 959 397 aa, chain + ## HITS:1 COG:DR1863 KEGG:ns NR:ns ## COG: DR1863 COG1459 # Protein_GI_number: 15806863 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Deinococcus radiodurans # 1 390 1 400 406 203 29.0 4e-52 MAIYKYVARDINAKKITGKTEARDEHDLSQLLRVKDMYLISHKDITKEETKNYKMKLKEL ANFCREIGTMMNSGLPLIRTVSILASREDNKKLKAIYNDIYVKLQQGQTLSDALKEQGKA FPDILIQMVRSGEASGNMQDTMMVLNNQFTNDNKIKNKVKSAMTYPVILGIVTIVVLLIV YTAVLPSFFSMFEGMELPLITEINIIISKFIMSYWYTLLIGILVLILIIVSLLQLPKVKY QFDRFKLKMPIIGKLMKIIYTSRFARTLCSLYSSGISIVNAMVIVKSTIGNKYIETQFDN SIKAVRNGEALSVAIGMIDGFDIKLTSSVYIGEESGNLEDLLSSLADDFDYEAMLASEKM VAILEPAMIIVLAVVICVIIISVLVPIYSMYQNVGSM >gi|223714110|gb|ACDT01000105.1| GENE 53 53021 - 53200 199 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755498|ref|ZP_02427625.1| ## NR: gi|167755498|ref|ZP_02427625.1| hypothetical protein CLORAM_01012 [Clostridium ramosum DSM 1402] # 1 58 10 67 307 110 100.0 2e-23 MKNIIYLSNTSAQLVSGVCNGSNQIDISDFQDYQLPEGTMLDGAIMDEDALKDVLRIIK >gi|223714110|gb|ACDT01000105.1| GENE 54 53247 - 53918 595 223 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755498|ref|ZP_02427625.1| ## NR: gi|167755498|ref|ZP_02427625.1| hypothetical protein CLORAM_01012 [Clostridium ramosum DSM 1402] # 1 223 85 307 307 410 99.0 1e-113 MHKNLIVPKIKDKEIHAIVKEEIDSLENGEHDLIYDYTILKDKLVDKPGIEILCCAMKKQ MIEDYLNIFDDVGIEITAIDISLNAIDKLIEDIIRLSQRNFVIAVINGNDIALYLFEEGK YVFSNRSRLFSERGSSSFTMEVSNILIKFKQFIKTADYNQNIERVYFCGLDDYEEKMLFE VVSDSVDIRAMRLANSNTIYYSGNSKEFLLHKYVYAIGSLCER >gi|223714110|gb|ACDT01000105.1| GENE 55 53927 - 54116 115 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755499|ref|ZP_02427626.1| ## NR: gi|167755499|ref|ZP_02427626.1| hypothetical protein CLORAM_01013 [Clostridium ramosum DSM 1402] # 1 63 1 63 214 114 100.0 1e-24 MFKRNKSKDIDLLEQYYLATKKKNIFQEYLKYLILPLVLVFIIGGAFGYLKVKQGNYKND IAE Prediction of potential genes in microbial genomes Time: Thu May 26 10:19:04 2011 Seq name: gi|223714109|gb|ACDT01000106.1| Coprobacillus sp. D7 cont1.106, whole genome shotgun sequence Length of sequence - 1502 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 366 - 404 5.0 1 1 Op 1 . - CDS 436 - 1281 957 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 2 1 Op 2 . - CDS 1297 - 1500 180 ## Acfer_0525 2,5-didehydrogluconate reductase (EC:1.1.1.274) Predicted protein(s) >gi|223714109|gb|ACDT01000106.1| GENE 1 436 - 1281 957 281 aa, chain - ## HITS:1 COG:YPO2805 KEGG:ns NR:ns ## COG: YPO2805 COG0656 # Protein_GI_number: 16123003 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Yersinia pestis # 1 280 15 294 297 355 59.0 4e-98 MDYITLNNGVKMPQVGFGVFQIKDKEECVRVILDAIDAGYRLIDTAQSYGNEEAVGEAVK KTAVPREELFITTKVWISNYGFENAKASIEDSLNKMQLDYLDLVLLHQPFKDYYGAYRAL VDLYKEGKIKAIGVSNFYPDRLVDLCLDMDVIPAVNQVEVNPFHQQNLALEYNQKYSVQL EAWAPFAEGKNGIFENKILSDIGKKYNKSIGQVILRWLVQRGIVPLAKTVRKERMEENIN IFDFELSQEDMNIIAQMNKDKSSFFSHYDPKTVEMICGLKR >gi|223714109|gb|ACDT01000106.1| GENE 2 1297 - 1500 180 67 aa, chain - ## HITS:1 COG:no KEGG:Acfer_0525 NR:ns ## KEGG: Acfer_0525 # Name: not_defined # Def: 2,5-didehydrogluconate reductase (EC:1.1.1.274) # Organism: A.fermentans # Pathway: not_defined # 1 67 215 282 282 63 42.0 2e-09 RWIYQEGVISLPKSTNIERMEGNIDIFDFELNNQEMQSIRDLDTGKPTHNPEDSANETRL MGLKIHD Prediction of potential genes in microbial genomes Time: Thu May 26 10:19:06 2011 Seq name: gi|223714108|gb|ACDT01000107.1| Coprobacillus sp. D7 cont1.107, whole genome shotgun sequence Length of sequence - 1469 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 100 - 438 116 ## PRU_1504 iron-sulfur cluster-binding protein 2 1 Op 2 . - CDS 419 - 865 554 ## Mlab_0618 hypothetical protein - Prom 909 - 968 11.9 + Prom 886 - 945 11.1 3 2 Tu 1 . + CDS 977 - 1469 288 ## COG0583 Transcriptional regulator Predicted protein(s) >gi|223714108|gb|ACDT01000107.1| GENE 1 100 - 438 116 112 aa, chain - ## HITS:1 COG:no KEGG:PRU_1504 NR:ns ## KEGG: PRU_1504 # Name: not_defined # Def: iron-sulfur cluster-binding protein # Organism: P.ruminicola # Pathway: not_defined # 15 112 191 292 292 75 35.0 4e-13 MKGYTIEGKTAWESVSKRNRESPEWTDGTALHCDIDKCNGCGICKKVCPVGRNIIKDGKR FEVTNKCDFCLGCIHACPRNAFYLTINDKNPNARYRNPNINIKEIMDANNQR >gi|223714108|gb|ACDT01000107.1| GENE 2 419 - 865 554 148 aa, chain - ## HITS:1 COG:no KEGG:Mlab_0618 NR:ns ## KEGG: Mlab_0618 # Name: not_defined # Def: hypothetical protein # Organism: M.labreanum # Pathway: not_defined # 2 141 3 142 255 134 46.0 1e-30 MILYYTSTGNSLYIAKQLGGELISIPQALKEDNLIFEDDVIGIVCPDYSAELPRTVRKFL EKATLKADYVFVIITYGKLDSIVAYWTKEFAQKHGIHLDYVNTVLMVDNWLPSFDMFEEM AIDKEIPKQLENIKKDIKLEKICERIYH >gi|223714108|gb|ACDT01000107.1| GENE 3 977 - 1469 288 164 aa, chain + ## HITS:1 COG:NMB0173 KEGG:ns NR:ns ## COG: NMB0173 COG0583 # Protein_GI_number: 15676100 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Neisseria meningitidis MC58 # 1 83 1 83 306 68 39.0 6e-12 MELYQLEYLVALEKYKTLLETANQLHVSQPSITKAIKKLEDEFGFALFDRVKNRLVLNMY GEEVIAHAKKILSEVELLKLSTNALFQQNFRITIGSNAPAPLWGIQSLLDHSIGEIINDN EVLIKSFRNHLYSMIILDYPMKLPNCECKEICKEQLYISVDKTH Prediction of potential genes in microbial genomes Time: Thu May 26 10:19:13 2011 Seq name: gi|223714107|gb|ACDT01000108.1| Coprobacillus sp. D7 cont1.108, whole genome shotgun sequence Length of sequence - 7982 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 125 - 394 291 ## gi|237733498|ref|ZP_04563979.1| predicted protein - Prom 442 - 501 4.8 - Term 877 - 922 5.7 2 2 Op 1 . - CDS 930 - 4334 3525 ## COG3525 N-acetyl-beta-hexosaminidase 3 2 Op 2 . - CDS 4388 - 7942 4299 ## COG3525 N-acetyl-beta-hexosaminidase Predicted protein(s) >gi|223714107|gb|ACDT01000108.1| GENE 1 125 - 394 291 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733498|ref|ZP_04563979.1| ## NR: gi|237733498|ref|ZP_04563979.1| predicted protein [Mollicutes bacterium D7] # 1 89 9 97 97 140 100.0 2e-32 MIGSSNQSLQTYYGGAKVLSADKGEADVFMFVDDSAVKEVIVNTDDKLLQDSIVISKSIK VSGNFENEREADEKFLKSILEDFLKNSLE >gi|223714107|gb|ACDT01000108.1| GENE 2 930 - 4334 3525 1134 aa, chain - ## HITS:1 COG:L126168 KEGG:ns NR:ns ## COG: L126168 COG3525 # Protein_GI_number: 15673476 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Lactococcus lactis # 189 517 3 324 325 107 26.0 1e-22 MFIYVSALMRWDDMKKIKKLLKIGLSIAIVMSGFLATAKVANASTSLMTYPNVQQYTKEA GQDFTLAENSRIFVVANEKTLNNTTLLKDLKLTSSNFEAAGVLSKAPIIVFGKEENAVVN DIVVRMEDVAELEGKAESYKLDITDKITVTAKDEIGIYYGLMSVIQMLKINDKILEKGTV IDYPDVELRSMHLDIARKPFSKEWIIRQIKDLSWQKYNVVQLHFSENEGFRIQSDTLDAI EGFKYKYDDVLSKQDILDIIQVANDYHIEIVPSLDSPGHLGAVLQYLPTDYSCRELFPTD DRRNQCFNIFTNPEAREFLVNLMTEFIEFFGDAGCKHFNIGGDEFLAKFSSFSNEQYGQI MTYFNDISKIVKDNGMTPRAWNDGLLFGDYEGYTLDRDIEVCYWAAPENCASVADFVANG NKVINYSDVYMYYVLSWWWQTNAVPEGDRIYNEWHPGKFSNLSGTAQTFEEPYPEYVMGG SYAVWCDDPNYESEQSVEDKIYHRTRATAYKMWNANDNQPDYDVFKAAVDKLGRTPGFNE ALPAPGEVFQGEDTATVTIKYIDNFGKTIAADDVFYGLNDSEYHFEAKELFGYSFIESDL PLDGKYNGNMTITLKYQLHCDKTDLKKEIFEPLAVSEYINETVADYNEALTMAKEIYFDE TSGQLAVNNALRALQDAKNKAVKLEYYSLYVEVTYPLNESDYSSGYQAYKNAVTTGKTLL NSEGLTVEKAKEAYNAIQTAKQSLVKPNAKVPTISATDTYYQSNNYNNMIDGNINTKCWF NSNQEIGKEVKFTFTRPINISSIQIIQPSNAGEDIIDGADVQISTDGKNWTTVGTLDNSA FEKTITFDKTLTKYVRIVLTEGKGNWYQIAEVKFELEEIPEDSTLKDMILEAETLNITGK SSGSVNEMIDALIVAQKSYVEGKVDVTEETNNLRNAINALRDTVTSNKEKLTDKINDAKE IENNNYSIESWSRFEKTLLEAIIVNKDSNATQEQVDKALKTLDDAIKGLKAPSDTSELET VIDKANKLDASKYTEKSWNVLQNAIDAGNKVLENKDATQEEINAAVDAINNAIDKLEVKP VDPDKPAKPEKPMDPVKPIKPGVATGDNTNFMAIGTLLVAATGIALLKKKKKEN >gi|223714107|gb|ACDT01000108.1| GENE 3 4388 - 7942 4299 1184 aa, chain - ## HITS:1 COG:L126168 KEGG:ns NR:ns ## COG: L126168 COG3525 # Protein_GI_number: 15673476 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Lactococcus lactis # 202 539 2 313 325 126 28.0 3e-28 MKGKVSMNRLLKVVAPALLVAAIICQSVGFQGVLAKDNVTTASSDHVSNLTDQQLQALTN PILQSYKADNSHKVWKMTADSRLAVLANQTNIENVRLQEVVKLVNSEFVEKGIISSPLAM VFAQEKDVSYGDILINIDKTNPITNNSNSEEAYKITIGEDGVCLTGASENAVMHGLRTIQ NLAITNDGLVYGEIIDYPNVAERRVHVDCARKYISKDWFIRQIREMSYMKMNALQIHFSE NMGFRIECETDPAIVSDQYLTKIEVREILAEAKKYGIKVIPSFDSPGHVDQILKAHPEYG QVNNSGNHYKSGLDVTNPEAIAYIRSLYDEYMDLFEGCTDFHIGGDEYMEFDRPPFTTEY KSVLNSYAVEKYGQGYIWKDVIAGYINDLAEYVHDRGFTPRIWNDGVYYGESSYEGAQKI KMHDYIGIDFWSQMSWNSSIANLQTFINKGHDTIYNINASFFYYVLRNDKPTDGREQHSF DNLNADRKIYNEWTPGKFQGNPAVDDNSDFIKGASMGIWCDNPNLCSEDVITEDIADELR ALASKSWNTSSNTIINFDGFQENYAKLGNVAGFEKGSTLPDAGEFLTAGDLGKITIRFVD ENNQELKHEVIKYGTVGEKFEFSADPIYGYRVIDNTPITGTYTKEGAVYIFTYELYTDKA ALNNEVTNVLKEKDYINETFSEYKTALIAAKAINDKTDATQAEVDEVYDELIEAKAKAVK IEYYSLYVQSTYPLKQDDYVGGYEAYKQAVDAGKALLSSDTLNAETAKGAYEAIKTAKSN LVKPDGNIPTITATDNYYEKNQWYPTKEYSYEKMLDNDLNTKCWFGQEQAADKEFKFTFP TAVNMTSVQVIQPSNVGDDALKAADIEVSLDGETWTKVGSITENDLDYTATFEKIAVKYV RVRLTEAKPGYWYQVSEVKFAYEQPQEDNTLRDMINEAEALDINGKAPILVSNMVDALIA GQKEYVKGTTDTTTVETTLRNAIDALKDAADITNLSGLIEKVQKLNKADYTVDSWNNLEA AYNKALEVLENTTATQEQVDKACDDLDRAINALVKAPIVADKTELLKAIEAAKDLVEKDY TSESWKTLQDALDAANKVVANKDATQEEIDAAVDAINDAIDKLEAKPVEPEKPIEPEKPM DPVKPAKPNQPAKPGVATGDNANFMAIGTLIAATAVIALLKKEN Prediction of potential genes in microbial genomes Time: Thu May 26 10:19:19 2011 Seq name: gi|223714106|gb|ACDT01000109.1| Coprobacillus sp. D7 cont1.109, whole genome shotgun sequence Length of sequence - 2559 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 549 - 971 241 ## gi|237733502|ref|ZP_04563983.1| conserved hypothetical protein - Term 1951 - 1983 -0.6 2 2 Tu 1 . - CDS 2196 - 2558 161 ## gi|237733503|ref|ZP_04563984.1| conserved hypothetical protein Predicted protein(s) >gi|223714106|gb|ACDT01000109.1| GENE 1 549 - 971 241 140 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733502|ref|ZP_04563983.1| ## NR: gi|237733502|ref|ZP_04563983.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 140 23 162 162 235 100.0 7e-61 MIDMSLSNILSKIDRKNVYDYIVYSDEMTFMIDDIIEMESYWYDIELIFKFYGEYSFFRG KVISGSIKENGCIAVIKNTSMVKKDLKVLMKMGRNDMKKAYIELKNKMLNSFEDIWEFLN KLKIFYLKTIRFNRSILLFC >gi|223714106|gb|ACDT01000109.1| GENE 2 2196 - 2558 161 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733503|ref|ZP_04563984.1| ## NR: gi|237733503|ref|ZP_04563984.1| conserved hypothetical protein [Mollicutes bacterium D7] # 8 120 1 113 113 203 99.0 2e-51 DPITTPEILKPGEKPNLSDNVSNLDDLPNGTQIKDITPEENVNLNKPGEYTGLLEITYPD GSKENVEVKIIVEKVTADTNRPKTGDMSNSSLQIFGMFCSSVTLLGLFMRRHKKGKYNIK Prediction of potential genes in microbial genomes Time: Thu May 26 10:19:33 2011 Seq name: gi|223714105|gb|ACDT01000110.1| Coprobacillus sp. D7 cont1.110, whole genome shotgun sequence Length of sequence - 4274 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1827 1063 ## SLGD_00351 cell wall associated biofilm protein - Prom 1861 - 1920 5.1 2 2 Tu 1 . - CDS 2462 - 2866 275 ## COG1959 Predicted transcriptional regulator - Prom 2919 - 2978 9.5 + Prom 3111 - 3170 8.4 3 3 Tu 1 . + CDS 3301 - 4273 696 ## EF0530 AraC family transcriptional regulator Predicted protein(s) >gi|223714105|gb|ACDT01000110.1| GENE 1 3 - 1827 1063 608 aa, chain - ## HITS:1 COG:no KEGG:SLGD_00351 NR:ns ## KEGG: SLGD_00351 # Name: not_defined # Def: cell wall associated biofilm protein # Organism: S.lugdunensis # Pathway: not_defined # 79 436 312 650 3799 221 41.0 6e-56 MSRQKYGDSMYESRKQRWTIRKLSIGVCSVLLGMTMMISYSPVIHAYENSETLAEVARGV NITDRGMDTTTHFTASGIEDSSLRYASTVYNKIDNSGNIFLKMSKWADGSTGWGEDKNNE FAGKYLLSFTNDSFYKEIDRITIDNKSMAKRDDGALWTISITDIPNYALIGVVTNHDVKI TLKNGQTLQSMGLANTAISFSSVWIKNNGSIARESISNGYILENNPNVKNEKQTGFTAGQ MTQKVLFDADNMSIKSIHTFKPNENYLQSDYGWVVYVKEQIPAVLLKYIDTDNITINNSD VFGDNYLNRKVFKISIDDTGLVDTSTVPELSIVGNDTKVQLNEARKNTNEIFWGTLGQSR DYTITYKLKDNVSIADFARAMNDYIQEKNARILFDHWLEADYLNDSDQKLIHKPDGGAAP KQLVNSYSNAYLDTNDTDKDGLFDFVEWNIGSDTTKVDTDGDGVPDGQEFMTDKTNLIDA ADYLVSKPLTNIMNYDPTKNATITGTVPKPLITDPANADKLISITNPAAGNVIVKLQGYD ETNQTYTNEEYATTKIPFDNLITGSFTLNVNANVVPNGNKAVLVAYSPNGKNAVMGDPML FSITDAEK >gi|223714105|gb|ACDT01000110.1| GENE 2 2462 - 2866 275 134 aa, chain - ## HITS:1 COG:TM1527 KEGG:ns NR:ns ## COG: TM1527 COG1959 # Protein_GI_number: 15644275 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Thermotoga maritima # 9 130 11 127 140 75 33.0 3e-14 MHLKMSTDYAVRIIIYLSEKQCRVSSKELSKKLVIPQNYIFKILKILQESNLVNSYSGSD GGFILKKEPKDITMFDIMQATETTTRINRCLENDEYCSRHATTNCKARCFYIKMQMYMDD TFKRMTIQDLLEMK >gi|223714105|gb|ACDT01000110.1| GENE 3 3301 - 4273 696 324 aa, chain + ## HITS:1 COG:no KEGG:EF0530 NR:ns ## KEGG: EF0530 # Name: not_defined # Def: AraC family transcriptional regulator # Organism: E.faecalis # Pathway: not_defined # 110 321 10 221 305 129 36.0 1e-28 MNLDEKEIKKIKALHVTTKIPVFIFDEYSKCIRRYCSGKIFNFSYDFLFKNSSGTNTGVW FNYGIFNEIFISLPYKDVVITIGPFLTNRASKNRMELFMQRPDIINQTAANKKKYLKYYS SLSIYSLGDIRDFIIILGSLFNIDLEQNYSESLHAQVHKNELQIKKEFFENIGTEFIHSE KYKFYYENKIIELVSQGNLTALKQGIADIGCSVIPSLTSDSVRTEKNYTIVILEKLSSLA IHVGKDIIDTIRLRDFYIRKVESQKNLAQILAVRDSAIIHFTKELHEFSNVKYSPLTLSM IQYINLKVYDSFKTTELAKLFFYV Prediction of potential genes in microbial genomes Time: Thu May 26 10:19:45 2011 Seq name: gi|223714104|gb|ACDT01000111.1| Coprobacillus sp. D7 cont1.111, whole genome shotgun sequence Length of sequence - 3047 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 101 - 352 97 ## gi|237733507|ref|ZP_04563988.1| predicted protein + Prom 616 - 675 8.8 2 2 Tu 1 . + CDS 740 - 2581 2141 ## gi|237733508|ref|ZP_04563989.1| predicted protein + Term 2585 - 2618 3.1 + Prom 2607 - 2666 8.4 3 3 Tu 1 . + CDS 2731 - 2973 147 ## Predicted protein(s) >gi|223714104|gb|ACDT01000111.1| GENE 1 101 - 352 97 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733507|ref|ZP_04563988.1| ## NR: gi|237733507|ref|ZP_04563988.1| predicted protein [Mollicutes bacterium D7] # 1 83 1 83 83 127 100.0 2e-28 MEYINYSNIAFAYFKDNSYQNRAKPLISKLEKSYKEVIENIDNFKEAVKHALYNHETCIT YEYEKVVSQLSIAEQMQKENGIF >gi|223714104|gb|ACDT01000111.1| GENE 2 740 - 2581 2141 613 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733508|ref|ZP_04563989.1| ## NR: gi|237733508|ref|ZP_04563989.1| predicted protein [Mollicutes bacterium D7] # 1 613 1 613 613 1107 100.0 0 MKGKLFKTLLVSSMLTCNIAPIMAIENNVTTSNVLDGNEVIAEIFKYDASSLKNEYSEAV TSKPTYLIYPDTQMDLTSANALLDQLGIKEHLDRYATAAYIINPEADGYTQEESDEYLKI VDKLIGASINTKVIGIGNGATFVNKEISQKDWMLAGIMTYGGEAGTTAKYSVPAYVSNSA TNSELGYINALSSNKKVTSGSVDTYSNPDCKFETVVVNHEDESLSEAFSNAWKKVLSNNG RFGNIGGTFYTMTDSVEREYEYYTFFEKEDYGFERNVVKYDFDNDGKDSLWYEYISEEVN NASAGSVPLVILLHGNGNDARTQIETSGWAQVAKDNKLMLVEPEWQGTNQFDALTNDDSS SLDNDIISLVEKLKATYPQIDASRIYIEGLSRGSRNSLHIGLVHPELFAGLGIHSGGINP EFVDGLKEFAQDNASKYDMPVYMTIGTKDSFEYLPVASSAGGINIQKAIQYYQTLNDLPV TTTFNNSYFGLDLENEKTINNDGSLLIKSGTLTNSKGVAMSFNAIENFGHWNYEPIAKEM WAFFSNYSRNLETGEIVVKKGDTNTVIPTNQDTTTKTTDKVTSVKTGDNTPIIMLSMLLL MSLTLLVKTKKEL >gi|223714104|gb|ACDT01000111.1| GENE 3 2731 - 2973 147 80 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNLKNFCASLGFCSDLNTLRDAFIVSCLGNNINFKYLCKIVGTTNIKRLYVLCPEPDNIY ALTPLEMFFSRITNDTTNNY Prediction of potential genes in microbial genomes Time: Thu May 26 10:20:14 2011 Seq name: gi|223714103|gb|ACDT01000112.1| Coprobacillus sp. D7 cont1.112, whole genome shotgun sequence Length of sequence - 2274 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 350 - 670 158 ## PROTEIN SUPPORTED gi|42783874|ref|NP_981121.1| 30S ribosomal protein S14 homolog-related 2 1 Op 2 . - CDS 672 - 1277 417 ## gi|237733510|ref|ZP_04563991.1| predicted protein - Prom 1343 - 1402 4.6 - Term 1573 - 1620 9.1 3 2 Tu 1 . - CDS 1713 - 2012 183 ## COG1015 Phosphopentomutase - Prom 2151 - 2210 7.4 Predicted protein(s) >gi|223714103|gb|ACDT01000112.1| GENE 1 350 - 670 158 106 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42783874|ref|NP_981121.1| 30S ribosomal protein S14 homolog-related [Bacillus cereus ATCC 10987] # 6 105 10 109 114 65 39 4e-11 MAANTQLLKGLMDGVILEIIKNKETYGYEIYENLQNLGFNDFAEGSLYPLLLRLEKKKYI VGIKRPSEYGPDRKYYSLTSLGLEELTNFKTAWERIDAAMKNLWKK >gi|223714103|gb|ACDT01000112.1| GENE 2 672 - 1277 417 201 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733510|ref|ZP_04563991.1| ## NR: gi|237733510|ref|ZP_04563991.1| predicted protein [Mollicutes bacterium D7] # 1 201 1 201 201 329 100.0 5e-89 MNNQQNYKVQLNNDYTNGLKKIKKGLDLNLFPRNFQKEVLNDLLEVFLRNQAQGKKFSEV VDHDVNSFIKDIIEAYVLNMTKKEIGLNSLKIGLIVSIVFTVLDFCQYGFTVITMLMPLI SFILISITQYILYQTATKISIKLLHSLGFIFGIVLYIFEFSLVKIFPNLIIFNPFNSMFY PVFLAIQIIFYLIVCLRLREN >gi|223714103|gb|ACDT01000112.1| GENE 3 1713 - 2012 183 99 aa, chain - ## HITS:1 COG:yhfW KEGG:ns NR:ns ## COG: yhfW COG1015 # Protein_GI_number: 16131258 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphopentomutase # Organism: Escherichia coli K12 # 2 98 305 401 408 112 51.0 2e-25 MNVEWYQEILKICDRLFLNILKLLKKDDIFVVCADYGNDPLIGHSRHTREYVPLLVYKKD AQAVQLGVRETLADIGATVCDYFQVKPCEYGKSFLSKIV Prediction of potential genes in microbial genomes Time: Thu May 26 10:20:23 2011 Seq name: gi|223714102|gb|ACDT01000113.1| Coprobacillus sp. D7 cont1.113, whole genome shotgun sequence Length of sequence - 3212 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 642 - 875 253 ## gi|237733512|ref|ZP_04563993.1| integrase - Prom 901 - 960 7.3 - Term 2048 - 2102 3.3 2 2 Tu 1 . - CDS 2285 - 2749 433 ## gi|237733514|ref|ZP_04563995.1| predicted protein - Prom 2835 - 2894 12.9 Predicted protein(s) >gi|223714102|gb|ACDT01000113.1| GENE 1 642 - 875 253 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733512|ref|ZP_04563993.1| ## NR: gi|237733512|ref|ZP_04563993.1| integrase [Mollicutes bacterium D7] # 1 77 1 77 77 149 100.0 8e-35 MTLNELFDIYIEDVDMINQVTTTDSIKYRYKSHLKPVFGNIELEAIDPKSIKKFQKDMVE GVYGSRSGDVFRCHILI >gi|223714102|gb|ACDT01000113.1| GENE 2 2285 - 2749 433 154 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733514|ref|ZP_04563995.1| ## NR: gi|237733514|ref|ZP_04563995.1| predicted protein [Mollicutes bacterium D7] # 1 154 1 154 154 300 100.0 2e-80 MKKSIKLFLCLTLIFVCAFMSPVGVNGASDSSSEAVLYEAEFSLSNPNEQTKTFKNGDKT ICLKMGEDLSLKTRETIPIGSFNRYFTYDDGIITMKARYTGSVNPYNSFITEVSDGRYNS LAGTTFIRDRYSWGSLSYAPSNAYGKYEELRHTH Prediction of potential genes in microbial genomes Time: Thu May 26 10:20:36 2011 Seq name: gi|223714101|gb|ACDT01000114.1| Coprobacillus sp. D7 cont1.114, whole genome shotgun sequence Length of sequence - 1521 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 433 - 1071 767 ## gi|167754647|ref|ZP_02426774.1| hypothetical protein CLORAM_00150 - Prom 1204 - 1263 8.5 Predicted protein(s) >gi|223714101|gb|ACDT01000114.1| GENE 1 433 - 1071 767 212 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754647|ref|ZP_02426774.1| ## NR: gi|167754647|ref|ZP_02426774.1| hypothetical protein CLORAM_00150 [Clostridium ramosum DSM 1402] # 1 212 1 212 212 317 100.0 4e-85 MKKILTFALAALMIISMTGCAGNDNTQKEQEKVVTSFFDNIKEYKLDELEKLGTDDYTDV LDISSITDGFKNFEDENVYGESFVKETENFVKKVFDKLIKEYKITFDDEEENTVLVKGKR LDEKSIDMTSAQENINELAQKYQEENMEELSKIYLEQGQEAMMKKIFDDLSKELYNPLYK ELDNAKYTEFTFKFEMVEKDGKWLIDKISKEK Prediction of potential genes in microbial genomes Time: Thu May 26 10:20:54 2011 Seq name: gi|223714100|gb|ACDT01000115.1| Coprobacillus sp. D7 cont1.115, whole genome shotgun sequence Length of sequence - 33846 bp Number of predicted genes - 31, with homology - 30 Number of transcription units - 19, operones - 8 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 531 - 573 2.2 1 1 Tu 1 . - CDS 596 - 2761 2389 ## COG2273 Beta-glucanase/Beta-glucan synthetase - Prom 2987 - 3046 7.3 - Term 3040 - 3082 10.4 2 2 Op 1 . - CDS 3103 - 3342 220 ## CD3370A putative conjugative transposon excisionase 3 2 Op 2 . - CDS 3347 - 3766 261 ## CDR20291_1773 hypothetical protein - Prom 3858 - 3917 5.7 - Term 3939 - 3977 -0.4 4 3 Op 1 . - CDS 4002 - 4610 326 ## COG4832 Uncharacterized conserved protein 5 3 Op 2 . - CDS 4670 - 5569 506 ## COG2378 Predicted transcriptional regulator - Prom 5621 - 5680 3.4 - Term 5658 - 5697 1.2 6 4 Tu 1 . - CDS 5768 - 6403 143 ## gi|237733521|ref|ZP_04564002.1| conserved hypothetical protein - Prom 6514 - 6573 5.2 + Prom 6753 - 6812 2.8 7 5 Op 1 . + CDS 6915 - 8057 609 ## COG2946 Putative phage replication protein RstA 8 5 Op 2 . + CDS 8050 - 8256 173 ## SMU.1031 putative transposon excisionase; Tn916 ORF1-like 9 5 Op 3 . + CDS 8335 - 9528 223 ## COG0582 Integrase - Term 9974 - 10012 1.4 10 6 Tu 1 . - CDS 10239 - 10367 65 ## - Prom 10413 - 10472 11.8 11 7 Tu 1 . + CDS 10525 - 11529 900 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) + Term 11564 - 11621 5.1 12 8 Tu 1 . - CDS 11721 - 12473 672 ## gi|237733526|ref|ZP_04564007.1| predicted protein - Term 12491 - 12538 7.2 13 9 Op 1 8/0.000 - CDS 12549 - 14033 1194 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 14 9 Op 2 . - CDS 14048 - 15349 1461 ## COG1455 Phosphotransferase system cellobiose-specific component IIC - Term 15653 - 15707 -0.8 15 10 Tu 1 . - CDS 15744 - 16361 560 ## COG1285 Uncharacterized membrane protein 16 11 Op 1 . - CDS 16431 - 17189 756 ## COG2365 Protein tyrosine/serine phosphatase 17 11 Op 2 . - CDS 17191 - 18015 754 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) 18 11 Op 3 7/0.000 - CDS 18028 - 19878 1844 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 19 11 Op 4 . - CDS 19966 - 20802 760 ## COG3711 Transcriptional antiterminator - Prom 20903 - 20962 8.9 20 12 Tu 1 . - CDS 21222 - 21755 196 ## gi|237733534|ref|ZP_04564015.1| predicted protein - Term 22057 - 22096 0.3 21 13 Tu 1 . - CDS 22097 - 23872 1557 ## COG0627 Predicted esterase - Prom 24115 - 24174 6.8 22 14 Tu 1 . + CDS 24113 - 24481 85 ## gi|167754663|ref|ZP_02426790.1| hypothetical protein CLORAM_00166 - Term 24577 - 24615 0.2 23 15 Tu 1 . - CDS 24642 - 25706 939 ## gi|167754664|ref|ZP_02426791.1| hypothetical protein CLORAM_00167 - Prom 25761 - 25820 9.6 - Term 25787 - 25826 -1.0 24 16 Op 1 . - CDS 25886 - 26848 920 ## COG1073 Hydrolases of the alpha/beta superfamily 25 16 Op 2 . - CDS 26851 - 27192 454 ## EUBELI_01748 hypothetical protein - Prom 27216 - 27275 8.7 26 17 Tu 1 . + CDS 27491 - 27697 348 ## COG1278 Cold shock proteins + Term 27704 - 27743 6.8 - Term 27695 - 27728 3.0 27 18 Op 1 . - CDS 27733 - 28509 879 ## COG0561 Predicted hydrolases of the HAD superfamily 28 18 Op 2 . - CDS 28566 - 30443 1972 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family - Prom 30516 - 30575 13.2 + Prom 30453 - 30512 7.6 29 19 Op 1 . + CDS 30627 - 31199 589 ## NT01CX_0025 TetR/AcrR family transcriptional regulator 30 19 Op 2 . + CDS 31202 - 31999 546 ## COG0491 Zn-dependent hydrolases, including glyoxylases + Prom 32004 - 32063 5.5 31 19 Op 3 . + CDS 32084 - 33394 765 ## gi|167754672|ref|ZP_02426799.1| hypothetical protein CLORAM_00175 Predicted protein(s) >gi|223714100|gb|ACDT01000115.1| GENE 1 596 - 2761 2389 721 aa, chain - ## HITS:1 COG:TM0024 KEGG:ns NR:ns ## COG: TM0024 COG2273 # Protein_GI_number: 15642799 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucanase/Beta-glucan synthetase # Organism: Thermotoga maritima # 175 460 155 436 642 130 33.0 7e-30 MKSKFFKSALALTSALSLSGINMFSAVAKNLDTTGLETEYEMLELERTKPGTNLIKNGGF ETNKGQYWSTTGDGQITDYPGAAHNSKVCGLLPSFSKNASVYQTLSLEKNKEYKVTAKVL SADAAAQAIVAIKTPNVENLNPAIEQAITFNEAWQYQDLTFTFNSGENSEVALAFIKWTE STENATYKSQIYIDDVTLSQIGTETTPNDEKYEVIWKDEFDGTGAIDNNGLDNSKWGYEL GCIRGVEQQHYTKDKENVHVDDGKLVLEVTDRAKESQYKNPRGDRQVIYNSGSVRTHGKQ DFLFGRIEVLAKLPEGQATFPAFWTLGSDFTLDGSINGDQGDGWPLSGEIDIMESIGDPN FVYETLHYSDTNKPGYTPGADNGKYAGNGKGSKITTPGVVIDGETYHVFGINWSEGKMEW YIDDQIVRSVDYSDDPAAKAALDRPQYIQLNFATGGNWPGDAGSNLAGQTFKVEYAYYAQ NQEQKAAAEKYYANTAALNVKDLSMVEGVVPDLLNEATLTAGSELVDLSEYTIDYSIDNE HMFTTNPDLNDNSQSNDQNQTKVECLIDGAASKEKIAKLAPGEYNIHYSAMHDSKPSVRK TAKLTVVEKPLLPSDMDLEGVIGNKLESIALPEGWGWVNPNMVIDTHTGEYEIIYTKVDF SNPVKVTVRATEKADVDTIAPTEKPSKDNTSVKTGDNALVATFAMLGTISLAGITLLKKK N >gi|223714100|gb|ACDT01000115.1| GENE 2 3103 - 3342 220 79 aa, chain - ## HITS:1 COG:no KEGG:CD3370A NR:ns ## KEGG: CD3370A # Name: not_defined # Def: putative conjugative transposon excisionase # Organism: C.difficile # Pathway: not_defined # 1 79 2 79 79 68 46.0 8e-11 MKKEYDYPTLKVICQASAGNETAIREILKFYDAYICKLCLRPFYHSESGKITMQVDEELK GQIHTEMMKAILKYEIRVK >gi|223714100|gb|ACDT01000115.1| GENE 3 3347 - 3766 261 139 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1773 NR:ns ## KEGG: CDR20291_1773 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 3 136 4 137 145 87 38.0 1e-16 MNTSSFEHIVRIQFNALMMTVIKCTVKSRNRQFARHSKCEVLFCELSDTKNLECVTKDKY SCDYISFEVLNFTIQISNEKLAVALYKLSSKERDVILLRYFQSMSDQEIAELYHVSRSAI YRRRSNGLKKLKTVLKERN >gi|223714100|gb|ACDT01000115.1| GENE 4 4002 - 4610 326 202 aa, chain - ## HITS:1 COG:FN0105 KEGG:ns NR:ns ## COG: FN0105 COG4832 # Protein_GI_number: 19703453 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 12 202 12 204 204 200 56.0 1e-51 MKHEWKKHEKNLYGVKQSPRIVDIPMQQFIMIKGEGNPNDKEFSDKVSALYSLAYGIKMM YKNTISTNEISDYTVYPLEAIWKDIGDTQQLDKNQLEYTIMIRQPDVITKDIVFDALERV KIKKPNSLYEQIFFDTMQDAKSIEVLHIGAYDDEPISFQKMDLLAKENGLVRSTNYHREI YLNNANRVLKSKLKTILRYCVK >gi|223714100|gb|ACDT01000115.1| GENE 5 4670 - 5569 506 299 aa, chain - ## HITS:1 COG:FN1249 KEGG:ns NR:ns ## COG: FN1249 COG2378 # Protein_GI_number: 19704584 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Fusobacterium nucleatum # 1 297 16 312 314 279 48.0 5e-75 MQESRLFRIVYYLIEKGKSTAPELAQKFEVSARTIYRDIDVISSAGIPIYATQGKNGGII IDENFTLDRSMLSKKEKEQILSGLQALFVANSNNTNELLTKLGALFHLKTTNWIEVDFSD WIQGQPAQNIFNDIKTAILDKYIISFEYCSNREHTVFRYIQPLKLIFKSKAWYVYGFCML RKDYRLFKLTRIEHLKITEEHFTPPDTIPSVDTSIKQEDIITVTLKFDKKMAFRIYDEFP RDSIIEQNDFLFVTTSLPNSNRLYSYILSFGEYVEIIEPQEIRKNIQSQIKKIQEKYQT >gi|223714100|gb|ACDT01000115.1| GENE 6 5768 - 6403 143 211 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733521|ref|ZP_04564002.1| ## NR: gi|237733521|ref|ZP_04564002.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 211 1 211 211 358 100.0 1e-97 MEGYAMKISHLGNNIQTIRKFRGMKQQELADKIGINMQSLSKIERGLNYPAYETLEKIME VLDVTPNELLSGEWKYINQSEKEVCQFLRAEERLNAELKHGQYDNFFDSEEEWLEYELEK LREYITDYINGKSIEASDLYPIKEFIQHLKFQKLLDRYDDLYSMDMFGESTDGHKYGTPY QVVKMINPNSKEDMELLREVLKNNHFDDEDE >gi|223714100|gb|ACDT01000115.1| GENE 7 6915 - 8057 609 380 aa, chain + ## HITS:1 COG:BS_ydcR KEGG:ns NR:ns ## COG: BS_ydcR COG2946 # Protein_GI_number: 16077554 # Func_class: L Replication, recombination and repair # Function: Putative phage replication protein RstA # Organism: Bacillus subtilis # 63 372 25 351 352 147 30.0 3e-35 MNNLEFALEMKNKRLASGLSQGELASLTHVSRYSINRFENGKANASKETQNIILRCLNYY ICDKPFYLLVDYLSVRFPTTDALEVIRKVLGMKADYFIHYDYGYYGYKEHYAYGEIKVMA SDDEHMGVFLELKGAGSRNMEYVLQAQNRDWYSFLNRCLDCGGVIRRFDLAINDMCGLLD IPVLSEKYKNGGADCRCKNYENVQGGKLSGKNRNLASTLYIGSKASTKYFCLYEKQKEQA TKKKHTDIINRFEIRLRDKKAVQAVEELLLTYNPHGLVFYLITDFVQFPDYPLWEIFISH DSLPFEMNPVPVNMERTLQWLERQVMPSIVMIEEIDRLTGSNYMKMIDECTHLSEKQEML VEQMCTDIADVIESEGVFYE >gi|223714100|gb|ACDT01000115.1| GENE 8 8050 - 8256 173 68 aa, chain + ## HITS:1 COG:no KEGG:SMU.1031 NR:ns ## KEGG: SMU.1031 # Name: xis # Def: putative transposon excisionase; Tn916 ORF1-like # Organism: S.mutans # Pathway: not_defined # 3 68 2 67 67 105 84.0 6e-22 MSNAQDIPVWEKYTLTIEEASKYFRIGENKLRRLAEENKDAGWLIMNGNRIQIKRRQFEK VIDKLDAI >gi|223714100|gb|ACDT01000115.1| GENE 9 8335 - 9528 223 397 aa, chain + ## HITS:1 COG:SP1129 KEGG:ns NR:ns ## COG: SP1129 COG0582 # Protein_GI_number: 15900995 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 92 392 79 377 387 82 27.0 1e-15 MKEKRRDNKGRILHTGESQRTDGKYLYKYVDAFGNTKYVYAWRLTPTDPTPKGKREKPSL RELEQQIRRDIEDGIDSTGKKMTLCQLYAKQNAQRANVKKSTMKQREQLMRLLKEDKLGA RSIDTIKPSDAKEWALRMKEKGFSYNTINNHKRSLKASFYIAIQDDCVRKNPFDFKLSEV LENDTKEKIALTEEQEQALLSFIKTDNVYHKYYDDVLILLKTGLRISELCGLTVADIDFK NEVVIIDHQLLKSKEQGYYIETPKTKSGIRQVPLSRETIQAFQRVMKKRPKAEPFVIDGR GNFLFVNPKGKPKVAIDYSTLFVRMVKKYNKHHMDNPLPHITPHTLRHTFCTRLASKNMN PKDLQYIMGHSNISITMNWYAHASIDTAKSEVQRLIA >gi|223714100|gb|ACDT01000115.1| GENE 10 10239 - 10367 65 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MELSQLEQFIAIAKNQTMTKAAQQLHISQPALSIGLKKLEED >gi|223714100|gb|ACDT01000115.1| GENE 11 10525 - 11529 900 334 aa, chain + ## HITS:1 COG:YPO2806 KEGG:ns NR:ns ## COG: YPO2806 COG0667 # Protein_GI_number: 16123004 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Yersinia pestis # 1 317 1 318 329 248 41.0 9e-66 MQKREIGHLLVSQIGMECMGFSHGYGQVPSEKYAIEAIRNAYQEGCTFFDTAETYGKEMF EAGHNERLVGKAVEPFRKDITLATKLYLNIQEVENNGLEKVIREHLKKSMKNLKTDYIDL YYLHRINRDIPIEDVALVMGKLIQEKQIRGWGLSQVTKETIQKAHELTPVTAVQNLYSLL ERDCEDDVFPYCLDNQIGVVAFSPIASGLLSGKVTSETKFEGDDVRKFVPQLTQENIVKN QPIIDILHQFAKEKEATPAQISLAWMLHKYPNVVPIPDSKNKDRIIENLQASSITLSNEE FKKLEKALAQHQVFGHRGHNESENSDFSKKWKNK >gi|223714100|gb|ACDT01000115.1| GENE 12 11721 - 12473 672 250 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733526|ref|ZP_04564007.1| ## NR: gi|237733526|ref|ZP_04564007.1| predicted protein [Mollicutes bacterium D7] # 1 250 11 260 260 479 100.0 1e-134 MMVDKLNDVLNHVETYTTMYVFCSYTKSHIHDIADMTIEEVAKACHTSKGQISKCAKNLG FTSYLEFKDACIDYIHSFQDKPSFFSKECDLPQNTKMFAQSMSKTITYVGEKINYSHLNR LINDILRSKKVYLYAQGDNRSLCNVIQVELSTLYIPVVICDADFIKSYQFKEEHLLIILS TNGTIFQLNKRIISRLVHAEVNTWLITCNCDIEFSKNKLIVPSCKAKYNKYAIRHVIDIV IAAMRFVQQL >gi|223714100|gb|ACDT01000115.1| GENE 13 12549 - 14033 1194 494 aa, chain - ## HITS:1 COG:lin0288 KEGG:ns NR:ns ## COG: lin0288 COG2723 # Protein_GI_number: 16799365 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 2 493 6 485 486 568 57.0 1e-162 MRRFPNDFLWGGATAANQLEGAYNEGGKGLSTADMVKFVPREISCGKNTETVTYEEAIEI MNGKHDNEYYPKRWGIDFYHHYKEDIALMAEMGFRCFRMSINWTRIFPNGEEQEPNEEGL RFYDNVFDECLKYGIEPLVTLSHYETPLQLALKYNGWENRKMIEFFVHYAVTVFKRYRNK VKYWIPFNEINMSLHLPYTGGAIFIEKSQNELETIFQALHHQFIASALVTKKAREINPNF QIGSMLGMSLYYPKTNNPEDILAAQWANRVNYFFLDLLSKGEYPEYALSYMKKKNIELKM MEEDLSLLRENTVDFNSFSYYYSLCTSAHPENEEGLIAFVPENEVIDEFHPRKVRNENLE VTDWGFQIDPIGLRVAINEVWDRYHKAIIVSENGLGTYDTLTIDKKIHDDYRIKYLKEHI QQMKICIDEGVKVIGYTSWGCIDIVSAGTSERSKRYGYIYVDSDDYGHGSMKRYKKDSFY WYQKVIKTNGEDLG >gi|223714100|gb|ACDT01000115.1| GENE 14 14048 - 15349 1461 433 aa, chain - ## HITS:1 COG:lin2906 KEGG:ns NR:ns ## COG: lin2906 COG1455 # Protein_GI_number: 16801965 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 1 428 4 432 450 350 47.0 3e-96 MQALNSFFERYFMPFTTKLNSIKGLIAIRDAFVQIFPLTFVGSIVVCINVVLLSSSGFIG QFLVKIIPDLDDFQAVLSPVSNGTINLMAVFIVFLIARNMAKQYEVDGLKAGLTALASFF VLYPPKVDGMLTESYLGANGIFVAIITGLLVGYGFSKLTKIDKLQIKMPDQVPPEVSKSF MIAIPTAIIIILISVIAYCISLIEPQGLNALVYAILQAPIQSLGATPFTPMILIFMAMIL WSVGIHGTFTVSPIYITLYASMNIANISYAAQAGTTAGSPYPYTWFALFENYGCIGGTGN TLALIVAILILSRKKGWKRADYTKTAKIGLIPGLFCINEPIIFGLPIVLNPILVIPFILS PIVSMGLGALMISTGLVLPGTLDVGWTTPQPIKAFLSASGSWETAISVCFVFIICVLIYL PFVALANKQQTNQ >gi|223714100|gb|ACDT01000115.1| GENE 15 15744 - 16361 560 205 aa, chain - ## HITS:1 COG:CAC3658 KEGG:ns NR:ns ## COG: CAC3658 COG1285 # Protein_GI_number: 15896891 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Clostridium acetobutylicum # 1 166 15 185 229 154 49.0 8e-38 MIAAFCGCLIGYERSSRNKGAGVRTHAILALGSALIMLVSKYGFIDINEIDGSRIAAQIV SGVGFLGAGVIFVKNGNVSGLTTAAGMWATSGVGMCIGAGLYDIGIITTIILIGLQILFH RGFFLKVTQNNQNIQMEIYYEDYALKDIREVFDKYGIDIHSMKVENLKNKSMFLEMETYV NKGFQKESLVEEMMNKTYFKMLNYY >gi|223714100|gb|ACDT01000115.1| GENE 16 16431 - 17189 756 252 aa, chain - ## HITS:1 COG:lin1914 KEGG:ns NR:ns ## COG: lin1914 COG2365 # Protein_GI_number: 16800980 # Func_class: T Signal transduction mechanisms # Function: Protein tyrosine/serine phosphatase # Organism: Listeria innocua # 8 250 53 296 298 114 30.0 2e-25 MRIVNIKNFRDIKGYSSQDGFVIKPYMIFRGGALDKLTIKEQNHFVDHLKIKYILDFRDE AEASLAPDKQHENILYERISALKMQSHDQYGFDFGTMLQGEMTKEKYNYLMSYIKAGYKE MAFNNPAYHRLFELLLRNDGYVYFHCTAGKDRTGVAGFLIMIALGMSEEDAIQEYLLSNI YLKESNDELCQQLQIPEKLREECRPLLYVQRELIEIMIQSIRVKYRSYDEFLLQEYNFDN EKRRRLKEIYCE >gi|223714100|gb|ACDT01000115.1| GENE 17 17191 - 18015 754 274 aa, chain - ## HITS:1 COG:PM0616 KEGG:ns NR:ns ## COG: PM0616 COG0613 # Protein_GI_number: 15602481 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Pasteurella multocida # 5 248 6 249 277 105 29.0 9e-23 MVYADLHVHTAYSDGIHSIEETIKLAKEKGIKVIAITDHDTVFHFEEVQKVCRQNGLETI RGVEMSCYDYDVHKKVHVVGLGLNNHPYHVEKLCQKTLQCRDEYHHELIKILNQKGLNIT YEDAKEYAPYNIVFKMHLFLAIVHKYPEYNDINKYRELFMSETSLDVDSKMGYIDIKEGI KAIREDGGIAIIAHPCEYKNYDEIEKYVSYGLQGIEISHPSMESEDYPLTQELANRFHLI RSGGSDFHNIHLTAIGDFGLTKEQFDELKEEMGR >gi|223714100|gb|ACDT01000115.1| GENE 18 18028 - 19878 1844 616 aa, chain - ## HITS:1 COG:SPy0572_2 KEGG:ns NR:ns ## COG: SPy0572_2 COG1263 # Protein_GI_number: 15674662 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Streptococcus pyogenes M1 GAS # 84 454 1 364 364 204 34.0 4e-52 MKYEKLSKNIIEAVGGSNNIVTLQHCMTRLRFTLKDESKANDQKIEAIDGVLSLIKKGGQ YQVVIGTHVHDVYLDVCQIANIKEDENFGNKKEKKGIFNSIFSAIIGCVGPIIPILVGTG LGKCILLFVSMMGWANAETSMTYYVFNFVFDAGFTFLPVFTAVAAAKHFKCNMYMAALLG CALVHPQWSGIVSATDPKFIGDMFGFLPLYGMPYTSTLIPAILIVFVMSKVEFGLNKVLP ELVRGMLTPLFTLLIMTPLAFVVLAPAMGIISIYLGNALLWCYDTFGMFAIAIMCIIYPW MVATGMHATLAIAGIQILSQSGYDPFSRTLTLTANMAQGAAAFACAVKTKNRDFRNTCLS AGFTAFFAGITEPCIYGVSIRLKKPMYAVMIGSFTGGLYAGFCGLKAFAFMTPSIINLPM WVGGNNPNNLMNAIITMIISAVVTFIATLVIGFDDPKESEVKHDDNKLNQVNCPVKGKIV PLNEVNDEMFSKEVLGKGVAIIPVEGKIYAPANGIISATFETKHAIGLTTENGSEILIHV GIDTVKLEGKPFIQYVEKGEYVKAGSLLLEFDLKMIKEAGLDYTTMVVVTNSNDYLEVIP TKLKKVTKTDPVLTII >gi|223714100|gb|ACDT01000115.1| GENE 19 19966 - 20802 760 278 aa, chain - ## HITS:1 COG:BS_licT KEGG:ns NR:ns ## COG: BS_licT COG3711 # Protein_GI_number: 16080959 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 1 277 1 276 277 218 43.0 1e-56 MIIKKILNNNVVITTNYQMEEIIVMGKGLAYGKQAGDNIDMNKINKTFEVSLKPSQRKMI NMLKDIPLEYMEISDCVIKEAKVDLEVDDSLYISLTDHIHTSIERYKEGVYLKNHMLFEI KNFYPKEFELGLLTLKLIKEKYGINMEEDEAAFLALHIVSSEVGRNISDIYEMTNFILEI IGIVKDYFKLELDEDSLSYHRFVTHLKFFGLRVFNKIKQVEDITLNNDLLEIMKEKYVES YLCTSKIQSHIEKTYHYELNDEEVLYLTIHVAKIISKK >gi|223714100|gb|ACDT01000115.1| GENE 20 21222 - 21755 196 177 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733534|ref|ZP_04564015.1| ## NR: gi|237733534|ref|ZP_04564015.1| predicted protein [Mollicutes bacterium D7] # 1 177 1 177 177 317 100.0 2e-85 MKVFLSQNILFKSNEEISEYLFQQCLDNLVATKENLDLATLLKIVQLFKESEDITFLGDE NDLDEFLLLQIKLLTSGKCAYLFKVSETKTVRSYFFNENSVVVIINVCDGFFHYHDALEN AKEVGAKVIYISQDYNKEVADKVDIFYRYGVEWSINTGYVSLYYIGELLEKLCIRNF >gi|223714100|gb|ACDT01000115.1| GENE 21 22097 - 23872 1557 591 aa, chain - ## HITS:1 COG:PM1451 KEGG:ns NR:ns ## COG: PM1451 COG0627 # Protein_GI_number: 15603316 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Pasteurella multocida # 230 431 43 249 269 60 27.0 8e-09 MRQVKRWILQIGICFTLLLSLGVYSMSNVHAVNEGITITDNTTGLSEAKYQVTFVVDSTK LGENVENIQLQGGFQFIKSSEAPWYQENGASNDGIRWYSAYEYEQGMYPTGGCGNTERTE FNYNGNYILYDMVKDENLYSVTLPLPATEYFYGYFVTYSDGSAVVVQDPVNPSKKNEINN HDATWSYFYVGNSSDALAGQSYIYPRNDNMGSYQYDTYIAYDGTENCLGIYLPNGYSLGN NYKTIYLAHGNGGNETEWCQLGSAGNIVDNLIAEGELADSIIVTLNNSHFSGTGFDIKSN VILAQDVVNNVIPFIEKNYKVSTDPKDRAYAGLSAGGVAASTVMEIAPDSFGYFGIISAA VQIDDEVFTDELISKLQTKKIYLSAGTVDFGLINSFFKASILDFMLPKLDAENIDYTFEI QNGGHDWNTWRGAFTTFAKDILWNQENIEYCITDGANKNIKQGEELKIKTDIPSSLFKML IIDDQEIDRSKYTVSGDLITIILPKELISTLTIGEHILVIIANEGQAATMFNITADTNVL DIVENDQITKQSSHTAVLAPKTDDPSLFEVYCLFILLSGGAIVVIKKKYFN >gi|223714100|gb|ACDT01000115.1| GENE 22 24113 - 24481 85 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754663|ref|ZP_02426790.1| ## NR: gi|167754663|ref|ZP_02426790.1| hypothetical protein CLORAM_00166 [Clostridium ramosum DSM 1402] # 1 122 1 122 122 136 99.0 5e-31 MNYFEYSYSTSSTTTIFIFIFILLNIIARWRLFTKARRWGISAFIPIYSDIALCRIVKLS PLWLLFLLIPLVNMYFIIFYHIVISFKLSFAFGKGIMFGLGLLVFNPLFILILGFGNCEY RY >gi|223714100|gb|ACDT01000115.1| GENE 23 24642 - 25706 939 354 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754664|ref|ZP_02426791.1| ## NR: gi|167754664|ref|ZP_02426791.1| hypothetical protein CLORAM_00167 [Clostridium ramosum DSM 1402] # 1 354 1 354 354 646 100.0 0 MGDISNGTRILLTQYLLKRLTDEDHALSRDELIKRIGGLGTTISSNTIKADIESIKSYNN LIKDCQDELLTPFICNESGSINYATKQKKGAYVEKTYYTDAQLILLTRLLGRGLHLESQN DSKLLEKVLSDSSLYKINHYHLLDNINSTYEDNCITVYLEQIIKAINTFACVSFKELEYS VVENKVIDRPSHKTILLYPTNIDIKNDQVYLVGYLINSSEDFELAYYRIDRIDQLKITRT PKYIQAKTECFKERIWLSNDPLDIGKDNLINIELRVYLGNENIFEALYQRFKGCTFTVEQ FRRFVNVSIERVPDDKKLVSGFLQLANHVEVFKPLELRDKIKEEINQMYLMYRS >gi|223714100|gb|ACDT01000115.1| GENE 24 25886 - 26848 920 320 aa, chain - ## HITS:1 COG:L15267 KEGG:ns NR:ns ## COG: L15267 COG1073 # Protein_GI_number: 15673556 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Lactococcus lactis # 21 318 21 318 320 272 47.0 7e-73 MGKKTKIALTGSLTMLAAGIGSLFYAGNYLYNLALNKQANKDAIFTNPSTNNQIPDKDRT DINKKGDSSNFFDTIHYQDVFLDSDDGLKLHAYQFLNYGHDYVIIVHGYTSEGKLMHASA KHFYEQGYNLLLPDLRGHGQSEGDYIAMGWLDRLDIINWIKYLIDNDSKVKIILYGVSMG AATVMNVTGEKLPVNVIAAIEDCGFTSTWEMFSYQLKEMYNLPSRPFLDIANIVTQIRAG YSFGKAEAIEQVKKSTTPTLFIHGDKDRFVPFKMLNQLYQSANCPKEKLVIKGAGHAQCE KIGGQVYWSKIDSFIRKYQD >gi|223714100|gb|ACDT01000115.1| GENE 25 26851 - 27192 454 113 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01748 NR:ns ## KEGG: EUBELI_01748 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 113 1 114 114 74 35.0 1e-12 MPFISFTTNHKLTLRQENEIAKRTGELITILPGKKEENLMLHLEDNQIMYFRGDDIPCMM IAVKLYNTIDFDAKKKFTEELVKMIKETTNIEINDVYVSFDEYPNWGKQGTLF >gi|223714100|gb|ACDT01000115.1| GENE 26 27491 - 27697 348 68 aa, chain + ## HITS:1 COG:BH3610 KEGG:ns NR:ns ## COG: BH3610 COG1278 # Protein_GI_number: 15616172 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Bacillus halodurans # 4 65 3 63 65 84 77.0 6e-17 MSTGKVKWFNQEKGFGFITNDEDGKDIFVHFSAINAEGFKTLEEGQVVEFDINESDRGPQ AQNVTVKA >gi|223714100|gb|ACDT01000115.1| GENE 27 27733 - 28509 879 258 aa, chain - ## HITS:1 COG:CAC2423 KEGG:ns NR:ns ## COG: CAC2423 COG0561 # Protein_GI_number: 15895689 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 1 251 1 257 263 177 41.0 3e-44 MSKKIFFFDIDGTLAVKNIIPEDTKEALRKLQNLGHYVFICTGRPYIYAKYHFEKYVDGF ICANGRYIVYKEEVLLDEPLTRKQVSYFVDSFDKLECGYNFNGVNLGYAHGIDFNKLADM QTDYEHPYFIEDFKTEEIDAHMFDVHFNDQEHYQKIVSYLQDTVVLNEHFGHNSADATII GYDKGIGIKKLLTILKIERDNSYAFGDGFNDVCMFKVVGHPIAMGNGVDVAKETSEYITS DIYDGGIYKALLHYGIIE >gi|223714100|gb|ACDT01000115.1| GENE 28 28566 - 30443 1972 625 aa, chain - ## HITS:1 COG:CAC3371_1 KEGG:ns NR:ns ## COG: CAC3371_1 COG1902 # Protein_GI_number: 15896613 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Clostridium acetobutylicum # 2 361 17 400 401 254 38.0 4e-67 MKLKNRIVMGPMGTTGEADGAYNIDAINYFTERAKGGTGLIITGANVVSTKYEPRPCTEL SNFHHVERLNMLVERCHNYGAKVCVQVSPGLGRQQFTDPFTPPYSASDCGAFWFPDLKCK PFTKEDIKYIVEKVGYGASLAKMAGADAVELHAYGGYLLDQFHSKQWNTRTDEYGGSLEN RMRFTLECIAAIRTNVGPDFPILVKFTPVHKVEGFRQIDEGIAMAKILEGAGIEALHVDV GCYEAWYKAISTVYNEEGHQLDVVKAIKENVSLPVLGQGKLFDPKKAQEVVAQGVLDYVV LGHQMLADPHWANKVKQGNTMDIAPCVGCNECLLAGFSGKHYYCAVNPLCYAEKEFALPK VDGIKKSVLVIGGGPGGMYAAITARKRGYEVDLYEKEERLGGTLWAAGMPTFKHDVLKLI TYLERQCLKTGVNVYLNTEFTLKDAKKDYDKVILSAGSSPMMPPIPGIEKAGFVSEFLTD RKEAGKQIVIIGGGLAGCEAACYLKENAETVTIIEMQEDILFGCDHCLNNDQALRQMLQD NNINKITKATVTKIEDNRVTYLQDNQEYTIVCDTIINAAGFKPNDQLEDALEEIYDDVTV IGDAKAPRKILTAIHEGYHAIRVME >gi|223714100|gb|ACDT01000115.1| GENE 29 30627 - 31199 589 190 aa, chain + ## HITS:1 COG:no KEGG:NT01CX_0025 NR:ns ## KEGG: NT01CX_0025 # Name: not_defined # Def: TetR/AcrR family transcriptional regulator # Organism: C.novyi # Pathway: not_defined # 5 181 14 187 193 72 27.0 7e-12 MTKITKDIIIETAYQLFFKHGYNNVTINDICKECKITKPTFYTYIKSKDDILAHFYDDIT EAIVANTANIIMAENYWQQLLICFETLMEESIKLGYDLSSQMFIMNLKEDRGSFDFRDHL TNIAIAIIKKGQATKQIRNQNDPEILYQASAYAFTGYELMWCIKKGQFDWKKELRIALEN IYDVAPELRG >gi|223714100|gb|ACDT01000115.1| GENE 30 31202 - 31999 546 265 aa, chain + ## HITS:1 COG:TM1295 KEGG:ns NR:ns ## COG: TM1295 COG0491 # Protein_GI_number: 15644050 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Thermotoga maritima # 4 185 6 180 218 115 34.0 9e-26 MKLYNTIKINNHSYYINDEDGDSCYLIIGNTHALLIDLGLFKEPLLPTIKNITNKELIIV CTHGHFDHIGTIKEFKEETIYLSHRDRDIYYDNAHIIKELSLIDFNQIKDLKNHQQIELG NFEIEVLALPGHTPGSMIFLDRQNKCIYTGDAIGSGCGVWLQLFHSLDLKTYHTALQQTI DYFEKQGVDDTWHFWGGHNQQEIQSKISAYNKLDFVLMKDLEQLCLKLINSEITGIKNSA PTFDDNQAYYASYGKAEIVYQKNSL >gi|223714100|gb|ACDT01000115.1| GENE 31 32084 - 33394 765 436 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754672|ref|ZP_02426799.1| ## NR: gi|167754672|ref|ZP_02426799.1| hypothetical protein CLORAM_00175 [Clostridium ramosum DSM 1402] # 1 435 1 435 964 727 99.0 0 MKTLHGLDYSLKIKNTIINHINQAQKNPFINYYFIVDDPLYFEEAFFKYTDTLFNIRIIT YNDLIKKLLEHYQLYSYQELSKLDKILITKQLIENSNNLFNTNSKMDLIYELIDIFDLFF LESFSNSELDQLPPLAKQKIATIIGLYQQLTTSLPNNTCYKYEELLFEQIDNSNAEDHYI FISEQIFKQRRFELIKKISLYNDVTILVNNSSDSRDLNKPFNKYHGDQKTVFENDNPYLS HLNKYLFSLRSPKYENHTPLHTIIQTTPKAQIESVALNIYQDIVDNHMHYHDYAIYYPNQ EYLTLLVDTLNNFKIPHNIKKSLIFKELDACLLWIRYCLNHDNNDLLDLLDTKVLSRYND FDYLDLIKKNYLEKGYLEDPFSTLYNFEKAQSLDDYSSVMINFINQEMLFSQNQTMLINF FTNLTSPQSFTLADFL Prediction of potential genes in microbial genomes Time: Thu May 26 10:22:13 2011 Seq name: gi|223714099|gb|ACDT01000116.1| Coprobacillus sp. D7 cont1.116, whole genome shotgun sequence Length of sequence - 13561 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 5, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 8/0.000 + CDS 2 - 1123 926 ## COG3857 ATP-dependent nuclease, subunit B 2 1 Op 2 . + CDS 1113 - 4814 2937 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) - Term 4721 - 4766 8.3 3 2 Tu 1 . - CDS 4839 - 5846 1115 ## LEUM_0151 hypothetical protein - Prom 5963 - 6022 5.1 + Prom 5679 - 5738 7.4 4 3 Tu 1 . + CDS 5915 - 6640 579 ## COG3022 Uncharacterized protein conserved in bacteria + Term 6641 - 6686 -0.5 5 4 Op 1 . - CDS 6668 - 7723 1097 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 6 4 Op 2 . - CDS 7735 - 8520 753 ## COG0561 Predicted hydrolases of the HAD superfamily 7 4 Op 3 2/0.000 - CDS 8555 - 9514 1011 ## COG0205 6-phosphofructokinase 8 4 Op 4 . - CDS 9524 - 10228 625 ## COG1073 Hydrolases of the alpha/beta superfamily - Prom 10249 - 10308 6.6 - Term 10288 - 10321 1.1 9 5 Op 1 . - CDS 10419 - 11309 1054 ## gi|167754680|ref|ZP_02426807.1| hypothetical protein CLORAM_00183 10 5 Op 2 . - CDS 11318 - 11833 573 ## gi|237733555|ref|ZP_04564036.1| predicted protein 11 5 Op 3 . - CDS 11833 - 13350 1893 ## COG0696 Phosphoglyceromutase - Prom 13499 - 13558 8.2 Predicted protein(s) >gi|223714099|gb|ACDT01000116.1| GENE 1 2 - 1123 926 373 aa, chain + ## HITS:1 COG:CAC2263 KEGG:ns NR:ns ## COG: CAC2263 COG3857 # Protein_GI_number: 15895531 # Func_class: L Replication, recombination and repair # Function: ATP-dependent nuclease, subunit B # Organism: Clostridium acetobutylicum # 25 371 775 1134 1153 128 28.0 2e-29 TMIKHYIQTNNQPDKLTVPLFSNHLSASKLETYNGCPYKYFNQYGLKLYPFKQPLFQINE IGTIIHYVLEKTQALFTDNITASKAEVDDLETVINQHVEQYLNEHGLTERLNYGTNNYII KTVKHDLVNTVIVLINQMKASDFYIVGSEVDIYRNYPDFKFSGIVDRVDQYNQYLKIIDY KSSNKDLDLSLAIQGFNIQMLLYLDTLTKQKKLDKGALLYFNTKKRILTSSLKINESELS ENFFKLYRMNGYVNSEVVEEVDNNIDKDSAIIKAKFVKKDDCYKGNILSSFSFERLIDYV SQHIENLYHELANGNIGINPKGSDDATVFTKVNPCTYCNYRSLCNFDVFYNEYTLVDSNN LEHLLEEDNSDAN >gi|223714099|gb|ACDT01000116.1| GENE 2 1113 - 4814 2937 1233 aa, chain + ## HITS:1 COG:SA0828 KEGG:ns NR:ns ## COG: SA0828 COG1074 # Protein_GI_number: 15926556 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Staphylococcus aureus N315 # 8 1233 14 1215 1217 337 27.0 8e-92 MPINSLNDGQYEAVVTRGCSILVSAPAGSGKTKILVNRIMALIEDDGYNVDQLLVLTFTN AAALEMKQRLQVALDERLQEDINKPLEQHLLKQKQLLPKAYITNFHGFCSTLLKQYGYLI NLNSKFDICTDPSLIKHQILDNCIETWAEDQTFIDFISTYFPEYYFNSFKNAIFKFENLS NTIYDFDNYIDTVKKDIYYHIINNDDDAINNWPISNKIIELLHQQATMGLNKVYELASYA SSHNLSFYYQNPFGDEPKKADLPSPYDCHLKYYFDVIKAIESKNMTNIITASKASLMKSY DSRGLFNEDNELYKKEYNRLKNNIIKFYRDKFNDIVYDDFDEFKLTLSTSLEPLEKLITY LKQFKQAYQDYKLAHNLLDFNDLEANALELLEPKYGIVTLLYQQLKEIMIDEYQDTNQIQ ETLINKIADLHEPKINRFMVGDMKQSIYRFREADPEIFNEKYLTFNLLPKTKRIDLKFNY RSNKIVLDSVNYIFNQIMDKDIGGLDYYLDESAKLNYDFLRKEGAKELEDLPEVTAKASQ RLDEEARFTSEVILSYQTGTISNDAELEAKIVAKKIQDMIGHLELDNFNGTKRLADYKDI VVLMRSTRSFIAFKKIFNKFNIPSHIVLSQGFLQAPEIINLVNVLKAFNNRLDDIAFTSL LKGNYVISHFSENFLAQIKIDETISMFDNCINYLEKKLDNYEALETFINYYQDMRNYFNS HSIKESLMKFLEDSNYLPFLASLVNGPQRVANIELMIQKLDEMHDDSLNTITTKFDDMIN NGVNLSPAMVSSNDDNVVSFMTIHKSKGLEFPIVFVSNMQNKFNQQDARERIISDKKLGI AIRPRVKCDLEPYQDVIVEYENKYRKVIATAQTSEAINEEMRIFYVALTRASQKLIMTGL IKEPQDLIKWQKYVINNSEDSMINPRCKDKVILYHNARKQNSYLDWLGISLMSHSNIINQ GLKKELLDHDQTDDLVNEIKQNFQTILIHQNKHLIQENTKHSKFSIKILSHHDVEEQIIV SNKSDTILDLSSYHRYSNFEYPYPTNLEKSIAVTRKIEDGDRSFKDISYELDDSPVDAGT RGTIIHSVLEHLDIDVHLDLSDNLNKLKQANLYDNEAWQLIDKYQQHLENFITSPVYQLM VNADYLYREKEFSMLENGQIIHGIFDVVCIKDNQITIIDYKTDNLNKNTSKEILISLHKP QMDYYKKILARVFPQANIQAIVYYLYINKYVTI >gi|223714099|gb|ACDT01000116.1| GENE 3 4839 - 5846 1115 335 aa, chain - ## HITS:1 COG:no KEGG:LEUM_0151 NR:ns ## KEGG: LEUM_0151 # Name: not_defined # Def: hypothetical protein # Organism: L.mesenteroides # Pathway: not_defined # 5 335 1 327 330 94 25.0 6e-18 MNNTIIIDKILIHMLDLEHSKIIYSDTFINLTEGTTEYYDKKIEKCLENSGIKELVTGSE HHLLQAGKKMIESNEAFKEESIKITQDLFDLCTKIEEMPNANLMFVELKVDGKKFVLIIK LNYKTMPMSLIEENDGIRSIRFINQQILPNKTTAVEEAIIINVEDNILSIIEKRYMIDGK PGYYLNEQYIKGEPKLTDKQKMSIVNKVVKKVDSEYNVVEGDPLPLVKKELVDLVMDHRP VKPMELAKKVMGNDYNATEEVELIMRDLGIEEDDEIVNVPVSLDRMSRCKLVLDDDRIIE LNVEDYLEGVDIVKEMDEGGMTRIILKNIKDIVVK >gi|223714099|gb|ACDT01000116.1| GENE 4 5915 - 6640 579 241 aa, chain + ## HITS:1 COG:FN1762 KEGG:ns NR:ns ## COG: FN1762 COG3022 # Protein_GI_number: 19705081 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 232 1 245 248 128 36.0 1e-29 MKIIIAPAKVMKTKQVNIETTDILFKDKTTQLHNYLMKFNLDQLHDIMKISFKMADTVYS YFHNEYTAVPALYCYQGTVFKQLSLKDYDADDFVYLEQYLNILSAYYGILKYNSAITPYR LDMIMKIDFNLYQFWQDEIDNYFKNEDYIISLASKEFTKMLSHPHIINIDFVEDKAGKMV RNSMYVKQARGKMLDLMVKNKISDLDSLKRLTFDGYIYHEELSTDDNYVYIRNGKQTYKK L >gi|223714099|gb|ACDT01000116.1| GENE 5 6668 - 7723 1097 351 aa, chain - ## HITS:1 COG:L0065 KEGG:ns NR:ns ## COG: L0065 COG0079 # Protein_GI_number: 15673188 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Lactococcus lactis # 1 351 1 351 360 420 59.0 1e-117 MSWQDKLREVEPYVAGEQPKIVNMIKLNTNENPYGPSPKIKEVLTSLDIDRLRLYPNSDA VDLRKELANYYGLEEEQVFLGNGSDEVLALIFLTFFNGKEPILFPDITYSFYPVYCELYQ MAYKLVKLDRDFKINLNDFMQPNSGIIFPNPNAPTGLLVDLDFIETILKNNPDSIVVIDE AYIDFGGTSCVSLIDKYDNLVVTQTYSKSRSLAGIRLGIALGSKEAIRHLYDVKNSFNSY PVDYVAQQICLASILDDQDTKAKCAKVIKTREMTKTRLKELGFIVPDSYANFVFVKHPDV EGQELFTALRQVGIIVRHWNTKRIDQYLRISIGTDEEMERMINFLEEYLNK >gi|223714099|gb|ACDT01000116.1| GENE 6 7735 - 8520 753 261 aa, chain - ## HITS:1 COG:CAC0629 KEGG:ns NR:ns ## COG: CAC0629 COG0561 # Protein_GI_number: 15893917 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 1 250 1 250 268 60 24.0 3e-09 MKKILFCDMDGTIIDLDGLLYPQDKEKVEKLHQAGHLFAFNTGRNYEEALRSTNKLDLYY DYLVLNNGAHIVDRNDQELFKRVISKQAGIDIIEHCLKYDDMWIYFYDGKKTLGYLRGQT YIYDEVGRSIVTNEHDFIKEYPNVEEFDIIAINQDNQQIDEVLKIQKYIDEHYADEAHGT LNMHYLDITPSGCTKGTGISNLVSLLNEDVVSYAIGDSYNDISMFEHADHGYTFNRADEL IKKHSDKQVDYLSELIDEMLK >gi|223714099|gb|ACDT01000116.1| GENE 7 8555 - 9514 1011 319 aa, chain - ## HITS:1 COG:VC2689 KEGG:ns NR:ns ## COG: VC2689 COG0205 # Protein_GI_number: 15642684 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Vibrio cholerae # 1 318 13 330 331 304 49.0 1e-82 MKKIAVLTSGGDAPGMNGAIRAIVRVGIKHGFEVYGVYEGYRGLVEGKIARLGYRDVSEI LAKGGTILESSRLPEFTLDEVQDKAINNLKEHEIDSLVVIGGDGSYRGAQALAKKGINCI GIPGTIDNDINGTDETIGFHTALCNIIDAVNKLRDTSSSHHRCFVVEVMGNNADNLALYA AISCGSELVITNKTGYDEEHVIKELIRNEIVYNKRHAIVIASEKILDVEKLAQRISEETG FSGRSIVLGHIQRGGSPVPEDRILAARMANKAIELLQDNQSGKCVCLQDGKIVAKDIETA LQEVNKSNQMLYQLFDDLV >gi|223714099|gb|ACDT01000116.1| GENE 8 9524 - 10228 625 234 aa, chain - ## HITS:1 COG:CAC3022 KEGG:ns NR:ns ## COG: CAC3022 COG1073 # Protein_GI_number: 15896274 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Clostridium acetobutylicum # 6 231 4 226 228 210 48.0 2e-54 MKKKYLKIDNIPAVIWGAGSSKVFLAVHGNKSNKEDIVIEMLAKTAVNRGYQIVSFDLPK HGDRVYEDTLCNAQNCTEELLKIYDYVKGKYEKISLWACSLGAFFSLLAYQCVKFNQCLF LSPVVDMQRLIENMMIWFEINEEKLKEQKMIETPIGETLYYDYYCYVKEHSVTEWNSQTM ILYGENDNLCEYEYIKKFASHFNCDLKIMENGEHFFHLKNQLEFYQNWLEEKVK >gi|223714099|gb|ACDT01000116.1| GENE 9 10419 - 11309 1054 296 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754680|ref|ZP_02426807.1| ## NR: gi|167754680|ref|ZP_02426807.1| hypothetical protein CLORAM_00183 [Clostridium ramosum DSM 1402] # 1 296 1 296 296 550 100.0 1e-155 MKQCPHCKKNIPDSAKVCPHCGTRLEKGYQPMKRTNTFPNYLYGILALVLIFSPLISSML FGNFLGDGTASNTVTTPDKAITLGPLGEVNINKEVTEYQFGSLKDFDKLVTNSDSYVKKI KQLENDLTAITDKYGKTTIDKDYDFYVTDQNNVYTSLNYDLKIGKNETMSISFSYDLSGT TNAVNIGYTINGFKDFEAMKINEQSYPMLKEIVKLINGDDMYVSFNKAGEKFNQLENDFN ERNESIGNYGIGITQSEDDTKVSMRILSSEDGYRLKITYKTKADMNKLVGNSKGTE >gi|223714099|gb|ACDT01000116.1| GENE 10 11318 - 11833 573 171 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733555|ref|ZP_04564036.1| ## NR: gi|237733555|ref|ZP_04564036.1| predicted protein [Mollicutes bacterium D7] # 1 171 1 171 171 322 100.0 8e-87 MKKLLSLIVCVLMISGCQDASINRIKKDNSLSGLHEITYDELENKLSSDDTFVLYIGRPD CGDCKEFHPILTSYLEENKGTYLYYLNIQAFRDAAMQEGASEKEIDFYKNIREELDFNWT PILKLVNKGETISEYTYLSEEYYEIKDDTKKAQAKEKFVEDFKTWMANIYE >gi|223714099|gb|ACDT01000116.1| GENE 11 11833 - 13350 1893 505 aa, chain - ## HITS:1 COG:BS_pgm KEGG:ns NR:ns ## COG: BS_pgm COG0696 # Protein_GI_number: 16080444 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Bacillus subtilis # 1 505 1 510 511 592 58.0 1e-169 MKKRPIVLCILDGYGLSERIDGNAVKLANTPNIDDLEMIYPTTRIKASGMPVGLPDGQMG NSEVGHLNIGAGRTVYQSLTLINKAVEDGTFYKNAEFLKAINNAKENNTKLHIWGLLSNG GVHSSNEHIYALLKLAKQEGLEKVYVHAFLDGRDVAPDSGADFVKELADKIEEIGVGEIA TVSGRYYAMDRDKRFDRVELAFDAIVNHKGESFECPVQYVKDSYAKETYDEFVIPGYNKN VDGQVADGDSVIFANFRPDRAIQLATVMTNDGFYDHAFNNVPKNLTFVCMMKYADSVNGA IAFALPSLTNTLGDYLSAKGMKQLRIAETEKYAHVTFFFDGGVDKEIEGATRVLVNSPKV ATYDLQPEMSAYEVKDKLIEELDKDIHDVVIVNFANCDMVGHTGIIPAAIKAVSVVDECV GEVYNKVLELGGTMLITADHGNSEMLLDEDNNPFTAHTTNEVPLIVTNSHLELKEGGKLG DLAPTILQLLGLEIPAEMDGESLLK Prediction of potential genes in microbial genomes Time: Thu May 26 10:22:47 2011 Seq name: gi|223714098|gb|ACDT01000117.1| Coprobacillus sp. D7 cont1.117, whole genome shotgun sequence Length of sequence - 35731 bp Number of predicted genes - 40, with homology - 39 Number of transcription units - 23, operones - 10 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 726 491 ## Acid345_4096 GumN + Term 735 - 769 2.0 - Term 533 - 578 1.1 2 2 Op 1 . - CDS 756 - 941 217 ## gi|237733558|ref|ZP_04564039.1| predicted protein 3 2 Op 2 . - CDS 991 - 1371 384 ## gi|237733559|ref|ZP_04564040.1| predicted protein 4 2 Op 3 . - CDS 1352 - 1942 538 ## COG0558 Phosphatidylglycerophosphate synthase 5 2 Op 4 . - CDS 2004 - 2948 933 ## COG1335 Amidases related to nicotinamidase 6 2 Op 5 1/0.250 - CDS 2941 - 3855 968 ## COG2267 Lysophospholipase - Prom 3875 - 3934 3.5 - Term 3896 - 3943 2.2 7 3 Op 1 . - CDS 3960 - 5150 948 ## COG0628 Predicted permease 8 3 Op 2 . - CDS 5198 - 7858 2599 ## COG0474 Cation transport ATPase - Prom 7879 - 7938 4.8 - Term 8110 - 8147 4.5 9 4 Tu 1 . - CDS 8349 - 9086 861 ## COG3142 Uncharacterized protein involved in copper resistance + TRNA 9331 - 9407 74.6 # Arg CCT 0 0 - Term 9404 - 9448 2.7 10 5 Op 1 . - CDS 9640 - 11100 737 ## COG0582 Integrase 11 5 Op 2 . - CDS 11120 - 11320 152 ## gi|237733568|ref|ZP_04564049.1| conserved hypothetical protein - Prom 11367 - 11426 4.1 + Prom 11310 - 11369 2.5 12 6 Tu 1 . + CDS 11463 - 11720 69 ## gi|237733567|ref|ZP_04564048.1| conserved hypothetical protein + Term 11755 - 11820 9.0 13 7 Op 1 . - CDS 11860 - 12099 301 ## CD3328 hypothetical protein 14 7 Op 2 . - CDS 12096 - 12515 228 ## CD3329 hypothetical protein - Prom 12629 - 12688 3.9 + Prom 12966 - 13025 8.4 15 8 Tu 1 . + CDS 13116 - 13472 240 ## CD3330 putative transposon-related DNA-binding protein + Term 13694 - 13739 0.1 16 9 Tu 1 . - CDS 13505 - 14200 302 ## CD3331 hypothetical protein - Prom 14251 - 14310 7.8 + Prom 14293 - 14352 10.2 17 10 Tu 1 . + CDS 14380 - 14940 509 ## COG1309 Transcriptional regulator + Term 14973 - 15020 4.2 - Term 14959 - 15010 3.7 18 11 Op 1 . - CDS 15016 - 15924 427 ## CD3335 hypothetical protein 19 11 Op 2 . - CDS 15941 - 16945 627 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 20 11 Op 3 . - CDS 16942 - 17946 188 ## CD3337 conjugative transposon membrane protein + Prom 18414 - 18473 2.0 21 12 Tu 1 . + CDS 18514 - 19176 -394 ## 22 13 Tu 1 . - CDS 19308 - 20717 58 ## CD3338 hypothetical protein - Prom 20927 - 20986 3.0 23 14 Op 1 . - CDS 21742 - 22302 -81 ## CD3339 conjugative transposon membrane protein 24 14 Op 2 . - CDS 22316 - 22711 253 ## CD3340 putative conjugative transposon antirestriction protein 25 14 Op 3 . - CDS 22737 - 23231 140 ## smi_1327 hypothetical protein - Prom 23335 - 23394 3.0 - Term 23386 - 23443 -0.8 26 15 Tu 1 . - CDS 23632 - 24321 174 ## COG2946 Putative phage replication protein RstA - Prom 24348 - 24407 2.5 - Term 24825 - 24873 -0.8 27 16 Tu 1 . - CDS 25007 - 25978 140 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins - Prom 26000 - 26059 1.8 28 17 Op 1 . - CDS 26443 - 26820 277 ## CD3345 hypothetical protein 29 17 Op 2 . - CDS 26842 - 27156 155 ## CD3346 hypothetical protein - Prom 27273 - 27332 7.3 - Term 27280 - 27315 4.4 30 18 Op 1 . - CDS 27341 - 29095 961 ## COG5293 Uncharacterized protein conserved in bacteria 31 18 Op 2 . - CDS 29082 - 29330 216 ## gi|283845702|ref|ZP_06363179.1| hypothetical protein BcellDRAFT_1681 32 18 Op 3 . - CDS 29354 - 30187 558 ## pRALTA_0033 hypothetical protein - Prom 30315 - 30374 6.0 - Term 30870 - 30921 2.0 33 19 Op 1 35/0.000 - CDS 30991 - 31182 143 ## COG1132 ABC-type multidrug transport system, ATPase and permease components - Prom 31371 - 31430 4.0 34 19 Op 2 35/0.000 - CDS 31432 - 31650 120 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 35 19 Op 3 . - CDS 31632 - 31856 200 ## COG1132 ABC-type multidrug transport system, ATPase and permease components - Prom 32011 - 32070 7.1 + Prom 31737 - 31796 8.0 36 20 Tu 1 . + CDS 31947 - 32402 267 ## COG0583 Transcriptional regulator - Term 32821 - 32877 4.1 37 21 Tu 1 . - CDS 32952 - 33839 917 ## COG3588 Fructose-1,6-bisphosphate aldolase - Prom 33943 - 34002 11.3 + Prom 33938 - 33997 10.2 38 22 Tu 1 . + CDS 34029 - 34361 391 ## PROTEIN SUPPORTED gi|29375603|ref|NP_814757.1| hypothetical protein EF1023 + Term 34372 - 34423 8.5 - Term 34358 - 34412 9.2 39 23 Op 1 . - CDS 34562 - 35389 882 ## COG0627 Predicted esterase 40 23 Op 2 . - CDS 35413 - 35688 157 ## COG3933 Transcriptional antiterminator Predicted protein(s) >gi|223714098|gb|ACDT01000117.1| GENE 1 1 - 726 491 241 aa, chain + ## HITS:1 COG:no KEGG:Acid345_4096 NR:ns ## KEGG: Acid345_4096 # Name: not_defined # Def: GumN # Organism: A.bacterium # Pathway: not_defined # 93 239 150 282 287 75 31.0 1e-12 RDQIDKLDITTEKAIKDSQSIVLECSLDQNETAKYQNYLLENSLGELDLINYIPKLKENY PTLKKYRVNEFNAMAVSSLVTSDILDEVDGNNSTSIDAYLFNLTRSKKIPFEEMEGIEFQ MQLFSQLSKESPHAILESLKDRQQLIKSTKLILDSYYNHDLDTLATIYSYQELPDNEYQQ EYQNYQQLLIIDRNQTMQAKIIEYLNQNKVVFIGVGVGHVVGDLGLITSLSAAGYKVEKL G >gi|223714098|gb|ACDT01000117.1| GENE 2 756 - 941 217 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733558|ref|ZP_04564039.1| ## NR: gi|237733558|ref|ZP_04564039.1| predicted protein [Mollicutes bacterium D7] # 1 61 13 73 73 105 100.0 1e-21 MDFRKNAGIVLIILGMILTMDRTKDFQGIVSTIAYYIKGYWPLIFCFIGMYILSSPKKKK K >gi|223714098|gb|ACDT01000117.1| GENE 3 991 - 1371 384 126 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733559|ref|ZP_04564040.1| ## NR: gi|237733559|ref|ZP_04564040.1| predicted protein [Mollicutes bacterium D7] # 1 126 1 126 126 210 100.0 2e-53 MNKREIKEECYLDMLEDELNSVDAVLNHIEKVEIKEGIFDEHVIQKDLIRAYFDLELALA SLCILLRKMSENLFIHIDEDIRRDINSIIHSNKFEYHDHDRIYVYSQKGKEPVDLNNLLR FARSIL >gi|223714098|gb|ACDT01000117.1| GENE 4 1352 - 1942 538 196 aa, chain - ## HITS:1 COG:CAC3596 KEGG:ns NR:ns ## COG: CAC3596 COG0558 # Protein_GI_number: 15896830 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Clostridium acetobutylicum # 11 181 2 170 174 82 33.0 7e-16 MNMKDKTKLFNIPNCLCYFRILLIPVFLFVYFNAQWQHHYIIAAFILVLSGISDCLDGYI ARKYNMITDFGKLIDPIADKLTQFTIAITLLFTYPLAWVLLIIIVLKDGMLGIVGLYLYD YGLKIKGASWWGKIATAYFYFVVIILIGYHIPDTFASQLMIITSSALMLLSFVLYAKELK QMIKEKDKLVDEQKRN >gi|223714098|gb|ACDT01000117.1| GENE 5 2004 - 2948 933 314 aa, chain - ## HITS:1 COG:CAC3465 KEGG:ns NR:ns ## COG: CAC3465 COG1335 # Protein_GI_number: 15896704 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Clostridium acetobutylicum # 1 169 1 172 178 146 43.0 5e-35 MRKRCLLIIDVQTALTSQYPVDENKMIEHIRKLIRVCHNQHLEIVYVRHHDEELILNSDG WQIDSRITPRQDDKIFEKEYNSAFKNTNLDAYLKSQGIQDLIIVGMQSEYCIDTSIKVAF ELGYRVIVPKDATTTFDNDIISGKVLKQYLEEKIWKNRFAQVIEVDTLLMMIESKLDFKF SYGKTSFKDAALIRQAVFVEEQGFEKEFDKLDNTAYHLVIYKDEQPIAVGRMYFKDKTTM ILGRIAALKEYRGQKLGSKVVTALENKARELGCLETELSAQQQAQKFYEKLGYKPDGDMY YDEWCPHVTMKKIL >gi|223714098|gb|ACDT01000117.1| GENE 6 2941 - 3855 968 304 aa, chain - ## HITS:1 COG:lin1226 KEGG:ns NR:ns ## COG: lin1226 COG2267 # Protein_GI_number: 16800295 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Listeria innocua # 23 295 24 299 306 161 34.0 1e-39 MEYLKINSISDKLPLDVIVSAPEHPKAIFQIVHGMCEHKERYLDFIEYLNDCGYVVIIHD HRGHGKSVLDETDLGYFYGEGARAIVEDVHQLTNYIKKKYPNLPVCLFGHSMGSLVVRNY IQKYDHEINALIVCGSPSKNKLAGLGKLLCKAIAMVKGDKYHSKLLQKMSFGAFNKGFDK PNEWICSNSQVVDEYNNNPLCTFTFSVNGFYNLLSLMQNTYKNIDYEANKKLPVLFISGK EDPCLINEKAFNNAVTHLKKQGYQHVISILFEHMRHEILNEKYKETVYSTITTFLVDTME ENDA >gi|223714098|gb|ACDT01000117.1| GENE 7 3960 - 5150 948 396 aa, chain - ## HITS:1 COG:SPy1117 KEGG:ns NR:ns ## COG: SPy1117 COG0628 # Protein_GI_number: 15675097 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Streptococcus pyogenes M1 GAS # 22 387 19 376 382 135 31.0 1e-31 MRNFNLDDDLKRHVKPGLIISVFTVFLIYLLMNLGNIYGMISTLLGTLVYLFYGIIIAYI LNQPMKLIEKQITKYCTKTSFLYRKKRGLSIIATLILFICLIVLIASIVIPNLISSLISL ISNTSAFLTSVFNNIDDIFIYFNIDFRMENIASVKDLINMPWQDMVGQALDILTRSAGGI MNNATNFLSTFGVVFTGFIFSLYLLGNKETFLRQLRKAIGALCGYKVTCVIFDYAHKTNE VFSNFISGQLVEACILWVLYYVTMKLFNFPYPELIATIISIFSFVPFFGPIAAMFVGAVL ILSKDALMAIWFMVYFQILSQLEDNFIYPRVVGNSVGLPGIWVLLSIFVFGDLFGVFGMV IAVPSAACLYSLASEVINMILKKRKLVITENTIEQK >gi|223714098|gb|ACDT01000117.1| GENE 8 5198 - 7858 2599 886 aa, chain - ## HITS:1 COG:L85514 KEGG:ns NR:ns ## COG: L85514 COG0474 # Protein_GI_number: 15673239 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Lactococcus lactis # 5 886 21 910 910 1029 59.0 0 MRKFNSINERLVFASENKTNQVLDSFETNKMGITNETVKHHSDEFGKNIITKQKKDSILK KIFNAFINPFTAILLILAVVSLFTNVIFAETSEKDPTTVIIIVVMVMISGILRFIQEQRS GSAAEKLIAMISNTTNIKRYGQEAKEIPIDEVVVGDIVYLSAGDMIPGDLRIIEAKDLFI SQASLTGESEPVEKFAIETTVGTNVLEAQNLAFMGSDVISGSAVGVVIATGDETMLGRIS VDLNKKRELTTFEIGINSVSWLLIRFMLVMVPVVLFINGFTNGDWLDASLFALSVAVGLT PEMLPMIVTTSLAKGSLAMAKEKTIIKNLNSIQNLGAIDILCTDKTGTLTQDEVILEFPL DVHGKIDLRVLRHAFLNSYYQTGLNNLMDKAIINSTLAEQDNDSSLKDLTNKYEKIDEIP FDFQRRRMSVIIQDQDGKVQMVTKGAIEEMLSVCRYVEYLGKVWPLTQKLEKIIINQVEQ LNEKGLRVLGVSQKTDQDLVQKCTVSDEKDMVLIGYLAFLDPAKESTAPAIKALKEHGVA TKILTGDNEKVTKAICQKVGLNVENILLGQDVAKMGLDKLKEVVETTTIFAKLSPEQKAL IIKVLKENGHSVGYMGDGINDALALKASDVGISVDSGVDIAKEAADVILLDKDLMVLEKG LVEGRKVYANMIKYIKMTASSNFGNMFSVLIASAFLPFLPMAPIQLLLLNLIYDIACITL PFDRVDEEYLKIPRTWEASSIGRFMLWMGPISSIFDIMTYVLMFYFIAPIMAGGSYQSLT NPDYFIAVFQTGWFIESMWSQILVIHLIRTAKVPFVQSKPALFVTIFTLLSAFVLTLIPF SHLAKAIGLTSLPAYYFGLLIIIVILYIALTTVVKRVYLNKYHEWL >gi|223714098|gb|ACDT01000117.1| GENE 9 8349 - 9086 861 245 aa, chain - ## HITS:1 COG:YPO2048 KEGG:ns NR:ns ## COG: YPO2048 COG3142 # Protein_GI_number: 16122287 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Yersinia pestis # 5 203 4 200 254 128 39.0 8e-30 MDKIVEVCCGSYYDALQAQYGGARRIELNSALHLGGLTPSLATLLKVKDNTDLEVICMVR PRGAGFCYNDEDFEVMKLDAEILLDNGADGIAFGCLDEEGDINIPQTREMIDIIKSYHKT AVFHRAIDCVNDIDEAMNILITLGVDRILTSGLQGKATRGKEMIKYLYDAYGENIEILAG SGINAGNAKKMLDYTGIKQLHSSCKDWHTDSTTSRDEVNFSYHGDDYEVVSQELVEILVE LVEGE >gi|223714098|gb|ACDT01000117.1| GENE 10 9640 - 11100 737 486 aa, chain - ## HITS:1 COG:mlr0475 KEGG:ns NR:ns ## COG: mlr0475 COG0582 # Protein_GI_number: 13470699 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mesorhizobium loti # 21 442 26 395 399 79 22.0 2e-14 MASIKQRKSKFSVIYWYMDDTGERKQKWDTLETKKEAKARKAFIEFYQQTYGYVLVPLEE QFAHQREEAEQAIDTPDDEITLKEFLVTFVNLYGVSKWSANTYSSKLSSINNYINPIIGD WKLNEITTKKLSQYYNDLLSVPEVPRANRKATGRCVQPANIKKIHDIIRCALNQAIRWEY LDSKMRNPASLATLPKVPKVKRKVWNVQTFKEAIKLVDDDLLLLCMHLAFACSLRVGEIT GLTWDDVIVDEEAIANNNARVIVNKELARISQSAMQKLKEKDIIKIFPTQKPHCTTRLVL KTPKTETSNRTVWLPTTLAQLLVQYKKDQQELKEFLGTAYNDYNLVIALENGNPVESRIV RDRFTTLCEEHNFEVVVFHSLRHLSTKYKLKMTHGDIKSVQGDTGHAEAEMVTDVYSEIV DEDRRLNAKKLDEEFYDTLDTEEPEHNLHSEAQEPISDNDKLLLELLKSMTPEMKEKLLK ETLSNC >gi|223714098|gb|ACDT01000117.1| GENE 11 11120 - 11320 152 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733568|ref|ZP_04564049.1| ## NR: gi|237733568|ref|ZP_04564049.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 66 49 114 114 130 100.0 2e-29 MLDGKENTHGITTSEKKTYTVDEIAALLNISMKSAYALVKSGQFHYIRAGRMIRVSKISF DKWLHE >gi|223714098|gb|ACDT01000117.1| GENE 12 11463 - 11720 69 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733567|ref|ZP_04564048.1| ## NR: gi|237733567|ref|ZP_04564048.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 85 1 85 85 165 100.0 8e-40 MGISPLHGSHPAVPFVSGATSSHRLRLYGSIIITIERSWRQLCGKLLMERWQESLSGIIP PLVGYGVPAQRLSFYRNVFGCFMLR >gi|223714098|gb|ACDT01000117.1| GENE 13 11860 - 12099 301 79 aa, chain - ## HITS:1 COG:no KEGG:CD3328 NR:ns ## KEGG: CD3328 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 79 1 79 79 145 100.0 6e-34 MTAKHPMIPFPVIVRATDGDIEAVNQIVRHYSGFIASRSMRPMKDEYGNTHMVVDETLRR RMETRLIAKILSFEIREPN >gi|223714098|gb|ACDT01000117.1| GENE 14 12096 - 12515 228 139 aa, chain - ## HITS:1 COG:no KEGG:CD3329 NR:ns ## KEGG: CD3329 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 139 2 140 140 252 100.0 3e-66 MKPSEFQTTIENQFDYICKVAMEDERKDYLKALSRQCKRETLFCDMDDYTVNLFSSEDTY PSHFHTFEMDGFTVRIENSLLAEALESLDGKKRDVILRYYFLGFDDTEISKILEVNRSTI QRRRHAGLEFIKKFMEDEA >gi|223714098|gb|ACDT01000117.1| GENE 15 13116 - 13472 240 118 aa, chain + ## HITS:1 COG:no KEGG:CD3330 NR:ns ## KEGG: CD3330 # Name: not_defined # Def: putative transposon-related DNA-binding protein # Organism: C.difficile # Pathway: not_defined # 1 118 1 118 118 211 100.0 9e-54 MRKKEDKYDFRALGLAIKEARKKQGLTREQVGAMIEIDPRYLTNIENKGQHPSLQVLYDL VSLLNVSVDEFFLPASSQVKSTKRRQLENKIDNFTDADLVIMESVADGIVKSKEVGEM >gi|223714098|gb|ACDT01000117.1| GENE 16 13505 - 14200 302 231 aa, chain - ## HITS:1 COG:no KEGG:CD3331 NR:ns ## KEGG: CD3331 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 231 1 231 231 451 99.0 1e-126 MCTRFVYNGKETIVGFNFDIDLSEWEHTVIAEKDRFFIGIKMSDNKYHSFHGINRNGNVG TLLYVHGNDNAQFCGNESCYTIADLTENFIKGNLSFDDSLEIVKKKKITYAPDTTMQAMF SDRNGRVLIIEPGIGYRLEKEKYSLITNYSILKPELTNPYVLSGDNRYEKAKDLLQGYGE NFSISNAFDVLRSVRQEGLWATRVSFIYSVAKNKVYYVLNNDFKNIAEYQF >gi|223714098|gb|ACDT01000117.1| GENE 17 14380 - 14940 509 186 aa, chain + ## HITS:1 COG:BS_yobS KEGG:ns NR:ns ## COG: BS_yobS COG1309 # Protein_GI_number: 16078967 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 3 179 7 182 191 92 30.0 4e-19 MRVTKAEVIKTASDMADRNGLHNVSLKAIAENLGIRTPSLYNHIGSLDELLREIAHSGMR TMNEKMIRAAIGKTGDSALKLVAVEYLNYMIEHPGVYEIIQWASWNGTEETAIIFNDYLS LLKTLICSCGFNPDKTTEILNMVTGMLHGYTTLQLRYAFSNPDKVRKELSEAIDTLLLGA NQKYKD >gi|223714098|gb|ACDT01000117.1| GENE 18 15016 - 15924 427 302 aa, chain - ## HITS:1 COG:no KEGG:CD3335 NR:ns ## KEGG: CD3335 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 302 1 302 302 496 98.0 1e-139 MIQIRKEDNQKKQKEKKLKVYKVNTHKKTVIALWVLLAVSFLFAVYKNFTAIDIHTVHET KVIEEQILDTHKIENFVKNFAEVYYSWEQSAASIDNRTNALKGYLTGELQALNVDTVRKD IPVSSALTDFQIWEITEEKEQHYQVTYTVEQHITEGESGKTVRSAYQVTVYVDGSGNLTI IQNPTITSVPVKSGYTPKAVQSDGTVDSITTEEINEFLTTFFKLYPTATAKELTYYVNEG VLKPVGKEYIFSELVNPVYNRNGNQVTASLAVKYLDNQTMTTQVSQFDLVLEKNGENWKI VK >gi|223714098|gb|ACDT01000117.1| GENE 19 15941 - 16945 627 334 aa, chain - ## HITS:1 COG:BS_yddH_2 KEGG:ns NR:ns ## COG: BS_yddH_2 COG0791 # Protein_GI_number: 16077564 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Bacillus subtilis # 214 333 5 124 124 140 58.0 3e-33 MKLRHLFFACSGVFVMMFSLLLLVVIVFSDEEDGGSGGNLIFGGVSVSQEVLAHKPMLEK YAREYGIEEYLNVLLAIIQVESGGTLEDVMQSSESLGLPPNSLSTEESIKQGCKYFSELL AAAETKGCDLNSVIQSYNYGGGFLDYVAGRGKKYTFELAESFARDKSGGKKVTYTNPIAV EKNGGWRYSYGNMFYVLLVSQYLTVAQFDDETVQAIMEEALKYEGWTYVYGGDSPSTSFD CSGLVQWCYGKAGIALPRTAQEQYNVTQHIPLSEAKAGDLVFFHSTYNAGTYITHVGLYV GNNRMYHAGNPIGYADLTGSYWQQHLAGAGRIKK >gi|223714098|gb|ACDT01000117.1| GENE 20 16942 - 17946 188 334 aa, chain - ## HITS:1 COG:no KEGG:CD3337 NR:ns ## KEGG: CD3337 # Name: not_defined # Def: conjugative transposon membrane protein # Organism: C.difficile # Pathway: not_defined # 1 334 455 793 793 544 87.0 1e-153 MGSRIMRRPRMLMHAHMHRLQHKLGRSVAAFGTGTAAYHAGKQAGSDHRNASHFGSSKRT QADHSRPDGQAAPEKESVWKRAGSAVGTVADTKDKISDTAGQLREQAKDLPVNAKYALYH GKTQVSEGVRDFTSSVTQTRTARAEQRNAQAESRRHTIAERRAELEQAKQPQQKASEAPK GAAPVHERPVTAKQPEDFRHHAVKPAMQPASLSIRERGQVPYGETVAEQASVPVVKAASI HHEQTPPVRTERQIVPSASLSQPNERQKTAPTITQATPRPARPVQNDTAPVIPERKRAAP AVKESNFTIRRTTARKEWTKTVKAAAKQKKGEKP >gi|223714098|gb|ACDT01000117.1| GENE 21 18514 - 19176 -394 220 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLDLNGKQAVPYQIHAVFSLTFRVRQHNLCTKGQAGIADVCGKFIDFPDIIRSISDKGSG QHEHHHEIDNGMDGFCDFPFDKPCVGNVNPHHQNENQQKPDIKSLRGQAACGNACKGLHI FPNGIRRRGNEVQRVGFLNQIPRGVGKIQADCPDKICDGIQPVHHLFSDTVQPVPREPVP SAVHIKIKLVVVQRIFAVFVGSIHRIIHQSGSVGGCAEDC >gi|223714098|gb|ACDT01000117.1| GENE 22 19308 - 20717 58 469 aa, chain - ## HITS:1 COG:no KEGG:CD3338 NR:ns ## KEGG: CD3338 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 469 348 816 817 926 99.0 0 MYKLSYVVRVTAPDLEELKRRCNEVKDFYDDLNVKLVRPFGDMLGLHGEFLPASKRYLND YIQYVTSDFLAGLGFGATQMLGETEGIYIGYSLDTGRNVYLKPALASQGVKGSVTNALAA AFVGSLGGGKSFNNNMIVYYSVLFGAQALIVDPKAERGRWKETLPEIAHEINIVNLTSEE QNRGLLDPYVIMENPKDSESLAIDILTFLTGISSRDGEKFPVLRKAIRAVTNSEERGLFK VIEELRAEGTTISTSIADHIESFTDYDFAHLLFSDGDVTQSISLEKQLNIIQVADLVLPD KETSFEEYTTMELLSVAMLIVISTFALDFIHTDRSVFKIVDLDEAWSFLQVAQGKTLSMK LVRAGRAMNAGVYFVTQNTDDLLDEKLKNNLGLKFAFRSTDINEIKKTLAFFGVDSEDEN NQKRLRDLENGQCLISDLYGRVGVIQFHPIFEDLFHAFDTRPPVRKEVE >gi|223714098|gb|ACDT01000117.1| GENE 23 21742 - 22302 -81 186 aa, chain - ## HITS:1 COG:no KEGG:CD3339 NR:ns ## KEGG: CD3339 # Name: not_defined # Def: conjugative transposon membrane protein # Organism: C.difficile # Pathway: not_defined # 57 186 1 130 130 245 100.0 4e-64 MLPMGVTWNWREPLLSRTTAYMKSYDRPVCISFGKCRRLFYFRAALLPHLLERTDTMKKI RSYTSIWSVEKVLYSINDFKLPFPITFTQMAWFVVSVFAVMLLGNLPPLSFIDGAFLKYF GVPFALTWFMCQKTFDGKKPYGFLKSVLAYLVRPKLTYAGKPVKLEKEYPAQPITAVRSD IYGISD >gi|223714098|gb|ACDT01000117.1| GENE 24 22316 - 22711 253 131 aa, chain - ## HITS:1 COG:no KEGG:CD3340 NR:ns ## KEGG: CD3340 # Name: not_defined # Def: putative conjugative transposon antirestriction protein # Organism: C.difficile # Pathway: not_defined # 1 131 4 134 165 229 100.0 3e-59 MRIYIANLGKYNEGELVGAWFTPPVDFEEVKERIGLNDEYEEYAIHDYELPFEIDEYTPI EEVNRLCEMVEDLPEYIQEELSELQSYFGSIEELCEHEDDIICHSGCDDMADVARYYLEE SGQLGELPAHL >gi|223714098|gb|ACDT01000117.1| GENE 25 22737 - 23231 140 164 aa, chain - ## HITS:1 COG:no KEGG:smi_1327 NR:ns ## KEGG: smi_1327 # Name: not_defined # Def: hypothetical protein # Organism: S.mitis_B6 # Pathway: not_defined # 1 163 3 164 166 129 46.0 5e-29 MKIYILNTTRFYHEDFEEYPGAWFSCPVDFEEIRERLGVQSEEEIEIEDYELPFPLEGNT RLWEINALCRMIQEMQGTPLYYEMDVVQKRWFPSFTEFIDHKDQIRCYPVQDGESLARYL VQEVQLFGEVHPDLLNHIDYASIGRELETSENYLFTDNGIFYYR >gi|223714098|gb|ACDT01000117.1| GENE 26 23632 - 24321 174 229 aa, chain - ## HITS:1 COG:BS_ydcR KEGG:ns NR:ns ## COG: BS_ydcR COG2946 # Protein_GI_number: 16077554 # Func_class: L Replication, recombination and repair # Function: Putative phage replication protein RstA # Organism: Bacillus subtilis # 3 229 130 352 352 158 40.0 9e-39 MKRLDLAINDKTGILNIPHLTEKCRNEECISVFRSFKSYRSGELVRREEKECMGNTLYIG SLQSEVYFCIYEKDYEQYKKQDIPIEDAEVKNRFEIRLKNERAFYAIRDLLEHDNPERTA FQIINRYVRFVDRDNAKPRSDWRINEEWAWFIGEHRGSLKLTTKPEPYSFERTLHWLSHQ VAPTLKLALRLDKMNHTQIVHDIITHAKLTEKHEKILKQQAAAAKEVVL >gi|223714098|gb|ACDT01000117.1| GENE 27 25007 - 25978 140 323 aa, chain - ## HITS:1 COG:BS_ydcQ KEGG:ns NR:ns ## COG: BS_ydcQ COG1674 # Protein_GI_number: 16077553 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Bacillus subtilis # 1 323 144 467 480 360 53.0 3e-99 MKQGLLHIRVEITLGKYQEQLLHLEKKLESGLYCELTDKELKDSYVEYTLLYDTIANRIS IEDVQAKDGRLRLMENVWWEYDKLPHMLIAGGTGGGKTYFILTLIEALLRTNAVLFVLDP KNADLADLQAVMPDVYYKKEDMLACIDRFYEEMMKRSEDMKLMENYRTGENYAYLGLPAN FLIFDEYVAFMEMLGTKENAAVLNKLKQIVMLGRQAGFFLILACQRPDAKYLGDGIRDQF NFRVALGRMSEMGYGMMFGETTKDFFLKQIKGRGYVDVGTSVISEFYTPLVPKGHDFLKE IKKLIDSRQGVQAACEANAAETD >gi|223714098|gb|ACDT01000117.1| GENE 28 26443 - 26820 277 125 aa, chain - ## HITS:1 COG:no KEGG:CD3345 NR:ns ## KEGG: CD3345 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 125 1 125 125 229 96.0 3e-59 MRLANGIVIDKEATFGALKFSALRREVHLQNEDGSVSEEIKERTYDLKSRGQGRMIQVSI PASVPLKEFDYNAEVELINPVADTVATATFQGAEVDWYIKAEDIVLKKGAAMNPQTPKKD APPVK >gi|223714098|gb|ACDT01000117.1| GENE 29 26842 - 27156 155 104 aa, chain - ## HITS:1 COG:no KEGG:CD3346 NR:ns ## KEGG: CD3346 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 104 1 104 104 200 99.0 1e-50 MELKFVIPNMEKTFGNLEFAGEDKTEQRRINGRMAVLSRSFNLYSDVQRADDIVVILPAE AGEKHFDFEERVKLVNPRITAEGYKIGARGFTNYILHADDMVKA >gi|223714098|gb|ACDT01000117.1| GENE 30 27341 - 29095 961 584 aa, chain - ## HITS:1 COG:alr1855 KEGG:ns NR:ns ## COG: alr1855 COG5293 # Protein_GI_number: 17229347 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 5 567 7 584 586 178 26.0 2e-44 MKLLSLRANKPSFHPIIFKDGINIIVGKQVAPLDENDGNTYNGVGKSLTLHLIHFCLGAN KITSFAQKLPDWEFTLNFEIDGVKYYSTRSTNEQNKINFCGETLTVTVLRQKLLNLCFGI SSNPKNMTWNTLFSRFVRRYRSCYSTFDSFVPKETDYSKILNNCYLLGIDTDLIISKKEL REKQLAASATEKAIKKDPIFKQYYLGKNDAELDVADLEYRISELEKEISAFKVSNNYHEL EKEADEKSYQKKALENRRVLISNYIKNIEESFKETAQVKEEKLLKIYEAANVEIPTMVKK NIDEVLQFHSNLLTTRNARLRKELHKQKAELQEIDDKINVLGQRMDELLDFLNSHGALEE YVALTKQLAALQNELNRIHEYQRILKTYKDTVLDIKANLIIQDKETQTYLDDEAEYLSEL KNKYWTFAKRFYPKKRSGLVIKNNSGENTLRYTLEARIEDDSSDGVNEVRMFCFDFLLLI CRKSKIRFIAHDSRLFANMDPRQRETLFRIVSETCPKENFQYICSINEDALLSFQSLMSD TEYEEIVTNNIILELNDDSPESKLLGIQIDIDLEDKSKSSDDIN >gi|223714098|gb|ACDT01000117.1| GENE 31 29082 - 29330 216 82 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|283845702|ref|ZP_06363179.1| ## NR: gi|283845702|ref|ZP_06363179.1| hypothetical protein BcellDRAFT_1681 [Bacillus cellulosilyticus DSM 2522] # 1 77 1 79 83 65 43.0 1e-09 MILPKKQLSINESYFGFGAFLLEKLTEPISVEDLWEYYKDSYTNKKYPVKFSFDQFVIAL DYLYIIGAIKINERGLLCYEAA >gi|223714098|gb|ACDT01000117.1| GENE 32 29354 - 30187 558 277 aa, chain - ## HITS:1 COG:no KEGG:pRALTA_0033 NR:ns ## KEGG: pRALTA_0033 # Name: not_defined # Def: hypothetical protein # Organism: C.taiwanensis # Pathway: not_defined # 1 276 34 316 321 116 29.0 1e-24 MTKANPNFQPVKAYGNIGDMKNDGFDRTTGTYYQIFSPEDITKDTTIREAVKKLKTDFSG LYNHWNSLCPIKKFFFVVNDKYKGLPAPIIQMALELNNAPEYAQSDIDTFTAKDLEKVFM SLDDLDKQDIIGFIPDEIIPVVEYEALHETVTYLINAELPMNSMDNLTVPDFDEKIAFNG LSEVVKSRLVVGSYQEGTLMLYFNDNPGVKEILQKKFNALYEQSKENIPDTQENCSDCRF YYILEKASSKNTSPIQTSVLVLMAYYFSSCDIFEEPQ >gi|223714098|gb|ACDT01000117.1| GENE 33 30991 - 31182 143 63 aa, chain - ## HITS:1 COG:CAC3415 KEGG:ns NR:ns ## COG: CAC3415 COG1132 # Protein_GI_number: 15896656 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 3 63 521 581 627 86 67.0 9e-18 MQNGASLSRRQRQLLSIACAAVANLPVMILDEATSSIDIRTEVLVQQGMDNLMRGRTTYV IAH >gi|223714098|gb|ACDT01000117.1| GENE 34 31432 - 31650 120 72 aa, chain - ## HITS:1 COG:CAC3414 KEGG:ns NR:ns ## COG: CAC3414 COG1132 # Protein_GI_number: 15896655 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 1 70 141 210 577 72 51.0 2e-13 MLAFSLAVSLMINRDLSLIFLIAVPVLEAGLYFIVSRSHPYFEKVFHIYDELNSVVQENL AGIRVVKSFICL >gi|223714098|gb|ACDT01000117.1| GENE 35 31632 - 31856 200 74 aa, chain - ## HITS:1 COG:CAC3414 KEGG:ns NR:ns ## COG: CAC3414 COG1132 # Protein_GI_number: 15896655 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 2 68 73 139 577 95 64.0 3e-20 MLSGSFCAIASAGFAQNLRQNMYYKIHEYSFANIDKFSTSSLVTRLATDVTNVQNAHMMT IRVAVRAPSCWHFH >gi|223714098|gb|ACDT01000117.1| GENE 36 31947 - 32402 267 151 aa, chain + ## HITS:1 COG:ECs2784 KEGG:ns NR:ns ## COG: ECs2784 COG0583 # Protein_GI_number: 15832038 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 144 1 146 305 64 30.0 7e-11 MNTQQLECFVRIADKLNFTKVAEELYITAPAVVHHIINLEELNTSLFIRTSKMVKSTEPG SLFYSEAKDILQKIYMVEKKVKKLANQKLSILRIGCSSQVELEILEIPLIKLKEEFSTVY PQIIIQDYFILKSLFDNKQLDILVSTKEMDC >gi|223714098|gb|ACDT01000117.1| GENE 37 32952 - 33839 917 295 aa, chain - ## HITS:1 COG:CAP0064 KEGG:ns NR:ns ## COG: CAP0064 COG3588 # Protein_GI_number: 15004768 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1,6-bisphosphate aldolase # Organism: Clostridium acetobutylicum # 1 294 1 294 295 332 62.0 7e-91 MNTKQLERMQYGKGFIAALDQSGGSTPKALKSYGIDESAYSNDKEMFDLIHQMRSRIITS KAFTSKHILGAILFEKTMDSQIEGVDTAEYLWLKKEIVPFLKIDKGLEVEKDGVQLMKPI PELEAILNKARDKYIFGTKMRSVILEADPAGIKKIVDQQFELGLQIAKTGLVPILEPEVS ILSTSKKEAEELLLKEINEHLTLLDNDVKFIFKLSIPTIPDFYAELMKDSHVVRVVALSG GYDREQANGLLRENHGLIASFSRVLLSDLSINQTSVEFDTCLENVILQIYRASIV >gi|223714098|gb|ACDT01000117.1| GENE 38 34029 - 34361 391 110 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29375603|ref|NP_814757.1| hypothetical protein EF1023 [Enterococcus faecalis V583] # 1 109 1 109 110 155 66 4e-37 MSITVNIYYTGINGNARKFAEEMVTSGIVDAIKAEKGNERYEYFFSMNDPETILLIDSWK DQKSIDAHHATIMMEQITELRNKYDLHMKVERYTSNEENITSADKKFIRL >gi|223714098|gb|ACDT01000117.1| GENE 39 34562 - 35389 882 275 aa, chain - ## HITS:1 COG:lin2527 KEGG:ns NR:ns ## COG: lin2527 COG0627 # Protein_GI_number: 16801589 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Listeria innocua # 1 262 1 252 252 167 35.0 2e-41 MALVQVNFISQSLKRTVSMNVILPVDKFLFKGNCQPVKEYKTLYLLHGLLGNYTDWVTRT RIQEWAEAKNLVVVMPSGDNSFYVDQEVKNNDFGIFIGTELIEVTRKMFHLSNRRDDTFI AGLSMGGFGALRNGLKYYQNFGYIAALSSALNIFELPVHDESRCVMGEDSCFGDIDEAYL SDKNPKVCLENLIQAKKEDNTIVFPKIYMACGCDDELIGVNRKFKGYLENAGFDLVYKED VGSHNWDFWNKYIQDVLEWLPLDPYEEGINSGNVK >gi|223714098|gb|ACDT01000117.1| GENE 40 35413 - 35688 157 91 aa, chain - ## HITS:1 COG:STM0571_2 KEGG:ns NR:ns ## COG: STM0571_2 COG3933 # Protein_GI_number: 16763948 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Salmonella typhimurium LT2 # 4 76 366 438 449 63 41.0 6e-11 MEVFTFLAKTIIGLNIHLCCLIERLIKKEEIVSHINIVDFEKNNQEFINGVRISFKAIAN QYNITIPISEIAYIYDYVLNNYKTDMSIDEI Prediction of potential genes in microbial genomes Time: Thu May 26 10:24:07 2011 Seq name: gi|223714097|gb|ACDT01000118.1| Coprobacillus sp. D7 cont1.118, whole genome shotgun sequence Length of sequence - 1066 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 554 540 ## gi|237733594|ref|ZP_04564075.1| conserved hypothetical protein 2 1 Op 2 . - CDS 590 - 1066 470 ## COG1109 Phosphomannomutase Predicted protein(s) >gi|223714097|gb|ACDT01000118.1| GENE 1 2 - 554 540 184 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733594|ref|ZP_04564075.1| ## NR: gi|237733594|ref|ZP_04564075.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 184 1 184 184 300 100.0 2e-80 MQLNATLYQSILNSLCNELKLNEQVILDIIDSAFYMFQQDHQILYIDDLYECYFNIVKRN FTGNIDKVPFYSISRRLKDTDNDGLSLLELLTEENSFSNYLKEYGLTFKFDKEIEMYVNG NKVDIPDEGKYKPYLKNRFSYDYSFKGYAFDDQLMNNEILERVKYGPEFFGHLFNYVDND DEII >gi|223714097|gb|ACDT01000118.1| GENE 2 590 - 1066 470 158 aa, chain - ## HITS:1 COG:BH0267 KEGG:ns NR:ns ## COG: BH0267 COG1109 # Protein_GI_number: 15612830 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Bacillus halodurans # 1 151 287 440 447 144 48.0 9e-35 VGLYKALDKTNIDYVKTALGDKYVNENMVENNHCLGGEESGHIIFSKHATTGDGILTSLK IMEAIIESKQTLAQLTEPVTIYPQLTKNVRVENKKAAREDEAVVKEVEELLREEGRVIVR ESGTELVIRVMVEASTDELCNQYVNHIVSVMTERGLVL Prediction of potential genes in microbial genomes Time: Thu May 26 10:24:21 2011 Seq name: gi|223714096|gb|ACDT01000119.1| Coprobacillus sp. D7 cont1.119, whole genome shotgun sequence Length of sequence - 13932 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 7, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 454 338 ## gi|167755182|ref|ZP_02427309.1| hypothetical protein CLORAM_00687 2 1 Op 2 . - CDS 447 - 1145 471 ## COG3279 Response regulator of the LytR/AlgR family 3 2 Tu 1 . - CDS 1500 - 1877 449 ## COG2190 Phosphotransferase system IIA components - Prom 1995 - 2054 12.9 + Prom 1948 - 2007 7.4 4 3 Op 1 . + CDS 2033 - 2797 748 ## COG1737 Transcriptional regulators 5 3 Op 2 . + CDS 2797 - 3549 826 ## COG1737 Transcriptional regulators - Term 3525 - 3559 0.2 6 4 Op 1 2/0.000 - CDS 3567 - 5093 1690 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 7 4 Op 2 . - CDS 5114 - 6439 1756 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases - Prom 6596 - 6655 10.1 - Term 6764 - 6810 11.3 8 5 Tu 1 . - CDS 6830 - 7753 998 ## COG0679 Predicted permeases - Prom 7858 - 7917 9.1 + Prom 7838 - 7897 8.4 9 6 Tu 1 . + CDS 7929 - 9113 1152 ## COG1158 Transcription termination factor + Term 9179 - 9228 6.5 10 7 Op 1 2/0.000 - CDS 9359 - 10549 923 ## COG0477 Permeases of the major facilitator superfamily 11 7 Op 2 . - CDS 10615 - 10977 308 ## COG0789 Predicted transcriptional regulators 12 7 Op 3 . - CDS 10961 - 11266 302 ## gi|167755171|ref|ZP_02427298.1| hypothetical protein CLORAM_00676 13 7 Op 4 . - CDS 11325 - 11852 204 ## PROTEIN SUPPORTED gi|229878290|ref|ZP_04497790.1| acetyltransferase, ribosomal protein N-acetylase 14 7 Op 5 16/0.000 - CDS 11890 - 12489 723 ## COG1847 Predicted RNA-binding protein 15 7 Op 6 22/0.000 - CDS 12498 - 13394 975 ## COG0706 Preprotein translocase subunit YidC 16 7 Op 7 . - CDS 13395 - 13733 236 ## COG0594 RNase P protein component 17 7 Op 8 . - CDS 13774 - 13908 222 ## PROTEIN SUPPORTED gi|167755166|ref|ZP_02427293.1| hypothetical protein CLORAM_00671 Predicted protein(s) >gi|223714096|gb|ACDT01000119.1| GENE 1 1 - 454 338 151 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755182|ref|ZP_02427309.1| ## NR: gi|167755182|ref|ZP_02427309.1| hypothetical protein CLORAM_00687 [Clostridium ramosum DSM 1402] # 1 151 1 151 434 204 100.0 2e-51 MTSWIILLIKQYIDIAKYLLIGDAIFDLNNDKKKRTKENIILSSIILVIYIIFYSKIEDI WIDLNALICLIYLIIIYRDKLFKTFKVVIITILITTVLEQFMNLFFKYNLYEPPNMQFVI SNILRLLIVICGVKPIKMAKLIYIKKYNLSE >gi|223714096|gb|ACDT01000119.1| GENE 2 447 - 1145 471 232 aa, chain - ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 3 225 2 230 234 87 29.0 2e-17 MEINIAICDDDKNYINEIIQYLNNCSWKRNNSFVYHTFYCGEALLESKQYFDFIILDIEM DKVSGIEVKEIFYNRNNQSRIIFLTNYEEYMSDSFGKNVYGYISKDVLHKMDIPLNKIFK ELLEHRLIKISEDIIDLYEVYYVKADGPYINIHTNGSYEVYRMTLKDFANKIHNSNFIRI HKSYLINMRYINEINNKFLILDNDERLPISKSNRKTVLITYRKYLLENIIYD >gi|223714096|gb|ACDT01000119.1| GENE 3 1500 - 1877 449 125 aa, chain - ## HITS:1 COG:BH0844_3 KEGG:ns NR:ns ## COG: BH0844_3 COG2190 # Protein_GI_number: 15613407 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Bacillus halodurans # 1 121 28 147 151 77 36.0 8e-15 MGDGIAIYPDNNIIVAPCDGKITAIMENTEHAVGITLANGVELLIHVGLDTVNLTEKIFS VKVAKESYVHTGEELITFDKVRLKELGYQDITMLVILDTGNTKKLEILSETEAIINQTNI IKYID >gi|223714096|gb|ACDT01000119.1| GENE 4 2033 - 2797 748 254 aa, chain + ## HITS:1 COG:CAC0191 KEGG:ns NR:ns ## COG: CAC0191 COG1737 # Protein_GI_number: 15893484 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 9 254 10 265 283 86 27.0 4e-17 MLINQLDDEIIKNLSQSELYILHYVYDHPDEVIDMSIQELAKAVAFSSATILRFCKKLNF SGFAEFKFALKQQNKEIANLKKPISSMDSITSLYDDIESTSLLIKERYLKEIIDLIDSNK RIHLYGEGISRIPVDYLEKLLFSIGRQNVYKYKSSRLAAHIASNANEDDILIAISTSGNY PTTVKIVKLFNLNHATIIAISPYTKNAIADLANINFRFFVNHRENIDTEYTSRLAIFYII DVIFKTYLVRKEKE >gi|223714096|gb|ACDT01000119.1| GENE 5 2797 - 3549 826 250 aa, chain + ## HITS:1 COG:L192289 KEGG:ns NR:ns ## COG: L192289 COG1737 # Protein_GI_number: 15673158 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Lactococcus lactis # 13 236 14 245 273 77 25.0 2e-14 MIALLENAKSLDLSENEKELLYYLENSCKEVVSMTLSQLAKATYMSNASIMRFCNKLGLS GFNELKYELKQSMHQEIDYQMIIEKPLSRFLDNLKNLNYEIIEEVVDLLCSPQPIYVVGR SLSLVAADYFQTILSSIDINCILINDLHLSKSVFNNLKAPATIFIVSANNAGKIYDEVVT IAKTHDCKIILLTSNAKGSLVDSCDYVLSSNDENLTYHGVDINSRLGIFTIMQIIIELTA QKLALEKATD >gi|223714096|gb|ACDT01000119.1| GENE 6 3567 - 5093 1690 508 aa, chain - ## HITS:1 COG:malX_1 KEGG:ns NR:ns ## COG: malX_1 COG1263 # Protein_GI_number: 16129579 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli K12 # 3 406 8 424 450 289 38.0 7e-78 MKKNKVMDFFSALGRSLMMPIAALAACGIILGVTAALLKTQVVEAVPFLQQPVVFYILNT LKTVSNVVFTLIPVLFAISISFGMAKEDKEVAAFAGFIGYYTFLVSASCMINSGFMNFDS LQISTILGVETLDMGAVAGILTGVTVAALHNKYHKVVFPVAIAFYGGKRFVAIVVILAMA LLGQVAPFIWAPVSAGINGLGTLISESGLLGVFSFGFLERLLIPTGLHHVLNGIFRTTAI GGVYQGVEGCLNIFLQFFDSVDISVMREYTQFLGQGKMLFMIFGLPAAAYAIYKTSPGEK KNKVKALMIAGVAASIVSGITEPLEFAFMFIAPPLFLFHAVMGGISFMLMSLLQVAVGNT GGGIIDMFIWGVFQPGSNWYWIIIIGPIYAIIYYNVFKWYFNRKHLSIEVAEDDGDKDTN DTASISDKQQALATKIIEGLGGFDNIITVNNCISRLRVDVKDMSLVNEEELKKTGSMGIV KPSETHIQVIYGPKVEQVANSVREVLKY >gi|223714096|gb|ACDT01000119.1| GENE 7 5114 - 6439 1756 441 aa, chain - ## HITS:1 COG:CAC3426 KEGG:ns NR:ns ## COG: CAC3426 COG1486 # Protein_GI_number: 15896667 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Clostridium acetobutylicum # 1 440 1 444 445 459 51.0 1e-129 MKKYSVCIVGGGSTFTLGFLKSFCRMKEQFPLKKLVLFDIDAERQEPIGEFGKILFKEQF PELDFSYTTDIKEAYEGMDFVFMQMRAGGLPMRAKDEHIPLSMGLIGQETCGAGGMAYGL RSCVDMIKAVHDIRNYAPEAWILNYSNPAAIVAEALRREFPDDKKILNICDQPENVVRSC SRLLGCDWENLDPVYFGLNHYGWFTNIYDIKTGEDLLPKLRELIKKNGFLPQDAEQRDQS WLDTYGMVETLMNDFPDFLPNTYDQYYLYPDYKAAHLDPNFTRADEVMAGREKRVFDECR EVIAAGVLGDKFDDISDAHAEMMINVAEAIAYNKNTRHILIVENNGAIANMQDDAMVEVV CELGINGPRPMRVGNIPQFYLGLLVNQVSCEKLIVDAYFENSYQKALMAFTLNRLINDGK KARKVLDALIEANKGMWPELK >gi|223714096|gb|ACDT01000119.1| GENE 8 6830 - 7753 998 307 aa, chain - ## HITS:1 COG:CAC2949 KEGG:ns NR:ns ## COG: CAC2949 COG0679 # Protein_GI_number: 15896202 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Clostridium acetobutylicum # 6 306 5 305 305 149 33.0 6e-36 MDIMVVINQMIQLFLIIGLGYFMQKKKILNDELNTKLNFIVISITTPALIFSSVCTTSIS EKSMVIYTLAIATAVYVALPVISFVLVKLMRIPMHQKGLYMFMTIFSNTGFMGYPIMKAL FGNDAVFYTALFNILFNLEVFTLGVILINYGNDVKMKLNPKNLLSPGVVASIIAIFIYFL EIPIPDVIANSCGMVGDMTTPLAMMIIGATLANIKVKELFTELRLYYFTIVKQVILPIAV FPIIAYFIKDPLIQGITLVNLAMPVANSAVLFAKEYGGDVELAAKSIFITTLVSVFTIPL IVSLLLV >gi|223714096|gb|ACDT01000119.1| GENE 9 7929 - 9113 1152 394 aa, chain + ## HITS:1 COG:CAC2889 KEGG:ns NR:ns ## COG: CAC2889 COG1158 # Protein_GI_number: 15896143 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Clostridium acetobutylicum # 5 376 102 468 483 408 60.0 1e-113 MESIKNDKNKVNTKEEDELGELASGILEIMVDGYGFLRPKNYTQESRDIYISQSQIRRFN LKNGDEIVGHARNSKNGERYNALMRVDLVNGQNPDQAKKRIAFEFLTPIYPNERVYLGKD NEQLSMRLIDLISPIGKGQRGIIVAPPKAGKTTLLKDIALSVSKNYPNIKIFILLIDERP EEVTDIKEALDADNVEVIYSTFDEKSESHIKVAQTTLERAKRLTEQGYDTMILLDSLTRL TRANNLTVTPSGRTLSGGLDPASFYFPKRLFGAARNTREGGSLTILATALIETGSRMDDM VYEEFKGTGNMELVLSRKLQEKRIFPAIDVQKSGTRREDLLLSKDEYEAMQIIRKSLDQT TIEQATDSLLNMFEKTKDNEVLIKNILVSTKHQK >gi|223714096|gb|ACDT01000119.1| GENE 10 9359 - 10549 923 396 aa, chain - ## HITS:1 COG:BH1174 KEGG:ns NR:ns ## COG: BH1174 COG0477 # Protein_GI_number: 15613737 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus halodurans # 3 377 6 374 399 98 24.0 2e-20 MNKNKLLMFFIINGLFNLAANYAHPITPTVIKNLQLGDYMFGVAFGAMMGLNFLFSPFWG KLNEQINSKTTMLICSIGYAVGQVLFWQANSQNSVIFARMFSGVFTGGSYVSFLTYVVNV TPPEKRGQNLTVLATVASVCAAFGYFIGGMLGEISIDLTFAVQAATLASCGIFFYLFCEN DVVSPVKQKVALLIKQANPFNSFKQAKVFIDKKYLIMFTICCLAFLGFTAFEQSFNYYIK DIYNLTSGYNGIIKAAIGIISLVANGTICLWIIKNTNVSRSSMLILLLSSVTMLGVLATN NIIIFIVMNIIFFALNAIIIPVTQDLVASRSNDTNSSIVMGFYNAIKSLGGIIGALMAGF FYTFNPRLPFIFGFVAFLLATVLSYYYYQQNKLHKS >gi|223714096|gb|ACDT01000119.1| GENE 11 10615 - 10977 308 120 aa, chain - ## HITS:1 COG:CAP0107 KEGG:ns NR:ns ## COG: CAP0107 COG0789 # Protein_GI_number: 15004810 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 115 1 115 152 92 41.0 1e-19 MEYTIKEVTKKYNLSASTLRYYEKEGLLPKIKKNQSNQRVYDDDDLSWLDIIMCMRKTGM TIAYIKNYIELCQEGDNTINQRYEIFLKQKEILLLQQQELEKNIETVNYKINLYKEKLLK >gi|223714096|gb|ACDT01000119.1| GENE 12 10961 - 11266 302 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755171|ref|ZP_02427298.1| ## NR: gi|167755171|ref|ZP_02427298.1| hypothetical protein CLORAM_00676 [Clostridium ramosum DSM 1402] # 1 101 1 101 101 154 100.0 1e-36 MDKIVINQELLLLLQKNKIIATTSNFNYRDQAVISLIKTLITMGYNSSDILELIDEEMTL LDILEFHQDLVINKQYECCCLINKLKKMIEEVKEETNGIYN >gi|223714096|gb|ACDT01000119.1| GENE 13 11325 - 11852 204 175 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229878290|ref|ZP_04497790.1| acetyltransferase, ribosomal protein N-acetylase [Slackia heliotrinireducens DSM 20476] # 1 175 1 178 181 83 29 9e-16 MKMNIELARATDVNEIEALYGAVVDDLLANVNYPGWKKGIYPTRDEAIFGIEKKELYVMR QNDQIVGSIVINHVQEENYQLAAWKIDAQDHEVYVIHTLAVHPQFKGLKIAQKLLEYADE LAKNNGVKTIRLDVRKGNVPAIKIYERCGYTYVGAINLDFRGSDLGLFELYEKII >gi|223714096|gb|ACDT01000119.1| GENE 14 11890 - 12489 723 199 aa, chain - ## HITS:1 COG:BH4063 KEGG:ns NR:ns ## COG: BH4063 COG1847 # Protein_GI_number: 15616625 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein # Organism: Bacillus halodurans # 2 196 4 204 207 96 31.0 3e-20 MKRYTARTVQDAVNTACQELGVTIDELNYEVISETKTLFTKKAEIECYTIPMIQEYIESY IRRFIGDMGFEVETVSYLQDGRIYCNINTSNNSILIGKAGVILRAINFIVKNAVSNTFKK RFEISVDINGYKEDRYKKVASMAKRFGKQVLRTKAEIKLDPMPADERKVMHQELSRFEHI KTESHGEGKNRHMVISYVD >gi|223714096|gb|ACDT01000119.1| GENE 15 12498 - 13394 975 298 aa, chain - ## HITS:1 COG:lin2986 KEGG:ns NR:ns ## COG: lin2986 COG0706 # Protein_GI_number: 16802044 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Listeria innocua # 16 275 17 246 287 105 28.0 1e-22 MVLDNKTKKILKVFCLVFLVAILTGCAQNLDANGKLIASRAITSSTPWSFDAGWFDFIFV IPLAKGILFINDYVGNVAFGVIGVTIIVNIITLPIMIKSTVSTQKMQLLQPEMEKIQRKY KGRKDQASQMRQNAEIQNLYKKNNVSMFGSFATFLTLPIMFAMWQAVQRIDILYSTYIFG INLGAKPISHIMDLDIAYIVLLVLVAGTQFFAMQITQIMAKRSPKYRPSQQMKSMNTMNN VMTIFIVYLAATMPAAMSLYWITTNVINIIRTVYIQLFHIEKAKKEVESTTTNFLNKK >gi|223714096|gb|ACDT01000119.1| GENE 16 13395 - 13733 236 112 aa, chain - ## HITS:1 COG:SPy0246 KEGG:ns NR:ns ## COG: SPy0246 COG0594 # Protein_GI_number: 15674428 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase P protein component # Organism: Streptococcus pyogenes M1 GAS # 1 104 1 104 119 70 39.0 7e-13 MKKEFRVRKNEDFSRIIKKKQSMANRSFIIYYLKNDLDHARIGISVSKKLGKAVIRNKIK RQVRMMLQQTINFDDNYDYIVIIRNKYLDLDFNSNLNELKYLYKKILKRMEK >gi|223714096|gb|ACDT01000119.1| GENE 17 13774 - 13908 222 44 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167755166|ref|ZP_02427293.1| hypothetical protein CLORAM_00671 [Clostridium ramosum DSM 1402] # 1 44 1 44 44 90 100 8e-18 MKRTYQPNKRKRSKTHGFRARMATVGGRKVLARRRKRGRKVLSA Prediction of potential genes in microbial genomes Time: Thu May 26 10:24:38 2011 Seq name: gi|223714095|gb|ACDT01000120.1| Coprobacillus sp. D7 cont1.120, whole genome shotgun sequence Length of sequence - 1279 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 388 - 447 6.9 1 1 Tu 1 . + CDS 536 - 1277 695 ## COG0593 ATPase involved in DNA replication initiation Predicted protein(s) >gi|223714095|gb|ACDT01000120.1| GENE 1 536 - 1277 695 247 aa, chain + ## HITS:1 COG:BH0001 KEGG:ns NR:ns ## COG: BH0001 COG0593 # Protein_GI_number: 15612564 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Bacillus halodurans # 1 247 1 254 449 157 37.0 2e-38 MSDNSKIWQLCLDTLEKQNLPEERKIDKVIMESVFRSAKIASINNNKVIITTPFSFNVET IKNNQKDIEDILSGMLASSINLEIVGEEDFKKSVAVKQETPFRDNLNRTLTFDNFVVGSS NRMAQNAALLVSTNPGSNFNPLFIYSNPGLGKTHLLNAIGNYAKEVNPALRIRYITSKDF VDEVIGAMKGRDGDEIYDKYKNLDILLIDDIQFLFNKEKSSEIFFHIFNELINNNKQIVI TSDKMPE Prediction of potential genes in microbial genomes Time: Thu May 26 10:24:44 2011 Seq name: gi|223714094|gb|ACDT01000121.1| Coprobacillus sp. D7 cont1.121, whole genome shotgun sequence Length of sequence - 18137 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 5, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 69 - 128 4.5 1 1 Op 1 16/0.000 + CDS 209 - 580 300 ## COG0593 ATPase involved in DNA replication initiation + Prom 646 - 705 7.5 2 1 Op 2 6/0.000 + CDS 736 - 1845 1333 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) + Prom 1969 - 2028 12.8 3 2 Op 1 9/0.000 + CDS 2054 - 2260 327 ## COG2501 Uncharacterized conserved protein 4 2 Op 2 9/0.000 + CDS 2260 - 3357 856 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) 5 2 Op 3 24/0.000 + CDS 3357 - 5282 2608 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 6 2 Op 4 . + CDS 5300 - 7759 3043 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit + Term 7762 - 7802 9.4 7 3 Tu 1 . + CDS 7813 - 8322 740 ## COG3153 Predicted acetyltransferase + Term 8325 - 8363 5.1 - Term 8316 - 8347 1.0 8 4 Op 1 . - CDS 8348 - 8734 368 ## gi|237733419|ref|ZP_04563900.1| predicted protein 9 4 Op 2 . - CDS 8749 - 8991 195 ## gi|237733420|ref|ZP_04563901.1| predicted protein 10 4 Op 3 . - CDS 8988 - 9551 493 ## LCRIS_01737 transcriptional regulator - Prom 9707 - 9766 5.7 11 5 Op 1 . + CDS 10054 - 11559 1537 ## COG1757 Na+/H+ antiporter 12 5 Op 2 3/0.000 + CDS 11621 - 14563 2948 ## COG2200 FOG: EAL domain + Term 14571 - 14608 4.4 + Prom 14613 - 14672 5.1 13 5 Op 3 2/0.000 + CDS 14696 - 15283 424 ## COG1309 Transcriptional regulator 14 5 Op 4 36/0.000 + CDS 15349 - 16044 321 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 15 5 Op 5 . + CDS 16045 - 18136 1573 ## COG0577 ABC-type antimicrobial peptide transport system, permease component Predicted protein(s) >gi|223714094|gb|ACDT01000121.1| GENE 1 209 - 580 300 123 aa, chain + ## HITS:1 COG:SA0001 KEGG:ns NR:ns ## COG: SA0001 COG0593 # Protein_GI_number: 15925706 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Staphylococcus aureus N315 # 7 120 338 450 453 74 33.0 5e-14 MNHTNNIDMAFALESFKDDKIVQNPKSSLTKESILKTTAEFYYLTISQLISKNKTRKLTT PREMCMYLMRELLDITFAEIGATFSNRDHSTVMKACARVETKIKKDPDYKLAINKLKDKL GIV >gi|223714094|gb|ACDT01000121.1| GENE 2 736 - 1845 1333 369 aa, chain + ## HITS:1 COG:BS_dnaN KEGG:ns NR:ns ## COG: BS_dnaN COG0592 # Protein_GI_number: 16077070 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Bacillus subtilis # 1 369 1 378 378 247 39.0 3e-65 MNFKIKRLKLLNALAKATKAVSIRSPLPVLTGIKFDLQDDCLILTGSDSDITIQTKIEKD EDLTIYQTGGVVLNSRYILDIVRKIDSDEIKIEILDGALTRISGATSKFDLNGTDVLDYP RIDLSKTGTKVLLNALSLKDIISQTKFAASDKEHKPILTGINFKAGNRQLECTATDSYRL AKKIVSLDEDVTFNITIPQKSLDEISKIIERDEMIEMYVSDRKVLYVFDNNIIQTRLIDG TFPDTNRLIPDAFDYELDLDAHYLLNAIDRVSLLTNEQNNIIKLDMSDEKVVLSSYMQEI GSVEEILDKSFYKGTPISISFSSKYATDAIRAFNEPKIKILFTGEMKPFIIKDFEKDDVI QLVLPVRTF >gi|223714094|gb|ACDT01000121.1| GENE 3 2054 - 2260 327 68 aa, chain + ## HITS:1 COG:CAC0003 KEGG:ns NR:ns ## COG: CAC0003 COG2501 # Protein_GI_number: 15893301 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 67 1 67 68 62 52.0 1e-10 MKKIQIRDEYITLGQFLKYAGCVSSGIEAKMVIKDELVMVNGEVETRRGKKLRTGDQIEF NGESFVID >gi|223714094|gb|ACDT01000121.1| GENE 4 2260 - 3357 856 365 aa, chain + ## HITS:1 COG:lin0005 KEGG:ns NR:ns ## COG: lin0005 COG1195 # Protein_GI_number: 16799084 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Listeria innocua # 1 361 1 365 370 280 42.0 3e-75 MKVNSLCLDNFRNYNHFFIEFDRDINILIGSNGQGKTNLIEAIYLLSVGKSFKTHINKQM IMFDCEFAKVKGEVTSNNKLRSLEMILGSDFKRAKIDDQDIYKISEYVGLLNVVVFVPDD LYLIKGSPNNRRRFIDLELSKISPIYVFNLSKYNNLLKERNKYLKILNQKNRDGDEYLEV LDEQMARLQVELIKKRIDFIKNLNQKVTSIYNLIAKNDNEKISLRYSCFLKQELTYENIL ALYKKNHQRDIRYMQSHLGIHKDDLKIFMNGNAADLFASQGQQRTIVLSLKIALIELIKD EIGEYPVLLLDDVLSELDEARKNMLLDILNQKIQTFITTTSIDGINHQIVEKAKKIYIKG GKEAT >gi|223714094|gb|ACDT01000121.1| GENE 5 3357 - 5282 2608 641 aa, chain + ## HITS:1 COG:BS_gyrB KEGG:ns NR:ns ## COG: BS_gyrB COG0187 # Protein_GI_number: 16077074 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Bacillus subtilis # 4 641 2 638 638 822 65.0 0 MSEEKVEHSYDESDIQVLEGLEAVRKRPGMYIGTTSIRGLHHLVWEIVDNSIDEALAGYA SHIRVILKDDDVVQVIDDGRGMPTGTHAKTGISTVETIFTVLHAGGKFGGGGYKVSGGLH GVGASVVNALSEWLEVNVHRDGHEYYQRFENGGKPAGELKIIGDSDITGTIVTFKADKII FKEGTVYDYDTLRQRIRELAFLNKGLRLSLEDQRNGVDRKHEYYYEGGIKEYVAYINKNK TPIHEEIIYVEDMQQEITIEVGMQYNDGYQSNIYSFCNNINTHEGGTHEEGFRLALTRVI NNYAKSKGLLKKDEEALSGDDVREGLTAIISVKHPDPQYEGQTKTKLGNAEVRKIASNII SASLDKFLLENPDTAKIIVDKAIVAARARMAAKKAREMTRRKNVLEVSNLPGKLADCSSK DASECEIFIVEGNSAGGSAKMGRDRRIQAILPLRGKILNVEKSRLEKILGNAEIRSMITA FGTGIGEEFNLDKLRYHKIVIMTDADVDGGHIRILMLTFLYRYLRPLVEGGFVYAAQPPL YLLKHGKDEYYCYSDEELDDLKTKIGEGAKYSIQRYKGLGEMNAEQLWETTMDPEKRILN QIDLDEAMEADMIFDMLMGEKVEPRREFIQENAKYVTDLDI >gi|223714094|gb|ACDT01000121.1| GENE 6 5300 - 7759 3043 819 aa, chain + ## HITS:1 COG:BS_gyrA KEGG:ns NR:ns ## COG: BS_gyrA COG0188 # Protein_GI_number: 16077075 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Bacillus subtilis # 6 805 9 806 821 933 59.0 0 MEERGIRKVKLTSEMKNSFMQYAMSVIVSRALPDVRDGLKPVQRRIIYGMSDLGCTPGTP YKKSARIVGEVMGKYHPHGDSSIYEALVRMAQDFSYRYMLVDGHGNFGSIDGDGAAAQRY TEARMSKISSELVRDLKEDTIDFIPNYDGEEQEPAVLPARIPNLLINGSTGIAVGMATNI PPHNLVEVIEAILAVAKDPDITIMELMENYIQGPDFPTGGYILGKTAIKKAYETGNGLIV MRAKTEIEEKGSKKVIVVTEIPYMVNKARMIEKMSQLVREKKIEGITEIRDESNRDGIRV VIELRRDIQPEVILNQLYKLTSLQTTFGANCIALVNNEPKTLNLKEMITLYLDHQEEVIK RRTKFRLTKAEDRAHILEGLKRALDQIDAIIDLIRSSRTTEIIQTRLMEEFNMSDKQAKA IREMQLQRLAGLERQKIEDDLNKLLEIIADLKDILENRERLLSIIHDELEEIKDKYGDKR RTEIIQGTFDLEDEDLIPVEDVIISLTNNGYVKRMPVDTYKTQNRGGRGIKGMATNSDDI VSSLINMSTHDDLLFFTNFGKVYRLKGYNIPEFGRSAKGLPVVNLLNLDKEEQVKSMISI NRKQIEENNDYYLFFVTASGLVKRVHINEFERIRQSGKIAISLRDDDQLVSVKLTGGNDQ ILIAASNGKLVRFEETNVRPMGRSASGVKGINVDGSEVIGMTTDKEGKYIMVVTERGYGK MSPLEEYRLSNRGGKGVKTINATERNGQIVALRAVNGDEDLLIITDDGIMIRLPMEQVKT AGRATQGVRLIKVNAENKVSSVEVVEKAAETEEDKSEEE >gi|223714094|gb|ACDT01000121.1| GENE 7 7813 - 8322 740 169 aa, chain + ## HITS:1 COG:L1005 KEGG:ns NR:ns ## COG: L1005 COG3153 # Protein_GI_number: 15672038 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Lactococcus lactis # 1 167 1 172 173 115 38.0 5e-26 MIIRPEEKQDFNEIYTVVKAAFATAEHSDGNEQDLVIALRTGNNYIQKLSLVAKIDNQLV GYIMFTTAKVGNDTVLVLAPLAILPAFQKQGIGSALINKGHQIAKELGYEYCLVLGSEHY YPRFGYLPAEQFGIKVPKGIPSINFMAKKLIFRAGKISGEVKYAKEFGI >gi|223714094|gb|ACDT01000121.1| GENE 8 8348 - 8734 368 128 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733419|ref|ZP_04563900.1| ## NR: gi|237733419|ref|ZP_04563900.1| predicted protein [Mollicutes bacterium D7] # 1 128 1 128 128 213 100.0 3e-54 MSQSIVINNLEDLENIRNEIISFALEFDSIAQRKQYTFDQIENINVFHNSRYTDILAKDI SKASELLKLFYQLKAEQLYTIGDGENDICMLQCTDNSFTFNHVETVIKNSAQYHFDIIEQ ILAFINQK >gi|223714094|gb|ACDT01000121.1| GENE 9 8749 - 8991 195 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733420|ref|ZP_04563901.1| ## NR: gi|237733420|ref|ZP_04563901.1| predicted protein [Mollicutes bacterium D7] # 1 80 1 80 80 138 100.0 1e-31 MKIIGSDFDGTITTNKVILPETRKQIKQFRDAGNIFIIVTERSPQKFKDGIKKYDIDFFD YVICANGAVTLDQQLKVIIK >gi|223714094|gb|ACDT01000121.1| GENE 10 8988 - 9551 493 187 aa, chain - ## HITS:1 COG:no KEGG:LCRIS_01737 NR:ns ## KEGG: LCRIS_01737 # Name: not_defined # Def: transcriptional regulator # Organism: L.crispatus # Pathway: not_defined # 26 184 236 390 390 97 33.0 3e-19 MLEECQNNTQHLPQIKTIEEADCYYLPYLDKSFDVVIANHVFMYFDDLPKALQEINRILK DDGILYCSTIAKDMMKERDVMLKDFDSKISFNQEILYQPFGYENGKTKLEKYFQDIKLYD RKEVYEITDLDLYYQFILSGKGLSLNLEPLYKKKKQLYEYMQKYLNKNNLFYLTTHAGMF VARKRKK >gi|223714094|gb|ACDT01000121.1| GENE 11 10054 - 11559 1537 501 aa, chain + ## HITS:1 COG:VC1131 KEGG:ns NR:ns ## COG: VC1131 COG1757 # Protein_GI_number: 15641144 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Vibrio cholerae # 1 467 4 496 533 323 40.0 3e-88 MDFVQTGWALLPPVIAILIALKTKEVYLSLFFGIVSGALLLSNFHIIDTVNVMFDTMIGS LSDTWNIGILIFLIFLGIIVTLMTKAGGSRAYGLWAKEKIKSRQGSLFSTFLLGILIFVD DYFNCLTVGSVMREVTDEFSVSRAKLAYIIDSTAAPICIIAPISSWAAAVSGYTSGDGFS LFLQTIPFNLYALLTIVMVTYVIKKDFHIGTMKHHEIAAQNGDVHNGVNDYEEGEEMEVS DQGQVYDLLLPVFFLIISCIVGMLYTGGFFEGVSFIMAFTNCDASRGLVLGSFFTLVFIF ILYLPRKIISYEDFAKCIPMGFKTMVPAMIILTLAWTLGDLVSDKLQAGVFVYNLMQGAS ISTAILPVCLFLVGTGLAFATGTSWGTFGILIPIATSIFPEGSQMLVIAIASILAGAVCG DHISPISDTTIMASAGAQCNHVNHVTSQIPYAMLVAGACVVGYLVAGFTTNVVLTFVIAI IALFIFIFITKYFYNFKAKKD >gi|223714094|gb|ACDT01000121.1| GENE 12 11621 - 14563 2948 980 aa, chain + ## HITS:1 COG:BS_ykoWm_4 KEGG:ns NR:ns ## COG: BS_ykoWm_4 COG2200 # Protein_GI_number: 16081166 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Bacillus subtilis # 189 437 8 255 259 147 32.0 8e-35 MKIECDQKGGVKMEESHNFYLNDYTEFVKQATDIIESNKNSKNYVIMYADLTNFQLVNDF YGFIEGDRFLLEFAKFLSMSPRTILCGRVFSDHFLRLSLFEDNCDIESVTRNYEQKLETF ISSQEEHHPDSKIAVAAGLCPVKLEEKAIIKAIDQANIARKEAKKDNHCKLVWFNEDMQK FINKRKILEIEIQNALRNKLFTFYLQPKVNINTGKIVGSEALARLIKDNKVIYPDQFIEI MEKNETIIDLDFLICQQVCRYLRERIDQGKKLIPVSMNMSRMHVYETGFVARVNKIVNKY NIPPYLLEFELTETVILKNLTKVREVADELRSFGYRVSIDDYGTGYSGMNIWQDLNFDIV KLDRSYITKRESNNNRNNIIIPAIVDICNKLQTTIICEGVETLDQCLYMKRFGCNVVQGY YFSKPVSTVEFEKMIDETDGKFNLPWNKDDFKYTSVDNISQSEIKAITNSIFNFIPGGVA GFSYELELLFVSDSLVELSGYSREELLSGGIEWPKKLFYSEEEFKHVFYHNGLADRSSKN LVEFKIKAKNGKTVYINMYGGLVDSPEWGSYYLCCFYDVTNEKVNELKSHELQENINSLL KNINGGIGKMTLDGKGTILFASDGFYKTIGYTRTDFNQPPINNDLSQIVSKEAYFLCNHV IEQIKAGKIDYFEVEVNSKTAQKIWLTIYFSNIHRENGKLVADVFCIDSTVDHQQMMLEQ HCNEMEDNLNIVIENTPGDIVVIKIKDDHITTKFLSHGLSKIFDFEQSELTDILKHNKGF DLVYEDDREVLAKQFIELSKQKSCVNFDYRSYYKNGKVSWHNINANFYRQEEDGTIIYHG IITNINTLKEQQERAEKLKDRYNFILSVLAVDVWNFDVRNNRMDSLYMDDKFSIDLPKGA KIVPEKLLEDALILPEDQNKYLNMLTSIRAGQKTAHDKLHCRIKTGEYNLFDLMIRVKDT NSDMAIIMAKNFDQFFNKKE >gi|223714094|gb|ACDT01000121.1| GENE 13 14696 - 15283 424 195 aa, chain + ## HITS:1 COG:CAC0821 KEGG:ns NR:ns ## COG: CAC0821 COG1309 # Protein_GI_number: 15894108 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 99 1 103 200 58 35.0 5e-09 MQIKKEDLKNDIIEAAKVEFLHHGYEGASMRIIASKSHTTLGNLYNYFTNKEELLGAVLE PSIKSLNKLVEVHLNEEIQVHSLEEVDEALDQFGDFFEESDFRYIMDERLIILFELKTTE YKEKRDWFLLKFKQHMAWHLNIEDINSPYIDIITNMFIDCIKHVLVSHDNLEMQKQEFLK VFKMLCTGIVVNEEK >gi|223714094|gb|ACDT01000121.1| GENE 14 15349 - 16044 321 231 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 214 1 217 245 128 35 3e-29 MIKFEEVSKIYQNGNKKIYALNKVSFTINKGEFVIILGPSGAGKSTLLNILGGIDRLDEG KIIIDNEVISNKNENELSDYRAHQVGFIFQFYNLIPTLTVYENVALMKELKKDIMPVDNV LDQVGLLVHKKKFPHQLSGGEQQRVSIARAVVKNPEILLCDEPTGALDSDTGAKVLKLLH DMCKEYHKTTIIVTHNTNLQAIADKVIRVKNGQINTITTNSHPLDVMEVEW >gi|223714094|gb|ACDT01000121.1| GENE 15 16045 - 18136 1573 697 aa, chain + ## HITS:1 COG:lin1187 KEGG:ns NR:ns ## COG: lin1187 COG0577 # Protein_GI_number: 16800256 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Listeria innocua # 220 696 580 1084 1136 152 26.0 2e-36 MLFKKMIRDLWKNKVQFGAIFIMMFLGVFIFSGISSEYHGLEVALNNYIKENNLADAWLY QKEFSDSTLNELKKQTDYEQRMYVSTSNKAVDNSNIDLYITDSNKISTLKVVNGKSFNNQ DDGIWLDAMYAKKNGYRLNDQITLKYQNFELSKKIIGLVYSPENIYLAKEDQLTPNRQKY GFAYVNNNSFPNTLYHPNQLVLSSKKDISKIVDQLLAGNEYKLILQKDHPSVNMLTGEIA QHRSIGLVFSLAFLFIAVLITITTMHRVLQSQRMQLGILKALGFKKRKLLIHYLSHSTFI CLVGSILGYWLGLIIMPALIYPMFEEMYVLPVLTSQPIVLGYLLPIGCTILCMLISFSVC FKYLKMNGATILYSNNLISNSSYSLAWSSNLPFRFQWNIRDIIRNKLRSLMTIFGVLGCT ALLFSACGLYDTMTNLTDWNYTKLQNYKYKLNLNDNLNEEQIDYLLNQTKGQTLLESTAK IVIDEQNLDVSLTVLENNDYIKLAQDLSNFKSLKEGIALSKKTADKLDLKVGDIIKWKGN LDHQEYQSKISLIIRTPNIQGITIMKEEYLQTGHQYYPTAIIGSKNQDCLTNLTNISSIQ YQQDLLESFDVLLEAMILIIIILVLGAIILGGVILYNLGVLSYLERYYEFSTLKVLGFKD TQIQKILVQQNIWLSMIGILLGLPVGYLLIEYMISTI Prediction of potential genes in microbial genomes Time: Thu May 26 10:25:00 2011 Seq name: gi|223714093|gb|ACDT01000122.1| Coprobacillus sp. D7 cont1.122, whole genome shotgun sequence Length of sequence - 3696 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 518 549 ## COG0172 Seryl-tRNA synthetase 2 1 Op 2 . + CDS 535 - 1362 659 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 1480 - 1515 -0.5 - Term 1108 - 1143 -0.5 3 2 Tu 1 . - CDS 1364 - 1795 225 ## gi|237733429|ref|ZP_04563910.1| lysR-family transcriptional regulator - Prom 1968 - 2027 7.7 + Prom 2209 - 2268 7.8 4 3 Tu 1 . + CDS 2363 - 2545 62 ## + Prom 2915 - 2974 5.2 5 4 Tu 1 . + CDS 3063 - 3494 438 ## COG3279 Response regulator of the LytR/AlgR family Predicted protein(s) >gi|223714093|gb|ACDT01000122.1| GENE 1 3 - 518 549 171 aa, chain + ## HITS:1 COG:NMB1684 KEGG:ns NR:ns ## COG: NMB1684 COG0172 # Protein_GI_number: 15677532 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Neisseria meningitidis MC58 # 3 169 261 428 431 167 46.0 9e-42 TYTSYSPCFRKEKGAHGIEERGVYRIHQFEKQEMIVVCKPEDSKMWFEKLWNNTVDLFRS LDIPVRTLECCSGDLADLKSKSIDVEAWSPRQKKYFEVGSCSNLTDAQARRLNIRINGEN GKYFAHTLNNTVVAPPRMLIAFLENNLNEDGSVNIPKALQPYMGGKEKITK >gi|223714093|gb|ACDT01000122.1| GENE 2 535 - 1362 659 275 aa, chain + ## HITS:1 COG:MA0317 KEGG:ns NR:ns ## COG: MA0317 COG0561 # Protein_GI_number: 20089215 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Methanosarcina acetivorans str.C2A # 1 133 7 138 151 79 35.0 5e-15 MKTLYISDLDGTLLNSQGKISDYSIKTINNLINEGMIFTYATARSLVSASPVTRGLIKNL PLIIYNGTFIVNGETGKLLHKNIFNSKQVAHIKAIMEKTQLKPMVYALVNEKERVTIINE SLHEGVKYYLSKRKTDYRINLTLDYLSLYEGEVFYFTIIGDYDNLRPAYESLKDDLDYNI TFQQEIYRKEYWLEIMPKSASKASAILKLKELLNCDRIVSFGDAINDLPMFAISDQCYAM ANAVTSLKQQATAVIKSNDEDGVAHWLKEHVINTK >gi|223714093|gb|ACDT01000122.1| GENE 3 1364 - 1795 225 143 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733429|ref|ZP_04563910.1| ## NR: gi|237733429|ref|ZP_04563910.1| lysR-family transcriptional regulator [Mollicutes bacterium D7] # 1 143 35 177 177 250 100.0 2e-65 MIDIALGLSFPKYPDYQKILIKKYSLIVLINKNNPLANQKTVSQEELPNILFDVRKLYYQ DNYPEFEGNLLKIACNQGCAILHAFTKDNCYNDYLKDIPLTPLSEKSVYLIYDQDNYNPL ISNFINYLKTTSILNHKNYFLSY >gi|223714093|gb|ACDT01000122.1| GENE 4 2363 - 2545 62 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MALGIILNTRTDLGVAAFTSLFYAFSKVKNISLGIASIILYLILIIVQICLVRKLTITIF >gi|223714093|gb|ACDT01000122.1| GENE 5 3063 - 3494 438 143 aa, chain + ## HITS:1 COG:lin0983 KEGG:ns NR:ns ## COG: lin0983 COG3279 # Protein_GI_number: 16800052 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Listeria innocua # 12 143 15 146 151 83 34.0 1e-16 MKVLIECEEDNELEVVIRCNKIDDEVRRIISLFEDKHFIIGKLDNRSYQIKIDDIYYLEA NEDRAFIYCEDKVYETSLRLYELEDQLDPRLFVRISKSILLNLNKLDNVRALLNGRYEAL LINRERLIITRHYVSSFKEKFGM Prediction of potential genes in microbial genomes Time: Thu May 26 10:25:23 2011 Seq name: gi|223714092|gb|ACDT01000123.1| Coprobacillus sp. D7 cont1.123, whole genome shotgun sequence Length of sequence - 45955 bp Number of predicted genes - 42, with homology - 41 Number of transcription units - 16, operones - 9 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 349 243 ## COG1131 ABC-type multidrug transport system, ATPase component 2 1 Op 2 . + CDS 367 - 546 214 ## gi|167755142|ref|ZP_02427269.1| hypothetical protein CLORAM_00647 3 1 Op 3 . + CDS 597 - 1676 970 ## FMG_0240 antibiotic ABC transporter permease protein 4 1 Op 4 . + CDS 1663 - 2811 1112 ## FMG_0241 putative ABC transporter 5 1 Op 5 . + CDS 2846 - 3349 278 ## PROTEIN SUPPORTED gi|228000081|ref|ZP_04047083.1| acetyltransferase, ribosomal protein N-acetylase + Term 3350 - 3393 6.3 + Prom 3966 - 4025 10.1 6 2 Op 1 2/0.000 + CDS 4190 - 4648 401 ## COG0590 Cytosine/adenosine deaminases 7 2 Op 2 1/0.333 + CDS 4710 - 6632 1891 ## COG2812 DNA polymerase III, gamma/tau subunits 8 2 Op 3 . + CDS 6645 - 7238 621 ## COG0353 Recombinational DNA repair protein (RecF pathway) 9 2 Op 4 . + CDS 7271 - 7432 157 ## + Term 7437 - 7461 -1.0 - Term 7425 - 7449 -1.0 10 3 Tu 1 . - CDS 7454 - 7870 512 ## COG0071 Molecular chaperone (small heat shock protein) - Prom 7941 - 8000 7.3 + Prom 7978 - 8037 12.8 11 4 Op 1 22/0.000 + CDS 8075 - 8695 707 ## COG0125 Thymidylate kinase 12 4 Op 2 7/0.000 + CDS 8692 - 9633 824 ## COG0470 ATPase involved in DNA replication 13 4 Op 3 1/0.333 + CDS 9646 - 10458 1006 ## COG1774 Uncharacterized homolog of PSP1 14 4 Op 4 1/0.333 + CDS 10458 - 11195 757 ## COG4123 Predicted O-methyltransferase 15 4 Op 5 . + CDS 11188 - 12042 864 ## COG0313 Predicted methyltransferases + Prom 12101 - 12160 5.7 16 5 Op 1 . + CDS 12180 - 13496 1382 ## COG0726 Predicted xylanase/chitin deacetylase + Prom 13510 - 13569 6.7 17 5 Op 2 . + CDS 13589 - 14239 453 ## COG2357 Uncharacterized protein conserved in bacteria 18 6 Op 1 . + CDS 15312 - 17243 2216 ## COG0143 Methionyl-tRNA synthetase 19 6 Op 2 . + CDS 17243 - 17944 661 ## COG2188 Transcriptional regulators + Term 17955 - 18005 9.0 - Term 18210 - 18256 8.3 20 7 Tu 1 . - CDS 18286 - 20502 1556 ## COG2200 FOG: EAL domain - Prom 20522 - 20581 10.1 + Prom 20569 - 20628 11.7 21 8 Op 1 . + CDS 20692 - 22143 1179 ## Elen_3094 regulatory protein GntR HTH + Prom 22152 - 22211 9.6 22 8 Op 2 1/0.333 + CDS 22235 - 23245 1141 ## COG0407 Uroporphyrinogen-III decarboxylase 23 8 Op 3 . + CDS 23229 - 24656 1280 ## COG3894 Uncharacterized metal-binding protein 24 8 Op 4 . + CDS 24653 - 25294 369 ## Ccel_3200 hypothetical protein 25 8 Op 5 . + CDS 25297 - 25929 750 ## COG5012 Predicted cobalamin binding protein 26 8 Op 6 . + CDS 25923 - 27020 1359 ## BVU_2930 trimethylamine corrinoid protein 2 + Term 27021 - 27050 -0.2 + Prom 27196 - 27255 8.4 27 9 Tu 1 . + CDS 27319 - 27801 609 ## COG2515 1-aminocyclopropane-1-carboxylate deaminase + Term 27909 - 27953 -0.0 28 10 Tu 1 . - CDS 28175 - 28765 565 ## CLH_1589 hypothetical protein - Prom 28873 - 28932 9.8 + Prom 28709 - 28768 7.8 29 11 Op 1 . + CDS 28968 - 31538 2207 ## COG2199 FOG: GGDEF domain 30 11 Op 2 . + CDS 31581 - 33140 1616 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold 31 11 Op 3 . + CDS 33141 - 33782 767 ## COG0325 Predicted enzyme with a TIM-barrel fold + Term 33857 - 33894 2.1 + Prom 33853 - 33912 11.4 32 12 Op 1 . + CDS 33991 - 34740 793 ## COG4821 Uncharacterized protein containing SIS (Sugar ISomerase) phosphosugar binding domain 33 12 Op 2 . + CDS 34737 - 35483 766 ## COG4821 Uncharacterized protein containing SIS (Sugar ISomerase) phosphosugar binding domain 34 12 Op 3 . + CDS 35533 - 37632 1701 ## COG3711 Transcriptional antiterminator 35 12 Op 4 . + CDS 37625 - 38773 1063 ## COG3010 Putative N-acetylmannosamine-6-phosphate epimerase 36 12 Op 5 . + CDS 38777 - 39052 342 ## gi|237733468|ref|ZP_04563949.1| predicted protein 37 12 Op 6 . + CDS 39085 - 40371 1571 ## COG3037 Uncharacterized protein conserved in bacteria + Prom 40423 - 40482 7.9 38 13 Op 1 . + CDS 40571 - 41113 393 ## gi|167755103|ref|ZP_02427230.1| hypothetical protein CLORAM_00607 + Prom 41118 - 41177 8.0 39 13 Op 2 . + CDS 41365 - 42126 796 ## Dhaf_3021 sporulation transcriptional activator Spo0A + Term 42147 - 42209 11.1 - Term 42106 - 42146 -0.3 40 14 Tu 1 . - CDS 42238 - 43053 821 ## COG0784 FOG: CheY-like receiver - Prom 43082 - 43141 7.5 - Term 43172 - 43223 2.7 41 15 Tu 1 . - CDS 43251 - 44354 1146 ## COG5279 Uncharacterized protein involved in cytokinesis, contains TGc (transglutaminase/protease-like) domain - Prom 44531 - 44590 8.0 + Prom 44510 - 44569 7.0 42 16 Tu 1 . + CDS 44598 - 45954 1283 ## COG1316 Transcriptional regulator Predicted protein(s) >gi|223714092|gb|ACDT01000123.1| GENE 1 2 - 349 243 115 aa, chain + ## HITS:1 COG:CAC0236 KEGG:ns NR:ns ## COG: CAC0236 COG1131 # Protein_GI_number: 15893528 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Clostridium acetobutylicum # 1 114 131 244 310 149 59.0 2e-36 PKQLSGGLLRRLNIACGIAHKPKLIFLDEPTVAVDPQSRNNILEGIKRLNELGATIVYTT HYMEEVEQLCNQIIIIDKGRVIASGTKDELKNMITLGERITIELKSMNQTFIEQT >gi|223714092|gb|ACDT01000123.1| GENE 2 367 - 546 214 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755142|ref|ZP_02427269.1| ## NR: gi|167755142|ref|ZP_02427269.1| hypothetical protein CLORAM_00647 [Clostridium ramosum DSM 1402] # 1 59 254 312 312 107 98.0 3e-22 MNIEIENNTVHISYNTKEDNLSKLIDYVRENNVNYTSLFSERPTLNDVFLELTGKELRD >gi|223714092|gb|ACDT01000123.1| GENE 3 597 - 1676 970 359 aa, chain + ## HITS:1 COG:no KEGG:FMG_0240 NR:ns ## KEGG: FMG_0240 # Name: not_defined # Def: antibiotic ABC transporter permease protein # Organism: F.magna # Pathway: ABC transporters [PATH:fma02010] # 2 359 19 365 365 202 36.0 2e-50 MVIFWTLVFPLALATFFNLAFANLTASEEFETVKVALVEEQANSQFKETLQELEKGDDKL IDLQITSLDKAQQLLKDEKITGYYVVSNEIKLVVNNTGIDETILESIVNEYNSTMSTVEN IATLNPSALQSNILNTVNLSKNNFENQNIGGNTDLTVVYFYTLIGMNCLNASFCGLRTTS QIEANLSRQGTRITISPISKLVTVLSGILSAFIIQFAIMMVLMAYLIFGLGVGFGEQTGY IILLIAIGCFTGVGLGNFAGNVLKFKKEDTKISFLSSFSLVLSFFSGMMIIDIKYWMQTN LPILTYINPVSLITDGLYALYYYDNLSRYTTNMICLFVIGVLLILGSLFFTRRKQYDSI >gi|223714092|gb|ACDT01000123.1| GENE 4 1663 - 2811 1112 382 aa, chain + ## HITS:1 COG:no KEGG:FMG_0241 NR:ns ## KEGG: FMG_0241 # Name: not_defined # Def: putative ABC transporter # Organism: F.magna # Pathway: ABC transporters [PATH:fma02010] # 1 377 1 361 370 169 32.0 2e-40 MTVFKKYLKIANTYTLMILAYTAIFLGLAIFTGTYNSTSTDYKSMDVKVAIINRDKNTEL INGFKDYIKDHGELISLEDDNESLRDALFYRTVDYIMIIPDNYTTDFINGKDVTIETMEL PDAYSSIYSKNLLNKYLNTANLYLKAGISDTELSKLIKEDLNKKVEVGMLDKQNEVDFTM PATYYNFSNYMLITITMVIVTMIMVSFNEEKIKRRNLVSPVSYKSMNRQLMLGNYTVGLA IWLLYVGFSFILYKDAMLTMNGLFLVLNSLVLMIFIQAFSFMIAKFTSNREILSGVGNLF GMGSSFICGAFVPQSMLSPFVLSLAKFLPSYWFIKANNEIIKLTDFSFGSIKPILIDMLI IFGFTLLVYLATQIVTKIRLKK >gi|223714092|gb|ACDT01000123.1| GENE 5 2846 - 3349 278 167 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|228000081|ref|ZP_04047083.1| acetyltransferase, ribosomal protein N-acetylase [Brachyspira murdochii DSM 12563] # 7 167 6 166 166 111 39 7e-24 MEIEYLEAVPSDALKIINYLNVVAGQSDNLTFGLNECILNEVQEMQLIKEIHEDPNSVMI VAKDGDEIVGIATLSGNQKTRLSHRAQMGVSVLKSYWHQGIGSKLVAIIIGYAAEASIEI IELEVVTSNENAIALYQKYGFEVIGTYENFMKIDDHYVDAYLMNLYL >gi|223714092|gb|ACDT01000123.1| GENE 6 4190 - 4648 401 152 aa, chain + ## HITS:1 COG:SA0516 KEGG:ns NR:ns ## COG: SA0516 COG0590 # Protein_GI_number: 15926236 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Staphylococcus aureus N315 # 1 152 1 154 156 145 48.0 3e-35 MDQDLEFMEIAYQEALKCLDMDEVPVGAVIVKDGKIIACGRNLRETSKRATAHAEIIAIE EACRTLNSWYLDECTLYVTLEPCVMCSGAIINSRIQRVVFGAFESRWLALTTIYQSDIPV NHQPVIVSGVLGDKCSKVIKDYFKNKRKRDKS >gi|223714092|gb|ACDT01000123.1| GENE 7 4710 - 6632 1891 640 aa, chain + ## HITS:1 COG:lin2852 KEGG:ns NR:ns ## COG: lin2852 COG2812 # Protein_GI_number: 16801912 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Listeria innocua # 1 352 1 357 579 330 48.0 4e-90 MSYKALYRSYRPQTFGEVAGQEHIVTTLKNAIKENRISHAYLFAGPRGTGKTTVAKLLAK ALNCTGENPPCDQCPNCKAITVGEHPDVIEIDAASNNGVDEVRDLIDKVKYAPINGKYKV YIIDEVHMMSTGAFNALLKTLEEPPAHIVFVLATTEPHKILPTIISRCQRFDFKKVENHD IISRLEYVLKSENKKYELPALESVAKLAEGGMRDALSILEQCLAYNNELTVESVNMVYGL LSMDNKISFIKQLLSKDIKGVLTSLDNMLSGSIDIKRLTFDLVDVLKDIIIYKNTQDVSI LFVLTQQDVDNLAPYILVEEAFEIIDILIEASSHYSQSLDANTYFELAMLKICNRIKEEN KLAIDNSKAIEQVNILPVKDTAKAIPVVEETDSLPEEVIEDEIIEEELNKGVIEYDPEIE ESIPEELKAKADDITETVVPEEVAEGTLETVISNSDVSLPVGEDISQEIDENIIVNKSPE NIEVSFSDILNILVQADRRVLNDIKEKWTVIARYRFNLNTAKFASMLCDGKPVAAAPGGI IVAFEHQPNVNEVNETQNYYQLKNFLKEVLGENYDFIAIKNSLWPDMRSKYIDMNRAGTL PAPEPIVLHHIGEFKEKRAELNDAQAMAVELFGDLVEFEE >gi|223714092|gb|ACDT01000123.1| GENE 8 6645 - 7238 621 197 aa, chain + ## HITS:1 COG:BS_recR KEGG:ns NR:ns ## COG: BS_recR COG0353 # Protein_GI_number: 16077089 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Bacillus subtilis # 1 197 1 198 198 215 51.0 4e-56 MNYPQAFQDLVDCFKRLPGIGGKSAERLAYHVLSMDKKYVDEFSAAIGSIQTKIHYCKKC GHICEGEICDICQDPDRDQTTICIVEDPKDVFAMEKVKEYHGLYHVLHGAMSIMDGKTMD DLNIASLFERLDDSIKEVIIATNPTRDGETTALYLAKLLSKKNINTSRIANGLPIGSNID YADELTLLKSLEGRKKI >gi|223714092|gb|ACDT01000123.1| GENE 9 7271 - 7432 157 53 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRRVLFVITRAFVILLILNILTNNYFSYNFINIFVLSMLSLPGIIVIYIISLL >gi|223714092|gb|ACDT01000123.1| GENE 10 7454 - 7870 512 138 aa, chain - ## HITS:1 COG:CAC3714 KEGG:ns NR:ns ## COG: CAC3714 COG0071 # Protein_GI_number: 15896945 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Clostridium acetobutylicum # 31 124 46 138 151 65 39.0 4e-11 MKLLPGFATFDDVFENMFSDPFFKGTSTHMRTDVKEVDDNYVLDMELPGFDKKDISIELK DGYLNITANKSTDGKETDNQGNIIRQERFTGTCSRSFYIGDEVKQEDIRASYDNGELKIT LPKKAPKEITTNTYIPIE >gi|223714092|gb|ACDT01000123.1| GENE 11 8075 - 8695 707 206 aa, chain + ## HITS:1 COG:BS_tmk KEGG:ns NR:ns ## COG: BS_tmk COG0125 # Protein_GI_number: 16077096 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate kinase # Organism: Bacillus subtilis # 1 206 1 207 212 228 56.0 9e-60 MKGKFITLEGPDGSGKTTVSKIVVEQLQMEGYKVLLTREPGGIDIAEQIRKIILDTNNIT MDARTEALLYAAARRQHLVEKVAPALNDGYIVICDRFVDSSLVYQGVGRKIGIEEVYQIN QFAIGNIKPDATIFFDLPYEVGLARINNGERVADRLDLESDDFHKDVYNGYMTICEKYAE RITKIDASKTIDEVVAQVINVIKSKL >gi|223714092|gb|ACDT01000123.1| GENE 12 8692 - 9633 824 313 aa, chain + ## HITS:1 COG:BS_holB KEGG:ns NR:ns ## COG: BS_holB COG0470 # Protein_GI_number: 16077099 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication # Organism: Bacillus subtilis # 9 313 12 326 329 130 28.0 4e-30 MKEYIEKNQPIFYNLIKNEFSQQRIPHAFLLIGNNTNIPLTYLAMSLICDETLACENCND CRKVKENKYSDIIRFNGKDNSIKKGNIELIQDTFKKSSLEGKAKIYIIENIEYATKEAMN TLLKMLEEPTEGIYAIFTANNASRVLPTILSRCQVIDIKPDSKEVIIGALCQDGITKESA HILAYLAPSIEEAKTLYDERFEYMQLQVINFIEDLFLKRANLIINTQTNLLKKYKERDDI KLFLNMLVLAMKDMFHVKHTDDIVYCEHQEFLRNIKIDESQLIKQIEIILETIYTIESNA NVPMLMDSMMYRL >gi|223714092|gb|ACDT01000123.1| GENE 13 9646 - 10458 1006 270 aa, chain + ## HITS:1 COG:BH0045 KEGG:ns NR:ns ## COG: BH0045 COG1774 # Protein_GI_number: 15612608 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Bacillus halodurans # 6 248 4 247 275 233 47.0 4e-61 MEKIKLASVKFKSAGKVYYFSTDIELDKGDKVVVETARGVELGEISQSLKSISEFNLDTE LKKITRKATQRDIEAYKKNIIDAQEALVTCRDIISRYDVDMQLTNCEFTLDKAKVIFMYT SDVRVDFRELLKELATVFRCRIELRQIGPRDKAKVIGGMGTCGLPLCCTSLLGEFNGVSI NMAKNQMLAINIEKISGACGRLMCCLKYEDEVYTIEKQRFPKIGSRVKYEGKDVKVLGLN VINDLVKIDNGGAIIFVNLDEIKFKQKDNK >gi|223714092|gb|ACDT01000123.1| GENE 14 10458 - 11195 757 245 aa, chain + ## HITS:1 COG:BS_yabB KEGG:ns NR:ns ## COG: BS_yabB COG4123 # Protein_GI_number: 16077102 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Bacillus subtilis # 1 239 4 243 247 202 42.0 3e-52 MSDQEVLNYLLAYNNMKIIQRKDMFNFSLDTVLLANFCTITKDVKQIIDFGTNNAAIPLL LSQRTNRPITGIEIQKEAVDLAIKNIELNNLETQINIVHADIAEYVKDAKKVGLVICNPP FFKVDEDSNLNENEYLTIARHEIKINLEGIIKSAARILDNKGKFAMVHRPDRMIDILNLM QKYDIEPKRIRFVYPKIDRDSHVLLVEGMYKGKKGLKIEPPLYAHNADGSYSNEVRKMFG ENIDE >gi|223714092|gb|ACDT01000123.1| GENE 15 11188 - 12042 864 284 aa, chain + ## HITS:1 COG:BS_yabC KEGG:ns NR:ns ## COG: BS_yabC COG0313 # Protein_GI_number: 16077104 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Bacillus subtilis # 1 282 2 291 292 278 55.0 1e-74 MNRQKSFQNDRPCLYLVATPIGNLEEMTYRAIRTLQEVDYIGAEDTRNTVKILNHYNIRT KLISHHEHNLGQSIPKLINLLLDGNNIALVSDAGYPAISDPGYELVKAAIDNEINVIPIS GANACLDALVVSGIAPQPFLFYGFLDHQDKKKKKELQVLKNYQETIVFYESPHRITKTLK LMEDILGDRPIALCREITKKHEEILRGSISEITKVAGDLKGEMVIVVSGNNNVIEETVFE QTIVEHVDEYVSKGMTVKDAIKEVAKLRNIKKNEVYATYHQKDE >gi|223714092|gb|ACDT01000123.1| GENE 16 12180 - 13496 1382 438 aa, chain + ## HITS:1 COG:lin0436 KEGG:ns NR:ns ## COG: lin0436 COG0726 # Protein_GI_number: 16799513 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Listeria innocua # 19 427 8 445 466 166 29.0 7e-41 MEYRHAINNRRKLKKKPTIALVVIVVIAILLGGIFWFINREGPYDKYKVYNKDNKKFGTV EHYEKDDDSFFVSLYYPKTKNSNLDKIVKDYQENYVKEQKINKNSKDILYMDYSINEVYN QFINLKFKTTRYDEDDKVVETKEKLFTYDTKKEKILTVGDSLRNTFKTVLASSQGIDKVD AKSNNLTVEKDKLIIYTTEDLKNKIEVNYKDNKELIKLANKNIPSDAPLDVAGPAAQPEV DPNKKMIAFTLDDGPHKTNTLKVVEMFEKYNGRATFFELGKNITLYPDVVKTVYEHGFEI ASHSWDHPDLRKLDAEGLNKQIVDTQNAIYKITGAEPTLIRPPYGAFNDNVKSVVKNNGM EIALWSVDTLDWKLKDANKIKETIINNSYDGAVVLLHDIHNFSVEGLEMALGELYNRGYQ FVTLDTLKQYKDLKTVFR >gi|223714092|gb|ACDT01000123.1| GENE 17 13589 - 14239 453 216 aa, chain + ## HITS:1 COG:CAC3340 KEGG:ns NR:ns ## COG: CAC3340 COG2357 # Protein_GI_number: 15896583 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 16 207 23 215 217 172 49.0 5e-43 MNQEMINAFQEQRHFEELMQPYRAAIMIVKTKLEIIDQELKFKCEHSPIHNIQSRIKSPQ SILDKLERKGFSKNFNNIKELNDIAGLRVICQYINDIQYIAQLLVLQDDVIQIKHNNYIQ YPKENGYRSLHLIVSVPVYQREGMMQVPVEIQIRTIAMDCWASLEHELAYKTNFVANNDI KQRLKWCADMMAKTDEEMQKVYLELNQPWSNKKCFT >gi|223714092|gb|ACDT01000123.1| GENE 18 15312 - 17243 2216 643 aa, chain + ## HITS:1 COG:CAC2991_1 KEGG:ns NR:ns ## COG: CAC2991_1 COG0143 # Protein_GI_number: 15896243 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 529 1 523 536 643 59.0 0 MKNEKDYYLTTAITYTSGKPHIGNTYEIILADAIARFKRQEGYNVYFQTGTDEHGQKIEL KAKEAGIPPKQFVDEKAGEVKRIWDLMDTTYDRFMRTTDDYHKSQVQKIFKKLYDQGDIY LGHYEGKYCTACESFFTESQLVDGKCPDCGGEVHDAKEEAYFFKLSKYADRLIEHIETHP EFIQPVSRKNEMMNNFLKPGLQDLCVSRTSFTWSIPVDFDPKHVVYVWIDALSNYITGLG YDVDGNHGELYQKYWPADLHLIGKDIVRFHTIYWPIMLMALDIPLPKQVFGHPWLLQGES KMSKSKGNVIYADDLVDIFGVDAVRYYVLHEIPYDNDGSITWDLLVERINSDLANVLGNL VNRTIAMSNKYFGGIVNNPKVTEAVDDELIEIALAMPKKVIGKMDEFKSSDALDAIFTLL RRTNKYIDETMPWALAKDETKRDRLATVLYNLIESIRFAAIALIPYMPSTANKILDQINT DQRDLLALDTFGTYKEGTKVAEKPEMLFARLDPKEIQKKVEALQPAKPEVKEEPKEDNTI TIDDFKKIELIVGTVEECKKHPDADKLLVSQINLGKETRQIVSGIADHYTPEEFVGKKVI VVANLKPAKLRGIESQGMILAGDKKGLLEVISVENLPNGTKIH >gi|223714092|gb|ACDT01000123.1| GENE 19 17243 - 17944 661 233 aa, chain + ## HITS:1 COG:BH0914 KEGG:ns NR:ns ## COG: BH0914 COG2188 # Protein_GI_number: 15613477 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 3 233 9 238 248 95 27.0 1e-19 MASLYPFIKEDILKDIQSGVYQEDKMLPPEREFTEKYRVSRMTIRRALDELIQDGVLIRK SGSGVFIAKNKKSRSISKVSIQTDEEIVKTYGKVIIKVVSIKTVVNHPLALRYLDVKTDE EVYQLKRVQYGGKTPIVYENIFLPKRYFNSLDKIDCTRSMSQIVHETIKVKEATNRSIEV EARLASKKLATYLEVAKNAPILQTTIIEKNAEKNPLYCGVNSFDATEFKYTSE >gi|223714092|gb|ACDT01000123.1| GENE 20 18286 - 20502 1556 738 aa, chain - ## HITS:1 COG:slr1692 KEGG:ns NR:ns ## COG: slr1692 COG2200 # Protein_GI_number: 16330979 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Synechocystis # 485 736 63 320 332 169 34.0 2e-41 MKKIGKNIVNKKPSIYPVGKLIYIYIILIVIIVLMIFNTINLNNTLGSSTNTYVNDVTSQ LADDISMRIEAIKSSLRLLSDSINILPDNQEISSFLQKKCKSLEFTDLFVIDDKGNTIPE QSETFDLSLVSSSFKGENNLIYLDGQSLLFSTPIYDEKGITKVLIGIRDKENIQNLIKPK SFNGQGLSCIIDQNGNVVISPTDLKPFMQLDNIFKDGTDTDAVNHIKKMRKNIINGKSGV FEFTASDDTSLIISYHYLDHNDWTLLTLVPGELISSGTNQYIFYNFILVACVICVFSVLL YLIFLSYKKHKNWLEKIAFTDPLTAGLNNAGFQLDAQKLVNTAPPSNYTVIFLDIKQFKL INQNYGFNKGNDVLKHIYNVIYNNISSNELVARSENDHFFICLNESNKQIIQSRLDMIIA EINSFAKNKASQLTILQGACLIENPNSEINIIQDYARLACRLNSKKSKCTYYDKEILNTM KKEQELNDLFENSIKNKDFHMFLQPKIIISNNKIGGAEALVRWFHPERGTIYPSDFIPLF ERNDNIIKLDLYIFEEVCIFLQDMIARGEKPFVISVNVSRVHFKDNDFLKAFARIKEQYR IPNNLIELELTESIFFDNQQIESIKDTIHQIHQYGFLCSLDDFGAGFSSLGLLKDFKVDG IKMDKKFFDDISSQKSKDIIANFIELADKLNISLVAEGIETLEQLEFLKSVHCNMVQGYI YSKPLPVDEFILWLKTIK >gi|223714092|gb|ACDT01000123.1| GENE 21 20692 - 22143 1179 483 aa, chain + ## HITS:1 COG:no KEGG:Elen_3094 NR:ns ## KEGG: Elen_3094 # Name: not_defined # Def: regulatory protein GntR HTH # Organism: E.lenta # Pathway: not_defined # 1 482 1 486 486 216 28.0 2e-54 MKQKQTLFNYLYNNLHDLIVSGRLPYGSKLPSISELCEFYNIGIRTVKDVLHVLKEEGYI STHERKATTVVYNIHSKFKEDCLEYVLEHRQEIIDVYKTIGLIMPVIFSFAAQIWDEEDL QLCSQRLKESEDKSAEERERICTRIFFELLDKSHNPLLRDIFSSLEIYARPVFFVNYEKY INYFNLEYTFKSITWVTSSLLTRDKSEIEYRFGLMYDTVINVIEKTLTDLALKYPEIKEM TPNYTWSAELGRDHCYTQIARDLINKISLGIYPVGSFLPPEAKLAKMYKVSVSTIRKSLH MLNELGFGETMNVKGTRVVIQDEQTAIKCMQNKQYRQDTLLYLNGVQAMVILIKKAATLA FPNITQEKIKNLQGEIEDSKNLMLECLLNCIVDNLPLLPFKTIIQETNKIIYWGYYFAFY PSEKQSINIINIKSKKALEYLCLGEAECFADEISSSYNHSLNVVRDYLIEYGLGEAKNII SPE >gi|223714092|gb|ACDT01000123.1| GENE 22 22235 - 23245 1141 336 aa, chain + ## HITS:1 COG:MA0146 KEGG:ns NR:ns ## COG: MA0146 COG0407 # Protein_GI_number: 20089044 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III decarboxylase # Organism: Methanosarcina acetivorans str.C2A # 65 331 74 337 339 116 30.0 6e-26 MEIEKWLNDVLEKKNKRAMPIITFPAATKLNISVKKLINDTDQQVAALKIIKEECNPLAI LGFMDLSVEAECFGAEIKVTDNEVPTVVGQLIEDEDDAQALKVPAVGSGRSQLYIDAIKK TKAEIKDIPILGGCIGPFSLAGRLMDVSEAMINCYEEPEMVHIVLEKATKFLIDYISAYK EAGADGILMAEPLAGVLSPVLATEFSSTYVKKIVTAVQDENFMVVYHNCGNNTLKMIDDF KEIKARAYHFGNAINIKKMLELMPKDCLVMGNIDPVEIISNGTPETIKEITLNLLKECKD YRNFIISSGCDIPPLAPWKNIKAFMEVCENYYENDS >gi|223714092|gb|ACDT01000123.1| GENE 23 23229 - 24656 1280 475 aa, chain + ## HITS:1 COG:AF0010 KEGG:ns NR:ns ## COG: AF0010 COG3894 # Protein_GI_number: 11497631 # Func_class: R General function prediction only # Function: Uncharacterized metal-binding protein # Organism: Archaeoglobus fulgidus # 99 472 191 595 597 186 28.0 8e-47 MKTILEYLQEQDCNFIAPCNGQKKCGKCKVKATNRIIEVNHDDLKLLTKKELDQGYRLAC SHLYHQGDRFILPQADGVIEDSIYLNDIVVITEPVEKVGIIIDIGTTTVAMKWLNLKSGM IIESQSFFNPQGKFGSDVIARIDFDNRDNDHKLGELIIKEIFSRINVSTKIKEILVCGNA TMINLFLKEKVKTIGVSPFDVPILTMTEYPLNYFIKSSVKIEVMTMNHISAYVGSDIVMG IYATNMDKNKENVLLMDLGTNGEMVIGNKHRLLATSCPAGPAFEGVNIECGGPSIAGAVC ATKVENNKLVYKTIDNQDANSICGSGLISLIANLLRLGIIDDTGNFLNKQKKYYLNDEVY LSIKDIKAFMLAKAAIQAGKEVLLKELNDEVTTIYIAGGFGNYLEKADLVTLNIISSEEV NKVQYIKNSAISGLYKLILTRDFQRVQHISNQTKVIYLEKDPDFNDYWIDAMVFS >gi|223714092|gb|ACDT01000123.1| GENE 24 24653 - 25294 369 213 aa, chain + ## HITS:1 COG:no KEGG:Ccel_3200 NR:ns ## KEGG: Ccel_3200 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 2 148 4 166 253 74 28.0 2e-12 MIKVIACKIYEYYISQLDLSLDNYEFVYLDIQIHNQPHTLAKNIQKEIDKSRGDEKIIVL YGLCGNALVEIQARNIPVYLVRVHDCLTVLLGSKQRFKQLFSNRLSSSWSCYSLKVNGCN VYEDEQYLKWCQRYDQETADYLKSQLQTQDNVYLIYNYCDDVPFLKNQEQIKIDLKFLKQ ILLQTSNELVILEKNQKVVISSDIDQVFEIMEE >gi|223714092|gb|ACDT01000123.1| GENE 25 25297 - 25929 750 210 aa, chain + ## HITS:1 COG:mlr1231 KEGG:ns NR:ns ## COG: mlr1231 COG5012 # Protein_GI_number: 13471298 # Func_class: R General function prediction only # Function: Predicted cobalamin binding protein # Organism: Mesorhizobium loti # 1 201 22 224 238 153 39.0 2e-37 MIEEISELIQKGQAKKVVKLVNEALEAGFPATDILNQGLLKGMDAIGVKFKNEEVFVPEV LVAARAMNKGIEILKPYLQDTEHQSKGTVILGTVKGDLHDIGKNLVKIMLEGKGLEVVDL GVDVDAQKFVEEAKIHQAQIICCSALLTTTMPEMKKVVDLVAQEHLNVKVMIGGAPVNQD YCNEIGADYYTDDATSASEVAYAIVGGTLC >gi|223714092|gb|ACDT01000123.1| GENE 26 25923 - 27020 1359 365 aa, chain + ## HITS:1 COG:no KEGG:BVU_2930 NR:ns ## KEGG: BVU_2930 # Name: not_defined # Def: trimethylamine corrinoid protein 2 # Organism: B.vulgatus # Pathway: not_defined # 89 311 89 306 600 73 25.0 2e-11 MLTKAKKEQLVEDFEKWWEHKLERPIIQVTLFDEDYQPQKHRYNRGELLEMLYDLDKPVN EVVQAYQETFFSNIYLGDAFPIFYMRSTGVLGAYLGQTYHIDVKQGTIWFQEMKGCELED IHPRLEETYPLYQRSLELIKAFNQFYGDDIAMGIANLGGMMDIVESMRGANNSLIDLYDD PDEVQRLNDDIYKAFEQAYEEMIASIDLNNTLGYTGWISLLSQKPYFISQCDFCCMIGPE QFDEFVFETLKKEASLIERSFYHLDGPGAVRHLDKIIECGFKGIQWINGAGAKPLNDPCW NEIYQKVHDAGLLLQVNISGKEELEYIDYIVDYLGSPKGIAFICTGSSKDKEVFEAYLEK YHIPN >gi|223714092|gb|ACDT01000123.1| GENE 27 27319 - 27801 609 160 aa, chain + ## HITS:1 COG:PAB2303 KEGG:ns NR:ns ## COG: PAB2303 COG2515 # Protein_GI_number: 14520280 # Func_class: E Amino acid transport and metabolism # Function: 1-aminocyclopropane-1-carboxylate deaminase # Organism: Pyrococcus abyssi # 1 159 79 232 330 103 38.0 1e-22 MQSNHARATAYAAAKLSMKSCLLLRGNGSSEPVEGNYFLDRLVGADIVIKEPEIFNRDKD KIMLKLKTAYEAKGYKPYIIPMGASNGIGTLGYVEAFTEILKQEEAMKVEFDTIINAVGS GGTYAGLYIGNELNRTKKQIIGFNVCDDKEYFIKEITKNY >gi|223714092|gb|ACDT01000123.1| GENE 28 28175 - 28765 565 196 aa, chain - ## HITS:1 COG:no KEGG:CLH_1589 NR:ns ## KEGG: CLH_1589 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 2 190 5 194 196 187 53.0 2e-46 MNNKTLKIATIGLMAALAYVSFTFLQIKIPTPGGTTSFHLGNTFCVLAALLLGGLPGGIA GAIGMGIGDLLDPVYVTYAPKTIILKLMIGIVTGLVAHKIFKIEKHDNDSHLIIYVVISC MAGMFFNVVGEPIFGYFYNSMILGAPEKAAATLASWNAITTSVNAVITVALASTIYLVIR PRLAGSGILVKLAPHK >gi|223714092|gb|ACDT01000123.1| GENE 29 28968 - 31538 2207 856 aa, chain + ## HITS:1 COG:DRA0297_2 KEGG:ns NR:ns ## COG: DRA0297_2 COG2199 # Protein_GI_number: 15807957 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Deinococcus radiodurans # 391 549 213 370 373 115 35.0 5e-25 MDKLNNEYKVLLDVIDNVILVIDNEMKILFVNRKGRKLIGENVLMQPCKALKMDNCSTEK CCIMRYLHGLQPLDNLHKDGSVEKVTVSRFYDNQNNPQGFIIVATDITELSNMKKELLIG EEIYKLALKQANTTLWQYDVLNHTIEQLFCPDEVALGILDINKTYYNIPESLVETKIISQ EDGLRVRKLCQEIEKGRPETSIELKMKRGDGEERWISLKCSTIFDEQGRAVKSIGIGKDI TDFVELKSKYEIEREYREALGKDALSYIEVNLTMNEVIDRKIAKNNFIDFYDGKNYEKSI LKLSENSIPSEYGKVLRELNRQNLLNNYRDGKRKLDLEYQFYNKFKDNYNFIQITLYLIE FNKDIYACGYIKDINENKVHDLVLKKKIELDPLTGLYNREAIEIKTNEIIQNFPENNHAI MILDIDNFKQVNDNFGHLYGDAFLCEISRKIKSKFRNDDLVARLGGDEYLVLMKNISNSE LAIDKANDLCKLIEGAYGTGGTKVKVTVSVGVVLYPNHGKTFDELYHHGDQALYKIKKNN KNGVCLYDESLEISDGQLDKVNNIVDRATKKFSDNVGEYIFRILYKCQDLNETITAVLEL LGMHYYMSQNLIIVRDLDIGKYKVNNYWSSKNEDLTKVIDVTYHDYWPEYVRQFDDEGIL WVNDIKASDVSPAIKTLYANSKTRMVISGLIKNGDDIIGIISMEHEKPYAYKAEEKEMLL TSIAIISTFLVKKYQEKEKNQYLNAIQMILDFQENGIYVIDPNTYKLVYYNHKIKHIFNQ VKVGDYCYKSFRGYDEPCADCPIKDMKDSDLSYTKLIYNCNISAQLETTVKRVRWINGQD VAAVTSIDITKYYNEK >gi|223714092|gb|ACDT01000123.1| GENE 30 31581 - 33140 1616 519 aa, chain + ## HITS:1 COG:FN0649 KEGG:ns NR:ns ## COG: FN0649 COG1574 # Protein_GI_number: 19703984 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Fusobacterium nucleatum # 23 516 29 539 542 258 34.0 2e-68 MRIYFRGNLYGCNQATAFVEDKGKIIFIGSDKEALKYSGEQIDLNNKYVYPGFNDSHMHL VNYGQSLKNVLLEKHTNSLKALLEELKKHLVKGQWLIGRGWNHDYFTDEQRFPTRKDLDM ISEEEPIVITRTCGHILVANNKAIELANITSEAVEGGYFDLDAGLFQENALYLIYDTIPQ PTIEEIKDNILIAQKELHSYGITSVQSDDLLSATSDYHDALQAFEQLRAENKLTIRVYEQ AQLPTLKALKEFINLGYCTGSGDEFFKIGPLKMLGDGSLGARTAFLSKPYYDAPKTRGIP VFSREEIKMMFDYANRHEMQIAIHAIGDGILDWIFEGYENALKNYSREDPRHGIVHCQIT REDQLLKYQQLHLHAYIQSVFLDYDNHIINQRVSPQLAQTSYNFKTLRNITTISNGSDCP VEAPDVLKGIQLAVTRTSIDGTGPYLKEQALTREEAIESFTIGGAYASFEEEVKGTLEVG KYCDFVVLSDNILDVDVQHIKDIKVLATYVGGQLVYGGQ >gi|223714092|gb|ACDT01000123.1| GENE 31 33141 - 33782 767 213 aa, chain + ## HITS:1 COG:aq_274 KEGG:ns NR:ns ## COG: aq_274 COG0325 # Protein_GI_number: 15605814 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Aquifex aeolicus # 18 212 28 223 228 154 44.0 1e-37 MRVNEQAVKNILKSVGNAKLVAATKYVGIEEINELEKLGVKYFGENRVQAFLEKYEKYHG AQEFHMIGTLQPNKVKYIIDKVSLIHSIDSYSLIKEVEKQAKKHDLKMPILIQVNVAKEE SKHGFAVEEIEDVFDYLKECPHLIVKGLMMMAPHIDSNDTAVYFKQTKELFDQLNCKYSE FEMTELSMGMSNDYHEALKYGATMIRVGSALFE >gi|223714092|gb|ACDT01000123.1| GENE 32 33991 - 34740 793 249 aa, chain + ## HITS:1 COG:BH3325 KEGG:ns NR:ns ## COG: BH3325 COG4821 # Protein_GI_number: 15615887 # Func_class: R General function prediction only # Function: Uncharacterized protein containing SIS (Sugar ISomerase) phosphosugar binding domain # Organism: Bacillus halodurans # 3 242 4 239 242 154 34.0 2e-37 MEAWENYFEVMEKVVAQVKNTQKDNIKKAAKILADTTEKGGLIYGFGTGHSHLVVDDAFW RAATPANYCALLEQSATGSFEITKSYYIENMYEIGKMIVDYHRITPNDCMIIISNSGNNI APVDAAIRAKEKGIPIIAITAVEYSQFLKTKHKAGVKLKDVADVVLDNCSLIGDAAIEIE NFPMKVGATSTIPNVFLQNAILCEMVDILVKKGIHPDVYYNGHMAFMNEDCADHNDKLVD KYFYRIRNL >gi|223714092|gb|ACDT01000123.1| GENE 33 34737 - 35483 766 248 aa, chain + ## HITS:1 COG:BH0227 KEGG:ns NR:ns ## COG: BH0227 COG4821 # Protein_GI_number: 15612790 # Func_class: R General function prediction only # Function: Uncharacterized protein containing SIS (Sugar ISomerase) phosphosugar binding domain # Organism: Bacillus halodurans # 6 244 3 240 250 181 38.0 2e-45 MMKARSRYYQHIHDILNNIVDSQEAAIQKAALEMTQCIEKGHTIYAFGASHAGILTQELF YRAGGLALINPIMAKEVQLDVRPITLTSQMERLPGYGTLILEHTPIQKEDVLIIHSVSGR NTIAIDMALRAKAMGVIVIAVTNIEYSQAVVSRHETGKKLLDIADIVIDNCGDFEDSSIT IDGLDQKVAPTSTIAGAFILNSVIIQVVENLVNDGFEPPIFHSANIDGGDEYNQNILNQY KNRIHYMK >gi|223714092|gb|ACDT01000123.1| GENE 34 35533 - 37632 1701 699 aa, chain + ## HITS:1 COG:BH3853 KEGG:ns NR:ns ## COG: BH3853 COG3711 # Protein_GI_number: 15616415 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus halodurans # 1 469 3 471 700 131 23.0 5e-30 MNTRSVEILKKLCGGKNYQVEDLAHDFEISIRMIRYVIDDINDFLETINIKKNIMIRNGK IQFNITHQEKEILMKEISNLDNYVYAISPFERKCYILVSMWCCSEPLTSQYFADEMNVSK SSIDKDILSIRQEYEGYDFSINVKTGKGSYIEGDERDIRSCFFRVIEKNIDFNKFMHFQY TPITFIEKNIYRMVFDKYWKIIIKVMNDFESHSEKKLTYLSYKDICIHLAVSLTRIELGY DIILHQDYLIEISKTEGYHDAQFICSQIMQALNIKMSQSEVAIITILLNSARYTTLKRYV IYDWANIQMIATTLAFKVGEKLNVNFSRDEELINSLTLHLGPTVFKLQNQVPVVNPSLTV IKKNYHEVFIALLETIDELCYKELKGIKEDDIAFLTLHFCAALERKNRFKDVYNVVIVCV HGFGTASLMKEMICSRFKNIIVKKTVTEERILNIDLSNIDFIISSIDLQIMQCLVVKVNV ILKESDYRLIENAMEKIVHQNIQESDTFMDGILSIVKRNCEIVNYQQLFDDFSEYFKDLG IDAGIKLQPLLNEYLTADKVLLCDTVDHWETAVCCAGQMLVKSQDIAECYIKSMIDTVKK SGSYMVIDEGIALIHGEIDYGVNQTSMSLLIIKNGVKFNHDKYDPVHIILCLAAKDNYTH MKALNNFIKFLENIRNNGIERYFELDYVLDQINEVSKYD >gi|223714092|gb|ACDT01000123.1| GENE 35 37625 - 38773 1063 382 aa, chain + ## HITS:1 COG:SP1330 KEGG:ns NR:ns ## COG: SP1330 COG3010 # Protein_GI_number: 15901184 # Func_class: G Carbohydrate transport and metabolism # Function: Putative N-acetylmannosamine-6-phosphate epimerase # Organism: Streptococcus pneumoniae TIGR4 # 139 380 4 233 233 160 41.0 5e-39 MIKYIKENCVDFYKECHDWKEAIVYAGYLLEKNGYIDHQYINDMVNIIEKNGPYIVVMPG VALAHARPNGHVYQNSISLVTFKNGVKFGHSVNDPVRVLLALAAKSDEEHLKLFQEVALC LMDRKYLHKIFNARSYQDILKDNKNIDAIKGGLIVSCYADSAINPYMDNSIAIQCLAQSC VAGGAKAIRTNLEHVKAIAEVVDVPLIGIKKIYKGDDPLHSSFRITPTMDEVDQLVAAGV DGIAIDGTQRERYDDLSLEEFVNKIKNKYPELFVIADISTVEEGIRASKAGVDAVGTTLS GYTPYSKNPIIFGTVPSPDPDYEIIKELKTAGVSRVIAEGRINDGSKMKKCIEAGAFAVV IGTSISEPAKIVKTILHDAKEG >gi|223714092|gb|ACDT01000123.1| GENE 36 38777 - 39052 342 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733468|ref|ZP_04563949.1| ## NR: gi|237733468|ref|ZP_04563949.1| predicted protein [Mollicutes bacterium D7] # 1 91 1 91 91 166 100.0 5e-40 MKSFKVITVCGAGVGTSTLLRMNIDKTFKKFALPLEVTVENKGLSMSKGLMCDAVFTFES FADELRSCYEDVIVINNLMDMNELEEKIKNY >gi|223714092|gb|ACDT01000123.1| GENE 37 39085 - 40371 1571 428 aa, chain + ## HITS:1 COG:lin2798 KEGG:ns NR:ns ## COG: lin2798 COG3037 # Protein_GI_number: 16801859 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 7 427 3 431 432 256 38.0 5e-68 MSVFKVIVDIFQNYVFNQPFIFLSIVAMIGLILQKKSIDKIISGSVKTGIGYLILSVGTS TIAGVVTPIATLLEKIMGIETVATGMGTNAFTEQWASTIAIIMVIGFFINLILARFTPFK YVFLTAHQTYYLIFVYLAVAVEVFANPNNTLVIIIGGLLTGIYCTLAPALAQPFLRKVTG SDDFAYGHTTTFGVITGSLVGNLFKKHKNESSENININPKLNFLKDITVSTALVMTILYI VAVCLAGFDYVEANLSNGQPAILYAIVSGVQFGVGITIVLNGVSMMVSEITEAFKGISEK IVPNAVPALDCPVVFNFAPTAVMLGFLSCLGTVILCTIIFGAIGWYALTPPVITTFFGGG PAGVFGNSTGGWRGAILAGVVAGLLLSFGQALTVGVLSTTVADFARWSNDFDYSVFPAFF KWILQLFA >gi|223714092|gb|ACDT01000123.1| GENE 38 40571 - 41113 393 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755103|ref|ZP_02427230.1| ## NR: gi|167755103|ref|ZP_02427230.1| hypothetical protein CLORAM_00607 [Clostridium ramosum DSM 1402] # 1 180 1 180 180 336 100.0 3e-91 MITVANSAEYFINEFTQEIYDQVLDHVDFKSLECSCGAKGSFVKIGCYPRFYKTATNKIC IRIQRVMCKHCGKTHAVFVECMVPSSMLLLTTQIEMLRSYYNHRLEEFLSSYPTIDRPNA IYVIKNYERKWVNYLKKSGFTLKSKEREIQRYFFEKYQVQFLQMKCFSNSSSSRLLNHLV >gi|223714092|gb|ACDT01000123.1| GENE 39 41365 - 42126 796 253 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_3021 NR:ns ## KEGG: Dhaf_3021 # Name: not_defined # Def: sporulation transcriptional activator Spo0A # Organism: D.hafniense_DCB-2 # Pathway: Two-component system [PATH:dhd02020] # 1 246 1 257 264 139 31.0 7e-32 MNNKAGIVILTENHLLKEELVNKIRSNDHYEIIATFNDGGICEDYLATHTCDLLVIDLIL TNIDGAGVIGNIRSNNPKALKHVICISDFTNSLVFEMLEGLAVDYCLKKPVDLNYFMEII DRILKIRLKHNLGIDDYHQTVLKKEIHDTFMKVGMPRHLKGYNYLVTAIVLVCGNINLLG EITKELYPRIARTYGTTASRVEQSIRHVLKCTWESGCQDELEQLFGFRAKRKTCNSEFIS TIVDELLSKYKGS >gi|223714092|gb|ACDT01000123.1| GENE 40 42238 - 43053 821 271 aa, chain - ## HITS:1 COG:BH2773_1 KEGG:ns NR:ns ## COG: BH2773_1 COG0784 # Protein_GI_number: 15615336 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Bacillus halodurans # 4 115 1 112 120 64 28.0 2e-10 MTDVKKYTVYILEENKDLLNNIKDSLIATNNFTVIGTSDNATSCLNYLSNNNCDLLIIDL MLTNIDGIGVLNKLKEVNSRAYNKVVCITSFTNPLICETLEKLNVSYCFKQPFDINYFAT TLNSVMKVTLEDIKGMSQNESEKYQKIKLENEITDILHEVGIPAHIKGYMYLRTAILSTY YNIELLGQVTKVLYPDIARQYSTTASRVERAIRHAIEVAWNRGNTDAIDDIFGYTVSATK SKPTNSEFIAMIADKLRLEHQTKAASRHSYL >gi|223714092|gb|ACDT01000123.1| GENE 41 43251 - 44354 1146 367 aa, chain - ## HITS:1 COG:SPy0210 KEGG:ns NR:ns ## COG: SPy0210 COG5279 # Protein_GI_number: 15674407 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Uncharacterized protein involved in cytokinesis, contains TGc (transglutaminase/protease-like) domain # Organism: Streptococcus pyogenes M1 GAS # 45 241 56 263 410 79 27.0 8e-15 MKKILLTILSLFLLTGCFSFREKSNSRPAITIDTDYLSGDASYLYYYQSLSSKEKEIYEN IYNCILDNAKKVTISSNDYELVQKINDYVLYDHPEIYYLDYFELQNQVDICNYIPSYSYS KSERDTLTAQLESVRDELVNSISSESSDYDKLKKIYQFVIEKCRYVDNAKDNQYITSSLI YGETVCSGYVKAIQYLAEAVGIKSAYIVGKEIGASDDEAYHAWNLIYLDDDYYYLDATWG DYDSEGNIFAMMNYFMFDSDDMLKLYEPLDQYEITKQGNYTYFKYENLYNENYNKAALNK MVKQYKHDNIAWMEFKFSDSCYQEAKKRLIDQEEMFDLFNPYTSKQYTVQYFYYDNLNVL IFNQKIE >gi|223714092|gb|ACDT01000123.1| GENE 42 44598 - 45954 1283 452 aa, chain + ## HITS:1 COG:SP0346 KEGG:ns NR:ns ## COG: SP0346 COG1316 # Protein_GI_number: 15900275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pneumoniae TIGR4 # 74 449 73 443 481 193 31.0 7e-49 MKDNKFIKFITSKYLILAVQLFATALFTYLIFQLDLVPLKYLIPATGALGLLIIIFFFIM RSGQKKINQGLKSKRSIVTKIISLLMSILLMFASSYVVRGNDFFNTVTKATTQKYLVSVI TMKNNSATKLSDLDGKKFGVSYQHDTTTITKAIADMENDLGEQEDMVKYDDYSGLADALY KGEVDAIIVGQEYKSMLEANHDSFDDETKIIKSYEYESKLSVTTKQTNVTENPFTIYVTG IDTYGSVSTVARSDVNLIVTVNPKTKQILMTSIPRDCEIQLHKNGKMDKLTHTGIYGTSE TISTIEDFLDVEINYFARTNFSGMTNIVDALGGVTIDSDYKFTTLHGNYNIVKGENQMDG DKALCFVRERYSLPNGDFDRGKNQQKLLKAMLEKAMSPKIITNFNNILTAIEGSFETDMS SKEIKSLLNMQLNDMSDWTVYNVQVEGEGYKT Prediction of potential genes in microbial genomes Time: Thu May 26 10:26:24 2011 Seq name: gi|223714091|gb|ACDT01000124.1| Coprobacillus sp. D7 cont1.124, whole genome shotgun sequence Length of sequence - 21476 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 13, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 117 - 176 3.3 1 1 Tu 1 . + CDS 196 - 1704 1581 ## COG1316 Transcriptional regulator + Prom 1763 - 1822 9.6 2 2 Tu 1 . + CDS 1858 - 3369 1621 ## COG1376 Uncharacterized protein conserved in bacteria + Term 3375 - 3431 7.0 + Prom 3440 - 3499 12.2 3 3 Tu 1 . + CDS 3561 - 3953 372 ## COG5015 Uncharacterized conserved protein + Term 3966 - 4017 2.7 - Term 3954 - 4010 3.6 4 4 Op 1 11/0.000 - CDS 4019 - 5044 1179 ## COG1088 dTDP-D-glucose 4,6-dehydratase 5 4 Op 2 13/0.000 - CDS 5045 - 5620 764 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 6 4 Op 3 . - CDS 5632 - 6513 1187 ## COG1209 dTDP-glucose pyrophosphorylase - Prom 6636 - 6695 7.9 + Prom 6621 - 6680 5.9 7 5 Tu 1 . + CDS 6709 - 7563 790 ## COG1284 Uncharacterized conserved protein + Term 7730 - 7768 -0.7 8 6 Tu 1 . - CDS 7634 - 8362 730 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase - Prom 8425 - 8484 8.6 + Prom 8453 - 8512 11.5 9 7 Op 1 . + CDS 8667 - 10025 1405 ## COG1455 Phosphotransferase system cellobiose-specific component IIC + Prom 10038 - 10097 3.2 10 7 Op 2 . + CDS 10125 - 10706 556 ## SGO_1646 hypothetical protein 11 7 Op 3 . + CDS 10777 - 11577 737 ## CDR20291_2879 putative phosphosugar-binding transcriptional regulator + Term 11586 - 11617 1.7 - Term 11574 - 11605 2.5 12 8 Tu 1 . - CDS 11607 - 12770 747 ## COG0500 SAM-dependent methyltransferases - Prom 12835 - 12894 9.2 + Prom 12727 - 12786 7.2 13 9 Tu 1 . + CDS 12909 - 14444 1257 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily + Prom 16028 - 16087 12.0 14 10 Tu 1 . + CDS 16142 - 17497 245 ## PROTEIN SUPPORTED gi|145223395|ref|YP_001134073.1| NLP/P60 protein + Term 17511 - 17551 3.2 - Term 17499 - 17539 2.4 15 11 Tu 1 . - CDS 17765 - 18241 534 ## COG0671 Membrane-associated phospholipid phosphatase - Prom 18262 - 18321 12.7 + Prom 18264 - 18323 6.0 16 12 Tu 1 . + CDS 18381 - 19745 1177 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis + Term 19818 - 19866 -0.7 + Prom 19796 - 19855 11.6 17 13 Tu 1 . + CDS 19899 - 21474 1549 ## COG4193 Beta- N-acetylglucosaminidase Predicted protein(s) >gi|223714091|gb|ACDT01000124.1| GENE 1 196 - 1704 1581 502 aa, chain + ## HITS:1 COG:SP0346 KEGG:ns NR:ns ## COG: SP0346 COG1316 # Protein_GI_number: 15900275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pneumoniae TIGR4 # 73 491 69 481 481 192 28.0 1e-48 MAKKKKKKGMFSKLTSKGIILTIQLIASAVFLGYIYVLKMLPTRYYAILIIVLIALLLLE MIWITSGARRKRRTGKGFRIFISKFVSVALSALLIVGSVYASQGNSFLSNITSAFTQTRV IAVYAMKQGKINKVEDLKGKTIGVENKTSVSSITDALAQVQEKIGKEPEKKNYSDYADLA DALYKGEVDCIVADQSYLGVLETNHESFEDETKVVYKMEVQEKLEAVTTKTDVTENPFIV YITGIDTYGSVNTISRADVNLVVCVNPLEKQVLMVSVPRDTQVNLHKNGKMDKITHSAMY GINETIQTLEDFLDLKVNYYAKTNFSGITNIIDALGGVEVDSPYEFTTLHGNYKIKKGVN ELNGDQALCFVRERYALPSGDFDRGKNQQRLLKAMLKKAMSPKIITNYSNILAAVEGSFE TDMSSDDIKSLVNMQLDDMANWEMFNVQVTGDGAISYDTYSQKGKKTYITIPYKKSIQNI RKVIDKIEAGKKLTEADVKGLS >gi|223714091|gb|ACDT01000124.1| GENE 2 1858 - 3369 1621 503 aa, chain + ## HITS:1 COG:CAC0747 KEGG:ns NR:ns ## COG: CAC0747 COG1376 # Protein_GI_number: 15894034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 8 503 8 466 466 163 26.0 8e-40 MAFIKKIKKLKKSVKITGGVIIGVIVVFAIFSFYYRDKWYPNTSINGIDVSSMTYEESKN VLKNQIESYALEIGAEGNQKLTIAGKDINLSADFEKNLESDYKNHRSQRSIFGIFSGYNH DIDLKISYDQNKLSEIVNGSVLINGNEEYQIVQSTNAHIEYDETTKSGKMVKATIGNELN LEKFSNLITTSISKLTTKIDLTDQDKYAEVYQQPVSDISDKHLEEMLNTYNNYLLNWINW DMGEGKVETMTPDDIKNWLSCNDKGEVVLNKEAMSEWIEEFCLRYKTVGKKRNFTTHNGN VIQISGGDYGWRLDYEKIVKQVEAAITEKTDSKLIEAYLSEQSKKNQKALTTELEPTYSN KAYQKDYENFENDWDTQNYSEIDLTEQRVYVYRDGQLAYSCICVSGLPTEKNDRITRTGV WYIKEKKPEKVLVGEDYETPVKYWIRIMWTGTGYHALDRSDWANWTPDLYKVKGSHGCLN LQEEDAKKLYELIRMNDPVFIHY >gi|223714091|gb|ACDT01000124.1| GENE 3 3561 - 3953 372 130 aa, chain + ## HITS:1 COG:CAC2569 KEGG:ns NR:ns ## COG: CAC2569 COG5015 # Protein_GI_number: 15895829 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 130 1 131 131 103 40.0 1e-22 MNEVYEFLKKCGTYYLATVEGYKPRVRPFGTIDLYNNRLYIQTGKVKAVSRQIKENPQIE ICAMDDGKWIRIEATAYLDNNIDAQKHMLAAYPSLQKMYQPGDGNTEVFYLRNVTAKICS FTEEPKIFKF >gi|223714091|gb|ACDT01000124.1| GENE 4 4019 - 5044 1179 341 aa, chain - ## HITS:1 COG:MTH1789 KEGG:ns NR:ns ## COG: MTH1789 COG1088 # Protein_GI_number: 15679777 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Methanothermobacter thermautotrophicus # 2 341 3 335 336 456 61.0 1e-128 MKILVTGGAGFIGGNFVHYMVETYPEDMIVNLDLLTYAGNLETCKPVEGKPNYKFVKGDI ADREFIFKLFEEEKFDVVVNFAAESHVDRSITDPEIFVKTNVMGTTTLLDAAKEFGVKRY HQVSTDEVYGDLPLDRPDLFFTETTPLHTSSPYSSSKASADLFVLAYHRTFGLPVTISRC SNNYGPYHFPEKLIPLMISRALADEELPVYGKGDNVRDWLHVYDHCVAIDLIIRKGRVGE VYNVGGHNERTNLEVVQTILKALNKPESLIKYVEDRKGHDRRYAIDPTKLETELGWKPKY NFDTGIQQTIQWYLDNKEWWQNILSGEYQNYFEKMYAGKVK >gi|223714091|gb|ACDT01000124.1| GENE 5 5045 - 5620 764 191 aa, chain - ## HITS:1 COG:mlr7551 KEGG:ns NR:ns ## COG: mlr7551 COG1898 # Protein_GI_number: 13476272 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Mesorhizobium loti # 8 182 8 180 183 194 50.0 1e-49 MKKIETKIPGVVIIEPDIHGDHRGYFMETYSTSNFHELGIDNVFVQDNMSFTAKKGTLRG LHFQNDPMAQAKLVSCTKGTVIDVAVDIRKGSPTYKQWVAVELSEENKKMFFIPRGFAHG FLTLTDNVEFRYKVDNLYSKEHDRGIRYDDPSVNVDWGGLLNGIEPILSDKDKNGPTLDE SDCNFKFEGDK >gi|223714091|gb|ACDT01000124.1| GENE 6 5632 - 6513 1187 293 aa, chain - ## HITS:1 COG:rfbA KEGG:ns NR:ns ## COG: rfbA COG1209 # Protein_GI_number: 16129979 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Escherichia coli K12 # 2 282 5 285 293 413 69.0 1e-115 MKGIVLAGGSGTRLYPLTQVTSKQLLPIYDKPMIYYPLSILMEAGIKDILIISTPDDTPR FEELLGDGSQFGISLQYKVQPSPDGLAQAFILGEEFIDGEGCAMVLGDNIFHGHGLGKRL REAANKFSGATVFGYYVDDPERFGVVEFDENGKAVSLEEKPANPKSNYAVTGLYFYDKNV VEFAKSIKPSARGELEITDLNKIYLDNGTLDVTLLGQGFTWLDTGTHESLVDATNFVKTV ETHQNRKIACLEEIAYNNGWISKDQLNTSYELYKKNQYGKYLKDVLDGKFIDQ >gi|223714091|gb|ACDT01000124.1| GENE 7 6709 - 7563 790 284 aa, chain + ## HITS:1 COG:BH1678 KEGG:ns NR:ns ## COG: BH1678 COG1284 # Protein_GI_number: 15614241 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 5 276 8 280 290 132 30.0 1e-30 MKDIRNIIGILLGNTIYALGVTMFILPNNLITGGTTGMALLVNATTGFSITVFVSIFNVS MFLIGWKVLGKKFALTTLISSFYYPFILGILQGVFKNEMMSKDTLLCVILAGLMIGTAIG LVIRCGASTGGMDIPPLILNKKLGIPISVSMYGFDFIILLGQMVIRNREMVFYGILLVLI YTVVLDKVLVIGKSQMQVKIVSAKFNEINEKIISVLDRGTTLIHSETGFKHNQYPVVLTV VNNRELTQLNNYVYDIDPDAFMIINKVNEVRGKGFSSAKKYENK >gi|223714091|gb|ACDT01000124.1| GENE 8 7634 - 8362 730 242 aa, chain - ## HITS:1 COG:lin2869 KEGG:ns NR:ns ## COG: lin2869 COG0363 # Protein_GI_number: 16801929 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Listeria innocua # 1 236 1 239 239 170 38.0 3e-42 MKLIIEDSKEKMSESAMQILLGTMMQDKRVNISLTAGRSPELLYKMMIPYVKDQAKFADV QYYLFDEAPYIGKTAKEDGENWKEMQKVFFEPANIPADRVHITTMDNWETFDEQIRNAGE IDAMLIGLGFDGHFCSNCPRCTPMDSYTYALERKIKNAVNPAYADKPQQPVTLTMGPKSL MRVKHLVMIVTGKEKAEILKQMLDSPITDELPATILKLHPNFTVICDQDAASLLDLNNYK RL >gi|223714091|gb|ACDT01000124.1| GENE 9 8667 - 10025 1405 452 aa, chain + ## HITS:1 COG:VC1282 KEGG:ns NR:ns ## COG: VC1282 COG1455 # Protein_GI_number: 15641295 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Vibrio cholerae # 10 450 13 427 446 305 42.0 1e-82 MNFIEKFNEIANKTLVPIANKLGNQRHLAAIRDGMVVAIPLSILGGFCLIISTPPFTPKS LPDWGVISDLLIGWYNWAQANATALRLPYNMTMALMGLFVAIAIAYHLAKRYEMSGLNAA IVSTTVFLLICAPTTTAVLTSAISEGANTNDLLAQAASFIPTTYLDAKGIFTAIIVSIGC VEIMRVMLKKNIRFKMPEGVPPAIASSFDSILPLFVCMGVFYGLSLIIQNISGELLPSLI MTILAPAISGLDSLLGICLITIIAQTFWFFGLHGASITQPIRLPFMQMYLMANITAFSAG EPIVNFFTQPFWSYVITLGGGGATLGLCILLLRSKSVELKTLGKLSIGPAIFNINEPIIF GLPMVLNPLMMIPFIFVPVINTIIAYTCMALNIVGKGVVETPWTTPAPIGAALGCMDIKA GIMVIILIILDMLLYYPFFKLMEKQKLAEENS >gi|223714091|gb|ACDT01000124.1| GENE 10 10125 - 10706 556 193 aa, chain + ## HITS:1 COG:no KEGG:SGO_1646 NR:ns ## KEGG: SGO_1646 # Name: not_defined # Def: hypothetical protein # Organism: S.gordonii # Pathway: not_defined # 24 191 22 188 190 101 33.0 1e-20 MSKKRDNKRKVEIKGYDIAKLRFQRFLAMVIDWYISNMIVAIPVTFFLRGKDYIQPYSFQ LETYGYKIGMIYGLFVVVVGICYYFAVPTYIWKGQTLGKKICKLQVVKTDGQKVDTKTMF LREIIGAVVIEGGIVVSATYIRKLIGLLITAEIIPILKYGAYAITLASIIYAYFNPLSQS FHDKLARTVVIRK >gi|223714091|gb|ACDT01000124.1| GENE 11 10777 - 11577 737 266 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_2879 NR:ns ## KEGG: CDR20291_2879 # Name: not_defined # Def: putative phosphosugar-binding transcriptional regulator # Organism: C.difficile_R20291 # Pathway: not_defined # 23 189 20 187 283 66 32.0 8e-10 MYNLTIILLSTINSELINSNNYRIAKYILENMRALEDISITELAKECYVSNSSISRFCRD IGLRDYNELKSQIAKYQPAHQYAKNKFYYQSYQKEVPGQSFVEGVIENLQLLKRTINEKD IYKLVTDIANYSNVAAFGYMQSQSVAQNLQYDLQTCHKFIHTSMKYSDQIEYINNADSSN LIIILSESGTYFKRAFERKTLFRNTNDKPKIYLITCNSDIEIPYVDYYIRYESINDYASH PYSLAAITGMICTCYAERYLEAPEPI >gi|223714091|gb|ACDT01000124.1| GENE 12 11607 - 12770 747 387 aa, chain - ## HITS:1 COG:VNG0503C KEGG:ns NR:ns ## COG: VNG0503C COG0500 # Protein_GI_number: 15789731 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Halobacterium sp. NRC-1 # 131 347 7 223 262 93 26.0 6e-19 MEKLYSTGEFAKMAGVTLRTIRYYDKIGLLKPTKILDNGYRRYCNRDLITLQKILSLKEL GFSLEEIYPLIQDNNQDNFKESIKLQTNLIDQKIKHLTNLKDSLKATERLINKKNIAWDK IIELIRLSTIDDNLVTQYMNAKNLDTRIKLHDKFSINQQGWFPWIFEQIDFSTVYRLLEL GCGNGKLWENNTYNLRNREIFLSDNSEGMIDEIRQKLGNDYNYIVADCQSIPFKNNYFDT IVANHMLFYLKDLKQGLTEITRVLKNNGTFYCTTYSKYHMQEISELAQNFDSRISLSDDP LPERFGLENGKDILKSYFNYVELKKYEDYLLITEAQPLIDYILSCHGNQNEYLGNRLKEF KIYIEDLIKESQGIKITKDSGLFICTK >gi|223714091|gb|ACDT01000124.1| GENE 13 12909 - 14444 1257 511 aa, chain + ## HITS:1 COG:STM4541 KEGG:ns NR:ns ## COG: STM4541 COG1368 # Protein_GI_number: 16767785 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Salmonella typhimurium LT2 # 146 417 122 392 750 124 32.0 4e-28 MSKRHRAVSIIIFSIFLFLSLMICTSVLWCNRTFGKVDMDQLLFTVLAPTTSTDHGIIIS WILESLVFSLVIMVIVLISYYLIRKYWANRFVKPRFLYVINRHLWVLGVVLLAGALSMAE SNFGVFDYLRKSNQKTEIYEPKKIAVKKEVSDGDPELIYADPTSVAVSGENPNNLIYIYL ESYENTFMDVVNGGIKEINCLPELTQLANENISFSNTDKAGGALGFTGTTWTIASMVGQS SGLPLKSEVANDMSNYAKFMPGAKMIGDILAENGYIQEFCIGGNATFAGTDKLFEQHGNY KIVDYKALKNDGRVQRGEVCEWGINDQGLFRIAKEEITNLVNSGQKFNFTMATIDCHTTD GIKCSLCPNTYSNRYENIYACQSKQVNNFISWCKEQSWFANTTIVLVGDHNTMAVKYTKD IPADYVRTTYNCFINSKVSSNNIKNRQFSHLDMFPTTLAAMGFKVDGNKLALGTNLFSSL PTAIEKYGQAYIEAEVQKSSTFLDENIYKFN >gi|223714091|gb|ACDT01000124.1| GENE 14 16142 - 17497 245 451 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145223395|ref|YP_001134073.1| NLP/P60 protein [Mycobacterium gilvum PYR-GCK] # 343 447 240 347 348 99 42 2e-20 MKKFLKLFVVVMLTCTTLVFTLGIFTSPLSLSAMELPAINYQAHVEDKGWLDTVHDNEIA GTVNQSKRLEALILNLKENDKSMVRYRAHIADVGWQAWVTSGAQAGTTDKAIAIQAFQAE LTNEYKDMYDIYYRVHVPYRGWLGWAKNGEIAGSIGLALRTEAIQIKLVIKGSQISTEGL SSLTKPTLTYSSHVQDMGWMSYADEGKMAGTTNQKKRMEAVKIKLSDFDGNSGLTYRTHV SDVGWQNWVSSNERAGTEGQNKPIEAIEISLSRTMSDFFDIYYRMHVSNMGWLGWAKNGE TAGTIGGRIQSEAIEIKLVPRNSTFDRGGVAFVDATITIRDRIVDAAHSRLGCPYVWGGN GPNTFDCSGLVKWCYAQVGISIPRTTGELKSYGTQISVSQALPGDILWKSGHVGIYIGNG QYIHAPQTGDVVKISSVSSGNFTFARRSNEL >gi|223714091|gb|ACDT01000124.1| GENE 15 17765 - 18241 534 158 aa, chain - ## HITS:1 COG:SP1916 KEGG:ns NR:ns ## COG: SP1916 COG0671 # Protein_GI_number: 15901740 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Streptococcus pneumoniae TIGR4 # 26 154 29 159 167 57 28.0 7e-09 MKFFYTGLNKFMWNHPFIKSCTHFTSRFCPYMVAIFYMLFLLKIYLDRPHNLLGLAAEPI AVFAITAILRIVIDRKRPSEKYDLTPIDGSKKTGHSFPSIHVAMSISIALAVLHFGPNMG LLLSTLAIAISLCRLLSGVHYLTDILASIAIAFIVNLI >gi|223714091|gb|ACDT01000124.1| GENE 16 18381 - 19745 1177 454 aa, chain + ## HITS:1 COG:DRA0034 KEGG:ns NR:ns ## COG: DRA0034 COG2148 # Protein_GI_number: 15807704 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Deinococcus radiodurans # 62 453 102 480 481 210 31.0 4e-54 MNRKIMPIMLFMIEVVMYYFICLYFHMDIDLIVGSGTLYFLFLFIYGHYSLNTCLIWNEI QQLVKSSFCFYIALLVLVPQSTGYERKMHLTLMAISMFLISLLASRFLRIAFREQFARKT LVIGTGYEAARLGKISNNNRFALTNIMGYVDVNKTNELFRFEQENIIKHSPIYEYDKIDE AIENLGIEQIIIAIPEADQQVIDKVMSDIYGKVDSVKYLPDVNGTMTFSSEVQDFDGQLL IATSNDEIGLLDKFIKRLIDILAGIAGVMTLLPLMAYVKYKYVKSGDHDNIMFSQERIGK NGKLIKIYKFRSMIPNAEAELERMMEEDPKIKEEYLTNKKLKDDPRITPVGHFLRKTSLD EWPQFVNVLKGEMSFIGPRPYLPREKEDMGQYYDSIIKLKPGVTGMWQANGRSDVEFSYR CKLDDYYYHNWSIWLDFTIMYKTVKSVVYGKGSL >gi|223714091|gb|ACDT01000124.1| GENE 17 19899 - 21474 1549 525 aa, chain + ## HITS:1 COG:SP0965_2 KEGG:ns NR:ns ## COG: SP0965_2 COG4193 # Protein_GI_number: 15900842 # Func_class: G Carbohydrate transport and metabolism # Function: Beta- N-acetylglucosaminidase # Organism: Streptococcus pneumoniae TIGR4 # 257 419 2 156 156 91 36.0 5e-18 MKKKIAKVSAASLITLSMTVGNVAAFNNHDDLSVEDESSNDKNIDLNSNSTTELENNSSI ETKNGNKEVIGQTKFVDENGNITTVDVYDGTTGEVYNPRLRVVSTANMVNFNCSSAGTTT EFVDYYTGQAGYISKASAADAAFLGYENGKVKFMISGVTGLVDPSKVEVLTQGTYYASNY EVNSSGNLYHYISNNVNATGNQGNSNYVGKGPSYLTKGKEYYSYDGHYFYENYNTMITDY KNNVRTNSVNPSTPYYNYFQYLPMRSKTNYTAQELTTYLNNKANSSTSKLNNTGDMFIKY QNKYGVNALMAASFAALESGWGKSSIAQNKNNLFGMNATDANPSEDAKKYSSVEACIEDF ASNWMSKKYLNGTYTSLFRGGYFGDKGSGIFGKYSSDPYEGEKCASIAENMDASISGKDK NYYTIGVKDVAGTSRTNLNVRQSSNISSTVLYTTIKNPSYAFIVRKKTPENEFYEIQSDS VLNSNRTAVSTSAEYNYDSDYAYVSSNHLTIVNNGNDISYEKNEA Prediction of potential genes in microbial genomes Time: Thu May 26 10:26:33 2011 Seq name: gi|223714090|gb|ACDT01000125.1| Coprobacillus sp. D7 cont1.125, whole genome shotgun sequence Length of sequence - 1528 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 7 - 1413 1320 ## smi_1563 choline-binding protein LytC, 4-beta-N-acetylmuramidase, Cpb13 (EC:3.2.1.17) Predicted protein(s) >gi|223714090|gb|ACDT01000125.1| GENE 1 7 - 1413 1320 468 aa, chain + ## HITS:1 COG:no KEGG:smi_1563 NR:ns ## KEGG: smi_1563 # Name: lytC # Def: choline-binding protein LytC, 4-beta-N-acetylmuramidase, Cpb13 (EC:3.2.1.17) # Organism: S.mitis_B6 # Pathway: not_defined # 246 466 312 536 536 173 42.0 1e-41 MNHNYDTGKINTDIYAYDDEGEYSKLELDTKIEENQAPVISDVKVINVTSDEYTVICKVT DDLEVVRVQMPTWTVNNGQDDIVWHEAELKNGVATFTVNRKDHNFEYGDYITHIYAYDRE GLKGFSICGTVNLKEPDINTSTKPIIKNVKISNINDSGYTVTCTVLSNNKITEVLFPTWT YKNEQDDLIWGVGQKLENEYTFRVNRSDHNYEFGTYVTHIYAYDEKGAVNNIELPFHNII NTTERIGWAYIDGQKYFFDNKGNIAGNMPSKKVIDVSSYNGNIDWNTVKQYGDVDGAILR IAAHPNGEYIEDVQFANNLAACRRLKIPFGVYIYDYSNSENDALNEAKFVIDILQKYNVT PDELGYPVYFDLERTTITKEQNIANMNAFISEMNAKGYTTNVYSYRAMLNSSLNDKAILS NVSWMAAYTDTIGWENPYYKGKFGWQYTSSGSIPGISGNVDISCWYTI Prediction of potential genes in microbial genomes Time: Thu May 26 10:26:50 2011 Seq name: gi|223714089|gb|ACDT01000126.1| Coprobacillus sp. D7 cont1.126, whole genome shotgun sequence Length of sequence - 42330 bp Number of predicted genes - 36, with homology - 36 Number of transcription units - 20, operones - 9 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 560 - 619 7.7 1 1 Op 1 . + CDS 667 - 1980 1102 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 2 1 Op 2 . + CDS 1999 - 2421 419 ## gi|167756139|ref|ZP_02428266.1| hypothetical protein CLORAM_01659 3 1 Op 3 . + CDS 2433 - 2897 361 ## gi|167756140|ref|ZP_02428267.1| hypothetical protein CLORAM_01660 4 1 Op 4 . + CDS 2937 - 3689 696 ## COG1482 Phosphomannose isomerase + Prom 3772 - 3831 9.2 5 2 Op 1 2/0.250 + CDS 3874 - 5310 1356 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 6 2 Op 2 . + CDS 5325 - 6587 712 ## COG3711 Transcriptional antiterminator + Prom 6613 - 6672 6.3 7 3 Tu 1 . + CDS 6715 - 7272 395 ## gi|167756143|ref|ZP_02428270.1| hypothetical protein CLORAM_01663 + Term 7277 - 7325 1.1 - Term 7257 - 7314 4.0 8 4 Tu 1 . - CDS 7337 - 9259 1286 ## COG2200 FOG: EAL domain - Prom 9289 - 9348 8.3 + Prom 9322 - 9381 5.2 9 5 Tu 1 . + CDS 9456 - 10316 1056 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 10375 - 10422 5.0 + Prom 10369 - 10428 5.6 10 6 Tu 1 . + CDS 10453 - 11172 511 ## COG0860 N-acetylmuramoyl-L-alanine amidase + Term 11187 - 11246 12.1 + Prom 11247 - 11306 8.3 11 7 Tu 1 . + CDS 11471 - 11761 241 ## gi|237733326|ref|ZP_04563807.1| predicted protein 12 8 Tu 1 . - CDS 11774 - 12268 480 ## COG4905 Predicted membrane protein - Prom 12295 - 12354 4.7 - Term 12355 - 12402 1.5 13 9 Tu 1 . - CDS 12457 - 12933 574 ## COG0386 Glutathione peroxidase + TRNA 13113 - 13189 100.1 # Ile GAT 0 0 + TRNA 13216 - 13292 100.1 # Ile GAT 0 0 + TRNA 13311 - 13386 94.0 # Ala TGC 0 0 14 10 Op 1 . + CDS 13443 - 14198 647 ## COG4905 Predicted membrane protein 15 10 Op 2 . + CDS 14201 - 15562 1060 ## COG0534 Na+-driven multidrug efflux pump 16 10 Op 3 . + CDS 15564 - 16559 1072 ## COG3641 Predicted membrane protein, putative toxin regulator + Prom 16562 - 16621 7.9 17 11 Tu 1 . + CDS 16663 - 17430 822 ## COG1737 Transcriptional regulators + Term 17630 - 17687 11.7 + Prom 17441 - 17500 12.5 18 12 Op 1 . + CDS 17716 - 22107 4584 ## Cphy_1775 S-layer domain-containing protein 19 12 Op 2 . + CDS 22120 - 24111 1796 ## COG4886 Leucine-rich repeat (LRR) protein + Term 24203 - 24233 1.2 - Term 24191 - 24221 2.0 20 13 Op 1 . - CDS 24268 - 24489 148 ## gi|167756156|ref|ZP_02428283.1| hypothetical protein CLORAM_01679 - Prom 24535 - 24594 7.1 21 13 Op 2 . - CDS 24603 - 27809 2556 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family - Prom 27962 - 28021 79.9 + TRNA 27943 - 28018 78.7 # Thr CGT 0 0 + Prom 28536 - 28595 9.2 22 14 Op 1 7/0.000 + CDS 28703 - 29158 435 ## COG2190 Phosphotransferase system IIA components 23 14 Op 2 7/0.000 + CDS 29174 - 30043 695 ## COG3711 Transcriptional antiterminator + Prom 30046 - 30105 7.1 24 14 Op 3 8/0.000 + CDS 30125 - 31516 1444 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 25 14 Op 4 . + CDS 31520 - 32998 1190 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase - TRNA 33097 - 33172 85.8 # Lys TTT 0 0 26 15 Tu 1 . - CDS 33350 - 34210 624 ## COG1533 DNA repair photolyase - Prom 34277 - 34336 12.5 + Prom 34261 - 34320 12.5 27 16 Tu 1 . + CDS 34342 - 35208 830 ## gi|237733341|ref|ZP_04563822.1| predicted protein + TRNA 35311 - 35386 93.8 # Asn GTT 0 0 + TRNA 35388 - 35463 76.6 # Glu TTC 0 0 + TRNA 35468 - 35543 96.6 # Val TAC 0 0 + 5S_RRNA 35472 - 35525 94.0 # AF302131 [D:490..741] # 5S ribosomal RNA # Streptococcus agalactiae # Bacteria; Firmicutes; Lactobacillales; Streptococcaceae; Streptococcus. + TRNA 35556 - 35631 95.8 # Thr TGT 0 0 + TRNA 35640 - 35723 77.6 # Leu TAG 0 0 + Prom 35649 - 35708 80.4 28 17 Op 1 . + CDS 35819 - 36136 293 ## gi|237733342|ref|ZP_04563823.1| predicted protein + Prom 36315 - 36374 5.4 29 17 Op 2 . + CDS 36482 - 37888 205 ## gi|237733343|ref|ZP_04563824.1| predicted protein 30 17 Op 3 . + CDS 37889 - 38362 249 ## gi|237733344|ref|ZP_04563825.1| predicted protein + Prom 38448 - 38507 6.8 31 18 Tu 1 . + CDS 38545 - 39240 481 ## gi|237733345|ref|ZP_04563826.1| predicted protein + Prom 39246 - 39305 5.5 32 19 Op 1 . + CDS 39370 - 40257 680 ## COG1192 ATPases involved in chromosome partitioning 33 19 Op 2 . + CDS 40299 - 41144 808 ## gi|237733347|ref|ZP_04563828.1| predicted protein 34 19 Op 3 . + CDS 41201 - 41416 151 ## gi|237733348|ref|ZP_04563829.1| predicted protein + Term 41421 - 41463 2.2 + Prom 41418 - 41477 6.2 35 20 Op 1 . + CDS 41542 - 41727 247 ## gi|237733349|ref|ZP_04563830.1| predicted protein 36 20 Op 2 . + CDS 41747 - 42319 588 ## gi|237733350|ref|ZP_04563831.1| predicted protein Predicted protein(s) >gi|223714089|gb|ACDT01000126.1| GENE 1 667 - 1980 1102 437 aa, chain + ## HITS:1 COG:BBB04 KEGG:ns NR:ns ## COG: BBB04 COG1455 # Protein_GI_number: 11497019 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Borrelia burgdorferi # 26 421 20 435 443 164 30.0 4e-40 MSKKEKNSFVDKIMGPMDKISSPLIKFGQIPFIQGLQRGMVSSIGVTMVGSIFLVICLFG ADGNITEKALLPFLTPYIDQLSLINSLSMNIMAIYMCVAMGAEYADIKGINKTTGAVGAL FAFILLNYNGIAATSEGVNALEITYWGSAGIVTAIIAMAISVNVIHLCYKYNIRIKLPSS VPPAISDSFSSIVPYLFVALICWSIRTIMGFDIPAVITNILLPVLSGADNIFVFCFAYFF ASLCWVCGIHGDNIVGTVISPMLQTWMIENTEAYTAGLAAPHIWIDQLNRLFQYVSTCWP ILIYMYMSSKKLPQLKPLAVLSTPSMIFCIVEPLMFGLPIVLNPFLAIPFVLIHTITAAV SYLLTSIGFVGRFVISIPWATPSPILAYLATAGSIGAVLLVFINFAIGMVIMYPFWKAYE KNEIKKLEEQKATEVAV >gi|223714089|gb|ACDT01000126.1| GENE 2 1999 - 2421 419 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756139|ref|ZP_02428266.1| ## NR: gi|167756139|ref|ZP_02428266.1| hypothetical protein CLORAM_01659 [Clostridium ramosum DSM 1402] # 1 140 1 140 140 266 100.0 2e-70 MKTCPCCHNEILDDAIYCDYCGKELTKKEEDVSRVVELKENPQKNYFCQLGLILFLFSMV ILDFFMATVVHNTVGNSRIVFYISSVFYILALATEGFALFVDYNAVKQGYRKNGNLGLAL ATMALSSYFLLVNIFGVILK >gi|223714089|gb|ACDT01000126.1| GENE 3 2433 - 2897 361 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756140|ref|ZP_02428267.1| ## NR: gi|167756140|ref|ZP_02428267.1| hypothetical protein CLORAM_01660 [Clostridium ramosum DSM 1402] # 1 154 1 149 149 276 96.0 3e-73 MYCRKCGNPIDNKTVFCKICGEKIIKLEQKSYEEKYLEKKKKEKNAKSDLKKQDKYCDLK NPYVIPAITVSCIGFILGVFPYPAAWKVGTSLWLSVVILVFALLGAYHSVKATQVNRFYA QKYCYTIKEKTVKVARILSTVTILADLFVFMSRI >gi|223714089|gb|ACDT01000126.1| GENE 4 2937 - 3689 696 250 aa, chain + ## HITS:1 COG:lin2215 KEGG:ns NR:ns ## COG: lin2215 COG1482 # Protein_GI_number: 16801280 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Listeria innocua # 4 212 5 215 318 195 45.0 7e-50 MSILFFKPIPRPAIWGHTLVKDYFDYCDFPDGIGQSWSFSAQENASTVCITEPFVGKTLH ELWNSHQELFGHPNEPFPVIISLVGPEDDLSIQVHPDNIYALKEGFNSGKNEAWYFIEAM PNYNIVYGHNASSPEDLKDYIRNEHWDDLIRYLDVKSDDFVYLPAGLLHALRKGSVVYEI QQATDITYRFYDYHRKDSQGNERELHLDKAIECLNYDQSKMENNVQPIEKNMTTLRKLSI YPMTHLQLQS >gi|223714089|gb|ACDT01000126.1| GENE 5 3874 - 5310 1356 478 aa, chain + ## HITS:1 COG:SP0303 KEGG:ns NR:ns ## COG: SP0303 COG2723 # Protein_GI_number: 15900236 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Streptococcus pneumoniae TIGR4 # 6 473 4 478 478 333 41.0 5e-91 MKKKELRKDFLWGAALSNVQAEGGYLEDGKGLNVYDTLVVTPEPGIESQYSDTKVATDHY HHFKEDIDLMAEMGIKAYRFSIIWSRIHPNGDDEANEKGLDFYEEMIDYMLSKGIEPVVS LVHFDMPDHLMRAYNGFLNKKVIDFYVEHVRCIAKRMNDKVKYWITYNETNLAPYQSDLV AGTNRPDYMSKEEFYSQLYINIQVAHARAVLAIKKEIPDAMVGGMVNYCLFQPATTTSRD MIACEFSHKFMNYLSFDIMTSGYLPEYFVSFMEKRDIECEFNNEEQEDVLKASKELGYLA ISYYRSMMVTSYRTIKNESIIDFENNLLWGKGFVQNPLYNASEWNWTIDPDALRLSLIQL HERYHLPIFIVENGIGIKEEMINGKIYDDNRIDYYQGHIQAMKTAIENDGVDVIGYLAWS SIDFLSAHKEMLKRYGFIYINRDVEDLKDLKRYPKKSFYWYKKCIASNGNDLENNVEY >gi|223714089|gb|ACDT01000126.1| GENE 6 5325 - 6587 712 420 aa, chain + ## HITS:1 COG:SP0306 KEGG:ns NR:ns ## COG: SP0306 COG3711 # Protein_GI_number: 15900239 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Streptococcus pneumoniae TIGR4 # 26 292 22 290 493 92 28.0 1e-18 MVRAQIFNQESYCIQIIRYLSFLDDYATSYEIANHIGISRRLVRDEIINVQELLNEFGYR LISRTPKGYRIDFTDYLEALELINKIENHERNHNFDYYLYINRTFYFAKRLLECDTGIKI DDLAEEIFVSRSTLSKELSNLREWFAKHELKINSKANVGLILEGREENKRIILSDIIFIN YKHSYIMFDFLKSLYKNKEMLDYKIISLLKKYNVSLSDTSLIDFLVYILVMISRAKSKHL LTEIDIPEEFNLSIEMQVAKEIALKIKDTINCQITYDEIKQIAIELFAKEDKGSFYYNKE MSLTIVNESLKAIENEFFIIFKEPLLKDLKEELFTMVDKTLFHCSFNTKQRNTYYNKVEK NYSNSFQLALCLKNIIEQKTIHHVSMSSISSFTILFEKFICLTKLPKTESFICMYARSSK >gi|223714089|gb|ACDT01000126.1| GENE 7 6715 - 7272 395 185 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756143|ref|ZP_02428270.1| ## NR: gi|167756143|ref|ZP_02428270.1| hypothetical protein CLORAM_01663 [Clostridium ramosum DSM 1402] # 1 185 464 648 648 328 100.0 1e-88 MDIDINIPIVLISEFLDNNTMRYFKNLVMETRQIEMSEFLFNKKYYFNCVDIDNINGIYD YLYSVLKKEVAISFDIFKKNLLDKDQVEIIDSCSHSLIFSSHMKLNLKNLLIVFVLKNPI IYNKKKVSIIIFSTLQDDFEGNVRTQGLLKSLRYLCDNEQGFDQFINNPTYVCFIKGIKD ISVKK >gi|223714089|gb|ACDT01000126.1| GENE 8 7337 - 9259 1286 640 aa, chain - ## HITS:1 COG:slr1692 KEGG:ns NR:ns ## COG: slr1692 COG2200 # Protein_GI_number: 16330979 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Synechocystis # 392 639 78 325 332 181 35.0 4e-45 MIWDITAELVSAITLCIILVYARKGNLLPTVKNKVFQYCLFITFLSVSSNIISTTLLQYY KQVPLFFNSFFLLIYYLSTPLMGAIYFIYALANIYDEKEVKKYAALCSLPSILYVLLVFS NFYTSLLFSFDQVSGYQQGPWIFITYLVFYIYVFFSLILVIHKRKSLERNVSYILGVFPF ISAFVILFQYIHPEYILTGTAATSALLIIYLYLQNKQMFTDTLTNLLNRQEFNKMIDILI DNNKPFIAVVISLKNFKFINDKFGQEIGDQILLEVCHYLRYLLPKQAMYRYGGDEFALIF YNKKNVINALEKIETRMKNPWQISNIDFIISYVAGGIAYPKVAHSKEEIINGLEYAVSLG KKDNAHNINFCVTAMIKQIRRKYQILDLLKICLENNSFEVYFQPIYDLKTHSYCKAEALL RLPDNPLGFISPEEFIPIAEENGLIAPITYQVLDKTCLFIKKVIAEKKDFTGVSINFSVL QFMQDDLENKVLKVIEKHQLPYELIKIEITESMLVTNFDAITNFMTNMINRGIQFLLDDF GTGYSNITYVLTIPFQVVKVDKSLIWQAMKDEKAAILIKKMIEAFNQIGLHILAEGIETK EQMEFMKKCGCDLLQGYYFSRPVSFDEALNVIKTTKITEK >gi|223714089|gb|ACDT01000126.1| GENE 9 9456 - 10316 1056 286 aa, chain + ## HITS:1 COG:BH3506_1 KEGG:ns NR:ns ## COG: BH3506_1 COG2207 # Protein_GI_number: 15616068 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 7 107 7 107 130 92 43.0 6e-19 MYAWQEIQVTLDYIEENYDKEIDIDKLAAIAHLSKYYYQRLFFRLTKKTVNDYVKLRRLE KAANQLKNTEKRILDIAIACGFSGHSAFTRAFKEVYKITPDDYRNQKIHLDHFIKPDLQL NYVVIDTGVPLIVEGMVLEINEKQVCNDRIFYGKSKMAKVDELGEPKINNIVSLYQELDV QDAVVVDILTLSDDPALFNYFVGIEIDRPSIDCEQRIIPSGKYVVCSYEAENFESLVNEA LYKASRYLYEVWFVEHRLEPDDLLVQKYYNPYQENCYIELWAKVRS >gi|223714089|gb|ACDT01000126.1| GENE 10 10453 - 11172 511 239 aa, chain + ## HITS:1 COG:BH0239 KEGG:ns NR:ns ## COG: BH0239 COG0860 # Protein_GI_number: 15612802 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 13 226 16 237 238 114 34.0 2e-25 MKRTYLIIAITAALYLAGHYFYPKNDVAVSKDLPLKNVSVVIDPGHGGLDNGASVGKIYE SELNLKISYALKEELESRGATVNMTREDEQDMTKRNHHYSKQDDMYLRVKKIDSYKSDYL ISIHLNSAPASGAWGSQVFYYKNSDKGKRLASEIQTTMKEVTGSAKRISGADFRVLRATQ TVGVLIECGFISNANERGQLQSSKYHQKLAVKICDGIEKYREKYPEDTIDPKDYEKILS >gi|223714089|gb|ACDT01000126.1| GENE 11 11471 - 11761 241 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733326|ref|ZP_04563807.1| ## NR: gi|237733326|ref|ZP_04563807.1| predicted protein [Mollicutes bacterium D7] # 1 96 1 96 96 172 100.0 4e-42 MWDTEIPSDFFVEKLIEESEKGSVKCEIQAYSIAEALENRIDADVVLLSPLICFEQIKIE RLVRCPVDIIGIGAYALFDVKAILLKAFDAMRKKRI >gi|223714089|gb|ACDT01000126.1| GENE 12 11774 - 12268 480 164 aa, chain - ## HITS:1 COG:lin2818 KEGG:ns NR:ns ## COG: lin2818 COG4905 # Protein_GI_number: 16801879 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 1 148 7 156 270 91 35.0 7e-19 MVKFAIYTFLGFILESVYVSILKKEFYLSGLLKGPFIPIYGFGALLILSIVPYHTNNFEI FFYSLFGCTALEYLTHFFLSYDSQIEVWNYGKIPSNYTSRICMFYSLMWGFLGIILVNYI DPFINSLLINLNYYAVNIIALIYILIILYQFYNQQFQVTKKQPK >gi|223714089|gb|ACDT01000126.1| GENE 13 12457 - 12933 574 158 aa, chain - ## HITS:1 COG:SP0313 KEGG:ns NR:ns ## COG: SP0313 COG0386 # Protein_GI_number: 15900246 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione peroxidase # Organism: Streptococcus pneumoniae TIGR4 # 2 157 3 158 158 171 51.0 3e-43 MNLYEIEVLDNNNQTISLNNYKDQVLLIVNTATDCEFTEQYDDLEELYEKYHNYGFCILD FPCNQFGEQAPGTMEEIMSFCADTYGVSFPQFAKIEVNGPNESPLYTYLKEKQDSTLSKR IDWNFTKFLVNQEGDVVARFEPLVKPREIENNIKKLLK >gi|223714089|gb|ACDT01000126.1| GENE 14 13443 - 14198 647 251 aa, chain + ## HITS:1 COG:lin2818 KEGG:ns NR:ns ## COG: lin2818 COG4905 # Protein_GI_number: 16801879 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 7 244 5 246 270 127 30.0 2e-29 MTGKLIILVVSFFIYCLMGWIWESIILPLSRHQKPYNRGFLNGPWIPIYGFGAMLVIVLF DIQQVNYPIYTLFISGGVVACLLEYITSYVMEKLFHRRWWDYSQKAFNLNGRVCLEGFIC FGLFSVVAIDYVQPFFTSKLLLVDENILFVLSGVLSVLFIVDTIVSTHVALDIEKKLELV KQVLEESENRLIASIEEKQNEARAYIEKQRLEWQKQQIVLKSLIRQKKLFKYGHRRLIRA FPDLLKRKKDE >gi|223714089|gb|ACDT01000126.1| GENE 15 14201 - 15562 1060 453 aa, chain + ## HITS:1 COG:CAC0847 KEGG:ns NR:ns ## COG: CAC0847 COG0534 # Protein_GI_number: 15894134 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 2 448 3 450 459 192 27.0 1e-48 MLKKYFGTKQFYRTVISIAFPIMAQQFITTFVNLIDNVMIGSIGNIALTSVTVANRFFLI MNSILFGLCGAAGIYIAQFFGAKQKKNCQEVFNINMAFAVGAGLLFTLIIFIFPQYAIEL FSKTPIIVDEAIKYLHLAKYTYIPFAVSFTCMMALRAVGINKIQLKVGIVAVITNTALNY CLIFGNFGFFKMGIQGAAIATIIARLVEMAIYLIILYRNKHFFSFDLKGIIRINVNLLTS IVRKAIPLTINEILFSLGQALIFMSYMRCDEYLVASISVVDTVANIMFIVFGGLSSAVSI MIGNKLGANKLDEARDNARKLLCFALMVSLMIGMICFLLAPLIPNLYNVDIPIKEAIVRL VRIKSIMINIYAFNVCIFFILRAGGDVVSTMIMDSGFLWAAGVLVSTLLSSFTTISLISL YIVVESLDLIKLIIATYFFKKERWVKNIAVVGG >gi|223714089|gb|ACDT01000126.1| GENE 16 15564 - 16559 1072 331 aa, chain + ## HITS:1 COG:FN1900 KEGG:ns NR:ns ## COG: FN1900 COG3641 # Protein_GI_number: 19705205 # Func_class: R General function prediction only # Function: Predicted membrane protein, putative toxin regulator # Organism: Fusobacterium nucleatum # 4 328 2 330 330 249 48.0 5e-66 MKKENYIIKSLNGMAYGFFCSLIIGTIFKQIGNFANISQLVAWGEVATYLMGPAIGAGIA YAIDAKGLNLIAAVIAGTIGAGTFNGNTATTGNPIAAFVAVIIAVEVTRLVQGKTPVDIL LVPFTSIIVAGIVTIFIGPYITKLIVWIGDLINQGVNMQPFFMSIVVAVLMGMALTAPIS SAAIGVMLGLNGLAAGAALAGCCAQMIGFAVMSMDDNDIGDVVAIGIGTSMLQFKNIVKH PIIWLPPILTSAIIAPISTCLLEISCSAIGSGMGTAGLVGILEAVNVMGNNYWIPIIVID LLAPMLISFGIYKAFRKLNYIKAGYLKLDRF >gi|223714089|gb|ACDT01000126.1| GENE 17 16663 - 17430 822 255 aa, chain + ## HITS:1 COG:YPO1253 KEGG:ns NR:ns ## COG: YPO1253 COG1737 # Protein_GI_number: 16121540 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 2 245 1 244 246 201 42.0 1e-51 MLFTYKQIANLNETETFIYRYVMKHIKVVTNMSVRDLAECTHVSTATVVRFCQKLDCNGF VEFKTKLKLFNDGLTLPETDDEIDVLLEFFNYARSVDFKEKINKFVKYIKEAKSICFLGI GTSGTLGKYGARYFSNVGYYSQSIDDPYYPPPVDGNESSLLIALTESGETREVIDQLKMY QSQNTKIVVITNKPGSTIDQMADLGIYYYVKDIILPQTYNISSQVPVLYILERVTHELQN AKEKSLPLKFTSRNL >gi|223714089|gb|ACDT01000126.1| GENE 18 17716 - 22107 4584 1463 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1775 NR:ns ## KEGG: Cphy_1775 # Name: not_defined # Def: S-layer domain-containing protein # Organism: C.phytofermentans # Pathway: not_defined # 360 1155 432 1154 2117 460 38.0 1e-127 MIKRLNKNFFAAILTFAMAIGMIAPVNVAANDEYLNFDQDSIYAIVSTTTGKALNIKTAS WTSPAEADGEYNAKTNKISNSSQFMFINENDGSVTIKYTGINDGQDYYLRTESGQTYVWA DTRNEDTTAKYKITKTADNQGIIESMKRTNEYLSIDSKGKLVYTSDRDLAEKFNFVKNPG IISNIAWIENVATGKLVTFADQPDENFSAIKVTGDAGNIGDTEKFKVTFHTNDNGIKNVI SFESVSKPGYQIASAKWVDGAVPTIGSYKKVGGGWEAIAVEPIGNGLFVLKDAATGKLVT ANNDEVLEGGYEGELTDREKFIIHNAIDIADVTDLKADDSSRTETTIDLTWTNPNSLYTD VTLFKKASNEVVFDKVADVTGKNSYQVTDLKAGMQYSFKLEYVNGNGNLSDINNPKNESA ELKVSTRAGAKPATPADVKLVEKDGEFTISWGAAENATHYQLLRADSMLGEYNVVETVTR DSTSVTVAPIGDKYENYYRIIALNNGQNNDDNLSNAEISERSEYVSLEKNIFGEHTLFFS PKDDVAQLDKILLDLFNQQNDRGADAQFKGQQYQVYFKPGDYTETTCMYLGFYTSFNGLG KTPMDVKLNNIAIPSYLDGNNATCNFWRSAENLAVINTGNEQGKAGFDCNLARPEYFNWA VAQAAPLRRVYSSRPVGYDWNYGWASGGYVADCMFEGVFEDQGNQLSAGTYSGQQFYTRN SKLTGGGYGTTLNNFFQGVDAPNLPTEKDGDALANGNGYSNWGKEAANGEQQVFTNVEKT KKLSEKPFLYLDNGEYKIFVPDLQKDTSGTSWTKDGDMGKGKSLSLSEFYIAKPSDSAKT INEQLDKGKNIYFTPGTYHAEESIKVNKANTILIGTGMTSIIPDNDEAAMEIADVDGIKV AGLIFDAGEKDSNYLLKVGPANSDKNHSANPTILQDLFFRVGGTTANPTRAKNALEINSN DVITDHFWIWRADHGNGVSWSGNAADHGLIVNGDNVTCYALFNEHFNKYDTLWNGENGAT YFYQNEKCYDPISQEAWMSHNGTVNGYAAYKVANDVKEHYAVGLGIYNVFIYTGEDYDSS KVQIQMDSAIEVPNSKGVIIENACLQTFADENKVLQKFNHIINDVGPGVSSGTDKDDFAI KGEGWSRKFLLNYCDGTAVYGKMPAADQKGKFIGTVTEENIKALGDDNIDTKTLSDLYNV NKDKNENNYTVDSWKVFVKALDDAGKQLNADLKYAYQKDFDSAKDVLNDAIDNLVLVGAD YSKVDSIKAEAEKVLASDNYTKDYYTEATRATFDKANKALNDLADDLTILEQNKVDDAIE LLTEAINGLTLKGADYSKLDVVKKEAEKVLISDNYTKGHYTEESKALFDEAYEVLSSLAD NLTIIDQGMVDEVVKALDTAIKGLTLKTEEPGQIVKPNEPGQSTKPSDSSSAKTGDDQMI YGYGIVTLLALGGLMMLKRHHTS >gi|223714089|gb|ACDT01000126.1| GENE 19 22120 - 24111 1796 663 aa, chain + ## HITS:1 COG:lin0372 KEGG:ns NR:ns ## COG: lin0372 COG4886 # Protein_GI_number: 16799449 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Listeria innocua # 400 615 365 591 656 95 30.0 3e-19 MKKSKIIKVAILALFSAVLLTLNNVGAISSNPYNDWKTSAINYPNNGQLVPAGPITITWD RLSIDSHEVIGYEVYLDNVLQNSTIIDEGDIFSCEVYTTKVAQHQVKILAQLSNNTKIST SVRNFYISKKGMGFYSGNGYSAIQDAQNMGLSWYYNWGTAPTYAGTCPNQKINFVPMIWG AYNGSNEQLTTIKNAGYKTVLGYNEPDFVDQSNVPVATAIANQHYFTNSGMRIGAPATAI QAPNSEWFNEYWQGINTDDIDFIPVHNYPGNIGVTDKEIKDNAKSLLNFINETHNKFNKP IWVTEFAVANWDPYWDGYNGANEANKAEVRKFLNYVINGFDNNVGLNDLEFVERYAWFSF DALDRYGGDSGLFNTKADHDKNSMLKIGTLTTLGNDYRNLGNPEGYILPNLMGEIEPSIE DEYVDDYVNVMINGRSENVVLGSKFDKIDTPVKDGYVFSGWYSDEACTKEYDFDSIVTDN VTIYPKWLKMHTVTIDGIDNKVIDGNTAVKPLPSVKDGYVFKGWYSDSEYNNKFDFNEPI RGDIAIYSKWVKLVTVSVDNNKVLIESGTVLSRPDMPIKNGYTFAGWYSDEACTKEFDFA KPLLEDVTIYTKWKKVIDPMEKEDQTTSINSVKTGDNFAIQGYVLGLIGVMVAIVMMQKK YNK >gi|223714089|gb|ACDT01000126.1| GENE 20 24268 - 24489 148 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756156|ref|ZP_02428283.1| ## NR: gi|167756156|ref|ZP_02428283.1| hypothetical protein CLORAM_01679 [Clostridium ramosum DSM 1402] # 1 73 1 73 73 124 100.0 1e-27 MIFDELLKGQLLTNDCNNYLIATNTGEIIESCYRCLHRREKKALELYKKDISIINTKIID NNFERTLCIIVDL >gi|223714089|gb|ACDT01000126.1| GENE 21 24603 - 27809 2556 1068 aa, chain - ## HITS:1 COG:FN1160 KEGG:ns NR:ns ## COG: FN1160 COG0553 # Protein_GI_number: 19704495 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Fusobacterium nucleatum # 68 1065 70 1089 1089 585 37.0 1e-166 MYIKDDEINRIFNEKNWKIAHSYYKDNVITNMSVLKQGNEYHIDGTVEIYGRSSTCHIVV DTGGKINNFECDCPYCHNDELACGHIGVLLLKFYSLEMSEIPFQFNQKIDYKAKIEAFER VREEKLIQTKLEESKKLITDYTNQQDPQLTNDSLIDLIPNINYSFNRILISYRIGQNKTY LIKNLKEFIYNYRYHNVYTYGSQLTTDYNRKSFSSEALKQIDFIRDNSEFIDDIERYRSI DINKYNIDEFFETYLTNLNINMFFTTIDLNNLTINIEDKEDYFEISLLPLGGQLIIGNQY LYYLDKNILNRYSLKMSKITKKLISKLDKENLIVAKEDFGYFGKYIIDQILPYIAITGAN IDDYMPSVITLLTYVDLNNFGDLTINLEYRDDEGNILFDGKYIDSYETKLPLNVDTALAM IEDYAQYDELTNMYLITNNDEDIYYFIKNILPKLNRYCDVFVSEDIKNINKPKNISLNIG IRLKNDLLEIDLDSINVSKDEIRDILESYQKHKAYHRLKNGEFVNLEDQTLNEAYNLIQD LNLENKNIQDGAIIVDKSKALFLNELIQDSETINFNRNQQFQELINHLTNCNINNYPVPE PFSDILRDYQCTGYKWIKTMSDYGFGGILADDMGLGKTLQMITVLEDAKKNHKASIVITP ATLILNWQDEIKKFSNDLNVLCISGTLSVRKKMIEQINNYDVIITSYDYIRRDFELYKPF KFEYIVLDEAQYIKNQATKNARAVKELHGTHRFALTGTPIENSLAELWSIFDFLMPNYLY NYNYFREHFERPIVRDEDKDAQIRLKKMVEPFILRRTKQEVLEELPDKIENNIKIAFNKE EENLYIANLSQINSELKTALDVERIDKIQILAMMTRLRQICCDARILYNEIIGPSSKMKA CLDIIKKAKENNQKVLLFSSFTSSLDLLEKELRKEDILYYVLTGATNKIKRHQLVNAFQN DNTDVFLISLKAGGTGLNLTAASIVIHFDPWWNMSAQNQATDRAYRIGQTNNVQVYKLIM KNSIEEKIQELQAQKQDLSNIFIENNDGSITKMSTADIISLFSIDQEG >gi|223714089|gb|ACDT01000126.1| GENE 22 28703 - 29158 435 151 aa, chain + ## HITS:1 COG:BS_ybfS_3 KEGG:ns NR:ns ## COG: BS_ybfS_3 COG2190 # Protein_GI_number: 16077304 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Bacillus subtilis # 5 135 2 132 137 117 47.0 1e-26 MNKQDGGIVSPIDGKCINIEDVPDKVFSTRMMGDGFAIIPSDNIVVSPVDGIIVMIPNSK HAFGIKTRSGVEILVHVGLDTVNLKGRGFETFVKVGDKVKKNMPILKFDDNIMDEMNIDK TTMVIFTAGYNQSIKLECYDREVHRGDTLIK >gi|223714089|gb|ACDT01000126.1| GENE 23 29174 - 30043 695 289 aa, chain + ## HITS:1 COG:BS_licT KEGG:ns NR:ns ## COG: BS_licT COG3711 # Protein_GI_number: 16080959 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 1 274 1 273 277 127 32.0 2e-29 MKVIKKINNNIAIAVDGENNEVIVFAKGIGYGNIPYEIKDLSVIERTFYDVDNRYYGLLK EIPNNIFNLVTKMVDVAKKNIEDELSPNLVFILADHINFAIIREKNGMDISLPYSYELEY EYPLLTKISKWMVKKTNEELDTHLPKGEVTSITMHFINALEGTNTKMHAGNIENKISKVI FNVTNIVEKHFNFNIDKKSFNYFRFKNHLKYFVQRKEAREMFTDNNRELYESMIEKYPDT YDCISKIDNYFYQEYKEKCTTEELLYLLVHVNRLYIKEDCHRKGITPEK >gi|223714089|gb|ACDT01000126.1| GENE 24 30125 - 31516 1444 463 aa, chain + ## HITS:1 COG:SPy0572_2 KEGG:ns NR:ns ## COG: SPy0572_2 COG1263 # Protein_GI_number: 15674662 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Streptococcus pyogenes M1 GAS # 107 459 12 364 364 236 38.0 8e-62 MAKKYEKLAKTILDSIGGKENITYFQHCTTRLRFNLKDRSIVKISDIENVDQVLGSQWSN DELQIIIGPAVADAYAEICEFGNLQREEAIDEDLDGNAEKKKVALKSVLMGIMDGISGCI TPIIPMMIGGGMIKVFYMLANMVGILPETSSTYQILYWLGDAFIYFFPVFLGATAAKKFG ANQGVGMLLGALLIYPSFIAMATEGAAMSIFGLPVYVANYSTTVIPIILTVYILSKVEKL INRYCPDILKSFVVPTGSLLVMLPIMLVVTAPLGYYIGTYVAEAVIWIYETIGFLGVSIL CALLPLMVMTGMHTMMTPYWTSAFASLGYDPFFLPAMILSNMNQFAATLAVSLKAKTKKV KSTAMSCAVTAIVGGVTEPAMFGITFKYKKPLYAAMIGNAAGGLIAGLLKVACYAFPGSG GMFAVVTFAGPGNNLIFFLIAALVGMIITFILTYILGIDEKEV >gi|223714089|gb|ACDT01000126.1| GENE 25 31520 - 32998 1190 492 aa, chain + ## HITS:1 COG:CAC1405 KEGG:ns NR:ns ## COG: CAC1405 COG2723 # Protein_GI_number: 15894684 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 1 492 1 473 473 565 57.0 1e-161 MAFPKNFLWGGATAANQFEGGWNEDGKLDSTADHFTLGSRTEPRLFTEKIDEDKYFYPSH KASDFYHRYQEDIALLGEMGFKVYRMSINWSRIFPHGDDEEPNQKGIEFYRNVFLELKKY NIEPLVTISHYEMPYHLAEKYDGWYARETIDFYLKYCQTLFNEYKGLVKYWLTFNEINSL ILGGNGYFSGGILSSSKKDMGAGVIEDLDTLSTVEQHNDIVKQYQALHHQLVASALATRM AHEISGEYQIGCMIAGITQYPYSCRPEDMLLVQKERRNIFYYCPDIQINGAYPRYAHRYL KENDIHINFGENDEDILKSGTVDFFTFSYYSTGCVTTDKNLEKTNGNLIFGVANPYLEKS QWGWQIDPLGLRYFLNEIYDRYHIPIMVVENGLGQDDKLEIDKTVHDDYRIEYMKQHIQA MSEAVDDGVELVGYTPWGCIDLVAASTGEMIKRYGFVYVDVDNLGNGTYNRYKKDSFYWY KKVIASNGRKLD >gi|223714089|gb|ACDT01000126.1| GENE 26 33350 - 34210 624 286 aa, chain - ## HITS:1 COG:MA0795 KEGG:ns NR:ns ## COG: MA0795 COG1533 # Protein_GI_number: 20089679 # Func_class: L Replication, recombination and repair # Function: DNA repair photolyase # Organism: Methanosarcina acetivorans str.C2A # 3 275 2 288 309 243 45.0 3e-64 MKYQPLESKSALNKVSGRFPFKWDLNIYRGCQHGCIYCYAIYSHKYLDNNDYFGTIYYKE NILSCLEKELSSPKWKHELVNIGGVCDSYQPLEEQLQIMPEILKLMIKYKTPIIISTKSS LILRDIELINELSKITYVNIACTIITVDDDLRKIIEPGSSPIIERFKVIDQIKKETKAHA GIHIMPIIPYLTDQPGNLNGLYKMAKKVNADYVLPGTLYLRGQTKPYFLNCIKKYNYDLY QKIASLYYQKNALKTYKPKIYQTISELRKIYDLPAQAKRSPKDQNL >gi|223714089|gb|ACDT01000126.1| GENE 27 34342 - 35208 830 288 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733341|ref|ZP_04563822.1| ## NR: gi|237733341|ref|ZP_04563822.1| predicted protein [Mollicutes bacterium D7] # 1 288 1 288 288 503 100.0 1e-141 MKTKYVLVLMAMMFVISGCREEVTKEVGDTTINIAELNGKREIDKNDKIFANNNQGKHDL LMLMANSVDHYNLISGTMLSTLSYQGQKSESKYIFQVNVPEYQSYIYLVPSKNEESQVTF RDSERMYFFVGDLVSLNNNLDELINKQMMTVEDVSKDDSLFKYKDLTLEERVENMDISRV ANDLFGDVAPYLLPEDIALRQIGTAYNNYEIKGEEKYLNRDTYRVEGKINDSLYNSDSTF SMLVDKQTGIVLKYTRTTKDSLVEMKMEGISIDEMNEENMYAKYLSQK >gi|223714089|gb|ACDT01000126.1| GENE 28 35819 - 36136 293 105 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733342|ref|ZP_04563823.1| ## NR: gi|237733342|ref|ZP_04563823.1| predicted protein [Mollicutes bacterium D7] # 1 105 1 105 105 171 100.0 1e-41 MIVDTKKSDTVEENNRRAERLFELMKVNLIPWIRDLLKENEMQPSDIEKMFDKKKRMISN AMRGDYQPSIQFYIKMGIVLDKSPGEVLDSVILYEQKNNIKKKKT >gi|223714089|gb|ACDT01000126.1| GENE 29 36482 - 37888 205 468 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733343|ref|ZP_04563824.1| ## NR: gi|237733343|ref|ZP_04563824.1| predicted protein [Mollicutes bacterium D7] # 1 468 1 468 468 880 100.0 0 MITNNTYVSYDTCGSRDDINSWLNTGHDIFSNRSDYKTVIAHQCDYKDMHSYTYGRKKHK KVYSNIYAHQMEVMIPVAMTDSLKHDFIKKFMCAVNPLYKLNSFLYCYKFITKGKGSYIS VLCFTRKYFKRQRLMTVTYEKDYIYSSKTKRLCKKDDPDAVLVHKKGDIKKDENGSPVKE TFFVSKTEQEIFKYTSFIRFHKRLLKNAEYAAMLLDRDFWNKKIKYFSRITIKKSGFKNK EKIHIKNQLIIRLNSRINSMQTALQEGRYWNNSLHIDKAFHKMIHRINGLLYLTRWTDPV SGSSIYLGTKQSVTCLKDNISLIEEHIEQLLTQWWADEVYSDDDFTAVQYTEHEADSLKS LVETDKYWKDFNNNKWSKNRYSYEEALEYSRTLVNCRNCINCLDCRNCKNCTDCIDCSCC TGCCNCFGCVFCIRATDQHHYMEKPSNNGYWDDSLINTYLNLKEREAA >gi|223714089|gb|ACDT01000126.1| GENE 30 37889 - 38362 249 157 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733344|ref|ZP_04563825.1| ## NR: gi|237733344|ref|ZP_04563825.1| predicted protein [Mollicutes bacterium D7] # 1 157 1 157 157 298 100.0 7e-80 MKKLQCYVASSFDELNKKCGYGVIMKEECRPIDIISGFTEESWQIKSYQVGGYLIGVLKG IQYAIDNNYDEVTIFYNFEGADKYTKIRNSEGLSDVVKTYLSGFKNLEQKIDIKFEKANM NLFIPRTNLQRAYKISRKATQEEFDEFHILNKKNFKP >gi|223714089|gb|ACDT01000126.1| GENE 31 38545 - 39240 481 231 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733345|ref|ZP_04563826.1| ## NR: gi|237733345|ref|ZP_04563826.1| predicted protein [Mollicutes bacterium D7] # 1 231 1 231 231 410 100.0 1e-113 MLNDNDKSNNCKRKVNIKLSIENVKVVLNRVSELMLDHAYSVGANISSFSASVIRMSDGS LESGDYKFIKSVCKSNSNEDYICRILEAVNMLKNEKQKNVIILRYFYNATFNEIRYGYYD PVDDVEFEPIWQVYKLRDNALSELIYLIDDVLVYEDENIRSKNTKTVKESKSKERIPVNY VAKNGMHGTIRLKDYEDFLNWSKRMKIFKEVELTDETIKNVKAVFGVKGAT >gi|223714089|gb|ACDT01000126.1| GENE 32 39370 - 40257 680 295 aa, chain + ## HITS:1 COG:Cgl1387 KEGG:ns NR:ns ## COG: Cgl1387 COG1192 # Protein_GI_number: 19552637 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Corynebacterium glutamicum # 3 293 38 290 290 96 26.0 5e-20 MKVISIINEKGGVGKTTASVNISYGLVERLMNDGKRILLIDADAQGNATKFFLPEFKSIT LEEFNMLEVPQNCDIRSSTKFIKNSLISMLGERNDLNKLLLEGKGVIRECIHQTQYQQLD IIPSIGTELIATDKLLGASTGQRHVILKRALREIRNDYDIVIIDHAPTFNNITVNGLFCS NEIIIPLKPGGFELKGLIDTLEELFDIEDDYECEYKIRILMNMIPRGVRPAYISFINKIR EFYGDTILQTTVGYQDAVASRSTMSGKLLYNSKTGVGEDYRNLVDELIKEFDNEI >gi|223714089|gb|ACDT01000126.1| GENE 33 40299 - 41144 808 281 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733347|ref|ZP_04563828.1| ## NR: gi|237733347|ref|ZP_04563828.1| predicted protein [Mollicutes bacterium D7] # 1 281 7 287 287 487 100.0 1e-136 MALTNLINADKVSKMNKKPVGLKTIDIDDIEVNQHNQFDINGIDELKSSILEYGLRKPLD VYKIGDRYRISDGERRYTALKELVDEGKIEPEVYCIIYEAPESELNDRLRLILGNGTRIM SELDKIKIVAELLDIYKQLDPKPKGNKRDWIAPFIGAKSGKTVQKYINAIEKKEECIDIG MYDEDKKDSCKNKEYGLSELFSDYNKLLKQIDKINKKVENTSIYHKKIAESDDRDITVED SLINIHNVIGKFSRCIQISIEIDEQNNDENVLDGQMSIDEI >gi|223714089|gb|ACDT01000126.1| GENE 34 41201 - 41416 151 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733348|ref|ZP_04563829.1| ## NR: gi|237733348|ref|ZP_04563829.1| predicted protein [Mollicutes bacterium D7] # 1 71 1 71 71 120 100.0 2e-26 MGDLVLRDGDEFFKLSFLKRFLLGTNGIIAGGCFKNIFNGERVKDIDIFFRDKRDLSDVT KSIKIILILPY >gi|223714089|gb|ACDT01000126.1| GENE 35 41542 - 41727 247 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733349|ref|ZP_04563830.1| ## NR: gi|237733349|ref|ZP_04563830.1| predicted protein [Mollicutes bacterium D7] # 1 61 1 61 61 90 100.0 4e-17 MPKIRVYANYEFTNDSDNCNDNGKTNTYGYSYDRYTREEQKEREKAYGYVRSILKSVLED S >gi|223714089|gb|ACDT01000126.1| GENE 36 41747 - 42319 588 190 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733350|ref|ZP_04563831.1| ## NR: gi|237733350|ref|ZP_04563831.1| predicted protein [Mollicutes bacterium D7] # 1 190 3 192 192 355 100.0 1e-96 MYVIKYDIDIYAGTYNDMDGCYLYPSSYFNMQKPKVFKTLKGAEKHLSNLKNKVVYGKNA DVYNFKIIEWSENDLESHLMSIGVKPGKNDDPQNSNNISPDKKLEKYWKKVFKPLKNEEI EITNITVSHNVCTINYLMPYCTGSEEYIFQADIDKEKAVDGFCNTVEEKANEMIVMIREC SNIRIYFKEK Prediction of potential genes in microbial genomes Time: Thu May 26 10:28:55 2011 Seq name: gi|223714088|gb|ACDT01000127.1| Coprobacillus sp. D7 cont1.127, whole genome shotgun sequence Length of sequence - 1399 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 25 - 228 303 ## gi|237733351|ref|ZP_04563832.1| predicted protein 2 1 Op 2 . + CDS 244 - 402 170 ## gi|237733352|ref|ZP_04563833.1| predicted protein 3 1 Op 3 . + CDS 439 - 705 83 ## + Term 855 - 898 -0.9 4 2 Tu 1 . - CDS 888 - 1316 199 ## gi|237733353|ref|ZP_04563834.1| predicted protein - Prom 1338 - 1397 4.1 Predicted protein(s) >gi|223714088|gb|ACDT01000127.1| GENE 1 25 - 228 303 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733351|ref|ZP_04563832.1| ## NR: gi|237733351|ref|ZP_04563832.1| predicted protein [Mollicutes bacterium D7] # 1 67 1 67 67 115 100.0 7e-25 MLSKKGRLLYALHTVKTLFRDAGIENAKVEVELKDGTVETIDCMEEIELFITDYINEDNN FNPSEED >gi|223714088|gb|ACDT01000127.1| GENE 2 244 - 402 170 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733352|ref|ZP_04563833.1| ## NR: gi|237733352|ref|ZP_04563833.1| predicted protein [Mollicutes bacterium D7] # 1 52 1 52 52 78 100.0 2e-13 MGRFYKASSGRILLIENDEKLTQEEVDRADELQQIAFPDEWKDQNPNNKKED >gi|223714088|gb|ACDT01000127.1| GENE 3 439 - 705 83 88 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIKEKELLEDINSYIDQYRCELAKQQVKSILDNLSFIYYKTKNLIIELNDVINKIKNSYR TSIIKLFKKLLMNCLKKEIILDCGRKYN >gi|223714088|gb|ACDT01000127.1| GENE 4 888 - 1316 199 142 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733353|ref|ZP_04563834.1| ## NR: gi|237733353|ref|ZP_04563834.1| predicted protein [Mollicutes bacterium D7] # 1 142 1 142 142 259 100.0 5e-68 MKEFAEDIMKSETILVTGSHKSALPAEMLRLNLLKFGKKVLYVPLDLIHDLNHFVTEKDM IIYFTNSGSSLPHYKSALKDVKKEKNCNLSILTMNNKLALKNSCDNFIWLPSSVNQHFDK YLENQIVFFLYIDLLASQLSID Prediction of potential genes in microbial genomes Time: Thu May 26 10:29:16 2011 Seq name: gi|223714087|gb|ACDT01000128.1| Coprobacillus sp. D7 cont1.128, whole genome shotgun sequence Length of sequence - 5693 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 184 132 ## - Prom 204 - 263 8.4 + Prom 207 - 266 11.2 2 2 Tu 1 . + CDS 343 - 1683 1584 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 3 3 Op 1 . - CDS 1972 - 2199 77 ## gi|167756364|ref|ZP_02428491.1| hypothetical protein CLORAM_01897 4 3 Op 2 . - CDS 2201 - 2389 103 ## gi|237733357|ref|ZP_04563838.1| predicted protein - Prom 2429 - 2488 2.4 + Prom 2476 - 2535 5.4 5 4 Op 1 . + CDS 2592 - 3941 1576 ## Clos_1804 hypothetical protein 6 4 Op 2 . + CDS 3941 - 4240 313 ## Amet_4242 hypothetical protein 7 4 Op 3 . + CDS 4243 - 5172 902 ## COG1250 3-hydroxyacyl-CoA dehydrogenase Predicted protein(s) >gi|223714087|gb|ACDT01000128.1| GENE 1 1 - 184 132 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEIGYNKLDITPTHPVRIAGYNRKEKSKGQLDPIEINSIFIKNNDKYIVISLLDSIIIED S >gi|223714087|gb|ACDT01000128.1| GENE 2 343 - 1683 1584 446 aa, chain + ## HITS:1 COG:VC1282 KEGG:ns NR:ns ## COG: VC1282 COG1455 # Protein_GI_number: 15641295 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Vibrio cholerae # 8 439 6 427 446 219 33.0 7e-57 MEKTEKGLIQKIQAILEDRVVPVAMKISQQRHLASIRDGLTILVPITIIGGFAVLLAQPP VAADVMPTNFILQILCAWRDWAASCADTLLIPYNLTIGAISIYVVLGVSYRLCKHYKMDT ISNIITTLLVYLCVAGIPTAFVTDEATVTAVPLTNLGASGMFTAIIVALGVIEINHFFIE KKLVIRLPDSVPPNVAAPFNVLIPGVVSLLVFMGIDGLCIAYLGTGITGLVYAVFQPLLS ATGSLPSIILINLLMTTFWFFGIHGGNMLGIVTTPITTAALALNAEAYVAGKELPCIFAG AFNTVYGGYISYMAVIICILIAGKAAQSRSIAKLAIVSTAFNINEPVIFGLPTVLNPFTL IVFFICNNLNVAIAYILMSSGFLGKFYITLPFTVPGPLQAWLASMDIKAIFVWLALLVLD VVIAMPFMRAYDKQLLAEEVETIAEN >gi|223714087|gb|ACDT01000128.1| GENE 3 1972 - 2199 77 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756364|ref|ZP_02428491.1| ## NR: gi|167756364|ref|ZP_02428491.1| hypothetical protein CLORAM_01897 [Clostridium ramosum DSM 1402] # 1 75 24 98 98 132 100.0 9e-30 MGSINCFLFLNFLDKNQKIKIGFEATGHYGSSLKQFLKAYNFDFMEINSFLIKQFSKAST LRKTKTDKIDSVLIP >gi|223714087|gb|ACDT01000128.1| GENE 4 2201 - 2389 103 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733357|ref|ZP_04563838.1| ## NR: gi|237733357|ref|ZP_04563838.1| predicted protein [Mollicutes bacterium D7] # 1 62 1 62 62 111 100.0 1e-23 MLYEQREYMKKKLVVLLTVRIILWRFFITYLVGIDIAKYKHDCFIHDHNGEAIRHSFTFV NN >gi|223714087|gb|ACDT01000128.1| GENE 5 2592 - 3941 1576 449 aa, chain + ## HITS:1 COG:no KEGG:Clos_1804 NR:ns ## KEGG: Clos_1804 # Name: not_defined # Def: hypothetical protein # Organism: A.oremlandii # Pathway: not_defined # 1 445 1 445 447 481 54.0 1e-134 MRKIRIGSGAGYGGDRIEPAIDLMENGNLDYMIFECLAERTIALANLQKIKNPELGYNHL LEYRFQPILGEYKKGNRIKIVTNMGAANPYSAAKKIVEMANCIGVKVKVAAVLGDDVLNV INQYMDESTLETNQPLSTIKDEIVSANAYIGCAGISEALRNGAEIIITGRVADPSLTLGI LVHEFNWSMNNYDLLGKGTIAGHLLECAGQVTGGYYYDGDKKNVPDLWNLGFPIAIVGAD GSIEIAKTETSGGLVNEMTCQEQTLYEIQDPENYFTPDCIANFKDVKIKQIKKDVVAISG VTGKAPNGKYKVSVGYHDCYIGEGQISYGGYNAYKRALKAKEVLEKRLEVTKMDLDEVRF DFIGVNSLYGDTLPNDKDYHEIRLRLVARTKDYQTAVRVGQEVEAMYTNGPSGGGGVTQS VREIVSVASILIDQNVVKQSVEYLESGDE >gi|223714087|gb|ACDT01000128.1| GENE 6 3941 - 4240 313 99 aa, chain + ## HITS:1 COG:no KEGG:Amet_4242 NR:ns ## KEGG: Amet_4242 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 1 99 1 99 99 132 71.0 4e-30 MKLYEIAHSRTGDKGNISVISIIAYDEKDFELLKEKVTEDKVKRYFKDIVFGEVKRYTLE NIYALNFVMDNALGGGVTRSLALDMHGKTLGSALLEMEI >gi|223714087|gb|ACDT01000128.1| GENE 7 4243 - 5172 902 309 aa, chain + ## HITS:1 COG:mlr6793 KEGG:ns NR:ns ## COG: mlr6793 COG1250 # Protein_GI_number: 13475669 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyacyl-CoA dehydrogenase # Organism: Mesorhizobium loti # 1 294 1 291 309 224 38.0 2e-58 MIKNIVIAGAGTMGASMAECFSGYGFRVYVYDAFEKSLENGKNLIYLNQETLVNEGLISK EKSAQIIQNLSFHTDIEIFKEADLVVESITEEIEIKSDFYSKISNIVNDNCIICSNTSAL PITLLSKSVNKPERFLGMHWFNPPHIIPLIEIIKSNKTDYRYVNLIYDLSKSIDKQPVIV NKDIKGFVANRIQFAVLREALYLVDNDVISVEDIDKVMKYALGFRYACFGPLEIADFGGL DTFEHISKFLNPNLCNDTTISKSLVELVENNNYGVKNGKGFYDYSNGKDTKAIQRRDELY LKLTKLLTD Prediction of potential genes in microbial genomes Time: Thu May 26 10:29:39 2011 Seq name: gi|223714086|gb|ACDT01000129.1| Coprobacillus sp. D7 cont1.129, whole genome shotgun sequence Length of sequence - 2786 bp Number of predicted genes - 5, with homology - 3 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 42 - 101 5.4 1 1 Op 1 . + CDS 208 - 414 114 ## gi|237733361|ref|ZP_04563842.1| predicted protein 2 1 Op 2 . + CDS 417 - 773 200 ## gi|237733362|ref|ZP_04563843.1| predicted protein + Prom 936 - 995 9.3 3 2 Op 1 . + CDS 1171 - 1299 84 ## 4 2 Op 2 . + CDS 1292 - 2263 781 ## COG4227 Antirestriction protein 5 2 Op 3 . + CDS 2303 - 2786 194 ## Predicted protein(s) >gi|223714086|gb|ACDT01000129.1| GENE 1 208 - 414 114 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733361|ref|ZP_04563842.1| ## NR: gi|237733361|ref|ZP_04563842.1| predicted protein [Mollicutes bacterium D7] # 1 68 30 97 97 115 100.0 7e-25 MNDDLYDKLYDYLIKLSRGNDKAVAHTKLMMEECRPIIEKIEKDEQISNDEFNSFMEKFR VFKRKYLM >gi|223714086|gb|ACDT01000129.1| GENE 2 417 - 773 200 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733362|ref|ZP_04563843.1| ## NR: gi|237733362|ref|ZP_04563843.1| predicted protein [Mollicutes bacterium D7] # 1 118 1 118 118 219 100.0 5e-56 MHFQYHKYRTAGIKIIFWKKRIIEIYDLNDKLITGVSNQDQLDEFMDDNEFEYPEVIIKE NGKWNSSGDFIIINFNYNSLTKEQIHFYKCIFVKNHGYFTECKILSSFTVYNPTQKTA >gi|223714086|gb|ACDT01000129.1| GENE 3 1171 - 1299 84 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCENNSKLFEVLGMIEDDIEINDEDIDSEQFFSLKEVGVTNE >gi|223714086|gb|ACDT01000129.1| GENE 4 1292 - 2263 781 323 aa, chain + ## HITS:1 COG:XF2061_1 KEGG:ns NR:ns ## COG: XF2061_1 COG4227 # Protein_GI_number: 15838653 # Func_class: L Replication, recombination and repair # Function: Antirestriction protein # Organism: Xylella fastidiosa 9a5c # 10 301 221 504 522 112 30.0 9e-25 MNNEMSKTRQMLIEGYIESLEQEQIPWRKGWGNGDSQHNPITNTFYRGINQLLLNYKYDE RRYEDPRWLTFIQIMNEGWTLENAKGQGVPLEKWGIYDREEKKYINVSDMKKIIAEEQLE GREVDARFCWSCKTFTVFNASLVKGIEKYEIIRNEFDIDEVGLEMIENYCDATDLAIIEG RQSKGEAFYNKIHDTVYVPDRYYFDDSYEYLATVLHECCHSTGHPNRLNRNMLNEHDEYA LEELRAEIGSSFLCGDLGLDISNARIDNHKAYIQSWISGFKEKPNVLLSTINDAKKITDY IENKGELSGIIERQKSVIQGRKR >gi|223714086|gb|ACDT01000129.1| GENE 5 2303 - 2786 194 161 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKNYPIIYLNEKKFVTLTGANNEYIKILKGAKDTNNESTIRIKIKQDDKMSGLGSREFEL VISESSYPNLFIEMMNYMEMVYLHDCFKGVDADKVIKKLHDKQDELEYWYIEYILENQID NNLNELLNNPIDYASTNYLAPHEIAYMIFQCEQWDKEHGFD Prediction of potential genes in microbial genomes Time: Thu May 26 10:30:02 2011 Seq name: gi|223714085|gb|ACDT01000130.1| Coprobacillus sp. D7 cont1.130, whole genome shotgun sequence Length of sequence - 3968 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 222 178 ## + Term 255 - 296 -0.8 + Prom 602 - 661 7.4 2 2 Op 1 . + CDS 704 - 943 190 ## COG1396 Predicted transcriptional regulators + Term 945 - 983 7.0 3 2 Op 2 . + CDS 995 - 1243 309 ## gi|237733366|ref|ZP_04563847.1| predicted protein + Prom 1245 - 1304 2.6 4 3 Op 1 . + CDS 1324 - 2424 1213 ## COG3786 Uncharacterized protein conserved in bacteria 5 3 Op 2 . + CDS 2476 - 3651 1147 ## gi|237733368|ref|ZP_04563849.1| predicted protein + Term 3653 - 3683 -0.3 Predicted protein(s) >gi|223714085|gb|ACDT01000130.1| GENE 1 1 - 222 178 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no RFNIDYHEENINRYINKYASDKIISSKLKRNLDVYSKLKKYQKKGEQELVYKVKKLLIFV LCLLSLFITGYSG >gi|223714085|gb|ACDT01000130.1| GENE 2 704 - 943 190 79 aa, chain + ## HITS:1 COG:L80045 KEGG:ns NR:ns ## COG: L80045 COG1396 # Protein_GI_number: 15674195 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 4 72 7 75 102 59 47.0 1e-09 MNIGKRLRKLRIEHNLTQNEVSILTGIKRSSIASYELNEQLPPVDKLIQLANLYKVSLDY LCGLDNSEIRNTRNEKSKI >gi|223714085|gb|ACDT01000130.1| GENE 3 995 - 1243 309 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733366|ref|ZP_04563847.1| ## NR: gi|237733366|ref|ZP_04563847.1| predicted protein [Mollicutes bacterium D7] # 1 82 7 88 88 141 100.0 1e-32 MKVKELCLNYLDTRNKKIIAGISVVAVILASTLGIYALTKDNLPDFVLSDSKTIKLEYGD KYSVNGMKLLNTEGMDDEDKKY >gi|223714085|gb|ACDT01000130.1| GENE 4 1324 - 2424 1213 366 aa, chain + ## HITS:1 COG:MT0322 KEGG:ns NR:ns ## COG: MT0322 COG3786 # Protein_GI_number: 15839693 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mycobacterium tuberculosis CDC1551 # 184 358 41 216 218 110 33.0 3e-24 MTFNDNTLVKKVKVADTTAPELNTEFRDIDIVKGTDLNNYDFGGLNLFNATDLSPVEVGY DSSAIDTNTVGTYVLKTTARDSSGNETVKEMSTNITEAPNENQELVTETVTNEDGTKSIR NTLRDKATAQDNTANTKIGSNSSKGSSTNKKGSTGSTGSSSNSSSNSGVNQSFVANMSIS RQTTQAITVVGNGGSYATLTLHTKRNGIWTETLSCTARVGKNGITSNKREGDGKTPTGIY SFGQAFGVAGNPGTSRGWLQVNNNHYWVDDVNSPYYNKLVDASQTGIQWSSAEHLIGYPT AYKYAIAVNYNTACTPGAGSAIFLHCSTGGSTAGCISVSQSNMIRILQSLQGDTLIGIYQ NSNSLY >gi|223714085|gb|ACDT01000130.1| GENE 5 2476 - 3651 1147 391 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733368|ref|ZP_04563849.1| ## NR: gi|237733368|ref|ZP_04563849.1| predicted protein [Mollicutes bacterium D7] # 1 391 1 391 391 526 100.0 1e-147 MDIRKDIIDKYLNTKKNRIIAGCVTGLIVLGCGGGIVYATRDTSPKLQLKKNTINLEYGK EFKADFDTLVDTKGLNKEDKEYLKKNLKIKSDIKNDTESVTKEDGTTEEKDRGFAKVGDY KVNLTYKDETKTVKVVVKDTTAPEITVPENIEILQGTDLSTFDFKSLITATDLAQMELNI DYSTINVNVPMEYIAKANVEDVNGNKNEIEFKVTVIALPIVAEDEVVVQETVTNADGTKT VKNTVKKKADSSGDKVVSSGNNSNSSSNSSGSKPSGSTGGSSGSSSSGSNKPSNGGTSGG SSSGGNSGNSGGSSGSGSGSGNTGSTGETTKKYIYVEWSYKVGDRTFNGSWAGREEKWNP NAMSLPNDAYDFSGFNKKSITYEEYKQIMGY Prediction of potential genes in microbial genomes Time: Thu May 26 10:30:44 2011 Seq name: gi|223714084|gb|ACDT01000131.1| Coprobacillus sp. D7 cont1.131, whole genome shotgun sequence Length of sequence - 3810 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 415 229 ## gi|237733369|ref|ZP_04563850.1| conserved hypothetical protein 2 1 Op 2 . + CDS 396 - 2222 1646 ## BDP_0277 putative sortase-anchored surface protein 3 1 Op 3 . + CDS 2200 - 3000 633 ## COG3764 Sortase (surface protein transpeptidase) 4 1 Op 4 . + CDS 2993 - 3595 501 ## gi|237733372|ref|ZP_04563853.1| conserved hypothetical protein Predicted protein(s) >gi|223714084|gb|ACDT01000131.1| GENE 1 2 - 415 229 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733369|ref|ZP_04563850.1| ## NR: gi|237733369|ref|ZP_04563850.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 137 1 137 137 251 100.0 1e-65 EFSLYSDKECTKEIEKGKTDKNGQLNFDRISVGDFYLKETKAPAGYRLLEDPIKISLKNV NGKFTFFVNDKEIKEDDKNNSLTLENGLYTGNLTVINSRGSILPATGSPMTIILTAAGVL CLLITIKRGKNNDEEQI >gi|223714084|gb|ACDT01000131.1| GENE 2 396 - 2222 1646 608 aa, chain + ## HITS:1 COG:no KEGG:BDP_0277 NR:ns ## KEGG: BDP_0277 # Name: not_defined # Def: putative sortase-anchored surface protein # Organism: B.dentium # Pathway: not_defined # 174 602 123 493 495 134 33.0 9e-30 MMKNKFKKLGITALACVLSLTTAGTVANLVNSGNTGNGIFTKLNAEGTGHITINANVGDN GVTQSLAGKKFNIYRIFDAQNSAGMESINYTMNPAYEKSLKKVTGKDTEYAIIDYIQTLN NNQVINDVSQTQKNETAYSNFRYFVESLRNMLVEDKADPTQVLTVPSTATDSYTLDVAYG WYIIDEITSVNGTHSAASMIMVNTANPDVAIDIKSDFPVIQKQILEDDNQKNIGADKDGW NDVGDYEIGQTVPYRYLTYAPDMNGYANYYFAVHDRMDSQLTFNKDSVVVKVGNKTLVKD TDYKVVTTGLTGETFQIQITDLKATINKYFYASEAGANPETEKLYGQKIVVEYNATLNES AQADTGRAGFENDVRLEYSNDPDSNGTGKTGLTPWDTVVAFTFRMDGIKVNDQTPERKLE GAKFRLYSDEDCTKEVYVKKASGGDGYTVINRDSVTGEAVEMVSDSNGVFNIIGLDSQTY YLKETKAPDGYRLLKDPIKIDVKATYGADNRLNYVKGDGATAKTLQKLAATASFKEFYTG AYSQYDKTLSTDVDTGTFNIKVVNKVGSKLPATGSMATLLLVATGTAVMVTVLVRRRKNE KNENSQIS >gi|223714084|gb|ACDT01000131.1| GENE 3 2200 - 3000 633 266 aa, chain + ## HITS:1 COG:SP0467 KEGG:ns NR:ns ## COG: SP0467 COG3764 # Protein_GI_number: 15900383 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase (surface protein transpeptidase) # Organism: Streptococcus pneumoniae TIGR4 # 26 223 1 204 261 161 41.0 1e-39 MRILKFLKKDILIFITGLTLVSYPVISNYIESREQADQIDTYNSSVSELSRANKQLMLNE AHEWNDNLYSKEKGLPEDSGLEYENVLDLGNGVIGTVEIPKIKVNMPVYHGTDSDVLNVG AGHIEDSSLPVGGNNTRTVLTGHRGLPSSKLFTRLDELKKGDLFFIKVIDETLAYKINKI EVILPDKVSYSIIDNQDLATLITCTPYGLNTHRLVITGKRVPYKKKEKKSIKSSIPSLRE IIFYVIPVLFSVAGILFFRKRGVKNV >gi|223714084|gb|ACDT01000131.1| GENE 4 2993 - 3595 501 200 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733372|ref|ZP_04563853.1| ## NR: gi|237733372|ref|ZP_04563853.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 190 1 190 200 325 100.0 1e-87 MCKKLITAVFTVFLFFISITAVSALDEVGSITVNLEEGKKGTSVKNVELELIKVGDVVNG QYLFIDDLQDIEIDLNTLETAEDMKNAAYTISKITVSKNIVGTRKTTNGYGTVKFDQLEK GVYLLQATDINKYENIVSTLISVPVFNNESKNSMNYDISIVPKHSPVIAVKTGDAFDLKL FAVLAGVSAVIIAIIRREAI Prediction of potential genes in microbial genomes Time: Thu May 26 10:31:10 2011 Seq name: gi|223714083|gb|ACDT01000132.1| Coprobacillus sp. D7 cont1.132, whole genome shotgun sequence Length of sequence - 12935 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 3, operones - 3 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 1807 1890 ## COG4932 Predicted outer membrane protein + Term 1814 - 1844 -0.5 2 1 Op 2 . + CDS 1870 - 2160 194 ## gi|237733374|ref|ZP_04563855.1| predicted protein 3 1 Op 3 . + CDS 2147 - 2854 237 ## gi|237733375|ref|ZP_04563856.1| predicted protein 4 1 Op 4 . + CDS 2875 - 3279 417 ## gi|237733376|ref|ZP_04563857.1| predicted protein 5 1 Op 5 . + CDS 3284 - 3622 287 ## gi|237733377|ref|ZP_04563858.1| predicted protein 6 1 Op 6 . + CDS 3659 - 5497 1485 ## gi|237733378|ref|ZP_04563859.1| predicted protein 7 1 Op 7 . + CDS 5500 - 5841 328 ## gi|237733379|ref|ZP_04563860.1| predicted protein 8 1 Op 8 . + CDS 5783 - 6487 533 ## gi|237733380|ref|ZP_04563861.1| predicted protein 9 1 Op 9 . + CDS 6474 - 8306 1299 ## COG3451 Type IV secretory pathway, VirB4 components + Term 8309 - 8350 3.0 + Prom 8317 - 8376 10.7 10 2 Op 1 . + CDS 8410 - 9513 597 ## Elen_0713 hypothetical protein 11 2 Op 2 . + CDS 9548 - 10165 643 ## FMG_1594 hypothetical protein 12 2 Op 3 . + CDS 10176 - 11048 662 ## Apre_0680 hypothetical protein 13 2 Op 4 . + CDS 11071 - 11589 407 ## gi|237733385|ref|ZP_04563866.1| predicted protein + Term 11607 - 11643 5.9 + Prom 11781 - 11840 8.7 14 3 Op 1 . + CDS 11876 - 12130 219 ## gi|237733386|ref|ZP_04563867.1| predicted protein 15 3 Op 2 . + CDS 12117 - 12932 583 ## gi|237733387|ref|ZP_04563868.1| predicted protein Predicted protein(s) >gi|223714083|gb|ACDT01000132.1| GENE 1 2 - 1807 1890 601 aa, chain + ## HITS:1 COG:L148778 KEGG:ns NR:ns ## COG: L148778 COG4932 # Protein_GI_number: 15672133 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted outer membrane protein # Organism: Lactococcus lactis # 41 550 1115 1735 1983 76 23.0 2e-13 KWMDVGDFELRKTNDTGKLIDGSEFSLKHTTLDFEKKLVVKDGKLSMTGLPVGEYLLTET KVPSGHAVLQKTFQVTVNKDQTTEQIVVNKLRPTGTLEINKTLEASNVEANKADYDLTKV QFKITANQDIYDSVSLEKLYSKGDAITVGSGKGSNDDVVKLVNGTELENGIYTCDTNGKL SLSGLPMGTYSVEEIACPDGFVLDKEIKTVQFAQQDFVTLTYTSSLNINNELTKTVFSKV TADGDNLYGVPMEIVDVKTGKPVHTWITDDSSLEIDGLPTGDYIWREVNAPEGYVLAKPI EFTVKDGDIQDIEMKNFSVEFAKNGNDGNKLLKGGEFQVTSTKTKQIIDKWTSGEHIFDI TEDMKTKLTSGEVVSDMFINIEDDSSTYYRISKNADRDDYRLLMQANGETKYYNIDINGD ETTHMIRGTVEDQEYVITELKSPDGYATAEPVSFKTNKEQNLTVDMTDEITQFEFYKKDI TSQEELEGATLQIKDKNGNIVDEWVSGKTPHKITGLTVGQTYTMIEVIAPVNYKIAQNKE FTVSDTGEVQKITMYDELMPVAKKVKTADDMYIGMYMMLGGLSALSIGMFMSNRNKKMHK N >gi|223714083|gb|ACDT01000132.1| GENE 2 1870 - 2160 194 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733374|ref|ZP_04563855.1| ## NR: gi|237733374|ref|ZP_04563855.1| predicted protein [Mollicutes bacterium D7] # 1 96 5 100 100 139 100.0 8e-32 MIKKSKIIMKNKNLKIALIVLACLGLTMFGYVAYKYDFRVAYTLFLSIIVSLIPVVTGIF IIRLSFKFFIKEMPGIMNDFFKIKTIFKKAFMNENN >gi|223714083|gb|ACDT01000132.1| GENE 3 2147 - 2854 237 235 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733375|ref|ZP_04563856.1| ## NR: gi|237733375|ref|ZP_04563856.1| predicted protein [Mollicutes bacterium D7] # 1 235 1 235 235 408 100.0 1e-112 MKITNHSDIKRFSKIEFLVYEYVNLMALVNPAVGFLSEDAVKFFIMYSIDTYDDRNNSSY TTMNESVYTLEKRIENGDYKDNRKLLDRFNHFKQILKDTEPLYLKVNDEKKEKSRWTTIT FTKNQLGKTYIANGREVCNIRLPIGCNYERYALTVPKSLISESNYSNTMYFRTTENFIHS ITTFDDKTKSWLKVELDTDTLKKEMHSIFKNHKKDTNINKTELEKINKNSCARSR >gi|223714083|gb|ACDT01000132.1| GENE 4 2875 - 3279 417 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733376|ref|ZP_04563857.1| ## NR: gi|237733376|ref|ZP_04563857.1| predicted protein [Mollicutes bacterium D7] # 1 134 1 134 134 247 100.0 2e-64 MQNILIVKSCDFVYSPMFSCEDVDPAYQQAFIKPNFYFVDDSKLNLKIDISVGKNKTDYD FNMLINYEVDLDECYSTSKKPKHEIVKEIMKEFDILDGCRTLVEQQIEMSKKLNLHLKNN IKNTRVKNLLKGGH >gi|223714083|gb|ACDT01000132.1| GENE 5 3284 - 3622 287 112 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733377|ref|ZP_04563858.1| ## NR: gi|237733377|ref|ZP_04563858.1| predicted protein [Mollicutes bacterium D7] # 1 112 1 112 112 197 100.0 2e-49 MIKKIKYKLSGFVTMLSLLFFKINYIYAIKAPSTEDGKKIAKPWIDGFITFCQWAMVGVF VGFVIFNAVRYYQLSQQERERTSFWDTVKKGFYILVFAELIVPILSIFGLSW >gi|223714083|gb|ACDT01000132.1| GENE 6 3659 - 5497 1485 612 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733378|ref|ZP_04563859.1| ## NR: gi|237733378|ref|ZP_04563859.1| predicted protein [Mollicutes bacterium D7] # 1 612 1 612 612 1078 100.0 0 MDLLDVEGLIRDLLWAMCNLVLNIVDALYEIAKRICSLDLTKIDFIWQWYDVLVVFFVLF LLFRIFKVFFKSYIDQDFMQKLNVGKVMVMLFVSIFSFSMFPYGFAYLSDISNSAIENIA VFTNSSPDASFSSILVQQASIDISADLSKNADDINNLKKWKNEHKKISESKFNSFIADFD WAEALVGKYSSSKMVIISSAIANNLSSISELTYKQYSDYYDKTYGENIDINKIDINECYE LDDGVPVLGDIPVANWWFGATQIYYFFPSYSSLFFGLMASVSALFMFLPICLQMAVRVIG MILKIMLSPYALSSLVDPESNSFKVWSKSLLSDLLLNFFQLYSMILIFTLYSSSDLDKAI GATGYGMFVKLFLFLGGLLAVVKAPAGITEIIGGGELGAASSAQGFMGLLAAAGFIMHPK QSLQMRLNPASLYGGGRQGGRFSRNGNSEDLGENGLNEGENIDTGFNQNGSMGEGADTGF NQKDSSNEQFDSYGNNMSSEEELGADGNLTGGSGKSENSYSTSSSSPTELQLDSAKYYGI EGAEKMSQQQLKQEFTKRGYGSSFPEQPGYNIDAGIGNSKVNIDNASSNNSVKPASQNSL LNDHSLNKDRRF >gi|223714083|gb|ACDT01000132.1| GENE 7 5500 - 5841 328 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733379|ref|ZP_04563860.1| ## NR: gi|237733379|ref|ZP_04563860.1| predicted protein [Mollicutes bacterium D7] # 1 113 1 113 113 191 100.0 2e-47 MKFIVPQNFKNGRLIMKRYRLFDLVMLISVVGFSLVVFIAVVQSGIGRNGVIFFFAFFSI LIFITFILTMGAGINHNILELILTIKDYYKTEKKYEWAGRIYADEESFKKEQR >gi|223714083|gb|ACDT01000132.1| GENE 8 5783 - 6487 533 234 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733380|ref|ZP_04563861.1| ## NR: gi|237733380|ref|ZP_04563861.1| predicted protein [Mollicutes bacterium D7] # 8 234 1 227 227 369 100.0 1e-101 MNGQVEYMQMRKASKKNKDKKKKVKLTKSDKKIIKERRKVRYTTLDWMSLEEINPDHCVI VEFGNRFILKGIRIMPRNIYLLDDKTLGSIIQGFNVFLNKEKRRLYWKFVKSEPNMIKQN RHLIDLYEDEDDERIKRFIQYELEKNEWFMSSFNEMSFYILILEPEKYIDKHFQDLCNYF KSTRMQFEIAKKKDFQNMIYADFDNRDISDYYIPKLYSEEFNVTKGEEGTDEEF >gi|223714083|gb|ACDT01000132.1| GENE 9 6474 - 8306 1299 610 aa, chain + ## HITS:1 COG:MYPU_3830 KEGG:ns NR:ns ## COG: MYPU_3830 COG3451 # Protein_GI_number: 15828854 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Mycoplasma pulmonis # 148 592 406 834 853 72 25.0 3e-12 MKSFNDMKNLMFKSRAVPRGFHEYANYYVIGDKFHRAMTFTITKDFDEGFLGMFVAGQNY RVEMVTEPSKAKISNYMKTVVKDMEQEYSKTTDRIQRAELREQMDGYNAYIRELVRKRDA TLNCIITVFIEANTLEELDERTYDLDKLFSTYNVEVRAERFLQQRILKCSSPFFESDGFR KEASQNIGQLLSTQTVSGLYPFTFDTLDDEKGMLLGVESSQKGMVCFDQFLYLNNPAKAK KYKRLNGNLIIIGESGSGKTTAMDLILMSHICHTRKVIWIDPENKNRTIIKQAGGSYFNL GTKNARINIFDLKPISCDEDEEDEVDKYDTRLAIYNVVEDIKIVFRYLSKDISEGALDIL GELVLEAYETVGVNINESFQYLKNTDYPTFSVFSAIVDEHIEKFEKSNIVNDEYTALKEI RIKLKPICTEWARYLDGHTTITEDEMKNPFFAFGTKVFFNLTPGLKNALMHIIFQYAWGL CLDETSESVLAVDEAHMFVLAGETAKMIEQFQRRSRKYHNVTILGTQEPHDFADPRVLAE GKAIFNNASYKLILGLDKDATKDLRKLTALSDNEIKRIARFSQGDAILIAGNKRLPIEIK ATPDQLALMR >gi|223714083|gb|ACDT01000132.1| GENE 10 8410 - 9513 597 367 aa, chain + ## HITS:1 COG:no KEGG:Elen_0713 NR:ns ## KEGG: Elen_0713 # Name: not_defined # Def: hypothetical protein # Organism: E.lenta # Pathway: not_defined # 170 362 97 321 549 187 44.0 4e-46 MDFEGGNTRDLFVAKAIDEYNAVNGAVGGDKYRKWYTGSADGANWCATFVSWIADQCGLI DAKVIPKYQGCDAGVDWFKKKSEFQYTPRYGGKSIVPSPGDIIFFCKGNKNDSTHTGIVI DVTDNTVHTIEGNSSNTVKRNSYKTDSKSILGYGIPAFPNSISNGVSGMLSEEFKYFAKY ESSCNYDQGFSSSDDYHALGYYQFDNRYDLQKFLKYCYQLDNNKYACFEAYLSCSKASLR NNKGLEKAWHTAYSNDPKDFANKQDQFEYDNYYIPIENALKSLGIDLSSRRDCIKGLVCG MANLFGNSGAKRLIKNSGITNDLGDKEFVTKLCDYLVSYCPKHYAYGKSYANRYKNEKND CLRYITN >gi|223714083|gb|ACDT01000132.1| GENE 11 9548 - 10165 643 205 aa, chain + ## HITS:1 COG:no KEGG:FMG_1594 NR:ns ## KEGG: FMG_1594 # Name: not_defined # Def: hypothetical protein # Organism: F.magna # Pathway: not_defined # 23 195 10 187 195 112 35.0 8e-24 MYDRNGDMMCLLENKVEEYYKLNGEYMVYSDMLEYGFTKREISRLVEKELLRKLVKGLYI YKNELEDEFFVYQFANKNMIYSHETACYLHGLTTVIPSRCNVTTYSGCHLRNNRLKVSYV KKELLHVGAVEHIDYFGNKIIVYDCERTICDLIRNKKKVDTQVYYQSLQSYFNNKKLDMR KLSKYGKLFNVESEIAEIYSLYKSA >gi|223714083|gb|ACDT01000132.1| GENE 12 10176 - 11048 662 290 aa, chain + ## HITS:1 COG:no KEGG:Apre_0680 NR:ns ## KEGG: Apre_0680 # Name: not_defined # Def: hypothetical protein # Organism: A.prevotii # Pathway: not_defined # 5 248 3 245 278 146 34.0 1e-33 MAYKNIDQWKNGMIKYAKEKNMDLRDVQQRFILEEFAEKISNSKYRDSLIIKGGFVVSNL LGFENRTTLDMDATVNSTVYSVEEINNMITDIIEEDITKSFFDYRIGDIKEGQSDDGYPG YSVSILALKGKTRLNLKIDISNNTLIYPEAIEHTFISLFSDKKINVCTYAVENIIAEKIE TTLDRGIYNTRMRDLFDVYNLLTQKDISIDMNVLIDSFVNVSKSRNTLNNIYDYEELISE LSESAVFKENFNRFKKIKEIEDVSLEDIFTVFKGVIEKIDVNQFINSRSR >gi|223714083|gb|ACDT01000132.1| GENE 13 11071 - 11589 407 172 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733385|ref|ZP_04563866.1| ## NR: gi|237733385|ref|ZP_04563866.1| predicted protein [Mollicutes bacterium D7] # 1 172 1 172 172 225 100.0 6e-58 MIIDRKYAVIKYFVIGFILLIIIIASIFYFTGNEDSKTKYKYKYLNEDQITFASDEQVDD YKIMLERYLEKNKYSDVETVKFYNRTFKEDNYVYFYCLLDDEFKTLLECKYDKSEEKFLN YFEWVGDKYDDSTEAPASKITYLEIVDKESYESKKFDEEIENREPDENIDSD >gi|223714083|gb|ACDT01000132.1| GENE 14 11876 - 12130 219 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733386|ref|ZP_04563867.1| ## NR: gi|237733386|ref|ZP_04563867.1| predicted protein [Mollicutes bacterium D7] # 1 84 1 84 84 91 100.0 2e-17 MNKKLINRIKKLEEKKEQLLTQTSEIDVELKKLYELKKEDENLQKKRMCFEEKLDNILVK AKKPDEKSMQRNLCEDGENVNDIQ >gi|223714083|gb|ACDT01000132.1| GENE 15 12117 - 12932 583 271 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733387|ref|ZP_04563868.1| ## NR: gi|237733387|ref|ZP_04563868.1| predicted protein [Mollicutes bacterium D7] # 1 271 1 271 271 459 100.0 1e-127 MTFNKITYSLYQKVKYLIDAAYPNIDENVIGNYKKINIVLSKKTLKQNEKYEDRKCIIYN LYRQESELSNSLLICLAHHIDYLMRGETKNDSEFNKIYIHILHTAINEKMVKYDELKRTD DYKNKKMIQKALDSYWESNKKFDTVYLEIYNCYEIKSDLKRDGFVYNEYYQCWQKEVKTN NISTQKDYCFNLKSDLIFNIREKNHIIFTLYGMICVTGNTYFAKNILKKNKYFFKENCWQ KKIKSSNFLKEKRNLERQLPPAQGIKIEMEY Prediction of potential genes in microbial genomes Time: Thu May 26 10:32:49 2011 Seq name: gi|223714082|gb|ACDT01000133.1| Coprobacillus sp. D7 cont1.133, whole genome shotgun sequence Length of sequence - 8006 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 49 - 810 529 ## COG3177 Uncharacterized conserved protein + Term 959 - 996 3.1 + Prom 1007 - 1066 6.2 2 2 Op 1 . + CDS 1090 - 1455 244 ## gi|237733389|ref|ZP_04563870.1| predicted protein 3 2 Op 2 . + CDS 1421 - 1807 180 ## gi|237733390|ref|ZP_04563871.1| predicted protein + Prom 1809 - 1868 3.8 4 3 Op 1 . + CDS 1895 - 4480 1756 ## COG0790 FOG: TPR repeat, SEL1 subfamily 5 3 Op 2 . + CDS 4461 - 6479 1407 ## COG3505 Type IV secretory pathway, VirD4 components 6 3 Op 3 . + CDS 6493 - 7917 1006 ## gi|237733393|ref|ZP_04563874.1| predicted protein Predicted protein(s) >gi|223714082|gb|ACDT01000133.1| GENE 1 49 - 810 529 253 aa, chain + ## HITS:1 COG:pli0008 KEGG:ns NR:ns ## COG: pli0008 COG3177 # Protein_GI_number: 18450294 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 16 218 4 212 254 73 27.0 2e-13 MNNIEKLKNELIEQRNTKFPGNIYHVTQVLFAYNSNKIEGSRLTEEQIEMIFETNSFIAK DYDVIKTDDITEAVNHFRLFDYILDVIDKPLTKEMIIEMNKIIKRGTSDEFNPRYNVGGF KVLPNVIGLTNVIKTSRPENVEKDINDLLFEYENRRNIDIEDIIDFHLKFERIHPFGDGN GRVGRAIMFKECLRKNIVPFVILDQHKAYYLRGLREYDRDKNFLIDTCLNEQDIYKDLCK PLLNMNFENGKTR >gi|223714082|gb|ACDT01000133.1| GENE 2 1090 - 1455 244 121 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733389|ref|ZP_04563870.1| ## NR: gi|237733389|ref|ZP_04563870.1| predicted protein [Mollicutes bacterium D7] # 1 121 1 121 121 175 100.0 9e-43 MAMEAVSFRFDNETLFKLSELTKFYKSNDKRISKTKVMADIISTAYDNINTSQKKCYNNE NEIVYREIVELEYTIIKLLTTNLKDVDNIEEIKRIINAKSRFQIYVEDKLNEKNSNMYSD R >gi|223714082|gb|ACDT01000133.1| GENE 3 1421 - 1807 180 128 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733390|ref|ZP_04563871.1| ## NR: gi|237733390|ref|ZP_04563871.1| predicted protein [Mollicutes bacterium D7] # 1 128 1 128 128 187 100.0 1e-46 MKKIAICIRIDSDIDNKINYLINDKKNEQSKYNDDTNVTKTDIITDAVELLFSYRTRDAF RLLNKEQETYLKGVLKQYQSLNNEFFNNLIFEILILKQLIVSTNGITESQLDETVKKVLD NITNANSN >gi|223714082|gb|ACDT01000133.1| GENE 4 1895 - 4480 1756 861 aa, chain + ## HITS:1 COG:ECU11g0430 KEGG:ns NR:ns ## COG: ECU11g0430 COG0790 # Protein_GI_number: 19074843 # Func_class: R General function prediction only # Function: FOG: TPR repeat, SEL1 subfamily # Organism: Encephalitozoon_cuniculi # 483 801 175 501 590 84 27.0 1e-15 MTKSNAPIIGDVRFLKYGADAPDESRFKGKISTESLIGFFEYTEQEEKNVDCKDESLHDG GFFGYTSDKNEYTFSSMGWLDKNKHDKFRRELRKAFSKPGDLWWDTVVSVSSFEQLSEFG IVTADDWYDICLKALPEVFKTMNLEYDNMLWWGNYHIDTTHPHIHLCFLEKDKTRERGKL TPTELRKFKSAFIKEILKRAVFEKELKVSVNEFFKSKDVDFKNLMNVIDSRVLEKEKVSM NELYKILPKQGRLAYNSCNMLPYKKLIDSLIDKVLSDADVQKEFNIYLSKLEQLEDLMNR QVNTNISNIKSTELEKLRSQIGNHILKNYKRKITSKSKKKININDGSFEKSMIDNVEQNI IENSNTDVVDDLIENYSDILETADRAVNENEFILEWSQSYKEGANLFYSSKNESELHLAE QMLLSEAGRNNVLALELLGKLYSSNKLDEKSNIMYKRSLEGFKCIFDENINDKFIRSYAG YRLGKFYLYGTGTEIDYENAMKYFESSNNKYAYYSLGRMYQYGLGIEKNDEMALYYFEKS SEGNNAYANYEVAHHYEKGIACKVDLEKAETHYKTAYNEFKSMVEKHGDDNLLYRLGQMT YLGKGCGQDINKAVEYLQRAVKFENTNAKLLLAQIYLKEGYIGMYETALKYLHDVNNDTS NYLLGNIYCNGEIVKKDWTKAVQYFNKCKTNEYAYYKMYIIYKKNNQPDRALKYLNKSVE IGYEYSKIILAKEYMDGKYLDKDICKAIKLLKQSNNSYAQYLLGKIYLTGDGVKANNNLA KEYLKKSADQGNEYAQYLLKHSNSYQKTYRTRYSLKRLSNLSARYCRINQELAHKEYYKW LRDNGLLSEQEGREFNEKNSR >gi|223714082|gb|ACDT01000133.1| GENE 5 4461 - 6479 1407 672 aa, chain + ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 190 586 162 500 591 86 23.0 1e-16 MKKIVVSIISFVLTKVVVDFIGMQLIKLFVLKDFDFKYIFDYSLLFLDGNKFFLLLSFFA FLVWFEIIYGIFFRDDVKKRRLTKNEQLKYAKIDRTRDVKKSLLRVDFNKTGLVEHRMSK QALKIKMTPIIDKINKRTKDIPILNNTQKVLIPAKCWNIGDKKCYVRAGFPITGNRKCIY VDPGDVHNLVIGTTGSGKTWTFVNEYLELCAMCGESVVVNDPKGELAKHTLQKFKDNGFN TVVINFVDPEFSNCWNPFKLAFDEWKKADDIYIKKYEEWKIKNKNGILEYELLGLDTKEI ENKLRKPKPNYSKAMDLLRDACNTICLEENPKDPIWTETARDMLVGCGALMCEEGKVEYM NATSFKKIIEIGKKQLPGIRNSKTILKAFLDSERSITDYSVNYLEGYVGSSGDTSTGFES QFNNKLSILTANEDIQRITSHSDFDFRQIGSEKTAIFLIVHDEKKTYYPLVTLFIKQLYE VLVRTARDEINLRLKVPINLILDEFGNMPALPDVDAMLTAARSRGIRLTAIIQGLEQLEK HYGKMAETLKGNFTNTIYLLSGSDSTVEEISKKAGSERVWNKDMRKYEDQRIFTPERLKD FKMGEILFLRQRHKNPYYTRLLPYDKYVFFDKSKFYLNLPIVEKPDVELFNIEYEYLKRH GKADKFDRFARN >gi|223714082|gb|ACDT01000133.1| GENE 6 6493 - 7917 1006 474 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733393|ref|ZP_04563874.1| ## NR: gi|237733393|ref|ZP_04563874.1| predicted protein [Mollicutes bacterium D7] # 1 474 1 474 474 795 100.0 0 MSWDEKEQAKKIREEYNIKYNNYMTLNLEQLNLCEDKETNASDFECESRTFADRIIVGKD RKRFAVKNSYFEYINFRKFECEETVFINCFFKGVKFIDCDVPNIKFINCNFDDFEIVRSN FIGSIFSGCLFIRSRVEESTLDESIFPLSNYYDIEIDTDKSKPNYSTVKGVLASEQKEVS SIETILKSDEVVPIEVETIENKIYKHETIVIEELKNNKFVKCEFIECVFENDKETDNYIN DLSIEFWNSNFKKCLFNATFFKTVFDGSNFLECRFKDKYIRCSFCDTVWNKNILEDTAVF NYSMIINAVLDFDFTSIPERNRKNVGIGTTANSKNIVNNLSEKDRQIEQLNKENKLLKEE LEKLDCKKKNLISFEEFDDQLLNEAFRLIIARCQRPEPVEFDLWNEIKAHIDSLSDLDFI QFVNSINNYKLSSNNLIDEDRQEKNMDNPDNNYIHEVAQEEKKEKLPGLEDGVV Prediction of potential genes in microbial genomes Time: Thu May 26 10:35:11 2011 Seq name: gi|223714081|gb|ACDT01000134.1| Coprobacillus sp. D7 cont1.134, whole genome shotgun sequence Length of sequence - 74605 bp Number of predicted genes - 86, with homology - 86 Number of transcription units - 17, operones - 15 average op.length - 5.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 671 630 ## COG0655 Multimeric flavodoxin WrbA 2 1 Op 2 . + CDS 682 - 1269 587 ## COG0406 Fructose-2,6-bisphosphatase + Prom 1302 - 1361 8.4 3 2 Op 1 40/0.000 + CDS 1482 - 1790 509 ## PROTEIN SUPPORTED gi|167755911|ref|ZP_02428038.1| hypothetical protein CLORAM_01431 4 2 Op 2 58/0.000 + CDS 1812 - 2540 1172 ## PROTEIN SUPPORTED gi|237733228|ref|ZP_04563709.1| 50S ribosomal protein L3 5 2 Op 3 61/0.000 + CDS 2562 - 3185 1046 ## PROTEIN SUPPORTED gi|167755913|ref|ZP_02428040.1| hypothetical protein CLORAM_01433 6 2 Op 4 61/0.000 + CDS 3185 - 3475 485 ## PROTEIN SUPPORTED gi|167755914|ref|ZP_02428041.1| hypothetical protein CLORAM_01434 7 2 Op 5 60/0.000 + CDS 3528 - 4361 1439 ## PROTEIN SUPPORTED gi|167755915|ref|ZP_02428042.1| hypothetical protein CLORAM_01435 8 2 Op 6 59/0.000 + CDS 4380 - 4658 492 ## PROTEIN SUPPORTED gi|167755916|ref|ZP_02428043.1| hypothetical protein CLORAM_01436 9 2 Op 7 61/0.000 + CDS 4678 - 5010 537 ## PROTEIN SUPPORTED gi|167755917|ref|ZP_02428044.1| hypothetical protein CLORAM_01437 10 2 Op 8 50/0.000 + CDS 5022 - 5675 1099 ## PROTEIN SUPPORTED gi|167755918|ref|ZP_02428045.1| hypothetical protein CLORAM_01438 11 2 Op 9 50/0.000 + CDS 5678 - 6088 704 ## PROTEIN SUPPORTED gi|167755919|ref|ZP_02428046.1| hypothetical protein CLORAM_01439 12 2 Op 10 50/0.000 + CDS 6091 - 6285 200 ## PROTEIN SUPPORTED gi|227527923|ref|ZP_03957972.1| 50S ribosomal protein L29 13 2 Op 11 50/0.000 + CDS 6297 - 6557 432 ## PROTEIN SUPPORTED gi|167755921|ref|ZP_02428048.1| hypothetical protein CLORAM_01441 14 2 Op 12 57/0.000 + CDS 6592 - 6960 606 ## PROTEIN SUPPORTED gi|167755922|ref|ZP_02428049.1| hypothetical protein CLORAM_01442 15 2 Op 13 48/0.000 + CDS 6975 - 7310 556 ## PROTEIN SUPPORTED gi|167755923|ref|ZP_02428050.1| hypothetical protein CLORAM_01443 16 2 Op 14 50/0.000 + CDS 7327 - 7866 899 ## PROTEIN SUPPORTED gi|237733240|ref|ZP_04563721.1| 50S ribosomal protein L5 17 2 Op 15 50/0.000 + CDS 7884 - 8069 332 ## PROTEIN SUPPORTED gi|167755925|ref|ZP_02428052.1| hypothetical protein CLORAM_01445 18 2 Op 16 55/0.000 + CDS 8104 - 8499 662 ## PROTEIN SUPPORTED gi|167755926|ref|ZP_02428053.1| hypothetical protein CLORAM_01446 19 2 Op 17 46/0.000 + CDS 8530 - 9075 907 ## PROTEIN SUPPORTED gi|167755927|ref|ZP_02428054.1| hypothetical protein CLORAM_01447 20 2 Op 18 56/0.000 + CDS 9097 - 9453 585 ## PROTEIN SUPPORTED gi|167755928|ref|ZP_02428055.1| hypothetical protein CLORAM_01448 21 2 Op 19 . + CDS 9466 - 10029 949 ## PROTEIN SUPPORTED gi|167755929|ref|ZP_02428056.1| hypothetical protein CLORAM_01449 22 2 Op 20 . + CDS 10043 - 10225 168 ## PROTEIN SUPPORTED gi|227871783|ref|ZP_03990188.1| ribosomal protein L30 23 2 Op 21 53/0.000 + CDS 10240 - 10680 743 ## PROTEIN SUPPORTED gi|167755931|ref|ZP_02428058.1| hypothetical protein CLORAM_01451 24 2 Op 22 28/0.000 + CDS 10682 - 11983 1196 ## COG0201 Preprotein translocase subunit SecY 25 2 Op 23 12/0.000 + CDS 12007 - 12654 759 ## COG0563 Adenylate kinase and related kinases 26 2 Op 24 9/0.000 + CDS 12654 - 13400 883 ## COG0024 Methionine aminopeptidase 27 2 Op 25 2/0.000 + CDS 13421 - 13642 273 ## PROTEIN SUPPORTED gi|15610598|ref|NP_217979.1| translation initiation factor IF-1 28 2 Op 26 . + CDS 13674 - 13787 188 ## PROTEIN SUPPORTED gi|18311362|ref|NP_563296.1| ribosomal protein L36 29 2 Op 27 48/0.000 + CDS 13814 - 14179 603 ## PROTEIN SUPPORTED gi|167755936|ref|ZP_02428063.1| hypothetical protein CLORAM_01456 30 2 Op 28 32/0.000 + CDS 14199 - 14591 671 ## PROTEIN SUPPORTED gi|167755937|ref|ZP_02428064.1| hypothetical protein CLORAM_01457 31 2 Op 29 50/0.000 + CDS 14641 - 15585 1246 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 32 2 Op 30 6/0.000 + CDS 15618 - 15980 593 ## PROTEIN SUPPORTED gi|167755939|ref|ZP_02428066.1| hypothetical protein CLORAM_01459 + Prom 15995 - 16054 5.7 33 2 Op 31 15/0.000 + CDS 16080 - 16910 251 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 34 2 Op 32 34/0.000 + CDS 16886 - 17737 401 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 35 2 Op 33 8/0.000 + CDS 17730 - 18530 697 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 36 2 Op 34 . + CDS 18530 - 19267 595 ## COG0101 Pseudouridylate synthase 37 2 Op 35 . + CDS 19257 - 20132 901 ## Lebu_0869 histidinol phosphate phosphatase HisJ family + Term 20133 - 20175 7.2 + Prom 20521 - 20580 7.4 38 3 Op 1 . + CDS 20601 - 21800 1267 ## gi|167755945|ref|ZP_02428072.1| hypothetical protein CLORAM_01465 + Term 21831 - 21874 -0.2 39 3 Op 2 . + CDS 21897 - 22697 859 ## COG1073 Hydrolases of the alpha/beta superfamily 40 3 Op 3 . + CDS 22711 - 24930 2109 ## COG2199 FOG: GGDEF domain 41 3 Op 4 1/0.000 + CDS 24920 - 26368 1883 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 26369 - 26405 3.5 42 3 Op 5 . + CDS 26435 - 27271 835 ## COG1737 Transcriptional regulators 43 3 Op 6 . + CDS 27283 - 28311 1021 ## COG2008 Threonine aldolase 44 3 Op 7 . + CDS 28379 - 28654 172 ## PROTEIN SUPPORTED gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 45 3 Op 8 . + CDS 28667 - 29080 228 ## PROTEIN SUPPORTED gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 + Prom 29248 - 29307 11.1 46 4 Op 1 . + CDS 29387 - 30220 934 ## COG2207 AraC-type DNA-binding domain-containing proteins 47 4 Op 2 . + CDS 30300 - 32963 2999 ## COG4099 Predicted peptidase + Term 32971 - 33007 2.1 48 5 Op 1 . - CDS 32988 - 33620 730 ## Apre_1054 hypothetical protein 49 5 Op 2 . - CDS 33691 - 34263 861 ## COG4869 Propanediol utilization protein + Prom 34334 - 34393 9.0 50 6 Op 1 21/0.000 + CDS 34566 - 39077 4674 ## COG0069 Glutamate synthase domain 2 51 6 Op 2 . + CDS 39080 - 40561 1718 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases + Term 40568 - 40613 5.3 - Term 40554 - 40603 4.5 52 7 Op 1 . - CDS 40617 - 41129 619 ## COG1528 Ferritin-like protein 53 7 Op 2 . - CDS 41201 - 41731 288 ## gi|237733277|ref|ZP_04563758.1| predicted protein - Prom 41760 - 41819 4.6 + Prom 41728 - 41787 6.6 54 8 Op 1 7/0.000 + CDS 41904 - 42731 840 ## COG1624 Uncharacterized conserved protein 55 8 Op 2 6/0.000 + CDS 42733 - 44148 1255 ## COG4856 Uncharacterized protein conserved in bacteria 56 8 Op 3 . + CDS 44164 - 45513 1736 ## COG1109 Phosphomannomutase 57 8 Op 4 40/0.000 + CDS 45588 - 46241 854 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 58 8 Op 5 2/0.000 + CDS 46325 - 47578 1252 ## COG0642 Signal transduction histidine kinase + Term 47582 - 47614 4.0 59 8 Op 6 . + CDS 47643 - 48524 785 ## COG0583 Transcriptional regulator + Prom 48535 - 48594 9.6 60 8 Op 7 . + CDS 48615 - 49550 1248 ## COG0702 Predicted nucleoside-diphosphate-sugar epimerases + Term 49567 - 49618 13.2 + Prom 49586 - 49645 8.9 61 9 Op 1 . + CDS 49713 - 50186 459 ## COG4894 Uncharacterized conserved protein + Term 50187 - 50218 1.1 + Prom 50188 - 50247 4.9 62 9 Op 2 . + CDS 50275 - 51636 1410 ## COG0534 Na+-driven multidrug efflux pump + Term 51656 - 51703 -0.7 + Prom 51649 - 51708 6.1 63 10 Op 1 . + CDS 51764 - 53173 1611 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 64 10 Op 2 . + CDS 53237 - 53740 296 ## gi|167755972|ref|ZP_02428099.1| hypothetical protein CLORAM_01492 65 10 Op 3 . + CDS 53737 - 54402 668 ## THA_612 autolysin response regulator 66 10 Op 4 . + CDS 54399 - 55634 747 ## Cphy_0975 signal transduction histidine kinase regulating citrate/malate metabolism + Prom 56937 - 56996 11.8 67 11 Op 1 . + CDS 57022 - 57414 291 ## EUBREC_2353 hypothetical protein 68 11 Op 2 . + CDS 57417 - 58250 1007 ## EUBELI_02032 hypothetical protein 69 11 Op 3 2/0.000 + CDS 58250 - 58882 743 ## COG2214 DnaJ-class molecular chaperone 70 11 Op 4 . + CDS 58898 - 60658 2142 ## COG0443 Molecular chaperone + Term 60659 - 60690 2.7 71 11 Op 5 . + CDS 60696 - 61244 672 ## COG0288 Carbonic anhydrase + Term 61269 - 61308 4.6 + Prom 61272 - 61331 8.1 72 12 Op 1 10/0.000 + CDS 61456 - 62367 1085 ## COG0379 Quinolinate synthase 73 12 Op 2 13/0.000 + CDS 62381 - 63682 1435 ## COG0029 Aspartate oxidase 74 12 Op 3 . + CDS 63660 - 64514 517 ## PROTEIN SUPPORTED gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 + Term 64533 - 64574 6.1 + Prom 64539 - 64598 10.2 75 13 Tu 1 . + CDS 64652 - 66019 1190 ## COG0665 Glycine/D-amino acid oxidases (deaminating) + Term 66024 - 66058 1.2 76 14 Op 1 . - CDS 66033 - 66656 504 ## gi|167755984|ref|ZP_02428111.1| hypothetical protein CLORAM_01504 77 14 Op 2 . - CDS 66637 - 67287 563 ## COG1636 Uncharacterized protein conserved in bacteria 78 14 Op 3 . - CDS 67332 - 67676 370 ## COG1416 Uncharacterized conserved protein 79 14 Op 4 . - CDS 67692 - 68060 379 ## gi|167755987|ref|ZP_02428114.1| hypothetical protein CLORAM_01507 80 14 Op 5 17/0.000 - CDS 68126 - 69151 1262 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 81 14 Op 6 . - CDS 69153 - 70289 929 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase - Prom 70309 - 70368 8.2 82 15 Op 1 . - CDS 70383 - 71174 824 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 71205 - 71264 5.2 83 15 Op 2 . - CDS 71267 - 71971 543 ## COG3279 Response regulator of the LytR/AlgR family - Prom 71993 - 72052 5.9 + Prom 71883 - 71942 6.7 84 16 Tu 1 . + CDS 72154 - 72720 188 ## gi|237733307|ref|ZP_04563788.1| hypothetical protein MBAG_02731 + Prom 72863 - 72922 11.6 85 17 Op 1 5/0.000 + CDS 72949 - 73797 1125 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 86 17 Op 2 . + CDS 73797 - 74604 1119 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases Predicted protein(s) >gi|223714081|gb|ACDT01000134.1| GENE 1 3 - 671 630 222 aa, chain + ## HITS:1 COG:CAC3341 KEGG:ns NR:ns ## COG: CAC3341 COG0655 # Protein_GI_number: 15896584 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Clostridium acetobutylicum # 17 222 1 208 208 271 64.0 8e-73 IIIDCAIIFIKDGGCGMKVLLINGSFRRNGCTFTALNEVAKALNTNGIDTEIFHIGTKAI RGCMGCGVCHEKGECIDNSDAVNECLKLFEAADGIVIGSPVHYASSSGMIASFMDRLFYA SRFDKRFKVGATVVSCRRGGASATFDQLNKYFTIAQMPIVSSQYWNSIHGNTPEEVKQDF EGMQTMKVLGNNMAFLLKSIALGKEKYGLPDAELKIATNFIR >gi|223714081|gb|ACDT01000134.1| GENE 2 682 - 1269 587 195 aa, chain + ## HITS:1 COG:lin1208 KEGG:ns NR:ns ## COG: lin1208 COG0406 # Protein_GI_number: 16800277 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Listeria innocua # 4 151 5 143 199 89 38.0 3e-18 MTRIYLLRHGQTLFNQKNLIQGSCDSKLTAKGIAQAKAVKEYLDTINFAVVYCSTSERTL DTATYATGGKYPIYPCKEFKEIDFGVVEGEDGSQLFKGMKHDNFVSLLLNEGWSSLEGES GFDFTKRIFTKLDEITAQYPDENILIATHGGTILNTVVNIDHNFVNMDGPANCSITIIDC HGEYRVIEYNLQVAQ >gi|223714081|gb|ACDT01000134.1| GENE 3 1482 - 1790 509 102 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755911|ref|ZP_02428038.1| hypothetical protein CLORAM_01431 [Clostridium ramosum DSM 1402] # 1 102 1 102 102 200 100 2e-50 MANNKIRIRLKSFDHKILDNSAEKIIGAAKKSGAQVVGPVPLPTEKEIYTVLRAVHKYKD SREQFEIRTHKRLIDIVNPTQETVDILTRLELPSGVDIEIKL >gi|223714081|gb|ACDT01000134.1| GENE 4 1812 - 2540 1172 242 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237733228|ref|ZP_04563709.1| 50S ribosomal protein L3 [Mollicutes bacterium D7] # 11 242 1 232 232 456 100 1e-127 MKGILGRKIGMTQVFTTDGLLIPVTVVEADANVVLQKKTVATDGYDAIQVGFEDKREKLA NKAELGIVKKANTAPKRFIKEFRYDEMMSYEVGDKITVDSFVAGEVVDVTGTSKGKGYQG VIKRHGQRIGPKGHGSGAHRIVGSMGPIAPNRIAPGKKLPGQMGHVTRTVQNLEVVAVDV ENNLLLIKGSVPGPKKGLVIVKSGIKAAGKVNEAHELVDFTPVVEETKEADAATETPVEA AE >gi|223714081|gb|ACDT01000134.1| GENE 5 2562 - 3185 1046 207 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755913|ref|ZP_02428040.1| hypothetical protein CLORAM_01433 [Clostridium ramosum DSM 1402] # 1 207 1 207 207 407 100 1e-113 MLTVKVYNQEGSEVKDLELNEAVFGIEPNKQALFDMVLLQRASLRQGTHKVKNRTEVRGG GKKPWRQKGTGRARQGSIRAPQWRGGGVVFGPTPRSYKFKLNRKVRRLALKSALSTKIND NEFMALEAIKFDAPKTKEMVKVLANLEAPVKTLIVVDEICPNVARSANNIPGVKLLDAKH VNVYDILNSNKLIMTEAAIKSVEEVLG >gi|223714081|gb|ACDT01000134.1| GENE 6 3185 - 3475 485 96 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755914|ref|ZP_02428041.1| hypothetical protein CLORAM_01434 [Clostridium ramosum DSM 1402] # 1 96 1 96 96 191 100 1e-47 MAHITDVLKKPVLTEKTMTLQANENKYTFDVDVNANKVEIKQAVEAMFGVKVESVNVMNV KPKTKRMGRYEGKTNRRRKAIVKLAEGNEINYFGEE >gi|223714081|gb|ACDT01000134.1| GENE 7 3528 - 4361 1439 277 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755915|ref|ZP_02428042.1| hypothetical protein CLORAM_01435 [Clostridium ramosum DSM 1402] # 1 277 1 277 277 558 100 1e-158 MAIKTYKPVTNGRRNMTSLTYEEITTNKPEKSLVAKVSKNGGRNNQGVITTRHHGGGHKR KYRIIDFKRNKDDIVGTVATIEYDPNRSANIALINYADGEKRYILAPKGLEVGSKIVSGE NADIKVGNALPMGNMPEGTVIHNIEMQPGKGGQIARSAGVSAQILGKEERYVIVRLASGE VRKLLAVCRATVGVVGNEDHGLVNYGKAGRMRWKGVKPTVRGSVMNPNDHPHGGGEGRTS IGRKAPMTPWGKKAMGVKTRKNKKASTKLIVRRRNSK >gi|223714081|gb|ACDT01000134.1| GENE 8 4380 - 4658 492 92 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755916|ref|ZP_02428043.1| hypothetical protein CLORAM_01436 [Clostridium ramosum DSM 1402] # 1 92 1 92 92 194 100 2e-48 MARSLKKGPFVDGHLMKKVEALNAAGKKEVIKTWSRRSTIFPQFIEHTFAVYNGREHIPV YVTEDMVGHKLGEFAPTRTYHGHGADDKKAGK >gi|223714081|gb|ACDT01000134.1| GENE 9 4678 - 5010 537 110 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755917|ref|ZP_02428044.1| hypothetical protein CLORAM_01437 [Clostridium ramosum DSM 1402] # 1 110 1 110 110 211 100 9e-54 MEARAQAKMIRVSPQKARLVVDLVRGKQVKEALGMLEYVNKSATPAIIKVVKSAAQNAVY NEGAEAEKLYIKEIYVDEGPTLKRFAARAKGSGTRILKRTSHITCVVEER >gi|223714081|gb|ACDT01000134.1| GENE 10 5022 - 5675 1099 217 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755918|ref|ZP_02428045.1| hypothetical protein CLORAM_01438 [Clostridium ramosum DSM 1402] # 1 217 1 217 217 427 100 1e-119 MGQKVSPIGLRVGVNRNWDSRWYANDQDFAGLLHEDIKIREYLLKKLKTASVAKVEIERS KNRVTIFVHTSRPGVVIGKDGEAVDALRKEVSKMVKDKQVFINIVEIKNPDVVAQLVANN IAEQLENRASFRTVQKRAIQRAMRAGAKGIKTSVSGRLGGADMARAEGYSEGNVPLHTLR ADIDYATAEADTTYGKLGVKVWICKGEILPEKKKGDK >gi|223714081|gb|ACDT01000134.1| GENE 11 5678 - 6088 704 136 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755919|ref|ZP_02428046.1| hypothetical protein CLORAM_01439 [Clostridium ramosum DSM 1402] # 1 136 1 136 136 275 100 4e-73 MLMPKRTKYRRPHRVKYEGKAKGGTEVSFGEYGLQATEGAWITSRQIEAARIAINRRLNR GGQVWIRIFPHLAKTKKPLEVRMGSGKGSPEEWVAVVKTGRVLFEVAGVDEELAREALRL ASHKLPIKCKIIGKGE >gi|223714081|gb|ACDT01000134.1| GENE 12 6091 - 6285 200 64 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227527923|ref|ZP_03957972.1| 50S ribosomal protein L29 [Lactobacillus ruminis ATCC 25644] # 1 64 1 64 64 81 59 1e-14 MTVQEIKELDNAALLAKVEEYKKELFGLRFQQATGSLENTARIRTVRKSIARIKTIIRER ELNQ >gi|223714081|gb|ACDT01000134.1| GENE 13 6297 - 6557 432 86 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755921|ref|ZP_02428048.1| hypothetical protein CLORAM_01441 [Clostridium ramosum DSM 1402] # 1 86 1 86 86 171 100 1e-41 MERNNRKVYTGTVVSTKMDKTITVQVETHVKHKLYGKRMKSTTKFHAHDEENIASVGDTV KIMSTRPLSATKRFRLVEIVKKAETV >gi|223714081|gb|ACDT01000134.1| GENE 14 6592 - 6960 606 122 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755922|ref|ZP_02428049.1| hypothetical protein CLORAM_01442 [Clostridium ramosum DSM 1402] # 1 122 1 122 122 238 100 9e-62 MIQNESRLKVADNTGAKEVLVIRNLGGSNRKFSNIGDVVVATVKQAAPGGSVKKGEVVRA VIVRSRYGVGRENGSYIKFDDNACVIIKEDKSPKGTRIFGPVARELRDADFMKIVSLAPE VL >gi|223714081|gb|ACDT01000134.1| GENE 15 6975 - 7310 556 111 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755923|ref|ZP_02428050.1| hypothetical protein CLORAM_01443 [Clostridium ramosum DSM 1402] # 1 111 1 111 111 218 100 6e-56 MKIRVGDTVEVIAGKDKGKQGEVLQVLAKQDKVIVEGVNTVTKHIKPSQADPEGGIVTRE APIHVSNVAFYDSKAKAPVKLGYKIVEKDGKKTKVRVNKKTGAEVDKKKKK >gi|223714081|gb|ACDT01000134.1| GENE 16 7327 - 7866 899 179 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237733240|ref|ZP_04563721.1| 50S ribosomal protein L5 [Mollicutes bacterium D7] # 1 179 1 179 179 350 100 1e-95 MNRLMERYQNDVVKSLVEKFNYSSSMQAPKVEKIVLNIGVGDAVSNSKLLDEAVNELTLI TGQKPVITRAKKSIAGFKLREGAPIGCKVTLRGERMYEFLDKLVNISLPRVRDFRGVSNN SFDGRGNYTLGIKEQLIFPEINFDKVNKLRGMDIVFVTTAKSDEEGRELLAQLGMPFKK >gi|223714081|gb|ACDT01000134.1| GENE 17 7884 - 8069 332 61 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755925|ref|ZP_02428052.1| hypothetical protein CLORAM_01445 [Clostridium ramosum DSM 1402] # 1 61 1 61 61 132 100 5e-30 MAKTSMKVKQQRPQKYKVREYTRCERCGRPHSVIRKFKLCRICFRELAYKGEIPGVKKAS W >gi|223714081|gb|ACDT01000134.1| GENE 18 8104 - 8499 662 131 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755926|ref|ZP_02428053.1| hypothetical protein CLORAM_01446 [Clostridium ramosum DSM 1402] # 1 131 1 131 131 259 100 3e-68 MVMTDPIADMLTRIRNANRQHHETVMVPASKLKADIAEILKNEGFIKGYKVEGEGPIKNI NITLKYRGNDRVITDLKRISKPGLRVYAKVNEIPKVLNGLGIVILSTSQGLMTDKEARAK QVGGEVLAYIW >gi|223714081|gb|ACDT01000134.1| GENE 19 8530 - 9075 907 181 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755927|ref|ZP_02428054.1| hypothetical protein CLORAM_01447 [Clostridium ramosum DSM 1402] # 1 181 1 181 181 353 100 1e-96 MSRVGNKIINVPEGVTVDIAADNTVTVTGAKGTLVRQFSPVITINKEDNVITVARSSEQK THKQLHGTTRALLANMIEGVHEGFKKQLEIVGIGYRGQMKGNTLVLNIGYSHQVEIEAEE GVTIETPNNTTIVVSGISKERVGQVAAQIREVRKPEPYKGKGIKYSDERIIRKEGKTAGK K >gi|223714081|gb|ACDT01000134.1| GENE 20 9097 - 9453 585 118 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755928|ref|ZP_02428055.1| hypothetical protein CLORAM_01448 [Clostridium ramosum DSM 1402] # 1 118 1 118 118 229 100 2e-59 MSKLQSRNDVRLKRHARVRAKISGTPECPRLNVFRSNAHIHAQVIDDVNGVTLASASSVD MKLENGSNIAAATAVGKAVGEAAIAKNIKKVVFDRGGYVYHGRVKALAEAAREAGLEF >gi|223714081|gb|ACDT01000134.1| GENE 21 9466 - 10029 949 187 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755929|ref|ZP_02428056.1| hypothetical protein CLORAM_01449 [Clostridium ramosum DSM 1402] # 1 187 1 187 187 370 100 1e-101 MEQRKPRNNGPRNNGPRRGQRNNGPRREEKEFEERVVTINRVTKVVKGGRKFRFAALVVI GDRKGRVGFGTGKANEVPDAIRKASENAKRNVINVPSVHGTIPHEVTGIYGSGRVFIKPA SQGTGIIAGGPVRAVVELAGYSDILSKSLGSRTPINMVRATMEGLASLKTVNQVAELRDK KVEEIFG >gi|223714081|gb|ACDT01000134.1| GENE 22 10043 - 10225 168 60 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227871783|ref|ZP_03990188.1| ribosomal protein L30 [Oribacterium sinus F0268] # 1 59 1 59 60 69 57 6e-11 MAKTVMITLKKSPIGCLPKQRATLKALGLTKVNKTVEKENNEFIQGMIKVVGHLIKVEEK >gi|223714081|gb|ACDT01000134.1| GENE 23 10240 - 10680 743 146 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755931|ref|ZP_02428058.1| hypothetical protein CLORAM_01451 [Clostridium ramosum DSM 1402] # 1 146 1 146 146 290 100 1e-77 MKLHELQYTEGARKTRKRVGRGTSSGTGKTAGRGQKGQGARSGGGKKPGFEGGQTPLFMR LPKRGFTNFNKLEYAIVNLDQLNTFEAGTVVCPKALKEAGLIKKELDGVKVLGNGTLEKA ITVKAHKFSKSALAAIEAAGGKTEVI >gi|223714081|gb|ACDT01000134.1| GENE 24 10682 - 11983 1196 433 aa, chain + ## HITS:1 COG:BS_secY KEGG:ns NR:ns ## COG: BS_secY COG0201 # Protein_GI_number: 16077204 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecY # Organism: Bacillus subtilis # 1 432 1 430 431 333 41.0 5e-91 MFTFLKNVFKKGELRQKVIFTLGILFVYRLGAAITIPSVNTGALSSSDATNGIFGIMNLL GGGTLERFSLFSLGVSPYITSSIIIELLSMDVIPALSQWRKEGNTGKKKKDRVTRYVTLF LAALQGGVLTYAFDNGYKILADSSIWTYIYVVLIMMAGSMLAVWLGDQITNKGIGNGVSL LIFTGIVSNLPSSFISTFNNLVDTSGKAGDMWLGIAWYALFVVVYLSIVIFVVFNEGAVR KIPIIYATSSNTTMRTKDQTHMPIKINSSGVIPVIFASSVLAAPRTVVSFMETSDVTKWI NTIFNYQEPVGFVLYILMIIGFTFFYSNLQIDAEKISDDLKKNGGAIPGVRTGVDTMKHI KTILNRVTVVGSLSLCIIAAIPIITPVIWSQTANASLSLGGTGLIIVTGVALETVKQIKT HITRKEYHGYIRR >gi|223714081|gb|ACDT01000134.1| GENE 25 12007 - 12654 759 215 aa, chain + ## HITS:1 COG:lin2760 KEGG:ns NR:ns ## COG: lin2760 COG0563 # Protein_GI_number: 16801821 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Listeria innocua # 1 212 1 212 215 236 53.0 3e-62 MNIILMGPPGAGKGTQAANLVKEYGLTHISTGDIFRKALKEQTKYGVIAKYFMQFGHLVP DDYTIQMVREYLQENEFPNGFILDGFPRTIIQARELESIAKEFGFEIDAVINLDIELDRL VPRLSGRRTCKECGASYHIEYNPPKVEGVCDVCGGELYQRPDESEDAVKVRLDTYEKQTS PLIDYYTMKGQITNINGDQSMEDVFKDIKASLEVK >gi|223714081|gb|ACDT01000134.1| GENE 26 12654 - 13400 883 248 aa, chain + ## HITS:1 COG:BH0156 KEGG:ns NR:ns ## COG: BH0156 COG0024 # Protein_GI_number: 15612719 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Bacillus halodurans # 1 247 1 247 248 305 61.0 6e-83 MIVTKDRREIELMAEAGRIVALVHEKLKEVIVPGITTKQIDEICEKVIRDNNATPSFLHL YGFPNSVCTSINEVVVHGIPSDRKLEEGDIISVDVGACYKGYHGDSAWTYAVGKISDEAK RLMEVCEASLYAGLEQVKPGNRLSDISHAVQVYLEDHGCTTPLDYTGHGIGTEVHEDPAV PNYGQAGRGPRLKAGMTLAIEPMAHLGGCETEVLQDDWTVVTKDRSLAAHYEHTIVITDD GYEILTKL >gi|223714081|gb|ACDT01000134.1| GENE 27 13421 - 13642 273 73 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15610598|ref|NP_217979.1| translation initiation factor IF-1 [Mycobacterium tuberculosis H37Rv] # 1 73 1 73 73 109 68 4e-23 MAKKDDVLEVEATVLETLPNAMFKVALENGVEILAHVSGKIRMHYIRILPGDRVTVEISP YDLTRGRITFRHK >gi|223714081|gb|ACDT01000134.1| GENE 28 13674 - 13787 188 37 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|18311362|ref|NP_563296.1| ribosomal protein L36 [Clostridium perfringens str. 13] # 1 37 1 37 37 77 89 3e-13 MKVRPSVKPMCDKCRVIKRKGRVMVICENPKHKQRQG >gi|223714081|gb|ACDT01000134.1| GENE 29 13814 - 14179 603 121 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755936|ref|ZP_02428063.1| hypothetical protein CLORAM_01456 [Clostridium ramosum DSM 1402] # 1 121 1 121 121 236 100 2e-61 MARIAGVDIPRDKRVVISLTYIYGIGKPTAQEILKKVGINENTRVKDLTEDEIGTIRKEI ESIKVEGDLRREVALNIKRLMEIGSYRGIRHRKGLPVRGQKTKTNARTRKGKAKPIAGKK K >gi|223714081|gb|ACDT01000134.1| GENE 30 14199 - 14591 671 130 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755937|ref|ZP_02428064.1| hypothetical protein CLORAM_01457 [Clostridium ramosum DSM 1402] # 1 130 1 130 130 263 100 3e-69 MAKVKQVRGKKRAKKNIAKGVAHVHSTFNNTIVTITDEHGNVLTWSSAGALGFKGSRKST PFAAQMASEAAAKASMDHGVKSVEVCVKGPGPGREAAVRALQAAGLEVTAINDVTPIPHN GCRPPKRPRG >gi|223714081|gb|ACDT01000134.1| GENE 31 14641 - 15585 1246 314 aa, chain + ## HITS:1 COG:BH0162 KEGG:ns NR:ns ## COG: BH0162 COG0202 # Protein_GI_number: 15612725 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Bacillus halodurans # 1 311 1 311 314 345 58.0 7e-95 MQKFEKANFNIANYDETDNYGKFVIEPLERGFGTTLGNSLRRVLLSSLPGSAVYAIKVQG AIHEFSAVDGVVEDVTSIILNLKKLVFDVDSDESATMIIDVEGPATVTGADIQCPSEVTM ISNDMEIAHVAQGAHLYMELYAKKDRGYVSADQNKKEINTIGIIPTDSIYSPVEKVSYAV EPTRVGESAKYDQLTLEIETNGALKPYEAISLAAKILVEHLNMFVELTDMAVNMEVMSEA QSDTTNKVLDMTIEELDLSVRSYNCLKRAGIQTVQDLAAKSEDDMIKVRNLGKKSLKEVK EKLVELGLGFKPID >gi|223714081|gb|ACDT01000134.1| GENE 32 15618 - 15980 593 120 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167755939|ref|ZP_02428066.1| hypothetical protein CLORAM_01459 [Clostridium ramosum DSM 1402] # 1 120 1 120 120 233 100 3e-60 MAKNRKLGRTSDIRKSMLRSLATEVIMYGKLETTETRAKEVRSVVEELITLGKRGDLHAR RQAAAVLHNVVDAETGKTAVQKLFDDVAPKYSDVNGGYTRILKTYNRKGDNAPMAIIALV >gi|223714081|gb|ACDT01000134.1| GENE 33 16080 - 16910 251 276 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 228 1 226 305 101 29 1e-20 MDKIIEIKDLSFEYEAGLKTINHISFDINKGDYVAILGHNGSGKSTIAKLLIGLLEKKSG SIIINGYELNLENLYKVRDNIGIVFQNPDNQFIGATVRDDIAFGLENTCVPQSDMDKIIN TYAAKVGMSDFLDHEPTKLSGGQKQRVAIAGILAMAPTIIILDEATSMLDPMGRREINSL VKELNKEKDITIISITHDIEEAKNADQVIMLSAGEVVASGNPKDILSDEANLVKYELDIP FALKVAKGLEKLNIRTSMSLQEEVLITELCQLHLKK >gi|223714081|gb|ACDT01000134.1| GENE 34 16886 - 17737 401 283 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 4 277 131 396 398 159 34 5e-38 MSITFKEVEHIYSENTPFAYHALKGVNLKITDQSFTAIIGQTGSGKSTLIQHINALLLPT SGSINIDEYLITATDKPSKLKPLRKKAGLVFQFPEYQLFEETIEKDIIFGPMNFGVSEEE AKKIAHNVLKTVGLDESYLNKSPFDLSGGQKRRVAIAGILAMNPDILILDEPTAGLDPQG TNEMMSLFKRINESGKTIILVTHDMNHVLQYCDEVVVMNHGKVEKHDTVTNVFKDSEYLN SLGIDLPIITNFIIKLNNQGFNLDSSINNIEQLIAAIGGELNG >gi|223714081|gb|ACDT01000134.1| GENE 35 17730 - 18530 697 266 aa, chain + ## HITS:1 COG:L77627 KEGG:ns NR:ns ## COG: L77627 COG0619 # Protein_GI_number: 15672261 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Lactococcus lactis # 1 247 1 247 266 197 44.0 2e-50 MDNMTFGKYIPVNSLIHQLDPRLKIGALLIFLIAIFFDAGFIGYGIMGVFVLLVAGLSNI SIKHILKAIKPMVFMMLFLMIFNIFLINTGSVLFTIGSLKIYSGALIQSAYIFIRLILII TLTTILTSSTKPLDLTLAIENLLNPFKRFGFPAHELAMMISIALRFIPDLLDEAKRIMKA QASRGVDFNEGTISEKIKAIVSLIIPLFISAFQRAEDLANAMESRNYNPEASRTRYKSLN WQTSDTMAFALTIVVSGTVIIMSFVL >gi|223714081|gb|ACDT01000134.1| GENE 36 18530 - 19267 595 245 aa, chain + ## HITS:1 COG:BS_truA KEGG:ns NR:ns ## COG: BS_truA COG0101 # Protein_GI_number: 16077216 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Bacillus subtilis # 2 243 1 246 247 186 39.0 4e-47 MVRVKCSVSYDGSKFHGFQVQNKLRTVQGEIQKALKNICKEEIIIHGSGRTDAKVHGTKQ VFHFDTTREMPEAQWKRAINHFLPNDIYILDSYFVDENFHSRYSAIKKEYHYLLSTNEYS PFETNYIFQYGRPLDLELMREAAKIFIGEHDFASFCSYDQYGNTIRELYELSITNEKGII KFILVGNGFRRYMVRHIVGGLIQVGAKRVTKERLQELLDSKGEQKCLFKAKPQGLYLHEV FYDEN >gi|223714081|gb|ACDT01000134.1| GENE 37 19257 - 20132 901 291 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0869 NR:ns ## KEGG: Lebu_0869 # Name: not_defined # Def: histidinol phosphate phosphatase HisJ family # Organism: L.buccalis # Pathway: Histidine metabolism [PATH:lba00340]; Metabolic pathways [PATH:lba01100]; Biosynthesis of secondary metabolites [PATH:lba01110] # 2 282 3 254 258 160 35.0 5e-38 MKTNFHTHTFRCGHAVGNEEAMVKSAIEEGIEVLGFSEHVPLPRYRKHIFKGMRYTLNHF RSFAVACKAIITNGPAMRMPYKDKQLHLEEVKRLKEKYCDQITIYQGFEAEYFEEYLDYY QGLLNSGEADYLILGNHFNKYTVHTRYYGKMDISDEEIISYKNDLLKALDTNLFSYVAHP DLFMVGKVYFDELCENITREICQKALEKDVPLEVNAGGIRRGYRKVGDEIQYPYPNSHFF DIVGEVGCKVILGIDAHSPDDFNQDDFEALEKFAWRHKLNVINIPEFKKGK >gi|223714081|gb|ACDT01000134.1| GENE 38 20601 - 21800 1267 399 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755945|ref|ZP_02428072.1| ## NR: gi|167755945|ref|ZP_02428072.1| hypothetical protein CLORAM_01465 [Clostridium ramosum DSM 1402] # 1 399 1 399 399 751 100.0 0 MARECIHKYLEEHKSNYQGKYRCHSCVQTKKFEHKFHYYIRDTQFREINVFVTLDYAGPE VKTVFSVDLHEQEEEYITKDALKQIIYFNKYRTILHCHVFQHYINSKNTENMLEPLDYRN ILDYLEYHRGTNQETIDEFYDFFMPYLKRLIRNGNYKRFMDSLNLLLDKIIYEYEWDGTT AKYLDTQYQYHLYYFRIIIRMVFDDLDIFYDQVKEPLLEAIWRLCNSQRFAFAIMTDFGN LVLSHYRVTKAIFNYVDARFQKEGDSNIVIPYLKAIFESDADGYRNAAMDVIRFVMNDML TFANHDLQLAIGNSIVQNEGYDLLINLFSKDYNTFVFVCFPISTFPPEYREPIREELEKA IRFYAGRMEHDEYRLSSFEQVSNINRLLMENYKEYGKNG >gi|223714081|gb|ACDT01000134.1| GENE 39 21897 - 22697 859 266 aa, chain + ## HITS:1 COG:SPy1892 KEGG:ns NR:ns ## COG: SPy1892 COG1073 # Protein_GI_number: 15675706 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Streptococcus pyogenes M1 GAS # 38 263 81 305 308 118 33.0 1e-26 MLYRHHKEDDKPSILETKYDAQSIYIKNHQGIRLRGVLIEAVDAKKTLVILHPFALEAKD MTLYVPFFKERYPDWNILLVDACAHGQSDGYIRGLGIKDVKDLVCWNEYLLKTYGKEHEI ILYGKEAGANTILKAASKHLLKNVKAIISDGAYTSVYDILGYRMIKDYKVPKFPTMRLIK RKIKQEIKVNIKEDYAEMVKHNDIPTLYIHMKEDDFVPLSMVYPLYNANRGSKVLFVLKD ERYLYELEETDEFRKTLANFITKYVN >gi|223714081|gb|ACDT01000134.1| GENE 40 22711 - 24930 2109 739 aa, chain + ## HITS:1 COG:BS_ytrP_2 KEGG:ns NR:ns ## COG: BS_ytrP_2 COG2199 # Protein_GI_number: 16080017 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Bacillus subtilis # 566 730 30 189 199 94 35.0 8e-19 MDNKLKIGLLVDDSDNIFTTEIWRGASYAAGKLDVNLVVFFGGFVGSSNIYGNSRYEFQK NTTYKFSKTKDLDLLIISVSSIVHGNNRLKELFVKEFEGVPIVTLNHKFGNNSLVTFDNA KGIEDAVTSLIQEQQCRKIGMIAGPYENKGSDERLAAYKKTLKSHGIVIDEKRIIHTATF ERGYPEYAKKLLDLNPDLDAIVCASDNLAFDAYQVLKERGIRIGEDVQVVGFDDVQEASK VNPPLATIRAEAAQLGFYAVMDGVLSLRGEIPAEFTTLVDVDFIKRDSAAKVGYLEERLY RQIVNYTKSDKDKIVNILMEYIFNNNHSIFIIEARKRVIDLVTFLVELHNDDLFEEKTYT KFSNLIQELIDLDSVEFIDLKRILKVIDVVSQRLSDYRNNLSIIHFNQKILQIIGMHYNT ILSERTRISCEEKRLVNILSRGMLSVRNEYDNYNTIFECLKQLNIGNARLYLYEKPITSY MTQFTVLPNEVILKGSIINGKIIIPDDKIPVKTDRIFSEKRLFSNCNEYVVNSVYSGTTQ YGIFVCDLKYRDFVDLDFVSSQLGTVVNTINLVEKLDNLSKHDELTGLWNRRGFIDKVQG YMNINKGALIFVDLDGLKIINDTYGHEGGDEAIITGAKILKDAFDEFGVIGRIGGDEFAV FLPNQSDFALSEIDQLINDKTIYHNTLIKRNFKVELSYGISLFDQYQDLTVRELLDKADR EMYCHKRNKKGNRRKKDVF >gi|223714081|gb|ACDT01000134.1| GENE 41 24920 - 26368 1883 482 aa, chain + ## HITS:1 COG:lin0297 KEGG:ns NR:ns ## COG: lin0297 COG2723 # Protein_GI_number: 16799374 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 9 482 9 478 478 625 63.0 1e-179 MYFKDSKGFPSDFLWGSASAAYQVEGAWDSDGKGVSNWDKFVRIPGKTFKGTTGDKAVDH YNRYKEDVALMAEMGLKTYRFSIAWTRIYPNGNGEVNEAGLQFYDNLINECLKYGIEPMV TVYHWDMPQALEEQYHGWENRQIVDDYVNYATTLFKRYGDRVKYWITMNEQNIFTSFGWL EGMHPPGKVDDMKMFYQVNHHANMAHAKSVIALKELWPEAKVGASFAYSPSYAIDNKPEN AMAKADYDDLKNYYWMDVYAYGRYPRAALAYLEANGVAPVFEEGDNAIMKEAAQLVDFMG VNYYQTTTVEFNDIDGVGTSHEMNTTGKKGTAKVQGVPGLYKNPQNANLPTTDWDWTIDP MGIRMCCREITSRYDLPIVISENGLGAFDKLEDGEVHDPYRIAYLKAHIEELKKACDDGC RVLAYCTWSYTDLLSWLNGYQKRYGFVYVDREEDENSGTLNRYKKDSYYWYKKVIETNGE EL >gi|223714081|gb|ACDT01000134.1| GENE 42 26435 - 27271 835 278 aa, chain + ## HITS:1 COG:PA5438 KEGG:ns NR:ns ## COG: PA5438 COG1737 # Protein_GI_number: 15600631 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Pseudomonas aeruginosa # 16 226 2 208 293 77 28.0 3e-14 MNMRRKENTEVDILYNLLTYINASYSQDMYYTICYQVLNNIEKIPDISINELADLCYTSP ATISRFCKALKCDNFAEFKKEVQVGLKQASHEIKLHPDDLVEIHQNPVHCVDMVYDLTIN SLIESKKHINIHEIDRLCDIIYDAKKLHFFGFQFNKILASDIQFKLVKLGKFSYAFADRG DDSQRIELLDEDSVAIVLSVRARLSPVGDLIKSIKNREAKVILVTLNAESEVIELADKTF VVHGKESDFTESSISGTTVMKTFFDVLYVRYGLLYPRR >gi|223714081|gb|ACDT01000134.1| GENE 43 27283 - 28311 1021 342 aa, chain + ## HITS:1 COG:CAC3420 KEGG:ns NR:ns ## COG: CAC3420 COG2008 # Protein_GI_number: 15896661 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Clostridium acetobutylicum # 1 336 1 343 344 353 50.0 2e-97 MYSFTNDYSEGAHPKIMKALIETNDEQCAGYGLDKYCLEATRLLKRQLQNDNVDIHYLVG GTQTNAVFISSVLRPYQAVIAADSGHINVHETGAIEATGHKVLTRPHQDGKLTVAMIESI IEEHSDEHMVQPKMVYISQSTEYGSVYSLDELKDIATLCKKKELYLFVDGARLGSALALP DTPTFKDLADYSDAFYIGGTKMGALFGECLVIVNDSLKADFRYNMKQKGAMMAKGRLLGV QFKELFSNDLYLEIGKYENKMADILRAGLKDFEMMVPSQTNQVFPIMDDELIAKLQQDFS FNIMDRIDDHRHCIRLVTSWATPKEAVNSFVSKLSELLHNNL >gi|223714081|gb|ACDT01000134.1| GENE 44 28379 - 28654 172 91 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 [Haemophilus influenzae PittEE] # 1 91 4 94 96 70 34 2e-11 MNKTELVEAIASSTEMTKTDVDKVVTAFVDVVTEALVKGDKVSLKGFGNFEVRERGERTG RNPRTGETMTIAASKAPAFKSSSALKNAVNK >gi|223714081|gb|ACDT01000134.1| GENE 45 28667 - 29080 228 137 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 [Bacillus selenitireducens MLS10] # 7 123 27 146 236 92 40 6e-18 MTFGSFLIGYERQNHLKVAGVRTHLIVCLAAAMIMIVSKYGFNDILSHEGIAPDPSRSAA QIISGVSFLGAGIIFVHKREITGFTAAEIWAAAAVEMAIVAGMYVIGILATILILLVQII FHKQYRWIPTVWESWLI >gi|223714081|gb|ACDT01000134.1| GENE 46 29387 - 30220 934 277 aa, chain + ## HITS:1 COG:lin2267 KEGG:ns NR:ns ## COG: lin2267 COG2207 # Protein_GI_number: 16801331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 33 275 30 288 292 99 29.0 7e-21 MYKYEKVEHDNELPTKLHDFYIEDNIGEPIEKHWHRSIEILIPLYGSFILWINGNKVKII AGNIYIINSQDIHAIQAIEGERIYKGYALQIKYDYLRESYHDIDKIYFQQPNQQINKLLM AKIIDIINFYESDDPYNNIRVKSHIQMLVFLLLDNLSKKRIGYLEVKDSKYKDRITKIIK YLENNYQEDLSVKMIADEFGLSEGYLSKLFKESLGATVKEYLSRVRLWHAEEQLVETDYP VIDIAIGNGFPNVKSFNQAFKRKNVITPAKYREKMRK >gi|223714081|gb|ACDT01000134.1| GENE 47 30300 - 32963 2999 887 aa, chain + ## HITS:1 COG:TM0033 KEGG:ns NR:ns ## COG: TM0033 COG4099 # Protein_GI_number: 15642808 # Func_class: R General function prediction only # Function: Predicted peptidase # Organism: Thermotoga maritima # 44 399 34 385 395 100 28.0 1e-20 MFKKILSCLMSAMMVLGLTILPTHALEGTATGTQKAFVVGDDWGAGVTKSIITFDKKIKA DSISKTDFLVKETATNKVSDRTILDAYASDAQGNKVTTDSNIVTVEMYISPSEGNPISWN NLTWKNAWADPYQLEVTLAAGETLVSENETITTIDVDPTINVAEDGKICPQLDGFKMSEF TNDGTTVSYALYTPENDSHKNALVIWNHGVGETGTDVQIDLLGNEVTALAGDEFQNTMDG AYVLVPQRRSYDSTSNAEAIYQLILKTLRENPDIDQDRIIIGGCSAGGAMTMTMIIAHPE LYAAAYPICPATQSANVSDETIESLKDLPIWFTHAKNDSTVAIATTTEPLVERLRAAGAE VHTSIFDDVHDTTGRFSNEDGTPYQYDGHWSWTYFDNNECYDENGVNLWQWMSVQTKADK VIASGSQKAYIIGDDWGPAVTKTVISLDKAIDADSVVAENFKVVEEKEATINWGTGEIGI ATADRNVTAAYTSDAQGNKVTGSSKYITIEMYVSPSEGSPFIYNLKTGFNSWCTPYKLNV SLAKGATMSSGDEVVTDLNIKADIDVAGDGKICPQGEVFEMKAYTAKDGTTYSYADYTPA KDDKKNALVIWLHGAGEGGTDPYIDILGNEVTSLVSKEFQSLFEGAYVLAPQSPTMWMDD GTGAYQNGDKGSMYAESLFEMIDAYVKANDDIDPNRVIIGGCSNGGYMTMEMVLKHPTYF AAAFPICEAFQDQYITDDQINAIKDMPIWFTYAKNDGTVDPTLCVEPTVARLLAAGANNI HVSVFDDVHDTTGRFFNEDGTPYQYNGHWSWIYFDNNECYDENGVNAWQWLAKQIKTAAP VETPDQPTTPDQPANSVKTGDDVNFAGLGAIMMLTLAGIYVSRRKYN >gi|223714081|gb|ACDT01000134.1| GENE 48 32988 - 33620 730 210 aa, chain - ## HITS:1 COG:no KEGG:Apre_1054 NR:ns ## KEGG: Apre_1054 # Name: not_defined # Def: hypothetical protein # Organism: A.prevotii # Pathway: not_defined # 9 201 2 200 201 123 37.0 4e-27 MKILDTLGKTKKAIGGIIAILLVALLIFYAGTRFASSSEPKISSTGLSQQLQEIEELATM SYNYTKVGKFSNNLTFNGWDIPLTQKSFLITYDGKLKAGVKMDKIEVAINNNIITVSIPE IEILSNEIDESSIEVYDETKNVFNPISVNDYTTFAKKQKEAVAEEAIENGLLSEAATKTQ STIKKYLNAIPGIDGNYEIKVKFLETKKES >gi|223714081|gb|ACDT01000134.1| GENE 49 33691 - 34263 861 190 aa, chain - ## HITS:1 COG:TM0375 KEGG:ns NR:ns ## COG: TM0375 COG4869 # Protein_GI_number: 15643143 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Propanediol utilization protein # Organism: Thermotoga maritima # 5 178 23 197 210 192 57.0 2e-49 MTKKIIVETSARHIHLSDADLETLFGKGYQLTNKKNLSQPGQFACEEKVTVVGAKGEQKM SVLGPTRKATQVELSLTDARALGLQAAIRESGDIEGTAGCKLVGPAGEVEISCGVIAAKR HIHMTPADAVEFGVTDKQIVKVEIETEGRSLIFGDVVCRVSENYATAMHIDTDESNAAGC GREVYGTIIK >gi|223714081|gb|ACDT01000134.1| GENE 50 34566 - 39077 4674 1503 aa, chain + ## HITS:1 COG:sll1502_2 KEGG:ns NR:ns ## COG: sll1502_2 COG0069 # Protein_GI_number: 16329610 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Synechocystis # 383 1179 1 801 801 952 57.0 0 MNEKKNLGLYDKSFEHDNCGIGAVVNIKGVKTHITVSNALKIVEQLEHRAGKDAAGETGD GVGILTQVPYSYFKKTIKEFPLPLEGKFGVGMLFMPRDEHLRNRVKKMLETIVIKEGMEF LGWREVPTVPKVLGQKALDAMPSIWQCFIKKPHGINKGIEFDRILYQTRRIFEQSNFDDT YVCSFSSRTIVYKGMFLVGQLRSFFKDLQDDEYKSAIALVHSRFSTNTMPSWQRAHPNRF IVHNGEINTIKGNANNMLCREEIMTTNLLDTNKVYPVVDINGSDSAMLDNTLEFLVMSGM ELPRAVMMCIPEPYDNDKSMSQSKKDFYEYNSTLMEPWDGPASILFSDGEVMGAVLDRNG LRPSRYYITRDDYLILSSEVGALDIPAEDIVIKDRLHPGKMLLVDTKAGKLIADDDLKES YAHKQPYGEWLESNLIKLKDMRIPNARVERYQGEQLTKLQKIYGYSYEDVNTYIKKLALN GSEEVVAMGNDSPLAVLSMQHPPLFNYFKQLFAQVTNPPIDAIREEIITSTALCIGTEGN ILADNEKNCRLLKIDHPILSNTDLLKIKNIKKDGFKVAVLSMLYYKNTSLEKALDKLFVE ADKAYHEGANILILSDRGVDENHVPIPSLLAVAAMNEHLVNTKKRMSVALILESGEPREV HHYATLIGYGASAVNGYLAQDTIFELVDEGLLDKDYYAAIEDYNDGILHGIVKIASKMGI STIQSYRGSQNFEAIGLAKDFVAKYFPKTVTRIEGKTIKDIENDVDYRHSKVYDPLGLDV DITLDSRGDHKERSGKEEHLYNPATIHKLQLATRNGDYKLFKEYSAMIDEEGKNLNLRGL LQFKKGKSIPLDEVESVDKIVQRFKTGAMSYGSISKEAHETMAIAMNRLKGKSNSGEGGE DPERFILDENGDSRCSAIKQVASGRFGVTSQYLCSAKEIQIKMAQGAKPGEGGQLPAGKV YPWVAKTRHSTPGVGLISPPPHHDIYSIEDLAQLIYDLKNANKDARISVKLVSEAGVGTV AAGVAKAGAGVILISGYDGGTGAAPRNSVYNAGLPWELGLAEAHQTLIMNDLRGRVVLET DGKLMTGRDLAIATLLGAEEYGFATAPLVTMGCVMMRVCNLDTCPVGVATQNPILRKRFT GKPEYVENFMRFVAQEFREYMAELGFKTIDEMVGRSDLLEVRPGVENVDLSRVINNPYIK AKEIRHNPKNNYDFKLEEVKDTTILLKEFKNALENHQSHEINVDITNIDRTLGTLFGSEI TKRYQDTLDDDTFKVNCYGSGGQSFGAFIPNGLTLTLHGDSNDYFGKGLSGGKLVVVPPE NSNFKAAENIIIGNVALYGATSGEAYINGIAGERFAVRNSGAKAVVEGVGDHGLEYMTGG LVVVLGETGRNFAAGMSGGIAYVYDPNNRLYSRINKELVSYSSVTSKYDEEELRAVIQKH YDHTNSEVAKDILDNFGEQVSMFKKVVPHDYRHMIELIKYYEKQGLTNEQAKVEAFNEAR KGV >gi|223714081|gb|ACDT01000134.1| GENE 51 39080 - 40561 1718 493 aa, chain + ## HITS:1 COG:sll1027 KEGG:ns NR:ns ## COG: sll1027 COG0493 # Protein_GI_number: 16329369 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Synechocystis # 1 493 1 493 494 520 51.0 1e-147 MGKPTGFLEIERQVSKAVEPKKRIQNFNEFHKHLSKDDQACQGARCMDCGVPFCQSGREL EGAVSGCPLNNLIPEWNDLIYHNNYKQALARLLKTNNFPEFTARVCPALCEAACTCGLNG EPVTVKENEYGIIEDAYAKGLIKPCPPTHRIDKKIAIVGSGPAGLTAADQLNKRGYQVTV YERDDRIGGLLMYGIPNMKLEKEVIDCRVGIMKAEGVVFKTNCGIEDEKQAKELLDNYDR VILACGSRKPRDIDVPGRQGRGILMAVDYLTAITKSLLNSNLKDKNFAPTKDKHVLVIGG GDTGNDCVGSAIRLGCKSVTQLEMMPELPAIRSDNNPWPEWPRIKKTDYGQEESIAVFNQ DPRIYQTTVKEFILDKNGQVKEAVIVSLEPKENKETKRVEMVPVAGSEQTIPADLVLISA GFIGTEDHVARAFNVTLNPRGNVKSNQDYQTNIKKLFVAGDMRRGQSLVVWAIKEGREVA RAVDFDLMGYSNL >gi|223714081|gb|ACDT01000134.1| GENE 52 40617 - 41129 619 170 aa, chain - ## HITS:1 COG:CAC0845 KEGG:ns NR:ns ## COG: CAC0845 COG1528 # Protein_GI_number: 15894132 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Clostridium acetobutylicum # 12 170 12 170 170 184 58.0 8e-47 MLDKKVVELLNDQVNKEFYSAYLYLDFANYYKDNGLDGFANWYNIQAQEERDHAILFVQY LQNNNAKVTLEAIDKPDKEYTKLNDPLIYGLEHEEYVTSLIHNLYDAAYSLKDFRTMQFL DWFVKEQGEEETNANDLITKFNLFGSDSRSLYLLDSELAARVYNAPSLVL >gi|223714081|gb|ACDT01000134.1| GENE 53 41201 - 41731 288 176 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733277|ref|ZP_04563758.1| ## NR: gi|237733277|ref|ZP_04563758.1| predicted protein [Mollicutes bacterium D7] # 1 176 1 176 176 237 100.0 2e-61 MLKKKNNLICMIMIITALISGISTVILDYNLNSNSSAVTSLSTNSNNLDGDNPFGNNDSS TNNSTNPFGNSDSSIFPTTANNQNSKYRISILATCSIIICALIASLAIIYLIMSKLSQHQ VFINNDKKWIYGLSSCLLTILISFICINATKNLDSKTNNPTSIPNGSEQSNNNDIV >gi|223714081|gb|ACDT01000134.1| GENE 54 41904 - 42731 840 275 aa, chain + ## HITS:1 COG:SA1967 KEGG:ns NR:ns ## COG: SA1967 COG1624 # Protein_GI_number: 15927744 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Staphylococcus aureus N315 # 11 270 4 262 269 199 42.0 6e-51 MKVGDLMPLISAFTFNISSILNVGRTIIDLILVWYVIYLLISMMKQNMRTMQLFKGVLLI LILKMFTSLLRLSAMDYLVDTILTWGVVAIIIVFQPEIRGLLEKIGRTKLELKHDNLSDD EKERLMDELVGAITKLSEDQTGALITFERRQSLIDYINTGTKINADIKAELFTTIFWEGT PLHDGATIIKGDRVVCAAAFYPPTNQELSPLYGARHRAALGISEITDSLTVVVSEETGTI SFATDGKLRKIPRKELRASLVNELDWFNTQEKDGE >gi|223714081|gb|ACDT01000134.1| GENE 55 42733 - 44148 1255 471 aa, chain + ## HITS:1 COG:BS_ybbR KEGG:ns NR:ns ## COG: BS_ybbR COG4856 # Protein_GI_number: 16077244 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 110 448 52 391 483 131 26.0 3e-30 MSKKENPKNPIKKDSTTDFFLNFITRQKSEQKDEEKTTTSKVVDDTVKVRDKAFELFYHM INSLNIFLDRMLQSNLSMKVLSFVMAVVLLFTITGGIDNIFSTPNGGDYLYDVKIDTEGL QSDYDVVGLPETVNVALVGPSLDIYSTKISKKYKVVADFSTLGEGEHTIELQGKDFPSDL QVMIVPQTVTVKITQKVTKTFELGYKFSNEDSMDSKYSVSVESMEHHEVEVRGSQDNIDK INSVKAVIDLKGKNNDFEQNAKIYAYDRSGKKVDVEIIPNTVKVDCMVSSYSKEVSIVPQ YTGQLASGYGFESIKLKQEKVTIYGKEELLNSINSVGVVIDLSGLSGDKSYSKLPLTGIE NINKLDFNTVDASVRVSPSTKRIITDIPINIVNNNGGYQVNFAEGQDKASVEVDGVAAIL DALTINDFNISIDLANLKAGTNTVKVDLKIDKGYLTGKLVSPERITITLRK >gi|223714081|gb|ACDT01000134.1| GENE 56 44164 - 45513 1736 449 aa, chain + ## HITS:1 COG:BS_ybbT KEGG:ns NR:ns ## COG: BS_ybbT COG1109 # Protein_GI_number: 16077245 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Bacillus subtilis # 1 443 1 444 448 409 48.0 1e-114 MGKYFGTDGFRGEANVDLTVEHAYKVGRYLGWYYSQNGKAKVVIGKDTRRSSYMFEYALV SGLTASGADVYLLHVTTTPSVSYVVTSEEFDCGIMISASHNPYYDNGIKILDGNGHKMDA GVENLIEQYIDGLIDELPYATKDEIGCALDYSIGRNRYIGYLMSIPTRAFRNYRVGLDCA NGASSAIAKSVFDALGAKTYVINSDPDGLNINTNCGSTHIEVLQQYVKENALDIGFAYDG DADRCICVDEFGRVVDGDLILYVCGKYLKDHGELANDTVVTTIMSNLGLYKAFDKEGIKY EKTAVGDKYVNENMVKNGHVLGGEQSGHIIFSKHATTGDGILTSLKVMEAVIESKRTIAQ LIEPVTIYPQLMKNVPVRDKKEAQEDPDVRAVIAEVETDLGENGRVLVRESGTEPVVRVM VEADTDEKCLINVNKIVDKMKEKGFVIKR >gi|223714081|gb|ACDT01000134.1| GENE 57 45588 - 46241 854 217 aa, chain + ## HITS:1 COG:BS_yvqA KEGG:ns NR:ns ## COG: BS_yvqA COG0745 # Protein_GI_number: 16080354 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 1 216 1 222 225 195 46.0 5e-50 MQFRINIIDDEKNLNDLVRTYLEKEGYIVYSFYTYDEALMHKDDDVHLWLIDIMLDRASG FELFNEIKASRPKMPIIFMSARDQEFDRIIGLEKGSDDYITKPFNIKEVILRINNLIKRS YDDPSKIKMDGYDIDLEKRRVFNGGEEVILTTKEYDLLVYFITNKGLAISREQVLNKVWD ENYYGSDRVVDDTLRRLRKKMPEINVRTIYGFGYRLD >gi|223714081|gb|ACDT01000134.1| GENE 58 46325 - 47578 1252 417 aa, chain + ## HITS:1 COG:BS_yvqB KEGG:ns NR:ns ## COG: BS_yvqB COG0642 # Protein_GI_number: 16080355 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 93 412 112 447 451 168 32.0 2e-41 MLVILILVPLINKNLNALIDEEMFKTLDTSQSAYIDFDYSPIAKSSDKQIYHMTYDKNTN YLFPPSNLTRDKVLALYPVFADKLNEMLKGNKDKIQAKGTLDGDTLYFQITKKDSDSYII SLVYSDYSASLISSIRQQIINILYVSFAVIGAIIFIWVSGLIKPLKLIRNYIEDIRKDKQ SELKIDRGDEIGFVSDELVAMKEEIDKQSKIKEEMIHNISHDLKTPIALIKSYSQSVKDD IYPYGDKNSSMDIIIENAERLDGKVKSLLYLNRLDFISGENSDSEVDMHELIEHIVIQLQ GMHPEIEIETDLAFVSFKGDEECWRICVENIVDNAYRYVDKKIKIILKNDYLEIYNDGEP IDNDNIEALFQPYEKGTKGQFGLGLSIVHKTCTMYGYNVTAINQETGVSFIIEKKYN >gi|223714081|gb|ACDT01000134.1| GENE 59 47643 - 48524 785 293 aa, chain + ## HITS:1 COG:CAC3409 KEGG:ns NR:ns ## COG: CAC3409 COG0583 # Protein_GI_number: 15896650 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 4 259 26 286 319 106 25.0 5e-23 MLDKKIEFFIATVETGSFSGAARKLMLSQSAVSQQINLLEEELAVKLFDRSNYRPRLTRA GEFYFKKCQELVAIYEKMNHEVKLIDSYSIRIGITGPFENHHIPFLVKLFKEKYPQIEIT IIKGSFQSCKEWLNNNQIDVAFAIENDFINEPKITYEVLLQHQICAICSYEHPWAKLTKI AAQNLINQPLICLSRKFGEGFYHDFVEAFNKDKIEPNIIKEVDTLDELILSVKLNEGIGL TSREVVNEEEVAILDIINSHHHANYVIGYARDINNQFIFDFVALAKRYFNKTL >gi|223714081|gb|ACDT01000134.1| GENE 60 48615 - 49550 1248 311 aa, chain + ## HITS:1 COG:BH0305 KEGG:ns NR:ns ## COG: BH0305 COG0702 # Protein_GI_number: 15612868 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate-sugar epimerases # Organism: Bacillus halodurans # 3 290 2 271 284 106 31.0 5e-23 MSKVIVTGVDGNFGGYVARNITQLKEKEDLIFTCPFEEGLKEFKDTGIDCRVANFNHPDE QLVEAFKGGDTILIISAPFVGAKRQAAHKNAIDAAIKAGVKKVVYTSLVNARDVENPSIE KIDHAWTEEYIESTELDYIFLRNSQYAEAMITSYLTSNGNLLSCQGDGKMAYISRLDCAM AAMYALTKDDMHKQVLNINGPELLTLHEFAAIGNRETGMDVKVVDVSEEEVYAGFDAIGV PRTTDGAFKDGSPAPYSSDGMVTFARAIRIGKMDNFTDDFEKLTGVKPRTVAYMFANNSE YGVGARNSTDD >gi|223714081|gb|ACDT01000134.1| GENE 61 49713 - 50186 459 157 aa, chain + ## HITS:1 COG:SP1384 KEGG:ns NR:ns ## COG: SP1384 COG4894 # Protein_GI_number: 15901238 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 3 149 8 155 165 84 33.0 7e-17 MKLLFKQRFFSWLDSYDIYDEMGNTVYTVKGELSWGHKLQIYNADGLPVGTIKEEVFTFL PKFAMYLNDEYIGQIKKELTLFKPSFILDYNDWKISGNFWEWDYEIIDSRGIVIGDINKE LFNFTDTYSLTISDPRNAIYVLMIALAIDAQKCSNNS >gi|223714081|gb|ACDT01000134.1| GENE 62 50275 - 51636 1410 453 aa, chain + ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 17 430 16 428 448 266 35.0 9e-71 MGKDENNFARGSIPRHIINLAGPMIVAQLINVLYNVIDRIYIGRIPDVATLAMGGLGLCL PLISIIIAFANLFGMGGAPLCSIARGRGDIEEAEEIMGNSFMLLIIFGIVLTVIGLIFKE DLLWLFGASEHTIGYANDYMTIYLLGTVFVLIGLGMNSFINSQGFAKIGMMTVLLGAIVN IILDPIFIFGLDLGVKGAAFATVISQFISALWTLRFLTGKKTILKIKKQYLSLKVKYVTK IISLGMAGFMMAITNSIVTIVCNATLQQYGGDLYIAIMTIINSIREVASLPGQGMANACQ PVLGFNYGAKEYARVLQGIKFVTLTALSMMLIVWLAITVFPELFIKIFSHNQEIITHGVS ALRLYFFGFFMMSFQMTGQAAAVGLGKSKQAVFFSIFRKVIIVAPLTVILPIYIGIDGVF IAEAISNFIGGGACYITMWFTIAKKLKSGIISE >gi|223714081|gb|ACDT01000134.1| GENE 63 51764 - 53173 1611 469 aa, chain + ## HITS:1 COG:CAC1405 KEGG:ns NR:ns ## COG: CAC1405 COG2723 # Protein_GI_number: 15894684 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 1 469 1 472 473 586 59.0 1e-167 MKLKENFLWGGATAANQYEGGYNEGGRGLSINDVEKGAKHGVPREIHEYIHENTYYPSHV ATDFYHHYQEDIKLFAQMGFKCFRMSISWSRIFPRGDEEQPNEEGLAFYDKVFDELLKYG IEPVVTLSHYETPLTLVNEYGSWRSRKLVDFFEHYCKTVFSRYKDKVKYWMTFNEINGCL EVARPWHQAGIVYRDDEDHYQTILQASHHMFVASAKAVIAGHEINPNFKIGCMLIYPTTY AATCNPEDQIMMRNKMLNTFYYGDVHVRGRYTNTCTSLLHKHGVEIKMEPGDEELLAKGK VDYIGFSYYFSAVEGNDAEEIEGNVVKGGRNPYLKMTDWGWQIDPLGLRTALNELYDRYQ LPLFIVENGMGAVDTVEVDGSINDAYRIDYIKEHIKAFKDAVEIDEVDLMGYTPWGCIDL VAASTGEMRKRYGFIYVDKDDEGNGTLARKPKKSFYWYQNVIKTNGEEL >gi|223714081|gb|ACDT01000134.1| GENE 64 53237 - 53740 296 167 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755972|ref|ZP_02428099.1| ## NR: gi|167755972|ref|ZP_02428099.1| hypothetical protein CLORAM_01492 [Clostridium ramosum DSM 1402] # 1 167 1 167 167 239 100.0 3e-62 MENYCRKMAWLLYNRGLVDKENVEDLRFLLELILTQVITIISVYIIGLLFMDALSILIIC ACFVAGRKYLDGYHADTFRNCYLLTMLNFIFCIFMSRIVAYVEIFYFIGIGISLYLLFKY RYKKIIIFNLFYILIIVLFNTGIVYQINMVNYFVIVLMRVIRGEELK >gi|223714081|gb|ACDT01000134.1| GENE 65 53737 - 54402 668 221 aa, chain + ## HITS:1 COG:no KEGG:THA_612 NR:ns ## KEGG: THA_612 # Name: not_defined # Def: autolysin response regulator # Organism: T.africanus # Pathway: not_defined # 1 202 1 210 227 68 30.0 1e-10 MISIGIIDDEQVYLDKIKDILITNFDDIRVYSYNSASKIDNDLDFILLDIDMPDIDGIIF SKQHRNYRIVFVTNYDTRIKEAFGPNVYGYVSKGNLEEELVEKVKEVIEIIKCDYYVTFK VNGIDINIRIDDIIYCQYLGNHIVSIVYHDKKININNSSLKKVKEVLNEYFIEISQDIII NKHRIINFEDRYVYLDGINSKFEVSVRKRKLVRKSFYETFR >gi|223714081|gb|ACDT01000134.1| GENE 66 54399 - 55634 747 411 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0975 NR:ns ## KEGG: Cphy_0975 # Name: not_defined # Def: signal transduction histidine kinase regulating citrate/malate metabolism # Organism: C.phytofermentans # Pathway: not_defined # 203 405 19 232 242 79 30.0 2e-13 MIEVISEYFYCLIYVIALTFTASYLLEGKVKFKRVVIMAPLYAIVNYLVAGATSNNFWEY MIVNSIIIVIDFCYICILFKKTNVIFYVSLFQLFYIFTVNSLIHVINIIIGFNREVTLSF SIQRLIMVFLINILTVIFIVVLDKLKIIPTQKVINRQVKLFIMLDILVYYAMVIIYNIGV VRLVNLITVLVLILLFLWIMFLKILSMYVETTIRNEELIMEDISNKYISKYLDFYNQESD NLRKLKHDLKNHQLVLESLDKKNQYTQYIDEVFKGIGQVTYIESGNIYIDACLYAKQQEY PEIIFDFDISVAGLVFNEKDLTSLIFNLIDNACNEALKHNKLVSVMIRYTNNLLIIRIKN TCPIKPNFSTDKGEGHGYGLKIIKNIVNKYHGDLFIDYHDDQVIFNIKINT >gi|223714081|gb|ACDT01000134.1| GENE 67 57022 - 57414 291 130 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2353 NR:ns ## KEGG: EUBREC_2353 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 130 1 131 131 99 40.0 3e-20 MKEKFYRFMQGRYGIDQLNSFLMIVCVICFIVNMFIGSIVLTFIAYGTWLFVIFRMFSKN IYARNRENDKYLNFFSPLSRWLKLKLMSKQDPSNKYFSCPKCKQMVRVPKGHGTVVVTCP NCQNKFEKRT >gi|223714081|gb|ACDT01000134.1| GENE 68 57417 - 58250 1007 277 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_02032 NR:ns ## KEGG: EUBELI_02032 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 272 14 285 301 265 48.0 2e-69 MFGYVVINKPELKFKEFDVYQSYYCGLCKTLKEKYGSRAQISLNFDLNFISILLSGLYEP ETKIGESRCVMHPLSKHTTRYNECVDYAAKMTIVLTYFKCEDDWLDERKISKQAYKRLLR KAYQEIKEEYPDKLAHIETCLHCINDYESQGITNLDEISKYFGEVMGEICAYKDDEWHDE LYEFGFYLGKFIYFIDAYEDIEQDLKKGTYNPFKELYQTDQFEDKCKDILELMISEATMA FERLPIIENAAIIRNILYGGVWNKYELVRQKRLEGRK >gi|223714081|gb|ACDT01000134.1| GENE 69 58250 - 58882 743 210 aa, chain + ## HITS:1 COG:CAC0648 KEGG:ns NR:ns ## COG: CAC0648 COG2214 # Protein_GI_number: 15893936 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone # Organism: Clostridium acetobutylicum # 2 207 3 194 195 77 28.0 1e-14 MNPYQILGISSDATDDEVKKAYRTLSKKYHPDANINNPNQAAYTEKFKEVQTAYKTIMDD RKRGFTNRTYGNTQGAGYNGQSQSSYQYNGNDQAAFNEAAGFINARRFQEALNILEQIKT RNAMWFYYSAVAMNGIGNNITAVEYAQTAAQMEPGNLQYLLFLQQLKGGQRQYQTGQQTY GSPFGSTMQCCYSIMLLNCMMNCCCGGRMC >gi|223714081|gb|ACDT01000134.1| GENE 70 58898 - 60658 2142 586 aa, chain + ## HITS:1 COG:CAC1282 KEGG:ns NR:ns ## COG: CAC1282 COG0443 # Protein_GI_number: 15894564 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Clostridium acetobutylicum # 1 558 1 554 615 605 59.0 1e-173 MSKVIGIDLGTTNSCMAYMEDGKGIIIPNAEGNKTTPSVVAFTNDGKRLVGENAKRQAVT NPLNTVASIKREMGTAYRKNINGKDYSPQEISAMILQKLKIDAEAYLHEPVTEAVITVPA YFNDAQRQATKDAGRIAGLNVKRIINEPTAAALAYGLENSYGQKIMVFDLGGGTFDVSII EIGNGVIEVLSTSGDNHLGGDDFNEIIAKYIVDEFRRIEGIDLSHDQVAMQRINEAGEKA KIELSTMITTIVNLPFITNTSSGPKNLEVEISRAKFNELTKGLVDRVVIPMQNALNDARL TPQELNKIILVGGSSRIPAVQDKIKEITGKEASKSLNPDECVAQGASIQGGKLSGDIAAT GNFDVLLLDVTPLTLSIETMGGVATPLIERNTTVPTSHSQIFTTSANFQTQVEINVLQGE RPLAKDNKSLGKFKLKKIKRAMRGVPQIEVTFDIDANGIVNVSAKDLASGNMQSITIENT SNMSDDDIERAIHEAKQYEQQDAVTKERLVFKNEVENLILTVEAALSKNGKQIDKAMKSN IKNELASFKKLTKKVDFETCDQIQFDEIKTAKLNLETVAASLLSMQ >gi|223714081|gb|ACDT01000134.1| GENE 71 60696 - 61244 672 182 aa, chain + ## HITS:1 COG:BS_ytiB KEGG:ns NR:ns ## COG: BS_ytiB COG0288 # Protein_GI_number: 16080121 # Func_class: P Inorganic ion transport and metabolism # Function: Carbonic anhydrase # Organism: Bacillus subtilis # 1 179 3 181 187 219 50.0 3e-57 MIDEILAYNRAFVTNKGYKPYTTSKYPDRKLAIVTCMDTRLIELLPAALGIKNGDAKIIK NAGGVIVHPFGSAVRSLLIAIYELNVEEIMIIGHTDCGVGSIDIEAMLKKMEKRGISETV IRDLGYCGIDFNKWLGGFDDVEISVKESVSLLRRHPLLPKNIEIHGLVMDSRTGALSLVE KN >gi|223714081|gb|ACDT01000134.1| GENE 72 61456 - 62367 1085 303 aa, chain + ## HITS:1 COG:FN0008 KEGG:ns NR:ns ## COG: FN0008 COG0379 # Protein_GI_number: 19703360 # Func_class: H Coenzyme transport and metabolism # Function: Quinolinate synthase # Organism: Fusobacterium nucleatum # 6 302 3 296 298 286 51.0 4e-77 MNDIIEEIQKLKKEKNAVILAHYYVDGAIQDIADYLGDSYYLSKIAVDCPQDVIVFAGVE FMGESAKLLSPNKTVLMPDKDADCPMAHMVDEYFIARLRKEYEDLCVVCYINSTAEIKTM SDVCVTSSNAKRIVEALPNKNILFIPDQNLGKHIAELVPSKNFLFCNGYCPTHYRITPED ILTAKEKHPGAPVLIHPECKPECVELADYAGSTSGIIDYATNDREHDEFIVVTEIGVIHE MEKRNPGKKFYPATDKLVCPNMKKNTLEKVRDCLKNNTNQVELDEEFMEKAKKPLERMHE LAK >gi|223714081|gb|ACDT01000134.1| GENE 73 62381 - 63682 1435 433 aa, chain + ## HITS:1 COG:CAC1024 KEGG:ns NR:ns ## COG: CAC1024 COG0029 # Protein_GI_number: 15894311 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate oxidase # Organism: Clostridium acetobutylicum # 1 425 1 426 434 389 49.0 1e-108 MNNNYDVIIVGTGAAGLFAGLCLPADLKVLMITKDKVENSDSYLAQGGICTLKSADDFDA FYQDTLKAGRNENNPESVKIMIQQAPQIMKDLMDYGVEFDRDTEGNLAYTREGAHSEYRI LHHQDVTGKEITSKLIKQVEMRNNITMVEDATMLDIINHDNIATGIVMENNGEIIQINAK VIILATGGIGGLFTHSSNFRHITGDSFAIALRNNIELENINYIQIHPTTLYTTKPGRSFL ISESVRGEGAHLLNPDMERFVDELLPRDVVSNAIKKEMDKYNVPHVYLSVTHMDPKQIKR RFPHIYDQCLAEGYDMFMDPVPVVPAQHYLMGGIKSDTYGQTSMNNLFAVGETACNGVHG ANRLASNSLLESLVFAKRAAKVIADGIDKISLYEKVVDLDGYDKEKLKKENKEIIMNEIK RKDGEFYDKWCKS >gi|223714081|gb|ACDT01000134.1| GENE 74 63660 - 64514 517 284 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 [Kordia algicida OT-1] # 1 279 1 283 286 203 39 2e-51 MINGVNLKVNIDDYILSALKEDITSDDVTTNAVMPEAKLGHVDLICKQDGIVCGLEVFER TFKLLDENCKFTTKYKDGDVIKNGDYIGEVIGDIRVLLSGERVALNYLQRMSGIATYTSE SVAILKGSKTKLLDTRKTTPNNRIFEKYAVKIGGGTNHRYNLSDGVLIKDNHIGAAGSIA KAVAMARDYAPFVRKIEVEVETLEQLQEALEAKADIIMLDNMDNATMKEAVAMIGDKAQS ECSGNVTKERLMEIAEIGVDFVSSGAITYAAPILDFSMKNLRPY >gi|223714081|gb|ACDT01000134.1| GENE 75 64652 - 66019 1190 455 aa, chain + ## HITS:1 COG:CAC2596_1 KEGG:ns NR:ns ## COG: CAC2596_1 COG0665 # Protein_GI_number: 15895855 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Clostridium acetobutylicum # 4 362 2 390 399 158 30.0 2e-38 MENESYWQHTITLPKYPQINKNCNYDVVIVGGGISGVSLAYRLNNSNLKVALFESDTLGS KTTGHTTAKVTYLHGAVYADIYQVYGRSKAKQYLESNYEAYQDIKKIIEMEQIECDFKEN IAYVGASDTTNAKKLDCQIRLFKSWGFEVLENRLKNCQISMGLKQQAIFHPLKYLKGLLA QCKHIDIYEHSLVTGSMHQDGLVCLEVNGFKVQARQVVWMTRYPPNLQHGYFFRIIQEKE HVIFQEGQSNGNSILDLSTNYSKRYLDEQHILMIKRIDDLNQVYWYAQDGKPLRKIPYIG KMNGQEYVAYAYNKWGMTLSHVASKLIYDLIINDESKYASLYQPSYGNYLKSGADMLKLV KNNYHGMIKNRLVSSKKLKLKKQQGKVIRHQGRLLAVYKDAQERIFYFSPYCPHLKCVVQ YNEIDNTWNCPCHGSIFDCHGKLVSGPATKDLKQY >gi|223714081|gb|ACDT01000134.1| GENE 76 66033 - 66656 504 207 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755984|ref|ZP_02428111.1| ## NR: gi|167755984|ref|ZP_02428111.1| hypothetical protein CLORAM_01504 [Clostridium ramosum DSM 1402] # 1 207 1 207 207 359 100.0 5e-98 MMEPIYRCGYQIPSLIKLVFKLICFITIPFLITMMVKNLIVTNGIAVDHSFSFIIYPISL LITLGFVYSYRNSSIYLFYNNFMLYKEKLNLHDRMKLFFMKDDLNELENYLSKRNYANLE ELNIIVRKVSFMRTNDIGIEIKFKLIFNDQKNFIITVTEDTREAYIQFLEPFIKYHVKIN DPSNLIDGLKQPLRLTEYFIRNKSMED >gi|223714081|gb|ACDT01000134.1| GENE 77 66637 - 67287 563 216 aa, chain - ## HITS:1 COG:SPy0233 KEGG:ns NR:ns ## COG: SPy0233 COG1636 # Protein_GI_number: 15674419 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pyogenes M1 GAS # 2 210 16 231 255 181 41.0 7e-46 MKINYDLKFKEELTKIKNTRPRLLLHVCCGPCSGNVIRELADIFEITIYYSNSNIYPNAE YHRRYQELLGFIKQFNQDYHQNITVIEQPYLPKEYLNKLSRYKDEPEGGKRCYLCYQKRM NDAFQYSCEHNFDYWTTVLSVSPHKNSQWINEIGASFAQNKTKFLFSDFKKNNGYLKSVR FADSYHLYRQSYCGCVYSYQDMLKRKKDVEDDGTNL >gi|223714081|gb|ACDT01000134.1| GENE 78 67332 - 67676 370 114 aa, chain - ## HITS:1 COG:AF0913 KEGG:ns NR:ns ## COG: AF0913 COG1416 # Protein_GI_number: 11498518 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Archaeoglobus fulgidus # 1 114 1 112 112 77 37.0 7e-15 MKIIFHVDELTKWQLCLGNVTNMSNYYQVQNIDYQIEVLANSEAVIAYQLDYDHEIAKMF TALSKKHVTFTACNNALAANHLTAADIYQFITVVPAGVVELAQKQTEGFAYIKP >gi|223714081|gb|ACDT01000134.1| GENE 79 67692 - 68060 379 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755987|ref|ZP_02428114.1| ## NR: gi|167755987|ref|ZP_02428114.1| hypothetical protein CLORAM_01507 [Clostridium ramosum DSM 1402] # 1 122 1 122 122 186 100.0 5e-46 MFSYHKEDFIIRQIQSAARALAKLFFNSETVSYTIKDESNYTQSDLIYLKIKELMEQHKY LEAQTLLLEHLDSDDLQYLKLILYFYDQLDQMNDEQLAKYYLERQIITNNFIDATSKYNC LI >gi|223714081|gb|ACDT01000134.1| GENE 80 68126 - 69151 1262 341 aa, chain - ## HITS:1 COG:lin1566 KEGG:ns NR:ns ## COG: lin1566 COG0809 # Protein_GI_number: 16800634 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Listeria innocua # 1 340 1 341 342 417 59.0 1e-116 MKISEFDFNLPEELIAQTPLDKRDTSRLMVLNRNQQTIEHKHFYDIIDYLKPGDILVRNN TKVIPARLFGIKEETNGHVEVLLLKDQGNDMWECLVGNARIVKIDTIITFGDGLLKAQCV EIKEEGIRVFKMIYEGIFYEILDQLGTMPLPPYIKEKLDDQNRYQTVYAKIEGSAAAPTA GLHFTDEIIEKIKAKGIEILDVTLHVGLGTFRPVKVDDVLEHHMHSEYYMIEPDVADKLN QAKQNGQRIIAVGTTSTRTLEANMKKYGKFTSVHENTDIFIYPGYKYEAIDCLITNFHLP KSTLLMLISAFASKEFIFKAYQEAIDEKYRFFSFGDSMFIM >gi|223714081|gb|ACDT01000134.1| GENE 81 69153 - 70289 929 378 aa, chain - ## HITS:1 COG:SA1465 KEGG:ns NR:ns ## COG: SA1465 COG0343 # Protein_GI_number: 15927219 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Staphylococcus aureus N315 # 1 373 1 373 379 546 66.0 1e-155 MAAITYELKHVCKQSGARYGILHTPHGDVETPMFMPVGTLATVKGISPEMLKEMHSQVVL ANTYHLWLRPGEDVVEKAGGLHKFMNYNGPMLTDSGGFQVFSLGKTRKIEEEGVKFKSII DGSSLFLSPEKAIEIQNKLGADMIMSFDECAPYPCTYEYMKNSMERTLRWAKRGKEAHKN TEKQALFGIVQGGEFPDLREQCAKELVAMDFPGYSIGGTSVGEPKDVMYKMVDDTIKWLP EDKPRYLMGVGNPIDLIECAIRGIDMYDCVLPTRVARHGAIMTSRGRLNINNEKFKYDFT PLDPECDCYACKNYTRAYIRHLHKCDEIFGKTLLSIHNVNFLLKLASDIREAIKEDRLLD FKEEFLDKYGHDVYKRAF >gi|223714081|gb|ACDT01000134.1| GENE 82 70383 - 71174 824 263 aa, chain - ## HITS:1 COG:CAC2244 KEGG:ns NR:ns ## COG: CAC2244 COG0561 # Protein_GI_number: 15895512 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 5 263 4 266 266 236 45.0 4e-62 MATNKIMFFDIDGTILSETTHTIPKSTVEGLQKAKEQGHLIFINTGRPFSSIDECIKELD PDGYVCGCGTYIRYHDEVLFSKTLSQERCFEVRDLIRKTNVEGVLEGKNTVYFDQNIRHP FLKGVKERYEATPTFNLSTFDDPNLSFDKLAVWFDEHGDIETFKSEITKDFEYITRAEDF GEIVPLGCSKATGIQFLLDYFGLDKDDAYVFGDSFNDEAMLRYVKHAIVMGNGEPELFKL AYYVTKDIEEDGIYHALQHLNLI >gi|223714081|gb|ACDT01000134.1| GENE 83 71267 - 71971 543 234 aa, chain - ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 229 1 229 234 85 30.0 7e-17 MYNIALIDDDQRILQIVQYKVKSIFNSVGIDPTITCFNNPQKLHMDTYYDVLFLDIDMPS LNGIELAQNYLQNHDDTTIIFITNKYDLVFNAFSVHPFDFVKKENLETGLIQPIKHLISK LNRDNRVISLQDKNGLTTIKCKKILYCESYGHICYIHTTDRVIKTNKYKLSNIESIINCE DFYMINQSYLVHWKYVINIENKSTTLEDGTVLPISKRRFKDSLASYKKYTFRNI >gi|223714081|gb|ACDT01000134.1| GENE 84 72154 - 72720 188 188 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733307|ref|ZP_04563788.1| ## NR: gi|237733307|ref|ZP_04563788.1| hypothetical protein MBAG_02731 [Mollicutes bacterium D7] # 1 188 1 188 188 259 100.0 8e-68 MFEFLAEKITQVLETKNIIVRDRAIYKYGIETWISALASILLLLVVGIIFNCILEVIIYE AVFFILRRFTGGYYCTTHFECMSLYIAIFTLYMVLREYLKVSIVMVCIAIAISLIIIISL SPVQSNDQQLTELEQRKYHLYSTLLSIIVGISCIVLRFLNIPVYSILTYSFSLLAILMLG GKVVNRRS >gi|223714081|gb|ACDT01000134.1| GENE 85 72949 - 73797 1125 282 aa, chain + ## HITS:1 COG:PAB1737 KEGG:ns NR:ns ## COG: PAB1737 COG0543 # Protein_GI_number: 14521153 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Pyrococcus abyssi # 1 273 1 273 278 255 50.0 8e-68 MYKIVKKEVLNDVVELMEVHAPYVAKKCEPGQFIILRVGEDGERIPLTIADFDREKETIT IIYQIVGYSTKELAKLNEGDELTDFVGPLGVPTVLHEAKHVIGVAGGVGSAPLFPQLREL ANRGVDVDVIIGGREAQYVLLADEFKKFCKNVYIATDDGSLGTKGFVTNVLSDLIEKGES FDEVIAIGPVPMMKAVVNVTKPKNIKTSVSLNPIMIDGTGMCGCCRVSVDGKIKFACVDG PDFDGLQVDFDELMLRQRMFKEEEHTVSENANRMCNLMGGVK >gi|223714081|gb|ACDT01000134.1| GENE 86 73797 - 74604 1119 269 aa, chain + ## HITS:1 COG:PAB1738 KEGG:ns NR:ns ## COG: PAB1738 COG0493 # Protein_GI_number: 14521152 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Pyrococcus abyssi # 1 269 1 282 474 305 56.0 5e-83 MPNMSLTKVVMPEQEPDVRNKNFEEVALGYTKEMAMEEATRCLNCKHQPCKQGCPVGVPI PEFIQEVAAGNMEEAYKIITSENALPAICGRVCPQENQCEGKCVRGIKGEAVGIGRLERF VADYHMANGKAPELDIKSNGIKVAIIGSGPAGITCAGELAKKGYEVTVFEALHKTGGVLS YGIPEFRLPKALVQKEVDSVAALGVKFETNVVVGRSITIDELQEQGYQGIFIGSGAGLPR FQNIPGENLNGVYAANEFLTRVNLMKGYE Prediction of potential genes in microbial genomes Time: Thu May 26 10:36:37 2011 Seq name: gi|223714080|gb|ACDT01000135.1| Coprobacillus sp. D7 cont1.135, whole genome shotgun sequence Length of sequence - 4903 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 580 672 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases + Term 583 - 621 5.5 + Prom 583 - 642 3.6 2 2 Op 1 . + CDS 662 - 1993 1421 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases 3 2 Op 2 . + CDS 2012 - 2596 570 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) 4 2 Op 3 . + CDS 2601 - 3143 186 ## PROTEIN SUPPORTED gi|125718620|ref|YP_001035753.1| ribosomal protein N-acetylase + Prom 3147 - 3206 5.0 5 3 Op 1 6/0.000 + CDS 3249 - 4124 982 ## COG1396 Predicted transcriptional regulators + Prom 4141 - 4200 7.1 6 3 Op 2 . + CDS 4235 - 4651 485 ## COG1396 Predicted transcriptional regulators Predicted protein(s) >gi|223714080|gb|ACDT01000135.1| GENE 1 2 - 580 672 192 aa, chain + ## HITS:1 COG:TM1640 KEGG:ns NR:ns ## COG: TM1640 COG0493 # Protein_GI_number: 15644388 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 1 192 270 462 468 204 56.0 7e-53 PNHPTPVKITDTVCVIGAGNVSMDCARTAKRLGAKNVYIVYRRSDKEIPARAEEVHHAKE EGIIFKLLTNPVEIHGEDGWVKSMECVEMELGEPDESGRRRPVVKEGSNFVIETGTVIVS IGQSPNPLIRQTTPGLETQKWGGIIVDEDSMKTSKEGVYAGGDVVTGAATVILAMGAGKT AAKAMDEYLANK >gi|223714080|gb|ACDT01000135.1| GENE 2 662 - 1993 1421 443 aa, chain + ## HITS:1 COG:CAC0738 KEGG:ns NR:ns ## COG: CAC0738 COG0847 # Protein_GI_number: 15894025 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Clostridium acetobutylicum # 6 190 4 188 306 107 34.0 6e-23 MEQSITFVDVETPNYQNNSISSIGVINVDGDGVVTTKYFLVDPEAHFDRFNIELTGITPE MVADQPNFKEVWSEIEPYFTNSLVVAHNAVFDLGVISACLQRYDLPIFPIFYTCTYRISR ALKIPSNSYKLNDLSSYYHVTLDNHHNALADSKACMEIFYYLLKEPNLETLDQYVKCFEP TKGNKDNKKYLEVLIGLLTGIGFDNYLNKKEISFLNNWLTKNQLPYEYANIVKELKAVLK NEYITHYQYLHILNELQYMKSIKAKNIRSLYEFMAILEGISCDEVINDDEIMELNKWMKE NEQFKGTYPFNRILNKLEKIIIDKQISTIVTDELLYYIKNFFKPELDQGDLFDVKNKVIC LTGNFCFGERSQLEKLIVLKQGIISKSVTKKVDYLVLGSKGSAGYKYGKYGAKTNKALTM KSEGHKIELISEARLMEVLKLSK >gi|223714080|gb|ACDT01000135.1| GENE 3 2012 - 2596 570 194 aa, chain + ## HITS:1 COG:MA0147 KEGG:ns NR:ns ## COG: MA0147 COG2249 # Protein_GI_number: 20089045 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Methanosarcina acetivorans str.C2A # 1 194 1 197 200 117 35.0 2e-26 MKVLVIYCHPSNKSFTARMKDEFIRGLEDGGHTCQLIDLYQINFDETFSEEEYLREAFYD QALKVPKDVQAHQRLINANDAVVFIYPVFWTEAPGKLVGWFQRVWTYGFAYGTETKMKQL DKVLMLVTMGGDLSEEIRQQQVAAMKVVMLGDRIGERAKTKEMIVFDRMSRDYPVREVNY SKNLKRAYLLGKEF >gi|223714080|gb|ACDT01000135.1| GENE 4 2601 - 3143 186 180 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|125718620|ref|YP_001035753.1| ribosomal protein N-acetylase [Streptococcus sanguinis SK36] # 1 178 1 184 187 76 28 5e-14 MYKERVKENLETTRLRLRKFKLSDKEAVYKYGRDPRTLKYLEWIGVTTREEARISIEEYY LSRAGIYAIALKENDLCIGAIDIRLDEANDKASFGYLLDPDYWNQGYMSEVLSRILKHCF EDIRVNRVEAIHYRLNPASGKVMAKCGMKQEGIGIQELKIKGLYHDVVHYAITQEMWQRR >gi|223714080|gb|ACDT01000135.1| GENE 5 3249 - 4124 982 291 aa, chain + ## HITS:1 COG:SPy1834 KEGG:ns NR:ns ## COG: SPy1834 COG1396 # Protein_GI_number: 15675661 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 136 1 124 195 75 35.0 8e-14 MSLSDNLRALRKQKGYSQEQLAERLNVSRQAVSKWESDNGYPEMESLIILSDLFECTIDD LLKNDLTQHNPTAKQAYDKHYSLIAKAYTFGVVSILLGVCAYLFAEIYFSENTKSEFIGE IIFLFFVLIGVITFVYFGMKDSHFKNSHSEITDYYTDSQRSEFNQTYSLSIATGVGVILF AVVLQILIENLYNENLANGLFMVFVTIAVGIFVYFGTLKSKYDQVDLKVIKQEKRNKKVS IYCGIIMMIVTAIYLGCSFTTNAWHISWIVYPIGGIICGIVWLLFEAHEED >gi|223714080|gb|ACDT01000135.1| GENE 6 4235 - 4651 485 138 aa, chain + ## HITS:1 COG:SPy1834 KEGG:ns NR:ns ## COG: SPy1834 COG1396 # Protein_GI_number: 15675661 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 126 1 132 195 60 25.0 7e-10 MSLGKRIQSYRKQKGLSQEQLASRLNISRQALSKWESDINVPNIDKIMDVAKALEITLNE LLGLEEDSNDEYAKLESILNQVVLTQNNEIIRNKKYLKMGLVIGTGIFIILALVMWSVLN DIKGIKSTIGNNNAGLSN Prediction of potential genes in microbial genomes Time: Thu May 26 10:36:42 2011 Seq name: gi|223714079|gb|ACDT01000136.1| Coprobacillus sp. D7 cont1.136, whole genome shotgun sequence Length of sequence - 15540 bp Number of predicted genes - 19, with homology - 18 Number of transcription units - 3, operones - 2 average op.length - 9.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 66 - 959 407 ## gi|237733153|ref|ZP_04563634.1| predicted protein + Prom 969 - 1028 3.3 2 2 Op 1 . + CDS 1066 - 1245 97 ## gi|237733154|ref|ZP_04563635.1| predicted protein 3 2 Op 2 . + CDS 1249 - 1788 501 ## gi|237733155|ref|ZP_04563636.1| predicted protein 4 2 Op 3 . + CDS 1845 - 5477 2424 ## gi|237733156|ref|ZP_04563637.1| predicted protein 5 2 Op 4 . + CDS 5483 - 5665 120 ## + Prom 5692 - 5751 8.9 6 3 Op 1 . + CDS 5825 - 6514 739 ## gi|237733157|ref|ZP_04563638.1| predicted protein 7 3 Op 2 . + CDS 6569 - 8212 975 ## COG2894 Septum formation inhibitor-activating ATPase 8 3 Op 3 8/0.000 + CDS 8227 - 9699 1091 ## COG4962 Flp pilus assembly protein, ATPase CpaF 9 3 Op 4 . + CDS 9699 - 10646 543 ## COG4965 Flp pilus assembly protein TadB 10 3 Op 5 . + CDS 10657 - 11508 660 ## gi|237733161|ref|ZP_04563642.1| predicted protein 11 3 Op 6 . + CDS 11574 - 11789 368 ## gi|237733162|ref|ZP_04563643.1| predicted protein + Term 11791 - 11826 3.6 12 3 Op 7 . + CDS 11834 - 12019 202 ## gi|237733163|ref|ZP_04563644.1| predicted protein 13 3 Op 8 . + CDS 12040 - 12588 224 ## gi|237733164|ref|ZP_04563645.1| predicted protein 14 3 Op 9 . + CDS 12603 - 13121 482 ## gi|237733165|ref|ZP_04563646.1| predicted protein 15 3 Op 10 . + CDS 13123 - 13605 333 ## gi|237733166|ref|ZP_04563647.1| predicted protein 16 3 Op 11 . + CDS 13619 - 13900 150 ## gi|237733167|ref|ZP_04563648.1| predicted protein 17 3 Op 12 . + CDS 13974 - 14528 360 ## gi|237733168|ref|ZP_04563649.1| predicted protein 18 3 Op 13 . + CDS 14540 - 14860 215 ## gi|237733169|ref|ZP_04563650.1| predicted protein 19 3 Op 14 . + CDS 14865 - 15425 449 ## gi|237733170|ref|ZP_04563651.1| predicted protein Predicted protein(s) >gi|223714079|gb|ACDT01000136.1| GENE 1 66 - 959 407 297 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733153|ref|ZP_04563634.1| ## NR: gi|237733153|ref|ZP_04563634.1| predicted protein [Mollicutes bacterium D7] # 1 297 1 297 297 520 100.0 1e-146 MPLSSKTTDTKPDNNINLNSQENNISSNTVSQKNIETDIVYVNNEPYDIDKTYQKVNTYQ FVLLKLIGDTGLSEFQEILAQIVEKNNFKEKMIRTNIDELINMHILNEQNIGTPLRSKIK LLELTELGKAIYFKETGKKPEISEIQTMLNNHASLRHGYCIKETAKSLQRQGYANVCYDA SKNTIQLADNRRYVPDIIADHSKSVKTYWEVELAHHTDKDFFEKIDKAMKVTPNLYIIAP DKEAKTKLRKQIDKYTKYLFVNSINAKLTIFLGTLNELEKRQIFSNEEECKITINIT >gi|223714079|gb|ACDT01000136.1| GENE 2 1066 - 1245 97 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733154|ref|ZP_04563635.1| ## NR: gi|237733154|ref|ZP_04563635.1| predicted protein [Mollicutes bacterium D7] # 1 59 4 62 62 91 98.0 2e-17 MGDFFNLISTVFINAITVVYFYCQIYSAQIILLGLLTVAGIIEYKEKRFLTVDDERRVI >gi|223714079|gb|ACDT01000136.1| GENE 3 1249 - 1788 501 179 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733155|ref|ZP_04563636.1| ## NR: gi|237733155|ref|ZP_04563636.1| predicted protein [Mollicutes bacterium D7] # 1 179 1 179 179 342 100.0 8e-93 MINHDYDAVRYKDNTLGIRIFEAIKIAAFYGLCFIMSIYAISLYSYLGKVICIAIIVGVI ASLISWKMVPPERPNQIIAMKRNIFIYSGLLVGAHFLITKLSSLDPNLMGVSLGLPTGEV INNSALGWITMMIQFIIIGTPITHVGYEVKRIWTFYGFGFGKTTKRKRQEQLQKTIVRK >gi|223714079|gb|ACDT01000136.1| GENE 4 1845 - 5477 2424 1210 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733156|ref|ZP_04563637.1| ## NR: gi|237733156|ref|ZP_04563637.1| predicted protein [Mollicutes bacterium D7] # 1 1210 1 1210 1210 2025 100.0 0 MAFTQNFKTLLRSNYIDDEDFDMDDIIIHDTSIEIKDDKLQYLLFDFEKKVSPTKYERVF KAIKLVKLKRVPKKDLGLGGWLEMQTGVITGFYQGQINYIQIMANICHPKKEGLIYAYGV QGVSGRCIEEAMKFADIQMKTVERSITGTFHTLEYTDLTAYDARWIFQKLGCMSDMRVIR GIPTPKITAGRVTSNLFAANENRLIEEQSEEFLLGMDDYEYLFVLTASCVDPLVLSKWKE AQLKEQTYWASIKNGQKSMSFGISMPMVYAANMGASEGWGTNRGESFGHSTGTNYSHSVG TNESFSTGTSESVSHGTSNSLSSSNSTSESYGASTGSNLSTGMSKSLGSSESISHGVTEG TSTTTGESFTTGTNSSHSTGTNQSTSISHGTSSSTSTGTGVSHTTNSSVSHSQGVSNSFS SNVSSSTSNSGGFNGGIPGGILGVSGSSSTSHGNGTSEGIGFSSGTTTSSGSGTSYSNSS SNSYGVSSSEGYTQGTSESFTQGTSTSHGTSISQGTSSSNSTSMSSGLSSSVGTSNSYGQ SQSANASAGIGTSQGASSGVTDSYSQGVTNSHSVGASETISNGTSEGYNTGSSVGSSTSG SVTSGSSGSMGLGPSISFAKTFAWEDREVTYLLDMLVYSTNRIIMASNNLGMWFVDIYVA CENEEAASAVSSLSMTAWHDKNALTTPLQVYQPSDTEKDYLFKHLSVFSPSIKKEGIPGQ FESYKYTTMLLSNEIAAYSHPPRVNVGGIQAAVDDPPVLSISNKRQNGEIFLGYVADVEK YSKTNWYKSGFKFCLQNSELHHAYISGASRSGKTVAATRLVAEAYTHVRRGEQGKKLRFL IMDPKQDWRALAKIVPPEHFRFYSLSNPQFRPISLNLMKIPKGVYTERYADKLRELFIRS YGLGDRGFQILGQAIQGVYKRAGCFDEDVMYNEKDPITGKYPATEKSKNVTLADVCNFLE QQTQTAGMPRDKIEAIQRILDRLEQFREPLSSIFKIFCNKGDSGMGIDDLLGADDVIVLE SYGMDTKTSSFIFGLITSSVYQYAVSNGGFVKPEDQYETVLVIEEANQVLIGEDQDNLGG ANPFEIILDQSAGYGLFIWTITQKIADMPRSVLANSALKIIGRQDDEDDIKKTIVQIGKD GLINDRVFKNWLPDQPTGWFIIKSSRNRDFTMNAPQHVLIEYLDIEPPSNEELEYILQEG EIMNKQKIGG >gi|223714079|gb|ACDT01000136.1| GENE 5 5483 - 5665 120 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDFTVHQSIKIIAVIVVSVIIITALLNLDSVTESKQQMQSIADINQSNREKTANIKASFY >gi|223714079|gb|ACDT01000136.1| GENE 6 5825 - 6514 739 229 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733157|ref|ZP_04563638.1| ## NR: gi|237733157|ref|ZP_04563638.1| predicted protein [Mollicutes bacterium D7] # 1 229 1 229 229 352 100.0 9e-96 MLKSSKRISLIIIIVAGLILSGSLFMIFVGNTTEVLVATQDVKANTKIKESMFTTERVDT ASLPDNYLSADSAKDVVGYYTYIGFTKGSVISKSNIATSAKKASGAIEKGKTLLTISAEN LPTGIQSGDKVNIIIGANTDSGNKVVLTYQNIDVSNIYVDDEGSITGLEVSVTPEQSQKI VYAKLNGELSISLLPINYKDINLPIIDEGGFLDTSSSSTATDTNTSEAQ >gi|223714079|gb|ACDT01000136.1| GENE 7 6569 - 8212 975 547 aa, chain + ## HITS:1 COG:BH3027 KEGG:ns NR:ns ## COG: BH3027 COG2894 # Protein_GI_number: 15615589 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor-activating ATPase # Organism: Bacillus halodurans # 241 545 5 261 264 68 23.0 3e-11 MKIIILSSSEPYATELSRAALLLGHTIVATYISIDDVKKNIKNIEYDYLISSDSLYQNEQ LDTLIQFINSIGDIKKLKFILKNPGIHEALLAQYQIMPLYETTPPNTVISILAQEYAINS QFAQQSQYNEQNLQPTYHVPTMNQNTFVHVASASTQQNDPSPTINNSQIYPQGETFLGTP PVQANTPNSPTGTSINTSYNNDQPDTINSNEQIKNTKEESKSILGSRKNNTNFINFKNKF IVVNSPKGGVGKTTLAIELASLISNRAKGMDLNPASKFTSSKEFRTCLIDLNPSFDTMAS TLKCVHETKDYPTILNWVNRIEEKIYTSMTNEEKQAFNENRDLFDISPFCNNKIIKFTWD EVKSLTVYDAQTGLYIIPAVALPMDVNKVLPDYISIIIETIRKYFDISIADTSNNLTYFT VEAFHQADEVILVSSPTISTSTVVNRLIDACKKIDVDTSKFNLVINHPNRADSDLEAEKI ASVLKINLVAELPYDENLGKILEKGTPFSINTPKSKYSQAVTKLAHQIIPLWTMKKQKIK SKKFFNF >gi|223714079|gb|ACDT01000136.1| GENE 8 8227 - 9699 1091 490 aa, chain + ## HITS:1 COG:RSc0652 KEGG:ns NR:ns ## COG: RSc0652 COG4962 # Protein_GI_number: 17545371 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Ralstonia solanacearum # 102 489 91 453 453 260 38.0 4e-69 MPIPNSGINNISVFNSKTTNSNISNKNKPKKLISKNPIDFICENYSVVKHIGDMMSNKLG NELFEKDRKVVESECRTMILDYLKEKELLPIDALVAQYLKLIYDNILGLGMIQPLLDDKE IDEILVLDYNKIYVEKHGKLQLSPYRFPNYEAAYGVAKKIIEPLNRVLDVAHPNVDAQLQ DGSRLSATIPPLRANGAISLTIRKFKDKVEPLSFYAEKYKSSTPEMVKFIETAVKSKISI IVSGGTGSGKTTLLNSCSLAIPKDERIITIEDTLELKLQQENVETYQTVEENVEGKGGFS TQDIVKIALRKRPDRIIVGECRGGEFVEMLNAMNTGHDGSMSTVHSNSAKDFAQRAKTMV LSNPSTNNLEEDAIFNMINSAISLIIQTNRFEDGSRKITQITEIVGYGNEGYSKLREAGV LGPKAAVDKKKLYLQDIFKFTQTGVDPITKRVQGKFEAKGYKPMCVDKMNEKGYTFPDNF FEKRILLEVK >gi|223714079|gb|ACDT01000136.1| GENE 9 9699 - 10646 543 315 aa, chain + ## HITS:1 COG:SMa1564 KEGG:ns NR:ns ## COG: SMa1564 COG4965 # Protein_GI_number: 16263304 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein TadB # Organism: Sinorhizobium meliloti # 114 311 130 328 333 62 23.0 8e-10 MNFLFKTIIALLPLITATLIVVFITLQSVTSEMELKKVIKATTKSMIDAKVSDSVKEQKK KSYIKNFIDKQNSKLNLVGITYKYETCIAAASGVFIIAAIIAKLLFKAGPMLMVYLGLVF AGAVLLAVNKRAETKKEELTLEFLEKINEISAHLSVGKTIPNALDEIIKEQNTSQVLINE LITVKQNLNLGYPLSKSFMLAYEHLQIEEIKTFAMTLSVYEETGGNIIEVLKANDNFFQS KIKIKNTQKVYISSLKTSQKLTVGIPLGFIVLVIFINPSFFGDFYGTAVGELVGIIAISF LLFGIFLSNKIAKLK >gi|223714079|gb|ACDT01000136.1| GENE 10 10657 - 11508 660 283 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733161|ref|ZP_04563642.1| ## NR: gi|237733161|ref|ZP_04563642.1| predicted protein [Mollicutes bacterium D7] # 1 283 1 283 283 510 100.0 1e-143 MKELLVYISIICIFLIGLSLYIYIQSKNNNENDIQVLINKIRKSSFVSQEVMMKKMEEIP YIQRKNQEIEKMIDITESEYTLDSIYKYRKIVLICTFAAFILVLIFVGFIPSILILAIGG LCSYVPDFRLKQDLSMKYQIFDNTLPDYISRVSLAMNAGMNLSQAMIIATRALDGDIKKE FVRFLADTERNQDDIAKPYINLRQRFPTKSCERFCSVVITGIKNGNKMSEILEKETNYIN DETLIKMEEQGKKNEIVSTAISTGFIFIPITALLIAPIMMTSV >gi|223714079|gb|ACDT01000136.1| GENE 11 11574 - 11789 368 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733162|ref|ZP_04563643.1| ## NR: gi|237733162|ref|ZP_04563643.1| predicted protein [Mollicutes bacterium D7] # 1 71 3 73 73 106 100.0 5e-22 MKALKNYILDEDAISVEQIVFVIAGLAVAIAIGWFIYNLVAGKADDASDIASDANSSKSA GSEFSGGAFGN >gi|223714079|gb|ACDT01000136.1| GENE 12 11834 - 12019 202 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733163|ref|ZP_04563644.1| ## NR: gi|237733163|ref|ZP_04563644.1| predicted protein [Mollicutes bacterium D7] # 1 61 1 61 61 88 100.0 1e-16 MLFDVKSYIQNTDASASIETVVLVVAALSVSIAVGWWIFNTVKGQADNSSCSGNNSPFCI E >gi|223714079|gb|ACDT01000136.1| GENE 13 12040 - 12588 224 182 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733164|ref|ZP_04563645.1| ## NR: gi|237733164|ref|ZP_04563645.1| predicted protein [Mollicutes bacterium D7] # 1 182 1 182 182 250 100.0 3e-65 MLSTIFLILSLIAMFFGVIVDIKEKKFPNYIVIILLLLGMFYFCQTNSIKELIVPLCLFV LFNFIGIYLHKIKLVAPGDMKFFSLFPFFIVWSPKTSIIFIFILAFISMIYVSIRIFRKE KSLKSIFTNFKNQIFELKVFLLSKIRISPDYTQVSKDEGIAFTVPLFISFLIVLSCQQFG CV >gi|223714079|gb|ACDT01000136.1| GENE 14 12603 - 13121 482 172 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733165|ref|ZP_04563646.1| ## NR: gi|237733165|ref|ZP_04563646.1| predicted protein [Mollicutes bacterium D7] # 1 172 1 172 172 318 100.0 6e-86 MLKKFFKDEEGSEAVQMAILLPIILLLVMFITDRFIQYEGLTTTTVASNEALRAAVVQTS KEDAKNVIKDTLENRFSEGGIGWCSGSDLNHCVNWKASNTTSSISSFKNNSNYQLAVQLD GEWEKGSYLTLGVRTHKASIIPSYSNFRRLLSGGPIYHTHTYVIKALVEGEN >gi|223714079|gb|ACDT01000136.1| GENE 15 13123 - 13605 333 160 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733166|ref|ZP_04563647.1| ## NR: gi|237733166|ref|ZP_04563647.1| predicted protein [Mollicutes bacterium D7] # 1 160 1 160 160 285 100.0 5e-76 MNFLKKEDGNVTVVLVVLLPVLLWCLFFFESKMQVRWIYTQVQSVLDFSTLAGANTGETQ KSGTEYVCSIPYSASEQSKSGYHVAVKLFKENVKTLPEGIKNELLAQLQSGDIEGLKDRD AQMGGYMIMSTSFKYKPNIPIFFHNYIVTVSSTSRCQPDI >gi|223714079|gb|ACDT01000136.1| GENE 16 13619 - 13900 150 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733167|ref|ZP_04563648.1| ## NR: gi|237733167|ref|ZP_04563648.1| predicted protein [Mollicutes bacterium D7] # 1 93 1 93 93 138 100.0 1e-31 MGNLNKEDVLKETRNIQKQNQEIEEKRAILNEDTLKSKIQSKKPKKEEISLDRRLGYRRK YLTEKYKKFKIIFYVAAVLILIYLMFIIMPCKV >gi|223714079|gb|ACDT01000136.1| GENE 17 13974 - 14528 360 184 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733168|ref|ZP_04563649.1| ## NR: gi|237733168|ref|ZP_04563649.1| predicted protein [Mollicutes bacterium D7] # 1 184 21 204 204 278 100.0 9e-74 MVASSLLMLTGCSAEKSPVQVLDETLQYWKNKNEKSFKKSFKNKKVAMYSLIETSADDNK DLIEISKKLEEKIYSGLKYKILSDSQNDASHATVDVQISTIDTNSFLNEFIKKIINFYYE SKSTNSSLKEEQFYEEINEDFEKIIVARNFTVTVKFQKINKEWKITNNETLINALTGGYL EYQY >gi|223714079|gb|ACDT01000136.1| GENE 18 14540 - 14860 215 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733169|ref|ZP_04563650.1| ## NR: gi|237733169|ref|ZP_04563650.1| predicted protein [Mollicutes bacterium D7] # 1 106 1 106 106 173 100.0 4e-42 MAEIYRVYERPNEMITIVGSLAQISANAVELLISFESKNAKDIMLDAVKENKFIPLITKP EKIKTLVITKDGKVYPSTFSMSALSSRIYKATTKDYYKKELNIITR >gi|223714079|gb|ACDT01000136.1| GENE 19 14865 - 15425 449 186 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733170|ref|ZP_04563651.1| ## NR: gi|237733170|ref|ZP_04563651.1| predicted protein [Mollicutes bacterium D7] # 1 186 1 186 186 312 100.0 7e-84 MEKQNWSSYISRFIENKEYNPSIPPEEDFSNLINELDIHDANIIYRYLYHSGYDPYFDLL SKENFKAYNNTVSSRIDFMKKNFNYTTDFFLKNNIWGNVSYKRKDFDIKDYVNDVVIAFN YGRLEIYVINIEKIKHTIKPLKKYLNTTPNMELKDNNDGYFLCNYDIQTDIFNIIEAIKN IIEARL Prediction of potential genes in microbial genomes Time: Thu May 26 10:39:16 2011 Seq name: gi|223714078|gb|ACDT01000137.1| Coprobacillus sp. D7 cont1.137, whole genome shotgun sequence Length of sequence - 554 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 324 63 ## COG0681 Signal peptidase I 2 1 Op 2 . + CDS 326 - 554 60 ## Predicted protein(s) >gi|223714078|gb|ACDT01000137.1| GENE 1 1 - 324 63 107 aa, chain + ## HITS:1 COG:BS_sipV KEGG:ns NR:ns ## COG: BS_sipV COG0681 # Protein_GI_number: 16078113 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Bacillus subtilis # 9 81 58 143 168 62 46.0 1e-10 LEPKGFEFNRFDIICADVNNNKVIKRLIGLPGETINYKDGLLYINGVQTNESYFKNKENM DTPNFSLTLTSKQYLIIGDNRRLNDNLYNIIEQSQIISSGILFPMVE >gi|223714078|gb|ACDT01000137.1| GENE 2 326 - 554 60 76 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDSILLASYMCFNDSINQIQNHIDNLGLQPFYSYDIAVPAKENYYNYIPIFSEKRSVPDS YVYFSQKKPSNKVISR Prediction of potential genes in microbial genomes Time: Thu May 26 10:39:33 2011 Seq name: gi|223714077|gb|ACDT01000138.1| Coprobacillus sp. D7 cont1.138, whole genome shotgun sequence Length of sequence - 46925 bp Number of predicted genes - 51, with homology - 47 Number of transcription units - 23, operones - 14 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 56 - 115 4.5 1 1 Op 1 . + CDS 165 - 5003 3128 ## COG5263 FOG: Glucan-binding domain (YG repeat) 2 1 Op 2 . + CDS 5022 - 5393 362 ## gi|237733173|ref|ZP_04563654.1| predicted protein + Term 5500 - 5541 -0.8 + Prom 5907 - 5966 6.0 3 2 Op 1 . + CDS 6039 - 6938 801 ## COG2207 AraC-type DNA-binding domain-containing proteins 4 2 Op 2 . + CDS 6998 - 7213 89 ## gi|255308818|ref|ZP_05352989.1| AraC family transcription regulator + Term 7235 - 7276 -1.0 + Prom 7272 - 7331 4.5 5 3 Op 1 . + CDS 7406 - 7720 151 ## Amet_3916 transcription activator, effector binding 6 3 Op 2 . + CDS 7739 - 8161 279 ## Elen_0700 hypothetical protein 7 3 Op 3 . + CDS 8211 - 8972 768 ## COG0500 SAM-dependent methyltransferases 8 3 Op 4 . + CDS 9050 - 9520 385 ## gi|237733178|ref|ZP_04563659.1| predicted protein + Prom 9552 - 9611 3.0 9 4 Op 1 . + CDS 9631 - 10044 363 ## BF3481 hypothetical protein 10 4 Op 2 . + CDS 10097 - 10429 344 ## mru_0642 hypothetical protein 11 4 Op 3 . + CDS 10451 - 10771 343 ## COG3070 Regulator of competence-specific genes 12 4 Op 4 . + CDS 10822 - 11271 212 ## Sterm_0334 hypothetical protein + Prom 11288 - 11347 3.6 13 5 Op 1 . + CDS 11469 - 11879 322 ## CDR20291_0527 hypothetical protein 14 5 Op 2 . + CDS 11929 - 12342 381 ## BT_2376 hypothetical protein + Prom 12347 - 12406 2.8 15 6 Op 1 . + CDS 12432 - 12683 199 ## CDR20291_3072 hypothetical protein 16 6 Op 2 . + CDS 12680 - 13180 295 ## CD3334 hypothetical protein 17 6 Op 3 . + CDS 13212 - 13649 329 ## Blon_0991 hypothetical protein + Term 13746 - 13791 1.2 + Prom 13702 - 13761 2.2 18 7 Tu 1 . + CDS 13813 - 14154 270 ## COG3547 Transposase and inactivated derivatives + Prom 14743 - 14802 10.2 19 8 Op 1 . + CDS 14973 - 15791 585 ## gi|237733189|ref|ZP_04563670.1| predicted protein 20 8 Op 2 . + CDS 15845 - 17368 1236 ## gi|237733190|ref|ZP_04563671.1| predicted protein + Prom 17392 - 17451 2.3 21 9 Tu 1 . + CDS 17476 - 17871 300 ## gi|237733191|ref|ZP_04563672.1| predicted protein + Prom 18022 - 18081 6.3 22 10 Tu 1 . + CDS 18241 - 18408 189 ## + Term 18419 - 18445 -1.0 + Prom 18468 - 18527 13.3 23 11 Op 1 . + CDS 18558 - 20714 1442 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase 24 11 Op 2 . + CDS 20726 - 21088 316 ## + Prom 21090 - 21149 2.8 25 12 Tu 1 . + CDS 21238 - 23625 1556 ## COG0550 Topoisomerase IA 26 13 Op 1 . + CDS 23992 - 24888 615 ## COG4962 Flp pilus assembly protein, ATPase CpaF 27 13 Op 2 . + CDS 24904 - 25686 624 ## gi|237733195|ref|ZP_04563676.1| predicted protein 28 13 Op 3 . + CDS 25705 - 26946 721 ## BMQ_2765 hypothetical protein + Prom 27017 - 27076 10.8 29 14 Op 1 . + CDS 27104 - 27268 119 ## 30 14 Op 2 . + CDS 27268 - 27678 385 ## gi|237733197|ref|ZP_04563678.1| predicted protein 31 14 Op 3 . + CDS 27698 - 28048 344 ## gi|237733198|ref|ZP_04563679.1| predicted protein 32 14 Op 4 . + CDS 28038 - 28397 178 ## gi|237733199|ref|ZP_04563680.1| predicted protein 33 14 Op 5 . + CDS 28397 - 29176 735 ## gi|237733200|ref|ZP_04563681.1| predicted protein 34 14 Op 6 . + CDS 29183 - 29641 228 ## COG0602 Organic radical activating enzymes + Prom 29808 - 29867 5.1 35 15 Tu 1 . + CDS 29978 - 31237 172 ## PROTEIN SUPPORTED gi|145223395|ref|YP_001134073.1| NLP/P60 protein + Term 31303 - 31334 1.1 + TRNA 31810 - 31885 70.2 # Thr TGT 0 0 + TRNA 31892 - 31966 54.2 # Cys GCA 0 0 + TRNA 32055 - 32129 81.3 # Asn GTT 0 0 + TRNA 32131 - 32205 62.2 # Gly TCC 0 0 + TRNA 32209 - 32283 72.7 # Val TAC 0 0 + Prom 32210 - 32269 79.9 36 16 Tu 1 . + CDS 32440 - 32712 98 ## gi|237733203|ref|ZP_04563684.1| predicted protein + Term 32852 - 32918 30.0 + TRNA 32741 - 32817 82.9 # Met CAT 0 0 + TRNA 32822 - 32907 64.1 # Leu TAG 0 0 + TRNA 32913 - 32987 50.7 # Asp GTC 0 0 + TRNA 32990 - 33064 57.8 # Ile GAT 0 0 + TRNA 33068 - 33141 49.1 # Trp CCA 0 0 + TRNA 33295 - 33371 52.4 # Gln TTG 0 0 + TRNA 33387 - 33461 62.0 # Arg TCT 0 0 + TRNA 33465 - 33540 65.8 # Ala TGC 0 0 + TRNA 33542 - 33617 58.4 # Lys TTT 0 0 + TRNA 33621 - 33709 62.0 # Leu TAA 0 0 + TRNA 33711 - 33786 59.0 # Phe GAA 0 0 + TRNA 34070 - 34154 50.1 # Ser TGA 0 0 + TRNA 34413 - 34489 78.0 # Pro TGG 0 0 + TRNA 34491 - 34565 56.8 # Arg ACG 0 0 + TRNA 34857 - 34930 62.8 # His GTG 0 0 + TRNA 34932 - 35021 61.3 # Ser GCT 0 0 + Prom 34946 - 35005 80.4 37 17 Op 1 . + CDS 35134 - 35601 96 ## gi|237733204|ref|ZP_04563685.1| predicted protein 38 17 Op 2 . + CDS 35633 - 36103 155 ## bglu_2p0400 hypothetical protein + Term 36119 - 36153 2.0 39 18 Op 1 . + CDS 36163 - 36858 626 ## EUBREC_0875 hypothetical protein 40 18 Op 2 . + CDS 36860 - 37102 288 ## gi|237733207|ref|ZP_04563688.1| predicted protein 41 18 Op 3 . + CDS 37122 - 37388 355 ## gi|237733208|ref|ZP_04563689.1| predicted protein + Term 37425 - 37461 2.3 + Prom 37494 - 37553 11.7 42 19 Tu 1 . + CDS 37588 - 38826 1086 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) + Prom 38936 - 38995 12.8 43 20 Op 1 . + CDS 39015 - 39650 456 ## gi|237733210|ref|ZP_04563691.1| predicted protein 44 20 Op 2 . + CDS 39667 - 40152 467 ## gi|237733211|ref|ZP_04563692.1| predicted protein + Term 40154 - 40200 -0.5 + Prom 40185 - 40244 5.5 45 21 Tu 1 . + CDS 40372 - 40791 352 ## gi|237733212|ref|ZP_04563693.1| predicted protein + Term 41014 - 41048 1.4 + Prom 40959 - 41018 6.5 46 22 Tu 1 . + CDS 41217 - 41405 171 ## gi|237733213|ref|ZP_04563694.1| predicted protein + Prom 41411 - 41470 12.9 47 23 Op 1 . + CDS 41636 - 43399 1095 ## COG0608 Single-stranded DNA-specific exonuclease 48 23 Op 2 . + CDS 43414 - 43917 478 ## gi|237733215|ref|ZP_04563696.1| predicted protein 49 23 Op 3 . + CDS 43970 - 44170 161 ## gi|237733216|ref|ZP_04563697.1| predicted protein 50 23 Op 4 . + CDS 44148 - 44303 171 ## 51 23 Op 5 . + CDS 44303 - 46516 1024 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member + Term 46538 - 46577 -0.7 Predicted protein(s) >gi|223714077|gb|ACDT01000138.1| GENE 1 165 - 5003 3128 1612 aa, chain + ## HITS:1 COG:CAC1079_2 KEGG:ns NR:ns ## COG: CAC1079_2 COG5263 # Protein_GI_number: 15894364 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Clostridium acetobutylicum # 830 1249 2016 2422 2566 168 31.0 1e-40 MNVFKRECRYTPAHLIVGLFIGILFLFLNINKINAVGEIKNVNGYNVETAYEINEVYIEN NQIHLKGWYVAKGYQNFAQKNVTHLYSLSITGINKQYNDIGDYNTGGTITNLHKSANAGF CSSQYGQPGINQNGNCNYYYDDVGFHFAIPLQDFQNAVSSGVTDFYFQISVSCLNSRAGY STISQFVYVTNTRVGNGQGINLDIGNGYTANLYSNLEYKEIIVQPINAYVKTGPGKGYSV SKYGSKSRYWGYQYMYSGLSYAGSSELTDWFGLHFSDYGYQYNRWRVANGGSNYGYIPTS YFNVRNYTSNGENAALHLKILNKSPEITATTTQTYYSGQDLTTDMLKSGIIANDREDGKY VLNGTENKLRMEISSKEVTISNDKINTKNLDGSYQFTVKVTDTAGASTTSNLTVKFIKNT APTISGIKNDEEKIMYIRNYNLKSSDILQNITSNDIQDGDLSSELRVLKGNEVYDDSLDR NIPKIYNDFSVEVKDIPLIDIANHNISYSGMTAMTDTFNVTGGRNYQISVPTTQYKIIEY NSSGTQIRDSGWIQTKSGVFANSYTVQANASKVKIIMNSNKVSDPTSIHMIRDVWAQNVA LTTNINFILNIMDYYPPELSGHDYYYFVGYPVNEKEILEDIVITDTRFDVDKIRETLKIT NFDVIDPNTPGEYIIQITATNSGIEENPDLAEEKYTGSLNVVVHIIDTDDSQFIKKSIRY INNKHVKTLSKDSNWYKNEVLTNRLKSSLEKNSDDAILSFVFTGKDMQKLKEYINTKKGT TLFTTDFNQTFFEKIMAYSPDGHLNDGWIKIDGYWCYQYFDDKNRANLYTSTWKEIDNAL YYFDEAGHLAQNKWVTIDDKLYYFTNDGSVAHGWTYINGNYYYFDDNYEKVTGFKNIDGN TYYFDNNGIRIIGWYTIEGSQYYFGNDGIMRKDQILEIDGITYIFMENGQVPYYLVTYKD KLYCIDPVRGMIKNEIYDYNGNKYLFGDDGSALKGWQIFNNKKYYINSDYQLCINMFAVI DGEKYYFNKDGVIQTENGFITVDGKIYYVQDGGTILRDSWKTINGYEYYFNKEGIAATEI VKINSEYYYFDETGKKQINTWHNSYYFNSNGKAVNGWQDLHDPEDTSQDTFKFYFKDYQA IKNTIFEFSNESYYADVNGHIRSGLMNIGNDIYFFDETNYVMKKNQLVKYYTDSYFFGED GKAIKNTWKKIDDYNYYFKSDGKMARDANLTINSNNYHFDAEGKMCTDYIDGLWYYNNLG IGEPLLRSYDISGYVIEDDNATDGNIADTGDNRIVVGIFNDGRLKVTGVGDSMIFTNKDN YIAPWLIDTYTDENNQTKNISDLILSVEFSDSVSIKNLDYWFINCKNLENVSNLPITVES MISTFDGDTKLFNVTDMSKMVQLTTIKRAFAGTGLILTPILPDNITDMDYAFTNCLNLVE VINIPKNVVTMQGTFENCPSLQTVPNIPSEVENLNYTFQYDSNLKGQLIIETNKKDVTVS GTFTRAATTDLLELYGNGTNNELVNKMLPNTRKISNIIAGNEILSKNIKLKVNEEAQIKI YHNFINPEFTFTSDKSIVKIDNKGNIKALKSGETIITIAHDSKKCTIKAIVE >gi|223714077|gb|ACDT01000138.1| GENE 2 5022 - 5393 362 123 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733173|ref|ZP_04563654.1| ## NR: gi|237733173|ref|ZP_04563654.1| predicted protein [Mollicutes bacterium D7] # 1 123 3 125 125 200 100.0 2e-50 MILKGGIAVFALFLLVICIMDSGTSVIRRNETATIAEVASYETLNAVKNNATAINSDQEF IAEVIKNIVLEYDTNSDVVVNIISIDARNGLLDLEIQQTFTHSNGAKETNVERRTVILED YRE >gi|223714077|gb|ACDT01000138.1| GENE 3 6039 - 6938 801 299 aa, chain + ## HITS:1 COG:BH3634_1 KEGG:ns NR:ns ## COG: BH3634_1 COG2207 # Protein_GI_number: 15616196 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 1 117 1 116 132 65 32.0 1e-10 MNDMEAVQIIQDYIRTHYKDDNFCIHDVCSKIGYSRRQVDRFFKRFLKKTLREYINAVCL TESANELLNTRKTILEVALNSHYETHEGFARSFYKRFHIMPSVYRDKKIAIPLFIQYPVS HYNALLKHKEELFMNNALNFCMITAKERNKRKLIYLPSKNAQDYFSYCEEVGCEWEGLLN SIPEKIEPAALIELPEKFVEKGFSRIAVGIEVPLEYDKELPESYKIVELPECIMLYFQGE PYENEEDFCKAIESTYTAIGKYNPTLYGYKFAYDIAPSFNFGADTSTGARVAVPALSID >gi|223714077|gb|ACDT01000138.1| GENE 4 6998 - 7213 89 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|255308818|ref|ZP_05352989.1| ## NR: gi|255308818|ref|ZP_05352989.1| AraC family transcription regulator [Clostridium difficile ATCC 43255] # 10 71 93 154 303 115 88.0 9e-25 MCNGTFETIKKVYHITSDEYRKSVPMLNTFDKPELSSRYIMIDENVPLIVGNIVLEIQRK ALRTRKIYFGF >gi|223714077|gb|ACDT01000138.1| GENE 5 7406 - 7720 151 104 aa, chain + ## HITS:1 COG:no KEGG:Amet_3916 NR:ns ## KEGG: Amet_3916 # Name: not_defined # Def: transcription activator, effector binding # Organism: A.metalliredigens # Pathway: not_defined # 17 90 167 240 241 88 55.0 9e-17 MVLSTDVCPRNMVQHALLAGEYIVCRIEAESFEELVTTALNQANKYLFETWLPNHKLVTE PFMAEKYYKENAEFSHMGIWVIPVEVGITEKELSYTNGNEIENL >gi|223714077|gb|ACDT01000138.1| GENE 6 7739 - 8161 279 140 aa, chain + ## HITS:1 COG:no KEGG:Elen_0700 NR:ns ## KEGG: Elen_0700 # Name: not_defined # Def: hypothetical protein # Organism: E.lenta # Pathway: not_defined # 7 140 10 142 146 103 35.0 2e-21 MINIKDINYAPNISEISGYIKNPIFDEFYQYMNDKYKAVCKIEYSKDVWARGWNVKLRKA GKSLCVIYPKEQYFTLLIVVGSKEKNKVEELLPQLSKEMQDIYQNTKEGNGQRWLMIDLH KDDCLYQDALKLIHIRRESK >gi|223714077|gb|ACDT01000138.1| GENE 7 8211 - 8972 768 253 aa, chain + ## HITS:1 COG:slr1117 KEGG:ns NR:ns ## COG: slr1117 COG0500 # Protein_GI_number: 16329224 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Synechocystis # 8 251 6 251 253 172 36.0 8e-43 METTHPSMLDLLLETHIGLERQGPGSVEAIEQALEFLKPLNQFSKIADLGCGTGTQTLLL AKYLSGNITGLDIFSNFIDKLNENAKNRNLASRVTGITGSMENLPFQKNSLDLIWSEGAI DNIGFEKGLSYWHGFLKKEGFIAVTCPSWLTKEQPTVVEEFWSDAGSRLDSISDNIETMQ KCGYQFIASFALPEQCWTENYFIPREKVINKLLDKYVGNETMIEYAEQNRHETELYSKYS QYYGYVFYIGRVI >gi|223714077|gb|ACDT01000138.1| GENE 8 9050 - 9520 385 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733178|ref|ZP_04563659.1| ## NR: gi|237733178|ref|ZP_04563659.1| predicted protein [Mollicutes bacterium D7] # 1 156 1 156 156 292 100.0 5e-78 MPYTFQSVRNIEKLTDSSKLYNFIESFSDYGNQFSLICGKIKALFGQPIYETESLENLFS WCILAASEEEEVYLDIYCASSGPAVGGMSDEKSRKAAKALVDYVRQAEPINYAYKAYYLD GPTALEFGIREGTPYYNETELRLSEKEFRELYARLL >gi|223714077|gb|ACDT01000138.1| GENE 9 9631 - 10044 363 137 aa, chain + ## HITS:1 COG:no KEGG:BF3481 NR:ns ## KEGG: BF3481 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 5 137 7 139 142 166 58.0 3e-40 MLETIPSQSNMAELLGQSLFEVWQELCSAIDEKYEMERLWNTGGKKWIYEYKYRRGGKTL CCLYAKSNCVGFLIIFGKDERTKFEDIRDTLSNAVCRQYDEAKTYHDGKWVMFEPTDTSM FCDFIKLLSIKRKPNKK >gi|223714077|gb|ACDT01000138.1| GENE 10 10097 - 10429 344 110 aa, chain + ## HITS:1 COG:no KEGG:mru_0642 NR:ns ## KEGG: mru_0642 # Name: not_defined # Def: hypothetical protein # Organism: M.ruminantium # Pathway: not_defined # 1 109 1 109 112 125 63.0 4e-28 MKELIAYCGLDCEKCDAYHATINDDQELREKTAKLWAELNNAPILPEHINCQGCRVEGIK TMFCDNMCDIRQCALKKDVVTCGGCSDLENCPTVGPILENNPSALKNLKG >gi|223714077|gb|ACDT01000138.1| GENE 11 10451 - 10771 343 106 aa, chain + ## HITS:1 COG:SP0951 KEGG:ns NR:ns ## COG: SP0951 COG3070 # Protein_GI_number: 15900829 # Func_class: K Transcription # Function: Regulator of competence-specific genes # Organism: Streptococcus pneumoniae TIGR4 # 1 75 1 75 75 99 62.0 2e-21 MASSKEYLEFILGQLSELEEITYRAMMGEFIIYYRGKIVGGIYDDRLLVKPVKSAISYMS TAPHELPYEGAKEMLLVDEVDNKEFLIGLFNAMYEELPVPKLKKKK >gi|223714077|gb|ACDT01000138.1| GENE 12 10822 - 11271 212 149 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0334 NR:ns ## KEGG: Sterm_0334 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 4 147 5 148 157 204 63.0 1e-51 MTPDEILKYCLENLEGTVLVNSWGERGIYYNPDNILKRGVYILTVKEKDGDNDKSSSLNR ENIYRVNLGVRKTTFIEMFDFIPKRPAKGCIVDMNYDFSATNKILPHPVYAWMGWICSLN PSEKTFVELKPLIQEAYDYAKEKYKKRKV >gi|223714077|gb|ACDT01000138.1| GENE 13 11469 - 11879 322 136 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_0527 NR:ns ## KEGG: CDR20291_0527 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 136 1 137 137 177 81.0 7e-44 MKMPEKIDISMFAPCGMNCKVCYKHCYHKKPCAGCLNSDKGKPEHCRKCKIKDCVKAKAL SYCFECAEYPCKLIKNLEKSYNKRYQASLIENSRFVQGQDLEHFMEQQKIKYTCSECGGI ISIHDRECSECQEKMK >gi|223714077|gb|ACDT01000138.1| GENE 14 11929 - 12342 381 137 aa, chain + ## HITS:1 COG:no KEGG:BT_2376 NR:ns ## KEGG: BT_2376 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 137 43 176 178 183 64.0 2e-45 MLDKIPNAEEMTALVGKSLYDIWNKLCILIDEKYEMECLWNKGGKAWTYEYKYRRGGKTI CALYARENCVGFMIILGKDERLKFDKDRNSYSKEVQKIYDETKTYHDGKWLMFEPTDTAL FDDFVRLLGIKRKPNRK >gi|223714077|gb|ACDT01000138.1| GENE 15 12432 - 12683 199 83 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_3072 NR:ns ## KEGG: CDR20291_3072 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 83 1 83 83 104 71.0 8e-22 MNTVNDLLRNLNKLHTTRLGVERIKRNLSLDTDDVVDWCKTKINSVNAVITRNGKNWYAN VDSCIITVNAYSYTIITAHRERK >gi|223714077|gb|ACDT01000138.1| GENE 16 12680 - 13180 295 166 aa, chain + ## HITS:1 COG:no KEGG:CD3334 NR:ns ## KEGG: CD3334 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 154 2 155 156 255 86.0 3e-67 MKYTIRELNAFSVIGQEVELTNYQKKNIQISTQFWRKFNSNLKKSYLSQSGNWVKYAFME RRNGKLYYFCSIPKKTIIPDGFNMKEIPSYKYLVVEHIGAMGKIYGTYGKIYQEIIPSIP YIPIKDIILHFEKYDFRFHWNRDDSIIEIWIPIKSKSNPSLYDYRR >gi|223714077|gb|ACDT01000138.1| GENE 17 13212 - 13649 329 145 aa, chain + ## HITS:1 COG:no KEGG:Blon_0991 NR:ns ## KEGG: Blon_0991 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_infantis_ATCC15697 # Pathway: not_defined # 1 145 1 145 145 246 82.0 2e-64 MCIECYIDESRITPLLNPIECLQNHTQYICGTCGRCICIEHDPKRGLQRWNFPFKSLEIA KMYLRTADYSMKKSCGIYEIVSENGRRSYKIFANNEDLQLYLKKNKGKTCKDTKSIFAVE EYKEYANTQIRKLTSNEIQKYMQEQ >gi|223714077|gb|ACDT01000138.1| GENE 18 13813 - 14154 270 113 aa, chain + ## HITS:1 COG:FN1357 KEGG:ns NR:ns ## COG: FN1357 COG3547 # Protein_GI_number: 19704692 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 2 112 280 390 391 89 42.0 1e-18 MLIESEIGDINNFSSAVKVMGFAGVDPGTYQSGEYSAPRTALSKRGSRYLRKSLYQCILP LCTNNLAFNKYYKLKRLQGKSHSCAQEHSIRRLIRVIYKLLSENIQFVERKLI >gi|223714077|gb|ACDT01000138.1| GENE 19 14973 - 15791 585 272 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733189|ref|ZP_04563670.1| ## NR: gi|237733189|ref|ZP_04563670.1| predicted protein [Mollicutes bacterium D7] # 1 272 1 272 272 496 100.0 1e-139 MEKKKQGLGFYLNILAALVFFSGCGYFAYDYYTGKQAEQKEAEEAYAKDTSWQVENTRNN KYNGIYTLTKFDENGNNTWTGFIFEVKNDEIKIVNKVNYYTPNDVRNARFEGNPNVSDEN IRQYLDSPSCDLDNEELIGMSGIDVNDYDYSGGGVSDASSITIRGGNNALFGDGAVDYNK IEFNYKDHYHLLKYMNIQTAYSQEDNCLLLSKLLNNPNSIYKDSDGYKLIRYNSAQDLEA KTVSIYWIATSQFSEEFRGYTERKVISEGTFE >gi|223714077|gb|ACDT01000138.1| GENE 20 15845 - 17368 1236 507 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733190|ref|ZP_04563671.1| ## NR: gi|237733190|ref|ZP_04563671.1| predicted protein [Mollicutes bacterium D7] # 1 507 4 510 510 821 99.0 0 MSIKTILNRIKISLLISCVAASGTAIFQDYKGTNISAASYNSLSAADRAAFNTTGANMGQ VTNVTSTKSTSIEGKDNINGYQWTAWFNENSKNFTIDKTQTNGTYTVGVPAKTDSNQINA NMTQAGYYKYLSDPVYSEIRLKAYERFNWTEEWDTSSTSKDHTEWKYFRDKGSLEDKADD KHNFSHAINEINEKMDYTLTSSQKKQLNKKLWAMAENGQTVTSDFFHNNGMNGSSIGGSD NTITNGNQSIQYSLKVEVYGRRQYRWKVTKITTTTYHHVANKSATIEIANTVVAQNVSAT ELSKYVTFTLKSKNGGTGAVNTQVVKSNGGWWGSNHPAYHQGNNTTTDANGTYYYWSNSG SGSNGVNLRNVIPIKEKLYSFAGIPTNLGKDNKYTLTINPNGGRYNGNTGTTLVKQITGT SYTVNTPLRNNYLLSGWDFSGSGTWNPNNQKYTFGTGDGNLTAKWVEKTIDTDPENPNNI NKKYTLTIDPNGGKYNNKISKTTIIKK >gi|223714077|gb|ACDT01000138.1| GENE 21 17476 - 17871 300 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733191|ref|ZP_04563672.1| ## NR: gi|237733191|ref|ZP_04563672.1| predicted protein [Mollicutes bacterium D7] # 1 131 1 131 131 231 100.0 9e-60 MYRFYSDATLQAQWIKKGQSNSGTVIIDPNGGNYDGSTNKVTISGNAGSTTNIKTPTNTG YIFKGWEVINIEKNKFNGNTFTFIEGTTTLKAKWEKIKIDPCIVNGTGVDACGTNQDSMP GTWVHLSKDIY >gi|223714077|gb|ACDT01000138.1| GENE 22 18241 - 18408 189 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDFSIKQALTVVVVLVAAVAVVATAITMTNDNNSKTKTQNDGLWNKIQENSNAQG >gi|223714077|gb|ACDT01000138.1| GENE 23 18558 - 20714 1442 718 aa, chain + ## HITS:1 COG:VCA0511 KEGG:ns NR:ns ## COG: VCA0511 COG1328 # Protein_GI_number: 15601271 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Vibrio cholerae # 1 715 1 702 706 312 29.0 2e-84 MELIVIGKNNQHKKFNKNKILSAVQKSAERLNKSLSKEQEDRLIYLVLRNIENLQITEIP VSTMHKIVEVSLDDIDKDIATSYRNYRNYKQDFVKMLDNVYKQSVQINYVRLKENANMDS ALVSTKKTLVGNALGKEFYIKEFLNESEIEACEDGYIYIHDMSSRLDTMNCCLFDMDTVL SGGFEMGNRWYNEPKTLDVAFDVIGDIISSTAAQQYGGFTVPEVDKILEKYAQKSFNKYL EEYLSIVQSLLKDTDKKQQFLAKGLQYAYKKTQEECRQGYQGIEYKLNSVSSSRGDYPFV TFTFGLSKSLLGQMISKTILNVHRIGQGKDGYKTPVLFPKLVFLYDKNLHGKDKELYEVY KTALLCSSKTMYPDWLSLTGEGYVSNIYKKYRKVISPMGCRAFLSPWYERGGMYPADEKD TPIYVGRFNCGAISLHLPMIFEKSLKENKDFYEILDYYLEMIRDIHKRTKEHLARKKASI NPLAFCQGGFYGGTLDLNDNIAPLLESSTFSFGITALNELQELYNNKSLVEDGKFAVEVM EHINNKLNQYKKEDHILYAVYGTPAENLCGVQIKQFRNKYGIIKKVSDRDYVSNSFHCHV TEEISPIQKQNLEKRFWDLFNGGKIQYVRYPLNYNIKAMETLILRAMDLGFYEGVNLALS FCDDCGYEQLEMEVCPKCGSKNLTKIDRMNGYLAYSRVKGESRLNDAKMSEIKERKSM >gi|223714077|gb|ACDT01000138.1| GENE 24 20726 - 21088 316 120 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLNIIDNYYIKKYKYGVILLKKRDNVNATQIELDTDYDDTVDEDNSVYDSKLFKRCGYYS DTYTAFIGLLSKIMNEKLENVKSIQEYVEIQKQYVAVLTRSYQQINNIKKDCDGGQLPHD >gi|223714077|gb|ACDT01000138.1| GENE 25 21238 - 23625 1556 795 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 3 611 5 606 709 302 32.0 2e-81 MKLVIAEKPNVSKQLRDAFEPHAKYGGAPGYFIGEKYTFCCALGHLFTFAMPEEINDNYK RYALEDLPLNLPNDIPLKLTNDVTKQYYSVLKKLFKTGNFDEIIVATDPDREGQSIYERI KSQTRNFPKVPESRIWISEWTPSGLIKAMNNKKPNDNYRGLKAASECRAIEDYLIGMNGS RAMTVKFGGANNVISIGSTQTPTLYMIVSREREITNFKAEDYGVPFIELVNDKGELLTLK HTSKRHLSLLEAQTTQKKLLQHQGTIVNIEEKRRTSKPMKLPNATDIQKEMNKKYGFTAD KTSNILQTLYQDKHLTTYPGTKAREISKSSATLSLNTIKNLLKNNVMPKETEKILENNWE IAKHCIANEGLAHEAITPVFGQINQKDIDSLTQDERKVYDEIVKRFVQAFFPEAVFSDTI ITTTIDSENFIAKGSIIIQKGYKEIMGVGEDNLLPAFVNGKKYPISKVGIEEKQTTPPPR YTTSTLLEAMENASRFVDDKHYAKILSSKEVNGLGTDRTRSDILNKLLQRNYYEFKGKTI YPTQKAIDLFKVLPSSGELDSPIHKAKMEEKLALVELGKLSKKEYLQEVYQDINSFVQAV KDARQQVIANNMTSSKIKCPECGGGLLVSSKVVKCQHCNFILFPIVASKKLSETNIKDLI EKKETKLIKGFKSKTGNSFDAKLKLVKANDKWAVQFDFTKQILGKCPRCGGDIVEGVKGF SCENKNCEFFIFKDNKLLAKSKKKVTFSMVKKMLKDNGTGRVKVTGLQSSNGQTYSAFFS LDDTGKYINLKMEFK >gi|223714077|gb|ACDT01000138.1| GENE 26 23992 - 24888 615 298 aa, chain + ## HITS:1 COG:PM0849 KEGG:ns NR:ns ## COG: PM0849 COG4962 # Protein_GI_number: 15602714 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Pasteurella multocida # 1 206 110 319 425 93 32.0 4e-19 MSNEDLESLAYRIGNINNAQFNNVYPILEAELPALRLQFVHNSIAKSGTSLSLRKTPITA RINAASMKEDDYCSEECLNFLIAMIKAKMNSVICGLTGSGKTELAKYLAGHINRWERIIT IEDTLELHLLTIYPNKDVVELQVNNIVDYDGAIKSCMRMKPTWILLSEARGYEVKELIKS ISTGARIITTIHTGDARDIPKRLLNMFEDNELSNDKVETMIYDYIDVGIHIQCDYTEGST HRYIDQLVVFYTDENNNKCRDLIYEVKNGEITYKPIPKYVLNKFQKENIPPFKWGCDE >gi|223714077|gb|ACDT01000138.1| GENE 27 24904 - 25686 624 260 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733195|ref|ZP_04563676.1| ## NR: gi|237733195|ref|ZP_04563676.1| predicted protein [Mollicutes bacterium D7] # 1 260 1 260 260 468 100.0 1e-130 MKGFFDLKGLIQTIEGFGYHYSGIDFFKSIGIYTIVIAALAYLHNLSWLYIALLLLTVVV MFPFMVHSQYYFLNEQKRFQELCTYLKQMIINFKTYQKIYLALKETRNAYDEKSQMYKAI TRAITAIDEGKSFRTALDEIEKDFYNSYVNQLHSYLILGEQEGGEIVYESLTHIDFQEWQ TDTYSFQIQKEKLKKQNGIFAAMAICMTLFVFHFFPAEIMDPLLAQQAYQIYTFIYFEVI LLSFICVRAFMTGKWIKADE >gi|223714077|gb|ACDT01000138.1| GENE 28 25705 - 26946 721 413 aa, chain + ## HITS:1 COG:no KEGG:BMQ_2765 NR:ns ## KEGG: BMQ_2765 # Name: not_defined # Def: hypothetical protein # Organism: B.megaterium_QM_B1551 # Pathway: not_defined # 1 115 1 100 115 64 33.0 9e-09 MNSYLSMLINIFNYKQSTSRNDFWSAVLINFVVSFGILLINFGFYMFFISSPNTSPIIKM VYLTINLFIILYLSLATLSMSIRRLHAINKSGWWILTNCIPVIGFFIYLYLMAQPDAKNI DYFDEFTHYYSSKQKNMHNELDKAIEQSNIQILEESSRKENSLGHKLTAKDKIISYTTET DLNSLIFILENETTSSYVEKRLKKSLYIFLGLAGIVAVIGVANLLLNLHLAINYPLFIIA SVIISYGIYKIDYFMIKSVFKNQQKSVKDAFPLWMSTLEVLIVTNNIPNTLRKSLPSCPK PIVKDLERFIARLDVNPVDKEAYKDFLSQYNIPEIQEIVLDLYQFNFVDKNNISTEFAAL HKRLNRISSDTRKRRQESDTFLIGALNSLPLMVVSLYILMISNLLSSAIMGNM >gi|223714077|gb|ACDT01000138.1| GENE 29 27104 - 27268 119 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEIGLEEYLELIVSLIAGSCALLFFLGFLTNSNLGYSLQNIIDFFISTVTMIGA >gi|223714077|gb|ACDT01000138.1| GENE 30 27268 - 27678 385 136 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733197|ref|ZP_04563678.1| ## NR: gi|237733197|ref|ZP_04563678.1| predicted protein [Mollicutes bacterium D7] # 1 136 1 136 136 226 100.0 3e-58 MSNLSQYFSGKLIISVIVVSFVFASLFSLKNRWTMYADSTIETQKASETSDVNDNLRKYS APVIEIINGYIDYGSSNYDLKNFVKAYDADTKEDLTDKVKIYGSVNTKEYGVYKIHYVVT SSHGIKADKYGQVIVK >gi|223714077|gb|ACDT01000138.1| GENE 31 27698 - 28048 344 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733198|ref|ZP_04563679.1| ## NR: gi|237733198|ref|ZP_04563679.1| predicted protein [Mollicutes bacterium D7] # 1 116 1 116 116 183 100.0 2e-45 MKQAFEYFIEAVMVTIIIAIIVGAMNITSQIREANEFHNLAIAEVEASDFDDSIIKKYAT GEVNPNVITEFKNASVTTDSSVDYASERIYKVTTTYKISLPILGYVKTTTISGYAR >gi|223714077|gb|ACDT01000138.1| GENE 32 28038 - 28397 178 119 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733199|ref|ZP_04563680.1| ## NR: gi|237733199|ref|ZP_04563680.1| predicted protein [Mollicutes bacterium D7] # 1 119 1 119 119 214 100.0 1e-54 MLDKNKITILSTAIICIFFLTACTQTANYNKTDDTSSLLPVGSVVKVNDLGNMLIISIGE IDSNNQLYDYKGCSYPEGCTGNTYLFNRDDIISIKHKGYNSKEEVNYIKNIEKKMEELK >gi|223714077|gb|ACDT01000138.1| GENE 33 28397 - 29176 735 259 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733200|ref|ZP_04563681.1| ## NR: gi|237733200|ref|ZP_04563681.1| predicted protein [Mollicutes bacterium D7] # 1 259 1 259 259 410 100.0 1e-113 MDIIHEKALATILFSIVFIILIGAATPYIQQQISQTNEITDESLKYSKPALDGTNSDFDV DNIPMDDVVTSYARGDVNMDGEITTADYELILKHIAGNITFNEEQLYLGDVNSDGKIDNN DAERIKKMCEASTPSQYEKGDINRDGKLDLDDLNLLKKYLNGKVTFVLEQKKLADINNDG AIDSKDASLLNTMIGRINYEPGDVDRNEKITLDDAKLILDFLSGDAKLDEEQLELADIDK DGEVTSKDAQKIMALAKNN >gi|223714077|gb|ACDT01000138.1| GENE 34 29183 - 29641 228 152 aa, chain + ## HITS:1 COG:FN0312 KEGG:ns NR:ns ## COG: FN0312 COG0602 # Protein_GI_number: 19703657 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Fusobacterium nucleatum # 1 152 1 158 168 140 44.0 1e-33 MNYHNITKDDMLNGDGLRVVLWVAGCNHRCRGCQNPQTWDINSGIPFDNTAKLELFAELD KEYIDGITLSGGDPLHPQNLQSTTKLLKEIKEKYPNKTVWIYTGYTYEQIQEKELIKYID VLVDGPYIKSRRDLNLEYRGSSNQRIIRLSKN >gi|223714077|gb|ACDT01000138.1| GENE 35 29978 - 31237 172 419 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145223395|ref|YP_001134073.1| NLP/P60 protein [Mycobacterium gilvum PYR-GCK] # 261 417 223 347 348 70 37 1e-11 MRSRKFILGILLGGLLITQSEPLEISAINFETQEDKYMKLCASSSLTSSQQNTCKQFNNY LSKKNKELKKEIKQNESSIQTTQSSIDTIKEQINNLEHEISEKETEVQYFATMIQNKENE IKEKETEVKDRMYSMQTVNNSNSFVDYLFGANNFTDIFSRVSNINELTSYDNELIKEIYD NKKELEKQKNTISTAKKNLETQREQQYQLQIKYNELLNKQNDTLTANKEKSEEISEAQKE IDNNLTAIFEAAQKYDKPSTSIPPINGPTTQTGLAIANKALSKQGSRYWWGAPGGGFGDG QGLDSPNAIYFDCSGLIAWAHRQAGVKIGRSTAAGYSRSGSGVSYSNLQIGDVITFNYGS GVAHIGIYIGVVNGQKSFVHASGKGSSTRGQYADQCVKVSSIEPGSYYYKYIYNCRRLY >gi|223714077|gb|ACDT01000138.1| GENE 36 32440 - 32712 98 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733203|ref|ZP_04563684.1| ## NR: gi|237733203|ref|ZP_04563684.1| predicted protein [Mollicutes bacterium D7] # 1 90 1 90 90 105 100.0 1e-21 MSIQHLKSKFDPIYILSITILIAMGISALLNNLLVLLFVGFCGIILLLNHLIFSLRNTSI NQQIPKTPKKPKKIKKVIEVLVDEDGNEIV >gi|223714077|gb|ACDT01000138.1| GENE 37 35134 - 35601 96 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733204|ref|ZP_04563685.1| ## NR: gi|237733204|ref|ZP_04563685.1| predicted protein [Mollicutes bacterium D7] # 1 155 1 155 155 244 100.0 2e-63 MCIVTLYGQITYQKINYQKTRYTVRVTFAVKKHANTDIKHQIYAKSTLKYIRLNNITNMG VKLIKSLIKKTQYKIVYKTSDTSSELLFLTKCQNNHKVTWNYKYNFRNISKFEDDKDSTK LLYLLMTLMNKLNSNEISFWELSSLLPKNIFLIKL >gi|223714077|gb|ACDT01000138.1| GENE 38 35633 - 36103 155 156 aa, chain + ## HITS:1 COG:no KEGG:bglu_2p0400 NR:ns ## KEGG: bglu_2p0400 # Name: not_defined # Def: hypothetical protein # Organism: B.glumae # Pathway: not_defined # 1 118 7 119 319 82 32.0 3e-15 MSIELKELWKNEALNFTPWLAKSGIIEEIIDELNIFNNPLKLYKTEHKIGNYRIDMTYQT LDKKNSLIVENQFGLTDSKHLGQNIIYSSLTNIPNILWIADNISTEHKKIDEILKINIIL CSVKIHKVSDGYKLVFSIVNKNKDFIFTLDNSLNRI >gi|223714077|gb|ACDT01000138.1| GENE 39 36163 - 36858 626 231 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0875 NR:ns ## KEGG: EUBREC_0875 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 227 6 244 266 150 35.0 4e-35 MIYITGDTHGDFTKIDFFCKKMNTTKQDIIIILGDAGINYYLNEKDEKIKRHISNLPITL FCVHGNHEARPEIISSYELSTFFGGQVYIEKSYPNIIFAKDAEIYDIKEKKTLVIGGAYS IDKDYRLTMGYKWFPDEQPTKELKNKCMDIVNSKKIDIILSHTGPKKFEPVECFLPFVDQ TAVDKTTEEFLDKLEAVKNYNQWYFGHYHTDKKLLADNKEINILFNDIKEF >gi|223714077|gb|ACDT01000138.1| GENE 40 36860 - 37102 288 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733207|ref|ZP_04563688.1| ## NR: gi|237733207|ref|ZP_04563688.1| predicted protein [Mollicutes bacterium D7] # 1 80 1 80 80 145 100.0 9e-34 MEKDYFKDRPIESTKIKYVYLGQKVFICEKSAQKYAKRLNDLTPGTVIDILTRRNHPRGI KVKIKTPDGKIAIGRIVYFV >gi|223714077|gb|ACDT01000138.1| GENE 41 37122 - 37388 355 88 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733208|ref|ZP_04563689.1| ## NR: gi|237733208|ref|ZP_04563689.1| predicted protein [Mollicutes bacterium D7] # 1 88 1 88 88 145 100.0 6e-34 MQDFTKLQTLCEKHKVITDIIHELEDIFTNDEKFIINTSLGNKKTKLGVTTSDEATVKAL LAGFKVVETEISKEIGSFLGTASSNYNY >gi|223714077|gb|ACDT01000138.1| GENE 42 37588 - 38826 1086 412 aa, chain + ## HITS:1 COG:MT0027 KEGG:ns NR:ns ## COG: MT0027 COG0791 # Protein_GI_number: 15839398 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Mycobacterium tuberculosis CDC1551 # 301 398 176 268 281 103 48.0 8e-22 MIKKNLILLSLILTMSANTTPIFANSINEKLSNNTVYFDEISIANKTLDTLKKQQNKQLK EIESKQVEVDTLFYNKSSAEKNKKETKKRAEKATEDYNKTLNKLENAKESFTNNQNIILH LKNKITQNPTEDTIKEISTLKFKQIKQAKIVTDLESSVNTQAVVTKLTTEQTKKNEYKID NLQKEYESKSTELSDSFENKAEIKQEINNINITVQDSQKKALELKKNSFDLDEESRKEKE EKYRLPTDEELKLIIAENERRKAEEAEAARLAAEEAARIEAEKKAQEEALKKAIGQSIAD AALSKIGAPYVWGAQGPNTFDCSGLVWWACKQAGIYFDRTTAAGLSQMGTQISYSQLQPG DIITMNTLGYTSHVVIYIGNSQVVHAPQTGDVVKITSITPTKWNIVNCVRLY >gi|223714077|gb|ACDT01000138.1| GENE 43 39015 - 39650 456 211 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733210|ref|ZP_04563691.1| ## NR: gi|237733210|ref|ZP_04563691.1| predicted protein [Mollicutes bacterium D7] # 1 211 1 211 211 381 100.0 1e-104 MEVYTLKKLLMLLFACILLTGCFSSPQNDAITDFSSIPKTKFNYKNLKNRADMSAYYTLE DENHAFTDMSLETFYNKLTTSKKTLSGAFYFGYDDCPYCKQAVPILNYVAKEYNATVSYV NIFSSRYDENGNKLENGDKYYTELQAYLDEYLDKEDKTIYVPTVIFIKNGQISLYQLGLD TDENFDIKKGLNIKGKNNLAEIYRKGFNSIK >gi|223714077|gb|ACDT01000138.1| GENE 44 39667 - 40152 467 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733211|ref|ZP_04563692.1| ## NR: gi|237733211|ref|ZP_04563692.1| predicted protein [Mollicutes bacterium D7] # 1 161 1 161 161 250 100.0 2e-65 MFEIKKSYYIIFVFVVALVLGYFSISIETPKNNYDDVAAVEDKNETDTSTTDDINKSEDK TNIKEDSNSNTVDTTITEQEKETAKETLGDDIEISDIQKGTQINLVNEPLYVSSKTSIVS TYKTGTYYTWGFEANNRIAITNKLENIGVKGQVTGWINKPS >gi|223714077|gb|ACDT01000138.1| GENE 45 40372 - 40791 352 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733212|ref|ZP_04563693.1| ## NR: gi|237733212|ref|ZP_04563693.1| predicted protein [Mollicutes bacterium D7] # 1 139 1 139 139 247 100.0 2e-64 MDNKFFDFSIYTDNELLKLATINELNTDFNCVQLCLEYAKFDELKQKELDKIISASVSLV LLFEEELKKRNILKKYRESIQHPDYSEGIFYTYNSYSFRIAHLMSICNSGKVTEDSLKQF ICTIQNSQANTDKIPQIIN >gi|223714077|gb|ACDT01000138.1| GENE 46 41217 - 41405 171 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733213|ref|ZP_04563694.1| ## NR: gi|237733213|ref|ZP_04563694.1| predicted protein [Mollicutes bacterium D7] # 1 62 142 203 203 107 100.0 2e-22 MHLHFSDIKTYKNVFYRKNGENKIETFEIKEDFISIFLEDSFAYDLLENLRQATGKKIII IK >gi|223714077|gb|ACDT01000138.1| GENE 47 41636 - 43399 1095 587 aa, chain + ## HITS:1 COG:CAC2232 KEGG:ns NR:ns ## COG: CAC2232 COG0608 # Protein_GI_number: 15895500 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Clostridium acetobutylicum # 39 478 47 491 587 201 32.0 3e-51 MKLNIKSNSNIHFTKNNFLNSYLNNFIDNENTYELIKYPCFEHDPRLLNDAQSAAEYILK TDKKIFICGDYDCDGIMATSILFNTIKQFNSNVNFMIPDRIIDGYGLNIRMIDVIKTQVP PNEVLILTCDNGIACFEAIDYAKSLGMTVIVTDHHTVGEKLPNADFIVHPALGNYPFKEI SGATVAYKLCQLIIEMGGLNIPNLEISNKQFAAITVVSDVMPICSYYYDTMKVNENRRLL QEGITSMRNNPDWHIKMLSEMCNFNMETLDETTIGFNIAPVLNASGRIAKADYAVDFFTK EQKDRDAAIVDCSYLVYLNEERKKMKIAELNKAESVVCGKSIKLAFYPELHKGLIGIVAG QFTDKYKVPSGIFTQVKKDGTVLWTGSMRSNTVHLYNALTEISKKCPIVAFGGHAGAAGI TIADENIDLFKQIAEKYFYTNACEPANYAIEVKDYRSVLDAGECIKELKPFGNGLPKPQV KFNFFCTSVDVFYKSGHVKLSNYKQEELWLFGKRDEIKKHPVLNTLPLKSDNMQKLQETM NKQEAEKNKWERYSQYKKYYNFIIEAEIDYSCFNNQLGTQISVKKFK >gi|223714077|gb|ACDT01000138.1| GENE 48 43414 - 43917 478 167 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733215|ref|ZP_04563696.1| ## NR: gi|237733215|ref|ZP_04563696.1| predicted protein [Mollicutes bacterium D7] # 1 167 1 167 167 307 100.0 1e-82 MKKRFFFDLDGTVCEYKFTSIDSFYEDGFFKNLIPFENIVNTIKQLLKDGNNEVYIVSKY LDSKYAINDKNTWLNNYLPEINENHRIFIPYSAKKTDFIPGGIRYDKTRIDILIDDYNDN LFEWIKAGGFAVKMINGVNSLDSWKNKLYLHANIPAVVNLSILESIY >gi|223714077|gb|ACDT01000138.1| GENE 49 43970 - 44170 161 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733216|ref|ZP_04563697.1| ## NR: gi|237733216|ref|ZP_04563697.1| predicted protein [Mollicutes bacterium D7] # 1 66 1 66 66 107 100.0 2e-22 MEATVTLFLILWASLIEALIDVLPIVFFSLSFIIVAYLVNNLINDFIKRKNNSTHKEEAH ENKSRK >gi|223714077|gb|ACDT01000138.1| GENE 50 44148 - 44303 171 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKTKVENEKAIISWKILKKFIENSKSKKLTKETRDELLNNVETIILKKEFN >gi|223714077|gb|ACDT01000138.1| GENE 51 44303 - 46516 1024 737 aa, chain + ## HITS:1 COG:all7071 KEGG:ns NR:ns ## COG: all7071 COG0507 # Protein_GI_number: 17233087 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Nostoc sp. PCC 7120 # 51 720 53 728 748 262 28.0 2e-69 MKLSLQNSHTLEKIRGYVTYVFYPKSKLMKDTVQFVGFSIKSGKKNISCWGTMSNIAQGD YFELEGFYQKDGSFKFKSALRVDDDELGATSMLSFLFGPKTAAMLINYFNSPIRCINLFK NNQTDFELNALNIKGIGQKKLEKAYTKYENNIAVDVLYARFSKYGLTINQALKIFSTFGT DCLKRIESNPYILTSIDIPWKCVDFIALNYYKISCIDDNRLFAGIISSLKSIRNRGHVYI NLDSSDYNLLNLATQKLEVDISYIKEAIFKLQKEEKIVITNYNNYKIVYMRKMYDAEQNV SKLCAKLVSNSKLKKRDKNRIHHYIDNYEENHFKLADKQKEAIESALSNTFSIISGPPGS GKTTIIDVICSYFKYYNKNIRINLCAPTAKAAKRMFDSTGIQASTIHRLLEFNPADGEGG FKYNEYNPLKTDILVVDEFSMVDLLMTENLLKAIPETITALIFVGDINQLPSVDAGRVLE DMLQSKIPSTLLNKIYRQQEDSTLLQKALDFSKEKSIELADTKDFFFYSEKNELAIQEGV VNLFISEVEKYGIENVALLIPQNEGTFGVNTLNCLIQDKLNPKLFDTTPELRSGSRKFRI GDRVIHTVNEDNHNVFNGMVGTITDIEIGDKDFDTEDTIIVDYGEDELSEYHRDRFDNIK LAYAMTIHKSQGSEYKSVIMILHMANRFMLTKKLVYTGMTRAKSYLHLVGDEYAVNYAMK RNIPPRNSRLDILLKTI Prediction of potential genes in microbial genomes Time: Thu May 26 10:42:59 2011 Seq name: gi|223714076|gb|ACDT01000139.1| Coprobacillus sp. D7 cont1.139, whole genome shotgun sequence Length of sequence - 5332 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 43 - 102 10.3 1 1 Op 1 . + CDS 280 - 2337 1311 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) 2 1 Op 2 . + CDS 2330 - 2563 259 ## gi|237733220|ref|ZP_04563701.1| predicted protein - Term 2531 - 2565 1.1 3 2 Tu 1 . - CDS 2566 - 2823 323 ## gi|237733221|ref|ZP_04563702.1| predicted protein - Prom 2975 - 3034 11.4 + Prom 2903 - 2962 12.9 4 3 Op 1 . + CDS 3099 - 3668 384 ## GYMC10_5352 transcriptional regulator, XRE family + Term 3673 - 3719 8.0 5 3 Op 2 . + CDS 3729 - 3986 153 ## gi|237733223|ref|ZP_04563704.1| predicted protein 6 3 Op 3 . + CDS 4007 - 4468 161 ## Acid_2221 peptidase A24A, prepilin type IV + Term 4523 - 4568 5.4 Predicted protein(s) >gi|223714076|gb|ACDT01000139.1| GENE 1 280 - 2337 1311 685 aa, chain + ## HITS:1 COG:BH3606 KEGG:ns NR:ns ## COG: BH3606 COG0653 # Protein_GI_number: 15616168 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Bacillus halodurans # 2 679 37 772 838 586 44.0 1e-167 MLPDKVLQENTSKYINDFKQNGYNDTLLINSFANIREAAFRTIGLKAYKVQLMGAIILFN GDIAEMKTGEGKTLTSIFAIYLAVITGNSVHVVTTNEYLAKRDMELNSKVFDFLNISTAV NLKNLSIEFKKNAYQHQVLYSTHSELGFDYLRDNLCKYSENGLLQDNRVQSSHDFIIIDE ADSILIDEAKTPLILSSLEDIEKKQYEKPNNFVQTLIKNEDYEIDYTSFNIYLTEKGNYK AEKFFNISNLYLPQYSPINHRILQALRANFLIKKDHDYIVEDNIIKLIDKSTGRIMNGKS YTNGLHQAIEIKEHLKVTPENQINATITYQNYFRLYKKIGGMTGTAKTEEEELKQVYNMS VRVIPTEKPCIREDDTDVIYKTIKMRNTALCNEIIMRHNNNQPILIGTLSIEDSEIISNL LNNLKIKHNVLNAKNHQYEASIISEAGIAGNVTVATNMAGRGTDIKLDEEAKKAGGLAVL GLGRHESRRVDNQLRGRSGRQGDPGYTKFFLSLEDDLMIRFGLSKIKSMNLNVFDETKPI KSKILSKTIESAQKQIEGINYDQRVSILKYDSINSIQREQYYILRKEIAKIKELKDIIRI FKLDNISINNSENFTDIKKIIEKIYNIFDKNWTKYLSDMNIIRQGITLRQYGKLNPIEEY EKEAYEMFDLMMNTINNEVVRLINV >gi|223714076|gb|ACDT01000139.1| GENE 2 2330 - 2563 259 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733220|ref|ZP_04563701.1| ## NR: gi|237733220|ref|ZP_04563701.1| predicted protein [Mollicutes bacterium D7] # 1 77 1 77 77 133 100.0 4e-30 MSSKKLPLITLRKTRLKNNLRQKDIAYVLGCTTQYYSQLERGINVLSYDYATRLALMFNT TPDELFFEEYKKVYIKM >gi|223714076|gb|ACDT01000139.1| GENE 3 2566 - 2823 323 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733221|ref|ZP_04563702.1| ## NR: gi|237733221|ref|ZP_04563702.1| predicted protein [Mollicutes bacterium D7] # 1 85 1 85 85 145 100.0 1e-33 MGKKEIISRIIGTGFNEEYQKNSSITLLQCALENYNLKYEQISEEITKLCGEDEFAIELQ KFFVNNKHLRYSAINKLNSLCAKYY >gi|223714076|gb|ACDT01000139.1| GENE 4 3099 - 3668 384 189 aa, chain + ## HITS:1 COG:no KEGG:GYMC10_5352 NR:ns ## KEGG: GYMC10_5352 # Name: not_defined # Def: transcriptional regulator, XRE family # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 1 61 1 61 122 68 49.0 9e-11 MNTGEKIHELRKNKGLLQEELGTLIGVDTSVISRWESSQRQIKIEDLTKLSDLFEVSVDY LAKDNVFYSANIPLIGYFEEGKSLQTHIEKKKLLIFIAPPHNPWEVDIFKILNNNPIYKV SNGDFCLIKKQINSIEIGDIYLISDNDKPYLAHSVLENGKIGFKYNNQVSFNASIIGKFL SIVHPLSEM >gi|223714076|gb|ACDT01000139.1| GENE 5 3729 - 3986 153 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733223|ref|ZP_04563704.1| ## NR: gi|237733223|ref|ZP_04563704.1| predicted protein [Mollicutes bacterium D7] # 1 85 1 85 85 126 100.0 5e-28 MEKISVYKTSKYFMIISILFSFFAGTYETSKNYFYANNVAKLNSNISKVDSDTASYKSEI MEITNFAKLIDLMKLKGYTYFPDTH >gi|223714076|gb|ACDT01000139.1| GENE 6 4007 - 4468 161 153 aa, chain + ## HITS:1 COG:no KEGG:Acid_2221 NR:ns ## KEGG: Acid_2221 # Name: not_defined # Def: peptidase A24A, prepilin type IV # Organism: S.usitatus # Pathway: not_defined # 17 136 13 136 184 62 31.0 6e-09 MEIFNSNSFHSTLAPFILLVILIACGYSDVKNKKIPNILTISGILFGVIFNTYLYGYTGL LSSLISIVVIFVLFIVPYLLKQFGAGDIKLLITVASIMNFYYCIGATIMSSIFCAIYAII NSIRLKSFKFTIPFGLFMCIGTIFYQILIYIFY Prediction of potential genes in microbial genomes Time: Thu May 26 10:43:20 2011 Seq name: gi|223714075|gb|ACDT01000140.1| Coprobacillus sp. D7 cont1.140, whole genome shotgun sequence Length of sequence - 4322 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 144 - 203 5.8 1 1 Tu 1 . + CDS 274 - 738 358 ## COG1882 Pyruvate-formate lyase - Term 1995 - 2049 17.3 2 2 Tu 1 . - CDS 2273 - 2896 529 ## gi|237733088|ref|ZP_04563569.1| conserved hypothetical protein - Prom 2934 - 2993 12.4 + Prom 2864 - 2923 8.1 3 3 Tu 1 . + CDS 3052 - 3441 295 ## gi|237733089|ref|ZP_04563570.1| conserved hypothetical protein + Term 3663 - 3701 5.0 + Prom 3652 - 3711 9.5 4 4 Tu 1 . + CDS 3774 - 4320 310 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair Predicted protein(s) >gi|223714075|gb|ACDT01000140.1| GENE 1 274 - 738 358 154 aa, chain + ## HITS:1 COG:Z1046m KEGG:ns NR:ns ## COG: Z1046m COG1882 # Protein_GI_number: 15804975 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli O157:H7 EDL933 # 11 107 536 643 810 68 38.0 4e-12 MNNHCGYLFNSILYDNCIDRGKSLLNGGIDYLGGTLETYGNVNSGDSLYAIKKLVFDDKE IRYKDMVDAIRNNFIGYEEIREKCLNVIKYGNDNPEVDEMINELNQYVAVTTKSKSEKYG LSSYLIVIIIKYPFTRGCFTNGFNIRIIRVVLTK >gi|223714075|gb|ACDT01000140.1| GENE 2 2273 - 2896 529 207 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733088|ref|ZP_04563569.1| ## NR: gi|237733088|ref|ZP_04563569.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 207 1 207 207 345 100.0 7e-94 MRDEKVMKSIKIIIALLMIFVCIPVSNISALESTDNSNIIIYTPTEEEFRAAEEIEIAKM KAYFERTKESYAYDYRKVGDPQTSSFSPYKNAYDQPAGGTIFNSENSGFYWSDSSRSPGT WSLGISIAGKYLSVDVAYSPGVVSGSSGGVFVGISSTQVGKAIKLKVARNYKVQKYDVYR KPQYGGSWTYLSSYGAPTKYQSRFIIE >gi|223714075|gb|ACDT01000140.1| GENE 3 3052 - 3441 295 129 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733089|ref|ZP_04563570.1| ## NR: gi|237733089|ref|ZP_04563570.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 129 1 129 129 184 100.0 1e-45 MKNKYKIIVIVVIVFLLSSFYFGYSLGIDADKPNVIDKISGTYQLFKGNETLTVTIDDQD NLYYQNGFENESIKIKTQINKLYENLYLFENEKLSYNLILFNDNFLLLININDQTYDKLE KVSNTQVFL >gi|223714075|gb|ACDT01000140.1| GENE 4 3774 - 4320 310 182 aa, chain + ## HITS:1 COG:SA1196 KEGG:ns NR:ns ## COG: SA1196 COG0389 # Protein_GI_number: 15926944 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Staphylococcus aureus N315 # 5 180 9 148 420 85 31.0 5e-17 MNAGNRIYMAIDLKSFYASVECVERELDPLNTNTNLVVADSSKTEKTICLAVSPSLKQYG IGGRARLFEVVQKVKEINRKRKKDNRYREFRGKSHIDSELKNDTSLELGFIIAPPRMAFY IDYSKKIYEVYLKYTSAEDIHVYSIDEVFIDAMSYLKTVNTSPREFAKMIIQDIYKTTGI TA Prediction of potential genes in microbial genomes Time: Thu May 26 10:43:38 2011 Seq name: gi|223714074|gb|ACDT01000141.1| Coprobacillus sp. D7 cont1.141, whole genome shotgun sequence Length of sequence - 4581 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 5, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 24 - 83 3.5 1 1 Tu 1 . + CDS 163 - 477 276 ## gi|237733091|ref|ZP_04563572.1| conserved hypothetical protein + Term 488 - 544 9.2 - Term 466 - 536 13.1 2 2 Op 1 . - CDS 543 - 1022 257 ## gi|237733092|ref|ZP_04563573.1| predicted protein 3 2 Op 2 . - CDS 1009 - 1380 186 ## gi|237733093|ref|ZP_04563574.1| predicted protein 4 2 Op 3 . - CDS 1373 - 1816 216 ## gi|237733094|ref|ZP_04563575.1| conserved hypothetical protein - Prom 1973 - 2032 8.8 + Prom 1764 - 1823 10.2 5 3 Tu 1 . + CDS 1844 - 2557 507 ## gi|237733095|ref|ZP_04563576.1| conserved hypothetical protein + Prom 2954 - 3013 10.8 6 4 Tu 1 . + CDS 3089 - 3259 176 ## gi|237733097|ref|ZP_04563578.1| conserved hypothetical protein + Term 3461 - 3502 4.2 - Term 3449 - 3490 8.0 7 5 Tu 1 . - CDS 3492 - 3893 419 ## gi|237733098|ref|ZP_04563579.1| conserved hypothetical protein - Prom 3943 - 4002 4.3 Predicted protein(s) >gi|223714074|gb|ACDT01000141.1| GENE 1 163 - 477 276 104 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733091|ref|ZP_04563572.1| ## NR: gi|237733091|ref|ZP_04563572.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 104 1 104 104 206 100.0 4e-52 MNKYLVFRCGKCIWKNTKSHELIGVEMAEDIDDVFEDIQQDILDDLKGMKDYSYPVYRVE ADYIDGAADTSKYDYQMLGVVKPERTAPKNHTCWHGIIETELTE >gi|223714074|gb|ACDT01000141.1| GENE 2 543 - 1022 257 159 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733092|ref|ZP_04563573.1| ## NR: gi|237733092|ref|ZP_04563573.1| predicted protein [Mollicutes bacterium D7] # 1 159 1 159 159 245 100.0 6e-64 MRRSKENEKIFASFLTVLTLLGTTPVFALNEESNTQIVERFEDGSYIEVTIEEISTLTRS NTKKGNKTTYYKNTYDVILWSVKTSGTFTYDGSTAKCTAASVDTQCPAINWKLSNIRSSK SGASAYGYATAKSYNGLGMVLQTINETVKLTCSPSGTLS >gi|223714074|gb|ACDT01000141.1| GENE 3 1009 - 1380 186 123 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733093|ref|ZP_04563574.1| ## NR: gi|237733093|ref|ZP_04563574.1| predicted protein [Mollicutes bacterium D7] # 1 123 1 123 123 204 100.0 1e-51 MNNVVKQYLKECRRTFPFISKNEKLFFKRLEDNLNDIDENVTYKEICLKYGEPKNVMISY IENCDNEYILKRTSIKSIVKKIFITLFCILIALCATLLIYNQLIYNAYDDSNTIKKVTII EEE >gi|223714074|gb|ACDT01000141.1| GENE 4 1373 - 1816 216 147 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733094|ref|ZP_04563575.1| ## NR: gi|237733094|ref|ZP_04563575.1| conserved hypothetical protein [Mollicutes bacterium D7] # 8 147 1 140 140 233 100.0 2e-60 MIALYLHMSIATYILYVLNYRLYCDLIILKIGDGLMSRSSNNFFRMEMLLLKIIEIGNGE YYGYKIVQQLQELSDDKIKLAEGVMYPILYRLLDKGYIIDEKRLVGKRKTRVYYKLEPKG KEYLNQLYIDYMDINTSIIKIMEAGNE >gi|223714074|gb|ACDT01000141.1| GENE 5 1844 - 2557 507 237 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733095|ref|ZP_04563576.1| ## NR: gi|237733095|ref|ZP_04563576.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 237 1 237 237 429 100.0 1e-119 MKGRYVIKKSRKFIVSLIIAVFLVGIFPTNISTVTAMEKEISLDADCNIYSESPYNLLEN IVDLKSGGVLNVNGTSNPLFWEFDNMSTAFDIIISKFDDELNEISSVGNLDTLSLNNWKE YFNVFKEMYFDNIASGDIEDINILNQFFTVCENHEINSQARILCNQENVDIFELINYLPY TSPIIDTINQLPMTRAASFNFNKSTAVSYAKKYAVNPNPSYTTYSTDCTNFASQILR >gi|223714074|gb|ACDT01000141.1| GENE 6 3089 - 3259 176 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733097|ref|ZP_04563578.1| ## NR: gi|237733097|ref|ZP_04563578.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 56 56 111 111 100 100.0 4e-20 MKIEEVHSEEIEKNDECNYYMITQSIIGKPNNMGGTFKVKKIDNFYYVTSYDNMPR >gi|223714074|gb|ACDT01000141.1| GENE 7 3492 - 3893 419 133 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733098|ref|ZP_04563579.1| ## NR: gi|237733098|ref|ZP_04563579.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 133 1 133 133 235 100.0 6e-61 MLLDFIKSNWKEITTWLALFGVVVEVSPIKIYSLRWVGNRMNADIKKELDNVNENLNKHI TDSEKKEMKNLRFQMLDFADRVQNDSKPSKDAFSHIFDTIQDYHDIIDKYELKNGLIDIE TENLKQAYKKLYT Prediction of potential genes in microbial genomes Time: Thu May 26 10:44:26 2011 Seq name: gi|223714073|gb|ACDT01000142.1| Coprobacillus sp. D7 cont1.142, whole genome shotgun sequence Length of sequence - 7369 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 7, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 658 - 717 3.9 1 1 Tu 1 . + CDS 745 - 1608 889 ## gi|237733099|ref|ZP_04563580.1| predicted protein 2 2 Op 1 . + CDS 1960 - 2172 242 ## gi|167756504|ref|ZP_02428631.1| hypothetical protein CLORAM_02039 3 2 Op 2 . + CDS 2156 - 2401 140 ## gi|237733101|ref|ZP_04563582.1| predicted protein + Prom 2462 - 2521 4.4 4 3 Tu 1 . + CDS 2639 - 2818 118 ## gi|167756505|ref|ZP_02428632.1| hypothetical protein CLORAM_02040 + Term 2856 - 2912 -0.5 + Prom 2849 - 2908 7.6 5 4 Tu 1 . + CDS 3009 - 3182 161 ## gi|167756506|ref|ZP_02428633.1| hypothetical protein CLORAM_02041 - TRNA 4037 - 4110 62.3 # Thr GGT 0 0 - TRNA 4113 - 4199 56.5 # Leu GAG 0 0 - Term 4220 - 4268 6.4 6 5 Tu 1 . - CDS 4270 - 5640 1477 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases - Prom 5670 - 5729 9.3 + Prom 5581 - 5640 7.3 7 6 Tu 1 . + CDS 5740 - 6165 455 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Term 6203 - 6233 1.3 + Prom 6211 - 6270 6.9 8 7 Tu 1 . + CDS 6300 - 7367 749 ## PROTEIN SUPPORTED gi|51894064|ref|YP_076755.1| ribosomal protein S1-like protein Predicted protein(s) >gi|223714073|gb|ACDT01000142.1| GENE 1 745 - 1608 889 287 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733099|ref|ZP_04563580.1| ## NR: gi|237733099|ref|ZP_04563580.1| predicted protein [Mollicutes bacterium D7] # 1 287 1 287 287 405 100.0 1e-111 MKEKILKNCKKYWYVIAFIIVGFIVLNKEPDYLVLKKNTFKVELGTPFNIKPETYLNTEK LDTDVKKDVLENAKVTVDKDLNNIENFCLDVGKYKVTVSYKNEKETFTYKVEDTTAPVIT GAESIDIVQSTDLSTYDFKVLYTIDDLSQTDVKYDTSAIDVNTIGAYTLKIISEDNYGNK CDKEVAINVVAPLSADEVIVEETVSNADGTTTTKIVTRKKTKAQSGDFRIVTSGDTSKSS NNSSYSSSSSSSNSISTIKPSGSTSSNSSSNNNSSAAKPNNGNLIID >gi|223714073|gb|ACDT01000142.1| GENE 2 1960 - 2172 242 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756504|ref|ZP_02428631.1| ## NR: gi|167756504|ref|ZP_02428631.1| hypothetical protein CLORAM_02039 [Clostridium ramosum DSM 1402] # 1 68 4 71 79 131 98.0 1e-29 MGGFQPTIGKWYDYKNYDIDEPQWRRLLNTAEKLENEFGDLFVYERDGSIGYGLEIISSP MTKGYYEKYQ >gi|223714073|gb|ACDT01000142.1| GENE 3 2156 - 2401 140 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733101|ref|ZP_04563582.1| ## NR: gi|237733101|ref|ZP_04563582.1| predicted protein [Mollicutes bacterium D7] # 1 81 1 81 81 107 100.0 2e-22 MKNINKFKNLLNIFNEFHYVSTKGNKCGSHIHFNRKTLGFNSKEYSSLLSNLNNNIRKAE EVDHKRANKTIENVVAVMELY >gi|223714073|gb|ACDT01000142.1| GENE 4 2639 - 2818 118 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756505|ref|ZP_02428632.1| ## NR: gi|167756505|ref|ZP_02428632.1| hypothetical protein CLORAM_02040 [Clostridium ramosum DSM 1402] # 1 59 89 147 147 83 100.0 5e-15 MYNIVNVSRNYQGLINLEKLIVYKNNDKTIQMMKNYIENNDIENRKVDIKEVKKTIINA >gi|223714073|gb|ACDT01000142.1| GENE 5 3009 - 3182 161 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756506|ref|ZP_02428633.1| ## NR: gi|167756506|ref|ZP_02428633.1| hypothetical protein CLORAM_02041 [Clostridium ramosum DSM 1402] # 1 57 1 57 57 102 100.0 6e-21 MCIIIVKNSRMDLPDKEILKRCWNKNPHGAGFMYNYNDVVIIKKGFMFLMIFMKICK >gi|223714073|gb|ACDT01000142.1| GENE 6 4270 - 5640 1477 456 aa, chain - ## HITS:1 COG:lin1661 KEGG:ns NR:ns ## COG: lin1661 COG0624 # Protein_GI_number: 16800729 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Listeria innocua # 2 454 4 470 470 319 41.0 8e-87 MIDFKAEVLKIKDQMIDDIKMLCAIPSTQDDSTVAEFAPYGAANRRALDAMLEIGRRDGF EVQDVDGHAGHIDIGTGDETFGILGHLDVVLVNKVGWDSDPFEVITKEGKLYGRGVADDK GPLMAGYYAAKIINSLNIPTKMKTRVIFGCNEELGSNCVKYYFTKMPYPKMGFTPDAAFP VVYGEKAGCGFEITGHVEKGGLIYLCAGSRPNIVPETCEAVIEGNYKQYTESYKKFLNTN NLTGDIEEEGNHTKLILKGKSAHASTPEEGINAVVYLCKYLATVVDNKVVDFVLEYLDDC HGKKLGIDHTGLMGPLTLNLGVISYYKEEVKIVLDLRCPHDMDFDAMVTKFKHACANYEF RETHDLGKALYIDPNSKLITNLHEAYVSVTGDTVNKPQAIGGGTYAKSMPNCVAFGAEFL GEDNLIHGNNENIKIDSLLKATEIYCHALYNLIKAD >gi|223714073|gb|ACDT01000142.1| GENE 7 5740 - 6165 455 141 aa, chain + ## HITS:1 COG:lin1662 KEGG:ns NR:ns ## COG: lin1662 COG0494 # Protein_GI_number: 16800730 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Listeria innocua # 10 129 8 120 137 64 36.0 5e-11 MNNVRFHITVKGIVVLNNQILLMKRIRPSSDGLGYWELPGGGLEYGETPNQALIRELQEE TGLDIIIIKPAYTFTKIRKDYQTVGIGYLCIPKNDHVRLSHEHSDYRFVSIQEAKELLNP EIYNDIIFTIEEYYQNVHDIQ >gi|223714073|gb|ACDT01000142.1| GENE 8 6300 - 7367 749 356 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|51894064|ref|YP_076755.1| ribosomal protein S1-like protein [Symbiobacterium thermophilum IAM 14863] # 3 349 4 354 764 293 43 3e-79 MNEEIIAIASKELEITKKQINAVLSLLEQGNTVPFIARYRKEATGGLDEDQIRNIDKYYQ YQVSLLKRKEDVIRLIEEKGMLTDQLRADILKATKLNEVEDLYRPYKEKRKTKATEAKAK GLEPLSKWILSLPRGELKEEAKKYLNDKVETVEEAIQGALDIIAEVISDDIKYRKFVKDI IYKSGTIETKVKKKNPDENKVYEMYYDYHERVNRIVSHRILAINRAENEKVITVNIVLDK EFLIQYINRGVTRNRNSSVNEYLLKAVEDSLNRLLLPSIEREVRNELTEKASEQALKVFS INLEKLLMQAPLKDKMVLGLDPAYRTGCKLAVVDQTGKVLKIDKVFITIPKDNYDK Prediction of potential genes in microbial genomes Time: Thu May 26 10:44:59 2011 Seq name: gi|223714072|gb|ACDT01000143.1| Coprobacillus sp. D7 cont1.143, whole genome shotgun sequence Length of sequence - 4950 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 980 898 ## PROTEIN SUPPORTED gi|51894064|ref|YP_076755.1| ribosomal protein S1-like protein 2 1 Op 2 . + CDS 992 - 1735 796 ## gi|167756511|ref|ZP_02428638.1| hypothetical protein CLORAM_02048 3 1 Op 3 2/0.000 + CDS 1800 - 2234 530 ## COG1959 Predicted transcriptional regulator + Prom 2241 - 2300 8.9 4 1 Op 4 . + CDS 2409 - 3341 842 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 + Term 3349 - 3384 4.4 5 2 Tu 1 . - CDS 3955 - 4869 685 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 - Prom 4889 - 4948 9.7 Predicted protein(s) >gi|223714072|gb|ACDT01000143.1| GENE 1 3 - 980 898 325 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|51894064|ref|YP_076755.1| ribosomal protein S1-like protein [Symbiobacterium thermophilum IAM 14863] # 13 323 406 719 764 350 55 1e-96 SQLIKDNNLDVDYAIISEAGASVYSASKLAKEEFPDYQVEERSAVSIARRLQDPLAELVK IEPKAISVGQYQHDIAQKQLEEQLDFVVEKAVNNVGVDINTASVSLLQYVAGLNSGVAKN IIKYRDENGKFTNRKEIMNVSKLGAKTYQQAIGFLRIPGANDPFDATAIHPENYQQAKKL LKLIDADASLLGTNEIADKLSNINKEEVMNELNIDNYTLDDIIDSFIKPNRSPRDAYATP LLKKDILKIEDLKPGMQLEGTVRNVVDFGAFVDIGLKNDGLVHISKISHERIKHPLDKLS IGDIVEVYVIDVDVNKHRVGLSMIK >gi|223714072|gb|ACDT01000143.1| GENE 2 992 - 1735 796 247 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756511|ref|ZP_02428638.1| ## NR: gi|167756511|ref|ZP_02428638.1| hypothetical protein CLORAM_02048 [Clostridium ramosum DSM 1402] # 1 247 1 247 247 419 100.0 1e-116 MKKFLVGIVIIIIASIGFNLKPDKRDQAVVVQEEPVVKKEEVITYKELLEYLDIFKTGDI INIENTNYLSSYAIPGRWKHTIFYLGTYQQFSQVFTPEDKYYEEINKHYKNQDEVLVLDS NSTGVKIRTFDQMANLKEESYLKALSGYRFNEDDNFIKTYLSRALDYLGTPYDYSMTTYD DKALYCSELVYYALLANNIEVTKTSSIVEHVVITPTDLSDFLETLEDIDHVYLLEKEDNW INDAIND >gi|223714072|gb|ACDT01000143.1| GENE 3 1800 - 2234 530 144 aa, chain + ## HITS:1 COG:DR2094 KEGG:ns NR:ns ## COG: DR2094 COG1959 # Protein_GI_number: 15807088 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Deinococcus radiodurans # 1 141 46 191 197 107 38.0 5e-24 MKISTKGRYAIRVILDIAMHDDGKYIPLKDIAKRQDLTVKYLEQIISLLNKAGYLQSLRG NAGGYRLSKKPEECIIGDILRITEGDLAPIPCLKDEINNCSRANECITLPFWQGLDKVIK DYVESVTIQDLIDHANMQVNNYSI >gi|223714072|gb|ACDT01000143.1| GENE 4 2409 - 3341 842 310 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 1 308 1 305 308 328 55 4e-90 MTYNNHISELVGNTPILKLNNYVTKNQLKANIFAKLEYFNPAGSVKDRIAKAMLFKAKED GILKPDSVIIEPTSGNTGIGLASLGTSLGHQVILTMPETMSIERRNLLKAYGAKVVLTPG GLGMKGAIAKAEELAKEYKNAFIPSQFENQANPNAHYLTTGPEIYQQLEGKIDIFVAGVG TGGTISGIGKYLKEKNPSIKVVAIEPAASPVLSKGTPGPHAIQGIGAGFVPNTLNTDIYD EIITIENEAAFATSRAIAREEGVLVGISSGAALYGATVLAKRIENAGKNIVVLLPDTGER YLSTALVEQD >gi|223714072|gb|ACDT01000143.1| GENE 5 3955 - 4869 685 304 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 1 304 1 305 308 268 44 6e-72 MTIKKSILDTIGKTPLIQIDRFKKACEVNNKIFAKVEFFNPGGSVKDRVGLKLIEQAYED QLINKKTTIIEPTSGNTGIGLAIACAIYGNELILTMPETMSLERQKLLKAYGAKIVLTAG EKGMQGSVDKANELANKIENSYIPGQFVNPSNPLAHEETTALEIIDDFDDNIDYFVAGIG TGGTITGIARILKKKYPDIKIIGIEPKDSPLITKGKAGSHDLQGIGANFIPKILDLDLVD EVITVSTDDAYEAARILAKKEGLLVGITAGAALHGATKITDKNKNIVVLLPDTGERYLST TLFE Prediction of potential genes in microbial genomes Time: Thu May 26 10:45:12 2011 Seq name: gi|223714071|gb|ACDT01000144.1| Coprobacillus sp. D7 cont1.144, whole genome shotgun sequence Length of sequence - 3885 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 61 - 1281 1063 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs + Term 1285 - 1311 0.3 - Term 1273 - 1299 0.3 2 2 Op 1 . - CDS 1301 - 2539 1393 ## COG0205 6-phosphofructokinase 3 2 Op 2 . - CDS 2606 - 3265 576 ## gi|167756517|ref|ZP_02428644.1| hypothetical protein CLORAM_02054 - Prom 3302 - 3361 7.5 + Prom 3279 - 3338 8.1 4 3 Tu 1 . + CDS 3406 - 3828 437 ## gi|167756518|ref|ZP_02428645.1| hypothetical protein CLORAM_02055 Predicted protein(s) >gi|223714071|gb|ACDT01000144.1| GENE 1 61 - 1281 1063 406 aa, chain + ## HITS:1 COG:BS_ydeL KEGG:ns NR:ns ## COG: BS_ydeL COG1167 # Protein_GI_number: 16077591 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Bacillus subtilis # 1 404 44 461 463 224 32.0 3e-58 MAKQLDVSRTTVESAYLQLLVEGYIYAKEKVGYYVDVQFSNQIKNKKNPKVITKIDSHHY RYDFSGRLVDVESFNLDTWKKYIKKALNQSEDLMSYGEPLGEMPLRVALQEYSHEYRGVR HPVNNYIIGAGFQILLYHVCSLFGKDNIVGIEEGGFKQAEAVFHDCHMQVVKLPVDEEGI TLEGLRQANLKLLYLNSSSGGYHGHPIKQQRRLEIIEYARANQVYIIEDDHNGELKFNTK PIDAMSKLDNDHIIYIGSFSKLLLPSIRISYMALPNQLVQLFYEKSRDYHQTASKLEQLA LAMYVEDGQLARHLKRLRKHYRNKSTHLLQKLRTTFPQHKFELYETSLKITMAIKADLVD QYIALAKQNDILVNKNSNNQITLSFSGILDQDIDEAVDRLKEIWIN >gi|223714071|gb|ACDT01000144.1| GENE 2 1301 - 2539 1393 412 aa, chain - ## HITS:1 COG:XF0274 KEGG:ns NR:ns ## COG: XF0274 COG0205 # Protein_GI_number: 15836879 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Xylella fastidiosa 9a5c # 1 409 11 416 427 239 38.0 7e-63 MKKRNCIIGQSGGPTIAINASLAGIIHKAKHSEEYENVYGMINGIKGLLEGRYMDLLEEF DTGEKINALKQTPAMYLGSCRFKLPTHIDNQETYVKLFEILTELEITDFYYIGGNDSMDT VMKLDDYAKSINSEIQFIGIPKTIDNDLMGTDHTPGFGSAAKYVASSILEVVHDSLIYQL QSVTIIEIMGRNAGWLTAAAALARNEYNEAPDLIYLPEVVFDKEKFLEDIREVFKRKRCV VIAVSEGIKDKDGHFLDDDSKYAKKDAFGHVLHSGTGKMLESLVFKEFQCKVRSIELNVL QRCAMHIGSKTDLNESFVVGQKAIETALTGKSGQVMGFKRISNTPYEIDYISTDVRKVAN LEQKIPTAWINEAGNDITQELYDYLYPLIQGEVHINYLNGIPAYYSCKHLNK >gi|223714071|gb|ACDT01000144.1| GENE 3 2606 - 3265 576 219 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756517|ref|ZP_02428644.1| ## NR: gi|167756517|ref|ZP_02428644.1| hypothetical protein CLORAM_02054 [Clostridium ramosum DSM 1402] # 1 219 2 220 220 355 100.0 2e-96 MVKKLFLVGIMLLSLTGCVKGNVDIEFIDESNASLTIEVLIQEDLLDSYGTSLTDLKHKL TNSELSTWENKELKKDINGTQYIGFQLIAPKDINKSLLSFFTTNKKEGTYQVTIDHSTIN NIFNTSEIEDINNYSLTNLKTMGLELNLNIKMPGNISETSYGKIQDNQVKINLLDFLTQD ETKSISIISSNRHQTTQPANIFIFVALIIILYIILRKRR >gi|223714071|gb|ACDT01000144.1| GENE 4 3406 - 3828 437 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756518|ref|ZP_02428645.1| ## NR: gi|167756518|ref|ZP_02428645.1| hypothetical protein CLORAM_02055 [Clostridium ramosum DSM 1402] # 1 140 1 140 140 269 100.0 5e-71 MRVEILYTSKSKHIKPVAEAMARWVKTYAKTIDQFDALAPVDLLVIGFDDSAWKDPQLET FIKSLNRQKVHNIALFNAFYINDHKMKNMIKLCQETDLPLMREQYSFKLTFRQLKEIDQC VIDAARLYVEDMVNLVRDYY Prediction of potential genes in microbial genomes Time: Thu May 26 10:45:36 2011 Seq name: gi|223714070|gb|ACDT01000145.1| Coprobacillus sp. D7 cont1.145, whole genome shotgun sequence Length of sequence - 32445 bp Number of predicted genes - 34, with homology - 34 Number of transcription units - 12, operones - 7 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 4 - 63 13.8 1 1 Op 1 24/0.000 + CDS 184 - 522 488 ## COG0347 Nitrogen regulatory protein PII 2 1 Op 2 . + CDS 555 - 1796 1222 ## COG0004 Ammonia permease + Term 1807 - 1834 0.1 + Prom 1806 - 1865 4.8 3 1 Op 3 . + CDS 1885 - 3165 1666 ## COG2873 O-acetylhomoserine sulfhydrylase + Term 3174 - 3217 3.2 + Prom 3167 - 3226 6.2 4 2 Op 1 . + CDS 3258 - 3866 524 ## gi|237733120|ref|ZP_04563601.1| predicted protein 5 2 Op 2 . + CDS 3850 - 4437 367 ## gi|167756523|ref|ZP_02428650.1| hypothetical protein CLORAM_02060 + Term 4472 - 4508 -0.7 6 3 Op 1 . - CDS 4497 - 5066 712 ## COG0242 N-formylmethionyl-tRNA deformylase 7 3 Op 2 . - CDS 5127 - 5345 236 ## gi|167756525|ref|ZP_02428652.1| hypothetical protein CLORAM_02062 - Prom 5422 - 5481 8.4 + Prom 5379 - 5438 6.3 8 4 Op 1 . + CDS 5458 - 5724 364 ## COG4476 Uncharacterized protein conserved in bacteria 9 4 Op 2 . + CDS 5726 - 6946 1298 ## COG0772 Bacterial cell division membrane protein 10 4 Op 3 . + CDS 6979 - 7260 437 ## gi|237733126|ref|ZP_04563607.1| predicted protein 11 4 Op 4 . + CDS 7275 - 7457 114 ## gi|167756529|ref|ZP_02428656.1| hypothetical protein CLORAM_02066 12 4 Op 5 . + CDS 7463 - 7810 330 ## gi|167756530|ref|ZP_02428657.1| hypothetical protein CLORAM_02067 13 4 Op 6 14/0.000 + CDS 7819 - 8370 612 ## COG0742 N6-adenine-specific methylase 14 4 Op 7 . + CDS 8375 - 8863 375 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 + Prom 8868 - 8927 11.1 15 5 Op 1 15/0.000 + CDS 9011 - 9883 938 ## COG0540 Aspartate carbamoyltransferase, catalytic chain 16 5 Op 2 3/0.000 + CDS 9876 - 11144 1463 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 17 5 Op 3 13/0.000 + CDS 11153 - 11908 909 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 18 5 Op 4 . + CDS 11901 - 12818 1292 ## COG0167 Dihydroorotate dehydrogenase - Term 13208 - 13239 1.1 19 6 Tu 1 . - CDS 13253 - 16405 2746 ## COG3525 N-acetyl-beta-hexosaminidase - Prom 16435 - 16494 6.2 20 7 Tu 1 . + CDS 16749 - 17531 835 ## SUB0904 RpiR-family regulatory protein + Term 17536 - 17581 10.2 + Prom 17591 - 17650 5.5 21 8 Op 1 5/0.000 + CDS 17681 - 19825 2049 ## COG0210 Superfamily I DNA and RNA helicases 22 8 Op 2 4/0.000 + CDS 19904 - 21898 2169 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) 23 8 Op 3 . + CDS 21943 - 22665 817 ## COG4851 Protein involved in sex pheromone biosynthesis 24 8 Op 4 . + CDS 22680 - 23015 417 ## gi|167756541|ref|ZP_02428668.1| hypothetical protein CLORAM_02078 25 8 Op 5 . + CDS 23026 - 23316 426 ## gi|167756542|ref|ZP_02428669.1| hypothetical protein CLORAM_02079 26 8 Op 6 21/0.000 + CDS 23316 - 24764 1506 ## COG0154 Asp-tRNAAsn/Glu-tRNAGln amidotransferase A subunit and related amidases 27 8 Op 7 . + CDS 24770 - 26203 1526 ## COG0064 Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) 28 8 Op 8 . + CDS 26253 - 26996 913 ## gi|167756545|ref|ZP_02428672.1| hypothetical protein CLORAM_02082 + Term 26998 - 27022 -1.0 29 8 Op 9 . + CDS 27034 - 28380 1226 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 30 9 Tu 1 . + CDS 28718 - 28987 258 ## SSA_1059 hypothetical protein + Prom 29046 - 29105 11.6 31 10 Tu 1 . + CDS 29337 - 29513 201 ## gi|167756549|ref|ZP_02428676.1| hypothetical protein CLORAM_02086 + Term 29525 - 29560 1.0 + Prom 29548 - 29607 10.1 32 11 Op 1 . + CDS 29656 - 30189 565 ## COG0350 Methylated DNA-protein cysteine methyltransferase + Prom 30376 - 30435 11.8 33 11 Op 2 . + CDS 30465 - 30830 551 ## DSY1255 hypothetical protein + Term 30839 - 30884 10.2 + Prom 30854 - 30913 8.7 34 12 Tu 1 . + CDS 31016 - 31945 838 ## SUB0840 transporter protein Predicted protein(s) >gi|223714070|gb|ACDT01000145.1| GENE 1 184 - 522 488 112 aa, chain + ## HITS:1 COG:HI0337 KEGG:ns NR:ns ## COG: HI0337 COG0347 # Protein_GI_number: 16272289 # Func_class: E Amino acid transport and metabolism # Function: Nitrogen regulatory protein PII # Organism: Haemophilus influenzae # 1 112 1 112 112 113 49.0 7e-26 MKKLEIIIRPEKLEILKDILTDCGITGMMVSNIMGFGNQMGYTQQYRGTKYSVNLVSKLR IETVVKDEVVDQMIKEITTKLSSGNVGDGKVFVYPVEQAVRIRTGETGENAL >gi|223714070|gb|ACDT01000145.1| GENE 2 555 - 1796 1222 413 aa, chain + ## HITS:1 COG:sll0108 KEGG:ns NR:ns ## COG: sll0108 COG0004 # Protein_GI_number: 16331833 # Func_class: P Inorganic ion transport and metabolism # Function: Ammonia permease # Organism: Synechocystis # 1 413 84 492 507 384 51.0 1e-106 MEQLINIVWVFLGAVMVMLMQAGFAILEAGLTRQKNCNNVLMKNIMDFAIGSIIFLVVGF GLMFGESLGGIVGITGFIDPTSLNLSQFEALSPTVFIFFQTVFCATAATIVSGAMAERTK FSSYLIYTLVISLVIYPISGHWIWGGGFLSKIGFIDYAGSTAVHSVGGWAALMGAVVLGP RMGKYNRDGSTNAIPGHNIMMATLGVFILWFCWFGFNSGSSLEAAGYIGHIAMTTNLSAC AGALVAMFLTWKKYGKPDVSMTLNGILAGLVAITAGCHIVSLYGAIAIGAVGGILVVYGC EILDQKLHVDDPVGAVGVHCLNGVWGTLAVGLFACNTPASEGTLGLFFGGGTALLITQLI GVIIVAVWVCSMSFIMFTLIKKTVGLRVTPQEELAGLDLGEHGSEAYPDFLKK >gi|223714070|gb|ACDT01000145.1| GENE 3 1885 - 3165 1666 426 aa, chain + ## HITS:1 COG:BH2603 KEGG:ns NR:ns ## COG: BH2603 COG2873 # Protein_GI_number: 15615166 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Bacillus halodurans # 1 422 1 424 430 419 50.0 1e-117 MDNKNYGFETLQIHAGQKPDVTTKATGMPLYLSNAFTFEDADQAARIFALEEGGYFYSRL SNPTVDALQQRMAALDGGFGAVAFASGTAAIMGLIMTVCETGDEIVAANNLYGGTIGSLS GTMAGMGFKTHFINPTDLEALKAAINDKTKIIFVESVGNPNGDMLDFEAISAIAKEYQIL FVVDNTTPTPYLFKPIEYGADLVVYSTTKYLAGHGNVMGGVIVDSGNFKWKDNPRYPLFN MPDKAYHDIVYADLGAGAFCTKAVAKTLRDLGGCMSPFNAYMTLLGLETLSLRMKKHVEN ARLLAKFLNECDDVEWVSYAELPDSPNYELAKKYFPNGFGGLYTFGVKGGVAGGKTVINN IKLFIHVTNFADSRSLLTHPASTTHAQMSEEERIAGGVKQETIRMSVGLEDVEDLIADLK QAFAKI >gi|223714070|gb|ACDT01000145.1| GENE 4 3258 - 3866 524 202 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733120|ref|ZP_04563601.1| ## NR: gi|237733120|ref|ZP_04563601.1| predicted protein [Mollicutes bacterium D7] # 1 202 1 202 202 286 100.0 5e-76 MQSFLAMHINNIVMILLLFIYVVCLNGLFSMRRRLIKWRRFFAFLFIISTAGLMLGTIRN ELMVYWFNNASFNIESNFEWMIKIGYDLISLIIMILMIFLVKNILDFIKNSSRDAIRKLA INADIIAILIIIRTVIVPLIVKLARVLELFGLNNQTSSSNDIFLYALPYLGLPEVAIIVV TIFINILGISYWNRGAEDVMAK >gi|223714070|gb|ACDT01000145.1| GENE 5 3850 - 4437 367 195 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756523|ref|ZP_02428650.1| ## NR: gi|167756523|ref|ZP_02428650.1| hypothetical protein CLORAM_02060 [Clostridium ramosum DSM 1402] # 1 195 1 195 195 263 100.0 3e-69 MLWQNKSYDLVSYLEIIIGIIALICLLSLLYLWYQPKCQNKFNNIVFLGHLIICFDLYLL VDKVAHLLMSFYQAYLHHLKYNIYDYALLSSDILLTVVLIFSTFLFINNLEHLLIKKDSK EEYNLILNRVFKAYNCLMVVFIFKDVVLKIVYDFLVDVRGIRLHYHPYTINNLMVTIVLM LLLGLVINFYKKKTT >gi|223714070|gb|ACDT01000145.1| GENE 6 4497 - 5066 712 189 aa, chain - ## HITS:1 COG:BS_ykrB KEGG:ns NR:ns ## COG: BS_ykrB COG0242 # Protein_GI_number: 16078520 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Bacillus subtilis # 1 189 1 183 184 175 48.0 5e-44 MITMKDIIDDHNEKIREVSKEVALPITNEERELLLQMHEFLVNSQDPETSEKYDLRPAVG IAAIQLGIPKRMTAIHVLDFDEDGNVIGADDYALVNPKIISHTEKQSYLKDGEGCLSVND EVQGYVPRYAKVTVKGYDILTDQEVKIVARGFLSICLQHELDHFEGTLFYDRINKENPLA PIPNAMVID >gi|223714070|gb|ACDT01000145.1| GENE 7 5127 - 5345 236 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756525|ref|ZP_02428652.1| ## NR: gi|167756525|ref|ZP_02428652.1| hypothetical protein CLORAM_02062 [Clostridium ramosum DSM 1402] # 1 72 1 72 72 77 100.0 2e-13 MNCGFNYSGCGCNSYNPFITNPFGCGGCNSCNISSNGCGCGCGGFGSSNWFAIVLVVFLL LIICGNSRIRVG >gi|223714070|gb|ACDT01000145.1| GENE 8 5458 - 5724 364 88 aa, chain + ## HITS:1 COG:SP1404 KEGG:ns NR:ns ## COG: SP1404 COG4476 # Protein_GI_number: 15901258 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 3 79 5 81 92 74 49.0 4e-14 MDYNYPLDYRWSTEEIIDVMALYNAVEKAYEEGISKEEFMQAYRKFKAIVDSKSEEKQLD KQFYEISHYSIYQVVKAAKNNDFIRVGE >gi|223714070|gb|ACDT01000145.1| GENE 9 5726 - 6946 1298 406 aa, chain + ## HITS:1 COG:SPy0609 KEGG:ns NR:ns ## COG: SPy0609 COG0772 # Protein_GI_number: 15674690 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Streptococcus pyogenes M1 GAS # 6 400 9 405 434 151 30.0 3e-36 MAGIEKIKRLFKSRGVDKGIYASVFVLAIFGIIMIGSASVGSVTSHGVTYALKNMITQSV YVIAGAGVMIFIARVFKTRYITYSASMKLYLLGIFLMIICIFFGSTKGSHAWIKFGSLFS IQPAEFMKVAMILIMSYFLTESDKAFVVKGRFKTQQLKSAFYKEKFLKCVFLPMMLVAIA AGVGIFVQKDFGTTVILVTICFVCFIGTPRQYFKKYKRIVWVFIGVCGVLFLIIGTSVLK GYQLGRISTWLAPLSDPYDTSMQLSNALIAFNNGGLFGVGLGNSTQKFGYIPESQNDFIG AIIYEELGIIGLGLIIIPTCIIIFKLLKYSQEIKENKSRIILLGIASYFFLHLLINLGGI SGLIPMTGVPLLLISAGGSSSVTAFVAVGVAQAIIAKHNRQKFDTD >gi|223714070|gb|ACDT01000145.1| GENE 10 6979 - 7260 437 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733126|ref|ZP_04563607.1| ## NR: gi|237733126|ref|ZP_04563607.1| predicted protein [Mollicutes bacterium D7] # 1 93 1 93 93 178 100.0 8e-44 MDEFKEFLKTIPSIKQDVLNGRYTWQQLYEIFVMYGKDDKFWLPYKTGNSGFDLNMLLEV IKNIDLNALSSSLGSIEKVLNVASTFLDKKEAC >gi|223714070|gb|ACDT01000145.1| GENE 11 7275 - 7457 114 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756529|ref|ZP_02428656.1| ## NR: gi|167756529|ref|ZP_02428656.1| hypothetical protein CLORAM_02066 [Clostridium ramosum DSM 1402] # 1 60 1 60 60 89 100.0 1e-16 MMNEEIRYYLRYHPNWYIVLSRYPQEYEHLIQEYKDGKNQQFIDKIDQVSMLINMVEMMM >gi|223714070|gb|ACDT01000145.1| GENE 12 7463 - 7810 330 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756530|ref|ZP_02428657.1| ## NR: gi|167756530|ref|ZP_02428657.1| hypothetical protein CLORAM_02067 [Clostridium ramosum DSM 1402] # 1 115 1 115 115 166 100.0 4e-40 MEEILDKLITEIKLDQRYIDYVEAEKKLHNQEIDSLLHEYQNKLNQYDELKKYNQYIDNQ EIKDEIRDLKKKIASNEDILDYYCKYHRLNDFLDEITKVVFGGISNELDLSPYQL >gi|223714070|gb|ACDT01000145.1| GENE 13 7819 - 8370 612 183 aa, chain + ## HITS:1 COG:lin2159 KEGG:ns NR:ns ## COG: lin2159 COG0742 # Protein_GI_number: 16801225 # Func_class: L Replication, recombination and repair # Function: N6-adenine-specific methylase # Organism: Listeria innocua # 1 180 1 182 185 148 47.0 5e-36 MRVIAGKYKSRQLKSVKSNLTRPTTDRNKENLFNIIGPYFDGGAVLDLFAGSGGLGIEAL SRGYEHLYSVDNQYAAFQVIKENFTMLKLANAHVYKLDYRKALKRFANEKLKFDLILLDP PYGKGLVDDILEFLIVNEMLNDECMIVVEELKEVEFKPFKELDLIKKNDYGITALNVFKF MAG >gi|223714070|gb|ACDT01000145.1| GENE 14 8375 - 8863 375 162 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 1 159 1 159 164 149 45 3e-35 MKKNIAVYAGTFDPVTNGHLDIIERASRMFDTLYVTICINPNKQGLFSIDERKELLKAAC QQFDNVIIDSSDKLSVEYAKDVGSRIIVRGIRATMDFEYELQLAFSNQYLDKEVDMVFLM TKPSHSFISSSAVKEMVSHNRSVAGLVPPCVESALRKKYQGE >gi|223714070|gb|ACDT01000145.1| GENE 15 9011 - 9883 938 290 aa, chain + ## HITS:1 COG:BH2539 KEGG:ns NR:ns ## COG: BH2539 COG0540 # Protein_GI_number: 15615102 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Bacillus halodurans # 2 290 14 305 305 311 54.0 1e-84 MNIFTLKDFSTEEINQILDEAEEFKNGKKVDFKGQKVVANLFFEPSTRTHYSFDKAAYNL GCRTQNFEAANSSVQKGETLYDTVKFFESIGCDAVVIRHPKENYYEDLIERIKIPVVSGG DGTGNHPSQSLLDLMTIREEFGHFEGLKIVIVGDIVHSRVAHSNYEIMQRLGMEVYTSGP REFEEAGYNYVDFDKILPEVDIVMLLRVQHERHHDLMRLTTDEYHQMFGLNQKRVDKMKE GAIIMHPAPFNRNVEIADEIVECAQSRIFRQMENGVFVRMALLNRVLTND >gi|223714070|gb|ACDT01000145.1| GENE 16 9876 - 11144 1463 422 aa, chain + ## HITS:1 COG:SA1044 KEGG:ns NR:ns ## COG: SA1044 COG0044 # Protein_GI_number: 15926784 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Staphylococcus aureus N315 # 5 419 3 418 424 330 44.0 4e-90 MTKILFKNGNVFYHNKLQKLDVLIENDRIAAIAQELTDADEIIDLKGNLLTPGFVDIHVH LREPGFEHKETIATGTNSAKYGGYTHLVAMANTIPCMDDVATIKDFALRVAKDAKVHTYT YSAITTELKGQELVDFDANSKEEIVVGFSDDGRGVQSSMMMETAMKAAKANNSIIVAHCE DEAELQPGACINDGNYAKEHNLVGINNASEFNHAVRDLKLASEIGNRYHICHISTKETVA ALTEARKTNPLVSGEACPHHLILTDDNIKDMNPNYKMNPPLRSKEDLAALIQGINNGGVT VISTDHAPHAIEEKNKPIDKAPFGIIGNQHAFSLMYTYLVKKGLISLEKVLTCMSINPAK IIGIEHDLEVGLKANLAVFDLEEEYVITKESIKSKSINTPFLETKCFGALKYHVLDGKVT KI >gi|223714070|gb|ACDT01000145.1| GENE 17 11153 - 11908 909 251 aa, chain + ## HITS:1 COG:BS_pyrDII KEGG:ns NR:ns ## COG: BS_pyrDII COG0543 # Protein_GI_number: 16078617 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Bacillus subtilis # 8 247 6 254 256 219 44.0 3e-57 MANLHGMMKITSIKNIADDVYEMILEGEGAKYISAPGQFINIKINDSLQPYLRRPMSISD YDDSHIVIVFKVVGEGTKILKNKEIGSYLDCLIGLGNGFTIKDGKALLIGGGLGTPPLYN LGKKLAAKGIEITTVLGFGSAKDVFYQEKFAEFSNVFVATMDGSTGTKGTVVDVIKNEGL TFDNYYTCGPEPMLDALALNYPENGQLSFEARMGCGFGACMGCSCKVKTRPYKRICVEGP ILESNEVIVNG >gi|223714070|gb|ACDT01000145.1| GENE 18 11901 - 12818 1292 305 aa, chain + ## HITS:1 COG:BS_pyrD KEGG:ns NR:ns ## COG: BS_pyrD COG0167 # Protein_GI_number: 16078618 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Bacillus subtilis # 4 302 2 299 311 379 63.0 1e-105 MANLKVSLPGLEMKNPIIPASGTFGFGYEFAKFYDINILGSMMLKGTTLEPRYGNPLPRI AEGPSGLLNAIGLQNPGVDAVMAEELEKLRPLYNDKVIANIGGSAPDDYVETAKRISTHD MVGALELNISCPNVHSGGIQFGTNPEMAADLTRRVKEVSKVPVYVKLSPNVTDIVAMAKA VEAAGADGITMINTLVGMRFDFKTGKPIIANKTGGYSGPAIFPVALRMVYQVSQAVKIPV IGMGGIQNARDVIEMMSAGASAVAIGCQNLIDPYVCEKIINDLEPLLEQIGINDINDLVG RSHKF >gi|223714070|gb|ACDT01000145.1| GENE 19 13253 - 16405 2746 1050 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 270 692 82 501 757 105 27.0 3e-22 MKKKKRLLPIVMASLMAMSINVGPLKAADAPLITNPSFEHEIDKNEVQLYKAVRTDEKAF DGKYSIKVGNPKPENPSEVPIWRYNYGKGSVNVIFHNVEPNTTYKITTHYWNESGVKMST GVLDIEGKHTTSPWQLASNIQTNQKVSTEWQTNEHTITTGPRTNEIYGFAYMEWTGNDLG SGVFYIDDFKIEKVENKTVEEKASINYDADFSEFPETVPAVQNFEKNGNEIYKLNTAKQV FSTDEFSKAKTQYLADSMVKKGLIKDYTIKTVTDASQGEGIVLVKQPITFELPAAVTTSK IDAYQIDIDQDKTVIHSDYIQGIQNGAMTILQAFAQRKSLPAGSVQDYSDQVVRGLQVDS GRRYYSIQWLKDQIEQMAYYKQNILQLRLKDNEGIRYDSKIAADFVDRKGGFWTQEEVDE LVSYAAKFNIEVIPEIDFPGHAEQEANFHSGWCIPGSTKALDFSKQEVREYMISVYKEAA DFFHAKTIHIGGDEYFQSGYTTAGKTVLQEWARQVTGNEAATDHDAIKLFFNEAGDELIK QNLKVLVWNDNIFDLGGIVPLDSRLIVDFWAGGMYGSIKASTTANAGYTIMSSSSSNYHD LWPQQGQSKLDRPLPKRTYEEFTRYHYSKSSYHYGNDEVLTENLDKSLGQVFPIWDDAHG YVPEYILTRTLFPRYAGFALKTWGAEYQKEMDYESFERLMYTLGSPRDDLFTQSKINYNK TDLDLVINKIETALADKTTTNTAVQENIDNLNAMISDVKANPANYQKDSFYTDIINDLIY NYENVEYVVVKKNLLNIAIEEALKVTEAELNTIVPAVVTEFKAALAEAQQINGTSLATQE EVNASFDRLSDVMQKLSFKKGDKTDLENLIKKIAKLNAEDYLTSTWNAMLPVLDEAKVVV NNQNALEPEVQEAHDKLVRAFLQLRLKPNKDALNELINKAESLNSAEYTSESWAALANIL IDVKAVAANEVATVTEIETAYDNLQNAINNLVKVTPAEPTTPNTPDANKKDDVKQGNVKT GDNTNIALYTSLFIMGALVLPLVLKKKRHN >gi|223714070|gb|ACDT01000145.1| GENE 20 16749 - 17531 835 260 aa, chain + ## HITS:1 COG:no KEGG:SUB0904 NR:ns ## KEGG: SUB0904 # Name: not_defined # Def: RpiR-family regulatory protein # Organism: S.uberis # Pathway: not_defined # 8 225 8 236 289 69 27.0 1e-10 MAGFIYKLQNIVNTSPDSDDTNINIARCILKNIDKIKKRTSLQDFADICYTSQSSISRFS QYLGYANFNDFKADCIGIQEETDEMIIDTRATKKINCVDYAQAISESLVRMQETSIDQEI EQLCEYIHNAKRIFIFATHIPGDLGCIMQRAILTTGKFIEFYPRKEHQMEVAKQIRRDDL CIFLSLEGTLLMEKAITIPAIISLAKNVLITQNVHIKFSEQFTQVIGLGGHDSESIGKYK LLFFIDCLLNRYYHKYIVNN >gi|223714070|gb|ACDT01000145.1| GENE 21 17681 - 19825 2049 714 aa, chain + ## HITS:1 COG:SA1721 KEGG:ns NR:ns ## COG: SA1721 COG0210 # Protein_GI_number: 15927479 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Staphylococcus aureus N315 # 7 711 8 727 730 481 40.0 1e-135 MELDELLNKNQKEAATYLDSHLRIIAGAGSGKTRVVTYRIAYLIDEIGVDPRKILAITFT NKAANEMKERITSLLGPHALGSLVCTIHSLCVRILRRHINVINYPSNFMIMDEEDQKALI KKIYNNLDIDAKTISIKSMIASISNYKMANIDPERAMELAGQFQGEIKKAKVYKEYLEYQ ENHFMLDFDDLLLKAVYIFENYPDVLEKWQHKFQYIHVDEFQDVGNIEYRLVKLMSQDAI TCVVGDPDQTIYSFRGANVNYILDFDKDFKPCKTVILDQNYRSTGNILNISNSLIRKNRN RLEKDLYTEATGGSEVIHYMGKSEQEEADYICSQIEKIISEVDGVNYRDFAILYRANYLS RTIEQSMISHGIDYRIFGGLKFFSRKEVKDALSYLQLVCNDEDLAFERIINVPARGIGKK TLENIQLVALNNQVSLYEALTLFSDQIKLSSKAKKEIITLVKAVETAKQSSLPLHEMFEN LMNDVGYIDMLKKDLEDNRIDNIHELQRSIYEFQNQNPDIATIENYLQEISLYTDNDNLD DGQYVSLMSIHMAKGLEFDYVFVLGLSEGIFPSFRALSESDEGLEEERRLAYVAFTRARK QLFLTDSEGFSFVTDSPKISSRFIDEIGSEGIKHVGTKPRFKTANFINTAPTKEELVGNN EVTDWSNGDFVLHDTFGKGVVVKVNGNTLDIAFELPAGLKTLMANHKALKKLTN >gi|223714070|gb|ACDT01000145.1| GENE 22 19904 - 21898 2169 664 aa, chain + ## HITS:1 COG:lin1870 KEGG:ns NR:ns ## COG: lin1870 COG0272 # Protein_GI_number: 16800936 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Listeria innocua # 1 661 1 661 671 524 43.0 1e-148 MDDLSRYKELKATILYHNDRYYNQDNPEITDYEYDMMMQELKGLEQKHPEYITSDSPTQK VGGSAKREAGVLVRHNVPMLSLQDVFSKEDVDAFVADMQEQLVDPTFVVEYKIDGLSMAL RYVNGKLDVAVTRGDGVLQGEDVTVNAKVIKDVKNTLKEPIEYLEVRGEVYMTNEAFDKV NEIQEIKGKKLFANPRNCAAGTLRQLDSSITKERNLSMFVFNIQDAKGREFISHSEGYEY LKRQGIKIIEDYKICKTAKEVWEAIEAIGENRDKLGYDIDGAVVKIDSFADRQKLGATAK VPRWAVAYKYPPEEKETKLLAIELSVGRTGRITPTAIFEPIRLCGTTVSRATLHNQDFID DLDVRIGDTIVVYKSGEIIPKVKGVVKEKRPADSVPYVIGNTCPVCGAPAVREGDNADIK CTNHSCPSKLVRNIVNFVGRDAMDIKGFGFAYVETLVDHGYLKDLSDIYGLIDKRQELLD KKIIGLVKSTDNLLNAIEGSKNNDAIKLLTSLGISNVGKSAAKSLMKKFKSIDNLMKASY AQLIEVNDIGDISAMAIINYFKNPDNQAVVQRLKEYGVNMNIIEAQDGDERFDGKTFVVT GTLPTLSRKAASELIEKHGGKVSGSVSKKTDYLLAGENAGSKLTKAQNLGINVISEETLL EMVK >gi|223714070|gb|ACDT01000145.1| GENE 23 21943 - 22665 817 240 aa, chain + ## HITS:1 COG:lin1869 KEGG:ns NR:ns ## COG: lin1869 COG4851 # Protein_GI_number: 16800935 # Func_class: R General function prediction only # Function: Protein involved in sex pheromone biosynthesis # Organism: Listeria innocua # 1 211 1 223 371 59 25.0 5e-09 MKKIVALLAIALVLSGCSDAKETVENQTANDVSTTDSLDDSFYRVVNLNTNLSREDYYTA FGNTMDFQTIGRELQILSTDHFSTNDYYMSEGQYLKTDDMNQLLKRSEDTSKYPYTLQAQ RGTTIGGVANPIMVSTVHEQDYYEKNGSEYVLKGLSLAIVLDPRDEKNERLDTSLDESLV VDFGREAISKLYKYLQSKKDLKDIPANICVYYATNTNESDINGRYILKVIVMAVWEILKH >gi|223714070|gb|ACDT01000145.1| GENE 24 22680 - 23015 417 111 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756541|ref|ZP_02428668.1| ## NR: gi|167756541|ref|ZP_02428668.1| hypothetical protein CLORAM_02078 [Clostridium ramosum DSM 1402] # 1 111 247 357 357 194 100.0 2e-48 MFTSTEATAADEDTASQFEIFKSNMKKAATEAVGVIGYGRYKNGQIQSMKINLKVNIKTY TELLYLISTAADELNTQFSGFDIKVLVSSQDQLEALIVKDKGEDAKSILLY >gi|223714070|gb|ACDT01000145.1| GENE 25 23026 - 23316 426 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756542|ref|ZP_02428669.1| ## NR: gi|167756542|ref|ZP_02428669.1| hypothetical protein CLORAM_02079 [Clostridium ramosum DSM 1402] # 1 96 1 96 96 165 100.0 9e-40 MENKEEMLKKLGLKTMFNITDAEMPELVEEYEIFMNHVAVLEKIDTDGKAPLAYPYEIET CFLRDDEPIDVISREDVLKNAKSVQDNQIKVPKVVG >gi|223714070|gb|ACDT01000145.1| GENE 26 23316 - 24764 1506 482 aa, chain + ## HITS:1 COG:lin1867 KEGG:ns NR:ns ## COG: lin1867 COG0154 # Protein_GI_number: 16800933 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase A subunit and related amidases # Organism: Listeria innocua # 4 477 6 479 483 341 41.0 1e-93 MKVYSIEELHNLIVNQEIDLNEYYYDLFKEVEFQQNRLNAFVTITKTQAMDNIKELKISD EEILKGIPGVFKDNYNTKGIKTTASSKMLEDYVPVYDAAVVEKLYNEGVSLIGKSSMDEL AMGGTNKSALTGPVYNPWDSSRIAGGSSGGSAALVGSGVVPFALGSDTGDSIRKPAGFCG VVGFKPTWGRISRFGVIPYASSLDHVGAFTRNVRDMAIVTEALAGRDERDMTSSTRPVPH YLKELNSDIRNMKIAVLKTVSNEIRNDDIKNNFNHVVATLKQLGACVEEVEIPVHLAKAI LPTYTIIANSEATSNHSCLDGIKYGDRQPGSSTDEVMINSRTDGFGAHIKRRFILGNLAL ATENQERMFRKAQRVRRLIVEELNKIYDNYDIILTPNGGSVAPKLDEANDDRLSDEYLIL ENHLALGNFAGTPSISIPSGFSEGMPIAINLMGRLFEEQTVLNVAYALEQALGFENQYSR EG >gi|223714070|gb|ACDT01000145.1| GENE 27 24770 - 26203 1526 477 aa, chain + ## HITS:1 COG:BH0667 KEGG:ns NR:ns ## COG: BH0667 COG0064 # Protein_GI_number: 15613230 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) # Organism: Bacillus halodurans # 1 475 1 476 476 514 54.0 1e-145 MNFETVIGLEVHCELKTKSKMFSGAPVTFGEAPNTNANIVDMGMTGAMPVLNKSGVEFAI RVCSALNMEIDELVQFDRKNYYYSDLPKGFQITQDRFPIGRNGTMKIEVNGQTQVIEIER LHMEEDTAKQLHFDDYSLIDYNRAGIPLIEIVTKPTIRSGAAAAAYLERLRQIFLFTEVS DAKMEEGSMRCDVNVSIRPYGSEKFGTRTEIKNLNSISNVQKAIEFEVARQEKVLISGGE VLQETRRYDEDSKETVVMRAKGDAVDYKYYPEPNILPIRLNHQWVEGIIERIPEMPESRV ARYINEYKIPKTDALILVQTKEVSDFFDATVAYTKHYKIASNWLLGEVQAYLNKNNLIIT DTNITPEYLAKMINYIQDGTISSKQGKKVFEILMNEGKDPEIIIEENNMKQISSPEELTK IINEVLDNNPQSIEDYSNGKDRAVGFLVGQIMKKTGGQANPGLTSKLLIELLKQRIS >gi|223714070|gb|ACDT01000145.1| GENE 28 26253 - 26996 913 247 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756545|ref|ZP_02428672.1| ## NR: gi|167756545|ref|ZP_02428672.1| hypothetical protein CLORAM_02082 [Clostridium ramosum DSM 1402] # 1 247 11 257 257 396 99.0 1e-109 MSQRKRFSFDDEVDDTIDLSKDDLKNSSIANNDKPSNDNVFADEDGLDAKAYPTDNGGDK SMKNKKKRKFKIKKWQIALIAFLSIIIVFVIYVFVAGGNDGPVYGDRCASLIAIDKSKFT GVEDAIKADPSVNSINIEVDCRIVKISMNFVDNTSSDTAKQLATNALHTLDDTLGENKEE GRAYSNLFTTANGRGQYNVEFVLTSNGDTNFPIFGTKHPNSDDITFTGANVVDQAATDRA ANTNANQ >gi|223714070|gb|ACDT01000145.1| GENE 29 27034 - 28380 1226 448 aa, chain + ## HITS:1 COG:BH0687 KEGG:ns NR:ns ## COG: BH0687 COG2265 # Protein_GI_number: 15613250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus halodurans # 1 446 8 454 458 404 45.0 1e-112 MKKNDYFYGECTDLTHDGQGVVKIDNFTYFVKGMLTGERGRLKVIKVLKNYGIARLIELE VTSKYRIDPKCSVFRPCGGCQLQHLNPQGQMIYKTKRVKDCLERIGNCKVAVNDCIMMEK PWHYRNKVQMPVGVKGNHLVTGFYKQKSNEIIPCENCYIQNELSNQITNRVKELMEEFKI YPYDKLAHQGNIKHILTKHGYHTDEVMLVLISYRNKIKDMDKLVEIITKEFPMIKTVIQN INHRTDNVILGEQEVILFGNGYIYDTLLGNKYKISLKSFYQVNPIQVEKLYSKAIEFAGL SKEDIVLDAYCGIGTITLSVAKYVKKVYGVEIVETAIDDAKNNAVLNNISNAEFKCSDAG KYMLELVNQDQHLDVVFVDPPRKGCSTEFLDNLIQAEPKKIIYISCDVATQARDIKYLQE FGYHADVCQPVDMFPHTTHIENIVRLSK >gi|223714070|gb|ACDT01000145.1| GENE 30 28718 - 28987 258 89 aa, chain + ## HITS:1 COG:no KEGG:SSA_1059 NR:ns ## KEGG: SSA_1059 # Name: not_defined # Def: hypothetical protein # Organism: S.sanguinis # Pathway: not_defined # 1 85 1 91 173 73 49.0 3e-12 MNAFILLIPFLLIAFINEGAIKRAGYFAPLQGNEKIAYVIYQMANIGIFIYFFLSVIIDL SWQFCLGLLVYLLGLGMCIVLVKDFLSKK >gi|223714070|gb|ACDT01000145.1| GENE 31 29337 - 29513 201 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756549|ref|ZP_02428676.1| ## NR: gi|167756549|ref|ZP_02428676.1| hypothetical protein CLORAM_02086 [Clostridium ramosum DSM 1402] # 1 58 1 58 58 84 100.0 3e-15 MKIEFKMSYKLGLLEFKVGDQLEISKNTIFESINDKYYMLILKGLRYDLLKEDIKIIQ >gi|223714070|gb|ACDT01000145.1| GENE 32 29656 - 30189 565 177 aa, chain + ## HITS:1 COG:L118481 KEGG:ns NR:ns ## COG: L118481 COG0350 # Protein_GI_number: 15672513 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Lactococcus lactis # 7 165 7 167 169 155 51.0 3e-38 MQYISKYTSPLGAITLASNGEALTGLWFDGQKYFGANLSKEYKNVELPVFKQTKEWLNLY FNGQKPDFIPLLALQASEFRLAVWQILLEIPYGQTLSYGDISAVLAKQKGLKTMSAQAVG GAVGHNPISIIIPCHRVVGSNGNLTGYAGGLEKKIALLQLEGADMGALFTPKKGTAL >gi|223714070|gb|ACDT01000145.1| GENE 33 30465 - 30830 551 121 aa, chain + ## HITS:1 COG:no KEGG:DSY1255 NR:ns ## KEGG: DSY1255 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 118 13 130 130 117 55.0 2e-25 MNKEVLNYIAEKTRELIVAPTCSNETKEAAQAWLNAIGTESEMAETKKYIAELEEDIMPI DNLIRFAGSDNGKAYFGEENAKNIESHAKEIKANGAKYCDCPACLVVAQILEKKSELFNE K >gi|223714070|gb|ACDT01000145.1| GENE 34 31016 - 31945 838 309 aa, chain + ## HITS:1 COG:no KEGG:SUB0840 NR:ns ## KEGG: SUB0840 # Name: not_defined # Def: transporter protein # Organism: S.uberis # Pathway: not_defined # 1 309 1 309 309 242 47.0 1e-62 MRAVTTIFPVIFMIGIGTFSRIRGFITPSQKDGANKIVFNILFPIMIFNILFTSKLETSA IFIVLYVFIAYCVILFIGRFIINFTGEKFAHISPYLLTTCEGGNVALPLYLSIVGASSNT VIFDLAGVFMAFVIIPIMVARAGAGETRIRELIKTILTNSFVIAVILGLGLNLLGTYNFL SNSSYLGIYSNTITQVTAPIVGVILFIIGYDLNINMEILGAVLKLLVARIIFYILVIIGF FIFFPGLMIDKVFMIAVLIYFMSPTGFAIPMQISPLYKSNDDCGFASAIISLNMIITLVV YALIVVFIA Prediction of potential genes in microbial genomes Time: Thu May 26 10:46:53 2011 Seq name: gi|223714069|gb|ACDT01000146.1| Coprobacillus sp. D7 cont1.146, whole genome shotgun sequence Length of sequence - 1804 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 40 - 372 215 ## Amuc_0618 methyltransferase type 11 + Prom 954 - 1013 7.2 2 2 Op 1 . + CDS 1055 - 1306 78 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 3 2 Op 2 . + CDS 1209 - 1508 367 ## COG0110 Acetyltransferase (isoleucine patch superfamily) Predicted protein(s) >gi|223714069|gb|ACDT01000146.1| GENE 1 40 - 372 215 110 aa, chain + ## HITS:1 COG:no KEGG:Amuc_0618 NR:ns ## KEGG: Amuc_0618 # Name: not_defined # Def: methyltransferase type 11 # Organism: A.muciniphila # Pathway: not_defined # 4 108 144 248 249 80 32.0 1e-14 MELPLNQECFQIYKQYCQNFVGFNGGIKRDDERIKEFFDMHYKRMEFPNPLFFDQKSFIK RCISGSYSLKKTDQNYFEYIAALENVFDKYSNNGQLVMENKTVVYIGRLK >gi|223714069|gb|ACDT01000146.1| GENE 2 1055 - 1306 78 83 aa, chain + ## HITS:1 COG:L0026 KEGG:ns NR:ns ## COG: L0026 COG0110 # Protein_GI_number: 15673963 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Lactococcus lactis # 1 49 1 49 153 75 65.0 3e-14 MFAVIGDGCHIEPPFHANWGGKYVHFGKGVYANFNLTLVDDVEIYVGDYTPDGSKCYNYY WNSPNFTGIKKRSISIQSSSLYW >gi|223714069|gb|ACDT01000146.1| GENE 3 1209 - 1508 367 99 aa, chain + ## HITS:1 COG:L0026 KEGG:ns NR:ns ## COG: L0026 COG0110 # Protein_GI_number: 15673963 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Lactococcus lactis # 2 96 53 147 153 128 65.0 3e-30 MGPNVTIITGTHPILPELRKEAYQFNLPVYIGENVWIGAGTIILPGITIGDNSVIGAGSI VTKDIPKNVVAYGQPCTIKREINDHDKIYYYKERKIKSV Prediction of potential genes in microbial genomes Time: Thu May 26 10:46:56 2011 Seq name: gi|223714068|gb|ACDT01000147.1| Coprobacillus sp. D7 cont1.147, whole genome shotgun sequence Length of sequence - 1540 bp Number of predicted genes - 3, with homology - 1 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 4 - 63 5.1 1 1 Tu 1 . + CDS 255 - 512 161 ## + Prom 640 - 699 8.2 2 2 Op 1 . + CDS 842 - 1264 296 ## gi|237733033|ref|ZP_04563514.1| predicted protein 3 2 Op 2 . + CDS 1179 - 1539 300 ## Predicted protein(s) >gi|223714068|gb|ACDT01000147.1| GENE 1 255 - 512 161 85 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDIHYQQKDAIISVVNEQVSQVLHVFESVIDLMSYQMIQRNRNFKWTKNYYLSLGGASIV GKQIGISTIPLFLMHFLENNPKIKK >gi|223714068|gb|ACDT01000147.1| GENE 2 842 - 1264 296 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733033|ref|ZP_04563514.1| ## NR: gi|237733033|ref|ZP_04563514.1| predicted protein [Mollicutes bacterium D7] # 1 140 1 140 140 283 100.0 3e-75 MCIYYKDRKDLSMPINREHIFPAFLGGVNKLPLGYVSDEVNKLFSPLEKKLSVNSIIALL RMMFGPGKRGKKGESSPVVNVGLVEGKVCFYQVINGVAIPCPCILIKKGFIKDGLQSKLW RIEYRWQYTNFRFYSKFRYF >gi|223714068|gb|ACDT01000147.1| GENE 3 1179 - 1539 300 120 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDYSQSYGELSIDGNTLILDFIANLDTFKGRYIDKKSENMGCDEILIGYKNGKFFIGHSA DTVVDLSEVNKIINYLEKMKFKVEKIITASEAPKIEFNFAENDEYSRVFAKIGFNALAYI Prediction of potential genes in microbial genomes Time: Thu May 26 10:47:15 2011 Seq name: gi|223714067|gb|ACDT01000148.1| Coprobacillus sp. D7 cont1.148, whole genome shotgun sequence Length of sequence - 1494 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 234 - 293 8.5 1 1 Tu 1 . + CDS 367 - 1353 637 ## ECO26_3413 hypothetical protein Predicted protein(s) >gi|223714067|gb|ACDT01000148.1| GENE 1 367 - 1353 637 328 aa, chain + ## HITS:1 COG:no KEGG:ECO26_3413 NR:ns ## KEGG: ECO26_3413 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 23 320 29 320 332 137 32.0 7e-31 MKTYYKNGISKEIPYIIDTSEEFVIQDLGKDYPDRYIYPGDSVKYELDIELVRLYETLQD LYFECSENYYSDIYLQPVWVQGAGQNSYFDISAETFRDFLCNESSAGIPNLYKHLYLADC QFLVGTIQNLLSGIEDAFIRYYITLASLNIDDICHETPNSTIYVMSATSRCASSILETYF TKAYSILDILCKICYEIQNKNEEFSSYRKIKSAKILWGDRKKLLINSSPRTLFESCELVR IIESLRNEYVHNGTWELNPKIFVHFNNNVIEECFMFFPDMEQGRLTTIKGRKHFFSKGIK VNDVLPSIHAEFKNRLLNTIKLLNGTKL Prediction of potential genes in microbial genomes Time: Thu May 26 10:47:31 2011 Seq name: gi|223714066|gb|ACDT01000149.1| Coprobacillus sp. D7 cont1.149, whole genome shotgun sequence Length of sequence - 45798 bp Number of predicted genes - 44, with homology - 44 Number of transcription units - 20, operones - 10 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 76 - 120 1.1 1 1 Op 1 . - CDS 134 - 1132 1210 ## CA_C2564 hypothetical protein 2 1 Op 2 . - CDS 1148 - 1840 1015 ## COG0822 NifU homolog involved in Fe-S cluster formation - Prom 1912 - 1971 7.8 + Prom 2143 - 2202 5.8 3 2 Tu 1 . + CDS 2235 - 2468 57 ## gi|167756373|ref|ZP_02428500.1| hypothetical protein CLORAM_01906 + Prom 2582 - 2641 4.1 4 3 Tu 1 . + CDS 2677 - 3669 1103 ## COG1052 Lactate dehydrogenase and related dehydrogenases + Term 3670 - 3699 -0.2 + Prom 3673 - 3732 8.4 5 4 Op 1 . + CDS 3832 - 5121 1609 ## COG0148 Enolase + Prom 5123 - 5182 5.2 6 4 Op 2 . + CDS 5217 - 5570 483 ## COG1780 Protein involved in ribonucleotide reduction + Prom 6010 - 6069 15.6 7 5 Op 1 . + CDS 6090 - 6452 536 ## COG1321 Mn-dependent transcriptional regulator + Term 6455 - 6498 3.9 + Prom 6455 - 6514 5.6 8 5 Op 2 . + CDS 6534 - 6857 303 ## gi|237733042|ref|ZP_04563523.1| predicted protein + Prom 6861 - 6920 3.7 9 6 Tu 1 . + CDS 7033 - 7287 130 ## gi|167756380|ref|ZP_02428507.1| hypothetical protein CLORAM_01913 + Prom 7555 - 7614 7.7 10 7 Tu 1 . + CDS 7654 - 8652 812 ## COG4927 Predicted choloylglycine hydrolase 11 8 Tu 1 . - CDS 8886 - 9707 645 ## COG0266 Formamidopyrimidine-DNA glycosylase - Prom 9728 - 9787 8.5 12 9 Tu 1 . - CDS 9896 - 10321 485 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 10377 - 10436 6.8 + Prom 10295 - 10354 7.2 13 10 Op 1 . + CDS 10442 - 11458 982 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) + Prom 11473 - 11532 9.6 14 10 Op 2 . + CDS 11555 - 11857 455 ## gi|167756385|ref|ZP_02428512.1| hypothetical protein CLORAM_01918 + Term 11976 - 12030 7.0 15 11 Op 1 . + CDS 12190 - 13860 2189 ## COG2759 Formyltetrahydrofolate synthetase 16 11 Op 2 4/0.000 + CDS 13909 - 14397 678 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase 17 11 Op 3 . + CDS 14419 - 15120 878 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase 18 11 Op 4 21/0.000 + CDS 15137 - 16162 847 ## PROTEIN SUPPORTED gi|126667548|ref|ZP_01738518.1| Ribosomal protein S7 19 11 Op 5 10/0.000 + CDS 16156 - 16749 687 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN 20 11 Op 6 17/0.000 + CDS 16751 - 18274 1697 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) 21 11 Op 7 . + CDS 18293 - 19543 1679 ## COG0151 Phosphoribosylamine-glycine ligase 22 11 Op 8 . + CDS 19554 - 23315 4310 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain 23 11 Op 9 . + CDS 23317 - 24702 1648 ## COG0015 Adenylosuccinate lyase + Term 24703 - 24748 7.4 + Prom 24725 - 24784 8.7 24 12 Op 1 . + CDS 24830 - 25642 820 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 25704 - 25757 1.6 + Prom 25697 - 25756 8.6 25 12 Op 2 . + CDS 25788 - 26648 639 ## COG0582 Integrase + Term 26662 - 26699 1.2 + Prom 26863 - 26922 10.1 26 13 Tu 1 . + CDS 27032 - 27223 227 ## gi|167756397|ref|ZP_02428524.1| hypothetical protein CLORAM_01930 + Term 27228 - 27259 1.1 - Term 27216 - 27247 0.1 27 14 Tu 1 . - CDS 27252 - 28259 1076 ## COG2502 Asparagine synthetase A - Prom 28288 - 28347 9.4 + Prom 28332 - 28391 5.7 28 15 Tu 1 . + CDS 28413 - 29687 1410 ## COG3681 Uncharacterized conserved protein + Term 29693 - 29729 -0.4 + Prom 29803 - 29862 6.7 29 16 Tu 1 . + CDS 29960 - 30883 451 ## COG0583 Transcriptional regulator + Term 30959 - 31009 8.0 + Prom 30905 - 30964 4.4 30 17 Op 1 . + CDS 31032 - 31250 125 ## gi|237733063|ref|ZP_04563544.1| predicted protein 31 17 Op 2 . + CDS 31271 - 32125 799 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 32 17 Op 3 . + CDS 32142 - 32447 385 ## gi|237733065|ref|ZP_04563546.1| predicted protein 33 17 Op 4 . + CDS 32455 - 32931 450 ## COG0716 Flavodoxins 34 17 Op 5 . + CDS 32947 - 33504 589 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) + Term 33532 - 33585 -0.8 + Prom 34479 - 34538 9.6 35 18 Op 1 22/0.000 + CDS 34561 - 35694 1183 ## COG0263 Glutamate 5-kinase 36 18 Op 2 . + CDS 35694 - 36920 1432 ## COG0014 Gamma-glutamyl phosphate reductase + Term 36971 - 37032 4.2 + Prom 37071 - 37130 10.0 37 19 Op 1 . + CDS 37358 - 37867 310 ## ZPR_1044 probable secreted protein + Prom 37879 - 37938 10.4 38 19 Op 2 . + CDS 37959 - 38378 638 ## BT_0631 hypothetical protein + Term 38404 - 38434 1.3 + Prom 38881 - 38940 8.0 39 20 Op 1 . + CDS 39096 - 39548 531 ## Tlet_1610 MarR family transcriptional regulator 40 20 Op 2 35/0.000 + CDS 39548 - 41776 263 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 41 20 Op 3 . + CDS 41769 - 43622 207 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 + Term 43648 - 43679 -1.0 42 20 Op 4 . + CDS 43697 - 45022 1290 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase 43 20 Op 5 . + CDS 45032 - 45406 373 ## gi|237733076|ref|ZP_04563557.1| predicted protein 44 20 Op 6 . + CDS 45471 - 45798 338 ## COG2035 Predicted membrane protein Predicted protein(s) >gi|223714066|gb|ACDT01000149.1| GENE 1 134 - 1132 1210 332 aa, chain - ## HITS:1 COG:no KEGG:CA_C2564 NR:ns ## KEGG: CA_C2564 # Name: not_defined # Def: hypothetical protein # Organism: C.acetobutylicum # Pathway: not_defined # 1 316 1 316 333 528 84.0 1e-148 MALFESYERRIDQINAALAKFDIKSIEEAKAICDEKGIDVYEIVKSTQPICFENACWAYT VGAAMAIKNGDVKAVDAAKTIGVGLQSFCIPGSVADDRKVGLGHGNLGSMLLDEGTECFA FLAGHESFAAAEGAIKIAEKANKVRTKPLRVILNGLGKDAAKIISRINGFTYVETQFDYF TGEVKVLSTTAYSDGPRAKVNCYGADDVREGVAIMHKEGVDVSITGNSTNPTRFQHPVAG TYKKECIEQGKKYFSVASGGGTGRTLHPDNMAAGPASYGMTDTMGRMHSDAQFAGSSSVP AHVEMMGLIGMGNNPMVGATVAVAVAIEEACK >gi|223714066|gb|ACDT01000149.1| GENE 2 1148 - 1840 1015 230 aa, chain - ## HITS:1 COG:CAC2565 KEGG:ns NR:ns ## COG: CAC2565 COG0822 # Protein_GI_number: 15895825 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Clostridium acetobutylicum # 1 230 1 230 230 365 78.0 1e-101 MIYSHEVEQMCSVAQGASHGCAPIPEEGKWVYSKEIKDISGLTHGIGWCAPQQGACKLTL NVKEGIIEEALIETIGCSGMTHSAAMASEALVGKTLLEGLNTDLVCDAINTAMRELFLQI VYGRTQSAFSEGGLPVGAGLEDLGKGLRSMTGTSYSTKAKGPRYLELTEGYINRIALDAN NEIIGYEFVNLGRMMDMVKAGKDANEAMKEATGTYGRFDDAAKYIDPRKE >gi|223714066|gb|ACDT01000149.1| GENE 3 2235 - 2468 57 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756373|ref|ZP_02428500.1| ## NR: gi|167756373|ref|ZP_02428500.1| hypothetical protein CLORAM_01906 [Clostridium ramosum DSM 1402] # 1 77 1 77 77 132 100.0 5e-30 MNHELLILLKRNGVKFINIHSIGYDHINIKATKVLGIGISNNPYSVSSIADFISLHIPVS AKTYHAINKDNFYKGER >gi|223714066|gb|ACDT01000149.1| GENE 4 2677 - 3669 1103 330 aa, chain + ## HITS:1 COG:CAC2691 KEGG:ns NR:ns ## COG: CAC2691 COG1052 # Protein_GI_number: 15895949 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 313 1 313 326 370 56.0 1e-102 MKIIAFEVRSDEMADFEIMNNSNNFEITYYKEYLCQKNLSLLNGYDGIAFLGESKINHEI LDAIAKAGIGVIATRTIGYDHIDIEYAKELGIKVCNSSYGPEGVADFTVMLMLLSLRHYK KALWLGQVNDYSLQGLEGREMKDLTIGIMGTGRIGQAVLKNLSGFGARLLAYDIHQNETA KIYATYVDLEQFYQECDIISLHMPYLKSTHHLINDQTISKMKDGIIIINCARGQLCNTES LIRGIENKKIGALGLDVVEGEEGIYHQDMRTDIIKNKNMAYLRQFPNVVMTQHLAFYTDA AVSSMVQLSLNGLWACYNDLDWDNELTKDL >gi|223714066|gb|ACDT01000149.1| GENE 5 3832 - 5121 1609 429 aa, chain + ## HITS:1 COG:BS_eno KEGG:ns NR:ns ## COG: BS_eno COG0148 # Protein_GI_number: 16080443 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Bacillus subtilis # 1 429 1 430 430 607 73.0 1e-173 MSKIKSVYAREVLDSRGNPTVEVEVTTECGAFASAIVPSGASTGVHEAVELRDGDKNRYL GLGTEKAVKNVNEIIAKELIGKEVTNQREIDETMIALDGTPNKGKLGANAILGVSLAIAK CAATSLDLPLYRYIGGANAYVLPTPMMNIINGGAHADNNVDFQEFMIMPVSAPTFKEAIR MGAEVFHALKAVLKGKGLNTAVGDEGGFAPNLASNEDAIKTILEAVEKAGYKPGEDIKLA MDVASSEFYQDGKYVLPGENNKSFTSKELVDFYAELCSKYPIISIEDGLDQDDWEGWDYL TEKLGDKVQLVGDDFFVTNTKRLEEGIKRGVANSILIKVNQIGTLTETLEAIEMAQKANY TAVISHRSGETEDTTIADIAVATNAGQIKTGSASRTDRIAKYNQLIRIEDRLGKQSKYSG LSGFYQLNK >gi|223714066|gb|ACDT01000149.1| GENE 6 5217 - 5570 483 117 aa, chain + ## HITS:1 COG:SA0685 KEGG:ns NR:ns ## COG: SA0685 COG1780 # Protein_GI_number: 15926407 # Func_class: F Nucleotide transport and metabolism # Function: Protein involved in ribonucleotide reduction # Organism: Staphylococcus aureus N315 # 1 116 1 119 132 97 42.0 8e-21 MKVVFASRTGNVQSIVDRLSVDALEISSGDEAVSEPFLLITYTDGYGDVPMEVESFLNSN GDHLKGVIVSGDQGYGEAFCKAGDVIAEQYNVPCLYKVENDGTDEDIEEIKKIINNQ >gi|223714066|gb|ACDT01000149.1| GENE 7 6090 - 6452 536 120 aa, chain + ## HITS:1 COG:CAC1469 KEGG:ns NR:ns ## COG: CAC1469 COG1321 # Protein_GI_number: 15894748 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 2 118 1 117 122 146 67.0 7e-36 MMHESGEMYLETILLLKKQNGNVRSIDIARELGFSKPSVSRGVGILKNDGYIIVDSKGYI ELTEKGTEKATAVYEKHECLTEFLMHTAKVSKEVAEDDACKIEHIISDEVFEGIKKFLNK >gi|223714066|gb|ACDT01000149.1| GENE 8 6534 - 6857 303 107 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733042|ref|ZP_04563523.1| ## NR: gi|237733042|ref|ZP_04563523.1| predicted protein [Mollicutes bacterium D7] # 1 107 1 107 107 164 100.0 2e-39 MKTNELEKILGTTKDTLRYYEKEELISPSRDDNGYRNYSYEDIRIIKNIIMLRSFDLSIE DIRRIFNNEISLNTCLNQKKEYLLKEIGKKQRIIDLIEKNLSRKKAF >gi|223714066|gb|ACDT01000149.1| GENE 9 7033 - 7287 130 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756380|ref|ZP_02428507.1| ## NR: gi|167756380|ref|ZP_02428507.1| hypothetical protein CLORAM_01913 [Clostridium ramosum DSM 1402] # 1 84 1 84 84 145 100.0 7e-34 MGICYLQCGHILDYYHGIYIDLDIYAEYSCHLFESISLDNIHEIFAILKDKDIIIEDPLN LNGIFTQYQNLSELEKYFDNNMSK >gi|223714066|gb|ACDT01000149.1| GENE 10 7654 - 8652 812 332 aa, chain + ## HITS:1 COG:FN0032 KEGG:ns NR:ns ## COG: FN0032 COG4927 # Protein_GI_number: 19703384 # Func_class: R General function prediction only # Function: Predicted choloylglycine hydrolase # Organism: Fusobacterium nucleatum # 1 331 1 327 329 256 42.0 3e-68 MYHSRFKGEHYQIGFKWGRNLLKHKQLFLTNVPFEITLEHYDFALQCLPIYQKFFPEIIE EIRGIADGQKCSFKSICAVLLSMYCIVPEVHCSCFAFRNENEILFGRNSDFLTKIEKLYT NCIYQFTNDSYAFNGNTTACVEMEDGINEYGLAIGLTAVYPTVKQLGLNAGMMLRLFLEK CKTVDEVIQLLNTLPIASSQTFTLVDTGGDIALIECNSKQIKINRNNQNAFATNWFFSEA MREYNNHLIDNWQAQKRYQTIKEAILTDKINDFNDAAALLSGRYGYICQYDRKTGKDTVW SVIYDVKNKRIFRVEGNPSRKSFMEDKRFKFK >gi|223714066|gb|ACDT01000149.1| GENE 11 8886 - 9707 645 273 aa, chain - ## HITS:1 COG:L0271 KEGG:ns NR:ns ## COG: L0271 COG0266 # Protein_GI_number: 15672335 # Func_class: L Replication, recombination and repair # Function: Formamidopyrimidine-DNA glycosylase # Organism: Lactococcus lactis # 1 273 1 271 272 71 26.0 2e-12 MIELPEAYAIADDLKKEILGKTIIDLGGNYTDHKFTFYEGNPNSYKELLVGKKVTGIIKR NYYVEIVIENYRLTFRDGANIRYYQKPTKLKKSKLLITFADQSFINVTTSMYCFIGLFDQ ITGSNNEYYQTELTSIGPLDQEFTLNYFKTLITDETEKLSIKAFLATKQRILGIGNGVAQ DIMFNAKLFPKRRIKTLNEQDIKNLYDALIRTLTKMVENHGRDSEKDIYGNPGGYKTILC AKSYKSGCPICHCEIKKEQYLGGSIYYCPNCQK >gi|223714066|gb|ACDT01000149.1| GENE 12 9896 - 10321 485 141 aa, chain - ## HITS:1 COG:CAC3445 KEGG:ns NR:ns ## COG: CAC3445 COG0454 # Protein_GI_number: 15896686 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 3 141 5 143 147 80 35.0 1e-15 MIKEINNKMDYDHLLTESDPNINLVRDYLENGALYGFYLKDEPVSFIAVEIIDGEVEIKN LLTLSEHRGHGYAKALIKFIEETYNSYDTFLVGTANSSLENITFYTRLGYVYSHRIENFF IDYYPNKIIENGMQATDLMYF >gi|223714066|gb|ACDT01000149.1| GENE 13 10442 - 11458 982 338 aa, chain + ## HITS:1 COG:lin0516 KEGG:ns NR:ns ## COG: lin0516 COG2843 # Protein_GI_number: 16799591 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Listeria innocua # 35 337 158 459 475 139 27.0 1e-32 MMLFLVVGCSSSNQKNKKEETVKKEEKTEEKEPVNTSIKLNFAGDCTLGNYAGQAYDGSF NQEYAKQGNDATYFFKNVKNVFENDDLTIVNLEGPLTSATSHVEKQFPFSGPKEYVNILT SGSVELVSIANNHSEDYYEQGMTDTKQVLDEKGIGYFGYETSCVKEIKGIKIGFLGYRSM SLSMNNEKGRATIKAAINDLKNNQGANAVVVFYHWGIEREYYANSDQRELAKFSIDSGAD LVMGSHPHVVQGTEEYNGKQIVYSLGNFCFGGNRNPSDTDSMIYSITMNFTDGVYTGDSH EIIPCSVTSVKGRNNYQPIILEGNEKERVLTKIQKYSY >gi|223714066|gb|ACDT01000149.1| GENE 14 11555 - 11857 455 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756385|ref|ZP_02428512.1| ## NR: gi|167756385|ref|ZP_02428512.1| hypothetical protein CLORAM_01918 [Clostridium ramosum DSM 1402] # 1 100 1 100 100 157 100.0 2e-37 MKRKLVYFLYSGYIVFLVGLLGSRFANSISKGTLNIICLFIFIGVLAFCCGGFVGYPTRQ IDTEQTIKDKLESRDDAILIKELFIMIIPLVIIFIINYFC >gi|223714066|gb|ACDT01000149.1| GENE 15 12190 - 13860 2189 556 aa, chain + ## HITS:1 COG:CAC3201 KEGG:ns NR:ns ## COG: CAC3201 COG2759 # Protein_GI_number: 15896448 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate synthetase # Organism: Clostridium acetobutylicum # 1 556 1 556 556 708 64.0 0 MLTDVEIAQSAKMKPISEIAKKVGLEDDDLELYGKYKAKIALEAINRLENNPDGKLILVT AINPTPAGEGKTTTMIGLSQALNKLGKKSVVAMREPSLGPCFGIKGGAAGGGYAQVVPME DINLHFTGDIHAITAANNLIAAMLDNSIHQGNPLNIDVRNIVWRRVVDLNDRALRNTICG LGGKVNGMPREDGFDITVASEVMAILCLATSLEDLKERAGKMIVAYDYAGNPVTVNDIEA TGAVTLLLKDAIKPNLVQTLDHTPVFVHGGPFANIAHGCNSVMATKLAIKLGDYAITEAG FGADLGAEKFLDIKCRQANLDPQAVVIVATVRALKMHGGVDKKELGTENLEALAKGIKNL EKHIENIAKYNLPSVVAINAFPTDSEAELQLLKDTCNKMGVDVAISKVWEKGADGGIELA EKLLEILDTKEANYQPLYDLNLSIKEKIETIAKEIYGADGVAFDKKVLTKMKKYEAQGLA DLPICIAKTQYSLSDQPTLLGRPRGFTIKINDLIPSAGAGFLVAISGSIMRMPGLPKRPA AVNMDIDKDGKIVGLF >gi|223714066|gb|ACDT01000149.1| GENE 16 13909 - 14397 678 162 aa, chain + ## HITS:1 COG:CAC1390 KEGG:ns NR:ns ## COG: CAC1390 COG0041 # Protein_GI_number: 15894669 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Clostridium acetobutylicum # 1 156 1 155 159 155 57.0 3e-38 MKVAIIMGSTSDLSKVEPAIGILKDYGVEVNVRCLSAHRAHLGLSTFIKETETDGTEVII TAAGMAAALPGVVASQTVLPVIGVPISGATLDGMDALLSIVQMPSGIPVATVAINGSKNA AYLALQIMAIKHDKIKEKLLAYRKDMEKQAMSANEEVIAKYK >gi|223714066|gb|ACDT01000149.1| GENE 17 14419 - 15120 878 233 aa, chain + ## HITS:1 COG:FN0988 KEGG:ns NR:ns ## COG: FN0988 COG0152 # Protein_GI_number: 19704323 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Fusobacterium nucleatum # 2 233 4 235 237 257 57.0 1e-68 MAELLYEGKAKQVYKTDKEDEYLIHYKDDATAGNGVKHDQFEGKGVLNNTISCIIFDMLE EAGIKTHMIKKLNERDILVKKVDIFPLEVIIRNITTGSFCKRLGAPEGIVLDEPIFELSY KNDDYGDPLINEDHAVALKLCTREEYNFIKQETLKINELLKEFFLSLNLKLVDFKIEFGK TPDGQILLADEISPDSCRLWDVETNQKYDKDVFRQDIGDLIETYKAVLARMQK >gi|223714066|gb|ACDT01000149.1| GENE 18 15137 - 16162 847 341 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|126667548|ref|ZP_01738518.1| Ribosomal protein S7 [Marinobacter sp. ELB17] # 1 334 8 336 354 330 50 7e-90 MDYKKAGVDIEAGYKSVELMKKHVKETMRPEVLGGLGGFAGAFDLSAIKDMDEPVLLSGT DGCGTKVKLAFVMDKHDTIGIDAVAMCVNDIACSGGEPLFFLDYIACGKNYPEKIATIVS GVAEGCKQSGCALVGGETAEHPGLMPEDDYDLAGFAVGVVDKKDIINGENIKPGDVLVGI ASSGVHSNGFSLVRNVFEMNKETLDTYYDELGKTLGEALIAPTRIYVKALKNVKDAGVRI KGCSHITGGGFDENIPRMLPDGVKAVIKKDSYKVPAIFDLIAKKGDIAEEMMYNTFNMGL GMVIALDPADIETSMAAIKAAGDECYIIGSIEAGDKGVQLC >gi|223714066|gb|ACDT01000149.1| GENE 19 16156 - 16749 687 197 aa, chain + ## HITS:1 COG:CAC1394 KEGG:ns NR:ns ## COG: CAC1394 COG0299 # Protein_GI_number: 15894673 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Clostridium acetobutylicum # 1 197 1 203 204 184 49.0 1e-46 MLKIAVFVSGGGTDLQSVIDAVKNNSINGEIAIVISNRKNAYGLERARQAGIETAVVRKD DELIVKMLKERNVGLVVLAGYLAILTDVLIDAYPNKIINIHPSLIPSFCGPGHYGMHVHE KVLARGVKVTGATVHFVSSEVDGGPIILQEACNIDDLDNAEDIQARVLEIEHRILPKAVA LFCDGKIIVENERAKVI >gi|223714066|gb|ACDT01000149.1| GENE 20 16751 - 18274 1697 507 aa, chain + ## HITS:1 COG:CAC1395 KEGG:ns NR:ns ## COG: CAC1395 COG0138 # Protein_GI_number: 15894674 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Clostridium acetobutylicum # 1 507 2 499 499 488 52.0 1e-137 MKRALVSVTNKDGIVDFCKGLVDLGFEIVSTGGTMAKLQAENVPCIAIDDVTGFPEILDG RVKTLHPKVHGGLLFRRDLQEHVDTVKKMDIQPIDVVCVNLYEFEKALKAGKPMEDMIEN IDIGGPSMIRSAAKNFKDVLIVTDPADYDNVLTAIKDKTDDFDFRLNLAYKAFSTTGAYD AMISRYFAGVVGDDFPDTLNISLKKTEHLRYGENSQQRANGYVDPFVQETLLDYEQLHGK EISFNNVNDLYGAIAVVREFKDQIVTAAIKHSTPCGVAIGKDGYDSYMKAYEADPQSIFG GIVAVNYKIDKKTAEEMHKIFLEIVAAPDFDDDALEVLKKKKNLRILKLKNLYAREAKYD IKYLEGKVLVQDINTEMIKEMNCVTTAKPTEAQLKDMEFGMRVVKFVKSNAICIVKDGVT LAVGGGQTSRIWALENAIINNKDKDFKGAVLASDAFFPFSDCAEVAYKAGIGAIVQPGGS VRDQDSIDFCNEKGIPMVFTGYRHFRH >gi|223714066|gb|ACDT01000149.1| GENE 21 18293 - 19543 1679 416 aa, chain + ## HITS:1 COG:CAC1396 KEGG:ns NR:ns ## COG: CAC1396 COG0151 # Protein_GI_number: 15894675 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Clostridium acetobutylicum # 1 414 1 413 416 402 48.0 1e-112 MKVLVIGSGGREHAIAYKLNQSSKVDKIYAIPGNPGIAKIGECIDGKVEDNEMIVNFAKE HQIDLTVIGPEVPLCNGLANDLEAAGLMAFGPTKHAATLEGSKAFSKDFMIRHSIPTAKY KEVNSYDEAVKAISEFDYPLVIKADGLAAGKGVVIVDNHEDATVTLKEMMIDGSLDGAGS KVVLEEFLTGFECSLLCFTDGETIVPMVSAKDHKQIFDGNKGPNTGGMGTVSPNPFMPEG MDEVIKRDILDPFMQGLKDDNMDYRGVVFIGLMIEDGKAKVLEFNVRFGDPETQSIMLRL DSDLYDIMVGCATKTLKDVEVKWNNQHVACLVLSSGGYPGNYQKGIEIENIDDCDDCVVF HAGTAIKDGKLVTNGGRVLNICATGVSLDEVREKVYAVAKKIDFEGKYYRSDIGLR >gi|223714066|gb|ACDT01000149.1| GENE 22 19554 - 23315 4310 1253 aa, chain + ## HITS:1 COG:CAC1655_1 KEGG:ns NR:ns ## COG: CAC1655_1 COG0046 # Protein_GI_number: 15894932 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Clostridium acetobutylicum # 2 970 3 970 985 982 52.0 0 MSDVKRIYVEKRTAYATEANEIKHNLIEQLGLEIDTLRIINRYDVQGVSDGILKQGINTI LAEPMVDDVYQEEFPTNNQEVVFAIEFLPGQYDQRADSCEQCFAILTGNSSAKVKCARVF AITFKGDKDEILTKIQSFLINPVDQRLAILEKPADLNDVAPEIKPVPTINGFNELDKTGL EKFLTDNGMAMNYEDLKVTQDYFKNEEKRDPTEAELKVLDTYWSDHCRHTTFATVITDLD IENGAFKEILEKDIEDYKTSRHLVYGIDTKRPLTLMDLATISMKELRKTGYLDDLEVSEE INACSVEITVHTTDGDEQWLLMFKNETHNHPTEIEPFGGAATCLGGAIRDPLSGRAYVYQ SMRITGAGDPRKSLAETMAGKLPQRKLCQESAHGFSSYGNQIGLTTGYVHEIYDDGYVAK RMELGAVVAAAPKEQVKRLEPLKDHIVLLIGGRTGRDGIGGATGSSKSHDVKSIETAGAE VQKGNPVEERKIQRLFRNKEVSEKVVRCNDFGAGGVCVAVGELAPGLVIDLDAVLKKYEG LTGTELAISESQERMAIVIDEKDADFIKAECAKENLEVVQVAVVTDQNRLVMKHMGEDIV NISRDFLDAAGAKRYQNVKITLPDFEKTPFDKQSQTSFVETVKETLSKLSVASQKGLIER FDSSIGNGTVLSPFGGIKYHTETEGMAALIPVLGKETTTASVMTYGYNPLISKWSPYHGA MYAIVESVAKIVAMGGNYHTIRFSFQEYFEKLLDNPITWGKPTAALLGAYRVQSALKLAS IGGKDSMSGSFEDLHVPPTLVSFAITSGDVNDIMTPEIKETGHVLAEIIINKDQYHIFDF NHLQKQYDAIMELMKDKKIYSAYTVKDGGVIEAVSKMAFGNGVGVAFNNENSLESYIVRD YGNIIIEVKSEKDLSIFDNYRIVGTTNDSELLSYGGETISLDEAYAINKAPLEKVYPTRE KAPTSKIKVAACKTRADLKPKVTVETPLVVIPVFPGTNCEYDSKRAFEKAGAKVQLVLIR NKTEQMLKDSIDELEVAIKQANIVMLPGGFSAGDEPEGSGKFIATVLKNPRLKAAITDLL DNRDGLMIGICNGFQALVKLGLLPYGKIQEMSKDDPTLTFNTIGRHVSQMVDTRIGSVKS PWLKYVNVDDIHTIPVSHGEGRFVAPEEVIEELFENGQVFSQYVDPNRKVTMQTPYNPNG SMYAIEGIVSLDGRVIGKMGHSERQGENRFKNVYGEMDQKLFEAGVDYFKGGK >gi|223714066|gb|ACDT01000149.1| GENE 23 23317 - 24702 1648 461 aa, chain + ## HITS:1 COG:PA2629 KEGG:ns NR:ns ## COG: PA2629 COG0015 # Protein_GI_number: 15597825 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Pseudomonas aeruginosa # 13 456 11 454 456 425 47.0 1e-119 MEKQEFECLTLCPLDGRYSGVKDALGEYFSEYALVKYRVFVEIQWLKFLIENVESDVLAK FDLQDMDKLTTISSEFNYDSFARIKEIENTTRHDVKAVEYFIDEKVDALGFGYLQSFVHI GCTSEDINNTSYACMLKYGLKDVWLPKAKEFAAIIDKWAEEHSNDAMLAHTHGQPATPTT IGKEFKVYAYRFLSSIENVEAVKIKAKFNGATGNYSAILTAFPNEDWQVLAKKFVEEYLG LTFNPLTTQIESHDYTCHILDGIRHFNNVLVDFDVDMWLYISMEYFKQIPVKGEVGSSTM PHKVNPIRFENSEANIDMSNNICIALSNKLPKSRMQRDLSDSSSQRNLGLAFGYSLQAIN ETMNGLAKCVVNKDKLASDLNEKWEVLAEPIQTMLRKYGVPDAYDTLKALTRGKSISKED ILKFAESLDILSDQDRQTLVDMTPASYIGLAKELAKIELNK >gi|223714066|gb|ACDT01000149.1| GENE 24 24830 - 25642 820 270 aa, chain + ## HITS:1 COG:STM0867 KEGG:ns NR:ns ## COG: STM0867 COG0561 # Protein_GI_number: 16764229 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Salmonella typhimurium LT2 # 1 269 1 269 271 254 47.0 2e-67 MNVKLIAVDMDGTFLTNNNEYDQERFTRQYEFMKEKGIHFVVASGNQYYQLRSFFPDIHN ELAFIAENGAYIVDQGKDVFVAKLAKEDIQTVLKVVNQYPEVEKIVCGRKSAYISKYIDE DFYNTMNFYYHRLAKVDNFDDFDDTIFKIYLHCSKDNFETILTELKEKIGHIMTPVDCGH FGIDLIIPGINKAHGINLLMERWKISDAETMAFGDSGNDLEMLEQATYGFAMANAKEAIK KIADYTISSNEEHGVLEVVDWYISKKKMFE >gi|223714066|gb|ACDT01000149.1| GENE 25 25788 - 26648 639 286 aa, chain + ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 13 276 3 262 265 111 27.0 2e-24 MNKREINQKLIVNYKNHLIDEEKSLATITKYIRDIEKFYQYADKKEVTKELVKLYKEELM KSYKPTSINSMLAALNQFFEYNGWLECKIKELKIQKRVFLEESKELSKDEYKRLVNAARK QKNERLYVLLQAICSTGIRVGEHRYITVQALKDGYAQIYNKGKVREIFFSDDLKRILLKY CHKNKIENGAIFVTRSGRPLDRSNIWKAMKDLCDDAKVERSKVYPHNLRHLFAVTYYNLK KDIARLADLLGHSSMDTTRIYTMSSGREFKRYFDQMDLVFSNRKNN >gi|223714066|gb|ACDT01000149.1| GENE 26 27032 - 27223 227 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756397|ref|ZP_02428524.1| ## NR: gi|167756397|ref|ZP_02428524.1| hypothetical protein CLORAM_01930 [Clostridium ramosum DSM 1402] # 1 63 22 84 84 117 100.0 2e-25 MIISDLRYVGKDVLYSAILSLNVDDYSMKEWHDAIIYLTGNDVDKSIGKAAAKEYLCDYY RHN >gi|223714066|gb|ACDT01000149.1| GENE 27 27252 - 28259 1076 335 aa, chain - ## HITS:1 COG:FN0776 KEGG:ns NR:ns ## COG: FN0776 COG2502 # Protein_GI_number: 19704111 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthetase A # Organism: Fusobacterium nucleatum # 9 335 3 327 327 394 58.0 1e-109 MNLIIPENYDPVLDLRQTQEAIKYIRDTFQKEIGRALHLSRISAPLFVQKSSGLNDNLNG FESPVSFTINEIPGEQIEVVHSLAKWKRMALKKYGFKMHEGLYTNMNAIRKDEEVDNLHS YYVDQWDWEKVIAKEERTEETLKRHVKKIFKVIKHMEHEVWYKYPHAVNRLPDKIHFFTS QELEDRYPDLTPNERETAICKELGCVFVMGVGCKLKSGIKHDGRAPDYDDWNLNGDILFW FEPLQCALEISSMGIRVDEDTLVKQLKAENCLDRLKLPYHSQIVNKELPYTIGGGIGQSR LCMLLLKKVHVGEVQASIWPQEMIDECTKHNINLL >gi|223714066|gb|ACDT01000149.1| GENE 28 28413 - 29687 1410 424 aa, chain + ## HITS:1 COG:FN1147 KEGG:ns NR:ns ## COG: FN1147 COG3681 # Protein_GI_number: 19704482 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 22 422 3 410 411 343 47.0 3e-94 MALDEIKYQTYVQILKEELVPAMGCTEPIALSYCASKARDILGTTPTRCLVEISGNIIKN VKSVIVPNTNGLKGIEAAVAAGIIAGNANKVLEVIADVKKDQIKEIKDYLEHNCIIVRPL ETEHILDIKITLFDDDNYAMVRIVDQHTNIISIEKNHRVLFSKENKICNNEMVIDRCSLN VKDILEFANIVDLNDVKEIISRQIAYNSAIAKEGLTNNYGANIGSVLLNSGQDIITRAKA FAAAGSDARMSGCELPVVINSGSGNQGMTVSLPIIEYAKELNVNDEKLYRALVLGNLITI HQKTGIGRLSAYCGAVSAGCAAGCGIAYLYGADYECIAHTIVNSLAITSGIICDGAKSSC AAKIAASVDAGILGYKMYLDGQEFKDGDGIVVKGVENTIRNVARLGKEGMKETDKEIIKI MTNC >gi|223714066|gb|ACDT01000149.1| GENE 29 29960 - 30883 451 307 aa, chain + ## HITS:1 COG:lin0450 KEGG:ns NR:ns ## COG: lin0450 COG0583 # Protein_GI_number: 16799526 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 16 305 1 291 291 228 38.0 1e-59 MYFYYSIMKQKAGGEMDIRVLNYFLMVAREENITRASQLLHITQPTLSRQLMQLEEELGV KLFQRSNHSIYLTNDGLLFRRRAQEIVNLADKAQAELKQDTDVLSGNIAIGCGEMRSAQE IAKLITDFQNIYPFVQFELYSGNNENIKERMEQGTLDLGLLLEPVNVVKYDFIRMRTKEQ WGVLIHKDNPLSKKQVIYPGELVGTKVITIHLDTPVHHELASWSGNFAKEMESCANYNLL YNAVIVAKEKKGAVICVKLDNYYDDMKFIPFEPKLELTSVLAWKDRQSYSKATNTFISFI KDCYKHN >gi|223714066|gb|ACDT01000149.1| GENE 30 31032 - 31250 125 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733063|ref|ZP_04563544.1| ## NR: gi|237733063|ref|ZP_04563544.1| predicted protein [Mollicutes bacterium D7] # 1 72 1 72 72 110 100.0 3e-23 MRGEKYNTILNDLGFTNAEIELYIRLSHLGTSTKEKRIQIVSEKRRKILEEIHVKENQLQ EIDFLRHELQNA >gi|223714066|gb|ACDT01000149.1| GENE 31 31271 - 32125 799 284 aa, chain + ## HITS:1 COG:TM1009 KEGG:ns NR:ns ## COG: TM1009 COG0656 # Protein_GI_number: 15643767 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Thermotoga maritima # 4 270 6 276 286 300 55.0 2e-81 MEFITLSNGVKMPALGYVTFTMSNEETERCVLEAIKTGYRLIDTAQAYYNEEGVGNAIIK CGVPREELFITTKIWIENAGYEKAKASLHESLKKLQIEYIDLVLIHQAFNDYYGIWKALE EAYKLGKVRAIGVSNFYLDRFMDLATFSEIKPMVNQLETHVFQQHKNDKKFLERYGTKLE AWAPFARGSQGIFENEVLMKIAKQHNKTIGQVALRFLIQNGIIAIPKSAHKNRMEENFNI FDFSLSDEEMKKIEELDLGENVFMNHENAEDIDKFFKMFHVGQK >gi|223714066|gb|ACDT01000149.1| GENE 32 32142 - 32447 385 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733065|ref|ZP_04563546.1| ## NR: gi|237733065|ref|ZP_04563546.1| predicted protein [Mollicutes bacterium D7] # 1 101 1 101 101 164 100.0 2e-39 MIRHIFIATIKEGVTDSEIQKEMELMRSMKNELPEIKEIIIEKSTGWIGLSDVVTMTIDV ESKQDFDKVIQSVAHQKVSATAPDFFRTDNFILTQVEYEKE >gi|223714066|gb|ACDT01000149.1| GENE 33 32455 - 32931 450 158 aa, chain + ## HITS:1 COG:MA0407 KEGG:ns NR:ns ## COG: MA0407 COG0716 # Protein_GI_number: 20089301 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanosarcina acetivorans str.C2A # 3 157 8 175 179 90 31.0 1e-18 MSKSLVAYFSASGVTKKVAERLNEVVQGNLFEIVPEQPYTSADLNWTNPKSRSSVEMNDK SFRPAVKEKISDMGEYDIIYLGFPIWWYIAPTIINTFLEQYDLRDKTIIPFATSGMSGMG KTNDYLENSCKGARLMKGQRFDENVSADELSDWVKKMK >gi|223714066|gb|ACDT01000149.1| GENE 34 32947 - 33504 589 185 aa, chain + ## HITS:1 COG:CC0205 KEGG:ns NR:ns ## COG: CC0205 COG2249 # Protein_GI_number: 16124460 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Caulobacter vibrioides # 4 166 8 169 185 115 35.0 3e-26 MQNILVVSGHTDLNNSVANKAILERLENKLPQAEFVYLDKLYSDFQIDVEAEQEKLLNAD IIVLQFPIFWYAMPSLLSRWLEETFQHGFSHGSTGDKLKGKKLIASFTTGAPEFMYSYEG AQKYPIEDFLPPIKAMCNLCGLDYFGYVYTGGVSYQNRNDIEKMAEMKEKAVMHADKLLE LLETL >gi|223714066|gb|ACDT01000149.1| GENE 35 34561 - 35694 1183 377 aa, chain + ## HITS:1 COG:HI0900 KEGG:ns NR:ns ## COG: HI0900 COG0263 # Protein_GI_number: 16272838 # Func_class: E Amino acid transport and metabolism # Function: Glutamate 5-kinase # Organism: Haemophilus influenzae # 6 372 2 367 368 269 38.0 8e-72 MRDLSNIKNLIIKIGSSSLCDDEGNINKEKILNLIQQIAYIKRKGISITLVSSGAINAGV HVMNLECRPQTIPQKQALAAIGQASLMQIYEDLFSLFNLKCAQILLNHDDFDDRKRLMNF NNAMQALIEYDVVPIINENDTLAVEEIKVGDNDTLASLVVPAVNADMVVLVSDIDGLYDD NPHTNKNARLIRNVDGITKEIESMAKDASSKVGTGGMITKIRAAKVCNDFGCDMAIVNGN QPNVLIDLIEGKDVGTYFDGKPGRLLNSRQHWIMYRSMSKGTIVVDEGAKKALVTCHSSL LPKGIIEVRGNFLISQIIDIVDGNDSLLARGMVNYSSDEIRLIKGLNTSEIEDVLHYKDY DEVVHANNLVLVEGVKN >gi|223714066|gb|ACDT01000149.1| GENE 36 35694 - 36920 1432 408 aa, chain + ## HITS:1 COG:CAC3254 KEGG:ns NR:ns ## COG: CAC3254 COG0014 # Protein_GI_number: 15896499 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyl phosphate reductase # Organism: Clostridium acetobutylicum # 7 408 12 417 418 390 50.0 1e-108 MIEEQLKQAKLACRKMQNIDKDTKIKALEAISKNLISNIDYIVAENKKDVACAKENGISE AMVDRLLLTRSRIESIANDVLKVAGLHDCIGEVVREIKRPNGLIIKQVRIPIGTVATIYE SRPNVTVDIAAICIKTNNVCILKGGKEAINSNIALVRVIKEAITNILPENVVNLIEKTDR SVVTEVITANNYIDVVVPRGGAGLIQHVVNNATVPVIETGAGICHLYIDQEADLEMAVEV AVNAKISRPSVCNAIETILVHQGVANEFLTLLKPRFDKIKIFGDEIVLKYLEGNKATTKN YATEYDDYICNIKVVNDINEAIEHIYDYSTKHSESIITDNEDTARYFMDSLDSACVYHNA STRFTDGGEFGFGAEVGISTQKLHARGPIGLQEMTSTKYKIFGNGQIR >gi|223714066|gb|ACDT01000149.1| GENE 37 37358 - 37867 310 169 aa, chain + ## HITS:1 COG:no KEGG:ZPR_1044 NR:ns ## KEGG: ZPR_1044 # Name: not_defined # Def: probable secreted protein # Organism: Z.profunda # Pathway: not_defined # 1 120 105 230 282 62 30.0 4e-09 MANSSGYCYYYFYPTMWILSGYYGLGILSKYSIIEVMAQRLPNSIIREPRILTRTKLYFN EQIINIYNTHLTYADNQYRIKQMDYVKKHVDFNSYSILAGDFNSFKMNNKFKMEGVKCIN ENKKYKTFCEFAAPDNIFYADFFELKNCGLKKSSFSDHDLLYGEFFLKG >gi|223714066|gb|ACDT01000149.1| GENE 38 37959 - 38378 638 139 aa, chain + ## HITS:1 COG:no KEGG:BT_0631 NR:ns ## KEGG: BT_0631 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 137 1 147 152 91 33.0 1e-17 MNIDERVKKAYALHHSGYNCAQAVLGAYVDLFDLDMDRAMTIAYGFGGGVGMTREICGTL TGGAMALGLKYGKGEADVKQKKFVNEKVAALCKEFEDFHGSVVCGELLGLRETNKKVNRA TCDDLIAEVVRLLEKYLVD >gi|223714066|gb|ACDT01000149.1| GENE 39 39096 - 39548 531 150 aa, chain + ## HITS:1 COG:no KEGG:Tlet_1610 NR:ns ## KEGG: Tlet_1610 # Name: not_defined # Def: MarR family transcriptional regulator # Organism: T.lettingae # Pathway: not_defined # 12 145 6 143 147 62 28.0 6e-09 MEEISFDDVEEVINNFQGLNKTMHYMMMEKTEKFDISPDQTRLLFMLQNHQNINQNALAK KLNITKATLSVRLQRLEKLGYLTRTQDKNDKRNYILNITKTGEVFIEAAIKIMKEKTMIM FEGVSKEQITVINDVINIMKKNIEKCKGEE >gi|223714066|gb|ACDT01000149.1| GENE 40 39548 - 41776 263 742 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 497 722 132 354 398 105 31 4e-22 MLKLKHYFKKFWAPILLCVGLLFLQSQSELALPDYMSDIVSVGIQAGGFDSAVSDVLSEE TYNHLLVLMDEEDQQQFMDAYKLVEPSNLDKDTLDKFPKAKGQNIYKLKDLSEKKLDRLE SILVKPMLMVTSIDGMDKNSKEYQEQFGQLPPNMTPYDALAMMDNTTKAKMFSKIDSQME TMGESTLKIAAGNGVKAEYSRLGCDTDKIQNDYILWSGLKMLAIALAGTVCAVACGFLAS KVGAGVSRLLRRDVFEKVESFSNEEFNKFSTASLITRTTNDITQVQMVVIMFIRIVCFAP MMGIGALIKAFNNTPSMTWIIGLVLLVIFCLIGVTFAIVMPKFKIVQSLIDRLNLTMREN LSGMLVIRAFGNEKHSEERFEKANSDLTNVNLFVNRTMASLMPIMMFIFNVVTLLIVYYG AKQIDLGNIAIGQMMAFMQYAMQIIMSFLMIAMISIMLPRASVAADRIYEVISMEPKIVD PKEPKAFIESKKGLVEFNNVTFKYPGANEAVLENISFTAKPGETTAFIGSTGSGKSTLIN LIPRFYEVTEGNIKIDGVDIRDVNQHDLRDKIGLVPQKGLLFSGTIRSNLTYGAPEATDD ELEEVIRVAQAKEFIDNKEERLDSEISQGGTNVSGGQKQRLAIARAIAKNPEIFIFDDSF SALDFKTDAKLRQELDKMVKKTKNTVLIVGQRIASIMGAEQIIVLDEGKIVGKGTHEELM KNCDVYQEIAYSQLSKEELGHE >gi|223714066|gb|ACDT01000149.1| GENE 41 41769 - 43622 207 617 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 365 599 7 251 329 84 27 1e-15 MSNQPKRNGPKFGPGGMHGMRGGEKAKDFKGTLGKLFKYLRPYYFRLVIVVIFASASTVF AIVGPKILAKATDKLSEGIMAKVAGTGGIDFDYIGQIILILVGLYLISALFSYIQGFITS TISQRVAYDLRTSISQKMDRMPLSYFDKHTSGDILSRVTNDIDTIAQSLNQSMSQVITST VTVIGIFIMMLTISPVMTLIAVCVLPVSMVLIGLVMKRSQKYFARQQAALGDVNGHIEEM YGGHNVVKAFNGEAASVEQFNEYNDNLYESAWKSQFFSGLMQPITGFVGNVGYVAVCLLG GVLAGGGSITIGDIQAFIQYVRQFNQPITQLAQTMNMLQSTAAAAERVFEFLGEEELEPE TPKVTADEVAKVEGSVTFADVNFGYLKDQTIINDFSLHVHAGQTVAIVGPTGAGKTTIVK LLMRFYELNSGSILIDGKDIRDFGRQDLRSLFGMVLQDTWLFNGTIKENLMYGKLDASDE EVKEACKVAYVDHFVQTLENGYETMINEESSNISQGQKQLLTIARAFLKDPKILILDEAT SSVDTRTEVLIQKGMEKLMEGRTSFVIAHRLSTIRDADTIIVMKDGDIVELGNHDSLLAK DGFYASLYRSQFEGSDE >gi|223714066|gb|ACDT01000149.1| GENE 42 43697 - 45022 1290 441 aa, chain + ## HITS:1 COG:CAC0326 KEGG:ns NR:ns ## COG: CAC0326 COG2256 # Protein_GI_number: 15893618 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Clostridium acetobutylicum # 1 437 1 434 443 567 59.0 1e-161 MMMKQQAMFKNMSNEPLANRLRPTTLTEYVGQRHLIGPGKILYQLINSDVVPSMVFWGPP GVGKTTLARIIANQTKAKFINFSAVTSGIKDIRAVMKQAQEVQDLGEKTIVFVDEIHRFN KAQQDAFLPYVEQGSIILIGATTENPSFEINSALLSRCKVFVLKALTTDDLFGLLHYALI SPKGFKDQNVMIDDDLLYMIAGFSNGDARVALNTLEMAVLNGAITHDRIVVDKETIEQCI NQKSLLYDKKGEEHYNIISALHKSMRNSDIDASIYWMSRMIEAGEDPLYVARRLIRFASE DIGMADSRALEIAVATYQACHYNGMPECNVNLAHCVTYMALAPKSNALYKAYERAKHDAL NTIADPVPLQIRNAPTKLMKELNYGKGYQYAHDYDDKITNMQCLPDNIKDHRYYFPTNQG TEAKVIKRMEQIEYLRKHHKD >gi|223714066|gb|ACDT01000149.1| GENE 43 45032 - 45406 373 124 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733076|ref|ZP_04563557.1| ## NR: gi|237733076|ref|ZP_04563557.1| predicted protein [Mollicutes bacterium D7] # 1 124 1 124 124 217 100.0 2e-55 MKDELGLGYCGLVCGLCSENESCVGCKKGGCPEKDFCKNYQCCTKSGYDSCCQCPDFPCQ DSILHKLRIRTFCKFIGQYGEEKLIVCLRKNESNGVIYHYPNSHLGDYDLESEEAIYHLI LKGK >gi|223714066|gb|ACDT01000149.1| GENE 44 45471 - 45798 338 109 aa, chain + ## HITS:1 COG:BH2069 KEGG:ns NR:ns ## COG: BH2069 COG2035 # Protein_GI_number: 15614632 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 3 103 5 102 278 69 37.0 2e-12 MIKNIIGGIAVGIANVIPGVSGGTMMVILGIFNRMMDAISGIFKKENPNRKEDIIFIFQV LVGAGVGIIGFAKILEVLFEYYPTQTIYWFIGLIAFSIPLFLKGEMKGE Prediction of potential genes in microbial genomes Time: Thu May 26 10:48:31 2011 Seq name: gi|223714065|gb|ACDT01000150.1| Coprobacillus sp. D7 cont1.150, whole genome shotgun sequence Length of sequence - 10275 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 5, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 485 458 ## gi|167756422|ref|ZP_02428549.1| hypothetical protein CLORAM_01955 + Term 501 - 543 3.1 2 2 Tu 1 . + CDS 628 - 837 119 ## gi|237733079|ref|ZP_04563560.1| predicted protein + Term 1022 - 1074 0.2 + Prom 1109 - 1168 6.5 3 3 Op 1 . + CDS 1365 - 2435 1338 ## COG1058 Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA 4 3 Op 2 . + CDS 2445 - 2696 285 ## Cphy_1340 hypothetical protein + Term 2705 - 2739 0.4 - Term 2693 - 2727 3.1 5 4 Tu 1 . - CDS 2731 - 3618 839 ## COG0583 Transcriptional regulator - Prom 3647 - 3706 11.4 + Prom 3586 - 3645 10.2 6 5 Op 1 . + CDS 3789 - 5060 1683 ## COG0104 Adenylosuccinate synthase 7 5 Op 2 1/0.000 + CDS 5072 - 6619 1885 ## COG0519 GMP synthase, PP-ATPase domain/subunit + Term 6636 - 6667 0.1 + Prom 7122 - 7181 7.2 8 5 Op 3 . + CDS 7215 - 10275 1886 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits Predicted protein(s) >gi|223714065|gb|ACDT01000150.1| GENE 1 3 - 485 458 160 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756422|ref|ZP_02428549.1| ## NR: gi|167756422|ref|ZP_02428549.1| hypothetical protein CLORAM_01955 [Clostridium ramosum DSM 1402] # 1 160 136 295 295 234 99.0 9e-61 LAIIFGLEFLNPGEGKVVVNPDFPALSAGLFVKMIIIGAVSGATMIMPGVSGSMVLLILG EYYLFKSYLANVTSFSLDVIMPLGFMAIGIAVGIVVSAKLCSYFTKTHKAGFLSLILGLI VASSLVLIPFDVSYNLSLVVTSIIAVIFGGIIVLGLSKIQ >gi|223714065|gb|ACDT01000150.1| GENE 2 628 - 837 119 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733079|ref|ZP_04563560.1| ## NR: gi|237733079|ref|ZP_04563560.1| predicted protein [Mollicutes bacterium D7] # 1 69 1 69 69 98 100.0 1e-19 MLWAEEVVWQNLRGYEKKFDEAEWWTRLQNKPSPKGKRKTKTVNQKPVKIKKMYALPREA LWTMKSSKL >gi|223714065|gb|ACDT01000150.1| GENE 3 1365 - 2435 1338 356 aa, chain + ## HITS:1 COG:CAC3586_1 KEGG:ns NR:ns ## COG: CAC3586_1 COG1058 # Protein_GI_number: 15896820 # Func_class: R General function prediction only # Function: Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA # Organism: Clostridium acetobutylicum # 1 248 1 245 245 197 43.0 3e-50 MNVEIINVGTELLLGEIVNTNATYIQKMCKDLGFNVYYQTTVGDNHDRFKECLDVAFKRG ANCVITTGGLGPTQDDLTKELSAEYLGLELLYNEEEAKKVEEKCRFVTGWGDIPENNFKQ AFFPKDCYILENEVGTANGCVMSKDEKMIVNLPGPPKEMTYVVDHVLKDYLSQFKQDIIY TYDFLTMGIGESRVDEVLADLIDHQEEVSIALYASEETVRVRLGCKASDKETADKKIAPI KTEIEKILKDYIIKEKNLKEALANIMPSYWTIYECDNFTLRDDFNLGHNHTEDTMNLLID CREHPLGSIIKVTIDYQGRKDVFEIPLLKDPTLSYNKLESKIIERVYKFLTQPGYN >gi|223714065|gb|ACDT01000150.1| GENE 4 2445 - 2696 285 83 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1340 NR:ns ## KEGG: Cphy_1340 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 83 1 83 83 139 83.0 2e-32 MLEFRYDTQLLIEGKNLDEDVINDYITEHFKGDCLLAVGDDELIKIHFHTNEPWQVLEYC ASLGEIYDIVVEDMERQSRGLKG >gi|223714065|gb|ACDT01000150.1| GENE 5 2731 - 3618 839 295 aa, chain - ## HITS:1 COG:BH2712 KEGG:ns NR:ns ## COG: BH2712 COG0583 # Protein_GI_number: 15615275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 3 285 1 287 296 163 31.0 4e-40 MYIKLEQYKIFNEAATTLSFSIAARNLFISQSAVSQTISSIEKELQTQLFIRNSKGVSLT KEGKLLHQQIDKALALITGVENQLSNYHELKDGQLIIGAGDTFSEYFLTNYIVKFKQLYP GVKIKVINRTSLETRELLKSDQIDLGFLNLPIKDDSLVIKEYFQVHDIFVSKKPDNHIYT FNQLADQPLILLEKSSNTRNRIDNYFANKGLLLKPTIELGAHNLLLDFARAGLGISCVIK EFSQELLDQGILHEIKTSQPLPKRSIGYAYPKRRTQSVATNKFIELIANDPTLHI >gi|223714065|gb|ACDT01000150.1| GENE 6 3789 - 5060 1683 423 aa, chain + ## HITS:1 COG:BH4028 KEGG:ns NR:ns ## COG: BH4028 COG0104 # Protein_GI_number: 15616590 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Bacillus halodurans # 1 422 1 427 428 527 59.0 1e-149 MAGVVVVGSQWGDEGKGKITDYLAQKADIVARYQGGNNAGHTIEFNGQKFALRLIPSGIF SGNDVILGNGMVINPKALLEEMKYLNDANISTDKIMISDRAHVILPYHLEIDEIQETRRG ANNIGTTKKGIGPTYVDKYERVGIRMGEFIDEELFKERLEEALASKKAQYPELTCSAEEI FEEYKEYAKIIAPMVCDTGVVLDECFQTGKNVLFEGAQGTMLDIDYGTYPFVTSSHPGAN GVAEGSGIGPLYINEAVGIVKAYTTRVGSGAFPTEIEGELADQIRERGHEYGTVTKRARR IGWFDAVVVNQSRRMSSLTGISLMLLDVLSGLKTLKICTAYELDGKIIKALPSTIKQLNR VKPVYEEMPGWDEDITKVTSFEELPENCQKYLRRIEELINCPIVIFSVGPDREQTIVLRE IFK >gi|223714065|gb|ACDT01000150.1| GENE 7 5072 - 6619 1885 515 aa, chain + ## HITS:1 COG:CAC2700_2 KEGG:ns NR:ns ## COG: CAC2700_2 COG0519 # Protein_GI_number: 15895957 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Clostridium acetobutylicum # 196 515 1 316 316 450 70.0 1e-126 MKNELVIVIDFGGQYNQLVARRVRECNVYCEIYSYKVDIEKIKEMNPKGIILTGGPNSCY LEDSPTYQKELFELGIPVLGLCYGAQLMQHVLGGKVEKADVREYGKSHLIVSKQESKLMK DVAVESICWMSHFDYISKIAPGFEITSYTKDCPVASCEDESKKLYAIQFHPEVLHTEYGT KMLSNFVLDVCNCSGDWRMDSFVEEQIKAIREKVGNGKVLCALSGGVDSSVAAVLLSKAI GNQLTCVFVDHGLLRKNEGDEVEAVFGPDGQYDLNFIRVNAQERYYEKLKGVEEPEAKRK IIGEEFIRVFEEEAKKIGTVDFLVQGTIYPDVVESGLGGESAVIKSHHNVGGLPDAVDFK EIIEPLRDLFKDEVRKVGLELGIPEYLVFRQPFPGPGLGIRIIGEVTAEKVRIVQDADAI YREEIAKAGLDRSIGQYFAALTNMRSVGVMGDERTYDHAIALRAVNTIDFMTAEAAQIPY EVLNKVMSRIINEVRGVNRVMYDITSKPPGTIEFE >gi|223714065|gb|ACDT01000150.1| GENE 8 7215 - 10275 1886 1020 aa, chain + ## HITS:1 COG:SA0089 KEGG:ns NR:ns ## COG: SA0089 COG1112 # Protein_GI_number: 15925797 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Staphylococcus aureus N315 # 5 1013 6 1025 1050 366 31.0 1e-100 MSKKSDVLESWIMVEHLSEGDINLNDKSIITLSNLGDENYYDLFIREIQKKKMKKYQQGG IVVFFDIFSFKEIIDFLRKEYNLLETEEDIKVGNKFSFALYFDKELKIHSEMTFLTESFY IRKYRKIPRESEFSSFEQEYRKIFDEIFECPDEVDYVDHFNNAIKFIIKNNGIDIINCRM KAIENLETDATNLHSFFVGDLEKAKTIESSILDDYILKRRKERVNLDSRVKSSEFDSEVF SKILQPENYPTARFPSNPEYSLTFMQQVAVNLAIGYDNERIRSVNGPPGTGKTTLLKDIF AEFVVEQSYDIARLSTKYVKGSEETKYWNNASIGIMPKKIAEKGIVVASSNNGAVQNIVN ELPLLTGIDKEFSRALIEVDYFKIIANSNVSSKWVEDENGTKHEELVSEENKENNKFWGL FSLEGGRKENMDYIVKVLKHVVSYLEKEYEPDEEIYQKFMNQYKDVCLYREERQKISETI NKLNKLLVEIGTQHDLYVRESANRKKEIEKEKASTLKITIDIQDKIETLKDSLQELSNLQ FKQYEERECINQCIDALKLQKPGIFSSLKRKREYKEKCKLYSNQLQKIILGEHSYKAQIM EHENKIKSLLDTKSKKETELKCKIELFDKLTQKAKDDITYMEKQAEELKKSIDEKNINKL DFTIDYETLQLSNPWFDTEYRNLQSELFISALKVRKQFLYDNVKNIKAAYIIWNKQRDYL CHKMVISEAWNWINMVIPVISSTFASFSRMCFHMGEKTIGHLFIDEAGQALPQAGVGAIF RSRNVMVVGDPSQIEPVLTLDSSLLDMLGEYYGVSKFFLSSNASVQTLVDEISKYGFYKD NTREEWIGIPLWVHRRCKYPMFDIANKISYGGNMVQAEKKNGKSEWFDICGSAVDKYVEE QGKFLREKIQKMIFQNADIIDKEKKDIIYVITPFKNVAYHLSQELRKIGFTRYDEKGKPT NVGTVHTFQGKEAPIVFLVLGADETSKGAANWAMGTENPNIMNVAATRAKEEFKPFRFQR Prediction of potential genes in microbial genomes Time: Thu May 26 10:48:47 2011 Seq name: gi|223714064|gb|ACDT01000151.1| Coprobacillus sp. D7 cont1.151, whole genome shotgun sequence Length of sequence - 2263 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 117 - 479 230 ## gi|237732985|ref|ZP_04563466.1| predicted protein 2 1 Op 2 . - CDS 500 - 1504 666 ## gi|237732986|ref|ZP_04563467.1| predicted protein - Prom 1570 - 1629 8.6 3 2 Op 1 . - CDS 1754 - 1897 84 ## 4 2 Op 2 . - CDS 1963 - 2262 149 ## gi|237732988|ref|ZP_04563469.1| predicted protein Predicted protein(s) >gi|223714064|gb|ACDT01000151.1| GENE 1 117 - 479 230 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732985|ref|ZP_04563466.1| ## NR: gi|237732985|ref|ZP_04563466.1| predicted protein [Mollicutes bacterium D7] # 1 120 1 120 120 225 100.0 5e-58 MKTFEYNGKLFFKYNLNMNINARYMNFRKKLISVLAPYYRAKFGTKESQRHFRESNMIPD LIVECLIGKPNYKGELWTQAYYEPNSLRTYTFPAYCLGTFDIGYMYYDKVINGIDKDFIL >gi|223714064|gb|ACDT01000151.1| GENE 2 500 - 1504 666 334 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732986|ref|ZP_04563467.1| ## NR: gi|237732986|ref|ZP_04563467.1| predicted protein [Mollicutes bacterium D7] # 1 334 1 334 334 629 100.0 1e-179 MTLITTDYYKIMTSTKTEIQNLSPEFFNTLHPNVIKAIPETIYQYITLEQARGLNTLVLR RLPSWLLESYKIQDGETPYELVREIDGKKIPILAKPIAVRLRNSSSSLKNKPTTLYTKPF TRENADNLLLKDVKFLTQIYLNSAVKEHFVWQEKKGGSYSADFDEILNIINNHFVPNEFD KDKWFDHIPALISLPNLLAQSNLIPVTTHAKVKHLSYFKFGGSVRYIYKDAYKLLLMLDK LQIASKSCQAQYKATNKMPRIFFENLYKELLSYDDIEKYIHSGELKLYEDESHILQYDSA INLISKISENGYINALPIYVEVKEYFKTYMQMIQ >gi|223714064|gb|ACDT01000151.1| GENE 3 1754 - 1897 84 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAVTIAQVKPLKGEQAKNFIKAFESSSIKEEAVSKAIDAMKKGFTGK >gi|223714064|gb|ACDT01000151.1| GENE 4 1963 - 2262 149 99 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732988|ref|ZP_04563469.1| ## NR: gi|237732988|ref|ZP_04563469.1| predicted protein [Mollicutes bacterium D7] # 50 99 1 50 50 87 98.0 2e-16 LKKLSIFNIDAPFTVSAEITDDKGQFIIVEQDAEGNWKRQYRITKKTKNVTITPNDGIVD YFIMISTTQPNSTFEVKNIQIEKGSTATDYEPYNPPILK Prediction of potential genes in microbial genomes Time: Thu May 26 10:49:20 2011 Seq name: gi|223714063|gb|ACDT01000152.1| Coprobacillus sp. D7 cont1.152, whole genome shotgun sequence Length of sequence - 24163 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 9, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 384 - 602 145 ## gi|237732989|ref|ZP_04563470.1| predicted protein 2 1 Op 2 . - CDS 595 - 1113 281 ## gi|237732990|ref|ZP_04563471.1| predicted protein 3 1 Op 3 . - CDS 1113 - 1415 224 ## gi|237732991|ref|ZP_04563472.1| predicted protein 4 1 Op 4 . - CDS 1417 - 2289 300 ## gi|237732992|ref|ZP_04563473.1| predicted protein 5 1 Op 5 . - CDS 2347 - 2634 306 ## gi|237732993|ref|ZP_04563474.1| predicted protein - Prom 2669 - 2728 12.2 - Term 2710 - 2773 16.1 6 2 Op 1 . - CDS 2902 - 4089 760 ## SPs0551 hypothetical protein 7 2 Op 2 . - CDS 4101 - 12245 6567 ## COG4886 Leucine-rich repeat (LRR) protein - Prom 12314 - 12373 10.8 - Term 12526 - 12556 1.2 8 3 Op 1 . - CDS 12644 - 13567 682 ## SPs0551 hypothetical protein 9 3 Op 2 . - CDS 13579 - 14202 348 ## SPs0551 hypothetical protein 10 3 Op 3 . - CDS 14199 - 14807 265 ## SDEG_1614 putative prophage LambdaSa1, minor structural protein - Prom 14866 - 14925 6.7 11 4 Tu 1 . - CDS 15532 - 16215 595 ## gi|237733000|ref|ZP_04563481.1| predicted protein - Prom 16235 - 16294 10.4 12 5 Op 1 . - CDS 16551 - 17045 376 ## gi|237733001|ref|ZP_04563482.1| predicted protein - Prom 17073 - 17132 10.0 13 5 Op 2 . - CDS 17135 - 17812 747 ## gi|237733002|ref|ZP_04563483.1| predicted protein - Prom 17854 - 17913 8.6 - Term 18397 - 18450 6.2 14 6 Op 1 . - CDS 18485 - 18709 221 ## 15 6 Op 2 . - CDS 18727 - 21447 2161 ## LMOf2365_0413 cell wall surface anchor family protein - Term 21653 - 21692 -0.1 16 7 Tu 1 . - CDS 21821 - 22426 392 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Prom 22459 - 22518 4.9 + Prom 22724 - 22783 7.4 17 8 Tu 1 . + CDS 22852 - 23637 486 ## COG1737 Transcriptional regulators 18 9 Tu 1 . - CDS 23648 - 24058 431 ## gi|237733006|ref|ZP_04563487.1| conserved hypothetical protein Predicted protein(s) >gi|223714063|gb|ACDT01000152.1| GENE 1 384 - 602 145 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732989|ref|ZP_04563470.1| ## NR: gi|237732989|ref|ZP_04563470.1| predicted protein [Mollicutes bacterium D7] # 1 72 1 72 72 96 100.0 4e-19 MTKKFNLSKSTAINSLKESIKILENRDYSKARKNVFFERLQKNTKETDIYPNLINSDKLT PEILDSFSKKVN >gi|223714063|gb|ACDT01000152.1| GENE 2 595 - 1113 281 172 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732990|ref|ZP_04563471.1| ## NR: gi|237732990|ref|ZP_04563471.1| predicted protein [Mollicutes bacterium D7] # 1 172 1 172 172 291 100.0 8e-78 MFDYYLDLSSTNSGVVLVGEKVIYITSWNFSKFKKKSKIKVEWQIEKMRYISTYINDFIK KYPPKSFTAEGIFIKKEFLASSETLMKVHGMVIEKFINYPIKYIPPANIKKNITGKGNAS KELVRQNICKKLNIQGINYDEADALALMLTDKNLSEFQSLEKQIIYLEEKND >gi|223714063|gb|ACDT01000152.1| GENE 3 1113 - 1415 224 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732991|ref|ZP_04563472.1| ## NR: gi|237732991|ref|ZP_04563472.1| predicted protein [Mollicutes bacterium D7] # 27 100 27 100 100 110 100.0 3e-23 MAKHKKKIIIHKHLSIIYFIIGIFFISNLVSIHLTAINNQKEYEKLNKTYQKVQVENEKT IEEYNNLKNEDYLVRYARENYIFMKDGETVKKFMTIRRNK >gi|223714063|gb|ACDT01000152.1| GENE 4 1417 - 2289 300 290 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732992|ref|ZP_04563473.1| ## NR: gi|237732992|ref|ZP_04563473.1| predicted protein [Mollicutes bacterium D7] # 1 290 16 305 305 487 100.0 1e-136 MGKYTGKKRNSTQKNRTVYTKNKNIGMCAICGNKIKIYNNRIICNSCKNKINELQNKGKL QYAPPIHTILYLKFKLNKKNILDYLIKFPYMYEQTLNHSSKSVKKHVFNRQKIITEQMSI NKRYPEVEIIKVFQSSNKEVFYYRDKRTKIEFADCCDFLNSNKVGYHLSKTKNLYTDKLK HYKQLKMYLKRNNICYFKYYDTPPVHNPLSGIPMHYDIEIPFSKLMILIKQDNYKIYDSN TYHSLENAQYQSFMDKKREEYAKKNKYKILYLTHNDFVTGLYTKKIKKVI >gi|223714063|gb|ACDT01000152.1| GENE 5 2347 - 2634 306 95 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732993|ref|ZP_04563474.1| ## NR: gi|237732993|ref|ZP_04563474.1| predicted protein [Mollicutes bacterium D7] # 1 95 7 101 101 170 100.0 3e-41 MTNRYIEKTTSNLLNILQTVKRSAYQKRGHSDVTFSFGNEKVELVADVIPEREAGYKVIA SLICEDTTLIDESNKNIVKKIKETFNGEDSIFVSY >gi|223714063|gb|ACDT01000152.1| GENE 6 2902 - 4089 760 395 aa, chain - ## HITS:1 COG:no KEGG:SPs0551 NR:ns ## KEGG: SPs0551 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_SSI1 # Pathway: not_defined # 196 389 454 616 620 82 31.0 2e-14 MKYNKIVAACFLCFIIISCSLNSVNASTVDNLINEWSEETNKLEKKEYYNLGYIKITYED KNPGTIYGGTWTKIASGKSLIGVSSTIGVRDTGGEKETTLTSNNLPAHNHTGTVNISGNH SHSFYSIYGTSNVYGAYFMTTESGKGLATNAGGRRLVEGYASIGSAGAHNHSFTTNSQGK SQAHNNMQPYVTVYYWEKTSNASKTPTTFSGTTQQKLDKISATLDSLKRTTYYSVGDVIT LTTDIDPATVYGGTWKRTAKGKSLVGVDSTDDDFSQSRNTGGEKTHTLTNDELPIHSHTG STSTDGYHSHGYNKNTSNAYGTAFYTDNYHGWPLYSNNPYDGNCSGWYVRENYIDISSSG THSHSFTTSETGGSGAHNNLMPYYTVYMWEKISDT >gi|223714063|gb|ACDT01000152.1| GENE 7 4101 - 12245 6567 2714 aa, chain - ## HITS:1 COG:lin0372 KEGG:ns NR:ns ## COG: lin0372 COG4886 # Protein_GI_number: 16799449 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Listeria innocua # 227 406 384 562 656 90 36.0 4e-17 MHFSLKQALTWLVIILCAGIIINGLYQHTSESNTKSETIVATDWKIIKTNIGGSTYSEED DDYEIGDNIRLNVKVSFDLNGGEISPDFSTSKIVSTNEKYGTLPVPTRIGYNFEGWYTKV IGGTKITEESIVKINRNHTLYAHWNPIQYNVAYYHNNALLGTFPYLYNQDIIIKDINEFG IKKTGYTFKGWTDLSDILYTAGSTTSRLTEKENSTFKLYSNWKANEYTVNFDSNGGTTPQ NTKTVTYDSAYGALPTPTRQYYTFTGWYTDKENGSKVTENDIYKIDGNSTLYAHWAQNTY TITFIVGDKTYIETAFEGEDVNIPDTSKTGYTFKGWYTKASGGTKIDDFQYKTQTVYAQW QPNTYLLTFDANGGTVSPHNKEVTYDSTYGTLPTPTRTGYTFLGWFFDNDKQLKNTNIVQ ITSATTVKAKWKANSYTINFYNGSTKLGSKTFTYDKSDLLPQFSSFHIDKTGYTFAGWGL SNSDKTPSYTDGQSIINLADINNANINLYTVWQANKYTVTYISENKVYGTSQFTYDKAEP LLTVKDLALEKAGYQFEGWTTEEGSTTVKYKDGETVKNLVTTPDGNKNLYAVWKAKEYTV NYYNGTQKVGSSIHVYDKDSNLKTIAELNLSKTGYTFQGWSISPTDLKVTFKDGAPIKNL SIGTAIDLYAVWTPNTYTVTLDSAGGTVSPTIIKVTYDQSYGELPTPSRTGYTFKGWYLG STRITKDSIVKITSNSTLVAKWEANKYTVVFEANGGNVDVINKEVIYDGTFGILPTPTRT GYSFNGWYLGEQHITSSTIVKITSNVTLTARWSANTYTVEYYSDGFKKGSSSHTFDKESN LATLAALKISKPGYTFKGWATNSETMTPVYNDGQTIENLTNGNGAVIKLYAVWDANSYTV EYYDNGTLVGSSSHVYDIAKNLTTSSSLNIKKPGYTFKGWALTEGATTAKYTDGASVKNL TTSFNGTIKLYSVWQANTYKIEYYSLGQLKGSSSHIYDTAKNLTSAATLGLNNPGYTFKG WALTNGSTTVKYQNSQSVINLTSSTNGVVKLYAVWQANAYTVALDPAGGTVNPTSIKVTY DKAYGTLPTPTRSGYTFNGWFLDDTQITKDSIVKITRDSTLVAKWTANKYTINFDANGGT VGTTTKEVTYDSLYGILPVPTYEGYGFSGWYLGNELITKDSIVKITSNSTLIAKWNVNKY TIEYYDNGTLVGSSIHAFNSKQALKTDVDLGIKRTGYSLAGWSETADSTKVIYKNGEEVL NLTSTNNAVVKLYSVWTANTYTITFNPNGGTGGPTEQNFKFNGGEKITTSIPSKTGYTFN QWVAHHNTQYTFKPGATIPNGWESFTLDAQWTANNYNVEYYNGTTKVGTSTHVYDTDKNL TSASTLKLSKTGYTFSGWALTDGGTKKYNDSATVKNLTATNGGTVKLYALWTANQYTVTF DPAGGTVSPATKQVTYNSTYGTLPTPTRDGYAFAGWHLGNTQIIAGTKVTSTSNHTLTAH WTQHQTTKYTVKHYLMNVNGVGYTLKDTDNLTGPTDNNIIIKDHAKTYTGFTYNSCKLDG SSTATSSDTTTILGDGSRVLNLYYTRNKYTITVVKGNGVASTSATFAQYYQSNVSVTASV SSGYSWNKWTSSNTSLLADSTNATYGFTVPAGNITLTATATANTYTIEYYSDGVKSGSST HIYNTAKNLTSLVNLGIKKQGYTFKGWSTTSGGTTITYGDGASVKNLTATKGGVVKLYAV WTANDQLLTINPNGGKYDNTTANTKITGKTDSSINIQVPTRTGYRFNQWTKSGLGTITYD TNNKGTFTFGAGGTTLTAKWLANTYTIEYYNGNAKVGYSNHTYDTAKNLTSASALKIYKT GYVFNGWSTSASASTATYGDEASVKNLTATHGGVVKLYALWKPNATTQFNVKHYLMNTDG TTYTLNSTQSYTGTSDSSITLSSYRKSFTGFTYKETKLDNSSTATNSDTTTIAPDGSRVI HIYYSRNQYTITVNKGTGINTVTSNFTKYYGTSASVTTTLTTGYSWVNWTSSNTGLLANS TAQTYSFTMPAGNVTLTANARANTYTVEYYSDSTKKGSSSHTYNTAKNLTTASALGLTKS GYAFAGWSTTSGGTTVNYADGASIKNLTATQGTVIKLYAVWRSNTQTLTINPNGGTYNSS TNNTQINGTTDSTISINAPTKTGYRFNQWTLTGAGKITYDATNKGTFTFGASQTTLTANW IANNYTVNYYNGNTKVGSSSHTYNSAKNLTTASILNLNKTGYTFAGWSTASGGTTVNYTD GASVNNLTSTHNGTVNLYAIWRANTSTSYSVNHYLMNTDGATYALNSTQKYTGTSDATIT LSAYRKTFTGFTYKETRLDNASSTTSSTTTTISPDGSRIINIYYSRNKYQITVAKGTGIT SVTDTFTAYYGTNAYVTSTSATGYHWNKWISSNTSLLPDGPKQTYAFTVPAGNVILTATA TANTYTVKYYNGTTLLGSSNHTYNTSKKLTSGAALKAVKSGYDFVGWDTSKSGNTVKLDD EQNVTNLKNTNDAVIELYAVFSDIAKQIEDTKNTFNFAIGTVIQTTNSTNPSSLYGGIWQ QIATDRILVGLDENDSDFNTIRKTGGTKTENITIDTMPRHSHTGTTNNAGNHSHSFNSNT AQTYGVAYSTSDTSAGNPYIPASHSKDHLKEGYHGIYDAGNHSHSFTTNNTGGSTAHNNM QPYYGVYTWEKVGN >gi|223714063|gb|ACDT01000152.1| GENE 8 12644 - 13567 682 307 aa, chain - ## HITS:1 COG:no KEGG:SPs0551 NR:ns ## KEGG: SPs0551 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_SSI1 # Pathway: not_defined # 140 306 493 620 620 81 33.0 4e-14 MSFTVKHVVRLLLVLIVGTLAINFWIGLNNKNVNESNNVADYIWNNQTVTKKENVMFNAL EGTTDVLMKPIEFGKAYGELPTAIKDGYLFDGWYTEKDGGNKVTEDTIVVNDNLHILYAH YKEDPDIIARIETLEAKSVPIGFVFQSTSNINPSSTIGGSWQLIASGKTLVGYDKDDDDF AKVRNTGGEKEHTLTIAEMPIHGHTGTITTNSYNHSHKINYTNTPKADGSINGGLSPSAW NVGADSYWGSASTNAIPVYNENEQSMTNGSDSGSHSHSFTTSLTGGNRAHNNLMPYYVVY TWEKIAM >gi|223714063|gb|ACDT01000152.1| GENE 9 13579 - 14202 348 207 aa, chain - ## HITS:1 COG:no KEGG:SPs0551 NR:ns ## KEGG: SPs0551 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_SSI1 # Pathway: not_defined # 50 203 489 616 620 92 35.0 7e-18 MMKRYIELKKKIILFLLIILAIPLIPNCTELNRNKIYALSLEERLEALENRQFPVGYIYI SKTATNPSTIYGGTWKKIANGKCLVTIDSNDSDFNKVGKTGGEKKTSLSLEQIPSHSHTG TTSESGTHSHKFKWSHSYNELTGGFSTKGMESNGGYYTGRIAIMGRSSISSSSGNHAHSF ITNNTGQGFSHNNMQPYYTAYIWEKVG >gi|223714063|gb|ACDT01000152.1| GENE 10 14199 - 14807 265 202 aa, chain - ## HITS:1 COG:no KEGG:SDEG_1614 NR:ns ## KEGG: SDEG_1614 # Name: not_defined # Def: putative prophage LambdaSa1, minor structural protein # Organism: S.dysgalactiae # Pathway: not_defined # 37 202 495 625 625 80 30.0 5e-14 MIKFGFLWFLLSIFFLNSIIPVSAESISERLDRLENNELYKIGSIIYMTTSENPSIYFGG TWEETAKGRCIVGVEESDSSLSNSGMSGGEKAVRLLSNQLPSHNHSATTSTAGLHSHSTS LYPNSREGGNPSTFGFRGSNWNGSGGYFYNRILVNTDWNYADTTTNSEFIHSHSFTTNSS GESQTHNNMQPYYVVHIWERVA >gi|223714063|gb|ACDT01000152.1| GENE 11 15532 - 16215 595 227 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733000|ref|ZP_04563481.1| ## NR: gi|237733000|ref|ZP_04563481.1| predicted protein [Mollicutes bacterium D7] # 1 227 1 227 227 379 100.0 1e-104 MDSNKIKTFSLYILAFFLFLGLLWSTRYIKINSVLELPPSPANYFSDNYLSTTLYNQEND GCILTYTTDAKTNKNIGYIYWGYVYADDVKEILSDKRTNKEIIKDLETEQEFFQEQLGDK YNPGDSTNREFGKEDYWLMAYYDFDEEEPYAVYEINFWTKDFDINDKGMQKVLQYLGLDK IYDESTEQFTVKKLEKNKSKLKFANIFDFTNISHTVDMDGNEVSNAE >gi|223714063|gb|ACDT01000152.1| GENE 12 16551 - 17045 376 164 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733001|ref|ZP_04563482.1| ## NR: gi|237733001|ref|ZP_04563482.1| predicted protein [Mollicutes bacterium D7] # 1 164 1 164 164 307 100.0 1e-82 MNFRTFFEKLNDAENKSSVENAYSHIIDYVFDVILDKNVFSLIDTSTWNKGADGRIVPKN IISTPDFVITDRNYKFNDNTSNAYGCIEVKYADKDVKQSLRLCDDGDSKGYLHHYQNVLY TNGWIWIYYDRKTTPKWTINFRENQTNKEFGRLLYELCSIDWKN >gi|223714063|gb|ACDT01000152.1| GENE 13 17135 - 17812 747 225 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733002|ref|ZP_04563483.1| ## NR: gi|237733002|ref|ZP_04563483.1| predicted protein [Mollicutes bacterium D7] # 1 225 1 225 225 403 100.0 1e-111 MATLNTSLNTAAKENEGVNFEAPKEVRVYTYKNQNDDNKRLGGAFLQTPLFKYSNLSIIE GQNGPFLSYPSRKSNQTNDEGKPVYFDRYYPASKEAREYLNQVVLDAFNNAPSYDGKISD EYYNEEDIQITDIKITLDDSGSPINGVLGTVKVTTPLMVHPFITIRQSNTDPDDFYLAYP GYKTNDTDENGKPVYQRYFQTTNKAANEYLTKLVISAVEKELNKG >gi|223714063|gb|ACDT01000152.1| GENE 14 18485 - 18709 221 74 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDRDIFDDMKIETGCTYISDLPYIKKTVEKKLLELRFELYSKEQLQEFCNYVFRDNGAIY QSLMMRYRRDNRYN >gi|223714063|gb|ACDT01000152.1| GENE 15 18727 - 21447 2161 906 aa, chain - ## HITS:1 COG:no KEGG:LMOf2365_0413 NR:ns ## KEGG: LMOf2365_0413 # Name: not_defined # Def: cell wall surface anchor family protein # Organism: L.monocytogenes_F2365 # Pathway: not_defined # 650 906 151 407 407 149 41.0 4e-34 MKLKKFKSILTCVLVLCMTFGIQSNLAFVFAQGNDDTIEQENIVDNMVDITDDTARTTTD EVDVNTEAEKKDENIGDSQHTKYGLTDEEEWEECNQCSKDNPHLIATTADLDKVRTHTHT EGNVTTITGYFKLVNDIVFTDEDYQENGAFYNNGWGWTPIGHNNKTSYFSGDQFQGDFNG DDYAVKNIKIERPAGYWYNGLFAGPGDNAHIYNLKLEGFAIHADGAGALYGQSYATKKDS MLVENITVYNSCITGISSAAKLSGIFAGAIKGINRNITISNSTYKFGKSAWKGGFIASEI ATGKLENVTVTDCEMTTYAYTGILVPTICLDAVIKGVVIKDSTVNTTHHAWSYLIYEDFR SNLTGPTPVISDVKIDVNIQGTTAHLKDAKTIIPKLKNSDDSAIQMDNIDVHAILTDKTT NSQSKVDFNVDTNVIGIPSDAEKNIKAEERYIVSLNGGHIRPETVNEEGKLIADRSGYAF DGWYENKELNGDSIEKLNYNTYYYAKWSEKADCEVSFKDNLKLDKIYDENAVSLLESDYI VTEGAGKVTFNYQMKEDNEWKDIDTTPINAGTYRVKAIVAENDTHKRAETDWKEFVISKA MPTYEVPTNLTAIVGQTLADVTLPQDFAWQDDTTTSVGSAGINTFKVTYTPKDTANYNNI TDIEVILTVNPKMEELNAIPTINASNKTLTVGDTFEALDAVTASDKEDGDITEKVEVLSN DVDTSKAGTYTVIYKVTDSKGASSTKIITVTVNPKMEELNAIPTINASDKTLTVGDTFEA LDCVTASDKEDGDITKKVEVLSNNVDTSKAGTYTVIYKVTDSKGASSTKTITVTVKAKDT QNPTTDDNKKPSATDTDKKPASIDKNMTADNPKTGDSSNVTTWLALMFVSFGLLAGISVR KSRKNR >gi|223714063|gb|ACDT01000152.1| GENE 16 21821 - 22426 392 201 aa, chain - ## HITS:1 COG:pli0059 KEGG:ns NR:ns ## COG: pli0059 COG1961 # Protein_GI_number: 18450341 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 3 153 2 149 199 102 39.0 5e-22 MAKYAYIRVSSKDQNIARQVEAMQEIGLTKKQMYIDKQSGKDFIRKNYQRLVKKLKAGDE LYIKSIDRLGRDYDEILEQWRYLTRVKDIELIVLDFPLLDTRNQVNGVTGKFIADLVLQI LSYVAQIERENTKQRQAEGIRIAKEKGVQFGRPKHDFPEDFEEIYLLWEAGKISMREGGR RLNTHHTTFARWIARYQQQKA >gi|223714063|gb|ACDT01000152.1| GENE 17 22852 - 23637 486 261 aa, chain + ## HITS:1 COG:SP1674 KEGG:ns NR:ns ## COG: SP1674 COG1737 # Protein_GI_number: 15901509 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 7 261 10 258 283 71 22.0 1e-12 MKFKEKISIYYTSLTPTEKKICTKILDNPEIIIKNSIIDAGNRCDTSKSAMLRFAKKLGY SGYSEFKYAISEDYDSKQNDETIDVNESFLKRISSLFAESVYNLGDLDYELQLQQLAKMI DEYSYVKSIGIGNTAFCANQLVYSLYSYNKFIECVGDCIQFDYLKNCLNSEYLLIIFSVT CSETKYLDLVKTAKNKGAKIVVITMNNEHKLISLSDLSFILPSQTSPLQNKKVLKQLDNR TMMYFFAEIISYYYGLHLEKK >gi|223714063|gb|ACDT01000152.1| GENE 18 23648 - 24058 431 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733006|ref|ZP_04563487.1| ## NR: gi|237733006|ref|ZP_04563487.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 136 36 171 171 265 100.0 9e-70 MKAYDYIKIANDIKMKFDDLIKHIIKCIEPYNQTGIEVIGTTNKIINKLGIGTGACTDIF TMSKYGVDACIVSDDGINNWVAVQWAMDNKIPLIVVNHMTCEAPGIKNMAIFLNNKFENI EFKYVPNDYGIYHKEK Prediction of potential genes in microbial genomes Time: Thu May 26 10:50:58 2011 Seq name: gi|223714062|gb|ACDT01000153.1| Coprobacillus sp. D7 cont1.153, whole genome shotgun sequence Length of sequence - 2302 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 279 61 ## gi|237733007|ref|ZP_04563488.1| conserved hypothetical protein 2 1 Op 2 . - CDS 298 - 1611 946 ## COG1455 Phosphotransferase system cellobiose-specific component IIC - Prom 1738 - 1797 8.2 - Term 1778 - 1823 9.3 3 2 Tu 1 . - CDS 1829 - 2089 242 ## gi|237733009|ref|ZP_04563490.1| conserved hypothetical protein - Prom 2123 - 2182 7.8 Predicted protein(s) >gi|223714062|gb|ACDT01000153.1| GENE 1 3 - 279 61 92 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733007|ref|ZP_04563488.1| ## NR: gi|237733007|ref|ZP_04563488.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 92 1 92 93 183 100.0 3e-45 MYIGGKMKIKEVIDYLETQGEWVNRDCTKDHILYGNENNDIHQAIVCWVATLDIIYQAIQ NDCHFIISHENPFYLASTNLPQPIIKAQKKKK >gi|223714062|gb|ACDT01000153.1| GENE 2 298 - 1611 946 437 aa, chain - ## HITS:1 COG:BBB04 KEGG:ns NR:ns ## COG: BBB04 COG1455 # Protein_GI_number: 11497019 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Borrelia burgdorferi # 8 437 3 443 443 220 33.0 4e-57 MESLKRFFDKFQKLTEEKILPVTSAISSQRHLAALRDGLTILIPLTVIGGISLMLANPPV DLEVIKPTNFFFSFLIAWKNWSVDWASILTIPFNLTIGIISIYVVLGVSYRLAQHYKMEA FPNAITALFTFLCIAAVPESINDGSYINVANLGAASMFAGIIVALAVIEINHFMIAKNIK ISMPDGVPPMVAGPFEVLLPMVINILLFIGLNQLCINITGSSLTNLVFTIFSPLISATSS LPSMLFIILLTVIFWFFGIHGDNMVSAITTPIFTGNLVANLEAYNAGKEIPNIIAGNGTF IFGLAIVYLAILFNLIFICKNKRLKSLGKLAVPSSLFNINEPLVFGVPTVLNILTFIPSL LCVAIDFIVFYITTDMGLMAKTCMSVPWTLPAPVYAFISTMDYRAIIIWLILFAINVIIF IPFMKTYDKQMDLEEAE >gi|223714062|gb|ACDT01000153.1| GENE 3 1829 - 2089 242 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733009|ref|ZP_04563490.1| ## NR: gi|237733009|ref|ZP_04563490.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 86 1 86 86 127 100.0 2e-28 MNEDDIKKWVKILKDNYNEFEGIDYSSKIDKNELIETITIDTSKAKPEAYEVLGLEIDSQ KTKKVVLNFSSTVKKLEKSGFKCVDK Prediction of potential genes in microbial genomes Time: Thu May 26 10:51:13 2011 Seq name: gi|223714061|gb|ACDT01000154.1| Coprobacillus sp. D7 cont1.154, whole genome shotgun sequence Length of sequence - 22380 bp Number of predicted genes - 21, with homology - 20 Number of transcription units - 15, operones - 6 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 208 - 266 -0.1 1 1 Op 1 . - CDS 298 - 1743 863 ## GWCH70_0818 putative IS transposase 2 1 Op 2 . - CDS 1764 - 2063 160 ## COG1943 Transposase and inactivated derivatives - Prom 2219 - 2278 11.4 3 2 Tu 1 . + CDS 2339 - 2833 568 ## gi|237733012|ref|ZP_04563493.1| predicted protein + Term 3017 - 3055 1.4 + Prom 3543 - 3602 4.6 4 3 Op 1 . + CDS 3626 - 4447 504 ## COG0789 Predicted transcriptional regulators 5 3 Op 2 . + CDS 4524 - 5519 701 ## COG1680 Beta-lactamase class C and other penicillin binding proteins + Prom 6597 - 6656 2.9 6 4 Tu 1 . + CDS 6708 - 6989 122 ## COG3547 Transposase and inactivated derivatives + Term 7204 - 7262 0.3 + Prom 7544 - 7603 7.7 7 5 Op 1 . + CDS 7702 - 8274 685 ## Pput_0412 hypothetical protein 8 5 Op 2 . + CDS 8278 - 8805 285 ## gi|237733018|ref|ZP_04563499.1| predicted protein - Term 8891 - 8929 -0.7 9 6 Tu 1 . - CDS 8935 - 9117 140 ## gi|237733019|ref|ZP_04563500.1| conserved hypothetical protein - Prom 9255 - 9314 5.9 + Prom 9153 - 9212 6.9 10 7 Tu 1 . + CDS 9266 - 9583 375 ## gi|237733020|ref|ZP_04563501.1| conserved hypothetical protein + Term 9608 - 9662 7.1 - Term 9559 - 9603 2.9 11 8 Tu 1 . - CDS 9620 - 10558 425 ## gi|237733021|ref|ZP_04563502.1| predicted protein - Prom 10755 - 10814 8.5 - Term 10764 - 10802 -0.9 12 9 Tu 1 . - CDS 10826 - 12988 1870 ## gi|237733022|ref|ZP_04563503.1| predicted protein - Prom 13023 - 13082 7.0 - Term 13030 - 13078 -0.0 13 10 Op 1 . - CDS 13114 - 13746 545 ## gi|237733023|ref|ZP_04563504.1| predicted protein 14 10 Op 2 . - CDS 13825 - 14583 523 ## gi|237733024|ref|ZP_04563505.1| predicted protein - Prom 14616 - 14675 6.7 - Term 14635 - 14664 -0.2 15 11 Tu 1 . - CDS 14719 - 15465 603 ## gi|237733025|ref|ZP_04563506.1| predicted protein - Prom 15525 - 15584 10.6 16 12 Op 1 . - CDS 16412 - 17179 739 ## gi|237733026|ref|ZP_04563507.1| predicted protein 17 12 Op 2 . - CDS 17176 - 18111 944 ## gi|237733027|ref|ZP_04563508.1| predicted protein - Prom 18146 - 18205 9.7 18 13 Tu 1 . - CDS 19181 - 19507 153 ## gi|237733029|ref|ZP_04563510.1| predicted protein - Prom 19602 - 19661 6.9 + Prom 19559 - 19618 4.5 19 14 Op 1 . + CDS 19639 - 20040 210 ## COG1943 Transposase and inactivated derivatives 20 14 Op 2 . + CDS 20057 - 21517 604 ## GWCH70_0818 putative IS transposase + Term 21520 - 21570 3.1 + Prom 21703 - 21762 7.2 21 15 Tu 1 . + CDS 21792 - 22380 457 ## Predicted protein(s) >gi|223714061|gb|ACDT01000154.1| GENE 1 298 - 1743 863 481 aa, chain - ## HITS:1 COG:no KEGG:GWCH70_0818 NR:ns ## KEGG: GWCH70_0818 # Name: not_defined # Def: putative IS transposase # Organism: Geobacillus_WCH70 # Pathway: not_defined # 3 481 8 487 487 406 47.0 1e-111 MANFVIEFPLRTEIYQEDILNKRFEIGRNIYNALIKVTQKRYKEMIKTKQYRHLMYSLSD DKNLNKILWKQINCIRKECSMSEYAFHRDVKAMQRHFKDNIDAHTAQKIASALWKSYEKF FYGNGKQVHFKKYGELNSLEGKSNKSGIRIINDMLVWKGLKIPIDIDYDNDYEAQAMEHD ICYSRIVRRAHKYKYKFYVQIVFNGQPPIKYNQSTGEVRHPSGTGKVGLDIGPSTIAIVS HQDVKLLELADRVHDIEKQKQLLLRKMDRSRRATNIDNYNKDGTIKKSKKWVRSNHYNKA LARLKELYNKQSRIRKEQHEILANYIISLGDEIYVETMNFAGLAKKGKLETNADGHYKRR KRFGKSIGNRAPAMIIDIINRKLSYYNCSINKINTIKARASQFNHIDGTYIKKSLSQRWA KVGNKDIQRDIYSAFLIMNINNDLSTFNIEACNAEFEHFCKLHDIEIERLKHCYNLKSIG I >gi|223714061|gb|ACDT01000154.1| GENE 2 1764 - 2063 160 99 aa, chain - ## HITS:1 COG:VNG0285C KEGG:ns NR:ns ## COG: VNG0285C COG1943 # Protein_GI_number: 15789568 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Halobacterium sp. NRC-1 # 1 95 34 126 129 98 47.0 3e-21 MEQRFKELVKIKCEELKINIIAIECNEDHVHMFVNCLPTLSPSDIMHKIKGFTSKLLRTE FAELSKMPNLWTRSYFVSTAGNVCSETIKQYVENQRKRY >gi|223714061|gb|ACDT01000154.1| GENE 3 2339 - 2833 568 164 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733012|ref|ZP_04563493.1| ## NR: gi|237733012|ref|ZP_04563493.1| predicted protein [Mollicutes bacterium D7] # 1 164 1 164 164 270 100.0 2e-71 MDSLWNFESAEANGEYQMPKHNSENFIDVDFTRELLFIDIKDSQRGIEDLTYSGVELEEI KNDFIRNFDLDFDFQKIIRICFDFDNNLILIKTNDLKEYTIYSLNLWDKKDDILKTLRLE SVSTSQLQDDFEAMMLLKAFEELTHEEKIVFMKNYISMTKTIFE >gi|223714061|gb|ACDT01000154.1| GENE 4 3626 - 4447 504 273 aa, chain + ## HITS:1 COG:BH3496_1 KEGG:ns NR:ns ## COG: BH3496_1 COG0789 # Protein_GI_number: 15616058 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 4 115 7 116 117 75 41.0 1e-13 MLSIGEFSKICKVSTKTLRYYDEIGLIKPSKINQENGYRYYSIEQLETMLFINRLKQYNF SLEEIRAIITSEEIPNEKLSIELYKKKVELEKQIQIYSQITEQLSKDISVLKQGKSIMSY LNKIDVQLVEVPVMYLVSIRKMVCKFEMEEQYACCFNSILRKIQYDKLTVKAPPMVLFHG DEFTPLGLDTEFAIPIEQFVTETRDFCPGLCLKTILHGGYSNLPSIYTKQCEWAEQNGYE NNGPLYEVYITDQTQTSNEDDLVTEIYYPVKKK >gi|223714061|gb|ACDT01000154.1| GENE 5 4524 - 5519 701 331 aa, chain + ## HITS:1 COG:lin1811 KEGG:ns NR:ns ## COG: lin1811 COG1680 # Protein_GI_number: 16800879 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Listeria innocua # 6 321 15 320 323 226 39.0 3e-59 MNREELHDFIKEKQPNICQISCYKDGKEVYSDEWNNYKKIDTCHVMSATKSIVALLVGIA LDKGLIESIDQLVLDFFPEYKIKRGEKTIQQVTIKHLLTMTAPYKYKYEPWTKICSSDDW TVSSLDFLGGRKGLTGQFKYSTLGIHILTGIISKTSGIKVVDFANKFLFEPIGVEKHLNY LAETAEEHKHFTMSKDPQKNIWFCDPQGIGTAGYGLCFSAIDMAKIGQLCLDKGIHNGKQ IVSSKWIEEMTKPNYKCGEEFRNMSYGYLWWIVDENKGIYSAIGNSGNVIYVNPTKNTVI AVTSYFKPIIFDRIDFIQKYIEPFITIRRAL >gi|223714061|gb|ACDT01000154.1| GENE 6 6708 - 6989 122 93 aa, chain + ## HITS:1 COG:FN1357 KEGG:ns NR:ns ## COG: FN1357 COG3547 # Protein_GI_number: 19704692 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 2 88 280 366 391 59 40.0 2e-09 MLIESEIGDINNFSSAVKVMGFAGVYFGTFQSGEYSALRTALSKRGLRYLRKSLSQYILL VCTNNLTFNKHYKLKRLQGKSHSCGKGHSVVNS >gi|223714061|gb|ACDT01000154.1| GENE 7 7702 - 8274 685 190 aa, chain + ## HITS:1 COG:no KEGG:Pput_0412 NR:ns ## KEGG: Pput_0412 # Name: not_defined # Def: hypothetical protein # Organism: P.putida_F1 # Pathway: not_defined # 35 189 41 182 188 66 29.0 6e-10 MCVFDAMRKKNNYFKDFYDADKIEEIAAEIREMFELKETPTQIANILNKVGFKIFSLDMD DNLSGRIGIAKEFEKMLGSRKILQINSKDNRGHQRFTMAHELGHYIFDYNGHEEYANAYS LAEDDVNSPGEMRVNRFAAALLMPKNIFIDKYIARKTLGLDEVSICKSLAEEFEVSETAV SKRIVELGLN >gi|223714061|gb|ACDT01000154.1| GENE 8 8278 - 8805 285 175 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733018|ref|ZP_04563499.1| ## NR: gi|237733018|ref|ZP_04563499.1| predicted protein [Mollicutes bacterium D7] # 1 175 1 175 175 296 100.0 2e-79 MSKNNHNTSFSIINDIVRSVKEQGNSVEVNPIRYDKNYRDANDNQHIIKRNKQYTSILKK YTLYLDSALNFKSKLRKFVVIFFMIILGMVILSLLLFIGYIILNKDFDITTLVAFVTASG TLISTLLIIPTKIAEFIFNRDEEKYMSEIIKNIQEYDKNVRNGLCENDTTETSDK >gi|223714061|gb|ACDT01000154.1| GENE 9 8935 - 9117 140 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733019|ref|ZP_04563500.1| ## NR: gi|237733019|ref|ZP_04563500.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 60 11 70 70 98 100.0 2e-19 MKEKGITQYDLYTTYNVNRSQINRLKHNKNVEINTIDKLCNILHCRVEDIMEHFEDNNIF >gi|223714061|gb|ACDT01000154.1| GENE 10 9266 - 9583 375 105 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733020|ref|ZP_04563501.1| ## NR: gi|237733020|ref|ZP_04563501.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 105 1 105 105 184 100.0 2e-45 MNINSDKFKMLIYDLMIGNIELSSYSCKESEKVKNEFTIGSKCAVAYQEIFEANLRICSK LGVQEDEDIECIVSNFFDILQYLSLKMFDYGVLFTNSEHDYNKES >gi|223714061|gb|ACDT01000154.1| GENE 11 9620 - 10558 425 312 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733021|ref|ZP_04563502.1| ## NR: gi|237733021|ref|ZP_04563502.1| predicted protein [Mollicutes bacterium D7] # 1 312 1 312 312 495 100.0 1e-138 MQENNKSDIKVEGETTLPIVDKYLKNGTVKEIATILNILTVLTGIMTLLFNYLIDLFIEG YSNYLGLNKVSLNLRTSSIWETFFLATIFTIALYMLYYYISKIWKHLNYKFLFLSILIII FTIIVYILLKIINIGMDFLTLFVLFAIMIYPTFLMNHHVDKQDNLSNSNSNKKCGLGTKF YKIYTFIRKIFQKIKPYLYYGALPLVLLLLSIFATLFITNIIQTTKDAGEEFLLNEKTHT IVTIEDNSYLLLQDNNKDKIIMKIVSENTNPDKSKKYYVKKGEYKYLKEYDNLTVRVENC KIDVKAEPSKSE >gi|223714061|gb|ACDT01000154.1| GENE 12 10826 - 12988 1870 720 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733022|ref|ZP_04563503.1| ## NR: gi|237733022|ref|ZP_04563503.1| predicted protein [Mollicutes bacterium D7] # 1 720 1 720 720 1177 100.0 0 MLKFSRNIKRCLISLFVASSVAANIAVPINAYSPNASGSLNSMSATDRAIFNQLVGGNTS NIVRASAGSISTSIEGRSSVNGYQWTAWYNENNKNFALDKTQTNGTYNVGVPTKTGNSQV SANMTKAGYYKYLSDPIYSEIRLKASQTFNWVTANTSKVNTYAEDYYSPGYPKRLKRSDS IKHTATTFSIVNIAKDLYKIINMNEKKELQKGLNAHTSANCGGGLGTSGSHNGGSTFNKT YSNSKQSIIGATNAINNKLNSWINGSVSNISTANYSTVSVIEDSGNRKVIETKIYSGTVS AYNQECKSNYAALKLDVDVYHTVIENHSQTIEVANTVVAQNVSATELSKYATFTLKSKNG GTGAVNTQVVKSNGGWWGSNHPAYHQGNNTTIDSNGTYYYWSNNGGGSHSTNLRNAITIK EKLYTFAGVPTNLGKDNKYTLTINPNGGEYNGNPGITSIKQLIGTSYTVNTPSRDGYILS GWEFSGSGTWNPNNQKYTFGTGDGSLTAKWIAKTPETDPNNPDNTNKKYTLTIDPNGGKY NNKTTPTTITKKIGDSESIKNPKRSDYIFLGWKVVSGDSNCFSQGTSSSMYRFYSDATLQ AQWIKKGQSNSGTVIIDPNGGNYDGSTNKVTISGNAGSTTTIKTPVKDGYIFKGWEVINI EKNKFNGNTFTFIQGTTTLKAKWEVIKIDPCIVNGTGVDACGTNQDSMPGTWVHLSKDIY >gi|223714061|gb|ACDT01000154.1| GENE 13 13114 - 13746 545 210 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733023|ref|ZP_04563504.1| ## NR: gi|237733023|ref|ZP_04563504.1| predicted protein [Mollicutes bacterium D7] # 1 210 15 224 224 346 100.0 6e-94 MSFNHVNLNKDFVIIGSDSRECFQDKSYRNNRQKTFINRDLKICWSYTGLTTFHDIDNIK IIKNIIDMNNSDIVEKLTIIESIMNFQTRRYYEENKRDIYFDMFVCCNEEEQNAVYVLEI KNGLSNIEKKKKYYTDFFISSGVHLEVEKQININNITDSQLAPKELNRIIHLAMDASKND DNTIGGQAYIALMENDGNITTYINGKEAEF >gi|223714061|gb|ACDT01000154.1| GENE 14 13825 - 14583 523 252 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733024|ref|ZP_04563505.1| ## NR: gi|237733024|ref|ZP_04563505.1| predicted protein [Mollicutes bacterium D7] # 1 252 1 252 252 464 100.0 1e-129 MASQNKYCNNFGKNGHYEYGTNLINKPKIIPFDKAERIKNMSFNCLYMTDNYIIIGSDSR ETLSSSFYNDSMQKTFINKEQKLVWSMTGICKYKNIDYIKLVNEVMNSQAELRDKCILIQ HLLKYPTKDKSFSTSKIKIRFNLFVGVFENNNPKFYSVEVINGKCSNTNGIYDKINYPFA SGVHFERCNYLNLKIMDNKIKDDCVLEFTKLISDVIEYDKDKTKTVGGDVYVAYMDRDGN ITTYINGKEAEF >gi|223714061|gb|ACDT01000154.1| GENE 15 14719 - 15465 603 248 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733025|ref|ZP_04563506.1| ## NR: gi|237733025|ref|ZP_04563506.1| predicted protein [Mollicutes bacterium D7] # 1 248 1 248 248 470 100.0 1e-131 MENTKKKSKAKKIISLGIILCLVGGLIYYIFNKSAHPGFPEMPSPNINTNIYESIAHDMA VEAHDVPASRRSDKPSYFVGVCQTYDSGNKLYWSNYLTVDGSVSKYWELFYEGKVDNCPV QYTKIMTKLTFNQNYENEEEIIKEQQFMQENLGEHFSNQIKEDLKEDYWVAATYDMNSLT AQYDIAFYRNDFDISNPGLQAVLSYFGFDACYDEATVSFDPEKFVHSTNNFGSRFKGAFN NMIDDGVD >gi|223714061|gb|ACDT01000154.1| GENE 16 16412 - 17179 739 255 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733026|ref|ZP_04563507.1| ## NR: gi|237733026|ref|ZP_04563507.1| predicted protein [Mollicutes bacterium D7] # 1 255 7 261 261 484 100.0 1e-135 MSFFDDLDSFFVTGDIFYPDNKKRKERIKSLEKDCTNFLALAKKEAEASQEFMENLNNKV KNLLNAKDEIPQYLEITFPKQIEETPYESMMNIYATLCMVPLIQITWKTAYAIINHQSVE EFAKDLSVSLGITAATITGALYFGGIAALFVPGPINGAYSRTNLRKAIKNTANSRKKIYR QYYITEMFNRHLEIVLKCLNAYENTGIQNETIANAIVDKVNLFKNSILDEKTIEIEVEKY LNEKDKQEYQWTQEG >gi|223714061|gb|ACDT01000154.1| GENE 17 17176 - 18111 944 311 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733027|ref|ZP_04563508.1| ## NR: gi|237733027|ref|ZP_04563508.1| predicted protein [Mollicutes bacterium D7] # 1 311 1 311 311 547 100.0 1e-154 MAIFIDVFYPENSKISNRLSELVSDCQTLFADASVHKEEIETKLETCNKIIKEAYEKLSK TPPPLVSMEESIQWPVYVSEVLLDIFASTSAVSAIRWAIANNLVKTGKISAERLAELGIN SALKVTLPKWLNVSRVVAESGVAIAIMVAIDMIVDAANGAIERDKLKKAVREAAITRLDA KVVDMFNRNLLKTLDAVIVSYNVFSSMSLPPEVFDKMIQELVLQNKISEDSISRDVAIDW LIDFDETRKSWINEDGNWKEKKVKRSLYSLRMKNIERDRIQAAINQVDDLSDEIKAEFMS KVRGLCVEEKV >gi|223714061|gb|ACDT01000154.1| GENE 18 19181 - 19507 153 108 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237733029|ref|ZP_04563510.1| ## NR: gi|237733029|ref|ZP_04563510.1| predicted protein [Mollicutes bacterium D7] # 1 108 1 108 108 164 100.0 2e-39 MYNSISMDSNYDICKTGLNDNNLSSSQKGNKEMRLICTKINNDEIIIAADSCLNMNNGNK INIQKIFYDKAKSLVIAYAGDSSIVMNGQNIKINNIIEHHLKKYLYRF >gi|223714061|gb|ACDT01000154.1| GENE 19 19639 - 20040 210 133 aa, chain + ## HITS:1 COG:VNG0285C KEGG:ns NR:ns ## COG: VNG0285C COG1943 # Protein_GI_number: 15789568 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Halobacterium sp. NRC-1 # 1 129 1 126 129 111 41.0 3e-25 MENKYRHTNSTVALINYHFVFCPRYRRKIFLIPNVEQRFKELVNLKCKELDLELIAIECD KDHSHMFLNCLPTQSPSGIMHQIKGYTGKILREEFAELSKMPSLWTRNYFVSTADNVCSE TIKQYVENQKKRY >gi|223714061|gb|ACDT01000154.1| GENE 20 20057 - 21517 604 486 aa, chain + ## HITS:1 COG:no KEGG:GWCH70_0818 NR:ns ## KEGG: GWCH70_0818 # Name: not_defined # Def: putative IS transposase # Organism: Geobacillus_WCH70 # Pathway: not_defined # 3 486 8 487 487 451 50.0 1e-125 MANFIVQFPLKTELYQEDVLDRRFETGRKIYNSLVHITQNRYKEMVKTKKYRSLMNSLSG DKKSDKPIWKEISEIKKQYGMSEYSFHNDVKKMQQHFSDNIDSFTAQKIASALWKSYDKF FYGNGKKIYYKKYGELNSLEGKSNKTGIRFKDDMILWNRLKIPVIIDYGNYYEYQSMQSN ISYCRIVRKYVRNKYKYYVQIVFKGRPPVKVDAETGEIKHCTGKGDVGIDIGTSTIAYSS STDVKILELADKVQNIENQKRRLLRKMDRSRRATNPNNYNKDGTVKKHGNKKVTWDKSNH YIKYQNQLKELNRKQADVRKYQHECLANEIVSLGDNIYVETMNFSGLAEKSSKTEKNDKG RYKKKKRFGKSIANRAPAMLLSIIDRKLSYYDRQLIKIDTWNAKASQFNHFDGTYHKKAL SRRWNDFNGVKIQRDLYSAFLIMNIADDLKSFDINKCNDRFEIFYKLHNLEVDRLRGHKN LSSIAI >gi|223714061|gb|ACDT01000154.1| GENE 21 21792 - 22380 457 196 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKILLSILAVLVLAGCSSTSSSKSNSDSKDTLKNTENYTKNIITDNNFEGFAPNEQNEL PLNDSNKITLLSDGDELKIATEVTKVDIYNPMSGGVFSSKDEAFENEKAQFKNEQNKFKK YKSIKIEYKQYDDYYIITRTVDIDLLKEEAGKKAKDILNDIAFISAYAIDDEMNEEKAYD ESNQTINVSKMIKNMK Prediction of potential genes in microbial genomes Time: Thu May 26 10:53:33 2011 Seq name: gi|223714060|gb|ACDT01000155.1| Coprobacillus sp. D7 cont1.155, whole genome shotgun sequence Length of sequence - 1930 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 251 179 ## gi|167756296|ref|ZP_02428423.1| hypothetical protein CLORAM_01829 - Prom 293 - 352 12.5 + Prom 240 - 299 8.7 2 2 Tu 1 . + CDS 531 - 1742 714 ## COG1373 Predicted ATPase (AAA+ superfamily) Predicted protein(s) >gi|223714060|gb|ACDT01000155.1| GENE 1 2 - 251 179 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756296|ref|ZP_02428423.1| ## NR: gi|167756296|ref|ZP_02428423.1| hypothetical protein CLORAM_01829 [Clostridium ramosum DSM 1402] # 1 83 1 83 93 96 55.0 4e-19 MSTKDVAKYLVENGYNIELIRNVNNPTHYNSEHLMLECQNDIIEFGKNEKVKMYLDNDIF YSYTLCRYKTKLGTPVLECTLKE >gi|223714060|gb|ACDT01000155.1| GENE 2 531 - 1742 714 403 aa, chain + ## HITS:1 COG:FN1382 KEGG:ns NR:ns ## COG: FN1382 COG1373 # Protein_GI_number: 19704717 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 400 1 402 402 296 41.0 6e-80 MIYRAMYVDKIMAYADTPFVKILTGVRRCGKSTILKMIMQELETKRNVLPERIVSYRFDS MEYEDMTAKQMFIELKKRLSKDGRTYFFLDEMQEIKGWEKVVNSLASDYDVDIYITGSNS RMMSSEIATYLTGRYVSFRIYTLSFDEYLLFKSSYAEVDEPKKELVNFVRLGGFPATHLQ KYTQDEVYTIVRDIYNSTIFSDIVKRNQVRKIDQLERVVKYTFNNVGNTFSAKSISDYLK SEKRKLDNETVYSYLEKLEKAYLLHRCSRYDLRGKEILKTQEKFYLADTSLRYSVLGYNS DTVASSLENVVYLELCRRGYDVQIGKTPDGEIDFVATKQNNKLYVQVTQEIKSEKTEKRE YERLLEIRDNYPKYVLTTDDFSGGNYLGIKTMHIADFLLSQEY Prediction of potential genes in microbial genomes Time: Thu May 26 10:53:42 2011 Seq name: gi|223714059|gb|ACDT01000156.1| Coprobacillus sp. D7 cont1.156, whole genome shotgun sequence Length of sequence - 12669 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 710 444 ## gi|237732947|ref|ZP_04563428.1| conserved hypothetical protein 2 1 Op 2 . + CDS 726 - 1589 911 ## COG1578 Uncharacterized conserved protein + Term 1651 - 1698 -0.8 + Prom 1648 - 1707 12.7 3 2 Op 1 . + CDS 1733 - 4294 2147 ## COG0480 Translation elongation factors (GTPases) 4 2 Op 2 . + CDS 4305 - 4688 400 ## gi|167756461|ref|ZP_02428588.1| hypothetical protein CLORAM_01994 + Prom 4696 - 4755 9.6 5 3 Op 1 4/0.000 + CDS 4780 - 5205 333 ## COG1846 Transcriptional regulators 6 3 Op 2 35/0.000 + CDS 5207 - 6925 193 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 7 3 Op 3 . + CDS 6927 - 8783 192 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P + Term 8785 - 8814 -0.2 + Prom 8805 - 8864 5.8 8 4 Op 1 . + CDS 8887 - 10317 1534 ## CD0747 putative lipoprotein 9 4 Op 2 . + CDS 10339 - 12667 1809 ## COG2199 FOG: GGDEF domain Predicted protein(s) >gi|223714059|gb|ACDT01000156.1| GENE 1 3 - 710 444 235 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732947|ref|ZP_04563428.1| ## NR: gi|237732947|ref|ZP_04563428.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 235 1 235 235 425 100.0 1e-117 SELAKNTNVSISQVSRFVRSLGIESFSDFKEALVYHGEQKRHTGLQREIIDYRVYQKMVT EEITYFYQNFDESRVDKLVYDFYKYPKIAIFGILNSDNGAKELQYNLIGWGKICISYTNF QDQLDFIKSADSNTLIVIFSFSGNYVMKNGYSRFYHTYDYLKSSKAKVYVITKNNLVQEL EYVNEVILVPIKHDLYNYTLQCLVDLIFYEISKKFDKIEFPFFGNNIYLNIRMLC >gi|223714059|gb|ACDT01000156.1| GENE 2 726 - 1589 911 287 aa, chain + ## HITS:1 COG:TM0176 KEGG:ns NR:ns ## COG: TM0176 COG1578 # Protein_GI_number: 15642950 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 1 271 4 280 293 152 31.0 1e-36 MKINENCLPCLISQVIKVANITNIKNRDMFYRGVFQYLGKLNFTKTNPEIIGATFEMIKQ QVNDEDPYFELRKYYNELFLSRSTEFENKINSFETAVKYAIIGNIIDFSPIYNTQIKDID KWFENIDQLKLAINQLEEMITDIKSAKVLLYLGDNCGEICLDKLLIRRIKKLNPEIDIYF GVRGKPVVNDSIEADAYFVGMDEYATIISNGDNSLGTVLERTSNEFKRIYRSADIVIAKG QANFESLSEQEKNIYFLLMVKCEVIANYIGVAQKSLICLNYYKSMSH >gi|223714059|gb|ACDT01000156.1| GENE 3 1733 - 4294 2147 853 aa, chain + ## HITS:1 COG:CAC0854 KEGG:ns NR:ns ## COG: CAC0854 COG0480 # Protein_GI_number: 15894141 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Clostridium acetobutylicum # 6 632 5 640 644 510 41.0 1e-144 MKKIVLGILAHVDAGKTTLTESILYLTGTTRKLGRVDHRDTFLDYDFQERNRGITIFSKQ AIINWHDSQITLIDTPGHIDFSTEMERTLQVLDYAILVISGIDGVQGHSETIWNLLRYYK IPTFIFINKMDVSRYEQSELMQNLKERLDEHCFDFSNLNDEFYESVALNSEDLLNYYLEY NTLNKEMLINEIARCQLFPCFFGSALKMERIDNFFNEFTNYIKNKDYSESFGARVFKISH DEQGNKLTHLKITGGSLKVKTQLLNDEKVDQIRIYSGNKYHLTDEVMAGDICAIKGLKSI VAGQGLGIEDNTTQPLLSPYMDYQIKLPPDCDQHQVLKKLNLLAQEDPQLHINYNVRTKE IHVQLMGEIQIEVLKNIIQERFKVEVDFDYGRIIYKETIEETVEGVGHFEPLRHYAEVHL LLEPGKPGSGLKFDTNCQEYLLANNYQRLILSHLQEKEHLGVLTGSPITDMKITLVAGKA HLKHTEGGDFREATYRAIRQGLKTAKSVLLEPYFDFSLELPIEYLSRAIYDIEAMAGNFK LPENQTDIVVITGSAPVSKMQNYQSEVVRYTKGKGRLICQLSDYRPCQNQAEVIKSFNYD SEADVENPTGSVFCSHGAGYNVRWDQVAQHMHIPFVFKKIKKTANNIQDKSKFDNIDDEL ENIFTRTYGPVKQRSGEHQVTKKIFNESSYKYIPECLLVDGYNVIHAWPELKELAKDNLD AARMRLIDIMCNYQGYKKCILILVFDAYKVKDNIGSTTKYHNIYIAYTKEAQTADMYIER ATHELASKYNITVATSDALEQLIVLGQGGKRISSRELRLEVAQLDQEKLEEFRRKQPKSH NYLLEGLKNFNQD >gi|223714059|gb|ACDT01000156.1| GENE 4 4305 - 4688 400 127 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756461|ref|ZP_02428588.1| ## NR: gi|167756461|ref|ZP_02428588.1| hypothetical protein CLORAM_01994 [Clostridium ramosum DSM 1402] # 1 127 1 127 127 219 100.0 4e-56 MKKTFKYNLNYIGGIAEYQKRAIYTLNLKENSFSVDGIFKSRKFSYNDIIEVKYGEIEEL KKQLDPVNVLLFGRMALLDRRLNRKLKYCLAIILKEEMTIVFAEEYDGTIKTAYNTLSKI YKMTGSK >gi|223714059|gb|ACDT01000156.1| GENE 5 4780 - 5205 333 141 aa, chain + ## HITS:1 COG:CAC0197 KEGG:ns NR:ns ## COG: CAC0197 COG1846 # Protein_GI_number: 15893490 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 141 1 140 140 89 39.0 2e-18 MANSLYYLLFKTAHTQRKKNIPFLKELNLAPGQPKVLRYLYEQKRECLQKDITKGCDIEP ATVSIMLNKLESNGFIKRNYSEENKRNVYIQLTELGKITFLKWQAYLDEREDITLRDFSK EERKQFINYLERLYDALLKEE >gi|223714059|gb|ACDT01000156.1| GENE 6 5207 - 6925 193 572 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 318 564 266 518 563 79 27 2e-14 MFKQFFGYFKNYKKYLYLSAVFVILETLFELIIPLIMADIIDIGVANKDRNYILIKGGLM IICALLSLGLGLLYAKTAAKAGQGFGYELRKAQYQKIQEFSFKNTDHFSTSSLVTRLTSD VTILQNAICNGIRPLVRAPFMMLTALTMAILINAKLAVVFLIAIPVLATCLIIIMSKVRP LYGKMQRALDSVNSIVQENLIAIRVVKSYVRKDYEQAKFNEVNLNYQQVSRKSFHYAVMN MPCFQFVMYSTIIAILWFGGGMIQVGNMQVGELTGFLSYIMQILNSLMMISNVFLMLTRS LASAYRIQEVFDEEIDIKDEKSDIKITRGKIIFKNVAFKYDLKAKEYVLNNINLEIEPGE TVGIIGGTGSAKTSLVQLIPRLYDITAGDLLIDGHDIKSYGIEHLRDEIAMVLQKNTLFS GTIKENILWGKADASDHELNEVLDIACASEFIDALPKGINTDLGQGGVNVSGGQKQRLCI ARALLKKPKILILDDSTSALDTATERKLTDGLAYYLPKTTKIIISQRLSSLAHADKIVIL TDGKIDDIGTPEELANRNHIYQDLCKIQEGDK >gi|223714059|gb|ACDT01000156.1| GENE 7 6927 - 8783 192 618 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 381 604 132 355 398 78 26 2e-14 MARINGTPRPKNLKKTLKAMASYLTRHIGTLIIIAILVLISAGANILGTYLLKPVINDYI LPGNNAGLLQAIITMAIMYLTGVGACHLYNQLMVKTAQEIVKEIREDLFDKIQNLPLSFF DQRTHGELMSRFTSDIDTILEALNNSFAMLIQSFIMIVGTITMIIILNPYLSIIVIASMV MMFIFIRYSSKKSKQYFNQQQKYMGSLNGFIEEMVEGSKVVKVFNHEQVNFAEFEKRNQA LRQAATKAVTYSGTMIPTVVSISYLNYALSAVIGGFMAIYGIMDLGSLSSYLVFVRQTAM PVNQFSQQMNFILAAMSGAERIFEIMDEKVEIDEGSVTLTNVRNEYGKLVECHENTGLWA WRHPRSNGDVELVLLKGDVRFHDVCFSYVPNHPILKNLNLYAKPGQKIAFVGSTGAGKTT ITNLINRFYELDSGSITYDGIDIKLIKKAALRQSLSMVLQDTHLFTGTIEDNIRYGKLDA SDEEVVAAAKLANADSFIRRLPQGYQTMLTGDGSNLSQGQRQLLAIARAAIANPPVLILD EATSSIDTHTEKLIEKGMDSLMKDRTVFVIAHRLSTVRNSKAIMVLDHGEIIERGNHDEL INQKGRYYQLYTGKMELS >gi|223714059|gb|ACDT01000156.1| GENE 8 8887 - 10317 1534 476 aa, chain + ## HITS:1 COG:no KEGG:CD0747 NR:ns ## KEGG: CD0747 # Name: not_defined # Def: putative lipoprotein # Organism: C.difficile # Pathway: not_defined # 2 476 5 485 486 491 53.0 1e-137 MKKVRLVLISLLMFFAVTGCSNEDKNLLDKDDPTTIQVWHYYNGAQQEEFNRLVDEFNKT VGKEKGIIVEGSGQGTISDLERNVLDSINGKAGAADIPNIFAAYGDTAYQVDKLGYAVDL NKYFSKDELSKYVDGYLEEGHFSSKDTLKIFPVAKSVELFMLNKTDWEKFANATGASTND LNTIEGVTKVAEQYYNWTDSLTAAPNDGKAFFGRDAFANYMLVGYRQLATDIFSKKDNKI VLNFEADIAKKLWDNYYVPYISGYFSSSGKFRSDDIKIGNILACVSSSSSVTYFPKEVIL NDEESHSIELETFACPKFKDGKDYVVQQGAGMVVLKSEEKEQQAAVEFLKWFTSDKQNIA FSNASGYLPVTKSANDLDKITDEVEVNESVKKTLNTSLKMISDNNMYTSVPFEKGTDARN VLETTMSNLAKQDRETVITNLSNGMNLEQAISQFNNDTYFNQWYTNTKSQLESLIK >gi|223714059|gb|ACDT01000156.1| GENE 9 10339 - 12667 1809 776 aa, chain + ## HITS:1 COG:PA0575_3 KEGG:ns NR:ns ## COG: PA0575_3 COG2199 # Protein_GI_number: 15595772 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Pseudomonas aeruginosa # 555 722 8 173 182 99 33.0 2e-20 MFRKIMKNKSIFSYLLVVMCILVLIETTILVGSLSTGGMFAKLNQNAKDIVDQRVINRSS YLQNEMLNNWSNLSQLTDHINTTAKQLVSEGKVDYEHLDDSSETATPLILAVVDQLISTM RSQHVTGAYIIFNNHDLDKGLEDKPGIYLRDLDPLSKASAENGDLLIERAPTEVVKSLNI ATDSSWRPRFEFKKANIKYYDFFYTPYQQAISNSQEFSSTDMGYWGGSFRLRDSENEAFT YSLPLINDQGAVYGVVGIDITLDYLNKLLPSTELLDEGNGSYLLAINQDDSLILSNILTN GNIYTTKSPTTELMQSENDYYINQGSNTLYSSLKYLNIYNSNTPYSNQRWALVGIVDISD LFAFTNKITSILMFAIFLTLTVGIGGSFIFSYIISKPIKKLSDEVVKANIKNKISFRRTN IAEIDHLANMMEQLNQNVRDTALKFTNLLQMASIKLAGFEYNINTEELFISENFFEVLLK YDVNTSELTMTKFKETFEGYKQYIVSHDYSKKEYLFKIPDDDNYVFVNLRLLVNDNAYTG VIENVTNTIVEKNVIEYERDHDALTGLLNRRAFIRIMNSLFENEINKIKIAALLMLDLDN LKYINDNYGHEIGDNYISKAAETFVNSTPENTIISRISGDEFFVFFYGYDDENVIKQLIN KLKEAISKAFIPLADNSNFHVNASGGIAWYPRDSESFEKLQHYADYAMYKIKRTSKGNLT EFNLNDYIADSFLSESKEELNTIIEDRAIQFYFQPIISSKDGTIYAYESLMRSFMP Prediction of potential genes in microbial genomes Time: Thu May 26 10:54:08 2011 Seq name: gi|223714058|gb|ACDT01000157.1| Coprobacillus sp. D7 cont1.157, whole genome shotgun sequence Length of sequence - 20332 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 13, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 558 504 ## COG2200 FOG: EAL domain + Term 629 - 667 3.1 2 2 Tu 1 . - CDS 665 - 2995 2000 ## COG0474 Cation transport ATPase - Prom 3119 - 3178 9.7 - Term 3108 - 3154 1.1 3 3 Tu 1 . - CDS 3237 - 4664 1323 ## COG0726 Predicted xylanase/chitin deacetylase - Prom 4695 - 4754 4.7 + Prom 4685 - 4744 12.0 4 4 Op 1 . + CDS 4785 - 6008 913 ## EUBELI_01577 hypothetical protein 5 4 Op 2 . + CDS 6062 - 6466 258 ## Amet_0451 Zn-finger containing protein + Term 6469 - 6506 4.0 6 5 Tu 1 . - CDS 6477 - 6932 266 ## PROTEIN SUPPORTED gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 - Prom 6958 - 7017 8.4 + Prom 6921 - 6980 7.6 7 6 Tu 1 . + CDS 7032 - 7550 631 ## gi|237732962|ref|ZP_04563443.1| predicted protein + Term 7551 - 7581 1.1 - Term 7540 - 7598 1.5 8 7 Op 1 . - CDS 7743 - 8933 653 ## COG0582 Integrase - Term 8956 - 9014 6.0 9 7 Op 2 . - CDS 9022 - 9222 253 ## SMU.1031 putative transposon excisionase; Tn916 ORF1-like - Term 10369 - 10413 1.2 10 8 Tu 1 . - CDS 10618 - 11589 810 ## COG2946 Putative phage replication protein RstA - Prom 11769 - 11828 3.9 - Term 11649 - 11704 10.6 11 9 Tu 1 . - CDS 11948 - 12595 586 ## Cthe_1531 hypothetical protein - Prom 12825 - 12884 5.1 - Term 13192 - 13228 2.4 12 10 Op 1 . - CDS 13234 - 14547 1207 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 13 10 Op 2 . - CDS 14623 - 16986 2147 ## COG0474 Cation transport ATPase - Prom 17054 - 17113 6.5 + Prom 16973 - 17032 10.4 14 11 Tu 1 . + CDS 17199 - 18539 1105 ## COG0372 Citrate synthase + Term 18581 - 18607 0.3 - Term 18569 - 18595 0.3 15 12 Tu 1 . - CDS 18598 - 19662 1167 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase - Prom 19685 - 19744 8.4 + Prom 19690 - 19749 7.3 16 13 Tu 1 . + CDS 19905 - 20315 402 ## Cbei_2187 hypothetical protein Predicted protein(s) >gi|223714058|gb|ACDT01000157.1| GENE 1 1 - 558 504 185 aa, chain + ## HITS:1 COG:VC1592 KEGG:ns NR:ns ## COG: VC1592 COG2200 # Protein_GI_number: 15641600 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Vibrio cholerae # 16 171 87 247 284 90 32.0 2e-18 LETFANHLRNERINKDCKIFINSISNQILSNKKIDELEKEYSKYLKNVVLEVTEVEYIDE NYHQQKAELMKKWQAELALDDYGSGYNSDRILLLISPKFIKIDMDIIRNIDSDPDKCKIV ENIVNYAHERDMKLIAEGIETIDELKQVIKLKVDYLQGFLLAKPQYLPPRIQEDVIKLIQ LLNEK >gi|223714058|gb|ACDT01000157.1| GENE 2 665 - 2995 2000 776 aa, chain - ## HITS:1 COG:SP1623 KEGG:ns NR:ns ## COG: SP1623 COG0474 # Protein_GI_number: 15901459 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Streptococcus pneumoniae TIGR4 # 3 761 5 773 778 519 37.0 1e-147 MGKIKGLNKQEVAKRISEGKVNLSPKPVVKSNFQIIMGHVFNLFNAYNFIIAAALIAVQA YSSLFFVVIVISNTVIRARQEIKSKNMVAKLNLIVSPKTKVIREGKTISIDNEEIVLDDV VYFETGNQISADSIIIENNVEVDESLLTGEADPISKGPGDHLLSGSFILSGACYGKVEHV GKDNYANKITDQARTRKPVSSVLLNTFNKVTRYTSYLVLPLSILMLYQAYIIRDQGITST IVNTATALLGMLPKGLVLLTSVSLAASVVKLGKMNTLVQEMFSIETLSRIDVLCLDKTGT LTEGKMEIEQVIKLKNPLNLDLDEIMCSFVKGSLDNNITFKTLSNYFTGPAKYETLDRIA FSSARKWSAIELDSIGTIIVGAPEIIRPDYQLSSRIIEMQKSGARVLLIGHHPTLNILQE TLGQTTPIGLIVIKDPIRKDAHEALKFFSDNDVAVKVISGDNPATVSAIASQAGVANAEK YIDASTIVTDQQLEDAILNYNVIGRASPFQKHQMILCLQKHNQKVAMTGDGVNDVLALKD ADVSIAMGSGSDIARQVSQFIIIDGKLQTLVEVVREGRLVINNVTRSASMYYLKTIYTIL LSILSILMNIPYPFIPFQMTLLDMFIEGFPSFMILFERNIEKPKESIGHHAMRFSLPNAL AIVLSVAGIRLLAPTLQLSLAETFSVLYFTTAFVSIHMIYRIYKPLNWYRGGVLIIDIIG FILSTPIFWPLLEMHVLTPKLIQIILITIVISIPVLIILTRSVSYYLKNLNSKKAL >gi|223714058|gb|ACDT01000157.1| GENE 3 3237 - 4664 1323 475 aa, chain - ## HITS:1 COG:CAC3377 KEGG:ns NR:ns ## COG: CAC3377 COG0726 # Protein_GI_number: 15896619 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Clostridium acetobutylicum # 268 462 88 281 282 134 40.0 4e-31 MKLFNKFSKKQKITFISVLFILIISTVYLIWQKNNQLLTFVKDPILEINHEFDYQKIIKN IKNVQTSDIKVDTSKLKNDKIGSYPVTYIYQGKKFTVNINVVDKTKPKFEINNIEIDAGM DVDPTMLVDNIVDETKTTISFKDAYNFSKKGKVKVTVIVSDESNNKSTKHATVTVLSKDT TKPTISGLNDLTVTLNGKVDYLAGVKGNDNRDPHPQLTIDNKKVRLDKLGDYIVTYTCKD RSGNKIVAKRKVSVVEKKEIGVYAQTEEKVIYLTFDDGPSKNTKRILDILDKYHAKATFF VTGTNQNYNYLIKEAYQRGHTIGLHTYSHDYKTVYTSVTAYFNDLERVGNMVKNEIGFIP KYIRFPGGASNTISRKYCPGIMSILANEVIDRGYQFYDWNYGTGDAGGNNIPVNQIIATA TAGNANNQVILAHDTDAKNTTVDALPAIIEHYQALGYSFKSIDDNSYVPHHHINN >gi|223714058|gb|ACDT01000157.1| GENE 4 4785 - 6008 913 407 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01577 NR:ns ## KEGG: EUBELI_01577 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 388 1 357 365 256 35.0 1e-66 MSNIYDYLKWRGDLSMRIDPFNDIDGLILSELVYVDFDKIVPTFPNEQSYCLKDVANNFS QVNNIEELLDEISFTKESITLLQELAKCPRYQNILLSNYINELDYQSIKQFAAITFQLED SSIYVAFRGTDDTILGWREDFQMTYQFPVASQQRAVEYLAKIAEIDFSKSLFYLLCHCDY KMKWYQIFRIYITQKISGVKIRVGGHSKGGNLAVYASSCVNSKISKRIIEIYNNDGPGFN QKMLNTDNYRNISTRIKKFIPESSVFGIMLENLEEKIVVKSSAKGLMQHSGFSWSVEGTQ FVCATEMSKDSQTMDAAFKSWLSKVDSKTRKDAVTAVFSILEAAKIEKINDFSEKSLSHL ITAIKEIKNMNSDTRNALIQFFKTLIVETYDCSRKEHKKKDTNENNS >gi|223714058|gb|ACDT01000157.1| GENE 5 6062 - 6466 258 134 aa, chain + ## HITS:1 COG:no KEGG:Amet_0451 NR:ns ## KEGG: Amet_0451 # Name: not_defined # Def: Zn-finger containing protein # Organism: A.metalliredigens # Pathway: not_defined # 6 134 4 132 132 100 41.0 2e-20 MNKFQNALYRFMSGRYGSDQLNNFLLIFALILLILNLFVIRNPYLATIIWIILIINIFRT YSRNIYKRRAENDKFLSLIQPVKKRINIIKSNKNDKMHKYFLCPNCKQTVRVPRGRGQIT ITCPKCKQKFDKKS >gi|223714058|gb|ACDT01000157.1| GENE 6 6477 - 6932 266 151 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 [Streptococcus pneumoniae R6] # 1 141 12 152 165 107 40 8e-23 MKEQQLDALLENINYEISLLSNASAFIKQVFSDISWVGFYLCQDNQLILGPFQGKTACTL IPFNKGVCGACATNLKTYCVPDVHQFPSHIACDSATNSELVIPLIIDNSLYGVLDLDSRC FARFQPDDQEELEKLALIIIKHLKRIKEETK >gi|223714058|gb|ACDT01000157.1| GENE 7 7032 - 7550 631 172 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732962|ref|ZP_04563443.1| ## NR: gi|237732962|ref|ZP_04563443.1| predicted protein [Mollicutes bacterium D7] # 1 172 1 172 172 256 100.0 5e-67 MKRILKSICCLLLLFAMSFSIVGCSNNDDTNKDSNNNTTTNESDEDNGLEGIDSRVKNKF SSDYKVNRDDVNEAVNYIRDNIDDVKDKEVAKKLYEHGSYLEMAADKGEVDDDNSIRDLG ISTKEYAAKVYNAKDGEVDDIIADAKVRFDQFKTDLDSGVDTAVDKFMDFFK >gi|223714058|gb|ACDT01000157.1| GENE 8 7743 - 8933 653 396 aa, chain - ## HITS:1 COG:SP1129 KEGG:ns NR:ns ## COG: SP1129 COG0582 # Protein_GI_number: 15900995 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 92 389 82 375 387 76 27.0 8e-14 MSALKRRDNKNRILQRGESQRKDGRYTYKYVDALGKTRYVYAWKLLSTDRTPKGKRDDLS LREKERIIQRDLYDGIDTQGSKMTLCQLYAKKNAQRPNVKQNTLNGRKYLMQALEDDILG SRSIDSIKPSDAREWAIRMKEKGFSYKTISNYKRSLKASFYMAIEDDYVRKNPFDFKLSE VIEDDSEPKVALSEEQEQALLDFMAHDNVYRKYYDDVLILLKTGLRISELCGLTKKDLDF ENHAISVNHQLLKDKDGYYIDEPKTKSGIRKVPMSDETEQAFQRVLKRKQKSNIKEIDGY RNFLFLNTSGSPVHKQMYETILKRMVQKYNKTHEVKLPKITPHTLRHTFCTRLAQKNMNP KNLQYIMGHSNIMLTLNLYAHSSEIGANREMRSMIA >gi|223714058|gb|ACDT01000157.1| GENE 9 9022 - 9222 253 66 aa, chain - ## HITS:1 COG:no KEGG:SMU.1031 NR:ns ## KEGG: SMU.1031 # Name: xis # Def: putative transposon excisionase; Tn916 ORF1-like # Organism: S.mutans # Pathway: not_defined # 1 66 1 67 67 77 62.0 2e-13 MHDLSMPLWERYLMTIEEASSYFRIGENKLRQVANEHKAECVIMNGNRMLIKKKNFEKVI DKLEQI >gi|223714058|gb|ACDT01000157.1| GENE 10 10618 - 11589 810 323 aa, chain - ## HITS:1 COG:BS_ydcR KEGG:ns NR:ns ## COG: BS_ydcR COG2946 # Protein_GI_number: 16077554 # Func_class: L Replication, recombination and repair # Function: Putative phage replication protein RstA # Organism: Bacillus subtilis # 63 318 25 278 352 197 41.0 3e-50 MNDKNFIEELKQKREEYGVTQTRLAVACGISRTYFNQIENGKVVPSDELKETIEKQIERF NPQEPLFLLIDYFRVRFPTTNALAIIRDVLQLKADYMLLEDYGQYGYESQYVLGDISIMC STNEQLGVLLELKGRGCRQMESYLLAQERSWYDFMLDCLTAGGKMKRLDLAINDRAGILD IPKLKAKYKAGECMTLFRNQKGYDGTEKCGHDIPQNTGETLYLGSTSSELYMCIYQKNYE QSVKKGIELDESEIKNRFEIRLKNERAYYAVVDLLTYYDAEHTAFSIINHYVRFLKHDDT LPKGSWELDEDWAWFIGENRESI >gi|223714058|gb|ACDT01000157.1| GENE 11 11948 - 12595 586 215 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1531 NR:ns ## KEGG: Cthe_1531 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 54 200 2 149 162 87 33.0 2e-16 MKIEIWKDFPQYFKPSYPEEFELFSHFEVTAGIPAVLFAITTWKENEKPNICFHSWSCFH GDKTAFFAVMGNLYQHTHTYANIKREKCFCINFLPISYYDNLVDTINNNDIETDEFEVGH FNLSNAKTIHAPVIQEAFINMECTLKAVQDLSGAGITAMVIGQVQHISVDKEYAQGYERR YGKDGFMMLVPAPQDLVTGKPNQSAIATINIEKFD >gi|223714058|gb|ACDT01000157.1| GENE 12 13234 - 14547 1207 437 aa, chain - ## HITS:1 COG:ydaJ KEGG:ns NR:ns ## COG: ydaJ COG1473 # Protein_GI_number: 16129299 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Escherichia coli K12 # 20 419 24 421 441 308 45.0 2e-83 MYNKIKLLAQQYHQETIQTRRDFHKYAERGWLEMRTASLVARRLDSLGYQVSVGKEVMRE EFRMGLPSKRLLTLNYNRALLQGADPYYIEKVKNGFTGVVGVMHNGIGPVVAIRFDIDAL SITEDDSADHNPGKNGFISCNPGAMHACGHDGHTAMGLTVAKILAQIKEQLHGTIKLIFQ PAEEGVRGAKSIALSGILDDVDYIFAGHIWQQVEGEDYDIYLGMNETFATTKLDVTYHGV SSHAAGMPQFGKNALLSAASCILNLHSIPRNSDGCTRINVGTIRAGTGRNVVPETAKLEI EVRGATTELNNYMEVYARDVIRGAALMHDTIVEINEMGKAYSLDCDKKMMDLIRHICIDH LGDIKVAKEDLSPLDGSDDFSYMIATVQAHGGIGTYMKLTTDLVASPHNCTYDFNEDVLL TGTTVYSSIAYYLLNQD >gi|223714058|gb|ACDT01000157.1| GENE 13 14623 - 16986 2147 787 aa, chain - ## HITS:1 COG:L168650 KEGG:ns NR:ns ## COG: L168650 COG0474 # Protein_GI_number: 15672557 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Lactococcus lactis # 10 770 5 775 775 538 41.0 1e-152 MKINNQDYHGLSTAEVNQRISNGKINTVDNKITKSYKQIFLNNTITFFNILNAALLGLVL FVGSYKNTLFILVIIINTIAGIYQEIKAKRTLDRLSILTTSHVEVIRDDELKEIDIDQIV LDDYIVLTTGAQIPTDSILIDGHMETNESLLTGESDPVLKQNGDTLYSGSFITSGKGICK VIHVGDDNYMNQITQEAKRLKKHNSELNRCLNKILKYVSIVILPIGGLLFLKQFFYGNQT FNSSIVSTVAAILGMIPEGLVLLTSVALTLSVLRLAKQNTLVQELFCIETLARVDVLCLD KTGTITEGTMKVEFDVKMSNVDISEIVGNLMHSLTDVNVTAQALKEHYQTKTNFNPYFVI PFSSDRKYSGTSYFNRGTYYIGAYQFLFPKGNEELEEKCVDYASEGYRILVLAHTDEVMS NEALPVDLEPLGIIVLSDVIRKDAKETLAYFNEQGVDLKVISGDDPITVSSIAKRAGLNN AHHYVDATTLSTSEEIAEALNKYSVFGRVTPQQKKAMVVALKQQGHTVAMTGDGVNDVLA FKEADCSIAMASGSDAAKNAANLVLLDNNFDAMPHIVDEGRRVINNITMSASMFLIKTIF SALIAISTIFFGQAYPFEPIQLSLISACGVGIPTFFLTYEANFARVEGNFLETVLEKSFP SAFTIAIGATLITNIGLALNYDPNMLSTVCVLFTGWNYTIALLKIYRPLTKYRKFIIYST QFCYYISMMIGQSILELTGVSFNWLMILLGLIAFSSIIVDLSAELFKYIKIFKEFLEKRR KKRLSKG >gi|223714058|gb|ACDT01000157.1| GENE 14 17199 - 18539 1105 446 aa, chain + ## HITS:1 COG:L67186 KEGG:ns NR:ns ## COG: L67186 COG0372 # Protein_GI_number: 15672652 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Lactococcus lactis # 10 445 7 441 441 369 44.0 1e-102 MSQFIDGFIEKAKESNNKIDNELYSKFDVKKGLRNEDGTGVLVGLTKIADVVGYKKIDGK KVDCDGELYYRGIAVSDIINKREPHQRFLFEETCFLILFGYLPNKEELENFKKELSERYE LPPHYLESKILGFPSKNLMNKLQQEVLMLYSYDEDPDNISPSSTMYKGLNLIAKIPSIVC YAYRSKVHYFDKESLYIHQIRPDLSIAENILYLLRDHKHFTPKEAEVLDMCLVLHADHGG GNNSTFTNVVISSTGTDIYSSFAGAIGSLKGPKHGGANLAVMNQMKLIINEIGLDASDEQ INEIVERILDKQYNDNSGLIYGIGHAVYTVSDPRCVILRQQCSELALEKGREAEFDFYYR FEQAAIQAIKNRKGITVCANVDFYSGLIYDMLCIPTELYTLMFVVGRAVGWLAHNIENKL YSGRIIRPAAKYVGETYEYIPMNKRK >gi|223714058|gb|ACDT01000157.1| GENE 15 18598 - 19662 1167 354 aa, chain - ## HITS:1 COG:BH2156 KEGG:ns NR:ns ## COG: BH2156 COG0115 # Protein_GI_number: 15614719 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Bacillus halodurans # 3 353 6 356 358 412 57.0 1e-115 MEIKIERAKTLKEKPNQDNLGFGNYFTDHMFMMDYTEGTGWHDARIVPYAPIPMDPATMV LHYAQETFEGLKAYRDPNGNITLFRPEMNARRMINSNRRICMAELPEEMFVEAVEAIVKY EQDWIPTAKDTSLYIRPFMFASEAGVGVHPAKTYTFIIILSPVGNYYPEGVNPVKIWVED EYVRAVKGGTGFTKCGGNYAASIAAQVKAESHGYTQVLWLDGVHRKYVEEVGTMNVMFVI DGKVVTAPLEGSVLPGVTRDSIIHILKDWGYEVEERELSIDELMAAGRNGKLTEAFGTGT AAVISPVGQLYYKGEEVIINDFKTGDLTQKLYDTLTGIQWGRLEDKYGWVRHIK >gi|223714058|gb|ACDT01000157.1| GENE 16 19905 - 20315 402 136 aa, chain + ## HITS:1 COG:no KEGG:Cbei_2187 NR:ns ## KEGG: Cbei_2187 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 1 131 1 131 134 140 47.0 2e-32 MLKILICCGGGFSSSYVTERMKKEVVEKGLQEEVYLEFYPFSLVKEKEADFDIIMCCPHL KIYVERLLQETTISVPIYLLPPKMYGFMQLEEIVADAKDIIEIYRTFPNNPVHFPGEEKL LRITRGVAYRNVNKKR Prediction of potential genes in microbial genomes Time: Thu May 26 10:54:35 2011 Seq name: gi|223714057|gb|ACDT01000158.1| Coprobacillus sp. D7 cont1.158, whole genome shotgun sequence Length of sequence - 9217 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 3, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 513 507 ## COG1335 Amidases related to nicotinamidase - Prom 533 - 592 6.1 + Prom 531 - 590 9.2 2 2 Op 1 5/0.000 + CDS 628 - 960 459 ## COG0640 Predicted transcriptional regulators 3 2 Op 2 . + CDS 974 - 3445 2837 ## COG2217 Cation transport ATPase + Term 3448 - 3487 6.8 + Prom 3454 - 3513 6.4 4 3 Op 1 . + CDS 3546 - 4346 692 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III 5 3 Op 2 22/0.000 + CDS 4357 - 4593 432 ## COG1918 Fe2+ transport system protein A 6 3 Op 3 2/0.000 + CDS 4583 - 5470 835 ## COG0370 Fe2+ transport system protein B 7 3 Op 4 . + CDS 5464 - 6594 979 ## COG0370 Fe2+ transport system protein B 8 3 Op 5 . + CDS 6645 - 7286 692 ## CA_C1623 hypothetical protein 9 3 Op 6 . + CDS 7288 - 8703 1148 ## COG1640 4-alpha-glucanotransferase 10 3 Op 7 . + CDS 8705 - 9214 643 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family Predicted protein(s) >gi|223714057|gb|ACDT01000158.1| GENE 1 3 - 513 507 170 aa, chain - ## HITS:1 COG:Cj0119 KEGG:ns NR:ns ## COG: Cj0119 COG1335 # Protein_GI_number: 15791507 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Campylobacter jejuni # 1 164 1 165 173 130 41.0 1e-30 MKKLLVVIDYQNDFVNGSLGFQEAVALEDYLADLIKEYHNNNDDVIFTYDTHQKNYLTTQ EGKNLPVAHCLENTEGWQLYGKIHDLAKNDKKIIKETFGSLALGNYLKNKQYNEITLVGV VSNICVISNAIIIKAALPEASIIIDCLGIASNDPLLEQKSIDIMENLHMK >gi|223714057|gb|ACDT01000158.1| GENE 2 628 - 960 459 110 aa, chain + ## HITS:1 COG:CAC2242 KEGG:ns NR:ns ## COG: CAC2242 COG0640 # Protein_GI_number: 15895510 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 4 110 15 121 122 131 61.0 3e-31 MDKVVNEAIVNKVAAALPEEEILYDVAELFKVFGDSTRIRIICALFESEMCVYDLAACLD MTQSAISHQLRILKQANLVKFRRDGKLMYYSLDDEHVKQIFDAGYKHVIE >gi|223714057|gb|ACDT01000158.1| GENE 3 974 - 3445 2837 823 aa, chain + ## HITS:1 COG:CAC2241 KEGG:ns NR:ns ## COG: CAC2241 COG2217 # Protein_GI_number: 15895509 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 143 823 12 698 699 647 49.0 0 MKLKYSLKGLDCANCAQKVQERVSKLENVRECSVVFATTKMFVETDTDLFDESKIVEAVK SVEPDVEVINLSKNGQDYNVSNNHEECNGEHDHKHHHDHEKCGCGHEHEHHHDHKECSCG HDHEHNIESVQHGDLKGAINIKISGLDCANCAMKVEQAINKMNEIDEAMIIFSTETLKVK PKTSIAQNELLKKLQKVVDQVEDDVTLTLKDEIKVVEKPKLFVPKEHLGLIGGTLVYIAG IIAGEFDYAAIVYGIAYLLVGYKVILKALKNIRRGEVFDENFLMCIATIGAFCISDYKEA IAVMLFYSVGEIFQAYAVNKTRTSISSLMDLKSDYANLLVGEEIKKVAPEEIKIGDEIIV KVGEKVPLDGVVLEGASTLDTSSLTGETLPRNVSKGDEVLAGVVNLTGIINLKVSQVYED STVSRILDLVENAASKKAPIEQFITRFARIYTPTVVFLAVALAIVPMLIFKDAVFTDWLY RALTFLVVSCPCALVISIPLGLYAGLGKASKVGALIKGGNYLELLKDIDTVVFDKTGTLT EGSFEVVEINGADDLLMLGAYGESMSNHPIAKSILRKYGQEIDQKRISDFKEIAGKGIEV KIDDKVYNLGNKSYIEGLGITVNNPSTVGTVVHIVCQGKYLGNIVVADKIKETTIEGIKH LKKYGIKNTVMLTGDRSEVAEDIAKKIGIDTVYSELLPQDKVIQVETLINQGAKLSFVGD GINDAPVLARADLGIAMGGVGSDAAIEAADIVLMNDDIVTIGEAISISQKTNKILKQNVT FTLIIKIGVLLLTMFGYANMWMGVFADVGVTLIAILNSMRILR >gi|223714057|gb|ACDT01000158.1| GENE 4 3546 - 4346 692 266 aa, chain + ## HITS:1 COG:CAC1330 KEGG:ns NR:ns ## COG: CAC1330 COG1234 # Protein_GI_number: 15894609 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Clostridium acetobutylicum # 1 266 1 267 268 271 50.0 8e-73 MEKVIILGTGNAGVKNCYNTCFALKNKNEYLLVDAGGGNGILKQLELAKIELSQITNMIV THSHTDHVLGVVWIFRMVATKIKSGEYDGNFNIYCHDELVSTIKTIIKLTVQERLYNLID ERIFINEVQDGQKITICEHLVTFFDIYSTKAKQFGFSIELDDGKLTCLGDEPFNELCYQY AVDSKWLLSEAFCLYQERDFFKPYEKHHSTVREASELAQTLNIKNLILYHTEEKNLSHRK KLYTDEAKQYFNGNIFVPDDLEEYRL >gi|223714057|gb|ACDT01000158.1| GENE 5 4357 - 4593 432 78 aa, chain + ## HITS:1 COG:MA3479 KEGG:ns NR:ns ## COG: MA3479 COG1918 # Protein_GI_number: 20092290 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein A # Organism: Methanosarcina acetivorans str.C2A # 2 76 8 82 82 62 45.0 2e-10 MTLDKLIPGMSGKVTIVHGEGLLRRRLLEMGLTPKTVVKVRKIAPMGDPIELYLRSYVLT IRKDDAAMIEVEVISDAN >gi|223714057|gb|ACDT01000158.1| GENE 6 4583 - 5470 835 295 aa, chain + ## HITS:1 COG:CAC1031 KEGG:ns NR:ns ## COG: CAC1031 COG0370 # Protein_GI_number: 15894318 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Clostridium acetobutylicum # 6 286 8 292 683 221 42.0 1e-57 MPTKTIALIGNQNCGKTTLFNALTGSNQHVGNFPGVTVEQKSGTIKKHPSIKLVDLPGIY SLTPYTMEEIVSTDFLIKEKPSMIINIIDATSIERNLYLTMQLLELNIPMILALNMMDEV ISSGNSIDVEGLQQALGIRVIPLSASKNEGIEELIEAIETTLKENNCSHLDLCSGEIHKA IHSITHLIEDNAVKAKLPKRFAVTKIIEGDKDIIRQLHLDVQQLHIIKHIIEDMEEKEGL DKDAALVDMRYQVIENITRQTVFKEQETAGQVRSEKIDSILTHKYLEYQSSLSLC >gi|223714057|gb|ACDT01000158.1| GENE 7 5464 - 6594 979 376 aa, chain + ## HITS:1 COG:CAC0448 KEGG:ns NR:ns ## COG: CAC0448 COG0370 # Protein_GI_number: 15893739 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Clostridium acetobutylicum # 1 375 220 586 587 315 45.0 9e-86 MLMIFFLTFNVIGAPLQSLMEIAVDFIGSTIITFLTNHQVAPWLISLLRDGVIAGVGSVL SFLPLIVVLFFFLSMLEDSGYMARVAFVMDKLLRKIGLSGKSFVPMLIGFGCSVPAIMAS RTLSSQRDRKMTIIVTPFMSCSAKLPIYGMIIAAFFSTKAPLVMITIYCIGILVAIFSAL LLKATIFPGDPIPFVMELPSYRIPTAKNVIMHMWEKAKDFLKKAFTIIFIASLLIWFLQS FNFRFEMVTDSSKSILAYIGNKLSFIFAPLGFSDWRLSTSLITGITAKESVVSTLSVLTN SSSPDALYHALNTLLTPASAFAFLTFTVLYMPCVAAFAATKRELGSWLQAILTAGYQTGI AYVVAFIVYHLALLIS >gi|223714057|gb|ACDT01000158.1| GENE 8 6645 - 7286 692 213 aa, chain + ## HITS:1 COG:no KEGG:CA_C1623 NR:ns ## KEGG: CA_C1623 # Name: not_defined # Def: hypothetical protein # Organism: C.acetobutylicum # Pathway: not_defined # 1 213 1 213 213 220 53.0 3e-56 MDKNLGTIIKIKQDGMIKAFEQNNMQITFVDNFEQLHNYLKKYLCNHKTVALGGSMTLFE TGVIDLIHQSDVLLHDRYQEGLEREQMQEIFRKAFTSDLFITSTNALTTNGCLYNVDGNG NRVAAMIYGPKEVIVIAGKNKIFDSEEEAINHIKSISAPANAVRLNKKTPCTKTGTCMNC LSADRICSSYVKLGYQGNINRIKIVIVDQDLGY >gi|223714057|gb|ACDT01000158.1| GENE 9 7288 - 8703 1148 471 aa, chain + ## HITS:1 COG:alr3871 KEGG:ns NR:ns ## COG: alr3871 COG1640 # Protein_GI_number: 17231363 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Nostoc sp. PCC 7120 # 3 450 7 474 502 316 40.0 5e-86 MKSGILFPLSSFPSKHGIGDLGKEAYKVIDQLAANKGQYLQILPFHPSTQANSPYRAVST YAGDEIYINLNSLVENGLLDEVPEYYSETSNVDYDKIRKFKHKYLLLAYNNFKGNDQYDR FLTKCPWVFDYALYCTFASINKSYDWASWPTEFKDYPKYHQVNLTKYTEQIAYYQFVQFI FYHQWFSFKKYANDKGIKIIGDMPFYVDHGSVEVWLDNQSFLLDEAGYPTSVAGVPPDYF SPTGQYWGNPLYDWKYLKFNKFKFWTDRIGWASKIFDITRIDHFRAFDSYWDIPVDAPTA IEGSWKYAPGYELFDQLYASLGSIDLVAEDLGYLRPEVIELKNHYHLKGMKVFQFMLDDG FNDLEHSYFYPGTHDNETIKGWSDSLNEEALNKIKKIVDEKLPLNLAIIKFCLHSNASDV IVPVWDLLEVDNEQRFNIPGEVNDTNWTYRVSNMSLVEKAFKLFNKLKSED >gi|223714057|gb|ACDT01000158.1| GENE 10 8705 - 9214 643 169 aa, chain + ## HITS:1 COG:FN0320 KEGG:ns NR:ns ## COG: FN0320 COG1853 # Protein_GI_number: 19703665 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 1 168 1 180 180 81 32.0 6e-16 MSYKKIDVNELSLNPFKTIGKDWLLISAVKDGKINTMTASWGSVGVIWNKNVVTVYIRPQ RYTKEFVDSADYFTLTFFDGYKKELGVLGSKSGRDSDKIREVGFNVEMINEQPLFKQGTM GMVCKKLYRGKIEPTGFIDVSLDVKNYPDHDYHYIYIGEIESIYVNEKD Prediction of potential genes in microbial genomes Time: Thu May 26 10:54:40 2011 Seq name: gi|223714056|gb|ACDT01000159.1| Coprobacillus sp. D7 cont1.159, whole genome shotgun sequence Length of sequence - 3640 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 678 695 ## COG2323 Predicted membrane protein + Term 697 - 748 10.1 2 2 Op 1 . - CDS 725 - 1501 953 ## COG1521 Putative transcriptional regulator, homolog of Bvg accessory factor 3 2 Op 2 . - CDS 1495 - 2682 1315 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase + Prom 2943 - 3002 10.7 4 3 Tu 1 . + CDS 3023 - 3574 815 ## COG0693 Putative intracellular protease/amidase Predicted protein(s) >gi|223714056|gb|ACDT01000159.1| GENE 1 1 - 678 695 225 aa, chain + ## HITS:1 COG:BH2297 KEGG:ns NR:ns ## COG: BH2297 COG2323 # Protein_GI_number: 15614860 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 1 212 17 227 232 111 30.0 9e-25 IVVPVISLAVLFGITKMMGYRQVSQLNMYDYINGITIGSIASELVMGGFDNMLKPLIAMI VYTIVIIILSKLASTSLTMRNIIDGHPVVLFENDQIYNEELEKAKLDVDEFLMQCRIGGY FDLNELQMVVLETNGSLSFFPKEKYRPTVVNDLNIKINKVNPPTLLVKEGTIFHENLKSI NHDVDWLERELNVLGVPLSDIILMYQEDNSNLGVYTINEIEKKFN >gi|223714056|gb|ACDT01000159.1| GENE 2 725 - 1501 953 258 aa, chain - ## HITS:1 COG:lin0253 KEGG:ns NR:ns ## COG: lin0253 COG1521 # Protein_GI_number: 16799330 # Func_class: K Transcription # Function: Putative transcriptional regulator, homolog of Bvg accessory factor # Organism: Listeria innocua # 1 254 1 254 259 248 48.0 1e-65 MLIVIDIGNTNITMGLVDKDEIIDNYRLTTKLERTSDEYGFMLLSFLQASSILVDEIEDI IIASVVPKIMYSFTNSIKKYFNKEPIIVGPGIKTGIRIKIDNPKELGADRLVDAAGAYYI HGGPCLVIDFGTATTFDVISKDGDFLGGATAPGIGISINALSSQAAKLPEIEIKKPKSVI AKNTVSSMQAGIIYGYIGLTENIIREIKAEYNEKLKVISTGGLGRIIFNETDLIDIYDPD LTFKGLKIIYDKYRKNNK >gi|223714056|gb|ACDT01000159.1| GENE 3 1495 - 2682 1315 395 aa, chain - ## HITS:1 COG:BH2510 KEGG:ns NR:ns ## COG: BH2510 COG0452 # Protein_GI_number: 15615073 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Bacillus halodurans # 2 379 5 386 404 344 46.0 1e-94 MKTIIVGISGSVAAYKTCDLVRALKKLNYDIEVIMTENATKFISPLTLGALINKPVLVDD FDDQGYQIKHISYAKKADCFIIVPATANIIGKIANGVCDDIVTSTFLAATCPKLIAPAMN VNMYDNLATQRNIERCKTYGIKFVEPGYGLLACGDVGRGKLADTDDIINMVEYCLSPKPL KHKRVLVTAGPTQEAIDPVRYITNHSSGKMGYALAKRAYQLGAKVTLISGPTDLRVPYGV EVINIKSARAMFEAVKESYEKQDYIIKAAAVGDYRVKEIATNKIKKHEDSLILEMVKNDD ILTYLGAHKTTQTICGFAMETQNLIENAKDKFKRKNCDLLVANNLNENGAGFKKDTNKVT FISKNDVQSIELMSKDELSDLILKKLIKIKESHSC >gi|223714056|gb|ACDT01000159.1| GENE 4 3023 - 3574 815 183 aa, chain + ## HITS:1 COG:SP0804 KEGG:ns NR:ns ## COG: SP0804 COG0693 # Protein_GI_number: 15900697 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Streptococcus pneumoniae TIGR4 # 1 176 1 176 184 175 46.0 3e-44 MKKVAVLFHDGFEEVEALSVVDIMRRANVECTMVGMDKLEVTSSHQIKIKMDQIYDGLDN YDAVVIPGGMPGASNLRDDSRVIDLVKQFNHDGKIIGAICAGPIVLQEADVIKGKTVTCY PGFEEQLIGSNYQEALVQRDENIITGKGPAAALAFGYTLLEALGCDSSVIKEGMQYTYLE KNI Prediction of potential genes in microbial genomes Time: Thu May 26 10:54:41 2011 Seq name: gi|223714055|gb|ACDT01000160.1| Coprobacillus sp. D7 cont1.160, whole genome shotgun sequence Length of sequence - 973 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 43 - 112 10.3 1 1 Tu 1 . - CDS 125 - 616 206 ## gi|237732906|ref|ZP_04563387.1| predicted protein + Prom 540 - 599 8.3 2 2 Tu 1 . + CDS 755 - 971 206 ## gi|167757012|ref|ZP_02429139.1| hypothetical protein CLORAM_02561 Predicted protein(s) >gi|223714055|gb|ACDT01000160.1| GENE 1 125 - 616 206 163 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732906|ref|ZP_04563387.1| ## NR: gi|237732906|ref|ZP_04563387.1| predicted protein [Mollicutes bacterium D7] # 1 163 1 163 163 238 100.0 1e-61 MSLKTSWAYNNFTYLSYQPHLRIYYLIWVTLISCFLLFKVIKLFKHITYLTIANKFLIGS CFISMLTGSYLPYHPDNENIVSVLHVLISSSGTISLLFIIQLLINRIMLDDYQFYKKMST LFRSLVLLIGMFIIMFGVINSIVELFFTFTVLFSLAQIEKHYF >gi|223714055|gb|ACDT01000160.1| GENE 2 755 - 971 206 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757012|ref|ZP_02429139.1| ## NR: gi|167757012|ref|ZP_02429139.1| hypothetical protein CLORAM_02561 [Clostridium ramosum DSM 1402] # 1 72 1 72 106 130 100.0 4e-29 MKIKMKSIHANIINETVSASATDVRAQIEIGYSHDEKNRLQLNYHISVGKQNSGHLYVIK CSFVLIDFSTEM Prediction of potential genes in microbial genomes Time: Thu May 26 10:54:56 2011 Seq name: gi|223714054|gb|ACDT01000161.1| Coprobacillus sp. D7 cont1.161, whole genome shotgun sequence Length of sequence - 13081 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 6, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 37 - 96 4.4 1 1 Op 1 . + CDS 124 - 1182 1129 ## COG0527 Aspartokinases 2 1 Op 2 . + CDS 1175 - 1849 513 ## LEUM_0670 dihydrodipicolinate synthase (EC:4.2.1.52) + Prom 2031 - 2090 5.9 3 2 Tu 1 . + CDS 2172 - 2366 175 ## gi|237732910|ref|ZP_04563391.1| conserved hypothetical protein + Prom 2547 - 2606 5.3 4 3 Op 1 . + CDS 2660 - 3307 629 ## COG0595 Predicted hydrolase of the metallo-beta-lactamase superfamily 5 3 Op 2 . + CDS 3258 - 4367 1176 ## COG0595 Predicted hydrolase of the metallo-beta-lactamase superfamily + Prom 4369 - 4428 5.8 6 4 Op 1 . + CDS 4448 - 5167 546 ## BMQ_4133 DNA translocase FtsK 7 4 Op 2 . + CDS 5252 - 6739 1352 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins + Prom 6772 - 6831 6.6 8 5 Op 1 . + CDS 6859 - 7950 1368 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein + Term 7957 - 7983 -1.0 + Prom 7956 - 8015 2.0 9 5 Op 2 . + CDS 8037 - 8204 90 ## gi|167757002|ref|ZP_02429129.1| hypothetical protein CLORAM_02551 10 5 Op 3 24/0.000 + CDS 8173 - 9567 1513 ## COG3845 ABC-type uncharacterized transport systems, ATPase components 11 5 Op 4 26/0.000 + CDS 9557 - 10651 1320 ## COG4603 ABC-type uncharacterized transport system, permease component 12 5 Op 5 1/0.000 + CDS 10651 - 11613 959 ## COG1079 Uncharacterized ABC-type transport system, permease component + Term 11617 - 11663 13.3 + Prom 11647 - 11706 15.1 13 6 Tu 1 . + CDS 11730 - 12989 1316 ## COG0612 Predicted Zn-dependent peptidases Predicted protein(s) >gi|223714054|gb|ACDT01000161.1| GENE 1 124 - 1182 1129 352 aa, chain + ## HITS:1 COG:lin1475 KEGG:ns NR:ns ## COG: lin1475 COG0527 # Protein_GI_number: 16800543 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Listeria innocua # 2 235 3 245 403 124 35.0 3e-28 MIRVLKFGGTALKTNDSIKLVTNVITNNLDHKLLVIVSAMGRYPDAYATDTLNALALNAN YDEHCRLLSCGEIISSIVLSSNLKSQGINAISLSSMQLNLKVKHNRLVGLDTKVINKYFE TYDVIIVPGFQGIDKFKNIRILEKGDSDYTAVYLARRLNLDEVYIYSDVCGIFTGDPKYI NEARLIKHIGYRQALDLAKHKARIICYKALMEGYKKDGFKIKLRSFKDDRIGTLINHDDT NIRTMSINFNYWLIRFEETIAKSDYSFLHDCFDDDYLIFEEDLAKLETDYTKIDLYTKVH FVGCGLENDEIYRNFLEKFAIVSKNEHDSYYVKAKNQKQDLNLLHDLVVRSD >gi|223714054|gb|ACDT01000161.1| GENE 2 1175 - 1849 513 224 aa, chain + ## HITS:1 COG:no KEGG:LEUM_0670 NR:ns ## KEGG: LEUM_0670 # Name: not_defined # Def: dihydrodipicolinate synthase (EC:4.2.1.52) # Organism: L.mesenteroides # Pathway: Lysine biosynthesis [PATH:lme00300]; Metabolic pathways [PATH:lme01100]; Biosynthesis of secondary metabolites [PATH:lme01110] # 4 216 5 223 288 77 28.0 3e-13 MIKIDIFTNLITPFDIDDMIDYQALDNHIEKLVNQGNNKFIIGSRTGEAASLTFLEQKHL LRYVCYHYPGLEIYMQLSESCTKKVIKQIGDLKDISEFAGYMIELPEIFSLTQNGLIKHL DLIATATNKSIVIQQHKQNMIAVNHLLELKEKHENITGLITEVNSTGYQIIRNKGLGLYG IEEMLVQNEINNYDGLISNLTNLSYKKYRELITQHNDLYFDFLN >gi|223714054|gb|ACDT01000161.1| GENE 3 2172 - 2366 175 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732910|ref|ZP_04563391.1| ## NR: gi|237732910|ref|ZP_04563391.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 64 1 64 64 104 100.0 2e-21 MKFAFIIMGDFNLKYDHAAIHQQTAQIIGVTNLAEASIVAKNYLLRELAASNYAVLLEKQ ELKL >gi|223714054|gb|ACDT01000161.1| GENE 4 2660 - 3307 629 215 aa, chain + ## HITS:1 COG:lin1473 KEGG:ns NR:ns ## COG: lin1473 COG0595 # Protein_GI_number: 16800541 # Func_class: R General function prediction only # Function: Predicted hydrolase of the metallo-beta-lactamase superfamily # Organism: Listeria innocua # 14 203 13 198 555 181 48.0 1e-45 MEEKMKKDVVKILPLGGQAEMGKSMYCIEIKEQIFILDAGFRFPEVSKLGVDIIIPSFDY LKGNKERVVAIIITHGHDDVMAALPYLLEAVNAPVYAPALTADLIEQMIKRHKKHNNFRV SYQLNRVKRNDSIIIAGIPVEFFPVTHSIPGSVGVALWTDNGYIVYSGEFIVDFGAPEGF RCDIQKMMEIGKKGVLALLCESSYSKKSRLYITKT >gi|223714054|gb|ACDT01000161.1| GENE 5 3258 - 4367 1176 369 aa, chain + ## HITS:1 COG:lin1473 KEGG:ns NR:ns ## COG: lin1473 COG0595 # Protein_GI_number: 16800541 # Func_class: R General function prediction only # Function: Predicted hydrolase of the metallo-beta-lactamase superfamily # Organism: Listeria innocua # 7 369 201 555 555 195 31.0 1e-49 MSHLILKNPGYTSPKHKLTDKIDNIFEDSEGRIIISSYAQNIFRTKEIVELTKKYNRKIV FYGRDKYDSTNSIVRIGQRLKKAVIQVPKNLIAFSTDIGKKGIDDNLVVLLSGTPQRIYH DICDIIDGGDEFLKLNKNDTFIVASPVIPGTEKIANKAVNELYKTDSNIHVLKNKELCSM HASQEDIKVIIQIFNPTYFVPVKGEYQHFISNLEVAKNMAIKPENIAIIDNGEILSFKSG KLEDYRDTIEVEDVMIDGIGVGDVGDKVIDDRIQLSNDGVVIIGVTIDSKTHEIIANTDV QSRGFVYLRDSEHIIKGVIDIAEKCVSEMKDDPKLEAVEVRQNIKDKANKFIIKETGKRP VILPIIIEV >gi|223714054|gb|ACDT01000161.1| GENE 6 4448 - 5167 546 239 aa, chain + ## HITS:1 COG:no KEGG:BMQ_4133 NR:ns ## KEGG: BMQ_4133 # Name: ftsK # Def: DNA translocase FtsK # Organism: B.megaterium_QM_B1551 # Pathway: not_defined # 1 164 1 180 785 66 26.0 1e-09 MAKTSRSKKSKQDQISEDLKVRIIATVGLFLIVVGAMKLGPIGEQLNFLCTYIIGNFVGI SYISLVLISGYVIYFAKLPKFTGPRAIGLYMVVGAVLTFMSSLNDDLMVGMKVINQYVST APCNRGGFLGAILYGILSALFDKTGAMIAAGFILVIGLTLLGSKFYLEHRKNVAERKKKK PTKKILKCIKALVNSQIFLPLKKIKQKAFFQRRYLVMKILQAKSKIPLERVQLESILLI >gi|223714054|gb|ACDT01000161.1| GENE 7 5252 - 6739 1352 495 aa, chain + ## HITS:1 COG:BS_spoIIIE KEGG:ns NR:ns ## COG: BS_spoIIIE COG1674 # Protein_GI_number: 16078743 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Bacillus subtilis # 13 487 311 777 787 533 58.0 1e-151 MKMICLNLKKKINKNYRLPALSLLKNPTAKKSGDNKGNALSKAEALTNVLHEFGVNATIS DIFIGPSVTKYELKLETGTRVNKIMQLQDDIKLALAAKDIRIEAPIPGKPAVGVEIPNSV ATMVSFKEVIKDIPKDLQDNKLLVPLGKDVSGKIIYAELNKMPHLLIAGATGSGKSVCVN TIICSILMRARPDEVKFILVDPKKVELTNYNGIPHLLAPVVTDPKKAAAVLQEVVVEMEH RYDLFAGANVRNIEGYNNYARKKNEELALDEQLEILPFHVVILDEVADLMMVASKQVEDC IMRIAQMARAAGIHLIVATQRPSTDIITGVIKANIPSRIAFAVSSGIDSRTILDASGAEK LLGKGDMLFSPMGSSSPVRVQGAFVSDDEVSAITHHTATQQEASYDDKYINVKLNTTSPS AASKEEEEDEEYEMCRSFVINAQKASTSLLQRQFRIGYNKAARIIDQLEADGVIGPQIGS KPREVYIRGYQEEDI >gi|223714054|gb|ACDT01000161.1| GENE 8 6859 - 7950 1368 363 aa, chain + ## HITS:1 COG:SMc02884 KEGG:ns NR:ns ## COG: SMc02884 COG1744 # Protein_GI_number: 15963962 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Sinorhizobium meliloti # 1 309 1 290 330 123 31.0 6e-28 MKKLLASLVVFSFLLVGCGGSSSEEFEVALVTDAGDINDKSFNQSAWEAVEQYQKDHDAT IKYYKPASFDDAGYKSQIEEAVENGAKIIVCPGFKFCNAIGELQDKYKDVKFILIDATPT VPDEKDSTKSNPVDPGDNVYCALYTETQPGYLAGYAAVKAGHTKLGFMGGIALPAVINYG YGYIQGANDAAKELGVTVDMIYTYTGTFNEDPAIKTKAAKWYDAGTEAIFSCGGGICNSI FPAAKEAGKYAIGVDSNQNDDKTGVVITSALKDVKKTVYDKLVDYKNDKFKGGIETLDAS GGYVGLPDDFDRLGDFNASAYKEVFDKVASGEIKIKSMTDITDNDNGNPTKVPVTNVNIT YDK >gi|223714054|gb|ACDT01000161.1| GENE 9 8037 - 8204 90 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167757002|ref|ZP_02429129.1| ## NR: gi|167757002|ref|ZP_02429129.1| hypothetical protein CLORAM_02551 [Clostridium ramosum DSM 1402] # 1 31 1 31 510 63 96.0 3e-09 MEYVIEMLNITKEFPGIKANDNITVQLKKGDSCFVRRKWSRKINIDVNPIRAVST >gi|223714054|gb|ACDT01000161.1| GENE 10 8173 - 9567 1513 464 aa, chain + ## HITS:1 COG:lin1426 KEGG:ns NR:ns ## COG: lin1426 COG3845 # Protein_GI_number: 16800494 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport systems, ATPase components # Organism: Listeria innocua # 1 458 47 502 513 481 55.0 1e-135 MSILFGLYQPDEGCIKINGEIAEINDPNDANRYKIGMVHQHFKLVETFTVLDNIILGVEE TKNGLLAKDNARKKVLDLSKQYHLNIEPDALIKDITVGMQQRVEILKMLFRDNEILIFDE PTAVLTPQEIEELLKIMKGLVAEGKSIIFITHKLNEIKAVADRCTVIRKGKYVATVDVTT TSKEELSKMMVGRDVNFIVEKGKATPGQTVLKVENLNYCKKNAMKNALTDINFEVKRGEI VCIAGIDGNGQSELVYALTGMLTGYTGKIILNNEDITHYSVRQRNLKGISHIPEDRHKHG LVLDYNLAENMILKNYFNKDNQKHGFLKFENIYNYADRLIEKFDVRSGQGSHSIVRGMSG GNQQKAIIAREIESNCDLLIAVQPTRGLDVGAIEFIHKQLIQQRDEGHAVLLISLELDEV MNVSDRILVMYEGEIVTDLNPKETNVEELGLYMAGSKRGVGYEK >gi|223714054|gb|ACDT01000161.1| GENE 11 9557 - 10651 1320 364 aa, chain + ## HITS:1 COG:lin1427 KEGG:ns NR:ns ## COG: lin1427 COG4603 # Protein_GI_number: 16800495 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Listeria innocua # 8 354 6 350 350 186 38.0 8e-47 MKNKAWTKSFLSSLIAILCGLIFGLLIMFLVIPNQALEGFSILLRGGFYRGIKSVGEVLY LSVPIMMTGLSVAFAFKCGLFNIGTPGQYIVGAFASMFIAFNATFIPESIRWIFAVLAGG LAGALWASVPGLLKAYRNVNVVISSIMMNYIGMLLVIEGVKKFMYNPTGAESYTIPTSLA IPNLGLDQVFNGSSINLGTFIAIGVCILAYVIMNKTTFGYELKATGYNPDASKYAGMNEK KSIVLSMVIAGFFAGLGGALVYMAGTGRTIGTAEVLAAEGFNGIPVALLGFNNPLGVIFA ALFIAYINLGGNYVQAVHIAVEIIDVIVAAIIYFSSFTLFIRLILERRKGRKKDKKQALI GGDE >gi|223714054|gb|ACDT01000161.1| GENE 12 10651 - 11613 959 320 aa, chain + ## HITS:1 COG:SP0848 KEGG:ns NR:ns ## COG: SP0848 COG1079 # Protein_GI_number: 15900735 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 1 319 4 318 318 226 43.0 3e-59 MDLFYFLFRQSLMFSIPLLIVALGGLFSERSGVVNIALEGIMIFGAFFGIWFMHTAQNAG WLSGQALFLVALLISGVIGGLFALLHAFAAISMKADQTISGTALNLFAPAFGIFVAKTIF GGVKSIPFSNEFLIKKVPVLSDIPIIGDMFFKDFYLSTFIGIAILIATYVYLYKTKTGLR IRTCGENPQAADAAGVNVYKMRYLGVVLSGILAGMGGLIYVIPITSEYSSTVAGYGFLAL AVLIFGNWTPWRIALSAIFFAFAKTLAVTYASIPFLLNSGVPGVVFKIFPYVATLILLAF TSKNSAAPKAAGEPFDKAKR >gi|223714054|gb|ACDT01000161.1| GENE 13 11730 - 12989 1316 419 aa, chain + ## HITS:1 COG:BH2393 KEGG:ns NR:ns ## COG: BH2393 COG0612 # Protein_GI_number: 15614956 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Bacillus halodurans # 1 419 4 430 431 235 33.0 2e-61 MKKSYQLAGYTLHVIPSKKFKNITMSLKLEGKLTKENVTKRSLLAFMLTGGTENYPSTQA LSTHLEDLYGMSFGTNLATKGIGQVLNISSVCINETFLPYQENLLVQQIKMFNDVLFHPN VRNGKFDEQTFAIKKKELKERLIVQNDDKFMYGLDQLFKNMGEGGFLSISNNGYVEELDR ITNEEVYKYLVECLENDVKHLYVVGDVDESIVDVFKENLSFSSSQPLDPVTNFKSSKNDI LEVVEKQDITQAKLNIGYVVDCNFKDPGTYAMTVFNAIFGGFSQSRLFKIVREKHSLCYY ISSSYGAFSGIMTVNAGIEGSDYQKAKDLIAQELKNIQNGDFSNDEIDLAKLMLKSSLTK TKDEPISLITLAYNRDLTGVQETNDEYLEKLMRVSKEEIIAASKKVHLDTIFLLTGSDK Prediction of potential genes in microbial genomes Time: Thu May 26 10:55:13 2011 Seq name: gi|223714053|gb|ACDT01000162.1| Coprobacillus sp. D7 cont1.162, whole genome shotgun sequence Length of sequence - 4155 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1057 1053 ## COG0612 Predicted Zn-dependent peptidases + Term 1075 - 1120 4.4 + Prom 1066 - 1125 9.0 2 2 Op 1 4/0.000 + CDS 1299 - 1529 426 ## COG4224 Uncharacterized protein conserved in bacteria + Prom 1532 - 1591 6.9 3 2 Op 2 . + CDS 1621 - 3597 2540 ## COG0021 Transketolase + Term 3602 - 3647 9.3 + Prom 3611 - 3670 3.9 4 3 Tu 1 . + CDS 3700 - 4080 272 ## gi|167756995|ref|ZP_02429122.1| hypothetical protein CLORAM_02544 Predicted protein(s) >gi|223714053|gb|ACDT01000162.1| GENE 1 2 - 1057 1053 351 aa, chain + ## HITS:1 COG:BH2392 KEGG:ns NR:ns ## COG: BH2392 COG0612 # Protein_GI_number: 15614955 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Bacillus halodurans # 5 349 80 425 432 278 44.0 9e-75 MNGTDASDEFAKLGASTNAFTSSSRTAYLFSTTSNEYPCIELLLDFVQRLDITPESVEKE KGIIGQEIKMYDDDPDWRVYFGSIQNLYNNHPVAIDIAGTVETVNRTDKTMLETCYNTFY HPSNMMLFVVGNINADTAINVIRENQAKKNFETAQPIICQKNIEPCKVKTKENILSMDVE MNKIIVSIKINEILSEPKEKIKRELSMNLLFDLLFSKSSKLYNDWLNKGIINDSFSASFT QERDYAFIQIGCDCDDYETLKNHLLDLIKNFKELKIEEKDFERIKKKNIGLFINLFNSPE SIANLFSRYYFEGTIALNLVDEVADIQLKDIYNTFKYFDIEYTSTCIVKKK >gi|223714053|gb|ACDT01000162.1| GENE 2 1299 - 1529 426 76 aa, chain + ## HITS:1 COG:BS_ynzC KEGG:ns NR:ns ## COG: BS_ynzC COG4224 # Protein_GI_number: 16078851 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 9 76 7 74 77 60 58.0 9e-10 MSNIDPELIVRINELAKKKKDGTISEAELEEQKELRQKYLKAFRSGFKQQLMGVKVVDEE GNDITPQKLKDEKLKN >gi|223714053|gb|ACDT01000162.1| GENE 3 1621 - 3597 2540 658 aa, chain + ## HITS:1 COG:BS_tkt KEGG:ns NR:ns ## COG: BS_tkt COG0021 # Protein_GI_number: 16078852 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Bacillus subtilis # 7 658 8 665 667 744 55.0 0 METSSLSIATIRALGIDTINKANSGHPGMVLGSTPAIYTLFTKEMDFYHKQSKWFNRDRF VLASGHASALLYSMLHLSGYQISIDDLKNFRQWGSITPGHPECDITDGVDASSGPLGQGI PMAAGMALAEKFLASKYNREGYDVVDHYTYALCGDGDMQEGVTYEAASLAGHLSLGKLIV LYDANKVTLDGPLSMSFSEDVKKRYEAIGWQVIRVTDGNDVSAIQKAIRKGKKELYKPTL IIIDTVIGFGSSNEGTNKVHGNPLGKEDGKNAKISYGFDHDEFYVPEEVYDDFANTCIKR GKNRYNKWQRLMKDYKNAYPELAQELENAIAGNFSLDLDEIMKKYSPGHNDATRNTSLEM IQEVAQQNPTFMSGTADLASSTKTNIAGEKTFSVENYDGRNLAFGIREFAMVAMMNGITL HTGIKVSAGGFLVFSDYFKGALRMACLMELPIILPLSHDSIAVGEDGPTHQPVEQLTMLR SMPNMRVFRPADAVEMASAWKLAVESTNNPTALILTRQNVTTMTATSYEGVCHGAYIVGK EVNKLDAIIIASGSEVNLAMAAKEELLKEDIDVRVVSMPSMELFEQQDAKYKDEILPHNV RARLSIEMASDFGWHKYVGLDGKTMSVNKFGASAPADIVIKNYGFTVENVVKNVKDIL >gi|223714053|gb|ACDT01000162.1| GENE 4 3700 - 4080 272 126 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756995|ref|ZP_02429122.1| ## NR: gi|167756995|ref|ZP_02429122.1| hypothetical protein CLORAM_02544 [Clostridium ramosum DSM 1402] # 1 126 5 130 130 211 100.0 2e-53 MNTYKIYKLNPNVVGLLEQYPEVKNQIINIEPKNVFFTKQMECLFGNNEGVSDFIEYKLK DRYDYHRDKNKHVIDNQLTKEEMICVIHDYYIIIEANKKTNIFLDILYQISKSYVIMVEQ SKNLEV Prediction of potential genes in microbial genomes Time: Thu May 26 10:55:21 2011 Seq name: gi|223714052|gb|ACDT01000163.1| Coprobacillus sp. D7 cont1.163, whole genome shotgun sequence Length of sequence - 5862 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 47 - 106 5.0 1 1 Tu 1 . + CDS 131 - 751 477 ## COG0406 Fructose-2,6-bisphosphatase + Term 757 - 804 1.1 - Term 798 - 828 -0.9 2 2 Tu 1 . - CDS 850 - 1458 519 ## COG0344 Predicted membrane protein - Prom 1483 - 1542 11.7 + Prom 1487 - 1546 8.5 3 3 Op 1 24/0.000 + CDS 1573 - 3501 2195 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 4 3 Op 2 . + CDS 3533 - 5377 2015 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit 5 3 Op 3 . + CDS 5409 - 5862 382 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit Predicted protein(s) >gi|223714052|gb|ACDT01000163.1| GENE 1 131 - 751 477 206 aa, chain + ## HITS:1 COG:PM0634 KEGG:ns NR:ns ## COG: PM0634 COG0406 # Protein_GI_number: 15602499 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Pasteurella multocida # 1 200 4 212 216 99 30.0 6e-21 MKIYLTRHSKTLWNQEKRLQGWQDSPLTAEGIEDAKLLKARITELKIDYCYSSPIERAKA TSEILFDHVLVDERLKEMNFGKYEGCLINELLNDPIYNRLWNLPDDDVSTPGGETYHEVQ MRLKDFFNDIYKKHHDDTIFITIHGMLFIILHGIMLNYKTSELTKINKYVVRGCSLSEVE YDGKEFKIKYIDDDSHLKSSEIITYK >gi|223714052|gb|ACDT01000163.1| GENE 2 850 - 1458 519 202 aa, chain - ## HITS:1 COG:lin1323 KEGG:ns NR:ns ## COG: lin1323 COG0344 # Protein_GI_number: 16800391 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 7 202 6 198 198 141 44.0 9e-34 MDIMISIILIILSYLYGSIPFALVIGRLVYKTDVRNYGSGNLGGTNTGRVLGKKAGVAVI LLDASKALLVMLLTSYLCSLLKMNNDLAYICALACVIGHCYPVFANFKGGKAVSTAIGFF MAVNPIGAVLALVVFFIVLKLSKYVSLSSVLASSSVLFAIPFLNMSMTGKIVTAAVILLL IYRHKENLIRIKNGTENKITWM >gi|223714052|gb|ACDT01000163.1| GENE 3 1573 - 3501 2195 642 aa, chain + ## HITS:1 COG:BH2140 KEGG:ns NR:ns ## COG: BH2140 COG0187 # Protein_GI_number: 15614703 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Bacillus halodurans # 7 641 9 643 655 853 68.0 0 MAAKSNYDESNIQILEGLEAVRKRPGMYIGSTDARGLHHLVWEIVDNAIDEALSGFGDNI QVTICKDNSIKVTDHGRGMPTGMHASGKPTTEVILTILHAGGKFNEDGGYKTSGGLHGVG ASVVNALSSWLEVTIYRDGKIWQQRFEDGGSKVGKLKVIGKSNQTGTTIHFMPDSSIFSV YEYNFSTISERLMESAFLLKGIKIDLVDERTGKEISYQYDNGLTAFVEYLNYEKETLNEI INIEGTNNEIEVDVAMQFTSGYQETTLSFVNLVRTGDGGTHETGFRSALTRTFNEYARKY GLLKEKDKNLEGNDVREGLSAIISIKVPEHYLQFEGQTKSKLGTPEARNVVDTVVYEKLT YFLEEHKETADNLVKKMIKAAQVRDAARKAREAARKGKGPRQEKILSGKLAPAQSKDKKR KELYLVEGDSAGGSAKQGRDRKYQAILPLRGKVVNTEKAAMADILKNEEIATIINTIGAG VGADFNEKDSNYNKVIIMTDADTDGAHIQILLLTFFYRYMRDLITAGKVYIALPPLYKVS KKSGKKEDIVYAWEEDELEEAKKKIGRGYNVQRYKGLGEMNASQLWETTMNPETRTLIQV SIEDAAIVERRVSVLMGDKVEPRREWIEQNVQFTLEDNFAIE >gi|223714052|gb|ACDT01000163.1| GENE 4 3533 - 5377 2015 614 aa, chain + ## HITS:1 COG:BS_grlA KEGG:ns NR:ns ## COG: BS_grlA COG0188 # Protein_GI_number: 16078871 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Bacillus subtilis # 1 602 10 612 806 716 60.0 0 MSLETIMGDRFGKYSKYIIQDRALPDVRDGLKPVQRRILYSMYKEGNLYSKPYRKSAKTV GNVIGNYHPHGDISVYDAMVRLSQDWKMRVPLVDMQGNNGSIDNDPAAAMRYTEARLSAI AVELLKDLDKDTVLMSLNFDDTEYEPTVLPAAYPNLLVNGATGISAGYATDIPPHNLEEV IDATIHRIKYPNCSLEKLMEFIKGPDFPTGAIVEGKQGILNAFKSGRGKVIVRSTSEMIE EKTMNKIVITEIPYEVNKAELVRKIDEIRFNRNIDGIIEVRDESDRTGLRIVIDLKKDIN VQNTLNYLYKNTDLQKNYNYNMVAIKDKRPVLMGVIEILDGYIAHQIDVVTRSSIHDLNK ANERKHIVEGLIKAISILDEVVKTIRQSKDKSDAKNNLIEKYGFSEKQAEAIVMLQLYRL TNTDIVTLENENKELDNRIAYLNNILNSDEVLRKVIIDQLKGIKKKYPMPRLSKIRDEVQ EIKIDEKAMILSEDINVSITRDGYIKRISNRSIKASEGIPFGKKENDYLVSMYQANTLDH LLLFTDAGNYLFIPVHKIEEFKWKDAGKHVSYLVKLNAGEKIIGTILVKDFNLPLYVMLA TKKWSNQTDEFKRF >gi|223714052|gb|ACDT01000163.1| GENE 5 5409 - 5862 382 151 aa, chain + ## HITS:1 COG:BH2139 KEGG:ns NR:ns ## COG: BH2139 COG0188 # Protein_GI_number: 15614702 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Bacillus halodurans # 1 105 636 745 816 75 40.0 4e-14 MGLKNDDRVVDVKLTDNNQGIILTTKAGYGALYSEQEVSIVGIKAAGIRAVNLKNDELVS LNIFNPLVSSSLIVISEEGGVKRIKIGDITPCNRATKGTLLYKNPKSKIIYVKDSVIVDN SQLITVTLNDQTSFEFKATDYNNANLEQKPS Prediction of potential genes in microbial genomes Time: Thu May 26 10:55:28 2011 Seq name: gi|223714051|gb|ACDT01000164.1| Coprobacillus sp. D7 cont1.164, whole genome shotgun sequence Length of sequence - 18959 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 7, operones - 4 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 116 - 787 645 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit - Prom 862 - 921 7.4 + Prom 818 - 877 12.7 2 2 Op 1 2/0.000 + CDS 956 - 2053 978 ## COG1161 Predicted GTPases 3 2 Op 2 7/0.000 + CDS 2068 - 2355 243 ## PROTEIN SUPPORTED gi|212638657|ref|YP_002315177.1| Predicted RNA-binding protein containing KH domain, possibly ribosomal protein 4 2 Op 3 14/0.000 + CDS 2352 - 3452 1040 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase 5 2 Op 4 6/0.000 + CDS 3445 - 3789 490 ## COG0799 Uncharacterized homolog of plant Iojap protein 6 2 Op 5 1/0.000 + CDS 3789 - 4511 658 ## COG0500 SAM-dependent methyltransferases 7 2 Op 6 . + CDS 4513 - 5211 802 ## COG0775 Nucleoside phosphorylase 8 2 Op 7 . + CDS 5217 - 6665 1404 ## COG0826 Collagenase and related proteases 9 2 Op 8 . + CDS 6646 - 7554 686 ## CTC02275 protease (EC:3.4.-.-) + Prom 7561 - 7620 11.1 10 3 Tu 1 . + CDS 7669 - 8010 402 ## gi|167756980|ref|ZP_02429107.1| hypothetical protein CLORAM_02529 + Term 8070 - 8113 -0.9 11 4 Tu 1 . - CDS 8020 - 8559 623 ## COG0622 Predicted phosphoesterase - Prom 8667 - 8726 9.6 + Prom 8694 - 8753 12.4 12 5 Op 1 . + CDS 8969 - 9283 527 ## COG1440 Phosphotransferase system cellobiose-specific component IIB + Term 9320 - 9351 -0.8 + Prom 9298 - 9357 4.1 13 5 Op 2 . + CDS 9379 - 9696 543 ## COG1440 Phosphotransferase system cellobiose-specific component IIB + Prom 9744 - 9803 8.9 14 6 Op 1 9/0.000 + CDS 9827 - 10099 419 ## COG3830 ACT domain-containing protein 15 6 Op 2 . + CDS 10111 - 11475 1728 ## COG2848 Uncharacterized conserved protein 16 6 Op 3 . + CDS 11522 - 12019 425 ## Ccel_3420 protein of unknown function DUF815 17 6 Op 4 . + CDS 12091 - 12747 627 ## COG2607 Predicted ATPase (AAA+ superfamily) 18 6 Op 5 . + CDS 12763 - 14490 1352 ## COG4866 Uncharacterized conserved protein 19 6 Op 6 . + CDS 14561 - 15985 1607 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase + Term 16208 - 16249 0.2 - Term 16097 - 16149 6.1 20 7 Op 1 . - CDS 16151 - 16417 170 ## gi|237732943|ref|ZP_04563424.1| predicted protein 21 7 Op 2 4/0.000 - CDS 16426 - 17607 1304 ## COG1015 Phosphopentomutase 22 7 Op 3 . - CDS 17686 - 18594 619 ## COG4974 Site-specific recombinase XerD 23 7 Op 4 . - CDS 18684 - 18905 94 ## gi|237732946|ref|ZP_04563427.1| predicted protein Predicted protein(s) >gi|223714051|gb|ACDT01000164.1| GENE 1 116 - 787 645 223 aa, chain - ## HITS:1 COG:BH1285 KEGG:ns NR:ns ## COG: BH1285 COG1191 # Protein_GI_number: 15613848 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus halodurans # 3 217 2 221 235 206 52.0 2e-53 MFSTLLSLLTNGFFVSYIKSKSFELPLTAKEEERYLEQFFNGDKEARNVLIERNLRLVAH IAKKYENNKDMQEDLISIGTIGLIKAVDSYKQNHKTKLATYASRCIENEILMHLRTNKKT NLDISLNETIGIDKDGSEIVLGDIIAAKQEEFIDIVDKKDTLDKFTTYFSVLEPREKDTL IMRYGLNNTKKYTQKDIAKKLNISRSYVSRLEKRALINYCVNI >gi|223714051|gb|ACDT01000164.1| GENE 2 956 - 2053 978 365 aa, chain + ## HITS:1 COG:BS_yqeH KEGG:ns NR:ns ## COG: BS_yqeH COG1161 # Protein_GI_number: 16079621 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Bacillus subtilis # 3 365 2 366 366 408 53.0 1e-114 MTKEIRCFGCGAIIQSEDEKKIGFVPKNALEKDVILCKRCFRLKNYHELQKTDLTSDDFL KILQEIGNHDCLVVYLVDLFDYNGSLINGLSRHLNNNDILVLGNKRDILPKSLKDRKIEH WLRRQLKESGIKPIDVILSSGTKNYNLDLLIDKINEYRHGRDVYIVGVTNVGKSSLINSL LKHYAGVTDNLITTSEFPGTTLDLIEIPLDDKNYLYDSPGIINNHQMAHLVKDEDLRIII PKSELRPINFQLNDQQTLYFGGLARVDFIKGPRSSFTCFFPKLLKIHRTKLGNSDNLYNR HLTLMPEIESIKDINEMASYDFKLPEGKIDIVISGLGFVSVNCPNANVRVHAPQGVGVFI REALI >gi|223714051|gb|ACDT01000164.1| GENE 3 2068 - 2355 243 95 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|212638657|ref|YP_002315177.1| Predicted RNA-binding protein containing KH domain, possibly ribosomal protein [Anoxybacillus flavithermus WK1] # 1 89 1 89 97 98 52 4e-20 MLTGKQKRYLRGIAHNLNAIFQIGKEGVHQTQIEGIDDALEAHELIKVKILESCADSKNE IALELSMKTKADVVQILGRTIILYRPSEKEIYKLP >gi|223714051|gb|ACDT01000164.1| GENE 4 2352 - 3452 1040 366 aa, chain + ## HITS:1 COG:MPN336_1 KEGG:ns NR:ns ## COG: MPN336_1 COG1057 # Protein_GI_number: 13508075 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Mycoplasma pneumoniae # 6 185 2 180 180 107 30.0 3e-23 MIKIGVFGGSFDPIHRSHVRVIEESIRQLKLDKILVMPTANNPWKDSTGATKQQRLAMLE IALKRYKNVEICRYEIDQDSSKKNYTIDTIRYLKKIYPNDQLYFMMGMDQASLYHKWIAA KELSQLAQLVVFDRIGYQINDNLDKYNFIHLDLVASDDASNNIRNGALHALNPEVLKYIV NNGIYLETIVKNKMSLKRYRHTVSMAHLAKEIALANNLDGRKAYVAGMLHDIAKEMPHDE AVVIMQKNYPDYLNKPEAVWHQWLSEYVAKNEYLVDDPEILQAIRHHTTASLYMSKLDMC IYEADKYDPSRDYDSSKKIALCKKDVVAGFKGCLQDFYDFSHEKGRKIDECFFEIYNKYA KGDINE >gi|223714051|gb|ACDT01000164.1| GENE 5 3445 - 3789 490 114 aa, chain + ## HITS:1 COG:BH1328 KEGG:ns NR:ns ## COG: BH1328 COG0799 # Protein_GI_number: 15613891 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Bacillus halodurans # 4 109 7 112 117 99 41.0 2e-21 MNKLEVIIKALDDKLAEDIVVIDMQLASPIFDTFVICSASNERLMNALRDSVEDSCHENG YEVKKIEGLRNSKWLLMDYGDIVVHIFDADERNSYNLEKLWSDMPRIDVSKYLA >gi|223714051|gb|ACDT01000164.1| GENE 6 3789 - 4511 658 240 aa, chain + ## HITS:1 COG:SP1743 KEGG:ns NR:ns ## COG: SP1743 COG0500 # Protein_GI_number: 15901575 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Streptococcus pneumoniae TIGR4 # 2 235 5 248 248 130 35.0 2e-30 MSYENFAYYYDSLMDEQFYDDYFKFINEHADFKSVLELGCGTGEIAIRLAKAGKEIYATD LSKDMLEVSRLKAMEADVNLLLGRVDMTDFVTDKAVDLILCLCDSINYVLSKKKVLRTFK NVYESLKYNGTFIFDVDSLYKMDEILDGYFEEEDADDFYFKWHVEKTALGKVEHEIEIID KENNEHIKEMHYQQTYDIEIYLSLLKQAGFTDVQYYGEFEEYHDKSQRIIFICHKRRGGK >gi|223714051|gb|ACDT01000164.1| GENE 7 4513 - 5211 802 232 aa, chain + ## HITS:1 COG:SA1427 KEGG:ns NR:ns ## COG: SA1427 COG0775 # Protein_GI_number: 15927179 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Staphylococcus aureus N315 # 1 229 1 227 228 184 43.0 1e-46 MIGIIGAMEEEVAAIKEYMEITETRSILDCTFYQGTIKERQVVLLQGGVGKVNAAICTTL LLTNYKIDYVINIGSAGGLCLTQEVGDIVISNEVCQHDFDITAFPNRVIGEVPGLPPRIE ADRQLITQAKTILSNLNLNCEIGLIVSGDQFVATPEVATRIKNNFPDAKCTEMEAAAIAQ TCYKFGTSFIITRSLSDVFGKGDSSVQFDEYLKKASQASAKMCIALIAETEH >gi|223714051|gb|ACDT01000164.1| GENE 8 5217 - 6665 1404 482 aa, chain + ## HITS:1 COG:MA0538 KEGG:ns NR:ns ## COG: MA0538 COG0826 # Protein_GI_number: 20089427 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Methanosarcina acetivorans str.C2A # 5 479 7 492 855 397 42.0 1e-110 MQNIELLAPAGSYEALVAAVQNGANAIYLGGNEFSARAFATNFNREELQAAVAYGHLRNV KIYVTVNTLYEDNQFEKLQDYLLFLQTIKVDALIIQDIGLMSFVKQYFPDFEIHMSTQTS IYNLSAVKYFEDAGVDRVVLARENTLDEIADICQNTILDIEVFVHGALCMSYSGQCLMSS MIAKRSGNKGACGQPCRLAYKLQKDSMNLDKIPSYLLSPKDLCTFENIGQLIDAGVTSFK IEGRMKRPEYVATIVKQYREAIDAYLKNTTVNAFEQRIKKMKQMFNRGFTGGYILKDQNF VAKDYPGNRGIEVGTVIDYDKHRKIVKIQLQDKLKQGDRINFKSVGFTRTITKLYLFNNL INQGNAGDIVEIELNTPVKKNETVYKVIDIDLINEALASYKNENIKNTVTMAFSGQINEP ARLTINYKDLKVEKVSNLLIESAAKLPLDPQRIRQQLGKLGNTVFKANDITIDFPDNGFF SY >gi|223714051|gb|ACDT01000164.1| GENE 9 6646 - 7554 686 302 aa, chain + ## HITS:1 COG:no KEGG:CTC02275 NR:ns ## KEGG: CTC02275 # Name: not_defined # Def: protease (EC:3.4.-.-) # Organism: C.tetani # Pathway: not_defined # 3 279 473 767 787 83 25.0 1e-14 MVFLVIKEINEMRRQAIDELSNMIVKVKKVKKPMIKTKHNHINKQIKGIVVKIYNLAQLK ALLTEEVDAYYFPINEELDEAISLAHSVNKEIIPFTSFLNNQDILIKFKNSVSYNKINSI LVGDYGALQIFKDKKCILDSTFNLYNSYALNYFNNHDAILSLEMSRKQVNHLNNIKQNII MTVYGKTINMHLKHCLISDHYFNCKKIKCNLCKQGKFTLIDRKGEQFDIFPDQDCNNLIF NSHCLYIDHLEKLEVDFILLSFSNEAPEITKAVFRDFKNNIMFAKPRQISLNTKLTNGYF YD >gi|223714051|gb|ACDT01000164.1| GENE 10 7669 - 8010 402 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756980|ref|ZP_02429107.1| ## NR: gi|167756980|ref|ZP_02429107.1| hypothetical protein CLORAM_02529 [Clostridium ramosum DSM 1402] # 1 113 2 114 114 195 100.0 7e-49 MIDKDLAARLKSYRNFKHLTQKDVAAYLKVPHSAISDIENGKRDITVGELRIFSNLYGRS VEEIMSGKKYDYYNIANIARLLTELPDEDLKEVMFMIEYKRKRNEEKNRSEKF >gi|223714051|gb|ACDT01000164.1| GENE 11 8020 - 8559 623 179 aa, chain - ## HITS:1 COG:SPy0363 KEGG:ns NR:ns ## COG: SPy0363 COG0622 # Protein_GI_number: 15674513 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Streptococcus pyogenes M1 GAS # 3 154 6 160 173 90 35.0 2e-18 MQVLVISDTHLQNELFQKINQTYPKMDYYLHCGDSSLKKDDFLLNKYYTVKGNHDDEDFP VNIILEIGKYRCLITHGNAYDIYYGNDKIKQYMIANNIDICFHGHTHVPAYTQIGNRYII NPGSVMINRGSYGFGTFAIVEIGDTIKVNYYNSETFEECSKMVLDDGKIVLEEFKQILK >gi|223714051|gb|ACDT01000164.1| GENE 12 8969 - 9283 527 104 aa, chain + ## HITS:1 COG:lin2905 KEGG:ns NR:ns ## COG: lin2905 COG1440 # Protein_GI_number: 16801964 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Listeria innocua # 1 104 1 100 100 104 57.0 3e-23 MKTILLVCSAGMSTSLLVTKMEGAAKDAGVECKIFALPFSDAPRVLEEVDCILLGPQVRF QKAAIEKLAAGRKAGAIPVDVIDMRDYGTMNGKAVFEAAMKMIG >gi|223714051|gb|ACDT01000164.1| GENE 13 9379 - 9696 543 105 aa, chain + ## HITS:1 COG:lin2905 KEGG:ns NR:ns ## COG: lin2905 COG1440 # Protein_GI_number: 16801964 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Listeria innocua # 4 104 3 100 100 98 56.0 3e-21 MATRIMLVCAAGMSTSLLVTKMEVAAKEAGEEVEIFALPLTEGEKIIDTVDCVLLGPQVR YAKKEVEKIIADTGKDIPYDVIEMKDYGMMNGKAVYEFAKKLMGR >gi|223714051|gb|ACDT01000164.1| GENE 14 9827 - 10099 419 90 aa, chain + ## HITS:1 COG:CAC0478 KEGG:ns NR:ns ## COG: CAC0478 COG3830 # Protein_GI_number: 15893769 # Func_class: T Signal transduction mechanisms # Function: ACT domain-containing protein # Organism: Clostridium acetobutylicum # 3 90 2 89 89 84 55.0 6e-17 MQKGIITVVGKDQVGIIAKVCSFLAEKQVNILDISQTIIQGYFNMMMIVELSSITEEFGT ICEDLDKLGEEIGVNIKLQHENIFNKMHRI >gi|223714051|gb|ACDT01000164.1| GENE 15 10111 - 11475 1728 454 aa, chain + ## HITS:1 COG:lin0538 KEGG:ns NR:ns ## COG: lin0538 COG2848 # Protein_GI_number: 16799613 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 6 454 5 451 451 570 68.0 1e-162 MINLFEVAETNKMIEQENLDVRTITLGISLLDCMDHDIDVLCENIYKKITTVAKNLVSTG KQIEMKYGIPIVNKRISVTPIGFVGSNACKSTADFVKIAKTLDKCAHETGVNFIGGYTAI VSKGMTKAEELLIRSIPEAMKVTSNVCSSVNVGSTKTGINMDAVKLMGEIIHETADLTKE NDSIGCAKLVVLCNAPDDNPFMAGAFHGVSEADAIINVGVSGPGVVKNAIANAKGKDFGE LCEIIKKTAFKITRVGQLVALEASEKLGIPFGIIDLSLAPTPAIGDSIADCLEGMGLDSV GAPGTTAALAILNDQVKKGGVMASSYVGGLSGAFIPVSEDQGMIDAVNRGALTLEKLEAM TCVCSVGLDMIAIPGDTPVTTLSGIIADEMAIGMINQKTTAVRIIPVIGKNVGEVVEFGG LLGYAPVMPVNQFGCAEFINRGGRIPAPIHSFKN >gi|223714051|gb|ACDT01000164.1| GENE 16 11522 - 12019 425 165 aa, chain + ## HITS:1 COG:no KEGG:Ccel_3420 NR:ns ## KEGG: Ccel_3420 # Name: not_defined # Def: protein of unknown function DUF815 # Organism: C.cellulolyticum # Pathway: not_defined # 41 164 51 179 429 66 33.0 3e-10 MNDLIVFNPLYKNKDIAECFALYTEVVDGKLESSSKFVSLLIKLAEKYNLYDNLFNSLLT YLLINNENSFTLALERKKTIAANLKEIVMNDFKIIFKMFNTDFSVLPLAQQNLITGFTPS KPVINIELFEVSKTLQDNLTACNDVESFYNVLENFFLTYGVGNLD >gi|223714051|gb|ACDT01000164.1| GENE 17 12091 - 12747 627 218 aa, chain + ## HITS:1 COG:CAC3262 KEGG:ns NR:ns ## COG: CAC3262 COG2607 # Protein_GI_number: 15896507 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Clostridium acetobutylicum # 1 213 207 420 426 229 53.0 2e-60 MIGYESQKEKLTQNTEAFLAGKKANNVLLYGDSGTGKSSSIKALLNEYYKDGLRMIEVYK HQFINLPSIIQELQSRNYKFVLFMDDLSFEEFEIEYKYLKAVIEGGLEKKPDNILIYATS NRRHLVKQTWGDRQDQDEVNVNDAKQEKTSLSSRFGVKILFMHPDRQNYLDIVDGLAEQY GLMMERNELHQKALTWEIRHGGFSGRTAKQFINAMLGK >gi|223714051|gb|ACDT01000164.1| GENE 18 12763 - 14490 1352 575 aa, chain + ## HITS:1 COG:HP0292 KEGG:ns NR:ns ## COG: HP0292 COG4866 # Protein_GI_number: 15644920 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Helicobacter pylori 26695 # 27 293 28 290 290 119 32.0 2e-26 MRKLLLNDYHKIKKYLDDANYEGYNSNFVTMMMWDHEYEIYYEIKDNFMVMLHTYLDEKF FAMPFCKPEYYYEAIEYMQNYAKEHNFLFRIDLAVEESMSLIKERYGDKFLYLHNEDFDD YVYTKKSLESLAGKKMQKRRNHFNAFVKENPNYIYKEIEDEDVDNVLQCLKKWDFSHRIE ESVISEYIGIVYLLVHRHELEIKTGCIYINGQLEAFIIGSPLKHSTVQIHVEKANKDIRG LYVAIGKFFLERNFEGYEFVNREEDMGLESLRRAKQMLHPVKMVKKYNIVPNNLSIMPAA DNDLHSIIDLWRDSFEDEDKQTTNFYFMKCYSKENTYLLKNNNFLVSMMQIVPYTIVLDN QEIKAYFILGVATNKNYQKQGMMKKLMNYVLNLPKYAGQKILLQAYMPEIYYNFGFNKDY YHKITNVDKKSYQNDYDLTVIEDLDFVNSLKLYNDFTKDFNGYRKRDIDYYQNYLIARCR AYNDRIKLFYDNAKIIGYAIYHESASKIEVSEIIYTNYESLNKIVCFLSKTNKELIIESD LNNEIIGDCTYICTMLTNFLKNSCQDNKFYINECL >gi|223714051|gb|ACDT01000164.1| GENE 19 14561 - 15985 1607 474 aa, chain + ## HITS:1 COG:BH0630 KEGG:ns NR:ns ## COG: BH0630 COG0034 # Protein_GI_number: 15613193 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Bacillus halodurans # 1 460 1 460 473 574 60.0 1e-163 MFSKDRELHEECGVFAVCGYENAAAMCYYGLHSLQHRGQEAAGIVVKNKEALSIQKGEGL VTEVFDQDKISKMKGDHAIGHVRYSTAGGGGIANVQPLLFRTLNGSLGIAHNGNIVNANV LKADLEKKGSIFSSSSDTEILGHLIMREEGHMIDRICNSLNKLDGAFAFLILIENALYVA RDKYGLRPVSIGQLPNGAYVFSSETCAFEVVGAKFVRDLEPGEIVRVKDGQIKSKLYTND VTDKICAMEYIYFSRPDSNLDGINVHTTRKLAGKQLFKEAPVEADLVIGVPDSSISAAIG YAEAANIPYEMGLIKNKYVGRTFIQPTQEMREQGVRMKLSAVSSIVKGKRVVMIDDSIVR GTTSKRIVRLLKEAGAREVHVRIASPAIKYPCFYGVDMSTMDELISNRLNVNELCNYIEA DSLAFITEEGIDKSIHFNKEKHKCSLCLACFNGEYVTNLYDSFEMDKIKKGKEY >gi|223714051|gb|ACDT01000164.1| GENE 20 16151 - 16417 170 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732943|ref|ZP_04563424.1| ## NR: gi|237732943|ref|ZP_04563424.1| predicted protein [Mollicutes bacterium D7] # 1 88 1 88 88 107 100.0 2e-22 MEDNTKRLIVMSILAYAIGTFIFAAGLMTKTAVSIILFYIIASILIICGILALYNNYKKN HQIKLYLYLIVVGIVFLFLNTAALINNL >gi|223714051|gb|ACDT01000164.1| GENE 21 16426 - 17607 1304 393 aa, chain - ## HITS:1 COG:BS_drm KEGG:ns NR:ns ## COG: BS_drm COG1015 # Protein_GI_number: 16079407 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphopentomutase # Organism: Bacillus subtilis # 2 393 5 394 394 457 57.0 1e-128 MKYNRVFLIVCDSLGIGNAKDANLYNDEGSNTLKHICNACNGLDLPNLEGLGLGSLGNFS GIHQLTSQLGYTLALNEISNGKDTMTGHWEMMGLKTEKPFITFTETGFPQEFIELFEEKT GRKCVGNKAASGTAILDEYGEHQIKTGDWIVYTSADSVFQIAANEEIIPLEELYKACEIA REIAMDERWKVGRVIARPYVGKRVGEFKRTANRHDLALAPFGRTVLDNLKDKQFDVIGVG KIPDIFVDQGITKAVKTVSNQDGMEKTVELAKSEFKGLCFVNLVDFDAVYGHRRNPEGYG QAIVDFDKQLEELMKYLNHDDLLMITADHGNDPTYSGTDHTREQVPLIIYSKELLKPRHL HDMESFAVIGATIADNFGIELPTIGQSILELIK >gi|223714051|gb|ACDT01000164.1| GENE 22 17686 - 18594 619 302 aa, chain - ## HITS:1 COG:BH1529 KEGG:ns NR:ns ## COG: BH1529 COG4974 # Protein_GI_number: 15614092 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Bacillus halodurans # 3 295 4 298 299 244 43.0 2e-64 MLIKDALSEYKQYLIVEKGLSKNTIYSYLRDLIAFSNFIGEEYEINQIENINKEHIHLYL KELSKTNCTNSISRKLVSLRMLYIFLVKENIVKENLMSSFTLPKRDKKLPIILSQEEMIE ILDGIIVCDAISSRNRCMVELLYATGMRISELLNLTLKDLNIKMGFIKVIGKGNKERMIP IGSYVGEILEQYINDYRAEFNIKNDSLLFFNKHGQRLSREEFYSILQTIVNSTSITKKVS PHTFRHTFATHLLENGADLRSIQELLGHSDISTTTIYTHISNQKIRSEYQQFHPRIKKHN DG >gi|223714051|gb|ACDT01000164.1| GENE 23 18684 - 18905 94 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732946|ref|ZP_04563427.1| ## NR: gi|237732946|ref|ZP_04563427.1| predicted protein [Mollicutes bacterium D7] # 1 73 19 91 91 64 100.0 2e-09 MLFLGFFPQLIIELALTYIISIISIRLSINCFTISFLTTEVFNSRKIINYILDYVIIILI IVTLSMAFKVYAL Prediction of potential genes in microbial genomes Time: Thu May 26 10:55:52 2011 Seq name: gi|223714050|gb|ACDT01000165.1| Coprobacillus sp. D7 cont1.165, whole genome shotgun sequence Length of sequence - 1856 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 134 - 193 11.7 1 1 Op 1 . + CDS 229 - 990 600 ## COG0631 Serine/threonine protein phosphatase + Prom 1036 - 1095 4.0 2 1 Op 2 . + CDS 1176 - 1499 418 ## gi|237732864|ref|ZP_04563345.1| predicted protein + Term 1551 - 1585 4.5 Predicted protein(s) >gi|223714050|gb|ACDT01000165.1| GENE 1 229 - 990 600 253 aa, chain + ## HITS:1 COG:mlr2361 KEGG:ns NR:ns ## COG: mlr2361 COG0631 # Protein_GI_number: 13472160 # Func_class: T Signal transduction mechanisms # Function: Serine/threonine protein phosphatase # Organism: Mesorhizobium loti # 16 233 42 244 280 59 22.0 6e-09 MFNIAAYGRSIANGLENNQDSVLIKEINDFVILVVADGNGADSAGMINTGLLANNLMIDY LTKIIHQMSSINEIANQLNLGMYTISKCFLSINSIDEKFSNVYASQSLVIINKNTLDMCF ASIGNTEIHLFRSGELNRMNILQSKAYEMLINKSLLLSDFYNCPERGILTSAYGVFESIN VDLRIGKFNPNDIIILTTDGIFTSLNPNDLIKLLGQGESPTPQSGVENILSHISNKNNKL DNAGMICAYIEEI >gi|223714050|gb|ACDT01000165.1| GENE 2 1176 - 1499 418 107 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732864|ref|ZP_04563345.1| ## NR: gi|237732864|ref|ZP_04563345.1| predicted protein [Mollicutes bacterium D7] # 1 107 1 107 107 190 100.0 3e-47 MKGADNIVEMKIGSGAKNGEKIRGYAPRCGKSLCELLKQNGEINIKSMGGAATNNAVKAI AHAQNLLEEENLGLIADSFVFKTEHMIREDGAEVDGVLLQIHTKLVA Prediction of potential genes in microbial genomes Time: Thu May 26 10:55:58 2011 Seq name: gi|223714049|gb|ACDT01000166.1| Coprobacillus sp. D7 cont1.166, whole genome shotgun sequence Length of sequence - 1830 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 810 383 ## gi|237732866|ref|ZP_04563347.1| predicted protein + Prom 933 - 992 6.7 2 2 Op 1 . + CDS 1025 - 1231 220 ## gi|237732868|ref|ZP_04563349.1| predicted protein 3 2 Op 2 . + CDS 1300 - 1569 155 ## gi|237732869|ref|ZP_04563350.1| predicted protein 4 2 Op 3 . + CDS 1598 - 1829 174 ## Predicted protein(s) >gi|223714049|gb|ACDT01000166.1| GENE 1 1 - 810 383 269 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732866|ref|ZP_04563347.1| ## NR: gi|237732866|ref|ZP_04563347.1| predicted protein [Mollicutes bacterium D7] # 24 269 1 246 246 380 100.0 1e-104 NYTKTIYKKLRRANNIKVKEIIEMLANYGFSVSPRTYNRYEDGTRKFPKEIYNLVLDYLQ QDSLSADDVSKFIKNEKKTTNIEICNMKNIKRDHINLNLKHDKLQIDTQRLEFDSINDKN DTNYNHDTQHTLIEESNTVEKLSDTNIINTKRQTNNNYTDTKINSLIKDNEKMNRIINSI LLSIENMHNKTNRQMIQTCLKRLSSFKTYTIKKVTYDYTSITNMLSILLTYRDEELIEYF DLLITRIKNSGYKFKSPDNIINCFLYFYA >gi|223714049|gb|ACDT01000166.1| GENE 2 1025 - 1231 220 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732868|ref|ZP_04563349.1| ## NR: gi|237732868|ref|ZP_04563349.1| predicted protein [Mollicutes bacterium D7] # 1 68 1 68 68 103 100.0 5e-21 MKFHGLPFVSTHNVVYVVIILGAIGFYFLSKGISQLGKESQVEVGRTNVIYSVILLLMDA AIFYFYYL >gi|223714049|gb|ACDT01000166.1| GENE 3 1300 - 1569 155 89 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732869|ref|ZP_04563350.1| ## NR: gi|237732869|ref|ZP_04563350.1| predicted protein [Mollicutes bacterium D7] # 1 89 6 94 94 145 100.0 9e-34 MKLNLKPSKETKIVANEQIQNKEVTEKKGFTKLYKYTTNKNIIEFSDKGFFLENRNSNGD LNGLNCKVSIQFSEFDKQFHQTKYCPFFY >gi|223714049|gb|ACDT01000166.1| GENE 4 1598 - 1829 174 77 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFKSGAFFKKLEESQKTNKPAFIFQGGGVYEGKIYARTLQISASKMKDSCCLILWYGEGT KSKTGGYVFKENASRSN Prediction of potential genes in microbial genomes Time: Thu May 26 10:56:26 2011 Seq name: gi|223714048|gb|ACDT01000167.1| Coprobacillus sp. D7 cont1.167, whole genome shotgun sequence Length of sequence - 11493 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 9, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 42 - 1487 1116 ## gi|237732870|ref|ZP_04563351.1| predicted protein 2 2 Tu 1 . - CDS 1797 - 2228 351 ## gi|237732871|ref|ZP_04563352.1| predicted protein - Prom 2282 - 2341 9.5 + Prom 2296 - 2355 3.0 3 3 Tu 1 . + CDS 2444 - 2641 207 ## gi|237732872|ref|ZP_04563353.1| predicted protein - Term 2616 - 2647 -0.3 4 4 Tu 1 . - CDS 2697 - 3119 322 ## HMPREF0424_0748 hypothetical protein - Prom 3144 - 3203 4.8 5 5 Tu 1 . - CDS 3244 - 3486 213 ## stu0710 hypothetical protein - Term 3761 - 3808 6.1 6 6 Tu 1 . - CDS 3811 - 4524 702 ## gi|237732874|ref|ZP_04563355.1| predicted protein - Prom 4559 - 4618 11.2 + Prom 4974 - 5033 12.7 7 7 Tu 1 . + CDS 5070 - 6755 1630 ## COG5271 AAA ATPase containing von Willebrand factor type A (vWA) domain + Term 6757 - 6789 -0.1 + Prom 6761 - 6820 8.7 8 8 Op 1 . + CDS 7009 - 8088 604 ## gi|237732876|ref|ZP_04563357.1| predicted protein 9 8 Op 2 . + CDS 8085 - 9221 430 ## COG4547 Cobalamin biosynthesis protein CobT (nicotinate-mononucleotide:5, 6-dimethylbenzimidazole phosphoribosyltransferase) + Term 9232 - 9277 8.3 + Prom 9277 - 9336 11.3 10 9 Op 1 . + CDS 9369 - 10352 591 ## LM5578_p49 hypothetical protein 11 9 Op 2 . + CDS 10355 - 11167 611 ## BAA_B0112 hypothetical protein 12 9 Op 3 . + CDS 11189 - 11492 299 ## COG0587 DNA polymerase III, alpha subunit Predicted protein(s) >gi|223714048|gb|ACDT01000167.1| GENE 1 42 - 1487 1116 481 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732870|ref|ZP_04563351.1| ## NR: gi|237732870|ref|ZP_04563351.1| predicted protein [Mollicutes bacterium D7] # 1 481 1 481 481 716 100.0 0 MDINKIKKFLINNKVKVAITCLVIVFVITSITYYFYSSNTVSLLKLKKMDFIFELGEKIN KSPEFYLDTSKLKEDEKEKLLNKTTVKLPQKYDKLGKYEVLLSYKNDSVKAKVEIVDTIP PVFETVDNMTVPLGTQYTDEQLKEMFKVSDADEYELKIDNGGYNANAEGVYTITAEAKDK TGNSTKYSFTITVTAAVTEIQYTENTTQAESTVTETKKTTNSDKTNSESTSSNSSHNNST SGSNNNNNPANGGNNNNSNTDNSGIVYVKSVSISGNTTGNVSNNLNFNATVNPNNASNIG YGSYKWTTNSGNLKIVAGQGLANVIVTSDVAGTYTLTCTVNGVSNSINVTFKTVEVSQDV DFNKVKVKIKTPSGGEIVKSVKELNFQQNGKEATASITVGNNYGLVWNNPGDNAIIAAGL WFTADESMWYVQTPGPESYHTTFSGSHTVRYKSYNGYSLSITYNVGSVNVTHEEMIKILY G >gi|223714048|gb|ACDT01000167.1| GENE 2 1797 - 2228 351 143 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732871|ref|ZP_04563352.1| ## NR: gi|237732871|ref|ZP_04563352.1| predicted protein [Mollicutes bacterium D7] # 1 143 1 143 143 258 100.0 6e-68 MFIAELFIRGDSPWENFGKFYLSQEFIELTPEETSKIQSYDWYDTLSYLEKTYEKLEFFK ILRECNIHDNGDYLFKEGDVYDNFELNIEESNKNILNCKEFIEQGIFMSQNGKVLAYVSI YEIERNTKIPSNKFLYICDNLRL >gi|223714048|gb|ACDT01000167.1| GENE 3 2444 - 2641 207 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732872|ref|ZP_04563353.1| ## NR: gi|237732872|ref|ZP_04563353.1| predicted protein [Mollicutes bacterium D7] # 1 65 1 65 65 96 100.0 5e-19 MNNNEKLLKMSGIYNQIKNLKGVRTNLECSNIPGQHHNVAKNIASIERQMESLEKQYDKL LMSLK >gi|223714048|gb|ACDT01000167.1| GENE 4 2697 - 3119 322 140 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0424_0748 NR:ns ## KEGG: HMPREF0424_0748 # Name: not_defined # Def: hypothetical protein # Organism: G.vaginalis # Pathway: not_defined # 2 140 115 252 252 157 54.0 7e-38 MGAMMAKGVNETNPDTETIQNIRIMVERSIKFWNEFGPIEQDGFTFEPNGYTETVNTGDG DYLTADTIWDFKVSKSKLTNKHTLQLLMYWIMGQHSGQEIYKNITKLGIFNPRLNSIYTL NIDNIAFEIIEEIEKHVICY >gi|223714048|gb|ACDT01000167.1| GENE 5 3244 - 3486 213 80 aa, chain - ## HITS:1 COG:no KEGG:stu0710 NR:ns ## KEGG: stu0710 # Name: not_defined # Def: hypothetical protein # Organism: S.thermophilus_LMG # Pathway: not_defined # 1 73 1 73 250 96 61.0 3e-19 MSSVTQRIKKIKQPRGGYIKPSQFKLQKIDDGKILNEQENIHASVIGMAVDYLTRFIMGT DIIEAFKISCMGAKVAEEIF >gi|223714048|gb|ACDT01000167.1| GENE 6 3811 - 4524 702 237 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732874|ref|ZP_04563355.1| ## NR: gi|237732874|ref|ZP_04563355.1| predicted protein [Mollicutes bacterium D7] # 1 237 1 237 237 399 100.0 1e-110 MYIDLTNENIELYSNKHKQVLYNNLKISNEEFKKQFNIDYSLFMTKDELKLHEEEEENSI LQTVEDAFNISKCKKHKFVLPNYDFVLEKFPIFKAITNENRESLCIIELNNEQVIDIHKF VGTECSVGISIQDIFKYLIKKGINKYAIVHNHPNSIVANFTEDDIKNNYILGFYSNTFGL EFIDSFVITPFDCESQYQSDIENKENNLNPKQFKDLSKYKQINPKLYPILLHGLKTK >gi|223714048|gb|ACDT01000167.1| GENE 7 5070 - 6755 1630 561 aa, chain + ## HITS:1 COG:SPCC737.08 KEGG:ns NR:ns ## COG: SPCC737.08 COG5271 # Protein_GI_number: 19075870 # Func_class: R General function prediction only # Function: AAA ATPase containing von Willebrand factor type A (vWA) domain # Organism: Schizosaccharomyces pombe # 290 460 1552 1727 4717 65 31.0 3e-10 MKYEFEKMIMGKGRKNLRIPDTAFSKGEISTQFYSRKNNSTTLKKSVLSDKIVNAIVSAQ KLDFNIFFGQKDGLDVFIGAFDNASVPGSNYIIMFSDRKKITKIAATNINSHYLYTMLLA YNLYYKDKEAIKYYSEIDTTTGMIDSDQQKQLFILNDNIYWTAKELDGINVLDSNTVKSA TNDIIEIINEVVPHLTSNITSLLNENMVVITSSLKVNIDRDIPLYEVPVVVSKRKNSSEY KKEVLNKNYFIYKDEKELAGLPEELVIRYKSGQEAFEKYASTMDDDIFPLVKALKKMNVI CLTGEAGGGKTTIASAIAGCLGLPLITINGNNDTDYASMILDYGAKNGTTFAIEKPALLA YIHGGVVVIDEITRIKGEMSTALNAMLDTRQYYDAPNGIQYKKNPNFKVICTMNQGAGYS TEELDTSLIDRFNIVKYIQDPKKELAIEIISNSTGYTNKDIINKMWDVKTLIDEKAREEE AISRTSLRGVIDWINQSFITGEFVESAINTIIGKLILKDSSIYSQNIDYLMNDCDNELVA YAVSEIKDMFEDEDCLDTDFD >gi|223714048|gb|ACDT01000167.1| GENE 8 7009 - 8088 604 359 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732876|ref|ZP_04563357.1| ## NR: gi|237732876|ref|ZP_04563357.1| predicted protein [Mollicutes bacterium D7] # 1 359 1 359 359 621 100.0 1e-176 MDIKNFVSNIKNENRIPRIIRRIAGKNISLKYDPNTSSNYCTEIDGNYFINLGLLLAMDE ETIKSVISSQNFPDITYDDTLIRIIGAASHEASHVRFTDFNVLVKHSQEVMKAKNNVKEI AGEWLKTKDDSLFDEIKQETYNYLYGMFLTQMFNSLEDASIEDSCLRNYGRIPFVSSGIN MLRNLLTDNEDEYILKNFNEKEKYTKSYIETVITEIRHMAVAGYRNYPCRYTGLANYTAD EFEEMNYLALYARFNAKNSSEVFSASKVAMKFIENEIKNVANEYANKYINELKNETNGDT ANDITNDFSSEENDLAIAKKMANASTGTPPQKNQNLHLKFLIIFKNKLIMPVKKQITIL >gi|223714048|gb|ACDT01000167.1| GENE 9 8085 - 9221 430 378 aa, chain + ## HITS:1 COG:YPMT1.87 KEGG:ns NR:ns ## COG: YPMT1.87 COG4547 # Protein_GI_number: 16082880 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CobT (nicotinate-mononucleotide:5, 6-dimethylbenzimidazole phosphoribosyltransferase) # Organism: Yersinia pestis # 140 364 537 774 788 76 27.0 1e-13 MNSAGGKKTNNNKKQEKAFDNSQDSNSEEENEASNDNKQEKTSLEDYKKNIAPILDKQAK SELSKSYKELNKKKLSAQQKNLALSINDKGHTPDLNETDKKCLSDMHKGVKIRYFSPDEI YATMTNTEHLYEDITPYKKLARKFSKDMKRLLFRDSRIKKIPNLNRGKIDKRNLHRIIID DNCFYNKIDGKAHKARFCILVDQSGSMGGSKSKNAYYASVMLAEACKITSIPLAVYGHNN TSNVLLYHYLDYKKHSKKYYDNLVNIVKNGGDNHDSIPIFYCLKDLVKNRKHDEKLIFIV ISDGAPAGNNGYQGQCAFDDIRNIYKTFENFYGVETIGVGIGSDIGHIPQIYENHCLVPN VEELPIELLKIFKRTLQK >gi|223714048|gb|ACDT01000167.1| GENE 10 9369 - 10352 591 327 aa, chain + ## HITS:1 COG:no KEGG:LM5578_p49 NR:ns ## KEGG: LM5578_p49 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 2 315 16 351 377 73 24.0 1e-11 MNLEEVLKKLSATLLGDYYTLICPKCGKKEAFLYLDDLYKYKQDSTHKVAIRCNRLNKCG EITYLNDLLDFTEKKELNLPTPKSSIQINNRGIELIERFINYSLFSINCNYKDFDFDIRG ISNETLKNNGIFYYSKKFESIVDSNLGKKCFSDKYRIKEYKNRDIFIPILNYENKVERIL LRSKEQGEQKKKEIQLKLKDRSIEIWNIKALINKEKYVFITEGVYDALSIKDVDPNVDAL SLPGVRKYKKLVKEIDKNIEKCKNKIFIIATDNDKAGHEYAKKLEMELKNRNLRVSILDL RQYKDVNDFLQKNRLSLILSIKKAKGK >gi|223714048|gb|ACDT01000167.1| GENE 11 10355 - 11167 611 270 aa, chain + ## HITS:1 COG:no KEGG:BAA_B0112 NR:ns ## KEGG: BAA_B0112 # Name: not_defined # Def: hypothetical protein # Organism: B.anthracis_A0248 # Pathway: not_defined # 7 169 18 183 591 73 31.0 5e-12 MDYNLYLTEKAKKTIKQLDECLYEVYQNPNKYINWVKQGLKFKKYSANNQSLIYYRFPNA KYVANYKKWQELGFQVNKNEKSISLLRPNNIKGFYDKDGKFITLKKATLEQQKEIKKGNI KTTYKMCGFGVFSVFDISQTNATESDLSKLMQEQINIVVDETKLLCKLANIFEYDVSDDL NYNFIHLKEKTYEYINQNINYTPYEKALICNAMVYSLFSTVNKYTKEIEMLVINDITTLD SLVNMSNIELGKIIKDRIKVFLDEVIPYLD >gi|223714048|gb|ACDT01000167.1| GENE 12 11189 - 11492 299 101 aa, chain + ## HITS:1 COG:SPy1284 KEGG:ns NR:ns ## COG: SPy1284 COG0587 # Protein_GI_number: 15675237 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Streptococcus pyogenes M1 GAS # 1 101 1 101 1036 74 35.0 4e-14 MFTHLNIKTEYSMLNSIAKIDGLIEKAKRNNMKALAITDINNMHIVYNFEEKCKKAGLKP ILGVTLVVKYDNNYKLLGLLAKNEQGYKKITKIVTETNVGE Prediction of potential genes in microbial genomes Time: Thu May 26 10:57:36 2011 Seq name: gi|223714047|gb|ACDT01000168.1| Coprobacillus sp. D7 cont1.168, whole genome shotgun sequence Length of sequence - 6392 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 3105 2166 ## COG0587 DNA polymerase III, alpha subunit + Term 3110 - 3139 1.4 + Prom 3695 - 3754 7.2 2 2 Op 1 . + CDS 3830 - 5311 456 ## Acfer_1192 hypothetical protein 3 2 Op 2 . + CDS 5332 - 5658 283 ## COG1232 Protoporphyrinogen oxidase 4 2 Op 3 . + CDS 5742 - 6390 407 ## COG1232 Protoporphyrinogen oxidase Predicted protein(s) >gi|223714047|gb|ACDT01000168.1| GENE 1 1 - 3105 2166 1034 aa, chain + ## HITS:1 COG:TP0669 KEGG:ns NR:ns ## COG: TP0669 COG0587 # Protein_GI_number: 15639656 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Treponema pallidum # 1 933 146 1029 1170 553 35.0 1e-157 LNSLVSSGFFYDADKLLDEYTNIFGKNNVIIEISLHGIQSEKDFINSNWLQSKIFNEGYM YVATNDIYYLNKEHALHRSIGLSMNPNPDGIDEYSNYVDYNSEFYFKTEKEMEKVFSAYL HKYPDLLTNSEKIVKQCNAQVPIERALPEFPLPNGFSSEQYLYKLVWEGFKEKFPNESAI DNRFTLEDYKERLEHEFDTIVKMGFVDYFLIVQDFINWSKDIEVYKHPEVYFPHIELSKV SKMCIEKNFEIKVGPGRGSAAGSLLAYCLKITNLDPMKYDLLFERFLNPERVSMPDVDID FSNRDREKVVEYVQNKYGYSHVAQIVTFQKLKLKSIIKKVAKSYGIPYSKTDEMTKMIPG KICVSTEKEDGTIEIVEKEPELLCEIENIDYFSNLINKDDDIKLVFKMGKVLEGLTSTTG KHAAGVIIGRQPLQSYLPLMEVDGVLVTQFEKKASESIGLLKMDFLGLQTLDIIQETEEL VKENYNQIIHINDISIEDEKTFEIFQKGNTGRVFQFEGAGMKNLLKQMHPTSLEHLNAAC ALYRPGPMDYIPNYIEGRKFPNKIDYPHELFKEVAEETFGILVYQEQIMQLVQKMAGFSL GEADILRRGISKKEKDYLDKARKQFVDGCKKINVAEETSSKIYDTIELFAEYGFNKSHSC AYSYVSYQTGWLKAHFPEAFMAANLTVAAQNKDKVAAILSECKKMQIEVLPPDINKSEDK FTLENKNGKMCIRFGFNAIQDVGEQFAKQCSRKNKKSLTSFQEFLNELNISQLRKDTLTF MINSGAFDNYGTRLALNKELENLLEYEKAKKRINLVGYKFVLSLPTNNYFDGYEYPLDKL VSNEKEAIHFSLSGHILDGYRNLYQEKTAIADIKIDQSYYDSYVSVLGIIKNIKQITTKK GSQMAFLSLEDETGELEGVIFPKDFEKIKNNYLKFVNLPVIFDGKLQYENKDGEDKFSLI ISSIKIIDSPTKVYIHNNSNINKEFVEKLSSNNGISEVIVVDTQRTKIRILPFKINLEKS KVILKEYNIDYIIV >gi|223714047|gb|ACDT01000168.1| GENE 2 3830 - 5311 456 493 aa, chain + ## HITS:1 COG:no KEGG:Acfer_1192 NR:ns ## KEGG: Acfer_1192 # Name: not_defined # Def: hypothetical protein # Organism: A.fermentans # Pathway: not_defined # 18 321 10 313 561 112 29.0 3e-23 MKDKQENYVETIKKNIKFFYIHIIATAIALCSPNNDLWFLISSGRVVQKFGFIKKEMLTF HTNFDIIVQQWPVALLFSFLYDTFGMLGVIIVVSFVACIIAFLIFKIINLICQNKQLSIV YTAILFVFFSLFITTRPYVFSTVFLLLFIYGLEAYIATNNRKYLYLLPLASFVLINFHAS IWFMLFVLFLPYLVDMIQIPFIKTRYYIHDNYNKKPVFIAVLLMFFMGIINPYGTKAMFY VFKSYGIPEISNYVGEMKALFYSPYAWNFIIISFVLVVFVFFTRAITKTKNFKLRYIYMV LGTFIMLLLNRRNLIFFIIGIVCELGYCLKSINSVEVSSRNCKKNYLLNFIIYIFIFVHG VMGGISYHSLLNTTPEYEKIFNYIKQYQESNIVLYTDYNTGGHAEFYGFHPFIDPRAEVF LKANNNCEDVYIEYINLISNKSSYEDFMNKYRFTHLIVENNNDKYMKKHLIQDERYEAVV EDDGYILYQIKKY >gi|223714047|gb|ACDT01000168.1| GENE 3 5332 - 5658 283 108 aa, chain + ## HITS:1 COG:CC1101 KEGG:ns NR:ns ## COG: CC1101 COG1232 # Protein_GI_number: 16125353 # Func_class: H Coenzyme transport and metabolism # Function: Protoporphyrinogen oxidase # Organism: Caulobacter vibrioides # 5 77 11 82 504 89 53.0 2e-18 MMTKKIVIIGAGPAGLSAAYKLLEAKKDYKVVIVEEDKQIGGISKTINYKGNRMDLGGHR FFSKNQEVNDFWNKILPIQGAPASDEILINRIKIIKKAQILKMKTRSR >gi|223714047|gb|ACDT01000168.1| GENE 4 5742 - 6390 407 216 aa, chain + ## HITS:1 COG:CC1101 KEGG:ns NR:ns ## COG: CC1101 COG1232 # Protein_GI_number: 16125353 # Func_class: H Coenzyme transport and metabolism # Function: Protoporphyrinogen oxidase # Organism: Caulobacter vibrioides # 32 216 146 327 504 117 36.0 2e-26 MGFKTTFNVGIDYLKTCVSKKTENNLENFYINRFGEKLYSMFFEDYTEKLWGRHPTQIDA EWGSQRVKGISVFALLKNMLNINKEKETSLIEEFMYPKYGPGQMWEEVAQKITDMGGEIY LNCKVEKIIKENNRISRLECKYLGQTVELKGDIFISSMPLKDLIVDMSNIPLDVFQIAEQ LPYRDFVTMGILVDKLKIQNKTKIKTVNNIIPDCWI Prediction of potential genes in microbial genomes Time: Thu May 26 10:57:43 2011 Seq name: gi|223714046|gb|ACDT01000169.1| Coprobacillus sp. D7 cont1.169, whole genome shotgun sequence Length of sequence - 1305 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 32 - 406 246 ## Fisuc_0828 GtrA family protein + Prom 412 - 471 11.6 2 2 Op 1 4/0.000 + CDS 503 - 859 440 ## COG1192 ATPases involved in chromosome partitioning 3 2 Op 2 . + CDS 971 - 1304 240 ## COG1192 ATPases involved in chromosome partitioning Predicted protein(s) >gi|223714046|gb|ACDT01000169.1| GENE 1 32 - 406 246 124 aa, chain + ## HITS:1 COG:no KEGG:Fisuc_0828 NR:ns ## KEGG: Fisuc_0828 # Name: not_defined # Def: GtrA family protein # Organism: F.succinogenes # Pathway: not_defined # 6 123 8 130 134 69 34.0 4e-11 MKQNNIEKFIKYCLVGGCSTVIDWGTYALLYLCLGINYLLATTCGFIIGLITNYVLSKEF VFTQASKYKHEFIIYGIIGLIGLVITAILMIIFVDLITLNAFLARAITTILVLFWNYLAR KMLY >gi|223714046|gb|ACDT01000169.1| GENE 2 503 - 859 440 118 aa, chain + ## HITS:1 COG:Cgl3035 KEGG:ns NR:ns ## COG: Cgl3035 COG1192 # Protein_GI_number: 19554285 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Corynebacterium glutamicum # 6 64 28 86 307 64 50.0 4e-11 MEENKTVELEKRPRVISVFSSKGGVGKTTTSTNIAYNLSQFGYKCLVIDFDPQNNASIAL NVDYGQLGEIDDPIDGTPNIGSILFPYLVKGRTNTFSASHIAKVIQRPTYVVKEMTKR >gi|223714046|gb|ACDT01000169.1| GENE 3 971 - 1304 240 111 aa, chain + ## HITS:1 COG:XF2282 KEGG:ns NR:ns ## COG: XF2282 COG1192 # Protein_GI_number: 15838873 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Xylella fastidiosa 9a5c # 27 107 115 196 264 70 37.0 8e-13 MDTAFINQHLPNSRALLASVVDIIKKRFDYDFIIIDCCPSLGILNMNALNASDYLIIPTS MDYFAINGVKQTISRLEDIKKFTPNFKIIGVLKQMFQSKRMVDTAISGILE Prediction of potential genes in microbial genomes Time: Thu May 26 10:57:47 2011 Seq name: gi|223714045|gb|ACDT01000170.1| Coprobacillus sp. D7 cont1.170, whole genome shotgun sequence Length of sequence - 2650 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 89 - 1336 870 ## gi|237732886|ref|ZP_04563367.1| conserved hypothetical protein + Term 1377 - 1444 4.1 + Prom 1558 - 1617 11.4 2 2 Tu 1 . + CDS 1657 - 2115 321 ## gi|237732887|ref|ZP_04563368.1| predicted protein + Term 2141 - 2174 -0.3 Predicted protein(s) >gi|223714045|gb|ACDT01000170.1| GENE 1 89 - 1336 870 415 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732886|ref|ZP_04563367.1| ## NR: gi|237732886|ref|ZP_04563367.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 415 1 415 415 674 100.0 0 MALKTKSIGAVQKKTSVTDNLFNGTSGGLINNDDKTASELDVLEQELIDLSDIREMKNNK FPLVSIESLAESIDRVRLGQPIILRPLKKSFNLHDGDDVTEISNVSYEIVAGHRRFAAYK YLLSKYKLLNNEEKIRRYSSIPSQLLPEGATEDEIDSLYKATNFETRNISVTDVLLHIDY YLDQIQANDIQEKLINPSFRGTNKGEFISKKFKEININLSPAQVKRYVMVYEKSTEKLLT LFQNGEISMTSISKIIKKYNIKSDVLSGKQDEIADEINNVLNDSSLIPTEREKKKLNIID KYLAVNSDDKEVLNAIKEDKKLHNEETEQEKTKIDYTSLIKKMQSTAKNIKRISEVESFY DTITEITSDGLEMTKINENIKLNTAQKENIKIVFEYLKSEYENLEKWVSMIQNED >gi|223714045|gb|ACDT01000170.1| GENE 2 1657 - 2115 321 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732887|ref|ZP_04563368.1| ## NR: gi|237732887|ref|ZP_04563368.1| predicted protein [Mollicutes bacterium D7] # 1 152 1 152 152 233 100.0 2e-60 MFYSEKKLKDRIVSDCDKIDGYQTRLVKYIYDNPDDEIDFYLNNKQNKTNTFLFALIIFT SVAFGTYGYFSAETRYGVYALPSQFLFLLSFFILVRFSIKYIKYEKEKEFWTKAVKLIKK VQYNNPKEVNHYYRNIDNIFKTKKSNSDNIKN Prediction of potential genes in microbial genomes Time: Thu May 26 10:58:11 2011 Seq name: gi|223714044|gb|ACDT01000171.1| Coprobacillus sp. D7 cont1.171, whole genome shotgun sequence Length of sequence - 7108 bp Number of predicted genes - 14, with homology - 13 Number of transcription units - 4, operones - 2 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 232 - 291 8.2 1 1 Tu 1 . + CDS 332 - 613 224 ## gi|237732888|ref|ZP_04563369.1| predicted protein + Term 781 - 818 -0.9 + Prom 692 - 751 8.1 2 2 Tu 1 . + CDS 958 - 1428 357 ## gi|237732890|ref|ZP_04563371.1| predicted protein + Prom 1482 - 1541 11.9 3 3 Op 1 . + CDS 1572 - 1787 275 ## gi|237732891|ref|ZP_04563372.1| predicted protein 4 3 Op 2 . + CDS 1830 - 1991 180 ## gi|237732892|ref|ZP_04563373.1| predicted protein 5 3 Op 3 . + CDS 1982 - 2491 353 ## gi|237732893|ref|ZP_04563374.1| predicted protein 6 3 Op 4 . + CDS 2522 - 2941 262 ## gi|237732894|ref|ZP_04563375.1| predicted protein 7 3 Op 5 . + CDS 2925 - 3752 579 ## gi|237732895|ref|ZP_04563376.1| predicted protein 8 3 Op 6 . + CDS 3776 - 4480 312 ## gi|237732896|ref|ZP_04563377.1| predicted protein 9 3 Op 7 . + CDS 4499 - 5035 450 ## gi|237732897|ref|ZP_04563378.1| predicted protein + Term 5040 - 5069 1.4 + Prom 5142 - 5201 11.9 10 4 Op 1 . + CDS 5239 - 5691 401 ## gi|237732898|ref|ZP_04563379.1| predicted protein 11 4 Op 2 . + CDS 5693 - 6163 561 ## gi|237732899|ref|ZP_04563380.1| predicted protein 12 4 Op 3 . + CDS 6190 - 6432 204 ## gi|237732900|ref|ZP_04563381.1| predicted protein 13 4 Op 4 . + CDS 6445 - 6894 382 ## gi|237732901|ref|ZP_04563382.1| predicted protein 14 4 Op 5 . + CDS 6902 - 7106 225 ## Predicted protein(s) >gi|223714044|gb|ACDT01000171.1| GENE 1 332 - 613 224 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732888|ref|ZP_04563369.1| ## NR: gi|237732888|ref|ZP_04563369.1| predicted protein [Mollicutes bacterium D7] # 1 93 1 93 93 177 100.0 1e-43 MTIPEMYQNTPHAVKELCSKLHQAQITVLKIEDVTALSLDASFNDVEFKWNTKKGVFIWN IKTKKLITPQCNRYIKDIIYNIFYRWTNKYFHA >gi|223714044|gb|ACDT01000171.1| GENE 2 958 - 1428 357 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732890|ref|ZP_04563371.1| ## NR: gi|237732890|ref|ZP_04563371.1| predicted protein [Mollicutes bacterium D7] # 1 156 1 156 156 236 100.0 3e-61 MKKLIAVILPLLFLTGCSKTVETVCTIPEENSSGLTSSLKMTFVSKDGYVKEVEQKETTK LEGMDISNEEFKNASKELKKSYEKYKGLKYNYSYNKKNSEVVETLEIDCSKADSDTLTLI GLSINEQDLKDNEIIKISLKDSIKNMKSYGLECKEK >gi|223714044|gb|ACDT01000171.1| GENE 3 1572 - 1787 275 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732891|ref|ZP_04563372.1| ## NR: gi|237732891|ref|ZP_04563372.1| predicted protein [Mollicutes bacterium D7] # 1 71 1 71 71 90 100.0 2e-17 MEYVENPMTIGDYPYYQSQEDYEKEMRKQNLEEKIEYLKNEIEIEYDENLRDEYAGFLCF LQEELNYLDSL >gi|223714044|gb|ACDT01000171.1| GENE 4 1830 - 1991 180 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732892|ref|ZP_04563373.1| ## NR: gi|237732892|ref|ZP_04563373.1| predicted protein [Mollicutes bacterium D7] # 1 53 1 53 53 69 100.0 7e-11 MNKIQEQYKINEKLTKLSAKEIEEQVFKKVGGCIIEKELLEAINQQMRELGWL >gi|223714044|gb|ACDT01000171.1| GENE 5 1982 - 2491 353 169 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732893|ref|ZP_04563374.1| ## NR: gi|237732893|ref|ZP_04563374.1| predicted protein [Mollicutes bacterium D7] # 1 169 1 169 169 327 100.0 1e-88 MAMNGKSDVRKMFSKEECEIALENAVDTGIIKKDWNVLNQLIEEHFKLKEKYSKILDDVH DYRFETHCMKMTIRNLCKHFGVKNEKELQNIYLNKPYKPYKFEDLKPNMWVWDNVAKECL YVVRPFITTGVRVKYFTCLGIWNLEKIKKLNMEFEENRFFPVQYAHQFL >gi|223714044|gb|ACDT01000171.1| GENE 6 2522 - 2941 262 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732894|ref|ZP_04563375.1| ## NR: gi|237732894|ref|ZP_04563375.1| predicted protein [Mollicutes bacterium D7] # 1 139 1 139 139 248 100.0 7e-65 MTNLEFYKDEINNLIKDNESPYSVDRISDAFRVFSNSHLQEFKGKSEHFIKWLLRDHKET NKLTQVEIDCLKSWPFSQHDSFQSSNFYTSMKALWYFRGILDVSLTINEIKNNYEVVSED YFTKLENIKKGADAYDKKI >gi|223714044|gb|ACDT01000171.1| GENE 7 2925 - 3752 579 275 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732895|ref|ZP_04563376.1| ## NR: gi|237732895|ref|ZP_04563376.1| predicted protein [Mollicutes bacterium D7] # 1 275 3 277 277 559 100.0 1e-158 MTRKFDFIPFIPNYEWIREQMKADLEYRLKNRLYKTSLGRPLYERINIQLITTQECPYNC PFCLERQNPMEGNQDFKKQIESLKGVLSEHPNARLTITGGEPGLYTDHVVALVDTFKQHS KNIFISVNTSGYDPAIAYIPGVNINLSINDYVKPEILLFPHSTYQTVLPDSKINLVYIKN MMDTEECDKFSFRFISSLKKHNYDTTVWNELQKDQEIQIKTFRIGDFFVYATFDYHGKHA RVTLGDMYQQINNNYQDGYSNIIIHPDGKVKTNWK >gi|223714044|gb|ACDT01000171.1| GENE 8 3776 - 4480 312 234 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732896|ref|ZP_04563377.1| ## NR: gi|237732896|ref|ZP_04563377.1| predicted protein [Mollicutes bacterium D7] # 1 234 12 245 245 469 100.0 1e-131 MEKIVTLNLSELNEIIKALDLYSRIDIGQYDEIVRVNCWFFSFIHNNRELENLFAKIRSC CIPKLSGQSFNTSLGIWGPDTPLRAKRAYDIQQILRYQLAYHDNPNGGNTVNFNSPYIHG KWICSKESIKIIQKTIQKFGYPDYTRFHNTKYWECPIVITEFLDKKHQIKLQRDPSQIDS IIEAALKIYHLAKDNKITELFNIMYPNFKKSIIPSAKEIEVLLLKRQDYKLFKK >gi|223714044|gb|ACDT01000171.1| GENE 9 4499 - 5035 450 178 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732897|ref|ZP_04563378.1| ## NR: gi|237732897|ref|ZP_04563378.1| predicted protein [Mollicutes bacterium D7] # 1 178 1 178 178 330 100.0 3e-89 MLEKNEVLNLLDGMRGVHSPDEYTRGWDDAVKECMLKIINMEEKTATPQKKLTTPPTAQF KDSNEFVKIRNKKTNNLQIKNQYTDMDIVRMFCDFEKSKSCNSFVKVVPIPSNLEMEFLG QIIDSFEDFLEEKKVSIKNHEKTEDNHSDNPAIIFGSDYDNIEDSIRRLLQNWNILLR >gi|223714044|gb|ACDT01000171.1| GENE 10 5239 - 5691 401 150 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732898|ref|ZP_04563379.1| ## NR: gi|237732898|ref|ZP_04563379.1| predicted protein [Mollicutes bacterium D7] # 1 150 1 150 150 270 100.0 3e-71 MKYFNYQFEAVLNNGTDVDMSKICGCTNNTMSWCEANCKKYYDCHNIALANDILKKYEET EDLKRNEDKHTFFLKIISEIDTAVKQGLLVRDINNYNNILIYRMAGDNEPEGWYSENILR VASELVRDKDNYESFKKAVLRINKDEKETD >gi|223714044|gb|ACDT01000171.1| GENE 11 5693 - 6163 561 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732899|ref|ZP_04563380.1| ## NR: gi|237732899|ref|ZP_04563380.1| predicted protein [Mollicutes bacterium D7] # 1 156 1 156 156 297 100.0 1e-79 MKCDYKIIDFLSTKNVVEVKNEAEYEKLKSKLLDLGFDIFEEKNYREWQNLALINGRNPN VFYFEYDNAKGLTWYDNADEPALWYGVEPLTVDELCINEKECREYMVTVTATGFVYVNAE SEREAFDKVLMMSSQQIIESGDISGWKPGDVEEIQD >gi|223714044|gb|ACDT01000171.1| GENE 12 6190 - 6432 204 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732900|ref|ZP_04563381.1| ## NR: gi|237732900|ref|ZP_04563381.1| predicted protein [Mollicutes bacterium D7] # 1 80 1 80 80 149 100.0 5e-35 MKKLVLCGITLVTLFGLTGCATSFNTSYDKAIVRMPNDEVVELKIDSVRYDDDQLQIKTK DGKIYLIPSRNCVLVKNKER >gi|223714044|gb|ACDT01000171.1| GENE 13 6445 - 6894 382 149 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732901|ref|ZP_04563382.1| ## NR: gi|237732901|ref|ZP_04563382.1| predicted protein [Mollicutes bacterium D7] # 1 149 1 149 149 247 100.0 2e-64 MEKNIIIISEGHEGDFHLMSLDEMVEEVKNIAMNEDDDYLDWLEWEVLEDNDINLEKYGK QKSDWAKQMILNGYCYEYNGATFCELNDSYLEEKWKELEDVLFFEDEDKNLVLVSDWLIL NQKEYRDDICHLFDKHHSKGIRWLMNNFE >gi|223714044|gb|ACDT01000171.1| GENE 14 6902 - 7106 225 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEDKVMEKNNIDELSSNEKEEEIDYKEKAMRYLTIAYNAITFAQLDEMYDRERLIEEIGC TEEEYDEI Prediction of potential genes in microbial genomes Time: Thu May 26 10:59:45 2011 Seq name: gi|223714043|gb|ACDT01000172.1| Coprobacillus sp. D7 cont1.172, whole genome shotgun sequence Length of sequence - 2086 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 59 - 373 296 ## gi|237732902|ref|ZP_04563383.1| predicted protein 2 2 Op 1 . + CDS 484 - 927 377 ## gi|237732903|ref|ZP_04563384.1| predicted protein 3 2 Op 2 . + CDS 987 - 1157 173 ## gi|237732904|ref|ZP_04563385.1| predicted protein 4 3 Tu 1 . - CDS 1462 - 2007 510 ## gi|237732905|ref|ZP_04563386.1| predicted protein Predicted protein(s) >gi|223714043|gb|ACDT01000172.1| GENE 1 59 - 373 296 104 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732902|ref|ZP_04563383.1| ## NR: gi|237732902|ref|ZP_04563383.1| predicted protein [Mollicutes bacterium D7] # 1 104 5 108 108 179 100.0 4e-44 MEDEVDDGLMMFVHMKVYSAKYDIDNGCIHKYHDVTYQRLTDSYIKGKWKELEDVLFTED KYHNLVLSDDWFIFDKGEYKENIWHWFDKHYSKGIECLLMKTKE >gi|223714043|gb|ACDT01000172.1| GENE 2 484 - 927 377 147 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732903|ref|ZP_04563384.1| ## NR: gi|237732903|ref|ZP_04563384.1| predicted protein [Mollicutes bacterium D7] # 1 147 1 147 147 228 100.0 8e-59 MEVSIVRYYCNISDIIKNNKNLTKEEIDSLEEIENEYLKLKDKCFELESTDADEKYHFWK DEYKHLEKENYLLQTLIKENYTSALLVCRKTYIFGILMYSEAETTSKELTKELANILENN PNNKEIEKLVQKNLHLKYIPFGVELSK >gi|223714043|gb|ACDT01000172.1| GENE 3 987 - 1157 173 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732904|ref|ZP_04563385.1| ## NR: gi|237732904|ref|ZP_04563385.1| predicted protein [Mollicutes bacterium D7] # 1 56 1 56 56 95 100.0 7e-19 MDYKHLKVTKIGVLTLVFSGVIFLYGALNLVEYLQMIREENKLYRFCKKYSEELDS >gi|223714043|gb|ACDT01000172.1| GENE 4 1462 - 2007 510 181 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732905|ref|ZP_04563386.1| ## NR: gi|237732905|ref|ZP_04563386.1| predicted protein [Mollicutes bacterium D7] # 1 181 1 181 181 296 100.0 3e-79 MKKIIVLLLGIICVFSVVTPTAAIGAITQNEQKILNELEGGVVVENITLGLDDSSLNAAK TYLLNNDLSDNQVTTVLSQIVSAKEYMVNNKITELNSVNSKELLDYVSKAAEALGITLKI GNSKIITFLKNGEVLFTTDGPIKKTGYNFSKSFVTFGGLIAVLALCTIGVYTNIQENKKE E Prediction of potential genes in microbial genomes Time: Thu May 26 11:00:10 2011 Seq name: gi|223714042|gb|ACDT01000173.1| Coprobacillus sp. D7 cont1.173, whole genome shotgun sequence Length of sequence - 558 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 228 164 ## - Prom 411 - 470 8.8 Predicted protein(s) >gi|223714042|gb|ACDT01000173.1| GENE 1 3 - 228 164 75 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAYSIFLKIPDKMEDVYSEYEDYIDIYDNVSSLISLLITNPYKLVFFKYISEYLNEEEYA NTLRIAYTSSEAPNI Prediction of potential genes in microbial genomes Time: Thu May 26 11:00:23 2011 Seq name: gi|223714041|gb|ACDT01000174.1| Coprobacillus sp. D7 cont1.174, whole genome shotgun sequence Length of sequence - 30544 bp Number of predicted genes - 47, with homology - 45 Number of transcription units - 19, operones - 10 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 435 199 ## gi|237732818|ref|ZP_04563299.1| predicted protein + Prom 444 - 503 6.1 2 1 Op 2 . + CDS 525 - 773 174 ## gi|237732819|ref|ZP_04563300.1| predicted protein + Prom 792 - 851 8.9 3 2 Op 1 . + CDS 890 - 1564 539 ## COG2176 DNA polymerase III, alpha subunit (gram-positive type) 4 2 Op 2 . + CDS 1616 - 2320 558 ## gi|237732821|ref|ZP_04563302.1| predicted protein + Prom 2678 - 2737 16.2 5 3 Tu 1 . + CDS 2841 - 3620 472 ## COG0860 N-acetylmuramoyl-L-alanine amidase + Term 3656 - 3701 7.3 - Term 3647 - 3683 2.2 6 4 Tu 1 . - CDS 3687 - 4169 373 ## gi|237732823|ref|ZP_04563304.1| predicted protein - Prom 4373 - 4432 10.5 7 5 Op 1 . - CDS 4612 - 5016 225 ## gi|237732825|ref|ZP_04563306.1| predicted protein 8 5 Op 2 . - CDS 5022 - 6179 825 ## COG2333 Predicted hydrolase (metallo-beta-lactamase superfamily) 9 5 Op 3 . - CDS 6194 - 6451 130 ## gi|237732827|ref|ZP_04563308.1| predicted protein 10 5 Op 4 . - CDS 6438 - 6800 268 ## gi|237732828|ref|ZP_04563309.1| predicted protein - Prom 6903 - 6962 6.2 - Term 7001 - 7044 7.2 11 6 Tu 1 . - CDS 7048 - 7254 108 ## gi|237732829|ref|ZP_04563310.1| predicted protein - Prom 7316 - 7375 1.8 12 7 Op 1 . - CDS 7560 - 8321 695 ## gi|237732830|ref|ZP_04563311.1| predicted protein 13 7 Op 2 . - CDS 8335 - 8769 390 ## gi|237732831|ref|ZP_04563312.1| predicted protein 14 7 Op 3 . - CDS 8766 - 8996 128 ## gi|237732832|ref|ZP_04563313.1| predicted protein 15 7 Op 4 . - CDS 9010 - 9183 165 ## gi|237732833|ref|ZP_04563314.1| predicted protein - Prom 9211 - 9270 6.5 - Term 9195 - 9252 -0.2 16 8 Op 1 . - CDS 9272 - 10375 886 ## COG0206 Cell division GTPase 17 8 Op 2 . - CDS 10455 - 10886 509 ## gi|237732835|ref|ZP_04563316.1| predicted protein - Prom 11032 - 11091 9.0 + Prom 10988 - 11047 12.0 18 9 Tu 1 . + CDS 11270 - 11842 552 ## 19 10 Tu 1 . - CDS 11956 - 12600 352 ## gi|237732836|ref|ZP_04563317.1| predicted protein - Prom 12657 - 12716 10.3 - TRNA 12729 - 12805 83.1 # Met CAT 0 0 - Term 13661 - 13708 -0.6 20 11 Tu 1 . - CDS 13926 - 15428 1178 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair - Prom 15455 - 15514 8.2 21 12 Tu 1 . - CDS 15554 - 16573 288 ## PROTEIN SUPPORTED gi|163739394|ref|ZP_02146805.1| 50S ribosomal protein L36 - Prom 16766 - 16825 9.6 - Term 16629 - 16674 9.2 22 13 Op 1 . - CDS 16849 - 17709 639 ## COG1690 Uncharacterized conserved protein 23 13 Op 2 . - CDS 17606 - 18154 376 ## COG1690 Uncharacterized conserved protein 24 13 Op 3 . - CDS 18221 - 18595 319 ## gi|237732840|ref|ZP_04563321.1| predicted protein 25 13 Op 4 . - CDS 18636 - 19265 503 ## gi|237732841|ref|ZP_04563322.1| predicted protein - Prom 19286 - 19345 1.7 26 14 Op 1 . - CDS 19365 - 19796 257 ## gi|237732842|ref|ZP_04563323.1| predicted protein 27 14 Op 2 . - CDS 19825 - 20136 314 ## gi|237732843|ref|ZP_04563324.1| predicted protein - Prom 20191 - 20250 3.7 28 15 Op 1 . - CDS 20359 - 20841 186 ## gi|237732844|ref|ZP_04563325.1| predicted protein 29 15 Op 2 . - CDS 20883 - 21344 164 ## LM5578_0677 hypothetical protein 30 15 Op 3 . - CDS 21353 - 21649 308 ## gi|237732846|ref|ZP_04563327.1| predicted protein 31 15 Op 4 . - CDS 21668 - 22027 368 ## gi|237732847|ref|ZP_04563328.1| predicted protein 32 15 Op 5 . - CDS 22041 - 22475 240 ## gi|237732848|ref|ZP_04563329.1| predicted protein 33 15 Op 6 . - CDS 22506 - 22910 219 ## gi|237732849|ref|ZP_04563330.1| predicted protein 34 15 Op 7 . - CDS 22911 - 23294 456 ## Geob_0714 hypothetical protein - Prom 23341 - 23400 10.2 35 16 Op 1 . - CDS 23934 - 24155 379 ## gi|224543201|ref|ZP_03683740.1| hypothetical protein CATMIT_02401 36 16 Op 2 . - CDS 24127 - 24747 322 ## gi|237732851|ref|ZP_04563332.1| conserved hypothetical protein 37 16 Op 3 . - CDS 24846 - 25319 429 ## gi|237732852|ref|ZP_04563333.1| predicted protein 38 16 Op 4 . - CDS 25326 - 25526 126 ## gi|237732853|ref|ZP_04563334.1| predicted protein 39 16 Op 5 . - CDS 25547 - 26011 394 ## BPUM_1656 hypothetical protein 40 16 Op 6 . - CDS 26025 - 26189 80 ## 41 16 Op 7 . - CDS 26203 - 26388 227 ## gi|237732856|ref|ZP_04563337.1| predicted protein 42 16 Op 8 . - CDS 26392 - 26676 257 ## gi|237732857|ref|ZP_04563338.1| predicted protein 43 16 Op 9 . - CDS 26700 - 27317 317 ## gi|237732858|ref|ZP_04563339.1| predicted protein - Prom 27345 - 27404 6.9 44 17 Tu 1 . - CDS 27486 - 27911 344 ## gi|237732859|ref|ZP_04563340.1| predicted protein - Prom 27946 - 28005 9.9 + Prom 28304 - 28363 7.0 45 18 Op 1 . + CDS 28443 - 28859 422 ## BpOF4_21949 hypothetical protein + Term 28881 - 28916 -0.5 + Prom 28866 - 28925 4.8 46 18 Op 2 . + CDS 28975 - 29661 654 ## BpOF4_21944 hypothetical protein + Term 29672 - 29715 7.7 - Term 29803 - 29843 5.2 47 19 Tu 1 . - CDS 29871 - 30383 297 ## COG0194 Guanylate kinase - Prom 30435 - 30494 4.2 Predicted protein(s) >gi|223714041|gb|ACDT01000174.1| GENE 1 1 - 435 199 144 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732818|ref|ZP_04563299.1| ## NR: gi|237732818|ref|ZP_04563299.1| predicted protein [Mollicutes bacterium D7] # 103 144 1 42 42 67 100.0 4e-10 KKVNTYNFFSSHYVCSTLKIYSIPSTKLIEVLNSVYPIKNIYQISNNTNEIAKNILLKYM QSKIFRYKRNKKFKPFYLNFITPNSTDAIISNKKELLEAYLTMSKVLNKPISLNISSYRT IVKKIKEMEAAFYYQHLKTNVLDI >gi|223714041|gb|ACDT01000174.1| GENE 2 525 - 773 174 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732819|ref|ZP_04563300.1| ## NR: gi|237732819|ref|ZP_04563300.1| predicted protein [Mollicutes bacterium D7] # 1 82 1 82 82 122 100.0 9e-27 MMTKVNLNMDDEFFDDGIDDIKNDLVAYYFVRVNGKNILLKILFEDSKFKLVNIASNINQ QYINNKEISFFTSLLKQANTLY >gi|223714041|gb|ACDT01000174.1| GENE 3 890 - 1564 539 224 aa, chain + ## HITS:1 COG:BS_polC KEGG:ns NR:ns ## COG: BS_polC COG2176 # Protein_GI_number: 16078721 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit (gram-positive type) # Organism: Bacillus subtilis # 39 214 419 585 1437 94 36.0 1e-19 MKCPLCGNLIDSTNLNRCVRFPICPGFIDEETLTIQSIQEYVVFDLETTGFNKKEDRIIE IGAIKVKNGRVVDRFSTLCAPIKGGKPMYISAKITEVTGIKNVDLLNQIPESEAVKNFMC WLGESKISVAHNGLKFDIPFLKEACKRSNVEFKFTHILDTMLLSKALNYVNNGNIPNNKQ ETLARYFGVDYQAHRAVNDCEALLKIFDKLKNDAKNINFSLKKI >gi|223714041|gb|ACDT01000174.1| GENE 4 1616 - 2320 558 234 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732821|ref|ZP_04563302.1| ## NR: gi|237732821|ref|ZP_04563302.1| predicted protein [Mollicutes bacterium D7] # 1 234 1 234 234 412 100.0 1e-113 MAKIFSLYNTDKKFEEESFKYLKKTEIFYNDTMKLISEIKEREFQKMGMDTICLINNQVK VVDVKAMAGYIPTFSQELFNINSQRIGWLLNNSLKTDYYLLVYHVLDESIATHNYSKDKK LLTNENIAYTKAILISKKEIFNIITNETSLSSYDLENVLLDEIIPEYKTNHTTKMIYNSY HKCIEKKHGPSNVYFVVSDKIHEKPVNCIIRRELLEKHALKVWEIEENGKTALI >gi|223714041|gb|ACDT01000174.1| GENE 5 2841 - 3620 472 259 aa, chain + ## HITS:1 COG:BH1295_2 KEGG:ns NR:ns ## COG: BH1295_2 COG0860 # Protein_GI_number: 15613858 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 20 224 30 200 422 86 33.0 4e-17 MLITSITPLLAQEDTVPINIVIDAGHGGSVEQNLEKKQAVGAEYDGLKEKELNLKVANYL KEELDNYNVNVFMTRTDDSYCLTLKERAKLAKKYNATIIVSLHMNACDKHNANGSEVYIP NSSKFYSSMNQLGSTILSNLSQLGLKNNGIFTKLLKDKESNIEYYQDGSAKDYYGIIRES YLFNIPAIIVEHAYLDNYNDRENYLRTDDQLRELAHADAQALVSYYNLSKKIIMPKMHIE TQKKGMISLLKEIVSTFIE >gi|223714041|gb|ACDT01000174.1| GENE 6 3687 - 4169 373 160 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732823|ref|ZP_04563304.1| ## NR: gi|237732823|ref|ZP_04563304.1| predicted protein [Mollicutes bacterium D7] # 1 160 7 166 166 272 100.0 5e-72 MSGQNKRIIMYKTFICYINEDLLNAFYAQRITTNFSEWIRKEAKKCYSIELNNKNDILDL KDKVLLEYYDNIGEWIKERMRKAMSQNSNEDTKYIIMRYINENNSLYLYEDTLLSLKCNT PIYNWSSNKLKAKLFDYTEDARKCIEEEISANYKCMVVNI >gi|223714041|gb|ACDT01000174.1| GENE 7 4612 - 5016 225 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732825|ref|ZP_04563306.1| ## NR: gi|237732825|ref|ZP_04563306.1| predicted protein [Mollicutes bacterium D7] # 1 134 1 134 134 236 100.0 3e-61 MEYRYKYDPIGKCSLELNSSGTLSELCFPIAYILNRYDYNTEENVRSFFKDRELLLKIED SEVVEFIVTVDYISDYDRYLISFSEIDDSINYETHFLYFCLVLYKMLIRYQEPINVLDAI ISITLSIGGDTGEN >gi|223714041|gb|ACDT01000174.1| GENE 8 5022 - 6179 825 385 aa, chain - ## HITS:1 COG:BH3961 KEGG:ns NR:ns ## COG: BH3961 COG2333 # Protein_GI_number: 15616523 # Func_class: R General function prediction only # Function: Predicted hydrolase (metallo-beta-lactamase superfamily) # Organism: Bacillus halodurans # 53 309 48 293 295 108 26.0 2e-23 MKELKDIINQKSKLLKNKYTKVDKNKKASIIVIILVIFGLIISTFLFSSKKTDSMVTMID VGEAQSIVLKNGQDVVLYDAGSDDLHNTAIDDYMSYSKTDKIDTLILSHNDIQNINNAIF VIEKYDIQKIYMSDFGNGSKTYKRLLKFINEKNIEVINPHFGDNFKIGDGTIEFINPDKT YENRNDASLCIRYIDKYGLSVIATGDASDIVEKDIIKSYDINQYNTNEIKYHNYIIAGRR GSSYATSDYFLNSIQPEGVLISSGDYDTYKHPSKRLIERLEQKNIAYFKTNESSTIQLKS NETGISVSTIAIPKANEVIMTDKEAKQYAINVRKQAIAEAKYIGNRNTMHYFKNDNPKVD KITDSSITYFNSKEDAEKAGFSYSE >gi|223714041|gb|ACDT01000174.1| GENE 9 6194 - 6451 130 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732827|ref|ZP_04563308.1| ## NR: gi|237732827|ref|ZP_04563308.1| predicted protein [Mollicutes bacterium D7] # 1 85 1 85 85 112 100.0 5e-24 MQLYNVFLSAGNREVLIFSIGFAIIAISLSLMSLVLELVSKFYRTGRRGRILQTVDILKI AGVSFSLLAVISYLIGAANNIFIVI >gi|223714041|gb|ACDT01000174.1| GENE 10 6438 - 6800 268 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732828|ref|ZP_04563309.1| ## NR: gi|237732828|ref|ZP_04563309.1| predicted protein [Mollicutes bacterium D7] # 1 120 44 163 163 198 100.0 8e-50 MDISIRCKITMTEAMNEYLMNRFKKLNRYKVLDNSRVSILVKPHEAAVKIQTLISNKYGN FKVTTMASDYYTAVDLHIDKVKNKLCKIKELKYRSLNKKQIGMSYKYIQESKSNNNNATI >gi|223714041|gb|ACDT01000174.1| GENE 11 7048 - 7254 108 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732829|ref|ZP_04563310.1| ## NR: gi|237732829|ref|ZP_04563310.1| predicted protein [Mollicutes bacterium D7] # 1 68 1 68 68 73 100.0 5e-12 MREYHGNTLYFLLILLNLLKESSVMNISEIIFSGVLTLIIFIIIFLFLKVDKLEKEYHIF TDEHTLKK >gi|223714041|gb|ACDT01000174.1| GENE 12 7560 - 8321 695 253 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732830|ref|ZP_04563311.1| ## NR: gi|237732830|ref|ZP_04563311.1| predicted protein [Mollicutes bacterium D7] # 1 253 11 263 263 430 100.0 1e-119 MLKLIEIPESIKNIYLVDTENCYPEIISSKNNLYIYFLGHNKKINAETISNSESMLQFIN CEANGKNALDFILVGYLSILADRFKNRNYYIYSNDKGYDSVISVIADICSVNIQRIKVEP LQEIERRIGDARLVGALKDSGLFEDAVKSLRTNTNLYEYLIAKYGYDEGIKLFSMFSETY NRRLRNKLRKRRKLERDRLIKNHFSEKQREKIYKLNIVIEVEKCIKENLPLRDILVSKFG EERGAQLYKEYSV >gi|223714041|gb|ACDT01000174.1| GENE 13 8335 - 8769 390 144 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732831|ref|ZP_04563312.1| ## NR: gi|237732831|ref|ZP_04563312.1| predicted protein [Mollicutes bacterium D7] # 1 144 1 144 144 244 100.0 2e-63 MRKIKSFLICLVTVYLVVFSKPISALSIIDDGDYDLGSNVSMIENTIELNIGEKLKNSDV LNCFRIKNDENVSIKYDSTSIDTDIEKEQELKVTIKDGADTYVVSINVDIQDTEFSNKPV LAYLLLVFSLCLTLTFIVFYGKCV >gi|223714041|gb|ACDT01000174.1| GENE 14 8766 - 8996 128 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732832|ref|ZP_04563313.1| ## NR: gi|237732832|ref|ZP_04563313.1| predicted protein [Mollicutes bacterium D7] # 1 76 10 85 85 134 100.0 2e-30 MSVIEKVKELWNSNCDDYYITDINNNVISSFYFGEAFKSENADKILKQSEETILNWFIEN KWGHCILKVQLKGENL >gi|223714041|gb|ACDT01000174.1| GENE 15 9010 - 9183 165 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732833|ref|ZP_04563314.1| ## NR: gi|237732833|ref|ZP_04563314.1| predicted protein [Mollicutes bacterium D7] # 1 57 1 57 57 68 100.0 1e-10 MTKDLKEQLQFNIEESIKETEQKLDTIPKTKQNHKKILFYIYKLEKLKSYLPGSEKL >gi|223714041|gb|ACDT01000174.1| GENE 16 9272 - 10375 886 367 aa, chain - ## HITS:1 COG:BH2558 KEGG:ns NR:ns ## COG: BH2558 COG0206 # Protein_GI_number: 15615121 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Bacillus halodurans # 1 323 6 328 382 293 55.0 4e-79 MDINNQLAKIKVIGVGGGGNNAVNRMLEQNIKNVEFFIANTDVQVLHQSKLDSKIALGKT LTKGLGAGGNPDIGKKAALESEKALLNILQDTDMLFIAAGMGGGTGTGAAPIIAKLAKDL GILTVGVVTTPFSFEGKKRNSNALEGIDELMKNVDSLISVSNDRLIKLIGGLPLKESFQE ADKVLAQAIETITDLIATPALINLDFADVCSVMRDKGNSLIGIGHAKGDDKAKDAALKAI SSPLLEVSVAGAKDAIINVTGGPNVSLLDANIALETITSQVGNDLNTYLGISINEDLGDE IIVTIIATGLKDTKNKSVENQPNIAEVIKTNKAKYVYDLQYSKESINKQTDDLEIPNFFK KRSSIYY >gi|223714041|gb|ACDT01000174.1| GENE 17 10455 - 10886 509 143 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732835|ref|ZP_04563316.1| ## NR: gi|237732835|ref|ZP_04563316.1| predicted protein [Mollicutes bacterium D7] # 1 133 1 133 143 224 100.0 1e-57 MEELVNGAQQAGQNGQGYIDYYAQSQGYMTMIMFLVFALAAISLIFLFVQLKHKKYEAQA RVEASKYSAVGDIKVAEIYSSAGMSKNKRRVTSHDKKVSEIVEVSKLDEIDRLKSLLRSE DIDNRDELLSLLQEIEEKFEEEN >gi|223714041|gb|ACDT01000174.1| GENE 18 11270 - 11842 552 190 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGEDGTNQYTGNSNEGTSEDGTEMTELIPDTQDTGQGYVNYYEKSQGMMGLIFVLVIICA VINIIYRVNKKSQIENTKKTNLMYYAKSFQEHNLITQQQLTDMQNMNLQQLQMEMNRITE SLKQQMNQMEIDHQQQMQQQVREDAINAATGIEFGGVNPDINLNPGLQNQLNELNNINDF GSFNDFNNHM >gi|223714041|gb|ACDT01000174.1| GENE 19 11956 - 12600 352 214 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732836|ref|ZP_04563317.1| ## NR: gi|237732836|ref|ZP_04563317.1| predicted protein [Mollicutes bacterium D7] # 1 214 1 214 214 323 100.0 6e-87 MGRFSNAWNILTNNIGNFIKILLFSYFVSLLLGVLCIIPLGVGVILSLIISVLLTYIEVF LTNFLLKKQEITMKSISDAFNESCNMFSSIFLYYLVFFIKTFCLLITVIIIICLNMGITL GISSFSNIYFGFGSLCFFIFILLPITVYFCLRFEAKYYASLVGLLFNDRYNAYLDAINYK SKVWWLLIPIIGKYLYMIELISGASNHYLKREDF >gi|223714041|gb|ACDT01000174.1| GENE 20 13926 - 15428 1178 500 aa, chain - ## HITS:1 COG:SA1196 KEGG:ns NR:ns ## COG: SA1196 COG0389 # Protein_GI_number: 15926944 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Staphylococcus aureus N315 # 1 499 6 419 420 179 29.0 1e-44 MLKNHIYAAIDLKSFYASVECIERGLDPLKNNLVVADNSRTDKTICLAVSPALKAYGMPG RPRLFEVINKIKQINIERQRLNKNRDFIAKSINSVELNNNNQLELDYIVATPRMSFYIEY SKRIYEIYLKHVAFEDIHVYSIDEVFLDITPYLKKEKLKPYDFVRKILLDVFKETGITAT AGIGTNLYLAKVAMDILAKHISPDKNGTRIAGLDEMMYRKKLWSHQPLTDFWRIGKGYER RLTKLKLFTMGDIARQSIIDEDVLYNEFGINAELLIDHAWGYEPCTISDIKAYKPMKNSI STGQVLHRGYKCEQAKIVLCEMIDSLSLDLVEKQLVIKQVVLHISYDRENISRSDYKGKT KKDSYGRMVPMHSHGTVNLDFHTSSTDILRNNVSELFDSIVKKNLLIRKLNITFCNVINF SDIAQHTKKTYEQLDLFTDYESKDKQEKVLQEKLKKELALQKAILKIKNKYGKNAVLKVM DLQDGATAKERNEQIGGHKA >gi|223714041|gb|ACDT01000174.1| GENE 21 15554 - 16573 288 339 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739394|ref|ZP_02146805.1| 50S ribosomal protein L36 [Phaeobacter gallaeciensis BS107] # 31 327 84 377 409 115 26 3e-25 MELLIIGLLLVLIIVVIVLLSMISKQNNKVTKITAGMETINEMEYYVKHQEDTLKNINDN YIESVKQGSQSVASLNRIGAILDVIKDSAITQKEQTNNITQRINALNDIMVNKKSRGNWG EYQLNNLLSVYAGDNQSIFETQYSLKNGYIGDVALKIPGEKVLIIDSKFPLENFRKLDSV ESAENDIKKFESAFKQDIKKHINDISKKYITSETIENAVMFIPSEAIYMEVCAKYSELIE YAHHKHVLLTCPTTLIGVVFTLINITKDFNRNKHIKSLEKEIVAMYNDSQRMMTRLEGLS NTIEKLDKCYKEVFTSCNKIDSKIKKIHDGYMPEKESED >gi|223714041|gb|ACDT01000174.1| GENE 22 16849 - 17709 639 286 aa, chain - ## HITS:1 COG:all3526 KEGG:ns NR:ns ## COG: all3526 COG1690 # Protein_GI_number: 17231018 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 10 280 167 388 393 85 29.0 1e-16 MEINESNIGQRYLVVHSGSRNLGKQVADIYQDIADEHCNRTMSERNLLKKELIKKLKDQG KSNEIQKALEEFNKQNTEKKLLAHDLCYLEGQDMKDYLHDCLICNRYASLNRRIIVQRIL DFIVEDNGYKNCAVWIDEYQAHQFEFGFENNQIDIRSYGWETIHNYIGDDNILRKGAISA QQGEKVIIPINMRDGSIIGIGKGNPDYNYSGPHGAGRLMSRKTAKDSITLKDFEDTMKSV YTSSVCQSTVDESPMAYKGIDDILENIADSVEVLDIVKPIYNFKAH >gi|223714041|gb|ACDT01000174.1| GENE 23 17606 - 18154 376 182 aa, chain - ## HITS:1 COG:all3526 KEGG:ns NR:ns ## COG: all3526 COG1690 # Protein_GI_number: 17231018 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 39 149 42 158 393 73 34.0 1e-13 MFEIEGKYNKAKVFATVVENECISQITEMCNQKWLKGCQISIMPDTYAGKGATIGTTIKL KDKVSPSLVGVDISCGMLAVEIPKSLNLNLEKIDKFINDNIPAGFEVNKENFKGIYYGFI KNLKCYDKLKNIDHIEKSLGSLGSGNHFFGNQRIEYWTKIFSSSFWFEKFRETGSRYLSR YC >gi|223714041|gb|ACDT01000174.1| GENE 24 18221 - 18595 319 124 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732840|ref|ZP_04563321.1| ## NR: gi|237732840|ref|ZP_04563321.1| predicted protein [Mollicutes bacterium D7] # 1 105 1 105 124 188 100.0 7e-47 MNNYKLQEIDKNTILLLPIDYQIPLSVIIEDFNMKNEKQNGAVLMDLFIYVGNKKNRYKS FRIKNGQIYLNYSEVYHPDTKIVDAFYNLFAEAPIGMIERIVSPAIKKLILKKHLIKKKV LLEK >gi|223714041|gb|ACDT01000174.1| GENE 25 18636 - 19265 503 209 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732841|ref|ZP_04563322.1| ## NR: gi|237732841|ref|ZP_04563322.1| predicted protein [Mollicutes bacterium D7] # 1 209 1 209 209 330 100.0 3e-89 MSYCITLTFYPDIKANQVFKKAQQIAKNKLDNFEKVIDENYPYCPASMYNANYFSIEEYR EKELYHLEKLWIENIFTQTFLYWKEYNLLAVVGYDRAGKTTITFQNSVDQNYNYSEWNGV PLFEELVQQAKTTSNKNIENYEDSSDEYYRQTYAYDLIYKKLNLDYFLANKPMEKYDYFK LSMMNEHKSFQCHQYLKKKLLDELKSFLE >gi|223714041|gb|ACDT01000174.1| GENE 26 19365 - 19796 257 143 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732842|ref|ZP_04563323.1| ## NR: gi|237732842|ref|ZP_04563323.1| predicted protein [Mollicutes bacterium D7] # 1 143 1 143 143 296 100.0 2e-79 MKEIKSIIYETVDGTLFTDKKEAEKHEKRLLGRRFFRVYAHPDLTEGRHDAQPIGYIMVD AENLSYGNGPISDNICEDYAEAWCELYICSRKYSFAACSFHRGGPFHNWKVEQISEFNLT GIKNSSDNYLGSVDENGFHSFIK >gi|223714041|gb|ACDT01000174.1| GENE 27 19825 - 20136 314 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732843|ref|ZP_04563324.1| ## NR: gi|237732843|ref|ZP_04563324.1| predicted protein [Mollicutes bacterium D7] # 1 103 1 103 103 179 100.0 7e-44 MKDDFNTNEIEVLTAAYNILKKHKKNAIDAPSTLDINKLAQDLLEEIIKGNGDKKIFISS DDEGNSYHRLFYTIDTAPEFLDEIKADSPCMEDFNNEQIALLG >gi|223714041|gb|ACDT01000174.1| GENE 28 20359 - 20841 186 160 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732844|ref|ZP_04563325.1| ## NR: gi|237732844|ref|ZP_04563325.1| predicted protein [Mollicutes bacterium D7] # 1 160 1 160 160 270 100.0 1e-71 MEKKYIFLDIDGVLNNEASSKTSKSIYNLSKENVIRLKDLLTTMGVKYTPIVVVSSSWRY SPKALYRLRTYLKNYNLGFEDILSTEKMNCRGDEIEAYCKNKGIDIDNIIILDDDSDMGN LVHRLVKTYVRDGLTYKEVEQCLILLGLKEYVWKPERYKR >gi|223714041|gb|ACDT01000174.1| GENE 29 20883 - 21344 164 153 aa, chain - ## HITS:1 COG:no KEGG:LM5578_0677 NR:ns ## KEGG: LM5578_0677 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 5 153 2 150 154 221 69.0 6e-57 MTDKKILDVSCGARMFWFDKNNKNTIYMDNRNANEVLCDGRKLTVAPDIIADFRDIPFAN ETFNLVIFDPPHLFRVGENSWLAKKYGKLDANTWKNDISKGFNECMRVLKKDGILIFKWN EEQIKLKEILTCFDKTPLFGNKRSKTHWLVFMK >gi|223714041|gb|ACDT01000174.1| GENE 30 21353 - 21649 308 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732846|ref|ZP_04563327.1| ## NR: gi|237732846|ref|ZP_04563327.1| predicted protein [Mollicutes bacterium D7] # 1 98 1 98 98 147 100.0 3e-34 MILDKNNEFEAFEKIKEYILNNQIKELYGDAKVVSEAINNKFESTIITTIDLSIETLLVT KKRQDWSSIEQSIQTLERVRKMILGTKYDYIQKLKVED >gi|223714041|gb|ACDT01000174.1| GENE 31 21668 - 22027 368 119 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732847|ref|ZP_04563328.1| ## NR: gi|237732847|ref|ZP_04563328.1| predicted protein [Mollicutes bacterium D7] # 1 119 1 119 119 213 100.0 4e-54 MDKYKLCKTEHDYAGCVGADSYDSYKSWQDWKKQLLPFGATEWDWYDTFNFVIRYDIFEK NNAYGMDLYIFSQRIGNTSEVKIEYIAPDEYEEVIEWLNDRSKYIQSLWKEAENRFKKL >gi|223714041|gb|ACDT01000174.1| GENE 32 22041 - 22475 240 144 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732848|ref|ZP_04563329.1| ## NR: gi|237732848|ref|ZP_04563329.1| predicted protein [Mollicutes bacterium D7] # 1 144 1 144 144 233 100.0 3e-60 MTKQQRINGNRFASAVTIHLLNIRNYHIYLWRQYDGYYDSYYDYPWHWNNYIKINKNIEL RNIHAKTHKYDVSNIFRTYLKLSFDLYINDELIRKYRAVLSLNQTKDDFGCWEVENTYKR NKAIPQWAVDQIFQWIEKMRAKRD >gi|223714041|gb|ACDT01000174.1| GENE 33 22506 - 22910 219 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732849|ref|ZP_04563330.1| ## NR: gi|237732849|ref|ZP_04563330.1| predicted protein [Mollicutes bacterium D7] # 1 134 1 134 134 166 100.0 6e-40 MNIIVNDFIASIFPICIFILYAIIFFIFLEFILNKKYIFGCSIIILTVIFTLTVSFSKEK VNINIININSKEITYVIENDKTKNINTIKFTDNVRVTVVKHTNKEQLEKKLWSSGAKKIF YIHKDTYKKLITQK >gi|223714041|gb|ACDT01000174.1| GENE 34 22911 - 23294 456 127 aa, chain - ## HITS:1 COG:no KEGG:Geob_0714 NR:ns ## KEGG: Geob_0714 # Name: not_defined # Def: hypothetical protein # Organism: Geobacter_FRC-32 # Pathway: not_defined # 6 127 2 122 132 99 43.0 3e-20 MDSQFRIVVAGCRNFTDYEKVKKRLEIELEVLGSRLVIVSGGAAGADSLGERFAKEHNLE IERFPADWKKYGKAAGPIRNDQMAQVADMVIAFWDGKSKGTENMLRMANKYGVKMDVQLV RIDKEIK >gi|223714041|gb|ACDT01000174.1| GENE 35 23934 - 24155 379 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|224543201|ref|ZP_03683740.1| ## NR: gi|224543201|ref|ZP_03683740.1| hypothetical protein CATMIT_02401 [Catenibacterium mitsuokai DSM 15897] # 1 71 1 71 72 95 60.0 1e-18 MCYIIAKRFKKSGCVALKAKRGKELADFATDLQKKLGYDIQIVAITRPTAYGEYEPYKFV NSFEEFSIEASRL >gi|223714041|gb|ACDT01000174.1| GENE 36 24127 - 24747 322 206 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732851|ref|ZP_04563332.1| ## NR: gi|237732851|ref|ZP_04563332.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 206 1 206 206 356 100.0 5e-97 MYILDKIGLNIEILESLSYESKLGMSFKRTLSHFNKEEVLKEIELINNWYFSLEIIDDLP LDSRIKSVSSAKMKFERYYPNATYNRVFNDILGFRVICKSYDEVLELEKEDKIRVVDMSR GKSNDDGYRGIHVYYQRDNHHYPIEIQFNTYYDRQLNDWLHDKFYKRGYDSSCGQLLRKY YENGKIKSAEELEEVLEDVLYHCKKI >gi|223714041|gb|ACDT01000174.1| GENE 37 24846 - 25319 429 157 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732852|ref|ZP_04563333.1| ## NR: gi|237732852|ref|ZP_04563333.1| predicted protein [Mollicutes bacterium D7] # 1 157 3 159 159 261 100.0 1e-68 MNKKEYEERMTKLERELKELKEAEIVDDEFPKYGEQYWVEDENGNITSSYWSDDEDDNYC KDFLRIFKTKEECCRYLEIQEAFKAESIKFVPDWKNHTQKKYYLYYNHNHADNCIIIGAT VGFQQATLYFKSEKVLEELIERFGEDDVKKYYFGIEE >gi|223714041|gb|ACDT01000174.1| GENE 38 25326 - 25526 126 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732853|ref|ZP_04563334.1| ## NR: gi|237732853|ref|ZP_04563334.1| predicted protein [Mollicutes bacterium D7] # 1 66 14 79 79 96 100.0 5e-19 MDLKVFLLMFRIIECLLFIISIFMFYKSTKEAMKLNNYLQNRINELDKQIKNESKFRESL DKNDMR >gi|223714041|gb|ACDT01000174.1| GENE 39 25547 - 26011 394 154 aa, chain - ## HITS:1 COG:no KEGG:BPUM_1656 NR:ns ## KEGG: BPUM_1656 # Name: not_defined # Def: hypothetical protein # Organism: B.pumilus # Pathway: not_defined # 8 153 3 147 149 130 47.0 2e-29 MTVIYKKGNILNTKCNIICQQVNCKGVMGAGLALQIRRKWTTVYKDYKQYCNSANKYNNL LGNALFTKVENDKYVANIFGQLNYGRAKQQTNYSALSKSFNTVCNFAKVNGYTVAIPYGI GCGLAGGSWDVVSKIILDIFEHSTVQCEIWIFRD >gi|223714041|gb|ACDT01000174.1| GENE 40 26025 - 26189 80 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MERIFIIGLTVLLILTLGILVFNAKLYPAKGKMVSVSTSIVSFVILIIIFEQIL >gi|223714041|gb|ACDT01000174.1| GENE 41 26203 - 26388 227 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732856|ref|ZP_04563337.1| ## NR: gi|237732856|ref|ZP_04563337.1| predicted protein [Mollicutes bacterium D7] # 1 61 1 61 61 110 100.0 3e-23 MEKINYELLFDLAKKQNIPVHKAEPGEKPGIYISDGSGGKRAFTVNDLLYLNEKTSKKIT K >gi|223714041|gb|ACDT01000174.1| GENE 42 26392 - 26676 257 94 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732857|ref|ZP_04563338.1| ## NR: gi|237732857|ref|ZP_04563338.1| predicted protein [Mollicutes bacterium D7] # 1 94 1 94 94 172 100.0 6e-42 MENTKNKQEKITQDEVLMALAKVRMDGDWCEEEDYKLIQSIVKDYFALMDAFGKDKILGL SLSRSDGDDGIFEFEADVWATIGPRIVKVFYRDI >gi|223714041|gb|ACDT01000174.1| GENE 43 26700 - 27317 317 205 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732858|ref|ZP_04563339.1| ## NR: gi|237732858|ref|ZP_04563339.1| predicted protein [Mollicutes bacterium D7] # 1 205 1 205 205 355 100.0 8e-97 MKYYVYKRKAEWVEPPISTNSDMSDRLNWYRWTEWECLGYTEGAKELYLVLGKYIESVNS FLEQTTFKSLYTPSVQFLILDEKRRIRDYQELIRNIKTHRQYRKYCKSRWRWPENHSEQK RSITPEEVKEIRFKYNIDLKPIKNKRKINGYDLEYQTKLQRNWKRYRKTQYKKCFYNRYD FGGIEETNNIPGVYRTLYEKGKPIV >gi|223714041|gb|ACDT01000174.1| GENE 44 27486 - 27911 344 141 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732859|ref|ZP_04563340.1| ## NR: gi|237732859|ref|ZP_04563340.1| predicted protein [Mollicutes bacterium D7] # 1 141 1 141 141 276 100.0 3e-73 MGLIIENKHSSIDLSYTGFARLRTKIAYLHSKELGKFYDELVNAIPMFGKEREIFLKDYE KRLSTIDEELNVSHYLLDFLYAYDCEATMSVKHCRRIWNIIKNCDEDFSIGYVGKKDCAM FFDFKRIVKECRDKNIQMYWF >gi|223714041|gb|ACDT01000174.1| GENE 45 28443 - 28859 422 138 aa, chain + ## HITS:1 COG:no KEGG:BpOF4_21949 NR:ns ## KEGG: BpOF4_21949 # Name: not_defined # Def: hypothetical protein # Organism: B.pseudofirmus # Pathway: not_defined # 7 135 3 124 135 62 37.0 6e-09 MIKLYGKKLSIIFLVSLLLILCIFYIYFINNKPTTVTMKVNSKKNTELVIPSKESIKNNG YPINEDGQTYGPDMGNIISDSPDLILAEGKNGVIGYISYNSKITSPDDLKRDTLDKEDES IPLYLQDGKTVIGEFLFK >gi|223714041|gb|ACDT01000174.1| GENE 46 28975 - 29661 654 228 aa, chain + ## HITS:1 COG:no KEGG:BpOF4_21944 NR:ns ## KEGG: BpOF4_21944 # Name: not_defined # Def: hypothetical protein # Organism: B.pseudofirmus # Pathway: not_defined # 1 217 14 247 257 127 39.0 2e-28 MSLMLNITSIFAADADGSVKYVTVYGHRYSYNSSVYNDSTETWGYGRTSETNSNNVPTGY MGINARLYKSSGTLVKSSGWKYNTKPLGGLSVNSGSTTTKGTYYAKSQMQFYNGNGYNTY TSNATPNIQRAAIHTENYKINKYGLTYGSDYFAKDKNDSPDLIRVLGQNDIEGYVYSNEL NKEPKNINEVKNYMNEVKLGYTIPVYDENGENVIDSFFVYGEETAIVY >gi|223714041|gb|ACDT01000174.1| GENE 47 29871 - 30383 297 170 aa, chain - ## HITS:1 COG:SA1052 KEGG:ns NR:ns ## COG: SA1052 COG0194 # Protein_GI_number: 15926792 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Staphylococcus aureus N315 # 13 137 38 162 207 63 36.0 3e-10 MYNSLSNPISEILSTTTRNPRCGELDGVDYNFVDLRAFHKLSKIEEAEYSGEFYGISEEE ILGKVQNNTILFAVVSIEGVICLKRYIKDRFPEIKVDSVFLDVPSNILIDRMVNRGDKID KIQQRIKNMRKNDELKNGLFCNFAFTPTNPGIYDPKICVEEFYQFIKRKF Prediction of potential genes in microbial genomes Time: Thu May 26 11:04:21 2011 Seq name: gi|223714040|gb|ACDT01000175.1| Coprobacillus sp. D7 cont1.175, whole genome shotgun sequence Length of sequence - 30589 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 16, operones - 6 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 53 - 304 171 ## gi|167756329|ref|ZP_02428456.1| hypothetical protein CLORAM_01862 + Term 327 - 363 4.3 + Prom 453 - 512 9.0 2 2 Tu 1 . + CDS 554 - 1306 833 ## gi|237732790|ref|ZP_04563271.1| predicted protein + Term 1308 - 1344 2.1 + Prom 1460 - 1519 9.0 3 3 Tu 1 . + CDS 1608 - 4685 2896 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component + Term 4690 - 4733 7.2 + Prom 4780 - 4839 10.5 4 4 Op 1 8/0.000 + CDS 4867 - 6156 1120 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 5 4 Op 2 1/0.000 + CDS 6149 - 7561 1408 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 6 4 Op 3 . + CDS 7558 - 8442 757 ## COG1940 Transcriptional regulator/sugar kinase 7 5 Tu 1 . - CDS 8454 - 9647 924 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 9694 - 9753 9.3 + Prom 9625 - 9684 8.3 8 6 Op 1 . + CDS 9724 - 11184 1564 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 9 6 Op 2 1/0.000 + CDS 11242 - 12132 812 ## COG0583 Transcriptional regulator + Prom 12163 - 12222 8.4 10 6 Op 3 1/0.000 + CDS 12271 - 12783 461 ## COG0716 Flavodoxins 11 6 Op 4 . + CDS 12787 - 13536 742 ## COG0599 Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit + Prom 13612 - 13671 3.6 12 7 Op 1 2/0.000 + CDS 13718 - 14425 736 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 13 7 Op 2 1/0.000 + CDS 14441 - 15409 1235 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) + Prom 15447 - 15506 8.3 14 7 Op 3 . + CDS 15526 - 16092 448 ## COG0655 Multimeric flavodoxin WrbA + Prom 16123 - 16182 5.1 15 8 Tu 1 . + CDS 16243 - 16602 210 ## BCG9842_B1842 hypothetical protein 16 9 Tu 1 . - CDS 16755 - 16928 111 ## gi|167756344|ref|ZP_02428471.1| hypothetical protein CLORAM_01877 - Prom 17061 - 17120 7.4 + Prom 17401 - 17460 8.8 17 10 Op 1 . + CDS 17552 - 17707 108 ## gi|237732805|ref|ZP_04563286.1| hypothetical protein MBAG_03163 + Prom 17785 - 17844 4.4 18 10 Op 2 . + CDS 17868 - 18110 63 ## gi|167756345|ref|ZP_02428472.1| hypothetical protein CLORAM_01878 + Prom 18144 - 18203 14.4 19 11 Op 1 . + CDS 18228 - 19082 977 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 19087 - 19115 -0.0 + Prom 19097 - 19156 7.8 20 11 Op 2 . + CDS 19177 - 20448 1244 ## COG4099 Predicted peptidase + Term 20506 - 20550 -0.9 + Prom 20518 - 20577 11.2 21 12 Tu 1 . + CDS 20774 - 22462 1677 ## COG1164 Oligoendopeptidase F + Term 22463 - 22514 7.8 22 13 Tu 1 . - CDS 22656 - 24029 1281 ## COG0534 Na+-driven multidrug efflux pump - Prom 24063 - 24122 7.5 23 14 Tu 1 . - CDS 24142 - 25245 1217 ## COG0620 Methionine synthase II (cobalamin-independent) + Prom 25537 - 25596 5.5 24 15 Tu 1 . + CDS 25740 - 26864 1199 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family + Term 26872 - 26925 12.3 + Prom 26871 - 26930 7.9 25 16 Op 1 1/0.000 + CDS 26985 - 27881 733 ## COG1737 Transcriptional regulators + Prom 27908 - 27967 5.9 26 16 Op 2 . + CDS 27999 - 29444 1518 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 27 16 Op 3 . + CDS 29425 - 30279 763 ## COG1145 Ferredoxin 28 16 Op 4 . + CDS 30357 - 30588 123 ## gi|237732817|ref|ZP_04563298.1| hypothetical protein MBAG_03175 Predicted protein(s) >gi|223714040|gb|ACDT01000175.1| GENE 1 53 - 304 171 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756329|ref|ZP_02428456.1| ## NR: gi|167756329|ref|ZP_02428456.1| hypothetical protein CLORAM_01862 [Clostridium ramosum DSM 1402] # 1 83 127 209 209 164 100.0 1e-39 MFAQMLQDFSNNNHLPYHFSATSIYDDDTLSKEYDLVLMAPQVQHYTKSMAAKYHKRFLI MNPVDFAQYNCNNIVSRVQEVFL >gi|223714040|gb|ACDT01000175.1| GENE 2 554 - 1306 833 250 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732790|ref|ZP_04563271.1| ## NR: gi|237732790|ref|ZP_04563271.1| predicted protein [Mollicutes bacterium D7] # 1 250 1 250 250 474 100.0 1e-132 MELMISRILKYLNGCLDDDHMYRIGNFIIKNYTLIENHTMVSFIDAAKCSQEELLDFCQH FGIHSFDEFKERLLADHQGRLEQIHARMLNTDINDFLEHVNTDYSKEEFIQWIDELCELI FNKQRIVILGALYPSSVAVDFQTDLISLGKEVVEYHHFDLNFKFSDDDVVFFITATGRMM ERNVKKLKPQNICDAYLVLITQNLAYRDYENVCADYFAHVLGKFDGLQFNYQIMMIFDIL RIRYYQKYYQ >gi|223714040|gb|ACDT01000175.1| GENE 3 1608 - 4685 2896 1025 aa, chain + ## HITS:1 COG:L119891_1 KEGG:ns NR:ns ## COG: L119891_1 COG1136 # Protein_GI_number: 15672696 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Lactococcus lactis # 1 292 3 290 290 239 42.0 3e-62 MLQIQKITKEYITGDLHQIALNGISLNLRDNEFVAILGPSGSGKTTLLNIIGGLDRYDSG DLIINGISTKRYKDRDWDSYRNHTIGFVFQSYNLIPHQTVLANVELALTISGISKNERRK RAIDALKEVGLGEHLHKRPNQMSGGQMQRVAIARALVNDPDILLADEPTGALDSETSVQV MELLKEVARDRLVVMVTHNPELAEAYATRIVELRDGIIRSDSNPYEVSSDKLEVPRHENM GKSSMSFLTSLLLSFNNLRTKKARTILTSFAGSIGIIGIALILSLSNGVNQYIQSIEEET LSEYPLQIQSTGFDITSMMTDTQPNQNNEEKDDTKIHVSQMITNMFSKIGSNDLTSLKKY LDSGKSNIKENTNSIEYTYNVAPQIYSTNTDLIRQVNPDKSFSSLGLGSSANSNSLMSSM MSTDVFYQMPSNTSLYEDQYDIKAGNWPKNYNECVLVLTKNGNINDFMLYTLGLRDYSEL DKMIEQFSKEETVNVPTDIKSYSYKDILGIEFKLVNASDYYQYDSKYNVYKDKTDDENYM KNLIQNGENIKIVGIVQPKDSASATMLQSGIGYPAALTNHVIEQAASSEIVNKQISNPNI DVFTGKEFSDKNSNQLDMNSLFTVDGEMMKKAFSFDQNKLSFDMGDLDLSQIKLDSASLP SINMDNIFANMKVDIPKENIESFTQAVMTQFQQYLKDNGLIDPTKMNEYFMVFLQTDQAQ KLMQDEMIKLLQSSGATEQFQAQLERQMQTVMTQYTETITKSLQQQISTQITKQMGNLAN SMQDAIKIDTSVFAQAIKMNMNEEELSELMMSLMTTESSSYERNLKNLGYADFDKPSAIN IYPKDFETKQEVVNILDSYNENIKKIDEDKVISYTDYVGTLMSSVTDIINVISYVLIAFV AISLVVSSIMIGVITYISVLERKKEIGILRAIGASKKNISQVFNAETFIIGLLAGVLGIG ITLILLIPGNALIHEIAGNTSVSATLPIMGAIILIVLSVLLTLLGGLIPSKKAALEDPVT ALRSE >gi|223714040|gb|ACDT01000175.1| GENE 4 4867 - 6156 1120 429 aa, chain + ## HITS:1 COG:BS_ywbA KEGG:ns NR:ns ## COG: BS_ywbA COG1455 # Protein_GI_number: 16080890 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Bacillus subtilis # 1 427 1 441 444 197 31.0 4e-50 MEKLEKLLNRFIGPIAKKMSENDTIQSVAEGFMRTGPVTFGVCIFVILGNLPFTGYSDWL TNVGLKVHFDAISNASLNILALYVSFTVAHSFAKRKGDNALSCGILSLLSFLIIIPQTVA GVEGDITAFDITYLSGTGILVALIFAIIVGHLFHYLAGKGLKFKMPEGVPPMVSESFEPI FISMIIVTFAFVVRVGFGYTPFGNFLSFFDQTIGAFIIKIGLSLPTIFLLYFVANLLWFF GIHPNTVYSAFVPLQMTLIMTNIADAQAGKPLTYLTITLVSLFASFGGNGNTLGLCLSMF TARSERYKKMLKLAFIPNLFNINEPLIFGMPVMLNPVFFIPMVFCNVVMGFIGLFATQIF TFTYNPAMSLLPWTTPFFVKAFLAGGISLLIMVLILLVVNTLMYYPFFRIADKKAYEEEQ LAKVGGKIE >gi|223714040|gb|ACDT01000175.1| GENE 5 6149 - 7561 1408 470 aa, chain + ## HITS:1 COG:lin0017 KEGG:ns NR:ns ## COG: lin0017 COG2723 # Protein_GI_number: 16799096 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 2 470 4 476 477 558 57.0 1e-159 MSKLSKDFLWGGAIAANQAEGAYDADGKGLSLMDIATSASKDVSRQFTDGIKKGIYYPNH EGIDFYHKYKEDLALFEEMGFKCFRTSIAWSRIFPNGDEKLPNELGLKYYDDLFDEMIKR GMEPVVTISHYEMPLYLAKQYGGWANRKLIDFYLNFCEVIYKRYSGKVKYWMTFNEINSV IFMPEVAGIIGREQADFKQRSFQAAHHQFVASAKAVKLGHQIDSKNMIGCMVLTLIKYPL TAKPEDVLLAEEKMRYGTFAFSDIQVRGHYPNYVKKLIQKIDLKIEVKPDDLKSLKEGCV DFVGFSYYSSSAASTDTNVETTAGNIVSGVKNPYLPTSEWGWQIDAKGLRYILNRFYERY EIPLFIVENGLGYDDQVDENGYVEDDYRITYLKEHIKEMKSAILEDGVDVIGYTPWGCID LVSAGTGEMKKRYGFIYVDRDDQGNGTLKRSKKKSFAWYKKVIATDGEDL >gi|223714040|gb|ACDT01000175.1| GENE 6 7558 - 8442 757 294 aa, chain + ## HITS:1 COG:L118696 KEGG:ns NR:ns ## COG: L118696 COG1940 # Protein_GI_number: 15673470 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Lactococcus lactis # 1 287 1 283 293 157 31.0 3e-38 MNIAVFDVGGTFIKHCLMIDGKITQAGKIPTPQDSQESFLKAIKAVLDKMSNIEGIAFSL PGVIDVYRQYIYAGGSLRYNDHCDLKEWENYFNLPIQAENDARCAAIAELEQGNMQGIQN GLVLTFGTGVGGGIIINGDIFKGSHLISGEVSMIFAHRISDVTNHHLFGALGSIKNLIDK IAAAKGVKTDDGRLIFDWIKTGDLISCELFDDYCDDVIWQLHNIQCILDPQRICIGGGVS ENQIFIDGIKTAVKRFYQSLPIPFPQPEIVKCKYCNDANMVGAYLHYLRKNDER >gi|223714040|gb|ACDT01000175.1| GENE 7 8454 - 9647 924 397 aa, chain - ## HITS:1 COG:all3171 KEGG:ns NR:ns ## COG: all3171 COG2207 # Protein_GI_number: 17230663 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Nostoc sp. PCC 7120 # 303 395 209 301 306 69 35.0 1e-11 MIQTKQELINLLPTIYQNLNTPLFLISPDLKIIASPKKFLKLEINYFQQIFNFTIMKQHE IYIHFHQNATYFFFCTKLDEIPYICVGPIFNRKITTQDSPAEYELFQHVISSYTLDDFFN LPSITVETKNHFIFIYQIITGKILDSKLLKITFKGSKDNPLKQENSLEKEIFQIRETPLH EFSYTYEQKILNYIQNEDSTSARILMIELLQIKDERHLSKNQLQSAKYKVVAAIAVFTRG VISIGVPVDKAYSLSDVYIVKVDQSNTINQLHKLISDAIIDFTQLVKRYRNIQNPYWVKI CKNYISHNLHKNITLLDLAKVTEMNTTYLSTQFKKTTGQSIKQYINHKKIQEAQFLIKNS QYSLAQIADILQFSSQSHFNKVFKQIVGKSPIQYKNS >gi|223714040|gb|ACDT01000175.1| GENE 8 9724 - 11184 1564 486 aa, chain + ## HITS:1 COG:lin0017 KEGG:ns NR:ns ## COG: lin0017 COG2723 # Protein_GI_number: 16799096 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 2 485 6 476 477 512 52.0 1e-145 MFKNNFLWGGATAANQFEGGYNLGGRGLATSDTKTNGSMNKKRKHSFLDSNGKIVYLHGS EKIPQAYQSVIDEKMYYPSHRAVDFYHRYEDDIALLAQMGFKCFRMSISWTRIYPRGNES SPNQAGLDFYDHIFDECLKYGIEPIVTLNHFDMPIYLADHQNGWLNRQTVEYFIQYCKTV FKYYRGKVKYWMTFNEINLLRGYDTLGVHEMTPARYYQALHHIFVASAKAVILGHKIDSN NQIGMMLANILTYSETCNPRDVALELDVSRKLKYFYSDVQCRGYYPSYIVNSLKQQGITI NSLEDDDNLLKNGCVDYIGFSYYNSGVVTTRKDAEMTLGNGIKMAANPYLKESEWKWPID PIGLRISLNLLWDRYQKPLFIVENGLGANDQIAEDGKIHDEYRIDYLREHIIEMKKAVDE DGVELIGYTPWGCIDLVSAGTGEMKKRYGFIYVDMDDNGKGTLKRSCKESFYWYQKVIKS NGESLS >gi|223714040|gb|ACDT01000175.1| GENE 9 11242 - 12132 812 296 aa, chain + ## HITS:1 COG:XF1532 KEGG:ns NR:ns ## COG: XF1532 COG0583 # Protein_GI_number: 15838133 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Xylella fastidiosa 9a5c # 1 292 15 308 325 101 26.0 2e-21 MLLKQMKYFISVVECNSFTEAAEQCYISQSAISQQIKALEQELGVDLIKRNNRQFTLTPA GEYFYRHGVELVSEIDNLKRETVRRGQDQELSLKIGYLRCYGAQELHHAIAKFSKTYSEV SLSIVNGTHEELYDLLRSGQVDLVISDQRRAFNDDYYNYELLYSDCYIEISSRHPLSQRD ILTITDLKRNTCILISTKEQQEVEKDFYQNTLGFANQFLFADTLEEGRLMVVSNRGFMPI EAVGTLPPPATGITRIPLYHHHKPLQRNYCAFWHKEKTNYYIEEFAELLRNLLNND >gi|223714040|gb|ACDT01000175.1| GENE 10 12271 - 12783 461 170 aa, chain + ## HITS:1 COG:MA0407 KEGG:ns NR:ns ## COG: MA0407 COG0716 # Protein_GI_number: 20089301 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanosarcina acetivorans str.C2A # 2 169 7 174 179 177 47.0 1e-44 MKKSLIVYFSHRKENYVAGAIKDLKIGNTEVIAKKVQAIIGAELFEIHPLHEYPSKYDEC TKLAKDELETNARPKIINIISHFEEYENIYLGYPNWWSTMPMCLWTFLESYDFTNKHIYP FCTHEGSGLGKSISDLNKICPNAIIHQGLDIFGSQVFDSDEKIFKWLKEN >gi|223714040|gb|ACDT01000175.1| GENE 11 12787 - 13536 742 249 aa, chain + ## HITS:1 COG:MA0409 KEGG:ns NR:ns ## COG: MA0409 COG0599 # Protein_GI_number: 20089302 # Func_class: S Function unknown # Function: Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit # Organism: Methanosarcina acetivorans str.C2A # 1 248 1 248 250 279 50.0 3e-75 MAITKNAQNYHERMFPGYESKLLKTDPEFIELFDNFAFDEVVNQNDLDDKTSMIAILAML LGCQGIDEFKAMLTAAYNFGVTPIEMKEIIYQATAYLGIGRVFPFLHVVNDFCTKNGITL PLQGQSTTNQSNRLEKGIQAQVDIFGQDMCEFYKSGTADIKHINYWLTDNCFGDYYTRKG LDYKQREMITFCFLAAQGGCEPQLVSHIEANIRLGNDRAFLIKVISRGIPFLGYPRSLNA LRCINEVIK >gi|223714040|gb|ACDT01000175.1| GENE 12 13718 - 14425 736 235 aa, chain + ## HITS:1 COG:RSc0215 KEGG:ns NR:ns ## COG: RSc0215 COG1028 # Protein_GI_number: 17544934 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Ralstonia solanacearum # 1 229 44 272 275 264 58.0 1e-70 MNDAGFDVIPVEMDLSSRKSILKMIKVAQNYGEISMLVNGAGVSPSQASIETILKVDLYG TAVLLEEIGKVIKDGGVGVTISSQSGHRMPALGVEIDEMLAITETNKLLDLEVLQPHNIQ DTLHAYQLAKRCNEKRVMAEAIKWGQKGARINAISPGIIVTPLAIDEFNGPRGDFYKNMF AKCPAGRPGTADEVANVAELLMGPQGAFITGADFLIDGGATAAYFYGPLKPKNKS >gi|223714040|gb|ACDT01000175.1| GENE 13 14441 - 15409 1235 322 aa, chain + ## HITS:1 COG:lin2113 KEGG:ns NR:ns ## COG: lin2113 COG0667 # Protein_GI_number: 16801179 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Listeria innocua # 1 320 1 325 331 365 57.0 1e-101 MEYVNLGKTGLKVSKICLGCMGFGELDRGGIPGKTAEKEAREIIKYALEHGINFFDTANY YSLGASEEILGQALKDYANREDVVIATKLYYPMFDGVNAKGLSRKAVFTEVENSLKRLQT DYIDLYIIHRWDYTTPIEETMEALNDLVRMGKIRYIGASAMYAWQFVKANSIAEKNGWAK FISMQNMYNLVYREEEKEMIPMCQYENIALTPFSPLAHGVFNKAVDTPRTDGCGKLQNDY QLAHDCEIIKRVYELAKKYQKPTSQIALAWVLSKPYITSPLVGSSKIKYLEDALQALDIV LTPDEIAYLEEPYVPHNQYGFR >gi|223714040|gb|ACDT01000175.1| GENE 14 15526 - 16092 448 188 aa, chain + ## HITS:1 COG:MA0418 KEGG:ns NR:ns ## COG: MA0418 COG0655 # Protein_GI_number: 20089311 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 8 188 3 179 179 159 43.0 2e-39 MKEEAYMKKIVVISSSPRKNGNSEILADQFIKGATSVGHKVIKINLANYRIAPCLACEYC RGHHNRCVLKDDANKVIEEIVNADVFVMATPVYFYSLSAQLKILIDRMFAREYEIRNSPK RKTAYLILTSGSTDREQMHGTIESFRGFIKVLKKVDEGGIIYGLGAFNKGDAYTHPSYDE AYQMGSTI >gi|223714040|gb|ACDT01000175.1| GENE 15 16243 - 16602 210 119 aa, chain + ## HITS:1 COG:no KEGG:BCG9842_B1842 NR:ns ## KEGG: BCG9842_B1842 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_G9842 # Pathway: not_defined # 4 116 37 153 158 68 36.0 5e-11 MDKIIDEAKYYNRPFAARIVYNNKIIRRCFVNSYNEKSIKWLGRKERVCLETKHSSYFIF LDNIDTHNYNQMIFDEEYGLCGGSFPIFIKGVPAGAITVTGLRPHEDHQVIVKALEKLF >gi|223714040|gb|ACDT01000175.1| GENE 16 16755 - 16928 111 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756344|ref|ZP_02428471.1| ## NR: gi|167756344|ref|ZP_02428471.1| hypothetical protein CLORAM_01877 [Clostridium ramosum DSM 1402] # 1 57 1 57 57 97 96.0 2e-19 MTVAGHLIKVTPNRAIHNIGLNSCISEDSSDSLNIFAKTAESIIPINIKLRIGFGKK >gi|223714040|gb|ACDT01000175.1| GENE 17 17552 - 17707 108 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732805|ref|ZP_04563286.1| ## NR: gi|237732805|ref|ZP_04563286.1| hypothetical protein MBAG_03163 [Mollicutes bacterium D7] # 1 51 1 51 51 68 100.0 9e-11 MPIKYRNTLRSKALNKEVLIELLKKKPTRKIIVTDNIKSASISRRTFYARY >gi|223714040|gb|ACDT01000175.1| GENE 18 17868 - 18110 63 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756345|ref|ZP_02428472.1| ## NR: gi|167756345|ref|ZP_02428472.1| hypothetical protein CLORAM_01878 [Clostridium ramosum DSM 1402] # 1 80 1 80 80 146 100.0 5e-34 MNQDFSFSFYQDIKKVIILQLLKEYSCSQEVRQTLNIFILGLVVLLREWLENPDLESMEE YGKNFSCPHKTIFSSKSSII >gi|223714040|gb|ACDT01000175.1| GENE 19 18228 - 19082 977 284 aa, chain + ## HITS:1 COG:lin2453 KEGG:ns NR:ns ## COG: lin2453 COG0561 # Protein_GI_number: 16801515 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 283 1 281 281 275 52.0 9e-74 MTKKIIFFDVDGTLVDVRPSQEYIPASTVKAVQATRAKGNLCFLCTGRSKAEIFDHIMDV GFDGIIGAGGGFVEIGDRMLYHKQVSRTAINHVVDYFEANEFDYYIESNGGLYASKNLIP RLERIMYGDLENDPVARRTKAETPNHFIGSLKEGYDLHRDDVNKICFLEKDNFPFEKIKK EFEQEFNVIHCTVPIFGDNSGELSVPGVNKASAINALIDELGIPKENTYAFGDGLNDADM LEFCQYGIAVGNAKEALKEIADEVTDDIKDDGIYNSMKKYGLID >gi|223714040|gb|ACDT01000175.1| GENE 20 19177 - 20448 1244 423 aa, chain + ## HITS:1 COG:TM0033 KEGG:ns NR:ns ## COG: TM0033 COG4099 # Protein_GI_number: 15642808 # Func_class: R General function prediction only # Function: Predicted peptidase # Organism: Thermotoga maritima # 74 404 71 384 395 74 23.0 3e-13 MESKNNNLKDLIVSGSYTAFIKGDDWGCGVNKILLSLDHKIDQVNNLSFVVKEKKLTTDY CDTLYPIIESIIYRTVTNVYLVDYSGQITSEPSNFIMIEMKISPAEGNPLLFSMQTQYNT YNDLYELDIMIADGHEMTSLGQKVKQINIAKKMKDKITSADLFEEALYESCDGVSYKYAA YNPVQASDTLVVWLHGLGEGGTINTDPYITVLANKVSSLAGQEFQNTIGGANILVPQCPT FWMDKTGDSLATKFTEADGTSYYCDSLHELINFYRKETKSEKVIITGCSNGGYMTMLMAL KYGKEYNAYVPICEAFLNEKISDEEIKQLKKLPIFFIYSRDDDVVAPETHELPTIRRLLG AGASNVHTFVSKHVIDTSGRFNDKDGEPLRYSGHWSWIYFFNNEAVSDRCGIRVWQWMAQ QLK >gi|223714040|gb|ACDT01000175.1| GENE 21 20774 - 22462 1677 562 aa, chain + ## HITS:1 COG:FN1145 KEGG:ns NR:ns ## COG: FN1145 COG1164 # Protein_GI_number: 19704480 # Func_class: E Amino acid transport and metabolism # Function: Oligoendopeptidase F # Organism: Fusobacterium nucleatum # 42 558 1 551 559 145 25.0 2e-34 MKKKYRLLTCLLVFMLGLTGCTTDGYNIPQPQAEVVPEHSDLNFEDMSYERPDIDAINQK INDLLAKVVLEGNQEEILKGYDEILNDLKEVDQMESLASIKNNIDLSDSYYEEEYQFLTS AFVKLDNRMNELTEAILTTSYKDAFVKKMGEDFIERYEKNKKLNSPEIEELSEQETALIN EYSKTAAKEYTTMINGQSKTIDDLDFNKQEDIDGYYDIYEQKNKELGTIYKKLVSIRVEI AKKLGYENYTDYAYDLLGRDFSKEDAEKFEEAVLEYVAPLAAKMDTKYQEKLQKLDTSEI TVESGFSYLKTALKQEFPKAMQDAYDYMQKHHLYQIDDDSNMLHAGYTTIIGNEPFLFIN TSDYKDPSTLFHEFGHYYNFYLMGGTSWNDGNNLDIAEIHSQAFELLMFEYYDEIYGSDA KLMEIKVINDMLNSILQGCIEDEFQRKVFENPDISLEEMNTLHGQIYQKYMGYPLEYEWV DIHHHFETPFYYISYATSAVSSLELWLVGTKNREDALNAYRNMTQNTLNVDYLDALKDSG FSNPFTSKVIKNISSEIKREFL >gi|223714040|gb|ACDT01000175.1| GENE 22 22656 - 24029 1281 457 aa, chain - ## HITS:1 COG:yeeO KEGG:ns NR:ns ## COG: yeeO COG0534 # Protein_GI_number: 16129928 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 13 451 84 527 547 177 28.0 3e-44 MFKFIEPKIQLDKKRQIFTNKHLRALIIPLIIEQLLILLVGIADTLMVSYAGEAAVSGVS LVNQLNTVFILVFTALASGGAVVASQYIGNKNKENGVLAASQLVMITTIISVILLVLSVI FREQLLTLLFGKVAPDVMDACLKYLVISAFSFPALAIYNSCAGLFRSMSKTSVIMYVSII MNIINIIGNAIGIFILHAGVAGVAYPSLISRVVAALILLIISFNQKNIIYIRIKDIFKWN AQMIKRILNIAIPNGIENGLFQLSKVALSSIVAMFGTSQIAANGVAQSFWSMAALFCIAM GPAFITVIGQCMGAGDKEAADYYMKKLLRITYIGGIIWNIVFFLFTPIILKLYSLSNETI KLIIILCFIHNLFNALLCPIASSLSNGLRAAGDVKYTMYTSIFSTVVCRVILSFIFGVWM NLGVIGIALAMVGDWAIKSALILVRYLQGQWKSFKVI >gi|223714040|gb|ACDT01000175.1| GENE 23 24142 - 25245 1217 367 aa, chain - ## HITS:1 COG:L124252 KEGG:ns NR:ns ## COG: L124252 COG0620 # Protein_GI_number: 15672700 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase II (cobalamin-independent) # Organism: Lactococcus lactis # 6 357 10 357 369 238 40.0 1e-62 MLKIKYDIVGSFLRPEKIKEARAKYFNQEIDLKTLRKIEDDEIAKLVDKQVAHGLKIVTD GEFRRRWWHLDWLKEFNGFTTQHLDKTRNGVTNHIELGMINGKITYDKDKYHPEIEAWDY LFNLVTKYPGIEAKKCISGPNMILVDHFLQLGLKEVPYYNGDIDALIDDIGKAFQIAIKD LYDHGCRYLQIDDTSWTYMIDDNFLLKVSSLGYKKEDILEWFRKVSTKALENKPIDMTIA NHFCKGNFKGYPLFSGLYDTVAPIICQIPYDGFFVEYDDERSGSFTPWARLKNTNVTFVA GLISTKNPKLETYDEIKTRYFEAKRIIGKNIALSPQCGFASVEEGNCIDEKTQWAKIDLL VSCQDFL >gi|223714040|gb|ACDT01000175.1| GENE 24 25740 - 26864 1199 374 aa, chain + ## HITS:1 COG:CAC0767 KEGG:ns NR:ns ## COG: CAC0767 COG1453 # Protein_GI_number: 15894054 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Clostridium acetobutylicum # 1 373 1 372 376 452 58.0 1e-127 MDNKKLGFGLMRLPSLDPNDPAKIDIEQTKQMVDLFLERGFTYFDTAWMYCGFASENAAK EALVDRYPRDSFTLTTKLHAGFLKSKEDRDRIFNEQLKKTGVEYFDYYLLHDINTHSIDT YNELDCFNWIVEKKKQGLVKKIGFSYHDGPELLDKVLTEHPEFELVQLQINYLDWDSAGV QSRKCYEVATKHHKPVIVMEPVKGGTLANVPEEVTKMFKDYAPQSSIPSWAIRFVASLDN VVMVLSGMSNMEQLLDNTEYMANFKPLNDKEYKLINKAVAAINSTIKIPCTGCSYCTDGC PMNINIPKYFSLYNADLQEVAEKGWTPQGEYYANLTKTFGKASDCIACGQCENVCPQHLN IIEGLQEVATHFEK >gi|223714040|gb|ACDT01000175.1| GENE 25 26985 - 27881 733 298 aa, chain + ## HITS:1 COG:lin2846 KEGG:ns NR:ns ## COG: lin2846 COG1737 # Protein_GI_number: 16801906 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 8 268 9 268 283 102 28.0 9e-22 MLIEQKIERARLSQSQQLIADYLLEKRSNIKDMTIKELAVATYTSTGTIIRLAKKLGYHG YEDFKADFLKEVYYLDTHFKNIDPNFPFVKTDNIQKIASKVTLLAQETLSDTLALLEHDN LQKALRIMQKANQIHLAAISYSLLLGQIFKLDMLRIGKNVNICNTNGEELFLPAVIKNND CIIIISYSGEIHNLCSLARTLKGRSVPIIAITSLGDNELKKYADVVLHISTREKLYSKIA GYSNENSIKLILDILYSCYFNLMYDDNLARRIAISKQAEVGRESTLEIMKEDNTQSGT >gi|223714040|gb|ACDT01000175.1| GENE 26 27999 - 29444 1518 481 aa, chain + ## HITS:1 COG:CAC1405 KEGG:ns NR:ns ## COG: CAC1405 COG2723 # Protein_GI_number: 15894684 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 1 468 3 472 473 523 54.0 1e-148 MDSNFLWGGASAANQYEGGYNLDGRGLSINDVELGAGHGKIREIHEYVHKDTYYPSHIAT DFYHHYQEDIDLMAEMGFKCFRMSIAWSRIFPNGDDLECNEAGLEFYDKIFDELIKYGIE PIVTLHHFELPLSLVKKYGGWRNRKLVELSVRYAKTVMERYRNKVKYWMTFNEINALFLT DRPWHMAGIIYQQDEDLNQVKFQVAHYQLLASALTVIEGHKINPNFMIGNMLLYPCTYGA TCNPTDQVIAREKLLPTYYFGDVQVRGYYTNTCKSYLKKCNGHLVIETGDEEILLQGTVD YISFSYYFSAVEGVADIEMVEGNLSSGGKNPYLTTTEWGWQIDPVGLRYSLNQLYDRYQL PLFISENGLGAIDQIEKDGTIIDDYRISYLQEHLRAMKEAIEIDLVNCFGYAMWGPIDII SAGTGEMKKRYGFVYVDLDDFGNGTLTRTKKKSFYWYQQVIASKGKNLINTEDNDNVETS L >gi|223714040|gb|ACDT01000175.1| GENE 27 29425 - 30279 763 284 aa, chain + ## HITS:1 COG:SSO1577 KEGG:ns NR:ns ## COG: SSO1577 COG1145 # Protein_GI_number: 15898393 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Sulfolobus solfataricus # 3 172 109 277 455 60 29.0 5e-09 MLKRVYNQNRCTGCGICTINCPQKILKISNGHCVITDFDKCTRCQICQQVCPYLAIEFKN EEKSTFPVLLKGVTIPFHTGCYQGMIERLLAEVCEAMKLENKLVIFKSKDARFEINVEIY GSDNYLKDALEYKHNHPEKIVVVYYTDEEPWQHKQAISDFKELDNTPITIFHMLNYFSNL KLKPTSDEYAIDLCEILCISKDAALVARGSFTDIKRITEVKRYMKEAIGHQLEANGYTFL ELTLPCHWRLLDKPQGTITSLQVIENIEWFKNIINKMYPLKKYK >gi|223714040|gb|ACDT01000175.1| GENE 28 30357 - 30588 123 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732817|ref|ZP_04563298.1| ## NR: gi|237732817|ref|ZP_04563298.1| hypothetical protein MBAG_03175 [Mollicutes bacterium D7] # 1 77 1 77 78 135 100.0 1e-30 MDRDEVLNNYYMQEDALVKYGFKRLDNLFIVEKTLKNNEFYARFEISNSCFNIDVFENNG VKIFTFLYKKCSWGLYC Prediction of potential genes in microbial genomes Time: Thu May 26 11:05:00 2011 Seq name: gi|223714039|gb|ACDT01000176.1| Coprobacillus sp. D7 cont1.176, whole genome shotgun sequence Length of sequence - 10237 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 24 - 83 4.3 1 1 Op 1 . + CDS 124 - 624 391 ## COG3279 Response regulator of the LytR/AlgR family 2 1 Op 2 . + CDS 621 - 989 153 ## gi|167754749|ref|ZP_02426876.1| hypothetical protein CLORAM_00253 3 2 Tu 1 . - CDS 1018 - 2028 999 ## COG1609 Transcriptional regulators - Prom 2146 - 2205 11.6 + Prom 2003 - 2062 10.9 4 3 Tu 1 . + CDS 2309 - 3580 1268 ## COG1455 Phosphotransferase system cellobiose-specific component IIC + Term 3787 - 3834 7.5 + Prom 3641 - 3700 7.8 5 4 Op 1 12/0.000 + CDS 3885 - 4697 878 ## COG3959 Transketolase, N-terminal subunit 6 4 Op 2 2/0.000 + CDS 4698 - 5642 1015 ## COG3958 Transketolase, C-terminal subunit 7 4 Op 3 2/0.000 + CDS 5642 - 7120 1482 ## COG0554 Glycerol kinase 8 4 Op 4 . + CDS 7107 - 8522 1509 ## COG2407 L-fucose isomerase and related proteins 9 4 Op 5 . + CDS 8522 - 9805 1237 ## LMOf2365_1057 hypothetical protein 10 4 Op 6 . + CDS 9805 - 10104 293 ## COG1440 Phosphotransferase system cellobiose-specific component IIB Predicted protein(s) >gi|223714039|gb|ACDT01000176.1| GENE 1 124 - 624 391 166 aa, chain + ## HITS:1 COG:SA2153 KEGG:ns NR:ns ## COG: SA2153 COG3279 # Protein_GI_number: 15927943 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Staphylococcus aureus N315 # 44 165 26 147 147 60 34.0 1e-09 MKLICNEKHRQKLETFFARYQELDIILVEQGIEYAGLYYTFTLNNLKSVESDLNEMLERY LIGYLAEKIHRINYQDIVYIEGFSKEAYLNTQTCQYLCHYKLYELEQLLDKYSFIRINRS IIVNINYIEYMLPELNSRYTLYMQNGIMLVLTRSYLKKFKERLEIR >gi|223714039|gb|ACDT01000176.1| GENE 2 621 - 989 153 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754749|ref|ZP_02426876.1| ## NR: gi|167754749|ref|ZP_02426876.1| hypothetical protein CLORAM_00253 [Clostridium ramosum DSM 1402] # 1 122 1 122 122 146 100.0 3e-34 MIEVLERIGINLSLSFIYILLFRHLITPTVIVNTDFQVIILYFLLALLISIFLEAWHYLI IDLIEYPKVTLTVIAIGMIVCLFIKTNLYMLFCLGYVVIMMLTDFYRTYKIQQALILFKN KH >gi|223714039|gb|ACDT01000176.1| GENE 3 1018 - 2028 999 336 aa, chain - ## HITS:1 COG:VCA0132 KEGG:ns NR:ns ## COG: VCA0132 COG1609 # Protein_GI_number: 15600903 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 5 335 4 332 334 154 29.0 3e-37 MKIKISDIAKKAGVSDGTVSNALNNRKGISNEKREYILQIAREMGYFRKNSDQDKLIRLL IVNKQAHVVGDTPFFSELIRGIETECSGQGFELVINHVDAEILKNRRLEDILKTDQTSGI LLLGTEMEVQDLNYFRNICIPLVVLDTAFRDTNFDYVAINNVDGTYEIVEHLIKNGHTSI GIINSSHQINNFRERKNGYVHALNDNNLLLSPENEALVEPTPEGAYLDMKAFLNSFLAEF PKEKLPTAFYAVNDNIALGAIKAFNELDLSISICGFDDLPICELITPALTTVHVNKQYLG KTAVSRLVQKINTPDYDIQKILIATKTIQRDSVKKR >gi|223714039|gb|ACDT01000176.1| GENE 4 2309 - 3580 1268 423 aa, chain + ## HITS:1 COG:BS_licC KEGG:ns NR:ns ## COG: BS_licC COG1455 # Protein_GI_number: 16080909 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Bacillus subtilis # 9 414 11 433 452 267 37.0 3e-71 MENNKFINKALEVSGRIAANIYLDSISKGLMGTLPILMIGSLALLFGVFPFDPWLNFIQS TGIEKYFLAASTVTTSCLSIYASFLIGHRLAGHFKCDQIPAGLISIFAFFITTPLTEESA LSMQFMGAQGLFGAMIAALIATRLYCFFMTKEKLKIHMPEGVPPMIANTFSALIPGILVG FIFIFVSFGFSFTSWGSFSQMVYSVIVTPLNALGGSVWSLVVLLLVQMFLWFFGIHGSNV ISGVITAVYLPMATANMEAYAAGKELPNILGNTFYDAFSGIGGAGGTLSLCIILLLFAKS KQNKEMGKLGIVPGLFTINEPVVFGYPLIMNPILAIPFILTPIVQTLVAYFAMAIGLVPP LTGVQVPWCMPIIIKPLLAGGWQAAVLQVVCIAIGCLVWYPFFKVSDNQRYKEEISAKGH EEA >gi|223714039|gb|ACDT01000176.1| GENE 5 3885 - 4697 878 270 aa, chain + ## HITS:1 COG:TM0954 KEGG:ns NR:ns ## COG: TM0954 COG3959 # Protein_GI_number: 15644626 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Thermotoga maritima # 2 267 8 273 286 303 53.0 2e-82 MEIELLERQSSELRKDIVEVCYHSKAGHVGGSLSAIDILNVLYFNELRIDPNNPKMENRD RFILSKGHIAEALYVTLAKRGFFAYSELDNFSGYNTKYIGHPNNKVAGVEMNTGALGHGL AIGVGLAIAAKKMGQDYRTYVLMGDGELAEGSIWESAMAAGHYKLDNLCAIVDRNHLQIS GCTETVMTLEDLKGKWQMFGFEVIEIDGNNYEEIINAFTQAKAIKNKPSLILANTVKGKG VSFMENQASWHHGVLDAVQYKQALMELGDL >gi|223714039|gb|ACDT01000176.1| GENE 6 4698 - 5642 1015 314 aa, chain + ## HITS:1 COG:TM0953 KEGG:ns NR:ns ## COG: TM0953 COG3958 # Protein_GI_number: 15644625 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Thermotoga maritima # 4 312 2 309 311 285 47.0 6e-77 MGNKIANRAVICEKLQEFAKTDQKICIVTSDSRGSASLTPFTKEFPDRVIEVGIAEQDLV GIAAGLAAAGMKPYVASPACFLTMRSIEQIKVDVAYSKTDVKLIGISAGVSYGALGMSHH SLQDIAVLNAIPNMTIIVPADPYETKKMMNKLADFHGPVYIRVGRNPVSEVYHDDNFDYE IGKAKIMHDGDDLSIVAYGEMVRVALDAAIQLELQGIQARVINMHTIKPFDQEVIVKAAK DTKRIITIEEHSINGGLGSIVSQIVANQAPCIVKTLAIPDETLISGNSQQLFEYYGLTKE NVVSIAKQLLKDEK >gi|223714039|gb|ACDT01000176.1| GENE 7 5642 - 7120 1482 492 aa, chain + ## HITS:1 COG:TM0952 KEGG:ns NR:ns ## COG: TM0952 COG0554 # Protein_GI_number: 15643714 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Thermotoga maritima # 3 477 2 490 492 409 44.0 1e-114 MKYILTIDQSTFSTKVFIIDEESNMLASSSYQHQQYYPQNGWVEHDGEEIYTNLLKAVST LKKEFPELVAKVAGIAITNQRETTILWDKRTGKPVHHAIVWQCRRTASMCEKLKIHQDLV EKLTGLKINPYFSAPKIKWLIENCKLDEIDNYAFGTMESWLIYNLTNGKKHVSDVTNASR TLLMNIETLKWDEMLCSIFDIPSIILPAILNNDEIFGYSDVNGILDHEVPICGVIGDSQA ALYAQHCFETGSMKVTLGTGSSVMINAGSCKPQLTKGIVNAVGWCVENRIIYAQEAIINC SCDTLNWLKNQIGLFKKEAQLNSIWDEIENNDGVYLVPAFVGLSVPWWNDQARASICGLA RNHDYKYILRAGLESIAYQIYDAATALGYTSNQIMNIDGGAVSNIGLLQFIADILNVEVK VSYNYALSAMGAYEISRLKLFGQTEYYEDKKKYFPTMDEKVRNANIYGWHQAVKGVLAVA KLDNEVEINDKY >gi|223714039|gb|ACDT01000176.1| GENE 8 7107 - 8522 1509 471 aa, chain + ## HITS:1 COG:TM0951 KEGG:ns NR:ns ## COG: TM0951 COG2407 # Protein_GI_number: 15643713 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Thermotoga maritima # 1 399 1 402 471 144 28.0 3e-34 MINIKLGVIPTRRDIFSKEEALKYKELILQKLATMNITFVDIEDANDEGLLFDDNDVDKV VDKMIKEKVDALFFPHCNFGTEYLVAQVAKKMNLPILIWGPRDESPLGDGSRLRDSQCGL FATGKVLRRFRCKFTYLPMCRLEDQEFYEGIRRFLATVNIIKELKDLTILQIATRPSGFW TMMVNEGELLEKFNIKIHPVTLAEVKDEMDAVEQNKKDEVIQVIDYLKKETIIEISDEAL VKVAALKVAIKSIAKRYNCKAAAVQCWNAMQGVLGIFPCAANALLTDEKFPVACETDIHG AITSIIAQAASIEDNVTFFADWTVPHPTNNNAELLQHCGPWPISLMEEKPRLGAPFAFNH SHPGSLHGKIKEGEMTILRFDGDNGEYSLLMGKAKTIPGPFNQGTYIWIEVDDLKKLEDK LVCGPYVHHCTGIYEDILPQVFEACKYIEGLTPDPYDCDNEVLKALVRGEK >gi|223714039|gb|ACDT01000176.1| GENE 9 8522 - 9805 1237 427 aa, chain + ## HITS:1 COG:no KEGG:LMOf2365_1057 NR:ns ## KEGG: LMOf2365_1057 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_F2365 # Pathway: not_defined # 3 424 2 419 421 371 46.0 1e-101 MGLKVAINKKSFEKNGKPFFYLADTCWSAFTNINENDWKYYLQYRKRQGYNVLQINILPQ WDASATDLNHSPYQKDEKGNYQFDKLNDSYFNHARSMCEVAKEYGFELALVVLWCNYVPA TWANNMLSHNTMPYEEIDGYIKKVNETFTDLDPLYVISGDTDLNEEISCSYYLKASQLLH ELAPNCLQSLHVRGRLMEIPEKLIKQMDFYMFQSGHNAKLENRAMPYCLPEYFIEHYPVK PLLNSEPCYEQMGYSSNMYGRFHQFEIRRAAWQSILTGAFAGITYGAAGIYSWHTYGKEF ANEVGEGFDSPNPWHLAIHYPGAWDYSDIKVIFDNYEITELSSAQNLLMNGIDDIRAALT NNKLLLIYVPENTKIKLKLETTAIDKITIIDLETRHREVGTVSSCQGGCMVGMHIFEHDA LYVVDLK >gi|223714039|gb|ACDT01000176.1| GENE 10 9805 - 10104 293 99 aa, chain + ## HITS:1 COG:lin2905 KEGG:ns NR:ns ## COG: lin2905 COG1440 # Protein_GI_number: 16801964 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Listeria innocua # 1 92 1 93 100 60 34.0 9e-10 MKRILLLCAAGITTSLLTKKLDDLIREVNEEYQVIACPVAEVATQGIQAEILLLSPQVRF NYAKIQQMFPDKLVLKIGAEDYKNLNAEKIIKDLKNNLK Prediction of potential genes in microbial genomes Time: Thu May 26 11:05:15 2011 Seq name: gi|223714038|gb|ACDT01000177.1| Coprobacillus sp. D7 cont1.177, whole genome shotgun sequence Length of sequence - 12620 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 746 - 5965 5529 ## Elen_1817 coagulation factor 5/8 type domain protein 2 1 Op 2 . + CDS 5965 - 7203 1322 ## gi|237732774|ref|ZP_04563255.1| predicted protein + Term 7211 - 7268 4.2 3 2 Tu 1 . - CDS 7860 - 8177 192 ## gi|167754742|ref|ZP_02426869.1| hypothetical protein CLORAM_00246 - Prom 8285 - 8344 9.2 + Prom 8233 - 8292 7.5 4 3 Tu 1 . + CDS 8467 - 9228 788 ## COG1349 Transcriptional regulators of sugar metabolism + Term 9233 - 9261 -0.9 + Prom 9249 - 9308 12.6 5 4 Op 1 11/0.000 + CDS 9355 - 11709 1998 ## COG1882 Pyruvate-formate lyase 6 4 Op 2 . + CDS 11696 - 12613 588 ## COG1180 Pyruvate-formate lyase-activating enzyme Predicted protein(s) >gi|223714038|gb|ACDT01000177.1| GENE 1 746 - 5965 5529 1739 aa, chain + ## HITS:1 COG:no KEGG:Elen_1817 NR:ns ## KEGG: Elen_1817 # Name: not_defined # Def: coagulation factor 5/8 type domain protein # Organism: E.lenta # Pathway: not_defined # 120 1732 152 1779 1787 1074 38.0 0 MKKILAFFLVVVVLLGSTQLTIYADNESQGQEITTSVTNGESTKNENDSSMITNEKTVNN ESTSPEKTADVKNSSITGKVKLIINTQYFKKNISTLSAQLANIEQTYTGELDSTKENSQI FEFKNIVNGTYTLTIQGDGYETVSKEVVVNNNETTVKYVNTNSVADNIEGNGFGTFAYGD FNKDGKIDSKDKKILSTAIFNNDSSNSLLDIDDNNKIDLLDLHTFTYAYSDNGTTQVTRL EPVIETQPLLVNNVSVNNDELQGTLEGKIENIFSETAQEETLKLKPEKDEEISQSNPIQM SLEVKKPVAVTQLTIQGTKENAITEGSVIVVAENGEEIEAVISTAKSRARMASRQAIVQA DGSIVIDLGTQIAVKKVTIKVTGTSSKSLAEISKVEFLNGMEDHIPAPKLDIPTGITVDE LDEELYIKWDALNNVTGYKVKITYNGKTEEHASNVNSFNLKSFNDKDLLNGKTYKISVKA INDADKNSLWESDYSAAVEATPQASKVPDAPDNVKAVGAYRSINVSWKDMKSTDSYNVFY REKGNGEYIEAVSKYEGTSYTITSLKDNTKYEVYVTGNNKKGTGPASLVSEATTVNILPA QLPQYKLLNTSNGEGKLSNHINNVYYGNNGVTEIVNSPLDEVSTTTNSALALVDDNYDSY LQINDWDFGVSYHRNWRLTFEFDDTYEMNYISFAGPVNDSSINNIGISYYDESGKEVDAS IDAFRRKTDDNGRIYFIAHLAKPIKTNKVRFGVQSSNRTMRISEFNFYYYDSLEEDVNAL FTDSFHLTVRDDVTSTTLDDLQTRLNTPDEVSQELHPFKDLIQLELNQARQVVEGTALQN IQEIHNGIAASKQGNLGFGGLNSWQPLGYVTYPGDTFIVYVGQEGKRNGQAVNLQLVYSQ YHAESASFVSSPISLKVGKNEISMKELQSIGVEKGGSVYVQYTGNSNEKIAVRVSGGEKI ATLDLYQVSDENERLEKVKTYLQSLQTQINKMASKHEELHRDDNSVNYDYDEKNCILGAT EIMLDHMLYSVSGKQIMAGLKGTTLDEKANQLLNSLNAMDQMMELFYQNKGLNENAAAIN DRYPAQHLNIRYQRMFAGAFMYAGGNHIGIEWGSVSGLSNGIPFEAAENGKYLSGSLFGW GIAHEIGHNINQGSYAIAEITNNYFSLLSQNRDSNDTTRFKYPDVYEKVTSNTVGMSSNV FTQLAMYWQLHLAYDQNYHYKLYDSHEEQLNSLFFARVDYYARNPGKVNIPEGGTALKLN SDVQQNFVRLASAAANKDLTDFFTAWGIIPNEETKAFISQFEAETDKIQYLDDDSMAYRL DGKARMSVDTEINASLSNDKNSNQVTLTISNTNKVDGAMLGYEILRNGQAVGFVNAETGE TTFTDTVSTVNNRVFTYSIIGYDKLLNRTAELTLKPIKISHDGSLDKDQWLIETNLISDN DEAVGDEDSTPCLPEQKAVNDLIDNNYQNIYKGSTDTKQGEFIVSLNTLADITGVKLTNP SFNQVQIYVSDDKDNWTLVKEDSLKTGEENKLYFSQIVNDELENNKSLVSYSSSYIKIVA VDENDISLSEIDVLGPTGDNIDMSQENSIGILSGDYKLDANNTIPKGSLIFTGTYAGNPA YNVVELRYNKDEILSGYQAIFAEDPGDGDLADVSEGTWIYYLIPETDENGKIIENSFGYQ DDEGNFVKVELPGSIKPELYRVNDAQTNAGQRLVSDSFEVKIPTELPSITIGNQKRGGK >gi|223714038|gb|ACDT01000177.1| GENE 2 5965 - 7203 1322 412 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732774|ref|ZP_04563255.1| ## NR: gi|237732774|ref|ZP_04563255.1| predicted protein [Mollicutes bacterium D7] # 1 412 1 412 412 741 100.0 0 MKKIFQFFIAVVMCLSLSTNIYAINDNEHEITTDKPHIILTPNQAGDELTVKLHLSQEMF ERQIATMHLSINTSKSIQEYQNAFMLDEGLKKLASIYDYNLSENKIDLLLSNKTKLFEQQ EITLGTIRIDSNKDFEIEFTPAKITYVYVNDKAENSIEFDCPAVILSSVSGGDSGDNGQD PADKITVETVVEHLGGSKLELAEESQKVVQTSVMSYLNKHYAEILGNLPEGTVIKAQLSL KNLTESDITIDQKDKIEKVLEDGAKILAYYDISITATAYSKDDVIIPEINGVEISEMASP IKLSMTIPSEFIKKNRQFGIVRLHDNEANGLKSSLANENIITFETDRFSIYTLVAKDYTQ AEVNKDPDVLNNSIVKTGDDTSSMLPMIFMAASLLVIIGILEMKRREVNKQK >gi|223714038|gb|ACDT01000177.1| GENE 3 7860 - 8177 192 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754742|ref|ZP_02426869.1| ## NR: gi|167754742|ref|ZP_02426869.1| hypothetical protein CLORAM_00246 [Clostridium ramosum DSM 1402] # 1 105 1 105 105 160 100.0 2e-38 MPFFQTHSKSLIFYLQNPELYSPTIIFMMIAIIIVTAGYGNRQIIVIDKKYFKYCSNQNL WFNYQQFYLIIFNWEPTFDIQIPLVNIAKIALSYSDIYILCNQIY >gi|223714038|gb|ACDT01000177.1| GENE 4 8467 - 9228 788 253 aa, chain + ## HITS:1 COG:CAC0113 KEGG:ns NR:ns ## COG: CAC0113 COG1349 # Protein_GI_number: 15893409 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Clostridium acetobutylicum # 1 253 1 253 253 188 41.0 7e-48 MLQQERHNQIIAKLDLEGQVRVRDLSKDFNVTEDCIRKDLTILEKNGKLKRIHGGATQIR TNLHRVNVGERIGLNSPEKKLIAQKAVEIIEPGTMVFLGISTINIELAKLIYQRDMNVTV VTNMIELINIFRGECRARLIFIGGEFNNPKDGFLGAITIDQIKSYKFDLSFLGVVGIDLH SGNITTYDIDDGLTKKQVINSSKKSYMLAETAKLKLDGNYVFSHIDDFTGIICEKAVTLQ QRELINKYGLDIL >gi|223714038|gb|ACDT01000177.1| GENE 5 9355 - 11709 1998 784 aa, chain + ## HITS:1 COG:ECs4880 KEGG:ns NR:ns ## COG: ECs4880 COG1882 # Protein_GI_number: 15834134 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli O157:H7 # 7 782 2 764 765 593 40.0 1e-169 MDNLEQTPRITLLKEKMLNEPRYVSIEQARIITRIYQENESLSIPKKRALSLKAALEELE IGVEKEELIVGNRTKGVRYGVVFPESGCSWVNKEFETLPTRPQDKFRIKKEDVKEFKEII YPYWQDRSLEDVIKENYGEEINAIAKVVKINQKDHAQGHICPDTKTWLELGPKGLMTKAY EKLKNCDENQKEFYECTIIVLEGVCHFMMRYHDYILTMLESLEDDNKKSLQRVADICANL ASRPAQSFHEAVQSLWFLFVVLHMESNASSFSPGRMDQYLYPYYQKDIEKGIISKQEALE ILECLWLKFNQIVYLRNQHSAKYFAGFPIGFNIAIGGIDENGCDIYNELSLLLLKAQYHL GLPQPNLSVRLNKNSSHELIQEAIKVVAKGSGMPQFFNDEAIVNSMIKDLGIEEKDARNY AIVGCVELTTHGNNLGWSDAAMFNLNKALELTMNHGKCLLTNEPIGLDLGSIETYESFED LENAFQKQIDYFIEKMMKAEIVVEKAHQDCLPTAFLSTVIDSCLEKGVDVTRGGAKYNLS GIQMIQIANLADSLAAIKVLVYDEKMITRHELLEALQADFKGYEIIQTMLLNKVPKYGND VKWVDELGAKWAGYFRERMKDYTNYRGGLYHTGMYTVSAHVPMGENVGASPDGRNALTPL ADGGMSPVYGRDMSGPTAVLKSVSRMKDSYTTNGGLLNMKFLPEFFKAETGMMKFENFLR AFVDLKIPHIQFNVVRREDLLDAKLHPEQHRSLTIRVAGYTAYFVELAGKLQDEIIERTA YEDI >gi|223714038|gb|ACDT01000177.1| GENE 6 11696 - 12613 588 305 aa, chain + ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 8 273 8 269 302 158 31.0 1e-38 MKISKALVFDIKRFAVHDGYGLRTTVFFKGCPLRCKWCQNPEGLSSQRRPIYFENSCIHC QRCVEFSKKNQIKYENNRPYFNLQYEGTFDNLVKACPGNAIRYDSEAYDIKQLMEKIKED RVFFRDDGGVTFSGGEPLMQGEFLYEILKACQEEKIHTAIETTMYGSLELIKKILPYLDL IYIDLKVFDEKRHMELTNVSSKMIKQHIEYILESEYRNKVIIRTPLIPTMTATDHNIKSI ANFLVNIYPEVKYELLNYNPLAFAKYELVDLEYEVDKQLKMFDKEQMEHFHQLVYQTGLK NLIIE Prediction of potential genes in microbial genomes Time: Thu May 26 11:05:52 2011 Seq name: gi|223714037|gb|ACDT01000178.1| Coprobacillus sp. D7 cont1.178, whole genome shotgun sequence Length of sequence - 4169 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 3, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 21 - 512 448 ## Cthe_1362 NusG antitermination factor 2 1 Op 2 . + CDS 514 - 1011 369 ## Cthe_1362 NusG antitermination factor + Term 1012 - 1051 5.9 + Prom 1064 - 1123 3.9 3 2 Op 1 . + CDS 1162 - 1401 208 ## gi|237732783|ref|ZP_04563264.1| conserved hypothetical protein 4 2 Op 2 3/0.000 + CDS 1413 - 1973 556 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 5 2 Op 3 . + CDS 2011 - 2565 720 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 6 2 Op 4 . + CDS 2544 - 2798 228 ## gi|237732786|ref|ZP_04563267.1| conserved hypothetical protein + Term 2936 - 2970 -0.5 + Prom 2872 - 2931 4.7 7 3 Op 1 . + CDS 3026 - 3802 613 ## COG0451 Nucleoside-diphosphate-sugar epimerases 8 3 Op 2 . + CDS 3804 - 4167 152 ## LEGAS_0704 glycosyltransferase Predicted protein(s) >gi|223714037|gb|ACDT01000178.1| GENE 1 21 - 512 448 163 aa, chain + ## HITS:1 COG:no KEGG:Cthe_1362 NR:ns ## KEGG: Cthe_1362 # Name: not_defined # Def: NusG antitermination factor # Organism: C.thermocellum # Pathway: not_defined # 1 161 1 165 185 88 31.0 6e-17 MYWYVARCKTGRTKKLVSTLNKQVNMNAFIPKSERCFGRGETAEFIVKEIYPDYVFIKSD LDQEAFDEQFKEYFKTINGLVDLLEYKDTYPLTSEEQSLLEKLLDNTDTIKHTKGVIVDR RFVPTDGPLVGLEDMIKKVDKYRRFATLDTEIFTGKLLVAIDY >gi|223714037|gb|ACDT01000178.1| GENE 2 514 - 1011 369 165 aa, chain + ## HITS:1 COG:no KEGG:Cthe_1362 NR:ns ## KEGG: Cthe_1362 # Name: not_defined # Def: NusG antitermination factor # Organism: C.thermocellum # Pathway: not_defined # 1 164 1 172 185 62 28.0 8e-09 MNWYVLYVFSNKTNKILSNLNQRKELTAFIPKTEVFHRQAKKKTTKDMFDNYIFVKSDLK QNDFNDLLLSMKDKNDGLIKQLENAEVSALREKEIEFFNNILDKDNVARVSVGYQEEGKT IITEGPLLHYQDHIVRVMKHHCTAQLDLPFFDRKIILGVELISKN >gi|223714037|gb|ACDT01000178.1| GENE 3 1162 - 1401 208 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732783|ref|ZP_04563264.1| ## NR: gi|237732783|ref|ZP_04563264.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 79 1 79 79 116 100.0 4e-25 MTKILEKIESEVICIIDDKQYQYTNGKEAYQQLTNNYSITSIKAFNNQIILNLNPKENNK EQDWQEEYKKQFGEEPSFF >gi|223714037|gb|ACDT01000178.1| GENE 4 1413 - 1973 556 186 aa, chain + ## HITS:1 COG:NMA0639_2 KEGG:ns NR:ns ## COG: NMA0639_2 COG0110 # Protein_GI_number: 15793627 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Neisseria meningitidis Z2491 # 25 186 1 172 190 79 29.0 5e-15 MNLLIIGAGGHGRCCLDIARDMNIFDKISFLDDQNINEVINDCKIIGSIDEMSSYYPEYT HIHIAIGNNKLRSKLLLQAKEIGYSLPILQHSSSVVSNYASINEGTIIFPHAVIEPNATI GKGCIITANTTINHDAMINDGCLIYSNSIIRPMSVIGSNTRIGSGCTITFGTDIKEETDI KDGSII >gi|223714037|gb|ACDT01000178.1| GENE 5 2011 - 2565 720 184 aa, chain + ## HITS:1 COG:BH3716 KEGG:ns NR:ns ## COG: BH3716 COG2148 # Protein_GI_number: 15616278 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Bacillus halodurans # 9 176 2 169 207 249 73.0 3e-66 MITDKQKKYLLLKRGVDIVLSGGAIVVLSPVLGLLALAIKLDSKGPVLFKQKRVGKDKEL FEIYKFRTMRTDTPSDMPTHMLKDPDQFITKTGKLLRKTSLDELPQIFNIFTGKMSIIGP RPALWNQDDLIAERDKYHANDVTPGLTGWAQINGRDELEIDVKAKFDGDYVNEMGLKEWI SNVS >gi|223714037|gb|ACDT01000178.1| GENE 6 2544 - 2798 228 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732786|ref|ZP_04563267.1| ## NR: gi|237732786|ref|ZP_04563267.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 84 1 84 84 149 100.0 6e-35 MDIKCFLATIGSVLLGDGVVEGGTGELEETVDTQIVDPEKINKEAKIGAAVVGTAGAVGL AGLGFVCKHIKNNKKRINFIKDYY >gi|223714037|gb|ACDT01000178.1| GENE 7 3026 - 3802 613 258 aa, chain + ## HITS:1 COG:BH3715 KEGG:ns NR:ns ## COG: BH3715 COG0451 # Protein_GI_number: 15616277 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Bacillus halodurans # 6 255 40 281 282 240 52.0 2e-63 MLDPNWKEFDFSKYDTVFHVAGIAHADVGNVSEEVKQKYYQVNTDLTLEVANIAKNAKVR QFIFMSSMIVYSGCGTTHITKDTKPKAENFYGDSKLQADLKLQEMNDDSFKVVVVRPPMI YGKGSKGNYPQLVKLATKLPVFPIVNNKRSMLHIDNLCEFIKLMIDNEESGVFFPQNDEY TNTSDMVQMIASIKGHKIILLPGTSLFIKLLGNIPGKIGRLASKAFGDSYYDMNLSDYVE GYRINNFRHSIEMTEGEN >gi|223714037|gb|ACDT01000178.1| GENE 8 3804 - 4167 152 121 aa, chain + ## HITS:1 COG:no KEGG:LEGAS_0704 NR:ns ## KEGG: LEGAS_0704 # Name: epsF # Def: glycosyltransferase # Organism: L.gasicomitatum # Pathway: not_defined # 2 121 3 122 383 99 39.0 3e-20 MKRVLVLASVASMIDQFNMPNIRILQKLGYEVDVACNFIEGSTCSDEKIEQLKSKLSSMN IRCFQIDFARNITNMQKNIKAYWQVLNLVNKNDYIFVHCHSPIGGVIGRIICRKKHLKVI Y Prediction of potential genes in microbial genomes Time: Thu May 26 11:06:12 2011 Seq name: gi|223714036|gb|ACDT01000179.1| Coprobacillus sp. D7 cont1.179, whole genome shotgun sequence Length of sequence - 8829 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 493 280 ## Mbar_A0080 hypothetical protein 2 1 Op 2 . - CDS 480 - 728 194 ## Mbar_A0080 hypothetical protein 3 1 Op 3 . - CDS 721 - 1425 179 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 - Prom 1499 - 1558 11.1 - Term 1525 - 1569 7.3 4 2 Op 1 . - CDS 1580 - 2896 1544 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases 5 2 Op 2 1/0.000 - CDS 2883 - 3632 832 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 6 2 Op 3 13/0.000 - CDS 3642 - 4958 1466 ## COG1455 Phosphotransferase system cellobiose-specific component IIC - Prom 4981 - 5040 6.9 7 2 Op 4 2/0.000 - CDS 5065 - 5715 743 ## COG1447 Phosphotransferase system cellobiose-specific component IIA 8 2 Op 5 . - CDS 5730 - 7613 1532 ## COG3711 Transcriptional antiterminator - Prom 7728 - 7787 7.5 + Prom 7518 - 7577 8.1 9 3 Tu 1 . + CDS 7766 - 8335 244 ## Cbei_2704 RDD domain-containing protein + Term 8342 - 8385 9.5 - Term 8324 - 8380 15.0 10 4 Tu 1 . - CDS 8393 - 8638 229 ## gi|237732744|ref|ZP_04563225.1| predicted protein - Prom 8706 - 8765 4.8 Predicted protein(s) >gi|223714036|gb|ACDT01000179.1| GENE 1 1 - 493 280 164 aa, chain - ## HITS:1 COG:no KEGG:Mbar_A0080 NR:ns ## KEGG: Mbar_A0080 # Name: not_defined # Def: hypothetical protein # Organism: M.barkeri # Pathway: not_defined # 8 164 86 242 522 77 29.0 2e-13 MNVFNFRAMLQLLFFLSLGGAFFNTGMFDPSKDKYYAVILMKMDGREFVLSQYLYTLFRH YVGYMTAFMILSFTLQYPFWLSYILPLFIIGLKLMAVSAYLKDFKRNRKVRSENNNSPAI IGFSIGCLLVGYALIFSNLIIPIKMMLIVLIILSILAVILMSYV >gi|223714036|gb|ACDT01000179.1| GENE 2 480 - 728 194 82 aa, chain - ## HITS:1 COG:no KEGG:Mbar_A0080 NR:ns ## KEGG: Mbar_A0080 # Name: not_defined # Def: hypothetical protein # Organism: M.barkeri # Pathway: not_defined # 1 64 1 64 522 70 54.0 2e-11 MLDTFLISFRLKNTYRVNSIIYMFKQIPIIRRVLPMSLYKNQSLKIFVTILNTIWEIISV FLGKNNIFCNYAHNTFEFYECI >gi|223714036|gb|ACDT01000179.1| GENE 3 721 - 1425 179 234 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 3 207 5 214 305 73 28 5e-13 MKLIIEHLQKNFEDKKVLEDISFCFEKGKIYGLLGRNGAGKTTLFNCLNQDLEIDKGDFY LEDNGRKPLTMSDIGYVLSTPTVPEFLTAREFLKFFIEINIKNIKNLKTIDQYFDMVNLL EEDRDRLLKDFSHGMKNKMQMLIQFIAHPDVLLLDEPLTSLDVVVADEMKKLLKSIKKEH IIIFSTHIMELAVDLCDEIVILNKGQLSLINKEDLDDDAFQEKIIKILKDEDHA >gi|223714036|gb|ACDT01000179.1| GENE 4 1580 - 2896 1544 438 aa, chain - ## HITS:1 COG:BS_licH KEGG:ns NR:ns ## COG: BS_licH COG1486 # Protein_GI_number: 16080907 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Bacillus subtilis # 2 437 3 437 442 558 62.0 1e-158 MKGVKIVTIGGGSSYTPELVEGFIKRYAELPVKELWLVDIEEGKEKLEIVGNLAKRMIKK AGLDMEVHLTLNRREALKDADFVTTQFRVGQLDAREKDELIPLKYGVIGQETNGPGGMFK ALRTIPVIFEIIKDCEELCPNAWIISFTNPSGINAEAVFRHTGWKKFIGLCNGPVNMIKD IEKILKVKEEDHLDVKIGGINHMIYALDIQLNGKNTNRNLIDTMINDGNKESLKNIVDLP WSNEFLNSYGYLGIGYLRYYLQKQQMLEHCQEDAAKHQCRAQVVKKVEKELFEKYKDENL DVKPKELEQRGGAYYSDAACNLISSLYNDKGDIQVVDTLNNGAITNLPNDVVVEVSSIIT KDGPVPIPVGELPVQVVGLIQQLKAFEILTTNVAVSGNYDDALVAMAINPLVQSEITARQ ILDEMLEAHKKYLPQFFK >gi|223714036|gb|ACDT01000179.1| GENE 5 2883 - 3632 832 249 aa, chain - ## HITS:1 COG:BS_ybfT KEGG:ns NR:ns ## COG: BS_ybfT COG0363 # Protein_GI_number: 16077305 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Bacillus subtilis # 24 246 24 248 249 187 43.0 2e-47 MQVYVFKTEQDVDTYVGQQISTFIQDNDAPVIGFATGSTPLGAYDYLIDSYQSGKTDFSK VRAFNLDEYVGIEKDHPQSFARAMKDYLFSKINIKEENIYSLNGNAKDMTKECKEYDQLI INNPIDIQILGIGMDGHIAYNEPGSSFDSESHVVDLHPESIQSSLDYGFTKIEDVPTQGV TQGIKTIMKARQLIMIAKGNKKAKLVERMLYGPVSEDFPSSIIQTHNNVIVVLDQCAAAN LKEETYERR >gi|223714036|gb|ACDT01000179.1| GENE 6 3642 - 4958 1466 438 aa, chain - ## HITS:1 COG:BS_licC KEGG:ns NR:ns ## COG: BS_licC COG1455 # Protein_GI_number: 16080909 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Bacillus subtilis # 1 437 1 440 452 323 40.0 3e-88 MDRFTQMLEKTLMPIAEKVEQNRYLSAIREGFTSIMPFLIIGSIFLLLSNLPSEALNNFL GSIFGAENIAKLGYLLEPTFNIMALLVIIGIAKSLLEYHHIKDTSALILPIITFLLLNNF TITLTTEAGETVTSGGVIPMGYLGAAGLFVAMICTILTVESYHWIKERGWVIKMPDSVPP AVSRSFSSLIPAAIILTVVFLIKVIFEYTPYATVFDLIYQCLQQPLSALGNSLPSQMISE GLIGLFWCFGVHGDNVVSAIMGPIWRGLSAENLALVQAGKEPINIICQQFRDVYLIAGGT GATLSLLVSIWFGAKSPELKTVAKLSGPAAIFNINEPVIFGIPIVLNPIMMIPFVIVPIV LCVTTYLAMSLGLVPLLQGIEIPWTTPVFISGWIAGGWNALILQIVNFAVATAIYFPFVK VLDRDLLKNAKKKELLQE >gi|223714036|gb|ACDT01000179.1| GENE 7 5065 - 5715 743 216 aa, chain - ## HITS:1 COG:BS_licA KEGG:ns NR:ns ## COG: BS_licA COG1447 # Protein_GI_number: 16080908 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIA # Organism: Bacillus subtilis # 1 100 5 104 110 80 47.0 2e-15 MNETSFELILHSGNSKSSSMEAIEYARQGDIQEADFKMNEAQEELLLAHKIQSKLLAKSA KVDHFNPNMLLVHAQDHLCGAQTQLEMAKEIISLYEQVQEIKTYLGIENFQKQKNMRVLL VCGQGMSTSLLVQTMYLYADEGDYIESSSFEELVGVIGDYDVVLVSPQIRYRMPVIERMM TLRTQIVGLIDMKAYGKLDGQKIYNQAKELFMKIKH >gi|223714036|gb|ACDT01000179.1| GENE 8 5730 - 7613 1532 627 aa, chain - ## HITS:1 COG:BS_licR_1 KEGG:ns NR:ns ## COG: BS_licR_1 COG3711 # Protein_GI_number: 16080911 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 26 480 25 495 499 186 26.0 1e-46 MLNERQKQIFLCLDENKNSFIKGIELANKVNCSLKTLQNTVKEMKIILEKYQLEIISMTS KGYNLIIHDQQEYNDFKQLLLESDDKKDFNNQGYRITYILSTLLSDNHFVKADDLASQMY VSRSSVSSDLKIVKRVLEKYQLKIEHKPNYGMIVVGLEKDKRDCIIKEQLELQLGHSTIK KEWLATISDIVVDSLMKAKYRISDVVLQNLVLHICVSIQRMQHGIYIDNPQMARQYGHEH VIAKKILDELASIYHFIITENEISFLAIHLLGKRSYEENDLISKEIDYFVNGMLEEIKIK TDIDFYGDVELKISLALHLVPLLVRLESQMQLQNTMIQDIQANYPLAYDVAVIAASYISE QKHYQLSSDEVGYLAVHFSLSLSKKENKINPKKVLIICNARRGDYLMIQHTFLKEFNEMI SDLEIINALEIPQYNLDDYDCIFATFLNHPLIPKRALRINFFIDQKDIQRIKNSLQGQSE AGELLKYFKEEYFMGMIDAKNQKEVIQLMCEVANKYNEFEEDLFQSCIRREELGSTAYGQ LIALPHPDHLISQKTIVITAILDKPIVWSKQKAQLIFLICVEKGNQKDLRVLFECISKFM MDQQSVQDVICRGDYQTFTKNLGKLIG >gi|223714036|gb|ACDT01000179.1| GENE 9 7766 - 8335 244 189 aa, chain + ## HITS:1 COG:no KEGG:Cbei_2704 NR:ns ## KEGG: Cbei_2704 # Name: not_defined # Def: RDD domain-containing protein # Organism: C.beijerinckii # Pathway: not_defined # 19 186 23 196 199 88 37.0 1e-16 MKRIFNHSSVHNHQAPFVKRLIAYFIDWYIISVMTILPINLIYSIIYHQKNFTSSIVNLP LIPAISAFLIGLLLSLLYLVYFPYKYNGQTIGKKIFGLKIVKNNEINIDLKTLLIRNGVG LILIEGTFYSCSIYFWELINIIFEASIASFALSILGIVSFISMFMSLLNSNHRMLHDYLS KTSVVTVHK >gi|223714036|gb|ACDT01000179.1| GENE 10 8393 - 8638 229 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732744|ref|ZP_04563225.1| ## NR: gi|237732744|ref|ZP_04563225.1| predicted protein [Mollicutes bacterium D7] # 1 81 1 81 81 144 100.0 2e-33 MACMVIYGDFSNIADTYLSLVNWLQKMLSIRFQDPIALSVVEVIVIDYQTEIVSIPYIDI NPYTPNLEESAYFAWADEFDF Prediction of potential genes in microbial genomes Time: Thu May 26 11:06:29 2011 Seq name: gi|223714035|gb|ACDT01000180.1| Coprobacillus sp. D7 cont1.180, whole genome shotgun sequence Length of sequence - 15280 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 8, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 225 - 533 272 ## gi|167756319|ref|ZP_02428446.1| hypothetical protein CLORAM_01852 - Prom 714 - 773 7.2 - Term 757 - 804 -0.6 2 2 Op 1 . - CDS 849 - 1616 352 ## DSY3601 hypothetical protein 3 2 Op 2 . - CDS 1613 - 2713 964 ## COG0535 Predicted Fe-S oxidoreductases - Prom 2750 - 2809 2.4 4 3 Op 1 . - CDS 2827 - 3693 570 ## DSY3599 hypothetical protein 5 3 Op 2 . - CDS 3669 - 4241 342 ## DSY3598 hypothetical protein 6 4 Tu 1 . - CDS 4580 - 5413 714 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family - Prom 5435 - 5494 10.8 - Term 6082 - 6120 7.2 7 5 Op 1 . - CDS 6124 - 6702 610 ## COG0406 Fructose-2,6-bisphosphatase 8 5 Op 2 . - CDS 6716 - 7693 1134 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 9 5 Op 3 1/0.000 - CDS 7708 - 8118 536 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain 10 5 Op 4 3/0.000 - CDS 8136 - 8879 1002 ## COG0599 Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit 11 5 Op 5 2/0.000 - CDS 8893 - 10095 1113 ## COG0477 Permeases of the major facilitator superfamily 12 5 Op 6 . - CDS 10079 - 10450 257 ## COG0789 Predicted transcriptional regulators - Prom 10529 - 10588 8.6 - Term 10557 - 10599 4.9 13 6 Op 1 . - CDS 10608 - 11405 950 ## BDI_1568 hypothetical protein 14 6 Op 2 . - CDS 11483 - 12364 850 ## COG0583 Transcriptional regulator - Prom 12422 - 12481 5.3 - Term 12709 - 12753 0.3 15 7 Tu 1 . - CDS 12890 - 13312 504 ## ZPR_2888 hypothetical protein - Prom 13372 - 13431 10.9 16 8 Op 1 . - CDS 13508 - 14875 1343 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities 17 8 Op 2 . - CDS 14887 - 15279 264 ## EUBELI_20127 hypothetical protein Predicted protein(s) >gi|223714035|gb|ACDT01000180.1| GENE 1 225 - 533 272 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756319|ref|ZP_02428446.1| ## NR: gi|167756319|ref|ZP_02428446.1| hypothetical protein CLORAM_01852 [Clostridium ramosum DSM 1402] # 1 102 1 102 102 172 100.0 6e-42 MGGVEQFISNYYHHSDFTLEAFRDLELTISKDYLQAAINIYPNKRIEMYSNQEISNQLDQ IDEQYYADHEIEYFDFIDKYTEGNIVKMDEIMFKIGAFSKLT >gi|223714035|gb|ACDT01000180.1| GENE 2 849 - 1616 352 255 aa, chain - ## HITS:1 COG:no KEGG:DSY3601 NR:ns ## KEGG: DSY3601 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 5 222 8 223 260 146 35.0 6e-34 MRKQVTYLGNILCYEIENLNKPVCICFHGMNMTREMYKTDQLQHLLKNYSLVLVDLCGYG DSSIETKNFNMVVFNQLVLKLVKCENIKSCTIVGYCLGGVFALDFAIRNPYFIERLILIE TMIYLPKWLWLTLLPGYCSGYRIFQKQIKLFKILECCTKFKNISSSERIRISSREWNRTV NSFYLKLMNEYEKQNHIKRCQQINCPVNIIYSQASFKNIRKTAQILSQFSFVELHLCQGR GHFLFLDETMNRIVI >gi|223714035|gb|ACDT01000180.1| GENE 3 1613 - 2713 964 366 aa, chain - ## HITS:1 COG:MK0980 KEGG:ns NR:ns ## COG: MK0980 COG0535 # Protein_GI_number: 20094416 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Methanopyrus kandleri AV19 # 26 342 63 371 403 165 31.0 1e-40 MVKCGLNIIINDHLVQHDGYYLVNSFIPPINTEAFRTVVKMVPGKGKNFFVNHTEGIRLA PISTYIAVTDKCMYHCWHCSASKFMKDATSNSQFSTLELKKIIAQLQRLGVAIIGFTGGE PLLREDLEELIASIDERSVSYIFTTGYKLTYQRALALKEAGLFGIAISLDSLDAQRHNGM RHNDHAYDYAVEAIKNAKKAGLYTMSQTVCTRELLINGGILKLALYLKKLEIDEMRIMEP LPCGALKNKLEEVLTDNEKEQLKQLHITLNRDSRYPKASVFPYFESAEQFGCGAGVQHSY IDGHGNFGPCDFIEPTFGNVLNENIQYIWVKMSKAINGPHCQCIAKNCDICKQLPRFYKL MRGYKK >gi|223714035|gb|ACDT01000180.1| GENE 4 2827 - 3693 570 288 aa, chain - ## HITS:1 COG:no KEGG:DSY3599 NR:ns ## KEGG: DSY3599 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 9 287 1 285 300 208 39.0 2e-52 MHVKKRKIISLFHLGIKVGYLFITRKIVTRSDIGNSYDLVSKTYQQEYLKIMHTYNDILL NHLIKHISKGRILDLAGGTGYNSNFIQEYNDYSIDLVDISKQMLKQCQNSSINKVCKGML EFLTVQESCCYDAIVCTWALMYEQPQKVLNQCQRLLKNGGYLYILVNDQQTLPQIRKIYP KLLIEYVDSINKLMFDLPMPNNAQQLERWAKKAGLKNCRVTSKKQEFYFTDWNRAAKFVT STGALAGYDAMLDLKNEKIFNTLVEQLQKFFTKPCITHHFVMGIFKKG >gi|223714035|gb|ACDT01000180.1| GENE 5 3669 - 4241 342 190 aa, chain - ## HITS:1 COG:no KEGG:DSY3598 NR:ns ## KEGG: DSY3598 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 2 188 3 199 214 127 35.0 2e-28 MSYLFDSYAKIYDHFMRLFHLDDTSVIEHKLSKGQYHILDVGGGSGTLAADLQAAGHQVT IVDSSLSMLKEAKKKNTNVKLIHASIQKGLPINKVDVIICRDCLHHLMNQEKCLKLMLNY LNNNGFILIHDFNQRAFRIKLLFLFERCCFEKIKPVKPEQLVDFSLQNKLNIVFLHQGKW DYICMLKKEK >gi|223714035|gb|ACDT01000180.1| GENE 6 4580 - 5413 714 277 aa, chain - ## HITS:1 COG:L37351 KEGG:ns NR:ns ## COG: L37351 COG1387 # Protein_GI_number: 15673198 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Lactococcus lactis # 4 271 5 259 269 126 32.0 4e-29 MFCDYHVHTEYSDDSCYLMEDVIRDAIVLGIKEICFTDHVDYGIKLDWDDPNLIIKENEK TIANVNYPQYFQKIDSLTEKYCNQITIKKGMEFGIQMHTIDKFQKLFNKYPFDFIILSIH QVENKEFWTNDFQAGLSEAEYYQRYYQEMYDVVRNYHDYSVLGHMDLIKRYDDKDGYDAF NEHKEIITRILKYIIEDGKGIELNTSSVRYGLDDLMPSRDIFQLYYDLGGRIITIGSDSH EKVHLGAHIETMKKDLKKIGFKEFCTFKQMKPIFHKL >gi|223714035|gb|ACDT01000180.1| GENE 7 6124 - 6702 610 192 aa, chain - ## HITS:1 COG:lin1208 KEGG:ns NR:ns ## COG: lin1208 COG0406 # Protein_GI_number: 16800277 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Listeria innocua # 1 191 1 189 199 189 52.0 2e-48 MVKKLYLMRHGQTLFNLQNKIQGWCDSPLTELGQYQAKVAGQYFKDHQITFDHAYCSTSE RCSDTLELVTDMPYTRLKGLKENFYGQLEGESERLNCHLTPKDCETFYLQYGGESSKTVR DRMMQTLTNIMERDDHFNVLAVSHSGACFNFLRAIQDPTEELNKGFGNCCIFVYQFNDGV FKLEEVIRHKYN >gi|223714035|gb|ACDT01000180.1| GENE 8 6716 - 7693 1134 325 aa, chain - ## HITS:1 COG:lin2113 KEGG:ns NR:ns ## COG: lin2113 COG0667 # Protein_GI_number: 16801179 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Listeria innocua # 1 323 1 325 331 420 60.0 1e-117 MEYVKLGNTGLEVSKICLGCMSFGDPQRWIHSWVLNETDSRKIIKKALDLGINFFDTANV YGLGASEEILGRALQEYAKREEIVIATKVSGQMHEGPNGKGLSRKAILHEVEQCLKRLGT DYIDLLYIHRWDYNTPIEETMCALNDLVRAGKVHYLGASSMYAWQFQKAQYLAEKNGWTK FSVMQGHYNLLYREEEREMIPLCKDMKVALVPYSPLAAGRLTRDWNSDSKRAQEDLVAKR KYDSTVDTDRKIVERVAEIADKYQVTKSQVAVAWLWAKGVTAPIVGITKEKYLDDFAGAL TVKLTQEDIEYLEECYQPHVIMGHN >gi|223714035|gb|ACDT01000180.1| GENE 9 7708 - 8118 536 136 aa, chain - ## HITS:1 COG:MA0416 KEGG:ns NR:ns ## COG: MA0416 COG1917 # Protein_GI_number: 20089309 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Methanosarcina acetivorans str.C2A # 1 136 1 140 141 145 55.0 2e-35 MNREESLGPSAVFELGEPNDAYAKYFIGQSYLKHLTTEGVGVVNVTFEPGCRNNWHIHHK GGQILIVTGGKGYYQEWGKPVQCLKPGDIVNIPPEIKHWHGAVKDSWFSHLAVEVPAKDA SNEWCEPVTDEEYEKL >gi|223714035|gb|ACDT01000180.1| GENE 10 8136 - 8879 1002 247 aa, chain - ## HITS:1 COG:MA0409 KEGG:ns NR:ns ## COG: MA0409 COG0599 # Protein_GI_number: 20089302 # Func_class: S Function unknown # Function: Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit # Organism: Methanosarcina acetivorans str.C2A # 2 244 8 247 250 219 45.0 4e-57 MNKKAEQLFSDLNDGKDFLVKDPELREIMINYLYGDVYHHGNLDFKLRELILIVVNTTNH TLKALKEHVTSGLSVGVTPVEIKEAVYQCTPYIGLGKVEEALEIVNLVFEEKNISLPLLP QATVQGDNRFEKGFAVQSAAFGQEHIQAGHDHAPKELKHIQNYLSEYCFGDFYTRNGLDL PTRELITMVMLATLGGCENQLRAHVGANLTVGNNRDTLIETITQCQPYIGFPRTLNAIAI INEITKK >gi|223714035|gb|ACDT01000180.1| GENE 11 8893 - 10095 1113 400 aa, chain - ## HITS:1 COG:BH2694 KEGG:ns NR:ns ## COG: BH2694 COG0477 # Protein_GI_number: 15615257 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus halodurans # 9 392 9 402 418 77 24.0 4e-14 MVSKSSKHWLVLIVCCGLAASSIGVSINSSGVFYTPVSTSLGIMRGTFSMHMTIFSLTTA IGALFIPRIMNKVSYKLLLTISVVIAVVATGLMAVTKSIPGFYLLGAIRGLSTGMFSIVP LTIIINNWFEKSHGLATSIVFGFSGLAGSICSPILSSFIENFGWQTGYLIKAAIILGLCL PAVLYPFHLDPQKDGCLPYGHVEQPQDMVTQVSNSFSFTTVAFISFFVFGLLCSCITSVT QHLPGYGQSLGYSAALGAMLLSSGMVGNIISKLIIGVLSDHLGAIKATVTMIIANTVGII LLMWGSTAWLLIIGAFLFGSCYSIGAVALPLLTKYFFGIDNYARVFPKISFASNLGAAIS LSMVGYIYDFFGSYLYAFIIALAMIAVCVITLTLTAKTKE >gi|223714035|gb|ACDT01000180.1| GENE 12 10079 - 10450 257 123 aa, chain - ## HITS:1 COG:lin2876 KEGG:ns NR:ns ## COG: lin2876 COG0789 # Protein_GI_number: 16801936 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Listeria innocua # 1 108 19 127 151 62 35.0 1e-10 MRIKEVSKRFNVSITALRYYEKVGLFEDVNRVNGIREYEDKDIERLSLILTLKNIGLSNE TILKYIELNEQGAHTKKQQIQVLKLERQKLLDSIHNQQKNIDSLDYLIYQLKDKEKIKHG EQK >gi|223714035|gb|ACDT01000180.1| GENE 13 10608 - 11405 950 265 aa, chain - ## HITS:1 COG:no KEGG:BDI_1568 NR:ns ## KEGG: BDI_1568 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 4 263 29 286 424 315 55.0 8e-85 MTKQAVAPRTIFMTEKDFQPIDETRIYWLSSAGVMINSHGTTLMIDPLLEGFDLPLLVEM PILPKDVPSLDGVLITHCDNDHFSRLTCKNLAPVTKSFHGPHYLAELFAEEDLKGNGYDI GESFEIGNMKIKLTPADHAWQNESSKYSKIRKYQFEDYCGYWIDTPDGTIWMVGDSRLLQ EHLQMPEPDVMLFDFSDNSWHIGLDNAIKLANTYPNTELILIHWGTVDAPEMNAFNGDPK SLNGRVVNPERIRVVAPGEAFVLKK >gi|223714035|gb|ACDT01000180.1| GENE 14 11483 - 12364 850 293 aa, chain - ## HITS:1 COG:lin0450 KEGG:ns NR:ns ## COG: lin0450 COG0583 # Protein_GI_number: 16799526 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 290 1 288 291 260 42.0 2e-69 MELRVLKYFLAVAREENITKAAEFLHITQPTLSRQLMQLEEELNAQLFIRGKNRIVLTDE GMLLRRRAEEIVDLANKTEKEFLEQDNLVTGEIFIGGGETNAMHILARIIKEFKEEYPQI KYQFYSGNADDIKERLDKGLIDIGLLTEPVDIEKYEFVRLEQKEVWGILAPKDSKLAAKE YATPQDLLKLPLLSTRRTIVQNEIANWFGQDYEQLDIIATYNLIYNAAIMVEEGIGYAIC LEKLVNINDETKIRFIPFYPPLLTGTVIVWKKHQIFSPATARFIEKIKHSLKA >gi|223714035|gb|ACDT01000180.1| GENE 15 12890 - 13312 504 140 aa, chain - ## HITS:1 COG:no KEGG:ZPR_2888 NR:ns ## KEGG: ZPR_2888 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 7 135 159 289 313 94 40.0 1e-18 MENNQKKEVNEALHAIEETLGHLSRAQDYLSSAGNWGLFDMIGGGFITTMIKHGKMNEAE RAMAAARNSIRNLKKELSDVDQLVDVDLNISDFLSFADYFFDGIIADWMVQSKIKDARFQ VDKAIRELNRIKNTLLTLAV >gi|223714035|gb|ACDT01000180.1| GENE 16 13508 - 14875 1343 455 aa, chain - ## HITS:1 COG:CAC2970 KEGG:ns NR:ns ## COG: CAC2970 COG1168 # Protein_GI_number: 15896223 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Clostridium acetobutylicum # 5 383 4 382 384 341 41.0 2e-93 MKNQFSKLTNRWNSGSLKWEIKENELPMWVADMDFETAPEIIEALKQRVEHGIFGYNIVP DEFFESIQTWWLKRHNYYLEKEWMLFCTGVVPAISSLVRKMTSVGENILIQAPVYNIFYN SILNNGRHIISSDLIFDGKEYYIDFVDLEMKLADPQTTMMILCNPHNPIGKIWDIESLER IGEMCAKYNVLVISDEIHCDLTAPNKNYIPFASVSKVNQMNSITCIAPTKAFNLAGLQTA CIVVANPQLRHKVNRGINTDEVAEPNSFAITATVAAFTKGAVWLDELREYIEENKNIVSK FIRCYLPEIYLIPSEATYLLWLDCSRITTDSTMLTQFIRDKTGLYLTSGIEYGENGNGFI RMNIACPQSRLFDGLERFKEGIRAYQQGEIKMEEIDIEKMHQFEDVINMEIFYNKTTQEI MEQYYLQRENADSLAMQDFTLTDEELRVLDQLIKI >gi|223714035|gb|ACDT01000180.1| GENE 17 14887 - 15279 264 130 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20127 NR:ns ## KEGG: EUBELI_20127 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 116 175 290 290 166 71.0 2e-40 KLISKDAFWNCENVTVYDSTIIGEYLGWNSKNITFINCTIESLQGLCYIENLKLINCKLI NTTLAFEYSTVEVEVNSHIDSVVNPISGVIKAHSIGELILDKTKINPSETKIVITNPPIK SQTNIECICC Prediction of potential genes in microbial genomes Time: Thu May 26 11:06:54 2011 Seq name: gi|223714034|gb|ACDT01000181.1| Coprobacillus sp. D7 cont1.181, whole genome shotgun sequence Length of sequence - 948 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 597 431 ## lp_2889 hypothetical protein 2 1 Op 2 . - CDS 584 - 946 296 ## LHK_02162 hypothetical protein Predicted protein(s) >gi|223714034|gb|ACDT01000181.1| GENE 1 3 - 597 431 198 aa, chain - ## HITS:1 COG:no KEGG:lp_2889 NR:ns ## KEGG: lp_2889 # Name: not_defined # Def: hypothetical protein # Organism: L.plantarum # Pathway: not_defined # 1 193 1 193 288 254 59.0 9e-67 MEIIKQKYLSGERALFKSNNLKIEDSIFADGESPLKESKNIEINETIFKWKYPLWYCNDV KVTNSTLLETARSGIWYTKNISIIDSIIEAPKTFRRAQNIYLENVDLPIAQETLWNCTDI VLKAARVHGDYFGFNSINIKIDDLNLTGNYSFDGGRNIEVHNSKLISKDAFWNCENVTVY DSTIIGEYLGWNSNPFGS >gi|223714034|gb|ACDT01000181.1| GENE 2 584 - 946 296 120 aa, chain - ## HITS:1 COG:no KEGG:LHK_02162 NR:ns ## KEGG: LHK_02162 # Name: not_defined # Def: hypothetical protein # Organism: L.hongkongensis # Pathway: not_defined # 8 108 72 172 178 94 44.0 1e-18 IEINGQMFLATLDDTPTSQALLEKLPMVLTMKELNGNEKFYNLEYSLPVTSQSVNQINKG DLMLFHDNCLVLFYQDFLSKYQYTRIGQIDDAGNINQIVSAGDLVVSFMKYRNGEENGNN Prediction of potential genes in microbial genomes Time: Thu May 26 11:07:00 2011 Seq name: gi|223714033|gb|ACDT01000182.1| Coprobacillus sp. D7 cont1.182, whole genome shotgun sequence Length of sequence - 2698 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 1132 - 1191 8.2 1 1 Op 1 26/0.000 + CDS 1338 - 2465 1115 ## COG1134 ABC-type polysaccharide/polyol phosphate transport system, ATPase component 2 1 Op 2 . + CDS 2470 - 2698 98 ## COG1682 ABC-type polysaccharide/polyol phosphate export systems, permease component Predicted protein(s) >gi|223714033|gb|ACDT01000182.1| GENE 1 1338 - 2465 1115 375 aa, chain + ## HITS:1 COG:CAC2328 KEGG:ns NR:ns ## COG: CAC2328 COG1134 # Protein_GI_number: 15895595 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate transport system, ATPase component # Organism: Clostridium acetobutylicum # 8 241 4 241 419 226 47.0 5e-59 MENIENAIEIRDMSKDFKLIYDKPTTLKERLVFWNRSKPEWHHVLKNINLDIKKGDTVAL VGTNGSGKSTLLKLMTRILYPTKGKLETAGKLTSLLELGAGFHPDFTGRENIYFNASIFG LTKFEIEKRIDEIIEFSELGSFIDNPVRTYSSGMYMRLAFSVAINVDAEILLIDEILAVG DQHFQDKCFSKLKELRDSDKTIVIVSHSLDMIKKLCTRAVWIYNGEVKADGNPNIVIDKY LDQVVKDHANQVKDYVPETINCILTIETPTEFEEKDNSSNIVVSGWEVSNTFTTNVDVYM DNKYIGRASRHKRNDVLMHYQEANGGASMNAQAGWEINVPVKGLEGNHIIYVKVSDGDNI IANKQVQVILKDKSR >gi|223714033|gb|ACDT01000182.1| GENE 2 2470 - 2698 98 76 aa, chain + ## HITS:1 COG:CAC2329 KEGG:ns NR:ns ## COG: CAC2329 COG1682 # Protein_GI_number: 15895596 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate export systems, permease component # Organism: Clostridium acetobutylicum # 4 76 5 77 258 60 39.0 8e-10 MKLIENLYQYRELLKSNIKKEIRGKYKGSVLGVLWSFLNPLLMVMVYFLVFPFLFRNTVD NYLIYLVIGVIPWNFF Prediction of potential genes in microbial genomes Time: Thu May 26 11:07:03 2011 Seq name: gi|223714032|gb|ACDT01000183.1| Coprobacillus sp. D7 cont1.183, whole genome shotgun sequence Length of sequence - 8695 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 1, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 79 - 495 369 ## Aaci_0275 ABC-2 type transporter 2 1 Op 2 11/0.000 + CDS 497 - 2653 1800 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 3 1 Op 3 1/0.000 + CDS 2670 - 4652 1523 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 4 1 Op 4 1/0.000 + CDS 4663 - 6648 633 ## COG4713 Predicted membrane protein + Prom 6710 - 6769 8.7 5 1 Op 5 . + CDS 6796 - 7515 389 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 6 1 Op 6 . + CDS 7515 - 7871 318 ## gi|167755073|ref|ZP_02427200.1| hypothetical protein CLORAM_00577 7 1 Op 7 . + CDS 7864 - 8695 831 ## COG0463 Glycosyltransferases involved in cell wall biogenesis Predicted protein(s) >gi|223714032|gb|ACDT01000183.1| GENE 1 79 - 495 369 138 aa, chain + ## HITS:1 COG:no KEGG:Aaci_0275 NR:ns ## KEGG: Aaci_0275 # Name: not_defined # Def: ABC-2 type transporter # Organism: A.acidocaldarius # Pathway: ABC transporters [PATH:aac02010] # 13 138 135 257 257 79 35.0 4e-14 MLLFCIFGGVGISWHLILFPVFALIQYFITMGLILGLSAINVYIKDTEYIVQFFINMLFY GTPILYDLSTFKDFPSILMKIINLNPFKHLMVIYRDIFMYHNVPNLGATVYIVIFAAICF FGGLAIFRKLEKGFAEEV >gi|223714032|gb|ACDT01000183.1| GENE 2 497 - 2653 1800 718 aa, chain + ## HITS:1 COG:alr4487_2 KEGG:ns NR:ns ## COG: alr4487_2 COG0463 # Protein_GI_number: 17231979 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 186 628 1 443 519 202 31.0 2e-51 MGVTNFVDSTKKTPTFINDEITGIKVQYLKIILANLKENNELFVYSNDIEIAYNKKMISD DGDIQIELEIPKDCKEIRVCCLSGKCLIDVYRLKISYAFKIRSLFKKIFSIINRSIIVFG RGMRFLWREHRMMIPPTLWRKYFIRFFSRVKERGEILWNPFIQDDYHKWIQKFEKNNIDK EQEYNPLISILIPVYNVERKFLSECLDSILNQTYQNYEVCIVDDCSTNLETINTLKEYEN KDTRIVVKTRLINGHISKASNDALEIARGEFICLVDNDDTLAPNALYENVALLNKHKDAD FIYSDEDKLDLRGERCEPHFKSDFAPDTLLGINYICHLAVLRTSLVREVGGFTVGLEGVQ DHDLFLRITEKTKNIYHIPKILYHWRMIEGSTSLSVDNKAYAVKKGIETIESTLKRRGVK ANVKSLGNSTVYGIEYVLDTEPSVSIIVPTRDFADVTEKCLESIYKLTNYSNFEVVIVDN RSEKQETMELFEKYQMRYENFRVIKADMEFNYSAINNLAVSTCKSDVLVLLNNDTEVLTP NWLKLMVSYAIQKHIGAVGAKLLYPDMTIQHGGVLLGVGNAVAAHAFISHPRDDEGVYGR LKIPYNYSAVTAACLAVERKKYIQVGGLDETLKVAYNDVDFNLKLLDAGYYNLFIPQVEL IHYESKSRGLDSTSEKYKQFLAENNYMHKKWAKYIECDPYYNKNLSKLGLYMLNRNID >gi|223714032|gb|ACDT01000183.1| GENE 3 2670 - 4652 1523 660 aa, chain + ## HITS:1 COG:alr4487_2 KEGG:ns NR:ns ## COG: alr4487_2 COG0463 # Protein_GI_number: 17231979 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 152 536 1 394 519 207 33.0 5e-53 MKYRIEECEVCGIMNPTLVIRGWADSTINHIKIKSPDEILVKEIKKIRSRIVSFDYKIPL KKHTVYKVYLCNDELDIKVKVVRSNFIKRVWQKYKPRGIFINKVFGIIYDPLKREIIKFN KGTLNMQNENEYNEWLKNNDRFVEVKNFNYQPKISIVMPVYNVPGKYLSYCIESILKQTY QNFEICIADDCSTNIDTIETLKAYKNKDNRIKVIFREENGHISRATNSALSLASGEFIGL MDNDDTLDPHALNEVVNILNEDKNIDFIYTDEDKIDMGGKRSDPHFKPDFSLDTLYGGNY ICHFSVIRKNIIEKIGGFRVGYEGAQDFDLFLRVSNETDKIYHIPKILYHWRMIPGSTAV GGDGAKNYAGEAGKRALEDYFKQKAISVKIDNIISTQYFVEYIFENEPKVDILVVFEGNN TSLNNFLRSVYTDNAYKNFVVNIINKDNKPLKIDTNFNHSVKLFINEKNVIETVNNIIQA SDGEYVIFANEYCNFETYDWINLMVGYAKQNEIGAVGCKVMDKNKIVRNAGVILSEQSLF INAYECTSRNDYGNYGRLLVPYNYSIVSSHLMCVSKEKIHNLDTDFNLDFSLYDLCLRLL EDGLRNVVLPQVEVINTAYKRSSDIEQEKIFQRKNKMHIGNDRYYNVNLSKEKAFRLKKI >gi|223714032|gb|ACDT01000183.1| GENE 4 4663 - 6648 633 661 aa, chain + ## HITS:1 COG:CAC0024 KEGG:ns NR:ns ## COG: CAC0024 COG4713 # Protein_GI_number: 15893322 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 227 661 12 450 456 134 26.0 5e-31 MKFKNILSQIFNGLKKISYQWQIYRDKSFASSILKFFPKNFLLVIFLICSFIIIFNEYAA NNNSVIYRNNHLYSSNVGAIYKEPIEFSLDRKNVDQDFDILSIKFATYDRINDSDYTLSV MDKNNVVYETNFNTKKFGDNEYEDFYLNDTIDSKRLKDYTVKISPIKTDTSNNITVLCNE ENGDIAVAYKVSKSPLNISTFIIIGMIILFFILMKIVNERKIIHEKFLILALIFIALATI LTPPYQVPDERLHYLRTLQLSQYNFSKTPSENLGSREIIVPDNLDDVNYSRAEATDAVTD VGLIGDSMVEGRNIEKRYNSSIGGSAVMVAYLLPSIILRLVDCFTNSPLVLFYTGRLVCL GINFLLTYYAIKLTPKFKNTMLIVALMPMSIQSMISYSYDGLLNACILLFIATCLKMIYD KENKINKRYLSISVIMLFMIISIKLPYALLGVLYFFIPSYKFNNSVKKTASIFIVLFATY LSTLLLSKVMSIGALTTSVVSNSGNEQSNFTYILNNPLEIFSIAKNTFKEKIVFYIDSLV GYFGYFSIKMHTIFQYAYLIMAGGLILTEESNFKKKERIFYFLIVLTVIAGIFGALYFAW SGYQLSYVEGVQGRYFIPLILPTIMIFSFRKKILTIKNSTIFSFIDIILLNYIILLLVYN F >gi|223714032|gb|ACDT01000183.1| GENE 5 6796 - 7515 389 239 aa, chain + ## HITS:1 COG:L11285 KEGG:ns NR:ns ## COG: L11285 COG0463 # Protein_GI_number: 15672191 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Lactococcus lactis # 2 235 3 236 241 228 48.0 6e-60 MEQKILIIIPAYNEEKSIAEVYLNICNYNNEFGTNYDVIVINDGSTDDTKHICEINNIPV VNLIHNLGIGGAVQTGYKYARDNGYDFAIQFDGDGQHEVSCIKDILEPLYNDSADFVIGS RFIDKSSSEFKSSFARRIGIRIISNMIKFITGKRIHDITSGFRAANRRVILDFALSYPVE YPEPITNTELLKKGVRIQEIPVKMHEREAGISSINSWKNVYFMVNIILSVFVVGIRRWK >gi|223714032|gb|ACDT01000183.1| GENE 6 7515 - 7871 318 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755073|ref|ZP_02427200.1| ## NR: gi|167755073|ref|ZP_02427200.1| hypothetical protein CLORAM_00577 [Clostridium ramosum DSM 1402] # 1 118 1 118 118 130 100.0 3e-29 MSFGTVLLVLGTGLLIFLVTLQLLRRGRIPVKFSILWFIVAVILLVVGIFPNFIVLISTR IGFISMSNMLVGILIFLLFAMCIALTVIVSGQATKITLLIQEVSMLKKKIVNEEKNNG >gi|223714032|gb|ACDT01000183.1| GENE 7 7864 - 8695 831 277 aa, chain + ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 5 238 5 245 344 102 29.0 7e-22 MDKKVTIIIPVYNAGANIEHCIQSILKQTSRDFKVLLINDGSTDDSLIRITEYANRYPDI FKVLTHENMGVVETRHRGIKEADTEYIMFMDNDDFIDNDYVEVLLNEIEKTDSDIVITGY RRANFDKVLFTIPAYDDEWTKYRITAPWARIFRRNFLVENNVRFLKTYIGEDTYFNMNAY LYTNKIHGLDYVGYNWYYNDESVSNVKQRGLKPECDVLVVLNEIDKLYKEKDEYLNYFVT RHIIWYLLFSGVDATPQRFMEEYKRLFEWLKNQGYKL Prediction of potential genes in microbial genomes Time: Thu May 26 11:07:14 2011 Seq name: gi|223714031|gb|ACDT01000184.1| Coprobacillus sp. D7 cont1.184, whole genome shotgun sequence Length of sequence - 6882 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 4 - 63 4.4 1 1 Op 1 . + CDS 93 - 896 457 ## LCRIS_01755 glycosyl transferase, group 2 2 1 Op 2 . + CDS 898 - 2061 766 ## Cphy_3545 glycosyltransferase + Prom 2300 - 2359 8.0 3 2 Op 1 2/0.000 + CDS 2448 - 3041 229 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 4 2 Op 2 1/0.000 + CDS 2920 - 3504 370 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 5 2 Op 3 . + CDS 3505 - 4347 1063 ## COG1091 dTDP-4-dehydrorhamnose reductase + Term 4357 - 4397 3.5 6 3 Op 1 . + CDS 4400 - 5977 792 ## COG4713 Predicted membrane protein + Term 6007 - 6066 1.0 + Prom 5992 - 6051 8.7 7 3 Op 2 . + CDS 6071 - 6881 393 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily Predicted protein(s) >gi|223714031|gb|ACDT01000184.1| GENE 1 93 - 896 457 267 aa, chain + ## HITS:1 COG:no KEGG:LCRIS_01755 NR:ns ## KEGG: LCRIS_01755 # Name: not_defined # Def: glycosyl transferase, group 2 # Organism: L.crispatus # Pathway: not_defined # 9 258 2 259 267 121 35.0 3e-26 MKEYNINDLYVVIVVYNTSCQDSISYNRAKDSSVKIIVCDNSTSDYGNEEIVKNDGNIYI NMGGNNGLSKAYNAAIQVINKDNRYICLFDDDTDMGKKYFDKVLSYINMSNADILLPVVK TSTRILSPCQFKKKRCVEVKNLEELINKPISAINSGMVIKSEIYKDYKYNENIFLDYLDH DFMREMNRVKKNILVMNDNVLKQDFSMESNSLEASYKRLCILNHDLKEYYKEDYIMYLYQ ILAYKLVMIKKYKSLKFVLLKKFKQEV >gi|223714031|gb|ACDT01000184.1| GENE 2 898 - 2061 766 387 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3545 NR:ns ## KEGG: Cphy_3545 # Name: not_defined # Def: glycosyltransferase # Organism: C.phytofermentans # Pathway: not_defined # 12 383 15 391 391 277 40.0 5e-73 MKKYILDLGTKIIQKTRYRNNIIDYKPIIDNCDFDIVENNYNGGKKIGILLPHLVKSSGG VTSILRLGTNLSKLGYQLTYISMFNDNEKDMVEAAKFNLSNYEGECLPLNKTNKNDFDII VATEWRTVYRIMDFDAYKMYFVQDFEPIFFEMGERYLLSKKTYELGLHIVSLGKWNVDTV VKSCDVKGQIDYIEFPYEKTEYTFVKRNFEKIKEKRKIKVAVYVKEESKRLPVIIPLILD NLKKSLKLKGFELEINYFGNALPICINNGIECGKLSKKQLHELYCECDFGMVASLTNISL VPYEMIAAGLPVIEFKDGTFDYFFSKETAILCDLSYKHLEKEILYYVENPAELEEMTIRA NKSILSLSWEKSANQFVEILDHVDTVE >gi|223714031|gb|ACDT01000184.1| GENE 3 2448 - 3041 229 197 aa, chain + ## HITS:1 COG:FN1682 KEGG:ns NR:ns ## COG: FN1682 COG2244 # Protein_GI_number: 19705003 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Fusobacterium nucleatum # 7 172 138 305 486 63 28.0 2e-10 MCHGFFQGMEDFKKTVLRNFLARIVCVSLIFMFVKSSADLPLYVLFYSITLLLGNISMWF YIPKYIKRNDVKNLNIQRHIKPAVMLFLPQIASSLYTLLDKTMIGLITDNNSEVAYYEQS QIIIKTVLTLLTSLSTAMMPRIANLFATDDMETVKKLYDNVYKIYFNVLFSICTRDHWNS SWFCTMVLWKWIRKSNS >gi|223714031|gb|ACDT01000184.1| GENE 4 2920 - 3504 370 194 aa, chain + ## HITS:1 COG:SA0127 KEGG:ns NR:ns ## COG: SA0127 COG2244 # Protein_GI_number: 15925836 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Staphylococcus aureus N315 # 9 179 292 461 476 75 30.0 4e-14 MTMSIKFILMFSFPFVLGIIGIAPGFVPWFYGNGFEKVIPNLMLISPIIIFIGLSAVTGT QYLLALGRQREYTFSIIAAAIVNLILNLILIPQFSSLGAAIATSFAELVALLTQIWFIRK DFNIFSILKQGFRYLVFSLIMFVVVFGISNFLSMSIVSTLIEIFVGGIIYLLLLIIAKDT IFNGVRAKLAERRK >gi|223714031|gb|ACDT01000184.1| GENE 5 3505 - 4347 1063 280 aa, chain + ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 1 275 1 275 280 238 42.0 7e-63 MKLLVTGVKGQLGHDIVNECNNRNIEAVGVDVEEMDITDAGKVAEVIKSGNYNAVIHCAA WTAVDKAEDEVELCTKVNVDGTRNIANICKELDIPMMYFSTDYVFDGQGETEWKEYDERH PLNVYGQTKYEGELIVESLPKHFIVRIAWVFGINGNNFIKTMLRLGKERGAVCVVDDQIG SPTYTYDLSKLVVDMIQTDKYGIYHATNEGLCSWYEFACEIFKQAGMSVEVTPVDSNAFP AKAKRPNNSRMSKAMLDKNGFGRLPTWQDALSRYLKEIDY >gi|223714031|gb|ACDT01000184.1| GENE 6 4400 - 5977 792 525 aa, chain + ## HITS:1 COG:CAC0024 KEGG:ns NR:ns ## COG: CAC0024 COG4713 # Protein_GI_number: 15893322 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 77 524 7 448 456 108 26.0 3e-23 MKHAKIHRINNMNKKLSDSFIKNNKTITITIIIITVFFYFTTLWSIDGLIGFKIIVLGCV MSVLLLGAKEILQKNSRIEKLFIICVIPIGLIYTLLIPPGIVPDEWAHMQNEFSLSSQLL GKEIDGKVTMRESELNFLSKQVINPAKGYYDFIYNNIISTDNSAYIHTEIDSFGITQLFA YFPAVLGITFARILHFGVVTTVYFGRLFNFIFYILLTYLSIKKIPFGKLLMFTITMLPMT CQQMFSLSYDTVVNSSAFICIAYGMFFVYQAKEIQFIDILIYGFSGLILLANKGSAYAFI LVIPILAKYFNPGGDRIAKKTKIVIFLIVVICILILNYHSFTNNNQVTAIESVSSAGIVP WAGTPSYTLNAIIGDIPATFSLFLNTFLQKGMWYINTAIGSELGWLNILMPNWIINIWGI ILIISTFSEKSNNDVFTHEHKILYFFIAVTIILIVMLAMALAWTPSGYSTIEGVQGRYFI PIIFLLLICFQNSKLYMNERITKVVLMIIVILPILTIGNLILLVL >gi|223714031|gb|ACDT01000184.1| GENE 7 6071 - 6881 393 270 aa, chain + ## HITS:1 COG:STM4541 KEGG:ns NR:ns ## COG: STM4541 COG1368 # Protein_GI_number: 16767785 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Salmonella typhimurium LT2 # 141 264 136 246 750 61 31.0 2e-09 MKKIKTIEKKYIVGVVLIFLGCLIVTSIIWGNRTFSVKTLNQIIFHLKVPMEGTDNGIYL DWFIWAVPISIVAGSLIVALIFNIHRLCKLNNEKISQSIKKHFIKGGILCIIISLIFAIY NYDIYGYINNVVQETDIYEKYYVDPSTAKISFNNGKRNIIHLYLESVENTYANTTFGGAE EINYIPELSQLAKNNINFSNNDNIGGSRTIDGTQWTIASQVSQNMGIPLKLSIKSQKYDN DTAFLPGGYSLGEVLEANGYINEFKPFRFQ Prediction of potential genes in microbial genomes Time: Thu May 26 11:07:23 2011 Seq name: gi|223714030|gb|ACDT01000185.1| Coprobacillus sp. D7 cont1.185, whole genome shotgun sequence Length of sequence - 1068 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 26 - 727 677 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily 2 1 Op 2 . + CDS 740 - 1067 368 ## Fisuc_2046 glycosyl transferase family 2 Predicted protein(s) >gi|223714030|gb|ACDT01000185.1| GENE 1 26 - 727 677 233 aa, chain + ## HITS:1 COG:mll4258 KEGG:ns NR:ns ## COG: mll4258 COG1368 # Protein_GI_number: 13473603 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Mesorhizobium loti # 7 217 264 474 501 70 24.0 2e-12 MCGSDANFGGTSNFYKQHGNYIIRDYNSFKENQEIPEDYFVFWGIEDAKLFEFAKKDITE LASGEKPFNMEITTIDTHTPDGYLCDQCKNTHKNQYANVIECQSKQINDFINWCKTQEWY ENTTIVITGDHNSMSEKFFVGIDTDYVRTPYNCFINSVVEPVNNKNRLFSTIDMYPTMLV AMGASIEGNRLGLGTNLFSDKKTIMEEIGFNELNNEVQKTSRFYDYTILGLEK >gi|223714030|gb|ACDT01000185.1| GENE 2 740 - 1067 368 109 aa, chain + ## HITS:1 COG:no KEGG:Fisuc_2046 NR:ns ## KEGG: Fisuc_2046 # Name: not_defined # Def: glycosyl transferase family 2 # Organism: F.succinogenes # Pathway: not_defined # 1 94 1 97 288 74 40.0 9e-13 MKTGIVVLNYNDAIETIDFVEKISTFNAVDIICVVDNCSTDDSVKQLKKLKNIELIALDT NEGYAAGNNAGLKYLYEQECDNYIISNPDIIIDKRNLMDFIAHMNADTN Prediction of potential genes in microbial genomes Time: Thu May 26 11:07:27 2011 Seq name: gi|223714029|gb|ACDT01000186.1| Coprobacillus sp. D7 cont1.186, whole genome shotgun sequence Length of sequence - 4316 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 368 223 ## CTC01464 hypothetical protein 2 1 Op 2 13/0.000 + CDS 379 - 1377 796 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 3 1 Op 3 . + CDS 1377 - 1664 307 ## COG1343 Uncharacterized protein predicted to be involved in DNA repair 4 2 Tu 1 . - CDS 3346 - 3645 95 ## - Prom 3874 - 3933 8.1 + Prom 3671 - 3730 6.0 5 3 Tu 1 . + CDS 3836 - 4316 381 ## gi|167756704|ref|ZP_02428831.1| hypothetical protein CLORAM_02242 Predicted protein(s) >gi|223714029|gb|ACDT01000186.1| GENE 1 3 - 368 223 121 aa, chain + ## HITS:1 COG:no KEGG:CTC01464 NR:ns ## KEGG: CTC01464 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 4 121 47 163 163 114 59.0 1e-24 WNRKGLNSEIAIGNIRLDKLTAKYLTETKKSDADLVAAKWQLLYYLKILKSKGIIRKGRI EVIEKNKQNKSFIEVELTEIEEKELDKIVIKIKELLENDEIPLVLNESKCKKCAYYAYCY I >gi|223714029|gb|ACDT01000186.1| GENE 2 379 - 1377 796 332 aa, chain + ## HITS:1 COG:FN1177 KEGG:ns NR:ns ## COG: FN1177 COG1518 # Protein_GI_number: 19704512 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Fusobacterium nucleatum # 1 328 9 336 338 228 40.0 2e-59 MKSTKYITSMGDLKRKDDSLCFRKSGKNIYLPVENIKEIYCLNEISLNTKLLDFISSKNI IIHFFNYYQGYSGTFYPKDKYTSGKLLIKQVETFNNYRIDIAKAIVNGIAINVDETLYHY YKHGKNEVKETIDWLRKEVPKRLKKAEDIKMIMAVEGEIWQRFYSMFRYILPEDFIMNKR VKRPPDNPINALISFGNTLLYTKTISAIYQTHLNQSISFLHEPTEQRFSLSLDLSEVFKP IIVFKTIFELVNTKRLTVEKHFDKKTNYCLLNDKGRDIFIEAFENRMETKFLHTKLKRKI TYKTAIKYDGYKLIKTIFENKLFIPFSIKDKY >gi|223714029|gb|ACDT01000186.1| GENE 3 1377 - 1664 307 95 aa, chain + ## HITS:1 COG:TM1796 KEGG:ns NR:ns ## COG: TM1796 COG1343 # Protein_GI_number: 15644540 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Thermotoga maritima # 8 94 2 86 87 60 42.0 6e-10 MKIKNHNYVIVCYDIGEKRVNKIFKICKKYLPHYQYSIFKGPITPSKLILLKKELKKAIN KEEDCVSIIKLQSEDSFDEEILGSQKEGNEDSLII >gi|223714029|gb|ACDT01000186.1| GENE 4 3346 - 3645 95 99 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MISNFNTSYVVIKRVQPVQGNHRKLNFNTSYVVIKQSGLDNLLSFFHNFNTSYVVIKPTR LTILWSRSLYFNTSYVVIKRNLIFFVVAYWKFQYILCCY >gi|223714029|gb|ACDT01000186.1| GENE 5 3836 - 4316 381 160 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756704|ref|ZP_02428831.1| ## NR: gi|167756704|ref|ZP_02428831.1| hypothetical protein CLORAM_02242 [Clostridium ramosum DSM 1402] # 1 160 1 160 301 264 100.0 1e-69 MTILFILLLIKYFFVIIKVKVSRSGVIMTNKYRLKIIYKDTKLTLDIEENITFDELSSII NEKLMLSDSRAYKYQKDDDIIVTRKNSKNNKLADLLELDQKLVYIIGTGNNVYSINIIVW DYIIEADKKILEKFNQMLKNVKQVRPEQVYYLNSSQRKFI Prediction of potential genes in microbial genomes Time: Thu May 26 11:07:46 2011 Seq name: gi|223714028|gb|ACDT01000187.1| Coprobacillus sp. D7 cont1.187, whole genome shotgun sequence Length of sequence - 13408 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 334 159 ## gi|167756704|ref|ZP_02428831.1| hypothetical protein CLORAM_02242 + Term 343 - 386 6.2 - Term 331 - 374 5.4 2 2 Tu 1 . - CDS 378 - 755 465 ## EUBELI_00568 hypothetical protein - Prom 834 - 893 16.3 + Prom 677 - 736 11.3 3 3 Tu 1 . + CDS 937 - 2544 1730 ## COG0513 Superfamily II DNA and RNA helicases + Term 2554 - 2598 8.8 + Prom 2599 - 2658 9.8 4 4 Tu 1 . + CDS 2688 - 5354 3456 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase + Term 5357 - 5401 5.1 + Prom 5356 - 5415 6.8 5 5 Op 1 . + CDS 5437 - 6417 961 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D 6 5 Op 2 . + CDS 6427 - 7425 741 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake 7 5 Op 3 . + CDS 7427 - 8056 593 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) 8 5 Op 4 . + CDS 8112 - 9011 1038 ## COG1210 UDP-glucose pyrophosphorylase 9 5 Op 5 . + CDS 9011 - 9478 531 ## gi|167756712|ref|ZP_02428839.1| hypothetical protein CLORAM_02250 + Term 9506 - 9540 1.2 + Prom 9584 - 9643 12.7 10 6 Tu 1 . + CDS 9721 - 13239 3956 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit + Term 13250 - 13283 3.1 Predicted protein(s) >gi|223714028|gb|ACDT01000187.1| GENE 1 2 - 334 159 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756704|ref|ZP_02428831.1| ## NR: gi|167756704|ref|ZP_02428831.1| hypothetical protein CLORAM_02242 [Clostridium ramosum DSM 1402] # 1 110 192 301 301 199 100.0 7e-50 LTTKVIYYILDDKYEIHLYNNNDDLKEDNTEYLITFYDTNRAYFKGYQGVNRNIFILHKD KTIKQNDFEYLYHALNRIIHMFKDGDEDTLFESHETCLCYDIATNKFWTE >gi|223714028|gb|ACDT01000187.1| GENE 2 378 - 755 465 125 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00568 NR:ns ## KEGG: EUBELI_00568 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 5 125 19 138 157 88 39.0 8e-17 MENKIIWQDRKHFMWFPISFTKYEIKNDRLYQETGLFSTHYDELLLYRITDLCLRRNLSQ KIFGTGTVILYTKADSEREIHLKNIKNAREVKDLISKLVEDARDKKKVVGKEFFDENLNF EDHSE >gi|223714028|gb|ACDT01000187.1| GENE 3 937 - 2544 1730 535 aa, chain + ## HITS:1 COG:CAC3010 KEGG:ns NR:ns ## COG: CAC3010 COG0513 # Protein_GI_number: 15896262 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Clostridium acetobutylicum # 1 535 1 524 528 536 52.0 1e-152 MIENKFNELGLSNEVLKAIEDMGFSKPSKIQEEAIPVLLTGVDVIGQAQTGTGKTLAFGS VLLSKITPSQRKLPQAIILSPTRELAMQIHEEMERIGKYNGSRITCVYGGSDIERQIRTI KKGIDIIVGTPGRVMDLMRRNVLKLNDVKFVVLDEADEMLNMGFVEDIETILEKVDDDRQ TILFSATMPAGIKKIAQNYMHDNFEHVAVLSKQTTATSVKQFYYEVKQKDRFEAMCRLID VANVQTGIIFCRTKRSVDEVTEQMQQANYNVEAMHGDLSQNHRMNTLRKFKKGTINFLIA TDVAARGIDVENVTHVINYELPQDIESYVHRIGRTGRADKEGQAYSIITPREKGFLRQIE RVTKSSITKATIPTLQEISEAKIGTLVSKVEDQILAGNHKKFKQLVNEIDPTMLADFTAA LMYMTFQEQLGYDYKRDTIQEASEGKRRERGRGNNKDYTRIFITAGSMDRVKAPQIVNFF VSKAGVRKEDIGDIDIKRKFTFVDINKKVINKVVDKCNKQKINNRKIEIEIANKK >gi|223714028|gb|ACDT01000187.1| GENE 4 2688 - 5354 3456 888 aa, chain + ## HITS:1 COG:TM0272 KEGG:ns NR:ns ## COG: TM0272 COG0574 # Protein_GI_number: 15643042 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Thermotoga maritima # 2 859 3 865 881 1019 59.0 0 MKKYVYMFSEGNEMMRDLLGGKGANLAAMVNLGLPVPQGFTVTTEACNEYYADGKIINDE MKKQIDDCLERLEKLADKKLGGLDNPLLVSVRSGAKFSMPGMMDTILNLGLNDETVEVVA KQTGNRRFAFDSYRRFIQMYSDVVCEVDKELFEAKLSELKTSKGYESDLDITAEDFENII IPQYKEIFKTQLGRDFPQEAKEQLMGAILAVFRSWNNDRAIIYRNLNGIPHDLGTAVNVQ QMVFGNMGDDSGTGVLFTRNAANGDNHIYGEYLINAQGEDVVAGIRTPQKIAKLEEDMPE IYKQLVTIVKGLEKHYKDMQDCEFTVENGKLYILQTRNGKRTGKAALKIAVDLVHEGLIN KYEAMTRVEPDQISQLLHPNFTPAALAAAEVLIEGLPASPGAGAGKVYLTAEKVHEQAVA GEKVILVRHETSPEDIQGMVDCEGILTSTGGMTSHAAVVARGMGKCCIVGAKALSIDYDA GTFTIDGKTYPEGTEISLDGTSGKVYMGILDSEESELTGDFAELMSWADEIKKLQVRANA DSPRDAAVAIKFGAEGIGLCRTEHMFFEGDRIEYVRQMILSDTVEERIKALDELYKFQVE DFRGIYRAMVGLPVTVRLLDPPLHEFLPHTDEEYQAVADKLGKTLEEVKTKGATLKETNP MLGHRGSRLAVTYPEIYNMQVRAIIDAAIDVERELGCTIVPEIMLPLIGSESEIVYVKDN VTKAIDAAIMAKNAKIEYKIGTMIEIPRAALTADEIAKHAEFFSFGTNDLTQMTFGFSRD DVGSFLPEYINRKVIQVDPFVSLDQSGVGQLVEMAANKGRSVRPKIKLGICGEHGGDPES IKFCHKTGLTYVSCSPYRVLIARLAAAQAAAEEIILEHTTDKVLVADK >gi|223714028|gb|ACDT01000187.1| GENE 5 5437 - 6417 961 326 aa, chain + ## HITS:1 COG:HP0723 KEGG:ns NR:ns ## COG: HP0723 COG0252 # Protein_GI_number: 15645344 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Helicobacter pylori 26695 # 1 326 5 330 330 246 40.0 4e-65 MKNIIIVATGGTIAGSGKVGKATNYQAGKINIKEIIDSIPMINEVANLKAIQLFNVDSNE MNEEHWIILANKINDLASQKNVDGIVVTHGTDTLDETAYFLNLTINTYKPVVLTGAMRPA SATSADGPINLYQAVCLASCDDALGHGVMAVFSSTIYSGRDIQKISNFKTDAFDQKDFGC LGYMNDDKVMMFSRTFKKHTLQSVFSEKPITELPPVGIVYYYAGAKPDILTMMAQNHRGI VIIGSGSGNYSQAWLNEIETLAAKGIIFVRASRVNQGIVYESDVFDPHNVCIPSNTLSGQ KARVLLMLALSVTQNTKEIKRIFNEY >gi|223714028|gb|ACDT01000187.1| GENE 6 6427 - 7425 741 332 aa, chain + ## HITS:1 COG:alr3452 KEGG:ns NR:ns ## COG: alr3452 COG0758 # Protein_GI_number: 17230944 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Nostoc sp. PCC 7120 # 11 324 9 328 415 112 25.0 1e-24 MYVERIKLNLDSIAMMLFCARLKAYKEVPLSNEEWLLIERIIKKKGLKGPASLLSMNQTE LEEILEINEFIAYKMARRIETMNIFLSVLNNLEGNGINVTTKYEDNFPKLLTKHLKKRAP LYVFYCGNIELVGEGISLMGLNKVTKKDRAYTKRLVDKAIEDNLIYISNDAKGIDDVALH YALYHGCHCISFVCERLGAKSKDYTRYIKSGQLVMLSAEDPNCYFDVTNAIERNSYVCAL SKYQIIVSSSINNGATWFTSLQNLHNKWTTPLAVEGLYLGNDRLLDMGVTPIYIKDVLSD YSFDMIYDRNKKIVEDAEVNIDQMSIFEFIGE >gi|223714028|gb|ACDT01000187.1| GENE 7 7427 - 8056 593 209 aa, chain + ## HITS:1 COG:CAC0954 KEGG:ns NR:ns ## COG: CAC0954 COG0705 # Protein_GI_number: 15894241 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Clostridium acetobutylicum # 42 192 177 321 328 80 32.0 3e-15 MNFDYKKYPVTAGVIGICILVYCYTTVKYGFEMNAYQGIRAGGFNPILVLAGNQYWRLIS ANFIHFGIMHIFCNCYSLVNLGSVMEYLLGMKRYLIILIASALATTILPTVFYILTGNGA SSIMGGISGAIFGLMGALLALAWKFKDVYAYLFKQISSSVLLMLLISILVPSISLSGHIS GMIGGFIATLLIINLCLYAFGKEKTGKLS >gi|223714028|gb|ACDT01000187.1| GENE 8 8112 - 9011 1038 299 aa, chain + ## HITS:1 COG:CAC2335 KEGG:ns NR:ns ## COG: CAC2335 COG1210 # Protein_GI_number: 15895602 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose pyrophosphorylase # Organism: Clostridium acetobutylicum # 4 291 2 282 303 343 60.0 2e-94 MKQKVRKAIIPAAGLGTRFLPATKALAKEMLPIVDTPTIQYIIQEAVDSGIEEILIITNS NKHAMENHFDKSYELEARLTESGKMEQVKMIQDIADMANIYYIRQKEPKGLGHAVLCAKS FIGDEPFAVLLGDDIVVNDGGEPALKQLIDAYINKEASVVGVQTVEHKDVCKYGIVSPSH SHPRENGGRLVKLNNMVEKPAVEEAPSDMAVLGRYVLTPKVFELLETQGKGAGGEIQLTD AIKRLMDIQAVYAYDFEGIRYDVGDKFGFIKATIDFALKREDLKEKVQAYINSLVKDVK >gi|223714028|gb|ACDT01000187.1| GENE 9 9011 - 9478 531 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756712|ref|ZP_02428839.1| ## NR: gi|167756712|ref|ZP_02428839.1| hypothetical protein CLORAM_02250 [Clostridium ramosum DSM 1402] # 1 155 2 156 156 259 100.0 4e-68 MSFQEKYEEYKQRKEAQAFFNQNNDQKVYGKDYLIAILVGFGSTIVMGCILTWIISKIGF NFSYFTILIGIFEAQAIKKVLNKSGQQLAIIAAVTFVLGIVVAQAIFISITLPFFNVSML VETFKYCFQNMITGDVLSTIIYLFGAIAAYMALKD >gi|223714028|gb|ACDT01000187.1| GENE 10 9721 - 13239 3956 1172 aa, chain + ## HITS:1 COG:FN1170_1 KEGG:ns NR:ns ## COG: FN1170_1 COG0674 # Protein_GI_number: 19704505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Fusobacterium nucleatum # 1 408 1 407 410 564 65.0 1e-160 MAKKFLSMDGNTAAAHVAYAFTEVASIYPITPSSPMAENAEAWSAQGKKNIFGSPVNVIE MQSEAGAAGAVHGALQGGALATTFTASQGLLLMIPNLYKIAGELLPGVFHVSARALATRA LNIFGDHQDIYACRQVGMPMICSHSVQEVMDLGGIAHLTAIKSSVPVMHFFDGFRTSHEI QKVEVMDYDVLESLLDKEALAKYKKNAMNPHTNPIERGGAENDDIYFQGREAQNKHYDAV VEVVADYMKKISEITGREYAPFTYYGAPDAARVIVAMGSVTETIKETIDEMNRRGDRVGL IKVHLYRPFSAKYLLKVLPNTVEKVAVLDRTKEMGATGEPLYLDVVSVLKDVKHVRAIVG GRYGMGSKDTTPRQIKAVYDHLLEDAPFTSFTIGINDDVTNLSLKEDPEFNVDADYTACL FYGLGSDGTVSANKSSIKIIGDHTDLYSQAYFAYDSKKAGGATRSNLRFGNTPIRATYYV NNADFISCSLDNYVIKYDMLKNLKDGGTFLLNTEFSKEEIIDYLPNRVKKQLADKKAKFY IINANKIAGEIGMGRRTNTILQSAFFALNPQILPIEKAVEYMKEMAKKTYSKKGDAIVQL NYKAIDAGKDAIEEVAVDNSWSDLTVTATRSTTGDEHFDEFVSVINSLDGYDLPVSAFMD KLDGSMKSGVAIKEKRAIATEVPRWNKDNCIQCNNCVMVCPHATIRAFLLDEEEMANLPE NIGDDVLVPMGKDMGGLVYRIQVSPDNCVGCGLCVTECPGKKGEKALEMVSVKDELVHAP LADYMYANVKYRDDKYPLTTAKGVGFMRPYFEVSGACGGCGETPYYRLASQLFGKDMMIA NATGCSSIYSGSTPSTPCAIDGNGQGPAWANSLFEDNAEFGFGMKLAENYKVGHLLRIIE ENKDACEPELKALLEEYVEINGDRAKERELVPQIMTAVKASSNEAIKELLNHEGDMVSKS QWIIGGDGWAYDIGYGGLDHVIASNQNVNILVLDTEVYSNTGGQSSKSSQAGSIAKFTAG GKSAAKKDLAQIAMAYGHVYVAQVAMGANPVQTIKAFKEAEAYNGPSLIIAYSPCMEHGI KGGLANHQRQQKDAVSCGYFNLLRYDPRLEDAGKNPLQVDSKAPDFDKFKDFLLSENRFA QLLKVNPEHAEALMEKCLADAKKRRLRLDRMA Prediction of potential genes in microbial genomes Time: Thu May 26 11:08:03 2011 Seq name: gi|223714027|gb|ACDT01000188.1| Coprobacillus sp. D7 cont1.188, whole genome shotgun sequence Length of sequence - 6504 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 4, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 554 410 ## gi|167756443|ref|ZP_02428570.1| hypothetical protein CLORAM_01976 2 1 Op 2 . + CDS 577 - 1206 415 ## gi|167756443|ref|ZP_02428570.1| hypothetical protein CLORAM_01976 3 1 Op 3 . + CDS 1281 - 1838 586 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) 4 1 Op 4 . + CDS 1813 - 2604 178 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 5 2 Tu 1 . - CDS 2601 - 3860 1314 ## COG4100 Cystathionine beta-lyase family protein involved in aluminum resistance - Prom 3884 - 3943 7.6 + Prom 4399 - 4458 7.8 6 3 Tu 1 . + CDS 4533 - 4715 137 ## + Term 4955 - 5005 -0.5 7 4 Tu 1 . + CDS 5292 - 6504 582 ## gi|237732703|ref|ZP_04563184.1| hypothetical protein MBAG_03262 Predicted protein(s) >gi|223714027|gb|ACDT01000188.1| GENE 1 3 - 554 410 183 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756443|ref|ZP_02428570.1| ## NR: gi|167756443|ref|ZP_02428570.1| hypothetical protein CLORAM_01976 [Clostridium ramosum DSM 1402] # 1 179 9 187 408 320 96.0 2e-86 IEPKKRLRMIEVIFSIGLCIASIISIGYGLFDINANIEDIQFKQSIQMTRDTDLEDYSED NTICDVTYINGDKQLIVSYSYEDYVKLDDKTITAYEYETKNGTKLYFDHQNINDQEVQYA YKQVRANELASLFNFGIASLILMLSILIMMLFAKQFTTYEKSWFISIMVLATYFFRSFPR RKC >gi|223714027|gb|ACDT01000188.1| GENE 2 577 - 1206 415 209 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756443|ref|ZP_02428570.1| ## NR: gi|167756443|ref|ZP_02428570.1| hypothetical protein CLORAM_01976 [Clostridium ramosum DSM 1402] # 1 209 200 408 408 373 99.0 1e-102 MLLYLLDTFLNILCELLISKQSRYNFLVSILVEIVEIAICVVLMYRFATMATTLFFWLPI DIISYINWSKHKDDEENELTVVRKLRGYQEVLVIIGIIVWTFVIGYLISGLNIATDFYNN ELLETFIIYIDACASAVGIANGLFIFFRLQEQWIAWYICAFLEAVINIISGQYVLLVLKL GYFTNTTYGYIKWSRYIKEHTTEKQAQIS >gi|223714027|gb|ACDT01000188.1| GENE 3 1281 - 1838 586 185 aa, chain + ## HITS:1 COG:BS_efp KEGG:ns NR:ns ## COG: BS_efp COG0231 # Protein_GI_number: 16079501 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Bacillus subtilis # 3 185 2 185 185 160 47.0 1e-39 MAILAGDFKTGLTLIVDGDPCQVMDFQHVKPGKGAAILKTKMRNLKTGAIQERNFNASTK FEAANISRKDAQYSYEADATFYFMDLETYETYELAEDQVGDNKYYIIEGSTVSLMFFDGL LLSVSVPEKVELTVVETDPAIKGAPSNQTKDAVTDTGLTLRVPQFIDTGEKIVVFTTDGK YAGRA >gi|223714027|gb|ACDT01000188.1| GENE 4 1813 - 2604 178 263 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 18 259 4 240 242 73 28 5e-13 MESMLEELNKLFFKIMNKIVLITGGAKGIGKAIALELAKQGYDIVINYLTSKKEAEALKN MIIDDYGVRCLAIAGDVSKEDKVDEMVSLIESKLGGVDILINNAAVDLSNLFHLKNAEEF KKTLDVNVVGAFNCSKRVYRHMIDQEYGKIINIASTNGINTYYPMCIDYDASKAALISMT HNLAFEFAPFVNVNCIAPGFIGTENELNGYDEQFLKEEVEKIMVNRYGDPQEVAYLVKFL VSKEADFINNTVIRIDGGQKGSC >gi|223714027|gb|ACDT01000188.1| GENE 5 2601 - 3860 1314 419 aa, chain - ## HITS:1 COG:SA1148 KEGG:ns NR:ns ## COG: SA1148 COG4100 # Protein_GI_number: 15926890 # Func_class: P Inorganic ion transport and metabolism # Function: Cystathionine beta-lyase family protein involved in aluminum resistance # Organism: Staphylococcus aureus N315 # 9 417 3 409 412 367 43.0 1e-101 MILEQINPDLLTLAKESDTVLHDIYEKIDSICLTNSNRILSAFIENKVSYSDFSDINGYG NYDEGRNKIEKIFATVLGCEDALVRPQIMSGTNALYITLSALLKHGDTMISLSGAPYDSL QEMIGISGDSSQSLKAAGVKYEQIELINDDFDDQMIIERLKEKNIKLVEIQRSRGYSHRK SLSIAKIERIIKKIREVNDEVIIMVDNCYGELVETKEPGHVGADIVVGSLMKNLGGGIAP TGGYVAGNQNLVYMVAERLTAPGIGKDLGANFNLNNAFFKGIFMAPNAVKNALKTAIFTA YMLEKLEYSNVSPRYDEPRTDIIQTLELKSKENLVSFTQGIQQTSPIDSFVHVLPAPMPG YPFDEVMAAGSFTQGSTIELSADAPVIEPYTLYLQGGLTFEYGKLSILLALSNMKRKLN >gi|223714027|gb|ACDT01000188.1| GENE 6 4533 - 4715 137 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSISFHYIESNYLYYIDMNKRVDLLKKISSLDFNLLKSKDITKREYDDILDKMKNHHSDS >gi|223714027|gb|ACDT01000188.1| GENE 7 5292 - 6504 582 404 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732703|ref|ZP_04563184.1| ## NR: gi|237732703|ref|ZP_04563184.1| hypothetical protein MBAG_03262 [Mollicutes bacterium D7] # 1 404 1 404 404 806 99.0 0 MCYYYDEDLKFFKDGDEIMKEVIVSAGLVYIYLNGYLENERLHIQSIDTYSCLENGFYQR VIEASTIANEVDEIVHIFAVVGVKEKANQIYSMIEKRLKNNFNNIIAGKNIKKSSKMLDK TGNVFFRCVQEFFCEQLDHFPKQVNYDFNDFTIKGICSSRLPDYVGDSITNEIIYIWLER EKPMLCGRQLMIRSFFVRDFIGRKVIASLPNKENESWRFIFEGGHQISMNEKFSYLKETL HPDDLGGFCSSNIQSILMNPIYAYGQWFQPNDVCEEWHKVFLYLCAISDNEWNEVSISKI YDKFLDFLRKNICITMEAPSLISKSEYYKILLIHIINFRSFLKGEDEPVLSKDLLQTMNS RYVYLPYLWELIPPNTYKNRFSANNFKKQINEAMKENASYIKGI Prediction of potential genes in microbial genomes Time: Thu May 26 11:08:39 2011 Seq name: gi|223714026|gb|ACDT01000189.1| Coprobacillus sp. D7 cont1.189, whole genome shotgun sequence Length of sequence - 1722 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 1 - 726 546 ## COG3410 Uncharacterized conserved protein 2 1 Op 2 . + CDS 710 - 1027 353 ## COG1694 Predicted pyrophosphatase + Term 1234 - 1276 2.2 + Prom 1223 - 1282 8.2 3 2 Tu 1 . + CDS 1420 - 1720 433 ## Cphy_0274 S-adenosyl-methyltransferase MraW Predicted protein(s) >gi|223714026|gb|ACDT01000189.1| GENE 1 1 - 726 546 241 aa, chain + ## HITS:1 COG:BH3996 KEGG:ns NR:ns ## COG: BH3996 COG3410 # Protein_GI_number: 15616558 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 61 237 18 188 188 77 33.0 3e-14 SDIGSSKEIKKWCKYYNAIVYEDKLTSQFRCNGSDAYLSWVDNVLEIEDTANDELDIDYD VRIFDDPLALKKAIFEKNKIANKSRMLAGYCWKWQKNGRNRSDVHDIEIGDFSMSWNFAS TKTWAIDEESINEVGCIHTSQGLEFDYVGVIIGDDLRYENGKIITDFFKRAPTDRSIRGL KGQYKNDKESALKLADEIIKNTYRTLLTRGQKGCYIYCTNKELNEYLKLKLKRGKENVRK S >gi|223714026|gb|ACDT01000189.1| GENE 2 710 - 1027 353 105 aa, chain + ## HITS:1 COG:BH3997 KEGG:ns NR:ns ## COG: BH3997 COG1694 # Protein_GI_number: 15616559 # Func_class: R General function prediction only # Function: Predicted pyrophosphatase # Organism: Bacillus halodurans # 2 92 4 99 101 68 40.0 3e-12 MLEKVREEIIKFNQDRDWDQFHSPENLAKSIAIESGELLECFQWDNSFNKQDVCDELADV VNYCILMADKLDVDLEDIVLKKLKKTEKKYPVEKAKGNSKKYNQL >gi|223714026|gb|ACDT01000189.1| GENE 3 1420 - 1720 433 100 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0274 NR:ns ## KEGG: Cphy_0274 # Name: not_defined # Def: S-adenosyl-methyltransferase MraW # Organism: C.phytofermentans # Pathway: not_defined # 2 100 22 120 366 152 72.0 3e-36 MEPKHKRRIRYKGTHPKKYNEKYKELNPELYPETIEKVIKKGSTPVGMHLSIMVDEILEF LDIQEGQIGLDCTLGYGGHTLKMLECLNYTGHMYALDIDP Prediction of potential genes in microbial genomes Time: Thu May 26 11:08:44 2011 Seq name: gi|223714025|gb|ACDT01000190.1| Coprobacillus sp. D7 cont1.190, whole genome shotgun sequence Length of sequence - 9498 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 3, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 76 - 135 5.2 1 1 Op 1 . + CDS 175 - 1383 456 ## gi|237732691|ref|ZP_04563172.1| predicted protein 2 1 Op 2 . + CDS 1373 - 2821 540 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 3 1 Op 3 . + CDS 2832 - 3950 691 ## THA_871 4-alpha-L-fucosyltransferase 4 1 Op 4 1/0.000 + CDS 3967 - 5070 596 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 5 1 Op 5 . + CDS 5081 - 6085 718 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) 6 1 Op 6 . + CDS 6103 - 7506 741 ## COG2515 1-aminocyclopropane-1-carboxylate deaminase + Prom 7532 - 7591 2.5 7 2 Tu 1 . + CDS 7629 - 7904 198 ## gi|237732697|ref|ZP_04563178.1| predicted protein + Term 7965 - 8016 1.0 + Prom 7928 - 7987 3.9 8 3 Tu 1 . + CDS 8106 - 9338 1220 ## COG1004 Predicted UDP-glucose 6-dehydrogenase Predicted protein(s) >gi|223714025|gb|ACDT01000190.1| GENE 1 175 - 1383 456 402 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732691|ref|ZP_04563172.1| ## NR: gi|237732691|ref|ZP_04563172.1| predicted protein [Mollicutes bacterium D7] # 1 402 1 402 402 625 100.0 1e-177 MDVLTYDDYKQKIHLDNLGFILLMPILIDFLSVIVQQFGMSGSSVITMALYGLSLIVIII KLIKIVTVHEILNDIILYFLVLFPFGVNYFWFENTRAELISQEMLIVYLFFILLAIFSIR KIRRWDLFFEALIKPGKIAIFLAVFILLFLDYEKYLVYMGFSYALLPFVCNFYRTARIKK EFKEKLIACIFFAAGMVSILVFGARAAVGFAFVYIIVFEILRNDLTLSLKIISLIILLLI VWIISSNINAIAEMLVKMDAFKDSYLLKNLLSGQLLESNTRDILYQACLNRMSTMGLEIS GFFGDRQYCAGFAYPHNIFYELIMSFGWIIGSILIGTYALLLLKGILTSKPEKREVMIFI IISMLARYVISGSYLVEGKFWVATVLVISISLRKDKRFDNEE >gi|223714025|gb|ACDT01000190.1| GENE 2 1373 - 2821 540 482 aa, chain + ## HITS:1 COG:SMb21050 KEGG:ns NR:ns ## COG: SMb21050 COG2244 # Protein_GI_number: 16264377 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Sinorhizobium meliloti # 1 461 37 500 517 144 22.0 5e-34 MKNKSKTVYTNLMWRLLERFGAQGVTLVVSLILARLLDPEAYGTVALITIFTAVLNVFVD SGLANSLIQKKNADDVDFSSVFYFNVLICCLLYIIMFVSAPAISRFYNKPEIVPMIRVLS LTLIISGIKNVQQAYVSKNMMFKKFFFSTLGGTLTASIIGIALAILGKGAWALIIQNITN LLMDTIILWFTVKWRPKLLFSWKRLKGLLSYGYKLLLVSLINNIYNECRQLIIGKMYSSS DLACYNQGNRFPQIIVQNLNTAIDNVVFSAMSAAQDEITEVRNMTRRSIRTGSFILAPLM VGMACVSKSVVLLILTDKWIACVPYLQMFCIMYLFAPIQTANLNAIKAIGKSGVFLIIDI LEIVIGLVGLLVSMWFGPIYIAFSMLVCTILNLFINSFPNKKYLNYGTRDQLRDLFPNIL LAGVMGIVVCSINWLELSPIISLFIQIPLGGIIYIAGAAIFKIESFDYCCGIIRGLLNKK KE >gi|223714025|gb|ACDT01000190.1| GENE 3 2832 - 3950 691 372 aa, chain + ## HITS:1 COG:no KEGG:THA_871 NR:ns ## KEGG: THA_871 # Name: not_defined # Def: 4-alpha-L-fucosyltransferase # Organism: T.africanus # Pathway: not_defined # 11 308 2 289 358 80 28.0 1e-13 MRNNKKFYYKYIHLMYGHDTKFSKLLLDFISNPENGFEINQHLFVTPYKNVFDDLKQYSN VLLDESNKNLYKVYYKHCHLIISHSGEELYRILLTPKKIKNKVVYRYWGGMRILQYDENS KTFGESLKLKVKKYILKKSFSEFAAIGIANITDIIDLSRILKKDTKYYRLSYASNEYYDT VNKLKQKLDIENDINKRYKKRVLLGHRGTEENNHIEILKRLSKYNSENFDIFIPLSYGEK KYIQNVENYVKENSKGNIVIIKQFMKFSEYAEFLSTIDIAIFDGYTSYALGNLGIILFFN KTVYLNENGVIAKALESENNDYKKISDIGKISFEEFSKPMKYPANYTSDLCIISTEERIK NWNKLLADFSDK >gi|223714025|gb|ACDT01000190.1| GENE 4 3967 - 5070 596 367 aa, chain + ## HITS:1 COG:CAC2350 KEGG:ns NR:ns ## COG: CAC2350 COG0399 # Protein_GI_number: 15895617 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Clostridium acetobutylicum # 4 365 2 353 364 303 43.0 4e-82 MNNILVTRSSMPPFDEYCDEIKDIWDSHWLTNMGSKHKELQKELEKYLDIPHVALYTNGH LALENAIAALNLPKGGEVITTPFTFASTTHAIVRNGLVPVFCDIKEDDYTIDTQKLEKLI TDNTVAIVPVHVYGNMCDVEEIDRIAKKYGLKVIYDAAHAFAVKYKGISSACFGEASMFS FHATKVFNTIEGGCVCFKNDAWVQLLNDMKNFGIHGPEAVEFVGGNAKMNEFQAAMGICN LKHLNTEIEKRKKIVEHYRSRLEGVEGIKLSTIQKDVESNYAYFPVVFDGYKYTRNEVFE KLAEVGIGARKYFYPLTNSFECYRNYPTAGTEKTPIAQHIALRVLTLPLYADLALEEVDR ICDVIWR >gi|223714025|gb|ACDT01000190.1| GENE 5 5081 - 6085 718 334 aa, chain + ## HITS:1 COG:CAC2189 KEGG:ns NR:ns ## COG: CAC2189 COG0458 # Protein_GI_number: 15895458 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Clostridium acetobutylicum # 1 313 1 299 315 132 30.0 1e-30 MNVMLTSVGRRAYLVKYFKEVLGNDGKVFVCNSDDKSIAFKYADEKVISPLIYDSNYILF LLEYCKENRIDIVISLFDIDLLMLAKHKKQFEKIGTKVIVSDPIIIEVCNDKWKTYKFLI DNGFHAPMSFLDMNEVIEKISERKLSYPVVVKPRYGCGSISVAIAYDEEDLRYLTKKANE DIANSYLKYESAVTNDKVIYQECLIGQEYGADIINDLNGETQNVIIRKKLAMRSGETDIA QLVDEPSIKETLVRLGKITKHIANMDCDIFLVNGVPYVLEMNARFGGGYPFSHMGGCNLP KAIVEWAKGEPVDKETISARTGITGYKEIYITEI >gi|223714025|gb|ACDT01000190.1| GENE 6 6103 - 7506 741 467 aa, chain + ## HITS:1 COG:YPO1845 KEGG:ns NR:ns ## COG: YPO1845 COG2515 # Protein_GI_number: 16122096 # Func_class: E Amino acid transport and metabolism # Function: 1-aminocyclopropane-1-carboxylate deaminase # Organism: Yersinia pestis # 23 306 35 326 330 117 30.0 4e-26 MQDEFKLMQYKTPVVQIPDKDNKIYIKRDDLLPFSFGGNKVRIALEFIADMKNQGKDCIV GYGNSRSNLSRALANLCYQLEIPCHIISPADEDGTHIDTYNSKMALACNAEFHYCRKTNV KASVERVLKELRDKGLNPYYIYGDSTGKGNEHIPLLAYVKVYEDIKAQFDYIFLATGTGM TQGGLLAGKAIHGGDEKIVGISVARSSMQETSVLKNSLECFSTRVQKIDYGEINVQDEYL CNGYGTYNRQIEKTIHQQLTCNGMPLDPTYTGKAFWGMREYIKKNKIVGKKILFIHTGGT PLFFDYMNGIRLTEASNKEAVEEAVIRLEKRLVPSLTDRKINISQYSEKLAQYGKVWIHY DMGKPISIIAGYFNDETTRTVYLSMLAVAEEYQGKRLASSLLSEFEDYAIKNKMNYVKLE VRKHNLVARKLYSKFGYKVIGDASDTSYYMLKKLENLSGGAQNSINF >gi|223714025|gb|ACDT01000190.1| GENE 7 7629 - 7904 198 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732697|ref|ZP_04563178.1| ## NR: gi|237732697|ref|ZP_04563178.1| predicted protein [Mollicutes bacterium D7] # 1 91 1 91 91 175 100.0 8e-43 MCAGYANDLQNRRGYISVVAVLPEYSNKGYGKVAVQGFLKKAKSAGMLEVHLYADSENSS ALHMYEKLGFSEWHIINESRQKDKHLIRKFI >gi|223714025|gb|ACDT01000190.1| GENE 8 8106 - 9338 1220 410 aa, chain + ## HITS:1 COG:STM2080 KEGG:ns NR:ns ## COG: STM2080 COG1004 # Protein_GI_number: 16765410 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Salmonella typhimurium LT2 # 1 410 1 388 388 488 61.0 1e-138 MKIAVAGTGYVGLSIATLLSQNHEVMAVDIIPEKVEKINKRISPIHDEYIEKYLREKELN LTATLDAEAAYKDADFVVIAAPTNYDSKKNFFDTSAVEAVIKLVIEYNPEAVMIIKSTIP VGYTQSIREKTGSKNIIFSPEFLRESKALYDNLYPSRIIVGTDMNDPRLVEAANTFAGLL QEGAIKEDIDTLIMGYTEAEAVKLFANTYLALRVSYFNELDTYAEMKGLDTQNIINGVCL DPRIGSHYNNPSFGYGGYCLPKDTKQLVANYQDVPQEMMSAIVASNRTRKDFIADRVLEI AGAYEANDSWDESKEHNVVIGVYRLTMKSNSDNFRQSSIQGVMKRIKAKGAEVIIYEPTL KDGETFFGSRVVNNLNEFKQQSQAIIANRYDSCLDDVEDKVYTRDLFRRD Prediction of potential genes in microbial genomes Time: Thu May 26 11:09:10 2011 Seq name: gi|223714024|gb|ACDT01000191.1| Coprobacillus sp. D7 cont1.191, whole genome shotgun sequence Length of sequence - 1316 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 299 231 ## gi|237732676|ref|ZP_04563157.1| conserved hypothetical protein 2 1 Op 2 . + CDS 313 - 486 140 ## gi|237732677|ref|ZP_04563158.1| predicted protein 3 1 Op 3 . + CDS 513 - 797 335 ## gi|237732678|ref|ZP_04563159.1| predicted protein 4 1 Op 4 . + CDS 815 - 1294 337 ## gi|237732679|ref|ZP_04563160.1| predicted protein Predicted protein(s) >gi|223714024|gb|ACDT01000191.1| GENE 1 3 - 299 231 98 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732676|ref|ZP_04563157.1| ## NR: gi|237732676|ref|ZP_04563157.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 98 1 98 98 197 100.0 2e-49 KHNFFRAKINGKEVWVSRERISRKDLPCGWVRQELRHDGYDLKKPVKIDSSVILNFYGTM TYQVDWTRGAFVDGTSINEFEVIEEEKCLDITKEFLVF >gi|223714024|gb|ACDT01000191.1| GENE 2 313 - 486 140 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732677|ref|ZP_04563158.1| ## NR: gi|237732677|ref|ZP_04563158.1| predicted protein [Mollicutes bacterium D7] # 1 57 1 57 57 85 100.0 9e-16 MKSVEIVLKKKQIIKYVLDKYKFQNYDTSNIGYDELSDNWDDFGKKYILGKNICKNK >gi|223714024|gb|ACDT01000191.1| GENE 3 513 - 797 335 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732678|ref|ZP_04563159.1| ## NR: gi|237732678|ref|ZP_04563159.1| predicted protein [Mollicutes bacterium D7] # 1 94 3 96 96 163 100.0 3e-39 MDALTLYKHLKQQYLYHDVAYKLICYGALDSFIKEPTLTEYETIATVCIYADAKAEKPNI EQLANSVCKKYANKEYTLDELTEMSSWDVLELLD >gi|223714024|gb|ACDT01000191.1| GENE 4 815 - 1294 337 159 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732679|ref|ZP_04563160.1| ## NR: gi|237732679|ref|ZP_04563160.1| predicted protein [Mollicutes bacterium D7] # 1 159 16 174 174 287 100.0 1e-76 MQKKLIVVMGINNAGKTFYRQKFLNNYPYVDILEMQNKYPFITCDLVLKSYYETSNRVLE LFENNSTVILEHTLLRAVRREEFYFSVLRNKYPDIEIEVVCITPTLSQYLCNLRNRFTDY PLSYRDIVNYVEMLKLLEKPTKEEKVEIKLLTANNCNFF Prediction of potential genes in microbial genomes Time: Thu May 26 11:09:33 2011 Seq name: gi|223714023|gb|ACDT01000192.1| Coprobacillus sp. D7 cont1.192, whole genome shotgun sequence Length of sequence - 3474 bp Number of predicted genes - 7, with homology - 4 Number of transcription units - 5, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 22 - 366 202 ## gi|237732680|ref|ZP_04563161.1| predicted protein + Prom 470 - 529 6.7 2 2 Tu 1 . + CDS 743 - 1228 473 ## gi|237732681|ref|ZP_04563162.1| predicted protein + Prom 1396 - 1455 6.0 3 3 Op 1 . + CDS 1485 - 1652 174 ## 4 3 Op 2 . + CDS 1654 - 1875 185 ## 5 3 Op 3 . + CDS 1893 - 2243 345 ## APP7_0464 hypothetical protein + Term 2266 - 2305 1.0 6 4 Tu 1 . + CDS 2321 - 2884 317 ## Swit_5259 hypothetical protein + Prom 2889 - 2948 5.6 7 5 Tu 1 . + CDS 3044 - 3473 416 ## Predicted protein(s) >gi|223714023|gb|ACDT01000192.1| GENE 1 22 - 366 202 114 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732680|ref|ZP_04563161.1| ## NR: gi|237732680|ref|ZP_04563161.1| predicted protein [Mollicutes bacterium D7] # 1 114 1 114 114 206 100.0 4e-52 MIFFKALEGSCNERLNQCNRFKKDFFSGSETFKRIVNNNRAFRKRNYTEDIQMYPILQTS IEMLDFYAFLELTVLEFLAPELNMPLNKAKLCISCCFYEKLQFKPTTNITLERY >gi|223714023|gb|ACDT01000192.1| GENE 2 743 - 1228 473 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732681|ref|ZP_04563162.1| ## NR: gi|237732681|ref|ZP_04563162.1| predicted protein [Mollicutes bacterium D7] # 1 161 1 161 161 300 100.0 2e-80 MDLHILYDSEQKICLSYGDKNSIDIKYSSLCNEFSKNGLKTINLKKINLKEFLKKYGCND IIFNQIILAIMSYNNFFSELYELVDSNGDLEVFLGLLNRFEINEPETFICPQCGCIETQH TNSYYEVPYGEIEYTLVCQKCGFKINNYAYGKWELENAFKD >gi|223714023|gb|ACDT01000192.1| GENE 3 1485 - 1652 174 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRQKITIKPNLRKTTKYTEEEKMKLFKEGAIRSMSHVKVSKKIYTRKKKYKEDY >gi|223714023|gb|ACDT01000192.1| GENE 4 1654 - 1875 185 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYREENNKIPKKFQYIVAKMLVCTDVSNIGDIIRVKSSNDKYFAYNKRTMKTMLIPMSIL ANSHLVEIIDIIE >gi|223714023|gb|ACDT01000192.1| GENE 5 1893 - 2243 345 116 aa, chain + ## HITS:1 COG:no KEGG:APP7_0464 NR:ns ## KEGG: APP7_0464 # Name: not_defined # Def: hypothetical protein # Organism: A.pleuropneumoniae_AP76 # Pathway: not_defined # 12 114 217 318 318 79 39.0 3e-14 MKYKLTNKKIDYLNAKVYGNAEVCGNAKVYGNAWVHGDAEVCGDADYLVIGPIGSRNDIT TFFKTANNDIKVSCGCFTGTIEEFLEKVNETHGNNNYAKEYKTAVEIAKIHIFGGE >gi|223714023|gb|ACDT01000192.1| GENE 6 2321 - 2884 317 187 aa, chain + ## HITS:1 COG:no KEGG:Swit_5259 NR:ns ## KEGG: Swit_5259 # Name: not_defined # Def: hypothetical protein # Organism: S.wittichii # Pathway: not_defined # 51 120 48 120 198 67 44.0 3e-10 MGWTGIVANHYKNNKIDRMKEFLDIFHNSKNFYNDNVYVVKKARQVGSTIYAACCWQNSK NEDVSDTHGLICLTSVKDGQFYYKDMTESMGPYQSNCPESILSLLSPTVNEYALDWRTRC HNYNKHNKILDKLPVGSVIETELNGYKVTLYKRKYKKITIWWDGQYKYSKNHFISGGYRI IENKAVH >gi|223714023|gb|ACDT01000192.1| GENE 7 3044 - 3473 416 143 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNKLNILARGKDKTGHWIIGSYFPSIISLASPFDGTENSYKHYIICNGCGDWGLPCSMDI HEVCEESIGYYINDVDMNNQRIFSGDIIDIHQTVNGYSKFVVTWDGFSFGAKYYDESKDT IGRNYEYDVKELFDYGIYFADKK Prediction of potential genes in microbial genomes Time: Thu May 26 11:10:07 2011 Seq name: gi|223714022|gb|ACDT01000193.1| Coprobacillus sp. D7 cont1.193, whole genome shotgun sequence Length of sequence - 2750 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 306 253 ## gi|237732685|ref|ZP_04563166.1| predicted protein 2 1 Op 2 . + CDS 346 - 546 331 ## gi|237732686|ref|ZP_04563167.1| predicted protein 3 1 Op 3 . + CDS 578 - 1837 459 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Prom 1923 - 1982 1.7 4 2 Op 1 . + CDS 2007 - 2189 159 ## gi|237732688|ref|ZP_04563169.1| predicted protein 5 2 Op 2 . + CDS 2180 - 2680 483 ## gi|237732689|ref|ZP_04563170.1| predicted protein Predicted protein(s) >gi|223714022|gb|ACDT01000193.1| GENE 1 1 - 306 253 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732685|ref|ZP_04563166.1| ## NR: gi|237732685|ref|ZP_04563166.1| predicted protein [Mollicutes bacterium D7] # 11 101 1 91 91 130 98.0 2e-29 IETILKSDEVVPIEVETIENKIYKHETIVIEELKNNKFVKCEFIECDEVSNMNEAFVSAE SLMNELTKTFSDDEKEKYVFSKIKIGVTPKEDTYSIDITFN >gi|223714022|gb|ACDT01000193.1| GENE 2 346 - 546 331 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732686|ref|ZP_04563167.1| ## NR: gi|237732686|ref|ZP_04563167.1| predicted protein [Mollicutes bacterium D7] # 1 66 1 66 66 95 100.0 1e-18 MKFKVKIIEELARTVIVEADDKESAYEKASILANEEINLDYEDFSTRDIDVQGEATPLEL EIFDQY >gi|223714022|gb|ACDT01000193.1| GENE 3 578 - 1837 459 419 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 381 2 402 815 181 30 6e-46 MLTKFDEQAQEAIVVGESIAFDLGRNNVGSEHLLLSLLKISDSKLRELVKKYDVDDKNIY EDIKRLFGTNDSDNQPFYMEYSEALKSILEAAIEITHQQNKSKVTLNILTIALLQSEESV AHELLKKYGVNFKDVISKLGTKNNDKPQNLSGVKKLTFAEIINPKKKQVIMERDEEVLQI LVGLCCLEKANIAITGPAGVGKSAIIDELSRVLKYESVPKSLKGYTIVKLNLTSMLAGTK YRGDFEERLDKYLKEIRNKKVITFIDEGHQMTATGNSDNTLSLGEMIKPILSRGQYKFII ATTENEYKIIAADTALNRRFRKVPIYEPEKEKVFAMIREKINILKKFHGVDISKETIDEI IEETSKIRNRCFPDKALDVIDMTMAYSQVMNEDFSIDYAKKYISNIQLKNKERKKALVN >gi|223714022|gb|ACDT01000193.1| GENE 4 2007 - 2189 159 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732688|ref|ZP_04563169.1| ## NR: gi|237732688|ref|ZP_04563169.1| predicted protein [Mollicutes bacterium D7] # 1 60 1 60 60 103 100.0 4e-21 MFIKKSTISKMRNCDWGCYRLIGNFYLVKKYSPATKGIKGKIWVIDYAVKDKIKETSKWD >gi|223714022|gb|ACDT01000193.1| GENE 5 2180 - 2680 483 166 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732689|ref|ZP_04563170.1| ## NR: gi|237732689|ref|ZP_04563170.1| predicted protein [Mollicutes bacterium D7] # 1 166 1 166 166 313 100.0 3e-84 MGLTDYEKYQLEWMIEHGHSLEDIFNLMDDIVDEEYHYTDRPLPSEAFEAFEEIGFKGSE IWASEDEWKNNEALENNEFCKELILNGIKQGIVKFIENPHGEGTDCVIGDNRFYLVDFYD DAIPVSVIRLQFDNKELTEIVFNAIFSLVGDEHKHYISYLSENVKI Prediction of potential genes in microbial genomes Time: Thu May 26 11:10:30 2011 Seq name: gi|223714021|gb|ACDT01000194.1| Coprobacillus sp. D7 cont1.194, whole genome shotgun sequence Length of sequence - 618 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 274 313 ## gi|237732690|ref|ZP_04563171.1| predicted protein + Prom 336 - 395 9.7 2 2 Tu 1 . + CDS 421 - 616 335 ## Predicted protein(s) >gi|223714021|gb|ACDT01000194.1| GENE 1 2 - 274 313 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732690|ref|ZP_04563171.1| ## NR: gi|237732690|ref|ZP_04563171.1| predicted protein [Mollicutes bacterium D7] # 15 90 1 76 76 120 100.0 3e-26 IERNGKVFTLTQEEMSDFRYLDQAITGRACIDLVRTSYNEDSEEYELLSKLMNDEDICYN IENDILDNIMNDVGATEQSVINDYMQSNMK >gi|223714021|gb|ACDT01000194.1| GENE 2 421 - 616 335 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKFSIPVTWQVWDKVEVEAETIEEAIKYVKNNIDEIPLGTEPEYIDGTYKIDDGKNGEK SIEEA Prediction of potential genes in microbial genomes Time: Thu May 26 11:10:41 2011 Seq name: gi|223714020|gb|ACDT01000195.1| Coprobacillus sp. D7 cont1.195, whole genome shotgun sequence Length of sequence - 3426 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 262 - 314 -0.5 1 1 Tu 1 . - CDS 334 - 741 301 ## gi|167756457|ref|ZP_02428584.1| hypothetical protein CLORAM_01990 - Prom 882 - 941 7.1 2 2 Op 1 1/0.000 - CDS 946 - 1980 1036 ## COG1396 Predicted transcriptional regulators - Prom 2017 - 2076 5.3 3 2 Op 2 . - CDS 2079 - 3422 1408 ## COG0534 Na+-driven multidrug efflux pump Predicted protein(s) >gi|223714020|gb|ACDT01000195.1| GENE 1 334 - 741 301 135 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756457|ref|ZP_02428584.1| ## NR: gi|167756457|ref|ZP_02428584.1| hypothetical protein CLORAM_01990 [Clostridium ramosum DSM 1402] # 1 135 27 161 161 225 100.0 6e-58 MTIETNDNLNQLIKAYNHDIELNRLKDILDYLKSKPIYMPSTIKISKQKLLKIPNDDLIM ILDNLSGETTQNYLPVYSSPVLFNSKYRYFVTLSIYDCFMLMKRFKEKNAIIINPNRDNS LVIDNILIKYLKNGG >gi|223714020|gb|ACDT01000195.1| GENE 2 946 - 1980 1036 344 aa, chain - ## HITS:1 COG:CAC3472_1 KEGG:ns NR:ns ## COG: CAC3472_1 COG1396 # Protein_GI_number: 15896711 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 118 1 121 125 103 46.0 7e-22 MKINEIIKEKRLALGYTQEQLAKFLNLTTPAVNKWERGISYPDITILPALARILKTDLNT LLSFKDDLSDYEVVLFLNDLAAIKDFQQAYRIAIDKINEYPTCDNLIVNVAVVLEGLLSL NEMDEAIKYQQKIDSLLEQCLTSDNPMIKNQAQAVLINKYINQGKLEQAEKLINELPEQH LIDKKQKLVNLYIAQAKLEAAAKLQEEKLLVTVNEIPTMLLTLMEIAIKQERYDDAEIIA DIYKEMHEVFGLWKYSSYTAHFQLCINRKKRLECLKILKEMFNAINKGWNINTSPLYRHI TAKKIDQTFVQEMKNMLVTSIKSDPDCKFITDDLEMNKILEKYK >gi|223714020|gb|ACDT01000195.1| GENE 3 2079 - 3422 1408 447 aa, chain - ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 1 432 5 434 447 254 38.0 2e-67 MTTGSPLKLIIAFGIPLVIGNIFQQFYSMVDTIIVGKYVGKTALAAVGSTGSLNFMIIGF GIGICSGFGIPIAQSFGGKKIQDMKKYIVNSFYLCMLITAIMTIATVIALPSVLELMQTP SDIYQQAYDYIVVIFIGLFATMLYNILSSILRAIGDSRTPLYFLILSSIINIVLDLIFIT QFDMGAVGAAYATVIAQFLSGAACYIYMKKKTDVLTFEHDDKKFSKKHSIRLLQMGLPMA MQFSITAIGSVVIQSAVNTLGSDVVAAVTAAIKISVMLTQPLETLGLTMATYGGQNLGAN KIGRIFAGLRVSCIIGAAYCAIVFIFVYFTSDYLSLLFIDAKEVVIMAEIKQYLLINSMG YYILCILFILRNLLQGLGYSFLAMFGGVAEMIARCIVAFFFVSSFGFNAICFANPLAWLF ANIVFIGGWIYKRKELKVIQSSEEVTV Prediction of potential genes in microbial genomes Time: Thu May 26 11:10:49 2011 Seq name: gi|223714019|gb|ACDT01000196.1| Coprobacillus sp. D7 cont1.196, whole genome shotgun sequence Length of sequence - 4511 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 452 - 511 2.4 1 1 Op 1 . + CDS 546 - 716 200 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 2 1 Op 2 . + CDS 741 - 962 221 ## CD3359 ABC transporter, ATP-binding protein + Prom 1233 - 1292 7.7 3 2 Tu 1 . + CDS 1462 - 1656 179 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 1763 - 1802 1.1 4 3 Tu 1 . - CDS 1794 - 2774 1223 ## Cphy_2659 diaminopimelate dehydrogenase (EC:1.4.1.16) - Prom 2908 - 2967 8.2 - Term 4117 - 4176 6.1 5 4 Tu 1 . - CDS 4319 - 4510 187 ## gi|167756449|ref|ZP_02428576.1| hypothetical protein CLORAM_01982 Predicted protein(s) >gi|223714019|gb|ACDT01000196.1| GENE 1 546 - 716 200 56 aa, chain + ## HITS:1 COG:CAC3339 KEGG:ns NR:ns ## COG: CAC3339 COG0488 # Protein_GI_number: 15896582 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 2 55 145 198 518 65 64.0 2e-11 MVTAGLGINLIGLDKKINKISGGQCAKVILAKLLLEKLDVLLLDELTNFLDTKHIN >gi|223714019|gb|ACDT01000196.1| GENE 2 741 - 962 221 73 aa, chain + ## HITS:1 COG:no KEGG:CD3359 NR:ns ## KEGG: CD3359 # Name: not_defined # Def: ABC transporter, ATP-binding protein # Organism: C.difficile # Pathway: not_defined # 2 69 207 275 512 64 50.0 1e-09 MGTYIVISHDFDFLDKITDCILDIDYYTIKNIVENIHLFKTKKRLRTDYLRRYQKQQKHI KNGTIYPQKYSWE >gi|223714019|gb|ACDT01000196.1| GENE 3 1462 - 1656 179 64 aa, chain + ## HITS:1 COG:SP0770 KEGG:ns NR:ns ## COG: SP0770 COG0488 # Protein_GI_number: 15900664 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Streptococcus pneumoniae TIGR4 # 8 61 458 511 513 66 50.0 1e-11 MYVAIITSHLLILDEPTNHLDQEIKKALAKAIKNYSSSAILVCHEPDFYQELVDHIIKIE NCNF >gi|223714019|gb|ACDT01000196.1| GENE 4 1794 - 2774 1223 326 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2659 NR:ns ## KEGG: Cphy_2659 # Name: not_defined # Def: diaminopimelate dehydrogenase (EC:1.4.1.16) # Organism: C.phytofermentans # Pathway: Lysine biosynthesis [PATH:cpy00300] # 2 326 3 328 328 489 70.0 1e-137 MIKIGIVGYGNLGRGVECAVLHSQDMELTGVFTRRNPKSVKTHSDVPVYSMDKVYDMKDQ IDVLVLCGGSANDLPKQTVELAEYFNVVDSFDTHAKIPEHFANVDEASKKNKHISIISVG WDPGLFSLNRLYGQAILPQGHDYTFWGKGVSQGHSDAIRRIDEVKDARQYTIPVEAALQS VRNGENPTLVTRQKHTRECFVVAEEGADLQRIEEEIKTMPNYFADYDTTVHFISEAELLR DHQGIPHGGVVLRSGTTGFEQENKHVIEYKLTLDSNPEFTSSVLVAYARAAHRMYQEGQH GCKTVFDIAPAYLHPESGDELRKKLL >gi|223714019|gb|ACDT01000196.1| GENE 5 4319 - 4510 187 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756449|ref|ZP_02428576.1| ## NR: gi|167756449|ref|ZP_02428576.1| hypothetical protein CLORAM_01982 [Clostridium ramosum DSM 1402] # 1 63 11 73 73 91 100.0 1e-17 SDIYNGENIDDKAVVLVIRNNQVYIPTSKLEENIDIIEENCDDIAANNDKYIKNYYNYKV ALF Prediction of potential genes in microbial genomes Time: Thu May 26 11:11:01 2011 Seq name: gi|223714018|gb|ACDT01000197.1| Coprobacillus sp. D7 cont1.197, whole genome shotgun sequence Length of sequence - 6966 bp Number of predicted genes - 7, with homology - 5 Number of transcription units - 5, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 398 278 ## gi|237732665|ref|ZP_04563146.1| hypothetical protein MBAG_03295 + Term 405 - 439 1.1 + Prom 448 - 507 18.2 2 2 Op 1 . + CDS 702 - 1472 531 ## GWCH70_3452 initiator RepB protein 3 2 Op 2 . + CDS 1485 - 1661 88 ## + Term 1662 - 1701 -0.1 + Prom 2666 - 2725 9.5 4 3 Op 1 . + CDS 2749 - 3513 353 ## gi|237732667|ref|ZP_04563148.1| predicted protein 5 3 Op 2 . + CDS 3550 - 3771 124 ## gi|237732668|ref|ZP_04563149.1| predicted protein + Term 3865 - 3898 -0.2 - Term 5344 - 5406 9.5 6 4 Tu 1 . - CDS 5480 - 5686 68 ## - Prom 5787 - 5846 6.1 + Prom 5670 - 5729 8.0 7 5 Tu 1 . + CDS 5958 - 6966 324 ## EUBELI_01774 hypothetical protein Predicted protein(s) >gi|223714018|gb|ACDT01000197.1| GENE 1 3 - 398 278 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732665|ref|ZP_04563146.1| ## NR: gi|237732665|ref|ZP_04563146.1| hypothetical protein MBAG_03295 [Mollicutes bacterium D7] # 16 131 1 116 116 153 99.0 4e-36 RLKETEKDRDVLKKELHEARKENERLKINIYRIKDILEDMEYVINRIISVPANFLTNMYD RISKIRERVNRLYPIGEVNNIEDRKQVVRQRQTYSNEIDDLIYVTREEIKEYQANERERK IDRQQDRGMSR >gi|223714018|gb|ACDT01000197.1| GENE 2 702 - 1472 531 256 aa, chain + ## HITS:1 COG:no KEGG:GWCH70_3452 NR:ns ## KEGG: GWCH70_3452 # Name: not_defined # Def: initiator RepB protein # Organism: Geobacillus_WCH70 # Pathway: not_defined # 2 238 8 241 381 77 26.0 5e-13 MQSDLKIKQSNNLILATHKMDIQQMRLFFYACSQYKGDLDIKISLEEVNRVLTENTGNRG GNQRERIRETIPTLMRNAIVHIEDEKGEEWCSALRRSRLNKDDTVIFTFDKSIQKELEEL RGYTWMYLSNLTGMTSTYSVRLYEFFAMRLGNANKSSKFDYDINKLRLYLDCTKKYKDFR DFNKRVLAQAEKEINEKSNIHMSYKKIKTGRSITSILFTFKWKTKADVIDVQVVDQQEPM DHTKQMELEEFLKEFE >gi|223714018|gb|ACDT01000197.1| GENE 3 1485 - 1661 88 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MATLKDKMQISAENNLKEYEKLLHALKCSNVPNKEQRIKDCEHAIKKQKQILSNLNHY >gi|223714018|gb|ACDT01000197.1| GENE 4 2749 - 3513 353 254 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732667|ref|ZP_04563148.1| ## NR: gi|237732667|ref|ZP_04563148.1| predicted protein [Mollicutes bacterium D7] # 1 254 19 272 272 431 100.0 1e-119 MKKIIAFLLVFTLSFGAVTNISALNNSAVYSYSYKDYETTILNSTNEYNICFEIVENGEK IKLEYDTDFRYVKVNNNYYDFNDFNNAVLLQSTYIYDKNQPITIVEALTNFSKIDIDNIN DLKSNKISVKNNNSLAPKAEYGKFYFVGSRKKSMLVSLASSAISTIIAFLITNAAGLGVA KANVVSFVSGVLSSEGVNYFTSDVYYKVYQAILNVHNTTKEKRILGIFQPFKAQVVWDDA HPYYRTFETQRPNS >gi|223714018|gb|ACDT01000197.1| GENE 5 3550 - 3771 124 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732668|ref|ZP_04563149.1| ## NR: gi|237732668|ref|ZP_04563149.1| predicted protein [Mollicutes bacterium D7] # 1 73 1 73 73 141 100.0 1e-32 MKCPYCGKEMDKGFFYSSKWDYTWTPEGKKPHYWRNFPKEYEVVLKKGWANTLQITVFRC ANCKVMIINEDDC >gi|223714018|gb|ACDT01000197.1| GENE 6 5480 - 5686 68 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MMALSAFSYSNIFFMFQNLFLSLSNAETLYYFYYFFEKVCHCGTLFQYYDFVVFAHYIDV FLAGWSTL >gi|223714018|gb|ACDT01000197.1| GENE 7 5958 - 6966 324 336 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01774 NR:ns ## KEGG: EUBELI_01774 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 215 1 223 424 143 41.0 8e-33 MEFSMSITKGLGNQKHNNRTQKETTKNVNHDLTFLNVTLLDEDIRSIYHVLFDPSLKKYN AKQKRNDRKINDYYSKIANSKQEKLFHELVVSIGNISNAIRGDIANEIYTEFLKKFIDSN PQMKVFGAYIHHDEIGTVHMHLDYVPFSIGNKRGLETRVSNDKAIEQMGYRNWADWKDRQ FETLEKICHGHGIERVSMDNHSRHMSVESYKKEQRMIENLQNSLEHTLKTQNLPVIEQIE PHVNIITKKKTVPYDEYLVLLKHDKEQNDKISTLEKQIALQRAQISSLTNEKEKYKQDYL NTKKKAYIEENRRLETRVERLENINDSLSYKSGIQK Prediction of potential genes in microbial genomes Time: Thu May 26 11:11:41 2011 Seq name: gi|223714017|gb|ACDT01000198.1| Coprobacillus sp. D7 cont1.198, whole genome shotgun sequence Length of sequence - 6045 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 41 - 511 292 ## gi|237732660|ref|ZP_04563141.1| predicted protein + Term 760 - 804 5.1 - Term 748 - 791 1.1 2 2 Op 1 . - CDS 878 - 1819 998 ## COG1897 Homoserine trans-succinylase - Prom 1840 - 1899 8.7 3 2 Op 2 . - CDS 1909 - 3153 1344 ## COG0232 dGTP triphosphohydrolase - Prom 3182 - 3241 8.8 - Term 3225 - 3273 0.1 4 3 Tu 1 . - CDS 3274 - 4005 159 ## PROTEIN SUPPORTED gi|163781723|ref|ZP_02176723.1| 50S ribosomal protein L13 - Prom 4114 - 4173 5.7 + Prom 4189 - 4248 7.9 5 4 Tu 1 . + CDS 4269 - 5978 1999 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) + Term 5991 - 6025 2.1 Predicted protein(s) >gi|223714017|gb|ACDT01000198.1| GENE 1 41 - 511 292 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732660|ref|ZP_04563141.1| ## NR: gi|237732660|ref|ZP_04563141.1| predicted protein [Mollicutes bacterium D7] # 1 156 1 156 156 251 100.0 1e-65 MKKNIFHELVTINENISLKYINLAFSDKIMNRVQTIFGGFHLINDTVIGHLSDEKSRKLI NNYNDINLNDSYYELNKDIELFLYQMIKELIYHQSKDTNIYPNWFKSLLLEIQDLNKIKE TKDIFIISPYSKQYTIKKFHEYLNMTPSQYLNKKNR >gi|223714017|gb|ACDT01000198.1| GENE 2 878 - 1819 998 313 aa, chain - ## HITS:1 COG:CAC1825 KEGG:ns NR:ns ## COG: CAC1825 COG1897 # Protein_GI_number: 15895101 # Func_class: E Amino acid transport and metabolism # Function: Homoserine trans-succinylase # Organism: Clostridium acetobutylicum # 1 303 1 301 301 360 57.0 3e-99 MPIKIPNDLPASKILKEENIFVMDENRALAQQIRPLQLLILNIMPTKVVTETQLLRMLSN TPLQIEVDWIHMASHESKNTPQEHLLAFYKTFEEIKDNKYDGLIITGAPVEKLRFEDVDY WLEMEKILEWSKTHVFSSFFICWASQAALHYFYGIEKHELKEKLTGVYFHHTNVDKMKRK ILRGFDYQFYAPHSRYTTIMAEDIATISSLDILAASDEAGVYLIAEKDGSRFFVTGHPEY DPDTLDKEYNRDLAISDQATMPKNYYKNDDMHNEILVKWRSHAYLLFSNWLNYYVYQETP YDLNELESLKLGK >gi|223714017|gb|ACDT01000198.1| GENE 3 1909 - 3153 1344 414 aa, chain - ## HITS:1 COG:RSc2968 KEGG:ns NR:ns ## COG: RSc2968 COG0232 # Protein_GI_number: 17547687 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Ralstonia solanacearum # 36 236 41 228 387 122 37.0 2e-27 MTKNLFKEVAMNETNPNYSKAISRLEPLYQRSNDLRSEFGRDYTRIIFSQAYRRLKHKTQ VFFAVKDDHVCTRSEHVNLVESVSYTIANYLGLNTELTKAIAVGHDLGHAPFGHGGERII NELAKIHGLDSFWHERNSLHLIDEIETLEDNEHHRHNLNLTYAIRDGIISHCGEMNQMSI KKRDEYIDLQNYTSPGQYSPYTWEGCVVKMADKIAYLARDIEDALRLKVLKENKVEVLKA SLNNITSEYHFTAINNGTVVNYFIQDVCSHSNPSDGISLSKEAFEIMKTIMKFNYQNIYL IDRVQIHTNYVKLILNSIFDFFIKYDLMARDSNTNIIDEISKDKDQFPHAINGFIHWLEK YSVMNGAKRDSLYQNKVIYDFINDDKAIIKSILDYLSGMSDAYIIQIFNELISF >gi|223714017|gb|ACDT01000198.1| GENE 4 3274 - 4005 159 243 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163781723|ref|ZP_02176723.1| 50S ribosomal protein L13 [Hydrogenivirga sp. 128-5-R1-1] # 10 239 11 217 228 65 25 7e-11 MKLIIVENYEEASIEAAKVMLEVVKNNPTANLGLATGSTPIRMYELMIEDHKKNGTSYKD IKSFNLDEYFGLEATHPQSYHYFMNKHLFSGIDINSENVHVPNGAGDIQVSCDDYNKLLA ENPIDIQLLGIGSNGHIGFNEPGTSFDSVTHMIELKESTRQDNARLFFDGKIDEVPTHAI TMGISNILQAKKVLLVACGENKAQPIKVLVEGEKTTDVPASALQDHNDVVVIVDKAAASL LTK >gi|223714017|gb|ACDT01000198.1| GENE 5 4269 - 5978 1999 569 aa, chain + ## HITS:1 COG:SA0935 KEGG:ns NR:ns ## COG: SA0935 COG1080 # Protein_GI_number: 15926670 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Staphylococcus aureus N315 # 1 568 4 572 572 704 62.0 0 MIKGIGASSGIAIAKAYKLVMPDLTVTKVTVEDAEAEIKKFDDAMSQTAKELESIKEAAA KNLSAEEAAVFDAHALVLQDPELKTQVEDKIRNEKINADAALDEVANTFIAMFESMDDDY FRERAADIKDVSRRLLANLLGKSLPNPALIDEEVVIIADDLTPSDTAQLNKNLVRGFATN IGGRTSHSAIMARSLEIPAVVSCKTITDETNDADMVVLDGEAGIVIINPSDDEIKEYQAK REAFIAYKEELKKMKNEKSVTLDDHHVELVANIGSPKDIQGVLDNGGEGVGLFRTEFLYM ESAQLPTEEEQFNVYKEVLEGLEGKPAVVRTLDIGGDKEIEAIDLPKEMNPFLGVRAVRL CFQREDIFRVQLRALLRASVYGDLRIMFPMIATLDEFRKAKGILMEEKEKLISEGIEVSD TLQVGIMIEIPAAAVLADQFAKEVDFFSIGTNDLVQYTFAADRMSSGVSYLYQPFHPSIL RLVKHVIDSAHAEGKWTGMCGEMAGEAIAAPLLIGLGLDEFSMSATSILSQRKLIRSMKK SEMNELAAKAINCGTMEEVVALVKEAVEI Prediction of potential genes in microbial genomes Time: Thu May 26 11:11:54 2011 Seq name: gi|223714016|gb|ACDT01000199.1| Coprobacillus sp. D7 cont1.199, whole genome shotgun sequence Length of sequence - 5094 bp Number of predicted genes - 2, with homology - 0 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - 5S_RRNA 91 - 148 91.0 # BA000028 [D:679269..679384] # 5S ribosomal RNA # Oceanobacillus iheyensis HTE831 # Bacteria; Firmicutes; Bacillales; Bacillaceae; Oceanobacillus. - 5S_RRNA 266 - 939 89.0 # AF142677 [R:48033..48709] # 5S ribosomal RNA # Bacillus megaterium # Bacteria; Firmicutes; Bacillales; Bacillaceae; Bacillus. 1 1 Tu 1 . - CDS 1728 - 1979 197 ## - Prom 2113 - 2172 4.5 + Prom 2585 - 2644 3.3 2 2 Tu 1 . + CDS 2721 - 2936 59 ## + Term 3067 - 3099 -0.6 - SSU_RRNA 3504 - 4988 99.0 # EU530454 [D:1..1486] # 16S ribosomal RNA # uncultured Coprobacillus sp. # Bacteria; Firmicutes; Erysipelotrichi; Erysipelotrichales; Erysipelotrichaceae; Coprobacillus; environmental samples. Predicted protein(s) >gi|223714016|gb|ACDT01000199.1| GENE 1 1728 - 1979 197 83 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDLRVSVSGRGAFYVRRSGTVRSRGAHRRENAGVSSETWVRIPRTENPRFPEEGSSALGK SGPKARPRGVVDGQQVEIPVLTV >gi|223714016|gb|ACDT01000199.1| GENE 2 2721 - 2936 59 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVLVDSDRIPRVPPYSGAVSCFFTISATGLSPCFAGLPIPFAYNDFLLFADSPTTPILRS VWALPLSLAAT Prediction of potential genes in microbial genomes Time: Thu May 26 11:12:05 2011 Seq name: gi|223714015|gb|ACDT01000200.1| Coprobacillus sp. D7 cont1.200, whole genome shotgun sequence Length of sequence - 4011 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 51 - 1232 1076 ## STER_0748 hypothetical protein 2 1 Op 2 . + CDS 1292 - 1684 286 ## COG1051 ADP-ribose pyrophosphatase + Prom 2068 - 2127 8.2 3 2 Tu 1 . + CDS 2286 - 2456 130 ## gi|237732657|ref|ZP_04563138.1| hypothetical protein MBAG_03308 + Term 2572 - 2606 -0.8 - Term 2608 - 2645 -0.7 4 3 Tu 1 . - CDS 2648 - 2866 182 ## NT01CX_2068 hypothetical protein - Prom 2916 - 2975 6.6 5 4 Tu 1 . - CDS 3041 - 3880 469 ## HMPREF0424_0945 hypothetical protein - Prom 3914 - 3973 4.6 Predicted protein(s) >gi|223714015|gb|ACDT01000200.1| GENE 1 51 - 1232 1076 393 aa, chain + ## HITS:1 COG:no KEGG:STER_0748 NR:ns ## KEGG: STER_0748 # Name: not_defined # Def: hypothetical protein # Organism: S.thermophilus_LMD9 # Pathway: not_defined # 2 376 3 375 378 119 27.0 2e-25 MENILVIGNGFDIAHGFNTRYEDFINFCKTIATVYQGFKDNQFIYQNEYNDILKEKLDKV FKEKSVSAKAIKYFSNLTPTSETSQFIKACKENYWLTYVLKNKSMIGDKWSDLEYVIAKQ IEILSYISNNLNWHTREETMLASRYSSNLIELFKIVTEERGNNTDFQHQIELTKKHLYRE LEELTWLLEIYLTRFLNTKTKKIELFKYLPVTKLISFNYTDTYTRMYKKETNTVHYIHGF AAKDRIKEENNMVFGIGSEIKNVTDNDKYDYLEFQKYYQRIVKKTGNNYTKWLNDNEQFY IYFFGHSLDIVDGDVIRKLIHCKKAKVIIFYYNQKALNALVVNLAKILGKDELIQFTNEE KITFYKSDDLEIIKKLKNTMYQTRHEKLKSTVR >gi|223714015|gb|ACDT01000200.1| GENE 2 1292 - 1684 286 130 aa, chain + ## HITS:1 COG:PA2769 KEGG:ns NR:ns ## COG: PA2769 COG1051 # Protein_GI_number: 15597965 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Pseudomonas aeruginosa # 1 116 11 120 136 76 32.0 1e-14 MLIKNNQILLGHRIKDGVDTGGIYEPDTWCLPGGKQEYHETIFEGAIREVKEETNLNISQ IEVFNVVDDIQLNKHYVTIHIIAKNYDGDLKAMEPDKQDEWCWFEIEKLPNNIYSPSKKF IEAYLDRSIT >gi|223714015|gb|ACDT01000200.1| GENE 3 2286 - 2456 130 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732657|ref|ZP_04563138.1| ## NR: gi|237732657|ref|ZP_04563138.1| hypothetical protein MBAG_03308 [Mollicutes bacterium D7] # 1 56 1 56 56 82 100.0 1e-14 MSLLFYIENFVVVGRTYPLIYTEEKEKGKTLFVEILLIGVGLAMATLLYQFIKLKE >gi|223714015|gb|ACDT01000200.1| GENE 4 2648 - 2866 182 72 aa, chain - ## HITS:1 COG:no KEGG:NT01CX_2068 NR:ns ## KEGG: NT01CX_2068 # Name: not_defined # Def: hypothetical protein # Organism: C.novyi # Pathway: not_defined # 1 71 55 125 126 89 57.0 3e-17 MGQISGHNFNFNLHGITGSIALLLMAFHAVWATVILIKKNDRAKKTFHKFSIIVWSIWLI PYIIGIYIGMNG >gi|223714015|gb|ACDT01000200.1| GENE 5 3041 - 3880 469 279 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0424_0945 NR:ns ## KEGG: HMPREF0424_0945 # Name: not_defined # Def: hypothetical protein # Organism: G.vaginalis # Pathway: not_defined # 1 135 1 135 146 142 62.0 2e-32 MKVKKHNLLLIASIVWLIAGFNILKIGIETYVGYTKLLNFFLSIIVFIIFWFAIFYKLTK KHTHRIHSYEIEKQFFLNFFDLKSFIIMAFMIIFGITIRTFNLLPDRFIAIFYTGLGAAL FLAGIIFGLNYYKSLNKTLDYSPKSLINIAIIYFILAMAGGVFYREFTKFYAYSMPTVLS VIHPHLLILGTLLFIILAVIAKVTNIQNNRLFKKFVIIYNFSLPFMILTMLIRGILQITN TAINSLIDKMLSGFAGLSHITMMIALLILLISLKKEFTD Prediction of potential genes in microbial genomes Time: Thu May 26 11:12:21 2011 Seq name: gi|223714014|gb|ACDT01000201.1| Coprobacillus sp. D7 cont1.201, whole genome shotgun sequence Length of sequence - 2890 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 451 - 774 301 ## COG1733 Predicted transcriptional regulators - Prom 931 - 990 11.0 + Prom 759 - 818 9.5 2 2 Op 1 . + CDS 976 - 1836 805 ## COG0778 Nitroreductase + Prom 1840 - 1899 7.3 3 2 Op 2 . + CDS 1941 - 2768 783 ## COG1737 Transcriptional regulators Predicted protein(s) >gi|223714014|gb|ACDT01000201.1| GENE 1 451 - 774 301 107 aa, chain - ## HITS:1 COG:BH0737 KEGG:ns NR:ns ## COG: BH0737 COG1733 # Protein_GI_number: 15613300 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 10 106 14 110 118 109 50.0 1e-24 MIKKENLPECPVATTVELIGSKWKLLILKYLLNKTMRYNELKREIDGISQKVLTSTLKSM VEDGIVIRTSYPEVPPRVEYSLSEIGESMRPVIDVMADWGNTYKNKK >gi|223714014|gb|ACDT01000201.1| GENE 2 976 - 1836 805 286 aa, chain + ## HITS:1 COG:CAC3483_2 KEGG:ns NR:ns ## COG: CAC3483_2 COG0778 # Protein_GI_number: 15896720 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 92 272 1 174 185 66 28.0 6e-11 MNITNFKWSLEKCIHCGKCLKVCPGDLFIFDDKNELKIKEIDCFGWDGCWQCDHCLAVCP TGAISIFNKDPQNSVTLPDYQDAGKMMSALVRTRRACRRYLNKDVDKKIIKELMGSIQYA PTGGNKRKLEFTIIDDCKEMDYFRLLVRDEMIKLANNGIYPQGFDKTSYNQMMAWEKSVR PDSIFCGAPHILIPHAPKNIPCAVQDVNMAAAYFELLCNANGLGAICMSYPLAVLGNMPN VMKLLQIPEDHYISLMVGFGYPEIRYARGVQREANGKIKRIVFTQD >gi|223714014|gb|ACDT01000201.1| GENE 3 1941 - 2768 783 275 aa, chain + ## HITS:1 COG:PM1577 KEGG:ns NR:ns ## COG: PM1577 COG1737 # Protein_GI_number: 15603442 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Pasteurella multocida # 3 251 4 258 286 70 26.0 3e-12 MNLVDLLNDSKQFNETQNIICQYILNHSEDVVKMSARALAKETYTNASTIIRFVQKIGYE NYNDFKIHLVHDLKEYQAADIKITEKEKSISIVDKISELEKNVIEKTKNQLSLQQIEYIA ELLKKTTYIDFISSDANACIADYACHLFFLERKIANNYTTSNQQLYLTLTELKEHVVFVI SRRGEDEKILKVVQELHKNKEIKIIAITGRKESPIARYCDEILSAIHIGSFVELRDMIFQ VSAQYIINCLFSLLFTDDYQSIVQFNDEYEKIYLK Prediction of potential genes in microbial genomes Time: Thu May 26 11:12:23 2011 Seq name: gi|223714013|gb|ACDT01000202.1| Coprobacillus sp. D7 cont1.202, whole genome shotgun sequence Length of sequence - 2177 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 430 358 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 2 2 Tu 1 . - CDS 452 - 1309 849 ## COG0583 Transcriptional regulator - Prom 1338 - 1397 2.8 + Prom 1360 - 1419 5.8 3 3 Op 1 . + CDS 1439 - 1675 251 ## gi|167756299|ref|ZP_02428426.1| hypothetical protein CLORAM_01832 4 3 Op 2 . + CDS 1708 - 1965 329 ## EUBELI_00178 MerR family transcriptional regulator, mercuric resistance operon regulatory protein Predicted protein(s) >gi|223714013|gb|ACDT01000202.1| GENE 1 2 - 430 358 142 aa, chain + ## HITS:1 COG:BH0909 KEGG:ns NR:ns ## COG: BH0909 COG1440 # Protein_GI_number: 15613472 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Bacillus halodurans # 50 141 2 96 98 59 32.0 2e-09 KDSDEILFYLHFQMHDFKSTMDQFNTFFNYLCLPTIKLGESIFNNVNTNKILLSCSGGLT TSYFAYSMQEIFKRQGLDIVVDAVGYMEIDKVIDDYDMVLLAPQVAYLLPQLKNKYGNKI FIIDSLDFATNDFNAIIKKAVN >gi|223714013|gb|ACDT01000202.1| GENE 2 452 - 1309 849 285 aa, chain - ## HITS:1 COG:lin0450 KEGG:ns NR:ns ## COG: lin0450 COG0583 # Protein_GI_number: 16799526 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 284 1 283 291 278 48.0 7e-75 MELRVLKYFITVAREESISAAADYLHMTQPTLSRQLKDLENELGKQLLIRGNRRITLTDD GILFRKRAEEIIDLVNKTEAEMTSDNETISGDIYIGGGETEGMRLITKAIKETQKIHPEI KFHLYSGNAQDVTEKLDKGLLDFGILIEPTNFSKYDFIKLPYTDCWGVLMKKDAFLASKE YINPQDLKDLPLICSNQDLVRNELSGWLKDDFDKLNIVATYNLIYNASLLVDEGSGYALT LDKLINTYNSTLCFKPLEPKLEVGLDLVWKKYQIFSKAADFFLKK >gi|223714013|gb|ACDT01000202.1| GENE 3 1439 - 1675 251 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756299|ref|ZP_02428426.1| ## NR: gi|167756299|ref|ZP_02428426.1| hypothetical protein CLORAM_01832 [Clostridium ramosum DSM 1402] # 1 78 1 78 78 132 100.0 8e-30 MCSNDTPEAVYQNLLDAGCDSKLVDRFMLLFAHKNYQGQLQILSEQRKELLDELHLVQKE LDCLDYLIYKIKKLYLND >gi|223714013|gb|ACDT01000202.1| GENE 4 1708 - 1965 329 85 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00178 NR:ns ## KEGG: EUBELI_00178 # Name: not_defined # Def: MerR family transcriptional regulator, mercuric resistance operon regulatory protein # Organism: E.eligens # Pathway: not_defined # 3 79 2 78 145 68 44.0 1e-10 MGYTIREVSEILKIPVSTIRYYENLGLLPVIQRIEGKRVFTNENITDLKRIKKLKDMGMT LKDIKNFNQLYSKGKLSFEQKNYDT Prediction of potential genes in microbial genomes Time: Thu May 26 11:12:30 2011 Seq name: gi|223714012|gb|ACDT01000203.1| Coprobacillus sp. D7 cont1.203, whole genome shotgun sequence Length of sequence - 1831 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 755 591 ## gi|237732643|ref|ZP_04563124.1| hypothetical protein MBAG_03319 + Prom 794 - 853 7.9 2 2 Op 1 . + CDS 1086 - 1334 224 ## MARTH_orf236 putative DNA processing protein, smf family 3 2 Op 2 . + CDS 1331 - 1540 178 ## gi|237732645|ref|ZP_04563126.1| predicted protein + Prom 1584 - 1643 8.5 4 3 Tu 1 . + CDS 1673 - 1829 56 ## Predicted protein(s) >gi|223714012|gb|ACDT01000203.1| GENE 1 3 - 755 591 250 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732643|ref|ZP_04563124.1| ## NR: gi|237732643|ref|ZP_04563124.1| hypothetical protein MBAG_03319 [Mollicutes bacterium D7] # 1 250 1 250 250 455 99.0 1e-126 IRDIFIEHFTLSTIDEKKKKFIRACQYNYWIEYINSSDKADPNWCAFEQKITYHLEQLCW LETHFKEFESGQFEEIADYKEIYQLGRSLQLGYDNYYTKIDQFRIDLLSSLEELIYILNY YLSYFLNLKLERISLFESFTINKVVNFNYTSTFEHLYTNENARIDIHHIHGVLNDCDDEF ECQNNIIFGVGTELKYSNEESKIKYLDFQKYFQRIIKKTGIEYAQWLTEVNENVNIIFFG HSLDVPDGIL >gi|223714012|gb|ACDT01000203.1| GENE 2 1086 - 1334 224 82 aa, chain + ## HITS:1 COG:no KEGG:MARTH_orf236 NR:ns ## KEGG: MARTH_orf236 # Name: not_defined # Def: putative DNA processing protein, smf family # Organism: M.arthritidis # Pathway: not_defined # 5 80 1 79 258 65 45.0 6e-10 MNSVMKKVLYYFALKYNGDFDKIYSSLRTREKFNIDEFIRLKIDIQYQHITILDDKYPNY LKGVESPPFALFYEGNLNLIKI >gi|223714012|gb|ACDT01000203.1| GENE 3 1331 - 1540 178 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732645|ref|ZP_04563126.1| ## NR: gi|237732645|ref|ZP_04563126.1| predicted protein [Mollicutes bacterium D7] # 1 69 1 69 69 115 100.0 7e-25 MTSYIKYVVLESGGRLISTVNSIQKCNRIEFDYIVACENQKDLKDMLEHIKSKCLMKNYE KKKDKGMER >gi|223714012|gb|ACDT01000203.1| GENE 4 1673 - 1829 56 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDWLDYRDHLKIGLNDDEKANLFIIKMFNFLNCIDDDIAMIVDIGDYYTFCS Prediction of potential genes in microbial genomes Time: Thu May 26 11:12:51 2011 Seq name: gi|223714011|gb|ACDT01000204.1| Coprobacillus sp. D7 cont1.204, whole genome shotgun sequence Length of sequence - 1727 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 40 - 327 140 ## gi|237732641|ref|ZP_04563122.1| conserved hypothetical protein + Term 353 - 399 -0.9 + Prom 531 - 590 13.1 2 2 Tu 1 . + CDS 616 - 1620 1114 ## Aflv_1639 DEAD-like helicase domain fused to uncharacterized conserved domain Predicted protein(s) >gi|223714011|gb|ACDT01000204.1| GENE 1 40 - 327 140 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732641|ref|ZP_04563122.1| ## NR: gi|237732641|ref|ZP_04563122.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 95 1 95 95 177 100.0 2e-43 MVECKNWSSRVGVHQIRNIAHISNLKGNRTAILFASNGITSDAQREIYRLAIEGCFIVCV TVDDLLQMTSAKECKILILDKWNLLQENVELTTIL >gi|223714011|gb|ACDT01000204.1| GENE 2 616 - 1620 1114 334 aa, chain + ## HITS:1 COG:no KEGG:Aflv_1639 NR:ns ## KEGG: Aflv_1639 # Name: not_defined # Def: DEAD-like helicase domain fused to uncharacterized conserved domain # Organism: A.flavithermus # Pathway: not_defined # 1 310 1 314 649 308 50.0 3e-82 MIVYQETKEQFLLDVYNDKIDEKIEKLVFERLNRRTGYSEYQTWMNSMQYMYKVLENKAI PSNSIVAIEYKVPNSNKRIDFIVSGEDEQGRESAIIIELKQWQSLNKIENMDGLVETKLN GHLTPVAHPSYQAWSYVSLIEDYNEDVRKYKINLQPCAYLHNYRKKEYDDLVDLCYEYYL DKAPVFTRGDTKNYTMFIARYVKKGNPDVMYHIDSGKIKPSKSLQDSLTNMLKGKPEFTL LDEQKVTYEKALAIAKENSARKQVYIIHGGPGTGKTVIAVNMLVELINRDKNTIYVTKNS APREVYQKKLTDGGYKRFILIIYLKVQVVFNKCE Prediction of potential genes in microbial genomes Time: Thu May 26 11:13:02 2011 Seq name: gi|223714010|gb|ACDT01000205.1| Coprobacillus sp. D7 cont1.205, whole genome shotgun sequence Length of sequence - 1591 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 + CDS 3 - 470 258 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase 2 1 Op 2 . + CDS 470 - 946 626 ## COG5017 Uncharacterized conserved protein 3 1 Op 3 . + CDS 957 - 1590 462 ## EUBELI_00098 hypothetical protein Predicted protein(s) >gi|223714010|gb|ACDT01000205.1| GENE 1 3 - 470 258 155 aa, chain + ## HITS:1 COG:MA2171 KEGG:ns NR:ns ## COG: MA2171 COG0707 # Protein_GI_number: 20091013 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Methanosarcina acetivorans str.C2A # 2 154 19 162 163 89 34.0 2e-18 TRVIFTSSAGGHFSELCELSELMERYNSFLITEDHEMMKEYKKTNKSRSWYMPAGTKEHL FKFLCNFPINIFKSFKAYLKVKPDVIIATGAHTTVPICYIAKLFGKKVIFIETFANITTK TLSGKLVYPIADLFLVQWEEMLELYPKAKYRGGLK >gi|223714010|gb|ACDT01000205.1| GENE 2 470 - 946 626 158 aa, chain + ## HITS:1 COG:MA2172 KEGG:ns NR:ns ## COG: MA2172 COG5017 # Protein_GI_number: 20091014 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 3 135 2 134 155 65 32.0 6e-11 MRIFVMFGTQDKRFDRLLNAILNSNFVNENEVYVQLGYTKGDYSGINGQEYYTEDELSHQ IEIADLIITHAGVGAIVSALKLKKRVIVVPRLGQYKEQNNDHQVQIMERFDKQGYIIPCT DLSKLDETVNNAYNFEPKEYVADKQGIIDEITDFINTL >gi|223714010|gb|ACDT01000205.1| GENE 3 957 - 1590 462 211 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00098 NR:ns ## KEGG: EUBELI_00098 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 2 211 10 217 270 171 45.0 2e-41 MHTFVVLAYKESSYLEECIKSVLNQKYPSKVVIATSTPNQYIENIADKYSLAIKVNPNPG KGIGYDFDFAVSCGETELTTVAHQDDIYDYEYSDCIVKAYLKNPNSLIVFSDYYEIRDKG NVYSNINLKIKRILLLPMRSKKISSTKFGKRLSLRFGNSICCPAVTFVIDNIKSKDIFKC DFVCDVDWFAWEKLSLKDGKFTFVNNPLMGH Prediction of potential genes in microbial genomes Time: Thu May 26 11:13:06 2011 Seq name: gi|223714009|gb|ACDT01000206.1| Coprobacillus sp. D7 cont1.206, whole genome shotgun sequence Length of sequence - 1423 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 13 - 1423 1578 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase Predicted protein(s) >gi|223714009|gb|ACDT01000206.1| GENE 1 13 - 1423 1578 470 aa, chain + ## HITS:1 COG:CAP0010 KEGG:ns NR:ns ## COG: CAP0010 COG2723 # Protein_GI_number: 15004715 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 1 470 1 469 469 632 62.0 0 MLHKKLKPFPSDFLWGASTSAYQVEGANLIDGKGPSCQDVKKVPEGTSELDVCADQYHRY KEDIALMAEMGFKTYRFSIAWTRILPNGTGEVNPKGIEYYNNVINECLKYGIEPLVTMFH FDMPAALDERGSWGNPESVDWFVNFAKVMYENYGDRVKYWLTINEQNMLTLVGPVIGTLH LPEGCTNEIKEIYQQNHHMLVAQAKAMALCHEMIPGAKIGPAPNISLVYPASCKPEDVLA AQNYNAIRNWLYLDMAVYGVYNNLVWAYLEEHDACPTFAPGDAEALKNGHPDFIGFNYYN TATCEASDGTETMDPGADQQTARGEAGFYRGFKNPNLPTTEFGWEIDPMGFRATIREMYS RYRLPMIVTENGLGAYDKLTEDGKVHDQYRIEYLRKHLEQVQLAITDGCEMMGYCPWSAV DLISTHEGMVKRYGFIYVDREEFDLKTLDRYRKDSFYWYKKVIATNGDDL Prediction of potential genes in microbial genomes Time: Thu May 26 11:13:06 2011 Seq name: gi|223714008|gb|ACDT01000207.1| Coprobacillus sp. D7 cont1.207, whole genome shotgun sequence Length of sequence - 1356 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1355 1112 ## COG4932 Predicted outer membrane protein Predicted protein(s) >gi|223714008|gb|ACDT01000207.1| GENE 1 2 - 1355 1112 451 aa, chain + ## HITS:1 COG:BH2014 KEGG:ns NR:ns ## COG: BH2014 COG4932 # Protein_GI_number: 15614577 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted outer membrane protein # Organism: Bacillus halodurans # 22 408 1019 1365 1816 71 27.0 4e-12 NWIDGSQISINKTDNYGENISNAKFDLLQWNKTTRQYEKLREFSYNSSNKLYESDFLEKT DLNEGKFKVEETVPNGYTASSRYEQEFTLDGYIDRSVFTAEIEGKEVNVKLSSTLADNHY IKYEVDGVPDSVTEIKFPTWSVAGGQDDIDWVHLAKESDGVWRANKQMPESGQYVIHIYY NTAVQQNIYSHAVNFWPSDTTINVVNNRIMGKISVDKLDYHAGEKLANATFDVIARNDIR TPQGTVIYRQGEVVGNLTTDENGYGELDNLNLGEYTLKESAVPDGFRYDDSTYDFTITAD KTDSKLHMGLDVTWEINNYPTFINVYKVDKDSGKKLENAEFDLYNVTDKKKVGTYKTDKN GNIAVFYLSRQKTYYLQETNAPDSYKLNDTKYYFYVDEKGAFSVSDMNGTVEDGTFNVPF HGTMTITVKNEIDICNLRITKKNDNSKVLEN Prediction of potential genes in microbial genomes Time: Thu May 26 11:13:07 2011 Seq name: gi|223714007|gb|ACDT01000208.1| Coprobacillus sp. D7 cont1.208, whole genome shotgun sequence Length of sequence - 1334 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1313 1387 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases Predicted protein(s) >gi|223714007|gb|ACDT01000208.1| GENE 1 2 - 1313 1387 437 aa, chain - ## HITS:1 COG:CAC0533 KEGG:ns NR:ns ## COG: CAC0533 COG1486 # Protein_GI_number: 15893823 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Clostridium acetobutylicum # 3 437 2 435 441 677 74.0 0 MEKKYSVVIAGGGSTFTPGICLLLLDHMDRFPIRKIKFYDNFAERQETIAKACEIYLKEN APEVEFAYTTDPEEAFTDVDFVMCHIRVGLYAMREKDEKIPMKYNVVGQETCGPGGIAYG MRSIGGVIEILDYMEKYSPNAWMLNYSNPAAIVAEATRRLRPDSKILNICDMPIDLEEKM ANMVGLKSRKEMSVGYYGLNHFGWWHKIEDKDGNDLMPEIKKHMAANGFADALSGTNQHV EQSWIETFAKAKDVYALDPETIPNTYLKYYLYPDYVVEHTNPEHTRANEVMEHREKNIFT ACRKIIEKGTAVDGGFEPDAHAEYIVDLACAIAKNTKEKMLLIVPNEGAVENFDRTAMVE IPCIVGSNGYERICQGSIPQFQKGLMEQQVSVEKLVVDAWITGSYQKLWQAITLSKTVPS ARVAKLILDDLIEANKD Prediction of potential genes in microbial genomes Time: Thu May 26 11:13:08 2011 Seq name: gi|223714006|gb|ACDT01000209.1| Coprobacillus sp. D7 cont1.209, whole genome shotgun sequence Length of sequence - 1328 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 6 - 299 266 ## gi|167756917|ref|ZP_02429044.1| hypothetical protein CLORAM_02466 - Prom 337 - 396 10.2 - Term 381 - 430 9.1 2 2 Tu 1 . - CDS 443 - 592 87 ## gi|167756916|ref|ZP_02429043.1| hypothetical protein CLORAM_02465 - Prom 641 - 700 7.4 - Term 634 - 672 -0.1 3 3 Tu 1 . - CDS 874 - 1305 374 ## COG2017 Galactose mutarotase and related enzymes Predicted protein(s) >gi|223714006|gb|ACDT01000209.1| GENE 1 6 - 299 266 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756917|ref|ZP_02429044.1| ## NR: gi|167756917|ref|ZP_02429044.1| hypothetical protein CLORAM_02466 [Clostridium ramosum DSM 1402] # 1 93 1 93 163 169 100.0 4e-41 MKKIALILICCLVLIGCKNKDNNDNKFYRQYKEYENKIDNHNEFLNATNEFNIRLVVNKV EDKKYRYDVIIDTPTINMYHLQAIAKVAGDDNEVYQP >gi|223714006|gb|ACDT01000209.1| GENE 2 443 - 592 87 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756916|ref|ZP_02429043.1| ## NR: gi|167756916|ref|ZP_02429043.1| hypothetical protein CLORAM_02465 [Clostridium ramosum DSM 1402] # 1 49 1 49 49 63 100.0 4e-09 MAKGVCCHVDTCVYNHDCNCEAKEITVCNCKCHEAKDIQETACETFKCK >gi|223714006|gb|ACDT01000209.1| GENE 3 874 - 1305 374 143 aa, chain - ## HITS:1 COG:CAC3032 KEGG:ns NR:ns ## COG: CAC3032 COG2017 # Protein_GI_number: 15896283 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Clostridium acetobutylicum # 8 143 164 298 298 90 34.0 7e-19 MENESSNDYYLEFSENETVNQKIIDVEHKGMSDITLPFLDNEKRFFVRQQLFDNDAIVLK DVKSKTLSLKSLNHNKSLVFHMEGFNHLGIWAAKHVGGLIAIEPWVGHSDYVGFTGEFKD KEGVVSLNPSEIFECTFKVEINQ Prediction of potential genes in microbial genomes Time: Thu May 26 11:13:18 2011 Seq name: gi|223714005|gb|ACDT01000210.1| Coprobacillus sp. D7 cont1.210, whole genome shotgun sequence Length of sequence - 1238 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Thu May 26 11:13:18 2011 Seq name: gi|223714004|gb|ACDT01000211.1| Coprobacillus sp. D7 cont1.211, whole genome shotgun sequence Length of sequence - 1148 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 456 - 515 6.5 1 1 Tu 1 . + CDS 571 - 1147 301 ## gi|237732631|ref|ZP_04563112.1| conserved hypothetical protein Predicted protein(s) >gi|223714004|gb|ACDT01000211.1| GENE 1 571 - 1147 301 192 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732631|ref|ZP_04563112.1| ## NR: gi|237732631|ref|ZP_04563112.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 192 1 192 192 326 100.0 6e-88 MENRNNKQYFIQNDKLKIDSFLQIPKLLFKIPKYKKLSLTGRIMYSLYLNRYNDTKYRDS EGPYIIFGDAELQEFLGISRATCIRNKKELVNLNLIKIKKTTGYNKIYLMNYRNPDNNEF YTEEDLDSYAFYRFPRVFFDEQFDELTLNAKFLYTYYFDWMCLSQMNYIIDDYDRIYFRE SNKDQEANLLLN Prediction of potential genes in microbial genomes Time: Thu May 26 11:13:27 2011 Seq name: gi|223714003|gb|ACDT01000212.1| Coprobacillus sp. D7 cont1.212, whole genome shotgun sequence Length of sequence - 1124 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 51 - 740 698 ## CD2968 dipicolinate synthase subunit A 2 1 Op 2 . + CDS 733 - 1123 272 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase Predicted protein(s) >gi|223714003|gb|ACDT01000212.1| GENE 1 51 - 740 698 229 aa, chain + ## HITS:1 COG:no KEGG:CD2968 NR:ns ## KEGG: CD2968 # Name: dpaA, spoVFA # Def: dipicolinate synthase subunit A # Organism: C.difficile # Pathway: not_defined # 69 227 125 288 291 72 31.0 1e-11 MIADGYQIQEDSRLISSCDIIYLGKDGQGFEQVDFKNNAVVLSLLKNQRLCYLSKLKGFN YRYLYSDEDFVVENTHISDEAVIAYMIIDNSISLSNSNVLILGYGHCGRDLAAKLEKFNA KVSISNRSDHYHDEVVEKGYRYIRLDQLTLNGYDFVINTIPSQIIDSDILKTKDVNCKIY DVASTPYGLKTCDRDESYHLLKQLPTKYAYDSSAKALYKAIKRAVDQYA >gi|223714003|gb|ACDT01000212.1| GENE 2 733 - 1123 272 130 aa, chain + ## HITS:1 COG:BS_spoVFB KEGG:ns NR:ns ## COG: BS_spoVFB COG0452 # Protein_GI_number: 16078737 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Bacillus subtilis # 2 130 4 132 200 102 38.0 2e-22 MLRNKKIGIGLTGSFCSLQKTLAVIKELAALECDLYIFASEKILNCNTRFNKADELIDEL EKLSKRKVITNVVDSEIFGPKIPLDIMIVMPCSGNTLAKLAIGINDNAVTMACKSTLRNE HNVVLAIATN Prediction of potential genes in microbial genomes Time: Thu May 26 11:13:31 2011 Seq name: gi|223714002|gb|ACDT01000213.1| Coprobacillus sp. D7 cont1.213, whole genome shotgun sequence Length of sequence - 1104 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1102 736 ## COG2509 Uncharacterized FAD-dependent dehydrogenases Predicted protein(s) >gi|223714002|gb|ACDT01000213.1| GENE 1 1 - 1102 736 367 aa, chain + ## HITS:1 COG:BH1470 KEGG:ns NR:ns ## COG: BH1470 COG2509 # Protein_GI_number: 15614033 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Bacillus halodurans # 2 367 62 432 480 501 67.0 1e-141 IRNNEPGCLPACSITAGFGGAGAYSDGKFNITSEFGGWLTDYLSTNEVEDVINYVDNLYL KHGATREITDPTTDKVKEIEHRGYAVGLKLLRAKVRHLGTEENLRIMTEMSTELKEHIDM AFRTAVKEVLVEDGHAVGVVLENDEVIKAKKIVLAPGRDGSAWLTKVLKQHGLELYNNQV DIGVRVETSNIVMEEINSNLYEGKFVYNTSVGTKVRTFCSNPSGHVVIENHSGTMLANGH AYHDPKLGSKNTNFALLVSHTFSEPFNEPNEFAHEVSRLANKLSNGSVMVQRYGDIKKGR RTTYKRLKEGYTEPTLAEAVPGDLGLVLPYNTMKSIIEMIEALDNVTPGIANEHTLLYGV EAKFYSA Prediction of potential genes in microbial genomes Time: Thu May 26 11:13:32 2011 Seq name: gi|223714001|gb|ACDT01000214.1| Coprobacillus sp. D7 cont1.214, whole genome shotgun sequence Length of sequence - 1018 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 25 - 84 1.8 1 1 Tu 1 . + CDS 105 - 536 302 ## BT_4544 transposase + Term 586 - 620 5.5 - Term 575 - 607 4.2 2 2 Tu 1 . - CDS 636 - 1016 79 ## gi|237732627|ref|ZP_04563108.1| hypothetical protein MBAG_03338 Predicted protein(s) >gi|223714001|gb|ACDT01000214.1| GENE 1 105 - 536 302 143 aa, chain + ## HITS:1 COG:no KEGG:BT_4544 NR:ns ## KEGG: BT_4544 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 134 167 301 308 86 36.0 3e-16 MQRIKNIYSDSPKTIIIEEAPKSSTSSRKIPIPGNLISLFRIFHSSDDCYLLTGRADRFV EPRSLERKFRNYIKTAGIKKANFHMLRHTFATMCIEGGFEIKCLSEILGHSGSQITLDRY VHSSFDLKKSWIDKFSNQNITCH >gi|223714001|gb|ACDT01000214.1| GENE 2 636 - 1016 79 126 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732627|ref|ZP_04563108.1| ## NR: gi|237732627|ref|ZP_04563108.1| hypothetical protein MBAG_03338 [Mollicutes bacterium D7] # 5 126 1 122 122 225 100.0 9e-58 KFFHMKAKASKERILNTDGKTTFSDLSKDITLKSEKINYDEDNHRLKWLNIISGNIKNNI TGIYHGVTKGSLPLFLHEQEWRFNHRNTGKSIMEKVSKYITKSFPINMEKLSQILDLSKS YFSPCV Prediction of potential genes in microbial genomes Time: Thu May 26 11:13:42 2011 Seq name: gi|223714000|gb|ACDT01000215.1| Coprobacillus sp. D7 cont1.215, whole genome shotgun sequence Length of sequence - 1007 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 590 351 ## gi|237732624|ref|ZP_04563105.1| conserved hypothetical protein 2 1 Op 2 . - CDS 646 - 921 186 ## gi|237732625|ref|ZP_04563106.1| conserved hypothetical protein - Prom 941 - 1000 3.9 Predicted protein(s) >gi|223714000|gb|ACDT01000215.1| GENE 1 2 - 590 351 196 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732624|ref|ZP_04563105.1| ## NR: gi|237732624|ref|ZP_04563105.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 196 1 196 196 338 100.0 1e-91 MKKKIFNPVLNINLEKEATVTLIPYKDTIKIIVKSQNFNFSYECEETDIDKEYTLNLLSF QKALNEVSGKINFKIYIGKVIDENGQDLTVLEETEEQNFLIKFEEKELFIKTALIQNLSR HQNVRMKLALYPALGDLVIANTKNGVAFNSSDGMSVLQTRIDCEFKLQYSYLCPFELMKN TNLKKLKITKADENSK >gi|223714000|gb|ACDT01000215.1| GENE 2 646 - 921 186 91 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732625|ref|ZP_04563106.1| ## NR: gi|237732625|ref|ZP_04563106.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 91 29 119 119 144 100.0 2e-33 MVEIDKYLNKHSSIEEVQLFDYFYFICPKVNSKYSIAKSNIKKINLSQLNILRKYYNLND LENLSEEEADYLISSLGIFRYKQASNSQVKV Prediction of potential genes in microbial genomes Time: Thu May 26 11:13:55 2011 Seq name: gi|223713999|gb|ACDT01000216.1| Coprobacillus sp. D7 cont1.216, whole genome shotgun sequence Length of sequence - 992 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 332 353 ## BLJ_0861 peptidase C26 - Prom 365 - 424 12.2 + Prom 395 - 454 9.1 2 2 Tu 1 . + CDS 485 - 992 588 ## gi|167756867|ref|ZP_02428994.1| hypothetical protein CLORAM_02416 Predicted protein(s) >gi|223713999|gb|ACDT01000216.1| GENE 1 2 - 332 353 110 aa, chain - ## HITS:1 COG:no KEGG:BLJ_0861 NR:ns ## KEGG: BLJ_0861 # Name: not_defined # Def: peptidase C26 # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 24 110 26 113 260 65 41.0 4e-10 MNKKILTPMRSYSQRRVYYIYREYFRMLQTANLDPIIIGPSSDDTLDFLVTHCDGLLLSG GFDIDPVLYHQVLNPLTNKEEAELEELEIKLIHKFSKAHKPILGICRGIQ >gi|223713999|gb|ACDT01000216.1| GENE 2 485 - 992 588 169 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167756867|ref|ZP_02428994.1| ## NR: gi|167756867|ref|ZP_02428994.1| hypothetical protein CLORAM_02416 [Clostridium ramosum DSM 1402] # 1 166 20 185 598 253 99.0 4e-66 MDFLDNILKKFGTKNVIIVGVVLIVTIVILITYWMIKLRIYRKEIVVLENDMNAIKTLPI QYRLGRIKAIGKNMPDVLEKYDDFELEFNDLVNLQSNEIAPLINDIDERLFYRKLKGVRR DLNKLRQDIDNYEKRSKALLKEIEVITEIENVQRVEIIKIKEKFRVGTE Prediction of potential genes in microbial genomes Time: Thu May 26 11:14:05 2011 Seq name: gi|223713998|gb|ACDT01000217.1| Coprobacillus sp. D7 cont1.217, whole genome shotgun sequence Length of sequence - 992 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 43 - 102 10.1 1 1 Tu 1 . + CDS 130 - 991 1173 ## COG0172 Seryl-tRNA synthetase Predicted protein(s) >gi|223713998|gb|ACDT01000217.1| GENE 1 130 - 991 1173 287 aa, chain + ## HITS:1 COG:PH0710 KEGG:ns NR:ns ## COG: PH0710 COG0172 # Protein_GI_number: 14590588 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Pyrococcus horikoshii # 1 282 6 305 460 193 35.0 3e-49 MIDIAILRENPTLVKENMKKKFQDDKIGLVDEAYKLDKDYREVLTKASELRSKRNKLSKE IGQFMREGNKDAAENNKKQVSAMADELAKLEDLETQYGSELKAIMMKIPNFIDESVPLGK DDSENVEITKYGEPVVPDFEIPYHTEIMENFDGIDMDAAGRVAGQGFYYLMGDIARLHSA ILSYARDFMIDRGFTYVIPPYMIRSNVVTGVMSFEEMDSMMYKIEGEDLYLIGTSEHSMI GKFIDNILEEDKLPYTYTSYSPCFRKEKGAHGIEERGVYRIQTLSVP Prediction of potential genes in microbial genomes Time: Thu May 26 11:14:06 2011 Seq name: gi|223713997|gb|ACDT01000218.1| Coprobacillus sp. D7 cont1.218, whole genome shotgun sequence Length of sequence - 916 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 29 - 914 580 ## Amet_0211 hypothetical protein Predicted protein(s) >gi|223713997|gb|ACDT01000218.1| GENE 1 29 - 914 580 295 aa, chain + ## HITS:1 COG:no KEGG:Amet_0211 NR:ns ## KEGG: Amet_0211 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 10 291 68 356 369 172 36.0 1e-41 MYHILFNNGYIRFNFKGFESFLSNNLEMTSEVSTKKDLSSLNDLYDYFITGSDQVFNPYC SGFDGNYFLSFVSDKNMKFSYAASVGLENIPVELENYYKDYLNSFCRISIREITGANEIK RVCGIECSTNIDPTLLLDKSDWEKLMADLPSNADTPYLLLYALSEDKNMLKFAKKIAKRK KLKVIYINDRLFRPKGMLSLRNVSPEQWLRLFANANSIVTNSFHGIAFSINFEKEFYPFY LNKNTRVNSRIRDLLDLLNLQSLVINDNNDTLMNENIDYSEAREILKNRKEINLL Prediction of potential genes in microbial genomes Time: Thu May 26 11:14:11 2011 Seq name: gi|223713996|gb|ACDT01000219.1| Coprobacillus sp. D7 cont1.219, whole genome shotgun sequence Length of sequence - 864 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 49 - 798 480 ## Dtox_1535 putative transcriptional regulator Predicted protein(s) >gi|223713996|gb|ACDT01000219.1| GENE 1 49 - 798 480 249 aa, chain - ## HITS:1 COG:no KEGG:Dtox_1535 NR:ns ## KEGG: Dtox_1535 # Name: not_defined # Def: putative transcriptional regulator # Organism: D.acetoxidans # Pathway: not_defined # 6 175 48 209 523 95 34.0 1e-18 MANNTTVDMQDGYIIFGIEDKTFKITGVIDDINRKNQENIIGFLNSQIWSGEEIPDIEVK NIQIYGKEIDVLIIHNSNVTPYYLLKDYVKNTPGKDKTIVHAEVIYSRVEDRNTSSAECA TKQATEFLWKKRFGLVGTDSDKVTKRLKNKNHWYSTDEYETFHNSEYGDIVIKQDHNYNL EVNIGKDKSETQIWVMDFPYLFANVFNWNIGEGEIGRRAKWDIFLNGRVLDISLYGCSIH KTILLSNRA Prediction of potential genes in microbial genomes Time: Thu May 26 11:14:15 2011 Seq name: gi|223713995|gb|ACDT01000220.1| Coprobacillus sp. D7 cont1.220, whole genome shotgun sequence Length of sequence - 833 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 833 981 ## CLB_1182 hypothetical protein Predicted protein(s) >gi|223713995|gb|ACDT01000220.1| GENE 1 2 - 833 981 277 aa, chain + ## HITS:1 COG:no KEGG:CLB_1182 NR:ns ## KEGG: CLB_1182 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A_ATCC19397 # Pathway: not_defined # 19 277 70 330 579 269 49.0 9e-71 TTYTYLREVKSESVEIVSDLRTIFKLLLVMSVEEVLSSYPHLSGYFRDRDEFVKVIEEVY LFWRKLQRYSVVYNENTSRGYQNVQFGDAQAKFEELVLSVYRTIEETALGYRHHVYRQIT AGVNAGLRVTNMRSFLPYEYRNLDDIPFIETVIIHPPFITYSKRNKRDGVFPETKHNPIE DLKFDSEQWFCYPAKVGDLLAFVYFNVAFMAQGVALCNLFELASEEEYRNRKPDIIYVYG YEDGKMNQSFYQDDENDMMVALLSASDDFDYFGYMKK Prediction of potential genes in microbial genomes Time: Thu May 26 11:14:19 2011 Seq name: gi|223713994|gb|ACDT01000221.1| Coprobacillus sp. D7 cont1.221, whole genome shotgun sequence Length of sequence - 820 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 26 - 403 325 ## gi|167754625|ref|ZP_02426752.1| hypothetical protein CLORAM_00127 2 1 Op 2 . - CDS 400 - 591 253 ## gi|237732617|ref|ZP_04563098.1| predicted protein - Prom 732 - 791 9.2 Predicted protein(s) >gi|223713994|gb|ACDT01000221.1| GENE 1 26 - 403 325 125 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754625|ref|ZP_02426752.1| ## NR: gi|167754625|ref|ZP_02426752.1| hypothetical protein CLORAM_00127 [Clostridium ramosum DSM 1402] # 1 125 1 125 306 233 99.0 3e-60 MRLGYDLCEVVSEDIVGLLKSFKEDHITVFQLTKVDDCTYRFYLPIYQRIMVRKYHLTII KSVGILYYLILIFYRKLSIVGVVSFAVTVILCNQFIFRVEIIGNNPSTTKLVNQVLAENH IDVGE >gi|223713994|gb|ACDT01000221.1| GENE 2 400 - 591 253 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732617|ref|ZP_04563098.1| ## NR: gi|237732617|ref|ZP_04563098.1| predicted protein [Mollicutes bacterium D7] # 1 63 1 63 63 104 100.0 2e-21 MIYMENKCLSVKYYECVILLENNVIEIKMANNTLKVTGADLEIRYYSDQEVIIYGKIDCL RFL Prediction of potential genes in microbial genomes Time: Thu May 26 11:14:30 2011 Seq name: gi|223713993|gb|ACDT01000222.1| Coprobacillus sp. D7 cont1.222, whole genome shotgun sequence Length of sequence - 820 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 147 - 641 427 ## COG1943 Transposase and inactivated derivatives - Prom 680 - 739 3.7 Predicted protein(s) >gi|223713993|gb|ACDT01000222.1| GENE 1 147 - 641 427 164 aa, chain - ## HITS:1 COG:CAC3531 KEGG:ns NR:ns ## COG: CAC3531 COG1943 # Protein_GI_number: 15896768 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 6 159 4 157 157 197 61.0 1e-50 MAQKTNSLAHTKWMCKYHIVFTPKYRRKVIYNQLRNDIREIIIRLCQYRGVEIIEGHLMS DHVHMLVMIPPKLSVSSFMGYLKGKSALMIFDRHANLKYKYGNRHFWSEGYYVSTVGLND QTVAKYIREQEGHDIAMDKLSVKEYQNPFEDKEMKKENKKERRR Prediction of potential genes in microbial genomes Time: Thu May 26 11:14:30 2011 Seq name: gi|223713992|gb|ACDT01000223.1| Coprobacillus sp. D7 cont1.223, whole genome shotgun sequence Length of sequence - 819 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 817 609 ## COG2876 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase Predicted protein(s) >gi|223713992|gb|ACDT01000223.1| GENE 1 1 - 817 609 272 aa, chain + ## HITS:1 COG:CAC0892 KEGG:ns NR:ns ## COG: CAC0892 COG2876 # Protein_GI_number: 15894179 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Clostridium acetobutylicum # 1 272 65 335 337 377 67.0 1e-104 VSEPYKLANRAFHPDDTIVDVAGVKVGGDNLALIAGPCSVESEEQVIEIAKAAKAAGANI LRGGAFKPRTSPYAFQGMGSSGLDILVAAKEATGLPICSELMDAQYLEEFNEKVDLIQIG ARNMQNFDLLKKVGAGTKKPILLKRGLSATFEEWIMSAEYIMANGNPNVILCERGVRTFE TYTRNTLDLQAIPVVKKLTHLPIIIDPSHAGGKWWLVEPMAKAAVAAGADGLMIEVHNNP EKALCDGPQSLRPERYEELLKQISKIAEVVGK Prediction of potential genes in microbial genomes Time: Thu May 26 11:14:31 2011 Seq name: gi|223713991|gb|ACDT01000224.1| Coprobacillus sp. D7 cont1.224, whole genome shotgun sequence Length of sequence - 804 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 291 204 ## gi|237732612|ref|ZP_04563093.1| predicted protein + Term 294 - 338 9.9 - Term 281 - 326 10.1 2 2 Tu 1 . - CDS 327 - 794 406 ## gi|237732613|ref|ZP_04563094.1| predicted protein Predicted protein(s) >gi|223713991|gb|ACDT01000224.1| GENE 1 1 - 291 204 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732612|ref|ZP_04563093.1| ## NR: gi|237732612|ref|ZP_04563093.1| predicted protein [Mollicutes bacterium D7] # 1 96 1 96 96 157 100.0 3e-37 DENVDYKKIGKFVTQKNKTIGGTGKKYNVSYKVADGYLLDGENKIMKLSSDIPDIEIING YDFMAIFNDNYEFDYYKSNKGFFAIESNKQRLIMKL >gi|223713991|gb|ACDT01000224.1| GENE 2 327 - 794 406 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732613|ref|ZP_04563094.1| ## NR: gi|237732613|ref|ZP_04563094.1| predicted protein [Mollicutes bacterium D7] # 1 155 1 155 155 249 100.0 3e-65 MLSLFIFCGCSSSIAKTSVSQEIHDLEETHIASDYSLIFQQYDLSTMASTKLYSENLPVD EYNKKLAIIISNSMENLYMICLNRINQDMDLLYQYSDTYIDYIAYLSTYADFYNEDNDTL KELLKIGDISSIKELNSYKYADMMLIEALETVSGN Prediction of potential genes in microbial genomes Time: Thu May 26 11:14:44 2011 Seq name: gi|223713990|gb|ACDT01000225.1| Coprobacillus sp. D7 cont1.225, whole genome shotgun sequence Length of sequence - 788 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 205 199 ## HMPREF0868_0147 hypothetical protein Predicted protein(s) >gi|223713990|gb|ACDT01000225.1| GENE 1 2 - 205 199 67 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0868_0147 NR:ns ## KEGG: HMPREF0868_0147 # Name: not_defined # Def: hypothetical protein # Organism: Clostridiales_BVAB3 # Pathway: not_defined # 1 66 979 1044 1050 97 78.0 1e-19 VFLVLGADETSKGAANWAMGTENPNIMNVAATRAKEEFYIIGNKKLYLGLKSDVINDTNE VINKFNK Prediction of potential genes in microbial genomes Time: Thu May 26 11:14:46 2011 Seq name: gi|223713989|gb|ACDT01000226.1| Coprobacillus sp. D7 cont1.226, whole genome shotgun sequence Length of sequence - 779 bp Number of predicted genes - 2, with homology - 0 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 210 168 ## - Prom 236 - 295 12.8 - Term 254 - 307 2.2 2 2 Tu 1 . - CDS 525 - 695 145 ## - Prom 719 - 778 4.0 Predicted protein(s) >gi|223713989|gb|ACDT01000226.1| GENE 1 3 - 210 168 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQFIVVDLEANPGDFKEIINIAAHKVNIDYKIFKEKTNNYVISSKKILSNNNFDVFVKPF FNGRLSKKM >gi|223713989|gb|ACDT01000226.1| GENE 2 525 - 695 145 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPTDFLLNILLELLENIYDDTEYIVANINDKSLSLSKILIQIGFQEIETNKKYLLN Prediction of potential genes in microbial genomes Time: Thu May 26 11:14:56 2011 Seq name: gi|223713988|gb|ACDT01000227.1| Coprobacillus sp. D7 cont1.227, whole genome shotgun sequence Length of sequence - 755 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 256 123 ## gi|237732608|ref|ZP_04563089.1| hypothetical protein MBAG_03354 2 1 Op 2 . + CDS 244 - 546 112 ## gi|237732609|ref|ZP_04563090.1| hypothetical protein MBAG_03355 3 1 Op 3 . + CDS 566 - 755 78 ## gi|237732610|ref|ZP_04563091.1| hypothetical protein MBAG_03356 Predicted protein(s) >gi|223713988|gb|ACDT01000227.1| GENE 1 2 - 256 123 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732608|ref|ZP_04563089.1| ## NR: gi|237732608|ref|ZP_04563089.1| hypothetical protein MBAG_03354 [Mollicutes bacterium D7] # 1 84 1 84 84 153 98.0 3e-36 IINVIAEKCCDAKYGFNQLGGDNEILRMSDVEMRDFIITNILNQKQEYIKMITELQNGYL EENRLKEEFKPFRFQQTFHKIWSQ >gi|223713988|gb|ACDT01000227.1| GENE 2 244 - 546 112 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732609|ref|ZP_04563090.1| ## NR: gi|237732609|ref|ZP_04563090.1| hypothetical protein MBAG_03355 [Mollicutes bacterium D7] # 4 100 1 97 97 128 100.0 1e-28 MVTMIQPTSNVTETDFINLSNGIAQAIKTINDRKIEVESFTCPESMKNSRDNLKNNLEDY LTILNNIKNASDEKDITTIKNQYSNFKNIISSLQSVSTSL >gi|223713988|gb|ACDT01000227.1| GENE 3 566 - 755 78 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732610|ref|ZP_04563091.1| ## NR: gi|237732610|ref|ZP_04563091.1| hypothetical protein MBAG_03356 [Mollicutes bacterium D7] # 1 63 1 63 63 95 100.0 7e-19 MEQRIIKSTEYKEFIQYRSSRIPIKVFKEVGFSQIIKTADLKKIPYNKIPVLVCKEFKPF RFQ Prediction of potential genes in microbial genomes Time: Thu May 26 11:15:10 2011 Seq name: gi|223713987|gb|ACDT01000228.1| Coprobacillus sp. D7 cont1.228, whole genome shotgun sequence Length of sequence - 754 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 4 - 399 82 ## gi|237732606|ref|ZP_04563087.1| conserved hypothetical protein Predicted protein(s) >gi|223713987|gb|ACDT01000228.1| GENE 1 4 - 399 82 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732606|ref|ZP_04563087.1| ## NR: gi|237732606|ref|ZP_04563087.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 131 1 131 131 223 99.0 4e-57 MQKTETGPFLTPYTKIHSRWIKDLHVRPKTIKTLEENLGITIQDIRMGKDFMSKTPKAMA TKAKIDKWDLIKLKSFCTAEETTIRVNRQPTEWEKIFATYSSDKGLISRICKELKQIYRK KTTPSKSGQRI Prediction of potential genes in microbial genomes Time: Thu May 26 11:15:17 2011 Seq name: gi|223713986|gb|ACDT01000229.1| Coprobacillus sp. D7 cont1.229, whole genome shotgun sequence Length of sequence - 741 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 741 695 ## BDI_1865 putative formate acetyltransferase Predicted protein(s) >gi|223713986|gb|ACDT01000229.1| GENE 1 3 - 741 695 246 aa, chain + ## HITS:1 COG:no KEGG:BDI_1865 NR:ns ## KEGG: BDI_1865 # Name: not_defined # Def: putative formate acetyltransferase # Organism: P.distasonis # Pathway: not_defined # 33 245 95 326 759 85 27.0 2e-15 NRTKITSDDYFLSCFPSAICGFGSRNTNFIYNCKLERLAKLIEGTHDVEKEELEELYTFW ADENDKERLRRYYPNDIKELMPYDDFENDYYTAYPLYRLGGAYLDFEKLFDLGIDGLIHE IDSQPLNSFLRACKKSLIYLKELIKLYRDDVMDINPELAYTLNELLEHRPQNMKEAIQLM WIYVGVSEIRNYGRMDNQLARFLDDEQDAYKNIAEYFKVIRQRNTIYNGRIILGGEGRHD LEKANK Prediction of potential genes in microbial genomes Time: Thu May 26 11:15:22 2011 Seq name: gi|223713985|gb|ACDT01000230.1| Coprobacillus sp. D7 cont1.230, whole genome shotgun sequence Length of sequence - 736 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 36 - 734 400 ## JDM1_2587 hypothetical protein Predicted protein(s) >gi|223713985|gb|ACDT01000230.1| GENE 1 36 - 734 400 232 aa, chain - ## HITS:1 COG:no KEGG:JDM1_2587 NR:ns ## KEGG: JDM1_2587 # Name: not_defined # Def: hypothetical protein # Organism: L.plantarum_JDM1 # Pathway: not_defined # 4 176 70 242 413 84 26.0 3e-15 IQPVKKKLKDEYGLKEENIIIGCIHSHSAPAYFKPFFEDVQIEEKLQKLLIDQFCTSIMN AHSSLKELDINITQTSIDGLYGNRNVKDAYSNKDLTILEFKDNGEILHSLLFIATHPTIL NGSNLYLSADLIGFIRKKYLEKFSNPCMIANGCCGDVSTRFYRESSGKDELERVSNEIIK QMNHLHKLNYSINTLKTSTVVEEYTYHGNDEFITSELTNLANKDDAAYKNVI Prediction of potential genes in microbial genomes Time: Thu May 26 11:15:25 2011 Seq name: gi|223713984|gb|ACDT01000231.1| Coprobacillus sp. D7 cont1.231, whole genome shotgun sequence Length of sequence - 732 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 284 - 343 10.3 1 1 Tu 1 . + CDS 469 - 633 161 ## Predicted protein(s) >gi|223713984|gb|ACDT01000231.1| GENE 1 469 - 633 161 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKEVGLLHLAVMLFGLAGVIGKFVTLPAILITFGRVSFSSIFLLIIILIKKRS Prediction of potential genes in microbial genomes Time: Thu May 26 11:15:30 2011 Seq name: gi|223713983|gb|ACDT01000232.1| Coprobacillus sp. D7 cont1.232, whole genome shotgun sequence Length of sequence - 722 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 35 - 94 3.8 1 1 Tu 1 . + CDS 200 - 667 568 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases Predicted protein(s) >gi|223713983|gb|ACDT01000232.1| GENE 1 200 - 667 568 155 aa, chain + ## HITS:1 COG:SA1574 KEGG:ns NR:ns ## COG: SA1574 COG1187 # Protein_GI_number: 15927330 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Staphylococcus aureus N315 # 3 154 1 152 231 160 56.0 7e-40 MVMRLDKLLANYGIGTRKEVKSLIRKGFVKVNGMIIKKDDFKVDHEIDEIVFDDELIEYR PYVYIMLNKPAGYISATKDNLHPTVLELIEGYENYDLFPVGRLDLDTEGLLLITNDGDFS HKLLAPSRNHSKIYFANIAGVMDEQDIQAFKDGLF Prediction of potential genes in microbial genomes Time: Thu May 26 11:15:31 2011 Seq name: gi|223713982|gb|ACDT01000233.1| Coprobacillus sp. D7 cont1.233, whole genome shotgun sequence Length of sequence - 713 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 40 - 711 615 ## COG0438 Glycosyltransferase Predicted protein(s) >gi|223713982|gb|ACDT01000233.1| GENE 1 40 - 711 615 223 aa, chain - ## HITS:1 COG:SMb21250 KEGG:ns NR:ns ## COG: SMb21250 COG0438 # Protein_GI_number: 16264502 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Sinorhizobium meliloti # 52 219 233 400 427 106 33.0 4e-23 KVYRAVKKFDLFIPASKSIYEFYKEYLTGPQCIYLPLCIDEIPRNRVSLNTNQITVMGRL SKEKAYDDMLRVFKKVLEQNPDAKLNILGDGEERASLSILADQLAISNSVIFHGNTVGEA KNEIMNNTSVFVTTSHYESFGLVLLEAMSYGIPCVSFDSAKGSLDIIEDGKDGFIIRDRN LDEMANKIVELLNKTTKTLQNNAIRKAKRFSYQNVKKQWLKNL Prediction of potential genes in microbial genomes Time: Thu May 26 11:15:31 2011 Seq name: gi|223713981|gb|ACDT01000234.1| Coprobacillus sp. D7 cont1.234, whole genome shotgun sequence Length of sequence - 694 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 653 652 ## COG0176 Transaldolase Predicted protein(s) >gi|223713981|gb|ACDT01000234.1| GENE 1 3 - 653 652 216 aa, chain + ## HITS:1 COG:lin2886 KEGG:ns NR:ns ## COG: lin2886 COG0176 # Protein_GI_number: 16801946 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Listeria innocua # 1 212 2 211 214 294 73.0 8e-80 RFFLDTANVEEIKKANDMGVICGVTTNPSIIAKEGRDFNEVIKEIAAIVDGPISGEVKAT TVDAEGMIKEGREIAKIHKNMVVKLPTTAEGLKACRVLSGEGIKTNLTLIFNVPQAILAA RAGATYVSPFVGRIDDISMDGLQLIRDIAAIFKTHDIKAQIISASVRNACHVIECAKAGA DLATVPYSVIEQMLKHPLTAEGIEKFQKDYRAVFGG Prediction of potential genes in microbial genomes Time: Thu May 26 11:15:32 2011 Seq name: gi|223713980|gb|ACDT01000235.1| Coprobacillus sp. D7 cont1.235, whole genome shotgun sequence Length of sequence - 689 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 682 658 ## COG0176 Transaldolase Predicted protein(s) >gi|223713980|gb|ACDT01000235.1| GENE 1 2 - 682 658 226 aa, chain - ## HITS:1 COG:SPy2048 KEGG:ns NR:ns ## COG: SPy2048 COG0176 # Protein_GI_number: 15675818 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Streptococcus pyogenes M1 GAS # 1 217 1 217 222 206 49.0 2e-53 MEFIIDTVNLEEIKDAIDHMPIVGVTSNPSIVKATAPENFFDHMRKVRDIIGKDRSLHVQ VISKNCDEMVNEAHRILKEIDNRVYVKIPVSYEGIKAIKILKAEGINVTATAVYDLMQAY MALAAGADYIAPYVNRIGNLGSDPFELINELSNRIVMDDYDCKILAASFKGVQQVRDSFN SGAQAITAPVSVLKAIFNNPNIEKAVDDFNQDWYSVYGENKGICDL Prediction of potential genes in microbial genomes Time: Thu May 26 11:15:33 2011 Seq name: gi|223713979|gb|ACDT01000236.1| Coprobacillus sp. D7 cont1.236, whole genome shotgun sequence Length of sequence - 688 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 686 364 ## COG0582 Integrase Predicted protein(s) >gi|223713979|gb|ACDT01000236.1| GENE 1 2 - 686 364 228 aa, chain - ## HITS:1 COG:BS_ydcL KEGG:ns NR:ns ## COG: BS_ydcL COG0582 # Protein_GI_number: 16077547 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Bacillus subtilis # 5 223 90 313 368 86 30.0 5e-17 IEQKIDIHIFTFFEDLEIKNIDKKVIIDWQNWMLDRGQTIKNTNKIRSILYNIIEYFVTE ELMENNPLNNVKRMVDNSPTEEMLFWTYDEFKKFIRVVDEPVYRRFYKFLYYTGCRKSEA KALIWKQIDFNRHVIAITRSLERINDDGTHIVNKPKNKMNRHILMNKELEDELYSYYLDR RMHFDFSMNEYVFGIDKPLADTTIEKKKNKYCNDAMVKQIRIQTLSVP Prediction of potential genes in microbial genomes Time: Thu May 26 11:15:33 2011 Seq name: gi|223713978|gb|ACDT01000237.1| Coprobacillus sp. D7 cont1.237, whole genome shotgun sequence Length of sequence - 648 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 218 - 277 4.1 1 1 Tu 1 . + CDS 389 - 647 67 ## gi|167755080|ref|ZP_02427207.1| hypothetical protein CLORAM_00584 Predicted protein(s) >gi|223713978|gb|ACDT01000237.1| GENE 1 389 - 647 67 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167755080|ref|ZP_02427207.1| ## NR: gi|167755080|ref|ZP_02427207.1| hypothetical protein CLORAM_00584 [Clostridium ramosum DSM 1402] # 11 86 1 76 127 129 100.0 8e-29 MDMYNHYRTLMRLYFPKTIICIDSFRVLKKITVCLNSVRKRILRRYKDSQEYKLLKYRYE LLLKSGDKINDEKYFFDRTLGYTTSE Prediction of potential genes in microbial genomes Time: Thu May 26 11:15:39 2011 Seq name: gi|223713977|gb|ACDT01000238.1| Coprobacillus sp. D7 cont1.238, whole genome shotgun sequence Length of sequence - 645 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 243 79 ## + Prom 245 - 304 6.9 2 1 Op 2 . + CDS 334 - 643 91 ## VV1762 hypothetical protein Predicted protein(s) >gi|223713977|gb|ACDT01000238.1| GENE 1 1 - 243 79 80 aa, chain + ## HITS:0 COG:no KEGG:no NR:no YQDKDLSYDKALAVIPVFKSKQEHVDFIGYTKTHINEFKEDVVNQNIDEMFPDYAKNVNT VIVYKLGKTMVQWLENWRYR >gi|223713977|gb|ACDT01000238.1| GENE 2 334 - 643 91 103 aa, chain + ## HITS:1 COG:no KEGG:VV1762 NR:ns ## KEGG: VV1762 # Name: not_defined # Def: hypothetical protein # Organism: V.vulnificus_YJ016 # Pathway: not_defined # 5 103 16 120 373 79 41.0 3e-14 MNDIEDEMRSLFEFSSIDEKKVTVENRKKISKFMKESRSAAKRNKCFHCKKEVSSFCNSH SVPAFTLRNIAKNGNLVNLNNFIEIPFQDKQSGLQKTGTFQII Prediction of potential genes in microbial genomes Time: Thu May 26 11:15:46 2011 Seq name: gi|223713976|gb|ACDT01000239.1| Coprobacillus sp. D7 cont1.239, whole genome shotgun sequence Length of sequence - 633 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 34 - 93 5.7 1 1 Tu 1 . + CDS 138 - 617 335 ## COG1943 Transposase and inactivated derivatives Predicted protein(s) >gi|223713976|gb|ACDT01000239.1| GENE 1 138 - 617 335 159 aa, chain + ## HITS:1 COG:YPO2928 KEGG:ns NR:ns ## COG: YPO2928 COG1943 # Protein_GI_number: 16123115 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 8 146 3 141 152 208 68.0 3e-54 MNKQYSDDLHSLSHTKWSCKYHIVFAPKYRRRAFYEARRVEVGAILRQLCEWKGVNIIEA EVCIDHVHMLVEIPPKYSVSGFMGFLKGKSSQMIYERWANTRYKYRHREFWCRGYYVDTV GKNTKRIKNYIANQLKEDQEREQLTLDLEDPFKKGKGKN Prediction of potential genes in microbial genomes Time: Thu May 26 11:15:47 2011 Seq name: gi|223713975|gb|ACDT01000240.1| Coprobacillus sp. D7 cont1.240, whole genome shotgun sequence Length of sequence - 623 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 353 351 ## COG0438 Glycosyltransferase 2 1 Op 2 . + CDS 406 - 622 75 ## Predicted protein(s) >gi|223713975|gb|ACDT01000240.1| GENE 1 3 - 353 351 116 aa, chain + ## HITS:1 COG:AF0043 KEGG:ns NR:ns ## COG: AF0043 COG0438 # Protein_GI_number: 11497664 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Archaeoglobus fulgidus # 10 111 248 349 363 63 34.0 7e-11 NIDCIKHTHNQEELAHLYSSADVFVNPTRDEVFGMVNIESLACGTPVVTFDTGGSPECID ENTGIVVKKNDIDDLIDSIVRVCESNDFSAEKCIERAHHFDENILYDEYIDLYKKL >gi|223713975|gb|ACDT01000240.1| GENE 2 406 - 622 75 72 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNKRNSNDITIIALVIFEFINTMFRAFGMVNINQSIILREFDFIILLVLMIKHFISQKKI RYYYSKIYSFII Prediction of potential genes in microbial genomes Time: Thu May 26 11:15:52 2011 Seq name: gi|223713974|gb|ACDT01000241.1| Coprobacillus sp. D7 cont1.241, whole genome shotgun sequence Length of sequence - 622 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 120 - 590 541 ## gi|167756944|ref|ZP_02429071.1| hypothetical protein CLORAM_02493 Predicted protein(s) >gi|223713974|gb|ACDT01000241.1| GENE 1 120 - 590 541 156 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167756944|ref|ZP_02429071.1| ## NR: gi|167756944|ref|ZP_02429071.1| hypothetical protein CLORAM_02493 [Clostridium ramosum DSM 1402] # 1 156 1 156 156 212 99.0 5e-54 MNRQAITFLTLFSLILVLSIYCILLPPEGEASEVSVNNEELSQIEVLQQNLEKERADLIS ENNAIIASSDSDSNKIAAALANISEAKETAALEAKITKIINDAGFKNAFVEVENKTIKVV IDKKEASSSDANSIIKTVMEKTKNEYQVEVKFISET Prediction of potential genes in microbial genomes Time: Thu May 26 11:15:59 2011 Seq name: gi|223713973|gb|ACDT01000242.1| Coprobacillus sp. D7 cont1.242, whole genome shotgun sequence Length of sequence - 615 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 422 291 ## Tmz1t_3800 glycosyl transferase group 1 2 1 Op 2 . + CDS 425 - 614 132 ## gi|237732592|ref|ZP_04563073.1| conserved hypothetical protein Predicted protein(s) >gi|223713973|gb|ACDT01000242.1| GENE 1 3 - 422 291 139 aa, chain + ## HITS:1 COG:no KEGG:Tmz1t_3800 NR:ns ## KEGG: Tmz1t_3800 # Name: not_defined # Def: glycosyl transferase group 1 # Organism: Thauera # Pathway: not_defined # 8 136 259 390 391 75 31.0 5e-13 RTDLFIYDNIDSSIVFLGKISHDKCLSLVSKCDFSTIIREDTLLSRAGFPTKLAESFNCG TPVIVTPSSNIKEYINTNYGFVSKSCEYEDVKLLLKEVESVNKESINDMKSVIKSENPLA YSKFTDELSKVINNSKKGK >gi|223713973|gb|ACDT01000242.1| GENE 2 425 - 614 132 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732592|ref|ZP_04563073.1| ## NR: gi|237732592|ref|ZP_04563073.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 63 1 63 64 103 100.0 5e-21 MKVVQINPYCGKGSTGKICQSISERLVDNQIENYILYTQYNSDYNLGINYGINKKYVYLQ SLI Prediction of potential genes in microbial genomes Time: Thu May 26 11:16:06 2011 Seq name: gi|223713972|gb|ACDT01000243.1| Coprobacillus sp. D7 cont1.243, whole genome shotgun sequence Length of sequence - 598 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 6 - 597 366 ## CDR20291_2171 hypothetical protein Predicted protein(s) >gi|223713972|gb|ACDT01000243.1| GENE 1 6 - 597 366 197 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_2171 NR:ns ## KEGG: CDR20291_2171 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 41 180 134 278 353 65 33.0 1e-09 MIIFAVIQIYTVLELLPNYLENLYQLFSQNHYIIDATKYLDVIYSGSMSIVESISTGFIT GLIAFIMKIPSILFDLIFVVITSLFILLDYSRIERLVVKRYEMVALIIDTIKEVLSNLFK AYFIIMIITFGELWLGFKIIGINHSIMLACIIAIFDFMPVLGLDMIMIPWIIISALTNKI YLAGALLVIYMIIVITK Prediction of potential genes in microbial genomes Time: Thu May 26 11:16:10 2011 Seq name: gi|223713971|gb|ACDT01000244.1| Coprobacillus sp. D7 cont1.244, whole genome shotgun sequence Length of sequence - 593 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 40 - 552 368 ## COG0637 Predicted phosphatase/phosphohexomutase Predicted protein(s) >gi|223713971|gb|ACDT01000244.1| GENE 1 40 - 552 368 170 aa, chain - ## HITS:1 COG:CC2096 KEGG:ns NR:ns ## COG: CC2096 COG0637 # Protein_GI_number: 16126335 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Caulobacter vibrioides # 29 163 84 218 221 94 36.0 7e-20 MNFIKNKYNIDFDTDEKINQFKILEEQYLLKNSVELKKGLIQLLKYLNIHYYKTIVATSS GKERAERILGEHNLMKYFNGIVCGSEVEHGKPAPDIFLKACDKLNVEPEEALVLEDSEAG IQAASEAKISVICIPDMKFPQEKYLKKVEHVYDSLEDVISYLEMKKDISK Prediction of potential genes in microbial genomes Time: Thu May 26 11:16:10 2011 Seq name: gi|223713970|gb|ACDT01000245.1| Coprobacillus sp. D7 cont1.245, whole genome shotgun sequence Length of sequence - 581 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 581 205 ## gi|237732588|ref|ZP_04563069.1| conserved hypothetical protein Predicted protein(s) >gi|223713970|gb|ACDT01000245.1| GENE 1 2 - 581 205 193 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732588|ref|ZP_04563069.1| ## NR: gi|237732588|ref|ZP_04563069.1| conserved hypothetical protein [Mollicutes bacterium D7] # 6 193 1 188 188 343 99.0 5e-93 SLLLDLSILRTDIQKEEFLTVSIRKKEKIKKAITELEEGDYIISVNAQLMTINTFKTKEI ICPHCQRITHIKAKGETTELVLNDFSTIKKDKIQSTLPGVNKVLLMGNLYNDPKFRSAIN NLDYIKYKIVVTNYEDGNEIKSFPYIVSFNKEAQNAKLYLKKGNTVFIEGALQERHFKQR TKSFCTHCETEFD Prediction of potential genes in microbial genomes Time: Thu May 26 11:16:18 2011 Seq name: gi|223713969|gb|ACDT01000246.1| Coprobacillus sp. D7 cont1.246, whole genome shotgun sequence Length of sequence - 580 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Thu May 26 11:16:19 2011 Seq name: gi|223713968|gb|ACDT01000247.1| Coprobacillus sp. D7 cont1.247, whole genome shotgun sequence Length of sequence - 579 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 520 634 ## CLM_1301 hypothetical protein Predicted protein(s) >gi|223713968|gb|ACDT01000247.1| GENE 1 2 - 520 634 172 aa, chain + ## HITS:1 COG:no KEGG:CLM_1301 NR:ns ## KEGG: CLM_1301 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A2 # Pathway: not_defined # 1 160 417 576 579 228 68.0 5e-59 EIGAFVRLDDLDAGYSFKELDRSVFMNPDKVNARIVIPITDYKDVVAHHDVDFFLYANNY EEGGESLSFFDDVNSALDVFRAGNRMAKGTTTEYGLVGSYFANPFGPVQREEQTEVILVD YFNRMFKQGVKVGQLRTRLGIKGNEHKGPQEAATAILKYITGEDELKQLEEE Prediction of potential genes in microbial genomes Time: Thu May 26 11:16:22 2011 Seq name: gi|223713967|gb|ACDT01000248.1| Coprobacillus sp. D7 cont1.248, whole genome shotgun sequence Length of sequence - 565 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 192 - 548 158 ## gi|167755064|ref|ZP_02427191.1| hypothetical protein CLORAM_00568 Predicted protein(s) >gi|223713967|gb|ACDT01000248.1| GENE 1 192 - 548 158 118 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167755064|ref|ZP_02427191.1| ## NR: gi|167755064|ref|ZP_02427191.1| hypothetical protein CLORAM_00568 [Clostridium ramosum DSM 1402] # 1 117 244 360 820 236 99.0 3e-61 MYYHKTCFKHSAHKLKRLQRTANWACSFARLRTKKDIISQHNGPKQKISILSLHLQVGGI EKAICSLANMLVDNYDVEIINVYKLCEPSFYIDERVNVSYLSTDLKPNKEEFKYATKK Prediction of potential genes in microbial genomes Time: Thu May 26 11:16:29 2011 Seq name: gi|223713966|gb|ACDT01000249.1| Coprobacillus sp. D7 cont1.249, whole genome shotgun sequence Length of sequence - 554 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 363 149 ## gi|237732585|ref|ZP_04563066.1| conserved hypothetical protein - Prom 388 - 447 2.8 Predicted protein(s) >gi|223713966|gb|ACDT01000249.1| GENE 1 3 - 363 149 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732585|ref|ZP_04563066.1| ## NR: gi|237732585|ref|ZP_04563066.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 120 17 136 137 172 99.0 4e-42 MEMNGIVIEWNRMDSLNGIRWNHRMDWNGIIEWTRKGSLLNGIEWNHRMVSIGIIIKWNR MESPNRIEWNNHRMDSNGIILKWNRMELSNESGIIECNRIESSNGLEWNHRMEWNGTVNE Prediction of potential genes in microbial genomes Time: Thu May 26 11:16:35 2011 Seq name: gi|223713965|gb|ACDT01000250.1| Coprobacillus sp. D7 cont1.250, whole genome shotgun sequence Length of sequence - 547 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 117 - 545 364 ## COG0622 Predicted phosphoesterase Predicted protein(s) >gi|223713965|gb|ACDT01000250.1| GENE 1 117 - 545 364 142 aa, chain - ## HITS:1 COG:PA0351 KEGG:ns NR:ns ## COG: PA0351 COG0622 # Protein_GI_number: 15595548 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Pseudomonas aeruginosa # 1 128 8 145 157 99 38.0 2e-21 IRGGVIADTHDILKEEVLNELKQCDYIIHAGDIVDIKILERLQTITKVFVVKGNNDKLSL NEEEYFNIGGYSFYLVHQLDTKKDVDFYIYGHSHQLACYQKGRTLYLNPGSCGKKRFSLP LTYIILDLYEDHYEFLVKECSK Prediction of potential genes in microbial genomes Time: Thu May 26 11:16:35 2011 Seq name: gi|223713964|gb|ACDT01000251.1| Coprobacillus sp. D7 cont1.251, whole genome shotgun sequence Length of sequence - 530 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 180 - 434 219 ## gi|237732583|ref|ZP_04563064.1| predicted protein Predicted protein(s) >gi|223713964|gb|ACDT01000251.1| GENE 1 180 - 434 219 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732583|ref|ZP_04563064.1| ## NR: gi|237732583|ref|ZP_04563064.1| predicted protein [Mollicutes bacterium D7] # 1 84 26 109 109 149 100.0 4e-35 MVISERLGHRDRNQTLNRYAHLFPNAQQNGIKKINRLSNIYDEADIRLGAVVVDFLEKLD SIENPNQQEQEFIESFKELVFKSK Prediction of potential genes in microbial genomes Time: Thu May 26 11:16:41 2011 Seq name: gi|223713963|gb|ACDT01000252.1| Coprobacillus sp. D7 cont1.252, whole genome shotgun sequence Length of sequence - 524 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 523 295 ## Fnod_1435 methyl-accepting chemotaxis sensory transducer Predicted protein(s) >gi|223713963|gb|ACDT01000252.1| GENE 1 1 - 523 295 174 aa, chain + ## HITS:1 COG:no KEGG:Fnod_1435 NR:ns ## KEGG: Fnod_1435 # Name: not_defined # Def: methyl-accepting chemotaxis sensory transducer # Organism: F.nodosum # Pathway: Two-component system [PATH:fno02020]; Bacterial chemotaxis [PATH:fno02030] # 5 122 91 213 678 88 41.0 1e-16 GILQEPLLDYLNQLAQRGGVKRFGIADLQGNVVTSDHQTFNISDRDYFKDSLNGKTVTSK SITDYTDGEAINVYSVPIYKHNEVSGVLFATFYTDKLSSILSSATYNNVGYAFIFNENGD VVLSNKKTDDFNNIESLNGIDLNDIDINGKGIINFRDENNTLNYLIYANIEGNV Prediction of potential genes in microbial genomes Time: Thu May 26 11:16:44 2011 Seq name: gi|223713962|gb|ACDT01000253.1| Coprobacillus sp. D7 cont1.253, whole genome shotgun sequence Length of sequence - 523 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 263 - 322 8.8 1 1 Tu 1 . + CDS 364 - 523 109 ## gi|237732581|ref|ZP_04563062.1| conserved hypothetical protein Predicted protein(s) >gi|223713962|gb|ACDT01000253.1| GENE 1 364 - 523 109 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237732581|ref|ZP_04563062.1| ## NR: gi|237732581|ref|ZP_04563062.1| conserved hypothetical protein [Mollicutes bacterium D7] # 1 53 1 53 53 100 100.0 4e-20 MSKKRIMIIAASKLPIPAYKGGATETLITNLLNDKLIVDNDNYSIDVYSHCEG Prediction of potential genes in microbial genomes Time: Thu May 26 11:16:49 2011 Seq name: gi|223713961|gb|ACDT01000254.1| Coprobacillus sp. D7 cont1.254, whole genome shotgun sequence Length of sequence - 523 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 87 - 521 593 ## SUB0840 transporter protein Predicted protein(s) >gi|223713961|gb|ACDT01000254.1| GENE 1 87 - 521 593 144 aa, chain - ## HITS:1 COG:no KEGG:SUB0840 NR:ns ## KEGG: SUB0840 # Name: not_defined # Def: transporter protein # Organism: S.uberis # Pathway: not_defined # 1 144 166 309 309 110 44.0 1e-23 IVGLVLNFTGIYDLLIASQFGEMITGTLSQATTPIVSMILFILGYDLNVDKKTLVPILKL MGIKIVYYAMVIAGFFILFPAQMADKTFMMAPIIYFMCPTGFGLMPVIAPLYKDEDDASF TSAFVSIFMIITLIVYTLVVIFIA Prediction of potential genes in microbial genomes Time: Thu May 26 11:16:52 2011 Seq name: gi|223713960|gb|ACDT01000255.1| Coprobacillus sp. D7 cont1.255, whole genome shotgun sequence Length of sequence - 518 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 516 348 ## gi|237732578|ref|ZP_04563059.1| hypothetical protein MBAG_03386 Predicted protein(s) >gi|223713960|gb|ACDT01000255.1| GENE 1 3 - 516 348 171 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237732578|ref|ZP_04563059.1| ## NR: gi|237732578|ref|ZP_04563059.1| hypothetical protein MBAG_03386 [Mollicutes bacterium D7] # 1 171 1 171 172 240 100.0 2e-62 KFEVPGEYTVIYKLSNKVKSIKKKLIVKVIDDIEPVLVLNKTNLTLEYGQNIDLKDYIEK AEDNIDGNLVEKVDYNQIDTYKPGNHVVTYSVKDKAGNHVSVEMSININEKPKVTEKSTN SNIENNVTTNKSNNSNNSKSGQENIPSQFDKFFSGNSIDVYNQANSNPFGS Prediction of potential genes in microbial genomes Time: Thu May 26 11:17:00 2011 Seq name: gi|223713959|gb|ACDT01000256.1| Coprobacillus sp. D7 cont1.256, whole genome shotgun sequence Length of sequence - 517 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 44 - 382 187 ## Sde_3464 hypothetical protein Predicted protein(s) >gi|223713959|gb|ACDT01000256.1| GENE 1 44 - 382 187 112 aa, chain + ## HITS:1 COG:no KEGG:Sde_3464 NR:ns ## KEGG: Sde_3464 # Name: not_defined # Def: hypothetical protein # Organism: S.degradans # Pathway: not_defined # 5 112 176 283 366 92 47.0 4e-18 MQLMEAEYLDYEEYYDGYKKAKRNLEKDWSDYFLIYYKKLDYTVPIAFQDKIALINSFDN EIINNIYNFEPNYHTKDLHICVFPLESSTAIIMFVHNKDASRYRKFYKNSEK Prediction of potential genes in microbial genomes Time: Thu May 26 11:17:03 2011 Seq name: gi|223713958|gb|ACDT01000257.1| Coprobacillus sp. D7 cont1.257, whole genome shotgun sequence Length of sequence - 515 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 514 682 ## COG1592 Rubrerythrin Predicted protein(s) >gi|223713958|gb|ACDT01000257.1| GENE 1 1 - 514 682 171 aa, chain - ## HITS:1 COG:CAC3598 KEGG:ns NR:ns ## COG: CAC3598 COG1592 # Protein_GI_number: 15896832 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 1 171 1 172 181 204 66.0 6e-53 MAKFVCSVCGYVYEGDAAPEQCPVCKVGADKFVEQTGEMTWAAEHVVGVAKDVDEEIKKE LRANFEGECSEVGMYLAMARVAHREGYPEIGLYWEKAAYEEAEHAAKFAELLGEVVTDST KKNLEMRVEAENGATMGKTELAKKAKALNLDAIHDTVHEMARDEARHGKAF Prediction of potential genes in microbial genomes Time: Thu May 26 11:17:03 2011 Seq name: gi|223713957|gb|ACDT01000258.1| Coprobacillus sp. D7 cont1.258, whole genome shotgun sequence Length of sequence - 515 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Thu May 26 11:17:04 2011 Seq name: gi|223713956|gb|ACDT01000259.1| Coprobacillus sp. D7 cont1.259, whole genome shotgun sequence Length of sequence - 512 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Thu May 26 11:17:04 2011 Seq name: gi|223713955|gb|ACDT01000260.1| Coprobacillus sp. D7 cont1.260, whole genome shotgun sequence Length of sequence - 510 bp Number of predicted genes - 0